2014 The Psychology of Language PDF

THE PSYCHOLOGY
OF LANGUAGE
Now in full color, this fully revised edition of the best-selling textbook provides an up-to-date and com-
prehensive introduction to the psychology of language for undergraduates, postgraduates, and researchers.
It contains everything the student needs to know about how we acquire, understand, produce, and store
language.
Whilst maintaining both the structure of the previous editions and the emphasis on cognitive process-
ing, this fourth edition has been thoroughly updated to include:
x the latest research, including recent results from the fast-moving field of brain imaging and studies
x updated coverage of key ideas and models
x an expanded glossary
x more real-life examples and illustrations.
The Psychology of Language, Fourth Edition is praised for describing complex ideas in a clear and
approachable style, and assumes no prior knowledge other than a grounding in the basic concepts of
cognitive psychology. It will be essential reading for advanced undergraduate and graduate students
of cognition, psycholinguistics, or the psychology of language. It will also be useful for those on
speech and language therapy courses.
The book is supported by a companion website featuring a range of helpful supplementary resources
for both students and lecturers.
Trevor A. Harley is Dean of Psychology and Chair of Cognitive Psychology at the University of Dundee,
Scotland. He was an undergraduate at the University of Cambridge, where he was also a PhD student,
completing a thesis on slips of the tongue and what they tell us about speech production. He moved to
Dundee from the University of Warwick in 1996. His research interests include speech production, how
we represent meaning, and the effects of aging on language.
This page intentionally left blank
THE
PSYCHOLOGY
OF LANGUAGE
FROM DATA TO THEORY
FOURTH EDITION
TREVOR A. HARLEY
Psychology Press
Taylor & Francis Group
LONDON AND NEW YORK
Fourth edition published 2014
by Psychology Press
27 Church Road, Hove, East Sussex BN3 2FA
and by Psychology Press
711 Third Avenue, New York, NY 10017
Psychology Press is an imprint of the Taylor & Francis Group, an informa business
© 2014 Psychology Press
The right of Trevor A. Harley to be identified as author of this work has been asserted by him in
accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by
any electronic, mechanical, or other means, now known or hereafter invented, including photocopying
and recording, or in any information storage or retrieval system, without permission in writing from the
publishers.
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
First edition published by Psychology Press 1995
Third edition published by Psychology Press 2008
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
Harley, Trevor A.
The psychology of language: from data to theory / Trevor A. Harley.—Fourth edition.
pages cm
Includes bibliographical references and index.
1. Psycholinguistics. I. Title.
BF455.H2713 2014
401′.9—dc23
2013022343
ISBN: 978-1-84872-088-6 (hbk)

ISBN: 978-1-84872-089-3 (pbk)
ISBN: 978-1-315-85901-9 (ebk)
Typeset in Times
by Book Now Ltd, London
CONTENTS
Preface to the fourth edition ix Syllables 35

Illustration credits xi Linguistic approaches to syntax 36
How to use this book xiii Summary 46
Questions to think about 46
SECTION A: Further reading 47
INTRODUCTION 1
SECTION B: THE
1. The study of language 3 BIOLOGICAL AND
Introduction 3
DEVELOPMENTAL BASES
Why study language and why is it
OF LANGUAGE 49
so difficult? 4
What is language? 5
How has language changed over time? 7
3. The foundations of language 51
Introduction 51
What is language for? 9
Where did language come from? 51
The history and methods of
Do animals have language? 54
psycholinguistics 9
The biological basis of language 67
Models in psycholinguistics 16
Is there a critical period for language
Language and the brain 17
development? 73
Themes and controversies 22
The cognitive basis of language 80
Summary 27
The social basis of language 83
The language development of visually and
Further reading 28
hearing-impaired children 85
2. Describing language 30 What is the relation between language and
Introduction 30 thought? 88
How to describe speech sounds 30 Summary 100
Consonants 33 Questions to think about 101
Vowels 35 Further reading 101
vi CONTENTS
4. Language development 104 Meaning-based facilitation of visual word

Introduction 104 recognition 185
What drives language development? 105 Processing morphologically
The language acquisition device 111 complex words 190
How children develop language 118 Models of visual word recognition 192
Phonological development 120 Coping with lexical ambiguity 198
Lexical and semantic development 125 Summary 207
Syntactic development 136 Questions to think about 208
Summary 150 Further reading 208
Further reading 151
7. Reading 209
Introduction 209
5. Bilingualism and second language The writing system 209
acquisition 153 A preliminary model of reading 210
Introduction 153 The processes of normal reading 212
Bilingualism 153 The neuroscience of adult reading disorders 220
Second language acquisition 158 Models of word naming 227
Evaluation of work on bilingualism and Connectionist models of dyslexia 233
second language acquisition 162 Comparison of models 237
Summary 162 Summary 239
Questions to think about 163 Questions to think about 240
Further reading 163 Further reading 240
SECTION C: WORD 8. Learning to read and spell 241

Introduction 241
RECOGNITION 165
Normal reading development 241
Phonological awareness 243
6. Recognizing visual words 167 How should reading be taught? 247
Introduction 167 Learning to spell 248
Basic methods and findings 168 Developmental dyslexia 249
What makes word recognition Summary 256
easier (or harder)? 171 Questions to think about 256
Attentional processes in visual Further reading 256
word recognition 177
Do different tasks give consistent results? 180 9. Understanding speech 258
Is there a dedicated visual word Introduction 258
recognition system? 183 Recognizing speech 258
CONTENTS vii
Models of speech recognition 267 12. Comprehension 360

The neuroscience of spoken word Introduction 360
recognition 281 Memory for text and inferences 362
Summary 282 Reference and ambiguity 372
Questions to think about 283 Models of text processing 377
Further reading 283 Individual differences in
comprehension skills 386
SECTION D: MEANING The neuroscience of text processing 388
AND USING Summary 390
LANGUAGE 285 Questions to think about 391
Further reading 391
10. Understanding the structure of

sentences 287
SECTION E: PRODUCTION
Introduction 287 AND OTHER ASPECTS
Dealing with structural ambiguity 288 OF LANGUAGE 393
Early work on parsing 291
Processing structural ambiguity 295 13. Language production 395
Gaps, traces, and unbounded dependencies 310 Introduction 395
The neuroscience of parsing 312 Slips of the tongue 396
Summary 316 Syntactic planning 402
Questions to think about 317 Lexicalization 410
Further reading 317 Phonological encoding 426
The analysis of hesitations 430
11. Word meaning 319 The neuroscience of speech production 433
Introduction 319 Writing and agraphia 444
Classic approaches to semantics 321 Summary 446
Semantic networks 322 Questions to think about 447
Semantic features 325 Further reading 447
Family resemblance models 333
Combining concepts 336 14. How do we use language? 449
Figurative language 337 Introduction 449
The neuroscience of semantics 339 Making inferences in conversation 449
Connectionist approaches to semantics 351 The structure of conversation 453
Summary 357 Collaboration in dialog 454
Questions to think about 358 Sound and vision 456
Further reading 358 Summary 458
viii CONTENTS
Questions to think about 459 Some growth areas? 477

Further reading 459 Conclusion 480
15. The structure of the language
system 460 Appendix: Connectionism 481
Introduction 460 Interactive activation models 481
What are the modules of language? 461 Back-propagation 483
How many lexicons are there? 462 Further reading 485
Language and short-term memory 468
Summary 473 Glossary 486
Questions to think about 474 Example of sentence
Further reading 474 analysis 494
16. New directions 475 References 495

Introduction 475 Author index 569
Themes in psycholinguistics revisited 475 Subject index 590
PREFACE TO THE FOURTH
EDITION
I started writing this fourth edition with mixed someone something is good just by telling them;
feelings. On the positive side, it is an honor and you have to show it. The resulting book is a com-
a delight to be able to write the fourth edition of promise between making the subject fun and rele-
something. It must also mean that someone is vant and depth and perhaps even rigor of coverage.
reading it. I also welcomed the chance to make I have learned that you can’t please all reviewers,
the book better in every way. On the less positive so though some teachers will approve the easier
side, it is a huge amount of work. approach, others might bemoan the lack of detail
Apart from updating references and key that was in the earlier editions.
ideas and models, I have two main aims in this Why do students dislike the subject and find
new edition. it difficult? I think there are several reasons. First,
Students often find cognition in general diffi- it seems very abstract. I have tried to point out
cult and say it is the part of their psychology degree as many applications of the subject as possible,
that they like least, but the psychology of language and to give as many concrete examples as I can.
in particular is feared and disliked. I have, I’m Second, they think the subject is full of jargon—
almost ashamed to say, only really come to appre- which it is. I am surprised to discover how many
ciate how much many students dislike it over the students are unclear what a noun is, so no won-
last few years. I can’t help feeling a bit responsible der they find parsing difficult. I have therefore
for this fear: one fair criticism of previous editions tried to reduce the jargon and make sure all terms
of this book is that students find it difficult. It con- are explained. There is a glossary that should
tains a lot of material, perhaps too much. (For those help. Third, perhaps most oddly, they don’t like
struggling I am biased, but I recommend reading or see the point in models, and psycholinguistics
my own book Talking the Talk (Harley, 2010) first.) has more models per square page than any other
What is more there is a balance to be had between discipline I know. Fourth, psycholinguists rarely
making texts informative with respect to sources come to definitive conclusions—usually at any
(and of course avoiding plagiarism and giving due one time in any one area there are two opposing
credit) but making them so reference dense it puts models out there battling it out. I’ve tried to stress
the student off. I fear earlier editions have been why models are important, and point out that in
reference dense, so I’ve tried to be lighter in this cutting-edge science we sometimes have to live
edition. (This strategy is not without its risk, so if with uncertainty.
any author or researcher feels I have slighted them, The field has changed a great deal over the
please let me know.) last few years as a result of results from brain
Therefore my first aim is to make this edi- imaging, particularly fMRI studies. My second
tion easier and more approachable, and to try to aim therefore is to incorporate as much as is pos-
stimulate students into finding psycholinguistics sible of this exciting new research into the book
interesting and important. I try to do this explic- where relevant. Some might know that I am skep-
itly in the first chapter, but you can’t persuade tical about what brain imaging can offer cognitive
x PREFACE TO THE FOURTH EDITION
psychology; I have tried not to let this skepticism Pickering, Julian Pine, Ursula Pool, Eleanor
affect this revision. Most researchers believe that Saffran, Lynn Santelmann, Marcus Taft, Jeremy
brain imaging has greatly advanced our under- Tree, Roger van Gompel, Carel van Wijk, Alan
standing of psycholinguistics over the last decade. Wilkes, Beth Wilson, Suzanne Zeedyk, and Pienie
Technology has changed for the better, too, Zwitserlood. I would also like to thank several
making writing books much easier. Writing the anonymous reviewers for their comments; hope-
first edition involved constant trips to the library fully you know who you are. Numerous people
and much photocopying. In this edition I could pointed out minor errors and asked questions: I
read every reference I wanted at the luxury of thank them all. George Dunbar created the sound
my desk thanks to Google and electronic jour- spectrogram for Figure 2.1 using MacSpeechLab.
nals. I wrote the first draft of this book using Lila Gleitman gave me the very first line; thanks!
the wonderful Scrivener 2.0 on a Mac, and then Katie Edwards, Pam Miller, and Denise Jackson
finished it in Pages. helped me to obtain a great deal of material, often
There is a website associated with this book. at very short notice. This book would be much
It contains links to other pages, details of impor- worse without the help of all these people. I am
tant recent work, and a “hot link” to contact me. of course responsible for any errors or omissions
It is to be found at: http://www.psypress.com/ that remain. If there is anyone else I have forgot-
cw/harley. I still welcome any corrections, sug- ten, please accept my apologies. Many people
gestions for the next edition, or discussion on have suggested things that I have thought about
any topic. My email address is now: t.a.harley@ and decided not to implement, and many people
dundee.ac.uk. Suggestions on topics I have omit- have suggested things (more connectionism, less
ted or under-represented would be particularly connectionism, leave that in, take that out, move
welcome. The hardest bit of writing this book that bit there, leave it there) that are the opposite
has been deciding in what to leave out. I am of what others have suggested.
sure that people running other courses will cover In particular the writing of this edition was
some material in much more detail than has been made immeasurably easier by spending time in
possible to provide here. I would be interested the glorious environment of the University of
to hear, however, of any major differences of California, San Diego. I wish to thank everyone
emphasis. If the new edition is as successful as there from the bottom of my heart, particularly
the third, I will be looking forward (in a strange my hosts Tamar Gollan and Vic Ferreira.
sort of way) to producing the fifth edition in five I would also like to thank Psychology Press
years’ time. for all their help and enthusiasm for this project.
I would like to thank all those who have Finally, I would like to thank Brian Butterworth,
made suggestions about one or more of the pre- who supervised my PhD. He probably doesn’t
vious editions, particularly Jeanette Altarriba, realize how much I appreciated his help; without
Gerry Altmann, Elizabeth Bates, Paul Bloom, him, this book might never have existed.
Helen Bown, Peer Broeder, Gordon Brown, Hugh Finally, I hope that any bias there is in this
Buckingham, Annette de Groot, Lynne Duncan, book will appear to be the consequence of the
the Dundee Psycholinguistics Discussion consideration of evidence rather than of prejudice.
Group, Andy Ellis, Gerry Griffin, Zenzi Griffin,
Francois Grosjean, Evan Heit, Laorag Hunter, Professor Trevor A. Harley
Lesley Jessiman, Barbara Kaup, Alan Kennedy, School of Psychology
Kathryn Kohnert, Annukka Lindell, Nick University of Dundee
Lund, Siobhan MacAndrew, Nadine Martin, Dundee DD1 4HN
Randi Martin, Elizabeth Maylor, Don Mitchell, Scotland
Wayne Murray, Lyndsey Nickels, Jane Oakhill, [email protected]
Padraig O’Seaghdha, Shirley-Anne Paul, Martin February 2013
ILLUSTRATION CREDITS
Chapter 1 from the International Linguistic Association. Page 149:

Page 6 (top): © Lily Rosen-Zohar/Shutterstock.com. Photo supplied by SR Research Ltd.
Page 9: © Bettmann/Corbis. Page 14: © Underwood &
Underwood/Corbis. Page 19 (left): Photo supplied by Chapter 5
Professor Peter Mitchell, University of Nottingham. Page Page 158: © Zephyr/Science Photo Library. Page 159
21 (top): © Geoff Tompkinson/Science Photo Library. (top): © J. Gerard Sidaner/Science Photo Library.
Page 21 (bottom): © University of Durham/Simon Fraser/
Science Photo Library. Chapter 6
Page 167: Shutterstock.com. Page 174 (top): © Thomas
Chapter 2 M. Perkins/Shutterstock.com. Page 184 (left): © Colin
Page 33 (top): © Shaun Jeffers/Shutterstock.com. Page Cuthburt/Science Photo Library. Page 184 (right):
36: © Rick Friedman/Corbis. © Sovereign, ISM/Science Photo Library. Page 193
(top): © Filip Fuxa/Shutterstock.com. Page 197: From
Chapter 3 McClelland and Rumelhart (1981). Copyright © 1981
Page 52: © David Gifford/Science Photo Library. Page by the American Psychological Association. Reprinted
55 (right): Shutterstock.com. Page 57: © Jill Lang/ with permission.
Shutterstock.com. Page 61: © Susan Kuklin/Science
Photo Library. Page 62: From Savage-Rumbaugh Chapter 7
et al. (1983). Copyright © 1983 by the American Page 210 (bottom): Shutterstock.com. Page 218:
Psychological Association. Reprinted with permis- Shutterstock.com. Page 224: © Sovereign, ISM/
sion. Page 64: From Savage-Rumbaugh and Lewin Science Photo Library. Page 226 (right): © tamir niv/
(1994). Copyright © 1994 Wiley. Page 66: © Nagel Shutterstock.com. Page 230 (left): From Harm and
Photography/Shutterstock.com. Page 71: © Wellcome Seidenberg (2001). © 2001 Taylor & Francis.
Dept. of Cognitive Neurology/Science Photo Library.
Page 72: From Hickok and Poeppel (2004). Copyright Chapter 8
© 2004. Reproduced by permission of Elsevier. Page Page 248: © Robert Maass/Corbis. Page 249: © Will
74: © Maslov Dmitry/Shutterstock.com. Page 78: © & Deni Mcintyre/Science Photo Library. Page 255: ©
Bettmann/Corbis. Page 82: Shutterstock.com. Page 84: Seila Terry/Science Photo Library.
© M. Dominik/zefa/Corbis. Page 86: © Gabe Palmer/
Corbis. Page 88: © Louis Quail/Corbis. Page 92: © Chapter 9
Galen Rowell/Corbis. Page 259: © Lane V. Erickson/Shutterstock.com. Page
273: From McClelland, Rumelhart, and the PDP Research
Chapter 4 Group (1986). Copyright © 1986 Massachusetts Institute
Page 107: Shutterstock.com. Page 110: © Darama/Corbis. of Technology, by permission of the MIT Press. Page 275:
Page 119: © Sovereign, ISM/Science Photo Library. Page From McClelland, Rumelhart, and the PDP Research
123: Shutterstock.com. Page 126: Shutterstock.com. Group (1986). Copyright © 1986 Massachusetts Institute
Page 128 (top): © Philip Date/Shutterstock.com. Page of Technology, by permission of the MIT Press. Page
135: © John Austin/Shutterstock.com. Page 145 (bot- 279: From Norris (1994b). Copyright © 1994 Elsevier.
tom): From Berko (1958). Reproduced with permission Reprinted with permission.
xii ILLUSTRATION CREDITS
Chapter 10 Reproduced by permission of Elsevier. Page 413 (bot-

Page 290: © Claudia Steininger/Shutterstock.com. tom): © Wellcome Dept. of Cognitive Neurology/
Page 316: From Friederici (2002). Copyright © 2002 Science Photo Library. Page 414: © image100/Corbis.
Elsevier. Reprinted with permission. Page 416: From Caramazza (1997). Copyright © 1997
Psychology Press. Page 419: From Levelt et al. (1991).
Chapter 11 Copyright © 1991 by the American Psychological
Page 327: © Anton_Ivanov/Shutterstock.com. Page Association. Reprinted with permission. Page 424:
330: © Bozena Fulawka/Shutterstock.com. Page 337: From Dell (1986). Copyright © by the American
© Mogens Trolle/Shutterstock.com. Page 339: © PR Psychological Association. Reprinted with permission.
Michel Zanca/ISM/Science Photo Library. Page 343: Page 436: Reprinted from Grodzinsky and Friederici
From Sitton, Mozer, and Farah (2000). Copyright © (2006). Copyright © 2006, with permission from
2000 by the American Psychological Association. Elsevier. Page 439: © Bsip, Mendil/Science Photo
Reprinted with permission. Page 344: From Snodgrass Library. Page 441: From Martin et al. (1994). Copyright
and Vanderwart (1980). Copyright © 1980 by the © 1994 by Academic Press. Reproduced by permission
American Psychological Association. Reprinted with of Elsevier.
permission. Page 348: © Alfred Pasieka/Science Photo
Library. Chapter 14
Page 450 (top): © Mike Watson Images/Corbis. Page
Chapter 12 454: © Don Hammond/Design Pics/Corbis. Page 456:
Page 361: © Tomasz Trojanowski/Shutterstock. Adapted from Ferreira et al. (2005). Copyright © 2005,
com. Page 362: © Bettmann/Corbis. Page 365: with permission from Elsevier.
From Bransford and Johnson (1973). Copyright ©
1973 Academic Press. Reproduced by permission of Chapter 15
Elsevier. Page 372: © Tim Pannell/Corbis. Page 379: © Page 463: © Wellcome Dept. of Cognitive Neurology/
Roy McMahon/Corbis. Science Photo Library.
Chapter 13 Chapter 16
Page 397: © Bettmann/Corbis. Page 413 (top): From Page 476: © Geoff Tompkinson/Science Photo Library.
Indefrey and Levelt (2004). Copyright © 2004. Page 478: © James King-Holmes/Science Photo Library.
HOW TO USE THIS BOOK
This book is intended to be a stand-alone intro- I do not think that there is anything much
duction to the psychology of language. It is my that can be done about this, but to persevere.
hope that anyone could pick it up and finish read- Sometimes comprehension might be assisted
ing it with a rich understanding of how humans by later material, and sometimes a number of
use language. Nevertheless, it would probably be readings might be necessary to comprehend
advantageous to have some knowledge of basic the material fully. Fortunately, the study of the
cognitive psychology. (Some suggestions for psychology of language gives us clues about
books to read are given in the “Further reading” how to facilitate understanding. Chapters 7 and
section at the end of Chapter 1.) For example, you 11 will be particularly useful in this respect. It
should be aware that psychologists have distin- should also be remembered that in some areas
guished between short-term memory (which has researchers do not agree on the conclusions or
limited capacity and can store material for only on what should be the appropriate method to
short durations) and long-term memory (which investigate a problem. Therefore it is some-
is virtually unlimited). I have tried to assume times difficult to say what the “right answer,”
that the reader has no knowledge of linguistics, or the correct explanation of a phenomenon,
although I hope that most readers will be familiar might be. In this respect the psychology of lan-
with such concepts as nouns and verbs. The psy- guage is still a very young subject.
chology of language is quite a technical area full The book is divided into sections, each cover-
of rather daunting terminology. I have defined ing an important aspect of language. Section A is
technical terms and italicized them when they an introduction. It describes what language is, and
first appear. There is also a glossary with short provides essential background for describing lan-
definitions of the technical terms. guage. It should not be skipped. Section B is about
Connectionist modeling is now central to the biological basis of language, the relationship
modern cognitive psychology. Unfortunately, it is of language to other cognitive processes, and lan-
also a topic that most people find extremely dif- guage development. Section C is about how we
ficult to follow. It is impossible to understand the recognize words. Section D is about comprehen-
details of connectionism without some mathemat- sion: how we understand sentences and discourse.
ical sophistication. I have provided an appendix Section E is about language production, and also
that covers the basics of connectionism in more about how language interacts with memory. It
mathematical detail than is generally necessary also examines the grand design or architecture of
to understand the main text. The general princi- the language system. This final section concludes
ples of connectionism can, however, probably be with a brief look at some possible new directions
appreciated without this extra depth, although it is in the psychology of language.
probably a good idea to look at the appendix. Each chapter begins with an introduction out-
In my opinion and experience, the mate- lining what the chapter is about and the main prob-
rial in some chapters is more difficult than others. lems faced in each area. Each introduction ends
xiv HOW TO USE THIS BOOK
with a summary of what you should know by the animals use language, or whether they can be
end of the chapter. Each chapter concludes with taught to do so. This will also help clarify what
a list of bullet points that gives a one-sentence we mean by language. We will look at how lan-
summary of each section in that chapter. This is guage is founded in the brain, and how damage to
followed by questions that you can think about the brain can lead to distinct types of impairment
either to test your understanding of the material, in language. We will look in detail at the more
or to go beyond what is covered, usually with an general role of language, by examining the rela-
emphasis on applying the material. If you want to tion between language and thought. We will also
follow a topic up in more detail than is covered in look at what can be learned from language acqui-
the text (which I think is quite richly referenced, sition in exceptional circumstances, including the
and should be the first place to look), then there effects of linguistic deprivation.
are suggestions for further reading at the very end Chapter 4 examines how children acquire
of each chapter. language, and how language develops through-
One way of reading this book is like a novel: out childhood. Chapter 5 examines how bilingual
start here and go to the end. Section A should children learn to use two languages.
certainly be read before the others because it We will then look in Chapter 6 at what appear
introduces many important terms, without which to be the simplest or lowest level processes and
later going would be very difficult. I certainly work towards more complex ones. Hence we will
recommend starting with Chapter 1. After that, first look at how we recognize and understand
alternative orders are possible, however. I have single words. Although these chapters are largely
tried to make each chapter as self-contained as about recognizing words in isolation in the sense
possible, so there is no reason why the chapters that in most of the experiments we discuss only
cannot be read in a different order. Similarly, you one word is present at a time, the influence of the
might choose to omit some chapters altogether. context in which they are found is an important
In each case you might find you have to refer to consideration, and we will look at this also.
the glossary more often than if you just begin at Chapter 7 looks at how we recognize words
the beginning. Unless you are interested in just a and how we access their meanings. Although the
few topics, however, I advise reading the whole emphasis is upon visually presented word recogni-
book through at least once. Each chapter looks at tion, many of the findings described in this chap-
a major chunk of the study of the psychology of ter are applicable to recognizing spoken words as
language. well. Chapter 8 examines how we read and pro-
nounce words, and looks at disorders of reading
(the dyslexias). It also looks at how we learn to
OVERVIEW read. Chapter 9 looks at the speech system and
how we process speech and identify spoken words.
Chapter 1 tells you about the subject of the psy- We then move on to how words are ordered to
chology of language. It covers its history and form sentences. Chapter 10 looks at how we make
methods. Chapter 2 provides some important use of word order information in understanding
background on language, telling you how we sentences. These are issues to do with syntax and
can describe sounds and the structure of sen- parsing. Chapter 11 examines how we represent
tences. In essence it is a primer on phonology the meaning of words. Chapter 12 examines how
and syntax. we comprehend and represent beyond the sentence
Chapter 3 is about how language is related level; these are the larger units of discourse or text.
to biological and cognitive processes. It looks at In particular, how do we integrate new information
the extent to which language depends on the pres- with old to create a coherent representation? How
ence and operation of certain biological, cogni- do we store what we have heard and read?
tive, and social precursors in order to be able to In Chapter 13 we consider the process in
develop normally. We will also look at whether reverse, and examine language production and its
HOW TO USE THIS BOOK xv
disorders. By this stage we will have an under- In Chapter 15 we will look at the structure of
standing of the processes involved in understand- the language system as a whole, and the relation
ing language, and these processes must be looked between the parts. Finally, Chapter 16 looks at
at in a wider context (Chapter 14). some possible new directions in psycholinguistics.
SECTION A
INTRODUCTION
This section describes what the rest of the book the history and methods of psycholinguistics, the
is about, discusses some important themes in chapter covers some current themes and contro-
the psychology of language, and examines versies in modern psycholinguistics, including
some important concepts used to describe lan- modularity, innateness, and the usefulness of
guage. You should read this section before the brain imaging, and studies involving people with
others. brain damage, for looking at language.
Chapter 1, The study of language, looks at Chapter 2, Describing language, looks
the functions of language and how the study of at the building blocks of language—sounds,
language plays a major role in helping to under- words, and sentences. The chapter then examines
stand human behavior. We look at what language Chomsky’s approaches to syntax and how these
is and what it is used for. After a brief look at have evolved over the years.
CHAPTER 1
THE STUDY OF LANGUAGE
INTRODUCTION to make the components of the articulatory appa-

ratus move at just the right time. We also need a
What’s the best joke you’ve heard? I find it dif- language complex enough to convey any possible
ficult to remember any (and very few that can be message. We need to know the words and how
put into print), but a search through Google of to put the words in the right order. Young chil-
“best joke in the world” throws up this gem: dren somehow acquire this language. Finally, we
have to be aware of the social setting in which
A couple of New Jersey hunters are out in the we produce and understand these messages: We
woods when one of them falls to the ground. need to be aware of the knowledge and beliefs of
other people, and have some idea of how they will
He doesn’t seem to be breathing, his eyes are
interpret our utterances. The subject matter of this
rolled back in his head. The other guy whips book is the psychological processing involved in
out his cell phone and calls the emergency this sort of behavior.
services. He gasps to the operator: “My friend Although we usually take language for
is dead! What can I do?” The operator, in a granted, a moment’s reflection will show how
calm soothing voice, says: “Just take it easy. important it is in our lives. In some form or
another it so dominates our social and cognitive
I can help. First, let’s make sure he’s dead.”
activity that it would be difficult to imagine what
There is a silence, then a shot is heard. The life would be like without it. Indeed, most of us
guy’s voice comes back on the line. He says: consider language to be an essential part of what
“OK, now what?” it means to be human, and it is largely what sets us
apart from other animals. Our culture and technol-
Well, I must admit that one did make me laugh. ogy depends on it. Crystal (2010) describes sev-
Why is it funny? Notice how much the joke eral functions of language. The primary purpose
depends on language, in every way. of language is of course to communicate, but we
What was the last thing you said? The last can also use it simply to express emotion (e.g., by
thing you heard? The last thing you read? And the swearing), for social interaction (e.g., by saying
last thing you wrote? How did your brain do these “bless you!” when someone sneezes), to make use
things? of its sounds (e.g., in various children’s games),
Think of the steps involved in communicat- to attempt to control the environment (e.g.,
ing with other people. We obviously must have magical spells), to record facts, to think with,
the necessary biological hardware: We need an and to express identity (e.g., chanting in dem-
articulatory apparatus that enables us to make the onstrations). We even play with language. Much
right sort of sounds, and of course we also need humor—particularly punning—depends on being
a brain to decide what to say, how to say it, and able to manipulate language (Crystal, 1998).
4 A. INTRODUCTION
It is not surprising then that understanding sort of experimental methods. We construct models
language is an important part of understanding of what we think is going on from our experimental
human behavior, with different areas of scientific results; we use observational and experimental data
study emphasizing different aspects of language to construct theories. This book will examine some
processing. The study of the anatomy of language of the experimental findings in psycholinguistics,
emphasizes the components of the articulatory tract, and the theories that have been proposed to account
such as the tongue and voice box. Neuroscience for them. Generally the phenomena and data to be
examines the role of different parts of the brain in explained will precede discussion of the models, but
behavior. Linguistics examines language itself. it is not always possible to neatly separate data and
Psycholinguistics is the study of the psychological theories, particularly when experiments are tests of
processes involved in language. Psycholinguists particular theories. I’ll be talking a bit more about
study understanding, producing, and remembering models and theories later.
language, and hence are concerned with listening, This book has a cognitive emphasis. It is con-
reading, speaking, writing, and memory for lan- cerned with understanding the processes involved
guage. They are also interested in how we acquire in using and acquiring language. This is not just
language, and the way in which it interacts with my personal bias; I believe that all our past expe-
other psychological systems. Many people think that rience has shown that the problems of studying
“psycholinguistics” has a rather dated feel, empha- human behavior have yielded, and will continue
sizing the role of linguistics too much. Although the to yield, to investigation by the methods of cogni-
area might once have been about the psychology of tive psychology and neuroscience.
linguistic theory, it is now much more. Still, there is
currently no better term, so it will have to do.
One reason why we take language for granted WHY STUDY LANGUAGE
is that we usually use it so effortlessly, and most of AND WHY IS IT SO
the time, so accurately. Indeed, when you listen to DIFFICULT?
someone speaking, or look at this page, you nor-
mally cannot help but understand what has been Even before I get on to saying what language is,
said or what is printed on the page in front of you. I want to ask why we should study it. Some peo-
It is only in exceptional circumstances that we ple (mostly psycholinguists) think the answer is
might become aware of the complexity involved: obvious, but in practice many students are often
if for example we are searching for a word but perplexed as to why so much of their psychology
cannot remember it; if a relative or colleague has course is devoted to the subject. What’s more I’ve
had a stroke that has affected their language; if we noticed that students often find the psychology of
observe a child acquiring language; if we try to learn language the most difficult part of psychology. It’s
a second language ourselves as an adult; or if we are often the part they like least (and often actively
visually impaired or hearing impaired, or if we meet dislike). So why should we study language?
someone else who is. And, of course, if you find this Well, you’re reading this book right now, aren’t
book so difficult to understand that you have to keep you? Reading words and sentences and making
reading and rereading it to make any sense of it. As sense of them (or trying to); that’s part of psycho-
we shall see, all of these examples of what might linguistics, for starters. It’s a good bet that you’re
be called “language in exceptional circumstances” pretty good at reading, but you probably know
reveal much about the processes involved in speak- someone who has had some difficulty in learning to
ing, listening, writing, and reading. But given that read, or even now finds reading and spelling diffi-
language processes are normally so automatic, we cult (that is, they have dyslexia). Perhaps you know
also need to carry out careful experiments to under- someone who has had a stroke and now finds read-
stand what is happening. Modern psycholinguistics ing difficult. More psycholinguistics!
is therefore closely related to other areas of cognitive But I bet you’ve listened to the radio or TV
psychology, and relies to a large extent on the same today, or listened to music with words (talking,
1. THE STUDY OF LANGUAGE 5
more psycholinguistics). I’ll be a little surprised if of the applications of the psychology of language.
you’ve not talked to anyone at all (speaking, lis- Second, the subject seems to have a lot of jargon
tening; even yet more psycholinguistics). You’ve in it, and teachers sometimes forget this or under-
probably written something too (you get the idea). estimate their students’ knowledge. How can you
But even if by some miracle you haven’t, I be expected to understand what a reduced relative
bet you’ve heard a voice in your head. The voice clause is when you don’t know what a clause is? Or
in your head probably uses words. In fact it’s hard even aren’t that clear what a noun is? I’ve tried to
(I find impossible) to think about human thought make life as easy as possible by defining all techni-
without thinking about language. So thinking, cal terms, trying to keep jargon to a minimum, and
the essence of being human, is completely inter- providing a glossary which contains a simple defi-
twined with language. nition of every technical term I can think of. Third,
What is more we transmit our learning and psycholinguists are an argumentative bunch, and
culture by language. The major reason civiliza- rarely seem to agree on anything. Sometimes they
tion has reached its heights, that we live in cen- can’t even agree whether they agree or not. So there
trally heated houses with thin computers and cell are few situations when we can say “now THAT’s
phones, using social networking sites, is because the answer.” And people like answers. They don’t
we have built up a culture and a technology that like to be left with the conclusion “it could be this or
would have been completely impossible without it could be that and it all depends,” and that’s going
language. For this reason the evolutionary biolo- to be my conclusion most of the time. But life is full
gist Martin Nowak (2006) says that language is of uncertainties, so get over it and live with it. And
“the most interesting invention of the last 600 the final reason that people find psycholinguistics
million years” (p. 250). He says that the impact difficult is because it’s full of models. A colleague
of language is comparable with only a few other once told me that she overheard some students talk-
events in biological history, such as the evolution ing in front of her (yes, we love to eavesdrop) and
of life and the evolution of multi-celled animals. one said to the other “language—it’s just all these
So here is my list of reasons of why the study models.” Models are the most important thing in
of the psychology of language is so important: science; they’re the closest we get to an explana-
tion. I’ll talk about models below.
1. We use language nearly all the time; technol-
ogy and our cultures would be impossible
without it. WHAT IS LANGUAGE?
2. We usually think in language.
3. Some people have difficulty learning spoken or It might seem natural at this point to say exactly
written language (developmental disorders), or what is meant by “language,” but to do so is much
have difficulty with language as a consequence harder than it first appears. We all have some intui-
of brain damage (acquired disorders). tive notion of what language is; a simple definition
might be that it is “a system of symbols and rules
We can agree then that studying language that enable us to communicate.” Symbols are things
is important; but why do so many students find it that stand for other things: Words, either written or
hard? I think there are several reasons. First, the spoken, are symbols. The rules specify how words
importance and applications of language are not are ordered to form sentences. However, providing
always made as clear as they might be. If I told you a strict definition of language is not straightfor-
that I could teach you to read a textbook in a way ward. Consider other systems that many think are
that would guarantee you’d remember it and under- related to human spoken language. Are the com-
stand it and get an A in an exam, you’d probably pay munication systems of monkeys a language? What
attention. (Sadly I can’t, otherwise I would be very about the “language” of dolphins, or the “dance”
rich, although later I will give you some tips.) So of honey bees that communicates the location of
in this book I’ve tried to emphasize the importance sources of nectar to other bees in the hive? How
6 A. INTRODUCTION
meaning), syntax (the study of word order), mor-

phology (the study of words and word formation),
pragmatics (the study of language use), phonet-
ics (the study of raw sounds), and phonology (the
study of how sounds are used within a language)
(see Figure 1.1).
Syntax will be described in detail in the next
chapter, and semantics in Chapter 11. Morphology
is concerned with the way that complex words are
made up of simpler units, called morphemes.
There are two types of morphology: inflectional
morphology, which is concerned with changes to
Are these elephants communicating using a
language? a word that do not alter its underlying meaning
or syntactic category, and derivational morphol-
ogy, which is concerned with changes that do.
does the signing of people with hearing impairment Examples of inflectional changes are pluralization
resemble or differ from spoken language? Because (e.g., “house” becoming “houses,” and “mouse”
of these sorts of complications, many psycholo- becoming “mice”) and verb tense changes (e.g.,
gists and linguists think that providing a formal “kiss” becoming “kissed,” and “run” becoming
definition of language is a waste of time. We look “ran”). Examples of derivational changes are
at whether animals have language and at the char- “develop” becoming “development,” “develop-
acteristics of language in more detail in Chapter 2. mental,” or “redevelop.” The distinction between
We can describe language in a variety of ways: phonetics and phonology, which are both ways of
for example, we can talk about the sounds of the studying sounds, will also be examined in more
language, or the meaning of words, or the gram- detail in Chapter 2.
mar that determines which sentences of a language The idea of “a word” also merits considera-
are legitimate. These types of distinctions are fun- tion. Like the word “language,” the word “word”
damental in linguistics, and these different aspects turns out on closer examination to be a somewhat
of language have been given special names. We slippery customer. The dictionary definition of
can distinguish between semantics (the study of a word is “a unit of language,” but in fact there
PHONOLOGY
SEMANTICS
(the study of how sounds
(the study of meaning)
are used within a language)
PHONETICS SYNTAX
LINGUISTICS
(the study of raw sounds) (the study of word order)
PRAGMATICS MORPHOLOGY
(the study of language use) (the study of words and
word formation)
INFLECTIONAL MORPHOLOGY DERIVATIONAL MORPHOLOGY

(concerned with changes to a word (concerned with changes to a word
that do not alter its underlying meaning) that alters its underlying meaning)
FIGURE 1.1
are many other language units (e.g., sounds and adult knows about 70,000 words (Nagy & Anderson,
sentences). Crystal (2010, p. 461) defines a word 1984; but by “greatly” I mean that the estimates
as “the smallest unit of grammar that can stand range between 15,000 and 150,000—see Bryson,
on its own as a complete utterance, separated 1990). Recognizing a word is rather like looking it
with spaces in written language.” Hence “pigs” up in a dictionary; when we know what the word is,
is a word, but the word ending “-ing” by itself we have access to all the information about it, such
is not. A word can in turn be analyzed at a num- as what it means and how to spell it. So when we
ber of levels. At the lowest level, it is made up of see or hear a word, how do we access its representa-
sounds, or letters if written down. Sounds com- tion within the lexicon? How do we know whether an
bine together to form syllables. Hence the word item is stored there or not? What are the differences
“cat” has three sounds and one syllable; “houses” between understanding speech and understanding
has two syllables; “syllable” has three syllables. visually presented words? Psycholinguists are par-
Words can also be analyzed in terms of the ticularly interested in the processes of lexical access
morphemes they contain. Consider a word like and how things are represented.
“ghosts.” This is made up of two units of mean-
ing: the idea of “ghost,” and then the plural end-
ing or inflection (“-s”), which conveys the idea HOW HAS LANGUAGE
of number: in this case that there is more than one CHANGED OVER TIME?
ghost. Therefore we say that “ghosts” is made
up of two morphemes, the “ghost” morpheme Language must have changed enormously over
and plural morpheme “s.” The same can be said time, and one obvious consequence of these
of past tense endings or inflections: “Kissed” is changes is that there are now many different lan-
also made up of two morphemes, “kiss” plus the guages in the world. Depending on exactly how
“-ed” past tense inflection which signifies that the something counts as a separate language, there
event happened in the past. There are two sorts are now thought to be around 5,000–6,000 (but
of inflection, regular forms that follow some rule, the number is getting smaller as languages, like
and irregular forms that do not. Irregular plurals species, become extinct), although estimates
that do not obey the general rule of forming plu- vary between 2,700 and 10,000. We do not even
rals by adding an “-s” to the end of a noun, or know whether all human languages are descended
forming the past tense by adding a “-d” or “-ed” from one common ancestor, or whether they are
to the end of a verb, also contain at least two mor- derived from a number of ancestors (my bet is on
phemes. Hence “house,” “mouse,” and “do” are one). However, it is apparent that many languages
made up of one morpheme, but “houses,” “mice,” are related to each other. This relation is apparent
and “does” are made up of two. “Rehoused” is in the similarity of many of the words of some
made up of three morphemes: “house” plus “re-” languages (e.g., “mother” in English is “Mutter”
added through mechanisms of derivational mor- in German, “moeder” in Dutch, “mère” in French,
phology, and “-ed” added by inflection. Every “maht” in Russian, and “mata” in Sanskrit). More
child’s favorite word “antidisestablishmentarian- detailed analyses like this have shown that most of
ism” is made up of six morphemes. the languages of Europe, and parts of west Asia,
Psychologists believe that we store representa- derive from a common source called proto-Indo-
tions of words in a mental dictionary. We call this European. All the languages that are derived from
mental dictionary the lexicon. The lexicon contains this common source are called Indo-European.
all the information (or at least pointers to all of the We can gather ideas about where the speakers
information) that we know about a word, including of the ancestral language came from, by look-
its sounds (phonology), meaning (semantics), written ing at the words that are shared in the descend-
appearance (orthography), and the syntactic roles the ant languages. For example, all Indo-European
word can adopt. The lexicon must be huge: estimates languages have similar words for horses and
vary greatly, but a reasonable estimate is that an sheep, but not for palm tree or vine. Hence the
8 A. INTRODUCTION
original language must have been spoken some- some words, sometimes over short time spans—
where where it was easy to find horses and sheep, rather sadly I can’t remember the last time I had
but where palms and vines could not be found. to give a measurement in rods or chains. We bor-
Such observations suggest that the speakers of row (or perhaps steal is a better word) words from
proto-Indo-European probably spread out from other languages (“café” from French, “potato”
Anatolia (approximately modern-day Turkey) from Haiti, and “shampoo” from India). Sounds
with the expansion of agriculture about 9,000 change in words (“sweetard” becomes “sweet-
years ago (Bouckaert et al., 2012). Indo-European heart”). Words are sometimes even created by
has a number of main branches: the Romance error: “pea” was back-formed from “pease” as
(such as French, Italian, and Spanish), the people started to think (incorrectly) that “pease”
Germanic (such as German, English, and Dutch), was plural (Bryson, 1990).
and the Indian languages (see Figure 1.2). There We most definitely should not gloss over dif-
are some languages that are European but that are ferences between languages. Although they have
not part of the Indo-European family. Finnish and arisen over a relatively short time compared with
Hungarian are from the Finno-Ugric branch of the the evolution of humans, we cannot assume that
Uralic family of languages. There are many other speakers of different languages process them in
language families in addition to Indo-European, the same basic way. Whereas it is likely that most
including Afro-Asiatic (covering north Africa and of the mechanisms involved are the same, there
the Arabian peninsula), Niger-Congo, Japanese, might be some differences, particularly in the
Sino-Tibetan, and families of languages spoken processing of written or printed words. Writing
in and around the Pacific and in north and south is a recent development compared with speech,
America. Altogether linguists have identified over and as we shall see in Chapters 7 and 8, there
100 language families, although a few languages,
such as Basque, do not seem to be part of any fam-
ily. The extent to which these large families may
be related further back in time is unknown.
Languages also change over relatively short
time spans. Chaucerian and Elizabethan English
are obviously different from modern English, and
even Victorian speakers would sound decidedly
archaic to us today, my dear old bean. Even listen- U
ing to 1970s sitcoms can be disconcerting at times.
We coin new words or new uses of old words
when necessary (e.g., “computer,” “television,” fl»<8 62ftoi β»€»θ|
“internet,” “rap”). Whole words drop out of usage tfeap e* <Se\a і
(“thee” and “thou”), and we lose the meanings of fl»<8 62ftoi β»€»θ|
& с іі« н ф $ thg?
ftm«62ftoi
fl»<8 tuortaf ^
β»€»θ|
t fr te tZ b fa r I
fl»<8 62ftoi β»€»θ|
INDO-EUROPEAN LANGUAGES tott)»·^ & І ? »
fl»<8
tu ilis62ftoi
n ifln β»€»θ|
JCJrtir
ROMANCE GERMANIC INDIAN

fl»<8 62ftoi β»€»θ|
e.g., French e.g., English e.g., Hindi
fel-fc iW f f c pi!
Italian German Punjabi ўт хф Ц Ь Ь і
Spanish Dutch Urdu tfJlrtOitCOfUMu
5 д о < *№ *< £
Chaucerian language seems archaic and verbose in

comparison to modern English.
FIGURE 1.2
are important differences in the way that differ-

ent written languages turn written symbols into
sounds. Nevertheless, there is an important core
of psychological mechanisms that appear to be
common to the processing of all languages.
WHAT IS LANGUAGE FOR?

The question of what language is used for now is
intimately linked with its origin and evolution. It is a
reasonable assumption that the factors that prompted
its origin in humans are still of fundamental impor-
tance. Primary among these is the fact that language
is used for communication. Although this might
seem obvious, we can sometimes lose sight of this
point, particularly when we consider some of the
more complicated experiments described later in
this book. Nevertheless, language is a social activity,
and as such is a form of joint action where people
collaborate to achieve a common aim (Clark, 1996).
We do not speak or write in a vacuum; we speak to
communicate, and to ensure that we succeed in com- Spoken words can have a powerful influence on
municating we take the point of view of others into the listener’s state of mind.
account. We look at this idea in detail in Chapter 14.
Although the primary function of language is
communication, it might have acquired (or even A brief history of psycholinguistics
originated from) other functions. In particular,
Given the importance of language, it is surprising
language might have come to play a role in other,
that the history of psycholinguistics is a relatively
originally non-linguistic, cognitive processes.
recent one. The beginning of the scientific study
The extreme version of this idea is that the form
of the psychology of language is often traced to a
of our language shapes our perception and cogni-
conference held at Cornell University, USA, in the
tion, a view known as the Sapir–Whorf hypothesis.
summer of 1951, and the word “psycholinguis-
Indeed, some have argued that language evolved to
tics” was first used in Osgood and Sebeok’s (1954)
allow us to think, and communication turned out to
book describing that conference. Nevertheless,
be a useful side effect. As I noted above, technology
the psychology of language had been studied
and culture would be impossible without language.
before then. For example, in 1879 Francis Galton
I examine these ideas in more detail in Chapter 3.
studied how people form associations between
words. In Germany at the end of the nineteenth
century, Meringer and Mayer (1895) analyzed
THE HISTORY slips of the tongue in a remarkably modern way,
AND METHODS OF and Freud (1901/1975) tried to explain the origin
PSYCHOLINGUISTICS of speech errors in terms of his psychodynamic
theory (see Chapter 13). If we place the infancy
Now we know something about what language is, of modern psycholinguistics sometime around
let us look at how modern psycholinguistics stud- the American linguist Noam Chomsky’s (1959)
ies it. We will begin by looking briefly at the his- review of Skinner’s book Verbal Behavior, its
tory of the subject. adolescence would correspond to the period in the
10 A. INTRODUCTION
early and mid-1960s when psycholinguists tried was the most likely continuation of a sentence
to relate language processing to Chomsky’s trans- from a particular point onwards was central to
formational grammar. Since then psycholinguistics this approach. Information theory was also impor-
has left its linguistic home and achieved inde- tant because of its influence in the development
pendence, flourishing on all fronts. of cognitive psychology. In the middle part of the
As its name implies, psycholinguistics has its twentieth century, the dominant tradition in psy-
roots in the two disciplines of psychology and lin- chology was behaviorism, which emphasized the
guistics, and particularly in Chomsky’s approach relation between an input (or stimulus) and output
to linguistics. Linguistics is the study of language (response), and how conditioning and reinforce-
itself, the rules that describe it, and our knowledge ment formed these associations. Intermediate
about the rules of language. The primary concerns constructs (such as the mind) were considered
of early linguistics were rather different from what unnecessary to provide a full account of behav-
they are now. Comparative linguistics was con- ior. For behaviorists, the only valid subject matter
cerned with comparing and tracing the origins of for psychology was behavior, and language was
different languages. In particular, the American behavior just like any other sort. Its acquisition
tradition of the linguist Leonard Bloomfield and use could therefore be explained by standard
(1887–1949) emphasized comparative studies of techniques of reinforcement and conditioning.
indigenous North American Indian languages, lead- This approach perhaps reached its acme in 1957
ing to an emphasis on what is called structuralism: with the publication of B. F. Skinner’s famous (or
A primary goal of linguistics was taken to be pro- to linguists, infamous) book Verbal Behavior.
viding an analysis of the appropriate categories of
description of the units of language (Harris, 1951). Psycholinguistic tests of
In modern linguistics the primary data used
by linguists are intuitions about what is and is Chomsky’s linguistic theory
not an acceptable sentence. For example, we Attitudes changed very quickly: in part this change
know that the string of words in (1) is accept- was due to a devastating review of Skinner’s book
able, and we know that (2) is ungrammatical. by Chomsky (1959). The American linguist Noam
How do we make these decisions? Can we formu- Chomsky (b. 1928) has had more influence on
late general rules to account for our intuitions? how we understand language than any other per-
(An asterisk conventionally marks an ungram- son. Unusually, the book review came to be more
matical construction.) influential than the book it reviewed. Chomsky
showed that behaviorism was incapable of dealing
(1) What did the pig give to the donkey? with natural language. He argued that a new type of
(2) *What did the pig sleep to donkey? linguistic theory called transformational grammar
provided both an account of the underlying struc-
This emphasis on our knowledge led to ture of language and also of people’s knowledge
greater emphasis on what humans do with lan- of their language (see Chapter 2 for more details).
guage, rather than just on its structure. Psycholinguistics blossomed in attempting to
Early psychological approaches to language test the psychological implications of this linguis-
saw the language processor as a simple device tic theory, and the influence of linguistics peaked
that could generate and understand sentences by in the late 1960s and early 1970s. The enterprise
moving from one state to another. There are two was not wholly successful, and experimental
strands in this early work, derived from informa- results suggested that, although linguistics might
tion theory and behaviorism. Information theory tell us a great deal about our knowledge of our
(Shannon & Weaver, 1949) emphasized the role language and about the constraints on children’s
of probability and redundancy in language, and acquisition of language, it is limited in what it can
developed out of the demands of the fledgling tell us about the processes involved in speaking
telecommunications industry. Working out what and understanding.
The rest of this section is rather technical and applied. They are therefore the active, affirmative,
can be skipped on the first reading. You might like declarative forms of English sentences.
to return to it before or after reading Chapter 10 Miller and McKean (1964) tested the idea
on parsing. that the more transformations there are in a sen-
What can the linguistic approach contribute tence, the more difficult it is to process. They
to our understanding of the processes involved looked at detransformation reaction times to sen-
in producing and understanding syntactic struc- tences such as (5) to (9). Participants were told
tures? When Chomsky’s work first appeared, there that they would have to make a particular trans-
was great optimism that it would also provide an formation on a sentence, and then press a button
account of these processes. Two ideas attracted par- when they found this transformed sentence in a
ticular interest and were considered easily testable: list of sentences through which they had to search.
these were the derivational theory of complexity Miller and McKean measured these times.
(DTC), and the autonomy of syntax. The idea of
the derivational theory of complexity is that the (5) The robot shoots the ghost. (0 transforma-
more complex the formal syntactic derivation of tions: active affirmative form)
a sentence—that is, the more transformations that (6) The ghost is shot by the robot. (1 transforma-
are necessary to form it—the more complex the tion: passive)
psychological processing necessary to understand (7) The robot does not shoot the ghost. (1 trans-
or produce it, meaning that transformationally formation: negative)
complex sentences should be harder to process than (8) The ghost is not shot by the robot. (2 transfor-
less complex sentences. This additional processing mations: passive + negative)
complexity should be detectable by an appropri- (9) Is the ghost not shot by the robot? (3 transfor-
ate measure such as reaction times. The psycho- mations: passive + negative + question)
logical principle of the autonomy of syntax takes
Chomsky’s assertion that syntactic rules should be We can derive increasingly complex sentences
specified independently of other constraints fur- from the kernel (5). For example, (9) is derived from
ther, to mean that syntactic processes operate inde- (5) by the application of three transformations: pas-
pendently of other ones. In practice this means that sivization, negativization, and question formation.
syntactic processes should be autonomous with Miller and McKean found that the time it took to
respect to semantic processes. detransform sentences with transformations back
Chomsky (1957) distinguished between to the kernel was linearly related to the number of
optional and obligatory transformations. Obligatory transformations in them. That is, the more transfor-
transformations were those without which the sen- mations a participant has to make, the longer it takes
tence would be ungrammatical. Examples include them to do it. This was interpreted as supporting the
transformations introduced to cope with number psychological reality of transformational grammar.
agreement between nouns and verbs, and the intro- Other experiments around the same time sup-
duction of “do” into negatives and questions. Other ported this idea. Savin and Perchonock (1965)
transformations were optional. For example, the found that sentences with more transformations
passivization transformation takes the active form in them took up more memory space. The more
of a sentence and turns it into a passive form, for transformationally complex a sentence was, the
instance turning (3) into (4): fewer items participants could simultaneously
remember from a list of unrelated words. Mehler
(3) Boris applauded Agnes. (1963) found that when participants made errors
(4) Agnes was applauded by Boris. in remembering sentences, they tended to do it in
the direction of forgetting transformational tags,
Chomsky defined a subset of sentences that he rather than adding them. It was as though partici-
called kernel sentences. Kernel sentences are those pants remembered sentences in the form of “kernel
to which only obligatory transformations have been plus transformation.”
12 A. INTRODUCTION
Problems with the psychological obtain. Slobin’s finding that the depth of syntactic
interpretation of transformational processing is affected by semantic considerations
grammar such as reversibility is also counter to the idea
The tasks that supported the psychological real- of the autonomy of syntax, although this proved
ity of transformational grammar all used indirect more controversial. Using different materials and
measures of language processing. If we ask partici- a different task (judging whether the sentence was
pants explicitly to detransform sentences, it is not grammatical or not), Forster and Olbrei (1973)
surprising that the time it takes to do this reflects found no effect of reversibility, and more recently
the number of transformations involved. However, Ferreira (2003) found that there was always some
this is not a task that we necessarily routinely do cost to processing a passive sentence, even irre-
in language comprehension. Memory measures are versible ones. Taken together, these results mean
not an on-line measure of what is happening in sen- that what we observe depends on the details of the
tence processing; at best they are reflecting a side tasks used, but both syntactic and semantic factors
effect. What we remember of a sentence need have have an effect on the difficulty of sentences.
no relation with how we actually processed that Wason (1965) examined the relation between
sentence. Indeed, other findings that were difficult the structure of a sentence and its meaning. He
to fit into this framework soon emerged. measured how long it took participants to com-
Slobin (1966a) performed an experiment plete sentences describing an array of eight
similar to the original detransformation experi- colored circles, seven of which were red and one
ment of Miller and McKean. Slobin examined of which was blue. It is more natural to use a nega-
the processing of what are called reversible and tive in a context of “plausible denial”—that is, it is
irreversible passive sentences. A reversible pas- more appropriate to say “this circle is not red” of
sive is one where the subject and object of the the exception than of each of the others “this circle
sentence can be reversed and the sentence still is not blue.” In other words, the time it takes to
makes pragmatic sense. An irreversible passive is process a syntactic construction such as negative-
one that does not make sense after this reversal. formation depends on the semantic context.
If you swap the subject and object in (10) you get In summary, early claims supporting the
(12), which makes perfect sense, whereas if you ideas of derivational complexity in linguistic per-
do this to (11) you get (13), which, although not formance that were derived from Chomsky’s for-
ungrammatical, is rather odd—it is semantically mulation of grammar were at best premature, and
anomalous: perhaps just wrong. As we shall see in later chap-
ters, the degree to which syntactic and semantic
(10) The ghost was chased by the robot. processes are independent turns out to be one of
(11) The flowers were watered by the robot. the most important and controversial topics in
(12) The robot was chased by the ghost. psycholinguistics.
(13) ? The robot was watered by the flowers. Linguistic approaches have given us a use-
ful terminology for talking about syntax. They
In the case of an irreversible passive, you can also illuminate how powerful the grammar that
work out what is the subject of the sentence and underlies human language must be. Chomsky’s
what is the object by semantic clues alone. With a theory of transformational grammar also had a
reversible passive, you have to do some syntactic major influence on the way in which psychologi-
work. Slobin found that Miller and McKean’s cal syntactic processing was thought to take place.
results could only be obtained for reversible pas- In spite of their initial promise, later experiments
sives. Hence detransformational parsing only provided little support for the psychological real-
appears to be necessary when there are not suf- ity of transformational grammar. Chomsky had a
ficient semantic cues to the meaning of the sen- retreat available: Linguistic theories describe our
tence from elsewhere. This result means that the linguistic competence, our abstract knowledge of
derivational theory of complexity does not always language, rather than our linguistic performance,
what we actually do. That is, transformational particular for examples). This approach is some-
grammar is a description of our knowledge of our times called, rather derogatorily, “boxology.” It
linguistic competence, and the constraints on lan- is certainly not unique to psycholinguistics, and
guage acquisition, rather than an account of the such an approach is not as bad as is sometimes
processes involved in parsing on a moment-to- hinted. It at least gives rise to an understanding of
moment basis. This has effectively led to a sepa- the architecture of the language system—what the
ration of linguistics and psycholinguistics, with “boxes” of the language system are, and how they
each pursuing these different goals. Miller, who are related to others.
first provided apparent empirical support for the As a consequence of the influence of the com-
psychological reality of transformational gram- putational metaphor, and with the development of
mar, later came to believe that all the time taken suitable experimental techniques, psycholinguis-
up in sentence processing was used in semantic tics gained an identity independent of linguistics.
operations. Modern psycholinguistics is primarily an experi-
mental science, and as in much of cognitive psy-
Psycholinguistics and information chology, experiments measuring reaction times
have been particularly important (especially in
processing word recognition and comprehension; see Chapters
Psycholinguistics was largely absorbed into main- 6 through 12). Psychologists try to break language
stream cognitive psychology in the 1970s. In this processing down into its components, and show
approach, the information processing or compu- how those components relate to each other.
tational metaphor reigned supreme. Information
processing approaches to cognition view the mind
as rather like a computer. The mind uses rules to
The “cognitive science” approach
translate an input such as speech or vision into a The term “cognitive science” is used to cover the
symbolic representation: cognition is symbolic multidisciplinary approach to the study of the mind,
processing. This approach can perhaps be seen at with the disciplines including adult and developmen-
its clearest in a computational account of vision, tal psychology, philosophy, linguistics, anthropology,
such as that of Marr (1982), where the representa- neuroscience, and artificial intelligence (AI). We
tion of the visual scene becomes more and more have already seen how linguistics influenced early
abstract from the retinal level through increasingly psycholinguistics, particularly early work on syn-
sophisticated representations. Processing could tax. Philosophy has played an important role in our
be represented as flow diagrams, in the same way understanding of meaning. AI involves getting com-
that complex tasks could be represented as flow puters to do things that appear to need intelligence,
diagrams before being turned into a computer pro- such as understanding a story, or understanding
gram. Flow diagrams illustrate levels of process- speech. Apart from the obvious technological uses of
ing, and much work during this time attempted to AI, the hope has been that it will increase our under-
show how one level of representation of language standing of how humans do such tasks. AI involves
is transformed into another. The computational computer modeling of human behavior emphasizing
metaphor is clearly influential in modern psycho- rules, plans, and goals. Implementing a model in a
linguistics, as most models are phrased in terms computer program has the great advantage that you
of the description of levels of processing, and the have to be explicit about everything. There are no
rules or processes that determine what happens hidden assumptions in models that are implemented
in between. We will see this type of approach as programs. On the other hand, this compulsory
throughout this book. Many traditional psycholin- explicitness has a significant disadvantage: Because
guistic models are specified as “box-and-arrow” modelers have to be explicit about everything, they
diagrams, with boxes referring to processing lev- often have to make unwarranted assumptions about
els, and the arrows being the means of getting processes and representations, and simplify the con-
from one box to another (see Chapters 7 and 13 in text or the scope of the model to a degree where its
14 A. INTRODUCTION
usefulness becomes debatable. Nevertheless, com- I need some help, that much seems certain.
putational models have many advantages and have
WHAT WOULD IT MEAN IF YOU GOT
become increasingly important.
SOME HELP
If you have seen Kubrick’s film 2001: A Space
Odyssey, you will probably remember HAL, the Perhaps I could learn to get along with my
talking, listening, all-seeing, thinking computer. mother.
AI researchers hope someday to emulate HAL’s
TELL ME MORE ABOUT YOUR FAMILY
linguistic abilities. Some early successes made
researchers optimistic that this goal might soon be
The program worked in a very simple way,
realized. Among the earliest of these successes was
using a few techniques such as matching sentences
Weizenbaum’s (1966) program, called ELIZA. The
to pre-stored structures or templates, producing pre-
program simulated a non-directive, Rogerian ther-
determined responses to a few keywords (such as
apist. In the following example, the output of the
“mother”), and repeating what was input to it back
computer—the ELIZA program—is in capitals.
with a few appropriate changes in word order. Nev-
ertheless, people who used the program became
Men are all alike.
very attached to it, even if they knew it was only a
IN WHAT WAY machine producing the output. Although some peo-
ple even wanted to consult regularly with ELIZA in
They’re always bugging us about something
private, this perhaps says more about human con-
or other.
cerns than it does about how language is produced.
CAN YOU THINK OF A SPECIFIC There were a number of variants of ELIZA, such as
EXAMPLE PARRY (Colby, 1975), which “simulated” the out-
put of a paranoid schizophrenic.
Well, my boyfriend made me come here.
The next influential program was called
YOUR BOYFRIEND MADE YOU COME SHRDLU (Winograd, 1972; the name came from
HERE the letters of one row of a typesetting machine
and was often used by typesetters to flag a mis-
He says I’m depressed much of the time.
take). This program could answer questions
I AM SORRY TO HEAR THAT YOU ARE about an imaginary world called “blocksworld.”
DEPRESSED Blocksworlds are occupied by objects such as
small red pyramids sitting on top of big blue cubes.
It’s true that I am unhappy.
SHRDLU’s success in being able to “understand”
DO YOU THINK COMING HERE WILL sentences such as “move the small red pyramid on
HELP YOU NOT TO BE UNHAPPY top of the blue cube” was much hailed at the time.
The concept of a computer

that thinks and talks like a
human has existed in science
fiction for some time. The
smooth-talking HAL from
2001: A Space Odyssey, a scene
from which is depicted here,
is one of the more ominous
and disturbing creations.
The name HAL stands for
“Heuristically programmed
ALgorithmic computer.”
However, SHRDLU could only “understand” in One concept that is central in many types
as much as it could give an appropriate response of model, including connectionist models, is the
to an instruction, and most people would say that idea of activation. The idea has been around for
there is much more to understanding than this. a long time. Activation is a continuously vary-
Furthermore, these early demonstrations worked ing quantity, and can be thought of as a property
only for very simple, limited domains. SHRDLU rather like heat. We talk of how activation can
could not answer questions about elephants, or spread from one unit or word or point in a net-
even say what “block” means. Its knowledge was work to another, rather like electricity flowing
limited to the role of blocks within blocksworld. around a circuit board. Suppose we hear a word
These early attempts did have the virtue of such as “ghost.” If we assume there is a unit cor-
demonstrating the enormity of the task in under- responding to that word, it will have a very high
standing language. They also revealed the main level of activation. But a word related in mean-
problems that have to be solved before we can ing (e.g., “vampire”) or sound (e.g., “goal”)
talk of computers truly understanding language. might also have a small amount of activation,
There are an infinite number of sentences, of whereas a completely unrelated word (e.g.,
varying degrees of complexity. We can talk about “pamphlet”) will have a very low level of acti-
and understand potentially anything. The roles vation. The idea that the mind uses something
that context and world knowledge play in under- like activation, and that the activation level of
standing are very important: potentially any piece units—such as those representing words—can
of information we know could be necessary to influence the activation levels of similar items,
understand a particular sentence. The conven- is an important one.
tional AI approach has had some influence on
psycholinguistic theorizing, particularly on how The methods of modern
we understand syntax and how we make infer-
ences in story comprehension.
psycholinguistics
ELIZA and SHRDLU had extremely primitive Psycholinguistics uses many types of evidence.
syntactic processing abilities. ELIZA used tem- We will use examples of observational studies
plates for sentence recognition, and did not com- and linguistic intuitions, and make use of the
pute the underlying syntactic structure of sentences errors people make. Much has been learned
(a process known as parsing). SHRDLU was a lit- from computer modeling. Recently, neurosci-
tle more sophisticated, and did contain a syntactic ence has contributed greatly to our understand-
processor, but the processor was dedicated to the ing. But the bulk of our data, as you will see if
extraction of the limited semantic information nec- you just quickly skim through the rest of this
essary to move around “blocksworld.” Early AI book, comes from traditional psychology exper-
parsers lacked the computational power necessary iments, particularly those that generate reaction
to analyze human language. times. For example, how long does it take to
The influence of AI on psycholinguistics read out a word? What can we do to make the
peaked in the 1970s. More recently an approach process faster or slower? Do words differ in the
called connectionism (but also known as paral- speed with which we can read them out depend-
lel distributed processing, or neural networks) ing on their properties? The advantage of this
has become influential in all areas of psycholin- type of experiment is that it is now very easy to
guistics. Connectionist networks involve many run on modern computers. In many experiments,
very simple, richly interconnected neuron-like the collection of data can be completely auto-
units working together without an explicit gov- mated. There are a number of commercial (and
erning plan. Instead, rules and behavior emerge free) experimental packages available for both
from the interactions between these many simple PC and Macintosh computers that will help run
units. The principles of connectionist models are your experiments for you, or you can program
described more fully in the Appendix. the computer yourself.
16 A. INTRODUCTION
One of the most popular experimental tech- have to be explained. Types of data include experi-
niques is called priming. Priming has been used mental results, case studies of people with brain
in almost all areas of psycholinguistics. The damage, brain scans, and observations of people
general idea is that if two things are similar to using language correctly or incorrectly. A theory is
each other and involved together in processing, a general explanation of how something works. A
they will either assist with or interfere with each model is rather more specific: For example, com-
other, but if they are unrelated, they will have puter simulations are models of processes that are
no effect. For example, it is easier to recognize particular instances or parts of more abstract theo-
a word (e.g., BREAD) if you have just seen a ries. The distinction between a model and a theory
word that is related in meaning (e.g., BUTTER). is a bit fuzzy though, so don’t worry about it too
This effect is called semantic priming. If prim- much. A hypothesis is a very specific idea that
ing causes processing to be speeded up, we talk can be tested. An experimental test that confirms
about facilitation; if priming causes it to be the hypothesis is support for the particular theory
slowed down, we talk of inhibition. from which the hypothesis was derived. If the
Most psycholinguistic research has been car- hypothesis is not confirmed, then some change to
ried out on healthy monolingual English-speaking the theory is necessary. It need not be necessary
college students, in the visual modality (i.e., with to reject the theory completely, but as long as the
printed words). Psycholinguistic research does not hypothesis is derived fairly from the theory, then
differ from other types of psychology in this bias, some modification will be necessary. Testing the-
but it does have consequences: for example, it has ories by making predictions and trying to falsify
meant that there has been a great deal of research them is a fundamental part of science. And that’s
on reading when, for most people, speaking and why psycholinguistics is a part of science.
listening are the main language activities in their What’s an explanation then? An explana-
lives. Fortunately, in recent years this situation tion simplifies. If you carry out an experiment
has changed dramatically, and we are now see- and make one hundred observations, an explana-
ing the fruits of research on speech recognition, tion of those observations is something simpler
on language production, on speakers of different than those hundred data points. Suppose you
languages, on bilingual speakers, on people with could summarize why you got those observa-
brain damage, and on people across the full range tions in a sentence or one mathematical equation;
of the lifespan. A lot of this work has been spurred that would be a good explanation (and the equa-
by recent developments in brain imaging, which tion would also serve as a model). Explanations
over the last few years has revolutionized how we should also avoid being circular. A circular expla-
understand language. nation is one that explains itself in terms of itself;
for example, we could say children learn language
because they have a language acquisition module,
MODELS IN and define the language acquisition module as
PSYCHOLINGUISTICS what enables children to learn languages. Good
explanations transcend levels; complex phenom-
What do we do when we have a lot of data? We ena are explained in terms of simpler descriptions,
have to explain it. We do this by constructing a and may involve different areas.
model of the data. A good model is an account of Good models also make use of converging
the data that provides an explanation of why the evidence; evidence from different sources that
data are as they are and that makes novel, testable come together. A model of some behavior that is
predictions. Psycholinguistics is full of models, expressed as a computer model and makes novel,
and they’re very important. falsifiable predictions about real human behavior
At this point it is useful to explain what is is a good one, particularly if it is supported by
meant by the words “data,” “theory,” “model,” and evidence from other areas such as the study of
“hypothesis.” Data are the pieces of evidence that the brain.
LANGUAGE AND THE psycholinguistics over the last 30 years or so.

Traditional neurology and neuropsychology have
BRAIN been concerned primarily with questions about
Cognitive neuroscience studies how the brain which parts of the brain control different sorts of
and behavior are related. For a long time we were behavior (i.e., with the localization of function),
restricted to exploring how language and the brain and with working out how complex behaviors
were related by looking at the effects of brain damage map onto the flow of information through brain
on language. More recently advances in neuroimag- structures (see Figure 1.3). In one of the best-
ing have enabled us to look at the brain in action known traditional neuropsychological models of
during normal processing. language, the Wernicke–Geschwind model, lan-
guage processes basically flow from the back of
the left hemisphere to the front, with high-level
Lesion studies planning and semantic processes towards the
The brain is very vulnerable to damage (which is back, in what is called Wernicke’s area, and low-
why this precious organ is encased in a thick pad- level sound retrieval and articulation towards the
ded skull). Sites of damage to the brain are called front, in what is called Broca’s area, with the two
lesions. Some unfortunate individuals suffer brain regions connected by a tract of fibers called the
damage in a variety of ways, including strokes, arcuate fasciculus (see Figure 1.4). The empha-
brain surgery, and trauma from accidents (e.g., car sis of cognitive neuropsychology is rather differ-
crashes) or poisoning. ent: the goal is to relate brain-damaged behavior
Lesion studies involve examining the effects to models of normal processing.
of brain damage on performance, and have Shallice (1988) argued that cognitive neu-
made enormous contributions to understand- ropsychology can be distinguished from traditional
ing how psychological processes are related to neuropsychology in three crucial respects. First,
the brain. A particular approach to using lesion it has made a theoretical advance in relating neu-
studies, called cognitive neuropsychology, has ropsychological disorders to cognitive models.
led to great advances in our understanding of Second, it has made a methodological advance in
Cerebral cortex
Cingulate gyrus Parietal lobe
Hippocampus
Frontal lobe
Corpus
callosum
Thalamus
Suprachiasmatic
nucleus
Optic chiasm Occipital lobe

Pituitary Cerebellum
Hypothalamus
Pineal gland
Midbrain
Locus coeruleus
Brain stem
Pons
Medulla
FIGURE 1.3 A cross-
sectional view of the brain.
18 A. INTRODUCTION
Motor cortex architecture of the systems involved. That is, the

Broca’s
organization of the components—specifying lev-
area els of processing and how they are connected to
each other—involved is emphasized at the cost of
3 exploring the processes actually involved, leading
2
to the construction of box-and-arrow diagrams
with little advance in our understanding of what
goes on inside the boxes, or how we get from
one box to another. More emphasis is now being
1 placed on what happens inside the components,
particularly since connectionist modeling has
Primary been applied to cognitive neuropsychology.
auditory cortex Wernicke’s
area A concept important in both traditional and
cognitive neuropsychology is that of the dou-
FIGURE 1.4 The location of Wernicke’s area (1) ble dissociation. Consider two patients, A and
and Broca’s area (3). When someone speaks a word, B, given two tasks, I and II. Patient A performs
activation proceeds from Wernicke’s area through normally on task I but cannot perform task II.
the arcuate fasciculus (2) to Broca’s area. Patient B displays the reverse pattern of behavior,
in performing normally on task II but not on task
emphasizing the importance of single-case stud- I (see Figure 1.5). In such a situation the two tasks
ies, rather than group studies of neuropsycho- are said to be doubly dissociated. The traditional
logical impairment. That is, the emphasis is on interpretation of a double dissociation is that dif-
providing a detailed description and explanation ferent processes underlie each task. If we then
of individual patients, rather than on compar- find that patients A and B have lesions to differ-
ing groups of patients who might not have the ent parts of the brain, we will be further tempted
same underlying deficit. Third, it has contributed to draw a conclusion about where these processes
a research program, in that it emphasizes how are localized. To anticipate an example of a dou-
models of normal processing can be informed by ble dissociation, we will see in Chapter 7 that
studying brain-damaged behavior. Cognitive neu- some patients are unable to read nonwords (e.g.,
ropsychology has contributed a great deal to our SPUKE), but they can read words with irregular
understanding of language. spelling (e.g., STEAK). Other patients can read
Shallice went on to argue that sometimes nonwords, but are unable to read irregular words.
this approach has been taken too far, and called
this extreme position ultra-cognitive neuropsy-
chology. First, it has gone too far in arguing that
group studies cannot provide any information Patient Patient
appropriate for constructing cognitive models. A B
Performance
This proposal led to heated controversy (e.g.,

Bates, McDonald, MacWhinney, & Appelbaum,
1991; Caramazza, 1986, 1991; McCloskey &
Caramazza, 1988). Second, it has gone too far
in claiming that information about the localiza-
tion of function is irrelevant to our understand-
I II
ing of behavior (e.g., Morton, 1984). Third, it has Task
undervalued clinical information about patients.
Seidenberg (1988) pointed to another problem,
which is that cognitive neuropsychology places FIGURE 1.5 Illustration of a hypothetical double
too much emphasis on uncovering the functional dissociation.
Although the traditional interpretation of a at how the brain works? New techniques of brain
double dissociation is that two separate routes imaging are gradually becoming more accurate and
are involved in a process, connectionist modeling more accessible. As a consequence brain imaging
has shown that this might not always be the case. has been one of the most widely used and important
Apparent double dissociations can emerge in com- techniques in psycholinguistics in the last few years.
plex, distributed, single-route systems (e.g., Plaut Traditional X-rays are of limited use to us
& Shallice, 1993a; Seidenberg & McClelland, because the skull blocks the view of the brain and,
1989; both are described in Chapter 7). At the in any case, there is little variation in the density of
very least, we should be cautious about inferring the brain. Hence neuroscientists have had to use even
that the routes involved are truly distinct and do more ingenious techniques. These are based on meas-
not interact (Ellis & Humphreys, 1999). uring the brain’s electrical activity, or creating images
Some more general care is necessary when of brain activity. Ideally, we would like both good
making inferences from neuropsychological data. temporal (being able to separate and time events very
Some researchers have questioned the whole enter- accurately) and spatial (being able to localize very
prise of trying to understand normal processing by accurately in space in the brain) resolution.
studying brain-damaged behavior. Gregory (1961) EEGs (electroencephalograms) and ERPs
made an analogy of attempting to discover how a (event-related potentials) both measure the electri-
radio set works by removing its components. If we cal activity of the brain by putting electrodes on
did this, we might conclude that the function of a the scalp. ERPs measure voltage changes on the
capacitor (an electrical component) was to inhibit scalp associated with the presentation of a stimulus
loud wailing sounds! Furthermore, the categories of (see Figure 1.6). The peaks of an ERP are labeled
disorder that we will discuss are not always clearly according to their polarity (positive or negative
recognizable in the clinical setting. There is often voltage) and latency in milliseconds (thousandths
much overlap between patients, with the more pure of a second) after the stimulus begins (Kutas & van
cases usually associated with smaller amounts of Petten, 1994). The N400 is a much-studied peak
brain damage. Finally, things are not usually in a occurring after a semantically incongruent sentence
fixed state as a result of brain damage; intact pro- dog (Kutas & Hillyard, 1980). Of course, that pre-
cesses reorganize, and some recovery of function vious sentence should have ended with “sentence
often occurs, even in adults. completion,” and “dog” should therefore have gen-
erated a large N400 in you. P300 peaks are elicited
by any stimuli requiring a binary decision (yes/no).
Neuroimaging The contingent negative variation (CNV) is a slow
Reaction times enable us to infer how the mind negative potential that develops on the scalp when
works; lesion studies enable us to infer which part of a person is preparing to make a motor action or to
the brain does what; suppose we could look directly process sensory stimuli.
+1μV
−100 0 100 200 300 FIGURE 1.6 An EEG (left)

measures electrical potentials
in the brain by means of
electrodes placed across
the scalp. An ERP (example
on the right) is a complex
Milliseconds electrical waveform related
in time to a specific event.
20 A. INTRODUCTION
EEG and ERP have very good temporal which parts of the brain are most active when it is
resolution—they can currently resolve the tim- carrying out a particular task.
ing of events to within a millisecond or so. Their In recent years fMRI (functional magnetic
spatial resolution, however, is very poor. MEG resonance imaging) has become widely accessible,
(magnetoencephalography) is a recent devel- and “brain scans” derived from fMRI have become
opment that measures the magnetic activity of one of the most important sources of data in psy-
the brain. MEG has the advantage of both very chology. fMRI was developed in the 1990s. It meas-
good temporal and spatial (within 3 mm) resolu- ures the energy released by hemoglobin molecules
tion, but is more difficult to carry out and much in the blood, and then works out the areas of the
more expensive to run, needing superconducting brain receiving the greatest amounts of blood and
devices called SQUIDS, extreme cooling using oxygen. It therefore tells us which parts of the brain
liquid helium, and magnetic shielding. are most active at any time. It provides much better
CAT (computerized axial tomography) pro- temporal (about 1–5 seconds) and spatial (within
duces medium-resolution images from integrating 1 mm) resolution than PET, although the temporal
large numbers of X-ray pictures taken from many resolution is still clearly inferior to EEG. fMRI is
different angles around the head (see Figure 1.7). now the most widely used imaging technique used
MRI (magnetic resonance imaging) uses radio- in psycholinguistics, and its importance to the field
frequency waves rather than X-rays and produces has grown dramatically in the last few years.
higher resolution images than CAT. These tech- Another recently developed tool is TMS
niques enable neuroscientists to study the structure (transcranical magnetic stimulation). TMS is in
of the brain. PET (positron emission tomography) some ways the reverse of imaging: rather than
scans produce pictures of the brain’s activity. A observing the brain, we make part of it do some-
radioactive form of glucose, the metabolic fuel thing. A very powerful set of magnets is used to
that the brain uses, is injected into the blood, and directly stimulate part of the cortex of a partici-
detectors around the head measure where the glu- pant, and we then record what that participant
cose is being used up. In this way we can find out does or experiences.
X-ray tube
X-rays
X-ray detector
FIGURE 1.7 In a CAT scanner, X-rays pass through the brain in a narrow beam. X-ray detectors are arranged in
an arc and feed information to a computer that generates the scan image.
this a different area responsible for processing

the sound of words becomes active. This would
suggest that, when speaking, processes involving
meaning and sound do not overlap. On the other
hand, we might find that the meaning and sound
areas overlap and become almost immediately
simultaneously active. This result would suggest
that meaning and sound processing interact. In
effect, we could plot the graphs of the time course
of processing and how different types of informa-
tion interact.
Brain imaging is still relatively expensive,
and the spatial and particularly temporal reso-
lution of even fMRI still leave something to be
Functional magnetic resonance imaging (fMRI) desired, although they are improving rapidly all
scans have become an important source of data in the time. A more significant problem with current
psychology.
brain imaging is that the results are often difficult
to interpret. It is hard to be sure exactly what is
causing any activity. Imaging will tell us where
These techniques could potentially tell us a
something is happening, but in itself it does not
number of things. They could tell us a great deal
tell us how, what, or why. Looking at how the
about the time course of processes, and when dif-
brain works is not the same thing as looking at
ferent sources of information are used. Imaging
how the mind works. In the context of a theory of
could be particularly revealing about the extent to
language processing and brain structure, however,
which mental processes interact with other pro-
imaging might provide us with important clues as
cesses. Suppose that in a brain scan taken during
to what is going on. The main method used in
the production of a single word, we find that the
brain imaging is called subtraction: the participant
area responsible for processing the meaning of
carries out one task (e.g., reading aloud) and then
words becomes active, and then some time after
a variant of that task (e.g., reading silently), and
the images of one are subtracted from the images
of the other. You then identify where the critical
difference between the two is located (in this case
the vocalizing component of reading aloud). The
subtraction method may sound straightforward,
but in practice it is often difficult to find suitable
comparison conditions. Quite often the differ-
ence between the two conditions is a subtle one
that needs theoretical interpretation (Bub, 2000).
Furthermore, imaging techniques often show
activation of non-overlapping cortical areas for
similar tasks, which again is difficult to interpret
(Poeppel, 1996). Imaging studies also suggest
This participant is undergoing transcranial
magnetic stimulation (TMS), which is used to map that cognitive processes are more localized than
brain function. A figure-of-eight coil is placed over is indicated by other sorts of methods (such as
the participant’s skull and an electric current is the study of people with brain damage), because
passed through it, producing a magnetic field that imaging techniques reveal the many areas that are
induces an electric current within a discrete area active in a task, regardless of whether or not those
of the brain.
areas are carrying out an important role (Howard,
22 A. INTRODUCTION
1997). Also, group studies using imaging tech- the processes we examine specific to language, or
niques average brain images across people, when are they aspects of general cognitive processing
functions might be localized inconsistently in sometimes recruited for language? Seventh, how
different parts of their brains (Howard, 1997). sensitive are the results of our experiments to the
It is also easy to get carried away with focusing particular techniques employed? That is, do we get
on where in the brain things happen, rather than different answers to the same question if we do our
on the underlying processes (see Harley, 2004a, experiments in slightly different ways? To antici-
2004b; Loosemore & Harley, 2010). pate, the answers we get sometimes do depend on
In general, imaging techniques do not tell us the way we get those answers, which obviously can
in any straightforward way what high activity in make the interpretation of findings quite complex.
different parts of the brain means in processing One consequence is that we find that the experi-
terms. Suppose we see during sentence process- mental techniques themselves come under close
ing that the parsing and semantic areas are active scrutiny. In this respect, the distinction between
at the same time. This could be a result of interac- data and theory can become very blurred. Eighth,
tion between these processes, or it could reflect what can be learned from looking at the language of
the parsing of one part of the sentence and the people with damage to the parts of the brain that
semantic integration of earlier material. It might control language? Ninth, what difference does
even reflect the participant parsing a sentence and it make speaking a different language? We have
thinking dimly about what’s for dinner that night. already seen that there are many thousands of lan-
It might be possible to tease them apart, but we guages in the world. Many countries have more
need clever experiments to do this. Imaging data than one language, and some (e.g., Papua New
now play an important role as part of the con- Guinea) have hundreds. Some languages have
verging evidence for a particular model, or even hundreds of millions of speakers; some just a few
distinguishing between competing accounts. hundred. There are important differences between
Imaging already plays an important diagnostic languages that may have significant implications
role in investigating the effects of brain damage for the way in which speakers process language. It
and brain disease. More optimistically, in the is sometimes easy to forget this, given the domina-
more distant future, imaging will play a more tion of English in experimental psycholinguistics.
important role in treatment and therapy. Some people speak more than one language. How
they do this, how they learn the two languages, and
how they translate between them are all important
THEMES AND questions, the answers to which have wider impli-
CONTROVERSIES cations for understanding cognitive processing.
Finally, we should be able to apply psycho-
Ten themes recur throughout this book (see Figure linguistic research to everyday life and prob-
1.8). The first theme is to discover the actual pro- lems. Although language comes naturally to most
cesses involved in producing and understanding humans most of the time, there are many occasions
language. The second theme is the question of when it does not: for example, in learning to read,
whether apparently different language processes in overcoming language disabilities, in rehabilitat-
are related to one another. For example, to what ing patients with brain damage, and in developing
extent are the processes involved in reading also computer systems that can understand and produce
involved in speaking? The third theme is whether language. Advances in the theory of any subject
or not processes in language operate independently such as psycholinguistics should have practical
of one another, or whether they interact. This is applications. For example, in Chapters 6 and 7 we
the issue of modularity, and we look at it in more will examine research on visual word recognition
detail below. Fourth, what is innate about lan- and reading. Learning to read is a remarkably diffi-
guage? Fifth, do we need to refer to explicit rules cult task. A good theory of reading should cast light
when considering language processing? Sixth, are on how it should best be taught. It should indicate
What can be learned Are language processes What are the processes
from the language of specific to language or are involved in producing
patients with brain they aspects of general and understanding
damage? cognitive processing? language?
How sensitive are How do languages

the results of experiments differ?
STUDY OF
to the techniques employed?
LANGUAGE Are language processes
related to one another
Do we need specific
(e.g., reading and speaking)?
rules for language
processing? Do processes in language
How can the study of language What is innate operate independently or
be applied to everyday life? about language? interact (modularity)?
FIGURE 1.8
the best strategies that can be used to overcome dif- example, does the meaning of a sentence help in
ficulties in learning to read, and thereby help chil- recognizing the sounds of a word or in making
dren who find learning to read particularly difficult. decisions about the sentence structure?
A good theory should specify the best methods of A module is a self-contained set of pro-
dealing with adult illiteracy. Furthermore, it should cesses: it converts an input to an output, without
help in the rehabilitation of adults who have diffi- any outside help for what goes on in between—
culty in reading as a consequence of brain damage, we say that the processes inside a module are
showing what remedial treatment would be most independent of processes outside the module.
useful and which strategies would maximize any Yet another way of describing it is to say that
preserved reading skills. processing is purely data-driven. Models in
Let us look at some of these themes in more which processing occurs in this way are called
detail. autonomous.
The opposing view is that processing is
How modular is the language interactive. Interaction involves the influence
of one level of processing on the operation of
system? another, but there are two intertwined notions
The concept of modularity is an important one involved. First, there is the question of overlap
in psycholinguistics. Most researchers agree that of processing between stages. Are the processing
psychological processing can be best described stages temporally discrete or do they overlap?
in terms of a number of levels. Processing begins In a discrete stage model, a level of processing
with an input that is acted on by one or more inter- can only begin its work when the previous one
vening levels of processing to produce an output. has finished its own work. In a cascade model,
For example, when we name a word, we have to information is allowed to flow from one level
identify and process the visual form of the word, to the following level before it has completed
and access the sounds of the word. There is much its processing (McClelland, 1979). If the stages
less agreement on the way in which these levels overlap, then multiple candidates might become
of processing are connected to each other. For a activated at the lower level of processing. An
particular process, at what stage does any kind analogy should make this clear. Discrete models
of context have an influence? When do differ- are like those water wheels made up of a series of
ent types of information have their effects? For tipping buckets; each bucket only tips up when
24 A. INTRODUCTION
it is full of water. Cascading models on the other decision making; and in word production they have
hand are like a series of waterfalls. proposed an editor, or emphasized the role of work-
The second aspect of interaction is whether ing memory, or claimed that some kinds of data
there is a reverse flow of information, or feedback, (e.g., picture-naming times) are more fundamen-
when information from a lower level feeds back to tal than others (e.g., speech errors). Researchers
the prior level. For example, does knowledge about can get very hot under the collar about the role of
what a word might be influence the recognition of interaction. Both Fodor (1983, 1985) and Pinker
its component sounds or letters? Does the context of (1994), who are leading exponents of the view that
the sentence help to make identifying the constituent language is highly modular and has a significant
words easier? A natural waterfall is purely top-down; innate basis, give a broader philosophical view:
water doesn’t flow from the bottom back up to the modularity is inconsistent with relativism, the
top. But suppose we introduce a pump. Then we can idea that everything is relative to everything else
pump water back up to earlier levels. There is scope and that anything goes (particularly in the social
for confusion with the terms “bottom-up” and “top- sciences). Modules provide a fixed framework in
down,” as they depend on the direction of processing. which to study the mind.
So a non-interactive model of word recognition The existence of a neuropsychological disso-
would be one that is purely bottom-up—from the ciation between two processes is often taken as evi-
perceptual representation of the word to the mental dence of the modularity of the processes involved.
representation—but a non-interactive model of word When we consider the neuroscience of modularity,
production would be one that is purely top-down— we can talk both about physical modularity (are
from the mental representation to the sound of the psychological processes localized in one part of the
word. “Data-driven” is a better term than “bottom- brain?) and processing modularity (in principle a set
up,” but the latter is in common use. The important of processes might be distributed across the brain
point is that models that permit feedback have both yet have a modular role in the processing model).
bottom-up and top-down information flow. It is plausible that the two types of modularity are
Fodor (1983) argued that many psychological related, so that cognitive modules correspond to
processes are modular. To what extent are the pro- neuropsychological modules. However, Farah (1994)
cesses of language self-contained, or do they interact criticized this “locality” assumption, and argued that
with one another? According to many researchers, neuropsychological dissociations were explicable in
we should start with the assumption that processes terms of distributed, connectionist systems.
are modular or non-interactive unless there is a To what extent is the whole language system
very good reason to think otherwise. There are a big, self-contained module (or set of modules)?
two main reasons for this assumption. First, modu- Is it just a special module for interfacing between
lar models are generally simpler—they involve social processes and cognition? Or does it provide
fewer processes and connections between systems. a true window onto wider cognitive processes? On
Second, it is widely believed that evolution favors the one hand, Chomsky (1975) argued that lan-
a modular system. On the other hand, there is no guage is a special faculty that cannot be reduced
consensus on how good a “very good reason” has to cognitive processes. On the other, Piaget (1923)
to be before we dump the modularity hypothesis. argued that language is a cognitive process just like
It is always possible to come up with a saving or any other, and that linguistic development depends
auxiliary hypothesis that can be used to modify and on general cognitive development. We will return
hence save the modularity hypothesis (Lakatos, to this question in Chapter 3 when we consider the
1970). We will observe many instances of auxiliary relation between language and thought. In addition
hypotheses introduced to save the main hypothesis to there being a separate module for language, there
that processing is modular. In theories of word rec- are some obvious candidates for subsystems being
ognition researchers have introduced the idea of modules, such as the syntax module, the speech
post-access processes; in syntax and parsing they processing module, and the word recognition mod-
have proposed parallel processing with deferred ule. But even if language is a big, self-contained
module, it has to interact with the rest of the cogni- linguistics, much knowledge is encapsulated in
tive system. We talk about what we think about, the form of explicit rules. For example, we will
our thoughts are often in verbal form (what we call see in Chapter 2 that we can describe the syntax
inner speech), and we integrate what we hear with of language in terms of rules such as “a sentence
the rest of the information in our long-term mem- can comprise a noun phrase followed by a verb
ory. As we will see (particularly in Chapters 12 and phrase.” Similarly, we can formulate a rule that
15), language plays a central role in our working the plural of a noun is formed by adding an “-s”
memory, the short-term repository of information. to its end, except in a limited number of irregular
In each case where modularity arises as an forms, which we would need to store separately.
issue, you need to examine the data, and ask Clearly then we can describe language with a sys-
whether the auxiliary hypothesis is more plau- tem of rules, but do we actually make use of such
sible than the non-modular alternative. You also rules when speaking and listening?
need to think about whether data converges from Until quite recently, the answer was thought
experimental and imaging sources. Often, with to be “yes.” Many researchers, particularly those
existing data, it is impossible to decide. with a more linguistic orientation, still believe
this. For many other researchers, connectionist
modeling has provided an alternative view.
Is any part of language innate? Connectionism has revolutionized psycho-
There are broader implications of modularity, linguistics over the last 25 years. In connectionist
too. Generally, those researchers most committed models, processing takes place in the interaction
to the claim that language processes are highly of many simple, massively interconnected units.
modular also argue that a significant amount of Connectionist models that can learn are particu-
our language abilities are innate. The argument is larly important. In these models, information is
essentially that nice, clean-cut modules must be learned by repeated presentation; the connections
built into the brain, or hard-wired, and therefore between units change to encode regularities in the
innately programmed, and that complex, messy environment. The general idea underlying learn-
systems reflect the effects of learning. ing can be summarized in the aphorism, based on
Obviously there are some prerequisites to the the work of Donald Hebb (1949), that “cells that
acquisition of language, if only a general learning fire together, wire together”: the simultaneous
ability. The question is, how much has to be innate? activation of cells (or units) leads to an increase in
Are we just talking about general learning prin- synaptic (or connection) strength.
ciples, or language-specific knowledge—to what What does the “model” part of “connection-
extent is the innate information specifically linguis- ist model” mean? A few years ago I built a model
tic? A related issue is the extent to which the innate rocket. It was only a foot high, and made out of
components are only found in humans. We will look plastic, but it did take off (eventually), and went
at these questions in more detail in Chapters 3 and 4. a few hundred feet in the air. It differed from a
Connectionist modeling (discussed below) suggests “real” rocket in many ways other than scale; the
ways in which general properties of the learning sys- rocket propellant was very different from that used
tem can serve the role of innate, language-specific in real rockets, and many aspects of it were deco-
knowledge, and shows how behavior emerges from rative rather than functional. It was also, needless
the interaction of nature and nurture at all levels to say, greatly simplified. Yet it did illustrate many
(Elman et al., 1996). important principles of rocket flight, and you can
learn a lot about real rocketry by playing with
Does the language system make such models. Computational models of mind are
very similar. They are scaled-down models of the
use of rules? mind, or parts of it, made from different materials,
To what extent does the language-processing sys- but which illustrate important principles of how
tem make use of linguistic rules? In traditional the mind works. What is more, we can learn from
26 A. INTRODUCTION
them. Their behavior is not always totally predict- have proved particularly influential in language
able, in the same way as it is difficult to predict acquisition, where children are thought to learn
exactly how the model rocket is going to behave in language by statistical or distributional analysis
different conditions on the basis of limited knowl- of what they hear rather than learning explicit
edge about its raw materials. Modeling then is a rules (see Chapter 4).
very important idea in modern psycholinguistics.
What makes connectionist models so attrac-
tive? First, unlike traditional AI, at first sight they Are language processes specific to
are more neurally plausible. They are loosely based language?
on a metaphor of the brain, which is a structure made
Does language depend on very specific processes
up out of many massively interconnected neurons,
that have evolved to do nothing else, or does it
each one of which is relatively simple. It is important
make use of more general cognitive processes? For
not to get too carried away with this metaphor, but
example, when we understand sentences, do we
at least we have the feeling that we are starting off
make use of a general-purpose working memory
with the right sorts of models. Second, connection-
store, or do we have dedicated stores that can store
ist modelers usually try to minimize the amount of
only information about language? Do children
information hard-wired into the system, emphasiz-
learn language using general-purpose learning
ing looking at what emerges from the model. Third,
rules, or do they make use of information restricted
just like traditional AI, connectionism has the virtue
to the linguistic domain?
that writing a computer program forces you to be
The ideas of innateness, modularity, rules, and
explicit about your assumptions.
language-specific processing are related. There is
There have been three major consequences
a divide in psycholinguistics between those who
from the success of connectionist modeling.
argue for innate language-specific modules that
First, it has led to a focus on the processes
make extensive use of rules, and those who argue
that take place inside the boxes of our models.
that much or all of language processing is the
In some cases (e.g., the acquisition of the past
adaptation of more general cognitive processes.
tense), this new focus has led to a detailed re-
examination of the evidence motivating the
models. The second consequence is that connec- Are we certain of anything in
tionism has forced us to consider in detail the
representations used by the language system.
psycholinguistics?
In particular, connectionist approaches can be One important point to note is that there are very
contrasted with rule-based approaches. In connec- few topics in psycholinguistics where we can say
tionist models rules are not explicitly encoded, that we know the answer to questions with com-
but instead emerge as a consequence of statisti- plete certainty. Time after time you will notice
cal generalizations in the input data. Examples of that even when there is consensus, or when we
this include the grapheme–phoneme correspond- appear to agree on what happens, there are dis-
ence rules of the dual-route model of reading (see senting voices. Uncertainty is a fact of life when
Chapter 7), and the acquisition of the past tense trying to understand the psychology of language.
(see Chapter 4). It is important to realize that this The discipline is still relatively quite young,
point is controversial, and we shall see through- and we have a lot to learn. It’s not like physics
out the book that the role of explicit rules is still which has hundreds of years of solid research
a matter of substantial debate among psycholin- to stand on. Imagine being a physicist debating
guists. Third, the shift of emphasis from learning experiments and models in seventeenth-century
rules to learning through many repeated specific Europe. That’s a bit like where we’re at now.
instances has led to an increase in probabilistic So I’m sorry; as I said earlier, sometimes I’ll
models of language acquisition and processing just have to throw my hands up and say “sorry, we
(Chater & Manning, 2006). Probabilistic models don’t know,” and you’ll have to leave it at that.
SUMMARY
x Language is a communication system that enables us to talk about anything, irrespective of time
and space.
x Psycholinguistics arose after the Second World War as a result of interaction between the disci-
plines of information theory and linguistics, and as a reaction against behaviorism.
x Later experiments revealed a number of problems with a purely linguistic approach to under-
standing language.
x Two ideas from Chomsky’s original work that were picked up by early psycholinguists were the
derivational theory of complexity and the autonomy of syntax.
x The earliest experiments supported the idea that the more transformationally complex a sentence,
the longer it took to process; however, experiments using psychologically more realistic tasks
failed to replicate these findings.
x Although linguistic theory influenced early accounts of parsing, linguistics and psycholinguistics
soon parted ways.
x Modern psycholinguistics uses a number of approaches, including experiments, computer simula-
tion, linguistic analysis, brain imaging, and neuropsychology.
x Early artificial intelligence (AI) approaches to language such as ELIZA and SHRDLU gave the
impression of comprehending language, but had no real understanding of language and were
limited to specific domains.
x Language processes can be broken down into a number of levels of processing.
x Psychologists have different views on the extent to which the mind can be divided into discrete
modules.
x The use of brain imaging is becoming particularly important in the study of language.
x There is considerable debate about whether language processing is interactive or autonomous.
x An important question, particularly for the study of how we acquire language, is the extent to
which language is innate.
x Whereas traditional approaches, based on linguistics, state that much of our knowledge of lan-
guage is encoded in terms of explicit rules, more recent approaches based on connectionist mod-
eling state that our knowledge arises from the statistical properties of language.
x Double dissociations are important in the neuropsychological study of language.
QUESTIONS TO THINK ABOUT
1. What are the methodological difficulties involved for linguists who study people’s intuitions
about language?
2. What are the advantages and disadvantages of using brain imaging to study language?
3. What are the advantages of a modular system? Are there any disadvantages that you can think of?
4. What are the disadvantages of group experiments in neuropsychology?
5. Are there any limits to what single-case studies of the effects of brain damage on language
might tell us?
(Continued)
28 A. INTRODUCTION
(Continued)
6. What is the difference between neuropsychology and neuroscience?
7. How would you define language? What do you think are its most important characteristics?
8. Which do you think is going to tell us more about how humans use language: experiments or
computational modeling? Which would you prefer to do, and why?
9. What does knowing where something happens in the brain tell us about what is happening?
10. What is the difference between linguistics and psycholinguistics, and does the distinction matter?
FURTHER READING
There are many textbooks that offer an introduction to cognitive psychology. Any introductory text
on psychology will provide you with rich material. If you want more detail, try Anderson (2010),
Eysenck and Keane (2010), or Quinlan and Dyson (2008).
For a summary of the early history of psycholinguistics, see Fodor, Bever, and Garrett (1974),
and of linguistics, Lyons (1977a). If you wish to find out more about linguistics, you might try Fromkin,
Rodman, and Hyams (2011). Crystal (2010) is a complete reference work on language. Clark’s
(1996) book is about language as communication. For an amusing read on the history of English,
and much more besides, see Bryson (1990).
Thagard (2005) provides a general survey of cognitive science. There are many introductory
textbooks on traditional AI, including Negnevitsky (2004). Introductions to connectionism include
Bechtel and Abrahamsen (2001) and Ellis and Humphreys (1999)—the latter emphasizes the impact
of connectionism on cognitive psychology.
Kolb and Whishaw (2009) describe traditional neuropsychology and the Wernicke–Geschwind
model in detail; see also Andrewes (2001), Banich (2004), or Stirling (2002) for recent introduc-
tions to neuropsychology. For a more advanced source on neuropsychology and language, try Hillis
(2002). Notice that these references are now getting rather dated; that’s because the emphasis has
switched from pure neuropsychology to neuroscience. Gazzaniga, Ivry, and Mangun (2008) and
Ward (2010) are good general introductions to imaging and cognitive neuroscience.
Chalmers (1999) is a good introduction to the methods and philosophy of science.
Altmann (1997) and Pinker (1994) are introductions to the psychology of language that take the
same general approach as this book. There are some recent handbooks and encyclopedias of psy-
cholinguistics that will provide you with more detailed coverage of the topics in this book, including
Gaskell (2007), Spivey, McRae, and Joanisse (2012), and Traxler and Gernsbacher’s (2006) second
edition of the Handbook of Psycholinguistics. As already mentioned, Crystal (2010) is a very good
reference for linguistics.
A number of journals cover the field of psycholinguistics. Many relevant experimental articles
can be found in journals such as the Journal of Experimental Psychology (particularly the sections
entitled General; Learning, Memory, and Cognition; and, for lower level processes such as speech
perception and aspects of visual word recognition, Human Perception and Performance), the Quar-
terly Journal of Experimental Psychology, Cognition, Cognitive Psychology, Cognitive Science, and
Memory and Cognition. Three journals with a particularly strong language bias are the Journal
of Memory and Language (formerly called the Journal of Verbal Learning and Verbal Behavior),
Language and Cognitive Processes, and the Journal of Psycholinguistic Research. Theoretical and
review papers can often be found in Psychological Review, Psychological Bulletin, and Behavioral
and Brain Sciences. The latter includes critical commentaries on the target article, plus a reply to
those commentaries, which can be most revealing. Articles on connectionist and AI approaches to
language are often found in Cognitive Science again, and sometimes in Artificial Intelligence. Many
relevant neuroscience papers can be found in Brain and Language, Cognitive Neuropsychology,
the Journal of Cognitive Neuroscience, Neurocase, and sometimes in journals such as Brain and
Cortex. Papers with a biological or connectionist angle on language can sometimes also be found
in the Journal of Cognitive Neuroscience. Journals rich in good papers on language acquisition are
the Journal of Experimental Child Psychology, Journal of Child Language, and First Language; see
also Child Development.
As we will see, designing psycholinguistics experiments can be a tricky business. It is vital to
control for a number of variables that affect language processing (see Chapter 6 for more detail). For
example, more familiar words are recognized more quickly than less familiar ones. We therefore need
easy access to measures of variables such as familiarity. There are a number of databases that provide
this information, including the Oxford Psycholinguistic Database (Quinlan, 1992) and the Nijmegen
CELEX lexical database for several languages on CD-ROM (Baayen, Piepenbrock, & Gulikers, 1995).
There is a website for this book. It contains links to other pages, details of important recent
work, and a means of contacting me electronically. The URL is http://www.psypress.com/cw/harley.
CHAPTER 2
DESCRIBING LANGUAGE
INTRODUCTION gram shows the amount of energy present in a

sound when frequency is plotted against time.
This chapter introduces the building blocks of lan- The peaks of energy at particular frequencies are
guage: sounds, words, and sentences. It describes how called formants. Formant structure is an impor-
we make sounds and form words, and how we order tant characteristic of speech sounds. All vowels
words to form sentences. The chapter also provides and some consonants have formants, but the
means of describing sounds and sentence structure. pattern of formants is particularly important in
The study of syntax often comes across as distinguishing vowels.
being rather technical, with what appears at first We can describe the sounds of speech at
sight to be a lot of jargon and some daunting sym- two levels. Phonetics describes the acoustic
bols. However, it is worth persevering, because detail of speech sounds (their physical proper-
linguistics provides us with a valuable means of ties) and how they are articulated, while pho-
describing sentence structure and a way of show- nology describes the sound categories each
ing how sentences are related to each other. By the language uses to divide up the space of possi-
end of this chapter you should: ble sounds. An example should make this clear.
Consider the sound “p” in the English words
x Know how the sounds of language can be “pin” vs “spin.” The actual sounds are differ-
categorized. ent; you can tell this by putting your hand up
x Understand how we make different sounds. to your mouth as you say them. You should
x Understand how syntactic rules describe the be able to feel a breath of air going out as you
structure of a language. say “pin,” but not as you say “spin.” The “p”
x Be able to construct parse trees of simple sound in “pin” is said to be aspirated, and that
sentences. in “spin” unaspirated. In English, even though
x Understand the importance of the work of the the sounds are different, it does not make any
linguist Chomsky. difference to the meaning of the word that you
use. If you could manage to say “pin” with an
unaspirated “p” it might sound a little odd, but
HOW TO DESCRIBE to your listeners it would still have the same
SPEECH SOUNDS meaning as “pin” when said normally. But in
some languages aspiration does make a differ-
Acoustics is the name of the study of the physi- ence to the meaning of words. In Thai, “paa”
cal properties of sounds. Acoustic informa- (unaspirated) means “forest,” while “paa”
tion about sounds can be depicted in a number (aspirated) means “to split.”
of ways. One of the most commonly used is a A phoneme is a basic unit of sound in a par-
sound spectrogram (see Figure 2.1). A spectro- ticular language. In English the two sorts of “p”
2. DESCRIBING LANGUAGE 31
phonetics is the study of phones, and phonology

is the study of phonemes. There are three types
Frequency in kilohertz (1,000 cycles per second)
of phonetics depending on what is emphasized:
4 kHz
articulatory (which emphasizes how sounds are
made), auditory or perceptual (which empha-
sizes how sounds are perceived), and acoustic
3 kHz
(which emphasizes the sound waveform and
physical properties).
Two words in a language that differ by just
2 kHz
one sound are called minimal pairs. Examples
of minimal pairs are “dog” and “cog,” “bat” and
“pat,” “fog” and “fop.” We can also talk about
1 kHz
minimal sets of words (e.g., “pat,” “bat,” “cat,”
“hat”), all of which differ by only one phoneme, in
the same position. As we have just seen, substitut-
ing one phoneme for another by definition leads
Time to a change in the meaning, whereas just changing
one phone for another (e.g., aspirated for unaspi-
FIGURE 2.1 Sound spectrogram for the word rated [p]) need not necessarily lead to a change in
hospital. The burst of noise across a wide range of meaning.
frequencies corresponds to /s/; the noticeable gaps In many languages, such as English, there
are the stop consonants /p/ and /t/. In normal speech is not a perfect correspondence between letters
the final vowel is barely represented. and sounds. The letter “o” represents a number
of different sounds (such as in the words “mock,”
are the same phoneme, whereas in Thai they are “moon,” and “mow”). The sound “ee” can be
different phonemes. The two “p” sounds are pho- spelled by an “i” or a “y.” It is convenient to
netically different—they are said to be different have a system of representing individual sounds
phones. Two phones are said to be an instance with specific symbols, but letters are not suitable
of the same phoneme in a particular language because of these ambiguities. The International
if the difference between them never makes a Phonetic Alphabet (or IPA for short) is the stand-
difference to the meaning of words. Different ard method of representing sounds. The symbols
phones that are understood as the same phoneme of the IPA and examples of words containing the
in a language are called allophones. Hence in English phonemes they represent are shown in
English the aspirated “p” sounds are allophones: Box 2.1.
Whether or not a “p” is aspirated never makes Note that the ways in which these words are
a difference to the meaning of a word. To take pronounced can vary greatly, both between and
another example, the sounds “l” and “r” are within countries speaking the same language. These
clearly different phones, and in English they examples are based on “Received Pronunciation”
are also different phonemes. In Japanese they in English. Received Pronunciation (RP) is the
are just allophones of the same phoneme. On supposedly high-prestige, educated accent that
the other hand the sounds at the beginning of gives no clue to the regional origin of the speaker
“game,” “dame,” “fame,” and “same” are differ- within Britain; examples of RP can often be found
ent phonemes in English—switch them around by listening to national news broadcasts—
and you change the meaning of the words. particularly from 50 years ago! (It is important
A special notation is used for distinguishing to note that these examples do not mean that
between phones and phonemes. Square brackets these are the correct ways of pronouncing these
are used to designate [phones], whereas slanting words.) Vowel sounds are often very different
lines are used for /phonemes/. Broadly speaking between British English and American English.
32 A. INTRODUCTION
Box 2.1 The International Phonetic Alphabet (IPA)

Consonants Vowels
p pat pie British English American English
b bat babble i reed beat i
t tie tot H bed said H
k kid kick ˏ (l) did bit ˏ
d did deed æ rat anger æ
g get keg ˂ saw author ˂ (in saw)
s sun psychology a (ˀ) hard car ˀr
z razor peas ˁ pot got ˀ
f field laugh u who boot u
v vole drove U (Z) could foot Ǔ
e e
m mole mum above sofa* r
n not nun ˝ hut tough ˝
ŋ sing think
Diphthongs (vowel–vowel combinations)
T thigh moth
aˏ (ay) rise bite aˏ
ð the then
aZ (æZ) cow about aZ
³ (š) she shield
˂ˏ (˂y) boy coy ˂ˏ
ˣ (ž) vision measure eˏ (e) may bait eˏ
l lie lead oU (oZ) go boat ou
w we witch ˏ ˏr
e
w here mere
when whale H
e
mare rare er
r rat ran aˏ˂ hire fire aˏr
j you young ju new French tu
h hit him
˩ (č, tš) cheese church Frequently used alternative symbols shown in paren-
Gˣ (ĵ, dž) judge religion theses. Main examples are for most speakers of
x loch (Scottish British English; the far right symbols for vowels and
pronunciation) diphthongs are for most speakers of American English.
? bottle (glottal * This is the schwa, a weak, neutral vowel often
pronunciation) used to replace unstressed vowels.
There are also many specific differences between We produce speech by moving parts of the
British and American pronunciations; for exam- vocal tract, including the lips, teeth, tongue,
ple, American English tends to drop the initial mouth, and voice box or larynx (see Figure 2.2).
/h/ in “herbs.” (There are also different words for The basic source of sounds is the larynx, which
the same thing, of course, such as “sidewalk” for modifies the flow of air from the lungs and pro-
“pavement,” and “trash” for “rubbish.”) Different duces a range of higher frequencies called harmon-
systems of pronunciations within a language are ics. Different sounds are then made by changing
known as dialects. Dialects mostly differ in their the shape of the vocal tract. There are two differ-
vowel sounds. One advantage of the IPA is that ent major types of sounds. Vowels (such as a, e,
it is possible to represent these different ways of i, o, and u) are made by modifying the shape of
pronouncing the same thing. the vocal tract, which remains more or less open
sounds is to look at their place of articulation—

that is, the place where the vocal tract is closed
or restricted. The contrasting features needed to
describe sounds are known as distinctive features.
CONSONANTS
Consonants are made by closing or restricting
some part of the vocal tract as air flows through it.
We classify consonants according to their place of
articulation, whether or not they are voiced, and
their manner of articulation (see Table 2.1).
The place of articulation is the part of the
Received Pronunciation (RP) has long been vocal tract that is closed or constricted during
perceived as the most prestigious spoken form articulation. For example, /p/ and /b/ are called
of the English language. RP belies the origins of bilabial sounds and are made by closing the
its speaker, and is sometimes referred to as the
“Queen’s English,” as it is spoken by the monarch.
mouth at the lips, whereas /t/ and /d/ are made
by putting the tongue to the back of the teeth.
To understand the difference between /b/ and /p/,
while the sound is being produced. The position we need to introduce a concept called voicing.
of the tongue modifies the range of harmonics In one case (/b/), the vocal cords are closed and
produced by the larynx. Consonants (such as p, b, vibrating from the moment the lips are released;
t, d, k, g) are made by closing or restricting some the consonants are said to be pronounced with
part of the vocal tract at the beginning or end of a voice, or just voiced. In the other case (/p/), there
vowel. Most consonants cannot be produced with- is a short delay, as the vocal cords are spread
out some sort of vowel. This description suggests apart as air is first passed between them; hence
that one way to examine the relation between they take some time to start vibrating. These
Hard palate
Alveolar Nasal cavity

ridge
Velum
(soft palate)
Uvula
Tongue
Lips
Vocal cords
Teeth
Epiglottis Larynx
Esophagus
Glottis
Trachea
FIGURE 2.2 The structure

of the human vocal tract.
34 A. INTRODUCTION
consonants are said to be voiceless (also pro- When the glottis is completely closed and then
duced without voice or unvoiced). The time released, a glottal stop (/ˤ/) is made. Glottal
between the release of the constriction of the stops do not occur in the Received Pronunciation
airstream when we produce a consonant, and of English, but are found in some dialects and in
when the vocal cords start to vibrate, is called other languages. (The glottal stop can be heard,
the voice onset time (VOT). Voicing also distin- for example, in some dialects of the south-east
guishes between the consonants /d/ (voiced) and of England in the middle of words like “bottle,”
/t/ (voiceless). The sounds /d/ and /t/ are made replacing the /t/ sound.)
by putting the front of the tongue on the alveo- The other important dimension used to
lar ridge (the bony ridge behind the upper teeth). describe consonants is the manner of articulation.
Hence these are called alveolars. Dentals such as Stops are formed when the airflow is completely
/θ/ and /ð/ are formed by putting the tongue tip interrupted for a short time (e.g., /p/, /b/, /t/, /d/).
behind the upper front teeth. Labiodentals such Not all consonants are made by completely clos-
as /f/ and /v/ are formed by putting the lower lip ing the vocal tract at some point; in some it is
to the upper teeth. Postalveolar sounds (e.g., /³/, merely constricted. Fricatives are formed by con-
/ˣ/, formerly called alveopalatals) are made by stricting the airstream so that air rushes through
putting the tongue towards the front of the hard with a hissing sound (e.g., /f/, /v/, /s/). Affricatives
part of the roof of the mouth, the palate, near the are a combination of a brief stopping of the air-
alveolar ridge. Palatal sounds (e.g., /j/, /y/) are stream followed by a constriction (e.g., /Gˣ/,
made by putting the tongue to the middle of the /˩/). Liquids are produced by allowing air to
palate. Further back in the mouth is a soft area flow around the tongue as it touches the alveo-
called the soft palate or velum, and velars (e.g., lar ridge (e.g., /l/, /r/). Most sounds are produced
/k/, /g/) are produced by putting the tongue to the orally, with the velum raised to prevent airflow
velum. Finally, some sounds are produced with- from entering the nasal cavity. If it does and air is
out the involvement of the tongue. The glottis is allowed to flow out through the nose we get nasal
the name of the space between the vocal cords sounds (e.g., /m/, /n/). Glides or semi-vowels are
in the larynx. Constriction of the larynx at the transition sounds produced as the tongue moves
glottis produces a voiceless glottal fricative (/h/). from one vowel position to another (e.g., /w/, /y/).
TABLE 2.1 English consonants as combinations of distinguishing phonological features.
MANNER OF ARTICULATION
lateral
stop fricative affricative nasal approximant approximant
PLACE OF
ARTICULATION +V –V +V –V +V –V +V –V +V –V +V –V
bilabial b p m w
labiodental v f
dental ð T
alveolar d t z s n l
postalveolar ˣ ³ Gˣ ˩ r
velar g k ŋ
glottal ? h
TABLE 2.2 Vowels as combinations of distinguishing

phonological features.
Syllable
Front Central Back
High i u Onset Rime

ˏ Z
e Nucleus Coda
Mid e o
H ˂
Low æ ˝ ˀ
FIGURE 2.3 Hierarchical structure of syllables.
So we can describe consonants in terms of the words are monosyllabic—they only have one syl-
articulatory distinctive features, place of articula- lable. Syllables can be analyzed in terms of a hier-
tion, manner of articulation, and voicing. It should be archical structure (see Figure 2.3). The syllable
noted that some languages produce consonants (such onset is an initial consonant or cluster (e.g., /cl/);
as clicks) that are not found in European languages. the rime consists of a nucleus, which is the cen-
tral vowel, and a coda, which comprises the final
consonants. Hence in the word “clumps,” “cl-” is
VOWELS the onset and “-umps” the rime, which in turn can
be analyzed into a nucleus, which is the central
Vowels are made with a relatively free flow of
vowel (“u”), and coda (“mps”). In English, all
air. The nature of the vowel is determined by the
of these components are optional, apart from the
way in which the shape of the tongue modifies
nucleus (all words have to have at least a central
the airflow. Table 2.2 shows how vowels can be vowel). The rules that describe how component
classified depending on the position (which can syllables combine with each other differ across
be raised, medium, or lower) of the front, central, languages—for example, Japanese words do not
or rear portions of the tongue. For example, the have codas, and in Cantonese only nasal sounds
/i/ sound in “meat” is an example of a high front and glottal stops are possible codas.
vowel because the air flows through the mouth Features of words and syllables that may span
with the front part of the tongue in a raised (high) more than one phoneme, such as pitch, stress,
position. and the rate of speech, are called suprasegmental
Two vowel sounds can be combined to form features. For example, a falling pitch pattern indi-
a diphthong. Examples are the sounds in “my,” cates a statement, whereas a rising pitch pattern
“cow,” “go,” and “boy.” indicates that the speaker is asking a question. Try
Whereas the pronunciation of consonants is saying “it’s raining” as a statement, “it’s raining?”
relatively constant across dialects, that of vowels as a question, and “it’s raining!” as a statement of
can differ greatly. surprise. Stress varies within a word, as some syl-
lables receive more stress than others, and within a
SYLLABLES sentence, as some words are emphasized more than
others. Taken together, pitch and stress determine
Words are divided into rhythmic units called syl- the rhythm of the language. Languages differ in
lables. One way of determining the number of their use of rhythm. In English, stressed syllables
syllables in a word is to try singing it—each sylla- are produced at approximately equal periods of
ble will need a different note (Radford, Atkinson, time—English is said to be a stressed-timed lan-
Britain, Clahsen, & Spencer, 1999). For example, guage. In French, syllables are produced in a steady
the word syl–la–ble has three syllables. Many flow—it is said to be a syllable-timed language.
36 A. INTRODUCTION
In English, although we can use pitch to I examine his views on the relation between lan-
draw attention to a particular word, or convey guage and thought and on language acquisition in
additional information about it, different pitches Chapters 3 and 4. Chomsky argued that language
do not change the meaning of the word (“mouse” is a special feature that is innate, species-specific,
spoken with a high or low pitch still means and biologically pre-programmed, and that is a
mouse). In some languages pitch is more impor- faculty independent of other cognitive structures.
tant. In the Nigerian language Nupe, [ba] spoken Here we are primarily concerned with the more
with a high pitch means “to be sour,” but [ba] spo- technical aspect of his theory.
ken with a low pitch means “to count.” Languages For Chomsky, the goal of the study of syntax
that use pitch to contrast meanings are called tone is to describe the set of rules, or grammar, that
languages. enables us to produce and understand language.
Chomsky (1968) argued that it is important to dis-
tinguish between our idealized linguistic compe-
LINGUISTIC APPROACHES tence, and our actual linguistic performance. Our
TO SYNTAX linguistic competence is what is tapped by our
intuitions about which are acceptable sentences
Linguistics provides us with a language for of our language, and which are ungrammatical
describing syntax. In particular, the work of the strings of words. We know that the sentence “The
American linguist Noam Chomsky (b. 1928) has vampire the ghost loved ran away” is grammati-
been influential in indicating constraints on how cal, even if we have never heard it before, while
powerful human language must be, and how it we also know that the string of words “The vam-
should best be described. We looked at his influ- pire sleep the ghost ran away” is ungrammatical.
ence on the development of psycholinguistics in Competence concerns our abstract knowledge of
Chapter 1. our language. It is about the judgments we would
make about language if we had sufficient time
and memory capacity. In practice, of course, our
The linguistic theory of Chomsky actual linguistic performance—the sentences that
Chomsky’s work is based on two related ideas: we actually produce—is greatly limited by these
first, the relations between language and the brain, factors. Furthermore, the sentences we actually
and how children acquire language, and second, a produce often use the more simple grammatical
technical description of the structure of language. constructions. Our speech is full of false starts,
hesitations, speech errors, and corrections. The
actual ways in which we produce and understand
sentences are also in the domain of performance.
In his more recent work, Chomsky (1986)
distinguished between externalized lan-
guage (E-language) and internalized language
(I-language). For Chomsky, E-language linguis-
tics is about collecting samples of language and
understanding their properties; in particular it is
about describing the regularities of a language in
the form of a grammar. I-language linguistics is
about what speakers know about their language.
For Chomsky, the primary aim of modern linguis-
tics should be to specify I-language: it is to produce
The American linguist Noam Chomsky argued that a grammar that describes our knowledge of the
language is innate, species-specific, and biologically language, not the sentences we actually produce.
pre-programmed.
Another way of putting this is that I-language is
about mental phenomena, whereas E-language is be captured in a finite number of syntactic rules.
about social phenomena (Cook & Newson, 2007). A moment’s reflection should show that language
Competence is an aspect of I-language. involves rules, even if we are not always aware of
As a crude generalization, we can say that them. How else would we know that “Vlad bought
psycholinguists are more interested in our linguis- himself a new toothbrush” is acceptable English
tic performance, and linguists in our competence. but “Vlad bought himself toothbrush new a” is not?
Nevertheless, many of the issues of competence
are relevant to psychologists. In particular, lin- Describing syntax and
guistics provides a framework for describing and phrase-structure grammar
thinking about syntax, and its theories place pos-
sible constraints on language acquisition. How should we describe the rules of grammar?
Let us look at the notion of a grammar in Chomsky proposed that phrase-structure rules are
more detail. A grammar uses a finite number of an essential component of our grammar, although
rules that in combination can generate all the sen- he went on to argue that they are not the only
tences of a language—hence we talk of generative component. An important aspect of language is
grammar. Obviously we could produce a device that we can construct sentences by combining
that could emit words randomly, and although this words according to rules. Phrase-structure rules
might, like monkeys typing away with infinite describe how words can be combined, and pro-
time to spare, produce the occasional sentence, it vide a method of describing the structure of a sen-
will mainly produce garbage. For example, “dog tence. The central idea is that sentences are built
vampire cat chase” is a non-sentence in English. It up hierarchically from smaller units using rewrite
is an important constraint that although our gram- rules. The set of rewrite rules constitute a phrase-
mar must be capable of generating all the sentences structure grammar. Rewrite rules are simply rules
of a language, it should also never generate non- that translate a symbol on the left-hand side of the
sentences. (Of course, from time to time we errone- rule into those on the right-hand side. For exam-
ously produce non-sentences, but this is an aspect ple, (1) is a rewrite rule that says “a sentence (S)
of performance; remember we are concerned only can be rewritten as a noun phrase (NP) followed
with linguistic competence here.) Chomsky fur- by a verb phrase (VP)”:
ther argued that a grammar must give an account (1) S → NP + VP
of the underlying syntactic structure of sentences.
The sentence structures that the grammar creates In a phrase-structure grammar, there are two
should capture our intuitions about how sentences main types of symbol: terminal elements (consist-
and fragments of sentences are related. We know ing of vocabulary items or words) and non-terminal
that “the vampire kissed the ghost” and “the ghost elements (everything else). It is important to realize
was kissed by the vampire” are related in some that the rules of grammar do not deal with particu-
way. Finally, linguistic theory should also explain lar words, but with categories of words that share
how children acquire these rules. grammatical properties. Words fall into classes
Chomsky’s linguistic theory has evolved such as nouns (words used to name objects and
greatly over the years. The first version was ideas, both concrete and abstract, such as “pig,” or
described in a book called Syntactic Structures “truth”), adjectives (words used to describe, such
(1957). The 1965 version became known as the as “pink,” or “lovely”), verbs (words describing
“standard theory”; this was followed in turn by actions or states, or an assertion, such as “kiss,”
the “extended standard theory,” “revised extended or “modify”), adverbs (words qualifying verbs,
standard theory,” and then “government and bind- such as “quickly”), determiners (words deter-
ing (or GB) theory” (Chomsky, 1981). The latest mining the number of nouns they modify, such as
version is called minimalism (Chomsky, 1995). “the,” “a,” and “some”), prepositions (words such
Nevertheless, the central theme is that language is as “in,” “to,” and “at”), conjunctions (words such
rule-based, and that our knowledge of syntax can as “and,” “because,” and “so”), pronouns (“he,”
38 A. INTRODUCTION
poor ghost,” “The nasty vampire” is a phrase (as it

Box 2.2 A grammar for a can be replaced by, for example, “Vlad”), whereas
fragment of English “The nasty” is not; “laughed at the poor ghost” is
a phrase (for example, it can be replaced by just
S o NP VP (A) “laughed”), but “at the” is not.
NP o DET N (B) Phrases combine to make clauses. Clauses
NP o N (C) contain a subject (used to mention something), and
VP o V NP (D) a predicate (the element of the clause that gives
VP o V (E) information about the subject). Every clause has
N o Vlad, Boris, poltergeist, vampire, a verb. Sentences contain at least one clause but
werewolf, ghost … may contain many more. The essential idea of a
V o loves, hates, likes, bites, is … phrase-structure grammar is the analysis of the sen-
DET o the, a, an … tence into its lower level constituents, such as noun
phrases, verb phrases, nouns, and verbs. Indeed, this
Abbreviations approach is sometimes called constituent analysis.
S sentence N noun Constituents are components of larger constructions.
NP noun phrase V verb Two other important syntactic notions are the
VP verb phrase DET determiner subject and the object of a sentence. The subject of
a sentence is the noun phrase that is immediately
dominated by the highest-level element, the sen-
“she,” “it”), and so on. Box 2.2 is an example of a tence node. An easy test to discover the subject of a
phrase-structure grammar that accounts for a frag- sentence is to turn the sentence into a question that
ment of English. can be answered by “yes” or “no” (Burton-Roberts,
We can distinguish two types of word. 1997). The phrase that functions as the subject is
Content words do most of the semantic work of the one required to change its position in forming
the language, and function words do most of the the question. So from (2) “the vampire” is forced to
grammatical work. Content words include nouns, change position (relative to “is”) to form the ques-
adjectives, verbs, and most adverbs. Function tion in (3); hence “the vampire” is the subject:
words include determiners, conjunctions, prepo-
sitions, and pronouns. Function words tend to be (2) The vampire is kissing the witch.
short and used very frequently. Whereas the num- (3) Is the vampire kissing the witch?
ber of content words is very large and changing (we
often coin new content words, such as “television” There are different types of verbs, each
and “computer”), the number of function words is requiring different syntactic roles to create accept-
small and fixed (at about 360). For this reason, con- able structures. Transitive verbs require a single
tent words are sometimes called open-class words, noun phrase called a direct object. “Kisses” is a
and function words closed-class items. transitive verb. In (4) “the vampire” is the sub-
Words combine to make phrases, which ject and “the witch” is the object. Intransitive
express a single idea. For instance, “Vlad,” “the verbs do not require any further noun phrase; in
vampire,” “the old vampire,” and “the grouchy (5) “laughs” is an intransitive verb. Ditransitive
old vampire” are all examples of noun phrases— verbs require two noun phrases called the direct
they can all take the part of nouns in sentences. object and the indirect object; in (6) “the vampire”
They all make acceptable beginnings to the sen- is the subject, “the ring” is the direct object, and
tence fragment “__ bought a new toothbrush.” “the witch” is the indirect object.
Phrases are constituents that can generally be
systematically replaced by a single word while (4) The vampire kisses the witch.
maintaining the same sentence structure. Hence in (5) The vampire laughs.
the sentence “The nasty vampire laughed at the (6) The vampire gives the ring to the witch.
Because each sentence must contain at least

one clause, and each clause must have a subject,
S
it follows that every sentence must have a subject.
Not all sentences have an object, however. Sen- NP VP
tences containing just intransitive verbs, such as
(5), contain only a subject. DET N V NP
You might think by now that the subject is N

that which is doing the action, and the object is The vampire loves Boris
having something done to it. This type of descrip-
tion is a semantic analysis in terms of semantic
roles or themes. While this generalization is true
for many sentences (called active sentences), it is FIGURE 2.4 Parse tree for the sentence “The
not always true. Consider sentence (7): vampire loves Boris.”
(7) The vampire is being kicked by the witch. We desire more of a grammar than that it
(8) S → The vampire + verb phrase + preposi- should merely be able to generate sentences: we
tional phrase. need a way to describe the underlying syntactic
Now which is the grammatical subject of this structure of sentences. This is particularly use-
sentence and which is the grammatical object? If ful for syntactically ambiguous sentences. These
we apply the yes–no question test, we form “Is the are sentences that have more than one interpre-
vampire being kicked by the witch?,” with “the tation, such as the sentence “I saw the witches
vampire” moving position. “The witch” stays flying to America.” This could be paraphrased as
where it is. In addition, the structure of (7) is out- either “When I was flying to America, I saw the
lined in (8). Clearly “the vampire” is immediately witches,” or “There I was standing on the ground
dominated by the sentence node. Hence “the vam- when I looked up and there were the witches fly-
pire” is the subject of this sentence, even though ing off to America.” A phrase-structure grammar
“the witch” is doing the action and “the vampire” also enables us to describe the syntactic struc-
is having the action done to him. This type of sen- ture of a sentence by means of a tree diagram, as
tence structure is called a passive. The object in shown for the sentence “The vampire loves Boris”
the active form of the sentence has become the in Figure 2.4. The points on the tree correspond-
grammatical subject of the passive form. We will ing to constituents are called nodes. The node at
examine passives in more detail later. the top of the tree is the sentence or S node; at
The simple grammar in Box 2.2 can be used the bottom are terminal nodes corresponding to
to generate a number of simple sentences. Let us words; in between are non-terminal nodes corre-
start by applying some of these rewrite rules to sponding to constituents such as NP and VP.
show how we can generate a sentence (9). The Tree diagrams are very important in the analy-
goal is to show how a sentence can be made up sis of syntax, and it is important to be clear about
from terminal elements: what they mean. The underlying structure of a
sentence or a phrase is sometimes called its phrase
(9) Starting with S, rule (A) from Box 2.2 gives structure or phrase marker. It should be reiterated
us NP + VP. that the important idea is capturing the underlying
Rule (B) gives us DET + N + VP. syntactic structure of sentences; it is not our goal
Rule (D) gives us DET + N + V + NP. here to explain how we actually produce or under-
Rule (C) gives us DET + N + V + N. stand them. Furthermore, at this stage directional-
ity is not important; the directions of the arrows in
Then the substitution of words gives us, for Box 2.2 do not mean that we are limited to talking
example, the following sentence: “The vampire about sentence production. Our discussion at present
loves Boris.” applies equally to production and comprehension.
40 A. INTRODUCTION
Phrase-structure rules provide us with the underly- This process of center-embedding could
ing syntactic structure of sentences we both produce potentially continue forever, and most linguists
and comprehend. would argue that the sentence would still be perfectly
Clearly, this is an extremely limited gram- well-formed; that is, it would still be grammati-
mar. One obvious omission is that we cannot cal. Of course, we would soon have difficulty in
construct more complex sentences with more understanding such sentences, for we would lose
than one clause in them. However, we could do track of who scared whom and who loved what.
this by introducing conjunctions. A slightly more Many people have difficulty with sentence (13),
complex example would be using a relative clause and many people find constructions such as (14)
with a relative pronoun (such as “which,” “who,” grammatically acceptable, although it is missing
or “that”) to produce sentences such as (10): a verb (Gibson & Thomas, 1999). Although we
might rarely or never produce center-embedded
(10) The vampire who loves Boris is laughing. sentences, our grammar must be capable of pro-
ducing them, or at least of deciding that they are
Natural language could only be described by grammatical. Given a piece of paper and suffi-
a much more complex phrase-structure grammar cient time, you could still understand sentences
that contained many more rules. We would also of this type. This observation reflects the dis-
need to specify detailed restrictions on when par- tinction between competence and performance
ticular rules could and could not be applied. We mentioned earlier: We have the competence to
would then have a description of a grammar that understand these sentences, even if we never
could generate all of the sentences of a language produce them in actual performance. (Remem-
and none of the non-sentences. Obviously another ber that judgments of grammatical acceptability
language, such as French or German, would have are based on intuitions, and these might vary. Not
a different set of phrase-structure rules. everyone would agree that sentences with a large
Although these grammars might be very large, number of center-embeddings are grammatical.
they will still contain a finite number of rules. In Indeed, there is some controversy in linguistics
real languages there are potentially an infinite about their status; see Hawkins, 1990.) Neverthe-
number of sentences. How can we get an infinite less, most people think that recursion is a central
number of sentences from a finite number of rules property of language and perhaps human thought
and words? We can do this because of special rules (Fitch, Hauser, & Chomsky, 2005).
based on what are known as recursion and itera- Iteration enables us to carry on repeating the
tion. Recursion occurs when a rule uses a version same rule, potentially for ever. For example, we
of itself in its definition. Recursive rules enable can use iteration to produce sentences such as (15).
phrases to contain examples of the same sort of
phrase, such as in the old song “Little does she (15) The nice vampire loves the ghost and the
know that I know that she knows that I know …” ghost loves the vampire and the friendly
(Kursaal Flyers, 1976). One of the most important ghost loves the vampire and …
uses of recursion is to embed a sentence within
another sentence, producing center-embedded sen- There are different types of phrase-structure
tences. Examples (12) and (13) are based on (11): grammar. Context-free grammars contain only rules
that are not specified for particular contexts, whereas
(11) The vampire loved the ghoul. context-sensitive grammars can have rules that can
(12) The vampire the werewolf hated loved the only be applied in certain circumstances. In a con-
ghoul. text-free rule, the left-hand symbol can always be
(13) The vampire the werewolf the ghost scared rewritten by the right-hand one regardless of the con-
hated loved the ghoul. text in which it occurs. For example, the writing of
(14) *The vampire who the werewolf who the a verb in its singular or plural form depends on the
ghost had scared loved the ghoul. context of the preceding noun phrase.
Transformations transformations—for example, to form a negative

Chomsky argued that a phrase-structure gram- question, as in (21). The sentence that formed the
mar is not capable of capturing our linguistic basis of all the transformed versions (here 16) was
competence. Although it can produce any sen- called the kernel sentence.
tence of the language while not producing any
non-sentences, and although it can provide an (19) Does the vampire chase the ghost?
account of the structure of sentences, it cannot (20) The vampire does not chase the ghost.
explain the relation between related sentences. (21) Does the vampire not chase the ghost?
Consider sentences (16) and (17):
Not only do transformations capture our intu-
(16) The vampire chases the ghost. itions about how sentences are related, but they
(17) The ghost is chased by the vampire. also enable the grammar to be simplified, primar-
ily because rules that enable us to rewrite strings
Clearly our linguistic intuitions tell us that as other strings capture many of the aspects of
sentence (16) is related to sentence (17), but the dependencies between words (particularly the
how can we capture this relation in our gram- context-sensitive aspect described earlier).
mar? Phrase-structure grammars are not capable Of course, in a fully fledged grammar the
of capturing some relations. Chomsky (1957) rules would be much more numerous and much
showed that knowledge of such relations could more complex. For example, we have not looked
be flagged by the introduction of special rewrite at the details of changes to the form of the verb,
rules known as transformations. Transforma- or specified the types of sentences to which pas-
tions are so central to the theory that the whole sivization can be applied.
approach became known as transformational
grammar. A normal rewrite rule takes a single Surface and deep structure
symbol on the left-hand side (e.g., S, NP, or VP), Chomsky (1965) presented a major revision of
and rewrites it as something else more complex. the theory, usually called the standard theory.
A transformation is a special type of rewrite rule The changes were primarily concerned with the
that takes a string of symbols (i.e., more than structure of the linguistic system and the nature
one symbol) on the left-hand side, and rewrites of the syntactic rules. In the new model, there
this string as another string on the right-hand were now three main components. First, a seman-
side. Sentences (16) and (17) are related to each tic system (which had no real counterpart in the
other by what is called the passivization trans- earlier model) assigned meaning to the syntactic
formation; (17) is the passive form of the active strings; second, a phonological component turned
form (16). The transformation that achieves this syntactic strings into phonological strings; and
change looks like (18): third, a syntactic component was concerned with
word ordering. The syntactic component in turn
(18) NP1 + V + NP2 → NP2 + auxiliary + V* + had two components, a set of base rules (roughly
by + NP1 equivalent to the earlier phrase-structure rules),
and transformational rules.
An auxiliary verb is a special verb (here, “is”), Perhaps the most important extension of this
and the asterisk indicates that it is necessary to later theory was the introduction of the distinc-
change the form of the main verb, here by chang- tion between deep structure and surface structure
ing the “-s” ending to an “-ed” ending. (now called d-structure and s-structure). To some
Chomsky postulated many other types of extent this distinction was implicit in the earlier
transformations. For example, we can turn the model with the concept of kernel sentences, but
affirmative declarative form of a sentence (16) the revised model went further, in that every sen-
into an interrogative or question form (19), or tence was stipulated to have a deep structure and
into a negative form (20). We can also combine a surface structure. Furthermore, there was no
42 A. INTRODUCTION
longer a distinction between optional and obliga- The new “standard version of the theory” was
tory transformations. In a sense all transforma- originally known as Government and Binding
tions became obligatory, in that markers for them (GB) theory (Chomsky, 1981), but the term prin-
are represented in the deep structure. ciples and parameters theory is now more widely
In the standard theory, the syntactic component used. This name emphasizes the central idea that
generated a deep structure and a surface structure for there are principles that are common to all lan-
every sentence. The deep structure was the output of guages and parameters that vary from language to
the base rules and the input to the semantic compo- language (see Chapter 4).
nent; the surface structure was the output of the trans- There have been a number of important
formational rules and the input to the phonological changes in the more recent versions of the theory.
rules. Describing sentences in terms of their deep First, with time, the number of transformations
structure has two main advantages. First, some sur- steadily dwindled. Second, related to this, the
face structures are ambiguous in that they have two importance of deep structure has also dwindled
different deep structures. Second, what is the subject (Chomsky, 1991). Third, when constituents are
and what is the object of the sentence is often unclear moved from one place to another, they are hypoth-
in the surface structure. Sentence (22) is ambiguous esized as leaving a trace in their original position.
in its surface structure. However, there is no ambigu- (This has nothing to do with the TRACE model of
ity in the corresponding deep structures, which can spoken word recognition that will be described in
be paraphrased as (23) and (24): Chapter 9.) Fourth, special emphasis is given to
the most important word in each phrase. For exam-
(22) The hunting of the vampires was terrible. ple, in the noun phrase “the vampire with the gar-
(23) The way in which the vampires hunted was lic,” the most important noun is clearly “vampire,”
terrible. not “garlic.” (This should be made clear by the
(24) It was terrible that the vampires were hunted. observation that the whole noun phrase is about
the vampire, not about the garlic.) The noun “vam-
Sentences (25) and (26) have the same sur- pire” is said to be the head of the noun phrase.
face structure, yet completely different deep Fifth, the revised theory permits units inter-
structures: mediate in size between nouns and noun phrases,
and verbs and verb phrases. The rules are phrased
(25) Vlad is easy to please. –
in terms of what is called X (pronounced “X-bar”)
(26) Vlad is eager to please. syntax (Jackendoff, 1977; Kornai & Pullum,
–
1990). The intermediate units are called N (pro-
In (25), Vlad is the deep structure object of –
nounced noun-bar) and V (verb-bar), and are
please; in (26), Vlad is the deep structure subject made up of the head of a phrase plus any essential
of please. This difference can be made apparent in arguments or role players. Consider the phrase
that we can build a deep structure corresponding to “the king of Transylvania with a lisp.” Hence
(27) of the form of (25), but cannot do so for (26), “king” is an N and the head of the phrase; “the
as (28) is clearly ungrammatical. (The ungrammat- –
king of Transylvania” an N (because Transylvania
icality is conventionally indicated by an asterisk.) is the argument of “king,” the place that the king
is king of); and “the king of Transylvania with a
(27) It is easy to please Vlad. lisp” an NP. This approach distinguishes between
(28) *It is eager to please Vlad. essential arguments (such as “of Transylvania”)
and optional adjuncts or modifiers (such as “with
Principles and parameters theory, a lisp”). The same type of argument applies to
and minimalism verbs, which also have obligatory arguments
As Chomsky’s theory continued to develop, many (even if they are not always stated) and optional
of the features of the grammars changed, although modifiers. The advantage of this description is that
the basic goals of linguistics remained the same. it captures new generalizations, such as if a noun
phrase contains both argument and adjunct, the Chomsky is the most influential figure in the
argument must always be closer to the head than history of linguistics, with his central idea being
the adjunct: “The king with a lisp of Transylvania” that the goal of linguistics is to specify the rules
is distinctly odd. It is an important task of linguis- of a grammar that captures our linguistic compe-
tics to capture and explain such generalizations. tence. Later I look at the implications of this idea
This method of description also enables the speci- for psycholinguistics.
fication of a very general rule such as (29):
– Optimality Theory and Cognitive
(29) X → X, ZP* Linguistics
Although Chomsky’s earlier work had great
That is, any phrase (X-bar) contains a head with influence on the psycholinguistics of the time,
any number of modifiers (ZP*). Such an abstract this influence has waned. Minimalism, although
rule is an elegant blueprint for the treatment of important for linguists, has had no impact on psy-
both noun phrases and verb phrases, and captures cholinguistics. Many of the key ideas of modern
the underlying similarity between them. psycholinguistics are reflected in other branches
English is a head-first language. Japanese, on of linguistics, particularly Optimality Theory
the other hand, is a head-last language. Nevertheless, (McCarthy, 2001). Optimality Theory has been
both languages distinguish between heads and mod- applied to phonology, morphology, semantics,
ifiers; this is an example of a very general rule that and syntax; its main idea is that the surface form
Chomsky argues must be innate. This general rule is of an expression results from the resolution of
an example of a parameter. The setting of the param- conflicts between underlying representations.
eter that specifies head-first or head-last is acquired It shares much with connectionist approaches
through exposure to a particular language (Pinker, to language. As we shall see in Chapter 10, one
1994). I examine parameters and their role in lan- important approach to understanding sentences is
guage acquisition in Chapter 4. that of constraint satisfaction; we try to satisfy as
In the most recent reworking of his ideas, the many constraints as possible, and make sure that
minimalist program aims to simplify the gram- we satisfy all the important ones. We choose the
mar as much as possible (Chomsky, 1995). The best interpretation available in the context on the
Principle of Economy requires that all linguistic basis of all data.
representations and processes should be as eco- Cognitive Linguistics is the name given to
nomical as possible; the theoretical and descrip- the general approach that emphasizes language as
tive apparatus necessary to describe language one aspect of general cognition. In contrast with
should be minimized (Radford, 1997). The less Chomsky’s generative grammar approach, cogni-
complex a grammar, the easier it should be to tive linguists do not believe there is a separate
learn. Although this principle sounds simple, its faculty of language, and argue that we process
implications for the detailed form of the theory language using the same sorts of cognitive pro-
are vast. In minimalism, the role of abstract, cess as we use in every other aspect of cogni-
general grammatical rules is virtually abolished. tion. We learn language using general cognitive
Instead, the lexicon incorporates many aspects processes, rather than language-specific ones.
of the grammar. For example, information about These ideas are reflected in psycholinguistic
how transitive verbs take on syntactic roles is approaches to language acquisition that empha-
stored with the verbs in the lexicon, rather than size the importance of general learning mecha-
stored as an abstract grammatical rule. Instead nisms (see Chapter 4).
of phrase-structure rules, categories are merged
to form larger categories. The lexical representa-
tions of words specify grammatical features that
The formal power of grammars
control the merging of categories. These ideas are This part is relatively technical and can be
echoed by modern accounts of parsing. skipped, but the ideas discussed in it are useful
44 A. INTRODUCTION
for understanding how powerful a grammar must state of a finite-state device is determined by some
be if it is to be able to describe natural language. finite number of previous symbols (words). Type
The study of different types of grammar and the 3 grammars are also known as right-linear gram-
devices that are necessary to produce them is part mars, because every rewrite rule can only be of
of the branch of mathematical linguistics or com- the form A → B or A → x B, where x is a terminal
putational theory (a subject that combines logic, element. This produces right-branching tree struc-
linguistics, and computer science) called autom- tures. For example, if you use the rules in (30) you
ata theory. Automata theory also reveals some- can produce sentences such as in (31). Just substi-
thing of the difficulty of the task confronting the tute the appropriate letters; the vertical separator |
child who is trying to learn language. An automa- separates alternatives.
ton is a device that embodies a grammar and that
(30) S → the A | a A
can produce sentences that are in accordance with
A → green A | vicious A
that grammar. It takes an input and performs some
A → ghost B | vampire B
elementary operations, according to some previ-
B → chased C | loved C | kissed C
ously specified instructions, to produce an output.
C → the D | a D
The topic is of some importance because if we
D → witch | werewolf
know how complex natural language is, we might
(31) The vicious vampire chased the witch. A
expect this to place some constraints on the power
green vicious ghost kissed the werewolf.
of the grammar necessary to cope with it.
We have already defined a grammar as a The corresponding finite-state device is
device that can generate all the sentences of a lan- depicted in Figure 2.5. The finite-state device
guage, but no non-sentences. A language is not always starts in the S state, and then reads words
restricted to natural language: it can be an artifi- from the appropriate category to move on to the
cial language (such as a programming language), next state, before moving onto the next state. It
or a formal language such as mathematics. In fact, finishes producing sentences when it reaches the
there are many possible grammars that fall into end state. We can produce even longer sentences
a small number of distinct categories, each with if we allow iteration with a rule such as (32),
different power. Each grammar corresponds to a which will enable us to produce sentences of the
particular type of automaton, and each type pro- form (33).
duces languages of different complexity.
(32) D → and S
We cannot produce all the sentences of natu-
(33) The vicious vampire chased the witch and a
ral language simply by listing them, because there
green vicious ghost kissed the werewolf.
are an infinite number of grammatically accept-
able sentences. To be able to produce all these Next up in power from a finite-state device
sentences, our grammar must incorporate recursive is a push-down automaton. This is more powerful
and iterative rules. Some rules need to be sensitive than a finite-state device because it has a memory;
with respect to the context in which the symbols the memory is limited, however, in that it is only a
they manipulate occur. Context-free and context- push-down stack. A push-down stack is a special
sensitive languages differ in whether they need type of memory where only the last item stored
rules that can be specified independently of the on the stack can be retrieved; if you want to get at
context in which the elements occur. How com- something stored before the last thing, everything
plex is natural language, and how powerful must stored since will be lost. It is like a pile of plates. It
the grammar be that produces it? produces Type 2 grammars that can parse context-
The simplest type of automaton is known as free languages. Next in power is a linear-bounded
a finite-state device. This is a simple device that automaton, which has a limited memory, but can
moves from one state to another depending on only retrieve anything from this memory. It produces
its current state and current input, and produces Type 1 grammars, parsing context-sensitive lan-
what is known as a Type 3 language. The current guages. Finally, the most powerful automaton, a
Chomsky went further and argued that nei-

ther context-free nor context-sensitive grammars
green,
vicious provided an account of human language. He
the, a argued that it is necessary to add transformations
S A vampire,
ghost to a phrase-structure grammar; the resulting
grammar is then a Type 0 grammar, and can only
END B be produced by a Turing machine. Chomsky
witch,
thought that transformations were needed to
werewolf chased, show how sentences are related to each other.
loved, They also simplify the phrase-structure rules nec-
D C kissed
(a) the, a essary and provide a more elegant treatment
of the language. Finally, there is some linguistic
green, evidence that appeared to show that no context-
vicious
the, a
free or context-sensitive grammar can account
S A vampire, for certain constructions found in natural lan-
ghost
and guage. For example, Postal (1964) argued that the
B
Mohawk language contains intercalated dependen-
END
cies, in which words are cross-related (such as a1
witch,
werewolf a2 . . . an b1 b2 bn, where a1 relates to b1, and so
chased,
C
loved, on). Hence it seems that natural human language
D kissed
the, a can only be produced by the most powerful of all
(b)
types of grammar.
Although this conclusion was accepted
for a long time, it has been disproved. First,
FIGURE 2.5 An example of a finite-state device. it is not clear that all the complex dependen-
cies between words described by Chomsky and
Turing machine, has no limitations, and produces Postal are necessarily grammatical. Second,
a Type 0 grammar. there is a surprising formal demonstration by
Chomsky (1957) showed that natural language Peters and Ritchie (1973) that context can be
cannot be characterized by a finite-state device. In taken into account without exceeding the power
particular, a finite-state device cannot produce arbi- of a context-free grammar. Third, Gazdar, Klein,
trarily long sequences of multiple center-embedded Pullum, and Sag (1985) showed that context-
structures, where the sequence of embedding could free languages can account for the phenom-
carry on for ever. You can only produce these sorts ena of natural language thought to necessitate
of sentences if the automaton has a memory to keep context sensitivity if more complex syntactic
track of what it has produced so far. Recursion is categories are incorporated into the grammar.
necessary to account for this type of complexity, So while a finite-state device is too weak to
and recursion is beyond the scope of finite-state describe human language, a Turing machine
devices. At the time this conclusion was surprising: might be unnecessarily powerful.
Theories of language were dominated by behavior- Finally, it is worth reiterating that, although
ism and information theory, and it was thought that most of the examples in this chapter are in
knowledge of the previous states was all that was English, the same basic principles will apply to
necessary to account for human language. In effect, other languages. The rules and descriptions will
Chomsky showed that no matter how many previous differ from language to language, but we can use
words were taken into account, a finite-state device the same underlying approaches (e.g., describ-
cannot produce or understand natural language. An ing sounds by their method of articulation, or
important extension of this argument is that children grammars in terms of phrase-structure rules) to
cannot learn language simply by conditioning. describe all languages.
46 A. INTRODUCTION
SUMMARY
x The basic sounds of a language are called phonemes.

x Different languages use different phonemes, and languages vary in the differences in sounds that
are important.
x Phonetics describes the acoustic detail of speech sounds and how they are articulated; phonology
describes the sound categories each language uses to divide up the space of possible sounds.
x The IPA (International Phonetic Alphabet) provides a notation for sounds and a way of classifying
them.
x Consonants are made by almost closing the vocal tract, whereas vowels are made by modifying
its shape; in both cases the place of constriction determines the sound we make.
x Consonants further depend on the manner of articulation and whether voicing is present.
x Words can be divided into syllables, and syllables into onset and rimes.
x Syntactic rules specify the permissible orders of words in a language.
x Parsing is the process of computing the syntactic structure of language.
x Sentences can be analyzed by parse trees.
x The most influential work on linguistic theories of syntax has been that of Noam Chomsky.
x Chomsky distinguished between actual linguistic performance and idealized linguistic compe-
tence; the goal of linguistics is to provide a theory of competence.
x According to Chomsky, a complete linguistic theory will be able to generate all of the sentences
of a language and none of the non-sentences, will provide an account of people’s intuitions about
the knowledge of their language, and will explain how children can acquire language.
x The generative power of language is given by recursion and iteration.
x In his early work, Chomsky argued that sentences are generated by the operation of transforma-
tional rules on a deep structure representation generated by phrase-structure rules, resulting in a
surface structure representation.
x Chomsky later argued that important generalizations about language are best explained by a set of
principles and parameters; language acquisition involves setting these parameters to the appropri-
ate value given exposure to particular languages.
x In his more recent minimalist work Chomsky has attempted to simplify the grammar by incorpo-
rating many of its aspects into the lexicon.
x Automata theory provides a formal account of the power of artificial and natural languages;
Chomsky argued that only the most powerful automaton (the Turing machine) could cope with
natural language.
1. To what extent have linguistics and psycholinguistics converged or diverged?

2. What might psycholinguistics have to offer people trying to develop computer systems that
understand natural language?
3. Think about the different languages you know. What are their similarities and dissimilarities?
4. How would you describe samples of different dialects of your language (e.g., regions of Britain
or the USA) in terms of the IPA?
FURTHER READING
Crystal (2010) and Fromkin et al. (2011) provide excellent detailed introductions to phonetics and
phonology, and in particular give much more detail about languages other than English.
Fabb (1994) is a workbook of basic linguistic and syntactic concepts, and makes the meaning of
grammatical terms very clear, although most of the book avoids using the notion of a verb phrase, on
the controversial grounds that verb phrases are not as fundamental as other types of phrases. For a
more detailed account see Burton-Roberts (1997). Also try Tarshis (1992) for a friendly introduction
to grammatical rules in English. For a more advanced review, see Crocker (1999).
Pinker (1994) gives a brief and accessible description of Chomsky’s theory of syntax. Borsley
(1991) provides excellent coverage of contemporary linguistic approaches to syntax, and Radford
(1981) provides detailed coverage of the linguistic aspects of Chomsky’s extended theory. Radford
(1997) provides an excellent introduction to the minimalist approach; be warned, however, that this
is a very technical topic. An excellent, detailed yet approachable coverage of Chomsky’s theory,
which emphasizes principles and parameters theory, is Cook and Newson (2007). See also references
to his ideas on the development of language at the end of Chapter 3.
If you want to find out more about the relation between linguistics and psycholinguistics, read
the debate between Berwick and Weinberg (1983a, 1983b), Garnham (1983a), Johnson-Laird (1983),
and the articles by Stabler (1983) and Jackendoff (2003), with the subsequent peer commentaries.
An introduction to automata theory is provided in Johnson-Laird (1983) and Sanford (1985); a more
detailed and highly mathematical treatment can be found in Wall (1972).
See Fauconnier and Turner (2003) for a general account of cognition and language in the
cognitive linguistics vein.
SECTION B
THE BIOLOGICAL AND
DEVELOPMENTAL BASES OF LANGUAGE
Chapter 3, The foundations of language, asks that enables them to acquire language from input
where language came from, whether language is that is often impoverished? How do infants learn
unique to humans, and what we can learn from to associate words with the objects they see in the
attempts to teach human language to animals. Next world around them? How do they learn the rules
we examine the biological basis of language and that govern word order?
what mechanisms are necessary for its develop- Chapter 5, Bilingualism and second
ment. We look at the cognitive and social basis of language acquisition, asks what cognitive
human language development. Finally, we exam- processes are involved when a child is brought
ine the relation between language and thought. up using two languages, and whether these
Chapter 4, Language development, is con- differ from the situation of an adult learning
cerned with how language develops from infancy a second language. How should languages be
to adolescence. Do children have an innate device taught?
CHAPTER 3
THE FOUNDATIONS OF
LANGUAGE
INTRODUCTION WHERE DID LANGUAGE

Children acquire language without apparent COME FROM?
effort. This chapter examines the requirements There is a rich archeological record available to
for language acquisition. What biological, help us understand the evolution of the hands
cognitive, and social precursors are necessary and the development of the use of tools. There
for us to acquire language normally? How are is no such record available when examining the
language processes related to structures in the evolution of language, so at first sight it might
brain? Is language unique to humans? What seem to be a wholly speculative undertaking.
mechanisms need to be in place before lan- Indeed, in 1866 the Linguistic Society of Paris
guage development can begin? What affects famously banned all debate on the origins of
the rate of linguistic development? What are language.
the consequences of different types of impair- We have no idea what the first language was
ment or deprivation for language? The chapter like. Some words might have been onomatopoeic—
also examines how language is related to other that is, they sound like the things to which they refer.
cognitive processes. By the end of this chapter, For example, “cuckoo” sounds like the call of the
you should: bird, “hiss” sounds like the noise a snake makes,
and “ouch” sounds like the exclamation we make
x Know how language might have evolved. when there is a sudden pain. The idea that language
x Know about animal communication systems evolved from mimicry or imitation has been called
and be able to say how they differ from human the “ding-dong,” “heave-ho,” or “bow-wow” theory.
language. However, such similarities can only be attributed to
x Be able to describe attempts to teach languages a very few words, and many words take very differ-
to apes and to evaluate how successful these ent forms in different languages.
have been. Perhaps the most obvious idea about how
x Know to what extent language functions are language came into being is that it evolved as a
localized in the human brain. beneficial adaptation shaped by natural selec-
x Know how lateralization develops. tion. However, even this hypothesis is contro-
x Understand what is meant by a critical period versial. The alternative is that language arose as
for language development. a side effect of the evolution of something else,
x Understand the effects of different types of such as an increase in overall brain size and an
deprivation on linguistic development. increase in general intelligence (e.g., Chomsky,
x Understand the relation between language and 1988; Hauser, Chomsky, & Fitch, 2002; Piattelli-
thought. Palmarini, 1989). Several arguments have been
52 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
proposed in favor of the side-effect theory. First, increased in size and complexity when Homo
many researchers believe that there has not been sapiens became differentiated from other species,
enough time for something so complex as lan- between 2 million and 300,000 years ago. Study
guage to evolve since the evolution of humans of the fossil evidence suggests that a structure cor-
diverged from that of other primates. Second, a responding to Broca’s area, a region of the brain
grammar cannot exist in any intermediate form clearly associated with language in modern humans,
(we either have a grammar or we don’t). Third, as was present in the brains of early hominids as long
possessing a complex grammar confers no obvi- as 2 million years ago. The shape of the human
ous selective advantage, evolution could not have skull has changed significantly over time, enabling
selected for it. better control of speech: Neanderthals would not
In recent years, however, the hypothesis that have been capable of controlling their tongues suf-
language evolved by Darwinian natural selection ficiently to be able to articulate as clearly as we do.
as an advantageous adaptation has largely won, The articulatory apparatus has not changed signifi-
partly because it provides a well-understood general cantly over the last 60,000 years. The evolution of
mechanism—indeed, the only mechanism under- language has come at a cost: the structures in the
stood—for how language could have arisen (natu- throat that enable us to control the production of
ral selection), and partly because the objections do sounds also make us more likely than other pri-
not hold much water. It is now apparent that there mates to choke on our food. Obviously the evo-
was indeed sufficient time for grammar to evolve, lutionary advantages conferred by language must
that it evolved to communicate existing cognitive outweigh the disadvantage of this increased risk.
representations, and that the ability to communicate We do not know whether language existed
using a grammar-based system confers a big evolu- in some intermediate form—although it seems
tionary advantage. For example, it obviously makes unlikely that early humans went from commu-
a big difference to your survival if an area has ani- nicating through a few grunts to a rich language
mals that you can eat, or animals that can eat you, that used grammar. Bickerton (1990, 2003) has
and if you are able to communicate this distinction controversially championed the idea of a proto-
to someone else (Fitch, Hauser, & Chomsky, 2005; language that was intermediate between primate
Jackendoff & Pinker, 2005; Pinker, 2003; Pinker & communication systems and human language.
Bloom, 1990; Pinker & Jackendoff, 2005). Protolanguage arose with the evolution of Homo
The capacity for language and symbol erectus about 1.6 million years ago. Protolanguage
manipulation must have arisen as the human brain has vocal labels attached to concepts, but does not
This picture illustrates the

stages in human evolution
that have occurred over the
last 35 million years. As
the physical form and the
brain changed, language also
developed.
3. THE FOUNDATIONS OF LANGUAGE 53
have a proper syntax; it is distinguished from lan- Homo that became extinct about 30,000 years
guage by the power of syntax (Chapter 2). The ago—also carried the FOXP2 mutation and used
idea of a protolanguage is a powerful one: pri- some form of language, although these results are
mates taught sign language (this chapter), very controversial because they might just reflect inter-
young children (Chapter 4), children deprived of breeding between Homo sapiens and Homo nean-
early linguistic input (this chapter), and speakers derthalensis. We examine what the FOXP2 gene
of pidgin language (this chapter) could all be said may control in more detail in Chapter 4.
to use a protolanguage rather than language. The extent to which the evolution of language
What pressures selected for language? The depended on the hands, and whether grammar arose
social set-up of early humans must have played a from the use of manual gestures, is still controver-
role in the evolution of language, but many other sial. Paget (1930) was the first to propose that lan-
animals, particularly primates, have complex guage evolved in intimate connection with the use
social organizations, and although primates also of hand gestures, so that vocal gestures developed
have a rich repertoire of alarm calls and gestures, to expand the available repertoire. Corballis (1992,
they did not develop language. In a rich social 2003, 2004) argued that the evolution of language
environment an adaptation that enables rich com- freed the hands from having to make gestures to
munication confers a huge evolutionary advan- communicate, so that tools could be made and used
tage on that species. simultaneously with communication. Corballis
It is unlikely that language evolved in one argues that language arose not from primate calls,
step, or depends on a single gene. However, but from primate gestures. Additional evidence that
recent evidence suggests that important aspects language evolved from gestures comes from imag-
of language, especially grammar, may be asso- ing studies that show that the brains of great apes
ciated with a specific gene, called the FOXP2 are specialized in a very similar way to humans
gene. In animals, the FOXP2 gene seems to be (Cantalupo & Hopkins, 2001). Chimpanzees and
involved in coordinating sensory and motor infor- gorillas, like humans, show an asymmetry between
mation, and skilled complex movements (Fisher the left and right hemispheres of the brain, with
& Marcus, 2006). Damage to the FOXP2 gene in what is called Brodmann’s area 44 being particu-
humans leads to difficulty in acquiring language larly enlarged on the left. This area is probably
normally. The evidence suggests that the current involved with the production of gestures; further-
structure of the FOXP2 gene in humans arose more, it corresponds to Broca’s region in humans, a
through a mutation within the last 100,000 years key part of the brain involved in producing speech.
(Corballis, 2004), leading to greater development One plausible explanation of this finding is that the
of Broca’s region and an enhanced ability to coor- brains of great apes became specialized to enable
dinate complex sequences of movement (Fisher & the production of sophisticated gestures, but this
Marcus, 2006). Corballis argues that the flower- specialization continued in humans with speech
ing of human culture, art, and technology, and the arising from these gestures. Mirror neurons in this
expansion of Homo sapiens about 40,000 years region play a particular role in imitating gestures;
ago, were all associated with the FOXP2 mutation they fire when an animal performs a specific action
and the development of language. The mutation or sees another animal performing the same action
meant that speech could become fully autonomous (Rizzolatti, Fadiga, Fogassi, & Gallese, 1996).
in the sense that it no longer relied on gestures; They have been argued to play a particular role in
this autonomy at once freed the hands and enabled the evolution of language (Stamenov & Gallese,
better communication. A hundred thousand years 2002), with manual gestures rather than vocal com-
is a long time in evolution: A mutation giving a munication driving evolution. The mirror neuron
1% gain in fitness would increase in frequency in system for grasping enabled imitation, which in
the population from 0.1% to 99.9% in just 4,000 turn allowed early manual signs to develop (Arbib,
generations (Haldane, 1927). However, it is likely 2005). Although many species (including birds
that the Neanderthals—a branch of the genus and frogs) show left-hemisphere dominance for
producing sounds, only humans show very strong of language might be innate in humans and have
right-handedness dominance; in other animals ges- a genetic basis. Third, it might tell us about which
ture production is bilateral across the population. other social and cognitive processes are necessary
(Although individual nonhuman primates, dogs, for a language to develop. Finally, of course, the
cats, and even rats tend to favor one paw, there is no question is of great intellectual interest. The idea
systematic preference for left or right within these of being able to “talk to the animals” like the fic-
species.) As the gesture-based language evolved, tional Dr. Dolittle fascinates both adults and chil-
vocalizations became incorporated into the gesture dren alike. It can become an emotive subject, as it
system, leading to the specialization and lateraliza- touches on the issue of animal rights, and the extent
tion of the language and gesture systems and the to which humans are distinct from other animals.
right-handed preference in humans.
Of course, the relation between evolution and Animal communication
language might have been more complex than this.
Elman (1999) argued that language arose from a systems
communication system through many interacting Many animals possess rich communication
“tweaks and twiddles.” Deacon (1997) proposed systems—even insects communicate. Commun-
that language and the brain co-evolved in an inter- ication is much easier to define than language: it
active way, converging towards a common solution is the transmission of a signal that conveys infor-
for the cognitive and sensorimotor problems facing mation, often such that the sender benefits from
the organism. Symbolic gestures and vocalization the recipient’s response (Pearce, 2008). The sig-
preceded fully blown language. As the frontal cor- nal is the means that conveys the information
tex of humans grew larger, symbolic processing (e.g., a sound or a smell). It is useful to distin-
became more important, and linguistic skills became guish between communicative and informative
necessary to manage symbol processing, leading signals: communicative signals have an element
to the development of speech apparatus to imple- of design or intentionality in them, whereas
ment these skills, which in turn would demand and signals that are merely informative do not. If I
enable further symbolic processing abilities. Fisher cough, this might inform you that I have a cold,
and Marcus (2006) propose that language was not a but it is not a communication; but telling you that
single wholesale innovation, but a complex recon- I have a cold is.
figuration of several systems that became adapted A wide range of methods is used to convey
to form language. Such a conclusion is similar to information. Ants rely on chemical messengers
that of Christiansen and Chater (2008), who see lan- called pheromones. Honey bees produce a complex
guage itself as an evolving system that has made use “waggle dance” (see Figure 3.1) in a figure-of-eight
of pre-existing brain structures. shape to other members of the hive (von Frisch,
1950, 1974). The direction of the straight part of the
dance (or the axis of the figure-of-eight) represents
DO ANIMALS HAVE the direction of the nectar relative to the sun, and the
LANGUAGE? rate at which the bee waggles during the dance rep-
resents distance.
Is language an ability that is uniquely human? I Primates use visual, auditory, tactile, and
examine both naturally occurring animal commu- olfactory signals to communicate with each other.
nication systems, and attempts to teach a human- They use a wide variety of calls to symbolize a
like language to animals, particularly chimpanzees. range of features of the environment and their
There are a number of reasons why this topic is emotional states. For example, a vervet monkey
important. First, it provides a focus for the issue produces one particular “chutter” to warn oth-
of what we mean by the term language. Second, it ers that a snake is nearby, a different call when
informs debate about the extent to which aspects an eagle is overhead, and yet another distinct call
The waggle dance
Research shows that dolphins do not possess a

language in terms of the intentional structuring of
sub-units to deliver intelligible communications.
FIGURE 3.1 However, this prompts the question; at what
juncture do we decide that communication can
to warn of approaching leopards. Each type of be classed as a language?
call elicits different responses from other nearby
vervets (Demers, 1988). However, the signals are
linked to particular stimuli and are only produced
in their presence. Primates communicate about … an artificial system of signs and symbols,
stimuli for which they do not already possess sig- with rules for forming intelligible communica-
nals, suggesting that their communicative system tions for use, e.g., in a computer” (Chambers
has an element of creativity. Twentieth Century Dictionary, 1998). Many
It is a widespread belief that whales and dol- introductions to the study of language avoid
phins possess a language. However, the research giving a definition, or consider it to be so obvi-
does not support this belief. There is currently ous that it does not need to be defined. To some
no evidence to suggest that dolphins employ extent the aim of modern theoretical linguistics
sequences of sub-units that convey particular is to offer an answer to this question (Lyons,
messages, in the same way as we combine words 1977a). Perhaps the difference between an ani-
to form sentences to convey messages. Early mal communication system and a language is
research suggesting that dolphins were commu- just a matter of degree?
nicating with each other to carry out cooperative
tasks to obtain fish turned out to be explicable Design features
in terms of conditioning; the dolphins carried on Hockett (1960) attempted to sidestep the thorny
making sounds in the obvious absence of other issue of defining language by listing 16 general
dolphins (Evans & Bastian, 1969). Hump-backed properties or design features of spoken human
whale song consists of ordered sub-parts, but language (see Box 3.1). The emphasis of his
their function is unknown (Demers, 1988). design features is very much on the physical
How would we decide if an animal commu- characteristics of spoken languages. Clearly,
nication system had crossed the boundary to be these are not all necessary defining characteristics—
counted as a language? human written language does not display “rapid
fading,” yet clearly written language is a form of
language. Nevertheless, design features provide
Defining language a useful framework for thinking about how ani-
“Language” is a difficult word to define. The mal communication systems differ from human
dictionary defines language as “human speech language.
Box 3.1 Hockett’s (1960) “design features” of human

spoken language
1. Vocal-auditory channel (communication exceptions, they do not resemble what
occurs by the producer speaking and the they stand for)
receiver hearing) 9. Discreteness (the vocabulary is made of
2. Broadcast transmission and directional discrete units)
reception (a signal travels out in all direc- 10. Displacement (the communication system
tions from the speaker but can be localized can be used to refer to things remote in
in space by the hearer) time and space)
3. Rapid fading (once spoken, the signal rap- 11. Openness (the ability to invent new messages)
idly disappears and is no longer available 12. Tradition (the language can be taught and
for inspection) learned)
4. Interchangeability (adults can be both 13. Duality of patterning (only combinations of
receivers and transmitters) otherwise meaningless units are meaningful—
5. Complete feedback (speakers can access this can be seen as applying both at the level of
everything about their productions) sounds and words, and words and sentences)
6. Specialization (the amount of energy in the 14. Prevarication (language provides us with
signal is unimportant; a word means the the ability to lie and deceive)
same whether it is whispered or shouted) 15. Reflectiveness (we can communicate about
7. Semanticity (signals mean something: the communication system itself, just as
they relate to the features of the world) this book is doing)
8. Arbitrariness (these symbols are 16. Learnability (the speaker of one language
abstract; except with a few onomatopoeic can learn another)
Which features do animal communication Syntax has five important properties (Kako,
systems possess? All communication systems 1999a; Pinker, 2002). First, language is a discrete
possess some of the features. For example, the combinatorial system. When words are combined,
red belly of a breeding stickleback is an arbitrary we create a new meaning: the meanings of the
sign. Some of the characteristics are more impor- words do not just blend into each other, but retain
tant than others; we might single out semanticity, their identity. Second, well-ordered sentences
arbitrariness, displacement, openness, tradition, depend on ordering syntactic categories of words
duality of patterning, prevarication, and reflec- (such as nouns and verbs) in correct sequences.
tiveness. These features all relate to the fact that Third, sentences are built round verbs, which
language is about meaning, and provide us with specify what goes with what (e.g., you give some-
the ability to communicate about anything. We thing to someone). Fourth, we can distinguish
might add other features to this list that empha- words that do the semantic work of the language
size the creativity and meaning-related aspects (content words—see Chapter 2) from words that
of language. Marshall (1970) pointed out the assist in the syntactic work of the language (func-
important fact that language is under our vol- tion words). Fifth, recursion—phrases containing
untary control; we intend to convey a particular examples of themselves—enables us to construct
message. The creativity of language stems from an infinite number of sentences from a finite num-
our ability to use syntactic rules to generate a ber of rules. No animal communication system
potentially infinite number of messages from a has these properties.
finite number of words using iteration and recursion We can use language to communicate about
(see Chapter 2). anything, however remote in time and space.
Hence, although a parrot uses the vocal-auditory just picking up cues from their owner). When
channel and the noises it makes satisfy most of faced with a new name, he would infer that the
the design characteristics up to number 13, it can- name applied to a novel object, rather than being
not lie, or reflect about its communication system, another name for an object with which he was
or talk about the past. Whereas monkeys are lim- familiar—this “novel name equals nameless cat-
ited to chattering and squeaking about immedi- egory” principle is one that children use to learn
ate threats such as snakes in the grass and eagles some new words. However, unlike children,
overhead, we can express novel thoughts; we can Rico’s knowledge was restricted to the names
make up sentences that convey new ideas. This of physical objects, and he showed no under-
cannot be said of other animal communication standing of how the meanings of words might be
systems. Bees will never dance a book about the related (e.g., that doll and ball are both types of
psychology of the bee dance. We can talk about toy). Nevertheless, this performance is impres-
anything and effortlessly construct sentences that sive, and also suggests that general (rather than
have never been produced before. language-specific) learning mechanisms might
In summary, many animals possess rich sym- go some way to explaining early word learning
bolic communication systems that enable them in children.
to convey messages to other members of the Everyone knows that parrots can be taught
species, that affect their behavior, that serve an to mimic human speech. Pepperberg (1981, 1983,
extremely useful purpose, and that possess many 1987, 2009) took this idea further and embarked
of Hockett’s design features. On the other hand, on an elaborate formal program of training of her
these communication systems lack the richness of African grey parrot (Psittacus erithacus) called
human language. This richness is manifested in Alex. After 13 years, Alex had a vocabulary of
our limitless ability to talk about anything using a about 80 words, including object names, adjectives,
finite number of words and rules to combine those and some verbs. He could even produce and under-
words. However difficult “language” may be to stand short sequences of words. Alex could classify
define, the difference between animal communi- 40 objects according to their color and what they
cation systems and human language is not just one were made of, understand the concepts of same and
of degree. All nonhuman communication systems different, and count up to six. Alex showed evidence
are quite different from language (Deacon, 1997). of being able to combine discrete categories and
use syntactic categories appropriately. However,
Can we teach language to animals?
Perhaps some animals have the biological and
cognitive apparatus to acquire language, but have
not needed to do so in their evolutionary niche.
The alternative view is that only humans possess
the necessary capabilities: that other animals are
in principle incapable of learning language.
Most people think that dogs and par-
rots “know” some aspects of language. Dogs
respond to instructions. One border collie
called Rico knew the labels of over 200 items
(Kaminski, Call, & Fischer, 2004), being
able to fetch items with different names from
around the house, even when he could not see Pepperberg’s (1981) African grey parrot, Alex,
the owner (thereby eliminating the possibility showed evidence of being able to combine
of the “Clever Hans” effect, which is that ani- discrete categories and possibly to use syntactic
categories appropriately.
mals that appear to know language are in fact
he knew few verbs, showed little evidence of being animals and are our closest genetic neighbors. In
able to relate objects to verbs, and knew very few the following discussion it is useful to bear in mind
function words (Kako, 1999a). Hence Alex’s lin- the distinction between teaching word meaning
guistic abilities are extremely limited. and syntax. Remember that an essential feature of
Herman, Richards, and Wolz (1984) human language is that it involves both associat-
taught two bottle-nosed dolphins, Phoenix and ing a finite number of words with particular mean-
Akeakamai, artificial languages. One language ings or concepts, and using a finite number of rules
was visually based, using gestures of the trainer’s to combine those words into a potentially infinite
arms and legs (see Figure 3.2), and the other was number of sentences. Before we can conclude that
acoustically based, using computer-generated apes have learned a language we need to show that
sounds transmitted through underwater speakers. they can do both of these things.
However, this research tested only the animals’
comprehension of the artificial language, not their What are the other cognitive
ability to produce it. From the point of view of
answering our questions on language and animals
abilities of chimpanzees?
it is clearly important to examine both compre- We have seen that primates have a rich communi-
hension and production. Even so, the dolphins’ cation system that they use in the wild. The cogni-
syntactic ability was limited, and they showed tive abilities of a chimpanzee named Viki aged 3½
no evidence of being able to use function words years were generally comparable to those of a child
(Kako, 1999a). of a similar age on a range of perceptual tasks such
Most of the work on teaching language to as discriminating and matching similar items, but
animals involves other primates, particularly broke down on tasks involving counting (Hayes
chimpanzees, as they are highly intelligent, social & Nissen, 1971). Experiments on another chimp
TAIL-TOUCH MOUTH
FIGURE 3.2 Some

LEFT WATER of the gestures used
to communicate with
Akeakamai the dolphin.
Adapted from Herman et al.
(1984).
named Sarah also suggested that she performed at after 6 years the chimpanzee could produce just
levels close to that of a young child on tasks such four poorly articulated words (“mama,” “papa,”
as conserving quantity, as long she could see the “up,” and “cup”) using her lips. Even then, Viki
transformation occurring. For example, she under- could only produce these in a guttural croak, and
stood that pouring water from a tall, thin glass into a only the Hayes family could understand them eas-
short, fat glass did not change the amount of water. ily. With a great deal of training she understood
Hence the cognitive abilities of apes are broadly more words, and some combinations of words.
similar to those of young children, apart from the These early studies have a fundamental
latter’s linguistic abilities. This decoupling of lin- limitation. The vocal tracts of chimps are phys-
guistic and other cognitive abilities in children and iologically unsuited to producing speech, and
apes has important implications. First, it suggests this difference alone could account for their
that for many basic cognitive tasks language is not lack of progress (see Figure 3.3). Nothing can
essential. Second, it suggests that there are some be concluded about the general language abili-
non-cognitive prerequisites to linguistic develop- ties of primates from these early failures.
ment. Third, it suggests that cognitive limitations
in themselves might not be able to account for the Washoe
failure of apes to acquire language. Although the design of the vocal tracts of chimps
is unsuited to speaking, chimps are manually
Talking chimps very dexterous. Later attempts at teaching apes
The earliest attempt to teach apes language was language were based on systems using either a
that of Kellogg and Kellogg (1933), who raised a type of sign language, or involving manipulat-
female chimpanzee named Gua along with their ing artificially created symbols. Perhaps the most
own son. (This type of rearing is called cross- famous example of trying to teach language to an
fostering or cross-nurturing.) Gua only understood ape is that of Washoe. Washoe is a female chim-
a few words, and never produced any that were panzee who was caught in the wild when she was
recognizable. Hayes (1951) reared a chimp named approximately 1 year old. She was then brought
Viki as a human child and attempted to teach her up as a human child, doing things such as eating,
to speak. This attempt was also unsuccessful, as toilet training, playing, and other social activities
Palate
Nasal
cavity
Velum Nasal
cavity
Rear pharyngeal Palate
Lips Tongue wall Larynx
Tongue
Epiglottis Lips
Larynx
(vocal cords) Epiglottis
FIGURE 3.3 Compare the adult vocal tract of a human (left) with that of a chimpanzee (right). Adapted from
Lieberman (1975).
(Gardner & Gardner, 1969, 1975). In this context, Sarah

she was taught American Sign Language (ASL, A different approach was taken by Premack
sometimes called AMESLAN). ASL is the stand- (1971, 1976a, 1976b, 1985, 1986a). Sarah was
ard sign language used by people with hearing a chimpanzee trained in a laboratory setting to
impairment in North America. Just like spoken manipulate small plastic symbols that varied
language, it has words and syntax. in shape, size, and texture. The symbols could
At the age of 4, Washoe could produce about 85 be ordered in certain ways according to rules.
signs, and comprehend more; a few years later her Together, the symbols and the rules form a lan-
vocabulary had increased to approximately 150–200 guage called Premackese. One advantage of
signs (Fouts, Shapiro, & O’Neil, 1978). These signs this set-up is that less memory load is required,
came from many syntactic categories, including as the array is always in front of the animal.
nouns, verbs, adjectives, negatives, and pronouns. Sarah produced mainly simple lexical concepts
Her carers argued that she made over-generalization (strings of items together describing simple
errors similar to those of young children (for example, objects or actions), and could produce novel
in using the sign for “flower” to stand for flower-like strings of symbols. These, however, were gener-
smells, or “hurt” to refer to a tattoo). It was further ally only at the level of substituting one word
claimed that when she did not know a sign, she could for another. For example (with the Premackese
create a new one. When she first saw a duck and had translated into English), “Randy give apple
not learned a sign for it, she coined a phrase combin- Sarah” was used as the basis of producing
ing two signs she did have, producing “water bird.” “Randy give banana Sarah.” She produced sen-
Furthermore, she combined signs and used them cor- tences that were syntactically quite complex (for
rectly in strings up to five items long. Examples of example, producing logical connectives such as
Washoe’s signing include: “Washoe sorry,” “Baby “if … then”), and showed metalinguistic aware-
down,” “Go in,” “Hug hurry,” and “Out open please ness (reflectiveness) in that she could talk about
hurry.” She could answer some questions that use the language system itself using symbols that
what are called WH-words (so called because in meant “… is the name of.” However, there was
English most of the words that are used to start ques- little evidence that Sarah was grouping strings
tions begin with “wh,” such as “what,” “where,” of symbols together to form proper syntactic
“when,” or “who”). She displayed some sensitivity units. (Also see Figure 3.4.)
to word order in that she could distinguish between
“You tickle me” and “I tickle you.”
Do chimps who have been taught language
go on to teach their offspring, or can the offspring
learn language by observing their parents? These are
important questions, because there is little evidence
that human children are explicitly taught language
by their parents. Researchers observed that Washoe’s
adopted son Loulis both spontaneously acquired
signs from Washoe and was also seen to be taught by
Washoe. Although this is a clear indication of what is
known as cultural transmission, it is unclear whether
it is a language that has been transmitted, or just a
sophisticated communication system (Fouts, Fouts,
& van Cantfort, 1989; Fouts, Hirsch, & Fouts, 1982).
At first sight Washoe appears to have acquired FIGURE 3.4 Here we see another of Premack’s
the use of words and their meanings, and at least chimpanzees, Elizabeth. The message on the board
some sensitivity to word order in both production says “Elizabeth give apple Amy.” Adapted from
and comprehension. Premack (1976a).
Nim and others Rumbaugh, and Boysen (1978) reported attempts to

Terrace, Petitto, Sanders, and Bever (1979) teach the chimpanzees Lana, Sherman, and Austin
described the linguistic progress of a chimpanzee language, using a computer-controlled display of
named Nim Chimpsky (a pun on Noam Chomsky). symbols structured according to an invented syntax
They taught Nim Chimpsky a language based on called Yerkish. The symbols that serve as words are
ASL. Nim learned about 125 signs, and the research- called lexigrams (see Figure 3.5). The linguistic abil-
ers recorded over 20,000 utterances in 2 years, many ities of other primates such as gorillas have also been
of them of two or more signs in combination. They studied (e.g., Koko, reported by Patterson, 1981).
found that there was regularity of order in two-word
utterances—for example, place was usually the sec- Evaluation of early attempts to teach
ond thing mentioned—but that this broke down with language to apes
longer utterances. Longer utterances were largely At first sight, these attempts to teach chimps lan-
characterized by more repetition (“banana me eat guage might look quite convincing. The impor-
banana eat”), rather than displaying real syntactic tant design features of Hockett all appear to be
structure. Terrace et al. were far more pessimistic present. Specific signs are used to represent par-
about the linguistic abilities of apes than were either ticular words (discreteness), and apes can refer to
the Gardners or Premack. Unlike children, Nim objects that are not in view (displacement). The
rarely signed spontaneously; about 90% of his utter- issue of semanticity, whether or not the signs have
ances were in reply to his trainers and concerned meaning for the apes, is a controversial one to
immediate activities such as eating, drinking, and which we shall return. At the very least we can
playing, and 40% of his utterances were simply say that they have learned associations between
repetitions of signs that had just been made by his objects and events and responses. Sarah could
trainers. However, O’Sullivan and Yeager (1989) discuss the symbol system itself (reflectiveness).
pointed out that the type of training Nim received Signs could be combined in novel ways (open-
might have limited his linguistic skills. They found ness). The reports of apes passing sign language
that he performed better in a conversational setting on to their young satisfy the feature of tradition.
than in a formal training session. Most importantly, it is claimed that the signs are
There have been other famous attempts to combined according to specified syntactic rules
teach language to primates. Savage-Rumbaugh, of ordering: that is, they have apparently acquired
Student teacher Joyce

Butler with Nim Chimpsky
the chimpanzee, named
after American linguist,
philosopher, cognitive
scientist, and political
activist Noam Chomsky.
Joyce is showing Nim
the sign configuration for
“drink” and Nim is imitating
her. Photographed during
project Nim, an extended
study of animal language
acquisition conducted in the
1970s.
FIGURE 3.5 The arrangement of lexigrams on a keyboard. Blank spaces were non-functioning keys, or displayed
photographs of trainers. From Savage-Rumbaugh, Pate, Lawson, Smith, and Rosenbaum (1983).
a grammar. Maybe, then, these animals can learn it is not true; not all the attempts mentioned earlier
language, and the difference between apes and used ASL—Premack’s plastic symbols, for exam-
humans is only a matter of degree? ple, are very different. In addition, the force of this
Unfortunately, there are many problems with objection can be largely dismissed on the grounds
some of this research, particularly the early, pio- that although some ASL signs are iconic, many of
neering work. The literature is full of argument and them are not, and that deaf people clearly use ASL in
counter-argument, making it difficult to arrive at a a symbolic way. No one would say that deaf people
definite conclusion. There have been two sources using ASL are not using a language (Petitto, 1987).
of debate: methodological criticisms of the training Nevertheless, ASL is different from spoken language
methods and the testing procedures used, and argu- in that it is more condensed—articles such as “the”
ment over how the results should be interpreted. and “a” are omitted—and this clearly might affect
What are the methodological criticisms? First, the way in which animals use the language. And
one criticism was that ASL is not truly symbolic, in in Washoe’s case at least, a great proportion of her
that many of the signs are icons standing for what signing seemed to be based on signs that resemble
is represented in a non-arbitrary way (Savage- natural gestures. It is also possible that her trainers
Rumbaugh et al., 1978; Seidenberg & Petitto, 1979). over-interpreted her gestures, first incorrectly identi-
For example, the symbol for “give” looks like a fying some gestures as signs, or thinking that a par-
motion of the hand towards the body reminiscent of ticular movement was indeed an appropriate sign.
receiving a gift, and “drive” is a motion rather like Deaf native signers observed a marked discrepancy
turning a steering wheel. If this were true, then this between what they thought Washoe had produced
research could be dismissed as irrelevant because the (which was very little), and what the trainers claimed
chimps are not learning a symbolic language. Clearly (Pinker, 1994). Again, these criticisms are hard to
justify against the lexigram-based studies, although understanding of word meaning or syntactic struc-
Brown (1973) noted that Sarah’s performance dete- ture. (For details of these methodological problems,
riorated with a different trainer. see Bronowski & Bellugi, 1970; Fromkin et al.,
In these early studies, reporting of signing 2011; Gardner, 1990; Pinker, 1994; Seidenberg &
behavior was anecdotal, or limited to cumulative Petitto, 1979; and Thompson & Church, 1980.)
vocabulary counts and lists. No one ever produced There are also a number of differences between
a complete corpus of all the signs of a signing ape the behavior of apes using language and of children
in a predetermined period of time, with details of the of about the same age, or with the same vocabu-
context in which the signs occurred (Seidenberg & lary size (see Table 3.1). The utterances made by
Petitto, 1979). The limited reporting has a number of chimps are tied to the here-and-now, with those
consequences that make interpretation difficult. For involving temporal displacement (talking about
example, the “water bird” example would be less things remote in time) particularly rare. There is a
interesting if Washoe had spent all day randomly lack of syntactic structure and the word order used
making signs such as “water shoe,” “water banana,” is inconsistent, particularly with longer utterances.
“water refrigerator,” and so on. In addition, the data Fodor et al. (1974) pointed out that there appeared
presented are reduced so as to eliminate the repetition to be little comprehension of the syntactic relations
of signs, thus producing summary data. Repetition in between units, and that it was difficult to produce
signing is quite common, leading to long sequences a syntactic analysis of their utterances. There was
such as “me banana you banana me give,” which little evidence that “acquiring” a sentence struc-
is a less impressive syntactic accomplishment than ture as in the string of words “Insert apple dish”
“you banana me give,” and not at all like the early would help, or transfer to, producing the new sen-
sequences produced by human children. The chimps tence “Insert apple red dish.” Unlike humans, these
produced many imitations of the signs that had just chimpanzees could not reject ill-formed sentences.
been produced by the humans, while truly crea- They rarely asked questions—an obvious charac-
tive signing in the absence of something to imitate teristic of the speech of young children. Children
is rare. Thompson and Church (1980) produced a use language to find out more about language;
computer program to simulate Lana’s acquisition of chimpanzees do not. Chimps do not spontane-
Yerkish. They concluded that all she had done was to ously use symbols referentially—that is, they need
learn to associate objects and events with lexigrams, explicit training to go beyond merely associating a
and to use one of a few stock sentences depending particular symbol or word in a particular context;
on situational cues. There was no evidence of real young children behave quite differently. Finally, it
TABLE 3.1 Differences between apes’ and children’s language behavior.
Apes Children
Utterances are mainly in the here-and-now Utterances can involve temporal displacement
Lack of syntactic structure Clear syntactic structure and consistency
Little comprehension of syntactic Ability to pick up syntactic relationships between

relationships between units units
Need explicit training to use symbols Do not need explicit training to use symbols
Cannot reject ill-formed sentences Can reject ill-formed sentences
Rarely ask questions Frequently ask questions
No spontaneous referential use of Spontaneous referential use of symbols

symbols
is not clear that these chimps used language to help chimpanzee (Pan troglodytes), comparative studies
them to reason. of animals suggest that the bonobo or pygmy chim-
These criticisms have not gone unchallenged panzee (Pan paniscus) is more intelligent, has a richer
(e.g., Premack, 1976a, 1976b). Savage-Rumbaugh social life, and a more extensive natural communica-
(1987) pointed out that it is important not to gen- tive repertoire. Kanzi is a pygmy chimpanzee, and
eralize from the failure of one ape to the behavior many believe he has made a vital step in spontane-
of others. Furthermore, many of these early studies ously acquiring the understanding that symbols refer
were pioneering and later studies learned from their to things in the world, behaving like a child. Unlike
failures and difficulties. Broadly, however, much other apes, Kanzi did not receive formal training by
of the early work is of limited value because it is reinforcement with food on production of the correct
not clear that it tells us anything about the linguistic symbol. He first acquired symbols by observing the
abilities of apes; if anything, it suggests that they are training of his mother (called Matata) on the Yerkish
rather limited. system of lexigrams. He then interacted with peo-
ple in normal daily activities, and was exposed to
Kanzi English. His ability to comprehend English as well
The major challenge to the critical point of view comes as Yerkish was studied and compared with the abil-
from more recent studies involving pygmy chimpan- ity of young children (Savage-Rumbaugh, Murphy,
zees. Strong claims have been made about the perfor- Sevcik, Brakke, Williams, & Rumbaugh, 1993).
mance of Kanzi (Greenfield & Savage-Rumbaugh, Kanzi performed as well as or better on a number of
1990; Savage-Rumbaugh & Lewin, 1994; Savage- measures than a 2-year-old child. By the age of 30
Rumbaugh, McDonald, Sevcik, Hopkins, & Rupert, months, Kanzi had learned at least seven symbols
1986). Whereas earlier studies used the common (orange, peanut, banana, apple, bedroom, chase, and
Sue Savage-Rumbaugh holds

a board displaying some of
the lexigrams with which
she and Kanzi communicate.
From Savage-Rumbaugh and
Lewin (1994).
Austin); by the age of 46 months he had learned just chimpanzees used mainly signs for actions and
under 50 symbols and had produced about 800 com- objects. Furthermore, they showed little evidence
binations of them. He was sensitive to word order, of either syntactic or semantic structure in their
and understood verb meanings—for example, he signing, showing instead much repetition and
could distinguish between “get the rock” and “take simple concatenation of signs, mostly with the
the rock,” and between “put the hat on your ball” and goal of acquiring food or some other object. Rivas
“put the ball on your hat.” Spontaneous utterances— concluded that the signing of apes showed many
rather than those that were prompted or imitations— differences from the early language of children.
formed more than 80% of his output. Let us consider word meaning in more detail.
Both Kanzi’s semantic and syntactic abili- How do we use names—in what way is language
ties have been questioned. Seidenberg and Petitto different from simple association? Pigeons can be
(1987) argued that Kanzi understands names in a taught to respond differentially to pictures of trees
different way from humans. Take Kanzi’s use of and water (Herrnstein, Loveland, & Cable, 1977), so
the word “strawberry.” He uses “strawberry” as a it is an easy step to imagine that we could condition
name, as a request to travel to where the strawber- pigeons to respond in one way (e.g., pecking once)
ries grow, as a request to eat strawberries, and so on. to one printed word, and in another way (e.g., peck-
Furthermore, Kanzi’s acquisition of apparent gram- ing twice) to a different word, and so on. We could
matical skills was much slower than that of humans, go so far as to suggest that these pigeons would be
and his sentences did not approach the complexity “naming” the words. So in what way is this “nam-
displayed by a 3-year-old child. In reply, Savage- ing” behavior different from ours? One obvious
Rumbaugh (1987) and Nelson (1987) argued that difference is that we do more than name words: we
the critics underestimated the abilities of the chim- also know their meaning. We know that a tree has
panzees, and overestimated the appropriate linguis- leaves and roots, that an oak is a tree, that a tree is a
tic abilities of very young children. Kako (1999a) plant, and that they need soil to grow in. We know
argued that Kanzi shows no signs of possessing any that the word “leaf” goes with the word “tree” more
function words. He does not appear to be able to than the word “pyramid.” That is, we know how the
use morphology: he does not modify his language word “tree” is conceptually related to other words
according to number, as we do when we form plu- (see Chapter 11 for more detail). We also know what
rals. And there is no clear evidence that Kanzi uses a tree looks like. Consider what might happen if
recursive grammatical structures. we present the printed word “tree” to a pigeon. By
Kanzi is by far the best case for language- examining its pecking behavior, we might infer that
like abilities in apes. Why is Kanzi so success- the best a trained pigeon could manage is to indi-
ful? Although bonobos might be better linguistic cate that the word “tree” looks more like the word
students, another possibility is that he was very “tee” than the word “horse.”
young when first exposed to language (Deacon, Is the use of signs by chimpanzees more like
1997). Perhaps early exposure to language is as that of pigeons or of humans? There are two key
important for apes as it appears to be for humans. questions that would clearly have to be answered
“yes” before most psycholinguists would agree that
Evaluation of work on teaching apes these primates are using words like us. First, can
language apes spontaneously learn that names refer to objects
Most people would agree that in these studies in a way that is constant across contexts? We know
researchers have taught some apes something, but that a strawberry is a strawberry whether it’s in front
what exactly? Clearly apes can learn to associ- of us in a bowl covered in cream and sugar, or in a
ate names with actions and objects, but there is field attached to a strawberry plant half covered in
more to language than this. In a recent analysis of soil. We do not need different words for each, or
a large (3,448) corpus of signs made to humans by restrict our usage to just one context. Second, do
five chimpanzees (Pan troglodytes) with a long these primates have the same understanding of word
history of sign use, Rivas (2005) found that the meaning as we do? Despite the promising work with
Kanzi, there are no unequivocal answers to these can include phrases of the same type—is an essen-
questions. For example, Nim could sign “apple” or tial feature of human language. There is no evidence
“banana” correctly if these fruits were presented to that apes can use recursion. More recent research
him one at a time, but was unable to respond cor- reinforces this view. Monkeys can learn very sim-
rectly if they were presented together. This sug- ple grammars, but they cannot learn more sophis-
gests that he did not understand the meaning of the ticated, human-like grammars that use hierarchical
signs in the same way that humans do. On the other structures where there are long-distance dependen-
hand, Sherman and Austin could group lexigrams cies between words (e.g., the word “if” is usually
into the proper superordinate categories even when followed by “then,” but any number of words can
the objects to which they referred were absent. intervene; we can embed sentences within others,
For example, they could group “apple,” “banana,” such as in “the cat the rat bit died”). Cotton-top tam-
and “strawberry” together as “fruit,” although this arins perform well at a range of language-like tasks.
claim is controversial (Savage-Rumbaugh, 1987; They can, for example, like young children (see
Seidenberg & Petitto, 1987). Chapter 4), learn which sequences of sounds tend to
In summary, whereas chimpanzees have occur often together (essentially, they can discrimi-
clearly learned associations between symbols nate words from nonwords; see Hauser, Newport,
and the world, and between symbols, it is debat- & Aslin, 2001). We can study their abilities to learn
able whether they have learned the meaning of the grammars by their ability to discriminate instances
symbols in the way that we know the meanings of strings of sounds that follow a syntactic rule from
of words. Nevertheless, they can sometimes learn strings that violate that rule; essentially, we are ask-
very effectively, in a manner akin to children (Lyn ing them to make what we call grammaticality judg-
& Savage-Rumbaugh, 2000). Kanzi and another ments. When the monkeys hear a string that violates
bonobo chimpanzee (called Panbanisha), also the rules they tend to look at the loudspeaker; we
reared in a naturalistic environment, could learn could say that they “look surprised.” The monkeys
new words naming objects very quickly, with only can be taught simple invented grammars (e.g.,
a few exposures to novel items (at a rate similar to that produce a string of sounds corresponding to
that of language-delayed children). In addition, the an ABABAB syllable structure), but are unable to
chimpanzees could sometimes learn by observa- learn more sophisticated artificial grammars that
tion, rather than having to have the object pointed use hierarchical structure (e.g., that produce a string
out to them each time its name was presented.
Let us now look at chimps’ syntactic abilities.
Has it been demonstrated that apes can combine
symbols in a rule-governed way to form sentences?
In as much as they might appear to do so, it has
been proposed that the “sentences” are simply gen-
erated by “frames.” That is, it is nothing more than a
sophisticated version of conditioning, and does not
show the creative use of word-ordering rules. It is as
though we have now trained our pigeons to respond
to whole sentences rather than just individual
words. Such pigeons would not be able to recognize
that the sentence “The cat chased the dog” is related
in meaning to “The dog is chased by the cat,” or has
the same structure as “A vampire loved a ghost.”
We have a finite number of grammatical rules and The cotton-top tamarin performs well on a range
a finite number of words, but combine them to proof language-like tasks; for example, they can learn
duce an infinite number of sentences (Chomsky, which sequences of sounds tend to occur often
together.
1957). We have seen that recursion—where phrases
of sounds corresponding to AAABBB; Fitch & chimpanzees are not very different, their linguistic
Hauser, 2004). The generation of hierarchical struc- abilities are. This suggests that language processes
tures such as these depends on the ability to use are to some degree independent of other cognitive
recursion, and only humans can use recursion. processes. Third, following on from this, Chomsky
Hauser et al. (2002) and Fitch et al. (2005) go claimed that human language is a special faculty,
so far as to claim that recursion is the only uniquely which is independent of other cognitive processes,
human component of language—yet an immensely has a specific biological basis, and has evolved only
powerful one. Pinker and Jackendoff (2005) and in humans (e.g., Chomsky, 1968). Language arose
Jackendoff and Pinker (2005) take issue with this because the brain passed a threshold in size, and
extreme claim, arguing that there are many more only human children can learn language because
aspects of language, including properties of words only they have the special innate equipment nec-
and grammar, and the anatomy and control of the essary to do so. This hypothesis is summed up by
vocal tract, that are unique to humans. In addition, the phrase “language is species-specific and has an
the FOXP2 gene (see Chapters 1 and 4) is unique innate basis.” (Although as Kako, 1999a, observes, a
to humans and is involved in the control of speech better statement might be, “some components of lan-
and language, but does not seem to involve recur- guage are species-specific.”) In particular, Chomsky
sion. And furthermore, the Piraha language of the argued that only humans possess a language acqui-
Amazon does not seem to use any recursion, yet is sition device (LAD) that enables us to acquire
clearly a human language (Everett, 2005). language; without this device we would be stuck
In summary, some higher animals can learn forever at the level of a protolanguage (see Chapter
the names of objects and simple syntactic rules. 1). In particular, the ability to use recursive syntactic
However, they do not develop sophisticated rep- rules, which is what gives human language its full
resentations of meaning as do humans, and they power, is unique to humans (Hauser et al., 2002).
cannot learn complex, more human-like grammars. Even Premack (1985, 1986a, 1990) has become far
There is disagreement on how well apes less committed to the claim that apes can learn lan-
come out of a comparison of chimps and chil- guage just like human children. Indeed, he also has
dren. One problem is that it is unclear with which come to the conclusion that there is a major discon-
age group of children the chimpanzees should be tinuity between the linguistic and cognitive abilities
compared. When there is more work on linguistic of children and chimpanzees, with children possess-
apes bringing up their own offspring, the picture ing innate, “hard-wired” abilities that other animals
should be clearer. However, this research is diffi- lack. At the very least we can say that whereas chil-
cult to carry out, expensive, and difficult to obtain dren acquire language, apes have to be taught it.
funding for, so we might have to wait some time
for these answers.
At present we can conclude that chimps can
THE BIOLOGICAL BASIS
learn some symbols and some ways of combining OF LANGUAGE
them, but they cannot acquire a human-like syn-
tax. At best, they have acquired a protolanguage. What are the biological precursors of language?
How is language development related to the
Why is the issue so important? development of brain functions? How do biologi-
As we saw earlier, there is more to the issue of a cal processes interact with social factors?
possible animal language than simple intellectual
interest. First, the debate has led to a deeper insight Are language functions
into the nature of language and what is important
about it. We can see what makes human language
localized?
so very different from vervets “chattering” when The brain is not a homogeneous mass; parts of
they see a snake. Second, it is worth noting that it are specialized for specific tasks. How do we
although the cognitive abilities of young children and know this? In the past most of our knowledge
about how brain and behavior are related came earliest and most famous work on the effects of
from lesion studies combined with an autopsy: brain damage on behavior in the 1860s. Broca
neuropsychologists would discover which part observed several patients where damage to the
of the brain had been damaged, and relate that cortex of the left frontal lobes resulted in an
information to behavior. Now we have brain- impairment in the ability to speak, despite the
imaging techniques available, particularly fMRI vocal apparatus remaining intact and the abil-
(see Chapter 1), which can also be used with non- ity to understand language apparently remain-
brain-damaged speakers. These techniques indi- ing unaffected. (We look at this again in Chapter
cate which parts of the brain are active when we 13.) This pattern of behavior, or syndrome, has
do tasks such as reading or speaking. become known as Broca’s aphasia, and the part
Most people know that the brain is divided into of the brain that Broca identified as responsible
two hemispheres (see Kolb & Whishaw, 2009). The for speech production has become known as
two hemispheres of the brain are partly specialized Broca’s area (see Figure 3.6).
for different tasks: broadly speaking, in most right- A few years later, in 1874, the German neu-
handed people the left hemisphere is particularly rologist Carl Wernicke identified another area
concerned with analytic, time-based processing, of the brain involved in language, this time fur-
while the right hemisphere is particularly con- ther back in the left hemisphere, in the part of
cerned with holistic, spatially based processing. For the temporal lobe known as the temporal gyrus.
the great majority (96%) of right-handed people, Damage to Wernicke’s area (Figure 3.7) results in
language functions are predominately localized in Wernicke’s aphasia, characterized by fluent lan-
the left hemisphere. We say that this hemisphere guage that makes little sense, and a great impair-
is dominant. According to Rasmussen and Milner ment in the ability to comprehend language,
(1977), even 70% of left-handed people are left- although hearing is unaffected.
hemisphere dominant. This localization of function
is not tied to the speech modality; imaging stud- The Wernicke–Geschwind model
ies show that just the same left-hemisphere brain Wernicke also advanced one of the first models
regions are activated in people producing sign lan- of how language is organized in the brain. He
guage with both hands (Corina, Jose-Robertson, argued that the “sound images” of object names
Guillermin, High, & Braun, 2003). are stored in Wernicke’s area of the left upper
temporal cortex of the brain. When we speak,
Early work on the localization of this information is sent along a pathway of fib-
language ers known as the arcuate fasciculus to Broca’s
How do we know which bits of the brain do what?
In the 1950s, Penfield and Roberts (1959) studied
the effects of electrical stimulation directly on the
brains of patients undergoing surgical treatment
for epilepsy. More recently, a number of tech- Broca’s Parietal
area lobe
niques for brain imaging have become available,
including PET and CAT scans (see Chapter 1).
These techniques all show that there are specific
parts of the brain responsible for specific lan- Frontal
guage processes. lobe Occipital
lobe
Most of the evidence on the localization of Temporal
language functions comes from studies of the lobe
effects of brain damage. An impairment in lan-

guage production or comprehension as a result
of brain damage is called aphasia. The French
neurologist Paul Broca carried out some of the FIGURE 3.6 Location of Broca’s area.
This model is now known to be too simple for

several reasons (Kolb & Whishaw, 2009; Poeppel
Parietal
& Hickok, 2004). First, although for most people
Frontal
lobe lobe language functions are predominantly localized
in the left hemisphere, they are not restricted to
it. Some important language functions take place
in the right hemisphere. Some researchers have
Wernicke’s Occipital suggested that the right hemisphere plays an
area lobe
Temporal important role in an acquired disorder known as
lobe deep dyslexia (see Chapter 7), that it carries out
important aspects of visual word recognition, and
that it is involved with aspects of speech produc-
tion, particularly prosody (regarding the loud-
FIGURE 3.7 Location of Wernicke’s area. ness, rhythm, pitch, and intonation of speech); see
Lindell (2006) for a review. Subcortical regions
of the brain might play a role in language. For
area, in the left lower frontal cortex, where these example, Ullman et al. (1997) found that although
sound images are translated into movements for people with Parkinson’s disease (which affects
controlling speech. Although modern models subcortical regions of the brain) could success-
are more detailed, they essentially still follow fully inflect irregular verbs (presumably because
Wernicke’s scheme. The Wernicke–Geschwind these are stored as specific instances rather
model (Figure 3.8; sometimes called the than generated by a rule), they had difficulty
Wernicke–Lichtheim–Geschwind model) is an with regular verbs, suggesting that subcortical
elaboration of Wernicke’s scheme. Geschwind regions play some role in rule-based aspects of
(1972) described how language generation flows language. However, subcortical damage is usu-
from areas at the back to the front of the left ally also accompanied by cortical damage (e.g.,
hemisphere. When we hear a word, informa- see Olsen, Bruhn, & Öberg, 1986), and diseases
tion is transmitted from the part of the cortex such as Parkinson’s leads to damage to the cor-
responsible for processing auditory information tical regions of the brain to which these subcor-
to Wernicke’s area. If we then speak that word, tical regions project, so claims that subcortical
information flows to Broca’s area where articula- regions play a critical role in language need to be
tory information is activated, and is then passed treated with some caution. The right cerebellum
on to the motor area responsible for speech. If becomes significantly activated when we process
the word is to be spelled out, the auditory pat- the meaning of words (Marien, Enggelborghs,
tern is transmitted to a structure known as the Fabbro, & De Deyn, 2001; Noppeny & Price,
angular gyrus. If we read a word, the visual area 2002; Paquier & Marien, 2005; Petersen, van
of the cortex activates the angular gyrus and then Mier, Fiez, & Raichle, 1998). Second, even within
Wernicke’s area. Wernicke’s area plays a central the left cortex it is clear that brain regions out-
role in language comprehension. Damage to the side the traditional Wernicke–Broca areas play
arcuate fasciculus results in difficulties repeat- an important role in language. In particular, the
ing language, while comprehension and produc- whole of the superior temporal gyrus (of which
tion remain otherwise unimpaired. This pattern Wernicke’s region is just part) is important. Third,
is an example of a disconnection syndrome. brain damage does not have such a clear-cut effect
Disconnection occurs when the connection as the model predicts. Complete destruction of
between two areas of the brain is damaged with- areas central to the model rarely results in perma-
out damage to the areas themselves. The angular nent aphasias of the expected types. Furthermore,
gyrus plays a central role in mediating between we rarely find the expected clear-cut distinction
visual and auditory language. between expressive (production) and receptive
Speaking a heard word

Motor area
Broca’s
area
Primary
auditory area
Wernicke’s
area
Speaking a written word
Motor area
Broca’s
area
Angular
gyrus
FIGURE 3.8 The

top diagram shows the
sequence of events when a
spoken word is presented
and the individual repeats
Primary the word in spoken form.
visual area
The bottom diagram shows
Wernicke’s the sequence of events
area
when a written word is
presented and the individual
repeats the word in spoken
form.
(comprehension) disorders. For example, people Wernicke’s areas does not produce the simple,
with damage to Broca’s region often have diffi- different effects that we might expect.
culty understanding sentences. Different types of
aphasia have variable clusters of symptoms that More recent models of how language
tend to go together, and that are not as clearly is related to the brain
related to regions such as Broca’s or Wernicke’s Ullman (2004) proposed a model, called the D/P
as the model predicts. Fourth, virtually all peo- (declarative/procedural) model, of how language
ple with aphasia have some anomia (difficulty relates to the brain. He argued that language depends
in finding the names of things) regardless of the on two brain systems. The mental dictionary, or
site of damage. Finally, electrical stimulation of lexicon, depends on a declarative memory system
different regions of the brain often has the same based mainly in the left temporal lobe. The mental
effect, and selective stimulation of Broca’s and grammar, which depends primarily on procedural
Colored PET scans of

the areas of the brain
active while understanding
language. (The fronts of
these human brains are to
the left.) Active parts of the
cerebral left hemispheres
are red/orange. Various
language areas of the extra
sylvian temporal region are
active in the scan on the
left, which shows activity
associated with working
out the meaning of words.
The scan on the right shows
activity associated with
understanding sentences.
memory, is based on a distinct neural system involv- that Broca’s area computes, including phono-
ing the frontal lobes, basal ganglia, cerebellum, and logical short-term memory (Rogalsky & Hickok,
regions of the left parietal lobe. Essentially this dis- 2011), building a hierarchical structure (Friederici,
tinction is one between linguistic rules, or syntax, 2002; Friederici, Bahlmann, Heim, Schubotz, &
and words. The distinction will recur throughout Anwander, 2006), linearizing a hierarchical struc-
this book, so it is important to remember that there ture (Bornkessel-Schlesewsky, Schlesewsky, &
is some anatomical justification for this distinction. von Cramon, 2009), and unifying concepts into a
Another important idea here is that language pro- planned sentence (Hagoort, 2008). Quite a list!
cessing makes some use of cognitive processes and Some portions of the brain are more impor-
brain structures that are not just dedicated to language. tant for language functions than others, but it is
Recent work has used imaging to explore the difficult to localize specific processes in specific
exact role of Broca’s area in language, and one brain structures or areas. It is likely that multiple
result is that its precise role has become much more routes in the brain are involved in language pro-
controversial. The fact that damage to Broca’s area duction and comprehension. Modern brain-imag-
leads to aphasia shows that it plays an important ing techniques show that much larger regions of
role, but is it dedicated to language specifically, the brain may be involved in language processing
or does it just involve more general processes that than were once thought. For example, the temporal
underpin language? Are other regions of the brain gyrus seems to play an important role in language
involved in processing syntax? The answer to the comprehension (Dronkers, Wilkins, van Valin,
latter question is almost certainly yes, and to the Redfern, & Jaeger, 2004). A wide-ranging account
former, maybe. Imaging suggests that Broca’s area of the relation between language and the brain is
may play a role in general phonological work- provided by Hickok and Poeppel (2004), who, draw-
ing memory rather than syntactic manipulation ing on data from brain imaging and lesion stud-
as such (Rogalsky & Hickok, 2011; but see also ies, focus on auditory comprehension. They argue
Fedorenko & Kanwisher, 2011). There is even that early stages of speech perception involve the
debate as to the exact language-related processes superior temporal gyrus bilaterally (on both sides,
although more on the left). The cortical process- perception to speech production. Most of what we
ing system then diverges into dorsal (towards the traditionally think of as “speech perception” takes
back and top of the brain) and ventral (towards the place in the ventral stream. The output of the dor-
front and bottom of the brain) streams (see Figure sal stream is an integration of auditory and motor
3.9). The ventral stream is mainly concerned with information, and the stream is important when we
turning sound into meaning. The dorsal stream is focus on the sounds of the words involved (e.g.,
concerned with mapping sound onto a represen- in learning to make speech sounds, or in analyzing
tation involving articulation, and relates speech the sounds of words, or repeating back nonwords).
A Dorsal stream
Articulatory-based Auditory–motor
speech codes interface
Acoustic-phonetic
speech codes
Sound–meaning
interface
Auditory
input Ventral stream
B pIF/dPM (left) Area Spt (left)

articulatory-based auditory–motor interface
speech codes
STG (bilateral) pITL (left)

acoustic-phonetic sound–meaning interface
speech codes
FIGURE 3.9 (A) Hickok and Poeppel’s proposed framework for the functional anatomy of language. (B) General
locations of the model components shown on a lateral view of the brain. From Hickok and Poeppel (2004).
In summary, although we can point to spe- shown that Broca’s area is activated differently in
cific regions of the brain—particularly in the left boys and girls when they carry out the language task
frontal and temporal lobes—that play particularly of deciding whether two nonwords rhyme or not.
important roles in language, lesion and imaging Girls tend to show activation in both the left and
studies show that the neural systems underlying right pre-frontal cortex, while with boys activation
language are variable, flexible, and distributed is limited to the left hemisphere (Shaywitz et al.,
over many brain regions (Corina et al., 2003). 1995). It seems that the less lateralized brain leads
In a recent synthesis, Friederici (2012) to an advantage for language processing—perhaps
describes how the cortical regions of the brain because both hemispheres can be used.
involved in language are connected by ventral and There are also sex differences in language
dorsal pathways. The ventral pathway is involved use in later life. Doubtless there are some cul-
in auditory-to-meaning mapping, and the dorsal tural factors. Anderson and Leaper (1998) report
pathway is involved in auditory-to-motor mapping. a meta-analysis of gender differences in the use
The dorsal pathway might also be involved in syn- of interruptions. They found that men are signifi-
tactic processing, particularly with syntactically cantly more likely to interrupt than women, and
complex sentences. She argues that these two func- women are more likely to be interrupted than men.
tions are so dissimilar that we distinguish two dor- However, women also tend to be fluent, producing
sal streams on the basis of function and structure. more words, longer sentences, and fewer errors in a
The ventral pathway supports sound-to-meaning given time, and men are much more likely to suffer
mapping and local syntactic structure building. from clinical disorders such as stuttering.
Sex differences and language IS THERE A CRITICAL

Girls appear to have greater verbal ability than PERIOD FOR LANGUAGE
boys, while boys appear to be better than girls DEVELOPMENT?
at mathematical and spatial tasks (Kolb &
Whishaw, 2009; Maccoby & Jacklin, 1974). It is It is widely believed that the ability to acquire
probably too simplistic, however, to characterize language declines with increasing age, with very
this difference as simply “verbal versus visual,” young children particularly well-adapted for lan-
as this summary does not capture all the differ- guage acquisition. The critical period hypoth-
ences involved: females tend to have superior esis of Lenneberg (1967) comprises two related
visual memory, for example. It is also difficult ideas. The first idea is that certain biological
to establish the direction of causality for findings events related to language development can only
in this area, as some differences may be attrib- happen in an early critical period. In particular,
utable to cultural rather than biological causes. hemispheric specialization takes place during the
Nevertheless, there is plenty of evidence that critical period, and during this time children pos-
from an early age girls are superior to boys on sess a degree of flexibility that is lost when the
at least some verbal tasks (Baron-Cohen, 2003). critical period ends. The second component of the
Girls start talking before boys by about an average critical period hypothesis is that certain linguistic
of 1 month. They have better verbal memories, events must happen to the child during this period
and are better readers and spellers. for development to proceed normally. Proponents
Some researchers have found that males show of this hypothesis argue that language is acquired
greater lateralization than females (Baron-Cohen, most efficiently during the critical period.
2003; Kolb & Whishaw, 2009). Males show a The idea of a critical period for the devel-
greater right-ear left-hemisphere advantage for opment of particular processes is not unique to
perceiving speech sounds, while females suffer humans. Songbirds display hemispheric specializa-
relatively less aphasia after damage to the left hemi- tion in that only one hemisphere controls singing
sphere, and they recover faster. Brain imaging has (Demers, 1988). Many birds such as the chaffinch
to language capability, each able in principle to

acquire the processes responsible for language,
with the left hemisphere maturing to become spe-
cialized for language functions. The irreversible
determinism (or invariance) hypothesis states that
the left hemisphere is specialized for language
at birth and the right hemisphere only takes over
language functions if the left is damaged over a
wide area, involving both the anterior and poste-
rior regions (Rasmussen & Milner, 1975; Woods
& Carey, 1979). Irreversible determinism says that
Many songbirds, such as the chaffinch, are born language has an affinity for the left hemisphere
with the rudiments of a song, but must be exposed because of innate anatomical organization, and will
to the male song of their species between the ages
of 10 and 15 days in order to acquire it normally. not abandon it unless an entire center is destroyed.
The critical difference between the equipotentiality
and irreversible determinism hypotheses is that in
the former either hemisphere can become special-
are born with the rudiments of a song, but must be ized for language, but in the latter the left hemi-
exposed to the male song of their species between sphere becomes specialized for language unless
the ages of 10 and 15 days in order to acquire it there is a very good reason otherwise. The emer-
normally. Evidence for a critical period for human gentist account brings together these two extremes,
linguistic development comes from many sources. saying that the two hemispheres of the brain are
characterized at birth by innate biases in types of
Evidence from the development of information processing that are not specific to lan-
guage processing (e.g., the left hemisphere is bet-
lateralization ter at processing complex sequences), such that the
The structure of the brain is not completely fixed at left hemisphere is better suited to being dominant,
birth. A considerable amount of development con- although both hemispheres play a role in acquiring
tinues after birth and throughout childhood (and language (Lidzha & Krageloh-Mann, 2005).
indeed perhaps in adolescence); this process of The critical period hypothesis is the best
development is called maturation. Furthermore, known version of the equipotentiality hypothesis.
the brain (primarily the cortex) shows some Lenneberg (1967) argued that at birth the left and
degree of plasticity, in the sense that after dam- right hemispheres of the brain are equipotential.
age it can to some extent recover and reorganize, There is no cerebral asymmetry at birth; instead
or can adapt in response to pronounced changes lateralization occurs as a result of maturation. The
in input, even in adulthood. It is now known that process of lateralization develops rapidly between
the brain is much more flexible even in adulthood the ages of 2 and 5 years, then slows down, being
than was once thought (Begley, 2007). complete by puberty. Lenneberg argued that the
We are not born with our two hemispheres brain possesses a degree of flexibility early on, in
completely lateralized in function; instead, later- that, if necessary, brain functions can develop in
alization emerges throughout childhood. The most alternative locations.
striking evidence for this claim is that damage to the Lenneberg examined how a child’s age affected
left hemisphere in childhood does not always lead to recovery after brain damage. Damage to the left
the permanent disruption of language abilities. hemisphere of the adult brain leads to signifi-
There are three accounts of how lateralization cant and usually permanent language impairment.
takes place (Bates & Roe, 2001; Thomas, 2003). Lenneberg’s key finding was that the linguistic
The equipotentiality hypothesis states that the abilities of young children recover much better after
two hemispheres are similar at birth with respect brain damage than those of adults after brain damage,
and the younger the child, the better the chances of habituation paradigm. Exploring the cogni-
a complete recovery. Indeed, the entire function of tive and perceptual abilities of very young infants is
the left hemisphere can be taken over by the right obviously difficult, so we need to use clever
if the child is young enough. There are a number experimental paradigms. In this task, the experi-
of cases of complete hemidecortication, where an menter monitors changes in the infant’s sucking
entire hemisphere is removed as a drastic treatment rate as stimuli are presented. Rapid sucking is an
for exceptionally severe epilepsy. Such an operation innate response to stimulation; when the infant
on an adult would almost totally destroy language gets bored, or habituated, to the stimulus, the
abilities. If performed on children who are young sucking rate drops. If a new stimulus is presented,
enough—that is, during their critical periods—they and if the infant can detect the change, the suck-
seem able to recover almost completely. Another ing rate increases again. Hence monitoring suck-
piece of evidence supporting the critical period ing rate is a very useful way of being able to tell
hypothesis is that crossed aphasia, where damage if an infant can detect change. Entus found a more
to the right hemisphere leads to a language deficit, marked change in the sucking rate when speech
appears to be more common in children (Woods stimuli were presented to the right ear (and there-
& Teuber, 1973). These findings suggest that the fore a left-hemisphere advantage, as the right ear
brain is not lateralized at birth, but that lateralization projects on to the left hemisphere), and an advan-
emerges gradually throughout childhood as a conse- tage for non-speech stimuli when presented to
quence of maturation. This period of maturation is the left ear (indicating a right-hemisphere advan-
the critical period. tage). Molfese (1977) measured evoked poten-
On the other hand, Dennis and Whitaker tials (a measure of the brain’s electrical activity)
(1976, 1977) found that children who had had and found hemispheric differences to speech and
the whole left cortex removed subsequently had non-speech in infants as young as 1 week, with
particular difficulties in understanding complex a left-hemisphere preference for speech. Very
syntax, compared with children who had had the young children also show a sensitive period for
whole right cortex removed. One explanation of phonetic perception that is more or less over by
this finding is that the right hemisphere cannot 10–12 months (B. Harley & Wang, 1997; Werker
completely accommodate all the language func- & Tees, 1983).
tions of the left hemisphere, although Bishop Mills, Coffrey-Corina, and Neville (1993,
(1983) in turn presented methodological criti- 1997) examined changes in patterns of ERPs
cisms of this work. She observed that the number (event-related potentials) in the electrical activity of
of participants was very small, and that it is impor- the brain in infants aged between 13 and 20 months.
tant to match for IQ to ensure that any observed They compared the ERPs as children listened to
differences are truly attributable to the effects of words whose meanings they knew with ERPs for
hemidecortication. When IQ is controlled for, words whose meanings they did not know. These
there is a large overlap with normal performance. two types of word elicited different patterns of ERP,
It is not clear that non-decorticated individuals of but whereas at 13–17 months the differences were
the same age would have performed any better. spread all over the brain, by 20 months the differ-
ences were restricted to the more central regions of
the left hemisphere. Clearly some specialization is
Evidence from studies of occurring here—but still considerably before the
lateralization in very young window of the critical period originally hypoth-
esized by Lenneberg. These data also suggest that
children the right hemisphere plays an important role in early
Contrary to the critical period hypothesis, there language acquisition. In particular, unknown words
is evidence that some lateralization is present at elicit electrical activity across the right hemisphere,
a very early age, if not from birth. Entus (1977) perhaps reflecting the processing of novel but
studied 3-week-old infants using a sucking meaningful stimuli. The same idea could explain
the observation that focal brain injury to the right being equal, adults might be better than children
hemisphere of very young children (10–17 months) because of their better learning skills. Research
is more likely to result in a delay in the development has addressed the issue of whether there is an age-
of word comprehension skills than damage to the related block on second language learning.
left hemisphere (Goldberg & Costa, 1981; Thal Are children in fact better than adults at learn-
et al., 1991). ing language? The evidence is not as clear-cut as
Differences in early asymmetry may be is usually thought. Snow (1983) concluded that,
linked with later language abilities. Infants who contrary to popular opinion, adults are in fact no
show early left-hemisphere processing of pho- worse than young children at learning a second
nological stimuli show better language abilities language, and indeed might even be better. We
several years later (Mills et al., 1997; Molfese & often think children are better at learning the first
Molfese, 1994). and second languages, but they spend much more
Hence there does seem to be a critical period time being exposed to and learning language than
in which lateralization occurs, but the period starts adults, which makes a comparison very difficult.
earlier than Lenneberg envisaged. As there is con- Snow and Hoefnagel-Hohle (1978) compared
siderable evidence for some lateralization from English children with English adults in their first
birth, the data also support the idea that the left year of living in the Netherlands learning to speak
hemisphere has a special affinity for language, Dutch. The young children (3–4 years old) per-
rather than the view that the two hemispheres are formed worst of all. In addition, a great deal of the
truly equipotential. advantage for young children usually attributed to
the critical period may be explicable in terms of
Evidence from second language differences in the type and amount of information
available to learners (Bialystock & Hakuta, 1994).
acquisition There is also a great deal of variation: Some adults
The critical period hypothesis has traditionally are capable of near-native performance on a sec-
been used to explain why second language acqui- ond language, whereas some children are less
sition is difficult for older children and adults. successful (B. Harley & Wang, 1997). Although
Johnson and Newport (1989) examined the way ability in conversational syntax correlates with
in which the critical period hypothesis might duration of exposure to the second language, this
account for second language acquisition. They just suggests that total time spent learning the sec-
distinguished two hypotheses, both of which ond language is important—and the younger you
assume that humans have a superior capacity start the more time you tend to have (Cummins,
for learning language early in life. According to 1991). The conclusion is that there is little evi-
the maturational state hypothesis, this capacity dence for a dramatic cut-off in language-learning
disappears or declines as maturation progresses, abilities at the end of puberty.
regardless of other factors. The exercise hypoth- Adults learning a language have a persistent
esis further states that unless this capacity is exer- foreign accent, and hence phonological (sound)
cised early, it is lost. Both hypotheses predict that development might be one area for which there is
children will be better than adults in acquiring a critical period (Flege & Hillenbrand, 1984). And,
the first language. The exercise hypothesis pre- although adults seem to have an initial advantage
dicts that as long as a child has acquired a first in learning a second language, the eventual attain-
language during childhood, the ability to acquire ment level of children appears to be better (see
other languages will remain intact and can be used Krashen, Long, & Scarcella, 1982, for a review).
at any age. The maturational hypothesis predicts Johnson and Newport (1989) carried out one
that children will be superior at second language of the most detailed studies of the possible effects
learning, because the capacity to acquire language of a critical period on syntactic development. They
diminishes with age. However, it is possible found some evidence for a critical period for the
under the exercise hypothesis that, all other things acquisition of the syntax of a second language. They
examined native Korean and Chinese immigrants What happens if we cannot acquire a first lan-
to the USA, and found a large advantage in mak- guage during the critical period?
ing judgments about whether a sentence was gram-
matically correct for immigrants who arrived at a
younger age. In adults who had arrived in the USA Evidence from hearing
when they were aged between 0 and 16 years of age, children of hearing-impaired
there was a large negative linear correlation between
age of arrival and language ability (on this meas-
parents
ure). Adults who arrived between the ages of 16 and In principle, the language of hearing children of
40 showed no significant relation between age of deaf parents should provide a test of the critical
arrival and ability, although later arrivers generally period hypothesis. However, linguistic depriva-
performed slightly less well than early arrivers. The tion is never total. Sachs, Bard, and Johnson (1981)
variance in the language ability of the later arrivers described the case of “Jim,” a hearing child of deaf
was very high. Johnson and Newport concluded that parents whose only exposure to spoken language
different factors operate on language acquisition until he entered nursery at the age of 3 was the tele-
before and after 16 years of age. They proposed that vision. Although his parents signed to each other,
there is a change in maturational state, from plastic- they did not sign towards him. They believed that
ity to a steady state, at about age 16. Other research- as he had normal hearing it would be inappropri-
ers place the age of discontinuity much earlier, at ate for him to learn signing. Jim’s intonation was
around 5 (see Birdsong & Molis, 2001). abnormally flat, his articulation very poor, with
There is some controversy about whether some utterances being unintelligible, and his gram-
Johnson and Newport’s data really represent a mar very idiosyncratic. For example, Jim produced
change at 16 from plastic to fixed state. Is there a real utterances such as “House. Two house. Not one
discontinuity? Elman et al. (1996) showed that the house. That two house.” This example shows that
distribution of performance scores can also be fitted Jim acquired the concept of plurality but not that it
by a curvilinear function nearly as well as two lin- is usually marked by an “-s” inflection, although
ear ones, suggesting that there is a gradual decline normally this is one of the earliest grammati-
in performance rather than a strong discontinuity. cal morphemes a child learns. Utterances such as
Nevertheless, the younger a person is, the better they “Going house a fire truck” suggest that Jim con-
seem to acquire a second language. Furthermore, structed his own syntactic rules based on stating
Birdsong and Molis (2001) replicated the original a phrase followed by specifying the topic of that
Johnson and Newport (1989) study, using Spanish phrase—the opposite of the usual word order in
speakers learning English. Contrary to the original English. Although this is an incorrect rule, it does
findings, and contrary to the critical period hypoth- emphasize the drive to create syntactic rules (see
esis, Birdsong and Molis found no learning discon- Chapter 4). Jim’s comprehension of language was
tinuity around 16. Furthermore, some late learners also very poor. After intervention, within a few
(starting to learn the second language after the pre- months Jim’s language use was almost normal.
sumed end of the critical period) achieved near- Jim’s case suggests that exposure to language alone
native performance on it—something that should not is not sufficient to acquire language normally: it
be possible if the critical period hypothesis is correct. must be in an appropriate social, interactional con-
In summary, there is evidence for a critical text. It also emphasizes humans’ powerful urge to
period for some aspects of syntactic development use language.
and, even more strongly, for phonological devel- People exposed to sign language (e.g., ASL)
opment. However, rather than any dramatic dis- early achieve a better level of ultimate compe-
continuity, decline seems to be gradual. Second tence (Newport, 1990). In particular, late learners
language acquisition is not a perfect test of the have particular difficulty using signs to represent
hypothesis, however, because the speakers have complex verbs. These observations also support
usually acquired at least some of a first language. the critical period hypothesis.
What happens if children are

deprived of linguistic input during
the critical period?
In a very early psycholinguistic experiment, King
James IV of Scotland reputedly abandoned two
children in the wild (around the year 1500). Later
he claimed that they grew up spontaneously learn-
ing to speak “very good Hebrew.” What really
happens to children who grow up in the absence of
linguistic stimulation?
The other important idea of the critical period
hypothesis is that unless children receive linguis-
tic input during the critical period, they will be
unable to acquire language normally. The strong-
est version of the hypothesis is of course that
without input during this period children can-
not acquire language at all. Supporting evidence
comes from reports of wild or feral children who
have been abandoned at birth and deprived of
language in childhood. Feral children often have
Ramu was a young boy who appeared to have
no language at all when found, but more surpris- been reared by wolves. He was discovered in
ingly, appear to find language difficult to acquire India in 1960. At the time of his death, aged about
despite intensive training. “Wolf children” receive 10, he had still not learned to speak. The above
their name from when children are reputedly picture shows Ramu being examined by a doctor.
cross-fostered by wolves as wolf cubs (such as
the Romulus and Remus of Roman legend). One abandonment, and therefore might have been
of the most famous of these cases was the “Wild language-impaired, whatever the circumstances.
Boy of Aveyron,” a child found in isolated woods It is less easy to apply this argument to the
in the south of France in 1800. Despite attempts unfortunate child known as “Genie” (Curtiss, 1977;
by an educationalist named Dr. Itard to social- Fromkin, Krashen, Curtiss, Rigler, & Rigler, 1974).
ize the boy, given the name Victor, and to teach Genie was a child who was apparently normal at
him language, he never learned more than two birth, but suffered severe linguistic deprivation.
words. (This story was subsequently turned into From the age of 20 months, until she was taken into
a film by François Truffaut, called L’enfant sau- protective custody by the Los Angeles police when
vage, and is described by Shattuck, 1980.) More she was 13 years 9 months, she had been isolated
recent reports of feral children involving appar- in a small room, most of the time strapped into a
ent cross-fostering include the wolf children of potty chair. Her father was extraordinarily intoler-
India (Singh & Zingg, 1942) and the monkey boy ant of noise, so there was virtually no speech in the
of Burundi (Lane & Pillard, 1978). In each case, house—not even overheard from a radio or televi-
attempts to teach the children language and social sion. Genie was punished if she made any sounds.
skills were almost complete failures. These cases The only contact she had with other people was
describe events that happened some time ago, a few minutes each day when her mother fed her
and what actually happened is usually unclear. baby food, and occasionally when her father and
Furthermore, we do not know why these children older brother barked at her like dogs—clearly this
were originally abandoned. It is certainly conceivable is extreme social, physical, nutritional, and linguis-
that they were developmentally delayed before tic deprivation. Not surprisingly, Genie’s linguistic
abilities were virtually non-existent. At the age of as they are given exposure to language and train-
nearly 14 the critical period should be finished or ing at an early enough age. “Isabelle” was kept
almost finished, so could Genie learn language? from infancy, with minimum attention, in seclu-
With training, Genie learned some language skills. sion with her deaf-mute mother until the age of
However, her syntactic development was always 6½ (Davis, 1947; Mason, 1942). Her measured
impaired relative to her vocabulary. She used few intelligence was about that of a 2-year-old and she
question words, far fewer grammatical words, and possessed no spoken language. But with exposure
tended to form negatives only by adding nega- to spoken language she passed through the nor-
tives to the start of sentences. She failed to acquire mal phases of language development at a greatly
the use of inflectional morphology (the ability to accelerated rate, and after 18 months her intelli-
use word endings to modify the number of nouns gence was in the normal range and she was highly
and the tense of verbs), the ability to transform linguistically competent.
active syntactic constructions into passive ones In summary, the evidence from linguistic
(e.g., turning “the vampire chased the ghost” into deprivation is not as clear-cut as we might expect.
“the ghost was chased by the vampire”), and the use Children appear able to recover from it as long
of auxiliary verbs (e.g., “be”). Furthermore, unlike as they receive input early enough. If depriva-
most right-handed children, she showed a left-ear, tion continues, language development, particu-
right-hemisphere advantage for speech sounds. larly syntactic development, is impaired. A major
There could be a number of reasons for this, such as problem is that linguistic deprivation is invariably
left-hemisphere degeneration, the inhibition of the accompanied by other sorts of deprivation, and it
left hemisphere by the right, or the left hemisphere is difficult to disentangle the effects of these.
taking over some other function.
Because of financial and legal difficulties, Evaluation of the critical period
research on Genie did not continue for as long as
might have been hoped, and hence many ques-
hypothesis
tions remain unanswered. (Genie is now in an There are two reasons for rejecting a strong ver-
adult foster home.) In summary, Genie’s case sion of the critical period hypothesis. Children can
shows that it is possible to learn some language acquire some language outside of it, and lateraliza-
outside the critical period, but also that syntax tion does not occur wholly within it. In particular,
appears to have some privileged role. The amount some lateralization is present from birth or before.
of language that can be learned after the critical Nevertheless, it is possible to defend a weakened
period seems very limited. version of the hypothesis. A critical period appears
Of course, the other types of deprivation (such as to be involved in early phonological development
malnutrition and social deprivation) to which Genie and the development of syntax. The weakened ver-
was subjected might have played a part in her later sion is often called a sensitive period hypothesis.
linguistic difficulties. Indeed, Lenneberg discounted The evidence supports the weaker version. There
the case because of the extreme emotional trauma is a sensitive period for language acquisition, but
Genie had suffered. Furthermore, there has been no it seems confined to complex aspects of syntactic
agreement over whether Genie was developmentally processing (Bialystok & Hakuta, 1994).
delayed before her period of confinement. Indeed, The critical period does not apply only to
her father locked her away precisely because he con- spoken language. Newport (1990) found evidence
sidered her to be severely developmentally delayed, of a critical period for congenitally deaf people
in the belief that he was protecting her. Contrary to learning ASL, particularly concerning the use of
this, there is some evidence that aspects of Genie’s morphologically inflected signs. She also found a
non-linguistic development proceeded normally fol- continuous linear decline in learning ability rather
lowing her rescue (Rymer, 1993). than a sudden drop-off at puberty. Of course adults
Some children might be able to recover com- can learn sign language, but it is argued they learn
pletely from early linguistic deprivation as long it less efficiently.
Why should there be a critical period for lan- received the most attention. However, perhaps the
guage? There are three types of explanation. The two approaches are not really contradictory. A sys-
nativist explanation is that there is a critical period tem that matures and is more efficient for learning
because the brain is pre-programmed to acquire lan- language will have an evolutionary advantage.
guage early in development. Bever (1981) argued
that it is a normal property of growth, arising from a
loss of plasticity as brain cells and processes become THE COGNITIVE BASIS OF
more specialized and more independent. Along simi- LANGUAGE
lar lines, Locke (1997) argues that a sensitive period
arises because of the interplay of developing special- Jean Piaget is one of the most influential figures in
ized neural systems, early perceptual experience, and developmental psychology. According to Piaget,
discontinuities in linguistic development. Lack of development takes place in a sequence of well-
appropriate activation during development acts like defined stages. In order to reach a certain stage of
physical damage to some areas of the brain. development, the child must have gone through all
The maturational explanation is that certain the preceding stages. Piaget identified four principal
advantages are lost as the child’s cognitive and neu- stages of cognitive development (see Figure 3.10). At
rological system matures. In particular, what might birth, he argued that the child possesses only innate
first appear to be a limitation of the immature cog- reflexes. In the first stage of development, which
nitive system might turn out to be an advantage for Piaget called the sensorimotor period, behavior is
the child learning language. For example, it might be organized around sensory and motor processes. This
advantageous to be able to hold only a limited num- stage lasts through infancy until the child is about
ber of items in short-term memory, to be unable to 2 years old. A primary development in this period
remember many specific word associations, and to is the attainment of the concept of object perma-
remember only the most global correspondences. nence—that is, realizing that objects have continual
That is, there might be an advantage to “starting existence and do not disappear as soon as they go
small,” because it enables the children to see the out of view. Indeed, Piaget divided the sensorimo-
wood for the trees (Deacon, 1997; Elman, 1993; tor period into six sub-stages depending on the pro-
Kersten & Earles, 2001; Newport, 1990). It is pos- gress made towards object permanence. Next comes
sible that the limited cognitive resources of the child the pre-operational stage, which lasts until the age
are actually advantageous to children (an idea called of about 6 or 7. This stage is characterized by ego-
“less is more”), as it means they can only process centric thought, which means that these children are
limited amounts of language at any one time. They unable to adopt alternative viewpoints to their own
can then get the small segments right before they start and are unable to change their point of view. The
on the larger and more complex units, without being concrete operations stage lasts until the age of about
overwhelmed from the beginning. A related variant 12. The child is now able to adopt alternative view-
of the maturational answer is that, as the brain devel- points, as shown by the conservation task. In this task
ops, it uses up its learning capacity by dedicating water is poured from a short wide glass to a tall thin
specialist circuits to particular tasks. Connectionist glass, and the child is asked if the amounts of water
modeling of the acquisition of the past tense of verbs are the same. A pre-operational child will reply that
suggests that networks do indeed become less plastic the tall glass has more water in it; a concrete opera-
the more they learn (Marchman, 1993). tional child will correctly say that they both contain
The main differences between these answers the same amount. Nevertheless the child is still lim-
are the extent to which the constraints underlying ited to reasoning about concrete objects. In the for-
the critical period are linguistic or more general, mal operations stage, the adolescent is not limited
and the extent to which the timing of the acqui- to concrete thinking, and is able to reason abstractly
sition process is genetically controlled (Elman and logically. Piaget proposed that the main mecha-
et al., 1996). With insights from connectionist nisms of cognitive development are assimilation and
modeling, the maturational answer has recently accommodation. Assimilation is the way in which
information is abstracted from the world to fit exist- (Sinclair-de-Zwart, 1973). For example, the child has
ing cognitive structures; accommodation is the way to attain the stage of object permanence in order to be
in which cognitive structures are adjusted in order to able to acquire concepts of objects and names. Hence
accommodate otherwise incompatible information. an observed explosion in vocabulary size at around
According to Piaget, there is nothing special 18 months is related to the attainment of object per-
about language. Unlike Chomsky, he did not see it manence. However, Corrigan (1978) showed that
as a special faculty, but as a social and cognitive pro- there was no correlation between the development of
cess just like any other. It therefore clearly has cogni- object permanence and linguistic development once
tive prerequisites; it is dependent on other cognitive, the child’s age was taken into account. Furthermore,
motor, and perceptual processes, and its development infants comprehend names as much as 6 months
clearly follows the cognitive stages of development. before the stage of object permanence is complete.
Adult speech is socialized and has communicative Indeed, having unique names available for objects
intent, whereas early language is egocentric. Piaget may help children acquire object permanence. Xu
(1923/1955) went on to distinguish three differ- (2002) found that having two distinct labels available
ent types of early egocentric speech: repetition or for two distinct objects (e.g., a toy duck and a ball)
echolalia (where children simply repeat their own facilitated the discrimination abilities of 9-month-old
or others’ utterances); monologues (when children children, but having one label, or two distinct tones,
talk to themselves, apparently just speaking their or two facial expressions, did not.
thoughts out loud); and group or collective mono- There is some evidence that language acqui-
logues (where two or more children appear to be tak- sition is related to the development of object per-
ing appropriate turns in a conversation but actually manence in a more complex way. An important,
just produce monologues). For Piaget, cognitive and though at first small, class of early words are rela-
social egocentrism were related. tional words (e.g., “no,” “up,” “more,” “gone”). The
The cognition hypothesis is a statement of first relational words should depend on the emer-
Piaget’s ideas about language, and says that language gence of knowledge about how objects can be trans-
needs certain cognitive precursors in order to develop formed from one state to another, at the end of the
Piaget’s four stages of cognitive development
Sensorimotor Pre-operational Concrete Formal

stage stage operations operations
0–2 Preconceptual Intuitive 7–11/12 11/12

years 2–4 years 4–7 years years upwards
Intelligence in Thinking dominated by perception, Logical reasoning Individual can

action: child but child becomes more and more can only be applied think logically
interacts with capable of symbolic functioning; to objects that are about potential
environment language development occurs; real or can be seen events or
by manipulating child still unduly influenced by abstract ideas
objects own perception of environment
FIGURE 3.10
sensorimotor period. These words do indeed tend

to enter as a group near the end of the sensorimotor
period (McCune-Nicolich, 1981). Words that relate
to changes in the state of objects still present in the
visual field (e.g., “up,” “move”) emerge before
those (e.g., “all gone”) that relate to absent objects
(Tomasello & Farrar, 1984, 1986).
Language development in
children with learning
difficulties
An obvious test of the cognition hypothesis is to
examine the linguistic abilities of children with
learning difficulties. If cognitive development
drives linguistic development, then impaired
cognitive development should be reflected in
slow linguistic development. The evidence is
mixed but suggests that language and cognition
are to some extent decoupled.
Although some children with Down’s syn- Some people with Down’s syndrome may have
drome become fully competent in their language, impaired linguistic abilities, whereas others become
most do not (Fowler, Gelman, & Gleitman, 1994). fully competent. It seems that cognitive and linguistic
At first, these children’s language development is abilities are distinct, and a person with Down’s
syndrome may show greater abilities in their
simply delayed. Up to the age of 4, their language cognition than in linguistic ability, or vice versa.
age is consistent with their mental age (although it is
obviously behind their chronological age). After this,
language age starts to lag behind mental age. Lexical and linguistic processes are distinct, and that as nor-
development is slow, and grammatical development mal language could develop when there is severe
is especially slow (Hoff-Ginsberg, 1997). Most peo- general cognitive impairment, cognitive precursors
ple with Down’s syndrome never become fully com- are not essential for linguistic development. The
petent with complex syntax and morphology. situation is not straightforward, however, as not all
On the other hand, there are several types of Laura’s linguistic abilities were spared. For exam-
impaired cognitive development that do not lead to ple, she had difficulty with complex morphologi-
such clear-cut linguistic impairments. Laura was cal forms. In another case study, Smith and Tsimpli
a girl who showed severe and widespread cogni- (1995) described a man who had a non-verbal IQ
tive impairments (her IQ was estimated at 41), yet beneath 70, and was unable to live independently,
appeared unimpaired at complex syntactic con- yet who had a normal verbal IQ and could speak
structions (Yamada, 1990). Furthermore, factors several foreign languages.
that caused problems for Laura in cognitive tasks Williams syndrome is a rare genetic disorder
did not do so in linguistic tasks; for example, while that leads to physical abnormalities (affected chil-
non-linguistic tasks involving reasoning about dren have an “elfin-faced” appearance) and a very
hierarchies were very difficult for Laura, her ability low IQ, typically around 50. However, the speech
to produce sentences with grammatical hierarchies of such people is very fluent and grammatically cor-
was intact. Although her short-term memory was rect (Bellugi, Bihrle, Jernigan, Trauner, & Doherty,
very poor, she could still produce complex syntac- 1991). Indeed, they are particularly fond of unusual
tic constructions. Yamada concluded that cognitive words. Their ability to acquire new words and to
repeat nonwords is also good (Barisnikov, van 1991). Young children also rehearse less than older
der Linden, & Poncelet, 1996). This dissociation children do. It is possible that changes in working
between severe cognitive impairment and normal memory might have consequences for some lin-
(in some respects, better than normal) language guistic processes, particularly comprehension and
skills makes Williams syndrome particularly inter- learning vocabulary (see Chapter 15).
esting and important for thinking about how lan- There is currently little active research on the
guage and cognition are related. Piagetian approach to language. The emphasis has
Finally, children with autism find social com- instead shifted to the communicative precursors
munication difficult, and their language use is often of language and the social interactionist account
idiosyncratic. The things they talk about are differ- (see below). However, to be effective communica-
ent, for example, and they use some words in unusual tors children need to develop the ability to adopt
ways (Tager-Flusberg, 1999). Their peculiarities of others’ point of view. An essential component of
language use probably arise from their lack of a “the- this development is the acquisition of a “theory of
ory of mind” about how other people think and feel, mind.” Although this might be driven by cognitive
and is unlikely to be attributable to straightforward development, it might also be driven linguistically.
deficits in linguistic processing (Bishop, 1997). Their The acquisition of verbs such as “know,” “believe,”
grammatical skills are relatively unimpaired. “think,” and “want,” and the development of lin-
Cases such as these pose difficulty for any guistic structures that enable us to express complex
position that argues either for interaction between statements about beliefs, truth, and falsehood in a
cognitive and linguistic development, or for the relatively simple way, are almost certainly driving
primacy of cognitive factors. The evidence favors forces as well (de Villiers & de Villiers, 2000; Shatz,
a partial, but not complete, separation of language Diesendruck, Martinez-Beck, & Akar, 2003).
skills and general cognitive abilities such as rea-
soning and judgment.
THE SOCIAL BASIS OF
Evaluation of the cognition LANGUAGE
hypothesis We noted earlier that it is difficult to disentangle
The cognition hypothesis says that cognitive the specific effects of linguistic deprivation in feral
development drives linguistic development. children from the effects of social deprivation. Cases
However, there is no clear evidence for a strong such as that of Jim, the hearing child of deaf parents,
version of the cognition hypothesis. Children suggest that children need to be exposed to language
acquire some language abilities before they obtain in a socially meaningful situation (Sachs et al., 1981).
object permanence. Indeed, Bruner (1964) argued It is clearly not enough to be exposed to language;
that aspects of cognitive performance are facili- something more is necessary. Adults tend to talk to
tated by language. The possibility that linguistic children about objects that are in view and about
training would improve performance of the con- events that have just happened: the “here-and-now.”
servation task was tested by Sinclair-de-Zwart The usefulness of this is obvious (for example, in
(1969), who found that language training only had associating names with objects), and it is clear that
a small effect. Linguistic training does not affect learning language just by watching television is
basic cognitive processes, but helps in description going to be very limited in this respect. Furthermore,
and in focusing on the relevant aspects of the task. such situations involve the child having to both
Cognitive processes obviously continue to comprehend and produce language. To be effective,
develop beyond infancy. For example, working early language learning must involve interaction; it
memory capacity increases through childhood from must take place in a social setting. Social interaction-
about 2 items at age 2–3, to the adult span of 7 plus ists emphasize the importance of the development
or minus 2 in late childhood, and there might also of language through interaction with other people
be changes in the structure of memory (McShane, (Bruner, 1983; Durkin, 1987; Farrar, 1990; Gleason,
Hay, & Crain, 1989). According to social interac-

tionists, although biological and cognitive processes
may be necessary for language development, they
are not sufficient. Language development must
occur in the context of meaningful social interaction.
Bruner (1975, 1983) emphasized the importance
of the social setting in acquiring language. In many
respects his views are similar to those of Piaget, but
Bruner placed more emphasis on social development
than on cognitive development. Bruner stressed the
importance of the social setting of the mother–child
dyad in helping children to work out the meaning
of utterances to which they are exposed. Although
child-directed speech is an important mechanism,
the social dyad achieves much more than a particular
way of talking. For example, the important distinc-
tion between agents (who are performing actions) and
objects (who are having actions carried out on them)
is first made clear in turn-taking (and games based
on turn-taking) with the mother. As its name implies,
turn-taking is rather like a conversation; participants
appear to take it in turns to do things, although obvi-
ously the conscious intent on the part of the infant in
Bruner emphasized the importance of the
this may be limited. Processes such as mutual gaze, mother–child dyad in acquiring language. For
when the adult and child look at the same thing, and example, processes such as mutual gaze and joint
joint attention to objects and actions, are important in attention to objects are important in enabling the
enabling the child to discover the referents of words. child to discover the referents of words.
Bruner suggested that some of these social skills, or
the way in which they are used in learning language, not provide explicit negative evidence (in the form
may be innate. Bruner described language develop- of explicit correction), they do provide implicit
ment as taking place within the context of a LASS negative evidence (Sokolov & Snow, 1994). For
(language acquisition socialization system). example, parents tend to repeat more ill-formed
Other aspects of the social setting are impor- utterances than well-formed ones, and tend to fol-
tant for linguistic development. Feedback from low ill-formed utterances with a question rather
adults about children’s communicative efficiency than a continuation of the topic. There are also
plays a vital role in development. For example, regional and class differences: rural southern work-
the social-communicative setting can be central ing-class mothers in the USA provide more explicit
to acquiring word meanings by restricting the corrections than do middle-class mothers (Sokolov
domain of discourse of what is being talked about & Snow, 1994). Clearly the development of com-
(Tomasello, 1992b). Along the same lines, the municative competence is an essential prerequisite
social-communicative setting may also facilitate of language acquisition.
the task of learning the grammar of the language.
There has been a great deal of debate about the role Turn-taking in early
of negative evidence—for example, whether chil-
dren have to be told that certain strings of words
conversation
are ungrammatical—in language acquisition, and There is more to learning to use language than just
its limitations have been used to justify the exist- learning the meanings of words and a grammar: we
ence of innate principles. Although parents might also have to learn how to use language. Conversations
have a structure (see Chapter 14). Clearly we do about the details of how social interactions influ-
not always talk all at once; we appear to take turns ence development. Cognitive processes mediate
in conversations. At the very least children have to social interactions, and the key to a sophisticated
learn to listen and to pay some attention when they theory is in detailing this relation.
are spoken to. How does this ability to interact in
conversational settings develop? There is some evi-
dence that it appears at a very early age. Schaeffer
Disorders of the social use of
(1975) proposed that the origins of turn-taking lie in language
feeding. In feeding, sucking occurs in bursts inter- There are several developmental disorders of
spersed with pauses that appear to act as a signal to using language in a social context. Bishop (1997)
mothers to play with the baby, to cuddle it, or to talk describes semantic-pragmatic disorder, which is a
to it. He also noted that mothers and babies rarely language impairment that looks like a very mild ver-
vocalize simultaneously. Snow (1977) observed that sion of autism. Children with semantic-pragmatic
mothers respond to their babies’ vocalizations as if disorder often have difficulty in conversations where
their yawns and burps were utterances. Hence the they have to draw inferences. They give very literal
precursors of conversation are present at an early age answers to questions, failing to take the preceding
and might emerge from other activities. The gaze conversational and social context into account, as in
of mother and child also seems to be correlated; in the following (from Bishop, 1997, p. 221):
particular, the mother quickly turns her gaze to what-
ever the baby is looking at. Hence again cooperation Adult: Can you tell me about your party?
emerges at an early age. Although in these cases it is Child: Yes.
the mother who is apparently sensitive to the pauses
of the child, there is further evidence that babies of Although semantic-pragmatic disorder is poorly
just a few weeks old are differentially sensitive to fea- understood, it is clear that its origins are complex.
tures of their environment. Trevarthen (1975) found Whereas related disorders might be explicable
that babies visually track and try to grab inanimate in terms of memory limitations or social neglect,
objects, but they make other responses to people, semantic-pragmatic disorder is probably best
including waving and what he called pre-speech— explained in terms of these children having diffi-
small movements of the mouth, rather like a precur- culty in representing other people’s mental states.
sor of talking. The exact role of this pre-speech is This in turn is probably the result of an innate or
unclear, but certainly by the end of the first 6 months developmental brain abnormality. This deficit illus-
the precursors of social and conversational skills are trates how difficult it can be to disentangle biologi-
apparent, and infants have developed the ability to cal, cognitive, and social factors from each other.
elicit communicative responses.
Evaluation of social interactionist THE LANGUAGE

accounts DEVELOPMENT OF
Few would argue with the central theme of the VISUALLY AND HEARING-
social interactionist approach: To be effective, IMPAIRED CHILDREN
language acquisition must take place in a mean-
ingful social setting. But can this approach by One way of attempting to disentangle the devel-
itself account for all features of language acquisi- opment of language and cognition is to examine
tion? We will see in Chapter 4 that there is consid- language development in special circumstances. If
erable evidence that language development relies cognition drives language development, then visu-
on some innate knowledge. One particular dis- ally impaired children, who are likely to show some
advantage of the social interactionist approach is differences in cognitive development compared
that until recently these accounts were often vague with sighted children, should also show differences
in linguistic development. If language drives children in that their speech was more egocentric,
cognitive development, then hearing-impaired and stereotypic, and less creative. Cutsford (1951) went
non-hearing-impaired children should differ in their so far as to claim that blind children’s words were
cognitive development. meaningless. It is now known that these are over-
The cognitive development of blind or visu- generalizations, and are probably totally wrong.
ally impaired children is slower than that of Some (but not all) blind children may take
sighted children. The smaller range of experiences longer to say their first words, although this is con-
available to the child, the relative lack of mobility, troversial (Lewis, 1987). Bigelow (1987) found that
the decreased opportunity for social contact, and blind children acquired the first 50 words between
the decreased control of the child’s own body and the mean ages of 1 year 4 months and 1 year 9
environment all take their toll (Lowenfeld, 1948). months, compared with the 1 year 3 months to 1
The reliance of the development of the concept of year 8 months Nelson (1973) observed for sighted
object permanence on the senses of hearing and children. The earliest words seem to be similar to
touch leads to a delay in attaining it, and necessar- those first used by sighted children, although there
ily leads to a different type of concept. appears to be a general reduction in the use of object
Early studies suggested that the language devel- names (Bigelow, 1987). Not surprisingly, unlike with
opment of blind children differed from that of sighted sighted children, names do not refer to objects that
are salient in the visual world, particularly those that
cannot be touched (e.g., “moon”). Blind children use
far fewer animal names in early speech than sighted
children (8% compared with 20%; see Mulford,
1988). Instead, they refer to objects salient in the
auditory and tactile domains (e.g., “drum,” “dirt,”
and “powder”). Blind children also use more action
words than sighted children do, and tend to refer to
their own actions rather than the actions of others.
The earliest words also seem to be used rather
differently. They appear to be used to comment on
the child’s own actions, in play or in imitation, rather
than for communication or referring to objects or
events. Indeed, Dunlea (1984) argued that as blind
children were not using words beyond the context
in which they were first learned, the symbolic use of
words was delayed. Furthermore, vocabulary acqui-
sition is generally slower. The understanding of par-
ticular words is bound to be different: Landau and
Gleitman (1985) described the case of a 3-year-old
child who, when asked to look up, reached her arms
over her head. Nevertheless, Landau and Gleitman
demonstrated that blind children can come to learn
the meanings of words such as “look” and “see”
without direct sensory experience. It is possible
that children infer the meanings of these words by
observing their positions in sentences and the words
There is a difference in the rate of development that occur with them.
of linguistic abilities in blind and visually impaired There is considerable controversy about the
children compared with non-impaired children, use of pronouns by blind children. Whereas some
due to their different experience of the world.
researchers have found late acquisition of pronouns
and many errors with them (e.g., using “you” for children with alternative communicative strate-
“me”; Dunlea, 1989), better controlled studies have gies (Pérez-Pereira & Conti-Ramsden, 1999). For
found no such difference (Pérez-Pereira, 1999). example, repetition and stereotypic speech are used
There are differences in phonological develop- to serve a social function of keeping in contact with
ment: Blind children make more errors than sighted people. Blind children use verbal play to a greater
children in producing sounds that have highly vis- extent than sighted children, and may have better
ible movements of the lips (e.g., /b/), suggesting verbal memory. It should also be noted that work
that visual information about the movement of lips on blind children is methodologically complex and
normally contributes to phonological development tends to involve small numbers of participants; many
(Mills, 1987). Nevertheless, older blind children studies might have underestimated their linguistic
show normal use of speech sounds, suggesting that abilities (Pérez-Pereira & Conti-Ramsden, 1999).
acoustic information can eventually be used in iso- In any case, even if blind children were to
lation to achieve the correct pronunciation (Pérez- show an unambiguous linguistic deficit, it would
Pereira & Conti-Ramsden, 1999). be very difficult to attribute any deficit just to dif-
Syntactic development is marked by far more ferences in cognitive development. For example,
repetition than is normally found, and the use of the development of mutual gaze and the social
repeated phrases carries over into later develop- precursors of language will necessarily be differ-
ment. Furthermore, blind children do not ask ent; and sighted parents of blind children still tend
so many questions of the type “what’s that?” or to talk about objects that are visually prominent.
“what?,” or use modifiers such as “quite” or “very” However, caregivers try to adapt their speech to
(which account for the earliest function words of the needs of their children, resulting in subtle dif-
sighted children). This observation might reflect the ferences in linguistic development.
fact that their parents adapt their own language to On the other hand, it is obvious that the devel-
the needs of the children, providing more spontane- opment of spoken language is impaired in deaf or
ous labeling. There is also a delay in the acquisition hearing-impaired children. There is some evidence
of auxiliary verbs such as “will” and “can” (Landau that deaf children spontaneously start using and
& Gleitman, 1985). Again this is probably because combining increasingly complex gestures in the
of differences in the speech of the caregivers. absence of sign language (e.g., Mohay, 1982). This
Mothers of blind children use more direct com- finding shows that there is a strong need for humans
mands (“Take the doll”) than questions involving to attempt to communicate in some way. However,
auxiliaries (“Can you take the doll?”) when speak- given adequate tuition, the time course of the acqui-
ing to their children. The other curious finding is sition of sign language runs remarkably parallel to
that the children’s use of function words (which do that of normal spoken language development. Meier
the grammatical work of the language) is much less (1991) argued that deaf children using sign language
common early on (Bigelow, 1987). pass the same linguistic “milestones” at about the
Hence the linguistic development of blind chil- same ages as hearing children (and some milestones
dren is different from that of sighted children, but perhaps before hearing children).
the differences are mostly the obvious ones that Research on the cognitive consequences of
one would expect given the nature of the disabil- deafness has given mixed results. In one early exper-
ity. There is little clear evidence to support the idea iment, Conrad and Rush (1965) found differences
that an impairment of cognitive processing causes in coding in memory tasks between hearing and
an impairment of syntactic processing, and there- deaf children. This result is not surprising given the
fore we cannot conclude that cognitive processes involvement of acoustic or phonological process-
precede linguistic ones. Neither is there much evi- ing in short-term or working memory (Baddeley,
dence to support the idea that blind children’s early 1990). If rigorous enough controls are used, it can
language is deficient relative to that of sighted chil- be demonstrated that these indeed reflect differences
dren. Indeed, behavior that was once thought to be in the memory systems rather than inferiority of the
maladaptive in some way may in fact provide blind hearing-impaired systems (Conrad, 1979). Furth
(1966, 1971) found that compared with hearing chil- the linguistic performance of one group is superior to
dren, deaf children’s performance on Piagetian tasks that of the other. The cognitive development of deaf
was relatively normal. A review of results on tasks children generally proceeds better than it should if
such as conservation gave a range of results, from language were primary, and the linguistic develop-
no impairment to 1–2 years’ delay; the evidence was ment of blind children generally proceeds better than
mixed. Furth (1973) found that deaf adolescents it should if cognition were primary. Deaf children
had more difficulty with symbolic logic reasoning learn a sign language, and blind children acquire
tasks than did hearing children. Furth interpreted excellent coping strategies and acquire spoken lan-
these data as evidence for the Piagetian hypothesis guage remarkably well. Indeed, the linguistic devel-
that language is not necessary for normal cognitive opment of deaf children and the cognitive develop-
development. Any differences between deaf and ment of blind children both proceed better than we
hearing children arise out of the lack of experiences would expect if one were driving the other. There is
and training of the deaf children. little supporting evidence for the cognition hypothe-
However, most deaf children learn some kind sis from an examination of children with learning dif-
of sign language at a very early age, so it is difficulties or a comparison of deaf and blind children. If
ficult to reach any strong conclusions about the anything, these findings support Chomsky’s position
effects of lack of language. Deaf children with that language is an independent faculty. Nevertheless,
deaf parents acquire sign language at the same social factors are clearly important. Biological, cog-
rate as other children acquire spoken language nitive, and social factors work together in language
(Messer, 2000). Best (1973) found that the more development, and deficits in one of these areas can
exposure to sign language that deaf children had, often be compensated for by the others.
the better their performance on the Piagetian tasks.
WHAT IS THE RELATION
Evaluation of evidence from deaf BETWEEN LANGUAGE
and blind children AND THOUGHT?
There are clearly differences in cognitive develop-
ment between hearing-impaired and non-hearing- In this section we examine the relation between
impaired children, but it is not obviously the case that language and other cognitive and biological
processes. Does the form of our language influ-
ence the way in which we think, or is the form
of our language dependent on general cogni-
tive factors?
Many animals are clearly able to solve some
problems without language, suggesting that lan-
guage cannot be essential for problem solving and
thought. Although this may seem obvious, it has
not always been considered so. Among the early
approaches to examining the relation between lan-
guage and thought, the behaviorists believed that
thought was nothing more than speech. Young
children speak their thoughts aloud; this becomes
internalized, with the result that thought is covert
speech—thought is just small motor movements
of the vocal apparatus. Watson (1913) argued that
According to Messer (2000), deaf children with thought processes are nothing more than motor
deaf parents acquire sign language at the same rate
habits in the larynx. Jacobsen (1932) found some
as other children acquire spoken language.
evidence for this belief because thinking often is
accompanied by covert speech. He detected elec- to some point in development, when the child
trical activity in the throat muscles when partici- is about 3 years of age, speech and thought are
pants were asked to think. But is thought possible independent; after this, they become connected.
without these small motor movements? Smith, At this point speech and thought become inter-
Brown, Thomas, and Goodman (1947) used curare dependent: thought becomes verbal, and speech
to temporarily paralyze all the voluntary muscles becomes representational. When this happens,
of a volunteer (Smith, who clearly deserved to be children’s monologues are internalized to become
first author on this paper). Despite being unable to inner speech.
make any motor movement of the speech appara- Vygotsky contrasted his theory with that of
tus, Smith later reported that he had been able to Piaget, using experiments that manipulated the
think and solve problems. Hence there is more to strength of social constraints (see Figure 3.11).
thought than moving the vocal apparatus. Unlike Piaget, Vygotsky considered that later
Perhaps language sets us apart from ani- cognitive development was determined in part by
mals because it enables new and more advanced language. Piaget argued that egocentric speech
forms of thought? We need to distinguish how arises because the child has not yet become fully
language and thought might affect each other socialized, and withers away as the child learns to
developmentally, and in the fully developed communicate by taking into account the point of
adult state. view of the listener. For Vygotsky the reverse was
We can list the possible alternatives; each the case. Egocentric speech serves the function of
of them has been championed at some time. self-guidance that eventually becomes internalized
First, cognitive development determines the as inner speech, and is only vocalized because the
course of language development. This view- child has not yet learned how to internalize it.
point was adopted by Piaget and his followers. The boundaries between child and listener are
Second, language and cognition are independ- confused, so that self-guiding speech can only be
ent faculties (Chomsky’s position). Third, lan- produced in a social context. Vygotsky found that
guage and cognition originate independently the amount of egocentric speech decreased when
but become interdependent; the relation is com- the child’s feeling of being understood lessened
plex (Vygotsky’s position). Fourth, the idea that (such as when the listener was at another table).
language determines cognition is known as the He claimed that this was the reverse of what Piaget
Sapir–Whorf hypothesis. The final two of these would predict. However, these experiments are
approaches both emphasize the influence of lan- difficult to evaluate because Vygotsky omitted
guage in cognition. many procedural details and measurements from
his reports that are necessary for a full evalua-
The interdependence of language tion. It is surprising that the studies have not been
repeated under more stringent conditions. Until
and thought then, and until this type of theory is more fully
The Russian psychologist Vygotsky (1934/1962) specified, it is difficult to evaluate the significance
argued that the relation between language and of Vygotsky’s ideas.
thought was a complex one. He studied inner
speech, egocentric speech, and child mono- The Sapir–Whorf hypothesis
logues. He proposed that speech and thought
have different ontogenetic roots (that is, different In George Orwell’s novel Nineteen Eighty-Four,
origins within an individual). Early on, in par- language restricted the way in which people
ticular, speech has a pre-intellectual stage. In this thought. The rulers of the state deliberately used
stage, words are not symbols for the objects they “Newspeak,” the official language of Oceania, so
denote, but are actual properties of the objects. that the people thought what they were required
Speech sounds are not attached to thought. At to think. “This statement … could not have been
the same time early thought is non-verbal, so up sustained by reasoned argument, because the
Comparison of Piaget’s and Vygotsky’s theories
PIAGET VYGOTSKY
Learning precedes
Development precedes development
learning
Language is a SOCIAL phenomenon,
even at the earliest stages of
development, although a child’s
Egocentric speech early speech is egocentric
represents child
thinking aloud
Thought develops within a
social context
This gives way to social

Language and thought have
speech once the child
different origins
recognizes speech as a
means of communication
Prelinguistic Language is
child thinks acquired from
independently the child’s social
Thought determines of language grouping
language. Having begun
in the individual, it is
transferred into the Merging of thought and language
social arena via language as child learns language
FIGURE 3.11
necessary words were not available” (Orwell, The Sapir–Whorf hypothesis comprises two
1949, p. 249, in the appendix, “The principles related ideas. First, linguistic determinism is
of Newspeak”). Orwell’s idea is a version of the the idea that the form and characteristics of our
Sapir–Whorf hypothesis. language determine the way in which we think,
The central idea of the Sapir–Whorf hypoth- remember, and perceive. Second, linguistic rel-
esis is that the form of our language determines ativism is the idea that as different languages
the structure of our thought processes. Language map onto the world in different ways, differ-
affects the way we remember things and the way ent languages will generate different cognitive
in which we perceive the world. It was origi- structures.
nally proposed by a linguist, Edward Sapir, and Miller and McNeill (1969) distinguished
a fire insurance engineer and amateur linguist, between three versions of the Sapir–Whorf
Benjamin Lee Whorf (see Whorf, 1956a, 1956b). hypothesis. In the strong version, language deter-
Although Whorf is most closely associated with mines thought. In a weaker version, language
anthropological evidence based on the study of affects only perception. In the weakest version,
American Indian languages, the idea came to language differences affect processing on certain
him from his work in fire insurance. He noted tasks where linguistic encoding is important. It
that accidents sometimes happened because, he is the weakest version that has proved easiest to
thought, people were misled by words—as in the test, and for which there is the most support. It
case of a worker who threw a cigarette end into is important to consider what is meant by “per-
what he considered to be an “empty” drum of pet- ception” here. It is often unclear whether what is
rol. Far from being empty, the drum was full of being talked about is low-level sensory process-
petrol vapor, with explosive results. ing or classification.
Anthropological evidence Vocabulary differentiation

The anthropological evidence concerns the The way in which different languages have dif-
inter-translatability of languages. Whorf ana- ferent vocabularies has been used to support the
lyzed Native American Indian languages such Whorfian hypothesis, in that researchers believe
as Hopi, Nootka, Apache, and Aztec. He argued that cultures must view the world differently
that each language imposes its own “world view” because some cultures have single words availa-
on its speakers. For example, he concluded that ble for concepts that others may take many words
as Hopi contains no words or grammatical con- to describe. For example, Boas (1911) reported
structions that refer to time, Hopi speakers must that Eskimo (or Inuit) language has four differ-
have a different conception of time from us. ent words for snow; there are 13 Filipino words
Whorf’s data are now considered highly unrelia- for rice. An amusing debunking of some of these
ble (Malotki, 1983). Furthermore, translation can claims can be found in Pullum (1989): Whorf
be very misleading. Take as an example Whorf’s (1940/1956b) inflated the number of words for
(1940/1956b, p. 214) analysis of “clear dripping snow to seven, and drew a comparison with
spring” in the following quote: English, which he said has only one word for
snow regardless of whether it is falling, on the
We might isolate something in nature by ground, slushy, dry or wet, and so on. The num-
ber of types of snow the Inuit were supposed
saying “It is a dripping spring.” Apache
to have then varied with subsequent indirect
erects the statement on a verb ga: “be white reporting, apparently reaching its maximum in
(including clear, uncolored, and so on).” With an editorial in the New York Times on February
a prefix no—the meaning of downward motion 9, 1984, with “one hundred” to “two hundred”
enters: “whiteness moves downward.” Then in a Cleveland television weather forecast. In
to, meaning both “water” and “spring,” fact, it is unclear how many words Inuit has for
snow, but it is certainly not that many. It prob-
is prefixed. The result corresponds to our
ably only has two words or roots for types of
“dripping spring,” but synthetically it is “as snow: “qanik” for “snow in the air” or “snow-
water, or springs, whiteness moves downward.” flake”; and “aput” for “snow on the ground.” It
How utterly unlike our way of thinking! is unclear whether you should count the words
derived from these roots as separate. This story
In fact, Whorf’s translation was very idio- reinforces the importance of knowing how you
syncratic, so it is far from clear that speakers define a “word,” and also of always checking
of Apache actually dissect the world in differ- sources! Speakers of English also in fact have
ent ways (Clark & Clark, 1977; Pinker, 1994). several words for different types of snow (snow,
For example, both languages have separate ele- slush, sleet, and blizzard).
ments for “clear,” “spring,” and “moving down- Vocabulary differences are unlikely to have
wards.” Why should the expression not have been any significant effects on perception—although
translated “It is a clear dripping spring”? The again it is important to bear in mind what per-
appeal of such translations is further diminished ception might cover. We can learn new words
when it is realized that Whorf based his claim for snow: people learning to ski readily do so,
not on interviews with Apache speakers, but on and while this does not apparently change the
an analysis of their recorded grammar. Lenneberg quality of the skiers’ perception of the world, it
and Roberts (1956) pointed out the circularity certainly changes the way in which they classify
in the reasoning that, because languages differ, snow types and respond to them. For example,
thought patterns differ because of the differences you might choose not to go skiing on certain
in the languages. An independent measure of types of snow. Vocabulary differences reflect
thought patterns is necessary before a causal con- differences in experience and expertise. They
clusion can be drawn. do not seem to cause significant differences in
Opinion on the exact

number of Inuit words for
“snow” has varied wildly,
depending on the source
of the figure, and on the
parameters that have been
adopted in determining
what does and does not
constitute another word
for “snow.” This illustrates
the need for clarity when
deciding how to define a
“word.”
perception, but do aid classification and other that determine the endings than do English
cognitive processes. Not having words available speakers, and in particular they should group
for certain concepts does seem to have a detri- instances of objects according to their form. As
mental effect. Members of the Piraha tribe from all the children in the study were bilingual, the
the Amazon basin have words for the numbers comparison was made between more Navaho-
“one” and “two,” and then just “many.” Their per- dominant and more English-dominant Navaho
formance on a range of numerical tasks was very children. The more Navaho-dominant children
poor for quantities greater than three (Gordon, did indeed group objects more by form than
2004). Whereas we can count above two and by color, compared to the English-dominant
assign precise numbers to quantities, members group. However, a control group of non-Native
of the Piraha tribe just seem to be able to esti- American English-speaking children grouped
mate. Not having a word available for a concept even more strongly according to form, behav-
does appear to limit their cognitive abilities. ing as the Navaho children were predicted to
behave! It is therefore not clear what conclu-
sions about the relation between language and
Grammatical differences between thought can be drawn from this study.
languages A second example is that English speak-
Carroll and Casagrande (1958) examined the ers use the subjunctive mood to easily encode
cognitive consequences of grammatical differ- counter-factuals such as “If I had gone to the
ences in the English and Navaho languages. library, I would have met Dirk.” Chinese does
The form of the class of verbs concerning han- not have a subjunctive mood. Bloom (1981,
dling used in Navaho depends on the shape and 1984) found that Chinese speakers find it
rigidity of the object being handled. Endings for harder to reason counter-factually, and attrib-
the verb corresponding to “carry,” for example, uted this to their lack of a subjunctive con-
vary depending on whether a rope or a stick struction. Their memories are more easily
is being carried. Carroll and Casagrande there- overloaded than those of speakers of languages
fore argued that speakers of Navaho should that support these forms. There has been some
pay more attention to the properties of objects dispute about the extent to which sentences
used by Bloom were good idiomatic Chinese. Paganelli, and Dworzynski (2005) found that
It is also possible to argue counter-factually effects of gender on thought were highly con-
in Chinese using more complex construc- strained. They were found in Italian (a two-
tions, such as (translated into English) “Mrs. gender language), but only with tasks that
Wong does not know English; if Mrs. Wong require verbalization and only with certain
knew English, she would be able to read the semantic categories (animals) and not others
New York Times” (Au, 1983, 1984; Liu, 1985). (artifacts). For example, when participants are
Nevertheless, Chinese speakers do seem to find asked to judge which two of three words are
counter-factual reasoning more difficult than most similar (e.g., donkey–elephant–giraffe),
English speakers. If this is because the form grammatical gender affected similarity judg-
of the construction needed for counter-factual ments for animals but not for artifacts. There
reasoning is longer than the English subjunc- were no effects at all in German, a language
tive, then this is evidence of a subtle effect of with an additional neuter gender. The likely
linguistic form on reasoning abilities. reason for this difference is that in two-gender
A third example is that of grammatical languages gender is a reliable cue to sex—but
gender. Although English does not mark gram- of course this rule is inapplicable with artifacts.
matical gender, many languages do. Italian, The conclusion is consistent with a weak ver-
for example, marks nouns as masculine or sion of the Sapir–Whorf hypothesis—language
feminine, and German marks them as mascu- can affect performance on some tasks that use
line, feminine, or neuter. Vigliocco, Vinson, language.
Examples of stimuli and responses, showing the effect of verbal labels
Curtains in a Diamond in FIGURE 3.12 Carmichael

window a rectangle et al.’s study involved two
groups of participants who
Bottle Stirrup
were shown the drawings
Eyeglass Dumb-bell in the central column.
One group were given the
Kidney bean Canoe description on the left,
and the other group were
Crescent moon Letter “C”
given the description on
Two Eight the right. For example,
one group were told an
Ship’s wheel Sun object was a gun and the
other that it was a broom.
Hour glass Table
Later the participants were
Beehive Hat asked to reproduce the
drawings from memory.
Pine tree Trowel Their sketches matched
the description they were
Gun Broom
given, not the original
Seven Four
drawings, demonstrating
that perceptual recall is not
influenced solely by the
stimulus, but is also affected
by knowledge.
Indirect effects of language on about. The descriptions had been prepared so as to

cognition conform to Chinese or English personality stereo-
There is more evidence that language can have types. Bilingual people thinking in Chinese used
an indirect effect on cognition, particularly on the Chinese stereotype, whereas bilingual people
tasks where linguistic encoding is important. thinking in English used the English stereotype.
Carmichael, Hogan, and Walter (1932) looked at The language used influenced the stereotypes
the effects of learning a verbal label on partici- used, and therefore the inferences made and what
pants’ memory for nonsense pictures (see Figure was remembered.
3.12). They found that the label that the partici- Hence work on memory and problem solv-
pants associated with the pictures affected the ing supports the weakest version of the Whorfian
recall of the pictures. Santa and Ranken (1972) hypothesis. Language can facilitate or hinder
showed that having an arbitrary verbal label avail- performance on some cognitive tasks, particu-
able aided the recall of nonsense shapes. larly those where linguistic encoding is routinely
Duncker (1945) explored the phenomenon important.
known as functional fixedness, using the “box
and candle” problem (see Figure 3.13) where par- Number systems
ticipants have to construct a device using a col- Hunt and Agnoli (1991) examined how differ-
lection of commonplace materials so that a candle ent languages impose different memory burdens
can burn down to its bottom while attached to the on their speakers. English has a complex sys-
wall. The easiest solution is to use the box con- tem for naming numbers: we have 13 primitive
taining the materials as a support; however, par- terms (0–12), then special complex names for
ticipants take a long time to think of this, because the “teens,” then more general rule-based names
they fixate on the box’s function as container. for the numbers between 20 and 100, and then
Glucksberg and Weisberg (1966) showed that the more special names beyond that. On the other
explicit labeling of objects could strengthen or hand, the number naming system in Chinese is
weaken the functional fixedness effect depending much more simple, necessitating only that the
on the appropriateness of the label. This demon- child has to remember 11 basic terms (0–10), and
strates a linguistic influence on reasoning. three special terms for 100, 1,000, and 10,000.
In an experiment by Hoffman, Lau, and For example, “eleven” is simply “ten plus one.”
Johnson (1986), Chinese–English bilinguals read English-speaking children have difficulty learn-
descriptions of people, and were later asked to ing to count in the teen range, whereas Chinese-
provide descriptions of the people they’d read speaking children do not (Miller & Stigler, 1987).
Hence the form of the language has subtle influ-
ences on arithmetical ability, a clear example of
language influencing cognition.
Although Welsh numbers have the same num-
ber of syllables as their English counterparts, the
vowel sounds are longer and so they take longer
to say (Ellis & Hennelly, 1980). Hence bilingual
participants had worse performance on digit-span
tests in Welsh compared with English digit names,
and also slightly worse performance and higher
error rates in mental arithmetic tasks when using
Welsh digit names.
Key evidence comes from the Piraha peo-
ple of the Amazon (Everett & Madora, 2011;
FIGURE 3.13 The objects presented to participants Gordon, 2004). The Piraha lack precise numeri-
in the “box and candle” problem. cal terms, and seem to have great difficulty on
tasks involving numbers greater than three. It common and generally known and not usually
appears that in order to count accurately we need derived from the name of an object (hence “yel-
to have linguistic number terms available. low” but not “saffron”). Languages differ in the
number of color terms they have available. For
Color coding and memory for color example, Gleason (1961) compared the division
The most fruitful way of investigating the strong of color hues by speakers of English with that of
version of the Sapir–Whorf hypothesis has proved the languages Shona and Bassa (see Figure 3.14).
to be analysis of the way in which we name and Berlin and Kay found that across languages basic
remember colors. Brown and Lenneberg (1954) color terms were present in a hierarchy (see Figure
examined memory for “color chips” differing in 3.15). If a language only has two basic color terms
hue, brightness, and saturation. Codable colors, available, they must correspond to “black” and
which correspond to simple color names, are “white”; if they have three then they must be these
remembered more easily (e.g., an ideal red is two plus “red”; if they have four then they must be
remembered more easily than a poor example of the first three plus one of the next group, and so
red). Lantz and Stefflre (1964) argued that the on. English has names for all 11 basic color terms
similar notion of communication accuracy best (black, white, red, yellow, green, blue, brown,
determines success: People best remember colors purple, pink, orange, and gray). Berlin and Kay
that are easy to describe. also showed that the typical colors referred to by
This early work seemed to support the Sapir– the basic color terms, called the focal colors, tend
Whorf hypothesis, but there is a basic assumption to be constant across languages.
that the division of the color spectrum into labeled Heider (1972) examined people’s memory
colors is completely arbitrary. This means that, but for focal colors in more detail. Focal colors are
for historical accident, we might have developed the best examples of colors corresponding to
other color names, like “bled” for a name of a basic color terms: they can be thought of as the
color between red and blue, and “grue” for a name best example of a color such as red, green, or
of a color between green and blue, rather than red, blue. The Dani tribe of New Guinea have just
blue, and green. Is this assumption correct? two basic color terms, “mili” (for black and dark
Berlin and Kay (1969) compared the basic colors) and “mola” (for white and light colors),
color terms used by different languages. Basic although subsequently there has been some doubt
color terms are defined by being made up from as to whether this really is the case. Heider taught
only one morpheme (so “red,” but not “blood the Dani made-up names for other colors. They
red”), not being contained within another color learned names more easily for other focal colors
(so “red,” but not “scarlet”), not having restricted than for non-focal colors, even though they had
usage (hence “blond” is excluded), and being no name for those focal colors. They could also
English purple blue green yellow red orange
Shona cipswuka citema cicena cipswuka
Bassa hui ziza
FIGURE 3.14
Dani mili mola Comparison of color hue
division in English, Shona,
Bassa, and Dani (based on
Gleason, 1961).
He argued that the eyes of peoples in equatorial

regions have evolved to have protection from
excessive ultraviolet light. In particular, there is
BLACK, WHITE greater yellow pigmentation in the eyes, which
protects the eye from short-wave radiation, at a
RED
cost of decreased sensitivity to blue and green.
YELLOW, GREEN, BLUE Brown (1976) discussed the revised inter-
pretation of these color-naming data and their
BROWN
consequences for the Sapir–Whorf hypothesis.
PURPLE, PINK, ORANGE, GRAY He concluded that these later studies show that
color naming does not tell us very much about the
Sapir–Whorf hypothesis. If anything, it appeared
to emphasize the importance of biological factors
FIGURE 3.15 Hierarchy of color names (based on in language development. There are some prob-
Berlin & Kay, 1969). lems with some of these studies, however. Of the
20 languages originally described in the Berlin
remember focal colors more easily than non-focal and Kay (1969) study, 19 were in fact obtained
colors, again even those for which they did not from bilingual speakers living in San Francisco,
have a name. Three-year-old children also prefer and the use of color categories by bilingual speak-
focal colors: they match them more accurately, ers differs systematically from that of monolingual
attend to them more, and are more likely to choose speakers. In particular, the color categorization
them as exemplars of a color than non-focal colors of bilingual people comes to resemble that of
(Heider, 1971). In a similar way, English speakers monolingual speakers of their second language,
attend to differences between light and dark blue whatever their first language. This in itself would
in exactly the same way as Russian speakers, even give rise to an artifactual universality in color
though the latter have names for these regions of categorization. There are also methodological
the color spectrum while English speakers do not problems with the expanded set of 98 languages
(Davies et al., 1991; Laws, Davies, & Andrews, studied later by Berlin and Kay (Cromer, 1991;
1995; note that there has been considerable debate Hickerson, 1971). The criteria Berlin and Kay
about whether these are basic color names). (1969) used for naming basic color terms are sus-
At first sight then, the division of the color pect (Michaels, 1977). The criteria seem to have
spectrum is not arbitrary, but is based on the phys- been inconsistently applied, and it is possible that
iology of the color vision system. The six most the basic color terms of many languages were
sensitive points of the visual system correspond omitted (Hickerson, 1971).
to the first six focal colors of the Berlin and Kay There were also problems with the materials
hierarchy. Further evidence that differences are used in the original studies by Heider. The focal
biological and have nothing to do with language colors are perceptually more discriminable than
comes from work on prelinguistic children by the non-focal colors used in Berlin and Kay’s
Bornstein (1985). Children aged 4 months habitu- array in that they were perceptually more distant
ate more readily to colors that lie centrally in the from their neighbors. When the materials are cor-
red and blue categories than to colors that lie at rected for this artifact, Lucy and Shweder (1979)
the boundaries. found that focal colors were not remembered any
Bornstein (1973) found an environmental better than non-focal colors. On the other hand,
influence on the take-up of these color terms. He a measure of communication accuracy did pre-
noted that with increasing proximities of societies dict memory performance. This finding suggests
to the equator, color names for short wavelengths that having a convenient color label can indeed
(blue and green) become increasingly identified assist color memory. Kay and Kempton (1984)
with each other and, in the extreme, with black. showed that although English speakers display
categorical perception of colors that lie on either low level. As Pinker (1994) observes, no matter how
side of a color name boundary, such as blue and influential language might be, it is preposterous to
green, speakers of the Mexican Indian language think that it could rewire the ganglion cells. Third,
Tarahumara, who do not have names for blue and in any case, there do appear to be effects of language
green, do not. Hence having an available name on color perception: Roberson et al. found effects of
can at least accentuate the difference between two categorical perception for colors, but aligned with
categories. These more recent findings suggest linguistic categories rather than more biologically
that there are indeed linguistic effects on color based categories.
perception.
There are limitations on the extent to which Linguistic differences in the coding of
biological factors constrain color categorization, space and time
and it is likely that there is some linguistic influ- In a recent review, Boroditsky (2003) concludes
ence. The Berinmo, a hunter-gatherer tribe also that there are several instances where encoding
from New Guinea, have five basic color terms. differences between languages leads to differ-
The Berinmo do not mark the distinction between ences in performance by speakers of those lan-
blue and green, but instead have a color bound- guages. For example, different languages encode
ary between the colors they call “nol” and “wor,” spatial languages in different ways. Most lan-
which does not have any correspondence in the guages (such as English) use relative terms (e.g.,
English color-naming scheme. English speakers front of, back of, left of, right of) to encode rela-
show a memory advantage across the blue–green tive spatial terms. Languages such as Tzeltal (a
category boundary but not across the nol–wor one, Mayan language) use an absolute system (similar
whereas Berinmo speakers showed the reverse to our system of describing compass points, e.g.,
pattern (Davidoff, Davies, & Roberson, 1999a, to the north). Speakers of Dutch (which uses the
1999b). In a further series of experiments using relative system) and Tzeltal interpret and perform
more sensitive statistical techniques, Roberson, very differently on a non-linguistic orientation
Davies, and Davidoff (2000) were unable to repli- task. In this task, people see an arrow pointing in
cate Heider’s earlier results with the Dani with the one direction, to the left or right. The viewpoint
Berinmo. They found no recognition advantage is then rotated 180 degrees, and people are asked
for focal stimuli, no facilitation of learning focal which is most like the one they had originally
colors, and a relation between color recognition seen—an arrow pointing in relatively the same
was affected by color vocabulary. way, or absolutely the same way. Preferences
It is now also apparent that even within depend on whether the language uses an absolute
English not all basic color terms are equal. or a relative coding system, with the Dutch speak-
“Brown” and “gray” are acquired later than other ers preferring the right-pointing arrow if they had
basic color terms, are the two least preferred seen that previously, but the Tzeltal speakers pre-
colors, and are used less frequently in adult speech ferring the left-pointing arrow (Levinson, 1996a).
to children than other color terms (Pitchford & This is because “what is north” does not vary with
Mullen, 2005). rotation, but “what is left” does. Different spatial
In summary there appear to be effects of biolog- frames of references are acquired with ease by
ical and linguistic constraints on memory for colors. children from different cultures using different
Perhaps color naming is not such a good test of the languages—the absolute and relative systems are
Sapir–Whorf hypothesis after all. First, the task is acquired equally easily (Majid, Bowerman, Kita,
clearly very sensitive to the details of the experimen- Haun, & Levinson, 2004). Different languages
tal procedures and materials used. Second, the more encode time in different ways: in English we
basic the cognitive or perceptual process, the less mainly use a front–back metaphor (look ahead,
scope there is likely to be for the top-down influ- falling behind, move meetings forward), while
ence of language, and color perception, a mecha- Mandarin speakers systematically use vertical
nism shared with many nonhuman species, is pretty metaphors (with up corresponding roughly to last
and down to next). Mandarin speakers are more results, arguing that the purpose of the task was
likely to construct vertical timelines to think about too apparent to Li and Gleitman’s participants.
time, while English speakers are more likely to They also pointed out that their groups were
construct horizontal ones. For example, Mandarin tested with equal amounts of environmental cues
speakers are faster to confirm that March comes available, being tested equally indoors and out.
before April if they have just seen a vertical In summary, there is evidence that the way
array of objects than if they had seen a horizon- in which different languages encode distinctions
tal one. English speakers showed the reverse pat- such as time, space, motion, shape, and gender
tern (Boroditsky, 2001). Similar differences in influence the way in which speakers of those
performance can be found for the way in which languages think. These differences suggest that
languages encode object shape and grammatical our language may determine how we perform on
gender (Boroditsky, 2003). tasks that at first sight do not seem to involve
Languages differ in the way in which they language at all, although this claim remains
encode movement—do these linguistic differ- controversial.
ences lead to cognitive differences? English
encodes the direction of motion with a modifier Evaluation of the Sapir–Whorf
(“to,” “from”) and the manner of motion in the hypothesis
verb (“walk,” “run”), whereas in Greek the oppo- The weak version of the Sapir–Whorf hypoth-
site is the case: the verb encodes the direction of esis has enjoyed a resurgence. There is now a
motion, while the manner is encoded by a modi- considerable amount of evidence suggesting
fier. Papafragou, Massey, and Gleitman (2002) that linguistic factors can affect cognitive pro-
tested Greek and English children on two types of cesses. Even color perception and memory, once
task involving motion: one involving non-linguistic thought to be completely biologically determined,
tasks (remembering and categorizing motion in show some influence of language. Furthermore,
pictures of animals moving around), the other research on perception and categorization has
involving linguistic description. They only found shown that high-level cognitive processes can
a difference in performance on the linguistic tasks. influence the creation of low-level visual features
There has recently been debate about whether early in visual processing (Schyns, Goldstone, &
these linguistic differences reflect the presence or Thibaut, 1998). This is entirely consistent with the
absence of external cues, and whether they affect idea that, in at least some circumstances, language
performance on all tasks, or just linguistic tasks. might be able to influence perception.
Li and Gleitman (2002) argued that the results Indeed, it is hardly surprising that if a thought
of the studies by Levinson and colleagues on expressible in one language cannot be expressed
spatial frames of reference described above were so easily in another, then that difference will have
artifactual. Li and Gleitman suggested that the consequences for the ease with which cognitive
results depend on the presence of environmen- processes can be acquired and carried out. Having
tal cues. They tested a group of native English one word for a concept instead of having to use a
speakers, and found that they could make them whole sentence might reduce memory load. The
perform using relative or absolute frames of ref- differences in number systems between languages
erence depending on the presence of landmark form one example of how linguistic differences
cues in the environment. When participants can lead to slight differences in cognitive style.
could not see the outside world (the blinds of the We will see in later chapters that different
testing room were down), the speakers tended languages exemplify different properties that are
to use a relative frame; when they could see the bound to have cognitive consequences. For exam-
outside world (the blinds were up), they were ple, the complete absence of words with irregular
more likely to use an absolute frame of refer- pronunciations in languages such as Serbo-Croat
ence. On the other hand, Levinson, Kita, Haun, and Italian is reflected in differences between
and Rasch (2002) were unable to replicate these their reading systems and those of speakers of
languages such as English. Furthermore, differ- range of sources to justify his claim that inner
ences between languages can lead to differences speech is the glue that sticks cognition together,
in the effects of brain damage. and enables the modules of the mind to com-
The extent to which people find the Sapir– municate: that is, language is the medium of
Whorf hypothesis plausible depends on the extent conscious thought.
to which they view language as an evolutionarily Even here we must note that there might be
late mechanism that merely translates our thoughts cultural differences. In the West, it is assumed
into a format suitable for communication, rather that language and inner speech assist thinking;
than a rich symbolic system that underlies most in the East, it is assumed that talking interferes
of cognition (Lucy, 1996). It is also more plausi- with thinking. These cultural differences affect
ble in a cognitive system with extensive feedback performance: thinking out loud helped European
from later to earlier levels of processing. Americans to solve reasoning problems, but hin-
dered Asian Americans (Kim, 2002; Nisbett,
Language and thought: 2003).
The influence of language on thought has
Conclusion some important consequences. For example,
Perhaps the main conclusion about how language does sexist language really influence the way in
and thought are related is that there is a relation- which people think? Spender (1980) proposed
ship, but it is a complex one. Environment and some of the strongest arguments for non-sexist
biology jointly determine our basic cognitive language. For example, that using the word
architecture. Within the constraints set by this “man” to refer to all humanity has the associa-
architecture, languages are free to vary in how tion that males are more important than females;
they dissect the world and in what they empha- or that using a word like “chairman” (rather than
size. These differences can then feed back to a more gender-neutral term such as “chair” or
affect aspects of perception and cognition. “chairperson”) encourages the expectation that
We noted above that paralyzing overt the person will be a man. These expectations
speech does not stop us being able to think. do have real effects. Gender-stereotyped nouns
Clearly language is an important medium of (e.g., “surgeon,” “nurse”) are those to which
thought and conceptualization. Although there many people have a strong initial expectation
is a great deal of individual variation, a signifi- of the gender of the person (surgeon as male,
cant proportion of our mental life is conducted nurse as female). Readers take longer to read
in language (Carruthers, 2002); we hear “inner a subsequent pronoun referring to the noun if
speech,” which often seems to be expressing or the pronoun is in conflict with the stereotyping
guiding our thoughts, or which sometimes is the (such as using “she” to refer to a surgeon rather
product of reading. The extent to which inner than “he”; e.g., Kennison & Trofe, 2004). Such
speech or language plays a real role in thinking a theory is a form of the Sapir–Whorf hypoth-
is unclear and controversial (Carruthers, 2002). esis, although there has been surprisingly little
A strong view is that language is essential for empirical work in this area.
conceptual thought and is the medium in which As Gleitman and Papafragou (2005) con-
it is conducted; a weaker view is that language clude, clearly we can have thought without
is the medium of conscious propositional (as language—some animals clearly reason and
opposed to visual) thought; an even weaker solve problems; prelinguistic infants have rich
view is that language is necessary to acquire cognitive abilities; people with brain damage
many concepts, and influences cognition in destroying most of their language abilities dis-
ways that we have seen above; yet another play rich cognitive abilities. Yet there is also
view is that there is essentially no relation at much evidence that language and culture can
all (although language can clearly express affect our ways of thinking. Language and
thoughts). Carruthers presents evidence from a thought are related, but in a complex way.
SUMMARY
x Language must have conferred an evolutionary advantage on early humans.

x Many animals, including even insects, have surprisingly rich communication systems.
x Animal communication systems in the wild are nevertheless tied to the here-and-now, and
animals can only communicate about a very limited number of topics (mainly food, threat,
and sex).
x Hockett described “16 design features” that he thought characterized human spoken language.
x Early attempts to teach apes to talk failed because the apes lack the necessary articulatory
apparatus.
x Later attempts to teach apes to communicate using signs (e.g., Washoe and Kanzi) show at least
that apes can use combinations of signs in the appropriate circumstances, although it is unclear
whether they are using words and grammatical rules in the same way as we do.
x Some language processes are localized in specific parts of the brain, particularly the left cortex.
x Broca’s area is particularly important for producing speech, while Wernicke’s area is particularly
important for dealing with the meaning of words.
x Damage to particular areas of the brain leads to identifiable types of disrupted language.
x We are not born with functions fully lateralized in the two cortical hemispheres; instead, much
specialization takes place in the early years of life.
x There are sex differences in language use and lateralization from an early age; females tend to
have better linguistic skills.
x There is a sensitive period for language development during which we need exposure to socially
meaningful linguistic input.
x The stronger notion of a critical period for language acquisition between the ages of 2 and 7
cannot be correct because there is clear evidence that lateralization is present from birth, and
that older children and adults are surprisingly good at acquiring language.
x The acquisition of syntax by the left hemisphere is particularly susceptible to disruption during
the sensitive period.
x The relation between language and cognitive processes in development is complex.
x Infants do not need to attain object permanence before they can start naming objects.
x The cognitive development of deaf children proceeds better than it should if language underlies
cognition, and the linguistic development of blind children proceeds better than it should if cogni-
tion underlies language.
x Language use has important social precursors; in particular, parents appear to have “conversa-
tions” with infants well before the infants start to use language.
x Parents adapt their language to the needs of their children, and the way that caregivers speak
to blind children leads to differences in their grammatical development compared with sighted
children.
x The Sapir–Whorf hypothesis states that differences in languages between cultures will lead to
their speakers perceiving the world in different ways and having different cognitive structures.
x The most important sources of evidence in evaluating the Sapir–Whorf hypothesis are studies of
color naming.
x Color naming and memory studies show that although biological factors play the most important
role in dividing up the color spectrum, there is some linguistic influence on memory for colors.
x There is evidence that language can affect performance on some perceptual, memory, and
conceptual tasks.
1. Why might early humans have needed language while chimpanzees did not?
2. What do you think is the most important way in which human language can be differentiated
from the way in which Washoe used language?
3. What would convince you that a chimpanzee was using a language like humans?
4. How easy is it to separate features that are universal to language from features that are univer-
sal to our environment?
5. One reason why second language acquisition might be so difficult for adults is that it is not
“taught” in the way that children acquire their first language. How then could the teaching of
a second language be facilitated?
6. How might individual differences play a part in the extent to which people use language to
“think to themselves”?
7. Compare and contrast the language of Genie with the “language” of Washoe.
8. What ethical issues are involved in trying to teach animals language?
9. Clearly the alleged experiment on creating wild children reputed to have been carried out by
King James IV was extremely unethical. What ethical issues do you think might be involved
in cases such as Genie’s?
10. How could you tell whether sex differences in language use result from biological or cultural
factors (or both)?
11. Can you find any examples of sexist language in magazines, newspapers, or official docu-
ments? Has it influenced your understanding of the roles people play?
12. Can you think of any examples of when your cognition has been affected by the words
you use?
FURTHER READING
For more on the origins and evolution of language, see Aitchison (1996), Deacon (1997), Harley
(2010), and Jackendoff (1999). Christiansen and Kirby (2003) is a more advanced but still
approachable recent edited collection about language evolution; start with the chapter by Pinker
for an overview. Dennett (1991) discusses the evolution of language, and its possible relation
to consciousness.
For a more detailed review of animal communication systems and their cognitive abilities, see
Pearce (2008). A detailed summary of early attempts to teach apes language is provided by Premack
(1986a). Gardner, van Cantfort, and Gardner (1992) report more recent analyses of Washoe’s signs.
Premack’s later stance is critically discussed in reviews by Carston (1987) and Walker (1987);
see also the debate between Premack (1986b) and Bickerton (1986) in the journal Cognition. A
popular and contemporary account of Kanzi is given by Savage-Rumbaugh and Lewin (1994). See
also Deacon (1997) for more on animal communication systems and the evolution of language.
See Klima and Bellugi (1979) for more on sign language in humans. Aitchison (1998) is a good
description of attempts to teach language to animals and the biological basis of language. There
(Continued)
(Continued)
is a special issue of the journal Cognitive Science on primate cognition (2000, volume 24, part 3,
July–September). See Pepperberg (1999) and Shanker, Savage-Rumbaugh, and Taylor (1999) for
replies to Kako’s (1999a) criticisms; and Kako (1999b) for replies to them.
Most textbooks on neuropsychology and neuroscience have at least one chapter on language
and the brain (e.g., Gazzaniga et al., 2008; Kolb & Whishaw, 2009). See Poeppel and Hickok
(2004) and the rest of the special issue of the journal Cognition for a recent review of work on
the biology and anatomy of language.
Muller (1997) is an article with commentaries about the innateness of language, species-specificity,
and brain development. He argues that the brain is less localized for language and that language is less
precisely genetically determined than many people think. The article is also a good source of further
references on these topics.
An excellent source of readings on the critical period and how language develops in exceptional
circumstances is Bishop and Mogford (1993). Bishop (1997) provides a comprehensive review of
how comprehension skills develop in unusual circumstances. For a more detailed review of the
critical period and second language hypothesis see McLaughlin (1984). Bishop also describes spe-
cific language impairment (SLI) and semantic-pragmatic disorder in detail; see also Bishop (1989).
Gopnik (1992) also reviews SLI, emphasizing the role genetics plays in its occurrence. A popular
account of Genie and other attic children plus an outline of their importance is given by Rymer
(1993). See Shattuck (1980) for a detailed description of the “Wild Boy of Aveyron” and Curtiss
(1989) for a description of another linguistically deprived person called “Chelsea.” Description
of the neurology of hemispheric specialization can be found in Kolb and Whishaw (2009). Skuse
(1993) discusses other cases of linguistic deprivation. Cases of hearing children of deaf parents
and their implications are reviewed by Schiff-Myers (1993). See Harris (1982) for a full review of
cognitive prerequisites to language. Social precursors of language are discussed in more detail in
Harris and Coltheart (1986).
Gleason and Ratner (1993) give an overview of language development covering many of
the topics in this and the next chapter. See Cottingham (1984) for a discussion of rationalism and
empiricism. A general overview of cognitive development is provided by Flavell, Miller, and
Miller (1993) and by McShane (1991). Piattelli-Palmarini (1980) edited a collection of papers that
arose from the famous debate between Chomsky and Piaget on the relation between language and
thought, and the contributions of nativism versus experience, at the Royaumont Abbey near Paris
in 1975. Piattelli-Palmarini (1994) summarized and updated this debate. Lewis (1987) discusses
general issues concerning the effects of different types of disability on linguistic and cognitive
development. For more on language acquisition in the blind, see the collection of papers in Mills
(1983). Kyle and Woll (1985) is a textbook on sign language and the consequences of its use on
cognitive development. Cromer’s (1991) book provides a good critical overview of this area,
and indeed of many of the topics in this chapter. Gallaway and Richards (1994) is a collection of
papers covering research on child-directed speech and the role of the environment; the final chap-
ter by Richards and Gallaway (1994) provides an overview.
For more on the early language of blind children, see Dunlea (1989) and Pérez-Pereira and
Conti-Ramsden (1999), and for more on language in deaf, blind, and handicapped children, Cromer
(1991). For the effects of linguistic training on cognitive performance, see Dale (1976). Leonard
(2000) is a review of work on SLI. For a good review of the critical period hypothesis, see B. Harley
and Wang (1997). For a review of the biology of sex differences see Baron-Cohen (2003).
See Gleitman and Papafragou (2005) for an overview of the relation between language and
thought. Gumperz and Levinson (1996) is an edited volume about linguistic relativity. Dale (1976)
also discusses the Sapir–Whorf hypothesis in detail. See Levinson (1996b) for cross-cultural work
on differences in the use of spatial terms, and how they might affect cognition. Fodor (1972) and
Newman and Holzman (1993) review the work of Vygotsky and its impact. For a detailed review of
the Sapir–Whorf hypothesis in general and the experiments on color coding in particular, see Lucy
(1992). Clark and Clark (1977) provide an extensive review of the relation between language and
thought, with particular emphasis on developmental issues. Nisbett (2003) discusses cultural differ-
ences in language and cognition.
CHAPTER 4
LANGUAGE DEVELOPMENT
INTRODUCTION and the onset of two-word speech are strongly

correlated (Bates, Bretherton, & Snyder, 1988;
This chapter examines how language develops Nelson, 1973). At this point children may be
from infancy to adolescence. How do children learning 40 new words a week. Before children
acquire language? What is the time course of produce utterances that are grammatically cor-
development? Although there is a clear progres- rect by adult standards, they produce what is
sion in the course of language development, it called telegraphic speech. Telegraphic speech
is contentious whether or not discrete stages are contains a number of words but with many
involved. Are there stages in development? grammatical elements absent (Brown & Bellugi,
Children are not born silent: they make 1964). As grammatical elements appear, they do
vegetative sounds from birth: they cry, burp, so in a relatively fixed order for any particular
and make sucking noises. Around 6 weeks of language. From the age of approximately 2 years
age they start cooing, and from about 16 weeks 6 months, the child produces increasingly com-
old they start to laugh. Between 16 weeks and plex sentences (see Figure 4.1). Grammatical
6 months they engage in vocal play (Stark, development carries on throughout childhood,
1986). This involves making speech-like sounds. and we never stop learning new words. It has
Vowels emerge before consonants. From about been estimated that the average young teen-
the age of 6–9 months, infants start babbling. ager is still learning over 10 new words a day
Babbling is distinguished from vocal play by (Landauer & Dumais, 1997).
the presence of true syllables (consonants plus Carrying out controlled experiments on large
vowels), often repeated. Around 9 months the numbers of young children to examine their lin-
infant starts noticing that particular strings guistic development can be quite difficult to do.
of sounds co-occur with particular situations One commonly used technique is known as the
(Jusczyk, 1982; MacKain, 1982). For example, sucking habituation paradigm. In this procedure,
whenever the sounds “ball” are heard, a ball is experimenters measure the sucking rate of infants
there. Infants might even understand some words on an artificial teat. Babies prefer novel stimuli,
as early as 6 months if they refer to very sali- and as they become habituated to the stimulus pre-
ent, animated figures, such as parents (Tincoff sented, their rate of sucking declines. If they then
& Jusczyk, 1999). Children start producing their detect a change in the stimulus, their sucking rate
first words around the age of 10 or 11 months. will increase again. In this way it is possible to
The single words are sometimes thought of as measure whether the infants can detect differences
forming single-word utterances. Around the between pairs of stimuli. In the preferential-looking
age of 18 months, there is a rapid explosion in technique, researchers examine what children
vocabulary size, and around this time two-word look at when they see scenes depicting sentences
sentences emerge. This vocabulary explosion they are hearing; children spend longer looking
4. LANGUAGE DEVELOPMENT 105
studies of individual children, often the experi-

menters’ own, have been particularly influential.
The development of language One consequence of this is that most of the litera-
ture concerns a surprisingly small number of chil-
Vegetative sounds
dren, and one possible consequence of this is that
(0–6 weeks)
variation between individuals in development may
Cooing
have been underestimated.
(6 weeks) In Chapter 2 we saw that Chomsky distin-
guished between a speaker’s competence, their
Laughter knowledge of their language, and their actual lin-
(16 weeks) guistic performance. For linguists such as Chomsky,
the most interesting question about development
Vocal play is how children acquire competence—the ability
(16 weeks–6 months)
to judge what is and what is not grammatical. For
psycholinguistics, the most interesting question is
Babbling
about how children acquire performance—how they
(6–10 months)
acquire the ability to produce and understand words
and sentences. These are different goals, and might
Single-word utterances
(10–18 months) be influenced by different factors; our main interest
is how children acquire performance.
Two-word utterances By the end of this chapter you should:
(18 months)
x Know the time course of language development.
Telegraphic speech x Understand the difference between rationalism
(2 years) and empiricism.
x Know what drives language development.
Full sentences x Understand what is meant by a Language
(2 years 6 months)
Acquisition Device and by Universal Grammar.
x Understand the nature and importance of child-
directed speech.
FIGURE 4.1 x Know how babbling develops.
at scenes that are consistent with what they hear. x Understand how children learn names for
In the conditioned head turn technique, infants things.
are taught to turn their heads (by reinforcing x Understand how children come to learn syntactic
them with visual reinforcement of, for example, categories.
a brightly lit toy bunny playing drums) whenever x Know how syntax develops.
there is a change in the stimulus; the conditioning
phase is followed by a testing phase that tests what
distinctions these infants are capable of making. WHAT DRIVES LANGUAGE
Cross-sectional studies look at the performance of DEVELOPMENT?
a group of children at particular ages. One problem
with the cross-sectional methodology is that there What makes language development happen? What
is enormous linguistic variation between children transforms a non-speaking, non-comprehending
of the same age. Not only are some children lin- infant into a linguistically competent individual?
guistically more advanced, there are also differ- One of the most important issues in the study
ences in linguistic style between children. Because of language development is the extent to which
of this, observational and diary studies have also our language abilities are innate. There are two
been important methodologies. Longitudinal contrasting philosophical views on how humans
obtain knowledge. The rationalists (such as Plato We should be wary of seeking any simple
and Descartes) maintained that certain fundamen- answer to the question “what drives language
tal ideas are innate—that is, they are present from development?” The answer is almost certainly
birth. The empiricists (such as Locke and Hume) that many factors do. It should also be remem-
rejected this doctrine of innate ideas, maintaining bered that language development is a complex
that all knowledge is derived from experience. process that involves the development of many
Locke (1690/1975) was one of the most influen- skills, and processes that may be important for
tial empiricists. He argued that all knowledge held syntactic development, for example, might be of
by the rationalists to be innate could be acquired less importance in phonological, morphological,
through experience. According to Locke, the or semantic development. Nevertheless, we can
mind at birth is a tabula rasa—a “blank sheet of tease apart some likely important contributions.
paper”—on which sensations write and determine
future behavior. The rationalist–empiricist contro-
versy is alive today: it is often called the nature–
Imitation
nurture debate. Chomsky’s work in general and The simplest theory of language development is that
his views on language acquisition are in the ration- children learn language by imitating adult language.
alist camp, and there are strong empiricist threads Although children clearly imitate some aspects of
in Piaget. (Piaget argued that cognitive structures adult behavior, it is clear that imitation cannot by
themselves are not innate, but can arise from itself be a primary driving force of early language
innate dispositions.) Behaviorists, who argued development, and particularly of syntactic develop-
that language was entirely learned, are clearly ment. A cursory examination of the sentences pro-
empiricists. Although we must be wary of sim- duced by younger children shows that they do not
plifying the debate by trying to label contrasting often imitate adults. Children make types of mis-
views as rationalist or empiricist, the questions of takes that adults do not. Furthermore, when children
which processes are innate, and which processes try to imitate what they hear, they are unable to do
must be in place for language to develop, are of so unless they already have the appropriate gram-
fundamental importance. Nevertheless, we must matical construction (see examples that follow).
not forget that behavior ultimately results from Nevertheless, imitation of adult speech (and that of
the interaction of nature and nurture. Work in other children) plays an important role in acquiring
connectionism has focused attention on the nature accent, in the manner of speech, and in the choice of
of nurture and the way in which learning systems particular vocabulary items. It might also be more
change with experience (Elman et al., 1996). important in older children, as we will see below.
Box 4.1 How do humans obtain language?
Rationalist perspective Empiricist perspective

x originated from the ideas of Plato and x originated from the ideas of Locke and
Descartes Hume
x based on the premise that certain x based on the premise that all knowledge is
fundamental ideas are innate derived from experience
x language capacity is present from birth x the newborn is a “tabula rasa”—a blank slate
x favors nature in the nature–nurture x favors nurture in the nature–nurture debate
debate x developed into the behaviorist viewpoints
x developed into Chomskian viewpoint and plays an important role in the Piagetian
perspective
Conditioning Child: She holded the baby rabbits and we

patted them.
To what extent can language development be Adult: Did you say she held them tightly?
explained by learning alone, using just the processes Child: No, she holded them loosely.
of reinforcement and conditioning? Skinner’s (1957) (5) Adult: He’s going out.
book Verbal Behavior was the classic statement of Child: He go out.
the behaviorist approach to language. Skinner argued Adult: Adam, say what I say: Where can I put
that language was acquired by the same mechanisms them?
of conditioning and reinforcement that were thought Child: Where I can put them?
at the time to govern all other aspects of animal and
human behavior (see Chapter 1). However, there is Parents do not always completely ignore gram-
much evidence against this position. matically incorrect utterances. They may provide
First, adults (generally) correct only the truth some sort of feedback, in that certain parent–child
and meaning of children’s utterances, not the syntax discourse patterns vary in frequency depending
(Brown & Hanlon, 1970; see Example 1). Indeed, on the grammaticality of the child’s utterances
attempts by adults to correct incorrect syntax and (Bohannon, MacWhinney, & Snow, 1990; Bohannon
phonology usually make no difference. Examples (2) & Stanowicz, 1988; Demetras, Post, & Snow, 1986;
and (3) are from the work of de Villiers and de Villiers
(1979). At the age of 18 months their son Nicholas
went from correctly producing the word “turtle” to
pronouncing it “kurka,” in spite of all attempts at cor-
rection and clearly being able to produce the constitu-
ent sounds. In Example 3 the mother does not correct
a blatant grammatical solecism because the meaning
is apparent and correct. Parents rarely correct gram-
mar, and if they try to do so the corrections have little
effect (see Example 4, from Cazden, 1972). Finally,
Example 5 (Fromkin et al., 2011) shows that in some
circumstances children are unable to imitate adult
language unless they already possess the necessary
grammatical constructions.
(1) Child: Doggie [pointing at a horse].

Adult: No, that’s a horsie [stressed].
(2) Adult: Say “Tur.”
Child: Tur.
Adult: Say “Tle.”
Child: Tle.
Adult: Say “Turtle.”
Child: Kurka.
(3) Child: Mama isn’t boy, he a girl.
Adult: That’s right.
(4) Child: My teacher holded the rabbits and we
patted them.
Adult: Did you say teacher held the baby Parents will sometimes correct grammatical errors,
rabbits? particularly by repeating the child’s utterance in a
Child: Yes. grammatically correct form, or by asking a follow-
up question that helps the child to rephrase.
Adult: What did you say she did?
Hirsh-Pasek, Treiman, & Schneiderman, 1984; is called U-shaped development: performance starts
Moerk, 1991; Morgan & Travis, 1989). For exam- off at a good level, but then becomes worse, before
ple, parents are more likely to repeat the child’s improving again. U-shaped development is sugges-
incorrect utterance in a grammatically correct form, tive of a developing system that has to learn both
or to ask a follow-up question (Saxton, 1997). rules and exceptions to those rules. We examine this
Example (4) exemplifies this. On the other hand, type of development in detail later.
if the child’s utterance is grammatically correct, The third piece of evidence against a condition-
the adults just continue the conversation (Messer, ing theory of language learning is that some words
2000). People from different cultures also respond (such as “no!”) are clearly understood before they
differently to grammatically incorrect utterances, are ever produced. Fourth, Chomsky (1959) argued
with some appearing to place more emphasis on that theoretical considerations of the power and
correctness (Ochs & Schieffelin, 1995). structure of language mean that it cannot be acquired
Whether this type of feedback is strong simply by conditioning (see Chapter 2). Finally, in
enough to have any effect on the course of acqui- phonological production, babbling is not random,
sition is controversial (Marcus, 1993; Morgan and imitation is not important: The hearing babies of
& Travis, 1989; Pinker, 1989). Such feedback is hearing-impaired parents babble normally. In gen-
probably too infrequent to be effective, although eral, language development appears to be strongly
others argue that occasional contrast between based on learning rules rather than simply on learn-
the child’s own incorrect speech and the correct ing associations and instances.
adult version does enable developmental change
(Saxton, 1997). Evidence in favor of this argu-
ment is that children are more likely to repeat
Poverty of the stimulus
adults’ expansions of their utterances than other Can children learn language from what they hear?
utterances, suggesting that they pay particular Chomsky showed that children acquire a set of
attention to them (Farrar, 1992). The debate about linguistic rules or grammar. He further argued that
whether or not children receive sufficient nega- they could not learn these rules by environmental
tive evidence (sometimes called the no negative exposure alone (Chomsky, 1965). The language
evidence problem), such as information about children hear was thought to be inadequate in
which strings of words are not grammatical, is two ways. First, they hear what has been called a
important because without negative feedback it degenerate input. The speech children hear is full of
is a challenge to specify how children learn to
produce only correct utterances. One possible
solution is that they rely on mechanisms such as Box 4.2 Arguments against the
innate principles to help them learn the grammar.
learning theory of language
Second, the pattern of acquisition of irregular
past verb tenses and irregular plural nouns cannot be
development
predicted by learning theory. Some examples of irreg- x Adults correct mainly the truth and meaning
ular forms given by children are “gived” for “gave,” of a child’s utterances, rarely the syntax
and “mouses” for “mice.” The sequence observed x Some words are understood before they
is: correct production, followed by incorrect produc- are produced
tion, and then later correct production again (Brown, x The pattern of acquisition of irregular past
1973; Kuczaj, 1977). The original explanation for tense verbs and irregular plural nouns is
this pattern (but see later) is that the children begin by U-shaped
learning specific instances. They then learn a general x Aspects of the structure of language mean
rule (e.g., “form past tenses by adding ‘-ed’”; “form it cannot be acquired simply by conditioning
plurals by adding ‘-s’”) but apply it incorrectly by x In phonological production, babbling is not
using it in all instances. Only later do they learn the random and imitation is not important
exceptions to the rule. This is an example of what
slips of the tongue, false starts, and hesitations, and too (Hladik & Edwards, 1984). Mothers using sign
sounds run into one another so that the words are language also use a form of CDS when signing to
not clearly separated. Second, there does not seem to their infants, repeating signs, exaggerating them,
be enough information in the language that children and presenting them at a slower rate (Masataka,
hear for them to be able to learn the grammar. They 1996). Even 4-year-old children use CDS when
are not normally exposed to a sufficient number of speaking to infants (Shatz & Gelman, 1973). In
examples of grammatical constructions that would turn, infants prefer to listen to CDS rather than to
enable them to deduce the grammar. In particular, normal speech (Fernald, 1991). There appears to be
they do not hear grammatically defective sentences some feedback between the language of the adult
that are labeled as defective (e.g., “listen, Boris, carer and that of the child: the vocabulary of carers
this is wrong: ‘the witch chased to a cave’”). These becomes modified by exposure to the language of
obstacles to learning language constitute the pov- the child. The same is not true of syntax, however,
erty of the stimulus argument (Berwick, Pietroski, suggesting that the adult’s CDS directly and caus-
Yankama, & Chomsky, 2011). ally influences the syntactic development of the
child (Huttenlocher, Waterfall, Vasilyeva, Vevea,
& Hedges, 2010).
Child-directed speech What determines the level of simplification
Adults (particularly mothers) have a special way of used in CDS? Cross (1977) proposed a linguistic
talking to children (Snow, 1972, 1994). This spe- feedback hypothesis, which states that mothers
cial way of talking to children was originally called tailor the amount of simplification they provide
motherese, but is now called child-directed speech depending on how much the child appears to
(CDS for short), because its use is clearly not lim- need. Counter to this, Snow (1977) pointed out
ited to mothers. It is commonly known as “baby that mothers produce child-directed speech before
talk.” Adults talk in a simplified way to children, infants are old enough to produce any feedback on
taking care to make their speech easily recogniz- the level of simplification. Instead, she proposed
able. The sentences are to do with the “here-and- a conversational hypothesis in which what is
now”; they are phonologically simplified (baby important is the mother’s expectation of what the
words such as “moo-moo” and “gee-gee”); there child needs to know and can understand. Cross,
are more pauses, the utterances are shorter, there Johnson-Morris, and Nienhuys (1980) found that
is more redundancy, the speech is slower, and it is the form of CDS used to hearing-impaired chil-
clearly segmented. There are fewer word endings dren suggested that a number of factors might be
than in normal speech, the vocabulary is restricted, operating, and that elements of both the feedback
sentences are shorter, and prosody is exaggerated and the conversational hypothesis are correct. The
(Dockrell & Messer, 1999). There is a great deal form of CDS also interacts in a complex way with
of repetition in the speech of mothers to their chil- the social setting: Maternal speech contains more
dren, and they focus on shared activities (Messer, nouns during toy play, but more verbs during non-
1980). Carers are more likely to use nouns at the toy play (Goldfield, 1993). The nature of CDS
most common or basic level of description (e.g., also varies with the socioeconomic status of the
“dog” rather than “animal”; Hall, 1994). They are family, with higher status mothers saying more,
also more likely to use words that refer to whole using more variety in their language, and using
objects (Masur, 1997; Ninio, 1980). Speech is spe- longer utterances. These differences in CDS cor-
cifically directed towards the child and marked by relate with subsequent vocabulary development in
a high pitch (Garnica, 1977). Furthermore, these the child (Hoff, 2003), and might be one reason
differences are more marked the younger the child; why the vocabulary and language skills of chil-
hence adults reliably speak in a higher pitch to dren from high-status families grow more quickly
2-year-olds than to 5-year-olds. The most impor- than those of children from low-status families.
tant words in sentences receive special emphasis. (Of course, we cannot rule out genetic factors, as
Although mothers use CDS more, fathers use it mother and child are genetically very similar.)
of CDS is widespread, it is not universal across all

cultures (Heath, 1983; Ochs & Schieffelin, 1995;
Pye, 1986). Furthermore, there is great variation in
the styles of social interaction and the form of CDS
across different cultures (Lieven, 1994). On the other
hand, it is possible that these cultures compensate for
the lack of CDS by simplifying language develop-
ment in other ways, such as emphasizing everyday
communal life (Ochs & Schieffelin, 1995; Snow,
1995). Another problem is that the rate of linguistic
development is not correlated with the complexity
of the children’s input (Ellis & Wells, 1980). What
seems to be important about CDS is not merely the
form of what is said to the children but, perhaps
not surprisingly, the content. In particular, the chil-
dren who learn fastest are those who receive most
encouragement and acknowledgment of their utter-
ances. Questioning and directing children’s attention
to the environment, and particularly to features of
the environment that are salient to the child (such as
repeated household activities), are also good facilita-
tors of language development. Cross (1978) demon-
strated the value of extended replies by adults that
amplify the comments of the children. The children
who showed the most rapid linguistic development
According to Goldfield (1993), motherese, or
were those whose mothers both asked their children
child-directed speech (CDS), tends to contain more questions and gave more extensive replies to
more nouns during toy play, but more verbs their children’s questions (Howe, 1980).
during non-toy play. If the form of CDS makes little difference to
linguistic development, why is CDS so widespread?
The use of child-directed speech gradually One possibility is that it serves some other function,
fades as the child gets older. It is sensitive to the such as creating and maintaining a bond between the
child’s comprehension level rather than produc- adult and child. Child-directed speech helps establish
tion level (Clarke-Stewart, Vanderstoep, & Killian, joint focus. Harris and Coltheart (1986) proposed
1979). Hence speech intended for children is spe- that the syntactic simplification of CDS is just a side
cially marked in order to make it stand out from effect of simplifying and restricting content. Needless
background noise, and is simplified so that the task to say, all these factors might be operative.
of discovering the referents of words and under- In summary, even though CDS might not be
standing the syntactic structure is easier than it necessary for language development, it might nev-
would otherwise be. In this respect Chomsky’s ertheless facilitate it (Pine, 1994b). A child acquir-
claim about children only being exposed to an inad- ing language on the basis of CDS is going to have
equate input does not hold up to scrutiny. a less impoverished input than one not exposed to
However, there is some controversy about the CDS. If CDS is not necessary, then how do children
difference that CDS actually makes to development. learn a language on the basis of a degenerate and
Do children require a syntactically and phonologi- impoverished input? Chomsky considered it to be
cally simplified input in order to be able to acquire impossible that a child could deduce the structure of
language? The evidence suggests not, although the the grammar solely on the basis of hearing normal
data are not entirely consistent. Although the use language. Something additional is necessary. He
argued that the additional factor is that the design of languages. Thus this approach sees language acquisi-
the grammar is innate: Some aspects of syntax must tion as parameter setting.
be built into the mind. Let us look at a simple example. In languages
like Italian, it is possible to drop the pronoun of sen-
THE LANGUAGE tences. For example, it is possible just to say “parla”
ACQUISITION DEVICE (speaks). In languages such as English and French,
it is not grammatical just to say “speaks”; you must
What might be innate in language? Chomsky (1965, use the pronoun, and say “he speaks.” Whether or
1968, 1986) argued that language acquisition must not you can drop the pronoun in a particular lan-
be guided by innate constraints, and that language guage is an example of a parameter; it is called the
is a special faculty not dependent on other cognitive pro-drop parameter. English and French are non-pro-
or perceptual processes. It is acquired, he argued, at drop languages, whereas Italian and Arabic are
a time when the child is incapable of complex intel- pro-drop languages. But once the pro-drop param-
lectual achievements, and therefore could not be eter is specified, other aspects of the language fall
dependent on intelligence, cognition, or experience. into place. For example, in a pro-drop language such
Because the language they hear is impoverished and as Italian you can construct subjectless sentences
degenerate, children cannot acquire a grammar by such as “cade la notte” (“falls the night”); in non-
exposure to language alone. Assistance is provided pro-drop sentences, you cannot. Instead, you must
by the innate structure called the language acqui- use the standard word order with an explicit subject
sition device (LAD). In Chomsky’s later work the (“the rain falls”). Pro-drop languages always permit
LAD is replaced by the idea of universal gram- subjectless sentences, so pro-drop is a generalization
mar. This is a theory of the primitives and rules of about languages (Cook & Newson, 2007).
inferences that enable the child to learn any natural
grammar. In Chomsky’s terminology, it is the set of Is language learning parameter
principles and parameters that constrain language
acquisition (see Chapter 2). For Chomsky, language
setting?
is not learned, but grows. Is learning language setting parameters? For
Obviously languages vary, and children are Chomsky and others who view language acquisi-
faced with the task of acquiring the particular details tion as a process of acquiring a grammar, the basis
of their language. For Chomsky (1981), this is the of which is innate, acquiring a language involves
process of parameter setting. A parameter is a univer- putting the built-in switches (parameters) into the
sal aspect of language that can take on one of a small correct positions. One obvious problem with this
number of positions, rather like a switch. The param- view is that language development is a slow pro-
eters are set by the child’s exposure to a particular lan- cess, full of errors. Why does it take so long to set
guage. Another way of looking at it is that the LAD these switches? There are two explanations. The
does not prescribe details of particular languages, but continuity hypothesis says that all the principles and
rather sets boundaries on what acquired languages parameters are available from birth, but they cannot
can look like; languages are not free to vary in every all be used immediately because of other factors.
possible way, but are restricted. For example, no lan- For example, the child has first to identify words
guage yet discovered forms questions by inverting as belonging to particular categories, and be able to
the order of words from the primary (declarative) hold long sentences in memory for long enough to
form of the sentence. The LAD can be thought of process them (Clahsen, 1992). The second expla-
as a set of switches that constrain the possible shape nation is that the children do not have immediate
of the grammars the child can acquire; exposure to a access to all their innate knowledge. Instead, it only
particular language sets these switches to a particular becomes gradually available over time as a conse-
position. If exposure to the language does not cause quence of maturation (Felix, 1992) (see Figure 4.2).
these switches to go to a particular position, they stay There is little agreement about which of these pro-
in the neutral one. Parameters set the core features of vides the best account of language development.
Why does it take so long for a

child to acquire a grammar?
Continuity hypothesis Maturation hypothesis

(all the parameters are available (children do not have immediate
from birth, but they cannot be used access to all their innate knowledge,
until other difficulties have but it becomes available over time)
been overcome)
FIGURE 4.2
Another problem is that it has proved difficult two languages at the same time, when the lan-
to find examples of particular parameters clearly guages involved might need to have parameters
being set in different languages (Maratsos, 1998). set to different positions (Messer, 2000).
In telegraphic speech, English-speaking children These are difficult problems for the theory
often omit pronouns. One possible explanation for of principles and parameters. To counter them,
this is that they have incorrectly set the parameter Chomsky toned down the idea that grammati-
for whether or not pronouns should be included in cal rules are abstract, and generally reduced
their utterances. At first sight this makes the lan- their importance in language acquisition (e.g.,
guage look like Italian, but this comparison fails Chomsky, 1995).
because Italian verbs specify the subject, whereas
English ones provide much less information.
Other problems for the parameter-setting the-
Linguistic universals
ory include how deaf children manage to acquire Constraints must be general enough to apply
sign language. There are some indications that across all languages: clearly innate constraints
similar processes underlie both sign language and cannot be specific to a particular language.
spoken language. First, all the milestones in both Instead, there must be aspects of language that are
types of language occur at about the same sort of universal. Chomsky argued that there are substan-
time. Originally it was thought that because the tial similarities between languages, and the differ-
manual system matures more quickly than the ences between them are actually quite superficial.
language system, the first signs appeared before Pinker (1994, p. 232), perhaps controversially,
the first spoken words (Newport & Meier, 1985; suggested that “a visiting Martian would surely
Schlesinger & Meadow, 1972). However, it is conclude that aside their mutually unintelligible
possible that people tend to over-interpret ges- vocabularies, Earthlings speak a single language.”
tures by young children, and that in fact signed Although there are 6,000 languages in the world,
and spoken words emerge at about the same time they all share the same basic structure—and this
(Petitto, 1988). Second, signing children make basic structure is universal grammar.
the same sorts of systematic errors as speaking Linguistic universals are features that can be
children at the same time (Petitto, 1987). Hence, found in most languages. Chomsky (1968) distin-
although spoken and signed language develop in guished between substantive and formal univer-
very similar ways, it is unclear how sign language sals. Substantive universals include the categories
gestures can be matched to the innate principles of syntax, semantics, and phonology that are com-
and parameters of verbal language. It is also prob- mon to all languages. The presence of the noun
lematic how bilingual children manage to acquire and verb categories is an example of a substantive
universal, as all languages make this distinction. There are four possible reasons why
It is so fundamental that it can arise in the absence universals might exist. First, some universals might
of linguistic input. “David,” a deaf child with no be part of the innate component of the grammar.
exposure to sign language, used one type of ges- There is some evidence for this claim in the way
ture corresponding to nouns, and another type for in which parameters set apparently unrelated
verbs (Goldin-Meadow, Butcher, Mylander, & features of language. For example, at first sight
Dodge, 1994). A formal universal concerns the there is no obvious reason why all SVO languages
general form of syntactic rules that manipulate must also put question words at the beginning of
these categories. These are universal constraints a sentence. Second, some universals might be part
on the form of syntactic rules. One of the goals of of an innate component of cognition, which then
universal grammar is to specify these universals. makes them more likely to be incorporated in
An interesting example of a linguistic universal some or all languages. For example, 5-month-old
relates to word order. Greenberg (1963) examined infants are sensitive to the conceptual distinction
word order and morphology in 30 very different between things that fit tightly and things that
languages and found 45 universals, focusing on the fit loosely. Using the standard dishabituation
normal order of subject, object, and verb (English paradigm, infants start to pay attention when there
is a SVO language: its dominant order is subject– is a change from cylinders in a narrow container
verb–object). He noted that we do not appear to to cylinders in a wider container (Bloom, 2004;
find all possible combinations; in particular, there Hespos & Spelke, 2004). That is, they are sensitive
seems to be an aversion to placing the object first. to the conceptual contrast. Some languages (e.g.,
The proportions found are shown in Table 4.1. Korean, which uses different verbs when referring
(Note that in general OVS and VOS languages are to things fitting tightly compared with things
very rare, comprising less than 1% of all languages, fitting loosely) mark this contrast linguistically,
and although some linguists believe that there are and some (e.g., English) do not. Hurford (2003)
a few OSV languages, there is no consensus; see argues that the predicate-argument distinction has
Pullum, 1981.) Even more striking is the way in a neural basis, reflecting distinctions such as that
which the primary word order has implications for between the “what” and “where” visual processing
other aspects of a language: it is an example of a pathways. Of course, the wider view is that neural
parameter. Once primary word order is fixed, other systems have evolved to interact with the physical
aspects of the language are also fixed. For example, laws of the universal, such as a distinction between
if a language is SVO it will put question words at mass and movement. Language learning is a
the beginning of the sentence (“Where is … ?”); if process of linking words to universal, pre-existing
it is SOV, it will put them at the end. SVO languages concepts that enable animals to navigate the world.
put prepositions before nouns (“to the dog”), while Third, constraints on syntactic processing make
SOV languages use postpositions after the noun. some word orders easier to process than others
(Hawkins, 1990). Languages evolve so that they
TABLE 4.1 Different word orders, as percentages of are easy to understand. Fourth, universals might
languages (based on Clark & Clark, 1977). result from strong features of the environment
that are imposed on us from birth, and make their
subject object verb 44% presence felt in all languages. Languages make
use of important distinctions in the environment.
subject verb object 35%
Different languages might pick up on some
verb subject object 19% differences rather than others. In practice it might
be very difficult to distinguish between these
verb object subject 2%
alternatives. Finally, it should be noted that the
object verb subject 0% notion that there are true universals common
to all languages has recently been criticized;
object subject verb 0%
instead, it has been argued, there is variation
across languages in all ways in which variation is develop within-gesture structures analogous to
possible (Evans & Levinson, 2009). characteristics of word morphology. It is as though
The commonly accepted view is that innate there is a biological drive to develop syntax, even
mechanisms make themselves apparent very early if it is not present in the adult form of communica-
in development, whereas aspects of grammar that tion to which a child is exposed. Bickerton calls
have to be learned develop slowly. Wexler (1998) this idea the language bioprogram hypothesis:
argued that this does not have to be so. Some children have an innate drive to create a grammar
parameters are set by exposure to language at a that will make a language even in the absence of
very early age, whereas some innate, universal environmental input.
properties of language can emerge quite late, as
a consequence of genetically driven maturation.
As evidence for early parameter setting, Wexler
Genetic linguistics
observed that children know a great deal about the More evidence that aspects of language are innate
inflectional structure of their language when they comes from studies of the genetic basis of lan-
enter the two-word stage (around 18 months). guage, genetic linguistics. Specific language
Furthermore, the parameter of word order— impairment, or SLI, is a disorder that affects
whether or not the verb precedes or follows the about 5% of the population. SLI is marked by
object, and all that follows from it—is set from significant problems with spoken language with-
the earliest observable stage. out any obvious accompanying brain damage or
problems with hearing, and those affected have
IQs in the normal range. Importantly, it runs
Pidgins and creoles in families (Gopnik, 1990a, 1990b; Gopnik &
Further evidence that there is a strong biologi- Crago, 1991; Leonard, 1989, 2000; Pinker, 2001;
cal drive to learn syntax comes from the study of Vargha-Khadem, Watkins, Alcock, Fletcher, &
pidgin and creole languages. Pidgins are simplified Passingham, 1995). For example, the “KE” fam-
languages that were created for communication ily of London is a large family spanning three
between speakers of different languages who were generations where about half the members have
forced into prolonged contact, such as the result some speech or language disorder. Affected mem-
of slavery in places like the Caribbean, the South bers have difficulty controlling their tongues and
Pacific, and Hawaii. A creole is a pidgin language making speech sounds, but they also have trouble
that has become the native tongue of the children identifying speech sounds, understanding speech,
of the pidgin speakers. Whereas pidgins are highly and making judgments about the grammatical
simplified syntactically, creole languages are syn- acceptability. They have particular difficulty with
tactically rich. They are the spontaneous creation regular inflections (e.g., forming the plural of
of the first generation of children born into mixed nouns by adding an “s” at the end), and a study of
linguistic communities (Bickerton, 1981, 1984). the heritability of the disorder suggests that a sin-
Creoles are not restricted to spoken language: gle dominant gene is involved (Hurst, Baraitser,
hearing-impaired children develop a creole sign Auger, Graham, & Norell, 1990). Their language
language if exposed to a signing pidgin. A commu- is replete with grammatical errors, particularly
nity of deaf children in Nicaragua developed their involving pronouns. They have difficulty in learn-
own sign language from scratch (Kegl, Senghas, & ing new vocabulary. The speech of the affected
Coppola, 1999). Furthermore, the grammars that people is slow and effortful, and they have diffi-
different creoles develop are very similar. Deaf culty in controlling their facial muscles. Contrary
children who are not exposed to sign language to the earlier reports that were based on quite a
(because they have non-signing hearing parents) small number of items, affected members of the
nevertheless spontaneously develop a gesture sys- family also have difficulty with irregular inflec-
tem that seems to have its own syntax (Goldin- tions. SLI can also cause severe difficulties in
Meadow, Mylander, & Butcher, 1995). They also language comprehension (Bishop, 1997).
The distribution of the disorder in the fam- such as recognizing the sound in common in words
ily suggests it is caused by a dominant gene (or (“b” in “ball” and “bat”). Joanisse and Seidenberg
a set of linked genes) on a non-sex chromosome; argued that normal syntactic development has an
the most likely candidate is a segment of chromo- important phonological component. For example,
some 7 labeled SPCH1 (Fisher, Vargha-Khadem, in order to be able to form the past tense of verbs
Watkins, Monaco, & Pembrey, 1998). Study of correctly, you have to be able to accurately identify
another person with SLI enabled the disorder to be the final sound of the word. If the final sound of a
tied to a specific gene, called FOXP2 (Lai, Fisher, present tense verb is a voiceless consonant, then you
Hurst, Vargha-Khadem, & Monaco, 2001—see form the past by adding a /t/ sound (“rip” becomes
also Chapter 3). The FOXP2 seems to play some “ripped”). But if it is a voiced consonant then you
causal role in the brain circuitry underlying nor- must add a /d/ sound (“file” becomes “filed”), and
mal language development, including Broca’s if it is an alveolar stop you must add an unstressed
area; in particular, it seems to be involved in con- vowel as well as a /d/ (“seed” becomes “seeded”).
trolling fine movements of the face and articula- Hence these morphological rules have an important
tory system (Fisher & Marcus, 2006). phonological component. Watkins, Dronkers, and
Clearly, then, genetic factors affect language Vargha-Khadem (2002) argued that the core defi-
proficiency, although there is considerable dis- cit in SLI is sequencing sounds, with the problems
agreement about just how specific the grammati- with inflections and syntactic sequencing secondary
cal impairment in the KE family actually is. As to that of sequencing sounds.
noted above, Vargha-Khadem and colleagues The argument about the theoretical impor-
showed that in fact affected members of the KE tance of SLI hinges on the extent to which these
family performed poorly on many other language impairments are truly specific to language or
tasks in addition to regular inflection formation to knowledge of grammar. On balance, the evi-
(Leonard, 1989; Vargha-Khadem & Passingham, dence suggests that language difficulties can “run
1990; Vargha-Khadem et al., 1995). Furthermore, in families,” but that these difficulties are quite
systems other than language might also be general and not limited to innate knowledge about
involved. For example, Tallal, Townsend, Curtiss, linguistic rules. The mapping between genes and
and Wulfeck (1991) proposed that children who language is a complex one, but the FOXP2 gene
tended to neglect word endings and other mor- clearly plays an important role.
phological elements did so because of difficul-
ties in temporal processing. There is also debate Formal approaches to language
about whether people with SLI have near-normal
IQ on tests of non-verbal performance. Affected
learning
members of the KE family scored 18 points lower How do children learn the rules of grammar?
on performance IQ tests than unaffected mem- Most accounts stress the importance of induction
bers (Vargha-Khadem et al., 1995). Although SLI in learning rules: Induction is the process of form-
might have a genetic basis, it is nevertheless to ing a rule by generalizing from specific instances.
some extent treatable. Members of the KE fam- One aspect of the poverty of the stimulus argument
ily learned to compensate for their difficulty in is that children come to learn rules that could not
generating syntactically complex sentences by be learned from the input they receive (Lightfoot,
memorizing structures, and by consciously apply- 1982). Gold (1967) showed that the mechanism of
ing rules most of us apply unconsciously. induction is not sufficiently powerful to enable a
An alternative view is that SLI is not primarily a language to be learned by itself; the proof of this is
disorder of grammar, but arises from impaired sound known as Gold’s theorem. If language learners are
processing (Joanisse & Seidenberg, 1998). Children presented only with positive data, they can only
with SLI who have syntactic deficits also have dif- learn a very limited type of language (known as
ficulty in tasks such as repeating nonwords (such a Type 3 language—see Chapter 2). They would
as “slint”), and tasks of phonological awareness, then not be able to construct sentences with an
unlimited number of center embeddings. Human More general innate accounts

language is substantially more powerful than a
Type 3 language. This observation means that, in Other researchers agree that the child must come to
principle, human language cannot be acquired by language learning with innate help, but this assis-
induction only from positive exemplars of sen- tance need not be the language-specific information
tences of the language. incorporated in universal grammar. Slobin (1970,
If children cannot learn a language as pow- 1973, 1985) argued that children are not born with
erful as human language from positive exem- structural constraints such as particular categories,
plars of sentences alone, what else do they but with a system of processing strategies that guide
need? One possibility might be that language their inductions. He emphasized the role of general
learners use negative evidence. This means that cognitive development. Slobin examined a great
the child must generate ungrammatical sen- deal of cross-cultural evidence, and proposed a
tences that must then be explicitly corrected number of processing strategies that could account
by the parent, or that the parent provides the for this acquisition process (see Box 4.3). For
child with utterances such as “The following Slobin, certain cognitive functions are privileged;
sentence is not grammatical: ‘The frog kiss the for example, the child tries to map speech first
princess.’” As we have seen, the extent to which onto objects and events. In a similar vein, Taylor
children use negative data is questionable, and and Taylor (1990) listed a number of factors that
few parents spontaneously produce this type of characterize language acquisition (Box 4.4). These
utterance. Hence Gold’s theorem seems to sug- principles apply to learning other skills as well. Of
gest that induction cannot be the only mecha- course, other factors (albeit biological, cognitive, or
nism of language acquisition. The explanation social) may in turn underlie these principles.
given most frequently is that it is supplemented
with innate information. The area of research Problems with innate accounts of
that examines the processes of how language
learning might occur is known as learnability
language acquisition
theory or formal learning theory. The controversy about innateness is how much of
Pinker (1984) attempted to apply learnabil- language is innate, and how language-specific the
ity theory to language development. He placed a innate information has to be. A study of a large
number of constraints on acquisition. First, the
acquisition mechanisms must begin with no spe-
cific knowledge of the child’s native language— Box 4.3 Some general principles
that is, the particular language to be learned.
of acquisition (based on Slobin,
Pinker emphasized the continuity between the
grammar of the child and the adult grammar. He
1973)
argued that the child is innately equipped with a 1. Pay attention to the ends of words
large number of the components of the grammar, 2. The phonological form of words can be
including parameters that are set by exposure to systematically modified
a particular language. He also argued that the 3. Pay attention to the order of morphemes
categories “noun” and “verb” are innate, as is and words
a predisposition to induce rules. Even though 4. Avoid interruption or rearrangement
children are supplied with these categories, they of units
still have to assign words to them, which is not 5. Underlying semantic relations should be
a trivial problem. Pinker argued that the link- clearly marked
ing rule that links a syntactic category such as 6. Avoid exceptions
“noun” to a thematic role—the role the word is 7. The use of grammatical markers should
playing in the meaning of the sentence—must make semantic sense
be innate.
where the principles come from and how they

Box 4.4 Pragmatic factors work: for example, by showing which genes con-
affecting acquisition (based on trol language development and how. As Braine
Taylor & Taylor, 1990) (1992) asked, exactly how do we get from genes
laid down at conception to syntactic categories
x Simple and short before complex and 2½ years later? We are a long way away from
long being able to answer this question.
x Gross before subtle distinctions Nativist accounts tend not to give enough
x Perceptually salient (in terms of size, color, emphasis to the importance of the social precur-
etc.) first sors of language. It is possible that social factors
x Personal before non-personal can do a great deal of the work for which innate
x Here and now before those displaced in principles have been proposed. Researchers who
time and space are opposed to nativist theories argue that the
x Concrete before abstract learning environment is much richer than the
x Frequent and familiar before less frequent nativists suppose: in particular, children are pre-
and unfamiliar sented with feedback. Deacon (1997) argues that
x Regular before irregular forms (though the structure of language itself facilitates learning
interacts with frequency) it: Language has evolved so that it has become
x Items in isolation before capturing easy to learn.
relationships
x Whole first, then analyzed into parts, then
mature whole An alternative to innate
knowledge: Distributional
information
number of same-sex twins found that vocabulary The alternative to innate knowledge about language
and grammatical abilities are correlated at the is that there is sufficient information in the input for
ages of 2 and 3, suggesting that the same genetic children to be able to learn language. Connectionist
factors influence both abilities (Dionne, Dale, modeling provides an alternative account of these
Boivin, & Plomin, 2003). Such results suggest phenomena, showing how complex behavior
that the innate basis of language is very general. can emerge from the interaction of many sim-
To some extent the debate is no longer simply pler processes without the need to specify innate
about whether nature or nurture is more impor- language-specific knowledge (e.g., Elman, 1999;
tant, but about the precise mechanisms involved, Elman et al., 1996). Modeling emphasizes the role
and the extent to which general cognitive or bio- of the actual linguistic input to which children are
logical constraints determine the course of lan- exposed. In addition, as we will see, there is now a
guage development. considerable amount of evidence that infants make
Many people consider there is something use of information about the distribution of sounds
unsatisfactory about specific innate principles. and words in what they hear. The central idea is that
Having to resort to saying that something is innate children make use of general-purpose associative
is rather negative, because it is easy to fall back learning mechanisms (Gomez & Gerken, 2000).
on a nativist explanation if it is not easy to see a Often they seem able to learn a great deal about
non-nativist alternative. This is not always a fair linguistic form without knowing the meaning of
criticism, but it is important to be explicit about what they are listening to. (This idea is particularly
which principles are innate and how they oper- apparent in the studies that show children can learn
ate. Innate principles are also difficult to prove. patterns and rules in artificial languages that do not
The best way of countering those researchers who have meaning.) This finding suggests that meaning
see this as a negative approach would be to show need not precede form.
Elman (1993) showed that networks could learn A-type word plus a B-type word) by extracting pre-
grammars with some of the complexities of English. dictive dependencies—that some things consistently
In particular, the networks could learn to analyze go with other things. Interestingly, similar results
embedded sentences, but only if they were first were found with non-linguistic sounds and even
trained on non-embedded sentences, or were given in the visual modality, suggesting that these learn-
a limited initial working memory that was gradu- ing mechanisms are not specific to language. Very
ally increased. This modeling shows the importance young children are also able to extract structure from
of starting on small problems that reflect the types what they hear. Seven-month-old infants attend
of sentences to which young children are in prac- longer to sentences with unfamiliar structures than to
tice exposed. It also provides support for Newport’s sentences with familiar structures (Marcus, Vijayan,
(1990) idea, called the less-is-more theory, that ini- Rao, & Vishton, 1999). Marcus et al. tested children
tially limited cognitive resources might actually help on sequences in an artificial language where simple
children to acquire language, rather than hinder them. counting or statistical mechanisms would not suffice
In a study involving how easily adults learned an arti- to learn the rule generating the sequence because
ficial language, Kersten and Earles (2001) found that they heard new items. For example, suppose you
adults learned the artificial language better when they hear items like “ga ti ga” and “li na li” repeated sev-
were initially presented with only small segments of eral times. You then hear the new item “wo fe wo”;
the language than when they were exposed to the this item does not generate surprise, because it con-
full complexity of the language from the beginning. forms to the rule you have inducted (sequences must
On the other hand, making the task more realistic by be of the form ABA). If, however, you hear “wo fe
introducing semantic information into the modeling fe” you might be surprised, and pay more attention,
suggests that starting small provides less of an advan- because this stimulus does not conform to the rule.
tage than when syntactic information alone is consid- Marcus et al. found that the 7-month-olds behaved
ered. Indeed “starting small,” or “less is more,” might in the same way. So very young children are able to
actually hinder development with more naturalistic extract abstract rules from very little input. There is,
inputs to the learning system (Rohde & Plaut, 1999). however, some debate as to what counts as a “rule,”
In any case, connectionist modeling shows that and the extent to which connectionist networks
explicit negative syntactic information might not be can model this behavior using only simple statis-
needed to acquire a grammar in the absence of innate tical mechanisms (Christiansen & Curtin, 1999;
information—there might after all be sufficient infor- Seidenberg & Elman, 1999; see Marcus, 1999, for
mation in the sentences children actually hear. a reply).
It should be pointed out, however, that these con-
nectionist networks have only modeled grammars
approaching the complexity of natural language. In HOW CHILDREN DEVELOP
general, it is debatable whether the constraints neces- LANGUAGE
sary to acquire language in the face of Gold’s theorem
need to arise from innate language-specific informa- Many things drive language development: genes,
tion, or can be satisfied by more general constraints the environment, and particularly social interac-
on the developing brain, or by the social and linguis- tion. The main issue is the extent to which children
tic environment (Elman et al., 1996). need genetically encoded language-specific infor-
Nevertheless, adults and children are able to mation, rather than general-purpose learning mech-
extract at least some syntactic structure on the basis anisms. We should note that learning mechanisms
of exposure to statistical information alone. Saffran change as the child grows: Connectionist modeling
(2001, 2002) tested adults and 6–9-year-old children has focused attention on the way in which learn-
on an artificial language and then asked them to ing systems change with experience. Finally, we
decide whether test items followed the rules of the should remember that the balance of the driving
language or not. Both groups learned the structure forces for phonological, syntactic, semantic, and
of the language (e.g., that an A phrase consists of an pragmatic development might be very different.
Do children learn any language in In spite of the impoverished nature of the

sounds that reach the baby in the womb, there
the womb? is a substantial amount of evidence that there is
Children do not start speaking at birth because they still sufficient information for the baby to be able
need some exposure to language before they can start to learn something from those sounds (Gomez
using it, and because other processes (e.g., sound & Gerken, 2000). DeCasper and Spence (1986)
perception, vision, brain maturation, and social asked a group of pregnant women to read aloud a
interaction) have to reach some level of ability first. short story every day for the final 6 weeks of their
But children do start learning language before birth. pregnancies. After the babies were born, DeCasper
The mother’s womb provides shelter, but it does not and Spence tested the babies to see if they could
exclude all stimuli from the outside world. Sounds distinguish the story that they had heard in the
including language penetrate the uterus, and the womb from another story. Discovering what very
baby in the womb can hear those sounds, although young infants can and cannot do, and what they
speech sounds different; in particular, the amniotic want and do not want to do, is obviously very dif-
fluid prevents the higher frequencies from reaching ficult. You cannot just ask a newborn baby “have
the baby. Indeed, only sounds up to 1,000 Hz (cycles you heard this story before?” One of the most
per second) will get through to the baby. In compari- commonly used techniques to investigate the
son, people with normal hearing can hear frequen- preferences of young infants is called non-nutri-
cies up to 20,000 Hz; speech contains sounds in the tive sucking. In this technique, the infant sucks on
range of 100 to 4,000 Hz; and telephones only con- a teat that controls the presentation of a stimulus.
vey sounds up to 3,000 Hz: so the speech the fetus Babies learn very quickly to adapt their rate of
hears will sound very muffled (Altmann, 1997). sucking to control the presentation of the stimu-
lus. They might have to suck quickly to obtain one
stimulus, and slowly to obtain another. DeCasper
and Spence showed that the infants preferred to
listen to the story to which they had been exposed
in the womb, rather than a new story. Importantly,
they preferred the story that they had heard before
even if it was spoken by someone other than their
mother. So in the womb they must have learned some
characteristic of the language, rather than just
having become familiar with a particular voice.
Another study by DeCasper, Lecanuet,
Maugais, Granier-Deferre, and Busnel (1994)
supports the same conclusion. The mothers read
aloud a story every day between the 34th and
38th weeks of pregnancy. The experimenters then
played a story to the fetus directly through the
mother’s abdomen (so the mother was unaware of
what was played). They monitored changes in the
heart rate of the fetus, and found that it decreased
Fetus’ brain. Colored magnetic resonance imaging when the familiar story was played, but not when
(MRI) scan of a coronal section through the brain
(center) of a 25-week-old fetus in its mother’s an unfamiliar story was played.
womb. At 25 weeks the connections within These experiments show that infants in the
the fetus’ brain are developing, especially in the womb learn something about the spoken lan-
areas responsible for emotions, perception, and guage around them. Given the muffled nature of
conscious thought. The fetus is also able to hear at what they hear, it is unlikely to be anything very
this stage.
specific. So what might it be? One possibility
is suggested by a study by Mehler et al. (1988). Early speech perception

These researchers played tapes of French speech
to 4-day-old babies. They used a variant of the Even though they have not yet started to talk,
sucking habituation technique, and found that the babies have surprisingly sophisticated speech-
babies sucked until the novelty wore off. When recognition abilities. Prelinguistic infants have
the sucking rate fell, they switched to playing complex perceptual systems that can make sub-
Russian speech. The speech was recorded from a tle phonetic distinctions. Using the techniques
bilingual French–Russian speaker, so there were described at the start of the chapter, it has been
no differences in voice. The sucking rate increased shown that from birth children are sensitive
again. The same result was found if the tapes were to speech sounds, as distinct from non-speech
played the other way round. Hence the newborn sounds. Indeed, it has been argued that infants
babies could detect the change of language. between 1 and 4 months of age, and perhaps even
What characteristics of the languages do younger, are sensitive to all the acoustic differ-
babies pay attention to such that they can detect ences later used to signal phonetic distinctions
changes? In another experiment, Mehler et al. (Eimas, Miller, & Jusczyk, 1987). For example,
(1988) played the babies tapes of language that they are capable of the categorical perception of
had been filtered to remove high frequencies. We voicing, place, and manner of articulation (see
depend on the high-frequency sounds to be able Chapter 2). Cross-linguistic studies, which com-
to recognize individual sounds. The babies still pare the abilities of infants growing up with different
detected the change. So it is unlikely that they were linguistic backgrounds, show common categoriza-
distinguishing the languages just on the basis of tions by infants, even when there are differences
the repertoire of sounds. Instead, the babies must in the phonologies of the adult language. Eimas,
have detected the different prosodies of the two Siqueland, Jusczyk, and Vigorito (1971) showed
languages. Prosody is the collective name given to that infants as young as 1 month old could dis-
all the information about languages that span indi- tinguish between two syllables that differed in
vidual sounds. One important aspect of prosody is only one distinctive phonological feature (e.g.,
stress, which determines the rhythm and empha- whether or not the vocal cords vibrate, as in the
sis of speech. Another important aspect is intona- sound [ba] compared with the sound [pa]). Eimas
tion, the way in which the pitch of speech rises et al. played the different sounds and found they
and falls, which determines the melody of lan- could elicit changes in sucking rate. Furthermore
guage. When we ask a question, we use a different they found that perception was categorical, as the
intonation than when we make a statement—the infants were only sensitive to changes in voice
pitch rises at the end of questions. These studies onset time that straddled the adult boundaries:
show that babies in the womb, and at birth, can that is, the categories used by the babies were the
detect changes in prosody. Sensitivity to prosody same as those used by adults. This suggests that
is important because it later helps children distin- these perceptual mechanisms might be innate.
guish and identify the sounds of language. From an early age, infants discriminate
sounds from each other regardless of whether or
not these sounds are to be found in the surrounding
PHONOLOGICAL adult language. The innate perceptual abilities are
DEVELOPMENT then modified by exposure to the adult language.
For example, Werker and Tees (1984) showed
Infants appear to be sensitive to speech sounds that infants born into English-speaking families in
from a very early age. As we saw in Chapter 3, Canada could make phonetic distinctions present
there is some evidence that the infant brain is in Hindi at the age of 6 months, but this ability
lateralized to some degree from birth. How does declined rapidly over the next 2 months. A second
the child’s ability to hear and produce language example is that 2-month-old Kikuyu infants in
sounds develop? Africa can distinguish between [p] and [b]. If not
used in the language into which they are growing begins with a sequence like /mp/ because this is
up, this ability is lost by about the age of 1 year not a legitimate string of sounds at the start of
or even less (Werker & Tees, 1984). (Adults can English words. Similarly the sounds within words
learn to make these distinctions again, so these such as “laughing” and “loudly” frequently co-
findings are more likely to reflect a reorganization occur by virtue of these being words; the sounds
of processes rather than complete loss of ability.) “ingloud” occur much less frequently together—
Infants are sensitive to features of speech only when words like “laughing loudly” are spo-
other than phonetic discriminations. Neonates ken adjacently. This type of low co-occurrence
(newborn infants) aged 3 days prefer the mother’s information provides a way of dividing the speech
voice to that of others (DeCasper & Fifer, 1980; stream. On the other hand, the sounds making up
see above). From an early age, infants can distin- “mother” co-occur very frequently; hence the
guish languages as long as they are rhythmically way in which sounds cluster together is another
distinct enough; newborn French infants can dis- important cue. Cairns et al. (1997) and Batchelder
tinguish British English from Japanese, but not (2002) showed that it is relatively straightforward
from Dutch (Nazzi, Bertoncini, & Mehler, 1998). to construct a computational model that learns to
The sensitivity of babies to language extends segment English and other languages using distri-
beyond simple sound perception. Infants aged 8 butional information. Of course, once a child has
months are sensitive to cues such as the location of successfully segmented a few words, it becomes
important syntactic boundaries in speech (Hirsh- progressively easier to segment the rest of the
Pasek et al., 1987). Hirsh-Pasek et al. inserted speech stream. This idea of using a little infor-
pauses into speech recorded from a mother speak- mation to uncover more of the same is known as
ing to her child. Infants oriented longer to speech bootstrapping—by analogy to the idea of try-
where the pauses had been inserted at important ing to pull yourself up by your own bootstraps.
syntactic boundaries than when the pauses had Bootstrapping is an important theme in language
been inserted within the syntactic units. The infant acquisition. Batchelder’s computational model
appears early on to be identifying acoustic corre- (called BootLex) shows how useful bootstrapping
lates of clauses (such as their prosodic form—the is. Furthermore, infants do seem to be sensitive
way in which intonation rises and falls, and stress to this sort of distributional information. Saffran,
is distributed). Aslin, and Newport (1996) found that 8-month-old
One of the major difficulties facing chil- infants very quickly learn to discriminate words in
dren learning language is how to segment fluent a stream of syllables on the basis of which sounds
speech they hear into words. Words run together tend to occur together regularly. Once they have
in speech; they are rarely delineated from each learned the words, they then listen longer to novel
other by pauses. Young children probably make stimuli than to the words presented in the stream
use of several strategies in order to be able to seg- of syllables. Children probably use both divi-
ment the speech stream. Child-directed speech sional and clustering distributional information at
may help the child learn how to segment speech. some time.
For example, carers put more pauses in between Although children can segment speech on
words in speech to young children than in speech the basis of statistical information alone, their
to other adults. Children are further aided by the performance is much better if they can make
great deal of information present in the speech use of other types of information. Eight-month-
stream. Distributional information about pho- old babies also make use of speech-specific
netic segments is an important cue in learning information, including phonotactic cues such
to segment speech (Cairns, Shillcock, Chater, & as co-articulation—the way in which sounds
Levy, 1997; Christiansen, Allen, & Seidenberg, change in the presence of other sounds (Johnson
1998). Distributional information concerns the & Jusczyk, 2001; Mattys & Jusczyk, 2001).
way in which sounds co-occur in a language. For For example, Mattys and Jusczyk found that
example, we do not segment speech so that a word 9-month-old infants turned and looked longer
at the source of a sound producing consonant– of New World monkey, can segment a sequence
vowel–consonant triplets with good phonotactic of sounds based on distributional information, with
cues to a word boundary than triplets without some sequences being more common than others,
these cues. For example, the triplet “gaffe” just like human infants (Hauser, Newport, & Aslin,
stands out more if it is preceded by “bean” (the 2001). However, even if animals can perform these
good phonotactic cue) than “fang” (the neutral perceptual distinctions, it does not necessarily fol-
cue). A single, isolated consonant is not a via- low that the perceptual mechanisms they employ
ble word; hence adults segment speech in such are identical to those of humans, and, furthermore,
a way as to avoid creating isolated consonants. humans possess language abilities that go far
Measuring the time children spent listening to beyond categorical perception and speech-stream
stimuli, Johnson, Jusczyk, Cutler, and Norris segmentation.
(2003) found that 12-month-old children use the Finally, for a while children actually regress
same strategy. Hence, from an early age chil- in their speech perception abilities (Gerken, 1994):
dren segment speech so as to avoid creating iso- The ability of young children to discriminate sounds
lated units that could not be words. In addition, is worse than that of infants. In part this regression
very young infants also seem to be sensitive to might be an artifact of using more stringent tasks
the prosody of language. Prosodic information to test older children: Tests for infants just involve
concerns the pitch of the voice, its loudness, and discriminating new sounds from old ones, but tests
the length of sounds. Neonates prefer to listen to for older children require them to match particular
parental rather than non-parental speech. Using sounds. It might also occur because of a change in
the sucking habituation technique, Mehler et al. focus of the child’s language-perception system.
(1988) showed that infants as young as 4 days Infants aged 14 months do not attend to fine pho-
old can distinguish languages from one another. netic detail (e.g., “bih” versus “dih”) when learn-
Infants prefer to listen to the language spoken ing new words, though children aged 8 months are
by their parents. For example, six babies born to capable of discriminating these sounds in a percep-
French-speaking mothers preferred to listen to tion task (Stager & Werker, 1997). When children
French rather than Russian. The likely explana- know only a few words, it might be possible to rep-
tion for this is that the child learns the prosodic resent them in terms of rather gross characteristics;
characteristic of the language in the womb. indeed, limiting the amount of detail to which you
Sensitivity to prosody helps the infant to iden- need to attend might be advantageous. But as chil-
tify legal syllables of their language (Altmann, dren grow older and acquire more words, they are
1997). After some months’ exposure to a lan- forced to represent words in terms of their detailed
guage, infants learn to make use of knowledge sound structure. Hence, early on—perhaps up to a
of lexical stress in identifying words; for exam- vocabulary size of about 50 words—detailed sound
ple, children growing up exposed to English contrasts are not yet needed by the child (Gerken,
adopt a stress initial syllable strategy, enabling 1994). Perceptual skills, experience, and the task at
them to identify when a new word is starting hand all interact to determine performance.
(Curtin, Mintz, & Christiansen, 2005; Thiessen Young children quickly become very good at
& Saffran, 2007). speech recognition. Children aged 18 months can
Just because some mechanisms of speech per- identify a large number of words without having to
ception are innate, it does not follow that they are hear the whole word: the first 300 ms is sufficient,
necessarily language- or even species-specific. All as shown by studies looking at children’s eye move-
children need is a general-purpose learning algo- ments to pictures of objects while listening to speech
rithm that helps them detect statistical regularities. (Fernald, Swingley, & Pinto, 2001). Once children
Kuhl (1981) showed that chinchillas (a type of have made a start on segmentation, “bootstrap-
South American rodent) display categorical percep- ping” can come into play: they can use their existing
tion of syllables such as “da” and “ta” in the same knowledge to facilitate the acquisition of new knowl-
way as humans do. The cotton-top tamarin, a type edge (Werker & Yeung, 2005). PRIMIR (Processing
Rich Information from Multidimensional Interactive languages. This range of sounds is then gradually
Representations) is a model that emphasizes the role narrowed down, by reinforcement by parents and
of bootstrapping in early word learning (Werker & others of some sounds but not others (and by the lack
Curtin, 2005). Although children continue to per- of exposure to sounds not present within a particular
ceive phonetic variations in the speech stream, by language), to the set of sounds in the relevant lan-
17 months old they have learned a sufficient number guage. (The extreme version of this of course is the
of word–object pairings to enable them to focus on behaviorist account of language development dis-
the phonological distinctions that are important for cussed earlier: Words are acquired by the processes
distinguishing new words. of reinforcement and shaping of random babbling
sounds.) For example, a parent might give the infant
extra food when he or she makes a “ma” sound, and
Babbling progressively encourages the child to make increas-
From about the age of 6 months to 10 months, ingly accurate approximations to sounds and words
before infants start speaking, they make speech- in their language. There are a number of problems
like sounds known as babbling. Babbling is clearly with the continuity hypothesis. Many sounds, such
more language-like than other early vocaliza- as consonant clusters, are not produced at all in bab-
tions such as crying and cooing, and consists of bling, and also parents are not that selective about
strings of vowels and consonants combined into what they reinforce in babbling: they encourage all
sometimes lengthy series of syllables, usually vocalization (Clark & Clark, 1977). Nor does there
with a great deal of repetition, such as “bababa appear to be much of a gradual shift towards the
gugugu,” sometimes with an apparent intonation sounds particular to the language to which the child
contour. There are two types of babbling (Oller, is exposed (Locke, 1983).
1980). Reduplicated babble is characterized The discontinuity hypothesis states that bab-
by repetition of consonant–vowel syllables, often bling bears no simple relation to later develop-
producing the same pair for a long time (e.g., ment. Jakobson (1968) postulated two stages in the
“bababababa”). Non-reduplicated or variegated development of sounds. In the first stage children
babble is characterized by strings of non-repeated babble, producing a wide range of sounds that do
syllables (e.g., “bamido”). Babbling lasts for 6–9 not emerge in any particular order and that are not
months, fading out as the child produces the first obviously related to later development. The second
words. It appears to be universal: deaf infants also
babble (Sykes, 1940), although it is now known
that they produce slightly different babbling pat-
terns. This suggests that speech perception plays
some role in determining what is produced in
babbling (Oller, Eilers, Bull, & Carney, 1985).
Across many languages, the 12 most frequent
consonants constitute 95% of babbled conso-
nants (Locke, 1983), although babbling patterns
differ slightly across languages, again suggesting
that speech perception determines some aspects
of babbling (de Boysson-Bardies, Halle, Sagart,
& Durand, 1989; de Boysson-Bardies, Sagart, &
Durand, 1984).
What is the relation between babbling and According to Mowrer (1960), babbling is a direct
later speech? The continuity hypothesis (Mowrer, precursor of language. The range of babbling
1960) states that babbling is a direct precursor of sounds is gradually narrowed down over time by
language—in babbling the child produces all of the reinforcement by the carer of some sounds but
not others.
sounds that are to be found in all of the world’s
stage is marked by the sudden disappearance of Later phonological development

many sounds that were previously in their reper-
toires. Some sounds are dropped temporarily, re- Early speech uses fewer sounds compared with
emerging perhaps many months later, whereas some the babbling of just a few months before, but it
are dropped altogether. Jakobson argued that it is contains some sounds that were only rarely or not
only in this second stage that children are learning all produced then (particularly clusters of conso-
the phonological contrasts appropriate to their par- nants, e.g., “str”). Words are also often changed
ticular language, and these contrasts are acquired after they have been mastered. Children appear
in an invariant order. However, the idea that from to be hypothesis testing, with each new hypoth-
the beginning babbling contains the sounds of all esis necessitating a change in the pronunciation of
the world’s languages is not true: the early bab- words already mastered, either directly as a con-
bling repertoire is quite limited (Hoff-Ginsberg, sequence of trying out a new rule, or indirectly as
1997). For example, the first consonants tend to a result of a shift of attention to other parts of the
be just the velar ones (/k/ and /g/). Furthermore, word.
although Jakobson observed that there was a silent Jakobson (1968) proposed that the way in
period between babbling and early speech, there is which children learn the contrasts between sounds
probably some overlap (Menyuk, Menn, & Silber, is related to the sound structure of languages. For
1986). Indeed, there seem to be some phonological example, the sounds /p/ and /b/ are contrasted by
sequences that are repeated that are neither clearly the time the vocal cords start to vibrate after the
babbling nor words. These can be thought of as lips are closed. He argued that children learn the
protowords. Early words might be embedded in contrasts in a universal order across languages.
variegated babble. There are preferences for cer- He also argued that the order of acquisition of the
tain phonetic sequences that are found later in early contrasts is predictable from a comparison of the
speech (Oller, Wieman, Doyle, & Ross, 1976). languages of the world: The phonological con-
This points to some continuity between babbling trasts that are most widespread are acquired first,
and early speech. whereas those that are to be found in only a few
Thus there is no clear evidence for either the languages are acquired last.
continuity or the discontinuity hypothesis. What One weakness of this approach is that
then is the function of babbling? Perhaps babbling because the theory emphasizes the acquisition of
has a motor origin, for example, in practice at gain- contrasts, other features of phonological develop-
ing control over the articulatory tract (Clark & ment are missed or cannot be explained (Clark
Clark, 1977; MacNeilage & Davis, 2000). Perhaps & Clark, 1977; Kiparsky & Menn, 1977). For
infants are learning to produce the prosody of their example, even when children have acquired the
language rather than particular sounds (Crystal, contrast between one pair of voiced and unvoiced
1986; de Boysson-Bardies et al., 1984). It is worth consonants (/p/ and /b/) and between a labial and a
noting that children exposed to sign language “bab- velar consonant (/p/ and /k/), they are often unable
ble” on their hands, reinforcing the view that there is to combine these contrasts to produce the voiced
a strong biological drive to produce babble, and that velar consonant (/g/). So just knowing the con-
babbling does more than enable motor control over trasts does not seem to be enough. There are also
the mouth and jaw (Petitto & Marentette, 1991). exceptions that counter any systematic simplifica-
Interestingly, hearing babies who are exposed just tion of a child’s phonological structure. Children
to sign language produce a different pattern of sign- can often produce a word containing a particular
babbling from those exposed to sign language and phonological string when all other similar words
speech (Petitto, Holowka, Sergio, Levy, & Ostry, are simplified or omitted. For example, Hildegard
2004). This difference suggests that babbling does could say the word “pretty” when she simplified
have some specifically linguistic component to its all her other words and used no consonant clus-
origin, allowing babies to discover how sounds are ters (such as “pr”) at all (Clark & Clark, 1977;
related and contrasted to each other. Leopold, 1939–1949).
Output simplification are too many exceptions, and because children

Young children simplify the words they produce are at least aware of the contrasts even if they
(see Figure 4.3). Smith (1973) described four cannot always apply them.
ways in which children do this, with a general ten- A second explanation of output simplifica-
dency towards producing shorter strings. Young tion is that children are using phonological rules
children often omit the final consonant, they to change the perceived forms into ones that they
reduce consonant clusters, they omit unstressed can produce (Menn, 1980; Smith, 1973). As chil-
syllables, and they repeat syllables. For exam- dren sometimes alternate between different forms
ple, “ball” becomes “ba,” “stop” becomes “top,” of simplification, the rules they use would have
and “tomato” becomes “mado.” Younger children to be applied non-deterministically. A third pos-
often substitute easier sounds (such as those in sibility is that simplification is a by-product of
the babbling repertoire) for more difficult sounds the development of the speech production system
(those not to be found in the babbling repertoire). (Gerken, 1994). It is likely that all of these factors
Simplification is found in all languages. play some role.
Why do young children simplify words?
There are a number of possible explanations. The
memory of young children is not so limited that LEXICAL AND SEMANTIC
this degree of simplification is necessary (Clark DEVELOPMENT
& Clark, 1977). Children must have some rep-
resentation of the correct sounds, because they Words are produced from the age of about 1 year.
can still correctly perceive the sounds they can- New words are added slowly in the first year, so
not yet produce (Smith, 1973). Jakobson (1968) that by the age of 18–24 months the child has a
argued that one reason why this happens is vocabulary of about 50 words. Around this point
because the child has not yet learned the appro- the vocabulary explosion occurs. Nelson (1973)
priate phonological contrasts. For example, a examined the first 10 words produced by children
child might sometimes produce “fis” instead of and found that the categories most commonly
“fish” because he or she has not yet mastered referred to were important person names, animals,
the distinction between alveolar and postalveo- food, and toys. However, children differ greatly in
lar fricatives (which captures the distinction their earliest words. Indeed, Nelson was able to
between the /s/ and /sh/ sounds). This explana- divide the children into two broad groups based on
tion cannot be the complete story, because there the types of early words produced: children in the
Substitution of easier
sounds for more Omit the final
difficult sounds consonant
Children’s
simplification
of words
Reduce consonant
Repeat clusters
syllables
Omit unstressed
syllables
FIGURE 4.3
“expressive style” group emphasize people and Greenfield and Smith (1976) found that
feelings, while children in the “referential style” early words may refer to many different roles,
group emphasize objects. These differences prob- not just objects, and further proposed that the
ably arise for several reasons. Nelson argued that first utterances may always name roles. For
they arise because of differences in what children example, the early word “mama” might be
think language is for: Children who think lan- used to refer to particular actions carried out
guage is primarily for labeling objects are likely by the mother, rather than to the mother her-
to be referential, while those who think it is for self. Generally, the earliest words can be char-
social interaction are likely to be more expressive. acterized as referring either to things that move
The differences also probably reflect differences in (such as people, animals, vehicles) or things
language use by the parents; some parents spend that can be moved (such as food, clothes,
a great deal of time producing object labels for toys). Moving things tend to be named before
their children, and such children tend to fall in movable things. Places and the instruments of
the referential style group (Pine, 1994a). It was actions are very rarely named.
once thought that the referential style led to faster There is some debate as to whether the earli-
language development; however, when you take est referential words may differ in their use and
into account factors such as vocabulary size and representation from later ones (McShane, 1991).
the age at which children produce the first word In particular, the child’s earliest use of reference
(both types of children reach 50 words at the same (what things refer to) appears to be qualitatively
age, but as the referential children tend to produce different from later use. The youngest children
their first word later, they appear to rush faster name objects spontaneously or give names of
towards that limit), there is no obvious difference objects in response to questions quite rarely, in
in subsequent development (Bates et al., 1994; marked contrast to their behavior at the end of the
Hoff-Ginsberg, 1997). second year.
It would be surprising if children got the
meanings of words right every time. Consider
the size of the task facing very young children. A
mother says to a baby sitting in a pram and look-
ing out of the window: “Isn’t the moon pretty?”
How, from all the things in the environment,
does the child pick out the correct referent for
“moon”? That is, how does the child know what
the word goes with in the world? It is not even
immediately obvious that the referent is both
an object and an object the infant can see. Even
when the child has picked out the appropriate
referent, substantial problems remain. He or she
has to learn that “moon” refers to the object, not
some property such as “being silver colored” or
“round.” What are the properties of the visual
object that are important? The child has to learn
that the word “moon” refers to the same thing,
even when its shape changes (from crescent to
full moon). The task, then, of associating names
Some children’s first words tend to refer to with objects and actions is an enormous one,
objects (“referential”) whereas some children’s and it is surprising that children are as good at
are more likely to refer to people and feelings
acquiring language as they are. Errors are there-
(“expressive”).
fore only to be expected. Sentences (6) and (7)
are examples of errors in acquiring meaning there is some bias in learning, and one of the
from Clark and Clark (1977): goals of understanding semantic development is
to work out how this bias arises.
(6) Mother pointed out and named a dog “bow- The first words emerge out of situations
wow.” where an exemplar of the category referred to
Child later applies “bow-wow” to dogs, but by the word is present in the view of parent and
also to cats, cows, and horses. child (see Chapter 3 on the social precursors
(7) Mother says sternly to child: “Young man, of language). However, there are well-known
you did that on purpose.” philosophical objections to a simple “look and
When asked later what “on purpose” means, name,” or ostensive model of learning the first
child says: “It means you’re looking at me.” words (Quine, 1960). Ostensive means pointing—
this conveys the idea of acquiring simple words
What are the features that determine the by a parent pointing at a dog and saying “dog,”
child’s first guess at the meaning of words? and the child then simply attaching the name to
How do the first guesses become corrected the object. The problem is simply that the child
so that they converge on the way adults use does not know which attribute of input is being
words? The errors that children make turn out labeled. For all the child knows, it could be that
to be a rich source of evidence about how they the word “dog” is supposed to pick out just the
learn word meaning. dog’s feet, or the whole category of animals, or
Clark and Clark (1977) argued that, in the its brown color, or the barking sound it makes,
very earliest stages of development, the child or its smell, or the way it is moving, and so on.
must start with two assumptions about the pur- This is often called the mapping problem. One
pose of language: Language is for communica- thing that makes the task slightly easier is that
tion, and language makes sense in context. From adults stress the most important words, and
then on they can form hypotheses about what the children selectively attend to the stressed parts
words mean, and develop strategies for using and of the speech they hear (Gleitman & Wanner,
refining those meanings. 1982). Nevertheless, the problem facing the
child is an enormous one.
After the first few words, vocabulary devel-
The emergence of early words opment is very fast and very efficient. Young
Children’s semantic development is dependent children are able to associate new words with
on their conceptual development. They can only objects after only one exposure, an ability called
map meanings into the concepts they have avail- fast-mapping. How can the child learn so quickly?
able at that time. In this respect, linguistic devel- Researchers have proposed a number of solutions
opment must follow cognitive development. Of to the mapping problem.
course, not all concepts may be marked by simple
linguistic distinctions. We don’t have different Constraints on learning names for
words for brown dogs as opposed to black dogs. things
There must surely be some innate processes, if Perhaps the cognitive system is constrained in
only to categorize objects, so the child is born its interpretations? The developing child makes
with the ability to form concepts. Quinn and use of a number of lexical principles to help to
Eimas (1986) suggest that categorization is part establish the meaning of a new word (Golinkoff,
of the innate architecture of cognition. Hirsh-Pasek, Bailey, & Wenger, 1992; Golinkoff,
However, children’s early vocabularies can- Mervis, & Hirsh-Pasek, 1994). The idea of lexi-
not be predicted just on the basis of the words cal principles as general constraints on how chil-
they hear. Their vocabularies contain many more dren attach names to objects and their properties
names for objects than are present in the speech is an important one. Several main constraints have
directed towards them (Bloom, 2001a). Clearly been proposed.
sometimes think that adjectives are labels for

objects (e.g., thinking that “pretty” refers to a
flower). Where does this important constraint
come from? In fact the whole-object bias is not
limited to words (Bloom, 2001a). Prelinguistic
infants are strongly biased to split the word up
into discrete objects (Spelke, 1994).
The taxonomic constraint is that a word
refers to a category of similar things. For exam-
ple, if a child hears the word “cat” in the pres-
ence of a cat, they will first assume that the
word labels the whole cat (by the whole-object
The taxonomic constraint (Markman, 1989) assumption) and then that all similar things
predicts that when a child hears a word in the will also be called “cat” (Markman, 1989) (see
presence of an object, they will go on to label all
similar things with that same word. Hence, a dog Figure 4.4). Children prefer to use new words to
may be called a “cat” and vice versa. associate things that are taxonomically related
rather than thematically related (e.g., a dog
with dog food), even though they often prefer
First, the cognitive system may be con- to group things thematically in other circum-
strained so that it tends to treat ostensive defi- stances (Markman & Hutchinson, 1984). Of
nitions as labels for whole objects. This is the course, we still have to solve the problem of
whole-object assumption (Markman, 1990; how children identify how objects are taxonom-
Taylor & Gelman, 1988; Waxman & Markow, ically related. Children begin word learning
1995). There is some evidence that adults are expecting that new words pick out commonali-
sensitive to this constraint. Ninio (1980) found ties between objects, and these commonalities
that adults talking to children almost wholly are fine-tuned by further experience (Waxman,
use ostensive definition to label whole objects 1999; Waxman & Booth, 2001). For example,
rather than parts or attributes. When adults devi- 14-month-old children recognize that nouns and
ate from this, they try to make it clear—for adjectives are different types of word, and pick
example, by mentioning the name of the whole out different aspects of relations among objects
object as well. Children make errors that sug- (membership of a category of similar objects or
gest that they are using this constraint. They properties of objects; Waxman & Booth, 2001).
FIGURE 4.4 A significant problem for a child when learning a new word is that the thing it refers to can appear
in many different forms. For example, the word “building” can be used to name many different types of structure.
A third possible constraint is the mutual (1992b) argued that social and pragmatic factors
exclusivity assumption, whereby each object can could have an important influence on language
only have one label (Markman & Wachtel, 1988): development. The problem of labeling objects
That is, (unilingual) children do not usually like would be greatly simplified if the adult and child
more than one name for things. establish through any available communicative
As children acquire words, new strategies means that the discourse is focusing on a particu-
become available. For example, they may be lar dimension of an object. For example, if it has
biased to assign words to objects for which they been established that the domain of discourse is
do not already have names (the novel name– “color,” then the word “pink” will not be used
nameless category or N3C principle; Mervis & to name a pig, but its color. Adults and chil-
Bertrand, 1994). There are syntactic cues to mean- dren interact in determining the focus of early
ing; if we talk about “I see Wolf” we are prob- conversation. Tomasello and Kruger (1992)
ably talking about a proper noun, but if we say demonstrated the importance of pragmatic and
“I see the wolf” we are talking about a common communicative factors. They showed that young
noun (Bloom, 2001a). Later on, when children’s children are surprisingly better at learning new
vocabulary is larger and their linguistic abilities verbs when adults are talking about actions that
more sophisticated, explicit definition becomes have yet to happen than when the verbs are used
possible. Hence superordinate and subordinate ostensively to refer to actions that are ongoing.
terms can be explicitly defined by constructions This must be because the impending action con-
such as “Tables, chairs, and sofas are all types of tains a great deal of pragmatic information that
furniture.” the infant can use, and the infant’s attention can
be drawn to this. In summary, the social setting
Other solutions to the mapping can serve the same role as innate principles in
problem enabling the child to determine the reference
Other solutions have been proposed to the map- without knowing the language. Joint attention
ping problem. There might be an innate basis to with adults, or intersubjectivity, is an essential
the hypotheses children make (Fodor, 1981): We component of learning a language, particularly
might have evolved such that we are more likely early in development. Variability in experience
to attach the word “dog” to the object “dog,” of joint attention at 9–18 months may be one of
rather than to its color, or some even more the most important determinants of variability in
abstruse concept such as “the hairy thing I see early lexical development. Nevertheless, there
on Mondays.” is a limit to what social-pragmatic factors and
It is likely that social factors play an impor- joint attention can achieve, and as the child gets
tant role in learning the meanings of early older the availability and nature of the linguis-
words. Joint attention between adult and infant tic input become increasingly important (Hoff &
is an important factor in early word learning. Naigles, 2002). In a study of 63 children, Hoff
Parents usually take care to talk about what and Naigles found that, at the age of 24 months,
their children are interested in at the time. Even variation in the extent to which mother and child
at 16 months of age, children are sensitive to mutually engage in conversation has little effect
what the speaker is attending to and can work on the richness of the vocabulary of the child; on
out whether novel labels refer to those things the other hand, variation in the lexical richness
(Baldwin, 1991; Woodward & Markman, 1998). and syntactic complexity of the mother’s utter-
Early words may be constrained so that they are ances does have an effect.
only used in particular discourse settings (Levy Children appear to vary in the importance
& Nelson, 1994; Nelson, Hampson, & Shaw, they assign to different concepts, and this leads
1993). The social setting is important in learn- to individual differences and preferences for
ing new words as a supplement or an alterna- learning words. The first use of “dog” varies
tive to innate or lexical constraints. Tomasello from four-legged mammal-shaped objects, to
all furry objects (including inanimate objects to the difference between noun phrase syntax
such as coats and hats), to all moving objects as in “This is Sib” and count noun syntax as in
(Clark & Clark, 1977). In each case the same “This is a sib.” This is obviously a useful cue
basic principle is operating: a child forms a for determining whether the word is a proper
hypothesis about the meaning of a word and name or stands for a category of things. The
tries it out. The hypotheses formed differ from ability of using syntactic knowledge to learn
child to child. meaning is called syntactic bootstrapping
Brown (1958) was among the earliest to sug- (Gleitman, 1990; Gleitman, Cassidy, Nappa,
gest that children start using words at what was Papafragou, & Trueswell, 2005; Landau &
later known as the basic level (see Chapter 11). Gleitman, 1985; Lidz, Gleitman, & Gleitman,
The basic level is the default level of usage. For 2003). Children use the structure of the sen-
example, “dog” is a more useful label than “ani- tences they hear in combination with what they
mal” or “terrier.” The bulk of early words are perceive in the world to interpret the meanings
basic-level terms (Hall, 1993; Hall & Waxman, of new words. For example, they use the syntax
1993; Richards, 1979; Rosch, Mervis, Gray, to help them infer the meanings of new verbs by
Johnson, & Boyes-Braem, 1976). Superordinate working out the types of relation that are per-
concepts, above the basic level, seem particularly missible between the nouns involved (Naigles,
difficult to acquire (Markman, 1989). Taxonomic 1990). For instance, suppose a child does not
hierarchies begin to develop only after the con- understand the verb “bringing” in the sentence
straint biasing children to acquire basic-level “Are you bringing me the doll?” The syntactic
terms weakens. Later on, particular cues become structure of the sentence suggests that “bring”
important. Mass nouns (which represent sub- is a verb whose meaning involves transfer, thus
stances or classes of things, such as “water” ruling out possible contending meanings such
or “furniture”) in particular seem to aid children as “carrying,” “holding,” or “playing.” Even
in learning hierarchical taxonomies, as they often children as young as 2 years old can use infor-
flag superordinate category names (Markman, mation about transitive and intransitive verbs to
1985, 1989). As such, they are syntactically infer the meanings of verbs (Naigles, 1996).
restricted, which is apparent when we try to sub- There are a number of reasons why some
stitute one for another. Hence although we can words are easier to learn than others. First, and
say “this is a table,” it is incorrect to say “this most obviously, children are exposed to some
is a furniture”; similarly “this is a ring” but not words more often in the language and in the envi-
“this is a jewelry”; and “this is a dollar” but ronment. Second, some concepts might be more
not “this is a money.” accessible. Conceptual structures change as the
The properties of objects themselves might child develops, and understanding words like
constrain the types of label that are considered “know,” “think,” and “believe” might depend on
appropriate for them. Soja, Carey, and Spelke the child having a sophisticated conceptual struc-
(1992) argued that the sorts of inferences children ture and a theory of mind (Gopnik & Meltzoff,
make vary according to the type of object being 1997; Huttenlocher, Smiley, & Charney, 1983).
labeled. For example, if the speaker is talking Third, the information change model says that
about a solid object, the child assumes the word is the type of information available to the child
the name of the whole object, but if the speaker is changes and increases over time, and not all
talking about a non-solid substance, then the child words are acquired in the same way (Gleitman
infers that the word is the name of parts or proper- et al., 2005). Of course all of these factors might
ties of the substance. operate, although Gleitman et al. argue that infor-
Finally, there are syntactic cues to word mation change is more important than conceptual
meaning. Brown (1958) proposed that children change; certain words and syntactic structures
may use part-of-speech as a cue to meaning. For have to be learned before others can be success-
example, 17-month-olds are capable of attending fully acquired.
Evaluation of work on how children have been greatly underestimated. In conclu-

acquire early names sion, it is likely that a number of factors play a
Approaches that make use of constraints on how role in how children come to name objects.
children relate words to the world have some prob-
lems. First, we are still faced with the problem of Errors in representing meaning
where these constraints themselves come from.
Are they innate, and part of the language acquisi- One useful way of discovering how children
tion device? Second, they are biases rather than acquire meaning is to examine the errors children
constraints, as children sometimes go against them make. Children’s early meanings overlap with adult
(Nelson, 1988, 1990). In particular, very early words meanings in four ways: the early meaning might
(those used before the vocabulary explosion) often be exactly the same as the adult meaning; it might
violate the constraints (Barrett, 1986). For example, overlap but go beyond it; it might be too restricted;
Bloom (1973) noted that a young child used “car” to or there might be no overlap at all. Words that have
refer to cars, but only when watched from a certain no overlap with adult usage get abandoned very
location. The constraints only appear to come into quickly: Bloom (1973) observed that in the earli-
operation at around 18 months, which is difficult est stages of talking, inappropriate names are some-
to explain if they are indeed innate or a component times used for objects and actions, but these are
of the language acquisition device. (It is of course soon dropped, because words that have no overlap
possible that the attainment of the concept of object in meaning with the adult usage are likely to receive
permanence interacts with this.) Third, whereas it is no reinforcement in communication.
relatively easy to think of constraints that apply to
concrete objects and substances, it is less easy to do Over-extensions and
so for abstract objects and actions. under-extensions
Nelson (1988, 1990) argued that language E. Clark (1973) was one of the first researchers
development is best seen as a process of social to look at over-extensions (sometimes called
convergence between adult and child, empha- over-generalizations) in detail. When children
sizing communicability. The role of social and over-extend a word, they use it in a broader
pragmatic constraints in early acquisition might way than the adult usage. Table 4.2 gives
TABLE 4.2 Examples of over-extensions (based on Clark & Clark, 1977).
Object Domain of application
moon cakes, round marks on window, round postcards, letter “O”
ball apples, grapes, eggs, anything round
bars of cot toy abacus, toast rack with parallel bars, picture of columned
building
stick cane, umbrella, ruler, all stick-like objects
horse cow, calf, pig, all four-legged animals
toy goat on wheels anything that moves
fly specks of dirt, dust, all small insects, toes
scissors all metal objects
sound of train steaming coffee pot, anything that makes a noise

some examples of early over-extensions. Over- There is some controversy surrounding these
extensions are very common in early language findings. Fremgen and Fay (1980) argued that the
and appear to be found across all languages. results of Thomson and Chapman (1977) were
Rescorla (1980) found that one third of the first an experimental artifact. They pointed out that
75 words were over-extended, including some the children were repeatedly tested on the same
early high-frequency words. words, and this might have led to the children
As we can observe from Table 4.2, over- changing their response either out of boredom or
extensions are often based on perceptual attributes of to please the experimenter. When Fremgen and
the object. Although shape is particularly impor- Fay tested children only once on each word, they
tant, the examples show that over-extensions failed to find comprehension over-extensions in
are also possible on the basis of the properties words over-extended in production. The situation
of movement, size, texture, and the sound of the is complex, however, as Chapman and Thomson
objects referred to. Although Nelson (1974) pro- (1980) showed that in their original sample there
posed that functional attributes are more important was no evidence of an increase in the number of
than perceptual ones, Bowerman (1978) and E. over-extensions across trials, which would have
Clark (1973) both found that appearance usually been expected if Fremgen and Fay’s hypothesis
takes precedence over function. That is, children was correct. Behrend (1988) also found over-
over-extend based on a perceptual characteristic extensions in comprehension in children as young
such as shape even when the objects in the domain as 13 months.
of application clearly have different functions. Clark and Clark (1977) hypothesized that
McShane and Dockrell (1983) pointed out over-extensions develop in two stages. In the ear-
that many reports of over-extensions failed to dis- liest stage, the child focuses on an attribute, usu-
tinguish persistent from occasional errors. They ally perceptual, and then uses the new word to
argued that occasional errors tell us little about the refer to that attribute. However, with more expo-
child’s semantic representation, perhaps arising sure they realize that the word has a more specific
only from filling a transient difficulty in accessing meaning, but they do not know the other words
the proper word with the most available one. Such that would enable them to be more precise. In this
transient over-generalizations are more akin to later stage, then, they use the over-extended word
adult word substitution speech errors (see Chapter rather as shorthand for “like it.” Hence the child
13), and as such would tell us little about normal might know that there is more to being a ball than
semantic development. Hence it is important to being round, yet when confronted with an object
show that words involved in real over-extensions like the moon, not having the word “moon” they
are permanently over-extended, and also that the might call it “ball,” meaning “the-thing-with-the-
same words are over-extended in comprehension. same-shape-as-a-ball.”
If a word is over-extended because the represen- We should also bear in mind that, like adults,
tation of its meaning is incomplete, the pattern of children might sometimes just make mistakes.
comprehension of that word by the child should They might be using words as an analogy—the
reflect this. To this end, Thomson and Chapman moon is like a ball. Or they might just be being
(1977) showed that young children over-extended mischievous (Bloom, 2001a).
the meanings of words in comprehension as well Under-extensions occur when words are
as in production. They found that many words used more specifically than their meaning—such
that were over-extended in production by a group as using the word “round” to refer only to balls.
of 21- to 27-month-old children were also over- The number of under-extensions might be dra-
extended in comprehension. However, not all matically under-recorded, because usually the
words that were over-extended in production were construction will appear to be true. For example,
over-extended in comprehension. Most children if a child points at the moon and says “round,”
chose the appropriate adult referent for about half this utterance is clearly correct, even if the child
the words they over-extended in production. thinks that this is the name of the moon.
Three types of theory have been proposed to child converge. The features are acquired in an
account for these data. The accounts are all based order from most to least general.
on the idea that over-extensions occur because of a Atkinson (1982) and Barrett (1978) dis-
lexical representation that is incomplete compared cussed problems with this approach. Any theory
to that of the adult, whereas under-extensions of lexical development based on a semantic fea-
occur because the developing representation is ture theory of meaning will inherit the same prob-
more specific than that of the adult. lems as the original theory, and there are serious
The semantic feature hypothesis (E. Clark, problems with the semantic feature theory (see
1973) is based on a decompositional theory of Chapter 11). In particular, we must be able to
lexical semantics. This approach states that the point to plausible, simple features in all domains,
meaning of a word can be specified in terms and this is not always easy, even for the kind of
of a set of smaller units of meaning, called concrete objects and actions that young children
semantic features (see Chapter 11). Over- and talk about. Atkinson (1982) in particular pointed
under-extensions occur as a result of a mismatch to the central problem that the features proposed
between the features of the word as used by the to account for the data are arbitrary. The devel-
child compared with the complete adult represen- opmental theory cannot easily be related to any
tation. The child samples from the features, pri- plausible general semantic theory, or to an inde-
marily on perceptual grounds. Over-extensions pendent theory of perceptual development.
occur when the set of features is incomplete; In Nelson’s (1974) functional core hypothesis,
under-extensions occur when additional spurious generalization is not restricted to perceptual simi-
features are developed (such as the meaning of larity; instead, functional features are also empha-
“round” including something like [silvery white sized. In other respects this is similar to the featural
and in the sky]). Semantic development consists account and suffers from the same problems.
primarily of acquiring new features and reducing The prototype hypothesis (Bowerman, 1978)
the mismatch by restructuring the lexical repre- states that lexical development consists of acquiring
sentations until the features used by the adult and a prototype that corresponds to the adult version.
Box 4.5 Theoretical accounts of over- and under-extensions
Semantic feature hypothesis (E. Clark) x Features are acquired from the most general
x The meaning of words can be specified in to the least general
terms of smaller units of meaning (“semantic
Functional core hypothesis (Nelson)
features”)
x When there is a mismatch between x Generalization is not restricted to
features of the word used by the child perceptual similarity—functional features
and the complete representation used are also emphasized
by the adult, an over- or under-extension x In other ways, similar to the semantic feature
occurs hypothesis
x Over-extensions occur when a set of
features is incomplete Prototype hypothesis (Bowerman)
x Under-extensions occur when a set of x A prototype is an average member of a
features is incomplete category
x Semantic development involves acquiring x Lexical development consists of acquiring
new features and reducing mismatch a prototype that corresponds to the adult
between adult and child features version
A prototype is an average member of a category. When new word meanings are acquired, because
Over-extensions may probably be explained better features are contrasted with the features of exist-
in terms of concept development and basic category ing word meanings, the meaning should not overlap
use. Kay and Anglin (1982) found prototypicality with that of existing words: the words’ meaning
effects in over- and under-extensions. The more should fill a gap. Children do not like two labels
an object was prototypical of a category, the more for the same thing.
likely it was that the conceptual prototype name Unfortunately, young children are some-
would be extended to include the object. Words times happy with two labels for the same object
are less likely to be extended for more peripheral (Gathercole, 1987). Contrast appears to be used
category members. This suggests that the concepts later rather than earlier as an organizing principle
are not fully developed but clustered around just a of semantic development. Neither is it likely to
few prototypical exemplars. Once again, a signifi- be the only principle driving semantic develop-
cant problem with this approach is that it inherits ment. There comes a point when it is no longer
the problems of the semantic theory on which it is useful for semantic development to make a con-
founded (see Chapter 11). trast (for example, between black cats and white
In summary, the strengths and weaknesses cats), and the contrastive hypothesis says nothing
of these developmental theories are the same as about this. It seems just as likely that when chil-
those of the corresponding adult theories. There dren hear someone use a new word, they assume
is surely scope for connectionist modeling here, it must refer to something new because otherwise
which may yet show that a variant of the semantic the speaker would have used the original word
feature hypothesis is along the right lines. instead (Gathercole, 1989; Hoff-Ginsberg, 1997).
The contrastive hypothesis Summary of work on early

Once children have a few names for things, how do semantic development
they accommodate the many new words to which It is unlikely that only one principle is operat-
they are exposed? Barrett (1978) argued that the ing in semantic development. On the one hand,
key features in learning the meaning of a word children have to learn appropriate contrasts
are those that differentiate it from related words. between words, but they must not learn inap-
For example, the meaning of “dog” is learned by propriate or too many contrasts. As this is just
attaining the contrast between dogs and similar the sort of domain where the learning of regu-
animals (such as cats) rather than simply by learn- larities and the relation between many complex
ing the important features of dogs. In the revised inputs and outputs is important, computational
version of this model (Barrett, 1982), although modeling should make a useful contribution
contrasts are still important, they are not what are here; however, as yet there has been no research
acquired first. Instead words are initially mapped on this topic. One obvious problem is that it is
onto prototypical representations; the most salient most unclear how to model the input to semantic
prototypical features are used to group the word development. How should the salient perceptual
with words sharing similar features, and contras- and functional attributes of objects and actions
tive features are then used to distinguish between be encoded? Finally, we should not underesti-
semantically similar words. mate the importance of the social setting of lan-
This emphasis on contrast has come to be seen guage development.
as very important (E. Clark, 1987, 1993, 1995).
The contrastive hypothesis is a pragmatic prin-
ciple that simply says that different words have
The later development of meaning
different meanings. It is very similar to the lexi- Children largely stop over-extending at around
cal constraint of mutual exclusivity. However, the age of 2½ years. At this point they start
the child is still faced with significant problems. asking questions such as “What’s the name of
that?” and vocabulary develops quickly from although related view is that their acquisition
then on. From this point, a good guide to the depends on the prior acquisition of some nouns
order of acquisition of words is the semantic and some information about how syntax operates
complexity of the semantic domain under con- at the clause level (Gillette, Gleitman, Gleitman, &
sideration. Words with simpler semantic repre- Lederer, 1999). That is, verb acquisition depends
sentations are acquired first. For example, the on acquiring knowledge about linguistic context.
order of acquisition of dimensional terms used Gillette et al. presented adults with video clips of
to describe size matches their relative seman- adults speaking to children. Some words on the
tic complexity. These terms are acquired in the soundtrack were replaced with beeps or made-up
sequence shown in (8): words such as “gorp.” The adults had to identify
the meanings of the beeps and made-up words.
(8) big–small The extralinguistic context was surprisingly unin-
tall–short, long–short formative: adults found it quite difficult to identify
high–low the meanings of words on the basis of environ-
thick–thin mental information alone. They were particularly
wide–narrow, deep–shallow poor, however, at identifying verbs relative to
nouns, and extremely bad at identifying verbs
“Big” and “small” are the most general relating to mental states (e.g., “think,” “see”).
of these terms, and so these are acquired first. Performance increased markedly when syntactic
“Wide” and “narrow” are the most specific cues were available. As an example, the “gorp” in
terms, and are also used to refer to the sec- “Vlad is gorping” is more likely to mean “sneeze”
ondary dimension of size, and hence these are than “kick,” but in “Vlad is gorping the snaggle”
acquired later on. The other terms are interme- it is more likely to mean “kick” than “sneeze.” In
diate in complexity and are acquired in between summary, environmental context might be less
(Bierwisch, 1970; Clark & Clark, 1977; Wales powerful than was once thought, while linguistic
& Campbell, 1970). context provides powerful cues. Verbs are more
Nouns are acquired more easily than verbs. difficult to acquire than nouns because of their
One explanation for this might be that verbs are greater reliance on complex linguistic context.
more cognitively complex than nouns, in that Later semantic development sees much interplay
whereas nouns label objects, verbs label relations between lexical and syntactic factors.
between objects (Gentner, 1978). An alternative
Does comprehension always precede
production?
Comprehension usually precedes production for
the obvious reason that the child has to more or less
understand (or think they understand) a concept
before producing it. Quite often contextual cues
are strong enough for the child to get the gist of
an utterance without perhaps being able to under-
stand the details. In such cases there is no ques-
tion of the child being able to produce language
immediately after being first exposed to a partic-
ular word or structure. Furthermore, as we have
seen, even when a child starts producing a word
At around the age of 2½ years, children start to or structure, it might not be used in the same way
ask questions such as “What’s the name of that?” as an adult would use it (e.g., children over-extend
This marks the onset of a period of accelerated words). There is more to development than a sim-
vocabulary development.
ple lag, however. The order of comprehension and
production is not always preserved: words that One important approach says that knowl-
are comprehended first are not always those that edge about the basic syntactic categories is innate
are produced first (Clark & Hecht, 1983). Early (Pinker, 1984, 1989). Children know that nouns
comprehension and production vocabularies may refer to objects and verbs refer to actions. Pinker
differ quite markedly (Benedict, 1979). There are argued that the child first learns the meaning of
even cases of words being produced before there some content words, and uses these to construct
is any comprehension of their meaning (Leonard, semantic representations of some simple input
Newhoff, & Fey, 1980). sentences. With the surface structure of a sentence
and knowledge about its meaning, the child is in a
position to make an inference about its underlying
SYNTACTIC structure. Children start off with their innate knowl-
DEVELOPMENT edge of syntactic categories and a set of innate link-
ing rules that relate them to the semantic categories
We have seen that a stage of single-word speech of thematic roles. Thematic roles are a way of
(called holophrastic speech) precedes a stage of labeling who did what in a sentence: For example,
two-word utterances. After this, early speech is in the sentence “Vlad kissed Agnes,” Vlad is the
telegraphic, in that grammatical morphemes may agent (the person or thing initiating the action) and
be omitted. We can broadly distinguish between Agnes the patient (the person or thing being acted
continuous and discontinuous theories. In con- on by the agent). An innate linking rule relates the
tinuous theories, children are believed to have syntactic categories of subject and object to the
knowledge of grammatical categories from the semantic categories of agent and patient, respec-
very earliest stages (e.g., Bloom, 1994; Brown tively. So on exposure to language, all the child
& Bellugi, 1964; Menyuk, 1969; Pinker, 1984). has to do is identify the agents in utterances, and
The child’s goal is to attach particular words to this information then provides knowledge about
the correct grammatical categories, and then use the syntactic structure. This process is known as
them with the appropriate syntactic rules. In dis- semantic bootstrapping (see Figure 4.5).
continuous theories, early multiword utterances Although nativist accounts have the advan-
are not governed by adult-like rules (Bowerman, tage of providing a simple explanation for many
1973; Braine, 1963; Maratsos, 1983). Theoretical
approaches also vary depending on the extent to
which they emphasize the semantic richness of Semantic bootstrapping theory (Pinker, 1984, 1989)
the early utterances.
Child has innate knowledge of syntactic
categories and linking rules
How do children learn syntactic
categories?
Child learns meaning of some
One of the most basic requirements of under- content words
standing and using language is identifying the

major syntactic categories to which words belong. Child uses these to construct semantic
Is a word a noun, a verb, an adverb, or an adjec- representations of some simple
input sentences
tive? How do children learn these categories, and
which words belong to them?
Semantic bootstrapping takes place where
Are syntactic categories innate? child makes an inference about underlying
structure of sentence based on its surface
How do children begin to work out the meaning structure and knowledge about its meaning
of what they hear before they acquire the rules
of the grammar? Accounts differ in the extent to
which they posit the need for innate knowledge. FIGURE 4.5
otherwise mysterious phenomena, they have a Macnamara, 1972, 1982), which means that the
number of disadvantages. The predictions they very earliest stages of language development are
make are not always borne out by the data. asyntactic (Goodluck, 1991). A gross distinction
First, the theory depends on the child hearing is that nouns correspond to objects, adjectives to
plenty of utterances early on that contain easily attributes, and verbs to actions. But although many
identifiable agents and actions relating to what the nouns do indeed refer to objects, others are used
child is looking at that can be mapped onto nouns to refer to salient abstract concepts (e.g., “sleep,”
and verbs. However, it can sometimes be very diffi- “truth,” “time,” “love,” “happiness”). So one of
cult to work out the meaning of new words, particu- the major failings of a semantic approach to early
larly verbs (Gillette et al., 1999; Gleitman, 1990). grammar is that semantics alone cannot provide a
Second, Bowerman (1990) showed that there direct basis for syntax. It is possible, however, that
was little difference in the order of acquisition early semantic categories could underlie syntactic
of verbs that the semantic bootstrapping account categories (McShane, 1991); after all, children
predicts should be easiest for children to map onto learn about objects before they learn about truth
thematic roles, compared with those that should and time. Perhaps the category of “noun” is based
be more difficult. For example, verbs where the on a semantic category of objecthood (Gentner,
theme maps onto the subject (as is the case with 1982; Slobin, 1981).
many verbs, such as “fall,” “chased”) should be According to Schlesinger’s (1988) semantic
easier to acquire than verbs where the location, assimilation theory (see Figure 4.6), early seman-
goal, or source maps onto the subject and the tic categories develop into early syntactic catego-
theme onto the object (such as “have,” “got,” ries without any abrupt transition. At an early age
and “lose”). Instead Bowerman, in an analysis of children use an “agent–action” sentence schema.
the speech of her two children, Christy and Eva, This can be used to analyze new NP–VP sequences.
found that the two types of verb are acquired at The important point is that it is possible to give an
the same time. In general, children do not produce account of early syntactic development without hav-
sentences corresponding to the basic structure ing to assume that syntactic categories are innate.
“agent–action–patient” any earlier than other Macnamara (1972) proposed that the child
types of structure. focuses at first on individual content words so
Third, Braine (1988a, 1988b), in detailed that a small lexicon is acquired. Information per-
reviews of Pinker’s theory, questioned the need taining to word order is ignored at this stage. The
for semantic bootstrapping, and examined the child combines the meanings of the individual
evidence against the existence of very early phrase- words with the context to determine the speak-
structure rules. He argued that semantic information er’s intended meaning. For example, a child who
is sufficient for children to be able to learn syntac-
tic categories.
Finally, postulating the possession of specific
innate knowledge is very powerful—perhaps too Semantic assimilation theory (Schlesinger, 1988)
powerful. After all, the processes of language
No innate structures
development are slow and full of errors. There is
a fine balance between a developmental system Early semantic categories
that is innately constrained as Pinker proposed,

and yet is unconstrained enough to accommodate Early syntactic categories
all these false starts.

Child uses an agent–action sentence
schema to analyze new NP–VP sequences
Does semantics come first?
From the constructivist-semantic or meaning-
first view, grammatical classes are first con-
structed on a semantic basis (e.g., Gleitman, 1981; FIGURE 4.6
sees Mommy drop a ball knows the meaning of Distributional analysis

the words “Mommy,” “drop,” and “ball,” and on An alternative view has emerged that children
hearing the sentence “Mommy dropped the ball” can acquire syntactic categories from a very early
can work out the intended meaning of that utter- age with very little or no semantic information
ance. In doing so, the child can also take the first (Bloom, 1994; Levy & Schlesinger, 1988). This
steps towards mapping words onto roles in sen- approach exemplifies how children might view
tences. One of the earliest observations is that the language as a rule-governed “puzzle” that has to
default sentence order (in English at least, as we be solved. Children as young as 2 easily acquire
have seen) is subject (or agent), action, and object gender inflections in languages such as Hebrew,
(or person or thing acted on). The nature of child- even though these syntactic constructions have
directed speech (in referring to the here-and-now very little semantic basis and contribute little to
and using syntactically simplified constructions) the meaning of the message (Levy, 1983, 1988).
facilitates this process. Gender may play an important role in marking
The acquisition of verbs is more difficult to word boundaries, and may be particularly promi-
account for in this way. Although many verbs nent to children if they are viewing language as a
do describe actions, a large number of important puzzle. Children acquiring Hebrew attend to syn-
early verbs do not (e.g., “love,” “think,” “want,” tactic regularities before they attend to semantic
“need,” “see,” “stop”). Many early verbs refer to regularities (Levy, 1988). Syntactic cues are far
states, but many adjectives also describe states more effective than semantic cues for acquiring
(e.g., “hungry,” “nice”). Hence, if the early syn- the distinction between count nouns (which can
tactic prototype for verbs is based on the semantic represent single objects, such as “broomstick”)
notions of actions and states, one might occa- and mass nouns (e.g., “water”). It is possible to
sionally expect errors where adjectives get used say “a broomstick,” but not “a furniture”; simi-
as verbs (e.g., “I hungries”). However, such con- larly we can say “much furniture,” but not “much
structional errors are never found (McShane, broomstick.” We can form plurals of count nouns
1991). Therefore it seems unlikely that children (“broomsticks” is acceptable) but not of mass
are inducing the early verb concept from a pure nouns (“furnitures” is not acceptable). Children
semantic notion. seem to acquire the distinction not by noting that
Maratsos (1982) proposed that early syntac- count and mass nouns can correspond to objects
tic categories are formed on the basis of shared versus substances, but by making use of these
grammatical properties. For example, in English syntactic cues (Gathercole, 1985; Gordon, 1985).
nouns can occupy first positions in declarative Children do not miscategorize nouns whose
sentences. Once one category has been formed, semantic properties are inappropriate, but instead
bootstrapping facilitates the acquisition of sub- make use of the syntactic information.
sequent ones: adjectives come before and mod- This new approach to acquiring syntactic
ify or specify nouns, verbs come between nouns, categories claims that children perform a distribu-
and so on. Maratsos also proposed that the tional analysis on the input data (Gathercole, 1985;
types of modifications that a word can undergo Levy & Schlesinger, 1988; Valian, 1986). This
indicate its syntactic category. For example, if means that children essentially search for syntactic
a word can be modified by adding “-ed” and regularities with very little semantic information.
“-ing” to the end, then it must be a verb. Bates Distributional analysis shows that many aspects
and MacWhinney (1982) proposed that abstract of children’s early utterances, including the errors
nouns later become assimilated to the category they make, can be accounted for by the statisti-
because the words behave in the same way as the cal properties of the language they hear, without
more typical nouns; for example, they occupy recourse to innate knowledge.
the same sorts of positions in sentences. That is, Connectionist modeling of distributional
children might again be making use of distribu- analysis demonstrates that knowledge about
tional information. categories can be acquired on a statistical basis
alone (Elman, 1990; Finch & Chater, 1992; are typically verbs, but words that only take the
Mintz, 2003; Redington & Chater, 1998). This suffix -s are typically nouns (Maratsos, 1988).
approach shows how syntactic categories can be In English bisyllabic words, nouns tend to have
acquired without explicit knowledge of syntac- stress on the first syllable, but verbs have stress
tic rules or semantic information. Instead, all that on the second syllable (Kelly, 1992).
is necessary is statistical information about how
words tend to cluster together. This approach also Evaluation of work on learning
answers the criticism that a distributional analy- syntactic categories
sis of syntactic categories is beyond children’s In summary, the relation between the develop-
computational abilities (Pinker, 1984). In par- ment of syntax and the development of semantics
ticular, some words are ambiguous and belong to is likely to be a complex one. Early work empha-
multiple syntactic categories. A child hearing the sized the importance of semantic information in
first three sentences might conclude on the basis the acquisition of syntactic categories, but more
of distributional analysis alone that the fourth recent work has shown how these categories can
sentence is also acceptable: be acquired with little or no semantic informa-
tion. Children probably learn syntactic categories
(9) Vlad eats fish. through a distributional analysis of the language,
(10) Vlad eats rabbits. and connectionist modeling has been very useful
(11) Vlad can fish. in understanding how this occurs. It is unlikely
(12) *Vlad can rabbits. that innate principles are needed to learn syntactic
categories.
However, computer modeling shows that sta-
tistical distributional analysis in fact works very
well. MOSAIC is a computer model that has no
Two-word grammars
built-in syntactic knowledge and learns by the dis- Soon after the vocabulary explosion, the first
tributional analysis of an input of child-directed two-word utterances appear. There is a gradation
speech (Freudenthal, Pine, & Gobet, 2005, 2006). between one-word and two-word utterances in
It provides input to a range of data in English, the form of two single words juxtaposed (Bloom,
Dutch, Italian, and Spanish, fitting the errors that 1973). Children remain in the two-word phase for
children make and how those errors change in some time.
time in the light of further input. Mintz (2003) Early research focused on uncovering the
shows how exposure to words in frequent frames grammar that underlies early language. It was
produces extremely accurate categories. To give a hoped that detailed longitudinal studies of a few
very simple example, any word in the X position children would reveal the way in which adult
in “the X laughs” must be a noun. grammar was acquired. Early multiword speech
Researchers currently disagree about how is commonly said to be telegraphic in that it con-
much innate knowledge is necessary before dis- sists primarily of content words, with many of the
tributional learning can successfully take place. function words absent (Brown & Bellugi, 1964;
The current trend in research is to show how less Brown & Fraser, 1963).
knowledge must be innate because the input with It would be a mistake to characterize tele-
which children work is richer than was once real- graphic speech as consisting only of semanti-
ized. For example, Redington and Chater (1998) cally meaningful content words. Braine (1963)
pointed out that children have access to distribu- studied three children from when they started
tional information in addition to co-occurrence to form two-word utterances (at about the age
information. For instance, morphology varies of 20 months). He identified a small number of
regularly with syntactic category and this pro- what he called pivot words. These were words
vides a strong cue to the syntactic function of that were used frequently and always occurred in
a word. Words that take the suffixes -s and -ed the same fixed position in every sentence. Pivot
words were not used alone and were not found not the other properties ascribed to pivot words. On
in conjunction with other pivot words. Most pivot closer analysis she found that the open class was
words (called P1 words) were to be found in the not undifferentiated, using instead a number of
initial position, although a smaller group (the P2 classes. Harris and Coltheart (1986) suggested that
words) were to be found in the second position. the children in the Bowerman study might have
There was a larger group of what Braine called been linguistically more advanced than those of the
open words that were used less frequently and earlier studies, and therefore more likely to show
that varied in the position in which they were increased syntactic differentiation.
used, but were usually placed second. This idea Bloom (1970) argued that these early gram-
that sentences are formed from a small number of matical approaches failed to capture the seman-
pivot words is called pivot grammar. Hence most tic richness of these simple utterances because
two-word sentences were of the form (P1 + open) they placed too much emphasis on their syntactic
words (e.g., “pretty boat,” “pretty fan,” “other structure. The alternative approach—that of plac-
milk,” “other bread”) with a smaller number of ing more emphasis on the context and content
(open + P2) forms (e.g., “push it”). Some (open + of children’s utterances, rather than just on their
open) constructions (e.g., “milk cup”) and some form—became known as rich interpretation. It
utterances consisting only of single open words soon became apparent that two-word utterances
are also found. with the same form could be used in different
Brown (1973) took a similar longitudinal ways. In one famous example, Bloom noted that
approach with three children named “Adam,” the utterance “mommy sock,” uttered by a child
“Eve,” and “Sarah.” Samples of their speech were named Kathryn, was used on one occasion to refer
recorded over a period of years from when they to the mother’s sock, and on another to refer to
started to speak until the production of complex the action of the child having her sock put on by
multiword utterances. Brown observed that the the mother. Bloom argued that it was essential to
children appeared to be using different rules from observe the detailed context of each utterance.
adults, but rules nevertheless. This idea that chil- The rich interpretation methodology has its
dren learn rules but apply them inappropriately is own problems. In particular, the observation of
an important concept. They produced utterances an appropriate context and the attribution of the
such as “more nut,” “a hands,” and “two sock.” intended meaning of a child’s utterance to a par-
Brown proposed a grammar similar in form to pivot ticular utterance in that context is a subjective
grammar, whereby noun phrases were to be rewrit- judgment by the observer. It is difficult to be cer-
ten according to the rule NP → (modifier + noun). tain, for example, that the child really did have
The category of “modifier” did not correspond two different meanings in mind for the “mommy
to any single adult syntactic category, containing sock” utterance.
articles, numbers, and some (demonstrative) adjec- In summary, it is difficult to uncover a simple
tives and (possessive) nouns. As the children grew grammar for early development that is based on
older, however, these distinctions emerged, and the syntactic factors alone. An additional problem is
grammar became more complex. that the order of words in early utterances is not
always consistent.
Problems with the early grammar
approaches Semantic approaches to early
Bowerman (1973) reviewed language development
across a number of cultures, particularly English syntactic development
and Finnish. She concluded that the rules of pivot The apparent failure of pure syntactic approaches
grammar were far from universal. Indeed, they did to early development, and the emerging emphasis
not fully capture the speech of American children. on the semantic richness of early utterances, led to
She confirmed that young children use a small an emphasis on semantic accounts of early gram-
number of words in relatively fixed positions, but mars (Schlesinger, 1971; Slobin, 1970). Aspects
child appears to be learning specific instances

Box 4.6 Eleven important rather than just semantic categories. Braine gives
early semantic relations and the example of a child who learned to use “other”
examples (based on Brown, mostly only with nouns denoting food and cloth-
1973) ing. He concluded that children use a combination
of general and specific rules.
Attributive “big house”
Agent–Action “Daddy hit” The acquisition of verb-argument
Action–Object “hit ball”
Agent–Object “Daddy ball” structure
Nominative “that ball” An important aspect of learning syntax is to learn
Demonstrative “there ball” the appropriate argument structure of verbs. For
Recurrence “more ball” example, we know that “hits” is a transitive verb
Non-existence “all-gone ball” that takes an object, that “falls” is an intransitive
Possessive “Daddy chair” verb that does not take an object, and that some
Entity + Locative “book table” verbs are more complex in that they can have
Action + Locative “go store” direct and indirect objects (“Boris gives the ball
to Agnes”). How do children learn this important
aspect of language?
of Brown’s (1973) grammar were also derived The acquisition of verb-argument structure
from this: for instance, he observed that 75% of follows a U-shaped function: performance is
two-word utterances could be described in terms good, then poor, then good again. Young children
of only 11 semantic relations (see Box 4.6 for tend to produce the correct forms; they then go
examples). through a period where they produce incorrect
There is some appeal to the semantic forms, particularly making over-generalization
approach in the way in which it de-emphasizes errors. For example, they tend to use intransi-
syntax and innate structures, and emphasizes tive verbs in transitive ways (“Adam fall toy”),
mechanisms such as bootstrapping, but it has its because they are developing structures where the
problems. First, there is a lack of agreement on link between causal actions and transitive verbs is
which semantic categories are necessary. Second, inappropriately generalized to intransitive verbs.
it is unclear whether children are conceptu- Finally, they become adult-like in producing the
ally able to make these distinctions. Third, this correct form of complex verbs (Akhtar, 1999;
approach does not give any account of the other Alishahi & Stevenson, 2005). Clearly this pattern
25% of Brown’s observed utterances. Fourth, the is a clue as to how children are learning verb-
order of acquisition and the emergence of rules argument structure.
differ across children. Finally, Braine (1976) Perhaps children come to use semantic informa-
argued that this approach was too general: the evi- tion about which sorts of verbs can and cannot par-
dence is best described by children learning rules ticipate in certain verb-argument structures (Pinker,
about specific words rather than general semantic 1989). For example, verbs that convey information
categories. For example, when children learn the about motion in a specified direction (fall, climb,
word “more,” is this a case of learning that the word ascend, descend) can only occur in intransitive
“more” specifically combines with entities, or is constructions. This idea is called the semantic verb
it more generally the case that they understand that it class hypothesis. Children make over-generalization
represents the idea of “recurrence plus entities”? errors when they have not yet learned the precise
If the latter is the case, then when children learn semantic representations of the verbs.
the word “more” they should be able to use other A second idea is that particular importance
available recurrence terms (e.g., “another”) freely is attached to the acquisition of certain key verbs.
in similar ways; however, they do not. Hence the Children learn some verbs and the particular
ways in which they are used. These early verbs instances. As we have noted before, production
that form the basis of utterances are called “verb is usually more difficult than comprehension.
islands” (Akhtar & Tomasello, 1997; Tomasello, Furthermore, most of the stimuli that test early
1992a, 2000, 2003). Tomasello (2000, 2003) comprehension tend to involve nonsense words
questioned the continuity assumption—the idea or artificial languages, whereas later produc-
that a child’s grammar is adult-like, using the tion studies usually involve real language where
same sort of grammatical rules as adults and with word meaning is involved. Naigles suggests that
an adult-like linguistic competence. He argued the patterns the younger children extract are not
that young children’s syntactic abilities have yet tied to meaning. Toddlers do not lose these
been greatly overestimated: in particular, they early abstractions, but their specific use of them
produce far fewer novel utterances than is usu- is very limited until they can integrate them with
ally attributed to them. Instead, their language meaning. As she says, learning form is easy, but
development proceeds in a piecemeal fashion learning meaning is hard. She argues that there
that is based on particular items (mainly verbs), is no reason to suppose that very young chil-
with little evidence of using general structures dren are not making abstractions across syn-
such as syntactic categories. Lieven, Pine, and tactic structures, so she resolves the paradox by
Baldwin (1997) found that virtually all their saying that toddlers do use abstraction. Young
sample of young children (1–3 years old) used children have difficulty extending meaning, not
verbs in only one type of construction, suggest- frames. Tomasello and Akhtar (2003) continued
ing that their syntax was built around these par- the debate (see Naigles, 2003, for a reply), argu-
ticular lexical items. Tomasello emphasizes the ing that there is no paradox. They contended that
importance of syntactic development by analogy- there is converging evidence that up to the age
making based on verb islands. The verb-island of 3 young children are unable to abstract across
hypothesis accounts for the data because chil- syntactic structures, focusing instead on specific
dren are learning some specific high-frequency items and expressions, and using a few specific
examples (giving the correct pattern in the syntactic frames. Tomasello and Akhtar argued
first instance) that are then used to form gener- that diary studies of spontaneous speech, and
alizations; however, the application of some of the production studies where children are taught
these generalizations sometimes leads to errors. novel verbs, produce particularly compelling
Eventually the child realizes that both rules and data that toddlers do not form abstract syntactic
exceptions are necessary. representations.
The verb-island hypothesis has generated If adults hear a particular syntactic struc-
considerable debate, particularly about whether ture, they are more likely to use that structure in
or not there is a paradox in accounts of early child production in the immediate future, a phenom-
language. Naigles (2002) argues that at first sight enon known as structural priming (see Chapter
there is a paradox: infants seem to be very good 13 for details). For example, you are more likely
at statistical learning and abstracting general to produce a passive construction if you have just
patterns from specific instances, while toddlers heard a passive sentence than if you have
are very poor, dealing instead with non-abstract, just heard an active one. Children over 4 show
item-specific information (e.g., the key verb this structural priming effect; however, children
of verb islands). It is though as they get older under 4 do not (Savage, Leiven, Theakston, &
children actually lose their ability for abstrac- Tomasello, 2003). One explanation for this find-
tion. She argues that this difference arises in part ing is that young children have no general syn-
from differences in methodologies: Studies on tactic structures to prime, but the finding might
younger children tend to test comprehension, and also suggest that imitation plays some role in
find more evidence of abstraction, while studies older children.
on older children tend to use test production, A third solution is that repeated instances of a
and find more evidence of the use of specific verb in particular constructions cause the child to
make a probabilistic inference that the verb is only requires a level of syntactic abstraction. How
associated with a particular verb-argument struc- early does this abstraction happen? According
ture. The more often children hear a verb used in a to late-syntax theories abstraction happens rela-
particular construction, the less often they should tively late, suggesting that syntax takes time to be
generalize it to a novel input. This idea is called learned and is acquired through abstracted experi-
the entrenchment hypothesis (Braine & Brooks, ence, with children early on interpreting sentences
1995; Theakston, 2004). The more often children with lexical or verb-specific knowledge (Braine,
hear a verb being used, the less likely they should 1992; Lieven, Pine, & Baldwin, 1997; Tomasello,
be to get it wrong. Therefore verb frequency is 2003). According to early-syntax theories,
particularly important here, with over-generalization abstraction happens relatively early (Fisher, 2002;
errors particularly likely on low-frequency verbs. Naigles, 2002; Pinker, 1984). If abstraction hap-
Hence children are more likely to (incorrectly) pens early, children must be making use of some
say that “She arrived her to the park” is gram- additional information, which might be innate
matical than the similar construction containing (Pinker, 1984), or might arise from the structure
the higher frequency verb in “She came me to the of the general cognitive architecture used to learn
school” (Theakston, 2004). language (Chang, Dell, & Bock, 2006; Saffran,
Of course word frequency and the amount of 2002). Unfortunately different methodologies
exposure to semantic information are confounded. give different results and support different theo-
An alternative account combines the above ries (Chang et al., 2006). Results using elicited
accounts. It dispenses with rules and exceptions, production (getting children to speak) support
and argues that children carry out a type of distri- the late-syntax theory, while results examining
butional analysis of verb structures, with semantic comprehension support the early-syntax theory.
information playing an important role (Alishahi & Even different comprehension tasks give dif-
Stevenson, 2005). In this model the acquisition of ferent results. Tasks in which comprehension is
verb-argument structure is probabilistic. Children assessed by children acting out sentences find that
learn the argument structures of each specific children under 3 do not seem to use word order
verb over many specific instances, as well as the to comprehend who is acting on whom (Akhtar
more general semantic characteristics of that type & Tomasello, 1997). On the other hand, tasks
of verb. Early on children imitate specific forms, using the preferential-looking technique find that
but increasingly rely on generalizations based on children under 3 do use word order information
general patterns. At first this general information (Fernandes, Marcus, Di Nubila, & Vouloumanos,
overwhelms the specific information, but as the 2006; Gertner, Fisher, & Eisengart, 2006). Chang
child encounters more examples of infrequent et al. show that a connectionist model that learns
verbs they come to be able to use those less fre- and predicts sequences from repeated exposure
quent verbs correctly. to grammatical strings of words, and which also
The study of the acquisition of verb-argument makes use of information about the meaning of
structures enables us to make a more general utterances, can account for the data from both
point about how children learn syntax. Clearly sorts of methodology. The model can simulate
an important part of learning is to abstract infor- both the elicited production and preferential-look-
mation out of specific instances. After the age of ing data. Children appear to understand complex
3, children are able to combine novel verbs with structures early on with the preferential-looking
the appropriate syntactic structures with ease. For task because it provides a choice between two
example, consider the sentences “Agnes kicked interpretations. The system develops partial struc-
Vlad” and “Agnes kissed Vlad.” There are simi- tural representations before it can produce correct
larities between these sentences—for example, whole structures. In effect, it has enough informa-
both are transitive sentences involving agents and tion to be able to understand when alternatives are
objects (as opposed, say, to kickers and things provided, but not enough to be able to produce
being kicked), but to recognize these similarities from scratch.
A more general way of phrasing these ques- multiword utterances just statistically reflect
tions was put by Lidz et al. (2003): Is word the most common types of utterance they hear?
learning driven by observation of the outside According to this view, children have a much
world, or is it driven by properties already inside less formal grammar than is commonly sup-
the child? Causative verbs make a particularly posed. Evidence for this comes from the obser-
good arena for testing this question. In English, vation that early language use is much less
causativity and transitivity are entwined: flexible than it would be if children were using
Causative verbs (whose meanings contain some explicit grammatical rules (Pine & Lieven,
notion of causation) are transitive. For example, 1997).
the causative verb “kill” (meaning “cause to In general, the idea that there is a syntax mod-
die”) is transitive—it can take an object (“Vlad ule that drives language development is becoming
kills Boris”); “swim” is not causative and is an less popular. It is clear that language development
intransitive verb—it cannot take an object. In must be seen within the context of social devel-
the Dravidian language Kannada (spoken in the opment and the way language is used (Messer,
subcontinent of India), however, transitivity is 2000). The shift is also mirrored in Chomsky’s
not the best predictor of causativity: There is more recent work (1995), where the importance
a causative morpheme which is never present of grammatical rules is much reduced.
unless the verb is a causative one. How do chil- Perhaps there is no straightforward way
dren come to learn verbs in such a language? of separating grammatical and lexical devel-
The emergentist theory, which says that learn- opment; the two are intertwined (Bates &
ing is driven by observation, will mean that for Goodman, 1997, 1999). For example, grammat-
the child the most reliable cue (which will not ical development is related to vocabulary size:
be transitivity, but the presence of the causative The best predictor of grammatical development
morpheme) will be associated with causativ- at 28 months is vocabulary size at 20 months,
ity. The syntactic universalist theory, however, suggesting that the two share something impor-
where learning is driven by the properties of the tant (Bates & Goodman, 1999; Fenson et al.,
syntax already present in the child, predicts that 1994). Furthermore, there is no evidence for a
they should still make most use of transitivity. dissociation between grammatical and vocabu-
Lidz et al. found that 3-year-old children largely lary development in either early or late talk-
ignore the causative morphology and make most ers: We cannot identify children with normal
use of the less useful transitive structures when grammatical development but with very low or
understanding verbs. high vocabulary scores for their age. Neither is
there any evidence of any clear dissociations
Evaluation of work on early between grammatical and lexical development
in language in special circumstances (such as
syntactic development Williams syndrome and Down’s syndrome).
Can early syntactic development be both non- Bates and Goodman (1999) concluded that
syntactic and non-semantic? The identification there is little support for the idea of a separate
of early syntactic categories might occur without module for grammar.
much semantic help, and without being based on In conclusion, recent work tends to downplay
the acquisition of an explicit grammar. Instead, the role of an innate grammatical module and the
children seem to learn grammatical categories by attribution of adult-like grammatical competence
distributional analysis. Can this type of approach to young children.
be extended to account for how children produce
two-word and early multiword utterances?
Perhaps children’s early productions are Later syntactic development
much more limited than has frequently been Brown (1973) suggested that the mean length of
thought (Messer, 2000). Perhaps their early utterance (MLU) is a useful way of charting the
TABLE 4.3 Mean length of utterance (MLU) and language development. Based on Brown (1973).
Stage I MLU < 2.25 many omissions, few grammatical words and inflections
Stage II 2.25–2.75 much variation
Stage III 2.75–3.5 (c. 3 years) pluralization, most basic syntactic rules
Stage IV 3.5–4 increasing syntactic sophistication
Stage V 4+ imperatives, negatives, questions, reflexives, passives (5–7 years), in that order
progress of syntactic development. This is the people doing odd actions. For example, she
mean length of an utterance measured in mor- would point to a drawing and say: “This is a
phemes averaged over many words. Brown wug. This is another one. Now there are two __”
divided early development into five stages based (see Figure 4.7). The children would fill in the
on MLU. Naturally MLU increases as the child gap with the appropriate plural ending “wugs.”
gets older; we find an even better correlation with In fact, they could use rules to generate posses-
age if single-word utterances are omitted from the sives (“the bik’s hat”), past tenses (“he ricked
analysis (Klee & Fitzgerald, 1985). This approach yesterday”), and number agreement in verbs
is rather descriptive and there is little correla- (“he ricks every day”).
tion between MLU and age after the age of 5. The development of order of acquisition of
Nevertheless, it is a convenient and much-used grammatical morphemes is relatively constant
measure (see Table 4.3). across children (James & Khan, 1982). The ear-
The rule-based nature of linguistic develop- liest acquired is the present progressive (e.g.,
ment is clear from the work of Berko (1958). “kissing”), followed by spatial prepositions, plu-
She argued that if children used rules, their use rals, possessives, articles, and the past tense in
should be apparent even with words the children different forms.
had not used before. They should be able to use
appropriate word endings even for imaginary
words. In a famous study, Berko used nonsense Inflecting verbs: Acquiring the past
words to name pictures of strange animals and tense
The development of the past tense has come under
particular scrutiny. Brown (1973) observed that the
youngest children use verbs in uninflected forms
(“look,” “give”). He argued that children seem to
be aware of the meaning of the different syntactic
roles before they could use the inflections. That
is, the youngest children use the simplest form
This is a wug. to convey all of the syntactic roles. They learn to
use the appropriate inflections very quickly: past
tenses to convey the sense of time (usually marked
by adding “-ed”), the use of the “-ing” ending,
Now there is another one. number modification, and modification by combi-
There are two of them.
. nation with auxiliaries. However, although regular
There are two ________.
verbs can be modified by applying a simple rule
(e.g., form the past tense by adding “-ed”), a large
FIGURE 4.7 number of verbs are irregular.
The time course of development of irregular while patients with Parkinson’s disease are rela-
verbs and nouns is an example of U-shaped devel- tively worse at regular forms. More controver-
opment. Behavior changes from good performance, sially, children with Williams syndrome may
to poor performance, before improving again. Early fare worse with irregular forms, while children
on, children produce both regular and irregular with specific language impairment (SLI) fare
forms. Importantly, in the poor performance phase, worse with regular forms (Pinker, 1994, 1999;
children make a large number of over-regularization see Thomas & Karmiloff-Smith, 2003, for a
errors (e.g., Brown, 1973; Cazden, 1968; Kuczaj, review). A problem with acquiring the dual-
1977). Later on they can produce both the regular route model is that regular and irregular forms
and irregular forms once again. coexist; the proportion of over-regularizations
One explanation of this pattern is that the never rose above 46% in 14 children studied by
youngest children have just learned specific Kuczaj (1977), suggesting that a very general,
instances. They then learn a rule by induct- powerful rule is not learned.
ion (e.g., form the past tense by adding -ed An alternative account, connectionist mod-
to verbs, form plurals by adding -s to nouns) eling of the acquisition of the past tense, has gener-
and apply this in all cases. Only later do they ated substantial controversy. The basic idea of these
start to learn the exceptions to the rule. Hence models is that we do not need two distinct routes to
children develop a past-tense formation system produce regular and irregular forms; instead, knowl-
with two separate routes: a symbolic system edge of regular forms comes from knowledge about
that uses a rule to generate regular forms, and phonological regularities, whereas knowledge of
a route accessing a separate listing of irregular irregular forms comes from lexical-semantic knowl-
forms (Pinker, 1994, 1999). Evidence for a dual- edge. fMRI imaging data suggest that it is the pho-
route model comes from several dissociations nological characteristics of the past tense forms that
of performance on regular and irregular verbs. are important for determining which brain regions
Patients with fluent aphasia (see Chapter 13) are activated: Irregular forms that sound as if they
tend to be worse at reading and producing irreg- could be regular forms (e.g., “slept,” “sold”) pro-
ular forms than regular forms, while patients duce a pattern of activation similar to regular forms
with non-fluent aphasia tend to be relatively (Joanisse & Seidenberg, 2005). Rumelhart and
worse at processing the regular forms. Imaging McClelland (1986) simulated the acquisition of the
data suggest the processing of regular and irreg- past tense using back-propagation. The input con-
ular forms involves different parts of the brain. sisted of the root form of the verb, and the output
PET imaging suggests that only Broca’s area is consisted of the inflected form. The training sched-
activated when processing regular past tenses, ule was particularly important, as it was designed
but the temporal lobes of the brain are involved to mimic the type of exposure that children have to
in processing irregular past tenses (Jaeger et al., verbs. At first the model was trained on 10 of the
1996). fMRI data suggest that while the posterior highest frequency words, 8 of which happened to
temporal lobes are involved in processing both be irregular. After 10 training cycles, 410 medium-
regular and irregular forms, only regular forms frequency verbs were introduced for another 190
produce activation around the frontal gyrus learning trials. Finally 86 low-frequency verbs were
(Pinker & Ullman, 2002). There is also evidence introduced. The model behaved as children do: it ini-
that regular and irregular plurals are processed tially produced the correct output, but then began to
in different ways. Clahsen (1999) argued that over-regularize. Rumelhart and McClelland pointed
experimental and neuroimaging work on plural out that the model behaved in a rule-like way, with-
formation in German suggests that the language out explicitly learning or having been taught a rule.
system is divided into a lexicon and a computa- Instead, the behavior emerged as a consequence of
tional system that, among other things, gener- the statistical properties of the input. If true, this
ates irregular forms. Patients with Alzheimer’s might be an important general point about language
disease are relatively worse at irregular forms, development.
What are the problems with this account? discontinuity as in the original Rumelhart and
Pinker and Prince (1988) made the most sub- McClelland model, they gradually increased the
stantial criticisms of this work. They noted that number of verbs the system must learn, to simu-
irregular verbs are not really totally irregular. late the gradual increase in children’s vocabu-
It is possible to predict which verbs are likely lary size. They concluded that a network could
to be irregular, and the way in which they will display U-shaped learning even when there are
be irregular. This is because irregular verbs still no discontinuities in the training. MacWhinney
obey the general phonological constraints of and Leinbach (1991) reached similar conclu-
the language. Hence it is possible that irregu- sions. Nevertheless, some problems remain
lar forms are derived by general phonological (Clahsen, 1999; Marcus, 1995). Obtaining the
rules. In addition, the way in which some verbs U-shaped curve in modeling seems to depend
have both regular and irregular past tenses, and on presenting the training stimuli in a cer-
the way in which they are inflected, depends on tain way—in particular, it depends on sudden
the semantic context (“hang” and “hanged” and changes in the training regime, in contrast to the
“hung,” and “ring” and “ringed” and “rung,” for smooth changes of input that children are faced
example). The network also made errors of a type with. Furthermore, connectionist models make
that children never produce (e.g., “membled” for more irregularization errors than children. It is
the past tense of “mail”). Pinker and Prince also possible that the single-route mechanism actu-
pointed out that there is no explicit representa- ally fits the child data better than rule-based
tion for a word in Rumelhart and McClelland’s accounts (Marchman, 1997). In particular,
(1986) model. Instead, it is represented as a dis- children are more likely to regularize irregular
tributed pattern of activation. However, words verbs that are similar to other verbs that behave
as explicit units play a vital role in the acquisi- in a regular way. For example, “throw” forms
tion process. Pinker and Prince also argued that an irregular past tense as “threw.” There are
the simulation’s U-shaped development resulted other verbs like it, however, that form their past
directly from its training schedule. The drop in tenses in a regular way (e.g., “flow,” “show”).
performance of the model occurred when the An irregular verb like “hit,” however, has no
number of regular verbs in the training vocabu- competing enemies. As the connectionist con-
lary was suddenly increased. There is no such straint-based model predicts, children are more
discontinuity in the language to which young likely to produce “throwed” than “hitted.”
children are exposed. Obtaining the U-shaped One outcome of the modeling work by
curve also depended on having a disproportion- Rumelhart and McClelland has been to focus
ately large number of irregular verbs in the ini- attention on the details of how children acquire
tial training phase. This is not mirrored by what skills such as forming the past tense (e.g.,
children are actually exposed to. Finally, the way Marchman & Bates, 1994; Marcus et al.,
in which the medium-frequency, largely regular 1992). We now know much more than we did
verbs are all introduced in one block on trial 11 before. A general problem with the connection-
is quite unlike what happens to children, where ist accounts is that these models need explicit
exposure is cumulative and gradual (McShane, feedback in order to learn. As we have seen, the
1991). extent and influence of explicit feedback in real
Plunkett and Marchman (1991, 1993) language development is limited. One frequent
argued that connectionist networks can model counter to this objection is that the modeling
the acquisition of verb morphology, but many is merely demonstrating the principle that asso-
more factors have to be taken into account. ciation and statistical regularities in the lan-
In particular, they proposed that the training guage can account for the phenomena without
set must more realistically reflect what hap- recourse to explicit rules, and the details of the
pens with children. Rather than present all the learning mechanisms involved are not impor-
verbs to be learned in one go, or with a sudden tant in this respect. Another possibility is that as
children listen to speech, they make predictions (such as Russian) are more highly inflected and
about what comes next. They can then match have freer word order. Not surprisingly, these dif-
the predictions to the actual input. However, ferences lead to differences in the detail of lan-
there is presently little evidence that this hap- guage development.
pens (Messer, 2000). What is perhaps surprising is the amount
Finally, computational modeling shows of uniformity in language development across
how developmental disruption to past-tense languages. For example, stage 1 speech (cover-
acquisition can account for the apparent dis- ing the period with the first multiword utterances,
sociation between the patterns of acquisition up to MLU of 2.0) seems largely uniform
shown in Williams syndrome and SLI (Thomas across the world (Dale, 1976; Slobin, 1970).
& Karmiloff-Smith, 2003). Rather than a static There are of course some differences: Young
model, whereby children come with two routes, Finnish children do not produce yes–no ques-
one of which is either spared or destroyed, high- tions (Bowerman, 1973). This is because you
level deficits (past-tense formation) can arise cannot form questions by rising intonation in
from relatively low-level deficits (phonological Finnish, so speakers must rely on an interroga-
processing and the lexical-semantic system) in tive inflection. Some differences emerge in
conjunction with the effects of development and later development. Plural marking is an
compensation. extremely complex process in Arabic, but rela-
tively simple in English. Hence plural marking
Individual differences in language is acquired early in English-speaking children,
but is not entirely mastered until the teen-
development
The way in which adults talk to children appears to age years for Arabic-speaking children (see
have an effect as the child gets older: There are large McCarthy & Prince, 1990; Prasada & Pinker,
individual differences in the ability of preschool chil- 1993). In complex inflectional languages such
dren to form and understand syntactically complex as Russian, development generally progresses
sentences, and the quality of what children hear cor- from the most concrete (e.g., plurals) first to
relates highly with these differences (Huttenlocher, the most abstract later (e.g., gender usually
Vasilyeva, Cymerman, & Levine, 2002). Children has no systematic semantic basis; see Slobin,
who hear complex structures master them earlier. 1966b).
Even here, it is difficult to be certain about what
is causal. The most important source of input for The development of syntactic
young children is their parents, so we cannot rule out
comprehension
genetic factors: Syntactic complexity in parent and
More complicated syntactic constructions nat-
child might reflect parent–child genetic similarity.
urally provide the child with a number of chal-
However, the language of teachers also comes to have
lenges. The youngest children have difficulty
an effect: The syntactic abilities of children taught by
with passives because they are inappropriately
teachers who use syntactically more complex speech
applying the standard canonical order strat-
develops faster than those taught by teachers who
egy, which simply says that the subject of the
use simpler constructions (Huttenlocher et al., 2002).
sentence is the agent. Older children (around
Hence language input does play a role.
3 years old) start to map the roles of passives
as adults do, but they make mistakes depend-
Cross-linguistic differences in ing on the semantic context of the utterance.
language development Children have particular difficulty with revers-
Languages differ in their syntactic complexity. ible passives, when the subject and object
For example, English is relatively constrained in can be reversed and the sentence still makes
its use of word order, whereas other languages sense (such as “Vlad was kissed by Agnes”).
Here there are no straightforward semantic revising their initial interpretations if they turn
cues available to assist them. M. Harris (1978) out to be wrong. Five-year-old children did not
showed that animacy is an important cue in use context to resolve ambiguous structures
the development of understanding passives. and were unable to revise their initial interpre-
Animate things tend to get placed earlier in tation. Children always preferred the “destina-
the sentence. Hence, in a picture description tion” interpretation (put the frog on the napkin)
task, when the object being acted on was ani- rather than the “modifier” interpretation (take
mate (such as a boy being run over by a car), the frog that is on the napkin and put it in the
a passive construction tended to be used to put box), regardless of the visual context. Young
the animate object first (“the boy was run over children therefore use different principles to
by the car”). The type of verb also matters: understand sentences; little is known about the
Young children find passives with action verbs way in which these principles turn into their
easier to manipulate than stative verbs such as adult equivalent.
“remember” (Sudhalter & Braine, 1985). The development of comprehension skills
More recently eye-tracking has been used is a long and gradual process with no clear-cut
to investigate how children understand sen- end point (Hoff-Ginsberg, 1997). Markman
tences. Trueswell, Sekerina, Hill, and Logrip (1979) found that a significant number of
(1999) used head-mounted eye-trackers to dis- 12-year-olds erroneously judged that (13)
cover where children looked in a scene as they made sense (I had to read it twice myself to
responded to ambiguous spoken instructions to find the problem):
move objects about that scene. As we shall see
in Chapters 10 and 14, adults can make use of (13) There is absolutely no light at the bottom of
many sources of information to resolve ambig- the ocean. Some fish that live at the bottom
uous instructions such as “Put the frog on the of the ocean know their food by its color.
napkin in the box,” and are also very good at They will only eat red fungus.
An eye-tracker can be
used to record and store
information about an
observer’s eye fixations.
Trueswell et al. (1999) used
this method to discover
where children looked in
a scene as they responded
to instructions to move
objects about that scene.
SUMMARY
x Rationalists believed that knowledge was innate, whereas empiricists argued that it arose from
experience.
x An analysis of the effects of correcting speech on young children shows that language acquisition
cannot be driven just by imitation or reinforcement.
x Because the linguistic input that children hear does not seem to contain sufficient information (it is
an impoverished input), Chomsky proposed that they have an innate Language Acquisition Device.
x In particular, he argued that we are born with a fixed set of switches (parameters), the positions of
which are set by exposure to particular languages.
x In practice it has proved difficult to identify these parameters, and to explain how bilingual
children and children using sign language use them.
x Human languages have a surprising amount in common; this might be because they are all derived
from the same universal grammar.
x There are different types of linguistic universals; some show how a particular aspect of language
may have implications for other features.
x The drive to use language in general and rules of word order in particular is so great that children
develop them even if they are absent from their input.
x Young children move from babbling to one-word or holophrastic speech, through abbreviated or
telegraphic speech, before they master the full syntactic complexity of their language.
x Correcting children’s errors makes surprisingly little difference to their speech patterns.
x Adults speak to young children in a special way; this child-directed speech (CDS for short; some-
times called “motherese”) simplifies the child’s task in acquiring language.
x CDS is clear, and what is being talked about is usually obvious from the context.
x As CDS is not used by all cultures it may not be necessary for language development, although
it might facilitate it.
x There are specific language impairments (SLIs) that are genetically marked, although the precise
nature of the impairment is disputed.
x All young children go through a stage of babbling, but it is not clear how the sounds they make
are related to the sounds of the language to which they are exposed.
x Infants are born with rich speech-perception abilities.
x It is likely that babbling serves to enable infants to practice articulatory movements and to learn
to produce the prosody of their language.
x There is an explosion in children’s vocabulary at around 18 months.
x There have been a number of proposals for how children learn to associate the right word with
things in the world, including lexical constraints, innate concepts, syntactic cues, and social-
pragmatic cues.
x Young children make errors in the use of words; in particular, they occasionally over-extend them
inappropriately.
x A number of models have been proposed to account for over-extensions; one of the most influential
has been the idea that the child has not yet acquired the appropriate semantic features for a word.
x Later semantic development depends on conceptual and syntactic factors.
x A number of mechanisms have been proposed for how children learn the syntactic categories of words.
x One view is that knowledge of syntactic categories and how objects and actions are mapped onto
nouns and verbs is innate.
x Once children have learned a few correspondences, their progress can be much faster because of
bootstrapping.
x According to the constructivist or meaning-first view, there is an early asyntactic phase of devel-
opment, which is driven only by semantic factors.
x More recent approaches have focused on the idea that children monitor the distribution of words
and use co-occurrence information to derive syntactic categories.
x Braine proposed that two-word grammars were founded on a small number of “pivot” words that
were also used in the same position in sentences.
x Purely grammatical approaches to early speech have difficulty in explaining all the utterances
children make, and ignore the semantic context in which the utterances are made.
x The acquisition of past tenses is best described by a U-shaped pattern, as performance goes from
perfect performance on irregular verbs through a phase of incorrectly regularizing them, before
using the correct irregular forms again.
x There has been much debate as to whether the learning of the past tense is best explained by the
acquisition of specific rules or by constraint-based models based on connectionist modeling.
1. What cognitive processes do you think need to be innate for language development to occur?
2. Throughout this chapter we have talked of “language development” or “language acquisition”
rather than (first) language learning. What is the advantage of avoiding the term “language
learning”?
3. To what extent are the errors that children make like the errors adult speakers routinely make?
(You might need to read Chapter 13 before attempting this question.)
4. Consider the first words made by someone you know. (You might be able to discover your
own.) What do you think accounts for them?
5. Produce a detailed summary of the time course of language development.
6. To what extent is the telegraphic speech of young children like the agrammatic speech of some
aphasics (see Chapter 13)?
7. In some studies with young infants children pay attention for longer to easy or familiar stimuli,
whereas in others they attend longer to unfamiliar material. What might determine when each
of these happens?
FURTHER READING
Many texts describe language development in far more detail than can be attempted in a single chapter:
see, for example, include Hoff-Ginsberg (1997) and Owens (2004) for an introductory approach.
Hoff-Ginsberg includes very good descriptions of language development in special circumstances.
Messer (2000) is a very short review of the main themes. See Bloom (1998) for another good
review with an emphasis on the effect of the context of development.
(Continued)
(Continued)
See Werker and Yeung (2005) for a review of early speech perception and word learning. Bloom
(2001a) reviews work on how children learn the meaning of words; Bloom (2001b) is a summary of
the book, with a commentary. See also Hollich, Hirsh-Pasek, and Golinkoff (2000) for word learn-
ing. Although we have focused on nouns and verbs, we should not forget that there are other catego-
ries of words; see Mintz and Gleitman (2002) for work on how children learn adjectives.
There are several introductions to Chomsky’s work that cover his ideas on language, language
development, syntax (see Chapter 2), and sometimes his political ideas as well. See Cogswell and
Gordon (1996), Lyons (1991), and Maher and Groves (1999). A convincing defense of the position
that language has an important innate component is presented in a very approachable way by Pinker
(1994); see Pinker (1989) for more on formal approaches to language development. See Leonard
(2000) for a review of SLI. For more on language development as parameter setting, see Stevenson
(1988). Cook and Newson (2007) provide a great deal of material on Chomsky’s work, with par-
ticular evidence on language development. In particular, they provide a very clear account of the
poverty of the stimulus argument. See McClelland and Seidenberg (2000) and Seidenberg and Elman
(1999) for critiques of nativism. For more on early phonological and segmentation skills, see Saffran,
Werker, and Werner (2006). See Vihman (1996) for more on phonological development.
MacWhinney (1999) is an edited collection with an emphasis on how language is an emergent
property. Elman et al. (1996) discuss how connectionism has changed our view of what it means
for something to be innate. Their emphasis is on how behavior arises from the interactions between
nature and nurture. Plunkett and Elman (1997) provide practical examples of connectionist modeling
relevant to this in a simulation environment called tlearn. See Deacon (1997) for a review of the
biological basis of language, how it might have evolved, how humans differ from animals, and how
language might constrain language learning.
Broeder and Murre (2000) present a collection of articles that emphasizes computational modeling
of language development.
For a review of work on past-tense formation, see Clahsen (1999). Altmann (1997) has a good
section on the phonological skills of infants.
CHAPTER 5
BILINGUALISM AND SECOND
LANGUAGE ACQUISITION
INTRODUCTION to two languages from birth. It is not necessary

for them to be equally fluent in both languages,
Oddly enough for someone who has written sev- but at least they should be very competent in
eral books on language, languages were my worst the second one. Some people are trilingual, or
subject at school. My worst exam performance by even multilingual. This definition of bilingual-
far was in French, where I could literally hardly ism is a little vague as it depends on what we
understand a word. I of course blame the teaching. mean by “fluent.” It is perhaps best to think of
Many people believe that it is more difficult proficiency in multiple languages as lying on a
for older children and adults to learn another lan- continuum, rather than being an either–or idea.
guage. Given the same amount of exposure in Some authorities (e.g., Bialystock, 2001) distin-
the same way in both languages, is this assump- guish between productive bilingualism (speakers
tion correct? This chapter examines the topic of can produce and understand both languages) and
second language acquisition in more detail. How receptive bilingualism (speakers can understand
does second language acquisition differ from both languages, but have more limited product-
first? How do children and adults store the two ion abilities).
sets of words in their lexicons? How do the chil- Bilingualism is common in some parts of
dren manage to keep the languages apart? How the world (to mention just a few examples: North
do they learn to recognize that two distinct lan- Wales and Welsh–English; Canada and French–
guages are involved? By the end of this chapter English; and places where there are many ethnic
you should: minorities within a culture). By convention the
language learned first is called L1 and the lan-
x Know how young children can acquire two guage learned second is called L2. Sometimes,
languages simultaneously. however, the two languages are learned simul-
x Understand how we can learn a second lan- taneously, and sometimes the language that is
guage in adulthood. learned first turns out to be the secondary lan-
x Have some idea about how a second language guage of use in later life. We can distinguish
should best be taught. between simultaneous bilingualism (L1 and L2
learned about the same time), early sequential
bilingualism (L1 learned first, but L2 learned
BILINGUALISM relatively early, in childhood), and late (in ado-
lescence onwards) bilingualism (Bialystok &
If a speaker is fluent in two languages, then Hakuta, 1994). Early sequential bilinguals form
they are said to be bilingual. The commonly the largest group world-wide, and the number is
held image of a bilingual person is of someone increasing, particularly in countries with large
brought up in a culture where they are exposed immigration rates.
showed that young children can quickly (within

Box 5.1 Categories of 6 months) forget the old language and pick up a
bilingualism new one if they move to another country. Initially
the two languages are mixed up, but differentia-
x Simultaneous bilingualism: L1 and L2 learned tion quickly emerges (Vihman, 1985). We observe
at the same time. language mixing when words combine, such
x Early sequential bilingualism: L1 learned as an English suffix added to a German root, or
first, but L2 learned relatively early in English words put into a French syntactic struc-
childhood. ture, or responding to questions in one language
x Late bilingualism: L2 learned later, in with answers in another (Redlinger & Park, 1980;
adolescence or after. Swain & Wesche, 1975). Code switching (also
called language switching) is the name given to
the tendency of bilinguals when speaking to other
A number of factors determine which lan- bilinguals to switch from one language to another,
guage people use in a bilingual society. Naturally often to more appropriate words or phrases. This
the speaker’s home background is very important, process is highly variable between individuals.
as is to whom the person is speaking. Some socie- What happens if a child has already become
ties may have a history of attempting to impose moderately proficient in L1 when they start learn-
one language as being higher in prestige than oth- ing L2? Although we saw in our discussion of the
ers. Using a particular language may be a signal critical period in Chapter 3 that the duration of
of solidarity with or distance from others. For exposure to L2 (which is often the length of resi-
example, in Paraguay, Spanish is the language dence in the new country) is important, other fac-
used in more formal situations, while Guarani tors are also vital. These include the personality
is the language of intimacy, signaling solidarity and cognitive attributes of the person learning L2
with the other person. Courtship frequently begins (Cummins, 1991). Proficiency in L1 is extremely
in Spanish and ends in Guarani (Crystal, 2010; important: the development of L1 and L2 is inter-
Rubin, 1968). dependent. Children who have attained a high
What can we learn from the study of bilingual- level of skill at L1 are also likely to do so at L2,
ism? First, it is clearly of practical importance to particularly on relatively academic measures of
many societies. Second, psycholinguistics should language performance.
inform us about the best way of teaching people
a second language. Third, how do people repre-
sent the two languages? Do they have a separate The advantages of being bilingual
lexicon (mental dictionary) for each one, or just Bilingual children suffer no obvious linguis-
separate entries for each word form but a shared tic disadvantages from learning two languages
conceptual representation? And how do people simultaneously (Snow, 1993). There might be
translate between the two languages? Finally, the some initial delay in learning vocabulary items
study of bilingualism is a useful tool for examin one language, but this delay is soon made up,
ining other cognitive processes: for example, it and of course the total bilingual vocabulary of the
casts light on the critical period for language (see children is much greater.
Chapter 3). Bilingualism also has costs and benefits for
One of the earliest detailed studies of bilin- other aspects of cognitive processing. Bilingual
gualism was the diary study of Leopold (1939– people tend to have a slight deficit in cognitive
1949). Leopold was a German linguist, whose processing and working memory for tasks that
daughter Hildegard had an American mother and are carried out in L2. On the other hand, they
lived from an early age in the USA. German was show clear gains in metalinguistic awareness and
used in the home at first, but this soon gave way cognitive flexibility, and superior verbal fluency
to English, the environment language. The diary (Ben-Zeev, 1977; Bialystock, 2001; Cook, 1997;
5. BILINGUALISM 155
Pearl & Lambert, 1962). For example, Lambert, and connected directly together (Paivio, Clark, &
Tucker, and d’Anglejan (1973) found that children Lambert, 1988). This model is supported by evi-
in the Canadian immersion program (for learning dence that semantic priming produces facilitation
French) tended to score more highly on tests of between languages (e.g., Chen & Ng, 1989; Jin,
creativity than monolinguals. Bilingual children, 1990; Schwanenflugel & Rey, 1986; see Altarriba,
compared with monolingual children, show an 1992, and Altarriba & Mathis, 1997, for a review).
advantage in knowing that a word is an arbitrary Studies that minimize the role of attentional pro-
name for something (Hakuta & Diaz, 1985). cessing and participants’ strategies, and that maxi-
Although some researchers have argued mize automatic processing (e.g., by masking the
that there is no obvious processing cost attached stimulus, or by varying the proportion of related
to being bilingual (e.g., see Nishimura, 1986), pairs—see Chapter 6), suggest that equivalent
others have found indications of interference words share an underlying semantic representation
between L1 and L2 (see B. Harley & Wang, that can mediate priming between the two words
1997, for a review). For example, increasing pro- (Altarriba, 1992). Most of the evidence now tends
ficiency in L2 by immigrant children is associ- to favor the common-store hypothesis. However,
ated with reduced speed of access to L1 (Magiste, early and late learners show different patterns of
1986). B. Harley and Wang (1997, p. 44) con- cross-language priming, with late learners showing
clude that “monolingual-like attainment in each much less priming (Silverberg & Samuel, 2004),
of a bilingual’s two languages is probably a myth suggesting once again that age-of-acquisition is
(at any age).” critical in how bilinguals represent and access
On the other hand, there is now an over- words, with late learners having separate lexicons
whelming body of research showing that bilin- mediated at the conceptual levels.
gualism confers a general cognitive advantage Another possibility is that some people use a
in the form of enhanced flexibility. There is even mixture of common and separate stores (Taylor &
evidence that being bilingual protects people Taylor, 1990). For example, concrete words, cog-
to some extent against developing Alzheimer’s nates (words in different languages that have the
disease by helping to build up the mind’s “cog- same root and meaning and which look similar),
nitive reserve” that slows down cognitive aging and culturally similar words act as though they
(Bialystok, Craik, & Luk, 2012). are stored in common, whereas abstract and other
words act as though they are in separate stores.
Also steering between the common- and separate-
Bilingual language processing stores models, Grosjean and Soares (1986) argued
How many lexicons does a bilingual speaker pos- that the language system is flexible in a bilingual
sess? Is there a separate store for each language, or speaker, and that its behavior depends on the cir-
just one common store? In separate-store models, cumstances. In unilingual mode, when the input
there are separate lexicons for each language. These and output are limited to only one of the available
are connected at the semantic level (Potter, So, von languages, and perhaps when the other speakers
Eckardt, & Feldman, 1984). Evidence for the sep- involved are unilingual in that language, inter-
arate-stores model comes from the finding that the action between the language systems is kept to
amount of facilitation gained by repeating a word a minimum; the bilingual tries to switch off the
(a technique called repetition priming) is much second language. In the bilingual mode, both lan-
greater and longer lasting within than between language systems are active and interact. How speakers
guages (Kirsner, Smith, Lockhart, King, & Jain, have strategic control over their language systems
1984), although repetition priming might not be is a topic that largely remains to be explored.
tapping semantic processes (Scarborough, Gerard, What happens when a bilingual speaker
& Cortese, 1984). In common-store models, there hears or sees a word? How do they prevent the
is just one lexicon and one semantic memory sys- two languages from interfering with one another?
tem, with words from both languages stored in it Bilingual speakers must have mechanisms in place
to prevent interference. In an event-related potential Kroll and Stewart (1994) proposed that transla-
(ERP) study, bilingual Spanish–Catalan speakers tion by second-language novices is an asymmetric
were instructed to press a button when they saw a process. They argued that we translate words from
word in one of the languages, and to ignore words our first language into the second language (called
in the other (Rodriguez-Fornells, Rotte, Heinze, forward translation) by conceptual mediation. This
Nosselt, & Munte, 2002). The brain potentials of means that we must access the meaning of a word
the participants showed that they were not sensi- in order to translate it. In contrast, we translate from
tive to the frequency of the words in the ignored the second language into the first (called back-
language, suggesting that the words did not reach ward translation) by word association—that is, we
a high level of processing. However, fMRI activa- use direct links between items in the lexicon (see
tion had a lot in common with the way in which Figure 5.1). The evidence for this asymmetry is
we process nonwords. This pattern of results sug- that semantic factors (such as the items to be trans-
gests that speakers use quite low-level information lated being presented in semantically arranged lists)
to block words in the non-target language at a very have a profound effect on forward translation, but
early stage, such that the meanings of these words little or no effect on backward translation. In addi-
do not become activated. Further evidence for tion, backward translation is usually faster than for-
this low-level blocking of the non-target language ward translation.
comes from an electrophysiological study of very Having said this, there is some evidence that
fluent Italian–Slovenian bilinguals. The pattern of backward translation (from L2 to L1) might also be
activation while reading suggested that discrimina- semantically mediated. De Groot, Dannenburg, and
tion between the two languages is taking place at van Hell (1994) found that semantic variables such
a very early stage (Proverbio, Cok, & Zani, 2002). as imageability affect translation times in backward
translation, although to a lesser extent than in for-
Bilingual syntactic processing ward translation. La Heij, Hooglander, Kerling, and
There has been much less research on how bilingual van der Velden (1996) found that backward trans-
people process syntax than there has on how they lation was facilitated by the presence of congru-
process individual words. The issues are much the ent pictures and hindered by incongruent pictures,
same: for languages that use similar sorts of con- suggesting that the translation involves accessing
struction, do people store syntactic knowledge sep- semantics. Hence it is likely that translation in both
arately for each language, or just once, in a shared directions involves going through the semantic rep-
store? A study of Spanish–English bilingual speak- resentations of the words. It is also probable that
ers found that a particular syntactic structure in one the extent of conceptual mediation increases as the
language could make it easier to use the same struc- speaker becomes more proficient in L2.
ture in the second language, supporting the “shared
syntax” idea (Hartsuiker, Pickering, & Veltkamp,
2004). Similarly, Loebell and Bock (2003) found
Translation between L1 and L2
that production of German datives primed the (Kroll & Stewart, 1994)
subsequent use of English datives, and vice versa.
Similar results have been found in Dutch–English Forward translation via
conceptual mediation
bilinguals (Salamoura & Williams, 2006).
Translating between languages L1 L2

How do we translate between two languages? As
we might remember from school, or from our last
foreign holiday, translating a foreign language Backward translation via
can be fraught with difficulties. I remember once word association
complimenting a chef in Spanish on his swim-
ming pool (rather than his fish). FIGURE 5.1
5. BILINGUALISM 157
Picture–word interference studies suggest that sublexical levels of processing; see Dijkstra & van
in production only words of the target language Heuven, 2002; Dijkstra, van Heuven, & Grainger,
are ever considered for selection. Many studies 1998). The model attempts to bring together all
have shown that words in different languages inter- types of evidence concerning the orthographic pro-
fere with one another (e.g., Ehri & Ryan, 1980). cessing of two languages, but makes particular use
For example, it takes Catalan–Spanish bilinguals of how we recognize cognates—words that look
longer to name the picture of a table in Catalan if the the same (or very similar) in the two languages
Spanish word for chair is the distractor rather than (such as “silence” in English and French, or “ani-
an unrelated word. Costa, Miozzo, and Caramazza mal” in English and Spanish). In the BIA+ model,
(1999) presented Catalan–Spanish bilinguals with lexical access is non-language specific in its earli-
pictures to name in Catalan. In their experiment, est stages, so words from both languages are acti-
the name of the picture (not the name of a word vated, whatever the input. The model comprises a
related in meaning) was printed on top of the picture network of nodes at each level of representation
either in Catalan (same-language pairs) or Spanish (e.g., words, phonemes), connected together by
(different-language pairs). The critical condition is facilitatory and inhibitory connections. The model
the different-language pair. If choosing a word is not is purely bottom-up in the sense that word recogni-
language-specific, the different-language condition tion cannot be affected by the particular task (e.g.,
should cause a great deal of interference, as the word naming, lexical decision) being carried out. The
written in Spanish and the name of the picture in model is characterized by “language” nodes, which
Catalan will compete with each other. But if choos- tag representations according to the language to
ing a word is language-specific, then the Spanish dis- which they belong. The “language” nodes can
tractor name should not be able to compete with the receive activation from words (bottom-up) but can
Catalan word. Instead, if anything, it should facili- also send top-down inhibition. Recent work has
tate the production of the Catalan name through the centered on how bilingual processing is localized
intermediary of its meaning. Costa et al. found the in the brain (e.g., Moreno & Kutas, 2009).
latter: Having the name of the picture printed above
the target picture in the non-response language led The neuroscience of bilingualism
to facilitation. This finding suggests that only words There is some evidence that bilinguals with right-
of the target language are ever considered for output. hemisphere damage show more aphasia (crossed
A different picture holds for auditory com- aphasia) than monolinguals (Albert & Obler, 1978;
prehension. Eye-tracking studies suggest that both Hakuta, 1986). Crossed aphasia might arise because
languages are automatically considered. When the right hemisphere is involved in L2 acquisition,
bilingual people look at visual scenes searching particularly if L2 is acquired relatively late (Martin,
for particular items in the first language, they also 1998; Obler, 1981; Vaid, 1983), or because lan-
look at items with a name starting the same in the guage is less asymmetrically represented in the
second, irrelevant language (Marian & Spivey, two hemispheres in bilingual speakers—although
2003; Spivey & Marian, 1999). For example, when this is highly controversial (Obler & Hannigan,
an English–Russian bilingual looks for a “spear” 1996; Paradis, 1997). An ERP study of responses
in a visual array, they will also glance at a box of to words in 19–22-month-old English–Spanish
matches, because its name in Russian (“spichki”) bilingual children showed that the more dominant
overlaps substantially with the English word. language becomes lateralized before the less domi-
nant one (Conboy & Mills, 2006). In addition to the
Models of bilingualism types of aphasia shown by individuals who speak
The most influential model of bilingualism that only one language, brain damage sometimes causes
attempts to tell a complete story of the psychologi- additional disorders in people who speak two lan-
cal processes involved is the Bilingual Interactive guages. For example, we can observe pathological
Activation Plus (BIA+) model (a development of switching and mixing of languages, and difficulties
the original BIA model to include phonological and in translating between the languages.
Colored computed
tomography (CT) scans of
horizontal sections through
different levels of a stroke
victim’s brain. (The front
of the brain is at the top in
each image.) The stroke has
resulted in internal bleeding
(white/orange). The mass
of blood (hematoma)
extends up and down in
the brain as well as across
the left hemisphere, and
has ruptured the ventricles
(black) that carry the brain’s
cerebrospinal fluid. This
brain damage caused aphasia
as well as paralysis of one
side of the body.
The most interesting issue is the extent to differences in comprehension between monolin-
which processing of different languages tends to guals and bilinguals. Bilinguals are generally slower
be localized in different parts of the brain. One of to respond to linguistic stimuli, regardless of what
the first reports of this was by Scoresby-Jackson, language the stimuli are in (Green, 1986; Proverbio
describing the case of an Englishman who, after a et al., 2002). Electrophysiological measures show
blow to the head, selectively lost his knowledge complex differences in reading and comprehension
of Greek. Since then there have been a number (Proverbio et al., 2002).
of reports of the selective impairment of one lan-
guage following brain damage, and many more
of differential recovery of the two languages (see SECOND LANGUAGE
Fabbro, 2001; Obler & Hannigan, 1996; Paradis, ACQUISITION
1997). The evidence is consistent with two inde-
pendent language systems connected at the con- Second language acquisition happens when a child
ceptual level. or an adult has already become competent at a lan-
Imaging suggests that the time of acquisition guage and then attempts to learn another. We should
most affects the grammatical aspects of language. distinguish between learning a second language nat-
The lexicons of both early and late bilinguals are uralistically (e.g., when a child or person moves to a
organized similarly. However, individuals who new country) and class-based instruction.
acquire the second language after the age of 7 show There are a number of reasons why a person
different organization (Fabbro, 2001). In particular, might find learning a second language difficult.
in early-acquisition bilinguals, closed- and open- First, we saw in Chapter 3 that some aspects of
class words are stored in different parts of the brain; language learning, particularly involving syn-
in late-acquisition bilinguals closed-class words are tax, are more difficult outside the critical period.
stored with open-class words. There are other Second, older children and adults often have less
5. BILINGUALISM 159
time and motivation to learn a second language.

Third, there will of course be similarities and dif-
ferences between the first (L1) and second (L2)
languages. The contrastive hypothesis (Lado,
1957) says that the learner will experience diffi-
culty when L1 and L2 differ. In general, the more
idiosyncratic a feature is in a particular language
relative to other languages, the more difficult it
will be to acquire (Eckman, 1977). This cannot
be the whole story, however, as not all differences
between languages cause problems. For example,
Duskova (1969) found that many errors made A number of methods can be used to teach a
by Czech speakers learning English were made second language. One of these is the audiolingual
on syntactic constructions in which the two lan- method, which emphasizes speaking and listening
guages do not differ. before reading and writing.
There is some evidence that the time course
of L2 acquisition follows a U-shaped curve: initial
learning is good, but then there is a decline in per- method) on the other hand carry out all teaching
formance before the learner becomes more skilled in L2, with emphasis on conversational skills. The
(McLaughlin & Heredia, 1996). The decline in audiolingual method emphasizes speaking and lis-
performance is associated with the substitution tening before reading and writing. The immersion
of more complex internal representations for less method teaches a group of learners exclusively
complex ones. That is, the learner’s knowledge through the medium of the foreign language. In
becomes restructured. For example, as learners the more extreme submersion method, the learner
move from learning by rote to using syntactic is surrounded exclusively by speakers of L2, usu-
rules, utterances tend to become shorter. ally in the foreign country, and the learner has to
A number of methods have been used to teach “sink or swim.”
a second language (see Figure 5.2). The tradi- The work of Krashen (1982) has proved
tional method is based on translation from one to influential, if controversial, in understanding how
another, with lectures in grammar in the primary we might better teach languages. He proposed
language. Direct methods (such as the Berlitz five hypotheses concerning language acquisition
Traditional method:
Direct translations from L1 to L2
Lectures in grammar in L1
Submersion method: Direct method:

Methods used
Learner is surrounded exclusively All teaching done in L2
to teach a
by speakers of L2 usually in a social with emphasis on
second language
setting or foreign country conversational skills
Immersion method: Audiolingual method:

Learner taught exclusively Speaking and listening are
through medium of L2 emphasized rather than
reading and writing
FIGURE 5.2
that together form the monitor model of second Chomsky’s distinction between competence and
language learning (see Figure 5.3). Central to his performance). The fourth hypothesis is the com-
approach is a distinction between language learn- prehensible input hypothesis. In order to move
ing (which is what traditional methods empha- from one stage to the next, the acquirer must
size) and language acquisition (which is more understand the meaning and the form of the input.
akin to what children do naturally). Learning This hypothesis emphasizes the role of compre-
emphasizes explicit knowledge of grammati- hension. Krashen argues that production does not
cal rules, whereas acquisition emphasizes their need to be explicitly taught: it emerges itself in
unconscious use. Although learning has its role, time, given understanding, and the input at the
to be more successful second language acquisi- next highest level need not contain only infor-
tion should place more emphasis on acquisition. mation from that level. Finally, the active filter
The first of the five hypotheses is the acquisi- hypothesis says that attitude and emotional fac-
tion and learning distinction hypothesis: children tors are important in second language acquisition,
acquire their first language largely unconsciously and that they account for a lot of the apparent dif-
and automatically—they do not learn it. Earlier ference in the facility with which adults and chil-
views that emphasized the importance of the dren can learn a second language.
critical period maintained that adults could only Krashen’s approach provides a useful frame-
learn a second language consciously and effort- work, and has proved to be one of the most influ-
fully. Krashen argued that adults could indeed ential theoretical approaches to teaching a second
acquire the second language. The second hypoth- language. More recent work has moved away
esis is the natural order in acquisition hypothesis. from the idea that acquisition and learning are so
The order of acquisition of syntactic rules, and very different, emphasizing the practicalities of
the types of errors of generalization made, are the how learners can best acquire novel material, and
same in both languages. exploring the role of attention and covert learn-
The third and fourth hypotheses are central ing in language learning (see Doughty & Long,
to Krashen’s approach. The third hypothesis is 2005).
the monitor hypothesis. It states that the acquisi- In addition to teaching method, individual
tion processes create sentences in the second lan- differences between second language learn-
guage, but learning enables the development of a ers play some role in how easily people acquire
monitoring process to check and edit this output. L2 (Robinson, 2001). In a classic study, Carroll
This can only happen if there is sufficient time (1981) identified four sources of variation in peo-
in the interaction; hence it is difficult to employ ple’s ability to learn a new language. These were:
the monitor in spontaneous conversation. The phonetic coding ability (the ability to identify new
monitor uses knowledge of the rules rather than sounds and form associations between them—an
the rules themselves (in a way reminiscent of aspect of what is called phonological awareness);
grammatical sensitivity (the ability to recognize
the grammatical functions of words and other syn-
tactic structures); rote-learning ability; and induct-
Acquisition and learning ive learning ability (the ability to infer rules from
distinction hypothesis data). Working memory plays an important role in
foreign language vocabulary learning (Papagno,
Natural order
Active filter Monitor model
in acquisition Valentine, & Baddeley, 1991), and it is possible
hypothesis (Krashen, 1982)
hypothesis to recast Carroll’s four components of language
learning in terms of the size, speed, and efficiency
Comprehensible Monitor
input hypothesis hypothesis
of working memory functions (McLaughlin &
Heredia, 1996). Motivation, of course, also plays
a significant role; people who want or need to
FIGURE 5.3 learn will do better (Dörnyei, 1990).
5. BILINGUALISM 161
How can we make second artificial language better when they were ini-
tially presented with only small segments of the
language acquisition easier? language than when they were exposed to the
Second language acquisition is often characterized full complexity of the language from the begin-
by a phase or phases of silent periods when few pro- ning. Perhaps children learn the new language in
ductions are offered despite obvious development spite of the immersion rather than because of it.
of comprehension. Classroom teaching methods Immersion might be particularly counter-productive
that force students to speak in these silent periods for adults who, without the cognitive limitations
might be doing more harm than good. Newmark of childhood, will have great difficulty in apply-
(1966) argued that this has the effect of forcing the ing a “less-is-more” strategy.
speaker back onto the rules of the first language. Sharpe (1992) identified what he called the
Hence silent periods should be respected. “four Cs” of successful modern language teaching
Krashen (1982) argued we should make sec- (see Figure 5.4). These are communication (the main
ond language acquisition more like first language purpose of learning a language is aural communica-
acquisition by providing sufficient comprehensi- tion, and successful teaching emphasizes this); cul-
ble input. The immersion method, involving complete ture (which means learning about the culture of the
exposure to L2, exemplifies these ideas. Whole speakers of the language and de-emphasizing direct
schools in Montreal, Canada, contain English- translation); context (which is similar to providing
speaking children who are taught in French in all comprehensible input); and giving the learners con-
subjects from their first year (Bruck, Lambert, & fidence. These points may seem obvious, but they
Tucker, 1976). Immersion seems to have no del- are often neglected in traditional, grammar-based
eterious effects, and if anything might be beneficial methods of teaching foreign languages.
for other areas of development (e.g., mathematics). Finally, some particular methods of learning
The French acquired is very good but not perfect: second languages are of course better than oth-
there is a slight accent, and syntactic errors are ers. Ellis and Beaton (1993) reviewed what facili-
sometimes made. tates learning foreign language vocabulary. They
There might be limits, however, to how much concluded that simple rote repetition is best for
immersion is ideal. Recall the “less-is-more” learning to produce the new words, but that using
theory from Chapter 4: that starting small is an keywords is best for comprehension. Naturally,
advantage to children learning language. Kersten learners want to be able to do both, so a combina-
and Earles (2001) found that adults learned an tion of techniques is the optimum strategy.
Communication:
emphasis on aural
communication
Culture:
Four Cs of successful
Confidence: learning about the culture
modern language
given to learners and de-emphasizing
teaching
direction translation
Context:
providing comprehensible input
FIGURE 5.4
EVALUATION OF WORK phonological (sound) similarity that is important

in generating repetition blindness: Words that
ON BILINGUALISM sound the same (e.g., “won” and “one”) produce
AND SECOND repetition blindness, whereas words that are
LANGUAGE similar in meaning (e.g., “autumn” and “fall”)
do not (Bavelier & Potter, 1992; Kanwisher &
ACQUISITION Potter, 1990). Altarriba and Soltano confirmed
The study of bilingualism and second language that meaning plays no part in repetition blind-
acquisition is an increasingly important topic in ness using non-cognate translation equivalents.
psycholinguistics. First, the way in which bilin- These are words in different languages that
gual people represent and process two languages have the same meaning but different physical
is of great interest to psycholinguists. Second, it forms (e.g., “nephew” and “sobrino” in English
is clearly important that we should be able to and Spanish). They found that a sentence such
teach a second language in the most efficient as (1) generated repetition blindness in fluent
way. Third, it provides us with an additional Spanish–English participants (people had very
tool for investigating language and cognition. poor recall for the second instance of “ant”) but
For example, Altarriba and Soltano (1996) used (2) did not:
knowledge of how bilinguals store language to (1) I thought we had killed the ants but there
investigate the phenomenon known as repetition were ants in the kitchen.
blindness (Kanwisher, 1987). Repetition blind- (2) I thought we had killed the ants pero habian
ness refers to the observation that people are hormigas en la cocina.
very poor at recalling repeated words when the
words are presented rapidly. For example, when Clearly similarity in meaning cannot be
given the sentence “she ate salad and fish even responsible for the repetition blindness effect. The
though the fish was raw,” participants showed results also show that conceptual access in trans-
very poor recall of the second presentation of lation is very rapid for bilingual speakers, and
the word “fish.” The explanation of repeti- also that bilingualism may facilitate some aspects
tion blindness is that the repeated word is not of memory.
recognized as a distinct event and somehow Learning and using one language is an
becomes assimilated with the first presenta- impressive achievement; learning and managing
tion of the word. It appears to be the visual and several is incredible.
SUMMARY
x Second language acquisition in adulthood and later childhood is difficult because it is not like first
language acquisition.
x There are probably both costs and benefits of learning two languages at once. There might be
some general cognitive advantages.
x There has been much debate as to how we translate words between languages; in particular,
whether or not there are direct links between words in our mental dictionaries, or whether the
entries are mediated by semantic links.
x Translation probably does involve conceptual mediation.
x Bilingualism is a useful tool for studying other language processes.
5. BILINGUALISM 163
1. How would you suggest teaching a second language based on psycholinguistic principles?
2. How would your answer differ if you were teaching (a) 3-year-olds; (b) 10-year-olds;
(c) 20-year-olds?
3. What are the advantages of knowing more than one language? What are the disadvantages?
FURTHER READING
There are many reference works on bilingualism and second language acquisition. Examples of more
detailed reviews include Kilborn (1994) and Klein (1986). Books covering the area in greater depth
include Bialystok and Hakuta (1994), de Groot and Kroll (1997), Ritchie and Bhatia (1996)—particularly
the review chapter by Romaine—and Romaine (1995). For a review of research on code switching, see
Grosjean (1997). Altarriba (1992) reviews work on bilingual memory. The book by Fabbro (1999) pro-
vides an introduction to the neuropsychology of bilingualism; see also Fabbro (2001). See McLaughlin
(1987) for a discussion of Krashen’s work. For a cognitive approach to second language learning, see
Skehan (1998). Doughty and Long’s Handbook of Second Language Acquisition (2005) provides a fairly
recent review of all the main topics in the area.
SECTION C
WORD RECOGNITION
This section examines how we recognize printed text. What can studies of people with brain dam-
(or written) and spoken words, and how we turn age tell us about this process?
printed words into sound. It also examines disor- Chapter 8, Learning to read and spell,
ders of reading, and how children learn to read. looks at how children learn to read. What is the
Chapter 6, Recognizing visual words, best method of teaching this vital skill? How do
examines the process that takes place when we children learn to spell? Why do some children
recognize a written word. How do we decide on find reading difficult to learn?
the meaning of a word, or even whether we know Chapter 9, Understanding speech, turns
the word or not? What methods are available to to the question of how we recognize the sounds
psycholinguists to study phenomena involved in we hear as speech. How do we decide where one
word recognition, and what models best explain word ends and another begins in the stream of
them? sound that is spoken language? How can context
Chapter 7, Reading, looks at how human help, and what models have been suggested to
beings access sound and meaning from a written explain how spoken word recognition operates?
CHAPTER 6
RECOGNIZING VISUAL WORDS
INTRODUCTION word and accesses its meaning “the magic

moment.” In models with a magic moment, a
How do we recognize written or printed words? word’s meaning can only be accessed after it
When we see or hear a word, how do we access its has been recognized. Johnson-Laird (1975) pro-
representation and meaning within the lexicon? posed that the depth of lexical access may vary.
How do we know whether an item is stored there He noted that sometimes we retrieve hardly
or not? If there are two or more meanings for the any information for a word. Gerrig (1986)
same word (e.g., “bank”), how do we know which extended this idea, arguing that there are differ-
meaning is intended? ent “modes of lexical access” in different con-
Although recognition involves identifying texts. It is an intuitively appealing idea, fitting
an item as familiar, we are not only interested in with our introspection that sometimes when we
discovering how we decide if a printed string of read we are getting very little sense from what
letters is familiar or not, but also how all the infor- we are reading.
mation that relates to a word becomes available. Although the processing of spoken language
For example, when you see the string of letters has a great deal in common with the processing
“g h o s t,” you know more than that they make of visual language, one important difference is
up a word. You know what the word means, that that the speech signal is only available for a short
it is a noun and can therefore occupy certain roles
in sentences but not others, and how the word is
pronounced. You further know that its plural is
formed regularly as “ghosts.” In lexical access,
we access the representation of an item from its
perceptual representation and then this sort of
information becomes available.
In this chapter we focus on how lexical
access takes place, how we assess a word’s famil-
iarity, how we recognize it, and how we access
its meaning. In the next chapter, we concentrate
on how we pronounce the word, and on the rela-
tion between accessing its sound and accessing its
meaning. Gerrig (1986) argued that there are different
Is there a gap between recognizing a word “modes of lexical access.” This fits with our feeling
and accessing its meaning? Balota (1990) called that sometimes we get very little sense from what
we are reading.
the point in time when a person recognizes a
168 C. WORD RECOGNITION
time, whereas under normal conditions a written BASIC METHODS AND

word is available for as long as the reader needs
it. Nevertheless, many of the processes involved
FINDINGS
in accessing the meaning of words are common Six main methods have been used to explore
to both visual and spoken word recognition. We visual word recognition. These are brain imag-
will look at spoken word recognition in Chapter 9, ing (see Chapter 1); examining eye movements;
although many of the findings in the present chap- measuring naming, lexical decision, and catego-
ter also apply to the way we understand spoken rization times; and tachistoscopic identification.
words. For example, facilitation of recognition
by words related in meaning is found in stud-
ies of both spoken and visual word recognition.
Studying eye movements
Selecting the appropriate meaning of an ambigu- The study of eye movements has become impor-
ous word is a problem for both spoken and visual tant in helping us understand both how we rec-
word recognition. ognize words and how we process larger units of
While the great majority of human beings printed language. There are a number of different
have used spoken language for a very long techniques available for investigating eye move-
time, literacy is a relatively recent development. ments. One simple technique is called limbus
There has been a great deal of research on visual tracking. An infra-red beam is bounced off the
word recognition, in part because of conveni- eyeball and tracks the boundary between the iris
ence. Although written language might not be and the white of the eye (the limbus). Although
as fundamental as spoken language, it is excep- this system is good at tracking horizontal eye
tionally useful. Literacy is an important fea- movements, it is relatively poor at tracking verti-
ture of modern civilization. The study of word cal movements. Therefore one of the most com-
recognition should have many implications for monly used techniques is the Purkinje system,
teaching children to read, for the remediation which is accurate at tracking both horizontal and
of illiteracy, and for the rehabilitation of peo- vertical movements. It takes advantage of the fact
ple with reading difficulties. By the end of this that there are several sources of reflection from
chapter you should: the eye, such as the cornea and the back of the
lens. The system computes the movements of the
x Appreciate how word recognition is related to exact center of the pupil from this information.
other cognitive processes. When we read, we do not move our eyes
x Know that recognizing a word occurs when we smoothly. Instead, the eyes travel in jumps called
access its representation in the mental lexicon. saccades of about 20 to 60 ms in duration, with
x Know what makes word recognition easier or intervals of around 200 to 250 ms when the eye is
more difficult. still (Rayner, 1998). These still periods are called
x Understand the phenomenon of semantic prim- fixations (see Figure 6.1). Very little information is
ing and how it occurs. taken in while the eye is moving in a saccade. The
x Know how the various tasks used to study information that can be taken in within a fixation is
word recognition might give different results. limited—15 characters to the right and only 3–4 to
x Appreciate that different aspects of a word’s the left in English speakers (McConkie & Rayner,
meaning are accessed over time. 1976; Rayner, Well, & Pollatsek, 1980). This asym-
x Know how we process morphologically com- metry is reversed for Hebrew readers, who read
plex words. from right to left (Pollatsek, Bolozky, Well, &
x Know about the serial search, logogen, and Rayner, 1981). Skilled readers may be able to take in
Interactive Activation and Competition (IAC) more information in one fixation—that is, they have
models of word recognition. a larger span—than less skilled readers (Martin,
x Understand how we cope with lexical ambigu- 2004). Information from the more distal regions of
ity, when a word can have two meanings. the span is used to guide future eye movements.
6. RECOGNIZING VISUAL WORDS 169
Roadside joggers endure sweat, pain, and angry drivers in the name of
FIGURE 6.1 Diagram
1 2 3 4 5 6 7 8
showing a typical
286 221 246 277 256 233 216 188 progression of fixations
and variations in saccade
fitness. A healthy body may seem reward enough for most people. However, length. The dots indicate
the place of the fixation;
9 10 11 12 13 14 15 16 17 18 19 the first number below the
301 177 196 175 244 302 112 177 266 188 199 dot indicates its position
in the sequence (note the
for all those who question the pay-off, some recent research on physical “overshoot” phenomenon
at fixation 20, in which
21 20 22 23 24 25 26 27 the first fixation on a
new line often falls too
activity and creativity has provided some surprising good news. Regular
far into a sentence and a
regression is required).
29 28 30 31 32 33 34 35 36 37
201 66 201 188 203 220 217 288 212 75 The second number below
the dot indicates the
duration of each fixation in
milliseconds.
The fovea is the most sensitive part of the These eye movements back to previous material,
visual field, and corresponds to the central seven called regressions, are sometimes so brief that we
characters or so of average-size text, subtending are not aware of it. As we will see in Chapter 10, the
the central 2° of vision. The fovea is surrounded study of these regressive eye movements provides
by the parafovea (extending 5° either side of the important information about how we disambiguate
fixation point) where visual acuity is poorer; ambiguous material.
beyond this is the periphery, where visual acuity There has been considerable debate as to
is even poorer. We extract most of the meaning which measure from eye movements is the most
of what we read from the foveal region. Rayner informative (Inhoff, 1984; Rayner, 1998). Should
and Bertera (1979) displayed text to readers with it be first fixation duration—the amount of time
a moving mask that creates a moving blindspot. the eye spends looking at a region in the first
If the foveal region was masked, reading was fixation—or should it be total gaze time—which
possible from the parafoveal region (just outside also includes the time spent looking at a region in
the fovea), but at a greatly reduced rate (only 12 any later regression? Most researchers now select
words a minute). If both the foveal and parafoveal regions of the text for detailed analysis and report
regions were masked, virtually no reading was a number of measures for that region.
possible. Participants knew that there were strings How are eye movements controlled when
of letters outside the masked portion of text, could reading—what determines where the eyes look
report the occasional grammatical function word and when? The most influential model of eye-
such as “and,” and could sometimes obtain infor- movement control is the E-Z Reader model
mation about the starts of words. For example, (Reichle, Rayner, & Pollatsek, 1999, 2003). In the
one participant read “The pretty bracelet attracted E-Z Reader attention, visual processing, and ocu-
much attention” as “The priest brought much lomotor control jointly determine when and where
ammunition.” eyes move when we are reading. The central idea
Sometimes we make mistakes, or need to check of this model is that, when we read, we fixate on a
previous material, and have to look backwards. point, and then visual attention progresses across
the line of text until a point is reached where the nonword. In the more common visual presentation
acuity limitations of the visual system then make it method, the letter string is displayed on a computer
difficult to extract more information and recognize screen (there is also an auditory version of this
new words. Attention then shifts and an eye move- task). For example, the participant should press one
ment is programmed into the oculomotor system key in response to the word “nurse” and another
to move to the point of difficulty. A saccade then key in response to the nonword “murse.” The
takes place to the new location, and the process is experimenter measures reaction times and error
repeated. Saccades are programmed in two stages: rates. One problem with this task is that experi-
there is an early labile stage when the planned sac- menters must be sensitive to the problem of speed–
cade can be canceled if it turns out that it is no accuracy trade-offs (the faster participants respond,
longer necessary (e.g., because we have managed the more errors they make; Pachella, 1974), and
to identify the word in the proposed target loca- therefore researchers must be careful about the
tion); after this initial labile stage saccades cannot precise instructions the participants are given.
be canceled. The central, and the most controver- Encouraging participants to be accurate tends to
sial, assumption of the model is that attention is make them respond accurately but more slowly;
allocated to one word after another in a strictly encouraging them to be fast tends to make them
serial fashion, shifting only after each word is respond faster at the cost of making more mistakes.
identified. This assumption ensures that words are Researchers therefore usually analyze both reac-
processed in the correct order. Word “identification times and error rates (although usually these
tion” occurs in two stages: the first stage is a famil- show the same pattern of results). Response times
iarity check (do I know this word? Am I likely to vary, depending on many factors, but are typically
be able to use it?). Completion of the first stage in the order of 500 ms to 1 second.
can trigger the programming of a saccade. The In experiments measuring reaction time, the
second stage is full lexical access, where mean- absolute time taken to respond is not particularly
ing is retrieved and the representation of the word useful: we are usually concerned with differences
integrated with the emerging linguistic structure. between conditions. We assume that our experi-
Completion of the second stage triggers the shift in mental manipulations change only particular
attention to the next word along. Hence saccades aspects of processing, and everything else remains
and attention are decoupled in this model, and constant and therefore cancels out. For example,
have different sources of control (familiarity and we assume that the time participants take to locate
identification). Linguistic processing can affect the word on the screen and turn their attention to
eye movements; for example, if an analysis turns it is constant (unless of course we are deliberately
out to be wrong, we might return to an earlier loca- trying to manipulate it).
tion. In the model, higher level processes intervene In tachistoscopic identification, participants
in the general drive forward only when something are shown words for very short presentation times.
goes wrong. Researchers in the past used a piece of equipment
called a tachistoscope; now computers are used
instead, but the name is still used to refer to the
Reaction time measures general methodology. The experimenter records
In the naming task, participants are visually pre- the thresholds at which participants can no longer
sented with a word that they then have to name, confidently identify items. If the presentation is
and the time it takes a participant to start to pro- short enough, or if the normal perceptual pro-
nounce the word aloud (the naming latency) is cesses are interfered with by presenting a second
measured. Naming latencies are typically in the stimulus very quickly after the first, we some-
order of 500 ms from the onset of the presentation times find what is commonly known as sublimi-
of the word. nal perception. In this case participants’ behavior
In the lexical decision task the participant is affected although they are unaware that any-
must decide whether a string of letters is a word or thing has been presented.
The semantic categorization task requires the the word, by reducing the contrast between the
participant to make a decision that taps semantic word and the background, or by rotating the word
processes. For example, is the word “apple” a to an unusual angle.
“fruit” or a “vegetable”? Is the object referred to Presenting another stimulus immediately
by the word smaller or bigger than a chair? after the target interferes with the recognition
Different techniques do not always give process. This is called backwards masking (see
the same results. They tap different aspects Figure 6.2). There are two different ways of doing
of processing—an important consideration to this. If the masking stimulus is unstructured—for
which we will return. example, if it is just a patch of randomly posi-
One of the most important ideas in word rec- tioned black dots, or just a burst of light—then
ognition is that of priming. This involves presenting we call it energy (or brightness, or random noise)
material before the word to which a response has masking. If the masking stimulus is structured (for
to be made. One of the most common paradigms example, if it comprises letters or random parts of
involves presenting one word prior to the target letters) then we call it pattern masking (or feature
word to which a response (such as naming or lexi- masking). These two types of mask have very dif-
cal decision) has to be made. The first word is called ferent effects (Turvey, 1973). Energy masks oper-
the prime, and the word to which a response has ate on the visual feature detection level by causing
to be made is called the target. The time between a visual feature shortage and making feature iden-
when the prime is first presented (its onset) and the tification difficult. Feature masks cause interfer-
start of the target is called the stimulus–onset asyn- ence at the letter level and limit the time available
chrony, or SOA. We then observe what effect the for processing.
prime has on subsequent processing. By manipu- Masking is used in studies of one of the great-
lating the relation between the prime and the target, est of all psycholinguistic controversies, that of
and by varying the SOA, we can learn a great deal perception without awareness. Perception with-
about visual word recognition. The prime does not out awareness is a form of subliminal perception.
have to be a single word: it can be a whole sen- Researchers such as Allport (1977) and Marcel
tence, and does not even have to be linguistic (e.g., (1983a, 1983b) found that words that have been
it could be a picture). masked, to the extent that participants report they
are not aware of their presence, can nevertheless
produce activation through the word identification
WHAT MAKES WORD system, even to the level of semantic processing.
RECOGNITION EASIER (OR That is, we can access semantic information about
HARDER)? an item without any conscious awareness of that
item. The techniques involved are notoriously dif-
Next we will look at some of the main findings ficult; the results have been questioned by, among
on visual word recognition. You should bear in others, Ellis and Marshall (1978) and Williams
mind that many of these phenomena also apply to and Parkin (1980). Holender (1986), in critically
spoken word recognition. In particular, frequency reviewing this field, pointed out methodological
effects and semantic priming are found in both problems with the early experiments. He empha-
spoken and visual word recognition. sized ensuring that participants are equally dark-
adapted during the preliminary establishing of
individual thresholds and the main testing phase
Interfering with identification of the experiment. Otherwise we cannot be sure
We can slow down word identification by mak- that information is not reaching conscious aware-
ing it harder to recognize the stimulus. One way ness in the testing phase, even though we think
of doing this is by degrading its physical appear- we might have set the time for which the target is
ance. This is called stimulus degradation and can presented to a sufficiently short interval. The win-
be achieved by breaking up the letters that form dow between presenting a word quickly enough
BACKWARDS MASKING
ENERGY MASKING PATTERN MASKING

(masking stimulus is (masking stimulus is
unstructured, e.g., a structured, e.g., random
burst of light) parts of letters)
affects the visual

feature detection level
causes interference at
the letter level
causes visual
feature shortage
makes feature limits time available

identification difficult for processing
FIGURE 6.2
for it not to be available to consciousness, and so early use of information that is most likely to help
quickly that participants really do see nothing at them identify a word.
all, is very small. As yet it is unclear whether we
can identify and access meaning-related informa-
tion about words without conscious awareness,
Frequency, familiarity, and
although the balance of evidence is probably that age-of-acquisition
we can. Such a finding does not pose any real The frequency of a word is a very important fac-
problem for our models of lexical access. tor in word recognition. Commonly used words
Another informative way in which we can are easier to recognize and are responded to more
interfere with word recognition is to present a quickly than less commonly used words. The fre-
word, but delay the presentation of one or two quency effect was first demonstrated in tachisto-
letters at the beginning of the word by backward scopic recognition (Howes & Solomon, 1951),
masking of those letters. What causes most dis- but has since been demonstrated for a wide range
ruption when we do this? In English, after 60 ms of tasks. Whaley (1978) showed that frequency
it doesn’t make much difference, but before that, is the single most important factor in determin-
delaying a consonant disrupts visual word rec- ing the speed of responding in the lexical deci-
ognition much more than delaying a vowel (Lee, sion task. Forster and Chambers (1973) found a
Rayner, & Pollatsek, 2001). Early on, then, con- frequency effect in the naming task.
sonant identification is particularly important for The effect of frequency is not just a result of
recognizing a word. In English, consonants have differences between frequent and very infrequent
a more regular mapping from visual appearance words (e.g., “year” versus “heresy”), where you
to sound, whereas vowels do not. In Italian, which would obviously expect a difference, but also
has a much more regular mapping for vowels, between common and slightly less common words
there is no early advantage for consonants. Hence (e.g., “rain” versus “puddle”). It is therefore essen-
readers in different languages make differential tial to control for frequency in psycholinguistic
experiments, ensuring that different conditions & White, 1973a; Gilhooly, 1984). On the whole,
are matched. There are a number of norms of fre- children learn more common words first, but there
quency counts available; in the past, Kucera and are exceptions: for example, “giant” is generally
Francis (1967; see also Francis & Kucera, 1982) learned early although it is a relatively low-frequency
was one of the most popular of these, listing the word. Words that are learned early in life are
occurrence per million of a large number of words named more quickly and more accurately than
in many samples of printed language. Kucera and ones learned late, across a range of tasks including
Francis is based on written American English. object naming, word naming, and lexical decision
Clearly there are differences between versions of (Barry, Morrison, & Ellis, 1997; Brown & Watson,
English (e.g., “pavement” and “sidewalk”) and 1987; Carroll & White, 1973a; Morrison, Ellis, &
between written and spoken word frequency. For Quinlan, 1992). The later the age-of-acquisition
example, the pronoun “I” is 10 times more com- of a name, the more difficult it will be for some-
mon in the spoken word corpus than the written one with brain damage to produce (Hirsh & Ellis,
one (Dahl, 1979; Fromkin et al., 2011). Another 1994). Frequency and AOA may be correlated,
popular choice is the CELEX database (Baayen, but statistical techniques such as multiple regres-
Piepenbrock, & Gulikers, 1995), which is stored sion enable us to tease them apart. Early-learned
electronically and is therefore easily searchable, items tend to be higher in frequency, although
making it particularly useful for making up lists estimates of the size of the correlation have var-
of materials with very specific characteristics. ied from 0.68 (Carroll & White, 1973b) to as low
The Internet has made possible the collection and as 0.38 (between an objective measure of AOA,
analysis of very large samples of text. when a word first enters a child’s vocabulary, and
Gernsbacher (1984) pointed out that cor- the logarithm of the spoken word frequency, as in
pora of printed word frequencies are only an Ellis & Morrison, 1998). It has been suggested that
approximation to experiential familiarity. This all frequency effects are really AOA effects (e.g.,
approximation may break down, particularly for Morrison & Ellis, 1995). On the other hand, it has
low-frequency words. For example, psycholo- also been suggested that studies reporting AOA
gists might be very familiar with a word such as effects have not controlled adequately for fre-
“behaviorism,” even though it has quite a low frequency; in particular, these studies might not have
quency in the general language. People also rate taken into account cumulative frequency—how
some words with recorded low frequency (such often words have been encountered throughout the
as “mumble,” “giggle,” and “drowsy”) as more lifespan (Zevin & Seidenberg, 2002). Measures
familiar than others of similar frequency (such of frequency such as Kucera and Francis and the
as “cohere,” “rend,” and “char”). The printed- CELEX database are quite small (even a million
frequency corpora might not be very accurate words is small relative to the number we come
for low-frequency words, and language use has across in real life), and, as we have seen with
changed since many of the corpora were com- familiarity (Gernsbacher, 1984), might not accu-
posed. If it is possible to obtain ratings of the rately reflect the true occurrence of words in the
individual experiential familiarity of words, they language. Even then, they just provide a snapshot
should prove to be a more reliable measure in proof adult usage. Importantly, they might particu-
cessing tasks than printed word frequency. larly underestimate the frequency of words we are
Several other variables correlate with fre- exposed to in childhood. However, a large-scale
quency. For example, common words tend to be study of French showed that AOA effects persist
shorter. If you wish to demonstrate an unambigu- even when cumulative frequency is controlled for
ous effect of frequency, you must be careful to (Bonin, Barry, Méot, & Chalard, 2004). It is prob-
control for these other factors. able that both frequency and AOA have effects
Frequency is particularly entangled with age- on word processing (Morrison & Ellis, 2000).
of-acquisition (AOA). The age-of-acquisition of a Different tasks might differ in their sensitivity to
word is the age at which you first learn it (Carroll AOA and different measures of frequency; AOA
& Ellis, 2010). Rather than train a connectionist

network to learn all items simultaneously, Ellis and
Lambon Ralph introduced items into the training
regime at different times. Items learned early pos-
sess an advantage independently of their frequency
of occurrence. As a network learns more items, it
becomes less plastic, and late items are not as effi-
ciently or as strongly represented as those learned
early, because they are more difficult to differen-
tiate from items that have already been learned.
Early-learned items have a head start that enables
them to develop stronger representations in the net-
work. Late-learned items can only develop strong
representations if they are presented with a very
high frequency.
Generally speaking, children learn more common
words first, although some low-frequency words Word length
are also learned early on, through storytelling, for
example. Gough (1972) argued that during word recogni-
tion letters are taken out of a short-term visual
buffer one by one at a rate of 15 ms per letter. The
transfer rate is slower for poor readers. Therefore
particularly affects word reading, while cumulative it would not be at all surprising if long words were
frequency has an effect in all tasks (Bonin et al., harder to identify than short words. However, a
2004). On the other hand, Zevin and Seidenberg length effect that is independent of frequency has
(2002) provide simulations that show that tasks proved surprisingly elusive. One complication is
involving redundancy and regularity in the input– that there are three different ways of measuring
output mappings (e.g., reading, where letters map word length: how many letters there are in a word,
onto sounds in a predictable way) are less prone how many syllables, and how long it takes you to
to AOA effects, and are sensitive only to cumula- say the word (see Figure 6.3).
tive frequency, but tasks with less redundancy and Although Whaley (1978) found some word
regularity (such as learning the names of objects or length effects on lexical decision, Henderson
faces) do show AOA effects. (1982) did not. However, Chumbley and Balota
Age-of-acquisition effects might arise as a (1984) found length effects in lexical decision
consequence of a loss of plasticity in developing when the words and nonwords were matched for
systems (Ellis & Lambon Ralph, 2000; Monaghan length and the regularity of their pronunciation.
Length of a word can

be measured by
Number of Number of How long it takes Number of

letters syllables to say the word phonemes
FIGURE 6.3
For some time it was thought that there was recognize when other factors have been con-
clear evidence that longer words take longer to trolled for, although clear benefits are only
pronounce (Forster & Chambers, 1973). Weekes found for low-frequency words: Performance
(1997) found that word length (measured in let- on naming and lexical decision tasks is faster
ters) had little effect on naming words when for low-frequency words that have many ortho-
other properties of words (such as the number of graphic neighbors (Andrews, 1989; Grainger,
words similar to the target word) were controlled 1990; McCann & Besner, 1987). The rime parts
for (although length had some effect on reading of neighbors seem to be particularly impor-
nonwords). It seems that the number of letters in tant in producing the facilitation (Peereman &
a word has little effect for short words, but has Content, 1997).
some effect on words between 5 and 12 letters In addition to neighborhood size, the fre-
long. Furthermore, word length effects in naming quency of the neighbors might also be important,
words probably reflect the larger number of simi- although in a review of the literature Andrews
lar words with similar pronunciations found for (1997) concluded that neighborhood size has
shorter words. more effect than neighborhood frequency. On
Naming time increases as a function of the the other hand, it is surprising that having many
number of syllables in a word (Eriksen, Pollack, neighbors produces facilitation at all, rather than
& Montague, 1970). There is at least some con- competition (Andrews, 1997).
tribution from preparing to articulate these sylla-
bles in addition to any perceptual effect. We find a
similar effect in picture naming. We take longer to
Word or nonword?
name pictures of objects depicted by long words Words are generally responded to faster than non-
compared with pictures of objects depicted by words. Less plausible nonwords are rejected faster
short words, and longer to read numbers that have than more plausible nonwords (Coltheart et al.,
more syllables in their pronunciation, such as the 1977). Hence in a lexical decision task we are rela-
number 77 compared with the number 16 (Klapp, tively slow to reject a nonword like “siant” (which
1974; Klapp, Anderson, & Berrian, 1973). might have been a word, and indeed which looks
like one, “saint”), but very quick to reject one such
as “tnszv.” Nonwords that are plausible—that is,
Neighborhood effects that follow the rules of word formation of the lan-
Some words have a large number of other words guage in that they do not contain illegal strings of
that look like them (e.g., “mine” has “pine,” “line,” letters—are sometimes called pseudowords.
“mane,” among others), whereas other words of
similar frequency have few that look like them
(e.g., “much”). Coltheart, Davelaar, Jonasson, and
Repetition priming
Besner (1977) defined the N-statistic as the num- Once you have identified a word, it is easier to
ber of words that can be created by changing one identify it the next time you see it. The technique
letter of a target word. Hence “mine” has a large N of facilitating recognition by repeating a word
(29): It is said to have many orthographic neigh- is known as repetition priming. Repetition
bors (e.g., “pine,” “mane,” “mire”), but “much” facilitates both the accuracy of perceptual iden-
has a low N (5) and few neighbors. The word tification (Jacoby & Dallas, 1981) and lexical
“bank” has an N-value of 20, but “abhorrence” decision response times (Scarborough, Cortese,
only has an N-value of 1. (The related word is & Scarborough, 1977). Repetition has a surpris-
“abhorrency”—which oddly enough my spell- ingly long-lasting effect. It is perhaps obvious
checker doesn’t like!) N is a measure of neighbor- that having just seen a word will make it easier
hood size (or density). to recognize straight away, but periods of facili-
Neighborhood size affects visual word rec- tation caused by repetition have been reported
ognition, making words with a high N easy to over several hours or even longer.
Repetition interacts with frequency. In a lexi- easier to obtain if the prime is masked, perhaps
cal decision task, repetition priming effects are because masked priming is a more “pure” form
stronger for low-frequency words than for high- of priming that has no contribution from con-
frequency ones, an effect known as frequency scious processing (Davis & Lupker, 2006; Forster
attenuation (Forster & Davis, 1984). Forster and & Davis, 1984; Forster, Davis, Schoknecht, &
Davis also pattern-masked the prime in an attempt Carter, 1987).
to wipe out any possible episodic memory of it.
They concluded that repetition effects have two
components: a very brief lexical access effect, and
Semantic priming
a long-term episodic effect, with only the latter For over a century, it has been known that iden-
sensitive to frequency. tification of a word can be facilitated by prior
There has been considerable debate as to exposure to a word related in meaning (Cattell,
whether repetition priming arises because of the 1888/1947). Meyer and Schvaneveldt (1971) pro-
activation of an item’s stored representation (e.g., vided a more recent demonstration of what is one
Morton, 1969; Tulving & Schachter, 1990) or of the most robust and important findings about
because of the creation of a record of the entire word recognition. They showed that the identifi-
processing even in episodic memory (e.g., Jacoby, cation of a word is made easier if it is immediately
1983). An important piece of evidence that sup- preceded by a word related in meaning. They used
ports the episodic view is the finding that we a lexical decision task, but the effect can be found,
generally obtain facilitation by repetition priming with differing magnitudes of effect, across many
only within a domain (such as the visual or audi- tasks, and is not limited to visual word recogni-
tory modality), but semantic priming (by meaning tion (although the lexical decision task shows the
or association) also works across domains (see largest semantic priming effect; Neely, 1991). For
Roediger & Blaxton, 1987). example, we are faster to say that “doctor” is a
word if it is preceded by the word “nurse” than
if it is preceded by a word unrelated in meaning,
Form-based priming such as “butter,” or if it is presented in isolation.
We might expect that seeing a word like This phenomenon is known as semantic priming.
CONTRAST should make it easier to recognize The word priming is best reserved for the
CONTRACT, because there is overlap between methodology of investigating what happens when
their physical forms. As they share letters, they one word precedes another. The first word (the
are said to be orthographically related, and this prime) might speed up recognition of the second
phenomenon is known as orthographic priming word (the target), in which case we talk of facilita-
or form-based priming. In fact, form-based prim- tion. Sometimes the prime slows down the iden-
ing is very difficult to demonstrate. Humphreys, tification of the target, in which case we talk of
Besner, and Quinlan (1988) found that form-based inhibition.
priming was only effective with primes masked at With very short time intervals, priming can
short SOAs so that the prime is not consciously occur if the prime follows the target. Kiger and
perceived. Forster and Veres (1998) further Glass (1983) placed the primes immediately after
showed that the efficacy of form-based primes the target in a lexical decision task. If the target
depends on the exact make-up of the materials in was presented for 50 ms, followed 80 ms later by
the task. Form-related primes can even have an the prime, there was no facilitation of the target,
inhibitory effect, slowing down the recognition of but if the target was presented for only 30 ms, and
the target (Colombo, 1986). One explanation for followed only 35 ms later by the prime, there was
these findings is that visually similar words are significant backwards priming of the target. This
in competition during the recognition process, so finding suggests that words are to some extent
that in some circumstances similar-looking words processed in parallel if the time between them is
inhibit each other. Form-based priming is much short enough.
Semantic priming is a type of context effect. (1) If your bicycle is stolen, you must [formulate]
One can see that the effect might have some (2) If your bicycle is stolen, you must [batteries]
advantages for processing. Words are rarely read
(or heard) in isolation, and neither are words In both cases the target word (in italics) is
randomly juxtaposed. Words related in meaning semantically unpredictable from the context, yet
sometimes co-occur in sentences. Hence pro- Wright and Garrett found that syntactic context
cessing might be speeded up if words related to affected lexical decision times so that people were
the word you are currently reading are somehow significantly slower to respond to the noun (“bat-
made more easily available, as they are more teries”) in this context than the verb (“formulate”).
likely to come next than random words. How
does this happen? We shall return to this question
throughout this chapter. ATTENTIONAL PROCESSES
IN VISUAL WORD
Other factors that affect word RECOGNITION
recognition
Reading is a mandatory process. When you see a
The ease of visual word recognition is affected by word, you cannot help but read it. Evidence to sup-
a number of variables (most of which have similar port this introspection comes from the Stroop task:
effects on spoken word recognition). There are oth- Naming the color in which a word is written is made
ers that should be mentioned, including the gram- more difficult if the color name and the word conflict
matical category to which a word belongs (West & (e.g., “red” written in green ink) (see Figure 6.4).
Stanovich, 1986). The imageability, meaningful- How many mechanisms are involved in prim-
ness, and concreteness of a word may also have ing? In a classic experiment, Neely (1977) argued
an effect on its identification (see Paivio, Yuille, that there were two different attentional modes of
& Madigan, 1968). In a review of 51 properties priming. His findings relate to a distinction made
of words, Rubin (1980) concluded that frequency, by Posner and Snyder (1975) and Schneider and
emotionality, and pronunciability were the best Shiffrin (1977) between automatic and atten-
predictors of performance on commonly used tional (or controlled) processing. Automatic
experimental tasks. Whaley (1978) concluded that processing is fast, parallel, not prone to interfer-
frequency, meaningfulness, and the number of syl- ence from other tasks, does not demand working
lables had most effect on lexical decision times, memory space, cannot be prevented, and is not
although recently age-of-acquisition has come to directly available to consciousness. Attentional
the fore as an important variable. In a study of a (or controlled) processing is slow, serial, sensi-
large number of words, Balota, Cortese, Sergent- tive to interference from competing tasks, does
Marshall, Spieler, and Yap (2004) compared the
effects of phonological (e.g., the first sound), lexi-
cal (e.g., frequency, length, neighborhood size),
and semantic (e.g., imageability) variables on The Stroop effect
speeded visual word naming and lexical decision.
They found that the contribution of the variables BLACK
was highly task dependent. Semantic variables are
especially important, particularly in lexical deci- BLUE
sion. Finally, the syntactic context affects word rec- BLACK
ognition. Wright and Garrett (1984) found a strong
effect of syntactic environment on lexical decision BLUE
times. In (1) and (2) the preceding context can be
continued with a verb, but not with a noun. In (2)
this syntactic constraint is violated: FIGURE 6.4
Automatic processing vs. attentional processing
! !
! !
! !
! !

! !
! !
! !

FIGURE 6.5
use working memory space, can be prevented or expect to shift or not shift their attention from one
inhibited, and its results are often (but not nec- category name to members of another category.
essarily) directly available to consciousness (see Examples of stimuli in the key conditions are
Figure 6.5). given in Box 6.1.
Neely used the lexical decision task to investi- Neely found that the pattern of results
gate attentional processes in semantic priming. He depended on the SOAs. The crucial condition is
manipulated four variables. The first was whether what happens after “BODY.” At short SOAs, an
or not there was a semantic relation between the unexpected but semantically related word such as
prime and target, so that in the related condition a “HEART” was facilitated relative to the baseline
category name acting as prime preceded the tar- condition, whereas participants took about as long
get. Second, he manipulated the participants’ con- to respond to the expected but unrelated “DOOR”
scious expectancies. Third, he varied whether or as the baseline. At long SOAs, “HEART” was
not participants’ attention had to be shifted from inhibited—that is, participants were actually
one category to another between the presentation slower to respond to it than they were to the base-
of the prime and the presentation of the target. line condition, whereas “DOOR” was facilitated.
Finally, he varied the stimulus–onset asynchrony, Neely interpreted these results as show-
between 250 ms (a very short SOA) and 2,000 ms ing that two different processes are operating at
(a very long SOA). short and long SOAs. At short SOAs, there is
Importantly, in this experiment there was a fast-acting, short-lived facilitation of semanti-
discrepancy between what participants were led cally related items, which cannot be prevented,
to expect from the instructions given to them irrespective of the participants’ expectations.
before the experiment started, and what actu- This facilitation is based on semantic relations
ally happened. Participants were told, for exam- between words. There is no inhibition of any sort
ple, that whenever the prime was “BIRD,” they at short SOAs. This is called automatic prim-
should expect that a type of bird would follow, ing. “BODY” primes “HEART,” regardless of
but that whenever the prime was “BODY,” a part what the participants are trying to do. But at long
of a building would follow. Hence their conscious SOAs, there is a slow build-up of facilitation that
expectancies determined whether they had to is dependent on your expectancies. This leads to
Box 6.1 Materials from Neely’s (1977) experiment

1. BIRD ROBIN R E NS
2. BODY DOOR UR E S
3. BIRD ARM UR UE NS
4. BODY SPARROW UR UE S
5. BODY HEART R UE S
6. CONTROL: to measure the baseline, use XXXX–ROBIN
R semantically related
UR semantically unrelated
E as expected from instructions
UE unexpected from instructions
S shift of attention from one category to another
NS no shift of attention from one category to another
the inhibition of responses to unexpected items, may not just arise from attentional processes, but
with the cost that if you do have to respond to may also have an automatic component. Antos
them, then responding will be retarded. This is also showed the importance of the baseline
attentional priming. Normally, these two types condition, a conclusion supported by de Groot
of priming work together. In a semantic priming (1984). A row of Xs, as used by Neely, is a con-
task at intermediate SOAs (around 400 ms) both servative baseline, and tends to delay respond-
automatic and attentional priming will be cooper- ing; it is as though participants are waiting for
ating to speed up responding. One can also con- the second word before they respond. It may be
clude from this experiment, on the basis of the more appropriate to use a neutral word (such as
unexpected–related condition, that the meanings “BLANK” or “READY”) as the neutral condi-
of words are accessed automatically. tion. When this is done we observe inhibition at
much shorter SOAs. Antos also argued that even
Further evidence for a Neely found evidence of cost at short SOAs, but
that this was manifested in an increase in the
two-process priming model error rate rather than in a slowing of reaction
The details of the way in which two processes time. This is evidence of a speed–error trade-off
are involved in priming have changed a little in the data. Generally, in psycholinguistic reac-
since Neely’s original experiment, although the tion time experiments, it is always important to
underlying principle remains the same. Whereas check for differences in the error rate as well as
Neely used category–instance associations (e.g., reaction times across conditions.
“BODY–ARM”), which are not particularly A second source of evidence for attentional
informative (any part of the body could follow effects in priming comes from studies manipulat-
“BODY”), Antos (1979) used instance–category ing the predictive validity (sometimes called the
associations (e.g., “ARM–BODY”), which are cue validity) of the primes. The amount of prim-
highly predictive. He then found evidence of inhi- ing observed increases as the proportion of related
bition (relative to the baseline) in the unexpected words used in the experiment increases (Den
but semantically related condition at shorter Heyer, 1985; Den Heyer, Briand, & Dannenbring,
SOAs (at 200 ms), suggesting that inhibition 1983; Tweedy, Lapinski, & Schvaneveldt, 1977).
This is called the proportion effect. If priming important in word recognition, and may play dif-
were wholly automatic, then the amount found ferent roles in the tasks used to study it.
should remain constant across all proportions
of associated word pairs. The proportion effect
reflects the effect of manipulating the partici- DO DIFFERENT TASKS
pants’ expectancies by varying the proportion GIVE CONSISTENT
of valid primes. If there are a lot of primes that RESULTS?
are actually unrelated to the targets, participants
quickly learn that they are not of much benefit. Experiments on word recognition are difficult
This will then attenuate the contribution of atten- to interpret because different experimental tasks
tional priming. Nevertheless, in those cases where sometimes give different results. When we use
primes are related to the target, automatic prim- lexical decision or naming, we are not just study-
ing still occurs. The more related primes there are ing pure word recognition: we are studying word
in an experiment, the more participants come to recognition plus the effects of the measurement
recognize their usefulness, and the contribution of task. Worse still, the tasks interact with what is
attentional priming increases. being studied. It is rather like using a telescope
to judge the color of stars when the glass of the
Evaluation of attentional processes telescope lens changes color depending on the
distance of the star—and we don’t realize it.
in word recognition By far the most controversy surrounds the
There are two attentional processes operating in naming and lexical decision tasks. Which of these
semantic priming: a short-lived, automatic, facili- better tap the early, automatic processes involved
tatory process that we cannot prevent from hap- in word recognition?
pening, and an attentional process that depends on Lexical decision has been particularly criti-
our expectancies and that is much slower to get cized as being too sensitive to post-access effects.
going. However, the benefits of priming are not In particular, it has been argued that it reflects too
without their costs; attentional priming certainly much of participants’ strategies rather than the
involves inhibition of unexpected alternatives, automatic processes of lexical access (e.g., Balota
and if one of these is indeed the target then recog- & Lorch, 1986; Neely, Keefe, & Ross, 1989;
nition will be delayed. There is probably also an Seidenberg, Waters, Sanders, & Langer, 1984).
inhibitory cost associated with automatic priming. This is because it measures participant decision-
Automatic priming probably operates through making times in addition to the pure lexical access
spreading activation. times (Balota & Chumbley, 1984; Chumbley &
We can extend our distinction between auto- Balota, 1984). Participants do not always respond
matic and attentional processes to word recog- as soon as lexical access occurs; instead, atten-
nition itself. As we have seen, there must be an tional or strategic factors may come into opera-
automatic component to recognition, because this tion, which delay responding. Participants need
processing is mandatory. Intuition suggests that not be aware of these post-access mechanisms,
there is also an attentional component. If we mis- as not all attentional processes are directly avail-
read a sentence, we might consciously choose to able to consciousness. Participants might use one
go back and reread a particular word. To take this or both of two types of strategy. First, as we have
further, if we provisionally identify a word that seen, participants have expectancies that affect
seems incompatible with the context, we might processing. In a lexical decision experiment, par-
check that we have indeed correctly identified it. ticipants usually notice that some of the prime–
These attentional processes operate after we have target word pairs are related. So when they see the
first contacted the lexicon, and hence we also talk prime, they can generate a set of possible targets.
about automatic lexical access and non-automatic Hence they can make the “word” response faster
post-access effects. Attentional processes are if the actual target matches one of their generated
words than if it does not. The second is a postlexi- (e.g., “dog” and “cat”) mixed in. Nevertheless,
cal or post-access checking strategy. Participants lexical decision does seem to routinely involve
might use information subsequent to lexical post-access checking. Third, backwards seman-
access to aid their decision. The presence of a tic priming of words that are only associated in
semantic relation between the prime and target one direction but not another (see later) is found
suggests that the prime must be a word, and hence in the lexical decision task but is not normally
they respond “word” faster in a lexical decision found in naming (Seidenberg, Waters, Sanders,
task, as there can be no semantic relation between et al., 1984). This type of priming again more
a word and nonword. That is, using postlexical plausibly arises through post-access checking
checking, participants might respond on the basis than through the automatic spread of activation.
of an estimate of the semantic relation between These results suggest that the naming task is
prime and target, and not directly on the results less sensitive to postlexical processes. The nam-
of trying to access the lexicon. Strategic factors ing task, however, has a production component in
might even lead some participants, some of the the way that lexical decision does not (Balota &
time, to respond before they have recognized a Chumbley, 1985). In particular, naming involves
word (that is, they guess, or respond to stimuli on assembling a pronunciation for the word that
very superficial characteristics). might bypass the lexicon altogether (using what is
What is the evidence that word naming is known as a sublexical route, discussed in detail in
less likely to engage participant strategies than Chapter 7). There are also some possible strategic
lexical decision? First, inhibitory effects are effects in naming: People are unwilling to utter
small or non-existent in naming (Lorch, Balota, words that may be incorrect in some way—for
& Stamm, 1986; Neely et al., 1989). As we have example, they may hesitate if they are unsure of
seen, inhibition is thought to arise from atten- the word’s pronunciation (O’Seaghdha, 1997).
tional processes, so its absence in the naming Clearly both lexical decision and naming
task suggests that naming does not involve atten- have their disadvantages. For this reason, many
tional processing. Second, mediated priming is researchers now prefer to use analysis of eye
found much more reliably in the naming task than movements. Fortunately, the results from differ-
in lexical decision (Balota & Lorch, 1986; de ent methods often converge. Schilling, Rayner,
Groot, 1983; Seidenberg, Waters, Sanders, et al., and Chumbley (1998) found that although the
1984). Mediated priming is facilitation between lexical decision task is more sensitive to word
pairs of words that are connected only through frequency than naming and gaze duration,
an intermediary (e.g., “dog” primes “cat,” which there is nevertheless a significant correlation
primes “mouse” for the prime–target pair “dog between the frequency effect and response time
mouse”). It is much more likely to be automatic in all three tasks. We either need to place more
than expectancy-driven because participants are stress on results on which the three techniques
unlikely to be able to generate a sufficient num- converge, or have a principled account of why
ber of possible target words from the prime in they differ.
sufficient time by any other means. Mediated
priming is not usually found in lexical decision
because normally participants speed up process-
The locus of the frequency effect
ing by using post-access checking. It is possible At what stage does frequency have its effect?
to demonstrate mediated priming in lexical deci- Is it inherent in the way that words are stored,
sion by manipulating the experimental materials or does it merely affect the way in which par-
and design so that post-access checking is dis- ticipants respond in experimental tasks? An
couraged (McNamara & Altarriba, 1988). For experiment by Goldiamond and Hawkins
example, we observe mediated priming if all (1958) suggested the latter. The first part of this
the related items only are mediated (“dog” and experiment was a training phase. Participants
“mouse”), with no directly related semantic pairs were exposed to nonwords (such as “lemp” and
“stunch”). Frequency was simulated by giving later recognition of a word is facilitated every
a lot of exposure to some words (mimicking time we are exposed to it, whether through
high frequency), and less to others (mimick- speaking, writing, listening, or reading. Hence
ing low frequency). For example, if you see frequency of experience and frequency of gen-
“lemp” a lot of times relative to “stunch,” then eration are both important.
it becomes a higher frequency item for you, Most accounts of the frequency effect
even though it is a nonword. In the second part assume that it arises as a kind of practice—the
of the experiment, participants were tested for more often we do something, the better we get
tachistoscopic recognition at very short inter- at it. This idea has been challenged recently by
vals. Although the participants were told to Murray and Forster (2004), who show that the
expect the words on which they were trained, time it takes to identify words is linearly related
only a blurred stimulus that they had not seen to frequency, rather than varying as a logarith-
before was in fact presented. Nevertheless, par- mic function, as you would expect if frequency
ticipants generated the trained nonwords even was based on learning that in turn was based on
though they were not present, but also with multitudinous repetitions. (Eventually you get
the same frequency distribution on which they diminishing returns from repeating things more
were trained. That is, they responded with the times.) They argue that the frequency effect
more frequent words more often, even though is better accounted for by searching serially
nothing was actually present. It can be argued through lists of words, where all that matters
from this that frequency does not have an effect is relative frequency rather than absolute fre-
on the perception or recognition of a word, only quency. We examine the serial search model in
on the later output processes. That is, frequency more detail below.
creates a response bias. This leads to what is There has been considerable debate about
sometimes called a guessing model. This type whether the naming and lexical decision tasks are
of experiment only shows that frequency can differentially sensitive to word frequency (Balota
affect the later, response stages. It does not & Chumbley, 1984, 1985, 1990; Monsell, Doyle,
show that it does not involve the earlier rec- & Haggard, 1989). Balota and Chumbley argued
ognition processes as well. Indeed, Morton that word frequency has no effect on semantic
(1979a) used mathematical modeling to show categorization. This is a task that must involve
that sophisticated guessing cannot explain the accessing the meaning of the target word. They
word frequency effect alone. concluded that when frequency has an effect on
A frequency effect could arise in two ways. word recognition, it does so because of post-
A word could become more accessible because access mechanisms, such as checking in lexical
we see (or hear) frequent words more than we decision, and preparing for articulation in nam-
see (or hear) less frequent ones, or because we ing. They also showed that the magnitude of the
speak (or write) frequent words more often. frequency effect depended on subtle differences
Of course, most of the time these two possi- in the stimulus materials in the experiment (such
bilities are entangled; we use much the same as length differences between words and non-
words in speaking as we are exposed to as lis- words). This can be explained if the effect is
teners. Another way of putting this is to ask if mediated by participants’ strategies. Furthermore,
frequency effects arise through recognition or the magnitude of the frequency effect is much
generation. Morton (1979a) disentangled these greater in lexical decision than naming. The argu-
two factors. He concluded that the data are best ment is that this is because the frequency effect
explained by models whereby the advantage has a large attentional, strategic component, with
of high-frequency words is that they need less any automatic effect being small or non-existent.
evidence to reach some threshold for identi- Lexical decision is more sensitive to strategic fac-
fication. The effect of repeated exposure to a tors; therefore lexical decision is more sensitive
word is therefore to lower this threshold. The to frequency.
However, most researchers believe that fre- they arise instead because of this confound with
quency does have an automatic, lexical effect on neighborhood frequency. Hence the extent of
word recognition. Monsell et al. (1989) found post-access processes in lexical decision might
that frequency effects in naming can be inflated be less than originally thought.
to a similar level to that found in lexical deci-
sion by manipulating the regularity of the pro-
nunciation of words; participants must access
Evaluation of task differences
the lexical representation of irregular words to Throughout this section we have seen that dif-
pronounce them. It is possible that frequency ferent variables have different effects on perfor-
effects are absorbed by other components of mance, depending on which measure is used. In
the naming task (Bradley & Forster, 1987). particular, lexical decision and word naming do
Furthermore, delaying participants’ responses not always give the same results. The differences
virtually eliminates the frequency effect (Forster arise because other tasks include aspects of non-
& Chambers, 1973; Savage, Bradley, & Forster, automatic processing. Naming times include
1990). Delaying responding eliminates prepara- assembling a phonological code and articulation;
tion and lexical access effects, but not articu- lexical decision times include response prepara-
lation. This casts doubt on the claim that there tion and post-access checking. Hence the differ-
is a major articulatory component to the effect ences in reaction times between the tasks may
of frequency on naming, and suggests that the reflect differing accounts of post-access rather
effect must be occurring earlier. than access processes. Given that the goal of
Grainger (1990; see also Grainger & Jacobs, reading is to extract meaning, the extent to which
1996; Grainger, O’Regan, Jacobs, & Segui, either lexical decision or naming gets at this is
1989) reported experiments that addressed both questionable.
the locus of the frequency effect and also task
differences between lexical decision and nam-
ing. He showed that response times to words
IS THERE A DEDICATED
are also sensitive to the frequency of the neigh- VISUAL WORD
bors of the target words. The neighbors of a RECOGNITION SYSTEM?
word are those that are similar to it in some
way—in the case of visually presented words, How might our ability to read have come about?
it is visual or orthographic similarity that is Although there has been plenty of time for
important. For example, there is much overlap speech to evolve (see Chapter 1), reading is a
in the letters and visual appearance of “blue” much more recent development. It is therefore
and “blur.” Grainger found that when the fre- unlikely that a specific system has had time
quency of the lexical neighborhood of a word to evolve for visual word processing. It seems
is controlled, the magnitude of the effect of more likely that the word recognition system
frequency in lexical decision is reduced to that must be tacked onto other cognitive and per-
of the naming task. Responses to words with a ceptual processes. However, words are unusual:
high-frequency neighbor were slowed in the lex- We are exposed to them a great deal, they have
ical decision task and facilitated in the naming a largely arbitrary relation with their meaning,
task. He argued that as low-frequency targets and most importantly, in alphabetic writing sys-
necessarily tend to have more high-frequency tems at least, they are composed of units that
neighbors, previous studies had confounded correspond to sounds.
target frequency with neighborhood frequency. Is the word-processing system distinct from
Furthermore, he argued that the finding that fre- other recognition systems? This can be exam-
quency effects are stronger in lexical decision ined most simply in the context of naming pic-
than naming cannot necessarily be attributed tures of objects, the picture-naming task. One
to task-specific post-access processes, and that important way of looking at this is to examine
the extent to which the presentation of printed

words affects the processing of other types of
material, such as pictures. Pictures facilitate
semantically related words in a lexical decision
task (Carr, McCauley, Sperber, & Parmalee,
1982; McCauley, Parmalee, Sperber, & Carr,
1980; Sperber, McCauley, Ragain, & Weil, 1979;
Vanderwart, 1984). However, the magnitude of
the priming effect is substantially less than the
size of the within-modality priming effect (pic-
tures priming pictures, or words priming words).
These findings suggest that the picture-nam-
ing and word recognition systems are distinct,
although this is controversial (Glaser, 1992). The
Brain activity during the reading of words. This
results are sensitive to the particulars of the tasks is a composite of a 3-D magnetic resonance
used. Morton (1985) discussed differences in the imaging (MRI) scan (blue) of the brain, overlaid
details of experimental procedures that might with positron emission tomography (PET) scan
account for different findings. For example, in data (red/green) showing brain activity. The
experiments such as those of Durso and Johnson brain is seen from the side, with the front of
the brain at left. In this test, words are being
(1979) the pictures were presented very clearly, read, and the occipital lobe (far right) is active.
whereas in those of Warren and Morton (1982) This is the brain’s visual center. Also active
they were presented very briefly. Very brief pres- is an area of the temporal lobe (lower right),
entation acts in a similar way to degrading the which is associated with comprehension of
stimulus, and produces a processing bottleneck words.
not present in other experiments.
Parts of the left ventral visual cortex around
the fusiform gyrus respond more to words and pseudowords than strings of consonants. fMRI
imaging studies show that this area is sensitive to the
orthographic rather than the perceptual properties of
words; strings of letters where the case is alternated
(cAsE) are perceptually unfamiliar, but still activate
this brain region (Cohen & Dehaene, 2004; Polk &
Farah, 2002). These imaging data suggest that there
is a dedicated brain region, often called the visual
word form area, that processes words at an abstract
level of representation. Given that the region also
responds to pseudowords, but not strings of con-
sonants, the region must be picking something up
involving the orthographic regularity of a sequence
of abstract letters. The idea of a dedicated visual
word form area is disputed, however, because the
area does respond to word-like nonwords and to
other familiar objects (Price & Devlin, 2003).
A man taking part in a word recognition Farah (1991) argued that two fundamental
experiment. The speed with which he can name visual recognition processes underlie all types of
images representing common or rare words visual processing. These are the holistic process-
is being recorded. Photographed at Newcastle ing of non-decomposed perceptual representations
University, England.
and the parallel processing of complex, multiple
parts. She proposed that recognizing faces depends out that there are different types of semantic prim-
just on holistic processing, whereas recognizing ing, and they have different effects.
words depends on part processing. Recognizing
other types of objects involves both sorts of
processing to different degrees, depending on the
Types of “semantic” priming
specific object concerned. One obvious question is whether all types of
Farah’s proposal makes specific predictions semantic relation are equally successful in induc-
about the co-occurrence of neuropsychological ing priming. The closer the meanings of the two
deficits. Because object recognition depends on words, the larger the size of the priming effect
both holistic and part processing, you should never observed. We can also distinguish between asso-
find a deficit of object recognition (called agnosia) ciative priming and non-associative semantic
without either a deficit of face recognition (called priming.
prosopagnosia) or word recognition (dyslexia). Two words are said to be associated if par-
Similarly, if a person has both prosopagnosia and ticipants produce one in response to the other
dyslexia, then they should also have agnosia. in a word association task. This can be meas-
Although this is an interesting proposal, it is ured by word association norms such as those
not clear-cut that face perception is holistic, that of Postman and Keppel (1970). Norms such as
object recognition is dependent on both wholes and these list the frequency of responses to a num-
parts, and that word recognition depends on just ber of words in response to the instruction “Say
parts. Furthermore, Humphreys and Rumiati (1998) the first word that comes to mind when I say …
described the case of MH, a woman showing signs doctor.” If you try this, you will probably find
of general cortical atrophy. MH was very poor at words such as “nurse” and “hospital” come to
object recognition, yet relatively good at word and mind. It is important to note that not all associa-
face processing. This is the pattern that Farah pre- tions are equal in both directions. “Bell” leads to
dicted should never occur. Humphreys and Rumiati “hop” but not vice versa: hence “bell” facilitates
concluded that there are some differences between “hop,” but “hop” does not facilitate “bell.” Some
word and object processing: for example, there is words are produced as associates of words that
much more variation in the spatial positions of parts are not related in meaning: an example might be
in objects than letters in words. Words are two- “waiting” generated in response to “hospital.”
dimensional and objects three-dimensional. Lambon Priming by associates is called associative prim-
Ralph, Sage, and Ellis (1996) describe a case study ing; the two associates might or might not also
of a patient who can recognize words and objects (as be semantically related.
familiar or unfamiliar, by a lexical or object decision Non-associative semantically related words
task), but who is selectively impaired at retrieving the are those that still have a relation in terms of mean-
meanings of words. This behavior can be explained ing to the target, but that are not produced as asso-
if there is a specific visual word form area, but it has ciates. Consider the words “dance” and “skate.”
become disconnected from the semantic system. They are clearly related in meaning, but “skate”
In summary there is considerable evidence is rarely produced as an associative of “dance.”
that a dedicated brain region processes informa- “Bread” and “cake” are an example of another pair
tion about visual words. of semantically related but unassociated words.
Superordinate category names (e.g., “animal”) and
category instances (e.g., “fox”) are clearly seman-
MEANING-BASED tically related, but are not always strongly associ-
FACILITATION OF VISUAL ated. Members of the same category (e.g., “fox”
WORD RECOGNITION and “camel” are both animals) are clearly related,
but are not always associated. Priming by words
We have seen that semantic priming is one of the that are semantically but not associatively related
most robust effects on word recognition. It turns is called non-associative semantic priming.
Most studies of semantic priming have (Shelton & Martin, 1992). Both Fischler (1977)
looked at word pairs that are both associatively and Lupker (1984) found some priming effect
and semantically related. However, some stud- of semantic relation without association, also in
ies have examined the differential contributions a lexical decision task. The lexical decision task
of association and pure semantic relatedness to seems to be a less pure measure of automatic
priming. In particular, to what extent are these processing than naming, and hence this prim-
types of priming automatic? The evidence for ing might have arisen through non-automatic
automatic associative priming is fairly clear- means. Although Shelton and Martin (1992)
cut, and most of the research effort has focused also used a lexical decision task, they designed
on the question of whether or not we can find their experiment to minimize attentional pro-
automatic non-associative semantic priming. cessing. Rather than passively reading a prime
Many early studies found no evidence of and then responding to the target, participants
automatic pure semantic facilitation. Lupker made rapid successive lexical decisions to indi-
(1984) found virtually no semantic priming of vidual words. On a small proportion of trials
non-associated words in a naming task. The two successive words would be related, and the
word pairs were related in his experiment by vir- amount of priming to the second word could be
tue of being members of the same semantic cat- recorded. This technique of minimizing non-
egory, but were not commonly associated (e.g., automatic processing produced priming only
“ship” and “car” are related by virtue of both for the associated words, and not for the non-
being types of vehicles, but are not associated). associated related words.
Shelton and Martin (1992) showed that auto- These results suggest that automatic prim-
matic priming is obtained only for associatively ing in low-level visual word recognition tasks
related word pairs in a lexical decision task, and that tap the processes of lexical access can be
not for words that are semantically related but explained by associations between words, rather
not associated. This result suggests that auto- than by mediation based on word meaning.
matic priming appears to occur only within the “Doctor” primes “nurse” because these words
lexicon by virtue of associative connections frequently co-occur, leading to the strengthen-
between words that frequently co-occur. Moss ing of connections in the lexicon, rather than
and Marslen-Wilson (1993) found that semantic because of an overlap in their meaning, or the
associations (e.g., chicken–hen) and semantic activation of an item at a higher level of rep-
properties (e.g., chicken–beak) have different resentation. Indeed, co-occurrence might not
priming effects in a cross-modal priming task. even be necessary for words to become asso-
(In a cross-modal task, the prime is presented in ciated: it might be sufficient that two words
one modality—e.g., auditorially—and the target tend to be used in the same sort of contexts. For
in another—e.g., visually.) Associated targets example, both “doctor” and “nurse” tend to be
were primed context-independently, whereas used in the context of “hospital,” so they might
semantic-property targets were affected by the become associated even if they do not directly
context of the whole surrounding sentence. co-occur (Lund, Burgess, & Atchley, 1995;
Moss and Marslen-Wilson concluded that asso- Lund, Burgess, & Audet, 1996).
ciative priming does not reflect the operation McRae and Boisvert (1998) questioned this
of semantic representations, but is a low-level, conclusion. They argued that the studies that
intra-lexical automatic process. failed to find automatic semantic priming with-
On the other hand, Hodgson (1991) found out association (most importantly, Shelton &
no priming for semantically related pairs in a Martin, 1992) failed to do so because the items
naming task, but significant priming for the used in these experiments were not sufficiently
same pairs in a lexical decision task. It is pos- closely related (e.g., “duck” and “cow,” “nose”
sible that the instructions in his lexical deci- and “hand”). McRae and Boisvert used word
sion task encouraged non-automatic processing pairs that were more closely related but still not
associated (e.g., “mat” and “carpet,” “yacht” whereas we have just seen that in lexical deci-
and “ship”). With these materials McRae and sion (a recognition task) semantic priming has a
Boisvert found clear facilitation even at very facilitatory effect.
short (250 ms) SOAs. It now seems likely that
at least some aspects of semantic relation can Does sentence context affect
cause automatic facilitation.
The pattern of results observed also
visual word recognition?
depends on the precise nature of the seman- Priming from sentence context is the amount of
tic relations involved. Moss, Ostrin, Tyler, priming contributed over and above that of the
and Marslen-Wilson (1995) found that both associative effects of individual words in the sen-
semantically and associatively related items tence. The beginning of the sentence “It is impor-
produced priming of targets in an auditory lex- tant to brush your teeth every single __” facilitates
ical decision task. Furthermore, semantically the recognition of a word such as “day,” which is
related items produced a “boost” in the mag- a highly predictable continuation of the sentence,
nitude of priming if they were associatively compared with a word such as “year,” which is
related as well. However, a different pattern of not. The sentence context facilitates recognition
results was observed in a visual lexical deci- even though there is no semantic relation between
sion version of the task (which was also prob- “day” and other words in the sentence. Can sen-
ably the version of the task that minimized any tence context cause facilitation?
involvement of attentional processing). Here, Schuberth and Eimas (1977) were the first to
whether or not pure (non-associative) semantic appear to demonstrate sentence context effects in
priming was observed depended on the type of visual word recognition. They presented incom-
semantic relation. Category coordinates (e.g., plete context sentences followed by a word or
“pig–horse”) did not produce automatic prim- nonword to which participants had to make a
ing without association, whereas instrument lexical decision. Response times were faster if
relations (e.g., “broom–floor”) did. This sug- the target word was congruent with the preceding
gests that information about the use and purpose context. West and Stanovich (1978) demonstrated
of an object is immediately and automatically similar facilitation by congruent contexts on word
activated. naming. Later studies have revealed limitations
Moss, McCormick, and Tyler (1997) with regard to when and how much contextual
also showed that some semantic properties facilitation can occur.
of words are available before others. Using a Fischler and Bloom (1979) used a paradigm
cross-modal priming task, they found signifi- similar to that of Schuberth and Eimas. They
cant early priming for information about the showed that facilitation only occurs if the target
function and design of artifacts, but not for word is a highly probable continuation of the sen-
information about their physical form. There tence. For example, consider the sentence “She
are grounds to suppose (see Chapter 11 on the cleaned the dirt from her __.” The word “shoes”
neuropsychology of semantics) that a different is a highly predictable continuation here; the word
pattern of results would be obtained with other “hands” is an unlikely but not anomalous con-
semantic categories. In particular, information tinuation; “terms” would clearly be an anomalous
about perceptual attributes might be available ending. (We do not need to rely on our intuitions
early for living things. for this; we can ask a group of other participants
Finally, it should be pointed out that seman- to give a word to end the sentence and count up
tic priming may have different results in word the numbers of different responses.) We find that
recognition and word production. For example, an appropriate context has a facilitatory effect on
Bowles and Poon (1985) showed that semantic the highly predictable congruent words (“shoes”)
priming has an inhibitory effect on retrieving a relative to the congruent but unlikely word (e.g.,
word given its definition (a production task), “hands”), and an inhibitory effect to the anomalous
words (e.g., “terms”). As there is no direct associa- should at least sometimes be non-automatic.
tive relation between “shoes” and other words in Perhaps the potential benefit is too small for it to
the sentence, this seems to be attributable to prim- be worth the language processor routinely using
ing from sentence context. context. Sentence context may only be of practi-
Stanovich and West (1979, 1981; see also cal help in difficult circumstances, such as when
West & Stanovich, 1982) found that contextual the stimulus is degraded.
effects are larger for words that are harder to rec- As naming does not necessitate integration
ognize in isolation. Contextual facilitation was of the target word into the semantic structure,
much larger when the targets were degraded by the analysis of eye movements is revealing here.
reduced contrast. In clear conditions, we find Schustack, Ehrlich, and Rayner (1987) found evi-
mainly contextual facilitation of likely words; dence of the effects of higher level context in the
in conditions of target degradation, we find con- analysis of eye movements, but not of naming
textual inhibition of anomalous words. Children, times. Inhoff (1984) had participants read short
who of course are less skilled at reading words passages of text from Alice in Wonderland. A
in isolation than adults, also display more con- moving visual pattern mask moved in synchrony
textual inhibition. Different tasks yield different with the readers’ eyes. Ease of lexical access was
results. Naming tasks tend to elicit more facilita- manipulated by varying word frequency, and ease
tion of congruent words, whereas lexical decision of conceptual processing was manipulated by
tasks tend to elicit more inhibition of incongru- varying how predictable the word was in context.
ent words. The inhibition is most likely to arise Analysis of eye movements suggested that lexical
because lexical decision is again tapping post- access and context-dependent conceptual process-
access, attentional processes. It is likely that these ing could not be separated in the earliest stages
processes involve integrating the meanings of the of word processing. The mask affected frequency
words accessed with a higher level representation and predictability differentially, suggesting that
of the sentence. there is an early automatic component to lexical
West and Stanovich (1982) argued that the access, and a later non-automatic, effortful pro-
facilitation effects found in the naming task arise cessing involving context. So context may have
through simple associative priming from preced- some early effects, but lexical access and concep-
ing words in the sentence. It is very difficult to tual processing later emerge as two separate pro-
construct test materials that eliminate all associa- cesses. This experiment is also further support for
tive priming from the other words in the sentence the idea that early lexical processing is automatic,
to the target. If this explanation is correct, any whereas later effects of context involve an atten-
facilitation found is simply a result of associa- tional component.
tive priming from the other words in the sentence. Van Petten (1993) examined event-related
Sentence context operates by the post-access inhi- potentials (ERPs) to semantically anomalous sen-
bition of words incongruent with the preceding tences. One advantage of the ERP technique is
context, and this is most likely to be detected with that it enables the time course of word recognition
tasks such as lexical decision that are more sen- to be examined before an overt response (such
sitive to post-access mechanisms. One problem as uttering a word or pressing a button) is made.
with this conclusion is that lexical relatedness is The effects of lexical and sentence context were
not always sufficient in itself to produce facili- distinguishable in the ERP data, and the effects
tation in sentence contexts (O’Seaghdha, 1997; of sentence context were more prolonged. Van
Sharkey & Sharkey, 1992; Williams, 1988). This Petten concluded that there was indeed an effect
suggests that the facilitation observed comes from of sentence context that could not be attributed to
the integration of material into a higher text-level lexical priming. Furthermore, the priming effects
representation. Forster (1981) noted that the use appear to start at the same time, which argues
of context may be very demanding of cognitive against a strict serial model where lexical prim-
resources. This suggests that contextual effects ing precedes sentence context priming. Similarly,
Kutas (1993) found that lexical and sentence con- Morris and Harris (2002) argue the RSVP
text had very similar effects on ERPs. Both give (rapid serial visual presentation) technique is par-
rise to N400s (a large negative wave present 400 ticularly suited to investigating the effects of sen-
ms after the stimulus) whose amplitudes vary with tence context because it resembles normal reading
the strength of the association or sentence context. in that a whole sentence has to be read and pro-
Finally, Altarriba, Kroll, Sholl, and Rayner (1996) cessed, in contrast to tasks that involve respond-
examined naming times and eye movements in ing to one particular word in a sentence. In the
an experiment where fluent English–Spanish RSVP task, words are displayed one at a time in
bilinguals read mixed-language sentences. They the same location, each new word overwriting the
found that sentence context operated both through previous one. Readers tend to misread the word
intra-lexical priming and high-level priming. “rice” in sentences such as “She ran her best time
Contextual constraints still operate across lan- yet in the rice last week” as “race” when the items
guages, although the results were moderated by a are presented using RSVP (Potter, Moryadas,
lexical variable, word frequency. Abrams, & Noel, 1993). Clearly here sentence
Clearly the results are variable, and seem to context is causing the misperception, but at what
be task-dependent. It is possible that processing in stage? The early, interactive accounts state that
discourse is different from the processing of word sentence context is one factor interacting with all
lists such as are typically used in semantic priming others to determine the activation of a word, and
experiments. Hess, Foss, and Carroll (1995) manip- affects recognition; the late, modular accounts
ulated global and local context in a task where state that “rice” is indeed selected, and corrected
participants heard discourse over headphones, and later as a result of postperceptual processing, or
then had to name the concluding target word, which recall. Morris and Harris combined the RSVP task
appeared on a screen in front of them. The most with repetition blindness, whereby people seeing
important conditions were where the target word a word repeated very soon after its first instance
was globally related to the context but locally tend to omit the repetition in the reports of what
unrelated to the immediately preceding words (3), they have seen—that is, they are blind to the rep-
and globally unrelated but locally related (4): etition (Kanwisher, 1987). Repetition blindness
can be so strong that people might report hav-
(3) The computer science major met a woman ing seen “When she spilled the ink there was all
who he was very fond of. He had admired her over,” which doesn’t make sense, when they actu-
for a while but wasn’t sure how to express ally saw “When she spilled the ink there was ink
himself. He always got nervous when trying all over.” The preponderance of evidence (e.g.,
to express himself verbally so the computer from ERP studies) suggests that repetition blind-
science major wrote the poem. ness has an early, perceptual effect.
(4) The English major was taking a computer What happens if we combine RSVP with cor-
science class that she was struggling with. rected words and repetition blindness in a mis-
There was a big project that was due at the reading repetition blindness paradigm? Suppose
end of the semester which she had put off we present participants with “race” very soon
doing. Finally, last weekend the English after the sentence “She ran her best time yet in the
major wrote the poem. rice last week”? If the perceptual account of the
correction is correct, “rice” should be “perceived”
Hess et al. found that only global context like “race,” and therefore we should get repetition
facilitated naming the target word “poem.” This blindness for the “second” “race.” If the postper-
result does not show that automatic semantic ceptual account is correct, people really do “see”
priming does not occur: we certainly observe it “rice,” and therefore this case should not cause
with isolated items presented rapidly together. The repetition blindness. Morris and Harris found
experiment does show that in real discourse the that the perceptual account fitted the data better:
effects of global context may be more important. reconstructions cause repetition blindness.
In summary, sentence context can have either A few researchers argue that activation does
an early perceptual effect or a late postperceptual not spread, and instead propose a compound-cue
effect. We can observe early effects, but only in theory (e.g., Ratcliff & McKoon, 1981, 1988; see
certain tasks, particularly ones that resemble read- also Hodgson, 1991). The central idea of spreading
ing of whole sentences and discourse rather than activation—which Ratcliff and McKoon disputed—
responding to isolated words. is that activation can permeate some distance through
a network, and that this permeation takes time. The
Summary of meaning-based further activation travels, the more time should pass,
and it can be very difficult to detect some of these
priming studies very small effects. Instead, according to compound-
We can distinguish between associative seman- cue theory, priming involves the search of memory
tic priming, associative non-semantic priming, with a compound cue that contains both the prime
and non-associative semantic priming. All sorts and the target. This theory predicts that priming can
of priming have both automatic and attentional only occur if two items are directly linked in mem-
components, although there has been considerable ory. It therefore cannot account for mediated prim-
debate as to the status of automatic non-associative ing where two items that are not directly linked can
semantic priming. Attentional processes include be primed through an intermediary (see McNamara,
checking that the item accessed is the correct 1992, 1994). Furthermore, there is now evidence
one, using conscious expectancies, and integrat- that time elapses while activation spreads, and the
ing the word with higher level syntactic and more distantly related two things are, the longer the
semantic representations of the sentence being time that elapses (McNamara, 1992; McNamara &
analyzed. The remaining question is the extent to Altarriba, 1988).
which sentence context has an automatic compo-
nent. Researchers are divided on this, but there
is a reasonable amount of evidence that it has. PROCESSING
Schwanenflugel and LaCount (1988) suggested MORPHOLOGICALLY
that sentential constraints determine the semantic COMPLEX WORDS
representations generated by participants as they
read sentences. The more specific the constraints, So far we have mainly looked at morphologically
the more specific the expected semantic represen- simple words. How are morphologically complex
tations generated. Connectionist modeling also words stored in the lexicon? Is there a full listing
suggests a mechanism whereby sentence context of all derivations of a word, so that there are entries
could have an effect. In an interactive system, for “kiss,” “kissed,” “kisses,” and “kissing”? We
sentence context provides yet another constraint call this the full-listing hypothesis. Or do we just
that operates on word recognition in the same way list the stem (“kiss-”), and produce or decode the
as lexical variables, facilitating the recognition of inflected items by applying a rule (you add “-ed”
more predictable words. to the stem of a word to form the past tense)?
How does priming occur? The dominant the- As English contains a large number of irregular
ory says that semantic priming occurs by the spread derivations (e.g., “ran,” “ate,” “mice,” “sheep”),
of activation. Activation is a continuous property, we would then have to list the exceptions sepa-
rather like heat, that spreads around a network. rately, so we would store a general rule and a list
Items that are closely related will be close together of exceptions. We call this the obligatory decom-
in the network. Retrieving something from mem- position hypothesis (Smith & Sterling, 1982; Taft,
ory corresponds to activating the appropriate items. 1981, 2004). There is an intermediate position,
Items that are close to an item in the network will called the dual-pathway hypothesis. Although it is
receive activation by its spread from the source uneconomical to list all inflected words, some fre-
unit. The farther away other items are from the quent and common inflected words do have their
source, the less activation they will receive. own listing (Monsell, 1985; Sandra, 1990).
According to the obligatory decomposition 1990). Hence neither “milk” nor “spoon” will
hypothesis, to recognize a morphologically com- facilitate the recognition of “buttercup.”
plex word we must first strip off its affix, a process Marslen-Wilson, Tyler, Waksler, and Older
known as affix stripping (Taft & Forster, 1975; see (1994) examined how we process derivationally
also Taft, 1985, 1987). In a lexical decision task, complex words in English. Marslen-Wilson et al.
words that look as though they have a prefix, but used a cross-modal lexicon decision task to exam-
in fact do not (e.g., “interest,” “result”), take longer ine what we decompose morphologically com-
to recognize than control words (Taft, 1979, 1981). plex words into, and therefore the sorts of words
It is as though participants are trying to strip these that they can influence. For example, a participant
words of their affixes but are then unable to find would hear a spoken prime (e.g., “happiness”)
a match in the lexicon and have to reanalyze. In and then immediately have to make a lexical deci-
a task where participants were asked to judge sion to a visual probe (e.g., “happy”). The cross-
whether a visually presented word was pronounced modal nature of the task is important because it
identically to another word (i.e., the word was a obliterates any possible phonological priming
homophone), Taft (1984) observed people have between similar words. Instead, any priming that
difficulty with words such as “fined” that have a occurs must result from lexical access.
morphological structure different from their homo- The pattern of results was complicated and
phonic partner (here “find”). Taft argued that the showed that the extent of priming found depends
difficulty with such words arises from the fact on the ideas of phonological transparency and
that inflected words are represented in the lexicon semantic transparency. The relation between
as stems plus their affix. Finally, consider words two morphologically related words is said to
like “seeming” and “mending”; they have very be phonologically transparent if the shared part
similar surface frequencies—that is, those par- sounds the same. Hence the relation in “friendly”
ticular forms occur with about equal frequency in and “friendship” is phonologically transpar-
the language. However, the stems have very dif- ent (“friend” sounds the same in each word),
ferent base frequencies: “Seem” and all its vari- but in “sign” and “signal” it is not (the “sign”
ants (seems, seemed) is much more frequent than components have different pronunciations).
“mend” and its variants (mends, mended). Which (Phonological transparency is really a continuum
determines the ease of recognition—surface or rather than a dichotomy, with some word pairs,
base frequency? It turns out that on the whole such as “pirate” and “piracy,” in between the
lexical decision is much faster, and there are fewer extremes.) A morphologically complex word is
errors, for words with high base frequencies, again semantically transparent if its meaning is obvious
suggesting that complex words are decomposed from its parts: hence “unhappiness” is semanti-
and recognized by their stem (Taft, 1979, 2004). cally transparent, being made up in a predictable
However, the base frequency effect is not found fashion from “un-,” “happy,” and “-ness.” A word
for all words; for some common words there is no like “department,” even though it contains recog-
effect of base frequency but there is one of surface nizable morphemes, is not semantically transpar-
frequency (Baayen, Dijkstra, & Schreuder, 1997; ent. The meaning of “depart” in “department” is
Bertram, Schreuder, & Baayen, 2000; Schreuder not obviously related to the meaning of “depart”
& Baayen, 1997). This finding is evidence for the in “departure.” It is semantically opaque.
dual-pathway hypothesis, although the debate is Semantic and phonological transparency
ongoing, with Taft arguing that base- and surface- affect the way in which words are identified.
frequency effects arise at different stages of pro- Semantically transparent forms are morpho-
cessing, so that the lack of a base-frequency effect logically decomposed, regardless of whether
is not evidence against obligatory decomposition. or not they are phonologically transparent.
Compound words whose meanings are not Semantically opaque words, however, are not
transparent from their components (e.g., “but- decomposed. Furthermore, suffixed and pre-
tercup”) will also be stored separately (Sandra, fixed words behave differently. Suffixed and
prefixed derived words prime each other, but we locate them because their storage location is
pairs of suffixed words produce interference. defined by their content—a feature called content
This is because when we hear a suffixed word, addressability?
we hear the stem first. All the suffixed forms Carr and Pollatsek (1985) use the term lexical
then become activated, but as soon as there is instance models for models that have in common
evidence for just one of them, the others are that there is simply perceptual access to a memory
suppressed. Therefore, if one of them is subse- system, the lexicon, where representations of the
quently presented, we observe inhibition. attributes of individual words are stored, and they
The experiment of Marslen-Wilson et al. do not have any additional rule-based component
shows that in English there is a level of lexi- that converts individual letters into sounds. We
cal representation that is modality-independent can distinguish two main types of lexical instance
(because we observe cross-modal priming), and model. These differ in whether they employ serial
that it is morphologically structured for seman- search through a list, or the direct, multiple activa-
tically transparent words (because of the pattern tion of units. The best known instance of a search
of facilitation shown). More recent studies have model is the serial search model. Direct access,
found that morphological priming effects are activation-based models include the logogen
independent of meaning similarity; that is, there model, localist connectionist models, as well as
is no difference in the priming effects for semanti- the cohort model of spoken word recognition (see
cally transparent and opaque derivations in sev- Chapter 9). More difficult to fit into this simple
eral languages, including English (Rastle, Davis, scheme are hybrid or verification models (which
& New, 2004), French (Longtin, Segui, & Halle, combine direct access and serial search), and dis-
2003), and Hebrew (Frost, Forster, & Deutsch, tributed connectionist models (which although
1997). These results suggest that morphological very similar to the logogen model do not have
priming in general is obtained because of morpho- simple lexical units at all).
logical structure rather than because of semantic
overlap between similar items. Forster’s autonomous serial
search model
MODELS OF VISUAL WORD Imagine how you might try to find a word by search-
RECOGNITION ing through a dictionary; you search through the
entries, which are arranged to facilitate search on
In this section, we examine some models of visual the basis of visual characteristics (that is, they are
lexical access. They all take as input a perceptual in alphabetical order), until you find the appropriate
representation of the word, and output desired entry. The entry in the dictionary gives you all the
information such as meaning, sound, and famili- information you need about the word: its meaning,
arity. The important question of how we access a pronunciation, and its syntactic class. A commonly
word’s phonological form will be examined in the used analogy here is that of searching through a cat-
next chapter. alog to find the location of a book in the library. The
All models of word recognition have to model is a two-stage one; you can use the catalog to
address four main questions. First, is process- find out where the book is, but you still have to go to
ing autonomous or interactive—in particular, the shelf, find the book’s actual location, and extract
are there top-down effects on word recognition? information from it. Forster (1976, 1979) proposed
Second, is lexical access a serial or a parallel pro- that we identify words by a serial search through
cess? Third, can activation cascade from one level the lexicon. In this model the catalog system corre-
of processing to a later one, or must processing by sponds to what are called access files, and the shelf
the later stage wait until that of the earlier one is full of books to the master file.
complete? Fourth, how do we find items? Do we In the serial search model, perceptual process-
find them by searching through the lexicon, or can ing is followed by the sequential search of access
files that point to an entry in the lexicon. Access files

are modality-specific: there are different ones for
orthographic, phonological, and syntactic–semantic
(used in speech production) sources. These access
files give pointers to a master file in the lexicon that
stores all information to do with the word, including
its meaning. To speed up processing, these access
files are subdivided into separate bins on the basis
of the first syllable or the first few letters of a word.
Items within these bins are then ordered in terms
of frequency, such that the more frequent items are
examined first. Hence more frequent items will be
accessed before less frequent ones. This frequency-
Forster (1976) proposed that we identify words by
based searching is an important characteristic of
a serial search through the lexicon. A library is a
useful analogous tool, whereby the library’s catalog the model. Semantic priming arises as the result of
system corresponds to access files, and the shelf cross-references between entries in the master file.
full of books to the master file. The model is shown in Figure 6.6.
Orthographic Phonological Syntactic–semantic

access file access file access file
Analysis of visual
input to be used
in search and to
compute probable
bin
(bins are arranged in order of

decreasing frequency)
pig /pig/ PIG
COW PIG
Master file FIGURE 6.6 Forster’s

cross-referencing (lexicon)
serial search model of
lexical access (based on
Forster, 1976).
Search is not affected by syntactic or seman- Forster, 2004). In the serial search model only the
tic information, which is why the search is said to relative frequency of words within a bin has an
be autonomous. The only type of context that can effect on access time, not the absolute frequency.
operate on lexical access is associative priming This idea is called the rank hypothesis (Murray
within the master file. There is no early role for & Forster, 2004). Suppose you have two bins; in
the effect of sentence context; sentence context one bin the absolute frequency of the first item
can only have an effect through post-access mech- is 100,000 and of the second item just 10, while
anisms such as checking the output and integrat- in the second bin the frequency of the first item
ing it with higher level representations. Repetition is just 20 and of the second 10. Hence in the
can temporarily change the order of items within first bin there is a big absolute difference in fre-
bins, which is why we observe repetition priming. quency between the two items, and in the second
Illegal nonwords can be rejected early on in the bin a small absolute difference. But in each case
bin selection process, but legal nonwords are only the relative frequencies are the same—the first
rejected after the exhaustive search of the appro- item compared with the second item. Most of the
priate bin. evidence suggests that relative frequency is more
important in determining access time than abso-
Evaluation of the serial search model lute frequency. Detailed experimental analy-
The most significant criticism of the serial search sis of lexical decision times and error rates for
model concerns the plausibility of a serial search words with a wide range of frequencies shows
mechanism. Although introspection suggests that that reaction times fit better to a linear rank func-
word recognition is direct rather than involving tion (as predicted by the rank hypothesis where
serial search, we cannot rely on these sorts of data. all that matters is relative frequency) than to a
Making a large number of serial comparisons will logarithmic function (where absolute frequency
take a long time, but word recognition is remark- matters). In particular, the extremes of the dis-
ably fast. The model accounts for the main data in tribution do not behave as expected: Both very
word recognition, and makes a strong prediction high frequency and very low frequency words
that priming effects should be limited to associa- are responded to more slowly and inaccurately
tive priming within the lexicon. There should be than the logarithmic function predicts.
no top-down involvement of extra-lexical knowl- The serial search model has proved very
edge in word recognition. Finally, the model does influential and is a standard against which to
not convincingly account for how we pronounce compare other models. Can we justify using lexi-
nonwords. cal access mechanisms more complex than serial
Forster (1994) addressed some of these prob- search?
lems. In particular, he introduced an element of
parallelism by suggesting that all bins are searched
simultaneously. The subdivision of the system into
The logogen model
bins greatly speeds up the search, and it makes it In this model every word we know has its own sim-
possible to conclude that a string of letters is a non- ple feature counter called a logogen correspond-
word much more quickly than if the whole lexicon ing to it. A logogen accumulates evidence until its
has to be searched. individual threshold level is reached. When this
The serial search model also provides an happens, the word is recognized. Lexical access is
account of the effects of word frequency on therefore direct, and occurs simultaneously and in
lexical access. It was originally thought that parallel for all words. Proposed by Morton (1969,
the effect of frequency is roughly logarithmic, 1970), the logogen model was related to the infor-
so that the difference in access times between mation processing idea of features and demons, as
a common and a slightly less common word is described in Lindsay and Norman’s classic (1977)
much less than between a rare and a slightly more textbook, where “demons” monitor the perceptual
rare word (Howes & Solomon, 1951; Murray & input for specific “features”; the more evidence
there is for a particular feature in the perceptual

input, the louder the associated demon shouts.
The model was originally formulated to explain Visual word Auditory word
analysis analysis
how context affects word recognition with very
brief exposure to the word, but has been extended
to account for many word recognition phenom- Logogen Cognitive
ena. The full mathematical model is presented in system system
Morton (1969), but a simplified account can be
found in Morton (1979a). Phonological
Each logogen unit has a resting level of acti- output
vation. As it receives corroborating evidence that
it corresponds to the stimulus presented, its
activation level increases. Hence if a “t” letter is
FIGURE 6.7 The original logogen model of lexical
identified in the input, the activation levels of all access (based on Morton, 1979b).
logogens that correspond to words containing a
“t” will increase. If the activation level manages to
pass a threshold, the logogen “fires” and the word as by a visual prime. Subsequent experiments
is “recognized.” Both perceptual and contextual contradicted this prediction.
evidence will increase the activation level. That Winnick and Daniel (1970) showed that
is, there is no distinction between evidence for a the prior reading aloud of a printed word facili-
word from external and internal sources. Context tated tachistoscopic recognition of that word.
increases a logogen’s activation level just as rel- However, naming a picture or producing a word
evant sensory data do. Any use of the logogen will in response to a definition produced no subse-
give rise to subsequent facilitation by lowering quent facilitation of tachistoscopic recognition of
the threshold of that logogen. More frequent items those words. That is, different modalities pro-
have lower thresholds. Nonwords will be rejected duce different amounts of facilitation. Indeed,
if no logogen has fired by the time a deadline has Morton (1979b) reported replications of these
passed. Logogens compute phonological codes results, clearly indicating that the logogen model
from auditory and visual word analysis, and also needed revision. (For further details of the
pass input after detection to the cognitive system. experiments, see also Clarke & Morton, 1983;
The cognitive system does all the other work, Warren & Morton, 1982.) Hence Morton divided
such as using semantic information. The connec- the word recognition system into different sets
tions are bidirectional, as semantic and contextual of logogens for different modalities (e.g., input
information from the cognitive system can affect and output). Morton (1979b) also showed that
logogens. (See Figure 6.7 for a depiction of the although the modality of response appeared to
early version of the logogen model.) be immaterial (reading or speaking a word in
the training phase), the input modality did mat-
Problems with the original logogen ter. The model was revised so that instead of one
model logogen for each word, there were two modality-
In the original logogen model, a single logogen specific ones (see Figure 6.8). The consequence
carried out all language tasks for a particular of this change ensured that only visual inputs
word, regardless of modality. That is, the same could facilitate subsequent visual identification
logogen would be used for recognizing speech of words, and that auditorily presented primes
and visually presented words, for speaking, and would not facilitate visually presented targets in
for writing. The model predicts that the modality tachistoscopic recognition. Subsequent evidence
of the source of activation of a logogen should not suggests that four logogen systems are neces-
matter. For example, visual recognition of a word sary: one for reading, one for writing, one for
should be as equally facilitated by a spoken prime listening, and one for speaking.
that the less legible the stimuli, the more benefi-

Visual word Auditory word
cial the effects of context. Others have found them
analysis analysis to be additive (Becker & Killion, 1977; Stanners,
Jastrzembski, & Westwood, 1975). Later experi-
ments by Norris (1984) clarified these results. He
Visual Auditory found that frequency and stimulus quality could
logogens logogens
interact, but that the interaction between stimulus
quality and context is larger and more robust.
?
Cognitive
?
In summary, it is very difficult to draw con-
system
clusions from this research. The issues involved
are complex and the experimental results often
Phonological contradictory. Morton (1979a) proposed that fre-
output quency does not affect the logogen system itself,
but rather the cognitive systems to which it out-
puts at the end of the recognition process. The
FIGURE 6.8 The revised logogen model of lexical implications of this revision make the interpreta-
access (based on Morton, 1979b). tion of these data yet more complex.
The logogen model has been overtaken by con-
nectionist models of word recognition, and in many
Some have argued that Morton was too hasty respects it can be seen as a precursor of them.
in giving up the simpler model, arguing that the
possible ways in which the primes and targets
are represented in the tachistoscopic results mean
Interactive activation models of
that no firm conclusion can be drawn (P. Brown, word recognition
1991), or that the precise way in which the facili- McClelland and Rumelhart (1981) and Rumelhart
tation effect occurs is unclear (Besner & Swan, and McClelland (1982) developed a model called
1982). Neuropsychological evidence (see Chapter interactive activation and competition (IAC). It is
15 for details) supports the splitting of the logogen one of the earliest of all connectionist models. (If
system, and this is currently the dominant view. you haven’t studied connectionist models before,
I strongly advise you to read the Appendix care-
Interaction of variables in the logogen fully at this point.)
model The original purpose of this model was
The effects of context and stimulus qual- to account for word context effects on letter iden-
ity (whether or not the stimulus is degraded) tification. Reicher (1969) and Wheeler (1970)
should interact if the logogen model is correct. showed that, in tachistoscopic recognition, letters
Furthermore, frequency and context are handled are easier to recognize in words than when seen as
in the same way in the logogen model, and hence isolated letters. This is known as the word supe-
they should show similar patterns of interac- riority effect. However, the model can be seen as
tion with any other variable (Garnham, 1985). a component of a general model of word recogni-
For example, stimulus quality should have the tion. We will only look at the general principles of
same effects when combined with manipulations the model here.
of context and frequency. Less perceptual infor- The IAC model consists of many simple pro-
mation is required to recognize a high-frequency cessing units arranged in three levels. There is an
word than a low-frequency one, and less informa- input level of visual feature units, a level where
tion is required to recognize a word in context than units correspond to individual letters, and an out-
out of context. The findings are complex and con- put level where each unit corresponds to a word.
tradictory. Some researchers find an interaction; Each unit is connected to each unit in the level
Meyer, Schvaneveldt, and Ruddy (1974) found immediately before and after it. Each of these
ABLE TRIP TIME
TRAP TAKE CART
A N T G S
FIGURE 6.9 Fragment
of an interactive activation
network of letter
recognition. Arrows show
excitatory connections;
filled circles, inhibitory
connections. From
McClelland and Rumelhart
(1981).
connections is either excitatory (that is, positive or are connected to all other units at the same level
facilitatory), if it is an appropriate one, or inhibi- by inhibitory connections, as soon as a unit (e.g., a
tory (negative), if it is inappropriate. For exam- word) becomes activated, it starts inhibiting all the
ple, the letter “T” would excite the word units other units at that level. Hence if the system “sees”
“TAKE” and “TASK” in the level above it, but a “T,” then “TAKE,” “TASK,” and “TIME” will
would inhibit “CAKE” and “CASK.” Excitatory become activated, and immediately start inhibit-
connections make the destination units more ing words without a “T” in them, like “CAKE,”
active, while inhibitory connections make them “COKE,” and “CASK.” As activation is also sent
less active. Each unit is connected to each other back down to lower levels, all letters in words
unit within the same level by an inhibitory con- beginning with “T” will become a little bit acti-
nection. This introduces the element of competi- vated and hence “easier” to “see.” Furthermore, as
tion. The network is shown in Figure 6.9. letters in the context of a word receive activation
When a unit becomes activated, it sends acti- from the word units above them, they are easier to
vation in parallel along the connections to all the see in the context of a word than when presented
other units to which it is connected. If it is con- in isolation, when they receive no supporting
nected by a facilitatory connection, it will have top-down activation—hence the word superiority
the effect of increasing activation at the unit at the effect. Equations described in the Appendix deter-
other end of the connection, whereas if it is con- mine the way in which activation flows between
nected by an inhibitory connection, it will have the units, is summed by units, and is used to change
effect of decreasing the activation at the other end. the activation level of each unit at each time step.
Hence if the unit corresponding to the letter “T” in Suppose the next letter to be presented is an
the initial letter position becomes activated, it will “A.” This will activate “TAKE” and “TASK” but
increase the activation level of the word units cor- inhibit “TIME,” which will then also be inhibited
responding to “TAKE” and “TASK,” but decrease in turn by within-level inhibition from “TASK”
the activation level of “CAKE.” But because units and “TIME.” The “A” will of course also activate
“CASK” and “CAKE,” but these will already be Verification models can be extended to include
some way behind the two words starting with “T.” any model where there is verification or check-
If the next letter is a “K,” then “TAKE” will be the ing that the output of the bottom-up lexical access
clear leader. Time is divided into a number of slices processes is correct. Norris (1986) argued that a
called processing cycles. Over time, the pattern of post-access checking mechanism checks the out-
activation settles down or relaxes into a stable con- put of lexical access against context and resolves
figuration so that only “TAKE” remains activated, any ambiguity.
and hence is the word “seen” or recognized.
The interactive activation model of letter and
word recognition has been highly influential. As
Comparison of models
the name implies, this type of model is heavily There are two dichotomies that could be used
interactive; hence any evidence that appears to to classify these models. The first is between
place a restriction on the role of context is prob- interactive and autonomous models. The second
lematic for it. The scope of the model is limited, dichotomy is between whether words are accessed
and gives no account of the roles of meaning and directly or through a process of search. The logo-
sound in visual word processing. Connection gen and interactive activation models are both
strengths have to be coded by hand. Models where interactive direct access models; the serial search
the connection strengths are learned have become model is autonomous and obviously search-based.
more popular. We will examine a connectionist Most researchers agree that the initial stages of
learning model of word recognition and naming lexical access involve parallel direct access,
in the next chapter. although serial processes might subsequently be
involved in checking prepared responses. There
is less agreement on the extent to which context
Hybrid models affects processing. All these models can explain
Hybrid models combine parallelism (as in the log- semantic priming, but the serial search model has
ogen and connectionist models) with serial search no role for sentence context.
(as in Forster’s model). In Becker’s (1976, 1980)
verification model, bottom-up, stimulus-driven
perceptual processes cannot recognize a word COPING WITH LEXICAL
on their own. A process of top-down checking or AMBIGUITY
verification has the final say. Rough perceptual
processing generates a candidate or sensory set of Ambiguity in language arises in a number of
possible lexical items. This sensory set is ordered ways. There are ambiguities associated with the
by frequency. Context generates a contextual or segmentation of speech. Consider the spoken
semantic set of candidate items. Both the sensory phrases “gray tape” with “great ape,” and “ice
and the semantic set are compared and verified cream” with “I scream”: in normal speech they
by detailed analysis against the visual characteris- sound the same. Some sentences have more than
tics of the word. The semantic set is verified first; one acceptable syntactic interpretation. Although
verification is serial. If a match is not found, then this chapter is primarily about visual word recog-
the matching process proceeds to the sensory set. nition, in this section we will look at lexical ambi-
This process will generate a clear advantage for guity for both visual and spoken words.
words presented in an appropriate context. The There are a number of types of lexical
less specific the context, the larger the semantic ambiguity. Homophones are words with differ-
set, and the slower the verification process. As ent meanings that sound the same. Some exam-
the context precedes the target word, the semantic ples of pure homophones are “bank” (a place for
set is ready before the sensory set is ready. Paap, money, or a place beside a river) and “pen” (a
Newsome, McDonald, and Schvaneveldt (1982) writing instrument or a place to keep animals).
also presented a version of the verification model. Heterographic homophones sound the same but
are spelled differently (e.g., “knight” and “night,” appropriate sense. The two main processing ques-
and “weight” and “wait”). Homographs are tions are: How do we resolve the ambiguity—that
ambiguous when written down, and some of these is, how do we choose the appropriate meaning or
may be disambiguated when pronounced (such as reading? And at what stage is context used?
“lead”—as in “dog lead” and “lead” the metal).
Most interesting of all are polysemous words, Early work on lexical ambiguity
which have multiple meanings. There are many
examples of polysemous words in English, such Early research on lexical ambiguity used a variety
as “bank,” “straw,” “ball,” and “letter.” Consider of tasks to examine at what point we select the
sentences (5) to (8). Some words are also syntacti- appropriate meaning of an ambiguous word. Most
cally ambiguous—“bank” can operate as a verb as of these tasks were off-line, in the sense that they
well as a noun, as in (7) or (8): used indirect measures that tap processing some
time after the ambiguity has been resolved.
(5) The fisherman put his catch on the bank.
(6) The businessman put his money in the bank. Early models of lexical ambiguity
(7) I wouldn’t bank on it if I were you. When we come across an ambiguous word, do we
(8) The plane is going to bank suddenly to one immediately select the appropriate sense, or do we
side. access all of the senses and then choose between
them, either in some sequence or in parallel? Early
Frazier and Rayner (1990) distinguished researchers worked within the framework of three
between words with multiple meanings, where types of model of resolving lexical ambiguity.
the meanings are unrelated (e.g., the meanings We can call the first model the context-guided
of “bank” or “ball”), and words with multiple single-reading lexical access model (Glucksberg,
senses, where the senses are related (e.g., a “film” Kreuz, & Rho, 1986; Schvaneveldt, Meyer, &
can be the physical reel or the whole thing that Becker, 1976; Simpson, 1981). According to this
is projected on a screen or watched on television, model, the context somehow restricts the access
“twist” can be a coil, or to operate something by process so that only the relevant meaning is ever
turning, or to sprain an ankle, or to distort the accessed. One problem with this model is that it is
meaning of something—all the meanings are unclear how context can provide such an immedi-
related). It is not always easy to decide whether a ate constraint.
word has multiple meanings or senses. The second model is called the ordered-
We are faster to make lexical decisions about access model (Hogaboam & Perfetti, 1975). All of
ambiguous words compared with matched unam- the senses of a word are accessed in order of their
biguous words—this advantage is called the ambi- individual meaning frequencies. For example,
guity advantage (Jastrzembski, 1981). However, the “writing instrument” sense of “pen” is more
the advantage is only found for lexical decision. frequent than the “agricultural enclosure for ani-
For other tasks there is no advantage or even a mals” sense. Each sense is then checked serially
disadvantage (e.g., on eye-movement measures; against the context to see if it is appropriate. We
see Rayner, 1998). Perhaps ambiguous words check the most common sense against the context
benefit from having multiple entries in the lexi- first to see if it is consistent. Only if it is not do we
con. This observation needs qualification: while try the less common meaning.
multiple senses of a word confer an advantage, The third model is called the multiple-access
distinct multiple meanings do not (Rodd, Gaskell, model (Onifer & Swinney, 1981; Swinney,
& Marslen-Wilson, 2002). 1979; Tanenhaus, Leiman, & Seidenberg, 1979).
Most of the time we are probably not even According to this model, when an ambiguous
aware of the ambiguity of ambiguous words; we word is encountered, all its senses are activated,
have somehow used the context of the sentence to and the appropriate one is chosen when the con-
disambiguate the sentence—that is, to select the text permits.
Early experiments on processing One problem is that the phoneme monitoring

lexical ambiguity task is sensitive to other linguistic variables, such
Early experiments appeared to show that we as the length of the preceding word. Short words
routinely access all the meanings of ambigu- leave us little time to process them, whereas long
ous words. This interpretation is based on the words are often identified and processed before
premise that if an ambiguous word is harder to pro- their end; it is as though processing of short
cess according to some measure than a control words has to continue into the next word. This
unambiguous word, even in a strongly biasing processing carry-over delays identification of the
context, then this suggests that at some level the phoneme for which participants are monitoring.
language-processing system has detected the Mehler, Segui, and Carey (1978) showed that
ambiguity. For example, MacKay (1966) used a this effect disappears if the ambiguous words are
sentence-completion task whereby participants properly controlled for length. It so happens that
have to complete an initial sentence fragment in English ambiguous words tend to be shorter
(9 or 10) with an appropriate ending: than non-ambiguous words.
In the dichotic-listening task, different mes-
(9) After taking the right turn at the intersection, sages are presented to the left and right ears (see
I… Figure 6.10). Participants are told to attend to
(10) After taking the left turn at the intersection, one ear and ignore the other. In experiments by
I… Lackner and Garrett (1972) and MacKay (1973)
the attended message was (13), and the unattended
Participants take longer to complete (9) than message either (14) or (15):
(10) because of the ambiguity of the word “right.”
(It could mean “right” in the sense of “the oppo- (13) The spy put out the torch as a signal to
site of left,” or “right” in the sense of “correct.”) attack.
This finding suggests that both senses are being (14) The spy extinguished the torch in the window.
considered, and the delay arises because the par- (15) The spy displayed the torch in the window.
ticipant is making a choice.
In these sentences the ambiguity is unresolved Afterwards participants were asked to para-
by the context—both senses of “right” are appro- phrase the attended message. Their interpretation
priate here. Do we find that ambiguous words are was affected by the unattended message that dis-
more difficult even when the context biases us to ambiguated the ambiguous phrase “put out.”
one interpretation? Consider sentences (11) and The experiments discussed so far suggest that
(12). Here the context of “farmer” is strongly bias- all meanings of an ambiguous word are accessed
ing towards the farmyard sense of “straw” rather in parallel. Hogaboam and Perfetti (1975) showed
than the sense of short drinking implement. Foss that the time taken to access meaning depends on
(1970) used a technique called phoneme monitor- frequency of use. They used an ambiguity detec-
ing to show that ambiguous words take longer to tion task, which simply measures the time that
process even when they are strongly biased by participants take to detect the ambiguity. People
context. In this task, participants have to monitor are slow to detect ambiguity when the word
spoken speech for a particular sound or phoneme, occurs in its most frequent sense (16 rather than
and press a button when they detect it. In these 17). This is because in (16) participants access the
sentences the target is /b/. Participants are slower common reading of “pen” automatically, integrate
to detect the /b/ in (11) than in (12), presumably it with the context, and afterwards have to reana-
because they are slowed down by disambiguating lyze to detect the ambiguity. In (17) participants
the preceding word. try the most common sense of the word, fail to
integrate it with the context, and then access the
(11) The farmer put his straw beside the machine. second sense. Hence in this case the ambiguity is
(12) The farmer put his hay beside the machine. detected in routine processing.
Dichotic Listening Task
Ignored outputs Attended inputs
The spy extinguished The spy put out the

the torch in the torch as a signal to
window attack
FIGURE 6.10 Dichotic

listening task: different
words are presented to
each ear. Participants
are instructed to ignore
material presented to a
Headphones particular ear (here the
Speech output
right ear), and to shadow
The spy put out the
torch as a signal
the material presented just
to attack to the left ear. See text for
further information.
(16) The accountant filled his pen with ink. Swinney’s (1979) experiment
(17) The farmer put the sheep in the pen. Some of the early evidence supported multiple
access, and some selective access. The results
Schvaneveldt et al. (1976) employed a we find are very task-dependent. Furthermore,
successive lexical decision task, in which par- the tasks are either off-line, in the sense that they
ticipants see individual words presented in a reflect processing times well after the ambiguity
stream, and have to make lexical decisions for has been processed (such as ambiguity detection,
each word. In this case participants become dichotic listening, and sentence completion), or
far less aware of relations between successive are on-line tasks such as phoneme monitoring
words. The lexical decision time to triads of that are very sensitive to other variables. We
words such as (18), (19), and (20) is the main need a task that tells us what is happening imme-
experimental concern: diately when we come across an ambiguous
word. Swinney (1979) carried out such an exper-
(18) save bank money iment. He used a cross-modal priming technique
(19) river bank money in which participants have to respond to a visual
(20) day bank money lexical decision task while listening to correlated
auditory material.
The fastest reaction time to “money” was in
(18) where the appropriate meaning of “bank” (21) Rumor had it that, for years, the govern-
had been primed by the first word (“save”). ment building had been plagued with prob-
Reaction time was intermediate in control lems. The man was not surprised when he
condition (20), but slowest in (19) where the found several (spiders, roaches, and other)
incorrect sense had been primed. If all senses of bugs1 in the cor2ner of his room.
“bank” had been automatically accessed when
it was first encountered, then “money” should In (21) the ambiguous word is “bugs.” The
have been primed by “bank” whatever the first phrase “spiders, roaches, and other” is a disam-
word. This result therefore supports selective biguating context that strongly biases participants
access. towards the “insect” sense of “bugs” rather than
the “electronic” sense. Only half the participants autonomous, or informationally encapsulated, in
saw this strongly disambiguating phrase. There that all senses of the ambiguous word are output,
was a visually presented lexical decision task but then semantic information is utilized very
either immediately after (at point 1) or slightly quickly to select the appropriate sense. This in
later (three syllables after the critical word, at point turn suggests that the construction of the seman-
2). The target in the lexical decision was either tic representation of the sentence is happening
“ant” (associated with the biased sense), “spy” more or less on a word-by-word basis.
(associated with the irrelevant sense), or “sew” McClelland (1987) argued that these findings
(a neutral control). Swinney found facilitation at are consistent with interactive theories. He argued
point 1 for both meanings of “bugs,” including that context might have an effect very early on,
the irrelevant meaning, but facilitation only for but the advantage it confers is so small that it does
the relevant meaning at point 2. This suggests that not show up in these experiments. This approach
when we first come across an ambiguous word, is difficult to falsify, so for now the best interpre-
we automatically access all its meanings. We then tation of these experiments is that we access all
use context to make a very fast decision between the meanings.
the alternatives, leaving only the consistent sense
active. The effects of meaning frequency and
Swinney’s experiment showed that seman- prior context
tic context cannot restrict initial access. There is now agreement that when we encounter
Tanenhaus et al. (1979) performed a similar an ambiguous word, all meanings are activated
experiment based on a naming task rather than and context is subsequently used to very quickly
lexical decision. They used words that were syn- select the correct meaning. Recent research has
tactically ambiguous (e.g., “watch,” which can used on-line techniques, primarily cross-modal
be a verb or a noun). Tanenhaus et al. found that priming and eye-movement measures, to refine
both senses of the word were initially activated these ideas. Research has focused on three main
in sentences such as “Boris began to watch” and issues. First, what effect does the relative fre-
“Boris looked at his watch.” Again, the context- quency of the different meanings of the ambigu-
independent meaning faded after about 200 ms. ous word have on processing? Second, what is the
Hence syntactic context cannot constrain ini- effect of presenting strong disambiguating con-
tial access either. Tanenhaus and Lucas (1987) text before the ambiguous word? Third, how does
argued that there are good reasons to expect that context affect the access of semantic properties of
initial lexical access should not be restricted by words?
syntactic context. Set-membership feedback is There is controversy about whether the
of little use in deciding whether or not a word relative frequencies of meanings affect initial
belongs to a particular syntactic category: put access. On the one hand, Onifer and Swinney
another way, the likelihood of correctly guess- (1981) replicated Swinney’s experiment using
ing what word is presented given just its syntac- materials with an asymmetry in the frequency
tic category is very low. of the senses of the ambiguous word, so that one
In summary, the data so far suggest that meaning was much more frequent than the other
when we hear or see an ambiguous word, we meaning. Nevertheless, they still observed that
unconsciously access all the meanings immedi- all meanings were initially activated, regardless
ately, but use the context to very quickly reject of the biasing context. However, the dominant
all inappropriate senses. This process can begin meaning may be activated more strongly and
after approximately 200 ms. Less frequent perhaps sooner than less frequent ones (Simpson
meanings take longer to access because more & Burgess, 1985). Extensive use has been made
evidence is needed to cross their threshold for recently of studying eye movements, which are
being considered appropriate to the context. This thought to reflect on-line processing. Studies
suggests that the processes of lexical access are making use of this technique showed that the
time participants take gazing at ambiguous According to the autonomous access model,
words depends on whether the alternative mean- prior context has no effect on access; meanings
ings of the ambiguous word are relatively equal are accessed exhaustively. In a version of this
or highly discrepant in frequency. Simpson called the integration model, the successful inte-
(1994) called the two types of ambiguous words gration of one meaning with prior context termi-
balanced and unbalanced respectively. nates the search for alternative meanings of that
In most of the studies we have examined so word (Rayner & Frazier, 1989). Hence there is
far, the disambiguating context comes after the selective (single meaning) access when the inte-
ambiguous word. The evidence converges on the gration of the dominant meaning is fast (due to the
idea that all meanings are immediately accessed context) but identification of a subordinate mean-
but that the context is quickly used to select one ing is slow.
of them. What happens when the disambiguat- Dopkins, Morris, and Rayner (1992) car-
ing context comes before the ambiguous words? ried out an experiment to distinguish between
Three models have been proposed to account for the reordered access and integration models. In
what happens. their experiment, an ambiguous word was both
According to the selective access model, preceded and followed by context relevant to the
prior disambiguating material constrains meaning of the word. The context that followed
access so that only the appropriate meaning is the ambiguous word always conclusively disam-
accessed. biguated it. The main manipulation in this experi-
According to the reordered access model, ment was the extent to which the prior context was
prior disambiguating material affects the access consistent with the meanings of the ambiguous
phase in that the availability of the appropriate word. In the positive condition, the ambiguous
meaning of the word is increased (Duffy, Morris, word was preceded by material that highlighted
& Rayner, 1988; Rayner, Pacht, & Duffy, 1994). It an aspect of its subordinate meaning, although
is a hybrid model between autonomous and inter- the context was also consistent with the dominant
active models, where the influence that context meaning (e.g., 22). In the negative condition, the
can have is limited. Duffy et al. (1988) examined word was preceded by material that was inconsist-
the effect of prior context on balanced or unbal- ent with the dominant meaning but did not contain
anced ambiguous words, with the unbalanced any strong bias to the subordinate meaning (e.g.,
words always biased by the context to their less 23). In the neutral condition, the ambiguous word
common meaning. Processing times for balanced was preceded by context that provided support for
words and their controls were the same, but partic- neither of its meanings (e.g., 24).
ipants spent longer looking at unbalanced words
than the control words. Duffy et al. argued that (22) Having been examined by the king, the
the prior disambiguating context increased avail- page was soon marched off to bed. [positive
ability of appropriate meanings for both balanced condition]
and unbalanced words. In the case of the balanced (23) Having been hurt by the bee-sting, the
words, the meaning indicated by the context was page was soon marched off to bed. [nega-
accessed before the other meanings. In the case tive condition]
of the unbalanced words with the biasing con- (24) Just as Henrietta had feared, the page was
text, the two meanings were accessed at the same soon marched off to bed. [neutral condition]
time, with additional processing time then needed
to select the appropriate subordinate meaning. What do the two models predict? The criti-
This additional time is called the subordinate bias cal condition is the positive condition. The
effect (Rayner et al., 1994). A biasing context can integration model predicts that context has no
reorder the availability of the meanings so that effect on the initial access phase. The mean-
the subordinate meaning becomes available at the ings of ambiguous words will be accessed in a
same time as the dominant meaning. strict temporal sequence that is independent of
the context, with the dominant meaning always The reordered access model finds further sup-
accessed first. If this meaning can be integrated port from an experiment by Folk and Morris (1995).
with the context, it will be selected; if not, the They examined reading fixation times and naming
processor will try to integrate the next meaning times when reading words that were semantically
with the context, and so on. In the positive and ambiguous (e.g., “calf”) had the same pronuncia-
neutral conditions, the context will contain no tion but different meanings and orthographies (e.g.,
evidence that the dominant meaning is inappro- “break” and “brake”), or had multiple semantic
priate, so the processor will succeed in integrat- and phonological codes (e.g., “tear”). They found
ing this meaning, halt before the subordinate that semantic, phonological, and orthographic con-
meaning is accessed, and move on. When the straints all had an early effect, influencing the order
subsequent material is encountered, the proces- of availability of the meanings.
sor realizes its mistake and has to backtrack. In So far, then, the data support a reordered
the negative condition, the preceding context access model over a strictly autonomous one
indicates that the dominant meaning is inappro- such as the integration model. Contextual infor-
priate, so the processor will then have to spend mation can be used to restrict the access of
time accessing the subordinate meaning. The meanings. In the reordered access model, how-
later context will provide no conflict. The inte- ever, the role of context is restricted by meaning
gration model predicts that processing times for frequency. In particular, the subordinate-biased
the ambiguous word will be longer in the nega- context cannot inhibit the dominant mean-
tive condition than in the positive and neutral ing from becoming available. Recent research
conditions, but processing time for the later dis- has examined the extent to which this is true.
ambiguating context will be longer in the posi- An alternative model is the context-sensitive
tive and neutral conditions than in the negative. model (Simpson, 1994; Vu, Kellas, & Paul,
The reordered access model predicts that the 1998), where meaning frequency and biasing
preceding context will have an effect on the ini- context operate together, dependent on contex-
tial access of the ambiguous word in the positive tual strength. This is the degree of constraint
condition but not in the negative or neutral con- that the context places on an ambiguous word.
ditions. In the positive condition, the context will According to this model, the subordinate bias
lead to the subordinate meaning being accessed effect that motivated the reordered access model
early. This means that when the context after only arises in weakly biasing contexts. If the
the word is encountered, the processor will not context is sufficiently strong, the subordinate
have to recompute anything, so processing in the meaning alone can become available.
disambiguating region will be fast. In the nega- If the context-sensitive model is correct, then
tive and neutral conditions the preceding context a sufficiently strong context should abolish the
contains no evidence for the subordinate mean- subordinate bias effect whereby we spend longer
ing and the predictions are similar to the integra- looking at an ambiguous word when its less fre-
tion model. quent meaning is indicated by the context. This
The key condition, then, is the positive condi- idea was tested in an experiment by Martin, Vu,
tion, which favors the subordinate meaning but is Kellas, and Metcalf (1999). Martin et al. varied
also consistent with the dominant meaning. The the strength of the discourse context: (25) is a
reordered access model predicts that processing weakly biasing context towards the subordinate
times in the subsequent disambiguation region meaning, but (26) is a strongly biasing context to
will be relatively fast, whereas the integration the subordinate meaning; (27) and (28) show the
model predicts that they will be relatively slow. control contexts for the dominant meanings.
The results supported the reordered access model.
Dopkins et al. found that reading times for the dis- (25) The scout patrolled the area. He reported
ambiguating material were indeed relatively fast the mine to the commanding officer. [weak
in the positive condition. context favoring subordinate meaning]
(26) The gardener dug a hole. She inserted the Accessing selective properties
bulb carefully into the soil. [strong context of words
favoring subordinate meaning] Tabossi (1988a, 1988b) used a cross-modal
(27) The farmer saw the entrance. He reported priming task to show that sentence context that
the mine to the survey crew. [weak context specifically constrains a property of the prime
favoring dominant meaning] word leads to selective facilitation. She argued
(28) The custodian fixed the problem. She inserted for a modified version of context-dependency:
the bulb into the empty socket. [strong con- not all aspects of semantic-pragmatic context
text favoring dominant meaning] can constrain the search through the possible
meanings, but semantic features constraining
According to the reordered access model, specific semantic properties can provide such
the dominant meaning will always be generated constraints. For example, the context in (29)
regardless of context, so time will be needed to clearly suggests the “sour” property of “lemon.”
resolve the competition. Hence there will be a sub- Tabossi observed facilitation when the target
ordinate bias effect, and the reading times on the “sour” was presented visually in a lexical deci-
ambiguous word should be the same, and longer sion task immediately after the prime (“lemon”),
than the reading time for the dominant meanings, relative both to the same context but with a dif-
regardless of the strength of the context. Accord- ferent noun (30) and a different context with the
ing to the context-sensitive model, there should same noun (31).
only be conflict and therefore a subordinate bias
effect in the weak context condition; therefore (29) The little boy shuddered eating the lemon.
reading times of the ambiguous word should be (30) The little boy shuddered eating the popsicle.
faster with the strong biasing context compared (31) The little boy rolled on the floor a lemon.
with the weak context. The data from a self-
paced reading task supported the context-sensitive In effect, Tabossi argued that there are large
model. A sufficiently strong context can eliminate differences in the effectiveness of different types
the subordinate bias effect so that reading times on of contextual cues. If the context is weakly con-
a word with either the subordinate or the dominant straining, we observe exhaustive access, but if it
meaning strongly indicated are the same. is very strongly constraining, we observe selec-
Rayner, Binder, and Duffy (1999) criticized tive access. However, Moss and Marslen-Wil-
the materials in this experiment. They argued that son (1993) pointed out that the acoustic offset
many of the items were unsuitable. For example, of the prime word might be too late to measure
some items appeared to be more balanced than an effect, given that initial lexical access occurs
biased, and some contexts were consistent with the very early, before words are completed. Tabossi
same meaning. They also argued that the reordered used two-syllable-long words, and it is possible
access model predicts that in very strong con- that these words were long enough to permit
texts the subordinate meaning might be accessed initial exhaustive access with selection occur-
before the dominant meaning. Nevertheless, ring before presentation of the target. Tabossi
access is exhaustive: the dominant meaning is and Zardon (1993) examined this possibility
still always accessed—unless the context contains in a cross-modal lexical decision task by pre-
a strong associate of the intended meaning, as in senting the target 100 ms before the end of the
Seidenberg, Tanenhaus, Leiman, and Bienkowski ambiguous prime. They still found that only the
(1982). Hence, Rayner et al. (1999) argue, the data dominant, relevant meaning was activated when
from Martin et al. are not contrary to the reordered the context was strongly biasing towards that
access model. In reply, Vu and Kellas (1999), meaning. Tabossi and Zardon also found that
while admitting that there were problems with if the context strongly biases the interpretation
some of their stimuli, claim that these problems to the less frequent meaning, both the dominant
could not have led to erroneous results. meaning (because of its dominance) and less
dominant meaning (because of the effect of con- models of disambiguation incorporate an ele-
text) are active after 100 ms (see also Simpson ment of interactivity: the question now is the
& Krueger, 1991). extent to which it is restricted. Can a sufficiently
Moss and Marslen-Wilson (1993) also constraining semantic context prevent the acti-
explored the way in which aspects of meaning can vation of the less dominant meaning of a word?
be selectively accessed. They measured lexical Hence the way in which we deal with lexical
access very early on, before the presentation of the ambiguity depends on both the characteristics of
prime had finished. Semantically associated tar- the ambiguous word and the type of disambigu-
gets were primed independent of context, whereas ating context.
access to semantic-property targets was affected A number of questions remain to be
by the semantic context. Semantic properties answered. In particular, how does context exert
were not automatically accessed whenever heard, its influence in selecting the right meaning? How
but could be modulated by prior context, even does semantic integration occur? MacDonald,
at the earliest probe position. Hence this finding Pearlmutter, and Seidenberg (1994b) address
again indicates that neither exhaustive nor selec- this issue, and also address the relation between
tive access models may be quite right, in that what lexical and syntactic ambiguity. They propose
we find depends on the detailed relation between that the two are resolved using similar mecha-
the context and the meanings of the word. nisms based on an enriched lexicon. Kawamoto
(1993) constructed a connectionist model of
Evaluation of work on lexical lexical ambiguity resolution. The model showed
ambiguity that, even in an interactive system, multiple
Early on, there were two basic approaches to candidates become active, even when the con-
how we eventually select the appropriate sense text clearly favors one meaning. (This happens
of ambiguous words. According to the auton- because the relation between a word’s percep-
omous view, we automatically access all the tual form and its meanings is much stronger than
multiple senses of a word, and use the context the relation between the meaning and the con-
to select the appropriate reading. Semantic text.) This suggests that multiple access is not
information context is then used to access the necessarily diagnostic of modularity.
appropriate sense of the word. On the interac- Although ambiguous words appear to cause
tive view, the context enables selective access difficulty for the language system, there are some
of the appropriate sense of the ambiguous word. circumstances where ambiguous words have an
The experiments used in this area are very sen- advantage. We may be quicker to name ambiguous
sitive to properties of the target and context words compared with unambiguous words, and
length. When we get context-sensitive priming they have an advantage in lexical decision (e.g.,
in these cross-modal experiments depends on Balota, Ferraro, & Conner, 1991; Jastrzembski,
the details of the semantic relation between the 1981; Kellas, Ferraro, & Simpson, 1988; Millis
target and prime. Early experiments using off- & Button, 1989; but see Borowsky & Masson,
line tasks found contradictory results for both 1996). There are a number of explanations for this
multiple and context-specific selective access. possible advantage, but they all center around the
Later experiments using more sophisticated idea that having multiple target meanings speeds
cross-modal priming indicated multiple access up processing of the word. For example, if each
with rapid resolution. word meaning corresponds to a detector such as a
More recent experiments suggest that the logogen, then a word with two meanings will have
pattern of access depends on the relative fre- two detectors. The probability of an ambiguous
quencies of the alternative senses of the ambiguous word activating one of its multiple detectors will
word and the extent to which the disambiguating be higher than the probability of an unambiguous
context constrains the alternatives. All recent word activating its only detector.
SUMMARY
x Word recognition is distinct from object and face recognition.

x Recognizing a word occurs when we uniquely access its representation in the mental lexicon.
x Eyes fixate on material that is being read for 200–250 ms, with movements between fixations
called saccades.
x Lexical access is affected by repetition, frequency, age-of-acquisition, word length, the exist-
ence of similar words, the physical and semantic similarity of preceding items, and stimulus
quality.
x Semantic priming is the facilitation of word recognition by prior presentation of an item related
in meaning.
x Semantic priming has a fast, automatic, mandatory, facilitatory component, and a slow, atten-
tional component that inhibits unexpected candidates.
x The lexical decision and naming tasks sometimes give different results, with lexical deci-
sion more prone to contamination by post-access processes such as response checking,
and naming prone to contamination by the processes involved in assembling a word’s
pronunciation.
x Semantic priming has an automatic component based on association, and an attentional compo-
nent involving non-associative semantic relations.
x Some types of non-associative semantic relations may give rise to automatic facilitation; instru-
mental semantic priming at least is automatic.
x Different aspects of a word’s meaning are accessed over time, with functional information about
artifacts becoming available before perceptual information.
x Sentence-based contextual priming operates through expectancy-based attentional mechanisms,
but may also have an early automatic component.
x In English, morphologically complex words are decomposed into their stems by affix stripping, but
morphologically complex high-frequency words may have their own lexical listing.
x There is a level of lexical representation that is modality-independent (because we observe cross-
modal priming), and that is morphologically structured for semantically transparent words in
English.
x Compound words whose meanings are not transparent from their components (e.g., “buttercup”)
will also be stored separately.
x Forster’s model of word recognition is based on serial search through frequency-ordered
bins.
x Morton’s logogen model proposes that each word has an individual feature counter—a logogen
associated with it that accumulates evidence until a threshold is exceeded.
x IAC (Interactive Activation and Competition) networks are connectionist networks with excita-
tory connections between letters and words to which the letters belong, and inhibitory connec-
tions elsewhere.
x Lexical ambiguity is when a word can have two meanings.
x How we access the meaning of ambiguous words depends on the relative frequencies of the
alternative senses of the ambiguous word, and the extent to which the disambiguating context
constrains the alternatives.
x When we come across an ambiguous word, all its meanings are activated, but the context is very
quickly used to select the appropriate sense.
1. What might be different about reading in languages such as Hebrew that read from right to left?
2. Is the lexicon really like a dictionary?
3. Compare and contrast two models of word recognition.
4. How many types of priming are there?
5. What are the differences between naming, recognition, lexical access, and accessing the meaning?
What might neuropsychology tell us about these processes?
FURTHER READING
For a collection of papers surveying the field, see Andrews (2006). For reviews of the eye-movement
literature, see van Gompel, Fischer, Murray, and Hill (2006), and the collection edited by Henderson
and Ferreira (2004). For a detailed discussion of the latest version of the E-Z Reader model (version
7), and a comparison with several other important models of eye-movement control in reading, with
peer commentary, see Reichle, Rayner, and Pollatsek (2003). In addition to the E-Z Reader, there are
other recent models of eye-movement control in reading. See McDonald, Carpenter, and Shillcock
(2005) for the SERIF model. The SERIF model emphasizes the way in which information from each
half of the visual field is transmitted to the contralateral visual cortex. See Legge, Klitz, and Tjan
(1997) for the Mr. Chips model, and Martin (2004) for the Encoder model.
See Dean and Young (1996) for a review of work on repetition priming, and experimental evi-
dence that is troublesome for the episodic view. Morrison, Chappell, and Ellis (1997) provide age-
of-acquisition norms for a large set of object names.
More recent work on perception without awareness can be found in the papers by Doyle and
Leach (1988) and Dagenbach, Carr, and Wilhelmsen (1989). Humphreys (1985) reviewed the litera-
ture on attentional processes in priming. Neely (1991) provides a wide-ranging review of semantic
priming. For discussion of whether associative priming occurs through a mechanism of spreading
activation or some more complex process, see McNamara (1992, 1994). Plaut and Booth (2000)
present a connectionist model that incorporates both facilitation and inhibition using a single mecha-
nism. See Kinoshita and Lupker (2003) for a review of work on masked priming.
An excellent review of models of word recognition is Carr and Pollatsek (1985); they provide a
useful diagram showing the relation of all types of recognition model. See Garnham (1985) for more
detail on the interactions between frequency, context, and stimulus quality.
CHAPTER 7
READING
INTRODUCTION THE WRITING SYSTEM

In Chapter 6 we looked at how we recognize The basic unit of written language is the let-
words; this chapter is about how we read ter. The name grapheme is given to the letter
them. How do we gain access to the sounds or combination of letters that represents a pho-
and meanings of words? We also examine the neme. For example, the word “ghost” contains
effects of brain damage on reading (giving rise five letters and four graphemes (“gh,” “o,” “s,”
to acquired dyslexia), and show how reading and “t”), representing four phonemes. There is
disorders can be related to a model of reading. much more variability in the structure of writ-
The next chapter looks at how children learn ten languages than there is in spoken languages.
to read. Whereas all spoken languages utilize a basic
Reading aloud and reading to oneself are distinction between consonants and vowels,
clearly different, but related, tasks. When we there is no such common thread to the world’s
read aloud (or name words), we must retrieve written languages. The sorts of written language
the sounds of words. When we read to our- most familiar to speakers of English and other
selves, we read to obtain the meaning, but most European languages are alphabetic scripts.
of us, most of the time, experience the sounds English uses an alphabetic script. In alphabetic
of the words as “inner speech.” Is it possible scripts, the basic unit represented by a grapheme
to go to the meaning of a word when reading is essentially a phoneme. However, the nature of
without also accessing its sounds? By the end this correspondence can vary. In transparent lan-
of this chapter you should: guages such as Serbo-Croat and Italian there is a
one-to-one grapheme–phoneme correspondence,
x Know how different languages translate words so that every grapheme is realized by only one
into sounds, and understand the alphabetic phoneme and every phoneme is realized by only
principle. one grapheme. In languages such as English this
x Understand the motivation for the dual-route relation can be one-to-many in both directions. A
model of reading, and know about its strengths phoneme can be realized by different graphemes
and weaknesses. (e.g., compare “to,” “too,” “two,” and “threw”),
x Appreciate how different types of dyslexia and a grapheme can be realized by many differ-
relate to the dual-route model, and also the ent phonemes (e.g., the letter “a” in the words
problems they pose for it. “fate,” “pat,” and “father”). Some languages lie
x Know about connectionist models of reading between these extremes. In French, correspond-
and how they account for dyslexia. ences between graphemes and phonemes are
TABLE 7.1 Types of written languages.
Examples Features
Alphabetic script English The basic unit represented by a

Other European languages grapheme is essentially a phoneme.
Consonantal script Hebrew Not all sounds are represented, as

Arabic vowels are not written down.
Syllabic script Cherokee Written units represent syllables.

Japanese kana
Logographic/ideographic script Chinese Each symbol represents a whole word.

Japanese kanji
quite regular, but a phoneme may have differ- Hence this chapter should be read with the cau-
ent graphemic realizations (e.g., the graphemes tion in mind that some conclusions may be true of
“o,” “au,” “eau,” “aux,” and “eaux” all repre- English and many other writing systems, but not
sent the same sounds). In consonantal scripts, necessarily of all of them.
such as Hebrew and Arabic, not all sounds are Unlike speech, reading and writing are a
represented, as vowels are not written down at relatively recent development. Writing emerged
all. In syllabic scripts (such as Cherokee and independently in Sumer and Mesoamerica, and
the Japanese script kana), the written units rep- perhaps also in Egypt and China. The first writ-
resent syllables. Finally, some languages do not ing system was the cuneiform script printed on
represent any sounds. In ideographic languages clay in Sumer, which appeared just before 3000
(sometimes also called logographic languages), BC. The emergence of the alphabetic script can be
such as Chinese and the Japanese script kanji, traced to ancient Greece in about 1000 BC. The
each symbol is equivalent to a morpheme (see development of the one-to-many correspondence
Table 7.1). in English orthography primarily arose between
One consequence of this variation in writing the fifteenth and eighteenth centuries as a conse-
systems is that there must be differences in pro- quence of the development of the printing press
cessing between readers of different languages. and the activities of spelling “reformers” who
tried to make the Latin and Greek origins of
words more apparent in their spellings (see Ellis,
1993, for more detail). Therefore it is perhaps not
H
E
surprising that reading is actually quite a complex
B fl»<8 62ftoi β»€»θ| fl»<8fl»<8
62ftoi β»€»θ|
62ftoi β»€»θ| cognitive task. There is a wide variation in read-
R
E ing abilities, and many different types of reading
W fl»<8 62ftoi β»€»θ|fl»<8 62ftoi
fl»<8 β»€»θ|
62ftoi β»€»θ| disorder arise as a consequence of brain damage.
A
L fl»<8 62ftoi β»€»θ| fl»<8 62ftoi
fl»<8 β»€»θ|
62ftoi β»€»θ|
P
H
A fl»<8fl»<8 62ftoi
62ftoi β»€»θ|
β»€»θ| A PRELIMINARY MODEL OF
B
E READING
T
There is much more variability in the structure Introspection can provide us with a preliminary
of written languages than there is in spoken model of reading. Consider how we might name
languages. In consonantal scripts, such as or pronounce the word “beef.” Words like this
Hebrew (above) and Arabic, not all sounds are are said to have a regular spelling-to-sound cor-
represented.
respondence. That is, the graphemes map onto
7. READING 211
phonemes in a totally regular way; you need tend to agree on how they should be pronounced.
no special knowledge about the word to know If you hear nonwords like these, you can spell
how to pronounce it. If you had never seen the them correctly; you assemble their pronunciations
word “beef” before, you could still pronounce from their constituent graphemes. (Of course, not
it correctly. Some other examples of regular all nonwords are pronounceable—e.g., “xzhgh.”)
word pronunciations include “hint” and “rave.” Our ability to read nonwords on the one hand
In these words, there are alternative pronuncia- and irregular words on the other suggests the pos-
tions (as in “pint” and “have”), but “hint” and sibility of a dual-route model of naming. We can
“rave” are pronounced in accordance with the assemble pronunciations for words or nonwords
most common pronunciations. These are all reg- we have never seen before, yet also pronounce
ular words, because all the graphemes have the correctly irregular words that must need informa-
standard pronunciation. tion specific to those words (that is, lexical infor-
Not all words are regular, however. Some are mation). The classic dual-route model (see Figure
irregular or exception words. Consider the word 7.1) has two routes for turning words into sounds.
“steak.” This has an irregular spelling-to-sound There is a direct access or lexical route, which is
(or grapheme-to-phoneme) correspondence: the needed for irregular words. This must at least in
grapheme “ea” is not pronounced in the usual some way involve a direct link between print and
way, as in “streak,” “sneak,” “speak,” “leak,” and sound. That is, the lexical route takes us directly
“beak.” Other exceptions to a rule include “have” to a word’s entry in the lexicon and we are then
(an exception to the rule that leads to the regu- able to retrieve the sound of a word. There is also
lar pronunciations “gave,” “rave,” “save,” and so a grapheme-to-phoneme conversion (GPC) route
forth) and “vase” (in British English, an exception (also called the indirect or non-lexical or sublexi-
to the rule that leads to the regular pronunciations cal route), which is used for reading nonwords.
“base,” “case,” and so forth). English has many This route carries out what is called phonologi-
irregular words. Some words are extremely irreg- cal recoding. It does not involve lexical access at
ular, containing unusual patterns of letters that all. The non-lexical route was first proposed in
have no close neighbors, such as “island,” “aisle,” the early 1970s (e.g., Gough, 1972; Rubenstein,
“ghost,” and “yacht.” These words are sometimes Lewis, & Rubenstein, 1971). Another important
called lexical hermits. justification for a grapheme-to-phoneme conver-
Finally, we can pronounce strings of letters sion route is that it is useful for children learning
such as “nate,” “smeak,” “fot,” and “datch,” even to read by sounding out words letter by letter.
though we have never seen them before. These Given that neither route can in itself ade-
letter strings are all pronounceable nonwords or quately explain reading performance, it seems that
pseudowords. Therefore, even though they are we must use both. Modern dual-route theorists see
novel, we can still pronounce them, and we all reading as a “race” between these routes. When
Print
Grapheme–phoneme
Lexicon
conversion rules
Pronunciation FIGURE 7.1 The simplified

version of the dual-route model
of reading.
we see a word, both routes start processing it. For direct route from print to sound, and a direct route
skilled readers, most of the time the direct route is via semantics; what is debated is the role of the
much faster, so it will usually win the race and the indirect route in normal reading (see Taft & van
word will be pronounced the way that it recom- Graan, 1998, for further discussion of these issues).
mends. The indirect route will only be apparent in
exceptional circumstances, such as when we see
a very unfamiliar word; in that case, if the direct THE PROCESSES OF
route is slower than normal, then the direct and NORMAL READING
GPC routes will produce different pronunciations
at about the same time, and these words might be According to the dual-route model, there are two
harder to pronounce. independent routes when naming a word and
In the previous chapter we examined a num- accessing the lexicon: a lexical or direct access
ber of models of word recognition. These can route and a sublexical or grapheme–phoneme
all be seen as theories of how the direct, lexical conversion route. This section looks at how we
access reading route operates. The dual-route is name nonwords and words.
the simplest version of a range of possible multi-
route or parallel coding models, some of which Reading nonwords
posit more than two reading routes. Do we really
need a non-lexical route at all for routine read- It sounds odd to start a section on “normal reading”
ing? Although we appear to need it for reading by talking about how we can read nonwords, but
nonwords, it seems a costly procedure. We have they’re very revealing. According to the dual-route
a mechanism ready to use for something we model, the pronunciation of all nonwords should be
rarely do—pronouncing new words or nonwords. assembled using the GPC route. This means that all
Perhaps it is left over from the development of pronounceable nonwords should be alike and their
reading, or perhaps it is not as costly as it first similarity to words should not matter. However,
appears. We will see later that the non-lexical pronounceable nonwords are not all alike.
route is also apparently needed to account for the
neuropsychological data. Indeed, whether or not The pseudohomophone effect
two routes are necessary for reading is a central Pseudohomophones are pronounceable non-
issue of the topic of reading. Models that propose words that sound like words when pronounced
that we can get away with only one (such as con- (such as “brane,” which sounds like the word
nectionist models) must produce a satisfactory “brain” when spoken). The behavior of the pseu-
account of how we can pronounce nonwords. dohomophone “brane” can be compared with the
Of course, except for reading aloud, the pri- very similar nonword “brame,” which does not
mary goal of reading is not getting the sound of a sound like a word when it is spoken. Rubenstein
word, but getting the meaning. As we shall see in et al. (1971) showed that pseudohomophones are
Chapter 8, in the early stages of learning to read more confusable with words than other types of
children get to the meaning through the sound; that nonwords are. Participants are faster to name
is, they spell out the sound of the words, and then them, but slower to reject them as nonwords than
access meaning as they recognize those sounds. control nonwords.
Some researchers believe that even skilled adults Is the effect caused by the phonological or
primarily get to meaning by going from print to visual similarity between the nonword and word?
phonology and then to meaning, an idea called Martin (1982) and Taft (1982) argued that it is visual
phonological mediation (discussed in more detail similarity that is important. Pseudohomophones
below). Most researchers, however, believe that are more confusable with words than other non-
in skilled adults, most of the time, there is a direct words are because they look more similar to words
route from print to semantics. Indeed, as we shall than non-pseudohomophones, rather than because
see below, most researchers believe that there is a they sound the same. Pring (1981) alternated the
7. READING 213
case of letters within versus across graphemes, each other. Subsequent research has shown that
such as the “AI” in “grait,” to produce “GraIT” the proportion of regular pronunciations of non-
or “GRaiT.” These strings look different but still words increases as the number of orthographic
sound the same. Alternating letter cases within a neighbors increases (McCann & Besner, 1987).
grapheme or spelling unit (aI) eliminates the pseu- In summary, there are lexical effects on nonword
dohomophone effect; alternating letters elsewhere processing.
in the word (aiT) does not. Hence we are sensi-
tive to the visual appearance of spelling units of More on reading nonwords
words. The nonword “yead” can be pronounced to rhyme
The pseudohomophone effect suggests that with “bead” or “head.” Kay and Marcel (1981)
not all nonwords are processed in the same way. showed that its pronunciation can be affected
The importance of the visual appearance of the by the pronunciation of a preceding prime word:
nonwords further suggests that something else “bead” biases a participant to pronounce “yead”
apart from phonological recoding is involved here. to rhyme with it, whereas the prime “head” biases
It remains to be seen whether the phonological participants to the alternative pronunciation.
recoding route is still necessary, but if it is, then it Rosson (1983) primed the nonword by a seman-
must be more complex than we first thought. tic relative of a phonologically related word. The
task was to pronounce “louch” when preceded
Glushko’s (1979) experiment: Lexical either by “feel” (which is associated with “touch”)
effects on nonword reading or by “sofa” (which is associated with “couch”).
Glushko (1979) performed a very important In both cases “louch” tended to be pronounced to
experiment on the effect of the regularity of the rhyme with the appropriate relative.
word-neighbors of a nonword on its pronun- Finally, nonword effects in complex experi-
ciation. Consider the nonword “taze.” Its word- ments are sensitive to many factors, such as the
neighbors include “gaze,” “laze,” and “maze”; pronunciation of the surrounding words in the list.
these are all themselves regularly pronounced This also suggests that nonword pronunciation
words. Now consider the word-neighbors of the involves more than just grapheme-to-phoneme
nonword “tave.” These also include plenty of reg- conversion.
ular words (e.g., “rave,” “save,” and “gave”) but
there is an exception word-neighbor (“have”). As Evaluation of research on reading
another example, compare the nonwords “feal” nonwords
and “fead”: both have regular neighbors (e.g., These data do not fit the simple version of the
“real,” “seal,” “deal,” and “bead”) but the pro- dual-route model. The pronunciation of nonwords
nunciation of “fead” is influenced by its irregu- is affected by the pronunciation of visually simi-
lar neighbor “dead.” Glushko (1979) showed that lar words. That is, there are lexical effects in non-
naming latencies to nonwords such as “tave” were word processing; the lexical route seems to be
significantly slower than to ones such as “taze.” affecting the non-lexical route.
That is, reaction times to nonwords that have
orthographically irregular spelling-to-sound cor-
respondence word-neighbors are slower than to
Reading words
other nonword controls. Also, people make pro- According to the dual-route model, words are
nunciation “errors” with such nonwords: “pove” accessed directly by the direct route. This means
might be pronounced to rhyme with “love” rather that all words should be treated the same in respect
than “cove”; and “heaf” might be pronounced of the regularity of their spelling-to-sound corre-
to rhyme with “deaf” rather than “leaf.” In sum- spondences. An examination of the data reveals
mary, Glushko found that the pronunciation of that this prediction does not stand up.
nonwords is affected by the pronunciation of sim- One problem for the simple dual-route model
ilar words, and that nonwords are not the same as is that pronunciation regularity affects response
times, although in a complex way. Baron and One possibility is that late-acquired low-frequency
Strawson (1976) provided an early demonstration consistent words can make use of the network
of this problem, finding that a list of regular words structure of other consistent words; inconsistent
was named faster than a list of frequency-matched items cannot, and need new associations to be
exception words (e.g., “have”). This task is a sim- learned between input and output (Monaghan &
plified version of the naming task, with response Ellis, 2002).
time averaged across many items rather than taken In general, regularity effects are more likely
from each one individually. There have been many to be found when participants have to be more
other demonstrations of the influence of regular- conservative, such as when accuracy rather than
ity on naming time (e.g., Forster & Chambers, speed is emphasized. The finding that regularity
1973; Frederiksen & Kroll, 1976; Stanovich & affects naming might appear problematic for the
Bauer, 1978). A well-replicated finding is that of dual-route model, but makes sense if there is a race
an interaction between regularity and frequency: between the direct and indirect routes. Remember
regularity has little effect on the pronunciation of that there is an interaction between regularity and
high-frequency words, but low-frequency regu- frequency. The pronunciation of common words is
lar words are named faster than low-frequency directly retrieved before the indirect route can con-
irregular words (e.g., Andrews, 1982; Seidenberg, struct any conflicting pronunciation. Conflict arises
Waters, Barnes, & Tanenhaus, 1984), even when when the lexical route is slow, as when retrieving
we control for age-of-acquisition (Monaghan & low-frequency words, and when the pronunciation
Ellis, 2002). Jared (1997b) found that high- of a low-frequency word generated by the lexical
frequency words can be sensitive to regularity, but route conflicts with that generated by the non-lexical
the effect of regularity is moderated by the number route (Norris & Brown, 1985).
and frequencies of their “friends” and “enemies”
(words with similar or conflicting pronunciations). Glushko’s (1979) experiment: Results
That is, it is important to control for the neighbor- from words
hood characteristics of the target words as well as Glushko (1979) also found that words behave in a
their regularity in order to observe the interaction. similar way to nonwords, in that the naming times
On the other hand, it is not clear whether there are of words are affected by the phonological consist-
regularity effects on lexical decision. They have ency of neighbors. The naming of a regular word
been obtained by, for example, Stanovich and is slowed down relative to that of a control word
Bauer (1978), but not by Coltheart et al. (1977), of similar frequency if the test word has irregular
or Seidenberg et al. (1984). In particular, a word neighbors. For example, the word “gang” is regu-
such as “yacht” looks unusual, as well as having lar, and all its neighbors (such as “bang,” “sang,”
an irregular pronunciation. The letter pairs “ya” “hang,” and “rang”) are also regular. Consider on
and “ht” are not frequent in English; we say they the other hand “base”; this itself has a regular pro-
have a low bigram frequency. Obviously the visual nunciation (compare it with “case”), but it is incon-
appearance of words is going to affect the time it sistent, in that it has one irregular neighbor, “vase”
takes for direct access, so we need to control for (in British English pronunciation). We could say
this when searching for regularity effects. Once that “vase” is an enemy of “base.” This leads to
we control for the generally unusual appearance a slowing of naming times. In addition, Glushko
of irregular words, regularity and consistency only found true naming errors of over-regularization: for
seem to affect naming times, not lexical decision example, “pint” was sometimes given its regular
times. Age-of-acquisition has a similar effect to pronunciation—to rhyme with “dint.”
frequency, and gives rise to a similar interaction:
Consistency has a much bigger impact on naming Pronunciation neighborhoods
time for late-acquired than early-acquired words Continuing this line of research, Brown (1987)
(Monaghan & Ellis, 2002). Why do late-acquired argued that the number of consistently pronounced
and low-frequency inconsistent words stand out? neighbors (friends) determines naming times, rather
7. READING 215
than whether a word has enemies (that is, whether than a straightforward dichotomy between regular
or not it is regular). It is now thought that the num- and irregular words (see Table 7.2). This classifica-
ber of both friends and enemies affects naming tion reflects two factors: first, the regularity of the
times (Brown & Watson, 1994; Jared, McRae, & pronunciation with reference to spelling-to-sound
Seidenberg, 1990; Kay & Bishop, 1987). correspondence rules; second, the agreement with
Andrews (1989) found effects of neighbor- other words that share the same body. (This is the
hood size in both the naming and the lexical end of a monosyllabic word, comprising the central
decision tasks. Responses to words with large vowel plus final consonant or consonant cluster;
neighborhoods were faster than words with e.g., “aint” in “saint” or “us” in “plus.”) We need
small neighborhoods (although this may be mod- to consider not only whether a word is regular or
erated by frequency, as suggested by Grainger, irregular, but also whether its neighbors are regular
1990). Not all readers produce the same results. or irregular. The same classification scheme can be
Barron (1981) found that good and poor elemen- applied to nonwords.
tary school readers both read regular words more In summary, just as not all nonwords behave
quickly than irregular words. However, once he in the same way, neither do all words. The reg-
controlled for neighborhood effects, he found ularity of pronunciation of a word affects the
that there was no longer any regularity effect ease with which we can name it. In addition, the
in the good readers, although it persisted in the pronunciation of a word’s neighbors can affect
poor readers. its naming. The number of friends and enemies
Parkin (1982) found more of a continuum affects how easy it is to name a word.
of ease-of-pronunciation than a simple division
between regular and irregular words. All this work The role of sound in accessing
suggests that a binary division into words with meaning: Phonological mediation
regular and irregular pronunciations is no longer There is some experimental evidence suggesting
adequate. Patterson and Morton (1985) provided a that a word’s sound may have some influence on
more satisfactory but complex categorization rather accessing the meaning (Frost, 1998; van Orden,
TABLE 7.2 Classification of word pronunciations depending on regularity and consistency (based on Patterson
& Morton, 1985).
Word type Example Characteristics
Consistent gaze All words receive the same regular pronunciation of the body
Consensus lint All words with one exception receive the same regular pronunciation
Heretic pint The irregular exception to the consensus
Gang look All words with one exception receive the same irregular
pronunciation
Hero spook The regular exception to the gang
Gang without a hero cold All words receive the same irregular pronunciation
Ambiguous: conformist cove Regular pronunciation with many irregular exemplars
Ambiguous: love Irregular pronunciation with many regular exemplars

independent
Hermit yacht No other word has this body

1987; van Orden, Johnstone, & Hale, 1988; van On the other hand, Jared and Seidenberg
Orden, Pennington, & Stone, 1990). In a cate- (1991) showed that prior phonological access only
gory decision task, participants have to decide if happens with low-frequency homophones. In an
a visually presented target word is a member of examination of proof-reading and eye movements,
a particular category. For example, given “A type Jared, Levy, and Rayner (1999) also found that
of fruit” you would respond “yes” to “pear,” and phonology only plays a role in accessing the mean-
“no” to “pour.” If the “no” word is a homophone ings of low-frequency words. In addition, they
of a “yes” word (e.g., “pair”), participants make a found that poor readers are more likely to have
lot of false positive errors—that is, they respond to access phonology in order to access semantics,
“yes” instead of “no.” Participants seem confused whereas good readers primarily activate semantics
by the sound of the word, and category deci- first. Daneman, Reingold, and Davidson (1995)
sion clearly involves accessing the meaning. The reported eye fixation data on homophones that
effect is most noticeable when participants have to suggested the meaning of a word is accessed first
respond quickly. Lesch and Pollatsek (1998) found whereas the phonological code is accessed later,
evidence of interference between homophones in a probably post-access. They found that gaze dura-
semantic relatedness task (e.g., SAND–BEECH). tion times were longer on an incorrect homophone
We take longer to respond to homophones in a (e.g., “brake” was in the text when the context
lexical decision task (e.g., MAID), presumably demanded “break”), and that the fixation times on
because the homophones are generating confusion the incorrect homophone were about the same as
in lexical access, perhaps through feedback from on a spelling control (e.g., “broke”). This means
phonology to orthography (Pexman, Lupker, & that the appropriate meaning must have been acti-
Jared, 2001; Pexman, Lupker, & Reggin, 2002). vated before the decision to move the eyes, and that
Hence there is considerable evidence that the the phonological code is not activated at this time.
recognition of a word can be influenced by its pho- (If the phonological code had been accessed before
nology. The dominant view is that this influence meaning then the incorrect homophone would
arises through the indirect route, although word sound all right in the context, and gaze durations
recognition is primarily driven by the direct route should have been about the same.) The phonologi-
(or routes)—a view that has been labeled the weak cal code is accessed later, however, and influences
phonological perspective (Coltheart, Rastle, Perry, the number of regressions (when the eyes look back
Langdon, & Ziegler, 2001; Rastle & Brysbaert, to earlier material) to the target word. (However,
2006). Most of the models described in this chapter see Rayner, Pollatsek, & Binder, 1998, for different
subscribe to the weak phonological view. The alter- conclusions. It is clear that these experiments are
native, strong phonological view—that we primarily very sensitive to the materials used.)
get to the meaning through sound—is called pho- Taft and van Graan (1998) used a seman-
nological mediation. The most extreme form of this tic categorization task to examine phonological
idea is that visual word recognition cannot occur in mediation. Participants had to decide whether or
the absence of computing the sound of the word. not words belonged to a category of “words with
There is a great deal of controversy about the definable meanings” (e.g., “plank,” “pint”) or the
status of phonological mediation. Other experi- category of “given names” (e.g., “Pam,” “Phil”).
ments support the idea. Folk (1999) examined There was no difference in the decision times
eye movements as participants read sentences between regular definable words (e.g., “plank”)
containing either “soul” or “sole.” Folk found and irregular definable words (e.g., “pint”),
that the homophones were read with longer gaze although a regularity effect was shown in a word
duration—that is, they were processed as though naming task. This suggests that the sound of a
they were lexically ambiguous—even though the word does not need to be accessed on the route to
orthography should have prevented this. This accessing its meaning.
result is only explicable if the phonology is in A number of studies have tried to decide
some way interfering with the semantic access. between the strong and weak phonological views
7. READING 217
using masked phonological priming. In this tech- could give perfect definitions of printed words.
nique, targets (e.g., “clip”) are preceded by phono- In general, a review of the neuropsychological
logically identical nonword primes (e.g., “klip”). literature suggests that people can recognize
Responses to the targets are faster and more words in the absence of phonology (Coltheart,
accurate than when the target is preceded by an 2004). Hence it is unlikely that phonological
unrelated word. Several studies have found prim- recoding is an obligatory component of visual
ing effects occur even when the primes have been word recognition (Rastle & Brysbaert, 2006).
masked and presented so briefly that they cannot How then can we explain the data showing
be consciously observed and reported, suggesting phonological mediation? There are a number of
that the phonological stimulus must occur auto- alternative explanations. First, although phono-
matically and extremely quickly (e.g., Lukatela & logical recoding prior to accessing meaning may
Turvey, 1994a, 1994b; Perfetti, Bell, & Delaney, not be obligatory, it might occur in some circum-
1988). While some researchers interpret masked stances. Given there is a race between the lexical
phonological priming as supporting phonological and sublexical routes in the dual-route model, if
mediation—Why else should early phonological for some reason the lexical route is slow in pro-
activation happen so early unless it is essential?— ducing an output, the sublexical route might have
other researchers point out that these effects time to assemble a conflicting phonological repre-
are very sensitive to environmental conditions, sentation. Second, there might be feedback from
and are not always reliably found (see Rastle & the speech production system to the semantic sys-
Brysbaert, 2006, for a review). In a meta-analysis tem, or the direct access route causes inner speech
of the literature, Rastle and Brysbaert (2006) do that interferes with processing. Third, it is possi-
find small but significant masked phonological ble that lexical decision is based on phonological
priming effects. information (Rastle & Brysbaert, 2006).
These data suggest that the sound of a word
is usually accessed at an early stage. However, Silent reading and inner speech
there is much evidence suggesting that phono- Although it seems unlikely that we have to access
logical recoding cannot be obligatory in order sound before meaning, we do routinely seem to
to access the word’s meaning (Ellis, 1993). For access some sort of phonological code after access-
example, some dyslexics cannot pronounce non- ing meaning in silent reading. Subjective evidence
words, yet can still read many words. Hanley for this is the experience of “inner speech” while
and McDonnell (1997) described the case of reading. Tongue-twisters such as (1) take longer to
a patient, PS, who understood the meaning of read silently than sentences where there is variation
words in reading without being able to pro- in the initial consonants (Haber & Haber, 1982).
nounce them correctly. Critically, PS did not This suggests that we are accessing some sort of
have a preserved inner phonological code that phonological code as we read.
could be used to access the meaning. Some
patients have preserved inner phonology and (1) Boris burned the brown bread badly.
preserved reading comprehension, but make
errors in speaking aloud (Caplan & Waters, However, this inner speech cannot involve
1995b). Hanley and McDonnell argued that PS exactly the same processes as overt speech
did not have access to his phonological code because we can read silently much faster than
because he was unable to access both meanings we can read aloud (Rayner & Pollatsek, 1989),
of a homophone from seeing just one in print. and because overt articulation does not pro-
Thus PS could not produce the phonological hibit inner speech while reading. Furthermore,
forms of words aloud correctly, and did not have although most people who are profoundly deaf
access to an internal phonological representa- read very poorly, some read quite well (Conrad,
tion of those words, yet he could still under- 1972). Although this might suggest that eventual
stand them when reading them. For example, he phonological coding is optional, it is likely that
it is activated, it does seem that silent reading nec-

essarily generates some sort of phonological code
(Rayner & Pollatsek, 1989). This information is
used to assist comprehension, primarily by main-
taining items in sequence in working memory.
The role of meaning in accessing

sound
Phonological mediation means that we might
access meaning via sound. Sometimes we need to
access the meaning before we can access a word’s
The experience of “inner speech” while reading sound. Words such as “bow,” “row,” and “tear”
demonstrates that we can access some sort of have two different pronunciations. This type of
phonological code after accessing meaning in silent word is called a homograph. How do we select the
reading. appropriate pronunciation? Consider sentences
(2) and (3):
these deaf able readers are converting printed (2) When his shoelace came loose, Vlad had to
words into some sign language code (Rayner & tie a bow.
Pollatsek, 1989). Evidence for this is that deaf (3) At the end of the play, Dirk went to the front
people are troubled by the silent reading of word of the stage to take a bow.
strings that correspond to hand-twisters (Treiman
& Hirsh-Pasek, 1983). (Interestingly, deaf people Clearly here we need to access the word’s
also have some difficulty with signing phonologi- meaning before we can select the appropriate pro-
cal tongue-twisters, suggesting that difficulty can nunciation. Further evidence that semantics can
arise from lip-reading sounds.) affect reading is provided by a study by Strain,
Hence, when we read we seem to access a Patterson, and Seidenberg (1995). They showed
phonological code that we experience as inner that there is an effect of imageability on skilled
speech. That is, when we gain access to a word’s reading such that there is a three-way interaction
representation in the lexicon, all its attributes between frequency, imageability, and spelling
become available. The activation of a phonologi- consistency. People are particularly slow and
cal code is not confined to alphabetic languages. make more errors when reading low-frequency
On-line experimental data using priming and exception words with abstract meanings (e.g.,
semantic judgment tasks suggest that phonologi- “scarce”). Although a subsequent study by
cal information about ideographs is automatically Monaghan and Ellis (2002) suggests that this
activated in both Chinese (Perfetti & Zhang, 1991, semantic effect might be at least in part the result
1995) and Japanese kanji (Wydell, Patterson, & of a confound with age-of-acquisition, as abstract
Humphreys, 1993). low-frequency exception words tend to have late
Inner speech seems to assist comprehen- AOA, this interaction is still found when we con-
sion; if it is reduced, comprehension suffers for trol for AOA (Strain, Patterson, & Seidenberg,
all but the easiest material (Rayner & Pollatsek, 2002). Hence, at least some of the time, we need
1989). McCutchen and Perfetti (1982) argued to access a word’s semantic representation before
that whichever route is used for lexical access in we can access its phonology.
reading, at least part of the phonological code of
each word is automatically accessed—in particu- Does speed reading work?
lar we access the sounds of beginnings of words. Occasionally you might notice advertisements in
Although there is some debate about the precise the press for techniques for improving your read-
nature of the phonological code and how much of ing speed. The most famous of these techniques
7. READING 219
is known as “speed reading.” Proponents of speed be read by a non-lexical route that is insensitive
reading claim that you can increase your reading to lexical information. Second, there are effects
speed from the average of 200–350 words a minute of regularity of pronunciation on reading words,
to 2,000 words a minute or even faster, yet retain which should be read by a direct, lexical route that
the same level of comprehension. Is this possible? is insensitive to phonological recoding.
Unfortunately, the preponderance of psychological A race model fares better. Regularity effects
research suggests not. As you increase your read- arise when the direct and indirect routes produce
ing speed above the normal rate, comprehension an output at about the same time, so that conflict
declines. Just and Carpenter (1987) compared the arises between the irregular pronunciation proposed
understanding of speed readers and normal readers by the lexical route and the regular pronunciation
on an easy piece of text (an article from Reader’s proposed by the sublexical route. However, it is not
Digest) and a difficult piece of text (an article from clear how a race model where the indirect route uses
Scientific American). They found that normal read- grapheme–phoneme conversion can explain lexical
ers scored 15% higher on comprehension measures effects on reading nonwords. Neither is it clear how
than the speed readers across both passages. In semantics can guide the operation of the direct route.
fact, the speed readers performed only slightly bet- Skilled readers have a measure of attentional
ter than a group of people who skimmed through or strategic control over the lexical and sublexical
the passages. The speed readers did as well as the routes such that they can attend selectively to lexi-
normal readers on the general gist of the text, but cal or sublexical information (Baluch & Besner,
were worse at details. In particular, speed readers 1991; Monsell, Patterson, Graham, Hughes, &
could not answer questions when the answers were Milroy, 1992; Zevin & Balota, 2000). For exam-
located in places where their eyes had not fixated. ple, Monsell et al. found that the composition of
Speed reading, then, is not as effective as nor- word lists affected naming performance. High-
mal reading. Eye movements are the key to why frequency exception words were pronounced
speed reading confers limited advantages (Rayner faster when they were in pure blocks than when
& Pollatsek, 1989). For a word to be processed they were mixed with nonwords. Monsell et al.
properly, its image has to land close to the fovea argued that this was because participants allocated
and stay there for a sufficient length of time. Speed more attention to lexical information when read-
reading is nothing more than skimming through a ing the pure blocks. Participants also made fewer
piece of writing (Carver, 1972). This is not to say regularization errors when the words were pre-
that readers obtain nothing from skimming: if you sented in pure blocks (when they can rely solely
have sufficient prior information about the mate- on lexical processing) than in mixed blocks (when
rial, your level of comprehension can be quite the sublexical route has to be involved).
good. If you speed read and then read normally, At first sight, then, this experiment suggests
your overall level of comprehension and retention that in difficult circumstances people seem able
might be better than if you had just read the text to change their emphasis in reading from using
normally. It is also a useful technique for preparing lexical information to sublexical information.
to read a book or article in a structured way (see However, Jared (1997a) argued that people need
Chapter 12). Finally, associated techniques such as not change the extent to which they rely on sub-
relaxing before you start to read might well have lexical information, but instead might be respond-
beneficial effects on comprehension and retention. ing at different points in the processing of the
stimuli. She argued that the faster pronunciation
Evaluation of experiments on latencies found in Monsell et al.’s experiment in the
exception-only condition could just be the result of
normal reading a general increase in response speed, rather than
There are two major problems with a simple dual- a reduction in reliance on the non-lexical route.
route model. First, we have seen that there are However, there is further evidence for stra-
lexical effects on reading nonwords, which should tegic effects in the choice of route when reading.
Using a primed naming task, Zevin and Balota of the two reading routes. That is, we should find
(2000) found that nonword primes produce a some patients have damage to the lexical route but
greater dependence on sublexical processing, can still read by the non-lexical route only, whereas
but low-frequency exception word primes pro- we should be able to find other patients who have
duce a greater dependence on lexical processing. damage to the non-lexical route but can read by the
Coltheart and Rastle (1994) suggested that lexical lexical route only. The existence of a double dis-
access is performed so quickly for high-frequency sociation is a strong prediction of the dual-route
words that there is little scope for sublexical model, and a real challenge to any single-route
involvement, but with low-frequency words or in model.
difficult conditions people can devote more atten-
tion to one route or the other.
Surface dyslexia
People with surface dyslexia have a selective
THE NEUROSCIENCE impairment in the ability to read irregular (excep-
OF ADULT READING tion) words. Hence they would have difficulty with
DISORDERS “steak” compared with a similar regular relative
word such as “speak.” Marshall and Newcombe
What can studies of people with brain damage tell (1973) and Shallice and Warrington (1980)
us about reading? This section is concerned with described some early case histories. Surface dys-
disorders of processing written language. We must lexics often make over-regularization errors when
distinguish between acquired disorders (which, trying to read irregular words aloud. For example,
as a result of head trauma such as stroke, opera- they pronounce “broad” as “brode,” “steak” as
tion, or head injury, lead to disruption of processes “steek,” and “island” as “eyesland.” On the other
that were functioning normally beforehand) and hand, their ability to read regular words and non-
developmental disorders (which do not result words is intact. In terms of the dual-route model,
from obvious trauma, and which disrupt the the most obvious explanation of surface dyslexia
development of a particular function). Disorders is that these patients can only read via the indirect,
of reading are called the dyslexias; disorders of non-lexical route: that is, it is an impairment of
writing are called the dysgraphias. Damage to the the lexical (direct access) processing route. The
left hemisphere will generally result in dyslexia, comprehension of word meaning is intact in these
but as the same sites are involved in speaking, patients. They still know what an “island” is, even
dyslexia is often accompanied by impairments to if they cannot read the word, and they can still
spoken language processing. understand it if you say the word to them.
We can distinguish central dyslexias, which The effects of brain damage are rarely local-
involve central, high-level reading processes, ized to highly specific systems, and, in practice,
from peripheral dyslexias, which involve lower patients do not show such clear-cut behavior as
level processes. Peripheral dyslexias include the ideal of totally preserved regular word and
visual dyslexia, attentional dyslexia, letter-by- nonword reading, and the total loss of irregular
letter reading, and neglect dyslexia, all of which words. The clearest case yet reported is that of
disrupt the extraction of visual information from a patient referred to as MP (Bub, Cancelliere, &
the page. As our focus is on understanding the Kertesz, 1985). She showed completely normal
central reading process, we will limit discussion accuracy in reading nonwords, and hence her
here to the central dyslexias. In addition, we will non-lexical route was totally preserved. She was
only look at acquired disorders in this section, and not the best possible case of surface dyslexia,
defer discussion of developmental dyslexia until however, because she could read some irregular
our examination of learning to read. words (with an accuracy of 85% on high-frequency
If the dual-route model of reading is correct, items, and 40% on low-frequency exception
then we should expect to find a double dissociation words). This means that her lexical route must
7. READING 221
have been partially intact. The pure cases are Derouesné (1979). Phonological dyslexics find
rarely found. Other patients show considerably irregular words no harder to read than regular
less clear-cut reading than this, with even better ones. These symptoms suggest that these patients
performance on irregular words, and some deficit can only read using the lexical route, and there-
in reading regular words. fore that phonological dyslexia is an impairment
If patients were reading through a non-lexical of the non-lexical (GPC) processing route. As
route, we would not expect lexical variables to with surface dyslexia, the “perfect patient,” who
affect the likelihood of reading success. Kremin in this case would be able to read all words but no
(1985) found no effect of word frequency, part nonwords, has yet to be discovered. The clearest
of speech (noun versus adjective versus verb), or case yet reported is that of patient WB (Funnell,
whether or not it is easy to form a mental image 1983), who could not read nonwords at all; hence
of what is referred to (called imageability), on the the non-lexical GPC route must have been com-
likelihood of reading success. Although patients pletely abolished. He was not the most extreme
such as MP, from Bub et al. (1985), show a clear case possible of phonological dyslexia, however,
frequency effect in that they make few regulari- because there was also an impairment to his lexi-
zations of high-frequency words, other patients, cal route; his performance was about 85% correct
such as HTR, from Shallice, Warrington, and on words.
McCarthy (1983), do not. Patients also make For those patients who can pronounce
homophone confusions (such as reading “pane” some nonwords, nonword reading is improved
as “to cause distress”). if the nonwords are pseudohomophones (such
Surface dyslexia may not be a unitary cate- as “nite” for “night,” or “brane” for “brain”).
gory. Shallice and McCarthy (1985) distinguished Those patients who also have difficulty in read-
between Type I and Type II surface dyslexia. ing words have particular difficulty in reading the
Patients of both types are poor at reading excep- function words that do the grammatical work of
tion words. The more pure cases, known as Type the language. Low-frequency, low-imageability
I patients, are highly accurate at naming regular words are also poorly read, although neither fre-
words and pseudowords. Other patients, known quency nor imageability seems to have any over-
as Type II, also show some impairment at reading whelming role in itself. These patients also have
regular words and pseudowords. The reading per- difficulty in reading morphologically complex
formance of Type II patients may be affected by words—those that have syntactic modifications
lexical variables such that they are better at read- called inflections. They sometimes make what
ing high-frequency, high-imageability words, bet- are called derivational errors on these words,
ter at reading nouns than adjectives and at reading where they read a word as a grammatical rela-
adjectives than verbs, and better at reading short tive of the target, such as reading “performing”
words than long. Type II patients must have an as “performance.” Finally, they also make visual
additional, moderate impairment to the non-lexical errors, in which a word is read as another with a
route, but the dual-route model can nevertheless similar visual appearance, such as reading “per-
still explain this pattern. form” as “perfume.”
There are different types of phonological
dyslexia. Derouesné and Beauvois (1979) sug-
Phonological dyslexia gested that phonological dyslexia can result from
People with phonological dyslexia have a selec- disruption of either orthographic or phonological
tive impairment in the ability to read pronounce- processing. Some patients are worse at reading
able nonwords, called pseudowords (such as graphemically complex nonwords (e.g., CAU,
“sleeb”), while their ability to read matched words where a phoneme is represented by two letters;
(e.g., “sleep”) is preserved. Phonological dyslexia hence this nonword requires more graphemic pars-
was first described by Shallice and Warrington ing) than graphemically simple nonwords (e.g.,
(1975, 1980), Patterson (1980), and Beauvois and IKO, where there is a one-to-one mapping between
letters and graphemes), but show no advantage for of the function words that caused his problems.
pseudohomophones. These patients suffer from a Nevertheless, he could understand the meaning
disruption of graphemic parsing. Another group of of function words that he could not read, and
patients are better at reading pseudohomophones his deficit was confined to reading single words.
than non-pseudohomophones, but show no effect His reading of function words in continuous text
of orthographic complexity. These patients suf- was much better. It is likely that MC at least has
fer from a disruption of phonological processing. a problem with syntactic processing such that
Friedman (1995) distinguished between phono- when producing words in isolation he is unable to
logical dyslexia arising from an impairment of access syntactic information.
orthographic-to-phonological processing (charac- People with phonological dyslexia show
terized by relatively poor function word reading complex phonological problems that have noth-
but good nonword repetition) from that arising ing to do with orthography. Indeed, it has been
from an impairment of general phonological pro- proposed that phonological dyslexia is a conse-
cessing (characterized by the reverse pattern). quence of a general problem with phonological
Following this, a three-stage model of sub- processing (Farah, Stowe, & Levinson, 1996;
lexical processing has emerged (Beauvois & Harm & Seidenberg, 2001; Patterson, Suzuki, &
Derouesné, 1979; Coltheart, 1985; Friedman, Wydell, 1996). If phonological dyslexia arises
1995). First, a graphemic analysis stage parses solely because of problems with ability to trans-
the letter string into graphemes. Second, a print- late orthography into phonology, then there must
to-sound conversion stage assigns phonemes to be brain tissue dedicated to this task. This implies
graphemes. Third, in the phonemic blending stage that this brain tissue becomes dedicated by school-
the sounds are assembled into a phonological age learning, which is an unappealing prospect.
representation. There are patients whose behav- The alternative view is that phonological dys-
ior can best be explained in terms of disruption lexia is just one aspect of a general impairment
of each of these stages (Lesch & Martin, 1998). of phonological processing. This impairment
MS (Newcombe & Marshall, 1985) suffered from will normally be manifested in performance on
disruption to graphemic analysis. Patients with non-reading tasks such as rhyming, nonword writ-
disrupted graphemic analysis find nonwords in ing, phonological short-term memory, nonword
which each grapheme is represented by a single repetition, and tasks of phonological synthesis
letter easier to read than nonwords with multiple (“what does “c–a–t spell out?”) and phonologi-
correspondences. WB (Funnell, 1983) suffered cal awareness (“what word is left if you take the
from disruption in the print-to-sound conver- “p” sound out of “spoon”?). This proposal also
sion stage; here nonword repetition is intact. ML explains why pseudohomophones are read bet-
(Lesch & Martin, 1998) was a phonological dys- ter than non-pseudohomophones. An important
lexic who could carry out tasks of phonological piece of evidence in favor of this hypothesis is
assembly on syllables, but not on sub-syllabic that phonological dyslexia is never observed in
units (onsets, bodies, and phonemes). MV (Bub, the absence of a more general phonological deficit
Black, Howell, & Kertesz, 1987) suffered from (but see Coltheart, 1996, for a dissenting view).
disruption to the phonemic stage. A general phonological deficit makes it difficult
Why do some people with phonological to assemble pronunciations for nonwords. Words
dyslexia have difficulty reading function words? are spared much of this difficulty because of sup-
One possibility is that function words are difficult port from other words and top-down support from
because they are so abstract (Friedman, 1995). their semantic representations. Repeating words
However, patient MC (Druks & Froud, 2002) had and nonwords is facilitated by support from audi-
great difficulty in reading nonwords, morpho- tory representations, so some phonological dys-
logically complex words, and function words in lexics can still repeat some nonwords. However,
isolation. Crucially, he could read highly abstract if the repetition task is made more difficult so
content words, so it cannot be the abstractness that patients can no longer gain support from the
7. READING 223
auditory representations, repetition performance semantic errors, they make visual errors, they sub-
declines markedly (Farah et al., 1996). This idea stitute incorrect function words for the target, they
that phonological dyslexia is caused by a general make derivational errors, they can’t pronounce
phonological deficit is central to the connectionist nonwords, they show an imageability effect, they
account of dyslexia, discussed later. find nouns easier to read than adjectives, they
find adjectives easier to read than verbs, they find
function words more difficult to read than content
Deep dyslexia words, their writing is impaired, their auditory
At first sight, surface and phonological dyslexia short-term memory is impaired, and their read-
appear to exhaust the possibilities of the con- ing ability depends on the context of a word (e.g.,
sequences of damage to the dual-route model. FLY is easier to read when it is a noun in a sen-
There is, however, another even more surprising tence than a verb).
type of dyslexia called deep dyslexia. Marshall There has been some debate about the extent
and Newcombe (1966, 1973) first described deep to which deep dyslexia is a syndrome (a syndrome
dyslexia in two patients, GR and KU, although is a group of symptoms that cluster together).
it is now recognized that the syndrome had been Coltheart (1980) argued that the clustering of
observed in patients before this (Marshall & symptoms is meaningful, in that they suggest a
Newcombe, 1980). In many respects deep dys- single underlying cause. However, although these
lexia resembles phonological dyslexia. Patients symptoms tend to occur in many patients, they
have great difficulty in reading nonwords, and do not apparently necessarily do so. For example,
considerable difficulty in reading the grammati- AR (Warrington & Shallice, 1979) did not show
cal, function words. Like phonological dyslex- concreteness and content word effects and had
ics, they make visual and derivational errors. intact writing and auditory short-term memory. A
However, the defining characteristic of deep dys- few patients make semantic errors but very few
lexia is the presence of semantic reading errors visual errors (Caramazza & Hillis, 1990). Such
or semantic paralexias, when people produce a patients suggest that it is unlikely that there is
word related in meaning to the target instead of a single underlying deficit. Like phonologi-
the target, as in examples (4) to (7): cal dyslexics, deep dyslexics obviously have
some difficulty in obtaining non-lexical access
(4) DAUGHTER “sister” to phonology via grapheme–phoneme recoding,
(5) PRAY “chapel” but they also have some disorder of the seman-
(6) ROSE “flower” tic system. We nevertheless have to explain why
(7) KILL “hate” these symptoms are so often associated. One pos-
sibility is that the different symptoms of deep
The imageability of a word is an important dyslexia arise because of an arbitrary feature of
determinant of the probability of reading success brain anatomy: Different but nearby parts of the
in deep dyslexia. The easier it is to form a men- brain control processes such as writing and audi-
tal image of a word, the easier it is to read. Note tory short-term memory, so that damage to one
that just an imageability effect in reading does not is often associated with damage to another. As we
mean that patients with deep dyslexia are better at will see, a more satisfying account is provided by
all tasks involving more concrete words. Indeed, connectionist modeling.
Newton and Barry (1997) described a patient (LW) Shallice (1988) argued that there are three
who was much better at reading high-frequency con- subtypes of deep dyslexia that vary in the pre-
crete words than abstract words, but who showed cise impairments involved. Input deep dyslexics
no impairment in comprehending those same have difficulties in reaching the exact semantic
abstract words. representations of words in reading. In these
Coltheart (1980) listed 12 symptoms com- patients, auditory comprehension is superior to
monly shown by deep dyslexics: They make reading. Central deep dyslexics have a severe
auditory comprehension deficit in addition to

their reading difficulties. Output deep dys-
lexics can process words up to their semantic
representations, but then have difficulty pro-
ducing the appropriate phonological output. In
practice it can be difficult to assign particular
patients to these subtypes, and it is not clear
what precise impairment of the reading systems
is necessary to produce each subtype (Newton
& Barry, 1997).
The right-hemisphere hypothesis

Does deep dyslexia reflect attempts by a greatly
damaged system to read normally, as has been
argued by Morton and Patterson (1980), among
others? Or does it instead reflect the operation of
an otherwise normally suppressed system coming
through? Perhaps deep dyslexics do not always
use the left hemisphere for reading. Instead, peo-
ple with deep dyslexia might use a reading sys-
tem based in the right hemisphere that is normally
suppressed (Coltheart, 1980; Saffran, Bogyo,
Schwartz, & Marin, 1980; Zaidel & Peters, 1981).
Brain activity during reading aloud in a normal
This right-hemisphere hypothesis is supported by (top) and dyslexic (bottom) subject. These are
the observation that the more of the left hemi- composites of 3-D magnetic resonance imaging
sphere that is damaged, the more severe the deep (MRI) scans of the brain, with positron emission
dyslexia observed (Jones & Martin, 1985; but see tomography (PET) scans overlaid to show active
Marshall & Patterson, 1985). Furthermore, the areas (orange). The most active areas are in the
left cerebral hemisphere (right), site of the brain’s
reading performance of deep dyslexics resem- language centers. In the dyslexic, there is an
bles that of split-brain patients when words are abnormal area of activity in the globus pallidus (just
presented to the left visual field, and therefore left of center) of the right cerebral hemisphere.
to the right hemisphere. Under such conditions
they also make semantic paralexias, and have an
advantage for concrete words. Finally, Patterson, rely on the right hemisphere for reading. The
Vargha-Khadem, and Polkey (1989) described right-hemisphere advantage for concrete words
the case of a patient called NI, a 17-year-old girl is rarely found, and the imageability of the tar-
who had had her left hemisphere removed for get words used in these experiments might have
the treatment of severe epilepsy. After recovery been confounded with length (Ellis & Young,
she retained some reading ability, but her perfor- 1988; Patterson & Besner, 1984). Finally,
mance resembled that of deep dyslexics. Roeltgen (1987) described a patient who suf-
In spite of these points in its favor, the fered from deep dyslexia as a result of a stroke
right-hemisphere reading hypothesis has never in the left hemisphere. He later suffered from
won wide acceptance. In part this is because a second left-hemisphere stroke, which had the
the hypothesis is considered a negative one, effect of destroying his residual reading ability.
in that if it were correct, deep dyslexia would If the deep dyslexia had been a consequence of
tell us nothing about normal reading. In addi- right-hemisphere reading, it should not have
tion, people with deep dyslexia read much bet- been affected by the second stroke in the left
ter than split-brain patients who are forced to hemisphere.
7. READING 225
Summary of research on deep dyslexia Non-semantic reading

There has been debate as to whether the term
“deep dyslexia” is a meaningful label. The cru- Schwartz, Marin, and Saffran (1979), and
cial issue is whether or not its symptoms must Schwartz, Saffran, and Marin (1980a), described
necessarily co-occur because they have the WLP, an elderly patient suffering from progres-
same underlying cause. Are semantic paralexias sive dementia. WLP had a greatly impaired abil-
always found associated with impaired non- ity to retrieve the meaning of written words; for
word reading? So far they seem to be; in all example, she was unable to match written animal
reported cases semantic paralexias have been names to pictures. She could read those words out
associated with all the other symptoms. How loud almost perfectly, getting 18 out of 20 cor-
then can deep dyslexia be explained by one rect and making only minor errors, even on low-
underlying disorder? In terms of the dual-route frequency words. She could also read irregular
model, there would need to be damage to both words and nonwords. In summary, WLP could
the semantic system (to explain the semantic read words without any comprehension of their
paralexias and the imageability effects) and meaning. Coslett (1991) described a patient, WT,
the non-lexical route (to explain the difficulties who was virtually unable to read nonwords, sug-
with nonwords). We would also then have to gesting an impairment of the indirect route of
specify that for some reason damage to the first the dual-route model, but who was able to read
is always associated with damage to the second irregular words quite proficiently, even though
(e.g., because of an anatomical accident that the she could not understand those words. These case
neural tissue supporting both processes is in studies suggest that we must have a direct access
adjoining parts of the brain). This is inelegant. route from orthography to phonology that does
As we shall see, connectionist models have cast not go through semantics.
valuable light on this question. A second issue is
whether we can make inferences from deep dys- Summary of the interpretation of
lexia about the processes of normal reading, as
we can for the other types of acquired dyslexia.
the acquired dyslexias
We have seen that the dual-route model readily We have looked at four main types of adult central
explains surface and phonological dyslexia, and dyslexia: surface, phonological, deep, and non-
that their occurrence is as expected if we were to semantic reading. We have seen how a dual-route
lesion that model by removing one of the routes. model explains surface dyslexia as an impairment
Hence, it is reasonable to make inferences about of the lexical, direct access route, and explains
normal reading on the basis of data from such phonological dyslexia as an impairment of the
patients. There is some doubt, however, as to non-lexical, phonological recoding route. The
whether we are entitled to do this in the case of existence of non-semantic reading suggests that
deep dyslexia; if the right-hemisphere hypoth- the simple dual-route model needs refinement. In
esis were correct, deep dyslexia would tell us particular, the direct route must be split into two.
little about normal reading. The balance of There must be a non-semantic direct access route
evidence is at present that deep dyslexia does that retrieves phonology given orthography, but
not reflect right-hemisphere reading, but does which does not pass through semantics first, and
reflect reading by a greatly damaged left hemi- a semantic direct access route that passes through
sphere. Deep dyslexia suggests that normally semantics and allows us to select the appropriate
we can in some way read through meaning; sounds of non-homophonic homographs (e.g.,
that is, we use the semantic representation of a “wind”). In non-semantic reading, the seman-
word to obtain its phonology. This supports our tic direct route has been abolished but the non-
earlier observation that with homographs (e.g., semantic direct route is intact. An analysis of
“bow”) we use the meaning to select the appro- acquired dyslexia by Coltheart (1981) is shown in
priate pronunciation. Figure 7.2.
Analyzing acquired dyslexia (adapted from

different scripts, kana and kanji (see Coltheart,
Coltheart, 1981) 1980; Sasanuma, 1980). Kana is a syllabic
Is naming a letter much harder when it is script, and kanji is a logographic or ideographic
accompanied by other, irrelevant letters?
script. Therefore words in kanji convey no infor-
Yes No
mation on how a word should be pronounced.
When words are misread, are
While kana allows sublexical processing, kanji
Attentional
dyslexia the errors usually confined to must be accessed through a direct, lexical route.
one half of the word?
The right hemisphere is better at dealing with
kanji, and the left hemisphere is better at read-
No Yes
ing kana (Coltheart, 1980). Reading of briefly
Are words often read Neglect or
letter by letter? positional dyslexia
presented kana words is more accurate when
they are presented to the right visual field (left
hemisphere), but reading of kanji words is better
Yes No
when they are presented to the left visual field
Letter-by-letter Are semantic errors made
reading in reading aloud? (right hemisphere). The analog of surface dys-
lexia is found in patients where there is a selec-
tive impairment of reading kanji, but the reading
No Yes
Is reading aloud of
of kana is preserved. The analog of phonologi-
nonwords very bad Deep dyslexia cal dyslexia is an ability to read both kana and
or impossible?
Yes No
Are regular words read
Phonological
aloud much better than
dyslexia
exception words? MODELS
MODELS OF WORD
OF WORD
Yes
MODELS
MODELS OF WORD
OF WORD
Surface dyslexia
MODELS
MODELS OF WORD
OF WORD
FIGURE 7.2
MODELS
MODELS OF WORD
OF WORD
Acquired dyslexia in other
languages MODELS
MODELS OF WORD
OF WORD
Languages such as Italian, Spanish, or Serbo-Croat,
which have totally transparent or shallow alphabetic
orthographies—that is, where every grapheme is in MODELS
MODELS OF WORD
OF WORD
a one-to-one relation with a phoneme—can show
phonological and deep dyslexia, but not surface dys-
lexia, defined as an inability to read exception words MODELS
MODELS OF WORD
OF WORD
(Patterson, Marshall, & Coltheart, 1985a, 1985b).
However, we can find the symptoms that can co-
occur with an impairment of exception word reading,
MODELS
MODELS OF WORD
OF WORD
such as homophone confusions, in the languages that
permit them (Masterson, Coltheart, & Meara, 1985). Chinese (shown here) is a logographic or
Whereas languages such as English have ideographic script, providing no information on
word pronunciation.
a single, alphabetic script, Japanese has two
7. READING 227
kanji, but a difficulty in reading Japanese non- are necessary, other disorders suggest that these
words. The analog of deep dyslexia is a selective alone will not suffice. At first sight it is not
impairment of reading kana, while the reading obvious how a single-route model could explain
of kanji is preserved. For example, patient TY these dissociations at all.
could read words in both kanji and kana almost Theorists have taken two different approaches
perfectly, but she had great difficulty with non- depending on their starting point. One possibil-
words constructed from kana words (Sasanuma, ity is to refine the dual-route model. Another is
Ito, Patterson, & Ito, 1996). to show how word-neighborhoods can affect
Chinese is an ideographic language. pronunciation, and how pseudowords can be pro-
Butterworth and Wengang (1991) reported evi- nounced in a single-route model. This led to the
dence of two routes in reading in Chinese. development of analogy models. More recently,
Ideographs can be read aloud either through a a connectionist model of reading has been devel-
route that associates the symbol with its complete oped that takes the single-route, analogy-based
pronunciation, or through one that uses parts of approach to the limit.
the symbol. (Although Chinese is non-alphabetic,
most symbols contain some sublexical information The revised dual-route model
on pronunciation.) Each route can be selectively
impaired by brain damage, leading to distinct types We can save the dual-route model by making it
of reading disorder. more complex. Morton and Patterson (1980) and
The study of other languages that have differ- Patterson and Morton (1985) described a three-
ent means of mapping orthography onto phonol- route model (see Figure 7.3). First, there is a
ogy is still at a relatively early stage, but it is likely non-lexical route for assembling pronunciations
to greatly enhance our understanding of reading from sublexical grapheme–phoneme conver-
mechanisms. The findings suggest that the neu- sion. The non-lexical route now consists of two
ropsychological mechanisms involved in reading subsystems. A standard grapheme–phoneme con-
are universal, although there are obviously some version mechanism is supplemented with a body
differences related to the unique features of differ- subsystem that makes use of information about
ent orthographies. correspondences between orthographic and pho-
nological rimes. This is needed to explain lexi-
cal effects on nonword pronunciation. Second,
MODELS OF WORD the direct route is split into a semantic and a
NAMING non-semantic direct route.
The three-route model accounts for the data as
Both the classic dual-route and the single- follows. The lexical effects on nonwords and regu-
route, lexical-instance models face a number larity effects on words are explained by cross-talk
of problems. First, there are lexical effects for between the lexical and non-lexical routes. Two
nonwords and regularity effects for words, and types of interaction are possible: interference dur-
therefore reading cannot be a simple case of ing retrieval, and conflict in resolving multiple pho-
automatic grapheme-to-phoneme conversion for nological forms after retrieval. The two subsystems
nonwords, and automatic direct access for all of the non-lexical route also give the model greater
words. Single-route models, on the other hand, power. Surface dyslexia is the loss of the ability to
appear to provide no account of nonword pro- make direct contact with the orthographic lexicon,
nunciation, and it remains to be demonstrated and phonological dyslexia is the loss of the indirect
how neighborhood effects affect a word’s pro- route. Non-semantic reading is a loss of the lexical-
nunciation. Second, any model must also be semantic route. Deep dyslexia remains rather mys-
able to account for the pattern of dissociations terious. First, we have to argue that these patients
found in dyslexia. While surface and phonologi- can only read through the lexical-semantic route.
cal dyslexia indicate that two reading mechanisms While accounting for the symptoms that resemble
Original dual-route model Revised dual-route model
Printed language Printed language
Graphemes
Graphemes
Lexicon
(visual input logogens)
Grapheme
Lexicon
conversion
Non-semantic Sublexical recoding
reading (graphemes and bodies)
Semantic system
Phonology
(speech output Phonology
logogens) (speech output logogens)
Speech Speech
FIGURE 7.3 The original and revised dual-route models of reading.
phonological dyslexia, it still does not explain the activation network to determine the final pronun-
semantic paralexias. One possibility is that this ciation of a word. Such an approach develops ear-
route is used normally, but not always success- lier models that make use of knowledge at multiple
fully, and that it needs additional information (such levels, such as those of Brown (1987), Patterson
as from the non-lexical and non-semantic direct and Morton (1985), and Shallice, Warrington, and
route) to succeed. So when this information is no McCarthy (1983).
longer available it functions imperfectly. It gets us The most recent version of the dual-route
to the right semantic area, but not necessarily to the model is the dual-route cascaded, or DRC, model
exact item, hence giving paralexias. This additional (Coltheart, Curtis, Atkins, & Haller, 1993; Coltheart
assumption seems somewhat arbitrary. An alterna- & Rastle, 1994; Coltheart, Rastle, Perry, Langdon,
tive idea is that paralexias are the result of addi- & Ziegler, 2001). This is a computational model
tional damage to the semantic system itself. Hence based on the architecture of the dual-route model—
a complex pattern of impairments is still necessary although it is in fact misleadingly so called, as it is
to explain deep dyslexia, and there is no reason to really based on the three-route model, with a non-
suggest that these are not dissociable. lexical grapheme–phoneme rule system and a lexi-
Multi-route models are becoming increasingly cal system, which in turn is divided into one route
complicated as we find out more about the reading that passes through the semantic system and a non-
process (for example, see Carr & Pollatsek, 1985). semantic route that does not. The model makes use
Another idea is that multiple levels of spelling-to- of cascaded processing, in that as soon as there is
sound correspondences combine in determining any activation at the letter level, activation is passed
the pronunciation of a word. In Norris’s (1994a) on to the word level. The computational model can
multiple-levels model, different levels of spelling- simulate performance on both lexical decision and
to-sound information, including phoneme, rime (the naming tasks, showing appropriate effects of fre-
final part of the word giving rise to the words with quency, regularity, pseudohomophones, neighbor-
which it rhymes, e.g., “eak” in “speak”), and word- hood, and priming. Regularity is now a central
level correspondences, combine in an interactive motivation of the model; words are either regular,
7. READING 229
or they are not. Irregular words take longer to pro- activates “hang,” “rang,” “sang,” and “bang”;
nounce than regular ones because the lexical and these are all consistent with the regular pronun-
non-lexical routes produce conflicting pronuncia- ciation of “gang,” and hence assembling a pro-
tions. The model accounts for surface dyslexia by nunciation is straightforward. When presented
making entries in the orthographic lexicon less with “base,” however, “case” and “vase” are acti-
available, and for phonological dyslexia by damag- vated; these conflict, and hence the assembly of a
ing the grapheme–phoneme conversion route. pronunciation is slowed down until the conflict is
There is not uniform agreement that it is nec- resolved. A nonword such as “taze” is pronounced
essary to divide the direct route into two. In the by analogy with the consistent set of similar
summation model (Hillis & Caramazza, 1991b; words (“maze,” “gaze,” “daze”). A nonword
Howard & Franklin, 1988), the only direct route is such as “mave” activates “gave,” “rave,” and
reading through semantics. How does this model “save,” but it also activates the conflicting enemy
account for non-semantic reading? The idea is that “have,” which hence slows down pronunciation
access to the semantic system is not completely of “mave.” In order to name by analogy, you have
obliterated. Activation from the sublexical route to find candidate words containing appropriate
combines (or is “summated”) with activation trick- orthographic segments (like “-ave”); obtain the
ling down from the damaged direct semantic route phonological representation of the segments; and
to ensure the correct pronunciation. assemble the complete phonology (“m + ave”).
It is difficult to distinguish between these Although attractive in the way they deal with
variants of the original dual-route model, although regularity and neighborhood effects, early ver-
the three-route version provides the more explicit sions of analogy models suffered from a number
account of the dissociations observed in dyslexia. of problems. First, the models did not make clear
There is also some evidence against the summa- how the input is segmented in an appropriate way.
tion hypothesis. EP (Funnell, 1996) could read Second, the models make incorrect predictions
irregular words that she could not name, and about how some nonwords should be pronounced.
priming the name with the initial letter did not Particularly troublesome are nonwords based on
help her naming, contrary to the prediction of the gangs; “pook” should be pronounced by analogy
summation hypothesis. Many aspects of the dual- with the great preponderance of the gang compris-
route model have been subsumed by the triangle ing “book,” “hook,” “look,” and “rook,” yet it is
model that serves as the basis of connectionist given the “hero” pronunciation (see Table 7.2)—
models of reading. The situation is complicated which is in accordance with grapheme–phoneme
even more by the apparent co-occurrence of the correspondence rules—nearly 75% of the time
loss of particular word meanings in dementia and (Kay, 1985). Analogy theory also appears to make
surface dyslexia (see later). incorrect predictions about how long it takes us to
make regularization errors (Patterson & Morton,
1985). Finally, it is not clear how analogy mod-
The analogy model els account for the dissociations found in acquired
The analogy model arose in the late 1970s when dyslexia. Nevertheless, in some ways the analogy
the extent of lexical effects on nonword reading model was a precursor of connectionist models of
and differences between words became apparent reading.
(Glushko, 1979; Henderson, 1982; Kay & Marcel,
1981; Marcel, 1980). It is a form of single-route
model that provides an explicit mechanism for Connectionist models: Seidenberg
how we pronounce nonwords. It proposes that we and McClelland’s (1989) model of
pronounce nonwords and new words by analogy
with other words. When a word (or nonword) is
reading
presented, it activates its neighbors, and these all The original Seidenberg and McClelland (1989)
influence its pronunciation. For example, “gang” model evolved in response to criticisms that I will
examine after describing the original model. The

Seidenberg and McClelland (1989) model (often
abbreviated to SM) shares many features with the Output layer
/h/ /s/ /a/ /v/ /e/
interactive activation model of letter recognition (phonological units)
discussed in Chapter 6. The SM model provides
an account of how readers recognize letter strings Hidden layer
as words and pronounce them. This first model
simulated one route of a more general model of Input layer
lexical processing (see Figure 7.4). Reading and (visual units)
H S A V E
speech involve three types of code: orthographic,
meaning, and phonological. These are con-
nected with feedback connections. The shape of
the model has given it the name of the triangle FIGURE 7.5 The layers of Seidenberg and
model. As in the revised dual-route model, there McClelland’s (1989) model of word recognition
is a route from orthography to phonology by way (simplified—see text for details). Based on Seidenberg
of semantics. The key feature of the model is that and McClelland (1989).
there is only one other route from orthography to
phonology; there is no route involving grapheme– important characteristic of this type of model is
phoneme correspondence rules. that the weights on these connections are not set
Seidenberg and McClelland (1989) just sim- by the modelers, but are learned. This network
ulated the orthographic-to-phonology part of the learns to associate a phonological output with
overall triangle model. The model has three lev- an orthographic input by being given repeated
els, each containing many simple units. These are exposure to word-pronunciation pairs. It learns
the input, hidden, and output layers (see Figure using an algorithm called back-propagation.
7.5). Each of the units in these layers has an acti- This involves slowly reducing the discrepancy
vation level, and each unit is connected to all the between the desired and actual outputs of the net-
units in the next level by a weighted connection, work by changing the weights on the connections.
which can be either excitatory or inhibitory. An (See the Appendix for more information.)
Seidenberg and McClelland used 400 units to
code orthographic information for input and 460
Context units to code phonological information for out-
put, mediated by 200 hidden units. Phonemes and
graphemes were encoded as a set of triples, so that
Meaning each grapheme or phoneme was specified with its
flanking grapheme or phoneme. This is a common
trick to represent position-specificity (Wickelgren,
1969). For example, the word “have” was rep-
resented by the triples “#ha,” “hav,” “ave,” and
“ve#,” with “#” representing a blank space. A
Orthography Phonology non-local representation was used: The graphemic
representations were encoded as a pattern of acti-
vation across the orthographic units rather than
MAKE /mAK/ corresponding directly to particular graphemes.
Each phoneme triple was encoded as a pattern of
FIGURE 7.4 Seidenberg and McClelland’s (1989) activation distributed over a set of units represent-
“triangle model” of word recognition. Implemented ing phonetic features—a representation known
pathways are shown in bold. Reproduced with as a Wickelfeature. The underlying architecture
permission from Harm and Seidenberg (2001). was not a simple feedforward one, in that the
7. READING 231
hidden units fed back to the orthographic units, set of hidden units, and only one process is used
mimicking top-down word-to-letter connections to name regular, exception, and novel items. As
in the IAC model of word recognition. However, the model uses a distributed representation, there
there was no feedback from the phonological to is no one-to-one correspondence between hid-
the hidden units, so phonological representations den units and lexical items; each word is repre-
could not directly influence the processing of sented by a pattern of activation over the hidden
orthographic-level representations. units. According to this model, lexical memory
The training corpus comprised all 2,897 unin- does not consist of entries for individual words.
flected monosyllabic words of at least three or more Orthographic neighbors do not influence the
letters in the English language present in the Kucera pronunciation of a word directly at the time of
and Francis (1967) word corpus. Each trial con- processing; instead, regularity effects in pronun-
sisted of the presentation of a letter string that was ciation derive from statistical regularities in the
converted into the appropriate pattern of activation words of the training corpus—all the words we
over the orthographic units. This in turn fed for- have learned—as implemented in the weights of
ward to the phonological units by way of the hidden connections in the simulation. Lexical processing
units. In the training phase, words were presented therefore involves the activation of information,
a number of times with a probability proportional to and is not an all-or-none event.
the logarithm of their frequency. This means that the
ease with which a word is learned by the network, Evaluation of the original SM
and the effect it has on similar words, depends to
some extent on its frequency. About 150,000 learn-
model
ing trials were needed to minimize the differences Coltheart et al. (1993) criticized important aspects
between the desired and actual outputs. of the Seidenberg and McClelland (SM) model.
After training, the network was tested by They formulated six questions about reading that
presenting letter strings and computing the ortho- any account of reading must answer:
graphic and phonological error scores. The error
score is a measure of the average difference x How do skilled readers read exception words
between the actual and desired output of each of aloud?
the output units, across all patterns. Phonological x How do skilled readers read nonwords aloud?
error scores were generated by applying input to x How do participants make visual lexical deci-
the orthographic units, and measured by the out- sion judgments?
put of the phonological units; they were inter- x How does surface dyslexia arise?
preted as reflecting performance on a naming x How does phonological dyslexia arise?
task. Orthographic error scores were generated x How does developmental dyslexia arise?
by comparing the pattern of activation input to
the orthographic units with the pattern produced Coltheart et al. then argued that Seidenberg and
through feedback from the hidden units, and McClelland’s model only answered the first of
were interpreted as a measure reflecting the per- these questions.
formance of the model in a lexical decision task. Besner, Twilley, McCann, and Seergobin
Orthographic error scores are therefore a meas- (1990) provided a detailed critique of the
ure of orthographic familiarity. Seidenberg and Seidenberg and McClelland model, although
McClelland showed that the model fitted human a reply by Seidenberg and McClelland (1990)
data on a wide range of inputs. For example, answered some of these points. First, Besner et al.
regular words (such as “gave”) were pronounced argued that in a sense the model still possesses
faster than exception words (such as “have”). a lexicon, where instead of a word correspond-
Note that the Seidenberg and McClelland ing to a unit, it corresponds to a pattern of acti-
model uses a single mechanism to read non- vation. Second, they pointed out that the model
words and exception words. There is only one “reads” nonwords rather poorly—certainly much
less well than a skilled reader. In particular, it realistic input and output representations.
only produced the “correct,” regular pronuncia- Phonological representations were based on pho-
tion of a nonword under 70% of the time. This nemes with phonotactic constraints (that constrain
contrasts with the model’s excellent performance which sounds occur together in the language), and
on its original training set. Hence the model’s orthographic representations were based on graph-
performance on nonwords is impaired from the emes with graphotactic constraints (that constrain
beginning. In reply, Seidenberg and McClelland which letters occur together in the language). The
(1990) pointed out that their model was trained on original SM model performed badly on nonwords
only 2,987 words, as opposed to the 30,000 words because Wickelfeatures disperse spelling–sound
that people know, and that this may be responsible regularities. For example, in GAVE, the A is rep-
for the difference. Hence the model simulates the resented in the context of G and V, and has noth-
direct lexical route rather better than it simulates ing in common with the A in SAVE (represented
the indirect grapheme–phoneme route. Therefore in the context of S and V). In the revised PMSP
any disruption of the model will give a better model, letters and phonemes activate the same
account of disruption to the direct route—that is, units irrespective of context. A mathematical
of surface dyslexia. The model’s account of lexi- analysis showed that a response to a letter string
cal decision is inadequate in that it makes far too input is a function that depends positively on the
many errors—in particular it accepts too many frequency of exposure to the pattern, positively to
nonwords as words (Besner et al., 1990; Fera & the sum of the frequencies of its friends, and nega-
Besner, 1992). The model did not perform as well tively to the sum of the frequencies of its enemies.
as people do on nonwords, in particular on non- The response to a letter string is non-linear, in that
words that contain unusual spelling patterns (e.g., there are diminishing returns: For example, regu-
JINJE, FAIJE). In addition, the model’s account of lar words are so good they gain little extra benefit
surface dyslexia was problematic and its account from frequency. This explains the interaction we
of phonological dyslexia non-existent. observe between word consistency and frequency.
Forster (1994) evaluated the assumptions As we shall see, the revised model also gives a
behind connectionist modeling of visual word much better account of dyslexia.
recognition. He made the point that showing that
a network model can successfully learn to per-
form a complex task such as reading does not
Accessing semantics
mean that that is the way humans actually do it. Of course the goal of reading is to access the mean-
Finally, Norris (1994b) argued that a major stum- ing of words. The PMSP model simulates the
bling block for the Seidenberg and McClelland orthography–phonology side of the triangle. Clearly,
model was that it could not account for the ability according to the model, we can access semantics
of readers to shift strategically between reliance either directly (OS: orthography–semantics) or indi-
on lexical and sublexical information. rectly (OPS: orthography–phonology–semantics—
what we have also called phonological mediation).
The revised connectionist model: Hence there is a division of labor between the two
routes. Harm and Seidenberg (2004) model the
PMSP access of semantics. In the full model, all parts of
A revised connectionist model performs much the system operate simultaneously and contribute to
better at pronouncing nonwords and at lexi- the activation of meaning. The Harm and Seidenberg
cal decision than the original (Plaut, 1997; model is a complete implementation of the triangle
Plaut & McClelland, 1993; Plaut, McClelland, model. It is trained to produce the correct pattern of
Seidenberg, & Patterson, 1996; Seidenberg, activation across a set of semantic features given
Petersen, MacDonald, & Plaut, 1996; Seidenberg, an orthographic input. In the first phase, the model
Plaut, Petersen, McClelland, & McRae, 1994). is trained for a while on the phonology–semantics
The model, called PMSP for short, used more side of the triangle, to simulate the knowledge of
7. READING 233
young children who cannot yet read, but who know hidden and output (phonological) units (called late
what words mean. These weights are then frozen. In weights); and damage to the hidden units them-
the second phase, the orthography–phonology and selves. Damage was inflicted by probabilistically
orthography–semantics sides of the triangle are then resetting a proportion of the weights or units to
trained. zero. The greater the amount of damage being
How does the trained model perform? simulated, the higher the proportion of weights that
Perhaps not surprisingly, in simulations resem- was changed. The consequences were measured in
bling the skilled reader in normal conditions, the two ways. First, the damage was measured by the
OS route is normally faster, with the OPS route phonological error score, which as we have seen
lagging somewhat behind. Nevertheless, analy- reflects the difference between the actual and target
sis of how activation of the input determines activation values of the phonological output units.
activation of the output shows that activation of Obviously, high error scores reflect impaired per-
the semantic system is driven by both pathways. formance. Second, the damage was measured by
Even if the OPS path is slower, it still always con- the reversal rate. This corresponds to a switch in
tributes to the final output. In addition, because pronunciation by the model, so that a regular pro-
of interactivity in the system, activation of the nunciation is given to an exception item (for exam-
semantic system activates corresponding pho- ple, “have” is pronounced to rhyme with “gave”).
nological representations, which in turn affect Increasing damage at each location produces
the semantic system. Simulations show that the near-linear increases in the phonological error
relative contributions of the two pathways (OS scores of all types of word. On the whole, though,
and OPS) are modulated by a number of factors, the lesioned model performed better with regu-
including skill (phonological information is more lar than with exception words. The reversal rate
important early on in training, corresponding increased as the degree of damage increased, but
to less skilled readers) and word frequency (for nevertheless there were still more reversals occur-
high-frequency words the OS pathway is more ring on exception words than on regular words.
efficient). The model also simulates the response Damage to the hidden units in particular produced
times of van Orden (1987), where people are a large number of instances where exception
slow to say “no” to “Is it a flower? ROWS.” words were produced with a regular pronuncia-
tion; this is similar to the result whereby surface
dyslexics over-regularize their pronunciations.
CONNECTIONIST MODELS However, the number of regularized pronuncia-
tions that were produced by the lesioned model
OF DYSLEXIA was significantly lower than that produced by sur-
face dyslexic patients. No lesion made the model
Modeling surface dyslexia perform selectively worse on nonwords. Hence
Over the last few years connectionist modeling the behavior of the lesioned model resembles that
has contributed to our understanding of deep of a surface dyslexic.
and surface dyslexia. Patterson, Seidenberg, Patterson et al. also found that word frequency
and McClelland (1989) artificially damaged or was not a major determinant of whether a pronun-
“lesioned” the Seidenberg and McClelland (1989) ciation reversed or not. (It did have some effect,
network after the learning phase by destroy- so that high-frequency words were generally more
ing hidden units or connection weights, and then robust to damage.) As we have seen, some sur-
observed the behavior of the model. Its perfor- face dyslexics show frequency effects on reading,
mance resembled the reading of a surface dyslexic. while others do not. Patterson et al. found that the
Patterson et al. (1989) explored three main types main determinant of reversals was the number of
of lesion: damage to the connections between the vowel features by which the regular pronunciation
orthographic input and hidden units (called early differs from the correct pronunciation, a finding
weights); damage to the connections between the verified from the neuropsychological data.
An additional point of interest is that the Surface dyslexia arises in the progressive
lesioned model produced errors that have tra- neurological disease dementia (see Chapter
ditionally been interpreted as “visual” errors. 11 on semantics for details of dementia).
These are mispronunciations that are not over- Importantly, people with dementia find excep-
regularizations and that were traditionally tion words difficult to pronounce and repeat
thought to result from an impairment of early if they have lost the meaning of those words
graphemic analysis. If this analysis is correct, (Hodges, Patterson, Oxbury, & Funnell, 1992;
then Patterson et al. should only have found Patterson & Hodges, 1992; but see Funnell,
such errors when there was damage to the ortho- 1996). Patterson and Hodges proposed that
graphic units involved. In contrast, they found the integrity of lexical representations depends
them even when the orthographic units were on their interaction with the semantic system:
not damaged. This is an example of a particu- Semantic representations bind phonological
lar strength of connectionist modeling; the same representations together with a semantic glue;
mechanism explains what were previously con- hence this is called the semantic glue hypothe-
sidered to be disparate findings. Here visual sis. As the semantic system gradually dissolves
errors result from the same lesion that causes in dementia, so the semantic glue gradually
other characteristics of surface dyslexia, and comes unstuck, and the lexical representations
it is unnecessary to resort to more complex lose their integrity. Patients are therefore forced
explanations involving additional damage to the to rely on a sublexical or grapheme–phoneme
graphemic analysis system. correspondence reading route, leading to sur-
There are three main problems with this face dyslexic errors. Furthermore, they have
particular account. First, we have already seen difficulty in repeating irregular words for which
that the original Seidenberg and McClelland they have lost the meaning, if the system is suf-
model was relatively bad at producing non- ficiently stressed (by repeating lists of words),
words before it was lesioned. We might say but they can repeat lists of words for which
that the original model is already operating as a the meaning is intact (Patterson, Graham, &
phonological dyslexic. Yet surface dyslexics are Hodges, 1994; but see Funnell, 1996, for a
good at reading nonwords. Second, the model patient who does not show this difference).
does not really over-regularize, it just changes PMSP showed that a realistic model of sur-
the vowel sound of words. Third, Behrmann face dyslexia depends on involving semantics
and Bub (1992) reported data that are inconsist- in reading. Support from semantics normally
ent with this model. In particular, they showed relieves the phonological pathway from hav-
that the performance of the surface dyslexic MP ing to master low-frequency exception words by
on irregular words does vary as a function of itself. In surface dyslexia the semantic pathway is
word frequency. They interpreted this frequency damaged, and the isolated phonological pathway
effect as problematic for connectionist models. reveals itself as surface dyslexia.
Patterson et al. (1989) were quite explicit in Plaut (1997) further examined the involve-
simulating only surface dyslexia; their model ment of semantics in reading. He noted that some
does not address phonological dyslexia. patients have substantial semantic impairments
but can read exception words accurately (e.g.,
Exploring semantic involvement in DC of Lambon Ralph, Ellis, & Franklin, 1995;
DRN of Cipolotti & Warrington, 1995; WLP of
reading Schwartz, Marin, & Saffran, 1979). To explain
The revised model, abbreviated to PMSP, provides why some patients with semantic impairments
a better account of dyslexia. The improvements cannot read exception words but some can, Plaut
come about because the simulations implement suggested that there are individual differences
both pathways of the triangle model in order to in the division of labor between semantic and
explain semantic effects on reading. phonological pathways. Although the majority
7. READING 235
of patients with semantic damage show surface advantage for pseudohomophones, but no obvi-
dyslexia (Graham, Hodges, & Patterson, 1994), ous general phonological impairment. There
some exceptions are predicted. He also argued have also been effects of orthographic complex-
that people use a number of strategies in per- ity and visual similarity, suggesting that there
forming lexical decision, one of which is to use is also an orthographic impairment present in
semantic familiarity as a basis for making judg- phonological dyslexia (Derouesné & Beauvois,
ments. The revised model therefore takes into 1985; Howard & Best, 1996). For example,
account individual differences between speak- Howard and Best showed that their patient
ers, and shows how small differences in read- Melanie-Jane read pseudohomophones that were
ing strategies can lead to different consequences visually similar to the related word (e.g., GERL)
after brain damage. better than pseudohomophones that were visu-
ally more distant (e.g., PHOCKS). There was no
effect of visual similarity for control nonwords.
Modeling phonological dyslexia However, Harm and Seidenberg (2001) show
The triangle model provides the best connection- how phonological impairment in a connectionist
ist account of phonological dyslexia. It envisages model can give rise to such effects. A phonolog-
reading as taking place through the three routes ical impairment magnifies the ease with which
conceptualized in the original SM model. The different types of stimuli are read.
routes are orthography to phonology, orthography
to semantics, and semantics to phonology (Figure
7.4). This approach sees phonological dyslexia as
Modeling deep dyslexia
nothing other than a general problem with phono- Hinton and Shallice (1991) lesioned another
logical processing (Farah et al., 1996; Sasanuma connectionist model to simulate deep dyslexia.
et al., 1996). Phonological dyslexia arises through Their model was trained by back-propagation to
impairments to representations at the phono- associate word pronunciations with a represen-
logical level, rather than to grapheme–phoneme tation of the meaning of words. This model is
conversion. This is called the phonological particularly important, because it shows that one
impairment hypothesis. People with phono- type of lesion can give rise to all the symptoms
logical dyslexia can still read words because their of deep dyslexia, particularly both paralexias
weakened phonological representations can be and visual errors.
accessed through the semantic level. (Hence this The underlying semantic representation of a
approach is also a development of the semantic word is specified as a pattern of activation across
glue hypothesis.) We have already noted that semantic feature units (which Hinton and Shallice
the original Seidenberg and McClelland (1989) called sememes). These correspond to semantic
model performed rather like a phonological dys- features or primitives such as “main-shape-2D,”
lexic patient, in that it performed relatively poorly “has-legs,” “brown,” and “mammal.” These can
on nonwords. Consistent with the phonological be thought of as atomic units of meaning (see
deficit hypothesis, the explanation for this poor Chapter 11). The architecture of the Hinton and
performance was that the source of these errors Shallice (1991) model comprised 28 graphemic
was the impoverished phonological representa- input units and 68 semantic output units with an
tions used by the model. intervening hidden layer containing 40 intermedi-
An apparent problem with the phonological ate units. The model was trained to produce an
deficit hypothesis is that it is not clear that it appropriate output representation given a particu-
would correctly handle the way in which people lar orthographic input using back-propagation.
with phonological dyslexia read pseudohomo- The model was trained on 40 uninflected mono-
phones better than other types of nonwords syllabic words.
(Coltheart, 1996). Furthermore, patient LB of The structure of the output layer is quite
Derouesné and Beauvois (1985) showed an complex. First, there were interconnections
between some of the semantic units. The 68 that was semantically but not visually close to
semantic feature units were divided into 19 the target; these resemble the classic semantic
groups depending on their interpretation, with paralexias of deep dyslexics); visual (words visu-
inhibitory connections between appropriate ally but not semantically similar); mixed (where
members of the group. For example, in the the output is both semantically and visually
group of semantic features that define the size of close to the target); and others. All lesion sites
the object denoted by the word, there are three and types (except for that of disconnecting the
semantic features: “max-size-less-foot,” “max- semantic and cleanup units) produced the same
size-foot-to-two-yards,” and “max-size-greater- broad pattern of errors. Finally, on some occa-
two-yards.” Each of these features inhibits the sions the lesions were so severe that the network
others in the group, because obviously an object could not generate an explicit response. In these
can only have one size. Second, an additional cases, Hinton and Shallice tested the below-
set of hidden units called cleanup units was threshold information left in the system by simu-
connected to the semantic units. These permit lating a forced-choice procedure. They achieved
more complex interdependencies between the this by comparing the residual semantic output
semantic units to be learned, and have the effect to a set of possible outputs corresponding to a
of producing structure in the output layer. This set of words, one of which was the target seman-
results in a richer semantic space where there tic output. The model behaved above chance on
are strong semantic attractors. An attractor can this forced-choice test, in that its output semantic
be seen as a point in semantic space to which representation tended to be closer to that of the
neighboring states of the network are attracted; target than to the alternatives.
it resembles the bottom of a valley or basin, so Hence the lesioned network behaves like
that objects positioned on the sides of the basin a deep dyslexic patient, in particular in mak-
tend to migrate towards the lowest point. This ing semantic paralexias. The paralexias occur
corresponds to the semantic representation ulti- because semantic attractors cause the accessing
mately assigned to a word. of feature clusters close to the meanings of words
As in Patterson et al.’s (1989) simulation that are related to the target. A “landscape” met-
of surface dyslexia, different types of lesion aphor may be useful. Lesions can be thought of
were possible. There are two dimensions to as resulting in the destruction of the ridges that
remember: one is what is lesioned, the other separate the different basins of attraction. The
is how it is lesioned. The connections involved occurrence of such errors does not seem to be
were the grapheme–intermediate, intermedi- crucially dependent on the particular lesion type
ate–sememe, and sememe–cleanup. Three or site under consideration. Furthermore, this
methods of lesioning the network were used. account provides an explanation of why differ-
First, each set of connections was taken in turn, ent error types, particularly semantic and visual
and a proportion of their weights was set to errors, nearly always co-occur in such patients.
zero (effectively disconnecting units). Second, Two visually similar words can point in the first
random noise was added to each connection. instance to nearby parts of semantic space, even
Third, the hidden units (the intermediate and though their ultimate meanings in the basins
cleanup units) were ablated by destroying a may be far apart; if you start off on top of a hill,
proportion of them. going downhill in different directions will take
The results showed that the closer the lesion you to very different ultimate locations. Lesions
was to the semantic system, the more effect it modify semantic space so that visually similar
had. The lesion type and site interacted in their words are then attracted to different semantic
effects; for example, the cleanup circuit was attractors.
more sensitive to added noise than to discon- Hinton and Shallice’s account is important
nections. Lesions resulted in four types of error: for cognitive neuropsychologists for a num-
semantic (where an input gave an output word ber of reasons. First, it provides an explicit
7. READING 237
mechanism whereby the characteristics of deep on abstract words. Plaut and Shallice argue that
dyslexia can be derived from a model of nor- this is consistent with patient CAV (Warrington,
mal reading. Second, it shows that the actual 1981), who showed such an advantage. Hence
site of the lesion is not of primary importance. this network can account for both the usual
This is mainly because of the “cascade” char- better performance of deep dyslexic patients
acteristics of these networks. Each stage of on concrete words, and also the rare exception
processing is continually activating the next, where the reverse is the case. They also showed
and is not dependent on the completion of pro- that lesions closer to the grapheme units tended
cessing by its prior stage (McClelland, 1979). to produce more visual errors, whereas lesions
Therefore, effects of lesions at one network closer to the semantic units tended to produce
site are very quickly passed on to surrounding more semantic errors. The model also provides
sites. Third, it shows why symptoms that were an account of the behavior of normal participants
previously considered to be conceptually dis- reading degraded words (McLeod, Shallice, &
tinct necessarily co-occur. Semantic and visual Plaut, 2000). If words are presented very rapidly
errors can result from the same lesion. Fourth, to people, they make both visual and semantic
it thus revives the importance of syndromes as a errors. The data fit the connectionist model well.
neuropsychological concept. If symptoms co- Connectionist modeling has advanced our
occur as a result of any lesion to a particular understanding of deep dyslexia in particular,
system, then it makes sense to look for and and neuropsychological deficits in general. The
study such co-occurrences. finding that apparently unrelated symptoms can
Plaut and Shallice (1993a) extended this necessarily co-occur as a result of a single lesion
work to examine the effect of word abstractness is of particular importance. It suggests that deep
on lesioned reading performance. As we have dyslexia may after all be a unitary condition.
seen, the reading performance of deep dyslexic However, there is one fly in the ointment. The
patients is significantly better on more image- finding that at least some patients show image-
able than on less imageable words. Plaut and ability effects in reading but not in comprehension
Shallice showed that the richness of the under- is troublesome for all models that posit a distur-
lying semantic representation of a word is an bance of semantic representations as the cause of
analog of imageability. They hypothesized that deep dyslexia (Newton & Barry, 1997). Instead,
the semantic representations of abstract words in at least some patients, the primary disturbance
contain fewer semantic features than those of may be to the speech production component of
concrete words; that is, the more concrete a reading.
word is, the richer its semantic representation.
Jones (1985) showed that it was possible to
account for imageability effects in deep dyslexia COMPARISON OF MODELS
by recasting them as ease-of-predication effects.
Ease-of-predication is a measure of how easy A simple dual-route model provides an inad-
it is to generate things to say about a word, or equate account of reading, and needs at least
predicates, and is obviously closely related to the an additional lexical route through imageable
richness of the underlying semantic representa- semantics. The more complex a model becomes,
tion. It is easier to find more things to say about the greater the worry that routes are being intro-
more imageable words than about less image- duced on an arbitrary basis to account for par-
able words. Plaut and Shallice (1993a) showed ticular findings. Analogy models have some
that when an attractor network similar to that attractive features, but their detailed workings
of Hinton and Shallice (1991) is lesioned, con- are vague and they do not seem able to account
crete words are read better than abstract words. for all the data. Connectionist modeling has
One exception was that severe lesions of the provided an explicit, single-route model that
cleanup system resulted in better performance covers most of the main findings, but has its
problems. At the very least it has clarified the an increase in the number of times it is neces-
issues involved in reading. Its contribution sary to reanalyze inconsistent words as we read
goes beyond this, however. It has set the chal- them from left to right). Zevin and Seidenberg
lenge that only one route is necessary in reading (2006) argued that graded sensitivity to consist-
words and nonwords, and that regularity effects ency effects in nonwords provides the critical
in pronunciation arise out of statistical regulari- test between the models, with only connection-
ties in the words of the language. It may not be a ist models correctly predicting the presence
complete or correct account; however, it is cer- of such effects, and being able to account for
tainly a challenging one. individual differences in nonword pronuncia-
Currently we are faced with two serious tion. However, doubtless this debate will run
alternatives: a connectionist model such as the and run.
triangle model, and a variant of the dual-route Perhaps the choice between the triangle and
model such as the dual-route cascaded model. the dual-route cascaded model comes down to
The literature is full of claim and counter-claim, which one values most: explaining a wide range
and it would be presumptuous for a text like of data, or parsimony in design.
this to say that one is clearly right and the other Balota (1990) asked if there is a magic
wrong. There are many studies providing sup- moment when we recognize a word but do not
port for and against one or the other of the models. yet have access to its meaning. He argued that
Many of them focus on how we read nonwords the tasks most commonly used to study word
(Besner et al., 1990; Seidenberg et al., 1994), processing (lexical decision and word naming)
because the division of labor in the DRC model are both sensitive to post-access processes. This
between a lexical route with knowledge of indi- makes interpretation of data obtained using
vidual words and a non-lexical route with spell- these tasks difficult (although not, as we have
ing rules is absent in connectionist models, and seen, impossible). Furthermore, deep dyslexia
this difference is the key one between the two (discussed earlier) suggests that it is possible
sorts of models. The DRC emphasizes regular- to access meaning without correctly identifying
ity (does the word obey the rule?), which is a the word, while non-semantic reading suggests
categorical concept—either the word obeys the that we can recognize words without necessarily
spelling–sound rules or it does not, with non- accessing their meaning. Whereas unique lexi-
words having to be pronounced by the rule. cal access is a prerequisite of activating mean-
The triangle model emphasizes consistency of ing in models such as the logogen and the serial
rimes and other units (how often is -AVE pro- search model, cascading connectionist models
nounced in a certain way?), which is a statisti- permit the gradual activation of semantic infor-
cal concept. According to Zevin and Seidenberg mation while evidence is still accumulating
(2006), consistency effects such as those shown from perceptual processing. A model such as
in Glushko’s (1979) and Jared’s (1997b) stud- the triangle model (Patterson et al., 1996; Plaut
ies are the critical test between models. Words et al., 1996) seems best able to accommodate all
like PAVE are regular but inconsistent; accord- these constraints.
ing to the DRC model they should be as easy Finally, all of these models—particularly
to pronounce as regular and consistent words the connectionist ones—are limited in that they
such as PANE; according to the triangle model have focused on the recognition of morphologi-
they should not. Now of course we know from cally simple, often monosyllabic words. Rastle
Glushko’s study that regular inconsistent words and Coltheart (2000) have developed a rule-based
are slower to pronounce than regular consist- model of reading bisyllabic words, emphasiz-
ent ones, but Coltheart et al. (2001) argue that ing how we produce the correct stress, and Ans,
these differences are an artifact arising from Carbonnel, and Valdois (1998) have developed
several confounding factors (e.g., the pres- a connectionist model of reading polysyllabic
ence of exception words in the materials, and words.
7. READING 239
SUMMARY
x Different languages use different principles to translate words into sounds; languages such as
English use the alphabetic principle.
x Regular words have a regular grapheme-to-phoneme correspondence, but exception words do not.
x According to the dual-route model, words can be read through a direct lexical route or a sublexical
route; in adult skilled readers the lexical route is usually faster.
x The sublexical route was originally thought to use grapheme–phoneme conversion, but now it is
considered to use correspondences across a range of sublexical levels.
x There are effects of lexical similarity in reading certain nonwords (pseudohomophones), while
not all words are read with equal facility (the consistency of the regularity of a word’s neighbors
affects its ease of pronunciation).
x It might be necessary to access the phonological code of a word before we can access its meaning;
this process is called phonological mediation.
x Phonological mediation is most likely to be observed with low-frequency words and with poor readers.
x Readers have some attentional control over which route they emphasize in reading.
x Access to some phonological code is mandatory, even in silent reading, but normally does not
precede semantic access.
x Increasing reading speed above about 350 words a minute (by speed reading, for example) leads
to reduced comprehension.
x Surface dyslexia is difficulty in reading exception words; it corresponds to an impairment of the
lexical route in the dual-route model.
x Phonological dyslexia is difficulty in reading nonwords; it corresponds to an impairment of the
sublexical route in the dual-route model.
x Deep dyslexic readers display a number of symptoms including making visual errors, but the most
important characteristic is the presence of semantic reading errors or paralexias.
x There has been some debate as to whether deep dyslexia is a coherent syndrome.
x Non-semantic readers can pronounce irregular words even though they do not know their meaning.
x The revised dual-route model uses multiple sublexical correspondences and permits direct access
through a semantic lexical route and a non-semantic lexical route.
x The dual-route cascaded model allows activation to trickle through levels before processing is
necessarily completed at any level.
x Seidenberg and McClelland (SM) produced an important connectionist model of reading; how-
ever, it performed poorly on nonwords and pseudohomophones.
x Lesioning the SM network gives rise to behavior resembling surface dyslexia, but its over-
regularizations differ from those made by humans.
x The revised version of this model, PMSP, gives a much better account of normal reading and surface
dyslexia; it uses a much more realistic representation for input and output than the original model.
x There are clear semantic influences on normal and impaired reading, and recent connectionist
models are trying to take these into account.
x The triangle model accounts for phonological dyslexia as an impairment to the phonological rep-
resentations: this is the phonological impairment hypothesis.
x Deep dyslexia has been modeled by lesioning semantic attractors; the lesioned model shows how
the apparently disparate symptoms of deep dyslexia can arise from one type of lesion.
(Continued)
(Continued)
x More imageable words are relatively spared because they have richer semantic representations.
x There has been considerable debate as to whether developmental dyslexia is qualitatively differ-
ent from very poor normal reading, and whether there are subtypes that correspond to acquired
dyslexias; the preponderance of evidence suggests that developmental dyslexia is on a continuum
with normal reading.
x Connectionist modeling shows how two distinct types of damage can lead to a continuum of
impairment between development surface and phonological dyslexia extremes.
1. Is there a “magic moment” when we recognize a word?

2. Why might reading errors occur? Keep a record of any errors you make and try to relate them
to what you have learned in this and the previous chapter.
3. What practical tips could help adult dyslexic readers to read more effectively?
4. Do we make errors in inner speech?
FURTHER READING
Many of the references at the end of Chapter 6 will also be relevant here. There are a number of
works that describe the orthography of English, and discuss the rules whereby certain spelling-to-
sound correspondences are described as regular and others as irregular. One of the best known of
these is Venezky (1970). For an example of work on reading in a different orthographic system, see
Kess and Miyamoto (1999).
For a general introduction to reading, writing, spelling, and their disorders, see Ellis (1993).
For more discussion of dyslexia, including peripheral dyslexias, see Ellis and Young (1988). Two
volumes (entitled Deep Dyslexia, 2nd ed., by Coltheart, Patterson, & Marshall, 1987, and Surface
Dyslexia, by Patterson, Marshall, & Coltheart, 1985b) cover much of the relevant material. A special
issue of the journal Cognitive Neuropsychology (1996, volume 13, part 6) was devoted to phonologi-
cal dyslexia.
For recent overviews of reading, see Andrews (2006) and Snowling and Hulme (2007).
CHAPTER 8
LEARNING TO READ AND SPELL
INTRODUCTION NORMAL READING

How do we learn to read? Unlike speaking and DEVELOPMENT
listening, reading and writing are clearly not I remember being taught reading at school: the
easy tasks to learn, as shown by the large num- letters of the alphabet were written in capitals on
ber of people who find them difficult, and the separate pieces of card, with an appropriate pic-
amount of explicit tuition apparently neces- ture accompanying each letter (apple for A, cat for
sary. The complexities of English spelling make C; I can’t remember what X and Z were). Great
the task facing the learner a difficult one. Here pride was associated with being able to recite the
we will concentrate on the most fundamental alphabet backwards.
aspect of reading development, that of how we Nearly all children at some point go through
learn to read words. Reading development is a stage of alphabetic reading where they make
closely associated with skills such as spelling, use of grapheme–phoneme correspondences, yet
and we will also examine this. Finally, dispropor- skilled readers eventually end up using some sort
tionate difficulty in learning to read and spell— of direct route to sound and meaning that makes
developmental dyslexia and dysgraphia—are little use of rule-based correspondences. Hence,
relatively common, and we will examine these in learning skilled reading involves a developmental
the context of a model of normal reading develop- shift away from reading by a reliance on phono-
ment. Developmental dyslexias can be categorized logical recoding to a more direct route from print
in a similar way to acquired dyslexia, which has to meaning. How does this shift occur? There
been used as further justification for a dual-route is general agreement that children learn to read
model of reading. alphabetic languages by discovering the prin-
By the end of this chapter you should: ciples of phonological recoding (Jorm & Share,
1983; Share, 1995).
x Know the course of normal reading development. Children probably learn to read in a series of
x Understand the importance of the alphabetic stages, although as Rayner and Pollatsek (1989)
principle. point out, it is likely that these stages reflect the use of
x Understand the importance of phonological increasingly sophisticated skills and strategies, rather
awareness in learning to read. than the biologically and environmentally driven
x Know how reading should best be taught. sequence of stages that might underlie cognitive
x Know about developmental reading disorders. development. A number of broadly similar devel-
x Know how poor readers can be helped to read opmental sequences have been proposed (e.g., Ehri,
better. 1992, 1997a, 1997b; Frith, 1985; Marsh, Desberg, &
Cooper, 1977; Marsh, Friedman, Welch, & Desberg,

Ehri’s (1992) four stages of reading development
1981). Frith (1985) described three stages. First,
in the logographic stage, the child recognizes indi-
vidual words by particular salient characteristics of PRE-ALPHABETIC PHASE
(very little knowledge of letter–sound
the word; hence the child cannot read new words or correspondences; reading by rote)
nonwords. Second, in the alphabetic stage, the child
learns to read by grapheme–phoneme correspond-
PARTIAL ALPHABETIC READING PHASE
ences. Third, in the orthographic stage, the child has (partial knowledge of spelling–pronunciation
acquired an adult-like reading system, being able to correspondences, but unable to segment
recognize whole words without having to decode all sounds in a word’s pronunciation)
each individual grapheme. Nevertheless the child can
still use the grapheme–phoneme conversion system FULL ALPHABETIC PHASE
for new words and nonwords. (complete connections between letters and sounds)
Ehri (1992, 1997a, 1997b) prefers the term
“phase” to stage, as it has fewer implications CONSOLIDATED ALPHABETIC PHASE
about how discrete the boundary between phases (reading like an adult; can operate with multi-letter
is. Ehri described four phases of reading develop- units, e.g., syllables, rimes, morphemes)
ment (see Figure 8.1). During the pre-alphabetic

phase, children know little about letter–sound cor- FIGURE 8.1
respondences, so they read by rote, learning direct
links between the visual appearances of words In the full alphabetic phase, complete con-
and their meanings. For example, the word “yel- nections are made between letters and sounds. At
low” might be remembered because it “has two this stage children can read words they have never
tall bits together in the middle.” In some cases at seen before. Gradually, as children practice read-
least, children are remembering the concept asso- ing words often enough, words become known by
ciated with the visual pattern rather than the word: sight. They can then be read aloud by the direct
Harste, Burke, and Woodward (1982) describe route without the need for letter–sound conver-
how one child read “Crest” (the name of a brand sion. Sight-word reading has the advantage that
of toothpaste) as “toothpaste” on one occasion it is much faster than letter–sound conversion.
and “brush teeth” on another. This phase is short, Finally, in the consolidated alphabetic phase, the
and might not happen with all children. Although child reads like an adult. Letter patterns that recur
this is a version of direct access, it is very different across words become familiar, so the child can
from the direct access of skilled readers. There are operate with multi-letter units such as syllables,
no systematic relationships and no detailed pro- rimes, and morphemes. The rime is the end part of
cessing, with the child relying on arbitrary, salient a word that produces the rhyme (e.g., the rime in
cues. Knowledge about sounds is important from “rant” is “ant”); it is the VC or VCC (vowel–con-
a very early stage. sonant or vowel–consonant–consonant) part of a
In the partial alphabetic reading phase, word—the phonological equivalent of the ortho-
young readers use their partial knowledge of let- graphic body of a monosyllabic word. As we will
ter names and sounds to form partial correspond- see, rimes may play an important part in learning
ences between spellings and pronunciations. to read.
Some letters are associated with sounds. Ehri Poor readers never get far beyond the sec-
proposed that the first and final letters are the ond stage because they have poor phonological
ones that are often first associated with sounds recoding skills. Competent readers have two types
because they are easiest to pick out. The con- of knowledge about spelling: they know about
nections are only partial because children at this the alphabetic system, and they know about the
stage are unable to segment the word’s pronun- spellings of specific words (Ehri, 1997a). Words
ciation into all of its sounds. are difficult to spell if they violate the alphabetic
8. LEARNING TO READ AND SPELL 243
principle or if they place a heavy load on memory. of words. They also performed better on more
Hence words containing graphemes with irregular imageable words. So even children in the earliest
pronunciations, phonemes with many graphemic stages of reading are sensitive to spelling–sound
options, and graphemes with no phonological correlations, but semantic factors also play a role.
respondences will all be difficult to spell.
In this scheme, then, there is an initial phase
of direct access based only on visual cues. Barron PHONOLOGICAL
and Baron (1977) showed that concurrent articula- AWARENESS
tion had no effect on extracting the meaning of a
printed word. However, this initial phase of visual Phonological awareness—the awareness of the
access is very short. There is some evidence that sounds of a word—is important when learning to
phonetic information is used from a very early read. It is one aspect of more general knowledge
stage (Ehri, 1992; Ehri & Wilce, 1985; Rack, of our cognitive abilities (called metacognitive
Hulme, Snowling, & Wightman, 1994). Early knowledge) that is thought to play an essential
readers set up partial associations between sounds role in cognitive development (Karmiloff-Smith,
and the letters for which they stand, even though 1986). Many tasks have been used to test phono-
these partial associations are not the same as con- logical awareness (see Table 8.1 for some exam-
scious letter-by-letter decoding. Ehri and Wilce ples). Phonological awareness is just one aspect
(1985) showed that children who could not yet use of our knowledge of language. Gombert (1992)
phonological decoding still found it easier to learn distinguished between epilinguistic knowledge
the simplified spelling cue “jrf,” which bears some (implicit knowledge about our language processes
phonetic resemblance to the target word “giraffe,” that is used unconsciously) and metalinguistic
than “wbc,” which is visually very distinctive knowledge (explicit knowledge about our lan-
but bears no phonological relation to the target. guage processes of which we are aware and can
Semantic factors also influence very early read- report, and of which we can make deliberate use).
ing: Laing and Hulme (1999) found that children This distinction is reflected in the tasks that have
performed better at associating spelling cues with been used to test phonological awareness (e.g.,
words when they were clearer about the meanings those in Table 8.1).
TABLE 8.1 Some tasks used to assess phonological awareness (based on Yopp, 1988).
Task Example
Sound-to-word matching Is there a /f/ in “calf”?
Word-to-word matching Do “pen” and “pipe” begin the same?
Recognition of rhyme Does “sun” rhyme with “run”?
Isolating sounds What is the first sound in “rose”?
Phoneme segmentation What sounds do you hear in “hot”?
Phoneme counting How many sounds do you hear in “cake”?
Phoneme blending Combine these sounds: /k/ /a/ /t/
Phoneme deletion What would be left if you took /t/ out of “stand”?
Specifying deleted phoneme What sound do you hear in “meat” that’s missing in “eat”?
Phoneme reversal Say “as” with the first sound last and last sound first.
Although it was first thought that these tasks as isolating, segmenting, and manipulating sounds
may all measure the same thing, it is now agreed as evidenced by production. Implicit awareness
that they do not. In an analysis of 10 commonly follows a large-to-small developmental sequence,
used tests of phonological awareness, Yopp as indicated by early performance in match-
(1988) identified two related factors, one to do ing tasks (Treiman & Zukowski, 1996), but this
with manipulating single sounds and another to do has little controlling effect on learning to read.
with holding sounds in memory while performing Explicit awareness follows a small-to-large unit
operations on them. Muter, Hulme, Snowling, and sequence and reflects the demands of learning
Taylor (1998) identified distinct factors in tests to read using letter–sound correspondences. For
of phonological awareness, one to do with seg- example, beginning readers’ explicit awareness
mentation skills and one with rhyming skills. The of rimes and onsets can be poor, while implicit
underlying ability to determine that two words knowledge of rhyming can be good (Duncan,
have a sound in common (phoneme constancy) Seymour, & Hill, 1997, 2000). Younger children
might be a particularly important phonological were best at finding the common unit in sounds
skill for learning to read (Byrne, 1998). when the units were small (e.g., initial conso-
Phonological awareness and literacy are nants, as in “face” and “food” rather than “boat”
closely related. Illiterate adults (from an agricul- and “goat”). Thus, although they were able to
tural area of south Portugal) performed poorly on make the implicit judgment that “boat” and “goat”
phonological awareness tasks, particularly those rhymed, they were poor at explicitly identifying
involving manipulating phonemes (e.g., adding the common sound in those words. As children
or deleting phonemes to the starts of nonwords). grow older they are more sensitive to the rimes of
Ex-illiterate adults, who had received some lit- words and better able to generate word analogies
eracy training in adulthood, performed much bet- for nonwords (e.g., “door” for “goor”).
ter (Morais, Bertelson, Cary, & Alegria, 1986; Early work suggested that rime-level aware-
Morais, Carey, Alegria, & Bertelson, 1979). ness could predict late reading ability in longi-
Speakers of Chinese, who use a non-alphabetic tudinal studies (Goswami, 1993; Goswami &
writing system where there is no correspondence Bryant, 1990); more recent studies have claimed
between written symbols and individual sounds, that phoneme-level segmentation skill and letter-
seem less aware of individual phonemes. Chinese name knowledge are strong predictors of level of
adult speakers who were literate in both an alpha- reading ability, while rhyming skill is only a weak
betic and a non-alphabetic system could readily predictor (Muter et al., 1998), although there is
perform tasks such as deleting or adding conso- some controversy about the effects of the spe-
nants in spoken Chinese words; speakers who cific instructions given to children (Bryant, 1998;
were literate only in the non-alphabetic system Hulme, Muter, & Snowling, 1998).
found the deletion and addition tasks extremely Beginning readers have difficulty with pho-
difficult (Read, Zhang, Nie, & Ding, 1986). These nological awareness tasks, but their performance
studies show that phonological awareness works improves with age. Developing phonological
in both ways: literacy in alphabetic scripts can awareness improves reading skills and, as chil-
lead to phonological awareness. dren learn to read, their phonological awareness
Where phonological awareness tasks have increases. Phonological awareness plays a driving
been applied systematically to all levels of the role in reading development (Rayner & Pollatsek,
syllable from small units (phonemes) through 1989). Training on phonological awareness can
intermediate-size units (onsets and rimes) to lead to an improvement in segmenting and read-
large units (syllables), researchers have found a ing skills in general (Bradley & Bryant, 1983) if
sequence of phonological development. Implicit it is linked to reading (Hatcher, Hulme, & Ellis,
awareness is measured by tasks such as matching 1994; see Bus & van Ijzendoorn, 1999, for a
sounds (e.g., finding rimes) and detecting oddi- review). Laing and Hulme (1999) showed that
ties; explicit awareness is measured by tasks such phonological awareness correlates with the ability
of young children to learn to associate phonetic be interrelated, because impaired phonologi-

cues with words (e.g., “bfr” for “beaver,” as in cal awareness leads to difficulty in reading (see
the Ehri & Wilce, 1985, task described earlier). later), but the absence of literacy leads to poor
A recent meta-analysis of studies of learning to performance on tasks of phonological awareness.
read demonstrates the importance of phonological However, not all researchers accept that it has
awareness in learning to read, and how an impair- yet been conclusively shown that phonological
ment in phonological awareness is associated awareness skills precede and play a causal role in
with reading difficulties (Melby-Lervåg, Lyster, learning to read, rather than being just correlated
& Hulme, 2012). with, or a consequence of, reading development.
As we have noted several times before, Longitudinal studies reveal correlations, while
different languages map spelling onto sounds there are potential difficulties with the training
in different ways. How do these differences studies that are most likely to reveal causal links
affect the development of phonological aware- between phonological awareness and reading
ness? Before children learn to read, we would skill (Castles & Coltheart, 2004). For example,
expect children from different language com- several studies that train phonological awareness
munities to show broadly the same features of also trained other skills (e.g., letter names), and
phonological awareness. After they learn to read, virtually all studies have used children who could
however, their knowledge of how letters map already read, and for whom therefore the phono-
onto sounds in their particular language might logical awareness training might have reinforced
lead to particularities in their phonological some pre-existing reading skill. So although it is
awareness skill that might be different from that clear that phonological awareness and reading
of other languages. Experimental results sup- development are related, there remains contro-
port this idea (Goswami, Ziegler, & Richardson, versy as to whether phonological awareness is
2005). English and German are very similar in the cause or consequence of literacy (Castles &
the sorts of sounds they use, and pre-literate chil- Coltheart, 2004; Hulme, Snowling, Caravolas, &
dren have very similar phonological awareness. Carroll, 2005).
However, German is much more consistent in
its spelling–sound correspondences, while as we
know English is highly variable. After the first
The size of early reading units
year of reading instruction there are clear dif- Do children have to learn phonological decoding
ferences in the phonological awareness skills of before they can become skilled readers and use
children learning to read these two languages. In processes such as reading by analogy? There has
particular, English children pay more attention been considerable debate about the progression
to the rime of a word than do the German chil- in reading development. Do beginning readers
dren. German children on the other hand develop start with large units and then move to small, or
awareness of the role of individual phonemes do they start with small units and then move to
relatively more quickly (because small read- large? Although Goswami (1993) argued that the
ing units have more regular correspondences in correspondences between sounds and the rimes of
English). These results also show that phonolog- syllables are probably the first to be acquired, it
ical awareness and reading development have a is now generally agreed that grapheme–phoneme
reciprocal relationship—learning to read changes correspondences are learned first.
our phonological awareness. Goswami (1986, 1988, 1993) argued that
In summary, phonological awareness is young children read words by analogy, before they
a central concept in reading, and is absent or are able to use phonological recoding. It is harder
impoverished in unskilled readers. The ability to for beginning readers to sound out and blend pho-
manipulate phonemes and knowledge of letter– nemes than to sound out and blend larger sub-
sound correspondences are particularly impor- units such as onsets and rimes. Children’s ability
tant. Phonological awareness and literacy must to detect rhyming words in a sequence is strongly
predictive of their later analogical reading perfor- in all age groups, again suggesting that the child’s
mance. Goswami presented children with a clue reading strategy is task-dependent. Hence learn-
word (e.g., “beak”) and asked them to read several ing to read involves a process of learning through
other words and nonwords, some of which were several different reading routes (Grainger, Lété,
analogs of the clue word (e.g., “bean,” “beal,” Bertand, Dufau, & Ziegler, 2012).
“peak,” and “lake”). She found that the children Given that different languages map spelling
read the analog words better than the control onto sound in different ways, it is perhaps not sur-
words, suggesting that they are making use of the prising that languages differ in the preferred size of
rime to read by analogy. For Goswami, children the key unit that emerges while learning to read. We
start to read by identifying large units (onset and have just seen that in English the rime emerges as
rime) first, and only later identify small units such a key reading unit. In languages such as German,
as phonemes. Greek, and Spanish, which are much more regular
Most studies, however, have found that begin- in the spelling–sound correspondences, it is possible
ning readers need some grapheme–phoneme decod- to make systematic use of smaller units and hence
ing skill in order to able to read words by analogy older children come to rely on simple grapheme–
(see Brown & Deavers, 1999; Coltheart & Leahy, phoneme conversion without needing to develop
1992; Duncan, Seymour, & Hill, 2000; Ehri, 1992; reading by analogy based on rimes. Speakers of
Ehri & Robbins, 1992; Laxon, Masterson, & orthographically regular languages do not need to
Coltheart, 1991; Marsh et al., 1981; Savage, 1997). make use of larger units. The data support this idea.
That is, beginning readers start by identifying how There are many words in English and German that
letters correspond to sounds. For example, begin- are orthographically identical (sand, zoo). However,
ning readers are more adept at segmenting words as we saw in Chapter 7, the ease of pronunciation
into phonemes than into onsets and rimes (Seymour of a target word in English depends on the number
& Evans, 1994). The differences between these of words that share the same rime with the target: a
results are probably attributable to the materials and word like “start” has many neighbors and is easier to
tasks Goswami used. Her control words might have pronounce, while a word such as “storm” has fewer
been more difficult to read than the analogs. Muter, neighbors and is more difficult. In German, this
Snowling, and Taylor (1994) pointed out that the effect in adult speakers is much less pronounced,
majority of these tasks involved the simultaneous while the effect of length is stronger (Ziegler, Perry,
presentation of clue words and target words, which Jacobs, & Braun, 2001). The idea that different lan-
might have provided additional information that guages make use of different-sized preferred reading
might not be available in normal reading. Along units is called the psycholinguistic grain size theory
these lines, Savage (1997) showed that there was no (Ziegler & Goswami, 2005).
privileged role for onsets and rimes in the absence In summary, in natural situations younger
of the concurrent prompts. Ehri and Robbins (1992) reading-age children tend to read using grapheme–
showed that children could only read words by phoneme correspondences, and older reading-age
analogy in natural reading if they already possessed children tend to read by analogy based mainly on
grapheme–phoneme recoding skills. Brown and rime. They are sensitive to task demands, how-
Deavers (1999) showed that reading strategy varied ever, and younger children can be encouraged to
depending on the reading age of the child. Although read by analogy by the clue word technique.
less skilled readers (with a mean reading age of 8 There is evidence that once children know
years 8 months) could make use of rime-based cor- something about reading—once they have acquired
respondences (that is, read by analogy), they pre- the basics of phonological recoding—they in part
ferred to read by grapheme–phoneme correspond- teach themselves to read (Share, 1995). Bowey and
ences. Children with a higher reading age (11 years Muller (2005) gave third-grade children (about 8
6 months) were more likely to read by analogy, with years old) short stories to read silently. The stories
the rime being particularly important. Using a clue contained nonwords, and in a subsequent test the
word increased the amount of reading-by-analogy children were asked to read lists of words containing
those nonwords. They pronounced these nonwords reading literature showed that systematic training
more quickly than control nonwords. on phonics produced a strong beneficial effect on
learning to read (Ehri, Nunes, Stahl, & Willows,
2001). Indeed, many studies show that discovering
HOW SHOULD READING the alphabetic principle (that letters correspond sys-
BE TAUGHT? tematically to sounds) is the key to learning to read
(see Backman, 1983; Bradley & Bryant, 1978, 1983;
When should reading be taught? The age at which Byrne, 1998; Rayner & Pollatsek, 1989; Share,
children start to learn to read seems to be relatively 1995). Other methods do not work anywhere near as
unimportant—indeed, even when it is delayed until well. Seymour and Elder (1986) examined the read-
age 7 there are no serious or permanent side effects ing performance of a class of young children (aged
(Rayner & Pollatsek, 1989). In fact, older children 4½ to 5½ years) who were taught to “sight read”
learn to read more quickly in comparison with with relatively little emphasis on the alphabetic prin-
younger children (Feitelson, Tehori, & Levinberg- ciple. They found that the children were limited to
Green, 1982). As a corollary of this, very early tuition reading only words that they had been taught. They
does not provide any obvious long-term benefits, as made many reading errors, and their performance in
late starters catch up so easily. some ways resembled that of people with deep and
The main question then is how should reading phonological dyslexia.
be taught? There are two traditional approaches Hence the most efficient way of learning to read
to teaching children how to read (see Figure 8.2). in an alphabetic language is to learn what phonemes
These correspond to emphasizing one of the two correspond to. In the absence of tuition, however,
routes in the dual-route model. In the look-and-say children try to assign letters to words rather than
or whole word method, children learn to associ- sounds, although most children soon realize that
ate the sound of a word with a particular visual this will not work (Byrne, 1998; Ferreiro, 1985).
pattern. This corresponds to emphasizing the lexi- Anything that expedites this realization facilitates
cal or direct access route. In the alternative phonic reading. Teaching the alphabetic principle explicitly
method, children are taught to associate sounds does this, and, as we have seen, training on phono-
with letters and letter sequences, and use these logical awareness improves reading skills, presum-
associations to build up the pronunciations of ably by focusing on phonemes and preparing the
words. This method therefore emphasizes the non- way to showing how they can be mapped onto let-
lexical, grapheme–phoneme conversion route. ters. As Byrne (1998, p. 144) concludes, “if we want
It is generally agreed that the phonic method children to know something, we would be advised to
gives much better results (Adams, 1990). A meta- teach it explicitly.”
analysis (which is a method of combining the results There are two types of phonics instruction.
of two or more, often many, experiments) of the Analytic phonics is generally taught after reading
APPROACHES TO LEARNING TO READ
LOOK-AND-SAY/WHOLE WORD METHOD ALPHABETIC/PHONIC METHOD

Children learn to associate the Children learn to associate sounds
sound of a word with a particular with letters, and use this to build
visual pattern up pronunciations of words
FIGURE 8.2
has begun. Letter sounds are introduced gradually; with children taught by this method showing a
reading is practiced using sets of words that share reading advantage several years later.
common sounds (e.g., dog and dig). Analytic Finally, mere exposure to print has benefi-
phonics is currently the most common method of cial effects. Stanovich, West, and Harrison (1995)
teaching reading in the United Kingdom. In syn- showed that exposure to print was a significant
thetic phonics, children are taught all the letters predictor of vocabulary size and declarative
and letter sounds before anything else. Teaching knowledge even after other factors such as work-
emphasizes word-building activities involving the ing memory differences, educational level, and
blending together of constituent sounds. Recent general skill were taken into account. It is particu-
work in Clackmannanshire in Scotland suggests larly important for adults to involve young chil-
that being taught by synthetic phonics is greatly dren actively with print, rather than children just
preferable to being taught by analytic phonics merely being passively exposed to it (Levy, Gong,
(Johnston & Watson, 2004, 2005). A 7-year lon- Hessels, Evans, & Jared, 2006). Hence games and
gitudinal study showed that children who were activities that get children to manipulate letters
taught by synthetic phonics learned to read and and words and involve them in carrying out some
spell faster than children who were taught by early form of reading are highly desirable. Indeed,
other methods. The advantages of learning to read lack of exposure to print can lead to a develop-
by synthetic phonics appear to be long-lasting, mental delay in reading, and may even be one
factor causing developmental surface dyslexia
(Stanovich, Siegel, & Gottardo, 1997).
LEARNING TO SPELL
Spelling is an important skill associated with the
emergence of phonological awareness and learning
to read. Spelling can be thought of as the reverse
of reading: Instead of having to turn letters into
sounds, you have to turn sounds into letters. Indeed
the classic model of spelling is a dual-route one
based on the dual-route model of reading (Brown
& Ellis, 1994). In this model, there is a spelling-to-
sound, or assembled or non-lexical, route, which
can only work for regular words, and a direct, or
addressed or lexical, route, which will work for all
words. The crucial determinant in spelling develop-
ment is the acquisition of phonological representa-
tions of words (Brown & Ellis, 1994).
Given the similarities between reading and spell-
ing, it is no surprise that the same sorts of issues are
found in spelling research as in reading research, and
that the two areas are closely connected longitudi-
nally. Spelling errors are a rich source of informa-
tion about how children spell. In the earliest stages of
spelling, around the age of 3, children know that writ-
In the phonic method, children are taught to ing is different from drawing, but do not yet under-
associate sounds with letters in order to build up
stand the alphabetic principle. Young children believe
the pronunciation of whole words.
that the written forms of words should reflect their
meanings; hence they think that the names of large DEVELOPMENTAL

objects such as “whale” should be spelled with more
letters than the names of small objects such as “mos-
DYSLEXIA
quito” (Lundberg & Tornéus, 1978; Treiman, 1997). Developmental dyslexia is an impairment in
Gradually, children’s spelling becomes motivated by developing reading abilities: whereas acquired
their realization of the importance of the alphabetic dyslexia involves damage to reading systems that
principle—that letters correspond to sounds. At first were known to be functionally normal before the
the application of this principle might be sporadic, brain trauma, developmental dyslexic children
but eventually it comes to dominate. Early spelling grow up such that the normal acquisition of read-
errors often reflect the over-application of the alpha- ing is impaired. In the popular press, the term
betic principle. For example, “Trevor” (age 6) spelled is often used to refer to difficulties with writing
“eat” as “et,” with two letters, because it only has two and poor spelling; strictly speaking, these symp-
sounds (Treiman, 1997). Early errors may also reflect toms should be called developmental dysgraphia,
the fact that sometimes children’s analyses of words although naturally developmental dyslexia and
into phonemes do not match those of adults; hence dysgraphia usually occur together. To qualify for
“dragon” becomes “jragin” (Read, 1975). Another developmental dyslexia, the child’s reading age
source of error is that young children are over- must be below what would be expected from their
rigorous about applying letter names. This is a particu- age and IQ, and the child’s IQ, home background,
lar problem with vowels: because the name of “e” is and level of education must reach certain levels
/i/, children make errors such as spelling “clean” as of attainment (Ellis, 1993). Estimates of the inci-
“clen” and “happy” as “hape” (Treiman, 1994). dence of developmental dyslexia range from 10%
Very young children may use groups of to as high as 30% (Freberg, 2006).
sounds that are larger than a phoneme. In partic- There are several important issues in the
ular, they may try to spell with a letter for each study of developmental dyslexia. Although devel-
syllable (Ferreiro & Teberosky, 1982). For exam- opmental dyslexia is a convenient label, there has
ple, 5-year-old “Bobby” spelled monosyllabic been considerable debate as to whether it repre-
words with one letter each: “be” became “b” and sents one end of a continuum of reading skills, or
“should” became “c.” Consonant clusters may be whether it is a distinct deficit with a single under-
spelled with just one letter: “street” becomes “set” lying cause (or causes if there is more than one
(Treiman, 1993). type). Neither is there agreement that there are
Children also soon become sensitive to the clear-cut subtypes of developmental dyslexia that
distributional information about the orthographic
patterns to which they have been exposed. For
example, in English the string of letters “ck” can
occur in the middle and at the end of words, but
not at the beginning. Early spellers seem aware
of this: they make few errors such as “ckak” (for
“cake”) that violate these constraints (Treiman,
1997). Young children do however produce some
orthographically illegal strings in error: “hr” for
“her” is quite a common error.
As children grow older, they use information
in addition to the alphabetic principle. They learn
to spell irregular words, and learn that morphemes
are spelled in regular ways—for example, that Developmental dysgraphia (difficulty with writing
the past tense ending of regular verbs is always and poor spelling) and developmental dyslexia (an
spelled “ed,” no matter how it is pronounced impairment in developing reading abilities) usually
occur together.
(Treiman, 1997).
correspond to the acquired dyslexias. Identifying to contrast and movement, seems to be affected.
developmental dyslexic children is complex: By Deficits in the magnocellular system lead to prob-
definition, they read less well than age-matched lems with controlling and fixating the eyes, giving
controls, but how much less well do you have to rise to the sensation that letters are moving around
read to be a developmental dyslexic, rather than the page (Stein, 2003). Deficits in the magnocel-
just a poor reader? lular pathway are unlikely to be the sole cause of
A problem that arises when trying to infer developmental dyslexia, however, because many
the properties of the reading system from cases individuals without dyslexia have the same visual
of developmental dyslexia is that the developing deficits in this pathway as individuals with dyslexia
reading system may be very different from the (Skoyles & Skottun, 2004); indeed, most individu-
adult system. For example, grapheme–phoneme als with this visual deficit do not show dyslexia.
conversion might play a larger role in children’s Furthermore, not all people with dyslexia have this
reading. Furthermore, the nature of the child’s visual deficit (Lovegrove et al., 1986). We need to
reading system will depend on the way in which look elsewhere for a widespread underlying cause.
the child is being taught to read. The look-and- Reading disabilities tend to run in fami-
say method emphasizes the role of the direct lies, and recent work shows that dyslexia has
access route, and the phonic method emphasizes a significant genetic component, with a num-
grapheme–phoneme conversion. ber of chromosomal loci identified (Eckert,
Lombardino, & Leonard, 2001; Fisher et al.,
The biology of developmental 1999). There is some uncertainty—and per-
haps variation—about how these genetic abnor-
dyslexia malities are ultimately manifest at the level of
The relation between developmental dyslexia brain structure. Imaging studies suggest that
and other cognitive abilities is complicated the thalamus, frontal lobes, and cerebellum all
(Ellis, 1993). Some developmental dyslexic play some role, although the left planum tem-
children have other language problems, such as porale, a structure at the heart of Wernicke’s
in speaking or object naming. It is often thought area, plays a particularly important role in the
that dyslexic children are clumsier than aver- origin of developmental dyslexia (see Figure
age, but it is unclear whether this is really the 8.3). The planum temporale is usually larger
case. Some children with surface developmental in the left hemisphere than in the right; the dif-
dyslexia might similarly have impaired vis- ference in size is much less in individuals with
ual memory (Goulandris & Snowling, 1991), developmental dyslexia (Beaton, 1997). At a
although not all do. “Allan” (Hanley, Hastie, & processing level, damage to these brain areas
Kay, 1992) performed extremely well on tests seems to be manifest primarily as a disturbance
of visual short-term and long-term memory. to phonological skills (see below). An autopsy
People with developmental dyslexia are slightly of four men with developmental dyslexia found
more likely to be left-handed or ambidextrous this abnormal symmetry of the planum tempo-
than people without (Eglinton & Annett, 1994). rale, but also found neuronal ectopias (abnormal
There is some evidence that the oscillatory brain clusters of neurons) and dysplasias (abnormally
activity of people with developmental dyslexia oriented neurons)—both conditions associated
is abnormal, associated with aberrant lateraliza- with abnormalities in the migration phase of
tion and leading to problems with phonological brain development in the fetus, when neurons
processing and memory (Kraus, 2012). move to their eventual location (Galaburda,
Many studies have also found developmental Sherman, Rosen, Aboitiz, & Geschwind, 1985).
dyslexia to be associated with visual deficits (e.g., Neurons tend to be smaller in the left medial
Lovegrove, Martin, & Slaghuis, 1986). In particu- geniculate nucleus, an important part of the
lar, the magnocellular visual pathway, involving brain for relaying auditory information, than in
large cells that respond quickly and are sensitive the right in people with developmental dyslexia
Anterior
Broca’s
area
Planum
temporale
Planum (right)
Anterior Posterior temporale
(left)
Wernicke’s
area
Posterior
FIGURE 8.3 An axial cross-section of the brain to show the planum temporale. As here, the left planum
temporale is usually larger than the right. In an individual with developmental dyslexia the size difference would be
much less.
(Galaburda, Menard, & Rosen, 1994). Imaging Bryant and Impey (1986) reported a compari-
studies also reveal that the occipital regions of son of dyslexic and reading-age-matched control
the brain show increased activity—probably children and found that the “normal” children
because people are using additional visual strat- made exactly the same types of reading error as the
egies to cope with their phonological deficits dyslexic children. If dyslexic and normal children
(Casey, Thomas, & McCandliss, 2001). make the same types of error then this weakens
Clearly genetic and brain abnormalities play the argument that developmental dyslexia arises
an important role in determining a child’s read- from the same type of brain damage as acquired
ing ability. However, given the variation observed dyslexia. In addition, we find large differences in
in orthographies and dyslexia, it is unlikely that a normal young readers. Bryant and Impey suggest
single biological factor can account for all types that there are many different reading styles, and
of reading difficulty (Hadzibeganovic et al., 2010; some children adopt styles that lead them into dif-
Seidenberg, 2011). ficulty. Indeed, Baron and Strawson (1976) found
that some adult normal readers were particularly
Are there subtypes of good at phonological skills but relatively poor at
orthographic skills (they called these Phoenicians;
developmental dyslexia? they correspond to a very mild version of surface
There has been some controversy about whether dyslexia). Others were particularly good at ortho-
or not there are different types of developmental graphic skills but relatively poor at phonological
dyslexia. Frith (1985) emphasized the impor- skills (Baron and Strawson called these Chinese
tance of progressing from the logographic stage readers, corresponding to phonological dyslexia).
to the alphabetic stage, arguing that classic devel- Baron and Strawson proposed that these were the
opmental dyslexics fail to make this progression. ends of a continuum of individual differences in
Less severely affected are those readers who are the normal population. Developmental dyslex-
arrested at the alphabetic stage and cannot pro- ics would lie at the extremes of this continuum
gress to the orthographic stage. Less severe still (but see also Coltheart, 1987, and Temple, 1987,
is what is called type-B spelling disorder, where for detailed replies). Olson, Kliegel, Davidson,
there is a failure of orthographic access for spell- and Foltz (1984) also found that individual dif-
ing but not for reading. ferences in reading skills in their participants fell
along a normally distributed continuum rather highly impaired at nonword reading, but read
than into distinct subtypes. words with normal accuracy and latencies. She
A number of researchers have pointed out reported that she had experienced no difficulties
that there are similarities between acquired in learning to read or write at school. She never
and developmental dyslexia. Jorm (1979) experienced any difficulty in “real-life” reading.
compared developmental dyslexia with deep Like all these people, Melanie-Jane had diffi-
dyslexia. In both cases grapheme–phoneme culty with other tasks involving phonology (e.g.,
conversion is impaired, which leads to a par- assembly and segmentation). In summary, many
ticular difficulty with nonwords. He concluded developmental dyslexics resemble people with
that the same part of the parietal lobe of the acquired phonological dyslexia.
brain was involved in each case; it was dam- Castles and Coltheart (1993) examined the
aged in deep dyslexia, and failed to develop reading of 56 developmental dyslexics, and argued
normally in developmental dyslexia. However, that they did not form a homogeneous population,
Baddeley, Ellis, Miles, and Lewis (1982) showing instead a clear dissociation between sur-
found that although the phonological encod- face and phonological dyslexic reading patterns.
ing of people with developmental dyslexia was They concluded that such a dissociation is the norm
greatly impaired, they could do some tasks that in developmental dyslexia. In this interpretation,
necessitate it. For example, they could read the types of developmental dyslexia correspond
nonwords at a much higher level than deep to a failure to “acquire” normally one of the two
dyslexics, although of course nowhere near as routes of the dual-route model. One subgroup was
well as age-matched controls. relatively skilled at sublexical processing (as they
Most people with developmental dyslexia were good at reading nonwords and poor at read-
rarely make semantic paralexias, so perhaps ing exception words) and another relatively skilled
they resemble phonological dyslexics rather at lexical processing (as they showed the reverse
more? Campbell and Butterworth (1985), and pattern). Hence Castles and Coltheart concluded
Butterworth, Campbell, and Howard (1986), that there are surface and phonological subtypes of
describe the case of RE, a successful university developmental dyslexia. Subsequent work looking
student, who resembled a phonological dyslexic. at the heritability of developmental dyslexia among
RE could only read a new word once she had twins suggests that although both types are signifi-
heard someone else say it. She could not inspect cantly inheritable, the genetic contribution is much
the phonological form of words, and could not larger in developmental phonological dyslexia
“hear words in the head.” Such a skill may be (Castles, Datta, Gayan, & Olson, 1999).
necessary for the development of the phono- An important consideration in studying
logical recoding route. In addition, she had an developmental dyslexia is selecting an appropri-
abnormally low digit span. A similar case is that ate control group. Snowling (1983, 2000) urged
of JM, a person of superior intelligence whose caution in comparing types of acquired and devel-
reading age was consistently 2 years less than opmental dyslexia. In particular, she argued that
his chronological age (Hulme & Snowling, the best comparison in understanding what has
1992; Snowling & Hulme, 1989). At the age of gone wrong is not between developmental and
15 his word reading was comparable to that of acquired dyslexics, but between developmental
reading-age-matched controls, but he was com- dyslexics and reading-age-matched controls. That
pletely unable to read two-syllable nonwords. is, if someone with a chronological age of 14 has
He also had a severely reduced short-term a reading age of 10, they should be compared
memory span and difficulty with other tests of with normal readers of 10. The study by Castles
phonology such as nonword repetition. Howard and Coltheart did not use appropriate reading-
and Best (1996) described the case of “Melanie- age-matched controls, and did not control for IQ
Jane,” an 85-year-old person with developmen- (Snowling, Bryant, & Hulme, 1996; Stanovich,
tal phonological dyslexia. Melanie-Jane was Siegel, Gottardo, Chiappe, & Sidhu, 1997). It is
therefore possible that any apparent differences awareness may be related to difficulties with
between the two types of developmental dyslex- phonological short-term memory (Campbell &
ics just reflect individual differences in normal Butterworth, 1985; Hulme & Snowling, 1992;
readers of a lower reading age. When compared Snowling, Stackhouse, & Rack, 1986). There
with children at the same reading age (rather than is also evidence of a speech perception deficit
chronological age), the two groupings disappear, in children at the developmental phonological
because children at different reading ages differ dyslexia extreme. Manis et al. (1997) showed
in the difficulty they have with exception words that dyslexics with low phonological awareness
and nonwords. were poor at distinguishing between the sounds
The consensus of opinion is that most “p” and “b.”
impairments in developmental dyslexia lie on a Harm and Seidenberg (1999) argued that
continuum, rather than falling into two neat cat- while children at the developmental phonological
egories, with phonological developmental dys- dyslexia end of the continuum share a core defi-
lexics and surface developmental dyslexics at cit in phonological processing, children at the
the ends of the continuum (Manis, Seidenberg, developmental surface dyslexia end are like begin-
Doi, McBride-Chang, & Petersen, 1996; ner readers, who are also much worse at reading
Seymour 1987, 1990; Wilding, 1990). Those exception words than sounding out nonwords.
developmental dyslexics near the surface dys- They therefore concluded that surface devel-
lexia end are poor at reading irregular words but opmental dyslexics are delayed readers. They
are not so troubled by nonwords, whereas those showed how both surface and phonological devel-
at the phonological dyslexia end have severe opmental dyslexia can be generated by different
nonword reading problems and make many pho- types of damage to an attractor connectionist net-
nological errors while reading. Children at the work. This model also shows that it is possible to
phonological dyslexic end of the continuum are have a phonological deficit that is severe enough to
impaired on tasks of phonological awareness, interfere with reading development but not severe
while children at the surface dyslexic end do not enough to interfere with speech perception and
differ from age-matched controls on such tasks production. Developmental phonological dyslexia
(Manis et al., 1996). arises as a consequence of damage to phonologi-
It seems then that those at the surface dys- cal representations before the model is trained to
lexic end of the continuum read and perform read. Developmental surface dyslexia can arise in
very similarly to reading-age-matched controls, several ways, including less training (correspond-
suggesting that a general developmental delay ing to less experience of reading), making technical
is the root of the problem, rather than a deviant changes to the way in which the model learns so
reading pattern. Clearly problems with phonol- that it does not obtain the normal benefits from the
ogy play a central role in the deviant reading same amount of learning, reducing the number of
pattern shown in developmental phonological hidden units that mediate between orthography and
dyslexia. Most people with developmental dys- phonology, and degrading the orthographic input to
lexia are indeed worse at tasks involving both the model (corresponding to visual-perceptual defi-
nonword reading and phonological awareness cits). Relatively pure examples of phonological and
(Bradley & Bryant, 1983; Goswami & Bryant, surface dyslexia (corresponding to the extremes of
1990; Metsala, Stanovich, & Brown, 1998; the continuum) were associated with mild forms
Rack, Snowling, & Olson, 1992; Siegel, 1998; of impairment; more severe impairments created
Snowling, 1987). Bradley and Bryant (1978) a mixed pattern of nonword and exception word
showed that people with developmental dyslexia impairment that lies somewhere along the continuum.
perform less well than reading-age-matched This work therefore shows how two dis-
control children at picking out a phonologically tinct types of damage to a connectionist model
distinct word from a group of four (e.g., cat, fat, can give rise to a continuum of impairments.
hat, net). These difficulties with phonological As we noted above, this phonological deficit
arises as a consequence of damage to specific disabilities. This deficit, measurable on pho-

brain areas, and has both biological and envi- nological awareness tasks, has a partial genetic
ronmental causes. Phonological awareness basis. The deficit also lies on a continuum, with
skills can “run in the family.” Pennington and children at the less severe extreme being able
Lefly (2001) found that members of one fam- to learn to read relatively normally. However,
ily had a much higher incidence than expected not all developmental reading disorders can be
of developmental dyslexia. Children who later accounted for in terms of a phonological defi-
developed dyslexia showed deficits on a range cit. Some children show poor comprehension
of phonological measures from an early age. (as measured by semantic tasks such as syno-
Furthermore, the scores on phonological tasks nymy judgment—e.g., do “boat” and “ship”
were continuous, with some children in the have similar meanings?) yet appear unimpaired
family scoring worse than control children, yet at tasks involving phonology, such as nonword
not developing full-blown dyslexia. Snowling, reading. These children are worst at reading low-
Gallagher, and Frith (2003) similarly found that frequency irregular words, probably because
the risk of developing dyslexia within a high- they are receiving inadequate support from
risk family is continuous, and that good general semantics (Nation & Snowling, 1998). Note
language skills—particularly good early vocab- though that a correlation does not of course
ulary development—can sometimes partly com- imply causality: just because we observe a pho-
pensate for a specific phonological deficit. Even nological deficit in every case of dyslexia does
children classified as normal readers had some not mean that the deficit causes dyslexia; it
difficulties spelling and reading nonwords. could be that both are caused by some third fac-
Hence, within a family at genetic risk of dys- tor. Some researchers argue that both dyslexia
lexia, apparently unaffected members do in fact and phonological awareness deficits arise from
have subtle reading and phonological deficits. an early problem in visual attention (Facoetti
These results also reiterate the conclusion that et al., 2010; Vidyasagar & Pammer, 2010).
although developmental dyslexia may have a
genetic component, the causes are complex, and
environmental factors also play a role. How can we improve the reading
Although most of the studies reported of people with developmental
have examined reading difficulties in English,
a phonological processing deficit is present in
dyslexia?
poor readers of other languages, whether they We have seen that a lack of phonological aware-
have more regular grapheme–phoneme corre- ness plays an important role in developmen-
spondences (e.g., French and Portuguese), use tal dyslexia. Therefore one obvious technique
a different script (e.g., Arabic and Hebrew), or to improve reading is to improve phonologi-
are non-alphabetic (e.g., Chinese). In each case cal awareness from as early an age as possible
there is a core phonological deficit (Siegel, (Snowling, 1987). There is some evidence that
1998). In fact the ability to segment phonemes training in sound categorization might assist
arising from a deficit in analyzing the rhythmic reading and spelling development in devel-
properties of speech appears to be a universal opmental dyslexia as well as normal reading
problem, including at least in children with dys- development. In an experiment by Bradley and
lexia speaking English, Spanish, and Chinese Bryant (1983), a group of children who had
(Goswami et al., 2011). previously been shown to be poor at a rhyme
In summary, there is a consensus that a judgment task were given training on catego-
phonological processing deficit, measured by rizing words on the basis of the similarity of
a deficit in phonological awareness, underlies their sounds. For example, they had to put “hat”
developmental dyslexia. This account is called with “cat” on the basis of shared rimes, but with
the phonological deficit model of reading “hen” on the basis of shared initial sounds. This
training was given individually and weekly for

2 years. After 4 years, the experimental group
who had received sound training were much
better at reading and spelling than the control
groups. The effects of the training were specific
to reading and spelling development; it had no
carry-over into other educational skills such as
mathematics. It has since been shown that train-
ing on a range of phonological skills linked
to the teaching of reading is the best way of
improving the reading of poor readers (Hatcher
et al., 1994).
Other techniques focus on training the
ability to segment words into onsets and rimes
(Snowling, 1987). Related words with the same
rime can then be read by analogy (e.g., “rain,”
“pain,” and “stain” are all pronounced in simi-
lar ways). In contrast, the Orton–Gillingham–
Stillman multisensory method emphasizes the
systematic and explicit teaching of individual
grapheme–phoneme rules. The multisensory
aspect of the techniques is probably important
in its success: Children see, say, write, and even
feel new spelling patterns (Fernald, 1943). There
is some evidence that multisensory techniques
improve poor reading: Hulme (1981) showed that A dyslexic girl wearing yellow glasses while
poor readers remembered strings of letters better reading. These filters might improve the function
if they were allowed to trace them. of the brain cells concerned with visual perception
Broom and Doctor (1995a, 1995b) suggested (the magnocellular system) in those with dyslexia,
helping them to read.
that it is possible to provide specific remedial
therapy if the locus of the deficit in the model of
reading can be located. They described the case with visual problems, although there have been
of DF, a 10-year-old boy with poor reading skills very few controlled studies, and much of the
that resembled surface dyslexia. They argued evidence of improvement is anecdotal (Stein,
that DF had become arrested at the alphabetic 2003; Wilkins & Neary, 1991). Increasing letter
reading stage, and therefore improved ortho- spacing might help (Zorzi, Barbierob, Facoettia,
graphic reading by focusing on low-frequency & Ziegler, 2012).
irregular words. They similarly showed that it Successful techniques for improving poor
was possible to improve the reading skills of reading, then, possess two features. First, they
SP, an 11-year-old boy with phonological devel- provide explicit training on the skills in which
opmental dyslexia, by training on phonological the person is deficient. This means improving
reading skills. poor phonological awareness in developmental
For those for whom visual deficits play an dyslexia at the phonological dyslexia end of the
obvious and significant role, improving the per- continuum, and establishing an orthographic lexi-
formance of the visual magnocellular system by con in people at the surface dyslexia end. Second,
training eye fixations can lead to improvements they use techniques well known from studies of
in reading (Stein, 2003). Using yellow filters memory and mnemonics to improve memory for
might also lead to some improvement for those spelling patterns.
SUMMARY
x Acquiring the alphabetic principle is an important part of learning to read.

x Phonological awareness is an awareness of the sounds of words; phonological awareness is essen-
tial for the development of skilled reading.
x Reading is best taught by the phonic method because this emphasizes grapheme–phoneme
correspondences.
x There has been considerable debate as to whether developmental dyslexia is qualitatively differ-
ent from very poor normal reading, and whether there are subtypes that correspond to acquired
dyslexias; the preponderance of evidence suggests that developmental dyslexia is on a continuum
with normal reading.
x Developmental dyslexia is associated with impaired phonological awareness.
x Connectionist modeling shows how two distinct types of damage can lead to a continuum of
impairment between developmental surface and phonological dyslexia extremes.
x We can produce improvements in the reading skills of people with developmental dyslexia using
techniques derived from our theories.
1. What are the differences between good and poor readers?

2. What does psycholinguistics say about the best way of teaching children how to read? Which
words are likely to cause children the most difficulty?
3. Find out how you learned to read. Did you face any particular difficulties? Have these affected
your subsequent experience of reading?
4. Some authorities recommend that exam questions should be printed on colored paper for people
with reading disabilities. How could this make a difference to reading performance?
5. How should students with dyslexia be compensated in assessment?
6. Why do you think dyslexia is so commonly thought to be to do with problems spelling and
writing? Is it fair to say that poor memory and concentration skills are part of the dyslexia
syndrome?
FURTHER READING
See McBride-Chang (2004) for an introduction to literacy development. Ellis (1993) includes an
excellent description of developmental dyslexia. Snowling (2000) is a very approachable review
of work on developmental dyslexia, and Olson (1994) provides an up-to-date review. For general
overviews of learning to read, with emphasis on individual differences in reading ability, see
Goswami and Bryant (1990), McShane (1991), Oakhill (1994), and Perfetti (1994). For a popular
account of connectionist models of reading, see Hinton (1992), and Hinton, Plaut, and Shallice
(1993). Brown and Ellis (1994) review research on spelling. Harris and Hatano (1999) provide a
cross-linguistic perspective on learning to read and write.
For an excellent recent review of the whole area, with emphasis on phonological awareness, see
Ziegler and Goswami (2005). For more detail see Snowling and Hulme (2007).
CHAPTER 9
UNDERSTANDING
SPEECH
INTRODUCTION are involved? We can distinguish the prelexical

code, which is the sound representation used prior
Speech is at the heart of language. This chapter is to the identification of a word, from the postlexi-
about how we understand spoken words and con- cal code, which is information that is only avail-
tinuous speech. able after lexical access. An important task for
Speech perception is about how we identify understanding speech recognition is to specify the
or perceive the sounds of language, while spoken nature of the prelexical code. Among the impor-
word recognition is about the higher level process tant topics here are whether or not phonemes are
of recognizing the words that the sounds make up. represented explicitly in this representation, and
This convenient distinction is perhaps artificial. It the role of syllables in speech perception.
could be that we do not identify all the sounds of a
word and then put them together to recognize the Why is speech perception difficult?
word; perhaps knowing the word helps us to iden-
tify the constituent sounds. We may not even need There are obvious differences between spoken
to hear all the sounds of a word before we can and visual word perception. The most important
identify it. The effect of word-level knowledge on difference between the tasks is that spoken words
sound perception is an important and controver- are present only very briefly, whereas a written
sial topic. By the end of this chapter you should: or printed word is there in front of you for how-
ever long you want to analyze it. You only get one
x Understand how we segment speech. chance with a spoken word, but you can usually
x Know how context is used in recognizing speech. go back and check a visually presented word as
x Appreciate that we recognize a word at its recog- many times as you like. Furthermore, there is not
nition point, but that the recognition point does such an easy segmentation of words into compo-
not have to correspond to when the word is first nent sounds as words into letters; sounds and even
uniquely distinguishable from other, similar- whole words tend to slur into one another.
sounding words. In spite of these difficulties, we are rather
x Know about the cohort and TRACE models of good at recognizing speech. The process is auto-
word recognition. matic; when you hear speech, you cannot make
x Understand how brain damage can affect yourself not understand it. Most of the time it
speech recognition. happens effortlessly and with little apparent diffi-
culty. Speech perception is fast (Liberman, Cooper,
RECOGNIZING SPEECH Shankweiler, & Studdert-Kennedy, 1967). When
people are given sequences of sounds consisting of
What sort of representations are used to access a buzz, hiss, tone, and vowel, they can only distin-
our mental dictionary, the lexicon? What units guish the order of the sounds if they are presented at
9. UNDERSTANDING SPEECH 259
rate (Miller, 1981). The “b” sounds in “ball,” “bill,”

“able,” and “rob” are acoustically distinct. This sort
of acoustic variability makes phoneme identifica-
tion a complex task, as it means that they cannot
be identified by comparison with a “perfect exem-
plar” of that phoneme, called a template. There is
an analogy with recognizing letters; there are lots
of perfectly acceptable ways of writing the same
letter. This variation is most clear in the context of
different phones that are the same phoneme in a
language, such as aspirated and unaspirated /p/ (see
Chapter 2). Yet we successfully map these different
Although spoken words are transient in terms of
their availability for analysis, recognizing speech is phones onto one phoneme.
automatic and usually effortless. If we look at the physical acoustic signal and
the sounds conveyed by the signal, it is appar-
ent the relation between the two is a complex one.
In their review of speech perception, Miller and
a rate slower than 1.5 sounds per second (Clark &
Jusczyk (1989) pointed out that this complexity
Clark, 1977; Warren, Obusek, Farmer, & Warren,
arises because of two main features that must act
1969). Yet in 1 second we can understand speech at
as major constraints on theories of speech per-
the rate of 20 phonemes per second, and sometimes
ception. These features are both facets of the lack
much faster. We can identify spoken words in con-
of identity or isomorphism between the acoustic
text from about 200 ms after their onset (Marslen-
and phonemic levels of language, and are called
Wilson, 1984). Furthermore, speech sounds seem
the segmentation and invariance problems. The
to be at an advantage over non-speech sounds when
invariance problem is that the same phoneme
heard against background noise. Miller, Heise, and
can in fact sound different depending on the
Lichten (1951) found that the more words there are
context in which it occurs. The segmentation
to choose from a predetermined set (as they put it,
problem is that sounds slur together and cannot
the greater the information transmitted per word),
easily be separated. Let us look at these problems
the louder the signal had to be relative to the noise
in more detail.
for the participants to identify them equally well.
Acoustic invariance arises because the details
Bruce (1958) showed that words in a meaningful
of the realization of a phoneme vary depending
context are recognized better against background
on the context of its surrounding phonemes. This
noise than words out of context, and we take nearly
means that phonemes take on some of the acous-
twice as long to recognize a word if it is presented
tic properties of their neighbors, a process known
in isolation, out of the context of the sentence in
as assimilation. Hence the /I/ phoneme is usually
which it occurs (Lieberman, 1963).
produced without any nasal quality, but in words
In summary, there is clearly some advantage
such as “pin” and “sing” the way in which the
to recognizing speech in context compared with
vocal tract anticipates the shape it needs to adopt
speech out of context and non-speech sounds.
for the next phoneme means that /I/ takes on a nasal
What is this advantage?
quality. That is, there are co-articulation effects, in
that as we produce one sound our vocal apparatus
Acoustic signals and phonetic has just moved into position from making another
segments: How do we segment sound, and is preparing to change position again
speech? to make the subsequent sound. Co-articulation
The acoustic properties of phonemes are not fixed. has advantages for both the speaker and the lis-
They vary with the context they are in, and they tener. For the speaker, it means that speech can
even vary acoustically depending on the speaking be produced more quickly than if each phoneme
had to be clearly and separately articulated. For information is called the metrical segmentation
the listener, co-articulation has the advantage strategy. It is possible to construct experimen-
that information about the identity of phonetic tal materials that violate these expectations, and
segments may be spread over several acoustic seg- these reliably induce mishearings in listeners. For
ments. Although this has the apparent disadvan- example, Cutler and Butterfield described how
tage that phonemes vary slightly depending on one participant, given the unpredictable words
the context, it also has the advantage that we do “conduct ascents uphill” presented very faintly,
not gather information about only one phoneme reported hearing “The doctor sends the bill,” and
at any one time; they provide us with some infor- another “A duck descends some pill.” The lis-
mation about the surrounding sounds (a feature teners have erroneously inserted word bounda-
known as parallel transmission). For example, the ries before the strong syllables and deleted the
/b/ phonemes in “bill,” “ball,” “bull,” and “bell” boundaries before the weak syllables. This type of
are all slightly different acoustically, and tell us segmentation procedure, whereby listeners seg-
about what is coming next. ment speech by identifying stressed syllables, is
The segmentation problem is that it is not called stress-based segmentation. An alternative
easy to separate sounds in speech, as they run mechanism, which is based on detecting syllables
together (except for stop consonants and pauses). and is used in languages such as French that have
This problem does not just apply to sounds within very clear and unambiguous syllables, is called
words; in normal conditions, words also run into syllable-based segmentation. In stress-based lan-
each other. To take a famous example, in normal guages such as English, syllable boundaries can
speech the strings “I scream” and “ice cream” be unclear, and identifying the syllables is not
sound indistinguishable. The acoustic segments reliable. Hence the form of the listener’s language
visible in spectrographic displays do not map in determines the precise segmentation strategy used
any easy way into phonetic segments. One obvi- (Cutler, Mehler, Norris, & Segui, 1986).
ous constraint on segmenting speech is that we How do bilingual speakers segment lan-
prefer to segment speech so that each speech seg- guages? They do not simply mimic the monolin-
ment is accounted for by a possible word. This gual speakers of the language. Their segmentation
is called the possible-word constraint: We do not strategy is determined by which is their domi-
like to segment speech so that it leaves parts of nant language. Cutler, Mehler, Norris, and Segui
syllables unattached to words (Norris, McQueen, (1992) tested English–French bilingual speakers
Cutler, & Butterfield, 1997). Any segmenta- on segmenting English and French materials,
tion of the speech string that results in impossi- using a syllable monitoring task where the par-
ble words (such as isolated consonants) is likely ticipants had to respond as quickly as possible if
to be rejected. Hence, other things being equal, they heard a particular sequence of sounds. The
the segmentation of “fill a green bucket” will be French words “balance” and “balcon” (mean-
preferred to “filigree n bucket” because the latter ing “balance” and “balcony”) begin with dif-
results in an unattached “n” sound. ferent syllables (“ba” in “balance” and “bal” in
Other strategies that we develop to segment “balcon”). Native French speakers find it easy to
speech depend on our exposure to a particular lan- detect “ba” in “balance” and “bal” in “balcon.”
guage. Strong syllables bear stress and are never On the other hand, they take longer to find the
shortened to unstressed neutral vowel sounds; “bal” in “balance” and “ba” in “balcon” because
weak syllables do not bear stress and are often although these sounds are present, they do not cor-
shortened to unstressed neutral vowel sounds. In respond to the syllables. The syllable structure of
English, strong syllables are likely to be the initial the English word “balance” is far less clear; peo-
syllables of main content-bearing words, while ple are uncertain to which syllable the “l” sound
weak syllables are either not word-initial, or start a belongs. Hence the time it takes English speakers
function word (Cutler & Butterfield, 1992; Cutler to detect “ba” and “bal” does not vary with the
& Norris, 1988). A strategy that uses this type of syllable structure of the word they hear (“balance”
or “balcony”). French makes use of syllables, but is possible to fatigue the feature detectors hypoth-
English does not. esized to be responsible for categorical perception
In Cutler et al.’s experiment, the English– by repeated exposure to a sound, and to shift per-
French bilingual speakers segmented depending ception towards the other end of the continuum
on their primary or dominant language: English- (Eimas & Corbit, 1973). This technique is called
dominant speakers showed stress-based segmen- selective adaptation. For example, repeated pres-
tation with English language materials, and never entation of the syllable “ba” makes people less
showed syllable-based segmentation, whereas sensitive to the voicing feature of the /b/. This
French-dominant speakers showed syllabic seg- means that immediately afterwards the boundary
mentation, and only with French materials. It is between /b/ and /p/ shifts towards the /p/ end of
as though the segmentation strategy is fixed at an the continuum. Hence, even though speech stim-
early age, and only that strategy is developed fur- uli may be physically continuous, perception is
ther. Hence all bilingual speakers are monolingual categorical.
at the level of segmentation. This is not as big a The boundaries between categories are not
disadvantage as it might seem: Efficient bilin- fixed, but are sensitive to contextual factors such
guals are able to discard ineffective segmentation as the rate of speech. The perceptual system
processes and use other, more general, analytical seems able to adjust to fast rates of speech so
processes instead (Cutler et al., 1986, 1992). that, for example, a sound with a short VOT that
should be perceived as /b/ is instead perceived as
Categorical perception /p/. In effect, an absolutely short interval can be
Even though there is all this variation in the treated as a relatively long one if the surround-
way in which phonemes can sound, we rarely, if ing speech is rapid enough (Summerfield, 1981).
ever, notice these differences. We classify speech This is not necessarily learned, as infants are also
sounds as one phoneme or another; there is no sensitive to speech rate. They are able to interpret
halfway house. This phenomenon is known as the relative duration of different frequency com-
the categorical perception of phonemes (first ponents of speech depending on the rate of speech
demonstrated by Liberman, Harris, Hoffman, & (Eimas & Miller, 1980; Miller & Jusczyk, 1989;
Griffith, 1957). Liberman et al. used a speech see Altmann, 1997, for more detail).
synthesizer to create a continuum of artificial syl- At first, researchers thought that listen-
lables that differed in the place of articulation. ers were actually unable to distinguish between
In spite of the continuum, participants placed slightly different members of a phoneme cat-
these syllables into three quite distinct categories egory. However, this does not appear to be the
beginning with /b/, /d/, and /g/. Another exam- case. Pisoni and Tash (1974) found that partici-
ple of categorical perception is voice onset time pants were faster to say that two /ba/ syllables
(abbreviated to VOT). In the voiced consonants were the same if the /b/ sounds in each were
(e.g., /b/ and /d/), the vocal cords start vibrating acoustically identical, than if the /b/ sounds dif-
as soon as the vocal tract is closed, whereas in fered slightly in VOT. Participants are in fact sen-
the unvoiced consonants (e.g., /p/ and /t/), there sitive to differences within a category. Hence the
is a delay of about 60 ms. The pairs /p/ and /b/, importance of categorical perception has recently
and /t/ and /d/, differ only in this minimal feature come into question. It is possible that many phe-
of voicing. Voicing lies on a continuum; it is pos- nomena in speech perception are better described
sible to create sounds with a VOT of, for example, in terms of continuous rather than categorical per-
30 ms. Although this is midway between the two ception, and although our phenomenal experience
extremes, we actually categorize such sounds as of speech identification is that sounds fall into
being either simply voiced or unvoiced—exactly distinct categories, the evidence that early sensory
which may differ from time to time and from per- processing is really categorical is much weaker
son to person, and people can actually be biased (Massaro, 1987, 1994). Massaro argued that the
towards one end of the continuum or the other. It apparent poor discrimination within categories
does not result from early perceptual processing, (For example, people are faster to respond to the
but instead just arises from a bias of participants word-initial “b” in the predictable word “book”
to say that items from the same category are iden- than the less word predictable “bill” in the context
tical. Nevertheless, the idea of categorical percep- of “He sat reading a book/bill until it was time to
tion remains popular in psycholinguistics. go home for his tea.”) Foss and Blank argued that
people respond to the prelexical code when the
What is the nature of the prelexical phoneme monitoring task is made easy, but to the
code? postlexical code when the task is difficult (such as
Do we need to identify phonemes before we iden- when the target word is contextually less likely).
tify spoken words? Savin and Bever (1970) asked Subsequently Foss and Gernsbacher (1983) failed
participants to respond as soon as they heard a par- to find experimental support for the dual-code
ticular unit, which was either a single phoneme or model. Increasing the processing load of the
a syllable. They found that participants responded participants (e.g., by requiring them to monitor
more slowly to phoneme targets than to syllable for multiple targets) did not shift them towards
targets, and concluded that phoneme identifica- responding on the basis of the postlexical code.
tion is subsequent to the perception of syllables. They concluded that people generally respond
They proposed that phonemes are not perceptually in the phoneme monitoring task on the basis of
real in the sense that syllables are: we do not rec- the prelexical code, and only in exceptional cir-
ognize words through perceiving their individual cumstances make use of a postlexical code. These
phonemes, but instead can only recognize them results suggest that phonemes form part of the
through perceiving some more fundamental unit, prelexical code.
such as the syllable. Foss and Swinney (1973) que- Marslen-Wilson and Warren (1994) pro-
ried this conclusion, arguing that the phoneme and vided extensive experimental evidence on a
syllable monitoring task used by Savin and Bever range of tasks that phoneme classification does
did not directly tap into the perception process. not have to be finished before lexical activation
That is, just because we can become consciously can begin. Nonwords that are constructed from
aware of a higher unit first does not mean that it is words are more difficult to reject in an auditory
processed perceptually earlier. lexical decision task than nonwords constructed
Foss and Blank (1980) proposed a dual-code from nonwords. In this experiment, you start off
theory where speech processing employs both a with “smog” (a word) and “smod” (a nonword).
prelexical (or phonetic) code and a postlexical In each case you then take off the final consonant
(or phonemic) code. The prelexical code is com- and splice on a new one, “b,” to give you a new
puted directly from the perceptual analysis of the nonword, “smob.” Although they might initially
input acoustic information, whereas the postlexi- sound the same, the version made from “smog” is
cal code is derived from information derived from more difficult to reject as a nonword because the
higher level units such as words. In the phoneme co-articulation information from the vowel is con-
monitoring task, participants have to press a but- sistent with a word. Furthermore, the effects were
ton as soon as they hear a particular sound. Foss also found across a number of different tasks. If
and Blank showed that phoneme monitoring times the phonetic representation of the vowel had been
to target phonemes in words and nonwords were translated into a phoneme before lexical access,
approximately the same. In this case, the partici- then the co-articulation information would have
pants must have been responding to the phonetic been lost and the two types of nonword would
code, as nonwords cannot have phonological have been equally difficult. Marslen-Wilson and
codes. Foss and Blank also found that the fre- Warren argued that lexical representations are
quency of the target word does not affect phoneme directly accessed from featural information in the
monitoring times. On the other hand, manipulat- sound signal. Co-articulation information from
ing the semantic context of a word leads to people vowels is used early to identify the following con-
responding on the basis of the postlexical code. sonant and therefore a word.
In summary, there is controversy about willing to put a sound into a category they would
whether or not we need to identify phonemes not otherwise choose if the result makes a word:
before recognizing a word. Most data suggest that “kiss” is a word, but “giss” is not, and this influ-
while phonemes might be computed during word ences our categorical perception of the ambiguous
recognition, we do not need to complete phoneme phoneme. This is known as lexical identification
identification before word recognition can begin. shift. In this respect, word context is influencing our
The research on phonological awareness described categorization of sounds. Findings using this tech-
in Chapter 8 suggests that we seem to be less nique, developed by Connine and Clifton (1987),
aware of phonemes than other phonological con- further strengthen the argument that lexical
stituents of speech, such as syllables. Morais and knowledge (information about words) is available
Kolinsky (1994) proposed that there are two quite to the categorical perception of ambiguous stim-
distinct representations of phonemes: an uncon- uli. They showed that other processing advantages
scious system operating in speech recognition and accrue to the ambiguous stimuli when this lexical
production, and a conscious system developed in knowledge is invoked, but not at the ends of the
the context of the development of literacy (read- continuum, where perceptual information alone is
ing and writing). sufficient to make a decision. Later studies using a
method of analysis known as signal detection also
What role does context play in suggest that the lexical identification shift in a cat-
egorical perception task is truly perceptual. Signal
identifying sounds? detection theory provides a means of describing
The effect of context on speech recognition is of the identification of imperfectly discriminable
central importance, and has been hotly debated. Is stimuli. Lexical context is not sensitive to manip-
speech recognition a purely bottom-up process, or ulations (primarily the extent to which correct
can top-down information influence its outcome? responses are rewarded and incorrect ones pun-
If we can show that the word in which a sound ished) known to influence postperceptual pro-
occurs, or indeed the meaning of the whole sen- cesses (Pitt, 1995a, 1995b; but see Massaro &
tence, can influence the recognition of that partic- Oden, 1995, for a reply). Connine (1990) found
ular sound, then we will have shown a top-down that sentential context (provided by the meaning
influence on sound perception. In this case, we of the whole sentence) behaves differently from
will have shown that speech perception is in part lexical context (the context provided by the word
at least an interactive process; knowledge about in which the ambiguous phoneme occurs). In par-
whole words is influencing our perception of their ticular, sentential context has a similar effect to
component sounds. Of course, different types of the obviously postperceptual effect of the amount
context could have an effect at every level of pho- of monetary payoff, where certain responses lead
nological processing, and in principle the effects to greater rewards. She therefore concluded that
might be different at each level. sentential context has postperceptual effects.
The first piece of relevant evidence is based A classic psycholinguistic finding known as
on the categorical perception of sounds varying the phoneme restoration effect appears at first sight
along a continuum. For example, although /p/ and to be evidence of contextual involvement in sound
/b/ typically differ in VOT between 0 and 60 ms, identification (Obusek & Warren, 1973; Warren,
sounds in between will be assigned to one or the 1970; Warren & Warren, 1970). Participants were
other category. Word context affects where the presented with sentences such as “The state gov-
boundary between the two lies. Ganong (1980) ernors met with their respective legi*latures con-
varied an ambiguous phoneme along the appro- vening in the capital city.” At the point marked
priate continuum (e.g., /k/ to /g/), inserted this in with an asterisk *, a 0.12-second portion of speech
front of a context provided by a word ending (e.g., corresponding to the /s/ phoneme had been cut out
“-iss”), and found that context affected the per- and replaced with a cough. Nevertheless, partici-
ceptual changeover point. That is, participants are pants could not detect that a sound was missing
from the sample. That is, they appear to restore asked whether the restoration occurs at the phono-
the /s/ phoneme to the word “legislatures.” The logical processing level, or at some higher level.
effect is quite dramatic. Participants continue to Perhaps it is just the case, for example, that partic-
report that the deleted phoneme is perceptually ipants guess the deleted phoneme. The guessing
restored even if they know it is missing. Moreover, does not even need to be conscious. Another way
participants cannot correctly locate the cough of putting this issue is, does the context affect the
in the speech. The effect can still be found if an actual perception or some later process?
even larger portion of the word is deleted (as in There is evidence that in some circumstances
le***latures). Warren and his colleagues argued phoneme restoration is a true perceptual effect.
that participants are using semantic and syntactic Samuel (1981, 1987, 1990, 1996) examined the
information far beyond the individual phonemes effects of adding noise to the segment instead of
in their processing of speech. The actual sound used just replacing the segment with noise. If phoneme
is not critical; a buzz or a tone elicits the effect as restoration is truly perceptual, participants should
successfully as a cough. There are limits on what not be able to detect any difference between these
can be restored, however; replacing a deleted conditions; in each case they will think they hear
phoneme with a short period of silence is easily a phoneme plus sound. On the other hand, if the
detectable and does not elicit the effect. effect is postperceptual, there should be good dis-
In an even more dramatic example, partici- crimination between the two conditions. Samuel
pants were presented with sentences (1) to (4) concluded that lexical context does indeed lead
(Warren & Warren, 1970): to true phoneme restoration and that effect was
prelexical. On the other hand he concluded that
(1) It was found that the *eel was on the orange. sentence context does not affect phoneme recog-
(2) It was found that the *eel was on the axle. nition, and affects only postlexical processing.
(3) It was found that the *eel was on the shoe. Consider the sentences in (5) and (6):
(4) It was found that the *eel was on the table.
(5) The travelers found horrible bats in the cavern/
The participants listened to tapes that had been tavern when they visited it.
specially constructed so that the only thing that (6) The travelers found horrible food in the cavern/
differed between the four sentences was the last tavern when they visited it.
word. In each case, a different final word was
spliced onto a physically identical beginning. This In (5) the sentential context supports “cav-
is important because it means that there can be ern” more than “tavern”; in (6) the reverse is the
no subtle phonological or intonational differences case. If sentence context has an effect, we should
between the sentences that might cue participants. therefore get stronger phoneme restoration of the
Once again, the phoneme at the beginning of *eel deleted initial phoneme for “cavern” than “tav-
was replaced with a cough. It was found that the ern” in (5), and the opposite way round in (6).
phoneme that participants restored depended on This was not the case. In conclusion, only infor-
the semantic context given by the final word of mation about particular words affects the identifi-
the sentence. Participants restored a phoneme that cation of words; information about the meaning of
would make an appropriate word for that context. the sentence affects a later processing stage.
These are “peel” in (1), “wheel” in (2), “heel” in Samuel (1997) investigated the suggestion
(3), and “meal” in (4). that people just guess the phoneme in the restoration
Although at first sight it seems that the per- task, rather than truly restore it at a perceptual level.
ception of speech is constrained by higher level He combined the phoneme restoration technique
information such as semantic and syntactic con- with the selective adaptation technique of Eimas
straints, it is unclear in these experiments how and Corbit (1973). Listeners identified sounds
the restoration is occurring. Do participants really from the /bI/–/dI/ continuum where the sounds
perceive the missing phoneme? Fodor (1983) that were acting as adaptors were the third syllable
of words beginning either with /b/ or /d/ (e.g., The time course of spoken word
“alphabet” and “academic”). After repeated pres-
entation of the adaptor (e.g., /b/, by listening to
recognition
the word “alphabet” 40 times), participants were The terms “word recognition” and “lexical
less likely to classify a subsequent sound as /b/. access” are often used in the spoken word rec-
Crucially, this adaptation occurred even if the crit- ognition literature to refer to different processes
ical phoneme in the adaptor word was replaced (Tanenhaus & Lucas, 1987), and so it is best to be
with a loud burst of noise (e.g., “alpha*et,” with * clear in advance about what our terms mean. We
signifying the noise). The adaptation only occurred can identify three stages of identification: initial
when the critical phonemes were replaced with a contact, lexical selection, and word recognition
burst of noise, but not when they were replaced (Frauenfelder & Tyler, 1987) (see Figure 9.1).
with silence. These stages might overlap; whether they do or
At first sight this study suggests that not is an empirical question, and is an aspect of
restored phonemes can act like real ones and our concern with modularity.
cause adaptation. Others, however, have argued Recognizing a spoken word begins when
that these findings can be explained without some representation of the sensory input makes
interaction if the restored phonological code is initial contact with the lexicon, called the initial
created by top-down lexical context rather than contact phase. Once lexical entries begin to match
just provided by the lexical code. The lexical the contact representation, they change in some
context does not seem to be improving the per- way; they become “activated.” The activation
ceptibility of the phoneme (the sensitivity), but might be all-or-none (as is the case in the original
just affects how participants respond (the bias). cohort model described later), or the relative acti-
To this extent top-down information is not really vation levels might depend on properties of the
affecting the sensitivity of word recognition. words (such as word frequency), or words may
Perhaps listeners come to learn to recognize the be activated in proportion to the current goodness
noise as an instance of a “b” sound, and hence it of fit with the sensory data (as in the more recent
causes adaptation in the same way that a “real” cohort model, or in the connectionist TRACE
“b” would (Norris, McQueen, & Cutler, 2000, model). In the selection phase, activation accu-
2003). mulates until one lexical entry is selected. Word
The balance of the data here, and as discussed recognition is the end point of the selection phase.
later in the description on the TRACE model, sug- In the simplest case, the word recognition
gests that top-down context has at best a limited point corresponds to its uniqueness point, where
role in sound identification. In particular, there the word’s initial sequence is common to that word
is little evidence that sentential context affects and no other. Often recognition will be delayed
speech processing. until after the uniqueness point, and in principle
Three stages of identification (Frauenfelder & Tyler, 1987)
WORD RECOGNITION
INITIAL CONTACT LEXICAL SELECTION
(word is recognized and the
(some representation of the (sensory input continues to
recognition point usually
sensory input makes initial accumulate until one
occurs before the complete
contact with the lexicon) lexical entry is selected)
word has been heard)
FIGURE 9.1
we might recognize a word before its uniqueness immediate sensory signal. It includes information
point—in strongly biasing contexts, for example. available from the previous sensory input (the
If this happens, the point at which this occurs is prior context) and from higher knowledge sources
called the isolation point. This is the point in a (e.g., lexical, syntactic, semantic, and pragmatic
word where a proportion of listeners identify the information). The nature of the context being
word correctly, even though they may not be con- discussed also depends on the level of analysis.
fident about this decision (Grosjean, 1980; Tyler & For example, we might have word-level context
Wessels, 1983). By the isolation point, the listener operating on phoneme identification, and sen-
has isolated a word candidate; they then continue tence-level context operating on word identifica-
to monitor the sensory input until some level of tion. To show that context affects recognition, we
confidence is reached; this is the recognition point. need to demonstrate top-down influences on the
Lexical access refers to the point at which all the bottom-up processing of the acoustic signal. We
information about a word—phonological, seman- have already examined whether context affects
tic, syntactic, pragmatic—becomes available follow-level perceptual processing; here we are
lowing its recognition. The process of integration concerned with the possible effects of context on
that then follows is the start of the comprehension word identification. The issues involved are com-
process proper, where the semantic and syntactic plex. Even if there are some contextual effects,
properties of the word are integrated into the higher we would still need to determine which types of
level sentence representation. context have an effect, at what stage or stages they
have an effect, and how they have this effect.
When does frequency affect spoken We have already noted that there are two
word recognition? opposing positions on the role of context in rec-
Frequency has a very early effect on spoken word ognition, which can be called the autonomous and
recognition. Dahan, Magnuson, and Tanenhaus interactionist positions. The autonomous position
(2001) examined people’s eye movements while says that context cannot have an effect prior to
looking at pictures on a computer screen. The word recognition. It can only contribute to the
participants had to follow spoken instructions evaluation and integration of the output of lexi-
about which object in the scene they had to click cal processing, not its generation. However, the
with their mouse. Participants tended to look at lateral flow of information is permitted in these
objects with the higher frequency name first, models. For example, information flow is allowed
compared with a competitor picture with a lower between words within the lexicon, but not from
frequency name but the same initial sounds (e.g., the lexicon to lower level processes such as pho-
the spoken word was “bench,” and alongside neme identification. On the other hand, interac-
the picture of a bench were pictures of a bed—a tive models allow different types of information
high-frequency competitor—and a bell—low- to interact with one another. In particular, there
frequency). Participants also needed to look for may be feedback from later levels of processing to
less time at targets with higher frequency names. earlier ones. For example, information about the
A detailed analysis of how these effects unfolded meaning of the sentence or the pragmatic context
over time showed that word frequency is impor- might affect perception.
tant from the very earliest stages of processing, This description is the simplest way of put-
and that these effects persisted for some time. ting the autonomous–interactive distinction.
However, perhaps the autonomous and interactive
models should be looked at as the extreme ends of
Context effects on word a continuum of possible models rather than as the
two poles of a dichotomy. There might be some
recognition restrictions on permitted interaction in interactive
Does context affect spoken word recognition? models. For example, context can propose candi-
The context is all of the information not in the dates for what word the stimulus might be before
sensory processing has begun (Morton, 1969), or Tyler, 1980; Tyler & Wessels, 1983). But it is not
it might be restricted to disposing of candidates clear whether non-structural and semantic struc-
and not proposing them (Marslen-Wilson, 1987). tural context effects can be distinguished, or at
Because there are such huge differences between which stages they operate. Furthermore, these
models it can be difficult to test between them. effects must be studied using tasks that minimize
Strong evidence for the interactionist view is if the chance of postperceptual factors operating.
context has an effect before or during the access For this reason the delay between the stimulus
and selection phases. In an autonomous model, and the response cannot be too long; otherwise
context can only have an influence after a word participants would have a chance to reflect on and
has emerged as the best fit to the sensory input. maybe alter their decisions, which would obvi-
Frauenfelder and Tyler (1987) distinguished ously reflect late-stage, post-access mechanisms.
between two types of context: non-structural Interpretative structural context involves more
and structural. Non-structural context can be high-level information, such as pragmatic infor-
thought of as information from the same level mation, discourse information, and knowledge
of processing as that which is currently being about the world.
processed. An example is facilitation in pro- There is some evidence that non-linguistic
cessing arising from intra-lexical context, such context can have an effect on word recognition.
as an associative relation between two words Tanenhaus, Spivey-Knowlton, Eberhard, and
like “doctor” and “nurse.” It can be explained Sedivy (1995) studied people’s eye movements
in terms of relations within a single level of pro- while they were examining a visual scene while
cessing, and hence need not violate the principle following instructions. They found that visual con-
of autonomy, in terms of the spread of activa- text can facilitate spoken word recognition. For
tion within the lexicon. Alternatively, associa- example, the words “candy” and “candle” sound
tive facilitation can be thought of as occurring similar until about halfway through. Following the
because of hard-wired connections between instruction “pick up the candle,” participants were
similar things at the same level. According to faster to move their eyes to the object mentioned
autonomy theorists such as Fodor (1983) and if only a candle was in the scene than if both a
Forster (1981), this is the only type of context candle and candy were present. Indeed, when no
that affects processes prior to recognition. confusion object was present participants identi-
Structural context affects the combination fied the object before hearing the end of the word.
of words into higher level units, and it involves This result suggests that interpretative structural
higher level information. It is top-down process- context can affect word recognition.
ing. There are a number of possible types of struc-
tural context. Word knowledge (lexical context)
might be used to help identify phonemes, and MODELS OF SPEECH
sentence-level knowledge (sentence and syntactic RECOGNITION
context) might be used to help identify individual
words. The most interesting types of structural Before we can start to access the lexicon, we
context are those based on meaning. Frauenfelder have to translate the output of the auditory nerves
and Tyler (1987) distinguished two subtypes: from the ear into an appropriate format. Speech
semantic and interpretative. Semantic context is perception is concerned with this early stage of
based on word meanings. There is much evidence processing. It is obviously an important topic for
that this affects word processing. Words that are the machine recognition of speech, as there are
appropriate for the context are responded to faster many obvious advantages to computers and other
than those that are not, across a range of tasks machines being able to understand speech.
which I discuss in more detail later, such as pho- Early models of speech recognition exam-
neme monitoring, shadowing, naming, and gating ined the possibility that word recognition
(e.g., Marslen-Wilson, 1984; Marslen-Wilson & occurred by template matching. Target words
are stored as templates, and identification occurs movements must be quite abstract; mute people
when a match is found. A template is an exact can understand speech perfectly well (Lenneberg,
description of the sound or the word for which 1962), and we can understand speech we cannot
we are searching. However, there is far too much ourselves produce (e.g., that of people with stut-
variation in speech for this to be a plausible ters, or foreign accents).
account except in the most restricted domains. Analysis-by-synthesis models suffer from
Speakers differ in their dialect, basic pitch, basic two substantial problems. First, there is no appar-
speed of talking, and in many other ways. One ent way of translating the articulatory hypoth-
person can produce the same phoneme in many esis generated by the production system into the
different ways—you might be speaking loudly, same format as the heard speech in order for the
or more quickly than normal, or have a cold, for potential match to be assessed. Second, we are
example. The number of templates that would extremely adept at recognizing clearly articulated
have to be stored would be prohibitively large. words that are improbable in their context, which
Generally, template models are not considered as suggests that speech recognition is primarily a
plausible accounts in psycholinguistics. data-driven process. In summary, Clark and Clark
One early model of speech perception was (1977) argued that this theory is underspecified
that of analysis-by-synthesis (Halle & Stevens, and has little predictive power. Nevertheless, in
1962; Liberman et al., 1967; Stevens, 1960). The recent years motor theories of perception have
basis of analysis-by-synthesis is that we recog- seen something of a resurgence. They do have
nize speech by reference to the actions necessary the advantage that matching the auditory signal
to produce a sound. The important idea underly- to motor representations for producing our own
ing this model was that when we hear speech, speech provides a means for categorizing the
we produce or synthesize a succession of speech acoustic signal; indeed, some researchers go so
sounds until we match what we hear. The synthe- far as to argue that these motor representations
sizer does not randomly generate candidates for have a privileged role in language processing,
matching against the input; it creates an initial and that perceiving speech resembles perceiving
best guess constrained by acoustic cues in the motor gestures, in the sense that the goal of speech
input, and then attempts to minimize the differ- perception is recognizing which vocal tract move-
ence between this and the input. This approach ments could give rise to the sounds, rather than
had a few advantages. First, it uses our capac- the more abstract identification of the sounds
ity for speech production to cope with speech themselves (Galantucci, Fowler, & Turvey, 2006;
recognition as well. Second, it copes easily with Liberman & Whalen, 2000). Imaging data show
intra-speaker differences, because the listeners are that the motor areas of the brain become activated
generating their own candidates. Third, it is easy during speech perception (Watkins & Paus, 2004),
to show how constraints of all levels might have although of course this activation does not mean
an effect; the synthesizer only generates candi- that the motor areas play a causal role in percep-
dates that are plausible. It will not, for example, tion. Although analysis-by-synthesis cannot be
generate sequences of sounds that are illegitimate the whole story of speech perception, it does seem
within that language. One variant of the model, as though motor processes play some role.
the motor theory, proposes that the speech synthe- We are left with two basic types of model of
sizer models the articulatory apparatus and motor word recognition. The cohort model of Marslen-
movements of the speaker. It effectively computes Wilson and colleagues emphasizes the bottom-up
which motor movements would have been nec- nature of word recognition. The connectionist
essary to create those sounds. Evidence for this model TRACE emphasizes its interactive nature,
model is that the way sounds are made provides a and allows feedback between levels of process-
perfect description of them; for example, all /d/s ing. Partly in response to TRACE, Marslen-
are made by tapping the tongue against the alveo- Wilson modified the cohort model, so we should
lar ridge. Note that the specification of the motor distinguish between early and late versions of it.
The cohort model prelexical, and the integration stage is postlexi-

cal. Like Morton’s logogen model (see Chapter
The cohort model of spoken word recognition was 6), the original cohort model is based on parallel,
proposed by Marslen-Wilson and Welsh (1978; interactive, direct access, but whereas logogens
Marslen-Wilson, 1984, 1987) (see Figure 9.2). passively accumulate positive evidence, words in
The central idea of the model is that as we hear the cohort actively seek to eliminate themselves.
speech, we set up a cohort of possible items the On the presentation of the beginning of a word, a
word could be. Items are then eliminated from “word-initial cohort” of candidate words is set up.
this set until only one is left. This is then taken These are then actively eliminated by all possible
as the word currently trying to be recognized. means, including further phonological evidence,
We should distinguish an early version of the and semantic and syntactic context. In particular,
model (Marslen-Wilson, 1984), which permitted as we hear increasing stretches of the word, can-
more interaction, from a later version (Marslen- didates are eliminated.
Wilson, 1989, 1990), where processing was more Remember that the uniqueness point is
autonomous and the recognition system was bet- the point at which a word can be distinguished
ter able to recover if the beginnings of words were uniquely from all similar words. It is around this
degraded. point that the most intense processing activity
There are three stages of processing in the occurs. Consider the following increasing seg-
cohort model. First, in the access stage the percep- ments of a word (7–11). Obviously when we hear
tual representation is used to activate lexical items, /t/ alone (7) there are many possible words—the
and thereby to generate a candidate set of items. cohort will be very large. The next segment (8)
This set of candidates is known as the cohort. The reduces the cohort somewhat, but it will still be
beginning of the word is particularly important in very large. With more information (9) the cohort
generating the cohort. Second, there is a selection of possible items is reduced still further, but there
stage when one item only is chosen from this set. are still a number of items the word might be (e.g.,
Third, there is an integration stage in which the “trespass,” “trestle,” “trend,” “trench”). The next
semantic and syntactic properties of the chosen phoneme (in 10) reduces the cohort to only three
word are utilized—for example, in integrating the (“trespass,” “tress,” and “trestle”), but it is only
word into a complete representation of the whole at (12) that the cohort is reduced to one word (or
sentence. The access and selection stages are more properly, one root morpheme)—“trespass.”
This point is called the uniqueness point.
Cohort model of word recognition (7) /t/

(Marslen-Wilson)
(8) /tr/
ACCESS STAGE (9) /tre/
(perceptual representation used to
activate lexical items, thus (10) /tress/
generating a candidate set of items; (11) /tresp/
Prelexical
the cohort)
(12) /trespass/
SELECTION STAGE
It is important to note that the recognition
(one item only is chosen from this set) point does not have to coincide with the unique-
ness point. Suppose we heard the start of a sen-
tence “The poacher ignored the sign not to tres-.”
Postlexical
INTEGRATION STAGE In the early version of this model, at this point

(in which the semantic and syntactic the context might be sufficiently strong to elimi-
properties of the chosen word are utilized)
nate all other words apart from “trespass” from
the cohort. Hence it could be recognized before its
FIGURE 9.2 uniqueness point. The early version of the model
was very interactive in this respect; context is proposes. Lexical candidates that are contextually
clearly affecting the prelexical selection stage. The appropriate are integrated into the higher level
cost of all this is that sometimes strong contextual representation of the sentence. Sentential context
bias might lead to error. On the other hand, if the cannot override perceptual hypotheses, but only
sensory information is poor, the recognition point has a late effect when one candidate is starting
might not be until well after a word’s uniqueness to emerge as the likely winner. The frequency of
point. Indeed, the uniqueness point and recogni- a word affects the activation level of candidates
tion point of a word are only likely to coincide in in the early stages of lexical access. The rate of
the case of a very clear, isolated word. gain of activation is greater for higher frequency
In a revision of the basic model (e.g., Marslen- words. There are relative frequency effects within
Wilson, 1989), context only affects the integration the initial cohort, so that being in the cohort is not
stage. The model has bottom-up priority, mean- all-or-none, but instead items vary along a con-
ing that context cannot be used to restrict which tinuum of activation. The most recent version
items form the initial cohort. Bottom-up priority of the model (Marslen-Wilson & Warren, 1994)
is a feature of both the early and late versions of emphasizes the direct access of lexical entries on
the cohort model, but in the later version, context the basis of an acoustic analysis of the incoming
cannot be used to eliminate members of the cohort speech signal.
before the uniqueness point. This change was
motivated by experimental data (from the gating Experimental tests of the cohort
task to be discussed later) that suggested that the model
role of context is more limited than was originally Marslen-Wilson and his colleagues have used a
thought: Context cannot be used to eliminate can- number of experimental tasks to gather evidence
didates at an early stage. Another important modi- for the cohort model. Marslen-Wilson and Welsh
fication in the later version of the cohort model is (1978) used a technique known as shadowing to
that the elimination of candidates from the cohort examine how syntax and semantics interact in
no longer becomes all-or-none. This counters one word recognition. In this task, participants have
objection to the original model: What happens if to listen to continuous speech and repeat it back
the start of a word is distorted or misperceived? as quickly as possible (typically after a 250 ms
This would have prevented the correct item delay). The speech samples have deliberate mis-
from being in the word-initial cohort, yet we can takes in them—distorted sounds so that certain
sometimes overcome distortions even at the start words are mispronounced. Participants are not
of a word. Suppose we hear a word like “bleas- told that there are mispronunciations, but are told
ant” (e.g., as in “the dinner was very bleasant”). they have to repeat back the passage of speech
Although we might be slowed down, we can still as they hear it. But Marslen-Wilson and Welsh
recover to identify the word as “pleasant.” (For found that participants often (about 50% of the
example, a model such as TRACE, described time) repeat these back as they should be rather
later, will successfully identify “bleasant” as than as they actually are, and without any audible
“pleasant” because the degree of overlap is high disruption to the fluency of their speech. That is,
and there is no better word candidate.) Hence, in we find what are called fluent restorations, such
the revised model degree of overlap is important, as producing “travedy” as “tragedy.” (On a small
although the beginnings of words are particularly proportion of trials participants restored words
important in generating the cohort. Also in the after a hesitation; these non-fluent hesitations,
revised cohort model, in the absence of further along with errors, were excluded from further
positive information, candidates gradually decay analysis.) The more distorted a sound is, the more
back down to their normal resting state. They can likely you are to get an exact repetition.
be revived again by subsequent positive informa- In Marslen-Wilson and Welsh’s experiment
tion. The activation level of contextually inap- there were three variables of interest. The first
propriate candidates decays: context disposes, not variable was the size of the discrepancy between
the target and the erroneous word. This discrep- On the other hand, rhyme fragments of words pro-
ancy was measured in terms of the number of duce very little priming. For example, neither a
distinctive features changed in the deliberate error word (“cattle”) nor a derived nonword (“yattle”)
(either one feature, as in “trachedy,” or three fea- prime “battle” (Marslen-Wilson, 1993; Marslen-
tures, as in “travedy”). The second variable was Wilson & Zwitserlood, 1989). (Marslen-Wilson,
the lexical constraint, which reflected the number 1993, argued on this basis that the cohort model
of candidates available at different positions in gives a better account than that of the TRACE
the word by manipulating the syllable position on model described later. According to TRACE,
which the error was located (first or third sylla- “cattle” should compete with “battle” through the
ble). The third variable was the context (the word lateral inhibition connections, but as there is no
involved was a probable or improbable continua- word match for “yattle” it should not compete,
tion of the start of the sentence). An example of a and may even facilitate.)
high-constraint context was “Still, he wanted to The gating task (Grosjean, 1980; Tyler,
smoke a cigarette,” and of a low-constraint case, 1984; Tyler & Wessels, 1983) involves present-
“It was his misfortune that they were stationary.” ing gradually increasing amounts of a word, as in
Marslen-Wilson and Welsh found that most examples (7) to (12) given earlier. This enables
of the fluent restorations were made when the dis- the isolation points of words to be found: This is
tortion was slight, when the distortion was in the the mean time it takes from the onset of a word
final syllable, and when the word was highly pre- for listeners to be able to guess it correctly. This
dictable from its context. On the other hand, most task demonstrates the importance of context:
of the exact reproductions occur with greater dis- Participants need an average of 333 ms to identify
tortion when the word is relatively unconstrained a word in isolation, but only 199 ms in an appro-
by context. In a suitable constraining context, priate context, such as “At the zoo, the kids rode
listeners make fluent restorations, even when on the” for the word “camel” (Grosjean, 1980).
deviations are very prominent. These results were On the other hand, these studies also showed that
interpreted as demonstrating that the immediate candidates are generated that are compatible with
percept is the product of both bottom-up percep- the perceptual representation up to that point, but
tual input and top-down contextual constraints. that are not compatible with the context. Strong
Shadowing experiments showed that both syntac- syntactic and semantic constraints do not prevent
tic and semantic analyses of speech start to happen the accessing, at least early on, of word candidates
almost instantaneously, and are not delayed until that are compatible with the sensory input but not
a whole clause has been heard (Marslen-Wilson, with the context. Hence sentential context does
1973, 1975, 1976). not appear to have an early effect.
We do not pay attention equally to all parts In a visual equivalent of the gating task,
of a word. The beginning of the word, particu- participants looked at a computer screen show-
larly the first syllable, is especially salient. This ing pictures of a clown, cloud, dog, and par-
was demonstrated by the listening for mispronun- rot, and were instructed to “click on the cloud”
ciations task (Cole, 1973; Cole & Jakimik, 1980). (Allopenna, Magnuson, & Tanenhaus, 1998). On
In this task participants listen to speech where hearing the onset “cl-” participants were equally
a sound is distorted (e.g., “boot” is changed to likely to look at the picture of the cloud and that of
“poot”), and detect these changes. Consistent with the clown, but then as soon as they heard further
the shadowing task, participants are more sensi- disambiguating information they looked at just
tive to changes to the beginning of the words. the target picture.
Indeed, word fragments that match a word Although context might not be able to affect
from the onset are nearly as effective a prime as the generation of candidates, it might be able to
the word itself. For example, “capt-” is almost as remove them. A technique known as cross-modal
good a prime of the word “ship” as the word “cap- priming enables the measurement of contextual
tain” (Marslen-Wilson, 1987; Zwitserlood, 1989). effects at different times in recognizing a word
(Zwitserlood, 1989). This technique necessitates is not driven purely by the phonetic properties of
participants listening to speech over headphones the incoming words.
while simultaneously looking at a computer screen
to perform a lexical decision task to visually pre- The influence of lexical
sented words. The relation between the word on neighborhoods
the screen and the speech, and the precise time In the cohort model, the evaluation of competitors
relation between the two, can be systematically to the target word takes place in parallel, and hence
varied. Zwitserlood showed that context can assist the number of competitors (the cohort size) at any
in selecting semantically appropriate candidates time should not have any effect on the recognition
before the word’s recognition point. Consider the of the target (Marslen-Wilson, 1987). However,
word “captain.” (Zwitserlood’s experiment actu- data from Goldinger, Luce, and Pisoni (1989) and
ally used Dutch materials, where the equivalent Luce, Pisoni, and Goldinger (1990) suggest that
item is “kapitein.”) Participants heard differing cohort size does affect the time course of word
amounts of the word before either a related or a recognition. Luce et al. found that the structure of
control word appeared on a computer screen. At a word’s neighborhood affects the speed and accu-
the point of hearing just “cap,” the word is not racy of auditory word recognition on a range of
yet unique. It is consistent with a number of con- tasks, including identifying words and performing
tinuations, including the word “captain” but also an auditory lexical decision task. The number and
a competitor, “capital.” Zwitserlood found facili- characteristics of a word’s competitors (such as
tation for the recognition of both relatives of the their frequency) are very important. For example,
target (e.g., “ship”) and competitors (“money” for we are less able to identify high-frequency words
“capital”). By the end of the word, however, only that have many high-frequency neighbors than
relatives of the target could be primed. There was words with fewer neighbors or low-frequency
also more priming by the more frequent candidate neighbors. Luce and his colleagues argue that the
than by less frequent candidates, as predicted by number of competitors, what they call the neigh-
the cohort model. Importantly, constraining con- borhood density, influences the decision. Words
text did not have any effect early on in the word: with many neighbors take longer to identify and
Even if context strongly favors a word so that produce more errors because of competition.
its competitors are implausible (e.g., as in “With Marslen-Wilson (1990) examined the effect
dampened spirits the men stood around the grave. of the frequency of competitors on recogniz-
They mourned the loss of their captain”), they ing words. He found that the time it takes you to
nevertheless still prime their neighbors. After a recognize a word such as “speech” does not just
word’s isolation point, however, we do find effects depend on the relative uniqueness points of com-
of context. Context then has the effect of boosting petitors (such as “speed” and “specious”) in the
the word’s activation level relative to its competi- cohort, but also on the frequency of those words.
tors. These results support the ideas that context Hence, you are faster to identify a high-frequency
cannot override perceptual hypotheses, and that word that only has low-frequency neighbors
sentential context has a late effect, on interpret- than vice versa. The rise in activation of a high-
ing a word and integrating it with the syntax and frequency word is much greater than for a low-
semantics of the sentence. Context speeds up this frequency one.
process of integration. Phonological neighborhood is not the only
Recent imaging data support the idea that factor that can affect auditory recognition.
semantics plays a role in selecting among candidates. Orthographic neighborhood can also affect audi-
In a lexical decision task, high imageability words tory recognition, but does so in a facilitatory fash-
generated stronger activation than low image- ion. That is, spoken words with many visually
ability words, in competitive contexts (Zhuang, similar neighbors are faster to identify than spo-
Randall, Stamatakis, Marslen-Wilson, & Tyler, ken words with few neighbors (Ziegler, Muneaux,
2011). The imaging work now shows that selection & Grainger, 2003). Somehow the printed word
can sometimes affect spoken word recognition,

presumably because somewhere in the system k
I kk
л
a
p
p I k
p I k
л
a
pp I
p
kk
I k
λ
sublexical units, or word units, or both, for differ- p I k

~"p
pp I k
λ
λ
p I kk
p I kk
λ
a
p I k
p I ~
клр
ent modalities are linked. a
λ
p I k
p I kk
a
a
p
p I k
p I k
λ
a
p
'
p tt ii і tt ii і tt πi tt πi
Evaluation of the cohort model ' I tt i i і tt ii і tt ' мi tt
Words
ttii
t I ii ^t ii ^t ii ·tΠ ii
The cohort model has changed over the years, and i t i t1 î tt V i i
– – – – – – – –
in the light of more recent data it places less empha- – – – – – – – –
sis on the role of context. In the early version of the | p
p | p p | p p| p p
| p |p p | pp | p~ p ~p
p
p ' | pp | pp | p p | p p| p p| p p| p ~p T
model, context cannot affect the access stage, but
t I tt I tt і tt і tt і tt і tt і tt і
it can affect the selection and integration stages. tt I t t I tt I t^ tt I tt I tt I tt I 1
In the later version of the model, context cannot
Phonemes
α> I k kk I k I k l kk l kk I kk I k ~
£ k
affect selection but only affects integration. In the <cu ' l k
k l kk l ^ k l k kl k k l k kl k kl ~
revised version (Marslen-Wilson, 1987), elements O

-C ii i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
Q_
are not either “on” or “off,” but have an activation a | a | a | a | a | a | a | a
Λ
level proportional to the goodness-of-fit between | λ | α | λ | α | λ | λ | λ |
– – – – – – – –
the element and the acoustic input, so that a num- – – – – – – – –
ber of candidates may then be analyzed further in
parallel. This permits a gradual decay of candidates hi
rather than immediate elimination. The model does

Vocalic
*
CO ·
not distinguish between provisional and definite u ·
5 :
identification; there are some probabilistic aspects lo
–
to word recognition (Grosjean, 1980). The later hi
version, by replacing all-or-none elimination from
Diffuseness
I
Features
■5 cω ··
Ф
the cohort with gradual elimination, also better
<O із I
accounts for the ability of the system to recover Ф 4—
it
from errors. A continuing problem for the cohort ^ lo
lo
–
model is its reliance on knowing when words start hi
Acuteness
without having any explicit mechanism for finding Ф

c
*
·
the starts of words. Із
ф
·
*
u ♦
< t lo
l-o
–
TRACE and related models –- tt ii k p -–
TRACE is a highly interactive model of spoken

word recognition (McClelland & Elman, 1986), FIGURE 9.3 Architecture of the TRACE model of
derived from the McClelland and Rumelhart (1981) speech recognition. Each rectangle represents one
interactive activation model of letter and visual unit. Units at different levels span different portions
word identification (see Chapter 6). Here I will out- of the speech trace. In this example, the phrase “tea
line only the principal features of the model. (Once cup” has been presented to the model. Its input
again, if you haven’t studied connectionist models values on three phonetic features are illustrated
before, I strongly advise you to read the Appendix by the blackened histograms. From McClelland,
Rumelhart, and the PDP Research Group (1986).
carefully at this point if you haven’t already done
so.) The most important characteristic of TRACE
is that it emphasizes the role of top-down process- TRACE is a connectionist model, and so con-
ing (context) on word recognition. Hence, lexical sists of many simple processing units connected
context can directly assist acoustic-perceptual pro- together. These units are arranged in three lev-
cessing, and information above the word level can els of processing. It assumes some early, fairly
directly influence word processing. sophisticated perceptual processing of the acoustic
signal. The level of input units represents phono- perception arises in the model as a consequence
logical features; these are connected to phoneme of within-level inhibition between the phoneme
units, which in turn are connected to the output units. As activation provided by an ambigu-
units that represent words (see Figure 9.3). Input ous input cycles through time, mutual inhibition
units are provided with energy or “activated,” and between the phoneme units results in the input
this energy or activation spreads along the con- being classified as at one or other end of the con-
nections in a manner described in the Appendix, tinuum. TRACE accounts for position effects in
with the result that eventually only one output word recognition (word-initial sounds play a par-
unit is left activated. The winner in this stable ticularly important role) because input unfolds
configuration is the word that the network “recog- over time, so that word-initial sounds contribute
nizes.” Units on different levels that are mutually much more to the activation of word nodes than
consistent have excitatory connections. All con- word-final sounds do (see Figures 9.4 and 9.5).
nections between levels are bidirectional, in that
information flows along them in both directions. Evaluation of the TRACE model
This means that both bottom-up and top-down TRACE handles context effects in speech per-
processing can occur. There are inhibitory conception very well. It can cope with some acoustic
nections between units within each level, which variability, and gives an account of findings such as
has the effect that once a unit is activated, it the phoneme restoration effect and co-articulation
tends to inhibit its competitors. This mechanism effects. TRACE gives a very good account of lexical
therefore emphasizes the concept of competition context effects. It is good at finding word bounda-
between units at the same level. The model deals ries and copes extremely well with noisy input—
with time by simulating it as discrete slices. Units which is a considerable advantage, given the noise
are represented independently in each time-slot. present in natural language. An attractive aspect of
The model is implemented in the form of com- TRACE is that features that are a problem for older
puter simulations, and runs of the simulations are models, such as co-articulation effects in template
compared with what happens in normal human models, actually facilitate processing, just as they
speech processing. The model shows how lexical clearly do in humans, through top-down process-
knowledge can aid perception—for example, if ing. As with all computer models, TRACE has the
an input ambiguous between /p/ and /b/ is given advantage of being explicit.
followed by the ending corresponding to -LUG, There are several problems with TRACE,
then /p/ is “recognized” by the model. Categorical however. There are many parameters that can be
FIGURE 9.4 Response

strengths in TRACE of the
p r a d k t
2.00 units for several words
1.80 relative to the response
Relative response strength
1.60 strength of the unit for

1.40 product, as a function of
time relative to the peak of
1.20
the first phoneme that fails
1.00
to match the word. The
0.80
successive curves coming
0.60 off the horizontal line
0.40 representing the normalized
0.20 response strength of
0.00 product are for the words
0 6 12 18 24 30 36 42 48 54 60 66 72
trot, possible, priest,
Processing cycles
progress, and produce,
respectively.
/g/ /k/
1.00
Initial excitation 0.75
0.50
0.25
FIGURE 9.5 Categorical
phoneme perception
0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 in TRACE. The top
Stimulus number panel shows the level of
bottom-up activation to
the phoneme units /g/ and
1.00
/k/, for each of 12 stimuli
(shown on the x-axis).
Phoneme node activation
0.75
The lower panel shows
/g/ / k/
the activation for the
0.50 same phoneme units after
cycle 60. Stimuli 3 and 9
0.25 correspond to canonical
/g/ and /k/, respectively.
At cycle 60, the boundary
0.00
between the phonemes
is much sharper. From
– 0.25 McClelland, Rumelhart, and
0 1 2 3 4 5 6 7 8 9 10 11 12
Stimulus number
the PDP Research Group
(1986).
manipulated in the model, and it is possible to both phonemes approximately equally, as there are
level the criticism that TRACE is too powerful, in words beginning with both /pli-/ and /pri-/. Massaro
that it can accommodate any result. By adjusting found that the context biases performance so that,
some of the parameters, can the model be made to for example, listeners were more likely to classify
simulate any data from speech recognition experi- an ambiguous phoneme as /l/ in the /s_i/ context and
ments, whatever they show? Moreover, the way in /r/ in the /t_i/ context. The behavior of humans in
which the model deals with time, simulating it as this task differed from the behavior of the TRACE
discrete slices, is implausible. network. In particular, in TRACE context has
Massaro (1989) pointed out a number of prob- the biggest effect when the speech signal is most
lems with the TRACE model. He carried out an ambiguous, and has less effect when the signal is
experiment in which listeners had to make a forced- less ambiguous. With humans, the effects of con-
choice decision about which phoneme they heard, text are constant with respect to the ambiguity of the
when the sound they heard was on the continuum speech signal. Although McClelland’s (1991) reply
between /l/ and /r/. The sounds occurred in the accepted many of Massaro’s points, and tried mak-
contexts of /s_i/, /p_i/, and /t_i/. The first context ing the model’s output probabilistic (or stochastic),
favors the identification of /l/, as there are a num- Massaro and Cohen (1991) found that the problems
ber of English words that begin with /sli-/ but no persisted even after this modification. Massaro’s
words that begin /sri-/ The third context favors work is important in that it shows that it is possible
/r/ because there are words beginning with /tri-/ to make falsifiable predictions about connection-
but not /tli-/. Finally, the second context favors ist models such as TRACE. Massaro argues for a
model where phonetic recognition uses features that of “English” and “copious” were replaced with a
serve as an input to a decision strategy involving sound halfway between /s/ and /sh/.
variable conjunctions of perceptual features called At first sight then, the data of Elman and
fuzzy prototypes (see Klatt, 1989, for more detail). McClelland (1988) support an interactive model
Choosing between these models is difficult, and it is rather than an autonomous one. The lexicon
not clear that they are addressing precisely the same appears to be influencing a prelexical effect (com-
issues: TRACE is concerned with the time course pensation). There are, however, accounts of the
of lexical access, whereas the fuzzy logic model is data compatible with the autonomous model. First,
more concerned with decision making and output it is not necessary after all to invoke lexical knowl-
processes (McClelland, 1991). edge. Connectionist simulations using strictly
The main problem with TRACE is that it is bottom-up processing can learn the difference
based on the idea that top-down context permeates between /g/ after /s/ and /sh/, and also that /s/ is
the recognition process. The extent to which top- more likely to follow one vowel and /sh/ another.
down context influences speech perception is con- That is, there are sequential dependencies between
troversial. In particular, there is also experimental phonemes that mean that we do not need to invoke
evidence against the types of top-down processing lexical knowledge: Some sequences of phonemes
that TRACE predicts occur in speech process- are just more likely (Cairns, Shillcock, Chater,
ing: Context effects are only really observed with & Levy, 1995; Norris, 1993). Pitt and McQueen
perceptually degraded stimuli (Burton, Baum, & (1998) demonstrated that this sequential informa-
Blumstein, 1989; McQueen, 1991; Norris, 1994b). tion can be used in speech perception. They found
In support of TRACE, Elman and McClelland compensation for co-articulation effects on the
(1988) reported an experiment showing interactive categorization of stop consonants when they were
effects on speech recognition of the sort predicted preceded by ambiguous fricative sounds at the end
by TRACE. They argued that they had demon- of nonwords. For example, the sequence of pho-
strated that between-level processes can affect nemes in the nonword “der?” is biased towards
within-level processes at a lower level. In particular, an /s/ conclusion, while the sequence in “nai?” is
they showed that illusory phonemes created by top- biased towards a /sh/ conclusion. (In both cases
down, lexical knowledge (in a manner analogous to the final sound in fact was halfway between /s/
phoneme restoration) can affect co-articulation (the and /sh/.) The nonwords were followed by a word
influence of one sound on a neighboring sound) beginning with a stop consonant sound along the
operating at the basic sound perception level in the /t/ to /k/ continuum, from “tapes” to “capes.” The
way predicted by simulations in TRACE. Consider identification of the stop consonant was influenced
word pairs such as “English dates/gates” or “copi- by the preceding ambiguous fricative differently
ous dates/gates,” where the initial phoneme of the depending on the nonword context of the frica-
second word was ambiguous, lying on the con- tive. As the preceding item was a nonword, lexical
tinuum between /d/ and /g/. The co-articulatory knowledge could not be used. The fact that com-
effects of the final sound of the first word affect the pensation is still obtained suggests that sequential
precise way in which we produce the first sound of knowledge about which phonemes co-occur is
the second word. Listeners are sensitive to these being used.
co-articulation effects in speech: the effect is called TRACE is also poor at detecting mispronun-
compensation for co-articulation. In particular, we ciations. TRACE is a single-outlet model (Cutler,
are more likely to identify the ambiguous phoneme Mehler, Norris, & Segui, 1987): The only way
as a /d/ when it follows a /sh/, as in “English,” but TRACE can identify phonemes is to see which pho-
more likely to identify it as a /g/ when following nemes are identified at the phoneme level. However,
/s/, as in “copious.” So listeners should tend to suppose a mispronounced word is presented. The
report hearing “English dates” but “copious gates.” phonemes will activate the best match word. This
Elman and McClelland showed that this compensa- word node will then feed back activation to the pho-
tion effect was obtained even when the final sounds neme level, so that the phonemes in the best match
will become activated: The incorrect phonemes will to top-down inhibition. TRACE also predicts that
be corrected. But mispronunciations are not over- targets (e.g., t) in nonwords derived from changed
looked; they have a distinct adverse effect on per- words (e.g., vocabutary) should be identified more
formance (Gaskell & Marslen-Wilson, 1998). slowly than targets in control nonwords (e.g.,
Single-outlet models can be contrasted with socabutary) because the actual phoneme competes
multiple-outlet models, such as the Race model with the phoneme in the real word (l) because of
(Cutler & Norris, 1979), where two sources of infor- top-down feedback. However, there was no differ-
mation, the stored and maintained prelexical analy- ence between the two nonword conditions. Cutler
sis of the word, and a word’s lexical entry, compete et al. (1987) found that phoneme monitoring laten-
for output. The decision is made on the basis of cies were faster to word-initial phonemes than to
which route produces the answer first—hence the phonemes at the start of nonwords. According to
race aspect. Because there are two outlets, prelexi- the TRACE model there should be no difference
cal and lexical, it should be possible to emphasize for phonemes at the start of words and nonwords
one rather than the other by shifting attention. as activation will not have had time to build up and
Lexical effects on phoneme processing should be feed back to the phoneme level.
maximized when people pay particular attention TRACE is also unable to account for the find-
to the lexical outlet, and minimized when they pay ings from subcategorical mismatch experiments
particular attention to the prelexical outlet. This pat- (Marslen-Wilson & Warren, 1994). This task
tern is exactly what is observed, and is difficult for involves cross-splicing the initial consonants and
single-outlet models such as TRACE to account for consonant clusters from matched pairs of words
(Cutler et al., 1987; Norris et al., 2000). For exam- (e.g., “job” and “smob”). Marslen-Wilson and
ple, the magnitude of the lexical effect in phoneme Warren examined the effect of splicing on lexical
monitoring tasks depends on the composition of the decision (is it a word?) and phoneme categoriza-
other filler items used in the experiment. tion (what sort of sound did you hear?). The effect
In their review of the literature on con- of the cross-splice on nonwords was much greater
text effects on speech recognition, Norris et al. when the spliced material came from a word (e.g.,
(2000) argued that feedback is never necessary an item like “smob,” where the “sm-” component
in speech recognition. Indeed, top-down feed- came from the word “smog”), such that perfor-
back, they argue, would hinder recognition. mance was poorer when the cross-spliced nonword
Feedback cannot improve accuracy in process- came from a word, but the splicing made little dif-
ing (indeed, it can override the detection of ference to the processing of words. These data are
mispronunciations and can actually decrease difficult for many models. They are difficult for
accuracy); it can only speed up processing. The independent race models because decisions about
cost to this increase in speed is a trade-off with nonwords can only be made by the prelexical
accuracy. The crux of the argument is whether route, and therefore should be unaffected by the
or not there is lexical involvement in phonemic lexical status of the items from which the mate-
decision making—which are all tasks where rials are derived. They are difficult for TRACE
listeners are required to make decisions about because simulations in TRACE show that words
sounds, such as phoneme monitoring, phoneme should be affected as well as nonwords, and in
restoration, and phonetic categorization. nonwords the inhibitory effect should be greater
Finally, there is experimental evidence against than it actually is. TRACE does poorly because it
other assumptions of the model. Frauenfelder, cannot use data about the mismatch between two
Segui, and Dijkstra (1990) found no evidence of items.
top-down inhibition on phonemes in a task involv- TRACE is successful in accounting for a
ing phoneme monitoring of unexpected phonemes number of phenomena in speech recognition, and
late in a word compared with control nonwords. is particularly good at explaining context effects.
TRACE predicts that once a word is accessed, Its weakness is that the extent to which its predic-
phonemes that are not in it should be subject tions are supported by data is questionable.
Other connectionist models of speech could become activated in parallel. The target
recognition word only becomes strongly differentiated from its
Recent networks use recurrent connections from the competitors close to its uniqueness point. Second,
hidden layer to a context to store information about the model successfully simulated the experimen-
previous states of the network (Elman, 1990) (see tal data of Marslen-Wilson and Warren (1994).
Figure 9.6). This modification enables networks to Third, unlike other connectionist models such as
encode information about time. Hence, they give TRACE, and like humans, their model shows very
a much more plausible account of the time-based little tolerance. As in Marslen-Wilson and Warren’s
nature of speech processing than does TRACE, (1994) experiment, a nonword such as “smob” that
which uses fixed time-based units and therefore finds matches a word quite closely (“smog”) except for
it difficult to cope with variations in speech rate. the place of articulation of the final segment, and
Gaskell and Marslen-Wilson (1997, 1998, 2002) which is constructed so that the vowels are consist-
extended the cohort model to model the process that ent with the proper target, does not in fact activate
maps between phonological and lexical information. the lexical representation of the word (“smog”) very
They constructed a connectionist model that empha- much. The network requires a great deal of phonetic
sized the distributed nature of lexical representations detail to access words—just like humans. Gaskell
(unlike TRACE, which uses local representation) so and Marslen-Wilson propose that this feature of
that information about any one word is distributed the model is a consequence of the realistic way in
across a large number of processing units. The other which the inputs are presented (with words embed-
important way in which it differed from other con- ded in a stream of speech), and the training of the
nectionist models such as TRACE is that low-level network on a large number of similar phonologi-
speech information, represented by phonetic fea- cal forms. These features force the network to be
tures, is mapped directly onto lexical forms. There intolerant about the classification of inputs. Fourth,
are no additional levels of phonological processing because words are represented in a way such that
involved (although there is a layer of hidden units similar items overlap in their representations, com-
mediating between the feature inputs and the seman- petition between similar items is an essential part
tic and phonological output layers). of processing. The simultaneous activation of more
Gaskell and Marslen-Wilson’s model simu- than one candidate creates conflict. Gaskell and
lated several important aspects of speech process- Marslen-Wilson present a series of experiments
ing. First, it gave a good account of the time course using cross-modal priming that show that com-
of lexical access. It showed that multiple candidates petition reduces the magnitude of the semantic
Architecture of a recurrent network
Output
units
Hidden
units
Input units Context units
FIGURE 9.6
priming effect. When a word is still ambiguous, for The SHORTLIST model is entirely bottom-up and is
example “capt-,” which could be either “captain” or based on a vocabulary of tens of thousands of words.
“captive,” it is not particularly effective at priming Essentially the model views spoken word recogni-
“ship”; it only becomes effective relatively late, tion as a bottom-up race between similar words. A
after we have reached the word’s uniqueness point. competition network is created “on the fly” from the
Note though that “capt-” still produces some prim- output of a bottom-up recognition network in which
ing; you can access meaning prior to the uniqueness candidates detected in the incoming speech stream
point, which allows some facilitation of semanti- are allowed to compete with each other. Only a few
cally related words, but as you cannot get complete words are active enough to be used in the list (hence
access, semantic priming is weaker than after the the name). The main drawback of this approach con-
uniqueness point. Finally, the model accounts for cerns the plausibility of creating a new competitive
the different pattern of effects found in cross-modal network at each time step (Protopapas, 1999).
repetition priming and cross-modal semantic prim- Given that they argue there is no top-
ing. Gaskell and Marslen-Wilson argue that the down feedback in speech recognition, Norris,
amount of competition between words depends McQueen, and Cutler (2000) propose a purely
on the coherence of the competing set. The candi- data-driven model. They call this model MERGE.
dates activated by a partial sound input will neces- MERGE is a competition-activation model simi-
sarily sound similar (e.g., captain and captive): the lar to SHORTLIST. In the MERGE model, activa-
candidate set is coherent. In contrast the semantic tion flows from the prelexical level to the lexicon
properties of the candidate words will be unrelated. and to phoneme-decision nodes. Crucially, there
Hence repetition priming can make direct use of the is no feedback between the lexical nodes and
set of lexical candidates directly activated by the the prelexical nodes. However, lexical informa-
input (e.g., “capt-” is closely related to “captain” tion can influence the phoneme-decision nodes.
and “captive”). Semantic priming cannot do so, as Decisions are made on the basis of merging
it generates multiple unrelated candidate items; the
candidate words related to the prime “capt-” include
“ship” and “prisoner,” which are unrelated—this set
is incoherent. Furthermore, with incoherent candi-
catalog
date sets, the more candidates there are, the more
competition there will be, while with coherent sets,
the number of candidates matters much less, and
hence priming should be less affected by the cohort cattle
set size. Hence competition effects should be much

more prominent in cross-modal semantic priming
than in repetition priming, and more sensitive to a
cohort set size—which is just what was found.

Norris (1990) showed that recurrent networks cat log
can identify spoken words at their uniqueness points,

and can also cope with variations in speech rate.
at
However, he noted that, unlike TRACE, recurrent
networks cannot recover if they misidentify parts of
words. They have no way of undoing early decisions MODELS OF
about parts of words in a way that TRACE manages
to do through competition between whole words. FIGURE 9.7 The pattern of inhibitory connections
Norris’s (1994b) SHORTLIST model tries to com- between candidate words in the SHORTLIST model
bine the best of both approaches, with a hybrid archi- (Norris, 1994b). Only the subset of candidates that
tecture where a recurrent network provides input to completely match the input are shown. From
an interactive activation network (see Figure 9.7). Norris (1994b).
these two inputs. Norris et al. provide simula- produce no priming. On the other hand, the evi-
tions that show that such a model does a good job dence for the amount of interaction that TRACE
of accounting for a wide range of experimental entails is limited.
data. Critics (see commentary in Norris et al.) The Gaskell and Marslen-Wilson model is
argue that merging is a form of interaction, as the very similar to the SHORTLIST model of Norris
phoneme-decision nodes are influenced by lexical (1994b). Both models differ from TRACE in
information, and the MERGE is a model specifi- making less use of top-down inhibition and more
cally about phoneme-decision tasks rather than a use of bottom-up information. SHORTLIST
general model of speech recognition. combines the advantages of recurrent nets and
TRACE. At present, these types of connectionist
Comparison of models of spoken model show how models of spoken word recogni-
tion are likely to develop, although SHORTLIST
word recognition currently suffers from the problem that it is not
Let us look again at the three phases of speech rec- clear how interactive activation networks can be
ognition we identified and see what the different set up quickly “on the fly.”
models we have examined so far have to say about Virtually all models of word recognition
them. When we hear speech, we have to do two view spoken word recognition as incorporating
things. We have to segment the speech stream into an element of competition between the target
words, and we have to recognize those words. The word and its neighbors. Therefore priming a
amount of speech needed to compute the contact word should retard recognition of another shar-
representation determines when initial contact can ing the same initial sounds (Monsell & Hirsh,
occur. According to Klatt (1989), contact can be 1998). Unfortunately, the bulk of the research
made after the first 10 ms. Models that use sylla- has shown either facilitation or no effect of
bles to locate possible word onsets, and which need priming phonologically related items, rather
larger units of speech, will obviously take longer than the expected inhibition. Why might this
before they can access the lexicon. Different mod- be? Monsell and Hirsh pointed out that in these
els also emphasize how representations make con- studies the lag between the prime and the probe
tact with the lexicon. Hence in the cohort model, is very brief. It is possible that any inhibi-
the beginning of the word (the first 150 ms) is tory effects are cancelled out by short-acting
used to make first contact. In other models (e.g., facilitatory effects generated by other factors,
Grosjean & Gee, 1987), the more salient or reliable such as processing shared sublexical constit-
parts of the word, such as the most stressed sylla- uents (such as phonemes or rimes). If this is
ble, are used. All of these models where initial con- the case, then inhibition should be apparent at
tact is used to generate a subset of lexical entries longer time lags, when the short-lived facilita-
have the disadvantage that it is difficult to recover tory effects have had time to die away. This is
from a mistake (e.g., a mishearing). Models such what Monsell and Hirsh observed. In an audi-
as TRACE, where there is not a unique contact for tory lexical decision task, with time lags of 1–5
each word, do not suffer from these problems. Each minutes between prime and target, the response
identified phoneme—the whole word—contributes time for a monosyllabic word preceded by a
to the set of active lexical entries. The cost of this is word sharing its onset and vowel (e.g., “chat”
that these sets may be very large and this might be and “chap”) increased relative to an unprimed
computationally costly. control. Similarly, response time increased for
The revised cohort model negates the prob- polysyllabic words preceded by another sharing
lem of recovering from catastrophic early mis- the first syllable (e.g., “beacon” and “beaker”).
takes by allowing gradual activation of candidates The effect was limited to word primes—non-
rather than all-or-none activation. Furthermore, word primes (e.g., “chass” and “beacal”) did
we have seen that while the beginnings of words not produce this inhibition. Hence priming
are important in lexical access, the rhyme parts phonological competitors does indeed retard
the subsequent recognition of items, but the sonority—essentially the amount of acoustic
effect is only manifest when other short-term energy in a sound) were taken into account.
facilitatory effects have died down. Patients with pure word deafness can speak,
Finally, we make use of other types of infor- read, and write quite normally, but cannot
mation when understanding speech. Even people understand speech, even though their hearing is
with normal hearing can make some use of lip- otherwise normal (see Saffran, Marin, & Yeni-
reading. McGurk and MacDonald (1976) showed Komshian, 1976, for a case history). Patients
participants a video of someone saying “ba” with pure word deafness cannot repeat speech
repeatedly, but gave them a soundtrack with “ga” and have extremely poor auditory comprehen-
repeated. Participants reported hearing “da,” appar- sion. They are impaired at tasks such as distin-
ently blending the visual and auditory information. guishing stop consonants from each other (e.g.,
This effect suggests that speech perception is the /pa/ from /ba/ and /ga/ from /ka/). On the other
result of the best guess of the whole perceptual sys- hand Saffran et al.’s patient could identify musi-
tem, using multiple sources of information, among cal instruments and non-speech noises, and could
which speech is usually the most important. identify the gender and language of a recorded
voice. This pattern of performance suggests that
these people suffer from disruption to a prelexi-
THE NEUROSCIENCE cal, acoustic processing mechanism. A very rare
OF SPOKEN WORD and controversial variant of this is called word
RECOGNITION meaning deafness. Patients with word mean-
ing deafness show the symptoms of pure word
Some difficulty in speech recognition is quite deafness but have intact repetition abilities. The
common in adults with a disturbance of language most famous case of this was a patient living in
functions following brain damage. Varney (1984) Edinburgh in the 1890s (Bramwell, 1897/1984),
reported that 18% of such patients had some prob- although more recent cases have been reported
lem in discriminating speech sounds. Brain dam- by Franklin, Howard, and Patterson (1994), and
age can affect most levels of the word recognition Kohn and Friedman (1986). Pure word deafness
process, including access to the prelexical and the shows that we can produce words without neces-
postlexical codes. sarily being able to understand them.
There are many cases of patients who have Only one patient (EDE) clearly showed intact
difficulty in constructing the prelexical code. acoustic-phonetic processing (and therefore the
Caplan (1992) reviews these. For example, ability to construct a prelexical code), but also
brain damage can affect the earliest stages then had difficulties with lexical access (Berndt
of acoustic-phonetic processing of features & Mitchum, 1990). This patient performed well
such as voice onset time, or the later stages on all tests of phoneme discrimination and acous-
involving the identification of sounds based tic processing, yet made many errors in decid-
on these features (Blumstein, Cooper, Zurif, ing whether a string of sounds made up a word
& Caramazza, 1977). Neuropsychological evi- or not (e.g., “horse” is a word, but “hort” is not).
dence suggests that vowels and consonants are Nevertheless EDE generally performed well on
processed by different systems. Caramazza, routine language comprehension tasks, and Berndt
Chialant, Capasso, and Miceli (2000) describe and Mitchum interpreted her difficulties with this
two Italian-speaking aphasic patients who particular task in terms of a short-term memory
show selective difficulties in producing vow- deficit rather than of lexical access. As yet there
els and consonants. Patient AS produced have been no reports of patients who have com-
mainly errors on vowels, while patient IFA pletely intact phonetic processing but who cannot
produced mainly errors on consonants. These access the postlexical code. This might be because
differences remained even when other possi- so far we have not looked hard enough, or perhaps
ble confounding factors (such as the degree of have just been unlucky.
SUMMARY
x We can recognize meaningful speech faster and more efficiently than we can identify non-speech
sounds.
x Sounds run together (the segmentation problem), and vary depending on the context in which they
occur (the invariance problem).
x The way in which we segment speech depends on the language we speak.
x We use a number of strategies to segment speech; stress-based segmentation is particularly impor-
tant in English.
x Consonants are classified categorically, but it is unclear how early in perception this effect arises,
because listeners are sensitive to differences between sounds within a category.
x The lexicon is our mental dictionary.
x The prelexical code is the sound representation used to access the lexicon.
x There is controversy about whether phonemes are represented directly in the prelexical code, or
whether they are constructed after we access the lexicon.
x Studies of co-articulation effects in words and nonwords suggest that a low-level phonetic repre-
sentation is used to access the lexicon directly.
x The lexical identification shift of ambiguous phonemes varies depending on the lexical context.
x Phonemes masked by noise can be restored by an appropriate context.
x There has been debate about whether the lexical identification shift and phoneme restoration
effects are truly perceptual effects or instead reflect later processing.
x Word recognition can be divided into initial contact, lexical selection, and word recognition phases.
x A spoken word’s uniqueness point is when the stream of sounds is finally unambiguously distin-
guishable from all other words.
x We recognize the word at its recognition point; the recognition point does not have to correspond
to the uniqueness point.
x Although the extent to which top-down sentential context has an effect on the early stages of
word recognition is controversial, the preponderance of evidence suggests that context only has
its effects after lexical access.
x Early models of speech recognition included template matching and analysis-by-synthesis.
x According to the cohort model of word recognition, when we hear a word a group of candidates—the
cohort—is set up; as further evidence arrives, the cohort is reduced until only one word remains.
x Later revisions of the cohort model introduced the idea of graded activation rather than all-or-
none membership of the cohort, and reduced the role of contextual effects.
x Evidence for the cohort model comes from studies of fluent restorations in speech, listening for
mispronunciations, and studies using the gating and cross-modal priming techniques.
x The lexical neighborhood comprises all words that sound like a particular word, and can have
effects on its recognition.
x TRACE is a highly interactive connectionist model of spoken word recognition.
x The main difficulty with TRACE is that it assumes more interaction than there is evidence for.
x Models such as SHORTLIST show how bottom-up, data-driven connectionist models can account
for most of the major findings of speech processing research.
x Vowels and consonants are processed by different systems.
x People with pure word deafness cannot understand speech even though their hearing is otherwise
unimpaired and they can read and write quite well.
x People with the rare disorder known as word meaning deafness cannot understand speech even
though they can repeat it back.
1. What particular processing problems might people with a different dialect cause a listener?
2. Why might mishearings occur?
3. What sort of special problems might code switching by bilinguals create for speech recognition
by their listeners?
4. What are the main differences between the cohort and SHORTLIST models of spoken word
recognition?
FURTHER READING
Luce (1993) is an introduction to acoustics, the low-level processes of hearing, and how the ear
works. See MacMillan and Creelman (1991) for an introduction to signal detection theory. See Ward
(2010, Chapter 10) for a description of the neuroscience of auditory processing. Remez and Pisoni
(2005) is an edited collection that covers the whole field of speech perception and spoken word
recognition.
The classic textbook by Clark and Clark (1977) has a good description of the earlier models of
speech perception, particularly analysis-by-synthesis. The paper by Frauenfelder and Tyler (1987)
in a special issue on spoken word recognition in the journal Cognition is an introduction to the
issues involved in spoken word recognition. Two collections of papers on speech processing are to
be found in Altmann (1990) and Altmann and Shillcock (1993). Altmann (1997) provides excellent
coverage of speech perception, particularly on the importance of sound perception by infants and
other species.
Ellis and Humphreys (1999) review connectionist models of speech processing. Massaro (1989)
provides a critique of connectionist models in general and TRACE in particular. Norris (1994b) is a
good summary of the problems with TRACE, and see Protopapas (1999) for a review of connection-
ist models of speech perception. Grosjean and Frauenfelder (1996) review the methods commonly
used to study spoken word recognition. For a review of the literature on speech recognition, with the
conclusion that speech perception is bottom-up and data-driven, see Norris, McQueen, and Cutler
(2000), with commentaries.
SECTION D
MEANING AND USING LANGUAGE
This section examines the processes of compre- particular how we represent the meanings of
hension. How do we extract meaning from what individual words. Categorization, associations
we read or hear and make use of word order infor- between words, use of metaphor and idiom, and
mation? How do we represent and make use of the connectionist modeling of semantics are among
meaning of words and sentences? the topics addressed.
Chapter 10, Understanding the structure Chapter 12, Comprehension, looks at
of sentences, tackles the complexities of sentence what follows after we have identified words
interpretation and parsing. Once we have recog- and built the syntactic structure of a sentence.
nized words, how do we decide between all the What do we remember of text that we read or
different roles the words can take—who is doing hear? How do we know when to draw infer-
what to whom? (You may find it useful to read ences or move beyond the literal meaning of
Chapter 2 again before starting Chapter 10.) the text? This chapter also addresses the spe-
Chapter 11, Word meaning, examines cific problems inherent in understanding spo-
issues involved in the study of semantics, in ken conversation.
C H A P T E R 10
UNDERSTANDING THE STRUCTURE
OF SENTENCES
INTRODUCTION with argument structure being particularly impor-

tant (Bencini & Goldberg, 2000; Healy & Miller,
I’m going to be honest here; most students find 1970). To assign thematic roles, at least some of
this chapter difficult, and many say they can’t see the time we must compute the syntactic structure of
the point of parsing. But how do you tell the dif- the sentence, a process known as parsing. The first
ference between “Vlad killed Boris” and “Vlad step in parsing is to determine the syntactic cate-
was killed by Boris”? And when you hear “I gory to which each word in the sentence belongs
saw the Pennines flying to Dundee,” why don’t (e.g., noun, verb, adjective, adverb, and so on). We
you think, “Cor, those Pennines are overhead on then combine those categories to form phrases. An
their way to Dundee again.” And when you come important step in parsing is to determine the subject
across sentences such as “The cop shot the burglar of the sentence (what the sentence is about). From
the gun,” how do you know just who had a gun? such information about individual words we start
These are details that give language its fantastic to construct a representation of the meaning of the
expressive power. sentence we are reading or hearing. This chapter is
So far we have largely been concerned with about the process of assembling this representation.
the processing of individual words. What happens
after we recognize a word? When we access the
lexical entry for a word, two major types of infor-
mation become available: information about the
Box 10.1 Thematic roles
word’s meaning, and information about the syntac- Agent The instigator of an action
tic and thematic roles that the word can take. The (corresponding to the subject,
goal of sentence interpretation is to assign thematic usually animate)
roles to words in the sentence being processed— Theme The thing that has a particular
who is doing what to whom (see Box 10.1). One of location or change of location
the most important guides to thematic roles comes Recipient The person receiving the
from an analysis of the verb’s argument structure theme
(sometimes called subcategorization frame). For Location Where the theme is
example, the verb “give” has the structure AGENT Source Where the theme is coming
gives THEME to RECIPIENT (e.g., “Vlad gave from
the ring to Agnes”). Hence verbs and their argu- Goal Where the theme is moving to
ment structures play a central role in parsing. Time Time of the event
Indeed, people are likely to identify sentences as Instrument The thing used in causing the
being similar on the basis of the main verb rather event
than on the basis of the subject of the sentence,
288 D. MEANING AND USING LANGUAGE
When we hear and understand a sentence, infor- stage of syntactic processing. Semantic informa-
mation about the word order is often crucial tion is used only in the second stage. Hence the
(at least in languages such as English). This is question about the number of stages is really the
information about the syntax of the sentence. same question as whether parsing is modular or
Sentences (1) and (2) have the same word order interactive.
structure but different meanings; (1) and (3) The goal of understanding is to extract the
have different word order structures but the same meaning from what we hear or read. Syntactic
meaning: processing is only one stage in doing this, but
it is nevertheless an important one. Whether it
(1) The ghost chased the vampire. is always an essential one is an important issue.
(2) The vampire chased the ghost. There is, however, another reason why we should
(3) The vampire was chased by the ghost. study syntax. Fodor (1975) argued that there is a
“language of thought” that bears a close resem-
A number of important questions arise blance to our surface language. In particular, the
about parsing and the human sentence parsing syntax that governs the language of thought may
mechanism. How does parsing operate? Why are be very similar or identical to that of external lan-
some sentences more difficult to parse than oth- guage. Studying syntax may therefore provide a
ers? What happens to the syntactic representa- window onto fundamental cognitive processes.
tion after parsing? Why are sentences assigned Different languages use different syntactic
the structures that they are? How many stages rules. English in particular is a strongly configu-
of parsing are there? What principles guide the rational language whose interpretation depends
operation of these stages? What happens if there heavily on word order. In inflectional languages
is a choice of possible structures at any point? At such as German, word order is less important.
what stage is non-structural (semantic, discourse, It is therefore possible that the predominance of
and frequency-based) information used? This last studies that have examined parsing in English
question is another manifestation of the issue of may have given a misleading view of how human
whether language processes are modular or not. Is parsing operates. For this reason, an important
there an enclosed syntactic module that uses only recent development has been the study of parsing
syntactic information to parse a sentence, or can in languages other than English. Most psycholin-
other types of information guide the parsing pro- guists hope and expect that the important parsing
cess? Any account of parsing must be able to spec- mechanisms will be common to speakers of all
ify why sentences are assigned the structure that languages. By the end of this chapter you should:
they are, why we are biased to parse structurally
ambiguous sentences in a certain way, and why x Know that parsing is incremental.
some sentences are harder to parse than others. x Understand how we assign syntactic structures
We should distinguish between autonomous to ambiguous sentences.
and interactive models of parsing, and one-stage x Be able to evaluate the extent to which parsing
and two-stage models. In autonomous models, the is autonomous or interactive.
initial stages of parsing at least can only use syn- x Understand the importance of verbs in parsing.
tactic information to construct a syntactic repre- x Understand how brain damage can disrupt
sentation. According to interactive models, other parsing.
sources of information (e.g., semantic informa-
tion) can influence the syntactic processor at an
early stage. DEALING WITH
In one-stage models, syntactic and semantic STRUCTURAL AMBIGUITY
information are both used to construct the syntac-
tic representation in one go. In two-stage models, My local newspaper, The Dundee Courier,
the first stage is invariably seen as an autonomous recently had a headline that read “Police seek
10. UNDERSTANDING SENTENCES 289
orange attackers.” Do you think that the headline reading the ambiguous regions of sentences than
meant “Police seek attackers who are orange,” the unambiguous regions of control sentences, but
“Police seek attackers of an orange,” or “Police we often spend longer in reading the disambigua-
seek attackers who attacked with an orange”? (It tion region.
was meant to be the last of these.) Here is another The central issue in parsing is when different
example: “Enraged cow injures farmer with axe.” types of information are used. In principle there
In this example the ambiguity arises because are two alternative parse trees that could be con-
the prepositional phrase “with axe” could be structed for (8). We could construct one of them
attached to either “farmer” or “injures”; that is, on purely syntactic grounds, and then decide using
there are two possible structures for this sentence. semantic information whether it makes sense or
So, as well as being poorly written, these sen- not. If it does, we accept that representation; if it
tences are ambiguous. does not, we go back and try again. This is a serial
It is difficult to discern the operations of the autonomous model. Alternatively, we could con-
processor when all is working well. For this reason, struct all possible syntactic representations in par-
most research on parsing has involved syntactic allel, again using solely syntactic information, and
ambiguity because ambiguity causes process- then use semantic or other information to choose
ing difficulty. Studying syntactic ambiguity is an the most appropriate one (Mitchell, 1994). This
excellent way of discovering how sentence pro- would be a parallel autonomous model. Or we
cessing works. could use semantic information from the earliest
There are different types of ambiguity involv- stages to guide parsing so that we only construct
ing more than one word. We have the bracketing semantically plausible syntactic representations.
ambiguity of example (4), which could be inter- Or we could activate representations of all possi-
preted either in the sense of (5) or in the sense of (6): ble analyses, with the level of activation affected
by the plausibility of each. The final two are ver-
(4) old men and women leave first sions of an interactive model.
(5) ([old men] and women) So far we have just looked at examples of
(6) (old [men and women]) permanent (also called global) ambiguity. In these
cases, when you get to the end of the sentence it is
More complex are structural ambiguities still syntactically ambiguous. Many sentences are
associated with parsing, such as in sentence (7). locally (or temporarily, or transiently) ambiguous,
What was done yesterday—Boris saying or Vlad but the ambiguity is disambiguated (or resolved)
finishing? Although both structures are equally by subsequent material (the disambiguation
plausible in (7), this is not the case in (8): region). We are sometimes made forcefully aware
of temporary ambiguity when we appear to have
(7) Boris said that Vlad finished it yesterday. chosen an incorrect syntactic representation.
(8) I saw the Alps flying to Romania. Consider (9) from Bever (1970). The verb “raced”
is ambiguous in that it could be a main verb (the
Many of us would not initially recognize a most frequent sense) or a past participle (a word
sentence such as (8) as ambiguous. On considera- derived from a verb acting as an adjective):
tion, this might be because one of its two meanings
is so semantically anomalous (the interpreta- (9) The horse raced past the barn fell.
tion that I looked up and saw a mountain range (10) The log floated past the bridge sank.
in the sky flying to a country) that it does not (11) The ship sailed round the Cape sank.
appear even to be considered. But psychology has (12) The old man the boats.
shown us many times that we cannot rely on our
intuitions. Recording eye movements has been When you hear or read a sentence like (9), it
particularly important in studying parsing. The can be interpreted in a straightforward way until
bulk of evidence shows that we spend no longer the final unexpected word “fell.” When we come
across the last word we realize that we have been clauses in a sample from the Wall Street Journal
led up the garden path. We realize that our origi- (Elsness, 1984; Garnsey, Pearlmutter, Myers, &
nal analysis was wrong and we have to go back Lotocky, 1997; McDavid, 1964; Thompson &
and reanalyze. We have the experience of having Mulac, 1991). There is evidence that appropri-
to backtrack. We then arrive at the interpreta- ate punctuation such as commas can reduce (but
tion of “The horse that was raced past the barn not obliterate) the magnitude of the garden path
was the one that fell.” (Some people take some effect by enhancing the reader’s awareness of the
time to work out what the correct interpretation phrasal structure (Hill & Murray, 2000; Mitchell
is.) That is, we initially try to parse it as a simple & Holmes, 1985). In real life, speakers give pro-
noun phrase followed by a verb phrase. In fact, sodic cues to provide disambiguating information,
it contains a reduced relative clause. (A relative and listeners are sensitive to this type of informa-
clause is one that modifies the main noun, and it tion; for example, speakers tend to emphasize
is “reduced” because it lacks the relative pronoun the direct-object nouns, and insert pauses akin
“which” or “that.”) Examples (10), (11), and (12) to punctuation (Snedeker & Trueswell, 2003).
should also lead you up the garden path. Garden Similarly, disfluencies influence the way in which
path sentences are favorite tools of researchers people interpret garden path sentences. When an
interested in parsing. interruption (saying “uh”) comes before an unam-
Many people might think that garden path biguous noun phrase, listeners are more likely to
sentences are rather odd: Often there would be think that the noun phrase is the subject of a new
pauses in normal speech and commas in written clause rather than the object of an old one (Bailey
language, which, although strictly optional, are & Ferreira, 2003). Disfluencies can help, but only
usually there to prevent the ambiguity in the first as long as they are in the right place. They are
place. For example, Rayner and Frazier (1987) helpful in (13) where they correctly flag a new
intentionally omitted punctuation in order to mis- subject, but not in (14), where they do not.
lead the participants’ processors. Deletion of the
complementizer “that” can also produce mis- (13) Vlad bumped into the ghost and the (um)
leading results (Trueswell, Tanenhaus, & Kello, ghoul told him to be careful.
1993). In such cases it might be possible that (14) Vlad bumped into the (um) ghost and the
these sentences are not telling us as much about ghoul told him to be careful.
normal parsing as we think. In fact, reduced
relatives are surprisingly common; “that” was However, just because speakers give prosodic
omitted in 33% of sentences containing relative cues, and listeners make use of these cues, does
not mean that speakers always mean to give these
cues for the express purpose of helping the lis-
tener (what has been called the audience design
hypothesis). Speakers are not always aware that
what they are saying is ambiguous, and they tend
to produce the same cues even when there is no
audience (Kraljic & Brennan, 2005). Prosody and
pauses probably reflect both the planning needs of
the speaker (see Chapter 13) as well as a deliber-
ate source of information to aid the listener.
Perhaps even more tellingly, McKoon and
Ratcliff (2003) showed that sentences with
reduced relatives with verbs like “race” (e.g., (9))
Garden path sentences, such as “The horse occur in natural language with near-zero probabil-
raced past the barn fell,” are favorite tools of ity. So, although such sentences might technically
researchers interested in parsing.
be syntactically correct, most people find these
sorts of sentence unacceptable. Indeed, McKoon the sentence. It is often said that “syntax proposes;
and Ratcliff go so far as to argue that sentences semantics disposes.” The simplest approach treats
with reduced relatives with verbs similar to “race” syntax as an independent or autonomous process-
are ungrammatical. Hence considerable caution is ing module: Only syntactic information is used to
necessary when drawing conclusions about the construct the parse tree. Is this true?
syntactic processor from studies of garden path
sentences.
At first sight, our experience of garden path
What size are the units of parsing?
sentences is evidence for a serial autonomous pro- What are the constituents used in parsing, and
cessor. But what has led us up the garden path? how big are they? Jarvella (1971) showed that
We could have been taken there by either seman- listeners only begin to purge memory of the
tic or syntactic factors. There has been a great deal details of syntactic constituents after a sentence
of research on trying to decide which. According boundary has been passed (see Chapter 12 for
to the serial autonomy model, we experience the more details). Once a sentence has been pro-
garden path effect because the single syntactic cessed, verbatim memory for it fades away very
representation we are constructing on syntactic quickly. Hence, perhaps not surprisingly, the
grounds turns out to be incorrect. According to sentence is a major processing unit. Beneath this,
the parallel autonomy model, one representation the clause also turns out to be an important unit.
is much more active than the others because of the A clause is a part of a sentence that has both a
strength of the syntactic cues, but this turns out subject and predicate. Furthermore, people find
to be wrong. According to the interactive model, material easier to read a line at a time if each line
various sources of information support the analy- corresponds to a major constituent (Anderson,
sis more than its alternative. However, later infor- 2010; Graf & Torrey, 1966). There is a clause
mation is inconsistent with these initial activation boundary effect in recalling words: it is easiest
levels. to recall words from within the clause currently
being processed, independent of the number of
words in the clause (Caplan, 1972). The process-
EARLY WORK ON PARSING ing load is highest at the end of the clause, and
eye fixations are longer on the final word of a
Early models of parsing were based on Chomsky’s clause (Just & Carpenter, 1980).
theory of generative grammar. In particular, psy- One of the first techniques used to explore
chologists tested the idea that understanding sen- the size of the syntactic unit in parsing was the
tences involved retrieving their deep structure. As click displacement technique (Fodor & Bever,
it became apparent that this could not provide a 1965; Garrett, Bever, & Fodor, 1966). The basic
complete account of parsing, emphasis shifted to idea was that major processing units resist inter-
examining strategies based on the surface struc- ruption: We finish what we are doing, and then
ture of sentences. process other material at the first suitable oppor-
For early psycholinguists still influenced by tunity. Participants heard speech over headphones
ideas from transformational grammar such as in one ear, and at certain points in the sentence,
the autonomy of syntax, the process of language extraneous clicks were presented in the other
understanding was a simple story (e.g., Fodor, ear. Even if the click falls in the middle of a real
Bever, & Garrett, 1974). First, we identify the constituent, it should be perceived as falling at a
words on the basis of perceptual data. Recognition constituent boundary. That is, the clicks should
and lexical access give us access to the syntactic appear to migrate according to listeners’ reports.
category of the words. We can use this informa- This is what was observed:
tion to build a parse tree for each clause. It is only
when each clause is completely analyzed that we (15) That he was* happy was evident from the
finally start to build a semantic representation of way he smiled.
For example, a click presented at * in (15) was disrupted immediately after they read the
migrated to after the end of the word “happy.” word “shot” in (16). The immediate disruption
This is at the end of a major constituent, at the means that they must have processed the sentence
end of the clause. The original study claimed to syntactically and semantically up to that point.
show that the clause is a major perceptual unit. However, syntactic effects are often delayed so
The same results were found when all non-syntactic that they occur a few words later.
perceptual cues, such as intonation and pauses,
were removed. This suggests that the clause is a (16) That is the very small pistol with which the
major unit of perceptual and syntactic processing. heartless killer shot the hapless man yester-
However, this interpretation is premature. day afternoon.
The participants’ task is a complex one: They
have to perceive the sentence, parse it, understand Not only do people construct the representa-
it, remember it, and give their response. Click tion incrementally, they try to anticipate what is
migration could occur at any of these points, not coming next. In an experiment with Dutch speak-
just perception or parsing. Reber and Anderson ers, van Berkum, Brown, Zwitserlood, Kooijman,
(1970) carried out a variant of the technique in and Hagoort (2005) examined the ERPs of peo-
which participants listened to sentences that actu- ple listening to stories. The stories led people to
ally had no clicks at all. They were told that it expect specific nouns. However, if participants
was an experiment on subliminal perception, and then heard a gender-marked adjective immedi-
were asked to say where they thought the clicks ately before the expected noun, and the gender
occurred. Participants still placed the non-existent was not the right match for the expected noun, the
clicks at constituent boundaries. This suggested inconsistent adjectives elicited a marked ERP.
that click migration occurs in the response stage: Indeed, people even anticipate properties
Participants are intuitively aware of the existence of upcoming words in the sentence, so that, for
of constituent boundaries and have a response bias example, the argument structure of a verb can be
to put clicks there. Wingfield and Klein (1971) used to anticipate the subsequent theme (Altmann
showed that the size of the migration effect is & Kamide, 1999). For example, the verb “drink”
greatly reduced if participants can point to places requires that the direct object is something drink-
in the sentence on a visual display at the same time able; this information is used to predict what is
as they hear them, rather than having to remember coming next, and people only pay attention to
them. It was also unclear whether intonation and drinkable things thereafter (as measured by their
pausing are as unimportant in determining struc- eye movements while looking at a picture). That
tural boundaries as was originally claimed. is, people make anticipatory eye movements
Hence these early studies probably reflect the towards probable upcoming objects. In a related
operations of memory rather than the operations experiment, Kamide, Altmann, and Haywood
of syntactic processing. It is now agreed that pars- (2003) tracked the eye movements of people
ing is largely an incremental process—we try to looking at a visual scene. They found that people
build structures on a word-by-word basis. That is, anticipated a great deal of information, even with
we do not sit idly by while we wait for the clause more complex verb structures. For example, given
to finish. The experiments of Marslen-Wilson a picture containing a man and a slice of bread, on
(1973, 1975) and Marslen-Wilson and Welsh hearing “The woman will spread the butter –”
(1978; see Chapter 9 for details) demonstrate people make anticipatory eye movements to the
that we try to integrate each word into a semantic bread when they hear butter, but to the man when
representation as soon as possible. Many studies they hear “The woman will slide the butter –.”
have shown that syntactic and semantic analysis In general, language processing interacts with
is incremental (Just & Carpenter, 1980; Tyler & the representation of a visual scene so linguistic
Marslen-Wilson, 1977). For example, Traxler and information can determine where we look next
Pickering (1996) found that readers’ processing (Altmann & Kamide, 2009). The conclusion is
that the processor draws on different sources of the second one the object. In fact, if we made use
information, some of them non-linguistic, at the of this strategy we could get a long way in com-
earliest opportunity, to construct as full an inter- prehension. This is called the canonical sentence
pretation as possible. strategy. We try the simpler strategies first, and if
We saw earlier that Chomsky’s description these do not work, we try other ones. If the battery
of language placed great emphasis on the hierar- of surface structure strategies become exhausted
chical and recursive nature of syntactic structure. by a sentence, we must try something else.
There is, however, debate as to which hierarchical Fodor, Bever, and Garrett (1974) developed
structure is actually used in cognitive processing. this type of approach in one of the most influential
In line with the incremental models, Frank and works in the history of psycholinguistics. They
Bod (2011) found that reading times are best pre- argued that the goal of parsing was to recover
dicted by purely sequential models; people do not the underlying, deep structure of a sentence. As it
appear to use hierarchical structure information to had been shown that this was not done by explic-
predict what word is coming next. itly undoing transformations, it must be done by
In summary, the language processor oper- perceptual heuristics; that is, using our surface
ates incrementally: It rapidly constructs a syntac- structure cues. However, there is little evidence
tical analysis for a sentence fragment, assigns it that deep structure is represented mentally inde-
a semantic interpretation, and relates this inter- pendently of meaning (Johnson-Laird, 1983).
pretation to world knowledge (Pickering, 1999). Nevertheless, the general principle that when we
Any delay in this process is usually very slight. parse we use surface structure cues has remained
Incremental analysis makes a lot of sense from a influential, and has been increasingly formalized.
processing point of view: Imagine having to wait
until the sentence finishes or the other person
stops speaking before you can begin analyzing
Two early accounts of parsing
what you have seen or heard. Kimball (1973) also argued that surface struc-
ture provides cues that enable us to uncover the
Parsing strategies based on underlying syntactic structure. He proposed seven
principles of parsing to explain the behavior of the
surface-structure cues human sentence parsing mechanism. He argued
The surface structure of the sentence often pro- that we initially compute the surface structure of a
vides a number of obvious cues to the underlying sentence guided by rules that are based on psycho-
syntactic representation. One obvious approach is logical constraints such as minimizing memory
to use these cues and a number of simple strategies load. He argued that these principles explained
that enable us to compute the syntactic structure. why sentences are assigned the structure that
The earliest detailed expositions of this idea were they are, why some sentences are harder to parse
by Bever (1970) and Fodor and Garrett (1967). than others, and why we are biased to parse many
These researchers detailed a number of parsing structurally ambiguous sentences in a certain way.
strategies that used only syntactic cues. Perhaps The first principle is that parsing is top-down,
the simplest example is that when we see or hear except when a conjunction (such as “and”) is
a determiner such as “the” or “a,” we know a encountered. It means that we start from the sen-
noun phrase has just started. A second example tence node and predict constituents. To avoid an
is based on the observation that although word excessive amount of backtracking, the processor
order is variable in English, and transformations employs limited lookahead of one or two words.
such as passivization can change it, the common For example, if you see that the first word of the
structure noun–verb–noun often maps on to what next constituent is “the,” then you know that you
is called the canonical sentence structure SVO are parsing a noun phrase.
(subject–verb–object). That is, in most sentences The second principle is called right associa-
we hear or read, the first noun is the subject, and tion, which is that new words are preferentially
attached to the lowest possible node in the struc- nodes will have to be kept active at once. Hence sen-
ture constructed so far. This places less of a load tences of this sort, such as (18), will be difficult, but
on memory. Consider (17): corresponding right-branching paraphrases such as
(19) cause no difficulty, because the sentence nodes
(17) Vlad figured that Boris wanted to take the do not need to be kept open in memory:
pet rat out.
(18) The vampire the ghost the witch liked loved
Here we attach “out” to the right-most availa- died.
ble constituent, “take” rather than “figured.” This (19) The witch liked the ghost that loved the
means that although this structure is potentially vampire that died.
ambiguous, we prefer the interpretation “take out”
to “figured out” (see Figure 10.1). Right associa- The fifth principle is that of closure, which
tion gives English its typically right-branching says that the processor prefers to close a phrase
structure, and it also explains why structures as soon as possible. The sixth principle is called
that are not right-branching are more difficult to fixed structure. Having closed a phrase, it is com-
understand (e.g., “the ghost who Vlad expected to putationally costly to reopen it and reorganize
leave’s ball”). the previously closed constituents, and so this is
Kimball’s third principle was new nodes. avoided if possible. This principle explains our
Function words signal a new phrase. The fourth difficulty with garden path sentences. The final
principle is that the processor can only cope with principle is the principle of processing. When a
nodes associated with two sentence nodes at any one phrase is closed it exits from short-term memory
time. For example, center-embedding splits up noun and is passed on to a second stage of deeper,
phrases and verb phrases associated with the sen- semantic processing. Short-term memory has lim-
tences so that they have to be held in memory. When ited capacity, and details of the syntactic structure
there are two embedded clauses, three sentence of a sentence are very quickly forgotten.
NP VP
V S
Vlad figured that S

FIGURE 10.1 Alternative
structures for the sentence
NP VP
“Vlad figured that Boris
wanted to take the pet
V VP rat out,” showing how
right association leads us
Boris wanted to VP to attach “out” to the
right-most verb phrase
V NP PART node (“take”) rather than
to the higher verb node
(“figured”). S = sentence;
take the pet rat out
NP = noun phrase; VP =
verb phrase; V = verb;
PART = participle.
Kimball’s principles do a good job of “cried yesterday” to “said yesterday.” The sau-
explaining a number of properties of the proces- sage machine cannot account for the preference
sor. However, given that the principle of process- for right association in some six-word sentences.
ing underlies so many of the others, perhaps the
model can be simplified to reflect this? In addi- (20) Vampires werewolves rats kiss love sleep.
tion, there are some problems with particular (21) Vlad said that Boris cried yesterday.
strategies. For example, the role of function words
in parsing might not be as essential as Kimball Fodor and Frazier (1980) conceded that right
thought. Eye fixation research shows that we may association does not arise directly from the sau-
not always gaze directly at some function words: sage machine’s architecture. They added a new
Very short words are frequently skipped (Rayner principle that governs the performance of the sau-
& McConkie, 1976; although we might be able to sage machine, which says that right association
process them parafoveally—that is, we could still operates when minimal attachment cannot deter-
extract information from them even though they mine where a constituent should go. The sausage
are not centrally located in our visual field; see machine evolved into one of the most influential
Kennedy, 2000, and Rayner & Pollatsek, 1989). models of parsing, the garden path model.
Frazier and Fodor (1978) simplified
Kimball’s account by proposing a model they
called the “sausage machine,” because it divides PROCESSING
the language input into something that looks like STRUCTURAL AMBIGUITY
a link of sausages. The sausage machine is a two-
stage model of parsing. The first stage is called One of the major foci of current work on parsing
the preliminary phrase packager, or PPP. This is is on trying to understand how we process syntac-
followed by the sentence structure supervisor, or tic ambiguity, because this gives us an important
SSS. The PPP has a limited viewing window of tool in evaluating alternative models of how the
about six words, and cannot attach words to struc- syntactic processor operates.
tures that reflect dependencies longer than this. Two models have dominated research on
The SSS assembles the packets produced by the parsing. The garden path model is an autonomous
PPP, but cannot undo the work of the PPP. The two-stage model, while the constraint-based
idea of the limited length of the PPP, and a second model is an interactive one-stage model. Choosing
stage of processing that cannot undo the work of between the two depends on how early discourse
the first, operationalizes Kimball’s principle of context, frequency, and other semantic informa-
processing. The PPP can only make use of syn- tion can be shown to influence parsing choices.
tactic knowledge and uses syntactic heuristics, Is initial attachment—the way in which syntac-
such as preferring simpler syntactic structures if tic constituents are attached to the growing parse
there is a choice of structures (known as minimal tree—made on the basis of syntactic knowledge
attachment). alone, or is it influenced by semantic factors?
Wanner (1980) pointed out a number of prob-
lems with the sausage machine model. For exam-
ple, there are some six-word sentences that are
The garden path model
triply embedded, but because they are so short, According to the garden path model (e.g., Frazier,
should fit easily into the PPP window, such as 1987a), parsing takes place in two stages. In the
(20). Nevertheless, we still find them difficult to first stage, the processor draws only on syntactic
understand. There are also some six-word sen- information. If the incoming material is ambigu-
tences where right association operates when ous, only one structure is created. Initial attachment
minimal attachment is unable to choose between is determined only by syntactic preferences dic-
the alternatives, as they are both of equal com- tated by the two principles of minimal attachment
plexity (21). Here we prefer the interpretation and late closure. If the results of the first pass turn
out to be incompatible with further syntactic, prag- first verb. When we come to “seems” it is apparent
matic, or semantic and thematic information gener- that this structure is incorrect—we have been led up
ated by an independent thematic processor, then a a garden path. In an eye-movement study, Frazier
second pass is necessary to revise the parse tree. In and Rayner (1982) found that the reading time was
the garden path model, thematic information about longer for (23) than (22), and in (23) the first fixa-
semantic roles can only be used in the second stage tion in the disambiguating region was longer.
of parsing (Rayner, Carlson, & Frazier, 1983).
Two fundamental principles of parsing deter- (22) Since Jay always jogs a mile and a half this
mine initial attachment, called minimal attach- seems a short distance to him.
ment and late closure. According to minimal (23) Since Jay always jogs a mile and a half
attachment, incoming material should be attached seems a very short distance to him.
to the phrase marker being constructed using the
fewest nodes possible. According to late closure, Rayner and Frazier (1987) monitored partici-
incoming material should be incorporated into pants’ eye movements while they read sentences
the clause or phrase currently being processed. such as (24) and (25).
If there is a conflict between these two principles,
then minimal attachment takes precedence. (24) The criminal confessed his sins harmed
many people.
(25) The criminal confessed that his sins harmed
Constraint-based models of many people.
parsing
When we start to read (24), minimal attachment
A type of interactive model called the constraint-
leads to the adoption of the structure that contains
based approach has become very popular (e.g.,
the fewest number of nodes. Hence when we get
Boland, Tanenhaus, & Garnsey, 1990; MacDonald,
to “his sins” the simplest analysis is that “his sins”
1994; MacDonald, Pearlmutter, & Seidenberg,
is the object of “confessed,” rather than the more
1994a; Tanenhaus, Carlson, & Trueswell, 1989;
complex analysis that it is the subject of the com-
Taraban & McClelland, 1988; Trueswell et al.,
plement clause (as later turns out to be the case).
1993). On this account, the processor uses mul-
Readers should therefore be led up the garden path
tiple sources of information, including syntactic,
in (24), and will then be forced to reanalyze when
semantic, discourse, and frequency-based, called
they come to “harmed.” However, (25) should not
constraints. The construction that is most strongly
lead to a garden path, because “that” blocks the
supported by these multiple constraints is most
object analysis of the sentence. Rayner and Frazier
activated, although less plausible alternatives
found that participants did indeed experience dif-
might also remain active. Garden paths occur
ficulty when they reached “harmed” in (24) but
when the correct analysis of a local ambiguity
not in (25).
receives little activation.
Ferreira and Clifton (1986) described an exper-
iment that suggests that semantic factors cannot
Evidence for autonomy in syntactic prevent us from being garden-pathed. Garden path
theory predicts that, because of minimal attach-
processing ment, when we come across the word “examined”
The garden path model says that we resolve ambi- we should take it to be the main verb in (26) and
guity using minimal attachment and late closure, (27) rather than the verb in a reduced relative clause:
without semantic assistance. As (22) is consistent
with late closure, it does not cause the processor any (26) The defendant examined by the lawyer
problem; (23) is not ultimately consistent with late turned out to be unreliable.
closure, however, and the processor tries in the first (27) The evidence examined by the lawyer
instance to attach the NP “a mile and a half” to the turned out to be unreliable.
Consider what sorts of structure we might have (28) After the child had visited the doctor
generated by the time we get to the word “exam- prescribed a course of injections.
ined” in (26) and (27). “Examined” requires an (29) After the child had sneezed the doctor pre-
agent. In (26), “the defendant” is animate and scribed a course of injections.
can therefore fulfill the role of agent, as in “the
defendant examined the evidence”; but of course, Van Gompel and Pickering (2001) came to
“the defendant” can also be what is examined, so the same conclusion using an eye-movement
the syntactic structure is ambiguous between a methodology: readers experience difficulty after
reduced relative clause and a main verb analysis. “sneezed.” These experiments suggest that the
In (27) “the evidence” is inanimate and there- first stage of parsing is short-sighted and does not
fore cannot fulfill the role of the agent; it must use semantic or thematic information. Similarly,
be what is examined, and therefore this struc- Ferreira and Henderson (1990) examined data
ture can only be a reduced relative. However, from eye movements and word-by-word self-
analysis of eye-movement evidence suggested paced reading of ambiguous sentences, conclud-
that the semantic evidence available in sentences ing that verb information does not affect the initial
such as (27) did not prevent participants from parse, although it might guide the second stage of
getting garden-pathed. Instead, we still appear reanalysis.
to construct the initial interpretation to be the We can manipulate the semantic relatedness
syntactically most simple according to minimal of nouns and verbs in contexts where they are
attachment. Ferreira and Clifton argued that either syntactically appropriate or inappropriate.
semantic information does not prevent or cause Their different effects can then be teased out in
garden-pathing, but can hasten recovery from it. lexical decision and naming tasks (O’Seaghdha,
The difficulty caused by the ambiguity is very 1997). The results suggest that syntactic analysis
short in duration, and is resolved while reading precedes semantic analysis and is independent of
the word following the verb, “by” (Clifton & it. Consider (30) and (31):
Ferreira, 1989).
Mitchell (1987), on the basis of data from (30) The message that was shut.
a self-paced reading task (where participants (31) The message of that shut.
read a computer display and press a key every
time they are ready for a new word or phrase), In (30), the target word “shut” is syntactically
concluded that the initial stage only makes use appropriate but semantically anomalous. In (31),
of part-of-speech information, and that detailed the target is both syntactically and semantically
information from the verb only affects the sec- anomalous. In the lexical decision task, in (30)
ond, evaluative, stage of processing. Consider we observe meaning-based inhibition relative to a
sentences (28) and (29). In (28), according to baseline. In (31), we do not observe any inhibition.
garden path theory, the processor prefers to In the naming task, there is no sensitivity to seman-
assign the phrase “the doctor” as direct object of tic anomaly, but there is sensitivity to the syntactic
“visited” (to comply with late closure, keeping inappropriateness of the target in (31). O’Seaghdha
the first phrase open for as long as possible). As suggested that the inhibition occurs in (30) in the
expected, participants were garden-pathed by lexical decision task because of a difficulty in inte-
(28). However, if semantic and thematic infor- grating the target word into a high-level text rep-
mation about verbs is available from an early resentation. We do not get that far in (31) because
stage, then in (29) thematic information should the failure to construct a syntactic representation
tell the processor that “sneezed” cannot take a blocks any semantic integration. The results look
direct object (a process called lexical guidance). as though they support interactivity because the
Nevertheless, participants are still led up the lexical decision task is sensitive to post-access inte-
garden path with (29); hence the initial parse gration processes. The naming data are less con-
must be ignoring verb information. taminated by post-access processing and suggest
that syntactic analysis is prior to semantic integra- semantic processing are distinct (Ainsworth-
tion and independent of it. Darnell, Shulman, & Boland, 1998; Friederici,
Evidence from neuroscience suggests that 2002; Neville, Nicol, Barss, Forster, & Garrett,
semantic and syntactic processing are independ- 1991; Ni et al., 2000; Osterhout & Nicol, 1999).
ent. Breedin and Saffran (1999) described a For example, Ainsworth-Darnell et al. examined
patient, DM, who had a significant and pervasive ERPs when people heard sentences that contained
loss of semantic knowledge as a result of demen- a syntactic anomaly, a semantic anomaly, or
tia. For example, he found it very difficult to both. The sentences that contained both types of
match a picture of an object to another appropri- anomaly still provoked both an N400 and a P600.
ate picture (e.g., knowing that a pyramid is associ- Ainsworth-Darnell et al. concluded that different
ated with a palm tree rather than a pine tree). Yet parts of the brain automatically become involved
his semantic deficit had no apparent effect on his when syntactic and semantic anomalies are pre-
syntactic abilities. He performed extremely well sent, and therefore that these processes are rep-
at detecting grammatical violations (e.g., he knew resented separately. Osterhout and Nicol (1999)
that “what did the exhausted young woman sit?” gave participants sentences with different types of
was ungrammatical). He also had no difficulty in anomaly to read (34)–(37):
assigning semantic roles in a sentence. For exam-
ple, he could correctly identify who was being (34) The cats won’t eat the food that Mary leaves
carried in the sentence “The tiger is being carried them. (non-anomalous)
by the lion,” even though he had difficulty in rec- (35) The cats won’t bake the food that Mary
ognizing lions and tigers by name. leaves them. (semantic anomaly)
Brain-imaging studies are also useful here. A (36) The cats won’t eating the food that Mary
negative event-related potential (ERP) found 400 leaves them. (syntactic anomaly)
ms after an event (and hence called the N400) is (37) The cats won’t baking the food that Mary
thought to be particularly sensitive to semantic leaves them. (doubly anomalous)
processing, and is particularly indicative of vio-
lations of semantic expectancy (Batterink, Karns, As expected, semantically anomalous sentences,
Yamada, & Neville, 2010; Kounios & Holcomb, such as (35), elicited the N400, and syntacti-
1992; Kutas & Hillyard, 1980; Nigram, Hoffman, cally anomalous sentences, such as (36), elicited
& Simons, 1992). A sentence such as (32) gener- the P600. Doubly anomalous sentences, such as
ates a semantic anomaly: (37), elicited both an N400 and a P600, with the
magnitude of each effect being about the same
(32) Boris noticed a puncture and got out to as if each anomaly were present in isolation.
change the wheel on the castle. The brain responds differently to syntactic and
semantic anomalies, and the response to each
The N400 occurs 400 ms after the anomalous type of anomaly is unaffected by the presence
word “castle.” of the other type. Osterhout and Nicol concluded
There is also a positive wave found 600 ms that syntactic and semantic processes are separa-
after a syntactic violation (Hagoort, Brown, & ble and independent.
Groothusen, 1993; Osterhout & Holcomb, 1992; There has been some debate as to the strength
Osterhout, Holcomb, & Swinney, 1994). A P600 of this claim. It is useful to distinguish between
would be observed with (33): representational modularity and processing mod-
ularity (Pickering, 1999; Trueswell, Tanenhaus,
(33) Boris persuaded to fly. & Garnsey, 1994). Representational modularity
says that semantic and syntactic knowledge are
These anomalies can be used to map the represented separately. That is, there are distinct
time course of syntactic and semantic process- types of linguistic representation, which might be
ing. These ERP data suggest that syntactic and stored or processed in different parts of the brain.
This is relatively uncontroversial. Most of the (38) The thieves stole all the paintings in the
debate is about processing modularity: Is initial museum while the guard slept.
processing restricted to syntactic information, or (39) The thieves stole all the paintings in the
can all sources of information influence the earli- night while the guard slept.
est stages of processing?
Sentence (39) is a minimal attachment struc-
ture but (38) is not. In (38) the phrase “in the
Evidence for interaction in museum” must be formed into a noun phrase with
syntactic processing “paintings”; in (39) the phrase “in the night” must
The experiments discussed so far suggest that the be formed into a verb phrase with “stole.” The
first stage of parsing only makes use of syntactic noun phrase attachment in (38) produces a gram-
preferences based on minimal attachment and late matically more complex structure than the verb
closure, and does not use semantic or thematic phrase attachment in (39). Nevertheless, Taraban
information. On the interactive account, however, and McClelland found that (38) is read faster than
semantic factors influence whether or not we get (39). They argued that this is because all the words
garden-pathed. What is the evidence that semantic up to “museum” and “night” create a semantic
factors play an early role in parsing? bias for the non-minimal interpretation. They
Perhaps the syntactic principles of minimal concluded that violations of the purely syntactic
attachment and late closure can be better explained process of the attachment of words to the devel-
by semantic biases? Taraban and McClelland oping structural representation do not slow down
(1988) compared self-paced reading times for sen- reading, but violations of the semantic process of
tences such as (38) and (39) (see Figure 10.2): assigning words to thematic roles do. Taraban and
McClelland also concluded that previous studies
that had appeared to support minimal attachment
Noun phrase and verb phrase attachment structures in
had in fact confounded syntactic simplicity with
Taraban and McClelland (1988) semantic bias.
Why do we find garden-pathing on some
48 (7 nodes)
S
occasions but not others? Milne (1982) was one
of the first to argue that semantic factors rather
NP VP than syntactic factors lead us up the garden path.
Consider the three sentences (40)–(42). Only
The V NP (40) causes difficulty, because it sets up semantic
thieves expectancies that are then violated:
NP PP
stole
(40) The granite rocks during the earthquake.
all the in the (41) The granite rocks were by the seashore.
paintings museum (42) The table rocks during the earthquake.
39 (6 nodes)
S How can semantic factors explain our diffi-
culty with reduced relatives?
NP VP
Crain and Steedman (1985) used a speeded
grammaticality judgment task to show that an
The V NP PP
appropriate semantic context can eliminate syn-
thieves
tactic garden paths. In this task, participants see
stole all the in the
paintings night
a string of words and have to decide as quickly
as possible whether the string is grammatical or
not. Participants in this task on the whole are more
FIGURE 10.2 likely to misidentify garden path sentences as
non-grammatical than non-garden path sentences. lock” can modify either the noun phrase “the
Sentence (43) was incorrectly judged ungrammat- safe” or the verb phrase “blew open the safe.”
ical far more often than the structurally identical Altmann and Steedman presented the participants
but semantically more plausible sentence (44): with prior discourse context that disambiguated
the sentences. A prior context sentence referred to
(43) The teachers taught by the Berlitz method either one or two safes. (“Once inside he saw that
passed the test. there was a safe with a new lock and a strongbox
(44) The children taught by the Berlitz method with an old lock” versus “Once inside he saw that
passed the test. there was a safe with a new lock and a safe with
an old lock.”) If the context sentence mentioned
Crain and Steedman argued that there is no only one safe, then the complex noun phrase “the
such thing as a truly neutral semantic context. safe with the new lock” is redundant, and causes
Even when semantic context is apparently absent extra processing difficulty. Hence the preposi-
from the sentence, participants bring prior knowl- tional phrase in (48) took relatively longer to read.
edge and expectations to the experiment. They If the context sentence mentioned two safes, then
argued that all syntactic parsing preferences can be the simple noun phrase “the safe” in (47) fails
explained semantically. All syntactic alternatives to identify a particular safe, so the prepositional
are considered in parallel, and semantic consider- phrase “with the dynamite” in (47) took relatively
ations then rapidly select among them. Semantic longer to read.
difficulty is based on the amount of information Altmann and Steedman (1988) emphasized
that has to be assumed: The more assumptions that the processor constructs a syntactic represen-
that have to be made, the harder the sentence is tation incrementally, on a word-by-word basis. At
to process. Hence sentences such as (45) are dif- each word, alternative syntactic interpretations are
ficult compared with (46), where the existence generated in parallel, and then a decision is made
of only one horse is assumed. This assumption using context. Altmann and Steedman called this
is incompatible with the semantic representation “weak” interaction, as opposed to strong interac-
needed to understand (45)—that there are a num- tion, where context actually guides the parsing pro-
ber of horses but it was the one that was raced past cess so that only one alternative is generated. This
the barn that was the one that fell. That is, if the approach is called the referential theory of pars-
processor encounters a definite noun phrase in the ing. The processor constructs analyses in parallel
absence of any context, only one entity (e.g., one and uses discourse context to disambiguate them
horse) is postulated, and therefore no modifier is immediately. It is the immediate nature of this
necessary. If one is present, processing difficulty disambiguation that distinguishes the referential
ensues. theory from garden path models. As many factors
guide parsing, it must be semantic considerations
(45) The horse raced past the barn fell. that in this case must lead us up the garden path.
(46) The horse raced past the barn quickly. Is it possible to distinguish between the refer-
ential and the constraint-based theories? The theo-
Altmann and Steedman (1988) measured reading ries are similar in that each denies that parsing is
times on sentences such as (47) and (48): restricted to using syntactic information. In constraint-
based theories, all sources of semantic information,
(47) The burglar blew open the safe with the including general world knowledge, are used to dis-
dynamite and made off with the loot. ambiguate, but in referential theory only referential
(48) The burglar blew open the safe with the new complexity within the discourse model is important.
lock and made off with the loot. Ni, Crain, and Shankweiler (1996) tried to separate
the effects of these different types of knowledge by
These sentences are ambiguous: the prepositional studying reading times and eye movements when
phrases “with the dynamite” and “with the new reading ambiguous sentences. The results suggested
that semantic-referential information is used imme- this has a simpler structure than the alterna-
diately, but more general world knowledge takes tive (which turns out to be the correct analy-
longer to become available. Furthermore, world sis), in which the noun is the head of a complex
knowledge was dependent on working memory noun phrase. According to referential theory, the
capacity, whereas use of semantic-referential princi- resolution of ambiguities in context depends on
ples was not. (In general, people with larger working whether a unique referent can be found. The con-
memory spans are better able to maintain multiple text can bias the processor towards or away from
syntactic representations and therefore will be more garden-pathing. The null context induces a garden
effective at processing ambiguous sentences; see path in (51). However, some contexts will bias
MacDonald, Just, & Carpenter, 1992; Pearlmutter the processor towards a relative clause interpreta-
& MacDonald, 1995.) Ni et al. argued that the focus tion and prevent garden-pathing. Such a biasing
operator “only” presupposes the existence of more context can be obtained by preceding the ambigu-
than one vampire (in this example), and therefore a ous relative structure with a relative-supporting
modifier is needed to select one of them. Consider referential context. One way of doing this is to
(49) and (50): provide more than one possible referent for “the
man.” (For example, “A fireman braved a danger-
(49) The vampires loaned money at low interest ous fire in a hotel. He rescued one of the guests at
were told to record their expenses. great danger to himself. A crowd of men gathered
(50) Only vampires loaned money at low interest around him.”) Eye-movement measurements ver-
were told to record their expenses. ified this prediction. Measurements of difficulty
associated with garden-pathing were reflected in
Sentence (49) provokes a garden path effect longer average reading times per character in the
but (50) does not. Analysis suggested that these ambiguity region, and an increased probability of
referential principles were used immediately to regressive eye movements. When syntactic infor-
resolve ambiguity. Information about seman- mation leads to ambiguity and a garden path is
tic plausibility of interpretations was used later. possible, then the processor proceeds to construct
However, as Pickering (1999) noted, referential a syntactic representation on the basis of the best
theory cannot be a complete account of pars- semantic bet.
ing, because it can only be applied to ambigui- Further evidence for constraint-based models
ties involving simple and complex noun phrases. comes from the finding that thematic information
There is also more to context than discourse can be used to eliminate the garden path effect
analysis. Referential theory was an early version in these reduced relative sentences (MacDonald
of a constraint-based theory, applied to a limited et al., 1994a; Trueswell & Tanenhaus, 1994;
type of syntactic structure. Nevertheless, the idea Trueswell et al., 1994). For example, consider the
that discourse information can be used to influ- ambiguous sentence fragments (52) and (53):
ence parsing decisions is one essential component
of constraint-based theories. (52) The fossil examined –
Altmann, Garnham, and Dennis (1992) (53) The archeologist examined –
used eye-movement measures to investigate
how context affects garden pathing. Consider The fragments are ambiguous because they are
sentence (51): consistent with two sentence constructions:
the most frequent order, the unreduced struc-
(51) The fireman told the man that he had risked ture, where the first NP is the agent (e.g., “The
his life for to install a smoke detector. archeologist examined the fossil”), and with a
reduced relative clause (“The fossil examined
Garden path theory predicts that (51) should by the archeologist was important”). However,
always lead to a garden path. We always start to consider the thematic roles associated with
parse “the man” as a simple noun phrase because the verb “examine.” It has the roles of agent,
best fitted by an animate entity, and a theme, best According to constraint-based models, verb-
fitted by an inanimate object (Trueswell & bias information becomes available immediately
Tanenhaus, 1994). So semantic considerations the verb is recognized. Trueswell et al. (1993) found
associated with thematic roles suggest that (52) evidence for the immediate availability of verb-
is likely to be a reduced relative structure, and bias information across a range of tasks (prim-
(53) a simple sentence structure. Difficulty ing, self-paced reading, and eye movements).
ensues if subsequent material conflicts with They found that verbs with a sentence-complement
these interpretations, or if the context pro- bias did not cause processing difficulty, whereas
vided by the nouns is not sufficiently biasing. verbs with direct-object bias did. Furthermore,
Trueswell et al. (1994) examined eye move- the more frequently a sentence complement
ments to investigate how people understood verb appears in the language without a comple-
sentences such as (52) and (53). They found mentizer (“that”), the less likely it is to lead to
that if semantic constraints were sufficiently processing difficulty in sentence-complement
strong, reduced relative clauses were no more constructions. Using a carefully controlled set of
difficult than the unreduced constructions. materials combined with eye-movement and self-
Remember that, in contrast, Ferreira and paced reading analyses, Garnsey et al. (1997) also
Clifton (1986) found evidence of increased dif- found that people’s prior experience with particu-
ficulty with very similar materials, (26) and lar verbs guides their interpretation of temporary
(27). Why is there a discrepancy? Trueswell et al. ambiguity. Verb bias guides readers to a sentence-
argued that the semantic bias in Ferreira and complement interpretation with sentence-
Clifton’s experiment was too weak. If the seman- complement verbs. This information is available
tic constraint is not strong enough, we will be very quickly (certainly by the word following
garden-pathed. McRae, Spivey-Knowlton, and the verb). Furthermore, verb-bias information
Tanenhaus (1998) found that strong plausibility interacts with how plausible the temporarily
can also overcome garden-pathing. On the other ambiguous noun is as a direct object. For exam-
side of the coin, people are reluctant to abandon ple, “the decision” is more plausible as a direct
plausible analyses in favor of implausible ones, object than “the reporter.” This result is best
even when the plausible analysis is turning out to explained by constraint-based models, as accord-
be wrong (Pickering & Traxler, 1998). ing to the garden path model there should be no
An important idea in constraint-based mod- early effect of plausibility and verb bias.
els is that of verb bias (Garnsey et al., 1997; Note though that there is controversy over
Trueswell et al., 1993). This is the idea that whether verb-bias effects are real: Some studies
although some verbs can appear in a number have found no effect of verb-frequency informa-
of syntactic structures, some of their syntactic tion. For example, using an eye-tracking meth-
structures are more common than others. The odology, Pickering, Traxler, and Crocker (2000)
relative frequencies of alternative interpreta- found that readers experienced difficulty with tem-
tions of verbs predict whether or not people have porarily ambiguous sentence-complement clauses
difficulty in understanding reduced relatives even when the verbs were biased towards that
(MacDonald, 1994; Trueswell, 1996). Hence, analysis. Consider the sentence beginning (54).
although the verb “read” can appear with sen-
tence complements (“the ghost read the book had (54) The young athlete realized her potential –
been burned”), it is most commonly followed by
a direct object (as in simply, “the ghost read the There are now two possible analyses: the object
book during the plane journey”). Direct-object analysis (simply, “The young athlete realized her
verbs are those where the most frequent continu- potential”), and the sentence-complement analy-
ation is the direct object; sentence-complement sis (as in “The young athlete realized her potential
verbs are those where the most frequent continu- might one day make her a world class athlete”). The
ation is the sentence complement. sentence-complement analysis is the most common
for the verb “realized,” so readers should adopt that resolved in similar ways because of the impor-
and not the object analysis. However, they do not. tance of lexical constraints in parsing (MacDonald
People preferred to attach noun phrases as argu- et al., 1994a, 1994b). Syntactic ambiguities arise
ments of verbs, regardless of whether or not this because of ambiguities at the lexical level. For
analysis was likely to be correct. Kennison (2001) example, “raced” is an ambiguous word, with one
similarly found that ambiguous structures caused sense of a past tense, and another of a past parti-
difficulty regardless of the verb bias. Pickering and ciple. In (57), only the past tense sense is consist-
van Gompel (2006) concluded that verb-bias infor- ent with the preceding context. This information
mation has some influence on syntactic processing, eventually constrains the processor to a particular
but often not enough to prevent us having difficulty syntactic interpretation. But in (58), both senses
with temporally ambiguous sentences. are consistent with the context. Although con-
In constraint-based models, syntactic ambiguity textual constraints are rarely strong enough to
is eventually resolved by competition (MacDonald restrict activation to the appropriate alternative,
et al., 1994a, 1994b). The constraints activate they provide useful information for distinguish-
different analyses to differing degrees; if two or ing between alternative candidates. In this type of
more analyses are highly activated, competition is approach, a syntactic representation of a sentence
strong and there are severe processing difficulties. is computed through links between items in a rich
Tabor and Tanenhaus (1999; see also Tabor, Juliano, lexicon (MacDonald et al., 1994a).
& Tanenhaus, 1997) proposed that the competition
is resolved by settling into a basin of attraction in (57) The horse who raced –
an attractor network similar to those postulated to (58) The horse raced –
account for word recognition (Hinton & Shallice,
1991; see Chapter 7). Along similar lines, McRae Part of the difficulty in distinguishing between
et al. (1998) proposed a connectionist-like model of the autonomous and interactive constraint-based
ambiguity resolution called competition-integration. theories is in obtaining evidence about what is
Competition between alternative structures plays happening in the earliest stages of comprehen-
a central role in a parsing process that essentially sion. Tanenhaus et al. (1995) examined the eye
checks its preferred structure after each new word. movements of participants who were following
Evidence for parallel competition models comes instructions to manipulate real objects. Analy-
from studies that show that the more committed peo- sis of the eye movements suggested that people
ple become to a parsing choice, the more difficult processed the instructions incrementally, making
it is for them to recover, an effect called digging-in eye movements to objects immediately after the
(Tabor & Hutchins, 2004). For example, increasing relevant instruction. People typically made an eye
the gap between the ambiguity and the disambigu- movement to the target object 250 ms after the
ating information causes the comprehenders to “dig end of the word that uniquely specified the object.
in” as they become more committed to the wrong With more complex instructions, participants’
analysis (e.g., (55) is easier than (56); materials from eyes moved around the array looking for possible
Ferreira & Henderson, 1991). Once they have dug referents.
in, alternative interpretations (including the correct The best evidence for the independence of
one) become less activated. parsing comes from reading studies of sentences
with brief syntactic ambiguities, where listeners
(55) After the Martians invaded the town was have clear preferences for particular interpreta-
evacuated. tions, even when the preceding linguistic context
(56) After the Martians invaded the town that the supports the alternative interpretation. Tanenhaus
city bordered was evacuated. et al. pointed out that in this sort of experiment
the context may not be immediately available
Another important aspect of constraint-based because it has to be retrieved from memory.
models is that syntactic and lexical ambiguity are They examined the interpretation of temporarily
ambiguous sentences in the context of a visual structures (those where the reduced relative read-
array so that information is immediately avail- ing is the correct one). Furthermore, and contrary
able. They auditorily presented participants with to the reading time results, higher activation was
the sentence (59) with one of two visual contexts. shown while reading ambiguous sentences when
the ambiguity was resolved in favor of the pre-
(59) Put the apple on the towel in the box. ferred syntactic construction. The higher workload
was spread among the superior temporal gyrus
In the one-referent condition there was just one (including Wernicke’s area) and the inferior frontal
apple on a towel and another towel without an gyrus (including Broca’s area), hinting that multi-
apple on it. In the two-referent condition there ple processes are involved in ambiguity resolution
were two possible referents for the apple, one on a (see Figure 10.3). In particular, Broca’s area might
towel and one on a napkin. According to modular be involved in generating abstract syntactic frames,
theories, “on the towel” should always be initially and Wernicke’s in interpreting and elaborating
interpreted as the destination (where the apple them with semantic information. These findings
should be put, because this is structurally sim- are more consistent with parallel models where
plest). However, analysis of the eye movements multiple parses are kept open at the same time.
across the scene showed that “on the towel” was There is also recent electrophysiological
initially interpreted as the destination only in the evidence that shows that people predict what is
one-referent condition. In the two-referent condition, coming next (DeLong, Urbach, & Kutas, 2005;
“on the towel” was interpreted as the modifier of see also Kutas, DeLong, & Smith, 2011). DeLong
“apple.” In the one-referent condition, participants et al. examined the phonological regularity in the
looked at the incorrect destination (the irrelevant English indefinite article (“a” before a consonant,
towel) 55% of the time; in the two-referent condi- “an” before a vowel) using ERP, and concluded
tion, they rarely did so. This experiment is strong that people pre-activate words in a graded fashion.
evidence that people use contextual information
immediately to establish reference and to process Cross-linguistic differences in
temporarily ambiguous sentences. attachment
A similar experiment by Sedivy, Tanenhaus, A final point concerns the extent to which any
Chambers, and Carlson (1999) showed that people parsing principles apply to languages other than
very quickly take context into account when inter- English. Cuetos and Mitchell (1988) examined the
preting adjectives. On the basis of these findings, extent to which speakers of English and Spanish
Sedivy et al. argued that syntactic processing is used the late-closure strategy to interpret the same
incremental—that is, a semantic representation
is constructed with very little lag following the
input. People immediately try to integrate adjec-
Inferior frontal gyrus
tives into a semantic model even when they do not
have a stable core meaning (e.g., tall is a scalar
object—it is a relative term and depends on the Broca’s
area
noun it is modifying; tall in “a tall glass” means
something different from in “a tall building”).
They do this by establishing contrasts between
possible referents in the visual array (or memory).
Brain-imaging fMRI studies show that the
brain processes ambiguous and unambiguous
sentences differently (Mason, Just, Keller, &
Wernicke’s
Carpenter, 2003). Higher levels of brain activation Superior temporal gyrus area
are shown for ambiguous sentences, but also during
reading more complex structures and unpreferred FIGURE 10.3
sorts of sentences. They found that although the 1988). Third, as constraint-based models advo-
interpretations of the English speakers could be cate, parsing does not make use of linguistic prin-
accounted for by late closure, this was not true ciples at all. The results of interpretation depend
of the Spanish speakers. For example, given on the interaction of many constraints that are
(60), English speakers prefer to attach the rela- relevant in sentence processing. Whatever the
tive clause (“who had the accident”) to “the colo- answer, it is clear that if we limit our studies of
nel,” because that is the phrase currently being parsing to English then we miss out on a great
processed. We can find this out simply by asking deal of potentially important data.
readers “Who had the accident?” Constraint-based models contain a probabil-
istic element in that the most strongly activated
(60) The journalist interviewed the daughter of the analysis can vary depending on the circumstances.
colonel who had the accident. Another example of a probabilistic model is the
tuning hypothesis (Brysbaert & Mitchell, 1996;
Spanish speakers, on the other hand, given Mitchell, 1994; Mitchell et al., 1995). The tun-
the equivalent sentence (61), seem to follow a ing hypothesis emphasizes the role of exposure
strategy of early closure. That is, they attach the to language. Parsing decisions are influenced by
relative clause to the first noun phrase. the frequency with which alternative analyses are
used. Put another way, people resolve ambiguities
(61) El periodista entrevisto a la hija del coronel in a way that has been successful in the past (Sturt,
que tuvo el accidente. Costa, Lombardo, & Frasconi, 2003). Given the
reasonable assumption that people vary in their
Other languages also show a preference exposure to different analyses, then their preferred
for attaching the relative clause to the first initial attachments will also vary. Attachment
noun phrase, including French (Zagar, Pynte, & preferences may vary from language to language,
Rativeau, 1997) and Dutch (Brysbaert & Mitchell, and from person to person, and indeed might even
1996). These results suggest that late closure may vary within a person across time. Brysbaert and
not be a general strategy common to all lan- Mitchell (1996) used a questionnaire to examine
guages. Instead, the parsing preferences may attachment preferences in Dutch speakers, and
reflect the frequency of different structures found individual differences in these preferences.
within a language (Mitchell, Cuetos, Corley, &
Brysbaert, 1995). These cross-linguistic dif- Comparison of garden path and
ferences question the idea that late closure is a
process-generated principle that confers advan- constraint-based theories
tages on the comprehender, such as minimizing When do syntax and semantics interact in pars-
processing load. Frazier (1987b) proposed that ing? This has proved to be the central question
late closure is advantageous because if a constitu- in parsing, as well as one of the most difficult to
ent is kept open as long as possible, it avoids the answer. In serial two-stage models, such as the
processing cost incurred by closing it, opening it, garden path model, the initial analysis is con-
and closing it again. strained by using only syntactic information and
The results of this study can be explained preferences, and a second stage using semantic
in one of three ways. First, late closure may not information. In parallel constraint-based models,
originate because of processing advantages, and multiple analyses are active from the beginning,
the choice of strategy (early versus late closure) and both syntactic and non-syntactic informa-
is essentially an arbitrary choice in different lan- tion is used in combination to activate alterna-
guages. Second, late closure may have a process- tive representations. Unfortunately, there is little
ing advantage and may be the usual strategy, but consensus about which model gives the better
in some languages, in some circumstances, other account. Different techniques seem to give differ-
strategies may dominate (Cuetos & Mitchell, ent answers, and the results are sensitive to the
materials used. Proponents of the garden path might continue for a long time. In the competition-
model argue that the effects that are claimed to integration model (McRae et al., 1998; Spivey &
support constraint-based models arise because the Tanenhaus, 1998), competition is long-lasting but
second stage of parsing begins very quickly, and decreases as the sentence unfolds.
that many experiments that are supposed to be So do we resolve ambiguity by reanalysis or
looking at the first stage are in fact looking at the competition? Van Gompel et al. (2001) examined
second stage of parsing. Any interaction observed how we resolve ambiguity. They constructed sen-
is occurring at this second stage, which starts very tences such as (62) to (64):
early in processing. They argue that experiments
supporting constraint-based models are meth- (62) The hunter killed only the poacher with the
odologically flawed, and that constraint-based rifle not long after sunset.
models fail to account for the full range of data (63) The hunter killed only the leopard with the
(Frazier, 1995). On the other hand, proponents of rifle not long after sunset.
the constraint-based models argue that research- (64) The hunter killed only the leopard with the
ers favoring the garden path model use techniques scars not long after sunset.
that are not sensitive enough to detect the inter-
actions involved, or that the non-syntactic con- The prepositional phrase (“with the rifle/
straints used are too weak. scars”) can be attached either to “killed” (a VP
attachment analysis: the hunter killed with
the rifle/scars) or to “poacher/leopard” (an NP
Other models of parsing attachment: the poacher/leopard had the rifle/
Is there any way out of this dilemma? Alternative scars). In (63), only the VP attachment is plausible
approaches to garden path and constraint-based (that the hunter killed with the rifle, rather than
theories have recently come to the fore. that the leopard had the rifle); this is the VP condi-
The first alternative may be called the tion. In (64), only the NP attachment is plausible
unrestricted-race model. To understand the basis (that the leopard had the scars, as you cannot kill
of this model, we must consider exactly how with scars); this is the NP condition. In (62), both
syntactic ambiguity is resolved. We also need to the VP and NP attachments are plausible; this is
distinguish between models that always adopt the called the ambiguous condition.
same analysis of a particular ambiguity and those What do the different theories predict? The
that do not (van Gompel, Pickering, & Traxler, garden path model (an example of a fixed-choice
2000, 2001). two-stage model where ambiguity is resolved
The garden path model can be described as a by reanalysis) predicts, on the basis of minimal
fixed-choice two-stage model. It is fixed choice in attachment, that the processor will always initially
that it has no probabilistic element in its decision adopt the VP analysis, because this generates
making. Given a particular structure, the same the simpler structure. (It creates a structure with
syntactic structure will always be generated on fewer nodes than the NP analysis; see Chapter 2.)
the basis of late closure and minimal attachment. The processor only reanalyzes if the VP attach-
Either the correct analysis is chosen on syntactic ment turns out subsequently to be implausible.
grounds from the beginning, or, if the initial syn- Hence (62) should be as difficult as (63), but (64)
tactic analysis becomes implausible, reanalysis is should cause more difficulty. Constraint-based
needed. theories predict little competition in (64), because
Constraint-based models are variable-choice plausibility supports only the NP interpretation.
one-stage models. In constraint-based models, In (63) there should be little competition, because
syntactic ambiguity is resolved by competition. the semantic plausibility information supports
When there are alternative analyses of similar acti- only the VP analysis. Crucially, in this experiment
vation, competition is particularly intense, causing there was no syntactic preference for VP or NP
considerable processing difficulty. Competition attachment. The ambiguity was balanced (usually
VP/NP ambiguities are biased towards VP attach- processor will be forced to reanalyze. The critical
ment). In (62), however, there should be compe- and surprising finding that only a variable-choice
tition because both interpretations are plausible. two-stage model such as the unrestricted-race
In summary, garden path theory predicts that (62) model seems able to explain is that sometimes
and (63) should be equally easy, but (64) should ambiguous sentences cause less difficulty than
be difficult; constraint-based theory predicts that disambiguated sentences.
(63) and (64) should be easy, but (62) should be Need detailed syntactic processing neces-
difficult. sarily precede semantic analysis? In a second
Van Gompel et al. examined readers’ eye alternative approach Bever, Sanz, and Townsend
movements to discover when these sentences (1998) suggest that semantics comes first. In an
caused difficulty. They found that an inspection extension of the idea that probabilistic, statistical
of reading difficulty favored neither pattern of considerations play an important role in compre-
results. Instead, they found that the ambiguous hension, Bever et al. argue that statistically based
condition was easier to read than the two disam- strategies are used to propose an initial semantic
biguated ones. That is, (64) was easy but (62) and representation. This then constrains the detailed
(63) were difficult. computation of the syntactic representation. They
Neither garden path nor constraint-based the- argued that the frequency with which syntactic
ories seem able to explain this pattern of results. representations occur constrains the initial stage
Van Gompel et al. argue that only a variable-choice of syntactic processing. At any one time, the pro-
two-stage model can account for this pattern of cessor assigns the statistically most likely inter-
results. The unrestricted race is such a model pretation to the incoming material. Bever et al.
(Traxler, Pickering, & Clifton, 1998; van Gompel argued that a principle such as minimal attach-
et al., 2000, 2001). As in constraint-based mod- ment cannot explain why we find reduced rela-
els, all sources of information, both syntactic tives so very difficult, but the statistical rarity of
and semantic, are used to select among alterna- this sort of construction can (just because they
tive syntactic structures (hence it is unrestricted). are so rare). On this account, the role of the pro-
The alternatives are constructed in parallel and cessor is reduced to checking that everything is
engaged in a race. The winner is the analysis that accounted for, and that the initial semantic rep-
is constructed fastest, and this is adopted as the resentation indeed corresponds with the detailed
syntactic interpretation of the fragment. So in syntactic representation.
contrast to constraint-based theories, only one Do we always construct a complete, idealized
analysis is adopted at a time. If this analysis is syntactic structure? Christianson, Hollingworth,
inconsistent with later information, the processor Halliwell, and Ferreira (2001) argue that we
has to reanalyze, at considerable cost; hence it is do not. They focus on what people understand
also a two-stage model. It is also a variable-choice after they have read garden path sentences such
model, as the initial analysis is affected by the par- as “While the man hunted the deer ran into the
ticular characteristics of the sentence fragment (as woods.” This emphasis on comprehension—for
well as by individual differences resulting from example, asking people what they thought were
differences in experience). the subjects, objects, and actions of clauses, and
Let us consider how the unrestricted-race how confident they were about these judgments—
model accounts for these data. Because there is is different from that of most of the other stud-
no particular bias for NP or VP in (62)–(64), peo- ies we have looked at, which emphasize on-line
ple will adopt one of these as their initial prefer- measures of what is happening when we process
ence on about half the trials. In (62), people will individual words while looking at garden path
never have to reanalyze, because either preference sentences. They found that people do not always
turns out to be plausible, but (63) and (64) will completely reanalyze sentences, and often retain
both cause difficulty on those occasions when the a mistaken interpretation derived from the initial
initial preference turns out to be wrong, and the misanalysis. They concluded that people do not
strive towards perfect analyses, but instead are are ungrammatical, which is why people have so
happy with interpretations that seem to work; much difficulty with them. A study of a large cor-
they settle for “good enough.” In a return related pus of natural speech confirms that people only
to the early idea of surface cues, some researchers produce reduced relatives with these external-
now think that people use simple heuristics when causation verbs. With verbs where the control
processing language, in addition to detailed and is internal, in real life speakers use non-reduced
complete syntactic processing (Ferreira, 2003). constructions (“the horse that was raced past the
Comprehenders start out with the assumption that barn fell”).
a sentence is in canonical, NVN form, and sen- McKoon and Ratcliff call this approach,
tences that violate this heuristic (e.g., passives) where syntactic constructions convey particular
are more difficult to understand. meanings that restrict what sorts of nouns and
A different approach is taken by McKoon and verbs can be used with them, and particularly
Ratcliff (2002, 2003). They argue that syntactic what sort of verb-argument structures can be
constructions themselves carry meaning, beyond used, meaning through syntax (MTS). They fur-
the meaning of their constituent words. A passive ther argue that the MTS conflicts with constraint-
sentence provides a different emphasis from its based theories. According to constraint-based
corresponding active, and therefore has a differ- theories, the language processor knows about sta-
ent meaning. Sentences (65) and (66), although tistics of usage, not meanings and rules, whereas
superficially similar, convey different meanings. according to MTS, the language processor knows
about meanings and rules, but not statistics.
(65) Boris loaded the truck with hay. McKoon and Ratcliff found that statistical infor-
(66) Boris loaded hay onto the truck. mation about verbs derived from an actual corpus
of speech does not predict reading times of sen-
Here, sentence (65) conveys the notion that tences containing those verbs.
the truck is completely full of hay, but (66) does The MTS approach is criticized by McRae,
not. A difference in syntax conveys a difference in Hare, and Tanenhaus (2005), who argue that the
meaning. Reduced relative constructions convey difficulty of reduced relatives is best accounted
a particular meaning. McKoon and Ratcliff argue for not by the internal–external distinction, but
that this meaning means that it can only be com- by temporary processing difficulty resulting from
bined with particular sorts of nouns and verbs. ambiguity. Furthermore, the syntactic construc-
The reduced relative can only be used to talk tions can on occasion force, or coerce, a particu-
about particular sorts of things: The main noun lar interpretation regardless of the meaning of
participates in an event caused by some force or the verb: We can still understand a sentence such
other entity external to itself. The main verb has as “Boris sneezed the tissue off the table” even
to convey this sense of external participation. A though “sneezed” does not normally imply cau-
sentence such as (67) satisfies this constraint, but sation. Sentence constructions do carry meaning
a sentence such as (68) does not. independently of their constituent verbs. In sum-
mary, it is difficult to see how the MTS approach
(67) Cars and trucks abandoned in a terrifying can replace alternative theories of parsing dif-
scramble for safety. ficulty. Indeed, instead of replacing constraint-
(68) The horse raced past the barn fell. based theories, the internal–external causation
distinction may be just one more constraint.
“Abandoned” conveys this sense of external cau-
sation (“something caused cars and trucks to be Processing syntactic-category
abandoned”), but “raced” does not (because it is
the horse itself that is doing the racing). McKoon
ambiguity
and Ratcliff propose that reduced relatives with One type of lexical ambiguity that is of particular
verbs denoting internally caused events really importance for processing syntax is lexical-category
ambiguity, where a word can be from more than “the” (“the desert trains”), which permits both
one syntactic category (e.g., a noun or a verb, as in NV and NN interpretations, and the unambiguous
“trains” or “watches”). This type of ambiguity pro- controls started with “this” (giving “this desert
vides a useful test of the idea that lexical and syntac- trains” for an unambiguous NV interpretation) or
tic ambiguity are aspects of the same thing and are with “these” (giving “these desert trains” for an
processed in similar ways. unambiguous NN interpretation). The rest of the
According to serial-stage models such as gar- sentence provided disambiguating information, as
den path theory, lexical and syntactic ambiguity shown in the full sentences (69) and (70):
are quite distinct, because lexical representations
are already computed but syntactic representa- (69) I know that the desert trains young people to
tions must be computed (Frazier & Rayner, 1987). be especially tough.
According to Frazier (1989), distinct mechanisms (70) I know that the desert trains are especially
are needed to resolve lexical-semantic, syntactic, tough on young people.
and lexical-category ambiguity. Lexical-semantic
ambiguity is resolved in the manner described in Frazier and Rayner found that reading times in
Chapter 6: The alternative semantic interpreta- the critical, ambiguous region (“desert trains”) were
tions are generated in parallel, and one meaning is shorter in the ambiguous (“the”) condition than the
rapidly chosen on the basis of context and mean- unambiguous (“this”/“these”) conditions. However,
ing frequency. Syntactic ambiguity is dealt with in the ambiguous condition, reading times were
by the garden path model in that only one analysis longer in the disambiguating material later in the
is constructed at any one time; if this turns out to sentence. They proposed that when the processor
be incorrect, then reanalysis is necessary. Lexical- encounters the initial ambiguity, very little analy-
category ambiguity is dealt with by a delay sis takes place. Instead, processing is delayed until
mechanism. When we encounter a syntactically subsequent disambiguating information is reached,
ambiguous word, the alternative meanings are when additional work is necessary.
accessed in parallel, but no alternative is chosen According to constraint-based theories, there
immediately. Instead, the processor delays selec- is no real difference between lexical-semantic
tion until definitive disambiguating information ambiguity and lexical-category ambiguity. In
is encountered later in the sentence. The advan- each case, alternatives are activated in parallel
tage of the delay strategy is that it saves extensive depending on the strength of support they receive
computation because usually the word following from multiple sources of information. Hence mul-
a lexical-category ambiguity provides sufficient tiple factors, such as context and the syntactic
disambiguating information. bias of the ambiguous word (that is, whether it is
Frazier and Rayner (1987) provided some more frequently encountered as a noun or a verb),
experimental support for the delay strategy. They immediately affect interpretation.
examined how we process two-word phrases How can constraint-based theories account
containing lexical-category ambiguities, such for Frazier and Rayner’s findings that we seem
as “desert trains.” After the word “desert,” two to delay processing lexical-category ambigui-
interpretations are possible. The first noun can ties until the disambiguating region is reached?
either be a noun to be followed by a verb (in MacDonald (1993) suggested that the control
which case “desert” will be the subject of the condition in their experiment provided an unsuit-
verb “trains”—this is the NV interpretation), or able baseline, in that they introduced an additional
it can be a modifier noun that precedes a head factor. The determiners “this” and “these” serve
noun (in which case “desert” will be the modify- a deictic function, in that they point the compre-
ing noun and “trains” the head noun—this is the hender to a previously mentioned discourse entity.
NN interpretation). Frazier and Rayner examined When there is no previous entity, they sound quite
eye movements in ambiguous and unambiguous odd. Hence Frazier and Rayner’s control sen-
sentences. The ambiguous sentences started with tences (71) and (72) in isolation read awkwardly:
(71) I know that this desert trains young people favored by the semantic bias turns out to be incor-
to be especially tough. rect that reading times of the ambiguous sentence
(72) I know that these desert trains are especially should increase. The pattern of results favored
tough on young people. the constraint-based model. Semantic bias has an
immediate effect.
Therefore, MacDonald suggested, the rela-
tively fast reading times in the ambiguous region (75) She saw her duck –
of the experimental condition arose because the
comparable reading times in the control condi- What happens when we encounter an ambigu-
tion were quite slow, as readers were taken aback ous fragment such as (75)? In this situation, the
by the infelicitous use of “this” and “these.” continuation using “duck” in its sense as a verb
MacDonald therefore used an additional type (e.g., “She saw her duck and run”) is statistically
of control sentence. Rather than using different more likely than that as a noun (e.g., “She saw
determiners, she used the unambiguous phrases her duck and chickens”). It is possible to bias the
“deserted trains” and “desert trained.” She found interpretation with a preceding context sentence
that “this” and “these” did indeed slow down (e.g., “As they walked round, Agnes looked at all
processing, even in the unambiguous version (“I of Doris’s pets”). Boland (1997), using analysis
know that these deserted trains could resupply the of reading times, showed that whereas probabil-
camp” compared with “I know that the deserted istic lexical information is used immediately to
trains could resupply the camp”). influence the generation of syntactic structures,
MacDonald went on to test the effects of background information is used later to guide
the semantic bias of the categorically ambiguous the selection of the appropriate structure. These
word. The semantic bias is the interpretation that findings support the constraint-based approach:
people give to the ambiguity in isolation. It can When we identify a word, we do not just access
turn out either to be correct if it is supported by its syntactic category, we activate other knowl-
the context, such as in (73), which normally has a edge that plays an immediate role in parsing,
noun–verb interpretation, or to be incorrect if it is such as the knowledge about the frequency of
not, as in (74), where “warehouse fires” normally alternative syntactic structures. However, the
has a noun–noun interpretation: finding that context sometimes has a later effect
requires modification of standard constraint-
(73) The union told reporters that the corpora- based theories.
tion fires many workers each spring without
giving them notice.
(74) The union told reporters that the warehouse
GAPS, TRACES,
fires many workers each spring without giv- AND UNBOUNDED
ing them notice. DEPENDENCIES
According to the delay model, even a strong Syntactic analysis of sentences suggests that
semantic bias should not affect initial resolu- sometimes constituents have been deleted or
tion, because all decisions are delayed until the moved. Compare (76) and (77):
disambiguation region: Reading times should be
the same whether the bias is supported or not. (76) Vlad was selling and Agnes was buying.
According to the constraint-based model, a strong (77) Vlad was selling and Agnes_buying.
semantic bias should have an immediate effect.
If the interpretation favored by the semantic bias Sentence (77) is perfectly grammatically well
turns out to be correct, ambiguous reading times formed. The verb (“was”) has been deleted to
should not differ from the unambiguous con- avoid repetition, but it is still there, implicitly. Its
trol condition. It is only when the interpretation deletion has left a gap in the location marked.
Parts of a sentence can be moved elsewhere is the recent-filler strategy. This leads to the cor-
in the sentence. When they are moved they leave rect outcome in (80): Here the constituent “the
a special type of gap called a trace. There is no teacher” goes into the gap t1, leaving “the girl”
trace in (78), but in (79) “sharpen” is a transitive to go into t2. In (81), however, it is “the girl” that
verb demanding an object; the object “sword” has should go into the gap t, and not the most recent
been moved, leaving a trace (indicated by t). This constituent (“the teacher”). This delays process-
type of structure is called an unbounded depending, leading to the slower reading times. These
ency, because closely associated constituents are two strategies can be quite difficult to distinguish,
separated from each other (and can, in principle, but in each case trace-detection plays an impor-
be infinitely far apart). tant role in parsing.
Finally, at first sight some of the strongest
(78) Which sword is sharpest? evidence for the processing importance of traces
(79) Which sword did Vlad sharpen [t] yesterday? is the finding that traces appear able to prime the
recognition of the dislocated constituents or ante-
Gaps and traces may be important in the syn- cedents with which they are associated. That is,
tactic analysis of sentences, but is there any evi- the filler of the gap becomes semantically reacti-
dence that they affect parsing? If so, the gap has vated at the point of the gap. There is significant
to be located and then filled with an appropriate priming of the NP filler at the gap (Nicol, 1993;
filler (here “the sword”). Nicol & Swinney, 1989). In a sentence such as
There is some evidence that we fill gaps when (82), the NP “astute lawyer” is the antecedent of
we encounter them. First, traces place a strain on the trace [t], as the “astute lawyer” is the underly-
memory: The dislocated constituent has to be held ing subject who is going to argue during the trial
in memory until the trace is reached. Second, pro- (Bever & McElree, 1988). In the superficially
cessing of the trace can be detected in measure- similar control sentence (83) no constituent has
ments of the brain’s electrical activity (Garnsey, been moved, and therefore there is no trace.
Tanenhaus, & Chapman, 1989; Kluender &
Kutas, 1993), although it is difficult to disentangle (82) The astute lawyer, who faced the female
the additional effects of plausibility and working judge, was certain [t] to argue during the trial.
memory load in these studies. Third, all languages (83) The astute lawyer, who faced the female
seem to employ a recent filler strategy, whereby judge, hated the long speeches during the trial.
in cases of ambiguity a gap is filled with the most
recent grammatically plausible filler. For exam- We find that the gap in (82) does indeed
ple, Frazier, Clifton, and Randall (1983) noted facilitate the recognition of a probe word from the
that sentences of the form of (80) are understood antecedent (e.g., “astute”). The control sentence
100 ms faster (as measured by reading times) than (83) produces no such facilitation. Hence, when
sentences such as (81): we find a trace, we appear to retrieve its associ-
ated antecedent—a process known as binding the
(80) This is the girl the teacher wanted [t1] to dislocated constituent to the trace, thereby making
talk to [t2]. it more accessible.
(81) This is the girl the teacher wanted [t] to talk. On the other hand, there is other research
suggesting that traces are not important in on-
One possibility is that when the processor line processing. McKoon, Ratcliff, and Ward
detects a gap it fills it with the most active item, (1994) failed to replicate the studies that show
and is prepared to reanalyze if necessary. This is wh- traces (traces formed by a question forma-
the active-filler strategy (Frazier & Flores d’Arcais, tion) can prime their antecedents (e.g., Nicol &
1989). Another possibility is that the processor Swinney, 1989). Although unable to point to any
detects a gap, and fills it with a filler, that is, the conclusive theoretical reasons why it should be
most recent potential dislocated constituent. This the case, they found that the choice of control
words in the lexical decision was very important; (85) That is the very small pistol in which the
a choice of different words could obliterate the heartless killer shot the hapless man [t]
effect. They found no priming when the control yesterday afternoon.
words were chosen from the same set of words
as the test words, yet priming was reinstated Clearly this sentence is implausible, but when
when the control words were from a different do readers experience difficulty? Here the gap
set of words than the test words. In addition, location is after “man” (because in the plausible
when they found priming, they found it for loca- version the word order should be the heartless
tions both after and before the verb. This should killer shot the hapless man with the very small
not be expected if the trace is reinstating the pistol yesterday afternoon), but the readers expe-
antecedent, as the trace is only activated by the rience processing difficulty immediately on read-
verb. Clearly what is happening here is poorly ing “shot.” The unbounded dependency has been
understood. formed before the gap location is reached. The
An alternative view to the idea that we parsing mechanism seems to be using all sources
activate fillers when we come to a gap is that of information to construct analyses as soon as
interpretation is driven by the verbs rather than possible.
the detection of the gaps, so that we postulate Similarly, Tanenhaus et al. (1989) presented
expected arguments to a verb as soon as we reach participants with sentences such as (86) and (87):
it (Boland, Tanenhaus, Carlson, & Garnsey, 1989).
In the earlier sentences where there was evidence (86) The businessman knew which customer the
of semantic reactivation, the traces were adjacent secretary called [t] at home.
to the verbs, so the two approaches make the same (87) The businessman knew which article the
prediction. What happens if they are separated? security called [t] at home.
Consider sentence (84):
At what point do people detect the anomaly in
(84) Which bachelor did Boris grant the mater- (87)? Analysis of reading times showed that par-
nity leave to [t]? ticipants detect the anomaly before the gap, when
they encounter the verb “called.” ERP studies
This sentence is semantically anomalous, but confirm that the detection of the anomaly is asso-
when does it become implausible? If the pro- ciated with the verb (Garnsey et al., 1989).
cess of gap postulation and filling is driven by In summary, the preponderance of evidence
the syntactic process of trace analysis, it should suggests that fillers are postulated by activating
only become implausible when people reach the argument structure of verbs.
the trace at the end of the sentence. The role of
“bachelor” can only be assigned after the prepo-
sition “to.” But if the process is verb-driven, the THE NEUROSCIENCE OF
role of “bachelor” can be determined as soon as PARSING
“maternity leave” is assigned to the role of the
direct object of “grant”; hence “bachelor” is the As we would expect of a complex process such
recipient. So the anomaly will be apparent here. as parsing, it can be disrupted as a consequence
This is what Boland et al. found. Hence the pos- of brain damage. Deficits in parsing, however,
tulation and filling of gaps are immediate and might not always be apparent, because people can
are driven by the verbs (for similar results see often rely on semantic cues to obtain meaning.
Altmann, 1999; Boland, Tanenhaus, Garnsey, & The deficit becomes apparent when these cues are
Carlson, 1995; Nicol, 1993; Pickering & Barry, removed and the patient is forced to rely on syn-
1991; Tanenhaus, Boland, Mauner, & Carlson, tactic processing.
1993). For example, consider (85) from Traxler There is some evidence that syntactic func-
and Pickering (1996): tions take place in specific, dedicated parts of the
brain. The evidence includes the differing effects Some evidence against the idea that peo-
of brain damage to regions of the brain such as ple with agrammatism have some impairment
Broca’s and Wernicke’s areas (see Chapters in parsing comes from the grammaticality
3 and 13), and studies of brain imaging (e.g., judgment task. This task simply involves ask-
Dogil, Haider, Schaner-Wolles, & Husman, 1995; ing people whether a string of words forms a
Friederici, 2002; Neville et al., 1991). proper grammatical sentence or not. Linebarger,
Schwartz, and Saffran (1983) showed that the
The comprehension abilities of patients are much more sensitive to grammati-
cal violations than one might expect from their
agrammatic aphasics performance on sentence comprehension tasks.
The disorder of syntactic processing that follows They performed poorly in a few conditions con-
damage to Broca’s area is called agrammatism. taining structures that involve making compari-
The most obvious feature of agrammatism is sons across positions in the sentence (such as
impaired speech production (see Chapter 13), being insensitive to violations like “*the man
but many people with agrammatism also have dressed herself” and “*the people will arrive at
difficulty in understanding syntactically com- eight o’clock didn’t they?”). It appears, then,
plex sentences. The ability of people with that these patients can compute the constitu-
agrammatism to match sentences to pictures ent structure of a sentence, but have difficulty
when semantic cues are eliminated is impaired using that information, both for the purposes of
(Caramazza & Berndt, 1978; Caramazza & detecting certain kinds of violation as well as for
Zurif, 1976; Saffran, Schwartz, & Marin, 1980). thematic role assignment. Schwartz, Linebarger,
These patients are particularly poor at under- Saffran, and Pate (1987) showed that agram-
standing reversible passive constructions (e.g., matic patients could isolate the arguments of the
“The dog was chased by the cat” compared with main verb in sentences that were padded with
“The flowers were watered by the girl”) and extraneous material, but had difficulty using the
object relative constructions (e.g., “The cat that syntax for the purpose of thematic role assign-
the dog chased was black” compared with “The ment. These studies suggest that these patients
flowers that the girl watered were lovely”) in the have not necessarily lost syntactic knowledge,
absence of semantic cues. but are unable to use it properly. Instead, the
One explanation for these people’s difficulty mapping hypothesis is the idea that the com-
is that brain damage has disrupted their parsing prehension impairment arises because although
ability. One suggestion is that these patients are low-level parsing processes are intact, agram-
unable to access grammatical elements correctly matics are limited by what they can do with the
(Pulvermüller, 1995). Another idea is that this dif- results of these processes. In particular, they
ficulty arises because syntactic traces are not pro- have difficulty with thematic role assignment
cessed properly, and the terminal nodes in the parse (Linebarger, 1995; Linebarger et al., 1983).
trees that correspond to function words are not They compensate, at least in part, by making
properly formed (Grodzinsky, 1989, 1990; Zurif use of semantic constraints, although Saffran,
& Grodzinsky, 1983). Grodzinsky (2000) spelled Schwartz, and Linebarger (1998) have shown
out the trace-deletion hypothesis. This hypothesis that reliance on these constraints may sometimes
states that people with an agrammatic comprehen- lead them astray. Thus these patients failed to
sion deficit have difficulty in computing the rela- detect anomalies such as “*The cheese ate the
tion between elements of a sentence that have been mouse” and “*The children were watched by the
moved by a grammatical transformation and their movie” approximately 50% of the time.
origin (trace), as well as in constructing the higher Some types of patient that we might expect
parts of the parse tree. One problem with this view to find have so far never been observed. In par-
is that, as we have seen, the evidence for the exist- ticular, no one has (yet) described a case of a per-
ence of traces in parsing is questionable. son who knows the meaning of words but who is
unable to assign them to thematic roles (Caplan, shortage or the rapid decay of the results of syn-
1992; although Schwartz, Saffran, & Marin, tactic processing might play a causal role in the
1980b, describe a patient who comes close). syntactic comprehension deficit and in agram-
A completely different approach emerged matic production (Kolk, 1995).
that postulated that the syntactic comprehension This is an interesting idea that has provoked
deficit results from an impairment of general a good deal of debate. The extent to which the
memory. According to this idea, the pattern of comprehension deficit is related to limited com-
impairment observed depends on the degree of putational resources is debatable. For example,
reduction of language capacity, and the struc- giving these patients unlimited time to process
tural complexity of the sentence being processed sentences does not lead to an improvement
(Miyake, Carpenter, & Just, 1994). (Somewhat in processing (Martin, 1995; Martin & Feher,
confusingly, although Miyake et al. talk of a 1990). The degree to which Miyake et al. simu-
reduction in working memory capacity, they lated aphasic performance has also been ques-
mean a reduction in the capacity of a component tioned (Caplan & Waters, 1995a). In particular,
of the central executive of Baddeley’s 1990 con- the performance of even their lowest-span par-
ception of working memory that serves language ticipants was much better than that of the apha-
comprehension; see Just & Carpenter, 1992.) In sic comprehenders. Caplan and Waters pointed
particular, these limited computational resources out that rapid presentation might interfere with
mean that people with a syntactic comprehen- the perception of words rather than syntactic pro-
sion deficit suffer from restricted availability of cessing. Furthermore, patients with Alzheimer’s
the materials. Miyake et al. simulated agram- disease (AD) with restricted working memory
matism in normal comprehenders with varying capacity show little effect of syntactic com-
memory capacities by increasing computational plexity, but do show large effects of semantic
demands using very rapid presentation of words complexity (Rochon, Waters, & Caplan, 1994).
(120 ms a word). Along similar lines, Blackwell Addressing these concerns, Dick et al. (2001)
and Bates (1995) created an agrammatic perfor- compared the syntactic comprehension abilities
mance profile in normal participants who had of agrammatic patients with college students
to make grammaticality judgments about sen- working under a variety of stressful conditions
tences while carrying a memory load. In other (e.g., with the speech masked by noise, or by
words, people with a syntactic comprehension compressing the speech). The two groups then
deficit are just at one end of a continuum of performed similarly.
central executive capacity compared with the Finally, if there is a reduction in processing
normal population. Syntactic knowledge is still capacity involved in syntactic comprehension
intact, but cannot be used properly because of deficits, it might be a reduction specifically in
this working memory impairment. Grammatical syntactic processing ability, rather than a reduc-
elements are not processed in dedicated parts tion in general verbal memory capacity (Caplan,
of the brain, but are particularly vulnerable to Baker, & Dehaut, 1985; Caplan & Hildebrandt,
a global reduction in computational resources. 1988; Caplan & Waters, 1999). The extent to
Further evidence for this idea comes from self- which this is the case, or whether general verbal
reports from aphasic patients suggesting that working memory is used in syntactic process-
they have limited computational resources ing (the capacity theory), is still a hotly debated
(“other people talk too fast”—Rolnick & Hoops, topic with few signs of settling on any agreement
1969) and conversely that slower speech facili- (Caplan & Waters, 1996, 1999; Just & Carpenter,
tates syntactic comprehension in some apha- 1992; Just, Carpenter, & Keller, 1996; Waters &
sic patients (e.g., Blumstein, Katz, Goodglass, Caplan, 1996; see also Chapter 15). On balance
Shrier, & Dworetzky, 1985). Increased time it looks as though a general reduction in working
provides more opportunity for using the limited memory capacity cannot cause the syntactic defi-
resources of the central executive. Indeed, time cit in agrammatism.
Are content and function words a deficit of attentional processing. She exam-
processed differently? ined aphasic comprehension of syntactic and
Remember that content words do the semantic semantic anomalies, comparing performance on
work of the language and include nouns, verbs, an on-line measure (monitoring for a particular
adjectives, and most adverbs, while function word) with that on an off-line measure (detect-
words, which are normally short, common words, ing an anomaly at the end of the sentence). She
do the grammatical work of the language. Are found patients who performed normally on the
content and function words processed in different on-line task but very poorly on the off-line task.
parts of the brain? This suggests that the automatic parsing pro-
Content words are sensitive to frequency in cesses were intact, but the attentional processes
a lexical decision task, but function words are were impaired.
not. For a while it was thought that this pattern This is a complex issue that has spawned
is not observed in patients with agrammatism a great deal of research (e.g., Friederici &
(Bradley, Garrett, & Zurif, 1980). Instead, agram- Kilborn, 1989; Haarmann & Kolk, 1991; Martin,
matic patients are sensitive to the frequency of Wetzel, Blossom-Stach, & Feher, 1989; Milberg,
function words, as well as to the frequency of Blumstein, & Dworetzky, 1987; Tyler, Ostrin,
content words. This is because the brain dam- Cooke, & Moss, 1995). Clearly at least some of
age means that function words can no longer be the deficits we observe arise from attentional fac-
accessed by the special set of processes and have tors: the question remaining is, how many?
to be accessed as other content words. Perhaps
the comprehension difficulties of these patients Evaluation of work on the
arise from difficulty in activating function neuroscience of parsing
words? Unfortunately, the exact interpretation
of these results has proved very controversial, Although there has been a considerable amount
and the original studies have not been replicated of work on the neuropsychology of parsing, it is
(see, for example, Gordon & Caramazza, 1982; much more difficult to relate to the psychological
Swinney, Zurif, & Cutler, 1980). Caplan (1992) processes involved in parsing. Much of the work
concluded that there is no clear neuropsycho- is technical in nature and relates to linguistic theo-
logical evidence that function words are treated ries of syntactic representation. It is also unlikely
specially in parsing. that there is a single cause for the range of deficits
observed (Tyler et al., 1995).
Friederici (2002) describes a model of sen-
Is automatic or attentional tence processing where the left temporal regions
processing impaired in identify sounds and words; the left frontal cortex
agrammatism? is involved in sequencing and the formation of
structural and semantic relations; and the right
Most of the tasks used in the studies described hemisphere is involved in identifying prosody
so far (e.g., sentence–picture matching tasks, (see Figure 10.4). She argues that imaging and
anomaly detection, and grammaticality judg- electrophysiological data suggest that sentence
ment) are off-line, in that they do not tap parsing processing takes place in three phases. In Phase
processes as they actually happen. Therefore, the 1 (100–300 ms) the initial syntactic structure is
results obtained might reflect the involvement of formed on the basis of information about word
some later variable (such as memory). So do these category. In Phase 2 (300–500 ms) lexical-syntactic
impairments reflect deficits of automatic parsing processes take place, resulting in thematic role
processes, or deficits of some subsequent atten- assignment. In Phase 3 (500–1,000 ms) the dif-
tional process? ferent types of information are integrated. She
Tyler (1985) provided an indication that at argues that syntactic and semantic processes only
least some deficits in some patients arise from interact in Phase 3.
6
44 22
FIGURE 10.4 Brodmann

45 37
areas in the left hemisphere.
22
The inferior frontal gyrus
47 (IFG) is shown in green,
the superior temporal
21 gyrus (STG) in red, and
the middle temporal gyrus
(MTG) in blue. From
Friederici (2002).
SUMMARY
x The clause is an important unit of syntactic processing.

x In autonomous models, only syntactic information is used to construct and select among alterna-
tive syntactic structures; in interactive models non-syntactic information is used in the selection
process.
x Psycholinguists have particularly studied how we understand ambiguous sentences, such as gar-
den path constructions.
x One of the most studied types of garden path sentence is the reduced relative, as in the well-known
sentence “The horse raced past the barn fell.”
x Early models of parsing focused on parsing strategies using syntactic cues.
x Kimball proposed seven surface structure parsing strategies.
x The sausage machine of Frazier and Fodor comprised a limited window preliminary phrase pack-
ager (PPP) and a sentence structure supervisor (SSS).
x The principle of minimal attachment says that we prefer the simplest construction, where simple
means the structure that creates the minimum number of syntactic nodes.
x The principle of late closure says that we prefer to attach incoming material to the clause or phrase
currently being processed.
x Languages may differ in their attachment preferences.
x The garden path model of parsing is still a two-stage model, where only syntactic information can
affect the first stage.
x The referential model of parsing explains the garden path effect in terms of discourse factors such
as the number of entities presupposed by the alternative constructions.
x In constraint-based models, all types of information (e.g., thematic information about verbs) are
used to select among alternative structures.
x The experimental evidence for and against the autonomous garden path and interactive constraint-
based models is conflicting.
x In constraint-based models, lexical and syntactic ambiguity are considered to be fundamentally
the same thing, and resolved by similar mechanisms.
x Statistical preferences may have some role in parsing.
x Some recent models have questioned whether syntax needs to precede semantic analysis.
x Gaps are filled by the semantic reactivation of their fillers.
x Gaps may be postulated as soon as we encounter particular verb forms.
x Verbs play a central role in parsing.
x ERP studies show that people try and predict what is coming next.
x Some aphasics show difficulties in parsing when they cannot rely on semantic information.
x There is no clear neuropsychological evidence that content and function words are processed dif-
ferently in parsing.
x Some off-line techniques might be telling us more about memory limitations or semantic integra-
tion than about what is actually happening at the time of parsing.
x Electrophysiological and imaging data suggest that sentence comprehension takes place in three
phases, and different components of processing are identifiable with distinct regions of the brain.
1. What does the evidence from the study of language development tell us about the relation
between syntax and other language processes? (You may need to look at Chapters 2 and 3 again
in order to be able to answer this question.)
2. What do studies of parsing tell us about some of the differences between good and poor readers?
3. Is the following statement true: “Syntax proposes, semantics disposes”?
4. How does the notion of “interaction” in parsing relate to the notion of “interaction” in word
recognition?
5. Which experimental techniques discussed in this chapter are likely to give the best insight into
what is happening at the time of parsing? How would you define “best”?
FURTHER READING
Fodor, Bever, and Garrett (1974) is the classic work on much of the early research on the possi-
ble application of Chomsky’s research to psycholinguistics, including deep structure and the deri-
vational theory of complexity. Greene (1972) covers the early versions of Chomsky’s theory, and
detailed coverage of early psycholinguistic experiments relating to it. See Clark and Clark (1977)
for a detailed description of surface structure parsing cues. Johnson-Laird (1983) discusses different
types of parsing systems with special reference to garden path sentences.
(Continued)
(Continued)
For reviews on parsing work see Pickering and van Gompel (2006) and van Gompel and Pickering
(2007). For a model based on a rational analysis of what parsing involves, see Hale (2010).
As Mitchell (1994) pointed out, most of the work in parsing has examined a single language.
There are exceptions, including work on Dutch (Frazier, 1987b; Frazier, Flores d’Arcais, & Coolen,
1993; Mitchell, Brysbaert, Grondelaers, & Swanepoel, 2000), French (Holmes & O’Reagan, 1981),
East Asian languages (Special Issue of Language and Cognitive Processes, 1999, volume 14, parts 5
and 6), German (Bach, Brown, & Marslen-Wilson, 1986; Hemforth & Konieczny, 1999), Hungarian
(MacWhinney & Pleh, 1988), Japanese (Mazuka, 1991), and Spanish (Cuetos & Mitchell, 1988), but
the great preponderance of the work has been on English alone. It is possible that this is giving us at
best a restricted view of parsing, and at worst a misleading view.
See Caplan (1992; the paperback edition is 1996) for a detailed review of work on the neuro-
psychology of parsing. See Haarmann, Just, and Carpenter (1997) for a computer simulation of the
resource-deficit model of syntactic comprehension deficits.
C H A P T E R 11
WORD MEANING
INTRODUCTION between words and their meanings such that some

words have more than one meaning (ambiguity),
How do we represent the meaning of words? while some words have the same meaning as
How do we organize our knowledge of the world? each other (synonymy). Third, the meaning of
These are questions about the study of mean- words depends to some extent on the context.
ing, or semantics. In Chapter 10 we saw how Hence a big ant is very different in size from a
the sentence-processing mechanism constructs a big elephant, and the red in “the red sunset” is
representation of the syntactic relations between a different color from “she blushed and turned
words. Important as this stage might be, it is red.”
only an intermediate step towards the final goal Tulving (1972) distinguished between epi-
of comprehension, which is constructing a repre- sodic and semantic memory. Episodic memory
sentation of the meaning of the sentence that can is our memory for events and particular epi-
be used for the appropriate purpose. Derivation of sodes; semantic memory is, in simple terms,
meaning is hence the ultimate goal of language our general knowledge. Hence my knowledge
processing—and meaning is the start of the pro- that the capital of France is Paris is stored in
duction process. Having some effective means semantic memory, while my memory of a trip
of being able to represent meaning is practically to Paris is an instance of an episodic memory.
important, too: effective translation between lan- Semantic memory develops from or is abstracted
guages depends on meaning, as does effective from episodes that may be repeated many times.
information storage and retrieval (as in intelligent I cannot now recall when I learned the name of
search engines). In this chapter, I examine how the capital of France, but clearly I must have
the meanings of individual words are represented. been exposed to it at least once. We have seen
In Chapter 12, we will see how we combine these that our mental dictionary is called the lexicon,
meanings to form a representation of the meaning and similarly our store of semantic knowledge is
of the sentence and beyond. called our mental encyclopedia. Clearly there is
The discussion of non-semantic reading a close relation between the two, both in devel-
in Chapter 7 showed that the phonological and oping and developed systems. Neuropsychology
orthographic representations of words can be reveals important dissociations in this respect.
dissociated from their meanings. There is fur- We have seen that words and their meanings
ther intuitive evidence to support this dissocia- can be dissociated; but we must be wary of
tion (Hirsh-Pasek, Reeves, & Golinkoff, 1993). confusing a loss of semantic information with
First, we can translate words from one language the inability to access or use that information.
to another, even though not every word meaning This problem is particularly important when we
is represented by a simple, single word in every consider semantic neuropsychological deficits.
language. Second, there is an imperfect mapping Although the distinction between semantic and
episodic memory is a useful one, the extent to search our memories for where the appropriate
which they involve different memory processes facts are stored.
is less clear (McKoon, Ratcliff, & Dell, 1986). It should be obvious that the study of mean-
The notion of meaning is closely bound to ing therefore necessitates capturing the way in
that of categorization. A concept determines how which words refer to things that are all members
things are related or categorized. It is a mental of the same category and have something in com-
representation of a category. It enables us to group mon, yet are different from non-members. (Of
things together, so that instances of a category all course something can belong to two categories
have something in common. Thus concepts some- at once: We can have a category labeled by the
how specify category membership. All words word “ghost,” and another by the word “invis-
have an underlying concept, but not all concepts ible,” and indeed we can join the two to form the
are labeled by a word. For example, we do not category of invisible ghosts labeled by the words
have a special word for brown dogs. In English “invisible ghosts.”) There are two issues here.
we have a word “dog” that we can use about What distinguishes items of one category from
certain things in the world, but not about others. items of another? And how are hierarchical rela-
There are two fundamental questions here. The tions between categories to be captured? There
philosophical question is how does the concept of are category relations between words. For exam-
“dog” relate to the members of the category dog? ple, the basic-level category “dog” has a large
The psychological question is how is the mean- number of category superordinate levels above it
ing of “dog” represented and how do we pick out (such as “mammal,” “animal,” “animate thing,”
instances of dogs in the environment? and “object”) and subordinates (such as “terrier,”
In principle we could have a word, say “Rottweiler,” and “German shepherd”—these are
“brog,” to refer to brown dogs. We do not have said to be category coordinates of each other).
such a term, probably because it is not a particu- Hierarchical relations between categories are
larly useful one. Rosch (1978) pointed out that one clear way in which words can be related in
the way in which we categorize the world is not meaning, but there are other ways that are equally
arbitrary, but determined by two important fea- important. Some words refer to associates of a
tures of our cognitive system. First, the catego- thing (e.g., “dog” and “lead”). Some words (anto-
ries we form are determined in part by the way nyms) are opposites in meaning (e.g., “hot” and
in which we perceive the structure of the world. “cold”). We can attempt to define many words: for
Perceptual features are tied together because they example, we might offer the definition “unmar-
form objects and have a shared function. How the ried man” for “bachelor.” A fundamental issue
categories we form are determined by biological for semantics concerns how we should capture all
factors is an important topic, about which little is these relations.
known, although we know how color names relate Semantics concerns more than associations
to perceptual constraints (see Chapter 3). Second, (see Chapter 6). Words can be related in mean-
the structure of categories might be determined ing without being associated (e.g., “yacht” and
by cognitive economy. This means that seman- “ship”), so any theory of word meaning cannot
tic memory is organized so as to avoid excessive rely simply on word association. Words with
duplication. There is a trade-off between economy similar meanings tend to occur in similar con-
and informativeness: A memory system organ- texts. Lund, Burgess, and Atchley (1995) showed
ized with just the categories “animal,” “plant,” that semantically similar words (e.g., “bed” and
and “everything else” would be economical but “table”) are interchangeable within a sentence;
not very informative (Eysenck & Keane, 2010). the resulting sentence, while maybe pragmatically
We may also need to make distinctions between implausible, nevertheless makes sense. Consider
members of some categories more often than oth- (1) and (2). If “table” is substituted for the seman-
ers. Another disadvantage of cognitive economy tically related word “bed” the sentence still makes
might be increased retrieval time, as we need to sense. Word pairs that are only associated (e.g.,
11. WORD MEANING 321
“baby” and “cradle”) result in meaningless sen- a word decomposed into more elemental units of
tences. If we substitute “baby” for its associate meaning or not? How are words related to each
“cradle” in (3), we end up with the anomalous other by their meanings? This deals with issues
sentence (4). such as priming, and how word meanings are
related. What does the neuropsychology of mean-
(1) The child slept on the bed. ing tell us about its representation and its relation
(2) The child slept on the table. with the encyclopedia? In the next chapter, we
(3) The child slept in the cradle. will examine how word meanings are combined
(4) *The child slept in the baby. to form representations of the meaning of sen-
tences and large units of language. By the end of
Associations arise from words regularly this chapter you should:
occurring together, while semantic relations arise
from shared contexts and higher level relations. x Understand the difference between sense and
One task of research in semantics is to capture reference.
how contexts can be shared and how these higher x Know how semantic networks might represent
level relations should be specified. meaning.
Semantics is also the interface between x Know about the strengths and weaknesses of
language and the rest of perception and cogni- representing word meaning in terms of smaller
tion. This relation is made explicit in the work units of meaning.
of Jackendoff (1983), who proposed a theory x Understand how we store information about
of the connection between semantics and other categories.
cognitive, perceptual, and motor processes. He x Appreciate how brain damage can affect how
proposed two constraints on a general theory of meaning is represented.
semantics. The grammatical constraint says that x Know whether we have one or more semantic
we should prefer a semantic theory that explains memory systems.
otherwise arbitrary generalizations about syntax x Understand the importance of the difference
and the lexicon. Some aspects of syntax will be between perceptual and functional information.
determined by semantics. Some AI theories and x Know how semantic information breaks down
theories based on logic (in particular, a form in dementia.
of logic known as predicate calculus) fail this x Be able to evaluate the importance of connec-
constraint. In order to work, they have to make tionist modeling of semantic memory.
up entities that do not correspond to anything
involved in cognitive processing, and they break
up the semantic representation of single words CLASSIC APPROACHES TO
across several constituents. This constraint says SEMANTICS
that syntax and semantics should be related in a
sensible way. The cognitive constraint says that It is useful to distinguish immediately between a
there is a level of representation where semantics word’s denotation and its connotation. The deno-
must interface with other psychological representation of a word is its core, essential meaning.
tations, such as those derived from perception. The connotations of a word are all of its second-
There is some level of representation where lin- ary implications, or emotional or evaluative asso-
guistic, motor, and sensory information are com- ciations. For example, the denotation of the word
patible. Connectionist models in particular show “dog” is its core meaning: it is the relation between
how this constraint can be satisfied. the word and the class of objects to which it can
This chapter focuses on a number of related refer. The connotations of “dog” might be “nice,”
topics. How do we represent the meaning of “frightening,” or “smelly.” Put another way, peo-
words? In particular, how does a model of mean- ple agree on the denotation, but the connotations
ing deal with the issues we have just raised? Is differ from person to person. In this chapter I am
primarily concerned with the denotational aspect

of meaning, although the distinction can become
quite hazy.
Ask a person on the street what the meaning
of “dog” is, and they might well point to one. This
theory of meaning, that words mean what they
refer to, is one of the oldest, and is called the ref-
erential theory of meaning. There are two major
problems with this lay theory, however. First, it
is not at all clear how such a theory treats abstract
concepts. How can you point to “justice” or
“truth,” let alone point to the meaning of a word
such as “whomsoever”? Second, there is a disso-
ciation between a word and the things to which it
can refer. Consider the words “Hesperus” (Greek
for “The Evening Star”) and “Phosphorus” (Greek
for “The Morning Star”). They have the same ref-
erent in our universe, namely the planet Venus, Referential theory (which proposes that words
but they have different senses. The ancients did mean what they refer to) can be problematic.
“Hesperus” (Greek for “The Evening Star”) and
not know that Hesperus and Phosphorus were “Phosphorus” (Greek for “The Morning Star”)
the same thing, so even though the words actu- both refer to the planet Venus, but they have
ally refer to the same thing (the planet Venus), altogether different senses.
the words have different senses (Johnson-Laird,
1983). The sense of “Hesperus” is the planet
you can see in the evening sky, but the sense of the importance of truth in these theories, they
of “Phosphorus” is the one in the morning sky. are sometimes known as truth-theoretic seman-
This distinction was made explicit in the work of tics.) Although the original idea was to provide
Frege (1892/1952), who distinguished between an account of logic, mathematics, and computing
the sense (often called the intension) of a word languages, logicians have tried to apply it to nat-
and its reference (often called its extension). ural language. But, although formal approaches
The intension is a word’s sense: It is its abstract to semantics help refine what meaning might be,
specification that determines how it is related in they appear to say little about how we actually
meaning to other words. It specifies the properties represent or compute it (Johnson-Laird, 1983).
an object must have to be a member of the class.
The extension is what the word stands for in the
world; that is, the objects picked out by that inten- SEMANTIC NETWORKS
sion. These notions can be extended from words
or descriptive phrases to expressions or sentences. One of the most influential of all processing
Frege stated that the reference of a sentence was approaches to meaning is based on the idea
its truth value (which is simply whether it is true that the meaning of a word is given by how it
or not), while its sense was derived by combining is embedded within a network of other mean-
the intensions of the component words, and speci- ings. Some of the earliest theories of meaning,
fied the conditions that must hold for the sentence from those of Aristotle to those of the behavior-
to be true. ists, viewed meaning as deriving from a word’s
Logicians have developed this formal association. From infancy, we are exposed to
semantics approach of building logical models many episodes involving the word “dog.” For
of meaning into complex systems of meaning the behaviorists, the meaning of the word “dog”
known as model-theoretic semantics. (Because was simply the sum of all our associations to the
word: It obtains its meaning by its place in a A semantic network is particularly useful
network of associations. The meaning of “dog” for representing information about natural kind
might involve an association with “barks,” “four terms. These are words that denote naturally
legs,” “furry,” and so on. It soon became appar- occurring categories and their members—such
ent that association in itself was insufficiently as types of animal, or metal, or precious stone.
powerful to be able to capture all aspects of The scheme attributes fundamental importance
meaning. There is no structure in an associa- to their inherently hierarchical nature: For exam-
tive network, with no relation between words, ple, a bald eagle is a type of eagle, an eagle is a
no hierarchy of information, and no cognitive type of bird of prey, a bird of prey is a bird, and a
economy. In a semantic network, this addi- bird is a type of animal. This is a very economi-
tional power is obtained by making the connec- cal method of storing information. If you store
tions between items do something—they are the information that birds have wings at the level
not merely associations representing frequent of bird, you do not need to repeat it at the level
co-occurrence, but themselves have a semantic of particular instances (e.g., eagles, bald eagles,
value. That is, in a semantic network the links and robins). An example of a fragment of such
between concepts themselves have meaning. a network is shown in Figure 11.1. In the net-
work, nodes are connected by links that specify
The Collins and Quillian semantic the relation between the linked nodes; the most
common link is an ISA link which means that the
network model lower level node “is a” type of the higher level
Perhaps the best-known example of a semantic node. Attributes are stored at the lowest possible
network is that of Collins and Quillian (1969). node at which they are true of all lower nodes in
This work arose from an attempt to develop a the network. For example, not all animals have
“teachable language comprehender” to assist wings, but all birds do—so “has wings” is stored
machine translation between languages. at the level of birds.
breathes
ANIMAL
has lungs
ISA
ISA
has wings
lays eggs BIRD MAMMAL bears live young
flies
ISA
ISA
ISA
ROBIN PENGUIN PIG
FIGURE 11.1 Example

has red swims farm animal
breast cannot fly pink skin of a hierarchical semantic
network (based on Collins
& Quillian, 1969).
The sentence verification task (9) A robin has a red breast.

One of the most commonly used tasks in early (10) A robin has wings.
semantic memory research was sentence veri- (11) A robin has lungs.
fication. Participants are presented with simple
“facts” and have to press one button if the sen- These data from early sentence verification
tence is true, another if it is false. The reaction experiments therefore supported the Collins and
time is an index of how difficult the decision was. Quillian model.
Collins and Quillian (1969) presented participants
with sentences such as (5) to (8): Problems with the Collins and
Quillian model
(5) A robin is a robin. A number of problems with this model soon
(6) A robin is a bird. emerged. First, clearly not all information is
(7) A robin is an animal. easily represented in hierarchical form. What
(8) A robin is a fish. is the relation between “truth,” “justice,” and
“law,” for example? A second problem is that
Sentence (5) is trivially true, but it obvi- the materials in the sentence verification task
ously still takes participants some time to respond that appear to support the hierarchical model
“yes”; clearly they have to read the sentence and confound semantic distance with what is called
initiate a response. This sentence therefore pro- conjoint frequency. This is exemplified by the
vides a baseline measure. The response time to (5) words “bird” and “robin”; these words appear
is less than that to (6), which in turn is less than together in the language—for example, they are
that to (7). Furthermore, the difference between used in the same sentence—far more often than
the reaction times is about the same—that is, there “bird” and “animal” occur together. Conjoint
is a linear relation. Sentence (8) is of course false. frequency is a measure of how frequently two
Why do we get these results? According words co-occur. When you control for conjoint
to this model, participants produce responses by frequency, the linear relation between semantic
starting off from the node in the network that is distance and time is weakened (Conrad, 1972;
the subject in the sentence (here “robin”), and Wilkins, 1971). In particular, hierarchical effects
traveling through the network until they find the can no longer be found for verifying statements
necessary information. As this traveling takes about attributes (“a canary has lungs”), although
a fixed amount of time for each link, the farther they persist for class inclusion (“a canary is an
away the information is, the slower the response animal”). These findings suggest that an alterna-
time. To get from “robin” to “bird” involves trave- tive interpretation of the sentence verification
ling along only one link, but to get from “robin” results is that the sentences that give the faster
to “animal” necessitates traveling along two links. verification times contain words that are more
That is, the semantic distance between “robin” closely associated. Another possible confound
and “animal” is greater than that between “robin” in the original sentence verification experiments
and “bird.” If the information is not found, the is with category size. The class of “animals” is
“no” response is made. by definition bigger than the class of “birds,” so
The characteristic of property inheritance also perhaps it takes longer to search (Landauer &
shows the same pattern of response times, as we Freedman, 1968).
have to travel along links to retrieve the property Third, the hierarchical model makes some
from the appropriate level. Hence reaction times incorrect predictions. We find that a sentence such
are fastest to (9), as the “red-breasted” attribute is as (12) is verified much faster than (13), even
stored at the “robin” node, slower to (10), as “has though “animal” is higher in the hierarchy than
wings” is stored at the “bird” level above “robin,” “mammal” (Rips, Shoben, & Smith, 1973). This
and slowest to (11), as this information is stored suggests that memory structure does not always
two levels above “robin” at the “animal” level. reflect logical category structure.
(12) A cow is an animal.

(13) A cow is a mammal.
SWIMS CANARY
We do not reject all untrue statements equally

slowly. Sentence (14) is rejected faster than (15),
even though both are equally untrue (Schaeffer & HAS
PENGUIN BIRD
Wallace, 1969, 1970; Wilkins, 1971). This is called WINGS
the relatedness effect: The more related two things
are, the harder it is to disentangle them, even if they
are not ultimately from the same class. EATS FISH ROBIN
(14) A pine is a church.

(15) A pine is a flower. YELLOW
Neither are all true statements involving the

same semantic distance responded to equally RED
quickly. Sentence (16) is verified faster than (17),
even though both involve only one semantic link
(Rips et al., 1973), and a “robin” is judged to be a FIGURE 11.2 Example of a spreading activation
more typical bird than a “penguin” or an “ostrich” semantic network. It should be noted that two
(Rosch, 1973). This advantage for more typical dimensions cannot do justice to the necessary
items is called the prototypicality effect. complexity of the network. Based on Collins and
Loftus (1975).
(16) A robin is a bird.
(17) A penguin is a bird. complex, with the links between nodes varying
in strength or distance (see Figure 11.2). Hence
A related observation is that non-necessary fea- “penguin” is more distant from “bird” than is
tures are involved in classification. When people are “robin.” The structure is no longer primarily hier-
asked to list features of a concept, they include prop- archical, although hierarchical relations still form
erties that are not possessed by all instances of the parts of the network. Access and priming in the
concept (e.g., “flies” is not true of all birds). Feature network occur through a mechanism of spreading
listings correlate with categorization times: We are activation. The concepts of activation traveling
faster to categorize instances the more features they along links of different strengths, and of many
share with a concept (Hampton, 1979). simple units connected together in complex ways,
In summary there are too many problematical are of course important concepts in connectionist
findings from sentence verification experiments models. The problem with this model is that it is
to accept the hierarchical network model in its very difficult to test: It is hard to see what sorts
original form. We shall see that of these trouble- of experiments could falsify it. Nevertheless, the
some findings, the prototypicality and relatedness idea of activation spreading around a network
effects are particularly important. Semantic net- has proved influential in more recent models of
works do not capture the graded nature of seman- meaning (e.g., connectionist models).
tic knowledge (Anderson, 2010).
Revisions to the semantic network SEMANTIC FEATURES

model
Collins and Loftus (1975) proposed a revision of Another approach to semantic memory views the
the model based on the idea of spreading activa- meaning of a word as determined not by the posi-
tion. The structure of the network became more tion of the word in a network of meaning, but by
its decomposition into smaller units of meaning. actions that concern the movement of objects,
These smaller units of meaning are called seman- ideas, and abstract relations. For example, there
tic features (or sometimes semantic attributes, are five physical actions (called “expel,” “grasp,”
or semantic markers). Theories that make use of “ingest,” “move,” and “propel”), and two abstract
semantic features are often called decomposi- ones (“attend” and “speak”). Their names are
tional theories. The alternative is that each word is fairly self-explanatory, and it is not necessary
represented by its own concept that is not decom- to go into detail of their meanings here. Wilks
posed further. (1976) described a semantic system where the
Semantic features work very well in some meaning of 600 words in the simulation can be
simple domains where there is a clear relation reduced to combinations of only 80 primitives. In
between the terms. One such domain, much stud- this system the action sense of “beat” is denoted
ied by anthropologists, is that of kinship terms. A by (“strike” [subject—human] [object—animate]
simple example is shown in Table 11.1. Here the [instrument—thing]). The semantic representa-
meanings of the four words, “mother,” “father,” tion and syntactic roles in which the word can
“son,” and “daughter,” are captured by combina- partake are intimately linked. In a similar vein,
tions of the three features “human,” “male” or Wierzbicka (2004) argues that in spite of their
“female,” and “older” or “younger.” We could apparent diversity, all natural languages share a
provide a hierarchical arrangement of these fea- common core of about 60 conceptual primitives
tures (e.g., human → young and old; young → present in all languages. Other word meanings
male or female; and old → male or female), but can be built up by combining these primitives
it would either be totally unprincipled (there is (e.g., a plant is a living thing that cannot feel or
no reason why adult/young should come before do). Of course, just because we can reduce the
male/female, or vice versa), or would involve meaning of all words to a relatively small num-
duplication (if we store both hierarchical forms). ber of primitives does not mean that is how we do
Instead, we can list the meaning in terms of a list actually represent them.
of features, so that father is (+ human, − female, One possibility is that all words are repre-
+ older). sented in terms of combinations of only seman-
We can take the idea of semantic features fur- tic primitives. In addition to these AI models, the
ther, and represent the meanings of all words in model of Katz and Fodor (1963), described later,
terms of combinations of as few semantic features is of this type. Another possibility is that words
as possible. When we use features in this way it is are represented as combinations of features not all
as though they become “atoms of meaning,” and of which need be primitives. These non-primitive
are called semantic primitives. This approach has features might eventually be represented else-
been particularly influential in AI. For example, where in semantic memory as combinations of
Schank (1972, 1975) argued that the meaning of primitives. For example, the meaning of “woman”
sentences could be represented by the conceptual might include “human” but not “object,” because
dependencies between the semantic primitives the meaning of “human” might include “animal,”
underlying the words in the sentence. All com- and eventually the meaning of “animal” includes
mon verbs can be analyzed in terms of 12 primitive “object” (McNamara & Miller, 1989). This idea is
similar to the principle of economy incorporated
TABLE 11.1 Decomposition of kinship terms. into hierarchical semantic networks. Jackendoff
(1983) and Johnson-Laird (1983) described mod-
Feature Father Mother Daughter Son els of this type.
Human
Older
Early decompositional theories
One of the earliest decompositional theories
Female
was that of Katz and Fodor (1963). This theory
showed how the meanings of sentences could Feature-list theories and sentence
be derived by combining the semantic fea-
tures of each individual word in the sentence.
verification
It emphasized how we understand ambiguous We have seen that decompositional theories of
words. Consider examples (18) and (19). A dif- meaning enable us to list the meanings of words
ferent sense of “ball” is used in each sentence. as lists of semantic features. What account does
Then consider (20), which is semantically such a model give of performance on the sentence
anomalous: verification task, and in particular what account
does it give of the problems to which hierarchical
(18) The witches played around on the beach network models fall prey? Rips et al. (1973) pro-
and kicked the ball. posed that there are two types of semantic feature.
(19) The witches put on their party frocks and Defining features are essential to the underlying
went to the ball. meaning of a word, and relate to properties that
(20) ? The rock kicked the ball. things must have to be a member of that category
(for example, a bird is living, it is feathered, lays
There are no syntactic cues to be made use of eggs, and so forth). Characteristic features are
here, so how do the meanings of the words in the usually true of instances of a category, but are not
sentence combine to resolve the ambiguity in (18) necessarily true (for example, most birds can fly,
and (19) and identify the anomaly in (20)? First, but penguins and ostriches cannot).
Katz and Fodor postulated a decompositional According to Rips et al., sentence verification
theory of meaning so that the meanings of indi- involves making comparisons of the feature lists
vidual words in the sentence are broken down representing the meaning of the words involved
into their component semantic features (called in two stages. For this reason this particular
semantic markers by Katz and Fodor). Second, approach is called the feature-comparison theory.
the combination of features across words is gov- In the first stage, the overall featural similarity of
erned by particular constraints called selection the two words is compared, including both the
restrictions. There is a selection restriction on defining and characteristic features. If there is
the meaning of “kick” such that it must take very high overlap, we respond “true”; if there is
an animate subject and an optional object, but very low overlap, we respond “false.” If we com-
if there is an object then it must be a physical pare “robin” and “bird,” there is much overlap
object. An ambiguous word such as “ball” has and no conflict in the complete list of features, so
two sets of semantic features, one of which will we can respond “true” very easily. With “robin”
be specified as something like (sphere, small,
used in games, physical object … ), the other as
(dance, event … ). Only one of these contains
the “physical object” feature, so “kick” picks
out that sense. Similarly there is a selection
restriction on the verb “went” such that it picks
out locations and events, which contradicts the
“physical object” sense of “ball.” Finally, the
selection restriction on “kick” that specifies an
animate subject is incompatible with the under-
lying semantic features of “rock.” As there are
no other possible subjects in this sentence, we
consider it anomalous. As we shall see, one of
the problems with this type of approach is that Characteristic features are not necessarily relevant
for most words it is impossible to provide an to all members of a given category; most birds can
fly, for example, but penguins cannot.
exhaustive listing of all of its features.
and “pig,” there is very little overlap and a great features are weighted according to a combination
deal of conflict, so we can respond “false” very of how salient they are and the probability of their
quickly. However, if the amount of overlap is nei- being true of a category. For example, the feature
ther very high nor very low, we then have to go on “has four limbs” has a large weighting because it
to a second stage of comparison, where we con- relates to something that is perceptually salient and
sider only the defining features. This obviously is true of all mammals. “Bears live young” has a
takes additional time. An exact match on these is lower weighting because although true of almost
then necessary to respond “true.” For example, all mammals it is less salient, while “eats meat” is
when we compare “penguin” and “bird,” there is even lower because it is not even true of most mam-
a moderate amount of overlap and some conflict mals. In a sentence verification task, a candidate
(on flying, for example). An examination of the instance is accepted as an instance of the category
defining features of “penguin” then reveals that if it exceeds some critical weighted sum of features.
it is, after all, a type of bird. The advantage of the For example, “a robin is a bird” is accepted quickly
first stage is that although the comparison is not because the features of “robin” that correspond to
detailed, it is very quick. We do not always need “bird” easily exceed “bird’s” threshold.
to make detailed comparisons. The revised model has the advantage of
One problem with the feature-list model is that emphasizing the relation between meaning and
it is very closely tied to the sentence verification identification, and can account for all the verifica-
paradigm. A more general problem is that many tion time data. Because identifying an exemplar of
words do not have obvious defining features. Smith a category only involves passing a threshold rather
and Medin (1981) extended and modernized the than examining the possession of defining features,
feature theory with the probabilistic feature model. categories that have “fuzzy” or unclear boundaries are
In this approach there is an important distinction no longer problematic. At this point it becomes dif-
between the core description and the identifica- ficult to distinguish empirically between this model
tion procedures of a concept (see Figure 11.3). The and the prototype model described later.
core description comprises the essential defining
features of the concept and captures the relations Evaluation of decompositional
between concepts, while the identification proce-
dures concern those aspects of meaning that are
theories
related to identifying instances of the concept. For There is evidence for and against decomposi-
physical objects, perceptual features form an impor- tional theories. It is a difficult area in which to
tant part of the identification procedure. Semantic carry out experiments. Indeed, Hollan (1975)
Smith and Medin’s (1981)

probabilistic feature model
CORE DESCRIPTION IDENTIFICATION

+%%!&*!!&'$% PROCEDURES
of the concepts +$&&"!&)!
+#&'$%$&"!%# instances of the concept
between concepts
FIGURE 11.3
argued that it is impossible to devise an experi- (21) He became a bachelor.

ment to distinguish between feature-list and
semantic network theories because they are The word “bachelor” is ambiguous between
formally equivalent, in that it is impossible to the senses of “unmarried man who has never been
find a prediction that will distinguish between married” and “a person with a university degree.”
them (but see Rips, Smith, & Shoben, 1975, for Why do we select the second interpretation in the
a reply). Hence for all intents and purposes we case of (21)? You might say that it is because we
can consider network models to be a type of know that you cannot become an unmarried man
decompositional model. who has never been married. So does that mean
On the one hand, decompositional theories that “impossible to become” is part of the under-
have an intuitive appeal, and they make explicit lying meaning of this sense of bachelor—that this
how we make inferences based on the meaning is one of its semantic features? This seems very
of words in the sentence verification task. In implausible. Generally, the interpretation of word
reducing meaning to a small number of primi- meaning is very sensitive to world knowledge. Is
tives, they are very economical. On the other it part of the meaning of “pig” that it does not have
hand, it is difficult to construct decompositional a trunk? This also seems most unlikely. We could
representations for even some of the most com- suggest that these problems are solved by mak-
mon words. Some categories do not have any ing inferences rather than just accessing semantic
obvious defining features that are common to memory, but then the problem becomes much more
all their members. The most famous example complex. Finally, we have more knowledge about
of this was provided by Wittgenstein (1953), word meaning than can be represented as a list of
who asked what all games have in common, and features. We also know relationships between fea-
therefore how “game” should be defined—that is, tures. For example, if something flies and builds a
how it should be decomposed into its semantic nest, it usually lays eggs; if a living thing has four
primitives. There is no clear complete definition; legs, it gives birth (with a few exceptions) to live
instead, it is as though there are many differ- young. We say that features are intercorrelated.
ent “games,” which have in common a family The feature-comparison theory has additional
resemblance. If you consider some examples problems. First, it is very specific to the sentence
of games (e.g., boxing, chess, football, ring- verification task. Second, there are some meth-
a-ring-a-roses, solitaire), all have some of the odological problems with the Smith, Shoben,
important features (competition, recreation, and Rips (1974) experiments. Semantic related-
teams, winners and losers), but none has all. ness and stimulus familiarity were confounded in
So if we cannot define an apparently simple the original experimental materials (McCloskey,
concept such as this, how are we going to cope 1980). Moreover, Loftus (1973) showed that if
with more complex examples? A glance at the you reverse the order of the nouns in the sentences
examples we mentioned earlier should reveal used in sentence verification, you find effects not
another problem: Even when we can apparently predicted by the theory. If we only compare lists
define words, the features we come up with are of features for the instance and class nouns, their
not particularly appealing or intuitively obvi- order should not matter. Hence (22) should be
ous; one suspects that an alternative set could verified in the same time as (23):
be generated with equal facility. It is not even
clear that our definitions are complete: Often (22) Is a robin a bird?
it is as though we have to anticipate all possi- (23) Is a bird a robin?
ble important aspects of meaning in advance.
Bolinger (1965) criticized Katz and Fodor’s Loftus found that noun order is important.
theory because of its inability to provide an For sentences such as (22), the verification
explanation of the way in which we understand times were a function of how often the category
examples such as (21): was mentioned given a particular instance, but
for sentences such as (23) the times were a func-

tion of how often the instance was given for the
category. However, the task involved in verifying
sentences such as (23) seems unnatural compared
with that of (22). Third, Holyoak and Glass (1975)
showed that people may have specific strategies
for disconfirming sentences, such as thinking of a
specific counter-example, rather than carrying out
extensive computation. Finally, and most tellingly,
it is not easy to distinguish empirically between
defining and characteristic features. Hampton
(1979) showed that in practice defining features
do not always define category membership. The Fruit or vegetable? In practice, defining features
model still cannot easily account for the finding do not always define category membership. Some
that some categories have unclear or fuzzy bound- categories have unclear or fuzzy boundaries.
aries. For example, for many people it is unclear
whether a tomato is a fruit or a vegetable, or both. The first is whether we represent the meanings of
McCloskey and Glucksberg (1978) showed that words in terms of features. The other is whether
although participants agree on many items as mem- we make use of those features in comprehension.
bers of categories, they also disagree on many. For So is the decomposition of a word into its
example, although all participants agree that “can- component semantic features obligatory? That
cer” is a disease and “happiness” is not, half think is, when we see a word like “bachelor,” is the
that “stroke” is a disease and about half think that retrieval of its features an automatic process? In
it is not. Similarly, about 50% of participants think featural terms, the meaning of the unmarried man
that “pumpkin” is a type of fruit and 50% do not. sense of “bachelor” must clearly contain features
Labov (1973) showed that there is no clear bound- that correspond to (+unmarried, +man), although
ary between membership and non-membership these in turn might summarize decomposition into
of a category for a simple physical object like a yet more primitive features, or there might also
“cup”: “cup” and “bowl” vary along a continuum, be others (see earlier). In any case, on the decom-
and different participants put the cut-off point in positional account, when you see or hear or think
different places. Furthermore, asking participants the word “bachelor,” you automatically have to
to focus on different aspects of the object can alter decompose it. Therefore you will automatically
this point. If they are asked to imagine an object draw all the valid inferences that are implied by
that is otherwise half-way between a cup and a its featural representation—for example, the fea-
bowl as containing mashed potato, participants are ture (+unmarried) automatically becomes avail-
more likely to think of it as a bowl. able in all circumstances.
Finally, it is important to remember that Obligatory automatic decomposition is a very
semantic features or primitives need not have ready difficult theory to test experimentally. However,
linguistic counterparts. We obviously use examples Fodor, Fodor, and Garrett (1975) observed that
that are easy to put into words. Some semantic fea- some words have a negative implicit in their
tures might be perceptual, or at least non-verbal. We definition. They called these pure definitional
will return to this important point when we examine negatives (PDNs for short). For example, the
connectionist models of meaning. word “bachelor” has such an implicit negative in
(+unmarried), which is equivalent to (not mar-
Is semantic decomposition ried). It is well known that double negatives,
obligatory? two negatives together, are harder to process than
From a psychological perspective, there are two one alone. Fodor et al. compared sentences (24),
important issues (McNamara & Miller, 1989). (25), and (26):
(24) The bachelor married Sybil. as “a bachelor is unmarried,” you have to make a
(25) The bachelor did not marry Sybil. special type of inference (called a meaning postu-
(26) The widow did not marry Sybil. late). We do this only when required. A problem
with this study is that it is difficult to make up
According to decompositional theories, (24) good controls (for example, sentences matched
contains an implicit negative in the form of the for length and syntactic complexity) for this type
PDN in “bachelor.” If this is correct, and such of experiment (see Katz, 1977).
features are accessed automatically, then (25) is Fodor, Garrett, Walker, and Parkes (1980)
implicitly a double negative and should be harder examined the representation of words called lexi-
to understand than a control sentence such as cal causatives. These are verbs that bring about or
(26), which contains only an explicit negative cause new states of affairs. In a decompositional
and no PDN. Fodor et al. could find no process- analysis such verbs would contain this feature in
ing difference between sentences of the types their semantic representation. For example, “kill”
(25) and (26). They concluded that features are would be represented as something like (cause to
not accessed automatically, and instead pro- die), although this is obviously a far from perfect
posed a non-decompositional account in which decomposition. In Figure 11.4, (a) shows the sur-
the meaning of words is represented as a whole. face structure for the two sentences with the appar-
(Hence Fodor had completely changed his view ently similar verbs “kiss” and “kill.” For the control
of decomposition from the earlier Katz and Fodor verb “kiss,” the deep structure analysis is the same,
work.) They argued that to draw an inference such but if “kill” is indeed decomposed into “cause to
S
(a)
NP VP
N V NP
Vlad kissed Agnes

killed
(b) S
NP VP
N V NP VP
FIGURE 11.4 Examples

N V of analysis of semantics of
a causative verb showing
Vlad cause Agnes die different deep structure
distances. Based on Fodor
et al. (1980).
die,” its deep structure should be like that of (b). of objects between participants in the sentence,
Fodor et al. asked participants to rate the perceived while “sold” decomposes into the notion of trans-
relatedness between words in these sentences. In ferring the ownership of objects plus an exchange
(b), “Vlad” and “Agnes” are farther apart than of money between participants. Hence sentences
they are in the deep structure of “kissed,” as there of the type “Vlad sold the wand to Agnes” are
are more intervening nodes. Therefore “Vlad” and remembered more accurately than sentences of
“Agnes” should be rated as less related in the sen- the type “Vlad gave the wand to Agnes” because
tence with the causative verb “Vlad killed Agnes” the verb has a more complex underlying structure
than with a non-causative verb as in “Vlad kissed (Gentner, 1981; see also Coleman & Kay, 1981).
Agnes.” However, Fodor et al. found no difference Although memory tasks do not always pro-
in the perceived relatedness ratings in these sen- vide an accurate reflection of what is happening
tences, and therefore no evidence that participants at the time of processing, there is further evidence
decompose lexical causatives. in favor of semantic decomposition. People with
Gergely and Bever (1986) questioned this aphasia tend to be more successful at retrieving
finding. In particular, they questioned whether verbs with rich semantic representations com-
perceived relatedness between words truly is a pared with verbs with less rich representations
function of their structural distance. They pro- (Breedin, Saffran, & Schwartz, 1998). For exam-
vided experimental evidence to support their ple, the verb “hurry” has a richer representation
contention, concluding that the technique of intui- than “go” because it includes the meaning of
tions about the relatedness of words cannot be “go” with the additional features representing
used to test the relative underlying complexity “quickly.” Semantically related word substitu-
of semantic representations. The conclusion also tion speech errors (see Chapter 13) always show
depends on a failure to show a difference rather a featural relation between the target and occur-
than on obtaining a difference, which is always ring words. Finally, much of the work on semantic
less satisfactory. development (see Chapter 4) is best explained in
Some studies have concluded that complex terms of some sort of featural representation.
sentences that are hypothesized to contain more In summary, it is likely that we represent the
semantic primitives are no less memorable or meanings of words as combinations of semantic
harder to process than simpler sentences that pre- features, although these ideas are fiendishly difficult
sumably contain fewer primitives (Carpenter & to test. McNamara and Miller (1989) suggested
Just, 1977; Kintsch, 1974; Thorndyke, 1975). On that young children automatically decompose
the other hand, these experiments confounded the early words into semantic primitives, but as they
number of primitives with other factors (Gentner, get older, they mainly decompose them into non-
1981), particularly syntactic complexity (as primitive features. Eventually words themselves
pointed out by Gentner, 1981, and McNamara & might act as features in the semantic system.
Miller, 1989). There has recently been a resurgence of inter-
Although Fodor et al. (1980) argued that est in semantic features. This has come from the
semantic complexity should slow processing interplay between connectionist modeling and
down, it is more likely that it speeds processing neuropsychological studies of semantic memory.
up. In Hinton and Shallice’s (1991) model of deep Vigliocco, Vinson, Lewis, and Garrett (2004)
dyslexia, highly imageable words have rich fea- describe an updated feature-based model called
tural representations that make them more robust the Featural and Unitary Semantic Space hypoth-
(see Chapter 7). Features also provide scope for esis. They argue that object and action words at
interconnections. Sentences that contain features least are represented by combinations of features
that facilitate interconnections between their ele- grounded in perception and organized according
ments are recalled better than those that do not to modality. These ideas of grounding and modality-
(Gentner, 1981). For example, “give” decom- specific organization are important ones to which
poses into the notion of transferring the ownership we will return later.
FAMILY RESEMBLANCE slots such as “can fly?” (“yes” for blackbird and
robin, “no” for penguin and emu), “bill length”
MODELS (“short” for robin, “long” for curlew), and “leg
We have seen that one of the major problems with length” (“short” for robin, “long” for stork). The
the decompositional theory of semantics is that it bird prototype will have the most common or
is surprisingly difficult to come up with an intui- average values for all these slots (can fly, short
tively appealing list of semantic features for many bill, short legs). Hence a robin will be closer to
words. Many categories seem to be defined by a the prototype than an emu. Category boundaries
family resemblance between their members rather are unclear or “fuzzy.” For some items, it is not
than the specification of defining features that all clear which category they should belong in; and
members must possess. How can we account for in some extreme cases, some instances may be in
the wooliness of concepts? two categories (for example, a tomato may be cat-
egorized as both a vegetable and a fruit).
There is a wealth of evidence supporting
Prototype theories prototype theory over feature theory. Rosch and
A prototype is an average family member Mervis (1975) measured family resemblance
(Rosch, 1978). Potential members of the category among instances of concepts such as fruit, fur-
are identified by how closely they resemble the niture, and vehicles by asking participants to list
prototype or category average. Some instances their features. Although some features were given
of a category are judged to be better exemplars by all participants for particular concepts, these
than other instances. The prototype is the “best were not technically defining features, as they did
example” of a concept, and is often a non-exist- not distinguish the concept from other concepts.
ent, composite example. For example, a blackbird For example, all participants might say of “birds”
(or alternatively, American robin) is very close to that “they’re alive,” but then so are all other ani-
being a prototypical bird; it is of average size, has mals. The more specific features that were listed
wings and feathers, can fly, and has average fea- were not shared by all instances of a concept—for
tures in every respect. A penguin is a long way example, not all birds fly.
from being a prototypical bird, and hence we take A number of results demonstrate the pro-
longer to verify that it is indeed a member of the cessing advantage of a prototype over particu-
bird category. lar instances (see for example Mervis, Catlin,
The idea of a prototype arose from many dif- & Rosch, 1975). Sentence verification time is
ferent areas of psychology. Posner and Keele faster for prototypical members of a category.
(1968) showed participants abstract patterns of Prototypical members can substitute for category
dots. Unknown to the participants, the patterns names in sentences, whereas non-prototypical
were distortions of just one underlying pattern of members cannot. Words for typical objects are
dots that the participants did not actually see. The learned before words for atypical ones. In a free
underlying pattern of dots corresponds to the cate- recall task, adults retrieve typical members before
gory prototype. Even though participants never saw atypical ones (Kail & Nippold, 1984). Prototypes
this pattern, they later treated it as the best example, share more features with other instances of the
responding to it better than the patterns they did see. category, but minimize the featural overlap with
I considered the related work of Rosch on proto- related categories (Rosch & Mervis, 1975). Hence,
types and color naming earlier, in Chapter 3. for most people, “apple” is very close to the pro-
A prototype is a special type of schema. A totype of “fruit” (Battig & Montague, 1969), and
schema is a frame for organizing knowledge that is similar to other fruit and dissimilar to “veg-
can be structured as a series of slots plus fillers etables,” but “tomato” is a peripheral member
(see Chapter 12). A prototype is a schema with all and indeed overlaps with “vegetable.” There are
the slots filled in with average values. For exam- prototypes that possess an advantage over other
ple, the schema for “bird” comprises a series of members of the category even when they are all
formally identical. Participants consider the num- are not so easily distinguished from each other.
ber “13” to be a better “odd number” than “23” or Nevertheless, objects at the same basic level share
“501” (Armstrong, Gleitman, & Gleitman, 1983), perceptual contours; they resemble each other
and “mother” is a better example of “female” than more than they resemble members of other simi-
“waitress.” We have already seen that these typi- lar categories. It is the level at which we think, in
cality effects can also be found in sentence verifi- the sense that those are the labels we choose in the
cation times. Generally, the closer an item is to the absence of any particular need to do otherwise.
prototype, the easier we process it. The basic level is the most general category for
Prototype theories are not necessarily incon- which a concrete image of the whole category can
sistent with feature theories. According to be formed (Rosch et al., 1976).
prototype theories, word meaning is not only Rosch et al. (1976) showed that basic levels
represented by essential features; non-essential have a number of advantages over other catego-
features also play a role. Theories based on fea- ries. Participants can easily list most of the attrib-
tures have the additional attractive property that utes of the basic level; it is the level of description
they can explain how we acquire new concepts, most likely to be spontaneously used by adults;
such as “liberty” or “hypocrisy”: we merely com- sentence verification time is faster for basic-level
bine existing features. Network models can also terms; and children typically acquire the basic
form new concepts, by adding new nodes to the level first. We can also name objects at the basic
network with appropriate connections to exist- level faster than at the superordinate or subordi-
ing nodes. As we have seen, it is unclear whether nate levels (Jolicoeur, Gluck, & Kosslyn, 1984).
this is a meaningful distinction in practice. On the
other hand, new concepts are problematical for Problems with the prototype model
non-decompositional theories. One suggestion Hampton (1981) pointed out that not all types
is that all concepts, including complex ones, are of concepts appear to have prototypes: Abstract
innate (Fodor, 1981). concepts in particular are difficult to fit into this
scheme. What does it mean, for example, to talk
Basic levels about the prototype for “truth”? The prototype
Rosch (1978) argued that a compromise between model does not explain why categories cohere.
cognitive economy and maximum informative- Lakoff (1987) points to some examples of very
ness results in a basic level of categorization that complex concepts for which it is far from obvious
tends to be the default level at which we catego- how there could be a prototype—the Australian
rize and think, unless there is particular reason to Aboriginal language Dyirbal has a coherent cat-
do otherwise. In general, we use the basic level of egory of “women, fire, and dangerous things”
“chairs,” rather than the lower level of “armchairs” marked by the word “balan.” Furthermore, the
or the higher level of “furniture.” That is, there is prototype model cannot explain why typicality
a basic level of categorization that is particularly judgments vary systematically depending on the
psychologically salient (Rosch et al., 1976). The context (Barsalou, 1985). Any theory of catego-
basic level is the level that has the most distinc- rization that relies on similarity risks being circu-
tive attributes and provides the most economical lar: Items are in the same category because they
arrangement of semantic memory. There is a large are similar to each other, and they are similar to
gain in distinctiveness from the basic level to lev- each other because they are in the same category
els above, but only a small one to levels below. (Murphy & Medin, 1985; Quine, 1977). It is nec-
For example, there seems to be a large jump from essary to explain how items are similar, and proto-
“chairs” to “furniture” and to other types of fur- type theories do not do a good job of this. Finally,
niture such as “tables,” but a less obvious differ- the characterization of the basic level as the most
ence between different types of chair. Objects at psychologically fundamental is not as clear-cut
the basic level are readily distinguished from each as at first sight (Komatsu, 1992). The amount of
other, but objects in levels beneath the basic level information we can retrieve about subordinate
levels varies with our expertise (Tanaka & Taylor, as the number of instances considered increases
1991). Birdwatchers, for example, know nearly as (Storms, De Boeck, & Ruts, 2000). Both abstrac-
much about subordinate members such as black- tion-based theories (Gluck & Bower, 1988) and
birds, jays, and olivaceous warblers, as they do instance-based theories (in the Jets and Sharks
about the basic level. Nevertheless, although model of McClelland, 1981; see also Kruschke,
expertise increases the knowledge available at 1992) have been implemented in connectionist
other levels, the original basic level retains a priv- models. Across a range of tasks involving natural
ileged status (Johnson & Mervis, 1997). language categories, instance-based models give
a slightly better account than prototype models
(Storms et al., 2000). The instantiation principle
Instance theories might be one possible resolution to this conflict
Is abstraction an essential component of concep- (Heit & Barsalou, 1996). According to this prin-
tual representation? An alternative view is that ciple, a category includes detailed information
of representing exemplars without abstraction: about its range of instances. Although it is clearly
Each concept is representing a particular, previ- implemented in instance-based theories, it is pos-
ously encountered instance. We make semantic sible to incorporate it into prototype theories. This
judgments by comparison with specific stored idea represents a shift from emphasizing cogni-
instances. This is the instance approach (Komatsu, tive economy in our theories. This might not be as
1992), also called the exemplar theory. There disadvantageous as it first seems. Nosofsky and
are different varieties of the instance approach, Palmeri (1997) suggested that category member-
depending on how many instances are stored, and ship decisions are made by retrieving instances
on the quality of these instances. The instance one at a time from semantic memory until a deci-
approach provides greater informational richness sion can be made. In this case, the more instances
at the expense of cognitive economy. you have stored, the faster you can respond.
It is quite difficult to distinguish between
prototype and instance-based theories. Many
of the phenomena explained by prototype theo-
Theory theories
ries can also be accounted for by instance-based A final theory of classification and concept repre-
theories. Both theories predict that people pro- sentation has emerged from work on how children
cess central members of the category better than represent natural kind categories (e.g., Carey,
peripheral members (Anderson, 2010). Prototype 1985; Markman, 1989), on judgments of similar-
theories predict this because central mem- ity (Rips & Collins, 1993), and on how catego-
bers are closer to the abstract prototype, while ries cohere (Murphy & Medin, 1985). According
instance-based theories predict this because cen- to theory theories, people represent categories as
tral instances are more similar to other instances miniature theories (mini-theories) that describe
of the category. Instance-based theories predict facts about those categories and why the members
that specific instances should affect the process- cohere (Murphy & Medin, 1985; Rips, 1995). A
ing of other instances regardless of whether or theory underlying a concept is thought to be very
not they are close to the central tendency, and similar to the type of theory a scientist uses, say
this has been observed (Medin & Schaffer, 1978; to decide what sort of insect a particular specimen
Nosofsky, 1991). For example, although the aver- might be. Mini-theories are sets of beliefs about
age dog barks, if we experience an odd-looking what makes instances members of categories, and
one that does not, we will expect similar-looking an idea about what the normal properties of an
ones not to (Anderson, 2010). On the other hand, instance of a category should possess. They look
abstraction theories correctly predict that people rather like encyclopedia entries. Concept devel-
infer tendencies that are not found in any specific opment throughout childhood is a case of the
instance (Elio & Anderson, 1981). The predic- child evolving theories of categories that become
tive power of instance-based models increases increasingly like those used by adults.
Evaluation of work on to mini-theories and provides the power to be able

to recognize when some of our beliefs conflict
classification with each other.
The current battleground on how we classify How do we understand noun–noun combina-
objects is between instance-based theories and tions (e.g., “boar burger,” “robin hawk”)? How do
theory theories. Other accounts can be seen as we know that “corn oil” means “oil made from
special cases of these. For example, schema theo- corn” but that “baby oil” means “oil rubbed on
ries are just a version of theory theory, and as we babies” (Wisniewski, 1997)? People’s interpreta-
have seen, prototypes can be difficult to distin- tions of a novel phrase like “robin hawk” fall into
guish from instance-based theories, but also can three categories. In one there is a thematic rela-
be thought of as theory-like entities (Rips, 1995). tion between the two entities: “a hawk that preys
Instance-based theories have particular difficulty on robins.” In another there is a property link
in accounting for how we understand novel con- between the two: “a hawk with a red breast like a
cepts formed by combining words, whereas the- robin.” A third, less frequent category is hybridi-
ory theories do rather better. zation, where the compound is a combination
or conjunction of the constituents (e.g., a “robin
canary” is a cross between the two, and a “musi-
COMBINING CONCEPTS cian painter” refers to someone who is both).
Most of the research has been carried out on
So far we have largely been concerned with how thematic and property interpretations. There is a
we represent the meanings of individual words. general assumption in the research literature that
How do we combine concepts and understand people try the thematic relation first, and only if
novel phrases such as “green house”? this fails to generate a plausible combination do
Rips (1995) points out that instance-based they attempt a property interpretation. Property
theories run into obvious difficulties in provid- interpretations appear to be rare in natural, com-
ing an account of how we combine concepts. We municative contexts (Downing, 1977). One rea-
can still understand novel phrases even though we son for this bias is that relation interpretations
might have no instances of them. We would still preserve the meaning of each noun in the combi-
be able to decide whether a particular house is an nation, whereas property interpretations use just
instance of “green house” or not. Novel phrases one property of the noun that is acting as a modi-
and sentences enable us to express an infinite fier (e.g., the red breast of the robin). People pre-
number of novel concepts whose comprehension fer to assume that combinations involve the usual
is beyond the reach of a finite number of already meanings of their constituents, so they prefer to
encountered specific instances. use this strategy first. This is called the last resort
Theory theories have less difficulty in strategy.
accounting for concept combination, but still However, Wisniewski and Love (1998)
face some difficulties (Rips, 1995). How do showed that in certain circumstances people
mini-theories actually get combined? What is prefer to comprehend noun combinations on
the relation between the new mini-theory and the basis of property relations. High similarity
past mini-theories when they become revised in between the constituents of a combination facili-
the light of new information? Rips (1995) argued tates the production of property relations. People
that mini-theories alone cannot account for how then look for a critical difference between them
we combine concepts. They must be combined that can act as the basis of the interpretation. For
with some other mechanism. He proposes a dual example, consider “zebra horse.” “Zebra” and
approach combining mini-theories with a fixed “horse” are close in meaning, and the critical dif-
atomic symbol for each category. He calls this a ference “has stripes” can easily be used to gener-
“word-like entity in the language of thought.” A ate the property relation “a horse with stripes.”
dual approach enables us to keep track of changes However, no such relation exists for “tree zebra,”
relations are by no means rare, and in some cir-

cumstances form the strategy of preference.
Combining categories presents formidable
difficulties for the way we understand language,
which have yet to be resolved.
FIGURATIVE LANGUAGE
So far we have been concerned with how we pro-
cess literal language—that is, where the intended
meaning corresponds exactly to the meanings of
Wisniewski and Love (1998) showed that people
the words. Humans make extensive use of non-literal
often prefer noun combinations based on property or figurative language. In this we go beyond the
relations. For example, a “zebra horse” is easily literal meanings of the words involved, for humor,
interpreted as a horse with stripes. effect, politeness, to play, to be creative—and for
a mixture of these and other reasons. There are
three main types of figurative language.
so we might generate a thematic relation like “a First, we use what can broadly be called
zebra that lives in trees.” In a survey of familiar metaphor. This involves making a comparison,
noun–noun combinations, 71% of combinations or drawing a resemblance. A metaphor is a special
had thematic relation meanings and 29% prop- type of conceptual combination, where we com-
erty meanings. bine two concepts that are not normally thought
People are also influenced by what might be of as being related for some special effect. There
called a noun’s combinatorial history—the way are many types of metaphor, depending on the
in which a particular word has combined with relation between the words actually used and the
other words before. For example, when “moun- intended meaning. Here are a few examples:
tain” is used in compound nouns, it usually indi-
cates a location relation (e.g., “mountain stream,” (27) Vlad fought like a tiger. (Simile)
“mountain goat,” and “mountain resort”). Hence (28) Vlad exploded with fury. (Strict metaphor)
when we come across a new combination involv- (29) All hands on deck. (Synecdoche)
ing “mountain” (e.g., “mountain fish”) we tend to
interpret it in the same way. The modifying (first) Cacciari and Glucksberg (1994) argued that
noun of the pair is the most important in deter- there is no dichotomy between literal and meta-
mining this (Shoben & Gagne, 1997; Wisniewski, phoric usage: rather, there is a continuum. How do
1997). Further evidence that experience matters we process metaphorical utterances? The standard
is that exposure to a word pair related in a simi- theory is that we process non-literal language in
lar way makes it easier to understand a new word three stages (Clark & Lucy, 1975; Searle, 1979).
pair. For example, prior exposure to the word pair First, we derive the literal meaning of what we
“glass eye” makes people faster to understand hear. Second, we test the literal meaning against
“copper horse,” when the same conceptual rela- the context to see if it is consistent with it. Third,
tion (second word is made of the first) is instanti- if the literal meaning does not make sense with
ated (Estes & Jones, 2006). the context, we seek an alternative, metaphorical
Hence the interpretation of compound nouns meaning (see Figure 11.5). fMRI imaging data
depends on a number of factors, including past suggests that in processing metaphors people activate
experience, similarity, and whether plausible regions of the brain involved in general reason-
relations between the stimuli exist. Although ing and thinking, involving working memory and
there might be some bias towards understanding executive processing, to understand more abstract
them on the basis of thematic relations, property metaphors (Prat, Mason, & Just, 2012).
create the category “all things that are cluttered”;

Stages of non-literal language processing
we then include the topic (desk) in this category to
Derive literal meaning
generate an interpretation of the metaphor (Jones
& Estes, 2005). Jones and Estes confirmed this
idea by showing that priming with a metaphor
Test literal meaning (e.g., “that lie is a boomerang”) increases the
against context
probability that a person will judge the topic (lie)
to be an actual member of the vehicle category
If literal meaning makes (boomerang), compared with a similar but literal
no sense in context, seek
alternative, metaphorical
prime (“that lie was about a boomerang”).
meaning Second, idioms can be thought of as frozen
metaphors. Whereas we make metaphors up as we
go along, idioms have a fixed form and are in gen-
FIGURE 11.5 eral use. The meaning of an idiom is usually quite
unrelated to the meaning of its component words.
One prediction of this three-stage model is Examples include “to kick the bucket” and “fly off
that people should ignore the non-literal mean- the handle.” Gibbs (1980), using reading times,
ings of statements whenever the literal meaning found that participants take less time to compre-
makes sense, because they never need to proceed hend conventional uses of idioms than unconven-
to the third stage. There is some evidence that tional, literal uses, suggesting that people analyze
people are unable to ignore non-literal meanings. the idiomatic senses of expressions before deriving
Glucksberg, Gildea, and Bookin (1982) found that the literal, unconventional interpretation. Swinney
when good metaphoric interpretations of literally and Cutler (1979) also found that people are as fast
false sentences were available (e.g., “Some jobs to understand familiar idioms as they are compara-
are jails”), people take longer to decide that such ble phrases used non-idiomatically. They suggested
sentences are literally false. That is, the meta- that people store idioms like single lexical items.
phoric meaning seems to be processed at the same The meaning we intend to convey goes
time as the literal meaning. beyond what we actually say. When Vlad says
Is additional processing always brought into (30), he isn’t really asking if the listener has the
play whenever we recognize the falsity of what ability to get a glass of milk:
we read or hear (Glucksberg, 1991)? For exam-
ple, in (28) we recognize that Vlad did not actu- (30) Can you get me a glass of milk?
ally explode. The problem with this view is that
not all metaphors are literally false (e.g., “no man Instead, he is making an indirect request, asking
is an island,” “my husband’s an animal”). Cacciari for a glass of milk. Indirect requests are seen
and Glucksberg (1994) concluded that metaphors as more polite than the corresponding direct
are interpreted through a pragmatic analysis in the request (31):
same way that we process conversational implica-
tures (see Chapter 14): We assume that what we (31) Get me a glass of milk!
read is maximally informative. The class-inclusion
model claims that metaphors are meant to be In addition to indirect requests, we frequently
taken literally as assertions of category member- expect listeners to draw inferences that go well
ship. For example, in the metaphor “That desk beyond what we say. Indirect requests and infer-
is a junkyard,” the topic (desk) is intended to be ences in conversation are discussed in more detail
interpreted as a member of the vehicle for the in Chapter 12.
metaphor, in this case “junkyard” (Glucksberg & How do we construct new metaphors? There
Keysar, 1990). When we are faced with a meta- is obviously an essential creative component to
phor such as this, we use the vehicle (junkyard) to this; we must be able to see new connections.
Nevertheless, there are constraints. The meaning a word or object does not mean that the semantic
of the words cannot be either too similar or too representation of that word has been lost or dam-
dissimilar. Neither (32) nor (33), examples given aged. People can fail to access the phonology of a
by Aitchison (1994), is memorable: word while they still have access to its semantic
representation. There are a number of reasons why
(32) Jam is honey. this must be the case. Some people who are having
(33) Her cheeks were typewriters. difficulty accessing the whole phonological form
might be able to access part of it. These people
Clearly we have to generate just the right might be able to comprehend the word in speech.
amount of overlap: the words must share an They might be able to produce the word in sponta-
appropriate but minor characteristic overlap. Lit- neous speech. Importantly, they know how to use
tle is known about how we can generate just the the objects, and they can group pictures together
right amount of overlap. Producing metaphors and appropriately. In these cases we can conclude that
jokes is an aspect of our metalinguistic ability— the word meanings are intact, and that such people
our ability to reflect on and manipulate language, are having difficulty with later stages of process-
of which phonological awareness (see Chapter 7) ing. Nevertheless there are some instances where
is just one component. the semantic representation is clearly disrupted.
Can we distinguish between a “central”
semantic deficit, when a concept is truly lost (or at
THE NEUROSCIENCE OF least when its representation is degraded), and an
SEMANTICS “access” semantic impairment (sometimes called
a refractory semantic deficit), when there is dif-
What can we learn about the representation of ficulty in gaining access to the concept? Shallice
meaning from examining the effects of brain dam- (1988; see also Warrington & Cipolotti, 1996, and
age? Obviously, just because a person cannot name Warrington & Shallice, 1979) discussed five
Magnetic resonance imaging

(MRI) scans of the brain
of a woman with a tumor
(center right of scans) in the
left temporal lobe. (Front
of the brain is at top.) Six
views are seen showing
transverse sections through
different levels of the brain.
Language areas within the
brain are seen to be active
(colored areas) during
sentence generation from a
list of verbs. The temporal
lobe is important for the
processing of language
meaning (semantics).
Damage in this region
can create problems with
language processes.
criteria that could distinguish problems associated of underlying neurological damage. One type
with the loss of a representation from problems involves damage to a neuromodulatory system
of accessing it. First, performance should be that normally functions to maintain and enhance
consistent across trials. If an item is permanently neuronal signals, while the second involves dam-
lost, it should never be possible to access it. If an age to the neuronal system that encodes semantic
item is available on some trials rather than on oth- information. Hence the idea is that “refractori-
ers, the difficulty must be one of access. Second, ness,” a reduction in the ability to use the seman-
for both degraded stores and access disorders, it tic system in the same way for a period of time
should be easier to obtain the superordinate cat- following the initial response, builds up abnor-
egory than to name the item, because that infor- mally. (The idea is similar to that of the refrac-
mation is very strongly represented; but once the tory period in neuronal firing.)
superordinate is obtained, it will be very difficult Studies of the neuropsychology of semantics
to obtain any further information in a degraded cast light on a number of important issues. In par-
store. Warrington (1975) found that superordinate ticular, how many semantic memory systems are
information (e.g., that a lion is an animal) may there, and how is semantic memory organized?
be preserved when more specific information is
lost. She proposed that the destruction of semantic How many semantic systems are
memory occurs hierarchically, with lower levels
storing specific information being lost before
there?
higher levels storing more general information. Do we have separate semantic memory systems
Hence, information about superordinates tends to for each input modality? So far we have discussed
be better preserved than information about spe- semantic information as though there is only one
cific instances. Impaired access should affect all semantic store. This is called the unimodal store
levels equally. Third, low-frequency items should hypothesis. It is the idea, perhaps held by most peo-
be lost first. Low-frequency items should be more ple, that we have one central store of meaning that
susceptible to loss, whereas problems of access we can access from different modalities (vision,
should affect all levels equally. Fourth, priming taste, sound, touch, and smell). However, perhaps
should no longer be effective, as an item that each modality has its own store of information? In
is lost obviously cannot be primed. Fifth, if the practice, we are most concerned with a distinction
knowledge is lost then performance should be between a store of visual semantic information
independent of the presentation rate, whereas dis- and a store of verbal semantic information. Paivio
turbances of access should be sensitive to the rate (1971) proposed a dual-code hypothesis of seman-
of presentation of the material. tic representation, with a perceptual code encod-
There has been considerable debate about ing the perceptual characteristics of a concept, and
how reliably these criteria distinguish access dis- a verbal code encoding the abstract, non-sensory
orders from loss disorders, and how many patients aspects of a concept. Experimental tests of this
show all of these features (Rapp & Caramazza, hypothesis produced mixed results (Snodgrass,
1993). To be confident that items have been 1984). For example, participants are often faster
lost from semantic memory we need to observe to access abstract information from pictures than
at least consistent failure to access items across from words (see for example Banks & Flora,
tasks. However, a number of semantic-access 1977). Some support for the dual-code hypothesis
deficit patients have now been clearly identi- is that brain-imaging studies show that concrete
fied (Warrington & Cipolotti, 1996; Warrington and abstract words are processed differently
& Crutch, 2004), and other patients show ele- (Kounios & Holcomb, 1994).
ments of semantic-access deficit (e.g., Forde & The idea of multiple or modality-specific
Humphreys, 1995, 1997). Gotts and Plaut (2002) semantic stores, whereby verbal material (words)
present a connectionist model that suggests that and non-verbal material (pictures) are separated,
central and access deficits result from different types has enjoyed something of a resurgence owing
to data from brain-damaged participants. There in pictures. For example, the presence of a large
are three main reasons for this (Caplan, 1992). gaping mouth and heavy paws in the picture of a
First, priming effects have been discovered that lion is an excellent indirect cue to how to answer
have been found to be limited to verbal material. a comprehension question such as “is it danger-
Second, some case studies show impairments ous?,” even if you do not know it is a picture of a
limited to one sensory modality. For example, lion (Caplan, 1992).
patient TOB (McCarthy & Warrington, 1988) Nevertheless, some research is more dif-
had difficulty in understanding living things, but ficult to explain away. Bub, Black, Hampson,
only when they were presented as spoken names. and Kertesz (1988) describe the case of MP, who
He could name their pictures without difficulty. showed very poor comprehension of verbal mate-
Patient EM (Warrington, 1975) was generally rial, did not show automatic semantic priming, but
much more impaired at verbal tasks than at visual did show much better comprehension of the mean-
tasks. Third, patients with semantic deficits are ing of pictures. The nature of the detailed infor-
not always equally impaired for verbal and vis- mation MP was able to provide about the objects
ual material (e.g., Warrington, 1975). Warrington in the pictures, such as the color of a banana
and Shallice’s (1979) patient AR showed a much from a black-and-white line drawing, could not
larger benefit from cuing when reading a written easily be inferred from perceptual cues without
word than when naming the corresponding pic- access to semantic information about the object.
ture. They interpreted this finding as evidence for Warrington and Shallice (1984) found high item
separate verbal and visual conceptual systems. consistency in naming performance as long as
Coltheart, Inglis, Cupples, Michle, Bates, and the modality was held constant, again suggesting
Budd (1998) described the case of AC, who was different semantic systems were involved. Lauro-
unable to access visual semantic attributes, but Grotto, Piccini, and Shallice (1997) described a
could access other sensory semantic attributes as patient with semantic dementia (a type of degen-
well as non-sensory attributes. This was observed erative dementia where semantic memory is lost
independently of the modality of testing and of while episodic memory is relatively well pre-
the semantic category tested. Coltheart et al. pro- served) who was much better at tasks involving
posed that semantic memory is organized into visual input than verbal input.
subsystems. There is a subsystem for each sen- Finally, supportive evidence comes from
sory modality, and a subsystem for non-sensory modality-specific anomia, in which the naming
semantic knowledge. This non-sensory subsystem disorder is confined to one modality. For example,
is in turn divided into subsystems for semantic in the disorder known as optic aphasia (Beauvois,
categories such as living and non-living things. 1982; Coslett & Saffran, 1989), patients are
This approach takes the fractionation of semantic impaired at the naming of visually presented stim-
memory to the extreme. uli, but without general visual anomia or agno-
sia. They are unable to name objects presented
Evaluation of multiple-stores models visually, but can name them if they are exposed
Alternative explanations have been offered for to them through other modalities (e.g., patients
these studies. Riddoch, Humphreys, Coltheart, cannot name a cat by sight, but can if they hear it
and Funnell (1988) argued that patients who per- mew, or if they are given one to touch), or if they are
form better on verbal material might have a subtle given a definition of the word. Hence the names
impairment of complex visual processing. This of objects must still be intact, showing there is no
idea is supported by the finding that the distur- general anomia. Patients can also mime the use of
bance in processing pictures is greater for catego- objects, or sort pictures into appropriate catego-
ries with many visually similar members (e.g., ries, showing there is no general agnosia.
fruit and vegetables). The reverse dissociation of The interpretation of these data is con-
better performance on visual material may arise troversial. The most obvious interpretation
because of the abundance of indirect visual cues of optic aphasia, for example, is that we can
access different modality-specific stores, with the appropriate semantic store first. Second, they
one of the stores being wiped out. Riddoch and predict that activation of phonological and ortho-
Humphreys (1987) argued that optic aphasia graphic representations is mediated by verbal
is a disorder of accessing a unitary semantic semantics. Third, they predict that information
system through the visual system, rather than can only be accessed directly through the appro-
disruption to a visual modality-specific seman- priate input modality. Caramazza et al. argued that
tic system. Much hangs on the interpretation of the data do not really support these predictions.
gestures made by the patient. Do they indeed All the data really motivate is that there is a rela-
reflect preserved visual semantics—so that tion between input modality and semantic content
patients understand the objects they see—with type; it does not have to be in a modality-specific
disruption of verbal semantics, or are they format. They proposed an alternative model of the
merely inferences made from the perceptual semantic system that they called OUCH (short
attributes of objects? Riddoch and Humphreys’ for organized unitary content hypothesis). In this
patient JV produced only the most general of model, pictures of objects have privileged access
gestures to objects, and other experiments indi- to a unimodal store. This is because a picture of an
cated a profound disturbance of comprehension object has a more direct relationship to the object
of visual objects. Of course, we must remember itself than a word denoting the object. A fork is a
the caveat that different patients display differ- fork because you can eat with it, and you can eat
ent behaviors, and one must be wary of drawing with it because it has tines and a handle. Some
too general a conclusion from a single patient. semantic connections are more important than
Caramazza, Hillis, Rapp, and Romani (1990) others. This idea is attractively simple, but OUCH
argued that there is some confusion about what cannot explain patients who have more trouble
the terms “semantics” and “semantic stores” with pictures than words (e.g., FRA of McCarthy
mean when used in neuropsychological con- & Warrington, 1986). Finally, it is not clear that a
texts. Is semantic information general knowledge distinction between a semantic system and sub-
about objects and events, or just something that system is a meaningful one. Perhaps they amount
mediates between input and output? They dis- to the same thing (Shallice, 1993).
tinguished four versions of the multiple-stores How can we explain optic aphasia? There are
hypothesis. In the input account, the same seman- several accounts (Sitton, Mozer, & Farah, 2000).
tic system, containing everything (both visual Optic aphasia shows that the simple canonical
and verbal), is duplicated for each modality of model of meaning, where we go from sensory
input. There is little evidence for this idea. In the input to semantics, and then to name, cannot be
modality-specific content hypothesis, there is a correct, because in optic aphasia people have
semantic store for each input modality. Each store accessed the semantics and therefore should
contains information relevant to that modality, always be able to access the name. The modal-
but in an abstract or modality-neutral format. The ity-specific multiple-stores models accounts for
modality-specific format hypothesis is similar to optic aphasia by positing a disconnection between
this, but the store is in the format of the input (e.g., verbal semantics and visual semantics, with pro-
visual information for vision, verbal for verbal). In ducing the correct name depending on access to
the modality-specific context hypothesis, visual and verbal semantics. According to OUCH (Hillis &
verbal semantics refer to the information acquired Caramazza, 1995), we observe optic aphasia when
in the context of visually presented objects or the semantic representation that is computed from
words. For example, if you acquired “tigers have visual input is enough to support action patterns
stripes” through verbal exposure, that informa- (mimes), but not naming. Shallice (1993) pointed
tion is stored verbally rather than visually. These out that this would make optic aphasia indistin-
hypotheses are difficult to distinguish, but appear guishable from visual associative agnosia. In
to make three predictions. First, they predict that a similar vein Riddoch and Humphreys (1987)
access from a particular modality always activates also hypothesize an impairment from vision to
semantics, but argue that a direct pathway from The perceptual information necessary to identify
vision to gesture is preserved. Both ideas note that and name an object is only a subset of the mean-
visual objects have affordances (Gibson, 1979); ing of a concept. If this information is intact and
the shape of a chair encourages or creates the the amodal associative store is impaired, a person
idea of sitting in it. Finally, Sitton et al. (2000) will still be able to name an object, but will not be
argue that instead of optic aphasia arising from able to access the other verbal semantic informa-
damage to multiple semantic systems or multi- tion about the object. One argument against this
ple pathways, it arises from damage at multiple hypothesis is that patient RM of Lauro-Grotto
sites in a unitary model. They argue that lesions et al. (1997) had much better preserved semantic
to the pathways mapping visual input to seman- abilities than we would expect, given that she was
tics, and also semantics to naming, can account impaired at tasks involving verbal semantics. In
for optic aphasia if those lesions are what they particular, she still had knowledge about visual
call super-additive. Super-additive means that a contextual contiguity (knowing what items tend
task requiring both pathways (naming a visually to occur together visually, such as a windshield
presented object) gives a much higher failure rate wiper and a car tax disc, which in the UK is dis-
than would be expected on the basis of the error played in the corner of the windshield) and even
rates on tasks involving just one of the paths (e.g., functional contextual contiguity (the way objects
gesturing from semantics). They present a con- tend to be used in the same function, such as a
nectionist model that shows that super-additivity screwdriver and a screw). Lauro-Grotto et al.
can occur and that damage to a system with a sin- argue that these types of information are stored
gle semantic store and with visual and auditory in visual semantics rather than being an amodal
inputs and name and gesture outputs (see Figure component of semantic memory.
11.6) gives rise to a pattern of performance simi- In summary, most researchers currently
lar to optic aphasia. Essentially brain damage in believe that there are multiple semantic systems.
two parts of the brain is particularly damaging for Most importantly, there are distinct systems for
some tasks, while leaving performance on tasks verbal and visual semantics. However, it is impor-
that involve just one part close to normal. tant to note that the representations and mecha-
Caplan (1992) proposed a compromise nisms used by these systems need to be spelled
between the multiple-stores and unitary store out, and it can be quite difficult to distinguish
theories in which only a subset of semantic infor- between different theories.
mation is dedicated to specific modalities. This
has become known as the identification seman- Category-specific semantic
tics hypothesis (Chertkow, Bub, & Caplan, 1992).
disorders
Perhaps the most intriguing and hotly debated
phenomena in this area are category-specific dis-
name gesture
orders. Sometimes brain damage disrupts knowl-
edge about particular semantic categories, leaving
other related ones intact. For example, Warrington
semantic
and Shallice’s (1984) patient JBR performed much
better at naming inanimate objects than animate
objects. He also had a relative comprehension
visual auditory
deficit for living things. At first sight this suggests
that semantic memory is divided into animate and
inanimate categories. JBR’s brain damage caused
FIGURE 11.6 A schematic depiction of the super- the loss of the animate category. The picture is
additive impairment account of optic aphasia (Farah, more complicated, however. JBR was good at
1990). From Sitton, Mozer, and Farah (2000). naming parts of the body, even though these are
parts of living things. He was also poor at naming made a general observation about the materials
musical instruments, foodstuffs, types of cloth, used for these types of experiment. Most experi-
and precious stones, even though these are clearly ments use as stimulus materials a set of black-
all inanimate things. Difficulties with a particu- and-white line drawings from Snodgrass and
lar semantic category are not restricted to naming Vanderwart (1980). Some examples are given in
pictures of its members. They arise across a range Figure 11.7. Funnell and Sheridan (1992) showed
of tasks, including picture naming, picture–name that within this set there were more pictures of
matching, answering questions, and carrying out low-frequency animate objects than there were
gestures appropriate to the object (Warrington & of low-frequency inanimate objects. There were
Shallice, 1984). few low-familiarity non-living things and few
Even more specific semantic disorders have high-familiarity living things. That is, randomly
been observed. Hart, Berndt, and Caramazza selected pictures of animate things are likely to
(1985) reported a patient, MD, who also had spe- be less familiar than a random sample of inani-
cific difficulties in naming fruit and vegetables; mate objects. Hence, if frequency is important in
PC (Semenza & Zettin, 1988) had selective dif- brain-damaged naming, an artifactual effect will
ficulty with proper names; BC (Crosson, Moberg, show up unless care is taken to control for fre-
Boone, Rothi, & Raymer, 1997) just had diffi- quency across the categories. Furthermore, there
culty with medical instruments. Knowledge about were two anomalous subcategories. SL was poor
nouns and verbs seems to be processed by differ- at naming human body parts (high familiarity
ent parts of the brain (Caramazza & Hillis, 1991; but a subcategory of living things) and musical
Hillis, Tuffiash, & Caramazza, 2002; Shapiro & instruments (low frequency but inanimate). These
Caramazza, 2003). It is unlikely that this dissocia- were the two anomalous categories mentioned by
tion can be reduced to the effects of semantic vari- Warrington and Shallice (1984) in their descrip-
ables because of the report of a patient by Rapp tion of JBR.
and Caramazza (2002) who has greater difficulty Stewart, Parkin, and Hunkin (1992) also
speaking nouns than verbs, but greater difficulty argued that there had been a lack of control of
writing verbs than nouns. word name frequency, but pointed out in addition
that the complexity and familiarity of the pictures
Methodological issues in investigating used in these experiments varied between catego-
category-specific deficits ries. Gaffan and Heywood (1993) showed that pic-
There are a number of methodological problems tures of living things are visually more similar to
in studying category-specific semantic disorders. each other than pictures of non-living things. With
Funnell and Sheridan (1992) reported an appar- very brief presentation times, normal participants
ent category-specific effect whereby their patient, make more errors on living things. In reply to
SL, appeared to show a selective deficit in naming these criticisms, Sartori, Miozzo, and Job (1993)
pictures and defining words for living versus non- concluded that their patient “Michelangelo” had
living things. When they controlled for the famili- a real category-specific deficit for living things,
arity of the stimulus, this effect disappeared. They even when these factors were controlled for. The

of line drawings from the
Snodgrass and Vanderwart
(1980) set.
debate was continued by Parkin and Stewart is correct then category-specific disorders are
(1993), and Job, Miozzo, and Sartori (1993). One important because they reveal the structure of the
conclusion is that it is important to measure and categories as represented by the brain. Hence the
control the familiarity, visual featural complexity, distinction between living and non-living things
and visual similarity of pictures. would be a fundamental organizing principle in
On the other hand, we cannot explain all semantic memory. Farah (1994) argued that this
category-specific effects by these methodological approach would go against what we know about
problems. Now studies are careful to control for the organization of the brain. More importantly,
the potential confounding variables, yet category- this idea does not explain why deficits to particular
specific deficits persist. Some patients are poor at categories tend to co-occur. Why are impairments
tasks involving living things that do not involve on naming living things associated with impair-
picture naming, such as comprehension and defi- ments on naming gems, cloths, foodstuffs, and
nition (e.g., Warrington & Shallice, 1984). Most musical instruments, and why are impairments on
importantly, we observe a double dissociation naming non-living things associated with impair-
between the categories of living and non-living ments on naming body parts? It is also difficult
things. Warrington and McCarthy (1983, 1987) to reconcile with the observation that patients
describe patients who are the reverse of JBR in impaired at naming animals perform worse on
that they perform better on living objects than on tasks involving perceptual properties (Saffran &
inanimate objects. Their patient YOT, for exam- Schwartz, 1994; Sartori & Job, 1988; Silveri &
ple, who generally had an impairment in naming Gainotti, 1988). The second possible explanation
inanimate objects relative to animate ones, on is that the categories that are disrupted share some
closer examination could identify large outdoor incidental property that makes them susceptible to
objects such as buildings and vehicles. There loss. Riddoch et al. (1988) proposed that catego-
also appears to be a distinction between small ries that tend to be lost also tend to include many
and large artifacts. CW also found non-living similar and confusable items. However, it is not
things and body parts harder to name than living clear that these patients have any perceptual dis-
things (Sacchett & Humphreys, 1992). Hillis and order (Caplan, 1992). The third possible explana-
Caramazza (1991a) examined two patients, JJ tion is that the differences between the categories
and PS, who exemplified this double dissociation are mediated by some other variable so that the
when tested on the same stimuli. Although there items that are lost share some more abstract prop-
are fewer patients who show selective difficulties erty. We will look at this idea in detail.
with non-living things, there are enough of them
to be very convincing. The performance of these The sensory–functional theory
patients cannot be explained away as experimen- Non-living things are distinguished from one
tal artifacts, as they are having difficulty with another primarily in terms of their functional
members of the category that should prove easiest properties, whereas living things tend to be dif-
to process if all that matters is visual complexity ferentiated primarily in terms of their perceptual
and familiarity. properties (Warrington & McCarthy, 1987;
Warrington & Shallice, 1984). That is, the rep-
What explains the living–non-living resentation of living things depends on what they
dissociation? look like, but the representation of most non-living
There are three possible explanations for category- things depends on what they are used for. Hence
specific disorders. The first is that different types JBR, who generally showed a deficit for living
of semantic information are located at different things, also performed poorly on naming musical
sites in the brain, so that brain damage destroys instruments, precious stones, and fabrics. What
some types and not others. On this view, informa- these things all have in common is that, like liv-
tion about fruit and vegetables is stored specifi- ing things, they are recognized primarily in terms
cally in one part of the brain. If this explanation of their perceptual characteristics, rather than
being distinguished from each other on largely for each word was 7.7:1. For non-living things,
functional terms. This distinction is also consist- it was only 1.4:1. The network was then taught
ent with the organization of the brain, which has to associate the correct semantic and name pat-
distinct processing pathways for perceptual and tern when presented with each picture pattern,
motor information (Farah, 1994). and to produce the correct semantic and picture
Farah, Hammond, Mehta, and Ratcliff (1989) pattern when presented with each name pattern.
showed that control participants were poor at Farah and McClelland then lesioned the net-
answering questions on the perceptual features work. They found that damage to visual seman-
of both living and non-living objects (e.g., “Are tic units primarily impaired knowledge of living
the hind legs of kangaroos larger than their front things, whereas damage to functional semantic
legs?”). If visual attributes are more difficult to units primarily impaired knowledge about non-
process than functional ones, then categories that living things. Furthermore, when a category was
depend more on them would be more suscepti- impaired, knowledge of both types of attribute
ble to loss. This explains why we observe loss of was lost. This is because of the distributed nature
information about living things more frequently of the semantic representations. Lesioning the
than loss of information about non-living things. model results in a loss of support between parts
There is some support from neuroimaging of the representation. The elements of the repre-
work for this hypothesis. There is no obvious dif- sentation remaining after damage do not have suf-
ference in the blood flow in the temporal lobes ficient critical mass to become activated.
with responses to living and non-living things, In summary, the sensory–functional theory
but there is with a difference with the processing says knowledge of animate objects is derived pri-
of perceptual and functional information (Lee, marily from visual information, whereas knowl-
Graham, Simons, & Hodges, 2002), with more edge of inanimate objects is derived primarily
activation of the posterior regions of the left tem- from functional information. Non-living things
poral cortex when we are dealing with perceptual do not necessarily have more functional attributes
information, and more activation of the middle than perceptual attributes, but they have relatively
regions when dealing with functional information. more than living things.
Modality-specific and category-specific Challenges to the sensory–functional

effects theory
Is there any relation between the findings of Caramazza and Shelton (1998) challenged the
modality-specific and category-specific effects? prevalent view that the living–non-living distinc-
Farah and McClelland (1991) argued that there tion merely reflects an underlying differential
is. They constructed a connectionist model and dependence on sensory and functional informa-
showed that damage to a modality-specific tion. They focused on the pattern of associated
semantic memory system can lead to category- categories in category-specific disorders. They
specific deficits. The architecture of their model argued that if the sensory–functional theory is cor-
comprised three “pools” of units: verbal input and rect, then a patient with an impairment on living
output units (corresponding to name units), visual things should be impaired at tasks involving all
input and output units (picture units), and semantic types of living things, and also always impaired
memory units (divided into visual and functional on the associated categories of musical instru-
units). Farah and McClelland asked students to ments, fabrics, foodstuffs, and gemstones. They
rate dictionary definitions of living and non- pointed out that this is not the case. Some patients
living things according to the number of sensory are impaired at tasks involving animals but not
and functional elements each definition con- foodstuffs (e.g., KR of Hart & Gordon, 1992; JJ of
tained. The meaning of each word in the model Hillis & Caramazza, 1991a), whereas others are
was based on these findings. For living things, the impaired at tasks involving food but not animals
ratio of perceptual to functional features active (e.g., PS of Hillis & Caramazza, 1991a). Some
patients impaired at tasks involving animals are However, Farah and McClelland’s (1991) simula-
good at musical instruments (e.g., Felicia of De tions showed that when a category was impaired,
Renzi & Lucchelli, 1994). Animals can be spared knowledge of both types of attribute was lost.
or damaged independently of plants (Hillis & This is because of the distributed nature of the
Caramazza, 1991a), and the category of plants can semantic representations.
be damaged independently of animals (e.g., TU In addition PET and fMRI imaging suggests
of Farah & Wallace, 1992). It is of course possi- that knowledge about animals and tools is indeed
ble that some types of perceptual feature are more stored in separate, identifiable parts of the brain
important for some categories than for others. For (Caramazza & Shelton, 1998; Vigliocco et al.,
example, animals might depend on shape, while 2004). To summarize, knowledge about animals is
foodstuffs might depend on color (Warrington stored in occipital-temporal areas, while knowledge
& McCarthy, 1987). These further dissociations about tools is stored in lateral temporal-parietal-
would then reflect selective loss of particular occipital areas (see Figure 11.8).
types of sensory feature, rather than of all of Caramazza and Shelton also argued that
them. Caramazza and Shelton argue that there is the concept of functional information is poorly
no independent evidence for this approach. defined. In the dictionary rating experiment of
The sensory–functional hypothesis also Farah and McClelland, participants were told
appears to predict that people with a selective that “it was what things are for.” But it is pos-
impairment for living things should show a dis- sible that much other non-sensory verbal infor-
proportionate difficulty with visual properties. mation is really involved (for example, a lion is
Although this has been observed sometimes, stud- a carnivore and it lives in a jungle). Biological
ies that have carefully controlled for the level of function information (such as animals breathe
difficulty of the different types of question have and can see) is preserved in RC, even though
not always found it to be the case (Funnell & de other types of functional information (what
Mornay Davies, 1996; Laiacona, Barbarotto, & an animal eats or where it lives) are impaired
Capitani, 1993; Sheridan & Humphreys, 1993). (Tyler & Moss, 1997, 2001).
Animals Tools
Parietal lobe
Frontal lobe Parietal lobe
Frontal lobe
Occipital lobe Occipital lobe

Temporal lobe Temporal lobe
FIGURE 11.8 Imaging studies suggest that knowledge about animals is stored in the occipital-temporal areas,
whereas knowledge about tools is stored in lateral temporal-parietal-occipital areas.
Caramazza and Shelton proposed an alterna- of dementia called semantic dementia is particu-
tive explanation of the data, which they called the larly interesting: In semantic dementia, the loss
domain-specific knowledge hypothesis (DSKH). of semantic information is disproportionately
They argued that specific, innate neural mechanisms great relative to the loss of other cognitive func-
for distinguishing between living and non-living tions, such as episodic memory (Hodges et al.,
things have evolved because of the importance of 1992; Mayberry, Sage, & Lambon Ralph, 2011;
this distinction. They cite two lines of evidence Snowden, Goulding, & Neary, 1989; Warrington,
for this. First, very young children (within the 1975). This selective disturbance of semantic infor-
first few months) can distinguish between living mation makes it particularly useful for studying
and non-living things (Bertenthal, 1993; Quinn & how we represent meaning. Alzheimer’s disease
Eimas, 1996). The presence of this ability so soon and semantic dementia reflect damage (at least
after birth suggests that it is innate. Second, studies initially) to different brain regions: Neuroimaging
of lesion sites and recent studies using brain imag- studies show that Alzheimer’s disease typically
ing both suggest that different parts of the brain begins with medial temporal lobe atrophy, includ-
might, after all, be dedicated to processing living ing the hippocampus, with more advanced cases
and non-living things. Living things are generally showing global atrophy. Semantic dementia on
associated with the temporal lobe, while artifacts the other hand is marked by atrophy beginning
tend to be more associated with the dorsal region of particularly in the left anterior temporal region of
the temporal, parietal, and frontal lobes. the brain, with much less early damage to the hip-
It is too early to evaluate these alternative pocampus. Patients with semantic dementia show
approaches. Currently most researchers in the impaired word naming and a loss of word mean-
field subscribe to the sensory–functional hypoth- ing, but preserved syntax. Imaging results suggest
esis. Time will tell whether the domain-specific that the left middle and inferior temporal cortex
knowledge hypothesis will be preferred. Imaging of the brain play a particularly important role in
data suggest that while knowledge about animals accessing and representing meaning (Chan et al.,
and tools might be stored in different parts of 2001; Garrard & Hodges, 2000).
the brain, this might be because of an underly-
ing dependence on some other factor. While ani-
mals are associated with activation of the lateral
fusiform gyrus, and tools with activation of the
medial fusiform gyrus, some non-living things
(e.g., chairs) cause activation of areas outside that
associated with tools (Chao, Haxby, & Martin,
1999; Vigliocco et al., 2004).
The structure of semantic

memory: Evidence from studies of
dementia
Dementia is a general label for the widespread
decay of cognitive functioning, generally found
in old age. The ultimate causes of dementia are
unknown, although it is likely that both genetic An Alzheimer brain scan (left) compared with a
and environmental factors play some role, and it normal brain (right). The Alzheimer’s diseased
brain is considerably atrophied, due to the
is clear that there are several subtypes, the most
degeneration and death of nerve cells. Apart from
common of which is Alzheimer’s disease (AD). a decrease in brain volume, the surface of the
In dementia, memory and semantic information brain is often more deeply folded.
are particularly prone to disruption. A subtype
Semantic memory disturbances in Clearly problems with semantic processing

dementia are implicated in the naming difficulty of people
There is a huge body of work indicating prob- with dementia, but might other levels of process-
lems with semantic processing in dementia (see ing also be disrupted?
Nebes, 1989, and Harley, 1998, for reviews). There is some evidence that visual processing
Here are just a few examples of these find- is impaired in dementia, and that sufferers have dif-
ings. People with dementia are often impaired ficulty in recognizing objects. (See Figure 11.9 for
on the category fluency task, where they have a model of object naming.) Rochford (1971) found
to list as many members as possible of a par- a high proportion of perceptual errors in a naming
ticular category (e.g., Martin & Fedio, 1983). task (e.g., calling an anchor a “hammer”). Kirshner,
They have difficulty listing attributes that are Webb, and Kelly (1984) manipulated the percep-
shared by all members of a category (Martin tual difficulty of the target stimuli, by presenting
& Fedio, 1983; Warrington, 1975). They have them either as a masked line drawing, a line draw-
difficulty in differentiating between items from ing, a black and white photograph, or the object.
the same semantic category (Martin & Fedio, They found that the perceptual clarity of the stimuli
1983). They tend to classify items as being affected naming performance. It is unlikely that
similar to different items more than controls do difficulties with visual processing of stimuli can
(Chan et al., 1993a, 1993b). They are also poor account for all the naming problems, because peo-
at judging the semantic coherence of simple ple with dementia clearly have many other deficits
statements: For example, they are more likely that do not involve visual processing. In particular,
to judge “The door is asleep” to be a sensible they show a clear deficit on tasks involving the same
statement than controls (Grossman, Mickanin, materials presented in the auditory modality.
Robinson, & d’Esposito, 1996).
Difficulties with picture naming

Visual input
People with dementia often have difficulty in
naming things. There is evidence that the semantic
deficit is involved in picture naming. Most of the
naming errors in dementia involve the production Visual representation
system
of semantic relatives of the target (e.g., Hodges,
Salmon, & Butters, 1991). The extent of the nam-
ing impairment is correlated with the extent of
the more general semantic difficulties (Diesfeldt, Semantic memory
1989). Naming performance in dementia is some-
times affected by the semantic variable of image-
ability. With other types of neuropsychological
damage, patients usually find high-imageable items Lemma
easier than low-imageable items (e.g., Coltheart,
Patterson, & Marshall, 1987; Nickels & Howard,
1994; Plaut & Shallice, 1993b; for an excep-
tion, see Warrington, 1981). On the other hand, Phonological form
Warrington (1975) described how AB was worse
at defining concrete words than abstract words,
while EM, with the same diagnosis, showed the Speech output
reverse and more typical pattern. Breedin, Saffran,
and Coslett (1994) described a patient, DM, who
showed a relative sparing of abstract nouns rela- FIGURE 11.9 An outline of a model of object
tive to concrete nouns. naming. (See Chapter 13 for more detail.)
Two main lines of evidence suggest that peo-

ple with dementia also have a deficit at the phono-
logical level. First, they have particular difficulty Semantic units
in naming low-frequency objects (e.g., Barker (32)
& Lawson, 1968). Jescheniak and Levelt (1994)

argued that the word frequency effect in speech
production arises from differences in the thresh- Name hidden Visual hidden
units units
olds of phonological forms. Second, phonologi- (16) (16)
cal priming of the target improves their naming
(Martin & Fedio, 1983).
There are three possible explanations of Name input Visual input
these findings (Tippett & Farah, 1994). First, units units
(16) (16)
there might be heterogeneity among patients. If
dementia affects each patient in a different way,
then each patient might have a different locus of FIGURE 11.10 Functional architecture of Tippett
impairment, depending on the precise effects of and Farah’s model of naming in Alzheimer’s disease.
their dementia. Each type of impairment might The numbers refer to the number of units.
result in a naming deficit. Second, there might be
multiple loci of impairments within each patient, sensitive to manipulations at the visual and name
such that dementia leads to disruption of the per- levels. The bidirectional links mean that damage
ceptual, semantic, and lexical systems. Third, to one level has consequences at other levels too.
a single locus of impairment might give rise to Lesioning the semantic level meant that the net-
all the impairments observed. According to this work became more sensitive to visual degradation
hypothesis, damage to the semantic system in of the visual input units. Visual degradation was
some way results in additional perceptual and simulated by reducing the overall strength of the
lexical deficits. visual inputs. The lesioned network also had more
Connectionist modeling supports the sin- difficulty in producing low-frequency names
gle locus hypothesis. Tippett and Farah (1994) than high-frequency names. Lexical frequency
described a computational model of important was simulated by giving more training to some
aspects of naming in dementia. In particular, they pairs than others. Finally, naming after damage
showed how apparent visual and lexical deficits was improved by phonological priming. This was
can arise solely from damage to semantic memory. simulated by presenting part of the target phono-
In their model, bidirectional links connect visual logical output pattern at the start of the test phase.
input units to visual hidden units, which connect In summary, the Tippett and Farah model
to semantic units, which connect to name hidden shows that damage to the semantic system alone
units, which in turn connect to name input units (see can account for the range of semantic, visual, and
Figure 11.10). The meaning of a word is encoded lexical impairments shown in dementia. This is
as a distributed pattern of activation across the because, in a highly interactive network, damage at
semantic units, such that each unit corresponds to a one level may have consequences at all the others.
semantic feature. The bidirectional links, together
with the cascading activation, mean that the model Evaluation of research on the
is highly interactive. The model was first trained so
that the application of a pattern to one of the input
neuroscience of semantic memory
layers produced correct outputs at the two layers. In the last few years the study of the neuroscience
Dementia was simulated by removing random sub- of semantics has contributed greatly to our under-
sets of the semantic units. standing of the area. Although there is still consid-
The main finding was that damage to the erable disagreement in the field, it has indicated
semantic units alone rendered the network more what the important questions in the psychology
of meaning are. What are the types of feature that low-level nature. They mediate between percep-
underlie word meaning? How are categories organ- tion, action, and language, and do not necessarily
ized by the brain? How does our semantic system have any straightforward linguistic counterparts.
relate to input and output systems? While semantic microfeatures might correspond
to simple semantic features, they might corre-
spond to something far more abstract. There is no
CONNECTIONIST reason to assume that the semantic microfeatures
APPROACHES TO that we develop will correspond to any straight-
SEMANTICS forward linguistic equivalent (such as a word or
an attribute), in much the same way that hidden
Connectionism has made an impact on semantic units in a connectionist network do not always
memory, just as it did in earlier years on lower acquire an easily identifiable, specific function. In
level processes such as word recognition. We support of this idea, there is evidence that the loss
saw in Chapter 7 how Hinton and Shallice (1991) of specific semantic information can affect a set
and Plaut and Shallice (1993a) incorporated the of related concepts (Gainotti, di Betta, & Silveri,
semantic representation of words into a model 1996). Hence semantic microfeatures might
of the semantic route of word recognition. This encode knowledge at a very low level of seman-
approach gives rise to the idea that semantic tic representation, or in a very abstract way that
memory depends on semantic microfeatures. has no straightforward linguistic correspondence
Note that this approach is not necessarily a (Harley, 1998; Jackendoff, 1983; McNamara &
competitor to other theories such as prototypes; Miller, 1989). The encoding of visual information
one instance of a category might cause one pattern by at least some of the semantic microfeatures is
of activation across the semantic units, another yet another reason to expect lesions to the seman-
instance will cause another similar pattern, and tic system to result in visual errors and perceptual
so on. We can talk of the prototype that defines a processing difficulty in naming with dementia.
category as the average pattern of activation of all In Hinton and Shallice’s (1991) model of
the instances. deep dyslexia, meaning was represented as a pat-
tern of activation across a number of semantic
feature units, or sememes, such as “hard,” “soft,”
Semantic microfeatures “maximum-size-less-foot,” “made-of-metal,” and
In the connectionist models we have examined, “used-for-recreation.” No one claims that such
a semantic representation does not correspond semantic features are necessarily those that
to a particular semantic unit, but to a pattern of humans use, but there is some evidence for this
activation across all of the semantic units. For sort of approach from data on word naming by
example, in Tippett and Farah’s model the mean- Masson (1995). In Hinton and Shallice’s model
ing of each word or object was represented as a the semantic features are grouped together so that
pattern of activation over 32 semantic units, each features that are mutually excluded inhibit each
representing a semantic microfeature. A micro- other, and only one can be active at any one time.
feature is an individual, active unit; the prefix For example, an object cannot be both “hard” and
“micro” emphasizes that these units are involved “soft,” or “maximum-size-less-foot” and “maxi-
in low-level processes rather than explicit sym- mum-size-greater-two-yards,” at the same time.
bolic processing (Hinton, 1989), but there really In addition, another set of units called “cleanup”
isn’t much difference between a feature and a units modulate the activation of the semantic units.
microfeature. Connectionist models suppose that These features allow combinations of semantic
human semantic memory is based on microfea- units to influence each other. We saw in Chapter 7
tures. A semantic microfeature is really just a that semantic memory can be thought of as a land-
semantic feature, but the prefix “micro” is added scape with many hills and valleys. The bottom
in computational modeling to emphasize their of each valley corresponds to a particular word
meaning. Words that are similar in meaning will unit corresponding to the attribute “bites” is lost,
be in valleys that are close together. The initial then that attribute will always be unavailable. If,
pattern of activation produced by a word when it however, a unit corresponding to more abstract
first activates the network might be very different information that is not easily linguistically
from its ultimate semantic representation, but as encoded is lost, then the consequences might be
long as you start somewhere along the sides of the less apparent in any linguistic task. The loss of a
right valley, you will eventually find its bottom. feature may mean that the higher level, linguisti-
The valley bottoms, which correspond to particu- cally encoded units become permanently unavail-
lar word meanings, are called attractors. This type able, but alternatively it might just mean that the
of network is called an attractor network. higher level units become more difficult to access.
If meanings are represented as a pattern of Hence there is a probabilistic aspect to whether
activation distributed over many microfeatures, a word or an attribute will be consistently
then it makes less sense to talk about loss of indi- unavailable. So an increasing number of linguis-
vidual items in the model. Instead, the loss of units tically encoded units should become permanently
will result in the loss of microfeatures. This will unavailable as the severity of dementia increases
result in a general degradation in performance. and more microfeatures are lost, as is observed
(e.g., Schwartz & Chawluk, 1990).
Tippett and Farah (1994) pointed out
Explaining language loss in people that experimental tasks differ in the degree
with Alzheimer’s disease: The of constraint provided on possible responses.
semantic microfeature loss Connectionist models are sensitive to multiple
constraints: If one sort of constraint is lost, other
hypothesis consistent ones might still be able to facilitate
What happens if a disease such as dementia the correct output. For example, in Tippett and
results in the loss of semantic microfeatures? The Farah’s model, phonological priming provided
effect will be to distort semantic space so that an additional constraint. Hence the availability of
some semantic attractors might be lost altogether, items will depend on the degree to which tasks
while others might become inaccessible on some provide constraints. Patients with Alzheimer’s
tasks because of the erosion of the boundaries of disease perform relatively well in highly con-
the attractor basins. Damage to a subset of micro- strained tasks.
features will lead to a probabilistic decline in per-
formance. Depending on the importance of the Modeling category-specific
microfeature lost to a particular item in a particu-
lar patient, the pattern of performance observed
disorders in dementia
will vary from patient to patient and from task Connectionist models of category-specific disor-
to task. Different tasks will give different results ders in dementia are also interesting because they
because they will provide differing amounts of tell us both about the progress of the disease and
residual activation to the damaged system. Thus, about the structure of semantic memory. Dementia
although microfeatures are permanently lost in generally causes more global damage to the brain
dementia, when tested experimentally this loss than the very specific lesioning effects of herpes
will sometimes look like loss of information, but simplex that typically cause category-specific dis-
will at other times look like difficulty in accessing orders. Therefore category-specific deficits are
information. more elusive in dementia. There is also the ques-
Consider response consistency, usually tion of which semantic categories are more prone
taken as the clearest indication of item loss. If a to disruption in dementia. Gonnerman, Andersen,
unit corresponding to the meaning of the word Devlin, Kempler, and Seidenberg (1997) found
“vampire” is lost, the meaning of that word is that sufferers show selective impairments on
always going to be unavailable. Similarly, if the tasks involving both living things and artifacts,
depending on the level of severity of the disease. As intercorrelated features are particularly
Early on there is a slight relative deficit of naming common in the category of living things, a small
artifacts, followed later by a deficit on naming liv- amount of damage to the semantic network, char-
ing things, followed by poor naming performance acteristic of early dementia, will have little effect
across all categories. on living things. This is because the richly intercon-
What explains the way in which category- nected intercorrelated features support each other
specificity varies with severity? To understand (Devlin, Gonnerman, Andersen, & Seidenberg,
this, we need to look more closely at semantic fea- 1998). Hence, early on in the progression of demen-
tures. In an important study, McRae, de Sa, and tia, tasks involving living things will appear not to
Seidenberg (1997) argued that there are different be affected. Beyond a critical amount of damage,
types of semantic feature, depending on the extent however, this support will no longer be available.
to which each feature is related to other ones (see When a critical mass of distinguishing features is
Figure 11.11). Intercorrelated features tend to occur lost, there will be catastrophic failure of the mem-
together: For example, most things that have beaks ory system. Then, whole categories will suddenly
can also fly, and most things that have fur often become unavailable. Artifacts, however, tend not to
have tails and claws. Living things tend to be repre- be represented by many intercorrelated features, but
sented by many intercorrelated features. Semantic by relatively many informative distinguishing fea-
features also differ in the extent to which they ena- tures. The loss of just a few of these features might
ble us to distinguish among things. Some features result in the loss of a specific item. Increasing dam-
are more important than others. Distinctive (some- age then results in the gradual loss of an increasing
times called distinguishing) features enable mem- number of items across categories, rather than the
bers of a category to be distinguished: For example, catastrophic loss observed with living things.
a leopard can be distinguished from other large cats It is important to emphasize the probabilistic
because it has spots. Many members of a natural nature of this loss. If a distinguishing feature for an
kind category will share intercorrelated features, but animal happens to be lost early on, then that ani-
distinguishing features are exclusive to single items mal will be confused with other animals from that
within the category. Artifacts tend not to be repre- point on (Gonnerman et al., 1997). However, there
sented by many intercorrelated features, but rather are more intercorrelated than non-correlated distin-
by many distinguishing features. Using a primed guishing features within the living things category.
semantic verification task (e.g., “is an apple used Hence an intercorrelated feature is more likely to be
to make cider?”), Cree, McNorgan, and McRae affected, but usually with no obvious consequence,
(2006) showed that distinctive features hold a privi- than a distinguishing feature. This type of approach
leged status in semantic memory; they are activated is promising, but we must be wary about the rela-
more strongly than shared, non-distinctive features. tively limited amount of data on which this sort of

,%
INTERCORRELATED FEATURES DISTINGUISHING FEATURES

2',((-*,(!,"* 2'%-+,(#+,#'!-#+"
2#.#'!,"#'!+,',(*)*+', &('!,"#'!+
1&'1#',*(**%, ,-*+ 20%-+#.,(+#'!%#,&+/#,"#'
2'1&&*+( ',-*%$#' ,!(*1
,!(*1/#%%+"*#',*(**%, 2 *,# ,+,',(*)*+',
,-*+ 1&'1#+,#'!-#+"#'! ,-*+
FIGURE 11.11
model is based. For example, Garrard, Patterson, Landauer and Dumais examined how latent
Watson, and Hodges (1998) failed to find any inter- semantic analysis might account for aspects of
action between disease severity and the direction of vocabulary acquisition. After exposure to a large
dissociation. Instead, they found a group advantage amount of text, the model generated performed
for artifacts, with a few individuals showing an well at a multiple-choice test of selecting the appro-
advantage for living things. priate synonym of a target word. It also acquired
vocabulary at the same rate as children. (To give
some idea of the complexity of the task, and to
Latent semantic analysis provide another demonstration of the importance
We have seen that connectionism represents meaning of computers in modern psycholinguistics, 300
by a pattern of activation distributed over many sim- dimensions were necessary to represent relations
ple semantic features. In these models, the features among 4.6 million words of text taken from an
are hand-coded; they are not learned, but are built encyclopedia.) This statistical sort of approach is
into the simulations. How do humans learn these very good at accounting for later vocabulary learn-
features? Connectionist models suggest one means: ing, where direct instruction is very rare. Instead,
connectionist models are particularly good at picking we infer the meanings of new words from the
out statistical regularities in data, so it is possible that context. LSA also shows how we can reach agree-
we abstract them from many exposures to words. A ment on the usage of words without any external
closely related approach makes explicit the role of referent. This observation is particularly useful in
co-occurrence information in acquiring knowledge. explaining how we acquire words describing pri-
This technique is called latent semantic analysis vate mental experiences. How do you know that I
(LSA) (Landauer & Dumais, 1997; Landauer, Foltz, mean the same thing by “I’m sad today” as you do?
& Laham, 1998; see Burgess & Lund, 1997, for the The answer is in the context in which these words
similar HAL—hyperspace analog to language— repeatedly do and do not occur.
model). Latent semantic analysis needs no prior lin- One criticism of the HDM models is that they
guistic knowledge. Instead, a mathematical proce- are overly concerned with the context in which
dure abstracts dimensions of similarity from a large words occur, so that words are related to other
corpus of items based on analysis of the context in words, rather than to the world, and therefore these
which words occur. We saw earlier how Lund et al. models find it difficult to cope with novel situations
(1995) showed that semantically similar words are (Glenberg & Robertson, 2000; see Burgess, 2000,
interchangeable within a sentence. This means that for a reply). For example, we know that it makes
the context in which words can (and cannot) occur sense to use a newspaper to protect our head from
provides a powerful constraint on how word mean- the wind, but not a matchbox. We will return to how
ings are represented. Latent semantic analysis makes meaning is connected to perception at the end of
use of this context to acquire knowledge about this chapter.
words. At first sight these constraints might not
seem particularly strong, there are a huge number Evaluation of connectionist models
of them, and we are exposed to them many times.
Constraints on the co-occurrence of words provide a
of semantic memory
vast number of interrelations that facilitate semantic Throughout this chapter we have seen how con-
development. LSA learns about these interrelations nectionist modeling has indicated how appar-
through a mechanism of induction. The mathemati- ently disparate theories and phenomena—here
cal techniques involved are too complex to describe the time course of dementia, modality-specific
here, but essentially the algorithm tries to minimize stores, functional versus perceptual attributes,
the number of dimensions necessary to represent all and category-specific memory—may be sub-
the co-occurrence information. Indeed, this type of sumed under one model. Connectionist model-
model is often called the HDM (high-dimensional ing of neuropsychological deficits is particularly
memory) approach. promising. The data and modeling work suggest
that the language deficits shown in diseases As Rogers et al. note, such a computational
such as dementia result from the gradual loss of approach, although broadly similar to the feature-
semantic microfeatures. based model, has several advantages. First, we no
longer have to be worried about what features we
Grounding: Connecting language should use and whether they are arbitrary; features
emerge to do the job. We no longer have to worry
to the world about whether a dog’s bark and a cow’s moo are
Language and meaning are not a closed system. the same or different features. Second, the compu-
Meaning is a way of mapping language onto the tational model forces us to be explicit about how
external world. At some point the semantic sys- every semantic or perceptual task is carried out.
tem has to interface with the perceptual systems; Third, the model provides an account of semantic
this interfacing is sometimes called grounding dementia. Semantic dementia was simulated by
(see Jackendoff, 1987, 2002, 2003; Roy, 2005; removing a proportion of the weights; increasing
Vigliocco et al., 2004). How does grounding occur? severity is modeled by removing a larger propor-
Rogers et al. (2004) describe a connectionist model tion of the weights. The lesioned model resembles
of semantic memory that provides an account of the behavior of patients with semantic dementia.
how language and perception are connected. They For example, in both the model and the patients, as
constructed a model that maps between modality- severity increases so does the proportion of omis-
specific representations of objects and their verbal sion and superordinate errors, while the production
descriptions (see Figure 11.12). Semantic repre- of semantic substitutions initially increases but
sentations mediate between these two output then declines. With a little damage, the model first
representations. In their model, a semantic level confuses similar items, but with increasing damage
mediates between visual features (e.g., is round) it becomes unable to generate any information that
and verbal descriptors, which in turn comprise distinguishes one item from another, and whole
names (e.g., bird), perceptual descriptors (e.g., has categories merge together. Hence, although indi-
wings), functional descriptors (e.g., can fly), and vidual names may not be accessible, superordinate
encyclopedic descriptors (e.g., lives in Africa). The categories remain so. With yet more damage, even
model learns to associate inputs with outputs. The broad categories may become indistinguishable.
internal semantic structure is constrained by both The model gives a similarly good account of other
visual and verbal outputs; hence visually similar semantic tasks, such as sorting words and pic-
inputs give rise to similarly structured internal tures, drawing, copying after a delay, and matching
representations. As noted above, the semantic rep- words to pictures. The model makes some specific
resentations do not necessarily encode semantic predictions: Because fruits share some proper-
features (e.g., has eyes) directly; they just have to ties with animals (e.g., they are living, or at least
be “good enough” to do the job (e.g., giving a name not man-made), they have many visual attributes
to an object, answering a question such as “does a in common with man-made objects. And patients
chair have eyes?”). do indeed treat fruit differently, sorting them with
Verbal descriptions
– Names, e.g., bird Visual features
– Perceptual, e.g., has wings Semantics
e.g., is round FIGURE 11.12 Rogers
– Functional, e.g., can fly
– Encyclopedic, e.g., lives in Africa et al.’s (2004) connectionist
model of semantic memory.
Adapted from Rogers et al.
(2004).
artifacts. The simulations also predicted that more example, if they were performing the action of
omission errors should be made when naming arti- opening a drawer, they were more likely to men-
facts and more substitution errors when naming tion clothes likely to be found inside a clothes
living things, because of the greater structure in the dresser than otherwise.
domain of living things, a prediction verified by the There is evidence that our mental situation in
data from patients with semantic dementia. the world takes a very concrete form, in that there
This kind of computational approach does are direct links between representations of percep-
not contradict the HAL (hyperspace analog to tions and actions. What happens in the brain when
language) model of Burgess and Lund (1997) we hear the word “kick”? Using brain imaging, we
or the LSA (latent semantic analysis) model of see Wernicke’s region, the part of the left tempo-
Landauer and Dumais (1997). Indeed, all these ral lobe of the brain that we know plays a vital
approaches show that we extract and abstract role in accessing word meanings, become highly
semantic information from large bodies of infor- activated. We also see some activation in Broca’s
mation. However, while HAL and LSA are reliant area, a region towards the front of the left hemi-
on verbal input, this computational approach links sphere that we know to be involved in producing
verbal and perceptual information. The computa- speech. What is even more surprising is the fMRI
tional model also links semantic processing to scans show that there is activation in the parts of
neuropsychology. the brain that deal with motor control, and particu-
The semantic representation is unitary and larly the motor control of the leg (Glenberg, 2007;
amodal, although different modalities will provide Hauk, Johnsrude, & Pulvermüller, 2004). It’s as
different inputs to the mediating semantic rep- though when we hear “kick,” we give a mental
resentation. In that respect the model resembles kick. Similarly, if we hear a word such as “catch,”
OUCH. Indeed, semantic memory might better be we see activation in the parts of the brain that con-
seen as a system that mediates different percep- trol the movements of the hand, and if you hear “I
tual systems, rather than a store of propositional eat an apple,” you get activation of the parts that
facts. The anterior regions of the temporal lobes control the mouth (Tettamanti et al., 2005). This
play a particularly important role in this process. motor activity peaks extremely quickly: within 20
The idea that our internal representations are ms of the peak activation in the parts of the brain
grounded in our perceptions, actions, and feelings traditionally thought to be involved in recogniz-
is an important one: put another way, our cogni- ing words and processing meaning (Pulvermüller,
tion is embedded in the world. Concepts have Shtyrov, & Illmoniemi, 2003), which is so fast that
very direct links to the world (Barsalou, 2003, it rules out the explanation that people are just con-
2008; Glenberg, 2007). Our minds don’t work sciously reflecting on or rehearsing what they’ve
in isolation—they are situated within the world. just heard. This idea that thinking or understanding
According to this view, concepts and meaning language causes activation in the parts of the brain
aren’t just abstract things: thinking about real- to do with how the body deals with these concepts
world objects, for example, involves the visual is called embodiment. Language is grounded to
perceptual system. Furthermore, according to the the world, and grounding happens in the parts
situated cognition idea, concepts are less stable of the brain that deal with perception and action
than has usually been thought, varying depending (Willems & Casasanto, 2011).
on the context and situation. Barsalou (2003) had Brain imaging studies reinforce the view that
people perform two tasks simultaneously: using wide areas of the brain are involved in processing
their hands to imagine performing some manual meaning at many different levels, initially involv-
operations, and identifying the properties of con- ing modality-specific sensory and motor systems,
cepts. Sometimes the actions being performed and then increasingly abstract representations that
were relevant to the concepts being described, tap into a variety of other cognitive, emotional,
in which case the participants were more likely and social processes carried out by the brain
to mention related aspects of the concepts. For (Binder & Desai, 2011).
SUMMARY
x Semantics is the study of meaning.

x Episodic memory is memory for events, and semantic memory is memory for general knowledge.
x Association alone cannot explain how semantic memory is organized.
x Semantics is the interface between language processing and the rest of cognition.
x The denotation of a word is its core meaning, and its connotations are its associations.
x The reference (extension) of a word is what it refers to in the world, and its sense (intension) is
the underlying concept that specifies how it is related to its referents.
x Semantic networks encode semantic information in the form of networks of linked nodes.
x The Collins and Quillian network emphasizes hierarchical relations and cognitive economy; it
attempted to give an account of sentence verification times.
x Hierarchical networks could not explain similarity and relatedness effects.
x Spreading activation networks can account for similarity and relatedness effects, but the theory
is difficult to falsify.
x The meaning of words can be decomposed into smaller units of meaning called semantic features.
x The idea that word meanings can be split up into smaller units is called semantic decomposition.
x Katz and Fodor showed how sentence meanings could be derived from the combination of seman-
tic features for each word in the sentence, and in particular how this information could be used to
select the appropriate sense of ambiguous words.
x Feature-list theories account for sentence verification times by postulating that we compare lists
of defining and characteristic features.
x A major problem for early decompositional features is that it is not always possible to specify the
features necessary to encode word meaning; that is, not all words are easily defined.
x A number of experiments have been carried out on whether semantic decomposition is obligatory;
on balance, the results suggest that it is.
x A prototype is an abstraction that represents the average member of a category.
x The basic level of a category is the one that is maximally informative and which we prefer to use
unless there are good reasons to use more general or specific levels.
x In contrast to abstraction theories, in instance-based theories each instance is represented individu-
ally, and comparisons are made with specific instances rather than with an abstract central tendency.
x We probably have different memory systems for visual and verbal semantics.
x Semantic categories can be selectively impaired by brain damage; in particular, performance on
tasks involving living and non-living things can be selectively disrupted.
x These impairments cannot be explained away in terms of methodological artifacts because we
observe a double dissociation.
x According to the sensory–functional theory, category-specific semantic impairments for living
and non-living things arise because living things are represented primarily in terms of perceptual
knowledge, but non-living things are represented primarily in terms of functional knowledge.
x According to the domain-specific knowledge hypothesis, category-specific semantic impairments
for living and non-living things arise because knowledge about living and non-living things is
stored in different parts of the brain.
x Dementia is a progressive degeneration of the brain resulting in deteriorating performance across
a range of tasks; in semantic dementia, semantic knowledge is disproportionately impaired.
(Continued)
(Continued)
x People with probable Alzheimer’s disease have difficulty with picture naming; this can be
explained in terms of their underlying semantic deficit.
x In connectionist modeling, word meaning is represented as a pattern of activation distributed
across many semantic features; this pattern corresponds to a semantic attractor.
x Semantic features (called microfeatures in computational modeling) do not necessarily have
straightforward perceptual or linguistic correspondences.
x Semantic dementia can be explained as the progressive loss of semantic features.
x Living things tend to be represented by many shared intercorrelated features, whereas non-living
things are represented primarily by distinctive features.
x The pattern of category-specificity displayed in dementia depends on the level of severity of the
disease.
x Connectionist modeling shows how the differential dependence of living and non-living things on
intercorrelated and distinctive features explains the interaction between performance on different
semantic categories and severity of dementia.
x Latent semantic analysis shows how co-occurrence information is used to acquire knowledge.
x Grounding is how symbols are connected to perceptual representations.
1. Can introspection tell us anything about how we represent meaning?

2. To what extent are feature-based theories of meaning concerned with a level of representation
beneath prototype and instance-based theories of concepts?
3. How would you explain case studies showing the loss of knowledge about very specific seman-
tic categories (e.g., medical terms)?
4. What sort of categories might a cat or dog possess?
FURTHER READING
A recent review of the psychology of semantics is Vigliocco and Vinson (2009). The classic lin-
guistics work on semantics is Lyons (1977a, 1977b). Johnson-Laird (1983) provides an excellent
review of a number of approaches to semantics, including the relevance of the more philosophical
approaches.
General problems with network models are discussed by Johnson-Laird, Herrman, and Chaffin
(1984). Chang (1986) and Smith (1988) review the experimental support for psychological models
of semantic memory. Kintsch (1980) is a good review of the early experimental work on semantic
memory, particularly on the sentence verification task.
For more on definitional versus non-definitional theories of meaning, see the debate between
J. A. Fodor (1978, 1979) and Johnson-Laird (1978; Miller & Johnson-Laird, 1976). For more on
instance-based theories, see Hintzman (1986), Murphy and Medin (1985), Nosofsky (1991), Smith
and Medin (1981), and Whittlesea (1987). For an important overview of connectionist approaches to
semantics, see Rogers and McClelland (2004).
Aitchison (1994) is a good introduction to processing figurative language.
For an excellent brief review of the neuropsychology of semantics, see Saffran and Schwartz
(1994). Caplan (1992) also provides an extensive review of the neuropsychology of semantic mem-
ory. For a review of optic aphasia see Sitton et al. (2000). See Vinson (1999) for an introductory
review of language in dementia; Harley (1998) for a review of work about naming and dementia; and
Schwartz (1990) for an edited volume of work on dementia with a cognitive bias.
HAL is another latent semantic analysis model (Lund et al., 1995, 1996); it produces an account
of semantic priming that is similar to McRae and Boisvert (1998; see Chapter 6).
C H A P T E R 12
COMPREHENSION
INTRODUCTION the incoming sentences refer to in this model?

We use the model that we are constructing to
This chapter is about the higher level processes help us make sense of the material. We shall
of comprehension. What happens after we have see that it is possible to take this idea of com-
identified the words and built the syntactic prehension as construction too far: We do not
structure of a sentence? How do we build up do more work than is necessary during compre-
a representation of the meaning of the whole hension. If comprehension for meaning is like
sentence, given the meanings of the individual building a house to live in, we do not build an
words? How do we combine sentences to con- extravagant mansion.
struct a representation of the whole conversation Text is printed or written material, usually
or text? And how do we use the meaning of what longer than a sentence. A story is a particular,
we have processed? self-contained type of text, although a story in
Comprehension is the stage of processing a psycholinguistic experiment might only be
that follows word recognition and parsing. As two sentences long. Discourse is the spoken
a result of identifying words (Chapters 6, 7, and equivalent of text. Conversations are spoken
9) and parsing the sentence, we have identified interchanges where the topic may change as
their thematic roles (Chapter 10) and accessed the conversation unfolds. Conversations have
their individual meanings (Chapter 11). The task their own particular mechanisms for control-
now facing the reader or listener is to integrate ling who is talking at any time. It should be
these different aspects into a representation of noted that most of the research has been car-
the sentence, to integrate it with what has gone ried out on text comprehension rather than dis-
on before, and to decide what to do with this course comprehension. Of course there may be
representation. many things in common in representing and
One of the central themes in the study of understanding spoken and written language,
comprehension is whether it is a constructive but there are also important differences. Time
process or a minimal process. How far do we go is less of a constraint on processing written
beyond the literal meaning of the sentence? Do language, and we also have the advantage with
we construct a complex model of what is being written language that the text is there for us to
communicated, or do we do as little work as reread if we should so wish. Comprehending
possible—just enough so as to be able to make spoken language is affected by the transience
out the sense? We go beyond the literal material of the speech signal and the time constraints
when we make inferences. When and how do this imposes. However, apart from the final
we make them? In comprehension, we construct section on conversation, most of what will be
a model of what we think is being communi- discussed in this chapter applies to both written
cated. How do we work out what the words in and spoken language.
12. COMPREHENSION 361
entities are referred to in successive sentences

(Bishop, 1997). When we read or listen, we strive
to maintain coherence and cohesion. We generally
assume that what we are processing is coherent
and makes sense. We assume that pronouns are
referring to things that have previously been intro-
duced. These are powerful constraints on process-
ing, and we will see that we maintain coherence in
a number of ways.
Throughout this chapter we will come across
a number of findings that point to factors that can
make comprehension easier or more difficult.
Some of them are perhaps not surprising: For
example, it is difficult to remember material if
the sense of a story is jumbled. Thorndyke and
Hayes-Roth (1979) showed that the structure of
Conversations are spoken interchanges which individual sentences could affect the recall of the
have a clearly defined structure, even though the whole story. In particular, they showed that rep-
topic may change as the conversation unfolds. etition of the same sentence structure improves
recall when the content of the sentences changes,
as long as not too much information is presented
It is useful to distinguish between semantic using the same sentence structure—that is, if it is
and referential processing (Garnham & Oakhill, not repeated too many times with different con-
1992). Semantic processing concerns working tent. Throughout this chapter, measures of how
out what words and sentences mean, whereas much we remember of a story are often used to
referential processing concerns working out their tell us how difficult the material is. It is assumed
role in the model—what must the world be like that good memory equals good comprehension.
for a sentence to be true? In general, semantic This is rather different from the other measures
processing precedes referential processing. In we have considered in previous chapters, which
incremental parsing models (such as Altmann & have tended to be on-line in the sense that they
Steedman, 1988), semantic and referential pro- measure processing at the time of presentation.
cessing occur on a word-by-word basis. So not Memory may reflect processing subsequent to
only have we got to work out the meaning of initial comprehension.
what we hear or read, we also have to relate this The organization of this chapter is as fol-
information to a model of the world. Everything lows. First, I look at what makes comprehension
new changes or adds to this model in some way. easy or difficult. Then I examine what deter-
What is the nature of this model? mines what we remember of text. Next, I examine
An important characteristic of text and dis- the process of inference-making in comprehen-
course is that it is coherent. The material has a sion in detail, with particular emphasis on the
topic and forms a semantically integrated whole. problems of deciding what words refer to in our
Gernsbacher (1990) proposed four sources of model. Then I review some influential theories
coherence. Referential coherence refers to con- of text comprehension. By the end of this chap-
sistency in who or what is being talked about. ter you should:
Temporal coherence refers to consistency in when
the events occur. Locational coherence refers to x Know how we integrate new material with pre-
consistency in where the events occur. Causal vious information.
coherence refers to consistency in why events x Understand how reliable our memory really is
happen. Text is also cohesive, in that the same for what we have read or heard.
x Appreciate how we make inferences about

what we read or hear.
x Know how we make inferences as we process
language.
x Know how we represent text.
x Understand about the story grammar, schema,
propositional network, mental model, and
construction–integration models of text com-
prehension.
x Know what differentiates skilled from less able
comprehenders.
x Know the best way to try to understand diffi-
When testifying to the Watergate Committee
cult material. in June 1973, John Dean’s recall of the specific
conversations in Nixon’s office was inaccurate.
MEMORY FOR TEXT AND

INFERENCES that eyewitness testimony is often unreliable, and
can easily be influenced by many factors, and that
Like eyewitness testimony, literal, verbatim mem- our memory can easily be led astray by mislead-
ory is notoriously unreliable. If we needed to be ing questions (Loftus, 1996). So what determines
reminded of this, Neisser (1981) discussed the what we remember and what we forget, and can
case study of the memory of John Dean, who was we ever remember material verbatim?
an aide of President Richard Nixon at the time of People generally forget the details of word
the Watergate cover-up and scandal in the early order very quickly. We remember only the mean-
1970s. Unknown to Dean, the conversations in ing of what we read or hear, not the details of the
Nixon’s office were tape-recorded, so his recall of syntax. Sachs (1967) presented participants with
them when testifying to the Watergate Committee a sentence such as (1) embedded in a story. She
in June 1973 could be checked against the tape- later tested their ability to distinguish it from pos-
recordings, 9 months after the original events. His sible confusion sentences (2) to (4):
recall was highly inaccurate. Nixon did not say
many of the things that Dean attributed to him, and (1) He sent a letter about it to Galileo, the great
much was omitted. Dean’s recall was only really Italian scientist. (original)
accurate at a very general thematic level: The peo- (2) He sent Galileo, the great Italian scientist, a
ple involved did discuss the cover-up, but not in letter about it. (formal word order change)
the precise way Dean said that they had. It seems (3) A letter about it was sent to Galileo, the great
that Dean’s attitudes influenced what he remem- Italian scientist. (syntactic change)
bered. For example, Dean said that he wanted to (4) Galileo, the great Italian scientist, sent him a
warn the President the cover-up might fall apart, letter about it. (semantic change)
but in fact he did not; at the hearings, he said that
he thought he had uttered this warning. Assuming Sachs tested recognition after 0, 80, or 160
that Dean was being truthful about his recall of the intervening syllables (which are equal to approxi-
events, we see that in spite of their belief to the mately 0, 25, or 50 second delays respectively),
contrary, speakers only remember the gist of pre- and found that the participants’ ability to detect
vious conversations. We see a tendency to abstract changes to word order and syntax decreased very
information, and to “remember” things that never quickly. Participants could not tell the difference
actually happen. These findings have been rep- between the original and the changed sentences
licated many times, so abstraction is clearly an (2) and (3). They were however sensitive to
important feature of memory. It is also well known changes in meaning (such as (4)). Generally, we
remember the gist of text, and very quickly dump Kintsch and Bates (1977) studied students’ mem-
the details of word order. ory of lectures. They found that verbatim memory
We start to purge our memory of the details of was good after 2 days but was greatly reduced
what we hear after sentence boundaries. Jarvella after 5 days. Extraneous remarks were remem-
(1971) presented participants with sentences such bered best: We remember the precise wording
as (5) and (6) embedded in a story: of jokes and announcements particularly well.
Perhaps surprisingly, there were no differences in
(5) The tone of the document was threatening. literal memory for sentences that were centrally
Having failed to disprove the charges, Taylor related to the topic compared with those con-
was later fired by the President. cerned with detail. A depressing result for teach-
(6) The document had also blamed him for hav- ers is that memory was worst for central topic
ing failed to disprove the charges. Taylor was statements and overall conclusions. These studies
later fired by the President. show that there are differences between coherent
naturalistic conversation, and isolated artificial
The participants were then tested on what they sentences and other materials constructed just for
remembered. They remembered the clause psycholinguistic experiments. In real conversa-
“having failed to disprove the charges” more tion (counting soap operas as examples of real
accurately in (5) than (6), presumably because in conversation), quite often what might be consid-
(5) it was part of the final sentence before the ered surface detail serves a particular function.
interruption. For example, the way in which we use pronouns
The way in which we describe what we recall or names depends in part on factors like how
from immediate memory can be influenced by the much attention we want to draw to what is being
syntactic structure of what we have just read or referred to. This result accords with our intuitions:
heard. Potter and Lombardi (1998) found that the Although we often remember only the gist of
tendency to use the same syntactic structure in what is said to us, on occasion we can remember
material recalled from immediate memory results the exact wording, particularly if it is important or
from syntactic priming by the target material (see emotionally salient.
Chapter 13 for more details). That is, we tend to Items and properties that become incorpo-
reuse the same words and sentence structures in rated into our model of what we hear are more
the material we recall because they were there memorable than those that do not. Consider these
in the original material. Potter and Lombardi two sentences, (7) and (8):
showed that it was possible to change the way
people phrased the material they recalled by prim- (7) Vlad was relieved that Agnes was wearing
ing them with an alternative sentence structure. her pink dress.
This is consistent with the idea that immediate (8) Vlad was relieved that Agnes was not wear-
recall involves generation from a meaning-level ing her pink dress.
representation, rather than true verbatim memory
(Potter & Lombardi, 1990, 1998). Both sentences mention the word “pink,” but
The details of surface syntactic form are not while in the first sentence there is a pink dress in
always lost. Yekovich and Thorndyke (1981) our representation of the sentence, in the second
showed that we can sometimes recognize exact there is not. We are explicitly told that there is no
wording up to at least 1 hour after presentation. pink dress present. How does this affect the mem-
Bates, Masling, and Kintsch (1978) tested par- orability of the word “pink”? Suppose we present
ticipants’ recognition memory for conversations the word “pink” after hearing these two sentences,
in television soap operas. As expected, memory and ask participants whether or not the word was
for meaning straight after the program was nearly present. What we find depends on the delay
perfect, but participants could also remember the between the sentence and presenting the probe
detailed surface form when it had some significance. word (“pink”). After 500 ms, “pink” is equally
accessible in both sentences, but after 1,500 ms, of the text; indeed, eye-movement research sug-
participants respond faster if the item is present gests this is in part the case. In this case the bet-
(7) compared with when it is not present (8). That ter memory would simply reflect more processing
is, immediately after hearing a sentence, linguistic time. However, Britton, Muth, and Glynn (1986)
structure and content determines memory; after a restricted the time participants could spend read-
longer delay, linguistic structure is less important ing parts of the text so they spent equal amounts of
than discourse structure (Kaup & Zwaan, 2003). time reading the more and the less important parts
Exactly why we sometimes remember the of a story, and found that they still remembered
exact surface form is not currently known. Is a the important parts better. Hence there is a real
decision taken to store it permanently, and if so effect of the role the material plays in the meaning
when? Neither is the relation between our mem- of the story. Important material must be flagged in
ory for surface form and the structure of the parser comprehension and memory in some way.
well understood. Clearly we can routinely remem- The importance of an idea relative to the rest
ber more than one clause, even if there has been of the story also affects its memorability (Bower,
subsequent interfering material, so it cannot be Black, & Turner, 1979; Kintsch & van Dijk, 1978;
simply that we always immediately discard sur- Thorndyke, 1977). As you would expect, the more
face form. Clearly the parser can process one sen- important a proposition is, the more likely it is to
tence while we are storing details of another. be remembered. Text processing theories should
predict why some ideas are more “important” than
others. One suggestion is that important ideas
Importance are those that receive more processing because
Not surprisingly, people are more likely to remem- themes in the text are more often related to impor-
ber what they consider to be the more important tant ideas than less important ones are.
aspects of text. Johnson (1970) showed that par-
ticipants were more likely to recall ideas from a What effect does prior knowledge
story that had been rated as important by another
group of participants. Keenan, MacWhinney, and
have?
Mayhew (1977) examined memory for a linguis- The effect of prior knowledge on what we remem-
tics seminar, and compared sentences that were ber and on the processes of comprehension was
considered to be HIC (high interactional content— explored in an important series of experiments
which is material having personal significance) by Bransford and his colleagues. For example,
and sentences with LIC (low interactional Bransford and Johnson (1973, p. 392) read par-
content—which is material having little personal ticipants the following story (11):
significance).
(11) “If the balloons popped, the sound wouldn’t
(9) I think you’ve made a fundamental error in be able to carry far, since everything would
this study. be too far away from the correct floor. A
(10) I think there are two fundamental tasks in closed window would also prevent the
this study. sound from carrying, since most buildings
tend to be well insulated. Since the whole
Sentences with high interactional content, such as operation depends upon a steady flow of
(9), were more likely to be recalled by the appropri- electricity, a break in the middle of the wire
ate participants in the seminar than sentences with would also cause problems. Of course, the
low interactional content, such as (10). fellow could shout, but the human voice is
Although it may not be surprising that more not loud enough to carry that far. An addi-
important information is recalled better, there tional problem is that a string could break
are a number of reasons why it might be so. We on the instrument. Then there could be no
might spend longer reading more important parts accompaniment to the message. It is clear
that the best situation would involve less

distance. Then there would be fewer poten-
tial problems. With face-to-face contact, the
least number of things could go wrong.”
This story was specially designed to be abstract

and unfamiliar. Bransford and Johnson measured
participants’ ratings of the comprehensibility of
the story and also the number of ideas recalled.
Participants were divided into three groups, called
“no context,” “context before,” and “context after.”
The context here was provided in the form of a pic-
ture that makes sense of the story (see Figure 12.1).
Bransford and Johnson found that this context was
only useful if it was presented before the story: the
“no context” group recalled an average of 3.6 ideas
out of a maximum of 14, the “context after” group
also recalled 3.6 ideas, but the “context before”
group recalled an average of 8.0 ideas. Hence con-
text must provide more than just retrieval cues; it
must also improve our comprehension, and this
improvement in comprehension then leads to an
improvement in recall. Context provides a frame for
understanding text. The role of context and back-
ground information is a recurring theme in address-
ing how we understand and remember text, and its
importance cannot be overestimated.
In this experiment, the story and the context
were novel. Bransford and Johnson (1973, p. 400)
also showed that a familiar context could facilitate
comprehension. They presented participants with
the following story (12): FIGURE 12.1 Picture context for the “balloon
story” (11). Figure from Bransford and Johnson (1973).
(12) “The procedure is actually quite simple.
First you arrange things into two different to the necessity for this task in the immedi-
groups. Of course, one pile may be suffi- ate future, but then one can never tell. After
cient depending on how much there is to the procedure is completed, one arranges the
do. If you have to go somewhere else due to material into different groups again. Then
lack of facilities, that is the next step; other- they can be put into their appropriate places.
wise you are pretty well set. It is important Eventually they will be used once more,
not to overdo things. That is, it is better to and the whole cycle will then have to be
do fewer things at once than too many. In repeated. However, that is part of life.”
the short run this might not seem important,
but complications can easily arise. A mis- When you know that this is called the
take can be expensive as well. At first the “clothes washing” story, it probably all makes
whole procedure will seem complicated. sense. Those who read the passage without this
Soon, however, it will become just another context later recalled an average of only 2.8 out
facet of life. It is difficult to foresee any end of a maximum of 18 ideas; those who had the
context after reading it also only recalled on aver- old home which is set back from the road
age 2.7 ideas. However, those participants given and which has attractive grounds. But since
the context before the story recalled an average of it is an old house it has some defects: for
5.8 ideas. These experiments suggest that back- example, it has a leaky roof, and a damp and
ground knowledge by itself is not sufficient: you musty cellar. Because the family is wealthy,
must recognize when it is applicable. they have a lot of valuable possessions—
Appropriate context may be as little as the such as ten-speed bike, a color television,
title of a story. Dooling and Lachman (1971, p. 218) and a rare coin collection.
showed the effect of providing participants with
a title that helped them make sense of what was The story was 373 words long and identified
read, but once again it had to be given before by the experimenters as containing 72 main ideas.
reading the story (13): Other participants had previously rated the main
ideas of the story according to their relevance to
(13) “With hocked gems financing him, our hero a potential house buyer or a potential burglar. For
bravely defied all scornful laughter that tried example, a leaky roof and a damp basement are
to prevent his scheme. ‘Your eyes deceive,’ important features of a house to house buyers but
he had said. ‘An egg, not a table, correctly not to burglars, whereas valuable possessions and
typifies this unexplored planet.’ Now three the fact that no one is in on Thursday are more rel-
sturdy sisters sought proof. Forging along, evant to burglars. The participants in the experi-
sometimes through vast calmness, yet ment read the story from either a “house buying”
more often over turbulent peaks and val- or a “burglar” perspective in advance. Not surpris-
leys, days became weeks as doubters spread ingly, the perspective influenced the ideas the par-
fearful rumours about the edge. At last, ticipants recalled. Half the participants were then
from nowhere, welcome winged creatures told the other perspective, while a control group
appeared signifying monumental success.” of the other half of the participants just had the
first repeated. The shift in perspective improved
Without the title of “Christopher Columbus’s dis- recall: participants could recall things they had
covery of America,” the story makes little sense. previously forgotten. This is because the new per-
In fact, “three sturdy sisters” refers to the three spective provides a plan for searching memory.
ships, the “turbulent peaks and valleys” to the At first sight the findings of this experi-
waves, and “the edge” refers to the supposed edge ment appear to contradict those of Bransford and
of a flat earth. Johnson. Bransford and Johnson showed that
It might reasonably be objected that all these context has little effect when it is presented after
stories so far have been designed to be obscure, a story, but Anderson and Pichert showed that
without a title or context, and are not representa- changing the perspective after the story—which of
tive of normal texts. What happens with less course is a form of context—can improve recall.
obtuse stories? The difference is that, unlike the Bransford and
This can be seen in an experiment by Johnson experiments, the Anderson and Pichert
Anderson and Pichert (1978), who showed how story was easy to understand. It is hard to encode
a shift in perspective provides different retrieval difficult material in the first place, let alone recall
cues. Participants read a story summarized in it later. With easier material the problem is in
(14)—a more colloquial British term for “playing recalling it, not encoding it. People encode infor-
hooky” is “playing truant,” or “skiving”: mation from both perspectives, but the perspec-
tive biases what people recall. In an extension of
(14) Two boys play hooky from school. They this study, Baillet and Keenan (1986) looked at
go to the home of one of the boys because what happens if perspective is shifted after read-
his mother is never there on a Thursday. ing but before recall. Participants who recalled the
The family is well off. They have a fine material immediately depended on the retrieval
perspective; however, participants who recalled For example, Sulin and Dooling (1974, p. 256)
it after a much longer interval (1 week) were not showed that background knowledge could also be
affected by the retrieval perspective—only the a source of errors if it is applied inappropriately.
perspective given at encoding mattered. Consider the following story (15):
There is a huge amount of potentially rel-
evant background knowledge. Almost anything (15) “Gerald Martin strove to undermine the
we know can be brought to bear on understand- existing government to satisfy his politi-
ing text. (Indeed, one way to improve our mem- cal ambitions. Many of the people of his
ory for text is to construct as many connections country supported his efforts. Current
as possible between new and old material.) political problems made it relatively easy
Culture-specific information also influences for Martin to take over. Certain groups
comprehension (Altarriba, 1993; Altarriba & remained loyal to the old government
Forsythe, 1993). For example, in an experiment and caused Martin trouble. He confronted
by Steffensen, Joag-dev, and Anderson (1979), these groups directly and so silenced
groups of American and Indian participants read them. He became a ruthless, uncontrolla-
two passages, one describing a typical American ble dictator. The ultimate effect of his rule
wedding and the other a typical Indian wedding. was the downfall of his country.”
Participants read the passage appropriate to their
native culture more rapidly and remembered Half of the participants in their experiment
more of it, and distorted more information from read this story as given here, with the main actor
the culturally inappropriate passage. Culture does in the story called “Gerald Martin.” The other
not mean just nationality: religious affiliation can half read it with the name “Adolf Hitler” instead.
affect reading comprehension. Lipson (1983) Participants in the “Hitler” condition afterwards
showed that children from strongly Catholic or were more likely to believe incorrectly that they
strongly Jewish backgrounds showed faster com- had read a sentence “He hated the Jews particu-
prehension of and better recall for text that was larly and so persecuted them,” than a neutral
appropriate to their affiliation. control sentence such as “He was an intelligent
In summary, prior knowledge has a large man but had no sense of human kindness.” That
effect on our ability to understand and remember is, they made inferences from their background
language. The more we know about a topic, the world knowledge that influenced their memory of
better we can comprehend and recall new mate- the story. Here the prior knowledge was a source
rial. The disadvantage of this is that sometimes of errors. Participants in the fictitious character
prior knowledge can lead us astray. condition were of course unable to use this back-
ground information.
There are three main types of inference,
Inferences called logical, bridging, and elaborative infer-
We make an inference when we go beyond the ences. Logical inferences follow from the mean-
literal meaning of the text. An inference is the ings of words. For example, hearing “Vlad is a
derivation of additional knowledge from facts bachelor” enables us to infer that Vlad is male.
already known; this might involve going beyond Bridging inferences (sometimes called backward
the text to maintain coherence, or to elaborate on inferences) help us relate new to previous infor-
what was actually presented. Inferences do not mation (Clark, 1977a, 1977b). Another way of
always lead to the correct conclusion, however. putting this is that texts have coherence in a way
Prior knowledge and context are mixed blessings. that randomly jumbled sentences do not have.
Although they can help us to remember material We strive to maintain this coherence, and make
that we would otherwise have forgotten, they can inferences to do so. One of the major tasks in
also make us think we have “remembered” mate- comprehension is sorting out what pronouns
rial that was never presented in the first place! refer to. Sometimes even more cognitive work
is necessary to make sense of what we read or (17) Three turtles rested on a floating log and a
hear. How can we make sense of (16)? We can if fish swam beneath them.
we assume that the moat refers to a moat around (18) Three turtles rested on a floating log and a
the castle mentioned in the first sentence. This is fish swam beneath it.
an example of how we maintain coherence: We
comprehend on the basis that there is continu- If you swim beneath a log with a turtle on
ity in the material that we are processing, and it, then you must swim beneath the turtle. If you
that it is not just a jumble of disconnected ideas. change “on” to “beside,” then participants are
Bridging inferences provide links among ideas very good at detecting this change, because the
to maintain coherence. inference is no longer true and therefore not one
likely to be made.
(16) Vlad looked around the castle. The moat
was dry. When are inferences made?
In the past, most researchers subscribed to a con-
We make elaborative inferences when we structionist view that inferences are involved
extend what is in the text with world knowledge. in constructing a representation of the text.
The Gerald Martin example is an (unwarranted) Comprehenders are more likely to make infer-
elaborative inference. This type of inference ences related to the important components of a
proves to be very difficult for AI simulations of story and not incidental details (Seifert, Robertson,
text comprehension, and is known as the frame & Black, 1985). The important components are
problem. Our store of world general knowledge the main characters and their goals, and actions
is enormous, and potentially any of it can be relating to the main plan of the story. According
brought to bear on a piece of text, to make both to constructionists, text processing is driven on a
bridging and elaborative inferences. How does “need to know” basis. The comprehender forms
text elicit relevant world knowledge? This is a goals when processing text or discourse, and these
significant problem for all theories of text pro- goals determine the inferences that are made,
cessing. Bridging and elaborative inferences what is understood and what is remembered about
have sometimes been called backward and the material, and the type of model constructed.
forward inferences respectively, as backward The alternative view is the minimalist hypoth-
inferences require us to go back from the cur- esis (McKoon & Ratcliff, 1992). According to the
rent text to previous information, whereas for- minimalist hypothesis, we automatically make
ward inferences allow us to predict the future. bridging inferences, but we keep the number of
As we shall see, there are reasons to think that elaborative inferences to a minimum. Those that
different mechanisms are responsible for these are made are kept as simple as possible and use
two types of inference. Taken together, all infer- only information that is readily available. Most
ences that are not logical are sometimes called elaborative inferences are made at the time of
pragmatic inferences. recall. According to the minimalist approach,
As we have seen, people make inferences text processing is data-driven. Comprehension is
on the basis of their world knowledge. We have enabled by the automatic activation of what is in
also seen that we only remember the gist of what memory: it is therefore said to be memory-based.
we read or hear, not the detailed form. Taken In part the issue comes down to when the infer-
together, these suggest that we should find it very ences are made. Is a particular inference made
difficult to distinguish the inferences we make automatically at the time of comprehension, or is
from what we actually hear. Bransford, Barclay, it made with prompting during recall?
and Franks (1972) demonstrated this experimen- The studies that show that we make elabo-
tally. They showed that after a short delay the tar- rative inferences look at our memory for text.
get sentence (17) could not be distinguished from Memory measures are indirect measures of com-
the valid inference (18): prehension, and may give a distorting picture of
the comprehension process. In particular, this may (23) The tooth was pulled painlessly. The patient
have led us to overestimate the role of construc- liked the new method.
tion in comprehension. The most commonly used
on-line measure is reading time, assuming that In (21) the statement to be verified is explic-
making an automatic inference takes time, neces- itly stated, so people are fast to verify the probe
sitating us to look at the guilty material for longer. statement. In (22) a bridging inference that the
For an inference to be made automatically, appro- dentist is pulling the tooth is necessary to main-
priate supporting associative semantic informa- tain coherence; people are as fast to verify the
tion must be present in the text. For example, probe as they are when it is explicitly stated in
McKoon and Ratcliff (1986, 1989) showed that in (21). This suggests that the bridging inference
a lexical decision task, the recognition of a word has been made automatically in the comprehen-
that is likely to be inferred in a “strong association sion process. But in (23) people are about 250
predicting context,” for example the word “sew” ms slower to verify the statement; this suggests
in (19), is facilitated much more than the word that the elaborative inference has not been drawn
that might be inferred in a “weak association con- automatically.
text,” the word “dead” in (20). It now seems likely that only bridging or
reference-related inferences necessary to maintain
(19) The housewife was learning to be a seam- the coherence of the text are made automatically
stress and needed practice so she got out during comprehension, and elaborative infer-
the skirt she was making and threaded her ences are generally only made later, during recall.
needle. Evidence supporting this is that people make
(20) The director and cameraman were ready to more intrusion inferences (the sort of elaborative
shoot close-ups when suddenly the actress inference where people think that something was
fell from the 14th floor. in the study material when it was not) the longer
the delay between study and test (Dooling &
In both cases the target word is part of a valid Christiaansen, 1977; Spiro, 1977). This is because
inference from the original sentence, but whereas people’s memory for the original material deterio-
“sew” is a semantic associate of the words “seam- rates with time, and they have to do more recon-
stress,” “threaded,” and “needle” in (19), the word struction. Corbett and Dosher (1978) found that
“dead” needs an inference to be made in (20). The the word “scissors” was an equally good cue for
actress does not have to die as a result of this acci- recalling each of the sentences (24)–(26):
dent, and this conclusion is not supported by a
strong associative link between the words of the (24) The athlete cut out an article with scissors
sentence (as would be the case if the material said, for his friend.
“the actress was murdered”). Such inferences do (25) The athlete cut out an article for his friend.
not therefore have to be drawn automatically, and (26) The athlete cut out an article with a razor
indeed may not ever be made. (This is why this blade for his friend.
viewpoint is known as minimalist.)
Singer (1994) also provided evidence that The mention of a “razor blade” in sentence
bridging inferences are made automatically, but (26) blocks any inference being drawn then about
elaborative inferences are not. He presented sen- the use of scissors. One explanation of the finding
tences (21), (22), and (23), and then asked partici- that “scissors” is just as effective a cue is that par-
pants to verify whether “A dentist pulled a tooth.” ticipants are working backwards at recall from the
cue to an action, and then retrieving the sentence.
(21) The dentist pulled the tooth painlessly. The A problem with this sort of experiment, however,
patient liked the method. is that subsequent recall might not give an accu-
(22) The tooth was pulled painlessly. The dentist rate reflection of what happens when people first
used a new method. read the material.
Dooling and Christiaansen (1977) car- (30) However, she was disturbed by a loud
ried out an experiment similar to the Sulin and scream from the back of the class and the
Dooling (1974) study with the “Gerald Martin” chalk/pen dropped on the floor.
text. They tested the participants after 1 week,
telling them that Gerald Martin was really What happens when the reader comes to the
Adolf Hitler. People still made intrusion errors word “chalk” or “pen”? The analysis of eye move-
that in this case could not have been made at the ments indicates when readers are experiencing dif-
time of study. These results suggest that elabo- ficulty by telling us how long they are looking at
rative and reconstructive inferences are made particular items and whether they are looking back
at the time of test and recall, and when readers to re-examine earlier information. If role resolu-
are reflecting about material they have just read tion is dominated by lexical-semantic context, then
(Anderson, 2010). “pen” will be suggested by the lexical-semantic con-
Garrod and Terras (2000) distinguished text of “write,” regardless of the discourse context it
between two types of information that might is in. This is what Garrod and Terras observed. Peo-
assist in making a bridging inference. Consider ple spent no longer looking at “pen” in either the
the story in (27): appropriate or the inappropriate context, although
the first-pass reading time of “chalk,” which is not
(27) Vlad drove to Memphis yesterday. The car so lexically constrained as “pen,” was affected by
kept overheating. the context. That is, “writing on a blackboard” is
just as good as “writing a letter.” The appropriate-
To maintain coherence, we make the infer- ness of the discourse context does have a subsequent
ence that “the car” must be the one that Vlad effect, however, in that inappropriate context has a
drove to Memphis—even though the car has not delayed effect that makes people re-examine ear-
yet been mentioned. “The car” is said to fill an lier material in both cases.
open discourse role, and is linked to previous To account for these data, Garrod and Terras
material by a bridging inference that maintains propose a two-stage model of how people resolve
coherence. There are two types of information to open discourse roles. The first stage is called
do this that might be used here. First, there are bonding. In this stage, items that are suggested by
lexical-semantic factors: “drive” implies using the lexical context (e.g., “pen”) are automatically
a vehicle of some sort. Second, there might be activated and bound with the verb. In the second
more general background contextual informa- stage of resolution the link between proposed filler
tion. Garrod and Terras tried to tease apart the and verb is tested against the discourse context. A
influence of these two factors in a study where non-dominant filler, such as “chalk,” cannot be
they examined eye movements of participants automatically bound to the verb in the first stage,
reading stories such as (28) and (29): and causes some initial processing difficulty. The
resolution process is a combination of automatic,
(28) The teacher was busy writing a letter of bottom-up processing and non-automatic, contex-
complaint to a parent. tual processing. Inference-making in comprehen-
(29) The teacher was busy writing an exercise on sion involves both types of process.
the blackboard.
Practical implications of research on
The discourse context in (28) is consistent with inferences
the instrumental filler “pen,” but in (29) it is con- Of course, there are some obvious implications for
sistent with “chalk.” In both cases, however, the everyday life if we are continually making infer-
lexical-semantic context of “write” is much more ences on the basis of what we read and hear. Much
strongly associated with “pen” than with “chalk.” social interaction is based on making inferences
Now consider what happens when (28) and (29) from other people’s conversation—and we have
are followed by the continuation (30): seen that these inferences are not always drawn
correctly. There are two main applied areas where because the definite article presupposes that a bro-
elaborative inferences are particularly important, ken headlight exists. Loftus and Palmer (1974)
and those are eyewitness testimony and methods also showed participants a film of a car crash.
of advertising. They asked some of the participants (35) and
The work of Loftus (1975, 1996) on eyewit- others (36) (see Figure 12.2):
ness testimony is very well known. She showed
how unreliable eyewitness testimony actually is, (35) About how fast were the cars going when
and how inferences based on the wording of ques- they hit each other?
tions could prejudice people’s answers. For exam- (36) About how fast were the cars going when
ple, the choice of either an indefinite article (“a”) they smashed into each other?
or a definite article (“the”) influences comprehen-
sion. The first time something is mentioned, we Participants asked (36) reliably estimated the
usually use an indefinite article; after that, we can speed of the cars to be higher than those asked
use the definite article. Sentence (31) is straight- (35). A week later the participants that had been
forward, but (32) is distinctly odd: asked (36) were much more likely to think that
they had seen broken glass than those asked (35),
(31) A pig chased a cow. They went into a river. although broken glass had not been mentioned.
The pig got very wet. The way a question is phrased can influence the
(32) ? The pig chased a cow. They went into a inferences people make and therefore the answers
river. A pig got very wet. that they give.
R. Harris (1978) simulated a jury listening
When we come across a definite article we to courtroom witnesses, and found that although
make an inference that we already know something participants were more likely to accept directly
about what follows. Sometimes this can lead to asserted statements as true than only implied
memory errors. Loftus and Zanni (1975) showed statements for which they had to make an infer-
participants a film of a car crash. Some participants ence, there was still a strong tendency to accept
were asked (33), while others were asked (34): the implied statements. Instructions to partici-
pants telling them to be careful to distinguish
(33) Did you see a broken headlight? between asserted and implied information did not
(34) Did you see the broken headlight? help either. Furthermore, this test took place only
5 minutes after hearing the statements, whereas
In fact, there was no broken headlight. Partici- in a real courtroom the delays can be weeks,
pants were more likely to respond “yes” incor- and the potential problem much worse. Harris
rectly to question (34) than to question (33), (1977) similarly found that people find it difficult
FIGURE 12.2 Loftus and

Palmer (1974) found that
assessment of speed of a
videotaped car crash and
recollection of whether there
was broken glass present
were affected by the verb
used to ask the question.
(a)
Use of the verbs “hit” and
“smash” have different
(b) connotations as shown in (a)
and (b). Adapted from Loftus
and Palmer (1974).
vampire” in (39), an example of definite noun

phrase anaphor—or verbs—“does” in (40).
(37) Vlad put the knife on the table. Then he for-

got where it was.
(38) After he had finished with the knife, Vlad
put it on the table.
(39) Vlad went to the cinema. The vampire
really enjoyed the film.
(40) Vlad loves Boris and so does Dirk.
Comprehenders must work out what anaphors

Harris’ (1978) jury demonstrated a strong
tendency to accept the implied statements, despite refer to—what their antecedents are. This process
instructions to be careful to distinguish between is called resolution. Anaphor resolution is a back-
asserted and implied information. ward inference that we carry out to maintain a
coherent representation of the text.
to distinguish between assertions and implica- How do we resolve anaphoric

tions in advertising claims. Participants are best
at distinguishing assertions from implications if ambiguity?
they have been warned to do so before hearing In many cases anaphor resolution can be straight-
the claim, and are asked about it immediately forward. In a story such as (41) there is only one
afterwards. Deviation from this pattern leads to a possible antecedent:
rapid impairment of our ability to distinguish fact
from implication. (41) Vlad was happy. He laughed.
What makes anaphor resolution difficult is

REFERENCE AND that often it is not obvious what the antecedent of
AMBIGUITY the anaphor is. The anaphor is ambiguous when
there is more than one possible antecedent, such
An important part of comprehension is working as in (42):
out what things refer to: this is called reference.
In (37) both “Vlad” and “he” refer to the same (42) Vlad stuck a dagger in the corpse. It was
thing—Vlad the vampire, or at least our mental made out of silver. It oozed blood.
representation of him. We call the case when two
linguistic expressions refer to the same thing (e.g., In this case we have no apparent difficulty in
“Vlad” and “he”) co-reference. A common exam- understanding what each “it” refers to. How do
ple of co-reference involves the use of pronouns we do this? In more complex cases there might be
such as “she,” “her,” “he,” “him,” and “it,” such a number of alternatives, or background or world
as in (38). Often we find that we cannot determine knowledge is necessary to disambiguate.
the reference of a linguistic expression without We cope with anaphoric ambiguity by using a
referring to another linguistic expression, called number of coping strategies. Whether or not these
the antecedent; this case, and the material that we strategies are used to guide an explicit search pro-
cannot identify in isolation, is called anaphor. In cess, or to exclude items from a search set, or both,
(38) “Vlad” and “knife” are the antecedents of the or even to avoid an explicit search altogether, is at
anaphors “he” and “it,” respectively. Co-reference present unclear.
does not have to involve pronouns; it can also One strategy for anaphor resolution is called
involve other nouns referring to the same thing—“the parallel function (Sheldon, 1974). We prefer to
match anaphors to antecedents in the same rele- referred to the pronoun. However, Arnold et al.
vant position. Anaphor resolution is more difficult also manipulated order of mention, and this inter-
when the expectations generated by this strategy acted with gender so that there was only evidence
are flouted. In (43) and (44) the appropriate order of an effect of gender on pronoun resolution for
of antecedents and pronouns differs. In (43) “he” the less-accessible second-mentioned character.
refers to “Vlad,” which comes first in “Vlad sold For the first-mentioned character, people looked
Dirk,” but in (44) “he” refers to “Dirk,” which quickly at the target no matter whether the gender
comes second. Therefore (44) is harder to under- was ambiguous or not. In summary, the effects of
stand than (43). gender can only really be observed when we take
into account what other information influences
(43) Vlad sold Dirk his broomstick because he pronoun resolution. Rigalleau and Caplan (2000)
hated it. found that people are slower to say the pronoun
(44) Vlad sold Dirk his broomstick because he “he” when it is inconsistent with the only noun
needed it. in the discourse (46) compared with when it is
consistent (47):
We can distinguish two groups of further
strategies: those dependent on the meaning of the (46) Agnes paid without being asked; he had a
actual words used, or their role in the sentence; sense of honor.
and those dependent on the emergent discourse (47) Boris cried in front of the grave; he had a
model. tissue.
Of the strategies dependent on the words
used, one of the most obvious is the use of gender Rigalleau and Caplan suggest that pronouns
(Corbett & Chang, 1983): become immediately and automatically related to
possible antecedents. The resolution process that
(45) Agnes won and Vlad lost. He was sad and ultimately determines which of the possible ante-
she was glad. cedents is finally attached to the pronoun might
depend on other factors. Resolution only involves
In (45) it is clear that “he” must refer to Vlad, attentional processing if the initial automatic pro-
and “she” to Agnes. Most of the evidence sug- cesses fail to converge on a single noun as the
gests that gender information is used automati- antecedent, or if pragmatic information makes the
cally. Other experiments show that the effects selected noun an unlikely antecedent. Some tech-
of gender are more complicated and depend on niques are better at establishing the time course of
what other referents are accessible at the time of anaphor resolution than others. In particular, the
reading. Arnold, Eisenband, Brown-Schmidt, and use of probes, as used in the earlier studies, might
Trueswell (2000) examined eye movements to disrupt the comprehension process, giving a mis-
investigate how gender information is used. Par- leading picture of what is happening.
ticipants examined pictures of familiar cartoon Different verbs carry different implications
characters while listening to text. Arnold et al. about how the actors involved should be assigned
found that gender information about the pronoun to roles. If participants are asked to complete the
was accessed very rapidly (within 200 ms after the sentences (48) and (49), they usually produce
pronoun). If the picture contained both a female continuations in which “he” refers to the sub-
and a male character (e.g., Minnie Mouse and ject (Vlad) in (48), and the object (Boris) in (49).
Donald Duck), participants were able to use the Verbs such as “sell” are called NP1 verbs, because
gender cue (“she” or “he”) very quickly to look causality is usually attributed to the first, subject,
at the appropriate picture. If the pictures were of noun phrase; verbs such as “blame” are called
same-sex characters (e.g., Micky Mouse and Donald NP2 verbs, because causality is usually attrib-
Duck), gender was no longer a cue, and partici- uted to the second, object, noun phrase (Grober,
pants took longer to converge on the picture that Beardsley, & Caramazza, 1978).
(48) Vlad sold his broomstick to Boris because The second group of anaphor resolution strat-
he . . . egies are those dependent on the perceived promi-
(49) Vlad blamed Boris because he . . . nence of possible referents in the emergent text
model. We might be biased, for example, to select
When does implicit causality have its effect? the referent in the model that is most frequently
Is it early, enabling us to focus on the appropriate mentioned. Antecedents are generally easier to
antecedent, or late, facilitating the integration of locate when they are close to their referents than
material? The difference between the two possible when they are farther away, in terms of the num-
time courses is whether or not causality informa- ber of intervening words (Murphy, 1985; O’Brien,
tion affects the initial processing of the “he” in 1987). In more complicated examples alternatives
(48) and (49). An experiment by Stewart, Pick- can sometimes be eliminated using background
ering, and Sanford (2000) suggests that implicit knowledge and elaborative inferences, as in (52).
causality only has a late effect. Stewart et al. Exactly how this background knowledge is used
manipulated information about the cause of an is unclear. In this case we infer that becoming a
action, and about the type of anaphor used. They vegetarian would not make someone want to buy
manipulated the implicit cause (through verb piglets, but more likely to sell them, as they would
bias) and the explicit cause, which is derived from be less likely to have any future use for them.
the whole sentence. The two types of cause could
be either congruent or in conflict, a condition (52) Vlad sold his piglets to Dirk because he had
they called incongruent. They also manipulated become a vegetarian.
whether the anaphor was a pronoun or a proper
name. They measured ease of processing using Pronouns are read more quickly when the ref-
self-paced reading. Sentence (50) is an example erent of the antecedent is still in the focus of the
of a congruent condition with names, and (51) is situation being discussed than when the situation
an incongruent condition with pronouns—note has changed so that it is no longer in focus (Garrod
that “apologize” is usually a NP1-bias verb. & Sanford, 1977; Sanford & Garrod, 1981). Items
in explicit focus are said to be foregrounded and
(50) Daniel apologized to Arnold because Daniel have been explicitly mentioned in the preceding
had been behaving selfishly. text. Such items can be referred to pronominally.
(51) Daniel apologized to Arnold because he Items in implicit focus are only implied by what
didn’t deserve the criticism. is in explicit focus. For example, in (53) Vlad is
in explicit focus, but the car is in implicit focus. It
The pronoun “he” is ambiguous, whereas the sounds natural to continue with “he was thirsty,”
name is not. The early-focus account predicts that but not with “it broke down.” Instead, we would
we should determine the antecedent of the pro- need to bring the car into explicit focus with a sen-
noun on the basis of the implicit causality bias of tence like “his car broke down.”
the verb. In incongruent sentences with pronouns,
therefore, the early-focus account predicts con- (53) Vlad was driving to Philadelphia.
flict and therefore reading difficulty; this diffi-
culty should not be present in the sentences with Experiments on reading time suggest that
the unambiguous names instead of pronouns. So, implicit focus items are harder to process. Items
if the early-focus account is correct, there should are likely to stay in the foreground if it is an
be an interaction between congruence and type of important theme in the discourse, and these items
anaphor. Stewart et al. found no such interaction, are likely to be maintained in working memory.
a result that supports the late-integration account. Pronouns with antecedents in the foreground, or
Indeed, they found congruence mattered for just topic antecedents, are read quickly, regardless of
repeated names, suggesting that implicit bias has the distance between the pronoun and referent
a late effect. (Clifton & Ferreira, 1987). In conversation, we
do not normally start using pronouns for refer- Given that there are a number of strategies
ents that we have not mentioned for some time. for interpreting anaphors, how do we choose the
In general, unstressed pronouns are used to refer best one? Badecker and Straub (2002) argue that
to the most salient discourse entity—the one at all potential cues contribute to the selection of the
the center of focus—while definite noun phrase appropriate antecedent. They propose an interac-
anaphors (e.g., “the intransigent vampire”) are tive parallel constraint model, where the multiple
used to refer to non-salient discourse entities— constraints influence the activation of the candidate
those out of focus. entities. The more conflict there is, the more candi-
In general, then, the more salient an entity is dates there are, and the more plausible they are, the
in discourse, the less information is contained in more difficult choosing an antecedent will be.
the anaphoric expression that refers to it. Almor
(1999) proposed that NP anaphor processing is
determined by informational load: this is the
Accessibility
amount of information an anaphor contains. The Some items are more accessible than others. We are
informational load of an anaphor with respect to faster at retrieving the referent of more accessible
its antecedent should either aid the identifica- antecedents. At this stage some caution is neces-
tion of the antecedent, or add new information sary to avoid a circular definition of accessibility.
about it, or both. The processing of anaphors is a Accessibility is a concept related both to anaphora
balance between the benefits of maximum infor- and to the work on sentence memory. It can be meas-
mativeness and the cost of minimizing working ured by recording how long it takes participants to
memory load. This idea that anaphor process- detect whether a word presented while participants
ing is a balance between informativeness and are reading sentences is present in the sentence.
processing cost leads to several predictions. For Common ground is shared information
example, anaphors with a high informational between participants in a conversation (Clark,
load with respect to their antecedent, but which 1996; Clark & Carlson, 1982). A piece of infor-
do not add new information about them, will mation is in the common ground if it is mutually
be difficult to process when the antecedent is believed by the speakers, and if all the speakers
in focus. Hence repetitive NP anaphors such as believe that all the others believe it to be shared.
(54) will be difficult: Information that is in the common ground should
have particular importance in determining refer-
(54) It was the bird that ate the fruit. The bird ence. The restricted search hypothesis states that
seemed very satisfied. the initial search for referents is restricted to enti-
(55) What the bird ate was the fruit. The bird ties in the common ground, whereas the unre-
seemed very satisfied. stricted search hypothesis places no such restric-
tion. That is, according to the restricted search
Here the antecedent (“a bird”) is in focus and hypothesis, things in the common ground should
the default antecedent, so a pronoun (“it”) will do. be more accessible than things that are not. The
The NP anaphor (“the bird”) has a high informa- evidence currently favors the unrestricted search
tional load, so it is not justified. It is only justified hypothesis (Keysar, Barr, Balin, & Paek, 1998).
when the antecedent is out of focus (55), because Consider (56) (Keysar et al., 1998, p. 5):
then it aids the identification of the antecedent.
Almor verified this prediction in a self-paced (56) “It is evening, and Boris’ young daughter is
reading task. “The bird” was read slower when playing in the other room. Boris, who lives
the antecedent was in focus (54) than when it was in Chicago, is thinking of calling his lover
out of it (55). Hence the use and processing of in Europe. He decides not to call because
pronominal and NP anaphors is a complex trade- she is probably asleep given the transatlan-
off between informativeness, focus, and working tic time difference. At that moment his wife
memory load. returns home and asks, ‘Is she asleep?’”
How does Boris search for the referent of The given–new contract
“she”? If the restricted search hypothesis were One of the most important factors that determines
correct, and search is restricted to possible ref- comprehensibility and coherence is the order in
erents in the common ground, the lover should which new information is presented relative to
not be considered, as the wife is not informed what we know already. Clearly this affects the
about the lover. However, entities that are not in ease with which we can integrate the new infor-
the common ground still interfere with reference mation into the old. It has been argued that there
resolution, as measured by error rates, verification is a “contract” between the writer and the reader,
times, and eye-movement measures. Although or participants in a conversation, to present new
common ground might not restrict which possi- information so that it can easily be assimilated
ble referents are initially checked, it almost cer- with what people already know. This is called the
tainly plays an important later role in checking, given–new contract (Clark & Haviland, 1977;
monitoring, and correcting the results of the initial Haviland & Clark, 1974). It takes less time to
search. Conversants take into account what each understand a new sentence when it explicitly con-
other knows when establishing common ground tains some of the same ideas as an earlier sentence
(Knutsen & Le Bigot, 2012). than when the relation between the content of the
Generally we are biased to referring back to sentences has to be inferred.
the subject of a sentence; there is also an advantage Utterances are linked together in discourse so
to first mention. This means that participants that that they link back to previous material and for-
are mentioned first in a sentence are more acces- ward to material that can potentially be the focus
sible than those mentioned second. Gernsbacher of future utterances. Centering theory, developed
and Hargreaves (1988) showed that there was an in AI models of text processing, provides a means
advantage for first mention, independent of other of describing these links (Gordon, Grosz, &
factors such as whether the words involved were Gilliom, 1993; Grosz, Joshi, & Weinstein, 1995).
subject or object. Gernsbacher, Hargreaves, and According to centering theory, each utterance in
Beeman (1989) explored the apparent contradic- coherent discourse has a single backward-looking
tion between first mention and recency, in that center that links to the previous utterance, and one
items that are more recent should also be more or more forward-looking centers that offer poten-
accessible. Gernsbacher and Hargreaves explained tial links to the next utterance. People prefer to
this with a constructionist, structure-building realize the backward-looking center as a pronoun.
account: The goal of comprehension is to build a The forward-looking centers are ranked in order
coherent model of what is being comprehended. of prominence, according to factors such as the
Comprehenders represent each clause of a multi- position in the sentence and the stress. The read-
clause sentence with a separate substructure, and ing times of sentences increase if these rules are
have easiest access to the substructure they are violated. For example, people actually take longer
currently working on. However, at some point to read stories where proper names are repeated
the earlier information becomes more accessible compared with sentences where appropriate pro-
because it serves as a foundation for the whole nouns are used.
sentence-level representation. So it is only as the
representation is being developed that recency is Summary of work on memory,
important. Recency is detectable only when acces-
sibility is measured immediately after the second
inferences, and anaphora
clause; elsewhere first mention is important, and Any model of comprehension must be able to
has the more long-lasting effect. This explanation explain the following characteristics. We read
is reminiscent of Kintsch’s propositional model, for gist, and very quickly forget details of sur-
discussed later, and shows how it is possible to face form. Comprehension is to some extent a
account for anaphor resolution in terms of the constructive process: We build a model of what
details of the emergent comprehension model. we are processing, although the level of detail
involved is controversial. At the very least, we predicate-argument form (with a verb operating
make inferences to maintain coherence. One of on a noun). A proposition has a truth value—
the most important mechanisms involved in this that is, we can say whether it is true or false.
is anaphor resolution. Inferences soon become For example, the words “witch” and “cackle”
integrated into our model as we go along, and we are not propositions: They are unitary and have
are very soon unable to distinguish our inferences no internal structure, and it is meaningless to
from what we originally heard. There is a fore- talk of individual words being true or false. On
ground area of the model containing important the other hand, “the witch cackles” contains a
and recent items, so that they are more accessible. proposition. This can be put in the predicate-
argument form “cackle(witch),” which does
have a truth value: the witch is either cackling
MODELS OF TEXT or she isn’t.
PROCESSING Propositions are connected together in propo-
sitional networks, as in Figure 12.3. The model
We now examine some models of how we repre- of Anderson and Bower (1973) was particularly
sent and process text. AI has heavily influenced influential. Originally known as HAM (short for
models of comprehension. Although the ideas Human Associative Memory), the model evolved
thus generated are interesting and explicit, there first into ACT (short for Adaptive Control of
is a disadvantage that the specific mechanisms Thought; see Anderson, 1976) and later ACT*
we use are unlikely to be exactly the same as the (pronounced “ACT-star”; Anderson, 1983). These
explicit mechanisms used to implement the AI models include a spreading activation model of
concepts. semantic memory, combined later with a product-
ion system for executing higher level operations.
Propositional network models of A production system is a series of if–then rules: if
x happens, then do y. ACT* gives a good account
representing text of fact retrieval from short stories. For example,
The meaning of sentences and text can be rep- the more facts there are associated with a concept,
resented by a network where the intersections the slower the retrieval of any one of those facts.
(or nodes) represent the meaning of words, and
the connections represent the relations between
words. This approach is related to Fillmore’s
(1968) theory of case grammar, which in turn was
Vlad
derived from generative semantics, a grammatical
give past
theory that emphasized the importance of seman-
tics. Case grammar emphasizes the roles, or cases,
broomstick witch
played by what the words refer to in the sentence.
It emphasizes the relation between verbs and the
words associated with them. (Cases are more or
less the same as thematic roles; see Box 10.1 for yellow
some examples.) One disadvantage of case gram-
mar is that there is little agreement over exactly
what the cases that describe the language should own old
be, or even how many cases there are. This lack
of agreement about the basic units involved is a
common problem with models of comprehension.
In network models, sentences are first ana- FIGURE 12.3 An example of a simplified
lyzed into propositions. A proposition is the propositional network underlying the sentence “Vlad
smallest unit of meaning that can be put in gave his yellow broomstick to the old witch.”
This is known as the fan effect (Anderson, 1974, in stories is the basis of story grammars, which
2010). When you are presented with a stimulus, are analogous with sentence grammar. Stories
activation spreads to all its associates. There is a have an underlying structure, and the purpose of
limit to the total amount of activation, however, comprehension is to reconstruct this underlying
so the more items it spreads to, the less each indi- structure. This structure includes settings, themes,
vidual item can receive. plots, and how the story turns out (see Mandler,
Another influential network model has been 1978; Mandler & Johnson, 1977; Rumelhart,
the conceptual dependency theory of Schank 1975, 1977; and Thorndyke, 1977, for examples).
(1975). This starts off with the idea that mean- Like sentence grammars, story grammars
ing can be decomposed into small, atomic units. are made out of phrase-structure rules (see the
Text is represented by decomposing the incoming example in Box 12.1). The nature of the syntactic
material into these atomic units, and by building a rules in Box 12.1 is expanded by a corresponding
network that relates them. An important interme- semantic rule: for example, once you have a set-
diate step is that the atomic units of meaning are ting then an episode is possible. You can draw tree
combined into conceptualizations that specify the structures just as with sentences, hence emphasiz-
actors involved in the discourse and the actions ing their hierarchical structure. The basic units,
that relate them. Once again, this approach has corresponding to individual words in sentence
the advantage that as it has been implemented (in grammars, are propositions, which are eventually
part) as a computer simulation: its assumptions assigned the lowest-level slots.
and limitations are therefore very clear. In the recall, paraphrasing, and summa-
rizing of stories, the less important details are
Evaluation of propositional network omitted. According to story grammars, humans
models compute the importance of a sentence or a fact
Just as there is little agreement on the cases to use, by its height in the hierarchy. Cirilo and Foss
so there is little agreement on the precise types (1980) showed that participants spend more
of roles and connections to use. If we measure time reading sentences high in the structure
propositional networks against the requirements than those low down in the structure. However,
listed for memory, inferences, and anaphora, we any sensible theory of text processing should
can see that they satisfy some of the requirements, predict that we pay more attention to the impor-
but leave a lot to be desired as models of discourse tant elements of a story.
processing. Most propositional network models Thorndyke (1977) presented participants
show how knowledge might be represented, but with one of two simple stories. The story “Circle
they have little to say about when or how we make
inferences, or how some items are maintained in
the foreground, or how we extract the gist from Box 12.1 Example of a
text (Johnson-Laird et al., 1984; Woods, 1975). fragment of a story grammar
Propositional networks by themselves are inad- (based on Rumelhart, 1975)
equate as a model of comprehension, but form
the basis of more complex models. Kintsch’s Story o Setting + theme + plot +
construction–integration model (see later) is based resolution
on a propositional model, but includes explicit Setting o Characters + location +
mechanisms for dealing with the foreground and time
making inferences. Theme o (Event)* + goal
Plot o Episode*
Episode o Subgoals + attempt* +
Story grammars outcome
Stories possess a structure: they have a begin- Asterisks show the element can be repeated.
ning, a middle, and an end. The structure present
Island” was about building a canal on an island,

and the second story was about an old farmer try-
ing to put his donkey into a shed. One group of
participants heard these stories in their normal
straightforward form. A second group heard a
modified version of the stories where the story
structure had been tampered with. The modifica-
tions included putting the theme at the end of the
story (rather than before its plot, where it is most
effective), deleting the goal of the story, or, in its
most extreme version, presenting the component
sentences in random order. Thorndyke found that
the more the story structure was tampered with, the
less of the story participants could subsequently
recall. Hence jumbled stories are harder to under-
stand and remember than originals. According to
story grammar theory, this is because jumbling a
story destroys its structure. However, jumbling Stories have a structure which includes settings,
also destroys referential continuity. Garnham, themes, plots, and conclusion. Thorndyke (1977)
Oakhill, and Johnson-Laird (1982) restored refer- found that the more this story structure was
tampered with, the less of the story participants
ential continuity in jumbled stories; this greatly could subsequently recall.
reduced the difficulty participants had with them
(as measured by memory for the stories, and the
readers’ ratings of comprehensibility). For exam-
ple, (57) is the original story, and (58) the same Evaluation of story grammars
story with the sentences unchanged but in a ran- A major problem with story grammars is in get-
dom order, but in (59) some of the noun phrases ting agreement on what their elements, rules,
have been changed so as to re-establish referential and terminal elements should be. In a sentence
continuity: grammar, the meaning of non-terminal elements
such as “noun” and “verb” is independent of
(57) David was playing with his big, colored ball their content, and well defined. This is not true
in the garden. He bounced it so hard that it of story grammars. Neither are there formally
went right over the fence. The people next agreed criteria for specifying a finite, well-
door were out so he climbed over to get it. specified set of terminal elements—there are a
He found his ball and threw it back. David finite number of words, but an infinite number
carried on with his game. of propositions. We therefore cannot (as we can
(58) He found his ball and threw it back. The with words) make a list of propositions and the
people next door were out so he climbed categories to which they belong. Furthermore,
over to get it. David carried on with his propositions might belong to different cat-
game. He bounced it so hard that it went egories, depending on the context. There is no
right over the fence. David was playing agreement on story structure: virtually every
with his big, colored ball in the garden. story grammatician has proposed a different
(59) David found his big, colored ball and threw grammar. Story grammars only provide a lim-
it back. The people next door were out so he ited account of a subset of all possible stories.
climbed over to get it. He carried on with Furthermore, the analogy of story categories,
his game. He bounced his ball so hard that it with formal syntactic grammars such as NP, VP,
went right over the fence. David was play- and their rules of combination, is very weak.
ing with it in the garden. There is much variation with stories, and, unlike
sentences, the analysis of stories is content- this meaning. Finally, the information must be
dependent. Story grammars fail to provide an integrated to form a single holistic representation.
account of how stories are actually produced The idea of a schema cannot in itself account
or understood. (See Black & Wilensky, 1979; for text processing, but it is a central concept in
Garnham, 1983b; Johnson-Laird, 1983; and many theories. Although it provides a means of
Wilensky, 1983, for details of these criticisms.) organization of knowledge, and explains why
Given the fundamental nature of some of these we remember the gist of text, it does not explain
difficulties, story grammars are no longer influ- how we make inferences, how material is fore-
ential in comprehension research. grounded, or why we sometimes remember the
literal meaning. To solve these problems the
notion must be supplemented in some way.
Schema-based theories
The idea of a schema (the plural can be either Scripts
schemata or schemas) was originally introduced A script is a special type of schema (Schank &
by Bartlett (1932). Bartlett argued that memory Abelson, 1977). Scripts represent our knowledge
is determined not only by what is presented, but of routine actions and familiar repeated sequences.
also by the prior knowledge a person brings to the Scripts include information about the usual roles,
story. He presented people with stories that con- objects, and the sequence of events to be found in
flicted with prior knowledge, and observed that an action; they enable plans to be made and allow
over time people’s memory for the story became us to draw inferences about what is not explicitly
increasingly distorted in the direction of fitting in mentioned. Two famous examples are the “res-
with their prior knowledge. taurant script” and the “attending a lecture script”
A schema is an organized packet of knowledge (see Table 12.1).
that enables us to make sense of new knowledge. Psychological evidence for the existence of
It is related to ideas in both AI on visual object scripts comes from an experiment by Bower et al.
recognition (Minsky, 1975) and experimental (1979). Bower et al. asked participants to list
psychology (Posner & Keele, 1968). The schema about 20 events in activities such as visiting a
gives knowledge-organizing activation that means restaurant, attending a lecture, getting up in the
that the whole is greater than the sum of its parts. morning, visiting the doctor, or going shopping.
It can be conceptualized as a series of slots that can Some examples are shown in Table 12.1. Items
be filled with particular values. Anderson (2010) labeled (1) were mentioned by the most par-
gives the following example (60) of a possible ticipants and are considered the most important
schema for a house. actions in a script; items labeled (2) were men-
tioned by fewer participants; and items labeled
(60) House schema: (3) were mentioned by the fewest participants.
Isa: building These are considered the least important parts of
Parts: rooms the script. The events are shown in the order in
Materials: bricks, stone, wood which they were usually mentioned. All of these
Function: human dwelling events were mentioned by at least 25% of the
Shape: rectilinear, triangular participants. Hence participants agree about the
Size: 100–10,000 square feet central features that constitute a script, and their
relative importance.
There are four central processes involved in Scripts are useful in explaining some results
schema formation. First, the appropriate aspects of experiments on anaphoric reference. Walker
of the incoming stimuli must be selected. Second, and Yekovich (1987) showed that a central con-
the meaning must be abstracted, and syntactic and cept of a script (such as a “table” in the restaurant
lexical details dispensed with. Third, appropriate script) was comprehended faster (regardless of
prior knowledge must be activated to interpret whether it was explicitly mentioned in the story)
TABLE 12.1 Examples of scripts (based on Bower than a peripheral concept. Peripheral concepts of
et al., 1979). scripts were dealt with particularly slowly when
their antecedents were only implied. That is, we
Visiting a Attending a lecture
find it easier to assign referents to the important
restaurant script script
elements of scripts.
Open door 3 Enter room 1 Occasionally events happen that are not in
the script: for example, the waiter might spill the
Enter 2 Look for friends 2
soup on you. Schank and Abelson (1977) referred
Give reservation 2 Find seat 1 to such interruptions as obstacles or distractions,
name because they get in the way of the main purpose of
the script (here, eating). Bower et al. made predic-
Wait to be 3 Sit down 1
seated tions about two types of event in stories relating to
scripts. First, distractions that interrupt the purpose
Go to table 3 Settle belongings 3 of the script should be more salient than the rou-
Be seated 1 Take out notebook 1
tine events, and should therefore be more likely to
be remembered. Second, events that are irrelevant
Order drinks 2 Look at other 2 to the purpose of the script (such as the color of
students the waiter’s shoes) should be poorly remembered.
Put napkins on lap 3 Talk 2 Both of these predictions were verified.
Schank (1982) pointed out that most of life
Look at menu 1 Look at lecturer 3 is not governed by predetermined, over-learned
Discuss menu 2 Listen to lecturer 1 sequences such as those encapsulated by a
script. Knowledge structures need to be flexible.
Order meal 1 Take notes 1 Dissatisfied with this limitation of scripts, Schank
Talk 2 Check time 1 focused on the role of reminding in memory. He
argued that memory is a dynamic structure driven
Drink water 3 Ask questions 3 by its failures. Memory is organized into differ-
Eat salad or soup 2 Change position in 3 ent levels, starting at the lower end with scenes.
seat Examples of these in what would earlier have
been called a “going to the doctor script” include
Meal arrives 3 Daydream 3 “reception room scene,” “waiting scene,” and
Eat food 1 Look at other students 3 “surgery scene.” Scenes are organized into mem-
ory organization packets or MOPs, which are all
Finish meal 3 Take more notes 3 linked by being related to a particular goal. In any
Order dessert 2 Close notebook 2 enterprise, more than one MOP might be active at
once. MOPs are themselves organized into meta-
Eat dessert 2 Gather belongings 2 MOPs if a number of MOPs have something in
Ask for bill 3 Stand up 3 common (for example, all MOPs involving going
on a trip). At a higher level than MOPs and meta-
Bill arrives 3 Talk 3 MOPs are thematic organization points or TOPs,
Pay bill 1 Leave 1 which deal with abstract information independent
of particular physical or social contexts.
Leave tip 2 There is some support for MOPs from a
Get coats 3 series of experiments by McKoon, Ratcliff, and
Seifert (1989) and Seifert, McKoon, Abelson, and
Leave 1 Ratcliff (1986). They showed that elements of
Items labeled (1) are considered most important, (3) MOPs could prime the retrieval of other elements
least important. from the same MOP. Participants read a number
of stories, some of which shared the same MOPs, Mental models

as construed by the experimenters. They then had
to make “old” or “new” recognition judgments Comprehenders construct a model as they go
about a number of test sentences, some of which along to represent what they hear and read. If the
had been in the original stories, and some of which information is represented in a form analogous to
had not. A priming phrase from the same story as what is being represented, this type of represen-
the test sentence always produced facilitation for tation is called a mental model (Johnson-Laird,
the subsequent test sentence. However, a priming 1983; see also Garnham, 1985, 1987). If the infor-
phrase that had originally been in a different story mation is represented propositionally, this type
from the test sentence also produced facilitation of representation is called a situation model (van
if it was in the same MOP as the test sentence. Dijk & Kintsch, 1983), although many research-
The amount of facilitation found was the same ers do not distinguish between the two terms. A
whether the original phrase was from a different mental model directly represents the situation in
story or from the same story. There was no facili- the text. Its structure is not arbitrary in the way
tation if the priming phrase was from a different that a propositional representation is, but directly
story and a different MOP to the test sentence. mirrors what is represented. We form mental
models of specific things, and these models can
Evaluation of schema and script-based give rise to mental images. Whereas schemas
approaches contain general information abstracted over
The primary accusation against schema and many instances, in the mental models approach
script-based approaches is that they are noth- a specific model is constructed to represent new
ing more than redescriptions of the data. This is information from general information of space,
quite difficult to rebuff. Ross and Bower (1981) time, causality, and human intentionality (Brewer,
suggested that schemas have an organizing abil- 1987). Mental models are not just used in the
ity beyond their constituents, but this has yet to short term in working memory to interpret text—
be demonstrated of scripts, although the data people have long-term memory for the models
from McKoon et al. could be interpreted in this they construct, as well as some memory for the
way. It is also unclear how particular scripts get surface text (Baguley & Payne, 2000).
activated. It cannot just be by word association. The application of mental models is most
For example, the phrase “the five-hour journey apparent in representing spatial information.
from London to New York” should activate There is some evidence that the spatial layout of
the “plane flight script,” yet no single word in what is represented in the text affects processing.
this utterance is capable of doing so (Garnham, For example, Morrow, Bower, and Greenspan
1985). (1989) argued that readers construct a mental
Although there are some experimental find- model representing the actors involved and their
ings that support the idea that knowledge is organ- relative spatial locations. They showed that the
ized around schema-like structures, they cannot as accessibility of objects mentioned in the text
yet provide a complete account of text process- depended on the relative spatial location of the
ing. They can only give an account of stereotyped objects and the actors, rather than on the accessi-
knowledge. They show how such knowledge bility of the hypothesized propositions that might
might be organized, and what kinds of inference be used to represent those locations. Ehrlich and
we can make, but at present they have little to Johnson-Laird (1982) examined how we might
say about how these inferences are made, how form a mental model of a text describing spatial
anaphors are resolved, or which items are fore- information. The “turtles story” of Bransford et al.
grounded. To do this we must consider not only (1972), given earlier in (17) and (18), also sug-
how knowledge is represented in memory, but gests that we construct a spatial layout of some
also the processes that operate on that knowledge, text. The accessibility of referents also depends
and relate it to incoming information. on the spatial distance from the focus of attention
in the model (Glenberg, Meyer, & Lindem, 1987; Greenspan, & Bower, 1987; Morrow et al.,
Rinck & Bower, 1995). In Rinck and Bower’s 1989; Zwaan & Radvansky, 1998). According
experiment, participants memorized a diagram of to the resonance model, new information reso-
a building and then read a story describing charac- nates with all information in memory, even with
ters’ activities in the building. The reading times information that is not apparently immediately
of sentences increased with the number of rooms relevant or up-to-date (Myers & O’Brien, 1998).
between the room containing an object mentioned Importantly, passive reactivation of old material
in the sentence and the room where the protago- cannot be prevented: all immediately irrelevant
nist of the story was currently located. information will become active as long as it is
Mental models represent more than spatial related. Zwaan and Madden show that compre-
information, however. There is agreement that henders can update situation models with new
they are multidimensional and represent five information that is consistent with the current
kinds of information: spatial, causal, and tempo- situation, but inconsistent with the prior situation,
ral information about people’s goals, and infor- as easily as material that was never inconsistent
mation about the characteristics of people and with the prior situation. This finding suggests that
objects (Zwaan & Radvansky, 1998). There is the most important determinant of updating is
some evidence that different aspects of the mental what is currently available, and new information
model are maintained independently in working does not resonate with all information in memory.
memory. Friedman and Miyake (2000) had peo- However, the findings in this sort of experiment
ple read short stories while responding to spatial are very sensitive to the details of the materials
and causal probe questions. They found that the used, and this conclusion is controversial (e.g.,
spatial measures were influenced by the spatial see O’Brien, Cook, & Peracchi, 2004; O’Brien,
demands of the texts, but not the causal demands, Rizzella, Albrect, & Halleran, 1998).
whereas the causal measures were only influenced Time is clearly an important determinant of
by the causal demands. Spatial aspects of the text how we construct models. In addition to the abso-
become encoded in spatial memory, but the causal lute time—the time at which information becomes
aspects become encoded in verbal memory. available in real time—relative time in a story is
The mental models approach is an extreme also important. A story unfolds in time, with the
version of a constructionist approach. Indeed, focus continually shifting. As a consequence,
Brewer (1987) distinguished mental models some events are immediate, some are in the recent
from other approaches by saying that rather than past, and some are perhaps quite a long time away.
accessing pre-existing structures, mental models Relative time can affect the accessibility of enti-
are specific knowledge structures constructed to ties in a model. Entities are less accessible when
represent each new situation, using general infor- the temporal distance between the “now” point
mation such as knowledge of spatial relations and and the past is long rather than short: readers
general knowledge. Exactly how this construction need to take more time to access entities remote
takes place, and the precise nature of the represen- in time. However, the effect of relative time only
tations involved, is sometimes unclear. applies to consecutive events (Kelter, Kaup, &
Claus, 2004). The critical comparison is the dif-
Updating the model ference between sentences such as (61) and (62).
Text processing is dynamic. As people compre-
hend text, and new material becomes available, (61) She then goes to the hairdresser and buys
they have to update their mental representa- hairspray.
tions. Zwaan and Madden (2004) distinguish two (62) She then goes to the hairdresser and gets a
approaches to how updating occurs. According to perm.
the here-and-now model, information that is cur-
rently relevant to the protagonist of the text is more There is no difference in utterance length
available than less relevant material (Morrow, here, but more time is likely to elapse in (62) than
in (61). Entities mentioned before these critical demanding that certain inferences be drawn if
sentences take longer to access after (62) than needed facts are not explicitly stated.
after (61). Text is represented in the form of a network
Given the importance of relative time in the of connected propositions or facts called a coher-
model, people pay particular attention to words ence graph. The coherence graph is built up
and phrases that indicate relative time, particularly hierarchically. This text base has both a micro-
those that indicate a shift of time in the narrative. structure and a macrostructure. The microstruc-
Words and phrases such as “later” or “two days ture is this network of connected propositions.
later” are called segmentation markers—they tell In processing text, we work through it in input
the reader that there is a temporal discontinuity cycles that usually correspond to a sentence, with
and a potential shift of topic. People take longer an average size of seven propositions. In each
to read the first sentence after a shift of topic (an cycle the overlap of the proposition arguments
effect called the boundary effect), but this penalty is noted; propositions are semantically related
is lessened if the shift is flagged by a segmenta- when they share arguments. If there is no over-
tion maker (Bestgen & Vonk, 2000). lap between the incoming propositions and the
propositions currently in working memory, then
Kintsch’s construction–integration there must be a time-consuming process of infer-
ence involving a reinstatement search (search
model of long-term memory). If there is overlap, then
Kintsch (1988) described a detailed and plausible the new propositions are connected to the active
model of spoken and written text comprehension part of the coherence graph by coherence rules.
known as the construction–integration model. The macrostructure concerns the higher level of
This model emphasizes how texts are represented description and the processes operating on that.
in memory and understanding, and how they are Relevant schemas are retrieved in parallel from
integrated into the comprehender’s general knowl- long-term memory. The knowledge base in long-
edge base. The construction–integration model term memory is stored in an associative network.
combines aspects of the network, schema-based, Rules called macrorules provide operations that
and mental model approaches. Text is represented delete propositions from microstructure, summa-
at a number of levels and processed cyclically in rize propositions, and construct inferences (e.g.,
two phases. A text base is created from the lin- to fill gaps in the text). Script-like information
guistic input and from the comprehender’s knowl- would be retrieved at this stage. The final situa-
edge base in the form of a propositional network. tion model represents the text, but in it the indi-
The text base is used to form the situation model viduality of the text has been lost, and the text
(which can also be represented propositionally), has been integrated with other information into
where the individuality of the text has been lost, a larger structure. Temporality, causality, and
and the text has been integrated with other infor- spatiality are particularly important in the situ-
mation to form a model of the whole situation ation model (Gernsbacher, 1990). Reading time
described in the text. studies suggest that comprehenders pay particu-
The early version of the model (Kintsch, lar attention to these aspects of text (Zwaan,
1979; Kintsch & van Dijk, 1978) is a sophisticated Magliano, & Graesser, 1995).
propositional network. The input is dealt with in As the text is being processed, certain prop-
processing cycles. Short-term memory acts as a ositions will be stored in working memory. As
buffer to store incoming material. We build up this has a limited capacity, what determines what
a representation of a story given two inputs: the goes into this buffer? First, recency is impor-
story itself, and the goals of the reader. The goals tant. Second, the level at which a proposition is
and knowledge of the reader are represented by stored is important, with propositions higher in
the goal schema, which does things such as stat- the coherence graph more likely to receive more
ing what is relevant, setting expectations, and processing cycles in working memory.
The construction–integration model itself it takes to read per word (Kintsch & Keenan,
(Kintsch, 1988) keeps most of the features of the 1973). Second, as we have seen, there is a levels
earlier model (see Figure 12.4). In the construc- effect in the importance of a proposition owing
tion phase of processing, word meanings are acti- to the multiple processing of high-level proposi-
vated, propositions formed, and inferences made, tions. They are held in working memory longer,
by the mechanisms described earlier. The initial and elaborated more. Whenever a proposition is
stages of processing are bottom-up. In the inte- selected from working memory, its probability of
gration phase, the network of interrelated items being reproduced increases. Kintsch and van Dijk
is integrated into a coherent structure. The text (1978) found that the higher the level of a propo-
base constructed in the construction phase may sition, the more likely it is to be recalled in a free
contain contradictions or may be incorrect, but recall task.
any contradictions are resolved in the integration Inferences are confused with original mate-
phase by a process of spreading activation around rial because the propositions created as a result of
the network until a stable, consistent structure is the inferences are stored along with explicitly pre-
attained. Information is represented at four levels: sented propositions. The two sorts of proposition
the microstructure of the text; the local structure are indistinguishable in the representation. That
(sentence-by-sentence information integrated with this depends on the operation of goal and other
information retrieved from long-term memory); schemas also explains why material can be hard
the macrostructure (the hierarchically ordered to understand and remember if we do not know
set of propositions derived from the microstruc- what it is about. We remember different things
ture); and the situation model (the integration of if we change perspective because different goal
the text base—microstructure and macrostructure schemas become active.
together—with the results of inferences). The model also explains readability effects
and the difficulty of the text. Kintsch and van
Evaluation of the construction– Dijk (1978) defined the readability of a story as
integration model the number of propositions in the story divided
The construction–integration model explains by the time it takes to read it. The best predictors
many experimental findings. First, the more of readability turned out to be the frequency of
propositions there are in a passage, the longer words in the text and the number of reinstate-
ment searches that have to be made, as predicted
from the model. Kintsch and Vipond (1979) con-
firmed that readability is not determined solely
by the text, but is an interaction between the text
Construction–integration model (Kintsch, 1988) and the readers. The most obvious example is
that reinstatement searches are only necessary
CONSTRUCTION PHASE
when a proposition is not in working memory,
% " and obviously the greater the capacity of an indi-
%
vidual’s working memory, the less likely such a
%
reader is to need to make reinstatement searches.
Daneman and Carpenter (1980) found that indi-
INTEGRATION PHASE vidual differences in working memory size can
% # affect reading performance. So if you want to

! ! write easily readable text, you should use short
%$ words, and try to write so as to avoid the reader
"
having to retrieve a lot of material from long-
term memory.
The model can explain some differences
FIGURE 12.4 between good and poor readers. Vipond (1980)
presented readers with passages containing tech- with less prior knowledge rely more on the sur-
nical material. Comprehension ease could be preface detail in the text base to answer questions.
dicted from the number of times a reader must
make a reinstatement, by the number of propo-
sitions reinstated, by the number of inferences
Comparison of models
and reorganizations required to keep the network Story grammars suffer from a number of prob-
coherent, and by the number of levels in the net- lems: In particular, it is difficult to agree on what
work required to represent the text. Vipond exam- the terminal and non-terminal categories and rules
ined how these variables operate at the microlevel of the grammar should be. Propositional networks
(to do with sentences) and the macrolevel (to do and schema models, while providing useful con-
with paragraphs). He found that involvement of structs, are not in themselves sufficient to account
microprocesses predicts the reading performance for all the phenomena of text processing. Of these
of less skilled readers, whereas the involvement of models, Kintsch’s is the most detailed and prom-
macroprocesses predicts the reading performance ising, and as a consequence has received the most
of better readers. He argued that microprocesses attention. It combines the best of schema-based
have greater influence in question answering, and network-based approaches to form a well-
recognition, and locating information in text, specified mental model theory.
whereas macroprocesses have greater influence in
integration and long-term retention.
Fletcher (1986) examined eight strategies INDIVIDUAL DIFFERENCES
that participants might use to keep propositions IN COMPREHENSION
in the short-term buffer. Four were local strat- SKILLS
egies (“most recent proposition”; “most recent
topical”—the first agent or object mentioned in a Throughout this book, we have seen that there are
story; “most recent containing the most frequent individual differences in reading skills, and the same
argument”; and “leading edge”—a combination is true of text processing: people differ in their abil-
of the most recent and the most important propo- ity to process text effectively. There are a number of
sition) and four were global strategies (“follow ways in which people differ in comprehension abili-
a script”; “correspond to the major categories of ties, and a number of reasons for these differences.
a story grammar”; “indicate a character’s goal For example, less skilled comprehenders draw fewer
or plan”; “are part of the most recent discourse inferences when processing text or discourse, and
topic”). These were tested against 20 texts in a are also less well able to integrate meaning across
recall task and a “think-aloud protocol” task, utterances (Oakhill, 1994; Yuill & Oakhill, 1991).
where participants had to read the story and elab- Working memory plays a role in these difficulties,
orate out loud. There was no clear preference but is unlikely to be the sole reason.
for local versus global strategies, although the Working or short-term memory is used for
“plan/goal” strategy was top in both tasks, and storing currently active ideas, and for the short-
story structure was also important. There were term storage of mental computations (Baddeley,
large task differences: for example, frequency 1990). Differences in working memory span have
was bottom in recall but third most used in the a number of consequences for the ability to under-
protocol task. stand text (Singer, 1994). For example, a high span
Finally, the model also predicts how good will enable an antecedent to be kept active in mem-
memory is for text and how prior knowledge ory for longer, and will enable more elaborative
affects the way in which people answer questions inferences to be drawn. A useful measure of work-
(Kintsch, Welsch, Schmalhofer, & Zimny, 1990). ing memory capacity for test processing is reading
The more background knowledge comprehenders span as defined by Daneman and Carpenter (1980).
have, the more likely they are to answer questions People hear or read sets of unrelated sentences, and
based on their situation model. Comprehenders after each set attempt to recall the last word of each
sentence. Reading span is the largest size set for material becomes activated before it can be sup-
which a participant can correctly recall all the last pressed. Reading activates a great deal of material,
words. Reading span correlates with the ability to and skilled comprehenders are better able to sup-
answer questions about texts, with pronoun resolu- press that material that is less relevant to the task at
tion accuracy, and even with general measures of hand. It reduces interference. Less skilled compre-
verbal intelligence such as SAT scores (a standard- henders are less efficient at suppressing the inap-
ized test of academic and intellectual achievement propriate meaning of homonyms such as SPADE
in the USA, standing for Scholastic Assessment (Gernsbacher, Varner, & Faust, 1990). When pre-
Test). Daneman and Carpenter argued that reading sented with the test word “ace” 850 ms after the
span gives a much better measure of comprehen- sentence “He dug with a spade,” skilled compre-
sion ability than traditional word span scores. On henders showed no interference, but less skilled
the other hand, it has proved much harder to find comprehenders took longer to reject the test word.
effects of memory capacity on elaborative infer- Less skilled comprehenders are also less efficient
ences, perhaps because optional elaborations are at suppressing the activation of related pictures
not always reliably inferred by readers (Singer, when reading words. They are even less good at
1994). Less able readers are also more prone to processing puns—this is because they are less able
mind wandering when reading (McVay & Kane, to quickly suppress the contextually appropriate
2012), suggesting that attentional control and meaning of a pun (Gernsbacher, 1997).
executive processing also play an important role Finally, although there has been considerable
in skilled reading, in addition to working memory debate as to the exact mechanisms involved, some
capacity. cognitive abilities decline with normal aging
We saw earlier that prior knowledge influences (Woodruff-Pak, 1997). There is experimental
comprehension. Possessing prior knowledge can be evidence that young people are more effective at
advantageous. In general, the more you know about relating ideas in text (Cohen, 1979; Singer, 1994).
a subject, the easier it is to understand and remember Healthy elderly people are less efficient at sup-
related text. (You can easily verify this for yourself pressing irrelevant material than young people.
by picking up a book or an article on a topic you
know nothing about.) Prior knowledge provides a
framework for understanding new material, activates
How to become a better reader
appropriate concepts more easily, and affects the pro- We saw in Chapter 7 that increases in reading
cessing of inferences. It helps us to decide what is speed are at the cost of impaired comprehension.
important and relevant in material and what is less so. However, psycholinguistics has suggested a num-
The effects of prior knowledge can be quite specific ber of tips about how one’s level of comprehen-
(Singer, 1994). Although experts are more accurate sion of text can be improved.
and faster than novices at making judgments about You can improve your reading ability by pro-
statements related to their expertise, this advantage viding yourself with a framework. One of the best
does not carry over to material in the same text that known methods for studying is called the PQ4R
is not related to their expertise, and does not help in method (Anderson, 2010; Thomas & Robinson,
making complicated elaborative inferences. 1972) (see Figure 12.5). This method emphasizes
Skilled comprehenders are also better able identifying the key points of what you are reading,
to suppress irrelevant and inappropriate mate- and adopting an active approach to the material.
rial (Gernsbacher, 1997). Suppression can be In terms of Kintsch’s model, this enables appro-
distinguished from the related attentional propriate goal schema to be activated right from the
cess of inhibition that is important in atten- start. It also enables you to process the material
tional expectancy-based priming (see Chapter more deeply, and think about its implications.
6). Suppression is the attenuating of activation, Material should also be related to prior knowl-
whereas inhibition is the blocking of activation edge. The technique also maximizes memory
(Gernsbacher, 1997). Suppression requires that retention. Making up questions and answering
them is known to improve memory, with question- x Reflect. Reflect on the material as you read it.
making the more effective of the two (Anderson, Try to think of examples, and try to relate the
2010). Finally, elaborative processing of material material to prior knowledge. Try to understand
is highly beneficial; we saw earlier that we tend it. If you don’t understand it all the first time,
to remember our inferences. The PQ4R method don’t worry. Some difficult material takes sev-
makes incidental use of all of these insights. The eral readings.
method goes like this. It can be applied either to a x Recite. After finishing a section, try to recall
whole book or to just one chapter in a book: the information that was in it. Try answering
the questions you made up earlier. If you can-
x Preview. Survey the material to determine not, reread the difficult material and the parts
what is discussed in it. Examine the contents relevant to the questions you could not answer.
list. If the book or chapter has an introduction x Review. After you have finished, go through it
or summary, read it. Read the conclusions. mentally, recalling the main points. Again try
Look at the figures and tables to get a feel for answering the questions you made up. A few
what the material is about. Identify the sections minutes after you have finished this process,
to be read as single units. Apply the next four flick through the material once more. If pos-
steps to each section. sible, repeat this an hour or so later.
x Questions. Make up questions for each section.
Try to make your questions related to your You might need to repeat the whole process
goals in reading the material. You can some- if you want to approach the material with a differ-
times turn section headings into questions. ent emphasis. This method is not always appropri-
(I’ve already tried to do this where possible in ate, of course. I wouldn’t like to read a novel by
this book.) the PQ4R method, for example. But if you have to
x Read. Read the material carefully, trying to study material for some purpose—such as this text-
answer the questions you made up. book for an exam—it is much better to rely on psy-
cholinguistic principles than to read it like a novel.
The PQ4R method (Anderson, 2010;
Thomas & Robinson, 1972)
THE NEUROSCIENCE OF
PREVIEW
TEXT PROCESSING
Much less is known about the neuropsychology of
text processing than about the neuropsychology of
QUESTIONS many other language processes. This is because
text processing and semantic integration really
comprise many processes, at least some of which
READ
are not specific to language, and involve much of
the cortex. It is much more straightforward to track
down the effects of brain damage on modular pro-
REFLECT
cesses. Many types of brain damage will lead to
some impairment of comprehension ability. For
example, people with receptive aphasia have dif-
RECITE
ficulty in understanding the meaning of words; this
obviously impairs their ability to follow coherent
text and conversation. People with syntactic pro-
REVIEW
cessing impairments have difficulty in parsing
sentences (see also Chapter 10). However, it has
FIGURE 12.5 proved much more difficult to find deficits that are
restricted to text processing. Some patients with Right-hemisphere patients also find some dis-
Wernicke’s aphasia have difficulty in maintaining course inferences difficult to make (Brownell,
the coherence of discourse; they might repeat ideas Potter, Bihrle, & Gardner, 1986). In particular,
or introduce irrelevant ones (Christiansen, 1995). while they are able to draw straightforward infer-
Children with SLI (specific language impairment) ences from discourse, they are unable to revise
are poor at story comprehension and making infer- them in the light of new information that should
ences. The source of their comprehension difficulty make them inappropriate (Caplan, 1992).
is uncertain: Limited working memory span might We saw in Chapter 3 that children with
play a role, and it is also possible that ability to semantic-pragmatic disorder have difficulty in
suppress information is impaired. It may also be conversations where they have to draw inferences
that the difficulties arise because these children (Bishop, 1997). They give very literal answers to
spend so much time processing individual words questions, and fail to take the preceding conversa-
and parsing sentences, as they have a host of other tional and social context into account. Semantic-
difficulties (Bishop, 1997). Of course, all of these pragmatic disorder is best explained in terms of
factors might play a part. these children having difficulty in representing
In spite of this difficulty, there are reports of other people’s mental states.
people with an impaired ability to understand dis- Many people with short-term memory impair-
course, but without other language impairments. ments show comprehension impairments. We saw
Most of these reports involve (right-handed) earlier that reading span tends to be lower in peo-
people with right-hemisphere damage (Caplan, ple with poor comprehension skills. Brain damage
1992). For example, such patients have some can dramatically reduce short-term memory span
difficulty in understanding jokes (Brownell & (to just one or two digits). Patient BO had particu-
Gardner, 1988; Brownell, Michel, Powelson, & lar difficulty understanding sentences with three or
Gardner, 1983). Consider the following joke (63) more noun phrases (Caplan, 1992). McCarthy and
with three possible punchlines (from Brownell et al., Warrington (1987b) described a patient who had
1983, selected by Caplan, 1992): difficulty in translating commands into actions.
People with dementia have difficulty in keeping
(63) The quack was selling a potion which he track of the referents of pronouns; this is likely to be
claimed would make men live to a great because of their impaired working memory (Almor,
age. He claimed he himself was hale and Kempler, MacDonald, Andersen, & Tyler, 1999).
hearty and over 300 years old. Vallar and Baddeley (1987) described a patient with
“Is he really as old as that?” asked a listener impaired short-term memory who could not detect
of the youthful assistant. anomalies involving reference. Although short-term
“I can’t say,” said the assistant, memory seems to play little role in parsing (Chapter
“X.” 10), it is important in integration and maintaining a
Which best fits X? discourse representation.
A. Correct punchline: “I’ve only worked We saw earlier that one aspect of being a skilled
with him for 100 years.” comprehender is to suppress irrelevant material.
B. Coherent non-humorous ending: “I don’t People with dementia are very inefficient at sup-
know how old he is.” pressing irrelevant material (Faust, Balota, Duchek,
C. Incoherent ending: “There are over 300 Gernsbacher, & Smith, 1997). This leads to a
days in a year.” reduced ability to understand text and conversation.
Furthermore, the more severe the dementia, the less
Brownell et al. found that right-hemisphere efficient the suppression. People with dementia also
patients were not very good at picking the correct seem to change the topic of conversation more often
punchline. They often chose the incoherent end- and more unexpectedly than people without demen-
ing. They knew that the ending of a joke should be tia, and are generally less able to maintain coher-
surprising, but were unable to maintain coherence. ence in conversation (Garcia & Joanette, 1997).
SUMMARY
x In comprehension, we go beyond word meaning and syntactic structure to integrate the semantic
roles into a larger representation that integrates the text or discourse with previous material and
with background information.
x Text has a structure and coherence that makes it easy to understand.
x People try to make new information as easy to assimilate as possible for the listener.
x Literal memory is normally very unreliable.
x People generally forget the syntactic and lexical details of what they hear or read, and just remem-
ber the gist.
x We can remember some of the literal form, particularly where the wording matters, and for
incidental material such as jokes.
x We have better memory for what we consider to be important material.
x Prior knowledge is important; it helps us to understand and remember material.
x Changing perspective can help you remember additional information if the story was easy to
understand in the first place.
x As we read or listen, we make inferences.
x Eyewitness testimony can be quite unreliable, as people confuse inference with what originally
happened, and can be misled by the wording of questions.
x Bridging inferences enable us to maintain the coherence of text, elaborative inferences to go
beyond the text.
x We find it difficult to distinguish our inferences from the original material.
x According to the constructionist viewpoint, we construct a detailed model of the discourse, using
many elaborative inferences; according to the minimalist viewpoint, we make only those infer-
ences we need to maintain the coherence of the representation.
x The number of inferences we make at the time of comprehension might be quite minimal; we
make only those necessary to make sense of the text and keep it coherent.
x Many elaborative inferences are made at the time of recall.
x Resolving anaphoric reference involves working out who or what (the antecedent) should be
associated with pronouns and referring phrases.
x Gender is an important cue for resolving anaphoric ambiguity.
x Some topics are more accessible than others; they are said to be in the foreground.
x Common ground refers to items that are mutually known by participants in conversations, when
the participants know that the others know about these things too.
x Factors such as common ground cannot restrict the initial search for possible referents, but may
be an important constraint in selecting among alternatives.
x Propositions are units of meaning relating two things.
x Propositional networks form a useful basis for representing text, but cannot be sufficient in themselves,
because they do not show how we make inferences, or how some items are kept in the foreground.
x According to story grammars, stories have a structure analogous to that of a sentence; however,
unlike sentence grammars, there is no agreement on how stories should be analyzed, or on what
the appropriate units should be.
x Schemas are organized packets of knowledge that have been abstracted from many instances; they
are particularly useful for representing stereotypical sequences (such as going to a restaurant).
x A mental model is a structure that represents what the text is about, particularly preserving spatial
information.
x The construction–integration model combines propositional networks, schema theory, and mental
models to provide a detailed account of how we understand text.
x Working memory span is an important constraint on comprehension ability.
x Skilled comprehenders are better able to suppress irrelevant material.
x The PQ4R method is a powerful method for approaching difficult material.
x People with right-hemisphere brain damage have difficulty in understanding jokes and drawing
appropriate inferences.
x Children with semantic-pragmatic disorder have difficulty following conversations because they
cannot represent other people’s mental states.
x Impaired short-term memory disrupts the ability to comprehend text and discourse.
x Dementia reduces the ability to comprehend text and discourse and to maintain a coherent conversation.
1. What makes some stories easier to follow than others?

2. How is watching a film like reading a book? In what ways does it differ?
3. Many of the experiments on parsing involved analyses of reaction times. In contrast, experi-
ments such as those of Bransford and Johnson (1973; see Figure 12.1) necessitate a more quali-
tative analysis that involves dividing a story up into “ideas.” How easy is it to identify an idea?
4. What determines how easy it is to assign an antecedent to an anaphoric expression?
5. What has psychology told us about how comprehension skills should be taught?
6. To what extent are the same sorts of processes considered to be automatic in word recognition,
parsing, and comprehension?
FURTHER READING
Fletcher (1994) reviews the classic literature on text memory. See Altarriba (1993) for a review of
cultural effects in comprehension.
There are many references on the debate between minimalism and constructionism (e.g.,
Graesser, Singer, & Trabasso, 1994; McKoon, Gerrig, & Greene, 1996; Potts, Keenan, & Golding,
1988; Singer, 1994; Singer & Ferreira, 1983; Singer, Graesser, & Trabasso, 1994).
Kintsch (1994) reviews models of text processing. Another early influential propositional net-
work model was that of Norman and Rumelhart (1975). Brewer (1987) compares the mental model
and schema approaches to memory. See Mandler and Johnson (1980) and Rumelhart (1980) for
replies to critics of story grammars. See Eysenck and Keane (2010) for more on schemas. Wilkes
(1997) describes how knowledge is represented.
See Bishop (1997) for a review of developmental discourse disorders, including semantic-
pragmatic disorder.
SECTION E
PRODUCTION AND OTHER
ASPECTS OF LANGUAGE
This section looks at how we produce language. Chapter 14, How do we use language?,
It also examines the structure of the language looks at how we use language. The chapter exam-
system, with emphasis on how we repeat words ines conversation and pragmatics, and the relation
and the role of memory in language processing. It between language and the visual world.
ends with a brief look at the main themes outlined Chapter 15, The structure of the language
in Chapter 1, and some possible future issues. system, draws together issues from the rest of the
Chapter 13, Language production, looks book, looking at how the components of the system
at the process involved in deciding what we want interrelate, particularly with reference to memory.
to say, and how we turn these words into sounds. Chapter 16, New directions, evaluates the
Where does comprehension end and production present status of psycholinguistics and the ways
begin? Writing is another way of producing lan- in which the themes introduced in Chapter 1 may
guage that is examined here. be developed in the future.
C H A P T E R 13
LANGUAGE PRODUCTION
INTRODUCTION to control the frequency, imageability, and visual

appearance (or any other aspect that is considered
This chapter examines how we produce language. important) of the materials of word recognition
There has been less research on language product- experiments, but our thoughts are much harder to
ion than on language comprehension. Consider control experimentally.
the amount of space devoted to these topics in this The processes of speech production fall into
book: several chapters on input and only one on three broad areas called conceptualization, for-
output. Clearly we do not spend disproportionately mulation, and execution (Levelt, 1989). At the
more time listening or reading than we do speak- highest level, the processes of conceptualization
ing, so why is there this imbalance of research? The involve determining what to say. These are some-
investigation of production is perceived to be more times also called message-level processes. The
difficult than the investigation of comprehension, processes of formulation involve translating this
primarily because it is difficult to control the input conceptual representation into a linguistic form.
in experiments on production. It is relatively easy Finally, the processes of articulation involve
Speech production processes (Levelt, 1989)
CONCEPTUALIZATION (MESSAGE LEVEL OF REPRESENTATION)

FORMULATION

ARTICULATION

FIGURE 13.1
396 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
detailed phonetic and articulatory planning (see correct sequence and specify how the muscles of
Figure 13.1). the articulatory system should be moved.
During conceptualization, speakers con- What types of evidence have been used to
ceive an intention and select relevant information study production? First, researchers have ana-
from memory or the environment in preparation lyzed transcripts of how speakers choose what to
for the construction of the intended utterance. say and how to say it (Beattie, 1983). For exam-
The product of conceptualization is a preverbal ple, Brennan and Clark (1996) found that speak-
message. This is called the message level of rep- ers cooperate in conversation so that they come to
resentation. To some extent, the message level is agree on the same names for objects. Computer
the forgotten level of speech production. A prob- simulations and connectionist modeling, as in other
lem with talking about intention and meaning, as areas of psycholinguistics, have become very influ-
Wittgenstein (1958) observed, is that they induce ential. Much has been learned by the analysis of
“a mental cramp.” Very little is known about the the distribution of hesitations or pauses in speech.
processes of conceptualization and the format Until fairly recently the most influential data were
of the message level. Obviously the message spontaneously occurring speech errors, or slips of
level involves interfacing with the world (par- the tongue, but in recent years experimental stud-
ticularly with other speakers), and with seman- ies, often based on picture naming, have become
tic memory. The start of the production process important. By the end of this chapter you should:
must have a great deal in common with the end
point of the comprehension process. When we x Know about the different types of speech error
talk, we have an intention to achieve some- and why we make them.
thing with our language. How do we decide on x Know the difference between conceptualiza-
the illocutionary force of what we want to say? tion, formulation, and execution.
Levelt (1989) distinguished between macroplan- x Understand how we plan the syntax of what
ning and microplanning conceptualization pro- we say.
cesses. Macroplanning involves the elaboration x Appreciate how we retrieve words when we
of a communicative goal into a series of sub- speak.
goals and the retrieval of appropriate informa- x Know about Garrett’s model and the interac-
tion. Microplanning involves assigning the right tive activation models of speech production.
propositional shape to these chunks of informa- x Know why we pause when we speak.
tion, and deciding on matters such as what the x Understand how brain damage affects lan-
topic or focus of the utterance will be. guage production.
There are two major components of formula- x Know how we plan what we write.
tion: We have to select the individual words that
we want to say (lexicalization), and we have to
put them together to form a sentence (syntactic SLIPS OF THE TONGUE
planning). It might not always be necessary to
construct a syntactic representation of a sentence Until fairly recently, models of speech production
in order to derive its meaning. Clearly this is not were primarily based on analyses of spontane-
an option when speaking. Given this, it is perhaps ously occurring speech errors. Casual examina-
surprising that more attention has not been paid to tion of our speech will reveal (in the unlikely
syntactic encoding in production, but the difficul- event that you do not know this already) that it
ties of controlling the input are substantial. is far from perfect, and rife with errors. Analysis
Finally, the processes of phonological encod- of these errors is one of the oldest research topics
ing involve turning words into sounds in the right in psycholinguistics. Speech errors are frequently
order, spoken at the correct speed, with the appro- commented on in everyday life. The case of the
priate prosody (intonation, pitch, loudness, and Reverend Dr. Spooner is quite commonly known;
rhythm). The sounds must be produced in the indeed, he gave his name to a particular type of
13. LANGUAGE PRODUCTION 397
error involving the exchange of initial conso- Not all Freudian slips need arise from a repressed
nants between words, the spoonerism. Some of sexual thought. In another example he gives, the
Reverend Spooner’s alleged spoonerisms are President of the Lower House of the Austrian
shown in examples (1) to (3). (See Potter, 1980, Parliament opened a meeting with “Gentlemen,
for a discussion of whether Reverend Spooner’s I take notice that a full quorum of members is
errors were in fact so frequent as to suggest an present and herewith declare the sitting closed!”
underlying pathology.) (instead of open). Freud interpreted this as reveal-
ing the President’s true thoughts, that he secretly
(1) Utterance: You have hissed all my mystery wished a potentially troublesome meeting closed.
lectures. However, Freud was not the first to study speech
Target: … missed all my history lectures. errors; a few years before, Meringer and Mayer
(2) Utterance: In fact, you have tasted the whole (1895) provided what is now considered to be
worm. a more traditional analysis. Ellis (1980) reana-
Target: … wasted the whole term. lyzed Freud’s collection of speech errors in terms
(3) Utterance: The Lord is a shoving leopard to of a modern process-oriented account of speech
his flock. production.
Target: … a loving shepherd. The most common method of analyzing
speech errors is to collect a large corpus of errors
Most people have heard of the Freudian slip. by recording as many as possible. Usually the
In part of a general treatise on action slips or errors researcher will interrupt the speaker when he or
of action called parapraxes, Freud (1901/1975) she detects the error, and ask the speaker what
noted the occurrence of slips of the tongue, was the intended target, why they thought the
and proposed that they revealed our repressed error was made, and so on. Although this method
thoughts. In one example he gives, a professor introduces the possibility of observer bias, this
said in a lecture, “In the case of female genitals, appears to be surprisingly weak, if present at all.
in spite of many Versuchungen (temptations)— A comparison of error corpora against a smaller
I beg your pardon, Versuche (experiments) …” sample taken from a rigorous transcription of a
Freud (1901) proposed that

slips of the tongue revealed
our repressed thoughts.
sample of tape-recorded conversation (Garnham, What can speech errors tell us?
Shillcock, Brown, Mill, & Cutler, 1982) suggests
that the types and proportion of errors are very Let us now analyze a speech error in more detail
similar. For example, word substitution errors to see what can be learned from them. Consider
and sound anticipation and substitution errors are the famous example of (4) from Fromkin
particularly common. Furthermore, it is possible (1971/1973):
to induce slips of the tongue artificially by, for
example, getting participants to read words out (4) a weekend for MANIACS—a maniac for
at speed (Baars, Motley, & MacKay, 1975). The WEEKENDS
findings from such studies corroborate the natu-
ralistic data. The capital letters indicate the primary stress
There are many different types of speech and the italics secondary stress. The first thing
error. We can categorize them by considering the to notice is that the sentence stress was left
linguistic units involved in the error (for exam- unchanged by the error, suggesting that stress is
ple, at the phonological feature, phoneme, sylla- generated independently of the particular words
ble, morpheme, word, phrase, or sentence levels) involved. Even more strikingly, the plural mor-
and the error mechanism involved (such as the pheme “-s” was left at the end of the second word
blend, substitution, addition, or deletion of units). where it was originally intended to be in the first
Fromkin (1971/1973) argued that the existence of place: it did not move with “maniac.” We say it
errors involving a particular unit shows that these was stranded. Furthermore, this plural morpheme
units are psychologically real. Table 13.1 gives was realized in sound as /z/ not as /s/. That is,
some examples of speech errors from my own the plural ending sounds consistent with the
corpus to illustrate these points. In any error there word that actually came before it, not with the
was the target that the speaker had in mind, and word that was originally intended to come before
the erroneous utterance as actually produced; the it. (Plural endings are voiced “/z/” if the final
erroneous part of the utterance is in italics. consonant of the word to which it is attached is
voiced, as in “weekend,” but are unvoiced “/s/” if
TABLE 13.1 Examples of speech errors classified by unit and mechanism.
Type Utterance Target
Feature perseveration Turn the knop knob
Phoneme anticipation The mirst of May first
Phoneme perseveration God rest re merry gentlemen ye
Phoneme exchange Do you reel feally bad? feel really bad
Affix deletion The chimney catch fire catches fire
Phoneme deletion Backgound lighting background
Word blend The chung of today children young
Word exchange Guess whose mind came to name? whose name came to mind
Morpheme exchange I randomed some samply I sampled some randomly
Word substitution Get me a fork spoon
Phrase blend Miss you a very much very much a great deal
the final consonant is unvoiced, as in “maniac.”) speaking and can correct, or repair, it; sometimes
This is an example of accommodation to the pho- we notice it only after we have finished speak-
nological environment. ing. Often we never notice we have made an error.
Such examples tell us a great deal about The idea of a monitor plays an important role in
speech production. Garrett’s model, described the WEAVER++ model of speech production,
next, is based on a detailed analysis of such discussed below.
examples. On the other hand, Levelt et al. Naming errors probably do not arise from
(1991a) argued that too much emphasis has people rushing their preparation, or, in the case
been placed on errors, and that error analysis of naming, from insufficient word preparation, or
needs to be supported by experimental data. If a failure to check names against objects. Griffin
these two approaches give conflicting results, (2004) examined people’s eye movements while
we should place more emphasis on the experi- they described a visual scene. People tend to gaze
mental data, as the error data are only telling at objects while they are preparing their names.
us about aberrant processing. There are three If errors arise from rushed preparation, they
points that can be made in response to this. should spend less time looking at an object just
First, a complete model should be able to before naming it incorrectly (e.g., saying “ham-
account for both experimental and speech error mer” when looking at an axe); however, they do
data. Second, the lines of evidence converge not. Instead they spend just as long gazing at a
rather than giving conflicting results (Harley, referent before uttering errors as they do before
1993a). Third, it is possible to simulate sponta- uttering correct names. Indeed, if they corrected
neously occurring speech errors experimentally, their utterance (“ham – axe”), they spent longer
and these experimental simulations lead to the looking at the object after making their error, pre-
same conclusion as the natural errors. Using a sumably because they were preparing their repair.
technique they called SLIP, Baars et al. (1975)
required participants to rapidly read pairs of Garrett’s model of speech
words such as “big dog,” “blocked drain,” and
then “dart board.” If participants have to read
production
these pairs from right to left, the priming effect In an important series of papers based primarily
of the preceding pairs leads them to make many on speech error analysis, Garrett (1975, 1976,
spoonerisms on “dart board.” Furthermore, the 1980a, 1980b, 1982, 1988, 1992) argued that we
participants are more likely to produce “barn produce speech through a series of discrete levels
door” (two real words) than they are the cor- of processing. In Garrett’s model, processing is
responding “bart doard”—an instance of the serial, in that at any one stage of processing only
bias towards lexical outcomes also displayed in one thing is happening. Of course, more than one
the naturalistic data. On the other hand, using thing is happening at different processing levels,
the same technique, speakers are less likely to because obviously even as we speak we might be
make exchanges that result in taboo words (e.g., planning what we are going to say next. However,
from “hit shed”; work it out) than ones that do these levels of processing do not interact with
not. Furthermore, galvanic skin responses were one another. The model distinguishes two major
elevated on these taboo trials, suggesting that stages of syntactic planning (see Figure 13.2). At
speakers generated the spoonerism internally, the functional level, word order is not yet explic-
but are in some way monitoring their output itly represented. The semantic content of words is
(Motley, Camden, & Baars, 1982). specified and assigned to syntactic roles such as
We should note that we sometimes correct subject and object. At the positional level, words
our speech errors, which shows that we are moni- are explicitly ordered. There is a dissociation
toring our speech. Sometimes we notice the error between syntactic planning and lexical retrieval.
before we speak it and can prevent it from being Garrett argued that content and function words
made; sometimes we notice the error as we are play very different roles in language production.
the planned sentence (C). Function words are

said to be immanent in this frame; they are an
Conceptualization Message level
inherent part of it, fully implied by its descrip-
tion. The phonological representations of content
Functional level words are then accessed from the lexicon using
the semantic representation (D). These are then
Formulation Positional level inserted into the syntactic planning frame where
final positions are specified. This is the posi-
tional level (E). The function words and other
Sound level
grammatical elements are then phonologically
specified to give the sound-level representation
Articulation Articulatory instructions (F). Sound exchanges occur at this stage and, as
their absolute position is now specified, are con-
strained by distance. Other tidying up might then
FIGURE 13.2 Garrett’s model of speech occur, as this is translated into a series of phono-
production, showing how the stages correspond to logical features that then drive the articulatory
the processes of conceptualization, formulation, and apparatus (G).
articulation. Based on Garrett (1975, 1976).
(Remember that content words are nouns, verbs,

Evidence for Garrett’s model of
adjectives, and adverbs, and do the semantic speech production
work of the language, while function words are In morpheme exchanges such as (4), it is clear that
a small number of words that do most of the syn- the root or stem morpheme (“maniac”) has been
tactic work.) Content words are selected at the accessed independently of its plural affix—in this
functional level, whereas function words are not
selected until the positional level.
Box 13.1 provides an example of how we Box 13.1 An example of how we
generate a sentence. We start with an intention produce an utterance based on
to say a particular message; here, for example,
Garrett’s (1975, 1976) model of
about someone doing the washing up. This hap-
pens at the message level (A). As we saw ear-
speech production
lier, there has been surprisingly little research (A) Message level—intention to convey
on the message level; often it is just shown as particular meaning activates appropri-
a cartoon-like “thoughts” bubble. We choose ate propositions
what we are going to say and the general way in (B) SUBJECT “mother concept,” VERB
which we are going to say it. The task of speech “wipe concept,” OBJECT “plate
production is then to produce these parallel ideas concept”
one at a time: that is, they must be linearized. We TIME past
go on to form an abstract semantic specification NUMBER OF OBJECTS MANY
where functional relations are specified. These (C) (DETERMINER) N V1 [PAST]
are then mapped onto syntactic functions. This is (DETERMINER) N [PLURAL]
2
the functional level (B). Word exchanges occur (D) /mother/ /wipe/ /plate/
at this stage. As only the functional roles of the (E) (DETERMINER) /mother/ /wipe/ [PAST]
words, and not their absolute positions, have (DETERMINER) /plate/ [PLURAL]
been specified at this point, word exchanges (F) /the/ /mother/ /wiped/ /the/ /plates/
are constrained by syntactic category, but much (G) Low-level phonological processing and
less so by the distance between the exchanging articulation
words. Next we generate a syntactic frame for
case the plural ending “-s.” (In English, affixes This is an extraordinarily robust finding: In my
are either prefixes, which come before a word, corpus of several thousand speech errors, there is
or suffixes, which come after, and are always not a single instance of a content word exchang-
bound morphemes, in that they cannot occur ing with a function word. This supports the idea
without a stem; morphemes that can be found as that content and function words are from compu-
words by themselves are called free morphemes. tationally distinct vocabularies that are processed
Bound morphemes can be either derivational or at different levels.
inflectional—see Chapter 1.) Because the bound There are also different constraints on
morpheme has been left in its original place word and sound exchange errors. Sounds only
while the free morpheme has moved, this type of exchange across small distances, whereas
exchange is called morpheme stranding. Content words can exchange across phrases; words that
words behave differently from the grammatical exchange tend to come from the same syntactic
elements, which include inflectional bound mor- class, whereas this is not a consideration in sound
phemes and function words. This suggests that errors, which swap with words regardless of
they are involved in different processing stages. their syntactic class. In summary, word exchange
In (4) the plural suffix was produced correctly errors involve content words and are constrained
for the sentence as it was actually uttered, not as it by syntactic factors; sound errors are constrained
was planned. This accommodation to the phono- by distance.
logical environment suggests that the phonologi-
cal specification of grammatical elements occurs
rather late in speech production, at least after the
Evaluation of Garrett’s model
phonological forms of content words have been Garrett’s model accounts for a great deal of the
retrieved. This dissociation between specifying speech error evidence, but a number of findings
the sounds of content words and specifying the subsequently have suggested that some aspects
grammatical elements is of fundamental impor- of it might not be correct. First, it is not at all
tance in the theory of speech production, and is clear that speech production is a serial process.
an issue that will recur in our discussions of its There is clearly some evidence for at least local
pathology. Furthermore, in word exchange errors, parallel processing in that we find word blend
the sentence stress is left unchanged, suggesting errors, which must be explained by two (or
that this is specified independently of the content more) words being simultaneously retrieved
words. from the lexicon, as in (5) for example. More
Error analysis suggests that when we speak problematically, we find blends of phrases
we specify a syntactic plan or frame for a sentence and sentences, such as in (6). Furthermore,
that consists of a series of slots into which con- the locus of these blends is determined phono-
tent words are inserted. Word exchanges occur logically (Butterworth, 1982), so that the two
when content words are put into the wrong slot. phrases cross over where they sound most alike.
Grammatical elements are part of the syntactic This suggests that two alternative messages are
frame, but their detailed phonological forms must processed in parallel from the message to the
be specified late. phonological levels.
This model predicts that when parts of a sen- We also observe two types of cognitive intru-
tence interact to produce a speech error, they must sion errors where material extraneous to the utter-
be elements of the same processing vocabulary. ance being produced intrudes into it. The message
That is, things only exchange if they are involved level can intrude into the utterance and lower
in the same processing level. Therefore certain levels of processing, producing errors called non-
types of error should never be found. Garrett plan-internal errors, such as in (7). These errors
observed that content words almost always only are often phonologically facilitated. Phonological
exchange with other content words, and that func- facilitation means that errors are more likely to
tion words exchange with other function words. occur if the target word and intrusion sound alike.
We find that targets and intrusions in non-plan- that the levels of processing cannot be independ-
internal errors sound more alike than would be ent of one another but must interact. These data
expected by chance alone, although special care drive the interactive models of lexicalization
is necessary to determine what the intended utter- described later.
ance was (Harley, 1984). A final problem, about which little can be
done, is that the distinction between content and
(5) Utterance: It’s difficult to valify. (Targets: function words is confounded with frequency
validate + verify) (Stemberger, 1985), in that function words include
(6) Target 1: I’m making some tea. some of the most common words of the language
Target 2: I’m putting the kettle on. (for example, “the,” “a”). Processing differences
Utterance: I’m making the kettle on. may reflect this, rather than their being processed
(7) Target: I’ve read all my library books. by different systems. However, the observation
Utterance: I’ve eaten all my library books. that bound morphemes behave like function words
Context: The speaker reported that he was supports Garrett’s hypothesis, as does neuropsycho-
hungry and was thinking of getting some- logical data, discussed later.
thing to eat.
The names of objects or words in the out- SYNTACTIC PLANNING

side environment can also intrude into speech,
producing environmental contamination. Con- Garrett’s model tells us a great deal about the rel-
sider (8) from Harley (1984). The intruding item ative stages of syntactic planning, but says little
(“Clark’s”) sounds similar to the target. Again, we about the syntactic processes themselves. Bock
find that phonological facilitation of these intru- and her colleagues examined these in an elegant
sions occurs more often than one would expect on series of experiments based on a technique of see-
a chance basis (although to a lesser degree than ing whether participants can be biased to produce
with other cognitive intrusions). particular constructions. An important finding
is that word order in speech is determined by a
(8) Target: Get out of the car. number of factors that interact (Bock, 1982). For
Utterance: Get out of the clark. example, animate nouns tend to be the subjects of
Context: The speaker was looking at a shop- transitive sentences (McDonald, Bock, & Kelly,
front in the background that had the name 1993), and conceptually more accessible items
“Clark’s” printed on it. The speaker reported (e.g., as measured by concreteness) tend to be
that he was not aware of this at the time of placed early in sentences (Bock, 1987; Bock &
speaking. Warren, 1985; Kelly, Bock, & Keil, 1986). In gen-
eral, these experiments show that the grammatical
These cognitive intrusions clearly have a role assignment component of syntactic plan-
high-level or message-level source. Hence speech ning is controlled by semantic-conceptual factors
production can involve parallel processing, with rather than by properties of words such as word
high-level processes constrained by low-level lengths. Speakers also construct sentences so that
processes such as phonological similarity. they provide “given” before “new” information
Word substitution speech errors are also (Bock & Irwin, 1980). Generally, ease of lexical
constrained by the similarity of the target and access can affect syntactic planning.
the intrusion, and tend to result in familiar out- Studies of eye movements in the visual
comes; the results are discussed in more detail world paradigm (see also Chapters 10 and 14)
later. Bock (1982; see later) found that the avail- tell us something about how people formulate
ability of words affects syntactic planning, fur- descriptions of visual scenes. Speakers gaze
ther suggesting that levels of processing interact at referents in the visual scene as they prepare
in syntactic planning. These findings all suggest words to refer to them (Griffin, 2001; Meyer,
Sleiderink, & Levelt, 1998). They also gaze at the (9) The ghoul sold a vacuum cleaner to a were-
referents of direct-object nouns while producing wolf.
the subject; if they are uncertain which argument (10) The ghoul sold a werewolf a vacuum
to produce immediately after the verb, their gaze cleaner.
moves between the alternative referents (Griffin (11) The vampire handed a hat to the ghost.
& Bock, 2000). Gaze is a reliable indicator of (12) The vampire handed the ghost a hat.
what and when people are thinking and plan-
ning. Indeed, as is often said, the eyes can give Importantly, syntactic priming does not
us away; speakers will look at the intended ref- depend on superficial similarities between the
erent of an object even if they are preparing to prime and utterance. It does not depend on reus-
“lie” by giving an intentionally inaccurate label ing words (lexical priming) or on repeating the-
for it (Griffin & Oppenheimer, 2006). matic roles, but instead reflects the more general
construction of syntactic constituent structures.
Similarly, the magnitude of the priming effect
Syntactic priming shown by verbs does not depend on the tense,
We reuse words and sentence structures within number, or aspect of the verb (Pickering &
conversation (Schenkein, 1980). The repeti- Branigan, 1998). For example, a prime sentence
tion of syntactic structure is called structural such as (13) was just as effective as the prime
priming or syntactic persistence (Bock, 1986). sentence (14) in eliciting a prepositional-object
Structural priming suggests that we can sepa- construction involving the word “to” (Bock,
rate meaning and form, because we can prime 1989). Put more generally, prepositional-object
sentence structures independently of sentence sentences prime descriptions to use prepositional-
meaning. object constructions regardless of the prepo-
Syntactic persistence is one aspect of the sition (e.g., “to” and “for”) used in the prime
more general phenomenon of syntactic prim- sentences. However, repeating the verb (regard-
ing, whereby processing of a particular syntactic less of tense, aspect, or number) does enhance
structure influences processing of subsequently priming, an effect Pickering and Branigan call
presented sentences. Syntactic priming is wholly the lexical boost. The lexical boost is important
facilitatory, and has been observed in comprehen- because it suggests that the verb has a special role
sion, in production, and bidirectionally between in production. Priming is also enhanced by the
comprehension and production (Branigan, repetition of word order between prime and tar-
Pickering, Liversedge, Stewart, & Urbach, get (Hartsuiker & Westenberg, 2000; Pickering,
1995). One common method used to study syn- Branigan, & McLean, 2002). In summary, we can
tactic priming is to get participants to repeat a prime abstract syntactic structures, but the mag-
prime sentence that contains the syntactic struc- nitude of the priming effect is greater if we repeat
ture of interest, and then to describe a picture. word order and the verb. Indeed, a verb prime
Syntactic priming studies show that speakers use alone may be sufficient to bias speakers’ subse-
a particular word order if the prime sentence used quent productions (Melinger & Dobel, 2005).
that order (Bock, 1986, 1989; Bock & Loebell,
1990; Branigan et al., 1995; Hartsuiker, Kolk, & (13) The werewolf baked a cake for the witch.
Huiskamp, 1999). Suppose we have to describe (14) The werewolf took a cake to the witch.
a picture of a vampire handing a hat to a ghost.
A preceding prepositional-object structure prime, Along similar lines, Bock and Loebell (1990)
such as (9), steers us towards producing a prep- showed that only sentences like (15) produce
ositional-object construction in our description: priming of the prepositional-object description
for example, we might say (11); while a double- (17). A construction such as (16) does not, even
object prime, such as (10), steers us towards pro- though it is superficially very similar to (15). It
ducing a double-object construction such as (12): has similar words (most noticeably, it contains the
word “to”) and has a similar stress pattern. How- changed the syntactic structure of what they had
ever, it has a very different syntactic structure (“a just read. In particular, people tend to reuse pre-
book to study” is a noun phrase, not a prepositional- vious syntactic structures: that is, they recalled
object phrase). Hence it is the underlying syntactic the sentence just presented with the syntactic
structure that is important in obtaining syntactic structure of a previous item. So syntactic priming
priming, not the surface form of the words. Syn- influences our memory, too. It can also lead us
tactic priming has been demonstrated for a variety to produce ungrammatical utterances, when we
of syntactic structures. are erroneously influenced by a structure we have
just heard (Ivanova, Pickering, McLean, Costa,
(15) Vlad brought a book to Boris. & Branigan, 2012).
(16) Vlad brought a book to study. It is also possible to prime the productions
(17) The witch is handing a paintbrush to the of patients who have an impairment of syntactic
ghost. planning in speech production, although not all
types of sentence structure are primed as easily
Syntactic persistence can continue for quite as others (Hartsuiker & Kolk, 1998; Saffran &
some time. Bock and Griffin (2000) found that Martin, 1997). The number of passives (e.g., “the
the structural priming could persist over as long cat was chased by the dog”) was increased by pas-
as 10 intervening sentences (although the priming sive primes, but the production of dative construc-
effect can be short-lived—Levelt & Kelter, 1982). tions (e.g., “give the food to the dog”) showed no
Such persistence suggests that the priming is due immediate increase after the primes. Some of the
to more than short-term memory, and may have newly generated constructions were morphologi-
some long-term learning component. cally deviant, suggesting that although phrase
Speakers also tend to reuse the syntactic con- structure and closed-class elements are normally
structions of other speakers (Branigan, Pickering, closely linked in production (as in Garrett’s
& Cleland, 2000). For example, speakers will model), they can be separated.
use a complex noun phrase (e.g., “the square At first sight, the way in which syntactic
that’s red”) more often after hearing a syntacti- frames can be primed independently of mean-
cally similar noun phrase than a simple one (“the ing points to a separation of meaning and form.
red square”), and are particularly likely to do so Greater overlap in meaning does not generally
if the main noun (“square”) is repeated (Cleland lead to a larger amount of priming; in most cases
& Pickering, 2003). We find this priming effect all that matters is the overlap in surface syntax.
on noun-phrase structure if the prime and tar- This finding suggests that sentence frames are
get noun are semantically related (“sheep” and independent syntactic representations, and in par-
“goat”), but not if they are phonologically related ticular that they have some existence independent
(e.g., “sheep” and “ship”), suggesting that while of the meaning of what they encode. It also points
syntactic encoding is unsurprisingly affected by to a probabilistic element in syntactic planning,
the semantic representation, it is not affected by where the precise form of the words we choose
feedback from the phonological representation is affected by environmental factors such as what
(Cleland & Pickering, 2003). we have just heard. Chang, Dell, and Bock (2006)
Syntactic priming does more than just influ- describe a connectionist model of sentence pro-
ence descriptions. Potter and Lombardi (1998) duction that can account for the structural priming
showed that immediate recall can be affected by data. In their model, sequencing in production
syntactic persistence. In their experiment, par- makes use of two types of information. A sequenc-
ticipants silently read words presented one at a ing system uses a recurrent connectionist model
time and at a fast rate on a computer screen. They that uses statistical information to predict what
then performed another distractor task before is coming next. However, the model also makes
being asked to repeat the sentence out aloud. This use of semantic information about events and the
task is quite difficult, and speakers sometimes message to be produced.
The model has two advantages. First, there Bock & Miller, 1991). These experiments look
are some recent data that suggest that meaning at what type of factors cause number agreement
can have some effect on priming. Chang, Bock, errors. Consider the sentence fragments (19)–(21)
and Goldberg (2003) found that similar thematic from Bock and Eberhard (1993):
roles can cause priming even when the surface
syntax is held constant (e.g., “The man sprayed (19) The player on the court –
water on the wall” has the theme (water) before (20) The player on the courts –
the location (wall), and “The man sprayed the wall (21) The player on the course –
with water” has the location before the theme; but
both sentences have the same surface structure of A suitable continuation for this might be “was
NP–V–NP–PP). Chang et al.’s model can account very good.” A continuation containing an agree-
for this result because of the meaning-based route. ment error might be “were very good.” Which of
Syntactic priming probably serves two main these fragments causes agreement errors? Sen-
functions. First, it enables speakers in a conversa- tence (19) is very straightforward; both nouns
tion to coordinate or align information. Using the are singular. As we might expect, this type of
same words and syntax helps conversants to col- fragment produces no agreement errors. In (20)
laborate more efficiently. Second, it results from the noun closest to the verb is plural, while the
implicit learning of how people use syntax to con- noun that should determine number (“player”)
vey meaning—people unconsciously adjust how is singular. In this condition we observe many
they convey information on the basis of experi- errors. What about (21)? Although the local noun
ence. The finding that syntactic priming can be (“course”) is singular, it is a pseudoplural, because
persistent over surprisingly long periods of time is the end of the word is an /s/ sound. (Remember
consistent with the idea that it results from learn- that regular plurals in English are formed by add-
ing rather than just reflecting transient activation ing an -s to the end of the singular form of the
of syntactic structures. noun.) So if the plural sound alone were impor-
tant in determining agreement, we would expect
Coping with dependencies sentences like (21) to generate many agreement
How do we cope with dependencies between errors. In fact, they generate none. Hence agree-
words? One particular problem facing speakers ment cannot be determined by the sound of sur-
is ensuring number agreement between subjects rounding words (in particular, whether they sound
and verbs. For instance, we must ensure that we as though they have plural endings) but by some-
say “the woman does” and “the men do,” and thing more fundamental. Further evidence for
not “the woman do” or “the men does.” We do this is that regular (“boys”) and irregular (“men”)
not always get agreement right; number agree- versions of nouns cause equal numbers of agree-
ment errors are fairly common in speech. We ment errors, as do individual (“ship”) and collec-
particularly have a tendency to make attraction tive (e.g., “audience,” “fleet,” and “herd”). At first
errors such as (18), where we make the verb sight what seems to be important in determining
(here “were” instead of “was”) agree with a number agreement is only the syntactic number
noun (“unions”) that is closer to the verb than of the nouns, suggesting that syntactic planning
the subject (“membership”) with which it should is modular.
agree (Bock & Eberhard, 1993). More recent work has challenged this idea
that syntactic processing is feedforward and mod-
(18) Membership in these unions were voluntary. ular. Distributive noun phrases, such as “the label
on the bottles,” where the semantics of the phrase
In an important series of experiments, Bock implies the existence of multiple labels, leads
and her colleagues used a sentence-completion speakers in several languages to produce plural
task designed to elicit agreement errors (e.g., verbs (Eberhard, 1999; Vigliocco, Butterworth, &
Bock & Cutting, 1992; Bock & Eberhard, 1993; Garrett, 1996). It now seems likely that whether
or not we find semantic effects on verb agree- of the local noun to the verb, as on its proxim-
ment depends on subtle factors such as the precise ity in the underlying syntactic structure. In one
materials we use in the experiments. Haskell and experiment participants had to generate sen-
MacDonald (2003) showed that number agree- tences from a sentence beginning and an adjec-
ment can be accounted for in terms of constraint tive, e.g., (22). A correct continuation would be
satisfaction. This approach is similar to that in (23), and one with an agreement error (24):
language comprehension, and makes use of the
constraint-satisfaction idea that several sources (22) The helicopter for the flights + safe.
of information interact to determine output. If (23) The helicopter for the flights is safe.
the different sources of information conflict, then (24) The helicopter for the flights are safe.
processing time increases. If one of the sources
strongly predicts singular or plural, then addi- In a second experiment participants had to
tional weak factors have little additional cost, but generate questions from (22), such as (25):
if the sources of information are approximately
equal, then competition is maximal and the cost to (25) Is the helicopter for the flights safe?
processing time greatest (Haskell & MacDonald, (26) Are the helicopter for the flights safe?
2003). For example, ordinary singular nouns (e.g.,
horse, ship) are very good predictors that a singu- Participants made about the same number of
lar verb is necessary, and produce little competi- agreement errors as in the first experiment, e.g.,
tion. Collective nouns (e.g., family, fleet, team) (26), even though here the “local noun” (“flights”)
share characteristics of both singulars and plurals. is much farther away in terms of the number of
Although they should strictly generate singular intervening words. This is because, according to
verbs, their plural characteristics induce some linguistic theory, the declarative sentence (23) and
competition between plural and singular verb the question (25) have the same underlying syn-
forms, leading to longer processing times and tactic structure.
more variability in output. According to Bock, Eberhard, and Cutting
Similar experimental methods also show (2004) we need two processes to ensure that num-
that number agreement takes place within the ber agreement proceeds smoothly. First, we need a
clause (Bock & Cutting, 1992). Analysis of specification that takes into account the number of
number agreement also provides further evi- things we are talking about in the message. Bock
dence that syntactic structure is generated et al. call this processing marking. For example,
before words are assigned to their final posi- if we are talking about one helicopter, then the
tions. Vigliocco and Nicol (1998) note that verb is marked as singular. Now suppose we are
grammatical encoding has three functions: talking about one pair of scissors. With regard to
assigning grammatical functions (e.g., assign- the message content, the verb will be marked as
ing the agent of an action to the subject of the singular. But we treat “scissors” as a plural noun,
sentence), building syntactic hierarchical con- even if we are only talking about one of them.
stituent structures to reflect this (e.g., turn- We say “the scissors are,” never “the scissor is.”
ing the subject into a NP), and arranging the Hence we need to override the syntactic process of
constituents in linear order. We have seen that marking with a process that takes account of the
speech error data clearly separate the first and morphology of the subject. This second process is
third functions (that is, the functional and posi- called morphing. This overriding process can lead
tional stages of Garrett’s model), but can we to attraction errors, where the verb erroneously
distinguish building abstract hierarchical struc- comes to agree with the number of a neighboring
tures from the final serial ordering of words? noun phrase that is not in fact that verb’s control-
Vigliocco and Nicol argued that we can. They ler, as in (27). Pronouns are more vulnerable to the
showed that number agreement errors do not so number of their controllers, leading to agreement
much depend on the surface or linear proximity errors such as (28). This difference suggests that
number agreement might involve different pro- though the verb clearly must play a central role
cesses for pronouns and verbs (Eberhard, Cutting, in syntactic planning. They showed that semantic
& Bock, 2005). Verbs are particularly controlled interference between the verb and a distractor was
by the grammatical number—a syntactic process, only obtained for verbs at the very beginning of
while pronouns are controlled by what is called German sentences. Therefore, in sentence-final
notional number—the speaker’s initial, fleeting positions it could not have been retrieved by the
perspective on the number of things involved, and time the participants started speaking.
which involves lexical processes (e.g., our first Smith and Wheeldon (1999) had participants
impression of the word “fleet” is that it is plu- describe moving pictures. They found longer
ral). Eberhard et al. provide a detailed model of onset latencies for single clause sentences begin-
marking and morphing in number agreement that ning with a complex noun phrase (e.g., “the dog
accounts for a wide range of data. and the kite move above the house”) than for simi-
lar sentences beginning with a simple phrase (e.g.,
(27) The time to find the scissors are now. “the dog moves above the kite and the house”).
(28) The key to the cabinets disappeared. They Participants also take longer to initiate double
were never found again. clause sentences (e.g., “the dog and the foot move
up and the kite moves down”) than single clause
Is syntactic planning incremental? sentences. These results suggest that people do not
Word exchange speech errors suggest that the plan the entire syntactic structure of complex sen-
broad syntactic content is sketched out in clause- tences in advance. They suggest that when people
sized chunks. This idea is supported by picture– start speaking they have completed lemma access
word interference studies that suggest that before for the first phrase of an utterance, and started but
we start uttering phrases and short sentences not completed processing the remainder.
containing two names, we select the nouns (tech- Schnur, Costa, and Caramazza (2006) used
nically, we select the lemma—see later) and a picture–word interference design to examine
the sound form of the first noun. Meyer (1996) how far we plan ahead. Participants produced
presented participants with pictures of pairs of sentences while ignoring words that were phonologi-
objects that they then had to name (“the arrow and cally related or unrelated to the verb of the sen-
the bag”), or place in short sentences (“the arrow tence. Schnur et al. found that the time to begin
is next to the bag”). At the same time, the par- producing the sentence was faster in the presence
ticipants heard an auditory distractor that could be of the phonologically related distractor, even if
related in meaning or sound to the first or second the sentence the speaker was producing was rela-
noun, or to both. She found that the time it took tively long. These results suggest that phonologi-
participants to initiate speaking was longer when cal planning extends some way ahead, and can in
the distractor was semantically related to either some circumstances (if the verb is primed) cross
the first or the second noun, but the phonologi- phrase boundaries.
cal distractor only had an effect (by facilitating On the other hand, there is a great deal of
initiation) when it was related to the first noun. evidence that suggests that syntactic planning
This pattern of results suggests that we prepare is incremental—that is, we make it up as we go
the meaning of short phrases and select the appro- along. Ferreira (1996) found that speakers find
priate words before we start speaking, but only production easier when they have more syntactic
retrieve the sound of the first word. (This finding options available to continue what they are say-
is also evidence that lexical access takes place in ing, presumably because they can be flexible and
speech production in two stages; see later.) pick the most suitable or available continuation
Schriefers, Teruel, and Meinshausen (1998) one at any time. If we make up a detailed plan
used a picture–word interference technique to before we start speaking, the number of options
show that the detailed selection of a verb is not an shouldn’t matter, or might even get in the way, as
obligatory component of advance planning—even we choose between them.
Ferreira and Swets (2002) also found evi- task comparing sentences such as “The draw-
dence for incremental planning. They had speaking of the flower” (where the two nouns are
ers answer arithmetic sums of differing difficulty tightly integrated) with sentences such as “The
in different sorts of syntactic construction (e.g., drawing with the flower” (where the two nouns
complete “the answer to 49 plus 73 is . . .”). are less closely integrated semantically). More
When speakers were encouraged to speak and errors were made in the completions in the “of”
plan simultaneously—that is, incrementally—by condition, where the components were tightly
trying to beat a deadline, both latency to begin integrated, supporting the parallel model. Hence
speaking and utterance duration were affected when we speak we maintain multiple compo-
by the difficulty of the problem. The more difficult nents of the sentence in memory; we plan and
the problem, the longer people took to produce the speak simultaneously; and we make it up as we
sentence, suggesting that they did not know the go along, rather than planning one chunk at a
answer—and therefore what they were going to time and only producing it when planning is
say—before they started speaking. complete.
Why the discrepancy in results? One expla-
nation is that evidence of detailed advance plan- Producing morphologically complex
ning comes from the study of either phrases or words
very short, simple sentences. Perhaps these are You will remember from Chapter 2 that words can
dealt with differently from more complex con- be morphologically modified in two ways: We can
structions. Another explanation is that the verb derive new words from existing ones (e.g., form-
in the experiments suggesting that there is con- ing “entertainment” from “entertain”), and we can
siderable advance planning is a simple link- inflect words to change noun number or verb tense
ing verb (“is”). Or perhaps the demands of the (e.g., “mouse/mice,” “run/ran”). The new part of
task affect how much participants plan in detail the word (e.g., “-ment”) is called an affix. Speech
before they start speaking. Speech production errors cast some light on how affixes are repre-
probably involves both preparation and planning sented in speech production. We find errors where
ahead and incremental planning; which wins the stems of lexical items can become separated from
day depends on the particular circumstances of their affixes (e.g., the morpheme stranding errors
the utterance. discussed earlier). Affixes are also sometimes
How does this incremental planning relate added incorrectly, anticipated, or deleted. Indeed,
to semantic and syntactic processing? Solomon Garrett’s speech production model rests on a dis-
and Pearlmutter (2004) contrast two approaches sociation between content words and grammatical
to planning production and coordinating mul- elements that are accessed at different times. The
tiple phrases, serial and parallel. They argue neuropsychological evidence from affix loss in
that serial systems must rely on memory to Broca’s-type disorders, and affix addition to neol-
shift representations in and out of memory. ogisms in jargon aphasia (described later), also
Memory-shifting should be easier for phrases suggests that affixes are added to stems. But how?
where the constituents are tightly integrated, You will remember that while most inflec-
with the consequence that there should be fewer tions are regular (we form the plural by adding
errors in such phrases. Parallel systems rely on “s” to the end of the noun, and the past tense
the parallel activation of multiple representa- by adding -ed to the verb), some (usually com-
tions simultaneously maintained in memory. mon) words are formed in an irregular way (e.g.,
Parallel activation means that more integrated mice, sheep, ran, did). How do we produce these
phrases will be processed together and will be irregular forms? One plausible model is that we
active simultaneously, leading to interference, know a rule for producing the regular versions,
with the consequence that there should be more and learn by rote a list of exceptions for dealing
errors in tightly integrated phrases. Solomon with the irregular ones, stored in our lexicon.
and Pearlmutter used a sentence-completion Evidence for this dual-mechanism model comes
from the observation that while we are happy to regular words, because of their greater phono-
form English compound nouns with either sin- logical complexity, is more affected in non-fluent
gular or plural irregular modifying nouns (both patients (with damage in Broca’s area), who
“mouse-eating” and “mice-eating” sound accept- have a central phonological deficit (see Chapter
able to us), we only form compound nouns with 7). These non-fluent patients also showed defi-
singular regular nouns (hence “a rat-eating man” cits on other phonological tasks, such as mak-
sounds acceptable, but “a rats-eating man” does ing judgments about whether words rhyme, and
not). It seems that inflected forms generated by a segmenting words. On the other hand, damage
rule cannot be used as a modifier in a compound to the semantic system leads to more difficulty
noun. How do we come to know what is accept- with irregular verbs, where phonology receives
able and what is not? One possibility is that the support from the semantic system (Joanisse &
child has some innate knowledge of grammar Seidenberg, 1999). Patient AW is problematic
(Pinker, 1999). for this account. While having a selective deficit
There is also neuropsychological evidence in producing irregular forms of verbs, he per-
for a dual-mechanism model. Ullman et al. formed perfectly on a range of tasks involving
(1997) reported a double dissociation between semantics (Miozzo, 2003).
performance on sentence completion and read- Haskell, MacDonald, and Seidenberg (2003)
ing on words with regular and irregular past tackled the observations on the acceptability of noun
tenses. Patients with what is called fluent aphasia modifiers. One problem for the dual-mechanism
(described in more detail below, but arising from account is that there are many exceptions to the cen-
damage to the rear of the left hemisphere) were tral observation (we have “awards ceremony” and
better at producing the past tense of regular verbs, “sports announcer”). Why should some exceptions
whereas patients with non-fluent aphasia (arising be acceptable? Haskell et al. proposed that accept-
from damage to the more frontal regions of the ability is decided by a multiple-constraint satisfac-
left hemisphere) were better at producing irregu- tion process, where semantic, phonological, and
lar past tenses. One explanation for this result is other factors come together to decide acceptability.
that we make use of a rule-based mechanism for These processes are acquired by children through
generating regular forms, and this mechanism is general-purpose learning algorithms. There is no
located in the front of the left hemisphere (and left need, they argued, for two different innately speci-
intact in fluent aphasia), and a lexicon for storing fied mechanisms.
irregular verbs, located in more posterior regions
(and left intact in non-fluent aphasia). Evaluation of work on syntactic
There is an alternative explanation for this
double dissociation, which is that regular and
planning
irregular verbs are processed by the same sys- In recent years there has been a notable increase in
tem, but the processing of regular verbs depends the amount of research examining syntactic plan-
more on phonological information, while the ning. This has largely been due to the evolution of
processing of irregular verbs depends more on new experimental techniques, particularly syntactic
semantics (McClelland & Patterson, 2002). priming, scene description, and sentence comple-
Regular past verbs tend to be more phonologi- tion. Although much remains to be done, we now
cally complicated and less distinct than irregu- know a considerable amount about how we trans-
lar ones—they tend to be longer, for example, late thoughts into sentences. In particular, it is clear
and sound and look more like their associated that there is a syntactic module used in production
stems. When we control for phonological com- that generates syntactic structures that are, to some
plexity, the relative disadvantage shown by non- extent at least, independent of the meaning of what
fluent aphasic patients on regular past tenses they convey. It is also clear that there is a probabilistic
disappears (Bird, Lambon Ralph, Seidenberg, aspect to production. Syntactic planning is quite iner-
McClelland, & Patterson, 2003). The access of tial, and tends to reuse whatever is easily available.
LEXICALIZATION
Two-stage model of lexicalization
Lexicalization is the process in speech production
whereby we turn the thoughts underlying words Conceptual representation
into sounds: We translate a semantic represen-
tation (the meaning) of a content word into its
Lemma
phonological representation of form (its sound).
There are three main questions to answer here.
First, how many steps or stages are involved? Phonological word form
Second, what is the time course of the processes

involved? Third, are these stages independent, or FIGURE 13.3
do they interact with one another?
While there is a great deal of evidence support-
ing the general two-stage hypothesis, the evidence
How many stages are there in for the existence of lemmas is more debatable.
lexicalization?
There is widespread agreement that lexicalization Evidence from speech errors
is a two-stage process, with the first stage being Fay and Cutler (1977) presented one of the earli-
meaning-based, and the second phonologically est models of how we produce words and why we
based. When we produce a word, we first go from make word substitutions. They observed that there
the semantic level to an intermediate level of indi- were two distinct types of whole word substitu-
vidual words. Choosing the word is called lexical tion speech error: semantic substitutions, such as
selection. We then retrieve the phonological forms examples (29) and (30), and form-based substitu-
of these words in a stage of phonological encoding. tions, such as examples (31) and (32). Form-based
Although there is consensus about these two word substitutions are sometimes called phono-
stages, there is disagreement about what hap- logically related word substitution errors or mala-
pens at the level of lexical representation (Rapp propisms. (The word “malapropism” originally
& Goldrick, 2000). All theories assume there is came from a character called Mrs. Malaprop in
at least one stage of lexical representation in pro- Sheridan’s play The Rivals, who was always using
duction where there are units that correspond to words incorrectly, such as saying “reprehend” for
words, but there is disagreement about the nature “apprehend” and “epitaphs” for “epithets.” Note
and functions of this representation. According to that while Mrs. Malaprop produced these substi-
the best known lemma theory (e.g., Levelt, 1989), tutions out of ignorance, the term is used slightly
each word is represented by a lemma. Lemmas confusingly in psycholinguistics to refer to errors
are specified syntactically and semantically but where the speaker knows perfectly well what the
not phonologically. The stage of specifying in a target should be.)
pre-phonological, abstract way the word that we
are just about to say is called lemma selection; (29) fingers → toes
the second stage of specifying the actual concrete (30) husband → wife
phonological form of the word is called lexeme (31) equivalent → equivocal
or phonological form selection (see Figure 13.3). (32) historical → hysterical
Hence in the lemma account there are two layers
of lexical representation. Lemmas are amodal— Fay and Cutler argued that the occurrence
that is, the level of representation mediating of these two types of word substitution sug-
semantics and phonology takes no account of gests that the processes of word production
modality. A consequence of their syntactic speci- and comprehension use the same lexicon, but
fication is that access to lexical syntax must occur in opposite directions. Items in the lexicon are
before access to the phonological form. arranged phonologically for recognition, so
that words that sound similar are close together. Experimental evidence
The lexicon is accessed in production by tra- The earliest experimental evidence for the divi-
versing a semantic network or decision tree sion of lexical access into two stages came
(see Figure 13.4). Semantic errors occur when from studies of the description of simple scenes
traversing the decision tree, and phonologi- (Kempen & Huijbers, 1983). They analyzed
cal errors occur when the final phonological the time people take before they start speaking
form is selected. As we shall see in Chapter 15, when describing these scenes, and argued that
the argument that there is a single lexicon for people do not start speaking until the content
comprehension and production is very contro- to be expressed has been fully identified. The
versial. If this is not the case, then some other selection of several lemmas for a multiword
mechanism will be necessary to account for the sentence can take place simultaneously. We can-
existence of malapropisms. The important idea not produce the first word of an utterance until
of Fay and Cutler’s model is that phonological we have accessed all the lemmas (at least for
and semantic word substitutions happen as a these short utterances) and at least the first pho-
result of mistakes in different parts of the word nological word form. Individual word difficulty
retrieval process. affects only word form retrieval times.
Butterworth (1982) formulated word retrieval Further experimental evidence for two
explicitly in terms of a two-stage process. In stages in lexicalization comes from Wheeldon
Butterworth’s model an entry in a semantic lexi- and Monsell’s (1992) investigation of repetition
con is first selected, which gives a pointer to an priming in lexicalization. Like repetition prim-
entry in a separate phonological lexicon. In gen- ing in visual word recognition, this effect lasts a
eral, in the two-stage model semantic and pho- long time, spanning over 100 intervening nam-
nological substitutions occur at different levels. ing trials. Wheeldon and Monsell showed that
The Fay and Cutler (1977) model predicts that naming a picture is facilitated by recently hav-
semantic and phonological processes should be ing produced the name in giving a definition or
independent. reading aloud. Prior production of a homophone
Word substitution errors, while supporting (e.g., “weight” for “wait”) is not an effective
the two-stage model in general, say nothing about prime, so the source of the facilitation cannot
the existence of amodal, syntactically specified be phonologically mediated. Instead, it must
lemmas. be semantic or lemma-based. Evidence from
speeded picture naming suggests that repeti-
tion priming arises from residual activation in
the connections between semantics and lemmas
object? (Vitkovitch & Humphreys, 1991).
Y N
Monsell, Matthews, and Miller (1992)
man-made looked at this effect in Welsh–English bilin-
object?
guals. There was facilitation within a language,
N Y
but not across (as long as the phonological
musical forms of the words differed). Taken together
instrument?
the experiments show that both the meaning
Y N
and the phonological forms have to be activated
stringed? vehicle? for repetition priming in production to occur.
Y N Repetition priming occurs as a result of the
/ukelele/ /eucalyptus/ /trumpet/ /truck/ strengthening of the connections between the
lemmas and phonological forms.
Evidence for a phase of early semantic activa-
FIGURE 13.4 An example of a search-based single tion in lexical selection and a later phase of phono-
lexicon model. Based on Fay and Cutler (1977). logical activation in phonological encoding comes
from picture–word interference studies (Levelt color in which a word is printed when the word
et al., 1991a; Schriefers, Meyer, & Levelt, 1990). spells out a color name), there is striking inhibi-
These experiments, discussed in more detail later tion. Usually we find interference with semanti-
in the section on the time course of lexicaliza- cally related pairs from the same category, and
tion, used a picture–word interference paradigm facilitation with phonologically related pairs.
in which participants see pictures that they have Schriefers et al. (1990) found that inhibition
to name as quickly as possible. At about the same disappears if participants have to press buttons
time they are given an auditorily presented word instead of naming pictures, suggesting that the
for which they have to make a lexical decision. interference reflects competition among lexi-
Words prime semantic neighbors early on, whereas cal items at the stage of lemma selection. The
late on they prime phonological neighbors. This details of the task and the timings involved are
suggests that there is an early stage when semantic also critical (Bloem & La Heij, 2003; Bloem et al.,
candidates are active (this is the lemma stage), and 2004).
a late stage when phonological forms are active.
The semantic-interference paradigm pro- Evidence from neuroscience
vides evidence for two stages, and furthermore, Different regions of the brain become activated
that the lexical items activated by the first stage in sequence as we produce words (Indefrey &
compete against each other (Starreveld & La Levelt, 2000, 2004). Conceptual selection of a
Heij, 1995, 1996). In semantic-interference stud- word in picture naming is associated with acti-
ies, participants have to name pictures which vation of the mid-part of the left middle tem-
have superimposed distractor words that they poral gyrus; accessing a word’s phonological
have to ignore; naming times are longer when code is associated with activation of Wernicke’s
the picture and the word are related. The distrac- area; and phonological encoding, in terms of the
tors lead to the activation of semantic competitors preparation of syllables, sounds, and the pros-
that slow down the selection of the lexical target. ody of the word, is associated with activation
In the related word translation task, semantically around Broca’s area. As we shall see, lesions to
related words induce semantic interference; these areas lead to different types of impairment
however, related pictures produce facilitation to word naming, with damage to more posterior
(Bloem & La Heij, 2003). The SOA is, however, regions of the brain resulting in difficulty in
critical; if the interfering words are presented 200 accessing the meanings of words, and damage
ms after the target, we observe semantic interfer- to more frontal regions resulting in difficulty
ence, but if they are presented 400 ms before the in accessing the sounds of words. A survey of
target, we observe semantic facilitation (Bloem, the imaging literature also reveals the timings
van den Boogaard, & La Heij, 2004). Bloem and of word retrieval in naming an object (Indefrey
La Heij proposed a model of lexical access in & Levelt, 2004): Visual and conceptual pro-
which semantic facilitation is localized at the cessing take on average 175 ms; the best-fitting
conceptual level, semantic interference is local- lexical item, or lemma, is retrieved between 150
ized at the lexical level, and only one concept and 225 ms; the phonological representations
is selected for lexicalization. They called this are retrieved between 250 and 330 ms; and the
the Conceptual Selection Model (CSM). They details of the sounds of the word at around 450
account for the effects of SOA with the assump- ms (see Figure 13.5).
tion that lexical representations decay faster than Electrophysiological evidence also supports
conceptual representations. the two-stage model (van Turenout, Hagoort, &
Whether or not we observe facilitation or Brown, 1998). Dutch-speaking participants were
inhibition in the picture–word interference para- shown colored pictures and had to name them with
digm depends on the details of the experimental a simple noun phrase (e.g., “red table”). At the same
set-up. In the most famous example of picture– time the participants had to push buttons depend-
word interference, the Stroop task (naming the ing on the grammatical gender of the noun, and
Picture 0 ms
↓Conceptual preparation
Lexical concept
175 ms
↓Lemma retrieval
Multiple lemmas
↓Lemma selection
Self-monitoring
400 –600
Target lemma 250 ms
275 –400
200 –400 ↓Phonological code retrieval
Lexical phonological
150 –225 output code
↓Segmental spell-out
L Segments 350 ms
↓Syllabification
Phonological word 455 ms
↓Phonetic encoding
Articulatory scores 600 ms
↓
Articulation
FIGURE 13.5 Time taken (in ms) for different processes to occur in picture naming. The specific processes
are shown on the right and the relevant brain regions are shown on the left. Reprinted from Indefrey and
Levelt (2004).
PET scans of human brain

areas which are active
while speaking and listening.
Top left—monitoring
imagined speech lights up
the auditory cortex. Top
right—working out the
meaning of heard words
activates other areas of
the temporal lobe. Bottom
left—repeating words
activates Wernicke’s area
for language comprehension
(right), Broca’s area for
speech generation (left), and
a motor region producing
speech. Bottom right—
monitoring speech activates
the auditory cortex.
on whether or not it began with a particular sound. (33) “A navigational instrument used in measur-
The electrophysiological data for the preparation of ing angular distances, especially the altitude
the motor movements suggested that the syntactic of the sun, moon, and stars at sea.”
properties were accessed before the phonological
information. However, the time delay between the Stop and try to name the item defined by (33). You
two was very short—in the order of 40 ms. may experience a TOT.
Example (33) defines the word “sextant.”
Evidence from the tip-of-the-tongue Brown and McNeill found that a proportion of
phenomenon the participants will be placed in a TOT state
The tip-of-the-tongue (TOT) state is a notice- by this task. Furthermore, they found that lexi-
able temporary difficulty in lexical access. It is an cal retrieval is not an all-or-none affair. Partial
extreme form of a pause, where the word takes a information, such as the number of syllables, the
noticeable time to come out (sometimes several initial letter or sound, and the stress pattern, can
weeks!). You are almost certainly familiar with be retrieved. Participants also often output near
this phenomenon: You know that you know what phonological neighbors like “secant,” “sextet,”
the word is, yet you are unable to get the sounds and “sexton.” These other words that come to
out. TOTs are accompanied by strong “feelings of mind are called interlopers. TOTs show us that
knowing” what the word is. They appear to be uni- we can be aware of the meaning of a word with-
versal; they have even been observed in children as out being aware of its component sounds; and
young as 2 (Elbers, 1985). The incidence of TOTs furthermore, that phonological representations
increases with old age (Burke, MacKay, Worthley, are not unitary entities.
& Wade, 1991), and TOTS are more common in There are two theories of the origin of
bilingual speakers (Gollan & Acenas, 2004; Gollan TOTs. These are called the partial activation
& Brown, 2006). They appear to be universal; and blocking (or interference) hypotheses.
deaf speakers experience “tip-of-the-finger” states Brown (1970) first proposed the partial activa-
(Thompson, Emmorey, & Gollan, 2005). tion hypothesis. This says that the target items
Brown and McNeill (1966) were the first are inaccessible because they are only weakly
to examine the TOT state experimentally. They represented in the system. Burke et al. (1991)
induced TOTs in participants by reading them provided evidence in favor of this model from
definitions of low-frequency words, such as (33): both an experimental and a diary study involv-
ing a group of young and old participants. They
argued that the retrieval deficit involves weak
links between the semantic and the phonologi-
cal systems: there is a transmission deficit in
getting between the two. A broadly similar
approach by Harley and MacAndrew (1992)
localized the deficit within a two-stage model
of lexical access, between the abstract lexical
units and the phonological forms. At first sight
Kohn et al. (1987) provided evidence contrary
to the partial activation hypothesis in the form
of a free association task. They showed that the
partial information provided by participants
does not in time narrow or converge on the
target. However, A. S. Brown (1991) pointed
The tip-of-the-tongue (TOT) state is an extreme out that participants might not say out loud the
form of a pause, where the word takes a interlopers in the order in which they came to
noticeable time to come out.
mind. Furthermore, in a noisy system there is
no reason why each attempt at retrieval should Additional evidence for this claim comes from
give the same incorrect answer. the finding that pictures with names in sparse
The blocking hypothesis, first suggested by phonological neighborhoods are named more
Woodworth (1938), states that the target item slowly than words with dense neighborhoods
is actively suppressed by a stronger competitor. where there are many similar sounding words
Jones and Langford (1987) used a variant of the (Vitevitch, 2002).
Brown and McNeill task known as phonologi- The TOT data best support the partial activa-
cal blocking to test this idea. They presented a tion hypothesis. They also suggest that the levels
phonological neighbor of the target word and of semantic and phonological processing in lexi-
showed that this increases the chance of a TOT cal retrieval are distinct. The tip-of-the-tongue
state occurring, whereas presenting a semantic state is readily explained as success of the first
neighbor does not. They interpreted this as show- stage of lexicalization but failure of the second.
ing that TOTs primarily arise as a result of com- There is some evidence that supports this idea.
petition. Jones (1989) further showed that the Vigliocco, Antonini, and Garrett (1997) showed
blocker is only effective if it is presented at the that grammatical gender can be preserved in
time of retrieval rather than just before. However, tip-of-the-tongue states in Italian. That is, even
Perfect and Hanley (1992) and Meyer and Bock though speakers cannot retrieve the phonological
(1992) discussed methodological problems with form of a word, they can retrieve some syntactic
these experiments. Exactly the same results are information about it.
found with these materials when the blockers There is also evidence from preservation of
are not presented, suggesting that the original gender in an Italian person, called Dante, who
results were an artifact of the materials. In fact, suffered from word-finding difficulties or anomia
prior processing of phonologically related words (Badecker, Miozzo, & Zanuttini, 1995). Dante
actually decreases the chance of being in a tip- could give details about the grammatical gender
of-the-tongue state, and increases the probability of words that he could not produce. Information
of retrieving the target word (James & Burke, about grammatical gender is part of the lexical-
2000), a finding consistent with the insufficient semantic and syntactic information encoded by
activation hypothesis—TOTs arise because there lemmas, such as knowing that a word is a noun.
is a deficit in transmitting activation from the Hence Dante had access to the lemmas, but was
semantic to the phonological level. The finding then unable to access the associated phonologi-
that bilingual speakers are more prone to TOTs cal forms. It is important to note that for many
is also best explained by the insufficient activa- Italian words grammatical gender is not predict-
tion idea—presumably the semantic–phonological able from semantics. Furthermore, Dante could
links are weaker in bilingual speakers because retrieve the gender for both regular and excep-
they speak each language only some of the time tion words, which suggests that Dante could not
(Gollan & Acenas, 2004). just have used partial phonological information
Harley and Bown (1998) showed that TOTs to predict grammatical gender. However, while
are more likely to arise on low-frequency words Dante’s performance is entirely compatible with
that have few close phonological neighbors. the two-stage account, it is also compatible with
For example, the words “ball” and “growth” an account where such information is stored
are approximately equal in their frequency of elsewhere. Gender can be put with other syn-
occurrence. There are a lot of other words that tactic information in the lexicon, such that it is
sound like “ball” (e.g., “call,” “fall,” “bore”), stored with words. In that case, how could gen-
but few that are close to “growth.” These data fit der be preferentially lost? We have a choice of
a partial activation model of the origin of TOTs only three genders, but of many more phonologi-
rather than an interference model. Indeed, pho- cal forms. It is possible that in an interactive acti-
nological neighbors appear to play a support- vation network we would be able to retrieve the
ing rather than a blocking role in lexical access. correct gender without the network being able to
settle down enough to select the appropriate one concepts and phonological forms (Caramazza,
of many phonological forms. 1997; Caramazza & Miozzo, 1997, 1998; Miozzo
Further evidence that TOTs are associated & Caramazza, 1997).
with a difficulty in retrieving the phonologi- One point is that it is not clear that the need
cal forms of words comes from brain imaging. for lemmas is strongly motivated by the data.
Shafto, Burke, Stamatakis, Tam, and Tyler (2007) Most of the evidence really only demands a dis-
had people aged 19–88 name pictures of famous tinction between the semantic and the phono-
people. The number of TOTs increased with age logical levels. The strongest evidence for lem-
and with atrophy of the left insula, a region of the mas comes from the finding that gender can be
brain known to be involved (among other things) retrieved when in the tip-of-the-tongue state,
in phonological production. although this interpretation has been disputed.
It should not be possible to retrieve phonologi-
Problems with the lemma model cal information for a word without retrieving
Although most researchers favor the two-stage the syntactic information for that word such as
model of lexicalization, there is less agreement on gender, as the phonological stage can only be
the need for lemmas as a level of amodal, syntacti- reached through the lemma stage. Tip-of-the-
cally specified representations mediating between tongue data suggest, however, that syntactic
LEXICAL–SEMANTIC
NETWORK
Activation flow
FIGURE 13.6 A
Adj detailed representation of
Ms
Caramazza’s (1997) model.
Cn The flow of information is
F
N
V from semantic to lexeme
/sedia/ and syntactic networks
M and then on to segmental
information. N = noun; V =
SYNTACTIC /tavolo/ /tigre/
verb; Adj = adjective; M =
NETWORK masculine; F = feminine;
PHONOLOGICAL
FORMS Cn = count noun; Ms =
mass noun. Dotted lines
indicate weak activation.
Links within a network are
inhibitory. Reproduced with
o l v t r d e s i a g
permission from Caramazza
(1997).
and phonological information are independent a representation, the frequency of the lexeme
(Caramazza & Miozzo, 1997, 1998; Miozzo & representation will be the sum of the two homo-
Caramazza, 1997): Italian speakers can some- phones. Hence a less frequent word like “nun”
times retrieve partial phonological information will behave like a more frequent word, assum-
when they cannot retrieve the gender of the ing that frequency operates at the lexeme level.
word, and vice versa. Importantly, there was no Some studies find that frequency effects appear
correlation between the retrieval of gender and to reflect total-homophone frequency rather
phonological information; people are no better than word-specific frequency (Levelt, Roelofs,
at recalling gender when they correctly recall the & Meyer, 1999). For example, Jescheniak and
initial phoneme of the target in a TOT state than Levelt (1994) found that the translation speeds
when they fail to do so. Hence, phonological of a word like “nun” by Dutch–English bilin-
retrieval does not necessarily depend on syntac- guals depended on total-homophone frequency
tic retrieval, and therefore these results do not (the rather large “none” plus “nun”) rather than
support the idea of syntactic mediation. Arguing word-specific frequency (the rather low fre-
that lemmas are unnecessary complications, quency of just “nun”) compared with control
Caramazza (1997) dispenses with them. He pro- words. In contrast Caramazza, Costa, Miozzo,
poses that lexical access in production involves and Bi (2001) found that naming latencies in a
the interaction of a semantic network, a syntac- range of experimental tasks were determined
tic network, and phonological forms (see Figure just by word-specific frequency (i.e., “nun”
13.6). Semantic representations activate both behaves like a low-frequency word rather than
appropriate nodes in the syntactic network and a high-frequency word).
the phonological network. Clearly there is conflict in the data here, and
If lemmas exist, given they are amodal and are it is unclear how this conflict is best resolved
syntactically specified, then grammatical impair- (Bonin & Fayol, 2002; Caramazza, Bi, Costa,
ments involving words should not be modality- & Miozzo, 2004; Jescheniak, Meyer, & Levelt,
specific. However, we find patients who are 2003). Whether we find word-specific or total-
selectively impaired in producing words of one homophone frequency effects depends on the
grammatical class in only one output modality number and type of materials, the controls used,
(Caramazza, 1997; Caramazza & Miozzo, 1998). and where frequency effects operate. There is
For example, patient SJD has difficulty in produc- now, for example, a considerable amount of evi-
ing verbs in writing but not in speaking; she can dence that frequency affects lexical selection (the
produce nouns equally well in writing and speaking retrieval of lemmas), rather than just the retrieval
(Caramazza & Hillis, 1991). Although her errors of phonological forms. For example, Navarette,
include semantic substitutions, SJD does not have Basagni, Alario, and Costa (2006) found effects
a central semantic impairment because she has no of frequency (in the form of faster response times
difficulty with comprehending these words, and for high-frequency items) on tasks in Spanish that
because her difficulties are restricted to one output require the retrieval of gender but not phonologi-
modality. It is difficult to account for this pattern cal properties. For example, they found frequency
of results with the lemma model (but see Roelofs, effects in a gender decision task, and in a task
Meyer, & Levelt, 1998, for an attempt). where participants had to describe pictures using
Another way of distinguishing between the pronouns rather than the name of the object.
two accounts is to examine how we produce Perhaps the best conclusion is that no firm
homophones. Consider the words “none” and conclusion can be drawn from these translation
“nun.” According to the lemma model, these tasks, although picture-naming data suggest that
two words have shared lexeme representa- specific-word frequency best predicts naming
tions but separate lemma representations. The times (Caramazza et al., 2004). So in spite of ini-
alternative is that they just have two distinct tial optimism, homophone production does not
lexeme representations. If homophones share provide clear evidence for the two-stage model.
In summary, although there is some dis- phonological specification only begin when the
sent about the nature of the two stages, and the first stage of lemma retrieval is complete, or does
extent to which there are amodal, syntactically it begin while the first stage is still going on? The
specified lemmas, there is consensus that lexi- speech error evidence of the existence of mixed
cal retrieval takes place in two stages, with a whole word substitutions indicates overlap or
semantic-lexical stage followed by a lexical- interaction between the two stages. To make the
phonological stage. distinction between independent and overlap-
ping models concrete, suppose that you want to
Is lexicalization interactive? say the word “sheep.” According to the two-stage
hypothesis, you formulate the semantic represen-
Given that there are two stages involved in lexi- tation underlying sheep, and use this to activate
calization, how do they relate to each other? a number of competing abstract lexical items.
Interaction involves the influence of one level of Obviously in the first instance these will all be
processing on the operation of another. It com- semantic relatives (like “sheep,” “goat,” “cow,”
prises two ideas. First, there is the notion of temporal etc.). The independence issue is this: Before you
discreteness. Are the processing stages tempo- start choosing the phonological form of the tar-
rally discrete or do they overlap, as they would get word, how many of these competing units are
if information or activation is allowed to cascade left? According to the independence (modular)
from one level to the following one before it has theory, only one item is left active before we start
completed its processing? The case when process- accessing phonological forms. This is of course
ing levels overlap, in that one level can pass on the target word, “sheep.” According to the inter-
information to the next before it has completed active theory, any number of them might be. So
its processing, is known as processing in cascade according to the interactive theory, when you
(McClelland, 1979). If the stages overlap, then intend to say “sheep,” you might also be thinking
multiple candidates will be activated at the second of the phonological form /gout/, and this will in
stage. For example, many lemmas will become turn have an effect on the selection of “sheep.”
partially activated while activation is accruing at Another way of putting this is that according to
the target. Activation will then cascade down to the discrete models, the semantic-lexical and
the phonological level. The result is that on the lexical-phonological stages cannot overlap, but
overlap hypothesis we get leakage between levels according to the interactive model, they can. The
so that non-target lemmas become phonologically issues involved are exactly the same as those dis-
activated. We can examine this by looking at the cussed in word recognition.
time course of lexicalization. Second, there is the Levelt et al. (1991a) performed an elegant
notion of the reverse flow of information. In this experiment to test between these two hypotheses.
case, information from a lower level feeds back to They looked for what is called a mediated prim-
the prior level. For example, phonological activa- ing effect: When you say “sheep,” it facilitates
tion might feed back from the phonological forms the recognition of the word “goat” (which obvi-
to the lemmas. Overlap and reverse flow of infor- ously is a semantic relative of “sheep”); but does
mation are logically distinct aspects of interac- “goat” then go on to facilitate in turn one of its
tion. We could have overlap without reverse flow phonological neighbors, such as “goal”? Levelt
(but reverse flow without overlap would not make et al. argue that the interactive model suggests
much sense). that this mediated priming effect should occur,
whereas the independence model states that it
The time course of lexicalization: should not. The participants’ task was this: They
Discrete or cascaded processing? were shown simple pictures of objects (such as a
How do the two stages of lexicalization relate to sheep) and had to name these objects as quickly
one another in time? Are they independent, or do as possible. This typically takes most people
they overlap? That is, does the second stage of approximately 500 to 800 ms to do. When we see
a picture or an object, we typically spend the first the appropriate concept. We then spend another
150 ms doing visual processing and activating 125 ms or so selecting the lemma. Phonological
The Image
The first step on the path where thoughts
flow into words can be thinking about or
seeing the image of the thing you want
to talk about, like a llama.
The Lexical Level, or Concept

Wool Hooves Animal
The image activates the lexical
module, or node, for llama, carrying all
an the information the brain has stored
is about llamas: an animal with hooves,
has
Gro
wool, etc. Each node is believed to be

has
is an
wth
a widely distributed network of

connected neurons in the brain.
Adjacent lexical nodes for related
words, like sheep, goat, animal, etc.,
are also activated; the information is
Llama Goat passed on to the next module for
processing.
Noun The Lemma Level

Activation from all theoretical concepts
is passed on to this level, where proper
syntax is assigned to each one. These rules
of language include word order, gender
if appropriate, case markings, and
Category other grammatical features.
Gender Gender Meanwhile, the various activated
lemmas compete; usually the most
Llama Goat highly activated wins, but the more
Masc. Fem. competing lemmas interfere, the
(lama) (chèvre)
longer it takes to generate the desired
word.
The Lexeme Level

Turning the desired concept into a
/la:ma/ /goUt/ spoken word requires matching the
syntactical elements from the lemma
level to the sounds that make up a
language; not just syllables but
stresses, rhythms, and intonation.
A word that is known but that is not
frequently used will take more time to
recall. This is where the tip-of-the-tongue
phenomenon occurs, perhaps
I a: m a g oU t
because a given lexical node was not
sufficiently activated to make it to
the lexeme level.
FIGURE 13.7 Processes involved in naming an object in a picture, according to the two-stage model of lexicalization.
From Levelt et al. (1991).
encoding starts around 275 ms and we usually whereas Levelt et al. used categorical associates
start uttering the name from 600 ms. (“sheep” and “goat”), Peterson and Savoy used
In the interval between presentation and near synonyms (“couch” and “sofa”). It is likely
naming, subjects were given a word in an acous- that categorical associates are too weakly acti-
tic form through headphones (e.g., “goal”). The vated to produce measurable activation of their
participants had to press a button as soon as they corresponding phonological forms. Near syno-
decided whether this second item was a word nyms, though, are very closely semantically
or not. That is, it was an auditory lexical deci- related and therefore highly activated.
sion task. There were two critical results. First, Whereas Peterson and Savoy used targets
Levelt et al. found that “sheep” did not facilitate and distractors that had a very strong seman-
“goal”: “sheep” affected the subsequent process- tic relation, Cutting and Ferreira (1999) used
ing of “goat,” but not of “goal.” That is, there was distractors that had a very strong phonological
no mediated priming. Hence they argued that no relation to the target picture. Participants had
interaction occurred. Second, in a separate experi- to name pictures that had homophonic names
ment, they showed that “sheep” only affects the (e.g., “ball”). Auditory distractor words were
access of semantic neighbors (e.g., “goat”) early presented 150 ms before the picture onset.
on, whereas late on it only affects the access of Homophones have the strongest phonologi-
phonological neighbors (e.g., “sheet”). That is, cal relation possible, because by definition
there was no late semantic priming. The prim- the sound of the two meanings (round toy and
ing effects were inhibitory: that is, related items formal dance) is identical. If the discrete stage
slowed down processing through interference. model is correct, at such an early SOA only
Levelt et al. concluded that, in picture naming and an appropriate-meaning semantic distractor
lexicalization, there is an early stage when seman- (e.g., “game”) should have an effect. But if the
tic candidates are active (this is the lemma selec- cascade model is correct, then the phonologi-
tion stage), and a late stage when phonological cal form of the inappropriate-meaning distrac-
forms are active (see Figure 13.7). Furthermore, tor (e.g., “dance”) should also have an effect.
these two stages are temporally discrete and do The results supported the cascade model. The
not overlap or interact. appropriate-meaning distractor produced inhi-
Dell and O’Seaghdha (1991) showed with bition relative to an unrelated control (“ham-
simulations that a model that incorporated local mer”), but the inappropriate-meaning distractor
interaction (between adjacent stages) could produced significant facilitation. The pho-
appear to be globally modular. This is because, in nologically related distractor affects picture
these types of model, different types of informa- naming at the same early stage as a semanti-
tion need not spread very far (but see Levelt et al., cally related distractor. Similarly, Morsella and
1991b). Only very weak mediated priming would Miozzo (2002) presented participants with two
be predicted here—insufficient to be detected by superimposed pictures, and asked them to name
this task. Harley (1993a) showed that a model one but ignore the other. They found that nam-
based on interactive activation could indeed pro- ing was faster when the two pictures were pho-
duce exactly this time course while permitting nologically related (e.g., a picture of a bed and
interaction between levels. a bell, compared with a bed and a pin). This
Levelt et al.’s findings have also been finding again suggests that activation from the
questioned by the results of an experiment unselected lexical node still trickles down to
by Peterson and Savoy (1998). They did find the phonological level.
mediated priming. They showed that “soda” Further support for cascade models of lexi-
is activated when we retrieve “couch,” as the calization comes from a study by Griffin and
word “couch” primes the word “sofa” through Bock (1998). They examined how long it took
mediated priming. The difference between participants to name pictures embedded in sen-
their experiment and that of Levelt et al. is that tences. They varied the degree of constraint of the
sentences and the frequency of the picture names. Is there feedback in lexicalization?
For example, (34) highly constrains the following Is there reverse information flow when we choose
target picture name, whereas (35) produces very words? Models based primarily on speech error
little constraint. data see speech production as primarily an
interactive process involving feedback, mainly
(34) Boris taught his son to drive a – because speech errors show evidence of multiple
(35) Boris drew his son a picture of a – constraints such as a lexical bias and similarity
effects (Dell, 1986; Dell & Reich, 1981; Harley,
Griffin and Bock found that the effects of 1984; Stemberger, 1985).
constraint and frequency interacted in determin- A familiarity bias is the tendency for errors to
ing naming times. High-constraint sentences produce familiar sequences of phonemes. In par-
show reduced frequency effects compared with ticular, lexical bias is the tendency for sound-level
low-constraint sentences. In discrete stage mod- speech errors such as spoonerisms to result in a
els there is no means for the constraint present in word rather than a nonword (e.g., “barn door” being
the lemma selection stage to influence the effect produced as “darn bore”) more often than chance
of word frequency in the separate and subsequent would predict. Of course, we would expect some
stage of phonological encoding. However, this sound errors to form words sometimes by chance,
finding is exactly what cascade models predict. but Dell and Reich showed that word outcomes
Data from bilingual speakers also support the happen far more often than is expected by chance.
cascade model. Costa, Caramazza, and Sebastian- This, then, is evidence of an interaction between
Galles (2000) examined the naming times of pic- lexical and phonological processes. This bias has
tures whose names are cognates in Catalan and been shown both for naturally occurring speech
Spanish (words that sound and look similar in errors (Dell, 1985, 1986; Dell & Reich, 1981) and
both language—e.g., “gat” in Catalan and “gato” in artificially induced spoonerisms (Baars et al.,
in Spanish, both meaning “cat”). For bilingual 1975), and in languages other than English (e.g., in
speakers, if activation does indeed cascade from Spanish; Hartsuiker, Anton-Méndez, Roelstraete,
unselected lexical nodes, then the activation lev- & Costa, 2006). Some aphasic speakers show clear
els of the phonemes /g/ /a/ /t/ should be very high lexical bias in their errors (Blanken, 1998).
because they are receiving activation from two Similarity effects arise when the error is more
lexical nodes—the selected Spanish target word similar to the target according to some criterion
and the non-selected Catalan node. Costa et al. than would be expected by chance. In mixed sub-
indeed found that the naming times for cognate stitutions the intrusion is both semantically and
words was shorter in bilingual speakers (but not phonologically related to the target, such as in
for monolingual speakers). (36) and (37). Obviously we will find some mixed
In summary, these experiments show that errors by chance, but we find them far more often
word selection precedes phonological encod- than would be expected by chance alone (Dell &
ing. There is much evidence that the two stages Reich, 1981; Harley, 1984; Shallice & McGill,
of lexicalization overlap, and little unambigu- 1978). Obviously we need a formal definition of
ous evidence against this idea. They found that phonological similarity; here both the target and
naming times were shorter for cognate words in the intrusion start with the same consonant, and
bilingual (but not monolingual) speakers. Only contain the same number of syllables. We also
the cascaded-processing model clearly predicts find similar results in artificially induced speech
this result. In the cascade model, activation cas- errors (e.g., Baars et al., 1975; Motley & Baars,
cades down from non-selected lexical nodes (the 1976) and in errors arising in complex naming
cognates) to their phonological segments, as well tasks (Martin, Weisberg, & Saffran, 1989). Laine
as from the target nodes. The result of this addi- and Martin (1996) discuss the effect of task train-
tional activation of the phonological segments is ing on a severely anomic patient, IL. They found
to speed up naming. a strong phonological relatedness effect.
(36) comma → colon related nouns produced facilitation, but only if

(37) calendar → catalogue they were in the same phrase (as in “the watch
and the wand move down”). Rapp and Samuel
Similarity effects are problematical for serial (2000) asked participants to complete sentence
models such as Fay and Cutler’s. At the very least, fragments, finding a missing word in a sentence
then, the basic model must be modified, and this such as (38) or (39):
can be done in two ways. Butterworth (1982) pro-
posed that a filter or editor checks the output to (38) The neighbors were shocked to hear Vlad
see that it is plausible; it is less likely to detect had killed her. He had an argument with his
an error if the word output sounds like the target wife and had returned with a –.
should have been, or is related in meaning. Such a (39) The neighbors were shocked to hear Vlad
mechanism, although it might be related to com- had killed her. He had an argument with his
prehension processes, is not parsimonious (see spouse and had returned with a –.
Stemberger, 1983).
Participants were much more likely to complete
Horizontal information flow (38) with the word “knife” than (39). The comple-
We can call the type of information flow while tions reflected both the semantic and phonologi-
speaking we have discussed so far “vertical cal prior context. Taken further, horizontal flow
information”; we have been concerned with enables us to write humorously or lyrically: puns
how information flows from the conceptual and poetry depend on horizontal flow.
level to the sound level for individual words.
We have seen that the evidence favors a cascade The interactive activation model of
model; words are simultaneously active at mul-
tiple levels of information. We have also seen
lexicalization
that speech production is an incremental pro- There is an emerging consensus among speech
cess; we plan as we speak. And we have seen production theorists that lexicalization can be
that information about lexical items can affect described by spreading activation in a model
the syntax of the sentence—for example, we similar to the interactive activation model of
construct sentences such that more accessible context effects in letter identification proposed
items are placed earlier in the sentence. by McClelland and Rumelhart (1981). Different
Given all this, it would not be surprising if versions of the same basic model have been
words affect other words in the sentence—what described by Dell (1986), Dell and O’Seaghdha
is called horizontal information flow. Smith and (1991), Harley (1993a), and Stemberger (1985).
Wheeldon (2004) used a picture–word interfer- For example, in Harley’s model lexicalization
ence task to demonstrate that information does proceeds in two stages. The meaning of a word
indeed flow horizontally as well as vertically in is represented as a set of semantic features (see
speech production. They used a modified version discussion of the Hinton & Shallice, 1991, and
of the picture–word interference task where par- Plaut & Shallice, 1993a, model of deep dyslexia
ticipants produced sentences describing a moving in Chapter 7, and the discussion of semantic rep-
scene on a computer screen. For example, they resentation in Chapter 11). These feed into a level
might see a picture of a saw moving above the of representation where abstract lexical represen-
printed word “axe,” and would have to say “The tations equivalent to the lemmas are stored, and
saw moves above the axe.” They found that two these in turn activate the phonological repre-
semantically related nouns produced interference sentations equivalent to lexemes. The basic archi-
even if they were different phrases of a sentence (as tecture of the model is shown in Figure 13.8; the
in “the saw moves above the axe”). As we might rules that govern the behavior of the network are
expect from the differences in scope of word similar to those in the McClelland and Rumelhart
and sound exchange errors, two phonologically model of word recognition and the TRACE model
The model (see Figure 13.9) gives a good

External semantic input
account of speech errors. Several units may be
active at each level of representation at any one
time. If there is sufficient random noise an item
Semantic units might be substituted for another one. As items
are coded for syntactic category and position in a
word, the other units that are active at any one time
Lexical units tend to be similar to the target in these respects.
There is feedback between levels. The feedback
between the phonological and lexical levels gives
Phonological units
rise to lexical bias and similarity constraints.
A related issue that has recently arisen is
Phonological output the degree to which there is competition within
a level between similar units. Recall that in the
IAC model of letter recognition there are within-
FIGURE 13.8 Architecture of an interactive level inhibitory links leading to competition
activation model of lexicalization. Arrows show between similar units. The key issue therefore is
excitatory connections; filled circles show inhibitory whether the time to produce a word is affected by
connections. The semantic within-level connections the activation of similar words. This issue is cur-
are more complex, with partial connectivity, as rently unresolved, with some researchers arguing
indicated by the unfilled circle. for competition, others against it, while yet others
claim that the data can be accounted for by an
of speech perception. As we have just noted, com- internal monitor checking planned productions
puter simulations based on this model can also against internal goals (Dhooge & Hartsuiker,
explain the picture-naming data of Levelt et al. 2012; Melinger & Rahman, 2013).
(1991a).
Evaluation of work on
Dell’s (1986) interactive model of
speech production
lexicalization
Dell (1986) proposed an interactive model of lexi- There is consensus (although by no means uni-
calization based on the mechanism of spreading versal) that lexicalization in speech production
activation. Items are slotted into frames at each occurs in two stages. There is certainly plenty
level of processing. Processing units specify the of evidence that information cascades between
syntactic, morphological, and phonological prop- levels. Strict serial models can only account for
erties of words. Activation spreads down from the the data by introducing additional assumptions
sentence level, where items are coded for syntac- (e.g., a non-serial component of the comprehen-
tic properties, through a morphological level, to a sion process interacting with speech production
phonological level. At each level, the most highly in the picture–word interference task; allowing
activated item is inserted into the currently active multiple selection of lemmas in limited spe-
slot in the frame. For example, the sentence frame cial circumstances). There is less consensus on
might be quantifier–noun–verb. The morphologi- whether the stages are discrete, and on whether
cal frame might be stem plus affix. The phono- they interact.
logical frame might be onset–nucleus–coda. The It is possible to construct cascade models
final output is a series of phonemes coded for that re-create the pattern of performance shown
position (e.g., /s/ in word-onset position). The by the mediated priming experiments of Levelt
flow of activation throughout the network is time- and his colleagues. One possible weakness of the
dependent, so that the first noun in a sentence is interactive models is that they have many free
activated before the second noun. parameters, and hence could potentially explain
LEXICAL NETWORK
TACTIC FRAMES
1 2
C
S SOME SWIMMER SINK DROWN

Q N V V
NP VP 3
PLURAL SWIM
Q N Plural V
V
(1) (2) (3) ?
SYNTAX
1
WORD WORD C
Stem SOME SWIM ER Plural SINK

SQ SV Af1 Af2 SV
SQ SV Af1 Af2
(1) ?
MORPHOLOGY
SYL sw A
On Nu
Rime
s w I m
On Nu Co On On Nu Co
?
PHONOLOGY
FIGURE 13.9 Dell’s (1986) connectionist model of speech production. This figure depicts the momentary
activation present in the production of the sentence “Some swimmers sink.” On the left there are tree structures
analogous to the representation at each level of the model. The numbered slots have already been filled in and
the “flag” indicates each node in the network that stands for an item filling a slot (the number indicates the order
and the c flag indicates the current node on each level). The ? indicates the slot in each linguistic frame that is
currently being filled. The highlighting on each unit indicates the current activation level. Each node is labeled
for membership of some category. Syntactic categories are: Q for quantifier; N for noun; V for verb; plural
marker. Morphological categories are: S for stem; Af for affix. Phonological categories are: On for onset; Nu for
nucleus; Co for coda. Many nodes have been left out to simplify the network, including nodes for syllables, syllabic
constituents, and features. From Dell (1986).
any pattern of results. It has become difficult to Hence all the data are consistent with non-
distinguish empirically between the cascade and discrete, cascading models. Levelt et al. (1991a)
discrete models. argued that real-time picture-naming experiments
present a more accurate view of the normal lexi- more than others, it is unlikely to be able to do
calization process. Nevertheless, any complete so to the extent that can account for the num-
model of lexicalization should also provide ber of mixed errors actually found. Generally
an account of the speech error data. Feedback the dissociation between aphasic speakers with
explains similarity and lexical biases, but it is of comprehension deficits who show good error
course most unlikely that feedback connections detection is a problem for the perceptual-loop
should exist just to give phonological facilitation hypothesis. Instead, it might be that speech error
in speech errors (Levelt, 1989). One reason feed- detection arises from the ability of the speech
back links might exist is that the system is used in production system to detect conflicts between
speech production and comprehension, but this is planned output and intention, using mechanisms
implausible given experimental and neuropsycho- located in the anterior cingulate cortex of the
logical evidence for a separation (see Chapter 15). brain (Nozari, Dell, & Schwartz, 2011).
Hence models with feedback are in some respects What role does feedback serve? Feedback is
problematical. unlikely to be the same mechanism that is used in
One possibility is to explain the speech error comprehension: speech production is not just com-
data away. Given that the main evidence for prehension in reverse. (For detailed justification
interaction is facilitation and lexical bias, per- of this statement, see Chapter 12.) Any increase in
haps these phenomena can be explained by other processing speed that feedback provides is likely
mechanisms. An alternative explanation is the to be marginal, and feedback is most unlikely to
use of monitors (Baars et al., 1975; Butterworth, exist just to ensure that errors are words. One pos-
1982; Levelt, 1989; Postma, 2000). Of course sibility is that it plays a role in monitoring speech
we monitor our speech; we sometimes detect and detecting and preventing errors.
errors and correct them. The idea that we make Connectionist modeling provides an alter-
use of a comprehension system to monitor what native explanation to feedback. In Chapter 7
we say is called the perceptual-loop hypothesis. we saw how mixed errors can arise in a feed-
Postma (2000) discusses three ways in which a forward architecture, as one of the properties of
monitor might operate: It might be completely an attractor network (Hinton & Shallice, 1991).
perceptual, having access only to our speech Perhaps in a similar way we can talk about pho-
output; it might have access to levels of pro- nological attractors. More work is necessary on
cessing prior to output, comparing intermediate this topic.
levels of representation against the conceptual Rapp and Goldrick (2000) reviewed the lit-
message; or it might make use of relative infor- erature on discreteness and interactivity, pay-
mation about activation levels (e.g., if two lem- ing particular attention to the pattern of errors
mas are simultaneously very highly activated, a made by normal and brain-damaged people.
warning light might flash). It is, however, dif- This review provoked a lively debate (Rapp &
ficult to distinguish between these alternatives, Goldrick, 2004; Roelofs, 2004a, 2004b). Rapp
and indeed all might well be true. and Goldrick (2000) argued that the degree of bias
The use of a monitor to edit some slips adds towards mixed errors and the lexical bias in errors
complexity to the system (Stemberger, 1985). made by normal individuals can only plausibly be
We also observe aphasic speakers with error accounted for by the presence of feedback in the
patterns that contradict the editor hypoth- system. Furthermore, brain damage can disrupt
esis. For example, Blanken (1998) describes a language production at either the semantic or the
patient who makes errors that come from differ- post-semantic level, and yet lead to only semantic
ent syntactic categories on some occasions, but errors. However, individuals with brain damage
not on others. The editor should be very good show the mixed-error effect only if the locus of
at detecting syntactic category violations and damage is post-semantic—a semantic locus of
should be consistent. So, although the monitor impairment leads to semantic errors but no larger
might sometimes prevent some types of error number of mixed errors than would be expected
by chance. For example, patient KE has a seman- PHONOLOGICAL

tic deficit, as indicated by a profound difficulty in
understanding the meaning of words, and made
ENCODING
only pure semantic errors and no mixed errors The main problem in phonological encoding is
(or no more than chance). In contrast, patient PW ensuring that the sounds of words come out in the
had a post-semantic deficit, as indicated by his appropriate order, with the appropriate prosody.
excellent comprehension, and made semantic and Four solutions to this problem have been proposed.
mixed errors (but no form-related errors). Using The first account of phonological encoding
computer simulations of different types of pro- is based on a distinction between structure and
duction architecture, Rapp and Goldrick conclude content. This approach is the most simple and
that there is cascading activation and feedback commonly used method for ensuring correct
between semantic and phonological processing sequencing. Linguistic structures create frames
levels, but that it is restricted. Modeling shows with slots, and we then retrieve linguistic con-
that too much feedback actually makes product- tent to fill these slots. A frame is stored for each
ion more difficult and does not fit the range of word we know. One of the best known versions
data. They argue in particular that the pattern of of this approach is the scan-copier mechanism
data from brain-damaged people shows that the (Shattuck-Hufnagel, 1979). The sound seg-
amount of feedback from the lexical to the seman- ments are retrieved separately from this frame
tic level must be minimal or zero. and inserted into the appropriate slots in a syl-
Given that some feedback is necessary, the labic frame. When we speak, we produce an
key question is, whereabouts in the production abstract frame for the up-coming phrase that is
system does it occur? Is it production-internal, in copied into a buffer. The frame specifies the syl-
the form of feedback connections between phonol- labic structure of the phrase (in terms of onset,
ogy and lemmas, as Rapp and Goldrick argue, or is nucleus, and coda). A scan-copier device works
it comprehension-based, in the form of a monitor through a syllabic frame in left-to-right serial
checking the output of a pure feedforward system, order selecting phonemes to insert into each
as Roelofs argues? In Rapp and Goldrick’s RIA position of the frame. As a phoneme is selected,
(restricted interaction account) model, there is a it is checked off. Disruption of this mecha-
limited amount of feedback within the product- nism leads to difficulty in sequencing sounds
ion network. In Roelofs’ WEAVER++ model, a in words (Buckingham, 1986). For example, if
purely feedforward production network generates the scan-copier selects an incorrect phoneme but
an output that then acts as an input for a purely incorrectly marks off that phoneme as used, we
feedforward comprehension network. There is unfor- will end up with a phoneme exchange speech
tunately no critical evidence that enables us to dis- error. If the scan-copier selects an incorrect
tinguish between these two alternative accounts, so phoneme but fails to mark that phoneme as
at present the debate continues. However, it would used, we get a perseveration or exchange error.
be difficult to argue with the conclusion reached Garrett’s model of syntactic planning uses the
by Vigliocco and Hartsuiker (2002), who, in their same idea. Frame-based models are very good
review of the literature on speech production, at accounting for sound-level speech errors. For
conclude that the traditional serial model, where example, a sophisticated frame-based model can
information is encapsulated within levels and account for how the proportions of anticipatory
where information flow between levels is limited, (e.g., “heft hemisphere”) and persevatory (e.g.,
needs revision. Vigliocco and Hartsuiker argue for “left lemisphere”) sound-level speech errors
a maximalist approach, where there is feedback (a vary with age and speech rate (Dell, Burger, &
bidirectional flow of information) and maximal Svec, 1997). Schwartz, Saffran, Bloch, and Dell
input (each level of information receives as much (1994) distinguished between “good” and “bad”
input as early as possible—cascading activation— error patterns. The good pattern is that found in
from as many sources as possible). normal speech: Errors are relatively rare, they
tend to create words, and the majority of them feedback copied the past state of the hidden units
are anticipations. The bad pattern is when there of the network, and therefore provided the model
are many errors and the proportion of persevera- with memory of its past internal structure. When
tions is high. The bad pattern is found with some the model made errors, it exhibited four properties
types of aphasia, in childhood when the material observed in human sound speech errors. First, it
is less familiar, and with a faster speech rate. obeyed the phonotactic constraint: errors result in
Frame-based models are very good at account- sound sequences that occur in the language spo-
ing for these sorts of data. Decreasing the avail- ken. Second, consonants exchanged with other
able time and weakening connection strengths consonants, and vowels exchanged with other
in the model both lead to an increase in the bad vowels. Third, the syllabic constituent effect is
error pattern. that vowel–consonant errors are less common than
The second account, competitive queuing consonant–vowel errors. Finally, initial conso-
(Hartley & Houghton, 1996), is a connectionist nants are more likely to slip than non-initial ones.
model that also uses a frame, but which provides
an explicit mechanism for inserting segments into Phonological encoding in the
slots. The segments to be inserted form an ordered
queue controlled by processes of activation and
lemma model
inhibition. There are two control units, an initia- The final account of phonological encoding is
tion and an end unit. Sounds that belong at the provided by the WEAVER++ model of Levelt,
start of a word have strong connections to a unit Roelofs, and colleagues (e.g., Levelt, 2001;
that controls the initiation of speech, while sounds Levelt, Roelofs, & Meyer, 1999; Roelofs, 1992,
at the ends of words have strong connections to 1997a, 1997b, 2002, 2004a, 2004b; see Figure
a unit that controls the end of the sequence. The 13.10). WEAVER++ is a discrete two-stage
strength of connections of other sounds in a word model without any interaction between lev-
to these control units varies as a function of their els. Concepts select lemmas by enhancing the
position in a word. After a sound is selected, it is activation level of the concept dominating the
temporarily suppressed. Failure to do this prop- lemma. Activation spreads through the network,
erly leads to perseveration errors. Although this with the important restriction that cascaded pro-
model was originally formulated to account for cessing is not permitted, so that activation of
serial order effects in remembering lists, it can be the corresponding word form can only begin
extended to account for all of speech production. after a unique lemma has been selected. A pho-
It has the advantage of being able to learn how to nological code is retrieved for each lemma; for
order items. multimorphemic words the phonological code
Connectionist models suggest that the frame– is retrieved for each of the morphemes (e.g., if
filler distinction does not have to be represented the target is “horses,” we retrieve “horse” and
explicitly, but that it can emerge from the phono- “-z”). The phonological codes are spelled out
logical structure of the language (Dell, Juliano, as ordered sets of phonemes. The phonologi-
& Govindjee, 1993). Dell et al. used a type of cal code is retrieved for the word as a whole;
connectionist network called a recurrent net- in picture–word interference studies, priming
work to associate words with their phonological by parts of words facilitates the naming of the
representations in sequence, without any explicit target (e.g., naming a hammer is facilitated by
representation of the structure–content distinc- presenting “mer” as a distractor), suggesting that
tion. Recurrent networks are very good at learning all the parts of the word have been retrieved in
sequences of patterns. Dell et al.’s model incor- one go (Levelt, 2001; Roelofs, 1997a, 1997b).
porated two kinds of feedback. External feedback These ordered sets of phonemes are then incre-
copied the output of the most recent segment, and mentally strung together to form syllables, a pro-
therefore provided the model with memory of the cess known as syllabification. Syllables are not
past phonological states of the model. Internal stored in the lexicon; rather, we create them as
STAGE 1 Conceptual preparation
Lexical concept
STAGE 2 Lexical selection
Lemma
Self-monitoring
STAGE 3 Morphological encoding
Morpheme or word form
STAGE 4 Phonological encoding
Phonological word
STAGE 5 Phonetic encoding
Phonetic gestural sense
STAGE 6 Articulation
FIGURE 13.10 The

Weaver++ computational
Sound wave
model. Adapted from Levelt
et al. (1999).
we go along, depending on the context. As syl- these highly overlearned motor patterns to speed
lables are composed, they form the input to the up production.
final step of encoding, that of phonetic encoding, These models are perhaps not as mutually
which forms the details of the sounds and acts as exclusive as they might first appear. They represent
an input to the articulatory apparatus. evolution in theorizing, and also emphasize different
An important concept in phonetic encoding aspects of phonological encoding. The main differ-
is the mental syllabary. The syllabary is a store ence is once again the extent to which information
of highly practiced syllabic “gestures” that can has to be explicitly encoded in the model, or whether
drive articulation; as syllabification proceeds, it emerges as a consequence of the statistical regular-
the corresponding syllabic patterns are retrieved ities of the language. At present, frame-based mod-
from the syllabary for execution (Levelt, 2001; els are better able to account for how we can produce
Levelt et al., 1999). Evidence for the existence novel sequences of sounds (Dell, Schwartz, Martin,
of the syllabary comes from the finding that, Saffran, & Gagnon, 1997).
when word frequency is controlled for, syllable
frequency affects naming times (Cholin, Levelt, The role of syllables in
& Schiller, 2006; Levelt & Wheeldon, 1994).
Although English has more than 10,000 differ- phonological encoding
ent syllables, 80% of the time we use just 500 One major difference between many of the con-
(Levelt, 2001). It makes sense to make use of nectionist and WEAVER++ models concerns
the role of the syllable. Most connectionist mod- It is difficult to come to any firm conclusion
els make use of metrical frames that specify the about the existence of pre-stored, abstract syllabic
number, order, and structure of syllables and their structures on the basis of the current contradictory
stress pattern; syllables are then inserted into this findings (see Cholin et al., 2006, for a summary).
metrical frame. In contrast, in the WEAVER++
model the metrical frame specifies only the stress
pattern, and does not contain syllable information.
How far do we plan ahead?
We can test this distinction, although the exper- What is the main unit of planning at the phonologi-
iments are complex. Roelofs and Meyer (1998) cal level? According to Levelt (1989), we have to
examined whether we store the structure of syllables prepare the phonological word before we can start
in the metrical frame. They used an implicit priming speaking. The phonological word is the smallest
paradigm. Participants had to produce one word out prosodic unit of speech: it is a stressed (strong)
of a small set of words as quickly as possible. The syllable and any associated unstressed (weak) syl-
sets of words were either homogeneous, when all lables (Levelt, 1989; Sternberg, Knoll, Monsell,
the words in the set had the same word-initial seg- & Wright, 1988; Wheeldon & Lahiri, 1997). For
ments, or heterogeneous, when they did not. They example, “the vampire” is one phonological word;
found that priming depended on the words having “the bad vampire” is two. The phonological word
the same number of syllables and the same stress is prepared prior to rapid execution. Wheeldon
pattern, but not the same syllable structure (the and Lahiri showed that when all other factors are
same number of consonants and vowels). Roelofs controlled for (e.g., syntactic structure, number of
and Meyer concluded that the lack of priming lexical items, and number of syllables), the time
suggests that syllable structure is not stored in the it takes us to prepare a sentence (as measured by
metrical frame. Cholin, Schiller, and Levelt (2004) the time it takes us to begin speaking the prepared
used the same paradigm, and concluded that sylla- material) is a function of the number of phonologi-
ble frames are not stored with a word and retrieved cal words in it.
during encoding, but instead are generated “on the In addition to content words, phonological
fly.” The general idea with these studies is that if words can contain function words, although in some
syllables are not explicitly stored in the lexicon, circumstances function words can form phonologi-
there should be no syllable-specific priming effect, cal words in themselves if we decide to stress them
which is what these studies find. Hence they sup- (e.g., “you CAN do that”). Further evidence for the
port the view that syllables are made up only when importance of phonological words in phonological
necessary, as in the WEAVER++ model. planning is that resyllabification occurs within pho-
Other studies come to a different conclusion. nological words, but not across them. This means
Costa and Sebastian-Gallés (1998) used a picture–word that sounds from the end of one syllable can migrate
interference paradigm: Participants had to name a to form the beginning of the next syllable. Consider
picture while a word was presented 150 ms later. (40) from Wheeldon and Lahiri (1997):
The results showed that participants were faster to
name the picture when the target and the distrac- (40) Get me a beer, if the beer is cold.
tor shared the same abstract structure. For example,
“cuña” (meaning “wedge”) has a CV.CV (conso- A final /r/ sound has been added explicitly to the
nant–vowel consonant–vowel) structure. “Cuña” end of the second “beer,” and this has then resyl-
primes the target word “mono” (monkey), which labified to become the onset of the following
has the same syllabic structure (CV.CV), but no “is,” so that it is pronounced “beea-riz.” No such
overlap in actual sounds (segmental content), rela- resyllabification can occur with the first “beer,”
tive to a control item (e.g., “culpa,” meaning fault, however, because the following /I/ is in a different
which is structurally and segmentally unrelated). phonological word.
This result suggests that abstract syllabic structures On the other hand, some more recent work
are used in phonological encoding. suggests that we do plan farther ahead than one
phonological word. For example, Costa and (see Figure 13.11). An unfilled pause is simply
Caramazza (2002) used a picture–word interfer- a moment of silence. A filled hesitation can be
ence design to examine how we produce noun a filled pause (where a gap in the flow of words
phrases in English and Spanish. They asked is filled with a sound such as “uh” or “um”), a
speakers to produce simple (determiner noun) and repetition, a false start, or a parenthetical remark
complex (determiner adjective noun) construc- (such as “well” or “I mean”). People often start
tions while ignoring phonological distractors. what they are saying, hesitate when they discover
They found that the distractors phonologically that they haven’t really worked out what to say
related to the noun produced faster naming laten- or how to say it, and repeat their start when they
cies, regardless of the type of construction and have (Clark & Wasow, 1998). Unfilled pauses are
the position of the noun. This result shows that easier to detect mechanically by the equipment
the level of activation of the phonological forms used to measure pause duration, so analysis has
of the lexical nodes outside the first phonologi- focused on them. It has been argued that pauses
cal form affect naming latency, meaning that the represent two types of difficulty: one in what
second phonological word of the noun phrase (the might be called microplanning (due to retriev-
noun, in the complex construction) is activated ing particularly difficult words), and a second in
before articulation begins (because it is facilitated macroplanning (due to planning the syntax and
by the prime). Hence, in at least some circum- content of a sentence). The theoretical emphasis
stances, phonological encoding extends beyond in the past has been that pauses predominantly
a phonological word (see also Alario, Costa, & reflect semantic planning.
Caramazza, 2002a, 2002b; Levelt, 2002).
One possible resolution of these apparently dis-
crepant findings is that the phonological representa-
Pauses and lexicalization
tions of words are activated in a graded way as we Goldman-Eisler (1958, 1968) examined the dis-
speak; the closer to output an item is, the more it is tribution of unfilled pauses (defined variously as
activated (Jescheniak, Schriefers, & Hantsch, 2003). longer than 200 or 250 ms) across time, using a
device nicknamed the “pauseometer.” Obviously
there are gaps between speakers’ “turns” in con-
THE ANALYSIS OF versation, known as switching pauses, but there
HESITATIONS are many pauses within a single conversational
turn. They tend to occur every five to eight words.
Hesitation analysis is concerned with the distribu- Goldman-Eisler (1958, 1968) showed that
tion of pauses and other dysfluencies in speech pauses are more likely to occur, and to be of
SPEECH DYSFLUENCIES
UNFILLED PAUSE FILLED PAUSE OTHER DYSFLUENCIES
false repetition parenthetical

DUE TO MICROPLANNING DUE TO
(retrieving difficult MACROPLANNING start remark
words) (planning the syntax and
content of a sentence)
FIGURE 13.11
longer duration, before words that are less pre- Pauses and sentence planning
dictable in the context of the preceding speech.
Predictability reflects a number of notions, Goldman-Eisler (1958, 1968) argued that in some
including word frequency and familiarity, and the pauses we plan the content of what we are about
preceding semantic and syntactic context. Pauses to say. She found that the difficulty of the speak-
before less predictable words are hypothesized ing task affected the number of pauses a speaker
to reflect microplanning and to correspond to a makes, with more difficult tasks (for example,
transient difficulty in lexical access. We know interpreting a cartoon rather than simply describ-
the meaning of the word we want to say but we ing a cartoon) leading to more pauses in speech.
cannot immediately retrieve its sound. Of course, She argued that speakers were using these addi-
not all hesitations precede less predictable words, tional pauses to carry out additional planning.
and not all less predictable words are preceded Pauses cast some light on the size of plan-
by pauses. Sections of repeated speech behave ning units in speech. Maclay and Osgood (1959)
differently from pauses, tending to follow unpre- argued that the planning units must be larger
dictable words rather than preceding them, as than a single word because false starts involve
though they are used to check that the speaker corrections of the grammatical words associated
has selected the correct word (Tannenbaum, with the unintended content-bearing words. We
Williams, & Hillier, 1965). tend to produce corrections such as “The dog—
Beattie and Butterworth (1979) attempted to the cat was …” Boomer (1965) argued on the
disentangle the effects of word frequency from basis of hesitations that an appropriate unit of
contextual probability. They showed that the analysis corresponds to a phonemic clause that
relation between pausing and predictability did essentially has only one major stressed element
not appear to be attributable simply to word fre- within it, and which corresponds to a clause of
quency, and concluded that the main component the surface structure. He argued that the clause
of predictability that determined hesitations was is planned in the hesitation at the start of the
difficulty in semantic planning. However, their clause. Ford and Holmes (1978) used dual-task
study did not rule out possible contributions from performance to monitor cognitive load dur-
syntactic difficulty (Petrie, 1987). ing speech production, whereby the participant
People often use appropriate gestures during had to speak while monitoring for a tone over
these hesitations (Butterworth & Beattie, 1978). headphones. They argued that planning does
Suppose you are having difficulty in retrieving the not span sentences because reaction times to the
word “telephone.” You pause just before you say tone were no longer at the ends of sentences,
it, and in that pause make a gesture appropriate suggesting that people are not planning the next
to a telephone (such as holding your fist to the sentence at the end of the previous one. On the
side of your head, with thumb and little finger other hand, Holmes (1988) asked participants to
extended). This suggests that you know the mean- read several sentences that began a story, and
ing of what you want to say—that is, that the dif- then produce a one-sentence continuation. She
ficulty lies elsewhere than in semantic planning. found that, contrary to instructions, some speak-
It suggests a two-stage model of lexical access in ers produced more than one sentence, and when
production. We first formulate a semantic specifi- they did so a pause was more likely at the start
cation of what we want to say, and phonological of their speech than when they produced only
retrieval follows this. On this account the pause one sentence. Different tasks seem to indicate
reflects a successful first stage but a delay in the that different units are the fundamental unit.
second stage, that of retrieving the particular pho- Nevertheless, the clause does seem to be an
nological form of the word. This account ties in important unit of planning.
with the evidence from tip-of-the-tongue states, What exactly is planned in the pauses? In
which can be seen as extreme examples of micro- particular, is the planning syntactic or seman-
planning pauses. tic in nature, or both? Goldman-Eisler (1968)
claimed that pause time was not affected by the plots of unfilled pauses against articulation time.
syntactic complexity of the utterances being Jaffe, Breskin, and Gerstman (1972) showed that
produced, and concluded that planning is pri- apparently cyclic patterns could be generated
marily semantic rather than syntactic. This completely randomly. However, other phenom-
conclusion is now considered controversial ena (such as filled hesitations) also cluster within
(see Petrie, 1987). One problem concerns what the planning phase of a cognitive cycle. For
measure should be taken of syntactic complex- example, speakers tend to gaze less at their listen-
ity. At this stage it would be premature to rule ers during the planning phase, maintaining more
out the possibility that macroplanning pauses eye contact during the execution phase (Beattie,
represent planning both the semantic and the 1980; Kendon, 1967). The use of gestures also
syntactic content of a clause. depends on the phase of speech (Beattie, 1983).
Henderson, Goldman-Eisler, and Skarbek Speakers tend to use more batonic gestures (ges-
(1966) proposed that there were cognitive tures used only for emphasis) in the hesitant
cycles in the planning of speech. In particu- phases, and more iconic gestures (gestures that
lar, phases of highly hesitant speech alternate in some way resemble the associated object,
with phases of more fluent speech. The hesi- such as the one described earlier when about to
tant phases also contain more filled pauses, and say “telephone”) in the fluent phase (particularly
more false starts, than the fluent phases. It is before less predictable words). The observation
thought that most of the planning takes part in that several features cluster together in hesitant
the hesitant phase, and in the fluent phase we phases suggests that these cycles are indeed psy-
merely say what we have just planned in the chologically real. Finally, Roberts and Kirsner
preceding hesitant phase (see Figure 13.12). (2000) used the statistical technique of time
Butterworth (1975, 1980) argued that a cycle series analysis to find further support for the
corresponds to an idea. He asked independent existence of temporal cycles.
judges to divide other speakers’ descriptions of
their routes home into semantic units, and com- Evaluation of research on
pared these with hesitation cycles. An idea lasts
for several clauses. Roberts and Kirsner (2000)
dysfluencies
found that new cycles are associated with topic Some dysfluencies might do more than just indi-
shifts in conversation. cate temporary processing difficulty. Sometimes
One problem with this work is the way in speakers deliberately (though perhaps usually
which the units were identified by inspection of unconsciously) put pauses into their speech to
make the listener’s job easier, perhaps aiding
them to segment speech, or to give them time to
parse the speech (see also Chapter 14, on audi-
Total amount of pause time
ence design). Lounsbury (1954) distinguished

between hesitation pauses, which reflect plan-
ning by the speaker, and juncture pauses, which
are put in by the speaker to mark major syntactic
boundaries, perhaps for the convenience of the
listener. Good and Butterworth (1980) provided
experimental evidence that hesitations might
be used to achieve some interactional goal, as
Time well as reflecting the speaker’s cognitive load.
They found that speakers paused more when
giving descriptions of their route into work
FIGURE 13.12 Planning and execution phases: when the experimenter asked them to appear
Cognitive cycles in speech production. to be more thoughtful. Listeners do make use of
dysfluencies when parsing the input (Ferreira & pause rate did indeed go down, but the number
Bailey, 2004). For example, filled pauses and of repeats they made went up instead.
repetitions are more common at the start than at Although the early work was originally
the end of clauses—the parser could therefore interpreted as showing that pausing reflected
make use of this information to decide on clause semantic planning, this is far from clear. It is
boundaries when there are alternative construc- likely that microplanning difficulties arise in
tions (e.g., in garden path sentences). The use of retrieving the phonological forms and planning
“oh” indicates to the speaker that the following propositions, whereas macroplanning pauses
utterance is not connected to the immediately reflect both semantic and syntactic planning
preceding information, but to something earlier of larger chunks of language. It is possible that
in the conversation (Fox Tree & Schrock, 1999). macroplanning and microplanning may conflict
“Uh” and “um” may serve different functions (Levelt, 1989); if we spend too much time on
in speech, with “uh” signaling a short delay, macroplanning, there will be fewer resources
and “um” a longer delay, in speaking (Clark & available for microplanning, leading to an
Fox Tree, 2002). Hence dysfluencies do more increase in pausing and decreased fluency as
than just reflect processing difficulty; they con- we struggle for particular words.
vey information to the listener. Of course, it is
quite possible that any one particular dysfluency
might serve more than one function. THE NEUROSCIENCE OF
Different types of pause might have dif- SPEECH PRODUCTION
ferent causes. Goldman-Eisler (1958) argued
that micropauses (those shorter than 250 ms) What else does neuroscience tell us about speech
merely reflect articulation difficulties rather production?
than planning time; however, this view has been
challenged (see, for example, Hieke, Kowal,
& O’Connell, 1983). There is some measure
Aphasia
of interchangeability between different types In the past, researchers placed a great deal of empha-
of hesitations. Beattie and Bradbury (1979) sis on the distinction between Broca’s and Wernicke’s
showed that if speakers were dissuaded from aphasias. These terms refer to what were once con-
making many lengthy pauses (by being “pun- sidered to be syndromes, or symptoms that cluster
ished” by the appearance of a red light every together, resulting from damage to different parts of
time they paused for longer than 600 ms), their the left hemisphere. Broca’s area is toward the front
Motor cortex
Broca’s
area
3 2
FIGURE 13.13 Pathways

1 showing the processes
involved in speaking a heard
Primary word. Activation flows from
auditory cortex Wernicke’s area (1), through
Wernicke’s the arcuate fasciculus (2), to
area
Broca’s area (3).
of the brain, in the frontal lobe, and Wernicke’s area in the other part. One their small tile into
is toward the rear, in the posterior temporal lobe (see her time here. She’s working another time
Figure 13.13). These terms are also still meaning- because she’s getting, too …”
ful for clinicians and neurologists, and they are still
acceptable terms in those literatures. For Wernicke, this type of aphasia resulted
from the disruption of the “sensory images” of
Broca’s aphasia words. Clearly aspects of word meaning process-
Broca’s aphasics have non-fluent speech, charac- ing are disrupted in this type of aphasia, while
terized by slow, laborious, hesitant speech, with syntactic processing is left relatively intact.
little intonation (called dysprosody), and with
obvious articulation difficulties (called speech Comparison of Broca’s and
apraxia). There is also an obvious impairment Wernicke’s aphasias
in the ability to order words. At the most general Broca’s and Wernicke’s aphasias are not really mir-
level, Broca’s-type patients have difficulty with ror images. They are distinguished on two dimen-
sequencing units of the language. An example of sions: intact versus impaired comprehension, and
Broca’s aphasia is given in (41) (from Goodglass, the availability or unavailability of the syntactic
1976, p. 238), where the dots indicate long pauses. components of language (see Figure 13.14). This
Although all Broca’s patients suffer from different categorization relates more to the links between
degrees of speech apraxia, not all obviously have the characteristics of the impaired speech and ana-
a syntactic disorder. tomical regions of the brain, while currently the
emphasis is on developing more functional descrip-
(41) “Ah … Monday … ah Dad and Paul … tions relating to psycholinguistic models of the
and Dad … hospital. Two … ah … doctors impairments. It is now considered more useful to
… and ah … thirty minutes … and yes … distinguish between fluent aphasia, which is char-
ah … hospital. And er Wednesday … nine acterized by fluent (though sometimes meaning-
o’clock. And er Thursday, ten o’clock … less) speech, and non-fluent aphasia. At the same
doctors. Two doctors … and ah … teeth.” time we can also distinguish between those patients
who can comprehend language and those who have
Wernicke’s aphasia a comprehension deficit. Traditional Broca’s-type
Damage to Wernicke’s area, which is in the left aphasics are non-fluent with no obvious compre-
temporal-parietal cortex, results in the product- hension deficit, whereas traditional Wernicke’s-type
ion of fluent but often meaningless speech. This is aphasics are fluent with an obvious comprehension
called Wernicke’s (sometimes sensory) aphasia. As deficit. Bear in mind that no classification scheme
far as one can tell, patients speak in well-formed for neuropsychological disorders of language is
sentences, with copious grammatical elements and perfect: there are always exceptions and patients
with normal prosody. Comprehension is noticeably who appear to cut across categories (see Schwartz,
poor, and there are obvious major content word- 1984). Furthermore, all patients have some degree
finding difficulties, with many word substitutions of anomia (word-finding difficulties, discussed in
and made-up words. Zurif, Caramazza, Myerson, more detail below)—even agrammatic Broca’s
and Galvin (1974) found that patients were unable aphasics (Dick et al., 2001).
to pick the two most similar words from triads as
“shark, mother, husband.” An example of the speech
of someone with Wernicke’s aphasia is given in (42)
Agrammatism
(from Goodglass & Geschwind, 1976, p. 410): The syntactic disorder of non-fluent patients tells
us a great deal about the processes involved in
(42) “Well this is … mother is away here work- speech production. In traditional neuropsychol-
ing her work out o’here to get her better, but ogy terms, such patients suffer from what has
when she’s looking, the two boys looking been labeled agrammatism.
Damage to Broca’s area

Speech Anomia, within the left frontal lobe
Broca’s aphasia comprehension agrammatism, AND in adjacent parts of
(expressive aphasia) relatively good mispronunciation frontal lobe and
subcortical white matter
Pure word deafness, Fluent but Damage to Wernicke’s

Wernicke’s aphasia
word comprehension meaningless area, within the left
(receptive aphasia)
difficulties speech temporal lobe
FIGURE 13.14 Comparison between Broca’s and Wernicke’s aphasias.
Agrammatism has three components. First, place in modern neuropsychology. The debate
there is a sentence construction deficit, such that centers on whether agrammatism is a coher-
patients have an impaired ability to output cor- ent deficit: Do people with agrammatism show
rectly ordered words. The words do not always symptoms that consistently cluster together,
form sentences, but look as though they are and hence, is there a single underlying deficit
being output one at a time. In some cases, simple that can account for them? If it is a meaningful
sentences can be generated (e.g., a patient might syndrome, we should find that the sentence con-
repeat “the old man is washing the window” as struction deficit, grammatical element loss, and
“the man is washing window. The man is old”; a syntactic comprehension deficit should always
Ostrin & Schwartz, 1986). The disorder extends co-occur. A number of single-case studies have
to sentence repetition, where complex phrases found dissociations between these impairments
are simplified. Second, some parts of speech (Caplan et al., 1985; Goodglass & Menn, 1985;
are better preserved than others. In particular, Miceli, Mazzucci, Menn, & Goodglass, 1983;
there is a selective impairment of grammatical Nespoulous et al., 1988; Saffran et al., 1980;
elements, such that content words are best pre- Schwartz et al., 1987).
served, and function words and word endings These dissociations suggest that there
(bound inflectional morphemes) are least well is a syntax module in the brain, but that the
preserved. Third, although for some time it was module itself has neurologically distinct com-
thought that their comprehension was spared, ponents. This idea is supported by recent
some people with agrammatism also have dif- neuroimaging data (Grodzinsky & Friederici,
ficulty in understanding syntactically complex 2006). Grodzinsky and Friederici identify dif-
sentences (see Chapter 10). It is also possible ferent sorts of syntactic processing, and indi-
that certain differences between agrammatic cate where they might take place in the brain
speakers reflect different adaptations to the (see Figure 13.15). Broca’s area is particularly
deficit. For example, some people show better important for identifying how different constit-
retention of bound morphemes, and others of uents in the sentence are related to each other,
free grammatical morphemes. with regions in the superior temporal gyrus
Whether or not these components are disso- (including Wernicke’s area) more involved in
ciable is an important question. There has been syntactic integration. Imaging suggests that
considerable debate as to whether terms such even parts of the right hemisphere play some
as Broca’s aphasia and agrammatism have any role in syntactic processing.
Explanations of agrammatism
One explanation of agrammatism is that the
patients’ articulation difficulties play a causal role.
It might be that patients find articulation so diffi-
cult that they drop function words in an attempt
to conserve resources. But agrammatism is much
more than a loss of grammatical morphemes, as
there is also a sentence construction and, in most
cases, a syntactic comprehension deficit.
Other theories attempt to find a single under-
lying cause for the three components. One obvious
suggestion is that Broca’s area is responsible for
processing function words and other grammatical
elements (see also Chapter 10). We saw earlier
that content and function words suffer very differ-
FIGURE 13.15 The main brain areas involved in ent constraints in normal speech production: for
syntactic processing. Pink areas (frontal operculum
example, they never exchange with each other in
and anterior superior temporal gyrus) are
word exchange speech errors. There is also some
involved in the build-up of local phrase structures;
the yellow area (BA33/45) is involved in the neuropsychological evidence that content and
computation of dependency relations between function words are served by different processing
sentence components; the striped area (posterior routines. French-speaking agrammatic patients
superior temporal gyrus and sulcus) is involved in made more phonological errors on reading func-
integration processes. Reprinted from Grodzinsky tion words than matched content words (Biassou,
and Friederici (2006). Obler, Nespoulous, Dordain, & Harris, 1997), a
finding often observed in deep dyslexia, which
often co-occurs with agrammatism. Probabilistic
difficulty in accessing grammatical elements
More recently it has been observed that will lead to difficulty in understanding complex
agrammatism can be observed in a wide range syntactic constructions, and deficits in syntactic
of aphasic patients, and is not restricted to non- production (Pulvermüller, 1995). Along these
fluent aphasics (Dick et al., 2001). Agrammatism lines, Kean (1977) proposed a single phonologi-
can even be observed in neurologically intact cal deficit hypothesis, later revised by Lapointe
people under stress. (1983), based on the assignment of stress to a
If there is no such syndrome as agram- syntactic frame. Kean argued that agrammatic
matism, it is meaningless to perform group patients omit items that are unstressed compo-
experiments on what is in fact a functionally nents of phonological words (see earlier). Hence
disparate group of patients. Instead, one should content words tend to be preserved, and affixes
only perform single-case studies (Badecker and function words are lost. This hypothesis
& Caramazza, 1985). In reply Caplan (1986) sparked considerable debate (see Caplan, 1992;
argued that at the very least agrammatism is Grodzinsky, 1984, 1990; Kolk, 1978). The main
a convenient label. Although there might be problem is that although it explains grammatical
subtypes, there is still a meaningful underly- element loss, it does not account so well for the
ing deficit. This issue sparked considerable other components of the disorder (particularly the
debate, both on the status of agrammatism (see sentence construction deficit), nor for the patterns
Badecker & Caramazza, 1986, for a reply to of dissociation that we can observe, in particular
Caplan) and on the methodology of single-case the patients’ ability to make judgments about the
studies (see Bates et al., 1991; Caramazza, grammaticality of sentences. Furthermore, as we
1991; McCloskey & Caramazza, 1988). saw in Chapter 10, the conclusion that function
and content words are processed differently is short-term memory (STM) does not necessar-
questionable. ily lead to agrammatism (Kolk & van Grunsven,
Stemberger (1984) compared agrammatic 1985; Shallice & Butterworth, 1977). Hence any
errors with normal speech errors. He proposed impairment would have to be to some component
that in agrammatic patients there is an increase of memory other than the phonological loop. This
in random noise, and an increase in the threshold could be to a specialist store for syntactic planning,
that it is necessary to exceed for access to occur. or perhaps to a special part of the central execu-
In these conditions substitutions and omissions, tive component of working memory. Nevertheless,
particularly of low-frequency items, occur. He reduced computational resources may play some
argued that agrammatism is a differential exac- role in the production deficits in agrammatism
erbation of problems found in normal speech; (Blackwell & Bates, 1995; Kolk, 1995). If this is
this idea, that aphasic behavior is just an extreme so, one possibility is that grammatical elements
version of normal speech errors, is one fre- are particularly susceptible to loss when computa-
quently mentioned. Harley (1990) made a simi- tional resources are greatly reduced.
lar proposal for the origin of paragrammatisms.
These are errors involving misconstructed gram-
matical frames, and can be explained in terms
Jargon aphasia
of excessive substitutions. Again, however, these Jargon aphasia is an extreme type of fluent aphasia
approaches do not explain all the characteristics in which syntax is primarily intact, but speech is
of agrammatism. Although uninflected words are marked by gross word-finding difficulties. People
more common than inflected forms, the high- with jargon aphasia often have difficulty in rec-
frequency function words are more likely to be ognizing that their speech is aberrant, and may
lost than content words, which are of lower fre- become irritated when people fail to understand
quency, on average. Stemberger argued that the them, indicating a problem with self-monitoring
syntactic structures that involve function words (Marshall, Robson, Pring, & Chiat, 1998).
are less frequent than structures that do not. The word-finding difficulties in jargon apha-
Schwartz (1987) related agrammatism to sia are marked by content-word substitutions
Garrett’s model. Consider what would happen in (paraphasias) and made-up words (neologisms).
this model if there were a problem translating from Paraphasias include unrelated verbal paraphasias,
the functional level to the positional level. No sen- such as (43), semantic paraphasias (44), form-
tence frame would be constructed, and no gram- based or formal paraphasias (45) (all from Martin
matical elements would be retrieved. This is what & Saffran, 1992, and Martin, Dell, Saffran, &
is observed. This does not provide an account of Schwartz, 1994), and phonemic paraphasias (46)
the comprehension deficit, which would arise from (from Ellis, 1985). Of particular interest are neol-
damage to other systems. The dissociation between ogisms, which are made-up words not to be found
the sentence construction deficit and grammatical in a dictionary. There are a number of types of
element loss suggests that different processes must neologisms, including distortions of real words,
be responsible in Garrett’s model for constructing for example (47) and (48) (from Ellis, 1985), and
the sentence frame and retrieving grammatical ele- abstruse paraphasias with no discernible relatives,
ments. Although lacking detail, this line of thought where it is often difficult to discern the intended
both supports and extends Garrett’s model, and word (49) (from Butterworth, 1979). As an exam-
shows how neuropsychological impairments can ple, consider the description (50) of connected
be related to a model of normal processing. speech. This is a description by patient CB (from
We saw that reduced computational resources Buckingham, 1981, p. 54) of the famous Boston
might play some role in the syntactic comprehen- “cookie theft” picture, which depicts a mother
sion deficit. Similarly, limited memory might play washing plates while the sink overfills, while
some role in agrammatic production. However, in the background a little boy and girl steal the
any role is a complicated one, as severely reduced cookies.
(43) thermometer → typewriter correctly, but the retrieval of the phonological

(44) scroll → letters forms fails. He also suggested that aphasic errors
(45) pencil → pepper are accentuated normal slips of the tongue, and
(46) swan → swom pointed to a large number of instances of word
(47) octopus → opupkus blend errors in KC’s speech, combined with or
(48) whistle → swizl perhaps caused by a failure in the mechanisms
(49) ? → kwailai that normally check speech output. Ellis, Miller,
(50) “You mean like this boy? I mean [noy], and and Sin (1983) found that the main determinant
this, uh, [neoy]. This is a [kaynit], [kahken]. of probability of successful retrieval in jargon is
I don’t say it, I’m not getting anything from word frequency. We would expect to have particu-
it. I’m getting, I’m [dime] from it, but I’m lar difficulty in retrieving low-frequency items.
getting from it. These were [eksprehsez], Buckingham (1986) provided an account of
[ahgrashenz] and with the type of [mah- jargon aphasia in terms of the traditional Garrett
kanic] is standing like this … and then the … model of speech production. Buckingham posited
I don’t know what she [goin] other than. And disruption of the functioning of the device known
this is [deli] this one is the one and this one as a scan-copier that is responsible for outputting
and this one and … I don’t know.” the phonemes of a word into the syntactic frame
in the correct order (Shattuck-Hufnagel, 1979).
Butterworth (1985) noted that jargon aphasia Buckingham (1981) pointed out that neologisms
changes over time as patients recover some of may actually have many different sources, but also
their abilities. A typical progression is from undif- invoked the notion of a random syllable generator.
ferentiated strings of phonemes, to neologistic Neologisms display appropriate syntactic
speech, to word paraphasias, and then perhaps to accommodation, and their affixes appear correct
circumlocutory phrases. for their syntactic environment (Butterworth,
Butterworth (1979) examined hesitations 1985). This is further support for the Garrett
before neologisms in the speech of patient KC, model, as content words are retrieved indepen-
and found that they resembled those made by nor- dently from their syntactic frames and inflections,
mal speakers before less predictable words. KC and jargon is a disorder of lexical retrieval. All
was more likely to hesitate before a neologism or Wernicke’s-type deficits can be seen as problems
a paraphasia than before a real word. The pres- with the semantic-phonological access system.
ence of pauses before neologisms argues against It is as yet unclear whether there are two sub-
any account of neologisms relying on disinhibition— types, one involving a semantic impairment and
that the lexical retrieval system is overactive. one involving only a problem in the retrieval of
Butterworth instead argued that such errors arise phonological forms, although given the two-stage
when the patient is unable to activate any phono- model, such a division would be expected.
logical form, and instead uses a random phoneme
generation device to produce a pseudoword.
Butterworth, Swallow, and Grimston (1981)
Anomia
examined the gestures in the pauses preceding the Anomia is an impairment of retrieving the names of
neologisms. They found that KC’s use of gestures objects and pictures of objects, and can be found in
was generally the same as that of normal speak- isolation, or accompanying other disorders such as
ers, and they therefore concluded that the seman- Wernicke’s-type or Broca’s aphasia. In fact virtually
tic system was intact in this patient. However, all types of aphasia are marked by some degree of
many gestures produced just before neologisms anomia. The two-stage model of lexicalization sug-
were incomplete. Iconic gestures are thought to gests that there are two things that could go wrong
be generated at the semantic level. Butterworth in naming. We can have difficulty in retrieving
(1985) argued that the first stage of lexical access the lemma from the semantic specification, or we
(what we have called lemma retrieval) functions could have difficulty in retrieving the phonological
substantially impaired (she could correctly name

only 3% of pictures without help), and she made
many semantic errors. Her naming performance
could be improved if she was given a phonologi-
cal cue to the target, such as its initial phoneme.
However, these phonological cues could lead her
astray; if she was given a cue to a close seman-
tic relative of the target she would produce that.
For example, the cue “l” would lead her to say
“lion” in response to a picture of a tiger. Howard
and Orchard-Lisle concluded that her processes of
object recognition were normal. JCU scored highly
on the pyramids and palm trees test. In this task the
participant has to match a picture of an object to
an associate. In the eponymous trial, the participant
must match a picture of a pyramid to a picture of
a palm tree rather than to one of a deciduous tree.
This pattern of performance suggests that in JCU
both object recognition processes and the underly-
ing conceptual representation were intact. JCU per-
formed less well on a picture categorization task
where there were close semantic distractors (such
as matching an onion to a peapod rather than to
an apple), although performance was still above
Anomia is an impairment of retrieving the names chance. Howard and Orchard-Lisle concluded that
of objects and pictures of objects. JCU suffered from a semantic impairment such
that there was interference between close semantic
relatives. She was led to the approximate seman-
form of a word after we have accessed its lemma. tic domain so the target word was distinguishable
Therefore we observe two types of anomia. from semantically unrelated words, but the seman-
tic representation was too impoverished to enable
Lexical-semantic anomia her to home in any more precisely. JCU could only
Perhaps the most striking evidence for involve- access incomplete semantic information.
ment of the semantic level in naming disorders is Some patients make semantic errors yet
when patients can name members of one semantic have apparently intact semantic processing.
category (such as inanimate objects) better than Howard and Franklin (1988) described the case
another (such as animate objects). We examined of a patient known as MK who had a moderate
these category-specific semantic disorders in comprehension deficit. MK was poor at naming,
detail in Chapter 11. In many of these patients, producing semantic relatives of the target, yet
however, the deficit is a central one: The central performed well at the pyramid and palm trees
semantic store (or stores) is disrupted, as perfor- task. For example, MK named a caterpillar as
mance is poor in comprehension as well as pro- “slug,” yet had no difficulty in associating a pic-
duction (Warrington & Shallice, 1984). ture of a caterpillar with a picture of a butterfly
Lexical-semantic anomia is an inability to rather than with a picture of a dragonfly. Hence,
use the semantic representation to select the cor- although the semantics were intact, MK still
rect lemma. Howard and Orchard-Lisle (1984) made semantic paraphasias. MK probably had
described patient JCU, who had a general seman- problems getting from an intact semantic system
tic disorder. Her naming of all types of object was to the lemma.
Phonological anomia described above. Wilshire and Saffran (2005)

Kay and Ellis (1987) described the case of EST. gave two fluent aphasic patients with anomia
(See Laine & Martin, 1996, for a description of auditory primes just before naming a picture.
a similar sort of patient, IL.) This patient knew They found that patient IG, who made many
the precise meaning of words and was good at semantic and phonological substitutions, was
all semantic tasks, but was very poor at retriev- helped only by word-initial phonological priming
ing any phonological information about the tar- (e.g., ferry–feather). Patient GL, who made pho-
get. For example, he performed normally on the nological errors and substitutions, only benefited
pyramids and palm trees test, and often offered from word-final primes (e.g., brother–feather). It
detailed semantic information about the word that is likely that word-initial and word-final primes
he could not retrieve. He was not prone to inter- have effects at different stages, with word-initial
ference from close semantic distractors. He was information becoming available very early, while
much better at retrieving high-frequency words the lemma is being selected, whereas word-final
than low-frequency ones. He had full and clear information is only available later, after the lemma
understanding of the items he was trying to name, has been selected and the detailed phonological
but he still could not retrieve targets, although he form of the word is being retrieved.
sometimes had partial phonological information,
and could produce associated semantic information Connectionist modeling
such as the superordinate category of the word and
a functional description of the associated object.
of aphasia
Phonological cuing of the target helped only a lit- Connectionist modeling of aphasia has focused on
tle, and unlike JCU, EST could not be misled into difficulties with lexical retrieval.
producing a category coordinate. This type of ano- Harley and MacAndrew (1992) lesioned a
mia is reminiscent of the tip-of-the-tongue state. model of normal lexicalization with the aim of pro-
EST’s problems appeared to arise at the phonologi- ducing some of the characteristics of aphasic para-
cal level rather than at the semantic level. phasias. They tested four hypotheses. First, Martin
and Saffran (1992) proposed that a pathological
Evaluation of anomia research increase in the rate of decay leads to increased
The existence of two types of anomia supports paraphasias and neologisms. Second, Harley
a distinction between semantic and phonologi- (1993b) argued that the loss of within-level inhibi-
cal processing in speech production. Although it tory connections would lead to impaired process-
is usually adduced as evidence for the two-stage ing; if lexical units were involved, neologisms and
model of lexicalization, it might also be consistent paraphasias would result, whereas if connections
with a one-stage model. In the two-stage account, between syntactic units were lost, paragramma-
lexical-semantic anomia can be explained as dif- tisms (Butterworth & Howard, 1987) would result.
ficulty in retrieving the lemma, whereas non- Third, Stemberger (1985) argued that normal
semantic impairment can be explained as difficulty speech errors result from noise in an interactive
in retrieving the phonological representation after activation network; perhaps aphasic errors result
the lemma has been successfully accessed. In the from excessive random noise. Finally, Miller and
one-stage model, lexical-semantic anomia could Ellis (1987) argued that neologisms result from the
arise from the failure of the semantic system, while weakening of the connections between the seman-
phonological anomia could arise from failures of tic and lexical units. Harley and MacAndrew
accessing the word forms (as do jargon and neol- concluded that weakened semantic–lexical con-
ogisms). As we saw earlier, the best evidence for nections best fit the error data: Weakening the
two stages in lexicalization comes from the study value of the parameter that governs the rate of
of anomic patients in languages with gender. spread of activation from semantic to lexical units
Another complication is that the effects of often results in target and competing lexical items
phonological priming are more complex than having similar high activation levels.
Currently the most comprehensive computa- activation spreads to the appropriate phonologi-
tional model of aphasia is based on Dell’s (1986) cal units. Feedback connections from the phono-
model of speech production. Martin and Saffran logical to the lexical level ensure that lexical units
(1992) reported the case of patient NC, a young corresponding to words that are phonologically
man who suffered a left hemisphere aneurysm that similar to the target word become activated. Martin
resulted in a pathological short-term memory span and Saffran argued that if the activation of lexical
and a disorder known as deep dysphasia. This is units decays pathologically quickly, then the target
an aphasic analog of deep dyslexia; it is a relatively lexical unit (as well as semantically related lexical
rare disorder marked by an inability to repeat non- units primed by earlier feedforward activation) will
words and the production of semantic errors in the be no more highly activated than other phonologi-
repetition of single words (see Howard & Franklin, cally related lexical units that have been activated
1988). Additionally, in word naming NC produced later by phonological–lexical feedback. Repetition
a relatively high rate of formal paraphasias errors are accounted for by a similar, but reversed,
(sound-related word substitutions, such as produc- mechanism. The target and phonologically related
ing “schools” for “skeleton”) (see Figure 13.16). lexical units are primed early by feedforward acti-
Martin and Saffran argued that the semantic errors vation from auditory input, and suffer more from
in word repetition and the formal paraphasias in decay. This activation feeds forward to semantic
production arise because of a pathological increase feature units that in turn feed back to the lexical
in the rate at which the activation of units decays. In network to refresh the activation of the decaying
naming, formal paraphasias arise because when the target unit. At the same time, this feedback primes
lexical unit corresponding to the target is activated, semantically related units. Because they are primed
0.50
NC
0.45
Proportion of responses
0.40
0.35
(n = 172)
0.30
0.25
0.20
0.15
0.10
0.05 FIGURE 13.16 Proportion
0.00 of naming errors by deep
C S F N S→F S→ N dysphasic patient, NC,
Response type
and the lesioned version
of Dell’s (1986) model of
0.50
Model, q = 0.92 speech production (where
0.45
Proportion of responses
0.40
the lesion led to abnormal
decay of activation). The
(n = 1,000 trials)
0.35
0.30 response categories are:
0.25 C = correct; S = semantic
0.20 error; F = formal paraphasia;
0.15 N = neologism (nonsense
0.10 word); S o F = formal
0.05 paraphasia on a semantic
0.00 error; S o N = neologism
C S F N S→F S→N
Response type
on a semantic error. q is
decay rate. Figure from
Martin et al. (1994).
later, the semantic competitors suffer less from Modeling work suggests that the perfor-
the cumulative effects of the decay impairment, mance of patients with impaired lexical access
and thus the likelihood increases that they will be is better accounted for by impairments to two
selected instead of the target and phonologically parameters, semantic weight and phonologi-
related words. It is difficult to sustain the activa- cal weight, rather than by one weight-decay
tion of the target lexical unit given rapid decay, par- parameter (Foygel & Dell, 2000). These two param-
ticularly when it is hindered in other ways (such as eters are measures of the weights, or connection
when the target is low frequency, or is supported by strengths, between the semantic and the lexical
impoverished semantic representations). (lemma) units, and between the lemma and the
The idea that a pathological rate of decay and phonological units. Damage in the model occurs
impaired activation processes play a central role by varying these weights. The new model fits
in word retrieval deficits has been developed fur- the patient data slightly better than the weight-
ther. Dell, Schwartz, Martin, Saffran, and Gagnon decay model. For example, some patients (e.g.,
(1997) simulated these deficits with Dell’s com- PW of Rapp & Caramazza, 1998; DP of Cuetos,
putational model of speech production. The basic Aguado, & Caramazza, 2000) make exclusively
model (called the DSMSG model after the authors) semantic errors, and some patients (e.g., JBN
is the interactive two-stage model described earlier: of Hillis, Boatman, Hart, & Gordon, 1999; DM
Activation flows from the semantic level through of Caramazza, Papagno, & Ruml, 2000) make
the lemma level to the phoneme level. There are exclusively phonological errors. These types of
feedback connections between levels. Dell et al. patients were not present in the sample mod-
impaired the functioning of the network by reduc- eled by the original DSMSG model, but can be
ing the connection weights or increasing the decay modeled by the Foygel and Dell model. The new
rate of the model (or both). These changes were model provides an extremely good fit to the nam-
made globally: the same parameter determines pro- ing and repetition performance of a large (94 par-
cessing at each level. Decreasing the connection ticipant) group of aphasic patients (Dell, Martin,
strength produces a large increase in the number of & Schwartz, 2007; Schwartz, Dell, Martin, Gahl,
nonword errors, and a small increase in the number & Sobel, 2006). Finally, the new model fits in
of semantic and phonological word substitutions. very simply with the two-stage model of lexicali-
Increasing the decay rate at first increases the num- zation: We can account for the pattern of all types
ber of semantic and phonological word substitu- of lexical access failure in terms of the structure
tions, although eventually more nonword errors are of the two-stage model without introducing new
created. The most important dimension determin- parameters (such as decay). Hence the model is
ing performance is the severity of damage: Aphasic more parsimonious than its predecessor.
naming performance lies on a continuum between Although these two models are based on
normal performance and a completely random pat- sound psycholinguistic principles, there has been
tern. As damage becomes severe, the error pattern considerable debate about how well their out-
becomes more random. The model also accounts puts fit a wide range of patient data, and about
for the pattern of recovery shown by aphasic speak- the extent to which aphasic errors can all result
ers with time by gradually resetting the decay vari- from global damage to all levels of a system, as is
able to its normal value. The model described the the case with pathological delay, an idea called the
naming errors of 21 fluent aphasic patients. It can globality assumption (Ruml & Caramazza, 2000;
also account for the pattern of performance shown Ruml, Caramazza, Shelton, & Chialant, 2000).
by two brothers with a degenerative brain dis- One reason why it is difficult to draw any firm
ease called progressive aphasia (Croot, Patterson, conclusions from this controversy is that there is
& Hodges, 1999). The language of one brother no agreement on how well a computational model
(RB) can best be explained by reduced connection has to fit the data for it to be a good model (Dell,
strength, while the language of the other (CB) is Schwartz, Martin, Saffran, & Gagnon, 2000;
best explained by an abnormally high decay rate. Ruml & Caramazza, 2000).
Other types of aphasia make in naming and spontaneous speech, and on

the types of reading error they make. Impairment
We have concentrated on three major categories of of the lexical route leads to many word substitu-
aphasia because of what they tell us about normal tions in speech and surface dyslexia. In transcor-
speech production. However, there are other types. tical motor aphasia, comprehension and repetition
In global aphasia, spontaneous speech, naming, and are very good, but there is very little spontaneous
repetition are all severely affected. In crossed apha- speech output.
sia, disorders of language arise from damage to the These disorders can be related to a more
right hemisphere, even in right-handed people. detailed model of normal production, but a full
In conduction aphasia, repetition is relatively account of this depends on an understanding of
impaired, while production and comprehension are the relation between language and short-term
relatively good. This dissociation is clear evidence memory. This topic is covered in Chapter 15.
that the processes of repetition can be distinguished
from the processes of production and comprehen- Evaluation of the contribution of
sion. There are two subtypes of conduction aphasia. aphasia research to understanding
In reproduction conduction aphasia, repetition is
poor because of poor phonological encoding. People
normal processing
with reproduction conduction aphasia show impair- At first sight then there is a double dissociation
ments in all language production tasks, including between word-finding and the production of
speaking, repetition, reading, and writing (Kohn, grammatical forms, with these processes located
1984). Repetition of longer and less familiar words in different brain regions. Broca’s patients have
is particularly poor. When reproduction conduction difficulty producing grammatical forms, yet
aphasics attempt to correct their errors, they make have relatively well-preserved word-finding.
repeated attempts to produce a word that progres- Wernicke’s patients have severe word-finding
sively approximates to the target, a phenomenon difficulties, yet have relatively well-preserved
known as conduit d’approche (Martin, 2001). In syntax. Just as we would expect from Garrett’s
particular, an output phonological buffer is thought model, there is a double dissociation between syn-
to be impaired in reproduction conduction apha- tactic planning and grammatical element retrieval
sia (Caramazza, Miceli, & Villa, 1986; Shallice, on the one hand and content word retrieval on the
Rumiati, & Zadini, 2000). In STM conduction other. This apparent double dissociation supports
aphasia, repetition is poor because of an impairment the main principles of the model.
of input auditory short-term memory; these patients The types of disorder observed support dis-
make few errors in spontaneous speech production, sociations between the production of syntax and
but repetition of strings of short familiar words is the retrieval of lexical forms, between the gen-
poor (Shallice & Warrington, 1977; Warrington & eration of syntax and the access of grammatical
Shallice, 1969). morphemes, and the retrieval of the phonology
On the other hand, people with transcortical of content words. Garrett argued that content and
aphasia can repeat words relatively well. There are function words are from different computational
two types of transcortical aphasia, depending on vocabularies, and this is confirmed by the neuropsy-
the precise site of the lesion. In transcortical sen- chological work. Schwartz (1987) interpreted
sory aphasia, comprehension is impaired, output is agrammatism and jargon aphasia within the frame-
fluent and may even include jargon, but repetition work of Garrett’s model. At present, this approach
is relatively good. There are two subtypes of trans- identifies the broad modules found in production
cortical sensory aphasia (Coslett, Roeltgen, Rothi, rather than the detailed mechanisms involved.
& Heilman, 1987): one type where both lexical and Although syntactic production and compre-
non-lexical repetition are preserved, and another hension deficits tend to co-occur, the dissociation
where only repetition through a non-lexical route of grammatical impairments in comprehension
is intact. Patients differ in the types of error they and production suggests that at some level there
are distinct syntactic processes in production and significant differences. In Chapter 15 I will show
comprehension. That is, some agrammatic patients that the neuropsychological evidence suggests
have no comprehension impairments, and some that speaking and writing use different lexical sys-
people with comprehension deficits do not have tems. We have much more time available when
any production impairments. Furthermore, there writing compared with when speaking. We also
is no correlation between the severity of the pro- (usually) speak to another person, but write alone
duction and comprehension syntactic deficits that (even if for an audience). This leads to two major
patients exhibit (Caplan, 1992). The parser and the differences between spoken and written language
syntactic planner are to a large degree separable. (Chafe, 1985). Written language is more inte-
There is a problem with this double dis- grated and syntactically complex than spoken lan-
sociation: it might be an artifact of considering guage. We take more time to write, and can plan
just people who speak English. Cross-linguistic and edit our output more easily. Second, writing
studies of speakers of languages that are much involves little interaction with other people, and
more richly inflected show different types of as a result shows less personal involvement than
break-down (Dick et al., 2001). In particular, speech. This has important consequences for
patients with damage to Wernicke’s area make teaching writing skills (Czerniewska, 1992).
many more grammatical errors, making many Hayes and Flower (1980, 1986) identified
grammatical substitutions (something for which three stages of writing. The first is the planning
there is little scope in English). Dick et al. argue stage. Here goals are set, ideas are generated, and
that Broca’s aphasics tend to omit things and information is retrieved from long-term memory
Wernicke’s aphasics tend to substitute things, and organized into a plan for what to write. The
not because of underlying grammatical reasons, second is the translation stage. Here written lan-
but simply because of the differing speech rates guage is produced from the representation in
of the two groups. When speech is very slow, memory. The plan has to be turned into sentences.
many items fail to reach a critical level of activa- In the third stage, reviewing, the writer reads and
tion, meaning that weakly represented elements edits what has been written.
are omitted. Substitution errors increase with Collins and Gentner (1980) described
speech rate, but in English there is little scope the planning stage in some detail. They dis-
for grammatical substitution. Hence it looks as tinguished between the initial generation of
though people with Broca’s aphasia are making ideas, and their subsequent manipulation into a
grammatical errors, and those with Wernicke’s form suitable for translation into the final text.
aphasia lexical errors, but really the two disor- They suggested several means of generating
ders lie on a continuum of omission and substitu- ideas: Writing down all the ideas you have on
tion errors, with the nature of English limiting a topic, keeping a journal of interesting ideas,
the sort of errors that can occur. Dick et al. brainstorming in a group, looking in books and
argue that their results show that grammar is not journals, getting suggestions from other people,
localized in one specific brain region (such as and trying to explain your ideas to somebody.
Broca’s area), but instead makes use of many Although these ideas must be put down in tan-
regions. Damage to Broca’s area has serious con- gible form, at this stage it is important not to
sequences for grammatical processing, but in a get too carried away with translation into text.
more distributed account it does not necessarily Collins and Gentner identified several methods
mean that grammar is located there. of manipulating ideas into a form suitable for
translation. These include identifying dependent
WRITING AND AGRAPHIA variables, generating critical cases, comparing
similar cases, contrasting dissimilar cases, sim-
There has been even less work on writing than ulating, categorizing, and imposing structure.
there has been on speaking. Obviously writing A number of factors are known to distin-
and speaking are similar, but there are also guish good from less able writers. Differences
at the planning stage are particularly important The neuroscience of writing

(Eysenck & Keane, 2010). Better writers can
manipulate the knowledge they have, rather than The phonic mediation theory says that, when we
just telling it (Bereiter & Scardamalia, 1987). write, we first retrieve the spoken sounds of words
Better writers are more able to construct suit- and then produce the written word (Luria, 1970).
able plans than less able writers. They are more Neuropsychological data show that the phonic
flexible about their plans, changing them as new mediation theory is almost certainly wrong (Ellis &
information becomes available or as it becomes Young, 1988/1996). There are patients who can spell
apparent that the original plan is unsatisfactory words that they cannot speak (e.g., Bub & Kertesz,
(Hayes & Flower, 1986). Indeed, one of the most 1982b; Caramazza, Berndt, & Basili, 1983; Ellis
serious errors that novice writers can commit is et al., 1983; Levine, Calvanio, & Popovics, 1982).
to confuse idea generation and planning with That is, inner speech is not necessary for writing.
translation into text, so that text constraints enter Brain damage can affect writing to produce
at too early a stage (Collins & Gentner, 1980). If dysgraphia. There are types of dysgraphia similar
this happens the writer loses track of the desired to the types of dyslexia. Shallice (1981) described
content and spends too much time editing text the case of PR, who was a patient with phonological
that is then often discarded. Text that is undesirable dysgraphia. This patient could spell many familiar
may be kept in just because the writer is reluctant words, but could not generate spellings from sounds.
to discard it given all the effort that has been put That is, he could spell words but not nonwords. (This
into it. is also further evidence against the phonological
Although the planning stage is particularly mediation theory.) Beauvois and Derouesné (1981)
important, there are also differences at the other reported RG, who could spell nonwords but who
two levels. Good writers can generate longer sen- would regularize irregular words, a condition called
tence parts: They seem to think in larger “writing surface dysgraphia. Finally, there are examples of
chunks” (Kaufer, Hayes, & Flower, 1986). Good people with deep dysgraphia who make semantic
writers can also readily produce appealing text: errors in writing (e.g., writing “star” as “moon”;
They know that good text must be enticing, com- see Bub & Kertesz, 1982a; Newcombe & Marshall,
prehensible, memorable, and persuasive (Collins 1980; Saffran, Schwartz, & Marin, 1976).
& Gentner, 1980). When they revise their mate- Degenerative diseases such as Alzheimer’s
rial, good writers are more likely than less skilled (see Chapter 11) can affect the high-level processes
writers to change the meaning of what they have involved in writing. It is possible to detect very early
written (Faigley & Witte, 1983). changes in writing style as a consequence of the
Finally, although producing outlines improves disease. The acclaimed British writer Iris Murdoch
the quality of the final work, perhaps surprisingly, won the Booker Prize in 1978 with her novel The
producing more detailed rough drafts does not Sea, the Sea. Her final novel, Jackson’s Dilemma
(Eysenck & Keane, 2010; Kellogg, 1988). This is (published in 1995), met with an unenthusiastic
because planning is the most important and dif- response from literary critics. She originally attrib-
ficult part of writing, and producing an outline uted her writing difficulties to “writer’s block,”
assists this stage. Producing a rough draft confers but showed a general cognitive decline around this
very little additional advantage to this (Eysenck time, and was diagnosed with Alzheimer’s disease
& Keane, 2010). in 1996. She died in 1999. A detailed analysis of
Where should you begin when writing? Quite her early and midperiod novels compared with her
often the beginning might not be the best place. final novel shows that although there were few dif-
Some people recommend starting with the section ferences in syntax, there were large differences in
that you think will be easiest to write (e.g., Rosnow the choice of words (Garrard, Maloney, Hodges, &
& Rosnow, 1992). It probably doesn’t matter too Patterson, 2005). Words in the final novel tended
much; the important thing is to have constructed a to be much higher in frequency, and chosen from a
plan of what you need to write before you start. much more restricted vocabulary.
SUMMARY
x Speech production has been studied less than language comprehension because of the difficulty
in controlling the input (our thoughts).
x Speech production can be divided into conceptualization, formulation, and execution.
x Formulation comprises syntactic planning and lexicalization.
x Lexicalization is the process of retrieving the sound of a word given its meaning.
x Speech errors are an important source of data in speech production, and can be described in terms
of the units and mechanisms involved.
x One of the best known models of formulation is Garrett’s; Garrett argues that speech error evidence
suggests there is a distinction between a functional level of planning and a positional level of planning.
x Explicit serial order information is not encoded at the functional level of Garrett’s model.
x The distinction between function and content words is central in speech production, as they never
exchange with each other in speech errors.
x Syntactic persistence is the phenomenon whereby we tend to reuse syntactic structures; hence we
can facilitate and direct production with appropriate prime sentences.
x Number agreement is determined by the underlying number of the subject noun.
x Production and syntactic planning has an incremental component to it.
x The strong version of Garrett’s model, in which the stages are discrete and do not interact, is
undermined by phonologically facilitated cognitive intrusions, blends of phrases merging at the
point of maximum phonological similarity, and similarity and familiarity biases in speech errors.
x In the two-stage model of lexicalization, a meaning-based stage is followed by a phonologically
based stage.
x Tip-of-the-tongue (TOT) states are noticeable pauses in retrieving a word; they arise because of
insufficient activation of the words in the lexicon.
x Evidence for two stages comes from an analysis of speech errors and TOTs, and of anomia in
languages that have gender.
x Lemmas are syntactically and semantically specified, amodal lexical representations.
x The amodal nature and syntactic mediation function of lemmas are debatable.
x Experimental studies of picture naming do not always find mediated semantic-phonological prim-
ing. Although this result suggests that the two stages of processing are discrete, simulations show
that it is not inconsistent with a cascade model, and other evidence suggests that the two stages
are accessed in cascade.
x Speech errors show lexical (familiarity) and similarity biases; these findings suggest that lexicali-
zation is interactive.
x Models such as that of Dell provide an interactive account of lexicalization.
x It is not clear why feedback connections exist, but connectionist models based on phonological
attractors can in principle still account for the data.
x The main problem for phonological encoding is ensuring that we produce the sounds in the correct
sequence.
x One important method of ensuring correct sequencing is to make a distinction between frames
and content.
x The phonological word is the basic unit of phonological planning.
x Hesitations reflect planning by the speaker, although they may also serve social and segmentation
functions.
x Microplanning pauses indicate transient difficulty in retrieving the phonological forms of less
predictable words, whereas macroplanning pauses indicate both semantic and syntactic planning.
x We sometimes hesitate before less predictable words, suggesting that we are having a temporary
difficulty in retrieving them.
x We tend to pause between major syntactic units of speech, and in these pauses we plan the content
of what we want to say.
x Speech falls in planning cycles, with fluent execution phases following hesitant planning phases
in which we do a relatively large amount of planning, each cycle corresponding to an idea.
x Aphasia is an impairment of language processing following brain damage.
x Broca’s aphasia patients are not fluent, often with some deficit in syntactic comprehension,
whereas Wernicke’s aphasics are fluent, usually with very poor comprehension.
x Agrammatism is a controversial label covering a number of aspects of impaired syntactic process-
ing, including a sentence construction deficit, the loss of grammatical elements of speech, and
impaired syntactic comprehension.
x Jargon aphasia is a disorder of lexical retrieval characterized by paraphasias and neologisms.
x Lexical-semantic anomia arises because of an impairment of semantic processing, whereas pho-
nological anomia arises because of difficulty in accessing phonological word forms.
x Naming errors can be modeled by manipulating connection strengths and the rate of decay of activation.
x Writing is less constrained by time than speech production, and is less cooperative than speech.
x Writing involves planning, translation, and reviewing; of these, planning is the most difficult.
x There are types of dysgraphia analogous to the types of dyslexia.
1. How does writing differ from reading?

2. How similar are speech errors to other sorts of action slip (e.g., intending to switch a light off
when it has already been switched off)?
3. Collect your own speech errors for 2 weeks. How well can they be accounted for by models of
speech production?
4. Observe when you pause and hesitate when speaking. Relate these observations to what you
have learned in this chapter.
5. Models of speech production have largely used connectionist architectures based on interactive
activation networks, whereas models of word recognition have largely used feedforward networks
trained with back-propagation. Can you think of any reason for this difference in emphasis?
FURTHER READING
See Wheeldon (2000) and Alario, Costa, Pickering, and Ferreira (2006) for collections of papers
covering all aspects of language production. A classic reference is Levelt (1989). Levelt discusses
what might happen at the message level. Dennett (1991) speculates about how the conceptualizer
might work.
(Continued)
(Continued)
In addition to number agreement, in many languages it is important that gender is matched
between adjectives, articles, and nouns. For work in this area, see Alario and Caramazza (2002);
Costa, Kovacic, Fedorenko, and Caramazza (2003); Schiller and Caramazza (2003); Schiller and
Costa (2006); Schriefers, Jescheniak, and Hantsch (2005); and Schriefers and Teruel (2000).
For more on hesitations and pauses, see Beattie (1983) for a sympathetic review and Petrie
(1987) for a critical review. For a review of the role of interaction in lexicalization and syntactic
planning, see Vigliocco and Hartsuiker (2002).
See Meyer (2004) for a review of work on the visual world and speech production.
For further information on the neuropsychology of language, see Kolb and Whishaw (2009).
For more on what cognitive neuropsychology tells us about normal speech production, see Caplan
(1992, with a paperback edition in 1996). See Roelofs, Meyer, and Levelt (1998) for a response to
Caramazza on the necessity of lemmas. See Rapp and Goldrick (2005) for a review of the literature
on the neuropsychology of word production.
See Vinson (1999) for an introductory review of language in aphasia. The methodological issues
involved in cognitive neuropsychology have spawned a large literature of their own. Indeed, a spe-
cial issue of the journal Cognitive Neuropsychology (1988, volume 5, issue 5) is completely devoted
to this topic. Much of the emphasis in this area has been on the status of agrammatism. For a more
detailed discussion, see also Shallice (1988). The nature of agrammatism has always been central in
this debate. See Hale (2002) for an account of what it must be like to lose language after a stroke, and
how the loss affects the family of the person.
See Emmorey (2001) for a review of the production of sign language. Sign language breaks
down after brain damage in interesting ways. Ellis and Young (1988) review the literature on the
neuropsychology of sign languages and gestures.
For more on the dual versus single route models of how we generate regular and irregular verbs,
see the debate in Trends in Cognitive Science (Marslen-Wilson & Tyler, 2003; McClelland &
Patterson, 2002, 2003; Pinker & Ullman, 2002).
For excellent overviews of research on writing, see Ellis (1993) and Eysenck and Keane (2010).
The latter covers the Hayes and Flower model in detail. Flower and Hayes (1980) discuss the plan-
ning process in more detail. Ellis (1993) also has a section on disorders of writing, the dysgraphias.
Ellis and Young (1988) also cover peripheral dysgraphias, which affect the lower levels of writing. See
Czerniewska (1992) for information on learning how to write, and how writing should best be taught.
C H A P T E R 14
HOW DO WE USE LANGUAGE?
INTRODUCTION The second topic is how we maintain con-

versations. To get things done, we have to col-
There is more to being a skilled language user than laborate. For example, we clearly do not want to
just understanding and producing language. The talk all at the same time. How do we avoid this?
study of pragmatics looks at how we deal with Do conversations have a structure that helps us
those aspects of language that go beyond the sim- to prevent this? And can we draw any inferences
ple meaning of what we hear and say. One obvi- from apparent transgressions of conversational
ous way of doing this is by making inferences. structure?
Pragmatics is concerned with how we get things Language use is a huge topic with many text-
done with language and how we work out what the books devoted to it, and I can only consider the
purpose is behind the speaker’s utterance. most important ideas here. A central theme here
Furthermore, much of what we have been is that people are always making inferences at all
concerned with so far is either how a compre- levels on the basis of what they hear. Our utter-
hender understands language, or how a speaker ances interact with the context in which they are
produces language. But usually we use language uttered to give them their full meaning.
in a social setting: we engage in dialog. It is possi- By the end of this chapter you should:
ble that the sorts of theory we have considered so
far offer limited theories of language processing x Understand how we use language.
(Pickering & Garrod, 2004). x Understand how we go beyond literal meaning.
This chapter is about how we use language. x Understand how we manage conversations.
The study of pragmatics can be divided into two x Know how researchers use the visual world to
interrelated topics. The first is how we as hear- investigate language processing.
ers and speakers go beyond the literal meaning of
what we hear to make and draw inferences. (Of
course, not all inferences are always intended!) MAKING INFERENCES IN
For example, if I say “Can you pass the salt?” I am CONVERSATION
usually not really asking you whether you have the
ability to pass the salt; it is an indirect, polite way We have seen that inferences play an important
of saying, “Please pass the salt.” Except perhaps part in understanding text, and are just as impor-
in psycholinguistics experiments, we do not pro- tant in conversation. We make inferences not just
duce random utterances; we are trying to achieve from what people say, but also from how they say
particular goals when speaking. So how do we get it, and even from what they do not say. In con-
things done with language? Clark (1996) calls this versation, though, we have an additional resource:
type of behavior layering. In practice, language we can ask the other person. Conversation is a
has multiple layers of meaning. cooperative act.
Speech acts
When we speak, we have goals, and it is the lis-
tener’s task to discover those goals. According to
Austin (1962/1976) and Searle (1969), every time
we speak we perform a speech act. That is, we
are trying to get things done with our utterances.
Austin (1976) began with the goal of explor-
ing sentences containing performative verbs.
These verbs perform an act in their very utterance,
such as “I hereby pronounce you man and wife”
(as long as the circumstances are appropriate—
such as that I have the authority to do so; such “Top me up!” This directive speech act may be
interpreted beyond its literal meaning and have
circumstances are called the felicity conditions). the perlocutionary effect of making fellow diners
Austin concluded that all sentences are perform- think that she has had quite enough wine to drink
ative, though mostly in an indirect way. That is, already!
all sentences are doing something—if only stat-
ing a fact. For example, the statement “My house
is terraced” can be analyzed as “I hereby assert According to Searle (1969, 1975), when we
that my house is terraced.” Austin distinguished speak we make speech acts. Every speech act falls
three effects or forces that each sentence pos- into one of five categories (see Figure 14.2):
sesses (see Figure 14.1). The locutionary force
of an utterance is its literal meaning. The illo- x Representatives. The speaker is asserting a fact
cutionary force is what the speaker is trying to and conveying his or her belief that a statement
get done with the utterance. The perlocutionary is true. (“Boris rides a bicycle.”)
force is the effect the utterance actually has on x Directives. The speaker is trying to get the lis-
the actions and beliefs of the listener. For exam- tener to do something. (In asking the question
ple, if I say (1) the literal meaning is that I am “Does Boris ride a bicycle?” the speaker is try-
asking you whether you have the ability to pass ing to get the hearer to give information.)
the gin. The illocutionary force is that I hereby x Commissives. The speaker commits him or her-
request you to pass the gin. The utterance might self to some future course of action. (“If Boris
have the perlocutionary force of making you doesn’t ride a bicycle, I will give you a present.”)
think that I drink too much. x Expressives. The speaker wishes to reveal his
or her psychological state. (“I’m sorry to hear
(1) Can you pass the gin? that Boris only rides a bicycle.”)
Three forces possessed by a sentence (Austin, 1976)
SENTENCE
LOCUTIONARY FORCE ILLOCUTIONARY FORCE PERLOCUTIONARY FORCE

%!! %#!! %!!!"!!
!"!! !$!! !"$ !!
#!!"!! ! !
FIGURE 14.1
14. HOW DO WE USE LANGUAGE? 451
x Declaratives. The speaker brings about a new increasing politeness. The less conventional they
state of affairs. (“Boris—you’re fired for rid- are, the more computational work is required by
ing a bicycle!”) the listener. Over 90% of requests are indirect
in English (Gibbs, 1986b). Indirectness serves a
Different theorists specify different catego- function: it is an important mechanism for con-
ries of speech acts. For example, D’Andrade and veying politeness in conversation (Brown &
Wish (1985) described seven types. They distin- Levinson, 1987). It also enables the speaker to be
guished between assertions and reactions (such as strategic in their language: for example, if you are
“I agree”) as different types of representatives, and offering someone a bribe, you might want to do
they distinguished requests for information from so indirectly, so you can fall back on the direct
other request directives. The lack of agreement and meaning should they turn out to be more honest
the lack of detailed criteria of what constitutes than you—“I never meant it that way!” (Lee &
any type of speech act are obvious problems here. Pinker, 2010).
Furthermore, some utterances might be ambigu- The meanings of indirect speech acts are not
ous, and if so, how do we select the appropriate always immediately apparent. Searle (1979) pro-
speech act analysis? A further challenge is that it posed a two-stage mechanism for computing the
needs to be made explicit how the listener uses the intended meaning. First, the listener tries the literal
context to assign the utterance to the appropriate meaning to see if it makes sense in context, and it
speech act type. is only if it does not that he or she will do the addi-
Direct speech acts are straightforward tional work of finding a non-literal meaning. There
utterances where the intention of the speaker is is an opposing one-stage model where people derive
revealed in the words. Indirect speech acts require the non-literal meaning either instead of or as well
some work on the part of the listener. The most as the literal one (Keysar, 1989). The evidence is
famous example is “Can you pass the salt?,” as conflicting, but certainly the non-literal meaning is
analyzed earlier. Speech acts can become increas- understood as fast as or faster than the literal mean-
ingly indirect (“Is the salt at your end of the ing, which favors a one-stage model. For example,
table?” to “This food is a bit bland”), often with Gibbs (1986a) found that in an appropriate context
Categories of speech act (Searle, 1969, 1975)
REPRESENTATIVE
The speaker is asserting a fact
and conveying his or her belief
that a statement is true
DECLARATIVE DIRECTIVE
The speaker brings about a The speaker is trying to get the
SPEECH
new state of affairs listener to do something
ACT
EXPRESSIVE COMMISSIVE
The speaker wishes to reveal The speaker commits him or
his or her psychological state herself to some future course
of action
FIGURE 14.2
participants took no longer to understand the sarcas- x Maxim of relevance. Make your contribution
tic sense of “You’re a fine friend!” than the literal relevant to the aims of the conversation.
sense in a context where that was appropriate. x Maxim of manner. Be clear: Avoid obscurity,
Clark (1994) detailed examples of many kinds ambiguity, wordiness, and disorder in your
of layering that can occur in conversation. We can language.
be ironic, sarcastic, or humorous, we can tease, we
can ask rhetorical questions that do not demand Subsequently there has been some debate on
answers, and so on. Although we probably under- whether there is any redundancy in these maxims.
stand these types of utterance using similar sorts Sperber and Wilson (1986) argued that relevance
of mechanisms as with indirect speech acts, much is primary among them and that the others can be
work remains to be done in this area. deduced from it.
Conversations quickly break down when
we deviate from these maxims without purpose.
How to run a conversation: However, we usually try to make sense of con-
Grice’s maxims versations that appear to deviate from them. We
Grice (1975) proposed that in conversations speak- assume that overall the speaker is following
ers and listeners cooperate to make the conversa- the cooperative principle. To do this, we make
tion meaningful and purposeful. That is, we adhere a particular type of inference known as a con-
to a cooperative principle. To comply with this, versational implicature. Consider the following
according to Grice, you must make your conver- conversational exchange (2).
sational contribution such as is required, when it is
required. This is achieved by use of four conversa- (2) Vlad: Do you think my nice new expensive
tional maxims (see Figure 14.3): gold fillings suit me?
Boris: Gee, it’s hot in here.
x Maxim of quantity. Make your contributions as
informative as is required, but no more. Boris’s utterance clearly violates the maxim
x Maxim of quality. Make your contribution true. of relevance. How can we explain this? Most of
Do not say anything that you believe to be false, us would make the conversational implicature
or for which you lack sufficient evidence. that in refusing to answer the question, Boris is
MAXIM OF QUANTITY
Make contributions as
informative as is required,
but no more
MAXIM OF MANNER
CONVERSATIONAL
Make contribution clear, MAXIM OF QUALITY
MAXIMS
avoiding obscurity, ambiguity, Make contribution true
(Grice, 1975)
wordiness, and disorder
MAXIM OF RELEVANCE
Make contribution relevant
to the aims of the conversation
FIGURE 14.3
implying that he dislikes Vlad’s new fillings and privileged information. They found that if speak-
doesn’t think they suit him at all, but for some rea- ers were told to keep this privileged information
son doesn’t want to say so to his face. Indeed, face secret, they were in fact more likely to refer to the
management is a common reason for violating the concealed objects. Wardlow Lane et al. explain
maxim of relevance (Goffman, 1967; Holtgraves, the results in terms of our monitoring speech;
1998): People do not want to hurt or be hurt. monitoring can bring things that we are trying to
The listeners’ recognition of this plays an impor- avoid into awareness, increasing the chance that
tant role in how they make inferences that make they are in fact produced. Freud (1975) would talk
sense of remarks that apparently violate relevance in terms of repression; the two explanations are
(Holtgraves, 1998). not a million miles apart.
There are other ways in which speakers The right hemisphere of the brain plays an
cooperate in conversations. Garrod and Anderson important role in processing some pragmatic
(1987) observed people cooperating in an attempt aspects of language (see Lindell, 2006, for a
to solve a computer-generated maze game. The review). We saw in Chapter 12 that patients
pairs of speakers very quickly adopted simi- with right-hemisphere damage have difficulty
lar forms of description—a phenomenon called in understanding jokes; more generally, the
lexical entrainment. For example, we could call right hemisphere is involved in non-literal pro-
a picture of a dog “a dog,” “a poodle,” “a white cessing. Patients with right-hemisphere damage
poodle,” or even “an animal.” The frequency have difficulty in understanding jokes, idioms,
and recency of name selection can override metaphors, and proverbs. Imagine the sort of
other factors that influence lexical choice, such literal image provoked by the phrase “cry-
as informativeness, accessibility, and being at ing your eyes out” (Lindell, 2006; Winner &
the basic level. Brennan and Clark (1996) pro- Gardner, 1977).
posed that in conversations speakers jointly make
conceptual pacts about which names to use.
Conceptual pacts are dynamic: They evolve over THE STRUCTURE OF
time, can be simplified, and even abandoned for CONVERSATION
new conceptualizations.
Of course, sometimes we don’t want to coop- There are two different approaches to analyzing
erate in conversations. Some people sometimes the way in which conversations are structured
want to lie; frequently we want to keep things to (Levinson, 1983). Discourse analysis uses the
ourselves. This privacy can sometimes be very dif- general methods of linguistics. It aims to dis-
ficult to maintain. Readers of a certain age might cover the basic units of discourse and the rules
remember an episode of the UK television pro- that relate them. The most extreme version of
gram “Dad’s Army,” where Captain Mainwaring, this is the attempt to find a grammar for conver-
desperate to keep Corporal Pike’s name from the sation in the same way as there are sentence and
invaders, says “Don’t tell him (your name), Pike.” story grammars. Labov and Fanshel (1977), in
(Here is another example: DON’T think of a pink one of the most famous examples of the analysis
elephant.) Often it seems that the harder we try of discourse, looked at the structure of psycho-
to keep something private, the more likely it is to therapy episodes. Utterances are segmented into
pop out. An experiment carried out by Wardlow units such as speech acts, and conversational
Lane, Groisman, and Ferreira (2006) showed that sequences are regulated by a set of sequencing
this impression is correct. Speakers described rules that operate over these units. Conversation
simple objects (e.g., triangles) to other people. analysis is much more empirical, aiming to
Some information was known only to the speak- uncover general properties of the organiza-
ers (e.g., that there was also another, larger tri- tion of conversation without applying rules.
angle in the scene concealed from the listeners). Conversation analysis was pioneered by ethno-
Wardlow Lane et al. call this type of information methodologists, who examine social behavior
in its natural setting. The data consist of tape-

recordings and more latterly videos of transcripts
of naturally occurring conversations.
In a conversation, speaker A says something,
speaker B has a turn, speaker A then has another
turn, and so on; we call this aspect of conversa-
tion turn-taking. A turn varies in length, and might
contain more than one idea. Other speakers might
speak during a turn in the form of back-channel
communication, making sounds (“hmm hmm”),
words (“yep”), or gestures (e.g., nodding) to show
that the listener is still listening, is understanding,
Visual cues, such as gaze and hand gestures,
agrees, or whatever (Duncan & Niederehe, 1974; are important in ensuring smooth turn-taking.
Yngve, 1970). Turn structure is made explicit by However, they play only a small part in the
the example of adjacency pairs (such as question– complex sequence of social rules that govern
answer pairs, or greeting–greeting pairs, or offer– conversation.
acceptance pairs). The exact nature of the turns
and their length depend on the social settings:
a complex sequence of social rules comes into
seminars are different from spontaneous drunken
play. The advantage of the system discussed by
conversation. Nevertheless speakers manage to
Sacks et al. is that it can predict other charac-
control conversations remarkably accurately. Less
teristics of conversation, such as when overlaps
than 5% of conversation consists of the overlap
(competing starts of turns, or where transitional
of two speakers talking at once, and the average
relevance places have been misidentified) or
gap between turns is just a few tenths of a second
gaps do occur.
(Ervin-Tripp, 1979).
Wilson and Wilson (2005) propose a more
Speakers must use quite a sophisticated
biological model of the control of turn-taking.
mechanism for ensuring that turn-taking proceeds
They argue that during conversation endogenous
smoothly. Sacks, Schegloff, and Jefferson (1974)
oscillators in the brains of speaker and listener
proposed that the minimal turn-constructional
become synchronized, or entrained. Endogenous
unit from which a turn is constructed is deter-
oscillators are groups of neurons that fire together
mined by syntactic and semantic structure, and
in a periodic way and hence act like clocks in the
by the intonational contour of the utterance (over
brain. The driving force of this synchronization is
which the speaker has a great deal of control).
the speaker’s rate of syllable production. A cyclic
A speaker is initially assigned just one of these
pattern develops, with the probability of one of
minimal units, and then a transition relevance
the conversants initiating speech at any time
place where a change of speaker might arise.
being out of phase with the other, so minimizing
Sacks et al. discussed a number of rules that gov-
the likelihood that the two people will start speak-
ern whether or not speakers actually do change
ing at the same time. The two key ideas of this
at this point. Gaze is important: We tend to look
proposal are that biological clocks ensure we do
at our listeners when we are coming to the end of
not speak simultaneously, and these clocks obtain
a turn. Hand gestures might be used to indicate
their timing from the speech stream.
that the speaker wishes to continue. As impor-
tant as visual cues might be, they cannot be the
whole story, as we have no difficulty in ensuring COLLABORATION IN
smooth turn transitions in telephone conversa- DIALOG
tions. Filled pauses indicate a wish to continue
speaking. Speakers might deliberately invite a Conversation is a collaborative enterprise, and
change of speakers by asking a question; otherwise speakers collaborate with listeners to ensure that
their utterances are understood (Clark & Wilkes- There are other reasons for supposing that
Gibbs, 1986; Schober & Clark, 1989). People go audience design is an emergent, interactive pro-
to considerable lengths to take the other person’s cess. Horton and Gerrig (2005) found that the
point of view in dialog, sometimes regardless of the memory requirements of a task influence speak-
cognitive load necessitated (Duran, Dale, & Kreuz, ers. They used a task in which “Directors” gave
2011). The idea that speakers tailor their utterances instructions about manipulating an array of cards
to the particular needs of the addressees is called to “Matchers.” They found that the Directors were
audience design (Clark, 1996). much better able to take the needs of the Matchers
In Chapter 12 we saw how readers and lis- into account when their own memory demands
teners construct representations of incoming produced by the task were lower. If speakers have
language. Conversation is a process of com- a lot to remember, they find it difficult to take the
municating these representations, of trying to needs of the listeners and the detailed past history
make the representation of the speaker and the of their conversational interaction into account.
listener the same—almost of filling in gaps. (Of
course, there are exceptions; when someone is
lying, or deliberately withholding information,
Audience design
they are trying to make sure that the gaps are The idea that speakers tailor their productions
not filled in.) Pickering and Garrod (2004) call to address the specific needs of their listeners is
this process of trying to make the language rep- called audience design. An example of audience
resentations of speakers and listeners coincide design is child-directed speech (see Chapter 4),
alignment. In their interactive alignment model, when adults modify their utterances when speak-
during dialog the linguistic representations of ing to infants and children.
the participants become aligned at many levels We also saw in Chapter 10 that speakers
(including the overall mental model of what is sometimes use prosody and pausing to help lis-
going on, the syntactic level, and the lexical teners disambiguate what they say. Speakers also
level). They argue alignment occurs by means seem to monitor what they say with the goal of
of four types of largely automatic mechanism: reducing ambiguity. While speakers sometimes
priming, inference, the use of routine expres- avoid linguistic ambiguity (e.g., ambiguous
sions, and the monitoring and repair of language words, as of the type we examined in Chapter 6,
output. Such alignment of linguistic represen- or temporarily ambiguous structures, of the sort
tations leads to the alignment of the speaker’s we examined in Chapter 10), they go out of their
and the listener’s situation models (Zwaan & way to avoid non-linguistic ambiguity (Ferreira,
Radvansky, 1998). Perhaps the most important Slevc, & Rogers, 2005). Non-linguistic ambi-
of these alignment mechanisms is priming. We guity arises when there are multiple instances
have examined priming in several contexts (e.g., of similar meanings—for example, if there are
lexical priming in Chapter 6, syntactic priming several instances of the same object in the visual
in Chapter 13). Priming of words and syntac- scene, or several instances that could be described
tic structures ensures that linguistic represen- by the same word. If there are two apples in front
tations become aligned at a number of levels. of us, one red and one green, we are unlikely to
This account assumes much less explicit rea- say just “give me the apple.” In their experiment,
soning about one’s interlocutor than alternative speakers described target objects (e.g., the fly-
views such as that of Clark (1996). Pickering ing mammal “bat”) in contexts where there were
and Garrod (2006) further emphasize the way other objects that could cause linguistic (a base-
in which listeners make predictions in conver- ball) or non-linguistic (a larger flying mammal)
sations, and that these predictions are made by ambiguity (see Figure 14.4). Ferreira et al.’s
the speech production system: Comprehension results found that speakers monitor their speech
draws on production, particularly in difficult and can sometimes detect and avoid linguistic
circumstances. ambiguity before producing it, but almost always
avoid non-linguistic ambiguity. Speakers are syntactic structure of instructions such as “Put the
much better at dealing with non-linguistic ambi- dog in the basket on the star” regardless of whether
guity than with linguistic ambiguity. A related or not the referential situation is actually ambiguous
study looking at dialog between two speakers (Kraljic & Brennan, 2005). This finding suggests
engaged in moving objects on a grid found that that there are limitations to audience design. What
when the visual context was potentially ambigu- is more, speakers overestimate how good they are
ous, speakers tried to disambiguate their utter- at conveying information (Keysar & Henly, 2002).
ances (Haywood, Pickering, & Branigan, 2005). Keysar and Henly looked at 40 speakers producing
Hence speakers do pay some attention to the syntactically ambiguous sentences such as “Boris
needs of the listener. shot the man with the gun” and lexically ambigu-
There are, however, limits to how far a speaker ous sentences such as “The typist tried to read the
will go to make the listener’s life easier. Ferreira and letter without her glasses.” Nearly half (46%) of the
Dell (2000) examined the extent to which speakers time the speaker thought the listeners had correctly
used optional complementizers (e.g., “that,” which understood the sentence; in fact they had not. So
is optional in “the vampire knew [that] you hated not only are there limits to how much speakers tai-
blood,” a structure that is ambiguous up until the lor their productions to their listeners, they do not
word “hated”). If speakers are trying to produce always do so correctly even when they try.
structures that are as easy to understand and as
unambiguous as possible, they should frequently SOUND AND VISION
include these optional words in sentences that
would otherwise be ambiguous. However, they do We saw in Chapter 3 that human language is so
not. Instead they choose structures that are easy to powerful because we can talk about anything—we
produce and that enable them to produce the main can talk about things remote in time and space, and
content words as early as possible. Speech product- about very abstract notions. However, just because
ion proceeds with quickly selected lemmas being we can do these things, it doesn’t mean we do them
produced as soon as possible. In addition, while all the time. In fact a great deal of the time we talk
speakers produce prosodic cues (such as length- literally about what is in front of our eyes. For much
ening words and inserting pauses) to syntactic of everyday life we converse about the “here-and-
boundaries, and listeners do pay attention to these now.” Not surprisingly, therefore, the study of how
cues, speakers tend to do so regardless of whether language interacts with the visual world has become
or not the listener really needs it. For example, of considerable importance over the last few years.
the speakers provide disambiguating cues to the Perhaps the only surprise is why it has taken so long
FIGURE 14.4 Sample

(a) (b) displays used by Ferreira
3 3
et al., in which the object
labeled 3 had to be named
so as to discriminate it from
the other objects. In (a),
there is linguistic ambiguity
2 2
1 1
(baseball bat vs. mammal
bat); in (b), there is non-
linguistic ambiguity (large vs.
small bat). Performance was
much worse with linguistic
than with non-linguistic
Linguistic ambiguity Non-linguistic ambiguity
ambiguity. Adapted from
Ferreira et al. (2005).
for this topic to become so prominent. The answer representation and to resolve syntactic ambigu-
to this question is that the study of how we interact ity. While adult readers rely mostly on lexical
with the visual world requires sophisticated eye- information to generate alternative syntactic
movement technology, and such technology has structures, adult listeners make a great deal of
only recently become available. use of the visual world in front of them. In par-
A second reason why the study of the visual ticular, people can use referential information
world has become so important is that it provides from the visual scene at which they are looking
us with a new tool for studying how we under- to override very strong lexical biases (Tanenhaus
stand language and speech. We can now see in et al., 1995).
real time how people make use of external, vis- The role of the visual world in comprehen-
ual information when processing language. The sion has since been demonstrated in several
visual world paradigm has recently proved very experiments. For example, Spivey, Tanenhaus,
popular for investigating sentence processing (see Eberhard, and Sedivy (2002) monitored the eye
many studies in Chapter 10) and speech production movements of participants following spoken
(see Chapter 13). instructions about picking up moving objects
While adults make considerable use of the in a visual workspace. The eye movements
visual world, similar studies show that children do were closely linked to the associated referential
so to a much lesser extent (Snedeker & Trueswell, expressions (phrases describing objects) in the
2004; Trueswell, Sekerina, Hill, & Logrip, 1999). instructions. What happens when people are given
Five-year-old children rely exclusively on verb-bias temporarily ambiguous sentences, such as (3),
information. Highly reliable cues, such as lexical which contains a temporarily ambiguous prepo-
bias, emerge first in development, with referential sitional phrase?
information gradually being used as the child gets
older. Furthermore, although referential informa- (3) Put the apple on the towel in the box.
tion may not determine which structures young
children construct, it may reduce the time it takes to The normally preferred initial interpretation
construct them (Snedeker & Trueswell, 2004). is the goal-argument analysis (put the apple on
the towel); the less usual initial interpretation
Using visual information in is the noun-phrase modifier (the apple that is
already on the towel should be put somewhere
comprehension else). The answer depends on the visual con-
We saw in Chapter 10 that many sources of infor- text. If there was just one apple in the visual
mation are used to help us construct a syntactic scene, people would go with the usual preferred
(a) (b)
of the display conditions
used by Spivey et al. (2002).
In scene (a) participants
spent time looking at the
target destination (the
empty towel), whereas in
scene (b) they spent less
time looking at the empty
towel. Based on Spivey et al.
(2002).
analysis, and spend time looking at the supposed information in turn is used from a very early stage
(but incorrect) target destination (an empty to influence parsing.
towel). If there was more than one apple, how- The results show that language processing
ever, participants assumed the less usual modi- immediately takes into account relevant non-lin-
fication analysis, and did not spend much time guistic context, and argues against models where
looking at the empty towel (see Figure 14.5). initial syntactic decisions are guided solely by
Eye movements showed that the initial inter- syntactic information.
pretation was the one consistent with the visual One particular sort of visual information is
context. information from the speaker themselves. We
Using a similar sort of design, Chambers, have seen in Chapter 9 that people’s recognition
Tanenhaus, and Magnuson (2004) showed that of speech can be influenced by the lip movements
properties of objects in the visual world can influ- of the speaker (the McGurk effect). Lip-readers
ence parsing. They gave participants temporarily clearly make extensive use of this sort of informa-
ambiguous sentences such as “Pour the egg in the tion. The eye movements of the speaker (see also
bowl over the flour.” The eggs in the scene could Chapter 13) provide another rich source of infor-
be in a liquid form, or whole. You cannot pour mation for listeners. We tend to look at what the
whole eggs, so people spend little time looking at speaker is looking at; indeed, eye movements can
them given the start of this instruction. Listeners be used to flag attention or a particular referent.
restrict their attention to objects that are physically When a speaker is describing a scene to a listener,
compatible with what they hear. If all you can see is the speaker naturally looks over the scene, and their
one egg, in a bowl, in liquid form, you will analyze eye movements relate to what they are describing.
the sentence from the beginning with the structure The eye movements of the listener come to match
of “pour the egg that’s in the bowl”—and your eyes the eye movements of the speaker; they move over
will give you away. Hence real-world properties of the scene in the same way, but with a delay of 2
objects constrain the referential domain, and this seconds (Richardson & Dale, 2005).
SUMMARY
x Pragmatics is concerned with what people do with meaning.

x When we speak we do things with language; we make speech acts.
x In an indirect speech act, the intended meaning has to be inferred from the literal meaning.
x Indirectness is an important mechanism for maintaining politeness.
x Grice proposed that conversations are maintained by the four maxims of quantity, quality,
relevance, and manner; of these, relevance is the most important.
x If an utterance appears to flout one of these maxims, we make a conversational implicature to
make sense of it.
x Conversations have a structure; we take turns to speak, and use many cues (such as gaze) to
ensure smooth transitions between turns.
x Audience design is the process of speakers modifying their utterances to take the needs of listen-
ers into account.
x In conversation speakers come to align their internal representations at all levels.
x Language processing takes relevant non-linguistic context into account.
1. Keep a record for a few days of what you talk about. How much is to do with the here-and-now?
2. When do people interrupt others?
3. When you talk, what do you look at? Why?
4. When you talk to someone, how much attention do you pay to whether or not they are following you?
5. How would you modify your speech if a tourist who is obviously a poor speaker of your native
language stops you in the street and asks you for directions? What does this example tell us
about audience design?
FURTHER READING
Sperber and Wilson (1987) is a summary of their book on relevance, with a peer commentary. Clark
(1996) is a classic work on using language. See Henderson and Ferreira (2004) for an edited collec-
tion on language and the visual world.
C H A P T E R 15
THE STRUCTURE OF THE
LANGUAGE SYSTEM
INTRODUCTION believe that the language system comprises encap-

sulated modules unless there is good evidence to
This penultimate chapter draws together many the contrary. We have seen throughout this book
issues from the rest of the book. The architecture that language is highly modular, and these mod-
of a building indicates what it looks like, how ules can be located in distinct brain regions, but
its parts are arranged, its style, and how to get the extent to which the modules are encapsulated
from one room to another. What is the architec- is often highly controversial.
ture of the language system? How are its mod- Second, processes within a module are man-
ules arranged, and how do they interact with one datory and automatic in that if there is an input to
another? How many lexicons are there? This the module, subsequent processing is obligatory.
chapter examines how the components of the lan- For example, normally we cannot help but read
guage system relate to one another. In particular, a word and access its meaning, even when it is
it will look at how different types of word recog- to our advantage not to do so (as in the Stroop
nition and production interrelate. A final ques- task, where we cannot ignore the meaning of the
tion is the extent to which language processes word whose ink color we are trying to name).
depend on other cognitive processes. Although Nevertheless, our views on what is automatic do
this issue was also considered in Chapter 3, we change. Imaging studies show that when the atten-
focus here on the relation between language pro- tional system is overloaded, the brain cannot dis-
cessing and memory. tinguish between random letters and meaningful
Caplan (1992) described four main charac- words—that is, in situations of extreme overload,
teristics of the language-processing system, based reading a word is not mandatory (Rees, Russell,
on Fodor’s (1983) classic account of the modu- Frith, & Driver, 1999).
larity of mind. First, the language system is not Third, language processes generally operate
a unitary structure, but is divided into a number unconsciously. Indeed, the detailed lower level
of modules. Fodor (1983) said that modules are processes are not even amenable to conscious
informationally encapsulated: Each module takes introspection. Finally, Caplan observed that most
only one particular representation as input, and language processing takes place very quickly and
delivers only one type of output. For example, the with great accuracy. Taking these final points
syntactic processor only takes a word-level repre- together, much of language processing is charac-
sentation and does not accept input directly from teristic of automatic processing (Posner & Snyder,
the acoustic level. We would have to revise this 1975; Shiffrin & Schneider, 1977). Obviously
assumption if we found evidence for interaction most real-life language tasks involve a number of
or leakage between modules. Many researchers modules, and trying to coordinate them might be
believe that a completely modular system is the slow and error-prone, as is the case with speech
most economical one, and it is parsimonious to production.
15. STRUCTURE OF THE LANGUAGE SYSTEM 461
Throughout the book, it has become obvi- can be connected together to form a propositional
ous that the extent to which language processes network that is operated on by schemata (in
interact is very controversial. As a very general comprehension—see Chapter 12) and the concep-
conclusion, we have observed that the earlier in tualizer (in production—see Chapter 13).
processing a process is, the more likely it is to be Throughout this book we have seen how
autonomous. By the end of this chapter and book neuropsychological case studies show us that
you should: brain damage can affect some components of lan-
guage while leaving others intact. We have seen
x Know about the components of the language dissociations in reading and speech production.
system and how they relate to each other. Some patients have preserved lexical access but
x Understand the extent to which language pro- impaired syntactic processing, while others show
cesses are interactive. the reverse pattern. The pattern of performance of
x Appreciate some differences between reading people with Parkinson’s disease and Alzheimer’s
and listening. disease is quite different, leading some research-
x Understand how we repeat words, and how ers to conclude that specific instances are stored
repetition can be affected by brain damage. in the mental lexicon in one part of the brain,
x Understand the role that working memory while general grammatical rules are processed
plays in language processing. elsewhere, although again this is controversial
(see Chapters 3 and 13).
There are obviously enormous differences
WHAT ARE THE MODULES between language processing in the visual and
OF LANGUAGE? the auditory modalities, given the very different
natures of the inputs. Even if there is phonological
What modules of the language system can we recoding in reading, it is unlikely to be obligatory
identify? When we see, hear, or produce a sen- to gain access to meaning in languages with deep
tence, we have to recognize or produce the words orthography that have many irregular words, such
(Chapters 6, 7, 9, and 13), and decode or encode as English. In addition, the temporal demands of
the syntax of the sentence (Chapters 10 and 13). spoken and visual word recognition are very dif-
All of these tasks involve specific language mod- ferent. In normal circumstances, we have access to
ules. Little is known at present about the rela- a visual stimulus for much longer than an acous-
tion between the syntactic encoder and decoder, tic stimulus. We can backtrack while reading, but
although the evidence described in Chapter 13 we are unable to do this when listening. It is even
suggests that they are distinct. But does seman- possible that fundamental variables have different
tic information direct syntactic modules to do effects in the two modalities. It is more difficult to
particular analyses (strong interaction), or just to find frequency effects in spoken language recog-
reject implausible analyses and cause reanalysis? nition than in visual word recognition (Bradley
We looked at this in the chapter on parsing and & Forster, 1987). Nevertheless, in normal cir-
syntactic ambiguity (Chapter 10). cumstances the reading and listening systems
The semantic-conceptual system is respon- develop closely in tandem: Except for very young
sible for organizing and accessing our world children, there is a very high correlation between
knowledge and for interacting with the percep- auditory and visual comprehension skills (Palmer,
tual system. We discussed the way in which word MacLeod, Hunt, & Davidson, 1985). Differences
meanings might be represented in Chapter 11. Most between the modalities may extend beyond word
researchers currently think that they are decom- recognition. Kennedy, Murray, Jennings, and
posed into semantic features, some of which Reid (1989) argued that parsing differs in the two
might be fairly abstract. Initial contact with the modalities. With written language, we have the
conceptual system is probably made through opportunity to go back to it, but access to spoken
modality-specific stores. The meanings of words language is more transient.
We saw in Chapter 9 that the data strongly level, or whether it is only specified at some more
suggest that speech recognition is a data-driven, abstract level, as there is uncertainty about the
purely bottom-up process. In contrast, we saw in degree of phonological similarity effects found
Chapter 13 that the data suggest that speech pro- in these internal errors (Corley, Brocklehurst, &
duction is a non-modular process involving feed- Moat, 2011; Oppenheim, 2012; Oppenheim &
back. Is there a contradiction here? Why should Dell, 2008). Finally, we saw in Chapter 7 that
recognition involve no feedback, but production reading often results in inner speech.
a great deal of it? The tasks are very different: In
speech recognition, the goal is to extract the cor-
rect meaning as quickly as possible; the speech HOW MANY LEXICONS
signal fades rapidly; and there is some redun- ARE THERE?
dancy in the input. And while we need to get at
the meaning and truth of what we are hearing, we We have seen how some researchers believe that
do not need to construct detailed representations there are multiple semantic memory systems,
of everything. In production, however, we need to one for each input modality. How many lexicons
be accurate. We do need to produce every word in are there? When we recognize a word, do we
full and construct every syntactic representation make contact with the same lexicon regardless
in detail. We need to make sure that one part of the of whether we are listening to speech or reading
sentence agrees with all the others. Traditionally, written language? Do we have just one mental
language production and language comprehen- dictionary, or is it fractionated, with a separate
sion have been treated as distinct modules; how- one for each modality? Clearly the peripheral
ever, recent thinking is that they are much more features of lexical processing—letters versus
intertwined (Pickering & Garrod, 2013). sounds, for example—must differ depending on
the modality, so the question should be rephrased
as: Is there one lexicon of lemmas (abstract lexi-
Inner speech cal units; see Chapter 13), or multiple systems
What about inner speech, that little voice we often of lemmas, one for each modality? In Levelt’s
hear in our head telling us what to do? Clearly original conception of lemmas they are modal-
inner speech is produced by the speech production ity neutral, but is that actually the case? In fact
system, but it stops short of full articulation. How lemmas, although an important idea in speech
short? Vigliocco and Hartsuiker (2002) argue that production, are rarely mentioned in the word
inner speech is in a phonetic code—that is, it is rel- recognition literature.
atively late. There are two main pieces of relevant The most parsimonious arrangement is that
evidence. The first is that articulatory suppression there is only one lexicon, used for the four tasks
(speaking out aloud) stops the inner voice, and of reading, listening, writing, and speaking.
articulatory suppression interferes with the pho- Alternatively, we may have four lexicons, one
netic code. Second, levels of representation are each for the tasks of writing, reading, speaking,
not accessible to consciousness prior to the pho- and listening. It is also plausible that there are two
netic code (we have no sense of knowing what lexicons: One possibility is that there are separate
a lemma or a phonological code is), but clearly lexicons for written (visual) language and spoken
inner speech is accessible to consciousness. (verbal) language (each covering input and out-
Recent research on getting people to mentally put tasks—that is, recognition and production),
recite tongue twisters has shown that people make and another is that there are separate lexicons for
speech errors in inner speech showing phonologi- input and output (each covering written and spo-
cal effects resembling those made in overt speech, ken language).
such as the lexical bias effect. However, opinion Note that to some extent the answers to these
is divided as to whether the errors show that inner questions depend on how we define our terms. If
speech is specified as far as the sound featural by “lexicon” we just mean “the complete mental
dictionary,” there can be only one lexicon, but per-

haps with a number of subsystems and peripheral
stores. If on the other hand we mean a discrete
system used to access semantics, we could have
a number of them. Differences in the use of ter-
minology like this make lexical architecture a dif-
ficult and confusing topic.
Experimental data
Fay and Cutler (1977) interpreted form-based
word substitution speech errors as evidence that a
single lexicon was accessed in two different direc-
tions for speech production and comprehension
(Chapter 13). We saw, however, that malaprop-
isms can readily be explained without recourse
to a common lexicon in an interactive two-stage
model of lexicalization. In fact, most of the data
argue against a single lexicon used for both recog-
nition and production.
Winnick and Daniel (1970) showed that
tachistoscopic recognition of a printed word was
facilitated by the prior reading aloud of that word,
whereas naming a picture or producing a word in
response to a definition did not facilitate subse-
quent tachistoscopic recognition of those words
(Chapter 6). Furthermore, priming in the visual Listening to and repeating words. Color positron
modality produces much more facilitation on a emission tomography (PET) scan showing areas
test in the visual modality than auditory priming of the human brain involved in word recognition.
The active areas are highlighted in red and yellow.
does, and vice versa (Morton, 1979b). In response, At top, the subject is listening to words only. The
Morton (1979b) revised the logogen model, so part of the brain activated is the auditory region
that instead of one logogen for each word, logo- as word sounds are heard. At bottom, the subject
gen stores were modality-specific. He further dis- is both listening to words, and repeating them.
tinguished between input and output systems. In The auditory (hearing) region is activated as well
as a small motor control area (yellow, above
support of this fractionation, Shallice, McLeod, the auditory region) involved in speech. Active
and Lewis (1985) found that having to monitor a areas show cerebral blood flow detected by PET,
list of auditorily presented words for a target cre- superimposed onto an image of the brain.
ated little interference on reading words aloud.
Furthermore, listening to a word does not acti-
vate the same areas of the brain that are activated may be necessary for word recognition. Hence
by reading a word aloud and word repetition, as interactions in speech production arise through
shown by PET (positron emission tomography) leakage along the comprehension route. As we
brain imaging (Petersen, Fox, Posner, Mintun, & have seen, evidence favors the view that the pro-
Raichle, 1989). These pieces of evidence suggest duction and comprehension lexicons are distinct.
that the speech input and output pathways are Perhaps the role of feedback is limited or non-
different. existent in both production and recognition, but
Dell (1988) suggested that the feedback con- both involve attractor networks, giving rise to the
nections in his interactive model of lexicalization observed interactions.
As we saw in Chapter 11, it can be difficult and spoken word recognition. In this section I
to distinguish between problems of access and examine data from patients whose behavior is
problems of storage. Allport and Funnell (1981) consistent with damage to some routes of a model
argued that perhaps we do not need separate lexi- of lexical processing while other routes are intact.
cons, just distinct access pathways to one lexicon. Several theorists, drawing on many sources, have
On the other hand, we have seen that semantic tried to bring all this material together to form
memory is split into multiple, modality-specific some idea of the overall structure of the language
stores. It seems uneconomical to have four access system (e.g., Ellis & Young, 1988; Kay, Lesser,
pathways (for reading, writing, speaking, and lis- & Coltheart, 1992; Patterson & Shewell, 1987).
tening) going to and from one lexicon, and then to One such arrangement is shown in Figure 15.1.
four semantic systems. Indeed, the most plausible The neuropsychological data strongly suggest
arrangement is that there are distinct lexical sys- that there are four different lexicons, one each for
tems. Language processes split early in process- speaking, writing, and spoken and visual word
ing, and do not converge again until quite late. recognition, although these systems must clearly
Monsell (1987) examined whether the same communicate in normal circumstances. This con-
set of lexical units is used in both production and clusion is consistent with the data from experi-
recognition. He compared the effects of priming ments on people without brain damage.
word recognition in an auditory lexical deci- At the heart of the model is a system where
sion task by perceiving a word or generating a word meanings are stored and that interfaces with
word. He found that generating a word facili- the other cognitive processes. This is the semantic
tated its recognition, suggesting that producing system (or systems, with the multiple-semantics
a word activates some representation that is also view). The four most important language behav-
accessed in recognition. This suggests that pro- iors are speaking, listening, reading, and writing.
duction and recognition use the same lexicon or Speaking involves going from the semantic
separate networks that are connected in some system to a store of the sounds of words. This
way. Further evidence that the input and output is the phonological output store. Understanding
phonological pathways cannot be completely speech necessitates the auditory analysis of
separate is that there are sublexical influences incoming speech in order to access a representa-
of speech production on speech perception. For tion of stored spoken word forms. This is the pho-
example, Gordon and Meyer (1984) found that nological input store.
preparing to speak influences speech percep- People with anomia have difficulty in retriev-
tion, so there must be some sharing of common ing the names for objects, yet can show perfect
mechanisms. Monsell tentatively argued that the comprehension of those words. EE was consist-
interconnection between the speech production ently unable to name particular words, yet he
and recognition systems happens at a sublexi- had no impairment of the auditory recognition
cal level such as the phonological buffer used in or comprehension of the words that he could not
memory-span tasks. name (Howard, 1995). This finding suggests that
In summary, experimental data from people the input and output phonological lexicons are
without brain damage suggest that spoken and vis- distinct.
ual word recognition make use of different mecha- Some patients show a disorder called pure
nisms. There are distinct input and output stores, word deafness. People with pure word deafness
perhaps sharing some sublexical mechanisms. can speak, read, and write quite normally, but can-
not understand speech (Chapter 9). These patients
Neuropsychological data and also cannot repeat speech back. However, there
are a few patients with word deafness who still
lexical architecture have intact repetition, a condition called word
There are very many neuropsychological disso- meaning deafness. Word meaning deafness is rare,
ciations found between reading, writing, and visual but has been reported by Bramwell (1897/1984)
speech pictures, seen objects print
Auditory
phonological
analysis
Abstract
letter
identification
Phonological
input
buffer
Phonological Visual object Orthographic

input recognition input
lexicon system lexicon
Visual semantic
Acoustic-to- Letter-to-
system
phonological sound rules
conversion
Verbal semantic
Lemmas system
Phonological Orthographic
output output
lexicon lexicon
Phonological Orthographic FIGURE 15.1 The overall

Sound-to-
output output
letter rules structure of the language
buffer buffer
system. It is possible that
the speech lemma system is
unnecessary, or that other
speech print links to lemmas need to be
introduced.
and Kohn and Friedman (1986). This shows that of word and nonword repetition performance,
word repetition need not depend on lexical access. along with the effect of semantic variables such
Indeed, if Figure 15.1 is correct, then we should as imageability. Obviously (assuming that they
be able to repeat speech using three routes (in a can be distinguished), if either the input or output
manner analogous to the three-route model of buffer is disrupted, repetition should be impaired;
reading). First, there is a repetition route through I examine this idea later. We should also be able
semantics. Second, there is a lexical repetition to see disruptions resulting from selective damage
route from the input phonological lexicon to the to and preservation of our three repetition routes.
output phonological lexicon. Third, there is a sub- If both the sublexical and the lexical routes
lexical repetition route from the input phonologi- are destroyed, then the person will be forced to
cal buffer to the output phonological buffer that rely on repetition through the semantic route.
bypasses lexical systems altogether. Disentangling If the semantic route is intact, there will be an
precisely which is impaired depends on the pattern imageability effect in repetition, with more
imageable words repeated more readily. If there is and nonwords (through the sublexical repetition
also some damage to the semantic route, patients route) and good comprehension (through the
will make semantic errors in repetition (for exam- semantic route), a deficit of this type will be diffi-
ple, repeating “reflection” as “mirror”). This is cult to detect. The important conclusion, however,
called deep dysphasia. Howard and Franklin is that the patterns of repetition impairment found
(1988) described the case of MK, who was good can be explained by this sort of model.
at speech production. He was severely impaired Different lexical systems are involved in
at single-word and nonword repetition, but was reading and writing. Bramwell’s patient could
good at the matching span task. He made seman- not comprehend spoken words, but could still
tic errors in repetition. Howard and Franklin con- write even irregular words to dictation. This is
cluded that MK had preserved input and output incompatible with any general system mediating
phonological systems, total loss of the sublexical lexical stores, and with obligatory phonological
repetition route, partial impairment of the lexi- mediation of orthographic-to-cognitive codes.
cal repetition route, and partial impairment of the We also saw in Chapter 7 that phonological medi-
semantic repetition route. ation does not appear to be necessary for writing
If only the lexical repetition route is left single words.
intact, then patients will be able to repeat words There is a great deal of neuropsychologi-
but not nonwords (as nonwords do not have a lex- cal evidence that there are distinct phonological
ical entry). They will not be able to comprehend and orthographic output stores. Beauvois and
the words they repeat (as there is no link with Derouesné (1981) reported a patient showing
semantics), and they will probably have difficulty impaired spelling yet intact lexical reading. MH
in understanding and producing speech (because was severely anomic in speech but had much less
of the disruption to semantics). Nor should they severe written word-finding difficulties (Bub &
show the effects of semantic variables such as Kertesz, 1982b). Patient WMA produced incon-
imageability in repetition. They might also make sistent oral and written naming responses. When
lexicalization errors (repeating nonwords as close given a picture of peppers, he wrote “tomato” but
words—e.g., repeating “sleeb” as “sleep”). Dr. said “artichoke” (Miceli, Benvegnu, Capasso, &
O (Franklin, Turner, Lambon Ralph, Morris, & Caramazza, 1997). If a single lexicon were used
Bailey, 1996) was close to this pattern. He could for both speaking and writing, WMA would have
understand written words, but could not under- given the same (erroneous) response in both cases.
stand spoken words. He could, however, repeat Some patients are better at written picture nam-
spoken words quite well (80%) but was very poor ing than spoken picture naming (Rapp, Benzing,
at nonword repetition (7%). & Caramazza, 1997; Shelton & Weinrich, 1997).
If only the sublexical repetition route is left The existence of patients such as PW who can
intact, patients will be able to repeat both words write the names of words that they can neither
and nonwords, but will have no comprehension of define nor name aloud is evidence for the inde-
the meaning of the words. Transcortical sensory pendence of these systems, and argues against
aphasia fits this pattern (Chapter 13). obligatory phonological mediation in writing
There are other possible combinations, of (Rapp et al., 1997). Rapp and Caramazza (2002)
course. Patients might have damage to only one of describe a patient who has more difficulty speak-
the routes, leaving two intact. Damage to the sub- ing nouns than verbs but greater difficulty writ-
lexical route alone would lead to an impairment ing verbs than nouns. This evidence suggests that
of repetition, with particularly poor repetition of different output stores are involved in speaking
nonwords, as they cannot be repeated through the and writing, and that writing does not require the
direct and semantic repetition routes. This is the generation of a phonological representation of the
pattern observed in conduction aphasia (Martin, word. Although there are some dissenting voices
2001). As damage to the lexical route alone should (e.g., Behrmann & Bub, 1992), most studies sug-
result in relatively good repetition of both words gest that multiple lexical systems are involved.
However, it is likely that they interact, as damage buffers. Writing involves going from the semantic
to word meaning usually leads to comparable dif- system to print through the orthographic output
ficulties in both written and spoken output (Miceli store. We can also write nonwords to dictation, so
& Capasso, 1997). there must be an additional connection between
the phonological output buffer and the ortho-
graphic output buffer that provides sound-to-letter
Sketch of a model rules.
As we saw in Chapter 7, reading makes use of Of course, we can do other things as well.
a number of routes. The exact number is con- We can name objects. Most people think that we
troversial, as connectionist models suggest that access the names of objects through the seman-
the direct and indirect lexical routes should be tic system from a system of visual object recog-
combined. Figure 15.1 shows the traditional nition. We saw in Chapter 11 that some people
model incorporating an indirect reading route; think that different semantic systems are used
the figure shows the maximum sophistication for words and objects, so we might have to
necessary in a model of lexical architecture. The split the semantic system in two. There is also
direct route goes from abstract letter identifica- some controversial evidence from the study of
tion to an orthographic input store and then to dementia that at least one patient (DT) can name
the semantic system. The direct lexical reading objects and faces without going through seman-
route then goes straight on to the phonological tics (Brennen, David, Fluchaire, & Pellat, 1996;
output store. The indirect or sublexical route but see Hodges & Greene, 1998, and Brennen,
(which as we saw in Chapter 7 might in turn be 1999, for a reply). In this case, we need to add an
quite complex) bypasses the orthographic input additional route from the visual object recogni-
store and the semantic system, giving us a direct tion system that bypasses semantics to get to the
link between letter identification and speech. We phonological output store.
saw that non-semantic reading means that the Note that there is no direct connection
semantic system can sometimes be bypassed. We between the orthographic input store and the
can also read out aloud a language with a regular orthographic output store. Are there patients
orthography (e.g., Italian) without being able to who can copy words (but not nonwords) without
understand it. Allport and Funnell (1981) argued understanding them? Finding such patients would
that we cannot have a separate amodal lexicon suggest that such a link will be necessary. We
mediating between systems. They reviewed evi- would need to find these sorts of patient to be cer-
dence from word meaning deafness, phonologi- tain about these links. There is also some question
cal dyslexia, and deep dyslexia. They described about whether we need distinct input and output
a number of studies of patients that argue for a phonological buffers, or whether one will suffice.
dissociation of cognitive and lexical functions. We examine these issues in more detail later.
The semantic paraphasias of deep dyslexics rule By the time we add lemmas and a non-
out any model where translation to a phonologi- semantic object-naming route, we end up with a
cal code is a necessary condition to be able to model that is even more complicated than Figure
access a semantic code (as these patients can 15.1, just to produce single words! Remember
access meaning without retrieving sound). that this is the most complex model necessary.
Writing and speaking produce output across Connectionist modeling may show how routes
time. It makes sense to retrieve a word in one go (in addition to the lexical and sublexical reading
rather than having to access the lexicon afresh routes) may be combined without loss of explana-
each time we need to produce a letter or sound. tory adequacy.
This means that we have to store the word while A final point on lexical organization is that
we speak out its constituent sounds, or write out it is not too important for the architecture of this
its constituent letters in order. This in turn means model whether words are represented in the lexi-
that we also need phonological and orthographic con in a local or a distributed representation. In
a distributed representation, words correspond to

patterns of activation over units rather than to indi-
vidual units (see the discussion of the Seidenberg Central
Visuo-spatial executive Phonological
& McClelland, 1989, model in Chapter 7). Hence sketch pad loop
the visual input store corresponds to the hidden
units in their model. In practice, it is very difficult,
perhaps impossible, to distinguish between these
possibilities. Given that individual words clearly
do not correspond to individual neurons, they
must be distributed to some extent. The impor-
tant issue is the extent to which these distributed
representations overlap. Of course, this is not to
FIGURE 15.2 A simplified representation of the
say that the processes that happen in the boxes in
working memory model (based on Baddeley &
Figure 15.1 are not important; they are crucial.
Hitch, 1974).
But we can nevertheless identify the general com-
ponents of the language system and the way that
information flows through it. an attentional system), a visuo-spatial sketch pad
There are two complications with this type of (for short-term storage of spatial information),
neuropsychological data. The first is distinguish- and a phonological loop (see Figure 15.2). Both
ing between having two separate stores and hav- the central executive and the phonological loop
ing one store with two separate input and output are important in language processing. The cen-
pathways. This is a fundamental problem, first tral executive plays an important role in seman-
raised in Chapter 11 and earlier in this chapter. tic integration and comprehension, while the
It is not always straightforward to address. The phonological loop plays a role in phonological
second complication is distinguishing between processes in language. It is debatable, however,
impairments to connections between input and whether any component of general verbal mem-
output stores, and input and output phonological ory plays a role in parsing.
buffers. We saw in Chapter 12 that reading span pre-
dicts performance on a range of reading and com-
prehension measures (Daneman & Carpenter,
LANGUAGE AND 1980). People hear or read sets of unrelated sen-
SHORT-TERM MEMORY tences, and after each set attempt to recall the
last word of each sentence. Reading span is the
What is the relation between language pro- largest size set for which a participant can cor-
cesses and short-term memory (STM)? The rectly recall all the last words. This measure of
role short-term memory plays in processes as reading span correlates with the ability to answer
diverse as speech perception, word repetition, questions about texts, with pronoun resolution
parsing, comprehension, and learning to speak accuracy, and even with general measures of ver-
and read has inspired much research on both bal intelligence such as SAT scores. Poor com-
impaired and unimpaired speakers. Language prehenders often have a reduced reading span.
plays an important role in short-term memory, Reading span is determined by the size or effi-
and the contents of short-term memory are cacy of working memory capacity. (Daneman
often linguistic. & Carpenter have a different notion of working
Psychological research on short-term mem- memory from Baddeley: Their “working mem-
ory suggests that it is not a unitary structure. ory” is most equivalent to the language-related
Baddeley and Hitch (1974) called the set of components of the central executive in Baddeley’s
structures involved working memory. Working scheme.) Low skills on a range of complex work-
memory comprises a central executive (which is ing memory tasks are associated with language
disabilities in childhood, particularly difficulty in lexical representations (e.g., by support from

learning to read (Gathercole, Alloway, Willis, & semantics) without the support of the phonologi-
Adams, 2006). cal loop (Martin, 1993). However, impairment
of the phonological loop does hinder the abil-
Short-term verbal memory and ity to repeat words, and particularly nonwords.
Patients with impaired ASTM show an effect of
lexical processing word length in repetition. Patients with a rep-
In Baddeley’s (1990) conception of working etition disorder (Shallice & Warrington, 1977)
memory, the phonological loop comprises a show very little impairment in language product-
passive phonological store that is linked with ion and comprehension, but are still impaired at
speech perception, and an articulatory control repeating words (Saffran, 1990).
process linked with speech production, which The extent to which phonological short-
can maintain and operate on the contents of term memory impairments are accompanied by
the phonological store. The effectiveness of the speech perception impairments is controversial.
phonological loop is measured by auditory Some studies (e.g., Campbell & Butterworth,
short-term memory (ASTM) tasks. These 1985; Vallar & Baddeley, 1984) report patients
tasks measure our memory for digits and words with very reduced ASTM span, yet with appar-
in a number of ways. In the single-word repeti- ently normal phonological processing, whereas
tion task, a person has to repeat single words others (e.g., Allport, 1984) argued that many
(or digits) back aloud. In the two-word repeti- patients have a subtle phonological processing
tion task, the person has to repeat pairs of words deficit that can only be detected by difficult
back. In the pointing span task, the person hears tests. One resolution of this conflict may be that
a sequence of words or digits and then has to phonological short-term memory impairments
point in sequence to pictures corresponding to may involve damage to either the input or the
those items. In the matching span task, the per- output phonological buffer, but not to phono-
son just has to say whether two lists are the same logical processing.
or different. No overt repetition is needed in the The degree of ASTM impairment influ-
matching span tasks, but items have to be main- ences language function. A span reduced to
tained in the input phonological buffer. Note just one or two items can have profound con-
that differing ways of measuring span size might sequences for language processing, including
give different results; for example, the pointing single-word processing. At spans of two or
span task requires activation of semantic infor- three, single-word processing is usually intact,
mation in a way that repetition does not (Martin but performance on longer sequences of words
& Ayala, 2004). can be impaired (Martin, Saffran, & Dell,
One plausible idea is that the passive pho- 1996). Even when naming and word repetition
nological store of the working memory system are relatively spared, performance on nonword
is the phonological buffer of the system we repetition tasks might still be impaired, because
described earlier. A reduction in the size of the nonwords cannot receive support from seman-
phonological store as a consequence of brain tic representations.
damage therefore should have consequences for Can lexical processing be damaged leav-
language processing, but the consequences are ing ASTM intact? Martin, Lesch, and Bartha
less dramatic than you might at first suppose. (1999) argued that it cannot. They proposed
Reduced ASTM capacity should hinder lan- that memory buffers for phonological, lexical,
guage comprehension, because material cannot be and semantic processing contain those items
stored in the phonological buffer for very long. in long-term memory structures that have been
However, very often this does not matter because recently activated. Damage to semantic rep-
we can access the syntactic and semantic proper- resentations will have consequences for main-
ties of words very quickly. We can then maintain taining the integrity of lexical representations.
Some patients with mild speech perception defi- phonological errors made in production by their
cits do not show impairments of ASTM (Martin sample of aphasic speakers and three measures of
& Breedin, 1992). In cases of mild speech per- input phonological buffer processing (phoneme
ception impairment, lexical items will still be discrimination, lexical decision, and synonym
able to become activated. judgments). However, Martin and Saffran (1998)
Martin and Saffran (1990) examined the found a negative relation between the propor-
repetition abilities of a patient (ST) with trans- tion of target-related nonword errors in a naming
cortical sensory aphasia. They showed that task and the patient’s ability to discriminate pho-
their patient could not repeat more than two nemes. One possible resolution of this disagree-
words without losing information about the ment is that the two buffers are interconnected.
earlier items in the input (here the first word Other evidence also supports the existence
of two words). People with a semantic impair- of separate input and output phonological buff-
ment cannot maintain items at the beginning ers. Romani (1992) described a patient with
of a sequence. Word repetition is supported by poor sentence and word repetition but good
phonological processes, but these processes are performance on immediate probe recognition,
of short duration without the feedback support suggesting an impaired output buffer but an
of semantic processes. Items at the beginning of intact input buffer. Similarly, R. C. Martin et al.
the sequence get lost because their maintenance (1999) describe the case of an anomic patient,
depends on activation spreading to semantics. MS, who showed a different pattern of perfor-
Items at the end of the sequence benefit from mance on tasks involving the input and output
the recency of phonological activation, and are phonological buffers. In particular, his perfor-
not dependent on that semantic feedback at the mance was poor on STM tasks that required
time of recall. So, although good repetition is verbal output, but normal on STM tasks that did
characteristic of transcortical sensory aphasia, not require verbal output but required the reten-
even that ability is limited. Martin and Saffran tion of verbal input. The pattern of performance
(1997) found similar associations between the suggests that separate input and output phono-
occurrence of semantic and phonological defi- logical buffers are involved. Shallice et al. (2000)
cits and serial position effects in single-word described a patient (LT) with reproduction con-
repetition. Semantic deficits are associated with duction aphasia. LT was impaired across a range
errors on the initial portion of the word, while of language output tasks; remember that the best
phonological deficits are associated with errors explanation for such a pattern of performance
on the final part of the word. This again points is an impairment to the phonological buffer.
to the integrity of the language and memory Yet LT had an intact short-term memory span,
systems. suggesting that the input phonological buffer
was spared but the output phonological buffer
Are there separate input and output was damaged. Finally, patients with impaired
phonological buffers? ASTM fall into clusters in performance on vis-
Can we distinguish between the input and out- ual homophone judgment, pseudohomophone
put phonological buffers? If Figure 15.1 is cor- judgment, and auditory and visual rhyme deci-
rect then we should be able to do so, and there is sion tasks in a way that can best be accounted
some neuropsychological evidence that we can. for by separate input and output phonological
Shallice and Butterworth (1977) described the buffers (Nickels, Howard, & Best, 1997). In
case of JB. On tasks probing memory span, JB particular, some patients showed evidence of
performed poorly, suggesting an impaired input damage to the input buffer, in being impaired
phonological buffer, but she had normal speech on all tasks apart from homophone judgment.
production, suggesting a preserved output pho- Other patients showed evidence of damage to
nological buffer. Nickels and Howard (1995) the output buffer, in that they were impaired on
found no correlation between the number of all tasks other than auditory rhyme judgments.
Furthermore, some patients showed evidence of The idea that a central memory capacity is
a lesion to the link between the output and the used in language comprehension is known as
input buffers, in that they could perform homo- the capacity theory of comprehension (Just &
phone and auditory rhyme judgments well, but Carpenter, 1992). Just and Carpenter argued that
were poor at pseudohomophone detection and working memory constrains language compre-
visual rhyme detection. hension. Individual differences between linguis-
tic working memory capacity lead to differences
The phonological loop and in reading ability, and reduction of working
vocabulary learning memory capacity through aging or brain dam-
We have seen that because we can access the age leads to language comprehension deficits.
meaning of words so quickly, damage to the pho- As we saw in Chapter 10, some researchers have
nological loop has surprisingly few consequences put forward the controversial view that the defi-
for language processing. The main role for the cits observed in syntactic comprehension are
phonological loop is now thought to be limited best explained by a reduction in central execu-
to learning new words (Baddeley, Gathercole, & tive capacity (Blackwell & Bates, 1995; Miyake
Papagno, 1998). Verbal short-term memory also et al., 1994). Waters and Caplan (1996) criti-
plays a role in vocabulary acquisition in chil- cized the capacity theory, arguing that language
dren (Gathercole & Baddeley, 1989, 1990, 1993; processing makes use of two distinct working
Gupta & MacWhinney, 1997). The size of verbal memory systems, one dedicated to controlled,
STM and vocabulary size are strongly corre- verbally mediated tasks, and one dedicated to
lated, and early nonword-repetition ability pre- automatic, obligatory “routine” language pro-
dicts later vocabulary size. Nonword repetition cessing. They call this the domain-specific view
skills also predict success at foreign language of working memory (Caplan & Waters, 1999).
vocabulary acquisition (Papagno et al., 1991). There is some evidence against the domain-
Patients with impaired short-term phonological specific view suggesting that working memory
memory (e.g., PV of Baddeley et al., 1998) find is involved in parsing. Gibson (1998) examined
it difficult to learn a new language. Phonological the relation between working memory and sen-
memory is used to sustain novel phonological tence processing. He argued that comprehension
forms so that they can be built into more perma- has two sorts of demands on available computa-
nent representations. tional resources: a cost associated with integrating
components, and a cost associated with keeping
Working memory and parsing track of syntactic structures. The costs increase the
Although short-term memory plays some role in longer a unit must be kept in memory before it can
integration and maintaining a discourse represen- be integrated into the developing representation
tation, the extent to which an impairment of STM of the sentence. Gibson argued that the human
affects parsing is controversial. Early models parsing mechanism prefers the structure that
of parsing considered the minimization of STM incurs the least memory load. More recent dual-
demands to be a primary constraint on parsing task studies show that parsing is impaired if peo-
(e.g., Kimball, 1973). With a conception of work- ple have to remember additional related material;
ing memory as a phonological loop and central the more syntactically complex the material, the
executive, the phonological representations of greater the cost of remembering additional words.
words are stored in the phonological buffer of the The key to observing interference is that the addi-
loop, and the semantic representations of focal tional items that must be kept active in memory
components of the discourse are handled by the must be related to the material participants are try-
central executive. The central executive might ing to understand, rather than being unrelated dig-
play a role in parsing, in computing parsing its, for example (Fedorenko, Gibson, & Rohde,
processes, and in manipulating the intermediate 2006; Gordon, Hendrick, and Levine, 2002). As
results of computations. noted in Chapter 10, the debate about whether or
not language comprehension uses general work- dementia. Although the participants’ working mem-
ing memory or a dedicated store is important, but ory capacity was reduced, there was little effect of
is unresolved and ongoing. syntactic complexity, although semantic complex-
Does parsing involve the phonological loop in ity was affected. Such results suggest that STM is
particular? On the one hand, some researchers argue not involved directly in parsing. Such patients can
that the phonological loop maintains some words still display a variety of comprehension difficulties
in short-term memory to assist in parsing, par- (such as turning commands into actions, or detect-
ticularly when parsing is difficult (e.g., Baddeley, ing discourse anomalies), suggesting that limited
Vallar, & Wilson, 1987; Vallar & Baddeley, 1984, STM can affect later integrative processing.
1987). Although some patients with STM deficits Hence it seems likely that if there is a reduct-
have impaired syntactic comprehension abilities ion in processing capacity involved in syntactic
(e.g., Vallar & Baddeley, 1987), others crucially comprehension deficits, it is a reduction specifi-
do not (e.g., Butterworth, Campbell, & Howard, cally in syntactic processing ability, rather than
1986; Howard & Butterworth, 1989; Waters, a reduction in general verbal memory capac-
Caplan, & Hildebrandt, 1991). For example, TB ity (Caplan et al., 1985; Caplan & Hildebrandt,
(a patient with a digit span of only two) showed 1988; Caplan & Waters, 1996, 1999). Parsing
increasing problems with comprehension as sen- uses a specific mechanism that does not draw
tence length increased (Baddeley & Wilson, on verbal working memory. However, these
1988). On the other hand, other researchers have more general processes may become involved
argued that the phonological loop plays no role in later in the comprehension process. This topic
parsing, but is involved in later processing after is hotly debated (Caplan & Waters, 1996, 1999;
the sentence has been interpreted syntactically Just & Carpenter, 1992; Just et al., 1996; Waters
and semantically (e.g., McCarthy & Warrington, & Caplan, 1996). The conclusions to be drawn
1987a, 1987b; Warrington & Shallice, 1969). This from all this depend on exactly how syntactic
later processing includes checking the meaning complexity is to be defined, and on the range of
against the pragmatic context, making some infer- sentence types, tasks, patient categories, and lan-
ences, and aspects of semantic integration. For guage examined (Bates, Dick, & Wulfeck, 1999).
example, patient BO had a memory span of only MacDonald and Christiansen (2002) take a
two or three items, yet had excellent comprehen- totally different approach to the idea of working
sion of syntactically complex sentences, includ- memory as a separate store. They adopt a connec-
ing those with dependencies spanning more than tionist perspective, arguing that the capacity limi-
three words (Caplan, 1992; Waters et al., 1991). RE tations arise from the architecture of the language
was a highly literate young woman with a greatly system, and from individual differences in read-
reduced digit span. Although she displayed phono- ing experience. In particular, there is no separate
logical dyslexia and impaired sentence repetition, working memory in the sense that there is a box
her syntactic analysis and comprehension abilities into which the results of linguistic computations
appeared to be intact (Butterworth et al., 1986; are put. Capacity and knowledge are inseparable.
Campbell & Butterworth, 1985; but see Vallar Instead, capacity limitations arise from the behav-
& Baddeley, 1989). McCarthy and Warrington ior of the whole system, rather than from one
(1987a) observed a double dissociation, with some component of it. MacDonald and Christiansen
patients showing an impairment to a passive pho- provided a connectionist model to simulate indi-
nological store involved in unrelated word list rep- vidual differences in language comprehension,
etition, but who were good at repeating sentences, showing how these differences can arise from dif-
and others showing an impairment to a memory ferences in the amount of training the networks
system involving meaningful sentence repetition, receive. This alternative approach has generated
but who could repeat lists of unrelated words. considerable controversy (Caplan & Waters,
Rochon et al. (1994) examined syntactic process- 2002; Just & Varma, 2002). Nevertheless the idea
ing in a group of patients with Alzheimer’s-type is pleasingly simple and parsimonious.
Evaluation of work on language of working memory. This is the multiple compo-

nents idea. Evidence supporting it comes from
and memory neuropsychological studies that show that some
Experimental and neuropsychological research span-reduced patients are worse at tasks involv-
points to the integrity of the language and memory ing semantic information than those involving
systems: The phonological loop is the phonological phonological information, whereas other patients
buffer of language processing. Hence disruptions show the reverse pattern (Hanten & Martin, 2000;
to language and short-term memory are intimately Martin, Shelton, & Yaffee, 1994). The components
related. Auditory short-term memory is therefore of lexical processing are the components of the
involved in many linguistic tasks. Nevertheless, memory system (see Martin & Saffran, 1992;
because we access the syntactic and semantic R. C. Martin & Lesch, 1996). Phonological STM
properties of words so quickly, damage to the pho- deficits are linked with damage to the temporal-
nological loop has surprisingly few consequences parietal region of the brain, while semantic STM
for language processing, apart from impairments deficits are linked with damage to frontal regions
in repetition ability and vocabulary acquisition. So (Romani & Martin, 1999).
all aphasics have span impairments, but not every- Language research is revealing about the
one with span impairments is aphasic. structure of working memory. In particular, it
The contents of the phonological stores are suggests that there are two phonological stores
those items in long-term representations that have involved in the phonological loop—an input and
become highly activated. Indeed, all levels of lin- an output buffer, each of which can be selectively
guistic processing may correspond to components disrupted.
SUMMARY
x Most language processes are fast, mandatory, and mostly automatic.

x Language processing involves a number of modules; the extent to which these modules are inde-
pendent of each other is hotly debated.
x The lack of permanence of the auditory input in listening leads to a number of differences between
reading and listening.
x Experimental data from people without brain damage suggest that there are different lexical sys-
tems for language production and language comprehension; this conclusion is supported by brain-
imaging studies.
x The neuropsychological data suggest that there are distinct lexical systems for reading, writing,
listening, and speaking.
x It is possible to create a model of the architecture of word processing (see Figure 15.1).
x There are three routes that can be used for word repetition, and these routes can be selectively
lost or spared.
x Working memory has a number of components; the most important for language is the phonologi-
cal loop, comprising a passive phonological store and an articulatory control process.
x It is likely that there are separate input and output phonological stores (buffers).
x Impairment to auditory short-term memory (ASTM) has significant consequences for language
processing.
x Working memory is involved in language comprehension and integration, but the extent to which
it is involved in parsing is very questionable.
x The phonological loop is important in first- and second-language vocabulary acquisition.
1. Do you think that a language system with multiple semantic stores is more plausibly combined
with separate or unitary lexical systems?
2. Are there kinds of patients that we should not observe, if Figure 15.1 is correct?
3. What role does the central executive play in language?
4. How do we decide whether or not two words rhyme?
5. What is a lexicon?
6. How does the content of what we know about the structure of the language system relate to what
we have learnt about the brain in earlier chapters?
FURTHER READING
For reviews of picture naming, see Glaser (1992) and Morton (1985). Allport and Funnell (1981)
review many of the issues concerning lexical fractionation; they argue for the reparability of cogni-
tive and lexical codes. Monsell (1987) is a comprehensive review of the literature on the fractionation
of the lexicon. Ellis and Young (1988, Ch. 8) provide a detailed discussion of the neuropsychologi-
cal evidence for their proposed architecture of the language system. See also the PALPA test battery
(Kay, Lesser, & Coltheart, 1992). See Shelton and Caramazza (1999) for a review and discussion of
how lexical architecture relates to semantic memory. Bradley and Forster (1987) review the differ-
ences between spoken and visual word recognition.
See Baddeley (2007) and Eysenck and Keane (2010) for more on working memory. Howard and
Franklin (1988) give a detailed single-case study of a patient (MK) with a repetition disorder, and
Martin (2001) is an excellent review of repetition disorders in aphasia. For more on the role of the
phonological loop in vocabulary learning, see the debate between Bowey (1996, 1997) and Gathercole
and Baddeley (1997). See Meyer, Wheeldon, and Krott (2006) for a collection that examines which
language processes might be automatic and which might require resources.
C H A P T E R 16
NEW DIRECTIONS
INTRODUCTION top-down approaches to science. In the bottom-

up mode, psycholinguists are driven by empirical
In this chapter we re-examine the themes raised findings. Perhaps there is a novel finding, or a
in the first chapter. I also summarize the present prediction from a theory that does not come out as
status of the psychology of language, and indicate predicted. A model is then constructed to account
where it might go in the future. for these findings. Alternatively, a theory might
By now I hope you have been convinced be bolstered by having its predictions verified.
that psycholinguists have made great progress Either way, experimental results drive theoretical
in understanding the processes involved in lan- advances. A top-down approach does not neces-
guage. Since the birth of modern psycholin- sarily worry too much about the data in the first
guistics, sometime around Chomsky’s (1959) instance (although it obviously cannot afford to
review of Skinner’s book Verbal Behavior, it avoid them), but instead tries to develop a theo-
has achieved independence from linguistics and retical framework that can then be used to make
has flourished on all fronts. I also hope you have sense of the data. Predictions are derived from
been convinced that the cognitive approach to these frameworks and then tested. In the past,
psycholinguistics in particular has taught us a examples of top-down approaches have included
very great deal indeed. Many questions remain, linguistics and symbolic AI, and currently the
and in some respects the more we learn, the most influential top-down approach is connec-
more questions are raised. tionism. Of course these modes of thought are not
exclusive. We know a great deal about language
processes from both experiments and modeling.
THEMES IN Progress is a process of interaction between the
PSYCHOLINGUISTICS bottom-up and top-down approaches.
REVISITED The second theme is the question of whether
apparently different language processes are related
I raised ten main themes in Chapter 1. Let’s look to one another. For example, to what extent are
at them all again. the processes involved in reading also involved in
The first theme was to discover the processes speaking? We have seen that while there is some
involved in producing and understanding lan- overlap, there is also a great deal of separation
guage. Modern psycholinguistics is founded on (e.g., see Figure 15.1).
data. Careful behavioral and neuroscience experi- The third theme is whether or not processes in
ments have clearly told us a great deal about the language operate independently of one another, or
processes involved in language. However, as in whether they interact. In modular models, the boxes
all science, there are two main ways of doing of the diagrams used to represent the structure of
things. These can be called the bottom-up and the language systems carry out their computations
independently of the others, and other boxes only particularly in language development. There is
get access to the final output. Interactive models a divide between those who argue that children
allow boxes to fiddle around with the contents of need language-specific information (which is usu-
other boxes while they are still processing, or they ally thought to have some innate origin) to acquire
are allowed to start processing on the basis of an language, and those who argue that acquisition
early input rather than having to wait for the preced- needs no more than general learning principles,
ing stage to complete its processing. This issue has such as the ability to make use of distributional
recurred through every chapter on adult psycholin- information.
guistics. Although there is a great deal of disagree- Seventh, how sensitive are the results of
ment among psycholinguists, the preponderance of our experiments to the particular techniques
evidence—in my opinion—suggests that language employed? Our results are sometimes very sen-
processing is strongly interactive, although there sitive to the techniques used, and this means that
are constraints. There may be modules, but they are in addition to having a theory about the principal
leaky ones: modules need not be informationally object of study, we need to have a theory about
encapsulated. The debate has now largely moved the tools themselves. Perhaps this is most clearly
on from simply whether language processes are exemplified by the debate about lexical decision
modular or interactive, to examining the detailed and naming, and whether they measure the same
time course of processing. When does interaction thing.
occur? What types of information interact? Can Eighth, a great deal can be learned by exam-
interactions be prevented? Psycholinguists have ining the language of people with damage to the
started to dispense with broad, general considera- parts of the brain that control language. In recent
tions, and to focus on the details of what happens. years, cognitive neuroscience imaging data has
Context can have different effects at different levels provided some of the most interesting and impor-
of processing. tant contributions to psycholinguistics.
Fourth, what is innate about language? We Ninth, language is cross-cultural. Studies of
have seen that there is still disagreement about processing in different languages have told us a
whether the developing child needs innate, great deal about topics such as language develop-
language-specific content in order to acquire ment, reading, parsing, language production, and
language. Connectionist modeling has shown neuropsychology. The results suggest that while
how language might be an emergent process, the the same basic architecture is used to process
development of which depends on general con- different languages, it is exploited in different
straints, although this remains controversial. ways. That is, we all share the same hard-wired
Fifth, do we need to refer to explicit rules when
considering language processing? There is cur-
rently little agreement on this, with researchers in
the connectionist camp against much explicit rule-
based processing, and traditionalists in favor of it
(e.g., see the debates on past tense acquisition in
Chapter 4 and on dual-route models of reading in
Chapter 7). There is considerable evidence that chil-
dren make much use of statistical learning of dis-
tributional information when acquiring language. A
recent study, for example, has found a correlation
between children’s statistical learning skills and
reading ability (Arciuli & Simpson, 2012).
The sixth theme is the extent to which lan- Cognitive neuropsychology has provided some
guage processes are specific to language. We have interesting and important contributions to the
study of language.
seen how this issue has proved very controversial,
16. NEW DIRECTIONS 477
modules, but they vary slightly in what they do. Connectionism has made many important
Hence there are some important cross-linguistic contributions to psycholinguistics over the last
differences, and these differences are of theoreti- 30 years. What are its virtues that have made
cal interest. it so attractive? First, as we have seen, unlike
Finally, we should be able to apply psycho- traditional AI it is more brain-like, in that pro-
linguistic research to everyday problems. We can cessing takes place in lots of simple, massively
discern five key applications: First, we now know interconnected neuron-like units. It is important
a great deal about reading and comprehension, not to get too carried away with this metaphor,
and this can be applied to improving methods of but at least we have the feeling that we are start-
teaching reading (Chapters 8 and 12). Second, ing off with the right sort of models. Second,
these techniques should also be of use in helping just like traditional AI, connectionism has the
children with language disabilities; for example, virtue that modeling forces us to be totally
the study of developmental dyslexia has aroused explicit about our theories. This explicitness
much interest (Chapter 8). Third, psycholinguis- has had three major consequences. First, recall
tics helps us to improve the way in which foreign that many psycholinguistic models are specified
languages can be acquired by children and adults as box-and-arrow diagrams (e.g., Figure 15.1).
(Chapter 5). Fourth, we have greatly increased our This approach is sometimes called, rather derog-
understanding of how language can be disrupted atorily, “boxology.” It is certainly not unique to
by brain damage. This has had consequences psycholinguistics, and such an approach is not
for the treatment and rehabilitation of brain- as bad as is sometimes hinted. It at least gives
damaged patients (e.g., see Howard & Hatfield, rise to an understanding of the architecture of
1987). Fifth, there are obvious advantages if we the language system—what the modules of the
can develop computers that can understand and language system are, and how they are related to
produce language. This is a complex task, but an others. However, connectionism has meant that
examination of how humans perform these tasks we have had to focus on the processes that take
has been revealing. Generally, computers are place inside the boxes of our models. In some
better at lower level tasks such as word recogni- cases (such as the acquisition of past tense), this
tion. Higher level, integrative processes involve has led to a detailed re-examination of the evi-
a great deal of context (Chapters 12 and 13), and dence. Second, connectionism has forced us to
this has proved a major stumbling block for work consider in detail the representations used by
in the area. the language system. This has led to a healthy
In addition to these ten themes, we noted in debate, even if the first representations used
Chapter 1 that modern psycholinguistics is eclec- by connectionist modelers turned out later not
tic. In particular, we have made use of data from to be the correct ones (e.g., see Chapter 7 and
cognitive neuropsychology and techniques of the debate on using Wickelfeatures as a repre-
connectionist modeling. sentation of phonology in the input to the read-
We have seen that the study of impairments ing system). Third, the emphasis on learning
to the language system has cast light on virtually in many connectionist models focuses on the
every aspect of psycholinguistics. For example, it developmental aspect that is hopefully leading
has provided a major motivation for the dual-route to an integration of adult and developmental
model of reading (Chapter 7); it has enhanced our psycholinguistics.
understanding of the development of reading and
spelling (Chapter 8); it has provided interesting if
complex data that any theory of semantics must SOME GROWTH AREAS?
explain (Chapter 11); it has bolstered the two-
stage model of lexicalization (Chapter 13); and it Students of any subject are obviously interested
has been revealing about the nature of parsing and primarily in where a subject has been, whereas
syntactic planning (Chapters 10 and 13). researchers naturally focus on where a subject is
going—and on helping it to get there. The study

of the psychology of language has traveled an
enormous distance since its beginning. It came
into its own with the realization that there are
psychological processes to be studied that are
independent of linguistic knowledge. The pro-
liferation of research in the area is enormous.
Even in the years since the first edition of this
book, the subject has been transformed, mostly
by the influence of computational modeling and
cognitive neuroscience. So, where is it going?
Unfortunately, as Einstein once remarked, “it
is difficult to predict, especially the future.” In
every edition I have tried to predict where psy-
cholinguistics will be in five years’ time; and
every time I have been wrong.
There is no particular reason to expect a
revolution in the way we examine or under-
stand language. To some extent, the next five
years are likely to see progress in solving the
same sorts of problems using the same sorts
of techniques. Of course, our models will be
more sophisticated, and our experimental tech-
niques more refined. My list of likely develop-
ments is rather arbitrary and perhaps personal,
and some of these points have been covered in Techniques in brain imaging are becoming
greater detail in earlier chapters. Nevertheless, more accurate. This photo shows a child in a
this selection gives some flavor of global trends magnetoencephalographic scanner. This detects
in the subject. Generally, the trend is towards the magnetic fields generated by neural activity
in different parts of the brain when stimulated by
more inclusive models covering more complex
auditory signals received through the tubes to the
phenomena. For example, now our processing ears and by visual information shown on a screen.
of morphologically simple words is relatively The reactions within his brain are recorded by the
well understood, interest is growing in words scanner, and the eye-tracking device in front of the
that are morphologically more complex. subject gives data on the gaze related to the brain
activity.
First, new techniques in neuroscience have
become more accurate and more accessible, and
will continue to do so. Imaging might tell us a mathematical and computational models being
great deal about the time course of processes, and used. Bayesian models are becoming increas-
when we make use of different sources of infor- ingly popular in cognitive psychology in general,
mation (see Chapter 1). Brain scans are being and will probably expand into areas of language
increasingly presented in the case study literature. processing.
The continued increased use of imaging is per- Third, we can expect the more widespread
haps the single most likely thing we can predict use of developmental data to help resolve adult
for the next five years. questions. For example, the study of how chil-
Second, we will develop new computa- dren learn and represent word meanings might be
tional models of language. The use of straight- revealing with regard to adult semantic represen-
forward connectionist models has more or less tation. There is also much more scope for com-
been exhausted, but there are other types of putational modeling in this area. Developmental
16. NEW DIRECTIONS 479
models will become more process orientated, and corpus of actual usage. The Internet also makes it
we will form a clearer understanding of how pro- relatively easy through crowdsourcing to collect
cessing changes throughout childhood before set- large-scale norms. We can also carry out “mega-
tling on the adult form. studies” using a huge number of participants and
Fourth, a full understanding of psycholinguis- items. The challenge is developing new tools that
tics would entail an understanding of the nature will enable us to extract meaningful conclusions
of all the components of the language processor, from these very large samples (see for example
and how they are related to each other. We saw Bestgen & Vincze, 2012).
in Chapter 15 (Figure 15.1) how a start has been Finally, psycholinguistics will explore other
made on the word recognition system. One impor- participant groups and more naturalistic settings
tant goal of any integrative theory is to specify in greater detail. Recent years have seen an enor-
how the language system interfaces with other mous diversification in who is being studied. We
cognitive systems. That is, what is the final output saw in the first chapter that many experiments
of comprehension and the initial input to product- have been on the visual processing of language
ion? It is likely these are the same. In Chapter 12 by healthy monolingual college-aged participants.
we saw how currently the most likely proposal That is changing. One particularly important
about the form of the output is a propositional aspect of this is the cross-linguistic study of lan-
representation associated with the activation of guage. Most of the experiments described in this
goals and other schemata (see the description of book have been on speakers of the English lan-
Kintsch’s model in that chapter). In Chapter 13, guage. This does not just reflect my bias, because
we saw that the conceptualizer that creates the most of the work carried out has been on English.
input to the production system has been much Research is also driven by the assumption that the
neglected. In Chapter 11, we saw that the work of underlying processing architecture is shared by
people like Jackendoff (1983) puts restrictions on languages, although there may be some important
the interface between the semantic and cognitive differences. There is likely to be more emphasis
systems. There is a move to integrating research on how we process natural speech in more natural
across areas. For example, the connectionist settings, away from the single word presented on
model of Chang, Dell, and Bock (2006) accounts a computer screen. The “visual world” paradigm
for data from adult speech production (structural (discussed particularly in Chapters 10, 13, and 14)
priming) and language acquisition (verb-argument has become particularly important in this respect,
structures). The Chang et al. model shows how and is likely to become even more so.
language acquisition and adult speech production Chomsky’s ideas have been very influential in
make use of the same mechanisms. A related ques- this respect. You will remember that according to
tion is how production and comprehension are his position language is an innate faculty specified
related (e.g., Ferreira & Bailey, 2004; Pickering & by the language acquisition device (LAD). All lan-
Garrod, 2013). Clearly much remains to be done guages, because they are governed by the form of
in this important area. the LAD, are similar at some deep level. Variation
Fifth, the Internet and social networking between languages boils down to differences in
make possible the use of very large corpora of vocabulary, and the parameters set by the exposure
language. We have seen in the study of seman- to a particular language. An alternative view is the
tics how HAL makes use of the co-occurrence connectionist one that similar constraints from gen-
information of a very large sample of text. Watts eral development and inherent in the data lead to
(2012) explores what we can learn about behavior similarities in development and processing across
and communication using Twitter and Facebook. languages. In general, cross-linguistic comparisons
For the first time we have readily available mil- help us to constrain the nature of this architecture
lions of samples of language in actual use. So, and to explain the important differences. What
for example, rather than estimating things such as are the consequences of these differences? Much
word frequency, we can specify it for a very large research remains to be done on this.
There are many areas where it is useful to frequency of related morphological and phonological
compare languages. First, the observation that forms, length, associations, age of acquisition, auto-
there are similar constraints on syntactic rules has biographical associations, categorizability, concrete-
been used to motivate the concept of universal ness, bigram frequency, imagery, letter frequency,
grammar (Chapters 2, 3, and 4). To what extent can number of meanings, orthographic regularity, mean-
the connectionist view that language is an emer- ingfulness, emotionality, recognition threshold,
gent process give an account of these findings? regularity, position of recognition point, and morpho-
Second, we also saw in Chapter 10 that examin- logical complexity. Since then we have discovered
ing a single language (English) might have given that not only are the neighbors of words important,
us a distorted view of the parsing process. Third, but also the properties of the neighbors are important.
similarities and differences in languages have con- And this is before we have begun to consider con-
sequences for language development (Chapter 4). straints on processing units larger than a word. Cutler,
For example, the cross-linguistic analysis of the writing in 1981, asked, “will we be able to run any
development of gender argues against a semantic psycholinguistic experiments at all in 1990?” The
basis for the development of syntactic categories. year 1990 has passed and we are still doing experi-
Finally, what can analysis of different languages ments in 2012, so the answer is obviously “yes,” but
that map orthography onto phonology in different it is getting more difficult, and we have to make a
ways tell us about reading (Chapter 7)? And do number of carefully justified assumptions. It is appar-
different languages break down in different ways ent that we have to be particularly careful about how
after brain damage? we choose our materials. Controlling variables we
Related to cross-linguistic studies, much know about might not be enough. Forster (2000)
remains to be learned about bilingualism, which is showed that skilled psycholinguists have a great deal
still receiving increasing attention in the research lit- of implicit knowledge about language. When asked
erature, both as a subject in its own right, and as a to make predictions about which word would be
means to investigate underlying language processes. responded to fastest on a lexical decision task from
word pairs controlled for known predictor variables,
skilled researchers performed above chance. Hence it
CONCLUSION is always possible that researchers are unconsciously
constructing their materials in a particular way. The
The eventual goal of psycholinguistics is a detailed remedy for this problem is making more use of
and unified theory of language and how it relates random sampling of materials.
to other cognitive processes. The more we know, There still remains a great deal to do in psy-
in some ways the harder it is to carry out psycho- cholinguistics. It should be clear from reading
linguistics experiments. Cutler (1981) observed that this book that there is much we don’t know, and
the list of variables that had to be controlled in psy- many occasions when there are competing inter-
cholinguistics experiments was large and growing, pretations of the data. If this book has inspired
and there were many that were rarely considered. any reader to investigate further and even actually
Here is Cutler’s (adapted) list for experiments on contribute to the subject, it has more than served
single words: syntactic class, ambiguity, frequency, its purpose.
1. How will this book be different in 2025?

2. Will the psychology of language (contrasted with the neuroscience of language) still be taught
in universities in 2050?
APPENDIX
CONNECTIONISM
This appendix provides a more formal and detailed do not learn are those based on interactive activa-
description of connectionism than is given in the tion and competition (IAC), and of models that do
main text. I hope it is comprehensible to anyone learn, those trained using back-propagation. We
with some knowledge of basic algebra. If you find should distinguish the architecture of a network,
the mathematics daunting, it is worth persevering, which describes the layout of the network (how
as many of the most important models in current many units there are and how they are connected
psycholinguistics are types of connectionist mod- to each other), the algorithm that determines how
els. See the suggestions for further reading for activation spreads around the network, and the
more detailed and comprehensive coverage. learning rule, if appropriate, that specifies how
Connectionism has become the preferred the network learns.
term to describe a class of models that all have We look here at two approaches that have
in common the principle that processing occurs been the most influential in psycholinguistics.
through the action of many simple, interconnected Other important learning algorithms that have
units; parallel distributed processing (PDP) and been used include Hebbian learning and the
neural networks are other commonly used terms Boltzmann machine (Hinton & Sejnowski, 1986);
that are almost synonymous. There are three very see the suggestions for further reading for details
important concepts underpinning all connectionist of these.
models. The first basic idea of connectionism is
that there are many simple processing units con-
nected together. These units don’t do very much INTERACTIVE
other than modify and pass on activation (one ACTIVATION MODELS
number). The second basic idea is that energy or
activation spreads around the network in a way I’ll start with the interactive activation model
determined by the strengths of the connections because historically it was the first connection-
between units. Strong positive weights magnify ist type model to have an impact on psychology,
the output of units; strong negative weights pro- and because it’s relatively easy to understand.
duce a large negative, inhibitory value. Units have McClelland and Rumelhart (1981) and Rumelhart
activation levels that are modified by the amount and McClelland (1982) presented the interac-
of activation they receive from other units. The tive activation and competition (IAC) model to
third idea is that high-level, complex “intelligent” account for word context effects on letter identi-
behavior emerges from the interaction and coop- fication. The TRACE model of spoken word rec-
eration of these many simple “dumb” units. ognition (McClelland & Elman, 1986) is an IAC
There are many types of connectionist model. model.
One important distinction is between models that The model consists of many simple process-
do not learn and models that do. In psychology ing units arranged in three levels. There is an
the most important examples of the models that input level of visual feature units, a level where
482 APPENDIX: CONNECTIONISM
units correspond to individual letters, and an out- and hence how quickly activation builds up at the
put level where each unit corresponds to a word. unit at the end of the connection. The total activa-
Each unit is connected to each unit in the level tion, called neti, arriving at each unit i from j con-
immediately before and after it. Each of these nections is shown in equation (A.1). Put in words,
connections is either excitatory (that is, positive this equation means that the activation arriving at
or facilitatory) or inhibitory (negative). Excitatory a unit is the sum (S) of the products of the output
connections make the units at the end of the con- activation (aj) of all the j units that input to it and
nection more active, whereas inhibitory connec- the weights (w) on the connection between the
tions make the connections at the end less active. input and receiving unit (wji). You just multiply all
Each unit is connected to each other unit within the output of connecting units by the strength of
the same level by an inhibitory connection. See the appropriate weight and add them up.
Figure 6.9 for a graphical representation of this
architecture. neti = 6 a j .w ji (A.1)
j
When a unit becomes activated, it sends
off energy, or activation, simultaneously along An example should make this clear. Figure A.1
the connections to all the other units to which shows part of a very simple network. There are four
it is connected. If it is connected by a facilita- input units to one destination unit. We say that the
tory connection, it will increase the activation input vector is [1 0 1 1]. (Here we are assuming that
of the unit at the other end of the connection, the input units are either simply “on” [with a value of
whereas if it is connected by an inhibitory con- 1] or “off” [with a value of 0]. That is, we are restrict-
nection, it will decrease the activation at the ing them to binary values. In principle, we could let
other end. Consider the IAC model of Figure the input units be “on” to different extents, e.g., 0.3.)
6.9. If the unit corresponding to the letter “T” The total amount of activation arriving at the desti-
in the initial letter position becomes activated, nation unit will be the sum of all the products of the
it will increase the activation level of the word outputs of the units that input to it with the appropriate
units corresponding to “TAKE” and “TASK,” weights on the connections: that is, ((1 × +0.2) + (0 ×
because they start with a “T,” but will decrease −0.5) + (1 × +0.7) + (1 × −0.1)) = +0.8. The under-
the activation level of “CAKE,” because it does lying idea is that this equation is a simplified model
not. Because units are connected to all other of a neuron: Neurons become excited or inhibited by
units within the same level by inhibitory con- all the other neurons that contact them. Found that bit
nections, as soon as a unit becomes activated, it
starts inhibiting all the other units at that level.
The equations summarized in the next section
determine the way in which activation flows on
between units, is summed by units, and is used +0.2
to change the activation level of each unit at
off –0.5
each time step. Over time, the pattern of activa-
tion settles down or relaxes into a stable con- +0.7
figuration so that only one word remains active. on –0.1
Basic equations of the interactive on
activation model
As we have seen, in the IAC model activation
spreads from each unit to neighboring units along FIGURE A.1 A simplified connectionist unit or
excitatory or inhibitory connections. Connections “neuron.” This unit takes multiple inputs (derived
have numbers or weights that determine how from the weighted output of other units) and
much activation spreads along that connection, converts them to a single output.
APPENDIX: CONNECTIONISM 483
of arithmetic tedious? Imagine doing it hundreds or three layers or levels. Again, each typically
thousands, or more, times. No wonder connectionism contains many simple units. These are called
relies on computers to do the computation. the input, hidden, and output levels (see Figure
Finally, a further equation is needed to deter- 7.5 for an example). As in the IAC model, each
mine what happens to a unit in each processing of the units in these layers has an activation
cycle after it receives an input. In the IAC model, level, and each unit is connected to all the units
each unit changes its activation level depending in the next level by a weighted connection,
on how much or how little input it receives, and which can be either excitatory or inhibitory.
whether that input is overall positive (excitatory) These networks learn to associate an input pat-
or negative (inhibitory). In each cycle the new tern with an output pattern using a learning rule
activation level of a unit i, Δai, is given by equations called back-propagation. The most important
(A.2) and (A.3): difference between IAC and back-propagation
networks is that in the case of the latter the
Δai = (max − ai)neti − decay(ai − rest) weights on connections are learned rather than
if neti > 0 (A.2) hand-coded at the start.
How does the network learn? The connec-
Δai = (ai − min)neti − decay(ai − rest) tions in the network all start off with random
otherwise (A.3) weights. Suppose we want the model to learn to
pronounce the printed word “DOG”; that is, we
where rest is the unit’s resting level, decay is a want to train the network to associate the input
parameter that makes the unit tend to decay back pattern of graphemes D O G with the output pat-
to its resting level in the absence of new input, tern of sounds or phonemes /d/ /o/ /g/. One pattern
max is the unit’s maximum permitted level of of activation over the input units corresponds to
activation, and min is the unit’s minimum permit- “DOG.” In Figure 7.5 I have for simplicity made
ted level of activation. So if absolutely nothing the representation a local one—that is, for exam-
happens, eventually the activation of the unit will ple, one unit corresponds to “D,” one to “O,” one
decay back to its resting level. to “G,” and so on. In more realistic models these
Processing takes place in cycles to represent patterns are usually distributed so that DOG is
the passage of time. At the end of each cycle, the represented by a pattern of activation over the
activation levels of all the units in the network are input units with no one single unit corresponding
updated. In the next cycle the process is repeated to any one single letter. Hence DOG might be rep-
using the new activation levels. Processing con- resented by input unit 1 on, input unit 2 off, input
tinues until some criterion (e.g., a certain number unit 3 on, and so on. These units then pass activa-
of processing cycles or a certain level of stability) tion on to the hidden units according to the val-
is attained. ues of the connections between the input and the
hidden units. Activation is then summed by each
BACK-PROPAGATION unit in the hidden unit layer in just the same way
as in the interactive activation model. In models
Back-propagation is the most widely used connec- that learn using back-propagation, the output of a
tionist learning rule. It enables networks to learn to unit is a complex function of its input: For reasons
associate input patterns with output patterns. It is called that we can skip, there must be a non-linear rela-
an error-reduction learning method because it is an tion between the two, given by a special type of
algorithm that enables networks to be trained to reduce function called the logistic function. The output ou
the error between what the network actually outputs of a unit u is related to its input by equation (A.4).
given a particular input, and what it should output given Here netinputu is the total input to the unit u from
that input. all the other units that input to it, and e is the expo-
The simplest type of network architecture nential constant (the base of natural logarithms,
that can be trained by back-propagation has with a value of about 2.718).
484 APPENDIX: CONNECTIONISM
ou 1/(1 e–netinputu) (A.4) Δpwij (tpj − opj) · ipi (A.5)
As an example, let us take the unit shown in The error for the output units is given by
Figure A.1 once again. The total input to that unit, equation (A.6), and that for the hidden units by
netinputu, is [(1 u 0.2) + (0 × −0.5) (1 u 0.7) equation (A.7), where l and m are connecting lay-
(1 u −0.1)] 0.8. Hence the output ou for this unit ers. The weight change is given by equation (A.8).
is 1/(1 e−0.8) 0.69.
Each unit has an individual threshold level or δpj (tpj opj) · opj · (1 opj) (A.6)
bias. (This is usually implemented by attaching an
δpl opl · (1 opl) · Σ δpm · wlm (A.7)
additional unit, the bias unit, which is always on,
to each principal unit. The value of the weights Δwij(n+1) η · (δpj · opi) + α · Δwij(n) (A.8)
between the bias and other units can be learned
like any other weights.) There are two new constants in equa-
Activation is then passed on from the hidden to tion (A.8): η is the learning rate, which deter-
the output units, and so eventually the output units mines how quickly the network learns, and α is
end up with activation values. But as we started the momentum term, which stops the network
off with totally random values, they are extremely changing too much and hence overshooting on
unlikely to be the correct ones. As the target output, any learning cycle. (The dots “·” mean the same
we wanted the most activated output units to corre- as “multiply,” but make the equations easier to
spond to the phonemes /d/ /o/ /g/, but the actual out- read.) Needless to say, this training process can-
put is going to be totally random, maybe something not be completed in a single step. It has to be
close to /k/ /i/ /j/. What the learning rule does then is repeated many times, but gradually the values
to modify the connections in the network so that the of actual and desired outputs converge. You can
output will be a bit less like what it actually produced, modify the training set in a number of ways to
and a bit more like what it should be. It does this in a make the task more realistic. For example, if you
way that is very like what happens in calculating the are interested in word frequency, you have to
mean squared error in an analysis of variance. The encode it in the training in some way, perhaps
difference between the actual and the target outputs by presenting more input–output pairings of fre-
is computed, and the values of all the weights from quent words more often.
the hidden to the output units are adjusted slightly to Networks trained by back-propagation show
try to make this difference smaller. This process is some interesting properties. Most interestingly,
then “back-propagated” to change the weights on the if you present a trained network with an item
connections between the input and the hidden units. that it has not seen before, it can often manage
The whole process can then be repeated for a dif- to produce the appropriate output quite well. For
ferent input–output (e.g., grapheme–phoneme) pair. example, in the case of the model learning to
Eventually, the weights of the network converge on read, although the network has not been taught
values that give the best output (that is, the least dif- any explicit rules of pronunciation, it behaves as
ference between desired and actual output) averaged though it has learned them, and can generalize
across all input–output pairs. appropriately.
The back-propagation learning rule is based One of the most important and commonly
on the generalized delta rule. The rule for chang- used modifications to the simple feedforward
ing the weights following the presentation of a architecture is to introduce recurrent connec-
particular pattern p is given by equation (A.5), tions from one layer (usually the hidden layer) to
where j and i index adjacent upper and lower lay- another layer (called the context layer). For exam-
ers in the network, tpj is the jth component of the ple, if the context layer stores the past state of the
desired target pattern, opj is the corresponding jth hidden unit layer, then the network can learn to
component of the actual output pattern p, and ipi is encode sequential information—what follows
the ith component of the input pattern. what in a sequence (Elman, 1990).
APPENDIX: CONNECTIONISM 485
You should bear in mind that this description machines are those that you are most likely to
is a simplification. Why you need hidden units in come across in psycholinguistics. The general
such a model, what happens if you do not have principles involved are much the same, and fur-
them, and how you select how many units to thermore, the end result is generally the same. We
have are all important issues. Furthermore, there are usually most interested in the behavior of the
are other learning algorithms that are sometimes trained network, and how it is trained is usually
used—of these, Hebbian learning and Boltzmann not relevant.
FURTHER READING
Bechtel and Abrahamsen (2001) is an excellent textbook on connectionism. Ellis and Humphreys
(1999) is a text that emphasizes the role of connectionism in cognitive psychology. The two-volume
set Parallel Distributed Processing (Rumelhart, McClelland, & the PDP Research Group, 1986;
McClelland, Rumelhart, & the PDP Research Group, 1986) is a classic. Caudill and Butler (1992),
Dawson (2005), McClelland and Rumelhart (1988), Orchard and Phillips (1991), and Plunkett and
Elman (1997) provide exercises and simulation environments. Plunkett and Elman is a companion
volume to Elman et al. (1996) and includes a simulation environment called tlearn that runs on both
Macintosh and Windows platforms.
There are a number of popular books about emergent systems, attractors, chaos, and complexity,
including Gleick (1987), Stewart (1989), and Waldrop (1992).
GLOSSARY
Acoustics: the study of the physical properties of American Sign Language (ASL): American Sign
sounds. Language (sometimes called AMESLAN).
Acquired disorder: a disorder caused by brain damage Anaphor: a linguistic expression for which the
is acquired if it affects an ability that was previously referent can only be determined by taking another
intact (contrasted with developmental disorder). linguistic expression into account—namely the
Activation: can be thought of as the amount of energy anaphor’s antecedent (e.g., “Vlad was happy; he loved
possessed by something. The more highly activated the vampire”—here he is the anaphor and Vlad is the
something is, the more likely it is to be output. antecedent).
Adjective: a describing word (e.g., “red”). Aneurysm: dilation of blood vessel (e.g., in the
Adverb: a type of word that modifies a verb (e.g., brain), where a sac in the blood vessel is formed and
“quickly”). presses on surrounding tissue.
Affix: a bound morpheme that cannot exist on its Anomia: difficulty in naming objects.
own, but that must be attached to a stem (e.g., re-, Antecedent: the linguistic expression that must
-ing). It can come before the main word, when it is a be taken into account in order to determine the
prefix, or after, when it is a suffix. referent of an anaphor (“Vlad was happy; he loved
Agent: the thematic role describing the entity that the vampire”—here he is the anaphor and Vlad the
instigates an action. antecedent). Often the antecedent is the thing for
Agnosia: disorder of object recognition. which a pronoun is being substituted.
Agrammatism: literally, “without grammar”; a Aphasia: a disorder of language, including a defect
type of aphasia distinguished by an impairment of or loss of expressive (production) or receptive
syntactic processing (e.g., difficulties in sentence (comprehension) aspects of written or spoken
formation, inflection formation, and parsing). There language as a result of brain damage.
has been considerable debate about the extent to which Apraxia: an inability to plan movements, in the absence
agrammatism forms a syndrome. of paralysis. Of particular relevance is speech apraxia, an
Allophones: phonetic variants of phonemes. For inability to carry out properly controlled movements of
example, in English the phoneme /p/ has two variants, an the articulatory apparatus. Compare with dysarthria.
aspirated (breathy) and unaspirated (non-breathy) form. Articulatory apparatus: the parts of the body
You can feel the difference if you say the words “pit” and responsible for making speech sounds, such as the
“spit” with your hand a few inches from your mouth. larynx, tongue, teeth, and lips.
Alzheimer’s disease (AD): Alzheimer’s disease Aspect: the use of verb forms to show whether
or dementia—often there is some uncertainty something is finished, continuing, or repeated. English
about the diagnosis, so this is really shorthand for has two aspects: progressive (e.g., “we are cooking
“probable Alzheimer’s disease” or “dementia of the dinner”) versus non-progressive (e.g., “we cook
Alzheimer’s type.” dinner”), and perfect, involving forms of the auxiliary
GLOSSARY 487
“have” (e.g., “we have cooked dinner”), versus non- language) is learned relative to L1: simultaneous (L1 and
perfect (without the auxiliary). L2 learned about the same time), early sequential (L1
Aspirated: a sound that is produced with an audible learned first but L2 learned relatively early, in childhood),
breath (e.g., at the start of “pin”). and late (in adolescence onwards).
Assimilation: the influence of one sound on the Body: the same as a rime—the final vowel and
articulation of another, so that the two sounds become terminal consonants.
slightly more alike. Bootstrapping: the way in which children can
Attachment: attachment concerns how phrases are increase their knowledge when they have some—such
connected together to form syntactic structures. In as inferring syntax when they have semantics.
“the vampire saw the ghost with the binoculars” the Bottom-up: processing that is purely data-driven.
prepositional phrase (“with the binoculars”) can be Bound morphemes: a morpheme that cannot exist
attached to either the first noun phrase (“the vampire”) or on its own (e.g., un, ent).
the second (“the ghost”). Brain imaging: techniques for looking at what the
Attentional (or controlled) processing: processing brain is doing when we carry out some activity.
requiring central resources. It is non-obligatory, generally Broca’s aphasia: a type of aphasia that follows
uses working memory space, is prone to dual-task from damage to Broca’s region of the brain,
interference, is relatively slow, and may be accessible to characterized by many dysfluencies, slow,
consciousness. (The opposite is automatic processing.) laborious speech, difficulties in articulation, and by
Attractor: a point in the connectionist attractor agrammatism.
network to which related states are attracted. Cascade model: a type of processing where
Audience design: the idea that speakers tailor their information can flow from one level of processing
productions to address the specific needs of their listeners. to the next before the first has finished processing;
Auditory short-term memory (ASTM): a short- contrast with discrete stage model.
term store for spoken material. Categorical perception: perceiving things that
Automatic processing: processing that is unconscious, lie along a continuum as belonging to one distinct
fast, obligatory, facilitatory, does not involve working category or another.
memory space, and is generally not susceptible to dual-task Child-directed speech (CDS): the speech of carers
interference. (The opposite is attentional processing.) to young children that is modified to make it easier to
Auxiliary verb: a linking verb used with other understand (sometimes called “motherese”).
verbs (e.g., in “You must have done that,” “must” and Class: the grammatical class of a word is the major
“have” are auxiliaries). grammatical category to which a word belongs—e.g.,
Babbling: an early stage of language, starting at the noun, adjective, verb, adverb, determiner, preposition,
age of about 5 or 6 months, where the child babbles, pronoun.
repetitively combining consonants and vowels into Clause: a group of related words containing a subject
syllable-like sequences (e.g., “bababababa”). and a verb.
Back-propagation: an algorithm for learning Closed-class item: same as function word.
input–output pairs in connectionist networks. It works Co-articulation: the way in which the articulatory
by alternately reducing the error between the actual apparatus takes account of the surrounding sounds
output and the desired output of the network. when a sound is articulated; as a result, a sound
Basic level: the level of representation in a hierarchy conveys information about its neighbors.
that is the default level (e.g., “dog” rather than Cognates: words in different languages that have
“terrier” or “animal”). developed from the same root (e.g., many English
Bilingual: speaking two languages. and French words have developed from the same
Bilingualism: having the ability to speak two languages. Latin root: “horn” [and “cornet”] and “corne” are
There are three types depending on when L2 (the second derived from the Latin “cornu”); occasionally used for
488 GLOSSARY
words that have the same form in two languages (e.g., Diphthong: a type of vowel that combines two vowel
“oblige” in English and French). sounds (e.g., in “boy,” “cow,” and “my”).
Competence: our knowledge of our language, as Discourse: linguistic units composed of several sentences.
distinct from our linguistic performance. Discrete stage model: a processing model where
Complementizer: a category of words (e.g., “that”) information can only be passed to the next stage when
used to introduce a subordinate clause. the current one has completed its processing (contrast
Conjunction: a part of speech that connects words with cascade model).
within a sentence (e.g., “and,” “because”). Dissociation: a process is dissociable from other
Connectionism: an approach to cognition that involves processes if brain damage can disrupt it, while leaving
computer simulations with many simple processing units, the others intact.
and where knowledge comes from learning statistical Distributional information: information about
regularities rather than explicitly presented rules. what tends to co-occur with what; for example, the
Connectionist: a computational model involving knowledge that the letter “q” is almost always followed
many simple, neuron-like units connected together by by the letter “u,” or that the word “the” is always
weighted links. followed by a noun, are instances of distributional
Consonant: a sound produced with some constriction information.
of the airstream, unlike a vowel. Double dissociation: a pattern of dissociations
Constituent: a linguistic unit that is part of a larger whereby one patient can do one task but not another,
linguistic unit. whereas another patient shows the reverse pattern.
Content word: one of the enormous number of words Dysarthria: difficulty with executing motor
that convey most of the meaning of a sentence—nouns, movements. In addition to difficulties with executing
verbs, adjectives, adverbs. Content words are the same as speech plans, there are problems with automatic
open-class words. Contrasted with function word. activities such as eating. Compare with apraxia, which
Conversational maxim: a rule that helps us to make is a deficit limited to motor planning.
sense of conversation. Dysgraphia: disorder of writing.
Co-reference: two or more noun phrases with the Dyslexia: disorder of reading.
same reference. For example, in “There was a vampire in Dysprosody: a disturbance of prosody.
the kitchen; Boris was scared to death when he saw him,” EEG: electroencephalography—a means of measuring
the co-referential noun phrases are vampire and him. electrical potentials in the brain by placing electrodes
Creole: a pidgin that has become the language of a across the scalp.
community through an evolutionary process known as Episodic memory: knowledge of specific episodes
“creolization.” (e.g., what I had for breakfast this morning, or what
Cross-linguistic: involving a comparison across happened in the library yesterday).
languages. ERP: event-related potential—electrical activity in
Deep dyslexia: disorder of reading characterized by the brain after a particular event. An ERP is a complex
semantic reading errors. electrical waveform related in time to a specific event,
Deep dysphasia: disorder of repetition characterized measured by EEG.
by semantic repetition errors. Expressive: a form of aphasia to do with producing
Derivational morphology: the study of derivational language, primarily speaking.
inflections. Facilitation: making processing faster, usually as a
Determiner: a grammatical word that determines the result of priming. It is the opposite of inhibition.
number of a noun (e.g., “the,” “a,” “an,” “some”). Figurative speech: speech that contains non-literal
Developmental disorder: a disorder where the material, such as metaphors and similes (e.g., “he ran
normal development or acquisition of a process (e.g., like a leopard”).
reading) is affected. Filler: what fills a gap.
GLOSSARY 489
fMRI: functional magnetic resonance imaging—a Hidden units: a unit from the hidden layer of a
modern method of mapping the brain’s activity by connectionist network that enables the network
recording blood flow in real time. to learn complex input–output pairs by the back-
Formal paraphasia: substitution in speech of a word propagation algorithm. The hidden layer forms a
that sounds like another word (e.g., “caterpillar” for layer between the input and output layers.
“catapult”). Sometimes called a form-related paraphasia. Homographs: different words that are spelled the
Formant: a concentration of acoustic energy in a sound. same; they may or may not be pronounced differently,
Function word: one of the limited numbers of words e.g., “lead” (as in what you use to take a dog for a
that do the grammatical work of the language (e.g., walk) and “lead” (as in the metal).
determiners, prepositions, conjunctions—such as Homophone: two words that sound the same.
“the,” “a,” “to,” “in,” “and,” “because”). Contrasted Idioms: an expression particular to a language, whose
with content word. meaning cannot be derived from its parts (e.g., “kick
Gap: an empty part of the syntactic construction that the bucket”).
is associated with a filler. Imageability: a semantic variable concerning how
Garden path sentence: a type of sentence where easy it is to form a mental image of a word: “rose” is
the syntactic structure leads you to expect a different more imageable than “truth.”
conclusion from that which it actually has (e.g., “the Implicature: an inference that we make in
horse raced past the barn fell”). conversations to maintain the sense and relevance of
Gating task: a task that involves presenting the conversation.
increasing amounts of a word. Independent models: models in which processing
Gender: some languages (e.g., French and Italian) occurs without reference to any external processes or
distinguish different cases depending on their information (e.g., purely bottom-up).
gender—male, female, or neuter. Inference: the derivation of additional knowledge
Generative grammar: a finite set of rules that will from facts already known; this might involve going
produce or generate all the sentences of a language beyond the text to maintain coherence or to elaborate
(but no non-sentences). on what was actually presented.
Glottal stop: a sound produced by closing and Inflection: a grammatical change to a verb (changing
opening the glottis (the opening between the vocal its tense, e.g., -ed) or noun (changing its number, e.g.,
folds); an example is the sound that replaces the /t/ -s, or “mice”).
sound in the middle of “bottle” in some dialects of Inflectional morphology: the study of inflections.
English (e.g., in parts of London). Inhibition: this has two uses. In terms of processing it
Grammar: the set of syntactic rules of a language. means slowing processing down. In this sense priming
Grammatical element: a difficulty in physically may lead to inhibition. Inhibition is the opposite of
producing the sounds of language, usually due to brain facilitation. In comprehension it is closely related to
damage affecting control of the muscles involved in the idea of suppression. In terms of networks it refers
moving the articulatory apparatus. to how some connections decrease the amount of
Grapheme: a unit of written language that activation of the target unit.
corresponds to a phoneme (e.g., “steak” contains Inner speech: that voice we hear in our head; speech
four graphemes, s t ea k, corresponding to the four that is not overtly articulated.
component sounds). Interactive models: models where different sorts
Hemidecortication: complete removal of the cortex of information are allowed to influence current
of one side of the brain. processing (e.g., a mixture of bottom-up and
Heterographic homophones: two words with top-down).
different spellings that sound the same (e.g., “soul” Intransitive verb: a verb that does not take an object
and “sole”; “night” and “knight”). (e.g., “The man laughs”).
490 GLOSSARY
Invariance: the same phoneme can in fact sound Modifier: a part of speech that is dependent on
different depending on the context in which it occurs. another, which it modifies or qualifies in some way
L1: the language learned first by bilingual people. (e.g., adjectives modify nouns).
L2: the language learned second by bilingual people. Modularity: the idea that the mind is built up from
Language acquisition device (LAD): Chomsky discrete modules; its resurgence is associated with
argued that children hear an impoverished language the American philosopher Jerry Fodor, who said that
input and therefore need the assistance of an innate modules cannot tinker around with the insides of other
language acquisition device in order to acquire modules. A further step is to say that the modules of
language. the mind correspond to identifiable neural structures in
Lemma: a level of representation of a word between the brain.
its semantic and phonological representations; it is Monosyllabic: a word having just one syllable.
syntactically specified, but does not yet contain sound- Morpheme: the smallest unit of meaning (e.g.,
level information; it is the intermediate stage of two- “dogs” contains two, dog + plural s).
stage models of lexicalization. Morphology: the study of how words are built up
Lesion: damage to a particular part of the brain. from morphemes.
Lexeme: the phonological word form, in a format Nativist: the idea that knowledge is innate.
where phonology is represented. Natural kind: a category of naturally occurring
Lexical access: accessing a word’s entry in the lexicon. things (e.g., animals, trees).
Lexicalization: in speech production, going from Neologism: a “made-up word” that is not in the
semantics to sound. dictionary. Neologisms are usually common in the
Lexicon: our mental dictionary. speech of people with jargon aphasia.
LSA: latent semantic analysis—a means of acquiring Nonword: a string of letters that does not form
knowledge from the co-occurrence of information. a word. Although most of the time nonwords
Malapropisms: a type of speech error where a mentioned in psycholinguistics refer to pronounceable
similar-sounding word is substituted for the target nonwords (pseudowords), not all nonwords need be
(e.g., saying “restaurant” instead of “rhapsody”). pronounceable.
Manner of articulation: the way in which the Noun: the syntactic category of words that can act as
airstream is constricted in speaking (e.g., stop). names and can all be subjects or objects of a clause;
Maturation: the sequential unfolding of all things are nouns.
characteristics, usually governed by instructions in the Noun phrase: a grammatical phrase based on a
genetic code. noun (e.g., “the red house”), abbreviated to NP.
Mediated priming: (facilitatory) priming through a Number: the number of a verb is whether one or
semantic intermediary (e.g., “lion” to “tiger” to “stripes”). more subjects are doing the action (e.g., “the ghost
MEG: magnetoencephalography—a technique for was” but “the ghosts were”).
mapping the brain’s electrical activity by recording the Object: the person, thing, or idea that is acted on by
magnetic field produced by the brain. the verb. In the sentence “The cat chased the dog,”
Metaphor: a figure of speech that works by “cat” is the subject, “chased” the verb, and “dog” is
association, comparison, or resemblance (e.g., “he’s a the object. Objects can be either direct or indirect—in
tiger in a fight,” “the leaves swam around the lake”). the sentence “She gave the dog to the man,” “dog” is
Minimal pair: a pair of words that differ in meaning the direct object and “the man” is the indirect object.
when only one sound is changed (e.g., “pear” and Onset: the beginning of something. It has two
“bear”). meanings. The onset of a stimulus is when it is first
Model: an account of the data that provides an presented. The onset of a printed word is its initial
explanation of why the data are as they are and that consonant cluster (e.g., “sp” in “speak”).
makes novel, testable predictions. Open-class word: same as content word.
GLOSSARY 491
Ostensive: you can define an object ostensively by which is created by the contact of two peoples who do
pointing to it. not speak each other’s native languages.
Over-extension: when a child uses a word to refer to Place of articulation: where the airstream in the
things in a way that is based on particular attributes of articulatory apparatus is constricted.
the word, so that many things can be named using that Polysemous words: words that have more than one
word (e.g., using “moon” to refer to all round things, or meaning.
“stick” to all long things, such as an umbrella). Pragmatics: the aspects of meaning that do not
Parameter: a component of Chomsky’s theory affect the literal truth of what is being said; these
that governs aspects of language, and that is set in concern things such as choice from words with the
childhood by exposure to a particular language. same meaning, implications in conversation, and
Paraphasia: a spoken word substitution. maintaining coherence in conversation.
Parsing: analyzing the grammatical structure of a sentence. Predicate: the part of the clause that gives information
Participle: a type of verbal phrase where a verb is about the subject (e.g., in “The ghost is laughing,” “the
turned into an adjective by adding -ed or -ing to the ghost” is the subject and “is laughing” is the predicate).
verb: “we live in an exciting age.” Prefix: an affix that comes before the stem (e.g., dis-
Patient: the thematic role of a person or thing acted interested). Contrast with suffix which comes after the
on by the agent. stem.
Performance: our actual language ability, limited by Preposition: a grammatical word expressing a
our cognitive capacity, distinct from our competence. relation (e.g., “to,” “with,” “from”).
Phoneme: a sound of the language; changing a Prepositional phrase: a phrase beginning with
phoneme changes the meaning of a word. a preposition (e.g., “with the telescope,” “up the
Phonetics: the acoustic detail of speech sounds and chimney”).
how they are articulated. Priming: affecting a response to a target by
Phonological awareness: awareness of sounds, presenting a related item prior to it; priming can have
measured by tasks such as naming the common sound either facilitatory or inhibitory effects.
in words (e.g., “bat” and “ball”), and deleting a sound Pronouns: a grammatical class of words that can stand
from a word (e.g., “take the second sound of bland”); for nouns or noun phrases (e.g., “she,” “he,” “it”).
thought to be important for reading development but Proposition: the smallest unit of knowledge that can
probably other aspects of language too. stand alone: it has a truth value—that is, a proposition
Phonological dyslexia: a type of dyslexia where can be either true or false.
people can read words quite well but are poor at Prosody: the way in which speech is stressed and
reading nonwords. intoned to give it a rhythm.
Phonology: the study of sounds and how they Prototype: an abstraction that is the best example of
relate to languages; phonology describes the sound a category.
categories each language uses to divide up the space of Pseudohomophone: a nonword that sounds like a
possible sounds. word when pronounced (e.g., “nite”).
Phrase: a group of words forming a grammatical Pseudoword: a string of letters that form a
unit beneath the level of a clause (e.g., “up a tree”). pronounceable nonword (e.g., “smeak”).
A phrase does not contain both a subject and a Psycholinguist: someone who does
predicate. In general, if you can replace a sequence psycholinguistics.
of words in a sentence with a single word without Psycholinguistics: the psychology of language.
changing the overall structure of the sentence, then Receptive aphasia: a form of aphasia to do with
that sequence of words is a phrase. understanding language.
Pidgin: a type of language, with reduced structure Recognition point: the point at which we recognize
and form, without any native speakers of its own, and a word.
492 GLOSSARY
Recurrent network: a type of connectionist Semantics: the study of meaning.

network that is designed to learn sequences. It does Sentence: a group of words that expresses
this by means of an additional layer of units called a complete thought, indicated in writing by
context units that stores information about past states the capitalization of the first letter, and ending
of the network. with a period (full stop). Sentences contain a subject
Reduced relative: a relative clause that has been and a predicate (apart from a very few exceptions,
reduced by removing the relative pronoun and “was” notably one-word sentences such as “Stop!”).
(“The horse raced past the barn fell”). Sequential bilingualism: L2 acquired after L1—
Reference: what things refer to. this can be either early in childhood or later.
Referent: the object or concept to which a pronoun Short-term memory: a limited capacity memory
refers. store that holds incoming information for short periods
Refractory period: after firing, a unit, cell, or organ of time only.
is much less likely to fire again during the refractory Simultaneous bilingualism: L1 and L2 acquired
period, until it has recovered. simultaneously.
Relative clause: a clause normally introduced by SOA: short for stimulus–onset asynchrony—the time
a relative pronoun that modifies the main noun between the onset (beginning) of the presentation
(“The horse that was raced past the barn fell”— of one stimulus and the onset of another. The time
here the relative clause is “that was raced past the between the offset (end) of the presentation of the
barn”). first stimulus and the onset of the second is known as
Repetition priming: (facilitatory) priming by stimulus offset–onset asynchrony.
repeating a stimulus. Span: the number of items (e.g., digits) that a person
Rime: the end part of a word that produces the rhyme can keep in short-term memory.
(e.g., the rime constituent in “rant” is “ant,” or “eak” in Specific language impairment: a developmental
“speak”): more formally, it is the VC or VCC (vowel– disorder affecting just language.
consonant or vowel–consonant–consonant) part of a word. Speech act: an utterance defined in terms of the
Saccade: a fast movement of the eye, for example to intentions of the speaker and the effect that it has on
change the fixation point when reading. the listener.
Schema: a means for organizing knowledge. Spoonerism: a type of speech error where the initial
Script: a script for procedural information (e.g., going sounds of two words get swapped (named after the
to the doctor’s). Reverend William A. Spooner, who is reported as
Segmentation: splitting speech up into constituent saying things such as “you have tasted the whole
phonemes. worm” instead of “you have wasted the whole term”).
Semantic bootstrapping: the idea that the meaning Stem: the root morpheme to which other bound
of a word provides a cue as to the syntactic category to morphemes can be added.
which that word belongs. Stochastic: probabilistic.
Semantic feature: a unit that represents part of the Subject: the word or phrase that the sentence
meaning of a word. is about—the clause about which something is
Semantic memory: a memory system for the long- predicated (stated). The subject of the verb: who
term storage of facts (e.g., a robin is a bird; Paris is the or what is doing something. More formally it is
capital of France). the grammatical category of the noun phrase
Semantic paralexia: a reading error based on a that is immediately beneath the sentence node in
word’s meaning. the phrase-structure tree; the thing about which
Semantic priming: priming, usually facilitatory, something is stated.
obtained by the prior presentation of a stimulus related Sublexical: correspondences in spelling and sound
in meaning (e.g., “doctor” – “nurse”). beneath the level of the whole word.
GLOSSARY 493
Sucking habituation paradigm: a method Tip-of-the-tongue (TOT): when you know that you
for examining whether or not very young infants know a word, but you cannot immediately retrieve
can discriminate between two stimuli. The child it (although you might know its first sound, or how
sucks on a special piece of apparatus; as the child many syllables it has).
habituates to the stimulus, their sucking rate drops, TMS: transcranial magnetic stimulation—producing
but if a new stimulus is presented, the sucking rate activity in certain brain regions using locally applied
increases again, but only if the child can detect that magnetic fields.
the stimulus is different from the first. Top-down: processing that involves knowledge
Suffix: a morpheme added to the end of a word to coming from higher levels (such as predicting a word
form a derivative (e.g. -ed, -ing, -s). from the context).
Suppression: in comprehension, suppression is Transcortical aphasia: a type of language
closely related to inhibition. Suppression is the disturbance following brain damage characterized by
attenuation of activation, while inhibition is the relatively good repetition but poor performance in
blocking of activation. Material must be activated other aspects of language.
before it can be suppressed. Transformation: a grammatical rule for
Syllable: a rhythmic unit of speech (e.g., po-lo transforming one syntactic structure into another (e.g.,
contains two syllables); it can be analyzed in terms of turning an active sentence into a passive one).
onset and rime (or rhyme), with the rime further being Transformational grammar: a system of
analyzable into nucleus and coda. Hence in “speaks,” grammar based on transformations, introduced by
“sp” is the onset, “ea” the nucleus, and “ks” the coda; Chomsky.
together “eaks” forms the rime. Transitive verb: a verb that takes an object (e.g.,
Syndrome: a medical term for a cluster of “The cat hit the dog”).
symptoms that cohere as a result of a single Unaspirated: a sound that is produced without an
underlying cause. audible breath (e.g., the /p/ in “spin”).
Syntactic bootstrapping: the idea that the syntactic Uniqueness point: the point at which a word is
frame associated with a verb provides a cue as to the unique and differs from all its neighbors.
word’s meaning. Universal grammar: the core of the grammar
Syntax: the rules of word order of a language. that is universal to all languages, and which
Tachistoscope: a device for presenting materials specifies and restricts the form that individual
(e.g., words) for extremely short durations; languages can take.
tachistoscopic presentation therefore means an item Unvoiced: a sound that is produced without vibration
that is presented very briefly. of the vocal cords, such as /p/ and /t/—the same as
Telegraphic speech: a type of speech used voiceless and without voice.
by young children, marked by syntactic Verb: a syntactic class of words expressing actions,
simplification, particularly in the omission of events, and states, and which have tenses.
function words. Verb-argument structure: the set of possible
Tense: the tense of a verb is whether it is in the past, themes associated with a verb (e.g., a person gives
present, or future (e.g., “she gave,” “she gives,” and something to someone—or agent–theme–goal).
“she will give”). Voice onset time (VOT): the time between the release
Thematic roles: the set of semantic roles in a of the constriction of the airstream when we produce a
sentence that conveys information about who is doing consonant, and when the vocal cords start to vibrate.
what to whom, as distinct from the syntactic roles of Voicing: consonants produced with vibration of the
subject and object. Examples include agent and theme. vocal cords.
Theme: the thing that is being acted on or being Vowel: a speech sound produced with very little
moved. constriction of the airstream, unlike a consonant.
494 GLOSSARY
Wernicke’s aphasia: a type of aphasia resulting from Syntactic analysis

damage to Wernicke’s area of the brain, characterized
by poor comprehension and fluent, often meaningless
Determiner noun verb determiner noun preposi-
tion determiner noun preposition determiner
speech with clear word-finding difficulties.
adjective noun
Word: the smallest unit of grammar that can stand alone.
Working memory: in the USA, often used as a Subject, verb, direct object, indirect object 1,
general term for short-term memory. According to indirect object 2
the British psychologist Alan Baddeley, working
memory has a particular structure comprising a Verb-argument structure
central executive, a short-term visual store, and a
phonological loop. Chase Agent CHASE Theme Source Goal
EXAMPLE OF SENTENCE Thematic role assignment

ANALYSIS Agent the vampire
Theme the ghost
The vampire chased the ghost from the cupboard Source the cupboard
to the big cave. Goal the big cave
S RE EC FT EI OR EN N C E S
Adams, M. J. (1990). Beginning to read: Thinking recognition: Evidence for continuous mapping models.
and learning about print. Cambridge, MA: MIT Press. Journal of Memory and Language, 38, 419–439.
Ainsworth-Darnell, K., Shulman, H. G., & Allport, D. A. (1977). On knowing the meaning
Boland, J. E. (1998). Dissociating brain responses to of words we are unable to report: The effects of
syntactic and semantic anomalies: Evidence from event- visual masking. In S. Dornic (Ed.), Attention and
related potentials. Journal of Memory and Language, performance VI (pp. 505–534). Hillsdale, NJ:
38, 112–130. Lawrence Erlbaum Associates, Inc.
Aitchison, J. (1994). Words in the mind: An Allport, D. A. (1984). Auditory short-term memory
introduction to the mental lexicon (2nd ed.). Oxford: and conduction aphasia. In H. Bouma & D. Bouwhis
Blackwell. (Eds.), Attention and performance X (pp. 313–326).
Aitchison, J. (1996). The seeds of speech: Language Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
origin and evolution. Cambridge: Cambridge Allport, D. A., & Funnell, E. (1981). Components of
University Press. the mental lexicon. Philosophical Transactions of the
Aitchison, J. (1998). The articulate mammal (4th ed.). Royal Society of London, Series B, 295, 397–410.
London: Routledge. Almor, A. (1999). Noun-phrase anaphora and focus:
Akhtar, N. (1999). Acquiring basic word order: The informational load hypothesis. Psychological
Evidence for data-driven learning of syntactic Review, 106, 748–765.
structure. Journal of Child Language, 26, 339–356. Almor, A., Kempler, D., MacDonald, M. C.,
Akhtar, N., & Tomasello, M. (1997). Young Andersen, E. S., & Tyler, L. K. (1999). Why do
children’s productivity with word order and verb Alzheimer patients have difficulty with pronouns?
morphology. Developmental Psychology, 33, 952–965. Working memory, semantics, and reference in
Alario, F.-X., & Caramazza, A. (2002). The comprehension and production in Alzheimer’s disease.
production of determiners: Evidence from French. Brain and Language, 67, 202–227.
Cognition, 82, 179–223. Altarriba, J. (1992). The representation of translation
Alario, F.-X., Costa, A., & Caramazza, A. (2002a). equivalents in bilingual memory. In R. J. Harris (Ed.),
Frequency effects in noun phrase production: Cognitive processing in bilinguals (pp. 157–174).
Implications for models of lexical access. Language Amsterdam: North-Holland.
and Cognitive Processes, 17, 299–319. Altarriba, J. (Ed.). (1993). Cognition and culture: A
Alario, F.-X., Costa, A., & Caramazza, A. (2002b). cross-cultural approach to psychology. Amsterdam:
Hedging one’s bets too much? A reply to Levelt North-Holland.
(2002). Language and Cognitive Processes, 17, Altarriba, J., & Forsythe, W. J. (1993). The role
673–682. of cultural schemata in reading comprehension. In
Alario, F.-X., Costa, A., Pickering, M., & Ferreira, V. J. Altarriba (Ed.), Cognition and culture: A cross-
(2006). Language production. Hove, UK: Psychology cultural approach to psychology (pp. 145–155).
Press. Amsterdam: North-Holland.
Albert, M. L., & Obler, L. K. (1978). The bilingual Altarriba, J., Kroll, J. F., Sholl, A., & Rayner, K.
brain: Neuropsychological and neurolinguistic aspects (1996). The influence of lexical and conceptual
of bilingualism. New York: Academic Press. constraints on reading mixed-language sentences:
Alishahi, A., & Stevenson, S. (2005). A probabilistic Evidence from eye fixations and naming times.
model of early argument structure acquisition. Memory and Cognition, 24, 477–492.
Proceedings of the 27th Annual Conference of the Altarriba, J., & Mathis, K. E. (1997). Conceptual
Cognitive Science Society, Stresa, Italy. and lexical development in second language
Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. acquisition. Journal of Memory and Language, 36,
(1998). Tracking the time course of spoken word 550–568.
496 REFERENCES
Altarriba, J., & Soltano, E. G. (1996). Repetition Andrews, S. (1997). The effect of orthographic
blindness and bilingual memory: Token individuation similarity on lexical retrieval: Resolving neighborhood
for translation equivalents. Memory and Cognition, 24, conflicts. Psychonomic Bulletin and Review, 4,
700–711. 439–461.
Altmann, G. T. M. (Ed.). (1990). Cognitive models of Andrews, S. (2006). From inkmarks to ideas. Hove,
speech processing. Cambridge, MA: MIT Press. UK: Psychology Press.
Altmann, G. T. M. (1997). The ascent of Babel: An Ans, B., Carbonnel, S., & Valdois, S. (1998). A
exploration of language, mind, and understanding. connectionist multiple-trace memory model for
Oxford: Oxford University Press. polysyllabic word reading. Psychological Review, 105,
Altmann, G. T. M. (1999). Thematic role assignment 678–723.
in context. Journal of Memory and Language, 41, Antos, S. J. (1979). Processing facilitation in a lexical
124–145. decision task. Journal of Experimental Psychology:
Altmann, G. T. M., Garnham, A., & Dennis, Y. Human Perception and Performance, 5, 527–545.
(1992). Avoiding the garden path: Eye movements Arbib, M. A. (2005). From monkey-like action
in context. Journal of Memory and Language, 31, recognition to human language: An evolutionary
685–712. framework for neurolinguistics. Behavioral and Brain
Altmann, G. T. M., & Kamide, Y. (1999). Sciences, 28, 105–167.
Incremental interpretation at verbs: Restricting the Arciuli, J., & Simpson, I. C. (2012). Statistical
domain of subsequent reference. Cognition, 73, learning is related to reading ability in children and
247–264. adults. Cognitive Science, 36, 286–304.
Altmann, G. T. M., & Kamide, Y. (2009). Discourse- Armstrong, S., Gleitman, L. R., & Gleitman, H.
mediation of the mapping between language and (1983). What some concepts might not be. Cognition,
the visual world: Eye movements and mental 13, 263–274.
representation. Cognition, 111, 55–71. Arnold, J. E., Eisenband, J. G., Brown-Schmidt, S.,
Altmann, G. T. M., & Shillcock, R. C. (Eds.). (1993). & Trueswell, J. C. (2000). The rapid use of gender
Cognitive models of speech processing. Hove, UK: information: Evidence of the time course of pronoun
Lawrence Erlbaum Associates. resolution from eyetracking. Cognition, 76, B13–B36.
Altmann, G. T. M., & Steedman, M. J. (1988). Atkinson, M. (1982). Explanations in the study of
Interaction with context during human sentence child language development. Cambridge: Cambridge
processing. Cognition, 30, 191–238. University Press.
Anderson, J. R. (1974). Retrieval of propositional Au, T. K. (1983). Chinese and English
information from long-term memory. Cognitive counterfactuals: The Sapir Whorf hypothesis revisited.
Psychology, 6, 451–474. Cognition, 15, 155–187.
Anderson, J. R. (1976). Language, memory, and thought. Au, T. K. (1984). Counterfactuals: In reply to Alfred
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Bloom. Cognition, 17, 289–302.
Anderson, J. R. (1983). The architecture of cognition. Austin, J. L. (1976). How to do things with words
Cambridge, MA: Harvard University Press. (2nd ed.). Oxford: Oxford University Press. [First
Anderson, J. R. (2010). Cognitive psychology and its edition published 1962.]
implications (7th ed.). New York: Worth. Baars, B. J., Motley, M. T., & MacKay, D. G.
Anderson, J. R., & Bower, G. H. (1973). Human (1975). Output editing for lexical status from
associative memory. Washington, DC: Winston & Sons. artificially elicited slips of the tongue. Journal of
Anderson, K. J., & Leaper, C. (1998). Meta-analyses Verbal Learning and Verbal Behavior, 14, 382–391.
of gender effects on conversational interruption: Who, Baayen, R. H., Dijkstra, T., & Schreuder, R. (1997).
what, when, where, and how. Sex Roles, 39, 225–252. Singulars and plurals in Dutch: Evidence for a parallel
Anderson, R. C., & Pichert, J. W. (1978). Recall of dual route model. Journal of Memory and Language,
previously unrecallable information following a shift 37, 94–117.
in perspective. Journal of Verbal Learning and Verbal Baayen, R. H., Piepenbrock, R., & Gulikers, L.
Behavior, 12, 1–12. (1995). The CELEX lexical database [CD-ROM].
Andrewes, D. (2001). Neuropsychology: From theory Philadelphia: Linguistic Data Consortium, University
to practice. Hove, UK: Psychology Press. of Pennsylvania.
Andrews, S. (1982). Phonological recoding: Is the Bach, E., Brown, C., & Marslen-Wilson, W.
regularity effect consistent? Memory and Cognition, (1986). Crossed and nested dependencies in German
10, 565–575. and Dutch: A psycholinguistic study. Language and
Andrews, S. (1989). Frequency and neighborhood Cognitive Processes, 1–4, 249–262.
effects on lexical access: Activation or search? Journal Backman, J. E. (1983). Psycholinguistic skills and
of Experimental Psychology: Learning, Memory, and reading acquisition: A look at early readers. Reading
Cognition, 15, 802–814. Research Quarterly, 18, 466–479.
REFERENCES 497
Baddeley, A. D. (1990). Human memory: Theory & K. Rayner (Eds.), Comprehension processes in
and practice. Hove, UK: Lawrence Erlbaum reading (pp. 9–32). Hillsdale, NJ: Lawrence Erlbaum
Associates. Associates, Inc.
Baddeley, A. D. (2007). Working memory, thought, Balota, D. A., & Chumbley, J. I. (1984). Are lexical
and action. Oxford: Oxford University Press. decisions a good measure of lexical access? The
Baddeley, A. D., Ellis, N. C., Miles, T. R., & Lewis, V. J. role of word frequency in the neglected decision
(1982). Developmental and acquired dyslexia: A stage. Journal of Experimental Psychology: Human
comparison. Cognition, 11, 185–199. Perception and Performance, 10, 340–357.
Baddeley, A. D., Gathercole, S., & Papagno, C. Balota, D. A., & Chumbley, J. I. (1985). The locus
(1998). The phonological loop as a language learning of word-frequency effects in the pronunciation task:
device. Psychological Review, 105, 158–173. Lexical access and/or production? Journal of Memory
Baddeley, A. D., & Hitch, G. J. (1974). Working and Language, 24, 89–106.
memory. In G. H. Bower (Ed.), The psychology of Balota, D. A., & Chumbley, J. I. (1990). Where
learning and motivation (Vol. 8, pp. 47–90). London: are the effects of frequency on visual word
Academic Press. recognition tasks? Right where we said they were!
Baddeley, A. D., Vallar, G., & Wilson, B. (1987). Comment on Monsell, Doyle, and Haggard (1989).
Comprehension and the articulatory loop: Some Journal of Experimental Psychology: General, 119,
neuropsychological evidence. In M. Coltheart (Ed.), 231–237.
Attention and performance XII (pp. 509–530). Balota, D. A., Cortese, M. J., Sergent-Marshall, S. D.,
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Spieler, D. H., & Yap, M. J. (2004). Visual word
Baddeley, A. D., & Wilson, B. (1988). recognition of single-syllable words. Journal of
Comprehension and working memory: A single case Experimental Psychology: General, 133, 283–316.
neuropsychological study. Journal of Memory and Balota, D. A., Ferraro, F. R., & Conner, L. T.
Language, 27, 479–498. (1991). On the early influence of meaning in word
Badecker, W., & Caramazza, A. (1985). On recognition: A review of the literature. In
considerations of method and theory governing the use P. J. Schwanenflugel (Ed.), The psychology of word
of clinical categories in neurolinguistics and cognitive meanings (pp. 187–222). Hillsdale, NJ: Lawrence
neuropsychology: The case against agrammatism. Erlbaum Associates, Inc.
Cognition, 20, 97–125. Balota, D. A., & Lorch, R. F. (1986). Depth of
Badecker, W., & Caramazza, A. (1986). A final brief automatic spreading activation: Mediated priming
in the case against agrammatism: The role of theory in effects in pronunciation but not in lexical decision.
the selection of data. Cognition, 24, 277–282. Journal of Experimental Psychology: Learning,
Badecker, W., Miozzo, M., & Zanuttini, R. (1995). Memory, and Cognition, 12, 336–345.
The two-stage model of lexical retrieval: Evidence Baluch, B., & Besner, D. (1991). Visual word
from a case of anomia with selective preservation of recognition: Evidence for strategic control of lexical
gender. Cognition, 57, 193–216. and nonlexical routines in oral reading. Journal of
Badecker, W., & Straub, K. (2002). The processing Experimental Psychology: Learning, Memory, and
role of structural constraints on the interpretation Cognition, 17, 644–652.
of pronouns and anaphors. Journal of Experimental Banich, M. T. (2004). Cognitive neuroscience and
Psychology: Learning, Memory, and Cognition, 28, neuropsychology. Boston, MA: Houghton Mifflin.
748–769. Banks, W. P., & Flora, J. (1977). Semantic and
Baguley, T., & Payne, S. J. (2000). Long-term perceptual processing in symbolic comparison.
memory for spatial and temporal mental models Journal of Experimental Psychology: Human
includes construction processes and model structure. Perception and Performance, 3, 278–290.
Quarterly Journal of Experimental Psychology, 53A, Barisnikov, K., van der Linden, M., & Poncelet, M.
479–512. (1996). Acquisition of new words and phonological
Bailey, K. G. D., & Ferreira, F. (2003). Disfluencies working memory in Williams syndrome: A case study.
affect the parsing of garden-path sentences. Journal of Neurocase, 2, 395–404.
Memory and Language, 49, 183–200. Barker, M. G., & Lawson, J. S. (1968). Nominal
Baillet, S. D., & Keenan, J. M. (1986). The role of aphasia in dementia. British Journal of Psychiatry,
encoding and retrieval processes in the recall of text. 114, 1351–1356.
Discourse Processes, 9, 247–268. Baron, J., & Strawson, C. (1976). Use of
Baldwin, D. A. (1991). Infants’ contributions to the orthographic and word-specific knowledge in reading
achievement of joint reference. Child Development, words aloud. Journal of Experimental Psychology:
62, 875–890. Human Perception and Performance, 2, 386–393.
Balota, D. A. (1990). The role of meaning in word Baron-Cohen, S. (2003). The essential difference.
recognition. In D. A. Balota, G. B. Flores d’Arcais, Harmondsworth, UK: Penguin.
498 REFERENCES
Barrett, M. D. (1978). Lexical development and Bates, E., & MacWhinney, B. (1982). Functionalist
overextension in child language. Journal of Child approaches to grammar. In E. Wanner &
Language, 5, 205–219. L. R. Gleitman (Eds.), Language acquisition: The
Barrett, M. D. (1982). Distinguishing between state of the art (pp. 173–218). Cambridge: Cambridge
prototypes: The early acquisition of the meaning of object University Press.
names. In S. A. Kuczaj (Ed.), Language development: Bates, E., Marchman, V., Thal, D., Fenson, L.,
Vol. 1. Syntax and semantics (pp. 313–334). New York: Dale, P. S., Reznick, J. S., et al. (1994). Developmental
Springer-Verlag. and stylistic variation in the composition of early
Barrett, M. D. (1986). Early semantic representations vocabulary. Journal of Child Language, 21, 85–123.
and early word-usage. In S. A. Kuczaj & M. D. Barrett Bates, E., Masling, M., & Kintsch, W. (1978).
(Eds.), The development of word meaning: Progress Recognition memory for aspects of dialog. Journal
in cognitive development research (pp. 39–67). New of Experimental Psychology: Human Learning and
York: Springer-Verlag. Memory, 4, 187–197.
Barron, R. W. (1981). Reading skills and reading Bates, E., McDonald, J., MacWhinney, B., &
strategies. In C. A. Perfetti & A. M. Lesgold (Eds.), Applebaum, M. (1991). A maximum likelihood
Interactive processes in reading (pp. 299–328). procedure for the analysis of group and individual
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. data in aphasia research. Brain and Language, 40,
Barron, R. W., & Baron, J. (1977). How children get 231–265.
meaning from printed words. Child Development, 48, Bates, E., & Roe, K. (2001). Language development
587–594. in children with unilateral brain damage. In
Barry, C., Morrison, C. M., & Ellis, A. W. (1997). C. A. Nelson & M. Luciana (Eds.), Handbook of
Naming the Snodgrass and Vanderwart pictures: developmental cognitive neuroscience (pp. 309–318).
Effects of age of acquisition, frequency, and name Cambridge, MA: MIT Press.
agreement. Quarterly Journal of Experimental Batterink, L., Karns, C. M., Yamada, Y., & Neville, H.
Psychology, 50A, 560–585. (2010). The role of awareness in semantic and syntactic
Barsalou, L. W. (1985). Ideals, central tendency, and processing: An ERP attentional blink study. Journal of
frequency of instantiation as determinants of graded Cognitive Neuroscience, 22, 2514–2529.
structure in categories. Journal of Experimental Battig, W. F., & Montague, W. E. (1969). Category
Psychology: Learning, Memory, and Cognition, 11, norms for verbal items in 56 categories: A replication
629–654. and extension of the Connecticut category norms.
Barsalou, L. W. (2003). Situated simulation in the Journal of Experimental Psychology Monograph, 80,
human conceptual system. Language and Cognitive 1–46.
Processes, 18, 513–562. Bavelier, D., & Potter, M. C. (1992). Visual and
Barsalou, L. W. (2008). Grounded cognition. Annual phonological codes in repetition blindness. Journal
Review of Psychology, 59, 617–645. of Experimental Psychology: Human Perception and
Bartlett, F. C. (1932). Remembering: A study in Performance, 18, 134–147.
experimental and social psychology. Cambridge: Beaton, A. A. (1997). The relation of planum
Cambridge University Press. temporale asymmetry and morphology of the corpus
Batchelder, E. O. (2002). Bootstrapping the lexicon: callosum to handedness, gender, and dyslexia: A
A computational model of infant speech segmentation. review of the evidence. Brain and Language, 60,
Cognition, 83, 167–206. 252–322.
Bates, E., Bretherton, I., & Snyder, L. (1988). From Beattie, G. W. (1980). The role of language
first words to grammar: Individual differences and production processes in the organisation of behaviour
dissociable mechanisms. Cambridge: Cambridge in face-to-face interaction. In B. Butterworth (Ed.),
University Press. Language production: Vol. 1. Speech and talk (pp.
Bates, E., Dick, F., & Wulfeck, B. (1999). Not so 69–107). London: Academic Press.
fast: Domain-general factors can account for selective Beattie, G. W. (1983). Talk: An analysis of speech and
deficits in grammatical processing. Behavioral and non-verbal behaviour in conversation. Milton Keynes,
Brain Sciences, 22, 96–97. UK: Open University Press.
Bates, E., & Goodman, J. C. (1997). On the Beattie, G. W., & Bradbury, R. J. (1979). An
inseparability of grammar and the lexicon: Evidence experimental investigation of the modifiability of the
from acquisition, aphasia and real-time processing. temporal structure of spontaneous speech. Journal of
Language and Cognitive Processes, 12, 507–586. Psycholinguistic Research, 8, 225–247.
Bates, E., & Goodman, J. C. (1999). On the emergence Beattie, G. W., & Butterworth, B. (1979). Contextual
of grammar from the lexicon. In B. MacWhinney (Ed.), probability and word frequency as determinants of
The emergence of language (pp. 29–79). Mahwah, NJ: pauses and errors in spontaneous speech. Language
Lawrence Erlbaum Associates, Inc. and Speech, 22, 201–211.
REFERENCES 499
Beauvois, M.-F. (1982). Optic aphasia: A process Berndt, R. S., & Mitchum, C. C. (1990). Auditory
of interaction between vision and language. and lexical information sources in immediate
Philosophical Transactions of the Royal Society of recall: Evidence from a patient with a deficit to the
London, Series B, 298, 35–47. phonological short-term store. In G. Vallar &
Beauvois, M.-F., & Derouesné, J. (1979). T. Shallice (Eds.), Neuropsychological implications
Phonological alexia: Three dissociations. Journal of short-term memory (pp. 115–144). Cambridge:
of Neurology, Neurosurgery and Psychiatry, 42, Cambridge University Press.
1115–1124. Bertenthal, B. I. (1993). Infants’ perceptions of
Beauvois, M.-F., & Derouesné, J. (1981). Lexical or biomechanical motions: Intrinsic image and knowledge-
orthographic agraphia. Brain, 104, 21–49. based constraints. In C. Granrud (Ed.), Visual
Bechtel, W., & Abrahamsen, A. (2001). perception and cognition in infancy (pp. 175–214).
Connectionism and the mind: Parallel processing, Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
dynamics and evolution in networks. Oxford: Bertram, R., Schreuder, R., & Baayen, R. H.
Blackwell. (2000). The balance of storage and computation in
Becker, C. A. (1976). Allocation of attention during morphological processing: The role of word formation
visual word recognition. Journal of Experimental type, affixal homophony, and productivity. Journal
Psychology: Human Perception and Performance, 2, of Experimental Psychology: Learning, Memory, and
556–566. Cognition, 26, 489–511.
Becker, C. A. (1980). Semantic context effects in Berwick, R. C., Pietroski, P., Yankama, B., &
visual word recognition: An analysis of semantic Chomsky, N. (2011). Poverty of the stimulus
strategies. Memory and Cognition, 8, 439–512. revisited. Cognitive Science, 35, 1207–1242.
Becker, C. A., & Killion, T. H. (1977). Interaction Berwick, R. C., & Weinberg, A. S. (1983a). The role
of visual and cognitive effects in word recognition. of grammars in models of language use. Cognition, 13,
Journal of Experimental Psychology: Human 1–61.
Perception and Performance, 3, 389–407. Berwick, R. C., & Weinberg, A. S. (1983b). Reply to
Begley, S. (2007). Train your mind, change your Garnham. Cognition, 15, 271–276.
brain. New York: Ballantine Books. Besner, D., & Swan, M. (1982). Models of lexical
Behrend, D. A. (1988). Overextensions in early access in visual word recognition. Quarterly Journal
language comprehension: Evidence from a signal of Experimental Psychology, 34A, 313–325.
detection approach. Journal of Child Language, 15, Besner, D., Twilley, L., McCann, R. S., &
63–75. Seergobin, K. (1990). On the connection between
Behrmann, M., & Bub, D. (1992). Surface dyslexia connectionism and data: Are a few words necessary?
and dysgraphia: Dual routes, single lexicon. Cognitive Psychological Review, 97, 432–446.
Neuropsychology, 9, 209–251. Best, B. J. (1973). Classificatory development in
Bellugi, U., Bihrle, A., Jernigan, T., Trauner, D., & deaf children: Research on language and cognitive
Doherty, S. (1991). Neuropsychological, neurological, development. Occasional Paper No. 15, Research,
and neuroanatomical profile of Williams syndrome. Development and Demonstration Center in Education
American Journal of Medical Genetics Supplement, 6, of Handicapped Children, University of Minnesota.
115–125. Bestgen, Y., & Vincze, N. (2012). Checking and
Bencini, G. L., & Goldberg, A. E. (2000). The bootstrapping lexical norms by means of word
contribution of argument structure constructions to similarity indexes. Behavior Research Methods, 44,
sentence meaning. Journal of Memory and Language, 998–1006.
43, 640–651. Bestgen, Y., & Vonk, W. (2000). Temporal adverbials
Benedict, H. (1979). Early lexical development: as segmentation markers in discourse comprehension.
Comprehension and production. Journal of Child Journal of Memory and Language, 42, 74–87.
Language, 6, 183–200. Bever, T. G. (1970). The cognitive basis for linguistic
Ben-Zeev, S. (1977). The influence of bilingualism on structures. In J. R. Hayes (Ed.), Cognition and the
cognitive strategy and cognitive development. Child development of language (pp. 279–362). New York:
Development, 48, 1009–1018. Wiley.
Bereiter, C., & Scardamalia, M. (1987). The Bever, T. G. (1981). Normal acquisition processes
psychology of written composition. Hillsdale, NJ: explain the critical period for language learning. In
Lawrence Erlbaum Associates, Inc. K. C. Diller (Ed.), Individual differences and
Berko, J. (1958). The child’s learning of English universals in language aptitude (pp. 176–198).
morphology. Word, 14, 150–177. Rowley, MA: Newbury House.
Berlin, B., & Kay, P. (1969). Basic color terms: Their Bever, T. G., & McElree, B. (1988). Empty categories
universality and evolution. Berkeley: University of access their antecedents during comprehension.
California Press. Linguistic Inquiry, 19, 35–45.
500 REFERENCES
Bever, T. G., Sanz, M., & Townsend, D. J. (1998). comprehension in children. Hove, UK: Psychology
The emperor’s psycholinguistics. Journal of Press.
Psycholinguistic Research, 27, 261–284. Bishop, D., & Mogford, K. (Eds.). (1993). Language
Bialystock, E. (2001). Metalinguistic aspects of development in exceptional circumstances. Hove, UK:
bilingual processing. Annual Review of Applied Lawrence Erlbaum Associates.
Linguistics, 21, 169–181. Black, J. B., & Wilensky, R. (1979). An evaluation of
Bialystok, E., Craik, F. I. M., & Luk, G. (2012). story grammars. Cognitive Science, 3, 213–229.
Bilingualism: Consequences for mind and brain. Blackwell, A., & Bates, E. (1995). Inducing
Trends in Cognitive Sciences, 16, 240–250. agrammatic profiles in normals: Evidence for
Bialystok, E., & Hakuta, K. (1994). In other words: the selective vulnerability of morphology under
The science and psychology of second-language cognitive resource limitation. Journal of Cognitive
acquisition. New York: Basic Books. Neuroscience, 7, 228–257.
Biassou, N., Obler, L. K., Nespoulous, J.-L., Blanken, G. (1998). Lexicalisation in speech
Dordain, M., & Harris, K. S. (1997). Dual production: Evidence from form-related word
processing of open- and closed-class words. Brain and substitutions in aphasia. Cognitive Neuropsychology,
Language, 57, 360–373. 15, 321–360.
Bickerton, D. (1981). Roots of language. Ann Arbor, Bloem, I., & La Heij, W. (2003). Semantic facilitation
MI: Karoma. and semantic interference in word translation:
Bickerton, D. (1984). The language bioprogram Implications for models of lexical access in language
hypothesis. Behavioral and Brain Sciences, 7, production. Journal of Memory and Language, 48,
173–221. 468–488.
Bickerton, D. (1986). More than nature needs? A Bloem, I., van den Boogaard, S., & La Heij, W.
reply to Premack. Cognition, 23, 73–79. (2004). Semantic facilitation and semantic interference
Bickerton, D. (1990). Language and species. in language production: Further evidence for the
Chicago: University of Chicago Press. conceptual selection model of lexical access. Journal
Bickerton, D. (2003). Symbol and structure: A of Memory and Language, 51, 307–323.
comprehensive framework for language evolution. Bloom, A. H. (1981). The linguistic shaping of
In M. H. Christiansen & S. Kirby (Eds.), Language thought: A study in the impact of thinking in China
evolution (pp. 77–93). Oxford: Oxford University and the West. Hillsdale, NJ: Lawrence Erlbaum
Press. Associates, Inc.
Bierwisch, M. (1970). Semantics. In J. Lyons (Ed.), Bloom, A. H. (1984). Caution—the words you use
New horizons in linguistics (Vol. 1, pp. 166–185). may affect what you say: A response to Au. Cognition,
Harmondsworth, UK: Penguin. 17, 275–287.
Bigelow, A. (1987). Early words of blind children. Bloom, L. (1970). Language development: Form and
Journal of Child Language, 14, 47–56. function in emerging grammars. Cambridge, MA: MIT
Binder, J. R., & Desai, R. H. (2011). The Press.
neurobiology of semantic memory. Trends in Bloom, L. (1973). One word at a time: The use of
Cognitive Sciences, 15, 527–536. single word utterances before syntax. The Hague:
Bird, H., Lambon Ralph, M. A., Seidenberg, M. S., Mouton.
McClelland, J. L., & Patterson, K. E. (2003). Bloom, L. (1998). Language acquisition in its
Deficits in phonology and past-tense morphology: developmental context. In W. Damon, D. Kuhn, &
What’s the connection? Journal of Memory and R. S. Siegler (Eds.), Handbook of child psychology
Language, 48, 502–526. (Vol. 2, 5th ed., pp. 309–370). New York: Wiley.
Birdsong, D., & Molis, M. (2001). On the evidence Bloom, P. (1994). Recent controversies in the study
for maturational constraints in second-language of language acquisition. In M. A. Gernsbacher (Ed.),
acquisition. Journal of Memory and Language, 44, Handbook of psycholinguistics (pp. 741–780). San
235–249. Diego, CA: Academic Press.
Bishop, D. (1983). Linguistic impairment after Bloom, P. (2001a). How children learn the meanings
left hemidecortication for infantile hemiplegia? of words. Cambridge, MA: MIT Press.
A reappraisal. Quarterly Journal of Experimental Bloom, P. (2001b). Précis of How children learn the
Psychology, 35A, 199–207. meanings of words. Behavioral and Brain Sciences,
Bishop, D. (1989). Autism, Asperger’s syndrome 24, 1095–1103.
and semantic-pragmatic disorder: Where are Bloom, P. (2004). Children think before they speak.
the boundaries? British Journal of Disorders of Nature, 430, 411–412.
Communication, 24, 107–121. Blumstein, S. E., Cooper, W. E., Zurif, E. B., &
Bishop, D. (1997). Uncommon understanding: Caramazza, A. (1977). The perception and production of
Development and disorders of language voice-onset time in aphasia. Neuropsychologia, 15, 19–30.
REFERENCES 501
Blumstein, S. E., Katz, B., Goodglass, H., Shrier, R., interaction of syntax and semantics in parsing. Journal
& Dworetzky, B. (1985). The effects of slowed of Psycholinguistic Research, 18, 563–576.
speech on auditory comprehension in aphasia. Brain Boland, J. E., Tanenhaus, M. K., & Garnsey, S. M.
and Language, 24, 246–265. (1990). Evidence for the immediate use of verb
Boas, F. (1911). Introduction to The Handbook of control information in sentence processing. Journal of
North American Indians (Vol. 1). Bureau of American Memory and Language, 29, 413–432.
Ethnology Bulletin, 40 (Part 1). Boland, J. E., Tanenhaus, M. K., Garnsey, S. M.,
Bock, J. K. (1982). Toward a cognitive psychology & Carlson, G. N. (1995). Verb argument structure
of syntax: Information processing contributions to in parsing and interpretation: Evidence from wh-
sentence formulation. Psychological Review, 89, 1–47. questions. Journal of Memory and Language, 34,
Bock, J. K. (1986). Syntactic persistence in language 774–806.
production. Cognitive Psychology, 18, 355–387. Bolinger, D. L. (1965). The atomization of meaning.
Bock, J. K. (1987). An effect of accessibility of word Language, 41, 555–573.
forms on sentence structure. Journal of Memory and Bonin, P., Barry, C., Méot, A., & Chalard, M.
Language, 26, 119–137. (2004). The influence of age of acquisition in word
Bock, J. K. (1989). Closed-class immanence in reading and other tasks: A never ending story? Journal
sentence production. Cognition, 31, 163–186. of Memory and Language, 50, 456–476.
Bock, J. K., & Cutting, J. C. (1992). Regulating Bonin, P., & Fayol, M. (2002). Frequency effects
mental energy: Performance units in language in the written and spoken production of homophonic
production. Journal of Memory and Language, 31, picture names. European Journal of Cognitive
99–127. Psychology, 14, 289–313.
Bock, J. K., & Eberhard, K. M. (1993). Meaning, Boomer, D. S. (1965). Hesitations and grammatical
sound and syntax in English number agreement. encoding. Language and Speech, 8, 148–158.
Language and Cognitive Processes, 8, 57–99. Bornkessel-Schlesewsky, I., Schlesewsky, M., &
Bock, J. K., Eberhard, K. M., & Cutting, J. C. von Cramon, D. Y. (2009). Word order and Broca’s
(2004). Producing number agreement: How pronouns region: Evidence for a supra-syntactic perspective.
equal verbs. Journal of Memory and Language, 51, Brain and Language, 111, 125–139.
251–278. Bornstein, M. H. (1973). Color vision and color
Bock, J. K., & Griffin, Z. M. (2000). The persistence naming: A psychophysiological hypothesis of cultural
of structural priming: Transient activation or implicit difference. Psychological Bulletin, 80, 257–285.
learning. Journal of Experimental Psychology: Bornstein, S. (1985). On the development of colour
General, 129, 177–192. naming in young children: Data and theory. Brain and
Bock, J. K., & Irwin, D. E. (1980). Syntactic effects Language, 26, 72–93.
of information availability in sentence production. Boroditsky, L. (2001). Does language shape thought?
Journal of Verbal Learning and Verbal Behavior, 19, Mandarin and English speakers’ conceptions of time.
467–484. Cognitive Psychology, 43, 1–22.
Bock, J. K., & Loebell, H. (1990). Framing Boroditsky, L. (2003). Linguistic relativity. In
sentences. Cognition, 35, 1–39. L. Nadel (Ed.), Encyclopedia of cognitive science
Bock, J. K., & Miller, C. A. (1991). Broken (Vol. 2, pp. 917–921). London: Nature Publishing
agreement. Cognition, 23, 45–93. Group.
Bock, J. K., & Warren, R. K. (1985). Conceptual Borowsky, R., & Masson, M. E. J. (1996). Semantic
accessibility and syntactic structure in sentence ambiguity effects in word identification. Journal of
formulation. Cognition, 21, 47–67. Experimental Psychology: Learning, Memory, and
Bohannon, J. N., MacWhinney, B., & Snow, C. E. Cognition, 22, 63–85.
(1990). No negative evidence revisited: Beyond Borsley, R. D. (1991). Syntactic theory: A unified
learnability or who has to prove what to whom. approach. London: Edward Arnold.
Developmental Psychology, 26, 221–226. Bouckaert, R., Lemey, P., Dunn, M., Greenhill, S. J.,
Bohannon, J. N., & Stanowicz, L. (1988). The issue Alekseyenko, A. V., Drummond, A. J., et al. (2012).
of negative evidence: Adult responses to children’s Mapping the origins and expansion of the Indo-
language errors. Developmental Psychology, 24, European language family. Science, 337, 957–960.
684–689. Bower, G. H., Black, J. B., & Turner, T. J. (1979).
Boland, J. E. (1997). Resolving syntactic category Scripts in memory for text. Cognitive Psychology, 11,
ambiguities in discourse context: Probabilistic 177–220.
and discourse constraints. Journal of Memory and Bowerman, M. (1973). Learning to talk: A cross
Language, 36, 588–615. linguistic study of early syntactic development, with
Boland, J. E., Tanenhaus, M. K., Carlson, G. N., special reference to Finnish. Cambridge: Cambridge
& Garnsey, S. M. (1989). Lexical projection and the University Press.
502 REFERENCES
Bowerman, M. (1978). The acquisition of word Young children’s acquisition of verbs (pp. 352–376).
meanings: An investigation into some current Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
conflicts. In N. Waterson & C. E. Snow (Eds.), Bramwell, B. (1897). Illustrative cases of aphasia.
The development of communication (pp. 263–287). Lancet, 1, 1256–1259. [Reprinted in Cognitive
Chichester, UK: Wiley. Neuropsychology (1984), 1, 249–258.]
Bowerman, M. (1990). Mapping thematic roles onto Branigan, H. P., Pickering, M. J., & Cleland, A. A.
syntactic functions: Are children helped by innate (2000). Syntactic co-ordination in dialogue. Cognition,
linking rules? Linguistics, 28, 1253–1289. 75, B13–B25.
Bowey, J. A. (1996). On the association between Branigan, H. P., Pickering, M. J., Liversedge, S. P.,
phonological memory and receptive vocabulary Stewart, A. J., & Urbach, T. P. (1995). Syntactic
in five-year-olds. Journal of Experimental Child priming: Investigating the mental representation of
Psychology, 63, 44–78. language. Journal of Psycholinguistic Research, 24,
Bowey, J. A. (1997). What does nonword repetition 489–506.
measure? A reply to Gathercole and Baddeley. Journal Bransford, J. D., Barclay, J. R., & Franks, J. J.
of Experimental Child Psychology, 67, 295–301. (1972). Sentence memory: A constructive versus
Bowey, J. A., & Muller, D. (2005). Phonological interpretive approach. Cognitive Psychology, 3,
recoding and rapid orthographic learning in third- 193–209.
graders’ silent reading: A critical test of the self- Bransford, J. D., & Johnson, M. K. (1973).
teaching hypothesis. Journal of Experimental Child Consideration of some problems of comprehension. In
Psychology, 92, 203–219. W. G. Chase (Ed.), Visual information processing (pp.
Bowles, N. L., & Poon, L. W. (1985). Effects of 383–438). New York: Academic Press.
priming in word retrieval. Journal of Experimental Breedin, S. D., & Saffran, E. M. (1999). Sentence
Psychology: Learning, Memory, and Cognition, 11, processing in the face of semantic loss: A case study.
272–283. Journal of Experimental Psychology: General, 128,
Bradley, D. C., & Forster, K. I. (1987). A reader’s 547–562.
view of listening. Cognition, 25, 103–134. Breedin, S. D., Saffran, E. M., & Coslett, H. B.
Bradley, D. C., Garrett, M. F., & Zurif, E. B. (1994). Reversal of the concreteness effect in a patient
(1980). Syntactic deficits in Broca’s aphasia. In D. with semantic dementia. Cognitive Neuropsychology,
Caplan (Ed.), Biological studies of mental processes 11, 617–660.
(pp. 269–286). Cambridge, MA: MIT Press. Breedin, S. D., Saffran, E. M., & Schwartz, M.
Bradley, L., & Bryant, P. (1978). Difficulties in (1998). Semantic factors in verb retrieval: An effect of
auditory organization as a possible cause of reading complexity. Brain and Language, 63, 1–35.
backwardness. Nature, 271, 746–747. Brennan, S. E., & Clark, H. H. (1996). Conceptual
Bradley, L., & Bryant, P. (1983). Categorizing pacts and lexical choice in conversation. Journal of
sounds and learning to read—A causal connection. Experimental Psychology: Learning, Memory, and
Nature, 301, 419–421. Cognition, 22, 1482–1493.
Braine, M. D. S. (1963). The ontogeny of English Brennen, T. (1999). Face naming in dementia: A reply
phrase structure: The first phase. Language, 39, to Hodges and Greene (1998). Quarterly Journal of
1–13. Experimental Psychology, 52A, 535–541.
Braine, M. D. S. (1976). Children’s first word Brennen, T., David, D., Fluchaire, I., &
combinations. Monographs of the Society for Research Pellat, J. (1996). Naming faces and objects
in Child Development, 41 (Serial No. 164). without comprehension: A case study. Cognitive
Braine, M. D. S. (1988a). Review of Language Neuropsychology, 13, 93–110.
learnability and language development by S. Pinker. Brewer, W. F. (1987). Schemas versus mental models in
Journal of Child Language, 15, 189–219. human memory. In P. Morris (Ed.), Modelling cognition
Braine, M. D. S. (1988b). Modeling the acquisition (pp. 187–197). Chichester, UK: J. Wiley & Sons.
of linguistic structure. In Y. Levy, I. M. Schlesinger, Britton, B. K., Muth, K. D., & Glynn, S. M. (1986).
& M. D. S. Braine (Eds.), Categories and processes Effects of text organization on memory: Test of a
in language acquisition (pp. 217–259). Hillsdale, NJ: cognitive effect hypothesis with limited exposure time.
Lawrence Erlbaum Associates, Inc. Discourse Processes, 9, 475–487.
Braine, M. D. S. (1992). What sort of innate structure Broeder, P., & Murre, J. (Eds.). (2000). Models
is needed to “bootstrap” into syntax? Cognition, 45, of language acquisition: Inductive and deductive
77–100. approaches. Oxford: Oxford University Press.
Braine, M. D. S., & Brooks, P. J. (1995). Verb Bronowski, J., & Bellugi, U. (1970). Language,
argument structure and the problem of avoiding an name, and concept. Science, 168, 669–673.
overgeneral grammar. In M. Tomasello & Broom, Y. M., & Doctor, E. A. (1995a).
W. E. Merriman (Eds.), Beyond names for things: Developmental phonological dyslexia: A case study of
REFERENCES 503
the efficacy of a remediation programme. Cognitive Brown, R., & Lenneberg, E. H. (1954). A study in
Neuropsychology, 12, 725–766. language and cognition. Journal of Abnormal and
Broom, Y. M., & Doctor, E. A. (1995b). Social Psychology, 49, 454–462.
Developmental surface dyslexia: A case study of Brown, R., & McNeill, D. (1966). The “tip of the
the efficacy of a remediation programme. Cognitive tongue” phenomenon. Journal of Verbal Learning and
Neuropsychology, 12, 69–110. Verbal Behavior, 5, 325–337.
Brown, A. S. (1991). A review of the tip-of-the-tongue Brownell, H. H., & Gardner, H. (1988).
experience. Psychological Bulletin, 109, 204–223. Neuropsychological insights into humour. In J. Durant
Brown, G. D. A. (1987). Resolving inconsistency: & J. Miller (Eds.), Laughing matters (pp. 17–34).
A computational model of word naming. Journal of Harlow, UK: Longman.
Memory and Language, 26, 1–23. Brownell, H. H., Michel, D., Powelson, J. A., &
Brown, G. D. A., & Deavers, R. P. (1999). Units of Gardner, H. (1983). Surprise but not coherence:
analysis in nonword reading: Evidence from children Sensitivity to verbal humor in right hemisphere
and adults. Journal of Experimental Child Psychology, patients. Brain and Language, 18, 20–27.
73, 208–242. Brownell, H. H., Potter, H. H., Bihrle, A. M., &
Brown, G. D. A., & Ellis, N. C. (1994). Issues in Gardner, H. (1986). Interference deficits in right
spelling research: An overview. In G. D. A. Brown brain-damaged patients. Brain and Language, 27,
& N. C. Ellis (Eds.), Handbook of spelling: Theory, 310–321.
process and intervention (pp. 3–25). London: John Bruce, D. J. (1958). The effects of listeners’
Wiley & Sons. anticipations in the intelligibility of heard speech.
Brown, G. D. A., & Watson, F. L. (1987). First Language and Speech, 1, 79–97.
in, first out: Word learning age and spoken word Bruck, M., Lambert, W. E., & Tucker, G. R. (1976).
frequency as predictors of word familiarity and word Cognitive and attitudinal consequences of bilingual
naming latency. Memory and Cognition, 15, 208–216. schooling: The St. Lambert project through grade six.
Brown, G. D. A., & Watson, F. L. (1994). Spelling- International Journal of Psycholinguistics, 6, 13–33.
to-sound effects in single-word reading. British Bruner, J. S. (1964). The course of cognitive growth.
Journal of Psychology, 85, 181–202. American Psychologist, 19, 1–15.
Brown, P. (1991). DEREK: The direct encoding Bruner, J. S. (1975). From communication to
routine for evolving knowledge. In D. Besner & language—a psychological perspective. Cognition, 3,
G. W. Humphreys (Eds.), Basic processes in reading: 255–287.
Visual word recognition (pp. 104–147). Hillsdale, NJ: Bruner, J. S. (1983). Child’s talk: Learning to use
Lawrence Erlbaum Associates, Inc. language. New York: W. W. Norton.
Brown, P., & Levinson, S. (1987). Politeness: Some Bryant, P. (1998). Sensitivity to onset and rhyme
universals in language usage. Cambridge: Cambridge does predict young children’s reading: A comment on
University Press. Muter, Hulme, Snowling, and Taylor (1997). Journal
Brown, R. (1958). Words and things. New York: Free of Experimental Child Psychology, 71, 29–37.
Press. Bryant, P., & Impey, L. (1986). The similarity
Brown, R. (1970). Psychology and reading: between normal readers and developmental and
Commentary on chapters 5 to 10. In H. Levin & acquired dyslexics. Cognition, 24, 121–137.
J. P. Williams (Eds.), Basic studies on reading Brysbaert, M., & Mitchell, D. C. (1996). Modifier
(pp. 164–187). New York: Basic Books. attachment in sentence parsing: Evidence from Dutch.
Brown, R. (1973). A first language: The early stages. Quarterly Journal of Experimental Psychology, 49A,
London: George Allen & Unwin. 664–695.
Brown, R. (1976). In memorial tribute to Eric Bryson, B. (1990). Mother tongue. Harmondsworth,
Lenneberg. Cognition, 4, 125–154. UK: Penguin Books.
Brown, R., & Bellugi, U. (1964). Three processes Bub, D. (2000). Methodological issues confronting
in the acquisition of syntax. Harvard Educational PET and fMRI studies of cognitive function. Cognitive
Review, 34, 133–151. Neuropsychology, 17, 467–484.
Brown, R., & Fraser, C. (1963). The acquisition of Bub, D., Black, S., Hampson, E., & Kertesz, A.
syntax. In C. Cofer & B. Musgrave (Eds.), Verbal (1988). Semantic encoding of pictures and words:
behavior and learning: Problems and processes (pp. Some neuropsychological observations. Cognitive
158–209). New York: McGraw-Hill. Neuropsychology, 5, 27–66.
Brown, R., & Hanlon, C. (1970). Derivational Bub, D., Black, S., Howell, J., & Kertesz, A. (1987).
complexity and order of acquisition in child Speech output processes and reading. In
speech. In J. R. Hayes (Ed.), Cognition and the M. Coltheart, G. Sartori, & R. Job (Eds.), The
development of language (pp. 11–53). New York: cognitive neuropsychology of language (pp. 79–110).
John Wiley & Sons. Hove, UK: Lawrence Erlbaum Associates.
504 REFERENCES
Bub, D., Cancelliere, A., & Kertesz, A. (1985). the tongue and language production (pp. 73–108).
Whole-word and analytic translation of spelling to Amsterdam: Mouton.
sound in a non-semantic reader. In K. E. Patterson, Butterworth, B. (1985). Jargon aphasia: Processes
J. C. Marshall, & M. Coltheart (Eds.), Surface and strategies. In S. Newman & R. Epstein (Eds.),
dyslexia: Neuropsychological and cognitive studies Current perspectives in dysphasia (pp. 61–96).
of phonological reading (pp. 15–34). Hove, UK: Edinburgh: Churchill Livingstone.
Lawrence Erlbaum Associates. Butterworth, B., & Beattie, G. W. (1978). Gesture
Bub, D., & Kertesz, A. (1982a). Deep agraphia. Brain and silence as indicators of planning in speech. In
and Language, 17, 146–165. R. N. Campbell & P. T. Smith (Eds.), Recent advances
Bub, D., & Kertesz, A. (1982b). Evidence for in the psychology of language: Vol. 4. Formal and
logographic processing in a patient with preserved experimental approaches (pp. 347–360). London:
written over oral single word naming. Brain, 105, Plenum Press.
697–717. Butterworth, B., Campbell, R., & Howard, D.
Buckingham, H. W. (1981). Where do neologisms (1986). The uses of short-term memory: A case study.
come from? In J. W. Brown (Ed.), Jargon-aphasia (pp. Quarterly Journal of Experimental Psychology, 38A,
39–62). New York: Academic Press. 705–737.
Buckingham, H. W. (1986). The scan-copier Butterworth, B., & Howard, D. (1987).
mechanism and the positional level of language Paragrammatisms. Cognition, 26, 1–37.
production: Evidence from phonemic paraphasia. Butterworth, B., Swallow, J., & Grimston, M.
Cognitive Science, 10, 195–217. (1981). Gestures and lexical processes in
Burgess, C. (2000). Theory and operational jargonaphasia. In J. Brown (Ed.), Jargonaphasia (pp.
definitions in computational memory models: A 113–124). New York: Academic Press.
response to Glenberg and Robertson. Journal of Butterworth, B., & Wengang, Y. (1991). The
Memory and Language, 43, 402–408. universality of two routines for reading: Evidence
Burgess, C., & Lund, K. (1997). Representing from Chinese dyslexia. Proceedings of the Royal
abstract words and emotional connotation in high- Society of London, Series B, 245, 91–95.
dimensional memory space. In Proceedings of the Byrne, B. (1998). The foundation of literacy: The
Cognitive Science Society (pp. 61–66). Hillsdale, NJ: child’s acquisition of the alphabetic principle. Hove,
Lawrence Erlbaum Associates, Inc. UK: Psychology Press.
Burke, D., MacKay, D. G., Worthley, J. S., & Wade, E. Cacciari, C., & Glucksberg, S. (1994).
(1991). On the tip of the tongue: What causes word Understanding figurative language. In M. A.
finding failures in young and older adults? Journal of Gernsbacher (Ed.), Handbook of psycholinguistics (pp.
Memory and Language, 30, 237–246. 447–477). San Diego, CA: Academic Press.
Burton, M. W., Baum, S. R., & Blumstein, S. E. Cairns, P., Shillcock, R., Chater, N., & Levy,
(1989). Lexical effects on the phonetic categorization J. (1995). Bottom-up connectionist modelling
of speech: The role of acoustic structure. Journal of of speech. In J. P. Levy, D. Bairaktaris, J. A.
Experimental Psychology: Human Perception and Bullinaria, & P. Cairns (Eds.), Connectionist models
Performance, 15, 567–575. of memory and language (pp. 289–310). London:
Burton-Roberts, N. (1997). Analysing sentences: UCL Press.
An introduction to English syntax (2nd ed.). London: Cairns, P., Shillcock, R., Chater, N., & Levy, J.
Longman. (1997). Bootstrapping word boundaries: A bottom-up
Bus, A. G., & van Ijzendoorn, M. H. (1999). corpus-based approach to segmentation. Cognitive
Phonological awareness and early reading: A meta- Psychology, 33, 111–153.
analysis of experimental training studies. Journal of Campbell, R., & Butterworth, B. (1985).
Educational Psychology, 91, 403–414. Phonological dyslexia and dysgraphia: A
Butterworth, B. (1975). Hesitation and semantic developmental case with associated deficits of
planning in speech. Journal of Psycholinguistic phonemic processing and awareness. Quarterly
Research, 4, 75–87. Journal of Experimental Psychology, 37A, 435–475.
Butterworth, B. (1979). Hesitation and the production Cantalupo, C., & Hopkins, W. D. (2001).
of neologisms in jargon aphasia. Brain and Language, Asymmetric Broca’s area in great apes. Nature, 414,
8, 133–161. 505.
Butterworth, B. (1980). Evidence from pauses in Caplan, D. (1972). Clause boundaries and recognition
speech. In B. Butterworth (Ed.), Language production: latencies. Perception and Psychophysics, 12, 73–76.
Vol. 1. Speech and talk (pp. 155–176). London: Caplan, D. (1986). In defense of agrammatism.
Academic Press. Cognition, 24, 263–276.
Butterworth, B. (1982). Speech errors: Old data in Caplan, D. (1992). Language: Structure, processing,
search of new theories. In A. Cutler (Ed.), Slips of and disorders. Cambridge, MA: MIT Press.
REFERENCES 505
Caplan, D., Baker, C., & Dehaut, F. (1985). Caramazza, A., & Hillis, A. E. (1990). Where do
Syntactic determinants of sentence comprehension in semantic errors come from? Cortex, 26, 95–122.
aphasia. Cognition, 21, 117–175. Caramazza, A., & Hillis, A. E. (1991). Lexical
Caplan, D., & Hildebrandt, N. (1988). Disorders of organization of nouns and verbs in the brain. Nature,
syntactic comprehension. Cambridge, MA: Bradford 349, 788–790.
Books. Caramazza, A., Hillis, A. E., Rapp, B. C., &
Caplan, D., & Waters, G. S. (1995a). Aphasic Romani, C. (1990). The multiple semantics
disorders of syntactic comprehension and working hypothesis: Multiple confusions? Cognitive
memory capacity. Cognitive Neuropsychology, 12, Neuropsychology, 7, 61–189.
637–649. Caramazza, A., Miceli, G., & Villa, G. (1986). The
Caplan, D., & Waters, G. S. (1995b). On the nature role of the (output) phonological buffer in reading,
of the phonological output planning processes writing, and repetition. Cognitive Neuropsychology, 3,
involved in verbal rehearsal: Evidence from aphasia. 37–76.
Brain and Language, 48, 191–220. Caramazza, A., & Miozzo, M. (1997). The relation
Caplan, D., & Waters, G. S. (1996). Syntactic between syntactic and phonological knowledge in
processing in sentence comprehension under dual- lexical access: Evidence from the “tip-of-the-tongue”
task conditions in aphasic patients. Language and phenomenon. Cognition, 64, 309–343.
Cognitive Processes, 22, 525–551. Caramazza, A., & Miozzo, M. (1998). More is not
Caplan, D., & Waters, G. S. (1999). Verbal working always better: A response to Roelofs, Meyer, and
memory and sentence comprehension. Behavioral and Levelt. Cognition, 69, 231–241.
Brain Sciences, 22, 77–126. Caramazza, A., Papagno, C., & Ruml, W. (2000).
Caplan, D., & Waters, G. S. (2002). Working The selective impairment of phonological processing
memory and connectionist models of parsing: Reply to in speech production. Brain and Language, 75,
MacDonald and Christiansen. Psychological Review, 428–450.
109, 66–74. Caramazza, A., & Shelton, J. R. (1998). Domain-
Caramazza, A. (1986). On drawing inferences about specific knowledge systems in the brain: The
the structure of normal cognitive systems from the animate–inanimate distinction. Journal of Cognitive
analysis of patterns of impaired performance. Brain Neuroscience, 10, 1–34.
and Cognition, 5, 41–66. Caramazza, A., & Zurif, E. B. (1976). Dissociation
Caramazza, A. (1991). Data, statistics, and theory: of algorithmic and heuristic processes in language
A comment on Bates, McDonald, MacWhinney, and comprehension: Evidence from aphasia. Brain and
Applebaum’s “A maximum likelihood procedure for Language, 3, 572–582.
the analysis of group and individual data in aphasia Carey, S. (1985). Conceptual change in childhood.
research.” Brain and Language, 41, 43–51. Cambridge, MA: MIT Press.
Caramazza, A. (1997). How many levels of Carmichael, L., Hogan, H. P., & Walter, A. A.
processing are there in lexical access? Cognitive (1932). An experimental study of the effect of
Neuropsychology, 14, 177–208. language on the reproduction of visually presented
Caramazza, A., & Berndt, R. S. (1978). Semantic forms. Journal of Experimental Psychology, 15,
and syntactic processes in aphasia: A review of the 73–86.
literature. Psychological Bulletin, 85, 898–918. Carpenter, P. A., & Just, M. A. (1977). Reading
Caramazza, A., Berndt, R. S., & Basili, A. G. comprehension as eyes see it. In M. A. Just &
(1983). The selective impairment of phonological P. A. Carpenter (Eds.), Cognitive processes in
processing: A case study. Brain and Language, 18, comprehension (pp. 109–140). Hillsdale, NJ:
128–174. Lawrence Erlbaum Associates, Inc.
Caramazza, A., Bi, Y., Costa, A., & Miozzo, M. (2004). Carr, T. H., McCauley, C., Sperber, R. D., &
What determines the speed of lexical access: Homophone Parmalee, C. M. (1982). Words, pictures and priming:
or specific-word frequency? A reply to Jescheniak et al. On semantic activation, conscious identification and
(2003). Journal of Experimental Psychology: Learning, the automaticity of information processing. Journal
Memory, and Cognition, 30, 278–282. of Experimental Psychology: Human Perception and
Caramazza, A., Chialant, D., Capasso, R., & Miceli, G. Performance, 8, 757–777.
(2000). Separable processing of consonants and vowels. Carr, T. H., & Pollatsek, A. (1985). Recognizing
Nature, 403, 428–430. printed words: A look at current models. In D. Besner,
Caramazza, A., Costa, A., Miozzo, M., & Bi, Y. T. J. Waller, & C. E. MacKinnon (Eds.), Reading
(2001). The specific-word frequency effect: research: Advances in theory and practice (Vol. 5, pp.
Implications for the representation of homophones. 1–82). New York: Academic Press.
Journal of Experimental Psychology: Learning, Carroll, J. B. (1981). Twenty-five years of research
Memory, and Cognition, 27, 1430–1450. on foreign language aptitude. In K. C. Diller (Ed.),
506 REFERENCES
Individual differences and universals in language Chalmers, A. F. (1999). What is this thing called
learning aptitude (pp. 83–118). Rowley, MA: science? (3rd ed.). Milton Keynes, UK: Open
Newbury House. University Press.
Carroll, J. B., & Casagrande, J. B. (1958). The Chambers Twentieth Century Dictionary. (1998).
function of language classifications in behavior. In Edinburgh: Chambers Harrap.
E. E. Maccoby, T. M. Newcomb, & E. L. Hartley Chambers, C. G., Tanenhaus, M. K., &
(Eds.), Readings in social psychology (3rd ed., pp. Magnuson, J. S. (2004). Actions and affordances
18–31). New York: Holt, Rinehart & Winston. in syntactic ambiguity resolution. Journal of
Carroll, J. B., & White, M. N. (1973a). Word Experimental Psychology: Learning, Memory, and
frequency and age-of-acquisition as determiners Cognition, 30, 687–696.
of picture-naming latency. Quarterly Journal of Chan, A. S., Butters, N., Paulsen, J. S., Salmon, D. P.,
Experimental Psychology, 25, 85–95. Swenson, M. R., & Maloney, L. T. (1993a). An
Carroll, J. B., & White, M. N. (1973b). Age-of- assessment of the semantic network in patients with
acquisition norms for 220 picturable nouns. Journal Alzheimer’s disease. Journal of Cognitive Neuroscience,
of Verbal Learning and Verbal Behavior, 12, 5, 254–261.
563–576. Chan, A. S., Butters, N., Salmon, D. P., &
Carruthers, P. (2002). The cognitive functions McGuire, K. A. (1993b). Dimensionality and clustering
of language. Behavioral and Brain Sciences, 25, in the semantic network of patients with Alzheimer’s
657–726. disease. Psychology and Aging, 8, 411–419.
Carston, R. (1987). Review of Gavagai! or the future Chang, F., Bock, K., & Goldberg, A. E. (2003). Can
history of the animal language controversy, by David thematic roles leave traces of their places? Cognition,
Premack. Mind and Language, 2, 332–349. 90, 29–49.
Carver, R. P. (1972). Speed readers don’t read: They Chang, F., Dell, G. S., & Bock, K. (2006).
skim. Psychology Today, 22–30. Becoming syntactic. Psychological Review, 113,
Casey, B. J., Thomas, K. M., & McCandliss, B. 234–272.
(2001). Applications of magnetic resonance imaging Chang, T. M. (1986). Semantic memory: Facts and
to the study of development. In C. A. Nelson & M. models. Psychological Bulletin, 99, 199–220.
Luciano (Eds.), Handbook of developmental cognitive Chao, L. L., Haxby, J. V., & Martin, A. (1999).
neuroscience (pp. 137–147). Cambridge, MA: MIT Attribute-based neural substrates in temporal cortex
Press. for perceiving and knowing about objects. Nature
Castles, A., & Coltheart, M. (1993). Varieties of Neuroscience, 2, 913–919.
developmental dyslexia. Cognition, 47, 149–180. Chapman, R. S., & Thomson, J. (1980). What is
Castles, A., & Coltheart, M. (2004). Is there a the source of overextension errors in comprehension
causal link from phonological awareness to success in testing of two-year-olds? A reply to Fremgen and Fay.
learning to read? Cognition, 91, 77–111. Journal of Child Language, 7, 575–578.
Castles, A., Datta, H., Gayan, J., & Olson, R. K. Chater, N., & Manning, C. D. (2006). Probabilistic
(1999). Varieties of developmental reading disorder: models of language processing and acquisition. Trends
Genetic and environmental influences. Journal of in Cognitive Sciences, 10, 335–344.
Experimental Child Psychology, 72, 73–94. Chen, H.-C., & Ng, M.-L. (1989). Semantic
Cattell, J. M. (1947). On the time required for facilitation and translation priming effects in
recognizing and naming letters and words, pictures Chinese–English bilinguals. Memory and Cognition,
and colors. In James McKeen Cattell, Man of science 17, 454–462.
(Vol. 1, pp. 13–25). Lancaster, PA: Science Press. Chertkow, H., Bub, D., & Caplan, D. (1992).
[Originally published 1888.] Constraining theories of semantic memory processing:
Caudill, M., & Butler, C. (1992). Understanding Evidence from dementia. Cognitive Neuropsychology,
neural networks: Computer explorations (Vols. 1 & 2). 9, 327–365.
Cambridge, MA: MIT Press. Cholin, J., Levelt, W. J. M., & Schiller, N. O. (2006).
Cazden, C. B. (1968). The acquisition of noun and Effects of syllable frequency in speech production.
verb inflections. Child Development, 39, 433–448. Cognition, 99, 205–235.
Cazden, C. B. (1972). Child language and education. Cholin, J., Schiller, N. O., & Levelt, W. J. M.
New York: Holt, Rinehart & Winston. (2004). The preparation of syllables in speech
Chafe, W. L. (1985). Linguistic differences produced production. Journal of Memory and Language, 50,
by differences between speaking and writing. In 47–61.
D. R. Olson, N. Torrance, & A. Hildyard (Eds.), Chomsky, N. (1957). Syntactic structures. The Hague:
Literacy, language and learning: The nature and Mouton.
consequences of reading and writing (pp. 105–123). Chomsky, N. (1959). Review of “Verbal behavior” by
Cambridge: Cambridge University Press. B. F. Skinner. Language, 35, 26–58.
REFERENCES 507
Chomsky, N. (1965). Aspects of the theory of syntax. Clark, E. V. (1973). What’s in a word? On the child’s
Cambridge, MA: MIT Press. acquisition of semantics in his first language. In
Chomsky, N. (1968). Language and mind. New York: T. E. Moore (Ed.), Cognitive development and the
Harcourt Brace. acquisition of language (pp. 65–110). New York:
Chomsky, N. (1975). Reflections on language. New Academic Press.
York: Pantheon. Clark, E. V. (1987). The principle of contrast: A
Chomsky, N. (1981). Lectures on government and constraint on language acquisition. In B. MacWhinney
binding. Dordrecht: Foris. (Ed.), Mechanisms of language acquisition (pp. 1–33).
Chomsky, N. (1986). Knowledge of language. New Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
York: Praeger Special Studies. Clark, E. V. (1993). The lexicon in acquisition.
Chomsky, N. (1988). Language and problems of Cambridge: Cambridge University Press.
knowledge: The Managua lectures. Cambridge, MA: Clark, E. V. (1995). Later lexical development and
MIT Press. word formation. In P. Fletcher & B. MacWhinney
Chomsky, N. (1991). Linguistics and cognitive (Eds.), The handbook of child language (pp. 393–412).
science: Problems and mysteries. In A. Kasher (Ed.), Oxford: Blackwell.
The Chomskyan turn (pp. 26–53). Oxford: Blackwell. Clark, E. V., & Hecht, B. F. (1983). Comprehension
Chomsky, N. (1995). Bare phrase structure. In G. and production. Annual Review of Psychology, 34,
Webelhuth (Ed.), Government and binding theory and 325–247.
the minimalist programme (pp. 383–400). Oxford: Clark, H. H. (1977a). Bridging. In P. N. Johnson-
Blackwell. Laird & P. C. Wason (Ed.), Thinking: Readings
Christiansen, J. A. (1995). Coherence violations and in cognitive science (pp. 411–420). Cambridge:
propositional usage in the narratives of fluent aphasics. Cambridge University Press.
Brain and Language, 51, 291–317. Clark, H. H. (1977b). Inferences in comprehension.
Christiansen, M. H., Allen, J., & Seidenberg, M. S. In D. LaBerge & S. J. Samuels (Eds.), Basic processes
(1998). Learning to segment speech using multiple in reading: Perception and comprehension (pp. 243–
cues: A connectionist model. Language and Cognitive 263). Hillsdale, NJ: Lawrence Erlbaum Associates,
Processes, 13, 221–268. Inc.
Christiansen, M. H., & Chater, N. (2008). Language Clark, H. H. (1994). Discourse in production. In
as shaped by the brain. Behavioral and Brain Sciences, M. A. Gernsbacher (Ed.), Handbook of psycholinguistics
31, 489–509. (pp. 985–1022). San Diego, CA: Academic Press.
Christiansen, M. H., & Curtin, S. (1999). Transfer Clark, H. H. (1996). Using language. Cambridge:
of learning: Rule acquisition of statistical learning? Cambridge University Press.
Trends in Cognitive Sciences, 3, 289–290. Clark, H. H., & Carlson, T. (1982). Speech
Christiansen, M. H., & Kirby, S. (Eds.). (2003). acts and hearers’ beliefs. In N. V. Smith (Ed.),
Language evolution. Oxford: Oxford University Press. Mutual knowledge (pp. 1–59). London: Academic
Christianson, K., Hollingworth, A., Halliwell, J. F., Press.
& Ferreira, F. (2001). Thematic roles assigned along Clark, H. H., & Clark, E. V. (1977). Psychology and
the garden path linger. Cognitive Psychology, 42, language: An introduction to psycholinguistics. New
368–407. York: Harcourt Brace Jovanovich.
Chumbley, J. I., & Balota, D. A., (1984). A word’s Clark, H. H., & Fox Tree, J. E. (2002). Using uh and
meaning affects the decision in lexical decision. um in spontaneous speaking. Cognition, 84, 73–111.
Memory and Cognition, 12, 590–606. Clark, H. H., & Haviland, S. E. (1977).
Cipolotti, L., & Warrington, E. K. (1995). Semantic Comprehension and the given-new contract. In
memory and reading abilities: A case report. Journal of R. O. Freedle (Ed.), Discourse production and
the International Neuropsychological Society, 1, 104–110. comprehension (pp. 1–40). Norwood, NJ: Ablex.
Cirilo, R. K., & Foss, D. J. (1980). Text structure and Clark, H. H., & Lucy, P. (1975). Understanding what
reading time for sentences. Journal of Verbal Learning is meant from what is said: A study in conversationally
and Verbal Behavior, 19, 96–109. conveyed requests. Journal of Verbal Learning and
Clahsen, H. (1992). Learnability theory and the Verbal Behavior, 14, 56–72.
problem of development in language acquisition. In Clark, H. H., & Wasow, T. (1998). Repeating words in
J. Weissenborn, H. Goodluck, & T. Roeper (Eds.), spontaneous speech. Cognitive Psychology, 37, 201–242.
Theoretical issues in language acquisition (pp. 53–76). Clark, H. H., & Wilkes-Gibbs, D. (1986). Referring
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. as a collaborative process. Cognition, 22, 1–39.
Clahsen, H. (1999). Lexical entries and rules of Clarke, R., & Morton, J. (1983). Cross modality
language: A multidisciplinary study of German facilitation in tachistoscopic word recognition.
and inflection. Behavioral and Brain Sciences, 22, Quarterly Journal of Experimental Psychology, 35A,
991–1060. 79–96.
508 REFERENCES
Clarke-Stewart, K., Vanderstoep, L., & Killian, G. Coltheart, M. (1985). Cognitive neuropsychology and
(1979). Analysis and replication of mother–child the study of reading. In M. I. Posner & O. S. M. Marin
relations at 2 years of age. Child Development, 50, (Eds.), Attention and performance XI (pp. 3–37).
777–793. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Cleland, A. A., & Pickering, M. J. (2003). The Coltheart, M. (1987). Varieties of developmental
use of lexical and syntactic information in language dyslexia: A comment on Bryant and Impey. Cognition,
production: Evidence from the priming of noun-phrase 27, 97–101.
structure. Journal of Memory and Language, 49, Coltheart, M. (1996). Phonological dyslexia: Past and
214–230. future issues. Cognitive Neuropsychology, 13, 749–762.
Clifton, C., & Ferreira, F. (1987). Discourse Coltheart, M. (2004) Are there lexicons? Quarterly
structure and anaphora: Some experimental results. In Journal of Experimental Psychology, 57A, 1153–1171.
M. Coltheart (Ed.), Attention and performance XII: Coltheart, M., Curtis, B., Atkins, P., & Haller, M.
The psychology of reading (pp. 635–654). Hove, UK: (1993). Models of reading aloud: Dual-route
Lawrence Erlbaum Associates. and parallel-distributed-processing approaches.
Clifton, C., & Ferreira, F. (1989). Ambiguity in Psychological Review, 100, 589–608.
context. Language and Cognitive Processes, 4, Coltheart, M., Davelaar, E., Jonasson, J. T., &
77–103. Besner, D. (1977). Access to the internal lexicon. In
Cogswell, D., & Gordon, P. (1996). Chomsky for S. Dornic (Ed.), Attention and performance VI (pp.
beginners. London: Readers & Writers Ltd. 535–555). London: Academic Press.
Cohen, G. (1979). Language comprehension in old Coltheart, M., Inglis, L., Cupples, L., Michle, P.,
age. Cognitive Psychology, 11, 412–429. Bates, A., & Budd, B. (1998). A semantic subsystem
Cohen, L., & Dehaene, S. (2004). Specialization of visual attributes. Neurocase, 4, 353–370.
within the ventral stream: The case for the visual word Coltheart, M., Patterson, K. E., & Marshall, J. C.
form area. NeuroImage, 22, 466–476. (Eds.). (1987). Deep dyslexia (2nd ed.). London:
Colby, K. M. (1975). Artificial paranoia. New York: Routledge & Kegan Paul. [1st ed., 1980.]
Pergamon Press. Coltheart, M., & Rastle, K. (1994). Serial processing
Cole, R. A. (1973). Listening for mispronunciations: in reading aloud: Evidence for dual-route models of
A measure of what we hear during speech. Perception reading. Journal of Experimental Psychology: Human
and Psychophysics, 13, 153–156. Perception and Performance, 20, 1197–1211.
Cole, R. A., & Jakimik, J. (1980). A model of speech Coltheart, M., Rastle, K., Perry, C., Langdon, R.,
perception. In R. A. Cole (Ed.), Perception and & Ziegler, J. (2001). DRC: A dual route cascaded
production of fluent speech (pp. 133–163). Hillsdale, model of visual word recognition and reading aloud.
NJ: Lawrence Erlbaum Associates, Inc. Psychological Review, 108, 204–256.
Coleman, L., & Kay, P. (1981). Prototype semantics. Coltheart, V., & Leahy, J. (1992). Children’s and
Language, 57, 26–44. adults’ reading of nonwords: Effects of regularity and
Collins, A., & Gentner, D. (1980). A framework for consistency. Journal of Experimental Psychology:
a cognitive theory of writing. In L. W. Gregg & E. R. Learning, Memory, and Cognition, 18, 718–729.
Sternberg (Eds.), Cognitive processes in writing (pp. Conboy, B. T., & Mills, D. L. (2006). Two languages,
51–72). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. one developing brain: Event-related potentials to
Collins, A. M., & Loftus, E. F. (1975). A spreading- words in bilingual toddlers. Developmental Science, 9,
activation theory of semantic processing. F1–F12.
Psychological Review, 82, 407–428. Connine, C. M. (1990). Effects of sentence context
Collins, A. M., & Quillian, M. R. (1969). Retrieval and lexical knowledge in speech processing. In
time from semantic memory. Journal of Verbal G. T. M. Altmann (Ed.), Cognitive models of speech
Learning and Verbal Behavior, 8, 240–247. processing (pp. 281–294). Cambridge, MA: MIT Press.
Colombo, L. (1986). Activation and inhibition Connine, C. M., & Clifton, C. (1987). Interactive use
with orthographically similar words. Journal of of lexical information in speech perception. Journal
Experimental Psychology: Human Perception and of Experimental Psychology: Human Perception and
Performance, 12, 226–234. Performance, 13, 291–319.
Coltheart, M. (1980). Deep dyslexia: A right Conrad, C. (1972). Cognitive economy in semantic
hemisphere hypothesis. In M. Coltheart, K. E. memory. Journal of Experimental Psychology, 92,
Patterson, & J. C. Marshall (Eds.), Deep dyslexia (pp. 149–154.
326–380). London: Routledge & Kegan Paul. [2nd ed., Conrad, R. (1979). The deaf school child: Language
1987.] and cognitive function. London: Harper & Row.
Coltheart, M. (1981). Disorders of reading and their Conrad, R., & Rush, M. L. (1965). On the nature of
implications for models of normal reading. Visible short-term memory encoding by the deaf. Journal of
Language, 15, 245–286. Speech and Hearing Disorders, 30, 336–343.
REFERENCES 509
Cook, V. (1997). The consequences of bilingualism morphemes: Evidence from Croatian. Journal of
for cognitive processing. In A. M. B. de Groot Experimental Psychology: Learning, Memory, and
& J. F. Kroll (Eds.), Tutorials in bilingualism: Cognition, 29, 1270–1282.
Psycholinguistic perspectives (pp. 279–299). Mahwah, Costa, A., Miozzo, M., & Caramazza, A. (1999).
NJ: Lawrence Erlbaum Associates, Inc. Lexical selection in bilinguals: Do words in the
Cook, V. J., & Newson, M. (2007). Chomsky’s bilingual’s two lexicons compete for selection?
universal grammar: An introduction (3rd ed.). Oxford: Journal of Memory and Language, 41, 365–397.
Blackwell. Costa, A., & Sebastian-Gallés, N. (1998). Abstract
Corballis, M. C. (1992). On the evolution of language phonological structure in language production:
and generativity. Cognition, 44, 197–226. Evidence from Spanish. Journal of Experimental
Corballis, M. C. (2003). From mouth to hand: Psychology: Learning, Memory, and Cognition, 24,
Gesture, speech, and the evolution of right- 886–903.
handedness. Behavioral and Brain Sciences, 26, Cottingham, J. (1984). Rationalism. London: Paladin.
199–260. Crain, S., & Steedman, M. J. (1985). On not being
Corballis, M. C. (2004). On the origins of modernity: led up the garden path: The use of context by the
Was autonomous speech the critical factor? psychological parser. In D. Dowty, L. Karttunen, &
Psychological Review, 111, 543–552. A. Zwicky (Eds.), Natural language parsing
Corbett, A. T., & Chang, F. (1983). Pronoun (pp. 320–358). Cambridge: Cambridge University Press.
disambiguation: Accessing potential antecedents. Cree, G. S., McNorgan, C., & McRae, K. (2006).
Memory and Cognition, 11, 383–394. Distinctive features hold a privileged status in the
Corbett, A. T., & Dosher, B. A. (1978). Instrument computation of word meaning: Implications for
inferences in sentence encoding. Journal of Verbal theories of semantic memory. Journal of Experimental
Learning and Verbal Behavior, 17, 479–492. Psychology: Learning, Memory, and Cognition, 32,
Corina, D. P., Jose-Robertson, L., Guillermin, A., 643–658.
High, J., & Braun, A. R. (2003). Language Crocker, M. W. (1999). Mechanisms for sentence
lateralization in a bimanual language. Journal of processing. In S. Garrod & M. J. Pickering (Eds.),
Cognitive Neuroscience, 15, 718–730. Language processing (pp. 191–232). Hove, UK:
Corley, M., Brocklehurst, P. H., & Moat, H. S. Psychology Press.
(2011). Error biases in inner and overt speech: Cromer, R. F. (1991). Language and thought in
Evidence from tongue twisters. Journal of normal and handicapped children. Oxford: Blackwell.
Experimental Psychology: Learning, Memory, and Croot, K., Patterson, K. E., & Hodges, J. R. (1999).
Cognition, 37, 162–175. Familial progressive aphasia: Insights into the nature
Corrigan, R. (1978). Language development as and deterioration of single word processing. Cognitive
related to stage 6 object permanence development. Neuropsychology, 16, 705–747.
Journal of Child Language, 5, 173–189. Cross, T. G. (1977). Mothers’ speech adjustments:
Coslett, H. B. (1991). Read but not write “idea”: The contribution of selected child listener variables.
Evidence for a third reading mechanism. Brain and In C. E. Snow & C. A. Ferguson (Eds.), Talking to
Language, 40, 425–443. children: Language input and acquisition (pp. 151–
Coslett, H. B., Roeltgen, D. P., Rothi, L. G., & 188). Cambridge: Cambridge University Press.
Heilman, K. M. (1987). Transcortical sensory Cross, T. G. (1978). Mother’s speech and its
aphasia: Evidence for subtypes. Brain and Language, association with rate of linguistic development in
32, 362–378. young children. In N. Waterson & C. E. Snow (Eds.),
Coslett, H. B., & Saffran, E. M. (1989). Preserved The development of communication (pp. 199–216).
object recognition and reading comprehension in optic Chichester, UK: Wiley.
aphasia. Brain, 112, 1091–1100. Cross, T. G., Johnson-Morris, J. E., & Nienhuys, T. G.
Costa, A., & Caramazza, A. (2002). The production of (1980). Linguistic feedback and maternal speech:
noun phrases in English and Spanish: Implications for Comparisons of mothers addressing hearing and
the scope of phonological encoding in speech production. hearing-impaired children. First Language, 1,
Journal of Memory and Language, 46, 178–198. 163–189.
Costa, A., Caramazza, A., & Sebastian-Galles, N. Crosson, B., Moberg, P. J., Boone, J. R., Rothi, L. J. G.,
(2000). The cognate facilitation effect: Implications & Raymer, A. (1997). Category-specific naming deficit
for models of lexical access. Journal of Experimental for medical terms after dominant thalamic/capsular
Psychology: Learning, Memory, and Cognition, 26, hemorrhage. Brain and Language, 60, 407–442.
1283–1296. Crystal, D. (1986). Prosodic development. In P.
Costa, A., Kovacic, D., Fedorenko, E., & Fletcher & M. Garman (Eds.), Language acquisition
Caramazza, A. (2003). The gender-congruency (2nd ed., pp. 174–197). Cambridge: Cambridge
effect and the selection of freestanding and bound University Press.
510 REFERENCES
Crystal, D. (1998). Language play. Harmondsworth, Cutting, J. C., & Ferreira, V. (1999). Semantic
UK: Penguin Books. and phonological information flow in the production
Crystal, D. (2010). The Cambridge encyclopedia of lexicon. Journal of Experimental Psychology:
language (3rd ed.). Cambridge: Cambridge University Learning, Memory, and Cognition, 25, 318–344.
Press. Czerniewska, P. (1992). Learning about writing.
Cuetos, F., Aguado, G., & Caramazza, A. (2000). Oxford: Blackwell.
Dissociation of semantic and phonological errors in D’Andrade, R. G., & Wish, M. (1985). Speech
naming. Brain and Language, 75, 451–460. act theory in quantitative research on interpersonal
Cuetos, F., & Mitchell, D. C. (1988). Cross-linguistic behavior. Discourse Processes, 8, 229–259.
differences in parsing: Restrictions on the use of the Dagenbach, D., Carr, T. H., & Wilhelmsen, A.
late closure strategy in Spanish. Cognition, 30, 73–105. (1989). Task-induced strategies and near-threshold
Cummins, J. (1991). Interdependence of first- and priming: Conscious influences on unconscious
second-language proficiency in bilingual children. In perception. Journal of Memory and Language, 28,
E. Bialystok (Ed.), Language processing in bilingual 412–443.
children (pp. 70–89). Cambridge: Cambridge Dahan, D., Magnuson, J. S., & Tanenhaus, M. K.
University Press. (2001). Time course of frequency effects in spoken-
Curtin, S., Mintz, T. H., & Christiansen, M. H. word recognition: Evidence from eye movements.
(2005). Stress changes the representational landscape: Cognitive Psychology, 42, 317–367.
Evidence from word segmentation. Cognition, 96, Dahl, H. (1979). Word frequencies of spoken
233–262. American English. Essex, CT: Verbatim.
Curtiss, S. (1977). Genie: A psycholinguistic study of Dale, P. S. (1976). Language development: Structure
a modern-day “wild child.” London: Academic Press. and function (2nd ed.). New York: Holt, Rinehart &
Curtiss, S. (1989). The independence and task- Winston.
specificity of language. In M. H. Bornstein & Daneman, M., & Carpenter, P. A. (1980). Individual
J. Bruner (Eds.), Interaction in human development differences in working memory and reading. Journal
(pp. 105–137). Hillsdale, NJ: Lawrence Erlbaum of Verbal Learning and Verbal Behavior, 19, 450–466.
Associates, Inc. Daneman, M., Reingold, E. M., & Davidson, M.
Cutler, A. (1981). Making up materials is a (1995). Time course of phonological-activation during
confounded nuisance, or: Will we be able to run reading: Evidence from eye fixations. Journal of
any psycholinguistic experiments at all in 1990? Experimental Psychology: Learning, Memory, and
Cognition, 10, 65–70. Cognition, 21, 884–898.
Cutler, A., & Butterfield, S. (1992). Rhythmic cues Davidoff, J., Davies, I., & Roberson, D. (1999a). Colour
to speech segmentation: Evidence from juncture categories in a stone-age tribe. Nature, 398, 203–204.
misperception. Journal of Memory and Language, 31, Davidoff, J., Davies, I., & Roberson, D. (1999b).
218–236. Addendum: Colour categories in a stone-age tribe.
Cutler, A., Mehler, J., Norris, D., & Segui, J. (1986). Nature, 402, 604.
The syllable’s differing role in the segmentation Davies, I., Corbett, G., Laws, G., McGurk, H.,
of French and English. Journal of Memory and Moss, A., & Smith, M. W. (1991). Linguistic
Language, 25, 385–400. basicness and colour information processing.
Cutler, A., Mehler, J., Norris, D., & Segui, J. (1987). International Journal of Psychology, 26, 311–327.
Phoneme identification and the lexicon. Cognitive Davis, C. J., & Lupker, S. J. (2006). Masked
Psychology, 19, 141–177. inhibitory priming in English: Evidence for lexical
Cutler, A., Mehler, J., Norris, D., & Segui, J. (1992). inhibition. Journal of Experimental Psychology:
The monolingual nature of speech segmentation by Human Perception and Performance, 32, 668–687.
bilinguals. Cognitive Psychology, 24, 381–410. Davis, K. (1947). Final note on a case of extreme
Cutler, A., & Norris, D. (1979). Monitoring sentence social isolation. American Journal of Sociology, 52,
comprehension. In W. E. Cooper & E. C. T. Walker 432–437.
(Eds.), Sentence processing: Psycholinguistic studies Dawson, M. (2005). Connectionism: A hands-on
presented to Merrill Garrett (pp. 113–134). Hillsdale, approach. Oxford: Blackwell.
NJ: Lawrence Erlbaum Associates, Inc. de Boysson-Bardies, B., Halle, P., Sagart, L., &
Cutler, A., & Norris, D. (1988). The role of strong Durand, C. (1989). A cross-linguistic investigation
syllables in segmentation for lexical access. Journal of vowel formants in babbling. Journal of Child
of Experimental Psychology: Human Perception and Language, 16, 1–17.
Performance, 14, 113–121. de Boysson-Bardies, B., Sagart, L., & Durand, C.
Cutsford, T. D. (1951). The blind in school and (1984). Discernible differences in the babbling of
society. New York: American Foundation for the infants according to target language. Journal of Child
Blind. Language, 11, 1–15.
REFERENCES 511
de Groot, A. M. B. (1983). The range of automatic of frame constraints in phonological speech errors.
spreading activation in word priming. Journal of Cognitive Science, 17, 149–195.
Verbal Learning and Verbal Behavior, 22, 417–436. Dell, G. S., Martin, N., & Schwartz, M. F. (2007).
de Groot, A. M. B. (1984). Primed lexical decision: A case-series test of the interactive two-step model
Combined effects of the proportion of related prime– of lexical access: Predicting word repetition from
target pairs and the stimulus onset asynchrony of picture naming. Journal of Memory and Language, 56,
prime and target. Quarterly Journal of Experimental 490–520.
Psychology, 36A, 253–280. Dell, G. S., & O’Seaghdha, P. G. (1991). Mediated
de Groot, A. M. B., Dannenburg, L., & van Hell, J. G. and convergent lexical priming in language
(1994). Forward and backward translation by bilinguals. production: A comment on Levelt et al. (1991).
Journal of Memory and Language, 33, 600–629. Psychological Review, 98, 604–614.
de Groot, A. M. B., & Kroll, J. F. (Eds.). (1997). Dell, G. S., & Reich, P. A. (1981). Stages in sentence
Tutorials in bilingualism: Psycholinguistic production: An analysis of speech error data. Journal
perspectives. Mahwah, NJ: Lawrence Erlbaum of Verbal Learning and Verbal Behavior, 20, 611–629.
Associates, Inc. Dell, G. S., Schwartz, M. F., Martin, N., Saffran, E. M.,
de Renzi, E., & Lucchelli, F. (1994). Are semantic & Gagnon, D. A. (1997). Lexical access in aphasic
systems separately represented in the brain? The case and nonaphasic speakers. Psychological Review, 104,
of living category impairment. Cortex, 30, 3–25. 801–838.
de Villiers, J. G., & de Villiers, P. A. (2000). Dell, G. S., Schwartz, M. F., Martin, N.,
Linguistic determination and the understanding of Saffran, E. M., & Gagnon, D. A. (2000). The
false beliefs. In P. Mitchell & K. J. Riggs (Eds.), role of computational models in the cognitive
Children’s reasoning and the mind (pp. 191–228). neuropsychology of language: A reply to Ruml and
Hove, UK: Psychology Press. Caramazza. Psychological Review, 107, 635–645.
de Villiers, P. A., & de Villiers, J. G. (1979). Early DeLong, K. A., Urbach, T. P., & Kutas, M. (2005).
language. London: Fontana/Open Books. Probabilistic word pre-activation during language
Deacon, T. (1997). The symbolic species. comprehension inferred from electrical brain activity.
Harmondsworth, UK: Penguin Books. Nature Neuroscience, 8, 1117–1121.
Dean, M. P., & Young, A. W. (1996). An item-specific Demers, R. A. (1988). Linguistics and animal
locus of repetition priming. Quarterly Journal of communication. In F. J. Newmeyer (Ed.), Linguistics:
Experimental Psychology, 49A, 269–294. The Cambridge survey: Vol. 3. Language:
DeCasper, A. J., & Fifer, W. P. (1980). Of human Psychological and biological aspects (pp. 314–335).
bonding: Newborns prefer their mothers’ voices. Cambridge: Cambridge University Press.
Science, 208, 1174–1176. Demetras, M. J., Post, K. N., & Snow, C. E. (1986).
DeCasper, A. J., Lecanuet, J. P., Maugais, R., Feedback to first language learners: The role of
Granier-Deferre, C., & Busnel, M. C. (1994). Fetal repetitions and clarification questions. Journal of
reactions to recurrent maternal speech. Infant Behavior Child Language, 13, 275–292.
and Development, 17, 159–164. Den Heyer, K. (1985). On the nature of the proportion
DeCasper, A. J., & Spence, M. J. (1986). Prenatal effect in semantic priming. Acta Psychologica, 60, 25–38.
maternal speech influences newborns’ perception of Den Heyer, K., Briand, K., & Dannenbring, G. L.
speech sounds. Infant Behavior and Development, 9, (1983). Strategic factors in a lexical decision task:
133–150. Evidence for automatic and attention driven processes.
Dell, G. S. (1985). Positive feedback in hierarchical Memory and Cognition, 10, 358–370.
connectionist models: Applications to language Dennett, D. C. (1991). Consciousness explained.
production. Cognitive Science, 9, 3–23. Harmondsworth, UK: Penguin.
Dell, G. S. (1986). A spreading-activation theory Dennis, M., & Whitaker, H. A. (1976). Language
of retrieval in sentence production. Psychological acquisition following hemidecortication: Linguistic
Review, 93, 283–321. superiority of the left over the right hemisphere. Brain
Dell, G. S. (1988). The retrieval of phonological and Language, 3, 404–433.
forms in production: Tests of predictions from Dennis, M., & Whitaker, H. A. (1977). Hemispheric
a connectionist model. Journal of Memory and equipotentiality and language acquisition. In
Language, 27, 124–142. S. J. Segalowitz & F. A. Gruber (Eds.), Language
Dell, G. S., Burger, L. K., & Svec, W. R. (1997). development and neurological theory (pp. 93–106).
Language production and serial order: A functional New York: Academic Press.
analysis and a model. Psychological Review, 104, Derouesné, J., & Beauvois, M.-F. (1979).
123–147. Phonological processing in reading: Data from
Dell, G. S., Juliano, C., & Govindjee, A. (1993). dyslexia. Journal of Neurology, Neurosurgery and
Structure and content in language production: A theory Psychiatry, 42, 1125–1132.
512 REFERENCES
Derouesné, J., & Beauvois, M.-F. (1985). The Dörnyei, Z. (1990). Conceptualizing motivation in
“phonemic” stage in the non-lexical reading process: foreign language learning. Language Learning, 40,
Evidence from a case of phonological alexia. In K. 45–78.
Patterson, M. Coltheart, & J. C. Marshall (Eds.), Doughty, C. J., & Long, M. H. (Eds.). (2005). The
Surface dyslexia (pp. 399–457). Hove, UK: Lawrence handbook of second language acquisition. Oxford:
Erlbaum Associates. Blackwell.
Devlin, J. T., Gonnerman, L. M., Andersen, E. S., & Downing, P. (1977). On the creation and use of
Seidenberg, M. S. (1998). Category specific semantic English compound nouns. Language, 53, 810–842.
deficits in focal and widespread brain damage: Doyle, J. R., & Leach, C. (1988). Word superiority
A computational account. Journal of Cognitive in signal detection: Barely a glimpse, yet reading
Neuroscience, 10, 77–94. nonetheless. Cognitive Psychology, 20, 283–318.
Dhooge, E., & Hartsuiker, R. J. (2012). Lexical Dronkers, N. F., Wilkins, D. P., van Valin, R. D.,
selection and verbal self-monitoring: Effects of Redfern, B. B., & Jaeger, J. J. (2004). Lesion
lexicality, context, and time pressure in picture-word analysis of the brain areas involved in language
interference. Journal of Memory and Language, 66, comprehension. Cognition, 95, 145–177.
163–176. Druks, J., & Froud, K. (2002). The syntax of
Dick, F., Bates, E., Wulfeck, B., Utman, J. A., single words: Evidence from a patient with a
Dronkers, N., & Gernsbacher, M. A. (2001). selective function word reading deficit. Cognitive
Language deficits, localization, and grammar: Neuropsychology, 19, 207–244.
Evidence for a distributive model of language Duffy, S. A., Morris, R. K., & Rayner, K. (1988).
breakdown in aphasic patients and neurologically Lexical ambiguity and fixation times in reading.
intact individuals. Psychological Review, 108, Journal of Memory and Language, 27, 429–446.
759–788. Duncan, L. G., Seymour, P. H. K., & Hill, S. (1997).
Diesfeldt, H. F. A. (1989). Semantic impairment in How important are rhyme and analogy in beginning
senile dementia of the Alzheimer type. Aphasiology, reading? Cognition, 63, 171–208.
3, 41–54. Duncan, L. G., Seymour, P. H. K., & Hill, S. (2000).
Dijkstra, A., & van Heuven, W. J. B. (2002). A small-to-large unit progression in metaphonological
The architecture of the bilingual word recognition awareness and reading? Quarterly Journal of
system: From identification to decision. Bilingualism: Experimental Psychology, 53A, 1081–1104.
Language and Cognition, 5, 175–197. Duncan, S. E., & Niederehe, G. (1974). On signaling
Dijkstra, T., van Heuven, W. J. B., & Grainger, J. that it’s your turn to speak. Journal of Experimental
(1998). Simulating cross-language competition Social Psychology, 10, 234–247.
with the bilingual interactive activation model. Duncker, K. (1945). On problem-solving.
Psychologica Belgica, 38, 177–196. Psychological Monographs, 58 (5, Whole No. 270).
Dionne, G., Dale, P. S., Boivin, M., & Plomin, R. Dunlea, A. (1984). The relation between concept
(2003). Genetic evidence for bidirectional effects of formation and semantic roles: Some evidence from the
early lexical and grammatical development. Child blind. In L. Feagans, C. Garvery, & R. M. Golinkoff
Development, 74, 394–412. (Eds.), The origins and growth of communication (pp.
Dockrell, J., & Messer, D. J. (1999). Children’s 224–243). Norwood, NJ: Ablex.
language and communication difficulties: Dunlea, A. (1989). Vision and the emergence of
Understanding, identification, and intervention. meaning: Blind and sighted children’s early language.
London: Cassell. Cambridge: Cambridge University Press.
Dogil, G., Haider, H., Schaner-Wolles, C., & Duran, N. D., Dale, R., & Kreuz, R. J. (2011).
Husman, R. (1995). Radical autonomy of syntax: Listeners invest in an assumed other’s perspective
Evidence from transcortical sensory aphasia. despite cognitive cost. Cognition, 121, 22–40.
Aphasiology, 9, 577–602. Durkin, K. (1987). Minds and language: Social
Dooling, D. J., & Christiaansen, R. E. (1977). cognition, social interaction and the acquisition of
Episodic and semantic aspects of memory for prose. language. Mind and Language, 2, 105–140.
Journal of Experimental Psychology: Human Learning Durso, F. T., & Johnson, M. K. (1979). Facilitation
and Memory, 3, 428–436. in naming and categorizing repeated pictures and
Dooling, D. J., & Lachman, R. (1971). Effects of words. Journal of Experimental Psychology: Human
comprehension on retention of prose. Journal of Learning and Memory, 5, 449–459.
Experimental Psychology, 88, 216–222. Duskova, L. (1969). On sources of errors in foreign
Dopkins, S., Morris, R. K., & Rayner, K. (1992). language learning. International Review of Applied
Lexical ambiguity and eye fixations in reading: A test Linguistics, 7, 11–36.
of competing models of lexical ambiguity resolution. Eberhard, K. M. (1999). The accessibility of
Journal of Memory and Language, 31, 461–476. conceptual number to the processes of subject–verb
REFERENCES 513
agreement in English. Journal of Memory and language. In S. Harnad (Ed.), Categorical perception
Language, 30, 210–233. (pp. 161–195). New York: Cambridge University Press.
Eberhard, K. M., Cutting, J. C., & Bock, J. K. Eimas, P. D., Siqueland, E. R., Jusczyk, P. W., &
(2005). Making syntax of sense: Number agreement Vigorito, J. (1971). Speech perception in infants.
in sentence production. Psychological Review, 112, Science, 171, 303–306.
531–559. Elbers, L. (1985). A tip-of-the-tongue experience at
Eckert, M. A., Lombardino, L. J., & Leonard, C. M. age two? Journal of Child Language, 12, 353–365.
(2001). Planar asymmetry tips the phonological Elio, R., & Anderson, J. R. (1981). The effects
playground and environment raises the bar. Child of category generalizations and instance similarity
Development, 72, 988–1002. on schema abstraction. Journal of Experimental
Eckman, F. (1977). Markedness and the contrastive Psychology: Human Learning and Memory, 7,
analysis hypothesis. Language Learning, 27, 315–330. 397–417.
Eglinton, E., & Annett, M. (1994). Handedness and Ellis, A. W. (1980). On the Freudian theory of speech
dyslexia: A meta-analysis. Perceptual Motor Skills, 79, errors. In V. A. Fromkin (Ed.), Errors in linguistic
1611–1616. performance (pp. 123–132). New York: Academic
Ehri, L. C. (1992). Reconceptualizing the Press.
development of sight word reading and its relationship Ellis, A. W. (1985). The production of spoken words:
to recoding. In P. Gough, L. Ehri, & R. Treiman (Eds.), A cognitive neuropsychological perspective. In
Reading acquisition (pp. 107–143). Hillsdale, NJ: A. W. Ellis (Ed.), Progress in the psychology of
Lawrence Erlbaum Associates, Inc. language (Vol. 2, pp. 107–145). Hove, UK: Lawrence
Ehri, L. C. (1997a). Sight word learning in normal Erlbaum Associates.
readers and dyslexics. In B. A. Blachman (Ed.), Ellis, A. W. (1993). Reading, writing and dyslexia:
Foundations of reading acquisition and dyslexia: A cognitive analysis (2nd ed.). Hove, UK: Lawrence
Implications for early intervention (pp. 163–189). Erlbaum Associates.
Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Ellis, A. W., & Lambon Ralph, M. A. (2000). Age of
Ehri, L. C. (1997b). Learning to read and learning to acquisition effects in adult lexical processing reflect
spell are one and the same, almost. In C. A. Perfetti, loss of plasticity in maturing systems: Insights from
L. Rieben, & M. Fayol (Eds.), Learning to spell: connectionist networks. Journal of Experimental
Research, theory, and practice across languages Psychology: Learning, Memory, and Cognition, 26,
(pp. 237–269). Mahwah, NJ: Lawrence Erlbaum 1103–1123.
Associates, Inc. Ellis, A. W., & Marshall, J. C. (1978). Semantic
Ehri, L. C., Nunes, S. R., Stahl, S. A., & Willows, D. M. errors or statistical flukes: A note on Allport’s
(2001). Systematic phonics instruction helps students “On knowing the meaning of words we are unable
learn to read: Evidence from the National Reading Panel’s to report.” Quarterly Journal of Experimental
meta-analysis. Review of Educational Research, 71, Psychology, 30, 569–575.
393–447. Ellis, A. W., Miller, D., & Sin, G. (1983). Wernicke’s
Ehri, L. C., & Robbins, C. (1992). Beginners aphasia and normal language processing: A case study
need some decoding skill to read words by analogy. in cognitive neuropsychology. Cognition, 15, 111–144.
Reading Research Quarterly, 27, 13–26. Ellis, A. W., & Morrison, C. M. (1998). Real age-
Ehri, L. C., & Ryan, E. B. (1980). Performance of of-acquisition effects in lexical retrieval. Journal of
bilinguals in a picture–word interference task. Journal Experimental Psychology: Learning, Memory, and
of Psycholinguistic Research, 9, 285–302. Cognition, 24, 515–523.
Ehri, L. C., & Wilce, L. S. (1985). Movement into Ellis, A. W., & Young, A. W. (1988). Human cognitive
reading: Is the first stage of printed word learning neuropsychology. Hove, UK: Lawrence Erlbaum
visual or phonetic? Reading Research Quarterly, 20, Associates. [Augmented edition with readings, 1996.]
163–179. Ellis, N. C., & Beaton, A. (1993). Factors affecting
Ehrlich, K., & Johnson-Laird, P. N. (1982). Spatial the learning of foreign language vocabulary:
descriptions and referential continuity. Journal of Imagery keyword mediators and phonological short-
Verbal Learning and Verbal Behavior, 21, 296–306. term memory. Quarterly Journal of Experimental
Eimas, P. D., & Corbit, L. (1973). Selective Psychology, 46A, 533–558.
adaptation of linguistic feature detectors. Cognitive Ellis, N. C., & Hennelly, R. A. (1980). A bilingual
Psychology, 4, 99–109. word-length effect: Implications for intelligence testing
Eimas, P. D., & Miller, J. L. (1980). Contextual and the relative ease of mental calculations in Welsh
effects in infant speech perception. Science, 209, and English. British Journal of Psychology, 71, 43–52.
1140–1141. Ellis, R., & Humphreys, G. W. (1999). Connectionist
Eimas, P. D., Miller, J. L., & Jusczyk, P. W. (1987). psychology: A text with readings. Hove, UK:
On infant speech perception and the acquisition of Psychology Press.
514 REFERENCES
Ellis, R., & Wells, G. (1980). Enabling factors in Eysenck, M. W., & Keane, M. T. (2010). Cognitive
adult–child discourse. First Language, 1, 46–62. psychology: A student’s handbook (6th ed.). Hove,
Elman, J. L. (1990). Finding structure in time. UK: Psychology Press.
Cognitive Science, 14, 179–211. Fabb, N. (1994). Sentence structure. London:
Elman, J. L. (1993). Learning and development in Routledge & Kegan Paul.
neural networks: The importance of starting small. Fabbro, F. (1999). The neurolinguistics of
Cognition, 48, 71–99. bilingualism: An introduction. Hove, UK: Psychology
Elman, J. L. (1999). The emergence of language: Press.
A conspiracy theory. In B. MacWhinney (Ed.), The Fabbro, F. (2001). The bilingual brain: Cerebral
emergence of language (pp. 1–27). Mahwah, NJ: representation of languages. Brain and Language, 79,
Lawrence Erlbaum Associates, Inc. 211–222.
Elman, J. L., Bates, E. A., Johnson, M. H., Facoetti, A., Trussardi, A. N., Ruffino, M., Lorusso,
Karmiloff-Smith, A., Parisi, D., & Plunkett, K. M. L., Cattaneo, C., Galli, R., et al. (2010).
(1996). Rethinking innateness: A connectionist Multisensory spatial attention deficits are predictive of
perspective on development. Cambridge, MA: phonological decoding skills in developmental dyslexia.
Bradford Books. Journal of Cognitive Neuroscience, 22, 1011–1025.
Elman, J. L., & McClelland, J. L. (1988). Cognitive Faigley, L., & Witte, S. (1983). Analysing revision.
penetration of the mechanisms of perception: College Composition and Communication, 32,
Compensation for coarticulation of lexically restored 400–414.
phonemes. Journal of Memory and Language, 27, Farah, M. J. (1990). Visual agnosia: Disorders of
143–165. object recognition and what they tell us about normal
Elsness, J. (1984). That or zero? A look at the choice vision. Cambridge, MA: MIT Press.
of object relative clause connective in a corpus of Farah, M. J. (1991). Patterns of co-occurrence among
American English. English Studies, 65, 519–533. the associative agnosias: Implications for visual object
Emmorey, K. (2001). Language, cognition, and the recognition. Cognitive Neuropsychology, 8, 1–19.
brain: Insights from sign language research. Hillsdale, Farah, M. J. (1994). Neuropsychological inference
NJ: Lawrence Erlbaum Associates, Inc. with an interactive brain: A critique of the “locality”
Entus, A. K. (1977). Hemispheric asymmetry in assumption [with commentaries]. Behavioral and
processing of dichotically presented speech sounds. Brain Sciences, 17, 43–104.
In S. J. Segalowitz & F. A. Gruber (Eds.), Language Farah, M. J., Hammond, K. M., Mehta, Z.,
development and neurological theory (pp. 63–73). & Ratcliff, G. (1989). Category-specificity
New York: Academic Press. and modality-specificity in semantic memory.
Eriksen, C. W., Pollack, M. D., & Montague, W. E. Neuropsychologia, 27, 193–200.
(1970). Implicit speech: Mechanisms in perceptual Farah, M. J., & McClelland, J. L. (1991). A
encoding? Journal of Experimental Psychology, 84, computational model of semantic memory impairment:
502–507. Modality-specificity and emergent category-
Ervin-Tripp, S. (1979). Children’s verbal turntaking. specificity. Journal of Experimental Psychology:
In E. Ochs & B. B. Schieffelin (Eds.), Developmental General, 120, 339–357.
pragmatics (pp. 391–414). New York: Academic Press. Farah, M. J., Stowe, R. M., & Levinson, K. L.
Estes, Z., & Jones, L. L. (2006). Priming via (1996). Phonological dyslexia: Loss of a reading-
relational similarity: A COPPER HORSE is faster specific component of the cognitive architecture?
when seen through a GLASS EYE. Journal of Memory Cognitive Neuropsychology, 13, 849–868.
and Language, 55, 89–101. Farah, M. J., & Wallace, M. A. (1992).
Evans, N., & Levinson, S. C. (2009). The myth Semantically-bounded anomia: Implications
of language universals: Language diversity and its for the neural implementation of naming.
importance for cognitive science. Behavioral and Neuropsychologia, 30, 609–621.
Brain Sciences, 32, 429–448. Farrar, M. J. (1990). Discourse and the acquisition of
Evans, W. E., & Bastian, J. (1969). Marine mammal grammatical morphemes. Journal of Child Language,
communication: Social and ecological factors. In H. T. 17, 607–624.
Andersen (Ed.), The biology of marine mammals (pp. Farrar, M. J. (1992). Negative evidence and
425–475). New York: Academic Press. grammatical morpheme acquisition. Developmental
Everett, C., & Madora, K. (2011). Quantity Psychology, 28, 90–98.
recognition among speakers of an anumeric language. Fauconnier, G., & Turner, M. (2003). The way we
Cognitive Science, 36, 130–141. think. New York: Basic Books.
Everett, D. L. (2005). Cultural constraints Faust, M. E., Balota, D. A., Duchek, J. A.,
on grammar and cognition in Piraha. Current Gernsbacher, M. A., & Smith, S. D. (1997).
Anthropology, 46, 521–646. Inhibitory control during sentence processing in
REFERENCES 515
individuals with dementia of the Alzheimer type. movements and word-by-word self-paced reading.
Brain and Language, 57, 225–253. Journal of Experimental Psychology: Learning,
Fay, D., & Cutler, A. (1977). Malapropisms and the Memory, and Cognition, 16, 555–568.
structure of the mental lexicon. Linguistic Inquiry, 8, Ferreira, F., & Henderson, J. M. (1991). Recovery
505–520. from misanalyses of garden-path sentences. Journal of
Fedorenko, E., Gibson, E., & Rohde, D. (2006). Memory and Language, 30, 725–745.
The nature of working memory capacity in sentence Ferreira, V. S. (1996). Is it better to give than to
comprehension: Evidence against domain-specific donate? Syntactic flexibility in language production.
working memory resources. Journal of Memory and Journal of Memory and Language, 35, 724–755.
Language, 54, 541–553. Ferreira, V. S., & Dell, G. S. (2000). Effect of
Fedorenko, E., & Kanwisher, N. (2011). Some ambiguity and lexical availability on syntactic
regions within Broca’s area do respond more strongly and lexical production. Cognitive Psychology, 40,
to sentences than to linguistically degraded stimuli: A 296–340.
comment on Rogalsky and Hickok (2011). Journal of Ferreira, V. S., Slevc, L. R., & Rogers, E. S.
Cognitive Neuroscience, 23, 2632–2635. (2005). How do speakers avoid ambiguous linguistic
Feitelson, D., Tehori, B. Z., & Levinberg-Green, D. expressions? Cognition, 96, 263–284.
(1982). How effective is early instruction in reading? Ferreira, V. S., & Swets, B. (2002). How incremental
Experimental evidence. Merrill-Palmer Quarterly, 28, is language production? Evidence from the production
458–494. of utterances requiring the computation of arithmetic
Felix, S. (1992). Language acquisition as a sums. Journal of Memory and Language, 46, 57–84.
maturational process. In J. Weissenborn, H. Goodluck, Ferreiro, E. (1985). Literacy development: A
& T. Roeper (Eds.), Theoretical issues in language psychogenetic perspective. In D. R. Olson, N. Torrance,
acquisition (pp. 25–51). Hillsdale, NJ: Lawrence & A. Hildyard (Eds.), Literacy, language, and
Erlbaum Associates, Inc. learning: The nature and consequences of reading
Fenson, L., Dale, P., Reznick, J., Bates, E., Thal, D., and writing (pp. 217–228). Cambridge: Cambridge
& Pethick, S. (1994). Variability in early University Press.
communicative development. Monographs of the Ferreiro, E., & Teberosky, A. (1982). Literacy before
Society for Research in Child Development, 59 (5, schooling. New York: Heinemann.
Serial No. 242). Fillmore, C. J. (1968). The case for case. In E. Bach
Fera, P., & Besner, D. (1992). The process of lexical & R. T. Harms (Eds.), Universals of linguistic theory
decision: More words about a parallel distributed (pp. 1–90). New York: Holt, Rinehart & Winston.
processing model. Journal of Experimental Psychology: Finch, S., & Chater, N. (1992). Bootstrapping
Learning, Memory, and Cognition, 18, 749–764. syntactic categories. In Proceedings of the 14th Annual
Fernald, A. (1991). Prosody and focus in speech to Conference of the Cognitive Science Society (pp. 820–
infants and adults. Annals of Child Development, 8, 825). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
43–80. Fischler, I. (1977). Semantic facilitation without
Fernald, A., Swingley, D., & Pinto, J. P. (2001). association in a lexical decision task. Memory and
When half a word is enough: Infants can recognize Cognition, 5, 335–339.
spoken words using partial phonetic information. Fischler, I., & Bloom, P. A. (1979). Automatic and
Child Development, 72, 1003–1015. attentional processes in the effects of sentence contexts
Fernald, G. M. (1943). Remedial techniques in basic on word recognition. Journal of Verbal Learning and
school subjects. New York: McGraw-Hill. Verbal Behavior, 18, 1–20.
Fernandes, K. J., Marcus, G. F., Di Nubila, J. A., & Fisher, C. (2002). The role of abstract syntactic
Vouloumanos, A. (2006). From semantics to syntax knowledge in language acquisition: A reply to
and back again: Argument structure in the third year of Tomasello (2000). Cognition, 82, 259–278.
life. Cognition, 100, B10–B20. Fisher, S. E., & Marcus, G. F. (2006). The eloquent
Ferreira, F. (2003). The misinterpretation of ape: Genes, brains and the evolution of language.
noncanonical sentences. Cognitive Psychology, 47, Nature Reviews Genetics, 7, 9–20.
164–203. Fisher, S. E., Marlow, A. J., Lamb, J., Maestrini, E.,
Ferreira, F., & Bailey, K. G. D. (2004). Disfluencies Williams, D. F., Richardson, A. J., et al. (1999). A
and human language comprehension. Trends in quantitative-trait locus on chromosome 6p influences
Cognitive Sciences, 8, 231–237. different aspects of developmental dyslexia. American
Ferreira, F., & Clifton, C. (1986). The independence Journal of Human Genetics, 64, 146–156.
of syntactic processing. Journal of Memory and Fisher, S. E., Vargha-Khadem, F., Watkins, K. E.,
Language, 25, 348–368. Monaco, A. P., & Pembrey, M. E. (1998).
Ferreira, F., & Henderson, J. M. (1990). Use of verb Localisation of a gene implicated in a severe speech
information in syntactic parsing: Evidence from eye and language disorder. Nature Genetics, 18, 168–170.
516 REFERENCES
Fitch, W. T., & Hauser, M. D. (2004). Computational Folk, J. R. (1999). Phonological codes are used to
constraints on syntactic processing in a nonhuman access the lexicon during silent reading. Journal of
primate. Science, 303, 377–380. Experimental Psychology: Learning, Memory, and
Fitch, W. T., Hauser, M. D., & Chomsky, N. (2005). Cognition, 25, 892–906.
The evolution of the language faculty: Clarifications Folk, J. R., & Morris, R. K. (1995). Multiple
and implications. Cognition, 97, 179–210. lexical codes in reading: Evidence from eye
Flavell, J. H., Miller, P. H., & Miller, S. (1993). movements, naming time, and oral reading. Journal
Cognitive development (3rd ed.). Englewood Cliffs, of Experimental Psychology: Learning, Memory, and
NJ: Prentice Hall. Cognition, 21, 1412–1429.
Flege, J. E., & Hillenbrand, J. (1984). Limits Ford, M., & Holmes, V. M. (1978). Planning units and
on phonetic accuracy in foreign language speech syntax in sentence production. Cognition, 6, 35–53.
production. Journal of the Acoustical Society of Forde, E. M. E., & Humphreys, G. W. (1995).
America, 76, 708–721. Refractory semantics in global aphasia: On semantic
Fletcher, C. R. (1986). Strategies for the allocation of organization and the access–storage distinction in
short-term memory during comprehension. Journal of neuropsychology. Memory, 3, 265–308.
Memory and Language, 25, 43–58. Forde, E. M. E., & Humphreys, G. W. (1997). A
Fletcher, C. R. (1994). Levels of representation in semantic locus for refractory behaviour: Implications
memory for discourse. In M. A. Gernsbacher (Ed.), for access–storage distinctions and the nature of
Handbook of psycholinguistics (pp. 589–608). San semantic memory. Cognitive Neuropsychology, 14,
Diego, CA: Academic Press. 367–402.
Flower, L. S., & Hayes, J. R. (1980). The dynamics Forster, K. I. (1976). Accessing the mental lexicon. In
of composing: Making plans and juggling constraints. R. J. Wales & E. C. T. Walker (Eds.), New approaches
In L. W. Gregg & E. R. Sternberg (Eds.), Cognitive to language mechanisms (pp. 257–287). Amsterdam:
processes in writing (pp. 31–50). Hillsdale, NJ: North Holland.
Lawrence Erlbaum Associates, Inc. Forster, K. I. (1979). Levels of processing and the
Fodor, J. A. (1972). Some reflections on L. S. structure of the language processor. In W. E. Cooper
Vygotsky’s thought and language. Cognition, 1, 83–95. & E. C. T. Walker (Eds.), Sentence processing:
Fodor, J. A. (1975). The language of thought. Psycholinguistic studies presented to Merrill Garrett
Hassocks, UK: Harvester Press. (pp. 27–85). Hillsdale, NJ: Lawrence Erlbaum
Fodor, J. A. (1978). Tom Swift and his procedural Associates, Inc.
grandmother. Cognition, 6, 229–247. Forster, K. I. (1981). Priming and effects of sentence
Fodor, J. A. (1979). In reply to Philip Johnson-Laird. and lexical contexts on naming time: Evidence of
Cognition, 7, 93–95. autonomous lexical processing. Quarterly Journal of
Fodor, J. A. (1981). The present status of the Experimental Psychology, 33A, 465–495.
innateness controversy. In J. A. Fodor, Representations Forster, K. I. (1994). Computational modeling
(pp. 257–316). Brighton, UK: Harvester Press. and elementary process analysis in visual word
Fodor, J. A. (1983). The modularity of mind. recognition. Journal of Experimental Psychology:
Cambridge, MA: MIT Press. Human Perception and Performance, 20, 1292–1310.
Fodor, J. A. (1985). Précis and multiple book review Forster, K. I. (2000). The potential for experimenter
of the Modularity of mind. Behavioral and Brain bias effects in word recognition experiments. Memory
Sciences, 8, 1–42. and Cognition, 28, 1109–1115.
Fodor, J. A., & Bever, T. G. (1965). The Forster, K. I., & Chambers, S. M. (1973). Lexical
psychological reality of linguistic segments. Journal access and naming time. Journal of Verbal Learning
of Verbal Learning and Verbal Behavior, 4, 414–420. and Verbal Behavior, 12, 627–635.
Fodor, J. A., Bever, T. G., & Garrett, M. F. (1974). Forster, K. I., & Davis, C. (1984). Repetition priming
The psychology of language. New York: McGraw-Hill. and frequency attenuation in lexical access. Journal
Fodor, J. A., & Garrett, M. F. (1967). Some syntactic of Experimental Psychology: Learning, Memory, and
determinants of sentential complexity. Perception and Cognition, 10, 680–698.
Psychophysics, 2, 289–296. Forster, K. I., Davis, C., Schoknecht, C., & Carter, R.
Fodor, J. A., Garrett, M. F., Walker, E. C. T., & (1987). Masked priming with graphemically related
Parkes, C. H. (1980). Against definitions. Cognition, forms: Repetition or partial activation? Quarterly
8, 263–367. Journal of Experimental Psychology, 39, 211–251.
Fodor, J. D., Fodor, J. A., & Garrett, M. F. Forster, K. I., & Olbrei, I. (1973). Semantic
(1975). The psychological unreality of semantic heuristics and syntactic analysis. Cognition, 2,
representations. Linguistic Inquiry, 6, 515–531. 319–347.
Fodor, J. D., & Frazier, L. (1980). Is the human sentence Forster, K. I., & Veres, C. (1998). The prime
parsing mechanism an ATN? Cognition, 8, 418–459. lexicality effect: Form-priming as a function of prime
REFERENCES 517
awareness, lexical status, and discrimination difficulty. case of word meaning deafness? Cognitive
Journal of Experimental Psychology: Learning, Neuropsychology, 13, 1139–1162.
Memory, and Cognition, 24, 498–514. Frauenfelder, U., Segui, J., & Dijkstra, T. (1990).
Foss, D. J. (1970). Some effects of ambiguity upon Lexical effects in phonemic processing: Facilitatory
sentence comprehension. Journal of Verbal Learning or inhibitory? Journal of Experimental Psychology:
and Verbal Behavior, 9, 699–706. Human Perception and Performance, 16, 77–91.
Foss, D. J., & Blank, M. A. (1980). Identifying the Frauenfelder, U. H., & Tyler, L. K. (1987). The
speech codes. Cognitive Psychology, 12, 1–31. process of spoken word recognition: An introduction.
Foss, D. J., & Gernsbacher, M. A. (1983). Cracking Cognition, 25, 1–20.
the dual code: Toward a unitary model of phoneme Frazier, L. (1987a). Sentence processing: A
identification. Journal of Verbal Learning and Verbal tutorial review. In M. Coltheart (Ed.), Attention and
Behavior, 22, 609–632. performance XII: The psychology of reading (pp. 559–
Foss, D. J., & Swinney, D. A. (1973). On the 586). Hove, UK: Lawrence Erlbaum Associates.
psychological reality of the phoneme: Perception, Frazier, L. (1987b). Syntactic processing: Evidence
identification, and consciousness. Journal of Verbal from Dutch. Natural Language and Linguistic Theory,
Learning and Verbal Behavior, 12, 246–257. 5, 519–560.
Fouts, R. S., Fouts, D. H., & van Cantfort, T. E. Frazier, L. (1989). Against lexical generation
(1989). The infant Loulis learns signs from cross- of syntax. In W. Marslen-Wilson (Ed.), Lexical
fostered chimpanzees. In R. A. Gardner, B. T. Gardner, representation and process (pp. 505–258). Cambridge,
& T. E. van Cantford (Eds.), Teaching sign language MA: MIT Press.
to chimpanzees (pp. 280–292). Albany, NY: Suny Frazier, L. (1995). Constraint satisfaction as a theory
Press. of sentence processing. Journal of Psycholinguistic
Fouts, R. S., Hirsch, A. D., & Fouts, D. H. (1982). Research, 24, 437–468.
Cultural transmission of a human language in a Frazier, L., Clifton, C., & Randall, J. (1983). Filling
chimpanzee mother–infant relationship. In H. E. gaps: Decision principles and structure in sentence
Fitzgerald, J. A. Mullins, & P. Gage (Eds.), Child comprehension. Cognition, 13, 187–222.
nurturance (Vol. 3, pp. 159–193). New York: Plenum. Frazier, L., & Flores d’Arcais, G. B. (1989). Filler
Fouts, R. S., Shapiro, G., & O’Neil, C. (1978). Studies driven parsing: A study of gap filling in Dutch.
of linguistic behaviour in apes and children. In P. Siple Journal of Memory and Language, 28, 331–344.
(Ed.), Understanding language through sign language Frazier, L., Flores d’Arcais, G. B., & Coolen, R.
research (pp. 163–185). London: Academic Press. (1993). Processing discontinuous words: On the
Fowler, A. E., Gelman, R., & Gleitman, L. R. interface between lexical and syntactic processing.
(1994). The course of language learning in children Cognition, 47, 219–249.
with Down syndrome: Longitudinal and language Frazier, L., & Fodor, J. D. (1978). The sausage
level comparisons with young normally developing machine: A new two-stage parsing model. Cognition,
children. In H. Tager-Flusberg (Ed.), Constraints on 6, 291–325.
language acquisition: Studies of atypical children (pp. Frazier, L., & Rayner, K. (1982). Making and
91–140). Hillsdale, NJ: Lawrence Erlbaum Associates, correcting errors during sentence comprehension: Eye
Inc. movements in the analysis of structurally ambiguous
Fox Tree, J. E., & Schrock, J. C. (1999). Discourse sentences. Cognitive Psychology, 14, 178–210.
markers in spontaneous speech: Oh what a difference Frazier, L., & Rayner, K. (1987). Resolution of
an oh makes. Journal of Memory and Language, 40, syntactic category ambiguities: Eye movements in
280–295. parsing lexically ambiguous sentences. Journal of
Foygel, D., & Dell, G. S. (2000). Models of impaired Memory and Language, 26, 505–526.
lexical access in speech production. Journal of Frazier, L., & Rayner, K. (1990). Taking on semantic
Memory and Language, 43, 182–216. commitments: Processing multiple meanings versus
Francis, W. N., & Kucera, H. (1982). Frequency multiple senses. Journal of Memory and Language,
analysis of English usage. Boston, MA: Houghton 29, 181–200.
Mifflin. Freberg, L. A. (2006). Discovering biological
Frank, S. L., & Bod, R. (2011). Insensitivity of the psychology. Boston, MA: Houghton Mifflin.
human sentence-processing system to hierarchical Frederiksen, J. R., & Kroll, J. F. (1976). Spelling
structure. Psychological Science, 22, 829–834. and sound: Approaches to the internal lexicon. Journal
Franklin, S., Howard, D., & Patterson, K. E. of Experimental Psychology: Human Perception and
(1994). Abstract word deafness. Cognitive Performance, 2, 361–379.
Neuropsychology, 11, 1–34. Frege, G. (1892). Über Sinn und Bedeutung.
Franklin, S., Turner, J., Lambon Ralph, M. A., Zeitschrifte für Philosophie und Philosophische Kritik,
Morris, J., & Bailey, P. J. (1996). A distinctive 100, 25–50. [Translated in P. T. Geach &
518 REFERENCES
M. Black (Eds.), Philosophical writings of Gottlob Frost, R. (1998). Toward a strong phonological theory
Frege (1952). Oxford: Blackwell.] of visual word recognition: True issues and false trails.
Fremgen, A., & Fay, D. (1980). Overextensions in Psychological Bulletin, 123, 71–99.
production and comprehension: A methodological Frost, R., Forster, K. I., & Deutsch, A. (1997).
clarification. Journal of Child Language, 7, 205–211. What can we learn from the morphology of Hebrew?
Freud, S. (1975). The psychopathology of everyday A masked priming investigation of morphological
life (Trans. A. Tyson). Harmondsworth, UK: Penguin. representation. Journal of Experimental Psychology:
[Originally published 1901.] Learning, Memory, and Cognition, 23, 829–856.
Freudenthal, D., Pine, J., & Gobet, F. (2005). Funnell, E. (1983). Phonological processes in reading:
Simulating the cross-linguistic development of New evidence from acquired dyslexia. British Journal
optional infinitive errors in MOSAIC. In B. G. Bara, of Psychology, 74, 159–180.
L. Barsalou, & M. Buchiarelli (Eds.), Proceedings Funnell, E. (1996). Response biases in oral reading:
of the 27th Annual Meeting of the Cognitive Science An account of the co-occurrence of surface dyslexia
Society (pp. 702–707). Mahwah, NJ: Lawrence and semantic dementia. Quarterly Journal of
Erlbaum Associates, Inc. Experimental Psychology, 49A, 417–446.
Freudenthal, D., Pine, J. M., & Gobet, F. (2006). Funnell, E., & de Mornay Davies, P. (1996). JBR:
Modelling the development of children’s use of A reassessment of concept familiarity and a category-
optional infinitives in English and Dutch using specific disorder for living things. Neurocase, 2,
MOSAIC. Cognitive Science, 30, 277–310. 461–474.
Friederici, A. D. (2002). Towards a neural basis of Funnell, E., & Sheridan, J. (1992). Categories of
auditory sentence processing. Trends in Cognitive knowledge? Unfamiliar aspects of living and non-
Sciences, 6, 78–84. living things. Cognitive Neuropsychology, 9, 135–153.
Friederici, A. D. (2012). The cortical language circuit: Furth, H. (1966). Thinking without language.
From auditory perception to sentence comprehension. London: Macmillan.
Trends in Cognitive Sciences, 16, 262–268. Furth, H. (1971). Linguistic deficiency and thinking:
Friederici, A. D., Bahlmann, J., Heim, S., Research with deaf subjects 1964–69. Psychological
Schubotz, R. I., & Anwander, A. (2006). The brain Bulletin, 75, 58–72.
differentiates human and non-human grammars: Furth, H. (1973). Deafness and learning: A
Functional localization and structural connectivity. psychosocial approach. Belmont, CA: Wadsworth.
Proceedings of the National Academy of Sciences of Gaffan, D., & Heywood, C. A. (1993). A spurious
the United States of America, 103, 2458–2463. category-specific visual agnosia for living things in
Friederici, A. D., & Kilborn, K. (1989). Temporal normal human and nonhuman primates. Journal of
constraints on language processing: Syntactic priming Cognitive Neuroscience, 5, 118–128.
in Broca’s aphasia. Journal of Cognitive Neuroscience, Gainotti, G., di Betta, A. M., & Silveri, M. C.
1, 262–272. (1996). The production of specific and generic
Friedman, N. P., & Miyake, A. (2000). Differential associates of living and nonliving, high- and low-
roles for visuospatial and verbal working memory in familiarity stimuli in Alzheimer’s disease. Brain and
situation model construction. Journal of Experimental Language, 54, 262–274.
Psychology: General, 129, 61–83. Galaburda, A. M., Menard, M. T., & Rosen, G. D.
Friedman, R. B. (1995). Two types of phonological (1994). Evidence for aberrant auditory anatomy in
alexia. Cortex, 31, 397–403. developmental dyslexia. Proceedings of the National
Frith, U. (1985). Beneath the surface of developmental Academy of Sciences, 91, 8010–8013.
dyslexia. In K. E. Patterson, J. C. Marshall, & Galaburda, A. M., Sherman, G. F., Rosen, G. D.,
M. Coltheart (Eds.), Surface dyslexia (pp. 301–330). Aboitiz, F., & Geschwind, N. (1985). Developmental
Hove, UK: Lawrence Erlbaum Associates. dyslexia: Four consecutive patients with cortical
Fromkin, V. A. (1971/1973). The non-anomalous anomalies. Annals of Neurology, 18, 222–233.
nature of anomalous utterances. Language, 51, Galantucci, B., Fowler, C. A., & Turvey, M. T.
696–719. [Reprinted in V. A. Fromkin (Ed.) (1973), (2006). The motor theory of speech perception
Speech errors as linguistic evidence (pp. 215–242). reviewed. Psychonomic Bulletin and Review, 13,
The Hague: Mouton.] 361–377.
Fromkin, V. A., Krashen, S., Curtiss, S., Rigler, D., Gallaway, C., & Richards, B. J. (Eds.). (1994). Input
& Rigler, M. (1974). The development of language and interaction in language acquisition. Cambridge:
in Genie: A case of language acquisition beyond the Cambridge University Press.
“Critical Period.” Brain and Language, 1, 81–107. Galton, F. (1879). Psychometric experiments. Brain,
Fromkin, V. A., Rodman, R., & Hyams, N. (2011). 2, 149–162.
An introduction to language (9th ed.). Boston, MA: Ganong, W. F. (1980). Phonetic categorization in
Thomson Heinle. auditory word perception. Journal of Experimental
REFERENCES 519
Psychology: Human Perception and Performance, 6, Alzheimer’s disease on the characteristics of writing
110–125. by a renowned author. Brain, 128, 250–260.
Garcia, L. J., & Joanette, Y. (1997). Analysis of Garrard, P., Patterson, K., Watson, P. C., &
conversational topic shifts: A multiple case study. Hodges, J. R. (1998). Category specific semantic
Brain and Language, 58, 92–114. loss in dementia of Alzheimer’s type: Functional–
Gardner, M. (1990). Science: Good, bad, and bogus. anatomical correlations from cross-sectional analyses.
Loughton, UK: Prometheus Books. Brain, 121, 633–646.
Gardner, R. A., & Gardner, B. T. (1969). Teaching Garrett, M. F. (1975). The analysis of sentence
sign language to a chimpanzee. Science, 165, 664–672. production. In G. Bower (Ed.), The psychology of
Gardner, R. A., & Gardner, B. T. (1975). Evidence learning and motivation (Vol. 9, pp. 133–177). New
for sentence constituents in the early utterances York: Academic Press.
of child chimpanzee. Journal of Experimental Garrett, M. F. (1976). Syntactic processes in sentence
Psychology: General, 104, 244–267. production. In R. J. Wales & E. C. T. Walker (Eds.),
Gardner, R. A., van Cantfort, T. E., & Gardner, B. T. New approaches to language mechanisms (pp. 231–
(1992). Categorical replies to categorical questions 255). Amsterdam: North Holland.
by cross-fostered chimpanzees. American Journal of Garrett, M. F. (1980a). Levels of processing in
Psychology, 105, 27–57. sentence production. In B. Butterworth (Ed.),
Garnham, A. (1983a). Why psycholinguists don’t Language production: Vol. 1. Speech and talk (pp.
care about DTC: A reply to Berwick and Weinberg. 177–220). London: Academic Press.
Cognition, 15, 263–270. Garrett, M. F. (1980b). The limits of accommodation.
Garnham, A. (1983b). What’s wrong with story In V. Fromkin (Ed.), Errors in linguistic performance:
grammars. Cognition, 15, 145–154. Slips of the tongue, ear, pen, and hand (pp. 263–271).
Garnham, A. (1985). Psycholinguistics: Central New York: Academic Press.
topics. London: Methuen. Garrett, M. F. (1982). Production of speech:
Garnham, A. (1987). Mental models as representation Observations from normal and pathological language
of discourse and text. Chichester, UK: Horwood. use. In A. W. Ellis (Ed.), Normality and pathology in
Garnham, A., & Oakhill, J. (1992). Discourse cognitive functions (pp. 19–76). London: Academic
processing and text representation from a “mental Press.
models” perspective. Language and Cognitive Garrett, M. F. (1988). Processes in language
Processes, 7, 193–204. production. In F. J. Newmeyer (Ed.), Linguistics: The
Garnham, A., Oakhill, J., & Johnson-Laird, P. N. Cambridge survey: Vol. 3. Language: Psychological
(1982). Referential continuity and the coherence of and biological aspects (pp. 69–96). Cambridge:
discourse. Cognition, 11, 29–46. Cambridge University Press.
Garnham, A., Shillcock, R. C., Brown, G. D. A., Garrett, M. F. (1992). Disorders of lexical selection.
Mill, A. I. D., & Cutler, A. (1982). Slips of the tongue in Cognition, 42, 143–180.
the London–Lund corpus of spontaneous conversation. Garrett, M. F., Bever, T. G., & Fodor, J. A. (1966).
In A. Cutler (Ed.), Slips of the tongue and language The active use of grammar in speech perception.
production (pp. 251–263). Amsterdam: Mouton. Perception and Psychophysics, 1, 30–32.
Garnica, O. (1977). Some prosodic and paralinguistic Garrod, S., & Anderson, A. (1987). Saying what you
features of speech to young children. In C. E. Snow & mean in dialogue: A study in conceptual and semantic
C. A. Ferguson (Eds.), Talking to children: Language co-ordination. Cognition, 27, 181–218.
input and acquisition (pp. 63–88). Cambridge: Garrod, S. C., & Sanford, A. J. (1977). Interpreting
Cambridge University Press. anaphoric relations: The integration of semantic
Garnsey, S. M., Pearlmutter, N. J., Myers, E., & information while reading. Journal of Verbal Learning
Lotocky, M. A. (1997). The contributions of verb bias and Verbal Behavior, 16, 77–90.
and plausibility to the comprehension of temporarily Garrod, S. C., & Terras, M. (2000). The contribution
ambiguous sentences. Journal of Memory and of lexical and situational knowledge to resolving
Language, 37, 58–93. discourse roles: Bonding and resolution. Journal of
Garnsey, S. M., Tanenhaus, M. K., & Chapman, R. M. Memory and Language, 42, 526–544.
(1989). Evoked potentials and the study of sentence Gaskell, G. (Ed.). (2007). Oxford handbook of
comprehension. Journal of Psycholinguistic Research, 18, psycholinguistics. Oxford: Oxford University Press.
51–60. Gaskell, M. G., & Marslen-Wilson, W. D. (1997).
Garrard, P., & Hodges, J. R. (2000). Semantic Integrating form and meaning: A distributed model of
dementia: Clinical, radiological and pathological speech perception. Language and Cognitive Processes,
perspectives. Journal of Neurology, 247, 409–422. 12, 613–656.
Garrard, P., Maloney, L. M., Hodges, J. R., & Gaskell, M. G., & Marslen-Wilson, W. D. (1998).
Patterson, K. E. (2005). The effects of very early Mechanisms of phonological inference in speech
520 REFERENCES
perception. Journal of Experimental Psychology: (Ed.), Handbook of psycholinguistics (pp. 781–820). San
Human Perception and Performance, 24, 280–396. Diego, CA: Academic Press.
Gaskell, M. G., & Marslen-Wilson, W. D. (2002). Gernsbacher, M. A. (1984). Resolving 20 years of
Representation and competition in the perception of inconsistent interactions between lexical familiarity
spoken words. Cognitive Psychology, 45, 220–266. and orthography, concreteness, and polysemy. Journal
Gathercole, S. E., Alloway, T. P., Willis, C., & of Experimental Psychology: General, 113, 256–281.
Adams, A. (2006). Working memory in children with Gernsbacher, M. A. (1990). Language comprehension
reading disabilities. Journal of Experimental Child as structure building. Hillsdale, NJ: Lawrence
Psychology, 93, 265–281. Erlbaum Associates, Inc.
Gathercole, S. E., & Baddeley, A. D. (1989). Gernsbacher, M. A. (1997). Group differences in
Evaluation of the role of phonological STM in the suppression skill. Aging, Neuropsychology, and
development of vocabulary in children: A longitudinal Cognition, 4, 175–184.
study. Journal of Memory and Language, 28, 200–213. Gernsbacher, M. A., & Hargreaves, D. J. (1988).
Gathercole, S. E., & Baddeley, A. D. (1990). Accessing sentence participants: The advantage of
Phonological memory deficits in language disordered first mention. Journal of Memory and Language, 27,
children: Is there a causal connection? Journal of 699–717.
Memory and Language, 29, 336–360. Gernsbacher, M. A., Hargreaves, D. J., &
Gathercole, S. E., & Baddeley, A. D. (1993). Working Beeman, M. (1989). Building and accessing clausal
memory and language. Hove, UK: Lawrence Erlbaum representations: The advantage of first mention versus
Associates. the advantage of clause recency. Journal of Memory
Gathercole, S. E., & Baddeley, A. D. (1997). and Language, 28, 735–755.
Sense and sensitivity in phonological memory and Gernsbacher, M. A., Varner, K. R., & Faust, M.
vocabulary development: A reply to Bowey (1996). (1990). Investigating differences in general
Journal of Experimental Child Psychology, 67, comprehension skill. Journal of Experimental
290–294. Psychology: Learning, Memory, and Cognition, 16,
Gathercole, V. C. (1985). “He has too much hard 430–445.
questions”: The acquisition of the linguistic mass- Gerrig, R. (1986). Processes and products of lexical
count distinction in much and many. Journal of Child access. Language and Cognitive Processes, 1, 187–196.
Language, 12, 395–415. Gertner, Y., Fisher, C., & Eisengart, J. (2006).
Gathercole, V. C. (1987). The contrastive hypothesis Learning words and rules: Abstract knowledge of word
for the acquisition of word meaning: A reconsideration order in early sentence comprehension. Psychological
of the theory. Journal of Child Language, 14, Science, 17, 684–691.
493–531. Geschwind, N. (1972). Language and the brain.
Gathercole, V. C. (1989). Contrast: A semantic Scientific American, 226, 76–83.
constraint? Journal of Child Language, 16, 685–702. Gibbs, R. W. (1980). Spilling the beans on
Gazdar, G., Klein, E., Pullum, G. K., & Sag, I. A. understanding and memory for idioms in conversation.
(1985). Generalized phrase structure grammar. Memory and Cognition, 8, 149–156.
Oxford: Blackwell. Gibbs, R. W. (1986a). On the psycholinguistics
Gazzaniga, M. S., Ivry, R. B., & Mangun, G. R. of sarcasm. Journal of Experimental Psychology:
(2008). Cognitive neuroscience: The biology of the General, 115, 3–15.
mind (3rd ed.). New York: Norton. Gibbs, R. W. (1986b). What makes some indirect
Gentner, D. (1978). On relational meaning: The speech acts conventional? Journal of Memory and
acquisition of verb meaning. Child Development, 49, Language, 25, 181–196.
988–998. Gibson, E. (1998). Linguistic complexity: Locality of
Gentner, D. (1981). Verb structures in memory for syntactic dependencies. Cognition, 68, 1–76.
sentences: Evidence for componential representation. Gibson, E., & Thomas, J. (1999). Memory
Cognitive Psychology, 13, 56–83. limitations and structural forgetting: The perception
Gentner, D. (1982). Why nouns are learned before of complex ungrammatical sentences as grammatical.
verbs: Linguistic relativity vs. natural partitioning. Language and Cognitive Processes, 14, 225–248.
In S. A. Kuczaj (Ed.), Language development: Vol. Gibson, J. J. (1979). The ecological approach to
2. Language, thought, and culture (pp. 301–334). perception. Boston, MA: Houghton Mifflin.
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Gilhooly, K. J. (1984). Word age-of-acquisition and
Gergely, G., & Bever, T. G. (1986). Related intuitions residence time in lexical memory as factors in word
and the mental representation of causative verbs in naming. Current Psychological Research, 3, 24–31.
adults and children. Cognition, 23, 211–277. Gillette, J., Gleitman, H., Gleitman, L., & Lederer, A.
Gerken, L. (1994). Child phonology: Past research, (1999). Human simulations of vocabulary learning.
present questions, future direction. In M. A. Gernsbacher Cognition, 73, 135–176.
REFERENCES 521
Glaser, W. R. (1992). Picture naming. Cognition, 42, Glucksberg, S., Kreuz, R. J., & Rho, S. H. (1986).
61–105. Context can constrain lexical access: Implications
Gleason, H. A. (1961). An introduction to descriptive for models of language comprehension. Journal of
linguistics. New York: Holt, Rinehart & Winston. Experimental Psychology: Learning, Memory, and
Gleason, J. B., Hay, D., & Crain, L. (1989). Cognition, 12, 323–335.
The social and affective determinants of language Glucksberg, S., & Weisberg, R. W. (1966). Verbal
development. In M. Rice & R. Schiefelbusch behavior and problem solving: Some effects of
(Eds.), The teachability of language (pp. 171–186). labeling in a functional fixedness problem. Journal of
Baltimore, MD: Paul Brookes. Experimental Psychology, 71, 659–664.
Gleason, J. B., & Ratner, N. B. (1993). Language Glushko, R. J. (1979). The organization and
development in children. In J. B. Gleason & N. B. activation of orthographic knowledge in reading
Ratner (Eds.), Psycholinguistics (pp. 301–350). Fort aloud. Journal of Experimental Psychology: Human
Worth, TX: Harcourt Brace Jovanovich. Perception and Performance, 5, 674–691.
Gleick, J. (1987). Chaos. London: Sphere Books. Goffman, E. (1967). Interaction ritual: Essays on face
Gleitman, L. R. (1981). Maturational determinants of to face behavior. Garden City, NY: Anchor Books.
language growth. Cognition, 10, 105–113. Gold, E. M. (1967). Language identification in the
Gleitman, L. R. (1990). The structural sources of limit. Information and Control, 16, 447–474.
word meaning. Language Acquisition, 1, 3–55. Goldberg, E., & Costa, L. D. (1981). Hemisphere
Gleitman, L. R., Cassidy, K., Nappa, R., differences in the acquisition and use of descriptive
Papafragou, A., & Trueswell, J. C. (2005). Hard systems. Brain and Language, 14, 144–173.
words. Language Learning and Development, 1, Goldfield, B. A. (1993). Noun bias in maternal speech
23–64. to one-year-olds. Journal of Child Language, 20, 85–99.
Gleitman, L. R., & Papafragou, A. (2005). Language Goldiamond, I., & Hawkins, W. F. (1958).
and thought. In K. J. Holyoak & R. Morrison (Eds.), Vexierversuch: The logarithmic relationship between
The Cambridge handbook of thinking and reasoning. word-frequency and recognition obtained in the
Cambridge: Cambridge University Press. absence of stimulus words. Journal of Experimental
Gleitman, L. R., & Wanner, E. (1982). Language Psychology, 56, 457–463.
acquisition: The state of the state of the art. In Goldin-Meadow, S., Butcher, C., Mylander, C.,
E. Wanner & L. R. Gleitman (Eds.), Language & Dodge, M. (1994). Nouns and verbs in a self-
acquisition: The state of the art (pp. 3–48). styled gesture system: What’s in a name? Cognitive
Cambridge: Cambridge University Press. Psychology, 27, 259–319.
Glenberg, A. (2007). Language and action: Creating Goldin-Meadow, S., Mylander, C., & Butcher, C.
sensible combinations of ideas. In M. G. Gaskell (Ed.), (1995). The resilience of combinatorial structure at the
Oxford handbook of psycholinguistics (pp. 362–370). word level: Morphology in self-styled gesture systems.
Oxford: Oxford University Press. Cognition, 56, 195–262.
Glenberg, A. M., Meyer, M., & Lindem, K. (1987). Goldinger, S. D., Luce, P. A., & Pisoni, D. B. (1989).
Mental models contribute to foregrounding during text Priming lexical neighbours of spoken words: Effects
comprehension. Journal of Memory and Language, of competition and inhibition. Journal of Memory and
26, 69–83. Language, 28, 501–518.
Glenberg, A. M., & Robertson, D. A. (2000). Goldman-Eisler, F. (1958). Speech production and the
Symbol grounding and meaning: A comparison of predictability of words in context. Quarterly Journal
high-dimensional and embodied theories of meaning. of Experimental Psychology, 10, 96–106.
Journal of Memory and Language, 43, 379–401. Goldman-Eisler, F. (1968). Psycholinguistics:
Gluck, M. A., & Bower, G. H. (1988). From Experiments in spontaneous speech. London:
conditioning to category learning: An adaptive Academic Press.
network model. Journal of Experimental Psychology: Golinkoff, R. M., Hirsh-Pasek, K., Bailey, L. M., &
General, 8, 37–50. Wenger, N. R. (1992). Young children and adults use
Glucksberg, S. (1991). Beyond literal meanings: The lexical principles to learn new nouns. Developmental
psychology of allusion. Psychological Science, 2, Psychology, 28, 99–108.
146–152. Golinkoff, R. M., Mervis, C. B., & Hirsh-Pasek, K.
Glucksberg, S., Gildea, P., & Bookin, H. B. (1982). (1994). Early object labels: The case for lexical
On understanding nonliteral speech: Can people ignore principles. Journal of Child Language, 21, 125–155.
metaphors? Journal of Verbal Learning and Verbal Gollan, T. H., & Acenas, L. R. (2004). What is a
Behavior, 21, 85–98. TOT? Cognate and translation effects on tip-of-the-
Glucksberg, S., & Keysar, B. (1990). Understanding tongue states in Spanish–English and Tagalog–English
metaphorical comparisons: Beyond similarity. bilinguals. Journal of Experimental Psychology:
Psychological Review, 97, 3–18. Learning, Memory, and Cognition, 30, 246–269.
522 REFERENCES
Gollan, T. H., & Brown, A. S. (2006). From tip-of- Gordon, P. C., Hendrick, R., & Levine, W. H.
the-tongue (TOT) data to theoretical implications in (2002). Memory-load interference in syntactic
two steps: When more TOTs means better retrieval. processing. Psychological Science, 13, 425–430.
Journal of Experimental Psychology: General, 135, Gordon, P. C., & Meyer, D. E. (1984). Perceptual-
462–483. motor processing of phonetic features. Journal of
Gombert, J. E. (1992). Metalinguistic development Experimental Psychology: Human Perception and
(Trans. T. Pownall, originally published 1990). Performance, 10, 153–178.
London: Harvester Wheatsheaf. Goswami, U. (1986). Children’s use of analogy in
Gomez, R. L., & Gerken, L. (2000). Infant artificial learning to read: A developmental study. Journal of
language learning and language acquisition. Trends in Experimental Child Psychology, 42, 73–83.
Cognitive Sciences, 4, 178–186. Goswami, U. (1988). Orthographic analogies
Gonnerman, L. M., Andersen, E. S., Devlin, J. T., and reading development. Quarterly Journal of
Kempler, D., & Seidenberg, M. S. (1997). Double Experimental Psychology, 40A, 239–268.
dissociation of semantic categories in Alzheimer’s Goswami, U. (1993). Towards an interactive
disease. Brain and Language, 57, 254–279. analogy model of reading development: Decoding
Good, D. A., & Butterworth, B. (1980). Hesitancy vowel graphemes in beginning reading. Journal of
as a conversational resource: Some methodological Experimental Child Psychology, 56, 443–475.
implications. In H. W. Dechert & M. Raupach (Eds.), Goswami, U., & Bryant, P. (1990). Phonological
Temporal variables in speech (pp. 145–152). The skills and learning to read. Hove, UK: Lawrence
Hague: Mouton. Erlbaum Associates.
Goodglass, H. (1976). Agrammatism. In H. Whitaker Goswami, U., Wang, H., Cruz, A., Fosker, T.,
& H. A. Whitaker (Eds.), Studies in neurolinguistics Mead, N., & Huss, M. (2011). Language-universal
(Vol. 1, pp. 237–260). New York: Academic Press. sensory deficits in developmental dyslexia: English,
Goodglass, H., & Geschwind, N. (1976). Language Spanish, and Chinese. Journal of Cognitive
disorders (aphasia). In E. C. Carterette & M. P. Neuroscience, 23, 325–337.
Friedman (Eds.), Handbook of perception: Vol. VII. Goswami, U., Ziegler, J. C., & Richardson, U.
Language and speech (pp. 389–428). New York: (2005). The effects of spelling consistency on
Academic Press. phonological awareness: A comparison of English and
Goodglass, H., & Menn, L. (1985). Is agrammatism German. Journal of Experimental Child Psychology,
a unitary phenomenon? In M.-L. Kean (Ed.), 92, 345–365.
Agrammatism (pp. 1–26). New York: Academic Press. Gotts, S. J., & Plaut, D. C. (2002). The impact
Goodluck, H. (1991). Language acquisition: A of synaptic depression following brain damage: A
linguistic introduction. Oxford: Blackwell. connectionist account of “access/refractory” and
Gopnik, M. (1990a). Dysphasia in an extended family. “degraded-store” semantic impairments. Cognitive,
Nature, 344, 715. Affective, and Behavioral Neuroscience, 2, 187–213.
Gopnik, M. (1990b). Feature blindness: A case study. Gough, P. B. (1972). One second of reading. In
Language Acquisition, 1, 139–164. J. F. Kavanaugh & I. G. Mattingly (Eds.), Language
Gopnik, M. (1992). A model module? Cognitive by ear and by eye (pp. 331–358). Cambridge, MA:
Neuropsychology, 9, 253–258. MIT Press.
Gopnik, M., & Crago, M. B. (1991) Familial Goulandris, A., & Snowling, M. (1991). Visual
aggregation of a developmental language disorder. memory deficits: A plausible case of developmental
Cognition, 29, 1–50. dyslexia? Evidence from a single case study. Cognitive
Gopnik, M., & Meltzoff, A. N. (1997). Words, Neuropsychology, 8, 127–154.
thoughts, and theories. Cambridge, MA: MIT Press. Graesser, A. C., Singer, M., & Trabasso, T.
Gordon, B., & Caramazza, A. (1982). Lexical (1994). Constructing inferences during narrative text
decision for open and closed-class words: Failure to comprehension. Psychological Review, 101, 371–395.
replicate differential frequency sensitivity. Brain and Graf, P., & Torrey, J. W. (1966). Perception of phrase
Language, 15, 143–160. structure in written language. American Psychological
Gordon, P. (1985). Evaluating the semantic categories Association Convention Proceedings, 83–88.
hypothesis: The case of the count/mass distinction. Graham, K. S., Hodges, J. R., & Patterson, K.
Cognition, 20, 209–242. (1994). The relationship between comprehension
Gordon, P. (2004). Numerical cognition without and oral reading in progressive fluent aphasia.
words: Evidence from Amazonia. Science, 306, Neuropsychologia, 32, 299–316.
496–499. Grainger, J. (1990). Word frequency and
Gordon, P. C., Grosz, B. J., & Gilliom, L. A. (1993). neighborhood frequency effects in lexical decision
Pronouns, names, and the centering of attention in and naming. Journal of Memory and Language, 29,
discourse. Cognitive Science, 17, 311–347. 228–244.
REFERENCES 523
Grainger, J., & Jacobs, A. M. (1996). Orthographic Grodzinsky, Y. (1984). The syntactic characterization
processing in visual word recognition: A multiple of agrammatism. Cognition, 16, 88–120.
read-out model. Psychological Review, 103, 518–565. Grodzinsky, Y. (1989). Agrammatic comprehension
Grainger, J., Lété, B., Bertand, D., Dufau, S., & of relative clauses. Brain and Language, 37, 480–499.
Ziegler, J. C. (2012). Evidence for multiple routes in Grodzinsky, Y. (1990). Theoretical perspectives on
learning to read. Cognition, 123, 280–292. language deficits. Cambridge, MA: MIT Press.
Grainger, J., O’Regan, K., Jacobs, A. M., & Grodzinsky, Y. (2000). The neurology of syntax:
Segui, J. (1989). On the role of competing word Language use without Broca’s area. Behavioral and
units in visual word recognition: The neighbourhood Brain Sciences, 23, 1–71.
frequency effect. Perception and Psychophysics, 45, Grodzinsky, Y., & Friederici, A. D. (2006).
189–195. Neuroimaging of syntax and syntactic processing.
Green, D. W. (1986). Control, activation, and Current Opinion in Neurobiology, 16, 240–246.
resource: A framework and a model for the control Grosjean, F. (1980). Spoken word recognition
of speech in bilinguals. Brain and Language, 27, processes and the gating paradigm. Perception and
210–223. Psychophysics, 28, 267–283.
Greenberg, J. H. (1963). Some universals of grammar Grosjean, F. (1997). Processing mixed languages:
with particular reference to the order of meaningful Issues, findings, and models. In A. de Groot & J. Kroll
elements. In J. H. Greenberg (Ed.), Universals of (Eds.), Tutorials in bilingualism: Psycholinguistic
language (pp. 58–90). Cambridge, MA: MIT Press. perspectives (pp. 225–254). Mahwah, NJ: Lawrence
Greene, J. (1972). Psycholinguistics. Harmondsworth, Erlbaum Associates, Inc.
UK: Penguin. Grosjean, F., & Frauenfelder, U. H. (1996). A guide
Greenfield, P. M., & Savage-Rumbaugh, E. S. (1990). to spoken word recognition paradigms: Introduction.
Grammatical combinations in Pan paniscus: Processes of Language and Cognitive Processes, 11, 553–558.
learning and invention in the evolution and development Grosjean, F., & Gee, J. P. (1987). Prosodic structure
of language. In S. T. Parker & K. R. Gibson (Eds.), and spoken word recognition. Cognition, 25, 135–155.
“Language” and intelligence in monkeys and apes: Grosjean, F., & Soares, C. (1986). Processing mixed
Comparative developmental perspectives (pp. 540–578). language: Some preliminary findings. In J. Vaid (Ed.),
New York: Cambridge University Press. Linguistics processing in bilinguals: Psycholinguistic
Greenfield, P. M., & Smith, J. H. (1976). The and neuropsychological perspectives (pp. 145–179).
structure of communication in early language Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
development. New York: Academic Press. Grossman, M., Mickanin, J., Robinson, K. M.,
Gregory, R. L. (1961). The brain as an engineering & d’Esposito, M. (1996). Anomaly judgements of
problem. In W. H. Thorpe & O. L. Zangwill (Eds.), subject–predicate relations in Alzheimer’s disease.
Current problems in animal behaviour (pp. 547–565). Brain and Language, 54, 216–232.
London: Methuen. Grosz, B. J., Joshi, A. K., & Weinstein, S. (1995).
Grice, H. P. (1975). Logic and conversation. In Centering: A framework for modeling the local
P. Cole & J. Morgan (Eds.), Syntax and semantics: Vol. coherence of discourse. Computational Linguistics, 21,
3. Speech acts (pp. 41–58). New York: Academic Press. 203–225.
Griffin, Z. M. (2001). Gaze durations during speech Gumperz, J. J., & Levinson, S. C. (1996). Rethinking
reflect word selection and phonological encoding. linguistic relativity. Cambridge: Cambridge University
Cognition, 82, B1–B14. Press.
Griffin, Z. M. (2004). The eyes are right when the Gupta, P., & MacWhinney, B. (1997). Vocabulary
mouth is wrong. Psychological Science, 15, 814–821. acquisition and verbal short-term memory:
Griffin, Z. M., & Bock, K. (1998). Constraint, Computational and neural bases. Brain and Language,
word frequency, and the relationship between lexical 59, 267–333.
processing levels in spoken word production. Journal Haarmann, H. J., Just, M. A., & Carpenter, P. A.
of Memory and Language, 38, 313–338. (1997). Aphasic sentence comprehension as a
Griffin, Z. M., & Bock, K. (2000). What the eyes say resource deficit: A computational approach. Brain and
about speaking. Psychological Science, 11, 274–279. Language, 59, 76–120.
Griffin, Z. M., & Oppenheimer, D. M. (2006). Haarmann, H. J., & Kolk, H. H. J. (1991). Syntactic
Speakers gaze at objects while preparing intentionally priming in Broca’s aphasics: Evidence for slow
inaccurate labels for them. Journal of Experimental activation. Aphasiology, 5, 247–263.
Psychology: Learning, Memory, and Cognition, 32, Haber, R. N., & Haber, L. R. (1982). Does silent
943–948. reading involve articulation? Evidence from tongue-
Grober, E. H., Beardsley, W., & Caramazza, A. twisters. American Journal of Psychology, 95, 409–419.
(1978). Parallel function in pronoun assignment. Hadzibeganovic, T., van den Noort, M., Bosch, P.,
Cognition, 6, 117–133. Perc, M., van Kralingen, R., Mondt, K., &
524 REFERENCES
Coltheart, M. (2010). Cross-linguistic neuroimaging Groot & J. F. Kroll (Eds.), Tutorials in bilingualism:
and dyslexia: A critical view. Cortex, 46, 1312–1316. Psycholinguistic perspectives (pp. 19–51). Mahwah,
Hagoort, P. (2008). Should psychology ignore NJ: Lawrence Erlbaum Associates, Inc.
the language of the brain? Current Directions in Harley, T. A. (1984). A critique of top-down
Psychological Science, 17, 96–101. independent levels models of speech production:
Hagoort, P., Brown, C. M., & Groothusen, J. Evidence from non-plan-internal speech production.
(1993). The syntactic positive shift as an ERP-measure Cognitive Science, 8, 191–219.
of syntactic processing. Language and Cognitive Harley, T. A. (1990). Paragrammatisms: Syntactic
Processes, 8, 439–483. disturbance or failure of control? Cognition, 34,
Hakuta, K. (1986). Mirror of language. New York: 85–91.
Basic Books. Harley, T. A. (1993a). Phonological activation of
Hakuta, K., & Diaz, R. (1985). The relationship between semantic competitors during lexical access in speech
degree of bilingualism and cognitive ability: A critical production. Language and Cognitive Processes, 8,
discussion and some new longitudinal data. In K. E. 291–309.
Nelson (Ed.), Children’s language (Vol. 5, pp. 319–344). Harley, T. A. (1993b). Connectionist approaches to
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. language disorders. Aphasiology, 7, 221–249.
Haldane, J. B. S. (1927). A mathematic theory of Harley, T. A. (1998). The semantic deficit in
natural and artificial selection, Part V: Selection and dementia: Connectionist approaches to what goes
mutation. Proceedings of the Cambridge Philosophical wrong in picture naming. Aphasiology, 12, 299–318.
Society, 23, 838–844. Harley, T. A. (2004a). Does cognitive neuropsychology
Hale, J. T. (2010). What a rational parser would do. have a future? Cognitive Neuropsychology, 21 (Special
Cognitive Science, 35, 399–443. Issue; Lead article), 3–16.
Hale, S. (2002). The man who lost his language. Harley, T. A. (2004b). Promises, promises. Cognitive
Harmondsworth, UK: Penguin Books. Neuropsychology, 21 (Special Issue; Reply to
Hall, D. G. (1993). Basic-level individuals. Cognition, commentators), 51–56.
48, 199–221. Harley, T. A. (2010). Talking the talk: Language,
Hall, D. G. (1994). How mothers teach basic-level psychology and science. Hove, UK: Psychology Press.
and situation-restricted count nouns. Journal of Child Harley, T. A., & Bown, H. E. (1998). What causes
Language, 21, 391–414. a tip-of-the-tongue state? Evidence for lexical
Hall, D. G., & Waxman, S. R. (1993). Assumptions neighbourhood effects in speech production. British
about word meaning: Individuation and basic-level Journal of Psychology, 89, 151–174.
kinds. Child Development, 64, 1550–1570. Harley, T. A., & MacAndrew, S. B. G. (1992).
Halle, M., & Stevens, K. N. (1962). Speech Modelling paraphasias in normal and aphasic speech.
recognition: A model and a program for research. IRE In Proceedings of the 14th Annual Conference of the
Transactions of the Professional Group on Information Cognitive Science Society (pp. 378–383). Hillsdale,
Theory, 8, 155–159. NJ: Lawrence Erlbaum Associates, Inc.
Hampton, J. A. (1979). Polymorphous concepts in Harm, M. W., & Seidenberg, M. S. (1999).
semantic memory. Journal of Verbal Learning and Phonology, reading acquisition, and dyslexia: Insights
Verbal Behavior, 18, 441–461. from connectionist models. Psychological Review,
Hampton, J. A. (1981). An investigation of the 106, 491–528.
nature of abstract concepts. Memory and Cognition, 9, Harm, M. W., & Seidenberg, M. S. (2001). Are there
149–156. orthographic impairments in phonological dyslexia?
Hanley, J. R., Hastie, K., & Kay, J. (1992). Cognitive Neuropsychology, 18, 71–92.
Developmental surface dyslexia and dysgraphia: Harm, M. W., & Seidenberg, M. S. (2004).
An orthographic processing impairment. Quarterly Computing the meanings of words in reading:
Journal of Experimental Psychology, 44A, 285–320. Cooperative division of labor between visual words
Hanley, J. R., & McDonnell, V. (1997). Are reading and phonological processes. Psychological Review,
and spelling phonologically mediated? Evidence 111, 662–720.
from a patient with a speech production impairment. Harris, M. (1978). Noun animacy and the passive
Cognitive Neuropsychology, 14, 3–33. voice: A developmental approach. Quarterly Journal
Hanten, G., & Martin, R. C. (2000). Contributions of Experimental Psychology, 30, 495–504.
of phonological and semantic short-term memory Harris, M., & Coltheart, M. (1986). Language
to sentence processing: Evidence from two cases of processing in children and adults. London: Routledge
close head injury in children. Journal of Memory and & Kegan Paul.
Language, 43, 335–361. Harris, M., & Hatano, G. (Eds.). (1999). Learning
Harley, B., & Wang, W. (1997). The critical period to read and write: A cross-linguistic perspective.
hypothesis: Where are we now? In A. M. B. de Cambridge: Cambridge University Press.
REFERENCES 525
Harris, P. L. (1982). Cognitive prerequisites to Hauk, O., Johnsrude, I., & Pulvermuller, F. (2004).
language? British Journal of Psychology, 73, 187–195. Somatotopic representation of action words in human
Harris, R. J. (1977). Comprehension of pragmatic motor and premotor cortex. Neuron, 41, 301–307.
implications in advertising. Journal of Applied Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002).
Psychology, 63, 603–608. The faculty of language: What is it, who has it, and
Harris, R. J. (1978). The effect of jury size and how did it evolve? Science, 298, 1569–1579.
judge’s instructions on memory for pragmatic Hauser, M. D., Newport, E. L., & Aslin, R. N.
implications from courtroom testimony. Bulletin of the (2001). Segmentation of the speech stream in a non-
Psychonomic Society, 11, 129–132. human primate: Statistical learning in cotton-top
Harris, Z. S. (1951). Methods in structural linguistics. tamarins. Cognition, 78, B53–B64.
Chicago: University of Chicago Press. Haviland, S. E., & Clark, H. H. (1974). What’s
Harste, J., Burke, C., & Woodward, V. (1982). new? Acquiring new information as a process of
Children’s language and world: Initial encounters with comprehension. Journal of Verbal Learning and
print. In J. Langer & M. Smith-Burke (Eds.), Bridging Verbal Behavior, 13, 515–521.
the gap: Reader meets author (pp. 105–131). Newark, Hawkins, J. A. (1990). A parsing theory of word order
DE: International Reading Association. universals. Linguistic Inquiry, 21, 223–261.
Hart, J., Berndt, R. S., & Caramazza, A. (1985). Hayes, C. (1951). The ape in our house. New York:
Category-specific naming deficit following cerebral Harper.
infarction. Nature, 316, 439–440. Hayes, J. R., & Flower, L. S. (1980). Identifying
Hart, J., & Gordon, B. (1992). Neural subsystems for the organisation of writing processes. In L. W. Gregg
object knowledge. Nature, 359, 60–64. & E. R. Sternberg (Eds.), Cognitive processes in
Hartley, T., & Houghton, G. (1996). A linguistically writing (pp. 3–30). Hillsdale, NJ: Lawrence Erlbaum
constrained model of short-term memory for nonwords. Associates, Inc.
Journal of Memory and Language, 35, 1–31. Hayes, J. R., & Flower, L. S. (1986). Writing
Hartsuiker, R. J., Anton-Méndez, I., Roelstraete, B., research and the writer. American Psychologist, 41,
& Costa, A. (2006). Spoonish Spanerisms: A lexical 1106–1113.
bias effect in Spanish. Journal of Experimental Hayes, K. J., & Nissen, C. H. (1971). Higher mental
Psychology: Learning, Memory, and Cognition, 32, functions of a home-raised chimpanzee. In
949–953. A. M. Schrier & F. Stollnitz (Eds.), Behaviour of
Hartsuiker, R. J., & Kolk, H. H. J. (1998). Syntactic nonhuman primates (Vol. 4, pp. 60–115). New York:
facilitation in agrammatic sentence production. Brain Academic Press.
and Language, 62, 221–254. Haywood, S. L., Pickering, M. J., & Branigan,
Hartsuiker, R. J., Kolk, H. H. J., & Huiskamp, P. H. P. (2005). Do speakers avoid ambiguities during
(1999). Priming word order in sentence production. dialogue? Psychological Science, 16, 362–366.
Quarterly Journal of Experimental Psychology, 52A, Healy, A., & Miller, G. (1970). The verb as the
129–147. main determinant of sentence meaning. Psychonomic
Hartsuiker, R. J., Pickering, M. J., & Veltkamp, E. Science, 20, 372.
(2004). Is syntax separate or shared between Heath, S. B. (1983). Ways with words. Cambridge:
languages? Cross-linguistic syntactic priming in Cambridge University Press.
Spanish/English bilinguals. Psychological Science, 15, Hebb, D. O. (1949). The organization of behavior.
409–414. New York: Wiley.
Hartsuiker, R. J., & Westenberg, C. (2000). Heider, E. R. (1971). “Focal” color areas and
Word order priming in written and spoken sentence the development of color names. Developmental
production. Cognition, 75, B27–B39. Psychology, 4, 447–455.
Haskell, T. R., & MacDonald, M. C. (2003). Heider, E. R. (1972). Universals in colour naming
Conflicting cues and competition in subject–verb and memory. Journal of Experimental Psychology, 93,
agreement. Journal of Memory and Language, 48, 10–20.
760–778. Heit, E., & Barsalou, L. W. (1996). The instantiation
Haskell, T. R., MacDonald, M. C., & Seidenberg, M. S. principle in natural categories. Memory, 4, 413–451.
(2003). Language learning and innateness: Some Hemforth, B., & Konieczny, L. (1999). German
implications of compounds research. Cognitive sentence processing. Dordrecht: Kluwer Academic
Psychology, 47, 119–163. Publishers.
Hatcher, P. J., Hulme, C., & Ellis, A. W. (1994). Henderson, A., Goldman-Eisler, F., & Skarbek, A.
Ameliorating early reading failure by integrating (1966). Sequential temporal patterns in speech.
the teaching of reading and phonological skills: The Language and Speech, 8, 236–242.
phonological linkage hypothesis. Child Development, Henderson, J. M., & Ferreira, F. (Eds.). (2004).
65, 41–57. The interface of language, vision, and action:
526 REFERENCES
Eye movements and the visual word. New York: Hinton, G. E., Plaut, D. C., & Shallice, T. (1993).
Psychology Press. Simulating brain damage. Scientific American, 269,
Henderson, L. (1982). Orthography and word 58–65.
recognition in reading. London: Academic Press. Hinton, G. E., & Sejnowski, T. J. (1986). Learning
Herman, L. M., Richards, D. G., & Wolz, J. P. and relearning in Boltzmann machines. In
(1984). Comprehension of sentences by bottlenosed D. E. Rumelhart, J. L. McClelland, & the PDP Research
dolphins. Cognition, 16, 129–219. Group, Parallel distributed processing: Explorations
Herrnstein, R., Loveland, D., & Cable, C. (1977). in the microstructure of cognition: Vol. 1. Foundations
Natural concepts in pigeons. Journal of Experimental (pp. 282–317). Cambridge, MA: MIT Press.
Psychology: Animal Learning and Memory, 2, Hinton, G. E., & Shallice, T. (1991). Lesioning an
285–302. attractor network: Investigations of acquired dyslexia.
Hespos, S. J., & Spelke, E. (2004). Conceptual Psychological Review, 98, 74–95.
precursors to language. Nature, 430, 453–456. Hintzman, D. L. (1986). “Schemata abstraction” in a
Hess, D. J., Foss, D. J., & Carroll, P. (1995). Effects multiple-trace memory model. Psychological Review,
of global and local context on lexical processing 93, 411–428.
during language comprehension. Journal of Hirsh, K. W., & Ellis, A. W. (1994). Age of
Experimental Psychology: General, 124, 62–82. acquisition and lexical processing in aphasia: A case
Hickerson, N. P. (1971). Review of “Basic study. Cognitive Neuropsychology, 11, 435–458.
Color Terms.” International Journal of American Hirsh-Pasek, K., Kemler-Nelson, D. G., Jusczyk, P. W.,
Linguistics, 37, 257–270. Cassidy, K. W., Druss, B., & Kennedy, L. (1987).
Hickok, G., & Poeppel, D. (2004). Dorsal and Clauses are perceptual units for young infants. Cognition,
ventral streams: A framework for understanding 26, 269–286.
aspects of the functional anatomy of language. Hirsh-Pasek, K., Reeves, L. M., & Golinkoff, R. M.
Cognition, 92, 67–99. (1993). Words and meaning: From primitives to
Hieke, A. E., Kowal, S. H., & O’Connell, D. C. complex organisation. In J. Berko Gleason & N. B.
(1983). The trouble with “articulatory” pauses. Ratner (Eds.), Psycholinguistics (pp. 134–199). Fort
Language and Speech, 26, 203–214. Worth, TX: Harcourt Brace.
Hill, R. L., & Murray, W. S. (2000). Commas and Hirsh-Pasek, K., Treiman, R., & Schneiderman, M.
spaces: Effects of punctuation on eye movements and (1984). Brown and Hanlon revisited: Mothers’
sentence parsing. In A. Kennedy, R. Radach, D. Heller, sensitivity to ungrammatical forms. Journal of Child
& J. Pynte (Eds.), Reading as a perceptual process Language, 11, 81–88.
(pp. 565–589). Oxford: Elsevier. Hladik, E. G., & Edwards, H. T. (1984). A
Hillis, A. (2002). Handbook of adult language comparative analysis of mother–father speech
disorders. Hove, UK: Psychology Press. in the naturalistic home environment. Journal of
Hillis, A. E., Boatman, D., Hart, J., & Gordon, B. Psycholinguistic Research, 13, 321–332.
(1999). Making sense out of jargon: A neurolinguistic Hockett, C. F. (1960). The origin of speech. Scientific
and computational account of jargon aphasia. American, 203, 89–96.
Neurology, 53, 1813–1824. Hodges, J. R., & Greene, J. D. W. (1998). Knowing
Hillis, A. E., & Caramazza, A. (1991a). Category- about people and naming them: Can Alzheimer’s
specific naming and comprehension impairment: A disease patients do one without the other? Quarterly
double dissociation. Brain, 114, 2081–2094. Journal of Experimental Psychology, 51A, 121–134.
Hillis, A. E., & Caramazza, A. (1991b). Mechanisms Hodges, J. R., Patterson, K. E., Oxbury, S., &
for accessing lexical representations for output: Funnell, E. (1992). Semantic dementia: Progressive
Evidence from a category-specific semantic deficit. fluent aphasia with temporal lobe atrophy. Brain, 115,
Brain and Language, 40, 106–144. 1783–1806.
Hillis, A. E., & Caramazza, A. (1995). Representation Hodges, J. R., Salmon, D. P., & Butters, N. (1991).
of grammatical categories of words in the brain. The nature of the naming deficit in Alzheimer’s and
Journal of Cognitive Neuroscience, 7, 396–407. Huntington’s disease. Brain, 114, 1547–1558.
Hillis, A. E., Tuffiash, E., & Caramazza, A. (2002). Hodgson, J. M. (1991). Informational constraints
Modality-specific deterioration in naming verbs in on pre-lexical priming. Language and Cognitive
nonfluent primary progressive aphasia. Journal of Processes, 6, 169–205.
Cognitive Neuroscience, 14, 1099–1108. Hoff, E. (2003). The specificity of environmental
Hinton, G. E. (1989). Deterministic Boltzmann influence: Socioeconomic status affects early
learning performs steepest descent in weight-space. vocabulary development via maternal speech. Child
Neural Computation, 1, 143–150. Development, 74, 1368–1378.
Hinton, G. E. (1992). How neural networks learn Hoff, E., & Naigles, L. (2002). How children use input
from experience. Scientific American, 267, 105–109. to acquire a lexicon. Child Development, 73, 418–433.
REFERENCES 527
Hoff-Ginsberg, E. (1997). Language development. case of a global aphasic. Cognitive Neuropsychology,

Pacific Grove, CA: Brooks/Cole. 1, 163–190.
Hoffman, C., Lau, I., & Johnson, D. R. (1986). The Howe, C. (1980). Language learning from mothers’
linguistic relativity of person cognition. Journal of replies. First Language, 1, 83–97.
Personality and Social Psychology, 51, 1097–1105. Howes, D. H., & Solomon, R. L. (1951). Visual
Hogaboam, T. W., & Perfetti, C. A. (1975). Lexical duration threshold as a function of word probability.
ambiguity and sentence comprehension: The common Journal of Experimental Psychology, 41, 401–410.
sense effect. Journal of Verbal Learning and Verbal Hulme, C. (1981). Reading retardation and
Behavior, 14, 265–275. multisensory learning. London: Routledge & Kegan
Holender, D. (1986). Semantic activation without Paul.
conscious identification in dichotic listening, Hulme, C., Muter, V., & Snowling, M. (1998).
parafoveal vision, and visual masking: A survey Segmentation does predict early progress in learning
and appraisal. Behavioral and Brain Sciences, 9, to read better than rhyme: A reply to Bryant. Journal
1–23. of Experimental Child Psychology, 71, 39–44.
Hollan, J. D. (1975). Features and semantic memory: Hulme, C., & Snowling, M. (1992). Deficits in
Set-theoretic or network model? Psychological output phonology: An explanation of reading failure.
Review, 82, 154–155. Cognitive Neuropsychology, 9, 47–72.
Hollich, G., Hirsh-Pasek, K., & Golinkoff, R. Hulme, C., Snowling, M., Caravolas, M., & Carroll, J.
(2000). Breaking the language barrier: An emergentist (2005). Phonological skills are (probably) one cause of
coalition of word learning. Oxford: Blackwell. success in learning to read. Scientific Studies of Reading,
Holmes, V. M. (1988). Hesitations and sentence 9, 351–366.
planning. Language and Cognitive Processes, 3, Humphreys, G. W. (1985). Attention, automaticity,
323–361. and autonomy in visual word processing. In D. Besner,
Holmes, V. M., & O’Reagan, J. K. (1981). Eye T. G. Waller, & G. E. MacKinnon (Eds.), Reading
fixation patterns during the reading of relative clause research: Advances in theory and practice (Vol. 5, pp.
sentences. Journal of Verbal Learning and Verbal 253–310). New York: Academic Press.
Behavior, 20, 417–430. Humphreys, G. W., Besner, D., & Quinlan, P. T.
Holtgraves, T. (1998). Interpreting indirect replies. (1988). Event perception and the word repetition
Cognitive Psychology, 37, 1–27. effect. Journal of Experimental Psychology: General,
Holyoak, K. J., & Glass, A. L. (1975). The role of 117, 51–67.
contradictions and counter-examples in the rejection of Humphreys, G. W., & Rumiati, R. I. (1998).
false sentences. Journal of Verbal Learning and Verbal Agnosia without prosopagnosia or alexia: Evidence for
Behavior, 14, 215–239. stored visual memories specific to objects. Cognitive
Horton, W. S., & Gerrig, R. J. (2005). The impact of Neuropsychology, 15, 243–277.
memory demands on audience design during language Hunt, E., & Agnoli, F. (1991). The Whorfian
production. Cognition, 96, 127–142. hypothesis: A cognitive psychology perspective.
Howard, D. (1995). Lexical anomia: Or the case Psychological Review, 99, 377–389.
of the missing lexical entries. Quarterly Journal of Hurford, J. R. (2003). The neural basis of predicate-
Experimental Psychology, 48A, 999–1023. argument structure. Behavioral and Brain Sciences,
Howard, D. (1997). Language in the brain. In M. D. 26, 261–316.
Rugg (Ed.), Cognitive neuroscience (pp. 277–304). Hurst, J. A., Baraitser, M., Auger, E., Graham,
Hove, UK: Psychology Press. F., & Norell, S. (1990). An extended family with a
Howard, D., & Best, W. (1996). Developmental dominantly inherited speech disorder. Developmental
phonological dyslexia: Real word reading can be Medicine and Child Neurology, 32, 347–355.
completely normal. Cognitive Neuropsychology, 13, Huttenlocher, J., Smiley, P., & Charney, R. (1983).
887–934. The emergence of action categories in the child:
Howard, D., & Butterworth, B. (1989). Short- Evidence from verb meanings. Psychological Review,
term memory and sentence comprehension: A 90, 72–93.
reply to Vallar and Baddeley, 1987. Cognitive Huttenlocher, J., Vasilyeva, M., Cymerman, E., &
Neuropsychology, 6, 455–463. Levine, S. (2002). Language input and child syntax.
Howard, D., & Franklin, S. (1988). Missing the Cognitive Psychology, 45, 337–374.
meaning? Cambridge, MA: MIT Press. Huttenlocher, J., Waterfall, H., Vasilyeva, M.,
Howard, D., & Hatfield, F. M. (1987). Aphasia Vevea, J., & Hedges, L. V. (2010). Sources of
therapy: Historical and contemporary issues. Hove, variability in children’s language growth. Cognitive
UK: Lawrence Erlbaum Associates. Psychology, 61, 343–365.
Howard, D., & Orchard-Lisle, V. (1984). On the Indefrey, P., & Levelt, W. J. M. (2000). The neural
origin of semantic errors in naming: Evidence from the correlates of language production. In M. Gazzaniga
528 REFERENCES
(Ed.), The new cognitive neurosciences (2nd ed., pp. James, S. L., & Khan, L. M. L. (1982). Grammatical
845–865). Cambridge, MA: MIT Press. morpheme acquisition: An approximately invariant
Indefrey, P., & Levelt, W. J. M. (2004). The order? Journal of Psycholinguistic Research, 11,
spatial and temporal signatures of word production 381–388.
components. Cognition, 92, 101–144. Jared, D. (1997a). Evidence that strategy effects in
Inhoff, A. W. (1984). Two stages of word processing word naming reflect changes in output timing
during eye fixations in the reading of prose. Journal of rather than changes in processing route. Journal of
Verbal Learning and Verbal Behavior, 23, 612–624. Experimental Psychology: Learning, Memory, and
Ivanova, I., Pickering, M. J., McLean, J. F., Costa, A., Cognition, 23, 1424–1438.
& Branigan, H. P. (2012). How do people produce Jared, D. (1997b). Spelling–sound consistency affects
ungrammatical utterances? Journal of Memory and the naming of high-frequency words. Journal of
Language, 67, 355–370. Memory and Language, 36, 505–529.
Jackendoff, R. (1977). X-bar syntax: A study of Jared, D., Levy, B. A., & Rayner, K. (1999). The
phrase structure. Cambridge, MA: MIT Press. role of phonology in the activation of word meanings
Jackendoff, R. (1983). Semantics and cognition. during reading: Evidence from proofreading and eye
Cambridge, MA: MIT Press. movements. Journal of Experimental Psychology:
Jackendoff, R. (1987). On beyond zebra: The relation General, 128, 219–264.
of linguistic and visual information. Cognition, 26, Jared, D., McRae, K., & Seidenberg, M. S. (1990).
89–114. The basis of consistency effects in word naming.
Jackendoff, R. (1999). Possible stages in the Journal of Memory and Language, 29, 687–715.
evolution of the language capacity. Trends in Cognitive Jared, D., & Seidenberg, M. S. (1991). Does word
Sciences, 3, 272–279. identification proceed from spelling to sound to
Jackendoff, R. (2002). Foundations of language. meaning? Journal of Experimental Psychology:
Oxford: Oxford University Press. General, 120, 358–394.
Jackendoff, R. (2003). Précis of Foundations of Jarvella, R. J. (1971). Syntactic processing of
language: Brain, meaning, grammar, evolution. connected speech. Journal of Verbal Learning and
Behavioral and Brain Sciences, 26, 651–707. Verbal Behavior, 10, 409–416.
Jackendoff, R., & Pinker, S. (2005). The nature of Jastrzembski, J. E. (1981). Multiple meanings,
the language faculty and its implications for evolution number of related meanings, frequency of occurrence,
of language (Reply to Fitch, Hauser, and Chomsky). and the lexicon. Cognitive Psychology, 13, 278–305.
Cognition, 97, 211–225. Jescheniak, J. D., & Levelt, W. J. M. (1994). Word
Jacobsen, E. (1932). The electrophysiology of mental frequency effects in speech production: Retrieval
activities. American Journal of Psychology, 44, of syntactic information and of phonological form.
677–694. Journal of Experimental Psychology: Learning,
Jacoby, L. L. (1983). Perceptual enhancement: Memory, and Cognition, 20, 824–843.
Persistent effects of an experience. Journal of Jescheniak, J. D., Meyer, A. S., & Levelt, W. J. M.
Experimental Psychology: Learning, Memory, and (2003). Specific-word frequency is not all that counts
Cognition, 15, 930–940. in speech production: Comments on Caramazza,
Jacoby, L. L., & Dallas, M. (1981). On the Costa, et al. (2001) and new experimental data.
relationship between autobiographical memory Journal of Experimental Psychology: Learning,
and perceptual learning. Journal of Experimental Memory, and Cognition, 29, 432–438.
Psychology: General, 110, 306–340. Jescheniak, J. D., Schriefers, H., & Hantsch, A.
Jaeger, J. J., Lockwood, A. H., Kemmerer, D., van (2003). Utterance format affects phonological priming
Valin, R. D., Murphy, B. W., & Khalak, H. G. in the picture–word task: Implications for models of
(1996). A positron emission tomographic study of phonological encoding in speech production. Journal
regular and irregular verb morphology in English. of Experimental Psychology: Human Perception and
Language, 72, 451–497. Performance, 29, 441–454.
Jaffe, J., Breskin, S., & Gerstman, L. J. (1972). Jin, Y.-S. (1990). Effects of concreteness on cross-
Random generation of apparent speech rhythms. language priming in lexical decisions. Perceptual and
Language and Speech, 15, 68–71. Motor Skills, 70, 1139–1154.
Jakobson, R. (1968). Child language, aphasia and Joanisse, M. F., & Seidenberg, M. S. (1998).
phonological universals. The Hague: Mouton. Specific language impairment: A deficit in grammar or
James, L. E., & Burke, D. M. (2000). Phonological processing? Trends in Cognitive Sciences, 2, 240–247.
priming effects on word retrieval and tip-of-the-tongue Joanisse, M. F., & Seidenberg, M. S. (1999).
experiences in young and older adults. Journal of Impairments in verb morphology after brain injury:
Experimental Psychology: Learning, Memory, and A connectionist model. Proceedings of the National
Cognition, 26, 1378–1391. Academy of Sciences USA, 96, 7592–7597.
REFERENCES 529
Joanisse, M. F., & Seidenberg, M. S. (2005). Jones, G. V. (1989). Back to Woodworth: Role of
Imaging the past: Neural activation in frontal and interlopers in the tip-of-the-tongue phenomenon.
temporal regions during regular and irregular past Memory and Cognition, 17, 69–76.
tense processing. Cognitive, Affective and Behavioral Jones, G. V., & Langford, S. (1987). Phonological
Neuroscience, 5, 282–296. blocking in the tip of the tongue state. Cognition, 26,
Job, R., Miozzo, M., & Sartori, G. (1993). On the 115–122.
existence of category-specific impairments: A reply to Jones, G. V., & Martin, M. (1985). Deep dyslexia
Parkin and Stewart. Quarterly Journal of Experimental and the right-hemisphere hypothesis for semantic
Psychology, 46A, 511–516. paralexia: A reply to Marshall and Patterson.
Johnson, E. K., & Jusczyk, P. W. (2001). Word Neuropsychologia, 23, 685–688.
segmentation by 8-month-olds: When speech cues Jones, L. L., & Estes, Z. (2005). Metaphor
count more than statistics. Journal of Memory and comprehension as attributive categorization. Journal
Language, 44, 548–567. of Memory and Language, 53, 110–124.
Johnson, E. K., Jusczyk, P. W., Cutler, A., & Jorm, A. F. (1979). The cognitive and neurological
Norris, D. (2003). Lexical viability constraints on basis of developmental dyslexia: A theoretical
speech segmentation by infants. Cognitive Psychology, framework and review. Cognition, 7, 19–32.
46, 65–97. Jorm, A. F., & Share, D. L. (1983). Phonological
Johnson, J. S., & Newport, E. L. (1989). Critical recoding and reading acquisition. Applied
period effects in second language learning: The Psycholinguistics, 4, 103–147.
influence of maturational state on the acquisition of Jusczyk, P. W. (1982). Auditory versus phonetic
English as a second language. Cognitive Psychology, coding of speech signals during infancy. In J. Mehler,
21, 60–99. E. C. T. Walker, & M. Garrett (Eds.), Perspectives on
Johnson, K. E., & Mervis, C. B. (1997). Effects mental representation (pp. 361–387). Hillsdale, NJ:
of varying levels of expertise on the basic level of Lawrence Erlbaum Associates, Inc.
categorization. Journal of Experimental Psychology: Just, M. A., & Carpenter, P. A. (1980). A theory
General, 126, 248–277. of reading: From eye fixations to comprehension.
Johnson, R. E. (1970). Recall of prose as a function Psychological Review, 87, 329–354.
of the structural importance of the linguistic units. Just, M. A., & Carpenter, P. A. (1987). The
Journal of Verbal Learning and Verbal Behavior, 9, psychology of reading and language comprehension.
12–90. Newton, MA: Allyn & Bacon.
Johnson-Laird, P. N. (1975). Meaning and the mental Just, M. A., & Carpenter, P. A. (1992). A capacity
lexicon. In A. Kennedy & A. Wilkes (Eds.), Studies theory of comprehension: Individual differences in
in long-term memory (pp. 123–142). London: John working memory. Psychological Review, 99, 122–149.
Wiley. Just, M. A., Carpenter, P. A., & Keller, T. A. (1996).
Johnson-Laird, P. N. (1978). What’s wrong with The capacity theory of comprehension: New frontiers
Grandma’s guide to procedural semantics: A reply to of evidence and arguments. Psychological Review,
Jerry Fodor. Cognition, 6, 249–261. 103, 773–780.
Johnson-Laird, P. N. (1983). Mental models. Just, M. A., & Varma, S. (2002). A hybrid architecture
Cambridge: Cambridge University Press. for working memory: Reply to MacDonald and
Johnson-Laird, P. N., Herrman, D. J., & Chaffin, R. Christiansen. Psychological Review, 109, 55–65.
(1984). Only connections: A critique of semantic Kail, R., & Nippold, M. A. (1984). Unconstrained
networks. Psychological Bulletin, 96, 292–315. retrieval from semantic memory. Child Development,
Johnston, R. S., & Watson, J. E. (2004). 55, 944–951.
Accelerating the development of reading, spelling and Kako, E. (1999a). Elements of syntax in the systems
phonemic awareness skills in initial readers. Reading of three language-trained animals. Animal Learning
and Writing, 17, 327–357. and Behavior, 27, 1–14.
Johnston, R. S., & Watson, J. E. (2005). The effects Kako, E. (1999b). Response to Pepperberg; Herman
of synthetic phonics teaching on reading and spelling and Uyeyama; and Shanker, Savage-Rumbaugh, and
attainment: A seven year longitudinal study. The Taylor. Animal Learning and Behavior, 27, 26–27.
Scottish Executive, available at http://www.scotland. Kamide, Y., Altmann, G. T. M., & Haywood, S. L.
gov.uk/Publications/2005/02/20688/52449. (2003). The time-course of prediction in incremental
Jolicoeur, P., Gluck, M. A., & Kosslyn, S. M. (1984). sentence processing: Evidence from anticipatory eye
Pictures and names: Making the connection. Cognitive movements. Journal of Memory and Language, 49,
Psychology, 16, 243–275. 133–156.
Jones, G. V. (1985). Deep dyslexia, imageability, Kaminski, J., Call, J., & Fischer, J. (2004). Word
and ease of predication. Brain and Language, 24, learning in a domestic dog: Evidence for “fast
1–19. mapping.” Science, 304, 1682–1683.
530 REFERENCES
Kanwisher, N. (1987). Repetition blindness: Type Keenan, J. M., MacWhinney, B., & Mayhew, D.
recognition without token individuation. Cognition, (1977). Pragmatics in memory: A study of natural
27, 117–143. conversation. Journal of Verbal Learning and Verbal
Kanwisher, N., & Potter, M. C. (1990). Repetition Behavior, 16, 549–560.
blindness: Levels of processing. Journal of Kegl, J., Senghas, A., & Coppola, M. (1999).
Experimental Psychology: Human Perception and Creations through contact: Sign language emergence
Performance, 16, 30–47. and sign language change in Nicaragua. In M.
Karmiloff-Smith, A. (1986). From meta-process to DeGraff (Ed.), Comparative grammatical change: The
conscious access: Evidence from metalinguistic and intersection of language acquisition, Creole genesis,
repair data. Cognition, 23, 95–147. and diachronic syntax (pp. 179–237). Cambridge,
Katz, J. J. (1977). The real status of semantic MA: MIT Press.
representations. Linguistic Inquiry, 8, 559–584. Kellas, G., Ferraro, F. R., & Simpson, G. B.
Katz, J. J., & Fodor, J. A. (1963). The structure of a (1988). Lexical ambiguity and the time-course of
semantic theory. Language, 39, 170–210. attentional allocation in word recognition. Journal
Kaufer, D., Hayes, J. R., & Flower, L. S. (1986). of Experimental Psychology: Human Perception and
Composing written sentences. Research in the Performance, 14, 601–609.
Teaching of English, 20, 121–140. Kellogg, R. T. (1988). Attentional overload and
Kaup, B., & Zwaan, R. A. (2003). Effects of negational writing performance. Journal of Experimental
and situational presence on the accessibility of text Psychology: Learning, Memory, and Cognition, 14,
information. Journal of Experimental Psychology: 355–365.
Learning, Memory, and Cognition, 29, 439–446. Kellogg, W. N., & Kellogg, L. A. (1933). The ape and
Kawamoto, A. (1993). Nonlinear dynamics in the the child. New York: McGraw-Hill.
resolution of lexical ambiguity: A parallel distributed Kelly, M. H. (1992). Using sound to solve syntactic
processing account. Journal of Memory and problems: The role of phonology in grammatical
Language, 32, 474–516. category assignments. Psychological Review, 99,
Kay, D. A., & Anglin, J. M. (1982). Overextension 349–364.
and underextension in the child’s expressive and Kelly, M. H., Bock, J. K., & Keil, F. C. (1986).
receptive speech. Journal of Child Language, 9, Prototypicality in a linguistic context: Effects on
83–98. sentence structure. Journal of Memory and Language,
Kay, J. (1985). Mechanisms of oral reading: A critical 25, 59–74.
appraisal of cognitive models. In A. W. Ellis (Ed.), Kelter, S., Kaup, B., & Claus, B. (2004).
Progress in the psychology of language (Vol. 2, pp. Representing a described sequence of events: A
73–105). Hove, UK: Lawrence Erlbaum Associates. dynamic view of narrative comprehension. Journal
Kay, J., & Bishop, D. (1987). Anatomical differences of Experimental Psychology: Learning, Memory, and
between nose, palm, foot. Or, the body in question: Cognition, 30, 451–464.
Further dissection of the processes of sub-lexical Kempen, G., & Huijbers, P. (1983). The
spelling–sound translation. In M. Coltheart (Ed.), lexicalization process in sentence production and
Attention and performance XII: The psychology of naming: Indirect election of words. Cognition, 14,
reading (pp. 449–469). Hove, UK: Lawrence Erlbaum 185–209.
Associates. Kendon, A. (1967). Some functions of gaze direction
Kay, J., & Ellis, A. W. (1987). A cognitive in social interaction. Acta Psychologica, 26, 22–63.
neuropsychological case study of anomia: Implications Kennedy, A. (2000). Parafoveal processing in word
for psychological models of word retrieval. Brain, 110, recognition. Quarterly Journal of Experimental
613–629. Psychology, 53A, 429–455.
Kay, J., Lesser, R., & Coltheart, M. (1992). Kennedy, A., Murray, W. S., Jennings, F., & Reid, C.
Psycholinguistic assessments of language processing (1989). Parsing complements: Comments on the
in aphasia (PALPA): An introduction. Hove, UK: generality of the principle of minimal attachment.
Lawrence Erlbaum Associates. Language and Cognitive Processes, 4, 51–76.
Kay, J., & Marcel, A. J. (1981). One process, not two Kennison, S. M. (2001). Limitations on the use of
in reading aloud: Lexical analogies do the work of verb information during sentence comprehension.
nonlexical rules. Quarterly Journal of Experimental Psychonomic Bulletin and Review, 8, 132–138.
Psychology, 33A, 397–414. Kennison, S. M., & Trofe, J. L. (2004).
Kay, P., & Kempton, W. (1984). What is the Sapir– Comprehending pronouns: A role for word-
Whorf hypothesis? American Anthropologist, 86, 65–79. specific gender stereotype information. Journal of
Kean, M.-L. (1977). The linguistic interpretation of Psycholinguistic Research, 32, 355–378.
aphasic syndromes: Agrammatism in Broca’s aphasia, Kersten, A. W., & Earles, J. L. (2001). Less really
an example. Cognition, 5, 9–46. is more for adults learning a miniature artificial
REFERENCES 531
language. Journal of Memory and Language, 44, Kintsch, W., & Vipond, D. (1979). Reading
250–273. comprehension and readability in educational practice
Kess, J. F., & Miyamoto, T. (1999). The Japanese and psychological theory. In L. G. Nilsson (Ed.),
mental lexicon: Psycholinguistic studies of Kana and Perspectives in memory research (pp. 329–366).
Kanji processing. Amsterdam: John Benjamins. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Keysar, B. (1989). On the functional equivalence of Kintsch, W., Welsch, D., Schmalhofer, F., & Zimny, S.
literal and metaphorical interpretations of discourse. (1990). Sentence memory: A theoretical analysis.
Journal of Memory and Language, 28, 375–385. Journal of Memory and Language, 29, 133–159.
Keysar, B., Barr, D. J., Balin, J. A., & Paek, T. S. Kiparsky, P., & Menn, L. (1977). On the acquisition of
(1998). Definite reference and mutual knowledge: phonology. In J. Macnamara (Ed.), Language learning
Process models of common ground in comprehension. and thought (pp. 47–78). New York: Academic Press.
Journal of Memory and Language, 39, 1–20. Kirshner, H. S., Webb, W. G., & Kelly, M. P. (1984).
Keysar, B., & Henly, A. S. (2002). Speakers’ The naming order of dementia. Neuropsychologia, 22,
overestimation of their effectiveness. Psychological 23–30.
Science, 13, 207–212. Kirsner, K., Smith, M., Lockhart, R. S., King, M. L.,
Kiger, J. I., & Glass, A. L. (1983). The facilitation of & Jain, M. (1984). The bilingual lexicon: Language-
lexical decisions by a prime occurring after the target. specific units in an integrated network. Journal of
Memory and Cognition, 11, 356–365. Verbal Learning and Verbal Behavior, 23, 519–539.
Kilborn, K. (1994). Learning language late: Second Klapp, S. T. (1974). Syllable-dependent pronunciation
language acquisition in adults. In M. A. Gernsbacher latencies in number naming, a replication. Journal of
(Ed.), Handbook of psycholinguistics (pp. 917–944). Experimental Psychology, 102, 1138–1140.
San Diego, CA: Academic Press. Klapp, S. T., Anderson, W. G., & Berrian, R.
Kim, H. S. (2002). We talk, therefore we think? A cultural (1973). Implicit speech in reading considered. Journal
analysis of the effect of talking on thinking. Journal of of Experimental Psychology, 100, 368–374.
Personality and Social Psychology, 83, 828–842. Klatt, D. H. (1989). Review of selected models
Kimball, J. (1973). Seven principles of surface of speech perception. In W. Marslen-Wilson (Ed.),
structure parsing in natural language. Cognition, 2, Lexical representation and process (pp. 169–226).
15–47. Cambridge, MA: MIT Press.
Kinoshita, S., & Lupker, S. J. (2003). Masked Klee, T., & Fitzgerald, M. D. (1985). The relation
priming: State of the art. New York: Psychology Press. between grammatical development and mean length of
Kintsch, W. (1974). The representation of meaning in utterance in morphemes. Journal of Child Language,
memory. Hillsdale, NJ: Lawrence Erlbaum Associates, 12, 251–269.
Inc. Klein, W. (1986). Second language acquisition.
Kintsch, W. (1979). On modelling comprehension. Cambridge: Cambridge University Press.
Educational Psychologist, 14, 3–14. Klima, E. S., & Bellugi, U. (1979). The signs of
Kintsch, W. (1980). Semantic memory: A tutorial. language. Cambridge, MA: Harvard University Press.
In R. S. Nickerson (Ed.), Attention and performance Kluender, R., & Kutas, M. (1993). Bridging the gap:
XIII (pp. 595–620). Hillsdale, NJ: Lawrence Erlbaum Evidence from ERPs on the processing of unbounded
Associates, Inc. dependencies. Journal of Cognitive Neuroscience, 5,
Kintsch, W. (1988). The use of knowledge in 196–214.
discourse processing: A construction-integration Knutsen, D., & Le Bigot, L. (2012). Managing
model. Psychological Review, 95, 163–182. dialogue: How information availability affects
Kintsch, W. (1994). The psychology of discourse collaborative reference production. Journal of Memory
processing. In M. A. Gernsbacher (Ed.), Handbook and Language, 67, 326–341.
of psycholinguistics (pp. 721–740). San Diego, CA: Kohn, S. E. (1984). The nature of the phonological
Academic Press. disorder in conduction aphasia. Brain and Language,
Kintsch, W., & Bates, E. (1977). Recognition 23, 97–115.
memory for statements from a classroom lecture. Kohn, S. E., & Friedman, R. B. (1986). Word-
Journal of Experimental Psychology: Human Learning meaning deafness: A phonological–semantic
and Memory, 3, 187–197. dissociation. Cognitive Neuropsychology, 3, 291–308.
Kintsch, W., & Keenan, J. M. (1973). Reading Kohn, S. E., Wingfield, A., Menn, L., Goodglass, H.,
rate and retention as a function of the number of Gleason, J. B., & Hyde, M. (1987). Lexical
propositions in the base structure of sentences. retrieval: The tip-of-the-tongue phenomenon. Applied
Cognitive Psychology, 5, 257–274. Psycholinguistics, 8, 245–266.
Kintsch, W., & van Dijk, T. A. (1978). Toward Kolb, B., & Whishaw, I. Q. (2009). Fundamentals of
a model of text comprehension and production. human neuropsychology (6th ed.). New York: W. H.
Psychological Review, 85, 363–394. Freeman & Co.
532 REFERENCES
Kolk, H. H. J. (1978). The linguistic interpretation of Kuhl, P. K. (1981). Discrimination of speech by non-
Broca’s aphasia: A reply to M.-L. Kean. Cognition, 6, human animals: Basic auditory sensitivities conducive
353–361. to the perception of speech-sound categories. Journal
Kolk, H. H. J. (1995). A time-based approach to of the Acoustical Society of America, 70, 340–349.
agrammatic production. Brain and Language, 50, Kursaal Flyers (1976). Little does she know/Drinking
282–303. socially. CBS 4689. Producer: Mike Batt.
Kolk, H. H. J., & van Grunsven, M. (1985). Kutas, M. (1993). In the company of other words:
Agrammatism as a variable phenomenon. Cognitive Electrophysiological evidence for single-word and
Neuropsychology, 2, 347–384. sentence context effects. Language and Cognitive
Komatsu, L. K. (1992). Recent views of conceptual Processes, 8, 533–572.
structure. Psychological Bulletin, 112, 500–526. Kutas, M., DeLong, K. A., & Smith, N. J. (2011).
Kornai, A., & Pullum, G. K. (1990). The X-bar A look around at what lies ahead: Prediction and
theory of phrase structure. Language, 66, 24–50. predictability in language processing. In M. Bar (Ed.),
Kounios, J., & Holcomb, P. J. (1992). Structure and Predictions in the brain: Using our past to generate
process in semantic memory: Evidence from event- a future (pp. 190–207). Oxford: Oxford University
related brain potentials and reaction times. Journal of Press.
Experimental Psychology: General, 121, 459–479. Kutas, M., & Hillyard, S. A. (1980). Reading
Kounios, J., & Holcomb, P. J. (1994). Concreteness senseless sentences: Brain potentials reflect semantic
effects in semantic processing: ERP evidence incongruity. Science, 207, 203–205.
supporting dual-coding theory. Journal of Kutas, M., & van Petten, C. (1994).
Experimental Psychology: Learning, Memory, and Psycholinguistics electrified: Event-related brain
Cognition, 20, 804–823. potential investigations. In M. A. Gernsbacher (Ed.),
Kraljic, T., & Brennan, S. E. (2005). Prosodic Handbook of psycholinguistics (pp. 83–143). San
disambiguation of syntactic structure: For the speaker Diego, CA: Academic Press.
or for the addressee? Cognitive Psychology, 50, Kyle, J. G., & Woll, B. (1985). Sign language: The
194–231. study of deaf people and their language. Cambridge:
Krashen, S. D. (1982). Principles and practices in Cambridge University Press.
second language acquisition. Oxford: Pergamon. La Heij, W., Hooglander, A., Kerling, R., & van
Krashen, S. D., Long, M., & Scarcella, R. (1982). der Velden, E. (1996). Nonverbal context effects
Age, rate, and eventual attainment in second language in forward and backward translation: Evidence for
acquisition. In S. D. Krashen, R. Scarcella, & M. Long concept mediation. Journal of Memory and Language,
(Eds.), Child–adult differences in second language 35, 648–665.
acquisition (pp. 161–172). Rowley, MA: Newbury Labov, W. (1973). The boundaries of words and their
House. meanings. In C.-J. Bailey & R. W. Shuy (Eds.), New
Kraus, N. (2012). Atypical brain oscillations: A ways of analyzing variations in English (pp. 340–373).
biological basis for dyslexia? Trends in Cognitive Washington, DC: Georgetown University Press.
Sciences, 16, 12–13. Labov, W., & Fanshel, D. (1977). Therapeutic
Kremin, H. (1985). Routes and strategies in surface discourse: Psychotherapy as conversation. New York:
dyslexia and dysgraphia. In K. E. Patterson, J. C. Academic Press.
Marshall, & M. Coltheart (Eds.), Surface dyslexia: Lackner, J. R., & Garrett, M. F. (1972). Resolving
Neuropsychological and cognitive studies of ambiguity: Effects of biasing context in the unattended
phonological reading (pp. 105–137). Hove, UK: ear. Cognition, 1, 359–372.
Lawrence Erlbaum Associates. Lado, R. (1957). Linguistics across cultures. Ann
Kroll, J. F., & Stewart, E. (1994). Category Arbor: University of Michigan Press.
interference in translation and picture naming: Lai, C. S. L., Fisher, S. E., Hurst, J. A., Vargha-
Evidence for asymmetric connections between Khadem, F., & Monaco, A. P. (2001). A forkhead-
bilingual memory representations. Journal of Memory domain gene is mutated in a severe speech and
and Language, 33, 149–174. language disorder. Nature, 413, 519–523.
Kruschke, J. K. (1992). ALCOVE: An exemplar- Laiacona, M., Barbarotto, R., & Capitani, E.
based connectionist model of category learning. (1993). Perceptual and associative knowledge in
Psychological Review, 99, 22–44. category specific impairment of semantic memory: A
Kucera, H., & Francis, W. N. (1967). Computational study of two cases. Cortex, 29, 727–740.
analysis of present-day American English. Providence, Laine, M., & Martin, N. (1996). Lexical retrieval
RI: Brown University Press. deficit in picture naming: Implications for word
Kuczaj, S. A. (1977). The acquisition of regular and production models. Brain and Language, 53, 283–314.
irregular past tense forms. Journal of Verbal Learning Laing, E., & Hulme, C. (1999). Phonological and
and Verbal Behavior, 16, 589–600. semantic processes influence beginning readers’ ability
REFERENCES 533
to learn to read words. Journal of Experimental Child Lee, H., Rayner, K., & Pollatsek, A. (2001). The
Psychology, 73, 183–207. relative contribution of consonants and vowels to word
Lakatos, I. (1970). Falsification and the methodology identification during reading. Journal of Memory and
of scientific research programmes. In I. Lakatos & Language, 44, 189–205.
A. Musgrave (Eds.), Criticism and the growth of Lee, J. J., & Pinker, S. (2010). Rationales for
knowledge (pp. 91–196). Cambridge: Cambridge indirect speech: The theory of the strategic speaker.
University Press. Psychological Review, 117, 785–807.
Lakoff, G. (1987). Women, fire, and dangerous things. Legge, G. E., Klitz, T. S., & Tjan, B. S. (1997).
Chicago: University of Chicago Press. Mr Chips: An ideal-observer model of reading.
Lambert, W. E., Tucker, G. R., & d’Anglejan, A. Psychological Review, 104, 524–553.
(1973). Cognitive and attitudinal consequences Lenneberg, E. H. (1962). Understanding language
of bilingual schooling. Journal of Educational without ability to speak: A case report. Journal of
Psychology, 85, 141–159. Abnormal and Social Psychology, 65, 419–425.
Lambon Ralph, M. A., Ellis, A. W., & Franklin, S. Lenneberg, E. H. (1967). The biological foundations
(1995). Semantic loss without surface dyslexia. of language. New York: Wiley.
Neurocase, 1, 363–369. Lenneberg, E. H., & Roberts, J. M. (1956).
Lambon Ralph, M. A., Sage, K., & Ellis, A. W. The language of experience. Memoir 13, Indiana
(1996). Word meaning blindness: A new form of University Publications in Anthropology and
acquired dyslexia. Cognitive Neuropsychology, 13, Linguistics.
617–639. Leonard, L. B. (1989). Language learnability and
Landau, B., & Gleitman, L. R. (1985). Language specific language impairment in children. Applied
and experience: Evidence from the blind child. Psycholinguistics, 10, 179–202.
Cambridge, MA: Harvard University Press. Leonard, L. B. (2000). Children with specific
Landauer, T. K., & Dumais, S. T. (1997). A solution language impairment. Cambridge, MA: MIT Press.
to Plato’s problem: The latent semantic analysis Leonard, L. B., Newhoff, M., & Fey, M. E. (1980).
theory of acquisition, induction, and representation of Some instances of word usage without comprehension.
knowledge. Psychological Review, 104, 211–240. Journal of Child Language, 7, 186–196.
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Leopold, W. F. (1939–1949). Speech development of a
An introduction to latent semantic analysis. Discourse bilingual child: A linguist’s record (5 vols.). Evanston,
Processes, 25, 259–284. IL: Northwestern University Press.
Landauer, T. K., & Freedman, J. L. (1968). Lesch, M. F., & Martin, R. C. (1998). The
Information retrieval from long-term memory: representation of sublexical orthographic–
Category size and recognition time. Journal of Verbal phonological correspondences: Evidence from
Learning and Verbal Behavior, 7, 291–295. phonological dyslexia. Quarterly Journal of
Lane, H., & Pillard, R. (1978). The wild boy of Experimental Psychology, 51, 905–938.
Burundi. New York: Random House. Lesch, M. F., & Pollatsek, A. (1998). Evidence
Lantz, D., & Stefflre, V. (1964). Language and for the use of assembled phonology in accessing
cognition revisited. Journal of Abnormal Psychology, the meaning of words. Journal of Experimental
69, 472–481. Psychology: Learning, Memory, and Cognition, 24,
Lapointe, S. (1983). Some issues in the linguistic 573–592.
description of agrammatism. Cognition, 14, 1–39. Levelt, W. J. M. (1989). Speaking: From intention to
Lauro-Grotto, R., Piccini, C., & Shallice, T. (1997). articulation. Cambridge, MA: MIT Press.
Modality-specific operations in semantic dementia. Levelt, W. J. M. (2001). Spoken word production: A
Cortex, 33, 593–622. theory of lexical access. Proceedings of the National
Laws, G., Davies, I., & Andrews, C. (1995). Academy of Sciences, 98, 13464–13471.
Linguistic structure and non-linguistic cognition: Levelt, W. J. M. (2002). Picture naming and word
English and Russian blues compared. Language and frequency. Language and Cognitive Processes, 17,
Cognitive Processes, 10, 59–94. 663–671.
Laxon, V., Masterson, J., & Coltheart, V. (1991). Levelt, W. J. M., & Kelter, S. (1982). Surface
Some bodies are easier to read: The effect of form and memory in question answering. Cognitive
consistency and regularity on children’s reading. Psychology, 14, 78–106.
Quarterly Journal of Experimental Psychology, 43A, Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999).
793–824. A theory of lexical access in speech production.
Lee, A. C. H., Graham, K. S., Simons, J. S., & Behavioral and Brain Sciences, 22, 1–75.
Hodges, J. (2002). Regional brain activations differ Levelt, W. J. M., Schriefers, H., Vorberg, D.,
for semantic features but not categories. Neuroreport, Meyer, A. S., Pechmann, T., & Havinga, J.
13, 1497–1501. (1991a). The time course of lexical access in speech
534 REFERENCES
production: A study of picture naming. Psychological Liberman, A. M., & Whalen, D. H. (2000). On the
Review, 98, 122–142. relation of speech to language. Trends in Cognitive
Levelt, W. J. M., Schriefers, H., Vorberg, D., Sciences, 4, 187–196.
Meyer, A. S., Pechmann, T., & Havinga, J. (1991b). Lidz, J., Gleitman, H., & Gleitman, L. (2003).
Normal and deviant lexical processing: Reply to Dell Understanding how input matters: Verb learning and
and O’Seaghdha (1991). Psychological Review, 98, the footprint of universal grammar. Cognition, 87,
615–618. 151–178.
Levelt, W. J. M., & Wheeldon, L. (1994). Do Lidzha, K., & Krageloh-Mann, I. (2005).
speakers have access to a mental syllabary? Cognition, Development and lateralization of language in the
50, 239–269. presence of early brain lesions. Developmental
Levine, D. N., Calvanio, R., & Popovics, A. Medicine and Child Neurology, 47, 724.
(1982). Language in the absence of inner speech. Lieberman, P. (1963). Some effects of semantic and
Neuropsychologia, 20, 391–409. grammatical context on the production and perception
Levinson, S. (1983). Pragmatics. Cambridge: of speech. Language and Speech, 6, 172–187.
Cambridge University Press. Lieberman, P. (1975). On the origins of language.
Levinson, S. (1996a). Frames of reference and New York: Macmillan.
Molyneux’s question: Crosslinguistic evidence. In P. Lieven, E. (1994). Crosslinguistic and crosscultural
Bloom & M. Peterson (Eds.), Language and space aspects of language addressed to children. In
(pp. 109–169). Cambridge, MA: MIT Press. C. Gallaway & B. J. Richards (Eds.), Input and
Levinson, S. (1996b). Language and space. Annual interaction in language acquisition (pp. 56–73).
Review of Anthropology, 25, 353–382. Cambridge: Cambridge University Press.
Levinson, S. C., Kita, S., Haun, D. B. M., & Lieven, E., Pine, J., & Baldwin, G. (1997).
Rasch, B. H. (2002). Returning the tables: Language Lexically-based learning and early grammatical
affects spatial reasoning. Cognition, 84, 155–188. development. Journal of Child Language, 24,
Levy, B. A., Gong, Z., Hessels, S., Evans, M. A., 187–220.
& Jared, D. (2006). Understanding print: Early Lightfoot, D. (1982). The language lottery: Toward a
reading development and the contributions of home biology of grammars. Cambridge, MA: MIT Press.
literacy experiences. Journal of Child Experimental Lindell, A. K. (2006). In your right mind: Right
Psychology, 93, 63–93. hemisphere contributions to language processing
Levy, E., & Nelson, K. (1994). Words in discourse: A and production. Neuropsychology Review, 16,
dialectical approach to the acquisition of meaning and 131–148.
use. Journal of Child Language, 21, 367–389. Lindsay, P. H., & Norman, D. A. (1977). Human
Levy, Y. (1983). It’s frogs all the way down. information processing (2nd ed.). New York:
Cognition, 15, 75–93. Academic Press.
Levy, Y. (1988). The nature of early language: Linebarger, M. C. (1995). Agrammatism as evidence
Evidence from the development of Hebrew about grammar. Brain and Language, 50, 52–91.
morphology. In Y. Levy, I. M. Schlesinger, & Linebarger, M. C., Schwartz, M. F., & Saffran, E. M.
M. D. S. Braine (Eds.), Categories and processes (1983). Sensitivity to grammatical structure in so-called
in language acquisition (pp. 73–98). Hillsdale, NJ: agrammatic aphasics. Cognition, 13, 361–392.
Lawrence Erlbaum Associates, Inc. Lipson, M. Y. (1983). The influence of religious
Levy, Y., & Schlesinger, I. M. (1988). The child’s affiliation on children’s memory for text information.
early categories: Approaches to language acquisition Reading Research Quarterly, 18, 448–457.
theory. In Y. Levy, I. M. Schlesinger, & Liu, L. G. (1985). Reasoning counter-factually in
M. D. S. Braine (Eds.), Categories and processes in Chinese: Are there any obstacles? Cognition, 21,
language acquisition (pp. 261–276). Hillsdale, NJ: 239–270.
Lawrence Erlbaum Associates, Inc. Locke, J. (1690). Essay concerning human
Lewis, V. (1987). Development and handicap. Oxford: understanding (Ed. P. M. Nidditch, 1975). Oxford:
Blackwell. Clarendon.
Li, P., & Gleitman, L. (2002). Turning the tables: Locke, J. L. (1983). Phonological acquisition and
Language and spatial reasoning. Cognition, 83, 265–294. change. New York: Academic Press.
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., Locke, J. L. (1997). A theory of neurolinguistic
& Studdert-Kennedy, M. (1967). Perception of the development. Brain and Language, 58, 265–326.
speech code. Psychological Review, 74, 431–461. Loebell, H., & Bock, K. (2003). Structural priming
Liberman, A. M., Harris, K. S., Hoffman, H. S., across languages. Linguistics, 41, 791–824.
& Griffith, B. C. (1957). The discrimination of Loftus, E. F. (1973). Category, dominance, instance
speech sounds within and across phoneme boundaries. dominance, and categorization time. Journal of
Journal of Experimental Psychology, 53, 358–368. Experimental Psychology, 97, 70–74.
REFERENCES 535
Loftus, E. F. (1975). Leading questions and the from associative priming by words, homophones,
eyewitness report. Cognitive Psychology, 7, 560–572. and pseudohomophones. Journal of Experimental
Loftus, E. F. (1996). Eyewitness testimony (reprint Psychology: General, 123, 107–128.
edition with new preface). Cambridge, MA: Harvard Lukatela, G., & Turvey, M. T. (1994b). Visual
University Press. lexical access is initially phonological: 2. Evidence
Loftus, E. F., & Palmer, J. C. (1974). Reconstruction from phonological priming by homophones and
of automobile destruction: An example of the pseudohomophones. Journal of Experimental
interaction between language and memory. Journal of Psychology: General, 123, 331–353.
Verbal Learning and Verbal Behavior, 13, 585–589. Lund, K., Burgess, C., & Atchley, R. A. (1995).
Loftus, E. F., & Zanni, G. (1975). Eyewitness Semantic and associative priming in high-dimensional
testimony: The influence of the wording of a question. semantic space. Proceedings of the 17th Annual
Bulletin of the Psychonomic Society, 5, 86–88. Conference of the Cognitive Science Society, 660–665.
Longtin, C. M., Segui, J., & Halle, P. A. (2003). Lund, K., Burgess, C., & Audet, C. (1996).
Morphological priming without morphological Dissociating semantic and associative word
relationship. Language and Cognitive Processes, 18, relationships using high-dimensional semantic space.
313–334. Proceedings of the 18th Annual Conference of the
Loosemore, R., & Harley, T. A. (2010). Brains Cognitive Science Society, 603–608.
and minds. In S. J. Hanson & M. Bunzl (Eds.), Lundberg, I., & Tornéus, M. (1978). Nonreaders’
Foundational issues in human brain mapping (pp. awareness of the basic relationship between spoken
217–240). Cambridge, MA: MIT Press. and written words. Journal of Experimental Child
Lorch, R. F., Balota, D. A., & Stamm, E. G. (1986). Psychology, 25, 404–412.
Locus of inhibition effects in the priming of lexical Lupker, S. J. (1984). Semantic priming without
decisions: Pre- or post-lexical access. Memory and association: A second look. Journal of Verbal Learning
Cognition, 9, 587–598. and Verbal Behavior, 23, 709–733.
Lounsbury, F. G. (1954). Transitional probability, Luria, A. R. (1970). Traumatic aphasia. The Hague:
linguistic structure and systems of habit-family Mouton.
hierarchies. In C. E. Osgood & T. A. Sebeok (Eds.), Lyn, H., & Savage-Rumbaugh, E. S. (2000).
Psycholinguistics: A survey of theory and research Observational word learning in two bonobos (Pan
problems (pp. 93–101). Bloomington: Indiana Paniscus): Ostensive and non-ostensive contexts.
University Press. [Reprinted 1965.] Language and Communication, 20, 255–273.
Lovegrove, W., Martin, F., & Slaghuis, W. (1986). Lyons, J. (1977a). Semantics (Vol. 1). Cambridge:
A theoretical and experimental case for a visual Cambridge University Press.
deficit in specific reading disability. Cognitive Lyons, J. (1977b). Semantics (Vol. 2). Cambridge:
Neuropsychology, 3, 225–267. Cambridge University Press.
Lowenfeld, B. (1948). Effects of blindness on the Lyons, J. (1991). Chomsky (3rd ed.). London:
cognitive functions of children. Nervous Child, 7, Fontana. [First edition 1970.]
45–54. Maccoby, E., & Jacklin, C. (1974). The psychology
Luce, P. A., Pisoni, D. B., & Goldinger, S. D. (1990). of sex differences. Stanford, CA: Stanford University
Similarity neighbourhoods of spoken words. In Press.
G. T. M. Altmann (Ed.), Cognitive models of speech MacDonald, M. C. (1993). The interaction of lexical
processing (pp. 122–147). Cambridge, MA: MIT and syntactic ambiguity. Journal of Memory and
Press. Language, 32, 692–715.
Luce, R. D. (1993). Sound and hearing: A conceptual MacDonald, M. C. (1994). Probabilistic constraints
introduction. Hillsdale, NJ: Lawrence Erlbaum and syntactic ambiguity resolution. Language and
Associates, Inc. Cognitive Processes, 9, 157–201.
Lucy, J. A. (1992). Language diversity and thought. MacDonald, M. C., & Christiansen, M. H. (2002).
Cambridge: Cambridge University Press. Reassessing working memory: Comment on Just
Lucy, J. A. (1996). The scope of linguistic relativity: and Carpenter (1992) and Waters and Caplan (1996).
An analysis and review of empirical research. In Psychological Review, 109, 35–54.
J. J. Gumperz & S. C. Levinson (Eds.), Rethinking MacDonald, M. C., Just, M. A., & Carpenter, P. A.
linguistic relativity (pp. 37–69). Cambridge: (1992). Working memory constraints on the processing
Cambridge University Press. of syntactic ambiguity. Cognitive Psychology, 24,
Lucy, J. A., & Shweder, R. A. (1979). Whorf and his 56–98.
critics: Linguistic and nonlinguistic influences on colour MacDonald, M. C., Pearlmutter, N. J., &
memory. American Anthropologist, 81, 581–615. Seidenberg, M. S. (1994a). Syntactic ambiguity
Lukatela, G., & Turvey, M. T. (1994a). Visual resolution as lexical ambiguity resolution. In C.
lexical access is initially phonological: 1. Evidence Clifton, L. Frazier, & K. Rayner (Eds.), Perspectives
536 REFERENCES
on sentence processing (pp. 123–153). Hillsdale, NJ: Mandler, J. M., & Johnson, N. S. (1980). On
Lawrence Erlbaum Associates, Inc. throwing out the baby with the bathwater: A reply to
MacDonald, M. C., Pearlmutter, N. J., & Black and Wilensky’s evaluation of story grammars.
Seidenberg, M. S. (1994b). The lexical nature of Cognitive Science, 4, 305–312.
syntactic ambiguity resolution. Psychological Review, Manis, F. R., McBride-Chang, C., Seidenberg, M. S.,
101, 676–703. Keating, P., Doi, L. M., Munson, B., et al. (1997). Are
MacKain, C. (1982). Assessing the role of experience speech perception deficits asociated with developmental
in infant speech discrimination. Journal of Child dyslexia? Journal of Experimental Child Psychology,
Language, 9, 323–350. 66, 211–235.
MacKay, D. G. (1966). To end ambiguous sentences. Manis, F. R., Seidenberg, M. S., Doi, L. M.,
Perception and Psychophysics, 1, 426–436. McBride-Chang, C., & Petersen, A. (1996). On the
MacKay, D. G. (1973). Aspects of the theory of bases of two subtypes of developmental dyslexia.
comprehension, memory and attention. Quarterly Cognition, 58, 157–195.
Journal of Experimental Psychology, 25, 22–40. Maratsos, M. (1982). The child’s construction of
Maclay, H., & Osgood, C. E. (1959). Hesitation grammatical categories. In E. Wanner &
phenomena in spontaneous English speech. Word, 15, L. R. Gleitman (Eds.), Language acquisition: The
19–44. state of the art (pp. 240–266). Cambridge: Cambridge
Macmillan, N. A., & Creelman, C. D. (1991). University Press.
Detection theory: A user’s guide. Cambridge: Maratsos, M. (1983). Some current issues in the study
Cambridge University Press. of the acquisition of grammar. In J. H. Flavell &
Macnamara, J. (1972). Cognitive basis of language E. M. Markman (Eds.), Handbook of child psychology:
learning in infants. Psychological Review, 79, 1–13. Vol. 3. Cognitive development (pp. 707–786)
Macnamara, J. (1982). Names for things: A study of (P. H. Mussen, Series Editor). New York: Wiley.
human learning. Cambridge, MA: MIT Press. Maratsos, M. (1988). The acquisition of formal word
MacNeilage, P. F., & Davis, B. L. (2000). On the classes. In Y. Levy, I. M. Schlesinger, &
origin of internal structure of word forms. Science, M. D. S. Braine (Eds.), Categories and processes
288, 527–531. in language acquisition. Hillsdale, NJ: Lawrence
MacWhinney, B. (Ed.). (1999). The emergence Erlbaum Associates, Inc.
of language. Mahwah, NJ: Lawrence Erlbaum Maratsos, M. (1998). The acquisition of grammar.
Associates, Inc. In W. Damon, D. Kuhn, & R. S. Siegler (Eds.),
MacWhinney, B., & Leinbach, J. (1991). Handbook of child psychology (Vol. 2, 5th ed., pp.
Implementations are not conceptualizations: Revising 421–466). New York: Wiley.
the verb learning model. Cognition, 40, 121–157. Marcel, A. J. (1980). Surface dyslexia and beginning
MacWhinney, B., & Pleh, C. (1988). The processing reading: A revised hypothesis of the pronunciation of print
of restrictive relative clauses in Hungarian. Cognition, and its impairments. In M. Coltheart, K. E. Patterson, &
29, 95–141. J. C. Marshall (Eds.), Deep dyslexia (pp. 227–258).
Magiste, E. (1986). Selected issues in second and London: Routledge & Kegan Paul. [2nd ed., 1987.]
third language learning. In J. Vaid (Ed.), Language Marcel, A. J. (1983a). Conscious and unconscious
processing in bilinguals: Psycholinguistic and perception: Experiments on visual making and word
neuropsychological perspectives (pp. 97–121). recognition. Cognitive Psychology, 15, 197–237.
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Marcel, A. J. (1983b). Conscious and unconscious
Maher, J., & Groves, J. (1999). Introducing perception: An approach to the relations between
Chomsky. Cambridge: Icon Books. [Originally phenomenal experience and perceptual processes.
published as Chomsky for beginners, 1996.] Cognitive Psychology, 15, 238–300.
Majid, A., Bowerman, M., Kita, S., Haun, D. B. M., Marchman, V. (1993). Constraints on plasticity in a
& Levinson, S. C. (2004). Can language restructure connectionist model of the English past tense. Journal
cognition? The case for space. Trends in Cognitive of Cognitive Neuroscience, 5, 215–234.
Sciences, 8, 108–113. Marchman, V. (1997). Children’s productivity in the
Malotki, E. (1983). Hopi time: A linguistic analysis English past tense: The role of frequency, phonology,
of temporal concepts in the Hopi language. Berlin: and neighborhood structure. Cognitive Science, 21,
Mouton. 283–304.
Mandler, J. M. (1978). A code in the node: The cue Marchman, V., & Bates, E. (1994). Continuity in
of a story schema in retrieval. Discourse Processes, 1, lexical and morphological development: A test of the
14–35. critical mass hypothesis. Journal of Child Language,
Mandler, J. M., & Johnson, N. S. (1977). 21, 339–366.
Remembrance of things parsed: Story structure and Marcus, G. F. (1993). Negative evidence in language
recall. Cognitive Psychology, 9, 111–151. acquisition. Cognition, 46, 53–85.
REFERENCES 537
Marcus, G. F. (1995). The acquisition of English Marshall, J. C., & Newcombe, F. (1966). Syntactic
past tense in children and multilayered connectionist and semantic errors in paralexia. Neuropsychologia, 4,
networks. Cognition, 56, 271–279. 169–176.
Marcus, G. F. (1999). Reply to Seidenberg and Marshall, J. C., & Newcombe, F. (1973). Patterns
Elman. Trends in Cognitive Sciences, 3, 289. of paralexia: A psycholinguistic approach. Journal of
Marcus, G. F., Ullman, M., Pinker, S., Hollander, Psycholinguistic Research, 2, 175–199.
M., Rosen, T. J., & Xu, F. (1992). Overregularization Marshall, J. C., & Newcombe, F. (1980). The
in language acquisition. Monographs of the Society for conceptual status of deep dyslexia: An historical
Research in Child Development, 57 (Serial No. 228). perspective. In M. Coltheart, K. E. Patterson, &
Marcus, G. F., Vijayan, S., Rao, S. B., & Vishton, P. M. J. C. Marshall (Eds.), Deep dyslexia (pp. 1–21).
(1999). Rule learning by seven-month-old infants. London: Routledge & Kegan Paul. [2nd ed., 1987.]
Science, 283, 77–80. Marshall, J. C., & Patterson, K. E. (1985). Left is
Marian, V., & Spivey, M. (2003). Bilingual and still left for semantic paralexias: A reply to Jones and
monolingual processing of competing lexical items. Martin. Neuropsychologia, 23, 689–690.
Applied Psycholinguistics, 24, 173–193. Marslen-Wilson, W. D. (1973). Linguistic structure
Marien, P., Enggelborghs, S., Fabbro, F., & and speech shadowing at very short latencies. Nature,
De Deyn, P. P. (2001). The lateralized linguistic 244, 522–523.
cerebellum: A review and a new hypothesis. Brain and Marslen-Wilson, W. D. (1975). Sentence perception
Language, 79, 580–600. as an interactive parallel process. Science, 189,
Markman, E. M. (1979). Realizing that you don’t 226–228.
understand: Elementary school children’s awareness of Marslen-Wilson, W. D. (1976). Linguistic
inconsistencies. Child Development, 50, 643–655. descriptions and psychological assumptions in the
Markman, E. M. (1985). Why superordinate category study of sentence perception. In R. J. Wales &
terms can be mass nouns. Cognition, 19, 311–353. E. C. T. Walker (Eds.), New approaches to language
Markman, E. M. (1989). Categorization and naming mechanisms (pp. 203–230). Amsterdam: North
in children. Cambridge, MA: MIT Press. Holland.
Markman, E. M. (1990). Constraints children place Marslen-Wilson, W. D. (1984). Spoken word
on word meanings. Cognitive Science, 14, 57–77. recognition: A tutorial review. In H. Bouma &
Markman, E. M., & Hutchinson, J. E. (1984). D. G. Bouwhis (Eds.), Attention and performance X:
Children’s sensitivity to constraints on word Control of language processes (pp. 125–150). Hove,
meaning: Taxonomic vs. thematic relations. Cognitive UK: Lawrence Erlbaum Associates.
Psychology, 16, 1–27. Marslen-Wilson, W. D. (1987). Functional
Markman, E. M., & Wachtel, G. F. (1988). parallelism in spoken word recognition. Cognition, 25,
Children’s use of mutual exclusivity to constrain 71–102.
the meaning of words. Cognitive Psychology, 20, Marslen-Wilson, W. D. (Ed.). (1989). Lexical
121–157. representation and process. Cambridge, MA: MIT
Marr, D. (1982). Vision: A computational Press.
investigation into the human representation and Marslen-Wilson, W. D. (1990). Activation,
processing of visual information. San Francisco: W. H. competition, and frequency in lexical access. In
Freeman. G. T. M. Altmann (Ed.), Cognitive models of speech
Marsh, G., Desberg, P., & Cooper, J. (1977). processing (pp. 148–172). Cambridge, MA: MIT Press.
Developmental changes in strategies of reading. Marslen-Wilson, W. D. (1993). Issues of process
Journal of Reading Behaviour, 9, 391–394. and representation in lexical access. In G. Altmann
Marsh, G., Friedman, M. P., Welch, V., & Desberg, P. & R. Shillcock (Eds.), Cognitive models of speech
(1981). A cognitive-developmental theory of reading processing (pp. 187–210). Hove, UK: Lawrence
acquisition. In T. G. Waller & G. E. Mackinnon Erlbaum Associates.
(Eds.), Reading research: Advances in theory and Marslen-Wilson, W. D., & Tyler, L. K. (1980). The
practice (Vol. 3, pp. 199–221). New York: Academic temporal structure of spoken language understanding.
Press. Cognition, 8, 1–71.
Marshall, J., Robson, J., Pring, T., & Chiat, S. Marslen-Wilson, W. D., & Tyler, L. K. (2003).
(1998). Why does monitoring fail in jargon aphasia? Capturing underlying differentiation in the human
Comprehension, judgement, and therapy evidence. language system. Trends in Cognitive Science, 7,
Brain and Language, 63, 79–107. 62–63.
Marshall, J. C. (1970). The biology of Marslen-Wilson, W. D., Tyler, L. K., Waksler, R.,
communication in man and animals. In J. Lyons (Ed.), & Older, L. (1994). Morphology and meaning in the
New horizons in linguistics (Vol. 1, pp. 229–242). English mental lexicon. Psychological Review, 101,
Harmondsworth, UK: Penguin. 3–33.
538 REFERENCES
Marslen-Wilson, W. D., & Warren, P. (1994). Levels Martin, N., Weisberg, R. W., & Saffran, E. M.
of perceptual representation and process in lexical (1989). Variables influencing the occurrence of naming
access: Words, phonemes, and features. Psychological errors: Implications for models of lexical retrieval.
Review, 101, 653–675. Journal of Memory and Language, 28, 462–485.
Marslen-Wilson, W. D., & Welsh, A. (1978). Martin, R. C. (1982). The pseudohomophone effect:
Processing interactions and lexical access during The role of visual similarity in non-word decisions.
word recognition in continuous speech. Cognitive Quarterly Journal of Experimental Psychology, 34A,
Psychology, 10, 29–63. 395–410.
Marslen-Wilson, W. D., & Zwitserlood, P. (1989). Martin, R. C. (1993). Short-term memory and
Accessing spoken words: The importance of word sentence processing: Evidence from neuropsychology.
onsets. Journal of Experimental Psychology: Human Memory and Cognition, 21, 176–183.
Perception and Performance, 15, 576–585. Martin, R. C. (1995). Working memory doesn’t work:
Martin, A., & Fedio, P. (1983). Word production A critique of Miyake et al.’s capacity theory of aphasic
and comprehension in Alzheimer’s disease: The comprehension deficits. Cognitive Neuropsychology,
breakdown of semantic knowledge. Brain and 12, 623–636.
Language, 19, 124–141. Martin, R. C., & Breedin, S. D. (1992). Dissociations
Martin, C., Vu, H., Kellas, G., & Metcalf, K. between speech perception and phonological short-
(1999). Strength of discourse context as a determinant term memory deficits. Cognitive Neuropsychology, 9,
of the subordinate bias effect. Quarterly Journal of 509–534.
Experimental Psychology, 52A, 813–839. Martin, R. C., & Feher, E. (1990). The consequences
Martin, G. L. (2004). Encoder: A connectionist model of reduced memory span for the comprehension of
of how learning to visually encode fixated text images semantic versus syntactic information. Brain and
improves reading fluency. Psychological Review, 111, Language, 38, 1–20.
617–639. Martin, R. C., & Lesch, M. F. (1996). Associations
Martin, G. N. (1998). Human neuropsychology. and dissociations between language impairment and
London: Prentice Hall. list recall: Implications for models of STM. In S. E.
Martin, N. (2001). Repetition disorders in aphasia: Gathercole (Ed.), Models of short-term memory (pp.
Theoretical and clinical implications. In R. S. Berndt 149–178). Hove, UK: Psychology Press.
(Ed.), Handbook of neuropsychology (Vol. 3, 2nd ed., Martin, R. C., Lesch, M. F., & Bartha, M. C.
pp. 137–155). Amsterdam: Elsevier Science. (1999). Independence of input and output phonology
Martin, N., & Ayala, J. (2004). Measurements of in word processing and short-term memory. Journal of
auditory-verbal STM in aphasia: Effects of task, Memory and Language, 41, 3–29.
item and word processing impairment. Brain and Martin, R. C., Shelton, J. R., & Yaffee, L. S.
Language, 89, 464–483. (1994). Language processing and working
Martin, N., Dell, G. S., Saffran, E. M., & memory: Neuropsychological evidence for separate
Schwartz, M. F. (1994). Origins of paraphasia in phonological and semantic capacities. Journal of
deep dysphasia: Testing the consequences of a decay Memory and Language, 33, 83–111.
impairment to an interactive spreading activation mode Martin, R. C., Wetzel, W. F., Blossom-Stach, C.,
of lexical retrieval. Brain and Language, 47, 609–660. & Feher, E. (1989). Syntactic loss versus processing
Martin, N., & Saffran, E. M. (1990). Repetition and deficit: An assessment of two theories of agrammatism
verbal STM in transcortical sensory aphasia: A case and syntactic comprehension deficits. Cognition, 32,
study. Brain and Language, 39, 254–288. 157–191.
Martin, N., & Saffran, E. M. (1992). A computational Masataka, N. (1996). Perception of motherese in
account of deep dysphasia: Evidence from a single case a signed language by 6-month-old deaf infants.
study. Brain and Language, 43, 240–274. Developmental Psychology, 32, 874–879.
Martin, N., & Saffran, E. M. (1997). Language and Mason, M. K. (1942). Learning to speak after six and
auditory-verbal short-term memory impairments: one half years silence. Journal of Speech and Hearing
Evidence for common underlying processes. Cognitive Disorders, 7, 295–304.
Neuropsychology, 14, 641–682. Mason, R. A., Just, M. A., Keller, T. A., &
Martin, N., & Saffran, E. M. (1998). The Carpenter, P. A. (2003). Ambiguity in the brain:
relationship between input and output phonology: What brain imaging reveals about the processing
Evidence from aphasia. Brain and Language, of syntactically ambiguous sentences. Journal of
65, 225–228. Experimental Psychology: Learning, Memory, and
Martin, N., Saffran, E. M., & Dell, G. S. (1996). Cognition, 29, 1319–1338.
Recovery in deep dysphasia: Evidence for a relation Massaro, D. W. (1987). Speech perception by ear and
between auditory-verbal STM and lexical errors in eye: A paradigm for psychological enquiry. Hillsdale,
repetition. Brain and Language, 52, 83–113. NJ: Lawrence Erlbaum Associates, Inc.
REFERENCES 539
Massaro, D. W. (1989). Testing between the TRACE of a single case. Journal of Neurology, Neurosurgery,
model and the fuzzy logical model of speech and Psychiatry, 49, 1233–1240.
perception. Cognitive Psychology, 21, 398–421. McCarthy, R. A., & Warrington, E. K. (1987a). The
Massaro, D. W. (1994). Psychological aspects of double dissociation of short-term memory for lists and
speech perception: Implications for research and sentences: Evidence from aphasia. Brain, 110, 1545–1563.
theory. In M. A. Gernsbacher (Ed.), Handbook of McCarthy, R. A., & Warrington, E. K. (1987b).
psycholinguistics (pp. 219–264). San Diego, CA: Understanding: A function of short-term memory?
Academic Press. Brain, 110, 1565–1578.
Massaro, D. W., & Cohen, M. M. (1991). Integration McCarthy, R. A., & Warrington, E. K. (1988).
versus interactive activation: The joint influence Evidence for modality-specific meaning systems in the
of stimulus and context in perception. Cognitive brain. Nature, 334, 428–430.
Psychology, 23, 558–614. McCauley, C., Parmalee, C. M., Sperber, R. D., &
Massaro, D. W., & Oden, G. C. (1995). Carr, T. H. (1980). Early extraction of meaning from
Independence of lexical context and phonological pictures and its relation to conscious identification.
information in speech perception. Journal of Journal of Experimental Psychology: Human
Experimental Psychology: Learning, Memory, and Perception and Performance, 6, 265–276.
Cognition, 21, 1053–1064. McClelland, J. L. (1979). On the time relations of mental
Masson, M. E. J. (1995). A distributed memory processes: An examination of systems of processes in
model of semantic priming. Journal of Experimental cascade. Psychological Review, 86, 287–330.
Psychology: Learning, Memory, and Cognition, 21, McClelland, J. L. (1981). Retrieving general and
3–23. specific information from stored knowledge of
Masterson, J., Coltheart, M., & Meara, P. (1985). specifics. Proceedings of the 3rd Annual Conference of
Surface dyslexia in a language without irregularly spelled the Cognitive Science Society, 170–172.
words. In K. E. Patterson, J. C. Marshall, & M. Coltheart McClelland, J. L. (1987). The case for interactions in
(Eds.), Surface dyslexia: Neuropsychological and language processing. In M. Coltheart (Ed.), Attention
cognitive studies of phonological reading (pp. 215–223). and performance XII: The psychology of reading (pp.
Hove, UK: Lawrence Erlbaum Associates. 3–36). Hove, UK: Lawrence Erlbaum Associates.
Masur, E. F. (1997). Maternal labelling of novel McClelland, J. L. (1991). Stochastic interactive
and familiar objects: Implications for children’s processes and the effect of context on perception.
development of lexical constraints. Journal of Child Cognitive Psychology, 23, 1–44.
Language, 24, 427–439. McClelland, J. L., & Elman, J. L. (1986). The
Mattys, S. L., & Jusczyk, P. W. (2001). Phonotactic TRACE model of speech perception. Cognitive
cues for segmentation of fluent speech by infants. Psychology, 18, 1–86.
Cognition, 78, 91–121. McClelland, J. L., & Patterson, K. E. (2002). Rules
Mayberry, E. J., Sage, K., & Lambon Ralph, M. A. or connections in past-tense inflections: What does
(2011). At the edge of semantic space: The breakdown the evidence rule out? Trends in Cognitive Science, 6,
of coherent concepts in semantic dementia is 465–472.
constrained by typicality and severity but not modality. McClelland, J. L., & Patterson, K. E. (2003).
Journal of Cognitive Neuroscience, 23, 2240–2251. Differentiation and integration in human language.
Mazuka, R. (1991). Processing of empty categories Trends in Cognitive Science, 7, 63–64.
in Japanese. Journal of Psycholinguistic Research, 20, McClelland, J. L., & Rumelhart, D. E. (1981). An
215–232. interactive activation model of context effects in letter
McBride-Chang, C. (2004). Children’s literacy perception: Part 1. An account of the basic findings.
development. London: Arnold. Psychological Review, 88, 375–407.
McCann, R. S., & Besner, D. (1987). Reading McClelland, J. L., & Rumelhart, D. E. (1988).
pseudohomophones: Implications for models of Explorations in parallel distributed processing.
pronunciation assembly and the locus of word frequency Cambridge, MA: MIT Press.
effects in naming. Journal of Experimental Psychology: McClelland, J. L., Rumelhart, D. E., & the PDP
Human Perception and Performance, 13, 14–24. Research Group. (1986). Parallel distributed
McCarthy, J. J. (2001). A thematic guide to processing: Vol. 2. Psychological and biological
optimality theory. Cambridge: Cambridge University models. Cambridge, MA: MIT Press.
Press. McClelland, J. L., & Seidenberg, M. S. (2000).
McCarthy, J. J., & Prince, A. (1990). Foot and word Words and rules—the ingredients of language by
in prosodic morphology: The Arabic broken plural. Pinker, S. Science, 287, 47–48.
Natural Language and Linguistic Theory, 8, 209–283. McCloskey, M. (1980). The stimulus familiarity
McCarthy, R. A., & Warrington, E. K. (1986). problem in semantic memory research. Journal of
Visual associative agnosia: A clinico-anatomical study Verbal Learning and Verbal Behavior, 19, 485–504.
540 REFERENCES
McCloskey, M., & Caramazza, A. (1988). Theory structures in story understanding. Journal of Memory
and methodology in cognitive neuropsychology: A and Language, 28, 711–734.
response to our critics. Cognitive Neuropsychology, 5, McKoon, G., Ratcliff, R., & Ward, G. (1994).
583–623. Testing theories of language processing: An empirical
McCloskey, M., & Glucksberg, S. (1978). Natural investigation of the on-line lexical decision task.
categories: Well-defined or fuzzy sets? Memory and Journal of Experimental Psychology: Learning,
Cognition, 6, 462–472. Memory, and Cognition, 20, 1219–1228.
McConkie, G. W., & Rayner, K. (1976). Asymmetry McLaughlin, B. (1984). Second language acquisition
of the perceptual span in reading. Bulletin of the in childhood (2nd ed.). Hillsdale, NJ: Lawrence
Psychonomic Society, 8, 365–368. Erlbaum Associates, Inc.
McCune-Nicolich, L. (1981). The cognitive bases of McLaughlin, B. (1987). Theories of second-language
relational words in the single word period. Journal of learning. London: Arnold.
Child Language, 8, 15–34. McLaughlin, B., & Heredia, R. (1996). Information-
McCutchen, D., & Perfetti, C. A. (1982). The visual processing approaches to research on second language
tongue-twister effect: Phonological activation in acquisition and use. In W. C. Ritchie & T. K. Bhatia
silent reading. Journal of Verbal Learning and Verbal (Eds.), Handbook of second language acquisition (pp.
Behavior, 21, 672–687. 213–228). London: Academic Press.
McDavid, V. (1964). The alternation of that and zero McLeod, P., Shallice, T., & Plaut, D. C. (2000).
in noun clauses. American Speech, 39, 102–113. Attractor dynamics in word recognition: Converging
McDonald, J. L., Bock, J. K., & Kelly, M. H. (1993). evidence from errors by normal subjects, dyslexic
Word order and world order: Semantic, phonological, patients and a connectionist model. Cognition, 74,
and metrical determinants of serial position. Cognitive 91–113.
Psychology, 25, 188–230. McNamara, T. P. (1992). Theories of priming: I.
McDonald, S. A., Carpenter, R. H. S., & Associative distance and lag. Journal of Experimental
Shillcock, R. C. (2005). An anatomically constrained, Psychology: Learning, Memory, and Cognition, 18,
stochastic model of eye movement control in reading. 1173–1190.
Psychological Review, 112, 814–840. McNamara, T. P. (1994). Theories of priming: II.
McGurk, H., & MacDonald, J. (1976). Hearing lips Types of prime. Journal of Experimental Psychology:
and seeing voices. Nature, 264, 746–748. Learning, Memory, and Cognition, 20, 507–520.
McKoon, G., Gerrig, R. J., & Greene, S. B. McNamara, T. P., & Altarriba, J. (1988). Depth of
(1996). Pronoun resolution without pronouns: Some spreading activation revisited: Semantic mediated
consequences of memory-based text processing. priming occurs in lexical decisions. Journal of
Journal of Experimental Psychology: Learning, Memory and Language, 27, 545–559.
Memory, and Cognition, 22, 919–932. McNamara, T. P., & Miller, D. L. (1989). Attributes
McKoon, G., & Ratcliff, R. (1986). Inferences of theories of meaning. Psychological Bulletin, 106,
about predictable events. Journal of Experimental 355–376.
Psychology: Learning, Memory, and Cognition, 12, McQueen, J. (1991). The influence of the lexicon on
82–91. phonetic categorisation: Stimulus quality and word-
McKoon, G., & Ratcliff, R. (1989). Semantic final ambiguity. Journal of Experimental Psychology:
associations and elaborative inference. Journal of Human Perception and Performance, 17, 433–443.
Experimental Psychology: Learning, Memory, and McRae, K., & Boisvert, S. (1998). Automatic semantic
Cognition, 15, 326–338. similarity priming. Journal of Experimental Psychology:
McKoon, G., & Ratcliff, R. (1992). Inference during Learning, Memory, and Cognition, 24, 558–572.
reading. Psychological Review, 99, 440–466. McRae, K., de Sa, V. R., & Seidenberg, M. S.
McKoon, G., & Ratcliff, R. (2002). Event templates (1997). On the nature and scope of featural
in the lexical representations of verbs. Cognitive representations of word meaning. Journal of
Psychology, 45, 1–44. Experimental Psychology: General, 126, 99–130.
McKoon, G., & Ratcliff, R. (2003). Meaning through McRae, K., Hare, M., & Tanenhaus, M. K. (2005).
syntax: Language comprehension and the reduced Meaning through syntax is insufficient to explain
relative clause construction. Psychological Review, comprehension of sentences with reduced relative
110, 490–525. clauses: Comment on McKoon and Ratcliff (2003).
McKoon, G., Ratcliff, R., & Dell, G. S. (1986). Psychological Review, 112, 1022–1031.
A critical evaluation of the semantic–episodic McRae, K., Spivey-Knowlton, M. J., & Tanenhaus,
distinction. Journal of Experimental Psychology: M. K. (1998). Modeling the influence of thematic
Learning, Memory, and Cognition, 12, 295–306. fit (and other constraints) in on-line sentence
McKoon, G., Ratcliff, R., & Seifert, C. M. (1989). comprehension. Journal of Memory and Language,
Making the connection: Generalized knowledge 38, 283–312.
REFERENCES 541
McShane, J. (1991). Cognitive development. Oxford: category norms, and word frequency. Bulletin of the
Blackwell. Psychonomic Society, 7, 283–284.
McShane, J., & Dockrell, J. (1983). Lexical and Messer, D. (1980). The episodic structure of maternal
grammatical development. In B. Butterworth (Ed.), speech to young children. Journal of Child Language,
Speech production: Vol. 2. Development, writing, 7, 29–40.
and other language processes (pp. 51–99). London: Messer, D. (2000). State of the art: Language
Academic Press. acquisition. The Psychologist, 13, 138–143.
McVay, J. C., & Kane, M. J. (2012). Why does Metsala, J. L., Stanovich, K. E., & Brown, G. D. A.
working memory capacity predict variation in reading (1998). Regularity effects and the phonological deficit
comprehension? On the influence of mind wandering model of reading disabilities: A meta-analytic review.
and executive attention. Journal of Experimental Journal of Educational Psychology, 90, 279–293.
Psychology: General, 141, 302–320. Meyer, A. S. (1996). Lexical access in phrase and
Medin, D. L., & Schaffer, M. M. (1978). A context sentence production: Results from picture–word
theory of classification learning. Psychological interference experiments. Journal of Memory and
Review, 85, 207–238. Language, 35, 477–496.
Mehler, J. (1963). Some effects of grammatical Meyer, A. S. (2004). The use of eye tracking in
transformations on the recall of English sentences. Journal studies of sentence generation. In J. M. Henderson &
of Verbal Learning and Verbal Behavior, 2, 346–351. F. Ferreira (Eds.), The interface of language, vision,
Mehler, J., Jusczyk, P. W., Lambertz, G., Halsted, N., and action: Eye movements and the visual world (pp.
Bertoncini, J., & Amiel-Tison, C. (1988). A precursor 191–211). Hove, UK: Psychology Press.
of language acquisition in young infants. Cognition, 29, Meyer, A. S., & Bock, K. (1992). The tip-of-the-
143–178. tongue phenomenon: Blocking or partial activation?
Mehler, J., Segui, J., & Carey, P. W. (1978). Tails of Memory and Cognition, 20, 715–726.
words: Monitoring ambiguity. Journal of Verbal Meyer, A. S., Sleiderink, A., & Levelt, W. J. M.
Learning and Verbal Behavior, 17, 29–35. (1998). Viewing and naming objects: Eye movements
Meier, R. P. (1991). Language acquisition by deaf during noun phrase production. Cognition, 66,
children. American Scientist, 79, 60–70. B25–B33.
Melby-Lervåg, M., Lyster, S.-A. H., & Hulme, C. Meyer, A. S., Wheeldon, L., & Krott, A. (Eds.).
(2012). Phonological skills and their role in learning to (2006). Automaticity and control in language
read: A meta-analytic review. Psychological Bulletin, processing. Hove, UK: Psychology Press.
138, 322–352. Meyer, D. E., & Schvaneveldt, R. W. (1971).
Melinger, A., & Dobel, C. (2005). Lexically-driven Facilitation in recognizing pairs of words: Evidence of
syntactic priming. Cognition, 98, B11–B20. a dependence between retrieval operations. Journal of
Melinger, A., & Rahman, R. A. (2013). Lexical Experimental Psychology, 90, 227–235.
selection is competitive: Evidence from indirectly Meyer, D. E., Schvaneveldt, R. W., & Ruddy, M. G.
activated semantic associates during picture naming. (1974). Loci of contextual effects on visual word
Journal of Experimental Psychology: Learning, recognition. In P. M. A. Rabbitt & S. Dornic (Eds.),
Memory, and Cognition, 39, 348–364. Attention and performance V (pp. 98–118). New York:
Menn, L. (1980). Phonological theory and child Academic Press.
phonology. In G. H. Yeni-Komshian, J. F. Kavanagh, Miceli, G., Benvegnu, B., Capasso, R., &
& C. A. Ferguson (Eds.), Child phonology (Vol. 1, pp. Caramazza, A. (1997). The independence of
23–41). New York: Academic Press. phonological and orthographic lexical forms: Evidence
Menyuk, P. (1969). Sentences children use. from aphasia. Cognitive Neuropsychology, 14, 35–69.
Cambridge, MA: MIT Press. Miceli, G., & Capasso, R. (1997). Semantic errors
Menyuk, P., Menn, L., & Silber, R. (1986). Early as neuropsychological evidence for the independence
strategies for the perception and production of and the interaction of orthographic and phonological
words and sounds. In P. Fletcher & M. Garman word forms. Language and Cognitive Processes, 12,
(Eds.), Language acquisition (2nd ed., pp. 198–222). 733–764.
Cambridge: Cambridge University Press. Miceli, G., Mazzucci, A., Menn, L., & Goodglass, H.
Meringer, R., & Mayer, K. (1895). Versprechen und (1983). Contrasting cases of Italian agrammatic aphasia
Verlesen: Eine Pyschologisch-Linguistische Studie. without comprehension disorder. Brain and Language,
Stuttgart: Gössen. 19, 65–97.
Mervis, C. B., & Bertrand, J. (1994). Young children Michaels, D. (1977). Linguistic relativity and color
and adults use lexical principles to learn new nouns. terminology. Language and Speech, 20, 333–343.
Child Development, 65, 1646–1662. Milberg, W., Blumstein, S. E., & Dworetzky, B.
Mervis, C. B., Catlin, J., & Rosch, E. (1975). (1987). Processing of lexical ambiguities in aphasia.
Relationships among goodness-of-example, Brain and Language, 31, 138–150.
542 REFERENCES
Miller, D., & Ellis, A. W. (1987). Speech and writing Mintz, T. H., & Gleitman, L. R. (2002). Adjectives
errors in “neologistic jargonaphasia”: A lexical really do modify nouns: The incremental and restricted
activation hypothesis. In M. Coltheart, G. Sartori, & nature of early adjective acquisition. Cognition, 84,
R. Job (Eds.), The cognitive neuropsychology of 267–293.
language (pp. 235–271). Hove, UK: Lawrence Miozzo, M. (2003). On the processing of regular and
Erlbaum Associates. irregular forms of verbs and nouns: Evidence from
Miller, G. A., Heise, G. A., & Lichten, W. (1951). neuropsychology. Cognition, 87, 101–127.
The intelligibility of speech as a function of the Miozzo, M., & Caramazza, A. (1997). Retrieval of
text of the test materials. Journal of Experimental lexical-syntactic features in tip-of-the-tongue states.
Psychology, 41, 329–355. Journal of Experimental Psychology: Learning,
Miller, G. A., & Johnson-Laird, P. N. (1976). Memory, and Cognition, 23, 1410–1423.
Language and perception. Cambridge: Cambridge Mitchell, D. C. (1987). Reading and syntactic
University Press. analysis. In J. R. Beech & A. M. Colley (Eds.),
Miller, G. A., & McKean, K. E. (1964). A Cognitive approaches to reading (pp. 87–112).
chronometric study of some relations between Chichester, UK: John Wiley & Sons Ltd.
sentences. Quarterly Journal of Experimental Mitchell, D. C. (1994). Sentence parsing. In
Psychology, 16, 297–308. M. A. Gernsbacher (Ed.), Handbook of
Miller, G. A., & McNeill, D. (1969). psycholinguistic research (pp. 375–410). San Diego,
Psycholinguistics. In G. Lindzey & E. Aronson (Eds.), CA: Academic Press.
The handbook of social psychology (Vol. 3, pp. 666– Mitchell, D. C., Brysbaert, M., Grondelaers, S., &
794). Reading, MA: Addison-Wesley. Swanepoel, P. (2000). Modifier attachment in Dutch:
Miller, J. L. (1981). Effects of speaking rate on Testing aspects of construal theory. In A. Kennedy,
segmental distinctions. In P. D. Eimas & J. L. Miller R. Radach, D. Heller, & J. Pynte (Eds.), Reading as a
(Eds.), Perspectives on the study of speech (pp. 39–74). perceptual process (pp. 493–516). Oxford: Elsevier.
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Mitchell, D. C., Cuetos, F., Corley, M. M. B., &
Miller, J. L., & Jusczyk, P. W. (1989). Seeking the Brysbaert, M. (1995). Exposure-based models
neurobiological bases of speech perception. Cognition, of human parsing: Evidence for the use of coarse-
33, 111–137. grained (nonlexical) statistical records. Journal of
Miller, K. F., & Stigler, J. (1987). Counting in Psycholinguistic Research, 24, 469–488.
Chinese: Cultural variations in a basic cognitive skill. Mitchell, D. C., & Holmes, V. M. (1985). The role
Cognitive Development, 2, 279–305. of specific information about the verb in parsing
Millis, M. L., & Button, S. B. (1989). The effect of sentences with local structural ambiguity. Journal of
polysemy on lexical decision time: Now you see it, Memory and Language, 24, 542–559.
now you don’t. Memory and Cognition, 17, 141–147. Miyake, A., Carpenter, P. A., & Just, M. A. (1994).
Mills, A. E. (Ed.). (1983). Language acquisition in the A capacity approach to syntactic comprehension
blind child: Normal and deficient. London: Croom Helm. disorders: Making normal adults perform like aphasic
Mills, A. E. (1987). The development of phonology patients. Cognitive Neuropsychology, 11, 671–717.
in the blind child. In B. Dodd & R. Campbell (Eds.), Moerk, E. (1991). Positive evidence for negative
Hearing by eye: The psychology of lip-reading (pp. evidence. First Language, 11, 219–251.
145–162). Hove, UK: Lawrence Erlbaum Associates. Mohay, H. (1982). A preliminary description of
Mills, D. L., Coffrey-Corina, S. A., & Neville, H. J. the communication systems evolved by two deaf
(1993). Language acquisition and cerebral children in the absence of a sign language model. Sign
specialization in 20-month-old infants. Journal of Language Studies, 34, 73–90.
Cognitive Neuroscience, 5, 317–334. Molfese, D. L. (1977). Infant cerebral asymmetry.
Mills, D. L., Coffrey-Corina, S. A., & Neville, H. J. In S. J. Segalowitz & F. A. Gruber (Eds.), Language
(1997). Language comprehension and cerebral development and neurological theory (pp. 21–35).
specialization from 13 to 20 months. Developmental New York: Academic Press.
Neuropsychology, 13, 397–445. Molfese, D. L., & Molfese, V. J. (1994). Short-term
Milne, R. W. (1982). Predicting garden path and long-term developmental outcomes: The use of
sentences. Cognitive Science, 6, 349–373. behavioral and electrophysiological measures in early
Minsky, M. (1975). A framework for representing infancy as predictors. In G. Dawson & K. W. Fischer
knowledge. In P. H. Winston (Ed.), The psychology (Eds.), Human behavior and the developing brain (pp.
of computer vision (pp. 211–277). New York: 493–517). New York: Guilford Press.
McGraw-Hill. Monaghan, J., & Ellis, A. W. (2002). What exactly
Mintz, T. H. (2003). Frequent frames as a cue for interacts with spelling–sound consistency in word
grammatical categories in child directed speech. naming? Journal of Experimental Psychology:
Cognition, 90, 91–117. Learning, Memory, and Cognition, 28, 183–206.
REFERENCES 543
Monaghan, P., & Ellis, A. W. (2010). Modeling Morrison, C. M., & Ellis, A. W. (1995). Roles
reading development: Cumulative, incremental of word frequency and age of acquisition in word
learning in a computational model of word naming. naming and lexical decision. Journal of Experimental
Journal of Memory and Language, 63, 506–525. Psychology: Learning, Memory, and Cognition, 21,
Monsell, S. (1985). Repetition and the lexicon. In A. 116–133.
W. Ellis (Ed.), Progress in psychology of language Morrison, C. M., & Ellis, A. W. (2000). Real age of
(Vol. 2, pp. 147–195). Hove, UK: Lawrence Erlbaum acquisition effects in word naming. British Journal of
Associates. Psychology, 91, 167–180.
Monsell, S. (1987). On the relation between lexical Morrison, C. M., Ellis, A. W., & Quinlan, P. T.
input and output pathways for speech. In A. Allport, (1992). Age of acquisition, not word frequency, affects
D. Mackay, W. Prinz, & E. Sheerer (Eds.), Language object naming, not object recognition. Memory and
perception and production: Shared mechanisms in Cognition, 20, 705–714.
listening, speaking, reading, and writing (pp. 273– Morrow, D. G., Bower, G. H., & Greenspan, S. L.
311). London: Academic Press. (1989). Updating situation models during narrative
Monsell, S., Doyle, M. C., & Haggard, P. N. (1989). comprehension. Journal of Memory and Language,
Effects of frequency on visual word recognition tasks: 28, 292–312.
Where are they? Journal of Experimental Psychology: Morrow, D. G., Greenspan, S. L., & Bower, G. H.
General, 118, 43–71. (1987). Accessibility and situation models in narrative
Monsell, S., & Hirsh, K. W. (1998). Competitor comprehension. Journal of Memory and Language,
priming in spoken word recognition. Journal of 26, 165–187.
Experimental Psychology: Learning, Memory, and Morsella, E., & Miozzo, M. (2002). Evidence for a
Cognition, 24, 1495–1520. cascade model of lexical access in speech production.
Monsell, S., Matthews, G. H., & Miller, D. C. Journal of Experimental Psychology: Learning,
(1992). Repetition of lexicalization across languages: Memory, and Cognition, 28, 555–563.
A further test of the locus of priming. Quarterly Morton, J. (1969). Interaction of information in word
Journal of Experimental Psychology, 44A, 763–783. recognition. Psychological Review, 76, 165–178.
Monsell, S., Patterson, K. E., Graham, A., Hughes, Morton, J. (1970). A functional model for human
C. H., & Milroy, R. (1992). Lexical and sublexical memory. In D. A. Norman (Ed.), Models of human
translations of spelling to sound: Strategic anticipation memory (pp. 203–260). New York: Academic Press.
of lexical status. Journal of Experimental Psychology: Morton, J. (1979a). Word recognition. In J. Morton &
Learning, Memory, and Cognition, 18, 452–467. J. C. Marshall (Eds.), Psycholinguistics series: Vol. 2.
Morais, J., Bertelson, P., Cary, L., & Alegria, J. Structures and processes (pp. 107–156). London: Paul
(1986). Literacy training and speech segmentation. Elek.
Cognition, 24, 45–64. Morton, J. (1979b). Facilitation in word recognition:
Morais, J., Cary, L., Alegria, J., & Bertelson, P. Experiments causing change in the logogen model.
(1979). Does awareness of speech as a sequence of In P. A. Kolers, M. E. Wrolstad, & M. Bouma (Eds.),
phones arise spontaneously? Cognition, 7, 323–331. Processing of visible language (pp. 259–268). New
Morais, J., & Kolinsky, R. (1994). Perception and York: Plenum.
awareness in phonological processing: The case of the Morton, J. (1984). Brain-based and non-brain-based
phoneme. Cognition, 50, 287–297. models of language. In D. Caplan, A. R. Lecours, &
Moreno, E. M., & Kutas, M. (2009). Processing A. Smith (Eds.), Biological perspectives in language
semantic anomaly in two languages: An (pp. 40–64). Cambridge, MA: MIT Press.
electrophysiological exploration in both languages, of Morton, J. (1985). Naming. In S. Newman & R.
Spanish–English bilinguals. Cognitive Brain Research, Epstein (Eds.), Current perspectives in dysphasia (pp.
22, 205–220. 217–230). Edinburgh: Churchill Livingstone.
Morgan, J. L., & Travis, L. L. (1989). Limits on Morton, J., & Patterson, K. E. (1980). A new
negative information in language input. Journal of attempt at an interpretation, or, an attempt at a new
Child Language, 16, 531–552. interpretation. In M. Coltheart, K. E. Patterson, &
Morris, A. L., & Harris, C. L. (2002). Sentence J. C. Marshall (Eds.), Deep dyslexia (pp. 91–118).
context, word recognition, and repetition blindness. London: Routledge & Kegan Paul. [2nd ed., 1987.]
Journal of Experimental Psychology: Learning, Moss, H. E., & Marslen-Wilson, W. D. (1993).
Memory, and Cognition, 28, 962–982. Access to word meanings during spoken language
Morrison, C. M., Chappell, T. D., & Ellis, A. W. comprehension: Effects of sentential semantic context.
(1997). Age of acquisition norms for a large set of Journal of Experimental Psychology: Learning,
object names and their relation to adult estimates and Memory, and Cognition, 19, 1254–1276.
other variables. Quarterly Journal of Experimental Moss, H. E., McCormick, S. F., & Tyler, L. K.
Psychology, 50A, 528–559. (1997). The time course of activation of spoken
544 REFERENCES
information during spoken word recognition. Naigles, L. R. (2003). Paradox lost? No, paradox
Language and Cognitive Processes, 12, 695–731. found! Reply to Tomasello and Akhtar (2003).
Moss, H. E., Ostrin, R. K., Tyler, L. K., & Marslen- Cognition, 88, 325–329.
Wilson, W. D. (1995). Accessing different types of Nation, K., & Snowling, M. J. (1998). Semantic
lexical semantic information: Evidence from priming. processing and the development of word-recognition
Journal of Experimental Psychology: Learning, skills: Evidence from children with reading
Memory, and Cognition, 21, 863–883. comprehension difficulties. Journal of Memory and
Motley, M. T., & Baars, B. J. (1976). Semantic bias Language, 39, 85–101.
effects on the outcomes of verbal slips. Cognition, 4, Navarette, E., Basagni, B., Alario, F.-X., & Costa, A.
177–187. (2006). Does word frequency affect lexical selection in
Motley, M. T., Camden, C. T., & Baars, B. J. (1982). speech production? Quarterly Journal of Experimental
Covert formulation and editing of anomalies in speech Psychology, 59, 1681–1690.
production: Evidence from experimentally elicited Nazzi, T., Bertoncini, J., & Mehler, J. (1998).
slips of the tongue. Journal of Verbal Learning and Language discrimination by newborns: Towards
Verbal Behavior, 21, 578–594. an understanding of the role of rhythm. Journal of
Mowrer, O. H. (1960). Learning theory and symbolic Experimental Psychology: Human Perception and
processes. New York: John Wiley & Sons. Performance, 24, 756–766.
Mulford, R. (1988). First words of the blind child. Nebes, R. D. (1989). Semantic memory in Alzheimer’s
In M. D. Smith & J. L. Locke (Eds.), The emergent disease. Psychological Bulletin, 106, 377–394.
lexicon: The child’s development of a linguistic Neely, J. H. (1977). Semantic priming and retrieval
vocabulary (pp. 293–338). New York: Academic Press. from lexical memory: Roles of inhibitionless spreading
Muller, R.-A. (1997). Innateness, autonomy, activation and limited capacity attention. Journal of
universality? Neurobiological approaches to language. Experimental Psychology: General, 106, 226–254.
Behavioral and Brain Sciences, 19, 611–675. Neely, J. H. (1991). Semantic priming effects in
Murphy, G. L. (1985). Processes of understanding visual word recognition: A selective review of
anaphora. Journal of Memory and Language, 24, current findings and theories. In D. Besner & G.
290–303. W. Humphreys (Eds.), Basic processes in reading:
Murphy, G. L., & Medin, D. L. (1985). The role Visual word recognition (pp. 264–336). Hillsdale, NJ:
of theories in conceptual coherence. Psychological Lawrence Erlbaum Associates, Inc.
Review, 92, 289–316. Neely, J. H., Keefe, D. E., & Ross, K. (1989).
Murray, W. S., & Forster, K. I. (2004). Serial Semantic priming in the lexical decision task:
mechanisms in lexical access: The rank hypothesis. Roles of prospective prime-generated expectancies
Psychological Review, 111, 721–756. and retrospective relation-checking. Journal of
Muter, V., Hulme, C., Snowling, M., & Taylor, S. Experimental Psychology: Learning, Memory, and
(1998). Segmentation, not rhyming, predicts early Cognition, 15, 1003–1019.
progress in learning to read. Journal of Experimental Negnevitsky, M. (2004). Artificial intelligence:
Child Psychology, 71, 3–27. A guide to intelligent systems. Reading, MA:
Muter, V., Snowling, M. J., & Taylor, S. (1994). Addison-Wesley.
Orthographic analogies and phonological awareness: Neisser, U. (1981). John Dean’s memory: A case
Their role and significance in early reading study. Cognition, 9, 1–22.
development. Journal of Child Psychology and Nelson, K. (1973). Structure and strategy in learning
Psychiatry, 35, 293–310. to talk. Monographs of the Society for Research in
Myers, J. L., & O’Brien, E. J. (1998). Accessing the Child Development, 38 (Serial No. 149).
discourse representation during reading. Discourse Nelson, K. (1974). Concept, word, and sentence:
Processes, 26, 131–157. Inter-relations in acquisition and development.
Nagy, W., & Anderson, R. (1984). The number of Psychological Review, 81, 267–285.
words in printed school English. Reading Research Nelson, K. (1987). What’s in a name? Reply to
Quarterly, 19, 304–330. Seidenberg and Petitto. Journal of Experimental
Naigles, L. R. (1990). Children use syntax to learn Psychology: General, 116, 293–296.
verb meanings. Journal of Child Language, 17, Nelson, K. (1988). Constraints on word meaning?
357–374. Cognitive Development, 3, 221–246.
Naigles, L. R. (1996). The use of multiple frames in Nelson, K. (1990). Comment on Behrend’s
verb learning via syntactic bootstrapping. Cognition, “Constraints and development.” Cognitive
58, 221–251. Development, 5, 331–339.
Naigles, L. R. (2002). Form is easy, meaning is Nelson, K., Hampson, J., & Shaw, L. K. (1993).
hard: Resolving a paradox in early child language. Nouns in early lexicons: Evidence, explanations and
Cognition, 86, 157–199. implications. Journal of Child Language, 20, 61–84.
REFERENCES 545
Nespoulous, J.-L., Dordain, M., Perron, C., Ska, B., Nicol, J. (1993). Reconsidering reactivation. In
Bub, D., Caplan, D., et al. (1988). Agrammatism in G. Altmann & R. Shillcock (Eds.), Cognitive models
sentence production without comprehension deficits: of speech processing (pp. 321–347). Hove, UK:
Reduced availability of syntactic structures and/or Lawrence Erlbaum Associates.
of grammatical morphemes? A case study. Brain and Nicol, J., & Swinney, D. (1989). The role of
Language, 33, 273–295. structure in coreference assignment during sentence
Neville, H., Nicol, J. L., Barss, A., Forster, K. I., & comprehension. Journal of Psycholinguistic Research,
Garrett, M. F. (1991). Syntactically based sentence 18, 5–9.
processing classes: Evidence from event-related brain Nigram, A., Hoffman, J. E., & Simons, R. F. (1992).
potentials. Journal of Cognitive Neuroscience, 3, N400 to semantically anomalous pictures and words.
151–165. Journal of Cognitive Neuroscience, 4, 15–22.
Newcombe, F., & Marshall, J. C. (1980). Ninio, A. (1980). Ostensive definition in vocabulary
Transcoding and lexical stabilization in deep dyslexia. teaching. Journal of Child Language, 7, 565–573.
In M. Coltheart, K. E. Patterson, & J. C. Marshall Nisbett, R. E. (2003). The geography of thought.
(Eds.), Deep dyslexia (pp. 176–188). London: London: Nicholas Brealey.
Routledge & Kegan Paul. [2nd ed., 1987.] Nishimura, M. (1986). Intrasentential codeswitching:
Newcombe, F., & Marshall, J. C. (1985). Reading The case of language assignment. In J. Vaid (Ed.),
and writing by letter sounds. In K. E. Patterson, Language processing in bilinguals (pp. 123–143).
J. V. Marshall, & M. Coltheart (Eds.), Surface dyslexia Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
(pp. 34–51). Hove, UK: Lawrence Erlbaum Associates. Noppeny, U., & Price, C. J. (2002). A PET study
Newman, F., & Holzman, L. (Eds.). (1993). Lev of stimulus- and task-induced semantic processing.
Vygotsky: Revolutionary scientist. London: Routledge. NeuroImage, 15, 927–935.
Newmark, L. (1966). How not to interfere with Norman, D. A., & Rumelhart, D. E. (1975). Memory
language learning. International Journal of American and knowledge. In D. A. Norman, D. E. Rumelhart,
Linguistics, 32, 77–83. & the LNR Research Group (Eds.), Explorations in
Newport, E. L. (1990). Maturational constraints on cognition (pp. 3–32). San Francisco: Freeman.
language learning. Cognitive Science, 14, 11–28. Norris, D. (1984). The effects of frequency, repetition,
Newport, E. L., & Meier, R. P. (1985). The and stimulus quality in visual word recognition.
acquisition of American Sign Language. In Quarterly Journal of Experimental Psychology, 36A,
D. I. Slobin (Ed.), The cross-linguistic study of 507–518.
language acquisition: Vol. 1. The data (pp. 882–938). Norris, D. (1986). Word recognition: Context effects
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. without priming. Cognition, 22, 93–136.
Newton, P. K., & Barry, C. (1997). Concreteness Norris, D. (1990). A dynamic-net model of human
effects in word production but not word speech recognition. In G. T. M. Altmann (Ed.),
comprehension in deep dyslexia. Cognitive Cognitive models of speech processing (pp. 87–104).
Neuropsychology, 14, 481–509. Cambridge, MA: MIT Press.
Ni, W., Constable, R. T., Menci, W. E., Pugh, K. R., Norris, D. (1993). Bottom-up connectionist models
Fulbright, R. K., Shaywitz, S. E., et al. (2000). of “interaction.” In G. Altmann & R. Shillcock (Eds.),
An event-related neuroimaging study distinguishing Cognitive models of speech processing (pp. 211–234).
form and content in sentence processing. Journal of Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Cognitive Neuroscience, 12, 120–133. Norris, D. (1994a). A quantitative multiple-levels
Ni, W., Crain, S., & Shankweiler, D. (1996). model of reading aloud. Journal of Experimental
Sidestepping garden paths: Assessing the contributions Psychology: Human Perception and Performance, 20,
of syntax, semantics, and plausibility in resolving 1212–1232.
ambiguities. Language and Cognitive Processes, 11, Norris, D. (1994b). Shortlist: A connectionist model
283–334. of continuous speech recognition. Cognition, 52,
Nickels, L., & Howard, D. (1994). A frequent 189–234.
occurrence? Factors affecting the production of Norris, D., & Brown, G. D. A. (1985). Race
semantic errors in aphasic naming. Cognitive models and analogy theories: A dead heat? Reply to
Neuropsychology, 11, 289–320. Seidenberg. Cognition, 20, 155–168.
Nickels, L., & Howard, D. (1995). Phonological Norris, D., McQueen, J. M., & Cutler, A. (2000).
errors in aphasic naming: Comprehension monitoring Merging information in speech recognition: Feedback
and lexicality. Cortex, 31, 209–237. is never necessary. Behavioral and Brain Sciences, 23,
Nickels, L., Howard, D., & Best, W. (1997). 299–370.
Fractionating the articulatory loop: Dissociations Norris, D., McQueen, J. M., & Cutler, A. (2003).
and associations in phonological recoding in aphasia. Perceptual learning in speech. Cognitive Psychology,
Brain and Language, 56, 161–182. 47, 204–238.
546 REFERENCES
Norris, D., McQueen, J. M., Cutler, A., & comparison with normal metaphonological processes.
Butterfield, S. (1997). The possible-word constraint Journal of Speech and Hearing Research, 28, 47–63.
in the segmentation of continuous speech. Cognitive Oller, D. K., Wieman, L. A., Doyle, W. J., &
Psychology, 34, 191–243. Ross, C. (1976). Infant babbling and speech. Journal
Nosofsky, R. M. (1991). Tests of an exemplar model of Child Language, 3, 1–11.
for relating perceptual classification and recognition Olsen, T. S., Bruhn, P., & Öberg, R. (1986). Cortical
memory. Journal of Experimental Psychology: Human hypoperfusion as a possible cause of “subcortical
Perception and Performance, 17, 3–27. aphasia.” Brain, 109, 393–410.
Nosofky, R. M., & Palmeri, T. J. (1997). An Olson, R. K. (1994). Language deficits in “specific”
exemplar-based random walk model of speeded reading ability. In M. A. Gernsbacher (Ed.), Handbook
classification. Psychological Review, 104, 266–300. of psycholinguistics (pp. 895–916). San Diego, CA:
Nowak, M. A. (2006). Evolutionary dynamics. Academic Press.
Cambridge, MA: Harvard University Press. Olson, R. K., Kliegel, R., Davidson, B. J., & Foltz, G.
Nozari, N., Dell, G. S., & Schwartz, M. F. (2011). (1984). Individual and developmental differences in
Is comprehension necessary for error detection? reading disability. In G. E. MacKinnon & T. G. Waller
A conflict-based account of monitoring in speech (Eds.), Reading research: Advances in theory and
production. Cognitive Psychology, 63, 1–33. practice (Vol. 4, pp. 1–64). New York: Academic Press.
Oakhill, J. (1994). Individual differences in children’s Onifer, W., & Swinney, D. A. (1981). Accessing
text comprehension. In M. A. Gernsbacher (Ed.), lexical ambiguities during sentence comprehension:
Handbook of psycholinguistics (pp. 821–848). San Effects of frequency of meaning and contextual bias.
Diego, CA: Academic Press. Memory and Cognition, 9, 225–236.
Obler, L. (1981). Right hemisphere participation in Oppenheim, G. M. (2012). The case for subphonemic
second language acquisition. In K. C. Diller (Ed.), attenuation in inner speech: Comment on Corley,
Individual differences and universals in language learning Brocklehurst, and Moat (2011). Journal of
aptitude (pp. 53–64). Rowley, MA: Newbury House. Experimental Psychology: Learning, Memory, and
Obler, L. K., & Hannigan, S. (1996). Cognition, 38, 502–512.
Neurolinguistics of second language acquisition and Oppenheim, G. M., & Dell, G. S. (2008). Inner
use. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook speech slips exhibit lexical bias, but not the phonemic
of second language acquisition (pp. 509–523). similarity effect. Cognition, 106, 528–537.
London: Academic Press. Orchard, G. A., & Phillips, W. A. (1991). Neural
O’Brien, E. J. (1987). Antecedent search processes computation: A beginner’s guide. Hove, UK:
and the structure of text. Journal of Experimental Lawrence Erlbaum Associates.
Psychology: Learning, Memory, and Cognition, 13, Orwell, G. (1949). Nineteen eighty-four.
278–290. Harmondsworth, UK: Penguin.
O’Brien, E. J., Cook, A. E., & Peracchi, K. A. O’Seaghdha, P. G. (1997). Conjoint and dissociable
(2004). Updating situation models: Reply to Zwaan effects of syntactic and semantic context. Journal of
and Madden (2004). Journal of Experimental Experimental Psychology: Learning, Memory, and
Psychology: Learning, Memory, and Cognition, 30, Cognition, 23, 807–828.
289–291. Osgood, C. E., & Sebeok, T. A. (Eds.). (1954).
O’Brien, E. J., Rizzella, M. L., Albrect, J. E., & Psycholinguistics: A survey of theory and research
Halleran, J. G. (1998). Updating a situation model: problems (pp. 93–101). Bloomington: Indiana
A memory-based text processing view. Journal of University Press. [Reprinted 1965.]
Experimental Psychology: Learning, Memory, and Osterhout, L., & Holcomb, P. J. (1992). Event-
Cognition, 24, 1200–1210. related potentials elicited by syntactic anomaly.
Obusek, C. J., & Warren, R. M. (1973). Relation of Journal of Memory and Language, 31, 785–806.
the verbal transformation and the phonemic restoration Osterhout, L., Holcomb, P. J., & Swinney, D. A.
effects. Cognitive Psychology, 5, 97–107. (1994). Brain potentials elicited by garden-path
Ochs, E., & Schieffelin, B. (1995). The impact of sentences: Evidence of the application of verb
language socialization on grammatical development. information during parsing. Journal of Experimental
In P. Fletcher & B. MacWhinney (Eds.), Handbook of Psychology: Learning, Memory, and Cognition, 20,
child language (pp. 73–94). Oxford: Blackwell. 786–803.
Oller, D. K. (1980). The emergence of sounds of Osterhout, L., & Nicol, J. (1999). On the
speech in infancy. In G. H. Yeni-Komshian, J. F. distinctiveness, independence, and time course of the
Kavanagh, & C. A. Ferguson (Eds.), Child phonology brain responses to syntactic and semantic anomalies.
(Vol. 1, pp. 93–112). New York: Academic Press. Language and Cognitive Processes, 14, 283–317.
Oller, D. K., Eilers, R. E., Bull, D. H., & Carney, A. E. Ostrin, R. K., & Schwartz, M. F. (1986).
(1985). Prespeech vocalizations of a deaf infant: A Reconstructing from a degraded trace—a study
REFERENCES 547
of sentence repetition in agrammatism. Brain and Patterson, F. (1981). The education of Koko. New
Language, 28, 328–345. York: Holt, Rinehart & Winston.
O’Sullivan, C., & Yeager, C. P. (1989). Patterson, K. E. (1980). Derivational errors. In
Communicative context and linguistic competence: M. Coltheart, K. E. Patterson, & J. C. Marshall (Eds.),
The effects of social setting on a chimpanzee’s Deep dyslexia (pp. 286–306). London: Routledge &
conversational skills. In R. A. Gardner & T. E. Kegan Paul. [2nd ed., 1987.]
van Cantford (Eds.), Teaching sign language to Patterson, K. E., & Besner, D. (1984). Is the right
chimpanzees (pp. 269–279). Albany, NY: Suny Press. hemisphere literate? Cognitive Neuropsychology, 3,
Owens, R. E., Jr. (2004). Language development: An 341–367.
introduction (6th ed.). Columbus, OH: Merrill. Patterson, K. E., Graham, N., & Hodges, J. R.
Paap, K. R., Newsome, S., McDonald, J. E., & (1994). The impact of semantic memory loss on
Schvaneveldt, R. W. (1982). An activation-verification phonological representations. Journal of Cognitive
model for letter and word recognition: The word Neuroscience, 6, 57–69.
superiority effect. Psychological Review, 89, 573–594. Patterson, K. E., & Hodges, J. R. (1992).
Pachella, R. G. (1974). The interpretation of reaction Deterioration of word meaning: Implications
time in information processing research. In B. H. for reading. Neuropsychologia, 30, 1025–1040.
Kantowitz (Ed.), Human information processing: Patterson, K. E., Marshall, J. C., & Coltheart, M.
Tutorials in performance and cognition (pp. 41–82). (1985a). Surface dyslexia in various orthographies:
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Introduction. In K. E. Patterson, J. C. Marshall,
Paget, R. (1930). Human speech. New York: Harcourt & M. Coltheart (Eds.), Surface dyslexia:
Brace. Neuropsychological and cognitive studies of
Paivio, A. (1971). Imagery and verbal processes. phonological reading (pp. 209–214). Hove, UK:
London: Holt, Rinehart & Winston. Lawrence Erlbaum Associates.
Paivio, A., Clark, J. M., & Lambert, W. E. (1988). Patterson, K. E., Marshall, J. C., & Coltheart, M.
Bilingual dual-coding theory and semantic-repetition (Eds.). (1985b). Surface dyslexia: Neuropsychological
effects. Journal of Experimental Psychology: and cognitive studies of phonological reading. Hove,
Learning, Memory, and Cognition, 14, 163–172. UK: Lawrence Erlbaum Associates.
Paivio, A., Yuille, J. C., & Madigan, S. (1968). Patterson, K. E., & Morton, J. (1985). From orthography
Concreteness, imagery, and meaningfulness values to phonology: An attempt at an old interpretation. In K. E.
of 925 nouns. Journal of Experimental Psychology Patterson, J. C. Marshall, & M. Coltheart (Eds.), Surface
Monographs, 76, 1–25. dyslexia: Neuropsychological and cognitive studies of
Palmer, J., MacLeod, C. M., Hunt, E., & Davidson, phonological reading (pp. 335–359). Hove, UK: Lawrence
J. E. (1985). Information processing correlates of Erlbaum Associates.
reading. Journal of Verbal Learning and Verbal Patterson, K. E., Seidenberg, M. S., &
Behavior, 24, 59–88. McClelland, J. L. (1989). Connections and
Papafragou, A., Massey, C., & Gleitman, L. (2002). disconnections: Acquired dyslexia in a computational
Shake, rattle, ’n’ roll: The representation of motion in model of reading processes. In R. G. M. Morris (Ed.),
language and cognition. Cognition, 84, 189–219. Parallel distributed processing: Implications for
Papagno, C., Valentine, T., & Baddeley, A. (1991). psychology and neurobiology (pp. 131–181). Oxford:
Phonological short-term memory and foreign- Clarendon Press.
language vocabulary learning. Journal of Memory and Patterson, K. E., & Shewell, C. (1987). Speak and
Language, 30, 331–347. spell: Dissociations and word-class effects. In
Paquier, P. F., & Marien, P. (2005). A synthesis of M. Coltheart, G. Sartori, & R. Job (Eds.), The
the role of the cerebellum in cognition. Aphasiology, cognitive neuropsychology of language (pp. 273–294).
19, 3–19. Hove, UK: Lawrence Erlbaum Associates.
Paradis, M. (1997). The cognitive neuropsychology Patterson, K. E., Suzuki, T., & Wydell, T. N. (1996).
of bilingualism. In A. M. B. de Groot & J. F. Kroll Interpreting a case of Japanese phonological alexia:
(Eds.), Tutorials in bilingualism: Psycholinguistic The key is phonology. Cognitive Neuropsychology, 13,
perspectives (pp. 331–354). Mahwah, NJ: Lawrence 803–822.
Erlbaum Associates, Inc. Patterson, K. E., Vargha-Khadem, F., & Polkey, C. E.
Parkin, A. J. (1982). Phonological recoding in lexical (1989). Reading with one hemisphere. Brain, 112,
decision: Effects of spelling-to-sound regularity 39–63.
depend on how regularity is defined. Memory and Pearce, J. M. (2008). Animal learning and cognition
Cognition, 10, 43–53. (3rd ed.). Hove, UK: Lawrence Erlbaum Associates.
Parkin, A. J., & Stewart, F. (1993). Category-specific Pearl, E., & Lambert, W. E. (1962). The relation
impairments? No. A critique of Sartori et al. Quarterly of bilingualism to intelligence. Psychological
Journal of Experimental Psychology, 46A, 505–509. Monographs, 76 (27, Whole No. 546).
548 REFERENCES
Pearlmutter, N. J., & MacDonald, M. C. (1995). of Experimental Psychology: Learning, Memory, and
Individual differences and probabilistic constraints in Cognition, 21, 24–33.
syntactic ambiguity resolution. Journal of Memory Peters, P. S., & Ritchie, R. W. (1973). Context-
and Language, 34, 521–542. sensitive immediate constituent analysis: Context-free
Peereman, R., & Content, A. (1997). Orthographic language revisited. Mathematical Systems Theory, 6,
and phonological neighbours in naming: Not all 324–333.
neighbours are equally influential in orthographic Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M. E.,
space. Journal of Memory and Language, 37, 382–410. & Raichle, J. (1989). Positron emission tomographic
Penfield, W., & Roberts, L. (1959). Speech and brain studies of the processing of single words. Journal of
mechanisms. Princeton, NJ: Princeton University Cognitive Neuroscience, 1, 153–170.
Press. Petersen, S. E., van Mier, H., Fiez, J. A., &
Pennington, B. F., & Lefly, D. L. (2001). Early Raichle, M. E. (1998). The effects of practice on the
reading development in children at family risk for functional anatomy of task performance. Proceedings
dyslexia. Child Development, 72, 816–833. of the National Academy of Science USA, 95, 853–860.
Pepperberg, I. M. (1981). Functional vocalizations by Peterson, R. R., & Savoy, P. (1998). Lexical
an African grey parrot (Psittacus erithacus). Zeitschrift selection and phonological encoding during language
für Tierpsychologie, 55, 139–160. production: Evidence for cascaded processing. Journal
Pepperberg, I. M. (1983). Cognition in the African of Experimental Psychology: Learning, Memory, and
grey parrot: Preliminary evidence for auditory/vocal Cognition, 24, 539–557.
comprehension of the class concept. Animal Learning Petitto, L. (1987). On the autonomy of language and
and Behavior, 11, 179–185. gesture: Evidence from the acquisition of personal
Pepperberg, I. M. (1987). Acquisition of the same/ pronouns in American Sign Language. Cognition, 27,
different concept by an African grey parrot (Psittacus 1–52.
erithacus): Learning with respect to categories of Petitto, L. (1988). “Language” in the prelinguistic
color, shape, and material. Animal Learning and child. In F. S. Kessel (Ed.), The development of
Behavior, 15, 423–432. language and language disorders (pp. 187–222).
Pepperberg, I. M. (1999). Rethinking syntax: A Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
commentary on E. Kako’s “Elements of syntax in the Petitto, L. A., Holowka, S., Sergio, L. E., Levy, B., &
systems of three language-trained animals.” Animal Ostry, D. J. (2004). Baby hands that move to the rhythm
Learning and Behavior, 27, 15–17. of language: Hearing babies acquiring sign languages
Pepperberg, I. M. (2009). Alex & me: How a scientist babble silently on the hands. Cognition, 93, 43–73.
and a parrot discovered a hidden world of animal Petitto, L. A., & Marentette, P. F. (1991). Babbling
intelligence—and formed a deep bond in the process. in the manual mode: Evidence for the ontogeny of
New York: Harper Perennial. language. Science, 251, 1483–1496.
Pérez-Pereira, M. (1999). Deixis, personal reference, Petrie, H. (1987). The psycholinguistics of speaking.
and the use of pronouns by blind children. Journal of In J. Lyons, R. Coates, M. Deuchar, & G. Gazdar
Child Language, 26, 655–680. (Eds.), New horizons in linguistics (Vol. 2, pp. 336–
Pérez-Pereira, M., & Conti-Ramsden, G. (1999). 366). Harmondsworth, UK: Penguin.
Language development and social interaction in blind Pexman, P. M., Lupker, S. J., & Jared, D. (2001).
children. Hove, UK: Psychology Press. Homophone effects in lexical decision. Journal of
Perfect, T. J., & Hanley, J. R. (1992). The tip-of- Experimental Psychology: Learning, Memory, and
the-tongue phenomenon: Do experimenter-presented Cognition, 27, 139–156.
interlopers have any effect? Cognition, 45, 55–75. Pexman, P. M., Lupker, S. J., & Reggin, L. D.
Perfetti, C. A. (1994). Psycholinguistics and reading (2002). Phonological effects in visual word
ability. In M. A. Gernsbacher (Ed.), Handbook of recognition: Investigating the impact of feedback
psycholinguistics (pp. 849–886). San Diego, CA: activation. Journal of Experimental Psychology:
Academic Press. Learning, Memory, and Cognition, 28, 572–584.
Perfetti, C. A., Bell, L. C., & Delaney, S. M. (1988). Piaget, J. (1923). The language and thought of the
Automatic (prelexical) phonetic activation in silent child (Trans. M. Gabain, 1955). Cleveland, OH:
word reading: Evidence from backward masking. Meridian.
Journal of Memory and Language, 27, 59–70. Piattelli-Palmarini, M. (Ed.). (1980). Language and
Perfetti, C. A., & Zhang, S. (1991). Phonological learning: The debate between Jean Piaget and Noam
processes in reading Chinese characters. Journal of Chomsky. London: Routledge & Kegan Paul.
Experimental Psychology: Learning, Memory, and Piattelli-Palmarini, M. (1989). Evolution, selection,
Cognition, 17, 633–643. and cognition: From “learning” to parameter setting
Perfetti, C. A., & Zhang, S. (1995). Very early in biology and the study of language. Cognition, 31,
phonological activation in Chinese reading. Journal 1–44.
REFERENCES 549
Piattelli-Palmarini, M. (1994). Ever since language Pinker, S. (2001). Talk of genetics and vice versa.
and learning: Afterthoughts on the Piaget–Chomsky Nature, 413, 465–466.
debate. Cognition, 50, 315–346. Pinker, S. (2002). The blank state. Harmondsworth:
Pickering, M. J. (1999). Sentence comprehension. In Penguin.
S. Garrod & M. J. Pickering (Eds.), Language Pinker, S. (2003). Language as an adaptation to the
processing (pp. 123–153). Hove, UK: Psychology Press. cognitive niche. In M. H. Christiansen & S. Kirby
Pickering, M. J., & Barry, G. (1991). Sentence (Eds.), Language evolution (pp. 16–37). Oxford:
processing without empty categories. Language and Oxford University Press.
Cognitive Processes, 6, 229–259. Pinker, S., & Bloom, P. (1990). Natural language and
Pickering, M. J., & Branigan, H. P. (1998). The natural selection. Behavioral and Brain Sciences, 13,
representation of verbs: Evidence from syntactic 707–784.
priming in language production. Journal of Memory Pinker, S., & Jackendoff, R. (2005). The faculty
and Language, 39, 633–651. of language: What’s special about it? Cognition, 95,
Pickering, M. J., Branigan, H. P., & McLean, J. F. 201–236.
(2002). Constituent structure is formulated in one Pinker, S., & Prince, A. (1988). On language and
stage. Journal of Memory and Language, 46, 586–605. connectionism: Analysis of a parallel distributed
Pickering, M. J., & Garrod, S. (2004). Toward a processing model of language acquisition. Cognition,
mechanistic psychology of dialogue. Behavioral and 28, 59–108.
Brain Sciences, 27, 169–226. Pinker, S., & Ullman, M. T. (2002). The past and
Pickering, M. J., & Garrod, S. (2006). Do people future of the past tense. Trends in Cognitive Science, 6,
use language production to make predictions during 456–463, and Reply, 472–474.
comprehension? Trends in Cognitive Sciences, 11, Pisoni, D. B., & Tash, J. (1974). Reaction times to
105–110. comparisons within and across phonetic categories.
Pickering, M. J., & Garrod, S. (2013). An integrated Perception and Psychophysics, 15, 285–290.
theory of language production and comprehension. Pitchford, N., & Mullen, K. (2005). The role
Behavioral and Brain Sciences, 36, 329–347. of perception, language, and preference in the
Pickering, M. J., & Traxler, M. J. (1998). developmental acquisition of basic colour terms.
Plausibility and recovery from garden paths: An eye- Journal of Experimental Child Psychology, 90,
tracking study. Journal of Experimental Psychology: 275–302.
Learning, Memory, and Cognition, 24, 940–961. Pitt, M. A. (1995a). The locus of the lexical shift
Pickering, M. J., Traxler, M. J., & Crocker, M. W. in phoneme identification. Journal of Experimental
(2000). Ambiguity resolution in sentence processing: Psychology: Learning, Memory, and Cognition, 21,
Evidence against frequency-based accounts. Journal of 1037–1052.
Memory and Language, 43, 447–475. Pitt, M. A. (1995b). Data fitting and detection theory:
Pickering, M. J., & van Gompel, R. P. G. Reply to Massaro and Oden. Journal of Experimental
(2006). Syntactic parsing. In M. J. Traxler & M. A. Psychology: Learning, Memory, and Cognition, 21,
Gernsbacher (Eds.), The handbook of psycholinguistics 1065–1067.
(2nd ed., pp. 455–503). San Diego, CA: Elsevier. Pitt, M. A., & McQueen, J. M. (1998). Is
Pine, J. M. (1994a). Environmental correlates of compensation for coarticulation mediated by the
variation in lexical style: Interactional style and the lexicon? Journal of Memory and Language, 39,
structure of the input. Applied Psycholinguistics, 15, 347–370.
355–370. Plaut, D. C. (1997). Structure and function in the
Pine, J. M. (1994b). The language of primary lexical system: Insights from distributed models of
caregivers. In C. Gallaway & B. J. Richards (Eds.), word reading and lexical decision. Language and
Input and interaction in language acquisition (pp. Cognitive Processes, 12, 765–805.
15–37). Cambridge: Cambridge University Press. Plaut, D. C., & Booth, J. R. (2000). Individual
Pine, J. M., & Lieven, E. (1997). Lexically-based and developmental differences in semantic
learning and early grammatical development. Journal priming: Empirical and computational support for
of Child Language, 24, 187–219. a single-mechanism account of lexical processing.
Pinker, S. (1984). Language learnability and Psychological Review, 107, 786–823.
language development. Cambridge, MA: MIT Press. Plaut, D. C., & McClelland, J. L. (1993).
Pinker, S. (1989). Learnability and cognition. Generalizing with componential attractors: Word
Cambridge, MA: MIT Press. and nonword reading in an attractor network. In
Pinker, S. (1994). The language instinct. W. Kintsch (Ed.), Proceedings of the 15th Annual
Harmondsworth, UK: Allen Lane. Conference of the Cognitive Science Society
Pinker, S. (1999). Words and rules. London: (pp. 824–829). Hillsdale, NJ: Lawrence Erlbaum
Weidenfeld & Nicolson. Associates, Inc.
550 REFERENCES
Plaut, D. C., McClelland, J. L., Seidenberg, M. S., Potter, M. C., & Lombardi, L. (1998). Syntactic
& Patterson, K. E. (1996). Understanding normal priming in immediate recall of sentences. Journal of
and impaired word reading: Computational principles Memory and Language, 38, 265–282.
in quasi-regular domains. Psychological Review, 103, Potter, M. C., Moryadas, A., Abrams, I., & Noel, A.
56–115. (1993). Word perception and misperception in context.
Plaut, D. C., & Shallice, T. (1993a). Deep dyslexia: Journal of Experimental Psychology: Learning,
A case study of connectionist neuropsychology. Memory, and Cognition, 19, 3–22.
Cognitive Neuropsychology, 10, 377–500. Potter, M. C., So, K. F., von Eckardt, B., &
Plaut, D. C., & Shallice, T. (1993b). Perseverative Feldman, L. B. (1984). Lexical and conceptual
and semantic influences on visual object naming errors representation in beginning and proficient bilinguals.
in optic aphasia: A connectionist account. Journal of Journal of Verbal Learning and Verbal Behavior, 23,
Cognitive Neuroscience, 5, 89–117. 23–38.
Plunkett, K., & Elman, J. L. (1997). Exercises in Potts, G. R., Keenan, J. M., & Golding, J. M.
rethinking innateness: A handbook for connectionist (1988). Assessing the occurrence of elaborative
simulations. Cambridge, MA: Bradford Books. inferences: Lexical decision versus naming. Journal of
Plunkett, K., & Marchman, V. (1991). U-shaped Memory and Language, 27, 399–415.
learning and frequency effects in a multilayered Prasada, S., & Pinker, S. (1993). Generalisation
perceptron: Implications for child language of regular and irregular morphological patterns.
acquisition. Cognition, 38, 43–102. Language and Cognitive Processes, 8, 1–56.
Plunkett, K., & Marchman, V. (1993). From rote Prat, C. S., Mason, R. A., & Just, M. A. (2012). An
learning to system building: Acquiring verb morphology fMRI investigation of analogical mapping in metaphor
in children and connectionist nets. Cognition, 48, 21–69. comprehension: The influence of context and
Poeppel, D. (1996). A critical review of PET studies individual cognitive capacities on processing demands.
of phonological processing. Brain and Language, 55, Journal of Experimental Psychology: Learning,
317–351. Memory, and Cognition, 38, 282–294.
Poeppel, D., & Hickok, G. (2004). Towards a new Premack, D. (1971). Language in chimpanzee?
functional anatomy of language. Cognition, 92, 1–12. Science, 172, 808–822.
Polk, T. A., & Farah, M. J. (2002). Functional MRI Premack, D. (1976a). Intelligence in ape and man.
evidence for an abstract, not perceptual, word-form Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
area. Journal of Experimental Psychology: General, Premack, D. (1976b). Language and intelligence in
131, 65–72. ape and man. American Scientist, 64, 674–683.
Pollatsek, A., Bolozky, S., Well, A. D., & Rayner, K. Premack, D. (1985). “Gavagai!” or the future history
(1981). Asymmetries in the perceptual span for Israeli of the animal language controversy. Cognition, 19,
readers. Brain and Language, 14, 174–180. 207–296.
Posner, M. I., & Keele, S. W. (1968). On the genesis Premack, D. (1986a). Gavagai! or the future history
of abstract ideas. Journal of Experimental Psychology, of the animal language controversy. Cambridge, MA:
77, 353–363. MIT Press.
Posner, M. I., & Snyder, C. R. R. (1975). Premack, D. (1986b). Pangloss to Cyrano de
Facilitation and inhibition in the processing of Bergerac: “Nonsense, it’s perfect!” A reply to
signals. In P. M. A. Rabbitt & S. Dornic (Eds.), Bickerton. Cognition, 23, 81–88.
Attention and performance V (pp. 669–682). New Premack, D. (1990). Words: What are they, and do
York: Academic Press. animals have them? Cognition, 37, 197–212.
Postal, P. (1964). Constituent structure: A study Price, C. J., & Devlin, J. T. (2003). The myth of the
of contemporary models of syntactic description. visual word form area. NeuroImage, 19, 473–481.
Bloomington, IN: Research Center for the Language Pring, L. (1981). Phonological codes and functional
Sciences. spelling units: Reality and implications. Perception
Postma, A. (2000). Detection of errors during speech and Psychophysics, 30, 573–578.
production: A review of speech monitoring models. Protopapas, A. (1999). Connectionist modeling
Cognition, 77, 97–131. of speech perception. Psychological Bulletin, 125,
Postman, L., & Keppel, G. (1970). Norms of word 410–436.
associations. New York: Academic Press. Proverbio, A. M., Cok, B., & Zani, A. (2002).
Potter, J. M. (1980). What was the matter with Dr. Electrophysiological measures of language processing
Spooner? In V. A. Fromkin (Ed.), Errors in linguistic in bilinguals. Journal of Cognitive Neuroscience, 14,
performance (pp. 13–34). New York: Academic Press. 994–1017.
Potter, M. C., & Lombardi, L. (1990). Regeneration Pullum, G. K. (1981). Languages with object before
in the short-term recall of sentences. Journal of subject: A comment and a catalogue. Linguistics, 19,
Memory and Language, 29, 633–654. 147–155.
REFERENCES 551
Pullum, G. K. (1989). The great Eskimo vocabulary single case study. Journal of Neurolinguistics, 15,
hoax. Natural Language and Linguistic Theory, 7, 373–402.
275–281. Rapp, B., & Goldrick, M. (2000). Discreteness and
Pulvermüller, F. (1995). Agrammatism: Behavioral interactivity in spoken word production. Psychological
description and neurobiological explanation. Journal Review, 107, 460–499.
of Cognitive Neuroscience, 7, 165–181. Rapp, B., & Goldrick, M. (2004). Feedback by any
Pulvermüller, F., Shtyrov, Y., & Illmoniemi, R. J. other name is still interactivity: A reply to Roelofs
(2003). Spatio-temporal patterns of neural language (2004). Psychological Review, 111, 573–578.
processing: An MEG study using minimum-norm Rapp, B., & Goldrick, M. (2005). Speaking words:
current estimates. NeuroImage, 20, 1020–1025. Contributions of cognitive neuropsychological
Pye, C. (1986). Quiché Mayan speech to children. research. Cognitive Neuropsychology, 22, 1–34.
Journal of Child Language, 13, 85–100. Rapp, D. N., & Samuel, A. G. (2000). A reason to
Quine, W. V. O. (1960). Word and object. Cambridge, rhyme: Phonological and semantic influences on
MA: MIT Press. lexical access. Journal of Experimental Psychology:
Quine, W. V. O. (1977). Natural kinds. In S. P. Schwartz Learning, Memory, and Cognition, 28, 564–571.
(Ed.), Naming, necessity, and natural kinds (pp. 155–175). Rasmussen, T., & Milner, B. (1975). Clinical and
Ithaca, NY: Cornell University Press. surgical studies of the cerebral speech areas in man. In
Quinlan, P. T. (1992). The Oxford psycholinguistic K. J. Zulch, O. Creutzfeldt, & G. C. Galbraith (Eds.),
database. Oxford: Oxford University Press. Cerebral localization (pp. 238–257). New York:
Quinlan, P. T., & Dyson, B. (2008). Cognitive Springer-Verlag.
psychology. Harlow, Essex: Pearson Education. Rasmussen, T., & Milner, B. (1977). The role of
Quinn, P. C., & Eimas, P. D. (1986). On early left brain injury in determining lateralization
categorization in early infancy. Merrill-Palmer of cerebral speech functions. Annals of the New York
Quarterly, 32, 331–363. Academy of Sciences, 299, 355–369.
Quinn, P. C., & Eimas, P. D. (1996). Perceptual Rastle, K., & Brysbaert, M. (2006). Masked
organization and categorization in young infants. In phonological priming effects in English: Are they real?
C. Rovee-Collier & L. P. Lipsitt (Eds.), Advances in Do they matter? Cognitive Psychology, 53, 97–145.
infancy research (Vol. 10, pp. 2–36). Norwood, NJ: Ablex. Rastle, K., & Coltheart, M. (2000). Lexical and
Rack, J. P., Hulme, C., Snowling, M. J., & nonlexical print-to-sound translation of disyllabic
Wightman, J. (1994). The role of phonology in words and nonwords. Journal of Memory and
young children learning to read words: The direct- Language, 42, 342–364.
mapping hypothesis. Journal of Experimental Child Rastle, K., Davis, M. H., & New, B. (2004). The
Psychology, 57, 42–71. broth in my brother’s brothel: Morpho-orthographic
Rack, J. P., Snowling, M. J., & Olson, R. K. (1992). segmentation in visual word recognition. Psychonomic
The nonword reading deficit in developmental Bulletin and Review, 11, 1090–1098.
dyslexia: A review. Reading Research Quarterly, 27, Ratcliff, J. E., & McKoon, G. (1981). Does activation
29–43. really spread? Psychological Review, 88, 454–462.
Radford, A. (1981). Transformational syntax: A Ratcliff, J. E., & McKoon, G. (1988). A retrieval
student’s guide to Chomsky’s extended standard theory of priming in memory. Psychological Review,
theory. Cambridge: Cambridge University Press. 95, 385–408.
Radford, A. (1997). Syntax: A minimalist Rayner, K. (1998). Eye movements in reading
introduction. Cambridge: Cambridge University Press. and information processing: 20 years of research.
Radford, A., Atkinson, M. A., Britain, D., Clahsen, H., Psychological Bulletin, 124, 372–422.
& Spencer, A. (1999). Linguistics. Cambridge: Rayner, K., & Bertera, J. H. (1979). Reading without
Cambridge University Press. a fovea. Science, 206, 468–469.
Rapp, B., Benzing, L., & Caramazza, A. (1997). Rayner, K., Binder, K. S., & Duffy, S. A. (1999).
The autonomy of lexical orthography. Cognitive Contextual strength and the subordinate bias effect:
Neuropsychology, 14, 71–104. Comment on Martin, Vu, Kellas, and Metcalf. Quarterly
Rapp, B., & Caramazza, A. (1993). On the Journal of Experimental Psychology, 52A, 841–852.
distinction between deficits of access and deficits Rayner, K., Carlson, M., & Frazier, L. (1983).
of storage: A question of theory. Cognitive The interaction of syntax and semantics during
Neuropsychology, 10, 113–141. sentence processing: Eye movements in the analysis
Rapp, B., & Caramazza, A. (1998). A case of of semantically biased sentences. Journal of Verbal
selective difficulty in writing verbs. Neurocase, 4, Learning and Verbal Behavior, 22, 358–374.
127–140. Rayner, K., & Frazier, L. (1987). Parsing temporarily
Rapp, B., & Caramazza, A. (2002). Selective ambiguous complements. Quarterly Journal of
difficulties with spoken nouns and written verbs: A Experiment Psychology, 39A, 657–673.
552 REFERENCES
Rayner, K., & Frazier, L. (1989). Selection Input and interaction in language acquisition (pp.
mechanisms in reading lexically ambiguous words. 253–269). Cambridge: Cambridge University Press.
Journal of Experimental Psychology: Learning, Richards, M. M. (1979). Sorting out what’s in a word
Memory, and Cognition, 15, 779–790. from what’s not: Evaluating Clark’s semantic features
Rayner, K., & McConkie, G. W. (1976). What acquisition theory. Journal of Experimental Child
governs a reader’s eye movements? Vision Research, Psychology, 27, 1–47.
16, 829–837. Richardson, D. C., & Dale, R. (2005). Looking
Rayner, K., Pacht, J. M., & Duffy, S. A. (1994). to understand: The coupling between speakers’ and
Effects of prior encounter and global discourse bias on listeners’ eye movements and its relationship to
the processing of lexically ambiguous words. Journal discourse comprehension. Cognitive Science, 2005,
of Memory and Language, 33, 527–544. 1045–1060.
Rayner, K., & Pollatsek, A. (1989). The psychology Riddoch, M. J., & Humphreys, G. W. (1987).
of reading. Englewood Cliffs, NJ: Prentice Hall. Visual object processing in optic aphasia: A case of
Rayner, K., Pollatsek, A., & Binder, K. S. (1998). semantic access agnosia. Cognitive Neuropsychology,
Phonological codes and eye movements in reading. 4, 131–185.
Journal of Experimental Psychology: Learning, Riddoch, M. J., Humphreys, G. W., Coltheart, M.,
Memory, and Cognition, 24, 476–497. & Funnell, E. (1988). Semantic systems or system?
Rayner, K., Well, A. D., & Pollatsek, A. (1980). Neuropsychological evidence re-examined. Cognitive
Asymmetry of the effective visual field in reading. Neuropsychology, 5, 3–25.
Perception and Psychophysics, 27, 537–544. Rigalleau, F., & Caplan, D. (2000). Effects of gender
Read, C. (1975). Children’s categorization of speech marking in pronominal coindexation. Quarterly
sounds in English. Urbana, IL: National Council of Journal of Experimental Psychology, 53A, 23–52.
Teachers of English. Rinck, M., & Bower, G. H. (1995). Anaphora
Read, C., Zhang, Y., Nie, H., & Ding, B. (1986). resolution and the focus of attention in situation
The ability to manipulate speech sounds depends on models. Journal of Memory and Language, 34,
knowing alphabetic writing. Cognition, 24, 31–44. 110–131.
Reber, A. S., & Anderson, J. R. (1970). The Rips, L. J. (1995). The current status of research on
perception of clicks in linguistic and nonlinguistic concept combination. Mind and Language, 10, 72–104.
messages. Perception and Psychophysics, 8, 81–89. Rips, L. J., & Collins, A. (1993). Categories and
Redington, M., & Chater, N. (1998). Connectionist resemblance. Journal of Experimental Psychology:
and statistical approaches to language acquisition: A General, 122, 468–486.
distributional perspective. Language and Cognitive Rips, L. J., Shoben, E. J., & Smith, E. E. (1973).
Processes, 13, 129–191. Semantic distance and the verification of semantic
Redlinger, W., & Park, T. Z. (1980). Language relations. Journal of Verbal Learning and Verbal
mixing in young bilinguals. Journal of Child Behavior, 12, 1–20.
Language, 7, 337–352. Rips, L. J., Smith, E. E., & Shoben, E. J. (1975).
Rees, G., Russell, C., Frith, C. D., & Driver, J. Set-theoretic and network models reconsidered:
(1999). Inattentional blindness versus inattentional A comment on Hollan’s “Features and semantic
amnesia for fixated but ignored words. Science, 286, memory.” Psychological Review, 82, 156–157.
2504–2507. Ritchie, W. C., & Bhatia, T. K. (Eds.). (1996).
Reicher, G. M. (1969). Perceptual recognition as a Handbook of second language acquisition. London:
function of meaningfulness of stimulus materials. Academic Press.
Journal of Experimental Psychology, 81, 274–280. Rivas, E. (2005). Recent use of signs by chimpanzees.
Reichle, E. D., Rayner, K., & Pollatsek, A. (1999). Journal of Comparative Psychology, 119, 404–417.
Eye movement control in reading: Accounting for Rizzolatti, G., Fadiga, L., Fogassi, L., & Gallese, V.
initial fixation locations and refixations within the E-Z (1996). Premotor cortex and the recognition of motor
Reader model. Vision Research, 39, 4403–4411. actions. Cognitive Brain Research, 3, 1131–1141.
Reichle, E. D., Rayner, K., & Pollatsek, A. (2003). Roberson, D., Davies, I., & Davidoff, J. (2000).
The E-Z Reader model of eye-movement control in Color categories are not universal: Replications and
reading: Comparisons to other models. Behavioral and new evidence from a stone-age culture. Journal of
Brain Sciences, 26, 445–526. Experimental Psychology, 129, 369–398.
Remez, R., & Pisoni, D. (Eds.). (2005). Handbook of Roberts, B., & Kirsner, K. (2000). Temporal cycles
speech perception. Oxford: Blackwell. in speech production. Language and Cognitive
Rescorla, L. (1980). Overextension in early language Processes, 15, 129–157.
development. Journal of Child Language, 7, 321–335. Robinson, P. (2001). Individual differences, cognitive
Richards, B. J., & Gallaway, C. (1994). Conclusions abilities and aptitude complexes. Second Language
and directions. In C. Gallaway & B. J. Richards (Eds.), Research, 17, 368–392.
REFERENCES 553
Rochford, G. (1971). Study of naming Rogers, T. T., & McClelland, J. L. (2004). Semantic
errors in dysphasic and in demented patients. cognition: A parallel distributed processing approach.
Neuropsychologia, 9, 437–443. Cambridge, MA: MIT Press.
Rochon, E., Waters, G. S., & Caplan, D. (1994). Rohde, D. L. T., & Plaut, D. C. (1999). Language
Sentence comprehension in patients with Alzheimer’s acquisition in the absence of explicit negative
disease. Brain and Language, 46, 329–349. evidence: How important is starting small? Cognition,
Rodd, J., Gaskell, G., & Marslen-Wilson, W. 72, 67–109.
(2002). Making sense of semantic ambiguity: Rolnick, M., & Hoops, H. R. (1969). Aphasia as
Semantic competition in lexical access. Journal of seen by the aphasic. Journal of Speech and Hearing
Memory and Language, 46, 245–266. Disorders, 34, 48–53.
Rodriguez-Fornells, A., Rotte, M., Heinze, H. J., Romaine, S. (1995). Bilingualism (2nd ed.). Oxford:
Nosselt, T., & Munte, T. (2002). Brain potential Blackwell.
and functional MRI evidence for how to handle two Romani, C. (1992). Are there distinct input and
languages with one brain. Nature, 415, 1026–1029. output buffers? Evidence from an aphasic patient with
Roediger, H. L., & Blaxton, T. A. (1987). Retrieval an impaired output buffer. Language and Cognitive
modes produce dissociations in memory for surface Processes, 7, 131–162.
information. In D. S. Gorfein & R. R. Hoffman (Eds.), Romani, C., & Martin, R. C. (1999). A deficit in the
Memory and cognitive processes (pp. 349–377). short-term retention of lexical-semantic information:
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Forgetting words but remembering a story. Journal of
Roelofs, A. (1992). A spreading-activation theory of Experimental Psychology: General, 128, 56–77.
lemma retrieval in speaking. Cognition, 42, 107–142. Rosch, E. (1973). Natural categories. Cognitive
Roelofs, A. (1997a). Syllabification in speech Psychology, 4, 328–350.
production: Evaluation of WEAVER. Language and Rosch, E. (1978). Principles of categorization.
Cognitive Processes, 12, 657–693. In E. Rosch & B. Lloyd (Eds.), Cognition and
Roelofs, A. (1997b). The WEAVER model of word- categorization (pp. 27–48). Hillsdale, NJ: Lawrence
form encoding in speech production. Cognition, 64, Erlbaum Associates, Inc.
249–284. Rosch, E., & Mervis, C. B. (1975). Family
Roelofs, A. (2002). Spoken language planning and resemblances: Studies in the internal structure of
the initiation of articulation. Quarterly Journal of categories. Cognitive Psychology, 7, 573–605.
Experimental Psychology, 55A, 465–483. Rosch, E., Mervis, C. B., Gray, W., Johnson, D.,
Roelofs, A. (2004a). Error biases in spoken word & Boyes-Braem, P. (1976). Basic objects in natural
planning and monitoring by aphasic and nonaphasic categories. Cognitive Psychology, 8, 382–439.
speakers: Comment on Rapp and Goldrick (2000). Rosnow, R. L., & Rosnow, M. (1992). Writing papers
Psychological Review, 111, 561–572. in psychology (2nd ed.). New York: Wiley.
Roelofs, A. (2004b). Comprehension-based versus Ross, B. H., & Bower, G. H. (1981). Comparisons of
production-internal feedback in planning spoken models of associative recall. Memory and Cognition,
words: A rejoinder to Rapp and Goldrick (2000). 9, 1–16.
Psychological Review, 111, 579–580. Rosson, M. B. (1983). From SOFA to LOUCH:
Roelofs, A., & Meyer, A. S. (1998). Metrical structure Lexical contributions to pseudoword pronunciation.
in planning the production of spoken words. Journal Memory and Cognition, 11, 152–160.
of Experimental Psychology: Learning, Memory, and Roy, D. (2005). Grounding words in perception and
Cognition, 24, 922–939. action: Computational insights. Trends in Cognitive
Roelofs, A., Meyer, A. S., & Levelt, W. J. M. (1998). Sciences, 9, 389–395.
A case for the lemma/lexeme distinction in models Rubenstein, H., Lewis, S. S., & Rubenstein, M. A.
of speaking: Comment on Caramazza and Miozzo (1971). Evidence for phonemic recoding in visual
(1997). Cognition, 69, 219–230. word recognition. Journal of Verbal Learning and
Roeltgen, D. P. (1987). Loss of deep dyslexic reading Verbal Behavior, 10, 645–658.
ability from a second left-hemisphere lesion. Archives Rubin, D. C. (1980). 51 properties of 125 words: A
of Neurology, 44, 346–348. unit analysis of verbal behavior. Journal of Verbal
Rogalsky, C., & Hickok, G. (2011). The role of Learning and Verbal Behavior, 19, 736–755.
Broca’s area in sentence comprehension. Journal of Rubin, J. (1968). National bilingualism in Paraguay.
Cognitive Neuroscience, 23, 1664–1680. The Hague: Mouton.
Rogers, T. T., Lambon Ralph, M. A., Garrard, P., Rumelhart, D. E. (1975). Notes on a schema for
Bozeat, S., McClelland, J. L., Hodges, J. R., et al. stories. In D. G. Bobrow & A. M. Collins (Eds.),
(2004). Structure and deterioration of semantic Representation and understanding: Studies in
memory: A neuropsychological and computational cognitive science (pp. 211–236). New York: Academic
investigation. Psychological Review, 111, 205–235. Press.
554 REFERENCES
Rumelhart, D. E. (1977). Understanding and Saffran, E. M., & Martin, N. (1997). Effects of
summarizing brief stories. In D. LaBerge & structural priming on sentence production in aphasia.
S. J. Samuels (Eds.), Basic processes in reading: Language and Cognitive Processes, 12, 877–882.
Perception and comprehension (pp. 265–303). Saffran, E. M., & Schwartz, M. (1994). Of
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. cabbages and things: Semantic memory from a
Rumelhart, D. E. (1980). On evaluating story neuropsychological perspective—a tutorial review.
grammars. Cognitive Science, 4, 313–316. In C. Umilta & M. Moscovitch (Eds.), Attention
Rumelhart, D. E., & McClelland, J. L. (1982). and performance XV: Conscious and nonconscious
An interactive activation model of context effects in information processing (pp. 507–536). Cambridge,
letter perception: Part 2. The contextual enhancement MA: MIT Press.
effect and some tests and extensions of the model. Saffran, E. M., Schwartz, M. F., & Linebarger, M. C.
Psychological Review, 89, 60–94. (1998). Semantic influences on thematic role
Rumelhart, D. E., & McClelland, J. L. (1986). On assignment: Evidence from normals and aphasics. Brain
learning the past tense of English verbs. In and Language, 62, 255–297.
D. E. Rumelhart, J. L. McClelland, & the PDP Saffran, E. M., Schwartz, M. F., & Marin, O. S. M.
Research Group, Parallel distributed processing: Vol. (1976). Semantic mechanisms in paralexia. Brain and
2. Psychological and biological models (pp. 216–271). Language, 3, 255–265.
Cambridge, MA: MIT Press. Saffran, E. M., Schwartz, M. F., & Marin, O. S. M.
Rumelhart, D. E., McClelland, J. L., & the PDP (1980). Evidence from aphasia: Isolating the components
Research Group. (1986). Parallel distributed processing: of a production model. In B. Butterworth (Ed.), Language
Vol. 1. Foundations. Cambridge, MA: MIT Press. production: Vol. 1. Speech and talk (pp. 221–241).
Ruml, W., & Caramazza, A. (2000). An evaluation of London: Academic Press.
a computational model of lexical access: Comment on Saffran, J. R. (2001). The use of predictive
Dell et al. (1997). Psychological Review, 107, 609–634. dependencies in language learning. Journal of Memory
Ruml, W., Caramazza, A., Shelton, J. R., & and Language, 44, 493–515.
Chialant, D. (2000). Testing assumptions in Saffran, J. R. (2002). Constraints on statistical
computational theories of aphasia. Journal of Memory language learning. Journal of Memory and Language,
and Language, 43, 217–248. 47, 172–196.
Rymer, R. (1993). Genie. London: Joseph. Saffran, J. R., Aslin, R. N., & Newport, E. L.
Sacchett, C., & Humphreys, G. W. (1992). Calling a (1996). Statistical learning by 8-month-old infants.
squirrel a squirrel but a canoe a wigwam: A category- Science, 274, 1926–1928.
specific deficit for artefactual objects and body parts. Saffran, J. R., Werker, J. F., & Werner, L. A.
Cognitive Neuropsychology, 9, 73–86. (2006). The infant’s auditory world: Hearing, speech,
Sachs, J., Bard, B., & Johnson, M. L. (1981). and the beginnings of language. In R. Siegler & D.
Language learning with restricted input: Case studies Kuhn (Eds.), Handbook of child development (6th ed.,
of two hearing children of deaf parents. Applied pp. 58–108). New York: Wiley.
Psycholinguistics, 2, 33–54. Salamoura, A., & Williams, J. N. (2006). Lexical
Sachs, J. S. (1967). Recognition memory for activation of cross-language syntactic priming.
syntactic and semantic aspects of connected discourse. Bilingualism: Language and Cognition, 9, 299–307.
Perception and Psychophysics, 2, 437–442. Samuel, A. G. (1981). Phonemic restoration: Insights
Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). from a new methodology. Journal of Experimental
A simplest systematics for the organization of turn- Psychology: General, 110, 474–494.
taking in conversation. Language, 50, 696–735. Samuel, A. G. (1987). The effect of lexical uniqueness
Saffran, E. M. (1990). Short-term memory on phonemic restoration. Journal of Memory and
impairments and language processing. In A. Language, 26, 36–56.
Caramazza (Ed.), Cognitive neuropsychology and Samuel, A. G. (1990). Using perceptual-restoration
neurolinguistics (pp. 137–168). Hillsdale, NJ: effects to explore the architecture of perception. In
Lawrence Erlbaum Associates, Inc. G. T. M. Altmann (Ed.), Cognitive models of speech
Saffran, E. M., Bogyo, L. C., Schwartz, M. F., & processing (pp. 295–314). Cambridge, MA: MIT
Marin, O. S. M. (1980). Does deep dyslexia reflect right Press.
hemisphere reading? In M. Coltheart, K. E. Patterson, Samuel, A. G. (1996). Does lexical information
& J. C. Marshall (Eds.), Deep dyslexia (pp. 381–406). influence the perceptual restoration of phonemes?
London: Routledge & Kegan Paul. [2nd ed., 1987.] Journal of Experimental Psychology: General, 125,
Saffran, E. M., Marin, O. S. M., & 28–51.
Yeni-Komshian, G. H. (1976). An analysis of speech Samuel, A. G. (1997). Lexical activation produces
perception in word deafness. Brain and Language, 3, potent phonemic percepts. Cognitive Psychology, 32,
209–228. 97–127.
REFERENCES 555
Sandra, D. (1990). On the representation and Savage-Rumbaugh, E. S., Pate, J. L., Lawson, J.,
processing of compound words: Automatic access Smith, T., & Rosenbaum, S. (1983). Can
to constituent morphemes does not occur. Quarterly a chimpanzee make a statement? Journal of
Journal of Experimental Psychology, 42A, 529–567. Experimental Psychology: General, 112,
Sanford, A. J. (1985). Cognition and cognitive 457–492.
psychology. London: Weidenfeld & Nicolson. Savage-Rumbaugh, E. S., Rumbaugh, D. M., &
Sanford, A. J., & Garrod, S. C. (1981). Boysen, S. (1978). Linguistically mediated tool use
Understanding written language. Chichester, UK: and exchange by chimpanzees. Behavioral and Brain
John Wiley. Sciences, 1, 539–554.
Santa, J. L., & Ranken, H. B. (1972). Effects of Savin, H. B., & Bever, T. G. (1970). The non-
verbal coding on recognition memory. Journal of perceptual reality of the phoneme. Journal of Verbal
Experimental Psychology, 93, 268–278. Learning and Verbal Behavior, 9, 295–302.
Sartori, G., & Job, R. (1988). The oyster with four Savin, H. B., & Perchonock, E. (1965). Grammatical
legs: A neuropsychological study on the interaction structure and the immediate recall of English sentences.
of visual and semantic information. Cognitive Journal of Verbal Learning and Verbal Behavior, 4,
Neuropsychology, 5, 105–132. 348–353.
Sartori, G., Miozzo, M., & Job, R. (1993). Category- Saxton, M. (1997). The contrast theory of negative
specific impairments? Yes. Quarterly Journal of input. Journal of Child Language, 24, 139–161.
Experimental Psychology, 46A, 489–504. Scarborough, D. L., Cortese, C., & Scarborough, H. S.
Sasanuma, S. (1980). Acquired dyslexia in Japanese: (1977). Frequency and repetition effects in lexical
Clinical features and underlying mechanisms. In M. memory. Journal of Experimental Psychology: Human
Coltheart, K. E. Patterson, & J. C. Marshall (Eds.), Perception and Performance, 3, 1–17.
Deep dyslexia (pp. 48–90). London: Routledge & Scarborough, D. L., Gerard, L., & Cortese, C.
Kegan Paul. [2nd ed., 1987.] (1984). Independence of lexical access in bilingual
Sasanuma, S., Ito, H., Patterson, K., & Ito, T. word recognition. Journal of Verbal Learning and
(1996). Phonological alexia in Japanese: A case study. Verbal Behavior, 23, 84–99.
Cognitive Neuropsychology, 13, 823–848. Schaeffer, B., & Wallace, R. (1969). Semantic
Savage, C., Lieven, E., Theakston, A., & similarity and the comprehension of word meanings.
Tomasello, M. (2003). Testing the abstractness of Journal of Experimental Psychology, 82, 343–346.
young children’s linguistic representations: Lexical Schaeffer, B., & Wallace, R. (1970). The comparison
and structural priming of syntactic constructions. of word meanings. Journal of Experimental
Developmental Science, 6, 557–567. Psychology, 86, 144–152.
Savage, G. R., Bradley, D. C., & Forster, K. I. Schaeffer, H. R. (1975). Social development in
(1990). Word frequency and the pronunciation task: infancy. In R. Lewin (Ed.), Child alive (pp. 32–39).
The contribution of articulatory fluency. Language London: Temple Smith.
and Cognitive Processes, 5, 203–236. Schank, R. C. (1972). Conceptual dependency: A
Savage, R. S. (1997). Do children need concurrent theory of natural language understanding. Cognitive
prompts in order to use lexical analogies in reading? Psychology, 3, 552–631.
Journal of Child Psychology and Psychiatry, 38, Schank, R. C. (1975). Conceptual information
235–246. processing. Amsterdam: North Holland.
Savage-Rumbaugh, E. S. (1987). Communication, Schank, R. C. (1982). Dynamic memory. Cambridge:
symbolic communication, and language: A reply Cambridge University Press.
to Seidenberg and Petitto. Journal of Experimental Schank, R. C., & Abelson, R. (1977). Scripts, plans,
Psychology: General, 116, 288–292. goals and understanding. Hillsdale, NJ: Lawrence
Savage-Rumbaugh, E. S., & Lewin, R. (1994). Erlbaum Associates, Inc.
Kanzi: At the brink of the human mind. New York: Schenkein, J. (1980). A taxonomy for repeating action
Wiley. sequences in natural conversation. In B. Butterworth
Savage-Rumbaugh, E. S., McDonald, K., Sevcik, R. A., (Ed.), Language production: Vol. 1. Speech and talk
Hopkins, W. D., & Rupert, E. (1986). Spontaneous (pp. 21–48). London: Academic Press.
symbol acquisition and communicative use by pygmy Schiff-Myers, N. (1993). Hearing children of deaf
chimpanzees (Pan paniscus). Journal of Experimental parents. In D. Bishop & K. Mogford (Eds.), Language
Psychology: General, 115, 211–235. development in exceptional circumstances (pp. 47–61).
Savage-Rumbaugh, E. S., Murphy, J., Sevcik, R. A., Hove, UK: Lawrence Erlbaum Associates.
Brakke, K. E., Williams, S. L., & Rumbaugh, D. M. Schiller, N. O., & Caramazza, A. (2003).
(1993). Language comprehension in ape and child. Grammatical feature selection in noun phrase
Monographs of the Society for Research in Child production: Evidence from German and Dutch.
Development, 58 (Whole Nos. 3–4). Journal of Memory and Language, 48, 169–194.
556 REFERENCES
Schiller, N. O., & Costa, A. (2006). Different Schustack, M. W., Ehrlich, S. F., & Rayner, K.
selection principles of free-standing and bound (1987). The complexity of contextual facilitation
morphemes in language production. Journal of in reading: Local and global influences. Journal of
Experimental Psychology: Learning, Memory, and Memory and Language, 26, 322–340.
Cognition, 32, 1201–1207. Schvaneveldt, R. W., Meyer, D. E., & Becker, C. A.
Schilling, H. E. H., Rayner, K., & Chumbley, J. I. (1976). Lexical ambiguity, semantic context, and
(1998). Comparing naming, lexical decision, and eye visual word recognition. Journal of Experimental
fixation times: Word frequency effects and individual Psychology: Human Perception and Performance, 2,
differences. Memory and Cognition, 26, 1270–1281. 243–256.
Schlesinger, H. S., & Meadow, K. P. (1972). Sound Schwanenflugel, P. J., & LaCount, K. L. (1988).
and sign: Childhood deafness and mental health. Semantic relatedness and the scope of facilitation for
Berkeley: University of California Press. upcoming words in sentences. Journal of Experimental
Schlesinger, I. M. (1971). Production of utterances Psychology: Learning, Memory, and Cognition, 14,
and language acquisition. In D. I. Slobin (Ed.), The 344–354.
ontogenesis of grammar (pp. 63–102). New York: Schwanenflugel, P. J., & Rey, M. (1986). Interlingual
Academic Press. semantic facilitation: Evidence for a common
Schlesinger, I. M. (1988). The origin of relational representational system in the bilingual lexicon.
categories. In Y. Levy, I. M. Schlesinger, & Journal of Memory and Language, 25, 605–618.
M. D. S. Braine (Eds.), Categories and processes in Schwartz, M. F. (1984). What the classical aphasia
language acquisition (pp. 121–178). Hillsdale, NJ: categories can’t do for us, and why. Brain and
Lawrence Erlbaum Associates, Inc. Language, 21, 3–8.
Schneider, W., & Shiffrin, R. M. (1977). Controlled and Schwartz, M. F. (1987). Patterns of speech production
automatic human information processing: I. Detection, deficit within and across aphasia syndromes:
search and attention. Psychological Review, 84, 1–66. Application of a psycholinguistic model. In M.
Schnur, T. T., Costa, A., & Caramazza, A. (2006). Coltheart, G. Sartori, & R. Job (Eds.), The cognitive
Planning at the phonological level during sentence neuropsychology of language (pp. 163–199). Hove,
production. Journal of Psycholinguistic Research, 35, UK: Lawrence Erlbaum Associates.
189–213. Schwartz, M. F. (Ed.). (1990). Modular deficits in
Schober, M. F., & Clark, H. H. (1989). Alzheimer-type dementia. Cambridge, MA: MIT Press.
Understanding by addressees and overhearers. Schwartz, M. F., & Chawluk, J. B. (1990).
Cognitive Psychology, 21, 211–232. Deterioration of language in progressive aphasia: A
Schreuder, R., & Baayen, R. H. (1997). How case study. In M. F. Schwartz (Ed.), Modular deficits
complex simplex words can be. Journal of Memory in Alzheimer-type dementia (pp. 245–296). Cambridge,
and Language, 37, 118–139. MA: MIT Press.
Schriefers, H., Jescheniak, J. D., & Hantsch, A. Schwartz, M. F., Dell, G. S., Martin, N., Gahl, S., &
(2005). Selection of gender-marked morphemes Sobel, P. (2006). A case-series test of the interactive
in speech production. Journal of Experimental two-step model of lexical access: Evidence from
Psychology: Learning, Memory, and Cognition, 31, picture naming. Journal of Memory and Language, 54,
159–168. 228–264.
Schriefers, H., Meyer, A. S., & Levelt, W. J. M. Schwartz, M. F., Linebarger, M., Saffran, E., &
(1990). Exploring the time course of lexical access Pate, D. (1987). Syntactic transparency and sentence
in language production: Picture–word interference interpretation in aphasia. Language and Cognitive
studies. Journal of Memory and Language, 29, Processes, 2, 85–113.
86–102. Schwartz, M. F., Marin, O. S. M., & Saffran, E. M.
Schriefers, H., & Teruel, E. (2000). Grammatical (1979). Dissociations of language function in
gender in noun phrase production: The gender dementia: A case study. Brain and Language, 7,
interference effect in German. Journal of Experimental 277–306.
Psychology: Learning, Memory, and Cognition, 26, Schwartz, M. F., Saffran, E. M., Bloch, D. E., &
1368–1377. Dell, G. S. (1994). Disordered speech production in
Schriefers, H., Teruel, E., & Meinshausen, R. M. aphasic and normal speakers. Brain and Language, 47,
(1998). Producing simple sentences: Results from 52–88.
picture–word interference experiments. Journal of Schwartz, M. F., Saffran, E. M., & Marin, O. S. M.
Memory and Language, 39, 609–632. (1980a). Fractionating the reading process in
Schuberth, R. E., & Eimas, P. D. (1977). Effects dementia: Evidence for word-specific print-to-sound
of context on the classification of words and non- associations. In M. Coltheart, K. E. Patterson, &
words. Journal of Experimental Psychology: Human J. C. Marshall (Eds.), Deep dyslexia (pp. 259–269).
Perception and Performance, 3, 27–36. London: Routledge & Kegan Paul.
REFERENCES 557
Schwartz, M. F., Saffran, E. M., & Marin, O. S. M. limitations of knowledge-based processing. Cognitive
(1980b). The word order problem in agrammatism I: Psychology, 14, 489–537.
Comprehension. Brain and Language, 10, 249–262. Seidenberg, M. S., Waters, G. S., Barnes, M. A., &
Scoresby-Jackson, R. E. (1867). Case of aphasia with Tanenhaus, M. K. (1984). When does irregular spelling
right hemiplegia. Edinburgh Medical Journal, 12, or pronunciation influence word recognition? Journal of
696–706. Verbal Learning and Verbal Behavior, 23, 383–404.
Schyns, P. G., Goldstone, R. L., & Thibaut, J.-P. Seidenberg, M. S., Waters, G. S., Sanders, M.,
(1998). The development of features in object & Langer, P. (1984). Pre- and post-lexical loci of
concepts. Behavioral and Brain Sciences, 21, 1–53. contextual effects on word recognition. Memory and
Searle, J. R. (1969). Speech acts. Cambridge: Cognition, 12, 315–328.
Cambridge University Press. Seifert, C. M., McKoon, G., Abelson, R. P., & Ratcliff,
Searle, J. R. (1975). Indirect speech acts. In P. Cole R. (1986). Memory connections between thematically
& J. L. Morgan (Eds.), Syntax and semantics: Vol. 3. similar episodes. Journal of Experimental Psychology:
Speech acts (pp. 59–82). New York: Academic Press. Learning, Memory, and Cognition, 12, 220–231.
Searle, J. R. (1979). Metaphor. In A. Ortony (Ed.), Seifert, C. M., Robertson, S. P., & Black, J. B.
Metaphor and thought (pp. 92–123). Cambridge: (1985). Types of inference generated during reading.
Cambridge University Press. Journal of Memory and Language, 24, 405–422.
Sedivy, J. C., Tanenhaus, M. K., Chambers, C. G., Semenza, C., & Zettin, M. (1988). Generating
& Carlson, G. N. (1999). Achieving incremental proper names: A case of selective inability. Cognitive
semantic interpretation through contextual Neuropsychology, 5, 711–721.
representation. Cognition, 71, 109–147. Seymour, P. H. K. (1987). Individual cognitive
Seidenberg, M. S. (1988). Cognitive neuropsychology analysis of competent and impaired reading. British
and language: The state of the art. Cognitive Journal of Psychology, 78, 483–506.
Neuropsychology, 5, 403–426. Seymour, P. H. K. (1990). Developmental dyslexia.
Seidenberg, M. S. (2011). What causes dyslexia? In M. W. Eysenck (Ed.), Cognitive psychology: An
Comment on Goswami. Trends in Cognitive Sciences, international review (pp. 135–196). Chichester, UK:
15, 2. John Wiley.
Seidenberg, M. S., & Elman, J. L. (1999). Networks Seymour, P. H. K., & Elder, L. (1986).
are not “hidden rules.” Trends in Cognitive Sciences, Beginning reading without phonology. Cognitive
3, 288–289. Neuropsychology, 3, 1–36.
Seidenberg, M. S., & McClelland, J. L. (1989). A Seymour, P. H. K., & Evans, H. M. (1994). Levels of
distributed developmental model of word recognition. phonological awareness and learning to read. Reading
Psychological Review, 96, 523–568. and Writing, 6, 221–250.
Seidenberg, M. S., & McClelland, J. L. (1990). Shafto, M., Burke, D., Stamatakis, E., Tam, P., &
More words but still no lexicon. Reply to Besner et al. Tyler, L. (2007). On the tip-of-the-tongue: Neural
(1990). Psychological Review, 97, 447–452. correlates of increased word-finding failures in
Seidenberg, M. S., Petersen, A., MacDonald, M. C., normal aging. Journal of Cognitive Neuroscience, 19,
& Plaut, D. C. (1996). Pseudohomophone effects and 2060–2070.
models of word recognition. Journal of Experimental Shallice, T. (1981). Phonological agraphia and the
Psychology: Learning, Memory, and Cognition, 22, lexical route in writing. Brain, 104, 413–429.
48–62. Shallice, T. (1988). From neuropsychology to mental
Seidenberg, M. S., & Petitto, L. A. (1979). Signing structure. Cambridge: Cambridge University Press.
behavior in apes: A critical review. Cognition, 7, Shallice, T. (1993). Multiple semantics: Whose
177–215. confusions? Cognitive Neuropsychology, 10, 251–261.
Seidenberg, M. S., & Petitto, L. A. (1987). Shallice, T., & Butterworth, B. (1977). Short-
Communication, symbolic communication, and term memory impairment and spontaneous speech.
language: Comment on Savage-Rumbaugh, McDonald, Neuropsychologia, 15, 729–735.
Sevcik, Hopkins, and Rupert (1986). Journal of Shallice, T., & McCarthy, R. (1985). Phonological
Experimental Psychology: General, 116, 279–287. reading: From patterns of impairment to possible
Seidenberg, M. S., Plaut, D. C., Petersen, A., procedure. In K. E. Patterson, J. C. Marshall, &
McClelland, J. L., & McRae, K. (1994). Nonword M. Coltheart (Eds.), Surface dyslexia: Neuropsychological
pronunciation and models of word recognition. and cognitive studies of phonological reading
Journal of Experimental Psychology: Human (pp. 361–397). Hove, UK: Lawrence Erlbaum Associates.
Perception and Performance, 20, 1177–1196. Shallice, T., & McGill, J. (1978). The origins of
Seidenberg, M. S., Tanenhaus, M. K., Leiman, J. M., mixed errors. In J. Requin (Ed.), Attention and
& Bienkowski, M. (1982). Automatic access of the performance VII (pp. 193–208). Hillsdale, NJ:
meanings of ambiguous words in context: Some Lawrence Erlbaum Associates, Inc.
558 REFERENCES
Shallice, T., McLeod, P., & Lewis, K. (1985). Shaywitz, B. A., Shaywitz, S. E., Pugh, K. R.,
Isolating cognitive modules with the dual task Constable, R. T., Skudlarski, P., Fulbright, R. K.,
paradigm: Are speech perception and production et al. (1995). Sex differences in the functional
separate processes? Quarterly Journal of Experimental organization of the brain for language. Nature, 373,
Psychology, 37A, 507–532. 607–609.
Shallice, T., Rumiati, R. I., & Zadini, A. (2000). The Sheldon, A. (1974). The role of parallel function in the
selective impairment of the phonological output buffer. acquisition of relative clauses in English. Journal of
Cognitive Neuropsychology, 17, 517–546. Verbal Learning and Verbal Behavior, 13, 272–281.
Shallice, T., & Warrington, E. K. (1975). Word Shelton, J. R., & Caramazza, A. (1999). Deficits in
recognition in a phonemic dyslexic patient. Quarterly lexical and semantic processing: Implications for
Journal of Experimental Psychology, 27, 187–199. models of normal language. Psychonomic Bulletin and
Shallice, T., & Warrington, E. K. (1977). Auditory- Review, 6, 5–27.
verbal short-term memory impairment and conduction Shelton, J. R., & Martin, R. C. (1992). How
aphasia. Brain and Language, 4, 479–491. semantic is automatic semantic priming? Journal of
Shallice, T., & Warrington, E. K. (1980). Single and Experimental Psychology: Learning, Memory, and
multiple component central deep dyslexic syndromes. Cognition, 18, 1191–1209.
In M. Coltheart, K. E. Patterson, & J. C. Marshall Shelton, J. R., & Weinrich, M. (1997). Further
(Eds.), Deep dyslexia (pp. 199–245). London: evidence of a dissociation between output
Routledge & Kegan Paul. [2nd ed., 1987.] phonological and orthographic lexicons: A case study.
Shallice, T., Warrington, E. K., & McCarthy, R. Cognitive Neuropsychology, 14, 105–129.
(1983). Reading without semantics. Quarterly Journal Sheridan, J., & Humphreys, G. W. (1993). A verbal-
of Experimental Psychology, 35A, 111–138. semantic category-specific recognition impairment.
Shanker, S. G., Savage-Rumbaugh, E. S., & Cognitive Neuropsychology, 10, 143–184.
Taylor, T. J. (1999). Kanzi: A new beginning. Animal Shiffrin, R. M., & Schneider, W. (1977). Controlled
Learning and Behavior, 27, 24–25. and automatic human information processing: II.
Shannon, C. E., & Weaver, W. (1949). The Perceptual learning, automatic attending, and a general
mathematical theory of communication. Urbana: theory. Psychological Review, 84, 127–190.
University of Illinois Press. Shoben, E. J., & Gagne, C. L. (1997). Thematic
Shapiro, K., & Caramazza, A. (2003). The relations and the creation of combined concepts. In
representation of grammatical categories in the brain. T. B. Ward, S. M. Smith, & J. Vaid (Eds.), Creative
Trends in Cognitive Sciences, 7, 201–206. thought: An investigation of creative structures and
Share, D. L. (1995). Phonological recoding and processes (pp. 31–50). Washington, DC: American
self-teaching: Sine qua non of reading acquisition. Psychological Association.
Cognition, 55, 151–218. Siegel, L. S. (1998). Phonological processing deficits
Sharkey, A. J. C., & Sharkey, N. E. (1992). Weak and reading disabilities. In J. L. Metsala & L. C.
contextual constraints in text and word priming. Ehri (Eds.), Word recognition and beginning literacy
Journal of Memory and Language, 31, 543–572. (pp. 141–160). Mahwah, NJ: Lawrence Erlbaum
Sharpe, K. (1992). Communication, culture, context, Associates, Inc.
confidence: The four Cs of primary modern language Silverberg, S., & Samuel, A. G. (2004). The
teaching. Language Learning Journal, 6, 13–14. effect of age of second language acquisition on the
Shattuck, R. (1980). The forbidden experiment. New representation and processing of second language
York: Kodansha International. words. Journal of Memory and Language, 51,
Shattuck-Hufnagel, S. (1979). Speech errors as 381–398.
evidence for a serial ordering mechanism in speech Silveri, M. C., & Gainotti, G. (1988). Interaction
production. In W. E. Cooper & E. C. T. Walker between vision and language in category-specific
(Eds.), Sentence processing: Psycholinguistic studies semantic impairment. Cognitive Neuropsychology, 5,
presented to Merrill Garrett (pp. 295–342). Hillsdale, 677–709.
NJ: Lawrence Erlbaum Associates, Inc. Simpson, G. B. (1981). Meaning dominance
Shatz, M., Diesendruck, G., Martinez-Beck, I., and semantic context in the processing of lexical
& Akar, D. (2003). The influence of language and ambiguity. Journal of Verbal Learning and Verbal
socioeconomic status on children’s understanding of Behavior, 20, 120–136.
false belief. Developmental Psychology, 39, 717–729. Simpson, G. B. (1994). Context and the processing
Shatz, M., & Gelman, R. (1973). The development of ambiguous words. In M. A. Gernsbacher (Ed.),
of communication skills: Modifications in the speech Handbook of psycholinguistic research (pp. 359–374).
of young children as a function of the listener. San Diego, CA: Academic Press.
Monograph of the Society for Research in Child Simpson, G. B., & Burgess, C. (1985). Activation
Development, 152. and solution processes in the recognition of ambiguous
REFERENCES 559
words. Journal of Experimental Psychology: Human D. I. Slobin (Eds.), Studies of child language development
Perception and Performance, 11, 28–39. (pp. 175–208). New York: Holt, Rhinehart & Winston.
Simpson, G. B., & Krueger, M. A. (1991). Selective Slobin, D. I. (1981). The origins of grammatical
access of homograph meanings in sentence context. encoding of events. In W. Deutsch (Ed.), The child’s
Journal of Memory and Language, 30, 627–643. construction of language (pp. 185–199). London:
Sinclair-de-Zwart, H. (1969). Developmental Academic Press.
psycholinguistics. In D. Elkind & J. H. Flavell (Eds.), Slobin, D. I. (1985). Crosslinguistic evidence for the
Studies in cognitive development (pp. 315–366). language-making capacity. In D. I. Slobin (Ed.), The
Oxford: Oxford University Press. crosslinguistic study of language acquisitions: Vol.
Sinclair-de-Zwart, H. (1973). Language acquisition 2. Theoretical issues (pp. 1157–1249). Hillsdale, NJ:
and cognitive development. In T. E. Moore (Ed.), Lawrence Erlbaum Associates, Inc.
Cognitive development and the acquisition of Smith, E. E. (1988). Concepts and thought. In
language (pp. 9–26). New York: Academic Press. R. J. Sternberg (Ed.), The psychology of human thought
Singer, M. (1994). Discourse inference processes. (pp. 19–49). Cambridge: Cambridge University Press.
In M. A. Gernsbacher (Ed.), Handbook of Smith, E. E., & Medin, D. L. (1981). Categories and
psycholinguistics (pp. 479–516). San Diego, CA: concepts. Cambridge, MA: Harvard University Press.
Academic Press. Smith, E. E., Shoben, E. J., & Rips, L. J. (1974).
Singer, M., & Ferreira, F. (1983). Inferring Structure and process in semantic memory: A featural
consequences in story comprehension. Journal of model for semantic decisions. Psychological Review,
Verbal Learning and Verbal Behavior, 22, 437–448. 81, 214–241.
Singer, M., Graesser, A. C., & Trabasso, T. (1994). Smith, M., & Wheeldon, L. (1999). High level
Minimal or global inference in comprehension. processing scope in spoken sentence production.
Journal of Memory and Language, 33, 421–441. Cognition, 73, 205–246.
Singh, J. A. L., & Zingg, R. M. (1942). Wolf children Smith, M., & Wheeldon, L. (2004). Horizontal
and feral man. Hamden, CT: Shoe String Press. information flow in spoken sentence production.
[Reprinted 1966, New York: Harper & Row.] Journal of Experimental Psychology: Learning,
Sitton, M., Mozer, M. C., & Farah, M. J. Memory, and Cognition, 30, 675–686.
(2000). Superadditive effects of multiple lesions Smith, N., & Tsimpli, I.-M. (1995). The mind of a
in connectionist architecture: Implications for the savant: Language learning and modularity. Oxford:
neuropsychology of optic aphasia. Psychological Blackwell.
Review, 107, 709–734. Smith, N. V. (1973). The acquisition of phonology: A
Skehan, P. (1998). A cognitive approach to language case study. Cambridge: Cambridge University Press.
learning. Oxford: Oxford University Press. Smith, P. T., & Sterling, C. M. (1982). Factors
Skinner, B. F. (1957). Verbal behavior. New York: affecting the perceived morphophonemic structure of
Appleton-Century-Crofts. written words. Journal of Verbal Learning and Verbal
Skoyles, J., & Skottun, B. C. (2004). On the Behavior, 21, 704–721.
prevalence of magnocellular deficits in the visual Smith, S. M., Brown, H. O., Thomas, J. E. P., &
system of non-dyslexic individuals. Brain and Goodman, L. S. (1947). The lack of cerebral effects
Language, 88, 79–82. of d-tubocurarine. Anesthesiology, 8, 1–14.
Skuse, D. H. (1993). Extreme deprivation in early Snedeker, J., & Trueswell, J. C. (2003). Using
childhood. In D. Bishop & K. Mogford (Eds.), prosody to avoid ambiguity: Effects of speaker
Language development in exceptional circumstances awareness and referential context. Journal of Memory
(pp. 29–46). Hove, UK: Lawrence Erlbaum Associates. and Language, 48, 103–130.
Slobin, D. I. (1966a). Grammatical transformations Snedeker, J., & Trueswell, J. C. (2004). The developing
and sentence comprehension in childhood and constraints on parsing decisions: The role of lexical
adulthood. Journal of Verbal Learning and Verbal biases and referential scenes in child and adult sentence
Behavior, 5, 219–227. processing. Cognitive Psychology, 49, 238–299.
Slobin, D. I. (1966b). The acquisition of Russian as a Snodgrass, J. G. (1984). Concepts and their surface
native language. In F. Smith & G. A. Miller (Eds.), The representation. Journal of Verbal Learning and Verbal
genesis of a language: A psycholinguistic approach Behavior, 23, 3–22.
(pp. 129–248). Cambridge, MA: MIT Press. Snodgrass, J. G., & Vanderwart, M. (1980). A
Slobin, D. I. (1970). Universals of grammatical standardised set of 260 pictures: Norms for name
development in children. In G. Flores d’Arcais & agreement, image agreement, familiarity, and visual
W. J. M. Levelt (Eds.), Advances in psycholinguistics complexity. Journal of Experimental Psychology:
(pp. 174–186). Amsterdam: North Holland. Human Learning and Memory, 6, 174–215.
Slobin, D. I. (1973). Cognitive prerequisites for the Snow, C. E. (1972). Mothers’ speech to children
development of grammar. In C. A. Ferguson & learning language. Child Development, 43, 549–565.
560 REFERENCES
Snow, C. E. (1977). The development of conversation Sokolov, J. L., & Snow, C. E. (1994). The changing
between mothers and babies. Journal of Child role of negative evidence in theories of language
Language, 4, 1–22. development. In C. Gallaway & B. J. Richards (Eds.),
Snow, C. E. (1983). Age differences in second Input and interaction in language acquisition (pp.
language acquisition: Research findings and folk 38–55). Cambridge: Cambridge University Press.
psychology. In K. Bailey, M. Long, & S. Peck (Eds.), Solomon, E. S., & Pearlmutter, N. J. (2004).
Second language acquisition studies (pp. 141–150). Semantic integration and syntactic planning in
Rowley, MA: Newbury House. language production. Cognitive Psychology, 49, 1–46.
Snow, C. E. (1993). Bilingualism and second Spelke, E. S. (1994). Initial knowledge: Six
language acquisition. In J. B. Gleason & N. B. Ratner suggestions. Cognition, 50, 443–447.
(Eds.), Psycholinguistics (pp. 391–416). Fort Worth, Spender, D. (1980). Man made language. London:
TX: Harcourt Brace Jovanovich. Routledge & Kegan Paul.
Snow, C. E. (1994). Beginning from baby talk: Sperber, D., & Wilson, D. (1986). Relevance:
Twenty years of research on input and interaction. Communication and cognition. Oxford: Blackwell.
In C. Gallaway & B. J. Richards (Eds.), Input and Sperber, D., & Wilson, D. (1987). Précis of
interaction in language acquisition (pp. 3–12). Relevance: Communication and cognition. Behavioral
Cambridge: Cambridge University Press. and Brain Sciences, 10, 697–754.
Snow, C. E. (1995). Issues in the study of input: Sperber, R. D., McCauley, C., Ragain, R. D., &
Finetuning, universality, individual and developmental Weil, C. M. (1979). Semantic priming effects on
differences, and necessary causes. In P. Fletcher & picture and word processing. Memory and Cognition,
B. MacWhinney (Eds.), The handbook of child 7, 339–345.
language (pp. 180–193). Oxford: Blackwell. Spiro, R. J. (1977). Constructing a theory of
Snow, C. E., & Hoefnagel-Hohle, M. (1978). The reconstructive memory: The state of the schema
critical period for language acquisition: Evidence from approach. In R. C. Anderson, R. J. Spiro, &
second language learning. Child Development, 49, W. E. Montague (Eds.), Schooling and the acquisition
1114–1128. of knowledge (pp. 137–177). Hillsdale, NJ: Lawrence
Snowden, J. S., Goulding, P. J., & Neary, D. (1989). Erlbaum Associates, Inc.
Semantic dementia: A form of circumscribed cerebral Spivey, M. J., & Marian, V. (1999). Crosstalk
atrophy. Behavioural Neurology, 2, 167–182. between native and second languages: Partial
Snowling, M. J. (1983). The comparison of acquired activation of an irrelevant lexicon. Psychological
and developmental disorders of reading. Cognition, Science, 10, 281–284.
14, 105–118. Spivey, M. J., McRae, K., & Joanisse, M. F. (2012).
Snowling, M. J. (1987). Dyslexia: A cognitive The Cambridge handbook of psycholinguistics.
development perspective. Oxford: Blackwell. Cambridge: Cambridge University Press.
Snowling, M. J. (2000). Dyslexia (2nd ed.). Oxford: Spivey, M. J., & Tanenhaus, M. K. (1998). Syntactic
Blackwell. ambiguity resolution in discourse: Modeling the
Snowling, M. J., Bryant, P. E., & Hulme, C. (1996). effects of referential context and lexical frequency.
Theoretical and methodological pitfalls in making Journal of Experimental Psychology: Learning,
comparisons between developmental and acquired Memory, and Cognition, 24, 1521–1543.
dyslexia: Some comments on A. Castles and M. Spivey, M. J., Tanenhaus, M. K., Eberhard, K. M.,
Coltheart (1993). Reading and Writing, 8, 443–451. & Sedivy, J. C. (2002). Eye movements and spoken
Snowling, M. J., Gallagher, A., & Frith, U. (2003). language comprehension: Effects of visual context on
Family risk of dyslexia is continuous: Individual syntactic ambiguity resolution. Cognitive Psychology,
differences in the precursors of reading skill. Child 45, 447–481.
Development, 74, 358–373. Stabler, E. P. (1983). How are grammars represented?
Snowling, M. J., & Hulme, C. (1989). A longitudinal Behavioral and Brain Sciences, 6, 391–421.
case study of developmental phonological dyslexia. Stager, C. L., & Werker, J. F. (1997). Infants listen
Cognitive Neuropsychology, 6, 379–401. for more phonetic detail in speech perception than in
Snowling, M. J., & Hulme, C. (Eds.). (2007). The word-learning tasks. Nature, 388, 381–382.
science of reading: A handbook. Oxford: Blackwell. Stamenov, M. I., & Gallese, V. (Eds.). (2002). Mirror
Snowling, M. J., Stackhouse, J., & Rack, J. neurons and the evolution of brain and language
(1986). Phonological dyslexia and dysgraphia: A (Advances in consciousness research 42). Amsterdam:
developmental analysis. Cognitive Neuropsychology, John Benjamins.
3, 309–339. Stanners, R. F., Jastrzembski, J. E., & Westwood, A.
Soja, N. N., Carey, S., & Spelke, E. S. (1992). (1975). Frequency and visual quality in a word–
Perception, ontology, and word meaning. Cognition, nonword classification task. Journal of Verbal
45, 101–107. Learning and Verbal Behavior, 14, 259–264.
REFERENCES 561
Stanovich, K. E., & Bauer, D. W. (1978). Sternberg, S., Knoll, R. L., Monsell, S., &
Experiments on the spelling-to-sound regularity effect Wright, C. E. (1988). Motor programs and
in word recognition. Memory and Cognition, 6, hierarchical organization in the control of rapid
410–415. speech. Phonetica, 45, 175–197.
Stanovich, K. E., Siegel, L. S., & Gottardo, A. Stevens, K. N. (1960). Toward a model for speech
(1997). Converging evidence for phonological and recognition. Journal of the Acoustical Society of
surface subtypes of reading disability. Journal of America, 32, 47–55.
Educational Psychology, 89, 114–127. Stevenson, R. (1988). Models of language
Stanovich, K. E., Siegel, L. S., Gottardo, A., development. Milton Keynes, UK: Open University
Chiappe, P., & Sidhu, R. (1997). Subtypes of Press.
developmental dyslexia: Differences in phonological Stewart, A. J., Pickering, M. F., & Sanford, A. J.
and orthographic coding. In B. A. Blachman (Ed.), (2000). The time course of the influence of implicit
Foundations of reading acquisition and dyslexia: causality information: Focusing versus integration
Implications for early intervention (pp. 115–141). account. Journal of Memory and Language, 42,
Mahwah, NJ: Lawrence Erlbaum Associates, Inc. 423–443.
Stanovich, K. E., & West, R. F. (1979). Mechanisms Stewart, F., Parkin, A. J., & Hunkin, N. M. (1992).
of sentence context effects in reading: Automatic Naming impairments following recovery from herpes
activation and conscious attention. Memory and simplex encephalitis: Category-specific? Quarterly
Cognition, 6, 115–123. Journal of Experimental Psychology, 44A, 261–284.
Stanovich, K. E., & West, R. F. (1981). The effect Stewart, I. (1989). Does God play dice? The new
of sentence context on ongoing word recognition: mathematics of chaos. Harmondsworth, UK: Penguin.
Tests of a two-process theory. Journal of Experimental Stirling, J. (2002). Introducing neuropsychology.
Psychology: Human Perception and Performance, 7, Hove, UK: Psychology Press.
658–672. Storms, G., De Boeck, P., & Ruts, W. (2000).
Stanovich, K. E., West, R. F., & Harrison, M. R. Prototype and exemplar-based information in
(1995). Knowledge growth and maintenance across the natural language categories. Journal of Memory and
life span: The role of print exposure. Developmental Language, 42, 51–73.
Psychology, 31, 811–826. Strain, E., Patterson, K., & Seidenberg, M. S.
Stark, R. E. (1986). Prespeech segmental feature (1995). Semantic effects in single-word naming.
development. In P. Fletcher & M. Garman (Eds.), Journal of Experimental Psychology: Learning,
Language acquisition (2nd ed., pp. 149–173). Memory, and Cognition, 21, 1140–1154.
Cambridge: Cambridge University Press. Strain, E., Patterson, K., & Seidenberg, M. S.
Starreveld, P. A., & La Heij, W. (1995). Semantic (2002). Theories of word naming interact with
interference, orthographic facilitation, and their spelling–sound consistency. Journal of Experimental
interaction in naming tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28,
Psychology: Learning, Memory, and Cognition, 21, 207–214.
686–698. Sturt, P., Costa, F., Lombardo, V., & Frasconi, P.
Starreveld, P. A., & La Heij, W. (1996). Time (2003). Learning first-pass structural attachment
course analysis of semantic and orthographic context preferences with dynamic grammars and recursive
effects in picture naming. Journal of Experimental neural networks. Cognition, 88, 133–169.
Psychology: Learning, Memory, and Cognition, 22, Sudhalter, V., & Braine, M. D. S. (1985). How does
896–918. comprehension of passives develop? Journal of Child
Steffensen, M. S., Joag-dev, C., & Anderson, R. C. Language, 12, 455–470.
(1979). A cross-cultural perspective on reading Sulin, R. A., & Dooling, D. J. (1974). Intrusion
comprehension. Reading Research Quarterly, 15, 10–29. of a thematic idea in retention of prose. Journal of
Stein, J. (2003). Visual motion sensitivity and reading. Experimental Psychology, 103, 255–262.
Neuropsychologia, 41, 1785–1793. Summerfield, Q. (1981). Articulatory rate and
Stemberger, J. P. (1983). Distant context effects in perceptual constancy in phonetic perception. Journal
language production: A reply to Motley et al. Journal of Experimental Psychology: Human Perception and
of Psycholinguistic Research, 12, 555–560. Performance, 7, 1074–1095.
Stemberger, J. P. (1984). Structural errors in normal Swain, M., & Wesche, M. (1975). Linguistic
and agrammatic speech. Cognitive Neuropsychology, interaction: Case study of a bilingual child. Language
1, 281–313. Sciences, 17, 17–22.
Stemberger, J. P. (1985). An interactive activation Swinney, D. A. (1979). Lexical access during sentence
model of language production. In A. W. Ellis (Ed.), comprehension: (Re)consideration of context effects.
Progress in the psychology of language (Vol. 1, pp. Journal of Verbal Learning and Verbal Behavior, 18,
143–186). Hove, UK: Lawrence Erlbaum Associates. 545–569.
562 REFERENCES
Swinney, D. A., & Cutler, A. (1979). The access and Taft, M., & van Graan, F. (1998). Lack of
processing of idiomatic expressions. Journal of Verbal phonological mediation in a semantic categorization
Learning and Verbal Behavior, 18, 523–534. task. Journal of Memory and Language, 38, 203–224.
Swinney, D. A., Zurif, E. B., & Cutler, A. (1980). Tager-Flusberg, H. (1999). Language development in
Effects of sentential stress and word class upon atypical children. In M. Barrett (Ed.), The development of
comprehension in Broca’s aphasics. Brain and language (pp. 311–348). Hove, UK: Psychology Press.
Language, 10, 132–144. Tallal, P., Townsend, J., Curtiss, S., & Wulfeck, B.
Sykes, J. L. (1940). A study of the spontaneous (1991). Phenotypic profiles of language-impaired
vocalizations of young deaf children. Psychological children based on genetic/family history. Brain and
Monograph, 52, 104–123. Language, 41, 81–95.
Tabor, W., & Hutchins, S. (2004). Evidence for self- Tanaka, J. W., & Taylor, M. (1991). Object
organised sentence processing: Digging-in effects. categories and expertise: Is the basic level in the eye of
Journal of Experimental Psychology: Learning, the beholder? Cognitive Psychology, 23, 457–482.
Memory, and Cognition, 30, 431–450. Tanenhaus, M. K., Boland, J. E., Mauner, G. A.,
Tabor, W., Juliano, C., & Tanenhaus, M. K. (1997). & Carlson, G. N. (1993). More on combinatory
Parsing in a dynamical system: An attractor-based lexical information: Thematic structure in parsing and
account of the interaction of lexical and structural interpretation. In G. Altmann & R. Shillcock (Eds.),
constraints in sentence processing. Language and Cognitive models of speech processing (pp. 297–319).
Cognitive Processes, 12, 211–271. Hove, UK: Lawrence Erlbaum Associates.
Tabor, W., & Tanenhaus, M. K. (1999). Dynamical Tanenhaus, M. K., Carlson, G. N., & Trueswell, J. C.
models of sentence processing. Cognitive Science, 23, (1989). The role of thematic structure in interpretation
491–515. and parsing. Language and Cognitive Processes, 4,
Tabossi, P. (1988a). Accessing lexical ambiguity 211–234.
in different types of sentential context. Journal of Tanenhaus, M. K., Leiman, J. M., & Seidenberg, M. S.
Memory and Language, 27, 324–340. (1979). Evidence for multiple stages in the processing of
Tabossi, P. (1988b). Effects of context on the ambiguous words in syntactic contexts. Journal of Verbal
immediate interpretation of unambiguous words. Learning and Verbal Behavior, 18, 427–440.
Journal of Experimental Psychology: Learning, Tanenhaus, M. K., & Lucas, M. (1987). Context
Memory, and Cognition, 14, 153–162. effects in lexical processing. Cognition, 25, 213–234.
Tabossi, P., & Zardon, F. (1993). Processing Tanenhaus, M. K., Spivey-Knowlton, M. J.,
ambiguous words in context. Journal of Memory and Eberhard, K. M., & Sedivy, J. C. (1995). Integration
Language, 32, 359–372. of visual and linguistic information in spoken language
Taft, M. (1979). Recognition of affixed words and comprehension. Science, 268, 1632–1634.
the word frequency effect. Memory and Cognition, 7, Tannenbaum, P. H., Williams, F., & Hillier, C. S.
263–272. (1965). Word predictability in the environments of
Taft, M. (1981). Prefix stripping revisited. Journal of hesitations. Journal of Verbal Learning and Verbal
Verbal Learning and Verbal Behavior, 20, 289–297. Behavior, 4, 134–140.
Taft, M. (1982). An alternative to grapheme–phoneme Taraban, R., & McClelland, J. L. (1988).
conversion rules? Memory and Cognition, 10, 465–474. Constituent attachment and thematic role assignment
Taft, M. (1984). Evidence for abstract lexical in sentence processing: Influences of content-based
representation of word structure. Memory and expectations. Journal of Memory and Language, 27,
Cognition, 12, 264–269. 597–632.
Taft, M. (1985). The decoding of words in lexical Tarshis, B. (1992). Grammar for smart people. New
access: A review of the morphographic approach. In York: Pocket Books.
D. Besner, T. G. Waller, & G. E. MacKinnon (Eds.), Taylor, I., & Taylor, M. M. (1990). Psycholinguistics:
Reading research: Advances in theory and practice Learning and using language. Englewood Cliffs, NJ:
(Vol. 5, pp. 83–123). Orlando, FL: Academic Press. Prentice Hall International.
Taft, M. (1987). Morphographic processing: The Taylor, M., & Gelman, S. A. (1988). Adjectives and
BOSS re-emerges. In M. Coltheart (Ed.), Attention nouns: Children’s strategies for learning new words.
and performance XII: The psychology of reading (pp. Child Development, 59, 411–419.
265–279). Hove, UK: Lawrence Erlbaum Associates. Temple, C. M. (1987). The nature of normality, the
Taft, M. (2004). Morphological decomposition and deviance of dyslexia and the recognition of rhyme:
the reverse base frequency effect. Quarterly Journal of A reply to Bryant and Impey (1986). Cognition, 27,
Experimental Psychology, 57A, 745–765. 103–108.
Taft, M., & Forster, K. I. (1975). Lexical storage Terrace, H. S., Petitto, L. A., Sanders, R. J.,
and retrieval of prefixed words. Journal of Verbal & Bever, T. G. (1979). Can an ape create a
Learning and Verbal Behavior, 14, 638–647. sentence? Science, 206, 891–902.
REFERENCES 563
Tettamanti, M., Buccino, G., Saccuman, M. C., Tomasello, M. (1992a). First verbs: A case study
Gallese, V., Danna, M., Scifo, P., et al. (2005). of early grammatical development. Cambridge:
Listening to action-related sentences activates Cambridge University Press.
fronto-parietal motor circuits. Journal of Cognitive Tomasello, M. (1992b). The social bases of language
Neuroscience, 17, 273–281. acquisition. Social Development, 1, 67–87.
Thagard, P. (2005). Mind: An introduction to Tomasello, M. (2000). Do young children have adult
cognitive science (2nd ed.). Cambridge, MA: MIT syntactic competence? Cognition, 74, 209–253.
Press. Tomasello, M. (2003). Constructing a language:
Thal, D., Marchman, V. A., Stiles, J., Aram, D., A usage-based theory of language acquisition.
Trauner, D., Nass, R., et al. (1991). Early lexical Cambridge, MA: Harvard University Press.
development in children with focal brain injury. Brain Tomasello, M., & Akhtar, N. (2003). What paradox?
and Language, 40, 491–527. A response to Naigles. Cognition, 88, 317–323.
Theakston, A. L. (2004). The role of entrenchment in Tomasello, M., & Farrar, M. J. (1984). Cognitive
children’s and adults’ performance of grammaticality- bases of lexical development: Object permanence
judgement tasks. Cognitive Development, 19, 15–34. and relational words. Journal of Child Language, 11,
Thiessen, E. D., & Saffran, J. R. (2007). Learning 477–493.
to learn: Infants’ acquisition of stress-based strategies Tomasello, M., & Farrar, M. J. (1986). Object
for word segmentation. Language Learning and permanence and relational words: A lexical training
Development, 3, 73–100. study. Journal of Child Language, 13, 495–505.
Thomas, E. L., & Robinson, H. A. (1972). Improving Tomasello, M., & Kruger, A. (1992). Joint attention
reading in every class: A sourcebook for teachers. on actions: Acquiring verbs in ostensive and non-
Boston, MA: Allyn & Bacon. ostensive contexts. Journal of Child Language, 19,
Thomas, M. S. C. (2003). Limits on plasticity. 311–333.
Journal of Cognition and Development, 4, 95–121. Traxler, M., & Gernsbacher, M. A. (Eds.). (2006).
Thomas, M. S. C., & Karmiloff-Smith, A. (2003). Handbook of psycholinguistics (2nd ed.). Burlington,
Modeling language acquisition in atypical phenotypes. MA: Academic Press.
Psychological Review, 110, 647–682. Traxler, M. J., & Pickering, M. J. (1996).
Thompson, C. R., & Church, R. M. (1980). An Plausibility and the processing of unbounded
explanation of the language of a chimpanzee. Science, dependencies: An eye-tracking study. Journal of
208, 313–314. Memory and Language, 35, 454–475.
Thompson, R., Emmorey, K., & Gollan, T. H. Traxler, M. J., Pickering, M. J., & Clifton, C.
(2005). “Tip of the fingers” experiences by deaf (1998). Adjunct attachment is not a form of lexical
signers. Psychological Science, 16, 856–860. ambiguity resolution. Journal of Memory and
Thompson, S., & Mulac, A. (1991). The discourse Language, 39, 558–592.
conditions for the use of the complementizer that in Treiman, R. (1993). Beginning to spell: A study of
conversational English. Journal of Pragmatics, 15, first-grade children. New York: Oxford University
237–251. Press.
Thomson, J., & Chapman, R. S. (1977). Who is Treiman, R. (1994). Sources of information used by
“Daddy” revisited: The status of two-year-olds’ beginning spellers. In G. D. A. Brown & N. C. Ellis
overextended words in use and comprehension. (Eds.), Handbook of spelling: Theory, process and
Journal of Child Language, 4, 359–375. intervention (pp. 75–91). London: John Wiley & Sons
Thorndyke, P. W. (1975). Conceptual complexity Ltd.
and imagery in comprehension. Journal of Verbal Treiman, R. (1997). Spelling in normal children and
Learning and Verbal Behavior, 14, 359–369. dyslexics. In B. A. Blachman (Ed.), Foundations
Thorndyke, P. W. (1977). Cognitive structures in of reading acquisition and dyslexia: Implications
comprehension and memory of narrative discourse. for early intervention (pp. 191–218). Mahwah, NJ:
Cognitive Psychology, 9, 77–110. Lawrence Erlbaum Associates, Inc.
Thorndyke, P. W., & Hayes-Roth, B. (1979). Treiman, R., & Hirsh-Pasek, K. (1983). Silent
The use of schemata in the acquisition and transfer of reading: Insights from second-generation deaf readers.
knowledge. Cognitive Psychology, 11, 82–106. Cognitive Psychology, 15, 39–65.
Tincoff, R., & Jusczyk, P. W. (1999). Some Treiman, R., & Zukowski, A. (1996). Children’s
beginnings of word comprehension in 6-month-olds. sensitivity to syllables, onsets, rimes, and phonemes.
Psychological Science, 10, 172–175. Journal of Experimental Child Psychology, 61,
Tippett, L. J., & Farah, M. J. (1994). A 193–215.
computational model of naming in Alzheimer’s Trevarthen, C. (1975). Early attempts at speech. In
disease: Unitary or multiple impairments? R. Lewin (Ed.), Child alive (pp. 62–80). London:
Neuropsychology, 8, 1–11. Temple Smith.
564 REFERENCES
Trueswell, J. C. (1996). The role of lexical frequency processes. Perception and Psychophysics, 34,
in syntactic ambiguity resolution. Journal of Memory 409–420.
and Language, 35, 566–585. Ullman, M. T. (2004). Contributions to memory
Trueswell, J. C., Sekerina, I., Hill, N., & Logrip, M. circuits to language: The declarative/procedural
(1999). The kindergarten-path effect: Studying online model. Cognition, 92, 231–270.
sentence processing in young children. Cognition, 73, Ullman, M. T., Corkin, S., Coppola, M., Hickok, G.,
89–134. Growdon, J. H., Koroshetz, W. J., et al. (1997). A
Trueswell, J. C., & Tanenhaus, M. K. (1994). neural dissociation within language: Evidence that
Toward a lexicalist framework for constraint-based the mental dictionary is part of declarative memory,
syntactic ambiguity resolution. In C. Clifton, L. and that grammatical rules are processed by the
Frazier, & K. Rayner (Eds.), Perspectives on sentence procedural system. Journal of Cognitive Neuroscience,
processing (pp. 155–179). Hillsdale, NJ: Lawrence 9, 266–276.
Erlbaum Associates, Inc. Vaid, J. (1983). Bilingualism and brain lateralization.
Trueswell, J. C., Tanenhaus, M. K., & Garnsey, S. M. In S. Segalowitz (Ed.), Language functions and brain
(1994). Semantic influences on parsing: Use of thematic organization (pp. 315–339). New York: Academic Press.
role information in syntactic disambiguation. Journal of Valian, V. (1986). Syntactic categories in the speech
Memory and Language, 33, 285–318. of young children. Developmental Psychology, 22,
Trueswell, J. C., Tanenhaus, M. K., & Kello, C. 562–579.
(1993). Verb-specific constraints in sentence Vallar, G., & Baddeley, A. D. (1984). Phonological
processing: Separating effects of lexical preference short-term store, phonological processing and sentence
from garden paths. Journal of Experimental comprehension: A neuropsychological case study.
Psychology: Learning, Memory, and Cognition, 19, Cognitive Neuropsychology, 1, 121–142.
528–553. Vallar, G., & Baddeley, A. D. (1987). Phonological
Tulving, E. (1972). Episodic and semantic memory. short-term store and sentence processing. Cognitive
In E. Tulving & W. Donaldson (Eds.), Organization of Neuropsychology, 4, 417–438.
memory (pp. 381–403). New York: Academic Press. Vallar, G., & Baddeley, A. D. (1989). Developmental
Tulving, E., & Schachter, D. L. (1990). Priming and disorders of verbal short-term memory and their
human memory systems. Science, 247, 301–306. relation to sentence comprehension: A reply to
Turvey, M. T. (1973). On peripheral and central Howard and Butterworth. Cognitive Neuropsychology,
processes in vision. Psychological Review, 80, 1–52. 6, 465–473.
Tweedy, J. R., Lapinski, R. H., & Schvaneveldt, R. W. van Berkum, J. J. A., Brown, C., Zwitserlood, P.,
(1977). Semantic-context effects on word recognition: Kooijman, V., & Hagoort, P. (2005). Anticipating
Influence of varying the proportion of items presented upcoming words in discourse: Evidence from
in an appropriate context. Memory and Cognition, 5, ERPs and reading times. Journal of Experimental
84–89. Psychology: Learning, Memory, and Cognition, 31,
Tyler, L. K. (1984). The structure of the initial cohort. 443–467.
Perception and Psychophysics, 36, 415–427. van Dijk, T. A., & Kintsch, W. (1983). Strategies of
Tyler, L. K. (1985). Real-time comprehension discourse representation. New York: Academic Press.
processes in agrammatism: A case study. Brain and van Gompel, R. P. G., Fischer, M. H., Murray, W. S.,
Language, 26, 259–275. & Hill, R. L. (2006). Eye-movement research: An
Tyler, L. K., & Marslen-Wilson, W. D. (1977). overview of current and past developments. In R. P. G.
The on-line effects of semantic context on syntactic van Gompel, M. H. Fischer, W. S. Murray, & R. L. Hill
processing. Journal of Verbal Learning and Verbal (Eds.), Eye movements: A window on mind and brain.
Behavior, 16, 683–692. Oxford: Elsevier Science.
Tyler, L. K., & Moss, H. E. (1997). Functional van Gompel, R. P. G., & Pickering, M. J. (2001).
properties of concepts: Studies of normal and brain- Lexical guidance in sentence processing: A note on
damaged patients. Cognitive Neuropsychology, 14, Adams, Clifton, and Mitchell (1998). Psychonomic
511–545. Bulletin and Review, 8, 851–857.
Tyler, L. K., & Moss, H. E. (2001). Towards a van Gompel, R. P. G., & Pickering, M. J. (2007).
distributed account of conceptual knowledge. Trends Syntactic parsing. In G. Gaskell (Ed.), The Oxford
in Cognitive Science, 5, 244–252. handbook of psycholinguistics. Oxford: Oxford
Tyler, L. K., Ostrin, R. K., Cooke, M., & Moss, H. E. University Press.
(1995). Automatic access of lexical information in van Gompel, R. P. G., Pickering, M. J., &
Broca’s aphasics: Against the automaticity hypothesis. Traxler, M. J. (2000). Unrestricted race: A new model of
Brain and Language, 48, 131–162. syntactic ambiguity resolution. In A. Kennedy,
Tyler, L. K., & Wessels, J. (1983). Quantifying R. Radach, D. Heller, & J. Pynte (Eds.), Reading as a
contextual contributions to word-recognition perceptual process (pp. 621–648). Oxford: Elsevier.
REFERENCES 565
van Gompel, R. P. G., Pickering, M. J., & handbook of psycholinguistics (pp. 195–216). Oxford:
Traxler, M. J. (2001). Reanalysis in sentence Oxford University Press.
processing: Evidence against constraint-based and Vigliocco, G., Vinson, D. P., Lewis, W., & Garrett, M. F.
two-stage models. Journal of Memory and Language, (2004). Representing the meanings of object and action
45, 225–258. words: The featural and unitary semantic space hypothesis.
van Orden, G. C. (1987). A rows is a rose: Spelling, Cognitive Psychology, 48, 422–488.
sound and reading. Memory and Cognition, 15, Vigliocco, G., Vinson, D. P., Paganelli, F., &
181–198. Dworzynski, K. (2005). Grammatical gender effects
van Orden, G. C., Johnston, J. C., & Hale, B. L. on cognition: Implications for language learning and
(1988). Word identification in reading proceeds from language use. Journal of Experimental Psychology:
spelling to sound to meaning. Journal of Experimental General, 134, 501–520.
Psychology: Learning, Memory, and Cognition, 14, Vihman, M. M. (1985). Language differentiation by
371–386. the bilingual infant. Journal of Child Language, 12,
van Orden, G. C., Pennington, B. F., & Stone, G. O. 297–324.
(1990). Word identification in reading and the promise Vihman, M. M. (1996). Phonological development.
of subsymbolic psycholinguistics. Psychological Oxford: Blackwell.
Review, 97, 488–522. Vinson, B. P. (1999). Language disorders across the
van Petten, C. (1993). A comparison of lexical and lifespan: An introduction. San Diego, CA: Singular
sentence-level context effects in event-related potentials. Publishing Group.
Language and Cognitive Processes, 8, 485–531. Vipond, D. (1980). Micro and macroprocesses in
van Turenout, M., Hagoort, P., & Brown, C. M. text comprehension. Journal of Verbal Learning and
(1998). Brain activity during speaking: From syntax to Verbal Behavior, 19, 276–296.
phonology in 40 milliseconds. Science, 280, 572–574. Vitevitch, M. S. (2002). The influence of phonological
Vanderwart, M. (1984). Priming by pictures in lexical similarity neighborhoods on speech production.
decision. Journal of Verbal Learning and Verbal Journal of Experimental Psychology: Learning,
Behavior, 23, 67–83. Memory, and Cognition, 28, 735–747.
Vargha-Khadem, F., & Passingham, R. (1990). Vitkovitch, M., & Humphreys, G. W. (1991).
Speech and language defects. Nature, 346, 226. Perseverant responding in speeded naming of pictures:
Vargha-Khadem, F., Watkins, K., Alcock, K., It’s in the links. Journal of Experimental Psychology:
Fletcher, P., & Passingham, R. (1995). Praxic and Learning, Memory, and Cognition, 17, 664–680.
nonverbal cognitive deficits in a large family with a Von Frisch, K. (1950). Bees, their vision, chemical
genetically transmitted speech and language disorder. senses, and language. Ithaca, NY: Cornell University
Proceedings of the National Academy of Science, 92, Press.
930–933. Von Frisch, K. (1974). Decoding the language of
Varney, N. L. (1984). Phonemic imperception in bees. Science, 185, 663–668.
aphasia. Brain and Language, 21, 85–94. Vu, H., & Kellas, G. (1999). Contextual strength
Venezky, R. L. (1970). The structure of English modulates the subordinate bias effect: Reply to
orthography. The Hague: Mouton. Rayner, Binder, and Duffy. Quarterly Journal of
Vidyasagar, T. R., & Pammer, K. (2010). Dyslexia: Experimental Psychology, 52A, 853–855.
A deficit in visuo-spatial attention, not in phonological Vu, H., Kellas, G., & Paul, S. T. (1998). Sources of
processing. Trends in Cognitive Sciences, 14, 57–63. sentence constraint on lexical ambiguity resolution.
Vigliocco, G., Antonini, T., & Garrett, M. F. (1997). Memory and Cognition, 26, 979–1001.
Grammatical gender is on the tip of Italian tongues. Vygotsky, L. (1934). Thought and language (Trans.
Psychological Science, 8, 314–317. E. Hanfman & G. Vakar, 1962). Cambridge, MA: MIT
Vigliocco, G., Butterworth, B., & Garrett, M. F. Press.
(1996). Subject–verb agreement in Spanish and Waldrop, M. M. (1992). Complexity: The emerging
English: Differences in the role of conceptual science at the edge of order and chaos. London:
constraints. Cognition, 61, 261–298. Penguin Books.
Vigliocco, G., & Hartsuiker, R. J. (2002). The Wales, R. J., & Campbell, R. (1970). On the
interplay of meaning, sound, and syntax in sentence development of comparison and the comparison of
production. Psychological Bulletin, 128, 442–472. development. In G. B. Flores d’Arcais &
Vigliocco, G., & Nicol, J. (1998). Separating W. J. M. Levelt (Eds.), Advances in psycholinguistics
hierarchical relations and word order in language (pp. 373–396). Amsterdam: North Holland.
production: Is proximity concord syntactic or linear? Walker, C. H., & Yekovich, F. R. (1987). Activation
Cognition, 68, B13–B29. and use of script-based antecedents in anaphoric
Vigliocco, G., & Vinson, D. P. (2009). Semantic reference. Journal of Memory and Language, 26,
representation. In G. Gaskell (Ed.), The Oxford 673–691.
566 REFERENCES
Walker, S. (1987). Review of Gavagai! or the future Waters, G. S., Caplan, D., & Hildebrandt, N.
history of the animal language controversy, by David (1991). On the structure of verbal short-term memory
Premack. Mind and Language, 2, 326–332. and its functional role in sentence comprehension:
Wall, R. (1972). Introduction to mathematical Evidence from neuropsychology. Cognitive
linguistics. Englewood Cliffs, NJ: Prentice Hall. Neuropsychology, 8, 81–126.
Wanner, E. (1980). The ATN and the sausage Watkins, K. E., Dronkers, N. F., & Vargha-
machine: Which one is baloney? Cognition, 8, Khadem, F. (2002). Behavioural analysis of an
209–225. inherited speech and language disorder: Comparison
Ward, J. (2010). The student’s guide to cognitive with acquired aphasia. Brain, 125, 452–464.
neuroscience (2nd ed.). Hove, UK: Psychology Press. Watkins, K. E., & Paus, T. (2004). Modulation of
Wardlow Lane, L., Groisman, M., & Ferreira, V. S. motor excitability during speech perception: The role
(2006). Don’t talk about pink elephants! Psychological of Broca’s area. Journal of Cognitive Neuroscience,
Science, 17, 273–277. 16, 978–987.
Warren, C., & Morton, J. (1982). The effects of Watson, J. B. (1913). Psychology as the behaviorist
priming on picture recognition. British Journal of views it. Psychological Review, 20, 158–177.
Psychology, 73, 117–129. Watts, D. (2012). Why everything is obvious (once
Warren, R. M. (1970). Perceptual restoration of you know the answer). New York: Atlantic Books.
missing speech sounds. Science, 167, 392–393. Waxman, S. R. (1999). Specifying the scope of
Warren, R. M., Obusek, C. J., Farmer, R. M., & 13-month-olds’ expectations for novel words.
Warren, R. P. (1969). Auditory sequence: Confusion Cognition, 70, B35–B50.
of patterns other than speech or music. Science, 164, Waxman, S. R., & Booth, A. E. (2001). Seeing pink
586–587. elephants: Fourteen-month-olds’ interpretations of
Warren, R. M., & Warren, R. P. (1970). Auditory novel nouns and adjectives. Cognitive Psychology, 43,
illusions and confusions. Scientific American, 223, 217–242.
30–36. Waxman, S. R., & Markow, D. B. (1995). Words as
Warrington, E. K. (1975). The selective impairment invitations to form categories: Evidence from 12- to
of semantic memory. Quarterly Journal of 13-month-old infants. Cognitive Psychology, 29,
Experimental Psychology, 27, 635–657. 257–303.
Warrington, E. K. (1981). Concrete word dyslexia. Weekes, B. S. (1997). Differential effects of number of
British Journal of Psychology, 72, 175–196. letters on word and nonword naming latency. Quarterly
Warrington, E. K., & Cipolotti, L. (1996). Word Journal of Experimental Psychology, 50A, 439–456.
comprehension: The distinction between refractory Weizenbaum, J. (1966). ELIZA: A computer program
and storage impairments. Brain, 119, 611–625. for the study of natural language communication
Warrington, E. K., & Crutch, S. J. (2004). A between man and machine. Communications of the
circumscribed refractory access disorder: A verbal Association for Computing Machinery, 9, 36–45.
semantic impairment sparing visual semantics. Werker, J., & Curtin, S. (2005). PRIMIR: A
Cognitive Neuropsychology, 21, 299–315. developmental framework of infant speech processing.
Warrington, E. K., & McCarthy, R. (1983). Language Learning and Development, 1, 197–234.
Category specific access dysphasia. Brain, 106, Werker, J. F., & Tees, R. C. (1983). Developmental
859–878. changes across childhood in the perception of
Warrington, E. K., & McCarthy, R. (1987). non-native speech sounds. Canadian Journal of
Categories of knowledge: Further fractionation and an Psychology, 37, 278–286.
attempted integration. Brain, 110, 1273–1296. Werker, J. F., & Tees, R. C. (1984). Crosslanguage
Warrington, E. K., & Shallice, T. (1969). The speech development: Evidence for perceptual
selective impairment of auditory verbal short-term reorganization during the first year of life. Infant
memory. Brain, 92, 885–896. Behavior and Development, 7, 49–63.
Warrington, E. K., & Shallice, T. (1979). Semantic Werker, J. F., & Yeung, H. H. (2005). Infant speech
access dyslexia. Brain, 102, 43–63. perception bootstraps word learning. Trends in
Warrington, E. K., & Shallice, T. (1984). Category- Cognitive Sciences, 9, 519–527.
specific semantic impairments. Brain, 107, 829–854. West, R. F., & Stanovich, K. E. (1978). Automatic
Wason, P. C. (1965). The contexts of plausible denial. contextual facilitation in readers of three ages. Child
Journal of Verbal Learning and Verbal Behavior, 4, Development, 49, 717–727.
7–11. West, R. F., & Stanovich, K. E. (1982). Source of
Waters, G. S., & Caplan, D. (1996). The capacity inhibition in experiments on the effect of sentence
theory of sentence comprehension: Critique of Just context on word recognition. Journal of Experimental
and Carpenter (1992). Psychological Review, 103, Psychology: Learning, Memory, and Cognition, 8,
761–772. 385–399.
REFERENCES 567
West, R. F., & Stanovich, K. E. (1986). Robust Williams, J. N. (1988). Constraints upon semantic
effects of syntactic structure on visual word activation during sentence comprehension. Language
processing. Memory and Cognition, 14, 104–112. and Cognitive Processes, 3, 165–206.
Wexler, K. (1998). Very early parameter setting and Williams, P. C., & Parkin, A. J. (1980). On
the unique checking constraint: A new explanation of knowing the meaning of words we are unable to
the optional infinitive stage. Lingua, 106, 23–79. report—confirmation of a guessing explanation.
Whaley, C. P. (1978). Word–nonword classification Quarterly Journal of Experimental Psychology, 32,
time. Journal of Verbal Learning and Verbal Behavior, 101–107.
17, 143–154. Wilshire, C. E., & Saffran, E. M. (2005). Contrasting
Wheeldon, L. (Ed.). (2000). Aspects of language effects of phonological priming in aphasic word
production. Hove, UK: Psychology Press. production. Cognition, 95, 31–71.
Wheeldon, L., & Lahiri, A. (1997). Prosodic units in Wilson, M., & Wilson, T. P. (2005). An oscillator
speech production. Journal of Memory and Language, model of the timing of turn-taking. Psychonomic
37, 356–381. Bulletin and Review, 12, 957–968.
Wheeldon, L. R., & Monsell, S. (1992). The locus of Wingfield, A., & Klein, J. F. (1971). Syntactic
repetition priming of spoken word production. Quarterly structure and acoustic pattern in speech perception.
Journal of Experimental Psychology, 44A, 723–761. Perception and Psychophysics, 9, 23–25.
Wheeler, D. (1970). Processes in word recognition. Winner, E., & Gardner, H. (1977). The
Cognitive Psychology, 1, 59–85. comprehension of metaphor in brain-damaged
Whittlesea, B. W. A. (1987). Preservation of patients. Brain, 100, 717–729.
specific experiences in the representation of general Winnick, W. A., & Daniel, S. A. (1970). Two kinds
knowledge. Journal of Experimental Psychology: of response priming in tachistoscopic recognition.
Learning, Memory, and Cognition, 13, 3–17. Journal of Experimental Psychology, 84, 74–81.
Whorf, B. L. (1956a). Language, thought, and reality: Winograd, T. A. (1972). Understanding natural
Selected writings of Benjamin Lee Whorf. New York: language. New York: Academic Press.
Wiley. Wisniewski, E. J. (1997). When concepts combine.
Whorf, B. L. (1956b). Science and linguistics. In Psychonomic Bulletin and Review, 4, 167–183.
J. B. Carroll (Ed.), Language, thought and reality: Wisniewski, E. J., & Love, B. C. (1998). Relations
Selected writings of Benjamin Lee Whorf (pp. 207–219). versus properties in conceptual combination. Journal
Cambridge, MA: MIT Press. [Originally published 1940.] of Memory and Language, 38, 177–202.
Wickelgren, W. A. (1969). Context-sensitive coding, Wittgenstein, L. (1953). Philosophical investigations
associative memory, and serial order in (speech) (Trans. G. E. M. Anscombe). Oxford: Blackwell.
behavior. Psychological Review, 76, 1–15. Wittgenstein, L. (1958). The blue and brown books.
Wierzbicka, A. (2004). Conceptual primes in Oxford: Blackwell.
human languages and their analogues in animal Woodruff-Pak, D. S. (1997). The neuropsychology of
communication and cognition. Language Sciences, 26, aging. Oxford: Blackwell.
413–441. Woods, B. T., & Carey, S. (1979). Language deficits
Wilding, J. (1990). Developmental dyslexics do not after apparent clinical recovery from childhood
fit in boxes: Evidence from the case studies. European aphasia. Annals of Neurology, 6, 405–409.
Journal of Cognitive Psychology, 2, 97–131. Woods, B. T., & Teuber, H.-L. (1973). Early onset of
Wilensky, R. (1983). Story grammars versus story complementary specialization of cerebral hemispheres
points. Behavioral and Brain Sciences, 6, 579–623. in man. Transactions of the American Neurological
Wilkes, A. L. (1997). Knowledge in minds: Individual Association, 98, 113–117.
and collective processes in cognition. Hove, UK: Woods, W. A. (1975). What’s in a link? Foundations
Psychology Press. for semantic networks. In D. G. Bobrow &
Wilkins, A. J. (1971). Conjoint frequency, category A. M. Collins (Eds.), Representation and
size, and categorization time. Journal of Verbal understanding: Studies in cognitive science (pp.
Learning and Verbal Behavior, 10, 382–385. 35–82). New York: Academic Press.
Wilkins, A. J., & Neary, G. (1991). Some visual, Woodward, A. L., & Markman, E. M. (1998). Early
optometric and perceptual effects of coloured glasses. word learning. In W. Damon, D. Kuhn, & R. S. Siegler
Ophthalmic and Physiological Optics, 11, 163–171. (Eds.), Handbook of child psychology (Vol. 2, 5th ed.,
Wilks, Y. (1976). Parsing English II. In E. Charniak pp. 371–420). New York: Wiley.
& Y. Wilks (Eds.), Computational semantics (pp. Woodworth, R. S. (1938). Experimental psychology.
155–184). Amsterdam: North Holland. New York: Holt.
Willems, R. M., & Casasanto, D. (2011). Flexibility Wright, B., & Garrett, M. (1984). Lexical decision
in embodied language understanding. Frontiers in in sentences: Effects of syntactic structure. Memory
Psychology, 2, 1–11. and Cognition, 12, 31–45.
568 REFERENCES
Wydell, T. K., Patterson, K. E., & Humphreys, G. W. interaction of lexical semantics and cohort competition
(1993). Phonologically mediated access to meaning in spoken word recognition: An fMRI study. Journal
for kanji: Is rows still a rose in Japanese kanji? Journal of Cognitive Neuroscience, 23, 3778–3790.
of Experimental Psychology: Learning, Memory, and Ziegler, J. C., & Goswami, U. (2005). Reading
Cognition, 19, 1082–1093. acquisition, developmental dyslexia, and skilled
Xu, F. (2002). The role of language in acquiring object reading across languages: A psycholinguistic grain size
kind concepts in infancy. Cognition, 85, 223–250. theory. Psychological Bulletin, 131, 3–29.
Yamada, J. E. (1990). Laura: A case for the Ziegler, J. C., Muneaux, M., & Grainger, J. (2003).
modularity of language. Cambridge, MA: MIT Press. Neighborhood effects in auditory word recognition:
Yekovich, F. R., & Thorndyke, P. W. (1981). An Phonological competition and orthographic
evaluation of alternative models of narrative schema. facilitation. Journal of Memory and Language, 48,
Journal of Verbal Learning and Verbal Behavior, 20, 779–793.
454–469. Ziegler, J. C., Perry, C., Jacobs, A. M., & Braun, M.
Yngve, V. (1970). On getting a word in edgewise. (2001). Identical words are read differently in different
Papers from the Sixth Regional Meeting of the languages. Psychological Science, 12, 379–384.
Chicago Linguistic Society, 6, 567–577. Zorzi, M., Barbierob, A., Facoettia, C., &
Yopp, H. K. (1988). The validity and reliability Ziegler, J. C. (2012). Extra-large letter spacing
of phonemic awareness tests. Reading Research improves reading in dyslexia. Proceedings of the
Quarterly, 23, 159–177. National Academy of Science USA, 109, 11455–11459.
Yuill, N., & Oakhill, J. (1991). Children’s problems Zurif, E. B., Caramazza, A., Myerson, P., &
in text comprehension. Cambridge: Cambridge Galvin, J. (1974). Semantic feature representations for
University Press. normal and aphasic language. Brain and Language, 1,
Zagar, D., Pynte, J., & Rativeau, S. (1997). Evidence 167–187.
for early closure attachment on first-pass reading Zurif, E. B., & Grodzinsky, Y. (1983). Sensitivity to
times in French. Quarterly Journal of Experimental grammatical structure in agrammatic aphasics: A reply
Psychology, 50A, 421–438. to Linebarger, Schwartz, & Saffran. Cognition, 15,
Zaidel, E., & Peters, A. M. (1981). Phonological 207–214.
encoding and ideographic reading by the disconnected Zwaan, R. A., & Madden, C. J. (2004). Updating
right hemisphere. Brain and Language, 14, 205–234. situation models. Journal of Experimental
Zevin, J. D., & Balota, D. A. (2000). Priming and Psychology: Learning, Memory, and Cognition, 30,
attentional control of lexical and sublexical pathways 283–288.
during naming. Journal of Experimental Psychology: Zwaan, R. A., Magliano, J. P., & Graesser, A. C.
Learning, Memory, and Cognition, 26, 121–135. (1995). Dimensions of situation model construction
Zevin, J. D., & Seidenberg, M. S. (2002). Age of in narrative comprehension. Journal of Experimental
acquisition effects in word reading and other tasks. Psychology: Learning, Memory, and Cognition, 21,
Journal of Memory and Language, 47, 1–29. 386–397.
Zevin, J. D., & Seidenberg, M. S. (2006). Simulating Zwaan, R. A., & Radvansky, G. A. (1998). Situation
consistency effects and individual difference in models in language comprehension and memory.
nonword naming: A comparison of current models. Psychological Bulletin, 123, 162–185.
Journal of Memory and Language, 54, 145–160. Zwitserlood, P. (1989). The locus of the effects
Zhuang, J., Randall, B., Stamatakis, E. A., of sentential-semantic context in spoken-word
Marslen-Wilson, W. D., & Tyler, L. K. (2011). The processing. Cognition, 32, 25–64.
A USTEHCOT RI O INN D E X
A Antos, S.J. 179

Anwander, A. 71
Abelson, R.P. 380, 381
Aboitiz, F. 250 Applebaum, M. 18, 436
Abrams, I. 189 Aram, D. 76
Acenas, L.R. 414, 415 Arbib, M.A. 53
Adams, A. 469 Arciuli, J. 476
Adams, M.J. 247 Armstrong, S. 334
Agnoli, F. 94 Arnold, J.E. 373
Aguado, G. 442 Aslin, R.N. 66, 121, 122
Ainsworth-Darnell, K. 298 Atchley, R.A. 186, 320, 354
Aitchison, J. 339 Atkins, P. 228, 231
Akar, D. 83 Atkinson, M. 133
Akhtar, N. 141, 142, 143 Atkinson, M.A. 35
Alario, F.-X. 417, 430 Au, T.K. 93
Albert, M.L. 157 Audet, C. 186
Albrecht, J.E. 383 Auger, E. 114
Alcock, K. 114, 115 Austin, J.L. 450
Alegria, J. 244 Ayala, J. 469
Alekseyenko, A.V. 8
Alishahi, A. 141, 143 B
Allen, J. 121 Baars, B.J. 398, 399, 421, 425
Allopenna, P.D. 271 Baayen, R.H. 173, 191
Alloway, T.P. 469 Backman, J.E. 247
Allport, D.A. 171, 464, 467, 469 Baddeley, A. 160, 471
Almor, A. 375, 389 Baddeley, A.D. 87, 252, 386, 389, 468, 469, 471, 472
Altarriba, J. 155, 162, 181, 189, 190, 367 Badecker, W. 375, 415, 436
Altmann, G.T.M. 119, 122, 261, 292, 300, 301, 312, 361 Baguley, T. 382
Amiel-Tison, C. 120, 122 Bahlmann, J. 71
Andersen, E.S. 352, 353, 389 Bailey, K.G.D. 290, 433, 479
Anderson, A. 453 Bailey, L.M. 127
Anderson, J.R. 291, 292, 325, 335, 370, 377, 378, 380, Bailey, P.J. 466
387, 388 Baillet, S.D. 366
Anderson, K.J. 73 Baker, C. 314, 435, 472
Anderson, R. 7 Baldwin, D.A. 129
Anderson, R.C. 366, 367 Baldwin, G. 142, 143
Anderson, W.G. 175 Balin, J.A. 375
Andrews, C. 96 Balota, D.A. 167, 174, 177, 180, 181, 182, 206, 219,
Andrews, S. 175, 214, 215 220, 238, 389
Anglin, J.M. 134 Baluch, B. 219
Annett, M. 250 Banks, W.P. 340
Ans, B. 238 Baraitser, M. 114
Anton-Méndez, I. 421 Barbarotto, R. 347
Antonini, T. 415 Barbierob, A. 255
570 AUTHOR INDEX
Barclay, J.R. 368, 382 Berwick, R.C. 109

Bard, B. 77, 83 Besner, D. 175, 176, 196, 213, 214, 219, 224, 231,
Barisnikov, K. 83 232, 238
Barker, M.G. 350 Best, B.J. 88
Barnes, M.A. 181, 214 Best, W. 235, 252, 470
Baron, J. 214, 243, 251 Bestgen, Y. 384, 479
Baron-Cohen, S. 73 Bever, T.G. 61, 63, 80, 262, 289, 291, 293, 307, 311,
Barr, D.J. 375 332
Barrett, M.D. 131, 133, 134 Bi, Y. 417
Barron, R.W. 215, 243 Bialystok, E. 76, 79, 153, 154, 155
Barry, C. 173, 174, 223, 224, 237 Biassou, N. 436
Barry, G. 312 Bickerton, D. 52, 114
Barsalou, L.W. 334, 335, 356 Bienkowski, M. 205
Barss, A. 298, 313 Bierwisch, M. 135
Bartha, M.C. 469, 470 Bigelow, A. 86, 87
Bartlett, F.C. 380 Bihrle, A. 82
Basagni, B. 417 Bihrle, A.M. 389
Basili, A.G. 445 Binder, J.R. 356
Bastian, J. 55 Binder, K.S. 205, 216
Batchelder, E.O. 121 Bird, H. 409
Bates, A. 341 Birdsong, D. 77
Bates, E. 18, 74, 104, 126, 138, 144, 147, 314, 363, Bishop, D. 75, 83, 85, 114, 215, 361, 389
434, 436, 437, 444, 471, 472 Black, J.B. 364, 368, 380, 381
Bates, E.A. 25, 80, 106, 117, 118 Black, S. 222, 341
Batterink, L. 298 Blackwell, A. 314, 437, 471
Battig, W.F. 333 Blank, M.A. 262
Bauer, D.W. 214 Blanken, G. 421, 425
Baum, S.R. 276 Blaxton, T.A. 176
Bavelier, D. 162 Bloch, D.E. 426
Beardsley, W. 373 Bloem, I. 412
Beaton, A. 161 Bloom, A.H. 92
Beaton, A.A. 250 Bloom, L. 131, 139
Beattie, G.W. 396, 431, 432, 433 Bloom, P. 52, 113, 127, 128, 129, 132, 136, 138, 140
Beauvois, M.-F. 221, 222, 235, 341, 445, 466 Bloom, P.A. 187
Becker, C.A. 196, 198, 199, 201 Blossom-Stach, C. 315
Beeman, M. 376 Blumstein, S.E. 276, 281, 314, 315
Begley, S. 74 Boas, F. 91
Behrend, D.A. 132 Boatman, D. 442
Behrmann, M. 234, 466 Bock, J.K. 402, 403, 404, 405, 406, 407
Bell, L.C. 217 Bock, K. 143, 156, 403, 404, 415, 420, 479
Bellugi, U. 63, 82, 104, 136, 139 Bod, R. 293
Ben-Zeev, S. 154 Bogyo, L.C. 224, 435
Bencini, G.L. 287 Bohannon, J.N. 107
Benedict, H. 136 Boisvert, S. 186
Benvegnu, B. 466 Boivin, M. 117
Benzing, L. 466 Boland, J.E. 296, 298, 310, 312
Bereiter, C. 445 Bolinger, D.L. 329
Berko, J. 145 Bolozky, S. 168
Berlin, B. 95, 96 Bonin, P. 173, 174, 417
Berndt, R.S. 281, 313, 344, 445 Bookin, H.B. 338
Berrian, R. 175 Boomer, D.S. 431
Bertand, D. 246 Boone, J.R. 344
Bertelson, P. 244 Booth, A.E. 128
Bertenthal, B.I. 348 Bornkessel-Schlesewsky, I. 71
Bertera, J.H. 169 Bornstein, M.H. 96
Bertoncini, J. 120, 121, 122 Bornstein, S. 96
Bertram, R. 191 Boroditsky, L. 97, 98
Bertrand, J. 129 Borowsky, R. 206
AUTHOR INDEX 571
Bosch, P. 251 Burke, C. 242

Bouckaert, R. 8 Burke, D. 414, 416
Bower, G.H. 335, 364, 377, 380, 381, 382, 383 Burke, D.M. 415
Bowerman, M. 97, 132, 133, 136, 137, 140, 148 Burton, M.W. 276
Bowey, J.A. 246 Burton-Roberts, N. 38
Bowles, N.L. 187 Bus, A.G. 244
Bown, H.E. 415 Busnel, M.C. 119
Boyes-Braem, P. 130, 334 Butcher, C. 113, 114
Boysen, S. 61, 62 Butterfield, S. 260
Bozeat, S. 355 Butters, N. 349
Bradbury, R.J. 433 Butterworth, B. 227, 252, 253, 401, 405, 411, 422,
Bradley, D.C. 183, 315, 461 425, 431, 432, 437, 438, 440, 469, 470, 472
Bradley, L. 244, 247, 253, 254 Button, S.B. 206
Braine, M.D.S. 117, 136, 137, 139, 141, 143, 149 Byrne, B. 244, 247
Brakke, K.E. 64
Bramwell, B. 281, 464
Branigan, H.P. 403, 404, 456 C
Bransford, J.D. 364, 365, 368, 382 Cable, C. 65
Braun, A.R. 68, 73 Cacciari, C. 337, 338
Braun, M. 246 Cairns, P. 121, 276
Breedin, S.D. 298, 332, 349, 470 Call, J. 57
Brennan, S.E. 290, 396, 453, 456 Calvanio, R. 445
Brennen, T. 467 Camden, C.T. 399
Breskin, S. 432 Campbell, R. 135, 252, 253, 469, 472
Bretherton, I. 104 Cancelliere, A. 220, 221
Brewer, W.F. 382, 383 Cantalupo, C. 53
Briand, K. 179 Capasso, R. 281, 466, 467
Britain, D. 35 Capitani, E. 347
Britton, B.K. 364 Caplan, D. 217, 281, 291, 314, 315, 341, 343, 345,
Brocklehurst, P.H. 462 373, 389, 435, 436, 444, 460, 471, 472
Bronowski, J. 63 Caramazza, A. 18, 157, 223, 229, 281, 313, 315, 340,
Brooks, P.J. 143 342, 344, 345, 346, 347, 373, 407, 416, 417, 421,
Broom, Y.M. 255 430, 434, 436, 442, 443, 445, 466
Brown, A.S. 414 Caravolas, M. 245
Brown, C. 292 Carbonnel, S. 238
Brown, C.M. 298, 412 Carey, P.W. 200
Brown, G.D.A. 173, 214, 215, 228, 246, 248, 253, 398 Carey, S. 74, 130, 335
Brown, H.O. 89 Carlson, G.N. 296, 304, 312
Brown, P. 196, 451 Carlson, M. 296
Brown, R. 63, 69, 95, 96, 104, 107, 108, 130, 136, Carlson, T. 375
139, 140, 141, 144, 145, 146, 414, 415 Carmichael, L. 93, 94
Brown-Schmidt, S. 373 Carney, A.E. 123
Brownell, H.H. 389 Carpenter, P.A. 219, 291, 292, 301, 304, 314, 332,
Bruce, D.J. 259 385, 386, 468, 471, 472
Bruck, M. 161 Carr, T.H. 184, 192, 228
Bruhn, P. 69 Carroll, J. 245
Bruner, J.S. 83, 84 Carroll, J.B. 92, 160, 173
Bryant, P. 244, 247, 251, 253, 254 Carroll, P. 189
Bryant, P.E. 252 Carruthers, P. 99
Brysbaert, M. 216, 217, 305 Carter, R. 176
Bryson, B. 7, 8 Carver, R.P. 219
Bub, D. 21, 220, 221, 222, 234, 341, 343, 435, 445, 466 Cary, L. 244
Buccino, G. 356 Casagrande, J.B. 92
Buckingham, H.W. 426, 437, 438 Casasanto, D. 356
Budd, B. 341 Casey, B.J. 251
Bull, D.H. 123 Cassidy, K. 130
Burger, L.K. 426 Cassidy, K.W. 121
Burgess, C. 186, 202, 320, 354, 356 Castles, A. 245, 252
572 AUTHOR INDEX
Catlin, J. 333 Coltheart, M. 110, 140, 175, 214, 216, 217, 220, 222,
Cattaneo, C. 254 223, 224, 225, 226, 228, 231, 235, 238, 245, 251,
Cattell, J.M. 176 252, 341, 345, 349, 464
Cazden, C.B. 107, 146 Coltheart, V. 246
Chafe, W.L. 444 Conboy, B.T. 157
Chaffin, R. 378 Conner, L.T. 206
Chalard, M. 173, 174 Connine, C.M. 263
Chambers, C.G. 304, 458 Conrad, C. 217, 324
Chambers, S.M. 172, 175, 183, 214 Conrad, R. 87
Chambers Twentieth Century Dictionary 55 Constable, R.T. 73, 298
Chan, A.S. 349 Content, A. 175
Chan, D. 348 Conti-Ramsden, G. 87
Chang, F. 143, 373, 404, 479 Cook, A.E. 383
Chao, L.L. 348 Cook, V. 154
Chapman, R.M. 311, 312 Cook, V.J. 37, 111
Chapman, R.S. 132 Cooke, M. 315
Charney, R. 130 Cooper, F.S. 258, 268
Chater, N. 26, 54, 121, 139, 276 Cooper, J. 242
Chawluk, J.B. 352 Cooper, W.E. 281
Chen, H.-C. 155 Coppola, M. 69, 114, 409
Chertkow, H. 343 Corballis, M.C. 53
Chialant, D. 281, 442 Corbett, A.T. 369, 373
Chiappe, P. 252 Corbett, G. 96
Chiat, S. 437 Corbit, L. 261, 264
Cholin, J. 428, 429 Corina, D.P. 68, 73
Chomsky, N. 9, 10, 11, 24, 36, 37, 40, 41, 42, 43, 45, Corkin, S. 69, 409
51, 52, 66, 67, 108, 109, 111, 112, 144, 475 Corley, M. 462
Christiaansen, R.E. 369, 370 Corley, M.M.B. 305
Christiansen, J.A. 389 Corrigan, R. 81
Christiansen, M.H. 54, 118, 121, 122, 472 Cortese, C. 155, 175
Christianson, K. 307 Cortese, M.J. 177
Chumbley, J.I. 174, 180, 181, 182 Coslett, H.B. 225, 341, 349, 443
Church, R.M. 63 Costa, A. 157, 321, 404, 407, 417, 421, 429, 430
Cipolotti, L. 234, 339, 340 Costa, F. 305
Cirilo, R.K. 378 Costa, L.D. 76
Clahsen, H. 35, 111, 146, 147 Crago, M.B. 114
Clark, E.V. 91, 113, 123, 124, 125, 127, 130, 131, 132, Craik, F.I.M. 155
133, 134, 135, 136, 259, 268 Crain, L. 84
Clark, H.H. 9, 91, 113, 123, 124, 125, 127, 130, 131, Crain, S. 299, 300
132, 135, 259, 268, 337, 367, 375, 376, 396, 430, Cree, G.S. 353
433, 449, 452, 453, 455 Crocker, M.W. 302
Clark, J.M. 155 Cromer, R.F. 96
Clarke, R. 195 Croot, K. 442
Clarke-Stewart, K. 110 Cross, T.G. 109, 110
Claus, B. 383 Crosson, B. 344
Cleland, A.A. 404 Crum, W.R. 348
Clifton, C. 263, 296, 297, 302, 307, 311, 374 Crutch, S.J. 340
Coffrey-Corina, S.A. 75, 76 Cruz, A. 254
Cohen, G. 387 Crystal, D. 3, 7, 124, 154
Cohen, L. 184 Cuetos, F. 304, 305, 442
Cohen, M.M. 275 Cummins, J. 76, 154
Cok, B. 156, 158 Cupples, L. 341
Colby, K.M. 14 Curtin, S. 118, 122, 123
Cole, R.A. 271 Curtis, B. 228, 231
Coleman, L. 332 Curtiss, S. 78, 115
Collins, A.M. 323, 324, 325, 335, 444, 445 Cutler, A. 122, 260, 261, 265, 276, 277, 279, 315, 338,
Colombo, L. 176 398, 410, 411, 463, 480
AUTHOR INDEX 573
Cutsford, T.D. 86 Desberg, P. 241–2, 246

Cutting, J.C. 405, 406, 407, 420 d’Esposito, M. 349
Cymerman, E. 148 Deutsch, A. 192
Czerniewska, P. 444 Devlin, J.T. 184, 352, 353
Dhooge, E. 423
di Betta, A.M. 351
D Di Nubila, J.A. 143
Dahan, D. 266 Diaz, R. 155
Dahl, H. 173 Dick, F. 314, 434, 436, 444, 472
Dale, P.S. 117, 126, 144, 148 Diesendruck, G. 83
Dale, R. 455, 458 Diesfeldt, H.F.A. 349
Dallas, M. 175 Dijkstra, A. 157
D’Andrade, R.G. 451 Dijkstra, T. 157, 191, 277
Daneman, M. 216, 385, 386, 468 Ding, B. 244
d’Anglejan, A. 155 Dionne, G. 117
Daniel, S.A. 195, 463 Dobel, C. 403
Danna, M. 356 Dockrell, J. 109, 132
Dannenbring, G.L. 179 Doctor, E.A. 255
Dannenburg, L. 156 Dodge, M. 113
Datta, H. 252 Dogil, G. 313
Davelaar, E. 175, 214 Doherty, S. 82
David, D. 467 Doi, L.M. 253
Davidoff, J. 97 Dooling, D.J. 366, 367, 369, 370
Davidson, B.J. 251 Dopkins, S. 203, 204
Davidson, J.E. 461 Dordain, M. 435, 436
Davidson, M. 216 Dörnyei, Z. 160
Davies, I. 96, 97 Dosher, B.A. 369
Davis, B.L. 124 Doughty, C.J. 160
Davis, C. 176 Downing, P. 336
Davis, C.J. 176 Doyle, M.C. 182, 183
Davis, K. 79 Doyle, W.J. 124
Davis, M.H. 192 Driver, J. 460
De Boeck, P. 335 Dronkers, N.F. 71, 115, 314, 434, 436, 444
de Boysson-Bardies, B. 123, 124 Druks, J. 222
De Deyn, P.P. 69 Drummond, A.J. 8
De Groot, A.M.B. 156, 179, 181 Druss, B. 121
de Mornay Davies, P. 347 Duchek, J.A. 389
De Renzi, E. 347 Dufau, S. 246
de Sa, V.R. 353 Duffy, S.A. 203, 205
De Villiers, J.G. 83, 107 Dumais, S.T. 104, 354, 356
De Villiers, P.A. 83, 107 Duncan, L.G. 244, 246
Deacon, T. 54, 57, 65, 80, 117 Duncan, S.E. 454
Deavers, R.P. 246 Duncker, K. 94
DeCasper, A.J. 119, 121 Dunlea, A. 86
Dehaene, S. 184 Dunn, M. 8
Dehaut, F. 314, 435, 472 Duran, N.D. 455
Delaney, S.M. 217 Durand, C. 123, 124
Dell, G.S. 143, 320, 404, 420, 421, 422, 423, 424, Durkin, K. 83
425, 426, 427, 428, 437, 441, 442, 456, 462, 463, Durso, F.T. 184
469, 479 Duskova, L. 159
DeLong, K.A. 304 Dworetzky, B. 314, 315
Demers, R.A. 55, 73 Dworzynski, K. 93
Demetras, M.J. 107
Den Heyer, K. 179
Dennis, M. 75 E
Dennis, Y. 301 Earles, J.L. 80, 118, 161
Derouesné, J. 221, 222, 235, 445, 466 Eberhard, K.M. 267, 303, 405, 406, 407, 457
Desai, R.H. 356 Eckert, M.A. 250
574 AUTHOR INDEX
Eckman, F. 159 Fera, P. 232

Edwards, H.T. 109 Fernald, A. 109, 122
Eglinton, E. 250 Fernald, G.M. 255
Ehri, L.C. 157, 241, 242, 243, 245, 246, 247 Fernandes, K.J. 143
Ehrlich, K. 382 Ferraro, F.R. 206, 297
Ehrlich, S.F. 188 Ferreira, F. 12, 290, 296, 297, 302, 303, 307, 308, 374,
Eilers, R.E. 123 433, 479
Eimas, P.D. 120, 127, 187, 261, 264, 348 Ferreira, V.S. 407, 408, 420, 453, 455, 456
Eisenband, J.G. 373 Ferreiro, E. 247, 249
Eisengart, J. 143 Fey, M.E. 136
Elbers, L. 414 Fiez, J.A. 69
Elder, L. 247 Fifer, W.P. 121
Elio, R. 335 Fillmore, C.J. 377
Ellis, A.W. 171, 173, 174, 185, 210, 214, 217, 218, Finch, S. 139
224, 234, 244, 249, 250, 255, 397, 437, 438, 440, Fischer, J. 57
445, 464 Fischler, I. 186, 187
Ellis, N.C. 94, 161, 248, 252 Fisher, C. 143
Ellis, R. 19, 110 Fisher, S.E. 53, 54, 115, 250
Elman, J.L. 25, 54, 80, 106, 117, 118, 139, 273, 276, Fitch, W.T. 40, 51, 52, 67
278, 481, 484 Fitzgerald, M.D. 145
Elsness, J. 290 Flege, J.E. 76
Emmorey, K. 414 Fletcher, C.R. 386
Enggelborghs, S. 69 Fletcher, P. 114, 115
Entus, A.K. 75 Flora, J. 340
Eriksen, C.W. 175 Flores d’Arcais, G.B. 311
Ervin-Tripp, S. 454 Flower, L.S. 444, 445
Estes, Z. 337, 338 Fluchaire, I. 467
Evans, H.M. 246 Fodor, J.A. 24, 63, 129, 264, 267, 288, 291, 293, 326,
Evans, M.A. 248 327, 330, 331, 332, 334, 460
Evans, N. 114
Fodor, J.D. 295, 330
Evans, W.E. 55
Fogassi, L. 53
Everett, C. 94
Folk, J.R. 204, 216
Everett, D.L. 67
Foltz, G. 251
Eysenck, M.W. 320, 445
Foltz, P.W. 354
Ford, M. 431
F Forde, E.M.E. 340
Fabbro, F. 69, 158 Forster, K.I. 12, 172, 175, 176, 182, 183, 188, 191,
Facoetti, A. 254 192, 193, 194, 214, 232, 267, 298, 313, 461, 480
Facoettia, C. 255 Forsythe, W.J. 367
Fadiga, L. 53 Fosker, T. 254
Faigley, L. 445 Foss, D.J. 189, 200, 262, 378
Fanshel, D. 453 Fouts, D.H. 60
Farah, M.J. 24, 184, 185, 222, 223, 235, 342, 343, 345, Fouts, R.S. 60
346, 347, 350, 352 Fowler, A.E. 82
Farmer, R.M. 259 Fowler, C.A. 268
Farrar, M.J. 82, 83, 108 Fox, N.C. 348
Faust, M. 387 Fox, P.T. 463
Faust, M.E. 389 Fox Tree, J.E. 433
Fay, D. 132, 410, 411, 463 Foygel, D. 442
Fayol, M. 417 Francis, W.N. 173, 231
Fedio, P. 349, 350 Frank, S.L. 293
Fedorenko, E. 71, 471 Franklin, S. 229, 234, 281, 439, 441, 466
Feher, E. 314, 315 Franks, J.J. 367, 382
Feitelson, D. 247 Frasconi, P. 305
Feldman, L.B. 155 Fraser, C. 139
Felix, S. 111 Frauenfelder, U.H. 265, 267, 277
Fenson, L. 126, 144 Frazier, L. 199, 203, 290, 295, 296, 305, 306, 309, 311
AUTHOR INDEX 575
Freberg, L.A. 249 Gelman, S.A. 128

Frederiksen, J.R. 214 Gentner, D. 135, 137, 332, 444, 445
Freedman, J.L. 324 Gerard, L. 155
Frege, G. 322 Gergely, G. 332
Fremgen, A. 132 Gerken, L. 117, 119, 122, 125
Freud, S. 9, 397, 453 Gernsbacher, M.A. 173, 262, 314, 361, 376, 384, 387,
Freudenthal, D. 139 389, 434, 436, 444
Friederici, A.D. 71, 73, 298, 313, 315, 316, 435, 436 Gerrig, R.J. 167, 455
Friedman, M.P. 242, 246 Gerstman, L.J. 432
Friedman, N.P. 383 Gertner, Y. 143
Friedman, R.B. 222, 281, 465 Geschwind, N. 69, 250, 434
Frith, C.D. 460 Gibbs, R.W. 338, 451
Frith, U. 241, 242, 251, 254 Gibbs, R.W., Jnr 451
Fromkin, V.A. 63, 78, 107, 173, 398 Gibson, E. 40, 471
Frost, R. 192, 215 Gibson, J.J. 343
Froud, K. 222 Gildea, P. 338
Fulbright, R.K. 73, 298 Gilhooly, K.J. 173
Funnell, E. 221, 222, 229, 234, 341, 344, 345, 347, Gillette, J. 135, 137
348, 464, 467 Gilliom, L.A. 376
Furth, H. 87, 88 Glaser, W.R. 184
Glass, A.L. 176, 330
G Gleason, H.A. 95
Gaffan, D. 344 Gleason, J.B. 83–4, 414
Gagne, C.L. 337 Gleitman, H. 130, 135, 137, 144, 334
Gagnon, D.A. 428, 442 Gleitman, L. 98, 130, 135, 137, 144
Gahl, S. 442 Gleitman, L.R. 82, 86, 87, 99, 127, 130, 137, 334
Gainotti, G. 345, 351 Glenberg, A.M. 354, 356, 383
Galaburda, A.M. 250, 251 Gluck, M.A. 334, 335
Galantucci, B. 268 Glucksberg, S. 94, 199, 330, 337, 338
Gallagher, A. 254 Glushko, R.J. 213, 214, 229, 238
Gallese, V. 53, 356 Glynn, S.M. 364
Galli, R. 254 Gobet, F. 139
Galton, F. 9 Goffman, E. 453
Galvin, J. 434 Gold, E.M. 115
Ganong, W.F. 263 Goldberg, A.E. 287, 405
Garcia, L.J. 389 Goldberg, E. 76
Gardner, B.T. 60 Goldfield, B.A. 109, 110
Gardner, H. 389, 453 Goldiamond, I. 181
Gardner, M. 63 Goldin-Meadow, S. 113, 114
Gardner, R.A. 60 Goldinger, S.D. 272
Garnham, A. 196, 301, 361, 379, 380, 382, 398 Goldman-Eisler, F. 430, 431, 432, 433
Garnica, O. 109 Goldrick, M. 410, 425
Garnsey, S.M. 290, 296, 298, 301, 302, 311, 312 Goldstone, R.L. 98
Garrard, P. 348, 354, 355, 445 Golinkoff, R.M. 127, 319
Garrett, M. 177 Gollan, T.H. 414, 415
Garrett, M.F. 63, 200, 291, 293, 298, 313, 315, 330, Gombert, J.E. 243
331, 332, 347, 348, 355, 399, 400, 405, 415 Gomez, R.L. 117, 119
Garrod, S. 449, 453, 455, 462, 479 Gong, Z. 248
Garrod, S.C. 370, 374 Gonnerman, L.M. 352, 353
Gaskell, G. 199, 278 Good, D.A. 432
Gaskell, M.G. 277 Goodglass, H. 314, 414, 434, 435
Gathercole, S.E. 469, 471 Goodluck, H. 137
Gathercole, V.C. 134, 138 Goodman, J.C. 144
Gayan, J. 252 Goodman, L.S. 89
Gazdar, G. 45 Gopnik, M. 114, 130
Gee, J.P. 280 Gordon, B. 315, 346, 442
Gelman, R. 82, 109 Gordon, P. 92, 94, 138
576 AUTHOR INDEX
Gordon, P.C. 376, 464, 471 Halliwell, J.F. 307

Goswami, U. 244, 245, 246, 253, 254 Halsted, N. 120, 122
Gottardo, A. 248, 252 Hammond, K.M. 346
Gotts, S.J. 340 Hampson, E. 341
Gough, P.B. 174, 210 Hampson, J. 129
Goulandris, A. 250 Hampton, J.A. 325, 330, 334
Goulding, P.J. 348 Hanley, J.R. 217, 250, 415
Govindjee, A. 427 Hanlon, C. 107
Graesser, A.C. 384 Hannigan, S. 157, 158
Graf, P. 291 Hanten, G. 473
Graham, A. 219 Hantsch, A. 430
Graham, F. 114 Hare, M. 308
Graham, K.S. 235, 346 Hargreaves, D.J. 376
Graham, N. 234 Harley, B. 75, 76, 155
Grainger, J. 157, 175, 183, 215, 246, 272 Harley, T.A. xi, 22, 349, 351, 399, 402, 414, 415, 420,
Granier-Deferre, C. 119 421, 422, 437, 440
Gray, W. 130, 334 Harm, M.W. 222, 230, 232, 235, 253
Green, D.W. 158 Harris, C.L. 189
Greenberg, J.H. 113 Harris, K.S. 261, 436
Greene, J.D.W. 467 Harris, M. 110, 140, 149
Greenfield, P.M. 64, 126 Harris, R.J. 371, 372
Greenhill, S.J. 8 Harris, Z.S. 10
Greenspan, S.L. 382, 383 Harrison, M.R. 248
Gregory, R.L. 19 Harste, J. 242
Grice, H.P. 452 Hart, J. 344, 346, 442
Griffin, Z.M. 399, 402, 403, 404, 420 Hartley, T. 427
Griffith, B.C. 261 Hartsuiker, R.J. 156, 403, 404, 421, 423, 426, 462
Grimston, M. 438 Haskell, T.R. 406, 409
Grober, E.H. 373 Hastie, K. 250
Grodzinsky, Y. 313, 435, 436 Hatcher, P.J. 244, 255
Groisman, M. 453 Hatfield, F.M. 477
Groothusen, J. 298 Hauk, O. 356
Grosjean, F. 155, 266, 271, 273, 280 Haun, D.B.M. 97, 98
Grossman, M. 349 Hauser, M.D. 40, 51, 52, 66, 67, 122
Grosz, B.J. 376 Haviland, S.E. 376
Growdon, J.H. 69, 409 Havinga, J. 399, 412, 418, 420, 423, 424
Guillermin, A. 68, 73 Hawkins, J.A. 40, 113
Gulikers, L. 173 Hawkins, W.F. 181
Gupta, P. 471 Haxby, J.V. 348
Hay, D. 84
H Hayes, C. 59
Haarmann, H.J. 315 Hayes, J.R. 444, 445
Haber, L.R. 217 Hayes, K.J. 58
Haber, R.N. 217 Hayes-Roth, B. 361
Hadzibeganovic, T. 251 Haywood, S.L. 292, 456
Haggard, P.N. 182, 183 Healy, A. 287
Hagoort, P. 71, 292, 298, 412 Heath, S.B. 110
Haider, H. 313 Hebb, D.O. 25
Hakuta, K. 76, 79, 153, 155, 157 Hecht, B.F. 136
Haldane, J.B.S. 53 Hedges, L.V. 109
Hale, B.L. 216 Heider, E.R. 95, 96
Hall, D.G. 109, 130 Heilman, K.M. 443
Halle, M. 268 Heim, S. 71
Halle, P. 123 Heinze, H.J. 156
Halle, P.A. 192 Heise, G.A. 259
Haller, M. 228, 231 Heit, E. 335
Halleran, J.G. 383 Henderson, A. 432
AUTHOR INDEX 577
Henderson, J.M. 297, 303 Houghton, G. 427

Henderson, L. 174, 229 Howard, D. 21–2, 229, 235, 252, 281, 349, 439, 440,
Hendrick, R. 471 441, 464, 466, 470, 472, 477
Henly, A.S. 456 Howe, C. 110
Hennelly, R.A. 94 Howell, J. 222
Heredia, R. 159, 160 Howes, D.H. 172, 194
Herman, L.M. 58 Hughes, C.H. 219
Herrman, D.J. 378 Huijbers, P. 411
Herrnstein, R. 65 Huiskamp, P. 403
Hespos, S.J. 113 Hulme, C. 243, 244, 245, 252, 253, 255
Hess, D.J. 189 Humphreys, G.W. 19, 176, 185, 218, 340, 341, 342,
Hessels, S. 248 345, 347, 411
Heywood, C.A. 344 Hunkin, N.M. 344
Hickerson, N.P. 96 Hunt, E. 94, 461
Hickok, G. 69, 71, 72, 409 Hurford, J.R. 113
Hieke, A.E. 433 Hurst, J.A. 114, 115
High, J. 68, 73 Husman, R. 313
Hildebrandt, N. 314, 472 Huss, M. 254
Hill, N. 149, 457 Hutchins, S. 303
Hill, R.L. 290 Hutchinson, J.E. 128
Hill, S. 244, 246 Huttenlocher, J. 109, 130, 148
Hillenbrand, J. 76 Hyams, N. 63, 107, 173
Hillier, C.S. 431 Hyde, M. 414
Hillis, A.E. 223, 229, 342, 344, 345, 346, 347,
417, 442 I
Hillyard, S.A. 19, 298 Illmoniemi, R.J. 356
Hinton, G.E. 235, 237, 303, 332, 351, 422, 425, 481 Impey, L. 251
Hirsch, A.D. 60 Indefrey, P. 412, 413
Hirsh, K.W. 173, 280 Inglis, L. 341
Hirsh-Pasek, K. 108, 121, 127, 218, 319 Inhoff, A.W. 169, 188
Hitch, G.J. 468 Irwin, D.E. 402
Hladik, E.G. 109 Ito, H. 227, 235
Hockett, C.F. 55, 56 Ito, T. 227, 235
Hodges, J.R. 234, 235, 346, 348, 349, 354, 355, 442, Ivanova, I. 404
445, 467
Hodgson, J.M. 186, 190
Hoefnagel-Hohle, M. 76 J
Hoff, E. 109, 129 Jackendoff, R. 42, 52, 67, 321, 326, 351, 355, 479
Hoff-Ginsberg, E. 82, 124, 126, 134, 149 Jacklin, C. 73
Hoffman, C. 94 Jacobs, A.M. 183, 246
Hoffman, H.S. 261 Jacobsen, E. 88
Hoffman, J.E. 298 Jacoby, L.L. 175, 176
Hogaboam, T.W. 199, 200 Jaeger, J.J. 71, 146
Hogan, H.P. 93, 94 Jaffe, J. 432
Holcomb, P.J. 298, 340 Jain, M. 155
Holender, D. 171 Jakimik, J. 271
Hollan, J.D. 328 Jakobson, R. 123, 124, 125
Hollander, M. 147 James, L.E. 415
Hollingworth, A. 307 James, S.L. 145
Holmes, V.M. 290, 431 Jared, D. 214, 215, 216, 219, 238, 248
Holowka, S. 124 Jarvella, R.J. 291, 363
Holtgraves, T. 453 Jastrzembski, J.E. 196, 199, 206
Holyoak, K.J. 330 Jefferson, G. 454
Hooglander, A. 156 Jennings, F. 461
Hoops, H.R. 314 Jernigan, T. 82
Hopkins, W.D. 53, 64 Jescheniak, J.D. 350, 417, 430
Horton, W.S. 455 Jin, Y.-S. 155
578 AUTHOR INDEX
Joag-dev, C. 367 Kellas, G. 204, 205, 206

Joanette, Y. 389 Keller, T.A. 304, 314, 472
Joanisse, M.F. 115, 146, 409 Kello, C. 290, 296, 302
Job, R. 344, 345 Kellogg, L.A. 59
Johnson, D. 130, 334 Kellogg, R.T. 445
Johnson, D.R. 94 Kellogg, W.N. 59
Johnson, E.K. 121, 122 Kelly, M.H. 139, 402
Johnson, J.S. 76, 77 Kelly, M.P. 349
Johnson, K.E. 335 Kelter, S. 383, 404
Johnson, M.H. 25, 80, 106, 117, 118 Kemler-Nelson, D.G. 121
Johnson, M.K. 184, 364, 365 Kemmerer, D. 146
Johnson, M.L. 77, 83 Kempen, G. 411
Johnson, N.S. 378 Kempler, D. 352, 353, 389
Johnson, R.E. 364 Kempton, W. 96
Johnson-Laird, P.N. 167, 293, 322, 326, 378, 379, Kendon, A. 432
380, 382 Kennedy, A. 295, 461
Johnson-Morris, J.E. 109 Kennedy, L. 121
Johnsrude, I. 356 Kennison, S.M. 99, 303
Johnston, J.C. 216 Keppel, G. 185
Johnston, R.S. 248 Kerling, R. 156
Jolicoeur, P. 334 Kersten, A.W. 80, 118, 161
Jonasson, J.T. 175, 214 Kertesz, A. 220, 221, 222, 341, 445, 466
Jones, G.V. 224, 237, 415 Keysar, B. 338, 375, 451, 456
Jones, L.L. 337, 338 Khalak, H.G. 146
Jorm, A.F. 241, 252 Khan, L.M.L. 145
Jose-Robertson, L. 68, 73 Kiger, J.I. 176
Joshi, A.K. 376 Kilborn, K. 315
Juliano, C. 303, 427 Killian, G. 110
Jusczyk, P.W. 104, 120, 121, 122, 259, 261 Killion, T.H. 196
Just, M.A. 219, 291, 292, 301, 304, 314, 332, 337, Kim, H.S. 99
471, 472 Kimball, J. 293, 471
King, M.L. 155
K Kintsch, W. 332, 363, 364, 382, 384, 385, 386
Kail, R. 333 Kiparsky, P. 124
Kako, E. 56, 58, 65, 67 Kirshner, H.S. 349
Kamide, Y. 292 Kirsner, K. 155, 432
Kaminski, J. 57 Kita, S. 97, 98
Kane, M.J. 387 Klapp, S.T. 175
Kanwisher, N. 71, 162, 189 Klatt, D.H. 276, 280
Karmiloff-Smith, A. 25, 80, 106, 117, 118, 146, 148, 243 Klee, T. 145
Karns, C.M. 298 Klein, E. 45
Katz, B. 314 Klein, J.F. 292
Katz, J.J. 326, 327, 331 Kliegel, R. 251
Kaufer, D. 445 Kluender, R. 311
Kaup, B. 364, 383 Knoll, R.L. 429
Kawamoto, A. 206 Knutsen, D. 376
Kay, D.A. 134 Kohn, S.E. 281, 414, 443, 465
Kay, J. 213, 215, 229, 250, 440, 464 Kolb, B. 68, 69, 73
Kay, P. 95, 96, 332 Kolinsky, R. 263
Kean, M.-L. 436 Kolk, H.H.J. 314, 315, 403, 404, 436, 437
Keane, M.T. 320, 445 Komatsu, L.K. 334, 335
Keating, P. 253 Kooijman, V. 292
Keefe, D.E. 180, 181 Kornai, A. 42
Keele, S.W. 333, 380 Koroshetz, W.J. 69, 409
Keenan, J.M. 364, 366, 385 Kosslyn, S.M. 334
Kegl, J. 114 Kounios, J. 298, 340
Keil, F.C. 402 Kowal, S.H. 433
AUTHOR INDEX 579
Krageloh-Mann, I. 74 Lee, J.J. 451

Kraljic, T. 290, 456 Lefly, D.L. 254
Krashen, S.D. 76, 78, 159, 160, 161 Leiman, J.M. 199, 202, 205
Kraus, N. 250 Leinbach, J. 147
Kremin, H. 221 Lemey, P. 8
Kreuz, R.J. 199, 455 Lenneberg, E.H. 73, 74, 75, 79, 91, 95, 268
Kroll, J.F. 156, 189, 214 Leonard, C.M. 250
Krueger, M.A. 206 Leonard, L.B. 114, 115, 136
Kruger, A. 129 Leopold, W.F. 124, 154
Kruschke, J.K. 335 Lesch, M.F. 216, 222, 469, 470, 473
Kucera, H. 173, 231 Leschziner, G. 348
Kuczaj, S.A. 108, 146 Lesser, R. 464
Kuhl, P.K. 122 Lété, B. 246
Kursaal Flyers 40 Levelt, W.J.M. 350, 395, 396, 399, 403, 404, 410,
Kutas, M. 19, 157, 189, 298, 304, 311 412, 413, 417, 418, 420, 423, 424, 425, 427, 428,
429, 430, 433
L Levinberg-Green, D. 247
La Heij, W. 156, 412 Levine, D.N. 445
Labov, W. 330, 453 Levine, S. 148
Lachman, R. 366 Levine, W.H. 471
Lackner, J.R. 200 Levinson, K.L. 222, 223, 235
LaCount, K.L. 190 Levinson, S. 97, 451, 453
Lado, R. 159 Levinson, S.C. 97, 98, 114
Laham, D. 354 Levy, B. 124
Lahiri, A. 429 Levy, B.A. 216, 248
Lai, C.S.L. 115 Levy, E. 129
Laiacona, M. 347 Levy, J. 121, 276
Laine, M. 421, 440 Levy, Y. 138
Laing, E. 243, 244 Lewin, R. 64
Lakatos, I. 24 Lewis, K. 463
Lakoff, G. 334 Lewis, S.S. 210, 212
Lamb, J. 250 Lewis, V. 86
Lambert, W.E. 155, 161 Lewis, V.J. 252
Lambertz, G. 120, 122 Lewis, W. 332, 347, 348, 355
Lambon Ralph, M.A. 174, 185, 234, 348, 355, 409, 466 Li, P. 98
Landau, B. 86, 87, 130 Liberman, A.M. 258, 261, 268
Landauer, T.K. 104, 324, 354, 356 Lichten, W. 259
Lane, H. 78 Lidz, J. 130, 144
Langdon, R. 216, 228, 238 Lidzha, K. 74
Langer, P. 180 Lieberman, P. 59, 259
Langford, S. 415 Lieven, E. 110, 142, 143, 144
Lantz, D. 95 Lightfoot, D. 115
Lapinski, R.H. 179 Lindell, A.K. 69, 453
Lapointe, S. 436 Lindem, K. 383
Lau, I. 94 Lindsay, P.H. 194
Lauro-Grotto, R. 341, 343 Linebarger, M.C. 313, 435
Laws, G. 96 Lipson, M.Y. 367
Lawson, J. 62 Liu, L.G. 93
Lawson, J.S. 350 Liversedge, S.P. 403
Laxon, V. 246 Locke, J. 106
Le Bigot, L. 376 Locke, J.L. 80, 123
Leahy, J. 246 Lockhart, R.S. 155
Leaper, C. 73 Lockwood, A.H. 146
Lecanuet, J.P. 119 Loebell, H. 156, 403
Lederer, A. 135, 137 Loftus, E.F. 325, 329, 362, 371
Lee, A.C.H. 346 Logrip, M. 149, 457
Lee, H. 172 Lombardi, L. 363, 404
580 AUTHOR INDEX
Lombardino, L.J. 250 Marcel, A.J. 171, 213, 229

Lombardo, V. 305 Marchman, V. 80, 126, 147
Long, M. 76 Marchman, V.A. 76
Long, M.H. 160 Marcus, G.F. 53, 54, 108, 115, 118, 143, 147
Longtin, C.M. 192 Marentette, P.F. 124
Loosemore, R. 22 Marian, V. 157
Lorch, R.F. 180, 181 Marien, P. 69
Lorusso, M.L. 254 Marin, O.S.M. 224, 225, 234, 281, 313, 314, 435, 445
Lotocky, M.A. 290, 302 Markman, E.M. 128, 129, 130, 149, 335
Lounsbury, F.G. 432 Markow, D.B. 128
Love, B.C. 336, 337 Marlow, A.J. 250
Lovegrove, W. 250 Marr, D. 13
Loveland, D. 65 Marsh, G. 241–2, 246
Lowenfeld, B. 86 Marshall, J.C. 56, 171, 220, 222, 223, 224, 226, 349,
Lucas, M. 202, 265 437, 445
Lucchelli, F. 347 Marslen-Wilson, W.D. 186, 187, 191, 199, 205, 206,
Luce, P.A. 272 259, 262, 267, 269, 270, 271, 272, 273, 277, 278,
Lucy, J.A. 96, 99 292
Lucy, P. 337 Martin, A. 348, 349, 350
Luk, G. 155 Martin, C. 204
Lukatela, G. 217 Martin, F. 250
Lund, K. 186, 320, 354, 356 Martin, G.L. 168
Lundberg, I. 249 Martin, G.N. 157
Lupker, S.J. 176, 186, 216 Martin, M. 224
Luria, A.R. 445 Martin, N. 404, 421, 428, 437, 440, 441, 442, 443,
Lyn, H. 66 466, 469, 470, 473
Lyons, J. 55 Martin, R.C. 186, 212, 222, 314, 315, 469, 470, 473
Lyster, S.-A.H. 245 Martinez-Beck, I. 83
Masataka, N. 109
M Masling, M. 363
MacAndrew, S.B.G. 414, 440 Mason, M.K. 79
Maccoby, E. 73 Mason, R.A. 304, 337
MacDonald, J. 281 Massaro, D.W. 261, 263, 275
MacDonald, M.C. 206, 232, 296, 301, 302, 303, 309, Massey, C. 98
389, 406, 409, 472 Masson, M.E.J. 206, 351
MacKain, C. 104 Masterson, J. 226, 246
MacKay, D.G. 200, 398, 399, 414, 421, 425 Masur, E.F. 109
Maclay, H. 431 Mathis, K.E. 155
MacLeod, C.M. 461 Matthews, G.H. 411
Macnamara, J. 137 Mattys, S.L. 121
MacNeilage, P.F. 124 Maugais, R. 119
MacWhinney, B. 18, 107, 138, 147, 364, 436, 471 Mauner, G.A. 312
Madden, C.J. 383 Mayberry, E.J. 348
Madigan, S. 177 Mayer, K. 9, 397
Madora, K. 94 Mayhew, D. 364
Maestrini, E. 250 Mazzucci, A. 435
Magiste, E. 155 McBride-Chang, C. 253
Magliano, J.P. 384 McCandliss, B. 251
Magnuson, J.S. 266, 271, 458 McCann, R.S. 175, 213, 231, 232, 238
Majid, A. 97 McCarthy, J.J. 43, 148
Maloney, L.M. 445 McCarthy, R. 221, 228, 345, 347
Maloney, L.T. 349 McCarthy, R.A. 341, 342, 389, 472
Malotki, E. 91 McCauley, C. 184
Mandler, J.M. 378 McClelland, J.L. 19, 23, 146, 147, 196, 197, 202,
Manis, F.R. 253 229, 230, 231, 232, 233, 234, 235, 236, 237, 238,
Manning, C.D. 26 273, 275, 276, 296, 299, 335, 346, 347, 355, 409,
Maratsos, M. 112, 136, 138, 139 418, 422, 468, 481
AUTHOR INDEX 581
McCloskey, M. 18, 329, 330, 436 Michaels, D. 96

McConkie, G.W. 168, 295 Michel, D. 389
McCormick, S.F. 187 Michle, P. 341
McCune-Nicolich, L. 82 Mickanin, J. 349
McCutchen, D. 218 Milberg, W. 315
McDavid, V. 290 Miles, T.R. 252
McDonald, J. 18, 436 Mill, A.I.D. 398
McDonald, J.E. 198 Miller, C.A. 405
McDonald, J.L. 402 Miller, D. 438, 440, 445
McDonald, K. 64 Miller, D.C. 411
McDonnell, V. 217 Miller, D.L. 326, 330, 332, 351
McElree, B. 311 Miller, G. 287
McGill, J. 421 Miller, G.A. 11, 12, 90, 259
McGuire, K.A. 349 Miller, J.L. 120, 259, 261
McGurk, H. 96, 281 Miller, K.F. 94
McKean, K.E. 11, 12 Millis, M.L. 206
McKoon, G. 190, 290, 308, 311, 320, 368, 369, 381, Mills, A.E. 87
382 Mills, D.L. 75, 76, 157
McLaughlin, B. 159, 160 Milne, R.W. 299
McLean, J.F. 403, 404 Milner, B. 68, 74
McLeod, P. 237, 463 Milroy, R. 219
McNamara, T.P. 181, 190, 326, 330, 332, 351 Minsky, M. 380
McNeill, D. 90, 414, 415 Mintun, M.E. 463
McNorgan, C. 353 Mintz, T.H. 122, 139
McQueen, J.M. 260, 265, 276, 277, 279 Miozzo, M. 157, 344, 345, 409, 415, 416, 417, 420
McRae, K. 186, 215, 232, 238, 302, 303, 306, 308, Mitchell, D.C. 289, 290, 297, 304, 305
353 Mitchum, C.C. 281
McShane, J. 83, 126, 132, 137, 138, 147 Miyake, A. 314, 383, 471
McVay, J.C. 387 Moat, H.S. 462
Mead, N. 254 Moberg, P.J. 344
Meadow, K.P. 112 Moerk, E. 108
Meara, P. 226 Mohay, H. 87
Medin, D.L. 328, 334, 335 Molfese, D.L. 75, 76
Mehler, J. 11, 120, 121, 122, 200, 260, 261, Molfese, V.J. 76
276, 277 Molis, M. 77
Mehta, Z. 346 Monaco, A.P. 115
Meier, R.P. 87, 112 Monaghan, J. 214, 218
Meinshausen, R.M. 407 Monaghan, P. 174
Melby-Lervåg, M. 245 Mondt, K. 251
Melinger, A. 403, 423 Monsell, S. 182, 183, 190, 219, 280, 411, 429, 464
Meltzoff, A.N. 130 Montague, W.E. 175, 333
Menard, M.T. 251 Morais, J. 244, 263
Menci, W.E. 298 Moreno, E.M. 157
Menn, L. 124, 125, 414, 435 Morgan, J.L. 108
Menyuk, P. 124, 136 Morris, A.L. 189
Méot, A. 173, 174 Morris, J. 466
Meringer, R. 9, 397 Morris, R.K. 203, 204
Mervis, C.B. 127, 129, 130, 333, 334, 335 Morrison, C.M. 173
Messer, D. 88, 108, 109, 112, 144, 148 Morrow, D.G. 382, 383
Messer, D.J. 109 Morsella, E. 420
Metcalf, K. 204 Morton, J. 18, 176, 182, 184, 194, 195, 196, 215, 224,
Metsala, J.L. 253 227, 228, 229, 267, 463
Meyer, A.S. 399, 402–3, 407, 412, 415, 417, 418, 420, Moryadas, A. 189
423, 424, 427, 428, 429 Moss, A. 96
Meyer, D.E. 176, 196, 199, 201, 464 Moss, H.E. 186, 187, 205, 206, 315, 347
Meyer, M. 383 Motley, M.T. 398, 399, 421, 425
Miceli, G. 281, 435, 443, 466, 467 Mowrer, O.H. 123
582 AUTHOR INDEX
Mozer, M.C. 342, 343 Nishimura, M. 155

Mulac, A. 290 Nissen, C.H. 58
Mulford, R. 86 Noel, A. 189
Mullen, K. 97 Noppeny, U. 69
Muller, D. 246 Norell, S. 114
Muneaux, M. 272 Norman, D.A. 194
Munson, B. 253 Norris, D. 122, 196, 198, 214, 228, 232, 260, 261, 265,
Munte, T. 156 276, 277, 279, 280
Murdoch, I. 445 Nosofsky, R.M. 335
Murphy, B.W. 146 Nosselt, T. 156
Murphy, G.L. 334, 335, 374 Nowak, M.A. 5
Murphy, J. 64 Nozari, N. 425
Murray, W.S. 182, 194, 290, 461 Nunes, S.R. 247
Muter, V. 244, 246
Muth, K.D. 364
Myers, E. 290, 302 O
Myers, J.L. 383 Oakhill, J. 361, 379, 386
Myerson, P. 434 Öberg, R. 69
Mylander, C. 113, 114 Obler, L.K. 157, 158, 436
O’Brien, E.J. 374, 383
Obusek, C.J. 259, 263
N Ochs, E. 108, 110
Nagy, W. 7 O’Connell, D.C. 433
Naigles, L.R. 129, 130, 142, 143 Oden, G.C. 263
Nappa, R. 130 Olbrei, I. 12
Nass, R. 76 Older, L. 191
Nation, K. 254 Oller, D.K. 123, 124
Navarette, E. 417 Olsen, T.S. 69
Nazzi, T. 121 Olson, R.K. 251, 252, 253
Neary, D. 348 O’Neil, C. 60
Neary, G. 255 Onifer, W. 199, 202
Nebes, R.D. 349 Oppenheim, G.M. 462
Neely, J.H. 176, 177, 180, 181 Oppenheimer, D.M. 403
Neisser, U. 362 Orchard-Lisle, V. 439
Nelson, K. 65, 86, 104, 125, 129, 131, 132, 133 O’Regan, K. 183
Nespoulous, J.-L. 435, 436 Orwell, G. 90
Neville, H. 298, 313 O’Seaghdha, P.G. 181, 188, 297, 420, 422
Neville, H.J. 75, 76 Osgood, C.E. 9, 431
New, B. 192 Osterhout, L. 298
Newcombe, F. 220, 222, 223, 445 Ostrin, R.K. 187, 315, 435
Newhoff, M. 136 Ostry, D.J. 124
Newmark, L. 161 O’Sullivan, C. 61
Newport, E.L. 66, 76, 77, 79, 80, 112, 118, 121, 122 Oxbury, S. 234, 348
Newsome, S. 198
Newson, M. 37, 111
Newton, P.K. 223, 224, 237 P
Ng, M.-L. 155 Paap, K.R. 198
Ni, W. 298, 300 Pachella, R.G. 170
Nickels, L. 349, 470 Pacht, J.M. 203
Nicol, J. 311, 312, 406 Paek, T.S. 375
Nicol, J.L. 298, 313 Paganelli, F. 93
Nie, H. 244 Paget, R. 53
Niederehe, G. 454 Paivio, A. 155, 177, 340
Nienhuys, T.G. 109 Palmer, J. 461
Nigram, A. 298 Palmer, J.C. 371
Ninio, A. 109, 128 Palmeri, T.J. 335
Nippold, M.A. 333 Pammer, K. 254
Nisbett, R.E. 99 Papafragou, A. 98, 99, 130
AUTHOR INDEX 583
Papagno, C. 160, 442, 471 Pillard, R. 78

Paquier, P.F. 69 Pine, J. 139, 142, 143
Paradis, M. 157, 158 Pine, J.M. 110, 126, 144
Parisi, D. 25, 80, 106, 117, 118 Pinker, S. 24, 52, 56, 62, 63, 67, 91, 97, 108,
Park, T.Z. 154 112, 114, 116, 136, 139, 141, 143, 146, 147, 148,
Parkes, C.H. 331, 332 409, 451
Parkin, A.J. 171, 215, 344, 345 Pinto, J.P. 122
Parmalee, C.M. 184 Pisoni, D.B. 261, 272
Passingham, R. 114, 115 Pitchford, N. 97
Pate, D. 313, 435 Pitt, M.A. 263, 276
Pate, J.L. 62 Plaut, D.C. 19, 118, 232, 234, 237, 238, 340, 349,
Patterson, F. 61 351, 422
Patterson, K.E. 215, 218, 219, 221, 222, 224, 226, 227, Plomin, R. 117
228, 229, 232, 233, 234, 235, 236, 238, 281, 348, Plunkett, K. 25, 80, 106, 117, 118, 147
349, 354, 409, 442, 445, 464 Poeppel, D. 21, 69, 71, 72
Paul, S.T. 204 Polk, T.A. 184
Paulsen, J.S. 349 Polkey, C.E. 224
Paus, T. 268 Pollack, M.D. 175
Payne, S.J. 382 Pollatsek, A. 168, 169, 172, 192, 216, 217, 218, 219,
PDP Research Group 273, 275 228, 241, 244, 247, 295
Pearce, J.M. 54 Poncelet, M. 83
Pearl, E. 155 Poon, L.W. 187
Pearlmutter, N.J. 206, 290, 296, 301, 302, 303, 408 Popovics, A. 445
Pechmann, T. 399, 412, 418, 420, 423, 424 Posner, M.I. 177, 333, 380, 460, 463
Peereman, R. 175 Post, K.N. 107
Pellat, J. 467 Postal, P. 45
Pembrey, M.E. 115 Postma, A. 425
Penfield, W. 68 Postman, L. 185
Pennington, B.F. 216, 254 Potter, H.H. 389
Pepperberg, I.M. 57 Potter, J.M. 397
Peracchi, K.A. 383 Potter, M.C. 155, 162, 189, 363, 404
Perc, M. 251 Powelson, J.A. 389
Perchonock, E. 11 Prasada, S. 148
Pérez-Pereira, M. 87 Prat, C.S. 337
Perfect, T.J. 415 Premack, D. 60, 64, 67
Perfetti, C.A. 199, 200, 217, 218 Price, C.J. 69, 184
Perron, C. 435 Prince, A. 147, 148
Perry, C. 216, 228, 238, 246 Pring, L. 213
Peters, A.M. 224 Pring, T. 437
Peters, P.S. 45 Protopapas, A. 279
Petersen, A. 232, 238, 253 Proverbio, A.M. 156, 158
Pugh, K.R. 73, 298
Petersen, S.E. 69, 463
Pullum, G.K. 42, 45, 91, 113
Peterson, R.R. 420
Pulvermuller, F. 313, 356, 436
Pethick, S. 144
Pye, C. 110
Petitto, L.A. 61, 62, 63, 65, 66, 112, 124
Pynte, J. 305
Petrie, H. 431, 432
Pexman, P.M. 216
Piaget, J. 24, 81, 90, 106 Q
Piattelli-Palmarini, M. 51 Quillian, M.R. 323, 324
Piccini, C. 341, 343 Quine, W.V.O. 127, 334
Pichert, J.W. 366 Quinlan, P.T. 173, 176
Pickering, M.F. 374 Quinn, P.C. 127, 348
Pickering, M.J. 156, 292, 293, 297, 298, 301, 302,
303, 306, 307, 312, 403, 404, 449, 455, 456,
462, 479 R
Piepenbrock, R. 173 Rack, J.P. 243, 253
Pietroski, P. 109 Radford, A. 35, 43
584 AUTHOR INDEX
Radvansky, G.A. 383, 455 Roberts, B. 432

Ragain, R.D. 184 Roberts, J.M. 91
Rahman, R.A. 423 Roberts, L. 68
Raichle, J. 463 Robertson, D.A. 354
Raichle, M.E. 69 Robertson, S.P. 368
Randall, B. 272 Robinson, H.A. 387, 388
Randall, J. 311 Robinson, K.M. 349
Ranken, H.B. 94 Robinson, P. 160
Rao, S.B. 118 Robson, J. 437
Rapp, B. 340, 344, 410, 425, 442, 466 Rochford, G. 349
Rapp, B.C. 342 Rochon, E. 314, 472
Rapp, D.N. 422 Rodd, J. 199
Rasch, B.H. 98 Rodman, R. 63, 107, 173
Rasmussen, T. 68, 74 Rodriguez-Fornells, A. 156
Rastle, K. 192, 216, 217, 220, 228, 238 Roe, K. 74
Ratcliff, G. 346 Roediger, H.L. 176
Ratcliff, J.E. 190 Roelofs, A. 417, 425, 427, 428, 429
Ratcliff, R. 290, 308, 311, 320, 368, 369, 381 Roelstraete, B. 421
Rativeau, S. 305 Roeltgen, D.P. 224, 443
Raymer, A. 344 Rogalsky, C. 71
Rayner, K. 168, 169, 172, 181, 188, 189, 199, 203, 204, Rogers, E.S. 455, 456
205, 216, 217, 218, 219, 241, 244, 247, 290, 295, Rogers, T.T. 355
296, 309 Rohde, D. 471
Read, C. 244, 249 Rohde, D.L.T. 118
Reber, A.S. 292 Rolnick, M. 314
Redfern, B.B. 71 Romani, C. 340, 470, 473
Redington, M. 139 Rosch, E. 130, 320, 325, 333, 334
Redlinger, W. 154 Rosen, G.D. 250, 251
Rees, G. 460 Rosen, T.J. 147
Reeves, L.M. 319 Rosenbaum, S. 62
Reggin, L.D. 216 Rosnow, M. 445
Reich, P.A. 421 Rosnow, R.L. 445
Reicher, G.M. 196 Ross, B.H. 382
Reichle, E.D. 169 Ross, C. 124
Reid, C. 461 Ross, K. 180, 181
Reingold, E.M. 216 Rosson, M.B. 213
Rescorla, L. 132 Rothi, L.G. 443
Rey, M. 155 Rothi, L.J.G. 344
Reznick, J.S. 126, 144 Rotte, M. 156
Rho, S.H. 199 Roy, D. 355
Richards, D.G. 58 Rubenstein, H. 210, 212
Richards, M.M. 130 Rubenstein, M.A. 210, 212
Richardson, A.J. 250 Rubin, D.C. 177
Richardson, D.C. 458 Rubin, J. 154
Richardson, U. 245 Ruddy, M.G. 196
Riddoch, M.J. 341, 342, 345 Ruffino, M. 254
Rigalleau, F. 373 Rumbaugh, D.M. 61, 62, 64
Rigler, D. 78 Rumelhart, D.E. 146, 147, 196, 197, 273, 275, 378,
Rigler, M. 78 422, 481
Rinck, M. 383 Rumiati, R.I. 185, 443, 470
Rips, L.J. 324, 325, 327, 329, 335, 336 Ruml, W. 442
Ritchie, R.W. 45 Rupert, E. 64
Rivas, E. 65 Rush, M.L. 87
Rizzella, M.L. 383 Russell, C. 460
Rizzolatti, G. 53 Ruts, W. 335
Robbins, C. 246 Ryan, E.B. 157
Roberson, D. 97 Rymer, R. 79
AUTHOR INDEX 585
S Schriefers, H. 399, 407, 412, 418, 420, 423, 424, 430

Schrock, J.C. 433
Sacchett, C. 345
Sachs, J. 77, 83 Schuberth, R.E. 187
Sachs, J.S. 362 Schubotz, R.I. 71
Sacks, H. 454 Schustack, M.W. 188
Saffran, E.M. 224, 225, 234, 281, 298, 313, 314, 332, Schvaneveldt, R.W. 176, 179, 196, 198, 199, 201
341, 345, 349, 404, 421, 426, 428, 435, 437, 440, Schwanenflugel, P.J. 155, 190
441, 442, 445, 469, 470, 473 Schwartz, M.F. 224, 225, 234, 313, 314, 332, 345, 352,
Saffran, J.R. 118, 121, 122, 143 425, 426, 428, 434, 435, 437, 441, 442, 443, 445
Sag, I.A. 45 Schyns, P.G. 98
Sagart, L. 123, 124 Scifo, P. 356
Sage, K. 185, 348 Searle, J.R. 337, 450, 451
Salamoura, A. 156 Sebastian-Galles, N. 421, 429
Salmon, D.P. 349 Sebeok, T.A. 9
Samuel, A.G. 155, 264, 422 Sedivy, J.C. 267, 303, 304, 457
Sanders, M. 180 Seergobin, K. 231, 232, 238
Sanders, R.J. 61 Segui, J. 183, 192, 200, 260, 261, 276, 277
Sandra, D. 190, 191 Seidenberg, M.S. 18, 19, 62, 63, 65, 66, 115, 118, 121,
Sanford, A.J. 374 146, 173, 174, 180, 181, 199, 202, 205, 206, 214,
Santa, J.L. 94 215, 216, 218, 222, 229, 230, 231, 232, 233, 234,
Sanz, M. 307 235, 236, 238, 251, 253, 296, 301, 303, 352, 353,
Sartori, G. 344, 345 409, 468
Sasanuma, S. 226, 227, 235 Seifert, C.M. 368, 381
Sasccuman, M.C. 356 Sejnowski, T.J. 481
Savage, C. 142 Sekerina, I. 149, 457
Savage, G.R. 183 Semenza, C. 344
Savage, R.S. 246 Senghas, A. 114
Savage-Rumbaugh, E.S. 61, 62, 64, 65, 66 Sergent-Marshall, S.D. 177
Savin, H.B. 11, 262 Sergio, L.E. 124
Savoy, P. 420 Sevcik, R.A. 64
Saxton, M. 108 Seymour, P.H.K. 244, 246, 247, 253
Scahill, R.I. 348 Shafto, M. 416
Scarborough, D.L. 155, 175 Shallice, T. 17, 18, 19, 220, 221, 223, 228, 235,
Scarborough, H.S. 175 237, 303, 332, 339, 341, 342, 343, 344, 345, 349,
Scarcella, R. 76 351, 421, 422, 425, 437, 439, 443, 445, 463, 469,
Scardamalia, M. 445 470, 472
Schachter, D.L. 176 Shankweiler, D.P. 258, 268, 300
Schaeffer, B. 325 Shannon, C.E. 10
Schaeffer, H.R. 85 Shapiro, G. 60
Schaffer, M.M. 335 Shapiro, K. 344
Schaner-Wolles, C. 313 Share, D.L. 241, 246, 247
Schank, R.C. 326, 378, 380, 381 Sharkey, A.J.C. 188
Schegloff, E.A. 454 Sharkey, N.E. 188
Schenkein, J. 403 Sharpe, K. 161
Schieffelin, B. 108, 110 Shattuck, R. 78
Schiller, N.O. 428, 429 Shattuck-Hufnagel, S. 426, 438
Schilling, H.E.H. 181 Shatz, M. 83, 109
Schlesewsky, M. 71 Shaw, L.K. 129
Schlesinger, H.S. 112 Shaywitz, B.A. 73
Schlesinger, I.M. 137, 138, 140 Shaywitz, S.E. 73, 298
Schmalhofer, F. 386 Sheldon, A. 372
Schneider, W. 177, 460 Shelton, J.R. 186, 346, 347, 442, 466, 473
Schneiderman, M. 108 Sheridan, J. 344, 347
Schnur, T.T. 407 Sherman, G.F. 250
Schober, M.F. 455 Shewell, C. 464
Schoknecht, C. 176 Shiffrin, R.M. 177, 460
Schreuder, R. 191 Shillcock, R. 121, 276
586 AUTHOR INDEX
Shillcock, R.C. 398 Solomon, E.S. 408

Shoben, E.J. 324, 325, 327, 329, 337 Solomon, R.L. 172, 194
Sholl, A. 189 Soltano, E.G. 162
Shrier, R. 314 Spelke, E. 113
Shtyrov, Y. 356 Spelke, E.S. 128, 130
Shulman, H.G. 298 Spence, M.J. 119
Shweder, R.A. 96 Spencer, A. 35
Sidhu, R. 252 Spender, D. 99
Siegel, L.S. 248, 252, 253, 254 Sperber, D. 452
Silber, R. 124 Sperber, R.D. 184
Silverberg, S. 155 Spieler, D.H. 177
Silveri, M.C. 345, 351 Spiro, R.J. 369
Simons, J.S. 346 Spivey, M.J. 157, 306, 457
Simons, R.F. 298 Spivey-Knowlton, M.J. 267, 302, 303, 306, 457
Simpson, G.B. 199, 202, 203, 204, 206 Stackhouse, J. 253
Simpson, I.C. 476 Stager, C.L. 122
Sin, G. 438, 445 Stahl, S.A. 247
Sinclair-de-Zwart, H. 81, 83 Stamatakis, E.A. 272, 416
Singer, M. 369, 386, 387 Stamenov, M.I. 53
Singh, J.A.L. 78 Stamm, E.G. 181
Siqueland, E.R. 120 Stanners, R.F. 196
Sitton, M. 342, 343 Stanovich, K.E. 177, 187, 188, 214, 248, 252, 253
Ska, B. 435 Stanowicz, L. 107
Skarbek, A. 432 Stark, R.E. 104
Skinner, B.F. 9, 10, 107, 475 Starreveld, P.A. 412
Skottun, B.C. 250 Steedman, M.J. 299, 300, 361
Skoyles, J. 250 Steffensen, M.S. 367
Skudlarski, P. 73 Stefflre, V. 95
Slaghuis, W. 250 Stein, J. 250, 255
Sleiderink, A. 403 Stemberger, J.P. 402, 421, 422, 425, 437, 440
Sleve, L.R. 455, 456 Sterling, C.M. 190
Slobin, D.I. 12, 116, 137, 140, 148 Sternberg, S. 429
Smiley, P. 130 Stevens, K.N. 268
Smith, E.E. 324, 325, 327, 328, 329 Stevenson, S. 141, 143
Smith, J.H. 126 Stewart, A.J. 374, 403
Smith, M. 155, 407, 422 Stewart, E. 156
Smith, M.W. 96 Stewart, F. 344, 345
Smith, N. 82 Stigler, J. 94
Smith, N.J. 304 Stiles, J. 76
Smith, N.V. 125 Stone, G.O. 216
Smith, P.T. 190 Storms, G. 335
Smith, S.D. 389 Stowe, R.M. 222, 223, 235
Smith, S.M. 89 Strain, E. 218
Smith, T. 62 Straub, K. 375
Snedeker, J. 290, 457 Strawson, C. 214, 251
Snodgrass, J.G. 340, 344 Studdert-Kennedy, M. 258, 268
Snow, C.E. 76, 84, 85, 107, 109, 110, 154 Sturt, P. 305
Snowden, J.S. 348 Sudhalter, V. 149
Snowling, M.J. 243, 244, 245, 246, 250, 252, 253, Sulin, R.A. 367, 370
254, 255 Summerfield, Q. 261
Snyder, C.R.R. 177, 460 Suzuki, T. 222, 238
Snyder, L. 104 Svec, W.R. 426
So, K.F. 155 Swain, M. 154
Soares, C. 155 Swallow, J. 438
Sobel, P. 442 Swan, M. 196
Soja, N.N. 130 Swenson, M.R. 349
Sokolov, J.L. 84 Swets, B. 408
AUTHOR INDEX 587
Swingley, D. 122 Trevarthen, C. 85

Swinney, D.A. 199, 201, 202, 262, 298, 311, 315, 338 Trofe, J.L. 99
Sykes, J.L. 123 Trueswell, J.C. 130, 149, 290, 296, 298, 301, 302, 312,
373, 457
Trussardi, A.N. 254
T Tsimpli, I.-M. 82
Tabor, W. 303 Tucker, G.R. 155, 161
Tabossi, P. 205 Tuffiash, E. 344
Taft, M. 190, 191, 212, 216 Tulving, E. 176, 319
Tager-Flusberg, H. 83 Turner, J. 466
Tallal, P. 115 Turner, T.J. 364, 380, 381
Tam, P. 416 Turvey, M.T. 171, 217, 268
Tanaka, J.W. 335 Tweedy, J.R. 179
Tanenhaus, M.K. 181, 199, 202, 205, 214, 265, 266, Twilley, L. 231, 232, 238
267, 271, 290, 296, 298, 301, 302, 303, 304, 306, Tyler, L.K. 187, 191, 265, 266, 267, 271, 272, 292,
308, 311, 312, 457, 458 315, 347, 389, 416
Tannenbaum, P.H. 431
Taraban, R. 296, 299
Tash, J. 261 U
Taylor, I. 116, 117, 155 Ullman, M.T. 69, 70, 146, 147, 409
Taylor, M. 128, 335 Urbach, T.P. 304, 403
Taylor, M.M. 116, 117, 155 Utman, J.A. 314, 434, 436, 444
Taylor, S. 244, 246
Teberosky, A. 249
Tees, R.C. 75, 120, 121 V
Tehori, B.Z. 247 Vaid, J. 157
Temple, C.M. 251 Valdois, S. 238
Terrace, H.S. 61 Valentine, T. 160, 471
Terras, M. 370 Valian, V. 138
Teruel, E. 407 Vallar, G. 389, 469, 472
Tettamanti, M. 356 van Berkum, J.J.A. 292
Teuber, H.-L. 75 van Cantfort, T.E. 60
Thal, D. 76, 126, 144 van den Boogaard, S. 412
Theakston, A. 142 van den Noort, M. 251
Theakston, A.L. 143 Van der Linden, M. 83
Thibaut, J.-P. 98 van der Velden, E. 156
Thiessen, E.D. 122 Van Dijk, T.A. 364, 382, 384, 385
Thomas, E.L. 387, 388 Van Gompel, R.P.G. 297, 303, 306, 307
Thomas, J. 40 Van Graan, F. 212, 216
Thomas, J.E.P. 89 van Grunsven, M. 437
Thomas, K.M. 251 van Hell, J.G. 156
Thomas, M.S.C. 74, 146, 148 van Heuven, W.J.B. 157
Thompson, C.R. 63 van Ijzendoorn, M.H. 244
Thompson, R. 414 van Kralingen, R. 251
Thompson, S. 290 van Mier, H. 69
Thomson, J. 132 Van Orden, G.C. 215, 216, 233
Thorndyke, P.W. 332, 361, 363, 364, 378, 379 Van Petten, C. 19, 188
Tincoff, R. 104 Van Turenout, M. 412
Tippett, L.J. 350, 352 van Valin, R.D. 71, 146
Tomasello, M. 82, 84, 129, 142, 143 Vanderstoep, L. 110
Tornéus, M. 249 Vanderwart, M. 184, 344
Torrey, J.W. 291 Vargha-Khadem, F. 114, 115, 224
Townsend, D.J. 307 Varma, S. 472
Townsend, J. 115 Varner, K.R. 387
Trauner, D. 76, 82 Varney, N.L. 281
Travis, L.L. 108 Vasilyeva, M. 109, 148
Traxler, M.J. 292, 302, 306, 307, 312 Veltkamp, E. 156
Treiman, R. 108, 218, 244, 249 Veres, C. 176
588 AUTHOR INDEX
Vevea, J. 109 Weaver, W. 10

Vidyasagar, T.R. 254 Webb, W.G. 349
Vigliocco, G. 93, 332, 347, 348, 355, 405, 406, 415, Weekes, B.S. 175
426, 462 Weil, C.M. 184
Vigorito, J. 120 Weinrich, M. 466
Vihman, M.M. 154 Weinstein, S. 376
Vijayan, S. 118 Weisberg, R.W. 94, 421
Villa, G. 443 Weizenbaum, J. 14
Vincze, N. 479 Welch, V. 242, 246
Vinson, D.P. 93, 332, 347, 348, 355 Well, A.D. 168
Vipond, D. 385 Wells, G. 110
Vishton, P.M. 118 Welsch, D. 386
Vitevitch, M.S. 415 Welsh, A. 269, 270, 292
Vitkovitch, M. 411 Wengang, Y. 227
von Cramon, D.Y. 71 Wenger, N.R. 127
von Eckardt, B. 155 Werker, J.F. 75, 120, 121, 122, 123
Von Frisch, K. 54 Wesche, M. 154
Vonk, W. 384 Wessels, J. 266, 267, 271
Vorberg, D. 399, 412, 418, 420, 423, 424 West, R.F. 177, 187, 188, 248
Vouloumanos, A. 143 Westenberg, C. 403
Vu, H. 204, 205 Westwood, A. 196
Vygotsky, L. 89 Wetzel, W.F. 315
Wexler, K. 114
W Whalen, D.H. 268
Wachtel, G.F. 129 Whaley, C.P. 172, 174, 177
Wade, E. 414 Wheeldon, L. 407, 422, 428, 429
Waksler, R. 191 Wheeldon, L.R. 411
Wales, R.J. 135 Wheeler, D. 196
Walker, C.H. 380 Whishaw, I.Q. 68, 69, 73
Walker, E.C.T. 331, 332 Whitaker, H.A. 75
Wallace, M.A. 347 White, M.N. 173
Wallace, R. 325 Whitwell, J.L. 348
Walter, A.A. 93, 94 Whorf, B.L. 90, 91
Wang, H. 254 Wickelgren, W.A. 230
Wang, W. 75, 76, 155 Wieman, L.A. 124
Wanner, E. 127, 295 Wierzbicka, A. 326
Ward, G. 311 Wightman, J. 243
Wardlow Lane, L. 453 Wilce, L.S. 243, 245
Warren, C. 184, 195, 278 Wilding, J. 253
Warren, P. 157, 262, 270, 277 Wilensky, R. 380
Warren, R.K. 402 Wilkes-Gibbs, D. 455
Warren, R.M. 259, 263, 264 Wilkins, A.J. 255, 324, 325
Warren, R.P. 259, 263, 264 Wilkins, D.P. 71
Warrington, E.K. 220, 221, 223, 228, 234, 237, 339, Wilks, Y. 326
340, 341, 342, 343, 344, 345, 347, 348, 349, 389, Willems, R.M. 356
439, 443, 469, 472 Williams, D.F. 250
Wason, P.C. 12 Williams, F. 431
Wasow, T. 430 Williams, J.N. 156, 188
Waterfall, H. 109 Williams, P.C. 171
Waters, G.S. 180, 181, 214, 217, 314, 471, 472 Williams, S.L. 64
Watkins, K.E. 114, 115, 268 Willis, C. 469
Watson, F.L. 173, 215 Willows, D.M. 247
Watson, J.B. 88 Wilshire, C.E. 440
Watson, J.E. 248 Wilson, B. 472
Watson, P.C. 354 Wilson, D. 452
Watts, D. 479 Wilson, M. 454
Waxman, S.R. 128, 130 Wilson, T.P. 454
AUTHOR INDEX 589
Wingfield, A. 292, 414 Yeager, C.P. 61

Winner, E. 453 Yekovich, F.R. 363, 380
Winnick, W.A. 195, 463 Yeni-Komshian, G.H. 281
Winograd, T.A. 14 Yeung, H.H. 122
Wish, M. 451 Yngve, V. 454
Wisniewski, E.J. 336, 337 Yopp, H.K. 243, 244
Witte, S. 445 Young, A.W. 224, 464
Wittgenstein, L. 329, 396 Yuill, N. 386
Wolz, J.P. 58 Yuille, J.C. 177
Woodruff-Pak, D.S. 387
Woods, B.T. 74, 75
Woods, W.A. 378 Z
Woodward, A.L. 129 Zadini, A. 443, 470
Woodward, V. 242 Zagar, D. 305
Woodworth, R.S. 415 Zaidel, E. 224
Worthley, J.S. 414 Zani, A. 156, 158
Wright, B. 177 Zanni, G. 371
Wright, C.E. 429 Zanuttini, R. 415
Wulfeck, B. 115, 314, 434, 436, 444, 472 Zardon, F. 205
Wydell, T.K. 218 Zettin, M. 344
Wydell, T.N. 222, 238 Zevin, J.D. 173, 174, 219, 220, 238
Zhang, S. 218
Zhang, Y. 244
X Zhuang, J. 272
Xu, F. 81, 147 Ziegler, J. 216, 228, 238
Ziegler, J.C. 245, 246, 255, 272
Zimny, S. 386
Y Zingg, R.M. 78
Yaffee, L.S. 473 Zorzi, M. 255
Yamada, J.E. 82 Zukowski, A. 244
Yamada, Y. 298 Zurif, E.B. 281, 313, 315, 434
Yankama, B. 109 Zwaan, R.A. 364, 383, 384, 455
Yap, M.J. 177 Zwitserlood, P. 271, 272, 292
SUBJECT INDEX
Italic page numbers indicate tables; bold numbers indicate figures, pictures and text boxes.
2001: A Space Odyssey 14 alignment 455

allophones 31
alphabetic principle 249
A alphabetic stage 242
abstract knowledge 36
alveolars 34
abstract nouns 37
alveopalatals 34
abstraction: and memory 362; use of 142–3
Alzheimer’s disease 146, 314, 348–50; brain scan 348;
abstraction theories 335
language loss 352; syntactic processing 472; writing 445
acceptable sentences 10
ambiguity 39, 319; anaphoric 372–5; in conversations
accessibility, comprehension 375–6
454–6, 456; lexical ambiguity 198–205; resolving
accommodation 81
acoustic invariance 259 149; structural 288–91; syntactic 303
acoustics 30 ambiguity detection task 200
acquired reading disorders 220 American Sign Language (ASL) 60, 62–3
acquisition and learning distinction hypothesis 160 analogy model 229
ACT* 377–8 analysis-by-synthesis 268
activation 15, 190, 265 analytic phonics 247–8
active-filler strategy 311 anaphora 372, 376–7
active filter hypothesis 160 anatomy of language 4
active sentences 39 angular gyrus 69
Adaptive Control of Thought (ACT) 377 animacy 149
adjectives 37 animals: communication 5–6, 54–5; gestures 54;
adult reading disorders 220–7; analysis of dyslexia language 54–67; teaching language to 57–67
226; deep dyslexia 223–5, 227–8, 235–7; dyslexia anomia 415, 438–40, 439, 464
225–6; dyslexia in languages other than English anthropological evidence 91
226; phonological dyslexia 221–3, 235; surface anticipation 291
dyslexia 220–1, 227, 233–5 antonyms 320
adverbs 37 Apache 91
advertising, inferences 372 apes: cognitive abilities 58; language teaching 58–67;
affix stripping 191 sign language 61, 61–5; syntactic abilities 66;
affixes 401, 408 teaching offspring 60; use of symbols 60, 60
affricatives 34 aphasia 68, 146, 433–4, 443, 466; bilingualism 157;
age, bilingualism 158 connectionist models 440–2; evaluation of research
age-of-acquisition (AOA) 173–4, 174; reading 214, 218 443–4; fluent aphasia 409
agents 136 applied research 477
agnosia 185 apraxia 434
agrammatic aphasia 313–16; impairment of automatic Arabic, pro-drop parameter 111
or attentional 315; processing of content and arcuate fasciculus 17
function words 315 artifacts, semantic features 353
agrammatism 434–7 artificial intelligence (AI) 13, 15, 26, 321, 326, 368,
agraphia 444–5 377, 475, 477
agreement errors 405–7 artificial languages 44, 118
SUBJECT INDEX 591
aspirated sounds 30 response bias 182; semantic bias 310; verb bias
assimilation 80–1, 259 302–3; whole-object bias 128–9
associations 320–1, 322–3 bigram frequency 214
associative facilitation 267 bilabial sounds 33
associative semantic priming 185–7 Bilingual Interactive Activation Plus (BIA+)
attachment preferences 305 model 157
attentional dyslexia 220 bilingualism 94, 480; advantages 154–5; age of
attentional processes, visual word recognition 177–80 acquisition 158; aphasia 157; categories 153–4, 154;
attentional processing 177–8, 178; agrammatic aphasia and cognitive processing 154–5; and color coding
315; evaluation of research 180; two-process 96; early research 154; evaluation of research 162;
priming model 179–80 interference 157; language processing 155–7;
attitude and emotion, second language acquisition 160 lateralization 157; lexicalization 411, 421; models
attractors 236 157; neuroscience 157–8; overview 153; parameter
audience design 455–6 setting 112; second language acquisition 158–61;
audiolingual teaching 159 segmentation 260–1; summary 162; syntactic
auditory comprehension 71–2, 157 processing 156; tip-of-the-tongue (TOT) 415;
auditory short-term memory (ASTM) 473; tasks translation 156–7
469–70 biological basis, of language 67–73
autism, language development 83 blindness see visual impairment
automata theory 44 blindspots 169
automatic associative priming 186–7 blocking hypothesis 415
automatic inferences 369 body 215
automatic non-associative priming 186–7 bonobos, language acquisition 64–5
automatic processing 177–8, 178; agrammatic book: cognitive emphasis 4; conclusions 480; themes
aphasia 315 22–6, 23, 475–7
autonomous access model 203–4 BootLex 121
autonomous-interactive distinction 266 bootstrapping 121, 122–3; semantic 136–7; syntactic
autonomous models of parsing 288 130
autonomy, in syntactic processing 296 borrowing, of words 8
autonomy of syntax 11 bottom-up 24
autonomy theory 266–7 bound morphemes 401
auxiliary hypotheses 24 “box-and-arrow” diagrams 13
auxiliary verbs 41; visual impairment 87 box and candle problem 94, 94
boxology 477
brain: activity during reading 184, 224; Alzheimer’s
B disease 348; cross section 17; knowledge storage
babbling 104, 123, 123–5 areas 347; and language 17–22; localization of
babytalk 109–11 functions 67–73, 70, 72; resolving ambiguity 304;
back-channel communication 454 syntactic processing 436
back propagation 230, 483–5 brain damage 476; and comprehension 389; effects
backward translation 156 on parsing 312–16; lesion studies 17–19; not
backwards masking 171–2, 172 localized 220; range of effects 461; recovery
base frequency effect 191 74–5; selective language impairment 158; spoken
basic-level terms 130 word recognition 281
basic levels 334 brain development, and language development 52
basis of language: biological basis 67–73; cognitive brain imaging 16, 19–22, 68, 71; ambiguous and non-
basis 80–2; genetics 53; hand gestures 53–4; ambiguous sentences 304; increasing accuracy 478;
origins 51–4; overview 51; primate studies 53; semantic and syntactic processing 298
protolanguage 52; social basis 83–8; social factors bridging inferences 367–8, 369, 370
53; summary 100 Broca’s aphasia 68, 433–4, 435, 444
Bassa, color coding 95 Broca’s area 17; agrammatic aphasia 313; location 18,
Bayesian models 478 68; role of 71
bees 54, 55, 57 Brodmann’s area 53, 316
behaviorism 10, 123; arguments against 108; as
empirical 106; view of thought 88–99
Berinmo 97 C
bias: in comprehension 376; familiarity bias 421; in canonical sentence strategy 293
learning 127; lexical bias 421; in research 16, 142; capacity theory 471
592 SUBJECT INDEX
Caramazza’s model 416, 417 Chinese 92–3; number systems 94; reading 227; script
cascade models 23–4, 418–21, 424–5, 426 226; see also Mandarin
case grammar 377 Chinese–English bilinguals 94
CAT (computerized axial tomography) 20, 20 Chomsky’s linguistic theory 36–45; see also
categorical perception 261–2 transformational grammar
categorical phoneme perception, TRACE model 275 class-inclusion model 338
categorization 320; basic level 334; fuzziness 330, 333 classification, evaluation of research 336
category decision task 216 clauses 38
category-specific disorders: connectionist models click displacement technique 291
352–4; living–non-living dissociation 345; closed-class items 38
methodological issues 344–5; modality-specific closure 294; late 295–6
effects 346–8; sensory–functional theory 345–8; co-articulation 121–2, 259–60, 262
stimulus materials 344 co-reference, comprehension 372
causal coherence 361 coda 35
causative verbs 144, 331, 331–2 code switching 154
center-embedding 40, 45 coding, of color 95
centering theory 376 cognition: embeddedness 356; indirect effects of
central deep dyslexia 223–4 language 94
central dyslexias 220 cognition hypothesis 81, 83, 88
certainty 26 cognitive cycles 432, 432
chaffinch 74 cognitive development: hearing impairment 87–8;
changes in languages, over time 7–9 Piagetian theory 80–2, 81
characteristic features 327, 327–9 cognitive economy 320
child-directed speech 109–11, 110, 455; cultural cognitive linguistics 43
variation 110 cognitive neuropsychology 17–18
children: color 96; concept development 335; cognitive neuroscience, area of study 17
deprivation of linguistic input 78–9; early sounds cognitive processes, specificity 26
104; hearing children of hearing-impaired parents cognitive processing, and bilingualism 154–5
77; hypothesis testing 123, 129–30; language cognitive psychology 10, 13
acquisition 63; lateralization 74–6; learning cognitive science approach 13–15
difficulties 82–3; motion encoding 98; spatial coherence 361
coding 97 coherence graph 384
children, language development 478–9; acquisition of cohort model 265, 268–73; extension 278
irregular forms 108; after babbling 124; babbling collaboration, in conversations 454–6
123–4; child-directed speech 109–11, 110; Collins and Quillian semantic network model 323–5
conditioning 107–8; distributional information color coding 95–6, 268
117–18; early speech perception 120–3; color hue division 95
early words 126, 127; errors 126–7; errors in color, memory for 95–7
meanings 131–4; formal approaches 115–16; color perception 97
genetic linguistics 114–15; imitation 106; color spectrum, and visual system 96–7
individual differences and preferences 129–30; color terms, hierarchy 95, 96
language acquisition device (LAD) 111–18; commissive speech acts 450
later phonological development 124; lexical and common ground 375–6
semantic development 125–6, 126; linguistic common-store models 155–6
universals 112–14; mapping problem communication 5; steps in 3
127–30; name learning 127–9, 131; output communicative signals 54
simplification 125, 125; over- and under-extensions comparative linguistics 10
131–4; overview 104–5; parameter setting 111–12, competence 36–7, 105
114; phonological development 120–5; pidgins competition 303
and creoles 114; poverty of the stimulus 108–9; competition effects 279
process 118–20; semantics first 136; summary competition-integration model 303
150–1; syntactic categories 136–9; syntactic competitive queuing 427
comprehension 148–9; syntactic development compound nouns 336–7
136–49; use of cues 130; verb-argument structure compound words 191
141–4; in the womb 119 comprehensible input hypothesis 160
chimpanzees see apes comprehension: accessibility 375–6; agrammatism
chinchillas 122 435; anaphoric ambiguity 372–5; bias 376;
SUBJECT INDEX 593
co-reference 372; common ground 375–6; context construction–integration model 378, 384–6
effect 364–7, 365; first mention 376; given–new constructivist-semantic perspective 136
contract 376; implicit causality 373–4; implicit content-word substitutions 437
focus 374–5; improving reading skills 387–8; content words 38, 315, 400
individual differences 386–8; inferences 367–72; context: lexical ambiguity 202–3; and meaning
Kintsch’s construction–integration model 378, 319; and sound identification 263–5; visual word
384–6; and memory 361, 362–72; memory, recognition 187–90
inferences and anaphora 376–7; mental models context effect: cohort model 270; comprehension
382–4; neuroscience of text processing 388–9; and memory 364–7, 365; garden path model
overview 360–2; prior knowledge 364–7, 387; and 301–2; speech recognition 277; TRACE model 276;
production 135–6; recency 376; reference 372; understanding indirect speech acts 451–2; word
referential processing 361; schema-based theories recognition 266–7
380–2; semantic processing 361; and sentence context-free grammars 40, 44, 45
structure 361; and short-term memory 389; speed context-guided single-reading lexical access model 199
reading 218–19; story grammars 378–80; summary context-sensitive grammars 40, 44, 45
390–1; text processing 377–86; visual information context-sensitive model 204–5
457, 457–8 contingent negative variation (CNV) 19
computational account, of vision 13 continuity assumption 142
computational metaphor 13 continuity hypothesis 111, 112, 123
computational models 25–6, 478–9 continuity theories 136
computer modeling 13–14; see also models contrastive hypothesis 134, 159
computer programs: ELIZA 14; experimental packages controlled processing 177
15; PARRY 14; SHRDLU 14–15 conversation analysis 453–4
concepts 320; combining 336–7; wooliness 333 conversational hypothesis 109
conceptual change 130 conversational implicature 452–3
conceptual dependency theory 378 conversations 360, 361; ambiguity 455–6; ambiguity
conceptual mediation 156 in 456; collaboration 454–6; conceptual pacts 453;
conceptual pacts 453 Grice’s maxims 452, 452–3; inferences in 449–53;
Conceptual Selection Model (CSM) 412 layering 452; privacy 453; sound and vision 456–8;
conceptualization, speech production 395–6 structure of 453; turn-taking 84–5, 454; visual cues
concrete nouns 37 454, 454
conditioning 107–8 cooperation 85
conduction aphasia 443, 466 core description 328
congruence 187–8 Cornell University conference 9
conjoint frequency 324 cotton-top tamarins 66, 66, 122
conjunctions 37, 40 counter-factual reasoning 92–3
connectionism 15, 25–6, 106, 477, 481–5 creole languages 114
connectionist modeling 80, 117–18, 138–9, 146–7 critical period hypothesis 73–80; deprivation
connectionist models 25–6, 229, 427; accessing of linguistic input 78–9; evaluation 79–80;
semantics 232–3; aphasia 440–2; of dyslexia lateralization 74–5; second language acquisition
233–7; grounding 355–6; of impairment in 76–7; syntactic development 76–7
dementia 350; latent semantic analysis (LSA) 354; cross-cultural studies 140, 476
lexicalization 423–4, 425–6; of reading 467; revised cross-language priming 155
232; semantic microfeature loss hypothesis 352; cross-linguistic differences, language development 148
semantics 351–6; of sentence production 403–4; cross-linguistic research 479–80
cross-modal lexicon decision task 191
speech recognition 273–80; working memory 472
cross-modal priming technique 201–2, 271–2
connotation 321–2
cross-sectional studies 105
conservation task 80
crossed aphasia 75, 157
consolidated alphabetic phase 242
CT scan, stroke 158
consonantal languages 210, 210
cues 130, 135
consonants 30, 33–5; as combinations of
culture, transmission 5
distinguishing phonological features 34; speech
production 33
constituent analysis 38 D
constituents 38 Dani 95
constraint-based models 296, 300–3ff; compared to data 16
garden path theories 305–6 data-driven processes 23, 24
594 SUBJECT INDEX
deafness see hearing impairment dual-code hypothesis of semantic representation 340–3

declarative/procedural (D/P), model 70 dual-code theory 262
declarative speech acts 451 dual-mechanism model 408–9
decompositional theories 326–30 dual-pathway hypothesis 190
deep dysgraphia 445 dual-route cascaded (DRC) model 228–9
deep dyslexia 223–5, 227, 235–7, 351, 467; dual-route model 146, 211–12, 212, 220; original and
right-hemisphere hypothesis 224 revised 228; regularity effect 214; revision 227–9
deep dysphasia 441, 466 dual-task performance 431
deep structure (d-structure) 41–2 Dutch 97
defining features 327–8 dysgraphias 220, 445
delay strategy 309, 310 dyslexias 4, 185, 217, 220; analysis 226;
Dell’s interactive model 423, 424, 441, 463 developmental dyslexia 249–55; in languages other
dementia 348–50 than English 226; and models of naming 227–9
denotation 321–2 dysprosody 434
dentals 34
derivational morphology 6 E
derivational theory of complexity (DTC) 11–12 E-Z Reader model 169–70
describing language: overview 30; summary 46 early asymmetry 76
design features, of language 55–6 early reading units 245–7
determiners 37 early speech perception 120–1
developmental data 478–9 early-syntax theory 143
developmental dysgraphia 249, 249 early words 126, 127
developmental dyslexia 249–55; biological basis ease-of-predication 237
250–1; control groups 252–3; genetics 250, 254; editor hypothesis 425
improving reading skills 254–5; subtypes 251–4 EEGs (electroencephalograms) 19, 19–20
developmental phonological dyslexia 253 egocentric speech 89
developmental reading disorders 220 egocentric thought 80
developmental surface dyslexia 253 egocentrism 81
dialects 32 Ehri’s four phase model of reading development
dialog see conversations 242, 242
diary studies 105 elaborative inferences 368, 369, 371
dichotic-listening task 200, 201 ELIZA 14
digging-in 303 embeddedness, of cognition 356
dipthongs 35 embedding 40, 45
direct-object verbs 302 embodiment 356
direct speech acts 451 emergentist hypothesis 74
directive speech acts 450 emergentist theory 144
disconnection syndromes 69 emotion and attitude, second language acquisition 160
discontinuity hypothesis 123 empiricism 106, 106
discontinuity theories 136 energy masking 171
discourse 360 English: color coding 95, 96, 97; graphemes 209;
discourse analysis 453 motion encoding 98; number systems 94; pro-drop
discrete models 23–4 parameter 111; spatial coding 97, 98; telegraphic
discrete stage models, lexicalization 421, 425–6 speech 112
disfluencies 290 entrenchment hypothesis 143
dishabituation paradigm 113 environmental cues, spatial coding 98
dissociation 24, 319, 401 environmental influence, on color coding 96
distinguishing features 353–4 epilinguistic knowledge 243
distributional analysis 143 episodic memory 319–20
distributional information 117–18, 121, 138–9 equipotentiality hypothesis 74
ditransitive verbs 38 ERPs (event-related potentials) 19, 19–20, 188; infants
dogs 57 75; semantic and syntactic processing 298
dolphins 55, 55; language teaching 57, 57 error patterns 426–7
domain-specific knowledge hypothesis (DSKH) 348 error scores 231
double dissociation 18, 18–19, 220, 444 errors in meanings: over- and under-extensions 131–4;
Down’s syndrome, language development 82 over-extensions 131
DSMSG model 442 Eskimo 91
SUBJECT INDEX 595
evoked potentials 75 formal approaches to language learning 115–16

evolution 51–2; stages 52 formal learning theory 116
execution, speech production 395 formal paraphrasias 441, 441
exemplar theory 335 formal power, of grammar 43–5
exercise hypothesis 76 formal universals 112–13
expectations, visual word recognition 178–9, 179 formants 30
experimental techniques 15–16, 180–3, 476 formulation, speech production 395
explanations, defining 16 forward translation 156
explicit awareness 244 four Cs 161, 161
exposure to print 248 fovea 169
expressive speech acts 450 FOXP2 gene 53, 67, 115
extended standard theory 37 Foygel and Dell model 442
extension 322 free morphemes 401
externalized language (E-language) 36–7 French: graphemes 209–10; pro-drop parameter 111
eye movement studies 168–70; phonological frequency effect 181–3, 186; homophones 417; lexical
mediation 216 ambiguity 202–3; reading 214; reversal rate 233;
eye movements: control of 169–70; speed word recognition 266
reading 219 Freudian slips 397, 397
eye, structure of 169 fricatives 34
eye-tracking 149, 149 Frith’s three stage model of reading development 242
eyewitness testimony 371, 371–2 full alphabetic phase 242
full-listing hypothesis 190
function words 38, 315, 400
F functional core hypothesis 133
facilitation 16 functional fixedness 94
familiarity bias 421 further reading: basis of language 101–3;
family resemblance models 333–6; evaluation 336; bilingualism 163; children, language development
instance theories 335; prototype theories 333–5; 151–2; comprehension 391; connectionism 485;
theory theories 335 describing language 47; introductory material
fan effect 378 28–9; language production and use 447–8;
Featural and Unitary Semantic Space hypothesis 332 language systems 474; language use 459;
feature-comparison theory 326–30 learning to read 256–7; parsing 317–18;
feature-list theories 327–8 reading 240; semantics 358–9;
feature masking 171 sentence structure 317–18; speech 283;
feedback 84; in interaction models 24; lexicalization visual word recognition 208
421–2, 425–6, 463; limited extent and influence future research 477–80
147; nature of corrections 107, 107–8; speech
recognition 277
feeding, and language acquisition 85 G
felicity conditions 450 gaps 310–12
feral children 78 garden path model 295–7ff, 305–6
fetus’ brain 119 garden path sentences 290–1, 299–300, 301–4
figurative language 337–9, 338 Garrett’s model of speech production 399–402, 400,
Filipino 91 426, 437, 438, 443
filled pauses 430, 433, 454 gating task 271
fillers 311–12 gaze 454
Fillmore’s theory of case grammar 377 gender cues, role in ambiguity resolution 373
finite state devices 44, 45, 45 gender stereotyping 99
first mention, comprehension 376 general phonological deficit 222–3
fixations 168–9, 169 generalization errors 141
fixed structure 294 generative grammar 37, 291
fluent aphasia 409 generative semantics 377
fluent restorations 270, 271 genetics 53
fMRI (functional magnetic resonance imaging) 20 gestures: in conversations 454; and development of
focal brain injury 75–6 language 53–4; and pauses 431; speech phases 432
focal colors 95–6 given–new contract 376
forced-choice procedure 236 glides 34
form-based priming 176 global aphasia 443
596 SUBJECT INDEX
globality assumption 442 illocutionary force 450

glossary 486–94 imageability 218, 221, 223
glottal stops 34 imitation 51, 106
Glushko’s experiments 213, 214 immersion method 161
good enough analyses 307–8 implicature 452–3
government and binding theory 37, 41–2 implicit awareness 244
grammar 37, 38; formal power 43–5; phrase-structure implicit causality 373–4
grammar 37–43; variations between languages 92 implicit focus 374–5
grammatical development 104 implicit priming paradigm 429
grammatical gender 93 importance, and memory 364; see also salience
grammaticality judgement task 313 in-utero development 119–20
grapheme coding 230–1 incremental parsing models 361
grapheme-to-phoneme conversion (GPC) 211–12 independence (modular) theory 418
graphemes 209 independent processes 23
Greek, motion encoding 98 indirect route 211–12
Grice’s maxims 452, 452–3 indirect speech acts 451–2
grounding 355–6 individual differences: language development 148;
growth areas 477–80 second language acquisition 160
guessing 182 Indo-European languages 7–8, 8
induction, controversy 115
infants: early speech perception 120–3; language
H acquisition 104; lateralization 75–6
hand gestures, and development of language 53–4 inferences 385, 449; advertising 372; comprehension
harmonics 32–3 376–7; in conversation 449–53; implications of
hearing children of hearing-impaired parents, language research 370–2; juries 371–2; memory for 367–72
acquisition 77 inflectional morphology 6, 79
hearing impairment: babbling 123–4; cognitive
inflections 408–9
consequences 87–8; creoles 114; language
information change model 130
development 86, 87–9, 88; parameter setting 112;
information flows 422
reading 217–18; and speech 281
information processing, and psycholinguistics 13
Hebrew 138, 210
information theory 10
hemidecortication 75
informational load 375
hemisphere dominance 68
informative signals 54
hemispheric specialization 73
inhibition 16, 387
hesitation pauses 432
initial contact phase 265
heterographic homophones 198–9
innateness 25, 105–6, 111, 113, 114, 116, 476;
hierarchy, semantic networks 323–5
children’s hypotheses 129; controversy 116–17;
high-dimensional memory (HDM) approach 354
living–non-living distinction 348; perceptual
High Interactional Content (HIC) 364
holistic processing 184–5 abilities 120–1; syntactic categories 136
holophrastic speech 136 inner speech 25, 89, 99, 217–18, 218, 462
homographs 199 input deep dyslexia 223
homophones 191, 198–9, 417, 420 instance theories 335, 336
honey bees 54, 55, 57 instantiation principle 335
horizontal information flow 422 integration model 203–4
Human Associative Memory (HAM) 377 intension 322
humour 3 interaction 23–4; in language processing 461, 475–6;
hybrid models 198 lexicalization 418–22; speech perception 263;
hyperspace analog to language (HAL) 354, 356 syntactic processing 299–300
hypotheses, defining 16 interaction theory 266–7
hypothesis testing, by children 123 interactional pauses 432–3
interactive activation and competition (IAC) model
196–8
I interactive activation models 196–8, 273, 422–4, 423,
identification procedures 328 424, 481–3
identification semantics hypothesis 343 interactive activation network 197
ideographic languages 210 interactive alignment model 455
idioms 338 interactive models of parsing 288, 289, 291
SUBJECT INDEX 597
interactive parallel constraint model 375 104–5; social context 83–4; see also apes; children,
intercalated dependencies 45 language acquisition
interchangeability 320–1; of pauses 433 language acquisition device (LAD) 111–18, 479
intercorrelated features 353–4 language acquisition socialization system (LASS) 84
interference 157 language bioprogram hypothesis 114
interlopers 414 language development 105; children with learning
internalized language (I-language) 36–7 difficulties 82–3; critical period hypothesis 73–80;
International Phonetic Alphabet (IPA) 31, 32 cross-linguistic differences 148; drivers 105–11;
Internet 479 evaluation of evidence of effects of sensory impairment
intersubjectivity 129 88; hearing impairment 86, 87–9, 88; individual
intonation 120 differences 148; visual impairment 85–6, 86
intra-lexical context 267 language disorders, of social use 85
intransitive verbs 38 language families 8
Inuit 91, 92 language functions, localization of 67–73
invariance 259 language learning, formal approaches 115–16
IQ (intelligence quotient) 115 language loss, Alzheimer’s disease 352
irregular forms, acquisition of 108 language, meaning and use: overview 285; see also
irregular words 228–9; reading 211 sentence structure
irreversible determinism (invariance) hypothesis 74 language of thought 288
irreversible passive sentences 12 language processes, specificity 26
isolation point 266, 271 language processing: bilingualism 155–7; improving
isomorphism 259 understanding of 479; interaction 461, 475–6;
Italian 111, 209 overlap 475; unconscious 460; visual and auditory
iteration 40, 44 461–2
language production and use: overview 393; summary
446–7; writing and agraphia 444–5; see also speech
J production
Japanese 226–7 language production, overview 395–6
jargon aphasia 437–8 language, study of: context and overview 3–4;
joint attention 84, 129 difficulty 5; reasons for 4–5
jokes 3 language systems: experimental evidence for lexicons
juncture pauses 432 463; lexicalization 462–8; and memory 473;
modeling 467–8; modularity 23–5, 460; modules
461–2; neuropsychology and lexical architecture
K 464–7; overview 460–1; rules 25–6; semantic 464;
kana 226–7
and short-term memory 468–73; structure of 465;
kanji 226–7
summary 473
Kannada 144
language teaching, to animals 57–67
kernel sentences 11, 41
language use: inferences in conversation 449–53;
kinship terms 326, 326
overview 449; speech acts 450–2; structure of
Kintsch’s construction–integration model 378, 384–6
conversations 453–4; summary 458
Kintsch’s propositional model 376
languages: number of 7; relationships 7
knowledge storage areas 347
larynx 32
Korean 113
late bilingualism 153
late closure 295–6, 305
L late-syntax theory 143
labeling 94, 129 latent semantic analysis (LSA) 354, 356
labiodentals 34 lateralization 74–5; bilingualism 157; infants 75–6
language: aspects of 6; defining 5–7, 55; design layering, in conversations 452
features 55–6; functions 3; social setting 3; utility learnability theory 116
56–7; and vision 456–8 learning bias 127
language abilities, innateness 105–6 learning difficulties, language development 82–3
language acquisition 4; apes vs. children 63, 63–4; learning theory 108; see also behaviorism
bonobos 64–5; children 63; deprivation of linguistic learning to read: age 247; cues 243; developmental
input 78–9; general principles 116; hearing children dyslexia 249–55; exposure to print 248;
of hearing-impaired parents 77; as parameter setting multi-sensory techniques 255; normal
111–12; pragmatic factors 117; research methods development 241–3; overview 241; phonological
598 SUBJECT INDEX
awareness 243–5; progress between stages 251; lexicons 7, 319; access to 464; bilingualism 155–7;
size of early reading units 245–7; summary 256; number of 462–8
teaching methods 247, 247–8; see also reading lexigrams 62, 64
left-hemisphere dominance 53–4 limbus tracking 168
lemmas 410, 416–18, 427–8, 462 linear-bounded automaton 44
lesion studies 17–19, 68–9 linguistic ambiguity 455–6, 456
less-is-more theory 118, 161 linguistic determinism 90
letter-by-letter reading 220 linguistic encoding 94
letters, and sounds 31 linguistic feedback hypothesis 109
levels, of psychological processing 23 linguistic relativism 90
lexeme selection 410 linguistic rules 25–6
lexical access 167, 258, 265, 266, 280; modes 167; see linguistic universals 112–14
also visual word recognition linguistics: contribution of 12–13; overview 10;
lexical ambiguity 198–205; autonomous access model transformational grammar 10–13
203–4; context effect 202–3; context-guided single- lip-reading 458
reading lexical access model 199; early research liquids 34
199–205; evaluation of research 206; experimental listening, neuroscience 413
research 200–2; frequency effect 202–3; integration literacy 168, 244–5
model 203–4; models 199; multiple access model locality assumption 24
199; ordered-access model 199; reordered access localization, of language functions 67–73, 70
model 204, 205; selective access model 203; locational coherence 361
Swinney’s experiment 201–2 locutionary force 450
lexical and semantic development 125–36, 126; logical inferences 367
comprehension and production 135–6; early words logogen model 194–6, 195, 196, 463
126, 127; errors in meanings 131–4; individual logographic languages 210
differences and preferences 129–30; later logographic stage 242
development 134–5; mapping problem 127–30; longitudinal studies 105, 140, 245
name learning 127–9, 131; over- and look and name 127
under-extensions 131–4; summary of early look-and-say method 247
development 134 Low Interactional Content (LIC) 364
lexical bias 421 lying 453
lexical boost 403
lexical category ambiguity 308–10 M
lexical causatives 331–2 macroplanning 396
lexical decision task 170, 178, 186; and consistency of made-up words 437
results 180–1; frequency effect 182–3 magic moment 167
lexical entrainment 453 magnocellular system 255
lexical guidance 297 malapropisms 410
lexical identification shift 263 Mandarin: spatial coding 97–8; see also Chinese
lexical instance models 192 manner of articulation 33, 34
lexical neighborhoods 272 mapping hypothesis 313
lexical processing, and short-term memory 469–72 mapping problem 127–30
lexical retrieval 414 mapping, sounds onto letters 245
lexical selection 410, 411–12 masked phonological priming 217
lexical-semantic anomia 439, 440 mass nouns 130
lexicalization 396, 410–26; bilingualism 411, matching span task 469
421; cascade models 418–21, 423–6; maturation 74
connectionist models 423–6; discrete stage maturation hypothesis 111, 112
models 421, 425–6; experimental evidence maturational state hypothesis 76, 80
411–12; feedback 421–2, 425–6; horizontal mean length of utterance (MLU) 144–5, 145
information flow 422; interactive activation meaning: and context 319; role in accessing sound
models 422–4, 423; interactivity 418–22; 218; and structure 12; see also semantics
mediated priming 418–20; neuroscience 412–14; meaning-first view 136
and pauses 430–1; speech errors 410–11, 425; meaning through syntax (MTS) 308
stages 410–18; time course 418–21; tip-of-the- meanings, children’s errors 131–4
tongue (TOT) 414, 414–16; two-stage models medial geniculate nucleus 250–1
410, 410–13, 418–19, 419, 423 mediated priming 181, 418–20
SUBJECT INDEX 599
MEG (magnetoencephalography) 20 morphologically complex words: speech production

memory 362; agrammatism 437; for color 95–7; 408–9; visual recognition 190–2
and comprehension 361, 362–72; comprehension morphology 6; linguistic universals 113; and syntactic
376–7; context effect 364–7, 365; episodic 319–20; category 139
eyewitness testimony 371–2; and importance 364; MOSAIC 139
see also salience; and prior knowledge 364–7, 380; mother–child dyad 84, 84
push-down stack 44; reminding 381–2; semantic motherese 109–11
319–20, 464; short-term 389, 468–73; for text and motion encoding 98
inferences 362; verbatim memory 362–4 motivation, second language acquisition 160
memory impairment 314 MRI (magnetic resonance imaging) 20, 21
memory organization packets (MOPs) 381–2 multiple-levels model 228
memory shifting 408 multiple locus hypothesis 350
memory systems, semantic memory 340–3 multiple-outlet models 277
mental dictionary 7 multiple stores models 341–3
mental encyclopedia 319 mutual exclusivity 134
mental models, comprehension 382–4 mutual gaze 84, 87
mental syllabary 428
MERGE model 279–80 N
_
message level of representation 396 N 42
metacognitive knowledge 243 N-statistic 175
metalinguistic knowledge 243 N400 19
metaphors 97–8, 337–9 name learning 127–8, 131
methodologies, and findings 143 naming 467
methods, psycholinguistics 15–16 naming errors 399, 412
metrical segmentation strategy 260 naming latency 170
micropauses 433 naming task 170, 181, 182–3, 186
microplanning 396 natural kind terms 323
mimicry 51, 57–8 natural order in acquisition hypothesis 160
mini-theories 335 natural selection 52
minimal attachment 295–6, 299 nature–nurture debate 106, 106
minimal pairs 31 Navajo 92
minimalism 37, 42–3 Neanderthals 52, 53
minimalist hypothesis 368 need to know 368
mirror neurons 53 negative evidence 84, 116
mispronunciation 276 neglect dyslexia 220
mixed substitutions 421, 425–6 neighborhood effects 175
modality-specific anomia 341 neologisms 437
modality-specific content hypothesis 342 neuroimaging 19–22
modality-specific effects 346–8 neuropsychological dissociations 24, 464
modality-specific format hypothesis 342 neuroscience 476; adult reading disorders 220–7;
modality-specific stores 461 ambiguous and non-ambiguous sentences 304;
model, meaning of term 25–6 bilingualism 157–8; developing techniques 478;
model-theoretic semantics 322 lexicalization 412–14; of parsing 312–16; picture
modeling 25–6 naming 413; semantic and syntactic processing 298;
models 16; importance of 5; visual word recognition of semantics 339, 339–51; speaking and listening
192–8; see also computer modeling; connectionist 413; speech production 433, 433–44; spoken word
modeling; reading; individual models recognition 281; text processing 388–9; turn-taking
modifiers 42 454; writing 445
modularity 22, 23–5, 460, 475–6; representational and Newspeak 89–90
processing 298 Nicaragua 114
modules: defining 23; language systems 461–2 Nineteen Eighty-Four (Orwell) 89–90
Mohawk 45 no negative evidence problem 108
monitor hypothesis 160 nodes 39
monkeys 57 non-associative semantic priming 185–7
morpheme stranding 401 non-interactive models 24
morphemes 6, 7 non-lexical route 211–12
morphing 406 non-linguistic ambiguity 455–6, 456
600 SUBJECT INDEX
non-linguistic context 267 paragrammatisms 437

non-literal language processing 338, 453–4 Paraguay 154
non-nutritive sucking 119 parallel activation 408
non-semantic reading 225 parallel autonomous model of parsing 289, 291
non-structural context 267 parallel function 372–3
non-terminal elements 37 parallel processing 24, 184–5
non-terminal nodes 39 parallel transmission 260
nonplan-internal errors 401 parameter setting 111–12, 114
nonwords 175, 211, 212–13 parameters 111
noun-noun combinations 336–7, 337 paraphrasias 437, 441, 467
noun phrases 38 parapraxes 397
nouns 37 Parkinson’s disease 69, 146
novel phrases 336 parrots 57, 57–8
nucleus 35 PARRY 14
number agreement 405–7 parse trees 39, 39–40
number systems 94–5 parsing 288, 294, 299; agrammatic aphasia 313–16;
autonomy in syntactic processing 296; comparison
O of models 305–6; constraint-based models 296,
object naming 349, 467 300–3; context effect 301–4; cross-cultural studies
object permanence 80, 81–2 304–5; early accounts 293–5; early research
objects 38–9 291–5; evaluation of neuroscience 315; garden path
obligatory automatic decomposition 330–2 model 295–7; independence of 303–4; interactive
obligatory decomposition hypothesis 190, 191 processing 299–300; models 289; neuroscience
obligatory transformations 11 312–16; and phonological loop 472; principles
observational studies 105 293–5; probabilistic effect 305, 307; processing of
on-line experiments 202–3 content and function words 315; referential theories
one-stage models of parsing 288 300–1; sausage machine 295; strategies based
onomatopoeia 51 on surface-structure cues 293; summary 316–17;
onset 35 syntactic-category ambiguity 308–10; units of 291–3;
open-class words 38 unrestricted race model 306–8; verb bias 302–3;
open words 140 visual information 458; and working memory 471–2;
optic aphasia 341–3, 343 see also sentence structure
Optimality Theory 43 partial activation hypothesis 414–15
optional complementizers 456 partial alphabetic phase 242
optional transformations 11 passive sentences 39
ordered-access model 199 passivization transformation 11, 41
organized unitary content hypothesis (OUCH) model 342 past tense, acquisition 145–8
origins of language 51–4 patients 136
orthographic neighborhoods 272–3 pattern masking 171
orthographic output store 466 pauses 430–2
orthographic priming 176 perception: early speech perception 120–3; of speech
orthographic stage 242 258–63; and vocabulary differences 90–2; without
Orton–Gillingham–Stillman multisensory method 255 awareness 171–2
ostensive model 127 perceptual heuristics 293
output deep dyslexia 224 perceptual-loop hypothesis 425
output simplification 125, 125 perceptual recall 93
output stores 466 performance 36–7, 105
over- and under-extensions 131, 131–4; theories of 133 peripheral dyslexias 220
over-extensions, verb-argument structure 141 perlocutionary force 450
over-generalizations 131–4 permanent ambiguity 289
over-regularization errors 220 PET (positron emission tomography) 20, 71
overlap hypothesis 418 pheromones 54
philosophy 13
phoneme coding 230–1
P phoneme identification 262
palatals 34 phoneme monitoring task 200, 262
parafovea 169 phoneme restoration effect 263–5
SUBJECT INDEX 601
phonemes 30–1, 209 pre-speech 85

phones 30–1 predicates 38
phonetics 6, 30, 31 predictions 148
phonic method 247–8, 248 preferential looking technique 104–5
phonological anomia 440 prefixes 191–2, 401
phonological awareness 243, 243–5 prelexical code 262–3
phonological buffers 465, 467, 468, 469, 470–1, 473 preliminary phrase packager (PPP) 295
phonological deficit hypothesis 235, 254 prepositional phrases 289
phonological deficits 253–4 prepositions 37
phonological development 120–5; babbling 123–4; preverbal message 396
early speech perception 120–1; later development primate studies, language teaching 58–67
124; visual impairment 87 primates, communication 54–5
phonological dyslexia 221–3, 235 priming 16, 171, 190; attentional modes 177;
phonological encoding 410, 426–30; lemma model frequency effect 186; morphological complexity
427–8; planning ahead 429–30; role of syllables 190–2; proportion effect 179–80; word fragments
428–9 271
phonological facilitation 401 PRIMIR (Processing Rich Information from
phonological form selection 410 Multidimensional Interactive Representations)
phonological impairment hypothesis 235 122–3
phonological input store 464 Principle of Economy 43
phonological loop 471–3 principles and parameters theory 41–3
phonological mediation 212, 215–17 prior knowledge 364–7, 380, 387
phonological neighborhoods 272 privacy, conversations 453
phonological output store 466 privileged information 453
phonological recoding 211, 217 pro-drop parameter 111
phonological transparency 191 probabilistic effect 305, 307
phonology 6, 30, 31 probabilistic feature model 328, 328
phrase-structure grammar 37–43, 45 probabilistic models 26
phrases 38 probability 10; and pauses 431
physical modularity 24 processing in cascade 418, 420
picture naming 183–4, 413, 418–20, 419; dementia processing modularity 24, 298
349; syllable number 175 production, and comprehension 135–6
picture–word interference studies 407, 412 pronouns 37–8; use by visually impaired children 86–7
picture–word interference task 422, 429, 430 pronunciation neighborhoods 214–15
pidgin languages 114 pronunciation switching 233
pigeons 65, 66 pronunciation, vowels 32
Piraha 92, 94–5 property inheritance 324
pitch 35, 36 proportion effect 179–80
pivot grammar 140 propositional network models 377, 377–8
pivot words 139–40 propositions 377
place of articulation 33 propsopagnosia 185
planum temporale 250, 251 prosodic cues 290
plasticity 74, 77, 80, 174 prosody 120, 122
plausibility 302 proto-Indo-European 7–8
plurals, count and mass nouns 138 protolanguage 52–3
PMSP model 232, 234 prototype hypothesis 133–4
pointing span task 469 prototype theories 333–5
polysemous words 199 prototypes 333–4
possible-word constraint 260 prototypicality effect 325
post-access processing 24 pseudohomophones 212–13
postaveolar sounds 34 pseudowords 175, 211
postlexical code 262 psycholinguistics 4; certainty 26; history 9–10;
PQ4R method 387–8, 388 and information processing 13; methods 15–16;
pragmatic inferences 368 summary of overview 27
pragmatics 6, 449, 458; see also language use psychological processing, levels of 23
pre-alphabetic phase 242 punctuation, disambiguation 290
pre-birth language development 119–20 pure definitional negatives (PDNs) 330–1
602 SUBJECT INDEX
pure word deafness 464 relative clauses 290

Purkinje system 168 relative time 384
purpose of language 9 relativism 24
push-down stack 44 relaxation 219
remembering, of sentences 12
reminding, and memory 381–2
Q reordered access model 204, 205
questioning 454 repetition blindness 162, 189–90
questions: basis of language 101; bilingualism repetition priming 155, 175–6, 279, 411
163; children, language development 151; representational modularity 298
comprehension 391; describing language 46; representative speech acts 450
general 480; introductory material 27–8; language repression 453
production and use 447; language systems 474; reproduction conduction aphasia 443
language use 459; learning to read 256; parsing 317; research: applied 477; bias 16, 142; future
reading 239–40; semantics 358; sentence structure developments 477–80; subjects 16
317; speech 283; visual word recognition 208 resolution, of ambiguity 373
response bias 182
response consistency 352
R response strengths, TRACE model 274
Race model 219, 277 restricted interaction account (RIA) model 426
rationalism 106, 106 restricted search hypothesis 375
reaction time measures 170 reversal rate 233
readability 385 reversible passive sentences 12
reading: accessing meaning 212; accessing semantics revised extended standard theory 37
232–3; adult reading disorders 220–7; age-of- rewrite rules 37
acquisition (AOA) 214; classification of word rich interpretation 140
pronunciations 215; comparison of models 237–8; right association 293–4, 295
effect of word abstractedness 237; frequency effect right-hemisphere hypothesis 224
214; Glushko’s experiments 213, 214; improving right-linear grammars 44
skill 387–8; inner speech 217–18; irregular rimes 35, 242, 244–5
words 211; models of word naming 227–33; non- Rogers et al.’s connectionist model of semantic
semantic 225; nonwords 212–13; normal reading memory 355, 355–6
212–20; overview 209; preliminary model 210–12; roles, semantic 39
pronunciation neighborhoods 214–15; Race model RSVP (raid serial visual presentation) 189
219, 277; regularity effect 213–14; role of meaning in rules 25–6, 118, 145, 145, 147, 476
accessing sound 218; selective attention 219; semantic Russian 96
involvement 234–5; silent reading 217–18; skimming
219; speed reading 218–19; summary 239–40;
words 213–20; see also learning to read S
reading span 386–7, 389, 468 S node 39
Received Pronunciation (RP) 31, 33 saccades 168, 169, 170
recency, in comprehension 376 salience 328, 334, 362–3, 364, 375
recent-filler strategy 311 Sapir–Whorf hypothesis 9, 89–98; evaluation 98–9
recognition point 265, 266, 269 sausage machine 295
recurrent networks 279, 427; architecture 278 saving face 453
recursion 40, 44, 45, 66–7 scan-copier mechanism 426
reduced relative clauses 290 schema 333
redundancy 10 schema-based theories, of comprehension 380–2
reference, comprehension 372 schemas 380–2
referential coherence 361 science, approaches to 475
referential processing 361 scripts 380–2, 381
referential theories 300–1 search-based single lexicon model 410–11, 411
referential theory of meaning 322, 322 second language acquisition 76–7, 158–61; attitude
referential words 126–7 and emotion 160; audiolingual teaching 159;
regressions 169 evaluation of research 162; facilitating 161; five
regularity effect, reading 213–14 hypotheses 159–60; four Cs 161; immersion method
regularization 147 161; individual differences 160; summary 162;
relatedness effect 325 teaching methods 159, 159–60
SUBJECT INDEX 603
segmentation 259–61; bilingualism 260–1 obligatory decomposition 330–2; overview 319–21;

segmenting 121–2 prototype theories 333–5; semantic features 325–32;
Seidenberg and McClelland model 229–32, 230 summary 357–8; theory theories 335
selection 265 semi-vowels 34
selection restrictions 327 sensory–functional theory 345–8
selective access model 203 sensory-functional theory 345–8
selective adaptation 261, 264–5 sentence-complement verbs 302–3
selective attention 219 sentence planning, and pauses 431–2
selective language impairment 158 sentence structure: autonomy in syntactic processing
self-paced reading task 205, 297 296; comparison of models 305–6; competition-
semantic analysis 39 integration model 303; and comprehension 361;
semantic and lexical development 125–36, 126; constraint-based models 296, 300–3; constraints on
comprehension and production 135–6; early words analysis 293; context effect 301–4; early accounts
126, 127; errors in meanings 131–4; individual of parsing 293–5; early research into parsing 291–5;
differences and preferences 129–30; later fillers 311–12; gaps 310–11; garden path model
development 134–5; mapping problem 127–30; 295–7; interactive processing 299–300; overview
name learning 127–9, 131; over- and 287–8; parsing strategies 293; probabilistic effect
under-extensions 131–4; summary of early 305, 307; processing structural ambiguity 295–310;
development 134 referential theories 300–1; structural ambiguity
semantic approaches, to syntactic development 140–1 288–91; summary 316–17; syntactic-category
semantic assimilation theory 136, 136 ambiguity 308–10; traces 311–12; unbounded
semantic bias 310 dependency 311–12; units of parsing 291–3;
semantic bootstrapping 136, 136–7 unrestricted race model 306–8; verb bias 302–3; see
semantic categorization task 171, 216 also parsing
semantic-conceptual system 461 sentence structure supervisor (SSS) 295
semantic decomposition 330–2 sentence verification task 324, 327–8
semantic deficits 339–40; category-specific disorders sentences: acceptability 10; forces 450, 450; as
343–8; differential impairment 341–2 performative 450; structure and meaning 12
semantic dementia 348, 355 separate-stores model 155–6
semantic feature hypothesis 133 sequential bilingualism 153
semantic features 325–32; types 353, 353–4 Serbo-Croat 209
semantic glue hypothesis 234, 235 serial autonomous model of parsing 289, 291
semantic-interference paradigm 412 serial model of lexicalization 426
semantic markers 327 serial search model 192–4, 193
semantic memory 319–20, 464; and dementia 348–50; sex differences 73
evaluation of neuroscientific research 350–1 sexist language 99
semantic microfeature loss hypothesis 352 shadowing 270–1
semantic microfeatures 351–2 Shona 95
semantic networks 322–5, 323, 325 short-term memory 389, 468–73
semantic paralexis 223, 225, 228, 236, 252 SHORTLIST model 279, 279
semantic-pragmatic disorder 85, 389 SHRDLU 14–15
semantic priming 16, 176–7, 185–7, 193, 279 side-effect theory 52
semantic primitives 326 sign language 6; acquisition 87; child-directed speech
semantic processing 298, 361 109; parameter setting 112; teaching to apes 59–60,
semantic relations 141 61, 61–5
semantic systems 464 signal detection theory 263
semantic transparency 191 signals 54
semantics 6; causative verbs 331; classic approaches silent reading 217–18
321–2; combining concepts 336–7; connectionist simplification 125, 125
models 351–6; constraints on general theory simultaneous bilingualism 153
321; decompositional theories 326–30; family single locus hypothesis 350
resemblance models 333–6; Featural and Unitary single-outlet models 277
Semantic Space hypothesis 332; feature-comparison single phonological deficit hypothesis 436
theory 326–9; feature-list theories 327–8; figurative single-route mechanism 19, 147
language 337–9; grounding 355–6; instance single-word repetition task 469
theories 335; latent semantic analysis (LSA) 354; situated cognition 356
memory systems 340–3; neuroscience 339, 339–51; situation models 382
604 SUBJECT INDEX
skimming 219 speech sounds, describing 30–3

slips of the tongue 396–9, 398 speed reading 218–19
social basis of language 83–8 spelling 242–3, 248–9
social context, child-directed speech 109 spoken language processing vs. visual language
social deprivation 83 processing 167–8
social development 84 spoonerisms 397, 421
social factors, early words 129 spreading activation semantic network 325
social interaction 83–4, 85 SQUIDS 20
social networking 479 standard theory 37, 42
social use of language, disorders 85 stimulus-onset asynchrony (SOA) 171
songbirds 73–4 STM conduction aphasia 443
sound anticipation and substitution errors 398 stops 34
sound processing 115 stories 360, 378–9, 385
sounds: categorical perception 263; consonants 33–5; story grammars 378, 378–80
and letters 31; manner of articulation 33; and stress 35, 120, 122
meaning 215–17; place of articulation 33; role of stress-based segmentation 260
meaning in accessing 218 stressed-timed language 35
Spanish 304–5 stroke, CT scan 158
spatial coding 97, 98 strong phonological perspective 216
spatial information 382–3 Stroop effect 177
SPCH1 gene 115 Stroop task 177, 412
speaking, neuroscience 413 structural ambiguity 288–91; autonomy in syntactic
species-specificity 67 processing 296; comparison of models 305–6;
specific language impairment (SLI) 114, 146, 148, 389 competition-integration model 303; constraint-based
specificity 26, 476 models 296, 300–3; context effect 301–4; garden
spectrograms 30, 31 path model 295–7; interactive processing 299–300;
speech: analysis-by-synthesis 268; categorical probabilistic effect 305, 307; processing 295–310;
perception 261–2; cohort model 268–73, 269, 278; referential theories 300–1; syntactic-category
comparison of models 280–1; connectionist models ambiguity 308–10; unrestricted race model 306–8;
273–80; context and sound identification 263–5; verb bias 302–3
context effects on word recognition 266, 277; structural context 267
difficulties of perception 258–63; frequency effect structural priming 142, 403
266; hearing impairment 281; MERGE model 279– structuralism 10
80; models of recognition 267–81; monitoring 453; structure, and meaning 12
neuroscience 281; overview 258; prelexical code subcategorical mismatch experiments 277
262–3; Race model 219, 277; recognition 258–67, subjects 38, 39
259; segmentation 259–61; SHORTLIST model subjunctive mood 92–3
279, 279; summary 282; template matching 267–8; sublexical route 211–12
time course of spoken word recognition 265–6; subliminal perception 170, 171–2
TRACE model 268; and vision 456–8 subordinate bias effect 203, 204–5
speech acts 450–2; categories 451 substantive universals 112–13
speech apraxia 434 subtraction method 21
speech dysfluencies 430, 430–3 successive lexical decision task 201
speech errors 396–9, 398, 401, 402; lexicalization sucking habituation paradigm 75, 104, 120
410–11, 425; monitoring 425 suffixes 191–2, 401
speech perception, location 71–2 summation hypothesis 229
speech production 32–3, 395; agrammatism super-additivity 343
434–7; anomia 438–40, 439; aphasia 433–4, 435; superordinate concepts 130
coping with dependencies 405–7; environmental suppression 387
contamination 402; Garrett’s model 399–402, suprasegmental features 35
426, 437, 438, 443; hesitations 430–3; jargon surface dysgraphia 445
aphasia 437–8; lexicalization see separate heading; surface dyslexia 220–1, 227, 233–5
morphologically complex words 408–9; naming surface structure (s-structure) 41–2
errors 399; neuroscience 433, 433–44; phonological syllabary 428
encoding 426–30; processes 395–6; slips of the syllabic scripts 210
tongue 396–9; summary 446–7; syntactic planning syllabification 427–8
402–9; syntactic priming 403–6 syllable-based segmentation 260
SUBJECT INDEX 605
syllable monitoring task 260–1 neuroscience 388–9; propositional network models

syllable number, visual word recognition 175 377–8
syllable-timed language 35 thematic roles 136, 287, 287
syllables 35–6; hierarchical structure 35; in themes: of book 22–6, 475–7; semantic 39
phonological encoding 428–9; and theory of mind 83
segmentation 260 theory theories 335, 336
symbols: phrase-structure grammar 37; use by apes thought, and language 5, 88–99; anthropological
60, 60 evidence 91–2; comparison of theories 90;
synonymy 319 conclusions 99; grammatical differences 92–3;
syntactic abilities, apes 66 indirect effects on cognition 94; interdependence
syntactic ambiguity 303 89; memory for color 95–7; number systems
syntactic bootstrapping 130 94–5; Sapir–Whorf hypothesis 89–99; sexism 99;
syntactic categories 136–9 spatial coding 97–8; theories of 89–95; vocabulary
syntactic-category ambiguity 308–10 differentiation 91–2
syntactic comprehension 148–9 thought, behaviorist view 88–99
syntactic comprehension deficit 313–14 three-route model 227
syntactic development 76–7, 136–49; distributional three-stage model of sub-lexical processing 222
information 138–9; evaluation of research 139, 144; time course of spoken word recognition 265–6
later development 144–9; and morphology 139; past time, model construction 383–4
tense 145–8; problems of early grammar approaches timelines 98
140; semantic approaches 140–1; semantic relations tip-of-the-tongue (TOT) 414, 414–16, 431
141; semantics first 136; syntactic categories 136–9; Tippett and Farah’s computational model of naming
syntactic comprehension 148–9; two-word grammars 350, 350
139–40; verb-argument structure 141–4; visual TMS (transcranial magnetic stimulation) 20, 21
impairment 87 tongue-twisters 217–18
syntactic persistence 403–4 top-down 24
syntactic planning 402–9; coping with dependencies trace-deletion hypothesis 313
405–7; as incremental 407–9; syntactic priming TRACE model 265, 268, 271, 273, 273–7; categorical
403–6 phoneme perception 275; evaluation 274–7;
syntactic priming 403–5 response strengths 274
syntactic processing 436; interaction 299–300; traces 311–12
neuroscience 298 transcortical aphasia 443, 466, 470
syntactic rules, language differences 288 transformational grammar 10–11, 12–13, 41; see also
Syntactic Structures (Chomsky) 37 Chomsky’s linguistic theory
syntactic universalist theory 144 transformations 41, 45; and difficulty of processing 11;
syntax 6, 36–45, 56 obligatory 42; optional vs. obligatory 11
syntax module 435 transitive verbs 38
synthetic phonics 248 translation, bilingualism 156–7
tree diagrams 39, 39–40
T Triangle model 229–32
tachistoscopic identification 170 truth value 322
Tarahumara 97 tuning hypothesis 305
taxonomic constraint 127–9, 128 Turing machine 46
taxonomic hierarchies 130 turn-taking 84–5, 454
teaching language, to animals 57–67 two-process priming model 179–80
teaching reading 247, 247–8 two-stage mechanism, understanding indirect speech
telegraphic speech 104, 112, 136, 139 acts 451–2
template matching 267–8 two-stage model of discourse resolution 370
templates 259 two-stage model of lexical access 414, 418
temporal change, in language 7–9 two-stage models of lexicalization 410, 410–13,
temporal coherence 361 418–19, 419, 423
temporal discreteness 418 two-stage models of parsing 288
terminal elements 37, 39 two-word repetition task 469
terminal nodes 39 Type 0 grammar 45
text, memory for 362–72 Type 1 grammars 44
text processing: comprehension 377–86; Kintsch’s Type 2 grammars 44
construction–integration model 378, 384–6; Type 3 language 44
606 SUBJECT INDEX
type-B spelling disorder 251 overview 167–8; reaction time measures 170;
Tzeltal 97 repetition priming 175–6; semantic priming 176–7;
serial search model 192–4; summary 207; summary
of research into meaning-based priming 190; syllable
U number 175; word frequency 172–3; word length
U-shaped development 108, 141, 146, 147 174–5; words and nonwords 175
U-shaped learning, second language acquisition 159 vocabulary development 135
ultra-cognitive neuropsychology 18 vocabulary differentiation 91–2
unaspirated sounds 30 vocabulary learning, and phonological loop 471
unbounded dependency 311–12 vocal tract: human and chimpanzee 59; structure 33
uncertainty 5, 26 voice onset time (VOT) 34, 261
under- and over extensions 131–4; theories of 133 voiceless consonants 34
unfilled pauses 430 voiceless glottal fricative 34
unimodal store hypothesis 340 voicing 33–4
uniqueness point 265, 269 voluntary control, of language 56
Universal Grammar 111, 112 vowels 30, 35; as combinations of distinguishing
unrestricted race model 306–8 phonological features 35; speech production 32–3
unrestricted search hypothesis 375
unvoiced consonants 34
utility, of language 56–7
W
waggle dance 54, 55, 57
V
_ weak phonological perspective 216
V 42 WEAVER++ 426, 427–8, 428
velars 34 Welsh, number systems 94
verb-argument structure 141–4 Wernicke–Geschwind model 17, 68–9
verb bias 302–3 Wernicke’s aphasia 68, 433–4, 435, 444
verb-island hypothesis 142 Wernicke’s area 17, 18, 69
Verbal Behavior (Skinner) 107 whales 55
verbatim memory 362–4 whole-object bias 128–9
verbs 37, 38 whole word method 247
vertical information 422 Wickelfeatures 230, 232, 477
vision: computational account 13; and language 456–8 Wild Boy of Aveyron 78
visual comprehension 157 William’s syndrome 82–3, 146, 148
visual context 267 word association 156
visual dyslexia 220 word, concept of 6–7
visual impairment: auxiliary verbs 87; language word exchange errors 398, 401, 402, 407
development 85–6, 86; phonological development word frequency 143; and pauses 431; visual word
87; syntactic development 87 recognition 172–3
visual information: in comprehension 457, 457–8; and word identification 265
parsing 458 word length: measuring 174; visual word recognition
visual processing, dementia 349 174–5
visual scenes 402–3 “word-like entity in the language of thought 336
visual system, and color spectrum 96–7 word meaning see semantics
visual word form area 184 word meaning deafness 281, 464–5
visual word recognition 184; accessing selective word order 113, 402; linguistic universals 113
properties 205–6; age-of-acquisition (AOA) 173–4; word production, semantic priming 187
attentional processes 177–80; comparison of models word recognition 463; context effect 266–7;
198; consistency of results 180–3; context effect frequency effect 266; neuroscience 281; overview
187–90; dedicated system 183–5; evaluation of 165; PET (positron emission tomography) 463;
attentional process research 180; expectations 178–9, stages of 265; time course 265–6; see also visual
179; eye movement studies 168–70; facilitation and word recognition
interference 171–7; factors affecting 177; form- word repetition 465–6
based priming 176; frequency effect 181–3; hybrid word substitution errors 398, 401, 402, 407
models 198; interactive activation models 196–8; word superiority effect 196
lexical ambiguity 198–205; logogen model 194–6; words: borrowing 8; classes 37–8; classification of
meaning-based facilitation 185–90; methods and pronunciations 215; ease of learning 130; models of
findings 168–71; models 192–8; morphologically naming 227–33; multiple meanings 128; processing
complex words 190–2; neighborhood effects 175; of content and function words 315; reading 213–20
SUBJECT INDEX 607
working memory 24, 468, 468–9; connectionist

models 472; domain specific view 471; and parsing
X
_
X-bar syntax 42
471–2; second language acquisition 160 X-rays 19
working memory span, and comprehension 386–7
writing 8–9, 467; agraphia 444–5; neuroscience 445;
planning 444–5 Y
writing systems 209–10; Chinese 226 yellow filters, for dyslexic readers 255, 255
written languages, types 210, 210 Yerkish 61
Taylor & Francis
ORDER
eBooks
Y
FREE 3 OUR
0 DAY
INSTITU
TIONA
TRIAL T L
ODAY!
FOR LIBRARIES
Over 22,000 eBook titles in the Humanities,
Social Sciences, STM and Law from some of the
world’s leading imprints.
Choose from a range of subject packages or create your own!
Free MARC records

Benefits for
COUNTER-compliant usage statistics
you
Flexible purchase and pricing options
Off-site, anytime access via Athens or referring URL

Benefits Print or copy pages or chapters
for your
Full content search
user
Bookmark, highlight and annotate text
Access to thousands of pages of quality research
at the click of a button
For more information, pricing enquiries or to order

a free trial, contact your local online sales team.
UK and Rest of World: [email protected]
US, Canada and Latin America:
[email protected]
www.ebooksubscriptions.com
A A lPSP Award for

/ \
/ \
BEST «BOOK
PUBLISHER BenefitsBenefi
for ts for eBooks
ф 2009 Finalist Taylor &. Francis Group
Afl exible and dynamic

flexible dynam ic resource
resource for teaching, learning and
and research.

2014 The Psychology of Language PDF

Uploaded by

Copyright:

Available Formats

2014 The Psychology of Language PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2014 The Psychology of Language PDF

Uploaded by

Copyright:

Available Formats

What is the book about?

What is the book about?

What are some of the topics covered in the book?

What are some of the topics covered in the book?

THE PSYCHOLOGY

ISBN: 978-1-84872-088-6 (hbk)

Preface to the fourth edition ix Syllables 35

4. Language development 104 Meaning-based facilitation of visual word

SECTION C: WORD 8. Learning to read and spell 241

Models of speech recognition 267 12. Comprehension 360

10. Understanding the structure of

Questions to think about 459 Some growth areas? 477

Further reading 474 analysis 494

16. New directions 475 References 495

Chapter 1 from the International Linguistic Association. Page 149:

Chapter 10 Reproduced by permission of Elsevier. Page 413 (bot-

INTRODUCTION to make the components of the articulatory appa-

meaning), syntax (the study of word order), mor-

INFLECTIONAL MORPHOLOGY DERIVATIONAL MORPHOLOGY

ROMANCE GERMANIC INDIAN

Chaucerian language seems archaic and verbose in

are important differences in the way that differ-

WHAT IS LANGUAGE FOR?

The concept of a computer

LANGUAGE AND THE psycholinguistics over the last 30 years or so.

Optic chiasm Occipital lobe

Motor cortex architecture of the systems involved. That is, the

This proposal led to heated controversy (e.g.,

−100 0 100 200 300 FIGURE 1.6 An EEG (left)

this a different area responsible for processing

How sensitive are How do languages

QUESTIONS TO THINK ABOUT

INTRODUCTION gram shows the amount of energy present in a

phonetics is the study of phones, and phonology

Box 2.1 The International Phonetic Alphabet (IPA)

sounds is to look at their place of articulation—

Alveolar Nasal cavity

FIGURE 2.2 The structure

TABLE 2.1 English consonants as combinations of distinguishing phonological features.

TABLE 2.2 Vowels as combinations of distinguishing

High i u Onset Rime

poor ghost,” “The nasty vampire” is a phrase (as it

Because each sentence must contain at least

You might think by now that the subject is N

Transformations transformations—for example, to form a negative

Chomsky went further and argued that nei-

x The basic sounds of a language are called phonemes.

QUESTIONS TO THINK ABOUT

1. To what extent have linguistics and psycholinguistics converged or diverged?

INTRODUCTION WHERE DID LANGUAGE

This picture illustrates the

The waggle dance

Research shows that dolphins do not possess a

Box 3.1 Hockett’s (1960) “design features” of human

FIGURE 3.2 Some

(Gardner & Gardner, 1969, 1975). In this context, Sarah

Nim and others Rumbaugh, and Boysen (1978) reported attempts to

Student teacher Joyce

TABLE 3.1 Differences between apes’ and children’s language behavior.