Week 2-1 Swaab Et Al. (2012)

Language-Related ERP Components

Article · January 2012

DOI: 10.1093/oxfordhb/9780195374148.013.0197

90 3,811

4 authors, including:

Tamara Y Swaab Kerry Ledoux

University of California, Davis Johns Hopkins University


Megan Boudewyn
University of California, Santa Cruz


All content following this page was uploaded by Tamara Y Swaab on 03 December 2019.

The user has requested enhancement of the downloaded file.

The Oxford Handbook of Event-Related Potential Components
Emily S. Kappenman and Steven J. Luck

Print publication date: 2011

Print ISBN-13: 9780195374148
Published to Oxford Handbooks Online: Sep-12
DOI: 10.1093/oxfordhb/9780195374148.001.0001

Language-Related ERP Components

Tamara Y. Swaab, Kerry Ledoux, C. Christine Camblin, Megan A. Boudewyn

DOI: 10.1093/oxfordhb/9780195374148.013.0197

Abstract and Keywords

Understanding the processes that permit us to extract meaning from spoken

or written linguistic input requires elucidating how, when, and where in the
brain sentences and stories, syllables and words are analyzed. Because
human language is a cognitive function that is not readily investigated
using neuroscience approaches in animal models, this task presents special
challenges. In this chapter, we describe how event-related potentials (ERPs)
have contributed to the understanding of language processes as they unfold
in real-time. We will provide an overview of the many ERPs that have been
used in language research, and will discuss the main models of what these
ERPs reflect in terms of linguistic and neural processes. In addition, using
examples from the literature, we will illustrate how ERPs can be used to
study language comprehension, and will also outline methodological issues
that are specific to using ERPs in language research.

lexical processing, sentence processing, discourse processing, non-literal language, event-

related potentials, N400, P600, Nref

Language is a central part of our everyday life. It enables us to accomplish

a seemingly infinite number of uniquely human tasks: We talk to many
people every day and answer hundreds of our children’s questions, we tell
each other jokes and recite poems to our loved ones, we negotiate deals
and treaties, we listen to news reporters, read books, and (slowly) write
chapters; and there are, of course, numerous other examples of our daily
language use. Language is a defining feature of who we are as human
beings. But understanding and producing language is extremely complex,

Page 1 of 114 Language-Related ERP Components

and is subserved by many processes and many areas in the brain. This
complexity arises in part as a function of the productivity and flexibility of
language. We constantly produce novel utterances and find new ways to use
existing words and phrases. In addition, words and sentences can assume
diverse meanings, depending on a multitude of factors such as the context,
the speaker, the listener, or the world knowledge relevant to the utterance.
A popular bumper sticker from the San Francisco Bay Area illustrates this
point. It reads: “It’s great to be alive in Colma.” To anyone unfamiliar with
this Californian town, the statement appears to be extolling the virtues of
Colma; however, residents will tell you that this statement actually refers
to the large number of cemeteries to be found in this otherwise small town
and quite literally means that it’s great to be alive in Colma. This anecdote
demonstrates the flexibility of meaning in language and how language
comprehension is frequently not straightforward.

In order to understand the processes by which we extract meaning from

spoken or written linguistic input, we must unravel how, when, and where
our brain interprets sentences and stories, syllables and words. This is a
difficult task, in part because human language is the only cognitive function
that cannot easily be studied with animal models or traditional neuroscience
methods. In this chapter, we aim to elucidate how language-related event-
related potentials (ERPs) have contributed to our understanding of language
processes as they unfold in real time. Specifically, we hope to accomplish
the following: (1) provide an overview of the different ERP components
that have been used in language research; (2) discuss the main theories of
what these components may reflect; and (3) illustrate with examples from
the literature how ERPs can be used to study language comprehension.1
Whenever possible, we will include information on the neuronal generators
of language-related ERPs. However, because all language ERPs occur
(relatively) late and are long-latency components, it is very likely that
multiple brain areas contribute to their generation, and (not surprisingly)
little is known about possible generators of most of the language-related
ERPs. Finally, we will discuss some methodological issues that are specific to
using ERPs in language research.

The N400

The best-studied language-related ERP component is the N400. The N400 is

a negative shift in the ERP waveform that is larger over centro-parietal than
over anterior regions of the scalp (Kutas & Hillyard, 1983). In young adults,
the N400 usually reaches its maximum amplitude between 380 and 440 ms

Page 2 of 114 Language-Related ERP Components

after stimulus onset. However, this may be delayed in elderly persons (e.g.,
Gunter et al., 1992) and aphasic patients (Swaab et al., 1997). In the domain
of language processing, the N400 is observed when words, sentences, or
discourses are presented as written text (e.g., Kutas & Hillyard, 1980), as
naturally produced connected speech (e.g., Holcomb & Neville, 1990, 1991),
and with sign language (Kutas et al., 1987). When words are presented
visually, the N400 effect typically onsets around 200 ms after the stimulus
and lasts for about 300 ms. In the auditory modality, the N400 may begin as
early as 100 ms after stimulus onset and last for 400 ms.

In a typical ERP N400 study, participants are presented with language

stimuli, and the N400 effect is measured to critical words for which semantic
information is manipulated. To illustrate this, let us consider the following
example from a study by Sitnikova and colleagues (2002). They presented
sentences in four conditions, as in the following example:
1. Diving was forbidden from the bridge because the river had rocks
in it.
2. Diving was forbidden from the bridge because the river had
cracks in it.
3. The guests played bridge because the river had rocks in it.
4. The guests played bridge because the river had cracks in it.

There were 40 stimuli in each condition, and the sentences of each

quadruplet were presented in different lists to avoid repetition (word
repetition leads to facilitated processing). Every content word in these
conditions will elicit an N400. However, as the language comprehender
proceeds through the sentence and contextual information accumulates,
it becomes easier to predict or integrate the next word of the sentence,
and this leads to a reduction in the amplitude of the N400. Thus, in normal
sentence contexts, content words presented later in the sentence will show
smaller amplitude N400s than words presented earlier in the sentence
(e.g., van Petten & Kutas, 1991). In this example, however, Sitnikova and
colleagues (2002) aimed to determine whether or not schizophrenic patients
were able to select contextually appropriate meanings of ambiguous words
that have one form representation but at least two unrelated meanings
(e.g., bridge, meaning either the structure spanning the river or the card
game). The critical word (river in the example) was always related to the
most frequent meaning of the ambiguous word. The preceding context
supported either the more frequent interpretation of the ambiguous word
(Conditions 1 and 2) or the less frequent interpretation (Conditions 3 and
4). After the critical words, a control word was included that was congruent

Page 3 of 114 Language-Related ERP Components

in meaning (or not) with the preceding context but was not related to any
of the meanings of the ambiguous word (rocks vs. cracks in the example).
Normal, neurologically unimpaired control subjects showed a reduced N400
to river when it was consistent with the meaning of the ambiguous word
biased by the preceding context (1) relative to when it was inconsistent
(3). A comparison of the control words in Conditions 1 and 2 revealed a
reduced N400 when the control word was consistent with the meaning of
the preceding context (1, rocks) relative to when it was not (2, cracks). The
schizophrenic patients also showed this latter effect of semantic congruity,
but in contrast to the normal control subjects, they did not show an N400
difference between the context-consistent critical word (river in Condition 1)
and the context-inconsistent critical word (river in Condition 3). This indicates
that schizophrenic patients have some deficit in using contextual information
to resolve lexical ambiguities, even though they can still use the context to
detect semantic anomalies.

What this study clearly demonstrates, in terms of the beauty of using ERPs
in language research, is that one can measure ERPs to any and all words in
the sentence without interrupting the language comprehender with a task.
Tasks can be (and often have been) included after the language stimulus is
presented. For example, one can ask subjects to make a “true” or “false”
decision about a statement based on the content of the experimental
sentence, or ask them to make a “good” or “bad” judgment about the
sentences they have just read (or heard). However, quite a few ERP language
researchers have argued that these types of tasks are no longer necessary,
because N400 effects can be observed without the inclusion of a potentially
interfering behavioral task (e.g., van Berkum, 2004). The ability to present
stimuli without any task other than to listen or read can be essential in
studies with patient populations, because they may not understand the
behavioral task (e.g., aphasic patients: Swaab et al, 1997, 1998).

However, there are challenges too. One of the challenges in any study that
uses ERPs is that blinking, eye movements, and other movements need to
be minimized because they induce artifacts in the electroencephalography
(EEG) signal. To accomplish this in ERP studies of reading, words are not
presented all at once (as in this text), but instead one word at a time at the
center of a computer screen, usually at a rate of one word every 500 ms with
a 200 ms blank interval between words. Further, a fixation cross typically
replaces words at the same central location on the screen so that subjects
can fixate their eyes, usually 1000 ms before the onset of the first word and
2000 ms after the presentation of the last word of the critical stimulus. To

Page 4 of 114 Language-Related ERP Components

further avoid blinks during presentation of the experimental stimuli, subjects
are typically made aware of when it would be a good time to blink. This can
be done, for example, by changing the color of the fixation cross to signal a
“rest” period, or during a task following the experimental materials, provided
that the task (e.g., involving true/false questions) does not contain a critical
manipulation. Often, subjects can control when they want to proceed to the
next experimental stimulus by pressing a button on a button box. This gives
them a chance to blink and move their eyes for a longer period of time, if
necessary. In addition, participants are usually given a break after a 5–10 min
block of an experiment. In general, experimental blocks are preceded by a
practice block, both to familiarize the subjects with the language task and
to train them to avoid moving their eyes or blinking during presentation of
the experimental stimuli. One big hurdle in ERP studies of language is that it
is usually not possible to generate more than 40 stimuli in a condition; this,
of course, makes it even more important not to lose many trials to subject-
generated artifacts (see also the methods section at the end of this chapter).

In the auditory modality, language stimuli have typically been presented

as naturally produced connected speech. The onset of critical words in the
speech stream is located with a speech-editing program, and this measured
latency is used to send out a trigger or event code; this marks the onset
of the critical word to which the ERPs can be time-locked. Since spoken
words vary in duration, the onset of the critical words typically varies across
experimental stimuli in language experiments. This can be avoided by using
identical speech input up to the time of the critical manipulation, which is
accomplished by splicing the critical words in all conditions (for individual
item sets) onto the same spoken stem (e.g., “[She drank her coffee with
cream and]Stem sugarCondition 1/dogCondition 2). The previous example would
involve three audio files, would create two experimental items in different
conditions, and, critically, would allow participants to hear identical spoken
input in each condition up to the presentation of the critical word. Because
the incoming speech signal is continuous, ERP components to the different
words in the sentences overlap, resulting in the absence of clear N1 and P2
components in the averaged ERP; yet, (fortunately) a robust N400 effect can
be measured in studies of comprehension of spoken language.

Page 5 of 114 Language-Related ERP Components

The Discovery of the N400

Fig. 15.1. The discovery of the N400 (Kutas and Hillyard, 1980). Subjects
were presented with sentences one word at a time in three conditions.
Arrows show when each word was presented. In this and all other figures,
the vertical axis shows the amplitude in microvolts and the horizontal axis
shows the time in milliseconds. Negative polarity is depicted upward. Event-
related potentials were compared to the final words in three conditions.
In the congruent condition (solid line), the last word of the sentence was
semantically appropriate given the context. In the anomalous condition
(dashed line), the sentence-final word was semantically inappropriate
given the context, and this elicited the N400. The dotted line shows that a
sentence-final word that was semantically congruent but physically deviant
(a different font size) elicited a large positive shift (P650) but no N400.
Redrawn with permission from Kutas and Hillyard (1980), Figure 1.

The existence of the N400 was first reported by Marta Kutas and Steven
Hillyard in a seminal study published in Science in 1980. At the time, no
language-related ERPs had been discovered and electrophysiological
methods were used almost exclusively to study other perceptual and
cognitive processes, such as attention and memory. Marta Kutas, who
had performed studies with Emanuel Donchin using the P300, wondered
whether or not this component would also be sensitive to “oddballs” of
language. Kutas and Hillyard (1980) asked subjects to read sentences
such as “He spread the warm bread with socks.” These sentences were
presented one word at a time at fixation, as is now typical for ERP studies
of reading (see the previous section). In this same study, subjects also read
normal sentences (e.g., “It was my first day at work”) and sentences that
ended with a word that was normal in meaning, given the context, but
anomalous because of a change in the physical appearance of the critical
word (e.g., “She put on her high-heeled SHOES”). The physical oddball
indeed resulted in a positive deflection in the ERP waveform. The semantic
anomaly, however, elicited a negative ERP that peaked at around 400 ms

Page 6 of 114 Language-Related ERP Components

and was maximal over centro-parietal electrode sites; this component was
labeled the N400 (see Figure 15.1).

The discovery of the N400 has led to a flurry of ERP studies of word,
sentence, and discourse comprehension (see Figures 15.2 and 15.3). Many
of the early studies were devoted to identifying the processing nature of the
N400. These studies showed that the N400 is modality independent; that
is, N400 effects are observed to words whether they are written, spoken, or
signed2 (e.g., Bentin et al., 1985; Holcomb & Neville, 1990; Kutas & Hillyard,
1980; Kutas et al., 1987; McCallum et al., 1984). In addition, many studies
have shown that the N400 is not only sensitive to semantic violations, but is
also found when words are semantically appropriate but less expected in the
context—for example wasp in “She was stung by a wasp,” where bee would
be the most expected completion (Kutas & Hillyard, 1984; Kutas et al., 1984).
Further studies have shown that the N400 is not restricted to manipulation
of meaning in sentences, but is also found for manipulations of discourse
contexts, on the one hand (e.g., van Berkum et al., 1999), and semantic or
repetition priming manipulations, on the other hand, where only one word
serves as the context (e.g., Bentin et al., 1985). Van Berkum and colleagues
(1999) manipulated whether or not the final word of the last sentence in
a short passage was consistent in meaning with the preceding discourse
context (e.g., “He ate a juicy steak” preceded by a discourse context that
had introduced a vegetarian versus a discourse context that had introduced
a person who loves to eat meat). They found a reduced N400 to critical
(final) words that matched versus those that did not match the meaning of
the global discourse contexts, even when these words were semantically
appropriate in the local sentence context. In studies of semantic priming,
the amplitude of the N400 is reduced to words that are associatively or
semantically related to the preceding context word relative to when they are
not (e.g., doctor–nurse vs. table–nurse; e.g., Bentin et al., 1985, Brown &
Hagoort, 1993; Chwilla et al., 1998, 2000; Holcomb, 1993).

Page 7 of 114 Language-Related ERP Components

Fig. 15.2. N400 findings for different experimental manipulations. Solid lines
in this figure always reflect the semantically easier processing condition.
Negative amplitude is depicted upward. Semantically congruent words
in sentences elicit a reduced N400 relative to semantically incongruent
words; this is found when the violation occurs in midsentence (A), but also
at the end of a sentence for written and spoken words and even when the
final word is replaced with a line drawing that is consistent or not with the
preceding sentence context (C), although this latter ERP effect has a more
anterior distribution than the canonical N400. The amplitude of the N400
is reduced to words that occur at a later position in semantically congruent
sentences (F) and also to words that are preceded by one context word
that is related in meaning (semantic priming, B) or by an identical word
(repetition priming, D). Words that are used frequently in our language
elicit a reduced-amplitude N400 relative to less frequently used words (E).
Reprinted with permission from Kutas and Federmeier (2000), Trends in
Cognitive Sciences (Cell Press).

Other studies have shown that the N400 is sensitive to lexical properties
of words. For example, real words (e.g., plant) elicit smaller N400s than
pseudowords (orthographically legal, pronounceable nonwords, e.g., plunt),
but random letter strings do not produce an N400 component (e.g., ntlpu).
Frequently used (high-frequency) words show smaller amplitude N400s
than do infrequently used (low-frequency) words (e.g., Barber et al., 2004),
but this effect is modulated by the context, such that words later in the
sentence no longer show lexical frequency effects (Van Petten & Kutas,
1991). Additionally, words with a small orthographic neighborhood (i.e.,
words that can be formed by changing one letter of another existing word,
such as fun and fan) show reduced N400s relative to words with large

Page 8 of 114 Language-Related ERP Components

orthographic neighborhoods (e.g., Holcomb et al., 2002; for a review, see
Grainger & Holcomb, 2009).3

Some studies that have manipulated the concreteness or imageability

of words have observed an increased negative shift to high-imageable/
concrete words (e.g., “banana”) than to low-imageable/abstract words
(e.g., “justice”). This effect occurs in the same latency range of the
N400, but the topographic distribution of this effect of imageability or
concreteness is anterior, instead of the typical centro-parietal scalp
distribution of the canonical N400 (Holcomb et al., 1999; Kounios &
Holcomb, 1994; Swaab et al., 2002; West & Holcomb, 2000, 2002). Different
topographic distributions of scalp-recorded ERPs may indicate that (partially)
nonoverlapping neural sources have contributed to their generation and that
they are not actually the same component (see Figure 15.4).

Fig. 15.3. A comparison of semantic violations in discourse and sentence

contexts. The top part of this figure shows a reduced N400 (solid line) to
critical words (CW) that violate the meaning of the discourse context, even
though these words are appropriate in the local sentence context (e.g.,
“Yesterday he ate a big juicy steak,” when the previous discourse has been
about a vegetarian). This finding is observed for both discourse-final and
discourse-medial critical words. The discourse N400 effect does not differ
from the effect of semantic anomalies in single-sentence contexts (bottom of
the figure). Redrawn with permission from Figure 5 in van Berkum, Brown &
Hagoort, Journal of Cognitive Neuroscience, 1999 (MIT Press).

The effect of imageability or concreteness is not modulated by semantic

relatedness in a semantic priming paradigm (Swaab et al., 2002), but larger
effects of concreteness are observed for words in anomalous compared to
congruent contexts (Holcomb et al., 1999). These results are consistent

Page 9 of 114 Language-Related ERP Components

with the idea that verbal- and image-based representations may be stored
separately, but that effects of context can be greater for highly imageable
or concrete words in an image-based memory system. Finally, N400-like
components have been observed in nonlinguistic meaningful contexts
such as line drawings (Ganis et al., 1996), stories that are formed by a
series of cartoon-like pictures (West & Holcomb, 2002), and short movies
(Sitnikova et al., 2003). These studies have found a negative-polarity
ERP in the same time window as the canonical N400 but with a more
anterior distribution (but see Willems et al., 2008). If the differences in
the topographic distribution of the canonical N400 and the ERP elicited in
other meaningful contexts indeed indicate that these ERPs are generated
by (partially) nonoverlapping neuronal sources in the brain, then this would
be relevant to the question of whether the semantic processing system
that enables language comprehension is the same as or different from the
system that must exist to process meaning in nonlinguistic contexts. As can
be seen in the bottom part of Figure 15.4, functional magnetic resonance
imaging (fMRI) findings from the same paradigm showed separable areas of
activation for the effects of priming and imageability, which is further indirect
evidence that the ERP effect of imageability is not generated by completely
overlapping neuronal sources, as is the canonical N400 (Giesbrecht et al.,

In the following sections, we will discuss in more detail some of the

major findings with the N400 in different language contexts for reading
comprehension and comprehension of spoken language.

N400 and Lexical Context

As discussed above, N400 effects are found even when the preceding
context consists of a list of words that do not form sentences or discourse.
These N400 effects have been observed in studies of semantic and repetition
priming (e.g., Bentin & Peled, 1990; Bentin et al., 1985; Boddy, 1986;
Holcomb & Neville, 1990; Joyce et al., 1999; Kutas & Hillyard, 1989; Rugg
et al., 1993; Swaab et al., 2002; see Figure 15.2).

Page 10 of 114 Language-Related ERP Components

Fig. 15.4. Event-related potential and functional magnetic resonance imaging
findings for effects of imageability and priming (see text for explanation
of these terms). The top left part of the figure shows that high-imageable
words (Hi, solid line) elicit a larger negative shift than do low-imageable
words (Lo, dotted line. This effect has a more anterior distribution than
the canonical N400 effect of priming shown on the right top part of this
figure. The canonical N400 has a parietal distribution, and a reduced N400
to related words (Rel, dotted line) is found relative to unrelated words (Unr,
solid line). The distribution of the effects of imageability and priming is
evident from the topographic maps that show the distribution of voltage of
the ERP effects over the head. The pink color displays the largest effects. The
bottom part of this figure shows the fMRI effects of imageability and priming
from the same paradigm. Effects of priming were observed in the middle
temporal gyrus (MTG), inferior parietal lobe (IPL), and inferior and middle
frontal gyrus (IFG and MFG). Effects of imageability were found in the middle
temporal gyrus (MTG) and inferior frontal gyrus (IFG). Note that the effects of
imageability never overlap with the effects of priming. Redrawn from Figures
1, 2 and 3 in Swaab, Baynes and Knight, Cognitive Brain Research, 2002
(Elsevier), and Figure 4 in Giesbrecht, Camblin and Swaab, Cerebral Cortex,
2004 (Oxford University Press).

In semantic priming studies, a reduced N400 is obtained to target words

that are semantically and/or associatively related to their preceding prime
(relative to unrelated target words: e.g., doctor–nurse vs. table–nurse).
The N400 priming effect has been observed in a range of tasks, including
semantic judgment (e.g., is this word related in meaning to the preceding
word?) and lexical decision (is this stimulus a word or not?), but also in no-
task situations, where participants were asked to just listen to or read pairs
of words for comprehension.

Page 11 of 114 Language-Related ERP Components

Lexical repetition also leads to a reduced N400 (see Figure 15.2,D; Bentin &
Peled, 1990; Joyce et al., 1999; Rugg et al., 1993), and this effect is greatest
at shorter lags (e.g., zero to six intervening items; Rugg & Nagy, 1989). The
N400 repetition effect has been observed across a variety of tasks, including
lexical decision (Bentin & Peled, 1990; Karayanidis et al., 1991; Rugg et al.,
1988), semantic classification (Hamberger & Friedman, 1992; Rugg et al.,
1988), and recognition memory tasks (Bentin & Peled, 1990). The effect is
restricted to repeated items that are semantically meaningful; geometric
line drawings, for example, do not show a reduced negativity in the N400
time range as a result of repetition (Rugg et al., 1995; Van Petten & Senkfor,

Reductions in the amplitude of the N400 to semantically related and

repeated words have been obtained in the visual and auditory modalities,
as well as in studies that have used cross-modal presentation (see Figure
15.2,B; Domalski et al., 1991; Holcomb et al., 2005; Joyce et al., 1999; Rugg
et al., 1993). Sometimes, the onset of the N400 priming effect occurs earlier
in the auditory modality (Holcomb & Neville, 1990), which is consistent with
findings that the unique identification of words occurs well before the whole
speech signal is heard (Grosjean, 1980).

A large number of ERP semantic priming studies have been devoted

to the processing nature of the N400 effect (e.g., Bentin et al., 1985;
Brown & Hagoort, 1993; Chwilla et al., 2000; Friederici, 1995; Holcomb,
1993; Holcomb & Neville, 1990; Kutas & Hillyard, 1989). In order to better
understand the nature of the N400 effects that have been obtained, we
will first briefly review one dominant behavioral account of lexical semantic
priming that was proposed by Neely (1991), who postulated three different
priming mechanisms. The first of these mechanisms is automatic spread
of activation within a lexical-semantic network (Collins & Loftus, 1975;
Neely, 1977). That is, activation spreads from the semantic node associated
with the prime to the semantic node associated with the target, thereby
reducing the processing time of the target upon its presentation. This
spread of activation is assumed to be an automatic process that cannot be
influenced by any subject strategies. The other two priming mechanisms
were postulated because different patterns of results had been obtained as
a function of the task that was used in the priming paradigm (e.g., lexical
decision vs. naming). The first of these additional mechanisms is expectancy-
induced priming: subjects use the meaning of the primes to generate an
expectancy set of possible target words (Becker, 1980, 1985; Posner &
Snyder, 1975). Expectancy-induced priming reflects controlled processing,

Page 12 of 114 Language-Related ERP Components

and is capacity consuming, relatively slow, and presumably under the
subjects’ strategic control. The other additional priming mechanism has been
labeled semantic matching (Neely & Keefe, 1989) or postlexical meaning
integration (De Groot, 1985). Semantic matching is not unlike the integration
process that occurs in the more common processing of sentences or
discourse (Brown & Hagoort, 1993; Chwilla et al., 1998, de Groot, 1985), and
may be automatic in nature since it does not require conscious awareness
of the prime (e.g., Bodner & Masson, 2003). Several studies have shown
that the N400 priming effect is greater when the task requires participants
to pay attention to the meaning of the words. Additionally, no N400 priming
effects are found for words presented in an unattended visual location
(McCarthy & Nobre, 1993) or in the unattended ear in a dichotic listening
task (Bentin, Kutas & Hillyard, 1995). But expectancy-induced processing
is not required to elicit an N400. Chwilla and colleagues (1998) prevented
the contribution of expectancy-induced priming in a study that utilized a
backward priming paradigm. Backward priming refers to the paradoxical
finding of facilitation of the processing of a target word when there is only
an association from the target to the prime, in the absence of an association
from the prime to the target. Consider, for example, the prime–target pair
baby–stork. In this case, baby has a forward association to words such as
mother and infant, but no such forward association exists between baby
(prime) and stork (target). Thus, neither forward spread of activation in a
lexical network nor the generation of an expectancy set for the target word
can explain the finding of facilitated processing of the target word in this
case. Koriat (1981) was the first to observe backward priming and suggested
that these effects might be attributed to spread of activation in a lexical
network in a backward direction. At longer intervals between prime and
targets, backward priming may also result from a postlexical relatedness-
checking strategy. That is, given enough time, participants in a priming study
may realize that there is a backward meaning relation between target and
prime after the target word has been presented (Seidenberg et al., 1984).
In order to prevent the use of this strategy, Chwilla and colleagues (1998)
made the interstimulus interval between prime and target 0 ms, theoretically
too short for a postlexical relatedness-checking strategy. Therefore, the
curious finding of facilitated processing of a target word that only has a
backward meaning relation to the prime can then be explained only in terms
of a backward association. Under these circumstances, clear N400 effects
of backward priming were obtained (e.g., the N400 was reduced to stork
when preceded by baby). Findings of this study are therefore consistent with
the idea that N400 priming effects can occur in the absence of controlled
processes, such as participants reflecting on the relation of stork back to

Page 13 of 114 Language-Related ERP Components

baby. N400 effects have also been found when participants are engaged in a
shallow processing task that is orthogonal to the relationship between words
(e.g., letter search; Kutas & Hillyard, 1989). Furthermore, Luck et al. (1996)
demonstrated that awareness of the target word is not required, since they
found preserved N400 effects in the attentional blink even when the second
of two targets presented in rapid serial visual presentation (RSVP) could
not be accurately reported. Other studies have shown effects of semantic
priming when perception of the prime was masked (e.g., Deacon et al., 2000;
for a discussion of masked N400 repetition effects, see Holcomb & Grainger,
2007). It appears, then, that N400 effects can be obtained under automatic
processing conditions. However, at this point, there is no conclusive evidence
from semantic priming studies to suggest that the N400 is exclusively
modulated by automatic integration and not by spread of activation in a
semantic network.

N400 and Sentence Context 4

In the previous section, we presented an overview of some of the many

studies that have found N400 priming effects to semantically associated
and repeated words. These studies have provided us with insights about
the organization and processing of lexical representations when presented
outside of a structured linguistic context. The sentence, as the smallest
complete unit of language with a cohesive structure, provides a valuable
springboard to study language processing in a meaningful and structured
environment. This section will focus on N400 effects that have been found
in ERP studies of the processing of the meaning of words embedded in
sentence contexts.

Erp studies of sentence reading

A seminal question in studies of sentence comprehension (that have not

focused on syntactic aspects of processing) concerns the point at which
lexical processing is influenced by the meaning of the larger sentence
context. As was discussed earlier, much of the initial work on the functional
nature of the N400 was performed with sentences, and in the 1980s it
was shown that the amplitude of the N400 varied as a function of cloze
probability and sentential constraint (Kutas et al., 1984). Cloze probability
measures are obtained by asking participants to finish sentences from which
the last word is omitted, such as “I drink my coffee with sugar and _____.”
The cloze probability is measured as the percentage of people that finish
the sentence with a specific word. In this example, cream might have a

Page 14 of 114 Language-Related ERP Components

cloze probability of 60% and milk might have a cloze probability of 30%.
In this case, cream is considered the best completion of the sentence.
Sentential constraint is determined by the range of possible continuations
of a particular sentence context. In this example, the sentence can be
finished in a meaningful way only with a very limited number of words and
is therefore highly constraining. Other sentences, like “Yesterday she went
to the _______” are not very constraining and can be completed with many
possible words (e.g., store, theatre, movies, ski slopes, school, dentist,
doctor, hospital). Strongly constraining sentences will necessarily always
end with a word that also has a very high cloze probability. But both high-
and low-constraint sentences can be completed with a low-cloze word, that
is, a word that is appropriate given the context but does not provide the
best completion. In this case, one can distinguish the effects of constraint
separately from the effects of cloze probability; that is, one can study the
processing of unexpected words in sentences of varying constraint. This is
important in identifying the scope and nature of the influence of sentence
contexts on the processing of upcoming words in the sentence. In other
words, is there a difference in the processing of unexpected (low-cloze)
words in high- versus low-constraint sentences? Some models assume that
lexical processing is impervious to contextual constraint and would therefore
predict no difference in processing as a function of contextual constraint.
Models that assume the immediate influence of context on lexical processing
would predict a larger impairment in the processing of unexpected words in
highly rather than in minimally constraining sentences. Behavioral studies
have used lexical decision and naming methods to investigate the influence
of contextual constraint on lexical processing of upcoming words. These
studies have shown that the strength of the sentential constraint and the
degree of semantic relation between the best and the actual completion both
influence processing (e.g., Schwanenflugel & LaCount, 1988; Schwanenflugel
& Shoben, 1985). Schwanenflugel and colleagues found evidence of
faster processing only for words that were related in meaning to the best
completion for low-constraint sentences, such as “She cleaned the dirt from
her sandals,” where shoes is the best completion, but not for high-constraint
sentence, such as “On a hot summer’s day, many people go to the lake,”
where beach is the best completion. On the basis of these results, it had
been assumed that strongly constraining sentences result in a narrow scope
of activation of lexical candidates, possibly because these contexts activate
a larger set of featural restrictions for upcoming words in the sentence (e.g.,
Schwanenflugel & LaCount, 1988; Schwanenflugel & Shoben, 1985).

Page 15 of 114 Language-Related ERP Components

However, N400 ERP studies have produced contradictory results to the
aforementioned behavioral studies: words that are unexpected but related
in meaning to the best sentence completion are actually processed more
easily in high- than in low-constraint sentences (Federmeier & Kutas, 1999;
Kutas et al., 1984). Federmeier and Kutas (1999) have suggested that this
paradox could be resolved if one assumes that high contextual constraint
can lead to both benefits and impairments in processing of upcoming words
if context exerts its influence at different stages of processing. That is,
initially high sentential constraint can facilitate processing of words that are
unexpected but related in meaning to the best completions because the
context has activated a set of semantic features that matches the meaning
of both words. The impairments in processing that have been observed in the
behavioral literature may then indicate a later stage of contextual influence
that does not affect the amplitude of the N400.

Federmeier and colleagues (2007) have examined this idea by testing

whether the contextual effects of sentence constraint and cloze probability
jointly affect the N400 or have independent influences on processing. They
manipulated sentence constraint and cloze probability as in the following

Strong Constraint:

The children went outside to play (high cloze—expected)/look (low cloze—


Weak Constraint:

Joy was too frightened to move (medium cloze—expected)/look (low cloze—


For the unexpected completions, the critical words were the same in
both constraint conditions (look in the example) and formed plausible
but unexpected completions with very low cloze probabilities (3.1%). For
the expected completions, contextual constraint and cloze probability
were necessarily confounded (high-constraint sentences, by definition,
will yield less possible responses than low-constraint sentences) such that
high-constraint sentences ended with critical words of very high cloze
probability (85.3%) and low-constraint sentences ended with critical words
of medium cloze probability (26.9%). Federmeier et al. (2007) replicated
previous findings of the effects of cloze probability on the N400 such that
congruent sentence-final words with high cloze probability resulted in

Page 16 of 114 Language-Related ERP Components

reduced amplitude N400s relative to words that were congruent but with
medium cloze probability (play and move in the example, respectively).
Interestingly, there was no modulation of the amplitude of the N400 to
the low-cloze words as a function of sentential constraint. Instead, low-
cloze words in highly constraining sentences elicited a positive shift with
a frontal distribution that occurred after the N400 (see Figure 15.5). This
result has important theoretical implications. In contrast to the behavioral
results discussed earlier, the ERP results show that both cloze probability
and sentence constraint affect processing, but that the effects of contextual
constraint are actually delayed relative to the effects of cloze probability.

Fig. 15.5 Event-related potential findings for critical words in different

sentences of different constraint and cloze probability (see text for
explanation). The x marks on the schematic of the positions of the electrodes
on the head indicate the sites from which the ERP results are displayed.
The results show that contextual constraint had no immediate effect on
the processing of unexpected words in the sentence; the N400 amplitude
between the strongly constraining and weakly constraining unexpected
words was not different. The effects of constraint occur later in the ERP
waveform; a larger positive shift with a medial frontal distribution is elicited
to unexpected words in highly constraining sentences. Redrawn with
permission from Federmeier, Wlotko, Ochoa-Dewald & Kutas, Brain Research,
2007 (Elsevier).

Other studies of sentence comprehension have tested if and when effects

of sentence congruence influence lexical priming effects. These studies also
allow us to ask whether N400 congruency effects in sentence contexts can
be distinguished from N400 priming effects. Van Petten (1993) has shown
that both lexical association and sentence congruence modulated the N400
to critical words that were preceded by an associatively related or unrelated
word that was embedded in meaningful sentences or in so-called syntactic

Page 17 of 114 Language-Related ERP Components

prose (i.e., meaningless sentences with a preserved structure; e.g., Marslen-
Wilson & Tyler, 1980), as in the following example:
a Congruent/Associated

• When the moon is full, it is hard to see many stars or the

Milky Way.
b Congruent/Unassociated

• When the insurance investigators found that he’d been

drinking, they refused to pay the claim.
c Syntactic Prose/Associated

• When the moon is rusted, it is available to buy many stars or

the Santa Ana.
d Syntactic Prose/Unassociated

• When the insurance supplies explained that he’d been

complaining, they refused to speak the keys.

Fig. 15.6 In contrast to medium- and high-span subjects, low-span subjects

do not show an N400 effect of overall sentence congruence in the absence of
lexical association (see text for explanation). Redrawn with permission from
Van Petten, Weckerly, McIsaac & Kutas, Psychological Science, 1997 (APS).

Page 18 of 114 Language-Related ERP Components

Modulation of the N400 was found as a function of sentence congruency
and lexical association, with the biggest effect in the congruent/associated
condition (Condition a). However, the effects of sentence congruency were
subject to great individual variability. This finding was followed up in a
subsequent study that examined whether effects of sentential congruency
and lexical association varied as a function of individual differences in
reading span (Van Petten et al., 1997; see Daneman & Carpenter, 1980, for
a discussion of reading span). Reading span presumably measures working
memory capacity in relation to sentence processing, and many studies
have observed correlations between reading span and sentence-processing
ability (but see Caplan & Waters, 1999, for a critique on the reading span
measure). Van Petten and colleagues (1997) presented participants with
the same stimuli as in their study discussed above (Van Petten, 1993),
but this time, participants were divided into low-, medium-, and high-span
groups according to their scores on the reading span task. The sentences
were presented one word at a time at a rapid serial presentation rate of one
word every 300 ms, which is much faster then the average rate of RSVP
that has been used in ERP experiments in general (∼500 ms) but closer to
the average fixation time to words under natural reading conditions (e.g.,
Camblin et al., 2007b). As can be seen in Figure 15.6, medium- and high-
span readers showed the same N400 results as in the Van Petten (1993)
study. However, low-span readers did not show N400 effects of sentence
congruence in the absence of lexical association (Condition b). The authors
concluded that lexical association effects are not modulated as a function of
reading span,5 but that low-span readers benefit from the effects of sentence
congruence only when lexical associations can aid in building sentence-level
context representations.

It is prudent at this point to raise two methodological issues. One is not ERP-
specific but is relevant to the present study: namely, the use of a probe
recognition task. Gordon et al. (2000) have shown that the probe word
recognition task induces readers to read sentences more like incoherent
lists, which may enhance effects of lexical association and decrease
effects of sentence coherence. The other, more ERP-specific issue is the
necessity to use RSVP paradigms with ERPs to avoid eye movements that
will contaminate the ERPs. As mentioned before, van Petten et al. (1997)
used an RSVP rate of one word every 300 ms to more closely approximate
normal reading rates. However, since words are presented one at a time,
this fast rate may impose additional demands on the reader that may
influence overall sentence integration (Camblin et al., 2007b). This issue

Page 19 of 114 Language-Related ERP Components

will be discussed in more detail later in the chapter (“Using ERPs to Study
Language: Methodological Issues”).

Fig. 15.7 The VHF method has been used to study the contribution of the LH
and RH to language processing. The VHF studies present stimuli laterally to
either the right visual field (RVF, in red) or the left visual field (LVF, in green).
The RVF input hits the left side of the retinas of both eyes and is processed
first by the LH, and the LVF input hits the right side of the retinas of both
eyes and is processed first by the RH. This is due to the organization of the
visual system, where the optic nerves from both eyes cross at the optic
chiasm and are sent to the primary visual cortices of the LH and RH of the
brain. Reprinted with permission from Bruno Dubuc, Canadian Institutes of
Health Research: Institute of Neurosciences, Mental Health and Addiction,

Interestingly, Coulson, and colleagues (2005) observed hemispheric

asymmetries for the ERP effects of lexical association and sentence
congruence. In this study, critical words were presented in the left visual
field/right hemisphere (LVF/RH) or in the right visual field/left hemisphere
(RVF/LH) by making use of the visual half field (VHF) method (see Figure

This method takes advantage of the neuroanatomical organization of the

visual system, in which the visual input from the left VHF is first projected
(via the optic chiasm) to the RH and vice versa. This has the consequence

Page 20 of 114 Language-Related ERP Components

that any stimulus presented to the RVF is initially processed by the LH
and any stimulus presented in the LVF is initially processed by the RH,
after which the information is transferred via the corpus callosum to the
other hemisphere. Briefly presenting a stimulus to a specific visual field
can allow researchers to draw inferences about the nature of processing
in the contralateral cerebral hemisphere. In VHF experiments, it is crucial
that participants keep their eyes focused centrally to make sure that the
stimulus is actually presented to one of the VHFs (and to obtain ERPs that
are not contaminated by eye movements). The following procedures can be
implemented to make sure that eye movements have not contaminated the
results: (1) monitor eye movements throughout the experimental session
in order to ascertain that all subjects maintain fixation, (2) conduct an eye-
calibration procedure for each subject, and (3) inspect VHF ERP data for a
larger N1 component on electrode sites contralateral to the visual field of
presentation; the presence of a larger N1 indicates that the visual stimulus
was indeed first processed by the contralateral hemisphere.

To study the contributions of the LH and RH to effects of sentence

congruence and lexical association, Coulson et al. (2005) utilized a
counterbalanced design in which sentence congruence and lexical
association were manipulated such that the last word of the sentence was
either congruent or not with the whole-sentence context and was associated
in meaning or not with an immediately preceding word, as in the following
example (critical words are underlined):
a Congruent/Associated

• The Italian cook always added too much olive oil.

b Congruent/Unassociated

• They were hard to walk in, but she loved her olive shoes.
c Incongruent/Associated

• During the test, Ellen leaned over and borrowed my spare

d Incongruent/Unassociated

• They were truly stuck, since she didn’t have a spare pencil.

Page 21 of 114 Language-Related ERP Components

Fig. 15.8 Only LVF presentations result in effects of association in both
congruent- and incongruent-context sentences. This effect was not observed
on the N400, but rather on a frontally distributed positive shift—more
positive to associated than unassociated words in both congruent and
incongruent conditions. The RVF presentation resulted in an N400 effect
of association for the incongruent condition only (bottom left quadrant).
Redrawn from Figures 4 and 5 in Coulson, Federmeier, van Petten and Kutas,
Journal of Experimental Psychology: Learning, Memory and Cognition, 2005

Coulson et al. (2005) showed N400 effects of sentence congruence for both
hemifields that were identical in size and onset. This finding suggests that
the RH is in fact sensitive to sentence-level semantic constraints, which
confirms earlier behavioral findings of Faust and colleagues (Faust, 1998;
Faust & Gernsbacher, 1996; Faust et al., 1993, 1995). Effects of lexical
priming, on the other hand, varied as a function of hemifield of presentation:
whereas in the LVF/RH, effects of lexical priming were observed for both
congruent and incongruent sentence endings, in the RVH/LH effects of lexical
priming were observed only for sentences that ended with an incongruent
word. Thus, these results suggest that the two hemispheres are differentially
sensitive to effects of lexical priming in sentence contexts. The LH “uses”
lexical associations only when no overall integration of the sentential context
is possible, whereas the RH uses lexical associations regardless of the overall
congruence of the sentence contexts. In addition, the effects of lexical
priming only showed up as a canonical N400 effect for the incongruent/
associated critical words presented to the RVF/LH. For the LVF/RH, a frontally
distributed ERP effect of lexical association was observed (see Figure 15.8).
Coulson et al. (2005) state that this effect needs further study, but it does
open the intriguing possibility that under certain circumstances, effects of
lexical priming may have a distinct electrophysiological signature from the
classical N400 effect of sentence congruence.

Page 22 of 114 Language-Related ERP Components

In sum, ERP studies that have compared effects of lexical priming and
overall congruence in sentence comprehension have found that effects
of lexical association can be modulated as a function of overall sentence
congruence. Relative to the Van Petten (1993) study, the Coulson et al.
(2005) study shows less robust effects of lexical association, which may
be explained in terms of differential cloze probability. In the Coulson et al.
(2005) study, the cloze probability of the sentence-final words was higher
than the cloze probability of the critical words in sentence-intermediate
positions in the Van Petten (1993) study, and this may indicate that effects
of lexical association contribute to processing only in sentence contexts
that are not highly constraining or incongruent. In addition, Van Petten et al.
(1997) have shown that low-span readers rely more on lexical associations
than do high-span readers, and Coulson et al. (2005) have shown that lexical
association contributes to RH sentence processing regardless of the overall
congruence of the sentence. In general, the effects of lexical association and
sentence congruence both appear to affect the canonical N400 ERP, except
when the critical associate is presented in the LVF/RH; in this case, a right
frontal effect of association is obtained.

Erp studies of spoken sentence comprehension

Event-related potential studies of spoken sentence comprehension have

been conducted much less frequently than ERP studies of reading. In part
this may be due to the fact that it is relatively time-consuming to identify
the onset of the critical stimulus in the continuous speech signal, which
is, of course, necessary in order to accurately time-lock the ERPs. Also,
because the speech signal is continuous, ERPs from previous input will
overlap with the ERP of interest, and early P1, N1, and P2 components
cannot be discerned in the ERPs to critical stimuli embedded in continuous
speech.6 Nevertheless, several studies have used the N400 to investigate
the influence of word or sentential context on the process of spoken word
recognition (e.g., Connolly & Phillips, 1994; Diaz & Swaab, 2007; Friederici
et al., 2004; Friedrich & Kotz, 2007; Friedrich et al., 2004; Hagoort & Brown,
2000; O’Rourke & Holcomb, 2002; Praamstra et al., 1994; Radeau et al.,
1998; van den Brink & Hagoort, 2004; van den Brink et al., 2001, 2006; Van
Petten et al., 1999). These studies have confirmed behavioral findings of a
very rapid influence of context on spoken word processing. But it has been
a matter of debate whether or not the electrophysiological manifestation of
these effects was found on the N400 (Van Petten et al., 1999) or instead on
a separable component that would be specifically sensitive to phonological
mismatch (Connolly & Phillips, 1994) or lexical selection (van den Brink &

Page 23 of 114 Language-Related ERP Components

Hagoort, 2004) during spoken word recognition. To address this issue, Diaz
and Swaab (2007) directly compared the processing of words in lists to the
processing of words in meaningful sentences. Phonological and semantic
congruence was manipulated in both contexts. To assess the effects of
phonological mismatch on ERPs per se, participants were asked to listen
to a series of eight words in which the final word was either phonologically
congruent or incongruent with respect to the onset of the preceding words
but none of the words were semantically or associatively related to each
other. The second list condition was included to elicit a canonical N400 effect,
and here the eighth word of the list was either semantically congruent or
incongruent with respect to the semantic category of the preceding words,
but there was no alliterative overlap in word onsets. In addition, participants
listened to four types of sentences in which the semantic and phonological
congruence of the terminal word was manipulated (as in Connolly & Phillips,
1994). These manipulations are illustrated in the following examples (critical
final words are underlined):


Alliterative condition

Congruent: chat, champ, chaff, chant, challis, chad, chap, chapter

Incongruent: chat, champ, chaff, chant, challis, chad, chap, address.

Category condition

Congruent: giraffe, sheep, bear, wolf, rabbit, lamb, elephant, dog

Incongruent: giraffe, sheep, bear, wolf, rabbit, lamb, elephant, desk.


High Congruent: He mailed the letter without a stamp.

Phonologically Congruent: He mailed the letter without a stance.

Low Congruent: He mailed the letter without a thought.

Incongruent: He mailed the letter without a roof.

Page 24 of 114 Language-Related ERP Components

In lists of words, the lexical-phonological processes could not be influenced
by a meaningful context, and separable effects of phonological and
semantic information were predicted on the processing of the critical target
words in the lists. The results indeed confirmed this (see Figure 15.9):
an early, topographically distinct electrophysiological manifestation of
phonological incongruence was observed that preceded the N400 effects
of semantic incongruence. In contrast, no such separation was obtained
in the meaningful sentence contexts: there was only evidence for an early
N400 to semantic incongruency, but no separable effect of phonological
incongruence was obtained.

This indicates that semantic information of the sentence context very

rapidly influences lexical processing in meaningful contexts. Together, these
results provide evidence that the moment in time at which context starts to
exert its influence on lexical processing depends on how constraining the
previous context is, with lists of words at one end of the spectrum and highly
constraining sentence contexts at the other end.

N400 and Discourse Contexts

Language comprehension critically depends on successfully integrating the

meaning of individual words into the meaning of larger discourse contexts.
Extracting meaning from the discourse requires rapid processing of several
different sources of linguistic information, including phonological, syntactic,
and semantic aspects of the language input. In addition, word and sentence
meaning critically depend on the context of the utterance. For example, even
though the word steak makes perfect sense in the isolated sentence context
“Yesterday my uncle ate a big juicy steak,” it is not sensible when the global
discourse context has just established that my uncle is a vegetarian: “A few
years ago, my uncle decided that he should become a vegetarian. He had
not found it very difficult to stay away from meat, because there are so many
delicious other foods. Yesterday he ate a big juicy steak.”

Page 25 of 114 Language-Related ERP Components

Fig. 15.9 Event-related potential effects and topographic distribution of
these effects in 200–300 ms (top) and 300–600 ms (bottom) epochs. On the
left side are displayed the effects of violation of phonological or semantic
congruence in lists of words. Effects of violation of phonological congruence
were found for occipital sites (more negative to violation of expectancy) and
frontal sites (more positive to violation of expectancy). A canonical N400
effect was obtained when a violation of congruence of a semantic category
occurred. For the sentences (right side of the figure), both semantically
anomalous words that share the speech onset with the word that is the best
completion for the sentence and those that do not result in a canonical N400
effect. Redrawn from Figures 2 and 3 in Diaz & Swaab, Brain Research, 2007

Previous research has shown that such contextual information influences

the identification of word meanings in sentences and discourse; that
is, listeners and readers employ the meaning of the preceding words
and sentences in determining the meaning of the current word. Thus,
language comprehenders do not represent the surface (literal word-by-
word) information of a context, but rely on the integrated meaning of the
overall representation (Bransford & Johnson, 1972). However, the time
course of the activation and integration of information from the discourse
context during lexical processing is a subject of much debate. Some theories
of language comprehension assume that sentences (or constituents) are
initially processed independently and that discourse information only
becomes available at later stages of processing (see Chomsky, 1975; Fodor
et al.,1974; Forster, 1989; Katz, 1972; Searle, 1979; Sperber & Wilson, 1986).
Other models predict a briefer delay of the influence of discourse information
on real-time comprehension, such that discourse can influence processing
after each word, but not until syntax and sentence-level meaning have been

Page 26 of 114 Language-Related ERP Components

integrated (Fodor et al., 1996); Frazier, 1999). In contrast, constraint-based
models argue for the simultaneous activation of phonological, syntactic,
and semantic information during lexical processing. Even though no explicit
predictions have been made with respect to discourse information, it would
be in the spirit of these latter models to assume that this simultaneously
activated information can be used immediately to comprehend the preceding
utterance (Jackendoff, 2002, 2007; MacDonald et al., 1994; Marslen-Wilson
& Tyler, 1980; Tanenhaus & Trueswell, 1995). In other words, all sources of
contextual information are integrated during lexical-semantic processing, not

Several ERP studies have used the exquisite temporal resolution of ERPs
in general and of the N400 in particular to investigate whether the wider
discourse context can immediately influence lexical semantic processing in
the local sentence context, or alternatively, whether there is a delay in the
influence of discourse context on lexical semantic processing in the local
sentence (e.g., Camblin et al., 2007a; Federmeier & Kutas, 1999; Nieuwland
& van Berkum, 2006; Nieuwland et al., 2007; St. George et al., 1994; van
Berkum, 2004; van Berkum et al., 1999, 2003, 2007, 2008; for a review, see
van Berkum, 2009).

St. George et al. (1994) were the first to show that the N400 is sensitive to
global discourse-level effects on the processing of the meaning of single
words. Subjects read ambiguous paragraphs from a behavioral study by
Bransford and Johnson (1972, p. 722), in which everyday activities were
described that did not make sense unless a disambiguating title was
provided, as in the following example:

“The procedure is actually quite simple. First you arrange things into different
groups depending on their makeup. Of course, one pile may be sufficient
depending on how much there is to do. If you have to go somewhere due to
lack of facilities that is the next step, otherwise you are pretty well set. It is
important not to overdo any particular endeavor. That is, it is better to do
too few things at once than too many. In the shorter run this may not seem
important, but complications from doing too many can easily arise. A mistake
can be expensive as well. The manipulation of the appropriate mechanism
should be self-explanatory, and we need not dwell on it here. At first the
whole procedure will seem complicated. Soon, however, it will become just
another facet of life. It is difficult to foresee any end to the necessity of this
task in the immediate future, but then one can never tell.”

Page 27 of 114 Language-Related ERP Components

Even though the individual sentences of this paragraph make sense, it is
very difficult to make sense of the whole passage. However if we give you
the title “A Procedure for Washing Clothes,” then everything starts to fall into
place (although admittedly the passage remains awkward).

Event-related potentials were obtained to all content words in paragraphs,

as in the example above, and the N400 was significantly reduced to these
words when presented in the titled relative to the untitled condition. This
suggests that providing a title facilitated the generation of a discourse
model, which in turn facilitated the integration of the meaning of single
words into a representation of the overall context.

Federmeier and Kutas (1999) used the N400 to investigate the lexical-
semantic processing of words within and across category boundaries when
a preceding context strongly favored one specific lexical candidate from a
semantic category (“They wanted to make the hotel look more like a tropical
resort. So along the driveway they planted rows of palms/pines/tulips”). They
found that the amplitude of the N400 varied as a function of the match of
the critical word with the semantic category biased by the overall context,
with a reduction of the N400 seen to the most expected final word given the
discourse context (palms). Importantly, the amplitude of the N400 to the
discourse-unexpected words varied as a function of the semantic relationship
of these words with the most expected word, such that a smaller N400 was
found to pines (which is a close semantic associate of palms) than to tulips
(which has a more distant semantic relationship). These findings not only
illustrate discourse context effects on processing but also, according to the
authors, indicate that long-term memory organization influences language

Fig. 15.10 A reduced N400 is found to words that violate animacy (dotted
line; e.g., the peanut fell in love) relative to those that do not (solid line;
e.g., the peanut was salted). This paradoxical result is obtained when the
violation of animacy is context appropriate in the discourse (e.g., in a story

Page 28 of 114 Language-Related ERP Components

of a peanut falling in love). CW onset indicates the moment in time when the
critical word to which the ERP was measured was presented. Reprinted with
permission from Nieuwland & van Berkum, Journal of Cognitive Neuroscience,
2006 (MIT press).

We will now turn to some N400 studies showing that a cohesive, supportive
discourse can delay or even override local effects of lexical-semantic
processing. Nieuwland and van Berkum (2006) presented participants with
short, cartoon-like passages featuring animacy violations. Animacy violations
produce an N400 effect, presumably because human-exclusive actions or
emotions, such as talking to a therapist or singing a love song, are ascribed
to inanimate objects, such as a yacht or a peanut, and this creates a kind
of semantic anomaly. The authors constructed 60 six-sentence cartoon-
like stories that featured a cartoon-style, animacy-violating character, such
as a peanut singing about his girlfriend. The experimental manipulation
involved the penultimate sentence of each story, which contained either an
animate (yet context-appropriate) or inanimate (yet context-inappropriate)
description of the main character (e.g., the peanut). In the animate, context-
appropriate condition, the peanut would be described as being in love,
for example, while the inanimate, context-inappropriate condition would
describe the peanut as salted. All inanimate, context-inappropriate words
were selected so as to be canonical, or common descriptors of that particular
object (“The peanut was salted”). It is important to note that such common
descriptors would ordinarily be expected to reduce the N400 amplitude,
particularly relative to animacy-violating descriptors (in love). In contrast,
as can be seen in Figure 15.10, Nieuwland and van Berkum (2006) found a
reduction of the N400 amplitude to the noncanonical but context-appropriate
words (in love). This suggests that a discourse context can actually override
the effects of both animacy violations and plausibility.

Camblin and colleagues (2007a) have provided other N400 evidence of

the rapid influence of discourse context on lexical processing. They varied
semantic association and discourse congruity in multisentence passages
(e.g., “The movie was applauded by adults and children/toddlers,” preceded
by a context referring either to a Disney film or to a Holocaust documentary).
Participants were asked to read these passages for comprehension.
Discourse congruity was found to have an earlier effect on lexical processing
than association, as evidenced by an earlier onset of the N400 effect for
the discourse manipulation. Recently, this time course effect was replicated
when the same passages were presented as naturally produced speech and
a very clear delay in the N400 effect to lexical associations was observed

Page 29 of 114 Language-Related ERP Components

relative to the N400 effect of discourse congruence effect, which onset much
earlier (Boudewyn et al., in press; see Figure 15.11).7

Other studies have shown that N400 effects of repetition priming are not
immune to discourse-level effects either (e.g., Camblin et al., 2007b; Gordon
et al., 2004; Johns et al., under review; Ledoux et al., 2006, 2007; Swaab
et al., 2004). For example, studies of both visual and auditory modalities
have shown that classic N400 effects of repetition priming with repeated-
name coreference (when two instances of a name refer to the same person)
were found only when the sentence structure was conducive to this type of
coreference. Compare, for example, the following sentences:
(1) At the office Daniel moved the desk because Daniel needed
room for the filing cabinet.
(2) At the office Daniel and Amanda moved the desk because
Daniel needed room for the filing cabinet.

Fig. 15.11 In this study by Boudewyn and colleagues (in press), participants
heard stories that ended with a word that was either congruent or not with
the preceding discourse and was associated or not with a word immediately
preceding the final word of the last sentence (see examples on the left side
of the figure). The ERP waveforms and topographic maps of the effects of
discourse congruence and associative priming are displayed. Red lines show
ERPs to discourse-incongruent and unrelated words, respectively, and blue
lines show ERPs to discourse-congruent and related words, respectively.
The N400 effects of lexical association are delayed relative to the effects of
discourse context in spoken language comprehension; whereas significant
effects of discourse congruence were obtained in all three epochs shown for
the topographic maps, the effects of association were not obtained until after
400 ms.

The repeated name “Daniel” (the anaphor) in Sentence (1) is awkward (the
pronoun he would be preferred in this case), whereas in Sentence (2) the use

Page 30 of 114 Language-Related ERP Components

of a repeated name is a perfectly acceptable vehicle for coreference. When
the sentence structure is not conducive to repeated name coreference (as
in “At the office Daniel moved the desk because Daniel…,”) the repetition
priming benefit is eliminated. In the behavioral literature, Gordon and
colleagues have labeled this effect the repeated name penalty and have
observed that this type of penalty occurs when a repeated name refers to
an antecedent in discourse focus (e.g., as in Sentence 1 above; see, e.g.,
Gordon & Hendrick, 1997; Gordon et al., 1999). Gordon and Hendrick (1997)
proposed that the repeated name penalty results from “disjoint reference”:
when the second Daniel is encountered, it is initially processed as a new
entity in the discourse and additional processing is required to determine
that the anaphor and the antecedent Daniel refer to the same person.
Ledoux and colleagues (2007) used ERPs to examine why this penalty might
occur. In other words, why is it so awkward to repeat a name when the
antecedent is in discourse focus? If the repeated name penalty would be
reflected in a modulation of the amplitude of the N400, then this would
indicate processing difficulty of a semantic nature, possibly because the
anaphor is more difficult to integrate in the context or because initially
the repeated name does not activate any retrieval cues. Ledoux et al.
presented sentences as in the example above. In addition, they added a
control condition in which the anaphors were replaced with new names to
examine the interaction of lexical repetition and discourse prominence. This
was done to test the prediction that coreference to a prominent antecedent
causes a repeated name to be processed as if it were a new name, and this
is exactly what they found (see Figure 15.12).

Discourse context can thus be seen to override the more purely lexical-level
benefits of both semantic association and repetition (for review, see Ledoux
et al., 2006).

Another essential aspect of discourse processing is to establish the referents

of the currently expressed entity. This entity will often be a noun phrase.
An essential function of noun phrases such as the morning star is that they
refer to a particular discourse entity, the referent (i.e., Venus; even though
Venus is actually a planet, as pointed out by Alex, the the son of the first
author of this chapter, who was 7 at the time). Referents can be entities
in the actual world, entities in some possible world, or even entities that
don’t exist. Discourse contexts are often needed to determine the intended
referent of an expression because multiple expressions can have the same
referent (e.g., morning star and evening star both refer to the planet Venus)
and the same noun phrase can have more than one referent (my son Alex

Page 31 of 114 Language-Related ERP Components

can refer to various people). As discussed above, coreferential processing
is specifically used to establish whether or not two linguistic expressions
refer to the same semantic entity (e.g., “Tamara Swaab’s son Alex knows
that Venus is not a star because he likes to read about the universe,” where
he corefers to Tamara Swaab’s son). Van Berkum and colleagues have
performed ERP studies of referential processing in discourse contexts, and
have consistently found a negative shift over frontal electrode sites for
expressions that might be linked to more than one referent in the preceding
discourse (ambiguous referents; Nieuwland et al., 2007; for reviews, see van
Berkum, 2009; Van Berkum et al., 2007). They have labeled this ERP the Nref
(see Figure 15.13).

Fig. 15.12 Effects of lexical repetition (right side) and discourse prominence
(left side) during reading of sentences. Event-related potentials are shown
for a central site. A repeated name penalty is found for repeated names
with antecedents in discourse prominence (Daniel/Daniel, blue line, vs.
Daniel and Amanda/Daniel, red line, top left quadrant). Effects of discourse
prominence are not obtained when a new name is entered in the discourse
(Robert, bottom left quadrant). Effects of lexical repetition are only obtained
for repeated names with antecedents that are not prominent in the discourse
(Daniel and Amanda/Daniel, blue line, vs. Daniel and Amanda/Robert, red
line, bottom right quadrant) but not for repeated names with prominent
antecedents (top right quadrant). Data from Ledoux et al. (2007).

Taken together, the N400 studies discussed in this section indicate the
rapid and sometimes dominating effect of discourse representations on the
processing of incoming words. Overall, the results are more consistent with
interactive models of processing, where contextual information can have an
immediate impact on language comprehension and processing.8

Page 32 of 114 Language-Related ERP Components

N400 and Nonliteral Language

Fig. 15.13 The Nref is obtained to critical words that can refer to more
than one antecedent (blue line) relative to those with unambiguous single
antecedents (black line). Adapted with permission from van Berkum (2009).

Language input is richly ambiguous and requires processing that goes

well beyond literal and straightforward interpretations. For example, the
following sentence illustrates that many words have more than one meaning,
some of which are more literal than others: “I still miss my wife, but I have
improved my aim” (Coulson & Williams, 2005). One of the first studies
using ERPs to examine nonliteral language processing was done by Pynte
and colleagues (1996). In their design, they tried to tease apart the effect
of metaphor familiarity and the presence of supporting context. In the
absence of greater context, short, familiar metaphors (e.g., “Those fighters
are lions”) produced larger N400s to the terminal, metaphorical word than
N400s elicited by the same terminal word in a nonmetaphorical sentence
(e.g., “Those animals are lions”). There was also a trend whereby unfamiliar
metaphors (e.g., “Those apprentices are lions”) produced larger N400s than
familiar metaphors, perhaps reflecting the association between the nouns in
the familiar metaphor. However, this result was flipped when the unfamiliar
metaphor was paired with a supportive context (e.g., “They are not cowardly:
Those apprentices are lions”) and the familiar metaphor was paired with an
unsupportive context (e.g., “They are not naive: Those fighters are lions”):

Page 33 of 114 Language-Related ERP Components

In this case, the unfamiliar metaphor elicited N400s of smaller amplitudes.
This indicates that the context provided in support of a metaphor frame may
be more influential in processing than more basic properties of the metaphor
like familiarity and association.

In a later study, Coulson and Van Petten (2002) examined the processing of
metaphors by comparing them not only to straightforward literal controls
but also to literal mappings. For example, consider the word syrup in the
following sentences:
Literal control: I read that one of Canada’s major exports is
maple syrup.

Literal mapping: In the movie Psycho, the blood was really

cherry syrup.

Metaphor: He didn’t understand the words, but her voice was

sweet syrup.

While the word “syrup” is not being used metaphorically in the literal
mapping condition, it does elicit some of the same processing that is
required to understand metaphors according to the conceptual blending
hypothesis: mappings need to be produced between disparate domains,
and integration across those domain backgrounds needs to take place
(Fauconnier & Turner, 1998). In the example above, the qualities of blood
and cherry syrup both need to be activated and mappings need to be made
between the two regarding their similarities specific to that given context.
Coulson and Van Petten (2002) found a graded N400 effect, such that the
largest N400s were produced to metaphors, the smallest to literal controls,
and those of intermediate amplitude were found in response to literal
mapping. Taken together, the findings of Coulson and Van Petten (2002)
and Pynte et al. (1996), show that while processing metaphors does appear
more effortful than processing sentences that are transparent literally, this
difficulty can be reduced by providing a supporting context and may not be
entirely unique to nonliteral language comprehension.

Many studies of nonliteral language comprehension have focused on

whether or not its representation and processing are different from those
of literal language. Some studies indicate a special role for the RH in the
representation and processing of nonliteral language. A few ERP studies
have been conducted with the VHF technique to investigate whether or not
the RH is involved in the processing of words that have both a literal and
a nonliteral sense and in the processing of jokes (Coulson & Lovett, 2004;

Page 34 of 114 Language-Related ERP Components

Coulson & Severens, 2007; Coulson & Van Petten, 2007; Coulson & Williams,
2005; Coulson & Wu, 2005). Kacinik and colleagues (2008) used ERPs and
the VHF technique to study processing and representation of polysemy in
language. Polysemous words such as bright have one form representation
but multiple senses that can be related to the literal meaning (e.g., bright
light) or to the metaphoric meaning (e.g., bright student). The RH has
been proposed to be preferentially involved in comprehending subordinate
figurative meanings (Anaki et al., 1998; Beeman, 1998; Brownell, 2000;
Jung-Beeman, 2005). However, a series of behavioral VHF and central
ERP experiments (Kacinik & Chiarello, 2007; Kacinik et al., in preparation)
have repeatedly failed to show differences for the integration of literal and
figurative meanings into ambiguous contexts. Kacinik and colleagues (2008)
measured ERPs to lateralized sentence-final words related to the literal or
figurative sense of polysemous words in ambiguous contexts (e.g., “The girl
did not approach the slimy frog/clerk”). Participants were asked to read these
sentences for comprehension and to answer a true/false comprehension
question that followed the presentation of each of the stimuli. No significant
differences between the integration of literal and figurative meanings were
found in either visual field with respect to both N400 and late positive
effects. The more imageable literal endings, however, did show a bigger
anterior imageability effect in the LVF/RH than in the RVF/LH, supporting
prior indications that brain activity differences in understanding literal and
figurative meanings mainly reflect differences in imageability rather than in
literalness or figurativeness per se. Semantically incongruent endings in the
LVF/RH also resulted in a larger N400 than for the RVF/LH, providing further
evidence of RH involvement in sentence comprehension and sensitivity to
message-level meaning.

Page 35 of 114 Language-Related ERP Components

Fig. 15.14 Event-related potentials to metaphorical (dotted line), low-cloze
literal (dashed line), and high-cloze literal sentence-final words. The largest-
amplitude N400 was obtained for the metaphorical sentence final words for
both RVF and LVF presentations. Reprinted with permission from Coulson &
Van Petten, Brain Research, 2007 (Elsevier).

Coulson and Van Petten (2007) also did not find VHF ERP evidence of an
RH advantage for figurative language comprehension. Participants read
sentences that ended with either highly predictable (high-cloze) words or
appropriate but low-cloze literal or low-cloze metaphorical words that were
presented in either the left or right VHF. A significant effect of sentence
type was found, such that both low-cloze literal and low-cloze metaphorical
sentence endings elicited relatively larger N400 amplitudes than high-cloze
endings, with the metaphorical endings eliciting the largest N400 amplitudes
of all (see Figure 15.14). This implies that metaphorical processing is more

Page 36 of 114 Language-Related ERP Components

difficult than its literal counterpart; however, the pattern of differences
in N400 amplitude among the conditions was found for both left visual
hemifield (LVHF) and right visual hemifield (RVHF) presentation, suggesting
that the RH does not have a special role in metaphorical processing.

A series of studies by Coulson and colleagues previously found, however,

that the RH may be essential for joke comprehension (Coulson & Lovett,
2004; Coulson & Severens, 2007; Coulson & Williams, 2005; Coulson & Wu,
2005). It has been shown that compared to an unfunny sentence ending,
joke endings to sentences (e.g., “Statistics indicate that Americans spend 80
million a year on games of chance, mostly weddings/dice”) evoke relatively
larger N400 waveforms (Coulson & Kutas, 2001). Coulson and Williams
(2005) went on to show that when presented using the VHF technique, “one-
liners” compared to nonjoke sentences such as “I still miss my wife, but I am
improving my aim/ego” replicate this effect when presented to the RVF/LH
but a null effect when presented to the LVF/RH. This suggests that the RH has
no more difficulty integrating a joke ending into the sentence than a nonjoke
ending (Coulson & Williams, 2005).

The Processing Nature of the N400

Taken together, the findings discussed in the previous section clearly

show that the N400 is modulated by semantic aspects of the input, and
specifically, that the amplitude of the N400 is reduced to words that can
be easily related to the meaning of the overall semantic context. But the
exact processing nature of the N400 is still a matter of debate. Broadly, two
accounts of the N400 have emerged. Kutas and colleagues (2006, p. 669)
have proposed that “Overall, the extant data suggest that N400 amplitude
is a general index of the ease or difficulty of retrieving stored conceptual
knowledge associated with a word (or other meaningful stimuli), which is
dependent on both the stored representation itself, and the retrieval cues
provided by the preceding context.” Thus, according to this account, the
amplitude of the N400 is modulated by the ease of semantic retrieval and by
the top-down contextual influence on this process. That is, when reading a
sentence such as “He spread the bread with butter/cream,” the word cream
elicits an N400 because the sentence context has been used to predict and
preactivate lexical-semantic features of the expected word butter, and cream
does not match all of these semantic features (and is therefore more difficult
to retrieve).

Page 37 of 114 Language-Related ERP Components

Hagoort (2005), on the other hand, has proposed that the N400 is a
reflection of a semantic integration or unification process, such that words
that can be easily integrated into the preceding conceptual context generate
a reduced N400. Hence, this latter account assumes that the process of
lexical-semantic integration or semantic unification is the only driving
force behind the N400 and that the ease of semantic retrieval per se is not
reflected by the N400 (see also Friederici, 2002, for a comparable account of
the N400). Thus, in the sentence “He spread the bread with butter/cream”
the N400 is reduced to “butter” because it is more easily integrated with the
higher-order meaning representation of the preceding sentence context than
is “cream”.

Some of the findings that we have discussed appear more difficult to

reconcile with the integration account of the N400. First, there is the finding
that the N400 is sensitive to lexical factors such as word frequency and
orthographic neighborhood, which have little to do with combinatorial
semantics. Also, findings showing the modulation of the N400 as a function
of preactivation of semantic features of words in sentence or discourse
contexts are more easily explained in terms of facilitated retrieval than in
terms of ease of integration.

Thus, these empirical findings with the N400 appear more consistent with
the retrieval view of Kutas and colleagues (2000, 2006; see also Lau et al.,
2008). Recently, van Berkum (2009 has proposed an extension of the
retrieval model of the N400, labeled the multiple cause intensified etrieval
model (MIR), to take into account some of the more recent findings with the
N400 (see notes 7 and 8). His model assumes that “The amplitude of the
word elicited N400 reflects the computational resources used in retrieving
the relatively invariant ‘coded’ meaning(s) stored in semantic long-term
memory for, and made available by, the word at hand” (van Berkum, 2009
p 12). As in the Kutas et al. (2006) model, van Berkum (2009) assumes that
retrieval is facilitated (i.e., requires fewer computational resources) when
the meaning of a word is consistent with contextually preactivated semantic
features. But he also assumes that the N400 is not only dependent on the
semantic context per se, but is also modulated as a function of emotional
connotation, linguistic focus, or preword hesitation, factors that can all lead
to the retrieval of a richer set of semantic features that requires increased
computational resources. Van Berkum (2009) also broadens the array of
factors that may generate contextual expectations, including nonlinguistic
ones (e.g., a mental representation of the sensory context, a mental model
of the situation being discussed, and some metalinguistic representation of

Page 38 of 114 Language-Related ERP Components

the discourse). The amplitude of the N400 is modulated (i.e., reduced) when
these contextual factors facilitate retrieval of the currently processed word.

Hagoort and colleagues (2009) have suggested that these different accounts
of the processing nature of the N400 might be reconciled if the LH and RH
make different contributions to the creation of a meaning representation of
the overall context, as proposed by Federmeier and colleagues (Federmeier,
2007; Federmeier & Kutas, 1999b; Kutas & Federmeier, 2000). They
propose that the LH is involved in predictive and the RH in integrative
semantic processing. In other words, the language-dominant LH generates
contextually consistent semantic predictions that will facilitate retrieval and
reduce the amplitude of the N400 (i.e., if the prediction is met). The RH, on
the other hand, activates semantic information on the basis of the input
and incrementally integrates the semantic information of the current input
with that of previously activated semantic information. If this information
matches, the integration is facilitated and a reduction of the N400 ensues.

Possible Neuronal Generators of the N400

Because scalp-recorded ERPs cannot directly be related to their generating

source(s) as a result of the inverse problem (i.e., different internal source
configurations can provide identical external electromagnetic fields),
knowledge of the possible brain areas that contribute to the N400 has
been gathered from studies using methods with better spatial resolution
(magnetoencephalography [MEG] and fMRI), studies using intracranial
recordings from presurgical epileptic patients, and studies in patients with
lesions in verified locations. Many of these studies have shown evidence
that the left (and to a lesser extent possibly the right) temporal lobe is a
likely contributor to the scalp-recorded N400 (e.g., Nobre & McCarthy, 1995;
Nobre et al., 1994; for reviews, see Halgren, 2002; Lau et al., 2008; Van
Petten & Luka, 2006). These cortices have long been considered important
in the representation and retrieval of semantic information (e.g., Beeman
& Chiarello, 1998; Bright et al., 2007; Damasio et al., 1996; Hagoort et al.,
1996; Martin, 2007). But evidence from fMRI and MEG studies also suggests
a generator in the left inferior frontal cortex (Hagoort et al., 2004; Halgren
et al., 2002; see Figure 15.15 for a depiction of these possible neural sources
of the N400).

Page 39 of 114 Language-Related ERP Components

Fig. 15.15 Plausible N400 generators may be localized in the (left) middle/
superior temporal lobes. Some findings suggest a generator in the left
inferior frontal gyrus.

Recent studies have clearly implicated a function of the left inferior frontal
cortex in semantic processing as well (e.g., Giesbrecht et al., 2004; Gold
& Buckner, 2002; Hagoort et al., 1996; Poldrack et al., 1999; Swaab et al.,
1997, 1998; Thompson-Schill et al., 1997, 1998, 1999; Wagner et al., 2000,
2001). Even though the exact nature of this contribution is still a matter
of debate, the different proposals converge on the idea that this area is
not sensitive to semantic retrieval per se, but instead may be involved in
context-sensitive response selection (Kerns et al., 2004), lexical selection
or competition among semantic features (Thompson-Schill et al., 1997),
semantic unification (Hagoort et al., 2005), or processes of response
selection for semantic information (Gabrieli et al., 1998).

The P600

In the previous section, we described the N400 ERP component associated

with semantic processing. Following its discovery, researchers sought to
identify other ERP components that might be similarly associated with other
aspects of language processing. Perhaps foremost among the candidate
processes were those associated with building or extracting a syntactic
structure, a fundamental and essential aspect of language production
and comprehension. And, before long, evidence emerged that seemed to
implicate one ERP component as a reflection of such processing: the P600.

The P600 (Osterhout & Holcomb, 1992; also sometimes called the syntactic
positive shift, or SPS; Hagoort et al., 1993) is a slow late positive shift
in the ERP waveform. It typically onsets around 500 ms after the onset
of a stimulus (although earlier positive shifts have also been observed;
Mecklinger et al., 1995) and lasts for several hundred milliseconds; its peak

Page 40 of 114 Language-Related ERP Components

amplitude is generally observed at around 600 ms (if at all; the component
often appears as more of a shift without a clear peak). The P600 is generally
maximal over posterior electrode sites (but sometimes a more anterior
distribution has been observed; Friederici et al., 2002; Hagoort et al.,
1999; Kaan & Swaab, 2003b) and is generally widespread, without distinct
laterality. The P600 has been observed in response to both written and
auditory stimuli (Hagoort & Brown, 2000).

Initially, the observation of the modulation of the P600 component in

response to syntactic manipulations seemed a perfect complement
to the N400 as a marker of semantic processing. However, the
functional interpretation of the P600 component is probably not quite as
straightforward as was once believed. Below, we briefly review some of the
situations in which modulation of the P600 is observed in an attempt to come
to some tentative conclusions about what this component might be able to
tell us about language processing in the brain.

The P600 and Syntactic Anomaly

An initial set of studies examined the ERP response to syntactic anomalies,

that is, sentences that contained some kind of violation of syntactic
principles. In an early study of this type, Osterhout and Holcomb (1992) had
participants read sentences like the following:
a The broker hoped to sell the stock.
b The broker persuaded to sell the stock.

Fig. 15.16. A P600 effect of a syntactic violation (red line) relative to a

syntactically legal continuation of a sentence (blue line). Data from Hagoort,
Brown & Groothusen, Language and Cognitive Processes, (1993).

Page 41 of 114 Language-Related ERP Components

Sentence (a) conforms to the phrase-structure principles of the verb “hope”
(which, as an intransitive verb, easily accepts the clausal complement).
Sentence b, on the other hand, violates the phrase-structure principles of the
verb “persuade”; as a transitive verb, “persuade” requires a noun phrase
that can act as a direct object, and can only accept the clausal complement
as part of a reduced relative clause (as in “The broker persuaded to sell the
stock was the one who got rich”), which Sentence b also does not allow.
Osterhout and Holcomb (1992; see also Osterhout & Holcomb, 1993) showed
that sentences with such violations of phrase structure elicited a larger
P600 relative to similar sentences that did not contain violations. At around
the same time, Hagoort and colleagues (1993) showed a similar effect
in response to violations of subject–verb agreement in Dutch (as in “Het
verwende kind *gooien het speelgood op de grond”/”The naughty child
*throwing the toy on the floor”) when compared to structurally well-formed
sentences (see Figure 15.16).

Many studies have shown P600 effects to a broad range of syntactic

anomalies, including the aforementioned phrase-structure (see also Friederici
et al., 1996; Neville et al., 1991) and number agreement violations, as
well as other types of agreement violations, including gender and case
marking (Coulson et al., 1998; Friederici et al., 1993; Osterhout, 1997;
Osterhout & Mobley, 1995), verb tense violations (Osterhout & Nicol, 1999),
subcategorization violations (Ainsworth-Darnell et al., 1998; Osterhout
et al., 1994), and violations of subjacency (McKinnon & Osterhout, 1996;
Neville et al., 1991). Interestingly, some studies have shown P600 effects
in response to syntactic violations even in sentences that are otherwise
meaningless (e.g., “The boiled watering-can smokes/*smoke the telephone
in the cat”; Hagoort & Brown, 1994; but see also Münte et al., 1997),
reinforcing the conclusion that it is something about linguistic structure
(as separate from meaning) that is reflected in this component. P600-type
effects are not necessarily restricted to domains of language processing.
In fact, they have been observed in response to violations of several
different types of structure, including those seen in music (Besson & Macar,
1987; Janata, 1995; Patel et al., 1998), mathematical rules (Núñez-Peña &
Honrubia-Serrano, 2004), and abstract sequences (Lelekov et al., 2000).
These findings suggest that the P600 may index processes of structure
building quite generally, of which syntactic processing is one example.

Page 42 of 114 Language-Related ERP Components

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2012. All Rights
The P600 and Syntactic Ambiguity

Additionally, P600 effects are not limited to cases of outright structural

violation. Several studies have demonstrated that the P600 component is
also sensitive to syntactic complexity within sentences that do not contain
any structural violation. That is, sentences that are grammatically well
formed, but syntactically more difficult or less preferred, can also elicit a
larger-amplitude P600 relative to sentences that are easier to parse or in
some other way more preferred. P600 effects of this type have been seen in
response to sentences that contain some temporary syntactic ambiguity, in
which at least two alternative syntactic parses can be entertained at some
point during sentence processing (Kaan & Swaab, 2003a; Mecklinger et al.,
1995; Osterhout, 1997; Osterhout & Holcomb, 1992, 1993; Osterhout et al.,
1994). Consider the following example (from Kaan & Swaab, 2003a):
a The man is painting the house but the garage is already finished.
b The man is painting the house and the garage is already finished.

Both (a) and (b) are grammatical sentences; however, (b) contains a
temporary syntactic ambiguity that is absent in a. Until the verb in the
second clause (“is”) is encountered in (b), two plausible and formally
permissible structures can be generated from the sentence fragment (one in
which “and” combines “the house” and “the garage” into a conjoined noun
phrase, both of which are being painted, and one in which “the garage” is
the head noun of a new clause, as is ultimately forced upon encountering
“is”). Behavioral research has demonstrated that most readers prefer the
first structure, the one that must be abandoned upon encountering “is”. The
ERP results mirror the behavioral results: at the verb in the second clause,
the amplitude of the P600 is larger in (b) than in (a), suggesting that readers
have to abandon the preferred structural interpretation at this point in favor
of the less preferred one (Kaan & Swaab, 2003a).

Other research has shown that it is not necessary for a sentence to include
this kind of “garden path” (in which one is first led down a specific syntactic
path before recognizing the need to change directions toward another) to
elicit a larger P600; sentences that are unambiguous, but syntactically more
difficult or sophisticated, can lead to increases in the amplitude of the P600
as well (Kaan & Swaab, 2003b, Kaan et al., 2000; see Figure 15.17).

Page 43 of 114 Language-Related ERP Components

Fig. 15.17 P600 effects are also found to grammatical but less preferred
syntactic continuations of sentences. In the top panel, the typical P600
effect to ungrammatical continuations is shown; when a verb and a noun
do not agree in number (i.e., “. . . the hamburger that are…”), a greater
P600 is found to the verb (red line) compared to a grammatical continuation
(blue line). In the bottom panel, preferred and non-preferred grammatical
continuations are compared; i.e., it is easier to attach the verb (were) to
the most recent noun phrase (pizzas), than to a noun phrase earlier in the
sentence (cakes), even though this latter continuation is also grammatical.
A greater P600 is found to the non-preferred continuation (pink line), than
to the preferred continuation (green line). For comparison, the middle
panel shows a greater P600 when the verb cannot be attached to any of
the preceding noun phrases (UNgrammatical, purple line), relative to the
preferred grammatical condition (green, line). (Data from Kaan & Swaab,

Based on results such as those presented above, several prominent

interpretations have been offered concerning the functional significance of
the P600. Some of these models describe a role for the P600 in processes
of syntactic analysis and reanalysis or repair, as needed. Osterhout and
colleagues (1994) suggested that the P600 reflects the cost of reprocessing
that is necessary when an initial parse is disconfirmed. Kaan and colleagues
(2000; see also Fiebach et al., 2002) proposed that the P600 reflects the
difficulty of syntactic integration, a process that is made easier when
the current syntactic structure is predictable (and thus becomes readily
activated). Friederici (1995, 2002; see also Friederici & Kotz, 2003; Friederici
& Weissenborn, 2007) has proposed a three-stage model of language
comprehension in which the P600 corresponds to the final stage of syntactic

Page 44 of 114 Language-Related ERP Components

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). (c) Oxford University Press, 2012. All Rights
reanalysis that may arise when information from the initial two stages
(early phrase-structure building and semantic/verb-argument information
activation) cannot be readily reconciled. Hagoort (2003; see also Hagoort,
2005) has proposed a unification model of syntactic processing in which the
P600 indexes the amount of time required to unify syntactic frames into one
phrasal configuration. This unification takes longer (and thus the amplitude
of the P600 is larger) when, for instance, syntactic ambiguity (temporarily)
introduces more than one possible syntactic configuration and competition
among the alternatives results.

Finally, it may well be that P600 effects are not of a piece, but instead
comprise a family of effects that may reflect separable underlying functional
processes. Hagoort and colleagues (1999) suggested that the topographic
distribution of the P600 might differ, depending on the type of demand
placed on the parser. Specifically, they suggested that P600 effects tend to
be more frontally distributed in cases in which syntactic preferences are not
met, but tend to be more posterior in cases of outright syntactic violations.

The P600 and Semantic-Thematic Integration

Until quite recently, despite the lack of agreement on the exact functional
nature of the P600 component, most researchers would have at least felt
comfortable with a characterization of this component as reflecting some
aspect of syntactic processing, as compared with the N400 and its role as an
index of semantic processing. However, even this rather general depiction
of the P600 has been called into question by a recent series of studies that
report P600-type effects to stimuli that contain seeming semantic violations
and thus might otherwise have been reasonably expected to elicit N400

One example of such a study comes from Kuperberg et al. (2003b), who
presented participants with sentences like the following:
a For breakfast the boys would only eat toast and jam.
b For breakfast the eggs would only eat toast and jam.
c For breakfast the boys would only bury toast and jam.

All three sentences are syntactically well formed. Sentence (a) is also
semantically well formed. Sentences (b) and (c) both contain semantic/
pragmatic violations. In (b), the incongruity results from an animacy violation
of thematic roles: eggs are inanimate and thus cannot fill the Agent thematic
role demanded by the verb “eat”. (An inanimate entity like “eggs” is

Page 45 of 114 Language-Related ERP Components

better suited to the Theme thematic role of “eat”.) In “c”, the incongruity
arises from a pragmatic violation: boys can indeed bury things (even toast
and jam), but are not expected to do so at breakfast time. As expected,
sentences like (c) show an increased N400 at the critical verb (“eat” vs.
“bury”) relative to the semantically well-formed controls (a). A surprising
pattern of results was found, however, to the critical verb in sentences like
(b): instead of an N400 to the critical verb in these sentences, a large P600
was elicited when compared to the controls (Kuperberg et al., 2003b; see
Figure 15.18).

Fig. 15.18 Event-related potentials to verbs that form a normal (blue solid
line), an implausible but possible (red dotted line), and a thematically
violated continuation of the sentence (dashed green line). Adapted from
Kuperberg, Holcomb, Sitnikova, Greve & Caplan, 2003, Cognitive Brain
Research (Elsevier).

This effect cannot be explained just in terms of component overlap with the
N400, since it seems unlikely that eat in the eggs would eat would elicit a
reduction in the N400 relative to the control condition the boys would eat.

Several subsequent studies have also demonstrated P600 effects to what

would generally be characterized as semantic violations (Hoeks et al., 2004;
Kim & Osterhout, 2005; Kolk et al., 2003; Kuperberg et al., 2006, 2007;
Nakano & Swaab, 2005; van Herten et al., 2005). For example, Kim and
Osterhout (2005) found a P600 effect to sentences like “The hearty meal

Page 46 of 114 Language-Related ERP Components

was devouring the kids” when compared to semantically well-formed passive
and active control sentences. However, they demonstrated that this effect
depended, in part, upon the “semantic attraction” of a given thematic role
assignment, given a particular verb. So, for a verb like “devour”, for which
“meal” is an appropriate Theme but not an appropriate Agent, there is a
semantically attractive alternative thematic role assignment available to
readers. On the other hand, for sentences such as “The dusty tabletops
were devouring with gusto”, in which “tabletops” is not appropriate for
any thematic role associated with “devour”, semantic attraction is low.
And indeed, Kim and Osterhout found, for sentences with low semantic
attraction, an N400 effect instead of a P600 effect. However, Kuperberg
and colleagues (2006) demonstrated a similar (if not larger) P600 effect to
sentences in which reassignment of thematic roles did not lead to repair (as
in “To make good documentaries cameras must interview…” ), suggesting
that semantic attraction or fit cannot be the only underlying cause of this
effect. Instead, violations of animacy in semantic–thematic relationships may
play an important role in eliciting the P600 effect.

Results such as these throw into doubt the traditional interpretation of the
P600 component as an index of syntactic analysis and repair and, more
broadly, raise serious and interesting questions about the relationship
between semantic and syntactic processes in the brain.

The P600 and Syntactic Priming

Recently, we used the responsiveness of the P600 component to aspects

of syntactic processing to examine the nature of syntactic priming (Ledoux
et al., 2007; see also Tooley et al., 2009). Syntactic priming is the facilitation
of sentence processing that occurs when a sentence has the same syntactic
form as a preceding sentence. Behavioral studies have shown syntactic
priming effects repeatedly in studies of language production. However,
such effects have been less consistently demonstrated in studies of
language comprehension. When priming effects have been demonstrated
in comprehension, the effects seemed to depend on the repetition of lexical
items (especially verbs) across sentences. In these cases, it was difficult to
disentangle the contribution of lexical repetition priming effects and true
syntactic priming effects using behavioral measures alone. On the other
hand, ERPs seemed well suited to this task, given that lexical repetition
priming has been shown to influence the N400 component and syntactic
processing to influence the P600 component.

Page 47 of 114 Language-Related ERP Components

We thus designed an ERP experiment to attempt to dissociate these two
types of effects. We presented participants with sentences such as the
following (from Ledoux et al., 2007):
Main-clause (MC) prime:

The speaker proposed the solution to the group at the space


Reduced-relative (RR) prime:

The speaker proposed by the group would work perfectly for

the program.

Target (always RR):

The manager proposed by the directors was a bitter old man.

For each stimulus set, participants read one of two types of prime sentences:
MC and RR forms. The MC interpretation is preferred, and behaviorally,
readers have temporary difficulty parsing the RR version. We expected to see
similar evidence of this difficulty electrophysiologically, and that is what we
found: a larger P600 to the disambiguating region (following the verb) for RR
prime sentences relative to MC prime sentences. After each prime sentence,
readers were presented with another sentence that contained the same main
verb presented in the prime sentence. This target sentence was always of
the RR form, regardless of the type of prime sentence that had preceded
it. So, half of the participants saw an MC prime followed by an RR target,
and half of them saw an RR prime followed by an RR target. We looked at
the ERPs to see if the response to the same target sentence differed purely
as a function of the type of prime sentence that had been read before it.
We found evidence that this was the case: the P600 was reduced for the
disambiguating region following the verb in the RR target sentences that had
been preceded by a prime sentence with a similar syntactic construction (RR)
relative to when the same target sentences had been preceded by a prime
sentence with a different syntactic construction (MC). We took this reduction
in the amplitude of the P600 to be evidence of syntactic priming. We were
also able to demonstrate that this syntactic priming effect was separate from
the priming effect that arose from lexical repetition. We found a reduction in
the amplitude of the N400 component to the second presentation of the verb
preceding the disambiguating region (in the target sentences) relative to its
first presentation (in the prime sentences). It seems, then, that the context
in which a sentence is presented (being in close proximity to sentences of

Page 48 of 114 Language-Related ERP Components

similar construction) can influence the brain’s response during syntactic
parsing, and that this response can be dissociated from the processing
benefit conferred by lexical repetition (Ledoux et al., 2007).

The Processing Nature of the P600

Even though many experiments have shown that the P600 is sensitive to
syntactic aspects of the linguistic input, the finding of a P600 to violations
of thematic constraints (as in “The eggs would eat toast with jam at
breakfast”) calls into question whether or not the P600 is uniquely evoked
by syntactic processing. Several new hypotheses about the functional
interpretation of the P600 have been offered in light of these recent results
(for reviews, see Bornkessel-Schlesewsky & Schlesewsky, 2008; Kolk &
Chwilla, 2007; Kuperberg, 2007; Stroud & Phillips, 2009). One proposal
is that the P600 effect in these experiments arises as a result of strong
semantic-thematic attraction (Kim & Osterhout, 2005) or fit (Kuperberg
et al., 2006) in sentences in which a plausible meaning can be derived if
thematic roles are reassigned (see also Kemmerer et al., 2007, for another
explanation based on the temporary syntactic reanalysis of grammatical-
semantic violations). More recently, Kuperberg (2007) has proposed a model
of language comprehension in which two processing streams act in parallel.
One, the semantic memory–based stream, computes semantic features
and relationships among sentence components and is primarily reflected in
the N400 component. The other, the combinatorial stream, is sensitive to
a multitude of linguistic constraints, including constraints of morphosyntax
and of semantic–thematic relationships (including animacy). When the two
streams provide contradictory output (i.e., when the semantic interpretation
output by the first stream contradicts morphosyntactic or semantic-thematic
information in the sentence), continued analysis must be undertaken to
resolve the inconsistency, and it is this extended analysis that is reflected
in the P600 component. A rather different proposal was presented by van
Herten and colleagues (2006; see also Kolk & Chwilla, 2007), who suggested
that the P600 might instead reflect the engagement of executive or cognitive
control processes in the service of error monitoring and reprocessing in order
to resolve response uncertainty during language processing (see also Vissers
et al., 2006, 2007, 2008). This last proposal is most damaging to the idea
that the P600 is sensitive to syntax, because it suggests that the P600 is not
even language-specific (which will be discussed further in the next section).

While P600 effects of this type are still open to interpretation and further
study, they do suggest that the interaction between semantic and syntactic

Page 49 of 114 Language-Related ERP Components

processes in the brain may be more dynamic than was previously supposed.
This conclusion is further supported by other studies that have reported
an influence of lexical/semantic and discourse factors on syntactic parsing
processes (Brown et al., 2000; Gunter et al., 2000; Osterhout et al., 1994;
van Berkum et al., 1999, 2003; Weckerly & Kutas, 1999; see also Bornkessel
& Schlesewky, 2006). What is clear at this point is the importance of further
studies of factors, such as those described above, that seem to be at the
interface of semantic and syntactic processing. This will be necessary
for a more complete understanding of the P600 component in language

Is the P600 Distinct from the P300? Task Sensitivity and Possible Neural

Very soon after the discovery of the P600 in the early 1990s (Hagoort et al.,
1993; Osterhout & Holcomb, 1992), a debate emerged in the literature on
whether or not the P600 is in fact just another manifestation of the P3b, a
member of the P300 family (see Chapter 7, this volume). This would imply
that the P600 may be related to cognitive processing that is not specific to
the building of hierarchical structure (Coulson et al., 1998; Gunter et al.,
1997; but see Osterhout & Hagoort, 1999). Coulson and colleagues (1998)
published a study that suggested that the P600 is a member of the P300
family. Specifically, they argued that manipulations known to modulate
the P3b, such as salience of the stimulus, probability of occurrence, and
task relevance, also modulate the P600. Further, they found no significant
differences in the topographic distribution of the P3b and the P600, which
also challenges the idea that these ERPs are distinct. If the P600 and the P3b
are not distinct, then this would further call into question the assumption of a
syntax-sensitive ERP, although it would not dispute the fact that this positive-
deflecting ERP is sensitive to manipulations of syntax as well. Next, we will
discuss the results of some studies that suggest that the P3b and the P600
may in fact not be the same ERP component.

In 1996, Osterhout and colleagues (Osterhout et al., 1996) conducted a study

on whether or not the P600 and the P3b could be differentiated on the basis
of their sensitivity to syntactic manipulations of subject–verb agreement and
nonsyntactic manipulations of saliency, probability of occurrence, and task
relevance. Participants read sentences such as the following:
a Nonanomalous Control: The doctors believe the patient will

Page 50 of 114 Language-Related ERP Components

b Agreement Violation: The doctors *believes the patient will
c Letter Case Violation: The doctors BELIEVE the patient will
d Double Anomaly: The doctors *BELIEVES the patients will recover.

Event-related potentials were measured to the critical words that were

violations (the verbs in the examples) relative to the control condition (a).
It was expected that the comparison of (a) and (b) would show a P600
and the comparison of (a) and (c) would show a P3b. Importantly, only the
P600 should be sensitive to the syntactic violation and only the P3b should
be sensitive to the manipulations of probability of occurrence and task
relevance. Further, in the Double Anomaly condition (d), additive effects
of P3b and P600 would also indicate that these ERPs are separable. When
probability was manipulated (20% vs. 60% violations), a modulation of the
P3b was obtained in the Letter Case Violation condition (c). In contrast, the
amplitude of the P600 was not affected by the probability manipulation (see
Figure 15.19).

Fig. 15.19. Event-related potentials to critical words that were syntactically

correct but physically deviant relative to the preceding context (uppercase
letters) in 20% (dotted line) and 60% (solid line) of the experimental
materials. A larger positive shift or P3b is obtained in the 20% condition. The
P600, which is obtained to the syntactic violation of subject–verb agreement
(*The doctors believes…) is not sensitive to this probability manipulation
(right side). Adapted from Osterhout, McKinnon, Bersick & Corey, Journal of
Cognitive Neuroscience, 1996 (MIT press).

Page 51 of 114 Language-Related ERP Components

Fig. 15.20 P3b effects in a standard oddball paradigm (see Chapter 7, this
volume) for visual, auditory, or somatic target stimuli. Results are compared
between normal, neurologically unimpaired subjects (control, solid line), and
patients (dotted line) with focal lesions in the left frontal cortex (top left), left
temporal-parietal junction (middle left), left parietal cortex (bottom left), and
hippocampus (top, middle, and bottom right). Only patients with temporal
parietal lesions show marked attenuations of the P3b response. Adapted with
permission from Knight and Scabini (1998).

The latency, amplitude, morphology, and topographic distribution of the

P3b and P600 effects differed as well; the P3b effect was larger in amplitude
than the P600 effect and was maximal over right posterior electrode sites
in a 400–800 ms time window; the P600 was maximal in a later 500–900
ms time window and was more evenly distributed over posterior sites
of both hemispheres. When task relevance was manipulated (sentence
acceptability task vs. reading task), P3bs were obtained to Letter Case
violations for both tasks, but the effect was greatly increased in the sentence
acceptability judgment task. P600s were found to syntactic violations in both
task conditions as well and the P600 was also larger in the task-relevant
condition, but the modulation of the P600 was much less robust than that
found for the P3b in the Letter Case violation. Finally, when participants
were presented with a Double Anomaly (condition d), a large-amplitude
broad positivity was obtained that was larger than the P3b to the Letter
Case manipulation alone and larger than the P600 to the syntactic violation,
indicating that this effect was additive. One could argue that this study in
fact showed that the P600 is sensitive to task relevance. However, in a study
that compared syntactic violations to syntactic preference, Kaan and Swaab
(2003a) found that the amplitude of the P600 does not vary as a function of
the task when syntactic preference is manipulated.

Page 52 of 114 Language-Related ERP Components

Other dissociations with regard to P600 and P3b have been observed in a
study by Hagoort and colleagues (2003), who found that aphasic patients
with a syntactic deficit did not show a P600 to syntactic violations of word
order but did show a P3b to target stimuli in a standard oddball paradigm
(see Chapter 7, this volume). Further evidence that the P600 and the P3b
may be distinct ERP components comes from studies in patients with
localized brain lesions that are indicative of separable neuronal sources.
The work of Knight and colleagues has shown that the P3b is significantly
attenuated in patients with temporal-parietal lesions (Knight et al., 1988,
1989). Frisch and colleagues, on the other hand, have shown the presence of
a P3b and the absence of a P600 in patients with lesions in the basal ganglia
(Frisch et al., 2003). The results of these latter two studies are shown in
Figures 15.20 and 15.21.

Less direct but nevertheless suggestive evidence comes from fMRI studies
that show that largely nonoverlapping brain regions become activated in
oddball experiments and syntactic experiments. Oddball fMRI studies have
shown activations in brainstem, temporal lobes, and medial frontal lobes
(e.g., Calhoun et al., 2006; McCarthy et al., 1997). Functional MRI studies
of syntax have observed activations in anterior regions of the superior
temporal gyrus (STG; Friederici et al., 2003; Meyer et al., 2000) and posterior
portions of the inferior frontal gyrus (Broca’s area; e.g., Caplan et al., 2008;
Friederici et al., 2003; Kuperberg et al., 2003; but see January et al., 2009).
Furthermore, recent studies that performed repeated transcranial magnetic
stimulation of Broca’s area show performance improvements during syntactic
processing (Sakai et al., 2002) and processing of artificial grammar (Uddén
et al., 2008).

Fig. 15.21 Comparison of P600 and P3b in patients with and without
lesions that include the basal ganglia. Patients with lesions that include
the basal ganglia do not show a P600 to morphosyntactic violations (solid
line) when compared to correct continuations (dotted line) of sentences
(left panel), but they do show a P3b response (right panel), with a larger
P300 to deviant (solid line) than to standard (dotted line) target stimuli.

Page 53 of 114 Language-Related ERP Components

Reprinted with permission from Frisch, Kotz, von Cramon & Friederici, Clinical
Neurophysiology (2003).

Finally, it has been shown that the P600 and the P3b have different
oscillatory signatures; whereas an increase in P600 amplitude is correlated
with a decrease in alpha and beta bands (Davidson & Indefrey, 2007), a
larger amplitude of the P3b is associated with a tighter synchronization in
the gamma band and a reduction of power in the gamma band (Ford et al.,

In sum, the extant evidence generally supports a separation between

P3b and P600 ERPs. Studies have shown that the P600 is not sensitive
to probability (Osterhout et al., 1996) and that manipulation of syntactic
preference results in a P600 that is not sensitive to task relevance (Kaan &
Swaab, 2003b). In addition, separable neural sources may be involved in the
generation of the P3b and P600.


In addition to the modulations of the P600 component described in the

previous section, syntactic anomalies have also been shown to elicit earlier
negative shifts in the ERP waveform (Friederici et al., 1993; Münte et al.,
1993; Neville et al., 1991; see Figure 15.22).

These shifts appear over anterior electrodes and in some cases (though not
all) have been shown to be lateralized to the left side of the head. For this
reason, this class of ERP component is generally referred to as a left anterior
negativity (LAN). (In most cases, when not left-lateralized, the anterior
negativity is bilaterally distributed.) Researchers have reported LAN effects
at varying latencies following a stimulus. Early LAN (ELAN) effects have been
observed as early as 100–300 ms poststimulus onset. These early effects
have been distinguished from later LAN effects in the same latency window
as the N400 (300–500 ms) but have been differentiated from that component
by its anterior distribution (and by a different set of eliciting conditions).
Whether the ELAN and LAN are truly two different components indexing
functionally distinct language processes, or whether they reflect a single
process that varies in onset, is a matter of great debate.

The ELAN and LAN effects have been observed to word category violations,
that is, when the parser anticipates that an upcoming word will be of a
particular grammatical category (noun, verb, etc.) but is presented with a
word that violates that expectation, as in “The young apprentice went to

Page 54 of 114 Language-Related ERP Components

see the new *designing in the museum,” where a verb (designing) is in a
noun position (design) (Friederici et al., 1996; Hagoort et al., 2003; Hahne &
Friederici, 1999; Münte et al., 1993). In German, the latency of the anterior
negativity has been shown to depend in part on the location of the violation
in the critical word: the onset was earlier when the violation was part of
the prefix of the critical word and was later when it occurred in the suffix
(Friederici et al., 1993, 1996; Hahne & Friederici, 1999, 2002). It is under
conditions of word category violation that the ELAN has been most reliably
elicited. The later LAN effect has been elicited under a broader range of
conditions, especially those involving violations of morphosyntax (number,
case, gender, and tense violations; Deutsch & Bentin, 2001; Gunter et al.,
1997, 2000; Osterhout & Mobley, 1995; Penke et al., 1997).

Fig. 15.22 The LAN and the P600 response to syntactic violations (dotted
line). Reprinted with permission from Friederici et al. (2004).

A LAN has also been consistently observed to fully grammatical sentences

that contain long-distance dependency constructions, such as filler-gap
sentences (Felser et al., 2003; Kluender & Kutas, 1993). In such sentences,
one sentence component (the filler) has been omitted or displaced from its
original position, leaving behind a trace (the gap) that must be filled with the
missing component. An example of gap filling comes from WH-constructions,
such as “Which movie did John like?” In this case, the filler “movie” has been

Page 55 of 114 Language-Related ERP Components

moved from its position as the object of the verb “like”, and has created a
gap after the verb that must be detected and filled with the displaced filler.
A LAN has been observed at the position of the gap (Fiebach et al., 2001,
2002; Kluender & Kutas, 1993). Another example comes from verb gapping,
in which a verb is omitted in a sentence that contains two conjoined clauses,
as in “Mary ate the hamburger, and Susie the salad”. The detection of the
gap in sentences such as these has also been shown to elicit an E/LAN effect
(Kaan et al., 2004).

The Functional Nature of the E/LAN

The functional significance of the E/LAN is the subject of much debate.

Because the effect is most consistently elicited only under conditions of
structural violation, it seems to index some aspect of difficulty during
syntactic processing. (Also, because E/LAN effects are seen primarily to
violations, they are usually accompanied by a P600 effect, but the converse
is not always true: P600 effects have been observed in the absence of
E/LAN effects.) Hagoort (2003), in his unification model, suggested that
anterior negativities result when a syntactic element cannot be bound to
any other element in the current parse (either because of a violation of
word category or because of a violation of morphosyntactic principles).
Friederici (1995, 2002) and Friederici and Weissenborn (2007) propose a
functional distinction between the ELAN and the LAN. According to this
model, the ELAN indexes difficulty during an initial stage (Phase 1) of phrase-
structure building during which word category information is identified.
The LAN, on the other hand, indexes difficulty during a stage (Phase 2) in
which morphosyntactic information is processed and integrated. (Semantic
information is also integrated during Phase 2, a process reflected in this
model by the amplitude of the N400.) As mentioned previously, Friederici
proposed that the amplitude of the P600 reflects processing during a final
stage (Phase 3), in which reanalysis and repair are engaged as needed to
fully integrate the syntactic and semantic outputs of Phases 1 and 2.

An alternative proposal suggests that anterior negativities may more

broadly index the working memory operations involved in the processing
of verbal material (Fiebach, et al., 2001; King & Kutas, 1995; Kluender &
Kutas, 1993). Some have suggested that the LAN may be a “time slice” of
the slow negative waves that have been observed with working memory
manipulations. Anterior negativities similar to the LAN have been observed
under other conditions that could be thought to tax verbal working memory.
Verbal working memory would certainly be recruited during the processing

of filler-gap constructions and other long-distance dependencies, since
sentence constituents must be maintained and manipulated over the
course of the sentence. It is more difficult to reconcile this view of the
LAN as an index of increased working memory load with its response to
morphosyntactic violations, where the form of a word does not obey the
structure of the sentence, such as in violations of tense, gender, number,
and case (e.g., “The young apprentice go to the museum”). Another
possibility is that there are multiple distinct processes occurring under
different conditions that produce many ERP components that are difficult to
differentiate due to the relatively limited spatial resolution of the technique.
Future research may help to answer some of these questions.

Using ERPs to Study Language: Methodological Issues

Anyone who studies language processing is aware of the creativity, time,

and effort it takes to construct the stimuli for an experiment. With pretesting
and norming, stimulus construction can sometimes take up to 1 year and
very often requires at least 3 months. In ERP research, this problem is
compounded by the need to have enough stimuli in each condition to
achieve the appropriate signal-to-noise ratio (see Luck, 2005). Fortunately,
because many of the language-related ERP effects have now been well
documented in the literature, a workable rule of thumb is to have a minimum
of 25 trials in each condition after artifact rejection. Given the rate at which
artifacts (primarily due to blinks and eye movements) occur, even when
instructions are given to minimize them, this generally means that it is
wise to start with at least 40 trials in each condition. However, this is still
a sizable number. Unlike studies of visual spatial attention, for example,
where the same stimulus is often repeated many times, repetition of the
same language stimulus is often detrimental (unless the repetition serves a
language function, as for example in coreferential processing with repeated
names, e.g., Swaab Camblin & Gordon, 2004, or when repetition priming
is studied). In fact, repetition can lead to changes in subject strategies and
diminishing alertness. In addition, lexical repetition per se has substantial
effects on the amplitude of the N400 (e.g., Besson & Kutas, 1993). As in
behavioral studies of language, in order to avoid repetition of stimuli, a Latin
square design is frequently applied, such that a given experimental item
appears in each condition across different lists that are counterbalanced
across participants. To the extent possible, it is advisable to keep the critical
word (to which the ERPs will be measured) the same across the different
versions of the experimental item. When critical words are different across
conditions, they should be carefully matched on relevant lexical properties

Subscriber: OUP - OHO Editorial Board; date: 21 January 2013
known to influence processing/ERPs (such as word length, lexical frequency,
part of speech, and concreteness and age of acquisition). Useful databases
characterizing such lexical features can be found online (e.g., at http://
www.psych.rl.ac.uk/MRC_Psych_Db.html; http://elexicon.wustl.edu/). Other
factors (such as cloze probability, plausibility, and task-induced processing
requirements) may also influence the amplitude of language-related ERP
components and should be carefully controlled unless they are the specific
object of study.

While the same lexical items can be used for the comparison across
conditions, other problems may arise when the critical word in the sentence
is not in the same position across conditions. As in behavioral studies,
ERP effects of the sentential position of the critical word have been
demonstrated, such that words presented at later positions in the sentence
are more easily integrated, resulting in a general reduction in the amplitude
of the N400 to these words toward the relative end of a sentence. In addition,
to avoid sentence wrap-up effects involving the integration of the overall
meaning of the sentence, it is best to avoid presenting the critical words in
the sentence-final position.

Furthermore, ERP baseline issues may occur when the words preceding and
following the critical word are different across experimental items, as in the
following example:
Anomalous: They admired my of sketch the landscape.

Control: They admired my sketch of the landscape.

The critical word of in this example introduces a word-order violation in

the Anomalous condition and follows a normal word order in the Control
condition. However, to see the effect of the word order violation on the
critical word of, the prestimulus baseline is affected by a difference in
word class across conditions: in the Control condition the open-class word
sketch will elicit an N400 that is not found to the closed-class word my in
the Anomalous condition. If a typical presentation rate of one word every
500 ms is used, then this will affect a 100 to 200 ms prestimulus baseline of
the critical word. Ideally then, the same words should directly precede (and
follow) the critical word.

RSVP Requirements

Because ERPs are vulnerable to artifacts from eye movements, ERP reading
experiments typically present participants with sentence or discourse stimuli

one word at a time, usually at a rate of 500 ms (300 ms per word, with an
interstimulus interval of 200 ms). The reading conditions during an ERP
experiment thus differ from natural reading conditions in at least two ways.
First, under normal reading conditions, not every word in the sentence
or discourse is read with the same speed, and some words are skipped
altogether. Second, the average reading speed is about 200 ms per word,
which is much shorter than the RSVP rate typically used. Third, under normal
reading conditions, participants can and often will return to words earlier in
the sentence when they are confronted with ambiguity or another difficulty
later in the sentence, or when they temporarily “zone out” during reading
and find that they are not consciously aware of the content of a passage that
they have just read. This option is not available to ERP participants when
they are forced to read one word at the time.

To more closely mimic natural reading speed, reading studies with ERPs
have been performed at faster RSVP rates of 200–250 ms (e.g., Camblin
et al., 2007b; van Petten et al., 1997). Under these circumstances, short-
latency ERP components such as N1 and P2 do not clearly appear in the ERP
waveform because of the overlap of ERPs from previous words. However,
even at these fast rates, it has been possible to observe distinct N400 effects
(van Petten et al., 1997). Interestingly, in at least one study, the fast rate of
presentation led to changes in the typical pattern of results (Camblin et al.,
2007b), presumably because readers lacked control over their reading input,
which they would still have in natural reading conditions. This may place
additional demands on the reader that interfere with the normal reading
process. Recently, Van Berkum and colleagues introduced the variable
serial visual presentation (VSVP) technique in concert with ERPs, in which
the presentation duration of each word depends on its length (for details,
see Otten & van Berkum., 2007). This procedure would seem to better
approximate more natural reading conditions when compared with a fixed
fast RSVP of 200 ms per word.

Ditman and colleagues (2007) found that the self-paced reading paradigm
can be used while recording ERPs, allowing for comparison of ERP results
within participants. In self-paced reading paradigms, subjects are asked
to press a button each time they want to advance to the next word of a
sentence. The latency of these button presses is assumed to correlate
with processing difficulty, such that subjects will take longer to press the
button for the next word when they have more difficulty processing the
current word. Ditman et al. showed typical effects of pragmatic and syntactic
violations that had been established in previous ERP studies and self-paced

reading studies with the same paradigms. Importantly, motor artifacts from
motor preparation and the button press itself did not appear to adversely
affect the EEG recording.

Other studies have combined behavioral methods with ERPs to investigate

language processing during reading. As discussed previously, Gordon
and colleagues have studied coreferential processing during reading with
identical or very similar stimuli using ERPs with both visual and auditory
presentations and eye-tracking methods (Camblin et al., 2007; Gordon
et al., 2004; Swaab et al., 2004). Two linguistic expressions are said to be
coreferential if they refer to the same semantic entity, as in “Emily asked
for a definition of coreferential processing because she wants to avoid
confusion for the readers of this chapter,” where Emily and she refer to
the same person (and the observant reader probably also realizes that this
request was made by one of the editors of this book). Interestingly, typical
effects of coreferential difficulty were observed with ERP studies that used
standard RSVP rates of 500 ms (Ledoux et al., 2007; Swaab et al., 2004)
in eye-tracking studies when participants read at their own speed (Ledoux
et al., 2007) and also in studies with naturally connected speech (Camblin
et al., 2007). However, when faster RSVP rates were used, Camblin and
colleagues observed effects of lexical repetition but not of coreferential
difficulty. This may indicate that fast RSVP rates will prompt the reader to
process words in a sentence more like words in lists, which will preserve
lexical effects of repetition but interfere with higher-order processing that
requires integration of context, as for example when readers are establishing
coreferential relations in a text.

We hope we have shown that ERPs provide a very useful tool in the study
of language comprehension. Even though much work lies ahead of us in
unraveling the mysteries of meaning in language, when all is said and done,
“Language is to the mind more than light is to the eye” (Gibson, The Miracle
Worker, p. 25).


Find This Resource

(1) A review of the psycholinguistics models of language processing is

outside the scope of this chapter (see Traxler & Gernsbacher, 2006). Also,
we cannot review all relevant research involving language-related ERPs:
we will not separately review research with language-related ERPs in brain
damaged and/or psychiatric populations (e.g., Bonte & Blomert, 2004;
Ditman & Kuperberg, 2007; Swaab et al., 1997,1998; Wassenaar & Hagoort,
2007; Wassenaar et al., 2004; for reviews, see Kuperberg & Caplan, 2003;
Kuperberg et al., 2010; Münte et al., 2000; Swaab, 1998). Event-related
potential studies of language production will not be reviewed here either.
These studies are rare, in large part because speaking induces large motor
artifacts in the EEG. However, clever designs have been used to identify the
relative timing in the access of different types of information during word
production (Levelt, 1999) in the incredibly fast processes that lead from
thought to speech production (Schmitt et al., 2001; van Turennout et al.,
1998). Recently, ERPs have also been used to test the theories of gesture
and language (e.g., Holle & Gunter, 2007; Özyürek et al., 2007), embodied
language (e.g., Chwilla et al., 2007), and bilingualism (e.g., Elston-Güttler
et al., 2005; Proverbio et al., 2004).

(2) The topographic distribution of the N400 varies to some extent for written
and spoken language. In the visual modality the N400 is maximal over
centro-parietal sites over the right hemisphere, whereas for spoken language
the N400 is more equally distributed over centro-parietal sites of the left and
right hemispheres.

In addition, the onset latency of the N400 in written language is around 200
ms, whereas in natural spoken language the N400 may start to diverge as
early as 50 ms after the onset of a critical word because of coarticulatory
information from the preceding speech.

(3) Recent studies have shown separable effects of early (pre)-lexical reading
processes on the N250 ERP component (e.g., Grainger and Holcomb, 2006)
and a P250 during early speech recognition (e.g., Friedrich and Kotz, 2007).

(4) Some researchers have also looked at slow wave responses elicited to
multiple words in sentences instead of to each individual word. In general,
these studies have found that more negative slow waves are observed to
sentences that are more difficult to process (relative to easier sentences). For
examples, see King and Kutas (1995) and Munte et al. (1998).

(5) Interestingly, the effects of both lexical association and sentence

congruence varied as a function of the number of intervening words between
prime and target. N400 priming effects were not obtained for lexical
associates that were separated by an average of 4.8 intervening words
but were present when 1 or no words intervened, whereas the reverse was
true for sentence congruence effects. This is consistent with findings that
effects of lexical association are short-lived (e.g., Chwilla et al., 2000) but
that sentence context effects build up over time.

(6) Studies of speech perception have used the sensitivity of the mismatch
negativity (MMN; see Chapter 6, this volume) to deviations in auditory input
to successfully investigate acoustic and phonological aspects of speech (e.g.,
Kaan, 2008; Kaan et al., 2007; Näätänen et al., 1997; Phillips, 2001).

(7) Other ERP studies have shown that the reader and listener may actually
use discourse information to anticipate and predict the upcoming words in
the sentence or story (e.g., Delong et al., 2005; Nieuwland & Van Berkum,
2006a; Otten & Van Berkum, 2008; Otten et al., 2007; Van Berkum et al.,
2005; Wicha et al., 2004).

(8) Work of Hagoort and colleagues has also shown that world knowledge is
immediately integrated during normal language comprehension (Hagoort
et al., 2004; Hald et al., 2007). In addition, nonlinguistic (pragmatic)
information, such as whether or not the voice of the speaker matches
the message (e.g., a male talking about being pregnant; van Berkum
et al., 2008) and even whether or not the attitude and moral values of
the comprehender clash with the message (Van Berkum et al., 2009), also
immediately influence the amplitude of the N400. This further illustrates that
real-time comprehension takes immediate advantage of many sources of
contextual information.

View publication stats

