Lexical Preactivation in Basic Linguistic Phrases
Joseph Fruchter1*, Tal Linzen1*, Masha Westerlund1,2, and Alec Marantz1,2
Abstract
■ Many previous studies have shown that predictable words are
read faster and lead to reduced neural activation, consistent with a
model of reading in which words are activated in advance of being
encountered. The nature of such preactivation, however, has typically been studied indirectly through its subsequent effect on
word recognition. Here, we use magnetoencephalography
to study the dynamics of prediction within serially presented
adjective–noun phrases, beginning at the point at which the predictive information is first available to the reader. Using corpus
transitional probability to estimate the predictability of a noun,
INTRODUCTION
Top–down predictive processing is one of the fundamental principles of brain function (Bar, 2007). Using
prior knowledge and contextual information, higherorder areas communicate expectations to lower areas,
which then compare the received input to the predicted
input. Language processing is no exception to this rule.
For example, listeners move their eyes to items that
are predictable from context, before the item itself
has been named (Altmann & Kamide, 1999; Kamide,
Altmann, & Haywood, 2003). Likewise, predictable words
are read more quickly (Ehrlich & Rayner, 1981) and elicit
reduced neural signals, most commonly observed
as a reduction in the N400 ERP component (Kutas &
Hillyard, 1984).
Predictability effects have been taken as support for
the notion that likely upcoming words are at least partly
preactivated in advance of being encountered (Kutas &
Federmeier, 2000), although some have argued for an alternative explanation in which these effects stem from
the increased ease of integrating predictable words into
the preceding context (Brown & Hagoort, 1993; Norris,
1986; see Lau, Phillips, & Poeppel, 2008, for a review of
the different interpretations of the N400 response). Recent empirical support for the preactivation account
comes from an experiment showing a modulation in
N400 effects for the English indefinite articles a and
an, based on whether the context licensed a predic-
1
New York University, 2New York University Abu Dhabi
*Joseph Fruchter and Tal Linzen contributed equally to this
work.
© 2015 Massachusetts Institute of Technology
we found an increase in activity in the left middle temporal gyrus
in response to the presentation of highly predictive adjectives
(i.e., adjectives that license a strong noun prediction). Moreover,
we found that adjective predictivity and expected noun frequency
interacted, such that the response to the highly predictive adjectives (e.g., stainless) was modulated by the frequency of the expected noun (steel ). These results likely reflect preactivation of
nouns in highly predictive contexts. The fact that the preactivation
process was modulated by the frequency of the predicted item is
argued to provide support for a frequency-sensitive lexicon. ■
tion for a noun that agreed with the article (e.g., The
day was breezy so the boy went outside to fly a [kite]/
*an [airplane]; DeLong, Urbach, & Kutas, 2005). Because both indefinite articles should be equally easy to
integrate into the semantic context of the sentence, the
only plausible explanation for the N400 effect in this case
is that participants were preactivating the representation of the upcoming noun. Similar responses have been
reported for semantically vacuous agreement features
in other European languages ( Van Berkum, Brown,
Zwitserlood, Kooijman, & Hagoort, 2005; Wicha, Moreno,
& Kutas, 2003).
A separate line of evidence for lexical preactivation
comes from predictability effects in early sensory responses (Kim & Lai, 2012; Dikker & Pylkkänen, 2011;
Dikker, Rabagliati, & Pylkkänen, 2009). For example,
Dikker and Pylkkänen (2011) presented participants with
pictures, followed by words that either did or did not
match the presented image. Some of the pictures were
predictive of a specific word (e.g., a picture of an apple),
and some were not, instead denoting a larger semantic
category from which a single predictable word could
not be isolated (e.g., a picture of a shopping bag full of
groceries). When the strong prediction for the word
apple generated by the presentation of a picture of an
apple was violated, there was an increase in the M100,
a magnetoencephalography (MEG) signal generated in
visual cortex around 100 msec after stimulus presentation.
Importantly, the contexts that did not afford a specific
lexical prediction did not elicit a similar violation response.
These findings point to a top–down modulation of visual
cortex activity by lexical expectations generated in
language regions (Dikker et al., 2009). In summary,
Journal of Cognitive Neuroscience 27:10, pp. 1912–1935
doi:10.1162/jocn_a_00822
predictability effects for semantically vacuous words, on
the one hand, and top–down modulatory effects in
sensory cortex, on the other hand, both provide support
for the lexical preactivation account of predictability
effects.
A popular model for these top–down predictability effects proposes that prediction arises directly from the organization of neurons in the cortex. Predictive coding
theories propose that cortical regions contain two types
of neuron populations: “expectation” neurons, which encode the representations, and “surprise” neurons, which
encode the mismatch between the predicted representation and the bottom–up input (Friston, 2005). The
predictive coding model makes the prediction that
anticipatory processing of a stimulus should elicit neural
activity in some of the same regions that are active when
that same stimulus is perceived. This prediction has
been increasingly supported by recent evidence (Kok,
Failing, & de Lange, 2014; Egner, Monti, & Summerfield,
2010). For example, seeing an image of a face and anticipating seeing a face both activate the fusiform face
area (Egner et al., 2010). One might therefore expect
that anticipation of a lexical stimulus would be detectable in the same regions involved in lexical processing
more generally. Indeed, a follow-up analysis of the
picture–noun data set described above showed that MEG
activity in temporal and occipital cortex was increased
in the presence of a specific prediction (picture of an
apple) before the presentation of the word (Dikker &
Pylkkänen, 2013).
Predictive coding models make the additional prediction that lexical preactivation in predictive contexts will
reflect the identity of the individual item being predicted.
This has been demonstrated for predictive processing in
earlier sensory areas. Kok, Jehee, and de Lange (2012),
for example, showed that the representation of a predicted element was “sharpened,” in that it became easier to
decode its identity from activity in visual cortex. More recently, Kok et al. (2014) showed that patterns of activity
in early visual cortex evoked by expected (but not presented) stimuli had similar feature specificity to those
evoked by the stimulus itself.
Design
The goal of the current experiment is to find direct evidence for the preactivation of particular lexical items.
One challenge in tackling this question lies in the fact
that it is difficult to pinpoint the exact moment at which
a linguistic prediction is generated. Prediction has typically been studied by varying the predictability of the last
word of a sentence (Kutas & Hillyard, 1984, and many
others). An issue with using this paradigm for studying
preactivation is that a prediction for the last word of a
sentence likely arises gradually as more and more information about the sentence is accumulated, rather than
being generated in its entirety immediately before the
last word of the sentence. For example, in the sentence
he loosened the tie around his neck, much of the information that enables a reader to predict the word neck is
likely to already be available after the word tie. This complication makes it difficult to temporally isolate a preactivation signal.
We departed from this classic paradigm in two respects. First, following Bemis and Pylkkänen (2011), we
used simple adjective–noun phrases, such as stainless
steel, as opposed to full sentences. Participants read the
phrases while their neural activity was recorded using
MEG. To ensure that they were fully engaged with the
materials, they made a lexical decision on the second
word of the phrase (the noun). While reading isolated
phrases is undoubtedly less natural than reading full sentences, this paradigm has several advantages for our purposes. First, the sources of information used to generate
the predictions are more limited, giving us better control
over the nature of the prediction signal. For example,
syntactic structure is kept constant across all stimuli, minimizing variation in syntactic predictions, which can affect
neural responses (Linzen, Marantz, & Pylkkänen, 2013).
Second, because each of the items is much shorter, we
can include significantly more items than in a sentential
paradigm. Finally, and most crucially, this paradigm allowed us to achieve precise control over the point in
time at which a prediction can be generated: in the
phrase stainless steel, a specific lexical prediction can
be generated immediately at stainless. Using MEG, we
can then measure neural activity before the word steel,
giving us a direct measure of a prediction signal, rather
than an indirect error response.
A second way in which our paradigm differs from classic N400 experiments is in the way predictability was
operationalized. Most studies of predictability have
estimated it using the cloze procedure (Taylor, 1953), a
pretest in which native speakers read a sentence with a
missing final word and are instructed to fill in the blank.
The cloze probability of a word is defined as the proportion of participants who completed the sentence using
that word. For example, if 97% of participants complete
the sentence he loosened the tie around his… with the
word neck, the cloze probability of the word neck would
be .97. By contrast, here we operationalized predictability
using corpus transitional probability (TP), that is, the
probability of encountering a second word w2 given that
a first word w1 has been encountered (P(w2|w1)), estimated from frequencies in the Corpus of Contemporary
American English (Davies, 2009). For stainless steel, for
example, TP is calculated as the number of times stainless steel appeared in the corpus divided by the total
number of times stainless appeared in the corpus
(McDonald & Shillcock, 2003). One advantage of TP over
cloze probability is that, whereas cloze probability is always bounded by the number of respondents (with 100
participants, the lowest possible cloze probability is .01),
TP does not have this limitation, which makes it possible
Fruchter et al.
1913
to study differences between items with fairly low predictability, for example, TP = .05, and very low predictability, for example, TP = .005 (Smith & Levy, 2013).
To recapitulate, our paradigm allowed us to characterize the nature of the signal generated by the expectation
of a noun. We quantified the degree to which an adjective evokes an expectation for a noun using corpus TP
from the adjective to the noun.
Anatomical ROI
As mentioned above, predictive coding models suggest
that expectation of a stimulus involves neural activity in
the same region that processes that stimulus when it is
encountered. Preactivation of a specific lexical item is
therefore likely to occur in the areas that are involved
in lexical access more generally. Consequently, we focused our analysis on the left middle temporal gyrus
(MTG), a cortical region thought to play a central role
in lexical access (Friederici, 2012; Hickok & Poeppel,
2007; Rodd, Davis, & Johnsrude, 2005; Indefrey & Levelt,
2004; Binder et al., 1997). This area has also recently
been implicated in generating expectations from linguistic stimuli and matching them against perceptual stimuli
(Francken, Kok, Hagoort, & De Lange, 2015).
Word Frequency
The preactivation of a word is likely to involve some of
the same processes that are engaged when the word is
accessed in other circumstances. One of the most reliable predictors of ease of lexical access is word frequency. Frequent words are processed faster in lexical
decision experiments ( Whaley, 1978; Rubenstein,
Garfield, & Millikan, 1970) and during natural reading
(Inhoff & Rayner, 1986). EEG experiments have found
that the amplitude of the N400 is reduced for frequent
words ( Van Petten & Kutas, 1990; Smith & Halgren,
1987). Importantly, within the MEG literature, frequency
effects have been found in the left MTG during the time
window of the M350, the evoked response thought to be
associated with lexical access (Solomyak & Marantz, 2010;
Embick, Hackl, Schaeffer, Kelepir, & Marantz, 2001). This
body of evidence leads us to predict that the preactivation of an infrequent word should be more effortful than
the preactivation of a frequent one. Concretely, when an
adjective is predictive of a specific noun, we expect the
frequency of the expected noun to modulate MTG activity before the presentation of the noun. For example,
upon recognition of the adjective stainless, we expect
participants to preactivate the linguistic representation
of the likely continuation steel; we therefore expect to
see concomitant effects of the frequency of steel associated with this preactivation in the MTG. More generally,
we expect to see an interaction in the MTG between adjective predictivity and the frequency of the expected
noun continuation, such that as adjective predictivity in1914
Journal of Cognitive Neuroscience
creases, we are more likely to observe effects of the frequency of the most likely noun continuation. This should
occur in the time window subsequent to recognition of
the adjective, but before the presentation of the noun.
After the presentation of the noun, we expect that frequency effects will be reduced for the more predictable
items, consistent with EEG experiments that have shown
that N400 frequency effects are only significant for words
that appear earlier in a sentence ( Van Petten & Kutas,
1990) or that are less predictable from context (Dambacher,
Kliegl, Hofmann, & Jacobs, 2006).
METHODS
Materials
We first define the variables that we calculated for each
phrase and then describe how the phrases were selected
(partially based on those variables).
Lexical Variables
We illustrate the calculation of the lexical variables using
the phrase economic reform. The most likely continuation of economic is not reform, but growth. In this case,
we say that the expected noun is growth, and the presented noun is reform. We define the following variables:
• Adjective frequency: freq(economic)
• Adjective predictivity: TP from the adjective to its
most likely noun continuation: P( growth|economic)
• Expected noun frequency: the frequency of the adjective’s most likely noun continuation: freq( growth)
• Presented noun frequency: the frequency of the noun
that was actually presented: freq(reform)
• Presented noun predictability: TP from the adjective
to the presented noun: P(reform|economic)
Focusing on a single expected noun is clearly a simplification; most adjectives license more than one prediction. After reading the adjective economic, for example,
participants may well predict both growth and reform.
These continuations would likely be preactivated in proportion to their conditional probability (Smith & Levy,
2013; DeLong et al., 2005): Following recognition of economic, the noun growth may be activated to a greater
extent than reform.1 To capture this intuition, we defined
a generalization of expected noun frequency that we
term weighted expected noun frequency. This variable
is a weighted average of the frequencies of the adjective’s
continuations, where the weights are given by the TPs of
the continuations. We only considered noun continuations within phrases that met our minimum frequency requirement (i.e., 50 tokens in the corpus, corresponding
to a probability of 1 in 8 million). Consequently, some of
the conditional probability mass for each adjective was
not assigned to any noun; we assigned this probability
to a generic noun that had the average frequency of
Volume 27, Number 10
all nouns in the corpus. As an illustration, in the case
of economic (assuming that there are only two suprathreshold noun continuations), the calculation of this variable would be given by
WFðeconomicÞ ¼ PðreformjeconomicÞ freqðreformÞ
þ Pð growthjeconomicÞ freqð growthÞ
þ ð1 − PðreformjeconomicÞ
− Pð growthjeconomicÞÞ
avgNounFreq
The shape of frequency effects has long been known
to be approximately logarithmic ( Whaley, 1978), and
there is increasing evidence that this is the case for predictability effects as well (Smith & Levy, 2013). We therefore log-transformed all frequency and predictability
variables before entering them into our statistical models.
Selection Criteria
A set of 474 adjective–noun phrases was obtained as follows. We first selected all sequences of two words from
the Corpus of Contemporary American English (Davies,
2009) that satisfied the following conditions:
(1) The first word was tagged as an adjective at least
90% of the time, according to the automatic partof-speech tagging included with the corpus.
(2) The second word was tagged as a noun at least 90%
of the time.
(3) The sequence had a frequency of at least 50 tokens
in the corpus out of ∼400 million tokens in the corpus, corresponding to a probability of approximately
1 in 8 million.
(4) The length of both words was between three and
nine characters.
(5) All nouns had an accuracy of at least 75% in the lexical decision data in the English Lexicon Project (this
criterion was implemented to ensure that participants were likely to be familiar with the words).
Many phrases contained the same adjectives or nouns
as other phrases in the selection (e.g., high table and
high chair, or black chair and high chair). Whenever
this was the case, we only kept the phrase in which the
noun was most predictable. Because phrases with highly
predictable nouns are relatively rare, this procedure maximized our coverage of the predictability range. More
specifically, we first grouped the phrases by noun (e.g.,
black chair and high chair) and excluded all but the
most predictable items; we then grouped the remaining
phrases by adjective (e.g., high table and high chair) and
again excluded all but the most predictable items. This
process yielded a candidate set of phrases, each composed of a unique adjective and unique noun.
A side effect of this procedure was that many phrases,
particularly towards the lower end of the TP range, contained nouns that were not the most predictable ones
given the adjective. For example, the phrase economic
reform, which has TP = .01, was included even though
a phrase with the same adjective, economic growth, had
higher TP (.05). This was done because growth occurs in
rapid growth, which has even higher TP (.08). In our final set, the noun was the most expected continuation of
the adjective in 51% of the phrases (242 of 474); in the
top quartile of adjective predictivity (i.e., adjectives that
had a noun continuation with TP > .10), this proportion
was 77% (90 of 117). Because the phrase used when the
most predicted noun was not available typically had the
second highest TP among all phrases that included the adjective, the order of magnitude of the TP of the selected
phrase was usually similar to that of the phrase that was
excluded (median ratio of highest TP to selected TP: 2.63).
Given the candidate set of phrases, we excluded items
that were clearly part of a longer phrase (e.g., congestive
heart, which always appears in the context congestive
heart failure) and items that are usually capitalized,
which tend to be names of places or works of art (e.g.,
Purple Haze). Finally, we asked six undergraduate students to rate the phrases for familiarity and excluded
phrases that five of six raters rated as unfamiliar (e.g.,
logistic regression). The final list contained 474 phrases,
which are listed in the Appendix, along with their associated TP values. After sorting by TP, the noun in every
other phrase was replaced with a pronounceable nonword (e.g., academic dusporate), obtained using Wuggy
(Keuleers & Brysbaert, 2010).
Table 1 lists the descriptive statistics for key stimulus
variables. Noun frequency and noun predictability (TP
between adjective and noun) were correlated (r = .34),
Table 1. Descriptive Statistics for Stimulus Variables
Adjective Frequency
(log)
Noun Frequency
(log)
Phrase Frequency
(log)
Adjective Predictivity
(log)
Noun Predictability
(log)
Min
5.05
5.91
3.93
−5.36
−7.9
Max
12.88
12.68
10.77
−0.17
−0.17
Median
8.59
9.77
4.98
−3.02
−3.49
Mean
8.69
9.73
5.23
−2.91
−3.45
SD
1.43
1.27
1.05
1.00
1.37
Fruchter et al.
1915
as were adjective frequency and noun predictability (r =
−.71). Adjective frequency was also correlated with
adjective predictivity (r = −.59). None of the variables
were strongly correlated with adjective length or noun
length (all r < .3). The high correlation between adjective frequency and the two predictability measures is
due to the fact that these quantities are mathematically
related:
logðTPÞ ¼ logðfreqðphraseÞ=freqðadjÞÞ
¼ logðfreqðphraseÞÞ− logðfreqðadjÞÞ
As mentioned above, we only selected phrases that appeared with a frequency of at least 1 per 8 million to
eliminate implausible or ungrammatical phrases. This entails that log(TP) and log(freq(adj)) must sum to at least
3.9, and therefore, a phrase cannot simultaneously have
log(TP) = −6 and log(freq(adj )) = 7. Note that because
log(freq( phrase)) is always positive, many combinations
of values for log(TP) and log(freq(adj )) would still be
impossible even if the frequency threshold for phrases
were lifted.
Participants
Sixteen participants (nine women) from New York City participated in the experiment. All participants provided informed consent and were paid for their participation.
Participants ranged in age from 19 to 45 years (median =
25.5 years). All participants were right-handed (assessed
using the Edinburgh Handedness Inventory; Oldfield,
1971) and were native speakers of English with normal
or corrected-to-normal vision.
Procedure
The experiment was conducted in the KIT/NYU facility at
New York University. Before recording, the head shape
of each participant was digitized to allow source localization and coregistration with structural MRIs. We also digitized three fiducial points (the nasion and the left and
right preauricular points) and the position of five coils,
placed around the participant’s face. Once the participant
was situated in the magnetically shielded room for the
experiment, the position of these coils was localized with
respect to the MEG sensors, allowing us to assess the position of the participant’s head for source reconstruction.
Data were recorded continuously with a 157-channel
axial gradiometer (Kanazawa Institute of Technology,
Kanazawa, Japan). Structural MRIs were obtained for 15
of the 16 participants; the MEG data from one participant
were thus eliminated from analysis because of failure to
obtain a structural MRI.
Before the experiment, participants were not given any
indication of the goal of the experiment or the properties
of the materials. The exact instructions were as follows:
“You will read two letter strings on the screen, one at a
1916
Journal of Cognitive Neuroscience
time. If the second string is a real English word, respond
with your index finger. If it is not, respond with your middle finger.” Each participant saw all 474 items. The order
of presentation was randomized for each participant individually. The assignment of items to conditions was fixed
across participants; in other words, the same nouns were
replaced with nonwords for all participants (see Materials
for details). A given adjective was always presented with
the same noun or nonword; for example, stainless was
followed by steel, and uncharted was followed by the
nonword cothenent (which replaced the predicted continuation territory) for all participants.
Stimuli were presented using the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) and projected onto a
screen approximately 50 cm away from the participant.
They were presented in white 30-point Courier font on
a gray background. The structure of each trial was as follows. First, a fixation cross was presented in the center of
the screen for 300 msec, followed by a blank screen presented for 300 msec. The adjective was then presented
for 300 msec, followed again by a blank screen presented
for 300 msec. Finally, the noun (or nonword) was presented for 300 msec, and participants responded to the
latter stimulus by pressing a button.
Preprocessing
The preprocessing and analysis of the MEG data closely
followed the procedures of Solomyak and Marantz (2009,
2010). Environmental noise was removed from the data
by regressing signals recorded from three orthogonally
oriented magnetometers, placed approximately 20 cm
away from the recording array, against the recorded data,
using the continuously adjusted least squares method
(Adachi, Shimogawara, Higuchi, Haruta, & Ochiai,
2001). The data were then low-pass filtered at 40 Hz, resampled to 250 Hz to facilitate analysis, and high-pass filtered at 0.1 Hz. MEG channels in which there was no
signal or excessive amounts of noise were interpolated
from neighboring channels or rejected (at most three
per participant). Trials in which at least one channel
showed a peak-to-peak amplitude exceeding 4000 fT
were rejected, as these amplitude values are likely to reflect blinks and noise artifacts (the number of rejected
trials ranged from 39 to 112, mean = 77.1, median =
77; the minimum number of trials analyzed for a given
participant was 362). None of the participants were excluded because of excessive trial rejections.
The MNE software package (Martinos Center MGH,
Boston, MA) was used to estimate neuroelectric current
strength based on the recorded magnetic field strengths
using minimum l2 norm estimation (Dale & Sereno, 1993;
Hämäläinen, Hari, Ilmoniemi, Knuutila, & Lounasmaa,
1993). Current sources were modeled as three orthogonal
dipoles spaced approximately 5 mm apart across the
cortical surface (Dale et al., 2000), yielding approximately
2500 potential electrical sources per hemisphere. The
Volume 27, Number 10
participants’ cortical surfaces were reconstructed based on
their structural MRIs using Freesurfer (Martinos Center).
The neuromagnetic data were coregistered with the
structural MRIs using MNE by first aligning the fiducial
points and then using an Iterative Closest Point algorithm
to minimize the difference between the scalp and the
points defining the head shape of each participant.
The forward solution was calculated for each source
using a single-layer boundary element model based on
the inner skull boundary. Noise covariance estimates
were obtained from a 200-msec baseline period before
the presentation of each adjective. Using the grand average of all trials across conditions (i.e., both Word and
Nonword trials), the inverse solution was computed to
determine the most likely distribution of neural activity.
We utilized a free orientation analysis, in which the
source orientations were unconstrained with respect to
the cortical surface. The resulting source estimates were
signed with a positive sign indicating an upward directionality and a negative sign indicating a downward directionality in the coordinate space defined by the head.
The estimated activation was normalized into a test statistic by dividing the estimates by their predicted standard
error given the noise covariance, yielding signed dynamic
statistical parametric maps (dSPMs; Dale et al., 2000).
The signal-to-noise ratio parameter, which controls the
regularization of the estimates, was set to 1.
Regions and Time Windows of Interest
Main Analysis
ROIs were defined anatomically using the cortical parcellation performed by FreeSurfer on the basis of the
Desikan–Killiany gyral atlas (Desikan et al., 2006). We
selected the left MTG anatomical ROI (Figure 3A later
in the paper) for the purposes of our main analysis. In
this and other temporal lobe labels, the Desikan–Killiany
atlas includes the gyrus along with the banks of the surrounding sulci. We defined three time windows of interest: adjective lexical access, presented noun lexical
access, and preactivation of the expected noun (Figure 3B). The time windows were defined on the basis of
the peaks of the M350 evoked response. This component has been argued to be associated with lexical access
(Pylkkänen & Marantz, 2003) and has demonstrated sensitivity to lexical variables such as frequency (Embick
et al., 2001). Lexical access of the adjective was assessed
in a time window starting 100 msec before the peak of
the M350 response evoked by the presentation of the
adjective and ending 100 msec after the peak (concretely,
242–442 msec after adjective presentation). Likewise,
lexical access of the presented noun was assessed in a
200-msec time window centered around the peak of the
M350 response to the noun, which was slightly earlier than
the response to the adjective (197–397 msec post-noun onset). Finally, we made the simplifying assumption that ef-
fects of lexical preactivation would be most evident after
lexical access of the adjective was complete. Consequently,
the preactivation time window started at the end of the adjective lexical access time window. To avoid including activity evoked by the presented noun, this time window ended
at the presentation of the noun (concretely, the time window extended from 442 to 600 msec post-adjective onset)
and was therefore somewhat shorter than the two other
time windows. All three time windows are illustrated in
Figure 3B.
Figure 3B also shows the average left MTG response to
the adjective for a four-way split of adjective predictivity
as well as the response to the noun for a median split of
presented noun predictability. Time-varying correlations
(Figure 3C–E) were generated using sliding 50-msec windows centered at [25, 75, …, 575] msec post-adjective
onset or post-noun onset.
Supplemental Analyses
We conducted post hoc analyses for two additional anatomical ROIs, which roughly corresponded to significant
peaks of activity in the evoked response to the adjective
at 300–400 msec (Figure 1A): the left lateral OFC (LOFC;
Figure 4A later in the paper) and left superior temporal gyrus (STG) anatomical ROIs. Finally, because previous investigations of predictability showed early effects at the M100
( Dikker & Pylkkänen, 2011; Dikker et al., 2009), we conducted an analysis of early predictability effects in this study,
in which we examined the time window 100–200 msec
post-noun onset in the MTG; we also examined 10-msec
windows centered around the two candidate M100 peaks
in the left cuneus anatomical ROI, corresponding to the
location of visual cortex within the occipital lobe.
Exploratory Analysis of Left Hemisphere
Language Regions
In addition to the main confirmatory ROI analysis, we
conducted an exploratory analysis of a broad language
network, covering most of the sources located within lateral cortical regions in the left frontal and left temporal
lobes. Specifically, we pooled the sources within the following anatomical regions in the left hemisphere, as
specified in the Desikan et al. (2006) parcellation: STG,
transverse temporal gyrus, banks of the STS, MTG, inferior temporal gyrus, temporal pole, fusiform gyrus, insula,2
inferior frontal gyrus (pars triangularis, pars opercularis,
and pars orbitalis), and LOFC. The analysis was conducted for the average activity in a given source over
sliding 100-msec windows centered at [50, 150, …, 550]
msec post-adjective onset or post-noun onset. The resulting t maps for the variables of interest are shown in
Figures 1B–E and 2B–E; see the next section for details
on how those t values were obtained. The figures for the
evoked responses to the adjective (Figure 1A) and
the noun (Figure 2A) were generated by first morphing
Fruchter et al.
1917
Figure 1. Neural response to
the adjective in left hemisphere
language areas. (A) The grandaveraged evoked response to
the adjective (in dSPM units).
(B) The effect of adjective
frequency. (C) The effect of
adjective predictivity. (D) The
effect of expected noun
frequency for the items in
the top half of adjective
predictivity (controlling for
adjective frequency and
adjective predictivity). (E) The
interaction between adjective
predictivity and expected noun
frequency (controlling for
adjective frequency). In B–E,
the t values represent the
results of a second-level
t test of the within-subject
β-coefficients (described more
fully in the Methods section).
For all images, red and yellow
indicate positively signed
values, and blue indicates
negatively signed values.
Figure 2. Neural response to
the noun in left hemisphere
language areas. (A) The grandaveraged evoked response to
the noun (in dSPM units).
(B) The effect of presented
noun frequency. (C) The
effect of presented noun
predictability. (D) The effect
of presented noun frequency
for the items in the bottom
half of presented noun
predictability (controlling for
presented noun predictability).
(E) The interaction between
presented noun predictability
and presented noun frequency.
In B–E, the t values represent
the results of a second-level
t test of the within-subject
β-coefficients (described more
fully in the Methods section).
For all images, red and yellow
indicate positively signed values,
and blue indicates negatively
signed values.
1918
Journal of Cognitive Neuroscience
Volume 27, Number 10
the grand-averaged activity (in dSPM units) for each participant into the neuroanatomical space of the average
brain, followed by averaging across all participants; unlike
the t maps, the evoked responses were calculated for all
cortical sources within the left hemisphere.
Statistical Methodology
Behavioral Analysis
After excluding the Nonword trials, we performed a logarithmic transformation of the RTs for the Word trials, following standard practice. For each participant, we
excluded trials for which the log-transformed RTs were
more than 2.5 standard deviations away from the participant’s mean and trials in which the RT was less than
100 msec or more than 5000 msec. We used the lme4
package in R (Bates, Maechler, & Bolker, 2013) to fit a
linear mixed-effects model with crossed random effects
for participants and items. Traditional repeated-measures
designs account for “random” differences across participants that are irrelevant to the experimental manipulation and therefore enable generalization of results
beyond the specific group of participants used in the experiment. Just like participants, linguistic materials may
also differ from one another in many ways that are irrelevant to the experimental manipulation. Mixed-effects
models with crossed random effects extend the logic
of repeated measures to participants and items simultaneously and enable generalization beyond both the
sample of participants and the sample of items used in
the experiment (Baayen, Davidson, & Bates, 2008). This
model was used to predict log-transformed RTs from
presented noun predictability and presented noun frequency. We used a maximal random effects structure:
for items, only a random intercept, and for participants,
random slopes for presented noun predictability and
presented noun frequency and their interaction as well
as a random intercept. Predictors were centered before
being entered into the model. The reported p values
are derived from likelihood ratio tests in stepwise
regression.
MEG ROI Analyses
Linear mixed-effects models were fitted to the average activity in an ROI over a time window of interest, following
rejection of trials with activity at least 4 standard deviations away from the mean across all trials and all participants. The linguistic variables (e.g., frequency) were
entered into the models as fixed effects. A maximal random effects structure was used, with random intercepts
for participants and items, as well as a random slope for
the particular linguistic variable being tested in the model.
Predictors were centered before being entered into the
models. To obtain the p values for the main effects as well
as the significance of the stepwise regressions and inter-
action effects, likelihood ratio tests were employed for the
relevant nested linear mixed-effects models. In many
cases, multiple regression was used to address correlations between stimulus variables; for example, the interaction between adjective predictivity and expected noun
frequency was tested in a regression model that included
adjective frequency as well (i.e., adjective frequency was
controlled for).
We analyzed the adjective and the noun time windows
separately. All of the trials were included in the analyses
of the adjective time window, and only Word trials were
included in the analyses of the noun time window (i.e.,
Nonword trials were excluded).
Exploratory MEG Analysis
Because it was not computationally feasible to conduct
the full mixed-effects analysis for each source individually,
we employed a summary statistic approach (Holmes &
Friston, 1998), as follows. For a given participant’s data,
at each source and during each time window, we computed the β-coefficient from a linear regression model
predicting the source activity (in dSPM units) as a function
of the linguistic variable of interest (e.g., frequency). We
then morphed the resulting maps of the β-coefficients
from each participant’s neuroanatomical space to the
space of the average brain using seven iterative smooth
steps. Because the a priori selection of the MTG anatomical ROI allowed us to establish the significance of
the effects in the main analysis, we did not correct the
resulting t maps for multiple comparisons across sources.
Consequently, the t maps should not be used to determine the significance of the effects, but rather to verify
their general spatial distribution.
RESULTS
Behavioral Results
Accuracy ranged from 95% to 99% (median = 97.9%).
Mean RTs ranged from 491 to 1057 msec (median =
735 msec).
Both presented noun predictability and presented
noun frequency were inversely correlated with RTs (predictability: β = −.012; frequency: β = −.02). Both
variables were significant in stepwise regressions (predictability: p = .007; frequency: p = .002). The interaction between the variables did not reach significance
(β = −.0004, p = .88). Adding trial number to the regression model revealed that participants became significantly slower over the course of the experiment (β =
.0002, p = .002). This effect did not interact with either
predictability or frequency (predictability: p = .6; frequency: p = .8). There is therefore no evidence that participants modified their prediction strategy over the
course of the experiment. Finally, a logistic mixed-effects
model showed that accuracy was marginally higher on
Fruchter et al.
1919
high frequency nouns (β = .3, p = .09) and did not vary
based on predictability.
Discrepancy Trials
To see whether the discrepancy between the predicted
noun and the presented noun affected RTs, we conducted a separate analysis restricted to those Word trials
in which the presented noun was not the most predictable continuation for the given adjective (“discrepancy
trials”). The number of trials included in this analysis
was roughly a quarter of the total trials in the experiment,
because we eliminated all Nonword trials as well as Word
trials in which the presented noun was the most predictable continuation for the adjective; thus, the statistical
power for this particular analysis is lower than other
analyses in this study.
As in the complete set of Word trials, higher predictability and presented noun frequency were associated
with shorter RTs (frequency: β = −.019, p = .004; predictability: β = −.02, p = .01), and there was no interaction between the two ( p = .93). There was a
marginal positive effect of adjective predictivity on RTs
(β = .016, p = .05), indicating that recognition of a noun
is slowed down by the presence of a conflicting prediction for a different noun.
MEG Results: Evoked Response
The grand-averaged evoked responses to the adjective
and the noun are shown from lateral perspectives
in Figures 1A and 2A, respectively. There is widespread
negative activity (shown in blue) within the left temporal
lobe, particularly at 300–400 msec post-adjective onset,
as well as a significant patch of positive activity (shown
in red and yellow) within the left inferior frontal cortex,
mostly overlapping with the LOFC ROI, at 300–400 msec
post-adjective onset.
Figure 3B displays the time course of average activity
within the left MTG, for the adjective and noun time windows. The second negative peak in the adjective time
window (i.e., the M350 response to the adjective) occurs
at a latency of roughly 350 msec post-adjective onset,
whereas the second negative peak in the noun time window (i.e., the M350 response to the noun) occurs at a
latency of roughly 300 msec post-noun onset.
MTG ROI Analysis
same region and time window (t = −3.74, p = .0007).4
Because of the high correlation between adjective frequency and adjective predictivity (r = −.59), we assessed
the significance of each variable in a stepwise regression
with the other variable. Using this procedure, only adjective predictivity remained significant (χ 2 = 5.96, p = .01).
In the presented noun lexical access time window, after exclusion of the Nonword trials, higher presented
noun frequency was associated with weaker left MTG activity (t = 2.04, p = .05). Higher presented noun predictability led to significantly weaker activity in the left MTG
in the same time window (t = 4.66, p = .0002). In a stepwise regression, only presented noun predictability remained significant (χ 2 = 12.30, p = .0005).
In summary, adjectives that license a relatively strong
prediction evoked increased left MTG activity in the adjective lexical access time window; less predictable nouns
evoked increased activity in the presented noun lexical
access time window.
Interaction Effects
In the preactivation time window, adjective predictivity
and expected noun frequency interacted in the left
MTG (χ 2 = 5.73, p = .02, controlling for adjective frequency; see Figure 3D). A median split on adjective predictivity (Figure 3E) indicated that this interaction was
driven by the fact that higher expected noun frequency
led to significantly weaker activity, but only for the items
in the top half of adjective predictivity (t = 2.43, p =
.02, controlling for both adjective frequency and adjective
predictivity). Adding weighted expected noun frequency
to the model increased the fit somewhat, although the
difference did not reach significance (χ 2 = 2.29, p =
.13). The items in the bottom half of adjective predictivity showed no effect of expected noun frequency (t =
−1.08, p = .28, controlling for adjective frequency and
adjective predictivity).
In the presented noun lexical access time window,
there was a significant interaction between presented
noun predictability and presented noun frequency in
the left MTG (χ 2 = 6.51, p = .01; see Figure 3D). A median split on presented noun predictability (Figure 3E)
indicated that this interaction was driven by significantly
weaker activity in response to higher-frequency presented nouns, but only for the items in the bottom half
of presented noun predictability (t = 3.05, p = .006, controlling for presented noun predictability). The items in
the top half of presented noun predictability showed
no effect of presented noun frequency (t = −0.94, p =
.35, controlling for presented noun predictability).
Main Effects
In the adjective lexical access time window, higher adjective frequency was associated with weaker activity in the
left MTG (t = 2.94, p = .004; Figure 3C).3 Higher adjective
predictivity was associated with stronger activity in the
1920
Journal of Cognitive Neuroscience
Discrepancy Trials
We again conducted a separate analysis of Word trials in
which the presented noun was not the most expected
Volume 27, Number 10
Figure 3. Left MTG ROI
analysis. (A) ROI: The MTG
ROI, displayed in green on
the average brain, from lateral
and ventral perspectives. (B)
Average activity: On the left,
the average response in the
MTG to the adjective for high
(top 10%: blue line), mid-high
(top 10–50%: solid black line),
mid-low (bottom 10–50%:
dotted black line), and low
(bottom 10%: red line)
adjective predictivity. On the
right, the average response
in the MTG to the noun for
the top half (solid black line)
and bottom half (dotted black
line) of presented noun
predictability. Horizontal lines
with arrows indicate the time
windows of interest for the
ROI analyses. (C) Main effects:
On the left, the effects of
adjective predictivity (blue) and
adjective frequency (red) during
the adjective time window.
On the right, the effects of
presented noun predictability
(blue) and presented noun
frequency (red) during the
noun time window. (D)
Interaction effects: On the
left, the interaction between
adjective predictivity and
expected noun frequency
(controlling for adjective
frequency) during the adjective
time window. On the right,
the interaction between
presented noun predictability
and presented noun frequency
during the noun time window.
(E) Binned analyses: On the left,
the effect of expected noun
frequency (controlling for
adjective frequency and
adjective predictivity) during
the adjective time window for
the top half (solid red line)
and bottom half (dotted red
line) of adjective predictivity.
On the right, the effect of
presented noun frequency
(controlling for presented noun
predictability) during the noun
time window for the top half
(solid red line) and bottom half
(dotted red line) of presented
noun predictability. The dotted
black lines in C–E represent
the level of correlation needed
to reach statistical significance
at p = .05 (uncorrected).
Fruchter et al.
1921
one. In the presented noun lexical access time window,
there was no effect of adjective predictivity (t = −0.95,
p = .35, controlling for presented noun predictability).
We repeated the analysis in the later time window 300–
500 msec, which we selected post hoc to more accurately
capture the peak of the presented noun predictability
effect. In this time window, higher adjective predictivity
was associated with greater MTG activity (t = −2.00, p =
.05, controlling for presented noun predictability); this effect did not reach significance; however, when presented
noun frequency was included in the model as well (t =
−1.60, p = .11). There is therefore some neural evidence
for an opposing effect of a violated strong prediction,
relative to the effect of presented noun predictability.
Removal of High Valence Items
Because our phrase selection process was automatic, our
final set of materials included some phrases with high valence (e.g., rectal exam). To rule out the possibility that
some of our effects were due to the presence of these
high valence phrases, we manually eliminated 18 phrases
(denoted with asterisks in the Appendix) that we judged
to contain a high valence adjective or noun and subsequently repeated our primary MTG analyses without
these items. In the adjective lexical access time window,
adjective frequency (t = 2.71, p = .008) and adjective
predictivity (t = −3.66, p = .001) remained significant.
In the preactivation time window, there was a significant
interaction of adjective predictivity and expected noun
frequency (χ 2 = 5.97, p = .01, controlling for adjective
frequency); this interaction was driven by a significant effect of expected noun frequency for the items in the top
half of adjective predictivity (t = 2.78, p = .01, controlling
for adjective frequency and adjective predictivity).
In the presented noun lexical access time window, presented noun frequency was below significance (t = 1.70,
p = .10), although presented noun predictability remained highly significant (t = 4.15, p = .0006). The interaction of presented noun frequency and presented noun
predictability remained significant (χ 2 = 4.37, p = .04),
driven by a significant effect of presented noun frequency
for the items in the bottom half of presented noun predictability (t = 2.81, p = .01, controlling for presented
noun predictability). In summary, all of the effects in
the MTG survived the removal of the high valence items,
with the exception of the presented noun frequency effect, which dipped below the significance threshold.
Post hoc ROI Analyses
STG
Given the significant patch of negative activity in the left
STG at 300–400 msec (Figures 1A and 2A) as well as the
role of this region in the language network (e.g., Friederici,
1922
Journal of Cognitive Neuroscience
2012), we conducted a post hoc analysis of activity in this
region.
In the adjective lexical access time window, there were
significant effects of adjective frequency (t = 3.59, p =
.002) and adjective predictivity (t = −2.22, p = .04). Both
effects were in the same direction as those found in the
MTG; the frequency effect was slightly stronger, and the
predictivity effect was slightly weaker than in the MTG.
In the preactivation time window, there was no significant interaction between adjective predictivity and
expected noun frequency (χ 2 = 1.93, p = .16, controlling for adjective frequency). Furthermore, there was no
significant effect of expected noun frequency for
the items in the top half of adjective predictivity (t =
1.08, p = .29, controlling for adjective frequency and
adjective predictivity).
In the presented noun lexical access time window,
there was no effect of presented noun frequency (t =
0.28, p = .78), and the effect of presented noun predictability was marginally significant (t = 1.88, p = .07).
There was no interaction of presented noun frequency
and presented noun predictability (χ 2 = 0.19, p = .67)
and no effect of presented noun frequency for the items
in the bottom half of presented noun predictability (t =
0.80, p = .43, controlling for presented noun predictability). In summary, most of the effects in this region are
either similar to or weaker than the effects in the left
MTG, supporting the selection of the latter region as
our central ROI.
LOFC
Within the frontal lobe, the region that is traditionally
associated with language processing is the left inferior
frontal gyrus (IFG), which includes Broca’s area (e.g.,
Friederici, 2012). However, the patch of positive evoked
activity in the frontal lobe at 300–400 msec post-adjective
onset (Figure 1A) did not localize to the IFG and instead
overlaps almost entirely with the left LOFC (see Figure 4A).
We therefore report a post hoc analysis of the activity in
that anatomical region rather than the IFG.
The evoked response in the LOFC (Figure 4B) shows a
prominent positive peak at roughly 350–400 msec following word presentation; correspondingly, the direction of
the presented noun predictability effect is such that activity is more negative for the high predictability condition.
Thus, the sign of the correlation with presented noun
predictability is negative, as opposed to the effects in
the MTG and STG, which were positive; the latter point
follows from the fact that a negative correlation with a
positive peak indicates a weakening of activity, whereas
a negative correlation with a negative peak indicates a
strengthening of activity.
In the adjective lexical access time window, higher adjective frequency was associated with weaker LOFC activity (t = −2.56, p = .01; Figure 4C), but there was no
Volume 27, Number 10
Figure 4. Left LOFC post hoc
ROI analysis. (A) ROI: The LOFC
ROI, displayed in green on
the average brain, from lateral
and ventral perspectives. (B)
Average activity: On the left,
the average response in the
LOFC to the adjective for the
top half (solid black line) and
bottom half (dotted black line)
of adjective predictivity. On
the right, the average response
in the LOFC to the noun for
the top half (solid black line)
and bottom half (dotted black
line) of presented noun
predictability. (C) Main effects:
On the left, the effects of
adjective predictivity (blue) and
adjective frequency (red) during
the adjective time window.
On the right, the effects of
presented noun predictability
(blue) and presented noun
frequency (red) during the
noun time window. (D)
Interaction effects: On the
left, the interaction between
adjective predictivity and
expected noun frequency
(controlling for adjective
frequency) during the adjective
time window. On the right,
the interaction between
presented noun predictability
and presented noun frequency
during the noun time window.
(E) Binned analyses: On the
left, the effect of expected
noun frequency (controlling
for adjective frequency and
adjective predictivity) during
the adjective time window
for the top half (solid red line)
and bottom half (dotted red
line) of adjective predictivity.
On the right, the effect of
presented noun frequency
(controlling for presented noun
predictability) during the noun
time window for the top half
(solid red line) and bottom
half (dotted red line) of
presented noun predictability.
The dotted black lines in
C–E represent the level of
correlation needed to reach
statistical significance at
p = .05 (uncorrected).
Fruchter et al.
1923
main effect of adjective predictivity (t = 0.72, p = .48). In
the preactivation time window, there was a significant
interaction of adjective predictivity and expected noun
frequency (χ 2 = 7.19, p = .007, controlling for adjective
frequency; Figure 4D); this interaction was driven by the
fact that higher expected noun frequency led to weaker
activity, for the items in the top half of adjective predictivity (t = −3.21, p = .003, controlling for adjective
frequency and adjective predictivity; Figure 4E). Adding
weighted expected noun frequency to the model marginally improved the fit (χ 2 = 3.37, p = .07).
In the presented noun lexical access time window,
higher presented noun frequency was associated with
marginally weaker LOFC activity (t = −1.96, p = .06;
Figure 4C), and higher presented noun predictability
was associated with significantly weaker activity (t =
−2.98, p = .008). There was no interaction of presented
noun frequency and presented noun predictability (χ 2 =
0.004, p = .95; Figure 4D), and there was no significant
effect of presented noun frequency for the items in the
bottom half of presented noun predictability (t = −1.31,
p = .19, controlling for presented noun predictability;
Figure 4E).
In summary, most of the variables had similar effects
on LOFC activity as they did on MTG activity (though
with opposite signs, as discussed earlier), with the exception of the main effect of predictivity in the adjective lexical access time window and the interaction between
presented noun frequency and predictability in the presented noun lexical access time window, which were
found in the MTG but not the LOFC.
Early Predictability Effects
To determine whether there was an early effect of presented noun predictability in the MTG, we analyzed the
time window 100–200 msec post-noun onset. This time
window indeed showed a significant effect of presented
noun predictability (t = 3.29, p = .002). Given the possibility of spillover from the earlier effect of adjective frequency in this region, we also ran a stepwise regression
with adjective frequency and presented noun frequency;
in this model, presented noun predictability was no
longer significant (χ 2 = 1.22, p = .27). However, it is
difficult to interpret the latter fact in light of the high
correlation of adjective frequency and presented noun
predictability (r = −.71), which would serve to reduce
the effects of each variable when present in the same
model. In summary, there is somewhat inconclusive evidence for early predictability effects in the MTG after
noun presentation.
The evoked activity in the left cuneus, roughly overlapping with the location of visual cortex in the occipital
lobe, showed a negative peak at 76 msec, followed by a
positive peak at 136 msec post-noun onset. Following the
M100 analysis in Dikker and Pylkkänen (2011), we analyzed 10-msec windows centered around both peaks.5
1924
Journal of Cognitive Neuroscience
In addition to noun predictability, we tested for an effect
of noun length to validate our selection of visual ROI
(under the assumption that early visual processing should
be sensitive to visual form properties, such as word
length). For the time window around the earlier peak,
there was no effect of presented noun predictability
(t = 0.53, p = .60, for the time window 71–81 msec
post-noun onset), but higher presented noun length
was associated with stronger activity (t = −2.52, p =
.02). For the time window around the later peak, higher
presented noun predictability was associated with marginally weaker activity (t = −1.87, p = .06, for the time
window 131–141 msec post-noun onset), and higher presented noun length was associated with stronger activity
(t = 2.74, p = .01). In a stepwise regression with presented noun length, the effect of presented noun predictability dipped further below significance (χ 2 = 1.71,
p = .19). The evidence in our data for visual predictability
effects is therefore inconclusive, although suggestive.
Exploratory Analysis of Language Areas
Figures 1 and 2 display the results of the exploratory analysis of language areas for the response to the presentation of the adjective and noun, respectively. Many of
the patterns observed within the ROI results are visible
in the present analysis: (i) there are positive effects of adjective frequency (Figure 1B) and presented noun frequency (Figure 2B) within the mid-anterior temporal
lobe and corresponding negative effects in the LOFC,
both peaking at 400–500 msec after the presentation of
the each word; (ii) the effects of adjective predictivity
(Figure 1C) and presented noun predictability (Figure 2C)
have opposite directionalities, and in particular, there is a
negative effect of adjective predictivity and a positive effect
of presented noun predictability, peaking at 400–500 msec
in the temporal lobe; (iii) the preactivation effect—the
effect of expected noun frequency in response to the
highly predictive adjectives at 500–600 msec post-adjective
onset (Figure 1D)—displays a strikingly similar spatial
distribution to both the earlier effect of adjective frequency
at 400–500 msec post-adjective onset as well as the later
effect of presented noun frequency at 400–500 msec
post-noun onset; and finally, (iv) the preactivation effect
peaks at 500–600 msec, although the effects of adjective
frequency and adjective predictivity are no longer visible
at that latency, suggesting that these effects are distinct
from each other.
DISCUSSION
This study set out to characterize the neural signal corresponding to lexical preactivation. MEG activity was recorded while participants performed a lexical decision
task on the second word of visually presented adjective–
noun phrases (e.g., stainless steel ). The behavioral results
Volume 27, Number 10
showed that predictable and frequent nouns were recognized faster, replicating previous results (Fischler &
Bloom, 1979; Whaley, 1978; Rubenstein et al., 1970, and
many others). Neurally, lexical preactivation manifested in
increased activity: During the adjective time window, left
MTG activity was greater for predictive adjectives (e.g.,
stainless, which is predictive of steel ). Later, during the
noun time window, left MTG activity was significantly
reduced for predictable nouns (e.g., steel, which is a predictable continuation of stainless).
These results are consistent with the previously observed association of the left MTG with lexical access
(Friederici, 2012; Hickok & Poeppel, 2007) as well as with
the predictive coding hypothesis, according to which
predictive processing modulates the same region implicated in bottom–up processing of a stimulus (Egner
et al., 2010; Friston, 2005). Moreover, these results indicate that the nature of this predictive processing is such
that neural activity is increased at the point at which a
specific prediction is generated, whereas activity is reduced at the point at which the prediction is verified; this
corroborates the findings of Dikker and Pylkkänen
(2013), in which activity in a left middle temporal ROI
(among other regions) was increased during the generation of a specific lexical prediction based on a presented
picture and reduced when such predictions were satisfied by the presentation of the expected word.
In this study, the left MTG displayed a significant interaction between adjective predictivity and expected noun
frequency in what we termed the preactivation time window (∼450–600 msec post-adjective onset). This interaction was driven by a significant effect of expected noun
frequency for predictive adjectives (e.g., stainless, which
is predictive of steel ), but not for less predictive adjectives (e.g., important, which is not predictive of any particular noun). Later, in what we termed the presented
noun lexical access time window (∼200–400 msec postnoun onset), there was a significant interaction between
presented noun frequency and presented noun predictability, driven by a significant effect of presented noun
frequency for less predictable nouns only (e.g., clue in
important clue).
These results suggest that participants not only preactivated the likely continuations for predictive adjectives
before presentation of any noun, but that this preactivation was sensitive to the frequency of the expected noun.
By contrast, in the case of less predictive adjectives, participants waited until the presentation of the noun to access the appropriate lexical representation. This evidence
for preactivation argues against the strong form of the
integration theory of predictability effects, according to
which predictable words are easier to process not because any prediction has taken place before they are read
but solely because they are easier to integrate into an
existing semantic representation (Norris, 1986). It is still
possible that some of the effects of predictability can be
attributed to greater ease of integration, but our results,
in conjunction with form prediction effects (Dikker &
Pylkkänen, 2011; DeLong et al., 2005), suggest that ease
of integration cannot be the whole story (see also Smith
& Levy, 2013).
The fact that a word’s frequency could modulate neural activity before its presentation raises the question of
how to understand such an effect within the explanatory
frameworks used to understand word frequency effects
more generally. Rational models of reading (Smith &
Levy, 2008, 2013; Norris, 2006) emphasize the influence
of predictability on word recognition. Within such a
framework, readers optimize their behavior on the basis
of their estimates for the likelihood of upcoming words.
In the absence of context, word frequency is taken as the
baseline expectation for encountering a word. Given the
presentation of a high-frequency (or highly predictable)
word, a reader might require less perceptual evidence
to decide on its identity (Norris, 2006) or less processing
time because of prior preparation (Smith & Levy, 2008).
One important implication of this approach to word recognition is that unconditional word frequency should
be irrelevant when a word is highly predictable from
context. Consistent with this hypothesis as well as with
prior EEG findings (Dambacher et al., 2006; Van Petten
& Kutas, 1990), we found that the response in the MTG
to a presented noun was only modulated by its frequency
when the noun was a less predictable continuation of the
preceding adjective. In summary, our main effects of predictability, as well as the reduction in frequency effects for
predictable nouns, largely validate the rational models’
emphasis on predictability as the central determinant of
reading behavior.
However, although the pervasive effects of predictability indeed suggest that the rational models are on the
right track, a strict interpretation of such a model proposes that frequency effects should be entirely accounted for by predictability (Smith & Levy, 2008), a
position that is inconsistent with our finding of frequency
effects for anticipated nouns before their presentation.
Instead, these results suggest that frequency effects are
associated with the very process of lexical access itself.
Although it is not immediately obvious how to account
for this phenomenon within the framework of the rational models, several prominent models within the psycholinguistic literature crucially predict this phenomenon.
For example, according to Morton’s (1969) Logogen
model, word frequency determines the resting level of activation for a lexical item; similarly, according to Forster’s
(1976) Serial Search model, the lexicon is composed of
frequency-ordered bins. Thus, our experiment can be
seen as providing some new evidence for a long-held view
within the psycholinguistic literature, in which frequency
effects arise because of the architecture of the lexicon.
The language production literature suggests another
intriguing interpretation of the effect of expected noun
frequency. A family of recent models argues that lexical
prediction employs some of the same mechanisms as
Fruchter et al.
1925
language production (Dell & Chang, 2014; Federmeier,
2007; Pickering & Garrod, 2007). For example, older
adults with high verbal fluency scores show stronger
prediction effects than those with lower fluency scores,
suggesting that predicted words may be actively generated using the language production system (Federmeier,
McLennan, Ochoa, & Kutas, 2002). Frequency effects in
production have been extensively documented (Strijkers,
Costa, & Thierry, 2010; Kittredge, Dell, Verkuilen, &
Schwartz, 2008; Jescheniak & Levelt, 1994; Oldfield &
Wingfield, 1965). These effects seem to arise both at the
form level and at the semantic level (Kittredge et al., 2008);
information at both of these levels needs to be accessed
to generate form predictions (Dikker & Pylkkänen, 2011)
and semantic feature predictions (Federmeier & Kutas,
1999). The expected noun frequency effects in our experiment may therefore reflect the retrieval of the predicted
concept, the retrieval of the orthographic form associated
with it, or both.
Our main analysis focused on the left MTG, based on
research implicating it in semantic and lexical access
(Friederici, 2012). Exploratory analysis of left hemisphere
language areas showed that effects generally localized to
an anterior section of the temporal lobe (cf. Lau, Weber,
Gramfort, Hämäläinen, & Kuperberg, 2014, who reported
a similar location for the effects of lexical–semantic prediction), as well as to a portion of the inferior frontal
lobe. Post hoc ROI analyses also confirmed effects of
the variables of interest in regions outside the MTG: a
temporal region, the STG, and a prefrontal region, the
LOFC. The STG is standardly assumed to be part of the
language network; the effects in that region were qualitatively similar to, though weaker than, the effects we found
in the MTG. The prefrontal effects are consistent with
the role of the pFC in anticipatory processing (Dikker &
Pylkkänen, 2013; Bar, 2007). Somewhat unexpectedly,
the prefrontal effects localized to LOFC rather than to
the IFG, which is the prefrontal region more commonly
associated with the language network. It is possible that
the spatial distribution of the prefrontal effects may be
due to a source localization error. However, a recent
MEG study, using the same source space analysis methodology as this study, found effects of a semantic variable at
∼300–500 msec in the LOFC (Fruchter & Marantz, 2015);
moreover, a different analysis technique has localized
prefrontal lexical prediction effects to ventromedial pFC
(Dikker & Pylkkänen, 2013), which is closer to LOFC than
to IFG. Multimodal recordings might be able to shed light
on the precise localization of this effect. Finally, post hoc
analysis revealed equivocal evidence for early (∼100–
200 msec) presented noun predictability effects within
the MTG and the left cuneus, consistent with previous
findings of lexical predictability effects in early sensory
responses (Kim & Lai, 2012; Dikker & Pylkkänen, 2011),
although the early effects reported here did not reach
statistical significance when control variables were included
in the regression models.
1926
Journal of Cognitive Neuroscience
There are further aspects of our study that remain
open as future avenues of investigation. We quantified
the predictability of the second word in a phrase via
corpus TP. It is an open question how closely this
corpus measure would relate to an empirically derived
cloze probability measure, the traditional stand-in for
predictability (see Smith & Levy, 2011, for a comparison
of sentential cloze probabilities with corpus measures of
predictability). In particular, any interpretation of a TP
effect confounds prediction based on raw co-occurrence
statistics with prediction based on semantics and world
knowledge (Frisson, Rayner, & Pickering, 2005). Although
not easily distinguishable in this study, the potentially
independent effects of these two sources of information
could be investigated in a future study.
In this study, participants performed a lexical decision
after each phrase. This task, although a useful tool to ensure that participants are paying attention to the materials, may have engaged conscious prediction strategies
that are not recruited during naturalistic language comprehension (Neely, 1991). A conscious prediction strategy,
developed over the course of the experiment, would likely
manifest as an increased effect of predictability in later trials
compared to earlier ones; such an effect was not observed.
Nevertheless, it is worth investigating whether the effects
reported here would generalize to a more ecologically valid
paradigm, such as a passive reading task.
Finally, the primary index of preactivation in our study
was the frequency of the most likely noun continuation.
Clearly, there is reason to suspect that readers might predict more than a single possible continuation. A preliminary step toward addressing this possibility was taken in
this study; the weighted average of the frequencies of
possible continuations was shown to slightly improve
the model fit relative to the frequency of the single most
likely continuation, although this difference did not reach
statistical significance. The latter point provides some
tentative evidence in favor of a richer conceptualization
of lexical preactivation. Hopefully, future work will serve
to further characterize the nature of such preactivation,
particularly the extent to which possible continuations
are preactivated in proportion to their conditional probability given the preceding context.
Conclusion
This study used MEG to probe the neural signals that correspond to the generation of a lexical prediction, using
minimal adjective–noun phrases such as stainless steel.
We observed an increase in activity in the left MTG in
response to the presentation of more highly predictive
adjectives (e.g., stainless). Later, though still before the
presentation of the noun, neural activity was modulated
by the frequency of the predicted noun (steel ). Correspondingly, when the noun was later presented, predictable nouns elicited weaker neural activity than
unpredictable ones.
Volume 27, Number 10
APPENDIX. List of Stimuli and Associated TPs
(before Logarithmic Transformation)
APPENDIX. (continued)
Adjective
Noun
TP
Adjective
Noun
TP
residual
limb
.221
unsalted
butter
.845
vast
majority
.217
stainless
steel
.824
bilingual
education
.214
barbed
wire
.793
septic
tank
.211
umbilical
cord
.663
everyday
life
.208
iced
tea
.575
rectal
exam*
.206
soapy
water
.539
cellular
phone
.205
renewable
energy
.523
anaerobic
digestion
.203
pubic
hair*
.521
hallowed
ground
.200
undivided
attention
.453
marital
status
.199
untimely
death*
.423
slippery
slope
.199
concerted
effort
.421
oncoming
traffic
.197
immune
system
.395
catalytic
converter
.193
salivary
gland
.389
salutary
effect
.184
airtight
container
.378
habitable
zone
.184
soy
sauce
.372
bearded
man
.182
rheumatic
fever
.355
digestive
tract
.179
runny
nose
.354
foreign
policy
.168
watchful
eye
.346
illicit
drug
.165
mental
health
.311
auditory
canal
.162
sour
cream
.307
deviant
behavior
.161
uncharted
territory
.307
unholy
alliance
.159
cervical
cancer*
.306
uncanny
ability
.152
taxable
income
.303
allergic
reaction
.152
cloudless
sky
.302
husky
voice
.152
ballistic
missile
.294
wooded
area
.151
unborn
child
.289
breakneck
pace
.151
prickly
pear
.278
pivotal
role
.150
thankless
job
.271
angular
momentum
.150
soluble
fiber
.268
abdominal
pain
.144
martial
law
.262
elective
office
.144
powdered
sugar
.260
daunting
task
.143
eminent
domain
.245
incurable
disease
.141
high
school
.240
facial
nerve
.140
toothy
grin
.239
saline
solution
.140
magnetic
field
.233
empirical
evidence
.138
leaded
glass
.232
impartial
spectator
.137
crude
oil
.224
floppy
disk
.135
jobless
rate
.223
lethal
injection*
.134
Fruchter et al.
1927
APPENDIX. (continued)
APPENDIX. (continued)
Adjective
Noun
TP
negligent
homicide*
.131
undue
burden
.086
crusty
bread
.131
postwar
period
.084
electoral
college
.127
unsolved
murder*
.083
outer
space
.124
cerebral
cortex
.082
virtual
reality
.118
nonprofit
group
.082
lifeless
body*
.116
honorary
doctorate
.082
unskilled
labor
.115
cubic
foot
.081
royal
family
.115
radial
velocity
.080
boneless
pork
.112
aerobic
fitness
.080
fictional
world
.110
disabled
list
.079
humid
air
.109
schematic
diagram
.079
custodial
parent
.108
keen
interest
.078
stormy
weather
.107
imminent
danger
.077
evasive
action
.104
glacial
ice
.075
indecent
exposure
.101
sane
person
.075
sore
throat
.099
teenage
girl
.074
intrinsic
value
.099
periodic
table
.074
unwed
mother
.098
outspoken
critic
.073
tropical
storm
.098
minor
league
.073
bald
head
.098
lunar
surface
.073
timely
manner
.098
naval
base
.072
coercive
power
.097
traumatic
event
.072
artistic
director
.096
unanimous
decision
.071
unmarked
car
.096
brisk
business
.071
leaky
roof
.095
utopian
vision
.071
hind
leg
.095
natural
gas
.071
domestic
violence*
.095
nominal
fee
.070
frontal
lobe
.094
volcanic
activity
.070
populous
country
.093
bridal
gown
.069
infinite
number
.092
molten
lava
.069
vicious
cycle
.091
rightful
owner
.069
sluggish
economy
.091
vaginal
dryness*
.069
thorny
issue
.090
violent
crime*
.067
exclusive
interview
.089
literal
sense
.067
nasal
cavity
.089
wide
variety
.066
offensive
line
.089
ultimate
goal
.065
bacterial
infection
.087
oily
skin
.065
rapid
growth
.087
private
sector
.065
doctoral
degree
.086
perpetual
motion
.064
1928
Journal of Cognitive Neuroscience
Adjective
Noun
TP
Volume 27, Number 10
APPENDIX. (continued)
Adjective
APPENDIX. (continued)
Noun
TP
Adjective
Noun
TP
lanky
frame
.064
stony
silence
.045
fertile
soil
.064
spectral
type
.045
untreated
sewage
.063
discreet
distance
.044
sexual
abuse*
.063
viable
option
.044
immortal
soul
.061
indoor
plumbing
.043
regular
basis
.059
stylistic
analysis
.043
notable
exception
.057
irregular
heartbeat
.043
skeletal
muscle
.056
romantic
comedy
.043
sole
purpose
.055
canned
food
.042
gay
marriage
.055
upper
lip
.042
prolific
writer
.055
covert
operation
.042
pungent
odor
.055
dismal
failure
.041
rugged
terrain
.055
rosy
picture
.040
surgical
procedure
.054
unfair
advantage
.040
miniature
golf
.054
inaugural
ball
.040
explosive
device
.054
universal
coverage
.040
unequal
treatment
.053
awkward
position
.040
radiant
heat
.052
patriotic
duty
.039
spinal
column
.052
sheer
size
.039
enormous
amount
.051
stellar
evolution
.038
tentative
agreement
.051
unwanted
pregnancy*
.038
impending
doom
.051
generic
term
.037
khaki
shirt
.051
ample
room
.037
raw
material
.050
decisive
victory
.036
popular
culture
.049
homeless
shelter
.035
speedy
trial
.049
earthly
paradise
.035
candid
camera
.049
sizable
chunk
.035
stressful
situation
.049
sensual
pleasure
.035
bad
news
.048
deaf
ear
.034
temperate
climate
.048
muscular
strength
.034
adverse
impact
.048
planetary
scientist
.033
funny
thing
.046
coarse
meal
.033
electric
mixer
.046
sweet
potato
.033
petite
woman
.046
normative
sample
.033
sensory
input
.046
factual
knowledge
.032
receptive
audience
.045
judicial
activism
.032
modernist
art
.045
lifelong
friend
.031
primal
scene
.045
geometric
pattern
.031
ethnic
identity
.045
lively
debate
.031
Fruchter et al.
1929
APPENDIX. (continued)
Adjective
APPENDIX. (continued)
Noun
TP
Adjective
Noun
TP
barren
landscape
.031
reliable
source
.022
festive
mood
.031
thermal
expansion
.022
optimal
level
.031
broad
daylight
.022
orbital
debris
.030
tragic
accident
.022
liberal
democracy
.030
upcoming
book
.022
damp
cloth
.030
durable
peace
.021
optical
illusion
.029
dental
hygiene
.021
extra
money
.029
casual
observer
.021
ancestral
homeland
.029
annual
budget
.021
shallow
dish
.028
dirty
laundry
.021
monthly
payment
.028
mere
fact
.021
fiscal
crisis
.028
static
pressure
.021
insane
asylum
.027
blind
date
.021
nervous
breakdown
.027
fuzzy
logic
.021
polite
applause
.027
spiritual
leader
.021
biblical
text
.026
vibrant
color
.021
clinical
practice
.026
nice
guy
.020
genetic
diversity
.026
wild
card
.020
serious
problem
.025
cardiac
output
.020
socialist
realism
.025
logical
extension
.020
strict
liability
.025
identical
twin
.020
integral
component
.025
slim
chance
.020
senior
editor
.025
free
agent
.019
exact
location
.025
incoming
freshman
.019
creamy
texture
.025
fresh
lemon
.019
preschool
teacher
.024
cautious
optimism
.019
main
reason
.024
lucrative
contract
.019
glossy
magazine
.024
potent
symbol
.019
rigorous
training
.024
weekly
newspaper
.019
dumb
luck
.024
solar
radiation
.018
digital
video
.024
endless
stream
.018
dietary
intake
.024
rational
choice
.018
vivid
memory
.024
portable
radio
.018
sleepy
town
.024
automatic
pilot
.018
polar
cap
.024
ambitious
project
.018
soft
tissue
.024
eternal
damnation*
.018
linear
model
.023
civic
pride
.018
drunken
driver
.023
cruel
joke
.018
fatal
flaw*
.023
tall
grass
.017
1930
Journal of Cognitive Neuroscience
Volume 27, Number 10
APPENDIX. (continued)
Adjective
APPENDIX. (continued)
Noun
TP
Adjective
Noun
TP
academic
success
.017
financial
planner
.013
exotic
dancer
.017
organic
carbon
.013
positive
attitude
.017
atomic
physics
.013
cultural
heritage
.017
gentle
breeze
.013
genuine
concern
.017
valuable
lesson
.013
ripe
tomato
.017
conscious
awareness
.013
good
idea
.016
aesthetic
quality
.012
humble
opinion
.016
distant
cousin
.012
frequent
flier
.016
explicit
reference
.012
fair
game
.016
dense
foliage
.012
urgent
message
.016
suitable
habitat
.012
full
moon
.016
tiny
fraction
.012
seasonal
flu
.016
economic
reform
.011
fierce
battle
.016
terrible
tragedy
.011
silent
auction
.016
religious
belief
.011
symbolic
capital
.016
retail
industry
.011
stable
condition
.016
faithful
servant
.011
absolute
certainty
.016
coastal
region
.011
uneasy
truce
.016
nuclear
arsenal
.011
informal
survey
.016
healthy
diet
.011
indirect
discourse
.015
vague
notion
.011
jealous
rage
.015
ethical
dilemma
.011
racial
equality
.015
urban
renewal
.011
selective
breeding
.015
neutral
hydrogen
.011
immediate
aftermath
.015
practical
advice
.011
precious
commodity
.015
bitter
pill
.011
black
pepper
.015
stiff
neck
.010
joint
statement
.015
eastern
seaboard
.010
bare
chest
.014
dominant
theme
.010
guilty
plea
.014
original
sin
.010
useful
tool
.014
dull
knife
.010
secular
humanism
.014
permanent
residence
.010
extreme
poverty
.014
innocent
victim
.010
inner
self
.014
moral
theology
.010
physical
therapy
.014
risky
strategy
.009
spatial
scale
.014
personal
trainer
.009
nasty
stuff
.014
anonymous
donor
.009
constant
reminder
.013
divine
creation
.009
steep
incline
.013
blue
cheese
.009
Fruchter et al.
1931
APPENDIX. (continued)
Adjective
APPENDIX. (continued)
Noun
TP
Adjective
Noun
TP
pure
vanilla
.009
recent
poll
.005
remote
corner
.009
critical
acclaim
.005
low
profile
.009
efficient
method
.005
defensive
posture
.009
athletic
shoe
.005
narrow
path
.009
angry
mob
.005
mild
recession
.009
quick
trip
.005
abstract
concept
.009
dangerous
precedent
.005
empty
stomach
.009
crucial
aspect
.005
temporary
relief
.008
dramatic
reduction
.005
northern
border
.008
rare
occasion
.004
brilliant
career
.008
similar
vein
.004
deadly
virus*
.008
common
stock
.004
rough
patch
.008
creative
genius
.004
bright
sunlight
.008
entire
universe
.004
accurate
diagnosis
.008
sad
song
.004
slight
movement
.008
illegal
gambling
.004
corporate
ladder
.008
military
campaign
.004
severe
drought
.007
active
volcano
.003
ongoing
dialogue
.007
proper
burial
.003
honest
broker
.007
strong
supporter
.003
native
tongue
.007
musical
notation
.003
yellow
squash
.007
huge
crowd
.003
visual
imagery
.007
perfect
timing
.003
negative
publicity
.007
strange
sensation
.003
elderly
gentleman
.007
sick
bay
.003
regional
stability
.007
massive
influx
.003
southern
accent
.007
rural
county
.003
modest
proposal
.007
thin
sheet
.003
emotional
intensity
.006
current
crop
.002
previous
page
.006
easy
prey
.002
safe
passage
.006
powerful
engine
.002
medical
marijuana
.006
famous
phrase
.002
large
pot
.006
major
obstacle
.002
cheap
plastic
.006
expensive
jewelry
.002
heavy
saucepan
.006
quiet
dignity
.002
legal
pad
.006
tough
stance
.002
apparent
suicide*
.006
local
chapter
.002
basic
premise
.005
fine
mist
.002
formal
complaint
.005
beautiful
scenery
.002
1932
Journal of Cognitive Neuroscience
Volume 27, Number 10
APPENDIX. (continued)
Adjective
REFERENCES
Noun
TP
modern
reader
.002
political
rhetoric
.002
young
adulthood
.001
dead
giveaway*
.001
small
intestine
.001
possible
scenario
.001
important
clue
.000
*Denotes high valence items, which were removed for the supplemental MTG analysis.
Acknowledgments
This material is based upon work supported by the National Science Foundation under grant BCS-0843969 and by the NYU Abu
Dhabi Research Council under grant G1001 from the NYUAD
Institute, New York University Abu Dhabi. We would like to
thank three anonymous reviewers for their informative feedback on this manuscript.
Reprint requests should be sent to Joseph Fruchter, Department
of Psychology, New York University, 2nd Floor, 6 Washington
Place, New York, NY 10003, or via e-mail:
[email protected].
Notes
1. Alternatively, for an individual trial, one might regard a participant as predicting only a single possible noun with a probability equal to that noun’s TP; on the aggregate, however, we
would nevertheless observe effects proportional to the relevant
conditional probabilities.
2. Because there was no definition for the insula region within
the parcellation, we obtained the region via an alternate parcellation for the average brain, which was then morphed back to
each participant’s neuroanatomical space.
3. Because the M350 response is negatively signed, a positive
correlation indicates a weakening of activity, and a negative correlation indicates a strengthening of activity.
4. Despite the highly significant adjective predictivity effect in
the main analysis (see also Figure 3C), a median split analysis
failed to show a comparably robust separation between the
items in the top half and bottom half of adjective predictivity.
This discrepancy indicated that the continuous regression using
linear mixed-effects models was a more sensitive measure of
the adjective predictivity effect. Consequently, we decided to
split the data into the top 10% (blue line), top 10–50% (solid
black line), bottom 10–50% (dotted black line), and bottom
10% (red line) of adjective predictivity (Figure 3B), which confirmed our hypothesis. In particular, the continuous regression
models are a better fit to the data than the median split, because the predictivity effect is more significant at the higher
and lower ranges of adjective predictivity values, relative to
the items in the middle of the distribution.
5. The latency of the M100 peak in Dikker and Pylkkänen’s
(2011) study was 97 msec, which is roughly midway between
the two peaks observed here; we thus decided to analyze both
peaks in the present data. It should be noted, however, that
Dikker and Pylkkänen (2011) performed a sensor space analysis, which may yield results that are not comparable to the results of the present source space analysis.
Adachi, Y., Shimogawara, M., Higuchi, M., Haruta, Y., &
Ochiai, M. (2001). Reduction of non-periodic environmental
magnetic noise in MEG measurement by continuously
adjusted least squares method. IEEE Transactions on
Applied Superconductivity, 11, 669–672.
Altmann, G. T. M., & Kamide, Y. (1999). Incremental
interpretation at verbs: Restricting the domain of
subsequent reference. Cognition, 73, 247–264.
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008).
Mixed-effects modeling with crossed random effects for
subjects and items. Journal of Memory and Language,
59, 390–412.
Bar, M. (2007). The proactive brain: Using analogies and
associations to generate predictions. Trends in Cognitive
Sciences, 11, 280–289.
Bates, D., Maechler, M., & Bolker, B. (2013). lme4: Linear
mixed-effects models using S4 classes. R package version
0.999999-2. http://CRAN.R-project.org/package=lme4.
Bemis, D. K., & Pylkkänen, L. (2011). Simple composition:
A magnetoencephalography investigation into the
comprehension of minimal linguistic phrases. Journal
of Neuroscience, 31, 2801–2814.
Binder, J. R., Frost, J. A., Hammeke, T. A., Cox, R. W., Rao, S. M.,
& Prieto, T. (1997). Human brain language areas identified
by functional magnetic resonance imaging. Journal of
Neuroscience, 17, 353–362.
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial
Vision, 10, 433–436.
Brown, C., & Hagoort, P. (1993). The processing nature of the
N400: Evidence from masked priming. Journal of Cognitive
Neuroscience, 5, 34–44.
Dale, A. M., Liu, A. K., Fischl, B. R., Buckner, R. L., Belliveau,
J. W., Lewine, J. D., et al. (2000). Dynamic statistical
parametric mapping: Combining fMRI and MEG for high
resolution imaging of cortical activity. Neuron, 26, 55–67.
Dale, A. M., & Sereno, M. I. (1993). Improved localization of
cortical activity by combining EEG and MEG with MRI cortical
surface reconstruction: A linear approach. Journal of
Cognitive Neuroscience, 5, 162–176.
Dambacher, M., Kliegl, R., Hofmann, M., & Jacobs, A. M. (2006).
Frequency and predictability effects on event-related
potentials during reading. Brain Research, 1084, 89–103.
Davies, M. (2009). The 385+ million word Corpus of
Contemporary American English (1990–2008+): Design,
architecture, and linguistic insights. International Journal
of Corpus Linguistics, 14, 159–190.
Dell, G. S., & Chang, F. (2014). The P-chain: Relating sentence
production and its disorders to comprehension and acquisition.
Philosophical Transactions of the Royal Society, Series B,
Biological Sciences, 369, 20120394.
DeLong, K., Urbach, T., & Kutas, M. (2005). Probabilistic word
preactivation during language comprehension inferred from
electrical brain activity. Nature Neuroscience, 8, 1117–1121.
Desikan, R. S., Ségonne, F., Fischl, B., Quinn, B. T., Dickerson,
B. C., Blacker, D., et al. (2006). An automated labeling system for
subdividing the human cerebral cortex on MRI scans into
gyral based regions of interest. Neuroimage, 31, 968–980.
Dikker, S., & Pylkkänen, L. (2011). Before the N400: Effects of
lexical-semantic violations in visual cortex. Brain and
Language, 118, 23–28.
Dikker, S., & Pylkkänen, L. (2013). Predicting language: MEG
evidence for lexical preactivation. Brain and Language,
127, 55–64.
Dikker, S., Rabagliati, H., & Pylkkänen, L. (2009). Sensitivity
to syntax in visual cortex. Cognition, 110, 293–321.
Egner, T., Monti, J. M., & Summerfield, C. (2010). Expectation
and surprise determine neural population responses in the
Fruchter et al.
1933
ventral visual stream. Journal of Neuroscience, 30,
16601–16608.
Ehrlich, S. F., & Rayner, K. (1981). Contextual effects on word
perception and eye movements during reading. Journal of
Verbal Learning and Verbal Behavior, 20, 641–655.
Embick, D., Hackl, M., Schaeffer, J., Kelepir, M., & Marantz, A.
(2001). A magnetoencephalographic component whose
latency reflects lexical frequency. Cognitive Brain Research,
10, 345–348.
Federmeier, K. D. (2007). Thinking ahead: The role and roots
of prediction in language comprehension. Psychophysiology,
44, 491–505.
Federmeier, K. D., & Kutas, M. (1999). A rose by any other
name: Long-term memory structure and sentence processing.
Journal of Memory and Language, 41, 469–495.
Federmeier, K. D., McLennan, D. B., Ochoa, E., & Kutas, M.
(2002). The impact of semantic memory organization and
sentence context information on spoken language processing
by younger and older adults: An ERP study. Psychophysiology,
39, 133–146.
Fischler, I., & Bloom, P. A. (1979). Automatic and attentional
processes in the effects of sentence contexts on word
recognition. Journal of Verbal Learning and Verbal
Behavior, 18, 1–20.
Forster, K. I. (1976). Accessing the mental lexicon. In R. J. Wales
& E. C. T. Walker (Eds.), New approaches to language
mechanisms (pp. 257–287). Amsterdam: North-Holland.
Francken, J. C., Kok, P., Hagoort, P., & De Lange, F. P. (2015).
The behavioral and neural effects of language on motion
perception. Journal of Cognitive Neuroscience, 27,
175–184.
Friederici, A. D. (2012). The cortical language circuit: From
auditory perception to sentence comprehension. Trends in
Cognitive Sciences, 16, 262–268.
Frisson, S., Rayner, K., & Pickering, M. J. (2005). Effects of
contextual predictability and transitional probability on eye
movements during reading. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 31, 862–877.
Friston, K. (2005). A theory of cortical responses. Philosophical
Transactions of the Royal Society, Series B, 360, 815–836.
Fruchter, J., & Marantz, A. (2015). Decomposition, lookup, and
recombination: MEG evidence for the full decomposition
model of complex visual word recognition. Brain and
Language, 143, 81–96.
Hämäläinen, M., Hari, R., Ilmoniemi, R. J., Knuutila, J., &
Lounasmaa, O. V. (1993). Magnetoencephalography—
Theory, instrumentation, and applications to noninvasive
studies of the working human brain. Review of Modern
Physics, 65, 413–497.
Hickok, G., & Poeppel, D. (2007). The cortical organization
of speech processing. Nature Reviews Neuroscience, 8,
393–402.
Holmes, A. P., & Friston, K. J. (1998). Generalisability, random
effects and population inference. Neuroimage, 7, S754.
Indefrey, P., & Levelt, W. J. M. (2004). The spatial and temporal
signatures of word production components. Cognition, 92,
101–144.
Inhoff, A. W., & Rayner, K. (1986). Parafoveal word processing
during eye fixations in reading: Effects of word frequency.
Perception & Psychophysics, 40, 431–439.
Jescheniak, J. D., & Levelt, W. J. (1994). Word frequency effects
in speech production: Retrieval of syntactic information and
of phonological form. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 20, 824.
Kamide, Y., Altmann, G., & Haywood, S. L. (2003). The timecourse of prediction in incremental sentence processing:
Evidence from anticipatory eye movements. Journal of
Memory and Language, 49, 133–156.
1934
Journal of Cognitive Neuroscience
Keuleers, E., & Brysbaert, M. (2010). Wuggy: A multilingual
pseudoword generator. Behavior Research Methods, 42,
627–633.
Kim, A., & Lai, V. (2012). Rapid interactions between lexical
semantic and word form analysis during word recognition
in context: Evidence from ERPs. Journal of Cognitive
Neuroscience, 24, 1104–1112.
Kittredge, A. K., Dell, G. S., Verkuilen, J., & Schwartz, M. F.
(2008). Where is the effect of frequency in word production?
Insights from aphasic picture-naming errors. Cognitive
Neuropsychology, 25, 463–492.
Kok, P., Failing, M. F., & de Lange, F. P. (2014). Prior expectations
evoke stimulus templates in the primary visual cortex.
Journal of Cognitive Neuroscience, 26, 1546–1554.
Kok, P., Jehee, J. F., & de Lange, F. P. (2012). Less is more:
Expectation sharpens representations in the primary visual
cortex. Neuron, 75, 265–270.
Kutas, M., & Federmeier, K. D. (2000). Electrophysiology
reveals semantic memory use in language comprehension.
Trends in Cognitive Sciences, 4, 463–470.
Kutas, M., & Hillyard, S. A. (1984). Brain potentials during
reading reflect word expectancy and semantic association.
Nature, 307, 161–163.
Lau, E. F., Phillips, C., & Poeppel, D. (2008). A cortical network
for semantics: (De)constructing the N400. Nature Reviews
Neuroscience, 9, 920–933.
Lau, E. F., Weber, K., Gramfort, A., Hämäläinen, M. S., &
Kuperberg, G. R. (2014). Spatiotemporal signatures of
lexical-semantic prediction. Cerebral Cortex. Advance
online publication. doi:10.1093/cercor/bhu219.
Linzen, T., Marantz, A., & Pylkkänen, L. (2013). Syntactic context
effects in visual word recognition: An MEG study. The Mental
Lexicon, 8, 117–139.
McDonald, S. A., & Shillcock, R. C. (2003). Low-level
predictive inference in reading: The influence of
transitional probabilities on eye movements. Vision
Research, 43, 1735–1751.
Morton, J. (1969). Interaction of information in word
recognition. Psychological Review, 76, 165–178.
Neely, J. H. (1991). Semantic priming effects in visual word
recognition: A selective review of current findings and
theories. In D. Besner & G. W. Humphreys (Eds.),
Basic processes in reading: Visual word recognition
(pp. 264–336). Hillsdale, NJ: Erlbaum.
Norris, D. (1986). Word recognition: Context effects without
priming. Cognition, 22, 93–136.
Norris, D. (2006). The Bayesian reader: Explaining word
recognition as an optimal Bayesian decision process.
Psychological Review, 113, 327–357.
Oldfield, R. C. (1971). The assessment and analysis of
handedness: The Edinburgh inventory. Neuropsychologia,
9, 97–113.
Oldfield, R. C., & Wingfield, A. (1965). Response latencies in
naming objects. Quarterly Journal of Experimental
Psychology, 17, 273–281.
Pelli, D. G. (1997). The VideoToolbox software for visual
psychophysics: Transforming numbers into movies. Spatial
Vision, 10, 437–442.
Pickering, M. J., & Garrod, S. (2007). Do people use language
production to make predictions during comprehension?.
Trends in Cognitive Sciences, 11, 105–110.
Pylkkänen, L., & Marantz, A. (2003). Tracking the time course
of word recognition with MEG. Trends in Cognitive Sciences,
7, 187–189.
Rodd, J. M., Davis, M. H., & Johnsrude, I. S. (2005). The
neural mechanisms of speech comprehension: fMRI
studies of semantic ambiguity. Cerebral Cortex, 15,
1261–1269.
Volume 27, Number 10
Rubenstein, H., Garfield, L., & Millikan, J. A. (1970).
Homographic entries in the internal lexicon. Journal of
Verbal Learning and Verbal Behavior, 9, 487–494.
Smith, M. E., & Halgren, E. (1987). Event-related potentials
during lexical decision: Effects of repetition, word frequency,
pronounceability, and concreteness.
Electroencephalography and Clinical Neurophysiology
Supplement, 40, 417–421.
Smith, N. J., & Levy, R. (2008). Optimal processing times in
reading: A formal model and empirical investigation.
In B. C. Love, K. McRae, & V. M. Sloutsky (Eds.),
Proceedings of the Thirtieth Annual Conference of the
Cognitive Science Society (pp. 595–600). Austin, TX:
Cognitive Science Society.
Smith, N. J., & Levy, R. (2011). Cloze but no cigar: The
complex relationship between cloze, corpus, and
subjective probabilities in language processing. In
L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings
of the 33rd Annual Conference of the Cognitive Science
Society (pp. 1637–1642).
Austin, TX: Cognitive Science Society.
Smith, N. J., & Levy, R. (2013). The effect of word predictability
on reading time is logarithmic. Cognition, 128, 302–319.
Solomyak, O., & Marantz, A. (2009). Lexical access in early
stages of visual word processing: A single-trial correlational
MEG study of heteronym recognition. Brain and Language,
108, 191–196.
Solomyak, O., & Marantz, A. (2010). Evidence for early
morphological decomposition in visual word recognition.
Journal of Cognitive Neuroscience, 22, 2042–2057.
Strijkers, K., Costa, A., & Thierry, G. (2010). Tracking lexical
access in speech production: Electrophysiological correlates
of word frequency and cognate effects. Cerebral Cortex,
20, 912–928.
Taylor, W. L. (1953). “Cloze procedure”: A new tool for
measuring readability. Journalism Quarterly, 30, 415–433.
Van Berkum, J. J. A., Brown, C. M., Zwitserlood, P., Kooijman,
V., & Hagoort, P. (2005). Anticipating upcoming words in
discourse: Evidence from ERPs and reading times. Journal
of Experimental Psychology: Learning, Memory, and
Cognition, 31, 443–467.
Van Petten, C., & Kutas, M. (1990). Interactions between
sentence context and word frequency in event-related
brain potentials. Memory & Cognition, 18, 380–393.
Whaley, C. P. (1978). Word-nonword classification time.
Journal of Verbal Learning and Verbal Behavior, 17, 143–154.
Wicha, N. Y., Moreno, E. M., & Kutas, M. (2003). Expecting gender:
An event related brain potential study on the role of
grammatical gender in comprehending a line drawing with
a written sentence in Spanish. Cortex, 39, 483–508.
Fruchter et al.
1935