A Dual Role For Prediction Error in
A Dual Role For Prediction Error in
A Dual Role For Prediction Error in
doi:10.1093/cercor/bhn161
Advance Access publication September 26, 2008
A Dual Role for Prediction Error in Hanneke E.M. den Ouden1, Karl J. Friston1, Nathaniel D. Daw2,
Anthony R. McIntosh3 and Klaas E. Stephan1,4
Associative Learning
1
Wellcome Trust Centre for Neuroimaging, Institute of
Neurology, University College London, 12 Queen Square,
London WC1N 3BG, UK, 2Department of Psychology, New
York University, New York, NY 10003, USA, 3Rotman Research
Institute of Baycrest Centre, University of Toronto, Toronto,
Ontario, Canada M6A 2E1 and 4Branco-Weiss-Laboratory,
Institute for Empirical Research in Economics, University of
Zürich, Switzerland
Confronted with a rich sensory environment, the brain must learn In all of these previous studies, the learned associations had
statistical regularities across sensory domains to construct causal direct relevance for behavior, either because they were linked
models of the world. Here, we used functional magnetic resonance to rewarding or punishing outcomes (e.g., McClure et al. 2003;
imaging and dynamic causal modeling (DCM) to furnish neurophysi- O’Doherty et al. 2004; Seymour et al. 2004) or because subjects
ological evidence that statistical associations are learnt, even when received feedback on their performance (Fletcher et al. 2001;
Choice of areas and time series extraction. The goal of the present
DCM analysis was to explain the (3-way) simple interaction CS
+
presence 3 visual outcome 3 RW learning for CS trials in V1 (see
SPM findings in the Results section) by a simple model, in which the
strength of the A1 / V1 connection was modulated as a function of
j
the RW predictions, /t (i.e., learning curves; Fig. 3). Representative A1
time series were chosen by testing for the main effect of CS presence,
and V1 time series were selected by testing for the simple interaction
described above. (The goal of DCM is to explain regional effects [as
detected in a voxel-wise GLM analysis] in terms of interregional
connectivity and its experimentally induced changes. This puts
congruence constraints on the contrast used to identify a regional
time series and the mechanisms in a DCM that are proposed to model
this time series. Therefore, different contrasts are typically required for
selecting time series representing the different areas in a model; c.f.
Stephan, Harrison, et al. 2007.) We did not model the 4-way interaction
with DCM because the SPM analysis showed that the learning effect
+
was driven by the CS (see Results section). Figure 4. Dynamic causal models of learning effects on audio-visual connectivity. For
all 3 models, the primary auditory (A1) and visual (V1) areas are both driven by their
As the exact locations of activation maxima varied over subjects, we
respective sensory inputs. The first model tested had a single connection from A1 to
ensured the comparability of our models across subjects by using
V1 (M1). In model 2 (M2) the V1 / A1 connection was added. In both M1 and M2,
combined anatomical--functional constraints in selecting the subject- the A1 / V1 connection was allowed to change during CSþ trials as a function of
specific time series (c.f. Stephan, Marshall, et al. 2007). Specifically, we the visual outcome (Vþ vs. V) and the RW learning curve (/). This modulatory
thresholded the subject-specific SPMs at P < 0.05 and chose the local effect corresponds to the interaction of the auditory CSþ prediction with the visual
maximum within 8 mm of the group activation maxima in primary outcome and models a learning-dependent contribution to V1 responses from CSþ
auditory cortex (A1) and primary visual cortex (V1) as inferred by responses in A1; and this contribution depends on whether the visual stimulus
a probabilistic cytoarchitectonic atlas in MNI space (Eickhoff et al. was present or not (c.f., a prediction error mediated by top-down signals from A1). In
2005). As a summary time series, we computed the first eigenvector the third model, suggested as a control by one of the reviewers, instead of the A1 / V1
across all suprathreshold voxels within a radius of 4 mm around the connection, the V1 / A1 connection is modulated by the learning signal.
Figure 5. fMRI results. (A) Significant activations in V1 as a function of RW learning, for both the 4-way interaction (CS type 3 CS presence 3 visual outcome 3 RW learning;
red), and the simple (3-way) interaction (blue), which is restricted to the CSþ trials (x 5 6, also showing the caudate activation) and (B) in the putamen bilaterally (y 5 6),
displayed on the mean structural image across all subjects. (C) z 5 12. Significant 3-way interaction CS type 3 CS presence 3 RW learning in the DLPFC and left putamen (red).
This interaction is driven by the CSþ trials, as shown by the simple interaction in blue.
design in our study allows us to circumvent this problem, as it with a basic principle emerging from many previous studies:
comprises conditions that correspond to congruent and prediction errors, or surprise, constitute a driving force for
incongruent prediction/outcome combinations, respectively. learning because they signal the need for learning in order to
Analyzing the 4-way interaction between our experimental update predictions (Shanks 1995; Schultz et al. 1997; Schultz
factors, we found that responses in the primary visual cortex and Dickinson 2000). Although the role of prediction errors has
and the putamen were sensitive to surprising events; over time, been mainly explored for reinforcement learning so far, there is
these areas became significantly more active when presented growing evidence that prediction errors may be equally
with a surprising cue--outcome combination. Learning was important for learning statistical relationships that are affectively
stronger for the CS+ blocks than for the CS– blocks, which is in neutral and behaviorally irrelevant. In other words, the same
line with previous behavioral evidence (Wasserman et al. 1993; mechanisms that optimize the learning of stimulus--response
Fletcher et al. 2001). Previous fMRI studies in humans have links may operate during the perceptual learning of stimulus--
demonstrated that BOLD activity in the striatum is correlated stimulus associations (Rao and Ballard 1999; Friston 2005).
with (signed) prediction errors during reinforcement learning Evidence that organisms learn predictive associations between
(O’Doherty et al. 2003; McClure et al. 2003; O’Doherty et al. initially neutral stimuli is seen in classical conditioning effects
2004; Seymour et al. 2004; Jensen et al. 2007; Menon et al. such as sensory preconditioning (Brogden 1939). Some forms of
2007) and other associative learning tasks (Corlett et al. 2004). sensory learning also exhibit such features, for example, the
In these studies, the learned associations, and the sign of the mismatch negativity (MMN) paradigm, in which responses to
resulting prediction errors, were of direct relevance for
sensory stimuli decrease with predictability (Friston 2005;
behavior. The current study shows that the putamen is
Baldeweg 2006), regardless of whether stimuli are attended. A
sensitive to unexpected outcomes even when the cue-stimulus
mechanism similar to predictive coding has been proposed in
association is learned incidentally and has no relevance to
the motor domain for cancellation of self-generated events
behavior. However, in contrast to the previous studies, the
(Wolpert et al. 1995; Blakemore et al. 1998; Shergill et al. 2005).
pattern of putamen activity does not appear to be sensitive
Moreover, the learning of predictive relationships that are
to the direction of the prediction error, only to its amplitude.
affectively neutral and task-irrelevant may engage similar
This difference may reflect the fact that learning was
perceptual as opposed to operant. In other words, the computational and neural mechanisms as those for predicting
occurrence of an unpredicted or surprising event may play significant events (Zink et al. 2006; Wittmann et al. 2007).
the role of negative reward, irrespective of whether the The results of the present study support the notion that the
surprising event entailed the presence of absence of a stimulus. role of prediction errors in learning transcends the simple
This issue will be discussed further in the section on predictive reinforcement of stimulus--response links and plays a more
coding below. pervasive and general role in various forms of learning. Indeed
a hallmark of adaptive systems is their ability to minimize
Role of Prediction Errors Beyond Reinforcement surprising exchanges with their environment (Friston et al.
Learning 2006). This entails adjustments to their internal models of the
Our finding that learning-induced responses in primary visual environment so that potentially surprising event can be
cortex and the putamen reflected prediction errors accords predicted. Almost universally, this adjustment involves changes