NN 17 873
NN 17 873
NN 17 873
www.elsevier.com/locate/neunet
Abstract
When a monkey searches for a colour and orientation feature conjunction target, the scan path is guided to target coloured locations in
preference to locations containing the target orientation [Vision Res. 38 (1998b) 1805]. An active vision model, using biased competition, is
able to replicate this behaviour. As object-based attention develops in extrastriate cortex, featural information is passed to posterior parietal
cortex (LIP), enabling it to represent behaviourally relevant locations [J. Neurophysiol. 76 (1996) 2841] and guide the scan path. Attention
evolves from an early spatial effect to being object-based later in the response of the model neurons, as has been observed in monkey single
cell recordings. This is the first model to reproduce these effects with temporal precision and is reported here at the systems level allowing the
replication of psychophysical scan paths.
q 2004 Elsevier Ltd. All rights reserved.
Keywords: Visual attention; Biased competition; Active visual search; Mean field population approach
0893-6080/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.neunet.2004.03.012
874 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
the cell’s receptive field is attenuated by the addition of a area V4, because the latter relies on the resolution of
second non-preferred stimulus (i.e. a stimulus that causes competition between objects in IT. The model was able to
only a weak response when presented alone in the receptive combine spatial and object-based attention at both the single
field). Responses are eventually determined by which of the cell and systems level in order to reproduce attentional
two stimuli are attended. When the preferred stimulus is effects seen in single cells in V4 (Chelazzi et al., 2001; Luck
attended, the response approaches that when this stimulus is et al., 1997) and IT (Chelazzi et al., 1993) with temporally
presented alone. If the non-preferred stimulus is attended, accuracy. Here, we use these attentional effects to produce
responses are severely suppressed, despite the presence of biologically plausible active vision behaviour.
the preferred stimulus in the receptive field. Neurons from There is evidence to suggest that an eye movement may
modules representing areas IT and V4 from the model be linked with a spatial enhancement of responses in area
presented here have been able to replicate such effects at the V4 because microstimulation of the frontal eye field (FEF),
cellular level (Lanyon & Denham, submitted). Here, we which is involved in the allocation of attention and eyes to
examine the systems level behaviour of the model in more locations in the scene (Schall, 2002), results in responses in
detail when replicating the nature of search scan paths V4 that are spatially modulated (Moore & Armstrong,
observed by Motter and Belky (1998b) who found most 2003). In addition to anticipatory spatial increases in
fixations landed within 18 of stimuli (only 20% fell in blank activity being found in LIP (Colby et al., 1996), posterior
areas of the display despite the use of very sparse displays) parietal cortex in general is implicated in the control of both
and these stimuli tended to be target coloured (75% of spatial and object-based attention (Corbetta et al., 2000;
fixations landed near target coloured stimuli and only 5% Corbetta, Shulman, Miezin, & Petersen, 1995; Fink, Dolan,
near non-target coloured stimuli). Halligan, Marshall, & Frith, 1997; Hopfinger et al., 2000;
Martinez et al., 1999; Posner, Walker, Friedrich, & Rafal,
1.2. Visual attention 1984; Robinson, Bowman, & Kertzman, 1995). Therefore,
the model assumes that FEF provides a spatial bias to V4 via
There has been debate over the issue of whether visual circuitry in LIP. Thus, a spatial bias is applied directly from
attention operates purely as a spatial ‘spotlight’ (Crick, FEF to LIP, and LIP then biases V4. The source of the bias
1984; Helmholtz, 1867; Treisman, 1982) or is more to LIP could also be dorsolateral prefrontal cortex, which
complex, operating in an object-based manner. The has connections with parietal cortex (Blatt, Andersen, &
evidence for object-based attention has been convincing Stoner, 1990), or pulvinar.
and growing in the psychophysical literature (Blaser,
Pylyshyn, & Holcombe, 2000; Duncan, 1984), from 1.3. Visual search behaviour
functional magnetic resonance imaging (O’Craven, Down-
ing, & Kanwisher, 1999), event-related potential recordings When a visual target contains a simple feature that is
(Valdes-Sosa, Bobes, Rodriguez, & Pinilla, 1998; Valdes- absent from distractors, it tends to effortlessly ‘pop-out’
Sosa, Cobo, & Pinilla, 2000) and from single cell recordings from the scene. However, when a target is defined by a
(Chelazzi et al., 1993, 2001; Roelfsema, Lamme, & conjunction of features, the search takes longer and appears
Spekreijse, 1998). However, there is no doubt that attention to require a serial process, which has been suggested to be
can produce spatially specific effects also (Bricolo, the serial selection of spatial locations to which attention is
Gianesini, Fanini, Bundesen, & Chelazzi, 2002; Connor, allocated (Treisman, 1982; Treisman & Gelade, 1980).
Callant, Preddie, & Van Essen, 1996). In the lateral Active visual search involves the movement of eyes, and
intraparietal area (LIP), an anticipatory spatial enhancement it is presumed attention (Hoffman & Subramaniam, 1995),
of responses has been recorded from single cells (Colby, to locations to be inspected. The resultant series of points
Duhamel, & Goldberg, 1996) and has been seen in imaging where the eyes fixate form a scan path. This differs from the
of the possible human homologue of LIP (Corbetta, more commonly modelled covert search where the attentive
Kincade, Ollinger, McAvoy, & Shulman, 2000; Hopfinger, focus is shifted but eye position and, hence, retinal input are
Buonocore, & Mangun, 2000; Kastner, Pinsk, De Weerd, held constant. During active search for a feature conjunction
Desimone, & Ungerleider, 1999). Spatial effects have been target, it seems that colour (or luminance) is more influential
recorded in single cells in area V4 in advance of the sensory on the scan path than form features, such as orientation, in
response and have then modulated the earliest stimulus- monkeys (Motter & Belky, 1998b), and in humans (Scialfa
invoked response at 60 ms post-stimulus (Luck, Chelazzi, & Joffe, 1998; Williams & Reingold, 2001). When the
Hillyard, & Desimone, 1997). However, object-based numbers of each distractor type are equal, this preference for
effects have not been recorded until much later in the target coloured locations over locations containing the
response, from , 150 ms in IT and V4 (Chelazzi et al., target orientation seems robust, even when the task is biased
1993, 2001). towards orientation discrimination (Motter & Belky,
The model presented here has been used (Lanyon & 1998b). Colour appears to segment the scene and guide
Denham, submitted) to suggest that spatial attention is the scan path in a manner that resembles guided search
available earlier than object-based attention, at least in (Wolfe, 1994; Wolfe, Cave, & Franzel, 1989). Even abrupt
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 875
Fig. 2. The inhibition and biases that influence competition in each of the dynamic modules (a) IT; (b) LIP; (c) V4.
more complex. Anterior areas in IT, such as area TE, are not feature dimension only. Stimuli of this type were chosen to
retinotopic but encode objects in an invariant manner mirror those used by Motter and Belky (1998b), in order that
(Wallis & Rolls, 1997) and the IT module here represents the active vision scan paths from this experiment could be
such encoding. An inhibitory interneuron pool mediates simulated. The size of the V1 orientation and colour filters,
competition between objects in IT. V1 and V4 are described in Section A.2, determines the number of pixels
retinotopic and process both colour and orientation. A V1 that represent 18 of visual angle, since V1 receptive fields
cell for every feature exists at every pixel position in the tend to cover no more than about 18 (Wallis & Rolls, 1997).
image. V4 receives convergent inputs from V1 over the area This size is then used to scale the stimuli to be 1 £ 0.258, as
of its receptive field and V4 receptive fields overlap by one used by Motter and Belky (1998a,b).
V1 neuron. V4 is arranged as a set of feature ‘layers’ The model operates in an active vision manner by
encoding each feature in a retinotopic manner. Each feature moving its retina around the image so that its view of the
belongs to a feature type, i.e. colour or orientation. V4 world is constantly changing. Most visual attention models
neurons are known to be functionally segregated (Ghose & (Deco, 2001; Deco & Lee; 2002; Niebur, Itti, & Koch, 2001)
Ts’O, 1997) and the area is involved in the representation of have a static retina. Here, cortical areas receive different
colour as well as form (Zeki, 1993). bottom-up input from the retina and V1 at each fixation. The
LIP provides a retinotopic spatio-featural map that is retinal image is the view of the scene entering the retinal,
used to control the spatial focus of attention and fixation. and cortical areas, at any particular fixation. From the
Locations in LIP compete with one another and the centre of information within the retinal image, the system has to
the receptive field of the assembly with the highest activity select its next fixation point in order to move its retina. The
is chosen as the next fixation point. LIP is able to integrate size of the retinal image is variable for any particular
featural information in its spatial map due to its connection simulation but is normally set to 441 pixels, which equates
with area V4. Competition between locations in LIP is to approximately 408 of visual angle. This is smaller than
mediated by an inhibitory interneuron pool. our natural vision but stable performance across a range of
Stimuli consist of vertical and horizontal, red and green retinal image sizes is possible (the only restriction being that
bars. During active search for an orientation and colour a very small retina tends to lead to a higher proportion of
conjunction target, distractors differ from the target in one fixations landing in blank areas of very sparse scenes due to
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 877
lack of stimuli within the limited retina) due to the 2.2. Object-based attention
normalisation of inputs to IT, described in Section A.3.2.
Cortical areas V4 and LIP are scaled dependant on the size Object-based attention operates within the ventral stream
of the retinal image so that larger retinal images result in of the model (V4 and IT) in order that features belonging to
more assemblies in V4 and LIP and, thus, longer processing the target object are enhanced and non-target features are
times during the dynamic portion of the system. For a 408 suppressed. A connection from the retinotopic ventral
retinal window, V4 consists of 20 £ 20 assemblies in each stream area (V4) to the parietal stream of the model allows
feature layer, and processing in 5 ms steps for typical the ventral object-based effect to influence the represen-
fixations lasting , 240 ms (saccade onset is determined by tation of behaviourally relevant locations in the LIP module.
the system) takes approximately 12 s to run in Matlab using The prefrontal object-related bias to IT provides a
a Pentium (4) PC with a 3.06 GHz processor and 1 GB of competitive advantage to the assembly encoding the target
RAM. For monochromatic single cell simulations the object such that, over time, this object wins the competition
system takes 0.5 s to process a 240 ms fixation at 1 ms in IT. Attention appears to require working memory (De
steps using a small retinal image (covering 23 £ 23 pixels). Frockert, Rees, Frith, & Lavie, 2001) and, due to its
sustained activity (Miller, Erickson, & Desimone, 1996),
prefrontal cortex has been suggested as the source of a
2.1. Spatial attention
working memory object-related bias to IT. Other models
have implemented such as bias (Deco & Lee, 2002; Renart,
Following fixation an initially spatial attention window Moreno, de la Rocha, Parga, & Rolls, 2001; Usher &
(AW) is formed. The aperture of this window is scaled Niebur, 1996). Here, the nature of this bias resembles the
according to coarse resolution information reflecting local responses of so-called ‘late’ neurons in prefrontal cortex,
stimulus density (Motter & Belky, 1998a), which is whose activity builds over time and tends to be highest late
assumed to be conveyed rapidly by the magnocellular in a delay period (Rainer & Miller, 2002; Romo, Brody,
pathway to parietal cortex, including LIP, and other possibly Hernandez, & Lemus, 1999). It is modelled with a sigmoid
involved areas, such as FEF. Alternatively, this information function, as described in Section A.3.2. This late response
may be conveyed sub-cortically, and superior colliculus or also reflects the time taken for prefrontal neurons to
pulvinar could be the source of this spatial effect in LIP and distinguish between target and non-target objects, beginning
V4. All other information within the system is assumed to 110 – 120 ms after the onset of stimuli at an attended
be derived from the parvocellular pathway. Thus, during a location (Everling, Tinsley, Gaffan, & Duncan, 2002).
scan path, the size of the AW is dynamic, being scaled IT provides an ongoing object-related bias to V4 that
according to stimulus density found around any particular allows target features to win local competitions in V4 such
fixation point (see Lanyon and Denham (2004a) for further that these features become the most strongly represented
details). Attention gradually becomes object-based over across V4. Each assembly in IT provides an inhibitory bias
time, as competition between objects and features is to features in V4 that do not relate to the object encoded by
resolved. Object-based attention is not constrained by the it. As the target object becomes most active in IT, this
spatial AW but is facilitated within it. Thus, object-based results in suppression of non-target features in V4. These
attention responses are strongest within the AW causing a object-based effects appear in IT and V4 later in their
combined attentional effect, as found by McAdams and response, from , 150 ms post-stimulus, as was found in
Maunsell (2000) and Treue and Martinez Trujillo (1999). single cell recordings (Chelazzi et al., 1993, 2001; Motter,
The initial spatial AW is implemented as a spatial bias 1994a,b). The result of object-based attention in V4 is that
provided from FEF to LIP, which results in a spatial target features are effectively ‘highlighted’ in parallel across
attention effect in LIP that is present in anticipation of the visual field (McAdams & Maunsell, 2000; Motter,
stimulus information (Colby et al., 1996; Corbetta et al., 1994a,b).
2000; Hopfinger et al., 2000; Kastner et al., 1999). A
connection from LIP to V4 allows a spatial attentional effect 2.3. The scan path
to be present in V4 as found in many studies (Connor et al.,
1996; Luck et al., 1997). Spatial attention in the model V4 In order to carry out effective visual search, brain areas
assemblies appears as an increase in baseline firing in involved in the selection of possible target locations should
advance of the stimulus information and as a modulatory be aware of object-based effects that are suppressing non-
effect on the stimulus-invoked response beginning at target features in the visual scene (as represented in the
, 60 ms, as found in single cells by Luck et al. (1997) retinotopic visual areas). This model suggests that object-
and reported in Lanyon and Denham (submitted). Spatial based effects occurring in area V4 are able to influence the
attention in V4 provides a facilitatory effect to object-based spatial competition for next fixation location in LIP. It is this
attention within the AW. The excitatory connection from cross-stream interaction between the ventral and dorsal
LIP to V4 also serves to provide feature binding at the visual streams (Milner & Goodale, 1995; Ungerleider &
resolution of the V4 receptive field. Mishkin, 1982) that allows visual search to select
878 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
appropriate stimuli for examination. Thus, object-based to be significant when the most active object assembly is
attention, at least at the featural level, may be crucial for twice as active as its nearest rival (such a quantitative
efficient search. difference is reasonable when compared to the recordings of
As object-based attention becomes effective within the Chelazzi et al., 1993) and a saccade is initiated 70 ms later,
ventral stream, the parallel enhancement of target features reflecting motor preparation latency.
across V4 results in LIP representing these locations as It is unclear what information is available within cortex
being salient, due to its connection from V4. The weight of during a saccade but evidence suggests that magnocellular
connection from the V4 colour assemblies to LIP is slightly inputs are suppressed during this time (Anand & Bridge-
stronger than that from the V4 orientation assemblies. This man, 2002; Thiele, Henning, Kubischik, & Hoffmann, 2002;
gives target coloured locations an advantage in the spatial and see Ross, Morrone, Goldberg, and Burr (2001) and Zhu
competition in LIP such that target coloured locations are and Lo (1996) for reviews). In the model, saccades are
represented as more behaviourally relevant and, hence, tend instantaneous but the dynamic cortical areas (IT, LIP and
to be most influential in attracting the scan path, as found by V4) are randomised at the start of the subsequent fixation in
Motter and Belky (1998b). The difference in strength of order to reflect saccadic suppression of magnocellular and
connection of the V4 features to LIP need only be marginal possibly parvocellular inputs, and cortical dynamics during
in order to achieve this effect. Fig. 3 shows the effect of the saccade.
adjusting the relative connection weights. The strength of
these connections could be adapted subject to cognitive 2.5. Inhibition of return
requirement or stimulus-related factors, such as distractor
ratios (Bacon & Egeth, 1997; Shen, Reingold, & Pomplum, As an integrator of spatial and featural information
2000, 2003). However, with the proportion of distractor (Colby et al., 1996; Gottlieb et al., 1998; Toth & Assad,
types equal, Motter and Belky (1998b) found that 2002), LIP provides the inhibition of saccade return (Hooge
orientation was unable to override colour even during an & Frens, 2000) mechanism required here to prevent the scan
orientation discrimination task. This suggests that the bias path returning to previously inspected sites. Inhibitory after-
towards a stronger colour connection could be learnt during effects once attention is withdrawn from an area are
development and be less malleable to task requirement. demonstrated in classic inhibition of return (IOR) studies
(Clohessy, Posner, Rothbart, & Vecera, 1991; Posner et al.,
2.4. Saccades 1984) and may be due to oculomotor processes, possibly
linked with the superior colliculus (Sapir, Soroker, Berger,
Single cell recordings (Chelazzi et al., 1993, 2001) & Henik, 1999; Trappenberg, Dorris, Munoz, & Klein,
provide evidence that saccade onset may be temporally 2001) or a suppressive after-effect of high activity at a
linked to the development of significant object-based previously attended location. In a model with a static retina,
effects, with saccades taking place , 70– 80 ms after a suppression of the most active location over time by specific
significant effect was observed in either IT or V4. In the IOR input (Itti & Koch, 2000; Niebur et al., 2001) or
model, saccades are linked to the development of a through self-inhibition is possible. Here, such a process
significant object-based effect in IT. The effect is deemed within LIP could lead to colour-based IOR (Law, Pratt, &
Fig. 3. The effect on fixation position of increasing the relative weight of V4 colour feature input to LIP. When V4 colour features are marginally more strongly
connected to LIP than V4 orientation features, the scan path is attracted to target coloured stimuli in preference to stimuli of the target orientation. Fixation
positions were averaged over 10 scan paths, each consisting of 50 fixations, over the image shown in Fig. 6a.
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 879
Abrams, 1995) due to the suppression of the most active area 7a, where neurons respond to both the retinal
locations. When the retina is moving, such inhibition is location of the stimulus and eye position in the orbit.
inadequate because there is a need to remember previously Such a mapping may be used to keep track of objects
visited locations across eye movements and there may be a across saccades and this concept has already been
requirement for a head- or world-centred mnemonic explored in neural network models (Andersen & Zipser,
representation. This is a debated issue with there being 1988; Mazzoni, Andersen, & Jordan, 1991; Quaia,
some evidence to suggest that humans use very little Optican, & Goldberg, 1998; Zipser & Andersen, 1988).
memory during search (Horowitz & Wolfe, 1998; Wood- Representations in LIP are retinotopic such that, after a
man, Vogel, & Luck, 2001). However, even authors saccade the representation shifts to the new co-ordinate
advocating ‘amnesic search’ (Horowitz & Wolfe, 1998) system based on the post-saccadic centre of gaze. Just
do not preclude the use of higher-level cognitive and prior to a saccade the spatial properties of receptive fields
mnemonic processes for efficient active search. Parietal in LIP change (Ben Hamed, Duhamel, Bremmer, & Graf,
damage is linked to the inability to retain a spatial working 1996) and many LIP neurons respond (, 80 ms) before
memory of searched locations across saccades so that the saccade to salient stimuli that will enter their receptive
locations are repeatedly re-fixated (Husain et al., 2001) and fields after the saccade (Duhamel, Colby, & Goldberg,
computational modelling has suggested that units with 1992). In common with most models of this nature (Deco
properties similar to those found in LIP could contribute to & Lee, 2002), such ‘predictive re-mapping’ of the visual
visuospatial memory across saccades (Mitchell & Zipser, scene is not modelled here. At this time, it is left to
2001). Thus, it is plausible for the model LIP to be involved specialised models to deal with the issue of pre- and post-
in IOR. saccadic spatial constancy, involving changes in represen-
Also, IOR seems to be influenced by recent event/reward tation around the time of a saccade (Ross, Morrone, &
associations linked with orbitofrontal cortex (Hodgson et al., Burr, 1997), memory for targets across saccades (Findlay,
2002). In the model, the potential reward of a location is Brown, & Gilchrist, 2001; McPeek, Skavenski, &
linked to its novelty, i.e. whether a location has previously Nakayama, 2000) and the possible use of visual markers
been visited in the scan path and, if so, how recently. for co-ordinate transform (Deubel, Bridgeman, & Schnei-
Competition in LIP is biased by the ‘novelty’ of each der, 1998), along with the associated issue of suppression
location with the possible source of such a bias being frontal of magnocellular (Anand & Bridgeman, 2002; Thiele,
areas, such as orbitofrontal cortex. A mnemonic map of Henning, Kubischik, & Hoffmann, 2002; and see Ross,
novelty values is constructed in a world- or head-centred co- Morrone, Goldberg, and Burr (2001) and Zhu and
ordinate frame of reference and converted into retinotopic Lo (1996) for reviews) and possibly parvocellular cortical
co-ordinates when used in LIP. Initially, every location in inputs just prior to and during a saccade.
the scene has a high novelty, but when fixation (and, thus,
attention) is removed from an area, all locations that fall
within the spatial AW have their novelty values reduced. At 3. Results
the fixation point the novelty value is set to the lowest value.
In the immediate vicinity of the fixation point (the area of 3.1. Dynamics of object-based attention result in
highest acuity and, therefore, discrimination ability, in a representation of behaviourally relevant locations in LIP
biological system) the novelty is set to low values that
gradually increase, in a Gaussian fashion, with distance Fig. 4 shows the activity in V4, IT and LIP at different
from the fixation point (Hooge & Frens, 2000). All locations times during the first fixation on an image. The outer box
that fall within the AW, but are not in the immediate vicinity plotted on the image in Fig. 4a represents the retinal image
of the fixation point, have their novelty set to a neutral value. and the inner box represents the AW. Initially, spatial
Novelty is allowed to recover linearly with time. This allows attention within the AW modulates representations in LIP
IOR to be present at multiple locations, as has been found in and V4. Object assemblies in IT are all approximately
complex scenes (Danziger, Kingstone, & Snyder, 1998; equally active. Later in the response (from , 150 ms post-
Snyder & Kingstone, 2001; Tipper, Weaver, & Watson, fixation), object-based attention develops and the target
1996) where the magnitude of the effect decreases object becomes most active in IT, whereas distractor objects
approximately linearly from its largest value at the most are suppressed. Features belonging to the red vertical target
recently searched location so that at least five previous are enhanced at the expense of the non-target features across
locations are affected (Irwin & Zelinsky, 2002; Snyder & V4 (Motter, 1994a,b). Responses in V4 are modulated by
Kingstone, 2000). both spatial attention and object/feature attention (Anllo-
The use of such a scene-based map (in world- or head- Vento & Hillyard, 1996; McAdams & Maunsell, 2000).
centred coordinates) reflects a simplification of processing This occurs because V4 is subject to both a spatial bias, from
that may occur within parietal areas, such as the ventral LIP, and an object-related bias, from IT. Both inputs are
intraparietal area, where receptive fields range from applied to V4 throughout the temporal processing at each
retinotopic to head-centred (Colby & Goldberg, 1999) or fixation. However, the spatial bias results in an earlier
880 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
spatial attention effect in V4 than the object effect because Once these object-based effects are present in V4, LIP is
the latter is subject to the development of significant object- able to represent the locations that are behaviourally
based attention in IT resulting from the resolution of relevant, i.e. the red coloured locations, as possible saccade
competition therein. targets. Prior to the onset of object-based attention all
Fig. 4. (a) A complete scene with the retinal image shown as the outer of two boxes plotted. Within the retinal image the initial spatial AW, shown as the inner
box, is formed. This AW is scaled according to local stimulus density. (N.B. When figures containing red and green bars are viewed in greyscale print, the red
appears as a darker grey than the green.) (b) The activity within the cortical areas 45 ms after the start of the fixation. This is prior to the sensory response in V4
(at 60 ms). However, there is an anticipatory elevation in activity level within the spatial AW in LIP and V4, as seen in single cell studies of LIP (Colby et al.,
1996) and V4 (Luck et al., 1997). (c) The activity within the cortical areas 70 ms after the start of the fixation. At this time, spatial attention modulates responses
in V4 and LIP. LIP represents both red and green stimulus locations. (d) The activity 180 ms after the start of the fixation. Object-based attention has
significantly modulated responses in IT and V4. LIP represents the location of target coloured (red) stimuli more strongly than non-target coloured (green)
stimuli. Object-based effects are still subject to a spatial enhancement in V4 within the AW. (For interpretation of the references to colour in this figure legend,
the reader is referred to the web version of this article.)
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 881
Fig. 4 (continued )
stimuli are approximately equally represented in LIP (Fig. would be explained, in this model, by the time taken to
4c) and a saccade at this time would select target and non- develop object-based effects in the ventral stream and
target coloured stimuli with equal probability. Saccade convey this information to LIP.
onset is determined by the development of object-based
effects in IT and, by the time this has occurred later in the 3.2. Inhibition in V4
response (Fig. 4d), the target coloured locations in LIP have
become more active than non-target coloured locations. The nature of inhibition in V4 affects the representation
Therefore, the saccade tends to select a target coloured of possible target locations in LIP. If V4 is implemented
location. Increased fixation duration has been linked with with a common inhibitory pool for all features or per feature
more selective search (Hooge & Erkelens, 1999) and this type (i.e. one for colours and one for orientations), this
882 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
results in a normalising effect similar to a winner-take-all but selective for green stimuli. Thus, inhibitory interneuron
process, whereby high activity in any particular V4 assemblies in V4 exist for every retinotopic location in V4
assembly can strongly suppress other assemblies. V4 and for each feature type. This is shown in Fig. 2c.
assemblies receive excitatory input from LIP, as a result This type of local inhibition results in the performance
of the reciprocal connection from V4 to LIP. When LIP shown in Fig. 5, which records the V4 colour assemblies and
receives input from all features in V4 as a result of two LIP at two different time steps during a search for a red
different types of stimulus (e.g. a red vertical bar and a target. Within the receptive field of the V4 assemblies at
horizontal green bar) being within a particular V4 receptive (matrix coordinate) location [6,13] both a red horizontal bar
field, these feature locations in V4 will become most highly and a green vertical bar are present in the retinal image and
active and may suppress other locations too strongly. Thus, are within the spatial AW. During the initial stimulus-
the common inhibitory pool tends to favour locations that related response from 60 ms post-fixation, the red and the
contain two different stimuli within the same receptive field. green selective assemblies at this location are approximately
Previous model that have used a common inhibitory pool equally active. These are amongst the strongest V4
(Deco, 2001; Deco & Lee, 2002) may not have encountered assemblies at this point because they are in receipt of
this problem because only one feature type was encoded. bottom-up stimulus information and are within the AW
Therefore, in this model currently, the requirement is that positively biased by LIP. LIP represents the locations of all
features within a particular feature type should compete strong featural inputs from V4 and has no preference for
locally so, for example, an assembly selective for red stimuli stimulus colour at this point. However, by 70 ms post-
competes with an assembly at the same retinotopic location stimulus strong competitive effects, due to there being two
Fig. 5. The dynamic effect when a V4 receptive field includes two different stimuli. This image also shows what happens when fixation is close to the edge of
the original image (a computational problem not encountered in the real world) and the retinal image extends beyond the image. In this case the cortical areas
receive no bottom-up stimulus information for the area beyond the original image but other cortical processing remains unaltered. As LIP represents the
locations of behaviourally relevant stimuli, the next fixation is never chosen as a location beyond the extent of the original image. (a) The extent of the retinal
image (outer box) and AW are shown for a fixation, forced to be at this location. (b) The activity within the cortical areas 70 ms after the start of the fixation.
Within the receptive field of V4 assemblies at position [6,13] there is both a red horizontal and a green vertical bar. Therefore, all V4 assemblies at this position
receive stimulus input. At the time of the initial sensory response assemblies at this position were as active as other assemblies in receipt of bottom-up stimulus
information. However, by 70 ms competition between the colours at this location, and between the orientations at this location, has caused the activity here to
be slightly lower than that of other assemblies that encode the features of a single stimulus within their receptive field. This is due to there being less
competition in the single stimulus case. The addition of the second stimulus drives responses down, as has been found in single cell recordings such as Chelazzi
et al. (2001). At this time all V4 assemblies are approximately equally active and there is no preference for the target object’s features. (c) The activity within
the cortical areas 200 ms after the start of the fixation when object-based attention has become significant. The red and vertical (target feature) assemblies at
[6,13] (in matrix coordinates) have become more active than the non-target features, which are suppressed, despite each receiving ‘bottom-up’ information. (d)
A plot of the activity of the V4 assemblies over time at position [6,13]. From ,150 ms object-based effects suppress non-target features. The activity of the
vertical orientation and red colour selective assemblies are shown as solid lines and that of the horizontal orientation and green colour selective assemblies as
dotted lines.)
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 883
Fig. 5 (continued )
stimuli within the receptive field, result in overall responses coloured stimuli and it is one of these locations that will
in each V4 feature at this location being lower than at be chosen as the next fixation point.
locations that contain only one stimulus, i.e. the addition of
a second stimulus lowers responses of the cell to the referred 3.3. The scan path
stimulus, as found in single cell studies such as Chelazzi
et al. (2001). By 200 ms, object-based attention has become The system produces scan paths that are qualitatively
effective in V4 and the green assembly in V4 has become similar to those found by Motter and Belky (1998b) where
significantly suppressed compared to the red assembly at the fixations normally land within 18 of orientation-colour
same location. The representation in LIP is now biased feature conjunction stimuli, rather than in blank areas, and
towards the representation of the locations of target these stimuli tend to be target coloured. Fig. 6a and b shows
884 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
Fig. 6. Scan paths that tend to select locations near a target coloured stimulus. Here, IT feedback to V4 is strong (parameter h in Eqs. (A20) and (A22) is set to
5). Fixations are shown as (i) magenta dots: within 18 of a target coloured stimulus; (ii) blue circles: within 18 of a non-target colour stimulus. (a) A scan path
through a dense scene. The target is a red stimulus. 95.92% of fixations are within 18 of a target coloured stimulus. 4.08% of fixations are within 18 of a non-
target colour stimulus. Average saccade amplitude ¼ 7.438. (b) A scan path through a sparse scene. The target is a green stimulus. 91.84% of fixations are
within 18 of a target coloured stimulus. 8.16% of fixations are within 18 of a non-target colour stimulus. Average saccade amplitude ¼ 12.138. (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Thus, the model tentatively predicts that during search in experiments, search has been found to be facilitated,
crowded scenes, as well as in the smaller arrays used for the enabling faster manual reaction times, when distractors
single cell simulations, fixation duration may be shorter are familiar (Lubow & Kaplan, 1997; Reicher, Snyder, &
amongst very familiar objects or in a familiar task. Richards, 1976; Richards & Reicher, 1978; Wang, Cava-
Therefore, search may be faster under these conditions. In nagh, & Green, 1994). However, Greene and Rayner (2001)
886 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
Fig. 7. The effect of the weight of IT to V4 feedback on fixation position. Fixation positions were averaged over 10 scan paths, each consisting of 50 fixations,
over the image shown in Fig. 6a. Fixations are considered to be near a stimulus when they are within 18 of the stimulus. As the weight of feedback is increased
there is a tendency for more target coloured stimuli and less non-target coloured stimuli to be fixated because object-based effects in the ventral stream are
stronger.
suggest that this familiarity effect may be due to the span of density such that, when fixation is placed in an area of dense
effective processing (which may be likened to the size of stimuli, the spatial AW is smaller than that when stimuli are
AW here) being wider around familiar distractors, rather sparse. The AW contributes a positive bias to the
than fixation duration being shorter. competition in LIP and results in locations containing target
features within the AW being favoured within LIP and
3.5. Effect of scene density on saccade amplitude being most likely to attract attention. Thus, the model
predicts that saccade amplitude is dependant on the stimulus
As Motter and Belky (1998b) found, saccades tend to be density in the local area around fixation.
shorter in dense scenes compared to sparse scenes. Fig. 6 Fig. 11a shows a scan path that has been unable to reach
shows this effect (also, large amplitude saccades are shown target coloured stimuli in the bottom left corner of the
in a sparse scene in Fig. 11). The same sized retinal image display. This was due to the constraint imposed by the size
was used for both the simulations in Fig. 6 in order that of the retinal image in this case (441 £ 441 pixels,
stimuli at the same distances could attract attention. equivalent to approximately 408 of visual angle). When
However, the spatial AW is scaled based on local stimulus the retinal image is increased to 801 £ 801 pixels
Fig. 8. The effect of varying the weight of the novelty bias on fixation position in a relatively sparse scene. Fixation position was averaged over 10 scan paths,
each consisting of 50 fixations, over the image shown in Fig. 6b. Fixations are considered to be near a stimulus when they are within 18 of the stimulus. As the
weight of the novelty bias increases, the number of fixations near target coloured stimuli decreases and fixations are more likely to occur in blank areas of the
display.
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 887
Fig. 11. Scan paths on a sparse image. (a) The retina is restricted to 441 £ 441 pixels (,408). This results in target coloured stimuli in the bottom left of the
image not being available to the retina from the nearby fixation point and, thus, not being reached by the scan path. Note, this problem only occurs in sparse
images with the retina image restricted to a smaller size than is normally available to humans and monkeys. (b) The retina is enlarged to be 801 £ 801 pixels
(,738). All target coloured stimuli are now examined.
the working memory bias. Many object-based attention objects are the subject of object-based attention and that a
experimental findings are confounded by issues relating to possibly earlier feature-based attention operates because
whether the selection is object-based, feature-based or performance benefits were found only when the feature
surface-based. In psychophysics, object-based attention (colour or orientation) to be discriminated was the same one
typically refers to the performance advantage given to all in which the target ‘pop-out’ occurred. Confusion results
features belonging to the same object (but not to those of a from the difficulty in designing an experimental paradigm
different object at the same location). However, Mounts and that eliminates all but object-based attentional effects. For
Melara (1999) have suggested that not all features bound to example, the rapid serial object transformation paradigm
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 889
influence the scan path most strongly despite task In conclusion, this is a biologically plausible model that
requirements. is able to replicate a range of experimental evidence for both
The ‘cross-stream’ interaction between V4 and LIP is an spatial and object-based attentional effects. Modelling
important feature of this model and it is predicted that loss visual attention in a biologically constrained manner within
of the connection from the ventral stream to LIP, due to an active vision paradigm has raised issues, such as memory
lesion, would result in deficits in the ability for the scan path for locations already visited in the scene, not addressed by
to locate behaviourally relevant stimuli. In order to the models with static retinas. The model is extendable to
object-based effect in V4 to bias LIP correctly in the current include other factors in the competition for the capture of
architecture, it was necessary for competition in V4 to be attention and should be of interest in many fields including
local. Given this, object-based attention in V4 allowed LIP neurophysiology, neuropsychology, computer vision and
to be able to map the behaviourally relevant locations in the robotics.
scene (Colby et al., 1996) and accurately control the search
scan path. Therefore, the model predicts that ventral stream
feedforward connections to parietal cortex may determine Appendix A
search selectivity. The model also predicts that the latency
to saccade is dependant on the development of object-based A.1. Retina
attention, which relies on strong object-related feedback
within the ventral stream. Tentatively, the strength of such Colour processing in the model focuses on the red –
feedback may be related to learning and familiarity with green channel with two colour arrays, Gred and Ggreen ; a
stimuli and the task. The connection from LIP to V4 serves simplification of the output of the medium and long
to bind features across feature types at the resolution of the wavelength retinal cones, being input to the retinal
V4 receptive field. Therefore, it is predicted that the loss of ganglion cells. References to red and green throughout
this connection would cause cross-feature type binding this paper refer to long and short wavelengths. The
errors (such as binding form and colour) whilst leaving greyscale image, Ggrey ; used for form processing, is a
intact the binding of form into object shape within the composite of the colour arrays and provides luminance
ventral stream. Such an effect has been observed in a patient information.
with bilateral parietal lesions (Humphreys, Cinel, Wolfe,
Olson, & Klempen, 2000).
Further factors affecting the scan path included the field A.1.1. Form processing in the retina
of view (the size of the retinal image), local stimulus density At each location in the greyscale image, retinal ganglion
(because this determines the size of the AW, which broad-band cells perform simple centre-surround proces-
influences saccade amplitude), and the importance of sing, according to Grossberg and Raizada (2000) as follows:
novelty (in the form of inhibition of previously inspected On-centre, off-surround broadband cells:
locations). X
Future work includes extending the factors influencing uþ ¼ Gijgrey 2 grey
Gpq ði; j; s1 ÞGpq ðA1Þ
competition in LIP so that the model is able to represent pq
further bottom-up stimulus-related factors and, possibly,
further top-down cognitive factors in the competition for Off-centre, on-surround broadband cells:
attentional capture. As such, the model may be able to X
replicate psychophysical data relating to attentional capture u2 ¼ 2Gijgrey þ grey
Gpq ði; j; s1 ÞGpq ðA2Þ
under competing exogenous and endogenous influences, pq
Red on-centre, off-surround concentric single-opponent ½xþ signifies half-wave rectification, i.e.
cells:
(
X x; if x $ 0
nredON
ij ¼ Gijred 2 green
Gpg ði; j; s1 ÞGpq ðA4Þ ½xþ ¼
pq 0; otherwise
Red off-centre, on-surround concentric single-opponent and the oriented DOOG filter DðlkÞ
pqij is given by
cells:
X
nredOFF
ij ¼ 2Gijred þ green
Gpg ði; j; s1 ÞGpq ðA5Þ DðlkÞ
pqij ¼ Gpq ði 2 d cos u; j 2 d sin u; s2 Þ
pq
2 Gpq ði þ d cos u; j þ d sin u; s2 Þ ðA10Þ
Green on-centre, off-surround concentric single-
opponent cells:
X where
ngreenON
ij ¼ Gijgreen 2 red
Gpg ði; j; s1 ÞGpq ðA6Þ
pq d ¼ s2 =2 and u ¼ pðk 2 1Þ=K; where k ranges from 1 to
2K, K being the total number of orientations (2 is used
Green off-centre, on-surround concentric single-
here).
opponent cells:
s2 is the width parameter for the DOOG filter, set as
X
ngreenOFF
ij ¼ 2Gijgreen þ red
Gpg ði; j; s1 ÞGpq ðA7Þ below.
pq r is the spatial frequency octave (i.e. spatial resolution),
such that
These concentric single-opponent cells provide colour- r ¼ 1 and s2 ¼ 1 for low resolution processing, used in
specific inputs to V1 double-opponent blob neurons. the magnocellular (or sub-cortical) pathway for scaling
the AW;
r ¼ 2 and s2 ¼ 0:5 for high resolution processing, used
A.2. V1 in the parvocellular pathway, which forms the remainder
of the model
The V1 module consists of K þ C neurons at each
location in the original image, so that neurons detect K The direction-of-contrast sensitive simple cell response
orientations and C colours. At any fixation, V1 would only is given by
process information within the current retinal image.
However, in this model, the entire original image is ‘pre- Srijk ¼ g½Rrijk þ Lrijk 2 lRrijk 2 Lrijk lþ ðA11Þ
processed’ by V1 in order to save computational time during
the active vision component of the system. As V1 is not g is set to10.
dynamically updated during active vision, this does not alter The complex cell response is invariant to direction of
the result. Only those V1 outputs relating to the current contrast and is given by
retinal image are forwarded to V4 during the dynamic active
vision processing.
Irijk ¼ Srijk þ Srij ðk þ KÞ ðA12Þ
A.2.1. Form processing in V1
For orientation detection, V1 simple and complex cells where k ranges from 1 to K.
are modelled as described by Grossberg and Raizada (2000), The value of the complex cells, Irijk ; over the area of the
with the distinction that two spatial resolutions are current retinal image, is input to V4.
calculated here. Simple cells detect oriented edges using a
difference-of-offset-Gaussian (DOOG) kernel. A.2.2. Colour processing in V1
The right- and left-hand kernels of the simple cells are The outputs of LGN concentric single-opponent cells
given by (simplified to be the retinal cells here) are combined in the
cortex in the double-opponent cells concentrated in the blob
ðlkÞ
Rrijk ¼ Spq ð½uþ 2
pq þ 2 ½upq þ Þ½Dpqij þ ðA8Þ zones of layers 2 and 3 of V1, which form part of the
parvocellular system. The outputs of blob cells are
þ ðlkÞ
Lrijk ¼ Spq ð½u2
pq þ 2 ½upq þ Þ½2Dpqij þ ðA9Þ transmitted to the thin stripes of V2 and from there to
colour-specific neurons in V4. For simplicity, V2 is not
where included in this model.
Double-opponent cells have a centre-surround antagon-
uþ and u2 are the outputs of the retinal broadband cells ism and combine inputs from different single-opponent cells
above; as follows:
892 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
with a latency of 60 ms to reflect normal response latencies m is the parameter for inhibitory interneuron input,
(Luck et al., 1997). In order to simulate the normalisation of set to 1
inputs occurring during retinal, LGN and V1 processing, the
V1 inputs to V4 are normalised by passing the convergent Over time, this results in local competition between
inputs to each V4 assembly through the response function at different orientation selective cell assemblies.
Eq. (A19) with its threshold set to a value equivalent to an
input activity for approximately half a stimulus within its A.3.1.2. Colour processing in V4. The output from the V1
receptive field. simple cell process, Iijc ; for each position ði; jÞ and colour c,
provides the bottom-up input to colour selective pyramidal
A.3.1.1. Form processing in V4. The output from the V1 assemblies in V4 that evolve according to the following
simple cell process, Iijk ; for each position ði; jÞ at orientation dynamics
k, provides the bottom-up input to orientation selective X
d
pyramidal assemblies in V4 that evolve according to the t1 Wijc ðtÞ¼2Wijc þ aFðWijc ðtÞÞ2 bFðWijIC ðtÞÞþ x Ipqc ðtÞ
following dynamics dt pq
X
d þ gYij ðtÞþ h BWlijc Xm ·FðXm ðtÞÞþI0 þ n
t1 Wijk ðtÞ ¼ 2 Wijk ðtÞ þ aFðWijk ðtÞÞ 2 bFðWijIK ðtÞÞ
dt X X m
þ x Ipqk ðtÞ þ gYij ðtÞ þ h BWijk Xm ·FðXm ðtÞÞ ðA22Þ
pq m
where
þ I0 þ n ðA20Þ
where Ipqc is the input from the V1 blob cells at all positions
within the V4 receptive field area ðp; qÞ; and of
t1 is set to 20 ms preferred colour c.
a is the parameter for excitatory input from other Xm is the feedback from IT cell populations via weight
cells in the pool, set to 0.95 BWijc Xm ; described later.
b is the parameter for inhibitory interneurons input,
set to 10 The remaining terms are the same as those in Eq. (A20).
Ipqk is the input from the V1 simple cell edge The dynamic behaviour of the associated inhibitory pool
detection process at all positions within the V4 for colour-selective cell assemblies in V4 is given by:
receptive field area ðp; qÞ; and of preferred d IC X
IC
orientation k t1 Wij ðtÞ ¼ 2Wijc ðtÞ þ l FðWijc ðtÞÞ 2 mFðWijIC ðtÞÞ
dt c
x is the parameter for V1 inputs, set to 4
Yij is the input from the posterior parietal LIP module, ðA23Þ
reciprocally connected to V4 Parameters take the same values as those in Eq. (A21).
g is the parameter for LIP inputs, set to 3 Over time, this results in local competition between
Xm is the feedback from IT cell populations via weight different colour selective cell assemblies.
BWijk Xm ; described later
h is the parameter representing the strength of object- A.3.2. IT
related feedback from IT (Fig. 7); normally set to 5, Neuronal assemblies in IT are assumed to represent
but set to 2.5 for simulation of single cell anterior IT (for example, area TE) where receptive fields
recordings in IT (Chelazzi et al., 1993), cover the entire retinal image and populations encode
I0 is a background current injected in the pool, set to invariant representations of objects. The model IT encodes
0.25 all possible objects, i.e. feature combinations, and receives
n is additive noise, which is randomly selected from feedforward feature inputs from V4 with a latency of 80 ms
a uniform distribution on the interval (0,0.1) to reflect normal response latencies (Wallis & Rolls, 1997).
V4 inputs to IT are normalised by dividing the total input to
The dynamic behaviour of the associated inhibitory pool each IT assembly by the total number of active (i.e. non-
for orientation-selective cell assemblies in V4 is given by zero) inputs. IT also feeds back an object bias to V4. The
d X strength of these connections is given by the following
t1 WijIK ðtÞ ¼ 2WijkIK
ðtÞ þ l FðWijk ðtÞÞ 2 mFðWijIK ðtÞÞ weights, which are set by hand (to 2 1 or 0, as appropriate,
dt k
for inhibitory feedback, although the model may also be
ðA21Þ
implemented with excitatory feedback; 0, þ 1) to represent
where prior object learning. These simple matrices reflect the type
of weights that would be achieved through Hebbian learning
l is the parameter for pyramidal cell assembly input, without the need for a lengthy learning procedure (such as
set to 1 Deco, 2001), which is not the aim of this work. The result is
894 L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897
that the connections that are active for excitatory feedback A.3.3. LIP
(or inactive for inhibitory feedback) are those features The pyramidal cell assemblies in LIP evolve according to
relating to the object. the following dynamics
V4 cell assemblies to IT (feedforward):
d
AXm Wijz t1 Y ðtÞ ¼ 2Yij ðtÞ þ aFðYij ðtÞÞ 2 bFðY I ðtÞÞ
dt ij
IT to V4 cell assemblies (feedback): X X
þ x FðWijk ðtÞÞ þ 1 FðWijc ðtÞÞ
BWijz Xm k c
X
where z indicates orientation, k, or colour, c. þ gPdij ðtÞ þ h Zpq þ I0 þ n ðA27Þ
The pyramidal cell assemblies in IT evolve according to pq
the following dynamics
d where
t1 X ðtÞ ¼ 2 Xm ðtÞ þ aFðXm ðtÞÞ 2 bFðX I ðtÞÞ
dt m X
þ x AXm Wijk ·FðWijk ðtÞÞ b is the parameter for inhibitory input, set to 1
ijk Wijk is the orientation input from V4 for orientation k, at
X location ði; jÞ
þx AXm Wijc ·FðWijc ðtÞÞ þ gPvM ðtÞ þ I0 þ n
Wijc is the colour input from V4 for colour c, at location
ijc
ði; jÞ
ðA24Þ
where 1.x so that colour-related input from V4 is stronger
than orientation-related input. Set to x ¼ 0:8;
b is the parameter for inhibitory interneuron input, 1 ¼ 4:
set to 0.01 Pdij is the top-down bias, i.e. spatial feedback current
Wijk is the feedforward input from V4 relating to from FEF, dorsolateral prefrontal cortex or pulvi-
orientation information, via weight AXm Wijk nar, injected directly into this pool when there is a
Wijc is the feedforward input from V4 relating to colour requirement to attend to this spatial location. Here,
information, via weight AXm Wijc when fixation is established, this spatial bias is
x is the parameter for V4 inputs, set to 2.5 applied and the spatial AW is formed.
PvM is the object-related feedback current from ventro- g is the parameter for the spatial top-down bias, set to
lateral prefrontal cortex, injected directly into this 2.5
pool Zpq is the bias from the area, pq, of the novelty
map (which is the size of the original image, N)
This feedback is sigmoidal over time as follows: that represents the receptive field. Area pq
represents the size of the LIP receptive field.
PvM ¼ 21=ð1 þ expðtsig 2 tÞÞ ðA25Þ
h is the parameter for the novelty bias, normally set
to 0.0009 (Fig. 8)
where t ¼ time (in milliseconds) and
tsig is the point in time where the sigmoid reaches half its
In order to attract the scan path to target coloured
peak value, normally set to 150 ms for scan path
locations:
simulations.
g is the parameter for the object-related bias, set to 1.2
The remaining terms are evident from previous
The remaining terms and parameters are evident from equations.
previous equations. The dynamic behaviour of the associated inhibitory pool
The dynamic behaviour of the associated inhibitory pool in LIP is given by
in IT is given by
d I X
d X t1 Y ðtÞ ¼ 2Y I ðtÞ þ l FðYij ðtÞÞ 2 mFðY I ðtÞÞ ðA28Þ
t1 X I ðtÞ ¼ 2X I ðtÞ þ l FðXm ðtÞÞ 2 mFðX I ðtÞÞ ðA26Þ dt ij
dt m
where where
l is the parameter for pyramidal cell assembly input, l is the parameter for pyramidal cell assembly input,
set to 3 set to 1;
m is the parameter for inhibitory interneuron input, m is the parameter for inhibitory interneuron input,
set to 1 set to 1.
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 895
Hamker, F. H. (1998). The role of feedback connections in task-driven Mazzoni, P., Andersen, R. A., & Jordan, M. I. (1991). A more biologically
visual search. In D. Von Heinke, G. W. Humphreys, & A. Olson (Eds.), plausible learning rule than backpropagation applied to a network
Connectionist models in cognitive neuroscience: proceedings of the fifth model of cortical area 7a. Cerebral Cortex, 1(4), 293–307.
neural computation and psychology workshop (NCPW’98) (pp. McAdams, C. J., & Maunsell, J. H. R. (2000). Attention to both space and
252– 261). London: Springer. feature modulates neuronal responses in macaque area V4. Journal of
Helmholtz, H. v. (1867). Handbuch der physiologishen optik. Leipzig: Neurophysiology, 83(3), 1751–1755.
Voss. McPeek, R. M., Skavenski, A. A., & Nakayama, K. (2000). Concurrent
Hodgson, T. L., Mort, D., Chamberlain, M. M., Hutton, S. B., O’Neill, K. S., processing of saccades in visual search. Vision Research, 40(18),
& Kennard, C. (2002). Orbitofrontal cortex mediates inhibition of 2499–2516.
return. Neuropsychologia, 1431, 1 –11. Miller, E. K., Erickson, C. A., & Desimone, R. (1996). Neural mechanisms
Hoffman, J. E., & Subramaniam, B. (1995). The role of visual attention in of visual working memory in prefrontal cortex of the macaque. The
saccadic eye movements. Perception and Psychophysics, 57(6), Journal of Neuroscience, 16(16), 5154–5167.
787– 795. Miller, E., Gochin, P., & Gross, C. (1993). Suppression of visual responses
Hooge, I. T., & Erkelens, C. J. (1999). Peripheral vision and oculomotor of neurons in inferior temporal cortex of the awake macaque by addition
control during visual search. Vision Research, 39(8), 1567–1575. of a second stimulus. Brain Research, 616, 25–29.
Hooge, I. T., & Frens, M. A. (2000). Inhibition of saccade return (ISR): Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. Oxford:
Spatio-temporal properties of saccade programming. Vision Research, Oxford University Press.
40(24), 3415–3426. Mitchell, J. F., Stoner, G. R., Fallah, M., & Reynolds, J. H. (2003).
Hopfinger, J. B., Buonocore, M. H., & Mangun, G. R. (2000). The neural Attentional selection of superimposed surfaces cannot be explained by
mechanisms of top-down attentional control. Nature Neuroscience, modulation of the gain of color channels. Vision Research, 43,
3(3), 284–291. 1323–1325.
Horowitz, T. S., & Wolfe, J. M. (1998). Visual search has no memory. Mitchell, J., & Zipser, D. (2001). A model of visual-spatial memory across
Nature, 394, 575–577. saccades. Vision Research, 41, 1575–1592.
Humphreys, G. W., Cinel, C., Wolfe, J., Olson, A., & Klempen, N. (2000). Moore, T., & Armstrong, K. M. (2003). Selective gating of visual signals by
Fractionating the binding process: neuropsychological evidence microstimulation of frontal cortex. Nature, 421, 370–373.
distinguishing binding of form from binding of surface features. Vision Moran, J., & Desimone, R. (1985). Selective attention gates visual
Research, 40, 1569–1596. processing in the extrastriate cortex. Science, 229, 782– 784.
Husain, M., Mannan, S., Hodgson, T., Wojciulik, E., Driver, J., & Kennard, Motter, B. C. (1993). Focal attention produces spatially selective
C. (2001). Impaired spatial working memory across saccades processing in visual cortical areas V1, V2, and V4 in the presence of
contributes to abnormal search in parietal neglect. Brain, 124, competing stimuli. Journal of Neurophysiology, 70(3), 909 –919.
941– 952. Motter, B. C. (1994a). Neural correlates of attentive selection for color or
Irwin, D. E., & Zelinsky, G. J. (2002). Eye movements and scene luminance in extrastriate area V4. The Journal of Neuroscience, 14(4),
perception: Memory for things observed. Perception and Psycho- 2178–2189.
physics, 64(6), 882 –895. Motter, B. C. (1994b). Neural correlates of feature selective memory and
Itti, L., & Koch, C. (2000). A saliency-based search mechanism for pop-out in extrastriate area V4. The Journal of Neuroscience, 14(4),
overt and covert shifts of visual attention. Vision Research, 20, 2190–2199.
1489–1506. Motter, B. C., & Belky, E. J. (1998a). The zone of focal attention during
Kastner, S., Pinsk, M., De Weerd, P., Desimone, R., & Ungerleider, L. active visual search. Vision Research, 38(7), 1007–1022.
(1999). Increased activity in human visual cortex during directed Motter, B. C., & Belky, E. J. (1998b). The guidance of eye movements
attention in the absence of visual stimulation. Neuron, 22, 751–761. during active visual search. Vision Research, 38(12), 1805–1815.
Kusunoki, M., Gottlieb, J., & Goldberg, M. E. (2000). The lateral Mounts, J. R. W., & Melara, R. D. (1999). Attentional selection of objects
intraparietal area as a salience map: The representation of abrupt onset, or features: Evidence from a modified search task. Perception &
stimulus motion, and task relevance. Vision Research, 40, 1459–1468. Psychophysics, 61(2), 322–341.
Lanyon, L. J., & Denham, S. L. (2004a). A biased competition Niebur, E., Itti, L., & Koch, E. (2001). Controlling the focus of visual
computational model of spatial and object-based attention mediating selective attention. In J. L. Van Hemmen, J. Cowan, & E. Domany
active visual search. Neurocomputing, in press. (Eds.), Models of neural networks IV (pp. 247–276). New York:
Lanyon, L. J., & Denham, S. L. A biased competition model of spatial and Springer.
object-based attention mediating active visual search. Submitted for O’Craven, K. M., Downing, P. E., & Kanwisher, N. (1999). fMRI evidence
publication. for objects as the units of attentional selection. Nature, 401, 584 –587.
Law, M. B., Pratt, J., & Abrams, R. A. (1995). Color-based inhibition of Olshausen, B. A., Anderson, C. H., & Van Essen, D. C. (1993). A
return. Perception and Psychophysics, 57(3), 402–408. neurobiological model of visual attention and invariant pattern
Lubow, R. E., & Kaplan, O. (1997). Visual search as a function of type of recognition based on dynamic routing of information. The Journal of
prior experience with targets and distractor. Journal of Experimental Neuroscience, 13, 4700–4719.
Psychology: Human Perception and Performance, 23, 14– 24. Pinilla, T., Cobo, A., Torres, K., & Valdes-Sosa, M. (2001). Attentional
Luck, S. J., Chelazzi, L., Hillyard, S. A., & Desimone, R. (1997). Neural shifts between surfaces: Effects on detection and early brain potentials.
mechanisms of spatial attention in areas V1, V2 and V4 of macaque Vision Research, 41, 1619–1630.
visual cortex. Journal of Neurophysiology, 77, 24–42. Posner, M. I., Walker, J. A., Friedrich, F. J., & Rafal, R. D. (1984). Effects
Lynch, J. C., Graybiel, A. M., & Lobeck, L. J. (1985). The differential of parietal injury on covert orienting of attention. Journal of
projection of two cytoarchitectonic subregions of the inferior parietal Neuroscience, 4, l863–1874.
lobule of macaque upon the deep layers of the superior colliculus. Quaia, C., Optican, L. M., & Goldberg, M. E. (1998). The maintenance of
Journal of Comparative Neurology, 235, 241 –254. spatial accuracy by the perisaccadic remapping of visual receptive
Martinez, A., Anllo-Vento, L., Sereno, M. I., Frank, L. R., Buxton, R. B., fields. Neural Networks, 11(7/8), 1229–1240.
Dubowitz, D. J., Wong, E. C., Hinrichs, H., Heinze, H. J., & Hillyard, Rainer, G., & Miller, E. K. (2002). Time course of object-related neural
S. A. (1999). Involvement of striate and extrastriate visual cortical areas activity in the primate prefrontal cortex during a short-term memory
in spatial attention. Nature Neuroscience, 2(4), 364–369. task. European Journal of Neuroscience, 15(7), 1244–1254.
L.J. Lanyon, S.L. Denham / Neural Networks 17 (2004) 873–897 897
Reicher, G. M., Snyder, C. R. R., & Richards, J. T. (1976). Familiarity of Thiele, A., Henning, P., Kubischik, M., & Hoffmann, K. P. (2002).
background characters in visual scanning. Journal of Experimental Neural mechanisms of saccadic suppression. Science, 295(5564),
Psychology: Human Perception and Performance, 2, 522–530. 2460–2462.
Renart, A., Moreno, R., de la Rocha, J., Parga, N., & Rolls, E. T. (2001). A Tipper, S. P., Weaver, B., & Watson, F. L. (1996). Inhibition of return to
model of the IT-PF network in object working memory which includes successively cued spatial locations: Commentary on Pratt and Abrams
balanced persistent activity and tuned inhibition. Neurocomputing, 38– (1995). Journal of Experimental Psychology: Human Perception and
40, 1525–1531. Performance, 22(5), 1289–1293.
Reynolds, J. H., Alborzian, S., & Stoner, G. R. (2003). Exogenously cued Toth, L. J., & Assad, J. A. (2002). Dynamic coding of behaviourally
attention triggers competitive selection of surfaces. Vision Research, relevant stimuli in parietal cortex. Nature, 415, 165–168.
43(1), 59–66. Tolias, A. S., Moore, T., Smirnakis, S. M., Tehovnik, E. J., Siapas, A. G.,
Reynolds, J. H., Chelazzi, L., & Desimone, R. (1999). Competitive Schiller, P. H. (2001). Eye movements module visual receptive fields of
mechanisms subserve attention in macaque areas V2 and V4. The V4 neurons. Neuron, 29, 757 –767.
Journal of Neuroscience, 19(5), 1736–1753. Trappenberg, T. P., Dorris, M. C., Munoz, D. P., & Klein, R. M. (2001). A
Richards, J. T., & Reicher, G. M. (1978). The effect of background model of saccade initiation based on the competitive integration of
familiarity in visual search. Perception and Psychophysics, 23, exogenous and endogenous signals in the superior colliculus. Journal of
499–505. Cognitive Neuroscience, 13, 256–271.
Robinson, D. L., Bowman, E. M., & Kertzman, C. (1995). Covert orienting Treisman, A. (1982). Perceptual grouping and attention in visual search for
of attention in macaques. II. Contributions of parietal cortex. Journal of features and for objects. Journal of Experimental Psychology: Human
Neurophysiology, 74(2), 698 –712. Perception and Performance, 8, 194–214.
Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Object-based Treisman, A. (1988). Features and objects: The fourteenth Bartlett
attention in the primary visual cortex of the macaque monkey. Nature, memorial lecture. Quarterly Journal of Experimental Psychology, 40,
395, 376 –381. 201– 237.
Rolls, E., & Deco, G. (2002). Computational neuroscience of vision. Treisman, A., & Gelade, G. (1980). A feature integration theory of
Oxford, UK: Oxford University Press. attention. Cognitive Psychology, 12, 97–136.
Romo, R., Brody, C. D., Hernandez, A., & Lemus, L. (1999). Neuronal Treue, S., & Martinez Trujillo, J. (1999). Feature-based attention influences
correlates of parametric working memory in the prefrontal cortex. motion processing gain in macaque visual cortex. Nature, 339(6736),
Nature, 399(6735), 470 –473. 575– 579.
Ross, J., Morrone, M. C., & Burr, D. C. (1997). Compression of visual Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In
space before saccades. Nature, 386, 598–601. D. J. Ingle, M. A. Goodale, & R. W. J. Mansfield (Eds.), Analysis of
Ross, J., Morrone, M. C., Goldberg, M. E., & Burr, D. C. (2001). Changes visual behaviour (pp. 549–586). Cambridge, MA: MIT Press.
in visual perception at the time of saccades. Trends in Neurosciences, Usher, M., & Niebur, E. (1996). Modeling the temporal dynamics of IT
24(2), 113–121. neurons in visual search: A mechanism for top-down selective attention.
Saenz, M., Buracas, G. T., & Boynton, G. M. (2002). Global effects of Journal of Cognitive Neuroscience, 8(4), 311–327.
feature-based attention in human visual cortex. Nature Neuroscience, Valdes-Sosa, M., Bobes, M. A., Rodriguez, V., & Pinilla, T. (1998).
5(7), 631–632. Switching attention without shifting the spotlight: Object-based
Sapir, A., Soroker, N., Berger, A., & Henik, A. (1999). Inhibition of return attentional modulation of brain potentials. Journal of Cognitive
in spatial attention: Direct evidence for collicular generation. Nature Neuroscience, 10(1), 137 –151.
Neuroscience, 2, 1053–1054. Valdes-Sosa, M., Cobo, A., & Pinilla, T. (2000). Attention to object files
Schall, J. D. (2002). The neural selection and control of saccades by the defined by transparent motion. Journal of Experimental Psychology:
frontal eye field. Philosophical Transactions of the Royal Society of Human Perception and Performance, 26(2), 488–505.
London B Biological Science, 357, 1073–1082. Wallis, G., & Rolls, E. T. (1997). Invariant face and object recognition in
Scialfa, C. T., & Joffe, K. M. (1998). Response times and eye movements in the visual system. Progress in Neurobiology, 51, 167–194.
feature and conjunction search as a function of target eccentricity. Wang, Q., Cavanagh, P., & Green, M. (1994). Familiarity and popout in
Perception and Psychophysics, 60(6), 1067–1082. visual search. Perception and Psychophysics, 56, 495–500.
Shen, J., Reingold, E. M., & Pomplum, M. (2000). Distractor ratio Williams, D. E., & Reingold, E. M. (2001). Preattentive guidance of eye
influences patterns of eye movements during visual search. Perception, movements during triple conjunction search tasks: The effects of feature
29(2), 241–250. discriminability and saccadic amplitude. Psychonomic Bulletin and
Shen, J., Reingold, E. M., & Pomplum, M. (2003). Guidance of eye Review, 8(3), 476–488.
movements during conjunctive visual search: The distractor ratio effect. Wolfe, J. M. (1994). Guided Search 2.0: A revised model of visual search.
Canadian Journal of Experimental Psychology, 57(2), 76 –96. Psychonomic Bulletin and Review, 1, 202–238.
Snyder, J. J., & Kingstone, A. (2000). Inhibition of return and visual search: Wolfe, J. M., Cave, K. R., & Franzel, S. L. (1989). Guided search: An
How many separate loci are inhibited? Perception and Psychophysics, alternative to the feature integration model for visual search. Journal of
62(3), 452–458. Experimental Psychology: Human Perception and Performance, 15,
Snyder, J. J., & Kingstone, A. (2001). Inhibition of return at multiple 419– 433.
locations in visual search: When you see it and when you don’t. Woodman, G. F., Vogel, E. K., & Luck, S. J. (2001). Visual search remains
Quarterly Journal of Experimental Psychology Section A: Human efficient when visual working memory is full. Psychological Science,
Experimental Psychology, 54(4), 1221–1237. 12(3), 219– 224.
Somers, D. C., Dale, A. M., Seiffert, A. E., & Tootell, R. B. H. (1999). Zeki, S. (1993). A vision of the brain. Oxford, UK: Blackwell.
Functional MRI reveals spatially specific attentional modulation in Zhu, J. J., & Lo, F. S. (1996). Time course of inhibition induced by a
human primary visual cortex. Proceedings of the National Academy of putative saccadic suppression circuit in the dorsal lateral geniculate
Science, USA, 96, 1663–1668. nucleus of the rabbit. Brain Research Bulletin, 41(5), 281–291.
Their, P., & Andersen, R. A. (1998). Electrical microstimulation Zipser, D., & Andersen, R. A. (1988). A back-propagation programmed
distinguishes distinct saccade-related areas in the posterior parietal network that simulates response properties of a subset of posterior
cortex. Journal of Neurophysiology, 80, 1713–1735. parietal neurons. Nature, 331(6158), 679 –684.