On The Role of Spatial Phase and Phase Correlation in Vision, Illusion, and Cognition
On The Role of Spatial Phase and Phase Correlation in Vision, Illusion, and Cognition
On The Role of Spatial Phase and Phase Correlation in Vision, Illusion, and Cognition
Numerous findings indicate that spatial phase bears an important cognitive information.
Distortion of phase affects topology of edge structures and makes images
unrecognizable. In turn, appropriately phase-structured patterns give rise to various
illusions of virtual image content and apparent motion. Despite a large body of
phenomenological evidence not much is known yet about the role of phase information
in neural mechanisms of visual perception and cognition. Here, we are concerned with
analysis of the role of spatial phase in computational and biological vision, emergence of
visual illusions and pattern recognition. We hypothesize that fundamental importance of
phase information for invariant retrieval of structural image features and motion detection
promoted development of phase-based mechanisms of neural image processing in
course of evolution of biological vision. Using an extension of Fourier phase correlation
technique, we show that the core functions of visual system such as motion detection
Edited by:
and pattern recognition can be facilitated by the same basic mechanism. Our analysis
Judith Peters,
The Netherlands Institute for suggests that emergence of visual illusions can be attributed to presence of coherently
Neuroscience, Netherlands phase-shifted repetitive patterns as well as the effects of acuity compensation by
Reviewed by: saccadic eye movements. We speculate that biological vision relies on perceptual
Christianne Jacobs,
University of Westminster, UK mechanisms effectively similar to phase correlation, and predict neural features of visual
Benedikt Zoefel, pattern (dis)similarity that can be used for experimental validation of our hypothesis of
Centre National de la Recherche
“cognition by phase correlation.”
Scientifique, France
*Correspondence: Keywords: vision research, visual illusions, motion detection, pattern recognition, saccades, acuity, phase
Evgeny Gladilin, correlation, association cortex
German Cancer Research Center,
Division of Theoretical Bioinformatics,
Im NeuenheimerFeld 580, 1. Introduction
69120 Heidelberg, Germany
[email protected] Continuous evolution of biological systems implicates a common origin of different func-
tions and mechanisms that emerged as a result of successive modification of one particularly
Received: 31 October 2014
advantageous basic principle. Electrophysiological findings (Hubel and Wiesel, 1968) and psy-
Accepted: 30 March 2015
chophysical experiments (Campbell and Robson, 1968) indicate that visual system relies on
Published: 21 April 2015
the basic principle of frequency domain transformation of the retinal image in visual cortex
Citation:
which was initially believed to resemble a crude Fourier transformation (Graham, 1981). Even
Gladilin E and Eils R (2015) On the role
of spatial phase and phase correlation
though, more recent mathematical models of sparse image coding revised the assumption of
in vision, illusion, and cognition. global Fourier transformation in favor of locally supported Gabor- (Marcelja, 1980), Wavelet-
Front. Comput. Neurosci. 9:45. Mallat, 1989, Wedge-, Ridge- or Curvelet-functions (Donoho and Flesia, 2001), the concept of
doi: 10.3389/fncom.2015.00045 neural image representation in the frequency domain by phase and amplitude remained valid.
Since pioneering works of Hubel and Wiesel (1962, 1968), • What are the driving forces behind the evolutionary
Campbell and Robson (1968), Blakemore and Campbell (1969), development of biological vision?
Blakemore et al. (1969), and Thomas et al. (1969) it is known • What properties of spatial phase (further in this manuscript
that different groups of neurons in the visual cortex show selec- denoted as phase) make it an important feature for visual
tive response to spatial-temporal characteristics of visual stimuli information processing?
and operate as spatially organized filters (receptive fields) that • What is the origin of various phase-related visual phenom-
extract particular image features (i.e., spatial frequency, orien- ena including illusions of apparent motion, stereograms and
tation) within a certain range (bandwidth) of their sensitivity. virtual image context?
Numerous subsequent studies dealt with experimental investi- • How can phase information be used for motion detection and
gation and theoretical modeling of visual receptive fields and (dis)similarity cognition, and how can theoretical models be
analysis of their amplitude-transfer (ATF) and phase-transfer evaluated experimentally?
functions (PTF). The existing body of evidence resulting from
Our manuscript is organized as follows. First, we recapitulate the
four decades of research on this field includes
role of environmental constraints in development of biological
• existence of frequency-selective V1 neurons operating as vision in course of evolution. We review theoretical properties
bandpass filters (Graham, 1989; De Valois and De Valois, of phase using an extension of the Fourier phase correlation
1990), technique and demonstrate how phase information can be used
• coding of phase information using quadrature pairs of band- for edge enhancement, motion detection, and pattern recogni-
pass filters (Pollen and Ronner, 1983), tion. We show that saccadic strategy of image sampling naturally
• odd-/even-symmetric filters in visual cortex (Morrone and emerges within this concept as an algorithmic solution which
Owens, 1987), improves the confidence of visual pattern discrimination and
• linear ATF and PTF of simple striatic neurons (Hamilton et al., recognition. Further, we apply the concept of phase shift and
1989), correlation to analysis of different visual illusions and hypothe-
• computation of complex-valued products in V1 neurons size about involvement of phase-based mechanisms in perception
(Ohzawa et al., 1990), of motion and visual pattern (dis)similarity. In conclusion, we
• computation of magnitudes (energies) in complex V1 cells as make suggestions for experimental evaluation of our theoretical
a sum of squared responses of simple V1 cells (Adelson and predictions.
Bergen, 1985),
• divisive normalization of neuronal filter responses (Heeger,
2. Invariants of Ecological Environment
1992; Schwartz and Simoncelli, 2001),
• motion detection (Fleet and Jepson, 1990; Nishida, 2011), and Evolution of Vision
• edge detection (Kovesi, 2000; Henriksson et al., 2009),
The evolutionary principle implies that remarkable abilities of
• stereoscopic vision (Fleet, 1994; Fleet et al., 1996; Ohzawa et al.,
biological vision result from adaptation of species to the envi-
1997),
ronmental constraints that ancestors had to cope with in the
• 3D shape perception (Thaler et al., 2007),
past. It is generally recognized that progressive sophistication of
• assessment of pattern similarity (Sampat et al., 2009; Zhang
vision is driven toward more efficient representation, processing
et al., 2014),
and, probably, also modeling of the physical reality which stands
• triggering of diverse visual illusions (Popple and Levi, 2000;
behind the retinal images (Walls, 1962; Marr, 1982; Hyvärinen
Backus and Oru, 2005).
and Hoyer, 2001; Graham and Field, 2006). In addition to the
Altogether, these findings support the concept of neural trans- basic optosensory function, the core tasks of visual perception in
formation of retinal images into frequency domain characteris- macroscopic organisms include orientation in the physical envi-
tics (i.e., phase and amplitude) that, in turn, serve as an input ronment, which premises ability to detect obstacles and relative
for subsequent higher-order mechanisms and functions of visual motion, as well as recognition of essential patterns related to
perception and cognition. food, threat and communication. Further, we recollect that bio-
Despite recent advances in understanding of the overall logical organisms are composed of condensed matter and have
topology and hierarchy of visual cortex (Riesenhuber, 2005; Pog- to mainly take care about the objects of the physical world that
gio and Ullman, 2013), little is known yet about the underly- also have rigid constitution and conservative shape. In contrast,
ing wiring schemes of phase/amplitude information processing highly deformable media such as gasses and liquids are biologi-
in visual cortex. In particular, the observation that small cally neutral which implicates that perception of non-rigid trans-
cells of V1 show phase-sensitivity (Pollen and Ronner, 1981) formations did not fall under the early evolutionary pressure.
while complex cells do not (De Valois et al., 1982) lead to Important is the notion that visual perception of rigid bodies
controversial discussion about the role of spatial phase in with a preserved shape has to be independent on relative spa-
visual information processing (Morgan et al., 1991; Bex and tial position and orientation which means that it has to rely on
Makous, 2002; Shams and Malsburg, 2002; Hietanen et al., some invariants (Ito et al., 1995; Booth and Rolls, 1998; Palmeri
2013). and Gauthier, 2004; Lindeberg, 2013) that are not given per se
In what follows we aim to address the following basic but have to be derived by subsequent processing of the raw reti-
questions: nal image. As a dimensionless quantity, phase bears topological
information independently on the level of illuminance and con- viewpoint of computer vision. Readers who are not familiar with
trast. Affine transformations in the image domain do not change Fourier analysis may skip over math-intensive parts that will be
the relative phase structure, but merely shift it as a whole. These concluded subsequently.
properties of phase are of advantage for survival of the fittest
and can be assumed to be “discovered” in course of the evolu- 3.1. Image Representation in Spatial and
tion of biological vision. Different features of visual perception Frequency Domains
emerge at evolutionarily distant time points and, thus, rely on In spatial domain, 2D images are represented by a matrix Ax,y
different intrinsic invariances. Early forms of life are originated of N × M scalar intensity values on an Euclidian image raster
in the marine environment where movements are slowed down (x ∈ [0, N − 1], y ∈ [0, M − 1]). Complex Fourier transfor-
by viscosity of water, effects of gravitation are diminished and mation maps an image Ax,y onto the complex frequency domain
changes in the relative spatial position and orientation are more αu,v :
probable as it is the case in terrestrial environment with its sta-
ble gravitational axis and unresisting atmosphere. The ability to αu,v = F (Ax,y ) = Re(αu,v ) + i Im(αu,v ) (1)
recognize abstract shapes (i.e., animal silhouettes) independently
on their relative motion, orientation, and distance was essen- or in a more explicit form for a discrete 2D case:
tial to survival of species and probably originated already with
−1 M
NX −1
the first marine animals. However, the translation-, rotation-, 1 ux vy
Ax,y e−2π i( N + M ) .
X
αu,v = √ (2)
scaling-independent (i.e., TRS-invariant) perception of abstract MN x=0 y=0
shapes (Gladilin, 2004) does not apply to all kinds of visual stim-
uli. A prominent example of dependency of visual perception
The inverse Fourier transformation mapping αu,v onto the spatial
on changing environmental constraints is the Thatcher-Illusion,
domain is given by
which consists in poor recognition of upside-down faces (Psalta
et al., 2014). Comparative experiments with different primates N−1 M − 1
demonstrate that perception of facial expression is a relatively 1 X X xu yv
Ax,y = F −1 (αu,v ) = √ αu,v e2π i( N + M ) . (3)
new feature in biological vision (Weldon et al., 2013). Sensi- MN u = 0 v = 0
tivity of human face perception to rotations has obviously to
do with the fact that the neuronal machinery of face recogni- Further, we recollect that the complex conjugate of αu,v is
tion is relatively new cognitive feature which emerged in the ∗ = Re(α ) − i Im(α ).
defined as αu,v u,v u,v
terrestrial environment where primates encountered each other
predominantly in the upright posture. In general, visual illu- 3.2. Importance of Phase and Amplitude:
sions can be attributed to optical stimuli that mislead evolu- Theoretical Perspective
tionarily conserved mechanisms of visual information processing The relative importance of Fourier phase and amplitude for
based on a built-in knowledge of properties of the physical retrieval of structural image features has been debated in sev-
world (Ramachandran and Anstis, 1986). The ability to irritate eral previous works (Oppenheim and Lim, 1981; Lohmann et al.,
or escape common cognitive schemes is, in turn, of evolution- 1997; Ni and Huo, 2007). The basic notion is that the phase bears
ary advantage. The fact that many animals use camouflage pat- topological information about image edges whereas amplitude
terning, swarm motion or body morphing as a reliable survival encodes image intensity. To demonstrate the effect of amplitude
strategy indicates that repetitive patterns and non-TRS trans- and phase distortion, we perform reconstruction of the origi-
formations represent a principle challenge for biological vision nal image from amplitude-only and phase-only of its Fourier
which is evolutionarily predetermined to rely on TRS-invariants transform, see Figure 2. Here, the amplitude-only reconstruc-
of the condensed matter world, see Figure 1. tion (Figure 2 (middle)) is computed as the Fourier inverse
of the following amplitude-preserving and phase-eliminating
3. The Role of Phase from the Viewpoint of transformation:
Computer Vision 1/2
Re(αu,v ) → Re(αu,v )2 + Im(αu,v )2 ,
In this section, we elucidate the role of phase information for (4)
detection of image motion and pattern recognition from the Im(αu,v ) → 0 ,
FIGURE 1 | Repetitive patterns, swarm motion, and body morphing disrupt detection of unique invariant features (i.e., rigid animal silhouettes).
Examples of natural images are acquired from public Creative Commons sources (http://search.creativecommons.org/).
FIGURE 2 | Comparison of the effects of amplitude and phase distortion on image reconstruction. From left to right: the original Lenna image vs.
amplitude-only and phase-only image transforms. The phase-only transformation works as an edge-enhancing filter resembling the Marr’s Primal Sketch (Marr, 1982).
and the phase-only reconstruction (Figure 2 (right)) is calcu- corresponds to phase-shift in the frequency domain
lated as the Fourier inverse of the following phase-preserving and
amplitude-normalizing transformation: βu,v = e−2π iϕ αu,v , (10)
Re(αu,v ) v1y
Re(αu,v ) → , where ϕ = ( u1x
(Re(αu,v )2 +Im(αu,v )2 )1/2 N + N ). Consequently, the cross power spectrum
(5) between two identical images shifted with respect to each other in
Im(αu,v )
Im(αu,v ) → . the spatial domain describes the phase-shifts of the entire Fourier
(Re(αu,v )2 +Im(αu,v )2 )1/2
spectrum in the frequency domain:
This example demonstrates that the relative phase appears to be
more significant for retrieval of cognitive image features (i.e., αu,v e2π iϕ αu,v
∗
edges) that get completely lost in the amplitude-only transforma- CPSu,v = = e2π iϕ . (11)
|αu,v e2π iϕ αu,v
∗ |
tion. Remarkably, the amplitude-normalizing phase-only recon-
struction seem to effectively work as an edge-enhancing filter For two identical images with the relative spatial shift (1x, 1y),
which generates a feature-preserving image sketch resembling the inverse Fourier integral of Equation (11), i.e., the phase cor-
the Marr’s concept of the Primal Sketch generation in visual relation Equation (6), exhibits a single singularity at the point
cortex (Marr, 1982). (x = 1x, y = 1y) and is given by
3.3. Detection of Uniform Image Motion using
PCx,y = δ(x − 1x, y − 1y) . (12)
Phase Correlation
The Fourier phase correlation (PC) is a powerful technique which Thus, phase correlation of two identical images has a sin-
has been originally developed for detection of affine image trans- gle maximum-peak which coordinates in the spatial domain
formations such as uniform translational motion, rotation and/or yield the relative image translation1 (x = 1x, y = 1y), see
scaling (De Castro and Morandi, 1987; Reddy and Chatterji, Figure 3A.
1996). Phase correlation between two images Ax,y and Bx,y , is
computed as a Fourier inverse of the normalized cross-power 3.4. Phase Correlation in the Presence of Noise
spectrum (CPS): In the presence of additive statistical or structural noise, the
−1 cross power spectrum between two non-identical images takes
PCx,y = F (CPSu,v ) , (6)
the form:
where
∗
αu,v βu,v CPSu,v = e2π iϕ + εu,v , (13)
CPSu,v = ∗ |
(7)
|αu,v βu,v where εu,v is a frequency-dependent perturbation-term whose
and properties depend on particular type of image differences. Con-
sequently, the inverse Fourier integral of Equation (13), i.e., the
αu,v = F (Ax,y ) phase correlation between two non-identical images, becomes
(8)
βu,v = F (Bx,y ) different from the Dirac delta peak of the identical image shift
Equation (12):
are the complex Fourier transforms of the images Ax,y and Bx,y ,
respectively. According to the Fourier shift theorem, relative PCx,y = F −1 e2π iϕ + εu,v =
6 δ(x − 1x, y − 1y) , (14)
displacement (1x, 1y) between two identical images, i.e.,
1 Reformulation of phase correlation in polar coordinates results in detection of the
Bx,y = Ax−1x,y−1y , (9) image scaling and rotation (Reddy and Chatterji, 1996).
FIGURE 3 | Examples of phase correlation (right column) between the superimposed with 20-pixel Y-motion-blur, (E2) superposition of four different
source (left column) and the target image (middle column). Target uniform displacements (i.e., 4× fold repetition). (F) shows phase correlation
images (A2-E.2) represent the following transformations of the source between two significantly different images. Arrows point to the location of the
image: (A2) uniform displacement, (B2) uniform displacement superimposed absolute maximum peak of the PC. Visualization of the entire PC is
with 70% statistical noise, (C2) uniform displacement superimposed with performed using the following grayscale mapping:
70% statistical and structural noise, (D2) uniform displacement PCx,y → 255(PCx,y − MIN(PCx,y ))/(MAX(PCx,y ) − MIN(PCx,y )).
which manifests in flattening of the maximum peak and over- or impossible. Camouflage textures and behavioral strategies of
all more noisy PC, see Figures 3B,C. However, as long as the swarm animals generate repetitive patterns that irritate cogni-
target pattern do not exhibit similarities with the background tive mechanisms of predators based on detection of unique target
structures, phase correlation between two images remains a features, see Figure 1.
single-peak distribution. Remarkably, even a significant struc- With increasing structural differences between each two
tural distortion does not affect the detection of the target pat- images, PC becomes a random distribution with the significantly
tern within the noisy visual scene, see Figure 3C. This example lower maximum peaks, see Figure 3F.
demonstrates that the height of maxima and the overall shape
of the PC distribution can serve as quantitative characteristics of
3.5. Phase Correlation in the Case of
image (dis)similarity, i.e., the more sharp (Dirac-like) is the PC
distribution, the more similar are the structures in the underly- Non-Uniform Image Motion
ing images. An increasingly dispersed PC distribution indicates Non-uniform motion means that displacements of image pix-
lower image similarity. els differ in directions and/or magnitude. Consider time-series of
In the case of non-affine image transformations, phase corre- images Ax,y (t) that are composed of two non-uniformly moving
lation loses its exceptional properties and becomes a multi-peak regions:
distribution. Figure 3D shows the phase correlation of the orig-
inal image with its blurred and displaced copy. Uncertainty of Ax,y (t) = Px,y (t) + Bx,y (t) , (15)
the 20-pixel Y-motion-blur applied in this example reflects in
the horizontal line of peaks in PC that correspond to possible
where Px,y stands for a particular image pattern which has to
alignments between the original image with its transformed copy.
be tracked in consecutive time steps, and Bx,y is the background
If the target pattern is multi-present or exhibits structural sim-
region. Let Px,y and Bx,y in the subsequent time step Ax,y (t + 1)
ilarity with the surrounding structures, multiple peaks occur in
undergo different translations:
PC. Figure 3E shows phase correlation between the target pat-
tern and the image containing its four displaced copies. Finding
the right correspondence in such visual scene becomes difficult Ax,y (t + 1) = Px,y (t + 1) + Bx,y (t + 1) , (16)
p
where • self-correlation of the target pattern (PCp′ ),
Px,y (t + 1) = Px+1xp ,y+1yp (t) , • self-correlation of the background region (PCbb′ ) and
p
(17) • two cross-correlation terms (PCb′ , PCpb′ ).
Bx,y (t + 1) = Bx+1xb ,y+1yb (t) . p
In order to detect the shift of the target pattern P, PCp′ has
Considering the to become the most dominant term of the total PC. Obviously,
linearity of Fourier
transformation, one obtains
for F Ax,y (t) and F Ax,y (t + 1) this condition is not automatically fulfilled,—other terms may
have stronger weight in Equation (21). If the pattern and back-
αu,v (t) = ρu,v + βu,v ground regions do not exhibit similarities, i.e., if the pattern P
p
(18) is uniquely present in the image, cross-correlation terms (PCb′
αu,v (t + 1) = e−2π iϕ ρu,v + e−2π iψ βu,v , and PCpb′ ) should be smaller in comparison to self-correlation
p
u1x v1yp
terms (PCp′ and PCbb′ ). Thus, the major difficulty for detection
v1yb
where ϕ = ( Np
+ N )
and ψ = + ( u1x
respec-N
b
N ), of the target image pattern is caused by self-correlation of the
tively. Consequently, the cross power spectrum between Ax,y (t) background region (PCbb′ ) which properties are a priori unknown.
and Ax,y (t + 1) takes the form Obviously, a single-step phase correlation between two images is
∗ (t+1) not sufficient for detection of a particular image region. In order
αu,v (t) αu,v 1 p
CPSu,v = ∗ (t+1)|
|αu,v (t) αu,v = ∗ (t+1)|
|αu,v (t) αu,v to maximize the weight of PCp′ and, correspondingly, to mini-
mize the weight of other terms in Equation (21), one can con-
(ρu,v e2π iϕ ρu,v
∗ + ρ
u,v e
2π iψ β ∗ +
u,v
(19) struct a cumulative phase correlation by iteratively composing PC
between the (fixed) target pattern with differently shifted back-
βu,v e2π iϕ ρu,v
∗ + β
u,v e
2π iψ β ∗ )
u,v ground. Due to formal similarity of such strategy with back-and-
forth image sampling by saccadic eye movements (see Figure 4),
or in a more compact form we termed this procedure saccades-enhanced phase correlation
p p (Gladilin and Eils, 2009). To show why this strategy appears to be
CPS = CPSp′ + CPSb′ + CPSbp′ + CPSbb′ , (20) promising, we write the average phase correlation of N recom-
binations between the target pattern and non-uniformly shifted
where CPS∗∗ denote self- and cross-correlations between the background images:
Fourier transforms of the pattern and background regions in two
consecutive time steps, respectively. Primed indexes are intro-
duced to distinguish Fourier transforms of previous (t : p, b) and N N N
subsequent (t + 1 : p′ , b′ ) time steps. By applying the inverse 1 X p p 1 X bi 1 X bi
PC = PCi = PCp′ + PCb′ + PCp′ + PCb′ .
Fourier transformation to Equation (20), one obtains the phase N N N
i=1 i=1 i=1
correlation between A(t) and A(t + 1): (22)
Since first two terms in Equation (22) are independent on back-
p p
PC = F −1 (CPS) = PCp′ + PCb′ + PCpb′ + PCbb′ . (21) ground variations (bi ), their absolute values remain unchanged.
Further, it can be shown that the last two terms decrease with
3.6. Saccades-Enhanced Phase Correlation increasing N, and, thus, their weight in the average phase cor-
Phase correlation between two non-uniformly shifted image relation can be arbitrarily decreased after sufficiently high num-
regions Equation (21) contains four terms: ber of saccadic iterations N >> 1. Without providing a precise
FIGURE 4 | Examples of saccadic eye movements from Yarbus (1967). Left the eyes of the observer exhibit remarkable back-and-forth movements between
different regions of interest (i.e., eyes, mouth) and the image background. Right saccadic trajectories seem to follow the shape contours and edges.
proof, we can give the following plausible comment: for dif- and analyze them from the perspective of theoretical concepts of
ferent shifts of the background region, positions of maxima in phase-based motion and pattern detection.
cumulative phase correlation differ as well. Consequently, the
sum over different bi remains bounded, and the average value 4.1. Importance of Phase and Amplitude:
of the last
two terms in Equation (22) decreases as N −1 , i.e., Psychophysical Perspective
limN→∞ N i = 1 PCbb′i → 0. As a result of saccadic image
1 PN From theoretical considerations in Section 3.2, phase appears
p to be more essential for retrieval of structural information than
composition, self-correlation of the target pattern PCp′ becomes amplitude. Psychophysical findings in Freeman and Simoncelli
the most dominant term and the shift of P can be determined (2011) and Zhang et al. (2014) suggest, however, a combined
from the coordinate of the absolute maximum of Equation (22). phase-amplitude mechanism of pattern perception with higher
The less structured is the target pattern and the more sim- weight of phase information near the fixation point and increas-
ilar it is to the image background, the more difficult becomes ing importance of amplitude on the periphery of the visual field.
the virtual separation of target and background regions using On the other hand, one should consider that conscious fixa-
saccades-enhanced phase correlation. Consequently, analysis of tions inhibit saccades which results in progressive low-pass blur-
poorly structured visual scenes requires more saccadic iterations ring of peripheral image. Unconstrained image observation is
for detection and recognition of the target pattern. Remarkably, always associated with saccadic eye movements that acquire high-
experimental findings seem to confirm this theoretical predic- frequency phase information from different image areas and,
tion: the strategy of saccades by observation of unstructured tex- thus, substantially increase the real weight of phase information
tural images exhibits increasing frequency of target-background in image perception and (re)cognition.
eye movements (He and Kowler, 1992).
4.2. On the Role of Phase and Saccades in Visual
3.7. Consideration of Visual Acuity Illusions
The foveal and peripheral areas of the retinal image are known Seemingly different visual illusions have a common feature to be
to exhibit significant differences in acuity that have to be con- triggered by coherently phase-shifted repetitive patterns. Below
sidered by construction of Fourier transforms and phase corre- we briefly review three groups of visual illusions2 that generate
lations of target and surrounding images. With approximately effects of (i) virtual depth (Tyler and Clarke, 1990), (ii) apparent
3◦ of high-acuity foveal cone-projection (Osterberg, 1935), the motion (Kitaoka and Ashida, 2003), and (iii) non-local image tilt
observer’s eye can sharply resolve only an area with the cross- (Popple and Levi, 2000). Tight resemblance in stimulus configu-
section dimension of D ≈ 0.1 L, where L denotes the distance ration of different visual illusions has been supposed in previous
from observer to the focus plane. For a L = 50 cm far com- works (Kitaoka, 2006). Though, a unified concept of underlying
puter screen, it makes a D = 5 cm wide spot. The remaining neural mechanisms that drive different perceptual illusions is still
peripheral area is progressively blurred with the distance from missing.
the focus. Consequently, a more natural representation of the
retinal and higher-lever neural images is the composition of the 4.2.1. Virtual Depth Illusions
central pattern surrounded by the low-pass smoothed periph- Stereogram images such as shown in Figure 5 cause perceptual
ery. For calculation of saccades-enhanced phase correlation this, illusions of virtual depth and hidden 3D content. Stereograms
in turn, means that not only the position of the focus but also are composed of repetitive patterns which retinal projections
spectral characteristics of the central and peripheral areas have to in the left and right eyes exhibit a relative spatial shift in the
be appropriately filtered anew for each saccadic fixation image. image domain and a corresponding phase-shift in the frequency
Repetitive target-background sampling by saccades will, obvi- domain. Accordingly, two basic models of binocular disparity
ously, lead to enhancement of small details (i.e., high-frequent based on position- and phase-shift receptive fields have been dis-
components) of more frequently focused regions and low-pass cussed in the literature in the last two decades (Arndt et al., 1995;
smoothing of less frequently sampled, peripheral areas. As a con- Fleet et al., 1996; Ohzawa et al., 1997; Parker and Cumming,
sequence, one can expect saccadic analysis to better discriminate 2001; Chen and Qian, 2004; Goutcher and Hibbard, 2014). Anzai
images that show distinctive spectral differences between central et al. (1997) conclude that “binocular disparity is mainly encoded
and peripheral areas. Visual examination of images with similar through phase disparity.” Fleet (1994) suggests a model of binoc-
spectral characteristics of pattern and background regions can be, ular disparity computation using the Local Weighted Phase Cor-
in turn, associated with intensification of back-and-forth saccadic relation which combines the features of phase-shift and phase
eye movements. correlation approaches. If phase correlation is, in fact, involved
in binocular disparity calculation, the underlying neural mecha-
nisms of virtual depth detection can be expected to depend on a
4. Psychophysical Evidence of Phase certain threshold of neuronal activity, i.e., the strength of phase
Involvement in Visual Information correlation, which, in turn, should be dependent on structural
image properties. In particular, as we have seen above one can
Processing expect that structured (i.e., edge-rich, phase-congruent) patterns
In this section, we review some psychophysical findings indicat- 2 All examples of visual stimuli were taken from the “Illusion Pages” of A. Kitaoka
FIGURE 5 | Examples of virtual depth illusions (stereograms) based on structured (left) and diffuse textural (right) patterns (courtesy A. Kitaoka).
such as shown in Figure 5 (left) produce stronger phase corre- not induce any significant perceptual effects, see Figures 6E,F.
lation signals and, thus, trigger virtual depth illusions easier re. In contrast, antiparallel Translating Snakes patterns generate a
faster than diffuse textural pattern such as Figure 5 (right). Fur- weak illusion of translational motion, see Figures 6G,H. From
ther experimental investigations are required to test this pure this observation, we conclude that phase advancement due local
theoretical prediction. contrast gradient is required but not sufficient for generation of
apparent motion illusion. The sufficient condition consists in dif-
4.2.2. Apparent Motion Illusions ferent spatial orientation of repetitive motion patterns: equally
Apparent motion illusions induce perception of dynamic image oriented motion patterns of the Translating Snake do not induce
changes while observing static visual stimuli. Notably, the inten- any illusory motion, while non-uniformly organized contrast
sity of apparent motion illusions depends on spectral charac- gradients of the Rotating Snake do, see Figures 6I,J. Thus, we
teristics (i.e., low/high frequent image content) and the relative conclude that apparent motion signals are triggered not only
phase-shift of repetitive patterns. by phase advancement at high contrast alone but by the dif-
ference in phase advancement between each two image regions
subsequently fixated by saccades.
4.2.2.1 The Rotating Snake
patterns from Kitaoka and Ashida (2003) induce a remarkably
strong illusion of apparent rotational motion, see Figures 6A,B. 4.2.2.2 The Anomalous Motion
The low-pass smoothed Rotating Snake in Figures 6C,D exhibit from Kitaoka (2006) is another example of apparent motion
a reduced intensity of apparent rotational motion. Backus and illusion which is induced by contrarily oriented contrast-
Oru (2005) explain emergence of illusory motion of the Rotat- gradient patterns, see Figure 7 (left). In Figure 7 (right), cen-
ing Snakes by the difference in the temporal response of visual tral and peripheral contrast-gradient patterns were aligned in
neurons to low- and high-contrast. This difference leads to mis- the same direction. As a result, the illusion of apparent motion
interpretation of the temporal phase-shift as a spatial phase- disappears. Only the combination of patterns with contrarily ori-
shift (“phase advance”) at high contrast. The effect of low-pass ented contrast-gradients (i.e., the relative phase shift) is capa-
smoothing, authors attribute to reduction of differences between ble to generate a stable illusion of apparent relative motion, see
high- and low-contrast regions. Recent findings indicate that sig- Figure 7 (left). Similar to the Rotation Snake, the Anomalous
nals of illusory motion in V1 and MT cortical areas can be also Motion illusion requires saccadic eye movements. Suppression of
triggered by update of the retinal image as a result of saccadic saccades by conscious point fixation stops the illusion of apparent
eye movements or blinkers (Conway et al., 2005; Troncoso et al., motion.
2008; Otero-Millan et al., 2012; Martinez-Conde et al., 2013).
Consequently, conscious suppression of saccades inhibits illu- 4.2.3. Non-Local Tilt Illusion.
sions of apparent motion that are based on phase-advancing con- Figure 8 shows the virtual tilt illusion from Popple and Levi
trast patterns. To dissect the structural principle of the Rotating (2000) and Popple and Sagi (2000) which seems to be triggered
Snake in more detail, we performed its polar-to-rectangle trans- without local cues. The particularity of this stimulus consists in a
formation into the Translating Snake, see Figures 6E–H. This way it is constructed by horizontal lines of patterns that exhibit
transformation changes the relative spatial orientation of repet- a relative vertical phase-shift. Consequently, the horizontal lines
itive patterns while preserving their local contrast structure. We appear to have a vertical tilt which direction depends on the
observe that a pair of parallel Translating Snake patterns does sign of the phase-shift. Based on our previous analysis of motion
FIGURE 7 | The Anomalous Motion (courtesy A. Kitaoka) induces an illusion of apparent translational motion (left). Manipulated equidirectional stimulus
(right) do not trigger any significant motion illusions.
FIGURE 8 | Dependence of the non-local tilt illusion on low/high-frequent image content. From left to right: the low-pass filtered vs. unfiltered Popple illusion
(courtesy A. Kitaoka).
FIGURE 9 | Scheme of the hypothetic mechanisms of visual similarity is detected by a single neuron. In contrast, a more disperse
pattern recognition. Persistent activity of a small number of neurons in and stochastic pattern of neural activity indicates a low degree of image
association cortex is a feature of high image similarity. In the ideal case, similarity.
6. Discussion
Here, we merge existing phenomenological findings, compu-
tational analysis and theoretical hypotheses to dissect the role
of image phase in diverse phenomena of visual information
processing, illusion and cognition. We argue that fundamental
importance of phase for detection of structural image features
and transformations is of clear evolutionary advantage for sur-
vival of species and can be assumed to promote the develop-
ment of phase-based mechanisms of neural image processing. A
large body of neurophysiological and psychophysical evidence
seems to confirm the assumption that biological vision relies
on frequency domain transformation, filtering and higher-order
processing of retinal images in the visual cortex. Hence, the emer-
gence of efficient phase-based neural mechanisms in course of
evolution appears to be plausible. We show that the concepts
of phase shift, amplitude-normalizing phase-only transforma-
tion and phase correlation provide a qualitative description for
a number of puzzling visual phenomena including
FIGURE 10 | Example of pattern recognition using phase correlation.
From left to right: (i) the target smiley, (ii) multi-smiley image, phase correlation • preservation of cognitive features in the image sketch (in the
between (i) and (ii). The green frame indicates the correct location of the target
sense of the Marr’s Primal Sketch),
pattern in the image, the red frame shows the wrong match which
corresponds to the absolute maximum of the noisy phase correlation. • robustness of pattern detection with respect to substantial level
Consideration of visual acuity improves the recognition score. Phase of noise and structural distortion,
correlation between the target smiley and the images with three different acuity • “eye exhaustion” by observation of repetitive and blurry
foci peaks out the right pattern location with the maximum height of PC = scenes,
7.93E+3.
• advantages of saccadic strategy of iterative target-background
sampling for pattern discrimination,
• dependency of saccadic eye movements on structural image
Furthermore, missing similarity between images can be expected
properties (i.e., target-background similarity and spectral
to provoke intensification of saccadic eye movements.
characteristics),
An example of repetitive pattern discrimination/recognition
• advantages of differences in foveal and peripheral acuity for
using phase correlation is shown in Figure 10. The task consists
visual pattern recognition,
in finding a particular smiley within a group of similar patterns.
• dependency of the delay time by perception of virtual depth
Since phase correlation of noise-free images will immediately
illusions on phase properties of stimuli,
match the right location of the target smiley, the search is com-
• coherent phase shifts in contrast-gradient patterns of apparent
plicated by adding a large amount of high-frequency noise which
motion illusions,
substantially corrupts small image features (such as smiley’s
• driving role of saccades in apparent motion and tilt illusions,
eyes). Single-step phase correlation between substantially noised
• recognition of virtual patterns in completion illusions using
images results in selection of the wrong pattern location (see yel-
phase correlation.
low framed smiley in Figure 10). Due to high-level of noise, the
• singular pattern of neural activity in the association cortex by
peak of phase correlation corresponding to the correct pattern
recognition of similar visual stimuli.
(green framed smiley) has the lower height. Remarkably, consid-
eration of visual acuity (i.e., peripheral blurring) helps to improve Although, straightforward projections of theoretical concepts
the recognition score. Phase correlation between the target smiley onto biological systems can, in general, lead to too far-reaching
and three images with different visual foci manages to peak out extrapolations, some of our hypothetic predictions, such as
the right pattern location which corresponds to the highest peak dependency of saccades strategy on structural image properties
of PC = 7.93E + 3. and singular response of association cortex to structurally similar
Another example of remarkable features of phase correlation visual stimuli, can be, on principle, tested in experiment.
as a pattern recognition tool is detection of the virtual image There is a tight resemblance between the concepts of
content in visual completion illusions. Figure 11 demonstrates amplitude-normalizing phase-only transformation and phase
detection of virtual geometrical patterns (i.e., triangle, circle) correlation we used in our work and energy models (Morrone
in the completion illusions from Idesawa (1991) and Kanizsa and Owens, 1987; Morrone and Burr, 1988; Fleet et al., 1996)
(1995). The correct location of the virtual figures corresponds re. phase congruency detectors (Morrone et al., 1986; Kovesi,
to the absolute maximum of phase correlation. This examples 2000). Both concepts take advantage of two basic principles:
FIGURE 11 | Detection of the virtual image content using phase and Idesawa (1991) (bottom row), (iii) phase correlations between (i) and (ii)
correlation. From left to right: (i) hidden patterns of illusion stimuli (i.e., (maximum is indicated by the arrow), registration of (i) onto (ii) according to
triangle, circle), (ii) visual completion illusion from Kanizsa (1995) (top row) the maximum of (iii).
(i) amplitude-normalization, which effectively performs edge our above results suggest that phase-only transformation in V1
enhancement (i.e., image sketchification) and makes scene anal- with subsequent phase correlation in association cortex represent
ysis independent of the level of illuminance and contrast, and bottom-up neural mechanisms of Primal Sketch generation and
(ii) calculation of the cognitive checksum by building an inte- perception, respectively. However, differently from the canonical
gral over the entire frequency spectrum, which, on one hand, edge operators that are based on derivatives (i.e., edge-mask con-
makes the cognition extremely robust with respect to noise and, volution) of the image intensity function, edge information in the
on the other hand, allows distributed storage of information in frequency domain is given implicitly by the relative phase struc-
neural networks. Otherwise, there is a basic difference between ture and can be assessed for the entire image in a non-iterative
these two concepts: phase congruency can be seen as an extended and non-local manner. The ability of phase correlation to capture
amplitude-normalizing, edge-enhancing filter, while phase corre- global structural information “on-the-fly” makes it to an ultimate
lation is constructed to detect the relative transformation and/or tool for rapid bottom-up processing of the focused image con-
structural (dis)similarity between each two images. Furthermore, tent. The temporal focus of the observer is, in turn, controlled
phase congruency is presumably performed by V1 neurons, while by higher-order cortical centers that integrate bottom-up streams
phase correlation can be expected to take place in a higher level and define conscious and unconscious strategies of visual scene
of visual cortex hierarchy, i.e., association cortex. Finally, taking sampling.
into consideration potential redeployment of the brain areas While the focus of our present work is on the role of image
(Anderson, 2007), one can expect that the suggested principle phase in visual information processing, it should be stated that
of pattern recognition by phase correlation is not restricted to phase does not exclusively bear cognitive features of visual stim-
the visual system and could also play a role in other cognitive uli. Findings in Freeman and Simoncelli (2011) and Zhang et al.
functions. (2014) suggest that amplitude information is also involved in
Within the general framework of recent hierarchical bottom- visual (re)cognition and can be even overweight in peripheral
up top-down models of visual cortex (Lee and Mumford, 2003; vision or by perception of textural images. It is a subject of
Epshtein et al., 2008; Poggio and Ullman, 2013), our find- future research to reveal how phase and amplitude are weighted
ings provide a theoretical explanation for what Marr called and merged to an integrated whole in association cortex upon
“early non-attentive vision” (Marr, 1976, 1982). In particular, structural properties of visual stimuli.
References Anzai, A., Ohzawa, I., and Freeman, R. (1997). Neural mechanisms underlying
binocular fusion and stereopsis: position vs. phase. Proc. Natl. Acad. Sci. U.S.A.
Adelson, E., and Bergen, J. (1985). Spatiotemporal energy models for the percep- 94, 5438–5443.
tion of motion. J. Opt. Soc. A 2, 284–299. Arndt, P., Mallot, H., and Biilthoff, H. (1995). Human stereovision without
Anderson, M. (2007). Evolution of cognitive function via redeploy- localized image features. Biol. Cybern. 72, 279–293.
ment of brain areas. Neuroscientics 13, 1–9. doi: 10.1177/10738584062 Backus, B., and Oru, I. (2005). Illusory motion from change over time in the
94706 response to contrast and luminance. J. Vis. 5, 1055–1069. doi: 10.1167/5.11.10
Barron, J., Fleet, D., and Beauchemin, S. (1994). Performance of optical flow Henriksson, L., Hyvaerinen, A., and Vanni, S. (2009). Representation of cross-
techniques. Int. J. Comp. Vis. 12, 43–77. frequency spatial phase relationships in human visual cortex. J. Neurosci. 29,
Bex, P., and Makous, W. (2002). Spatial frequency, phase, and the contrast of nat- 14342–14351. doi: 10.1523/JNEUROSCI.3136-09.2009
ural images. J. Opt. Soc. Am. A 19, 1096–1106. doi: 10.1364/JOSAA.19.001096 Hietanen, M., Cloherty, S., van Kleef, J., Wang, C., Dreher, B., and Ibbotson, M.
Blakemore, C., and Campbell, F. (1969). On the existence of neurones in the (2013). Phase sensitivity of complex cells in primary visual cortex. J. Neurosci.
human visual system selectively sensitive to the orientation and size of retinal 237, 19–28. doi: 10.1016/j.neuroscience.2013.01.030
images. J. Physiol. 213, 237–260. Hubel, D., and Wiesel, T. (1962). Receptive fields, binocular interaction
Blakemore, C., Nachmias, J., and Sutton, P. (1969). The perceived spatial frequency and functional architecture in the cat’s visual cortex. J. Physiol. 160,
shift: evidence for frequency-selective neurones in the human brain. J. Physiol. 106–154.
210, 727–750. Hubel, D., and Wiesel, T. (1968). Receptive fields and functional architecture of
Booth, M., and Rolls, E. (1998). View-invariant representations of familiar monkey striate cortex. J. Physiol. 195, 215–243.
objects by neurons in the inferior temporal visual cortex. Cereb. Cortex 8, Hyvärinen, A., and Hoyer, P. (2001). A two-layer sparse coding model learns sim-
510–523. ple and complex cell receptive fields and topography from natural images. Vis.
Campbell, F., and Robson, J. (1968). Applciation of fourier analysis to the visibility Res. 41, 2413–2423. doi: 10.1016/S0042-6989(01)00114-6
of gratings. J. Physiol. 197, 551–566. Idesawa, M. (1991). “Perception of illusory solid object with binocular viewing,” in
Chen, Y., and Qian, N. (2004). A coarse-to-fine disparity energy model with both Proceedings IJCNN-91 Seattle International Joint Conference of Neural Networks
phase-shift and position-shift receptive field mechanisms. Neural Comput. 16, (Seattle, WA), Vol. II, A-943.
1545–1577. doi: 10.1162/089976604774201596 Ito, M., Tamura, H., Fujita, I., and Tanaka, K. (1995). Size and position invariance
Conway, B., Kitaoka, A., Yazdanbakhsh, A., Pack, C., and Livingstone, M. (2005). of neuronal responses in monkey inferotemporal cortex. J. Neurophysiol. 73,
Neural basis for a powerful static motion illusion. J. Neurosci. 25, 5651–5656. 218–226.
doi: 10.1523/JNEUROSCI.1084-05.2005 Kanizsa, G. (1995). Margini quasi-percettivi in campi con stimolazione omogenea.
De Castro, E., and Morandi, C. (1987). Registration of translated and rotated Riv. Psycol. 49, 7–30.
images using finite fourier transforms. IEEE Trans. Pattern Anal. Mach. Intell. Kirchner, H., and Thorpe, S. (2006). Ultra-rapid object detection with saccadic
9, 700–703. eye movements: visual processing speed revisited. Vis. Res. 46, 1762–1776. doi:
De Valois, R., and De Valois, K. (1990). Spatial Vision. New York, NY: Oxford 10.1016/j.visres.2005.10.002
University Press. Kitaoka, A., and Ashida, H. (2003). Phenomenal characteristics of the
De Valois, R., Albrecht, D., and Thorell, L. (1982). Spatial frequency selectivity of peripheral drift illusion. VISION 15, 261–262. Available online at:
cells in macaque visual cortex. Vis. Res. 22, 545–559. http://www.psy.ritsumei.ac.jp/∼akitaoka/PDrift.pdf
Donoho, D., and Flesia, A. (2001). Can recent innovations in harmonic analy- Kitaoka, A. (2006). “Anomalous motion illusion and stereopsis,” in Journal Three
sis ‘explain’ key findings in natural image statistics. Network 12, 391–412. doi: Dimensional Images (Tokyo), 9–14.
10.1080/net.12.3.371.393 Kovesi, P. (2000). Phase congruency: a low-level image invariant. Psych. Res. 64,
Epshtein, B., Lifshitz, I., and Ullman, S. (2008). Image interpretation by a single 136–148. doi: 10.1007/s004260000024
bottom-up top-down cycle. Proc. Natl. Acad. Sci. U.S.A. 105, 14298–14303. doi: Kruger, N., Janssen, P., Kalkan, S., Lappe, M., Leonardis, A., Piater, J., et al.
10.1073/pnas.0800968105 (2013). Deep hierarchies in the primate visual cortex: what can we learn for
Fleet, D., and Jepson, A. (1990). Computation of component image velocity from computer vision? IEEE Trans. Pattern Anal. Mach. Intell. 35, 1847–1871. doi:
local phase information. Int. J. Comp. Vis. 5, 77–104. 10.1109/TPAMI.2012.272
Fleet, D., Wagner, H., and Heeger, D. (1996). Neural encoding of binocu- Lee, T., and Mumford, D. (2003). Hierarchical bayesian infer-ence in the visual
lar disparity: energy models, position shifts and phase shifts. Vis. Res. 36, cortex. J. Opt. Soc. Am. A 20, 1434–1448. doi: 10.1364/JOSAA.20.001434
1839–1857. Lindeberg, T. (2013). Invariance of visual operations at the level of receptive fields.
Fleet, D. (1994). “Disparity from local weighted phase correlation,” in Proceedings PLoS ONE 8:e66990. doi: 10.1371/journal.pone.0066990
IEEE International Conference on Systems, Man and Cybernetics (San Antonio, Lohmann, A., Mendlovic, D., and Gal, S. (1997). Signicance of phase and amplitude
TX), 48–56. in the fourier domain. J. Opt. Soc. Am. A 14, 2901–2904.
Freeman, J., and Simoncelli, E. (2011). Metamers of the ventral stream. Nat. Mallat, S. (1989). A theory for multiresolution signal decomposition: the wavelet
Neurosci. 14, 1195–1201. doi: 10.1038/nn.2889 representation. IEEE Trans. Pattern Anal. Mach. Intell. 11, 674–693.
Gladilin, E., and Eils, R. (2009). “Detection of non-uniform multi-body motion in Marcelja, S. (1980). Mathematical description of the responses of simple cortical
image time-series using saccades-enhanced phase correlation,” in Proceedings of cells. J. Opt. Soc. Am. 70, 1297–1300.
SPIE Medical Imaging 2009: Image Processing, eds J. P. W. Pluim; B. M. Dawant, Marr, D. (1976). Early processing of visual information. Philos. Trans. R. Soc. Lond.
(San Diego, CA). doi: 10.1117/12.811120 B Biol. Sci. 275, 483–519.
Gladilin, E. (2004). “A contour based approach for invariant shape description,” In Marr, D. (1982). Vision: A Computational Investigation into the Human Represen-
Proceedings of SPIE, Medical Imaging 2004: Image Processing (San Diego, CA), tation and Processing of Visual Information. San Francisco, CA: W. H. Freeman
5370, 1282–1291. and Company.
Goutcher, R., and Hibbard, P. (2014). Mechanisms for similarity matching in Martinez-Conde, S., Otero-Millan, J., and MacKnik, S. (2013). The impact of
disparity measurement. Front. Psych. 4:1014. doi: 10.3389/fpsyg.2013.01014 microsaccades on vision: towards a unified theory of saccadic function. Nat.
Graham, D., and Field, D. (2006). Evolution of the Nervous Systems Chapter Sparse Rev. Neurosci. 14, 83–96. doi: 10.1038/nrn3405
Coding in the Neocortex. Ithaca, NY: Academic Press. Mesulam, M. (1998). From sensation to cognition. Brain 121, 1013–1052.
Graham, N. (1981). “The visual system does a crude Fourier analysis of patterns,” Morgan, M., Ross, J., and Hayes, A. (1991). The relative importance of local
in Mathematical Psychology and Psychophysiology, SIAM-AMS Proceedings phase and local amplitude in patchwise image recognition. Biol. Cybern. 65,
Vol. 13., ed S. Grossberg, (Providence, Rhode Island, American Mathematical 113–119.
Society), 1–16. Morrone, M., and Burr, D. (1988). Feature detection in human vision: a phase-
Graham, N. (1989). Visual Pattern Analyzers. New York, NY: Oxford University dependent energy model. Philos. Trans. R. Soc. Lond. B Biol. Sci. 235, 221–245.
Press. Morrone, M., and Owens, R. (1987). Feature detection from local energy. Pattern
Hamilton, D., Albrecht, D., and Geisler, W. (1989). Visual cortical receptive fields Recogn. Lett. 6, 303–313.
in monkey and cat: spatial and temporal phase. Vis. Res. 29, 1285–1308. Morrone, M., Ross, J., Burr, D., and Owens, R. (1986). Mach bands are phase
He, P., and Kowler, E. (1992). The role of saccades in the perception of texture dependent. Nature 324, 250–253.
patterns. Vis. Res. 32, 2151–2163. Ni, X., and Huo, X. (2007). Statistical interpretation of the importance of phase
Heeger, D. (1992). Normalization of cell responses in cat striate cortex. Vis. information in signal and image reconstruction. Stat. Probab. Lett. 77, 447–454.
Neurosci. 9, 181–197. doi: 10.1016/j.spl.2006.08.025
Nishida, S. (2011). Advancement of motion psychophysics: review 2001-2010. J. Riesenhuber, M. (2005). Neurobiology of Attention Chapter Object Recognition in
Vis. 11, 1–53. doi: 10.1167/11.5.11 Cortex: Neural Mechanisms, and Possible Roles for Attention. Philadelphia, PA:
Ohzawa, I., DeAngelis, G., and Freeman, R. (1990). Stereoscopic depth descrimi- Elsevier.
nation in the visual cortex: neurons ideally suited as disparity detectors. Science Sampat, M., Wang, Z., Gupta, S., Bovik, A., and Markey, M. (2009). Complex
249, 1037–1041. wavelet structural similarity: a new image similarity index. IEEE Trans. Image
Ohzawa, I., DeAngelis, G., and Freeman, R. (1997). Encoding of binocular disparity Process. 18, 2385–2401. doi: 10.1109/TIP.2009.2025923
by complex cells in the cat’s visual cortex. J. Neurophysiol. 77, 2879–2909. Schwartz, O., and Simoncelli, E. (2001). Natural signal statistics and sensory gain
Oppenheim, A., and Lim, J. (1981). The importance of phase in signals. Proc. IEEE control. Nat. Neurosci. 4, 819–825. doi: 10.1038/90526
69, 529–541. doi: 10.1109/PROC.1981.12022 Shams, L., and Malsburg, C. (2002). The role of complex cells in object
Osterberg, G. (1935). Topography of the Layer of Rods and Cones in the Human recognition. Vis. Res. 42, 2547–2554. doi: 10.1016/S0042-6989(02)
Retina Vol. 13 of Acta Ophthalmologica. Copenhagen: A. Busck. 00202-X
Otero-Millan, J., MacKnik, S., and Martinez-Conde, S. (2012). Microsaccades and Thaler, L., Todd, J., and Dijkstra, T. (2007). The effects of phase on the perception
blinks trigger illusory rotation in the rotating snakes illusion. J. Neurosci. 32, of 3d shape from texture: psychophysics and modeling. Vis. Res. 47, 411–427.
6043–6051. doi: 10.1523/JNEUROSCI.5823-11.2012 doi: 10.1016/j.visres.2006.10.007
Palmeri, T., and Gauthier, I. (2004). Visual object understanding. Nat. Rev. Neu- Thomas, J., Bagrash, F., and Kerr, L. (1969). Selective stimulation of two form
rosci. 5, 291–304. doi: 10.1038/nrn1364 sensitive mechanisms. Vis. Res. 9, 625–627.
Parker, A., and Cumming, B. (2001). Cortical mechanisms of binocular stereo- Troncoso, X., MacKnik, S., Otero-Millan, J., and Martinez-Conde, S. (2008).
scopic vision. Prog. Brain Res. 134, 205–216. Microsaccades drive illusory motion in the enigma illusion. Proc. Natl. Acad.
Poggio, T., and Ullman, S. (2013). Vision: are models of object recognition catching Sci. U.S.A. 105, 16033–16038. doi: 10.1073/pnas.0709389105
up with the brain? Ann. N. Y. Acad. Sci. 1305, 72–82. doi: 10.1111/nyas.12148 Tyler, C., and Clarke, M. (1990). “The autostereogram,” In Proceedings of SPIE,
Pollen, D., and Ronner, S. (1981). Phase relationship between adjacent simple cells Stereoscopic Displays and Applications (Santa Clara, CA), 182–196.
in the visual cortex. Science 212, 1409–1411. Walls, G. (1962). The evolutionry history of eye movements. Vis. Res. 2, 69–80.
Pollen, D., and Ronner, S. (1983). Visual cortical neurons as localized spatial Weldon, K., Taubert, J., Smith, C., and Parr, L. (2013). How the thatcher illusion
frequency filters. IEEE Trans. Sys. Man Cybern. 5, 907–916. reveals evolutionary differences in the face processing of primates. Anim. Cogn.
Popple, A., and Levi, D. (2000). A new illusion demonstrates long-range process- 16, 691–700. doi: 10.1007/s10071-013-0604-4
ing. Vis. Res. 40, 2545–2549. doi: 10.1016/S0042-6989(00)00127-9 Yarbus, A. (1967). Eye Movements and Vision. New York, NY: Plenum Press.
Popple, A., and Sagi, D. (2000). A fraser illusion without local cues? Vis. Res. 40, Zhang, F., Jiang, W., Autrusseau, F., and Lin, W. (2014). Exploring v1 by modeling
873–878. doi: 10.1016/S0042-6989(00)00010-9 the perceptual quality of images. J. Vis. 14, 1–14. doi: 10.1167/14.1.26
Psalta, L., Young, A., Thompson, P., and Andrews, T. (2014). The thatcher illusion
reveals orientation dependence in brain regions involved in processing facial Conflict of Interest Statement: The authors declare that the research was con-
expressions. Psychol. Sci. 25, 128–136. doi: 10.1177/0956797613501521 ducted in the absence of any commercial or financial relationships that could be
Quiroga, R., Reddy, L., Kreiman, G., Koch, C., and Fried, I. (2005). Invariant visual construed as a potential conflict of interest.
representation by single neurons in the human brain. Nature 435, 1102–1107.
doi: 10.1038/nature03687 Copyright © 2015 Gladilin and Eils. This is an open-access article distributed under
Ramachandran, V., and Anstis, S. (1986). The perception of apparent motion. Sci. the terms of the Creative Commons Attribution License (CC BY). The use, distribu-
Am. 254, 102–109. tion or reproduction in other forums is permitted, provided the original author(s)
Reddy, B., and Chatterji, B. (1996). An fft-based technique for translation, rota- or licensor are credited and that the original publication in this journal is cited, in
tion, and scale-invariant image registration. IEEE Trans. Image Process. 5, accordance with accepted academic practice. No use, distribution or reproduction is
1266–1271. permitted which does not comply with these terms.