EUROGRAPHICS ’10/ Helwig Hauser and Erik Reinhard
STAR – State of The Art Report
Image Statistics and their Applications in Computer Graphics
Tania Pouli1 , Douglas W. Cunningham2 and Erik Reinhard1
1 Department
2 Brandenburg
of Computer Science, University of Bristol, UK
Technical University Cottbus, Cottbus, Germany
Abstract
The statistics of natural images have attracted the attention of researchers in a variety of fields and have been used
as a means to better understand the human visual system and its processes. A number of algorithms in computer
graphics, vision and image processing take advantage of such statistical findings to create visually more plausible
results. With this report we aim to review the state of the art in image statistics and discuss existing and potential
applications within computer graphics and related areas.
1. Introduction
The field of natural image statistics studies images and their
statistical regularities [Gei08, HHH09]. The human visual
system has evolved in natural environments, i.e. those without man-made structures. Human vision, it is argued, has
therefore evolved to be particularly good at observing natural scenes, as opposed to random images [van98a]. By studying the input to the human visual system, it is thought that
the human visual system can be understood better. This has
led to a rich literature on various aspects of natural image
statistics [Rud94, HM99b, DF01].
The typical process by which natural image statistics are
computed, involves collecting statistics of ensembles of images after passing each image through a particular transform.
The transforms typically employed tend to correspond to a
component of human vision. It is, for instance, possible to
compute the power spectra of each image, and from these
calculate the average power spectral slope for the whole
set. This state-of-the-art report reviews this and many other
transforms and statistics.
Aside from learning about human vision and the implications for perception, many of these statistical regularities
have transferred to computer graphics and adjacent areas:
several applications now successfully employ statistics of
natural images to help solve engineering problems. This generally takes two forms. Statistical regularities of a particular
ensemble can be used to ensure that the solution peoduced
by an algorithm will approach that ensemble.
Alternatively, statistical properties of single images are
c The Eurographics Association 2010.
computed and manipulated to achieve certain effects. As
an example, deblurring algorithms exist that rely on optimisation involving priors that help push candidate images
towards visually plausible solutions [SJA08]. Of course, visually plausible solutions are those that appear to humans
in some sense natural. Image priors have therefore been designed with the aid of specific knowledge of natural image
statistics, in this case regarding the average distribution of
image gradients.
We think that the use of natural image statistics has so far
shown great promise, with interesting applications making
use of some of the available statistical regularities. We also
see excellent scope for further future developments. In this
report, we review the current state of natural image statistics, and give an overview of how statistics have been used
in computer graphics, computational photography, visualisation and image processing so far.
Section 2 presents issues regarding image capture and
calibration and discusses possible correction mechanisms.
First order statistics, as well as their relevant findings and
applications are discussed in Section 3 while second and
higher order statistics are shown in Sections 4 and 5 respectively. Colour related statistics are presented separately in
Section 6. Finally, we summarise the findings in Section 7.
2. Data Collection and Calibration
The statistical analysis of natural images typically involves
the collection of large ensembles. Various image data sets are
already available but in order to create new ensembles, sev-
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
eral points need to be considered. One might need to create
new ensembles if the intended application involves a specific
class of images, or if the study is to understand human vision
under a specific viewing conditions.
Advances in camera technology have made the capture of
higher resolution and higher dynamic range imagery possible. Each such type of image presents its own advantages in
terms of information and its own specific issues regarding
capturing and calibration. The type of image appropriate for
each study largely depends on the type of information studied. In this report we will focus on the statistical analysis of
2-dimensional images and as such exclude from our discussion 3-dimensional data (such as range images or 3D scans)
or video sequences. More specifically, we will consider LDR
and HDR imagery. Issues relating to the collection of large
image ensembles will be discussed in Section 2.1.
Before analysing the collected images, it is prudent to remove artifacts and irregularities that are due to the camera,
lens or settings used to capture the scenes. These can include
noise, geometric distortions due to imperfections of the lens
or variations due to changes in illumination or even weather
conditions. The calibration process and pre-processing steps
necessary to account for such irregularities will be covered
in Section 2.2.
2.1. Image Capture
A number of issues need to be considered before capturing
images for an ensemble. The following largely depend on
the aims of the study:
• Type of imagery: For most purposes, low dynamic range
(LDR) imagery is sufficient, although HDR images can
capture a wider range of luminance values, making them
more appropriate for scenes of more extreme lighting
(such as night scenes where artificial light sources are
much brighter than the rest of the scene). Panoramas or
wider angle images can capture a larger part of the field
of view, or even the full scene in the case of spherical
panoramas.
• Number of images: Although generally, larger data sets
afford more accurate statistical findings, in the case of image statistics, the trade-off between number of images and
data collection time often needs to be resolved. The number of images appropriate for a particular study largely depends on its aims. Some example dataset sizes are given
in Table 1.
• Camera: Most digital cameras available on the market
offer high resolutions. However, single lens reflex (SLR)
cameras offer better image quality as well as a higher level
of control.
HDR Capture High dynamic range scenes can be captured
in a number of ways. The most commonly used approach
involves capturing a series of differently exposed versions
of the scene, known as exposure bracketing, that are then
merged into the HDR image. The merging process was first
proposed by Mann and Picard [MP95]. Several modern digital SLR cameras offer automated settings for bracketing,
greatly simplifying the process. The number of brackets offered typically ranges from 2 or 3 to a maximum of 9 in
higher range models of each brand. Table 2 shows the number of brackets and the maximum exposure step for a selection of cameras offering more than 3 brackets.
Recently, cameras capable of directly capturing HDR content have become available. Specialised cameras exist that
are capable of capturing 360◦ spherical images. Some examples are the SpheroCamHDR by Spheron VR, which uses
a single-line CCD and offers up to 26 f-stops of dynamic
range, and the Civetta by Weiss AG, which uses a full frame
sensor and a fish-eye lens to capture up to 30 f-stops of
dynamic range. A more detailed account of available HDR
cameras can be found elsewhere [RWPD05].
2.2. Post-processing and Calibration
When measuring the statistical regularities of an image ensemble, some care is necessary to ensure that irregularities
and artifacts due to the camera, lens or shooting conditions
do not affect the results. To avoid such an effect, the properties of the equipment used need to be accounted for. Moreover, to ensure the images are calibrated and therefore pixel
values between them are directly comparable, properties of
the scene need to be measured.
2.2.1. Lens Aberrations
The lens is responsible for focusing the light coming from
the scene to an image plane (which may be a sensor in digital
cameras or the film). For many applications, it is sufficient
to model the camera as a simple pinhole whereby no lens is
present and the aperture through which the light enters is a
point. To model the various effects of an optical system to the
image however, more complex lenses need to be considered.
Lenses can consist of a single element, in which case they
are referred to as simple lenses or multiple element, called
compound lenses. For a detailed description and model of
lenses the reader is referred to [Wil94,SHL95,RKAJ08]. Assuming Gaussian optics (lenses are arranged along a single
reference axis, and all light rays make only small angles φ
with this axis, so that sin(φ) ≈ φ and cos(φ) ≈ 1), the relationship between the focal length of a lens f , the distance
to an object s0 and the distance behind the lens where the
image is formed si is given by:
1
1
1
= +
f
s0 si
(1)
Imperfections in the design or construction of such lenses
can cause them to violate the assumption of Gaussian optics. These deviations are known as aberrations and can
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
Study
[ZL97]
[RCC98]
[vv96]
[HLM00]
[HM99c]
[DLAW]
Image Type
8 bit CCD Video Frames
Spectral Images
8 bit CCD Video Frames
Range Panoramic Images
Calibrated Grayscale Images (from [vv98])
HDR Spherical Panoramas
Number of Images
100
12 scenes x 43 images
276
205
4000
95 (Teller:2003) + 9 (Debevec)
Resolution
256 x 256
192 x 165
768 x 512
444 x 1440
1024 x 1536
various
Table 1: Dataset sizes and resolutions from various image statistics studies.
Camera Model
Canon 1D MKII / MKII N
Canon 1D MKIII
Canon 1Ds MKII
Canon 1Ds MKIII
Brackets
3, 5 or 7
2, 3, 5 or 7
2, 3, 5 or 7
2, 3, 5 or 7
EV Step
3
3
3
3
Casio EX-F1
3 or 5
2 or 1
Fuji S5 Pro
2 to 9
1
Kodak P712
Kodak P880
3 or 5
3 or 5
1
1
Konica Minolta 7D
3 or 5
0.5
Leica M9
5 or 7
2
Nikon D200
Nikon D300
Nikon D300s
Nikon D700
Nikon D2H
Nikon D2X/D2Xs
Nikon D3
Nikon Coolpix 5400
Nikon 5700
Nikon 8080
Nikon 8400
Nikon 8700
Nikon 8800
2 to 9
2 to 9
2 to 9
2 to 9
2 to 9
2 to 9
2 to 9
3 or 5
3 or 5
3 or 5
3 or 5
3 or 5
3 or 5
1
1
1
1
1
1
1
1
1
1
1
1
1
Camera Model
Olympus C5050 Zoom
Olympus C5060 Wide Zoom
Olympus C730UZ
Olympus C765UZ
Olympus C-8080
Olympus E-1
Olympus E-3
Olympus E-30
Olympus SP-310
Olympus SP-320
Olympus SP-350
Olympus SP-510UZ
Olympus SP-550UZ/SP-560UZ
Brackets
3 or 5
3 or 5
3 or 5
5
3 or 5
3 or 5
3 or 5
3 or 5
3 or 5
3 or 5
3 or 5
3 or 5
3 or 5
EV Step
1
1
1
1
1
1
1
1
1
1
1
1
1
Panasonic DMC-L1
Panasonic Lumix DMC-G1 / GH1
Panasonic Lumix DMC-GF1
Panasonic Lumix DMC-G1K
3 or 5
7
3, 5 or 7
3, 5 or 7
1
2/3
2/3
2/3
Pentax K7
Pentax K10D
Pentax K20D
3 or 5
3 or 5
3 or 5
2
2
2
Samsung GX-10
Samsung GX-20
3 or 5
3 or 5
2
2
Sony Alpha A-700
Sony DSC-R2
3 or 5
9
0.7
1
Table 2: The number of brackets and the maximum exposure step for cameras offering more than 3 brackets in Auto Exposure
Bracketing mode (AEB). (Adapted from [AEB10].)
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
tions. These include blurring aberrations and geometric distortions. Blurring aberrations can be further divided into
spherical aberration, coma, astigmatism and Petzval field
curvature.
Figure 1: Longitudinal (left) and lateral (right) chromatic aberrations are shown (adapted with permission from
[RKAJ08]).
be broadly categorised as chromatic, where the optical
system behaves differently for different wavelengths, and
monochromatic, where the distortions in the image are apparent in all channels.
Chromatic Aberrations Lens elements can have
wavelength-dependent indices of refraction, resulting
in different colours focusing in slightly different places.
This causes longitudinal or axial chromatic aberration,
which appears as chromatic image blur [BL09] (see
Figure 1). Typically this form of aberration is managed
optically, using a pair of thin lenses with different refractive
indices, known as achromatic doublets. Such systems
can reduce longitudinal aberration effects but not always
completely remove them.
Another type of chromatic aberration is lateral chromatic
aberration, which is the result of different wavelengths hitting the image plane at slightly different positions. This is
more commonly known as colour fringing and can be reduced digitally. This can typically be achieved by realigning
the colour channels of the image [MW07] which can compensate for the effect of the colour shifting to some extent,
or for a more accurate reconstruction, a more complete characterisation of the optical system can be utilised [Kan07].
One particular case of chromatic aberration specific to
digital cameras, especially ones using a charge coupled device sensor (CCD), is known as purple fringing. This effect
is caused by a combination of factors that operate in addition to the lens aberrations described earlier. Highlights in
the scene that are overly bright can cause some of the CCD
quantum-wells to flood, creating the visual effect of blooming. Additionally, a false colour effect can be created by the
demosaicing process [KP08].
A number of digital solutions are available for correcting
the effects of chromatic aberrations in images with various
levels of accuracy. DxO Optics handles chromatic aberrations by taking into account the characteristics of the lens
where possible [DxO]. A similar approach is also found in
PTLens [ePa]. Adobe Photoshop (version CS4 tested) provides simple sliders to account for red/cyan and blue/yellow
shifting.
Monochromatic Aberrations A wider range of artifacts
in images appears as the result of monochromatic aberra-
Spherical lenses are commonly used in optical systems
as they are relatively easy to manufacture, but their shape
is not ideal for the formation of a sharp image. Spherical
aberrations are the consequence of light rays further from
the lens axis (marginal) focusing at a different position than
rays passing through the middle of the lens. When marginal
rays focus nearer the lens than the principal rays, the effect is known as positive or undercorrected spherical aberration, while a marginal focus further away from the lens than
the paraxial focus causes negative or overcorrected spherical
aberration [van].
Comatic aberrations cause objects further from the optical
axis to appear distorted. Positive coma occurs when marginal
rays focus further away from the optical axis than principal
rays, while negative coma is the oposite effect, namely the
marginal rays focusing closer to the optical axis.
A third type of blurring aberration is known as astigmatism and is the result of off-axis rays of different orientations
focusing at slightly different positions. The meridional plane
for an off-axis object point is defined by that point and the
optical axis. The plane orthogonal to the meridional plane
is known as the saggital plane. Rays along these two planes
focus at different positions on the optical axis. As such, if
the lens is focused such that the meridional image is sharp,
blur will occur in the saggital direction and vice versa.
A final aberration considered here is a consequence of the
field curvature. The image and object planes are generally
considered to be planar. However, in the case of spherical
lenses they are in fact spherical, which is known as Petzval
field curvature. Regions of the image further from the optical
axis increasingly deviate from the planar assumption. This is
more evident when a planar image plane is used (such as a
sensor) in which case the image is less sharp towards the
periphery [RKAJ08].
The effect of blurring aberrations can be reduced in postprocessing by sharpening the image or during the image capture by using smaller aperture settings and thus only allowing rays closer to the optical axis to enter. Optical solutions
do exist and generally involve the use of lens doublets or
carefully designed lenses. These solutions however, complicate the design of the optical system and are as a result found
in more expensive lenses.
A different class of monochromatic aberrations consists
of radial distortions. These cause straight lines in the scene
to appear curved in the resulting image and can be corrected digitally either manually or using information about
the characteristics of the lens that was used. Radial distortions are the result of deviations from a linear projection
model and are more pronounced for wider angle lenses.
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
a) Spherical aberration
b) Coma
c) Astigmatism
Figure 2: Three of the four blurring aberrations.
plified even further if the lens characteristics are provided by
the manufacturer.
More complex solutions have also been proposed, combining image alignment with distortion parameter estimation [SK99, Ste97] or by recovering the radial distortion characteristics together with intrinsic camera parameters [CF05, Stu05, HK07].
Figure 3: This figure demonstrates the effect of barrel (middle) and pincushion (right) distortions on an image. In this
case, the distortions are exaggerated and overlaid with a
grid for demonstration purposes. The undistorted image is
shown left.
Although interactions between the various lens elements
of a complex optical system can produce some less well defined distortions, the two main types considered here are barrel and pincushion distortions, shown in Figure 3. Barrel distortion is the result of decreasing lateral magnification and it
causes straight lines to curve away from the image center.
Pincushion distortion occurs in lenses with increasing lateral magnification and causes straight lines to curve towards
the image center [WCH92].
Radial distortions can be adequately estimated as a fourth
degree polynomial:
x′ = x(1 + κ1 r2 + κ2 r4 )
′
2
4
y = y(1 + κ1 r + κ2 r )
(2)
(3)
where r2 = x2 + y2 , and κ1,2 are the radial distortion parameters [Sze10]. Several methods have been proposed in
the literature for estimating the distortion parameters of
Equation (3). In the simplest case, by photographing a test
scene containing straight lines, the distortion parameters can
be adjusted until the lines in the resulting image are also
straight [EMF03, BL09]. This process can of course be simc The Eurographics Association 2010.
Barrel and pincushion distortions are not the only possible geometric distortions. The interactions of the various
lens elements and their resulting distortions can result in effects that require more complex solutions. In these cases,
approaches using calibrated targets can still be of use.
Various software packages offer solutions that reduce or
remove the effect of radial distortions. Options such as DxO
Optics or PTLens use lens-specific information while other
solutions exist that require more manual input, such as for instance Hugin or the Lens Correction module in Adobe Photoshop.
Vignetting Vignetting is a common artifact appearing in
photographs, which causes areas of the image further from
the center to appear darker. Although this effect often goes
unnoticed and sometimes can even be added to images for
artistic purposes, the magnitude differences it causes can
be large. It has been found that for certain optical systems,
the amount of light reaching the corners of the image can
be several times less than the light going through the center [ZLK10]. Such differences may affect the results in statistical or computer vision applications.
Several factors contribute to vignetting [BL09]. Mechanical vignetting can be caused by lens extensions or accessories that physically block parts of the lens. Optical vignetting arises when rays arrive from an oblique angle. Parts
of the lens may be obstructed by the rim of the barrel, reducing the effective aperture and thus allowing less light to
enter the lens for these angles. Finally, a third cause of vignetting is known as natural vignetting and is the effect of
natural light falloff. The severity of this effect depends on the
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
angle θ that light rays exit the rear element and hit the image sensor and for simple lenses can be modeled as cos4 (θ)
falloff [GC05].
Both optical and natural vignetting can be reduced digitally after the image is captured. The simplest approaches
for estimating the vignetting for a given optical system involve capturing a photograph of a uniformly coloured and
illuminated surface [BL09,Sze10]. Using such an image, the
vignetting function can be computed and its effect can then
be removed from subsequent images taken using the same
system.
A limitation of this approach is that it requires such a calibration image to be captured with the same camera and lens
as the image(s) to be corrected [AAB96, KW00]. This may
not always be possible. To that end, information from a single image can be used to reduce the effects of vignetting.
This can be achieved by locating slow intensity variations in
the radial direction [ZLK10]. Alternatively, in cases where
multiple overlapping images are available (for instance when
capturing a panorama), corresponding pixels that appear in
more than one images can help recover the vignetting function [LS05, GC05].
2.2.2. Noise
Noise can be a major cause of image degradation and can
be defined as any random variation in brightness or colour
in the image caused by the sensor, circuitry or other parts of
the camera. Noise can have various sources, each requiring
different treatment. Some common types of image noise will
now be discussed.
Fixed Pattern Noise Fabrication errors can cause different
pixels of a sensor array to have different responses to constant illumination. This is known as fixed pattern noise and
appears in two forms. Dark signal non-uniformity (DSNU) is
the fixed offset from the average across each pixel of the image sensor, occurring even when no illumination is present.
Photo response non-uniformity (PRNU) is the variation in
the responsivity of pixels under constant illumination.
Fixed pattern noise can depend on camera settings and
conditions such as exposure time and temperature but can
be removed, either in the camera hardware or in postprocessing. In the latter case, DSNU can be removed simply by capturing and subtracting an image with no illumination (for instance by keeping the lens cap on) in otherwise
the same conditions as the images to be calibrated. To additionally remove noise due to PRNU under various illumination levels, an image of a flat, uniformly lit surface can
be captured with the illumination set such that it nearly saturates the sensor pixels. Then using the DSNU calibration
image described previously and the new image, the response
of each pixel can be linearly interpolated for a particular set
of conditions (temperature, camera settings, etc.) [Vis07].
Dark Current Even if no light reaches the sensor, electrons
can still reach the quantum-wells, generating dark current
shot noise. The severity of this type of noise is affected by
temperature and consequently can be reduced by cooling the
sensor [RKAJ08].
Photon Shot Noise The number of photons collected by
each individual quantum-well of an image sensor is not constant for a given intensity. Due to this randomness, a given
pixel can receive more or fewer photons than the average.
The distribution of photons arriving at the sensor can be
modeled by Poisson statistics (mean µ equals variance σ2 ).
As such, if the intensity of the source increases, so will the
variance in the signal. Since√
the signal-to-noise
ratio can be
√
defined as SNR = µ/σ = N/ N = N where N is the mean
number of photons hitting the image sensor, large values of
N cause the SNR to be large too, thus reducing the effect of
this type of noise.
Most of the noise sources discussed do not depend on illumination levels in the scenes. As such, in darker scenes,
the signal-to-noise ratio becomes smaller, effectively amplifying the effect of noise in the image. Although dark current
and fixed pattern noise can be minimised though the use of
appropriate calibration, other noise sources (including photon shot noise) cause less predictable patterns. The reader is
referred to the work by Healey and Kondepudy [HK94] for a
detailed account on noise estimation and camera calibration.
2.2.3. Radiometric Calibration
In order to map pixel values to real-world radiometric quantities, the response function of the camera, which is the mapping between sensor illuminances to pixel values, needs to
be known. Although this information is not generally available, the non-linearities for a particular camera can be recovered through the use of calibrated input.
The relationship between scene radiance and pixel values can be recovered in two ways and the process is known
as camera characterisation. Specialised equipment can be
used, such as for instance a monochromator and a radiance meter, to measure the response of the camera to particular wavelengths [RKAJ08]. More commonly, a specific
colour target (such as the GretagMacbeth ColorChecker
Chart) can be used in conjunction with a spectrophotometer to match XYZ values to the RGB values captured by the
camera [Fai07].
Several approaches exist for finding the transformation
between the measured XYZ values and the captured RGB
values. Look-up tables store the data of such measured/captured pairs. If a large enough number of samples is collected so that the camera gamut is densely populated, then
XYZ values corresponding to RGB samples not in the lookup table can be estimated using three-dimensional interpolation [RKAJ08]. Alternatively, if the number of samples
available is not sufficient for interpolation, a function can
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
be fitted to the available data. This is however much more
computationally expensive.
2.2.4. HDR Merging
Greg Ward’s Photosphere [Any], which is also an excellent
viewer for HDR image collections.
3. First Order Statistics
If the accurate representation of the scene illumination is
necessary, a single exposure may not be sufficient to cover
the full dynamic range in a scene. A series of exposures can
be captured as described in Section 2.1, which can then be
merged into a single high dynamic range image. Each individual exposure follows the non-linearities of the camera.
As such, before they can be merged into an HDR image, the
different exposures need to be linearised using the response
curve of the camera. The techniques described above could
be used to recover this response curve.
However, a sequence of differently exposed images can
be used directly to recover the response curve. A commonly
used method is proposed by Debevec and Malik [DM97].
By exploiting a property of both physical film and digital
sensors known as reciprocity (meaning that halving the irradiance hitting the sensor E and simultaneously doubling
the exposure time ∆t will result in the same pixel values
p), a linear system is derived. The solution to that is the
inverse function g−1 to the camera response curve. Other
methods for recovering the response curve of a camera are
known [MP95, MN99, RBS03].
After the response curve is obtained, the image irradiance
En for a given pixel of an exposure n can be computed given
the exposure time ∆tn and the value of that pixel pn as follows:
En =
g−1 (pn )
∆tn
(4)
If a sufficient number of exposures is collected, each part of
the scene should be correctly exposed in at least one of the
images. The irradiance value for a given pixel p in the HDR
image can then be computed as a weighted average of the
corresponding pixels of each exposure.
Several weighting functions have been proposed in the literature. Mann and Picard [MP95] merge the exposures using
certainty functions computed from the derivatives of the response curves for each differently exposed image. A simpler approach is used in [DM97] where a hat function suppressing values that are likely to be under- or over-exposed
is applied. Finally, inspired by signal theory, Mitsunaga and
Nayar [MN99] propose a weighting function that takes into
account the signal-to-noise ratio. Despite their differences,
any of these approaches will yield satisfactory results.
Many software packages are capable of merging exposures into an HDR image. Photoshop offers a ’Merge to
HDR’ function since its CS2 version that allows some manual control over the image alignment. Other packages include HDRSoft’s Photomatix [HDR] and HDRShop [Deb],
both of which offer batch processing functionality. Finally,
a free option that is only available for Apple computers is
c The Eurographics Association 2010.
The simplest regularities in images that can be explored
through statistical analysis are properties of single pixels.
The distribution of pixel values in images is typically represented with histograms. For a given image I, we define its
histogram H with B bins of width V as follows:
H = {(h(1), v(1)), ..., (h(B), v(B))}
max(I) − min(I)
B =
V
(5)
(6)
N
h(i) =
∑ P(I(p), i),
p=1
i ∈ [1, B]
v(i) = min(I) + (i − 1)V
1 i = I(p) − min(I) + 1
V
P(I(p), i) =
0 otherwise
(7)
(8)
(9)
where H is the set of all pairs (h(i), v(i)) for all i ∈ [1, B]
corresponding to the number of elements and value of the
ith bin of the histogram. I(p) is the value of the pth pixel
of image I which contains a total of N pixels and P(I(p), i)
represents the probability of a pixel I(p) belonging to a bin
i.
First order statistics do not capture relationships between
pixels and as such are not suitable for studying spatial regularities within images. Figure 4 shows an example of two
very different images that would result in identical first order statistics. The right image is constructed by randomly
permuting the pixels of the left image, yet it appears unnatural. Statistical analysis of the distribution of pixel intensities
can however lead to interesting insights.
Statistical moments are commonly employed to quantitatively describe the shape of a distribution. The kth moment
of a distribution can be computed as follows:
I(p) − c)k
N
p=1
N
mk =
∑
(10)
where c can be any constant. Generally, if c = 0 then the
above equation computes the raw moments of the distribution, while setting c = µ gives us the central moments (i.e.
centered at the mean). The first raw moment is the mean µ
of the distribution and the second is the variance σ2 .
The meaning of further moments is less straight forward
but the skewness S and kurtosis κ of a distribution relate to
the third and fourth moments respectively. More specifically,
the skewness and the kurtosis are defined as:
m
(11)
S = 33
σ
m
κ = 44
(12)
σ
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
Figure 4: The right image was created by randomly permuting the pixels of the image on the left, resulting in identical first
order statistics.
respectively, where m3 and m4 are the third and fourth central moments respectively.
A less skewed distribution was found by Brady and
Field [BF00] in their analysis of 46 logarithmically transformed natural images, while the linear images resulted in
a distribution skewed towards darker values. Although no
exact values were provided for the moments of the distributions, Figure 5 shows the resulting histograms for both cases.
As can be seen from the above examples, although in both
cases natural images were used and analysed, the results
vary. Generally, the distribution of log intensities for natural images does not deviate far from symmetric. Results do
depend however on the choice of images.
One particular case of interest was explored by
Dror et al. [DLAW] where high dynamic range spherical panoramas were analysed. Two separate datasets
were used consisting of 95 images taken from the
Teller database (http://city.lcs.mit.edu/data) [TAB∗ 03]
and 9 from Debevec’s Light Probe Image Gallery
(http://www.debevec.org/Probes/) [DHT∗ 00]. Figure 6
shows the resulting log intensity histograms for the two
datasets. The values found for the standard deviation,
Probability
Huang and Mumford analysed more than 4000 greyscale
images of natural scenes (taken from the database created
by J. H. van Hateren [vv98]) by computing central moments
of the log histograms [HM99c]. The values found for the
mean, standard deviation, skewness and kurtosis were respectively µ = 0, σ = 0.79, S = 0.22 and κ = 4.56. The value
of the skewness shows that the distribution is not symmetrical, which can be attributed, at least partly, to the presence
of sky in many of the images, resulting in a bias towards
higher intensities. In addition to that, the values of both the
skewness and the kurtosis show that the distribution is nonGaussian.
Linear
Log
Pixel Intensity
Figure 5: The linear and log intensity histograms found
by Brady et al. from their analysis of 46 natural images.
(Adapted from [BF00].)
skewness and kurtosis of the distributions respectively for
the Teller dataset were σ = 1.04, S = 0.02, κ = 4.04 and for
the Debevec dataset σ = 1.32, S = 0.36, κ = 12.49.
The shape of a histogram can also be associated with
particular qualities of the depicted surface. Motoyoshi et
al. [MNSA07] studied the perceived surface qualities of various materials and analysed their relation to the associated
histograms. They found that materials that are perceived as
glossier tend to have a positively skewed distribution while
matte surfaces result in a negatively skewed distribution.
The correlation found between the surface properties and the
skewness of the histograms is shown in Figure 7.
3.1. Histogram Adjustments
Despite their simplicity, first order statistics have now
found several applications in image processing. Studies have
shown correlations between first order statistical regularities
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
Debevec dataset
Probability
Teller dataset
Log Luminance
Log Luminance
Figure 6: The log luminance histograms found by Dror et
al. for the Teller (left) and the Debevec (right) datasets.
(Adapted from [DLAW].)
1
–2
0
1
10
100
Diffuse reflectance (%)
Skewness
2
4
3
2
0
–2
0.1
1
1
10
Specular intensity
Glossiness rating
3
0
3
4
Lightness rating
Skewness
3
0
100
Figure 7: As the diffuse reflectance of the surface increases,
the lightness rating as perceived by the observers also increases while the corresponding histogram skewness decreases(left). An increase in specular intensity also results
in an increased rating of glossiness as well as higher skewness value (right). (Adapted from [MNSA07].)
Figure 8: Histogram equalisation can increase the contrast
of a low contrast image (top left) by reshaping the intensity
distribution more equally. The equalised resulting image is
shown at the bottom left.
are computed:
B
Cs (i) =
∑ hs (i)
(13)
∑ ht (i)
(14)
i=1
B
Ct (i) =
i=1
I(p) − min(I) + 1
(15)
I0 (p) = vt Ct−1 Cs
V
in images and properties of the illuminant [RKAJ08,APS98]
which has proved useful in areas such as white balancing.
Moreover, transferring statistical moments between images
in appropriate colour spaces has been demonstrated in what
is now known as colour transfer [RAGS01]. These colourrelated application are discussed in detail in Section 6.
where a cumulative histogram C is defined as a function
mapping a bin index to a cumulative count. The inverse function C−1 acts as a reverse lookup on the histogram, returning
the bin index corresponding to a given count. An example of
this technique applied on a source-target pair of colour images is shown in Figure 9.
First order statistics can also be computed on a single image basis. By manipulating the distribution of values of a
single image, a variety of effects can be achieved. For instance, the contrast of an image that only covers a small portion of the available range of intensities can be increased by
adjusting the pixel values such that the full range of intensities is more equally represented. This is known as histogram
equalisation and an example can be seen in Figure 8.
Due to their limitations to single pixel properties, first
order statistics are the simplest to compute and interpret.
To further explore the relations between pixels, their spatial
configurations and their associations with aspects of human
vision more complex transforms are necessary. Second and
higher order statistics as well as their applications in computer graphics, image processing and other relevant areas are
presented in Sections 4 and 5 respectively.
A more general version of histogram equalisation is
known as histogram matching and it allows the histogram
of a source image (Is ) to be matched to that of a given target
(It ). First, the cumulative histograms of the source and target
c The Eurographics Association 2010.
4. Second Order Statistics
While the first order statistics of an image provide considerable information, their focus on individual pixels restricts
log10 Power
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
−2
−3
Slope −1.88
−4
−5
−6
−7
L
a
b
Source
−8
−9
Target
Result
Figure 9: The source image is matched to the target using
histogram matching in the CIELab color space. The resulting image as well as the histograms for all three are shown.
them to the “unstructured” aspects of the image – equivalent to an investigation of individual rays of light. As James
J. Gibson [Gib79] famously pointed out, since the interaction of light rays with an object systematic alters light according to the surface properties of the object, the pattern
of change across a set of light rays – which he referred to
as the structured optic array – provides direct, reliable information about the object. As is demonstrated in Figure 4,
this higher-order structure is of central interest to the human
visual system. It should not be surprising, then, that considerable attention has been directed towards investigating
second and even higher order statistics. It has been shown,
for example, that people can discriminate almost without effort between images that differ in the second order statistics [JC79,Cae81]. This is the basis for the well-known “popout” phenomenon.
The first and most obvious way to look at image structure
is to examine the relationship between pairs of pixels. This
is the core of second order statistics. There are two main
second order statistics: The power spectrum (4.1) and gradients (4.2). Note that image contrast (e.g., the standard deviation of all pixel intensities divided by the mean intensity) is
also often calculated. Image contrast can be derived from the
power spectrum [van98b], and will not be discussed here.
4.1. Power Spectra
Images contain information at different spatial scales, from
the large (e.g., the presence of a tree trunk on the right in
Figure 4) to very small (e.g., the pattern of color change on
the bird’s feathers). As is well known from Fourier analysis,
sharp edges, such as at the edges of the tree trunk, can be
0
1
2
Spatial Frequency (log10 cycles/image)
Figure 10: The average power spectra for 133 natural images. Reused by permission from [RSAT04]
described by a weighted sum of sinusoids, with higher frequencies being weighted less. In other words, natural images
contains information at many spatial frequencies. An examination of the relative power of the different spatial frequencies reveals several interesting trends, which are so prominent, that most work on image statistics provides an analysis
of the power spectrum.
The power spectrum S(u, v) of an N x N image is given
by:
S(u, v) =
|F(u, v)|2
N2
(16)
where F is the Fourier transform of the image, and (u, v)
are pixel indices in Fourier space. Two-dimensional frequencies can also be represented with polar coordinates ( f , θ),
using u = f cos(θ) and v = f sin(θ). While the sprectra of individual images varies considerably (which may play a role
in the rapid detection of certain scenes or objects, [BCT08]),
when the averaging over a sufficiently large number of images (and across orientation), a clear pattern arises: The
lower frequencies contain the most power, with power decreasing as a function of frequency. In fact, on a log-log
scale, amplitude as function of frequency lies approximately
on a straight line. That is, the averaged spectrum tends to follow a power law, which can be well modeled with P = 1/ f β ,
where P is the power as function of frequency f , and β is
spectral slope (see, e.g., Figure 10). Increasingly, the spectral slope is being used as a low dimensional descriptor of
texture [BCHT01, CG87, KP88, RV90, TSW∗ 05]. Perceptually, the primary effect of increasing the spectral slope is to
increase the coarseness of the texture [BCHT01](see Figure
11). A number of studies have reported values for the spectral slope for different image ensembles. While the particulars of the image ensembles vary considerably, they tend
to focus on natural scenes (which usually means simply
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
Study
[BM87]
[DA95]
* [DWA]+
[Fie87]
[Fie93]
[FB97]
[van92]
[HM99a]
[PBT98]
[RST01]
[RB94, Rud94]
[vv96]
[TF97]
[TTC92]
[TO03]
[WM97]
Figure 11: Static, random phase patches produced by 1/ f β
spatial-frequency filtering of random white noise. From top
to bottom, the values of the spectral slope are 0.8, 1.6, 2,4,
and 3.2 for the top left, top right, bottom left, and bottom
right respectively.
eliminating carpentered environments). The average spectral slope varies from 1.8 to 2.4, with most values clustering around 2.0 (see Table 3) [BM87, DA95, DWA, Fie87,
Fie93,FB97,van92,HLM00,PBT98,RST01,RSAT04,RB94,
Rud94, vv96, TF97, TTC92, TO03, WM97] One of the most
prominent facets of such a power law is that these images
tend to be scale invariant. That is, they share features at every scale, which is also a hallmark of fractal images (note
that a simple linear transform of the spectral slope yields the
image’s fractal dimension [CG87]). In practice, this means
that we should be able to zoom in and out of an image, or
travel through a natural scene, and expect that the statistics
will remain roughly constant. It is also worth noting that,
according to the the Wiener-Kintchine theorem, the power
spectrum and the auto-correlation function form a Fourier
transform pair. This means that the spectral slope of image
ensembles can be interpreted as describing relations between
pairs of pixels. Intuitively, the means that since since the surface of a given object tends to be rather homogenous, it is
expected that neighboring pixels will be similar and that the
farther apart two pixels are, the less similar they will be.
The 1/f property of images seems to arise from several
sources. Edges, for example, show 1/f spectra [SMS78].
Likewise, foliage and landscapes tend to exhibit fractal properties [Man83]. The clustering of independent objects is also
such that the distribution of sizes in many scenes also tends
to follow a power law [FB97, Rud97].
c The Eurographics Association 2010.
# of images
19
320
95
6
85
20
117
216
29
133
45
276
82
135
12,000
48
β ±1sd
2.1±.24
2.30
2.29
2.0
2.20
2.20±0.28
2.13±0.36
1.96
2.22±0.26
1.88 ±0.42
1.81
1.88±0.42
2.38
2.4.±0.26
2.08
2.26
Table 3: Spectral slopes for natural image ensembles. *Note
that [DWA]’s work was on high dynamic range illumination
maps.
There is also some emerging evidence that the slope of
the power spectrum is distinct for different scenes or objects
[HM99a, TO03, WM97]. For example, Huang and Mumford
[HM99a] examined 216 images which had been painstakingly segmented into pixels representing 11 different categories and found that although the image ensemble had an
average slope of 1.96, there were systematic differences in
the slopes across categories. Specifically, the slopes were
2.3, 1.8, 1.4, and 1.0 for man-made, vegetation, road, and
sky pixels, respectively. Likewise, Webster and Miyahara
[WM97] analyzed the power spectra and rms-contrast of 48
natural scenes. They found significant differences in both
spectral slope and contrast across the three scene types (2.15,
2.23, and 2.4 for the forest, close-up, and distant meadow
scenes, respectively). Furthermore, when the power spectra are not averaged over all orientations, it is clear that
there is some variation as a function of angle: Natural images tend to concentrate their power in horizontal and vertical angles [BAM∗ 97, OT01, Rud97, SMS78, vv96]. For example, in examining over 6000 man-made and 6000 natural scenes, [TO03] found that the slope varied as a function of both scene type and orientation (with slopes of 1.98,
2.02, and 2.22 for horizontal, oblique, and vertical angles
in natural scenes and 1.83, 2.37, and 2.07 for man-made
scenes). Thus, the spectral slope may be be useful in object
or scene discrimination [BCT08]. It is critical to mention
that if two images have similar spectral slopes, then swapping their power spectra will not affect recognition as long
as phase information is preserved [TTC92, TT93].
It has also been shown that the pattern of change over
time also follows a power law. That is, if the contrast mod-
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
ulation over time for a given pixel is examined, the power
spectra can also be modeled with P = 1/ f α , where P is the
power as function of frequency f , and α is temporal spectral
slope [BGK01,DA95,EBW92]. Temporal spectral slopes between 1.2 and 2.0 have been reported for natural image sequences. The temporal spectral slope relates, perceptually, to
the apparent speed and jitter of the scene and to some degree
with the apparent "purposefulness" of the motion (the higher
the slope is, the more persistent the motion will be).
flects an attenuation of the higher spatial frequencies. Billock suggested that the higher slopes for rapidly flickering
images may represent motion deblurring.
4.1.1. Human perception
4.1.2. Fractal Forgeries
There is an extremely large body of work examining the relationship between natural image statistics and human perception. At the simplest level, our ability to discriminate two
random phase textures based solely on changes in the spectral slope has been examined [BCHT01, BCT08, KFK90,
TT94]. Humans are most sensitive to slopes around 2.8 to
3.2, which would represent an image with much less high
spatial frequency content than natural images. There is some
evidence (albeit controversial) for a second minima near 1.2
Rainville and Kingdome examined the ability to detect symmetry for white noise images with different spectral slopes
and found that one participant was bets for slopes near 2.8,
consistent with the image discrimination data. The other
participant was best for slopes between 1 and 2, consistent with the potential second minima. Regardless, it is clear
that humans are not maximally sensitive to changes in spectral slopes representing natural images. The reasons for this
are still unclear, although several hypotheses have been forwarded including that the shift from 2 to 2.8 reflects blur
perception [Bil00].
Thanks in large part to the seminal work of Mandelbrot
[Man83], many areas of computer graphics use fractals to
synthesize textures, surfaces, objects, or even whole scenes
(see, e.g., [DCCH05,DL05,PS88]). A subset of this work focuses specifically on fractal brownian motion in general and
and fractal brownian textures in specific which bear striking
resemblance to real surfaces and textures. Since the fractal
dimension of such a fractal texture is a linear transform of
the spectral slope, these works are essentially relying on the
regularities in power spectra. Many of these techniques either explicitly or implicitly choose parameters so that the
spectral slope will be similar to natural images. Perhaps the
most famous of these synthetic scenes are the eerily familiar landscapes produced by in Voss’s “fractal forgeries” of
[Vos85].
Instead of attempting to determine the tuning of the visual system by measuring discrimination, one can approach
the problem more indirectly: one can estimate the sensitivity of the spatial perception mechanisms from an autocorrelation analysis of contrast sensitivity function (which
has been referred to as a modulation transfer function of the
human visual system). It is generally accepted that human
spatial perception is mediated by several, partially overlapping spatial frequency channels (at least 7 at each orientation: see, e.g., [Bil00]). Since similar frequencies are likely
to be processed by the same channel, the sensitivity to similar frequencies should be similar. The less similar the frequencies are, the less correlated their sensitivity thresholds
should be [BH96,OSS83,PWK95,SWO84]. Billock [Bil00],
examined the correlation between contrast sensitivity thresholds as a function of the spatial frequency separation, and
found that (for up to 5 octaves) the correlation functions
were power laws with slopes ranging from 2.1 to 2.4. This
held not only for static stimuli, but also for slowly flickering
stimuli (up to 1 Hz). These slopes are much more in line with
the slopes found in natural images, suggesting that human
spatial frequency mechanisms may be optomized for natural images. Interestingly, more rapid flicker yielded higher
slopes (around 2.6). As mentioned above, higher slopes re-
In contrast, the discrimination of temporal spectral slopes
appears to be more straight forward. Humans are most sensitive to differences in temporal spectral slope for slopes
between 1.8 and 2.0, which is very similar to the range of
slopes in natural image sequences.
In the class of landscape synthesis, a white noise field is
filtered in Fourier space with the desired spectral slope to
produced the terrain texture of field [PS88]. In such methods,
the role of natural image spectra is obvious. In a second class
of approaches, the midpoint displacement method [FFC82],
the role of power spectra is perhaps less obvious. In this
approach, terrain is essentially generated by iteratively decreasing the spatial scale and (random) displacement. In general, the procedure is capable of producing a wide range of
results, some of which look realistic and others which do not.
A closer examination of the method shows that changing the
size of the displacement decrement effectively allows one to
control the coarseness of the texture. Since the coarseness
of a texture is directly related to its spectral slope, it should
be possible to automatically determine which displacement
decrements produce natural terrain. The results of a simple
perceptual study support this, showing that people felt that
synthetic terrains that had a spectral slope of near 1.8 appeared to be the most natural [RSAT04].
Perhaps the second most common application of power
spectra, again through the use of fractal dimensions, is the
automatic generation of plants. Considerable literature has
been written on plant synthesis (see, e.g., [DL05] for a review). Overall, it is is well known that plants tend to exhibit
a strong degree of self-similarity, and many descriptive systems have been developed to taker advantage of this characteristic. The most prominent of these are L-systems. Just
as with fractal terrains, the system allows one to generate a
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
wide variety or results, some of which are perceptually plausible and some of which are not.
4.1.3. Image Processing and Categorization
While a power law description clearly captures the regularities in large collections of image, individual images tend
not to be 1/f. It has been suggested that differences in the
spectral slope between parts of an image allow people to
rapidly make some simple discriminations (e.g., “pop-out”
effect, see [BCT08, Cae81, JC79]). Others have speculated
on the evolutionary advantage of being able to detect spectral slope [Bil00,BCT08,CHJ78,HB96,RV90]. Just as knowing about the statistics of natural images in generasl can inform us about how the human visual system works, and how
we might build more efficient computer vision and computer graphics algorithms, so too will an understanding of
the cause of variations in the statics provide insights.
A number of potential sources for 1/f patterns and
their variations have been identified [Man83, FB97, HM99a,
Rud97, SMS78, TO03, WM97]. For example, [HM99a] and
[WM97] found different average spectral slopes for different scene categories (both within as well as between images): at the very least, there seem to be differences between man-made, general vegetation, forest, meadow, road,
and sky elements. It has also been shown that underwater
scenes have different spectral slopes [BG03]. To help further
distinguish between object or scene categories, one can look
at the interaction between power spectra and other characteristics [BAM∗ 97, OT01, Rud97, SMS78, vv96]. For example,
[OT01] suggested that a “spatial envelope” of a scene can
be constructed from the interaction between power spectra
and orientation combined with some information from PCA.
This envelope yields perceptual dimensions such as naturalness, roughness, expansion, and similar categories tend to
cluster together in this scene space. In [TO03], the approach
was extended to estimate absolute depth using the relationship between power spectra and orientation and some information from wavelets.
In a related line of work, Dror and colleagues used a variety of natural image statiscs to estimate the reflectance of
an object (e.g., metal versus plastic) under conditions where
the illumination is unknown [DAW01]. They employ a wide
range of image statistics from simple intensity distributions
through oriented power spectra to wavelet coefficient distributions.
It has been suggested that the differences between the average spectral slope of 2.0 and the peak of human sensitivity
to changes in slope (at 2.8) is due to deblurring. Furthermore, it has been suggested that the higher slopes for an autocorrelation analysis of human contrast sensitivity is due to
motion deblurring. In an indirect examination of this claim,
Dror and Colleagues examined the properties of Reichardt
correlator motion detections [DOL00, DOL01]. While there
is considerable evidence that the human visual system uses
c The Eurographics Association 2010.
such correlators for the low-level detection of motion, it has
also been shown using typical moving gratings that they signal temporal frequency and not velocity. Dror demonstrated
that when stimuli that have natural image statics are used,
the response properties of Reichardt detectors is better correlated with velocity and suggest that they make a much better
basis for the synthetic processing of motion than previously
assumed.
4.2. Gradients
Perhaps the simplest way to examine information contained
in more than a single pixel, however, is to look at the relationship between pairs of pixels. This is a discrete approximation of the first derivative or gradient of the intensity function of an image. There are three different discrete appoximations: forward/backward difference, central difference, a
soebel operator. The most straightforward is the forward difference method. Here, the gradient at a pixel (i,j) is calculated from the difference between it and the next pixel forwards:
Dx(i, j) = ln(I(i + 1, j) − ln(I(i, j)))
(17)
Dy(i, j) = ln(I(i, j + 1)) − ln(I(i, j))
(18)
Dx(i, j)
Dy(i, j)
q
D(i, j) =
(Dx(i, j)2 + Dy(i, j)2 )
(19)
where I(i,j) is the luminance for pixel i, j, Dx is the horizontal gradient and Dy the vertical gradient. An obvious variant
on this, referred to as the the backward difference method,
is to use the previous pixel (e.g., I(i-1,j) rather than the next
one. Ib both cases, it is common to calculate the mean gradient magnitude at a given location from the horizontal and
vertical components:
D(i, j) =
(20)
although one may choose to keep the vertical and horizontal gradients separate (see, for example, [Lev07]). An example of the gradient distribution for an image ensemble can be
seen in Figure 12.
A slightly more robust method is to calculate the central
differences:
Dx(i, j) = 1/2(ln(I(i − 1, j)) − ln(I(i + 1, j))) (21)
Dy(i, j) = 1/2(ln(I(i, j − 1)) − ln(I(i, j + 1))) (22)
Finally, one can use a Sobel operator, which also looks at the
diagonal pixels:
1
0 −1
0 −2
Sx = 2
(23)
1
0 −1
1
2
1
0
0
(24)
Sy = 0
−1 −2 −1
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
600
Density
the sharp image’s power spectra follows a power law distribution, while [JBFZ02, NCB04] use an interaction between
power spectra and wavelets. Fergus et al. [FSH∗ 06] estimate
the blur kernel by optimizing for the gradient distributions
using priors derived from natural images.
Dx
Dy
Dxx
Dyy
400
200
0
-2
-1
0
1
Log Luminance Gradients
2
Figure 12: Gradient distributions for a collection of natural
images. Dx and Dy are the horizontal and vertical gradients
(first derivative of the intensity function), respectively. Dxx
and Dyy are the horizontal and vertical second derivatives.
Just as was the case for power spectra, image gradients
have been extensively investigated (see, e.g., [RB94, Rud94,
HM99a, HLM00, DLAW, DWA]). It has been repeatedly
found that the gradient histogram has a very sharp peak at
zero (see Figure 12) and then falls off very rapidly (with
long tails at the higher gradients). The distribution can be
α
modeled as e−x with α < 1 [LZW02, Sim05]. The reason
for the specific shape of the gradient distribution seems to
be that images contain many, large surfaces which tend to
be smooth and somewhat homogenous (so that the gradients tend to be near zero) along with a few high contrast
edges (which will yield very high gradients) [BG00]. An object often finds itself over a consistent background, which
means that the transition from the object’s surface to the
background will be similar all along the object’s silhouette.
This is reflected in the symmetry of the gradient distribution
around the central peak.
4.2.1. Deblurring
When taking a photograph of a scene, it is not uncommon
that either the camera or an object in the scene moves. The
longer the aperture is open, the more likely this is to be the
case. As a result, all or part of the image will be blurred. A
number of approaches for sharpening an image have been
proposed. One type of approach, blind motion deconvolution, essentially treats the photograph as the result of a convolution between an unknown sharp image and an equally
unknown blurring kernel. The goal is to estimate the blur
kernel so that it can be deconvolved with the blurred image
to yield the sharp image. Naturally this is an underspecified
problem, so additional constraints are needed and recently, a
number of researchers have employed natural image statistics to provide them. For example, [CNR02] assume that
All of these approaches assume camera motion. That is,
that there is a single blur kernel for the entire image. In an
alternate approach, Levin [Lev07] use the gradient structure
to find those pixels that are blurred and segment them from
the rest of the image. Additionally, rather than use statistics
of sharp images, of the blurring process. Specifically, they
model how the the gradient distribution changes as a function of blurring to discovering the blurring kernel for a given
image. One primary features of motion blurring is the attenuation of higher frequencies. This shows in the slope of
the power spectra (by increasing β) as well as in gradient
histogram (e.g., in particular by removing the longer tails at
the higher gradients). Levin attempts to recover the blur kernel by applying different blurs to the image to find the one
that produces a gradient histogram that matches the blurred
one. This requires a non-blurred image or image region. Using a completely different image tends not to produce reliable results (due to differences in the underlying gradient
histogram). Since much terrestrial motion tends to be horizontal, the majority of the effects of motion blurring are also
horizontal. Thus, the vertical gradients can under some circumstances be used to calculate the blur of the horizontal
components.
In a further refinement, [SJA08] employ an iterative alternation between optimizing the blur kernel and the sharp
image using priors from natural image statistics. super resolution
In related work, [MFTF03] use the gradient distribution
to fill in the holes produced when a high-resolution image is
produced from a low resolution image (a process called super resolution). They also apply the same procedure to yield
a full color image from a achromatic image.
Likewise, there are many approaches for removing the
noise in an image. Recently, several researchers have had
considerable success using gradient distributions (and occasionally other natural image statictics) to denoise images
[RB05, RB09, Sim05].
4.2.2. Image Inpainting
One image processing application area where image statistics have been used successfully is known as image inpainting. In order to remove an unwanted object from an image,
the texture that would otherwise be behind that object need
to be re-created, a task that can be particularly difficult if the
holes are large.
Among the various approaches, several have used image
statistics to find optimal replacements for the missing texture and structure, usually by assuming that a patch similar to the missing portion can be found somewhere in the
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
remaining image. For example, [HT96], use a combination
of spectral and spatial information to find the replacement
patch. [LZW03] select the missing section based on the gradients at the boundary region as well as to maximize the
match the global gradient histogram. [SJZW] fill in the missing region by completing the gradient maps and then reconstructing the image by solving a Poisson equation.
5. Higher Order Statistics
Whereas second order statistics take into account pairs of
pixels through various transformations (in particular gradients and the amplitude spectrum), it is possible to find statistical regularities in larger groups of pixels using suitable
image manipulations. In particular, gradient statistics can be
extended in a hierarchical fashion to consider blocks of pixels by means of wavelet transforms. Further, it can be shown
that much of the structure of images is located in the phase
spectrum, as opposed to the power spectrum.
Principal Components Analysis (PCA) can be used to capture structure in images. It effectively allows the most important sources of commonality between images or within a single image to be assessed. Each principal component represents a dimension along which variability can be measured.
They are typically ordered so that the first principal component accounts for most of the data, whereas each subsequent
component accounts for progressively less of the structure in
the data. Each of the components are orthogonal with respect
to each other.
The orthogonality of the axes can be seen as a limitation.
A better, albeit computationally much more involved technique is Independent Components Analysis (ICA). Rather
than producing decorrelated axes that are orthogonal as
achieved with PCA, Independent Components Analysis
finds axes that are more or less independent, but which are
not necessarily orthogonal.
In the following sections, each of these four approaches
are outlined in more detail.
5.1. Phase Structure
It can be argued that although statistical regularities are
present in power spectra of natural images, much of the
perceptually relevant information is encoded in phase spectra [Tho99]. As an example, we have swapped the phase
spectrum of two images, shown in Figure 13. As can be
seen, much of the image structure has swapped. Thus, the
image structure is largely encoded into the phase spectrum.
Second order statistics such as the auto-correlation function
(and therefore power spectra in Fourier space) and variance
are insensitive to signal phase. They therefore do not adequately measure image structure.
We could try to measure higher moments such as skew
and kurtosis, but these would be valid only if the image
c The Eurographics Association 2010.
Figure 13: In this demonstration, the phases of the top two
images are swapped to produce the bottom to images. The
amplitude spectra are retained. Note that much of the image
structure is located in the phase spectrum, and has therefore
been swapped.
exhibits stationary statistics, i.e. each image region has the
same statistical properties. Natural images, however, are almost never stationary. As an example, this means that if we
have an image with two regions characterised by a Gaussian
distribution with means and standard deviations of (µ1 , σ1 )
and (µ2 , σ2 ) respectively, then each region will have zero
skewness and kurtosis [NP96]. However, the image itself
may still have non-zero skewness and kurtosis. In particular, when the means µ1 and µ2 are different, the skewness of
the entire image may be non-zero. If the variances σ1 and
σ2 are different, then the kurtosis will be non-zero. Here, the
kurtosis is effectively a measure of the variance of the variances [Bad96b].
As a consequence, it is not possible to analyse the phase
structure by computing higher-order measures. However, to
gain access to phase information without polluting the results with first and second order information, we can whiten
the images first [Tho99, Tho01]. This amounts to adjusting
the spectral slope to become flat. The auto-correlation func-
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
tion will therefore be zero everywhere, except at the origin. By removing the second moment from consideration,
it is now possible to compute skewness and kurtosis on the
whitened signal. The whitened skew Sw and whitened kurtosis κw are thus a measure of variations in the phase spectra.
The results of applying this procedure to a set of natural
images, leads to the conclusion that the whitened images are
almost always positively skewed and are always positively
kurtosed. In contrast, if the phase spectrum is randomised
on the same images, the whitened skewness and kurtosis are
close to zero.
While positive skewness and kurtosis of phase spectra
points in the direction of the presence of statistical regularities in the phase spectrum, these results are relatively weak
and do not easily explain aspects of human vision. Furthermore, they do not appear to have found employ in any graphics related applications that we are aware of (although it may
lead to higher-order constraints in texture synthesis algorithms. In the following section, however, we will discuss
how extending the present analysis to be localised in space,
leads to further and stronger insights.
5.2. Wavelets
In the preceding section, we have seen that second order
statistics, and in particular the amplitude spectrum of image
ensembles, show striking statistical regularities. The amplitude spectrum can be analysed for different orientations, but
as it is computed in Fourier space, it provides information of
whole images rather than regions of images.
In this section we study transforms that allow us to analyse images in local regions of space, as well as for different orientations and spatial frequencies. Such models and
transforms started with Gabor who introduced the notion
of time-frequency representations [Gab46]. It is noted that
such systems are in some sense similar to how images are
analysed by portions of the human visual system. Many
wavelet transforms are self-similar, consisting of spatially
localised band-pass basis functions which vary only by a
dilation, translation or rotation [Fie93]. They have found
many direct applications in image analysis and image processing. In particular, they have been successfully used for
the transmission of the chromatic component of HDTV signals [Wat90, Wat91], texture seggregation [BCG90], compact coding [Wat87,Dau88,ASF91] and in identification systems [Dau91].
Wavelet transforms using a variety of basis functions have
also been used to model the spatial response properties of
cells in the mammalian visual system [Fie93]. In particular,
Gabor functions — sinusoids weighted by a Gaussians —
are one of the more popular transforms. Nonetheless, these
particular bases are non-orthogonal, making them difficult to
invert and use in modelling [Fie93]. As a result, orthogonal
bases were developed [ASH87, WA89, ASF91].
Cells in the visual system respond to certain patterns of
light and dark, with each cell responding optimally to a specific pattern. It has been demonstrated that groups of cells
in the visual cortex of several different species of mammal
respond to wavelet-like structures, in particular the Gabor
function [Mar80]:
!
−x2 −y2
f (x, y) = sin (2πkx + Θ) exp
(25)
+
2 σ2x 2 σ2y
A Gabor filter and an image processed with it are shown in
Figure 14. As can be seen, the local oriented structure in the
image is analysed at the specific frequency determined by
the sinusoidal component of the filter.
It is possible to extend this description by making the
bandwidth proportional to the frequency, so that the variances σx , σy in the above equation are related to frequency
band k:
σx,y (k) = β k−α
(26)
This allows us to build a stack of filters, each level in the
stack tuned to a different set of frequencies, in proportion
to the position in the stack. Each level in the stack is also
known as the scale.
Field has shown that both phase and amplitude can be
measured and correlated across scales and location [Fie93].
Although the amplitude spectrum has shown scale invariance, the phase structure revealed by Fourier space analysis remains somewhat of a mystery. However, by analysing
wavelets, which are simultaneously localised both in space
and frequency, it can be shown that natural images are scale
in variant in both phase and amplitude.
This means that in local image regions, the location of
edges are correlated across scales. At higher frequencies
neighbours in frequency become more highly correlated,
whereas neighbours in space become less correlated. In other
words, phases in natural images contain alignment that extend over greater ranges at higher frequencies [Fie93]. Phase
structures tend to change somewhat in orientation as well as
position when moving across scales. Such behaviour is well
modeled with wavelets. It is therefore possible that this may
help explain why mammalian visual system use wavelet-like
structures in the visual cortex. It may well be that this is why
wavelets are useful in a range of applications, as outlined
above.
It was found that the distributions of histograms of
wavelet coefficients are highly non-Gaussian [Fie87,Mal89],
showing in particular long tails, i.e. having high kurtosis. In
particular, these histogram distributions can be modeled by
a generalised Laplacian [Mal89, SA96, Sim99b]:
P(x) =
exp (−|c/s| p )
1
2s
Γ
p
p
(27)
where Γ is the gamma function and the parameters s and p
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
Image filtered with with Gabor
filters with different wavelength
parameters:
Left to right: 16, 32 and 64.
Top: Filtered results.
Bottom: Gabor filter.
Figure 14: Examples of Gabor filters and an image processed with them.
can be used to fit the model to a specific image or image
ensemble.
If the bandwidths of the wavelet filters are chosen to
be around 1-2 octaves, that the kurtosis of the histograms
of wavelet transforms is maximal. This means that under
these conditions the resulting transform is maximally sparse.
In clusters of neurons, sparisity is an important feature. It
means that much of the variability in the input signal is explained by a small number of neurons [Fie94]. Such neuronal activity is metabolically efficient [Bad96a], minimises
wiring length [Fol95], and increases capacity in associative
memory [BMW88].
Similarly, in a wavelet transform, sparsity means that most
coefficients are small, with only a few coefficients having
larger values. This is for instance the basis for compression
algorithms: by replacing small coefficients with zero coefficients a lossy image compression algorithm can be constructed. This approach is used for instance in the JPEG2000
image compression standard [TM01]. Furthermore, waveletlike receptive fields are discovered when training a neural
network for sparseness [OF96].
Of course, we are still free to choose the basis functions. In many applications, for reasons of computability a
much simpler set of basis functions are used. In particular,
the Haar bases are used. In a Haar decomposition, an image is split into averages and differences with those averages. The averages are computed by a set of scaling funcc The Eurographics Association 2010.
tions [SDS95, SDS96]:
Φki = φ 2k x − i i = 0, . . . 2k − 1
(
1 for 0 ≤ x < 1
Φ(x) =
0 otherwise
(28)
(29)
The differences are computed with the mother wavelet function:
ψki = φ 2k x − i i = 0, . . . 2k − 1
(30)
1 for 0 ≤ x < 1/2
(31)
ψ(x) = −1 for 1/2 ≤ x < 1
0 otherwise
An example of a Haar wavelet decomposition with three
scales is shown in Figure 15.
5.2.1. Image Denoising
An example application of wavelet statistics is image denoising [Sim99a, PS00], where a Bayesian approach is employed to remove noise. Assuming that the wavelet coefficients of natural images follow a generalised Laplacian distribution, as outlined above, Gaussian image noise translates
into Gaussian polution of wavelet coefficients. Each coefficient can therefore be written as y = x +n, with x drawn from
the density given by Equation (27) [Sim99a]. The noise n is
Gaussian. A standard estimator for x given the corrupted im-
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
pendent. Although wavelet coefficients are on average zero,
showing that their values are decorrelated, they are not independent. Although beyond the scope of this report, it can be
shown that correlations between x and p can be accounted
for using a joint statistical model for the purpose of denoising [Sim99a], image compression [Sim97], as well as texture
synthesis [SP98].
5.2.2. Object Detection
An application area where wavelets have found extensive use
is object detection. An important problem in that area is finding an effective and compact way of characterising objects in
images. Some object categories, such as faces, show surprisingly little variation in terms of overall structure and colour,
other types of objects however can be more complex.
To overcome this issue, wavelets, and in particular Haar
wavelets, can be used to provide a standardised template for
characterising objects in images. Using a training scheme, a
small subset of Haar wavelets can be derived that can appropriately describe the structure of a particular class of images,
such as for instance faces or pedestrians. This set of basis
functions then forms the classifier that can be used to detect
the object in question in new image [OPS∗ 97, POP98].
Figure 15: Top: Original image. Bottom: An example of a
Haar wavelet decomposition with three scales. The top left
image in the decomposition represents averages, whereas
the remaining panels indicate differences in horizontal, vertical and diagonal directions.
age y is the maximum a posteriori (MAP) estimator:
x̂(y) = arg max P(x|y)
(32)
c
= arg max P(y|x) P(x)
(33)
= arg max Pn (y − x) P(x)
(34)
c
c
Thus, by applying Bayes rule, the estimator is rewritten as
the product of the Gaussian noise density Pn (y − x) and the
prior density of the signal coefficient P(x). If we assume that
the parameter p in Equation (27) is set to 2 the estimator
becomes linear:
x̂(y) =
σ2x
2
σx + σ2n
y
(35)
For other values of p the estimator is non-linear. For instance p = 0.5 and p = 1 lead to hard thresholding and softthresholding respectively. In television and video engineering such thresholding is known as coring [BP86].
This approach assumes that wavelet coefficients are inde-
More recently, a real-time face detection algorithm was
proposed by Viola and Jones [VJ01] where features are applied on the image in a cascade fashion, ensuring that more
complex classifiers are only used on instances that have
been already accepted by simpler ones. Although the features used in this work are similar to Haar wavelets, some
are more complex, as shown in Figure 16.
Image-based retrieval using the wavelet decomposition
of each image has also been demonstrated by Jacobs et
al. [JFS95]. An image querying metric is developed using
truncated, quantised versions of the wavelet decompositions
of the images. This forms a compact signature for each image that is fast to compute and captures sufficient amount of
information to allow searching.
5.3. Principal Components Analysis (PCA)
One of the main properties of natural images is that pixel
values are correlated. Nearby pixels tend to have similar values while these correlations tend to be less strong as the distance between pixels increases [HBS92]. One way to analyse
these correlations is by decomposing the image data to a set
of components that are as decorrelated as possible. Each of
these components will then capture some particular mode of
variation within the data.
Principal component analysis (PCA – also known as the
Karhunen-Loève transform) is a solution commonly employed to that end. By computing the eigenvectors and corresponding eigenvalues of the covariance matrix of the image
collection, a set of orthogonal, decorrelated components can
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
u2
Input (not face)
Input (face)
Known faces
u1
Figure 17: A 2D representation of matching in the "face
space". The axes u1 and u2 correspond to the two eigenvectors (eigenfaces).
is only captured by the first few components,while further
ones mostly correspond to high frequency features.
Figure 16: The features used by Viola and Jones to detect
faces in images.
be derived. The covariance between two variables X and Y ,
each consisting of n samples, is defined as:
cov(X,Y ) =
∑ ni=1 (Xi − X̄)(Yi − Ȳ )
n−1
(36)
where n is the number of elements in X, Y . Note that
cov(X, X) gives the variance of X. To capture the covariance
between more than two variables, a covariance matrix C can
be computed. For m variables {I1 , I2 , . . . , Im }, each consisting of n samples, the covariance matrix C is given by:
cov(I1 , I1 ) cov(I1 , I2 ) · · · cov(I1 , Im )
cov(I2 , I1 ) cov(I2 , I2 ) · · · cov(I2 , Im )
C= .
..
.
..
..
. ..
.
cov(Im , I1 ) cov(Im , I2 ) · · · cov(Im , Im )
(37)
The matrix of eigenvectors V of C and the diagonal matrix
of corresponding eigenvalues D can then be computed such
that C = VDV−1 .
Components with a larger eigenvalue correspond to more
important features while lower eigenvalues signify less important components. This representation offers the advantage that a relatively small number of components is sufficient to encode most of the information in the images.
Nevertheless, although the computed eigenvectors are
orthogonal, statistical independence cannot be guaranteed.
Natural images are non-Gaussian and thus the decomposition offered by PCA can only decorrelate the data [HHH09].
The main consequence of this is that meaningful information
c The Eurographics Association 2010.
5.3.1. Face Recognition
As discussed, PCA can be used to reduce the dimensionality of a given set. This observation has been used in
object and in particular face recognition applications successfully. The face recognition method known as Eigenfaces [TP91b, TP91a] is one example. The database of face
images is decomposed into a set of eigenvectors and eigenvalues, each of which corresponds to some dimension along
which faces vary. Faces tend to have a lot of common characteristics in terms of structure, making them an ideal subject
for dimensionality reduction.
In order to match input faces to samples in the database,
the new face can be projected to the computed "face space".
It then becomes a simple matter of computing the distance
between this new point and the database to decide whether
the input is in fact a face. Recognition can then take place
by finding the points closest to the input. Figure 17 shows a
simplified 2-dimensional version of this process.
5.4. Independent Components Analysis (ICA)
While Principal Components Analysis measures covariances
of pixels separated by varying distances, the components
that are found constitute a linear transform. This means that
although the data may be decorrelated, there is no guarantee
that the resulting components are in any way independent.
In particular, only if the input data happens to have a Gaussian distribution, then the resulting principal components are
both decorrelated and independent.
As we have seen, many of the statistics of natural images that are currently known point in the direction of
high kurtosis, i.e. they are highly non-Gaussian. This has
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
given rise to the use of Independent Components Analysis
(ICA) [Com94, HKO01], which is a technique to separate a
multivariate signal into a set of components that will sum to
the original signal under the assumption that the input is nonGaussian. Like PCA, ICA is a special case of blind source
separation, which is a general collection of techniques that
try to separate a set of signals from a mixed signal without prior knowledge of the source signals or the mixing process [Ach08].
ICA algorithms tend to require a set of preprocessing
steps to make the problem more tractable. In particular, the
data need to be centred and whitened. Also, data reduction is
often applied. The whitening can be achieved by running the
PCA algorithm first. By keeping only the first n components,
data reduction is achieved. It ensures that only those components are used that are meaningful to the problem being
solved. Moreover, it speeds-up the computations required to
determine independent components.
Several algorithms are known, including infomax [BS95],
extended infomax [LGS99], fastICA [Hyv99], Jade [Car99],
kernel ICA [BJ02] and RADICAL [LMF03]. Although the
implementations vary, their goals are the same: represent
multivariate data as a sum of components in such a way that
the components reveal the underlying structure that gave rise
to the observed signal.
The number of ICA algorithms available, and their computational and algorithmic complexity make an in-depth discussion of this class of algorithms beyond the scope of this
report. However, we will discuss some of their implications
for natural image statistics as well as human vision.
One could ask what are the independent components in
the context of natural images. It could be argued that these
are the objects that comprise a scene [OF97]. It is exactly
these that provide the structure that gives rise to the pixel
array that is analysed by computer algorithms as well as by
the human visual system. Both PCA and ICA, however, are
generative models which are based on linear superposition.
Thus, they cannot take into account translation and rotation
of objects [OF97].
However, it has been found that small image patches of
natural image ensembles can be analysed using ICA, revealing structures that resemble those found in the human
visual system [BS97]. In particular, this technique yields
Gabor-like patches — elongated structures that are localised
in space, orientation and frequency. Their size, aspect ratio,
spatial frequency bandwidth, receptive field length and orientation tuning bandwidth are similar to those measured in
cortical cells [vv98]. These results lend credence to the argument that the human visual system appears to represent
natural images with independent variables, each having a
highly kurtotic distribution that leads to a metabolically efficient sparse coding.
6. Colour Statistics
We have so far considered statistical regulaties in images either without specifically including the notion of colour, or
deliberately ignoring colour and assuming a single luminance channel. The reason is that colour properties of images have their own peculiarities, warranting a separate treatment [BM87, PBTM98, CCO00, COVC00].
In general, images are formed as light is transduced to
electric signals. This happens both in the retina as well as
in electronic image sensors. Before that, light is emitted and
usually reflected several times off different surfaces before it
reaches a sensor. This process is modeled by the rendering
equation, given by:
Lo (λ) = Le (λ) +
Z
Ω
Li (λ) fr (λ) cos(Θ)dω
(38)
This equation models the behaviour of light at a surface:
light Li impinging on a surface point from all directions Ω is
reflected into an outgoing direction of interest, weighted by
the bi-directional reflectance distribution function (BRDF)
fr . The amount of light Lo going into that direction is then
the sum of the reflected light and the light emitted by the surface point (which is zero unless the surface is a light source).
Here, we have specifically made all relevant terms dependent
on wavelength λ to indicate that this behaviour happens at all
wavelengths, including all visible wavelengths. This means
that Lo (λ) can be seen as a spectral distribution.
Under photopic lighting conditions, the human visual system contains three cone types, each with a different peak
sensitivity. Thus, each cone type integrates the incident light
according to a different weight function:
L=
M=
S=
Z
Z
Z
λ
λ
λ
¯ dλ
Lo (λ) l(λ)
(39a)
Lo (λ) m̄(λ) dλ
(39b)
Lo (λ) s̄(λ) dλ
(39c)
¯
where l(λ),
m̄(λ) and s̄(λ) are the weight functions (responsivities) which peak at wavelengths that are perceived
roughly as red, green, and blue. The letters L, M and S stand
for “long”, “medium” and “short” wavelengths.
A spectral distribution Lo (λ) therefore gives rise to a
triple of numbers (L, M, S) which represent the signal that
is passed on from the photoreceptors to the remainder of the
human visual system. Such triples are called tristimulus values. An important implication of this behaviour is that different spectral distributions can integrate to the same tristimulus values. Such spectral distributions are then necessarily
perceived to be identical, which is known as metamerism. It
is the reason that we are able to build display devices with
three primaries that are not the same as the responsivities
of the cones: tristimulus values in one colour space can be
converted to tristimulus values in another colour space, and
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
although they do not represent the same spectral distributions, through metamerism they lead to the same percept.
Further, rather than study spectral distributions, we can limit
ourselves to the study of tristimulus values.
There are two interesting and relevant findings in natural image statistics that pertain to colour. Both of these relate to the relationships of values between the components
of tristimulus values. In other words, there exist statistical
relationships between the three axes of colour spaces. Essentially, correlations between color channels can be exploited in colour constancy / white balancing algorithms.
By decorrelating images or image ensembles, new colour
spaces can be constructed, which have applications in for
instance colour difference metrics as well as colour transfer
algorithms.
6.1. Colour Constancy and White Balancing
Humans tend to effortlessly discount the illumination of their
environments. In other words, the colour of the light sources
in a particular viewing environment are to a large extent
ignored, so that human vision is fairly good at detecting
surface colours. This phenomenon is known as colour constancy. Computational models that aim to achieve the same
are known as white balancing algorithm. They intend to remove colour casts from images due to illumination.
As light is reflected off surfaces according to the rendering
equation, reconstructing either surface reflectance and/or illumination is an underconstrained problem. This means that
white balancing can only be achieved by means of statistical
assumptions. These are discussed in the following sections.
6.1.1. The Grey World Assumption
In colour spaces such as the aforementioned LMS space,
equal values of the three components denote achromatic
colours. One way to achieve such neutral colours is to start
with an equal energy spectrum, i.e. a spectral distribution
which has the same value Lo for each wavelength λ. This
could happen if a scene is illuminated by an equal-energy
light source, a source that emits the same energy at all
wavelengths. If the BRDF then also reflects all wavelengths
equally, a neutral colour would be achieved.
In practice, a scene is illuminated by only one or at most
a few light sources, with an emission spectrum that is offwhite. The BRDFs in a scene are much more variable. However, a surprising finding is that when a collection of BRDFs
are averaged, the result ends up being a distribution function that is close to grey. This is known as the grey-world
assumption [RKAJ08].
The implication of this finding is that if we were to average the colours of all pixels in an image, and the grey-world
assumption holds, the average tristimulus value (L̄, M̄, S̄)
is a good indicator of the colour of the light source. Of
c The Eurographics Association 2010.
course, when averaged over natural image ensembles, the
grey world assumption does tend to hold. In single images
this assumption may or may not hold. In particular, if the
surface colours in the scene are biased towards some specific saturated colour, the average reflectance will not tend
toward grey.
The grey world assumption is often used to aid white balancing. After all, if we know the colour of the illuminant,
then we can correct all pixels by simply dividing all pixels
by the average image value. Moreover, if we know that the
display has a different white point, say (Ld,w , Md,w , Sd,w ),
white balancing can be implemented as follows:
Ld,w
(40a)
L̄
Md,w
(40b)
Mwb = M
M̄
Sd,w
(40c)
Swb = S
S̄
The grey-world assumption is a statistical argument that
is necessary to perform white balancing on images in the
absence of further information about the illuminant, given
that white balancing is by itself an underconstrained problem [FBM98].
Lwb = L
Note that we have glossed over the fact that this procedure
is best applied in a perceptual colour space, such as LMS,
thereby mimicking chromatic adaptation processes that occur in the human visual system. If an image is given in a
different colour space, most likely the sRGB space, the image should first be converted to LMS.
The approximation of the illuminant can be improved
by excluding the most saturated pixels from the estimation [APS98]. Alternatively, the image can be subjected to
further statistical analysis to determine if the color distribution is due coloured surfaces or due to a coloured light
source [GS04]. Here the image is first converted to the
CIELAB colour opponent space. Ignoring the lightest and
darkest pixels as they do not contribute to a reliable estimate, the remaining pixels are used to computed a twodimensional histogram F(a, b) on the two chromatic channels a and b. In each channel, the chromatic distributions are
modeled with:
µk =
σ2k =
Z
k F(a, b) dk
(41)
(µk − k) F(a, b) dk
(42)
k
Z
k
where k = a, b. These are the mean and variances of the histogram projections onto the a and b axes. In CIELAB neutral
colours lie around the (a, b) = (0, 0) point. To assess how
far the histogram lies from this point, the distance D can be
computed:
where µ =
q
D = µ−σ
q
µ2a + µ2b and σ = σ2a + σ2b .
(43)
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
-4
-6
-4
-6
-12
-8
-10 -9 -8 -7 -6 -5 -4 -3 -2
First channel
1.6
l-α
l-β
1.2
α-β
0.8
Red - Green
Red - Blue
Green - Blue
-2
-10
0
Red - Green
Red - Blue
Green - Blue
-1
-2
-3
-4
-6
Second channel
Second channel
-8
0
Second channel
Red - Green
Red - Blue
Green - Blue
-5
-4
-3
-2
1.6
l-α
l-β
α-β
1.2
-3.5 -3
-1 0
1
First channel
Second channel
-2
Second channel
Second channel
Figure 18: Examples images used to demonstrate the correlation between channels. The first two images are reasonable examples of natural images, whereas the third image is an example of an image taken in a built-up area. Built environments tend
to have somewhat different natural image statistics compared with natural scenes [ZL97, ZL98]. Figure taken from [RKAJ08],
courtesy AK Peters, Ltd.
0.8
0.8
0.2
0.0
0.0
0.0
-8
-6
-4
l-α
l-β
α-β
0.4
0.4
-10
-1.5 -1 -0.5 0
First channel
0.6
0.4
-16 -14 -12 -10 -8 -6 -4 -2 0 2
First channel
-2.5 -2
-2
0
2
First channel
-6
-5
-4
-3
-2
-1 0
1
First channel
Figure 19: Random samples plotted in RGB color space (top) and L α β color space (bottom). The top to bottom order of the
plots is the same as the order of the images in Figure 18. Figure taken from [RKAJ08], courtesy AK Peters, Ltd.
The measure Dσ = D/σ can be used to assess the strength
of the cast. If the spread of the histogram is small, and lies
far away from the origin, the image is likely to be dominated
by strong reflectances rather than illumination.
6.1.2. Generalised Grey-World and White Patch
Assumptions
It was found that while grey-world algorithms work well on
some images, alternate solutions such as the white-patch algorithm [Lan77] perform better on texture-rich images. The
white-patch algorithm assumes that the lightest pixels in an
image depict a surface with neutral reflectance, so that its
colour represents the illuminant.
Both the grey-world and white patch algorithms are spe-
cial instances of the Minkowski norm [FT04]:
Lp =
R
f p (x)dx
R
dx
1/p
= ke
(44)
where f (x) denotes the image at pixel x. The average of
the image is computed for p = 1, thereby implementing the
grey-world assumption. The maximum value of the image
is computed by substituting p = ∞, which represents the
white-patch algorithm.
A further generalised assumption can be made about images, which is that the average difference between two pixels
evaluates to grey. This is known as the grey-edge assumpc The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
tion [vG05], and can be formulated as follows [GG07]:
1/p
Z n σ
p
∂ f (x)
= ken,p,σ
(45)
dx
∂xn
Here, n is the order of the derivative, p is the Minkowskinorm and f σ (x) = f ⊗ Gσ is the Gaussian blurred image
where the size of the filter kernel is given by σ. With this
formulation several colour constancy algorithms can be constructed:
0,1,0
e
This implements the grey-world algorithm.
e0,∞,0 This is the white-patch algorithm.
en,p,σ This is known as the general grey-world algorithm.
The value of n is typically 0, 1 or 2 for zero, first and
second order statistics.
In each case, the value of n gives the order of the statistics.
6.1.3. White Balance Algorithm Selection
Having noted that many color constancy and white balancing algorithms exist, with none of them universally applicable, Gijsenij and Gevers use natural image statistics to classify an image, and then use this classification to select the
most appropriate white balancing algorithm for the image at
hand [GG07]. In particular, they use the finding that the distribution of edge responses in an image can be modeled by
means of a Weibull distribution [GS05]:
γ
γ−1
γ x
x
exp
(46)
f (x) =
β β
β
The parameters β and γ have meaning in this context. The
contrast of an image (or image ensemble) is given by β
whereas γ is an indicator of grain size (i.e. the peakedness of
the distribution). This means that higher values for β represent images with more contrast. Higher values for γ indicate
finer textures.
Fitting a Weibull distribution involves computing a Gaussian derivative filter in both x and y directions. This results in
the (βx , βy , γx , γy ) set of parameters for each colour channel.
The Gaussian derivate filter can be first, second or third order. However, it was found that the order of the chosen filter
is relatively unimportant: high correlations exist between the
fitted parameters [GS05].
It is now possible to fit Weibull parameters to the
derivaties of a large number of images, and correlate their
values with the white balancing algorithm that performs best
for each image. The parameter space tends to form clusters
where a specific algorithm tends to produce the most accurate result. This means that the Weibull distribution can be
used to select the most appropriate white balancing algorithm for each image individually [GG07].
sources tend towards white, then in the LMS colour space
and similar RGB-like spaces, values in one colour channel
tend to be good predictors of values in the other two colour
channels. In other words, if we find a high value for a red
pixel, then chances are that the green and blue values are
high as well. This means that the colour channels in such
colour spaces are highly correlated.
An interesting experiment was conducted whereby a
set of spectral images was converted to log LMS space,
before being subjected to Principal Components Analysis (PCA) [RCC98]. The logarithm was taken as a dataconditioning measure. Applied in this manner, PCA rotates
the axes to the point where they are maximally decorrelated.
The resulting colour space is termed L α β where L is a lightness channel, and α and β are colour opponent channels. By
starting in LMS cone space, the rotation yields a new colour
space which surprisingly corresponds closely to the color
opponent space found in the ganglion cells (these are the
cells that transport the visual signal from the retina through
the optic nerve to the lateral geniculate nucleus, thought to
be a relay station in the brain).
Colour opponent spaces are characterised by a single
achromatic channel, typically encoding luminance or lightness, and two chromatic axes which can be thought of as
spanning red-green and yellow-blue (although the axes do
not precisely correspond to these perceptual hues). The chromatic axes can have positive and negative values; a positive
value can for instance denote the degree of redness, whereas
a negative value in the same channel would denote the degree of greenness. A consequence of colour opponency is
that we are not able to simultaneously perceive an object to
have green and red hues. Although we may describe objects
as reddish-blue or yellowish-green, we never describe them
as reddish-green. The same holds for yellow and blue.
The conversion from XYZ to LMS, and then from LMS
to the aforementioned decorrelated opponent colour space
L α β is given by:
0.3897
L
M = −0.2298
0.0000
S
0.6890
1.1834
0.0000
−0.0787 X
0.0464 Y
1.0000 Z
(47)
and the inverse conversion from LMS to XYZ is given by:
1.9102
X
Y = 0.3710
0.0000
Z
−1.1121
0.6291
0.0000
L
0.2019
0.0000 M
S
1.0000
(48)
6.2. Statistical Decorrelation
The success of the grey-world assumption has a second implication. If the average surface reflectance is grey, and light
c The Eurographics Association 2010.
Converting from LMS to L α β involves a further matrix
multiplication:
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
For 2000 random samples drawn from each of the images
shown in Figure 18, their distribution is plotted in Figure 19.
The point clouds form more or less diagonal lines in RGB
space when pairs of channels are plotted against each other,
showing that the three color channels in RGB space are almost completely correlated for these images. This is not the
case for the same pixels plotted in L α β space.
The colour opponency of the L α β space is demonstrated
in Figure 20, where the image is decomposed into its separate channels. The image representing the α channel has the
β channel reset to 0 and vice versa. We have retained the luminance variation here for the purpose of visualization. The
image showing the luminance channel only was created by
setting both the α and β channels to zero.
The fact that natural images can be transformed to a
decorrelated colour space, which coincides with a colour decomposition occurring in the human visual system, points to
a beautiful adaptation of human vision to its natural input.
It also points to several applications that are possible due
to the fact that opponent spaces are decorrelated. In particular, we highlight colour transfer, an algorithm that attempts
to transfer the look and feel of one image to another on the
basis of its colour content. Due to the three-dimensional nature of colour spaces, this is in the general sense a complicated three-dimensional problem. However, by converting to
a colour opponent space, the data in each channel will be
decorrelated from the other channels. As a first example, the
histogram projection onto the a and b axes as outlined in
Section 6.1.1 yields meaningful results as the CIELAB space
used here is an example of a colour opponent space.
Figure 20: The top-left image is decomposed into the L
channel of the Lαβ color space, as well as L + α and L + β
channels in the bottom-left and bottom-right images. Figure
from [RKAJ08], courtesy AK Peters, Ltd.
1
√
L
3
α =
0
β
0
0
1
√
6
0
0
1
0
1
1 1
√
2
1
1
−1
1
log L
−2 log M
0
log S
while the inverse transform is given by:
√
3
0
√
log L
1
1
1 3
6
log M = 1
1 −1 0
6
log S
1 −2
0
0
0
(49)
0
L
α
0
√
β
2
2
(50)
6.2.1. Colour Transfer
Perhaps one of the most direct applications of the use of
decorrelated colour spaces is colour transfer between images. The aim is to transfer some statistical properties of one
image to another. To make this process straightforward, images can be converted to a decorrelated colour space, such
as the Lαβ space proposed by Ruderman [RCC98]. In this
space, the means and standard deviations of all the pixels can
be computed separately in each of the three axis. Doing this
for both a source and a target image yields a set of factors
and terms that can be used to shift and scale the pixel data
in one image to match the means and standard deviations of
the other image [RAGS01]. An example of this procedure
is shown in Figure 21. Applications of colour transfer, aside
from producing aesthetically pleasing images, include creating night-time images from day-time images, as well as
augmented reality applications, where colour transfer provides an inexpensive way to make rendered objects fit into a
captured environment.
Although the L α β space is on average decorrelated, individual images may deviate arbitrarily from this ideal. As
such, many images may still show significant correlations
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
Source Image:
Target Images:
Top image cropped for
formatting purposes.
Results:
coding is often associated with probability density distributions that have a high kurtosis. Indeed, we have shown
several transforms that result in highly kurtotic signals, and
which approximate human visual processes. In particular, it
can be shown that wavelets that analyse regions locally in
space, orientation as well as frequency induce high kurtosis.
Moreover, such analysers are thought to operate in the visual
cortex.
Further, the human visual system appears to reduce redundancy in its input. The photoreceptors transduce a signal
which is recombined several times in the retina, producing a
spatial center-surround mechanism as well as colour opponency. It has been shown in particular that opponent colour
spaces tend to be decorrelated. For natural images, the opponent signals tend to be close to independent. As a result, the
information carried in its three channels is stripped of much
of its redundancy.
Figure 21: Example of colour transfer.
between the axis, even in a colour opponent space. It would
be possible to individually subject images to Principal Components Analysis to find the most decorrelated axes for each
specific image. Such a technique may provide plausible results for a wider range of inputs [AK04, AK07].
7. Conclusions
Natural images are not random. They contain striking statistical regularities, which are surveyed in this report. The human visual system has evolved to observe natural images and
therefore contains neural circuitry that is specifically adapted
to analyse natural images. Thus, the field that studies natural
image statistics has emerged to understand the input to the
human visual system first, for the purpose of gaining a better understanding of the inner workings of the human visual
system itself.
The discovery of natural image statistics follows several
steps. First an ensemble of natural images is collected. We
have outlined several issues that need to be resolved in this
step. Second, the images are transformed into a specific domain. Examples include the computation of power spectra,
phase spectra or wavelets, although many more are known
and listed in this report. Finally, statistical regularities are
computed in this domain.
There are a couple of important observations to make
about human visual processing of natural images. For
metabolic reasons, the HVS appears to have evolved to use
a sparse coding. In essence, this means that most cells in the
visual cortex lie dormant most of the time, while for each
input only a few cells are active. This helps minimise energy
consumption.
Thus, the HVS also applies several transforms to its input
so as to induce a sparse coding. In the statistical sense, sparse
c The Eurographics Association 2010.
Thus, the importance of discovering natural image statistics for understanding the human visual system is undeniable. However, the use of natural image statistics is not limited to the study of human vision. Particularly, in application
areas such as computer graphics, machine vision as well as
image processing, images are synthesized or analysed. By
observing the aforementioned image statistics, we are able
to produce more plausible images, enable new applications,
or detect objects in images with higher success rates. Examples of these are given throughout this report.
Here, we highlight some of the more salient applications that employ some form of natural image statistics.
Of these, wavelets are particularly useful for many tasks.
They are used for instance in denoising algorithms as well
as for object detection. However, perhaps its greatest success story lies in its use in lossy compression algorithms.
As wavelets induce a sparese coding: histograms of wavelet
coefficients are highly kurtotic so that most coefficients are
small. Clamping these coeffients to zero means that an image can be reconstructed from a very small set of non-zero
coefficients, leading to good opportunities of encoding only
a small set of coefficients and thereby obtaining high compression factors.
A second example is found in the use of the sparseness
of the distribution of gradients. Modeling the distribution of
gradients of image ensembles has resulted in a simple functional form that can be used in optimisation problems as priors. This has been demonstrated in blind deconvolution algorithms which aim to remove blur from images due to camera
shake. As an infinite number of solutions exist, these algorithms require additional constraints. In particular, they can
drive the solution towards some measure of naturalness, such
as the aforementioned distribution of gradients.
Finally, the decorrelation afforded by opponent colour
spaces has been used directly in colour transfer algorithms.
Here, the plausible transfer of colour palettes is made simpler by treating each channel independently.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
We hope that this overview of the field of natural image
statistics, and its link with graphics-related applications may
help in the development of further applications in graphics.
References
[AAB96] A SADA N., A MANO A., BABA M.: Photometric calibration of zoom lens systems. In Proceedings of the 13th International Conference on Pattern Recognition (1996), vol. 1, pp. 186–
190. 6
[Ach08] ACHARYYA R.: A New Approach for Blind Source Separation of Convolutive Sources: Wavelet Based Separation Using
Shrinkage Function. VDM Verlag Dr. Müller Aktiengesellschaft
& Co. KG, Saarbrücken, Germany, 2008. 20
[AEB10] Auto exposure bracketing settings by camera model.
http://hdr-photography.com/aeb.html, 2010. 3
[AK04] A BADPOUR A., K ASAEI S.: A fast and efficient fuzzy
color transfer method. In Proceedings of the 4th IEEE International Symposium on Signal Processing and Information Technology (2004), pp. 491–494. 25
[AK07] A BADPOUR A., K ASAEI S.: An efficient PCA-based
color transfer method. Journal of Visual Communication and Image Representation 18, 1 (2007), 15–34. 25
[BF00] B RADY M., F IELD D. J.: Local contrast in natural images: normalisation and coding efficiency. Perception 29 (2000),
1041–1055. 8
[BG00] BALBOA R. M., G RZYWACZ N. M.: Occlusions and
their relationship with the distribution of contrasts in natural images. Vision Research (2000). 14
[BG03] BALBOA R. M., G RZYACZ N. M.: Power spectra and
distribution of contrasts of natural images from different habitats.
Vision Research 43, 24 (2003), 2527–2537. 13
[BGK01] B ILLOCK V. A., G UZMAN G. C. D., K ELSO J. A. S.:
Fractal time and 1/f spectra in dynamic images and human vision.
Physica D 148 (2001), 136–146. 12
[BH96] B ILLOCK V. A., H ARDING T. H.: Evidence of the spatial and temporal channels in the correlational structure of human
spatiotemporal contrast sensitivity. Journal of Physiology 490, 2
(1996), 509–517. 12
[Bil00] B ILLOCK V. A.: Neural acclimation to 1/f spatial frequency spectra in natural images transduced by the human visual
system. Physica D 137 (2000), 379–391. 12, 13
[BJ02] BACH F. R., J ORDAN M. I.: Kernel independent component analysis. Journal of Machine Learning Research 3 (2002),
1–48. 20
[Any] A NYHERE S OFTWARE: Photosphere. http://www.
anyhere.com/. 7
[BL09] B RADY M., L EGGE G.: Camera calibration for natural
image studies and vision research. Journal of the Optical Society
of America A 26, 1 (2009), 30–42. 4, 5, 6
[APS98] A DAMS J., PARULSKI K., S PAULDING K.: Color processing in digital cameras. IEEE Micro 18, 6 (1998), 20–30. 9,
21
[BM87] B URTON G. J., M OORHEAD I. R.: Color and spatial
structure in natural scenes. Applied Optics 26, 1 (1987), 157–
170. 11, 20
[ASF91] A DELSON E., S IMONCELLI E., F REEMAN W. T.: Pyramids and multiscale representations. In Representations and Vision, Gorea A., (Ed.). Cambridge University Press, Cambridge,
1991, pp. 3–16. 16
[BMW88] BAUM E. B., M OODY J., W ILCZEK F.: Internal representations for associative memory. Biological Cybernetics 59
(1988), 217–228. 17
[ASH87] A DELSON E., S IMONCELLI E., H INGORANI R.: Orthogonal pyramid transforms for image coding. In SPIE Visual
Communications and Image Processing (1987), vol. II, p. 845.
16
[Bad96a] BADDELEY R. J.: An efficient code in V1? Nature 381
(1996), 560–561. 17
[Bad96b] BADDELEY R. J.: Searching for filters with “interesting” output distributions: An uninteresting direction to explore?
Network: Computation in Neural Systems 7, 2 (1996), 409–421.
15
[BAM∗ 97] BADDELEY R. J., A BBOTT L. F., M ICHAEL C. A.,
B OOTH C. A., S ENGPIEL F., F REEMAN T., WAKEMAN E. A.,
ROLLS E. T.: Responses of neurons in primary and inferior temporal visual cortices to natural scenes. Proceedings of the Royal
Society B: Biological Sciences 264, 1389 (1997), 1775–1783. 11,
13
[BCG90] B OVIK A. C., C LARK M., G EISLER W.: Multichannel texture analysis using localized spatial filters. IEEE Pattern
Analysis and Machine Intelligence 12, 1 (1990), 55–73. 16
[BCHT01] B ILLOCK V. A., C UNNINGHAM D. W., H AVIG P. R.,
T SOU B. H.: Perception of spatiotemporal random fractals: an
extension of colorimetric methods to the study of dynamic texture. Journal of the Optical Society of America A 18(10) (2001),
2404–2413. 10, 12
[BCT08] B ILLOCK V. A., C UNNINGHAM D. W., T SOU B. H.:
What visual discrimination of fractal textures can tell us about
discrimination of camouflaged targets. In Proceedings of Human
Factors Issues: In Combat Identification and in Image Analysis
Using Unmanned Aerial Vehicles (2008). 10, 11, 12, 13
[BP86] BAYER B. E., P OWELL P. G.: A method for the digital
enhancement of unsharp grainy photographs. Advances in Computer Vision and Image Processing 2 (1986), 31–88. 18
[BS95] B ELL A. J., S EJNOWSKI T. J.:
An informationmaximization approach to blind separation and blind deconvolution. Neural Computation 7, 6 (1995), 217–234. 20
[BS97] B ELL A. J., S EJNOWSKI T. J.: The ‘independent components’ of natural scenes are edge filters. Vision Research 37
(1997), 3327–3338. 20
[Cae81] C AELLI T.: Visual Perception: Theory and Practice.
Pergamon Press, Oxford, 1981. 10, 13
[Car99] C ARDOSO J. F.: High-order contrasts for independent
component analysis. Neural Computation 11, 1 (1999), 157–192.
20
[CCO00] C HIAO C.-C., C RONIN T. W., O SORIO D.: Color signals in natural scenes: characteristics of reflectance spectra and
the effects of natural illuminants. Journal of the Optical Society
of America A 17, 2 (2000), 218–224. 20
[CF05] C LAUS D., F ITZGIBBON A.: A rational function lens distortion model for general cameras. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition (2005),
pp. 213–219. 5
[CG87] C UTTING J., G ARVIN J.: Fractal curves and complexity.
Perception and Psychophysics 42 (1987), 365 – 370. 10, 11
[CHJ78] C AMPBELL F. W., H OWELL E. R., J OHNSON J. R.:
A comparison of threshold and suprathreshold appearance of
gratings with components in the low and high spatial frequency
range. Journal of Physiology 284 (1978), 193 – 201. 13
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
[CNR02] C ARON J. N., NAMAZI N. M., ROLLINS C. J.: Noniterative blind data restoration by use of an extracted filter function.
Applied Optics 41, 32 (2002), 6884–6889. 14
[Com94] C OMON P.: Independent component analysis: A new
concept. Signal Processing 36, 3 (1994), 287–314. 20
[COVC00] C HIAO C.-C., O SORIO D., VOROBYEV M., C RONIN
T. W.: Characterization of natural illuminants in forests and the
use of digital video data to reconstruct illuminant spectra. Journal of the Optical Society of America A 17, 10 (2000), 1713–
1721. 20
[DA95] D ONG D., ATICK J.: Statistics of time-varying images.
Network: Computation in Neural Systems 6 (1995), 345 – 358.
11, 12
[Dau88] DAUGMAN J. G.: Complete discrete 2D gabor transforms by neural networks for image analysis and compression.
IEEE Transactions on Acoustics, Speech and Signal Processing
36, 7 (1988), 1169–1179. 16
[Dau91] DAUGMAN J. G.: Self-similar oriented wavelet pyramids: Conjectures about neural non-orthogonality. In Representations and Vision, Gorea A., (Ed.). Cambridge University Press,
Cambridge, 1991. 16
[DAW01] D ROR R. O., A DELSON E. H., W ILLSKY A. S.: Surface reflectance estimation and natural illumination statistics. In
In Proc. of IEEE Workshop on Statistical and Computational
Theories of Vision (2001). 13
[DCCH05] D EUSSEN O., C OLDITZ C., C OCONU L., H EGE H.:
Efficient Modeling and Rendering of Landscapes Visualization in
Landscape and Environmental Planning. Tylor & Francis, 2005.
12
[Deb] D EBEVEC P.:
com/. 7
Hdrshop.
http://www.hdrshop.
[DF01] D ONOHO D., F LESIA A.: Can recent innovations in harmonic analysis ’explain’ key findings in natural image statistics?
Network: Computation in Neural Systems 12 (2001), 371–393. 1
[DHT∗ 00] D EBEVEC P., H AWKINS T., T CHOU C., D UIKER H.P., S AROKIN W., S AGAR M.: Acquiring the reflectance field
of a human face. In SIGGRAPH ’00: Proceedings of the 27th
annual conference on Computer graphics and interactive techniques (2000), pp. 145–156. 8
[DL05] D EUSSEN O., L INTERMANN B.: Digital Design of
Nature: Computer Generated Plants and Organics. SpringerVerlag, New York, 2005. 12
[DLAW] D ROR R., L EUNG T., A DELSON E., W ILLSKY A.:
Statistics of real-world illumination. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition,
pp. 164–171. 3, 8, 9, 14
[DM97] D EBEVEC P., M ALIK J.: Recovering high dynamic
range radiance maps from photographs. In SIGGRAPH ’97: Proceedings of the 24th annual conference on Computer graphics
and interactive techniques (1997), pp. 369–378. 7
[DOL00] D ROR R. O., O’C ARROLL D. C., L AUGHLIN S. B.:
The role of natural image statistics in biological motion estimation. In Biologically Motivated Computer Vision (2000), pp. 492–
501. 13
[EBW92] E CKERT M. P., B UCHSBAUM G., WATSON A. B.:
Separability of spatiotemporal spectra of image sequences. IEEE
Transactions on Pattern Analysis and Machine Intelligence 14,
12 (1992), 1210–1213. 12
[EMF03] E L -M ELEGY M., FARAG A.: Nonmetric lens distortion
calibration: Closed-form solutions, robust estimation and model
selection. Ninth International Conference on Computer Vision
(ICCV 2003) (2003), 554–559. 5
[ePa] E PAPER P RESS: Ptlens. http://epaperpress.com/
ptlens/. 4
[Fai07] FAIRCHILD M.: The HDR Photographic Survey. In Proceedings of the 15 th Color Imaging Conference: Color Science and Engineering Systems, Technologies, and Applications
(2007), pp. 233–238. 6
[FB97] F IELD D. J., B RADY N.: Visual sensitivity, blur and the
sources of variability in the amplitude spectra of natural images.
Vision Research 37 (1997), 3367 – 3383. 11, 13
[FBM98] F UNT B., BARNARD K., M ARTIN L.: Is machine
colour constancy good enough? In Proceedings of the 5th European Conference on Computer Vision (1998), pp. 445–459. 21
[FFC82] F OURNIER A., F USSELL D., C ARPENTER L.: Computer rendering of stochastic models. Communications of the
ACM 25, 6 (1982), 371–384. 12
[Fie87] F IELD D. J.: Relations between the statistics of natural
images and the response properties of cortical cells. Journal of
the Optical Society of America A 4, 12 (1987), 2379–2394. 11,
16
[Fie93] F IELD D. J.: Scale-invariance and self-similar ’wavelet’
transforms: An analysis of natural scenes and mammalian visual
systems. In Wavelets, fractals and Fourier transforms, Farge M.,
Hunt J. C. R., Vassilicos J. C., (Eds.). Clarendon Press, Oxford,
1993, pp. 151–193. 11, 16
[Fie94] F IELD D. J.: What is the role of sensory coding? Neural
Computation 6 (1994), 559–601. 17
[Fol95] F OLDIAK P.: Sparse coding in the primate cortex. In The
Handbook of Brain Theory and Neural Networks, Arbib M. A.,
(Ed.). MIT Press, Cambridge, MA, 1995, pp. 895–989. 17
[FSH∗ 06] F ERGUS R., S INGH B., H ERTZMANN A., ROWEIS S.,
F REEMAN W.: Removing camera shake from a single photograph. ACM Transactions on Graphics (TOG) 25 (2006), 787–
794. 14
[FT04] F INLAYSON G., T REZZI E.: Shades of gray and colour
constancy. In Proceedings of the Twelfth Color Imaging Conference (2004), pp. 37–41. 22
[Gab46] G ABOR D.: Theory of communication. JIEE London
93, III (1946), 429–457. 16
[GC05] G OLDMAN D., C HEN J.-H.: Vignette and exposure calibration and compensation. In Tenth IEEE International Conference on Computer Vision (ICCV 2005) (2005), vol. 1, pp. 899–
906. 6
[Gei08] G EISLER W. S.: Visual perception and the statistical
properties of natural scenes. Annual Reviews in Psychology 59
(2008), 167–192. 1
[DOL01] D ROR R. O., O’C ARROLL D. C., L AUGHLIN S. B.:
Accuracy of velocity estimation by reichardt correlators. J. Opt.
Soc. Am. A 18, 2 (2001), 241–252. 13
[GG07] G IJSENIJ A., G EVERS T.: Color constancy using natural image statistics. In IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (2007), pp. 1–8. 23
[DWA]
[Gib79] G IBSON J. J.: The ecological approach to visual perception. Lawrence Erlbaum, Hillsdale, NJ, 1979. 10
D ROR R. O., W ILLSKY A. S., A DELSON E. H.:. 11, 14
[DxO] D X O I MAGE S CIENCE:
Automated optical corrections dedicated to your lenses. http://www.dxo.com/
us/photo/dxo_optics_pro/optics_geometry_
corrections. 4
c The Eurographics Association 2010.
[GS04] G ASPARINI F., S CHETTINI R.: Color balancing of digital photos using simple image statistics. Pattern Recognition 37
(2004), 1201–1217. 21
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
[GS05] G EUSEBROEK J., S MEULDERS A.: A six-stimulus theory
for stochastic texture. International Journal of Computer Vision
62, 1-2 (2005), 7–16. 23
[KFK90] K NILL D. C., F IELD D., K ERSTEN ? D.: Human discrimination of fractal images. J. Opt. Soc. Am. A 7, 6 (1990),
1113–1123. 12
[HB96] H AMMETT S. T., B EX P. J.: Motion sharpening: Evidence for the addition of high spatial frequencies to the effective
neural image. Vision Research 36, 17 (1996), 2729 – 2733. 13
[KP88] K UBE P., P ENTLAND A.: On the imaging of fractal surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 10 (1988), 704–707. 10
[HBS92] H ANCOCK P., BADDELEY R. J., S MITH L.: The principal components of natural images. Network: Computation in
Neural Systems 3, 1 (1992), 61–70. 18
[KP08] K IM B., PARK R.: Automatic detection and correction of
purple fringing using the gradient information and desaturation.
In Proceedings of the 16th European Signal Processing Conference (2008). 4
[HDR] HDR S OFT: Photomatix. http://www.hdrsoft.
com/. 7
[HHH09] H YVÄRINEN A., H URRI J., H OYER P. O.: Natural Image Statistics: A Probabilistic Approach to Early Computational
Vision. Springer, 2009. 1, 19
[HK94] H EALEY G., KONDEPUDY R.: Radiometric ccd camera
calibration and noise estimation. IEEE Transactions on Pattern
Analysis and Machine Intelligence 16, 3 (1994), 267–276. 6
[HK07] H ARTLEY R., K ANG S. B.: Parameter-free radial distortion correction with center of distortion estimation. Pattern
Analysis and Machine Intelligence, IEEE Transactions on 29, 8
(2007), 1309–1321. 5
[HKO01] H YVÄRINEN A., K ARHUNEN J., O JA E.: Independent
Component Analysis. John Wiley & Sons, 2001. 20
[HLM00] H UANG J., L EE A., M UMFORD D.: Statistics of range
images. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (2000), vol. 1, pp. 324–331. 3,
11, 14
[HM99a] H UANG J., M UMFORD D.: Image Statistics for the
British Aerospace Segmented Database. Tech. rep., Division of
Applied Math, Brown Univeristy, 1999. 11, 13, 14
[KW00] K ANG S. B., W EISS R.: Can we calibrate a camera
using an image of a flat, textureless lambertian surface? In Sixth
European Conference on Computer Vision (ECCV 2000) (2000),
pp. 640–653. 6
[Lan77] L AND E.: The retinex theory of color vision. Scientific
American 237, 6 (1977), 108–128. 22
[Lev07] L EVIN A.: Blind motion deblurring using image statistics. Advances in Neural Information Processing Systems (2007).
13, 14
[LGS99] L EE T.-W., G IROLAMI M., S EJNOWSKI T. J.: Independent component analysis using an extended infomax algorithm
for mixed sub-gaussian and super-gaussian sources. Neural Computation 11, 2 (1999), 417–441. 20
[LMF03] L EARNED -M ILLER E. G., F ISHER III J. W.: ICA using spacings estimates of entropy. Journal of Machine Learning
Research 4 (2003), 1271–1295. 20
[LS05] L ITVINOV A., S CHECHNER Y. Y.: Radiometric framework for image mosaicking. Journal of the Optical Society of
America A 22 (2005), 839–848. 6
[HM99b] H UANG J., M UMFORD D.: Statistics of natural images
and models. In Proceedings CVPR 99 (1999), pp. 541–547. 1
[LZW02] L EVIN A., Z OMET A., W EISS Y.: Learning to perceive
transparency from the statistics of natural scenes. In In NIPS-15;
The 2002 Conference on Advances in Neural Information Processing Systems (2002), MIT Press. 14
[HM99c] H UANG J., M UMFORD D.: Statistics of natural images
and models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1999), vol. 1, pp. 541–547.
3, 8
[LZW03] L EVIN A., Z OMET A., W EISS Y.: Learning how
to inpaint from global image statistics. In Proceedings of
the Ninth IEEE International Conference on Computer Vision
(2003), pp. 305–312. 15
[HT96] H IRANI A. N., T OTSUKA T.: Combining frequency and
spatial domain information for fast interactive image noise removal. In SIGGRAPH ’96: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques (1996),
pp. 269–276. 15
[Mal89] M ALLAT S. G.: A theory for multiresolution signal decomposition: The wavelet representation. IEEE Pattern Analysis
and Machine Intelligence 11 (1989), 674–693. 16
[Man83] M ANDELBROT B. B.: The Fractal Geometry of Nature.
Freeman, New York, 1983. 11, 12, 13
[Hyv99] H YVÄRINEN A.: Fast and robust fixed-point algorithms
for independent component analysis. IEEE Transactions on Neural Networks 10, 3 (1999), 626–634. 20
[Mar80] M ARCELJA S.: Mathematical description of the responses of simple cortical cells. Journal of the Optical Society of
America 70 (1980), 1297–1300. 16
[JBFZ02] JALOBEANU A., B LANC -F ÉRAUD L., Z ERUBIA J.:
Estimation of blur and noise parameters in remote sensing. In
In Proc. of Int. Conf. on Acoustics, Speech and Signal Processing (2002), pp. 249–256. 14
[MFTF03] M. F. TAPPEN B. C. R., F REEMAN W. T.: Exploiting the sparse derivative prior for super-resolution and image demosaicing. In Third International Workshop on Statistical and
Computational Theories of Vision at ICCV (2003). 14
[JC79] J ULESZ B., C AELLI T.: Perception. On the limits of
Fourier decompositions in visual texture perception 8 (1979), 69
– 73. 10, 13
[MN99] M ITSUNAGA T., NAYAR S.: Radiometric self calibration. In Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition (1999), vol. 1, pp. 374–380. 7
[JFS95] JACOBS C. E., F INKELSTEIN A., S ALESIN D. H.: Fast
multiresolution image querying. In SIGGRAPH ’95: Proceedings
of the 22nd annual conference on Computer graphics and interactive techniques (New York, NY, USA, 1995), ACM, pp. 277–
286. 18
[MNSA07] M OTOYOSHI I., N ISHIDA S., S HARAN L., A DEL SON E. H.: Image statistics and the perception of surface qualities. Nature 447, 7141 (2007), 206–209. 8, 9
[Kan07] K ANG S.: Automatic removal of chromatic aberration
from a single image. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (2007), pp. 1–8. 4
[MP95] M ANN S., P ICARD R.: On being ’undigital’ with digital
cameras: Extending dynamic range by combining differently exposed pictures. In Proceedings of IS&T 48th annual conference
(1995), pp. 422–428. 2, 7
[MW07]
M ALLON J., W HELAN P.: Calibration and removal of
c The Eurographics Association 2010.
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
lateral chromatic aberration in images. Pattern Recognition Letters 28, 1 (2007), 125–135. 4
using multiple exposures.
(2003), 219–228. 7
Journal of Electronic Imaging 12
[NCB04] N EELAMANI R., C HOI H., BARANIUK R.: ForWaRD:
Fourier-wavelet regularized deconvolution for ill-conditioned
systems. Signal Processing, IEEE Transactions on 52, 2 (2004),
418–433. 14
[RCC98] RUDERMAN D. L., C RONIN T. W., C HIAO C.: Statistics of cone responses to natural images: Implications for visual
coding. Journal of the Optical Society of America A 15, 8 (1998),
2036–2045. 3, 23, 24
[NP96] N IKIAS C., P ETROPOLU A.: Higher-Order Spectra Analysis. Prentice-Hall, Englewood Cliffs, NJ, 1996. 15
[RKAJ08] R EINHARD E., K HAN E. A., A KYÜZ A. O., J OHN SON G. M.: Color Imaging: Fundamentals and Applications. A
K Peters, Wellesley, 2008. 2, 4, 6, 9, 21, 22, 24
[OF96] O LSHAUSEN B. A., F IELD D. J.: Emergence of simplecell receptive field properties by learning a sparse code for natural
images. Nature 381 (1996), 607–609. 17
[OF97] O LSHAUSEN B. A., F IELD D. J.: Sparse coding with
an overcomplete basis set: A strategy employed by V1? Vision
Research 37 (1997), 3311–3325. 20
[OPS∗ 97]
O REN M., PAPAGEORGIOU C., S INHA P., O SUNA E.,
P OGGIO T.: Pedestrian detection using wavelet templates. In
Computer Vision and Pattern Recognition, 1997. Proceedings.,
1997 IEEE Computer Society Conference on (1997), pp. 193–
199. 18
[OSS83] OWSLEY C., S EKULAR R., S IEMSEN D.: Contrast sensitivity throughout adulthood. Vision Research 23 (1983), 689 –
699. 12
[OT01] O LIVA A., T ORRALBA A.: Modeling the shape of the
scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42, 3 (2001), 145–175. 11,
13
[PBT98] P ÁRRAGA C. A., B RELSTAFF G., T ROSCIANKO T.:
Color and luminance information in natural scenes. Journal of
the Optical Society of America A 15, 3 (1998), 563–569. 11
[PBTM98] PÁRRAGA C. A., B RELSTAFF G., T ROSCIANKO T.,
M OOREHEAD I. R.: Color and luminance information in natural
scenes. Journal of the Optical Society of America A 15, 3 (1998),
563–569. 20
[POP98] PAPAGEORGIOU C. P., O REN M., P OGGIO T.: A general framework for object detection. In Computer Vision, 1998.
Sixth International Conference on (1998), pp. 555–562. 18
[PS88] P EITGEN H.-O., S AUPE D.: The Science of Fractal Images. Springer-Verlag, New York, 1988. 12
[PS00] P ORTILLA J., S IMONCELLI E. P.: Image denoising via
adjustment of wavelet coefficient magnitude correlation. In Proceedings of the 7th International Conference on Image Processing (2000), pp. 277–280. 17
[PWK95] P ETERZELL D. H., W ERNER J. S., K APLAN P. S.:
Individual differences in contrast sensitivity functions: Longitudinal study of 4-, 6- and 8-month-old human infants. Vision Research 35, 7 (1995), 9651–979. 12
[RAGS01] R EINHARD E., A SHIKHMIN M., G OOCH B.,
S HIRLEY P.: Color transfer between images. IEEE Computer
Graphics and Applications 21 (2001), 34–41. 9, 24
[RB94] RUDERMAN D., B IALEK W.: Statistics of natural images: Scaling in the woods. Physical Review Letters (1994). 11,
14
[RB05] ROTH S., B LACK M.: Fields of experts: A framework for
learning image priors. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (2005), pp. 860–867.
14
[RB09] ROTH S., B LACK M.: Fields of experts. International
Journal of Computer Vision 82, 2 (2009). 14
[RBS03] ROBERTSON M., B ORMAN S., S TEVENSON R.:
Estimation-theoretic approach to dynamic range enhancement
c The Eurographics Association 2010.
[RSAT04]
R EINHARD E., S HIRLEY P., A SHIKHMIN M., T ROS T.: Second order image statistics in computer graphics.
In Proceedings of the 1st Symposium on Applied perception in
graphics and visualization (2004), pp. 99–106. 10, 11, 12
CIANKO
[RST01] R EINHARD E., S HIRLEY P., T ROSCIANKO T.: Natural
image statistics for computer graphics. Tech. Rep. UUCS-01002, 2001. 11
[Rud94] RUDERMAN D. L.: The statistics of natural images. Network: Computation in Neural Systems, 5 (1994), 517–548. 1, 11,
14
[Rud97] RUDERMAN D. L.: Origins of scaling in natural images.
Vision Research 37 (1997), 3385–3398. 11, 13
[RV90] ROGOWITZ B., VOSS R.: Shape perception and lowdimension fractal boundaries. Proceedings of the SPIE 1249
(1990), 387 – 394. 10, 13
[RWPD05] R EINHARD E., WARD G., PATTANAIK S., D EBEVEC
P.: High Dynamic Range Imaging: Acquisition, Display and
Image-Based Lighting. Morgan Kaufmann Publishers, San Francisco, 2005. 2
[SA96] S IMONCELLI E. P., A DELSON E. H.: Noise removal via
bayesian wavelet coring. In Proceedings of the Third International Conference on Image Processing (1996), vol. I, pp. 379–
392. 16
[SDS95] S TOLLNITZ E. J., D E ROSE T. D., S ALESIN D. H.:
Wavelets for computer graphics: A primer. IEEE Computer
Graphics and Applications 15, 3 (1995), 76–84. 17
[SDS96] S TOLLNITZ E. J., D E ROSE T. D., S ALESIN D. H.:
Wavelets for Computer Graphics: Theory and Applications. Morgan Kaufmann, San Francisco, 1996. 17
[SHL95] S HIH S., H UNG Y., L IN W.: When should we consider
lens distortion in camera calibration. Pattern Recognition 28, 3
(1995), 447–461. 2
[Sim97] S IMONCELLI E. P.: Statistical models for images: Compression, restoration and synthesis. In 31st Asilomar Conference
on Signals, Systems and Computers (1997), pp. 673–678. 18
[Sim99a] S IMONCELLI E. P.: Bayesian denoising of visual images in the wavelet domain. In Bayesian Inference in Wavelet
Based Models, Müller P., Vidakovic B., (Eds.). Springer-Verlag,
New York, 1999. 17, 18
[Sim99b] S IMONCELLI E. P.: Modeling the joint statistics of images in the wavelet domain. In Proceedings of the SPIE 44th
Annual Meeting (1999), vol. 3813, pp. 188–195. 16
[Sim05] S IMONCELLI E.: Statistical modeling of photographic
images. In Handbook of Image and Video Processing, Bovik A.,
(Ed.). Elsevier Academic Press, 2005, ch. 4.7, pp. 431–441. 14
[SJA08] S HAN Q., J IA J., AGARWALA A.: High-quality motion
deblurring from a single image. ACM Trans. Graph. 27, 3 (2008),
1–10. 1, 14
[SJZW] S HEN J., J IN X., Z HOU C., WANG C. C. L.: Gradient
based image completion by solving poisson equation. Computers
and Graphics, 119–126. 15
Tania Pouli, Douglas W. Cunningham and Erik Reinhard / Image Statistics and their Applications in Computer Graphics
[SK99] S AWHNEY H., K UMAR R.: True multi-image alignment
and its application to mosaicing and lens distortion correction.
IEEE Transactions on Pattern Analysis and Machine Intelligence
21, 3 (1999), 235–243. 5
[SMS78] S WITKES E., M AYER M. J., S LOAN J. A.: Spatial
frequency analysis of the visual environment: Anisotropy and
the carpentered environment hypothesis. Vision Research 18, 10
(1978), 1393–1399. 11, 13
[SP98] S IMONCELLI E. P., P ORTILLA J.: Texture characterization via joint statistics of wavelet coefficient magnitudes. In Proceedings of the 5th International Conference of Image Processing
(1998), vol. I, pp. 62–66. 18
[Ste97] S TEIN G.: Lens distortion calibration using point correspondences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1997), pp. 602–608. 5
[Stu05] S TURM P.: Multi-view geometry for general camera models. In Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition (June 2005), pp. 206–212. 5
[SWO84] S EKULER R., W ILSON H. R., OWSLEY C.: Structural
modeling of spatial vision. Vision Research 24, 7 (1984), 689–
700. 12
[Sze10] S ZELISKI R.: Computer Vision: Algorithms and Applications. Springer, 2010. 5, 6
[TAB∗ 03] T ELLER S., A NTONE M., B ODNAR Z., B OSSE M.,
C OORG S., J ETHWA M., M ASTER N.: Calibrated, registered
images of an extended urban area. International journal of computer vision (2003). 8
[TF97] T HOMSON M., F OSTER D.: Role of second- and thirdorder statistics in the discriminability of natural images. JOSA A
14 (1997), 2081 – 2090. 11
[van] VAN WALREE P.:
Spherical aberration.
http://
toothwalker.org/optics/spherical.html. 4
[van92] VAN H ATEREN J.: Theoretical predictions of spatiotemporal receptive fields. Journal of Comparative Physiology A 171
(1992), 151 – 170. 11
[van98a] VAN DER S CHAAF A.: Natural image statistics and visual processing. PhD thesis, Rijksuniversiteit Groningen, The
Netherlands, 1998. 1
[van98b] VAN DER S CHAAF A.: Natural image statistics and visual processing. 10
[vG05] VAN DE W EIJER J., G EVERS T.: Color constancy based
on the grey-edge hypothesis. In Proceedings of the International
Conference on Image Processing (2005). 23
[Vis07] V ISION C LUB OF F INLAND:
Electronic shuttering for high speed cmos machine vision applications.
http://www.automaatioseura.fi/jaostot/mvn/
mvn2007/parameter.html, 2007. 6
[VJ01] V IOLA P., J ONES M.: Robust real-time object detection.
International Journal of Computer Vision (2001). 18
[Vos85] VOSS R.: Random fractal forgeries. In Fundamental Algorithms for Computer Graphics, Earnshaw R. A., (Ed.).
Springer, Berlin, 1985, pp. 805–835. 12
[vv96] VAN DER S CHAAF A., VAN H ATEREN J.: Modelling the
power spectra of natural images: statistics and information. Vision Research 36, 17 (1996), 2759–2770. 3, 11, 13
[vv98] VAN H ATEREN J., VAN DER S CHAAF A.: Independent
component filters of natural images compared with simple cells
in primary visual cortex. Proceedings of the Royal Society of
London B 265 (1998), 359–366. 3, 8, 20
[Tho99] T HOMSON M. G. A.: Visual coding and the phase structure of natural scenes. Network: Computation in Neural Systems
10 (1999), 123–132. 15
[WA89] WATSON A. B., A HUMADA J R . A. J.: A hexagonal orthogonal oriented pyramid as a model of image representation in
visual cortext. IEEE Transactions on Biomedical Engineering
36, 1 (1989), 97–106. 16
[Tho01] T HOMSON M. G. A.: Beats, kurtosis and visual coding.
Network: Computation in Neural Systems 12 (2001), 271–287.
15
[Wat87] WATSON A. B.: Efficiency of an image code based on
human vision. Journal of the Optical Society of America A 4, 12
(1987), 2401–2417. 16
[TM01] TAUBMAN D., M ARCELLIN M.: JPEG2000: Image
Compression Fundamentals, Standards and Practice. Springer,
2001. 17
[Wat90] WATSON A. B.: Perceptual components architecture for
digital video. Journal of the Optical Society of America 7 (1990),
1943–1954. 16
[TO03] T ORRALBA A., O LIVA A.: Statistics of natural image
categories. Network: Computation in Neural Systems 14 (2003),
391–412. 11, 13
[Wat91] WATSON A. B.: Multidimensional pyramids in vision
and video. In Representations of Vision, Gorea A., (Ed.). Cambridge University Press, Cambridge, 1991, pp. 17–26. 16
[TP91a] T URK M., P ENTLAND A.: Eigenfaces for recognition.
Journal of cognitive neuroscience 3, 1 (1991), 71–86. 19
[WCH92] W ENG J., C OHEN P., H ERNIOU M.: Camera calibration with distortion models and accuracy evaluation. IEEE
Transactions on Pattern Analysis and Machine Intelligence 14,
10 (1992), 965–980. 5
[TP91b] T URK M., P ENTLAND A.: Face recognition using eigenfaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1991), pp. 586–591. 19
[TSW∗ 05] TAYLOR R., S PEHAR B., W ISE J., C LIFFORD C.,
N EWELL B., H AGERHALL C., P URCELL T., M ARTIN T.: Perceptual and physiological responses to the visual complexity of
fractal patterns. Nonlinear Dynamics Psychol Life Sci 9 (2005),
89 –114. 10
[Wil94] W ILLSON R.: Modeling and calibration of automated
zoom lenses. PhD thesis, Pittsburgh, PA, USA, 1994. 2
[WM97] W EBSTER M., M IYAHARA E.: Contrast adaptation and
the spatial structure of natural images. Journal of the Optical
Society of America A 14 (1997), 2355–2366. 11, 13
[TT93] TADMOR Y., T OLHURST D. J.: Both the phase and the
amplitude spectrum may determine the appearance of natural images. Vision Research 33, 1 (1993), 141–145. 11
[ZL97] Z IEGAUS C., L ANG E. W.: Statistics of natural and urban
images. In Proceedings 7th International Conference on Artificial
Neural Networks (Berlin, 1997), vol. 1327 of Lecture Notes in
Computer Science, Springer-Verlag, pp. 219–224. 3, 22
[TT94] TADMOR Y., T OLHURST D.: Discrimination of changes
in the second order statistics of natural and synthetic images. Vision Research 34 (1994), 541 –554. 12
[ZL98] Z IEGAUS C., L ANG E. W.: Statistical invariances in artificial, natural and urban images. Z. Naturforsch 53a (1998),
1009–1021. 22
[TTC92] T OLHURST D. J., TADMOR Y., C HIAO T.: Amplitude
spectra of natural images. Ophthalmic and Physiological Optics
12 (1992), 229–232. 11
[ZLK10] Z HENG Y., L IN S., K ANG S.: Single-image vignetting
correction. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (2010), vol. 1, pp. 461–468. 5, 6
c The Eurographics Association 2010.