Statistical Regularities in Low and High Dynamic Range Images
Tania Pouli∗
University of Bristol
Manmade Day
Douglas Cunningham†
Brandenburg Technical University Cottbus, Cottbus, Germany
Manmade Indoors
Manmade Night
Natural
Erik Reinhard‡
University of Bristol
HDR Photographic Survey
Figure 1: Sample images from each of the data sets used.
Abstract
Computer graphics as well as related disciplines often benefit from
understanding the human visual system and its input. In this paper, we study statistical regularities of both conventional as well as
high dynamic range images, and find that several commonly held
wisdoms regarding natural image statistics do not directly apply to
high dynamic range data. These results have implications for both
the study of human vision, as well as for the design of algorithms in
computer graphics and computer vision that rely on such statistical
regularities.
CR Categories: I.2.10 [Vision and Scene Understanding]: Modeling and Recovery of Physical Attributes— [G.3]: Probability and
Statistics—Statistical Computing
Keywords: Natural Image Statistics, High Dynamic Range
1
Introduction
A recent trend in computer graphics and related fields is to reformulate algorithms as optimization problems. To ensure plausible
solutions are found, the optimization process is guided by priors.
In a rising number of applications, images are reconstructed, augmented, or generally improved, and in these cases priors are often
based on natural image statistics. The premise is that by using statistical regularities in natural images, the solutions found to these
problems are tending towards naturally occurring images. Examples of problems that can be solved in this manner are image denoising [Simoncelli 1999a; Portilla and Simoncelli 2000], blind deconvolution [Levin 2007; Shan et al. 2008], and inpainting [Levin
et al. 2003].
∗ e-mail:
[email protected]
Of course, the quality of the results depends on how closely the
models of natural image statistics actually model the real world.
In this paper we are concerned with better understanding statistical regularities in images and in particular the differences stemming from the quantization and limited dynamic range offered with
conventional capturing techniques. We will show for the first time
that significant non-trivial differences exist between conventional
low dynamic range images (LDR) and high dynamic range images
(HDR). These differences mean that commonly held assumptions
regarding natural image statistics do not extend to HDR imaging.
This has two implications. First, for engineering-based disciplines
this means that existing priors for LDR images cannot be used directly for HDR applications. Second, for human vision research
our findings indicate that our world may have statistics that differ
significantly from the statistics captured by existing conventional
image ensembles, requiring a rethinking of what it is that human
vision perceives.
Further, we created collections of images with different general
content. In particular, we are interested in assessing whether there
exist significant differences between indoor and outdoor scenes,
manmade versus natural, and in particular, night versus day images.
Here we leverage the unique possibilities afforded by HDR imaging, namely that we are able to capture scenes that contain light
sources. Once more we find significant differences across a range
of image statistics, suggesting that situation-specific priors could be
more effective than global priors.
First, we give a brief overview of the types of statistical regularities that have been discovered so far (Section 2) and discuss our
data collection procedure (Section 3). We then explore a range of
statistics, from first order central moments (Section 4), spectral and
phase analysis (Sections 5, 6) to gradients (Section 7) and wavelet
statistics (Section 8). Finally, we discuss the main themes in our
findings and draw conclusions in Sections 9 and 10.
† e-mail:
[email protected]
‡ e-mail:
[email protected]
2
Previous Work
Previous work on the statistics of natural images is often accompanied by the explicit assumption that the human visual system is in
some way optimized for its input and that knowledge about regularities in real world images might provide insights into how the human visual system is organized. The regularities can be elucidated
in a number of ways [Pouli et al. 2010]. The simplest analyses treat
each pixel independently, such as intensity or contrast distributions.
While this class of statistics provides insights into the visual system
(such as contrast adaptation), it cannot reveal much about the structure of a scene.
Far more common are second order statistics, which measure the
relationship between pairs of pixels. Nearly every paper on natural
image statistics reports an analysis of the power spectrum, with particular emphasis on the relationship between power and frequency
(for a more detailed discussion, see Section 5). While the spectra of individual images can vary considerably, the average spectra
are generally quite smooth and follow a power law. The average
spectral slope varies from 1.8 to 2.4, with most values clustering
around 2.0 [Burton and Moorhead 1987; Field 1987; van Hateren
and van der Schaaf 1998; van Hateren 1992; Reinhard et al. 2004;
Ruderman and Bialek 1994; Tolhurst et al. 1992].
Higher-order statistics, such as Principle Component Analysis
(PCA), Independent Component Analysis (ICA), or wavelets have
been applied to ensembles of natural images [Bell and Sejnowski
1997b; Bell and Sejnowski 1997a; Field 1993; Field 1999; Hancock
et al. 1992; van Hateren and van der Schaaf 1998; Hurri et al. 1996;
Hurri et al. 1997; Olshausen and Field 1996; Simoncelli 1999b].
Van Hateren [1998], for example, ran ICA on over 4000 natural images and found that the results matched the spatial frequency and
orientation tuning bandwidths as well as the aspect ratio and length
of the receptive fields of simple cells. Likewise, Olshausen and
Field [1996] demonstrated that when specific constraints (such as
sparseness) are used, PCA produces receptive fields that resemble
simple cells. Simoncelli [1999b] showed that natural image statistics have significant dependence between wavelet coefficients, and
that this has implications for the visual system.
A comprehensive set of analyses is performed by Huang and Mumford [1999b], who computed a wide variety of statistics ranging
from simple to higher-order for the 4000 images of the van Hateren
database [van Hateren and van der Schaaf 1998]. Subsequently,
Huang and Mumford [1999a] examined the power spectra of a
smaller database (214 images) that had been painstakingly segmented into regions representing 11 different object categories.
One of their central findings is that there are variations in spectral
slopes for these within-image categories. This variation by category
has also been reported for different scene types [Torralba and Oliva
2003; Webster and Miyahara 1997]. Webster and Miyahara [1997],
for example, analyzed the power spectra and rms-contrast of 48
natural scenes. They found significant differences in both spectral
slope and contrast across three scene types (2.15, 2.23, and 2.4 for
the forest, close-up, and distant meadow scenes, respectively).
With one exception, all of the work on natural scenes has been performed on LDR images. Dror et al [2001] examined the statistics of
HDR illumination maps. They found that many of the previously
identified regularities in LDR images can be found in HDR, with
two main differences. First, the wavelet fits were not as good. Second, and more critically, there were differences in the power spectra. When the pixel value was proportional to log luminance, the
spectral slope was roughly 2.0. When the pixel values were linear
in luminance, the presence of local illumination sources dominated
the power spectrum altering it so that it was no longer easily described by a simple power law. Since the LDR and HDR images
are not from the same data set (and differed in scene type as well
as size of camera angle), it is unclear whether the differences are
really due to HDR characteristics.
3
Image Ensembles
Our data sets capture a variety of environments and conditions as
well as differences between high dynamic range and traditional im-
ages. More specifically, images represent either natural or manmade environments with the latter containing daylight, night and
indoors scenes. Additionally, each data set consists of both HDR
and LDR images. In addition to the above data sets, we analysed
a subset of images from the HDR Photographic Survey [Fairchild
2007] that were radiometrically calibrated using the provided measurements where available1 . Only HDR images were available for
these scenes. Sample images from all ensembles are shown in Figure 1, while Table 1 lists the number of images per data set.
Data Set
Natural
Manmade Day
Manmade Indoors
Manmade Night
HDR Photographic Survey
Number of Images
95
240
125
52
58
Table 1: Number of images for each data set.
Images for our four data sets were captured using a Nikon D2H
with a Nikkor 24-120mm lens and a Nikon D300 with a Nikkor
18-200mm lens and saved as sRGB JPEG files. For the D2H, the
resolution of the images was 2464 x 1632 while the resolution of
the D300 images was 4288 x 2848. In the latter case, images were
downsampled to match the resolution of the D2H along the largest
axis.
Natural scene images were collected during daytime, under different weather conditions and covering a variety of landscapes. Manmade scenes were divided into three categories: day outdoors, night
outdoors and day indoors scenes. Light sources in the night and indoors scenes were captured where present. However, in all daytime
scenes we avoided capturing the sun directly to avoid over-exposed
regions in the HDR images.
Each scene was captured using 9 exposures spaced 1EV apart. The
exposure settings were selected such that on average the middle
exposure minimized the number of over and under-exposed pixels
present. The LDR data sets were formed by linearizing the middle
exposures according to the sRGB non-linearity and saving in a lossless format. Although for some scenes, this single exposure was
sufficient for capturing all the details in the scene, in most cases,
some over and under-exposed pixel were present in all exposures.
HDR images were created by merging the 9 calibrated exposures
for each scene using hdrgen1 with image alignment enabled. To
linearize the merged HDR images we used the camera response
curve derived using the techniques by Debevec and Malik [1997].
Imperfections in the camera or the lens give rise to artifacts and
distortions that can affect the statistics of scenes [Brady and Legge
2009]. All our data sets are corrected for geometric lens distortions,
chromatic aberration, vignetting and lens softness using DxO Optics Pro 2 with the appropriate lens and camera modules for our
specific setups. All exposures were calibrated separately before
merging them to HDR. Finally, all exposures were cropped to a
1024x1024 window that was centrally aligned to simplify the statistical analysis.
4
Histograms, Moments and Contrast
First order statistics are carried out directly on pixel values, and
include both histograms and central moments. For our datasets,
Figure 2 shows histograms computed in logarithmic space. Due to
1 Courtesy
Greg Ward
2 http://www.dxo.com/uk/photo/dxo
optics pro/
Density
Ensemble Histogram (LDR, linear)
0.35
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
RIT
0.30
0.25
0.20
Data Set
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
RIT
LDR
µ
σ
54.5 69.7
39.7 60.2
23.1 47.6
20.6 46.1
-
HDR
µ
σ
66.4 × 104 15.1 × 105
49.8 × 104 94.9 × 104
28.7 × 103 11.4 × 104
24.9 × 102 18.1 × 103
86.6 × 102 40.8 × 103
0.15
Table 2: Mean and standard deviation for our datasets.
0.10
0.05
Density
0
-0.5
0
0.5
1
1.5
2
Log Intensity
Ensemble Histogram (HDR)
0.12
0.10
0.08
12
0.06
0.04
0.02
0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
-1
0
1
2
3
4
5 6 7 8
Log Luminance
Skew, LDR
60
50
40
30
Skew, HDR
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
RIT
20
10
0
Kurtosis, LDR
6000
10
5000
8
4000
6
3000
4
2000
2
1000
0
0
Kurtosis, HDR
Figure 3: Skew and kurtosis for our image ensembles.
Figure 2: Histograms for our LDR and HDR datasets.
binning of small values, the LDR sets show spikes towards the left
which should not be interpreted as artifacts of the image ensemble.
This is simply the result of taking the logarithm of discrete values.
The LDR datasets do show problems with over-exposed pixels, despite our image selection procedure for LDR exposures. This can
be seen as the spikes towards the right of the plots.
The HDR datasets do not show these artifacts, indicating that, as
expected, over-exposure does not tend to occur in our data. We
note that the histograms for our day-lit ensembles are more or less
identical, indicating that the presence of man-made structures do
not significantly affect statistics. HDR images tend to show histograms with long tails. This is seen here as well, even though the
plots have a logarithmic horizontal axis. This is generally attributed
to the presence of bright light sources with limited spatial extent.
Our datasets can be further characterized by their central moments.
In natural image statistics, usually only the first four moments
(mean, standard deviation, skew, and kurtosis) are reported. The
means µ and standard deviations σ vary significantly between LDR
and HDR datasets, and are shown in Table 2. Both also vary significantly between datasets, and we note that the standard deviation
tends to be very large, pointing towards a wide range of image contrasts in each scene.
The skew and kurtosis results are compared in Figure 3. It is interesting to note that both skew and kurtosis are much higher for the
HDR datasets than their corresponding LDR ensembles, and indicate a decidedly non-Gaussian distribution of pixel values. While
the skew and kurtosis are highest for the Indoors and Night scenes
in the LDR ensembles, they are smallest for the HDR ensembles.
In particular, the calibrated RIT dataset shows a significantly higher
skew and kurtosis than any of the other sets.
The skew is positive for all ensembles, which is to be expected:
most pixels tend to be dark, with smaller clusters of pixels representing light sources. This is much more so for our HDR ensembles, which we attribute to the fact that light sources in LDR images
tend to be clipped and are therefore under-represented in conventional images.
The kurtosis measure is of considerable interest, as it relates to theories of sparse coding. If a distribution is highly positively kurtosed,
most of the values will be close to zero. Such sparseness could be
effectively employed by thresholding and removing the values near
zero. The remaining values could then be recoded more efficiently
with little loss of information [Thomson 1999]. This has computational [Barlow 1994] as well as metabolic advantages in human
vision [Field 1994]. We surmise that the significantly increased
kurtosis in HDR images means that real-world luminances are even
more amenable to sparse coding within the human visual system
than previously assumed.
Whether this translates to efficient computational compression
schemes remains to be seen, given that such schemes implicitly
rely on quantization to achieve high levels of compression, whereas
HDR images are not quantized.
Finally, we assess the magnitude of contrast in this section. The
overall contrast in an image can be captured by computing the ratio
between variance and mean, σi2 /µi . If we do this for each image
individually, then we can compute ensemble statistics (mean and
standard deviation) on contrast, the results of which are given in
Table 3.
We see that for the LDR ensembles the largest overall contrasts are
recorded for the indoors and night scenes. This can be attributed to
the fact that for such scenes it is likely that the best exposure is still
both over- and under-exposed at the same time. The HDR datasets
Data Set
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
RIT
LDR
µ
σ
57.8
29.3
65.0
42.7
101.8 58.1
95.5
44.5
-
HDR
µ
σ
64.1 × 104 17.4 × 105
68.8 × 104 22.5 × 105
16.0 × 104 21.0 × 104
28.1 × 103 52.9 × 103
51.1 × 103 21.5 × 104
Table 3: Mean and standard deviation of contrast.
To compute the spectral slope for each image ensemble, we first
convert each color image to luminance using ITU-Recommendation
BT.709 (the standard used for sRGB and HDTV): L = 0.2126R +
0.7152G + 0.0722B. Then, the luminance images were weighted
with a Kaiser-Bessel window to avoid measuring boundary effects,
before removing the mean intensity. The windowed images are then
converted to the Fourier domain, where the power spectrum is computed, and the resulting values are averaged over all directions. This
results in a power value for each frequency. The lowest 256 frequencies were then selected, to which the power model was fitted,
leading to a spectral slope for each image.
log HDR
Natural Day
Manmade Day
Manmade Night
log LDR
The spectral slopes were averaged for each of our ensembles and
subjected to statistical analysis. The slopes and their variances are
depicted in Table 4. Note that Dror et al [2001] have found that due
to the presence of strong illumination sources, the spectral distribution of HDR images no longer follows the standard 1/f −α power
curve. Following their work, we have log-transformed our HDR
data, allowing a power curve to be fitted to the spectral slopes.
Figure 4: Joint histograms for Natural, Manmade Day and Manmade Night data sets along horizontal neighbors.
show different behavior, in that the night scenes have less contrast
on average than the other ensembles. This is to be expected, as with
the exception of the presence of relatively bright light sources, most
areas in a scene are only sparsely illuminated, yielding on average
low contrast.
To capture the relations between neighboring pixels in each data
set, we created joint log histograms over horizontal and vertical
neighbors. Each image was analyzed separately and the resulting
histograms were averaged for each data set. All histograms were
computed using the overall maximum and minimum values for the
whole set. Some representative histograms for our data sets are
shown in Figure 4. Correlations are greater along the diagonal,
showing that neighboring pixels tend to have similar values.
5
lie around 2, which has several interesting implications. First, it
can be shown that this means that our natural environment is scaleinvariant: we can move through our world and expect that the statistics of the images falling on the retina do not significantly change.
Second, the Wiener-Kintchine theorem states that the power spectrum and the autocorrelation function form a Fourier transform pair.
This means that the spectral slope of image ensembles can be interpreted as describing relations between pairs of pixels.
Spectral Analysis
One of the most often reported natural image statistics is the average slope of the power spectrum. The power spectrum S(u,v) of an
N × N image is given by:
S(u, v) =
|F (u, v)|2
N2
(1)
where F is the Fourier transform of the image, and (u, v) are pixel
indices in Fourier space. Two-dimensional frequencies can also be
represented with polar coordinates (f, θ), using u = f cos(θ) and
v = f sin(θ). When computing the power spectrum over a sufficiently large set of natural images, and averaging over all directions
θ, then on average the power S at frequency f has the following
relationship:
S(f ) ∝ Af −α
(2)
where α is the spectral slope and A is a constant that determines
overall image contrast. For natural images, the slope α tends to
Data Set
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
RIT
LDR
α
σ2
2.22 0.242
2.29 0.153
2.44 0.121
2.47 0.249
-
log HDR
α
σ2
2.24 0.144
2.34 0.099
2.61 0.095
2.68 0.152
2.36 0.126
Table 4: Spectral slopes α and their variances σ 2 for each of our
ensembles.
Using two-tailed, independent measures t-tests [Hayes 1988] we
find that the difference in spectral slope between the HDR and LDR
datasets is significant (t(1022)=3.703, p<0.001, LDR slope 2.33,
HDR slope 2.42), meaning that findings obtained for LDR data do
not extend to HDR data. Further, we find a significant difference between the manmade and natural datasets (t(668)=2.644, p<0.008,
Manmade slope 2.32, Natural slope 2.23).
Comparing LDR versus HDR in each of our datasets, we find that
the Manmade Night, and Indoors datasets show a significant difference between LDR and HDR, whereas the Manmade Day set is
showing only a marginal difference (Table 5). No statistical difference between LDR and HDR was found for the Natural Day ensemble. In general, we find that HDR exponents tend to be higher than
their LDR counterparts. The LDR slopes for Night and Indoors appear at the extreme upper end of the range normally reported (i.e.
over 2.4).
Data set
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
t-test
t(188) = 0.390
t(478) = 1.683
t(248) = 3.977
t(102) = 2.362
Signif.
> 0.69
> 0.09
< 0.00
< 0.02
Slopes
2.22, 2.24
2.29, 2.34
2.44, 2.61
2.47, 2.68
Table 5: Comparison between spectral slopes of the LDR and log
HDR ensembles
As a result, we conclude that statistics collected on LDR scenes in
previous studies remain valid. However, these results do not extend
to indoors or night scenes, which show significantly larger spectral
slopes. Moreover, those ensembles not only show significant differences between LDR and HDR equivalents, but are also significantly different from the manmade and natural day ensembles. This
means that for indoor and night scenes in particular, HDR imaging
is a necessary tool for statistical analysis. Moreover, the values reported for day scenes do not necessarily translate to other visual
conditions. This suggests that in computer graphics applications
spectral image priors should be designed with specific viewing environments in mind.
6
Phase Analysis
It can be argued that although statistical regularities are present in
power spectra of natural images, much of the perceptually relevant
information is encoded in phase spectra [Thomson 1999]. To gain
access to phase information without polluting the results with first
and second order information, we can whiten and center the images
first. This amounts to adjusting the spectral slope to become flat and
setting the DC component to zero. As a consequence the mean of
the resulting image will be zero. Whitening images and analyzing
the phase spectrum amounts to an assessment of spatial structure.
18
16
14
12
10
8
6
4
2
0
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
Whitened Skew, LDR
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
RIT
60
Whitened Skew, HDR
50
40
30
20
10
Whitened Kurtosis, LDR
0
x 10 4
Whitened Kurtosis, HDR
7
where L(i, j) is the luminance for pixel (i, j). Images from each
ensemble are normalized according to the maximum value within
the whole ensemble. As the images are only calibrated up to a
constant, this normalization makes results from different data sets
comparable.
Both horizontal and vertical gradients are computed and the resulting distributions are shown in Figure 7. Our findings for the LDR
ensembles are qualitatively similar to previous findings in the literature and closely match the Laplacian distribution reported by
Huang and Mumford [1999b]. Images consist of mostly smooth
areas, causing the majority of gradients to fall under or very near
zero, creating a distinct peak. As structures in images tend to have
starting and ending boundaries of similar contrast, the distribution
of gradients is mirrored along the central peak.
The HDR gradient distributions, albeit similar at a high level to
their LDR equivalents, display some interesting properties. As
mentioned earlier, all images are normalized within each ensemble.
Since our gradient distributions are computed from logged luminance values, they essentially correspond to ratios in linear space
and therefore contrasts. HDR images can capture regions of widely
different luminance values, causing in turn extreme contrasts to appear.
As an example, consider the scene in Figure 6. The left image is
linearly compressed while the one on the right is tonemapped using the photographic operator [Reinhard et al. 2002]. Luminance
values corresponding to portions of the scene outside the window
are several orders of magnitude higher than pixels from the window frame. Nevertheless, apart from these extreme areas, neighbouring pixels in most parts of the image result in relatively small
contrasts - no larger than what would be encountered in an LDR
image. Consequently, the gradient distribution of the HDR ensembles acquire tails similar to those seen in their luminance histograms
(Section 4). Huang and Mumford [1999b] propose two different
6
5
4
3
2
1
0
Figure 5: Skew and kurtosis after whitening encode phase information.
The results of computing skew and kurtosis on the whitened image
ensembles are shown in Figure 5. The whitened kurtosis effectively
measures the sparseness of the phase information. As with the kurtosis computed directly on images, we see that the whitened kurtosis is much higher for high dynamic range images, and in particular
for the calibrated RIT ensemble. In this case, we conclude that spatial structure is sparse, in real-world scenes even more so than can
be captured with conventional images.
7
Gradients
To capture statistical regularities in the relations between pairs of
pixels, the distribution of gradients in images can be analyzed. We
compute the gradient D of the log luminance values for our image
ensembles as follows:
D(i, j) = ln L(i, j) − ln L(i, j + 1)
(3)
Figure 6: The left image is linearly compressed while the right is
tonemapped such that more details of the scene are visible. Portions
of the image within the window have much higher luminance values
than nearby pixels on the window frame.
functional forms to model the gradient distributions that they found;
one corresponding to the t-distribution and the second to a generalized Laplacian. These are defined as follows:
Model 1 f (x)
=
Model 2 f (x)
=
1
1
Z (1 + x2 /s2 )t
1 −|x/s|t
e
Z
(4)
(5)
where f (x) is the density function for the distribution D (Equation
3) and Z is chosen such that the integral of f (x) is 1, therefore
allowing s and t as free parameters.
HDR Manmade Day
Probability (ln f(x))
HDR Natural Day
-2
-4
-6
-8
-10
-12
-14
-16
-18
Probability (ln f(x))
HDR Manmade Night
dx
Model 1 dx
Model 2 dx
dy
Model 1 dy
Model 2 dy
-6
0
-2
-4
-6
-8
-10
-12
-14
-16
-18
HDR Manmade Indoors
-4
-2
0 2 4
Gradients
6
0 2 4
Gradients
-4
-2
0 2
Gradients
4
6
-6
LDR Manmade Day
LDR Natural Day
-8 -6 -4 -2
-6
6
-8 -6 -4 -2 0 2 4
Gradients
-4
-2
0
2
4
Gradients
6
LDR Manmade Indoors
6
8 -8 -6 -4 -2 0 2 4
Gradients
6
-10 -8 -6 -4 -2 0 2 4 6 8 10
Gradients
LDR Manmade Night
8 -8 -6 -4 -2 0 2 4
Gradients
6
8
Figure 7: Gradient distributions for LDR and HDR ensembles for the four data sets. The best fits using the two models proposed in [Huang
and Mumford 1999b] for the horizontal and vertical gradients are also shown.
To evaluate the differences between the various image ensembles
as well as between LDR and HDR, we used least squares to fit the
two proposed models to the resulting distributions. Table 6 and Figure 7 shows the results of the least squares fitting for both models
to ln f (x) for each ensemble for LDR and HDR images.
As can be seen, these models fit the steep part of the distributions
relatively well, but do not predict the very large gradients present in
the HDR ensemble distributions. If required, this difference can be
rectified by using either model in combination with a very shallow
linear part to account for the large gradients.
8
Wavelet Statistics
Wavelets are a useful tool for statistical image analysis as they provide a compact way to encode relations between image structures
at different orientations and scales. We chose Haar wavelets for our
work as they are easy to compute and previous work has shown no
significant difference between Haar wavelets and more complicated
bases [Dror et al. 2001].
This can be attributed to the small regions of extreme contrast that
are present in at least some of the HDR images.
Figure 8 shows the horizontal coefficients for the HDR datasets.
Although qualitatively similar, wavelet analysis also gives rise to
differences between the different scene types, which can be seen
especially in the coarser scales. The coefficient distributions for
the Natural Day ensemble (Figure 8) are narrower than the corresponding distributions for manmade scenes. From that we can conclude that manmade scenes show less correlation between pixel values over larger distances. Since the coefficient densities shown are
logarithmically compressed these irregularities only affect a small
number of pixels. Nevertheless, they open up interesting questions
regarding compression of different types of HDR scenes.
9
Discussion
We decompose the luminance channels of all images to their horizontal, vertical and diagonal components and compute the average
distribution of coefficients for each orientation for all our data sets.
Figure 9 shows the coefficient distributions for all three orientations
for the natural dataset for both LDR and HDR ensembles. In both
cases, the luminance values were normalized according to the maximum value within each ensemble and logarithmically compressed.
Our interest is in natural image statistics for the purpose of better
understanding the input to the human visual system as well as being able to apply the results to applications in computer graphics
and related fields. The hypothesis is that LDR images may not necessarily represent scene statistics in the same manner that retinal
images would. By employing HDR technology, we have assembled a collection of LDR and HDR image ensembles for the purpose of cataloguing similarities and differences between LDR and
HDR images. These were then used to assess a range of natural
image statistics, revealing some tell-tale differences between LDR
and HDR ensembles as well as between different scene types.
The distribution corresponding to the finest scale is comparable to
the logarithmic gradient distributions presented in Section 7. This is
not a surprising result as Haar wavelets operate in a similar way to
image gradients, albeit at different orientations and scales. Coarser
scales exhibit higher variance and indicate that wavelet coefficients
cover a progressively larger area. Note that, as was the case with
the gradient distributions, the coefficient distributions for the finest
scale of the HDR data appear narrower than the LDR equivalent.
It is well known that clipping exists in LDR images, which is reflected in histograms and their moments. Histograms of HDR sets
show a much higher skew and kurtosis as a consequence of the relatively small number of extremely bright portions of the images. An
interesting observation is that the differences between the HDR ensembles are not mirrored in the LDR ensembles. In particular, both
the Manmade and the Natural Day scenes result in more skewed
distributions than the Indoors or Night scenes in the case of HDR
LDR Data Fitting
Model 1
dx
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
s
0.37
0.34
0.21
0.21
dy
t
2.70
2.61
2.17
2.13
HDR Data Fitting
Model 2
s
0.38
0.41
0.23
0.20
Model 1
dx
t
2.78
2.79
2.28
2.05
s
0.03
0.03
0.01
0.02
t
0.53
0.51
0.45
0.47
s
0.04
0.04
0.02
0.02
dy
t
0.549
0.54
0.47
0.45
dx
s
0.24
0.25
0.27
0.17
Model 2
dy
t
3.44
3.07
3.05
2.53
s
0.26
0.27
0.31
0.21
dx
t
3.47
3.16
3.22
2.71
s
0.01
0.01
0.02
0.01
dy
t
0.52
0.49
0.55
0.43
s
0.01
0.01
0.02
0.01
t
0.53
0.49
0.55
0.46
Log Density
Table 6: Data fitting parameters for the LDR and HDR ensembles using the models from Equations 4 and 5.
0
-2
-4
-6
-8
-10
-12
-14
-16
-18
Natural Day
Manmade Day
Manmade Indoors
Manmade Night
Scale 1
Scale 2
Scale 3
-15 -10 -5 0
5 10 15
Horizontal Coefficient
-15 -10 -5 0
5 10 15
Horizontal Coefficient
-15 -10 -5 0
5 10 15
Horizontal Coefficient
-15 -10 -5 0
5 10 15
Horizontal Coefficient
Figure 8: The horizontal coefficients for all the HDR ensembles are shown for the three finest scales.
imagery, while the oposite effect arises in the LDR case.
Spectral analysis also gives rise to differences between the LDR
and HDR sets. However, our t-tests show that these differences
are only significant for the Manmade Night and Indoors datasets as
discussed in Section 5. After whitening the images by flattening
the spectral slope, the skew and kurtosis of the ensembles were recomputed, allowing phase information to be captured. Similar to
the un-whitened distributions, HDR sets result in much higher skew
and kurtosis values compared to the equivalent LDR sets.
To evaluate the relations between pairs of neighbouring pixels, gradient statistics were collected, showing that existing models predict
well the distribution of gradients in LDR ensembles but cannot predict the tail that HDR gradient distributions show. Such tails are
the result of image portions with extreme contrasts and although
they correspond to only a small number of pixels, they are a crucial
characteristic of HDR imagery.
Our wavelet analysis confirms the observations arising from the
gradient statistics, also giving rise to differences between Natural and Manmade datasets. It was shown that the distribution of
wavelet coefficients for coarser scales is wider, a consequence of
smaller correlations between distant pixels in Manmade scenes.
10
Conclusion
In summary, we find that significant differences exist between LDR
and HDR image ensembles as well as between different types of
scenes which cannot be traced back to our capture methods, and
are therefore inherently present in the scenes we have collected.
The implications are twofold. First, this means that the input to the
human visual system is not quite the same as that represented by
LDR imagery. As a consequence, it may be necessary to rethink
how the human visual system is matched to its environment.
Second, engineering solutions which rely on statistical regularities
in images, for instance in the design of priors in optimization problems, may benefit from the use of high dynamic range imagery due
to its ability to control quantization and exposure artifacts. How-
ever, in that case priors ought to be designed specifically for HDR
imagery, and dependent on the nature of the prior, for a specific type
of illumination environment. We have shown that it is not possible
to safely assume that existing natural image statistics extend to the
HDR case and therefore to real-world illumination.
References
BARLOW, H. B. 1994. What is the computational goal of the neocortex? In Large-Scale Neuronal Theories of the Brain, C. Koch,
Ed. MIT Press.
B ELL , A., AND S EJNOWSKI , T. 1997. The “independent components” of natural scenes are edge filters. Vision Research 37, 23,
3327–3338.
B ELL , A., AND S EJNOWSKI , T. 1997. Edges are the ’independent
components’ of natural scenes. Advances in Neural Information
Processing Systems 9, 831–837.
B RADY, M., AND L EGGE , G. 2009. Camera calibration for natural
image studies and vision research. Journal of the Optical Society
of America A 26, 1, 30–42.
B URTON , G., AND M OORHEAD , I. 1987. Color and spatial structure in natural scenes. Applied Optics 26, 1, 157–170.
D EBEVEC , P., AND M ALIK , J. 1997. Recovering high dynamic
range radiance maps from photographs. In SIGGRAPH ’97: Proceedings of the 24th annual conference on Computer graphics
and interactive techniques, 369–378.
D ROR , R., L EUNG , T., A DELSON , E., AND W ILLSKY, A. 2001.
Statistics of real-world illumination. In CVPR ’01: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 164–
171.
FAIRCHILD , M. 2007. The hdr photographic survey. In Proceedings of the Fifteenth Color Imaging Conference: Color Science and Engineering Systems, Technologies, and Applications,
vol. 15, 233–238.
Log Density
Log Density
Log Density
0
-2
-4
-6
-8
-10
-12
-14
-16
-18
0
-2
-4
-6
-8
-10
-12
-14
-16
-18
0
-2
-4
-6
-8
-10
-12
-14
-16
-18
LDR
HDR
Scale 1
Scale 2
Scale 3
-10 -5
0
5
10
Horizontal Coefficients
H UANG , J., AND M UMFORD , D. 1999. Statistics of natural images
and models. In CVPR ’99: IEEE Conference on Computer Vision
and Pattern Recognition, vol. 1.
H URRI , J., H YV ÄRINEN , A., K ARHUNEN , J., AND O JA , E. 1996.
Image feature extraction using independent component analysis.
In Proceedings of the IEEE Nordic Signal Processing Symposium (NORSIG), 475–478.
-20
-10
0
10 20
Horizontal Coefficients
H URRI , J., H YV ÄRINEN , A., AND O JA , E. 1997. Wavelets and
natural image statistics. In Proceedings of the Scandinavian
Conference on Image Analysis, vol. 1, 13–18.
L EVIN , A., Z OMET, A., AND W EISS , Y. 2003. Learning how to
inpaint from global image statistics. In ICCV ’03: IEEE International Conference on Computer Vision, 305.
L EVIN , A. 2007. Blind motion deblurring using image statistics.
Advances in Neural Information Processing Systems 19, 841–
848.
-10 -5
0
5
10
Vertical Coefficients
-20
-10
0
10 20
Vertical Coefficients
O LSHAUSEN , B., AND F IELD , D. 1996. Natural image statistics
and efficient coding. Network: Computation in Neural Systems
7, 333–339.
P ORTILLA , J., AND S IMONCELLI , E. 2000. Image denoising via
adjustment of wavelet coefficient magnitude correlation. In Proceedings of the 7th International Conference in Image Processing, 10–13.
-10 -5
0
5
10
Diagonal Coefficients
-20
-10
0
10 20
Diagonal Coef ficients
Figure 9: Wavelet coefficient distributions for all three orientations. The three finest scales are shown here for both the LDR and
HDR Natural datasets.
F IELD , D. 1987. Relations between the statistics of natural images and the response properties of cortical cells. Journal of the
Optical Society of America A 4, 12, 2379–2394.
F IELD , D. J. 1993. Scale-invariance and self-similar ’wavelet’
transforms: An analysis of natural scenes and mammalian visual
systems. In Wavelets, fractals and Fourier transforms. 151–193.
F IELD , D. J. 1994. What is the goal of sensory coding? Neural
Computation 6, 4, 559–601.
F IELD , D. 1999. Wavelets, vision and the statistics of natural
scenes. Philosophical Transactions: Mathematical 357, 176,
2527–2542.
H ANCOCK , P., BADDELEY, R., AND S MITH , L. 1992. The principal components of natural images. Network: Computation in
Neural Systems 3, 1, 61–70.
H ATEREN , J. H., AND VAN DER S CHAAF, A. 1998. Independent component filters of natural images compared with simple
cells in primary visual cortex. In Proceedings of the Royal Society B: Biological Sciences, vol. 265, 359.
VAN
H ATEREN , J. H. 1992. A theory of maximizing sensory information. Biological Cybernetics 68, 1, 23–29.
VAN
H AYES , W. L. 1988. Statistics. Harcour Brace College Publishers,
Orlando, FL.
H UANG , J., AND M UMFORD , D. 1999. Image statistics for the
british aerospace segmented database. MTPC preprint.
P OULI , T., C UNNINGHAM , D. W., AND R EINHARD , E. 2010.
Image statistics and their applications in computer graphics. In
Eurographics State-of-the-Art Reports.
R EINHARD , E., S TARK , M., S HIRLEY, P., AND F ERWERDA , J.
2002. Photographic tone reproduction for digital images. ACM
Transactions on Graphics 21, 3, 267–276.
R EINHARD , E., S HIRLEY, P., A SHIKHMIN , M., AND T ROS CIANKO , T. 2004. Second order image statistics in computer
graphics. In Proceedings of the 1st Symposium on Applied perception in graphics and visualization, 99–106.
RUDERMAN , D., AND B IALEK , W. 1994. Statistics of natural
images: Scaling in the woods. Physical Review Letters 73, 6,
814–817.
S HAN , Q., J IA , J., AND AGARWALA , A. 2008. High-quality
motion deblurring from a single image. ACM Transactions on
Graphics 27, 3, 73.
S IMONCELLI , E. 1999. Bayesian denoising of visual images in the
wavelet domain. Bayesian Inference in Wavelet Based Models
141, 291–308.
S IMONCELLI , E. 1999. Modeling the joint statistics of images in
the wavelet domain. Proceedings of the SPIE 3813, 188–195.
T HOMSON , M. 1999. Higher-order structure in natural scenes.
Journal of the Optical Society of America A 16, 7, 1549–1553.
T OLHURST, D., TADMOR , Y., AND C HAO , T. 1992. Amplitude
spectra of natural images. Ophthalmic and Physiological Optics
12, 2, 229–232.
T ORRALBA , A., AND O LIVA , A. 2003. Statistics of natural image
categories. Network: Computation in Neural Systems 14, 3, 391–
412.
W EBSTER , M., AND M IYAHARA , E. 1997. Contrast adaptation
and the spatial structure of natural images. Journal of the Optical
Society of America A 14, 9, 2355–2366.