Academia.eduAcademia.edu

Statistical regularities in low and high dynamic range images

2010

Computer graphics as well as related disciplines often benefit from understanding the human visual system and its input. In this paper, we study statistical regularities of both conventional as well as high dynamic range images, and find that several commonly held wisdoms regarding natural image statistics do not directly apply to high dynamic range data. These results have implications for both the study of human vision, as well as for the design of algorithms in computer graphics and computer vision that rely on such statistical regularities.

Statistical Regularities in Low and High Dynamic Range Images Tania Pouli∗ University of Bristol Manmade Day Douglas Cunningham† Brandenburg Technical University Cottbus, Cottbus, Germany Manmade Indoors Manmade Night Natural Erik Reinhard‡ University of Bristol HDR Photographic Survey Figure 1: Sample images from each of the data sets used. Abstract Computer graphics as well as related disciplines often benefit from understanding the human visual system and its input. In this paper, we study statistical regularities of both conventional as well as high dynamic range images, and find that several commonly held wisdoms regarding natural image statistics do not directly apply to high dynamic range data. These results have implications for both the study of human vision, as well as for the design of algorithms in computer graphics and computer vision that rely on such statistical regularities. CR Categories: I.2.10 [Vision and Scene Understanding]: Modeling and Recovery of Physical Attributes— [G.3]: Probability and Statistics—Statistical Computing Keywords: Natural Image Statistics, High Dynamic Range 1 Introduction A recent trend in computer graphics and related fields is to reformulate algorithms as optimization problems. To ensure plausible solutions are found, the optimization process is guided by priors. In a rising number of applications, images are reconstructed, augmented, or generally improved, and in these cases priors are often based on natural image statistics. The premise is that by using statistical regularities in natural images, the solutions found to these problems are tending towards naturally occurring images. Examples of problems that can be solved in this manner are image denoising [Simoncelli 1999a; Portilla and Simoncelli 2000], blind deconvolution [Levin 2007; Shan et al. 2008], and inpainting [Levin et al. 2003]. ∗ e-mail: [email protected] Of course, the quality of the results depends on how closely the models of natural image statistics actually model the real world. In this paper we are concerned with better understanding statistical regularities in images and in particular the differences stemming from the quantization and limited dynamic range offered with conventional capturing techniques. We will show for the first time that significant non-trivial differences exist between conventional low dynamic range images (LDR) and high dynamic range images (HDR). These differences mean that commonly held assumptions regarding natural image statistics do not extend to HDR imaging. This has two implications. First, for engineering-based disciplines this means that existing priors for LDR images cannot be used directly for HDR applications. Second, for human vision research our findings indicate that our world may have statistics that differ significantly from the statistics captured by existing conventional image ensembles, requiring a rethinking of what it is that human vision perceives. Further, we created collections of images with different general content. In particular, we are interested in assessing whether there exist significant differences between indoor and outdoor scenes, manmade versus natural, and in particular, night versus day images. Here we leverage the unique possibilities afforded by HDR imaging, namely that we are able to capture scenes that contain light sources. Once more we find significant differences across a range of image statistics, suggesting that situation-specific priors could be more effective than global priors. First, we give a brief overview of the types of statistical regularities that have been discovered so far (Section 2) and discuss our data collection procedure (Section 3). We then explore a range of statistics, from first order central moments (Section 4), spectral and phase analysis (Sections 5, 6) to gradients (Section 7) and wavelet statistics (Section 8). Finally, we discuss the main themes in our findings and draw conclusions in Sections 9 and 10. † e-mail:[email protected] ‡ e-mail:[email protected] 2 Previous Work Previous work on the statistics of natural images is often accompanied by the explicit assumption that the human visual system is in some way optimized for its input and that knowledge about regularities in real world images might provide insights into how the human visual system is organized. The regularities can be elucidated in a number of ways [Pouli et al. 2010]. The simplest analyses treat each pixel independently, such as intensity or contrast distributions. While this class of statistics provides insights into the visual system (such as contrast adaptation), it cannot reveal much about the structure of a scene. Far more common are second order statistics, which measure the relationship between pairs of pixels. Nearly every paper on natural image statistics reports an analysis of the power spectrum, with particular emphasis on the relationship between power and frequency (for a more detailed discussion, see Section 5). While the spectra of individual images can vary considerably, the average spectra are generally quite smooth and follow a power law. The average spectral slope varies from 1.8 to 2.4, with most values clustering around 2.0 [Burton and Moorhead 1987; Field 1987; van Hateren and van der Schaaf 1998; van Hateren 1992; Reinhard et al. 2004; Ruderman and Bialek 1994; Tolhurst et al. 1992]. Higher-order statistics, such as Principle Component Analysis (PCA), Independent Component Analysis (ICA), or wavelets have been applied to ensembles of natural images [Bell and Sejnowski 1997b; Bell and Sejnowski 1997a; Field 1993; Field 1999; Hancock et al. 1992; van Hateren and van der Schaaf 1998; Hurri et al. 1996; Hurri et al. 1997; Olshausen and Field 1996; Simoncelli 1999b]. Van Hateren [1998], for example, ran ICA on over 4000 natural images and found that the results matched the spatial frequency and orientation tuning bandwidths as well as the aspect ratio and length of the receptive fields of simple cells. Likewise, Olshausen and Field [1996] demonstrated that when specific constraints (such as sparseness) are used, PCA produces receptive fields that resemble simple cells. Simoncelli [1999b] showed that natural image statistics have significant dependence between wavelet coefficients, and that this has implications for the visual system. A comprehensive set of analyses is performed by Huang and Mumford [1999b], who computed a wide variety of statistics ranging from simple to higher-order for the 4000 images of the van Hateren database [van Hateren and van der Schaaf 1998]. Subsequently, Huang and Mumford [1999a] examined the power spectra of a smaller database (214 images) that had been painstakingly segmented into regions representing 11 different object categories. One of their central findings is that there are variations in spectral slopes for these within-image categories. This variation by category has also been reported for different scene types [Torralba and Oliva 2003; Webster and Miyahara 1997]. Webster and Miyahara [1997], for example, analyzed the power spectra and rms-contrast of 48 natural scenes. They found significant differences in both spectral slope and contrast across three scene types (2.15, 2.23, and 2.4 for the forest, close-up, and distant meadow scenes, respectively). With one exception, all of the work on natural scenes has been performed on LDR images. Dror et al [2001] examined the statistics of HDR illumination maps. They found that many of the previously identified regularities in LDR images can be found in HDR, with two main differences. First, the wavelet fits were not as good. Second, and more critically, there were differences in the power spectra. When the pixel value was proportional to log luminance, the spectral slope was roughly 2.0. When the pixel values were linear in luminance, the presence of local illumination sources dominated the power spectrum altering it so that it was no longer easily described by a simple power law. Since the LDR and HDR images are not from the same data set (and differed in scene type as well as size of camera angle), it is unclear whether the differences are really due to HDR characteristics. 3 Image Ensembles Our data sets capture a variety of environments and conditions as well as differences between high dynamic range and traditional im- ages. More specifically, images represent either natural or manmade environments with the latter containing daylight, night and indoors scenes. Additionally, each data set consists of both HDR and LDR images. In addition to the above data sets, we analysed a subset of images from the HDR Photographic Survey [Fairchild 2007] that were radiometrically calibrated using the provided measurements where available1 . Only HDR images were available for these scenes. Sample images from all ensembles are shown in Figure 1, while Table 1 lists the number of images per data set. Data Set Natural Manmade Day Manmade Indoors Manmade Night HDR Photographic Survey Number of Images 95 240 125 52 58 Table 1: Number of images for each data set. Images for our four data sets were captured using a Nikon D2H with a Nikkor 24-120mm lens and a Nikon D300 with a Nikkor 18-200mm lens and saved as sRGB JPEG files. For the D2H, the resolution of the images was 2464 x 1632 while the resolution of the D300 images was 4288 x 2848. In the latter case, images were downsampled to match the resolution of the D2H along the largest axis. Natural scene images were collected during daytime, under different weather conditions and covering a variety of landscapes. Manmade scenes were divided into three categories: day outdoors, night outdoors and day indoors scenes. Light sources in the night and indoors scenes were captured where present. However, in all daytime scenes we avoided capturing the sun directly to avoid over-exposed regions in the HDR images. Each scene was captured using 9 exposures spaced 1EV apart. The exposure settings were selected such that on average the middle exposure minimized the number of over and under-exposed pixels present. The LDR data sets were formed by linearizing the middle exposures according to the sRGB non-linearity and saving in a lossless format. Although for some scenes, this single exposure was sufficient for capturing all the details in the scene, in most cases, some over and under-exposed pixel were present in all exposures. HDR images were created by merging the 9 calibrated exposures for each scene using hdrgen1 with image alignment enabled. To linearize the merged HDR images we used the camera response curve derived using the techniques by Debevec and Malik [1997]. Imperfections in the camera or the lens give rise to artifacts and distortions that can affect the statistics of scenes [Brady and Legge 2009]. All our data sets are corrected for geometric lens distortions, chromatic aberration, vignetting and lens softness using DxO Optics Pro 2 with the appropriate lens and camera modules for our specific setups. All exposures were calibrated separately before merging them to HDR. Finally, all exposures were cropped to a 1024x1024 window that was centrally aligned to simplify the statistical analysis. 4 Histograms, Moments and Contrast First order statistics are carried out directly on pixel values, and include both histograms and central moments. For our datasets, Figure 2 shows histograms computed in logarithmic space. Due to 1 Courtesy Greg Ward 2 http://www.dxo.com/uk/photo/dxo optics pro/ Density Ensemble Histogram (LDR, linear) 0.35 Natural Day Manmade Day Manmade Indoors Manmade Night RIT 0.30 0.25 0.20 Data Set Natural Day Manmade Day Manmade Indoors Manmade Night RIT LDR µ σ 54.5 69.7 39.7 60.2 23.1 47.6 20.6 46.1 - HDR µ σ 66.4 × 104 15.1 × 105 49.8 × 104 94.9 × 104 28.7 × 103 11.4 × 104 24.9 × 102 18.1 × 103 86.6 × 102 40.8 × 103 0.15 Table 2: Mean and standard deviation for our datasets. 0.10 0.05 Density 0 -0.5 0 0.5 1 1.5 2 Log Intensity Ensemble Histogram (HDR) 0.12 0.10 0.08 12 0.06 0.04 0.02 0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 -1 0 1 2 3 4 5 6 7 8 Log Luminance Skew, LDR 60 50 40 30 Skew, HDR Natural Day Manmade Day Manmade Indoors Manmade Night RIT 20 10 0 Kurtosis, LDR 6000 10 5000 8 4000 6 3000 4 2000 2 1000 0 0 Kurtosis, HDR Figure 3: Skew and kurtosis for our image ensembles. Figure 2: Histograms for our LDR and HDR datasets. binning of small values, the LDR sets show spikes towards the left which should not be interpreted as artifacts of the image ensemble. This is simply the result of taking the logarithm of discrete values. The LDR datasets do show problems with over-exposed pixels, despite our image selection procedure for LDR exposures. This can be seen as the spikes towards the right of the plots. The HDR datasets do not show these artifacts, indicating that, as expected, over-exposure does not tend to occur in our data. We note that the histograms for our day-lit ensembles are more or less identical, indicating that the presence of man-made structures do not significantly affect statistics. HDR images tend to show histograms with long tails. This is seen here as well, even though the plots have a logarithmic horizontal axis. This is generally attributed to the presence of bright light sources with limited spatial extent. Our datasets can be further characterized by their central moments. In natural image statistics, usually only the first four moments (mean, standard deviation, skew, and kurtosis) are reported. The means µ and standard deviations σ vary significantly between LDR and HDR datasets, and are shown in Table 2. Both also vary significantly between datasets, and we note that the standard deviation tends to be very large, pointing towards a wide range of image contrasts in each scene. The skew and kurtosis results are compared in Figure 3. It is interesting to note that both skew and kurtosis are much higher for the HDR datasets than their corresponding LDR ensembles, and indicate a decidedly non-Gaussian distribution of pixel values. While the skew and kurtosis are highest for the Indoors and Night scenes in the LDR ensembles, they are smallest for the HDR ensembles. In particular, the calibrated RIT dataset shows a significantly higher skew and kurtosis than any of the other sets. The skew is positive for all ensembles, which is to be expected: most pixels tend to be dark, with smaller clusters of pixels representing light sources. This is much more so for our HDR ensembles, which we attribute to the fact that light sources in LDR images tend to be clipped and are therefore under-represented in conventional images. The kurtosis measure is of considerable interest, as it relates to theories of sparse coding. If a distribution is highly positively kurtosed, most of the values will be close to zero. Such sparseness could be effectively employed by thresholding and removing the values near zero. The remaining values could then be recoded more efficiently with little loss of information [Thomson 1999]. This has computational [Barlow 1994] as well as metabolic advantages in human vision [Field 1994]. We surmise that the significantly increased kurtosis in HDR images means that real-world luminances are even more amenable to sparse coding within the human visual system than previously assumed. Whether this translates to efficient computational compression schemes remains to be seen, given that such schemes implicitly rely on quantization to achieve high levels of compression, whereas HDR images are not quantized. Finally, we assess the magnitude of contrast in this section. The overall contrast in an image can be captured by computing the ratio between variance and mean, σi2 /µi . If we do this for each image individually, then we can compute ensemble statistics (mean and standard deviation) on contrast, the results of which are given in Table 3. We see that for the LDR ensembles the largest overall contrasts are recorded for the indoors and night scenes. This can be attributed to the fact that for such scenes it is likely that the best exposure is still both over- and under-exposed at the same time. The HDR datasets Data Set Natural Day Manmade Day Manmade Indoors Manmade Night RIT LDR µ σ 57.8 29.3 65.0 42.7 101.8 58.1 95.5 44.5 - HDR µ σ 64.1 × 104 17.4 × 105 68.8 × 104 22.5 × 105 16.0 × 104 21.0 × 104 28.1 × 103 52.9 × 103 51.1 × 103 21.5 × 104 Table 3: Mean and standard deviation of contrast. To compute the spectral slope for each image ensemble, we first convert each color image to luminance using ITU-Recommendation BT.709 (the standard used for sRGB and HDTV): L = 0.2126R + 0.7152G + 0.0722B. Then, the luminance images were weighted with a Kaiser-Bessel window to avoid measuring boundary effects, before removing the mean intensity. The windowed images are then converted to the Fourier domain, where the power spectrum is computed, and the resulting values are averaged over all directions. This results in a power value for each frequency. The lowest 256 frequencies were then selected, to which the power model was fitted, leading to a spectral slope for each image. log HDR Natural Day Manmade Day Manmade Night log LDR The spectral slopes were averaged for each of our ensembles and subjected to statistical analysis. The slopes and their variances are depicted in Table 4. Note that Dror et al [2001] have found that due to the presence of strong illumination sources, the spectral distribution of HDR images no longer follows the standard 1/f −α power curve. Following their work, we have log-transformed our HDR data, allowing a power curve to be fitted to the spectral slopes. Figure 4: Joint histograms for Natural, Manmade Day and Manmade Night data sets along horizontal neighbors. show different behavior, in that the night scenes have less contrast on average than the other ensembles. This is to be expected, as with the exception of the presence of relatively bright light sources, most areas in a scene are only sparsely illuminated, yielding on average low contrast. To capture the relations between neighboring pixels in each data set, we created joint log histograms over horizontal and vertical neighbors. Each image was analyzed separately and the resulting histograms were averaged for each data set. All histograms were computed using the overall maximum and minimum values for the whole set. Some representative histograms for our data sets are shown in Figure 4. Correlations are greater along the diagonal, showing that neighboring pixels tend to have similar values. 5 lie around 2, which has several interesting implications. First, it can be shown that this means that our natural environment is scaleinvariant: we can move through our world and expect that the statistics of the images falling on the retina do not significantly change. Second, the Wiener-Kintchine theorem states that the power spectrum and the autocorrelation function form a Fourier transform pair. This means that the spectral slope of image ensembles can be interpreted as describing relations between pairs of pixels. Spectral Analysis One of the most often reported natural image statistics is the average slope of the power spectrum. The power spectrum S(u,v) of an N × N image is given by: S(u, v) = |F (u, v)|2 N2 (1) where F is the Fourier transform of the image, and (u, v) are pixel indices in Fourier space. Two-dimensional frequencies can also be represented with polar coordinates (f, θ), using u = f cos(θ) and v = f sin(θ). When computing the power spectrum over a sufficiently large set of natural images, and averaging over all directions θ, then on average the power S at frequency f has the following relationship: S(f ) ∝ Af −α (2) where α is the spectral slope and A is a constant that determines overall image contrast. For natural images, the slope α tends to Data Set Natural Day Manmade Day Manmade Indoors Manmade Night RIT LDR α σ2 2.22 0.242 2.29 0.153 2.44 0.121 2.47 0.249 - log HDR α σ2 2.24 0.144 2.34 0.099 2.61 0.095 2.68 0.152 2.36 0.126 Table 4: Spectral slopes α and their variances σ 2 for each of our ensembles. Using two-tailed, independent measures t-tests [Hayes 1988] we find that the difference in spectral slope between the HDR and LDR datasets is significant (t(1022)=3.703, p<0.001, LDR slope 2.33, HDR slope 2.42), meaning that findings obtained for LDR data do not extend to HDR data. Further, we find a significant difference between the manmade and natural datasets (t(668)=2.644, p<0.008, Manmade slope 2.32, Natural slope 2.23). Comparing LDR versus HDR in each of our datasets, we find that the Manmade Night, and Indoors datasets show a significant difference between LDR and HDR, whereas the Manmade Day set is showing only a marginal difference (Table 5). No statistical difference between LDR and HDR was found for the Natural Day ensemble. In general, we find that HDR exponents tend to be higher than their LDR counterparts. The LDR slopes for Night and Indoors appear at the extreme upper end of the range normally reported (i.e. over 2.4). Data set Natural Day Manmade Day Manmade Indoors Manmade Night t-test t(188) = 0.390 t(478) = 1.683 t(248) = 3.977 t(102) = 2.362 Signif. > 0.69 > 0.09 < 0.00 < 0.02 Slopes 2.22, 2.24 2.29, 2.34 2.44, 2.61 2.47, 2.68 Table 5: Comparison between spectral slopes of the LDR and log HDR ensembles As a result, we conclude that statistics collected on LDR scenes in previous studies remain valid. However, these results do not extend to indoors or night scenes, which show significantly larger spectral slopes. Moreover, those ensembles not only show significant differences between LDR and HDR equivalents, but are also significantly different from the manmade and natural day ensembles. This means that for indoor and night scenes in particular, HDR imaging is a necessary tool for statistical analysis. Moreover, the values reported for day scenes do not necessarily translate to other visual conditions. This suggests that in computer graphics applications spectral image priors should be designed with specific viewing environments in mind. 6 Phase Analysis It can be argued that although statistical regularities are present in power spectra of natural images, much of the perceptually relevant information is encoded in phase spectra [Thomson 1999]. To gain access to phase information without polluting the results with first and second order information, we can whiten and center the images first. This amounts to adjusting the spectral slope to become flat and setting the DC component to zero. As a consequence the mean of the resulting image will be zero. Whitening images and analyzing the phase spectrum amounts to an assessment of spatial structure. 18 16 14 12 10 8 6 4 2 0 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 Whitened Skew, LDR Natural Day Manmade Day Manmade Indoors Manmade Night RIT 60 Whitened Skew, HDR 50 40 30 20 10 Whitened Kurtosis, LDR 0 x 10 4 Whitened Kurtosis, HDR 7 where L(i, j) is the luminance for pixel (i, j). Images from each ensemble are normalized according to the maximum value within the whole ensemble. As the images are only calibrated up to a constant, this normalization makes results from different data sets comparable. Both horizontal and vertical gradients are computed and the resulting distributions are shown in Figure 7. Our findings for the LDR ensembles are qualitatively similar to previous findings in the literature and closely match the Laplacian distribution reported by Huang and Mumford [1999b]. Images consist of mostly smooth areas, causing the majority of gradients to fall under or very near zero, creating a distinct peak. As structures in images tend to have starting and ending boundaries of similar contrast, the distribution of gradients is mirrored along the central peak. The HDR gradient distributions, albeit similar at a high level to their LDR equivalents, display some interesting properties. As mentioned earlier, all images are normalized within each ensemble. Since our gradient distributions are computed from logged luminance values, they essentially correspond to ratios in linear space and therefore contrasts. HDR images can capture regions of widely different luminance values, causing in turn extreme contrasts to appear. As an example, consider the scene in Figure 6. The left image is linearly compressed while the one on the right is tonemapped using the photographic operator [Reinhard et al. 2002]. Luminance values corresponding to portions of the scene outside the window are several orders of magnitude higher than pixels from the window frame. Nevertheless, apart from these extreme areas, neighbouring pixels in most parts of the image result in relatively small contrasts - no larger than what would be encountered in an LDR image. Consequently, the gradient distribution of the HDR ensembles acquire tails similar to those seen in their luminance histograms (Section 4). Huang and Mumford [1999b] propose two different 6 5 4 3 2 1 0 Figure 5: Skew and kurtosis after whitening encode phase information. The results of computing skew and kurtosis on the whitened image ensembles are shown in Figure 5. The whitened kurtosis effectively measures the sparseness of the phase information. As with the kurtosis computed directly on images, we see that the whitened kurtosis is much higher for high dynamic range images, and in particular for the calibrated RIT ensemble. In this case, we conclude that spatial structure is sparse, in real-world scenes even more so than can be captured with conventional images. 7 Gradients To capture statistical regularities in the relations between pairs of pixels, the distribution of gradients in images can be analyzed. We compute the gradient D of the log luminance values for our image ensembles as follows: D(i, j) = ln L(i, j) − ln L(i, j + 1) (3) Figure 6: The left image is linearly compressed while the right is tonemapped such that more details of the scene are visible. Portions of the image within the window have much higher luminance values than nearby pixels on the window frame. functional forms to model the gradient distributions that they found; one corresponding to the t-distribution and the second to a generalized Laplacian. These are defined as follows: Model 1 f (x) = Model 2 f (x) = 1 1 Z (1 + x2 /s2 )t 1 −|x/s|t e Z (4) (5) where f (x) is the density function for the distribution D (Equation 3) and Z is chosen such that the integral of f (x) is 1, therefore allowing s and t as free parameters. HDR Manmade Day Probability (ln f(x)) HDR Natural Day -2 -4 -6 -8 -10 -12 -14 -16 -18 Probability (ln f(x)) HDR Manmade Night dx Model 1 dx Model 2 dx dy Model 1 dy Model 2 dy -6 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 HDR Manmade Indoors -4 -2 0 2 4 Gradients 6 0 2 4 Gradients -4 -2 0 2 Gradients 4 6 -6 LDR Manmade Day LDR Natural Day -8 -6 -4 -2 -6 6 -8 -6 -4 -2 0 2 4 Gradients -4 -2 0 2 4 Gradients 6 LDR Manmade Indoors 6 8 -8 -6 -4 -2 0 2 4 Gradients 6 -10 -8 -6 -4 -2 0 2 4 6 8 10 Gradients LDR Manmade Night 8 -8 -6 -4 -2 0 2 4 Gradients 6 8 Figure 7: Gradient distributions for LDR and HDR ensembles for the four data sets. The best fits using the two models proposed in [Huang and Mumford 1999b] for the horizontal and vertical gradients are also shown. To evaluate the differences between the various image ensembles as well as between LDR and HDR, we used least squares to fit the two proposed models to the resulting distributions. Table 6 and Figure 7 shows the results of the least squares fitting for both models to ln f (x) for each ensemble for LDR and HDR images. As can be seen, these models fit the steep part of the distributions relatively well, but do not predict the very large gradients present in the HDR ensemble distributions. If required, this difference can be rectified by using either model in combination with a very shallow linear part to account for the large gradients. 8 Wavelet Statistics Wavelets are a useful tool for statistical image analysis as they provide a compact way to encode relations between image structures at different orientations and scales. We chose Haar wavelets for our work as they are easy to compute and previous work has shown no significant difference between Haar wavelets and more complicated bases [Dror et al. 2001]. This can be attributed to the small regions of extreme contrast that are present in at least some of the HDR images. Figure 8 shows the horizontal coefficients for the HDR datasets. Although qualitatively similar, wavelet analysis also gives rise to differences between the different scene types, which can be seen especially in the coarser scales. The coefficient distributions for the Natural Day ensemble (Figure 8) are narrower than the corresponding distributions for manmade scenes. From that we can conclude that manmade scenes show less correlation between pixel values over larger distances. Since the coefficient densities shown are logarithmically compressed these irregularities only affect a small number of pixels. Nevertheless, they open up interesting questions regarding compression of different types of HDR scenes. 9 Discussion We decompose the luminance channels of all images to their horizontal, vertical and diagonal components and compute the average distribution of coefficients for each orientation for all our data sets. Figure 9 shows the coefficient distributions for all three orientations for the natural dataset for both LDR and HDR ensembles. In both cases, the luminance values were normalized according to the maximum value within each ensemble and logarithmically compressed. Our interest is in natural image statistics for the purpose of better understanding the input to the human visual system as well as being able to apply the results to applications in computer graphics and related fields. The hypothesis is that LDR images may not necessarily represent scene statistics in the same manner that retinal images would. By employing HDR technology, we have assembled a collection of LDR and HDR image ensembles for the purpose of cataloguing similarities and differences between LDR and HDR images. These were then used to assess a range of natural image statistics, revealing some tell-tale differences between LDR and HDR ensembles as well as between different scene types. The distribution corresponding to the finest scale is comparable to the logarithmic gradient distributions presented in Section 7. This is not a surprising result as Haar wavelets operate in a similar way to image gradients, albeit at different orientations and scales. Coarser scales exhibit higher variance and indicate that wavelet coefficients cover a progressively larger area. Note that, as was the case with the gradient distributions, the coefficient distributions for the finest scale of the HDR data appear narrower than the LDR equivalent. It is well known that clipping exists in LDR images, which is reflected in histograms and their moments. Histograms of HDR sets show a much higher skew and kurtosis as a consequence of the relatively small number of extremely bright portions of the images. An interesting observation is that the differences between the HDR ensembles are not mirrored in the LDR ensembles. In particular, both the Manmade and the Natural Day scenes result in more skewed distributions than the Indoors or Night scenes in the case of HDR LDR Data Fitting Model 1 dx Natural Day Manmade Day Manmade Indoors Manmade Night s 0.37 0.34 0.21 0.21 dy t 2.70 2.61 2.17 2.13 HDR Data Fitting Model 2 s 0.38 0.41 0.23 0.20 Model 1 dx t 2.78 2.79 2.28 2.05 s 0.03 0.03 0.01 0.02 t 0.53 0.51 0.45 0.47 s 0.04 0.04 0.02 0.02 dy t 0.549 0.54 0.47 0.45 dx s 0.24 0.25 0.27 0.17 Model 2 dy t 3.44 3.07 3.05 2.53 s 0.26 0.27 0.31 0.21 dx t 3.47 3.16 3.22 2.71 s 0.01 0.01 0.02 0.01 dy t 0.52 0.49 0.55 0.43 s 0.01 0.01 0.02 0.01 t 0.53 0.49 0.55 0.46 Log Density Table 6: Data fitting parameters for the LDR and HDR ensembles using the models from Equations 4 and 5. 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 Natural Day Manmade Day Manmade Indoors Manmade Night Scale 1 Scale 2 Scale 3 -15 -10 -5 0 5 10 15 Horizontal Coefficient -15 -10 -5 0 5 10 15 Horizontal Coefficient -15 -10 -5 0 5 10 15 Horizontal Coefficient -15 -10 -5 0 5 10 15 Horizontal Coefficient Figure 8: The horizontal coefficients for all the HDR ensembles are shown for the three finest scales. imagery, while the oposite effect arises in the LDR case. Spectral analysis also gives rise to differences between the LDR and HDR sets. However, our t-tests show that these differences are only significant for the Manmade Night and Indoors datasets as discussed in Section 5. After whitening the images by flattening the spectral slope, the skew and kurtosis of the ensembles were recomputed, allowing phase information to be captured. Similar to the un-whitened distributions, HDR sets result in much higher skew and kurtosis values compared to the equivalent LDR sets. To evaluate the relations between pairs of neighbouring pixels, gradient statistics were collected, showing that existing models predict well the distribution of gradients in LDR ensembles but cannot predict the tail that HDR gradient distributions show. Such tails are the result of image portions with extreme contrasts and although they correspond to only a small number of pixels, they are a crucial characteristic of HDR imagery. Our wavelet analysis confirms the observations arising from the gradient statistics, also giving rise to differences between Natural and Manmade datasets. It was shown that the distribution of wavelet coefficients for coarser scales is wider, a consequence of smaller correlations between distant pixels in Manmade scenes. 10 Conclusion In summary, we find that significant differences exist between LDR and HDR image ensembles as well as between different types of scenes which cannot be traced back to our capture methods, and are therefore inherently present in the scenes we have collected. The implications are twofold. First, this means that the input to the human visual system is not quite the same as that represented by LDR imagery. As a consequence, it may be necessary to rethink how the human visual system is matched to its environment. Second, engineering solutions which rely on statistical regularities in images, for instance in the design of priors in optimization problems, may benefit from the use of high dynamic range imagery due to its ability to control quantization and exposure artifacts. How- ever, in that case priors ought to be designed specifically for HDR imagery, and dependent on the nature of the prior, for a specific type of illumination environment. We have shown that it is not possible to safely assume that existing natural image statistics extend to the HDR case and therefore to real-world illumination. References BARLOW, H. B. 1994. What is the computational goal of the neocortex? In Large-Scale Neuronal Theories of the Brain, C. Koch, Ed. MIT Press. B ELL , A., AND S EJNOWSKI , T. 1997. The “independent components” of natural scenes are edge filters. Vision Research 37, 23, 3327–3338. B ELL , A., AND S EJNOWSKI , T. 1997. Edges are the ’independent components’ of natural scenes. Advances in Neural Information Processing Systems 9, 831–837. B RADY, M., AND L EGGE , G. 2009. Camera calibration for natural image studies and vision research. Journal of the Optical Society of America A 26, 1, 30–42. B URTON , G., AND M OORHEAD , I. 1987. Color and spatial structure in natural scenes. Applied Optics 26, 1, 157–170. D EBEVEC , P., AND M ALIK , J. 1997. Recovering high dynamic range radiance maps from photographs. In SIGGRAPH ’97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, 369–378. D ROR , R., L EUNG , T., A DELSON , E., AND W ILLSKY, A. 2001. Statistics of real-world illumination. In CVPR ’01: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, 164– 171. FAIRCHILD , M. 2007. The hdr photographic survey. In Proceedings of the Fifteenth Color Imaging Conference: Color Science and Engineering Systems, Technologies, and Applications, vol. 15, 233–238. Log Density Log Density Log Density 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 0 -2 -4 -6 -8 -10 -12 -14 -16 -18 LDR HDR Scale 1 Scale 2 Scale 3 -10 -5 0 5 10 Horizontal Coefficients H UANG , J., AND M UMFORD , D. 1999. Statistics of natural images and models. In CVPR ’99: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1. H URRI , J., H YV ÄRINEN , A., K ARHUNEN , J., AND O JA , E. 1996. Image feature extraction using independent component analysis. In Proceedings of the IEEE Nordic Signal Processing Symposium (NORSIG), 475–478. -20 -10 0 10 20 Horizontal Coefficients H URRI , J., H YV ÄRINEN , A., AND O JA , E. 1997. Wavelets and natural image statistics. In Proceedings of the Scandinavian Conference on Image Analysis, vol. 1, 13–18. L EVIN , A., Z OMET, A., AND W EISS , Y. 2003. Learning how to inpaint from global image statistics. In ICCV ’03: IEEE International Conference on Computer Vision, 305. L EVIN , A. 2007. Blind motion deblurring using image statistics. Advances in Neural Information Processing Systems 19, 841– 848. -10 -5 0 5 10 Vertical Coefficients -20 -10 0 10 20 Vertical Coefficients O LSHAUSEN , B., AND F IELD , D. 1996. Natural image statistics and efficient coding. Network: Computation in Neural Systems 7, 333–339. P ORTILLA , J., AND S IMONCELLI , E. 2000. Image denoising via adjustment of wavelet coefficient magnitude correlation. In Proceedings of the 7th International Conference in Image Processing, 10–13. -10 -5 0 5 10 Diagonal Coefficients -20 -10 0 10 20 Diagonal Coef ficients Figure 9: Wavelet coefficient distributions for all three orientations. The three finest scales are shown here for both the LDR and HDR Natural datasets. F IELD , D. 1987. Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A 4, 12, 2379–2394. F IELD , D. J. 1993. Scale-invariance and self-similar ’wavelet’ transforms: An analysis of natural scenes and mammalian visual systems. In Wavelets, fractals and Fourier transforms. 151–193. F IELD , D. J. 1994. What is the goal of sensory coding? Neural Computation 6, 4, 559–601. F IELD , D. 1999. Wavelets, vision and the statistics of natural scenes. Philosophical Transactions: Mathematical 357, 176, 2527–2542. H ANCOCK , P., BADDELEY, R., AND S MITH , L. 1992. The principal components of natural images. Network: Computation in Neural Systems 3, 1, 61–70. H ATEREN , J. H., AND VAN DER S CHAAF, A. 1998. Independent component filters of natural images compared with simple cells in primary visual cortex. In Proceedings of the Royal Society B: Biological Sciences, vol. 265, 359. VAN H ATEREN , J. H. 1992. A theory of maximizing sensory information. Biological Cybernetics 68, 1, 23–29. VAN H AYES , W. L. 1988. Statistics. Harcour Brace College Publishers, Orlando, FL. H UANG , J., AND M UMFORD , D. 1999. Image statistics for the british aerospace segmented database. MTPC preprint. P OULI , T., C UNNINGHAM , D. W., AND R EINHARD , E. 2010. Image statistics and their applications in computer graphics. In Eurographics State-of-the-Art Reports. R EINHARD , E., S TARK , M., S HIRLEY, P., AND F ERWERDA , J. 2002. Photographic tone reproduction for digital images. ACM Transactions on Graphics 21, 3, 267–276. R EINHARD , E., S HIRLEY, P., A SHIKHMIN , M., AND T ROS CIANKO , T. 2004. Second order image statistics in computer graphics. In Proceedings of the 1st Symposium on Applied perception in graphics and visualization, 99–106. RUDERMAN , D., AND B IALEK , W. 1994. Statistics of natural images: Scaling in the woods. Physical Review Letters 73, 6, 814–817. S HAN , Q., J IA , J., AND AGARWALA , A. 2008. High-quality motion deblurring from a single image. ACM Transactions on Graphics 27, 3, 73. S IMONCELLI , E. 1999. Bayesian denoising of visual images in the wavelet domain. Bayesian Inference in Wavelet Based Models 141, 291–308. S IMONCELLI , E. 1999. Modeling the joint statistics of images in the wavelet domain. Proceedings of the SPIE 3813, 188–195. T HOMSON , M. 1999. Higher-order structure in natural scenes. Journal of the Optical Society of America A 16, 7, 1549–1553. T OLHURST, D., TADMOR , Y., AND C HAO , T. 1992. Amplitude spectra of natural images. Ophthalmic and Physiological Optics 12, 2, 229–232. T ORRALBA , A., AND O LIVA , A. 2003. Statistics of natural image categories. Network: Computation in Neural Systems 14, 3, 391– 412. W EBSTER , M., AND M IYAHARA , E. 1997. Contrast adaptation and the spatial structure of natural images. Journal of the Optical Society of America A 14, 9, 2355–2366.