Academia.eduAcademia.edu

Statistics of natural and urban images

1997, Lecture Notes in Computer Science

We investigated ensembles of artificial and real-world grey-scale images to find different invariance properties: translation invariance, scale invariance and a new hierarchical invariance recently proposed by Ruderman [1]. We found that the assumption of ...

Statistics of Natural and U r b a n Images Christian Ziegaus and E l m a r W. Lang Institute of Biophysics, University of Regensburg, D-93040 Regensburg, Germany [email protected] A b s t r a c t . We investigated ensembles of artificial and real-world greyscale images to find different invariance properties: translation invariance, scale invariance and a new hierarchical invariance recently proposed by Ruderman [1]. We found that the assumption of translational invariance can be taken for granted. Our results concerning the scale invariance are qualitatively the same as those found by Ruderman [1] and others. The deviations of the distributions of the logarithmically transformed images from a Gaussian distribution cannot be seen as clearly as stated by Ruderman [1]. Depending on the preprocessing of the images the results concerning the hierarchical invariance differed widely. It seems that this new invariance can be confirmed only for logarithmically transformed images. 1 Introduction The visual system is expected to be optimally adapted to the statistics of natural images it deals with. Hence the statistical properties of visual input patterns are of primary interest. Since there is no way to collect enough d a t a to fully characterize an image environment, recent investigations [1-4] seek to identify a simple underlying structure or invariance property in the image probability distribution. One such s y m m e t r y frequently assumed is translational invariance. Undoubtedly their most robust statistical property is an invariance to scale [1]. Recently, evidence has been presented supporting the notion of a hierarchical invariance in natural scenes. It relates to the conversion of exponential histograms to Gaussian distributions via local non-linear transformations. 2 2.1 Ensembles The natural and the urban ensemble Various scenes for the natural ensemble NAT have been taken within a forest with a CCD video camera. Scenes for the u r b a n ensemble ZIV have been filmed around the campus of the University of Regensburg. From individual frames an average image has been constructed and then a b o u t 100 smaller sized images (256 × 256 pixels each) have been extracted at random. To evaluate the image statistics raw pixel intensities • (x) have been transformed either linearly or non-linearly to difference intensities • (x) - D (x) or log-contrast intensities ¢ (x) - L (x) with zero mean distributions. 220 2.2 T h e artificial ensemble K2 Fourier images with 1/tkl-amplitude spectra and random phases were generated on a computer and inverse Fourier-transformed into the spatial domain to generate an image which is perfectly scale invariant, hence is characterized by a power spectrum S (Ikl) oc 1/Ikl 2. After determining for each Fourier-transformed image the corresponding maximal and minimal pixet intensities Ima x and Imin the latter have been rescaled linearly to span the intervall [0,..., 255]. The actual spread of the rescaled pixel intensities can be characterized by the relation (Ir~a~) ~-- ~v~M" The distribution of pixel intensities I (r) after rescaling with Ima~ is thus given by P(I)- 255Mv~eXp \2M/ 127.5 1 , (1) with M the number of pixels in each spatial direction and a1 the standard deviation of the distribution. 3 T r a n s l a t i o n a l Invariance If images are indeed translationally invariant then any two-point-correlation function A~ (y) = (~ ( x ) ~ (x + y)} of pixel intensities at points x and y will depend on their relative distance only and not on x itself. Besides calculating, for different starting points x, the second order correlation function A~ (y) of pixel intensities an average correlation function has been determined also according to the relation 1 (2) which is the mean over the number of pixels contained in any given image. The average correlation functions of the image ensembles are compared in Fig. 1. Though obviously translationally invariant, rotational isotropy is lost as vertical structures (stemming from trees largely) dominate the natural images, whereas the average correlation function from images of the urban environment reflect the predominance of horizontal contours. 4 Scale Invariance If scale invariance prevails, the power spectrum So (k) of the intensity distribution should, after averaging over all orientations, scale like 1 so (lkl) oc ikl2_,, (3) with [k[ the modulus of the spatial frequency and ~ ~ 0 an anomalous exponent. Scale invariance does not tell the form of the stationm'y distribution from 221 0.52 ' , ' 5,53- - ~ \ \ > (b) 0.26 T 0.00 . -0.26 -0.52 -0.52 -5.53~ - - ~ ~ -0.26 0.00 0.26 0.52 -5.53 -2.81 x [degree] 0.00 x f 2.81 5.53 [degree ] Fig. 1. Average correlation function A (y) for the ensembles NAT (a) and ZIV (b) for the difference intensities. Contours are shown at equal intervals of correlation. which the • (x) are drawn. Both aspects may be further investigated through the process of coarse graining [5], which replaces an N x N block of pixel intensities by their average. If the probability distribution PN (~) is scale invariant (scaling variable N), normally distributed difference intensities • (x) - D (x) would result in a Gaussian distribution. But in case of non-linearly transformed log contrast intensities ~ (x) - L (x) the correspondingly transformed and normalized distribution with zero average becomes .... P (L) - 255Mx/-~ exp (aLL + L) exp a2L = L2p (L) dL L = { - \c.. 2M/ 127.5 in (I) P (I) dZ = - 1 L P (L) dL. (4) J--oo Except for investigating the distribution of difference intensities or the logarithmic contrast values it is advantages also to consider the distribution of local gradients G ~ IV~I in the images. If the q5 (x) ~_ D (x) are normally distributed and scale invariant, a Rayleigh distribution of local gradients would result [6]. The corresponding distribution of local gradients G = ~fG 2 + G 2 in case of logarithmically transformed pixel intensities ~ (x) _~ L (x) as well as the average local gradient G = fdG~ f d G y G P (G~, Gy) must be obtained numerically assuming factorization of the joint probability density P (G~, G~) = P (G=) P (Gy) with P (Gx,y) chosen according to (4). 4.1 Statistics of the artificial ensemble K2 For the ensemble K2 S(tkl) c( Ikt ~ holds with a = - 1 . 9 6 9 + 0.002 the best estimate in the least squares sense. A coarse graining of the pixel intensities has been performed for the scales N = 1,2,4,8, 16, and 32. It is to be noted t h a t 222 the non-linear transformation of the probability density function leads to characteristic deviations from Gaussian and Rayleigh distributions. They resemble very well the theoretical distributions. 4.2 Statistics of the natural e n s e m b l e N A T The spectra show an approximate scaling over more than five orders of magnitude with small anomalous exponents collected in Table 1. It is our finding that any kind of smoothing improves the results of the coarse graining procedure. The histograms show interesting deviations from the related distributions of the K2 ensemble which are most pronounced for the histogram of local gradients. Table 1. Summary of the anomalous exponent ~1according to (3). ensemble transformation slope - 2 + ~/ K2 K2 NAT NAT ZIV ZIV 4.3 linear non-linear linear non-linear linear non-linear -1.969 -1.933 -2.319 -2.369 -2.634 -2.665 ::1=0.002 -4-0.002 -4- 0.054 + 0.062 ± 0.052 ± 0.054 7/ 0.031 0.067 -0.319 -0.369 -0.634 -0.665 S t a t i s t i c s o f the urban e n s e m b l e Z I V The images of this ensemble contain m a n - m a d e structures only with prominent horizontal and vertical contours. This is reflected in the power spectrum which is very asymmetric. It is also of interest that the anomalous scaling exponent ~/of the orientationally averaged power spectrum shows the largest anomalous exponent of all image ensembles considered in this study (cf. Table 1). 4.4 C o m p a r i s o n o f logarithmical contrast distributions Figure 2 shows the distributions of the logaritmically transformed pixel intensities for all ensembles together with the theoretically derived distribution and the histograms of local gradients. The distributions of the logarithmically transformed pixel intensities show for all ensembles onty small differences at high log contrast values when compared to the theoretically distribution. The comparison of the experimentally obtained distributions of local gradients of the ensembles NAT and ZIV with the theoretical histogram shows an increasing probability of very small and very large gradients in going from the theoretical distribution to NAT and ZIV. This is due to the increasingly pronounced edges and large unstructured areas in the latter images. Related observations have been made with histograms of difference intensities and corresponding local gradients. 223 (a) .~ 1[0.1 [ theo. d i s t r i b u t i o n - - NAT ...... 0.01 o.oot " ,/ (b) 0.i[ theo. d i s t r i b u t i o n NAT ......... ~-:..... '~", ~o.oo, IV ..... ~" 0.01 :. . . . . . . . . . . . le-05 le-05 1e-06 1e-06 le-07 le-07 2 4 -10 -8 -6 -4 -2 0 logarithmic contrast/standard deviation 0 1 2 3 4 5 gradient/mean 6 7 Fig. 2. Scaling of the distributions of logarithmic contrast values (a) and gradients (b) for the real-world ensemble NAT/ZIV and the theoretically derived histograms. 5 Hierarchical Invariance Recently the possibility of a hierarchical invariance of natural images has been discussed [1]. It is related to the observation t h a t simple linear filtering of logarithrnic contrasts produces exponential histograms much like those observed experimentally. If these exponential tails are due to a superposition of many distributions with largly varying variances, one may try to find a local non-linear transformation which can turn the distributions to Gaussians. The transformation proposed by Ruderman [1] amounts to calculating (x) - ( x ) -(=) ( = ) (5) with ~ (x) the average pixel intensity (whether linearly or non-linearly transformed) within a block of size N x N pixels and a (x) the standard deviation of pixel intensity fluctuations within the block. Besides these variance modified images one may as well consider so called variance images. These may be constructed either by substituting any pixel intensity by the related variance of pixel intensity fluctuations within block N x N or by substituting the whole block of pixel intensities by the variance of their intensity fluctuations thereby changing the image size. We will refer to both procedures as variance images without and with block substitution, respectively. The pixel intensities of these variance images may be transformed in the same way as the original images. T h e interesting thing is that these variance images seem to exhibit similar histograms as do the original images. This observation asks for the possibility to iterate this procedure and to its possible outcome. In doing so one may select the block size according to the smallest possible kurtosis of the distribution of pixel intensities as Gaussian distributions are characterized by vanishing kurtosis. 5.1 R e s u l t s for t h e e n s e m b l e s K 2 a n d N A T Ten iterations with 100 images have been performed to obtain variance modified images of block sizes 3 × 3 to 19 × 19 pixels. For every iteration the block 224 size resulting in the lowest kurtosis has been chosen. The resulting histograms without block substitution are shown in Figs. 3 for the ensemble NAT. The corresponding histograms of ensemble K2 yield qualitatively the same results. If a hierarchical invariance prevails, all histograms should exhibit a similar shape which is clearly not observed with linearly transformed pixel intensities but seems to hold in case of non-linearly transformed pi×el intensities (log contrasts). Contrary to scale invariance the result does depend strongly on the way the raw pixel intensities are transformed. 0.1 ~ oo1 OOl o.oo11 o.oo1 le 05 " le-05 ~ -2 0 2 difference intensit r/standard deviation (c) 1 (d) .... 0 1 0 1 2 3 4 5 gradient/mean 6 7 I 0.1 0.1 Z 0.1 i. ._-_- 0.01 0.01 0.001 0.001 0.0001 0.0001 . 1e-05 le-OS -4 -2 2 4 logarithmic contrast/standard deviation . . 2 . ~. 3 4 " .~:~, ~ gradient/mean 5 ~ . 6 ~4,~.~~ 7 Fig. 3. Distributions of the linearly (a, b) and non-linearly (c, d) transformed pixel intensities (a, c) and the corresponding local gradients (b, d) for ensemble NAT and the iteration procedure without block substitution (10 steps with 100 pictures). References 1. Ruderman, D. L.: The statistics of natural images Network 5 (1994) 517-548 2. Ruderman, D. L.: Designing receptive fields for highest fidelity Network 5 (1994) 147-155 3. Field, D. J.: Relations between the statistics of natural images and the response properties of cortical cells J. Opt. Soc. Am. 4 (1987) 2379-2394 4. Field, D. J.: What is the goal of sensory coding? Neural Comput. 6 (1994) 559-601 5~ Kadanoff, L. P.: Scaling laws for ising models near Tc Physics 2 (1966) 263-273 6. Gardiner, C. W.: Handbook of Stochastic Methods (Berlin, Heidelberg: Springer) 1983