J. R. G. R. 0. Wells, M. R.: Tian, Baruniuk, JR., Tun and H. Wu

WAVELET FOLDING AND DECORRELATION ACROSS THE SCALE
J . Tian, R. G. Baruniuk, R. 0. Wells, Jr., D. M. Tun and H. R. Wu

Computational Mathematics Laboratory School of Computer Science and
Rice University Software Engineering
Houston, T X 77005, USA Monash University, Australia
j untian ,richb ,wellserice. edu dmt ,[email protected] .edu. au
ABSTRACT decomposed into its low and high frequency parts, and
The discrete wavelet transform (DWT) gives a compact so on, t o produce a multiscale representation of the
multiscale representation of signals and provides a hi- original signal 2. From L and H , the inverse discrete
erarchical structure for signal processing. It has been wavelet transform (IDWT) will reconstruct x as
assumed the DWT can fairly well decorrelate real-world
signals. However a residual dependency structure still xn = + ,
( L - ~ ~ L E~ - 2 k ~ k ) (3)
remains between wavelet coefficients. It has been ob- kEZ
served magnitudes of wavelet coefficients are highly cor-
related, both across the scale and at neighboring spatial where and i jare the synthesis scaling filter and wavelet
locations. In this paper we present a wavelet folding filter, respectively. In an orthogonal case, h = h, g = 9.
technique, which folds wavelet coefficients across the The success of the DWT can be attributed t o its
scale and removes the across-the-scale dependence to a three primary properties [3]:
larger extent. It produces an even more compact sig- Locality: Each wavelet coefficient is localized simul-
nal representation and the energy is more concentrated taneously in time and frequency.
in a few large coefficients. It has a great potential in
applications such as image compression. Mult iresolution: Wavelet coefficients are compressed
and dilated t o analyze a t a nested set of scales.
1. I N T R O D U C T I O N Compression: The wavelet transforms of real-world
signals tend to be sparse.
Ever since the terminology “wa~elet”was first intro-
duced, in the context of a mathematical transform, by Based on the above three properties, it has been as-
A. Grossmann and J. Morlet [I]in 1984, wavelet theory sumed that the DWT can decorrelate real-world sig-
and its applications have grown tremendously. The dis- nals fairly well. However a residual dependency struc-
crete wavelet transform (DWT) and its fast implemen- ture still remains between wavelet coefficients. It has
tation was introduced by S. G. Mallat [2] in 1989. For been observed [3, 4, 5, 6 , 7, 8, 91 that magnitudes of
a discrete signal 2 = {xn}, the DWT of 2 consists of wavelet coefficients are actually highly correlated, i.e.,
two parts, the low frequency part (scaling coefficients) large values of wavelet coefficients tend t o propagate
L = { L I , }and the high frequency (wavelet coefficients) at the same relative spatial locations across the scale,
part H = { H k } , and a t neighboring spatial locations a t the same scale.
In this paper we present a wavelet folding technique,
LI, = xhn--2kxn, (1) which folds wavelet coefficients across the scale and re-
nEZ
moves the across-the-scale dependence to a larger ex-
HI, gn--2kxn, (2) tent.
nEZ This paper is organized as follows. In Section 2
where h and g are the (analysis) scaling filter and wavelet we give a brief overview of a data folding method for
filter, respectively. The L component can be further the discrete cosine transform (DCT). We introduce the
wavelet folding technique in Section 3. Beginning with
J. Tian, R. G. Baraniuk, and R. 0. Wells, Jr. were sup-
ported in part by AFOSR/DARPA, Exxon Production Re-
a simple example, we compute the wavelet packet of
search, NSF, and Northrop Grumman Corporation. Web sites: the data and construct a wavelet folding matrix. Then
www.cml.rice.edu, www.csse.monash.edu.au we apply the DWT on the scale direction of this matrix
0-7803-6293-4/001$10.0002000IEEE 544
to obtain a more compact representation. We conclude
the paper in Section 4. 8x8 pixel sub-blocks
2. DATA FOLDING FOR DCT

+
A data folding method for the DCT was presented in
[lo]. It rearranges the data t o allow the practical appli-
cation of the DCT to expand into higher dimensions.
Let us explain the data folding method with an exam-
ple.
Assume we have a 128 x 128 image with pixel values
a i j , 0 5 i , j 5 127. The standard two-dimensional (2-
D) DCT is applied to blocks of the image with block
size 8 x 8, i.e., for each pair ( k , l ) , O 5 k , l 5 15, we
apply DCT on the 8 x 8 matrix A k , l = ( Q k + i , s l + j ) , 0 5
. 16 sub-block wide
c
i , j 5 7. The DCT coefficients of A k , l is still an 8 x 8

matrix, which is denoted as D k , l = ( d 8 k + i , 8 1 + j ) , O 5
i , j 5 7. The d a t a folding method rearranges the 2-D Figure 1: A 2-D Data Folding for 4-D DCT
DCT coefficients D = ( d S k + i , S l + j ) , O 5 k , l 5 15,O 5
i , j F 7 , t o a new four-dimensional (4-D) matrix
13 := ( d k , i , i , j ) , 0 5 k,l 5 15, 0 5 i , j 5 7, (4) --t

and take another 2-D DCT of this new matrix 2)along
the k and 1 directions. Figure 1 illustrates the folding of
a 2-D 128 x 128 image into a 4-D 16 x 16 x 8 x 8 d a t a ar-
ray for a 4-D DCT. The statistical assumption behind
the data folding method is that it is very likely that
high correlation exists between the post DCT blocks,
that is, DCT coefficients a t the same location (the i , j
coordinates for 0 5 i , j 5 7) between different blocks
(the k,Z coordinates). Therefore, it may be desirable
to fold the DCT coefficients from different blocks and
i’
apply another 2-D DCT along the k and 1 directions to
remove further spatial redundancies. The data folding
method has been applied to image coding arid video Figure 2: Persistence Across Scale
coding. Experimental results show that a hybrid cod-
ing algorithm using the data folding method outper- scale” property. There is also strong evidence of clus-
forms the standard 2-D block DCT method for most of tering (dependence at neighboring spatial locations at
the test images. For more details of the data folding the same scale), but we will not address the clustering
method, we refer to [lo]. problem in this paper.
Coming back t o the “persistence across scale” prop-
3. WAVELET FOLDING erty of wavelet coefficients, let us denote by L j = { L ; }
and H j = { H i } as the scaling coefficients and the
Similar to the DCT, wavelet coefficients are not statis- wavelet coefficients, at j - t h scale, of a discrete signal
tically independent. Figure 2 is a seven-scale wavelet 2 = (xn}, respectively,
transform of a bump signal in [ll]. The bump sig-
nal lies atop the time-frequency tiling provided by the Li = hn-2kL;-’, (5)
DWT. Each tile is colored as a monotonic function of nEZ
the wavelet coefficient magnitude, with darker tiles in-
dicating larger magnitude. As we can see, large mag-
Hi = gn-2kLi-l, (6)
nEZ
nitude coefficients (dark colored) tend t o occur a t the
same relative spatial locations across the scale at dif- with Lo = x. We are interested in exploring the depen-
ferent subbands, which we call the “persistence across dence among H1,H 2 ,H 3 , . . ..
545
Let us assume for now that h and g are the Haar
scaling filter and wavelet filter,
and x is the square sequence {0,1,4,9,16,25,36,. . .}.

Thus
H 1 = ( 1fh, 5 fJz, 9 f J z , 13f J z , U'/&,.. .},

L1 = { l / J z ,13/J;i, 41/& 8 5 / J z , 1 4 5 / J z , . . .},
H 2 = { 6,22,38,54,70,86,102, . . .}.
Figure 3: Three Scale Wavelet Folding
Comparing H 1 and H 2 , they are both arithmetic se-
ries. What is striking behind it is that if we take an-
other DWT on H 1 (which is the wavelet packet [12] with m = 2, and L ( L ( H 1 ) )L, ( H 2 ) ,H 3 have the same
formulated by R. R. Coifman, Y. kleyer, and M. V.
Wickerhauser), the scaling coefficients L of H 1 is given
by
L(H1) = {3,11,19,27,35,43,51,.. .},
which is exactly H 2 divided by 2!
-
size for the folding processing. For convenience, denote
L ( " ) ( H j )=: L L . . . L ( H j ) .
n
Our goal is to further reduce the dependence among

The DWT of Haar scaling filter and wavelet filter H3, H j + l , . . . Hj+m. Now L ( " ) ( H j ) , L ( " - l ) ( H j + ' ) ,
could only remove the constant (which is equivalent to L(')(Hj+'-'), Hj+" all have the same size as
the zeroth vanishing moment) in the data. However IHj+"l. Construct a new wavelet folding matrix with
by combining the wavelet coefficients at two adjacent L ( m ) ( H 3 )L("-l)(Hj+'),
, . . ., L(l)(Hj+"-l), Hj+m as
scales, one can remove the quadratic relation (which its ( m + l ) rows. Along each column of the wavelet fold-
is the second vanishing moment) in the data, as illus- ing matrix, compute a wavelet coefficient ' w k by filter-
trated above. ing with some wavelet filter g', where k is the column
The above simple example leads t o the introduc- index. Note that we do not compute the scaling coef-
tion of wavelet folding. Let us begin with the standard ficients, since we are trying to remove the across-the-
DWT. Because of the downsampling procedure in the scale dependence by taking the difference (the wavelet
filtering by h and g , the sizes of wavelet coefficients coefficient). Also note we compute only one wavelet
at different scales H' ,H 2 , H 3 , . . . , H j , . . . , are differ- coefficient for each column. With this waselet coeffi-
ent by a factor of 2, IH'J = 2)H21 = 41H31 = ... = cient Wk,the coefficient of Hj+" at the k-th column in
2j-I lHjl = . . .. The wavelet packet comes naturally the wavelet folding matrix can be retrieved from other
to reduce the sizes of H s at finer scales (the H j with m rows ~ ( " l ( ~ j )L(?n-l)(Hj+1),
, . . ., L(')(Hj+""-I),
small j ) so that they will have the same size for the since the filtering is a linear transform. Thus the wavelet
folding processing. Unlike the DWT which only de- coefficients Hj+" can be fully retrieved from the se-
composes the scaling coefficients L, the wavelet packet quence w = {wk}and L ( " ) ( H j ) , L("-')(Hj+l), . . .,
decomposes the wavelet coefficients H as well. So for L(')(Hj+m-l), So given H j , Hjfl,.. . ,fJj+"-1, the
the wavelet coefficients H j a t the j - t h scale, one can +
wavelet coefficients a t ( j m)-th scale, Hj+", can be
have retrieved from the new sequence W. It is also clear now
that the value of m should be equal to the filter length
Lb(Hj) = 1h,n-2kHi,
nEZ
(7) 19'1. We claim that the sequence w is a more compact
representation than the wavelet coefficients be-
Hk(Hj) = gn-akHi. (8) cause of the following.
nEZ
Theorem 1 For a sequence x = {znIz, = an2 + bn +
One could further decompose the L ( H j ) and H ( H j ) e } , where a, b, c are some constants independent of n,
by filtering with h and g , and so on, until some desired assume h and g are the Haar scaling filter and wavelet
decomposition structure is reached. For our wavelet filter, then
folding purpose, we will only further decompose L ( H 3 ) H 2 ( x )= 2 L ( H 1 ( 2 ) ) .
until L ( L . . . L ( H j ) ) having the same size as Hj+",
the wavelet coefficients at some scale j + m. Figure 3 Therefore, by taking another DWT along the col-
illustrate a three scale wavelet folding decomposition, umn on the wavelet folding matrix consisting of H 2
546
and L ( H 1 ) ,the new sequence w will be a zero sequence [3] M.s. Crouse, R. D. Nowak, and R. G. Baraniuk.
(the factor 2 comes from the normalization factor f i Wavelet-based statistical signal processing using
of the scaling filter and wavelet filter). Thus using the hidden Markov models. IEEE Trans. Signal Pro-
wavelet folding, one can remove the quadratic relation cessing, 461886402, 1998.
in the data by the Haar scaling filter and wavelet filter,
which usually requires longer filters (for example, the [4] R. W. Buccigrossi and E. P. Simoncelli. Image
D 6 filter [13] with length 6) in a standard DWT. compression via joint statistical characterization
Moving beyond the Haar, we have in the wavelet domain. IEEE Trans. Image Pro-
cessing, 1999. To appear.
Theorem 2 Assume the wavelet filters g and g' have
[5] S. G. Mallat and W. Hwang. Singularity detec-
vanishing moments up to order N and NI, respectively,
tion and processing with wavelets. IEEE Trans.
then the wavelet folding technique can remove the poly-
nomial relation in the data up to order N N' 2. + + Inform. Theory, 38(2):617-643,1992.
[6] P. Muller and B. Vidakovic, editors. Bayesian In-
ference in Wavelet Based Models. Springer-Verlag,
Due t o the page limit, we have omitted the proofs
1999.
of two theorems presented in this paper. The proofs
can be found in [14]. Again because of the normaliza- [7] M. T. Orchard and K. Ramchandran. An inves-
tion factor fi,one needs t o enlarge the magnitudes of tigation of wavelet-based image coding using an
L ( " ) ( H j ) before the DWT across the scale. entropy-constrained quantization framework. In
Proc. Data Compression Conference, pages 341-
4. CONCLUSIONS 350, 1994.
In this paper, we introduce a new wavelet folding tech- [SI E. Ordentlich, M. Weinberger, and G. Seroussi. A
nique which reduces the across-the-scale dependence t o low-complexity modeling approach for embedded
a larger extent. Mathematically, we are looking a t the coding of wavelet coefficients. In Proc. Data Com-
dependence among the operators pression Conference, pages 408-417, 1998.
[9] J. M. Shapiro. Embedded image coding using ze-
( D o g ) o ( D o h) o ( D 0 h ) . . . (D0 h )
rotrees of wavelet coefficients. IEEE Trans. Signal
( D 0 h) 0 ( D 0 9 ) 0 ( D 0 h) . . . (D0 h) Processing, 41:3445-3462, December 1993.
[lo] D. M. Tan and H. R. Wu. hlulti-dimensional
( D 0 h ) 0 ( D 0 h ) 0 ( D 0 h ) ... ( D 0 g ) discrete cosine transform for image compression.
In Proc. 6th IEEE International Conference on
where each operator has exactly ( n - 1) h operations, Electronics, Circuits and Systems, pages TlE2.1-
one g operation, and n D operations. Here D repre- TlE2.5, Cyprus, September 1999.
sents the downsampling, h and g the convolutions with
the reversed filters h and g (i.e., Ln, g-n), respec- [Ill D. Donoho and I. Johnstone. Adapting to un-
tively. Since h, g , D do not commute, these operators known smoothness via wavelet shrinkage. J. Amer.
are all different. The wavelet folding technique gives Stat. Assoc., 90:1200-1224, December 1995.
a linear approximation t o reduce the dependence. In
practice, one may choose a smaller m for considera-
[la] R. R. Coifman, Y. Meyer, and M. V. Wicker-
hauser. Size properties of wavelet packets. In
tion of execution time and memory overhead, instead
Ruskai et al., editor, Wavelets and their Appli-
of setting m to be the full tree depth.
cations, pages 453-470. Jones and Bartlett, 1992.
5. REFERENCES [13] I. Daubechies. Orthonormal bases of compactly
supported wavelets. Comm. Pure Appl. Math.,
[l] A. Grossmann and J. Morlet. Decomposition of XLI:906-966, 1988.
Hardy functions into square integrable wavelets of
constant shape. SIAM J. Math. Anal., 15(4):723- [14] J . Tian, R. G. Baraniuk, and R. 0. Wells, Jr.
736, July 1984. Wavelet folding, vanishing moments, and poly-
nomial representation. Technical Report CML
[2] S. G. Mallat. Multiresolution approximation and TR-9910, Computational Mathematics Labora-
wavelet orthonormal bases of L2(R). Trans. A M s , tory, Rice University, October 1999.
315(1):69-87, September 1989.
547

J. R. G. R. 0. Wells, M. R.: Tian, Baruniuk, JR., Tun and H. Wu

Uploaded by

Copyright:

Available Formats

J. R. G. R. 0. Wells, M. R.: Tian, Baruniuk, JR., Tun and H. Wu

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

J. R. G. R. 0. Wells, M. R.: Tian, Baruniuk, JR., Tun and H. Wu

Uploaded by

Copyright:

Available Formats

WAVELET FOLDING AND DECORRELATION ACROSS THE SCALE

J . Tian, R. G. Baruniuk, R. 0. Wells, Jr., D. M. Tun and H. R. Wu

2. DATA FOLDING FOR DCT

i , j 5 7. The DCT coefficients of A k , l is still an 8 x 8

13 := ( d k , i , i , j ) , 0 5 k,l 5 15, 0 5 i , j 5 7, (4) --t

and x is the square sequence {0,1,4,9,16,25,36,. . .}.

H 1 = ( 1fh, 5 fJz, 9 f J z , 13f J z , U'/&,.. .},

Our goal is to further reduce the dependence among

You might also like