Draft: submitted to 20th Iranian Conference on Electrical Engineering, 2012.
Robust Audio Watermarking Based on HWD and SVD
Saeed Karimimehr*, Shadrokh Samavi**, Hoda Rezaee Kaviani***, Mojtaba Mahdavi****
*Isfahan University Of Technology, Isfahan, Iran,
[email protected]
**Isfahan University Of Technology, Isfahan, Iran,
[email protected]
***Isfahan University Of Technology, Isfahan, Iran,
[email protected]
****Isfahan University Of Technology, Isfahan, Iran,
[email protected]
Abstract: To protect copyright of audio signals several
watermarking algorithms have been proposed in recent years.
Many of them are based on wavelet transform but these
methods are not robust enough against signal processing
attacks. This paper presents a new audio watermarking
algorithm based on Hybrid wavelets and Directional Filter
banks (HWD) and Singular Value Decomposition (SVD). The
proposed method embeds the watermark in the directional
subbands of audio matrix. To do multiple embedding, framing is
used and each frame is split to two parts. The first one is used
for the synchronization code and the other for watermark
embedding. Synchronization code is embedded in time domain
to achieve more efficiency and watermark is embedded in SVDblocks of different directions using HWD. Experimental results
show that proposed method has increased robustness and
imperceptibility. It also has an acceptable data payload.
Keywords: Audio Watermarking, Hybrid Wavelets and
Directional Filter banks, SVD.
1.
Introduction
Due to growth of Internet and computer networks,
digital multimedia contents can easily be exchanged
among people. This ease of distribution and accuracy of
these methods to duplicate a multimedia in an
inexpensive way caused tough challenges to copyright
protection and intellectual property. A response to this
problem is digital watermarking [1] which includes three
main categories: image, audio and video watermarking.
Although audio and image watermarking are similar in
some aspects but there are more complications in audio
watermarking. For example the Human Auditory System
(HAS) is much more sensitive to changes than the
Human Visual System (HVS). The other challenge is
that, the ratio of the highest to the lowest audible
frequency is approximately 1,000 (range of 20Hz-20
kHz) where this ratio for light waves we can see is a
factor of 2 [2]. Opposite to its large dynamic range, HAS
contains a fairly small differential range, i.e. loud sounds
generally tend to mask out weaker sounds [3].
According to IFPI (International Federation of the
Phonographic Industry), an audio watermarking must
have some minimum properties: (1) A watermarked audio
signal should maintain more than 20 dB SNR. (2)
Watermarked signals should not reveal any clues about
the watermarks in them. Also, the security of the
watermarking procedure must depend on secret keys, but
not on the secrecy of the watermarking algorithm. (3) A
watermarking scheme must have the ability to extract the
watermark from a watermarked audio signal after
applying various signal processing attacks. (4) The
amount of data that can be embedded into the host audio
signal without losing imperceptibility (Payload) should
be more than 20 bps.
There are mainly two types of attacks which may
distort a watermarked audio: (1) Modifying the amplitude
of audio signal which results in the lost of parts of hided
information. These attacks include noise corruption,
amplitude scaling, re-sampling, and MP3 compression.
(2) Destroying synchronization of the watermark in time
domain that is more effective than corrupting
watermarked audio amplitude directly and includes
attacks such as time scaling, shifting and cropping. There
are few methods which withstand against synchronization
attacks.
A lot of audio watermarking schemes based on the WT
(Wavelet Transform) have been introduced [4] in recent
years. To overcome some of the restrictions of the WT, in
[5, 6] authors introduced the lifting scheme of wavelets
for the first time. An algorithm based on LWT (Lifting
Wavelet Transform) is proposed by [7], which has
showed that watermark detection can be implemented
quickly without presence of the original signal, but the
method is not very robust. Another improved algorithm
presented by [8] proposes a method of quantization that is
more robust. One of the main problems that most of the
methods had was their weakness against synchronization
attacks. It means that for example once the position was
lost by random cropping, the proper watermark cannot be
detected easily. The method in [9] uses a synchronization
signal in embedding procedure and in extraction phase,
the detection begins after the synchronization signal is
located. Although this method can resist some random
cropping attacks, the watermark still cannot be extracted
properly if the watermarked signal is cut. In [10] a lifting
wavelet domain audio watermarking algorithm based on
the statistical characteristics of sub-band coefficients is
proposed. This method uses self synchronization but
since it embeds the synchronization code in wavelet
domain, the detection procedure is very time consuming.
Altogether its robustness is not so good. After
representing directional transforms, it has been shown
[11, 12] that they are more efficient than wavelet in
representing a two dimensional signal. It means that they
capture significant information about an object of interest
using a small description.
In this paper a self synchronization method is
presented which embeds a synchronization code in each
frame of the original audio signal. The watermark is a
binary logo which is embedded in SVD-blocks of
different directional subbands of HWD transform. HWD
Draft: submitted to 20th Iranian Conference on Electrical Engineering, 2012.
transform is a new family of non-redundant
multiresolution directional transforms presented by
Eslami et al. [12]. In proposed method watermark is
encrypted using Arnold map before embedding to prevent
unauthorized detection. We call our method as Robust
HWD based Algorithm (RoHA). Experimental results
are compared with [10] and show high performance of
proposed algorithm.
The rest of the paper is organized as follows: Section 2
explains the proposed audio watermarking approach.
Section 3 gives the experimental results and evaluations
and Section 4 gives some concluding remarks.
2.
Proposed Method
In this section we intend to present the details of
proposed algorithm but first we need some explanation of
the chaotic encryption algorithm. Then Hybrid Wavelets
and Directional Filter banks are briefly introduced. We
also offer a short discussion on singular value
decomposition. Then the embedding and extraction
procedures are explained.
2.1
Chaotic Encryption Algorithm
Chaotic maps are used to encrypt watermarks. In the
literature of watermarking, chaotic maps prevent
unauthorized detection.
Here we use the Arnold transform, since the Arnold
transform is periodic, the number of scrambling can be
considered as the key to enhance the security. The Arnold
transform for an N by N matrix is shown below.
′
′
1 1
(1)
(
)
1 2
where (x, y) is the pixel of the watermarking image and
(x ′ , y′) is the pixel of the watermarking image after
scrambling.
2.2
=
Hybrid Wavelets and Directional Filter banks
In [12] Eslami proposed a family of non redundant
transforms using Hybrid Wavelets and Directional Filter
banks. They extended the directionality of the wavelet
transform by employing the DFBs to the high pass
channels of the wavelet transform. Therefore, they used
the name hybrid wavelets and directional filter banks
(HWD) transform family. Since, in the WT, we already
have horizontal and vertical sub bands, different
paradigms could be considered to apply DFBs to the
finest subbands of wavelets. They proposed two types of
HWD transforms. Here we use HWD-F.
Fig. 1: Schematic plot of HWD-F transform for = 3 directional levels.
To achieve HWD-F we apply full-tree DFBs with l j
levels to all three highpass subbands of wavelets at
levels1
. We denote the subbands by
( )
()
()
,
and
(2)
( ∈
( )
= { |1
(3)
2 })
A schematic diagram of the HWD-F transform is
illustrated in Fig. 1. Using the noble identities [13], we
can move the DFB filters before down sampling by in the
WT.
It is notable that we can use any number of directions
in DFB stage and both stages, WT and HWD, are nonredundant. Consequently, the HWD transforms provide a
family of non-redundant and flexible basis elements.
2.3
Singular Value Decomposition
The singular value decomposition of a matrix is a
factorization of the matrix into a product of three
matrices. Given an
× matrix A, where
, the
′
SVD of A is defined as =
. Where is a ×
column orthogonal matrix whose columns are referred to
as left singular vectors: = ��(�1 , �2 , … , � ) is a
× matrix whose diagonal elements are nonnegative
singular values arranged in descending order:
is an
× orthogonal matrix whose columns are referred to as
right singular vectors. If �
= , then satisfies:
�1 �2 ⋯ �
� +1 = � = 0
(4)
According to [14] the singular values (SVs) of a matrix
have very good stability, that is, when a small
perturbation is added to a matrix, its SVs do not change
significantly and SVs represent intrinsic algebraic
properties of a matrix.
2.4
Embedding Process of RoHA
In our method first we use framing in order to increase
robustness against synchronization attacks. Each frame
consists of two parts as depicted in Fig. 2.
Segment i
Synchronization Code
Watermark
W(i)
Fig. 2 Construction of each frame
Draft: submitted to 20th Iranian Conference on Electrical Engineering, 2012.
(4) Matrix formation: After finding the lowest
frequency, the achieved one dimensional vector is
reshaped to a two dimensional matrix (matrix
formation).
(5) Hybrid Wavelets and Directional Filter bank:
Here the output of wavelet is passed from the
HWD block, According to section 2.2. Then for
each subband (H2, V2, D2), 2 submatrices are
driven, each one shows one direction in
corresponding subband. It means that there are
= 3 × 2 matrices ( ) in output of block #5.
(6) Embedding: In the embedding process each
vector is embedded in its corresponding matrix.
To do so, each bit of
vector is embedded in
each block of corresponding matrix
as
following steps:
Step1: Partition the
matrix into 2D matrix blocks
, = 1,2, … , × , each of size × , where ×
is the number of bits in the corresponding vector . The
row and column numbers of 2D blocks are selected by
the user to achieve proper imperceptibility as well as
robustness and the SVD of each block is computed.
Step2: Let � = (�1 , �2 , … , � ) be the vector of SVs of
block . The norm of this vector is computed as follows:
= �
Fig. 3.Watermark embedding flowchart for each frame
After framing we do the same thing with all of the
frames as shown in the block diagram of Fig. 3. Note that
the length of each segment depends on the data payload
that we would like to embed. The larger the frame is, the
more data can be embedded.
For synchronization code embedding we used the same
method used in [15] which is in the time domain. Hence,
searching for the synchronization code is performed
faster than transform domain methods.
The description of blocks in Fig. 3 for embedding the
watermark is as follows:
(1) Watermark logo: is a
×
binary image.
(2) Watermark encryption: In order to prevent
unauthorized detection we used encryption as
explained in section 2.1. The outcome of the
encryption block ( ) is broken into = 3 × 2
row vectors. In this equation, 3 stands for number
of wavelet subbands that are vertical, horizontal
and diagonal in HWD and 2 is the number of
directions used in DFB stage of HWD.
(3) Wavelet filter: To best merge the watermark and
the signal it is better to embed the watermark in
the most significant part of the signal. Here we
used two levels of DWT to reach to the lowest
subband where there is concentration of most
energy.
=
=1 (�
(5)
)2
Step3: Compute the mean (
) and standard deviation
(� ) for each block.
Step4: The weight of each block is given by:
=
+
�
(6)
� ×
In equation (6) parameters of
and
are
user
�
defined parameters.
Step5: Amongst all
values, the maximum and the
minimum are chosen and called
and
respectively.
Step6: To increase robustness and decrease distortion, we
propose adaptive decision method for quantization steps,
which is better than using constant steps. The
quantization step ∆ for block
is calculated adaptively
using equation (7):
∆ =∆ + ∆ −∆
−
, = 1,2, … ,
−
×
(7)
In this equation ∆ and ∆ are user-defined minimum
and maximum quantization step values, respectively.
Step7: Then the integer
the quantization step for
.
Step8: Each bit ( , ) of
embedded as follows:
If (
, =1�
If (
, =0�
Step9: Calculate the value
of SVs as follows:
�1 , �2 , … , �
=
∆
is computed, where ∆ is
, Corresponding to the block
the watermark sub segment is
2 = 1), then = + 1
2 = 1), then = + 1
∆
′
= ∆ × + and the value
= (�1 , �2 , … , � ) ×
2
′
(8)
Step10: The watermarked blocks are obtained by
applying the inverse SVD using the modified SVs. Then
Draft: submitted to 20th Iranian Conference on Electrical Engineering, 2012.
the matrices
blocks.
′
are reconstructed from all the modified
2.5
RoHA Extraction Process
Generally, we should avoid false synchronization
during selecting synchronization code. Several reasons
contribute to false synchronization: 1) the style of the
synchronization code, 2) the length of synchronization
code, 3) the probability of "0" and "1" in synchronization
code. Amongst all of them, the length of the
synchronization code is especially important. The longer
it is, the more robust it is. Most of the times,
synchronization code is embedded in spatial domain in
order to have fast access. But embedding in spatial
domain often causes much distortion to cover. The same
as embedding process, synchronization code extraction is
driven from [15] which is a simple and direct method. A
correlation measure is used to compare both bit
sequences.
The watermark can be extracted without using the
original audio signal as follows:
Step1: Locate the beginning position of the watermark in
the audio frame using synchronization code searching
technique as explained in [15]. The output is a vector
with a pre-specified length given as a secret key.
Step2: Apply two levels of DWT to reach to the LL
subband. Then reshape the LL subband into a matrix.
Step3: Apply HWD with 2 direction (k is the number of
directions in embedding phase). Now find watermark bits
from 3 × 2 matrices as follows:
Partition each matrix into 2D matrix blocks of size
× . For each block compute the value
= �
where the vector � = �1 , �2 , … , � is formed by the
SVs of the block. Find the integer
=
∆
. If
2 = 0, then the embedded bit is 1, otherwise it is
0 (Value of ∆ is provided by the embedder as a secret
key). Then put the bits in sequence to achieve the
watermark vector and reshape the achieved vector.
3.
For the robustness test we have done two groups of
attacks: attacks that modify the amplitude of the signal
and attacks that de-synchronize the watermarked signal.
1) First group:
(1) Lowpass filtering1, 2: application of a 9th-order
Chebyshev filter with a cut-off frequency of 11.025,8
kHz;
(2) Noise adding: addition of zero-mean white noise
which variance is 0.01;
(3) Re-quantization: re-quantization from 16-bit to 8bit and then back to 16-bit;
(4) Re-sampling: down-sampling to 22.05 kHz
followed by up-sampling back to 44.1 kHz;
(5) MP3 Compression: compression of the audio
signal with a compression rate of 22:1, then
decompression of the signal.
2) Second group:
(1) Ten percent of the audio signal is cropped at one
of three selected positions randomly (front, middle
and back);
(2) Jittering: cropping of one sample out of every 100,
500, 1000, 2000 samples;
(3) Random cropping 1: selection of 5 positions
randomly and removal of 100 samples at each
position;
(4) Random cropping 2: selection of 10 positions
randomly and removal of 100 samples at each
position;
(5) Random cropping 3: selection of 10 positions
randomly and removal of 500 samples at each
position;
(6) Random cropping 4: selection of 10 positions
randomly and removal of 1000 samples at each
position.
Table I is a reference that shows the relation between
the Bit Error Rate (BER) and visibility of binary
watermark logo. Bit Error Rate (BER) is used to evaluate
the watermark detection accuracy after signal processing
operations. The BER of the watermarked signal retrieval
is defined as follows:
,
Experimental Results
To evaluate our scheme, we carried out performance
and robustness tests and compared the watermark
detection results with that of reference [10]. All of the
audio signals in the test were music clips recorded at 16
bits per sample and 44.1 kHz. We used a 32 × 24 binary
image as our watermark logo. For all audio signals a 16bit
Barker
code
of 1111100110101110 for
synchronization is used. We fixed the length of each
embedded watermark segment at 262,320 samples. In the
HWD stage we used 8 directions to embed. Our block
sizes for SVD were 8 by 8. The other parameters were
= 11 (this parameter is used in the synchronization
step [16]),
= 0.05, ∆ = 0.48 and
� = 0.80,
∆ = 0.50. All of the parameters were found to achieve
the
best
trade-off
between
robustness
and
imperceptibility.
=
=1
( , )⨁ ( , )
×
=1
(9)
In Equation (9)
and
are the original and the
extracted watermarks respectively. Also,
and
are
watermark’s width and length and ⨁ represents the
exclusive OR (XOR) operator.
To evaluate the similarity between the original and the
extracted watermarks Normalized cross-Correlation (NC)
is computed using Equation (10):
,
=1
=
=1
=1
=1
2
,
,
(, )
=1
=1
2(
, )
(10)
All of the parameters used in Equation (10) are
previously defined.
Draft: submitted to 20th Iranian Conference on Electrical Engineering, 2012.
Table II shows the results after applying several
attacks on watermarked signals in comparison with [10].
As it is shown, the proposed RoHA method is more
robust than the scheme in [10].
The quantization step for embedding, the barker code
for synchronization and the encryption of the watermark
before embedding are the main security parameters in our
method. As we embedded the synchronization code in
time domain, detection process in our method is faster
than scheme of [10] because this scheme embeds the
synchronization code in wavelet domain which is time
consuming to extract.
TABLE I: The relation between the BER and visibility of binary
watermark image
BER (%)
0
5
10
Binary
watermark
image
BER (%)
20
25
Binary
watermark
image
TABLE II: Extracted watermark by proposed scheme
Ref
[10]
NC
Proposed
NC
Proposed
BER (%)
1
1
0.986
Ref.
[10]
BER
(%)
0
0
0
No attack
Lowpass filter:1
Lowpass filter:2
1
1
1
Noise addition
Re-quantization
Re-sampling
MP3 compression
1
1
1
0.981
1
1
1
0.922
0
0
0
1.95
0
0
0
7.8
Add 10% (front)
Add 10% (middle)
1
1
1
1
0
0
0
0
Cropping 10%
(front)
1
1
0
0
Cropping 10%
(middle)
1
1
0
0
Jittering:1
Jittering:2
Jittering:3
Jittering:4
0.802
0.923
0.965
0.975
1
1
1
1
19.79
7.81
3.51
2.60
0
0
0
0
0.997
1
0.26
0
0.957
1
4.29
0
0.905
1
9.50
0
0.901
0.992
10.28
0.78
Random cropping:1
Random cropping:2
Random cropping:3
Random cropping:4
Conclusion
References
15
Attack type
3.
In this paper we proposed a new robust digital audio
watermarking algorithm which uses framing in order to
embed multiple copies of a watermark to achieve higher
robustness. To perform embedding each frame is split
into two parts of synchronization code and watermark
parts. Each bit of synchronization code is embedded in a
sequence of samples in time domain. The watermark is
embedded by quantizing the SVs of blocks in directional
matrices resulted from HWD transform. To increase
robustness, before applying HWD we pass the second
part of each frame through two-level DWT transform to
obtain the LL subband. The proposed method is blind
and doesn’t need original digital audio in the extraction
phase and the watermark can be extracted using some
security codes. Experimental results show the efficiency
of proposed method in comparison to a lifting wavelet
based method.
0
0
1.43
[1] A. Gurijala, J. R. Deller Jr, “Advances in Audio and Speech Signal
Processing: Technologies and Applications,” edited by H. PerezMeana 2007, IGI Global.
[2] S. Esmaili, “Content Based Audio Watermarking and Retrieval
using Time-Frequency analysis,” M.Eng thesis, Ryerson
University, Toronto, 2002.
[3] N. Cvejic, “Algorithms for audio watermarking and
steganogeraohy,” Ph.D. dissertation, university of Oulu Finland,
2004.
[4] Y. Qiang, Y. Wang, “A Survey of Wavelet-domain Based Digital
Image Watermarking Algorithm,” Computer Engineering and
Applications, 40, 11, 46–50, 2004.
[5] W. Sweldens, “The lifting scheme: a construction of second
generation wavelets,” SIAM Journal Mathematical Analysis, 29, 2,
511–546, 1997.
[6] I. Daubechies, W. Sweldens, “Factoring wavelet transforms into
lifting’s steps,” Journal of Fourier Analysis and Applications, 4, 3,
245–267, 1998.
[7] X. Y. Wang, H. Y. Yang, Y. R. CUI, H. Hong, “Content-based
adaptive digital audio watermarking algorithm in wavelet
domain,” Journal of Mini-Micro Systems, vol. 26, no. 8, pp. 13541357, 2005.
[8] X. Y. Wang, H. Y. Yang, H. Hong, “A new adaptive digital audio
watermarking algorithm,” Mini- Micro Systems, vol. 27, no. 7, pp.
1353-1357, 2006.
[9] J. Y. Qu, “Audio digital watermarking based on the lifting scheme
wavelet transform,” Computer & Digital Engineering, vol. 34, no.
4, pp. 91-94, 2006.
[10] Z. Tao, H. M. Zhao, J. Wu, J. H. Gu, Y. S. Xu, D. Wu, “A Lifting
Wavelet Domain Audio Watermarking Algorithm Based on the
Statistical Characteristics of Sub-Band Coefficients,” Journal of
Archives of Acoustics, vol. 35, no. 4, pp. 481-491, 2010.
[11] M. N. Do, M. Vetterli, “The Contourlet transform: An efficient
directional multiresolution image representation,” IEEE Trans. on
Image Processing, vol. 14, no.12, pp. 2091-2106, Dec. 2005.
[12] R. Eslami H. Radha, “A New Family of Nonredundant
Transforms Using Hybrid Wavelets and Directional Filter Banks,”
IEEE Tran. on Image Processing, vol. 16, no. 4, Apr 2007.
[13] P. P. Vaidyanathan, “Multirate Systems and Filter Banks,”
Englewood Cliffs, NJ: Prentice-Hall, 1993.
[14] R. Liu and T. Tan, “An SVD-based watermarking scheme for
protecting rightful ownership,” IEEE Trans. Multimedia, vol. 4,
pp. 121-128, Aug. 2002.
[15] D. Megias, J. Serra-Ruiz, M. Fallahpour, “efficient selfsynchronized blind audio watermarking system based on time
domain and FFT amplitude modification,” International Journal of
Signal Processing 3078–3092, May 2010.