Super-Exponential Methods For Blind Deconvolution: Shalvi and Ehud Weinstein, Ieee
Super-Exponential Methods For Blind Deconvolution: Shalvi and Ehud Weinstein, Ieee
Super-Exponential Methods For Blind Deconvolution: Shalvi and Ehud Weinstein, Ieee
(1)
the unit sample response of the unknown system combined
(convolved) with the equalizer, then we want to set C
n
so that
7
_
,
7 y ,
(2)
or, equivalently, that the frequency response of the combined
system be
(3)
where k stands for the delay, and stands for the phase
shift. We note that a constant delay is inherent because of
the stationarity of the input process, and the constant phase
00I8944R/93$03.00 1993 IEEE
S1A|V AN\ I!N>JL! 5IlH-I/C^J^J^. N\J\lN lH HllO DECONVOLUTION DUb
.........,,.........._
t ' j " t
|
.
|
| J |
....................
Fig. ! Blind UCCOHVOU!\OD HOUC.
shift is also unavoidable if the input distribution is invariant
under rotation.
Two common measures of equalization performance (e.g.,
see (18]) are the maximum distortion (MD)defned by
VD) =
Ln ISn
l-
18max
l
...l
and the intersymbol-interference (lSI) defined by
ISI(s)
Ln ^
-,..
^+l
where __ is the component of ,n having the maximal
absolute value (the leading tap or cursor). Clearly, VDand
lSI are zero if 8n is of the form of (2), and a small value of
MD/ISI indicates the proximity to the desired solution.
In the next section. we present a class of iterative algorithms
for adjusting 8n that converge monotonically at a very fast
rate to the desired response regardless of initialization. We
then show how to convert these virtual 8-domain algorithms
into realizable c-domain algorithms in which we only need to
adjust the unit sample response (taps) ('" of the equalizer.
III. !ur Aori1uv i 1ur sDOvAin
Consider the following iterative procedure for adjusting
the unit sample response coeffcicnts (taps) of the combined
system:
(4a)
(4b)
where , are the tap values before the iteration, are
intermediate values, and are the tap values after the
iteration. We denote by lll = v VLn -,' the norm
of the (infinite dimensional) vector of _ and by and the
conjugate and conjugate-transpose operations, respectively.
If we choose p and q to be nonnegative integers such that
+ ( _ 2, then the transformation in (4a) followcd by the
normalization operation in (4b) causes the taps with smaller
magnitndes to decrease more rapidly, forcing 8n to converge
very quickly to the desired response in which one tap (the
leading tap) approaches one in magnitude, while all other taps
approach zero. The effect of the algorithm is illustrated in
Fig. 2 for the case p = 2 and If = ! Suppose we start with
8n = |.c
l
''' which corresponds to an initial 11 of 3.'6.
Mter one iteration we obtain ._ 0.76(0.512)l
n
l, which
corresponds to J1 = 0.71 . On the next iteration cycle we
obtain Sn = 0.98(0.1 3)1n, corresponding to 11 ^ 0.037,
which is already quite close to the desired response.
~:
)L1
-
O.O
.
I
D
.
'
.
-
.
.
.
` O 4
|
l3I O.1
.
' | '
. 1
~' ~A ~
.
.
.
` / 4 '
!)
:
IL1
-
L.!O(
.
l ! l |
D
-' -A ~ -.
.
` O . . 4
(J
Fig
.
Z. LlI0Cl UI the _orIthm in the time domain. (al Before nrst iteration.
(b)After frst iteration. (c) After eCoDU itcra|ion.
Transforming (4a) and (4b) to the frequency domain, we
obtain
'(.)
and
(
(.) '
. .
(.)`(
,) ~ Q S*(
-
l4)
27. P
r
+r P q-l
(27)P
+
q
-
l!-."
.
'
= _+.)(=])
'(-=,)!=,!a_.
(Sa)
(.)
'=) = -, ==
'
(
)
-
_
'
(5b)
where (.), 3'.). and (.)are the Fourier transforms of
-''' "' and -<, respectively. To demonstrate the efect of the
algorithm in the frequency domain, consider the case Z
and q = d. Suppose the initial (
_
)has a narrow spectral null
of width "c. at frequency .,:
5(.) =
1 ,
- -a 7E/2,
- ._ ` H/2.
After performing one iteration cycle we obtain (see Fig. 3):
S
"(w) _
I
1
-
~'
` '
1
l
---!
(
:
|
~
o(
-
).
a-2wol : H,
I
w
-2wol ` "c
If we regard the diference between S (w) and the desired
response in (3) as "disturbance," then a single iteration of
the algorithm causes the width of the "disturbance" to be
multiplied hy a factor of 2, the location of the peak of the
"disturbanc" is multiplied by a factor of 2, and the peak value
is multiplied by a factor of E/2 which essentially eliminates the'
3 IEEE TRNSACIONS ON INFORMATION THEORY. VOL. 39, NO. 2, MARCH 1993
T
=T
|)
|
(
(a)
3(=)
(b)
T
7
Fig. 3. Efect of the algorithm H the frequency domain. (a) Before the
iteration. (b) After the iteration.
spectral null. With arbitrary p and g, we obtain similar efects:
the width of the "disturbance" is multiplied by a factor of
(p +), and the peak value is multiplied by a factor smaller
than EP+q-l
To analyze the convergence behavior of the algorithm, we
note that by (4a) and (4b),
(6)
Since + g 2, then
1
_
1 1
1 1
|PP'_
1 'no
; g ``
.
Bno Sn2 ` '
(7)
That is, the indexes 7/ n
1
, n2,'" of the ordered s components
are presered throughout the iterations. In particular, the index
of the leading tap is preserved along the iterations. Thus,
and
MD
(
S
"
)
=
_
=
:,
(p+
q
)
llnO
no
n;no
no
I
)
p
+
q
= [MD(s)]p+q (8)
n#no
M
e
`
' 2(
p
+q
)
ISI(s") =
n#no
no
n#no
M
< [I: '
+
<)
[181(,)['+'. (9)
Performing (8) and (9) iteratively we obtain, respectively,
MD(s
(
l)) [MD(s(O)](p+q)', (10)
(11)
where s(O)
(
... s;
'
. . , denotes the initial value
of s, and s{/) = ss'sill . oO) T denotes the value of s
after / iterations of the algorithm. These inequalities indicate
that MD( s(l and ISI( s(/ converge to zero at a very fast
at least super-exponential (that is, exponential to the power)
rate to zero, provided that MD(sCO) and 181(s(0)) are smaller
than 1. Since by (4b) Ils(1)
11 = 1, then sell must converge to a
vector having only one nonzero component whose magnitUde
equals one, i.e., to the desired response.
To prove convergence to the desired response under weaker
conditions, we invoke (6) and (7) once again:
p
+
q :iD(s") =
:
=
:
ni
n
o
no
nino
M
P+Q
-
1
_
`
,
8no
' :D(s).
z
Performing (12) iteratively,
By (6),
MD(s(l) q
.
MD(sCO).
11
(
)
p
+
q
-1
t=
O
8no
(12)
(13)
(14)
Thus, substituting (14) into (13) and following straightforard
algebraic manipulations
where
MD(sCI)) r
p
+q)l-l
MD(sCO,
(15)
U
,)
sno
In a similar way, we obtain
ISI(s(l)
r[(p+q)I-11ISI(sC
O (16)
Since MD(s(O
J
r!''''i::u)
\I.i
. .t,
:
. t,
,,
,-,) _ .
~ri ",(wnm:w
=
O
where lv() stands for the natural logarithm. An alternative
definition of eumulants in terms of moments is given in ([3, ch.
2, definition 2.3.1 D. Thus, if .i i
.
arc zero-mean random
variables then
cui:(.t, ) d
t.i.(t,: .r) E{.fl.rd
c\i(
J
: _` Fr,
c\iui(./,: t: r: r| E{TI:f2:r:J:r}
- curn(T1: .r}c\:i.(.t: r)
un,..i,|c(:: I,
- ci.r .l` ,iin(.|_ ,
and so on.
For notational convenience, let
ii:(.t:.t. r,.| uii.(, )
`
p terms
The following properties can easily be verifed (sec [3]).
PI) Linearity; cii(_
|,r,} _ |,un.,.)
l2) ! r,
. r g `
.
. .!_ _ are jointly Gaussian, then their
joint cumulant is zero whenever ! > 2.
B. Algorithm Implementation Using Finite-Length c'qualizers
In this subsection. we show how to express the ,-domain
algorithm (4a), (4b) in terms Ol the equalizer taps and some
data cumulants. We shall make the following assumptions:
I) !hc input U t \2. is a se4uence of
real/complex independent identically distributed (i.i.d.)
continuous/discrete type random variables. We assume
the existence uIthe following nonzero cumulants of lI.
(17)
11J U .::c, q + 1) # D. (b)
where p and are the parameters appearing in (4a). Note
that since p+q 2 2 then by P3) U) must be non-Gaussian.
Comment: II is suffcient to assume that U) I lZ" .
are statistically independent random variables. However, for
the sake of simplicity, we further assume that they are also
identically distributed.
2) The unknown system J {h,,} is a stable, possibly
non minimum phme, linear time-invariant flter whose
inverse (which may be noncausal) `
J
{h;1} exists.
3) The equalizer C ,, is a tap-delay line of
fnite-length I - 1I + I.
The combined system respunse subject to the fnite-length
restriction is
n a vector form
.
:+
i'~i
|-1
H - Hc. ( 19)
where is the possibly infnite vector taps of the combined
system
f is the 1 ) vector equalizer taps
'
,../`
and H is the matrix of columns and possibly infnite number
of rows, whose elements are
(20)
3 IEEE TRANSACIONS ON INFORMATION THEORY, VOL. 39, NO.2, MARCH 1993
In the deconvolution problem, we want to adjust c so that
B ~1c is equal to the vector 6'
*
' whose nth element is ^,
*
for some fxed (for simplicity, we shall ignore the constant
phase shift). However, since c is of fnite length, we may
only require that c is chosen to minimize the distance (norm)
between 1c and 6'
*
'. Assuming that 1 is known, this is a
standard linear least squares problem whose solution is
m+n|
1c- 6'
*
'|
'
c={11
16'
*
'
B =1~1{1 1
1 '*
' (21)
where we note that by assumption 2) 1 is full rank m that
1 1 is invertible for any L. This is the optimal solution in
the least squares sense, and also in the sense of minimizing
the mean-square restoration error since
E]|z, ,.
*
||
E||(e
|
*
')..
||
n
No equalizer, with or without a training sequence, can do
better than the solution in (21) even if / is known a priori. As
L increases, the matrix product 1{1 1,
1 approaches
the identity matrix 1 (more ecisely, the kth column of
1{1 1)
1 approaches
*
' in which case the solution
in (21) closely approaches the desired solution.
Having this result in mind, we now turn to the algorithm
specifed by (4a), (4b). According to (4a), we want to set c
/
so
that B 1c is equal to the vector g whose components are
It is not difficult to prove that the point of convergence of
the algorithm subject to the fnite-length restriction is given
approximately by
1
1{1 1
~
1'
*
'
{6'
*
')T 1{1'1
16'
*
'
(25)
(ust substitute B s'
*
' into (24a), (24b) and show that to
a frst-order approximation = s
'
*
' up to a constant phase
shift), and the lSI at that point is
|
'*` {
'
*
'
s
|*`]'
lSI {s'
*
' =
{'
*
'
1|*
'
]'
I~{6'
*
'
1
1{1
1
'*
{'*`
)
1
1{1 1
~:
1
'*
'
(26)
We see that
'*
' coincides with (21) up to a gain factor
which is very close to unity, Thus, by iteratively performing the
algorithm in (23a), (23b) we converge to the optimal attainable
solution subject to the finite-length restriction. The residual lSI
given by (26) is close to zero for suffciently large L. There is
a weak dependence on the index of the leading tap which
can be exploited through initialization, since the index of the
leading tap is preserved throughout the iterations.
The algorithm in (23a), (23b) is still implicit since Hand g
are unknown. Our next step is to convert this algorithm into a
realizable algorithm expressed in terms of joint cumulants of
the input and the output of the equalizer. We start of from
- =,~. *
*
,
*
*
(22) Then,
However, since the number of equalizer taps available for
adjustment is fnite, we shall only require that c is chosen
to minimize the distance (norm) between 1c and g.
The normalization operation in (4b) is equivalent to
c =
1
c
,c1*1c'
(23a)
(23b)
Projecting the c-domain algorithm in (23a), (23b) back into
the s-domain, we obtain
It
I ,
B = _ .
(24a)
(24b)
Once again, if 1{11
_.
~
*
=
_
*
~,,
*
,
* *.
Invoking properties PI) and P2),
cm{
)~:,
, =m_
*
,
.=
*
`_,,=,)
*
,
*
, *.
hkl-nhk2_rcum( .
*,
,~
*
,
!
^1 ^g
=cnm{
,.,)
*
)~
*
=cum{.,,
){1
1)
)
,
~
*
,
=
_
_
um{z. : p; : `)
*
)
*
where
(27)
(28)
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLUTION 3V
~ cum(,:,: ,:,`:
`
IcIH5 g lcIH5
cut
J_t:
.,.,_,,
+,
1 _ !1
-
,~+
,,
,
~
_
_ _
_S11 I_s:1 `
11 t 1f_ +
umc,(,c,t,c,.,,
,u
,;;
(!j
S1:: u,,t I2. , :
{ n,|, .
), q Ij / ~ | * O,_
: IH_ /,
0, +.+:Vs:
S1+..1.1 (J 1+ (!j
(J
um|, ).:, ,
(,)"um, )!, ( *
Ij
g : cvm( :c, + Ij
S1+..1.1 (I 1.+ (!j
cvm|:, : :z, . q;p
cum(u ), 1
'`
-s
.
(Ij
cum(u, . p; u
, 1
+ I)1*)),
(!
U1 (!I a11 (!j 1 (! (,+j V: ++1
(j
1
c ~ c,
c1c
(+j
V+:: 1 +: L X L .s V++: :::1. :
1. :+-1:: t.s
T+: 1::1:: .+. 1 1: :
1:.+1 d -::.+ + ::+11+.1: :1x1. (:+.:.+1(,
V+:: +:: . -::.+ + ++:. +1: :1t11.
RJ P1,
?
_ j !j_ck
1
,
- ,- s,=:.
.
vm(u:t, u|!i,
u,,u,
j
S1+..1.1 (+ 1.+ (, V: ++..1 : .:1.-: t:t:1
..+1 + .+: +.+t I1.:1 +. :+)1.1 j+1. :1t11.
+ g 11
~
, . ::+ .:+1 :J:: V: tJ . :+1.: .+:
:1t11. +
u
, 11 .+:1 :+ .+: i.:..+1
C. !hcIgurtthm wth Jmprcu| Cumulunt
l1 :.:: .+: ::. :1t11. : 11s1+V1 11 .+::+:
:1 +1J +: +s.:1 +J .+: : :.t.: 11 .+:
.+: 1. T+::+: 1.:1 + ( (+j V: 1:
l
(
c
c+ Fc
V+:: ! .+: 1 X m V++: ::t:1. :
(u:~u,
!n-
um )
11 d .+: J` .::.+ V++: ::t:1. :
.
| J !,1
1-
um(c., . ,c; .
1
I
(Ij
(!+
(
(j
F
_
cum(u,+u,,)
(uu'
cvm,, va c,)
11 d .+: Lxi -::.+ V++: ::t:1. :
(1 V+:: c.() 1:1+.: .+: :.t.: + c.r..(), ++.1:1 +J
+s.1 :1:t+: -:: V.+ :t: -:: F+
:t: , t:+:1 .+:1
cum,, . ;: . qj u)
,
`
cim( |, [ +I,
(j
l1 .+ :..1 .+: +.+ :t::1 1 .:t + +:
:1t: . a11 +t: 1. :111.
T+: .:t c1t ,-
c; q1) :1 1 .+: 1:1+t1.+ + (j |1. t1.
J1 :+1.1. .+. :1 +: 1:+.+.:1 1.+ |+: 1+t.+1
+:.+1 1 (+j
T+1 .+ i)::1. .+: +.+t V:
+1J 1::1 .+ s1+V .+: .1:: (.:: +V: + ,
T+:
.+.+t 11-: 1 .+: :1: .+. . 1+: 1+. t+: 1J
::.+1 +1 .+: +++.J 1+1.+1 + ,, +.1:1 .+
. 1+111 .J1 +: :+11.+: i1 (I! :1 (I
W: 1+.: .+ (j t 1 .a:.1: .+ 1: :1:.+:
l1 ++.+ c: w: 1::1 .+ t1.J +J .+: 1-:: 1 + .+:
cum,u`u.` (utu,,'
m( p; z; /u.,),__
T(l`lu,
cvm | u,)|
_
T!!-`-:uj,
?T|
l
:,|T!.,
uj,
- T(:;!(:
, u,
}.
1 V+:+ ::
:
|
"
u
'
,+u
_ u~u.,
t.1
40)
P
~
, p; 0v.,),_
,
_
_l~
,u,,
(
'
I
,
-.
2! lll TRANSACTIONS ON INFORMATION THEORY, VL1. 39, NO.2, MARCH 1993
Va:: a: 11m+: +' -+: t;: RJ [
.a:+:t !j .a:: :111. :t.: : :+1:1. Va
;+++..J +1: (V;
1
2
_
l!ln
I2
}
n=0
l
1 2
_ 1
2 +
I
12
_,.
`
,0
` ___
1
= ISI(s)
1
_
1
2
l
'
))- cov(
so
) ]
,
(!I(
Va::
l
r,; .1 + a: .:: + .a: +:s:: t.t Ta:
: ISI(s), given +J (!+(, .a: :1 lSI i1: .+ .a: nai.:
:1a + a: :1t: where a: ::+a : represents a:
:+1.+1+a 1: + .a: 1i.: :aa + a:
W: t1 1+V 1+..1: (!+( .1+ (!( 1i :J +1. a:
.1:.:i +;:..+1
Ta: ::1+1 + .a: :: = I q =
l (a+: :11a( ::i +1. .1 A;;:1.t R For
.;:.J V: 1: +. (lt is : -1: t:+:1
ai+ -+: .1 Va.:a :: .a: :+ai+a 1 (l( :1.:
a 1{ a} = J Ta: :1. .
E{
l
SI(s)} ISI(s) +
l
)
E{a
n
_
];2
{nD
+(IT H(H H)-l H+l - L) ,(!(
Va:: 1
:: jj [1!j(
L-l
E{IS1(s)} IS1(s) . (!1(
If L : :+t;: .+ a: :d::.-: :a.a + a: 1-::
H-l + H a:1 ISI(s) J 1i . -:J : compared
V..a .a: :d::..-: :1.a + H-l a:a 1H(H+ H)-l H+l -
L 0, in Va:a c: .a: .+ + (!( + (!( (:..-:
:T:.:1:J ( .-:a +J
E{a
n
E{a;} - E2{af}
i
E2{an
(J(
+ :t;: . U 1+aJ t:: t:+t:1 +1J fN.
...1 a: -1: -1 a o V.a ;+++..: o1 +/1)
1i I{(1 +o), respectively, a:a U .;: ::1.+1 J .:i
p= o{(1 - n)2. |+ / < 0.38 + O` 2.62, p < 1, .1.:..1
that a a. 1: .a: J ;.+..: ;:+1:: + a: ;+;+:
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS "OR BlJ'D DECONVOLUTION 511
algorithm is better than that of linear prediction based on the
minimum phase assumption. The reason is that third-order
cumulants carry useful information that can be used to improvc
performance. However, in the range 0.38 < I < 2.62 the
relative efficiency is greater than one, and as t approaches I,
p approaches infnity. This is since in this limit (It becomes
a binary symmetric random variable for which the third-order
cumulant is identically zero and therefore, it does not carry any
statistical information. In this case wc must use the information
contained in the higher order cumulants.
The calculation of (47) for the case ~ Z and q 1 (fourth
order cumulants) is carried out in Appendix B. In this case,
we let at be either real or complex valued random variable.
For simplicity we have assumed that r{ad and E
.,] are
zero. In the complex case, we have further assumed that
E{a7} . R{I{tl2o.d and E{lat
I2ai} are zero, in which case
E {zl
}
~ 0 so that the third term on the right-hand side of
(42) need not be estimated. These assumptions are satisfed
for most signal constellations used in data communication. By
(18), at is restricted l have a nonzero fourth-order cumulant:
where A in the real case and 2 in the complex case.
The result is
E{lSI(m l'\()
-
E{latI2}E{l
atI6} -
E2{
I!
tI
4}
[E{la(14} - AE2 {la
t
I2}]2
(51 )
We note that if E{
I
!tI2!d ; 0, then (51) contains an
additional term as indicated in Appendix B. Ignoring the
contribution of the bias terms in (51) and (49), their relative
efciency is:
_
~
E{I!tl2}E{I!
tI6} - E2{latI4}
(52)
[E{latI4} - AE2{loI12}f
For example, if U is a uniformly distributed random vari
able, then p = 3J indicating that fourth-order cumulants
provide useful information that can be used to improve per
formance (reduce the variance). For V22 and V29 signal
constellations that are commonly used in data communication
(,ee Fig. 4) p is equal to 0.471 and 0.719, respectively,
indicating that in both cases the proposed algorithm is superior
to predictive deconvolution even for minimum phase systems.
Invoking Cauchykhwarz inequality, it can be shown that
(53)
where equality holds, if and only if the probability weight of
at is zero unless U]
'
or lat I ~ for some positive real
. Some of the most popular signals used in communications
(sec [24]) and in particular in digital communications, e.g.,
BPSK, V27, 4-QAM (see Fig. 4) satisfy this condition. For
this class of signal constellations, p ~ 0, implying that the
statistical variance is zero even for fnite data, in which case
the expected lSI given by (51) is not zero only because of the
bias efect due to the fnite length of the equalizer.
\PP 3OlNCF
In(n;
V<? ~ODNcF
i(a,,
Nc(n
)
\L9 -O'N0F
n~_)
4-AV BClRCl
t(n
)
N
()
I_. +. Typical data communications signal cunstellations: V22, V29,
4-QAM. hJU V27.
J+ explain this striking result. suppose that lat , with
probability one (w.p.l). Then, for H ; 0,
,`(t
a
;_n
_2
(1
-
A).422:;1 ata;_n
(1 - A)A4
d w.p.1.
This rather interesting applicable result is confrmed with the
observation by Godard [10] that blind deconvolution based on
his criterion might be an easier task for signals having constant
amplitUde.
Comment: We may consider combining the algorithm based
on third-order and fourth-order cumulants, and perhaps other
algorithms in the class, in order to improve statistical stability.
We may also consider incorporating linear prediction methods,
particularly when the input distribution is close to Gaussian.
By appropriately weighting and linearly combining thc two
methods, we obtain a method that is capable of identifying the
phase ambiguities associated with predictive deconvolution,
and whose expected lSI is lower than that of either one of
the methods separately (and in particular lower then that of
predictive deconvolution). Detailed considerations are beyond
the scope of this paper.
l. Efect of Additive Noise
Consider the system illustrated in Fig. 5 in which
J; = /, ~ al +'t-
^12 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 39, NO. 2, MARCH 1993
Fig. D. Blind deconvolution in the presence of additive noise.
where C represents the additive noise at the input to the
equalizer. Then,
If 0, and -j are statistically independent (e, need not be an
i.i.d. process) then by P1) and P2),
cum( t ] 1 Y;-n )
Y
N=<4 0.4690 0.0938 0.02 0.0I!0 0.0I03 0.0101 0.0101
)
h=I000
<
..........
, ~~ ~ , ,
< 4 D
Iteration
Fig. 6. Performance of the algorithm for V29 source
.
an equalizer of length L 16 which was initialized to
c' (0 001 000000000 OOO)T.
In the frst set of experiments the input symbols were
independent realizations from V2l) source (see Fig. 4). The
algorithm was tested in 50 Monte Carlo trials and the average
lSI, denoted by <lSI> is tabulated and plotted in Fig. 6.
We have used a logarithmic (dB) scale because of the very
fast convergence rate. When using data blocks of P =
2400 samples, the algorithm converged in all trials within
2-4 iterations to an average lSI which is very close to the
asymptotic performance predicted by (51), For data blocks
of ^ = 1600 samples, the algorithm converged within 2-5
iterations, but there was a deviation of approximately 25%
(1 dB) from the asymptotic result, which may be because of
second-order stochastic efects due to the fnite length of the
data.
In Figs. 7 and 8, the input symbols were independent
realizations from a V27 source (see Fig. 4) and from a binary
source, respectively. In both cases, p = 0, so that according to
our analysis the algorithm is relatively insensitive to stochastic
effects due to the fnite length of the data, and the residual IS I
at the point of convergence s essentially due to the fnite
length of the equalizer. Thus, in the case of binary source
with ^ " 800 and P 400, which are considered very
short data blocks, the algorithm converged in all trials within
2-4 iterations to the predicted performance level. To push the
algorithm to its limit, it was tested for data blocks of P
200
points, in which case it still converged in 88% of the trials to
a suffciently small lSI level. These results suggest that the
algorithm may be potentially useful for communication over
fast fading channels.
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLLTION 21
\C131!OD O
N HU 0.b3
00052 0.0052
<
<D
< 4 _
Itt\1OF
Fig. 7. Peiform.ncO the algorithm \O1 `Z source.
IC11CJ
'
03071 ^
00274
f-
IA 04126
0 0690
f-
P< 07650 0.3425
_
1
\D
<U
,
0 .Ot 38 0 0062 0.0052 0.0052
~7~ U<6?00149 095 0.1232
Predicted perorence J JJ+S
__ . . . .. ..
< 4 _
lteratwn
Fig. 8. PIformne of the algorithm i binary source.
In Fig. 9, the experiment was repeated for binary sOuIcc
with ^ 400 where the equalizer input is contaminated hy
.lc:~iion
1<U
I1
G1
I
1
2 + o 6
0 4770 0. 0887 00110 J.005B 0 JJ` 0 c56 00056
05690 01490 00J~? 00149 ,001280 uu:.r 00125
\
`
_
-
` `
`
\ \
\ `
...
-
t
2
_ ... . . _
2 3 4 7 lC1!IDJ\
Fig. V. Performance of the algorithm for binary source in the presence of
noi-e.
additive white laussian noi,e with PK 20 dB and ^K
!0dB (where ^K is the ratio of the average power of
/, to the average power of f:t). In both cases the algorithm
converged in all trials, and the effect of additive noise on the
convergence rate was small.
In the last set of experiments, the input symbols where
independent realizations from a uniformly distributed random
variable. The outcome results arc shown in Fig. 10. For data
blocks of !00 points the algorithm converged within
2- iterations very close to the expected l predicted by
(51). For data blocks of l00 points, the algorithm
converged within 2-5 iterations, and the presence of additive
white Gaussian noise with SNR 20 dB did not have any
significant effect.
/ Kcruriv und`c(ucnIiuA|qort/hm:
In a variety of applications, it is desirable to process
the data recursively/sequentially. The advantage of d recur
sive/sequential algorithm over a batch algorithm is not neces
sarily in the fnal result, but in the computational effciency,
reduced storage requirements and in the fact that an outcome
may be provided without having to wait for all the data to be
processed. Moreover, if the underlying system (channel) ex
hibits lime changes, processing all the available data jointly is
not desirable, even if we can accommodate the computational
and storage load of the batch algorithm, since different data
segments correspond to different channel responses. In that
case, we want to havc an adaptive algorithm that is capable of
tracking the varying characteristics of the channel.
514 IEEE TRANSACtIONS ON INFORMATION THEORY. VOL. 3q. NO. 2. MARCH 1993
1\Ct|)OH Z 4 Y
I
N) 0.5257 0.1170 0.023 0.0111 0.0099 0. 098 0.98
|
u-iaca 0.6471 0.2065 0.0740 0.0289 0 0164
~D
1
.
i>
<U
N=l00
SN=20
dB
. . . . . . . .
<
< 4 O Iteration
1_. 10. Performance of the algorithm for uniformly distributed source.
Consider the algorithm in (37a), (37b) for the case p =
1 , q = 1 (third-order cumulants). Following the develop
ment in Appendix C, we obtain the following recursion for
computing C
/
at each iteration cycle:
/:
!
*
(, '
. _
r:
=
^:
:v:
:
7v: :=;
!
1
_
!
_ /,!:
v,v!:i
1 - n:
: ~i
1 - : :v!:iv,
(56)
(57)
where is the value of c based on sample averages to time t,
and
1_
:
=
_t~i
v/
=
where C are the values of the equalizer taps before entering
the iteration, and
cnm: , , 2) | : '
`
cnm
: , | ,: ,
Similarly, in case p = 2, q 1 (fourth-order cumulants),
we obtain
c
r,.;
t
!:v, , , -: ' |l , : , )-: ^v,r,il
(58)
where
_
cum , 2; a; : 2)
=
|| ' : ,' | | , , ' |
ci
m:
, )
|l, : ' )
If d
: = lit and
!:
is appropriately initialized, then the
outcome
r
at the end of the recursion coincides with c
obtained by calculating (37a) in one step using the sample
averages in (40) and (41) in the case p ~ 1, q = 1, or the
sample averages in (40) and (42) in the case p = 2, g 1. If
/:
/ (a constant), it corresponds to exponential weighting
that gives more weight to current data in the sample averaging
operations. The normalization operation required by (37b) can
be performed at the end of the recursion by normalizing c
N
,
or by scaling the output of the equalizer to have the same
average power as :
To convert these iterative-recursive algorithms in which
each iteration is performed recursively into sequential al
gorithms, we suggest to replace the iteration index by the
time index (-) which is a common procedure in stochastic
approximation, and use the most current state of the equalizer
to generate , . Under these approximations (56) becomes
|: | : , ]
r:
r:~i+
!:v: : -:
|l, ,
(59)
and (58) becomes
(60)
where
-: = Y;C-l ,
(61)
where the additional normalization operation (scaling :, to
have the same average power as
:
) may be required. Both
(59) and (60) are fully sequential algorithms in which we
perform one iteration per symbol. Using well-known results
from stochastic approximation methods [6], [ 12] it can be
shown that if
tm 0, 0 ,
n: = '
/ ,
t
e. g. , :
= l it, then under certain stationary-ergodic conditions
these algorithms converge almost surely (a.s.) and in the mean
square (m.s.) to the desired solution, and the limit distribution
at the point of convergence can also be evaluated. However,
such an analysis is beyond the scope of this paper. If we choose
, = (a constant), it corresponds to exponential weighting
that reduces the efect of past data relative to new data, and
we efectively obtain adaptive algorithms.
Palterative approach for generating sequential algorithms
is based on Robbins-Monro frst-order stohastic approxima
tion method [19], [12] in which expectations are replaced by
current realizations, and the iteration index is substituted by the
time index. The algorithm development is given in Appendix
D. In the case p 1, q = 1 , we obtain
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLCTION 5 15
!s.
B
C
1
` O
.
l )
__ __ __ _
_ _
Fig. 1 . PerIotmaace ofiherecursive a|gorithms
and in the case 2, ( - 1 , wc obtain
)
I-i +
g
;~t ~l
(' um(ft : 2: 1; : 2)
'; ' ;
|{
(63)
This algorithm is recognized as Godard's algorithm proposed
in [10] and [24]. The return to Godard's algorithm should
not be surprising since the super-exponential methods can
be viewed as iterative algorithms for optimizing a family of
deconvolution criteria (see [2 1 J), a special case of which is
Godard's criterion. We can now go back and analyze the
convergence behavior of Godard 's algorithm based on the
observation that i t i s a |1rstorder stochastic approximation of
the super-exponential method whose net action in the 8-domain
is well understood. In the case where \lI( U 2: a : 2)
0, the algorithm in (63) converges to thc desired solution.
However, in the general case, the algorithms in ( 62), (63)
should be modifed to maintain the normalization operation,
as explained in [21 J.
Thc diffcrence between (59) and (62), and between (60)
and (63), appears only in the multiplying matrix Qt . Since
the algorithms in (59) and (60) attempt to perform cumul ati ve
averaging, they are expected to be statistically more stable than
the algorithms in (62) and (63). Since Qt is proportional to
the inverse of the estimated covariance of g
;
, then the multi
plication by Q corresponds to a spectral whitening operation
and, as pointed out in [2], [21], this may signifcantly improve
convergence rate. To illustrate th;t, we have considered the
previous scenario in which the unknown system is an FIR filter
of length 7 whose tap values are (0. -1, 1, D. 7, (J . G, 0. 3,
-
0. -,
0. 1 ), corresponding to an initial l SI of 1. 27. The input ,ymbols
were independent random variables uniformly distributed in
(
-
0. 5, 0.5). We havc used an equalizer of length J - J b,
i ni ti al ized to c'
,,
I
,
o . 1 '
_
i ` ` `
_
g
r c, 0+ 1 ) .
The relations bctween the .q)-order spcctra of the input
U and the output Yt of the unknown system is given by l3]
(66)
where II( v) = _, lj ' is the frequency response of the
unknown system. In the special case where ,
( 1 . 0) , the
relation betwecn the power spectra of U) and g, is given by
(67)
Substituting 5 ., H ., , .) , 5 .} = H, .)( , .) and
, .) H, ., t , .)into (Sa) (5b) and using the relations
b1h ) KPNbPLLN LN NILKmPJL" HLKX, `L. V, NL. Z, MCH 1VV^
in (66) and (67), we obtain
C
'
W
)
_
`
'
=)
C=
_
'
,
y
,
y-) _
l
'
- +~I
C
=) C = )
-i )=]
`.;q]'
=J , .=,+, -i . =)
;, ,+i (=
i -Wp+q _l , W)
d=J d=.
i , (68)
(69)
where C(w) _cjc' is the frequency response of the
equalizer. This algorithm docs not require that
, be an i. i. d.
process. In mm, it requires prior knowledge of its power
spectrum and some other nonzero higher order spectrum.
In the actual implementation of the algorithm, the power
spectrum and the (p, q)-order spectrum of
, are replaced by
their sample estimates. Since the equalizer being used is of
fnite-length 1, its frequency response can be adjusted only at
L independent frequencies, say the discrete Fourier transform
(DF) frequencies =t ~ 2{ k k = u, 1 ,
. .
(L
1) , and
therefore, it has a fnite resolution. The resulting algorithm
is given by
L1 :-i
C
)
_
,=t)
\ \
Wk -
. . .
`-,=t)
t, =c t,,, , -
p+q-l p| ]g 1
C
=t _ =t, _C(=t _ C* (
-Wk, }
[=1 1 -| J
=
P
S
Y:P;Y' : q+l ( =t
. -Wkp 1 q_ 1 , Wk)
,,: g1,
=t,
.
, -Wkp H_, ' Wk)
C=t ) =
1
1/
2
C'
=t) .
1 ~y__ \'
I C
'
(W
)
1 2
1 =c 5_ -, )
(70)
(71)
where Sy;y. (w) and Sy:p;y' _q(W1
,: : : U___) denote the es
timates of the power spectrum and the |,q)-order spectrum
of , at the indicated set of frequencies. Statistical analysis of
the algorithm, similar to that performed in the time domain,
can also be carried out here.
VI. TE SYSTEM IDENTIFICATION ASPECT
The ap
p
roach presented in the paper can also be used for
the purpose of system identifcation. Since the equalizer C is
directed at fnding the inverse of the unknown system J,it can
therefore be viewed as modeling 1
-
' Since C is an all-zero
FIR flter, it corresponds to modeling the unknown system J
by an all-pole (AR) flter, where the taps C_ of the equalizer
can be regarded as the direct estimates of the AR parameters
of that model. Close form expressions for the resulting bias
and error variance can easily be extracted from (45) and (46),
where we note that the bias term refects the possible mismatch
between the actual J and its assumed P model. The main
feature of this system identifcation approach is that it uses a
non-Gaussian input distribution and therefore it is capable of
identifying nonminimum phase systems, unlike the classical
approaches for system identifcation that are based on the
Gaussian
'
assumption and therefore are incapable of identifying
nonminimum phase characteristics. The method is universal
in the sense that is does not impose any restrictions on the
probability distribution of the input process (driving noise) and
it only requires prior knowledge of its average power. Since
the proposed method uses information contained in the high
order statistics of the observed data, it often exhibits better
performance compared with second-order methods even for
minimum phase systems. Finally we note that the proposed
technique for reducing the effect of additive noise by cumulant
subtraction can also be applied here.
VII. CONCLUSION
We have presented a class of computationally efcient
iterative algorithms for blind deconvolution that converge
monotonically, regardless of initialization (no spurious local
stationary points), at a very fast nearly super-exponential
rate to the desired solution in which the inverse of the
unknown system is identifed, and its input is recovered up to a
constant delay and possibly a constant phase shif. We assume
that the input process is a sequence of i.i.d. real/complex
random variables with non-Gaussian but otherwise completely
arbitrary continuous/discrete probability distribution. When
implementing the algorithms in the frequency domain, the
i.i.d. assumption can be relaxed and instead we require prior
knowledge of the power-spectrum and some higher order
spectrum of the input process. The efects of the fnite length
of the equalizer and fnite length of the data are analyzed
and the results are confrmed via Monte-Carlo simulations.
It is shown that in many cases of interest the performance
(intersymboI-interference) of the proposed methods is far
superior to linear prediction methods even for minimum phase
systems. Recursive and sequential forms of the algorithms are
also presented.
APPENDIX A
DERIVATION OF (43)
Under the consistency requirement, the cumulant estimates
asymptotically preserve the linear property of cumulants. Thus,
by (27)-29),
'
_
_'
t,
', -,,:t, ,
t
_ ) , (A.l)
k2
c, '
q
:
_|[ c(, ] j` [` ;.t
)
(A.2)
k
cl(Zt
' p; :, ' q; ,._)
l _ t,
t
(t, . ?;-| ,t ;.t
-
,.,
)
(A. 3)
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL ML I I i UIl, FOR BLIND llfTONVOI.UTIO' 517
Note that the cumulant estimates in e.g. , ( 40)-(42) satisfes
(AI )-(A3) for any ` without any approximation.
Near the point of convergence, - () for H f 0 in which
case (A3) i s closely approximated by
` ] , } ,
~ - -, . ' } , J U;_n ) '
Iencc, rccaIl (38) and (39),
I H+A. H.
where A is the matrix whose clements urc
and g i s the vector whose elements are
-
1
(
_ l1, I!( , {
-
.,
-
^n
'
l`1!11 ( U] . . , : q
_
1)
Neglecting small-order terms,
-
I
(
=
;
-
R) U-I
(AA)
(A5)
(A6)
,A)
(A.S)
= ( H 1 1 I) - 1 - ( H+ H) -I H+( A - I) H( H+ H) -I .
(A9)
Thus,
(AlO)
where
u = g - ( A - I) H( H+ H) -
l
H g. (Al l )
Since A 1 and g 8b ( . )'Il, and since for large 1 the
product H ( H+ H) -I H+ approaches the identity matrix, then
the components of are given to a very good approximation
by
{ . , _ , t , : q
+l
) _
`0 n (,lllll( af : p: n ; : q+ l ) um. . o :
n -
-
' cum| . , :: ,+) ) cl l m(Ut , u ; ) '
Since -,
,
then
j 'I .
H ; 0
(AI2)
c''c' u+ H( H- H) -I H- AI( lJ- H) -1 H+u
, _ ^ '<T
H( H
- ' .
-
I H+l
|Q , ! l
i| il: U] . , )
(A. 1 3)
Hence, up to a constant phase,
//
1
J
C -
c' '!c
/
( H+ H) - l H+v.
lTH( H
I
H)
-
I H+l
(A1 4)
where W is a Vcc|O whose elements are
I/ O.
! f D.
n - 0,
(A. 1 5)
n # O.
Thus.
s" = Hd/ S. (A. 1 6)
where . - given by (43). Since . is independent of the iteration
index, then it is approximately the stationary point of the
algorithm.
ApPENDIX B
DERIVATION OF (48) P` (51)
In analogy with (40)-(42).
(B.3)
where ) - : i n the real case and A 2 in the complex case,
and where i n the transition to the second l ine of (B.3) we have
neglected 7|l mean terms whose variance is of the order of
( l ]A' , .
Substituting (B I )--(8.3) into (44),
where
H = 0.
n # 0,
(BA)
q 1,
2. [ * 1
(B.5)
Invoking the assumption that I 1 , 2,
.
is a sequence
of zero mean i . i . d. r. v. ' s and noting that E{at !p. ( at ) } = 0,
where
! - JH 0,
n = m. # O.
n # m = O,
|l/ H i U i O.
(B.6)
1l: TRANSACTIONS ON INFORMATION THEORY. VOL. 39, NO. 2
,
MARCH 1993
where in (B. lO) we have assumed that if U is complex valued
then E{aD E{ l a
t l
2
an
O.
Arranging the components of (B.6) in a matrix, the covari
ance of the vector W of {_ can be written in the form
where
Substituting (B. ll) into (47) and carrying out the indicated
matrix manipulations
E{ ISI(S) }
= TSI(.i) + Idl TyVl - 11 TW61 2] + 12( L - cTW6)
+ 2Real[3cTW1( 1 - 6TW6)] +146TW6( 1 - 6TW6) ,
(B. 16)
where
For sufficiently large L the frst column of W approaches the
vector 6 in which case 6TW6 ~ l, ITW6 1 1 and (B. 16) is
closely approximated by
E{ISI(s) } TST(s) +11 ( ITWI ~ I) + 12( L - I ) (B. 17)
Substituting (B. 1 2) and (B. 13) into (B. 17) and using
(B. 7)-(B.lO), we obtain (48) and (51).
ApPENDIX C
DERIVATION OF THE ITERATIVE-RECURSIVE ALGORITHMS
By (37a), the value of c
,
based on sample averages to time
t is given by
(C1)
where
(C2)
where in the case = 1, g 1
(C.3)
and in the case 2, [ = l, '
qt ( 1 - !t)qt-1 !ty; ( I Zt I 2 l : , ) : , (CA)
where J, lit in case of the sample averages as in (40)-(42),
where c1
,
1 = q and 02, 1 0, where
Yt
' Y
;,
z
[ , and q are
defned after (56)-(57) and where is defned after (58).
Inverting (C2), we obtain (57). Substituting (C2) and (C3)
into (C1 ),
,
, -Qt ,( l
!t)qt-1 + !tY; l zt I 2]
.. Qt [( 1 - rjt hQ;-
C
_l + !tY; l zt I
2
]
I
-Qt
(
Q:
,
In the case 1, g l, (0. 1 ) becomes
1
E
_ -
_
E
{ l
at I 2a
n
)
_
cum(a/ ; a; : 2)
Yt
Z
t
Z
t E{
l at I
2
}
and in the case = 2, g l, (0. 1) becomes
(0. 1)
(0.2)
1
EY; Zt ( I Z
t
I
2
_
E
{ l
a
t
I 4}
)
cum(at : 2; a; : 2) . E{ l
at
I 2}
(0.3)
where we used this at the stationary pointsE { I Z
t
E
{ l aI 1 2 } .
Gradient-search algorithms for solving (0.2) and (0.3) are
given, respectively, by
,
,
c
( | J) _
/3t
cum(at: at . 2)
(
@ E{
l
a
t , aj
, . E Yt
Z, ~;
E
_
l
a
t I
2
_ ) C=CI 1 - 1 ) , (004)
c
( | ,
C
( l -l) _ /
cum( nt . 2; a; . 2)
2
E
{
l a I `
' . E Yt Zt ( l
zt l -
E{ l at I
2
}
) C=C( l - l ) . (0.5)
' lnthe case Z. Q " 1 ,
" IIy, , , , - \ ' , I : - l
" |Iy . '
.
- `|l ' o l l
where i nthc trans|tion t o the second line we have used that En,! ,
.
| o, , [ DcCu$c D1IDc DOID8!Z3l!DD Dpetation. Inthissetting. dis obtained
by recursively performing (CA) w!th 1j J ]t.
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLUTION D 9
Replacing the iteration index / by the time index l, and
substituting expectations with current realizations, we obtain
(62) and (63), respectively.
XII. ACKNOWLEDGMENT
We would like to thank the reviewers for their useful and
insightful comments.
KrFRENCES
f l ] S. Bellini and F. Rocca, "Asymptotically eficient blind deconvolution,"
Signal PrceSSing. vol. 20, pp. 1 93-209, 1 990.
[2J A. Benveniste, M. Goursat, and l. Ruget. "Robust identification of a
nonminimum phase system: Blind adjustment of a linear equalizer in
data communication," IEEE Trans. Autumat. Contr. , vol. AC-25, pp.
385-399, June 1980.
[3] D. R. Brillinger, 1ime Series, Data Analysis and TheurJ. Sail Francisco,
CA: Holden-Day, 1981.
[4] P. Coman, "Independent Components Analysis," i n [nt. Signal Process
ing Worhhop on Higher Order Statistics, Chamrouse, France, VV! pp.
1 1 1-120.
[5] D. L. Donoho, "On minimum entropy deconvolution," in ]lied Time
Series Analysis, j, D.F. Findley, Ed. New York: Academic Press,
1981 .
[6] A. Dvoretzky, "On stochastic approXimation," Proc. Jrd Berkele Symp.
on Math. Statist. Probab., 195t, pp. 35-56.
[7] B. Friedlander and B. Porat, "Asymptotically optimal estimation of
MA and ARMA parameters of non-Gaussian processes from high-order
moments," [EEE Tran. Automat. Contr. , vol. 35. pp. 27-35, Jan. 1990.
[8] G. B. Giannakis and A. Swami, "On estimating noncausal nonminimum
phase ARMA models of non-Gaussian processes," IEEE Trans. cOuSl.,
Speech and Signal Processing, vol. 38, pp. 478-95. Mar. 1990.
[9] G. B. Giannakis and M. . Tsatanis, "A unifying maximum-likelihood
view of cumdlants and poly spectral measures for non-Gaussian signal
classifcation and estimation," [EEE Trans. Inform. Theory, vol. 38, .
386-406, Mar. 1992.
[10] D. Godard. "Self recovering equalization and carrier tracking in two
dimensional data communication systems, " 1Dn. Iommun. , vol.
CUM-28, pp. 1 867-1875, Nov. 1980.
[ 1 1 ] D. Hatzinakos and C. . Nikias, "Blind equalization using a triceptrum
based algorithm," IEEE Trun. Commu". , vol. 39, pp. 069-682, May
1991.
[ 12] H. J. Kushner, Stochastic Approximation Methods for Cum/rained and
Unconstrained Systems. New York: Springer-Verlag, 1978.
! 131 *>. Lii and M. Rosenblatt, "Non-Gaussian linear processes, phase, and
deconvolution," in Statistical Signal Processing, E. J. Wegman and J.
G. Smith, Eds. 1 984, . 51-58.
! 14] Ljung, SystemIdentifcation Theoryfor the User. nglewood Cliffs,
NJ: Prentice-liall, 1987.
[ I S] T. Matsuoka and T. J. UJrych, "Phase estimation using the bispectrum. "
Proc. IEEE, vol. 72, pp. 1 403-141 1 , 1984.
[ 1 6] .I. . Mazo, " Analysis of decision directed equalizer convergence," Bell
Syst. Tech. J. , vol. 59, pp. 1857-1876, Dec. 1980.
[17] C. L. Nikias and M. R. Raghuvecr. "Bispectrum estimation: A digital
signal processing framework," fIc. [EEE, vol. 75, pr. 869-891. July
1987.
I U!] J . G. Proakis, Digital Communications, second ed. New-York: McGraw
Hill. 1989.
[1 V] H. Rcbbins and S. Monro, "A stochastic approximation method," The
Ann. Mmh. Statist., vol. 22, pp. 40G-407, 1951 .
[20] Shalvi and E. Weinstein, "New criteria for blind deconvolution of
nonminimum phase systems (channels)," [EEE Trans. [nform. Theory,
vol. 36, pp. 31 2-321. Mar. 1990.
[21] , "Universal methods for blind deconvolution", I) Blind Decon
volution", S. Haykin, Ed. Englewood Cliffs, NJ: Prentice-Hall, to D
published.
[22] , , "Methods and apparatus for blind deconvolution", Israeli patent
application 97,345, Feb. 1991, and corresponding foreign applications.
[23] M. I. Sezan and A. M. Tekalp, "Survey of recent developments in digital
image restoration," Opt. Eng., vol. 29, pp. 393-04, May 1990.
[24] J. h. Triechler and B. G. Agee. "A new approach to multipath correc
tion of consIant modulus signals," [EEE Trans. Acoust. Speech Signal
Processing, vol. ASSP-3J , pp. 349-72, Apr. 1 983.
[Z5] V A. Wiggins, "Minimum entropy deconvolution," Geoexp/orati{ns,
vol. 1 6, pp. 21-35, 1978.