Positive-Definiteness, Integral Equations and Fourier Transforms

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Positive-definiteness, integral equations and Fourier

transforms
Jorge Buescu∗ Francisco Garcia† Isabel Lourtie‡
CAMGSD and Inst. Sist. Robótica Inst. Sist. Robótica
Dep. Matemática, Inst. Sup. Técnico Inst. Sup. Técnico
Inst. Sup. Técnico
Last revised August 24, 2000

Abstract
In this paper we define classes of functions which we call positive definite ker-
nel functions and positive definite kernels. The first class may be thought of as
a generalization to two dimensions of the classical positive definite functions of
Bochner-Khinchin type. We study their properties in depth and show how the sec-
ond class arises by considering the associated integral operators. We give necessary
and sufficient conditions for the existence of a bilinear expansion of Mercer type and
show the analog of Bochner’s theorem in the L2 setting, namely that a function is
a positive definite kernel if and only if its Fourier transform is a positive definite
kernel. A simple and elegant sufficient condition for compactness of support of pos-
itive definite kernels is given, namely that they are compactly supported along the
main diagonal. Several corollaries relating compactness of support of the Fourier
transform and analyticity are derived.

1 Introduction
An important class of results in the theory of Fourier transforms, either by itself or in
view of applications, relates to the consequences of a function or its Fourier transform
having compact support. In signal processing functions which are compactly supported
are known as time-limited and functions whose Fourier transforms are compactly sup-
ported are known as band-limited. We will be concerned in this paper with the analog of
band-limited functions in two dimensions.
Several classical results may be proved in a more or less straightforward way about
functions of a real variable whose Fourier transform is compactly supported. We list some
of these below; references are e.g. Dym and McKean [4] or Krantz [10].

Address for contact: Dep. Matemática, Instituto Superior Técnico, Av. Rovisco Pais, 1096 Lisboa,
Portugal. Email: [email protected]

Address for contact: ISR - Instituto de Sistemas e Robótica, IST - Instituto Superior Técnico, Torre
Norte, Av. Rovisco Pais, 1049-001 Lisboa, Portugal. E-mail: [email protected]

Address for contact: ISR - Instituto de Sistemas e Robótica, IST - Instituto Superior Técnico, Torre
Norte, Av. Rovisco Pais, 1049-001 Lisboa, Portugal. E-mail: [email protected]

1
Fact 1 If the Fourier transform fˆ of f has compact support, then f is a real-analytic
function.

Here by real-analytic we mean functions having a non-trivially convergent power series


expansion at every point. In particular real-analytic functions may be extended to entire
functions of a complex variable.
This result has interesting consequences. If f is analytic and not identically zero, its
zeros must be isolated; in particular, f cannot vanish identically on any interval. It follows
that, if (f, fˆ) are a Fourier pair, they cannot both be compactly supported (which is a
qualitative statement of the Uncertainty Principle). This property is referred to in signal
processing by saying that a signal cannot be simultaneously time- and band-limited.
Fact 1 may be interpreted as a smoothness result: compactness of fˆ ensures analyt-
icity of f . In an intuitive picture, the absence of “high-frequency” components ensures
smoothness of f . In fact, f will be so smooth that it can be reconstructed by sampling its
values at a set of regularly spaced points, provided the spacing between them is sufficiently
small. This is known as the Sampling Theorem, see Champeney [3].
It is possible to give a complete characterization of real-variable functions f whose
Fourier transform fˆ has compact support. There are several versions of this result; we
restrict ourselves to the L2 version, which is the most relevant for our purposes. We
denote by F (z) the analytic extension of f (x) to the complex plane.

Fact 2 A necessary and sufficient condition for fˆ ∈ L2 (R) to satisfy fˆ(ν) = 0 if |ν| > ν0
is that F (z) is an entire function and satisfies
Z +∞
|F (x + iy)|2 dx < C exp(4πν0 |y|)
−∞

for some C > 0.

More general statements and proofs of Facts 1–2 may be found in [3], [4] and references
therein.
These results are satisfying since they provide a complete characterization of band-
limited functions. However, they are specific to one-variable functions, and a generaliza-
tion to the higher-dimensional case is not known — even though such a characterization
may be important in applications, as the example below suggests.
In signal processing many physical phenomena are modeled by random processes.
Recent advances in digital computers suggest the use of discrete-time models obtained by
sampling techniques, where the signal samples correspond to the coefficients of a linear
decomposition of the original signal under a basis of delayed sinc functions. When the
signal is deterministic, the Sampling Theorem states that a band-limited function can
be exactly reconstructed from its samples if the sampling frequency is at least twice the
signal’s largest frequency (see e.g. Oppenheim et al. [12], Champeney [3]).
For second-order random processes complete reconstruction of a continuous signal
from its time-samples can only be guaranteed in the mean-square sense. If the process
is stationary (that is, its first and second order statistics do not depend on time), exact
reconstruction in the mean square sense is achieved when the Fourier transform of the
one-dimensional autocorrelation function has compact support; see Papoulis [13]. For
nonstationary processes, the corresponding sampling theorem described in Gardner [6]

2
requires that the 2-dimensional Fourier transform of the (2D) autocorrelation function
have compact support. In the frequency domain, the 2D compact support condition may
be relaxed using intrinsic properties of the autocorrelation function, such as its positive
definiteness, which we define in § 2.
Thus the version of the Sampling Theorem for this kind of processes leads in a natural
way to a positivity condition. Moreover, it suggests a definite link between positive
definiteness of a function of two variables and compactness of the support of its Fourier
transform.
In itself, positive definiteness is a condition which has been, in the one-dimensional
context, widely studied for almost a century. Among the extensive literature, the most
well-known result is probably Bochner’s theorem, which characterizes positive definite
functions as the Fourier-Stieltjes transforms of positive measures.
This result reinforces the suggestion of a deeper relationship between positive definite-
ness in the two-dimensional setting and Fourier transforms.
This paper aims to clarify this link. More specifically, we show that positive definite-
ness, unlike the 1-dimensional case, carries through Fourier transforms, that it provides a
simple and elegant sufficient condition for compactness of 2D Fourier transforms (namely
that the Fourier transform be compactly supported along a diagonal), and that L 2 positive
definite kernels admit a Mercer-like bilinear series eigenfunction expansion.
The structure of this paper is as follows. In § 2 we set up the functional analytic
background for Fourier transforms and introduce the definitions of positive definite kernel
functions and kernels, proving their most important properties. In § 3 we extend the
analiticity results analogous to Fact 1 to Rn . In § 4 we integrate these partial results to
establish the main facts referred to above, as well as derive some corollaries and make
some final remarks. Applications to signal processing are discussed elsewhere [5].

2 Fourier Transforms and positive-definite kernel


functions and kernels
2.1 Fourier Transforms
We will use throughout the L2 Fourier transform theory, sometimes called Fourier-Plan-
cherel transform. Let us briefly recall some essential facts. Basic references are e.g.
Rudin [14], Krantz [10], Goldberg [8] and Dym and McKean [4].
For f ∈ L1 (R), one may define the Fourier transform fˆ of f by
Z +∞
ˆ
f (ν) = f (t)e−2πiνt dt (1)
−∞

(several variations exist, depending on conventions regarding the sign and the factor of
2π in the exponent). Basic properties of this L1 Fourier transform are that it induces a
linear operator F(f ) : L1 (R) → L∞ (R) of operator norm 1 and that as a point function
fˆ is a continuous function vanishing at infinity (hence unifomly continuous in R).
However, the Fourier Transform in the L1 setting has some drawbacks. First of all, it
may be shown that although F is 1-1, it is not onto. In fact, its image is quite elaborate
and not completely known. This raises some problems regarding inversion, aggravated by
the following fact: since for arbitrary f ∈ L1 (R) it is not in general true that fˆ ∈ L1 (R), a

3
Fourier inversion formula must impose this extra assumption on fˆ by hand. In particular,
the Fourier inversion formula will not be defined in the whole of R(F). Finally, the
Plancherel Theorem or the Parseval identity do not make sense in the L1 setting.
One may construct a much more pleasing theory in the L2 setting. However, this
time difficulties arise with the definition of the Fourier Transform. In fact, one may not
proceed naı̈vely defining the Fourier Transform of f ∈ L2 (R) by (1), because L2 (R) is not
a subset of L1 (R) since R has infinite Lebesgue measure.
Instead, we may start out by defining the Fourier Transform by (1) in L1 (R) ∩ L2 (R)
[14, 8], or indeed Cc∞ (R) [10], and then use density of these subspaces in the L2 norm
to extend this linear mapping uniquely to the whole of L2 (R). The linear mapping F :
L2 (R) → L2 (R) thus constructed is sometimes called the Fourier-Plancherel Transform,
and may be shown to satisfy the following.

Proposition 2.1 The Fourier-Plancherel Transform F : L2 (R) → L2 (R) has the follow-
ing properties:

1. F is a linear operator and F(f ) = fˆ is defined by (1) if f ∈ L1 (R) ∩ L2 (R).

2. For all f ∈ L2 (R) we have k f k2 =k fˆ k2 , that is, F is an isometry.

3. F : L2 (R) → L2 (R) is a Hilbert space isomorphism.

4. The inverse Fourier-Plancherel Transform is defined in L2 (R); if f ∈ L2 (R) and


fˆ ∈ L1 (R) it is given by the inversion formula
Z +∞
f (t) = fˆ(ν)e2πiνt dν a.e., (2)
−∞

with equality holding everywhere if f is continuous.

5. For all f, g ∈ L2 (R) the following Parseval identity holds:


Z +∞ Z +∞
hf, gi = f (x) g(x) dx = fˆ(ν) ĝ(ν) dν. (3)
−∞ −∞

Remark that, as opposed to the L1 theory, the Fourier-Plancherel Theorem defines fˆ


uniquely as an element of the Hilbert space L2 (R); however, as a point function it is only
defined a.e..
The definition and properties of the L2 Fourier-Plancherel transform extend almost
verbatim to the n−dimensional analog. More precisely, defining the Fourier transform of
f ∈ L1 (Rn ) ∩ L2 (Rn ) by
Z
ˆ
f (ν) = f (t)e−2πiν.t dt, (4)
Rn

where ν.t denotes the usual inner product in Rn , statements (1)-(5) of Proposition 2.1
generalize with little extra effort to L2 (Rn ); see e.g. Krantz [10] for details. Namely, if
f ∈ L1 (Rn ) ∩ L2 (Rn ), then fˆ given by (4) is continuous; the Fourier transform extends

4
from (4) by density to the whole of L2 (Rn ); this transform is an L2 (Rn ) and a Hilbert
space isomorphism; the corresponding inversion formula,
Z
f (t) = fˆ(ν)e2πiν.t dν, (5)
Rn

valid for fˆ ∈ L1 (Rn ) ∩ L2 (Rn ), holds a.e. (everywhere if f is continuous) and may be
extended by density to L2 (Rn ); the Parseval identity holds for all f, g ∈ L2 (Rn ).
In this paper we shall be concerned only with the case n = 2.
We will occasionally refer to (f, fˆ) as a Fourier pair , since their relationship in the
L2 context is completely symmetric. In what follows we will work exclusively in the L 2
setting, to which we will from now on refer simply as Fourier Transform.

2.2 Positive-definite kernel functions and kernels


We now introduce the classes of functions with which we will work throughout this paper.

Definition 2.2 Let k : R2 → C. We say that k is a positive definite kernel function


(PDKF) if
Xn
k(ti , tj ) zi zj ≥ 0 (6)
i,j=1

for all n ∈ N, (t1 , · · · , tn ) ∈ R and (z1 , · · · , zn ) ∈ Cn .


n

Remark 2.3 The term positive definite stems obviously from the analogy with the finite
(matrix) case. The reason why we call these functions positive definite kernel functions
will be made clear below; see remark 2.12. For the moment let us note that this is more
or less standard terminology (see e.g. Stewart [15]).

Remark 2.4 Positive definite kernel functions are obviously related to the widely studied
class of positive definite functions introduced independently by Mathias, Caratheodory
and Bochner. These are functions f : R → C which satisfy (6) with k(ti , tj ) replaced with
f (ti −tj ). The most important result regarding positive definite functions is the celebrated
Bochner Theorem, characterizes positive definite functions as the Fourier-Stieltjes trans-
form of positive measures [8, 15, 1]. In the sequel, when some danger of confusion may
arise, we refer to these as classical positive definite functions, retaining for the purposes
of Definition2.2 always the term positive definite kernel functions .
Positive definite kernel functions can thus be viewed as a generalization to R2 of
classical positive definite functions. One of the motivations to introduce them comes from
signal processing: is s(t) is a non-stationary stochastic process, then the autocorrelation
function Ks (t, t) is automatically a PDKF by definition; see Wong [16].

Example 2.5 A first example of a positive definite kernel function is k(x, y) = φ(x)φ(y),
2
where φ : R → C. A slightly less obvious example P∞is the following: given an L (R)-
orthonormal set {φn (x)}n≥0 and a convergent series n=0 µn of positive terms, i.e. µn ≥ 0,
then X
k(x, y) = µn φn (x)φn (y) (7)
n≥0

5
is a PDKF.
In fact, we will show in § 4 the stronger result that k is a continuous L2 (R2 ) positive
definite kernel if and only if it is of the form (7).

Remark 2.6 The following properties of PDKFs are easy to verify. First of all, if k is a
PDKF,
Pn then so is k. Secondly, if k1 , k2 , ...kn are PDKF and ci ≥ 0 for 1 ≤ i ≤ n, then
i=1 i i (x, y) is a PDKF. This implies that the set of PDKF forms a cone in the space of
c k
all functions from R2 to C; it is easy to see that the relation ≤ induces a partial ordering
on this set. Finally, if kn is a sequence of PDKF converging pointwise to k, then k is
PDKF.

In the following proposition we state and prove the most important properties of
PDKFs for our purposes.

Proposition 2.7 Suppose k : R2 → C is a PDKF. Then:


(a) k is positive on the diagonal t1 = t2 , that is, k(t, t) is real and ≥ 0 for all t ∈ R;

(b) for all t1 , t2 ∈ R k(t1 , t2 ) = k(t2 , t1 );

(c) for all t1 , t2 ∈ R, |k(t1 , t2 )|2 ≤ k(t1 , t1 ) k(t2 , t2 ).

(d) If k(t1 , t2 ) is continuous in one of the variables for all the values of the other, then
k is continuous in R2 .

(e) If k(t1 , t2 ) is continuous in one of the variables for all the values of the other and
k(t, t) → 0 as |t| → ∞, then k(t, t) is uniformly continuous in R.

(f ) If k(t, t) ∈ L1 (R), then k(t1 , t2 ) ∈ L2 (R2 ).

Proof. (a) For any t ∈ R, inequality (6) relative to the single point (t, t) becomes
k(t, t)|z|2 ≥ 0 for all z ∈ C, implying k(t, t) ≥ 0.
(b) Consider the inequality (6) with n = 2, that is
2
X
k(ti , tj ) zi zj ≥ 0.
i,j=1

Since k(t1 , t1 ) and k(t2 , t2 ) are real by (a), this implies that k(t1 , t2 ) z1 z2 + k(t2 , t1 ) z2 z1
is real. Taking the complex conjugate of this expression and subtracting, we find that
   
k(t1 , t2 ) − k(t2 , t1 ) z1 z2 + k(t2 , t1 ) − k(t1 , t2 ) z2 z1 = 0,

or    
k(t1 , t2 ) − k(t2 , t1 ) z1 z2 = k(t1 , t2 ) − k(t2 , t1 ) z2 z1 .
 
For clarity, let w = k(t1 , t2 ) − k(t2 , t1 ) and ξ = z1 z2 . The above condition may be
written more concisely as w ξ = w ξ for all ξ ∈ C.
It is easy to see (e.g. in polar representation) that, if |w| 6= 0, the above equation has
no solution valid for all ξ ∈ C. Therefore |w| = 0 or, equivalently, k(t 1 , t2 ) = k(t2 , t1 ).
Since t1 , t2 are arbitrary, the result follows.

6
2
X
(c) For all t1 , t2 ∈ R, the quadratic form zi k(ti , tj ) zj is positive semidefinite; since
i,j=1
the corresponding matrix is hermitian by (b) and the elements on the main diagonal are
positive by (a), positivity of its determinant implies |k(t1 , t2 )|2 ≤ k(t1 , t1 ) k(t2 , t2 ).
(d) In this proof we denote for clarity k(ti , tj ) by kij and k(x, y) by kxy .
Suppose without loss of generality that k is continuous in the second variable. Consider
the 3 × 3 matrix  
k11 k12 k13
k21 k22 k23  .
k31 k32 k33
Since k is positive definite its determinant is ≥ 0. By (b) this may be written
 
k11 k12 k13
0 ≤ det k12 k22 k23  .
k13 k32 k33

Let (t2 , t3 ) → (x, y), recalling that k is continuous in the second variable and t1 is kept
fixed:
 
k11 k1x k1y
0 ≤ det k1x kxx a  ,
k1y a kyy
where a = lim(t2 ,t3 )→(x,y) k(t2 , t3 ) if it exists.
Developing the determinant,

0 ≤ k11 kxx kyy + ak1x k1y + ak1x k1y − kxx |k1y |2 − |a|2 k11 − kyy |k1x |2 ,

valid for all t1 , x, y ∈ R. Taking in particular t1 = y, we find that


2 2
0 ≤ kyy kxx + akyx kyy + akyx kyy − kxx kyy − |a|2 kyy − kyy |kyx |2
= kyy (akyx + akyx − |a|2 − |kyx |2 ).

By (a) we know that kyy ≥ 0. Suppose for the moment kyy > 0. Then

akxy + akxy − |a|2 − |kxy |2 ≥ 0

or
−|a − kxy |2 ≥ 0.
Therefore a = kxy . This shows that lim(t2 ,t3 )→(x,y) k(t2 , t3 ) exists and equals k(x, y) for all
t2 , t3 , x, y ∈ R. So k is continuous in R2 .
If, on the other hand, kyy = 0, then we conclude using (b) that kxy = 0; positivity of
the 2 × 2 minor  
kxx a
a kyy
implies that a = 0 = kxy . In other words, lim(t2 ,t3 )→(x,y) k(t2 , t3 ) = k(x, y), which shows
that also in this case k is continuous.

7
(e) By (d), continuity in one of the variables implies that k is continuous in R 2 . In
particular, k(t, t) is continuous in R; since by hypothesis k(t, t) → 0 as |t| → ∞, by a
standard compactness argument it follows that k(t, t) is uniformly continuous in R.
(f) By (c) we have that for all t1 , t2 ∈ R, |k(t1 , t2 )|2 ≤ k(t1 , t1 ) k(t2 , t2 ). Thus
Z +∞ Z +∞ Z +∞ Z +∞ Z +∞ 2
2
|k(t1 , t2 )| dt1 dt2 ≤ k(t1 , t1 ) k(t2 , t2 ) dt1 dt2 = k(t, t) dt ,
−∞ −∞ −∞ −∞ −∞

showing, since k is positive on the diagonal t1 = t2 by (a), that k ∈ L2 (R2 ) if k(t, t) ∈


L1 (R).


Remark 2.8 It is interesting to note that our definition of positive definite kernel func-
tions includes as a special case the classical positive definite functions by setting k(t i , tj ) =
f (ti − tj ) in equation (6). Thus our conditions may be seen thus far as a generalization
of the concept of positive definite functions.
Likewise, the results we stated and proved above apply as a special case to the classical
positive definite functions. In this way, we recover from Proposition 2.7 the most impor-
tant properties of positive definite functions simply by considering the relevant statement
applied to the particular case k(ti , tj ) = f (ti − tj ). Thus, is f : R → C is a classical
positive definite function we conclude immediately that:

(a) f (0) is real and ≥ 0;

(b) f (−x) = f (x) for all x ∈ R;

(c) |f (x)| ≤ f (0) for all x ∈ R, implying in particular that f is bounded.

For several purposes it will be useful to consider the integral analog of (6).

Definition 2.9 Let k ∈ L2 (R2 ). We say that k is an L2 -positive definite kernel (L2 -
PDK) if Z +∞ Z +∞
y(t1 ) k(t1 , t2 ) y(t2 ) dt1 dt2 ≥ 0 (8)
−∞ −∞

for all y ∈ L2 (R).

Remark 2.10 By choosing an appropriate sequence of test functions it is clear that if an


L2 -PDK k is continuous then it is a PDKF. Conversely, using density of simple functions
in L2 , a continuous L2 PDKF is a PDK. Thus for continuous L2 kernels both conditions
are equivalent.
However, we should stress that dropping the L2 requirement changes the picture dras-
tically. For continuous functions (8) is much stronger than (6): as our first example in 2.5
shows, a PDKF need not be bounded. In particular, it need not lie in L2 (R2 ). In fact,
the same example shows that general PDKFs need not even be measurable.

8
Remark 2.11 It is useful to look at Definition 2.9 from the point of view of integral
equations. Given k ∈ L2 (R2 ) we define a linear operator K : L2 (R) → L2 (R) by setting
Z +∞
y 7−→ K(y) = k(t1 , t2 ) y(t2 ) dt2 ,
−∞

that is, as an integral operator in R with kernel k.


It follows from standard operator theory arguments (see e.g. Hochstadt [9], Gohberg
and Goldberg
R +∞ [7]) that K is a compact operator. Equation (8) states, by Fubini’s Theorem,
that −∞ y(t1 ) K(y) (t1 ) dt1 ≥ 0. With the usual inner product in L2 (R), this is equivalent
to hK(y), yi ≥ 0, or in other words to the statement that K : L2 (R) → L2 (R) is a positive
operator. Similarly, the fact that k(t1 , t2 ) = k(t2 , t1 ) in Proposition 2.7 implies that K is
a hermitian operator.

Remark 2.12 This is the reason why we refer to functions satisfying (8) or (6) as positive
definite kernels or positive definite kernel functions, respectively. The name positive
definite functions is in an extensive literature to refer to the classical positive definite
functions of Bochner type; see Remark 2.4. We follow Stewart’s [15] convention and refer
to these classes of functions respectively as kernels or kernel functions — even when, in
the latter case, they are obviously not kernels of integral operators.

From Remark 2.10 we derive the following

Corollary 2.13 Suppose k is an L2 -positive definite kernel. Then:

(a) k is positive on the diagonal t1 = t2 , that is, k(t, t) is real and ≥ 0 for all t ∈ R;

(b) for all t1 , t2 ∈ R k(t1 , t2 ) = k(t2 , t1 );

(c) for all t1 , t2 ∈ R, |k(t1 , t2 )|2 ≤ k(t1 , t1 ) k(t2 , t2 ).

(d) If k(x, y) is continuous in one of the variables for all the values of the other, then k
is continuous in R2 .

(e) If k(x, y) is continuous in one of the variables for all the values of the other and
k(t, t) ∈ L1 (R), then k(t, t) is uniformly continuous in R.

Proof. Immediate from Proposition 2.7 and Remark 2.10. 

3 Analyticity results
Definition 3.1 We say f : Rn → R, n ≥ 1, is a real-analytic function in Rn if at each
t0 ∈ Rn it has a power series expansion convergent to f in some open disk around t 0 .

Theorem 3.2 Suppose that k ∈ L2 (R2 ) and k̂ has compact support. Then k(t1 , t2 ) is a
real-analytic function in R2 .

9
Proof. Choose R > 0 such that supp(k̂) ⊂ SR , where SR is the square [−R, R ]2 in the
(ν1 , ν2 )-plane. From the Fourier inversion formula, eq.( 5),
Z Z
k(t1 , t2 ) = k̂(ν1 , ν2 )e2πi(ν1 t1 +ν2 t2 ) dν1 dν2 .
SR

By Fubini’s Theorem, this may be written


Z R Z R 
2πiν2 t2
k(t1 , t2 ) = k̂(ν1 , ν2 )e dν2 e2πiν1 t1 dν1 .
−R −R
RR
Fixing t2 , this obviously means that the function gt2 (ν1 ) = −R k̂(ν1 , ν2 )e2πiν2 t2 dν2 is the
one-dimensional Fourier transform of k in the variable t1 . Because of the choice of R,
gt2 (ν1 ) ≡ 0 for |ν1 | > R and all t2 ∈ R. In view of Fact 1 this implies that k(t1 , t2 ) is real-
analytic as a function of t1 for each t2 , extending to the complex plane z1 = t1 + is1 as an
entire function. Interchanging the roles of t1 , t2 , we conclude that k(t1 , t2 ) is real-analytic
as a function of t2 for each t1 , extending to the complex plane z2 = t2 + is2 as an entire
function.
Denote by K(z1 , z2 ) : C2 → C the function thus constructed. This function is sepa-
rately holomorphic with respect to z1 and with respect to z2 in C2 . By Hartogs’ Theorem,
see Chabat [2], it follows that K is jointly holomorphic in C2 with respect to the variable
z = (z1 , z2 ). Therefore, K has at each point a power series representation convergent in
C2 . Thus k, which is the restriction of K to the real plane {z ∈ C2 : I(z1 ) = I(z2 ) = 0},
has at each point a power series representation convergent in R2 , and is therefore real-
analytic. 

Corollary 3.3 If k ∈ L2 (R2 ) and k̂ has compact support, then the support of k is R2 .

Proof. Denoting as above by K(z1 , z2 ) the complex two-dimensional inverse Fourier


transform of k̂, it follows that K is holomorphic in C2 . This implies that K cannot vanish
identically on any open set in C2 ; by a standard result of analysis in Cn (see Chabat [2],
pg. 31) this in turn implies that K cannot vanish identically on any open set on a real
hyperplane. In particular k(t1 , t2 ), being the restriction of K to the real plane, cannot
vanish identically on any open subset of R2 . Thus the zero set of k has empty interior in
R2 ; equivalently, the support of k is R2 . 

Remark 3.4 Notice that, opposite to what happens in the one variable case, zeros of
analytic functions in several variables can obviously have accumulation points; indeed
analytic functions in Cn can vanish on manifolds of (real) dimension 2n−2. The statement
that an analytic function cannot vanish in a real neighbourhood of a point, while trivial
for one variable, is non-trivial in this case.
In particular, the function k(t1 , t2 ) may vanish on nowhere dense sets in R2 . In fact,
since the derivative of k cannot vanish at any point, 0 is a regular value of k; the zero set
of k is therefore either empty or a submanifold of dimension 1 in R2 .

Remark 3.5 As is clear from the proofs, statements corresponding both to Theorem 3.2
and to Corollary 3.3 are valid in Rn with the obvious adaptations.

10
4 Characterization of L2-PDKs and their Fourier
transforms
In this section we provide a complete spectral characterization of the important class
of L2 -positive definite kernels. We also prove a 2-dimensional analog of the classical 1-
dimensional Bochner Theorem. Unlike the 1-dimensional case, in the 2-dimensional case
the positive-definiteness condition is completely symmetric within a Fourier pair: An L 2
kernel is positive definite if and only if its Fourier transform is, in an appropriate sense,
an L2 positive definite kernel.

Theorem 4.1 (Mercer) Let k : R2 → C be a PDKF. Suppose that k is continuous


in one of the variables for all values of the other and that k(x, x) ∈ L 1 (R). Then k is
continuous in R2 and is the kernel of an L2 positive integral operator K in R2 . Moreover,
k admits the bilinear series expansion
X
k(x, y) = µi φi (x) φi (y), (9)
i≥0

where the {φi }i≥0 are the eigenfunctions of K, which are continuous and L2 -orthonormal,
µi are the eigenvalues of K and the bilinear series (9) is absolutely and uniformly conver-
gent.

Proof. From Corollary 2.13 (e) the condition of continuity in one of the variables for
all values of the other implies that k(x, y) is continuous in R2 . On the other hand, from
Proposition 2.7 the condition k(x, x) ∈ L1 (R) implies that k(x, y) ∈ L2 (R2 ). Thus k is a
continuous L2 positive definite kernel; that is, the associated L2 integral operator
Z +∞
K(φ) = k(x, y)φ(y) dy
−∞

is positive.
Since k(x, x) ∈ L1 (R), it follows that k(x, x) → 0 as |x| → ∞ and thus k(x, x) is
uniformly continuous in R. On the other hand, since k(x, y) is continuous and is in
L2 (R2 ), it follows that
Z +∞
lim |k(x, y) − k(x0 , y)|2 dx = 0.
x→x0 −∞

These two properties together with summability of k(x, x) allow us to apply Novitskii’s
generalization of Mercer’s Theorem to unbounded domains [11]. We thus conclude that
X
k(x, y) = µi φi (x) φi (y),
i≥0

where the φi (x) are the eigenfunctions of the associated integral operator K, which are
continuous and L2 -orthonormal; the µi are the corresponding eigenvalues which by com-
pactness and positivity of K are real, positive and have 0 as its only possible limit point;
and the bilinear series above converges absolutely and uniformly in R2 . 

11
Remark 4.2 The converse of Theorem 4.1 is also true: if k admits a bilinear eigenfunc-
tion expansion (9) with the stated properties, then k is continuous (by uniform conver-
gence) and k(t, t) ∈ L1 (R) (trivial calculation, since the {φi } ∈ L2 (R)).
Theorem 4.1 may thus be regarded as a characterization result.

Corollary 4.3 (Trace formula) Suppose k is a PDKF, continuous in one of the vari-
ables for all the values of the other and with k(x, x) ∈ L1 (R). Then k is a continuous
L2 -PDK and the following trace formula holds:
Z +∞ X
k(x, x) dx = µi , (10)
−∞ i

where the notation is the one used in Theorem 4.1.

Proof. As above, from Corollary 2.13 (e) it follows that k(x, y) is continuous in R 2 . On
the other hand, from Proposition 2.7 the condition k(x, x) ∈ L1 (R) implies that k(x, y) ∈
L2 (R2 ). From Theorem 4.1 it follows that k(x, y) admits a bilinear eigenfunction series
expansion (9). Since the convergence of this series is uniform, term by term integration
is permissible, yielding the trace formula. 

Corollary 4.4 (Iterated kernels) Suppose k is a PDKF, continuous in one of the vari-
ables for all the values of the other and that k(x, x) ∈ L1 (R). With the notation of
Theorem 4.1 and setting
Z +∞ Z +∞ Z +∞
kn (x, y) = ··· k(x, t1 )k(t1 , t2 ) . . . k(tn , y) dt1 dt2 . . . dtn , (11)
−∞ −∞ −∞
| {z }
n

the following expansion for the iterated integrals holds:


Z +∞ X
kn (x, x) dx = µni . (12)
−∞ i

Proof. As above, the hypotheses on k ensure that Theorem 4.1 applies. Replacing each
k in the definition of kn by its bilinear series expansion (9), using uniform convergence
of these series to perform term by term integration and taking advantage of the L 2 -
orthonormality of the {φi }, we conclude that
X
kn (x, y) = µni φi (x) φi (y).
i≥0

It is easy to check that the conditions of the trace formula apply to kn ; using it establishes
the result. 
We now turn to the implications of positive definiteness of k ∈ L2 on its Fourier
transform. The existence of such a relationship in the one-dimensional case has been
known for almost a century, embodied in the Bochner theorem [8, 15] which may be
summarized by stating that classical positive definite functions are exactly the Fourier(-
Stieltjes) transforms of positive measures.

12
In the L2 (R2 ) setting we prove that, somewhat surprisingly, positive definiteness is a
condition that carries through Fourier transforms in a totally symmetric way: namely,
a function is an L2 -positive definite kernel if and only if its double Fourier transform is
itself (in the appropriate sense, an L2 -positive definite kernel. This is the content of the
next result.

Theorem 4.5 Let k ∈ L2 (R2 ) and k̂ be its Fourier transform. Then:


(a) k(t1 , t2 ) is a positive definite kernel if and only if k̂(ν2 , −ν1 ) is a positive definite
kernel.
(b) k(t1 , t2 ) is a continuous positive definite kernel admitting the bilinear eigenfunction
expansion X
k(t1 , t2 ) = µi φi (t1 ) φi (t2 )
i≥0

if and only if k̂(ν1 , −ν2 ) is a continuous positive definite kernel admitting the bilinear
eigenfunction expansion
X
k̂(ν1 , −ν2 ) = µi φ̂i (ν1 ) φ̂i (−ν2 ), (13)
i≥0

where {µi }i≥0 are the eigenvalues of the associated integral operators, {φi }i≥0 are the
(continuous, L2 -orthogonal) eigenfunctions of K, {φ̂i }i≥0 are their Fourier trans-
forms (and consequently the continuous, L2 -orthogonal eigenfunctions of K̂) and
both series are absolutely and uniformly convergent.

Proof. (a) Suppose k is a positive definite L2 kernel. We have, for arbitrary y ∈ L2 (R),
Z +∞ Z +∞
0 ≤ y(t1 ) k(t1 , t2 ) y(t2 ) dt1 dt2
−∞ −∞
Z +∞ Z +∞ 
= y(t1 ) k(t1 , t2 ) y(t2 ) dt2 dt1 (by Fubini)
−∞ −∞
Z +∞ Z +∞ Z +∞  
2πiν2 t2
= y(t1 ) k(t1 , t2 )e dt2 ŷ(ν2 ) dν2 dt1 (by Parseval)
−∞ −∞ −∞
Z +∞ Z +∞ Z +∞ Z +∞   
2πiν2 t2 −2πiν1 t1
= ŷ(ν2 ) e ŷ(ν1 ) k(t1 , t2 ) e dt1 dν1 dt2 dν2
−∞ −∞ −∞ −∞
Z Z Z +∞ Z +∞ 
−2πiν1 t1 2πiν2 t2
= ŷ(ν1 ) ŷ(ν2 ) k(t1 , t2 ) e e dt1 dt2 dν2 dν1 (Parseval)
R2 −∞ −∞
Z Z Z +∞ Z +∞ 
−2πiν1 t1 2πiν2 t2
= ŷ(ν1 ) ŷ(ν2 ) k(t1 , t2 )e e dt1 dt2 dν2 dν1 (by 2.13)
R2 −∞ −∞
Z Z
= ŷ(ν1 ) ŷ(ν2 )k̂(ν1 , −ν2 )dν2 dν1 by (4).
R2

Because the Fourier transform is an L2 isomorphism, ŷ ranges over L2 (R) when y


ranges over L2 (R). Consequently, we conclude that
Z Z
ŷ(ν1 )ŷ(ν2 )k̂(ν1 , −ν2 ) dν2 dν1 ≥ 0
R2

13
for arbitrary ŷ ∈ L2 (R), or in other words that k̂(ν1 , −ν2 ) is an L2 -PDK.
The converse implication is proved in exactly the same way, interchanging the roles of
k and k̂ and replacing the Fourier transform by the inverse Fourier transform.
(b) Suppose k ∈ L2 (R2 ) is continuous. From Theorem 4.1 and Remark 4.2 we know
that a continuous PDK admits the expansion (9) with the stated properties if and only
if k(t, t) ∈ L1 (R). Performing the double Fourier transform as in (a) we have, integrating
term by term, X
k̂(ν1 , −ν2 ) = µi φ̂i (ν1 ) φ̂i (−ν2 ),
i≥0

where the φ̂i are the Fourier transforms of the φi . By the Fourier-Plancherel Theorem, it
is readily seen that this series has all the convergence properties listed in Theorem 4.1,
that the φ̂i are the eigenfunctions of the associated integral operator K with eigenvalues
µi , and that they are continuous and L2 -orthogonal. The fact that k̂(ν, −ν) ∈ L1 (R) may
be either checked by direct calculation or considered as a consequence of Remark 4.2.
The converse implication is proved in exactly the same way, interchanging the roles of
k and k̂ and replacing the Fourier transform by the inverse Fourier transform. 

Corollary 4.6 Suppose k is an L2 -PDK. If k̂(ν, −ν) is compactly supported and ν0 is


such that k̂(ν, −ν) = 0 for |ν| > ν0 , then:
(a) k̂ is compactly supported and supp(k̂) ⊂ [−ν0 , ν0 ] × [−ν0 , ν0 ];
(b) k is real-analytic and supp (k) = R2 .
If in addition k(t, t) ∈ L1 (R), then:
(c) The eigenfunctions φi (x) of the bilinear expansion (9) are real-analytic;
(d) the trace formula Z +∞ X
k̂(ν, −ν) dν = µi (14)
−∞ i
holds.
Proof. (a) By hypothesis k̂(ν, −ν) = 0 for |ν| > ν0 . However, by Corollary 2.13 we
have that, for all ν1 , ν2 ∈ R, |k̂(ν1 , −ν2 )|2 ≤ k̂(ν1 , −ν1 ) k̂(ν2 , −ν2 ); in particular, if either
|ν1 | > ν0 or |ν2 | > ν0 then |k̂(ν1 , ν2 )| = 0. Thus supp(k̂) ⊂ [−ν0 , ν0 ] × [−ν0 , ν0 ].
(b) By Theorem 3.2 k is real-analytic; by Corollary 3.3 its support is R 2 .
(c) By Theorem 4.5 k̂(ν1 , −ν2 ) has a bilinear series expansion (13) and the φ̂i are the
eigenfunctions of K̂, that is,
Z +∞
φ̂i (ν1 ) = µi k̂(ν1 , −ν2 )φ̂i (−ν2 ) dν2 ,
−∞

where without loss of generality we may suppose µi 6= 0 since eigenfunctions correponding


to the eigenvalue 0 may be discarded in the bilinear series.
Since by (a) supp(k̂) ⊂ [−ν0 , ν0 ] × [−ν0 , ν0 ], it follows that supp(φ̂i ) ⊂ [−ν0 , ν0 ]. In
other words, the Fourier transform of φi has compact support; hence by Fact 1 φi is
real-analytic.
(d) By Theorem 4.5 (b) k̂ admits the bilinear eigenfunction series expansion (13)
relative to which the trace formula is valid. 

14
Remark 4.7 Obviously an iterated kernel result similar to (4.4) is valid for k̂. However,
since it says nothing fundamentally new we refrain from stating it explicitly.

Corollary 4.8 Suppose k is a continuous L2 -PDK. If k is compactly supported along


the main diagonal, and t0 is such that k(t, t) = 0 for t > t0 , then k̂ is analytic and
supp(k̂) = R2 . If, moreover, k(t, t) ∈ L1 (R), then k̂ has a bilinear series expansion (13),
its eigenfunctions φ̂i are analytic, and the trace formula (14) is valid.

Proof. Immediate from Corollary 4.6 by inverse Fourier transforms. 

5 Acknowledgments
We are deeply indebted to Profs. A. Ferreira dos Santos, F. Teixeira and F.-O. Speck of
Instituto Superior Técnico, whose insightful discussions are gratefully acknowledged.

References
[1] N. Akhiezer and I. Glazman, Theory of Linear Operators in Hilbert space, vol I,
Pitman Publishing, London, 1981.

[2] B. Chabat, Introduction à l’Analyse Complexe, t. 2: Fonctions de plusieurs variables,


Mir, Moscow, 1990.

[3] D. C. Champeney, A Handbook of Fourier theorems, Cambridge University Press,


Cambridge, 1987.

[4] H. Dym and H. P. McKean, Fourier series and integrals, Academic Press, New York,
1972.

[5] F. Garcia, I. Lourtie and J. Buescu, L2 (R) nonstationary processes and the Sampling
Theorem, submitted to IEEE Signal Processing Letters, August 2000.

[6] W. Gardner, A Sampling Theorem for nonstationary random processes, IEEE Trans-
actions on Information Theory IT-18, 6 (1972), 808–809.

[7] I. Gohberg and S. Goldberg, Basic operator theory, Birkhäuser, Basel, 1981.

[8] R. Goldberg, Fourier transforms, Cambridge Tracts in Mathematics and Mathemat-


ical Physics 52, CUP, Cambridge, 1962.

[9] H. Hochstadt, Integral equations, Pure and applied Mathematics Wiley-Interscience


series, J. Wiley and Sons, New York, 1973.

[10] S. Krantz, A panorama of harmonic analysis, Carus Mathematical Monographs bf


27, MAA, 1999.

[11] I. M. Novitskii, Representation of kernels of integral operators by bilinear series.


Siberian Math. J. 25, 3, 774–778 (1984). Translated form the Russian: Sibirsk. Mat.
Zh. 25, 5, 114-118 (1984).

15
[12] A. V. Oppenheim and A. S. Willsky, Signals and Systems, Prentice-Hall, 1983.

[13] A. Papoulis, Probability, random variables and stochastic processes, 3rd ed., McGraw-
Hill, New York.

[14] W. Rudin, Real and Complex Analysis (3rd ed.), McGraw-Hill, New York, 1986.

[15] J. Stewart, Positive definite functions and generalizations, an historical survey. Rocky
Mountain Jour. Math. 6, 3, 309-434 (1976).

[16] E. Wong, Stochastic Processes in Information and Dynamical Systems, McGraw-Hill,


1971.

16

You might also like