LectureNoteFA Mitocw
LectureNoteFA Mitocw
LectureNoteFA Mitocw
Richard Melrose
1
Version 0.9A; Revised: 29-9-2018; Run: May 16, 2020 .
2
Contents
Preface 5
Introduction 6
6. Isomorphism to l2 69
7. Parallelogram law 70
8. Convex sets and length minimizer 71
9. Orthocomplements and projections 71
10. Riesz’ theorem 73
11. Adjoints of bounded operators 74
12. Compactness and equi-small tails 75
13. Finite rank operators 78
14. Compact operators 80
15. Weak convergence 82
16. The algebra B(H) 85
17. Spectrum of an operator 86
18. Spectral theorem for compact self-adjoint operators 89
19. Functional Calculus 92
20. Spectral projection 94
21. Polar Decomposition 96
22. Compact perturbations of the identity 98
23. Hilbert-Schmidt, Trace and Schatten ideals 100
24. Fredholm operators 106
25. Kuiper’s theorem 109
Chapter 4. Differential and Integral operators 115
1. Fourier series 115
2. Toeplitz operators 119
3. Cauchy problem 122
4. Dirichlet problem on an interval 126
5. Harmonic oscillator 133
6. Fourier transform 135
7. Fourier inversion 136
8. Convolution 140
9. Plancherel and Parseval 143
10. Weak and strong derivatives 144
11. Sobolev spaces 150
12. Schwartz distributions 153
13. Poisson summation formula 154
Appendix A. Problems for Chapter 1 157
1. For §1 157
Appendix B. Problems for Chapter 4 161
1. Hill’s equation 161
2. Mehler’s formula and completeness 162
3. Friedrichs’ extension 166
4. Dirichlet problem revisited 170
5. Isotropic space 171
Appendix. Bibliography 175
PREFACE 5
Preface
These are notes for the course ‘Introduction to Functional Analysis’ – or in the
MIT style, 18.102, from various years culminating in Spring 2020. There are many
people who I should like to thank for comments on and corrections to the notes
over the years, but for the moment I would simply like to thank, as a collective,
the MIT undergraduates who have made this course a joy to teach, as a result of
their interest and enthusiasm.
6 CONTENTS
Introduction
This course is intended for ‘well-prepared undergraduates’ meaning specifically
that they have a rigourous background in analysis at roughly the level of the first
half of Rudin’s book [4] – at MIT this is 18.100B. In particular the basic theory of
metric spaces is used freely. Some familiarity with linear algebra is also assumed,
but not at a very sophisticated level.
The main aim of the course in a mathematical sense is the presentation of the
standard constructions of linear functional analysis, centred on Hilbert space and
its most significant analytic realization as the Lebesgue space L2 (R) and leading up
to the spectral theory of ordinary differential operators. In a one-semester course
at MIT it is only just possible to get this far. Beyond the core material I have
included other topics that I believe may prove useful both in showing how to use
the ‘elementary’ results in various directions.
It is the importance of such integrals which brings in the Lebesgue integral and
leads to a Hilbert space structure.
In any case one of the significant properties of the equation (1) is that it is
‘linear’. So we start with a brief discussion of linear (I usually say vector) spaces.
What we are dealing with here can be thought of as the eigenvalue problem for an
‘infinite matrix’. This in fact is not a very good way of thinking about operators
on infinite-dimensional spaces, they are not really like infinite matrices, but in this
case it is justified by the appearance of compact operators which are rather more
like infinite matrices. There was a matrix approach to quantum mechanics in the
early days but it was replaced by the sort of ‘operator’ theory on Hilbert space
that we will discuss below. One of the crucial distinctions between the treatment of
finite dimensional matrices and an infinite dimensional setting is that in the latter
topology is encountered. This is enshrined in the notion of a normed linear space
which is the first important topic we shall meet.
After a brief treatment of normed and Banach spaces, the course proceeds to
the construction of the Lebesgue integral and the associated spaces of ‘Lebesgue in-
tegrable functions’ (as you will see this is by way of a universally accepted falsehood,
but a useful one). To some extent I follow here the idea of Jan Mikusiński that one
can simply define integrable functions as the almost everywhere limits of absolutely
summable series of step functions and more significantly the basic properties can
be deduced this way. While still using this basic approach I have dropped the step
functions almost completely and instead emphasize the completion of the space of
continuous functions to get the Lebesgue space. Even so, Mikusiński’s approach
still underlies the explicit identification of elements of the completion with Lebesgue
‘functions’. This approach is followed in the book of Debnaith and Mikusiński [1].
After about a two-week stint of integration and then a little measure theory
the course proceeds to the more gentle ground of Hilbert spaces. Here I have been
most guided by the (old now) book of Simmons [5] which is still very much worth
reading. We proceed to a short discussion of operators and the spectral theorem for
compact self-adjoint operators. I have also included in the notes (but generally not
in the lectures) various things that a young mathematician should know(!) such
as Kuiper’s Theorem. Then in the last third or so of the semester this theory is
applied to the treatment of the Dirichlet eigenvalue problem, followed by a short
discussion of the Fourier transform and the harmonic oscillator. Finally various
loose ends are brought together, or at least that is my hope.
8
CHAPTER 1
In this chapter we introduce the basic setting of functional analysis, in the form
of normed spaces and bounded linear operators. We are particularly interested in
complete, i.e. Banach, spaces and the process of completion of a normed space to
a Banach space. In lectures I proceed to the next chapter, on Lebesgue integration
after Section 7 and then return to the later sections of this chapter at appropriate
points in the course.
There are many good references for this material and it is always a good idea
to get at least a couple of different views. The treatment here, whilst quite brief,
does cover what is needed later.
1. Vector spaces
You should have some familiarity with linear, or I will usually say ‘vector’,
spaces. Should I break out the axioms? Not here I think, but they are included
in Section 14 at the end of the chapter. In short it is a space V in which we can
add elements and multiply by scalars with rules quite familiar to you from the the
basic examples of Rn or Cn . Whilst these special cases are (very) important below,
this is not what we are interested in studying here. What we want to come to grips
with are spaces of functions hence the name of the course.
Note that for us the ‘scalars’ are either the real numbers or the complex numbers
– usually the latter. To be neutral we denote by K either R or C, but of course
consistently. Then our set V – the set of vectors with which we will deal, comes
with two ‘laws’. These are maps
(1.1) + : V × V −→ V, · : K × V −→ V.
which we denote not by +(v, w) and ·(s, v) but by v + w and sv. Then we impose
the axioms of a vector space – see Section 14 below! These are commutative group
axioms for +, axioms for the action of K and the distributive law linking the two.
The basic examples:
• The field K which is either R or C is a vector space over itself.
• The vector spaces Kn consisting of ordered n-tuples of elements of K.
Addition is by components and the action of K is by multiplication on
all components. You should be reasonably familiar with these spaces and
other finite dimensional vector spaces.
• Seriously non-trivial examples such as C([0, 1]) the space of continuous
functions on [0, 1] (say with complex values).
In these and many other examples we will encounter below, the ‘component
addition’ corresponds to the addition of functions.
9
10 1. NORMED AND BANACH SPACES
You should also be familiar with the notions of linear subspace and quotient
space. These are discussed a little below and most of the linear spaces we will meet
are either subspaces of these function-type spaces, or quotients of such subspaces –
see Problems 1.3 and 1.5.
Although you are probably most comfortable with finite-dimensional vector
spaces it is the infinite-dimensional case that is most important here. The notion
of dimension is based on the concept of the linear independence of a subset of a
vector space. Thus a subset E ⊂ V is said to be linearly independent if for any
finite collection of distinct elements vi ∈ E, i = 1, . . . , N, and any collection of
‘constants’ ai ∈ K, i = 1, . . . , N we have the following implication
N
X
(1.4) ai vi = 0 =⇒ ai = 0 ∀ i.
i=1
That is, it is a set in which there are ‘no non-trivial finite linear dependence rela-
tions between the elements’. A vector space is finite-dimensional if every linearly
independent subset is finite. It follows in this case that there is a finite and maxi-
mal linearly independent subset – a basis – where maximal means that if any new
element is added to the set E then it is no longer linearly independent. A basic
result is that any two such ‘bases’ in a finite dimensional vector space have the
same number of elements – an outline of the finite-dimensional theory can be found
in Problem 1.1.
Still it is time to leave this secure domain behind, since we are most interested
in the other case, namely infinite-dimensional vector spaces. As usual with such
mysterious-sounding terms as ‘infinite-dimensional’ it is defined by negation.
Definition 1.1. A vector space is infinite-dimensional if it is not finite di-
mensional, i.e. for any N ∈ N there exist N elements with no, non-trivial, linear
dependence relation between them.
Thus the infinite-dimensional vector spaces, which you may be quite keen to under-
stand, appear just as the non-existence of something. That is, it is the ‘residual’
case, where there is no finite basis. This means that it is ‘big’.
2. NORMED SPACES 11
2. Normed spaces
We need to deal effectively with infinite-dimensional vector spaces. To do so we
need the control given by a metric (or even more generally a non-metric topology,
but we will only get to that much later in this course; first things first). A norm
on a vector space leads to a metric which is ‘compatible’ with the linear structure.
Definition 1.2. A norm on a vector space is a function, traditionally denoted
(1.5) k · k : V −→ [0, ∞),
with the following properties
(Definiteness)
(1.6) v ∈ V, kvk = 0 =⇒ v = 0.
(Absolute homogeneity) For any λ ∈ K and v ∈ V,
(1.7) kλvk = |λ|kvk.
(Triangle Inequality) For any two elements v, w ∈ V
(1.8) kv + wk ≤ kvk + kwk.
Note that (1.7) implies that k0k = 0. Thus (1.6) means that kvk = 0 is equiv-
alent to v = 0. This definition is based on the same properties holding for the
standard norm(s), |z|, on R and C. You should make sure you understand that
(
x if x ≥ 0
R 3 x −→ |x| = ∈ [0, ∞) is a norm as is
(1.9) −x if x ≤ 0
1
C 3 z = x + iy −→ |z| = (x2 + y 2 ) 2 .
Situations do arise in which we do not have (1.6):-
Definition 1.3. A function (1.5) which satisfes (1.7) and (1.8) but possibly
not (1.6) is called a seminorm.
1Hint: For each point y ∈ X consider the function f : X −→ C which takes the value 1 at y
and 0 at every other point. Show that if X is finite then any function X −→ C is a finite linear
combination of these, and if X is infinite then this is an infinite set with no finite linear relations
between the elements.
12 1. NORMED AND BANACH SPACES
but as you will see (if you do the problems) there are also the norms
X 1
(1.17) |x|p = ( |xi |p ) p , 1 ≤ p < ∞.
i
In fact, for p = 1, (1.17) reduces to the second norm in (1.16) and in a certain sense
the case p = ∞ is consistent with the first norm there.
In lectures I usually do not discuss the notion of equivalence of norms straight
away. However, two norms on the one vector space – which we can denote k · k(1)
and k · k(2) are equivalent if there exist constants C1 and C2 such that
(1.18) kvk(1) ≤ C1 kvk(2) , kvk(2) ≤ C2 kvk(1) ∀ v ∈ V.
2. NORMED SPACES 13
The equivalence of the norms implies that the metrics define the same open sets –
the topologies induced are the same. You might like to check that the reverse is also
true, if two norms induced the same topologies (just meaning the same collection
of open sets) through their associated metrics, then they are equivalent in the sense
of (1.18) (there are more efficient ways of doing this if you wait a little).
Look at Problem 1.6 to see why we are not so interested in norms in the finite-
dimensional case – namely any two norms on a finite-dimensional vector space are
equivalent and so in that case a choice of norm does not tell us much, although it
certainly has its uses.
One important class of normed spaces consists of the spaces of bounded con-
tinuous functions on a metric space X :
(1.19) C∞ (X) = C∞ (X; C) = {u : X −→ C, continuous and bounded} .
That this is a linear space follows from the (pretty obvious) result that a linear
combination of bounded functions is bounded and the (less obvious) result that a
linear combination of continuous functions is continuous; this we are supposed to
know. The norm is the best bound
(1.20) kuk∞ = sup |u(x)|.
x∈X
is a norm. It is. From now on we will generally use sequential notation and think
of a map from N to C as a sequence, so setting a(j) = aj . Thus the ‘Hilbert space’
l2 consists of the square summable sequences.
14 1. NORMED AND BANACH SPACES
3. Banach spaces
You are supposed to remember from metric space theory that there are three
crucial properties, completeness, compactness and connectedness. It turns out that
normed spaces are always connected, so that is not very interesting, and they are
never compact (unless you consider the trivial case V = {0}) so that is not very
interesting either – in fact we will ultimately be very interested in compact subsets.
So that leaves completeness. This is so important that we give it a special name in
honour of Stefan Banach who first emphasized this property.
Definition 1.4. A normed space which is complete with respect to the induced
metric is a Banach space.
Lemma 1.2. The space C∞ (X), defined in (1.19) for any metric space X, is a
Banach space.
Proof. This is a standard result from metric space theory – basically that the
uniform limit of a sequence of (bounded) continuous functions on a metric space is
continuous. However, it is worth recalling how one proves completeness at least in
outline. Suppose un is a Cauchy sequence in C∞ (X). This means that given δ > 0
there exists N such that
(1.25) n, m > N =⇒ kun − um k∞ = sup |un (x) − um (x)| < δ.
x∈X
Fixing x ∈ X this implies that the sequence un (x) is Cauchy in C. We know that
this space is complete, so each sequence un (x) must converge (we say the sequence
of functions converges pointwise). Since the limit of un (x) can only depend on x,
we may define u(x) = limn un (x) in C for each x ∈ X and so define a function
u : X −→ C. Now, we need to show that this is bounded and continuous and is the
limit of un with respect to the norm. Any Cauchy sequence is bounded in norm –
take δ = 1 in (1.25) and it follows from the triangle inequality that
(1.26) kum k∞ ≤ kuN +1 k∞ + 1, m > N
and the finite set kun k∞ for n ≤ N is certainly bounded. Thus kun k∞ ≤ C, but this
means |un (x)| ≤ C for all x ∈ X and hence |u(x)| ≤ C by properties of convergence
in C and thus kuk∞ ≤ C, so the limit is bounded.
The uniform convergence of un to u now follows from (1.25) since we may pass
to the limit in the inequality to find
n > N =⇒ |un (x) − u(x)| = lim |un (x) − um (x)| ≤ δ
m→∞
(1.27)
=⇒ kun − uk∞ ≤ δ.
The continuity of u at x ∈ X follows from the triangle inequality in the form
|u(y) − u(x)| ≤ |u(y) − un (y)| + |un (y) − un (x)| + |un (x) − u(x)|
≤ 2ku − un k∞ + |un (x) − un (y)|.
Given δ > 0 the first term on the far right can be make less than δ/2 by choosing
n large using (1.27) and then, having chosen n, the second term can be made less
than δ/2 by choosing d(x, y) small enough, using the continuity of un .
I have written out this proof (succinctly) because this general structure arises
often below – first find a candidate for the limit and then show it has the properties
that are required.
3. BANACH SPACES 15
A map between two vector spaces (over the same field, for us either R or C) is
linear if it takes linear combinations to linear combinations:-
(1.37) T : V −→ W, T (a1 v1 +a2 v2 ) = a1 T (v1 )+a2 T (v2 ), ∀ v1 , v2 ∈ V, a1 , a2 ∈ K.
In the finite-dimensional case linearity is enough to allow maps to be studied.
However in the case of infinite-dimensional normed spaces we will require conti-
nuity, which is automatic in finite dimensions. It makes perfectly good sense to
say, demand or conclude, that a map as in (1.37) is continuous if V and W are
normed spaces since they are then metric spaces. Recall that for metric spaces
there are several different equivalent conditions that ensure a map, T : V −→ W,
is continuous:
(1.38) vn → v in V =⇒ T vn → T v in W
(1.39) O ⊂ W open =⇒ T −1 (O) ⊂ V open
(1.40) C ⊂ W closed =⇒ T −1 (C) ⊂ V closed.
For a linear map between normed spaces there is a direct characterization of
continuity in terms of the norm.
Proposition 1.3. A linear map (1.37) between normed spaces is continuous if
and only if it is bounded in the sense that there exists a constant C such that
(1.41) kT vkW ≤ CkvkV ∀ v ∈ V.
Of course bounded for a function on a metric space already has a meaning and this
is not it! The usual sense would be kT vk ≤ C but this would imply kT (av)k =
|a|kT vk ≤ C so T v = 0. Hence it is not so dangerous to use the term ‘bounded’ for
(1.41) – it is really ‘relatively bounded’, i.e. takes bounded sets into bounded sets.
From now on, bounded for a linear map means (1.41).
Proof. If (1.41) holds then if vn → v in V it follows that kT v − T vn k =
kT (v − vn )k ≤ Ckv − vn k → 0 as n → ∞ so T vn → T v and continuity follows.
For the reverse implication we use the second characterization of continuity
above. Denote the ball around v ∈ V of radius > 0 by
BV (v, ) = {w ∈ V ; kv − wk < }.
Thus if T is continuous then the inverse image of the the unit ball around the
origin, T −1 (BW (0, 1)) = {v ∈ V ; kT vkW < 1}, contains the origin in V and so,
being open, must contain some BV (0, ). This means that
(1.42) T (BV (0, )) ⊂ BW (0, 1) so kvkV < =⇒ kT vkW ≤ 1.
Now proceed by scaling. If 0 6= v ∈ V then kv 0 k < where v 0 = v/2kvk. So (1.42)
shows that kT v 0 k ≤ 1 but this implies (1.41) with C = 2/ – it is trivially true if
v = 0.
Note that a bounded linear map is in fact uniformly continuous – given δ > 0
there exists > 0 such that
(1.43) kv − wkV = dV (v, w) < =⇒ kT v − T wkW = dW (T v, T W ) < δ
namely = δ/C. One consequence of this is that a linear map T : U −→ W into a
Banach space, defined and continuous on a linear subspace, U ⊂ V. (with respect
to the restriction of the norm from V to U ) extends uniquely to a continuous map
T : U −→ W on the closure of U.
18 1. NORMED AND BANACH SPACES
As a general rule we drop the distinguishing subscript for norms, since which
norm we are using can be determined by what it is being applied to.
So, if T : V −→ W is continous and linear between normed spaces, or from
now on ‘bounded’, then
(1.44) kT k = sup kT vk < ∞.
kvk=1
Lemma 1.3. The bounded linear maps between normed spaces V and W form
a linear space B(V, W ) on which kT k defined by (1.44) or equivalently
(1.45) kT k = inf{C; (1.41) holds}
is a norm.
Proof. First check that (1.44) is equivalent to (1.45). Define kT k by (1.44).
Then for any v ∈ V, v 6= 0,
v kT vk
(1.46) kT k ≥ kT ( )k = =⇒ kT vk ≤ kT kkvk
kvk kvk
since as always this is trivially true for v = 0. Thus C = kT k is a constant for which
(1.41) holds.
Conversely, from the definition of kT k, if > 0 then there exists v ∈ V with
kvk = 1 such that kT k − < kT vk ≤ C for any C for which (1.41) holds. Since
> 0 is arbitrary, kT k ≤ C and hence kT k is given by (1.45).
From the definition of kT k, kT k = 0 implies T v = 0 for all v ∈ V and for λ 6= 0,
(1.47) kλT k = sup kλT vk = |λ|kT k
kvk=1
and this is also obvious for λ = 0. This only leaves the triangle inequality to check
and for any T, S ∈ B(V, W ), and v ∈ V with kvk = 1
(1.48) k(T + S)vkW = kT v + SvkW ≤ kT vkW + kSvkW ≤ kT k + kSk
so taking the supremum, kT + Sk ≤ kT k + kSk.
Thus we see the very satisfying fact that the space of bounded linear maps
between two normed spaces is itself a normed space, with the norm being the best
constant in the estimate (1.41). Make sure you absorb this! Such bounded linear
maps between normed spaces are often called ‘operators’ because we are thinking
of the normed spaces as being like function spaces.
You might like to check boundedness for the example, I, of a linear operator
in (1.36), namely that in terms of the supremum norm on C([0, 1]), kT k ≤ 1.
One particularly important case is when W = K is the field, for us usually C.
Then a simpler notation is handy and one sets V 0 = B(V, C) – this is called the
dual space of V (also sometimes denoted V ∗ ).
Proposition 1.4. If W is a Banach space then B(V, W ), with the norm (1.44),
is a Banach space.
Proof. We simply need to show that if W is a Banach space then every Cauchy
sequence in B(V, W ) is convergent. The first thing to do is to find the limit. To
say that Tn ∈ B(V, W ) is Cauchy, is just to say that given > 0 there exists N
such that n, m > N implies kTn − Tm k < . By the definition of the norm, if v ∈ V
5. SUBSPACES AND QUOTIENTS 19
6. Completion
A normed space not being complete, not being a Banach space, is considered
to be a defect which we might, indeed will, wish to rectify.
Let V be a normed space with norm k · kV . A completion of V is a Banach
space B with the following properties:-
(1) There is an injective (i.e. 1-1) linear map I : V −→ B
(2) The norms satisfy
(1.58) kI(v)kB = kvkV ∀ v ∈ V.
(3) The range I(V ) ⊂ B is dense in B.
Notice that if V is itself a Banach space then we can take B = V with I the
identity map.
So, the main result is:
Theorem 1.1. Each normed space has a completion.
There are several ways to prove this, we will come across a more sophisticated
one (using the Hahn-Banach Theorem) later. In the meantime I will describe two
proofs. In the first the fact that any metric space has a completion in a similar
sense is recalled and then it is shown that the linear structure extends to the
completion. A second, ‘hands-on’, proof is also outlined with the idea of motivating
the construction of the Lebesgue integral – which is in our near future.
Proof 1. One of the neater proofs that any metric space has a completion is
to use Lemma 1.2. Pick a point in the metric space of interest, p ∈ M, and then
define a map
(1.59) M 3 q 7−→ fq ∈ C∞ (M ), fq (x) = d(x, q) − d(x, p) ∀ x ∈ M.
That fq ∈ C∞ (M ) is straightforward to check. It is bounded (because of the second
term) by the reverse triangle inequality
|fq (x)| = |d(x, q) − d(x, p)| ≤ d(p, q)
and is continuous, as the difference of two continuous functions. Moreover the
distance between two functions in the image is
(1.60) sup |fq (x) − fq0 (x)| = sup |d(x, q) − d(x, q 0 )| = d(q, q 0 )
x∈M x∈M
using the reverse triangle inequality (and evaluating at x = q). Thus the map (1.59)
is well-defined, injective and even distance-preserving. Since C∞ (M ) is complete,
the closure of the image of (1.59) is a complete metric space, X, in which M can
be identified as a dense subset.
Now, in case that M = V is a normed space this all goes through. The
disconcerting thing is that the map q −→ fq is not linear. Nevertheless, we can
give X a linear structure so that it becomes a Banach space in which V is a dense
linear subspace. Namely for any two elements fi ∈ X, i = 1, 2, define
(1.61) λ1 f1 + λ2 f2 = lim fλ1 pn +λ2 qn
n→∞
where pn and qn are sequences in V such that fpn → f1 and fqn → f2 . Such
sequences exist by the construction of X and the result does not depend on the
choice of sequence – since if p0n is another choice in place of pn then fp0n − fpn → 0
in X (and similarly for qn ). So the element of the left in (1.61) is well-defined. All
22 1. NORMED AND BANACH SPACES
of the properties of a linear space and normed space now follow by continuity from
V ⊂ X and it also follows that X is a Banach space (since a closed subset of a
complete space is complete). Unfortunately there are quite a few annoying details
to check!
‘Proof 2’ (the last bit is left to you). Let V be a normed space. First
we introduce the rather large space
∞
( )
X
∞
(1.62) Ve = {uk }k=1 ; uk ∈ V and kuk k < ∞
k=1
the elements of which, if you recall, are said to be absolutely summable. Notice that
the elements of Ve are sequences, valued in V so two sequences are equal, are the
same, only when each entry in one is equal to the corresponding entry in the other
– no shifting around or anything is permitted as far as equality is concerned. We
think of these as series (remember this means nothing except changing the name, a
series is a sequence and a sequence is a series), the only difference is that we ‘think’
of taking the limit of a sequence but we ‘think’ of summing the elements of a series,
whether we can do so or not being a different matter.
Now, each element of Ve is a Cauchy series – meaning the corresponding se-
N
P
quence of partial sums vN = uk is Cauchy if {uk } is absolutely summable. As
k=1
noted earlier, this is simply because if M ≥ N then
M
X M
X X
(1.63) kvM − vN k = k uj k ≤ kuj k ≤ kuj k
j=N +1 j=N +1 j≥N +1
P
gets small with N by the assumption that kuj k < ∞.
j
Moreover, Ve is a linear space, where we add sequences, and multiply by con-
stants, by doing the operations on each component:-
(1.64) t1 {uk } + t2 {u0k } = {t1 uk + t2 u0k }.
This always gives an absolutely summable series by the triangle inequality:
X X X
(1.65) kt1 uk + t2 u0k k ≤ |t1 | kuk k + |t2 | ku0k k.
k k k
of those which sum to 0. As discussed in Section 5 above, we can form the quotient
(1.67) B = Ve /S
the elements of which are the ‘cosets’ of the form {uk } + S ⊂ Ve where {uk } ∈ Ve .
This is our completion, we proceed to check the following properties of this B.
(1) A norm on B (via a seminorm on Ṽ ) is defined by
n
X
(1.68) kbkB = lim k uk k, b = {uk } + S ∈ B.
n→∞
k=1
6. COMPLETION 23
Now the norm of the element I(v) = v, 0, 0, · · · , is the limit of the norms of the
sequence of partial sums and hence is kvkV so kI(v)kB = kvkV and I(v) = 0
therefore implies v = 0 and hence I is also injective.
We need to check that B is complete, and also that I(V ) is dense. Here is
an extended discussion of the difficulty – of course maybe you can see it directly
yourself (or have a better scheme). Note that I suggest that you to write out your
own version of it carefully in Problem ??.
Okay, what does it mean for B to be a Banach space, as discussed above it
means that every absolutely summable series in B is convergent. Such a series {bn }
(n) (n)
is given by bn = {uk } + S where {uk } ∈ Ve and the summability condition is
that
N
(n)
X X X
(1.71) ∞> kbn kB = lim k uk k V .
N →∞
n n k=1
P
So, we want to show that bn = b converges, and to do so we need to find the
n
limit b. It is supposed to be given by an absolutely summable series. The ‘problem’
P P (n)
is that this series should look like uk in some sense – because it is supposed
n k
24 1. NORMED AND BANACH SPACES
to represent the sum of the bn ’s. Now, it would be very nice if we had the estimate
X X (n)
(1.72) kuk kV < ∞
n k
since this should allow us to break up the double sum in some nice way so as to get
an absolutely summable series out of the whole thing. The trouble is that (1.72)
need not hold. We know that each of the sums over k – for given n – converges,
but not the sum of the sums. All we know here is that the sum of the ‘limits of the
norms’ in (1.71) converges.
So, that is the problem! One way to see the solution is to note that we do not
(n)
have to choose the original {uk } to ‘represent’ bn – we can add to it any element
(n)
of S. One idea is to rearrange the uk – I am thinking here of fixed n – so that
it ‘converges even faster.’ I will not go through this in full detail but rather do it
later when we need the argument for the completeness of the space of Lebesgue
integrable functions. Given > 0 we can choose p1 so that for all p ≥ p1 ,
X (n) X (n)
(1.73) |k uk kV − kbn kB | ≤ , kuk kV ≤ .
k≤p k≥p
Then in fact we can choose successive pj > pj−1 (remember that little n is fixed
here) so that
X (n) X (n)
(1.74) |k uk kV − kbn kB | ≤ 2−j , kuk kV ≤ 2−j ∀ j.
k≤pj k≥pj
p1 pj
(n) P (n) (n) P (n)
Now, ‘resum the series’ defining instead v1 = uk , vj = uk and
k=1 k=pj−1 +1
do this setting = 2−n for the nth series. Check that now
X X (n)
(1.75) kvk kV < ∞.
n k
(n)
Of course, you should also check that bn = {vk } + S so that these new summable
series work just as well as the old ones.
After this fiddling you can now try to find a limit for the sequence as
(p)
X
(1.76) b = {wk } + S, wk = vl ∈ V.
l+p=k+1
So, you need to check that this {wk } is absolutely summable in V and that bn → b
as n → ∞.
Finally then there is the question of showing that I(V ) is dense in B. You can
do this using the same idea as above – in fact it might be better to do it first. Given
an element b ∈ B we need to find elements in V, vk such that kI(vk ) − bkB → 0 as
Nj
P
k → ∞. Take an absolutely summable series uk representing b and take vj = uk
k=1
where the pj ’s are constructed as above and check that I(vj ) → b by computing
X X
(1.77) kI(vj ) − bkB = lim k uk kV ≤ kuk kV .
→∞
k>pj k>pj
7. MORE EXAMPLES 25
7. More examples
Let me collect some examples of normed and Banach spaces. Those mentioned
above and in the problems include:
• c0 the space of convergent sequences in C with supremum norm, a Banach
space.
• lp one space for each real number 1 ≤ p < ∞; the space of p-summable
series with corresponding norm; all Banach spaces. The most important
of these for us is the case p = 2, which is (a) Hilbert space.
• l∞ the space of bounded sequences with supremum norm, a Banach space
with c0 ⊂ l∞ as a closed subspace with the same norm.
• C([a, b]) or more generally C(M ) for any compact metric space M – the
Banach space of continuous functions with supremum norm.
• C∞ (R), or more generally C∞ (M ) for any metric space M – the Banach
space of bounded continuous functions with supremum norm.
• C0 (R), or more generally C0 (M ) for any metric space M – the Banach
space of continuous functions which ‘vanish at infinity’ (see Problem ??)
with supremum norm. A closed subspace, with the same norm, in C∞ (M ).
• C k ([a, b]) the space of k times continuously differentiable (so k ∈ N) func-
tions on [a, b] with norm the sum of the supremum norms on the function
and its derivatives. Each is a Banach space – see Problem ??.
• The space C([0, 1]) with norm
Z 1
(1.78) kukL1 = |u|dx
0
given by the Riemann integral of the absolute value. A normed space, but
not a Banach space. We will construct the concrete completion, L1 ([0, 1])
of Lebesgue integrable ‘functions’.
• The space R([a, b]) of Riemann integrable functions on [a, b] with kuk
defined by (1.78). This is only a seminorm, since there are Riemann
integrable functions (note that u Riemann integrable does imply that |u| is
Riemann integrable) with |u| having vanishing Riemann integral but which
are not identically zero. This cannot happen for continuous functions. So
the quotient is a normed space, but it is not complete.
• The same spaces – either of continuous or of Riemann integrable functions
but with the (semi- in the second case) norm
! p1
Z b
p
(1.79) kukLp = |u| .
a
Not complete in either case even after passing to the quotient to get a norm
for Riemann integrable functions. We can, and indeed will, define Lp (a, b)
as the completion of C([a, b]) with respect to the Lp norm. However we
will get a concrete realization of it soon.
• Suppose 0 < α < 1 and consider the subspace of C([a, b]) consisting of the
‘Hölder continuous functions’ with exponent α, that is those u : [a, b] −→
C which satisfy
(1.80) |u(x) − u(y)| ≤ C|x − y|α for some C ≥ 0.
26 1. NORMED AND BANACH SPACES
Note that this already implies the continuity of u. As norm one can take
the sum of the supremum norm and the ‘best constant’ which is the same
as
|u(x) − u(y)|
(1.81) kukC α = sup |u(x)| + sup ;
x∈[a,b]| x6=y∈[a,b] |x − y|α
it is a Banach space usually denoted C α ([a, b]).
• Note the previous example works for α = 1 as well, then it is not de-
noted C 1 ([a, b]), since that is the space of once continuously differentiable
functions; this is the space of Lipschitz functions Λ([a, b]) – again it is a
Banach space.
• We will also talk about Sobolev spaces later. These are functions with
‘Lebesgue integrable derivatives’. It is perhaps not easy to see how to
define these, but if one takes the norm on C 1 ([a, b])
1
2 du 2 2
(1.82) kukH 1 = kukL2 + k kL2
dx
and completes it, one gets the Sobolev space H 1 ([a, b]) – it is a Banach
space (and a Hilbert space). In fact it is a subspace of C([a, b]) = C([a, b]).
Here is an example to see that the space of continuous functions on [0, 1] with
norm (1.78) is not complete; things are even worse than this example indicates! It
is a bit harder to show that the quotient of the Riemann integrable functions is not
complete, feel free to give it a try.
Take a simple non-negative continuous function on R for instance
(
1 − |x| if |x| ≤ 1
(1.83) f (x) =
0 if |x| > 1.
R1
Then −1 f (x) = 1. Now scale it up and in by setting
8. Baire’s theorem
At least once I wrote a version of the following material on the blackboard
during the first mid-term test, in an an attempt to distract people. It did not work
very well – its seems that MIT students have already been toughened up by this
stage. Baire’s theorem will be used later (it is also known as ‘Baire category theory’
although it has nothing to do with categories in the modern sense).
8. BAIRE’S THEOREM 27
then at least one of the Cn ’s has an interior point, i.e. contains a non-empty ball
in M.
Proof. We will assume that each of the Cn ’s has empty interior, hoping to
arrive at a contradiction to (1.85) using the other properties. Thus if p ∈ M and
> 0 the open ball B(p, ) is not contained in any one of the Cn .
We start by choosing p1 ∈ M \ C1 which must exist since M is not empty and
otherwise C1 = M. Now, there must exist 1 > 0 such that B(p1 , 1 ) ∩ C1 = ∅,
since C1 is closed. No open ball around p1 can be contained in C2 so there exists
p2 ∈ B(p1 , 1 /3) which is not in C2 . Again since C2 is closed there exists 2 > 0,
2 < 1 /3 such that B(p2 , 2 ) ∩ C2 = ∅.
Proceeding inductively we suppose there is are k points pi , i = 1, . . . , k and
positive numbers
(1.86) 0 < k < k−1 /3 < k−2 /32 < · · · < 1 /3k−1
such that
Then we can add another pk+1 by using the properties of Ck+1 – it has empty
interior so there is some point in B(pk , k /3) which is not in Ck+1 and then
B(pk+1 , k+1 ) ∩ Ck+1 = ∅ where k+1 > 0 but k+1 < k /3. Thus, we have a
sequence {pk } in M satisfying (1.86) and (1.87) for all k.
Since d(pk+1 , pk ) < k /3 this is a Cauchy sequence, in fact
where the En are not necessarily closed. We can still apply Baire’s theorem however,
just take Cn = En to be the closures – then of course (1.85) holds since En ⊂ Cn .
The conclusion from (1.89) for a complete M is
9. Uniform boundedness
One application of Baire’s theorem is often called the uniform boundedness
principle or Banach-Steinhaus Theorem.
Theorem 1.3 (Uniform boundedness). Let B be a Banach space and suppose
that Tn is a sequence of bounded (i.e. continuous) linear operators Tn : B −→ V
where V is a normed space. Suppose that for each b ∈ B the set {Tn (b)} ⊂ V is
bounded (in norm of course) then supn kTn k < ∞.
Proof. This follows from a pretty direct application of Baire’s theorem to B.
Consider the sets
(1.91) Sp = {b ∈ B, kbk ≤ 1, kTn bkV ≤ p ∀ n}, p ∈ N.
Each Sp is closed because Tn is continuous, so if bk → b is a convergent sequence
in Sp then kbk ≤ 1 and kTn (b)k ≤ p. The union of the Sp is the whole of the closed
ball of radius one around the origin in B :
[
(1.92) {b ∈ B; d(b, 0) ≤ 1} = Sp
p
Proof. What we will try to show is that the image under T of the unit open
ball around the origin, B(0, 1) ⊂ B1 contains an open ball around the origin in B2 .
The first part, of the proof, using Baire’s theorem shows that the closure of the
10. OPEN MAPPING THEOREM 29
since that is what surjectivity means – every point is the image of something. Thus
one of the closed sets Cp has an interior point, v. Since T is surjective, v = T u for
some u ∈ B1 . The sets Cp increase with p so we can take a larger p and v is still
an interior point, from which it follows that 0 = v − T u is an interior point as well.
Thus indeed
(1.99) Cp ⊃ B(0, δ)
for some δ > 0. Rescaling by p, using the linearity of T, it follows that with δ
replaced by δ/p, we get (1.96).
If we assume that B1 is a Hilbert space (and you are reading this after we
have studied Hilbert spaces) then (1.96) shows that if v ∈ B2 , kvk < δ there is
a sequence un with kun k ≤ 1 and T un → v. As a bounded sequence un has a
weakly convergent subsequence, unj * u, where we know this implies kuk ≤ 1 and
Aunj * Au = v since Aun → v. This strengthens (1.96) to
T (B(0, 1) ⊃ B(0, δ/2)
and proves that T is an open map.
If B1 is a Banach space but not a Hilbert space (or you don’t yet know about
Hilbert spaces) we need to work a little harder. Having applied Baire’s thereom,
consider now what (1.96) means. It follows that each v ∈ B2 , with kvk = δ, is the
limit of a sequence T un where kun k ≤ 1. What we want to find is such a sequence,
un , which converges. To do so we need to choose the sequence more carefully.
Certainly we can stop somewhere along the way and see that
δ 1
(1.100) v ∈ B2 , kvk = δ =⇒ ∃ u ∈ B1 , kuk ≤ 1, kv − T uk ≤ = kvk
2 2
where of course we could replace 2δ by any positive constant but the point is the
last inequality is now relative to the norm of v. Scaling again, if we take any v 6= 0
in B2 and apply (1.100) to v/kvk we conclude that (for C = p/δ a fixed constant)
1
(1.101) v ∈ B2 =⇒ ∃ u ∈ B1 , kuk ≤ Ckvk, kv − T uk ≤ kvk
2
where the size of u only depends on the size of v; of course this is also true for v = 0
by taking u = 0.
Using this we construct the desired better approximating sequence. Given
w ∈ B1 , choose u1 = u according to (1.101) for v = w = w1 . Thus ku1 k ≤ C,
and w2 = w1 − T u1 satisfies kw2 k ≤ 12 kwk. Now proceed by induction, supposing
that we have constructed a sequence uj , j < n, in B1 with kuj k ≤ C2−j+1 kwk
and kwj k ≤ 2−j+1 kwk for j ≤ n, where wj = wj−1 − T uj−1 – which we have for
n = 1. Then we can choose un , using (1.101), so kun k ≤ Ckwn k ≤ C2−n+1 kwk
30 1. NORMED AND BANACH SPACES
and such that wn+1 = wn − T un has kwn+1 k ≤ 12 kwn k ≤ 2−n kwk to extend the
induction.
P Thus we get a sequence un which is absolutely summable in B1 , since
kun k ≤ 2Ckwk, and hence converges by the assumed completeness of B1 this
n
time. Moreover
n
X n
X
(1.102) w − T( uj ) = w1 − (wj − wj+1 ) = wn+1
j=1 j=1
One important corollary of this is something that seems like it should be obvi-
ous, but definitely needs completeness to be true.
Corollary 1.2. If T : B1 −→ B2 is a bounded linear map between Banach
spaces which is 1-1 and onto, i.e. is a bijection, then it is a homeomorphism –
meaning its inverse, which is necessarily linear, is also bounded.
Proof. The only confusing thing is the notation. Note that T −1 is generally
used both for the inverse, when it exists, and also to denote the inverse map on sets
even when there is no true inverse. The inverse of T, let’s call it S : B2 −→ B1 , is
certainly linear. If O ⊂ B1 is open then S −1 (O) = T (O), since to say v ∈ S −1 (O)
means S(v) ∈ O which is just v ∈ T (O), is open by the Open Mapping theorem, so
S is continuous.
since the norm of either u or v is less than the norm in (1.103). Restricting them
to Gr(T ) ⊂ B1 × B2 gives
(1.105) <
Gr(T )
S
π1 π2
| #
B1
T / B2 .
This little diagram commutes. Indeed there are two ways to map a point (u, v) ∈
Gr(T ) to B2 , either directly, sending it to v or first sending it to u ∈ B1 and then
to T u. Since v = T u these are the same.
Now, as already noted, Gr(T ) ⊂ B1 × B2 is a closed subspace, so it too is a
Banach space and π1 and π2 remain continuous when restricted to it. The map π1
is 1-1 and onto, because each u occurs as the first element of precisely one pair,
namely (u, T u) ∈ Gr(T ). Thus the Corollary above applies to π1 to show that its
inverse, S is continuous. But then T = π2 ◦ S, from the commutativity, is also
continuous proving the theorem.
You might wish to entertain yourself by showing that conversely the Open
Mapping Theorem is a consequence of the Closed Graph Theorem.
The characterization of continuous linear maps through the fact that their
graphs are closed has led to significant extensions. For instance consider a linear
map but only defined on a subspace (often required to be dense in which case it is
said to be ‘densely defined’) D ⊂ B, where B is a Banach space,
(1.106) A : D −→ B linear.
Such a map is said to be closed if its graph
Gr(A) = {(u, Au); u ∈ D} ⊂ B × B is closed.
Check for example that if H 1 (R) ⊂ L2 (R) (I’m assuming that you are reading
this near the end of the course . . . ) is the space defined in Chapter 4, as consisting
of the elements with a strong derivative in L2 (R) then
d
(1.107) : D = H 1 (R) −→ L2 (R) is closed.
dx
This follows for instance from the ‘weak implies strong’ result for differentiation.
If un ∈ H 1 (R) is a sequence such that un → u in L2 (R) and dun /dx −→ v in L2
(which is convergence in L2 (R) × L2 (R)) then u ∈ H 1 (R) and v = du/dx in the
same strong sense.
Such a closed operator, A, can be turned into a bounded operator by changing
the norm on the domain D to the ‘graph norm’
(1.108) kukGr = kuk + kAuk.
giving continuous linear functionals through the pairing – Riesz’ Theorem says that
in the case of a Hilbert space that is all there is. If you are following the course
then at this point you should also see that the only continuous linear functionals
on a pre-Hilbert space correspond to points in the completion. I could have used
the Hahn-Banach Theorem to show that any normed space has a completion, but
I gave a more direct argument for this, which was in any case much more relevant
for the cases of L1 (R) and L2 (R) for which we wanted concrete completions.
Theorem 1.6 (Hahn-Banach). If M ⊂ V is a linear subspace of a normed
space and u : M −→ C is a linear map such that
(1.109) |u(t)| ≤ CktkV ∀ t ∈ M
We proceed to show the real version of the Lemma, that w can be extended to a
linear functional w0 : M + Rx −→ R if x ∈/ M without increasing the norm. The
same argument as above shows that the only freedom is the choice of λ = w0 (x)
and we need to choose λ ∈ R so that
(1.117) |w(t) − λ| ≤ kt − xkV ∀ t ∈ M.
The norm estimate on w shows that
(1.118) |w(t1 ) − w(t2 )| ≤ |u(t1 ) − u(t2 )| ≤ kt1 − t2 k ≤ kt1 − xkV + kt2 − xkV .
Writing this out using the reality we find
w(t1 ) − w(t2 ) ≤ kt1 − xkV + kt2 − xkV =⇒
(1.119)
w(t1 ) − kt1 − xk ≤ w(t2 ) + kt2 − xkV ∀ t1 , t2 ∈ M.
We can then take the supremum on the left and the infimum on the right and
choose λ in between – namely we have shown that there exists λ ∈ R with
In this second part of the course the basic theory of the Lebesgue integral is
presented. Here I follow an idea of Jan Mikusiński, of completing the space of
step functions on the line under the L1 norm but in such a way that the limiting
objects are seen directly as functions (defined almost everywhere). There are other
places you can find this, for instance the book of Debnaith and Mikusiński [1]. Here
I start from the Riemann integral, since this is a prerequisite of the course; this
streamlines things a little. The objective is to arrive at a working knowledge of
Lebesgue integration as quickly as seems acceptable, to pass on to the discussion
of Hilbert space and then to more analytic questions.
So, the treatment of the Lebesgue integral here is intentionally compressed,
while emphasizing the completeness of the spaces L1 and L2 . In lectures everything
is done for the real line but in such a way that the extension to higher dimensions
– carried out partly in the text but mostly in the problems – is not much harder.
1. Integrable functions
Recall that the Riemann integral is defined for a certain class of bounded func-
tions u : [a, b] −→ C (namely the Riemann integrable functions) which includes all
continuous functions. It depends on the compactness of the interval and the bound-
edness of the function, but can be extended to an ‘improper integral’ on the whole
real line for which however some of the good properties fail. This is NOT what
we will do. Rather we consider the space of continuous functions ‘with compact
support’:
(2.1)
Cc (R) = {u : R −→ C; u is continuous and ∃ R such that u(x) = 0 if |x| > R}.
Thus each element u ∈ Cc (R) vanishes outside an interval [−R, R] where the R
depends on the u. Note that the support of a continuous function is defined to be
the complement of the largest open set on which it vanishes (or as the closure of the
set of points at which it is non-zero – make sure you see why these are the same).
Thus (2.1) says that the support, which is necessarily closed, is contained in some
interval [−R, R], which is equivalent to saying it is compact.
37
38 2. THE LEBESGUE INTEGRAL
The limits here are trivial in the sense that the functions involved are constant for
large R.
Proof. These are basic properties of the Riemann integral see Rudin [4].
Note that Cc (R) is a normed space with respect to kukL1 as defined above; that
it is not complete is one of the main reasons for passing to the Lebesgue integral.
With this small preamble we can directly define the ‘space’ of Lebesgue integrable
functions on R.
Definition 2.1. A function f : R −→ C is Lebesgue integrable, written f ∈
n
1
P
L (R), if there exists a series with partial sums fn = wj , wj ∈ Cc (R) which is
j=1
absolutely summable,
XZ
(2.3) |wj | < ∞
j
This is a somewhat convoluted definition which you should think about a bit.
Its virtue is that it is all there. The problem is that it takes a bit of unravelling. Be-
fore we go any further note that the sequence wj obviously determines the sequence
of partial sums fn , both in Cc (R) but the converse is also true since
w1 = f1 , wk = fk − fk−1 , k > 1,
Z XZ
(2.5) X
|wj | < ∞ ⇐⇒ |fk − fk−1 | < ∞.
j k>1
You might also notice that can we do some finite manipulation, for instance replace
the sequence wj by
X
(2.6) W1 = wj , Wk = wN +k−1 , k > 1
j≤N
and nothing much changes, since the convergence conditions in (2.3) and (2.4) are
properties only of the tail of the sequences and the sum in (2.4) for wj (x) converges
if and only if the corresponding sum for Wk (x) converges and then converges to the
same limit.
Before massaging the definition a little, let me give a simple example and check
that this definition does include continuous functions defined on an interval and
extended to be zero outside – the theory we develop will include the usual Riemann
1. INTEGRABLE FUNCTIONS 39
integral although I will not quite prove this in full, but only because it is not
particularly interesting.
Lemma 2.2. If f ∈ C([a, b]) then
(
˜ f (x) if x ∈ [a, b]
(2.7) f (x) =
0 otherwise
is an integrable function.
Proof. Just ‘add legs’ to f˜ by considering the sequence
0 if x < a − 1/n or x > b + 1/n,
(1 + n(x − a))f (a) if a − 1/n ≤ x < a,
(2.8) fn (x) =
(1 − n(x − b))f (b) if b < x ≤ b + 1/n,
f (x) if x ∈ [a, b].
Notice that we do not require E to be precisely the set of points at which the
series in (2.10) diverges, only that it does so at all points of E, so E is just a subset
of the set on which some absolutely summable series of functions in Cc (R) does
not converge absolutely. So any subset of a set of measure zero is automatically
of measure zero. To introduce the little trickery we use to unwind the definition
above, consider first the following (important) result.
Lemma 2.3. Any finite union of sets of measure zero is a set of measure zero.
Proof. Since we can proceed in steps, it suffices to show that the union of
two sets of measure zero has measure zero. So, let the two sets be E and F and
two corresponding absolutely summable sequences, as in Definition 2.2, be vj and
wj . Consider the alternating sequence
(
vj if k = 2j − 1 is odd
(2.11) uk =
wj if k = 2j is even.
Thus {uk } simply interlaces the two sequences. It follows that uk is absolutely
summable, since
X X X
(2.12) kuk kL1 = kvj kL1 + kwj kL1 .
k j j
P
Moreover, the pointwise series |uk (x)| diverges precisely where one or other of
P P k
the two series |vj (x)| or |wj (x)| diverges. In particular it must diverge on
j j
E ∪ F which is therefore, from the definition, a set of measure zero.
The definition of f ∈ L1 (R) above certainly requires that the equality on the
right in (2.4) should hold outside a set of measure zero, but in fact a specific one,
the one on which the series on the left diverges. Using the same idea as in the
lemma above we can get rid of this restriction.
Pn
Proposition 2.1. If f : R −→ C and there exists a series fn = wj with
P j=1
wj ∈ Cc (R) which is absolutely summable, so kwj kL1 < ∞, and a set E ⊂ R of
j
measure zero such that
∞
X
(2.13) x ∈ R \ E =⇒ f (x) = lim fn (x) = wj (x)
n→∞
j=1
then f ∈ L1 (R).
Recall that when one writes down an equality such as on the right in (2.13) one is
∞
P
implicitly saying that wj (x) converges and the equality holds for the limit. We
j=1
will call a sequence as the wj above an ‘approximating series’ for f ∈ L1 (R).
1
This is indeed a refinement of Pthe definition since all f ∈ L (R) arise this way,
taking E to be the set where |wj (x)| = ∞ for a series as in the defintion.
j
The same sort of identity is true for the pointwise series which shows that
X X X
(2.16) |uj (x)| < ∞ iff |wk (x)| < ∞ and |vk (x)| < ∞.
j k k
since the sequence of partial sumsPof the uj cycles through fn , fn (x) + vn (x), then
fn (x) and then to fn+1 (x). Since |vk (x)| < ∞ the sequence |vn (x)| → 0 so (2.17)
k
indeed follows from (2.13).
This is the trick at the heart of the definition of integrability above. Namely
we can manipulate the series involved in this sort of way to prove things about the
elements of L1 (R). One point to note is that if wj is an absolutely summable series
in Cc (R) then
P
|wj (x)| when this is finite
(2.18) F (x) = j =⇒ F ∈ L1 (R).
0 otherwise
The sort of property (2.13), where some condition holds on the complement
of a set of measure zero is so commonly encountered in integration theory that we
give it a simpler name.
Definition 2.3. A condition that holds on R \ E for some set of measure zero,
E, is said to hold almost everywhere. In particular we write
2. Linearity of L1
The word ‘space’ is quoted in the definition of L1 (R) above, because it is not
immediately obvious that L1 (R) is a linear space, even more importantly it is far
from obvious that the integral of a function in L1 (R) is well defined (which is the
point of the exercise after all). In fact we wish to define the integral to be
Z XZ
(2.20) f= wn
R n
Since the left identity holds a.e., so does the right and hence Re f ∈ L1 (R) by
Proposition 2.1. The same argument with the imaginary parts shows that Im f ∈
L1 (R). This also shows that a real element has a real approximating sequence.
The fact that the sum of two integrable functions is integrable really is a simple
consequence of Proposition 2.1 and Lemma 2.3. Indeed, if f, g ∈ L1 (R) have
approximating series wn and vn as in Proposition 2.1 then un = wn +vn is absolutely
summable,
XZ XZ XZ
(2.22) |un | ≤ |wn | + |vn |
n n n
and
X X X
wn (x) = f (x), vn (x) = g(x) =⇒ un (x) = f (x) + g(x).
n n n
The first two conditions hold outside (probably different) sets of measure zero, E
and F, so the conclusion holds outside E ∪ F which is of measure zero. Thus
f + g ∈ L1 (R). The case of cf for c ∈ C is more obvious.
The proof that |f | ∈ L1 (R) if f ∈ L1 (R) is similar but perhaps a little trickier.
Again, let {wn } be an approximating series as in the definition showing that f ∈
3. THE INTEGRAL ON L1 43
L1 (R). To make a series for |f | we can try the ‘obvious’ thing. Namely we know
that
Xn X
(2.23) wj (x) → f (x) if |wj (x)| < ∞
j=1 j
So, set
k
X k−1
X
(2.24) v1 (x) = |w1 (x)|, vk (x) = | wj (x)| − | wj (x)| ∀ x ∈ R.
j=1 j=1
So equality holds off a set of measure zero and we only need to check that {vj } is
an absolutely summable series.
The triangle inequality in the ‘reverse’ form ||v| − |w|| ≤ |v − w| shows that, for
k > 1,
k
X k−1
X
(2.26) |vk (x)| = || wj (x)| − | wj (x)|| ≤ |wk (x)|.
j=1 j=1
Thus
XZ XZ
(2.27) |vk | ≤ |wk | < ∞
k k
so the vk ’s do indeed form an absolutely summable series and (2.25) holds almost
everywhere, so |f | ∈ L1 (R).
For a positive function this last argument yields a real approximating sequence
with positive partial sums.
By combining these results we can see again that if f, g ∈ L1 (R) are both real
valued then
(2.28) f+ = max(f, 0), max(f, g), min(f, g) ∈ L1 (R).
Indeed, the positive part, f+ = 21 (|f | + f ), max(f, g) = g + (f − g)+ , min(f, g) =
− max(−f, −g).
3. The integral on L1
Next we want to show that the integral is well defined via (2.20) for any approx-
imating series. From Propostion 2.2 it is enough to consider only real functions.
For this, recall a result concerning a case where uniform convergence of continu-
ous functions follows from pointwise convergence, namely when the convergence is
monotone, the limit is continuous, and the space is compact. It works on a general
compact metric space but we can concentrate on the case at hand.
44 2. THE LEBESGUE INTEGRAL
Proof. Since all the un (x) ≥ 0 and they are decreasing (which really means
not increasing of course) if u1 (x) vanishes at x then all the other un (x) vanish there
too. Thus there is one R > 0 such that un (x) = 0 if |x| > R for all n, namely
one that works for u1 . So we only need consider what happens on [−R, R] which is
compact. For any > 0 look at the sets
Sn = {x ∈ [−R, R]; un (x) ≥ }.
This can also be written Sn = u−1n ([, ∞)) ∩ [−R, R] and since un is continuous it
follows that Sn is closed and hence compact. Moreover the fact that the un (x) are
decreasing means that Sn+1 ⊂ Sn for all n. Finally,
\
Sn = ∅
n
since, by assumption, un (x) → 0 for each x. Now the property of compact sets in a
metric space that we use is that if such a sequence of decreasing compact sets has
empty intersection then the sets themselves are empty from some n onwards. This
means that there exists N such that supx un (x) < for all n > N. Since > 0 was
arbitrary, un → 0 uniformly.
One of the basic properties of the Riemann integral is that the integral of the
limit of a uniformly convergent sequence (even of Riemann integrable functions but
here continuous) is the limit of the sequence of integrals, which is (2.29) in this
case.
We can easily extend this in a useful way – the direction of monotonicity is
reversed really just to mentally distinquish this from the preceding lemma.
Lemma 2.5. If vn ∈ Cc (R) is any increasing sequence such that limn→∞ vn (x) ≥
0 for each x ∈ R (where the possibility vn (x) → ∞ is included) then
Z
(2.30) lim vn dx ≥ 0 including possibly + ∞.
n→∞
Proof. This is really a corollary of the preceding lemma. Consider the se-
quence of functions
(
0 if vn (x) ≥ 0
(2.31) wn (x) =
−vn (x) if vn (x) < 0.
Since this is the maximum of two continuous functions, namely −vn and 0, it is
continuous and it vanishes for large x, so wn ∈ Cc (R). Since vn (x) is increasing,
wn is decreasing and it follows that lim wn (x) = 0 for all x – either it gets there
for some finite n and then stays 0 or the limit of vn (x) is zero. Thus Lemma 2.4
applies to wn so Z
lim wn (x)dx = 0.
n→∞ R R R
Now, vn (x) ≥ −wn (x) for all x, so for each Rn, vn ≥ − wn . From properties of
the Riemann integral, vn+1 ≥ vn implies that vn dx is an increasing sequence and
it is bounded below by one that converges to 0, so (2.30) is the only possibility.
3. THE INTEGRAL ON L1 45
From this result applied carefully we see that the integral behaves sensibly for
absolutely summable series.
Lemma 2.6.P RSuppose un ∈ Cc (R) is an absolutely summable series of real-valued
functions, so |un |dx < ∞, and also suppose that
n
X
(2.32) un (x) = 0 a.e.
n
then
XZ
(2.33) un dx = 0.
n
R Proof. RAs already noted, the series (2.33) does converge, since the inequality
| un dx| ≤ |un |dx shows that it is absolutely convergent (hence Cauchy, hence
convergent).
If E is a set of measure zero such that (2.32) holds on the complement then
we can modify un as in (2.14) by adding and subtracting a non-negative absolutely
summable sequence vk which diverges absolutely on E. For the new sequence un
(2.32) is strengthened to
X X
(2.34) |un (x)| < ∞ =⇒ un (x) = 0
n n
and the conclusion (2.33) holds for the new sequence if and only if it holds for the
old one.
Now, we need to get ourselves into a position to apply Lemma 2.5. To do
this, just choose some integer N (large but it doesn’t matter yet) and consider the
sequence of functions – it depends on N but I will suppress this dependence –
N
X +1
(2.35) U1 (x) = un (x), Uj (x) = |uN +j (x)|, j ≥ 2.
n=1
is increasing with p – since we are adding non-negative functions. If the two equiv-
alent conditions in (2.36) hold then
X N
X +1 ∞
X
(2.38) un (x) = 0 =⇒ un (x) + |uN +j (x)| ≥ 0 =⇒ lim gp (x) ≥ 0,
p→∞
n n=1 j=2
since we are only increasing each term. On the other hand if these conditions do
not hold then the tail, any tail, sums to infinity so
(2.39) lim gp (x) = ∞.
p→∞
46 2. THE LEBESGUE INTEGRAL
and the final conclusion is the opposite inequality in (2.41). That is, we conclude
what we wanted to show, that
∞ Z
X
(2.43) un = 0.
n=1
Finally then we are in a position to show that the integral of an element of
L1 (R) is well-defined.
Proposition 2.3. If f ∈ L1 (R) then
Z XZ
(2.44) f = lim un
n→∞
n
is independent of the approximating sequence, un , used to define it. Moreover,
Z Z X N
|f | = lim | uk |,
N →∞
k=1
Z Z
(2.45) | f| ≤ |f | and
Z n
X
lim |f − uj | = 0.
n→∞
j=1
are two series approximating f as in Proposition 2.1 then the real and imaginary
parts of the difference u0n − un satisfy the hypothesis of Lemma 2.6 so it follows
that
XZ XZ
un = u0n .
n n
4. SUMMABLE SERIES IN L1 (R) 47
Then the first part of (2.45) follows from this definition of the integral applied
to |f | and the approximating series for |f | devised in the proof of Proposition 2.2.
The inequality
XZ XZ
(2.46) | un | ≤ |un |,
n n
which follows from the finite inequalities for the Riemann integrals
XZ XZ XZ
| un | ≤ |un | ≤ |un |
n≤N n≤N n
then f ∈ L1 (R),
Z XZ
f= fn ,
n
Z Z Z n
X XZ
(2.51) | f| ≤ |f | = lim | fj | ≤ |fj | and
n→∞
j=1 j
Z Xn
lim |f − fj | = 0.
n→∞
j=1
Proof. The proof is very like the proof of completeness via absolutely sum-
mable series for a normed space outlined in the preceding chapter.
1
P R By assumption each fn ∈ L (R), so there exists a sequence un,j ∈ Cc (R) with
|un,j | < ∞ and
j
X X
(2.52) |un,j (x)| < ∞ =⇒ fn (x) = un,j (x).
j j
We might hope that f (x) is given by the sum of the un,j (x) over both n and j, but
in general, this double series is not absolutely summable. However we can replace
it by one that is. For each n choose Nn so that
X Z
(2.53) |un,j | < 2−n .
j>Nn
This is possible by the assumed absolute summability – the tail of the series there-
fore being small. Having done this, we replace the series un,j by
X
(2.54) u0n,1 = un,j (x), u0n,j (x) = un,Nn +j−1 (x) ∀ j ≥ 2,
j≤Nn
summing the first Nn terms. This still sums to fn on the same set as in (2.52). So
in fact we can simply replace un,j by u0n,j and we have in addition the estimate
XZ Z
(2.55) |u0n,j | ≤ |fn | + 2−n+1 ∀ n.
j
procedure for instance. This gives a new series of continuous functions of compact
support which is absolutely summable since
XN Z XZ XZ
(2.57) |vk | ≤ |un,j | ≤ ( |fn | + 2−n+1 ) < ∞.
k=1 n,j n
The set where (2.58) fails is a set of measure zero, by definition. Thus f ∈ L1 (R)
and (2.49) also follows. To get the final result (2.51), rearrange the double series
for the integral (which is also absolutely convergent).
For the moment we only need the weakest part, (2.49), of this. To paraphrase
this, for any absolutely summable series of integrable functions the absolute point-
wise series converges off a set of measure zero – it can only diverge on a set of
measure zero. It is rather shocking but thisR allows us to prove the rest of Proposi-
tion 2.4! Namely, suppose f ∈ L1 (R) and |f | = 0. Then Proposition 2.5 applies
to the series with each term being |f |. This is absolutely summable since all the
integrals are zero. So it must converge pointwise except on a set of measure zero.
Clearly it diverges whenever f (x) 6= 0,
Z
(2.59) |f | = 0 =⇒ {x; f (x) 6= 0} has measure zero
which is what we wanted to show to finally complete the proof of Proposition 2.4.
and gives a semi-norm on L1 (R). It follows from Proposition 1.5 that on the quo-
tient, k[f ]k is indeed a norm.
The completeness of L1 (R) is a direct consequence of Proposition 2.5. Namely,
to show a normed space is complete it is enough to check that any absolutely
50 2. THE LEBESGUE INTEGRAL
Note that despite the fact that it is technically incorrect, everyone says ‘L1 (R)
is the space of Lebesgue integrable functions’ even though it is really the space
of equivalence classes of these functions modulo equality almost everywhere. Not
much harm can come from this mild abuse of language.
Another consequence of Proposition 2.5 and the proof above is an extension of
Lemma 2.3.
Proposition 2.6. Any countable union of sets of measure zero is a set of
measure zero.
Proof. If E is a set of measure zero then any function f which Ris defined
on R and vanishes outside E is a null function – is in L1 (R) and has |f | = 0.
Conversely if the characteristic function of E, the function equal to 1 on E and
zero in R \ E is integrable and has integral zero then E has measure zero. This
is the characterization of null functions above. Now, if Ej is a sequence of sets of
measure zero and χk is the characteristic function of
[
(2.64) Ej
j≤k
R
then |χk | = 0 so this is an absolutely summable series with sum, the characteristic
function of the union, integrable and of integral zero.
In the usual approach through measure one has the concept of a measureable, non-
negative, function for which the integral ‘exists but is infinite’ – we do not have
this (but we could easily do it, or rather you could). Using this one can drop the
assumption about the finiteness of the integral but the result is not significantly
stronger.
Proof. Since we can change the sign of the fi it suffices to assume that the fi
are monotonically increasing. The sequence of integrals is therefore also montonic
increasing and, being bounded, converges. Turning the sequence into a series, by
setting g1 = f1 and gj = fj − fj−1 for j ≥ 2 the gj are non-negative for j ≥ 1 and
XZ XZ Z Z
(2.67) |gj | = gj = lim fn − f1
n→∞
j≥2 j≥2
The second part, corresponding to convergence for the equivalence classes in L1 (R)
follows from the fact established earlier about |f | but here it also follows from the
monotonicity since f (x) ≥ fj (x) a.e. so
Z Z Z
(2.69) |f − fj | = f − fj → 0 as j → ∞.
Now, to Fatou’s Lemma. This really just takes the monotonicity result and
applies it to a sequence of integrable functions with bounded integral. You should
recall that the max and min of two real-valued integrable functions is integrable
and that
Z Z Z
(2.70) min(f, g) ≤ min( f, g).
Proof. You should remind yourself of the properties of lim inf as necessary!
Fix k and consider
(2.73) Fk,n = min fp (x) ∈ L1 (R).
k≤p≤k+n
52 2. THE LEBESGUE INTEGRAL
Note that for a decreasing sequence of non-negative numbers the limit exists and
is indeed the infimum. Thus in fact,
Z Z
(2.75) gk ≤ lim inf fn ∀ k.
Now, let k vary. Then, the infimum in (2.74) is over a set which decreases as k
increases. Thus the gk (x) are increasing. The integrals of this
R sequence are bounded
above in view of (2.75) since we assumed a bound on the fn ’s. So, we can apply
the monotonicity result again to see that
f (x) = lim gk (x) exists a.e and f ∈ L1 (R) has
k→∞
(2.76) Z Z
f ≤ lim inf fn .
Since f (x) = lim inf fn (x), by definition of the latter, we have proved the Lemma.
Notice the change on the right from liminf to limsup because of the sign.
6. THE THREE INTEGRATION THEOREMS 53
Now we can apply the same argument to gj0 (x) = h(x) + fj (x) since this is also
non-negative and has integrals bounded above. This converges a.e. to h(x) + f (x)
so this time we conclude that
Z Z Z Z Z
(2.80) h + f ≤ lim inf (h + fj ) = h + lim inf fj .
R
In both inequalities (2.79) and (2.80) we can cancel an h and combining them we
find
Z Z Z
(2.81) lim sup fj ≤ f ≤ lim inf fj .
In particular the limsup on the left is smaller than, or equal to, the liminf on the
right, for the sameR real sequence. This however implies that they are equal and
that the sequence fj converges. Thus indeed
Z Z
(2.82) f = lim fn .
n→∞
Note that the ‘improper integral’ without the absolute value can converge without
u being Lebesgue integrable.
Proof. If (2.83) holds then consider the sequence of functions vN = χ[−N,N ] |u|,
which we know to be in L1 (R) by Lemma 2.2. This is monotonic increasing with
limit |u|, so the Monotonicity Lemma shows that |u| ∈ L1 (R). Then consider
wN = χ[−N,N ] u which we also know to be in L1 (R). Since it is bounded by |u| and
converges pointwise to u, it follows from Dominated Convergence that u ∈ L1 (R).
Conversely, if u ∈ L1 (R) then |u| ∈ L1 (R) and χ[−N,N ] |u| ∈ L1 (R) converges to |u|
so by Dominated Convergence (2.83) must hold.
7. Notions of convergence
We have been dealing with two basic notions of convergence, but really there
are more. Let us pause to clarify the relationships between these different concepts.
(1) Convergence of a sequence in L1 (R) (or by slight abuse of language in
L1 (R)) – f and fn ∈ L1 (R) and
(2.84) kf − fn kL1 → 0 as n → ∞.
(2) Convergence almost everywhere:- For some sequence of functions fn and
function f,
(2.85) fn (x) → f (x) as n → ∞ for x ∈ R \ E
where E ⊂ R is of measure zero.
(3) Dominated convergence:- For fj ∈ L1 (R) (or representatives in L1 (R))
such that |fj | ≤ F (a.e.) for some F ∈ L1 (R) and (2.85) holds.
(4) What we might call ‘absolutely summable convergence’. Thus fn ∈ L1 (R)
n
gj where gj ∈ L1 (R) and
P PR
are such that fn = |gj | < ∞. Then (2.85)
j=1 j
holds for some f.
(5) Monotone convergence.
R For fj ∈ L1 (R), real valued and montonic, we
require that fj is bounded and it then follows that fj → f almost
1 1
everywhere,
R R with f ∈ L (R) and that the convergence is L and also that
f = limj fj .
So, one important point to know is that 1 does not imply 2. Nor conversely
does 2 imply 1 even if we assume that all the fj and f are in L1 (R).
However, montone convergence implies dominated convergence. Namely if f is
the limit then |fj | ≤ |f | + 2|f1 | and fj → f almost everywhere. Also, monotone
convergence implies convergence with absolute summability simply by taking the
sequence to have first term f1 and subsequence terms fj − fj−1 (assuming that fj
is monotonically increasing) one gets an absolutely summable series with sequence
of finite sums converging to f. Similarly absolutely summable convergence implies
dominatedP convergence for the sequence of partial sums; by montone convergence
the series |fn (x)| converges a.e. and in L1 to some function F which dominates
n
the partial sums which in turn converge pointwise. I suggest that you make a
diagram with these implications in it so that you are clear about the relationships
between them.
Nor would this approach work for L1 (R) since |f | ∈ L1 (R) does not imply that
f ∈ L1 (R).
Definition 2.4. A function f : R −→ C is said to be ‘Lebesgue square inte-
grable’, written f ∈ L2 (R), if there exists a sequence un ∈ Cc (R) such that
(2.87) un (x) → f (x) a.e. and |un (x)|2 ≤ F (x) a.e. for some F ∈ L1 (R).
Proposition 2.7. The space L2 (R) is linear, f ∈ L2 (R) implies |f |2 ∈ L1 (R)
and (2.86) defines a seminorm on L2 (R) which vanishes precisely on the null func-
tions N ⊂ L2 (R).
Definition 2.5. We define L2 (R) = L(R)/N .
So we know that L2 (R) is a normed space. It is in fact complete and much more!
Proof. First to see the linearity of L2 (R) note that if f ∈ L2 (R) and c ∈ C
then cf ∈ L2 (R) since if un is a sequence as in the definition for f then cun is such
a sequence for cf.
Similarly if f, g ∈ L2 (R) with sequences un and vn then wn = un + vn has the
first property – since we know that the union of two sets of measure zero is a set
of measure zero and the second follows from the estimate
(2.88) |wn (x)|2 = |un (x) + vn (x)|2 ≤ 2|un (x)|2 + 2|vn (x)|2 ≤ 2(F + G)(x)
where |un (x)|2 ≤ F (x) and |vn (x)|2 ≤ G(x) with F, G ∈ L1 (R).
Moreover, if f ∈ L2 (R) then the sequence |un (x)|2 converges pointwise almost
everywhere to |f (x)|2 so by Lebesgue’s Dominated Convergence, |f |2 ∈ L1 (R). Thus
kf kL2 is well-defined. It vanishes if and only if |f |2 ∈ N but this is equivalent to
f ∈ N – conversely N ⊂ L2 (R) since the zero sequence works in the definition
above.
So we only need to check the triangle inquality, absolute homogeneity being
clear, to deduce that L2 = L2 /N is at least a normed space. In fact we checked
this earlier on Cc (R) and the general case follows by continuity:-
We will get a direct proof of the triangle inequality as soon as we start talking
about (pre-Hilbert) spaces.
So it only remains to check the completeness of L2 (R), which is really the whole
point of the discussion of Lebesgue integration.
Theorem 2.3. The space L2 (R) is complete with respect to k · kL2 and is a
completion of Cc (R) with respect to this norm.
Proof. That Cc (R) ⊂ L2 (R) follows directly from the definition and the fact
that a continuous null function must vanish. This is a dense subset since, if f ∈
L2 (R) a sequence un ∈ Cc (R) as in Definition 2.4 satisfies
(2.90) |un (x) − um (x)|2 ≤ 4F (x) ∀ n, m,
56 2. THE LEBESGUE INTEGRAL
At this point I normally move on to the next chapter on Hilbert spaces with
L2 (R) as one motivating example.
9. MEASURABLE AND NON-MEASURABLE SETS 57
Two of these sets intersect if and only if they have elements differing by a rational,
and then they are the same.
Now, each of these sets q −1 (τ ) intersects [0, 1]. This follows from the density of
the rationals in the reals, since if x ∈ q −1 (τ ) there exists r ∈ Q such that |x−r| < 21
and then x0 = x + (−r + 12 ) ∈ q −1 (τ ) ∩ [0, 1]. So we can ‘localize’ (2.100) to
G
(2.101) [0, 1] = L(τ ), L(τ ) = q −1 (τ ) ∩ [0, 1]
τ ∈R/Q
Now, we can simply order the sets Vr into a sequence Ai by ordering the rationals
in [−1, 1].
Suppose V is of finite Lebesgue measure. Then we know that all the Vr are
of finite measure and µ(Vr ) = µ(V ) = µ(Ai ) for all i, from the properties of the
Lebesgue integral. This means that (2.98) applies, so we have the inequalities
∞
X
(2.104) µ([0, 1]) = 1 ≤ µ(V ) ≤ 3 = µ([−1, 2]).
i=1
Clearly we have a problem! The only way the right-hand inequality can hold is if
µ(V ) = 0, but then the left-hand inequality fails.
Our conclusion then is that V cannot be Lebesgue measurable! Or is it? Since
we are careful people we trace back through the discussion and see (it took people
a long, long, time to recognize this) more precisely:-
Proposition 2.8. If a Vitali set, V ⊂ [0, 1] exists, containing precisely one
element of each of the sets L(τ ), then it is bounded and not of finite Lebesgue
measure; its characteristic function is a non-negative function of bounded support
which is not Lebesgue integrable.
Okay, so what is the ‘issue’ here. It is that the existence of such a Vitali set
requires the Axiom of Choice. There are lots of sets L(τ ) so from the standard
(Zermelo-Fraenkel) axions of set theory it does not follow that you can ‘choose an
element from each’ to form a new set. That is a (slightly informal) version of the
additional axiom. Now, it has been shown (namely by Gödel and Cohen) that the
Axiom of Choice is independent of the Zermelo-Fraenkel Axioms. This does not
mean consistency, it means conditional consistency. The Zermelo-Fraenkel axioms
together with the Axiom of Choice are inconsistent if and only if the Zermelo-
Fraenkel axioms on their own are inconsistent.
Conclusion: As a working Mathematician you are free to choose to believe in
the Axiom of Choice or not. It will make your life easier if you do, but it is up to
you. Note that if you do not admit the Axiom of Choice, it does not mean that
all bounded real sets are measurable, in the sense that you can prove it. Rather it
means that it is consistent to believe this (as shown by Solovay).
See also the discussion of the Hahn-Banach Theorem in Section 1.12.
The real and imaginary parts of a simple function are simple and the positive
and negative parts of a real simple function are simple. Since step functions are
11. THE SPACES Lp (R) 59
simple, we know that simple functions are dense in L1 (R) and that if 0 ≤ F ∈ L1 (R)
then there exists a sequence of simple functions (take them to be a summable
sequence of step functions) fn ≥ 0 such that fn → F almost everywhere and
fn ≤ G for some other G ∈ L1 (R).
We elevate a special case of the second notion of convergence above to a defi-
nition.
Definition 2.8. A function f : R −→ C is (Lebesgue) measurable if it is the
pointwise limit almost everywhere of a sequence of simple functions.
Lemma 2.10. A function is Lebesgue measurable if and only if it is the pointwise
limit, almost everywhere, of a sequence of continuous functions of compact support.
Proof. Continuous functions of compact support are the uniform limits of
step functions, so this condition certainly implies measurability in the sense of
Definition 2.8. Conversely, suppose a function f is the limit almost everywhere
of a squence un of simple functions. Each of these functions is integrable, so we
can find φn ∈ Cc (R) such that kun − φn kL1 < 2−n . Then the telescoped sequence
v1 = u1 − φ1 , vk = (uk − φk ) − (uk−1 − φk−1 ), k > 1, is absolutely summable so
un − φn → 0 almost everywhere, and hence φn → f off a set of measure zero.
is a seminorm on the linear space Lp (R) vanishing only on the null functions and
making the quotient Lp (R) = Lp (R) N into a Banach space.
Proof. The real part of an element of Lp (R) is in Lp (R) since it is measurable
and | Re f |p ≤ |f |p so | Re f |p ∈ L1 (R). Similarly, Lp (R) is linear; it is clear that
cf ∈ Lp (R) if f ∈ Lp (R) and c ∈ C and the sum of two elements, f, g, is measurable
and satisfies |f + g|p ≤ 2p (|f |p + |g|p ) so |f + g|p ∈ L1 (R).
We next strengthen (2.107) to the approximation condition that there exists a
sequence of simple functions vn such that
(2.110) vn → f a.e. and |vn |p ≤ F ∈ L1 (R) a.e.
which certainly implies (2.107). As in the proof of Lemma 2.11, suppose f ∈
Lp (R) is real and choose fn real-valued simple functions and converging to f almost
everywhere. Since |f |p ∈ L1 (R) there is a sequence of simple functions 0 ≤ hn such
that |hn | ≤ F for some F ∈ L1 (R) and hn → |f |p almost everywhere. Then set
1
gn = hnp which is also a sequence of simple functions and define vn by (2.106). It
follows that (2.110) holds for the real part of f but combining sequences for real
and imaginary parts such a sequence exists in general.
The advantage of the approximation condition (2.110) is that it allows us to
conclude that the triangle inequality holds for kukLp defined by (2.109) since we
know it for simple functions and from (2.110) it follows that |vn |p → |f |p in L1 (R)
so kvn kLp → kf kLp . Then if wn is a similar sequence for g ∈ Lp (R)
(2.111)
kf +gkLp ≤ lim sup kvn +wn kLp ≤ lim sup kvn kLp +lim sup kwn kLp = kf kLp +kgkLp .
n n n
The other two conditions being clear it follows that kukLp is a seminorm on Lp (R).
The vanishing of kukLp implies that |u|p and hence u ∈ N and the converse
follows immediately. Thus Lp (R) = Lp (R) N is a normed space and it only remains
to check completeness.
We know that completeness is equivalent to the convergence of any absolutely
summable series. So, we can suppose fn ∈ Lp (R) have
X Z p1
(2.112) |fn |p < ∞.
n
Consider the sequence gn = fn χ[−R,R] for some fixed R > 0. This is in L1 (R) and
1
(2.113) kgn kL1 ≤ (2R) q kfn kLp
12. LEBESGUE MEASURE 61
Proof. The first part follows from the fact that the constant function 1 is
locally integrable and hence χR\A = 1 − χA is locally integrable if and only if χA is
locally integrable.
Notice the relationship between a characteristic function and the set it defines:-
for any sequence {Em } of sets in Σ which are disjoint (in pairs).
As for Lebesgue measure a set A ∈ Σ is ‘measurable’ and if µ(A) is not of finite
measure it is said to have infinite measure – for instance R is of infinite measure
in this sense. Since the measure of a set is always non-negative (or undefined if it
isn’t measurable) this does not cause any problems and in fact Lebesgue measure
is countably additive as in (2.124) provided we allow ∞ as a value of the measure.
It is a good exercise to prove this!
13. HIGHER DIMENSIONS 63
The upshot of this lemma is that we can integrate again, and hence a total of
n times and so define the (iterated) Riemann integral as
Z Z R Z R Z R
(2.130) u(z)dz = ··· u(x1 , x2 , x3 , . . . , xn )dx1 dx2 . . . dxn ∈ C.
Rn −R −R −R
Now, one slightly annoying thing is that we would really want to know that
the integral is independent of the order of integration. In fact it is not hard – see
Problem XX. Again using properties of the one-dimensional Riemann integral we
find:-
Lemma 2.15. The iterated integral
Z
(2.133) kukL1 = |u|
Rn
is a norm on Cc (Rn ).
Definition 2.11. The space L1 (Rn ) is defined to consist of those functions
f : Rn −→ C such that there exists a sequence {fn } which is absolutely summable
with respect to the L1 norm and such that
X X
(2.134) |fn (x)| < ∞ =⇒ fn (x) = f (x).
n n
Now you can go through the whole discusion above in this higher dimensional
case, and the only changes are really notational!
Things get a littlem more complicated in the discussion of change of variable.
This is covered in the problems. There are also a few other theorems it is good to
know!
CHAPTER 3
Hilbert spaces
There are really three ‘types’ of Hilbert spaces (over C). The finite dimen-
sional ones, essentially just Cn , for different integer values of n, with which you are
pretty familiar, and two infinite dimensional types corresponding to being separa-
ble (having a countable dense subset) or not. As we shall see, there is really only
one separable infinite-dimensional Hilbert space (no doubt you realize that the Cn
are separable) and that is what we are mostly interested in. Nevertheless we try
to state results in general and then give proofs (usually they are the nicest ones)
which work in the non-separable cases too.
I will first discuss the definition of pre-Hilbert and Hilbert spaces and prove
Cauchy’s inequality and the parallelogram law. This material can be found in many
other places, so the discussion here will be kept succinct. One nice source is the
book of G.F. Simmons, “Introduction to topology and modern analysis” [5]. I like
it – but I think it is long out of print. RBM:Add description of con-
tents when complete and
1. pre-Hilbert spaces mention problems
Proof. The first condition on a norm follows from (3.2). Absolute homogene-
ity follows from (3.1) since
(3.6) kλuk2 = hλu, λui = |λ|2 kuk2 .
So, it is only the triangle inequality we need. This follows from the next lemma,
which is the Cauchy-Schwarz inequality in this setting – (3.8). Indeed, using the
‘sesqui-linearity’ to expand out the norm
(3.7) ku + vk2 = hu + v, u + vi
= kuk2 + hu, vi + hv, ui + kvk2 ≤ kuk2 + 2kukkvk + kvk2
= (kuk + kvk)2 .
Corollary 3.1. The inner product is continuous on the metric space (i.e. with
respect to the norm) H × H.
Proof. Corollaries really aren’t supposed to require proof! If (uj , vj ) → (u, v)
then, by definition ku − uj k → 0 and kv − vj k → 0 so from
2. Hilbert spaces
Definition 3.1. A Hilbert space H is a pre-Hilbert space which is complete
with respect to the norm induced by the inner product.
As examples we know that Cn with the usual inner product
n
X
(3.14) hz, z 0 i = zj zj0
j=1
is a Hilbert space – since any finite dimensional normed space is complete. The
example we had from the beginning of the course is l2 with the extension of (3.14)
∞
X
(3.15) ha, bi = aj bj , a, b ∈ l2 .
j=1
3. Orthonormal sequences
Two elements of a pre-Hilbert space H are said to be orthogonal if
(3.18) hu, vi = 0 which can be written u ⊥ v.
A sequence of elements ei ∈ H, (finite or infinite) is said to be orthonormal if
kei k = 1 for all i and hei , ej i = 0 for all i 6= j.
Proposition 3.1 (Bessel’s inequality). If ei , i ∈ N, is an orthonormal sequence
in a pre-Hilbert space H, then
X
(3.19) |hu, ei i|2 ≤ kuk2 ∀ u ∈ H.
i
Proof. Start with the finite case, i = 1, . . . , N. Then, for any u ∈ H set
N
X
(3.20) v= hu, ei iei .
i=1
which is (3.19).
In case the sequence is infinite this argument applies to any finite subsequence,
ei , i = 1, . . . , N since it just uses orthonormality, so (3.19) follows by taking the
supremum over N.
4. Gram-Schmidt procedure
Definition 3.2. An orthonormal sequence, {ei }, (finite or infinite) in a pre-
Hilbert space is said to be maximal if
(3.23) u ∈ H, hu, ei i = 0 ∀ i =⇒ u = 0.
Theorem 3.1. Every separable pre-Hilbert space contains a maximal orthonor-
mal sequence.
Proof. Take a countable dense subset – which can be arranged as a sequence
{vj } and the existence of which is the definition of separability – and orthonormalize
it. First if v1 6= 0 set ei = v1 /kv1 k. Proceeding by induction we can suppose we
have found, for a given integer n, elements ei , i = 1, . . . , m, where m ≤ n, which
are orthonormal and such that the linear span
(3.24) sp(e1 , . . . , em ) = sp(v1 , . . . , vn ).
We certainly have this for n = 1. To show the inductive step observe that if vn+1
is in the span(s) in (3.24) then the same ei ’s work for n + 1. So we may as well
assume that the next element, vn+1 is not in the span in (3.24). It follows that
n
X w
(3.25) w = vn+1 − hvn+1 , ej iej 6= 0 so em+1 =
j=1
kwk
makes sense. By construction it is orthogonal to all the earlier ei ’s so adding em+1
gives the equality of the spans for n + 1.
Thus we may continue indefinitely, since in fact the only way the dense set
could be finite is if we were dealing with the space with one element, 0, in the first
place. There are only two possibilities, either we get a finite set of ei ’s or an infinite
sequence. In either case this must be a maximal orthonormal sequence. That is,
we claim
(3.26) H 3 u ⊥ ej ∀ j =⇒ u = 0.
This uses the density of the vj ’s. There must exist a sequence wk where each wk is
a vj , such that wk → u in H, assumed to satisfy (3.26). Now, each vj , and hence
each wk , is a finite linear combination of el ’s so, by Bessel’s inequality
X X
(3.27) kwk k2 = |hwk , el i|2 = |hu − wk , el i|2 ≤ ku − wk k2
l l
6. ISOMORPHISM TO l2 69
where hu, el i = 0 for all l has been used. Thus kwk k → 0 and hence u = 0.
Although a non-complete but separable pre-Hilbert space has maximal or-
thonormal sets, these are not much use without completeness.
5. Orthonormal bases
Definition 3.3. In view of the following result, a maximal orthonormal se-
quence in a separable Hilbert space will be called an orthonormal basis; it is often
called a ‘complete orthonormal basis’ but the ‘complete’ is really redundant.
This notion of basis is not quite the same as in the finite dimensional case (although
it is a legitimate extension of it). There are other, quite different, notions of a basis
in infinite dimensions. See for instance ‘Hamel basis’ which arises in some settings
– it is discussed briefly in §1.12 and can be used to show the existence of a non-
continuous functional on a Banach space.
Theorem 3.2. If {ei } is an orthonormal basis (a maximal orthonormal se-
quence) in a Hilbert space then for any element u ∈ H the ‘Fourier-Bessel series’
converges to u :
X∞
(3.28) u= hu, ei iei .
i=1
which is small for large m by Bessel’s inequality. Since we are now assuming
completeness, um → w in H. However, hum , ei i = hu, ei i as soon as m > i and
|hw − un , ei i| ≤ kw − un k so in fact
(3.31) hw, ei i = lim hum , ei i = hu, ei i
m→∞
for each i. Thus u − w is orthogonal to all the ei so by the assumed completeness
of the orthonormal basis must vanish. Thus indeed (3.28) holds.
6. Isomorphism to l2
A finite dimensional Hilbert space is isomorphic to Cn with its standard inner
product. Similarly from the result above
Proposition 3.2. Any infinite-dimensional separable Hilbert space (over the
complex numbers) is isomorphic to l2 , that is there exists a linear map
(3.32) T : H −→ l2
which is 1-1, onto and satisfies hT u, T vil2 = hu, viH and kT ukl2 = kukH for all u,
v ∈ H.
70 3. HILBERT SPACES
This maps H into l2 by Bessel’s inequality. Moreover, it is linear since the entries
in the sequence are linear in u. It is 1-1 since T u = 0 implies hu, ej i = 0 for all j
implies u = 0 by the assumed completeness of the orthonormal basis. It is surjective
since if {cj }∞ 2
j=1 ∈ l then
∞
X
(3.34) u= cj ej
j=1
converges in H. This is the same argument as above – the sequence of partial sums
is Cauchy since if n > m,
n
X n
X
(3.35) k cj ej k2H = |cj |2 .
j=m+1 j=m+1
7. Parallelogram law
What exactly is the difference between a general Banach space and a Hilbert
space? It is of course the existence of the inner product defining the norm. In fact
it is possible to formulate this condition intrinsically in terms of the norm itself.
Proposition 3.4. Any normed space where the norm satisfies the parallelogram
law, (3.36), is a pre-Hilbert space in the sense that
1
kv + wk2 − kv − wk2 + ikv + iwk2 − ikv − iwk2
(3.38) hv, wi =
4
is a positive-definite Hermitian inner product which reproduces the norm.
So, when we use the parallelogram law and completeness we are using the
essence of the Hilbert space.
9. ORTHOCOMPLEMENTS AND PROJECTIONS 71
is closed.
Now, suppose W is closed. If W = H then W ⊥ = {0} and there is nothing to
show. So consider u ∈ H, u ∈
/ W and set
(3.46) C = u + W = {u0 ∈ H; u0 = u + w, w ∈ W }.
Then C is closed, since a sequence in it is of the form u0n = u + wn where wn is a
sequence in W and u0n converges if and only if wn converges. Also, C is non-empty,
since u ∈ C and it is convex since u0 = u + w0 and u00 = u + w00 in C implies
(u0 + u00 )/2 = u + (w0 + w00 )/2 ∈ C.
Thus the length minimization result above applies and there exists a unique
v ∈ C such that kvk = inf u0 ∈C ku0 k. The claim is that this v is orthogonal to W –
draw a picture in two real dimensions! To see this consider an aritrary point w ∈ W
and λ ∈ C then v + λw ∈ C and
(3.47) kv + λwk2 = kvk2 + 2 Re(λhv, wi) + |λ|2 kwk2 .
Choose λ = teiθ where t is real and the phase is chosen so that eiθ hv, wi = |hv, wi| ≥
0. Then the fact that kvk is minimal means that
kvk2 + 2t|hv, wi)| + t2 kwk2 ≥ kvk2 =⇒
(3.48)
t(2|hv, wi| + tkwk2 ) ≥ 0 ∀ t ∈ R =⇒ |hv, wi| = 0
which is what we wanted to show.
Thus indeed, given u ∈ H \ W we have constructed v ∈ W ⊥ such that u =
v + w, w ∈ W. This is (3.44) with the uniqueness of the decomposition already
shown since it reduces to 0 having only the decomposition 0 + 0 and this in turn is
W ∩ W ⊥ = {0}.
Since the construction in the preceding proof associates a unique element in W,
a closed linear subspace, to each u ∈ H, it defines a map
(3.49) ΠW : H −→ W.
This map is linear, by the uniqueness since if ui = vi + wi , wi ∈ W, hvi , wi i = 0 are
the decompositions of two elements then
(3.50) λ1 u1 + λ2 u2 = (λ1 v1 + λ2 v2 ) + (λ1 w1 + λ2 w2 )
must be the corresponding decomposition. Moreover ΠW w = w for any w ∈ W
and kuk2 = kvk2 + kwk2 , Pythagoras’ Theorem, shows that
(3.51) Π2W = ΠW , kΠW uk ≤ kuk =⇒ kΠW k ≤ 1.
Thus, projection onto W is an operator of norm 1 (unless W = {0}) equal to its
own square. Such an operator is called a projection or sometimes an idempotent
(which sounds fancier).
Finite-dimensional subspaces are always closed by the Heine-Borel theorem.
Lemma 3.3. If {ej } is any finite or countable orthonormal set in a Hilbert space
then the orthogonal projection onto the closure of the span of these elements is
X
(3.52) Pu = hu, ek iek .
Proof. We know that the series in (3.52) converges and defines a bounded
linear operator of norm at most one by Bessel’s inequality. Clearly P 2 = P by the
same argument. If W is the closure of the span then (u−P u) ⊥ W since (u−P u) ⊥
10. RIESZ’ THEOREM 73
Lemma 3.6. The image of a convergent sequence in a Hilbert space is a set with
equi-small tails with respect to any orthonormal sequence, i.e. if ek is an othonormal
sequence and un → u is a convergent sequence then given > 0 there exists N such
that
X
(3.72) |hun , ek i|2 < 2 ∀ n.
k>N
The convergence of this series means that (3.72) can be arranged for any single
element un or the limit u by choosing N large enough, thus given > 0 we can
choose N 0 so that
X
(3.74) |hu, ek i|2 < 2 /2.
k>N 0
Consider the closure of the subspace spanned by the ek with k > N. The
orthogonal projection onto this space (see Lemma 3.3) is
X
(3.75) PN u = hu, ek iek .
k>N
Then the convergence un → u implies the convergence in norm kPN un k → kPN uk,
so
X
(3.76) kPN un k2 = |hun , ek i|2 < 2 , n > n0 .
k>N
So, we have arranged (3.72) for n > n0 for some N. This estimate remains valid if
N is increased – since the tails get smaller – and we may arrange it for n ≤ n0 by
choosing N large enough. Thus indeed (3.72) holds for all n if N is chosen large
enough.
This suggests one useful characterization of compact sets in a separable Hilbert
space since the equi-smallness of the tails, as in (3.72), for all u ∈ K just means
that the Fourier-Bessel series converges uniformly.
Proposition 3.8. A set K ⊂ H in a separable Hilbert space is compact if and
only if it is bounded, closed and the Fourier-Bessel sequence with respect to any
(one) complete orthonormal basis converges uniformly on it.
Proof. We already know that a compact set in a metric space is closed and
bounded. Suppose the equi-smallness of tails condition fails with respect to some
orthonormal basis ek . This means that for some > 0 and all p there is an element
up ∈ K, such that
X
(3.77) |hup , ek i|2 ≥ 2 .
k>p
Consider the subsequence {up } generated this way. No subsequence of it can have
equi-small tails (recalling that the tail decreases with p). Thus, by Lemma 3.6,
it cannot have a convergent subsequence, so K cannot be compact if the equi-
smallness condition fails.
12. COMPACTNESS AND EQUI-SMALL TAILS 77
where the parallelogram law on C has been used. To make this sum less than 2
we may choose N so large that the last two terms are less than 2 /2 and this may
be done for all n and l by the equi-smallness of the tails. Now, choose n so large
that each of the terms in the first sum is less than 2 /2N, for all l > 0 using the
Cauchy condition on each of the finite number of sequence hvn , ek i. Thus, {vn } is
a Cauchy subsequence of {un } and hence as already noted convergent in K. Thus
K is indeed compact.
This criterion for compactness is useful but is too closely tied to the existence
of an orthonormal basis to be easily applicable. However the condition can be
restated in a way that holds even in the non-separable case (and of course in the
finite-dimensional case, where it is trivial).
Proposition 3.9. A subset K ⊂ H of a Hilbert space is compact if and only if
it is closed and bounded and for every > 0 there is a finite-dimensional subspace
W ⊂ H such that
(3.80) sup inf ku − wk < .
u∈K w∈W
where ΠW is the orthogonal projection onto W (so Id −ΠW is the orthogonal pro-
jection onto W ⊥ ).
Now, let us first assume that H is separable, so we already have a condition
for compactness in Proposition 3.8. Then if K is compact we can consider an
orthonormal basis of H and the finite-dimensional spaces WN spanned by the first N
elements in the basis with ΠN the orthogonal projection onto it. Then k(Id −ΠN )uk
is precisely the length of the ‘tail’ of u with respect to the basis. So indeed, by
Proposition 3.8, given > 0 there is an N such that k(Id −ΠN )uk < /2 for all
u ∈ K and hence (3.82) holds for W = WN .
Now suppose that K ⊂ H and for each > 0 we can find a finite dimensional
subspace W such that (3.82) holds. Take a sequence {un } in K. The sequence
ΠW un ∈ W is bounded in a finite-dimensional space so has a convergent sub-
sequence. Now, for each j ∈ N there is a finite-dimensional subspace Wj (not
necessarily corresponding to an orthonormal basis) so that (3.82) holds for = 1/j.
Proceeding as above, we can find successive subsequence of un such that the image
under Πj in Wj converges for each j. Passing to the diagonal subsequence unl it
follows that Πj uni converges for each j since it is eventually a subsequence of the
jth choice of subsequence above. Now, the triangle inequality shows that
(3.83) kuni − unk k ≤ kΠj (uni − unk )kWj + k(Id −Πj )uni k + k(Id −Πj )unk k.
Given > 0 first choose j so large that the last two terms are each less than
1/j < /3 using the choice of Wj . Then if i, k > N is large enough the first term
on the right in (3.83) is also less than /3 by the convergence of Πj uni . Thus uni
is Cauchy in H and hence converges and it follows that K is compact.
This converse argument does not require the separability of H so to complete
the proof we only need to show the necessity of (3.81) in the non-separable case.
Thus suppose K is compact. Then K itself is separable – has a countable dense
subset – using the finite covering property (for each p > 0 there are finitely many
balls of radius 1/p which cover K so take the set consisting of all the centers for
all p). It follows that the closure of the span of K, the finite linear combinations of
elements of K, is a separable Hilbert subspace of H which contains K. Thus any
compact subset of a non-separable Hilbert space is contained in a separable Hilbert
subspace and hence (3.80) holds.
Definition 3.4. An operator T ∈ B(H) is of finite rank if its range has fi-
nite dimension (and that dimension is called the rank of T ); the set of finite rank
operators will be denoted R(H).
Why not F(H)? Because we want to use this for the Fredholm operators.
Clearly the sum of two operators of finite rank has finite rank, since the range
is contained in the sum of the ranges (but is often smaller):
(3.84) (T1 + T2 )u ∈ Ran(T1 ) + Ran(T2 ) ∀ u ∈ H.
Since the range of a constant multiple of T is contained in the range of T it follows
that the finite rank operators form a linear subspace of B(H).
What does a finite rank operator look like? It really looks like a matrix.
Lemma 3.7. If T : H −→ H has finite rank then there is a finite orthonormal
set {ek }L
k=1 in H and constants cij ∈ C such that
L
X
(3.85) Tu = cij hu, ej iei .
i,j=1
Inserting this into (3.87) gives (3.85) (where the constants for i > p are zero).
It is clear that
(3.89) B ∈ B(H) and T ∈ R(H) then BT ∈ R(H).
Indeed, the range of BT is the range of B restricted to the range of T and this is
certainly finite dimensional since it is spanned by the image of a basis of Ran(T ).
Similalry T B ∈ R(H) since the range of T B is contained in the range of T. Thus
we have in fact proved most of
Proposition 3.10. The finite rank operators form a ∗-closed two-sided ideal
in B(H), which is to say a linear subspace such that
(3.90) B1 , B2 ∈ B(H), T ∈ R(H) =⇒ B1 T B2 , T ∗ ∈ R(H).
80 3. HILBERT SPACES
Proof. It is only left to show that T ∗ is of finite rank if T is, but this is an
immediate consequence of Lemma 3.7 since if T is given by (3.85) then
N
X
(3.91) T ∗u = cij hu, ei iej
i,j=1
Lemma 3.8 (Row rank=Colum rank). For any finite rank operator on a Hilbert
space, the dimension of the range of T is equal to the dimension of the range of T ∗ .
Proof. From the formula (3.87) for a finite rank operator, it follows that the
vi , i = 1, . . . , p must be linearly independent – since the ei form a basis for the
range and a linear relation between the vi would show the range had dimension less
than p. Thus in fact the null space of T is precisely the orthocomplement of the
span of the vi – the space of vectors orthogonal to each vi . Since
p
X
hT u, wi = hu, vi ihei , wi =⇒
i=1
p
X
(3.92) hw, T ui = hvi , uihw, ei i =⇒
i=1
p
X
∗
T w= hw, ei ivi
i=1
Now we shall apply this to the set K(B(0, 1)) where we assume that K is
compact (as an operator, don’t be confused by the double usage, in the end it turns
14. COMPACT OPERATORS 81
For each n consider the first part of these sequences and define
X
(3.95) Kn u = hKu, ei iei .
k≤n
This is clearly a linear operator and has finite rank – since its range is contained in
the span of the first n elements of {ei }. Since this is an orthonormal basis,
X
(3.96) kKu − Kn uk2H = |hKu, ei i|2
i>n
Thus (3.94) shows that kKu − Kn ukH ≤ . Now, increasing n makes kKu − Kn uk
smaller, so given > 0 there exists n such that for all N ≥ n,
(3.97) kK − KN kB = sup kKu − Kn ukH ≤ .
kuk≤1
Thus indeed, Kn → K in norm and we have shown that the compact operators are
contained in the norm closure of the finite rank operators.
For the converse we assume that Tn → K is a norm convergent sequence in
B(H) where each of the Tn is of finite rank – of course we know nothing about the
rank except that it is finite. We want to conclude that K is compact, so we need to
show that K(B(0, 1)) is precompact. It is certainly bounded, by the norm of K. Let
Wn = Tn H be the range of Tn . By definition it is a finite dimensional subspace and
hence closed. Let Πn be the orthogonal projection onto Wn , so Id −Πn is projection
onto Wn⊥ . Thus the composite (Id −Πn )Tn = 0 and hence
(3.98) (Id −Πn )K = (Id −Πn )(K − Tn ) =⇒ k(Id −Πn )Kk → 0 as n → ∞.
So, for any > 0 there exists n such that
(3.99) sup inf kKu − wk ≤ sup k(Id −Πn )Kuk <
u∈B(0,1) w∈Wn kuk≤1
and it follows from Proposition 3.9 that K(B(0, 1)) is precompact and hence K is
compact.
Proposition 3.12. For any separable Hilbert space, the compact operators form
a closed and ∗-closed two-sided ideal in B(H).
Proof. In any metric space (applied to B(H)) the closure of a set is closed,
so the compact operators are closed being the closure of the finite rank operators.
Similarly the fact that it is closed under passage to adjoints follows from the same
fact for finite rank operators. The ideal properties also follow from the correspond-
ing properties for the finite rank operators, or we can prove them directly anyway.
Namely if B is bounded and T is compact then for some c > 0 (namely 1/kBk
unless it is zero) cB maps B(0, 1) into itself. Thus cT B = T cB is compact since
the image of the unit ball under it is contained in the image of the unit ball under
T ; hence T B is also compact. Similarly BT is compact since B is continuous and
then
(3.100) BT (B(0, 1)) ⊂ B(T (B(0, 1))) is compact
82 3. HILBERT SPACES
can be made small by choosing n large since vp is a finite linear combination of the
ek . Thus indeed, hun , vi → hu, vi for all v ∈ H and it follows that un converges
weakly to u.
Proposition 3.13. Any bounded sequence {un } in a separable Hilbert space
has a weakly convergent subsequence.
This can be thought of as different extension to infinite dimensions of the Heine-
Borel theorem. As opposed to the characterization of compact sets above, which
involves adding the extra condition of finite-dimensional approximability, here we
weaken the notion of convergence.
Proof. Choose an orthonormal basis {ek } and apply the procedure in the
proof of Proposition 3.8 to it. Thus, we may extract successive subsequence along
the kth of which hunp , ek i → ck ∈ C. Passing to the diagonal subsequence, vn , which
is eventually a subsequence of each of these ensures that hvn , ek i → ck for each k.
Now apply the preceeding Lemma to conclude that this subsequence converges
weakly.
Lemma 3.11. For a weakly convergent sequence un * u
(3.106) kuk ≤ lim inf kun k
and a weakly convergent sequence converges strongly if and only if the weak limit
satisfies kuk = limn→∞ kun k.
Proof. Choose an orthonormal basis ek and observe that
X X
(3.107) |hu, ek i|2 = lim |hun , ek i|2 .
n→∞
k≤p k≤p
2
The sum on the right is bounded by kun k independently of p so
X
(3.108) |hu, ek i|2 ≤ lim inf kun k2
n
k≤p
This exists for each u by hypothesis. It is a linear map and from (3.114) it is
bounded, kT k ≤ C. Thus by the Riesz Representation theorem, there exists w ∈ H
such that
(3.116) T (u) = hu, w) ∀ u ∈ H.
Thus hun , ui → hw, ui for all u ∈ H so un * w as claimed.
16. THE ALGEBRA B(H) 85
Thus GL(H) ⊂ B(H), the set of invertible elements, is open. It is also a group
– since the inverse of G1 G2 if G1 , G2 ∈ GL(H) is G−1 −1
2 G1 .
This group of invertible elements has a smaller subgroup, U(H), the unitary
group, defined by
(3.129) U(H) = {U ∈ GL(H); U −1 = U ∗ }.
The unitary group consists of the linear isometric isomorphisms of H onto itself –
thus
(3.130) hU u, U vi = hu, vi, kU uk = kuk ∀ u, v ∈ H, U ∈ U(H).
This is an important object and we will use it a little bit later on.
The groups GL(H) and U(H) for a separable Hilbert space may seem very
similar to the familiar groups of invertible and unitary n × n matrices, GL(n) and
U(n), but this is somewhat deceptive. For one thing they are much bigger. In
fact there are other important qualitative differences. One important fact that
you should know, and there is a proof towards the end of this chapter, is that
both GL(H) and U(H) are contractible as metric spaces – they have no significant
topology. This is to be constrasted with the GL(n) and U(n) which have a lot of
topology, and are not at all simple spaces – especially for large n. One upshot of
this is that U(H) does not look much like the limit of the U(n) as n → ∞. In fact
there is another group which is essentially the large n limit of the U(n), namely
(3.131) U−∞ (H) = {Id +K ∈ U(H); K ∈ K(H)}.
It does have lots of interesting (and useful) topology.
Another important fact that we will discuss below is that GL(H) is not dense in
B(H), in contrast to the finite dimensional case. In other words there are operators
which are not invertible and cannot be made invertible by small perturbations.
Proof. Certainly, |hAu, ui| ≤ kAkkuk2 so the right side can only be smaller
than or equal to the left. Set
a = sup |hAu, ui| ≤ kAk.
kuk=1
Then for any u, v ∈ H, |hAu, vi| = hAeiθ u, vi for some θ ∈ [0, 2π), so we can arrange
that hAu, vi = |hAu0 , vi| is non-negative and ku0 k = 1 = kuk = kvk. Dropping the
primes and computing using the polarization identity
(3.137) 4hAu, vi
= hA(u + v), u + vi − hA(u − v), u − vi + ihA(u + iv), u + ivi − ihA(u − iv), u − ivi.
By the reality of the left side we can drop the last two terms and use the bound
|hAw, wi| ≤ akwkw on the first two to see that
(3.138) 4hAu, vi ≤ a(ku + vk2 + ku − vk2 ) = 2a(kuk2 + kvk2 ) = 4a
Thus, kAk = supkuk=kvk=1 |hAu, vi| ≤ a and hence kAk = a.
88 3. HILBERT SPACES
Thus kBk = b and Spec(B) ⊂ [−b, b] and the argument in the proof above shows
that both end-points are in the spectrum. It follows that
(3.145) {a− } ∪ {a+ } ⊂ Spec(A) ⊂ [a− , a+ ]
from which the statement follows.
In particular if A = A∗ then
(3.146) Spec(A) ⊂ [0, ∞) ⇐⇒ hAu, ui ≥ 0.
is a continuous function on the unit sphere which attains its supremum and infimum
where
(3.151) sup |F (u)| = kAk.
kuk=1
(3.154) |F (u± ± ± ± ± ± ± ±
n ) − F (u )| ≤ |hA(un − u ), un i| + |hAu , un − u i|
= |hA(u± ± ± ± ± ± ± ±
n − u ), un i| + |hu , A(un − u )i| ≤ 2kAun − Au k
eigenvectors form and orthonormal basis of Nul(A)⊥ . This completes the proof of
the theorem.
being the projection onto the span Cei . Since Pi Pj = 0 if i 6= j and Pi2 = Pi it fol-
lows inductively that the positive powers of A are given by similar sums converging
in B(H) :
X
(3.162) Ak = λki Pi , Pi u = hu, ei iei , k ∈ N.
i
There is a similar formula for the identity of course, except we need to remember
that the null space of A then appears (and the series does not usually converge in
the norm topology on B(H)) :
X
(3.163) Id = Pi + PN , N = Nul(A).
i
The sum (3.163) can be interpreted in terms of a strong limit of operators, meaning
that the result converges when applied term by term to an element of H, so
X
(3.164) u= Pi u + PN u, ∀ u ∈ H
i
which is a form of the Fourier-Bessel series. Combining these formulæ we see that
for any polynomial p(z)
X
(3.165) p(A) = p(λi )Pi + p(0)PN
i
define f (A) for a continous function defined on [a− , a+ ] if Spec(A) ⊂ [a− , a+ ]. (In
fact it only has to be defined on the compact set Spec(A) which might be quite a
lot smaller). This is an effective extension of the spectral theorem to the case of
non-compact self-adjoint operators.
How does one define f (A)? Well, it is easy enough in case f is a polynomial,
since then we can simply substitute An in place of z n . If we factorize the polynomial
this is the same as setting
(3.169) f (z) = c(z−z1 )(z−z2 ) . . . (z−zN ) =⇒ f (A) = c(A−z1 )(A−z2 ) . . . (A−zN )
and this is equivalent to (3.166) in case A is also compact.
Notice that the result does not depend on the order of the factors or anything
like that. To pass to the case of a general continuous function we need to estimate
the norm in the polynomial case.
Proposition 3.17. If A = A∗ ∈ B(H) is a bounded self-adjoint operator on a
Hilbert space then for any polynomial with real coefficients
(3.170) kf (A)k ≤ sup |f (z)|, Spec(A) ⊂ [a− , a+ ].
z∈[a− ,a+ ]
Proof. For a polynomial we have defined f (A) by (3.169). We can drop the
constant c since it will just contribute a factor of |c| to both sides of (3.170). Now,
recall from Lemma 3.14 that for a self-adjoint operator the norm can be realized as
(3.171) kf (A)k = sup{|t|; t ∈ Spec(f (A))}.
That is, we need to think about when f (A) − t is invertible. However, f (z) − t
is another polynomial (with leading term z N because we normalized the leading
coefficient to be 1). Thus it can also be factorized:
N
Y
f (z) − t = (z − ζj (t)),
j=1
(3.172)
N
Y
f (A) − t = (A − ζj (t))
j=1
where the ζj ∈ C are the roots (which might be complex even though the polynomial
is real). Written in this way we can see that
N
Y
(3.173) (f (A) − t)−1 = (A − ζj (t))−1 if ζj (t) ∈
/ Spec(A) ∀ j.
j=1
Indeed the converse is also true, i.e. the inverse exists if and only if all the A − ζj (t)
are invertible, but in any case we see that
(3.174) Spec(f (A)) ⊂ {t ∈ C; ζj (t) ∈ Spec(A), for some j = 1, . . . , N }
since if t is not in the right side then f (A) − t is invertible.
Now this can be restated as
(3.175) Spec(f (A)) ⊂ f (Spec(A))
since t ∈/ f (Spec(A)) means f (z) 6= t for z ∈ Spec(A) which means that there is no
root of f (z) = t in Spec(A) and hence (3.174) shows that t ∈ / Spec(f (A)). In fact
it is easy to see that there is equality in (3.175).
94 3. HILBERT SPACES
Then (3.170) follows from (3.171), the norm is the sup of |z|, for z ∈ Spec(f (A))
so
kf (A)k ≤ sup |f (t)|.
t∈Spec(A)
This allows one to pass by continuity to f in the uniform closure of the poly-
nomials, which by the Stone-Weierstrass theorem is the whole of C([a− , a+ ]).
Theorem 3.5. If A = A∗ ∈ B(H) for a Hilbert space H then the map defined
on polynomials, through (3.169) extends by continuity to a bounded linear map
(3.176) C([a− , a+ ]) −→ B(H) if Spec(A) ⊂ [a− , a+ ], Spec(f (A)) ⊂ f ([a− , a+ ]).
Proof. By the Stone-Weierstrass theorem polynomials are dense in continous
functions on any compact interval, in the supremum norm.
Remark 3.1. You should check the properties of this map, which also follow by
continuity, especially that (3.168) holds in this more general context. In particular,
f (A) is self-adjoint if f ∈ C([a− , a+ ]) is real-valued and is non-negative if f ≥ 0 on
Spec(A).
Certainly
(3.181) Qa (u) ≤ limhfn (A)u, ui
20. SPECTRAL PROJECTION 95
It follows that Qa (u, v) is a sesquilinear form, linear in the first variable and an-
tilinear in the second. Moreover the fn (A) are uniformly bounded in B(H) (with
norm 1 in fact) so
(3.185) |Qa (u, v)| ≤ Ckukkvk.
Now, using the linearity in v of Qa (u, v) and the Riesz Representation theorem it
follows that for each u ∈ H there exists a unique Qa u ∈ H such that
(3.186) Qa (u, v) = hQa u, vi, ∀ v ∈ H, kQa uk ≤ kuk.
From the uniqueness, H 3 u 7−→ Qa u is linear so (3.186) shows that it is a bounded
linear operator. Thus we have proved most of
Proposition 3.18. For each a ∈ [a− , a+ ] ⊃ Spec(A) there is a uniquely defined
operator Qa ∈ B(H) such that
(3.187) Qa (u) = hQa u, ui
recovers (3.182) and Q∗a = Qa = Q2a is a projection satisfying
(3.188) Qa Qb = Qb Qa = Qb if b ≤ a, [Qa , f (A)] = 0 ∀ f ∈ C([a− , a+ ]).
This operator, or really the whole family Qa , is called the spectral projection of A.
Proof. We have already shown the existence of Qa ∈ B(H) with the property
(3.187) and since we defined it directly from Qa (u) it is unique. Self-adjointness
follows from the reality of Qa (u) ≥ 0 since hQa u, vi = hu, Qa vi then follows from
(3.186).
From (3.184) it follows that
hQa u, vi = lim hfn (A)u, vi =⇒
n→∞
(3.189)
hQa u, f (A)vi = lim hfn (A)u, f (A)vi = hQa f (A)u, vi
n→∞
since f (A) commutes with fn (A) for any continuous f. This proves the statement
in (3.188). Since fn fm ≤ fn is admissible in the definition of Qa in (3.178)
(3.190)
hQa u, vi = lim h(fn fm )(A)u, vi = lim hfn (A)u, fm (A)vi = hQa (A)u, fm (A)vi
n→∞ n→∞
96 3. HILBERT SPACES
and now letting m → ∞ shows that Q2a = Qa . A similar argument shows the first
identity in (3.188).
taking values in the self-adjoint projections and increasing in the sense of (3.188).
A little more application allows one to recover the functional calculus as an integral
which can be written
Z
(3.192) f (A) = f (t)dQt
[a− ,a+ ]
where the Pj are the orthogonal projections onto the eigenspaces for λj .
This makes sense since Ev = 0 implies hEv, Evi = 0 and hence hA∗ Av, vi = 0 so
kAvk = 0 and Av = 0. So let us define
(
Av if w ∈ Ran(E), w = Ev
(3.197) U (w) =
0 if w ∈ (Ran(E))⊥ .
21. POLAR DECOMPOSITION 97
(3.201)
4hU w, U w0 i = kU (w + w0 )k2 − kU (w − w0 )k2 + ikU (w + iw0 )k2 − ikU (w − iw0 )k2
= kw + w0 k2 − kw − w0 k2 + ikw + iw0 k2 − ikw − iw0 k2 = 4hw, w0 i
The adjoint U ∗ of U has range contained in the orthocomplement of the null space
of U, so in Ran(E), and null space precisely Ran(A)⊥ so defines a linear map from
Ran(A) to Ran(E). As such it follows from (3.201) that
A bounded linear operator with the properties of U above, that there are two
decompositions of H = H1 ⊕ H2 = H3 ⊕ H4 into orthogonal closed subspaces, such
that U = 0 on H2 and U : H1 −→ H3 is a bijection with kU wk = kwk for all
w ∈ H1 is called a partial isometry. So the polar decomposition writes a general
bounded operator as product A = U E where U is a partial isometry from Ran(E)
1
onto Ran(A) and E = (A∗ A) 2 . If A is injective then U is actually unitary.
1
Exercise 1. Show that in the same sense, A = F V where F = (AA∗ ) 2 and
V is a partial isometry from Ran(A∗ ) to Ran F .
98 3. HILBERT SPACES
Proof of Proposition 3.20. First let’s check this in the case of a finite rank
operator K = T. Then
(3.209) Nul(Id −T ) = {u ∈ H; u = T u} ⊂ Ran(T ).
A subspace of a finite dimensional space is certainly finite dimensional, so this
proves the first condition in the finite rank case.
Similarly, still assuming that T is finite rank consider the range
(3.210) Ran(Id −T ) = {v ∈ H; v = (Id −T )u for some u ∈ H}.
Consider the subspace {u ∈ H; T u = 0}. We know that this this is closed, since T
is certainly continuous. On the other hand from (3.210),
(3.211) Ran(Id −T ) ⊃ Nul(T ).
Now, Nul(T ) is closed and has finite codimension – it’s orthocomplement is spanned
by a finite set which maps to span the image. As shown in Lemma 3.4 it follows
from this that Ran(Id −T ) itself is closed with finite dimensional complement.
This takes care of the case that K = T has finite rank! What about the general
case where K is compact? If K is compact then there exists B ∈ B(H) and T of
finite rank such that
1
(3.212) K = B + T, kBk < .
2
Now, consider the null space of Id −K and use (3.212) to write
(3.213) Id −K = (Id −B) − T = (Id −B)(Id −T 0 ), T 0 = (Id −B)−1 T.
Here we have used the convergence of the Neumann series, so (Id −B)−1 does exist.
Now, T 0 is of finite rank, by the ideal property, so
(3.214) Nul(Id −K) = Nul(Id −T 0 ) is finite dimensional.
Here of course we use the fact that (Id −K)u = 0 is equivalent to (Id −T 0 )u = 0
since Id −B is invertible. So, this is the first condition in (3.207).
Similarly, to examine the second we do the same thing but the other way around
and write
(3.215) Id −K = (Id −B) − T = (Id −T 00 )(Id −B), T 00 = T (Id −B)−1 .
Now, T 00 is again of finite rank and
(3.216) Ran(Id −K) = Ran(Id −T 00 ) is closed and of finite codimension.
What about (3.208)? This time let’s first check first that it is enough to consider
the finite rank case. For a compact operator we have written
(3.217) (Id −K) = G(Id −T )
1
where G = Id −B with kBk < 2 is invertible and T is of finite rank. So what we
want to see is that
(3.218) dim Nul(Id −K) = dim Nul(Id −T ) = dim Nul(Id −K ∗ ).
However, Id −K ∗ = (Id −T ∗ )G∗ and G∗ is also invertible, so
(3.219) dim Nul(Id −K ∗ ) = dim Nul(Id −T ∗ )
and hence it is enough to check that dim Nul(Id −T ) = dim Nul(Id −T ∗ ) – which is
to say the same thing for finite rank operators.
100 3. HILBERT SPACES
Now, for a finite rank operator, written out as (3.85), we can look at the vector
space W spanned by all the fi ’s and all the ei ’s together – note that there is
nothing to stop there being dependence relations among the combination although
separately they are independent. Now, T : W −→ W as is immediately clear and
N
X
(3.220) T ∗v = (v, fi )ei
i=1
where Bessel’s identity is used again. Thus the sum for A∗ with respect to the new
basis is finite. Applying this argument again shows that the sum is independent of
the basis, and the same for the adjoint.
Proposition 3.21. The operators for which (3.224) is finite form a 2-sided ∗
-ideal HS(H) ⊂ B(H), contained in the ideal of compact operators, it is a Hilbert
space and the norm satisfies
! 21
X
2
kAkB ≤ kAkHS = kAei k ,
(3.227)
i
kADkHS ≤ kAkHS kDkB , A ∈ HS(H), D ∈ B(H).
The inner product is
X
hA, BiHS = hAei , Bei iH , A, B ∈ HS(H).
i
For a compact operator the polar decomposition can be given a more explicit
form and we can use this to give another characterization of the Hilbert-Schmidt
operators.
Proposition 3.22. If A ∈ K(H) then there exist orthonormal bases ei of
Nul(A)⊥ and fj of Nul(A∗ )⊥ such that
X
Au = si hu, ei ifi
i
1
where the si are the non-zero eigenvalues of (A∗ A) 2 repeated with multiplicity.
The si are called the characteristic values of A.
where the supremum is over pairs of orthonormal sequences {ei } and {fi }.
102 3. HILBERT SPACES
Proposition 3.23. The trace class operators form an ideal, T (H) ⊂ HS(H),
which is a Banach space with respect to the norm (3.228) which satisfies
1 1
(3.229) kAkB ≤ kAkTr , kAkHS ≤ kAkB2 kAkTr
2
;
the following two conditions are equivalent to A ∈ T (H) :
(1) The operator defined by the functional calculus,
1 1
(3.230) |A| 2 = (A∗ A) 4 ∈ HS(H).
(2) There are operators Bi , Bi0 ∈ HS(H) such that
N
X
(3.231) A= Bi0 Bi .
i=1
Proof. Note first that T (H) is a linear space and that k · kTr is a norm on
1
it. Now suppose A ∈ T (H) and consider its polar decomposition A = U (A∗ A) 2 .
1
Here U is a partial isometry mapping Ran(A∗ A) 2 to Ran(A) and vanishing on
1
⊥ 1
Ran(A∗ A) 2 . Consider an orthonormal basis {ei } of Ran(A∗ A) 2 . This is an or-
thonormal sequence in H as is fi = U ei . Inserting these into (3.228) shows that
1 1 1
X X
(3.232) |hU (A∗ A) 2 ei , fi i| = |h(A∗ A) 4 ei , (A∗ A) 4 ei i| < ∞
i i
where we use the fact that U ∗ fi = U ∗ U ei = ei . Since the closure of the range of
1 1
(A∗ A) 4 is the same as the closure of the range of (A∗ A) 2 it follows from (3.232)
1
that (3.230) holds (since adding an orthonormal basis of Ran((A∗ A) 4 )⊥ does not
increase the sum).
Next assume that (3.230) holds for A ∈ B(H). Then the polar decomposition
1 1
can be written A = (U (A∗ A) 4 )(A∗ A) 4 showing that A is the product of two
Hilbert-Schmidt operators, so in particular of the form (3.231).
Now assume that A is of the form (3.231), so is a sum of products of Hilbert-
Schmidt operators. The linearity of T (H) means it suffices to assume that A = BB 0
where B, B 0 ∈ HS(H). Then,
(3.233) |hAe, fi i| = |hB 0 ei , B ∗ fi i| ≤ kB 0 ei kH kB ∗ fi kH .
Taking a finite sum and applying Cauchy-Schwartz inequality
N N N
1 1
X X X
(3.234) |hAe, fi i| ≤ ( kB 0 ei k2 ) 2 ( kB ∗ fi k2 ) 2 .
i=1 i=1 i=1
If the sequences are orthonormal the right side is bounded by the product of the
Hilbert-Schmidt norms so
(3.235) kBB 0 kTr ≤ kBkHS kB 0 kHS
and A = BB 0 ∈ T (H).
The first inequality in (3.229) follows the choice of single unit vectors u and v
as orthonormal sequences, so
(3.236) |hAu, vi| ≤ kAkTr =⇒ kAk ≤ kAkTr .
The completeness of T (H) with respect to the trace norm follows standard
arguments which can be summarized as follows
23. HILBERT-SCHMIDT, TRACE AND SCHATTEN IDEALS 103
Passing to the limit An → A in the finite sum gives the same bound with
An replaced by A and then allowing N → ∞ shows that A ∈ T (H).
Similarly the Cauchy condition means that for > 0 there exists M such
that for all N, and any orthonormal sequences ei , fi
N
X
(3.238) m.n > M =⇒ |h(An − Am )ei , fi i| ≤ .
i=1
is a continuous linear functional (with respect to the trace norm) which is indepen-
dent of the choice of orthonormal basis {ei } and which satsifies
(3.240) Tr(AB − BA) = 0 if A ∈ T (H), B ∈ B(H) or A, B ∈ HS(H).
Proof. The complex number Tr(AB − BA) depends linearly on A and, sepa-
rately, on B. The ideals are ∗-closed so decomposing A = (A + A∗ )/2 + i(A − A∗ )/2i
and similarly for B shows that it suffices to assume that A and B are self-adjoint. If
A ∈ T (H) we can choose use an orthonormal basis of eigenvectors for it to evaluate
the trace. Then if Aei = λi ei
X
(3.241) Tr(AB − BA) = (hBei , Aei i − hAei , Bei i)
i
X
= (λi hBei , ei i − λi hei , Bei i) = 0.
i
In fact the second result extends to Lidskii’s theorem: If T ∈ Tr(H) then the
spectrum outside 0 is discrete, so countable, and each P point is an eigenvalue λi
of finite algebraic multiplicity ki and then Tr(T ) = ki λi converges in l1 . The
i
algebraic multiplicity is the limit as k → ∞ of the dimension of the null space of
(T − λi )k . The standard proof of this is not elementary.
Next we turn to the more general Schatten classes.
with the supremum over orthonormal sequences, with finiteness implying that T ∈
Scp (H). If q is the conjugate index to p ∈ (0, ∞) then
and conversely, if A ∈ B(H) then A ∈ Scp (H) if and only if AB ∈ T (H) for all
B ∈ Scq (H) and
kAkScp = sup kABkTr .
kBkScq =1
with the supremum over orthonormal sequences. To see this let ej be an orthonomal
basis of eigenvectors for T. Then expanding in the Fourier-Bessel series
X X
(3.248) hT fi , fi i = λj |hfi , ej i|2 ≤ |λj ||hfi , ej i|2/p |hfi , ej i|2/q
j j
X 1 X 1 X 1
p 2
≤( |λj | hfi , ej i| ) p ( |hfi , ej i|2 ) q = ( |λj |p hfi , ej i|2 ) p
j j j
by Hölder’s inequality, so
X X X X
(3.249) |hT fi , fi i|p ≤ |λj |p hfi , ej i|2 = |λj |p = kT kpScp .
i j i j
As usual dropping to a finite sum on the left we can pass to the limit as N → ∞
and obtain a uniform bound on any finite sum for T from which (3.245) follows.
At this point we know that if A ∈ Scp (H) and U1 , U2 are unitary then
From (3.245) it follows directly that Scp (H) is linear, that the triangle inequal-
ity holds, so that k · kScp is a norm, and Scp (H) is complete and that it is ∗-closed.
Now, if A ∈ Scq (H) and B ∈ Scp (H) for conjugate indices p, q ∈ (1, ∞) choose
a finite rank orthogonal projection P and consider ABP which is of finite rank, and
hence of trace class. We can compute its trace with respect to any orthonormal
basis. Choose an orthonormal basis ei of the range of P AP and fi so that the polar
decomposition of P AP becomes
P AP fi = si ei =⇒ P A∗ P ei = si fi
106 3. HILBERT SPACES
Before looking at general Fredholm operators let’s check that, in the case of
operators of the form Id −K, with K compact the third conclusion in (3.207) really
follows from the first. This is a general fact which I mentioned, at least, earlier but
let me pause to prove it.
Proposition 3.26. If B ∈ B(H) is a bounded operator on a Hilbert space and
B ∗ is its adjoint then
(3.255) Ran(B)⊥ = (Ran(B))⊥ = {v ∈ H; (v, w) = 0 ∀ w ∈ Ran(B)} = Nul(B ∗ ).
Proof. The definition of the orthocomplement of Ran(B) shows immediately
that
(3.256) v ∈ (Ran(B))⊥ ⇐⇒ (v, w) = 0 ∀ w ∈ Ran(B) ⇐⇒ (v, Bu) = 0 ∀ u ∈ H
⇐⇒ (B ∗ v, u) = 0 ∀ u ∈ H ⇐⇒ B ∗ v = 0 ⇐⇒ v ∈ Nul(B ∗ ).
On the other hand we have already observed that V ⊥ = (V )⊥ for any subspace –
since the right side is certainly contained in the left and (u, v) = 0 for all v ∈ V
implies that (u, w) = 0 for all w ∈ V by using the continuity of the inner product
to pass to the limit of a sequence vn → w.
There is a more ‘analytic’ way of characterizing Fredholm operators, rather
than Definition 3.9.
Lemma 3.22. An operator F ∈ B(H) is Fredholm, F ∈ F(H), if and only if it
has a generalized inverse P satisfying
P F = Id −ΠNul(F )
(3.257)
F P = Id −ΠRan(F )⊥
with the two projections of finite rank.
Proof. If (3.257) holds then F must be Fredholm, since its null space is finite
dimensional, from the second identity the range of F must contain the range of
Id −ΠNul(F )⊥ and hence it must be closed and of finite codimension.
Conversely, suppose that F ∈ F(H). We can divide H into two pieces in two
ways as H = Nul(F ) ⊕ Nul(F )⊥ and H = Ran(F )⊥ ⊕ Ran(F ) where in each case
the first summand is finite-dimensional. Then F defines four maps, from each of
the two first summands to each of the two second ones but only one of these is
non-zero and so F corresponds to a bounded linear map F̃ : Nul(F )⊥ −→ Ran(F ).
These are two Hilbert spaces with a bounded linear bijection between them, so the
inverse map, P̃ : Ran(F ) −→ Nul(F )⊥ is bounded by the Open Mapping Theorem
and we can define
(3.258) P = P̃ ◦ ΠNul(F )⊥ .
Then (3.257) follows directly.
What we want to show is that the Fredholm operators form an open set in
B(H) and that the index is locally constant. To do this we show that a weaker
version of (3.257) also implies that F is Fredholm.
Lemma 3.23. An operator F ∈ F(H) is Fredholm if and only if it has a para-
metrix Q ∈ B(H) in the sense that
QF = Id −EL
(3.259)
F Q = Id −ER
108 3. HILBERT SPACES
with EL and ER of finite rank. Moreover any two such parametrices differ by a
finite rank operator.
The term ‘parametrix’ refers to an inverse modulo an ideal. Here we are looking
at the ideal of finite rank operators. In fact this is equivalent to the existence of
an inverse modulo compact operators. One direction is obvious – since finite rank
operators are compact – the other is covered by one of the problems. Notice that
the parametrix Q is itself Fredholm, since reversing the two equations shows that
F is a parameterix for Q. Similarly it follows that if F is Fredholm then so is F ∗
and that the product of two Fredholm operators is Fredholm.
goes around the origin in C. There is a lot more topology than this and it is actually
quite complicated.
Perhaps surprisingly, the corresponding group of the invertible bounded oper-
ators on a separable (complex) infinite-dimensional Hilbert space is contractible.
This is Kuiper’s theorem, and means that this group, GL(H), has no ‘topology’ at
all, no holes in any dimension and for topological purposes it is like a big open ball.
The proof is not really hard, but it is not exactly obvious either. It depends on
an earlier idea, ‘Eilenberg’s swindle’ - it is an unusual name for a theorem - which
shows how the infinite-dimensionality is exploited. As you can guess, this is sort
of amusing (if you have the right attitude . . . ). The proof I give here is due to B.
Mityagin, [3].
Let’s denote by GL(H) this group. In view of the open mapping theorem we
know that
(3.267) GL(H) = {A ∈ B(H); A is injective and surjective}.
Contractibility means precisely that there is a continuous map
γ : [0, 1] × GL(H) −→ GL(H) s.t.
(3.268)
γ(0, A) = A, γ(1, A) = Id, ∀ A ∈ GL(H).
Continuity here means for the metric space [0, 1] × GL(H) where the metric comes
from the norms on R and B(H).
I will only show ‘weak contractibility’ of GL(H). This has nothing to do with
weak convergence, rather just means that we only look for an homotopy over com-
pact sets.
As a warm-up exercise, let us show that the group GL(H) is contractible to
the unitary subgroup
(3.269) U(H) = {U ∈ GL(H); U −1 = U ∗ }.
These are the isometric isomorphisms.
Proposition 3.28. There is a continuous map
(3.270)
Γ : [0, 1] × GL(H) −→ GL(H) s.t. Γ(0, A) = A, Γ(1, A) ∈ U(H) ∀ A ∈ GL(H).
Proof. This is a consequence of the functional calculus, giving the ‘polar
decomposition’ of invertible (and more generally bounded) operators. Namely, if
A ∈ GL(H) then AA∗ ∈ GL(H) is self-adjoint. Its spectrum is then contained in
an interval [a, b], where 0 < a ≤ b = kAk2 . It follows from what we showed earlier
1
that R = (AA∗ ) 2 is a well-defined bounded self-adjoint operator and R2 = AA∗ .
Moreover, R is invertible and the operator UA = R−1 A ∈ U(H). Certainly it is
bounded and UA∗ = A∗ R−1 so UA∗ UA = A∗ R−2 A = Id since R−2 = (AA∗ )−1 =
(A∗ )−1 A−1 . Thus UA∗ is a right inverse of UA , and (since UA is a bijection) is the
unique inverse so UA ∈ U(H). So we have shown A = RUA then
(3.271) Γ(s, A) = (s Id +(1 − s)R)UA , s ∈ [0, 1]
satisfies (3.270).
There is however the issue of continuity of this map. Continuity in s is clear
enough but we also need to show that the map
1
(3.272) GL(H) 3 A 7−→ (A∗ A) 2 ∈ GL(H),
defining R, is continuous.
25. KUIPER’S THEOREM 111
giving (3.276). The absolute convergence of the series following from (3.278).
112 3. HILBERT SPACES
Thus, it remains to find a decomposition (3.277) for which (3.278) holds. This
follows from Bessel’s inequality. First choose 1 ∈ K then (Be1 , el ) → 0 as l → ∞
so |(Be1 , el1 )| < /4 for l1 large enough and we will take l1 > 2k1 . Then we use
induction on N, choosing K(N ), L(N ) and O(N ) with
K(N ) = {k1 = 1 < k2 < . . . , kN },
L(N ) = {l1 < l2 < · · · < lN }, lr > 2kr , kr > lr−1 for 1 < r ≤ N and
O(N ) = {1, . . . , lN } \ (K(N ) ∪ L(N )).
Now, choose kN +1 > lN such that |(el , BekN +1 )| < 2−l−N , for all l ∈ L(N ), and
then lN +1 > 2kN +1 such that |(elN +1 , Bk )| < e−N −1−k for k ∈ K(N +1) = K(N )∪
{kN +1 } and the inductive hypothesis follows with L(N + 1) = N (N ) ∪ {lN +1 }.
We have shown above that the projection P has range equal to the range of
BQK ; apply Lemma 3.26 with M = S(BQK )−1 P where S is a fixed isomorphism
of the range of QK to the range of QL . Then
(3.286) L1 (θ) = R(θ)B has L1 (0) = B, L(π/2) = B 0 with B 0 QK = QL SQK
an isomorphism onto the range of Q.
Next apply Lemma 3.26 again but for the projections QK and QL with the
isomorphism S, giving
(3.287) R0 (θ) = cos θQK + sin θS − sin θS 0 + cos θQL + QO .
Then the curve of invertibles
L2 (θ) = R0 (θ − θ0 )B 0 has L(0) = B 0 , L(π/2) = B 00 , B 00 QK = QK .
So, we have succeed by succesive homotopies through invertible elements in
arriving at an operator
Id E
(3.288) B 00 =
0 F
where we are looking at the decomposition of H = H ⊕ H according to the projec-
tions QK and Id −QK . The invertibility of this is equivalent to the invertibility of
F and the homotopy
Id (1 − s)E
(3.289) B 00 (s) =
0 F
connects it to
Id −(1 − s)EF −1
Id 0 00 −1
(3.290) L= , (B (s)) =
0 F 0 F −1
through invertibles.
The final step is ‘Eilenberg’s swindle’. In (3.290) we arrived at a family of
operators on H ⊕ H. Reversing the factors we can consider
F 0
(3.291) .
0 Id
Eilenberg’s idea is to connect this to the identity by an explicit curve which ‘only
uses F ’ and so is uniform in parameters. So for the moment just take F to be a
fixed unitary operator on H.
We use several isomorphism involving H and l2 (H) which are isomorphic of
course, as separable Hilbert spaces. First consider the two simple ‘rotations’ on
H ⊕H
Id cos t F sin t F cos t F sin t
(3.292) , .
−F −1 sin t Id cos t −F −1 sin t cos tF −1
114 3. HILBERT SPACES
These are both unitary norm-continuous curves, the first starts at the identity and
is off-diagonal at t = π/2 and equal to the second at that point. So by reversing
the second and going back we connect
Id 0 F 0
(3.293) Id = to .
0 Id 0 F −1
Now, we can also identify H ⊕ H with l2 (H ⊕ H). So an element of this space
is an l2 sequence with values in H ⊕ H and the identity just acts as the identity on
each 2 × 2 block. We can perform the to-and-fro rotation in (3.292) in each block.
That this is actually a norm-continuous curve acting on l2 (H ⊕ H) is a consequence
of the fact that it is ‘the same’ in each block and so it is actually a sequence of
operators, each on H ⊕ H, which are continuous, and uniformly so with respect to
the index i corresponding to a sequence in l2 ; so the whole operator is continuous.
This connects Id to the second matrix (3.293) acting in each block of l2 (H ⊕H).
So, here is one part of the swindle, we can reorder the space so it becomes l2 (H)
where now the operator is diagonal but with alternating entries
(3.294) Diag(F −1 , F, F −1 , F, . . . ) on l2 (H).
This requires just a unitary isomorphism corresponding to relabelling the basis
elements.
Now, go back to the operator (3.291) and look at the lower left identity element
acting on H. We can identify the H in this spot with l2 (H) and then we have a curve
linking this entry to (3.294). For the whole operator this gives a norm-continuous
curve connecting
F 0
(3.295) to Diag(F, F −1 , F, F −1 , F, . . . ) on H ⊕ l2 (H)
0 Id
just adding the first entry. But now we reverse the procedure using F −1 in place
of F so the end-point in (3.295) is connected to the identity on l2 (H) = H!
The fact that this construction only uses F itself and 2 × 2 matrices means that
it works uniformly when F depends continously on parameters in a compact set.
So we have constructed a curve as desired in (3.275) and hence we have proved:-
Theorem 3.6 (Kuiper). For any compact subset X ⊂ GL(H) there is a retrac-
tion γ as in (3.275).
Note that it follows from a result of Milnor (on CW complexes) that in this case
contractibility follows from weak contractibility. If you are topologically inclined
you might like to look up some applications of Kuiper’s Theorem - for instance
that the projective unitary group is a classifying space for two dimensional integral
cohomology, an Eilenberg-MacLane space.
CHAPTER 4
In the last part of the course some more concrete analytic questions are con-
sidered. First the completeness of the Fourier basis is shown, this is one of the
settings from which the notion of a Hilbert space originates. The index formula
for Toeplitz operators on Hardy space is then derived. Next operator methods are
used to demonstrate the uniqueness of the solutions to the Cauchy problem. The
completeness of the eigenbasis for ‘Sturm-Liouville’ theory is then deduced from
the spectral theorem. The Fourier transform is examined and used to prove the
completeness of the Hermite basis for L2 (R). Once one has all this, one can do a
lot more, but there is no time left. Such is life.
1. Fourier series
Let us now try applying our knowledge of Hilbert space to a concrete Hilbert
space such as L2 (a, b) for a finite interval (a, b) ⊂ R. Any such interval with b > a
can be mapped by a linear transformation onto (0, 2π) and so we work with this
special interval. You showed that L2 (a, b) is indeed a Hilbert space. One of the
reasons for developing Hilbert space techniques originally was precisely the following
result.
Theorem 4.1. If u ∈ L2 (0, 2π) then the Fourier series of u,
Z
1 X
(4.1) ck eikx , ck = u(x)e−ikx dx
2π (0,2π)
k∈Z
2
converges in L (0, 2π) to u.
Notice that this does not say the series converges pointwise, or pointwise almost
everywhere. In fact it is true that the Fourier series of a function in L2 (0, 2π)
converges almost everywhere to u, but it is hard to prove! In fact it is an important
result of L. Carleson. Here we are just claiming that
Z
1 X
(4.2) lim |u(x) − ck eikx |2 = 0
n→∞ 2π
|k|≤n
2
for any u ∈ L (0, 2π).
Our abstract Hilbert space theory has put us quite close to proving this. First
observe that if e0k (x) = exp(ikx) then these elements of L2 (0, 2π) satisfy
Z 2π (
if k 6= j
Z
0 0 0
(4.3) ek ej = exp(i(k − j)x) =
0 2π if k = j.
Thus the functions
e0k 1
(4.4) ek = = √ eikx
ke0k k 2π
115
116 4. DIFFERENTIAL AND INTEGRAL OPERATORS
form an orthonormal set in L2 (0, 2π). It follows that (4.1) is just the Fourier-Bessel
series for u with respect to this orthonormal set:-
√ 1
(4.5) ck = 2π(u, ek ) =⇒ ck eikx = (u, ek )ek .
2π
So, we already know that this series converges in L2 (0, 2π) thanks to Bessel’s in-
equality. So ‘all’ we need to show is
Proposition 4.1. The ek , k ∈ Z, form an orthonormal basis of L2 (0, 2π), i.e.
are complete:
Z
(4.6) ueikx = 0 ∀ k =⇒ u = 0 in L2 (0, 2π).
This however, is not so trivial to prove. An equivalent statement is that the fi-
nite linear span of the ek is dense in L2 (0, 2π). I will prove this using Fejér’s method.
In this approach, we check that any continuous function on [0, 2π] satisfying the
additional condition that u(0) = u(2π) is the uniform limit on [0, 2π] of a sequence
in the finite span of the ek . Since uniform convergence of continuous functions cer-
tainly implies convergence in L2 (0, 2π) and we already know that the continuous
functions which vanish near 0 and 2π are dense in L2 (0, 2π) this is enough to prove
Proposition 4.1. However the proof is a serious piece of analysis, at least it seems so
to me! There are other approaches, for instance we could use the Stone-Weierstrass
Theorem; rather than do this we will deduce the Stone-Weierstrass Theorem from
Proposition 4.1. Another good reason to proceed directly is that Fejér’s approach
is clever and generalizes in various ways as we will see.
So, the problem is to find the sequence in the span of the ek which converges
to a given continuous function and the trick is to use the Fourier expansion that
we want to check! The idea of Cesàro is close to one we have seen before, namely
to make this Fourier expansion ‘converge faster’, or maybe better. For the moment
we can work with a general function u ∈ L2 (0, 2π) – or think of it as continuous if
you prefer. The truncated Fourier series of u is a finite linear combination of the
ek :
Z
1 X
(4.7) Un (x) = ( u(t)e−ikt dt)eikx
2π (0,2π)
|k|≤n
where I have just inserted the definition of the ck ’s into the sum. Since this is a
finite sum we can treat x as a parameter and use the linearity of the integral to
write it as
Z
1 X iks
(4.8) Un (x) = Dn (x − t)u(t), Dn (s) = e .
(0,2π) 2π
|k|≤n
Again plugging in the definitions of the Ul ’s and using the linearity of the integral
we see that
Z n
1 X
(4.12) Vn (x) = Sn (x − t)u(t), Sn (s) = Dl (s).
(0,2π) n+1
l=0
So again we want to compute a more useful form for Sn (s) – which is the Fejér
kernel. Since the denominators in (4.10) are all the same,
n n
1 1
X X
(4.13) 2π(n + 1)(eis/2 − e−is/2 )Sn (s) = ei(l+ 2 )s − e−i(l+ 2 )s .
l=0 l=0
Looking directly at (4.15) the first thing to notice is that Sn (s) ≥ 0. Also, we
can see that the denominator only vanishes when s = 0 or s = 2π in [0, 2π]. Thus
if we stay away from there, say s ∈ (δ, 2π − δ) for some δ > 0 then, sin(t) being a
bounded function,
(4.17) |Sn (s)| ≤ (n + 1)−1 Cδ on (δ, 2π − δ).
We are interested in how close Vn (x) is to the given u(x) in supremum norm,
where now we will take u to be continuous. Because of (4.16) we can write
Z
(4.18) u(x) = Sn (x − t)u(x)
(0,2π)
where t denotes the variable of integration (and x is fixed in [0, 2π]). This ‘trick’
means that the difference is
Z
(4.19) Vn (x) − u(x) = Sn (x − t)(u(t) − u(x)).
(0,2π)
118 4. DIFFERENTIAL AND INTEGRAL OPERATORS
For each x we split this integral into two parts, the set Γ(x) where x − t ∈ [0, δ] or
x − t ∈ [2π − δ, 2π] and the remainder. So
(4.20) Z Z
|Vn (x) − u(x)| ≤ Sn (x − t)|u(t) − u(x)| + Sn (x − t)|u(t) − u(x)|.
Γ(x) (0,2π)\Γ(x)
Now on Γ(x) either |t − x| ≤ δ – the points are close together – or t is close to 0 and
x to 2π so 2π − x + t ≤ δ or conversely, x is close to 0 and t to 2π so 2π − t + x ≤ δ.
In any case, by assuming that u(0) = u(2π) and using the uniform continuity of a
continuous function on [0, 2π], given > 0 we can choose δ so small that
(4.21) |u(x) − u(t)| ≤ /2 on Γ(x).
On the complement of Γ(x) we have (4.17) and since u is bounded we get the
estimate
Z
(4.22) |Vn (x) − u(x)| ≤ /2 Sn (x − t) + (n + 1)−1 q(δ) ≤ /2 + (n + 1)−1 q(δ)
Γ(x)
−2
where q(δ) = 2 sin(δ/2) sup |u| is a positive constant depending on δ (and u).
Here the fact that Sn is non-negative and has integral one has been used again to
bound the integral of Sn (x − t) over Γ(x) by 1. Having chosen δ to make the first
term small, we can choose n large to make the second term small and it follows
that
(4.23) Vn (x) → u(x) uniformly on [0, 2π] as n → ∞
under the assumption that u ∈ C([0, 2π]) satisfies u(0) = u(2π).
So this proves Proposition 4.1 subject to the density in L2 (0, 2π) of the contin-
uous functions which vanish near (but not of course in a fixed neighbourhood of)
the ends. In fact we know that the L2 functions which vanish near the ends are
dense since we can chop off and use the fact that
Z Z !
2 2
(4.24) lim |f | + |f | = 0.
δ→0 (0,δ) (2π−δ,2π)
Make sure you understand the change of variable argument to get to a general
(finite) interval.
2. Toeplitz operators
Although the convergence of Fourier series was stated above for functions on an
interval (0, 2π) it can be immediately reinterpreted in terms of periodic functions
on the line, or equivalently functions on the circle S. Namely a 2π-periodic function
(4.27) u : R −→ C, u(x + 2π) = u(x) ∀ x ∈ R
is uniquely determined by its restriction to [0, 2π) by just iterating to see that
(4.28) u(x + 2πk) = u(x), x ∈ [0, 2π), k ∈ Z.
Conversely a function on [0, 2π) determines a 2π-periodic function this way. Thus
a function on the circle
(4.29) S = {z ∈ C : |z| = 1}
is the same as a periodic function on the line in terms of the standard angular
variable
(4.30) S 3 z = e2πiθ , θ ∈ [0, 2π).
In particular we can identify L2 (S) with L2 (0, 2π) in this way – since the missing
end-point corresponds to a set of measure zero. Equivalently this identifies L2 (S)
as the locally square integrable functions on R which are 2π-periodic.
Since S is a compact Lie group (what is that you say? Look it up!) this brings
us into the realm of harmonic analysis. Just restating the results above for any
u ∈ L2 (S) the Fourier series (thinking of each exp(ikθ) as a 2π-periodic function
on the line) converges in L2 (I) for any bounded interval
X Z
(4.31) u(x) = ak eikx , ak = u(x)e−ikx dx.
k∈Z (0,2π)
After this adjustment of attitude, we follow G.H. Hardy (you might enjoy ”A
Mathematician’s Apology”) in thinking about:
Definition 4.1. Hardy space is
(4.32) H = {u ∈ L2 (S); ak = 0 ∀ k < 0}.
There are lots of reasons to be interested in H ⊂ L2 (S) but for the moment
note that it is a closed subspace – since it is the intersection of the null spaces of the
continuous linear functionals H 7−→ ak , k < 0. Thus there is a unique orthogonal
projection
(4.33) πH : L2 (S) −→ H
with range H.
If we go back to the definition of L2 (S) we can see that a continuous function α ∈
C(S) defines a bounded linear operator on L2 (S) by multiplication. It is invertible
if and only if α(θ) 6= 0 for all θ ∈ [0, 2π) which is the same as saying that α is a
continuous map
(4.34) α : S −→ C∗ = C \ {0}.
For such a map there is a well-defined ‘winding number’ giving the number of
times that the curve in the plane defined by α goes around the origin. This is easy
120 4. DIFFERENTIAL AND INTEGRAL OPERATORS
to define using the properties of the logarithm. Suppose that α is once continuously
differentiable and consider
Z
1 dα
(4.35) α−1 dθ = wn(α).
2πi [0,2π] dθ
If we can write
(4.36) α = exp(2πif (θ))
with f : [0, 2π] −→ C continuous then necessarily f is differentiable and
Z 2π
df
(4.37) wn(α) = dθ = f (2π) − f (0) ∈ Z
0 dθ
since exp(2πi(f (0) − f (2π)))) = 1. In fact, even for a general α ∈ C(S; C∗ ), it is
always possible to find a continuous f satisfying (4.36), using the standard proper-
ties of the logarithm as a local inverse to exp, but ill-determined up to addition of
integral multiples of 2πi. Then the winding number is given by the last expression
in (4.37) and is independent of the choice of f.
Definition 4.2. A Toeplitz operator on H is an operator of the form
(4.38) Tα = πH απH : H −→ H, α ∈ C(S).
The result I want is one of the first ‘geometric index theorems’ – it is a very
simple case of the celebrated Atiyah-Singer index theorem (which it much predates).
Theorem 4.3 (Toeplitz). If α ∈ C(S; C∗ ) then the Toeplitz operator (4.38) is
Fredholm (on the Hardy space H) with index
(4.39) ind(Tα ) = − wn(α)
given in terms of the winding number of α.
Proof. First we need to show that Tα is indeed a Fredholm operator. To do
this we decompose the original, multiplication, operator into four pieces
(4.40) α = Tα + πH α(Id −πH ) + (Id −πH )απH + (Id −πH )α(Id −πH )
which you can think of as a 2 × 2 matrix corresponding to writing
L2 (S) = H ⊕ H− , H− = (Id −πH )L2 (S) = H ⊥ ,
(4.41)
Tα πH α(Id −πH )
α= .
(Id −πH )απH (Id −πH )α(Id −πH )
Now, we will show that the two ‘off-diagonal’ terms are compact operators
(on L2 (S)). Consider first (Id −πH )απH . It was shown above, as a form of the
Stone-Weierstrass Theorem, that the finite Fourier sums are dense in C(S) in the
supremum norm. This is not the convergence of the Fourier series but there is a
sequence αk → α in supremum norm, where each
Nk
X
(4.42) αk = akj eijθ .
j=−Nk
It follows that
(4.43) k(Id −πH )αk πH − (Id −πH )απH kB(L2 (S)) → 0.
2. TOEPLITZ OPERATORS 121
Now by (4.42) each (Id −πH )αk πH is a finite linear combination of terms
(4.44) (Id −πH )eijθ πH , |j| ≤ Nk .
However, each of these operators is of finite rank. They actually vanish if j ≥ 0 and
for j < 0 the rank is exactly −j. So each (Id −πH )αk πH is of finite rank and hence
(Id −πH )απH is compact. A very similar argument works for Hα(Id −H) (or you
can use adjoints).
Now, again assume that α 6= 0 does not vanish anywhere. Then the whole
multiplication operator in (4.40) is invertible. If we remove the two compact terms
we see that
(4.45) Tα + (Id −πH )α(Id −πH ) is Fredholm
since the Fredholm operators are stable under addition of compact operators. Here
the first part maps H to H and the second maps H− to H− . It follows that the
null space and range of Tα are the projections of the null space and range of the
sum (4.45) – so it must have finite dimensional null space and closed range with a
finite-dimensional complement as a map from H to itself:-
(4.46) α ∈ C(S; C∗ ) =⇒ Tα is Fredholm in B(H).
So it remains to compute its index. Note that the index of the sum (4.45)
acting on L2 (S) vanishes, so that does not really help! The key here is the stability
of both the index and the winding number.
Lemma 4.1. If α ∈ C(S; C∗ ) has winding number p ∈ Z then there is a curve
(4.47) αt : [0, 1] −→ C(S; C∗ ), α1 = α, α0 = eipθ .
Proof. If you take a continuous function f : [0, 2π] −→ C then
(4.48) α = exp(2πif ) ∈ C(S; C∗ ) iff f (2π) = f (0) + p, p ∈ Z
(so that α(2π) = α(0)) where p is precisely the winding number of α. So to construct
a continuous family as in (4.47) we can deform f instead provided we keep the
difference between the end values constant. Clearly
θ
(4.49) αt = exp(2πift ), ft (θ) = p (1 − t) + f (θ)t, t ∈ [0, 1]
2π
θ
does this since ft (0) = f (0)t, ft (2π) = p(1 − t) + f (2π)t = f (0)t + p, f0 = p 2π ,
f1 (θ) = f (θ).
It was shown above that the index of a Fredholm operator is constant on the
components – so along any norm continuous curve such as Tαt where αt is as in
(4.47). Thus the index of Tα , where α has winding number p is the same as the
index of the Toeplitz operator defined by exp(ipθ), which has the same winding
number (note that the winding number is also constant under deformations of α).
So we are left to compute the index of the operator πH eipθ πH acting on H. This is
just a p-fold ‘shift up’. If p ≤ 0 it is actually surjective and has null space spanned
by the exp(ijθ) with 0 ≤ j < −p – since these are mapped to exp(i(j + p)θ) and
hence killed by πH . Thus indeed the index of Tα for α = exp(ipθ) is −p in this case.
For p > 0 we can take the adjoint so we have proved Theorem 4.3.
122 4. DIFFERENTIAL AND INTEGRAL OPERATORS
Why is this important? Suppose you have a function α ∈ C(S; C∗ ) and you
know it has winding number −k for k ∈ N. Then you know that the operator Tα
must have null space at least of dimension k. It could be bigger but this is an
existence theorem hence useful. The index is generally relatively easy to compute
and from that one can tell quite a lot about a Fredholm operator.
3. Cauchy problem
Most, if not all, of you will have had a course on ordinary differential equations
so the results here are probably familiar to you at least in outline. I am not going
to try to push things very far but I will use the Cauchy problem to introduce ‘weak
solutions’ of differential equations.
So, here is a form of the Cauchy problem. Let me stick to the standard interval
we have been using but as usual it does not matter. So we are interested in solutions
u of the equation, for some positive integer k
k−1
dk u X dj u
P u(x) = (x) + ak (x) (x) = f (x) on [0, 2π]
dxk j=0
dxj
(4.50) dj u
(0) = 0, j = 0, . . . , k − 1
dxj
aj ∈ C j ([0, 2π]), j = 0, . . . , k − 1.
So, the aj are fixed (corresponding if you like to some physical system), u is the
‘unknown’ function and f is also given. Recall that C j ([0, 2π]) is the space (complex
valued here) of functions on [0, 2π] which have j continuous derivatives. The middle
line consists of the ‘homogeneous’ Cauchy conditions – also called initial conditions –
where homogeneous just means zero. The general case of non-zero initial conditions
follows from this one.
If we want the equation to make ‘classical sense’ we need to assume for instance
that u has continuous derivatives up to order k and f is continuous. I have written
out the first term, involving the highest order of differentiation, in (4.50) separately
to suggest the following observation. Suppose u is just k times differentiable, but
without assuming the kth derivative is continous. The equation still makes sense
but if we assume that f is continuous then it actually follows that u is k times
continuously differentiable. In fact each of the terms in the sum is continuous, since
this only invovles derivatives up to order k − 1 multiplied by continuous functions.
We can (mentally if you like) move these to the right side of the equation, so together
with f this becomes a continuous function. But then the equation itself implies
k
that ddxuk is continuous and so u is actually k times continuously differentiable. This
is a rather trivial example of ‘elliptic’ regularity which we will push much further.
So, the problem is to prove
Theorem 4.4. For each f ∈ C([0, 2π]) there is a unique k times continuously
differentiable solution, u, to (4.50).
Note that in general there is no way of ‘writing the solution down’. We can
show it exists, and is unique, and we can say a lot about it but there is no formula
– although we will see that it is the sum of a reasonable series.
How to proceed? There are many ways but to adopt the one I want to use
I need to manipulate the equation in (4.50). There is a certain discriminatory
3. CAUCHY PROBLEM 123
property of the way I have written the equation. Although it seems rather natural,
writing the ‘coefficients’ ak on the left involves an element of ‘handism’ if that is a
legitimate concept. Instead we could try for the ‘rigthist’ approach and look at the
similar equation
k−1
dk u X dj (bj (x)u)
(x) + (x) = f (x) on [0, 2π]
dxk j=0
dxj
(4.51) dj u
(0) = 0, j = 0, . . . , k − 1
dxj
bj ∈ C j ([0, 2π]), j = 0, . . . , k − 1.
As already written in (4.50) we think of P as an operator, sending u to this sum.
Lemma 4.2. For any functions aj ∈ C j ([0, 2π]) there are unique functions bj ∈
j
C ([0, 2π]) so that (4.51) gives the same operator as (4.50).
Proof. Here we can simply write down a formula for the bj in terms of the
aj . Namely the product rule for derivatives means that
j
dj (bj (x)u) X j dj−p bj dp u
(4.52) = · .
dxj p=0
p dxj−p dxp
If you are not quite confident that you know this, you do know it for j = 1 which
is just the usual product rule. So proceed by induction over j and observe that the
formula for j + 1 follows from the formula for j using the properties of the binomial
coefficients.
Pulling out the coefficients of a fixed derivative of u show that we need bj to
satisfy
k−1
X j dj−p bj
(4.53) ap = bp + .
j=p+1
p dxj−p
This shows the uniquness since we can recover the aj from the bj . On the other
hand we can solve (4.53) for the bj too. The ‘top’ equation says ak−1 = bk−1 and
then successive equations determine bp in terms of ap and the bj with j > p which
we already know iteratively.
Note that the bj ∈ C j ([0, 2π]).
So, what has been achieved by ‘writing the coefficients on the right’ ? The
important idea is that we can solve (4.50) in one particular case, namely when all
the aj (or equivalently bj ) vanish. Then we would just integrate k times. Let us
denote Riemann integration by
Z x
(4.54) I : C([0, 2π]) −→ C([0, 2π]), If (x) = f (s)ds.
0
Of course we can also think of this as Lebesgue integration and then we know for
instance that
(4.55) I : L2 (0, 2π) −→ C([0, 2π])
is a bounded linear operator. Note also that
(4.56) (If )(0) = 0
124 4. DIFFERENTIAL AND INTEGRAL OPERATORS
Notice that this argument is reversible. Namely if u ∈ C k ([0, 2π]) satisfies (4.58)
for f ∈ C([0, 2π]) then u ∈ C k ([0, 2π]) does indeed satisfy (4.58). In fact even more
is true
Proposition 4.2. The operator Id +B is invertible on L2 (0, 2π) and if f ∈
C([0, 2π]) then u = (Id +B)−1 I k f ∈ C k ([0, 2π]) is the unique solution of (4.51).
Proof. From (4.58) we see that B is given as a sum of operators of the form
I p ◦ b where b is multiplcation by a continuous function also denoted b ∈ C([0, 2π])
and p ≥ 1. Writing out I p as an iterated (Riemann) integral
Z x Z y1 Z yp−1
(4.59) I p v(x) = ··· v(yp )dyp · · · dy1 .
0 0 0
where the Heaviside function restricts the integrand to t ≤ x. Similarly in the next
case by reversing the order of integration
Z xZ s
(4.61) (I 2 · bk−2 )v(x) = b(t)v(t)dtds
0 0
Z x Z x Z 2π
= bk−2 (t)v(t)dsdt = βk−2 (x, t)v(t)dt,
0 0
βk−2 = (x − t)+ bk−2 (x).
In general
Z 2π
1
p
(4.62) (I · bk−p )v(x) = βk−p (x, t)v(t)dt, βk−p = (x − t)p−1
+ bk−p (x).
0 (p − 1)!
The explicit formula here is not that important, but (throwing away a lot of infor-
mation) all the β∗ (t, x) have the property that they are of the form
(4.63) β(x, t) = H(x − t)e(x, t), e ∈ C([0, 2π]2 ).
This is a Volterra operator
Z 2π
(4.64) Bv(x) = β(x, t)v(t)
0
with β as in (4.63).
3. CAUCHY PROBLEM 125
This is just the Neumann series, but notice we are not claiming that kBk < 1 which
would give the convergence as we know from earlier. Rather the key is that the
powers of B behave very much like the operators I k computed above.
Lemma 4.3. For a Volterra operator in the sense of (4.63) and (4.64)
(4.66)
Z 2π
Cj
B j v(x) = H(x−t)ej (x, t)v(t), ej ∈ C([0, 2π]2 ), ej ≤ (x−t)+j−1 , j > 1.
0 (j − 1)!
Proof. Proceeding inductively we can assume (4.66) holds for a given j. Then
B j+1 = B ◦ B j is of the form in (4.66) with
Z 2π
(4.67) ej+1 (x, t) = H(x − s)e(x, s)H(s − t)ej (s − x)ds
0
Z x Z x
Cj C j+1
= e(x, s)ej (s − t)ds ≤ sup |e| (s − t)j−1
+ ds ≤ (x − t)+j
t (j − 1)! t j!
provided C ≥ sup |e|.
The estimate (4.67) means that, for a different constant
Cj
(4.68) kB j kL2 ≤ ,j > 1
j−1
which is summable, so the Neumann series (4.58) does converge.
To see the regularity of u = (Id +B)−1 I k f when f ∈ C([0, 2π]) consider (4.58).
Each of the terms in the sum maps L2 (0, 2π) into C([0, 2π]) so u ∈ C([0.2π]).
Proceeding iteratively, for each p = 0, . . . , k − 1, each of these terms, I k−j (bj u)
maps C p ([0, eπ]) into C p+1 ([0, 2π]) so u ∈ C k ([0, 2π]). Similarly for the Cauchy
conditions. Differentiating (4.58) recovers (4.51).
As indicated above, the case of non-vanishing Cauchy data follows from Theo-
rem 4.4. Let
(4.69) Σ : C k ([0, 2π]) −→ C k
denote the Cauchy data map – evaluating the function and its first k − 1 derivatives
at 0.
Proposition 4.3. The combined map
(4.70) (Σ, P ) : C k ([0, 2π]) −→ Ck ⊕ C([0, 2π])
is an isomorphism.
Proof. The map Σ in (4.69) is certainly surjective, since it is surjective even
on polynomials of degree k − 1. Thus given z ∈ Ck there exists v ∈ C k ([0, 2π]) with
Σv = z. Now, given f ∈ C([0, 2π]) Theorem 4.4 allows us to find w ∈ C k ([0, 2π])
with P w = f − P v and Σw = 0. So u = v + w satisfies (Σ, P )u = (z, f ) and we have
shown the surjectivity of (4.70). The injectivity again follows from Theorem 4.4 so
(Σ, P ) is a bijection and hence and isomorphism using the Open Mapping Theorem
(or directly).
126 4. DIFFERENTIAL AND INTEGRAL OPERATORS
Let me finish this discussion of the Cauch problem by introducing the notion
of a weak solution. let Σ2π : C k ([0, 2π]) −→ C k be the evaluation of the Cauchy
data at the top end of the interval. Then if u ∈ C k ([0, 2π]) satisfies Σu = 0 and
v ∈ C([90, 2π]) satisfies Σ2π v = 0 there are no boundary terms in integration by
parts for derivatives up to order k and it follows that
k−1
dk v X dj a j v
Z Z
(4.71) P uv = uQv, Qv = (−1)k k +
(0,2π) (0,2π) dx j=0
dxj
Proof. If f ∈ L2 ((0, 2π)) and v ∈ C([0, 2π]) then the product vf ∈ L2 ((0, 2π))
and kvf kL2 ≤ kvk∞ kf kL2 . This can be seen for instance by taking an absolutely
summable approximation to f, which gives a sequence of continuous functions con-
verging a.e. to f and bounded by a fixed L2 function and observing that vfn → vf
a.e. with bound a constant multiple, sup |v|, of that function. It follows that for
b ∈ C([0, 2π]2 ) the product
(4.84) b(x, y)f (y) ∈ L2 (0, 2π)
for each x ∈ [0, 2π]. Thus Bf (x) is well-defined by (4.83) since L2 ((0, 2π) ⊂
L1 ((0, 2π)).
Not only that, but Bf ∈ C([0, 2π]) as can be seen from the Cauchy-Schwarz
inequality,
(4.85) Z
1
|Bf (x0 ) − Bf (x)| = | (b(x0 , y) − b(x, y))f (y)| ≤ sup |b(x0 , y − b(x, y)|(2π) 2 kf kL2 .
y
Since the inclusion map C([0, 2π]) −→ L2 ((0, 2π)) is bounded, i.e continuous, it
follows that the map (I have reversed the variables)
(4.90) [0, 2π] 3 y 7−→ b(·, y) ∈ L2 ((0, 2π))
is continuous and so has a compact range.
Take the Fourier basis ek for [0, 2π] and expand b in the first variable. Given
> 0 the compactness of the image of (4.90) implies that the Fourier Bessel series
converges uniformly (has uniformly small tails), so for some N
X
(4.91) |(b(x, y), ek (x))|2 < ∀ y ∈ [0, 2π].
|k|>N
The finite part of the Fourier series is continuous as a function of both arguments
X
(4.92) bN (x, y) = ek (x)ck (y), ck (y) = (b(x, y), ek (x))
|k|≤N
4. DIRICHLET PROBLEM ON AN INTERVAL 129
and so defines another bounded linear operator BN as before. This operator can
be written out as
X Z
(4.93) BN f (x) = ek (x) ck (y)f (y)dy
|k|≤N
and so is of finite rank – it always takes values in the span of the first 2N + 1
trigonometric functions. On the other hand the remainder is given by a similar
operator with corresponding to qN = b − bN and this satisfies
(4.94) sup kqN (·, y)kL2 ((0,2π)) → 0 as N → ∞.
y
This is a simple example of a ‘trace formula’; you might like to look up some others!
So, this happenstance allows us to decompose B as the square of another op-
erator defined directly on the othornormal basis. Namely
2
(4.98) Auk = uk =⇒ B = A2 .
k
Here again it is immediate that A is a compact self-adjoint operator on L2 (0, 2π)
since its eigenvalues tend to 0. In fact we can see quite a lot more than this.
Lemma 4.5. The operator A maps L2 (0, 2π) into C([0, 2π]) and Af (0) = Af (2π) =
0 for all f ∈ L2 (0, 2π).
Proof. If f ∈ L2 (0, 2π) we may expand it in Fourier-Bessel series in terms of
the uk and find
X
(4.99) f= ck uk , {ck } ∈ l2 .
k
130 4. DIFFERENTIAL AND INTEGRAL OPERATORS
Here each uk is a bounded continuous function, with the bound |uk | ≤ C being
independent of k. So in fact (4.100) converges uniformly and absolutely since it is
uniformly Cauchy, for any q > p,
21
q q q
X 2ck X X
(4.101) | uk | ≤ 2C |ck |k −1 ≤ 2C k −2 kf kL2
k
k=p k=p k=p
using the assumed non-negativity of V. So, there can be no null space – all the eigen-
values of AV A are at least non-negative and the inverse is the bounded operator
given by its action on the basis
(4.105) (Id +AV A)−1 ei = (1 + τi )−1 ei , AV Aei = τi ei .
Thus Id +AV A is invertible on L2 (0, 2π) with inverse of the form Id +Q, Q
again compact and self-adjoint since (1 + τi )−1 − 1 → 0. Now, to solve (4.103) we
just need to take
(4.106) v = (Id +Q)Af ⇐⇒ v + AV Av = Af in L2 (0, 2π).
Then indeed
(4.107) u = Av satisfies u + A2 V u = A2 f.
4. DIRICHLET PROBLEM ON AN INTERVAL 131
In fact since v ∈ L2 (0, 2π) from (4.106) we already know that u ∈ C([0, 2π]) vanishes
at the end points.
Moreover if f ∈ C([0, 2π]) we know that Bf = A2 f is twice continuously
differentiable, since it is given by two integrations – that is where B came from.
Now, we know that u in L2 satisfies u = −A2 (V u)+A2 f. Since V u ∈ L2 ((0, 2π) =⇒
A(V u) ∈ L2 (0, 2π) and then, as seen above, A(A(V u) is continuous. So combining
this with the result about A2 f we see that u itself is continuous and hence so is
V u. But then, going through the routine again
(4.108) u = −A2 (V u) + A2 f
is the sum of two twice continuously differentiable functions. Thus it is so itself. In
fact from the properties of B = A2 it satisifes
d2 u
(4.109) − = −V u + f
dx2
which is what the result claims. So, we have proved the existence part of Proposi-
tion 4.4.
The uniqueness follows pretty much the same way. If there were two twice
continuously differentiable solutions then the difference w would satisfy
d2 w
(4.110) − + V w = 0, w(0) = w(2π) = 0 =⇒ w = −Bw = −A2 V w.
dx2
Thus w = Aφ, φ = −AV w ∈ L2 (0, 2π). Thus φ in turn satisfies φ = AV Aφ and
hence is a solution of (Id +AV A)φ = 0 which we know has none (assuming V ≥ 0).
Since φ = 0, w = 0.
This completes the proof of Proposition 4.4. To summarize, what we have
shown is that Id +AV A is an invertible bounded operator on L2 (0, 2π) (if V ≥ 0)
and then the solution to (4.73) is precisely
(4.111) u = A(Id +AV A)−1 Af
which is twice continuously differentiable and satisfies the Dirichlet conditions for
each f ∈ C([0, 2π]).
This may seem a ‘round-about’ approach but it is rather typical of methods
from Functional Analysis. What we have done is to separate the two problems of
‘existence’ and ‘regularity’. We first get existence of what is often called a ‘weak
solution’ of the problem, in this case given by (4.111), which is in L2 (0, 2π) for
f ∈ L2 (0, 2π) and then show, given regularity of the right hand side f, that this is
actually a ‘classical solution’.
Even if we do not assume that V ≥ 0 we can see fairly directly what is hap-
pening.
Theorem 4.5. For any V ∈ C([0, 2π]) real-valued, there is an orthonormal basis
wk of L2 (0, 2π) consisting of twice-continuously differentiable functions on [0, 2π],
2
vanishing at the end-points and satisfying − ddxw2k + V wk = Tk wk where Tk → ∞ as
k → ∞. The equation (4.73) has a (twice continuously differentiable) solution for
given f ∈ C([0, 2π]) if and only if
Z
(4.112) Tk = 0 =⇒ f wk = 0.
(0,2π)
132 4. DIFFERENTIAL AND INTEGRAL OPERATORS
d2 w
(4.116) − + V w = 0, w(0) = w(2π) = 0 ⇐⇒
dx2
w ∈ A{v ∈ L2 (0, 2π); (Id +AV A)v = 0}.
Since AV A is compact and self-adjoint we see that
So, ultimately the solution of the differential equation (4.73) is just like the
solution of a finite dimensional problem for a self-adjoint matrix. There is a solution
if and only if the right side is orthogonal to the null space; it just requires a bit
more work. Lots of ‘elliptic’ problems turn out to be like this.
We can also say (a great deal) more about the eigenvalues Tk and eigenfunctions
wk . For instance, the derivative
(4.118) wk0 (0) 6= 0.
Indeed, were this to vanish wk would be a solution of the Cauchy problem (4.50)
d2
for the second-order operator P = dx 2 − q + Tk with ‘forcing term’ f = 0 and hence,
5. Harmonic oscillator
As a second ‘serious’ application of our Hilbert space theory I want to discuss
the harmonic oscillator, the corresponding Hermite basis for L2 (R). Note that so
far we have not found an explicit orthonormal basis on the whole real line, even
though we know L2 (R) to be separable, so we certainly know that such a basis
exists. How to construct one explicitly and with some handy properties? One way
is to simply orthonormalize – using Gram-Schmidt – some countable set with dense
span. For instance consider the basic Gaussian function
x2
(4.120) ) ∈ L2 (R).
exp(−
2
This is so rapidly decreasing at infinity that the product with any polynomial is
also square integrable:
x2
(4.121) xk exp(−
) ∈ L2 (R) ∀ k ∈ N0 = {0, 1, 2, . . . }.
2
Orthonormalizing this sequence gives an orthonormal basis, where completeness
can be shown by an appropriate approximation technique but as usual is not so
simple. This is in fact the Hermite basis as we will eventually show.
Rather than proceed directly we will work up to this by discussing the eigen-
functions of the harmonic oscillator
d2
(4.122) P = − 2 + x2
dx
which we want to think of as an operator – although for the moment I will leave
vague the question of what it operates on.
As you probably already know, and we will show later, it is straightforward
to show that P has a lot of eigenvectors using the ‘creation’ and ‘annihilation’
operators. We want to know a bit more than this and in particular I want to
apply the abstract discussion above to this case but first let me go through the
‘formal’ theory. There is nothing wrong here, just that we cannot easily conclude
the completeness of the eigenfunctions.
The first thing to observe is that the Gaussian is an eigenfunction of H
2 d 2 2
(4.123) P e−x /2
=− (−xe−x /2 + x2 e−x /2 )
dx
2 2 2
= −(x2 − 1)e−x /2 + x2 e−x /2 = e−x /2
134 4. DIFFERENTIAL AND INTEGRAL OPERATORS
1To compute the Gaussian integral, square it and write as a double integral then introduce
polar coordinates
Z
2
Z
2
Z ∞ Z 2π
−y 2 2 2 ∞
( e−x dx)2 = e−x e−r rdrdθ = π − e−r 0 = π.
dxdy =
R R2 0 0
6. FOURIER TRANSFORM 135
For j > 0, integration by parts (easily justified by taking the integral over [−R, R]
and then letting R → ∞) gives
Z Z Z
j j j
(4.131) 2
(Cr u0 ) = Cr u0 (x) Cr u0 (x)dx = u0 Anj Crj u0 .
R R R
Now, from (4.126), we can move one factor of An through the j factors of Cr until
it emerges and ‘kills’ u0
6. Fourier transform
The Fourier transform for functions on R is in a certain sense the limit of the
definition of the coefficients of the Fourier series on an expanding interval, although
that is not generally a good way to approach it. We know that if u ∈ L1 (R) and
v ∈ C∞ (R) is a bounded continuous function then vu ∈ L1 (R) – this follows from
our original definition by approximation. So if u ∈ L1 (R) the integral
Z
(4.137) û(ξ) = e−ixξ u(x)dx, ξ ∈ R
exists for each ξ ∈ R as a Lebesgue integral. Note that there are many different
normalizations of the Fourier transform in use. This is the standard ‘analyst’s’
normalization.
Proposition 4.6. The Fourier tranform, (4.137), defines a bounded linear map
(4.138) F : L1 (R) 3 u 7−→ û ∈ C0 (R)
into the closed subspace C0 (R) ⊂ C∞ (R) of continuous functions which vanish at
infinity (with respect to the supremum norm).
136 4. DIFFERENTIAL AND INTEGRAL OPERATORS
Proof. We know that the integral exists for each ξ and from the basic prop-
erties of the Lebesgue integal
(4.139) |û(ξ)| ≤ kukL1 , since |e−ixξ u(x)| = |u(x)|.
To investigate its properties we restrict to u ∈ Cc (R), a compactly-supported
continuous function, say with support in −R, R]. Then the integral becomes a
Riemann integral and the integrand is a continuous function of both variables. It
follows that the Fourier transform is uniformly continuous since
(4.140) Z
0 0
|û(ξ) − û(ξ 0 )| ≤ |e−ixξ − e−ixξ ||u(x)|dx ≤ 2R sup |u| sup |e−ixξ − e−ixξ |
|x|≤R |x|≤R
with the right side small by the uniform continuity of continuous functions on com-
pact sets. From (4.139), if un → u in L1 (R) with un ∈ Cc (R) it follows that ûn → û
uniformly on R. Thus, as the uniform limit of uniformly continuous functions, the
Fourier transform is uniformly continuous on R for any u ∈ L1 (R) (you can also
see this from the continuity-in-the-mean of L1 functions).
Now, we know that even the compactly-supported once continuously differen-
tiable functions, forming Cc1 (R) are dense in L1 (R) so we can also consider (4.137)
where u ∈ Cc1 (R). Then the integration by parts as follows is justified
de−ixξ
Z Z
du(x)
(4.141) ξ û(ξ) = i ( )u(x)dx = −i e−ixξ dx.
dx dx
Since du/dx ∈ Cc (R) (by assumption) the estimate (4.139) now gives
du
(4.142) sup |ξ û(ξ)| ≤ 2R sup | |.
ξ∈R x∈R dx
This certainly implies the weaker statement that
(4.143) lim |û(ξ)| = 0
|ξ|→∞
which is ‘vanishing at infinity’. Now we again use the density, this time of Cc1 (R),
in L1 (R) and the uniform estimate (4.139), plus the fact that if a sequence of
continuous functions on R converges uniformly on R and each element vanishes at
infinity then the limit vanishes at infinity to complete the proof of the Proposition.
7. Fourier inversion
We could use the completeness of the orthonormal sequence of eigenfunctions
for the harmonic oscillator discussed above to show that the Fourier tranform ex-
tends by continuity from Cc (R) to define an isomorphism
(4.144) F : L2 (R) −→ L2 (R)
with inverse given by the corresponding continuous extension of
Z
(4.145) Gv(x) = (2π)−1 eixξ v(ξ).
Instead, we will give a direct proof of the Fourier inversion formula, via Schwartz
space and an elegant argument due to Hörmander. Then we will use this to prove
the completeness of the eigenfunctions we have found.
7. FOURIER INVERSION 137
It follows that as s → 0 (along a sequence if you prefer) D(x, s)e−ixξ u(x) is bounded
by the L1 (R) function |x||u(x)| and converges pointwise to −ie−ixξ xu(x). Domi-
nated convergence therefore shows that the integral converges showing that the
derivative exists and that
dû(ξ)
(4.149) = F(−ixu).
dξ
From the earlier results it follows that the derivative is continuous and bounded,
proving the lemma.
Now, we can iterate this result and so conclude:
(1 + |x|)k u ∈ L1 (R) ∀ k =⇒
û is infinitely differentiable with bounded derivatives and
(4.150)
dk û
= F((−ix)k u).
dξ k
This result shows that from ‘decay’ of u we deduce smoothness of û. We can
go the other way too. One way to ensure the assumption in (4.150) is to make the
stronger assumption that
(4.151) xk u is bounded and continuous ∀ k.
Indeed, Dominated Convergence shows that if u is continuous and satisfies the
bound
|u(x)| ≤ (1 + |x|)−r , r > 1
then u ∈ L (R). So the integrability of xj u follows from the bounds in (4.151) for
1
is a complete metric. We will not use this here but it is the right way to understand
what is going on.
Notice that there is some prejudice on the order of multiplication by x and dif-
ferentiation in (4.156). This is only apparent, since these estimates (taken together)
are equivalent to
dk (xj u)
(4.158) sup | | < ∞ ∀ j, k ≥ 0.
dxk
To see the equivalence we can use induction over N where the inductive statement
is the equivalence of (4.156) and (4.158) for j + k ≤ N. Certainly this is true for
N = 0 and to carry out the inductive step just differentiate out the product to see
that
dk (xj u) k
jd u
X l
md u
= x + c l,m,k,j x
dxk dxk dxl
l+m<k+j
where one can be much more precise about the extra terms, but the important
thing is that they all are lower order (in fact both degrees go down). If you want to
be careful, you can of course prove this identity by induction too! The equivalence
of (4.156) and (4.158) for N + 1 now follows from that for N.
Theorem 4.6. The Fourier transform restricts to a bijection on S(R) with
inverse
Z
1
(4.159) G(v)(x) = eixξ v(ξ).
2π
7. FOURIER INVERSION 139
Proof. The proof (due to Hörmander as I said above) will take a little while
because we need to do some computation, but I hope you will see that it is quite
clear and elementary.
First we need to check that F : S(R) −→ S(R), but this is what I just did the
preparation for. Namely the estimates (4.156) imply that (4.155) applies to all the
dk (xj u)
dxk
and so
dj û
(4.160) ξk is continuous and bounded ∀ k, j =⇒ û ∈ S(R).
dξ j
This indeed is why Schwartz introduced this space.
So, what we want to show is that with G defined by (4.159), u = G(û) for all
u ∈ S(R). Notice that there is only a sign change and a constant factor to get from
F to G so certainly G : S(R) −→ S(R). We start off with what looks like a small
part of this. Namely we want to show that
Z
(4.161) I(û) = û = 2πu(0).
which gives (4.167). The computation of the integral in (4.169) is a standard clever
argument which you probably know. Namely take the square and work in polar
coordinates in two variables:
Z Z ∞Z ∞
2 2
(4.170) ( γ)2 = e−(x +y ) dxdy
0 0
Z 2π Z ∞ ∞
2 2
e−r /2
rdrdθ = 2π − e−r /2 0 = 2π.
=
0 0
So, finally we need to get from (4.161) to the inversion formula. Changing
variable in the Fourier transform we can see that for any y ∈ R, setting uy (x) =
u(x + y), which is in S(R) if u ∈ S(R),
Z Z
−ixξ
(4.171) F(uy ) = e uy (x)dx = e−i(s−y)ξ u(s)ds = eiyξ û.
8. Convolution
There is a discussion of convolution later in the notes, I have inserted a new
(but not very different) treatment here to cover the density of S(R) in L2 (R) needed
in the next section.
Consider two continuous functions of compact support u, v ∈ Cc (R). Their
convolution is
Z Z
(4.173) u ∗ v(x) = u(x − y)v(y)dy = u(y)v(x − y)dy.
8. CONVOLUTION 141
The first integral is the definition, clearly it is a well-defined Riemann integral since
the integrand is continuous as a function of y and vanishes whenever v(y) vanishes
– so has compact support. In fact if both u and v vanish outside [−R, R] then
u ∗ v = 0 outside [−2R, 2R].
From standard properties of the Riemann integral (or Dominated convergence
if you prefer!) it follows easily that u∗v is continuous. What we need to understand
is what happens if (at least) one of u or v is smoother. In fact we will want to take
a very smooth function, so I pause here to point out
Lemma 4.10. There exists a (‘bump’) function ψ : R −→ R which is infinitely
differentiable, i.e. has continuous derivatives of all orders, vanishes outside [−1, 1],
is strictly positive on (−1, 1) and has integral 1.
Proof. We start with an explicit function,
(
e−1/x x > 0
(4.174) φ(x) =
0 x ≤ 0.
The exponential function grows faster than any polynomial at +∞, since
xk
(4.175) exp(x) > in x > 0 ∀ k.
k!
This can be seen directly from the Taylor series which converges on the whole line
(indeed in the whole complex plane)
X xk
exp(x) = .
k!
k≥0
Here U is the derivative in x > 0. Taking the limit as ↓ 0 both sides converge,
and then we see that Z x
φ(x) = U (t)dt.
0
From this it follows that φ is continuously differentiable across 0 and it derivative
is U, the continuous extension of the derivative from x > 0. The same argument
applies to succssive derivatives, so indeed φ is infinitely differentiable.
From φ we can construct a function closer to the desired bump function.
Namely
Φ(x) = φ(x + 1)φ(1 − x).
The first factor vanishes when x ≤ −1 and is otherwise positive while the second
vanishes when x ≥ 1 but is otherwise positive, so the product is infinitely differ-
entiable on R and positive on (−1, 1) but otherwise 0. Then we can normalize the
integral to 1 by taking
Z
(4.180) ψ(x) = Φ(x)/ Φ.
In particular from Lemma 4.10 we conclude that the space Cc∞ (R),
of infinitely
differentiable functions of compact support, is not empty. Going back to convolution
in (4.173) suppose now that v is smooth. Then
(4.181) u ∈ Cc (R), v ∈ Cc∞ (R) =⇒ u ∗ v ∈ Cc∞ (R).
As usual this follows from properties of the Riemann integral or by looking directly
at the difference quotient
u ∗ v(x + t) − u ∗ v(x) v(x + t − y) − v(x − y)
Z
= u(y) dt.
t t
As t → 0, the difference quotient for v converges uniformly (in y) to the derivative
and hence the integral converges and the derivative of the convolution exists,
d dv
(4.182) u ∗ v(x) = u ∗ ( ).
dx dx
This result allows immediate iteration, showing that the convolution is smooth and
we know that it has compact support
Proposition 4.7. For any u ∈ Cc (R) there exists un → u uniformly on R
where un ∈ Cc∞ (R) with supports in a fixed compact set.
Proof. For each > 0 consider the rescaled bump function
x
(4.183) ψ = −1 ψ( ) ∈ Cc∞ (R).
In fact, ψ vanishes outside the interval (, ), is positive within this interval and
has integral 1 – which is what the factor of −1 does. Now set
(4.184) u = u ∗ ψ ∈ Cc∞ (R), > 0,
from what we have just seen. From the supports of these functions, u vanishes
outside [−R−, R+] if u vanishes outside [−R, R]. So only the convergence remains.
To get this we use the fact that the integral of ψ is equal to 1 to write
Z
(4.185) u (x) − u(x) = (u(x − y)ψ (y) − u(x)ψ (y))dy.
9. PLANCHEREL AND PARSEVAL 143
Corollary 4.1. The spaces Cc∞ (R) and S(R) are dense in L2 (R).
Uniform convegence of continuous functions with support in a fixed subset is
stronger than L2 convergence, so the result follows from the Proposition above for
Cc∞ (R) ⊂ S(R).
Since the integrals are rapidly convergent at infinity we may substitute the definite
of the Fourier transform into (4.188), write the result out as a double integral and
change the order of integration
Z Z Z
(4.189) u(x)v̂(x)dx = u(x) e−ixξ v(ξ)dξdx
Z Z Z
= v(ξ) e−ixξ u(x)dxdξ = û(ξ)v(ξ)dξ.
(2) We say that u ∈ L2 (R) has a strong derivative (in the L2 sense) if the
limit
u(x + s) − u(x)
(4.195) lim = ṽ exists in L2 (R).
06=s→0 s
10. WEAK AND STRONG DERIVATIVES 145
(3) Thirdly, we say that u ∈ L2 (R) has a weak derivative in L2 if there exists
w ∈ L2 (R) such that
df
(4.196) (u, − )L2 = (w, f )L2 ∀ f ∈ Cc1 (R).
dx
In all cases, we will see that it is justified to write v = ṽ = w = du
dx because these
defintions turn out to be equivalent. Of course if u ∈ Cc1 (R) then u is differentiable
in each sense and the derivative is always du/dx – note that the integration by parts
used to prove (4.196) is justified in that case. In fact we are most interested in the
first and third of these definitions, the first two are both called ‘strong derivatives.’
It is easy to see that the existence of a Sobolev derivative implies that this
is also a weak derivative. Indeed, since φn , the approximating sequence whose
existence is the definition of the Sobolev derivative, is in Cc1 (R) the integration by
parts implicit in (4.196) is valid and so for all f ∈ Cc1 (R),
df
(4.197) (φn , − )L2 = (φ0n , f )L2 .
dx
Since φn → u in L2 and φ0n → v in L2 both sides of (4.197) converge to give the
identity (4.196).
Before proceeding to the rest of the equivalence of these definitions we need
to do some preparation. First let us investigate a little the consequence of the
existence of a Sobolev derivative.
Lemma 4.11. If u ∈ L2 (R) has a Sobolev derivative then u ∈ C(R) and there
exists a uniquely defined element w ∈ L2 (R) such that
Z x
(4.198) u(x) − u(y) = w(s)ds ∀ y ≤ x ∈ R.
y
interval. This actually shows that the limit φn (x̄) must exist for each fixed x̄. In
fact we can always choose ψ to be constant near a particular point and apply this
argument to see that
That is, the limit exists locally uniformly, hence represents a continuous function
but that continuous function must be equal to the original u almost everywhere
(since ψφn → ψu in L2 ).
Thus in fact we conclude that ‘u ∈ C(R)’ (which really means that u has a
representative which is continuous). Not only that but we get (4.198) from passing
to the limit on both sides of
Z s Z s
0
(4.203) u(x) − u(y) = lim (φn (x) − φn (y)) = lim (φ (s))ds = w(s)ds.
n→∞ n→∞ y y
Indeed, if w1 and w2 are both Sobolev derivatives then (4.198) holds for both of
them, which means that w2 − w1 has vanishing integral on any finite interval and
we know that this implies that w2 = w1 a.e.
So at least for Sobolev derivatives we are now justified in writing
du
(4.205) w=
dx
since w is unique and behaves like a derivative in the integral sense that (4.198)
holds.
Lemma 4.12. If u has a Sobolev derivative then u has a strong derivative and
if u has a strong derivative then this is also a weak derivative.
Proof. If u has a Sobolev derivative then (3.17) holds. We can use this to
write the difference quotient as
u(x + s) − u(x) 1 s
Z
(4.206) − w(x) = (w(x + t) − w(x))dt
s s 0
since the integral in the second term can be carried out. Using this formula twice
the square of the L2 norm, which is finite, is
u(x + s) − u(x)
(4.207) k − w(x)k2L2
s Z Z sZ s
1
= 2 (w(x + t) − w(x)(w(x + t0 ) − w(x))dtdt0 dx.
s 0 0
There is a small issue of manupulating the integrals, but we can always ‘back off
a little’ and replace u by the approximating sequence φn and then everything is
fine – and we only have to check what happens at the end. Now, we can apply the
Cauchy-Schwarz inequality as a triple integral. The two factors turn out to be the
10. WEAK AND STRONG DERIVATIVES 147
Applying this to (4.208) and then estimating the t integral shows that
u(x + s) − u(x)
(4.210) − w(x) → 0 in L2 (R) as s → 0.
s
By definition this means that u has w as a strong derivative. I leave it up to you
to make sure that the manipulation of integrals is okay.
So, now suppose that u has a strong derivative, ṽ. Observe that if f ∈ Cc1 (R)
then the limit defining the derivative
f (x + s) − f (x)
(4.211) lim = f 0 (x)
06=s→0 s
is uniform. In fact this follows by writing down the Fundamental Theorem of
Calculus, as in (4.198), again using the properties of Riemann integrals. Now,
consider
f (x + s) − f (x)
Z Z
1 1
(u(x), )L2 = u(x)f (x + s)dx − u(x)f (x)dx
s s s
(4.212)
u(x − s) − u(x)
=( , f (x))L2
s
where we just need to change the variable of integration in the first integral from
x to x + s. However, letting s → 0 the left side converges because of the uniform
convergence of the difference quotient and the right side converges because of the
assumed strong differentiability and as a result (noting that the parameter on the
right is really −s)
df
(4.213) (u, )L2 = −(w, f )L2 ∀ f ∈ Cc1 (R)
dx
which is weak differentiability with derivative ṽ.
So, at this point we know that Sobolev differentiabilty implies strong differen-
tiability and either of the stong ones implies the weak. So it remains only to show
that weak differentiability implies Sobolev differentiability and we can forget about
the difference!
Before doing that, note again that a weak derivative, if it exists, is unique –
since the difference of two would have to pair to zero in L2 with all of Cc1 (R) which
is dense. Similarly, if u has a weak derivative then so does ψu for any ψ ∈ Cc1 (R)
148 4. DIFFERENTIAL AND INTEGRAL OPERATORS
since we can just move ψ around in the integrals and see that
df df
(ψu, − ) = (u, −ψ )
dx dx
(4.214) dψf
= (u, − ) + (u, ψ 0 f )
dx
= (w, ψf + (ψ 0 u, f ) = (ψw + ψ 0 u, f )
which also proves that the product formula holds for weak derivatives.
So, let us consider u ∈ L2c (R) which does have a weak derivative. To show that
it has a Sobolev derivative we need to construct a sequence φn . We will do this by
convolution.
Lemma 4.13. If µ ∈ Cc (R) then for any u ∈ L2c (R),
Z
(4.215) µ ∗ u(x) = µ(x − s)u(s)ds ∈ Cc (R)
One of the key properties of thes convolution integrals is that we can examine
what happens when we ‘concentrate’ µ. Replace the one µ by the family
x
(4.220) µ (x) = −1 µ( ), > 0.
R
The singular factor here is introduced so that µ is independent of > 0,
Z Z
(4.221) µ = µ ∀ > 0.
Note that since µ has compact support, the support of µ is concentrated in |x| ≤ R
for some fixed R.
10. WEAK AND STRONG DERIVATIVES 149
In fact there is no need to assume that u has compact support for this to work.
Proof. First we can change the variable of integration in the definition of the
convolution and write it intead as
Z
(4.223) µ ∗ u(x) = µ(s)u(x − s)ds.
Now, the rest is similar to one of the arguments above. First write out the difference
we want to examine as
Z Z
(4.224) µ ∗ u(x) − ( µ)(x) = µ (s)(u(x − s) − u(x))ds.
|s|≤R
Write out the square of the absolute value using the formula twice and we find that
Z Z
(4.225) |µ ∗ u(x) − ( µ)(x)|2 dx
Z Z Z
= µ (s)µ (t)(u(x − s) − u(x))(u(x − s) − u(x))dsdtdx
|s|≤R |t|≤R
Now we can write the integrand as the product of two similar factors, one being
1 1
(4.226) µ (s) 2 µ (t) 2 (u(x − s) − u(x))
using the non-negativity of µ. Applying the Cauchy-Schwarz inequality to this we
get two factors, which are again the same after relabelling variables, so
Z Z Z Z Z
(4.227) |µ ∗u(x)−( µ)(x)|2 dx ≤ µ (s)µ (t)|u(x−s)−u(x)|2 .
|s|≤R |t|≤R
The integral in x can be carried out first, then using continuity-in-the mean bounded
by J(s) → 0 as → 0 since |s| < R. This leaves
Z Z
(4.228) |µ ∗ u(x) − ( µ)u(x)|2 dx
Z Z Z
≤ sup J(s) µ (s)µ (t) = ( ψ)2 Y sup → 0.
|s|≤R |s|≤R |t|≤R |s|≤R
After all this preliminary work we are in a position to to prove the remaining
part of ‘weak=strong’.
Lemma 4.15. If u ∈ L2 (R) has w as a weak L2 -derivative then w is also the
Sobolev derivative of u.
Proof. Let’s assume first that u has compact support, so we can use the
discussion above. Then setR φn = µ1/n ∗ u where µ ∈ Cc1 (R) is chosen to be non-
negative and have integral µ = 0; µ is defined in (4.220). Now from Lemma 4.14
it follows that φn → u in L2 (R). Also, from Lemma 4.13, φn ∈ Cc1 (R) has derivative
given by (4.216). This formula can be written as a pairing in L2 :
dµ1/n (x − s) 2 dµ1/n (x − s)
(4.229) (µ1/n )0 ∗ u(x) = (u(s), − )L = (w(s), )L2
ds ds
150 4. DIFFERENTIAL AND INTEGRAL OPERATORS
using the definition of the weak derivative of u. It therefore follows from Lemma 4.14
applied again that
(4.230) φ0n = µ/m1/n ∗ w → w in L2 (R).
Thus indeed, φn is an approximating sequence showing that w is the Sobolev de-
rivative of u.
In the general case that u ∈ L2 (R) has a weak derivative but is not necessarily
compactly supported, consider a function γ ∈ Cc1 (R) with γ(0) = 1 and consider
the sequence vm = γ(x)u(x) in L2 (R) each element of which has compact support.
Moreover, γ(x/m) → 1 for each x so by Lebesgue dominated convergence, vm → u
in L2 (R) as m → ∞. As shown above, vm has as weak derivative
dγ(x/m) 1
u + γ(x/m)w = γ 0 (x/m)u + γ(x/m)w → w
dx m
as m → ∞ by the same argument applied to the second term and the fact that
the first converges to 0 in L2 (R). Now, use the approximating sequence µ1/n ∗ vm
discussed converges to vm with its derivative converging to the weak derivative of
vm . Taking n = N (m) sufficiently large for each m ensures that φm = µ1/N (m) ∗ vm
converges to u and its sequence of derivatives converges to w in L2 . Thus the weak
derivative is again a Sobolev derivative.
Finally then we see that the three definitions are equivalent and we will freely
denote the Sobolev/strong/weak derivative as du/dx or u0 .
That they are pre-Hilbert spaces is clear enough. Completeness is also easy, given
that we know the completeness of L2 (R). Namely, if un is Cauchy in H s (R) then
it follows from the fact that
(4.233) kvkL2 ≤ Ckvks ∀ v ∈ H s (R)
11. SOBOLEV SPACES 151
s
that un is Cauchy in L2 and also that (1 + |ξ|2 ) 2 ûn (ξ) is Cauchy in L2 . Both
therefore converge to a limit u in L2 and the continuity of the Fourier transform
shows that u ∈ H s (R) and that un → u in H s .
These spaces are examples of what is discussed above where we have a dense
inclusion of one Hilbert space in another, H s (R) −→ L2 (R). In this case the in-
clusion in not compact but it does give rise to a bounded self-adjoint operator on
L2 (R), Es : L2 (R) −→ H s (R) ⊂ L2 (R) such that
(4.234) (u, v)L2 = (Es u, Es v)H s .
s
It is reasonable to denote this as Es = (1 + |Dx |2 )− 2 since
2 −s
(4.235) u ∈ L2 (Rn ) =⇒ E
d s u(ξ) = (1 + |ξ| ) 2 û(ξ).
If such a v ∈ L2 (R) exists then it is unique – since the difference of two such
functions would have to have integral zero over any finite interval and we know
(from one of the exercises) that this implies that the function vansishes a.e.
One of the more important results about Sobolev spaces – of which there are
many – is the relationship between these ‘L2 derivatives’ and ‘true derivatives’.
1
Theorem 4.7 (Sobolev embedding). If n is an integer and s > n + 2 then
(4.239) H s (R) ⊂ C∞
n
(R)
consists of n times continuosly differentiable functions with bounded derivatives to
order n (which also vanish at infinity).
This is actually not so hard to prove, there are some hints in the exercises below.
152 4. DIFFERENTIAL AND INTEGRAL OPERATORS
These are not the only sort of spaces with ‘more regularity’ one can define
and use. For instance one can try to treat x and ξ more symmetrically and define
smaller spaces than the H s above by setting
s s
s
(4.240) Hiso (R) = {u ∈ L2 (R); (1 + |ξ|2 ) 2 û ∈ L2 (R), (1 + |x|2 ) 2 u ∈ L2 (R)}.
The ‘obvious’ inner product with respect to which these ‘isotropic’ Sobolev
s
spaces Hiso (R) are indeed Hilbert spaces is
Z Z Z
(4.241) (u, v)s,iso = uv + |x|2s uv + |ξ|2s ûv̂
R R R
which makes them look rather symmetric between u and û and indeed
s s
(4.242) F : Hiso (R) −→ Hiso (R) is an isomorphism ∀ s ≥ 0.
At this point, by dint of a little, only moderately hard, work, it is possible to
show that the harmonic oscillator extends by continuity to an isomorphism
s+2 s
(4.243) H : Hiso (R) −→ Hiso (R) ∀ s ≥ 2.
Finally in this general vein, I wanted to point out that Hilbert, and even Ba-
nach, spaces are not the end of the road! One very important space in relation to
a direct treatment of the Fourier transform, is the Schwartz space. The definition
is reasonably simple. Namely we denote Schwartz space by S(R) and say
u ∈ S(R) ⇐⇒ u : R −→ C
is continuously differentiable of all orders and for every n,
(4.244) X dp u
kukn = sup(1 + |x|)k | p | < ∞.
x∈R dx
k+p≤n
All these inequalities just mean that all the derivatives of u are ‘rapidly decreasing
at ∞’ in the sense that they stay bounded when multiplied by any polynomial.
So in fact we know already that S(R) is not empty since the elements of the
Hermite basis, ej ∈ S(R) for all j. In fact it follows immediately from this that
(4.245) S(R) −→ L2 (R) is dense.
If you want to try your hand at something a little challenging, see if you can check
that
\
s
(4.246) S(R) = Hiso (R)
s>0
as you can check. So the claim is that S(R) is complete as a metric space – such a
thing is called a Fréchet space.
12. SCHWARTZ DISTRIBUTIONS 153
What has this got to do with the Fourier transform? The point is that
(4.248)
du dF(u)
F : S(R) −→ S(R) is an isomorphism and F( ) = iξF(u), F(xu) = −i
dx dξ
where this now makes sense. The dual space of S(R) – the space of continuous
linear functionals on it, is the space, denoted S 0 (R), of tempered distributions on
R.
Indeed, this amounts to showing that kφkL2 is a continuous norm on S(R) (so it
must be bounded by a multiple of one of the kφkN , which one?)
It is relatively straightforward to show that L2 (R) 3 f 7−→ Tf ∈ S 0 (R) is
injective – nothing is ‘lost’. So after a little more experience with distributions one
comes to identify f and Tf . Notice that this is just an extension of the behaviour of
L2 (R) where (because we can drop the complex conjugate in the inner product) by
Riesz’ Theorem we can identify (linearly) L2 (R) with it dual, exactly by the map
f 7−→ Tf .
Other elements of S 0 (R) include the delta ‘function’ at the origin and even its
‘derivatives’ for each j
dj φ
(4.252) δ j : S(R) 3 φ 7−→ (−1)j (0) ∈ C.
dxj
In fact one of the main points about the space S 0 (R) is that differentiation and
multiplication by polynomials is well defined
d
(4.253) : S 0 (R) −→ S 0 (R), ×x : S 0 (R) −→ S 0 (R)
dx
in a way that is consistent with their actions under the identification S(R) : φ 7−→
Tφ ∈ S 0 (R). This property is enjoyed by other spaces of distributions but the
154 4. DIFFERENTIAL AND INTEGRAL OPERATORS
This shows that the map A, clearly linear, is well-defined. Now, how to see
that it is surjective? Let’s first prove a special case. Indeed, look for a function
ψ ∈ Cc∞ (R) ⊂ S(R) which is non-negative and such that Aψ = 1. We know that
we can find φ ∈ Cc∞ (R), φ ≥ 0 with φ > 0 on [0, 2π]. Then consider Aφ ∈ C ∞ (T).
It must be stricly positive, Aφ ≥ > 0 since it is larger that φ. So consider instead
the function
φ
(4.261) ψ= ∈ Cc∞ (R)
Aφ
where we think of Aφ as 2π-periodic on R. In fact using this periodicity we see that
(4.262) Aψ ≡ 1.
So this shows that the constant function 1 is in the range of A. In general, just
take g ∈ C ∞ (T), thought of as 2π-periodic on the line, and it follows that
(4.263) f = Bg = ψg ∈ Cc∞ (R) ⊂ S(R) satsifies Af = g.
Indeed,
X X
(4.264) Ag = ψ(x − 2πk)g(x − 2πk) = g(x) ψ(x − 2πk) = g
k k
using the periodicity of g. In fact B is a right inverse for A,
(4.265) AB = Id on C ∞ (T).
Question 2. What is the null space of A?
Since f ∈ S(R) and Af ∈ C ∞ (T) ⊂ L2 (0, 2π) with our identifications above,
the question arises as to the relationship between the Fourier transform of f and
the Fourier series of Af.
Proposition 4.10 (Poisson summation formula). If g = Af, g ∈ C ∞ (T) and
f ∈ S(R) then the Fourier coefficients of g are
Z
(4.266) ck = ge−ikx = fˆ(k).
[0,2π]
Proof. Just substitute in the formula for g and, using uniform convergenc,
check that the sum of the integrals gives after translation the Fourier transform of
f.
If we think of recovering g from its Fourier series,
1 X 1 Xˆ
(4.267) g(x) = ck eikx = f (k)eikx
2π 2π
k∈Z k∈Z
0
then in terms of the Fourier transform on S (R) alluded to above, this takes the
rather elegant form
!
1 X 1 X ikx X
(4.268) F δ(· − k) (x) = e = δ(x − 2πk).
2π 2π
k∈Z k∈Z k∈Z
The sums of translated Dirac deltas and oscillating exponentials all make sense in
S 0 (R).
APPENDIX A
1. For §1
Problem 1.1. In case you are a bit shaky on it, go through the basic theory of
finite-dimensional vector spaces. Define a vector space V to be finite-dimensional
if there is an integer N such that any N elements of V are linearly dependent – if
vi ∈ V for i = 1, . . . N, then there exist ai ∈ K, not all zero, such that
N
X
(A.1) ai vi = 0 in V.
i=1
If N is the smallest such integer define dimension of V to be dim V = N −1 and show
that a finite dimensional vector space always has a basis, ei ∈ V, i = 1, . . . , dim V
such that any element of V can be written uniquely as a linear combination
dim
XV
(A.2) v= bi ei , bi ∈ K.
i=1
Problem 1.2. Show from first principles that if V is a vector space (over R or
C) then for any set X the space of all maps
(A.3) F(X; V ) = {u : X −→ V }
is a vector space over the same field, with ‘pointwise operations’ (which you should
write down carefully).
Problem 1.3. Show that if V is a vector space and S ⊂ V is a subset which
is closed under addition and scalar multiplication:
(A.4) v1 , v2 ∈ S, λ ∈ K =⇒ v1 + v2 ∈ S and λv1 ∈ S
then S is a vector space as well with operations ‘inherited from V ’ (and called, of
course, a subspace of V ).
Problem 1.4. Recall that a map between vector spaces L : V −→ W is linear
if L(v1 + v2 ) = Lv1 + Lv2 and Lλv = λLv for all elements v1 , v2 , v ∈ V and all
scalars λ. Show that given two finite dimensional vector spaces V and W over the
same field
(1) If dim V ≤ dim W then there is an injective linear map L : V −→ W.
(2) If dim V ≥ W then there is a surjective linear map L : V −→ W.
(3) if dim V = dim W then there is a linear isomorphism L : V −→ W, i.e. an
injective and surjective linear map.
Problem 1.5. If S ⊂ V is a linear subspace of a vector space show that the
relation on V
(A.5) v1 ∼ v2 ⇐⇒ v1 − v2 ∈ S
157
158 A. PROBLEMS FOR CHAPTER ??
This means writing out the proof that this is a linear space and that the three
conditions required of a norm hold.
Problem 1.9. Prove directly that each lp as defined in Problem 1.8 is complete,
i.e. it is a Banach space.
Problem 1.10. The space l∞ consists of the bounded sequences
(A.6) l∞ = {a : N −→ C; sup |an | < ∞}, kak∞ = sup |an |.
n n
Show that this is a non-separable Banach space.
Problem 1.11. Another closely related space consists of the sequences con-
verging to 0 :
(A.7) c0 = {a : N −→ C; lim an = 0}, kak∞ = sup |an |.
n→∞ n
Check that this is a separable Banach space and that it is a closed subspace of l∞
(perhaps do it in the opposite order).
Problem 1.12. Consider the ‘unit sphere’ in lp . This is the set of vectors of
length 1 :
S = {a ∈ lp ; kakp = 1}.
(1) Show that S is closed.
(2) Recall the sequential (so not the open covering definition) characterization
of compactness of a set in a metric space (e.g. by checking in Rudin’s
book).
(3) Show that S is not compact by considering the sequence in lp with kth
element the sequence which is all zeros except for a 1 in the kth slot. Note
that the main problem is not to get yourself confused about sequences of
sequences!
Problem 1.13. Show that the norm on any normed space is continuous.
1. FOR §?? 159
Problem 1.14. Finish the proof of the completeness of the space B constructed
in the second proof of Theorem 1.1.
160
APPENDIX B
1. Hill’s equation
As an extended exercise I suggest you follow the ideas of §4.4 but now for
‘Hill’s equation’ which is the same problem as (4.73) but with periodic boundary
conditions:-
d2 u du du
(B.1) − + V u = f on (0, 2π), u(2π) = u(0), (2π) = (0).
dx2 dx dx
There are several ways to do this, but you cannot proceed in precisely the same
way since for V = 0 the constants are solutions of (B.1) – so even if the system has
a solution (which for some f it does not) this solution is not unique.
One way to proceed is to start from V = 1 say and solve the problem explicitly.
However the formulæ are not as simple as for the Dirichlet case.
So instead I will outline an approach starting from the solution of the Dirichlet
problem. This is allows you to see some important concepts – for instance the
Maximum Principle. You should proceed to prove this sequence of claims!
(1) If V ≥ 0 (always real-valued in C([0, 2π])) then we know that the Dirichlet
problem, (4.73), has a unique solution given in (4.111):
(B.2) u = SV f, SV = A(Id +AV A)−1 A.
Recall that the eigenfunctions of this operator are twice continuously
differentiable eigenfunctions for the Dirichlet problem with eigenvalues
Tk = λ−1k where the λk are the eigenvalues of SV .
(2) Prove the maximal principle in this case, that if V > 0, f ≥ 0 then
u = SV f ≥ 0. Hint:- If this were not true then there would be an interior
2
minimum at which u(p) < 0 but at this point − ddxu2 (p) ≤ 0 and V (p)u(p) <
0 which contradicts (4.73) since f (p) ≥ 0.
(3) Now, suppose u is a ‘classical’ (twice continuously differentiable) solution
to (B.1) (with V > 0). Then set u0 = u(0) = u(2π) and u0 = u − u0 and
observe that
d2 u0
(B.3) − + V u0 = f − u0 V =⇒ u0 = SV f − u0 SV V.
dx2
(4) Using the assumption that V > 0 show that
d d
(B.4) SV V (0) > 0, SV V (2π) < 0.
dx dx
2
d
Hint:- From the equation for SV V observe that dx 2 SV V (0) < 0 so if
d
dx SV V (0) ≤ 0 then SV V (x) < 0 for small x > 0 violating the Maximum
Principle and similarly at 2π.
161
162 B. PROBLEMS FOR CHAPTER ??
(5) Conclude from (B.3) that for V > 0 there is a unique solution to (B.1)
which is of the form
(B.5) u = TV f = SV f + u0 − u0 SV ,
d d
au0 = TV V (2π) − TV V (0),
dx dx
d d
a= SV V (2π) − SV V (0) > 0.
dx dx
(6) Show that TV is an injective compact self-adjoint operator and that its
eigenfunctions are twice continuously differentiable eigenfunctions for the
periodic boundary problem. Hint:- Boundedness follows from the proper-
ties of SV , as does compactness with a bit more effort. For self-adjointness
integrate the equation by parts.
(7) Conclude the analogue of Theorem 4.5 for periodic boundary conditions,
i.e. Hill’s equation.
that A has no null space – which of course is just the completeness of the e0j since
(assuming all the λj are positive)
(B.11) Au = 0 ⇐⇒ u ⊥ ej ∀ j.
Nevertheless, this is essentially what we will do. The idea is to write A as an
integral operator and then work with that. I will take the λj = wj where w ∈ (0, 1).
The point is that we can find an explicit formula for
∞
X
(B.12) Aw (x, y) = wj ej (x)ej (y) = A(w, x, y).
j=0
Now, for the Riemann integral we can differentiate under the integral sign with
respect to the parameter ξ – since the integrand is continuously differentiable – and
see that
Z R
d
û0 (ξ) = lim ixeiξx u0 (x)
dξ R→∞ −R
Z R
d
= lim i eiξx (− u0 (x)
(B.16) R→∞ −R dx
Z R Z R
d iξx
eiξx u0 (x)
= lim −i e u0 (x) − ξ lim
R→∞ −R dx R→∞ −R
= −ξ û0 (ξ).
Here I have used the fact that An u0 = 0 and the fact that the boundary terms
in the integration by parts tend to zero rapidly with R. So this means that û0 is
annihilated by An :
d
(B.17) ( + ξ)û0 (ξ) = 0.
dξ
Thus, it follows that û0 (ξ) = c exp(−ξ 2 /2) since these are the only functions in
annihilated by An . The constant is easy to compute, since
Z
2 √
(B.18) û0 (0) = e−x /2 dx = 2π
proving (B.14).
164 B. PROBLEMS FOR CHAPTER ??
We can use this formula, of if you prefer the argument to prove it, to show that
2 √ 2
(B.19) v = e−x /4 =⇒ v̂ = πe−ξ .
Changing the names of the variables this just says
Z
−x2 1 2
(B.20) e = √ eixs−s /4 ds.
2 π R
The definition of the uj ’s can be rewritten
d 2 2 d 2
(B.21) uj (x) = (− + x)j e−x /2 = ex /2 (− )j e−x
dx dx
2
as is easy to see inductively – the point being that ex /2 is an integrating factor for
the creation operator. Plugging this into (B.20) and carrying out the derivatives –
which is legitimate since the integral is so strongly convergent – gives
2
ex /2
Z
2
(B.22) uj (x) = √ (−is)j eixs−s /4 ds.
2 π R
Now we can use this formula twice on the sum on the left in (B.12) and insert
the normalizations in (B.9) to find that
∞ ∞ 2 2
ex /2+y /2 (−1)j wj sj tj isx+ity−s2 /4−t2 /4
X X Z
(B.23) wj ej (x)ej (y) = e dsdt.
j=0 j=0
4π 3/2 R2 2j j!
The crucial thing here is that we can sum the series to get an exponential, this
allows us to finally conclude:
Lemma B.2. The identity (B.12) holds with
1 1−w 2 1+w 2
(B.24) A(w, x, y) = √ √ exp − (x + y) − (x − y)
π 1 − w2 4(1 + w) 4(1 − w)
Proof. Summing the series in (B.23) we find that
2 2
ex /2+y /2
Z
1 1 1
(B.25) A(w, x, y) = exp(− wst + isx + ity − s2 − t2 )dsdt.
4π 3/2 R2 2 4 4
Now, we can use the same formula as before for the Fourier transform of u0 to
evaluate these integrals explicitly. One way to do this is to make a change of
variables by setting
√ √
(B.26) s = (S + T )/ 2, t = (S − T )/ 2 =⇒ dsdt = dSdT,
1 1 1 x+y 1 x−y 1
− wst + isx + ity − s2 − t2 = iS √ − (1 + w)S 2 + iT √ − (1 − w)T 2 .
2 4 4 2 4 2 4
Note that the integrals in (B.25) are ‘improper’ (but rapidly convergent) Riemann
integrals, so there is no problem with the change of variable formula. The formula
for the Fourier transform of exp(−x2 ) can be used to conclude that
√
(x + y)2
Z
x+y 1 2 2 π
exp(iS √ − (1 + w)S )dS = p exp(− )
R 2 4 (1 + w) 2(1 + w)
(B.27) √
x−y 1 (x − y)2
Z
2 π
exp(iT √ − (1 − w)T 2 )dT = p exp(− ).
R 2 4 (1 − w) 2(1 − w)
2. MEHLER’S FORMULA AND COMPLETENESS 165
Proof. By definition of Aw
X∞
(B.30) |(u, ej )|2 = lim(f, Aw f )
w↑1
j=1
so (B.29) reduces to
(B.31) lim(f, Aw f ) = kf k2L2 .
w↑1
To prove (B.31) we will make our work on the integral operators rather simpler
by assuming first that f ∈ C(R) is continuous and vanishes outside some bounded
interval, f (x) = 0 in |x| > R. Then we can write out the L2 inner product as a
double integral, which is a genuine (iterated) Riemann integral:
Z Z
(B.32) (f, Aw f ) = A(w, x, y)f (x)f (y)dydx.
Noting that A ≥ 0 the same argument shows that the second term is bounded by
a constant multiple of δ. Now, we have already shown that the first term in (B.35)
tends to zero as → 0, so this proves (B.31) – given some γ > 0 first choose > 0
so small that the first two terms are each less than 21 γ and then let w ↑ 0 to see
that the lim sup and lim inf as w ↑ 0 must lie in the range [kf k2 − γ, kf k2 + γ]. Since
this is true for all γ > 0 the limit exists and (B.29) follows under the assumption
that f is continuous and vanishes outside some interval [−R, R].
This actually suffices to prove the completeness of the Hermite basis. In any
case, the general case follows by continuity since such continuous functions vanishing
outside compact sets are dense in L2 (R) and both sides of (B.29) are continuous in
f ∈ L2 (R).
Now, (B.31) certainly implies that the ej form an orthonormal basis, which is
what we wanted to show – but hard work! It is done here in part to remind you
of how we did the Fourier series computation of the same sort and to suggest that
you might like to compare the two arguments.
3. Friedrichs’ extension
Next I will discuss an abstract Hilbert space set-up which covers the treatment
of the Dirichlet problem above and several other applications to differential equa-
tions and indeed to other problems. I am attributing this method to Friedrichs and
he certainly had a hand in it.
Instead of just one Hilbert space we will consider two at the same time. First is
a ‘background’ space, H, a separable infinite-dimensional Hilbert space which you
can think of as being something like L2 (I) for some interval I. The inner product
on this I will denote (·, ·)H or maybe sometimes leave off the ‘H’ since this is the
basic space. Let me denote a second, separable infinite-dimensional, Hilbert space
as D, which maybe stands for ‘domain’ of some operator. So D comes with its own
inner product (·, ·)D where I will try to remember not to leave off the subscript.
3. FRIEDRICHS’ EXTENSION 167
The relationship between these two Hilbert spaces is given by a linear map
(B.38) i : D −→ H.
This is denoted ‘i’ because it is supposed to be an ‘inclusion’. In particular I will
always require that
(B.39) i is injective.
Since we will not want to have parts of H which are inaccessible, I will also assume
that
(B.40) i has dense range i(D) ⊂ H.
In fact because of these two conditions it is quite safe to identify D with i(D)
and think of each element of D as really being an element of H. The subspace
‘i(D) = D’ will not be closed, which is what we are used to thinking about (since it
is dense) but rather has its own inner product (·, ·)D . Naturally we will also suppose
that i is continuous and to avoid too many constants showing up I will suppose that
i has norm at most 1 so that
(B.41) ki(u)kH ≤ kukD .
If you are comfortable identifying i(D) with D this just means that the ‘D-norm’
on D is bigger than the H norm restricted to D. A bit later I will assume one more
thing about i.
What can we do with this setup? Well, consider an arbitrary element f ∈ H.
Then consider the linear map
(B.42) Tf : D 3 u −→ (i(u), f )H ∈ C.
where I have put in the identification i but will leave it out from now on, so just
write Tf (u) = (u, f )H . This is in fact a continuous linear functional on D since by
Cauchy-Schwarz and then (B.41),
(B.43) |Tf (u)| = |(u, f )H | ≤ kukH kf kH ≤ kf kH kukD .
So, by the Riesz’ representation – so using the assumed completeness of D (with
respect to the D-norm of course) there exists a unique element v ∈ D such that
(B.44) (u, f )H = (u, v)D ∀ u ∈ D.
Thus, v only depends on f and always exists, so this defines a map
(B.45) B : H −→ D, Bf = v iff (f, u)H = (v, u)D ∀ u ∈ D
where I have taken complex conjugates of both sides of (B.44).
Lemma B.3. The map B is a continuous linear map H −→ D and restricted
to D is self-adjoint:
(B.46) (Bw, u)D = (w, Bu)D ∀ u, w ∈ D.
The assumption that D ⊂ H is dense implies that B : H −→ D is injective.
Proof. The linearity follows from the uniqueness and the definition. Thus if
fi ∈ H and ci ∈ C for i = 1, 2 then
(c1 f1 + c2 f2 , u)H = c1 (f1 , u)H + c2 (f2 , u)H
(B.47)
= c1 (Bf1 , u)D + c2 (Bf2 , u)D = (c1 Bf1 + c2 Bf2 , u)D ∀ u ∈ D
168 B. PROBLEMS FOR CHAPTER ??
shows that B(c1 f1 + c2 f2 ) = c1 Bf1 + c2 Bf2 . Moreover from the estimate (B.43),
(B.48) |(Bf, u)D | ≤ kf kH kukD
and setting u = Bf it follows that kBf kD ≤ kf kH which is the desired continuity.
To see the self-adjointness suppose that u, w ∈ D, and hence of course since
we are erasing i, u, w ∈ H. Then, from the definitions
(B.49) (Bu, w)D = (u, w)H = (w, u)H = (Bw, u)D = (u, Bw)D
so B is self-adjoint.
Finally observe that Bf = 0 implies that (Bf, u)D = 0 for all u ∈ D and hence
that (f, u)H = 0, but since D is dense, this implies f = 0 so B is injective.
If you think about this a bit you will see that this is an abstract version of the
treatment of the ‘trivial’ Dirichlet problem above, except that I did not describe
the Hilbert space D concretely in that case.
There are various ways this can be extended. One thing to note is that the
failure of injectivity, i.e. the loss of (B.39) is not so crucial. If i is not injective,
then its null space is a closed subspace and we can take its orthocomplement in
place of D. The result is the same except that the operator D is only defined on
this orthocomplement.
An additional thing to observe is that the completeness of D, although used
crucially above in the application of Riesz’ Representation theorem, is not really
such a big issue either
Proposition B.2. Suppose that D̃ is a pre-Hilbert space with inner product
(·, ·)D and i : Ã −→ H is a linear map into a Hilbert space. If this map is injective,
has dense range and satisfies (B.41) in the sense that
(B.54) ki(u)kH ≤ kukD ∀ u ∈ D̃
then it extends by continuity to a map of the completion, D, of D̃, satisfying (B.39),
(B.40) and (B.41) and if bounded sets in D̃ are mapped by i into precompact sets
in H then (B.50) also holds.
The map extended map may not be injective, i.e. it might happen that i(un ) → 0
even though un → u 6= 0.
The general discussion of the set up of Lemmas B.4 and B.5 can be continued
further. Namely, having defined the operators B and A we can define a new positive-
definite Hermitian form on H by
(B.55) (u, v)E = (Au, Av)H , u, v ∈ H
with the same relationship as between (·, ·)H and (·, ·)D . Now, it follows directly
that
(B.56) kukH ≤ kukE
so if we let E be the completion of H with respect to this new norm, then i : H −→
E is an injection with dense range and A extends to an isometric isomorphism
A : E −→ H. Then if uj is an orthonormal basis of H of eigenfunctions of A with
eigenvalues τj > 0 it follows that uj ∈ D and that the τj−1 uj form an orthonormal
basis for D while the τj uj form an orthonormal basis for E.
170 B. PROBLEMS FOR CHAPTER ??
The typical way that Friedrichs’ extension arises is that we are actually given
an explicit ‘operator’, a linear map P : D̃ −→ H such that (u, v)D = (u, P v)H
satisfies the conditions of Proposition B.2. Then P extends by continuity to an
isomorphism P : D −→ E which is precisely the inverse of B as in Lemma B.6. We
shall see examples of this below.
5. Isotropic space
There are some functions which should be in the domain of P, namely the twice
continuously differentiable functions on R with compact support, those which vanish
outside a finite interval. Recall that there are actually a lot of these, they are dense
in L2 (R). Following what we did above for the Dirichlet problem set
injection
1
(B.67) Hiso −→ L2 (R) × L2 (R).
Proof. Let us start with the last part, (B.67). The map here is supposed to
be the continuous extension of the map
du
(B.68) D̃ 3 u 7−→ ( , xu) ∈ L2 (R) × L2 (R)
dx
172 B. PROBLEMS FOR CHAPTER ??
where du/dx and xu are both compactly supported continuous functions in this
case. By definition of the inner product (·, ·)iso the norm is precisely
du 2
(B.69) kuk2iso = k k 2 + kxuk2L2
dx L
so if un is Cauchy in D̃ with respect to k · kiso then the sequences dun /dx and xun
are Cauchy in L2 (R). By the completeness of L2 they converge defining an element
in L2 (R) × L2 (R) as in (B.67). Moreover the elements so defined only depend on
the element of the completion that the Cauchy sequence defines. The resulting map
(B.67) is clearly continuous.
1
Now, we need to show that the inclusion i extends to Hiso from D̃. This follows
from another integration identity. Namely, for u ∈ D̃ the Fundamental theorem of
calculus applied to
d du du
(uxu) = |u|2 + xu + ux
dx dx dx
gives
Z Z
du du
(B.70) kuk2L2 ≤ | xu| + |ux | ≤ kuk2iso .
R dx dx
Thus the inequality (B.41) holds for u ∈ D̃.
It follows that the inclusion map i : D̃ −→ L2 (R) extends by continuity to Hiso1
1 2
since if un ∈ D̃ is Cauchy with respect in Hiso it is Cauchy in L (R). It remains to
check that i is injective and compact, since the range is already dense on D̃.
1
If u ∈ Hiso then to say i(u) = 0 (in L2 (R)) is to say that for any un → u in
Hiso , with un ∈ D̃, un → 0 in L2 (R) and we need to show that this means un → 0
1
1
in Hiso to conclude that u = 0. To do so we use the map (B.67). If un D̃ converges
in Hiso then it follows that the sequence ( du
1 2 2
dx , xu) converges in L (R) × L (R). If v is
a continuous function of compact support then (xun , v)L2 = (un , xv) → (u, xv)L2 ,
for if u = 0 it follows that xun → 0 as well. Similarly, using integration by parts
the limit U of du 2
dx in L (R) satisfies
n
Z Z
dun dv dv
(B.71) (U, v)L2 = lim v = − lim un = −(u, )L2 = 0
n dx n dx dx
1
if u = 0. It therefore follows that U = 0 so in fact un → 0 in Hiso and the injectivity
of i follows.
1
We can see a little more about the metric on Hiso .
1
Lemma B.7. Elements of Hiso are continuous functions and convergence with
respect to k · kiso implies uniform convergence on bounded intervals.
Proof. For elements of the dense subspace D̃, (twice) continuously differ-
entiable and vanishing outside a bounded interval the Fundamental Theorem of
Calculus shows that
Z x Z x
2 d 2 2 2 du
u(x) = ex /2 ( (e−t /2 u) = ex /2 (e−t /2 (−tu + )) =⇒
−∞ dt −∞ dt
(B.72) Z x
2 2 1
|u(x)| ≤ ex /2 ( e−t ) 2 kukiso
−∞
5. ISOTROPIC SPACE 173
where the estimate comes from the Cauchy-Schwarz applied to the integral. It fol-
lows that if un → u with respect to the isotropic norm then the sequence converges
uniformly on bounded intervals with
(B.73) sup |u(x)| ≤ C(R)kukiso .
[−R,R]
Now, to proceed further we either need to apply some ‘regularity theory’ or do a
computation. I choose to do the latter here, although the former method (outlined
below) is much more general. The idea is to show that
Lemma B.8. The linear map (P + 1) : Cc2 (R) −→ Cc (R) is injective with range
dense in L2 (R) and if f ∈ L2 (R) ∩ C(R) there is a sequence un ∈ Cc2 (R) such
1
that un → u in Hiso , un → u locally uniformly with its first two derivatives and
(P + 1)un → f in L2 (R) and locally uniformly.
Proof. Why P + 1 and not P ? The result is actually true for P but not so
easy to show directly. The advantage of P + 1 is that it factorizes
(P + 1) = An Cr on Cc2 (R).
so we proceed to solve the equation (P + 1)u = f in two steps.
First, if f ∈ c(R) then using the natural integrating factor
Z x
2 2 2
(B.74) v(x) = ex /2 et /2 f (t)dt + ae−x /2 satisfies An v = f.
−∞
The integral here is not in general finite if f does not vanish in x < −R, which by
2
assumption it does. Note that An e−x /2 = 0. This solution is of the form
2
(B.75) v ∈ C 1 (R), v(x) = a± e−x /2
in ± x > R
where R depends on f and the constants can be different.
In the second step we need to solve away such terms – in general one cannot.
However, we can always choose a in (B.74) so that
Z
2
(B.76) e−x /2 v(x) = 0.
R
Now consider
Z x
2 2
(B.77) u(x) = ex /2
e−t /2
v(t)dt.
−∞
Here the integral does make sense because of the decay in v from (B.75) and u ∈
C 2 (R). We need to understand how it behaves as x → ±∞. From the second part
of (B.75),
Z
2 2
(B.78) u(x) = a− erf − (x), x < −R, erf − (x) = ex /2−t
(−∞,x]
2
is an incomplete error function. It’s derivative is e−x but it actually satisfies
2
(B.79) |x erf − (x)| ≤ Cex , x < −R.
2
In any case it is easy to get an estimate such as Ce−bx as x → −∞ for any
0 < b < 1 by Cauchy-Schwarz.
174 B. PROBLEMS FOR CHAPTER ??
[3] B. S. Mitjagin, The homotopy structure of a linear group of a Banach space, Uspehi Mat.
Nauk 25 (1970), no. 5(155), 63–106. MR 0341523 (49 #6274a)
[4] W. Rudin, Principles of mathematical analysis, 3rd ed., McGraw Hill, 1976.
[5] George F. Simmons, Introduction to topology and modern analysis, Robert E. Krieger Pub-
lishing Co. Inc., Melbourne, Fla., 1983, Reprint of the 1963 original. MR 84b:54002
175
MIT OpenCourseWare
https://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.