Marsden, J.E. Ratiu, T.S. - Supplement

Page i
Internet Supplement for

Introduction to
Mechanics and Symmetry
A Basic Exposition of Classical Mechanical Systems
Second Edition
Jerrold E. Marsden
CDS 107-81
Caltech
Pasadena, CA 91125
USA
[email protected]
Tudor S. Ratiu
Departement de Mathematiques
Ecole Polytechnique Federale de Lausanne
CH - 1015 Lausanne
Switzerland
[email protected]fl.ch
Last modified on 19 December 1998

ii
Page iii
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
N6 Cotangent Bundles . . . . . . . . . . . . . . . . . . . . . . 3
N6.A Linearization of Hamiltonian Systems . . . . . . . 3
N7 Lagrangian Mechanics . . . . . . . . . . . . . . . . . . . . 9
N7.A The Classical Limit and the Maslov Index . . . . 9
N9 Lie Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 23
N9.A Automatic Smoothness . . . . . . . . . . . . . . . 23
N9.B Abelian Lie Groups . . . . . . . . . . . . . . . . . 25
N9.C Lie Subgroups . . . . . . . . . . . . . . . . . . . . 26
N9.D Lie’s Third Fundamental Theorem . . . . . . . . . 29
N9.E The Symplectic, Orthogonal, and Unitary Groups 31
N9.F Generic Coadjoint Isotropy Subalgebras are Abelian 40
N9.G Some Infinite Dimensional Lie Groups . . . . . . . 44
N9.G.1 Basic Facts about Sobolev Spaces and
Manifolds. . . . . . . . . . . . . . . . . . 52
N10 Poisson Manifolds . . . . . . . . . . . . . . . . . . . . . . 67
N10.A Proof of the Symplectic Stratification Theorem . . 67
N11 Momentum Maps . . . . . . . . . . . . . . . . . . . . . . 71
N11.A Another Example of a Momentum Map . . . . . . 71
N13 Lie-Poisson Reduction . . . . . . . . . . . . . . . . . . . . 73
N13.A Proof of the Lie–Poisson Reduction Theorem for
Diff vol (M ) . . . . . . . . . . . . . . . . . . . . . . 73
N13.B Proof of the Lie–Poisson Reduction Theorem for
Diff can (P ) . . . . . . . . . . . . . . . . . . . . . . 75
N13.C The Linearized Lie–Poisson Equations . . . . . . . 78
iv Contents
N14 Coadjoint Orbits . . . . . . . . . . . . . . . . . . . . . . . 83

N14.A Casimir Functions do not Determine Orbits . . . 83
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
References . . . . . . . . . . . . . . . . . . . . . . . . . . 87
1
Preface
This supplement contains a number of topics that are somewhat periph-
eral to the main flow of the text itself, so that to keep the book within a
reasonable size, we have placed them here. This does not mean that they
are any less important, but as usual, one has to make choices, sometimes
difficult ones. We have organized the material by Chapter to match that of
the text as far as possible.
This supplement is being continually updated and we appreciate com-
ments and suggestions from readers. Please also note that you can get the
current errata for the main text from the site
http://www.cds.caltech.edu/~marsden
Jerry Marsden
Pasadena, California
Tudor Ratiu
Lausanne, Switzerland
December, 1998
2
Page 3
N6
Cotangent Bundles
N6.A Linearization of Hamiltonian Systems

One process of linearizing a system is by doubling its dimension using the
tangent operation. In fact, if P is a symplectic (or even Poisson) manifold,
then so is T P in a natural way. We will show how this is established below.
A second method is that of linearizing along a given solution. For ex-
ample, to linearize a Hamiltonian system on a symplectic manifold at a
fixed point, one usually wants the linearized Hamiltonian to be the second
variation of the original Hamiltonian at the fixed point. The tangent lin-
earization does not give this; in canonical coordinates q i , pi , the tangent
linearized symplectic structure is
dq i ∧ d(δpi ) + d(δq i ) ∧ dpi (N6.A.1)
in the variables (q i , pi , δq i , δpi ). However, at a fixed point, it is often desir-

able to use the given symplectic form simply evaluated at the fixed point,
which has the expression
d(δq i ) ∧ d(δpi ), (N6.A.2)
while (N6.A.1) restricts to zero.

One can use “symplectic connections” to compare tangent spaces at dif-
ferent points along the unperturbed curve and thus make the linearization
process meaningful. A useful class of intrinsic symplectic connections on
cotangent bundles of Lie groups is constructed in Marsden, Ratiu, and
4 N6. Cotangent Bundles
Raugel [1991]. For systems with a symmetry group G, they use a G-

invariant connection and this gives, via reduction, a linearization theory
for Lie–Poisson systems. For instance, the rigid body and ideal fluid flow is
linearized in this fashion. One also gets a generalization of the linearization
procedure at a fixed point noted in Holm, Marsden, Ratiu, and Weinstein
[1985] and Abarbanel, Holm, Marsden, and Ratiu [1986].
Hamiltonian Systems in R2n . Let H : R2n → R be a Hamiltonian
function, which in canonical coordinates (q i , pj ) gives rise to Hamilton’s
equations
∂H ∂H
q̇ i = , ṗi = − . (N6.A.3)
∂pi ∂q i
Linearizing along a solution curve (q i (t), pi (t)) and calling the new vari-
ables (δq i , δpi ) we get the equations
∂2H ∂2H
(δq i )· = δq j
+ δpj ,
∂q j ∂pi ∂pj ∂pi
∂2H ∂2H
(δpi )· = − δq j
− δpj . (N6.A.4)
∂q j ∂q i ∂q i ∂pj
The matrix of the canonical symplectic form d(δq i ) ∧ d(δpi ) is

0 I
J= .
−I 0
Recall (see §2.7) that a linear operator with matrix

A B
T =
C D
is infinitesimally symplectic, that is, T t J + JT = 0, or equivalently, T is

ω-skew, if and only if B and C are symmetric matrices and D = −AT .
The linear system (N6.A.4) has a matrix clearly satisfying these conditions
and, therefore, it defines a Hamiltonian system in the (δq i , δpi )-variables,
whose Hamiltonian function is verified to be the second variation:
1 i i
2 ω(T (δq , δpi ), (δq , δpi )) = 12 δ 2 H(q i (t), pi (t))(δq i , δpi )2 . (N6.A.5)
The same argument and formulas hold for infinite-dimensional weak sym-
plectic vector spaces E × E , where E and E are (weakly) paired. One of
the goals of Marsden, Ratiu, and Raugel [1991] is to generalize this simple
procedure to arbitrary symplectic manifolds. Formula (N6.A.5) cannot be
correct, in general, since the second variation of a function does not make
intrinsic sense, except at critical points. Additional structure is needed to
N6.A Linearization of Hamiltonian Systems 5
correct the second variation by the addition of terms making the resulting
formula invariant.1
Infinite Dimensional Systems. There are a number of several interest-

ing infinite-dimensional systems whose phase spaces are of the form U ×E ,
where U is open in a Banach space E weakly paired with E . In all of these
cases the linearized equations are infinite-dimensional versions of (N6.A.4)
and the Hamiltonian function is given by the second variation of the origi-
nal Hamiltonian along a given integral curve. As we have mentioned, one of
the purposes of Marsden, Ratiu, and Raugel [1991] is to generalize this to
the nontrivial case. The latter include systems like the rigid body and flu-
ids, charged fluids, Maxwell–Vlasov equations, etc. However, the case with
a trivial connection still includes a surprisingly large number of interesting
systems. Here are some examples:
Examples
1. The Sine-Gordon equation utt − uxx = sin u has phase space E × E ,

where E consists of maps u : R → R (one can also use maps u : R → S 1 ,
but use of the universal covering space R of S 1 gives a linear space) and E
consists of maps u̇ : R → R; E × E has the canonical symplectic structure.
The Hamiltonian has the form kinetic plus potential energy (see Chernoff
and Marsden [1974] for details).
2. The Yang-Mills equations have phase space T ∗ A, where A is the

space of connections on a given principal bundle, which is an affine space, so
again we can put the trivial symplectic connection on T ∗ A. The Yang-Mills
equations are Hamiltonian on T ∗ A relative to the canonical symplectic
structure, so again (N6.A.4) is applicable and the Hamiltonian is the second
variation of H. See, for example, Arms, Marsden, and Moncrief [1982] for
the explicit formula. One of the interesting complications in this example
is the presence of a gauge symmetry; the statements above are valid in
any gauge. Interestingly, the symplectic form is always canonical, but the
Hamiltonian is linear in the so-called atlas fields, representing the gauge
freedom (the coefficients of the atlas fields are the momentum map for the
gauge group).
1 Another motivation for working in this general context is to deal with Hamiltonian
systems in Lie–Poisson spaces, which, as we explore in detail in Chapters 13 and 14,
is equivalent to G-invariant Hamiltonian systems on T ∗ G, where G is a Lie group. At
critical points of H + C, where C is a Casimir on g∗ (g∗ is the dual of the Lie algebra g
of G), such a linearization has been carried out in Holm, Marsden, Ratiu, and Weinstein
[1985] and Abarbanel, Holm, Marsden, and Ratiu [1986]; as expected, the Hamiltonian
function of the linearized equations is the second variation of H + C, but the Poisson
structure instead of being Lie–Poisson is a “frozen coefficient” Poisson bracket.
3. General relativity (in dynamical form) has phase space T ∗ Riem(M ),

where Riem(M ) is the space of Riemannian metrics on a fixed hypersurface
M . Again the dynamical equations are Hamiltonian on T ∗ Riem(M ) rela-
tive to the canonical symplectic structure (for any choice of gauge). Thus,
again we can put the trivial symplectic connection on T ∗ Riem(M ) and
formulas (N6.A.4) and (N6.A.5) (in their obvious infinite-dimensional gen-
eralization) apply. These linearized equations are studied in some detail,
for the purpose of getting results on the space of nonlinear solutions, in
Fischer, Marsden, and Moncrief [1980] and Arms, Marsden, and Moncrief
[1982].
An interesting question here is to couple these systems to ones with non-

trivial phase space. For instance, charged fluids, general relativistic fluids
or elasticity, the Maxwell–Vlasov equations, etc., are such systems. All of
these will produce nontrivial linearizations by these methods.
The Tangent Symplectic Structure. If (P, Ω) is a symplectic mani-
fold, the “flat map” Ω : T P → T ∗ P is a diffeomorphism. Then T P becomes
an exact symplectic manifold if the map Ω : T P → T ∗ P is used to pull
back the canonical one-form on T ∗ P . This one-form on T P , denoted ΘT ,
has the expression
(ΘT )v , w = Ωz (v, T τP (w)), (N6.A.6)
where v ∈ Tz P, w ∈ Tv (T P ), τP : T P → P is the projection, and ,

denotes the pairing between T ∗ (T P ) and T (T P ). In this way, T P becomes
a symplectic manifold with symplectic form ΩT = −dΘT .
If f : P → P is a diffeomorphism one verifies that T f : T P → T P is
symplectic iff f is symplectic.
We remark in passing that a vector field X is locally Hamiltonian if and
only if X(P ) is a Lagrangian submanifold of (T P, ΩT ) (see Abraham and
Marsden [1978], §5.3, and Sánchez de Alvarez [1986, 1989]).
The First Variation Equation. Let ϕt be the flow of a Hamiltonian
vector field XH on a symplectic manifold P and let ψt = T ϕt be the tangent
flow and Y be its generating vector field. Let sP : T (T P ) → T (T P ) be the
canonical involution given locally by sP (u, v, u̇, v̇) = (u, u̇, v, v̇). One
verifies that Y = sP ◦ T XH is Hamiltonian with respect to the symplectic
form ΩT on T P with the Hamiltonian function H(v) = Ω(XH (p), v), v ∈
Tp P , which is given in coordinates by the formula
∂H ∂H
H(q i , pi , v i , wi ) = v i + wj . (N6.A.7)
∂q i ∂pj
The Hamiltonian system Y = XH on T P is called the linearized Hamil-

tonian system or first variation equation of XH .
N6.A Linearization of Hamiltonian Systems 7
If Q is a pseudo-Riemannian manifold and P = T Q with the symplectic

form induced by the metric, the linearized Hamiltonian H of the Hamil-
tonian given by the kinetic energy of the metric on Q gives rise to the
Hamiltonian vector field XH , which coincides with the first variation equa-
tion for geodesics, which is an important construction in geometry (see, for
instance, Milnor [1965]).
Linearization with Respect to a Parameter. Let H be a family of
Hamiltonian functions on P depending smoothly on a parameter ! ∈ R.
Let H0 denote the value of H at ! = 0 and

1 dH
H = .
d! =0
Let ϕ t be the flow of the Hamiltonian vector field with Hamiltonian H
and let

d
ϕ (p) = ϕt (p) ∈ Tϕt (p) P.
d! t =0
Since ϕ 0 (p) = p, we have ϕ0 (p) = 0. Thus ϕt is an integral curve of the
Hamiltonian vector field XH1 on (T P, dΘT ), where
H1 = dH0 , · + τP∗ H 1 , (N6.A.8)
withτP : T P → P the canonical tangent bundle projection, , the pairing
between T ∗ P and T P , and dH0 , · : T P → R is given by
dH0 , · (vp ) := dH0 (p), vp (N6.A.9)
for vp ∈ Tp P . In local coordinates (q , pi , v , wi ) on T P ,
i i
∂H0 ∂H0
H1 (q i , pi , v i , wi ) = v i + wi + H 1 (q i , pi )
∂q i ∂pi
= H0 (q i , pi , v i , wi ) + H 1 (q i , pi ), (N6.A.10)
where H0 is given in terms of H0 by (N6.A.7). Hamilton’s equations for H1
on T P relative to the symplectic form ΩT are

dq i ∂H0 dpi ∂H0 

= , =− i , 

dt ∂p
dt ∂q 

dv i
i
∂ ∂ ∂H ∂H 1 
j 0
= v + w j + , (N6.A.11)
dt ∂q
j ∂pj ∂pi ∂pi 


dwi ∂ ∂ ∂H0 ∂H 1  
= − v j j + wj i
− . 

dt ∂q ∂pj ∂q ∂q i
One calls this the first variation equation relative to a parameter .
If we set H 1 = 0 we recover the first variation equation (N6.A.4) for XH0
discussed earlier, with H0 = H, v i = δq i , and wi = δpi .
Further details on the linearization of Hamiltonian systems and the use
of symplectic connections to accomplish this may be found in Marsden,
Ratiu, and Raugel [1991].
Page 9
N7
Lagrangian Mechanics
N7.A The Classical Limit and the Maslov

Index
The purpose of this section is to give a brief introduction through the
simplest examples, of the quantum–classical relationship and the Maslov
index, following the exposition in Marsden and Weinstein [1979]. For fur-
ther information and generalizations, the reader may consult Guillemin and
Sternberg [1977, 1984]; Woodhouse [1992] and Bates and Weinstein [1997].
We also refer to Littlejohn [1988] for an interpretation of the Maslov index
in terms of Berry’s phase. We also will not attempt to make every step
absolutely rigorous. See Eckmann and Seneor [1976] for details.
We begin with the one–dimensional Schrödinger equation. Let V :
R → R be a given potential, let ψ : R → C be a wave function, and let
E, , m be constants (energy, Planck’s constant, and mass, respectively).
Consider the stationary Schrödinger equation:
Lψ = Eψ, (N7.A.1)
where
2
Lψ = − ψ +Vψ (N7.A.2)
2m
10 N7. Lagrangian Mechanics
and the time-independent Hamilton–Jacobi equation for the function S :

R → R:
1
(S )2 + V = E. (N7.A.3)
2m
In this one dimensional case, the Hamilton–Jacobi equation is related to
Hamilton’s equations
∂H ∂H
q̇ = , ṗ = − , (N7.A.4)
∂p ∂q
where H(q, p) = p2 /2m + V (q), in a very simple way: if S(q) satisfies the
Hamilton–Jacobi equation, and if q̇(t) = p(t)/m and if p = S (q) = 0, then
(q(t), p(t)) satisfies Hamilton’s equations and has energy E.
Two related central questions are:
1. How does one pass from classical objects to quantum objects? Here,
“objects” can refer to the equations themselves, to solutions, or to
properties of the equations or solutions.
2. In what sense are solutions of the Hamilton–Jacobi equation a limit
of solutions of the Schrödinger equation as → 0?1
Progress with these questions was made with the basic work of Weyl,
Birkhoff, van Hove, and, among many others, Keller, Maslov, Souriau,
and Kostant (see the preceding references for the literature citations). van
Hove showed that there is no general quantization having all the properties
one would want.2 In studying question 2 using the WKB method, Keller
and Maslov discovered the topological meaning of the corrected Bohr–
Sommerfeld quantization rules. The invariant they discovered is commonly
called the Maslov index . (See Arnold [1967]). Our one–dimensional ex-
ample will contain many of the features of the general case.
If S is a solution of (N7.A.3), we try to solve (N7.A.1) with
ψ = exp(iS/). (N7.A.5)
Substitution of (N7.A.5) in (N7.A.2) gives
i
Eψ = Lψ + ψS (N7.A.6)
2m
1 Of course Planck’s constant is a constant and cannot literally tend to zero, any
more than the velocity of light can tend to infinity. However, when is small, compared
to quantities of interest in classical mechanics, this is expressed by mathematically by
taking the limit → 0 or by letting related parameters tend to zero (see Littlejohn
[1988] and de de Gosson [1997]).
2 van Hove’s theorem in R n is proved in Abraham and Marsden [1978], §5.4. van Hove
also found some positive results that were extended by Segal, Souriau and Kostant in a
procedure now called prequantization. Recent references in this direction may be found
in Gotay, Grundling, and Tuynman [1996].
N7.A The Classical Limit and the Maslov Index 11
by using (N7.A.3). Equation (N7.A.6) differs from (N7.A.1) by a term of

order . Next, try
ψ = a exp(iS/) (N7.A.7)
for a : R → R. Substituting this into (N7.A.2) and using the Hamilton–

Jacobi equation, we get
i ψ 2 a
Lψ = Eψ − (S a + 2S a ) − ψ
2m a 2m a
or
i ψ 2 a
Eψ = Lψ + (S a + 2S a ) + ψ. (N7.A.8)
2m a 2m a
This equation differs from (N7.A.1) by a term of order 2 if a satisfies the
transport equation
2a S + aS = 0, (N7.A.9)
whose solution is a = (constant)/|S |1/2 . Thus, (N7.A.8) becomes
2 a
Eψ = Lψ + ψ (N7.A.10)
2m a
which differs from (N7.A.1) by a term of order 2 . The idea is now to
continue this process by writing
N

ψ= ak (i)k exp(iS/) (N7.A.11)
k=0
for some functions ak : R → R and requiring ψ to satisfy (N7.A.1) up to

an error term of order N +2 . This procedure is usually called the WKB
method (after G. Wentzel, H. A. Kramers, and L. Brillouin, although it
goes back to Liouville, Green, and Lord Rayleigh).
Substituting (N7.A.11) into (N7.A.2) and using, as before, the Hamilton–
Jacobi equation (N7.A.3) yields

exp(iS/)
Lψ = Eψ − (S a0 + 2S a0 )i
2m

N
+ i (S ak + 2S ak − ak−1 )(i)k + iN aN N +2 .
k=1
(N7.A.12)
Imposing the transport equations
S ak + 2S ak − ak−1 = 0, k = 0, 1, . . . , N, a−1 ≡ 0, (N7.A.13)

which can be solved recursively, we see that (N7.A.12) reduces to
iN exp(iS/) N +2
Eψ = Lψ + aN . (N7.A.14)
2m
Thus, we have “solved” (N7.A.1) up to an error of order N +2 . Therefore,
if we let N → ∞ we have found an asymptotic solution
∞

ψ∼ ak k exp(iS/) (N7.A.15)
k=0
of (N7.A.1). The key observation in this procedure is that once S is de-

termined, the coefficients ak are obtained recursively as solutions of linear
ordinary differential equations. The solutions are a fortiori only local since
S given by (N7.A.3) is only local, as we shall see below.
Suppose the energy surface for the classical system has the form shown
in Figure N7.A.1.
p 2
p
__ +V=E
2m
+
q
q1 q2
–
Figure N7.A.1. A sample classical energy surface.
There correspond two solutions of (N7.A.3):

S = ± p(q) dq + C± , (N7.A.16)

where p(q) = 2m(E − V (q)), and C± are constants. Thus if ψ is given by
(N7.A.11), or asymptotically by (N7.A.15), then the first transport equa-
tion (N7.A.9) for k = 0 yields
d±
a0± = (N7.A.17)
[2m(E − V (q))]1/4
for some constants d± . This expression diverges at q1 and q2 and becomes
imaginary outside the interval [q1 , q2 ].
The subtlety of questions 1 and 2 centers on the multiple valuedness of

S and the presence of the turning points at q1 and q2 . To get around these
difficulties there have been several approaches.
1. Use analytic continuation methods to avoid the turning points. This

approach was developed by Zwaan.
2. Approximate the potential by a linear one near each turning point.

Schrödinger’s equation then yields an Airy function which is asymp-
totically matched by Bessel functions (Langer and Jeffreys).
3. Use a modified WKB method near the turning point and an asymp-
totic expansion (Maslov). We shall describe this method shortly.
There are other approaches too. For instance, Miller and Good [1953]
effectively used area–preserving maps to deform Figure N7.A.1 into that
for a harmonic oscillator. The same idea was used by Maslov [1965] for
higher superpositions of such expressions.
To study the behavior near q1 and q2 , we replace ψ = a exp(iS/) by a
superposition of such expressions, that is, by
∞
ψ(q) = a(q, p) exp(iϕ(q, p)/) dp (N7.A.18)
−∞
where ϕ : R2 → R is positive. This integral is called an oscillatory func-

tion; the theory of such integrals parallels that of Fourier integrals. Let
us take ϕ(q, p) = qp − T (p) for some real–valued function T defined in a
neighborhood of the origin whose second derivative never vanishes, that is,
∞
pq − T (p)
ψ(q) = a(q, p) exp i dp, (N7.A.19)
−∞
and try to solve (N7.A.1). A direct computation shows that

2
∞
p
Lψ − Eψ = a + V (q) − E
−∞ 2m

ip ∂a 2 ∂ 2 a pq − T (p)
− + exp i dp.
m ∂q 2m ∂q 2
(N7.A.20)
To evaluate the right–hand side of (N7.A.20) asymptotically in we need

the following:
Theorem N7.A.1 (Stationary Phase Formula). Let a, ϕ : R → R be

C ∞ functions, ϕ having finitely many nondegenerate critical points. Then
∞
a(x) exp(iϕ(x)/) dx
−∞

2π iπ a(y) exp(iϕ(y)/) 3
= exp sgn ϕ (y) 1 + O( 2 )
ϕ (y)=0 4
|ϕ (y)| 2
(N7.A.21)
where the sum is taken over all critical points y of ϕ. (Recall that a critical
point y of ϕ is nondegenerate iff ϕ (y) = 0.)
Proof (After Guillemin and Sternberg [1977]). Let {χn } be a C ∞
partition of unity on the real line, that is, each χn is C ∞ , 0 ≤ χn ≤ 1,
supp χn = closure of {x ∈ R | χn (x) = 0} is compact, eachx ∈ R has a
neighborhood intersecting only finitely many of supp χn , and n χn (x) = 1
for each x ∈ R. Since there are only finitely many critical points of ϕ, we
can arrange the supports of χn such that each supp χn contains at most
one critical point of ϕ. Writing
∞ ∞
iϕ(x) iϕ(x)
a(x) exp dx = χn (x)a(x) exp dx,
−∞ n −∞
we see that each integral on the right–hand side is a definite integral on

supp χn and that there are only a finite number of integrals that have
overlapping domains of integration. Some of these integrals have domains
which contain critical points of f , others do not.
We begin by studying those integrals that do not have a critical point of
ϕ in their domain. Thus, we can assume that supp a is compact and that
ϕ = 0 on supp a. Integrating by parts,
∞ ∞
iϕ(x) d iϕ(x)
a(x) exp dx = a(x) exp dx
−∞ −∞ iϕ (x) dx
∞
d a(x) iϕ(x)
= i exp dx,
−∞ dx ϕ (x)
which is an integral of the same type since dxd
[a(x)/ϕ (x)] is again C ∞ with
compact support inside supp a. Thus the procedure can be repeated any
number of times yielding
∞
iϕ(x)
a(x) exp dx = O(N )
−∞
for any N ∈ N. Thus, to prove (N7.A.21), it suffices to establish it if supp a
is compact and contains exactly one critical point x0 of ϕ. This will be
carried out in several steps.
Step 1 (Morse Lemma). There is a change of variables x → z such that
ϕ(x(z)) = ϕ(x0 ) + 12 (sgn ϕ (x0 ))(z − z0 )2 ,
where x(z0 ) = x0 .
To show this, we can clearly assume that
x0 = 0, ϕ(x0 ) = 0, ϕ (x0 ) = 0, ϕ (x0 ) = 0.
Write first
1 1
d
ϕ(x) = ϕ(tx) dt = x ϕ (tx) dt = xα(x),
0 dt 0
where 1
α(x) = ϕ (tx) dt
0
is again a C ∞ function. Since
ϕ (x) = α(x) + xα (x),
and ϕ (0) = α(0) = 0, the same argument shows that α(x) = xβ(x) for
some C ∞ function β(x). Therefore,
1
2
ϕ(x) = x β(x) and β(x) = α (tx) dt,
0
whence
β(0) = α (0) = 12 ϕ (0).
√
Define z(x) = 2|β(x)| 2 x which is C ∞ in a neighborhood of 0, since β(0) =
1
0, and satisfies √
z (0) = 2|β(0)| 2 = 0.
1
Therefore x → z is a diffeomorphism in a neighborhood of 0 and in this

neighborhood, suitably shrunk if necessary, β(x) does not change sign. Thus
ϕ(x) = x2 β(x) = (sgn β(x))x2 |β(x)| = 12 (sgn β(0))z 2 = 12 (sgn ϕ (0))z 2 .
Step 2. Performing the change of variables x → z given in Step 1 we

get
∞
a(x) exp(iϕ(x)/) dx
−∞

a(x0 ) exp(iϕ(x0 )/) ∞ ±i(z − z0 )2
= 1 exp dz
|ϕ (x0 )| 2 −∞ 2
∞
iϕ(x0 ) ±i(z − z0 )2
+ exp (z − z0 )γ(z) exp dz,
−∞ 2
where + or − is taken in accordance with sgn ϕ (x0 ) and γ(z) is C ∞ with
(z − z0 )γ(z) bounded together with all its derivatives. (The bound for each
derivative may be different.)
Indeed,
∞
∞
iϕ(x0 ) iϕ(x0 ) i(z − z0 )2 dx
a(x) exp dx = a(x(z)) exp ± dz dz
−∞ −∞ 2
and note that

dx 1
(z0 ) = √ |β(x0 )| 2 = |ϕ (x0 )|− 2
1 1
dz 2
so that proceeding as in Step 1 we can write

dx dx

a(x(z)) − a(x0 ) (z0 ) = (z − z0 )γ(z)

dz dz
for some C ∞ function γ(z) (z0 denotes the point given by x(z0 ) = x0 ), that
is,

dx 1
a(x(z)) = a(x0 ) 1 + (z − z0 )γ(z).
dz
|ϕ (x0 )| 2
Since

1
d dx
(z − z0 )γ(z) = (z − z0 ) a(x(zt ) (zt ) dt,

0 dt dz
where zt = tz + (1 − t)z0 , we see that on its domain of definition γ(z)

is smooth and has itself and all its derivatives bounded because a(x) has
compact support.
To show that each integral in Step 2 is well defined, we prove:
Step 3. Let h(z) be a C 2 function of a real variable such that the three
functions |h(z)|, |h (z)|, |h (z)| are all bounded by M > 0. If λ ∈ C, the
integral
∞
e−λz
2
/2
h(z) dz
−∞
is uniformly convergent for Re λ ≥ 0, |λ| ≥ 1, bounded by a constant de-

pending on M only, holomorphic for Re λ > 0, and continuous for Re λ ≥ 0.
∞
It suffices to prove this for 0 e−λz /2 h(z) dz for then, by changing vari-
2
ables z → −z, the same result holds for the integral from −∞ to 0 and
hence for the sum. Let 0 < A < B. Then

B
e−λz /2 h(z) dz
2
A
B
1 −λz2 /2
= −λ−1 (e ) h(z) dz
A z
B B
−1 −λz 2 /2 −1 h(z)
−λz 2 /2
= −(λz) e h(z) + λ e dz
A A z
B B
−1 −λz 2 /2 −2 −λz 2 /2 1 h(z)
= −(λz) e h(z) − λ (e ) dz
A A z z
B
−1 −λz 2 /2 −1 h(z)
= −(λz) e h(z) + λ
z
A
B
1 h(z)
+ λ−2 e−λz /2
2
dz.
A z z
The first term tends to zero as A → ∞ by boundedness of |h|, |h | if Re λ ≥

0, |λ| ≥ 1. The integral in the second term can also be bounded in absolute
value for the same range of λ by a constant depending only on M since
|h | is bounded. In particular, the integral
∞
e−λz
2
/2
h(z) dz
0
is uniformly convergent.
Arguing in the same manner for the λ–derivative, we conclude that the
integral is holomorphic for Re λ > 0. Similarly one shows continuity for
Re λ ≥ 0.
Step 4.

∞
±i(z − z0 )2 √
exp dz = 2πe±πi/4 .
−∞ 2
From the previous step it follows that this integral exists, by taking
λ = ∓i and h(z) = 1. Moreover, the classical formula
∞ √
e−u
2
/2
du = 2π
−∞
implies that for real positive λ we have

∞
e−λu /2 du = 2π/λ.
2
−∞
By analytically continuing both sides for Re λ > 0, the same formula holds
for complex λ in the right half–plane. Now let λ → ∓i/ to obtain
∞ π
±i(z − z0 )2 √ √
exp dz = 2π exp ± i
−∞ 2 2

√ πi
= 2π exp ± .
4
Step 5. The second integral in Step 2 is O(3/2 ).

Indeed, the integral exists by Step 3 and
∞
±i(z − z0 )2
(z − z0 )γ(z) exp dz
−∞ 2
∞
±iz 2
= zγ(z + z0 ) exp dz
−∞ 2

∞ ±iz 2
=± exp γ(z + z0 ) dz
i −∞ 2
∞
±iz 2
= ±i γ (z + z0 ) exp dz.
−∞ 2
The boundary terms vanish if γ vanishes sufficiently fast at ∞. This integral
has exactly the form of the original integral and therefore can be written
as a sum of two integrals, the first of order O(1/2 ) by Step 4 and the
second times again an integral of the same type. Thus this integral is of
order 1/2 = 3/2 .
From Steps 2, 4, and 5, we conclude that if there is a single critical
point x0 of ϕ in supp a we get
∞
iϕ(x)
a(x) exp dx
−∞
√ π a(x ) exp(iϕ(x )/)
0 0
= 2π exp ±i + O(3/2 ).
4 |ϕ (x0 )|1/2
The previous proof shows that the same formula holds if all functions
depend smoothly on additional parameters. In particular, we shall use the
following expression in analyzing the right–hand side of (N7.A.19):
∞
f (q, p)
c(q, p) exp i dp
−∞

√ iπ c(q, p) exp(if /)
= 2π exp sgnfpp + O(3/2 ), (N7.A.22)
f =0
4 |f pp |1/2
p
where the sum is over all p such that fp = ∂f /∂p vanishes; these critical
points are assumed to be finite in number and nondegenerate, that is,
fpp = ∂ 2 f /∂p2 = 0.
Applying (N7.A.22) to (N7.A.20) gives
Lψ − Eψ
2
√ exp[−iπ sgn T (p)/4] p
= 2π a(q, p) + V (q) − E
|T (p)|1/2 2m
q=T (p)

pq − T (p)
× exp i + O(3/2 ), (N7.A.23)

provided the number of critical points in p of the q-dependent function

f (q, p) = qp−T (p) is finite and all these p-critical points are nondegenerate.
By assumption, T never vanishes for p near zero and thus T (p) is either
strictly increasing or strictly decreasing. Thus, for a fixed q, there is exactly
one p such that q = T (p), that is, (N7.A.23) reads
√ exp[−iπ sgn T (p)/4]

Lψ − Eψ = 2π a(T (p), p)
|T (p)|1/2
2
p pT (p) − T (p)
× + V (T (p)) − E exp i + O 3/2 .
2m
Now we require that Lψ − Eψ = O(3/2 ), which forces the first term to

vanish (since a(q, p) is not the zero function), that is,
p2
+ V (T (p)) = E. (N7.A.24)
2m
Thus the graph of q = T (p) (as a function of p) is contained in the en-
ergy surface. Equation (N7.A.24) is the Hamilton–Jacobi equation in the
variable p, which is approximated near the turning points q1 and q2 .
Applying (N7.A.22) to formula (N7.A.19) gives
√ 1
ψ(q) = 2π exp[−iπ sgn T (p)/4]
|T (p)|1/2
q=T (p)

pq − T (p)
× exp i a(q, p) + O(3/2 )

√ 1
= 2π exp[−iπ sgn T (p)/4]
|T (p)|1/2

pT (p) − T (p)
× exp i a(T (p), p) + O(3/2 )

= O(1/2 ). (N7.A.25)
We now seek to represent ψ near q1 and q2 using functions T1 and T2

given by (N7.A.24) and seek to represent ψ on the ± portions in the form
(N7.A.19). We are, in effect, using a superposition of two WKB approxi-
mations.
Notice that if q − T (p) = 0, as above, then
d dp dp
(pq − T (p)) = p + q − T (p) = p, (N7.A.26)
dq dq dq
so both S and pq − T (p) are given by integrating p with respect to q; that

is, they are both actions.
Since q = T (p) along the energy curve in Figure N7.A.1, we see that
T (p) > 0 on the + side and T (p) < 0 on the − side of the p–axis. Thus

the term

e−iπ sgn T (p)/4
(N7.A.27)
in (N7.A.25) jumps, or suffers a phase shift, as p crosses the q–axis. In

Figure N7.A.2 we show the different regions and functions being considered.
So now we have obtained ψ on four different regions: The upper and
lower part of the energy surface and the parts around the two turning
points (q1 , 0), (q2 , 0); see Figure N7.A.2. The structure of this function is
that of a product of an amplitude times an exponential plus higher-order
terms. We shall require that they all match on the overlaps at first order.
Since there are constants of integration in these formulae (as in (N7.A.17),
for example), matching at points A, B, and C determines all the constants.
Thus, the consistency condition is the match of these solutions at the point
D. This will happen only if the phases in (N7.A.25) match.
p
2
p
__ +V=E
2m
+
S
D
T1 + T2
2 1 q
C
q1 A 3 4
q2
−
B
S–
Figure N7.A.2. Matching phases.

The phase changes in the exponentials exp(iS/) and exp[i(pq −T (p))/]

are given by

1
p dq, (N7.A.28)

since both S and pq − T (p) are given by integrating p and the line integral
is over the energy curve. On the other hand, the phase change due to the
term (N7.A.27) is
π π
−2 × − − = −π, (N7.A.29)
4 4
so the consistency condition is

1 p dq
p dq − π = 2πn, i. e. , = n + 12 . (N7.A.30)
2π
The 12 is the correction to the Bohr–Sommerfeld rules which one sees, for
example, in the harmonic oscillator solution. Equation (N7.A.30) is the
quantization condition. Its generalization to arbitrary manifolds reads

1
pi dq i − 14 Iγ = integer, (N7.A.31)
2π γ
where Iγ is the Maslov index of a closed curve γ. This topological in-

variant is thus arrived at via the WKB method. To understand it in higher
dimensions requires a lengthy excursion into the theory of Lagrangian sub-
manifolds. However, our simplified example shows that starting with a
study of the asymptotic limit → 0, one is led to quantization condi-
tions; that is, questions 1 and 2, formulated at the beginning of this section
are intimately related.
The overall aims of quantization and geometric asymptotics become
clearer if one has in mind some of the classical–quantum correspondences.
To this end, we present the table below (see Slawianowski [1971]). The
basic classical object is a symplectic manifold (T ∗ Q, Ω) and the quantum
object is the intrinsic Hilbert space H = L2 (Q) of half densities on Q. The
dictionary sets up a correspondence between operations on each.
Classical Mechanics Quantum Mechanics
immersed Lagrangian manifold element of L2 (Q) or D (Q)

Λ → (T ∗ Q, Ω)
Λ = graph of dS ψ = exp(iS/)
multiplication by (−1) on fibers complex conjugation
T ∗Q Hilbert space
(T ∗ Q, −Ω) dual space
Cartesian product tensor product
disjoint union direct product
Lagrangian manifold (unbounded) operator from
Ω ⊂ (T ∗ Q, ΩQ ) × (T ∗ R, −ΩR ) L2 (R) to L2 (Q)
composition of canonical relations composition of operators
graphs of canonical relations unitary operators
Hamilton–Jacobi equation Schrödinger equation
coisotropic submanifold involutive system of linear
C ⊂ T ∗Q differential equations
reduced space C/C Ω solution space
reduction of Lagrangian projection onto solution space
submanifolds
symplectic action unitary representation
(Hamiltonian G–space)
coadjoint orbits (homogeneous irreducible representations
Hamiltonian G–spaces)
reduction of phase space by a multiplicities of irreducibles
symmetry group
momentum mapping associated representation
of the group algebra
polarization complete set of observables
special symplectic structure representation of a complete
set of observables
change of special symplectic Fourier integral operator
structure (Tulczyjew [1977])
Page 23
N9
Lie Groups
Lie groups is a large subject and Chapter 9 of the text as well as this
supplement cover only a part of the subject.
N9.A Automatic Smoothness

We begin with a proof of Proposition 9.1.4 in the text. We recall the state-
ment.
Proposition 9.1.4. Let γ : R → G be a continuous one–parameter sub-

group of a Lie group G. Then γ is smooth and hence γ(t) = exp(ξt) for
some ξ ∈ g.
Proof. It suffices to prove smoothness of γ for |t| < ! for some small
! > 0. Indeed, γ(t + s) = γ(t)γ(s) shows that if |s| < ! then γ is smooth in
an !–neighborhood of each t; thus γ(t) is smooth in a 2!–neighborhood of
zero. Repeating, we see γ is smooth everywhere.
To show that γ is smooth for |t| small, the strategy is to show that
it coincides with exp(tζ) for some ζ ∈ g and for small t. The strategy
of the proof is to show this equality for small rational numbers t using
algebraic properties of γ and exp and then to invoke continuity for a limiting
argument.
To carry this strategy out, fix some n ∈ N and let BR be the open ball
of radius R about the origin in g on which exp is a diffeomorphism.
By
continuity of γ, there is some ! > 0 such that γ(t) ∈ exp BR/2 for all
24 N9. Lie Groups
|t| < !. Fix s > 0, s < ! and define η ∈ BR/2 by exp η = γ(s). Similarly,
since s/n < !, define ξn ∈ BR/2 by γ(s/n) = exp ξn and note that
n n
exp (nξn ) = (exp ξn ) = γ (s/n) = γ(s) = exp η
which would imply, by bijectivity of exp on BR , that nξn = η, if we knew in

advance that nξn ∈ BR . To see this, we begin by observing that 2ξn ∈ BR
and that
2
exp (2ξn ) = (exp ξn ) = γ(s/n)2 = γ(2s/n) ∈ exp(BR/2 )
since 2s/n < ! if 2 ≤ n. Thus, 2ξ ∈ BR/2 . Repeating this argument for

3ξ, 4ξ, . . . , we conclude that nξ ∈ BR/2 and so nξn = η.
Let now k ∈ N, 1 ≤ k ≤ n. Then
γ(ks/n) = γ(s/n)k = exp(ξn )k = exp(kξn ) = exp(kη/n)
since ξn = η/n. Also,
γ(−ks/n) = γ(ks/n)−1 = exp(kη/n)−1 = exp(−kη/n)
which shows that for any rational number q, |q| ≤ 1, we have
γ(qs) = exp(qη).
Now let qn be a sequence of rational numbers convergent to t/s for

|t| ≤ s < !. Continuity for γ and exp imply then that γ(qn s) → γ(t)
and exp(qn η) → exp(tη/s) as n → ∞. We conclude that γ(t) = exp(tζ)
where ζ = η/s, for all |t| ≤ s.
Next we generalize this result to Theorem 9.1.9 of the text. Again, we
recall the statement.
Theorem 9.1.9. Let f : G → H be a continuous homomorphism of finite
dimensional Lie groups. Then f is smooth.
Notice that if G = R, this statement gives the preceding proposition. In
fact the strategy is to use the special case 9.1.4 to prove the more general
one 9.1.9.
Proof. Note that if ξ1 , . . . , ξn ∈ g is a basis, then ψ : Rn → G, n =
dim G, given by
ψ(x1 , . . . , xn ) = exp(x1 ξ1 ) · · · exp(xn ξn )
has derivative at the origin equal to the identity map (if we identify g
with Rn via the chosen basis). Therefore, one can find open neighborhoods
V of e in G and U of the origin in Rn such that ψ|U : U → V is a
diffeomorphism. Let ϕ : V → U be given by ϕ = (ψ|V )−1 . Then (V, ϕ)
N9.B Abelian Lie Groups 25
is a chart at the identity, called exponential chart of the second kind

(as opposed to the exponential chart of the first kind given by the
inverse of the exponential map on a neighborhood of the identity). However,
t → f (exp tξi ) is a continuous one–parameter subgroup of H and is hence
smooth by Proposition 9.1.4. Therefore f ◦ ψ is smooth which implies that
f |V = (f ◦ ψ) ◦ ϕ : V → H is smooth. Thus f is smooth in a chart around
e ∈ G.
Since an atlas of G can be obtained by left translating this chart and
since
f |Lg (V ) = Lf (g) ◦ f ◦ Lg−1 |Lg (V ),
because f is a homomorphism, we see that f is smooth on Lg(V ) and hence

on G. Here, Lf (g) , as usual, denotes left translation by f (g) in H and Lg
denotes left translation by g in G.
N9.B Abelian Lie Groups

In this section we prove Theorem 9.1.11, the main structure theorem for
Abelian Lie groups.
Theorem 9.1.11. Every connected Abelian n–dimensional Lie group G

is isomorphic to a cylinder, that is, to Tk × Rn−k for some k = 0, 1, · · · , n.
Proof. Since G is Abelian, the map t → (exp tξ)(exp tη) is a one–parameter

subgroup of G for any ξ, η ∈ g. The derivative at t = 0 of this one–
parameter subgroup is ξ + η and so by uniqueness, we conclude that
(exp tξ)(exp tη) = exp t(ξ + η).
In particular, setting t = 1, we see that exp : g → G is a Lie group

homomorphism. In addition, since G is connected, it is generated by an
open neighborhood of the identity. Since exp is a local diffeomorphism
around the origin, G is generated by exp(g) and hence exp(g) = G because
exp is a homomorphism. Therefore, exp : g → G is a surjective Lie group
homomorphism that is also a local diffeomorphism. Consequently, its kernel
is a zero dimensional submanifold of g and thus is a discrete subgroup of g.
Consequently, g/ker exp is isomorphic to G as groups and diffeomorphic to
G as manifolds, by working in a small neighborhood of the origin in g where
exp is a diffeomorphism with an open neighborhood of the identity element
in G. Thus g/ker exp and G are isomorphic Lie groups. The theorem is
then a consequence of the following lemma.
26 N9. Lie Groups
Lemma N9.B.1. Any closed discrete subgroup C of Rn is of the form

k

C= ki ei | ki ∈ Z ,
i=1
where {e1 , . . . , ek } is a set of linearly independent vectors of g.

Proof. If C = {0}, there is nothing to prove. If not, there is some e1 =
0, e1 ∈ C. Since C is discrete in Rn , there is an open ball B centered at
the origin in Rn such that C ∩ B = {0}. Thus e1 can be chosen such that
e1 ≤ c for all c ∈ C. Moreover, span{e1 } ∩ C = Ze1 . Indeed, if te1 ∈ C
and [t] denotes the integer part of t, that is, [t] ≤ t < [t] + 1, [t] ∈ Z, then
te1 − [t] e1 ∈ C and (t − [t])e1 < e1 ,
which implies that t = [t] ∈ Z.
Now consider the projection ρ : Rn → Rn /span {e1 } ∼ = Rn−1 . Since ρ is
an open map, it follows that π(c) is a discrete subgroup of Rn /span {e1 }.
Inductively, we can find linearly independent vectors ρ(e2 ), . . . , ρ(ek ) in
Rn /span {e1 } such that every element of ρ(c) is a linear combination of
ρ(e2 ), . . . , ρ(ek ) with coefficients in Z.
It follows that e1 , . . . , ek satisfy the conditions of the lemma. Indeed,
since ρ(e2 ) . . . ρ(ek ) are linearly independent in Rn /span {e1 }, it follows
that e1 , . . . , ek are linearly independent in Rn . Moreover, if
c = t, e1 + t2 e2 + · · · + tk ek ∈ C,
then
ρ(c) = t2 ρ(e2 ) + · · · + tk ρ(ek ) ∈ ρ(c)
so by the inductive hypothesis, t2 , . . . , tk ∈ Z. But then
t 1 e1 = c − t2 e2 − · · · − tk ek ∈ C
and hence t1 ∈ Z.
N9.C Lie Subgroups

This section is devoted to the proof of the following theorem stated in the
text.
Theorem 9.1.14. If H is a closed subgroup of a finite dimensional Lie
group G, then H is a regular Lie subgroup. Conversely, if H is a regular
Lie subgroup, then H is closed.
Proof. Assume that H is a closed subgroup of G. The proof given below
that H is a regular Lie subgroup of G is due to Adams [1969] and consists
of four steps. We shall fix once and for all an inner product on g and denote
the associated norm by · .
N9.C Lie Subgroups 27
Step 1. Assume that {ζn } is a sequence in g such that ζn = 0 for all

n ∈ N, ζn → 0, and ζn /ζn → ζ ∈ g as n → ∞. If exp ζn ∈ H, then we
will show that exp tζ ∈ H for all t ∈ R.
To see this assume first that t > 0 and let mn = [t/ζn ] ∈ N be the
integer part of t/ζn , that is,
t
0 ≤ mn ≤ < mn + 1.
ζn
Then we have
mn ζn − t ≤ 0 < mn ζn − t + ζn ,
which implies that
0 ≤ t − mn ζn < ζn ,
whence mn ζn → t as n → ∞. Therefore, mn ζn → tζ as n → ∞, and
hence
mn
(exp ζn ) = exp (mn ζn ) → exp tζ, as n → ∞.
Since exp ζn ∈ H by hypothesis and H is closed, this implies that exp tζ ∈
−1
H for all t > 0. If t < 0, we have exp tζ = [exp (−tζ)] ∈ H, since
exp(−tζ) ∈ H by what we just proved.
Step 2. Define h = {ξ ∈ g | exp tξ ∈ R for all t ∈ R}. We will show that
h is a linear subspace of g.
It is clear that if λ ∈ R and ξ ∈ h then λξ ∈ h. Next, let ξ, η ∈ h
and assume that ξ + η = 0. If t ∈ R is sufficiently small, since exp is a
diffeomorphism of a neighborhood of zero in g with a neighborhood of e in
G, it follows that
(exp tξ)(exp tη) = exp(f (t))
for some f (t) ∈ g satisfying f (0) = 0 and f is smooth around 0. Since

d d
ξ+η = (exp tξ)(exp tη) = exp(f (t)),
dt t=0 dt t=0
it follows that f (t)/t → ξ + η as t → 0. Since f (t) → 0 as t → 0, letting
ζn = f (1/n) and ζ = (ξ + η)/ξ + η we see that the hypotheses of Step 1
hold and hence we conclude that exp tξ ∈ H for all t ∈ R. Therefore ζ ∈ h
which implies that ξ + η ∈ h.
Step 3. Let h⊥ be the orthogonal complement to h in g and define the
map
ϕ : h⊥ ⊕ h → G by ϕ(ξ, η) = (exp ξ)(exp η),
for ξ ∈ h⊥ , η ∈ h. Then we will show that there are neighborhoods of the
origin U ⊂ h⊥ , U ⊂ h, and V of e in G, such that
28 N9. Lie Groups
(i) ϕ : U × U → V is a diffeomorphism,
(ii) V ∩ H = exp(U ).
It follows from this that exp : {0} × U → exp(U ) ⊂ H is bijective.

To see (i) and (ii), note that T(0,0) exp equals the identity map of g and
hence ϕ is a local diffeomorphism around the origin in g. In particular,
there are balls B ⊂ h⊥ and B ⊂ h, both of radius r, centered at the origin
such that both the map
ϕ : B × B → ϕ (B × B) = exp (B ) exp (B)
and the map
exp : B × B → exp (B × B)
are diffeomorphisms. Let Bn , Bn denote the balls of radius r/n centered at
the origin in h⊥ and h respectively. We claim that for some n large enough,
exp (Bn ) = ϕ (Bn × Bn ) ∩ H = [exp (Bn ) exp (Bn )] ∩ H.
The definition of h immediately implies that exp (Bn ) ⊂ H and hence that
exp (Bn ) ⊂ [exp (Bn ) exp (Bn )] ∩ H.
To show the converse, assume the contrary, namely that for any n ∈ N
there exists a ξn ∈ Bn such that exp ξn ∈ H but ξn = 0. Clearly, ξn → 0
as n → ∞ and by compactness of the unit sphere, ξn /ξn has a con-
vergent subsequence ξnk /ξnk → ξ ∈ h⊥ , ξ = 1. Step 1 then implies
that exp tξ ∈ H for all t ∈ R, that is, ξ ∈ h, by definition of h. Thus
ξ ∈ h⊥ ∩ h = {0} which contradicts ξ = 1.
Therefore, if n is large enough,
exp(Bn ) = [exp(Bn ) exp(Bn )] ∩ H
and so (i) and (ii) are proved, by taking U = Bn , U = Bn , and V =

ϕ (U × U ) = exp (U ) exp U .
Step 4. Define ψ : exp(U ) = V ∩ H → {0} × U to be the inverse of

the bijective map in Step 3. Taking as a chart around e in G the inverse
of ϕ on exp(U × U ), that is, we consider the chart (V, ϕ−1 ) at e in G,
Step 3 guarantees that ϕ−1 (V ∩ H) = {0} × U , that is, (V, ϕ−1 ) has the
submanifold property relative to H. Moreover, the induced chart at e on H
is (W, ψ). Now we left translate (V, ϕ−1 ) to any point in G. In particular,
the left translated chart
−1
Vh := Lh (V ), ϕ−1 −1
h : Vh → U × U, ϕh (k) = ϕ
−1
h k ,
N9.D Lie’s Third Fundamental Theorem 29
has the submanifold relative to H inducing the chart

(Wh , ψh ), Wh := Lh (W ) = k exp(U ), ψh (k) = ψ(h−1 k),
on H. Thus, H is a smooth submanifold of G.
Finally, since the group operations in H are the restrictions of those in
G which are smooth in the manifold structure of G, it follows that they are
smooth in the manifold structure of H, since H is a smooth submanifold
on G.
Conversely, assume that H is a regular Lie subgroup of G. We shall prove
that H is closed. Let {hn } be a sequence in H convergent in G to some
element h ∈ G. Since H is a submanifold of G, there is a chart (V, χ) at e
in G with the properties
χ : V → U × U, U ⊂ h⊥ , U ⊂ h
open balls at the origin and χ(V ∩H) = {0}×U . For all n ≥ N, h−1 hn ∈ V .
On the other hand h−1 N hn ∈ H, so
−1
χ h−1N hn → χ hN h ∈ {0} × U,
since we can always chose V such that V −1 = V and V V ⊂ V . Therefore

h−1
N h ∈ H, since χ : V ∩ H → {0} × U is a diffeomorphism. Since hN ∈ H,
this implies h ∈ H and so H is closed in G.
N9.D Lie’s Third Fundamental Theorem

Recall the statement of this result.
Theorem 9.1.15. Let G be a Lie group with Lie algebra g and let h be a
Lie subalgebra of g. Then there exists a unique connected (immersed) Lie
subgroup H of G whose Lie algebra is h.
Proof. Define the smooth vector subbundle h̃ ⊂ T G by left translating h
to any point of G, that is, the fiber h̃g at g equals Te Lg (h). We prove now
that h̃ is an involutive subbundle.
Let X, Y be vector fields on G with values in h̃, that is, they are sections
of h̃. We will show that [X, Y ] is also a section of h̃. Fix g ∈ G and let
ξ = Tg Lg−1 (X(g)) and η = Tg Lg−1 (Y (g)) .
Let ξL and ηL denote the left invariant vector fields on G generated by
ξ and η respectively. The definition of the Lie bracket on g implies that
[ξ, η]L = [ξL , ηL ]. We have
[X, Y ](g) = [X − ξL , Y − ηL ] (g)
+ [ξL , Y − ηL ] (g) + [X − ξL , ηL ] (g) + [ξL , ηL ] (g).
30 N9. Lie Groups
The last term equals [ξ, η]L (g) ∈ h̃g . The first three terms all have the
following structure: U and V are sections of h̃ and V (g) = 0. If we can
prove that [U, V ] (g) ∈ h̃g , this will show that each of the first three terms
lies in h̃g and we can then conclude that h̃ is an involutive distribution.
The following Lemma solves this problem.
Lemma N9.D.1. Let M be a manifold and let E be a subbundle of T M .
If Y is a section of E such that Y (m0 ) = 0 for a given point m0 ∈ M ,
then [X, Y ] (m0 ) ∈ Em0 for any X ∈ X(M ).
Proof. Let E be the Banach space modeling M . Since the problem is
local, we can replace M by an open neighborhood U of 0 ∈ E, T M by
U × E, and m0 by 0. Because E is a subbundle of T M , there is a splitting
E = E1 × E2 such that, locally, E can be replaced by U × E1 . A section Y
of E is of the form x ∈ U → (x, (f (x), 0)), where f : U → E1 is a smooth
function. The condition X (m0 ) = 0 is equivalent to f (0) = 0.
Let X ∈ X(M ) be arbitrary. Represent it locally in the chart with domain
U by X(x) = (x, g(x)), where g : U → E is a smooth function. Then,
locally, in U ,
[X, Y ](0) = D(f, 0)(0) · g(0) − Dg(0) · (f (0), 0)

= Df (0) · g(0) ∈ F1
since f (0) = 0 and since Df (0) ∈ L (E, E1 ). Therefore [X, Y ] (m0 ) ∈ Em0 .

Returning to the proof of the theorem and applying the theorem of Frobe-
nius to the involutive subbundle h̃ ∈ T G, it follows that h̃ is integrable. Let
H be the maximal integral submanifold of h̃ through the identity, that is,
e ∈ H, Th H = h̃h for any h ∈ H, and H is the maximal (relative to the
inclusion) immersed submanifold of G having these properties.
We shall prove that H is a subgroup of G. If g ∈ G, then gH is the
maximal integral manifold containing g. Indeed, g ∈ gH since e ∈ H and
if h ∈ H, then
Tgh (gH) = Tgh (Lg H) = Th Lg (Th H)

= Th Lg (h̃h ) = Th Lg (Te Lh h)
= Te Lgh (h) = h̃gh
since Lg : G → G is a diffeomorphism. Therefore, by uniqueness of the

maximal integral manifolds, if h ∈ H, it follows that hH = H. Thus,
if k ∈ H, then hk ∈ hH = H. Moreover, if h ∈ H, then h−1 H is the
maximal integral manifold through h−1 and this integral manifold contains
h−1 h = e since h ∈ H. Thus h−1 H contains e and, again by uniqueness
of the maximal integral manifolds, it follows that h−1 H = H, that is,
h−1 ∈ H.
Next, we show that H is a Lie subgroup of G. Indeed,
(h, k) ∈ H × H D→ G × G → hk ∈ H D→ G
is a smooth map from H × H to G. However, since hk ∈ H the map is

smooth from H × H to H.
By construction, Te H = h, so h is the Lie algebra of H.
Finally we prove that H is the unique connected Lie subgroup of G with
Lie algebra h. Suppose that H1 was another such Lie subgroup. Then, if
h ∈ H1 , Th H1 = Te Lh (Te H1 ) = Te Lh (h) = hh ,
that is, H1 is an integral submanifold of h̃. Therefore H1 ⊂ H is an open

subgroup, hence it is also closed, and by connectedness of H it follows that
H1 = H.
N9.E Relations between the Symplectic,

Orthogonal, and Unitary Groups
We now want to relate Sp(2n, R), O(2n), and U(n). Following this, we shall
discuss their quaternionic counterparts.
Our first goal is to show that
Sp(2n, R) ∩ O(2n, R) = U(n).
To make this meaningful, we identify Cn = Rn ⊕ iRn and we express the

Hermitian inner product on Cn as a pair of real bilinear forms, namely, if
we use the notation
x1 + iy1 , x2 + iy2 ∈ Cn for x1 , x2 , y1 , y2 ∈ Rn ,
then
x1 + iy1 , x2 + iy2 = x1 , y1 + x2 , y2 + i (x2 , y1 − x1 , y2 ) .
Thus, identifying Cn with Rn × Rn and C with R × R, we can write

I 0 y1 0 I y1
(x1 , y1 ), (x2 , y2 ) = (x1 , x2 ) , −(x1 , x2 ) .
0 I y2 −I 0 y2
(N9.E.1)
The next task is to represent elements of U(n) as 2n × 2n matrices

with real entries. Since U(n) is a closed subgroup of GL(n, C) we begin
by representing the elements of gl(n, C) in this way. Let A + iB ∈ gl(n, C)
with A, B ∈ gl(n, R) and let x + iy ∈ Cn . Then
(A + iB)(x + iy) = (Ax − By) + i(Ay + Bx)

32 N9. Lie Groups
suggest that the map

A −B
C : A + iB ∈ GL(n, C) → ∈ GL(2n, R) (N9.E.2)
B A
is the desired embedding of GL(n, C) into GL(2n, R). It is straightforward

to verify that the map C is an injective Lie group homomorphism, so we
can identify GL(n, C) with all invertible 2n × 2n matrices of the form

A −B
(N9.E.3)
B A
with A, B ∈ gl(n, R). It is obvious that

C (A + iB)† = [C(A + iB)]
T
and
trace C(A + iB) = 2 Re trace (A + iB).
The relation
det C(A + iB) = | det(A + iB)|2
shows that A + iB ∈ GL(n, C) if and only if C(A + iB) ∈ GL(2n, R). To

prove this identity, bring A + iB into Jordan canonical form, so that its
determinant equals the product of the diagonal entries: λ1 , . . . , λn ∈ C.
Since C is a group homomorphism, the identity holds if we can prove it for
complex matrices in Jordan canonical form, that is,
A = diag(Reλ1 , . . . , Reλn ), and B = diag(Imλ1 , . . . , Imλn ) + N,
where N is the nilpotent matrix with 1’s occupying some places on the first
upper diagonal given by the complex Jordan canonical form
diag (λ1 , . . . , λn ) + N.
Interchanging columns and rows (for each column interchange do the same
for the rows) one can transform this matrix to a block upper triangular
matrix, each block on the diagonal being of the form

Re λ1 − Im λ1
,
Im λ1 Re λ1
the upper 2 × 2 blocks being either the zero matrix or the matrix

0 −1
,
1 0
and all other entries being zero. This matrix has the same determinant as
the original one (because an even number of columns and row changes have
been performed) and, since it is block upper triangular, this determinant
equals the product of the determinants of the diagonal blocks, that is,
(Re λk )2 + (Im λk )2 = |λk |2 , which proves the statement.
Using the embedding C defined above, it follows that, U(n) is embedded
in GL(2n, R) as the set of matrices of the form (N9.E.3) with a certain
additional property to be determined below. If A + iB ∈ U(n) then
(A + iB)† (A + iB) = I.
However, under the homomorphism (N9.E.2)
(A + iB)† = AT − iB T
is sent to the matrix

AT BT
.
−B T AT
Therefore,
(A + iB)† (A + iB) = I
becomes
T
I 0 A B T A −B
=
0 I −B T AT B A
T
A A + B T B −AT B + B T A
=
−B T A + AT B B T B + AT A
which is equivalent to
AT A + B T B = I and AT B is symmetric. (N9.E.4)
Proposition N9.E.1. The following holds:
Sp(2n, R) ∩ O(2n, R) = U(n).
As we shall see, this is the first in a series of three parallel results of this
sort.
Proof. We have already seen that A + iB ∈ U(n) iff (N9.E.4) holds.
Now let us characterize all matrices of the form

A B
∈ Sp(2n, R) ∩ O(2n, R).
C D
Recall from the main text that a block matrix like this is symplectic iff
AT D − C T B = I and AT C, B T D are symmetric. (N9.E.5)

34 N9. Lie Groups
Since this matrix is also in O(2n), we have

I 0 A B AT C T
=
0 I C D B T DT

AAT + BB T AC T + BDT
=
CAT + DB T CC T + DDT
which is equivalent to
AAT + BB T = I, AC T + BDT = 0, CC T + DDT = I. (N9.E.6)
Now, multiply on the right by D the first identity in (N9.E.6), to get from
(N9.E.5)
D = AAT D + BB T D
= A(I + C T B) + BB T D
= A + AC T B + BDT B
= A + (AC T + BDT )B = A
by the second identity in (N9.E.6). Next, multiply on the right by B the

last identity in (N9.E.6) and use, as before, (N9.E.5) to get
B = CC T B + DDT B
= C(AT D − I) + DDT B
= CAT D − C + DB T D
= −C + (CAT + DB T )D = −C
by the second identity in (N9.E.6). We have thus shown that

A B
∈ Sp(2n, R) ∩ O(2n)
C D
iff A = D, B = −C, AT A + C T C = I, and AT C is symmetric, which

coincide with the conditions (N9.E.4) characterizing U(n).
Notice that it follows from this and the fact that elements of Sp(2n, R)
have determinant 1, that
Sp(2n, R) ∩ SO(2n, R) = SU(n).
The Group GL(n, H). By analogy to Rn and Cn we define quaternionic

n space by
Hn = {a = (a1 , . . . , an ) | ai ∈ H}.
This satisfies all axioms of an n-dimensional vector space over H with the
sole exception that H is not a field, being non-commutative.The group
GL(n, H) is defined to be the set of all invertible H–linear maps T : Hn →
Hn defined by left multiplication by a n × n matrix [tpr ], with tpr ∈ H, that
is,

n
(T a)r = tpr ap ,
p=1
for a ∈ Hn . Because of non–commutativity of H, care has to be taken with

the concept of H–linearity. It is straightforward to note that
T (aα) = (T a)α,
for any α ∈ H, but that T (αa) = α(T a), in general. Therefore, usual matrix
multiplication is a right–linear map and, in general, it is not left–linear over
H. In complete analogy with the real case, C2n and Hn are isomorphic.
However, there is a lot of structure that we shall exploit below by realizing
left quaternionic matrix multiplication as a complex linear map. To achieve
this, we shall identify, as before, i ∈ C with the quaternion i ∈ H and will
define the fundamental right complex isomorphism
χ : C2n → Hn
by
χ(u, v) = u + jv,
where u, v ∈ C , and we regard C embedded in H by x + iy → x + iy, for
n
x, y ∈ R. We have
χ((u, v)α) = χ(u, v)α
for all α ∈ C. So, again, we get only right linearity. The key property of
χ is that it turns a left quaternionic matrix multiplication operator into
a usual complex linear operator on C2n . Indeed, if [tpr ] is a quaternionic
n × n matrix, then χ−1 T χ : C2n → C2n is complex linear . To verify this,
let α ∈ C, u, v ∈ Cn and note that
−1
χ T χ (α(u, v)) = χ−1 T χ ((u, v)α) = χ−1 T ((χ(u, v))α)

= χ−1 ((T χ(u, v))α) = χ−1 T χ(u, v) α

= α χ−1 T χ(u, v) .
Let us determine, for example, the 2n × 2n complex matrix J that cor-
responds to the right linear quaternionic map given by left multiplication
with the diagonal map jI. We have
J(u, v) = (χ−1 jIχ)(u, v)
= (χ−1 jI)(u + jv) = χ−1 (ju − v)
= (−v, u),
36 N9. Lie Groups
that is,

0 I
J=
−I 0
is the canonical symplectic structure on Cn ×Cn = C2n . Define the injective

map between the space of right linear quaternionic maps on Hn defined by
left multiplication by a matrix to the space of complex linear maps on C2n
by T → Tχ := χ−1 T χ. Among all the complex linear maps C2n → C2n we
want to characterize those that correspond to left matrix multiplication on
Hn . To achieve this, write
T = A + jB
where A and B are complex n × n matrices. The relation
(A + jB)(u + jv) = Au − Bv + j(Bu + Av)
obtained by using the identity jα = αj for α ∈ H, shows that
Tχ (u, v) = χ−1 T (u + jv)

= Au − Bv, Bu + AV

A −B u
=
B A v
Thus

A −B
H : A + jB ∈ gl(n, H) → ∈ gl(2n, C)
B A
satisfies
†
H (A + jB)† = [H(A + jB)] ,
T T
where (A + jB)† = A − B j and
trace H(A + jB) = 2 Re trace (A + jB).
Also, H is a homomorphism relative to multiplication. Therefore, the Lie

algebra gl(n, H) is isomorphic over C to the complex Lie algebra

A −B
u∗ (2n) := A, B ∈ gl(n, C) .
B A
Define
sl(n, H) = {T ∈ gl(n, H) | Re trace T = 0} .
From the above considerations, it follows that sl(n, H) is isomorphic over

C to the complex Lie algebra
su∗ (2n) = {M ∈ u∗ (2n) | trace M = 0} .
Since H is injective and preserves multiplication, it follows that
H(GL(n, H)) : = U ∗ (2n)

A −B A −B
= A, B ∈ gl(n, C), det = 0
B A B A
is a closed Lie subgroup of GL(2n, C). Realizing GL(n, H) as U ∗ (2n) avoids
the introduction of the concept of determinant of a square matrix with
entries in H, which is possible (this determinant is called the Dieudonné
determinant), but would take us too far afield. Thus one defines

A −B
SU∗ (2n) = A, B ∈ gl(n, C),
B A

A −B
det =1 .
B A
and
SL(n, H) = H−1 (SU∗ (2n)) .
The subgroups SU∗ (2n) and SL(n, H) are closed in U∗ (2n) and GL(n, H)
respectively.
Proceeding as in the real and complex cases (and using the quaternionic
inner product
M, N = trace(M N † )
for M, N ∈ gl(n, H) one gets
Proposition N9.E.2. The quaternionic general linear group GL(n, H)
is isomorphic over C to U∗ (2n) and has complex dimension 2n2 . It is a
noncompact connected Lie group. Its Lie algebra gl(n, H) consisting of n×n
quaternionic matrices is isomorphic over C to u∗ (2n). The quaternionic
special linear group SL(n, H) is isomorphic over C to SU∗ (2n). Its complex
dimension is 2n2 −1 and it is a noncompact connected closed Lie subgroup of
GL(n, H). Its Lie algebra is sl(n, H) which is isomorphic over C to su∗ (2n).
As usual, the connectedness statements need some comments. We shall
see in §9.3 that SL(n, H) is connected because Sp(2n), to be introduced
below, is connected. The connectedness of GL(n, H) follows from the exact
sequence
{1} → H \ {0} → GL(n, H) → SL(n, H) → {I}
and the theorem stating that if H is closed subgroup of G such that both
H and G/H are connected, then G is connected (see Varadarajan [1974]
or Abraham, Marsden, and Ratiu [1988] for the general case of bundles).
38 N9. Lie Groups
The Unitary Symplectic Group Sp(2n). We want to construct a

group analogous to O(n) when we worked with Rn , or to U(n) when we
worked with Cn .
For this, we introduce the quaternionic inner product

n
a, bH = ap bp ,
p=1
where a, b ∈ Hn and bp is the quaternion conjugate to bp , for p = 1, . . . , n.

Again, the usual axioms for the inner product are satisfied, by being careful
in the scalar multiplication by quaternions, that is,
(i) a1 + a2 , b = a1 , b + a2 , b,
(ii) αa, b = αa, b and a, bα = a, bα, for all α ∈ H,
(iii) a, b = b, a,
(iv) a, a ≥ 0 and a, a = 0 iff a = 0.
Any quaternionic vector can be written as u + jv ∈ Hn , where u, v ∈ Cn .
A straightforward computation shows that
u1 + jv1 , u2 + jv2 = u1 , u2 + v1 , v2

+ j v1 , u2 − u1 , v2 .
If T ∈ GL(n, H), express it as T = A + jB, with A, B ∈ gl(n, C), use the

homomorphism H and, defining
! "
Sp(2n) = T ∈ GL(n, H) | T † T = T T † = I
express the defining condition in terms of 2n × 2n complex matrices, that

is, in U∗ (2n). We get

A −B † † −T
Sp(2n) = AA + BB = I, AB symmetric
B A

A −B † † T
= A A + B B = I, A B symmetric
B A
whose Lie algebra is clearly

A −B
sp(2n) = ∈ gl(2n, C) A + A† = 0, B T = B
B A
! "
= T ∈ gl(n, H) | T † + T = 0 .
Note that the trace of any element in sp(2n) is necessarily zero and hence
that any element in Sp(2n) has determinant equal to 1. Thus, unlike the
case of real or complex matrices, where the isometry condition did not
imply that the determinant is zero (and hence we distinguished between
O(n) and SO(n), U(n) and SU(n)), in the case of quaternionic matrices
there is only one group of isometries, namely Sp(2n) and the determinant
equal to 1 condition is automatically satisfied.
Proposition N9.E.3. The unitary symplectic group Sp(2n) is the group
of isometries of Hn . It is a compact connected subgroup SL(n, H) ∼
= SU∗ (2n)
2
of complex dimension 2n + n whose Lie algebra is sp(2n).
Compactness is proved exactly as in the real or complex
√ case by showing
that the norm of an element in Sp(2n) is equal to 2n and the proof of
connectedness is, as usual, deferred to §9.3. From our previous consideration
it immediately follows that:
Proposition N9.E.4.
Sp(2n) = SU∗ (2n) ∩ U(2n).
The Complex Symplectic Group Sp(2n, C) is defined exactly as in
the real case by the condition
! "
Sp(2n, C) = T ∈ GL(2n, C) | T T JT = J .
It is a noncompact connected closed Lie subgroup of GL(2n, C) of complex
dimension 2n2 + n and whose Lie algebra is
! "
sp(2n, C) = T ∈ gl(2n, C) | T T J + JT = 0 .
Proposition N9.E.5.
Sp(2n) = Sp(2n, C) ∩ U(2n).
Proof. Recall that

A C
T = ∈ Sp(2n, C)
B D
if and only if AT B and C T D are symmetric and AT D − B T C = I (see
9.2.12). Also, T ∈ U(2n), if and only if
A† A + B † B = I, C † C + D† D = I and A† C + B † D = 0.
From the characterization (N9.E.7) it follows that all these conditions hold.
Conversely, if these conditions hold, then
−1
J = T T JT = T T T T JT
† −1 −1
=T T T JT = (T † T ) T JT
−1
= T JT
since T † T = I because T ∈ U(2n). Therefore T J = JT which forces C =
−B, D = A. But then, these conditions imply those in (N9.E.7).
40 N9. Lie Groups
N9.F Generic Coadjoint Isotropy

Subalgebras are Abelian
The aim of this section is to prove a theorem of Duflo and Vergne [1969]
showing that, generically, the isotropy algebras for the coadjoint action are
Abelian. A very simple example is G = SO(3). Here g∗ ∼ = R3 and Gµ = S 1
∗
for µ ∈ g and µ = 0, and G0 = SO(3). Thus, Gµ is abelian on the open
dense set g∗ \{0}.
To prepare for the proof, we shall develop some tools.
If V is a finite-dimensional vector space, a subset A ⊂ V is called alge-
braic if it is the common zero set of a finite number of polynomial functions
on V . It is easy to see that if Ai is the zero set of a finite collection of poly-
nomials Ci , for i = 1, 2, then A1 ∪ A2 is the zero set of the collection C1 C2
formed by all products of an element in C1 with an element in C2 . The
whole space V is the zero set of the constant polynomial equal to 1. Fi-
nally, if Aα is the algebraic set given as the common zeros of some finite
collection
# of polynomials Cα , where α ranges
$ over some index set, then
α A α is the zero set of the collection α C α . This zero set can also be
given as the common zeros of a finite collection of polynomials since the
zero set of any collection of polynomials coincides with the zero set of the
ideal in the polynomial ring generated by this collection and any ideal in
the polynomial ring over R is finitely generated (we accept this from alge-
bra). Thus, the collection of algebraic sets in V satisfies the axioms of the
collection of closed sets of a topology which is called the Zariski topology
of V .
Thus, the open sets of this topology are the complements of the algebraic
sets. For example, the algebraic sets of R are just the finite sets, since every
polynomial in R[X] has finitely many real roots (or none at all). Granting
that we have a topology (the hard part), let us show that any Zariski
open set in V is open and dense in the usual topology. Openness is clear,
since algebraic sets are necessarily closed in the usual topology as inverse
images of 0 by a continuous map. To show that a Zariski open set U is
also dense, suppose the contrary, namely, that if x ∈ V \U , then there is a
neighborhood U1 × U2 of x in the usual topology such that
(U1 × U2 ) ∩ U = ∅ and U1 ⊂ R, U2 ⊂ V2
are open, where V = R × V2 , the splitting being achieved by the choice of

a basis. Since x ∈ V \U , there is a finite collection of polynomials
p1 , . . . , pN ∈ R[X1 , . . . , Xn ], n = dim V,
that vanishes identically on U1 × U2 . If x = (x1 , . . . , xn ) ∈ V , then the

polynomials
qi (X1 ) = pi (X1 , x2 , . . . , xn ) ∈ R[X1 ]
all vanish identically on the open set U1 ⊂ R, which is impossible since

each qi has at most a finite number of roots. Therefore, (U1 × U2 ) ∩ U = ∅
is absurd and hence U must be dense in V .
Theorem N9.F.1 (Duflo and Vergne). Let g be a finite-dimensional Lie
algebra with dual g∗ and let r = min{dim gµ | µ ∈ g∗ }. The set
g∗reg := {µ ∈ g∗ | dim gµ = r}
is Zariski open and thus open and dense in the usual topology of g∗ . If
dim gµ = r, then gµ is abelian.
Proof (Due to J. Carmona, as presented in Rais [1972]). Define the map
ϕµ : G → g∗ by g → Ad∗g−1 µ. This is a smooth map whose range is the
coadjoint orbit Oµ through µ and whose tangent map at the identity is
Te ϕµ (ξ) = − ad∗ξ µ. Note that ker Te ϕµ = gµ and
range Te ϕµ = Tµ Oµ .
Thus, if n = dim g, we have
rank Te ϕµ = n − dim gµ ≤ n − r
since dim gµ ≥ r, for all µ ∈ g∗ . Therefore,
U = {µ ∈ g∗ | dim gµ = r} = {µ ∈ g∗ | rank(Te ϕµ ) = n − r}
and n − r is the maximal possible rank of all the linear maps
Te ϕµ : g → g∗ , µ ∈ g∗ .
Now choose a basis in g and induce the natural bases on g∗ and
L(g, g∗ ).
Let
Si = {µ ∈ g∗ | rank Te ϕµ = n − r − i}, 1 ≤ i ≤ n − r.
Then Si is the zero set of the polynomials in µ obtained by taking all
determinants of the (n − r − i + 1)-minors of the matrix $ representation
n−r
of Te ϕµ in these bases. Thus, Si is an algebraic set. Since i=1 Si is the
∗
complement of U , if follows that U is a Zariski open set in g , and hence
open and dense in the usual topology of g∗ .
Now let µ ∈ g∗ be such that dim gµ = r and let V be a complement to
gµ in g, that is,
g = V ⊕ gµ .
Then Te ϕµ |V is injective. Fix ν ∈ g∗ and define
S = {t ∈ R | Te ϕµ+tν |V is injective.}
42 N9. Lie Groups
Note that 0 ∈ S and that S is open in R because the set of injective linear
maps is open in L(g, g∗ ) and µ → Te ϕµ is continuous. Thus, S contains
an open neighborhood of 0 in R. Since the rank of a linear map can only
increase by slight perturbations, we have rank
Te ϕµ+tν |V ≥ rank Te ϕµ |V = n − r,
for |t| small, and by maximality of n − r, this forces
rank Te ϕµ+tν = n − r
for t in a neighborhood of 0 contained in S. Thus, for |t| small,
Te ϕµ+tν |V : V → Tµ+tν Oµ+tν
is an isomorphism. Hence, if ξ ∈ gµ , ad∗ξ (µ + tν) ∈ Tµ+tν Oµ+tν is the image

of a unique ξ(t) ∈ V under Te ϕµ+tν |V, that is,
ξ(t) = (Te ϕµ+tν |V )−1 (ad∗ξ (µ + tν)).
This formula shows that for |t| small, t → ξ(t) is a smooth curve in V and
ξ(0) = 0. However, since
ad∗ξ (µ + tν) = −Te ϕµ+tν (ξ),
the definition of ξ(t) is equivalent to Te ϕµ+tν (ξ(t) + ξ) = 0, that is,
ξ(t) + ξ ∈ gµ+tν .
Similarly, given η ∈ gµ , there exists a unique η(t) ∈ V such that
η(t) + η ∈ gµ+tν , η(0) = 0,
and t → η(t) is smooth for small |t|. Therefore, the map
t → µ + tν, [ξ(t) + ξ, η(t) + η]
is identically zero for small |t|. In particular, its derivative at t = 0 is also

zero. But this derivative equals
ν, [ξ, η] + µ, [ξ (0), η] + µ, [ξ, η (0)]

% & % &
= ν, [ξ, η] − ad∗η µ, ξ (0) + ad∗ξ µ, η (0) = ν, [ξ, η] ,
since ξ, η ∈ gµ . Thus, ν, [ξ, η] = 0 for any ν ∈ g∗ , that is,
[ξ, η] = 0.
Since ξ, η ∈ gµ are arbitrary, it follows that gµ is Abelian.

We close this section with a different proof (due to R. Fillippini) that

for µ ∈ g∗reg , gµ is Abelian. It uses concepts from later in the book on
momentum maps and collective Hamiltonians, so its proof can be deferred.
Another proof due to Weinstein [1983], page 535 is also instructive.
Proof that gµ is Abelian for µ ∈ g∗reg . The momentum map for lifted
left action on G on T ∗ G is given by right translation to the identity Jλ = ρ.
The momentum map for the lifted right action of G on T ∗ G is given by
left translation to the identity, Jρ = λ. Thus,
XJˆλ (ξ) (p) = ξ · p and XJˆρ (ξ) (p) = p · ξ
for all p in T ∗ G and ξ in g.

Given F : g∗ → R, a straightforward calculation shows that

δF δF
XF ◦Jλ (p) = · p and XF ◦Jρ (p) = p ·
δJλ (p) δJρ (p)
for all p ∈ T ∗ G and ξ ∈ g.
If F is constant on the coadjoint orbits, then F ◦ Jλ = F ◦ Jρ , hence

δF δF
·p=p· F
δJλ (p) δJρ (p)
for all p ∈ T ∗ G. In particular, if µ ∈ g∗ and g ∈ Gµ, so that g · µ = µ · g,
we deduce that
δF δF
· (g · µ) = (g · µ) · .
δµ δµ
We know that g∗reg is an open subset of g∗ . Fix µ ∈ g∗reg . There is then a
neighborhood U of µ and a surjective submersion π : U → U/G. If F : U →
R factors through π : U → U/G, then a straightforward calculation shows
that the preceding equation remains valid. A straightforward calculation
shows that δF/δµ ∈ gµ for all µ ∈ g. We now show conversely that given
any ξ ∈ gµ , there exists a smooth function F : U → R that factors through
π : U → U/G such that δF/δµ = ξ.
Let
[µ, g] = {ad∗η µ | η ∈ g}.
Note that [µ, g] can be identified with Tµ O, O being the coadjoint orbit
through µ. It follows that we may identify
Tπ(µ) (U/G) ∼
= g∗ /[µ, g].
Since the linear map ξˆ : g∗ → R factors through g∗ /[µ, g], it follows that
there exists a smooth map ϕ : U/G → R for which
(dϕ)π(µ) · (Tµ π · ν) = ν (ξ)
44 N9. Lie Groups
for all ν ∈ g∗ . Let F = ϕ ◦ π. Then δF/δµ = ξ.

It now follows that, for µ regular, ξ · (g · µ) = (g · µ) · ξ for all g ∈ Gµ
and ξ ∈ gµ . Taking g = e, we see that ξ · µ = µ · ξ; this is of course already
clear, since ξ ∈ gµ . It follows that
(ξ · g) · µ = ξ · (g · µ) = (g · µ) · ξ = g · (µ · ξ) = g · (ξ · µ) = (g · ξ) · µ.
Since G acts freely on T ∗ G, it follows that
ξ·g =g·ξ
for all g ∈ Gµ and ξ ∈ gµ . By differentiating this relation in g at g = e, we

see that gµ is Abelian.
Exercise
# N9.F-1. Prove the following generalization of the Duflo–Vergne Theorem
due to Guillemin and Sternberg [1984]. Let S be an infinitesimally invariant
submanifold of g∗ , that is, ad∗ξ µ ∈ S, whenever µ ∈ S and ξ ∈ g. Let
r = min{dim gµ |µ ∈ S}. Then dim gµ = r implies
[gµ , gµ ] ⊂ (Tµ S)0 = {ξ ∈ g | u, ξ = 0, for all u ∈ Tµ S}.
In particular gµ /(Tµ S)0 is abelian. Note that the Duflo–Vergne Theorem

is the case for which S = g∗ . [The solution to this exercise is given at the
end of this internet supplement].
N9.G Some Infinite Dimensional Lie Groups

Infinite dimensional groups often arise as configuration spaces, symmetry
groups, or gauge groups of physical systems with an infinite number of
degrees of freedom. These groups often consist of functions, operators, or
diffeomorphisms. Here we present without proof a number of facts about
these infinite dimensional groups. To make the details of this infinite di-
mensional geometry precise, lengthy proofs involving delicate functional
analytic and topological issues would need to be addressed. We shall dis-
cuss some of these issues, but shall not present all the detailed proofs.
(These may be found in Palais [1968], Ebin and Marsden [1970], Ratiu and
Schmid [1981], and Adams, Ratiu, and Schmid [1986].) In particular, we
will find that some of the examples we discuss in this section are not Lie
groups in the strict sense of our previous definitions, and caution will be
required when applying the general theory to them. Fortunately, one can
understand many of the main ideas and techniques without needing all the
technicalities.
N9.G Some Infinite Dimensional Lie Groups 45
Examples
A. Consider a compact manifold M (possibly with boundary) and the
infinite dimensional vector space C ∞ (M ) of all smooth real value functions
on M . Evidently, C ∞ (M ) is an abelian group with pointwise defined group
operations; that is,
µ : C ∞ (M ) × C ∞ (M ) → C ∞ (M ), (f, g) → f + g
is defined by (f + g)(x) = f (x) + g(x), x ∈ M , and
I : C ∞ (M ) → C ∞ (M ), f → −f
is defined by (−f )(x) = −(f (x)). The unit element e = 0 is given by

e(x) = 0 ∈ R for all x ∈ M . As for any vector space, we formally have
Te C ∞ (M ) = C ∞ (M ), so the Lie algebra of C ∞ (M ) coincides with C ∞ (M ),
and the bracket is trivial: [f, g] = 0.
The space C ∞ (M ) is not a Banach Lie group since spaces of C ∞ func-
tions do not form a Banach space. To get a Banach Lie group we can
complete C ∞ (M ) to C k (M ), 0 ≤ k < ∞ or to H s (M ), s ≥ 0.
Here H s denotes a Sobolev space, whose definition and prop-

erties are summarized in a separate section below.
Thus C k (M ) and H s (M ) are Banach Lie groups. For noncompact M , it is
sometimes useful to consider weighted Sobolev spaces for technical reasons
involving elliptic equations. In fact it is usually necessary to use Sobolev
spaces to prove facts about elliptic and hyperbolic equations. This is why
C ∞ , while formally nicer, is not suitable for many facts about partial dif-
ferential equations.
The vector group C ∞ (R3 ) and the Banach Lie groups C k (R3 ) and H s (R3 )
are closely related to the gauge group for electromagnetism: Maxwell’s
equations are invariant under the gauge transformation of the vector po-
tential A → A + ∇ϕ, for all ϕ ∈ C ∞ (R3 ). The gauge group for a general
Yang-Mills field also forms a Banach Lie group.
B. Next consider a manifold M and the space
C ∞ (M, R\{0}) = {f : M → R\{0} | f is smooth }
of real valued, nowhere vanishing functions on M . Thus, C ∞ (M, R\{0}) is

a group with pointwise defined group multiplication
µ : C ∞ (M, R\{0}) × C ∞ (M, R\{0}) → C ∞ (M, R\{0})
defined by (f, h) → f h where (f h)(x) = f (x)h(x), x ∈ M , and inversion
I : C ∞ (M, R\{0}) → C ∞ (M, R\{0})

46 N9. Lie Groups
defined by f → f −1 where f −1 (x) = 1/f (x), and x ∈ M . The unit element

e = 1 satisfies e(x) = 1 ∈ R, for all x ∈ M . This group is abelian since
R\{0} is abelian, and so its Lie algebra is C ∞ (M ) = C ∞ (M, R), with the
trivial bracket [f, g] = 0.
The spaces C k (M, R\{0}) and H s (M, R\{0}) are Banach Lie groups
for compact M and certain values of k and s given shortly. Note that
C k (M, R\{0}) is a subset of the Banach space C k (M, R), but in general it is
not an open subset nor a submanifold. However, it is open if M is compact.
Thus, for compact M , C k (M, R\{0}), k ≥ 0, is a Banach Lie group. For
H s (M, R\{0}) we need M compact as well, but even then, H s need not be
closed under pointwise multiplication. This requires, in addition, s > n/2,
n = dim M , as is discussed in the supplement below. For non-compact M
one can use different topologies or replace R\{0} by a compact Lie group
G. In the latter case, C k (M, G) and H s (M, G), for s > (1/2) dim(M ), are
Banach Lie groups under pointwise multiplication
µ(f, g)(x) = f (x)g(x), x ∈ M,
the latter product taken in G and inversion
I(f )(x) = (f (x))−1 .
The Lie algebra is C k (M, g) and H s (M, g) respectively with bracket
[ξ, η](x) = [ξ(x), η(x)], x ∈ M,
the bracket on the right-hand side being taken in g. For example, if s > 3/2,
then H s (R3 , S 1 ) is an abelian Banach Lie group under pointwise multipli-
cation, with Lie algebra H s (R3 ), [· , ·] = 0. If G is a compact Lie group with
Lie algebra g and s > n/2, then H s (Rn , G) has Lie algebra H s (Rn , g) and
bracket [f, g](x) = [f (x), g(x)], the latter bracket being in g.
Diffeomorphism Groups Among the most important “classical” ex-
amples of infinite dimensional groups are the diffeomorphism groups of
manifolds. Let M be a compact boundaryless manifold and denote by
H s Diff (M ) the set of all H s diffeomorphisms of M to M , s > n/2 + 1.
We will now outline the sense in which H s Diff (M ) is a Lie group with
Lie algebra H s -X(M ), the H s vector fields on M . Similar results will be
valid for C k Diff (M ) [resp. W s,p Diff (M )], the group of C k -diffeomorphisms
of M [resp. W s,p -diffeomorphisms, of M ].
The set H s Diff (M ) is a smooth Banach manifold and is a group under
composition; explicitly, the group operations are
µ : H s - Diff (M ) × H s -Diff (M ) → H s - Diff (M ); µ(f, g) = f ◦ g
and
I : H s - Diff (M ) → H s - D Diff(M ); I(f ) = f −1 .
The unit element e is the identity map. For s > (n/2) + 1, H s Diff(M )
is a Banach manifold and in fact is an open submanifold of the Banach-
manifold H s (M, M ). The condition s > (n/2) + 1 guarantees that elements
of H s Diff(M ) are C 1 and a map C 1 close to a diffeomorphism is a diffeo-
morphism by the inverse function theorem (plus an additional argument to
guarantee it is globally one to one and onto; see Marsden, Ebin, and Fis-
cher [1972]). The manifold H s Diff(M ) is not, however, a Banach Lie group,
since group multiplication is differentiable only in the following restricted
sense. Right multiplication
Rg : H s Diff(M ) → H s Diff(M ); Rg (f ) = f ◦ g
is smooth (C ∞ ) for each g ∈ H s Diff(M ), and if g ∈ H s+k Diff(M ), left

multiplication
Lg : H s Diff(M ) → H s Diff(M ), Lg (f ) = g ◦ f
is differentiable of class C k . Hence if g ∈ H s Diff(M ), Lg is only C 0 , that

is, continuous. More generally, composition
(f, g) ∈ H s+k Diff(M ) × H s Diff(M ) → f ◦ g ∈ H s Diff(M )
is a C k -map. Therefore,
group multiplication is continuous, but is not smooth.
The inversion map I : f → f −1 is continuous when regarded as a map of

H s to H s , but is C k if regarded as a map of H s+k Diff(M ) to H s Diff(M ).
The tangent space Tf (H s Diff(M )) at f ∈ H s Diff(M ) is the space of H s
vector fields along f ; explicitly, elements of Tf (H s Diff(M )) are given by
Tf (H s - Diff(M )) = {Xf : M → T M | Xf is H s and τM ◦ Xf = f } ,
where τM : T M → M denotes the tangent bundle projection. In particular,

for f = e = idM ,
Te (H s -Diff(M )) = {X : M → T M | X is H s and τM ◦ X = idM }

= H s (T M ).
The idea behind the above assertions is as follows. An element of the tan-
gent space at the point f , namely, Tf (H s Diff(M )) is the tangent vector to a
curve f (t) ∈ H s Diff(M ) at t = 0 where f (0) = f . But each f (t) maps M to
M , so for x ∈ M , f (t)(x) is a curve in M . Thus (d/dt)f (t)(x)|t=0 ∈ Tf (x) M
and so we get a map Xf of M to T M taking x to an element of Tf (x) M . In
particular, the tangent space at the identity of e is the space of H s vector
fields on M . The tangent manifold T (H s Diff(M )) can be identified with
the set of all mappings from M to T M that cover diffeomorphisms and it
48 N9. Lie Groups
is again an infinite dimensional Banach manifold. It can be shown that the

map
T R : H s+k (T M ) × H s Diff (M ) → T (H s Diff (M )),
defined by T R(f, X) = Te Rf (X) is a C k map.

A vector field X on H s Diff(M ) is a map
X : H s Diff(M ) → T (H s Diff(M ))
such that
X(f ) ∈ Tf (H s Diff(M )) for all f ∈ H s Diff(M ).
It is right invariant if (Rg )∗ X = X , for any g ∈ H s Diff(M ); similarly

it is left invariant if (Lg )∗ X = X. It is not hard to see that T Rg = Rg
and that T Lg = LT g ; explicitly, T Rg and
T Lg : T (H s Diff(M ) → (H s Diff(M ))
at Xf ∈ Tf (H s Diff(M )) are given by
T Rg (Xf ) = Xf ◦ g ∈ Tf ◦g (H s Diff(M ))
and
T Lg (Xf ) = T g ◦ Xf ∈ Tg◦f (H s Diff(M )).
The diagrams in Figure N9.G.1 may help to clarify the situation: in them,
T Rg (Xf ) = Xf ◦ g is a vector field along f ◦ g and T Lg (Xf ) = T g ◦ Xf is
a vector field along g ◦ f .
Tg ✲T M
TM TM
✼ ✻❙
✻ ✒
T Rg (Xf ) ❙ τM
❙ Xf τM
Xf ❙

❙❙
✇
T Lg (Xf )
g f ❄
M ✲ M ✲ M M ✲M ✲M
f g
Figure N9.G.1. Mapping vector fields over maps.

As in finite dimensions, the spaces XR (H s Diff(M )) and XL (H s Diff(M ))

of right and left invariant vector fields on H s Diff(M ) are isomorphic as
vector spaces to Te (H s Diff(M )) = H s (T M ). The isomorphisms are given
by ξ ∈ H s (T M ) → Yξ ∈ XR (H s Diff(M ), where
Yξ (f ) := ξ ◦ f ∈ Tf (H s -Diff(M )),
and ξ ∈ H s (T M ) → Xξ ∈ XL (H s Diff(M )), where
Xξ (f ) := T f ◦ ξ ∈ Tf (H s -Diff(M )).
Note that Xξ (e) = ξ and Yξ (e) = ξ. See Figure N9.G.2.
Tf ✲
M TM TM
✒ ✻ ✒
Yξ (f ) Xξ (f )
ξ ξ τ
f ❄ f ❄
M ✲M M ✲M
Figure N9.G.2. Left and right invariant vector fields on Diff.
Proposition N9.G.1. Let ξ1 , ξ2 ∈ Te (H s Diff(M )) = H s (T M ). Then

for the corresponding right invariant vector fields Yξ1 , Yξ2 on H s Diff(M )
we have
[Yξ1 , Yξ2 ] (e) = [ξ1 , ξ2 ] , i.e., [Yξ1 , Yξ2 ] = Y[ξ1 ,ξ2 ] .
Proof. Recall that the Lie bracket [X, Y ] of two vector fields X and Y
on H s Diff(M ) is given by

d ∗
[X, Y ] = (F Y )
dt t t=0
where Ft is the flow of X. One checks that the flow of Yξ1 is given by
Ft (η) = ϕt ◦ η = Lϕt (η) where ϕt is the flow of ξ1 on M . Then

d ∗ d
(Ft Yξ2 )(η) = (T F−t ◦ Yξ2 ◦ Ft )(η)
dt dt
t=0
t=0
d
= (T ϕ−t ◦ ξ2 ◦ ϕt ◦ η)
dt
t=0
d ∗
= (ϕ ξ2 ) ◦ η = [ξ1 , ξ2 ] ◦ η = Y[ξ1 ,ξ2 ] (η).
dt t t=0
50 N9. Lie Groups
For the corresponding left invariant vector fields Xξ1 , Xξ2 on H s Diff(M ),
there is a sign change:
[Xξ1 , Xξ2 ] (e) = − [ξ1 , ξ2 ] , i.e., [Xξ1 , Xξ2 ] = −X[ξ1 ,ξ2 ] .
Note that the bracket [ξ1 , ξ2 ] is the ordinary Lie bracket of the vector fields
ξ1 , ξ2 on the manifold M . Due to this fact, we define the right “Lie” algebra
of the “Lie” group H s Diff(M ) to be the space of right invariant vector fields
on H s Diff(M ).
Thus the usual bracket of vector fields on M is the right

“Lie” algebra bracket. The Lie algebra bracket associated to the
conventional left invariant definition is the negative of the usual
Jacobi-Lie bracket of vector fields.
Note that for ξ1 , ξ2 ∈ H s (T M ), [ξ1 , ξ2 ] ∈ H s−1 (T M ) so one derivative is

lost. Hence [Yξ1 , Yξ2 ] is an H s−1 -vector field, and the “Lie” algebra is not
closed under the bracket. This corresponds to the fact that H s Diff(M ),
and likewise C k Diff(M ) are not Banach Lie groups.
For ξ ∈ H s (T M ) let ϕt ∈ (H s Diff(M )) be its flow. (That ϕt is H s if ξ
is H s is proved in Ebin and Marsden [1970]; cf. Abraham, Marsden, and
Ratiu [1988], Supplement 4.1C.) The curve c : R → H s Diff(M ), c(t) = ϕt ,
is an integral curve of the right invariant vector field Yξ on H s Diff(M ).
Indeed, c(0) = ϕ0 = e , and for x ∈ M ,
d d
c(t)(x) = ϕt (x) = ξ(ϕt (x)) = (ξ ◦ c(t))(x) = Yξ (c(t))(x).
dt dt
In particular, note that c (0) = ξ. Thus the exponential map
exp : Te (H s - Diff(M )) = H s (T M ) → H s Diff(M )
is given by exp(ξ) = ϕ1 . The map exp is continuous, but unlike the case
for Banach Lie groups, it is not C 1 ; in fact, there is no neighborhood of the
identity onto which it maps surjectively (Kopell [1970]). As a result in this
direction, let us prove that if a diffeomorphism of S 1 has no fixed points
and is in the image of the exponential map, then it must be conjugate to a
rotation (Hamilton [1982]). Let η ∈ Diff(S 1 ) have no fixed points. If there
is ξ ∈ X(S 1 ) such that exp(ξ) = η, then ξ is nowhere vanishing. Write
ξ = f (t)(d/dt), where t ∈ R is a parameterization of S 1 modulo 2π. Now
reparametrize the circle by

dt 1
θ=c where c = 2π
f (t) dt
0 f (t)
and note that ξ = c(d/dθ), i.e., ξ is constant in the parameterization θ and

therefore its exponential is given by θ → θ+c, i.e., it is a rotation. Since the
parameter change t → θ is given by a diffeomorphism of S 1 , it follows that

exp(ξ) is conjugate to the rotation θ → θ + c, which is what we claimed.
These facts are important pathologies to keep in mind, but fortunately
they will not impair our main development. In our applications to fluid
dynamics and plasma physics, various subgroups of Diff(M ) will appear,
which we will consider later. In view of the pathologies mentioned above,
we cannot invoke Proposition 9.1.14 to prove that they are Lie subgroups
— other special arguments are needed.
Subgroups. In the same sense as H s Diff(M ), there are other diffeomor-
phism groups that are Lie groups. We shall review a few of them here,
following Ebin and Marsden [1970], §6 (see Marsden, Ratiu, and Shkoller
[1999] for additional examples). Let s > dim(M )/2 + 1. If M is a compact
boundaryless manifold and N ⊂ M is a closed submanifold (possibly zero
dimensional) without boundary, let
H s Diff N (M ) = {f ∈ H s Diff (M ) | f (N ) ⊂ N } ,
i.e., the diffeomorphisms keeping N setwise fixed. Arguing as in the case
of H s Diff(M ), one sees that the Lie algebra of H s Diff N (M ) is
s
HN (T M ) = {X ∈ H s (T M ) | X(n) ∈ Tn N for all n ∈ N } ,
the H s vector fields on M tangent to N . Indeed, if f (t) ∈ H s Diff N (M ) is
a curve with f (0) = e, f (t)(n) ∈ N and so (d/dt)|t=0 f (t)(n) ∈ Tn N .
Similarly, one can consider the group of H s -diffeomorphisms keeping the
submanifold N pointwise fixed, i.e.,
H s -Diff N,p (M ) = {f ∈ H s -Diff (M ) | f (n) = n for all n ∈ N } .
As before, it is easy to see that the Lie algebra of this groups is
s
HN,p (T M ) = {X ∈ H s (T M ) | X(n) = 0 for all n ∈ N } .
Now let M be a compact manifold with boundary and consider the group
H s Diff(M ). Since f (∂M ) = ∂M for every diffeomorphism f , the previous
s
argument used for the Lie algebra HN (T M ) shows that the Lie algebra
here is
H∂s (T M ) = {X ∈ H s (T M ) | X(x) ∈ Tx (∂M ) for allx ∈ ∂M } ,
the H s vector fields tangent to the boundary. This group and its Lie algebra
are useful in the continuum mechanics of compressible fluids. If N ∩ ∂M =
∅, the groups H s Diff N (M ) and H s Diff N,p (M ) have Lie algebras equal to
s
HN (T M ) ∩ H∂s (T M ) s
and HN,p (T M ) ∩ H∂s (T M ),
respectively. Similarly
H s Diff p (M ) = {f ∈ H s Diff (M ) | f (x) = x for all x ∈ ∂M }
52 N9. Lie Groups
has Lie algebra
Hps (T M ) = {x ∈ H s (T M ) | X(x) = 0 for all x ∈ ∂M } .
For a manifold with boundary ∂M , the boundaryless double M̃ is

obtained by gluing together two copies of M along the boundary. Then M̃
is a boundaryless smooth manifold, dim(M̃ ) = dim(M ), and M̃ is compact
if M is. One checks that H s Diff(M ) is a submanifold of H s (M, M̃ ) and
H s Diff p (M ) is a submanifold of both H s Diff(M ) and H s (M, M̃ ). Using
M̃ there is yet another group that often shows up in the literature. A
diffeomorphism f ∈ H s Diff(M ) is said to have support in M if and only
if f can be extended to f˜ ∈ H s Diff(M̃ ) with f˜|(M̃ \M ) = identity. Let
H s Diff 0 (M ) denote the H s -diffeomorphisms with support in M . Then for
s > (dim M )/2 + 1, the embedding
f ∈ H s Diff 0 (M ) → f˜ ∈ H s Diff(M̃ )
makes H s Diff 0 (M ) a closed submanifold of H s Diff(M̃ ). The Lie algebra

of H s Diff 0 (M ) is

H0 (T M ) = X ∈ H s (T M ) there exists an H s -extension
s
X̃ ∈ H s (T̃ M ) with X zero on M̃ \M .
N9.G.1 Basic Facts about Sobolev Spaces and

Manifolds.
In our discussions of H s (M ) and H s Diff(M ), we implicitly used some basic
properties of Sobolev spaces. We summarize these here. More details may
be found in Adams [1975], Palais [1965, 1968], Ebin and Marsden [1970]
and in Marsden and Hughes [1983] (Ch. 6).
A fundamental point is that functional analysis requirements necessitate
the use of Sobolev spaces where C k spaces will not do. For example if
the Laplacian of f , namely, ∆f is C k , it does not follow that f is C k+2 .
However, its Sobolev analogue is true.
These spaces will first be defined over a domain Ω ⊂ Rn and are vector
subspaces of various Lp spaces. For any integer k ≥ 0 and 1 ≤ p < ∞ we
define a norm · k,p on real or complex valued functions on Ω as follows
 1/p

k
F k,p =  Dk f pp 
|r|=0
for any function f for which the right side makes sense, · p being the
Lp (Ω)-norm. It is clear that · k,p defines a norm on any vector space
N9.G.1 Basic Facts about Sobolev Spaces and Manifolds. 53
of functions on which the right side takes finite values, provided that two
functions are identified in the space if they are equal almost everywhere in
Ω. Let H k,p (Ω) be the completion of
! "
f ∈ C k (Ω) | f k,p < ∞
with respect to the norm · k,p . Then H k,p (Ω) is a Banach space called a
Sobolev space. For p = 2 we denote H k,2 (Ω) by H k (Ω).
For the reader familiar with distributional derivatives, another definition
may be helpful. Let
W k,p (Ω) = {f ∈ Lp (Ω) | Dr f ∈ Lp (Ω) for 0 ≤ |r| ≤ k}
where Dr f is the distributional rth derivative. Equipped with the norm

above, these are Banach spaces. Clearly W 0,p (Ω) = Lp (Ω). The Meyers-
Serrin theorem (Meyers and Serrin [1960]) states that H k,p (Ω) = W k,p (Ω).
The definition of W s,p (Ω) for an arbitrary s ≥ 0, i.e., s not an integer
is more complicated. Let s = k + σ where k is an integer and 0 < σ < 1.
Then the norm · s,p is defined by
 1/p
 p
|D r
f (x) − D r
f (h)|
f s,p = f pk,p + .
 Ω Ω |x − y|n+σp 
|r|=k
Let W s,p (Ω) denote the completion of {f ∈ C ∞ (Ω) | f s,p < ∞} and set
H s (Ω) ≡ W s,2 (Ω). One treats W s,p (Ω, Rm ) in a similar way. Also, by
completing the space of functions of Ω that extend to C ∞ functions in
an open neighborhood of the closure Ω̄ one similarly defines W s,p (Ω̄) and
W s,p (Ω̄, Rm ).
Now we turn to Sobolev spaces on manifolds. Let M be a compact C ∞

manifold, possibly with boundary, and let C ∞ (M ) be the set of real-valued
C ∞ functions on M . For s and p a pair of positive integers, a Sobolev norm
on f ∈ C ∞ (M ) is defined as follows: let {(Ui , ϕi )}0≤i≤N be a finite atlas
of M , fi = f ◦ ϕ−1
i : ϕi (Ui ) ⊂ Rn → R be the local representatives of f ,
and set
f s,p := max fi s,p
0≤i≤N
where · s,p are the Sobolev norms on ϕi (Ui ) ⊂ Rn defined as above

(respectively on the closure ϕi (Ui ) in Rn+ at the boundary). The Sobolev
space W s,p (M ) is then defined to be the (Cauchy) completion of C ∞ (M )
with respect to · s,p . One shows that the resulting space is independent
of the choice of atlas on M , but, of course, its norm is not. To get an
intrinsically defined norm one requires additional structure on M , such as
a Riemannian metric. The generalization of this definition to Rn -valued
54 N9. Lie Groups
functions on M , and further to sections of a vector bundle E → M should

be clear (presuming the existence of a vector bundle norm on E).
To obtain useful information concerning the Sobolev spaces W k,p , we
need to establish certain fundamental relationships between these spaces.
To do this, one uses the following fundamental inequality of Sobolev, as
generalized by Nirenberg and Gagliardo.
Theorem N9.G.2 (SNG Inequality). Let 1 ≤ q ≤ ∞, 0 ≤ r ≤ ∞, 0 ≤
j < m, j/m ≤ a ≤ 1, 0 < p < ∞, with j, m integers ≥ 0; assume that

1 j 1 m 1
= +a − + (1 − a) (N9.G.1)
p n r n q
(if 1 < r < ∞ and m − j − (n/r) is an integer ≥ 0, assume (j/m) ≤ a < 1).
Then there is a constant C such that for any smooth u : Rn → R1 , we have
Dj uLp ≤ CDm uaLr u1−a
Lq . (N9.G.2)
(If j = 0, rm < n, and q = ∞, assume u → 0 at ∞ or u lies in Lσ for
some finite σ > 0.)
Below we shall prove some special cases of this result. (The arguments
given by Nirenberg [1959] are geometric in flavor in contrast to the usual
Fourier transform proofs and therefore are more suitable for generalization
to manifolds; cf. Cantor [1975] and Aubin [1976].)
The above theorem remains valid for u defined on a region with piecewise
smooth boundary, or more generally if the boundary satisfies a certain
“cone condition.”
If one knows an inequality of the form (N9.G.2) exists, one can infer
that (N9.G.1) must hold by the following scaling argument: Replace u(x)
by u(tx) for a real t > 0. Then writing ut (x) = u(tx), one has
Dj ut Lp = tj−n/p Dj uLp ,
Dm ut aLr = ta(m−n/r) Dm uaLr ,
−n(1−a)/q
ut 1−a
Lq = t u1−a
Lq .
Thus if (N9.G.2) is to hold for ut (with the constant independent of t), we

must have
n
n (1 − a)
j− =a m− −n ,
p r q
which is the relation (N9.G.1).
The following corollary is useful in a number of applications:
Corollary N9.G.3. With the same relations as in the SNG Inequality,
for any ε > 0 there is a constant Kε such that
Dj uLp ≤ εDm uLr + Kε uLq
for all (smooth) functions u.
Proof. This follows from the SNG inequality (N9.G.2) and Young’s in-
equality:
xa y 1−a ≤ ax + (1 − a)y,
which implies that
xa y 1−a = (εx)a (Kε y)1−a ≤ aεx + (1 − a)Kε y
where Kε = 1/εa/(1−a) .
Let us illustrate how Fourier transform techniques can be used to directly
prove the special case of the preceding Corollary in which n = 3, j = 0,
p = ∞, m = 2, r = 2, and q = 2.
Proposition N9.G.4. There is a constant c > 0 such that for any ε > 0
and function f : R3 → R smooth with compact support, we have

f ∞ ≤ c ε3/2 f L2 + ε−1/2 ∆f L2 .
(It follows that if f ∈ H 2 (R3 ), then f is uniformly continuous and the

above inequality holds.)
Proof. Let
1
fˆ(k) = e−ik·x f (x)dx
(2π)3/2 R3
denote the Fourier transform. Recall that (∆fˆ)(k) = −k2 fˆ(k). From
Schwarz’ inequality, we have

2 2 2
ˆ dk 2 2ˆ
f (k) dk ≤ 2 ε + k f (k) dk
(ε2 + k2 )
c1
= ε2 − ∆ f 2L2 ,
ε
where
dξ
c1 = 2 < ∞.
R3 (1 + ξ2 )
Here we have used the fact that h → ĥ is an isometry the L2 -norm
inik·x
(Plancherel’s theorem). Thus, from f (x) = 1/(2π)3/2
R3
e fˆ(k)dk, we
get
c2
(2π)3/2 f ∞ ≤ fˆL1 ≤ √ ε2 − ∆ f L2
ε

≤ c2 ε f L2 + ε−1/2 ∆f L2 .
3/2

56 N9. Lie Groups
Thus we have shown that H 2 (R3 ) ⊂ C 0 (R3 ) and that the inclusion
is continuous. More generally, one can show by similar arguments that
H 2 (Ω) ⊂ C k (Ω) provided s > (n/2) + k and
W s,p (Ω) ⊂ C k (Ω) if s > (n/p) + k.
This is one of the Sobolev embedding theorems.

For Ω bounded, the inclusion W s,p (Ω) → C k (Ω), s > (n/p) + k is com-
pact; that is, the ball in W s,p (Ω) is compact in C k (Ω) (Rellich’s the-
orem). This is proved in a manner similar to the classical Arzela-Ascoli
theorem, one version of which states that the inclusion C 1 (Ω) ⊂ C 0 (Ω) is
compact (see Marsden and Hoffman [1993], for instance). Also, W s,p (Ω) ⊂

W s ,p (Ω) is compact if s > s and p = p or if s = s and p > p . (See
Friedman [1969] for the proofs.)
One application of Rellich’s theorem is to the proof of the Fredholm
alternative. (See, for example, Marsden and Hughes [1983] (Chapter 6).)
It is often used in this way in existence theorems, using compactness to
extract convergent sequences. Compactness is also used in existence theory
in another crucial way when one seeks weak solutions. This is through
the fact that the unit ball in a Banach space is weakly compact — that
is, compact in the weak topology. See, for example, Yosida [1980] for the
proof (and for refinements, involving weak sequential compactness).
We shall give another illustration of the SNG Inequality through a special
case that is useful in the study of, amongst other things, nonlinear wave
equations. This is the following inequality in R3 :
uL6 ≤ C grad uL2 .
Proposition N9.G.5. Let u : R3 → R be smooth and have compact

support. Then
3
u dx ≤ 48
6
grad u dx2
R3 R3
√
6
so C = 48.
Proof. (Following Ladyzhenskaya [1969].) From

x
∂u
u3 (x, y, z) = 3 u2 dx
−∞ ∂x
one gets

∞ 2 ∂u
sup u3 (x, y, z) ≤ 3 u
∂x dx.
x −∞

Set I = R3
u6 dx and write

3 3
∞
I=
u u dydz dx
−∞
∞ ∞ ∞
3 3
≤ sup
u dz
sup u dy dx
−∞ y −∞ −∞ x
∞
2 ∂u 2 ∂u
≤9
u ∂y dydz u ∂z dydz dx.
−∞
Using Schwarz’s inequality gives


∞ ∞ 2 1/2
 ∂u
I ≤9 4
u dydz dydz
−∞ −∞ ∂y
2 1/2 
∞
∂u  dx
× dydz
−∞ ∂z

∞ 2 1/2
∂u
≤ 9 max  4
u dydz dxdydz
x −∞ R3 ∂y
2 1/2 
∂u 
× dxdydz
R3 ∂z

2 1/2
3 ∂u ∂u
≤ 36
u ∂x dxdydz ∂y
dxdydz
R3 R3

2
∂u
× dxdydz
R3 ∂z
2 1/2 1/2
√ ∂u ∂u
2
≤ 36 I dxdydz dxdydz
R3 ∂x R3 ∂y
1/2
2
∂u
× dxdydz .
R3 ∂z
Using the arithmetic–geometric mean inequality

√ √
3 √
3
a b 3 c < (a + b + c)/3
gives
3/2
√ 1
I ≤ 36 I grad u2 ;
R3 33/2
58 N9. Lie Groups
i.e.,
3
(36)2
I≤ grad u2 .
33 R3
Another important corollary of the SNG Inequality can be used to deter-

mine to which W s,p space a product belongs.
Corollary N9.G.6. For s > n/2, H s (Rn ) is a Banach algebra (under
pointwise multiplication). That is, there is a constant K > 0 such that for
u, v ∈ H s (Rn ),
u · vH s ≤ KuH s vH s .
This is an important property of H s not satisfied for low s; it is not true
that L2 forms an algebra under multiplication.
Proof. Choose in the SNG inequality, a = j/s, r = 2, q = ∞, p = 2s/j,
m = s (0 ≤ j ≤ s) to obtain
j/s
Dj uL2s/j ≤ const. Ds uL2 u1−j/s
∞ ≤ const. uH s .
Let j + k = s. From Hölder’s inequality we have
Dj u · Dk v2L2 ≤ const. Dj u2L2s/j Dk v2L2s/k ≤ const. u2H s v2H s .
Now Ds (uv) consists of terms like Dj u · Dk v, so we obtain
Ds (uv)L2 ≤ const. uH s vH s .
Similarly for the lower-order terms. Summing gives the result.

The trace theorems state that the restriction map from Ω to a submani-
fold M ⊂ Ω of codimension m induces a bounded operator from W s,p (Ω)
to W s−(1/mp),p (M ). Adams [1975] and Morrey [1966] are good references;
the latter contains some useful refinements.
There are also basic extension theorems that are right inverses of restric-
tion maps. For example, the Calderon extension theorem asserts that
there is an extension map T : W s,p (Ω) → W s,p (Rn ) that is a bounded op-
erator and “restriction to Ω”◦T = Identity. This is related to a classical C k
theorem due to Whitney. See, for example, Abraham and Robbin [1967],
Stein [1970], and Marsden [1973a]. Finally, we mention that these Rn re-
sults carry over to manifolds in a straightforward way using local charts,
as in for example, Palais [1965].
Consider the set of all strictly positive functions in W s,p (M ). Clearly
this set (which we shall call W+s,p (M )) is not a Sobolev space since it
fails to be closed under multiplication by scalars. Elementary properties
k
of continuous functions guarantee that C+ (M ) is open in C k (M ). This is
not generally true for W+s,p (M ) in W s,p (M ). However, if s > n/p then
W s,p (M ) is continuously embedded in C 0 (M ). Then
W+s,p (M ) = W s,p (M ) ∩ C+
0
(M ),
from which it follows that W+s,p (M ) is an open subset of W s,p (M ). Note

that the differentiability, the embedding, and the multiplication properties
discussed above all hold (in appropriate form) for open subsets of Sobolev
spaces.
We make some additional comments about the group of diffeomorphisms
of M . We define the collection of diffeomorphisms of M as open subsets
of Sobolev spaces, we discuss (in terms of Sobolev spaces) the composi-
tion of diffeomorphisms with scalar functions, and finally we consider the
diffeomorphisms as a topological group.
The space of diffeomorphisms of M . A map η : M → M is lo-
cally (in a coordinate chart of M ) Rn -valued and hence we can define a
Sobolev space W s,p (M, M ) of such maps. One can check that this space
is chart-independent if s > n/p and hence is well-defined. The diffeomor-
phisms are the invertible elements of W s,p (M, M ). Now, the continuous
diffeomorphisms, i.e., the homeomorphisms, aren’t open in the set of con-
tinuous maps C 0 (M, M ). However, in C 1 (M, M ), the inverse function the-
orem is available and one can use it to verify openness. So one finds that
if s > n/p + 1, the set of diffeomorphisms W s,p Diff(M ) is open in the
Sobolev space W s,p (M, M ).
Composing scalar functions with diffeomorphisms. For s > (n/p)+
1 let

F : (W s ,p -Diff (M )) × W s ,p
(M ) → W s,p (M ),
for s , s ≥ s, be defined by
F (η, f ) = f ◦ η.
Then F is C ∞ .
Sobolev inverse functions. If η ∈ H s (M, M ), s > (n/p) + 1 and η has
a C 1 inverse, then the inverse is W s,p so η ∈ W s,p Diff(M ).
Now consider two maps related to the composition map F : let
Fη : W s,p (M ) → W s,p (M ),
be defined by
Fη (f ) := f ◦ η

for fixed η ∈ (W s ,p -Diff (M )), with s ≥ s + 1) and let
60 N9. Lie Groups

Of : (W s ,p -Diff (M )) → W s,p (M ),
be defined by
Of (η) := f ◦ η

for fixed f ∈ W s ,p (M ) with s ≥ s and s ≥ s).
The first of these, Fη , is the pullback map. It is linear since
Fη (f + g) = (f + g) ◦ η = f ◦ η + g ◦ η = Fη (f ) + Fη (g).
Since one verifies that it is also C 0 , Fη is a smooth map.

The second map, Of , is the orbit map; its image of W s,p (M ) is called
the orbit of W s,p Diff(M ) through f . Since Of is not linear, smoothness
is not so elementary. One does, however, find that if f ∈ W s+k,p (M ), then

Of : W s ,p Diff(M ) → W s,p (M )
is a C k map as long as s ≥ s. This result is proved using the “ω-lemma”

(Abraham, Marsden, and Ratiu [1988], Supplement 2.4B).
W s,p -Diff(M ) as a topological group. W s,p -Diff (M ) is a group using
composition for the group multiplication. For η ∈ W s,p Diff(M ), we set
Rη : W s,p -Diff (M ) → W s,p -Diff (M ); Rη (µ) = µ ◦ η
and
Lη : W s,p -Diff (M ) → W s,p -Diff (M ); Lη (µ) = η ◦ µ.
To ensure that W s,p Diff(M ) is a group, one requires that s > (n/p) + 1
so that an inverse exists. One can use the results on composition above
to show that while right multiplication is smooth, left multiplication and
inversion are only C 0 . Hence W s,p Diff(M ) is a topological group and not
a Banach Lie group.
Nested Groups. In view of the important technical properties of group
multiplication in H s Diff(M ), we introduce an abstract context for this
phenomenon developed by Omori [1974] and Adams, Ratiu, and Schmid
[1986].
Definition N9.G.7. A collection of groups {G∞ , Gs | s ≥ s0 } is called
a nest if
i each Gs is a Hilbert manifold of class C k(s) , modeled on the Hilbert

space E s , where the order of differentiability k(s) tends to ∞ as s →
∞;
ii for each s ≥ s0 , there are linear continuous, dense inclusions E s+1 →

E s and dense inclusions of class C k(s) , Gs+1 → Gs ;
iii each Gs is a topological group and G∞ = limGs is a topological group
←
with the inverse limit topology;
iv if (U s , ϕs ) is a chart on Gs , then (U s ∩ Gt , ϕs | U s ∩ Gt ) is a chart
on Gt for every t ≥ s;
v group multiplication µ : G∞ × G∞ → G∞ can be extended to a C k -
map µ : Gs+k × Gs → Gs for any s such that k ≤ k(s);
vi inversion I : G∞ → G∞ can be extended to a C k -map I : Gs+k → Gs ,
for any s satisfying k ≤ k(s);
vii right multiplication Rg by g ∈ Gs is a C k(s) -map Rg : Gs → Gs .
If the manifolds are Banach manifolds rather than Hilbert manifolds then
{G∞ , Gs | s ≥ s0 } is a Banach nest.
A collection of vector spaces {g∞ , gs | s ≥ s0 } is called a Lie algebra

nest if
i each gs is a Hilbert (Banach)-space and for each s ≥ s0 there are
linear, continuous, dense inclusions gs+1 → gs and g∞ = limgs is a
←
Fréchet space with the inverse limit topology;
ii there are bilinear, continuous, antisymmetric maps [ , ] : gs+2 ×gt+2 →
gmin(s,t) , for all s, t ≥ s0 , which satisfy the Jacobi identity on gmin(s,t,r)
for elements in gs+4 × gt+4 .
If {G∞ , Gs | s ≥ s0 } is a Lie group nest, put gs ≡ Te Gs and g∞ = limgs .
←
Then it is easy to see that {g∞ , gs | s ≥ s0 } is the Lie algebra nest of the
Lie group nest {G∞ , Gs | s ≥ s0 }.
Examples
A. The classical examples of Lie group nests are the diffeomorphism
groups
{Diff(M ), H s -Diff (M ) | s > (dim M )/2}
with Lie algebra nests
{X(M ), H s - X(M ) | s > (dim M )/2}
for M a compact manifold. Previously, we stated the facts, proved in Ebin

[1970], Ebin and Marsden [1970], Omori [1974] and Marsden, Ebin, and
Fischer [1972], which verify the preceding definition.
62 N9. Lie Groups
B. The group of homogeneous symplectomorphisms of T ∗ M \{0}.

The symplectic manifold most widely used is the cotangent bundle T ∗ M
of a manifold M . The canonical symplectic form on T ∗ M is exact, i.e.,
it is the differential of a canonical one-form θ on T ∗ M . Thus one can ask
about the structure of diffeomorphisms of T ∗ M that preserve θ. A diffeo-
morphism ϕ : T ∗ M → T ∗ M satisfying ϕ∗ θ = θ is necessarily a lift, i.e.,
ϕ = T ∗ η, for η ∈ Diff ∞ (M ) (this is proved in the main text). However,
in some cases one must consider T ∗ M \{0}, and its diffeomorphisms pre-
serving θ. Then ϕ∗ θ = θ if and only if ϕ is symplectic and homogeneous
∗
of degree one, i.e., ϕ(τ αm ) = τ ϕ(αm ) for all τ > 0 and αm ∈ Tm M . Now
∗
we can consider the group H Diff θ (T M \{0}) of homogeneous symplec-
s
tomorphisms of T ∗ M \{0}. But right away we are faced with the problem
of non-compactness of T ∗ M \{0}. We sketch below, following Ratiu and
Schmid [1981] how
{Diff θ (T ∗ M \{0}), H s Diff θ (T ∗ M \{0}) | s ≥ dim M + 1/2}
is a Lie group nest with Lie algebra nest
! ∞ ∗ "
S (T M \{0}), S s+1 (T ∗ M \{0}) | s > dim M + 1/2 ,
where
S s (T ∗ M \{0}) = {H : T ∗ M \{0} → R | H is of
class H s and homogeneous of degree one}
with the Poisson bracket as Lie algebra bracket. Note the gain in one deriva-
tive at the Lie algebra level. The basic idea of the ensuing discussion is
that H s Diff θ (T ∗ M \{0}) is algebraically isomorphic to the group of all H s
contact transformations of the cosphere bundle of M , which is a compact
manifold if M is. We start by recalling the relevant facts.
The multiplicative group of strictly positive reals R+ acts smoothly on
T ∗ M \{0} by αx → τ αx , τ > 0, αx ∈ Tx∗ M , αx = 0. This action is free
and proper and therefore π : T ∗ M \{0} → Q ≡ T ∗ M \{0}/R+ is a smooth
principal fiber bundle over Q, the cosphere bundle of M . Note that Q
is compact (supposing M is) and odd-dimensional. Q carries no canonical
contact one-form but for each global section σ : Q → T ∗ M \{0} we can de-
fine an exact contact one-form θσ on Q by θσ = σ ∗ θ. Such global sections
exist in abundance; for example, any Riemannian metric on M identifies
T ∗ M with T M and Q with the unit sphere bundle. Then the usual in-
clusion of the sphere bundle into T M gives a section σ. The section σ is
uniquely determined by a smooth function fσ : T ∗ M \{0} → R+ defined
by σ(π(αx )) = fσ (αx )αx . In other words, fσ measures how far from the
section σ an element αx ∈ T ∗ M \{0} lies. The function fσ is homogeneous
of degree −1 and π ∗ θσ = fσ θ.
An H s+1 contact transformation on Q is a diffeomorphism ϕ ∈
H ∗ Diff θ (Q) such that for any two sections σ, ζ : Q → T ∗ M \{0}, there
exists an H s+1 function hσζ : Q → R+ satisfying ϕ∗ θσ = hσζ θζ . Equiva-

lently, ϕ ∈ H s+1 Diff(Q) is an H s+1 contact transformation if and only if
for each global section σ there exists an H s+1 function hσ : Q → R+ such
that ϕ∗ θσ = hσ θσ . The function hσ is uniquely determined by σ, namely
hσ = ϕ∗ θσ , Eσ , where Eσ is the Reeb vector field on Q determined by
the contact structure θσ and , denotes the pairing between vector fields
and one-forms. (Eσ is the unique vector field satisfying θσ , Eσ = 1 and
iEσ (dθσ ) = 0, where iEσ (dθσ ) denotes the interiorproduct of Eσ with dθσ ;
in local coordinates x1 , . . . , xn−1 , y 1 , . . . , y n−1 , t on Q, where

n−1
θσ = y i dxi + dt,
i=1
we have Eσ = ∂/∂t. Therefore the group of H s+1 -contact transformations

on Q is isomorphic to the group
Cons+1 (Q)
!σ "
= (ϕ, h) ∈ H s+1 -Diff (Q) H s+1 (Q, R\{0}) | ϕ∗ θσ = hθσ
for any fixed but arbitrary global section σ, where
H s+1 -Diff(Q) H s+p (Q, R\{0})
is the semidirect product of the two Lie groups
H s+1 Diff(Q) and H s+1 (Q, R\{0})
(H s+1 (Q, R\{0}) regarded as a multiplicative group) with composition law
(ϕ1 , h1 )(ϕ1 , h2 ) = ((ϕ1 ◦ ϕ2 ), h2 (h1 ◦ ϕ2 )).
Omori [1974] has shown that Cons+1

σ (Q) is a closed Lie subgroup of the
semidirect product Lie group
H s+1 Diff(Q) H s+1 (Q, R\{0}).
The Lie algebra of this semidrect product group is the semidirect product
Lie algebra
H s+1 - X(Q) H s+1 (Q, R)
of H s+1 -vector fields and H s+1 –functions, with bracket
[(X, f ), (Y, g)] = ([X, Y ], X(g) − Y (f )).
The Lie algebra of Cons+1

σ (Q) is
σ (Q) = {(Y, g) ∈ H
cons+1 - X(Q) H s+1 (Q, R) | LY θσ = gθσ }.
s+1
64 N9. Lie Groups
In Ratiu and Schmid [1981] (Theorem 4.1) it is shown that the group
H s+1 Diff θ (T ∗ M \{0})
is isomorphic (as a group) to the Lie group Cons+1

θ (Q). The isomorphism
is given by
Φ : H s+1 Diff θ (T ∗ M \{0}) → Cons+1

θ (Q)Φ(η) = (ϕ, h)
where ϕ is defined by ϕ ◦ π = π ◦ η and h by
h ◦ π = (fσ ◦ η)/fσ , σ(π(αx )) = fσ (αx )αx .
The inverse of Φ is given by
Φ−1 (ϕ, h) = (σ ◦ ϕ ◦ π)/(h ◦ x) · fσ .
Since Cons+1
σ (Q) and Conζ
s+1
(Q) are isomorphic as nested Lie groups, for
any two global sections σ and ζ, the isomorphism Φ determines a nested
Lie group structure on H s+1 Diff θ (T ∗ M \{0}) which is independent of σ
(or independent of the Riemannian metric if σ is induced from such). Fur-
thermore, the Lie algebra
H s - Xθ (T ∗ M \{0}) = {Y ∈ H s - X(T ∗ M \{0}) | £Y θ = 0}
of H s+1 Diff θ (T ∗ M \{0}) is isomorphic to
S s+2 (T ∗ M \{0}) = {H ∈ H s+2 (T ∗ M \{0})

= {H ∈ H s+2 (T ∗ M \{0}, R) |
H homogeneous of degree one }.
This is because £Y θ = 0 if and only if Y is globally Hamiltonian, homoge-

neous of degree zero, with Hamiltonian function H = θ(Y ), homogeneous
of degree one. The Lie algebras H s+1 -Xθ (T ∗ M \{0}) and cons+1
σ (Q) are
isomorphic via
Te Φ : H s+1 -Xθ (T ∗ M \{0}) → cons+1

σ (Q).
Explicitly, Te Φ(XH ) = (X, k), where X is uniquely defined by T π ◦ XH =

X ◦ π and k by k ◦ π = {fσ , H}/fσ , ({ , } is the canonical Poisson bracket
on T ∗ M ). The map H → H ◦ σ is an isomorphism from S s+2 (T ∗ M \{0})
onto H s+2 (Q, R), with inverse
j : H s+2 (Q, R) → S s+2 (T ∗ M \{0}),
where j(f ) is the extension to T ∗ M \{0} by homogeneity of degree one of
f ◦ σ −1 : σ(Q) ⊂ (T ∗ M \{0}) → R,
for f ∈ H s+2 (Q, R). The composition of these two isomorphisms with
Te Φ−1 gives an isomorphism
T Φ−1
σ (Q) −→ H
F : cons+1 e s+1
-Xθ (T ∗ M \{0}) → S s+2 (T ∗ M \{0})
j −1
−→ H s+2 (Q, R),
defined by F (X, k) = θσ (X). In the condition £X θσ = kθσ , the function

k is uniquely determined by X, namely k = Eσ (θσ (X)). From this it fol-
lows that F is continuous and hence an isomorphism between cons+1 σ (Q)
and H s+2 (Q, R) (note the gain of one derivative). We see thus once again
that Cons+1
σ (Q) and Conζ
s+1
(Q) are isomorphic as nested Lie groups for
any two global sections σ, ζ : Q → T ∗ M \{0}, since both are modeled on
H s+2 (Q, R).
Defining the Hilbert space structure of S s+2 (T ∗ M \{0}) as the one in-
duced by the isomorphism
j : H s+2 (Q, R) → S s+2 (T ∗ M \{0}),
it follows that
H s+2 (Q, R), S s+2 (T ∗ M \{0}), H s+1 − Xσ (T ∗ M \{0}) and cons+1

σ (Q)
are all isomorphic as Hilbert spaces. It is desirable to compare the topol-

ogy of S s+2 (T ∗ M \{0}) with the strong C 1 -Whitney topology. Since all
elements of S s+2 (T ∗ M \{0}) are C 2 by the Sobolev embedding theorem,
we can define a new topology on S s+2 (T ∗ M \{0}) in the following way: a
neighborhood of zero consists of all those functions H ∈ S s+2 (T ∗ M \{0})
for which
dH : (T ∗ M \{0}) → T ∗ (T ∗ M \{0})
s+2
is C 1 -close to zero in the strong C 1 -Whitney topology. Let SW (T ∗ M \{0})
∗
be the set S (T M \{0}) equipped with this new topology. One checks
s+2
that
j : H s+2 (Q, R) → SW
s+2
(T ∗ M \{0}),
or equivalently the identity S s+2 (T ∗ M \{0}) → SW

s+2
(T ∗ M \{0}) is contin-
uous with discontinuous inverse, i.e., the new topology is strictly coarser
than the original one on S s+2 (T ∗ M \{0}). This remark is useful in the
construction of an explicit chart at e in H s+1 Diff θ (T ∗ M \{0}).
66 N9. Lie Groups
Remarks.
1. The gain of one derivative at the Lie algebra level has a corresponding
statement in H s+1 Diff θ (T ∗ M \{0}): for η ∈ H s+1 Diff θ (T ∗ M \{0}),
τ ∗ ◦ η : (T ∗ M \{0}) → M
is of class H s+1 , where τ ∗ : T ∗ M → M is the cotangent bundle projection.

Locally, this means that if η(x, α) = (y(x, α), β(x, α)), then y is H s+2
jointly in x and α. To prove this, note that η ∗ θ = θ is equivalent locally to

n
∂y i n
∂y i
βi k = αk , βi = 0, k = 1, . . . , n.
i=1
∂x i=1
∂αk
Since η is a diffeomorphism of class H s+1 , for fixed x, there exists a

unique α such that β = (0, . . . , 1, . . . , 0), the i-th basis vector. For this
choice
. i ofk /α, thes+1 first relation shows that the i-th column of the matrix
∂y /∂x is H . This says that y(x, α) has all derivatives of order at
most s + 2 square integrable except the derivatives involving only αk ’s.
The second relation is an elliptic equation with H s+1 coefficients of first
order
in y i regarded as a function
of α only (its symbol maps (ξ i ) ∈ Rn to
β1 + . . . + ξ i βi + . . . + βn ∈ Rn ) and thus its solution is of class H s+2 ,
i.e., the (s + 2)-nd derivative of y with respect to α is square integrable
and thus y is of class H s+2 .
2. Let η ∈ H s+1 Diff θ (T ∗ M \{0}) be fiber preserving, i.e., π ∗ η(αx ) =
π ∗ η(αx ) for all αx , αx ∈ Tx∗ M \{0}. Then η can be extended H s+1 -smoothly
to the zero section by η(0x ) = 0y for y = π(η(αx )), αx ∈ T ∗ M \{0}. Thus,
η : T ∗ M → T ∗ M , η ∗ θ = θ and hence η = T ∗ g for an H s+2 diffeomorphism
g : M → M . In particular, if π(η(αx )) = x for all αx ∈ T ∗ M \{0}, then
η = e. From this it follows that the effect of η on base points uniquely
determines η, i.e., if η, η ∈ H s+1 Diff θ (T ∗ M \{0}) satisfy π ◦ η = π ◦ η, then
η = η (since π(η ◦ η −1 )(αx ) = x).
Another interesting group considered by Adams, Ratiu, and Schmid

[1986] is the group of invertible Fourier integral operators, of interest
in the KdV equation. We refer to these papers for details.
Page 67
N10
Poisson Manifolds
N10.A Proof of the Symplectic

Stratification Theorem
We proceed in a series of technical propositions.1
Proposition N10.A.1. Let P be a finite dimensional Poisson manifold
with Bz: : Tz∗ P → Tz P the Poisson tensor. Take z ∈ P and functions
f1 , . . . , fk defined on P such that {Bz: dfj }1≤j≤k is a basis of the range of
Bz: . Let Φj,t be the local flow defined in a neighborhood of z generated by
the Hamiltonian vector field Xfj = B : dfj . Let
Ψzf1 ,... ,fk (t1 , . . . , tk ) = (Φ1,t1 ◦ · · · ◦ Φk,tk )(z)
for small enough t1 , . . . , tk . Then:

(i) There is an open neighborhood Uδ of 0 ∈ Rk such that:
Ψzf1 ,... ,fk : Uδ → P
is an embedding.
:
(ii) The ranges of (T Ψzf1 ,... ,fk )(t) and BΨ z are equal for t ∈ Uδ .
f1 ,... ,fk (t)
(iii) Ψzf1 ,... ,fk (Uδ ) ⊂ Σz .
1 This proof was kindly supplied by O. Popp

68 N10. Poisson Manifolds
(iv) If
Ψyg1 ,... ,gk : Uη → P
is another map constructed as above and y ∈ Ψzf1 ,... ,fk (Uδ ), then there
is an open subset, U ⊂ Uη , such that Ψyg1 ,... ,gk is a diffeomorphism
from U to an open subset in Ψzf1 ,... ,fk (Uδ ).
Proof. (i) The smoothness of Ψzf1 ,... ,fk follows from the smoothness of
Φj,t in both the flow parameter and manifold variables. Then
T0 Ψzf1 ,... ,fk (∂/∂tj ) = Xfj (z) = Bz: dfj ,
which shows that T0 Ψzf1 ,... ,fk is injective. It follows that Ψzf1 ,... ,fk is an
embedding on a sufficiently small neighborhood of 0, say Uδ . Notice also
that the ranges of T0 Ψzf1 ,... ,fk and of Bz: coincide.
(ii) Recall from the main text that for any invertible Poisson map Φ on
P , we have
T Φ · Xf = Xf ◦Φ−1 ◦ Φ
and also recall that that Hamiltonian flows are Poisson maps. Therefore, if
t = (t1 , . . . , tk ),
Tt Ψzf1 ,... ,fk (∂/∂tj )

= (T Φ1,t1 ◦ . . . ◦ T Φj−1,tj−1 ◦ Xfj ◦ Φj+1,tj+1 ◦ . . . ◦ Φk,tk )(z)
= (Xhj ◦ Ψzf1 ,... ,fk )(t),
where
−1
hj = fj ◦ Φ1,t1 ◦ . . . ◦ Φj−1,tj−1 .
This shows that

:
range Tt Ψxf1 ,... ,fk ⊂ range BΨ x
f1 ,... ,fk (t)
if t ∈ Uδ . Since B : is invariant under Hamiltonian flows, it follows that

:
dim range BΨ z = dim range Bz: .
f1 ,... ,fk (t)
This last equality, the previous inclusion, and the last remark in the proof
of (i) above conclude (ii).
(iii) This is obvious since Ψzf1 ,... ,fk is built from piecewise Hamiltonian
curves starting from z.
(iv) Note that Xg (z) ∈ range Bz: for any z ∈ P and any smooth func-
tion g. Using (ii), we see that Xg is tangent to the image of Ψzf1 ,... ,fk .
N10.A Proof of the Symplectic Stratification Theorem 69
Therefore, the integral curves of Xg remain tangent to Ψzf1 ,... ,fk (Uδ ) if
they start from that set. To get Ψyg1 ,... ,gk we just have to find Hamiltonian
curves which start from y. Therefore, we can restrict ourselves to the sub-
manifold Ψzf1 ,... ,fk (Uδ ) when computing the flows along the Hamiltonian
vector fields Xgj ; therefore we can consider that the image of Ψyg1 ,... ,gk is
in Ψzf1 ,... ,fk (Uδ ). The derivative at 0 ∈ Rk of Ψyg1 ,... ,gk is an isomorphism
to the tangent space of Ψzf1 ,... ,fk (Uδ ) at y (that is, range By: ), using (ii)
above. Thus, the existence of the neighborhood U follows from the inverse
function theorem.
Proposition N10.A.2. Let P be a Poisson manifold and B its Poisson
tensor. Then for each symplectic leaf Σ ⊂ P , the family of charts satisfying
(i) in the previous proposition, namely,
! z "
Ψf1 ,... ,fk | z ∈ Σ, {Bz: dfj }1≤j≤k a basis for range Bz: ,
gives Σ the structure of a differentiable manifold such that the inclusion is

an immersion. Then Tz Σ = range Bz: (so dim Σ = rank Bz: ), for all z ∈ Σ.
Moreover, Σ has a unique symplectic structure such that the inclusion is a
Poisson map.
Proof. Let
w ∈ Ψzf1 ,... ,fk (Uδ ) ∩ Ψyg1 ,... ,gk (U )
h1 ,... ,hk : Uγ → P . Using (iv) in the proposition above, we

and consider Ψw
can choose Uγ small enough so that
h1 ,... ,hk (Uγ ) ⊂ Ψf1 ,... ,fk (Uδ ) ∩ Ψg1 ,... ,gk (U )
Ψw z y
is a diffeomorphic embedding in both Ψzf1 ,... ,fk (Uδ ) and Ψyg1 ,... ,gk (U ). This
shows that the transition maps for the given charts are diffeomorphisms
and so define the structure of a differentiable manifold on Σ. The fact that
the inclusion is an immersion follows from (i) of the above proposition. We
get the tangent space of Σ using (i), (ii) of the previous proposition; then
the equality of dimensions follows.
It follows from the definition of an immersed Poisson submanifold that
Σ is such a submanifold of P . Thus, if i : Σ → P is the inclusion,
{f ◦ i, g ◦ i}Σ = {f, g} ◦ i.
Hence if {f ◦ i, g ◦ i}Σ (z) = 0 for all functions g then {f, g}(z) = 0 for
all g, that is, Xg [f ](z) = 0 for all g. This implies that df |Tz Σ = 0 since
the vectors Xg (z) span Tz Σ. Therefore, i∗ df = d(f ◦ i) = 0, which shows
that the Poisson tensor on Σ is nondegenerate and thus Σ is a symplectic
manifold. This proves the proposition and also completes the proof of the
symplectic stratification theorem.
70 N10. Poisson Manifolds
There is another proof of the symplectic stratification theorem (using

the same idea as for the Darboux coordinates) in Weinstein [1983] (see
Libermann and Marle [1987] as well.) The proof given above is along the
Frobenius integrability idea. Actually it can be used to produce a proof of
the generalized Frobenius theorem.
Theorem N10.A.3 (Singular Frobenius Theorem). Let D be a distribu-
tion of subspaces of the tangent bundle of a finite dimensional manifold M ,
that is, Dx ⊂ Tx M as x varies in M . Suppose it is smooth in the sense that
for each x there are smooth vector fields Xi defined on some open neigh-
borhood of x and with values in D such that Xi (x) give a basis of Dx . Then
D is integrable, that is, for each x ∈ M there is an immersed submanifold
Σx ⊂ M with Tx Σx = Dx , if and only if the distribution D is invariant
under the (local) flows along vector fields with values in D.
Proof. The “only if” part follows easily. For the “if” part we remark
that the proof of the theorem above can be reproduced here replacing the
range of Bz: by Dx and the Hamiltonian vector fields with vector fields
in D. The crucial property needed to prove (ii) in the above proposition
(i.e. Hamiltonian fields remain Hamiltonian under Hamiltonian flows) is
replaced by the invariance of D given in the hypothesis.
Remarks.
1. The conclusion of the above theorem is the same as the Frobenius

integrability theorem but it is not assumed that the dimension of Dx is
constant.
2. Analogous to the symplectic leaves of a Poisson manifold, we can define
the maximal integral manifolds of the integrable distribution D using curves
along vector fields in D instead of Hamiltonian vector fields. They are also
injectively immersed submanifolds in M .
3. The condition that (local) flows of the vector fields with values in D
leave D invariant implies the involution property of D, that is, [X, Y ] is a
vector field with values in D if both X and Y are vector fields with values
in D (see Chapter 4 of the text). But the involution property alone is not
enough to guarantee that D is integrable (if the dimension of D is not
constant).
4. This generalization of the Frobenius integrability theorem is due to
Hermann [1962], Stefan [1974], Sussman [1973], and it has proved quite
useful in control theory; see also Libermann and Marle [1987].
Page 71
N11
Momentum Maps
N11.A Another Example of a Momentum

Map
Here is an interesting example of a momentum map that illustrates some
of the convexity properties of momentum maps of torus actions.
As in Example (a) of §3.3 and Exercise 5.5-4, one checks that the mo-
mentum map of the standard Tn+1 action

(θ0 , . . . , θn ) · (z0 , z1 , . . . , zn ) = eiθ0 z0 , . . . , eiθn zn
on Cn+1 has the expression

1 2
JCn+1 (z0 , . . . , zn ) = |z0 | , . . . , |zn |2 .
2
Since JCn+1 is invariant under the circle action

θ · (z0 , . . . , zn ) = eiθ z0 , . . . , eiθ zn
on the unit sphere S 2n−1 = J−1

Cn+1 (1/2), it follows that JC
n+1 induces a
map JCPn : CPn → Rn+1 given by

1 2
JCPn ([z0 : · · · : zn ]) = |z0 | , . . . , |zn |2
2
where
[z0 : z1 : · · · : zn ] = [(z0 , · · · , zn )]
72 N11. Momentum Maps
denotes the equivalence class of (z0 , . . . , zn ) in CPn . It is easily verified that

JCPn is a momentum map of the Tn+1 action
. /
(θ0 , . . . , θn ) · [z0 : · · · : zn ] = eiθ0 z0 : · · · : eiθn zn
on CPn . The image of CPn under JCPn clearly coincides with

!
JCn+1 S 2n−1 = (t0 , . . . , tn ) ∈ Rn+1 | t0 + · · · + tn = 1,
ti ≥ 0 for all i = 0, . . . , n} ,
which is the standard n simplex in Rn+1 spanned by the vertices
(1, 0, . . . , 0), (0, 1, . . . , 0), . . . , (0, . . . 1).
This is an instance of the Atiyah–Guillemin–Sternberg curvexity theorem

(Atiyah [1982], Guillemin and Sternberg [1982]), which states that if a torus
T acts on a compact connected symplectic manifold P in a Hamiltonian
fashion with invariant momentum map J : P → t∗ , then J(P ) is a convex
compact polytope whose vertices are given by J(P T ), where P T is the fixed
point set of the T –action on P . These fixed point sets are, interestingly, the
bifurcation points of the momentum mapping, according to Arms, Marsden,
and Moncrief [1981].
In our example,
Tn+1
(CPn ) = {[1 : 0 : · · · : 0], . . . , [0 : · · · : 1]} ,
whose image clearly consists of the vertices
(1/2, 0, . . . , 0) , . . . , (0, . . . , 0, 1/2) .
If n = 1, the Hopf fibration

(z0 , z1 ) ∈ S 3 ⊂ C2 → 2z0 z 1 |z1 |2 − |z0 |2 ∈ S 2
identifies CP1 with S 2 and the T2 action on S 2 is given by rotations about

the vertical axis:

(θ0 , θ1 ) · x1 , x2 , x3 = x1 cos (θ0 − θ1 ) − x2 sin (θ0 − θ1 ) x1 sin (θ0 − θ1 )

+x2 cos (θ0 − θ1 ) , x3 .

In terms of x1 , x2 , x3 ∈ S 2 the momentum map JCP1 becomes
1
JS 2 x1 , x2 , x3 = (1 + x3 , 1 − x3 )
4
whose image is in the line segment joining (0, 1/2) and (1/2, 0) in the plane.
Page 73
N13
Lie-Poisson Reduction
N13.A Proof of the Lie–Poisson Reduction

Theorem for Diff vol (M )
An interesting special case of the Lie–Poisson reduction theorem is G =
Diff vol (Ω), the subgroup of the group of diffeomorphisms Diff(Ω) of a region
Ω ⊂ R3 , consisting of the volume-preserving diffeomorphisms. We shall
treat Diff(Ω) and Diff vol (Ω) formally, although it is known how to handle
the functional analysis issues involved (see Ebin and Marsden [1970] and
Adams, Ratiu, and Schmid [1986] and references therein). We shall prove
the Lie–Poisson reduction theorem for this special case.
The Lie Algebra of Diff. For η ∈ Diff(Ω), the tangent space at η is
given by the set of maps V : Ω → T Ω satisfying V (X) ∈ Tη(X) Ω, that is,
vector fields over η. We think of V as a material velocity field. Thus, the
tangent space at the identity is the space of vector fields on Ω (tangent to
∂Ω). Given two such vector fields, their left Lie algebra bracket is related
to the Jacobi–Lie bracket by (see Chapter 9):
[V, W ]LA = − [V, W ]JL ,
that is,
[V, W ]LA = (W · ∇)V − (V · ∇)W, (N13.A.1)
as one finds using the definitions.

74 N13. Lie-Poisson Reduction
Right Translation. We will be computing the right Lie–Poisson bracket

on g∗ . Right translation by ϕ on G is given by
Rϕ η = η ◦ ϕ. (N13.A.2)
Differentiating (N13.A.2) with respect to η gives
T Rϕ · V = V ◦ ϕ. (N13.A.3)
Identify Tη G with those V ’s such that the vector field on R3 given by

v = V ◦ η −1 , is divergence-free and identify Tη∗ G with Tη G via the pairing

π, V = π · V dx dy dz, (N13.A.4)
Ω
where π · V is the dot product on R3 . By the change of variables formula,

and the fact that ϕ ∈ G has unit Jacobian,
T ∗ Rϕ · π, V = π, T Rϕ · V

= π · (V ◦ ϕ) dx dy dz = (π ◦ ϕ−1 ) · V dx dy dz,
Ω Ω
so
T ∗ Rϕ · π = π ◦ ϕ−1 . (N13.A.5)
Derivatives of Right Invariant Extensions. If F : g∗ → R is given,

its right invariant extension is
FR (η, π) = F (π ◦ η −1 ). (N13.A.6)
Let us denote elements of g∗ by M, so we are investigating the relation

between the canonical bracket of FR and HR and the Lie–Poisson bracket
of F and H via the relation
M ◦ η = π.
From (N13.A.6) and the chain rule, we get
Dη FR (Id, π) · v = −DM F (M) · Dη π(Id) · v

δF
= − ((v · ∇)M) · dx dy dz, (N13.A.7)
Ω δM
where δF/δM is a divergence-free vector field parallel to the boundary.

Since T ∗ G is not given as a product space, one has to worry about what it
means to hold π constant in (N13.A.7). We leave it to the ambitious reader
to justify this formal calculation.
N13.B Proof of the Lie–Poisson Reduction Theorem for Diff can (P ) 75
Computation of Brackets. Thus, the canonical bracket at the identity

becomes

δFR δHR δHR δFR
{FR , HR } (Id, π) = − dx dy dz
Ω δη δπ δη δπ
δHR δFR
= Dη FR (Id, π) · − Dη HR (Id, π) · . (N13.A.8)
δπ δπ
At the identity, π = M and δFR /δπ = δF/δM, so substituting this and
(N13.A.7) into (N13.A.8), we get
{FR , HR }(Id, M)

δH δF δF δH
=− ·∇ M· − ·∇ M· dx dy dz.
Ω δM δM δM δM
(N13.A.9)
Equation (N13.A.9) may be integrated by parts to give
{FR , HR } (Id, M)

δH δF δF δH
= M· ·∇ − ·∇ dx dy dz
δM δM δM δM

δF δH
= M· , dx dy dz, (N13.A.10)
δM δM LA
which is the “+” Lie–Poisson bracket. In doing this step note div(δH/δM) =
0 and since δH/δM and δF/δM are parallel to the boundary, no boundary
term appears. When doing free boundary problems, these boundary terms
are essential to retain (see Lewis, Marsden, Montgomery, and Ratiu [1986]).
For other diffeomorphism groups, it may be convenient to treat M as a
one-form density rather than a vector field.
N13.B Proof of the Lie–Poisson Reduction

Theorem for Diff can (P )
This section discusses the Lie–Poisson reduction theorem for the special
case G = Diff can (P ), the group of canonical transformations of a bound-
aryless symplectic (or Poisson) manifold P . The Lie algebra of Diff can (P ) is
the algebra of infinitesimal Poisson automorphisms, or Poisson derivations,
that is, vector fields X on P for which
X[{f, h}] = {X[f ], h} + {f, X[h]}
for any f, h ∈ F(P ). To avoid complications, we work with the globally

Hamiltonian vector fields by suitably restricting P or Diff can (P ). Each
Hamiltonian vector field can be identified with its generator (modulo ad-
ditive constants being understood). From the formula
[Xk , Xh ]LA = − [Xk , Xh ]JL = X{k,h} (N13.B.1)
we see that g may be identified with F(P ) with the Lie bracket given by
the Poisson bracket. One could then identify g∗ with functions f on P via
the pairing

f, h = f h dµ, (N13.B.2)
P
where dµ is the Liouville measure. If P is only a Poisson manifold, identify

g∗ with the densities on P . As in the last section, Tη G consists of vector
fields of the form Xk ◦ η.
To identify the dual space of Tη G, we need objects to pair with Tη G in
a nondegenerate way. Since Xk ◦ η = T η ◦ Xk◦η , we cannot simply use the
pairing (N13.B.2) to identify Tη∗ G with F(P ); such a procedure would not
account for the extra factor T η. Instead, regard π ∈ g∗ as a one-form on P
and pair it with Xk ∈ g by

π, Xk = π · Xk dµ. (N13.B.3)
P
This pairing is degenerate; for example, if π = df , then π, Xk = 0 by

Stokes’ theorem. To simplify matters, let us work in coordinates, and write
π = πi dq i + π i dpi
so, integrating by parts,

∂k i ∂k
π · Xk dµ = πi −π dq dp
P P ∂pi ∂q i

∂πi ∂π i
= − + i k dq dp. (N13.B.4)
P ∂pi ∂q
Thus if we work modulo π’s satisfying the divergence-like condition
∂π i ∂πi
− = 0, (N13.B.5)
∂q i ∂pi
then the pairing (N13.B.3) is nondegenerate. Now let f ∈ F(P ) be given

and define π by requiring

π · Xk dµ = f k dµ (N13.B.6)
P P
N13.B Proof of the Lie–Poisson Reduction Theorem for Diff can (P ) 77
for all k ∈ F(P ). Thus, from (N13.B.4), we need

∂π i ∂πi
− = f. (N13.B.7)
∂q i ∂pi
Note that if π = (∂h/∂q i ) dq i + (∂h/∂pi ) dpi , the left side of (N13.B.7) is
identically zero since ∂ 2 h/∂q i ∂pi = ∂ 2 h/∂pi ∂q i . If we take π = (∂ψ/∂pi ) dq i −
(∂ψ/∂q i ) dpi , then ψ is determined by −∆ψ = f , so ψ is now uniquely de-
termined modulo π’s satisfying (N13.B.5). In two-dimensional incompress-
ible flow, which corresponds to the special case dim P = 2, ψ is the stream
function and f the vorticity.
Identify Tη∗ G with one-forms modulo exact one-forms over η ; that is,
objects of the form πη = π ◦ η. Given F ∈ F(P ), define F on g∗ by
F (π) = F (f ), where π and f are related by (N13.B.6) and extend it to be
right invariant by
FR (η, πη ) = F (πη ◦ η −1 ).
As in the preceding section and using vector analysis notation,
Dη FR (Id, π) · Xk = −DF (π) · Dη πη (Id) · Xk

δF
=− · (Xk · ∇π)dµ. (N13.B.8)
δπ
Also, δFR /δπ = δF/δπ at η = Id, as before. Thus the canonical bracket at
η = Id is

δFR δHR δHR δFR
{FR , HR } (Id, π) = − dµ
P δη δπ δη δπ

δF δH δH δF
=− · · ∇π − · · ∇π dµ,
P δπ δπ δπ δπ
which may be integrated by parts to give

δH δF δF δF
{FR , HR } (Id, π) = π· ·∇ − ·∇ dµ
P δπ δπ δπ δπ

δF δH
= π· , dµ. (N13.B.9)
P δπ δπ LA
To write this in terms of F(P ) we use (N13.B.6) to write
0 1
δF
δπ, = δπ · Xk dµ
δπ P
for some k to be determined. By the chain rule,

0 1
δF
δπ, = Dπ F · δπ = Df F · (Dπ f · δπ)
δπ

δF
= (Dπ f · δπ) dµ. (N13.B.10)
P δf
Differentiating (N13.B.6) implicitly relative, π we get

δπ · Xk dµ = (Dπ f · δπ) k dµ,
P P
so by (N13.B.10)
0 1
δF
δπ, = δπ · XδF/δf dµ, (N13.B.11)
δπ P
i.e.,
δF
= XδF/δf . (N13.B.12)
δπ
Thus (N13.B.9) becomes, with the aid of (N13.B.1) and (N13.B.6),

. /
{FR , HR } (Id, π) = π · XδF/δf , XδH/δf LA dµ
P
= π · X{δF/δf,δH/δf } dµ
P
δF δH
= f , dµ (N13.B.13)
P δf δf
which is the “+” Lie–Poisson bracket on g∗ identified with F(P ).
Remarks. 1. This derivation is related to one given by Kaufman and

Dewar [1984].
2. The bracket (N13.B.13) can be understood as a limit of the canonical
bracket for a larger number of particles moving in P by taking f to be
a sum of delta functions at the particle positions. This derivation is due
to Bialynicki-Birula, Hubbard, and Turski [1984]; see also Kaufman [1982]
and Marsden, Morrison, and Weinstein [1984].
N13.C The Linearized Lie–Poisson

Equations
Here we show that the Lie–Poisson equations linearized about an equilib-
rium solution (such as the rigid body or the ideal fluid equations) are Hamil-
tonian with respect to a “constant coefficient” Lie–Poisson bracket. The
Hamiltonian for these linearized equations is 12 δ 2 (H + C)|e , the quadratic
functional obtained by taking one-half of the second variation of the Hamil-
tonian plus conserved quantities and evaluating it at the equilibrium solu-
tion where the conserved quantity C (often a Casimir) is chosen so that
N13.C The Linearized Lie–Poisson Equations 79
the first variation δ(H + C) vanishes at the equilibrium. A consequence

is that the linearized dynamics preserves 12 δ 2 (H + C)|e . This is useful for
studying stability of the linearized equations.
For a Lie algebra g, recall that the Lie–Poisson bracket is defined on g∗ ,
the dual of g with respect to (a weakly nondegenerate) pairing , between
g∗ and g by the usual formula
0 1
δF δG
{F, G} (µ) = µ, , , (N13.C.1)
δµ δµ
where δF/δµ ∈ g is determined by

0 1
δF
DF (µ) · δµ = δµ, (N13.C.2)
δµ
when such an element δF/δµ exists, for any µ, δµ ∈ g∗ . The equations of

motion are
dµ
= − ad∗δH/δµ µ, (N13.C.3)
dt
where H : g∗ → R is the Hamiltonian, adξ : g → g is the adjoint action,
adξ ·η = [ξ, η] for ξ, η ∈ g, and ad∗ξ : g∗ → g∗ is its dual. Let µe ∈ g∗ be an
equilibrium solution of (N13.C.3). The linearized equations of (N13.C.3) at
µe are obtained by expanding in a Taylor expansion with small parameter
ε using µ = µe + εδµ, and taking (d/dε)|ε=0 of the resulting equations.
This gives

δH δH δH
= + εD (µe ) · δµ + O(ε2 ), (N13.C.4)
δµ δµe δµ
where δH/δµe , δµ := DH(µe ) · δµ, and the derivative D(δH/δµ)(µe ) · δµ

is the linear functional
ν ∈ g∗ → D2 H(µe ) · (δµ, ν) ∈ R (N13.C.5)
by using the definition (N13.C.2). Since
δ 2 H(δµ) := D2 H(µe ) · (δµ, δµ),
it follows that the functional (N13.C.5) equals
1 δ(δ 2 H)
.
2 δ(δµ)
Consequently, (N13.C.4) becomes
δH δH 1 δ(δ 2 H)
= + ε + O(ε2 ) (N13.C.6)
δµ δµe 2 δ(δµ)
and the Lie–Poisson equations (N13.C.3) yield
dµe d(δµ)
+ε = − ad∗δH/δµe µe
dt dt
1
− ε ad∗δ(δ2 H)/δ(δµ) µe − ad∗δH/δµe δµ + O(ε2 ).
2
Thus, the linearized equations are
d(δµ) 1
= − ad∗δ(δ2 H)/δ(δµ) µe − ad∗δH/δµe δµ. (N13.C.7)
dt 2
If H is replaced by HC := H + C, with the Casimir function C chosen to
satisfy δHC /δµe = 0, we get ad∗δHC /δµe δµ = 0, and so
d(δµ) 1
= − ad∗δ(δ2 HC )/δ(δµ) µe . (N13.C.8)
dt 2
Equation (N13.C.8) is Hamiltonian with respect to the linearized Poisson
bracket (see Example (f) of §10.1):
0 1
δF δG
{F, G} (µ) = µe , , . (N13.C.9)
δµ δµ
Ratiu [1982] interprets this bracket in terms of a Lie–Poisson structure
of a loop extension of g. The Poisson bracket (N13.C.9) differs from the
Lie–Poisson bracket (N13.C.1) in that it is constant in µ. With respect to
the Poisson bracket (N13.C.9), Hamilton’s equations given by δ 2 HC are
(N13.C.8), as an easy verification shows. Note that the critical points of
δ 2 HC are stationary solutions of the linearized equation (N13.C.8), that is,
they are neutral modes for (N13.C.8).
If δ 2 HC is definite, then either δ 2 HC or −δ 2 HC is positive-definite and
hence defines a norm on the space of perturbations δµ (which is g∗ ). Being
twice the Hamiltonian function for (N13.C.8), δ 2 HC is conserved. So, any
solution of (N13.C.8) starting on an energy surface of δ 2 HC (i.e., on a
sphere in this norm) stays on it and hence the zero solution of (N13.C.8) is
(Liapunov) stable. Thus, formal stability, i.e., definiteness of δ 2 HC , implies
linearized stability. It should be noted, however, that the conditions for
definiteness of δ 2 HC are entirely different from the conditions for “normal
mode stability,” that is, that the operator acting on δµ given by (N13.C.8)
have a purely imaginary spectrum. In particular, having a purely imaginary
spectrum for the linearized equation does not produce Liapunov stability
of the linearized equations.
The difference between δ 2 HC and the operator in (N13.C.8) can be made
explicit, as follows. Assume that there is a weak Ad-invariant metric ,
on g and a linear operator L : g → g such that
δ 2 HC = δµ, Lδµ; (N13.C.10)

N13.C The Linearized Lie–Poisson Equations 81
L is symmetric with respect to the metric , , that is, ξ, Lη = Lξ, η
for all ξ, η ∈ g. Then the linear operator in (N13.C.8) becomes
δµ → [Lδµ, µe ] (N13.C.11)
which, of course, differs from L, in general. However, note that the kernel
of L is included in the kernel of the linear operator (N13.C.11), that is, the
zero eigenvalues of L give rise to “neutral modes” in the spectral analysis
of (N13.C.11). There is a remarkable coincidence of the zero-eigenvalue
equations for these operators in fluid mechanics: for the Rayleigh equation
describing plane-parallel shear flow in an inviscid homogeneous fluid, taking
normal modes makes the zero-eigenvalue equations corresponding to L and
to (N13.C.11) coincide (see Abarbanel, Holm, Marsden, and Ratiu [1986]).
For additional applications of the stability method, see Holm, Marsden,
Ratiu, and Weinstein [1985], Abarbanel and Holm [1987], Simo, Posbergh,
and Marsden [1990, 1991], and Simo, Lewis, and Marsden [1991]. For a
more general treatment of the linearization process, see Marsden, Ratiu,
and Raugel [1991].
Exercises
# N13.C-1. Write out the linearized rigid body equations about an equi-
librium explicitly.
# N13.C-2. Let g be finite dimensional. Let e1 , . . . , en be a basis for g
and e1 , . . . , en a dual basis for g∗ . Let µ = µa ea ∈ g∗ and H(µ) =
H(µ1 , . . . , µn ) : g∗ → R. Let [µa , µb ] = Cab
d
µd . Derive a coordinate ex-
pression for the linearized equations (N13.C.7):
d(δµ) 1
= − ad∗δ(δ2 H)/δµ µe − ad∗δH/δµe δµ.
dt 2
Page 83
N14
Coadjoint Orbits
N14.A Casimir Functions do not Determine

Orbits
The purpose of this section is to use Corollary 14.4.3 to determine all
Casimir functions for the Lie algebra in Example (f) of §14.1. If
   
iu 0 0 is 0 x
µ =  0 iαu 0 ∈ g∗ , ξ =  0 iαs y  ∈ g,
a b 0 0 0 0
for a, b, x, y ∈ C, u, s ∈ R, then it is straightforward to check that

 
iu 0 0
ad∗ξ µ =  0 iαu 0 ,
−isa −iαsb 0
where
1
u = − Im(ax + αby).
1 + α2
Thus, if at least one of a, b is not zero, then
  
 0 0 x 
gµ = 0 0 y  Im(ax + αby) = 0 ,
 
0 0 0
84 N14. Coadjoint Orbits
whereas if a = b = 0, then gµ = g. For C : g∗ → R, denote by

 
iCu 0 Ca
δC
= 0 iαCu Cb  ,
δµ
0 0 0
where Cu ∈ R, Ca , Cb ∈ C are the partial derivatives of C relative to the

variables u, a, b. Thus, the condition δC/δµ ∈ gµ for all µ implies that
Cu = 0, that is, C is independent of u and
Im(aCa + αbCb ) = 0. (N14.A.1)
The same condition could have been obtained by lengthier direct cal-
culations involving the Lie–Poisson bracket. Here are the highlights. The
commutator bracket on g is given by
     
is 0 x iu 0 z 0 0 i(sz − ux)
 0 iαs y  ,  0 iαu w = 0 0 iα(sw − uy) ,
0 0 0 0 0 0 0 0 0
so that for µ ∈ g∗ parameterized by u ∈ R, a, b, ∈ C, we have

δF δH
{F, H} (µ) = − Re Trace µ ,
δµ δµ
= Im[a(Fu Ha − Hu Fa ) + αb(Fu Hb − Hu Fb )]. (N14.A.2)
Taking Fu = Fb = 0 in {F, C} = 0, forces Cu = 0. Then the remaining

condition reduces to (N14.A.1).
To solve (N14.A.1) we need first to convert it into a real equation. Regard
C as being defined on C2 with values in C and write C = A + iB, with A
and B real-valued functions. We start by searching for holomorphic Casimir
functions.
Write a = p + iq, b = v + iw so that by the Cauchy-Riemann equations
we have
Ap = Bq , Aq = −Bp , Av = Bw , Aw = −Bv
and also, since C is holomorphic
Ca = Ap + iBp = Bq − iAq = Cp = −iCq

Cb = Av + iBv = Bw − iAw = Cv = −iCw .
Therefore,
0 = Im((p + iq)(Ap + iBp ) + α(v + iw)(Av + iBv ))

= qAp + pBp + α(wAv + vBv )
= qAp − pAq + αwAv − αvAw
N14.A Casimir Functions do not Determine Orbits 85
by the Cauchy–Riemann equations. We solve this partial differential equa-

tion by the method of characteristics. The flow of the vector field with
components (q, −p, αw, −αv) is given by
Ft (p, q, v, w) = (p cos t + q sin t, −p sin t + q cos t,
v cos αt + w sin αt, −v sin αt + w cos αt)
and thus any solution is a rotationally invariant function. An argument
(using a theorem of Whitney [1943]) shows that solutions have the form
A = f (p2 + q 2 , v 2 + w2 )
for a real valued function f : R2 → R is the general solution of this equation.
Thus, any Casimir function is a functional of p2 + q 2 and v 2 + w2 . Note
that
Ca = Ap + iBp = Ap − iAq , and
Cb = Av + iBv = Av − iAw .
In particular, if f (x, y) = x, that is, A = p2 +q 2 , we have Ca = 2(p−iq) and
Cb = 0. One can then verify directly that p2 +q 2 is a Casimir function using
formula (N14.A.2). Similarly, one sees directly that v 2 + w2 is a Casimir
function.
Since the generic leaf of g∗ is two-dimensional (see Example 14.1(f))
and the dimension of g is five, it follows that the Casimir functions do
not characterize the generic coadjoint orbits. This is in agreement with the
observation made in Example 14.1(f) that the generic coadjoint orbits have
as closure the three-dimensional submanifolds of g∗ , which are the product
of the torus of radii |a| and |b| and the u -line, if one expresses the orbit
through  
iu 0 0
µ =  0 iαu 0
a b 0
as
  
 iu 0 0 
 0 iu 0 u = u + Im(ae−it z + be−iαt αw), t ∈ R, z, w ∈ C .
 
ae−it be−iαt 0
Note that this is consistent with these two Casimir functions preserving
|ae−it | = |a| and |be−iαt | = |b|.
Another illuminating example of a similar phenomenon (due to Juan
Simo) is the semidirect product SL(3) SM(3), where SM(3) is the space
of symmetric 3 × 3 matrices and the action of SL(3) on it is by similarity,
A × AT . The Lie algebra and its dual are 14 dimensional and the generic
coadjoint orbit is 12 dimensional. But there are no nontrivial Casimir func-
tions because the closure of any orbit contains the origin, so one cannot
separate two orbits by continuous functions.
86 N14. Coadjoint Orbits
Solution to Exercise N9F-1. Proceed as in the proof of the Duflo-

Vergne Theorem, replacing the curve µ + tν by a curve µ(t) ∈ S, with
µ(0) = µ and µ(1) = ν ∈ S. We assume that S is connected; if not, work
on connected components. The proof remains unchanged till the end when
the conclusion is that µ (0), [ξ, η] = 0 for all ξ, η ∈ gµ . Since µ (0) is an
arbitrary vector in Tµ S, this implies that [gµ , gµ ] ∈ (Tµ S)0 .
Page 87
References
References
Abarbanel, H. D. I. and D. D. Holm [1987] Nonlinear stability analysis of inviscid
flows in three dimensions: incompressible fluids and barotropic fluids, Phys.
Fluids 30, 3369–3382.
Abarbanel, H. D. I., D. D. Holm, J. E. Marsden, and T. S. Ratiu [1986] Nonlinear
stability analysis of stratified fluid equilibria. Phil. Trans. Roy. Soc. London
A 318, 349–409; also Richardson number criterion for the nonlinear stability
of three-dimensional stratified flow. Phys. Rev. Lett. 52 [1984], 2552–2555.
Abraham, R. and J. E. Marsden [1978] Foundations of Mechanics. Second Edi-
tion, Addison-Wesley.
Abraham, R., J. E. Marsden, and T. S. Ratiu [1988] Manifolds, Tensor Analy-
sis, and Applications. Second Edition, Applied Mathematical Sciences 75,
Springer-Verlag.
Abraham, R. and Robbin, J. [1967] Transversal mappings and flows. Benjamin-
Cummings, Reading, Mass.
Adams, J. F. [1969] Lectures on Lie groups. Benjamin-Cummings, Reading,
Mass.
Adams, R.A. [1975] Sobolev Spaces. Academic Press.
Adams, M. R., T. S. Ratiu, and R. Schmid [1986] A Lie group structure for pseu-
dodifferential operators. Math. Ann. 273, 529–551 and A Lie group structure
for Fourier integral operators. Math. Ann. 276, 19–41.
Arms, J.M., J.E. Marsden, and V. Moncrief [1981] Symmetry and bifurcations of
momentum mappings, Comm. Math. Phys. 78, 455–478.
88 References
Arms, J. M., J. E. Marsden, and V. Moncrief [1982] The structure of the space
solutions of Einstein’s equations: II Several Killings fields and the Einstein-
Yang-Mills equations. Ann. of Phys. 144, 81–106.
Arnold, V. I. [1967] Characteristic class entering in conditions of quantization.
Funct. Anal. Appl. 1, 1–13.
Atiyah, M. [1982] Convexity and commuting Hamiltonians. Bull. London Math.
Soc. 14, 1–15.
Aubin, T. [1976] Espaces de Sobolev sur les variètès riemanniennes. Bull. Sci.
Math. 100, 149–173.
Bates, S. and A. Weinstein [1997] Lectures on the Geometry of Quantization,
CPAM/UCB, Am. Math. Soc.
Bialynicki-Birula, I., J. C. Hubbard, and L. A. Turski [1984] Gauge-independent
canonical formulation of relativistic plasma theory. Physica A 128, 509–519.
Cantor, M. [1975] Perfect fluid flows over Rn with asymptotic conditions. J. Func.
Anal. 18, 73–84.
Chernoff, P. R. and J. E. Marsden [1974] Properties of Infinite Dimensional
Hamiltonian systems. Springer Lect. Notes in Math. 425.
de Gosson, M. [1997] Maslov classes, metaplectic representation and Lagrangian
quantization. Mathematical Research, 95. Akademie-Verlag, Berlin.
Duflo, M. and M. Vergne [1969] Une proprieté de la représentation coadjointe
d’une algébre de Lie. C.R. Acad. Sci. Paris 268, 583–585.
Ebin, D. G. [1970] On the space of Riemannian metrics. Symp. Pure Math., Am.
Math. Soc. 15, 11–40.
Ebin, D. G. and J. E. Marsden [1970] Groups of diffeomorphisms and the motion
of an incompressible fluid. Ann. Math. 92, 102–163.
Eckmann J.-P. and R. Seneor [1976] The Maslov-WKB method for the anhar-
monic oscillator, Arch. Rat. Mech. Anal. 61 153–173.
Fischer, A. E., J. E. Marsden, and V. Moncrief [1980] The structure of the space of
solutions of Einstein’s equations, I: One Killing field. Ann. Inst. H. Poincaré
33, 147–194.
Friedman, A. [1969] Partial differential equations. Holt, Rinehart and Winston.
Gotay, M. J., H. B. Grundling, G. M. Tuynman [1996] Obstruction results in
quantization theory. J. Nonlinear Sci. 6, 469–498.
Guillemin, V. and S. Sternberg [1977] Geometric Asymptotics. Amer. Math. Soc.
Surveys, 14. (Revised edition, 1990.)
Guillemin, V. and S. Sternberg [1982] Convexity properties of the moment map.
Inv. Math. 67, 491-513, 77 (1984) 533–546.
Guillemin, V. and S. Sternberg [1984] Symplectic Techniques in Physics. Cam-
bridge University Press.
Hamilton, R. [1982] The Inverse function theorem of Nash and Moser. Bull. Am.
Math. Soc. 7, 65–222.
Hermann, R. [1962] The differential geometry of foliations, J. Math. Mech. 11,
303–315.
References 89
Holm, D. D., J. E. Marsden, T. S. Ratiu, and A. Weinstein [1985] Nonlinear

stability of fluid and plasma equilibria. Phys. Rep. 123, 1–116.
Kaufman, A. [1982] Elementary derivation of Poisson structures for fluid dynam-
ics and electrodynamics. Phys. Fluids 25, 1993–1994.
Kaufman, A. and R. L. Dewar [1984] Canonical derivation of the Vlasov-Coulomb
noncanonical Poisson structure. Contemp. Math. 28, 51–54.
Kopell, N. [1970] Commuting diffeomorphisms. Proc. Sympos. Pure Math., AMS,
14, 165–184.
Ladyzhenskaya, O. A. [1969] The mathematical theory of viscous incomprehensible
flow. Second English edition, revised and enlarged. Math. and its Appl. 2
Gordon and Breach.
Lewis, D., J. E. Marsden, R. Montgomery, and T. S. Ratiu [1986] The Hamilto-
nian structure for dynamic free boundary problems, Physica D 18, 391–404.
Libermann, P. and C. M. Marle [1987] Symplectic Geometry and Analytical Me-
chanics. Kluwer Academic Publishers.
Littlejohn, R. G. [1988] Cyclic evolution in quantum mechanics and the phases
of Bohr-Sommerfeld and Maslov. Phys. Rev. Lett. 61, 2159–2162.
Marsden, J. E. [1973a] A proof of the Calderon extension theorem. Can. Math.
Bull. 16, 133-136.
Marsden, J. E., D. G. Ebin, and A. Fischer [1972] Diffeomorphism groups, hy-
drodynamics and relativity. Proceedings of the 13th Biennial Seminar on
Canadian Mathematics Congress, pp. 135–279.
Marsden, J.E. and M.J. Hoffman [1993] Elementary Classical Analysis. Second
Edition, W.H. Freeman and Co, NY.
Marsden, J. E. and T. J. R. Hughes [1983] Mathematical Foundations of Elastic-
ity. Prentice-Hall. Dover edition [1994].
Marsden, J. E., P. J. Morrison, and A. Weinstein [1984] The Hamiltonian struc-
ture of the BBGKY hierarchy equations. Cont. Math. AMS 28, 115–124.
Marsden, J. E., T. S. Ratiu, and G. Raugel [1991] Symplectic connections and
the linearization of Hamiltonian systems. Proc. Roy. Soc. Edinburgh A 117,
329-380
Marsden, J. E., T. S. Ratiu and S. Shkoller [1999] The geometry and analysis of
the averaged Euler equations with normal boundary conditions, in prepara-
tion.
Marsden, J.E. and A. Weinstein [1979], Review of Geometric Asymptotics and
Symplectic Geometry and Fourier Analysis, Bull. Amer. Math. Soc. 1, 545-
553.
Maslov, V. P. [1965] Theory of Perturbations and Asymptotic Methods. Moscow
State University.
Meyers, N. and J. Serrin [1960] The exterior Dirichlet problem for second order
elliptic partial differential equations. J. Math. Mech. 9, 513–538.
Miller, S. C. and R. H. Good [1953] A WKB-type approximation to the
Schrödinger equation. Phys. Rev. 91, 174–179.
Milnor, J. [1965] Topology from the Differential Viewpoint. University of Virginia
Press.
90 References
Morrey, Jr. C.B. [1966] Multiple Integrals in the Calculus of Variations. Springer–
Verlag.
Nirenberg, L. [1959] On elliptic partial differential equations. Ann. Scuola. Norm.
Sup. Pisa 13(3), 115–162.
Omori, H. [1974] Infinite dimensional Lie transformation groups. Lecture Notes
in Mathematics, 427. Springer-Verlag.
Palais, R. S. [1965] Seminar on the the Atiyah - Singer Index Theorem. Princeton
University Press, Princeton, N.J.
Palais, R. S. [1968] Foundations of Global Non-Linear Analysis. Benjamin.
Rais, M. [1972] Orbites de la représentation coadjointe d’un groupe de Lie,
Représentations des Groupes de Lie Résolubles, P. Bernat, N. Conze, Mj.
Duflo, M. Lévy-Nahas, M. Aais, P. Renoreard, Mj. Vergne, eds. Monogra-
phies de la Société Mathématique de France, Dunod, Paris 4, 15–27.
Ratiu, T. S. [1982] Euler-Poisson equations on Lie algebras and the N -
dimensional heavy rigid body, Am. J. Math. 104, 409–448, 1337.
Ratiu, T. S. and R. Schmid [1981] The differentiable structure of three remarkable
diffeomorphism groups, Math. Zeitschrift 177, 81–100.
Sánchez de Alvarez, G. [1986] Thesis. University of California at Berkeley.
Sánchez de Alvarez, G. [1989] Controllability of Poisson control systems with
symmetry. Cont. Math. AMS 97, 399–412.
Simo, J. C., D. R. Lewis, and J. E. Marsden [1991] Stability of relative equilibria
I: The reduced energy momentum method, Arch. Rat. Mech. Anal. 115,
15-59.
Simo, J. C., T. A. Posbergh, and J. E. Marsden [1990] Stability of coupled rigid
body and geometrically exact rods: block diagonalization and the energy-
momentum method, Physics Reports 193, 280–360.
Simo, J. C., T. A. Posbergh, and J. E. Marsden [1991] Stability of relative equi-
libria II: Three dimensional elasticity, Arch. Rat. Mech. Anal. 115, 61–100.
Slawianowski, J. J. [1971] Quantum relations remaining valid on the classical
level. Rep. Math. Phys. 2, 11–34.
Stefan, P. [1974] Accessible sets, orbits and foliations with singularities, Proc.
Lond. Math. Soc. 29, 699–713.
Stein, E. M. [1970] Singular integrals and differentiability properties of functions.
Princeton Mathematical Series,30 Princeton University Press.
Sussman, H. [1973] Orbits of families of vector fields and integrability of distri-
butions. Trans. Am. Math. Soc. 180, 171–188.
Tulczyjew, W. M. [1977] The Legendre transformation. Ann. Inst. Poincaré 27,
101–114.
Varadarajan, V. S. [1974] Lie Groups, Lie Algebras and Their Representations.
Prentice Hall. (Reprinted in Graduate Texts in Mathematics, Springer-
Verlag.)
Weinstein, A. [1983] The local structure of Poisson manifolds. J. Diff. Geom. 18,
523–557
H. Whitney [1943] Differentiable even functions. Duke Math. J. 10, 159–160.
References 91
Woodhouse, N. M. J. [1992] Geometric Quantization., Second Edition, Clarendon

Press, Oxford University Press.
Yosida, K. [1980] Functional analysis. Sixth edition. Grundlehren 123. Springer-
Verlag.

Marsden, J.E. Ratiu, T.S. - Supplement

Uploaded by

Copyright:

Available Formats

Marsden, J.E. Ratiu, T.S. - Supplement

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Marsden, J.E. Ratiu, T.S. - Supplement

Uploaded by

Copyright:

Available Formats

Page i

Internet Supplement for

Last modiﬁed on 19 December 1998

N14 Coadjoint Orbits . . . . . . . . . . . . . . . . . . . . . . . 83

N6.A Linearization of Hamiltonian Systems

dq i ∧ d(δpi ) + d(δq i ) ∧ dpi (N6.A.1)

in the variables (q i , pi , δq i , δpi ). However, at a ﬁxed point, it is often desir-

d(δq i ) ∧ d(δpi ), (N6.A.2)

while (N6.A.1) restricts to zero.

Raugel [1991]. For systems with a symmetry group G, they use a G-

The matrix of the canonical symplectic form d(δq i ) ∧ d(δpi ) is

Recall (see §2.7) that a linear operator with matrix

is inﬁnitesimally symplectic, that is, T t J + JT = 0, or equivalently, T is

Inﬁnite Dimensional Systems. There are a number of several interest-

1. The Sine-Gordon equation utt − uxx = sin u has phase space E × E ,

2. The Yang-Mills equations have phase space T ∗ A, where A is the

3. General relativity (in dynamical form) has phase space T ∗ Riem(M ),

An interesting question here is to couple these systems to ones with non-

(ΘT )v , w = Ωz (v, T τP (w)), (N6.A.6)

where v ∈ Tz P, w ∈ Tv (T P ), τP : T P → P is the projection, and  , 

The Hamiltonian system Y = XH on T P is called the linearized Hamil-

If Q is a pseudo-Riemannian manifold and P = T Q with the symplectic

N7.A The Classical Limit and the Maslov

and the time-independent Hamilton–Jacobi equation for the function S :

by using (N7.A.3). Equation (N7.A.6) diﬀers from (N7.A.1) by a term of

for a : R → R. Substituting this into (N7.A.2) and using the Hamilton–

whose solution is a = (constant)/|S |1/2 . Thus, (N7.A.8) becomes

for some functions ak : R → R and requiring ψ to satisfy (N7.A.1) up to

Imposing the transport equations

S ak + 2S ak − ak−1 = 0, k = 0, 1, . . . , N, a−1 ≡ 0, (N7.A.13)

which can be solved recursively, we see that (N7.A.12) reduces to

of (N7.A.1). The key observation in this procedure is that once S is de-

Figure N7.A.1. A sample classical energy surface.

There correspond two solutions of (N7.A.3):

The subtlety of questions 1 and 2 centers on the multiple valuedness of

1. Use analytic continuation methods to avoid the turning points. This

2. Approximate the potential by a linear one near each turning point.

where ϕ : R2 → R is positive. This integral is called an oscillatory func-

and try to solve (N7.A.1). A direct computation shows that

To evaluate the right–hand side of (N7.A.20) asymptotically in  we need

Theorem N7.A.1 (Stationary Phase Formula). Let a, ϕ : R → R be

we see that each integral on the right–hand side is a deﬁnite integral on

Step 1 (Morse Lemma). There is a change of variables x → z such that

ϕ(x(z)) = ϕ(x0 ) + 12 (sgn ϕ (x0 ))(z − z0 )2 ,

x0 = 0, ϕ(x0 ) = 0, ϕ (x0 ) = 0, ϕ (x0 ) = 0.

ϕ (x) = α(x) + xα (x),

Therefore x → z is a diﬀeomorphism in a neighborhood of 0 and in this

Step 2. Performing the change of variables x → z given in Step 1 we

and note that

so that proceeding as in Step 1 we can write

where zt = tz + (1 − t)z0 , we see that on its domain of deﬁnition γ(z)

is uniformly convergent for Re λ ≥ 0, |λ| ≥ 1, bounded by a constant de-

hence for the sum. Let 0 < A < B. Then

The ﬁrst term tends to zero as A → ∞ by boundedness of |h|, |h | if Re λ ≥

implies that for real positive λ we have

Step 5. The second integral in Step 2 is O(3/2 ).

provided the number of critical points in p of the q-dependent function

√ exp[−iπ sgn T (p)/4]

Now we require that Lψ − Eψ = O(3/2 ), which forces the ﬁrst term to

(ΘT )v , w = Ωz (v, T τP (w)), (N6.A.6)

where v ∈ Tz P, w ∈ Tv (T P ), τP : T P → P is the projection, and ,

To evaluate the right–hand side of (N7.A.20) asymptotically in we need

Step 1 (Morse Lemma). There is a change of variables x → z such that

Therefore x → z is a diﬀeomorphism in a neighborhood of 0 and in this

Step 2. Performing the change of variables x → z given in Step 1 we

Step 5. The second integral in Step 2 is O(3/2 ).

Now we require that Lψ − Eψ = O(3/2 ), which forces the ﬁrst term to

The phase changes in the exponentials exp(iS/) and exp[i(pq −T (p))/]

Proof. Since G is Abelian, the map t → (exp tξ)(exp tη) is a one–parameter

x1 + iy1 , x2 + iy2 = x1 , y1 + x2 , y2 + i (x2 , y1 − x1 , y2 ) .