An Introduction To KAM Theory: C. Eugene Wayne January 22, 2008
An Introduction To KAM Theory: C. Eugene Wayne January 22, 2008
An Introduction To KAM Theory: C. Eugene Wayne January 22, 2008
C. Eugene Wayne
January 22, 2008
Introduction
Over the past thirty years, the Kolmogorov-Arnold-Moser (KAM) theory has
played an important role in increasing our understanding of the behavior of
non-integrable Hamiltonian systems. I hope to illustrate in these lectures that
the central ideas of the theory are, in fact, quite simple. With this in mind, I
will concentrate on two examples and will forego generality for concreteness and
(I hope) clarity. The results and methods which I will present are well-known
to experts in the field but I hope that by collecting and presenting them in as
simple a context as possible I can make them somewhat more approachable to
newcomers than they are often considered to be.
The outline of the lectures is as follows. After a short historical introduction,
I will explain in detail one of the simplest situations where the KAM techniques
are used the case of diffeomorphisms of a circle. I will then go on to discuss the
theory in its original context, that of nearly-integrable Hamiltonian systems.
The problem which the KAM theory was developed to solve first arose in
celestial mechanics. More than 300 years ago, Newton wrote down the differential equations satisfied by a system of massive bodies interacting through
gravitational forces. If there are only two bodies, these equations can be explicitly solved and one finds that the bodies revolve on Keplerian ellipses about
their center of mass. If one considers a third body (the three-body-problem),
no exact solution exists even if, as in the solar system, two of the bodies are
much lighter then the third. In this case, however, one observes that the mutual
gravitational force between these two planets is much weaker than that between either planet and the sun. Under these circumstances one can try to solve
the problem perturbatively, first ignoring the interactions between the planets.
This gives an integrable system, or one which can be solved explicitly, with
each planet revolving around the sun oblivious of the others existence. One
can then try to systematically include the interaction between the planets in
a perturbative fashion. Physicists and astronomers used this method extensively throughout the nineteenth century, developing series expansions for the
solutions of these equations in the small parameter represented by the ratio of
the mass of the planet to the mass of the sun. However, the convergence of
these series was never established not even when the King of Sweden offered
a very substantial prize to anyone who succeeded in doing so. The difficulty in
establishing the convergence of these series comes from the fact that the terms
in the series have small denominators which we shall consider in some detail
later in these lectures. One can obtain some physical insight into the origin of
these convergence problems in the following way. As one learns in an elementary course in differential equations, a harmonic oscillator has a certain natural
frequency at which it oscillates. If one subjects such an oscillator to an external
force of the same frequency as the natural frequency of the oscillator, one has
resonance effects and the motion of the oscillator becomes unbounded. Indeed,
if one has a typical nonlinear oscillator, then whenever the perturbing force has
a frequency that is a rational multiple of the natural frequency of the oscillator,
one will have resonances, because the nonlinearity will generate oscillations of
all multiples of the basic driving frequency.
In a similar way, one planet exerts a periodic force on the motion of a second,
and if the orbital periods of the two are commensurate, this can lead to resonance
and instability. Even if the two periods are not exactly commensurate, but only
approximately so the effects lead to convergence problems in the perturbation
theory.
It was not until 1954 that A. N. Kolmogorov [8] in an address to the ICM
in Amsterdam suggested a way in which these problems could be overcome.
His suggestions contained two ideas which are central to all applications of the
KAM techniques. These two basic ideas are:
Linearize the problem about an approximate solution and solve the linearized problem it is at this point that one must deal with the small
denominators.
Inductively improve the approximate solution by using the solution of the
linearized problem as the basis of a Newtons method argument.
These ideas were then fleshed out, extended, and applied in numerous other
contexts by V. Arnold and J. Moser, ([1], [9]) over the next ten years or so,
leading to what we now know as the KAM theory.
As I said above, we will consider the details of this procedure in two cases.
The first, the problem of showing that diffeomorphisms of a circle are conjugate
to rotations, was chosen for its simplicity the main ideas are visible with
fewer technical difficulties than appear in other applications. We will then look
at the KAM theory in its original setting of small perturbations of integrable
Hamiltonian systems. Ill attempt to parallel the discussion of the case of circle
diffeomorphisms as closely as possible in order to keep our focus on the main
ideas of the theory and ignore as much as possible the additional technical
complications which arise in this context.
Circle Diffeomorphisms
Let us begin by discussing one of the simplest examples in which one encounters
small denominators, and for which the KAM theory provides a solution. It
may not be apparent for the moment what this problem has to do with the
problems of celestial mechanics discussed in the introduction, but almost all of
the difficulties encountered in that problem also appear in this context but in
ways which are less obscured by technical difficulties this is, if you like, our
warm-up exercise.
We will consider orientation preserving diffeomorphisms of the circle, or
equivalently, their lifts to the real line:
: R1 R1
(x) = x + (x) with (x + 1) = (x) and ! (x) > 1 .
on S ,
sup |(z)| %% < } .
|Imz|<
Note that one can assume that < 1, without loss of generality.
Our goal in this section will be to understand the dynamics of (x) = x +
+ (x) when has small norm. One way to do this is to show that the
dynamics of are like the dynamics of a system we understand for instance,
suppose that we could find a change of variables which transformed into a pure
rotation. Then since we understand the dynamics of the rotation, we would also
understand those of . If we express this change of variables as x = H(), where
H( + 1) = 1 + H() preserves the periodicity of , then we want to find H such
that
H 1 H() = R () ,
4
or equivalently
H() = H R () .
(2)
() = lim
Remark 2.2 It is a standard result of dynamical systems theory that for any
homeomorphism of the circle the limit on the right hand side of this equation
exists and is independent of x. (See [6], p. 296.)
Remark 2.3 Note that from the definition of the rotation number, it follows
immediately that for any homeomorphism H, the map = H 1 H has the
same rotation number as . (Since (n) = H 1 (n) H, and the initial and
final factors of H and H 1 have no effect on the limit.)
As a final remark about the the rotation number we note that if (x) = x +
!n1
+(x), then an easy induction argument shows that () = +limn n1 j=0
!n1
(j) (x). In particular, if = , we have limn n1 j=0 (j) (x) = 0, so we
have proved:
Lemma 2.1 If (x) = x + + (x) has rotation number , then there exists
some x0 such that (x0 ) = 0.
We must next ask about the properties we wish the change of variables
H to have. If we only demand that H be a homeomorphism, then Denjoys
Theorem ([6] p. 301) says that if the rotation number of is irrational, we
can always find an H which conjugates to a rotation. However, if we want
more detailed information about the dynamics it makes sense to ask that H
have additional smoothness. In fact, it is natural to ask that H be as smooth
as the diffeomorphism itself in this case, analytic. (There will, in general,
be some loss of smoothness even in this case. We will find, for example, that
while there exists an analytic conjugacy function, H, its domain of analyticity
will be somewhat smaller than that of .) Surprisingly, the techniques which
5
Denjoy used fail completely in this case, and the answer was not known until
the late fifties when Arnold applied KAM techniques to answer the question in
the case when is small. Even more surprisingly, in order to even state Arnolds
theorem, we have to discuss a little number theory.
Any irrational number can be approximated arbitrarily well by rational numbers, and in fact, Dirichlets Theorem even gives us an estimate of how good
this approximation is. More precisely, it says that given any irrational number ,
there exist infinitely many pairs of integers (m, n) such that | (m/n)| < 1/n2 .
On the other hand, most irrational numbers cant be approximated much better
than this.
Definition 2.2 The real number is of type (K, ) if there exist positive numbers K and such that | (m/n)| > K|n| , for all pairs of integers (m, n).
Proposition 2.1 For every > 2, almost every irrational number is of type
(K, ) for some K > 0.
Proof: The proof is not difficult, but would take us a bit out of our way. The
details can be found in [3], page 116, for example. Note also, that we can assume
) for some K
> 1,
without loss of generality that K 1, since if is of type (K,
it is also of type (1, ).
Theorem 2.1 (Arnolds Theorem [1]) Suppose that is of type (K, ). There
exists ((K, , ) > 0 such that if (x) = x + + (x) has rotation number ,
and %% < ((K, , ), then there exists an analytic and invertible change of
variables H(x) which conjugates to R .
As mentioned above, Arnolds proof of this theorem used the KAM theory.
The proof can be broken into two main parts an analysis of a linearized
equation, and a Newtons method iteration step. These same two steps will
reappear in the next section when we discuss nearly integrable Hamiltonian
systems, and they are characteristic of almost all applications of the KAM
theory.
Remark 2.4 It may seem that by assuming that the diffeomorphism is of the
form (x) = x++(x), where is the rotation number of , we are considering
a less general situation than that described above in which we allowed to have
the form x + + (x). As we shall see below, there is no real loss of generality
in this restriction.
Step 1: Analysis of the Linearized equation
Note that since %% is small, the diffeomorphism is close to the pure
rotation R . Thus, we might hope that if a change of variables H which satisfies
(2) exists is would be close to the identity i.e. H(x) = x + h(x), where h is
6
(3)
If we now expand both sides of this equation, retaining only terms of first order
in the (presumably) small quantities h and , we find:
h(x + ) h(x) = (x)
(4)
Since all the functions in this equation are periodic, and the equation is linear in
the unknown function h, we can immediately write down a (formal) solution for
the coefficients in the Fourier series of h. If (n) is the nth Fourier coefficient
of , then the nth Fourier coefficient of h is
h(n)
=
(n)
,
1
e2in
n )= 0 .
(5)
"
2inx
h(n)e
=
n%=0
"
n%=0
(n)
e2inx
1
e2in
makes any sense, however, we first note that even if (5) defines a well-behaved
function, it will not solve (4) but rather:
# 1
(x)dx = (x) (0) .
(6)
h(x + ) h(x) = (x)
0
This is because the zeroth Fourier coefficient of h drops out of (4). The fact
that h does not solve (4) will complicate the estimates below. The problems
with showing that (5) converges arise due to the presence of the factors of
e2in 1 in the denominator of the summands, and these are the (in)famous
small denominators which plagued celestial mechanics in the last century
and which the KAM theory finally overcame. We first note that if is rational,
there is little hope that the sum defining h will converge since the denominators
in this sum will vanish for the infinitely many n for which n = m for some
m Z. Thus, we can only hope for success if
/ Q. If is irrational, the
denominator will still be large whenever n m. However, by assuming that
is of type (K, ), we have some control over how close to zero the denominator
can become. In fact, the following lemma immediately allows us to estimate
h(x).
Lemma 2.2 If is of type (K, ), then
|e2in 1| = |e2im (e2i(nm) 1)| 4K|n|(1) if n )= 0 .
The other fact which we must use to estimate h is the fact that since B ,
Cauchys theorem immediately gives an estimate on its Fourier coefficients of
the form |
(n)| %% e2|n| . Combining this remark with Lemma 2.2, we
see that if |Imz| , one has
|h(z)| = |
" (n)e2inz
" |n|(1)
|
%% e2|n| e2|n|()
2in
e
1
4K
n%=0
n%=0
()
%% ,
K(2)
$
where () = 0 x1 ex dx, and we have assumed that 2(K +1) 4 < 1.
Thus we have proven,
Proposition 2.2 If is of type (K, ), and B , then h(x), defined by (5)
is an element of B for any > 0, and if 4 < 1, we have the estimate:
%h%
()
%% .
K(2)
that if we use the H(x) that comes from solving (6), H 1 H will be closer
to R than was and then we can iterate this process.
The first thing we must do is check that H is invertible. Since H(z) = z +
h(z), H will be invertible with analytic inverse on any domain on which %h! % < 1.
2()
Cauchys theorem and Proposition 2.2 imply that %h! %2 K(2)
+1 %% ,
so we conclude
Lemma 2.3 If 2()%% < K(2)(+1) and 4 < 1, then H(z) has an
analytic inverse on the image of S2 .
Remark 2.6 Note that if we combine the inequalities in Lemma 2.3 and Proposition 2.2, we find %h% < . Thus, if z S2 , H(z) S . Furthermore,
H maps the real axis into itself, and the images of the lines Imz = ( 2)
lie outside the strip S3 . With this information it is easy to show that
Range(H|S2 ) S3 , so that H 1 (z) is defined for all z S3 .
In addition to knowing that the inverse exists, we need an estimate on its
properties which the following proposition provides.
Proposition 2.3 If
2()%% < K(2)(+1) and 4 < 1 ,
then H 1 (z) = z h(z) + g(z), where
%g%4
2()2
K 2 (2)(2+1)
%%2 .
% () &2 %%2
K
(2)2+1
Let us now examine the transformed diffeomorphism (x)
= H 1 H(x).
(x)
= H 1 H(x) = H 1 (x + h(x) + + (x + h(x)))
= x + h(x) + + (x + h(x)) h(x + + h(x) + (x + h(x))) +
+ g(x + h(x) + + (x + h(x)))
= x + + {h(x) h(x + ) + (x)} + {(x + h(x)) (x)} +
+ {h(x + ) h(x + + h(x) + (x + h(x)))} +
+ g(x + h(x) + + (x + h(x))) .
We first note that because h solves (6), the first expression in braces in
the second to last line of this
$ 1 sequence of equalities is equal to 0 . The next
expression in braces equals 0 ! (x+sh(x))h(x)ds. If 2()%% < K(2)+1 ,
4 < 1, and x S4 , we can bound the norm of this expression on B4 by
()
2%% K(2)
+1 %% . Similarly, the quantity in braces in the next to last line
$1
may be rewritten as 0 h! (x + + s(h(x) + (x + h(x)))(h(x) + (x + h(x))ds.
Once again, assuming that the conditions on %% and described above hold,
and that x S4 , then we can bound the norm of this expression on B4
by
(
2() ' ()
4(())2
%%
+
%%
%%
<
%%2 ,
K(2)+1 K(2)
K 2 (2)2+1
where the last inequality used the fact that 2K < 1. Finally, if |Imx| < 6,
then |Im(x + h(x) + + (x + h(x)))| < 4, (since %% < ), so that the
last term in this expression is bounded by Proposition 2.3.
4(())2
2()2
()
2
+
%%
+
%%2 .
K(2)+1
K 2 (2)2+1
K 2 (2)2+1
Proposition 2.4 If 2()%% < K(2)(+1) , and 4 < 1, then (x)
=
1
H H(x) = x + + (x), where
%
%6
16(())2
%%2 .
K 2 (2)2+1
Remark 2.7 The important thing to note about the estimate of is that in
spite of the mess, it is second order in the small quantity %% as we had hoped.
That is, there exists a constant C(K, , ) such that %
%6 C(K, , )%%2 .
This is what makes the Newtons method argument work.
The proof of Arnolds Theorem is completed by inductively repeating the
above procedure. The principal point which we must check is that we dont
lose all of our domain of analyticity as we go through the argument note
that is analytic on a narrower strip than was our original diffeomorphism .
The essential reason that there is a nonvanishing domain of analyticity at the
completion of the argument is that the amount by which the analyticity strip
shrinks at the nth step in the induction will be proportional to the amount by
which our diffeomorphism differs from a rotation at the nth iterative step, and
thanks to the extremely fast convergence of Newtons method, this is very small.
The Inductive Argument:
Let 0 (x) (x), be our original diffeomorphism, and set 0 (x) = (x).
Define 1 (x) = H01 0 H0 (x), and by induction, n+1 (x) = Hn1 n Hn (x) =
x + + n+1 (x) where Hn (x) = x + hn (x), and hn solves
hn (x + ) hn (x) = n (x) (0) .
Also define the sequence of inductive constants:
n =
36(1+n2 ) ,
0 = , and n+1 = n 6n , if n 0.
(3/2)n
(0 = %% , and (n = (0
, if n 0.
K
( )(+1)
16() 36
*8
then n+1 (x) = x + + n+1 (x), with n+1 Bn+1 , and %n+1 %n+1 (n+1 .
11
()(n
,
K(2n )
2()2 (2n
.
K 2 (2n )2+1
Proof: Note that Proposition 2.2 and Proposition 2.3 imply that the estimates
on hn and gn hold for n = 0. The estimate on %1 %1 follows by noting that
from Proposition 2.4,
%1 %1
16(())2
%%2
K 2 (2)2+1
and the hypothesis on the inductive constants in the Inductive Lemma was
(3/2)
chosen so that this last expression is less than (0
= (1 . This completes the
first induction step.
Now suppose that the induction holds for n = 0, 1, . . . , N 1, so that we know
that %N %N (N . To prove that it holds for n = N , first choose hN to solve
hN (x + ) hN (x) = N (x) N (0). By Proposition 2.2, and the inductive
()'N
estimate on N , we will have %hN %N N K(2
, while Proposition 2.3
N)
2()2 '2
1
N
implies that HN
(x) = x hN (x) + gN (x), with %gN %N 4N K 2 (2N )2+1
.
1
Finally, if we define N +1 = HN N HN = x + + N +1 , with N +1 defined
in analogy with in Proposition 2.4, then we see that
%N +1 %N +1
16()2
(2 .
2+1 N
N)
K 2 (2
Once again, if use the hypotheses on the inductive constants we see that this ex(3/2)
pression is bounded by (N
= (N +1 , which completes the proof of the lemma.
With the aid of the Inductive Lemma it is easy to complete the proof of
Arnolds Theorem. Define
HN (x) = H0 H1 . . . HN (x)
= x + hN (x) + hN 1 (x + hn (x) + hN 2 (x + hN (x) + hN 1 (x + hN (x)))
+ . . . + h0 (x + h1 (x + . . .) . . .)
12
()'N
1) K(2
.
N)
so %HN +1 HN %N +1 ( 4
Note that by the definition of the
N +
inductive constants, the right hand side of this inequality converges if summed
over N . Thus, HN converges uniformly to some limit H on S , and H is
Remark 2.8 Suppose that in Arnolds Theorem we were given a diffeomorphism of the (apparently) more general form
(x) = x + + (x) .
but still with rotation number of type (K, ) (where )= .) We can rewrite
(x) = x++(+(x)) x++(x). If %% = %()+% ((K, , ),
then Theorem 2.1 implies that is analytically conjugate to R
Remark 2.9 In examples it may be difficult to determine from inspection of
the initial diffeomorphism what the rotation number is. In such cases there is
often a parameter in the problem which can be varied and which allows one to
show that the conjugacy in Arnolds Theorem exists at least for most parameter
values. For instance, the following result can be proven by easy modifications of
the previous methods: (See, [1], page 271.)
Theorem 2.2 Consider the family of diffeomophisms:
,' (x) = x + + ((x) ,
13
(7)
for [0, 1]. For every > 0, there exists (0 > 0 such that if |(| < (0 , there
exists a set A(() [0, 1] such that for A((), ', is analytically conjugate
to a rotation of the circle, and |Lebesgue measure (A(()) 1| < .
Remark 2.10 It is not necessary to work with analytic functions. For instance,
Moser [10] showed that if the original diffeomorphism is C k , and if the rotation
number is of type (K, ), then if k is sufficiently large (depending on ), and
the diffeomorphism is a sufficiently small perturbation of a rotation, the diffeo#
morphism is conjugate to a rotation, with a C k conjugacy function, for some
!
1 k < k. Note that this theorem is still local in that it demands that the
diffeomorphism which we start with be close to a pure rotation. More recent
work of Hermann [7] and Yoccoz [12], has lead to a remarkably complete understanding of the global picture of the dynamics of circle diffeomorphisms. For
instance (see [12]), one can write the real numbers as a union of two disjoint
sets A and B, and prove that any analytic circle diffeomorphism, , with rotation number () B is analytically conjugate to the rotation R() , while for
any A, there exists an analytic circle diffeomorphism with rotation number
which is not analytically conjugate to R .
14
In this section we address the KAM theory in its original setting, namely nearly
integrable Hamiltonian systems. Recall that a Hamiltonian system (in Euclidean
space) is a system of 2N differential equations whose form is given by
H
qj
H
qj =
pj
pj =
j = 1, . . . , N ,
j = 1, . . . , N ,
for some function H(p, q). (Here p = (p1 , . . . , pN ) and q = (q1 , . . . , qN ).)
In general these equations are just as hard to solve as any other system of 2N
coupled, nonlinear, ordinary differential equations, but in special circumstances
(the integrable Hamiltonian systems) there exists a special set of variables
known as the action-angle variables, (I, ), I RN and TN , such that
in terms of these variables, H(I, ) = h(I). Since the Hamiltonian does not
depend on the angle variables , the equations of motion are very simple they
become
H
= 0 ; j = 1, . . . , N ,
j
H
j (I) ; j = 1, . . . , N .
Ij
Ij
We can solve these equations immediately, and we find that I(t) = I(0), and
(t) = (I)t + (0). Thus, for an integrable system with bounded trajectories,
the action variables I are constants of the motion, while the angle variables
just precess around an N -dimensional torus with angular frequencies given
by the gradient of the Hamiltonian with respect to the actions. (In particular,
if the components of (I) are irrationally related to one another, (t) is a quasiperiodic function.)
Remark 3.1 The three-body (or N -body) problem, in which we ignore the mutual interaction between the planets is an integrable system.
Now suppose that we start with an integrable Hamiltonian h(I) and make a
small perturbation which depends on both the action and the angle variables
as in the case of the solar system when we consider the gravitational interactions
between the planets. Then the Hamiltonian takes the form:
H(I, ) = h(I) + f (I, ) .
(8)
N
"
j=1
Remark 3.2 Note that if is of type (K, ), then the vector (, 1) is of type
(L, ) with K = L and = 1. Also, we again assume without loss of
generality that L 1.
Given this remark, and the fact that we know that the numbers of type
(K, ) are a subset of the real line of full Lebesgue measure, the following result
(whose proof we omit) is not surprising.
Proposition 3.1 If > N , almost every RN is of type (L, ) for some
L < 1.
We are now in a position to state the KAM theorem.
Theorem 3.1 (KAM) Suppose that (I ) is of type (L, ), and that the
2
the Hessian matrix Ih2 is invertible at I . (And hence on some neighborhood of
I .) Then there exists (0 > 0 such that if %f %, < (0 , the Hamiltonian system
(8) has a quasi-periodic solution with frequencies (I ).
Remark 3.3 Although we have claimed in the theorem only that at least one
quasi-periodic solution exists in the perturbed hamiltonian system, we will see
in the course of the proof that the whole torus, I = I , survives.
16
Remark 3.4 One might wonder why we study quasi-periodic orbits rather than
the apparently simpler periodic orbits. If one considers values of the action variables for which the frequencies j (I) are all rationally related, then the integrable
hamiltonian will have an invariant torus, filled with periodic orbits. However,
under a typical perturbation, all but finitely many of these periodic orbits will
disappear. Hence, the quasi-periodic orbits are, in this sense, more stable than
the periodic ones.
As we will see, the proof follows very closely the outline of the previous
section. In particular, we begin with:
Step 1: Analysis of the Linearized equation
such that in terms of these new
The basic idea is to find new variables (I, )
variables (8) will be integrable. However, not just any change of variables is
allowed, because most changes of variables will not preserve the Hamiltonian
form of the equations of motion. We will admit only those changes of variables
which do preserve the Hamiltonian form of the equations. Such transformations
are known as canonical changes of variables. There is a large literature on
canonical transformations, (for a nice introduction see [2]), but pursuing it would
take us too far afield. In order to come to the point in as expeditious a fashion
as possible, let us just note the following:
Proposition 3.2 Suppose that there exists a smooth function (I, ) such that
the equations:
I=
, =
,
I
Then is a canonical transformation,
can be inverted to find (I, ) = (I, ).
and is called its generating function.
Proof: See [2], section 48.
Remark 3.5 Note that (I, ) = /I, 0 is the generating function for the identity transformation. (Here, /, 0 is the inner product in RN .)
Remark 3.6 There are other ways of generating canonical transformations.
In particular, the Lie transform method has proven to be very convenient for
computational purposes [5]. However, the generating function method offers a
simple and direct way to prove the KAM theorem and for that reason I have
chosen it here.
such that
We would like to find a canonical transformation (I, ) = (I, )
I) .
(I, ), ) = h(
17
(9)
(This, by the way, is the Hamilton-Jacobi equation. In the last century, Jacobi
proved the integrability of a number of physical systems by finding solutions of
this equation.) In our example, (9) can be written as:
h((
I) .
(I, )) + f ((
(I, ), ) = h(
(10)
f(I, n)ei2(n,)
/(I), n0
nZN \0
"
(12)
Remark 3.7 Once again, as in (6), the function S defined (formally) by (12)
does not satisfy (11), but rather
S
/(I),
(I, )0 + f (I, ) = 0 ,
(13)
and we will be forced to estimate the difference between these two equations
below.
Note that once again, we will face small denominators. Indeed, for a dense
set of points I, the denominators in (12) will vanish for infinitely many choices
of n. This is the reason that many people (including Poincare) at the end of
the last century believed that these series diverged. Nonetheless, the results
of Kolmogorov, Arnold and Moser show that most (in the sense of Lebesgue
measure) points I give rise to a convergent series. Having S be defined only on
the complement of a dense set of points I would be a problem, since we would be
hard pressed to take the derivatives we need in order to compute the canonical
transformation in Proposition 3.2. To proceed, we take advantage of the fact
that because of the analyticity of f , the Fourier coefficients f(I, n) are decaying
to zero exponentially fast as |n| becomes large. Thus, if we truncate the sum
defining S to consider only |n| < M , for some large M we will make only a
18
relatively small error in the solution of (11). On the other hand, since there are
now only finitely many terms in the sum defining S, we can find open sets of
action-variables on which the generating function is defined. Before stating the
precise estimate on S, we introduce a few preliminaries.
First, define 1, such that
,
+
2 h 1
2h
max ( sup % 2 %), ( sup %( 2 ) %) < .
I
|II |
|II | I
(Here, % % is the norm of the matrix considered as an operator from CN CN
such that sup|II | %( 3 h3 )% < .
(In this case, % % is the norm of ( Ih3 ) considered as a bilinear operator from
CN CN CN .) Next note that if we define
i
S < (I, ) =
2
"
nZN \0
|n|M
f(I, n)ei2(n,)
,
/(I), n0
!
where f < (I, ) |n|M f(I, n)ei2(n,) . Note that we have already discarded
all terms that were formally of more than first order in %f %, in order to derive
(11). Thus, if in deriving this equation for S < , we change (11) only by amounts
of this order, we wont have qualitatively worsened our approximation. We will
choose M in order to insure that this is the case.
Proposition 3.3 Choose 0 < < , and set M = | log(%f %, )|/(). If
< L/(2M +1 ) and 4 < 1, then S < is analytic on A, (I ), and
%S < %,
8( + 1)
(2)+1
*N
(2N )%f %,
2L
Proof: Recall that we chose our domain A, (I ) so that it was centered (in
the I variables) at a point with (I ) = . Now suppose that we choose
|n| < M , and consider /(I), n0 for some other point I in our domain. Writing
I = I + (I I ), we see that |/(I), n0 /(I ), n0| |n|. If we then use the
fact that is of type (L, ), we find that for |n| M and all |I I | < ,
|/(I), n0| = |/ , n0 + (/(I), n0 / , n0)|
19
L
L
|n|
,
|n|
2|n|
where the last inequality used the hypothesis on and the fact that |n| M .
If we combine this observation with the fact that |f(I, n)| %f %, e2|n| , by
Cauchys theorem, we find
%S < %,
" 2|n|
%f %, e2|n|
2L
|n|M
M
"
2%f %,
N (1 + 2
m e2|m| )N
2L
m=0
)
*N
8( + 1)
2N %f %,
.
(2)+1
2L
In going from the first to second line of this inequality, we used the fact that
|n| e2|n| N (max |nj |) e2|n| N
j
so that
|n|M
|n| e2|n| N (1 + 2
!M
m=1
N
-
j=1
me2m )N .
Now that we know that the generating function is well-defined, we can proceed to check that the canonical transformation is defined and analytic, just as
we did in Proposition 2.3 in the previous section.
Proposition 3.4 If
)
8( + 1)
(2)+1
*N
16N +1 %f %,
<1 ,
2L
(14)
on
define an analytic and invertible canonical transformation (I, ) = (I, )
the set A3,/4 .
Proof: Just as in the proof of Lemma 2.3 we begin by using the analytic inverse function theorem to check that (14) can be inverted. In both of the
expressions in this equation, the inverse function theorem can be applied pro2 <
S
vided % I
%2,/2 < 1. This in turn, follows immediately from the estimate
in Proposition 3.3 and Cauchys Theorem.
20
8( + 1)
(2)+1
*N
2N +1 %f %,
< /8 ,
2L
S <
on a
<
that | S
| is
<
| S
|,
domain A, ,
supA,
where we recall
the .1 norm of
This is the origin of the extra factor of N in these estimates.
S <
.
and
%f%3,/4 2( + 2)
+)
8( + 1)
(2)+1
*N
2N +1 %f %,
2L
,2
Remark 3.9 The important thing to note is that f, the amount by which our
transformed Hamiltonian fails to be integrable is quadratic in the small quantity
%f %2, . Just as in Proposition 2.4 in the previous section, this will form the
basis of a Newtons method argument, which will allow us to prove the existence
of a quasi-periodic solution with frequencies .
Proof: Using Taylors Theorem, we can rewrite
<
= H(I + S , (I, ))
I, )
H(
S <
S <
= h(I +
) + f (I +
, (I, ))
# 1# t
S <
0
0
# 1
S <
S <
f
+f (I, ) +
/ (I + t
, ),
0dt .
I
<
#
I) = h(I) + average{
h(
+average{
where average{g(I, )}
/
$
/(
S S S
(I + v
) ),
0dvdt}
I
f
S
S
,
(I + t , ),
0dt} + averagef (I, (I, ))
I
TN
,
and
g(I, )d
= f (I, (I, ))
f(I, )
# 1# t
0
0
# 1
f
S <
S <
+
/ (I + t
, (I, ),
0dt
I
0
# 1# t
0
0
# 1
f
S <
S <
average{ / (I + t
, ),
)0dt}
I
Remark 3.10 Subtracting the average of the three quantities in f insures that
when we expand f in a Fourier series, there will be no n = 0 coefficient this
was used in solving (11).
and f are easy to estimate using the estimates of Proposition 3.3
Both h
and Cauchys Theorem. For instance,
# 1
S <
S <
f
, ),
0dt%3,/4
%
/ (I + t
I
0
)
*N
2%f %, 8( + 1)
2N +2 %f %,
(2)+1
2L
while,
%
/(
S < S < S <
(I + v
)
),
0dvdt%3,/4
I
+)
,2
*N
8( + 1)
2N +1 %f %,
.
(2)+1
2L
4
( )N %f %2, ,
|n|M
where the last of these inequalities came from using the definition of M in
Proposition 3.3.
If we combine these remarks, we immediately obtain the estimates stated in
the Proposition.
36(1+n2 ) ,
n 0.
23
0 = , and n+1 = n 4n , if n 0.
0 , and n+1 = n /8, with 0 chosen to satisfy the hypothesis of the
following Lemma.
(3/2)(n/)
(0 = %f %, , and (n = (0
, if n 0.
Mn = | log (n |/(n ).
S
/n (I), n (I, )0 + fn< (I, ) = 0 ,
!
i2(n,)
th
n
with fn< (I, )
, and n (I) = h
|n|Mn fn (I, n)e
I (I). At the n
stage of the iteration we will work on the domain An ,n (In ) = {(I, ) CN
CN | |I In | < n , |Im(j )| < n , j = 1, . . . , N }, where In is chosen so
2
2
that n (In ) = , and we define n = max(1, sup % Ih2n %, %( Ih2n )1 %), with
the supremum in these expressions running over all I with |I In | < n .
We then have
Lemma 3.1 (KAM Induction Lemma) There exists a positive constant c1
such that if
(0 < 2c1 N (+1)
8N (4+1) 80 L16
2c1 L
, and 0 <
.
16N
8
( + 1)
M0+1
then
The generating function Sn< satisfies
%Sn< %n n ,n
8( + 1)
(2n )+1
*N
2N (n
.
2L
24
Before proving this lemma, we show how the KAM theorem follows from it.
If the perturbation f in our Hamiltonian is sufficiently small, the hypotheses of
the Induction Lemma will be satisfied, and roughly speaking, the idea is that as
n , Hn (I, ) h (I), an integrable system, since fn 0. Since all of the
orbits of an integrable system are quasiperiodic, this would complete the proof.
However, as n becomes larger and larger, the size of the domain in the action
variables on which Hn is defined goes to zero. Thus, we must be a little careful
with this limit.
Begin by defining n = 0 1 . . . n . By the induction lemma,
n : An 3n ,n /4 (In ) A0 ,0 (I0 ), and Hn = H0 n1 . In particular,
if (In (t), n (t)) is a solution of Hamiltons equations with Hamiltonian Hn , then
n1 (In (t), n (t)) is a solution of Hamiltons equations with Hamiltonian H0 .
Consider the equations of motion of Hn :
fn
fn
, = n (I) +
.
I =
I
fn
n
Since % f
I %n ,n /2 2(n N/n , and % %n n ,n (n N/n , the trajectory
with initial conditions (In , 0 ) (for any 0 TN ), will remain in An 3n ,n /4 (In )
for all times |t| Tn = 2n , by our hypothesis on (0 , and the definition of the
induction constants. Furthermore, if (In (t), n (t)) is the solution with these
initial conditions, we have
+
,
max
|t|Tn
|t|Tn
22n+2 (n N/n n .
Noting that the inductive estimates on In imply that there exists I with
limn In = I , we see that for t in any compact subset of the real line,
(In (t), n (t)) (I , t + 0 ) (again using the definition of the inductive constants). Using the inductive bounds on the canonical transformation one can
readily establish that
)
*N )
*
"
8( + 1)
8N (j
%n (I, ) (I, )%n+1 ,n+1
2N
,
(2j )+1
2j j L
j=0
while
%n (I, ) n1 (I, )%n+1 ,n+1
= %n1 n (I, ) n1 (I, )%n+1 ,n+1
) 8( + 1) *N ) 8N (n *
16
(2N +
)
.
n n
(2n )+1
2n n L
Using the definition of the inductive constants, we see that the sum over n of this
last expression converges and hence limn n (I , t + 0 ) = (I (t), (t))
25
Remark 3.11 Note that this argument is independent of the point 0 that we
take on the original torus. Thus it shows that every trajectory on the unperturbed torus is preserved.
Proof: (of Lemma 3.1.) Note that Propositions 3.3, 3.4, and 3.5, plus the
assumption on the induction constants imply that we can start the induction,
provided A0 30 ,0 /4 (I0 ) A1 ,1 (I1 ). From the definitions of the domains and
the inductive constants, we see that this will follow provided |I0 I1 | < 0 /8.
To see that this is so we note that 0 (I0 ) = 1 (I1 ). Thus, 0 (I0 ) 0 (I1 ) =
(h1 h0 )
(I1 ). But, % (h1Ih0 ) (I1 )%0 30 ,0 /6 12(1 /0 , while
I
0 (I0 ) 0 (I1 ) =
0
(I0 )(I0 I1 )
I
# 1# t 2
0
+
( 2 (I0 + sI1 )(I1 I0 ))2 dsdt .
I
0
0
%
&1
2
0
this implies that |I0 I1 | < 0 /8 by the
Since %
% and % I20 % ,
I
0 < 1/2, which will follow
definition of the induction constants, provided
if the constant c1 in the Lemma is sufficiently large. This completes the first
induction step.
Suppose that the induction argument holds for n = 0, 1, . . . , K 1. To prove
<
it for n = K we first note if SK
is defined by:
<
SK
(I, ) =
i
2
"
nZN \0
|n|MK
fK (I, n)ei2(n,)
,
/K (I), n0
8( + 1)
(2K )+1
*N
2N (K
.
2L
+1
Note that the hypothesis in Proposition 3.3 becomes K < L/(2K MK
).
26
where,
K
= max(1,
sup
|IIK |<K
max(1 +
2 hK
2 hK 1
%,
sup
%(
) %)
I 2
I2
|IIK |<K
K
"
64N (j
j=1
2j
, (1
K
"
64N (j
j=1
2j
)1 ) 2 ,
using the definition of the inductive constants. This observation, plus the hypothesis on 0 in the inductive lemma, guarantees that the hypothesis of Proposition 3.3 is satisfied. Thus, by Proposition 3.4, the canonical transformation
K defined by
<
<
SK
SK
I = I +
, and = +
,
(15)
I
is analytic and invertible on the set AK 3K ,K /4 (IK ), and maps this set into
AK ,K (IK ).
in Proposition 3.5
If we then define fK+1 and hK+1 , as we defined f and h
we see that
+)
,2
*N
8( + 1)
2N +1 (K
%fK+1 %K 3K ,K /4 2(K + 2)
(2K )+1
2K K L
while
%hK hK+1 %K 3K ,K /4 (K + 2)
+)
8( + 1)
(2K )+1
*N
2N +1 (K
2K K L
,2
Remark 3.12 From the point of view of applications of this theory it is often
convenient to know not just what happens to a single trajectory, but rather the
behavior of whole sets of trajectories. Simple modifications of the preceding
argument allow one to demonstrate the following variant of the KAM theorem.
(See [4].) Consider the family of Hamiltonian systems
H' = h(I) + (f (I, ) .
27
(16)
Suppose that there exists a bounded set V RN such that Ih2 (I) is invertible
for all I V , and that for every ( in some neighborhood of zero H' is analytic
on a set of the form A, (V ) = {(I, ) CN CN | |I I| < , for some I
V , and |Im(j )| < , j = 1, . . . , N }.
Theorem 3.2 For every > 0, there exists (0 > 0 such that if |(| < (0 , there
exists a set P' V TN , such that the Lebesgue measure of (V TN )\P' is
less than and for any point (I0 , 0 ) P' , the trajectory of (16) with initial
conditions (I0 , 0 ) is quasi-periodic.
Thus an informal way of stating the KAM theorem is to say thatmost trajectories of a nearly integrable Hamiltonian systems remain quasi-periodic.
Remark 3.13 Just as in the case of Arnolds theorem about circle diffeomorphisms, the KAM theorem also remains true when the Hamiltonian is only
finitely differentiable, rather than analytic. For a nice exposition of this theory, see [11].
References
[1] V. Arnold. Small denominators, 1: Mappings of the circumference onto
itself. AMS Translations, 46:213288, 1965 (Russian original published in
1961).
[2] V. Arnold. Mathematical Methods of Classical Mechanics. Springer-Verlag,
New York, 1978.
[3] V. Arnold. Geometrical Methods in the Theory of Ordinary Differential
Equations. Springer-Verlag, New York, 1982.
[4] G. Gallavotti. Perturbation theory for classical hamiltonian systems. In
J. Fr
ohlich, editor, Scaling and Self-Similarity in Physics, pages 359246.
Birkh
auser, Boston, 1983.
[5] W. Grobner. Die Lie-Reihen und ihre Anwendungen. Deutscher Verlag der
Wissenschaften, Berlin, 1960.
[6] J. Guckenheimer and P. Holmes. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector-Fields. Springer-Verlag, New York, 1983.
[7] M. R. Herman. Sur la conjugaison differentiable des diffeomorphismes du
cercle `
a des rotations. Publ. Math. I.H.E.S., 49:5234, 1979.
[8] A. N. Kolmogorov. On conservation of conditionally periodic motions under
small perturbations of the hamiltonian. Dokl. Akad. Nauk, SSSR, 98:527
530, 1954.
28
29