Lectures On Classical Mechanics-John Baez PDF
Lectures On Classical Mechanics-John Baez PDF
Lectures On Classical Mechanics-John Baez PDF
Classical Mechanics
by
John C. Baez
Department of Mathematics
University of California, Riverside
2005
i
c
2005 John C. Baez & Derek K. Wise
ii
iii
Preface
These are notes for a mathematics graduate course on classical mechanics at U.C.
Riverside. I’ve taught this course three times recently. Twice I focused on the Hamiltonian
approach. In 2005 I started with the Lagrangian approach, with a heavy emphasis on
action principles, and derived the Hamiltonian approach from that. This approach seems
more coherent.
Derek Wise took beautiful handwritten notes on the 2005 course, which can be found
on my website:
http://math.ucr.edu/home/baez/classical/
Later, Blair Smith from Louisiana State University miraculously appeared and volun-
teered to turn the notes into LATEXẆhile not yet the book I’d eventually like to write,
the result may already be helpful for people interested in the mathematics of classical
mechanics.
The chapters in this LATEX version are in the same order as the weekly lectures, but
I’ve merged weeks together, and sometimes split them over chapter, to obtain a more
textbook feel to these notes. For reference, the weekly lectures are outlined here.
Week 1: (Mar. 28, 30, Apr. 1)—The Lagrangian approach to classical mechanics:
deriving F = ma from the requirement that the particle’s path be a critical point of
the action. The prehistory of the Lagrangian approach: D’Alembert’s “principle of least
energy” in statics, Fermat’s “principle of least time” in optics, and how D’Alembert
generalized his principle from statics to dynamics using the concept of “inertia force”.
Week 2: (Apr. 4, 6, 8)—Deriving the Euler-Lagrange equations for a particle on an
arbitrary manifold. Generalized momentum and force. Noether’s theorem on conserved
quantities coming from symmetries. Examples of conserved quantities: energy, momen-
tum and angular momentum.
Week 3 (Apr. 11, 13, 15)—Example problems: (1) The Atwood machine. (2) A
frictionless mass on a table attached to a string threaded through a hole in the table, with
a mass hanging on the string. (3) A special-relativistic free particle: two Lagrangians, one
with reparametrization invariance as a gauge symmetry. (4) A special-relativistic charged
particle in an electromagnetic field.
Week 4 (Apr. 18, 20, 22)—More example problems: (4) A special-relativistic charged
particle in an electromagnetic field in special relativity, continued. (5) A general-relativistic
free particle.
Week 5 (Apr. 25, 27, 29)—How Jacobi unified Fermat’s principle of least time and
Lagrange’s principle of least action by seeing the classical mechanics of a particle in a
potential as a special case of optics with a position-dependent index of refraction. The
ubiquity of geodesic motion. Kaluza-Klein theory. From Lagrangians to Hamiltonians.
iv
2 Equations of Motion 14
2.1 The Euler-Lagrange Equations . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.1 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Lagrangian Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Interpretation of Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
v
CONTENTS 1
2
1.1 Lagrangian and Newtonian Approaches 3
dq
where q̇ = dt
, and also acceleration,
n
a = q̈ : −→ .
n
Now let m > 0 be the mass of the particle, and let F be a vector field on called the
force. Newton claimed that the particle satisfies F = ma. That is:
This is a 2nd-order differential equation for q : → n which will have a unique solution
given some q(t0 ) and q̇(t0 ), provided the vector field F is ‘nice’ — by which we technically
mean smooth and bounded (i.e., |F (x)| < B for some B > 0, for all x ∈ n ).
We can then define a quantity called kinetic energy:
1
K(t) := m v(t) · v(t) (1.3)
2
This quantity is interesting because
d
K(t) = m v(t) · a(t)
dt
= F (q(t)) · v(t)
So, kinetic energy goes up when you push an object in the direction of its velocity, and
goes down when you push it in the opposite direction. Moreover,
Z t1
K(t1 ) − K(t0 ) = F (q(t)) · v(t) dt
t0
Z t1
= F (q(t)) · q̇(t) dt
t0
So, the change of kinetic energy is equal to the work done by the force, that is, the integral
of F along the curve q : [t0 , t1 ] → n . This implies (by Stokes’s theorem relating line
integrals to surface integrals of the curl) that the change in kinetic energy K(t1 ) − K(t0 )
is independent of the curve going from q(t0 ) = a to q(t0 ) = b iff
∇×F = 0.
where V (t) := V (q(t)) is called the potential energy of the particle, and then we can
show that E is conserved: that is, constant as a function of time. To see this, note that
F = ma implies
d
[K(t) + V (q(t))] = F (q(t)) · v(t) + ∇V (q(t)) · v(t)
dt
= 0, (because F = −∇V ).
Conservative forces let us apply a whole bunch of cool techniques. In the Lagrangian
approach we define a quantity
From here one can go in two directions. One is to claim that nature causes particles
to follow paths of least action, and derive Newton’s equations from that principle. The
other is to start with Newton’s principles and find out what conditions, if any, on S(q)
follow from this. We will use the shortcut of hindsight, bypass the philosophy, and simply
use the mathematics of variational calculus to show that particles follow paths that are
‘critical points’ of the action S(q) if and only if Newton’s law F = ma holds. To do this,
qs + sδq
n
PSfrag replacements
t0 t1
Figure 1.1: A particle can sniff out the path of least action.
let us look for curves (like the solid line in Fig. 1.1) that are critical points of S, namely:
d
S(qs )|s=0 = 0 (1.8)
ds
1.1 Lagrangian and Newtonian Approaches 5
where
qs = q + sδq
n
for all δq : [t0 , t1 ] → with,
δq(t0 ) = δq(t1 ) = 0.
To show that
d
F = ma ⇔ S(qs )|t=0 = 0 for all δq : [t0 , t1 ] → n with δq(t0 ) = δq(t1 ) = 0 (1.9)
ds
we start by using integration by parts on the definition of the action, and first noting that
dqs /ds = δq(t) is the variation in the path:
Z t1
d d 1
S(qs ) = mq̇s (t) · q̇s (t) − V (qs (t)) dt
ds s=0 ds t0 2
Z t1 s=0
d 1
= mq̇s (t) · q̇s (t) − V (qs (t)) dt
t0 ds 2 s=0
Z t1
d d
= mq̇s · q̇s (t) − ∇V (qs (t)) · qs (t) dt
t0 ds ds s=0
Note that
d d d d d
q̇s (t) = qs (t) = qs (t)
ds ds dt dt ds
and when we set s = 0 this quantity becomes just:
d
δq(t)
dt
So,
Z t1
d d
S(qs ) = mq̇ · δq(t) − ∇V (q(t)) · δq(t) dt
ds s=0 dt
t0
Then we can integrate by parts, noting the boundary terms vanish because δq = 0 at t1
and t0 :
Z t1
d
S(qs )|s=0 = [−mq̈(t) − ∇V (q(t))] · δq(t)dt
ds t0
It follows that variation in the action is zero for all variations δq iff the term in brackets
is identically zero, that is,
−mq̈(t) − ∇V (q(t)) = 0
So, the path q is a critical point of the action S iff
F = ma.
The above result applies only for conservative forces, i.e., forces that can be written
as the minus the gradient of some potential. However, this seems to be true of the most
fundamental forces that we know of in our universe. It is a simplifying assumption that
has withstood the test of time and experiment.
6 From Newton’s Laws to Langrange’s Equations
so that slight changes in its path do not change the action (to first order). Often, though
not always, the action is minimized, so this is called the Principle of Least Action.
Suppose we did not have the hindsight afforded by the Newtonian picture. R Then we
might ask, “Why does nature like to minimize the action? And why this action K−V dt?
Why not some other action?”
‘Why’ questions are always tough. Indeed, some people say that scientists should
never ask ‘why’. This seems too extreme: a more reasonable attitude is that we should
only ask a ‘why’ question if we expect to learn something scientifically interesting in our
attempt to answer it.
There are certainly some interseting things to learn from the question “why is action
minimized?” First, note that total energy is conserved, so energy can slosh back and
forth between kinetic and potential forms. The Lagrangian L = K − V is big when most
of the energy is in kinetic form, and small when most of the energy is in potential form.
1.2 Prehistory of the Lagrangian Approach 7
Kinetic energy measures how much is ‘happening’ — how much our system is moving
around. Potential energy measures how much could happen, but isn’t yet — that’s what
the word ‘potential’ means. (Imagine a big rock sitting on top of a cliff, with the potential
to fall down.) So, the Lagrangian measures something we could vaguely refer to as the
‘activity’ or ‘liveliness’ of a system: the higher the kinetic energy the more lively the
system, the higher the potential energy the less lively. So, we’re being told that nature
likes to minimize the total of ‘liveliness’ over time: that is, the total action.
In other words, nature is as lazy as possible!
For example, consider the path of a thrown rock in the Earth’s gravitational field, as
in Fig. 1.2. The rock traces out a parabola, and we can think of it as doing this in order
PSfrag replacements
K − V big. . . bad!
Get this over with quick!
to minimize its action. On the one hand, it wants to spend a lot much time near the top
of its trajectory, since this is where the kinetic energy is least and the potential energy
is greatest. On the other hand, if it spends too much time near the top of its trajectory,
it will need to really rush to get up there and get back down, and this will take a lot of
action. The perfect compromise is a parabolic path!
Here we are anthropomorphizing the rock by saying that it ‘wants’ to minimize its
action. This is okay if we don’t take it too seriously. Indeed, one of the virtues of the
Principle of Least Action is that it lets us put ourselves in the position of some physical
system and imagine what we would do to minimize the action.
There is another way to make progress on understanding ‘why’ action is minimized:
history. Historically there were two principles that were fairly easy to deduce from ob-
servations of nature: (i) the principle of minimum energy used in statics, and (ii) the
principle of least time, used in optics. By putting these together, we can guess the prin-
ciple of least action. So, let us recall these earlier minimum principles.
PSfrag replacements L1 L2
m1 m2
a see-saw or lever (Fig. 1.3), and he found that this would be in equilibrium if
m1 L 1 = m 2 L 2 .
Later D’Alembert understood this using his “principle of virtual work”. He considered
moving the lever slightly, i.e., infinitesimally, He claimed that in equilibrium the infinites-
PSfrag replacements dθ
dq1
dq2
imal work done by this motion is zero! He also claimed that the work done on the ith
body is,
dWi = Fi dqi
and gravity pulls down with a force mi g so,
dWi = (0, 0, −mg) · (0, 0, −L1 dθ)
= m1 gL1 dθ
and similarly,
dW2 = −m2 gL2 dθ
Now D’Alembert’s principle says that equilibrium occurs when the “virtual work” dW =
dW1 +dW2 vanishes for all dθ (that is, for all possible infinitesimal motions). This happens
when
m1 L 1 − m 2 L 2 = 0
which is just as Archimedes wrote.
The forces and constraints on a system may be time dependent. So equal small
infinitesimal displacements of the system might result in the forces Fi acting on the
system doing different amounts of work at different times. To displace a system by δr i for
each position coordinate, and yet remain consistent with all the constraints and forces at a
given instant of time t, without any time interval passing is called a virtual displacement.
It’s called ‘virtual’ because it cannot be realized: any actual displacement would occur
over a finite time interval and possibly during which the forces and constraints might
change. Now call the work done by this hypothetical virtual displacement, F i · δri , the
virtual
P work. Consider a system in the special state of being in equilibrium, i.e.,. when
Fi = 0. Then because by definition the virtual displacements do not change the forces,
we must deduce that the virtual work vanishes for a system in equilibrium,
X
Fi · δri = 0, (when in equilibrium) (1.10)
i
Note that in the above example we have two particles in 3 subject to a constraint
(they are pinned to the lever arm). However, a number n of particles in 3 can be
treated as a single quasi-particle in 3n , and if there are constraints it can move in some
submanifold of 3n . So ultimately we need to study a particle on an arbitrary manifold.
But, we’ll postpone such sophistication for a while.
For a particle in n , D’Alembert’s principle simply says,
unstable equilibrium
unstable equilibrium
V
d ∂T ∂T
− = Qi . (1.11)
dt ∂ q̇i ∂qi
d ∂L ∂L
− = 0. (1.12)
dt ∂ q̇i ∂qi
A C
θ1 θ2
PSfrag replacements
B θ2
C0
θ1
medium 1
θ2
(having normalized n so that for a vacuum n = 1). Someone guessed the explanation,
realizing that if the speed of light in a medium is proportional to 1/n, then light will
satisfy Snell’s law if the light minimizes the time it takes to get from A to C. In the case
of refraction it is the time that is important, not just the path distance. But the same
is true for the law of reflection, since in that case the path of minimum length gives the
same results as the path of minimum time.
So, not only is light the fastest thing around, it’s also always taking the quickest path
from here to there!
using
d dV dqs (t)
V (qs (t)) =
ds s=0 dq ds s=0
and
dq̇s2 (t)
δ(q̇ 2 ) =
ds
then we have
Z t1
dV dqs m dq̇s (t)
0= − + dt
t0 dq dt 2 ds
Z
d t1 d m
= − V qs (t) + q̇s (t) dt
ds t0 dt 2 s=0
Z t1
d m 2
=δ − V qs (t) + q̇s (t) dt
t0 dt 2
therefore Z t1
δ −V (q) + K dt = 0
t0
We’ve described how D’Alembert might have arrived at the principle of least action
by generalizing previously known energy minimization and least time principles. Still,
there’s something unsatisfying about the treatment so far. We do not really understand
why one must introduce the ‘inertia force’. We only see that it’s necessary to obtain
agreement with Newtonian mechanics (which is manifest in Eq.(1.13)).
We conclude with a few more words about this mystery. Recall from undergraduate
physics that in an accelerating coordinate system there is a fictional force = ma, which is
called the centrifugal force. We use it, for example, to analyze simple physics in a rotating
reference frame. If you are inside the rotating system and you throw a ball straight ahead
it will appear to curve away from your target, and if you did not know that you were
rotating relative to the rest of the universe then you’d think there was a force on the ball
equal to the centrifugal force. If you are inside a big rapidly rotating drum then you’ll
also feel pinned to the walls. This is an example of an inertia force which comes from
using a funny coordinate system. In general relativity, one sees that — in a certain sense
— gravity is an inertia force!
Chapter 2
Equations of Motion
In this chapter we’ll start to look at the Lagrangian equations of motion in more
depth. We’ll look at some specific examples of problem solving using the Euler-Lagrange
equations. First we’ll show how the equations are derived.
Q = S2 × S2
PSfrag replacements
Q”, which means you have to disabuse yourself of the notion that we’re dealing with real
1
The tangent bundle T Q will be referred to as configuration space, later on when we get to the chapter
on Hamiltonian mechanics we’ll find a use for the cotangent bundle T ∗ Q, and normally we call this the
phase space.
14
2.1 The Euler-Lagrange Equations 15
particles, we are not, we are dealing with a single quasi-particle in an abstract higher
dimensional space. The single quasi-particle represents two real particles if we are talking
about the classical system in Fig. 2.1. Sometimes to make this clear we’ll talk about “the
system taking a path”, instead of “the particle taking a path”. It is then clear that when
we say, “the system follows a path q(t)” that we’re referring to the point q in configuration
space Q that represents all of the particles in the real system.
So as time passes “the system” traces out a path
q : [t0 , t1 ] −→ Q
(Γ is an infinite dimensional manifold, but we won’t go into that for now.) Let the La-
grangian=L for the system be any smooth function of position and velocity (not explicitly
of time, for simplicity),
L : T Q −→
and define the action, S:
S : Γ −→
by
Z t1
S(q) := L(q, q̇) dt (2.1)
t0
The path that the quasi particle will actually take is a critical point of S, in accord with
D’Alembert’s principle of least action. In other words, a path q ∈ Γ such that for any
smooth 1-parameter family of paths qs ∈ Γ with q0 = q1 , we have
d
S(qs ) =0 (2.2)
ds s=0
We write,
d
as “δ 00
ds s=0
so Eq.(2.2) can be rewritten
δS = 0 (2.3)
16 Equations of Motion
2.1.1 Comments
What is a “1-parameter family of paths”? Well, a path is a curve, or a 1D manifold. So
the 1-parameter family is nothing more nor less than a set of well-defined paths {qs }, each
one labeled by a parameter s. So a smooth 1-parameter family of paths will have q(s)
everywhere infinitesimally close to q(s + ) for an infinitesimal hyperreal . So in Fig. 2.2
qs
PSfrag replacements q0
a
U
PSfrag replacements
b
we restrict attention to a subinterval [t00 , t01 ] ⊆ [t0 , t1 ] such that qs (t) ∈ U for t00 ≤ t ≤ t01 .
Let’s just go ahead and rename t00 and t01 as “ t0 and t1 ” to drop the primes. We can
use the coordinate charts on U ,
ϕ : U −→ n
x 7−→ ϕ(x) = (x1 , x2 , . . . , xn )
dϕ : T U −→ T n ∼= n× n
(x, y) 7−→ dϕ(x, y) = (x1 , . . . , xn , y 1 , . . . , y n )
where we’ve used the given smoothness of L and the Einstein summation convention for
repeated indices i. Note that we can write δL as above using a local coordinate patch
because the path variations δq are entirely trivial outside the patch for U . Continuing,
using the Leibniz rule
d ∂L d ∂L ∂L
δq = δq + δ q̇
dt ∂y dt ∂y ∂y
we have,
Z t1
∂L d ∂L i
δS = − δq (t) dt
t0 ∂xi dt ∂y i
= 0.
If this integral is to vanish as demanded by δS = 0, then it must vanish for all path
variations δq, further, the boundary terms vanish because we deliberately chose δq that
18 Equations of Motion
vanish at the endpoints t0 and t1 inside U . That means the term in brackets must be
identically zero, or
d ∂L ∂L
i
− i =0 (2.4)
dt ∂y ∂x
This is necessary to get δS = 0, for all δq, but in fact it’s also sufficient. Physicists always
give the coordinates xi , y i on T U the symbols “q i ” and “q̇ i ”, despite the fact that these
also have another meaning, namely the xi and y i coordinates of the quantity,
q(t), q̇(t) ∈ T U.
d ∂L ∂L
=
dt ∂ q̇ i ∂q i
Q=
1
L(q, q̇) = mq̇ · q̇ − V (q)
2
1
= mq̇ i q̇i − V (q)
2
So translating our example into general terms, if we conjure up some abstract La-
grangian then we can think of the independent variables as generalized positions and
velocities, and then the Euler-Lagrange equations can be interpreted as equations relat-
ing generalized concepts of momentum and force, and they say that
ṗ = F (2.5)
So there’s no surprise that in the mundane case of a single particle moving in 3 under
time t this just recovers Newton II. Of course we can do all of our classical mechanics
with Newton’s laws, it’s just a pain in the neck to deal with the redundancies in F = ma
when we could use symmetry principles to vastly simplify many examples. It turns out
that the Euler-Lagrange equations are one of the reformulations of Newtonian physics
that make it highly convenient for introducing symmetries and consequent simplifications.
Simplifications generally mean quicker, shorter solutions and more transparent analysis
or at least more chance at insight into the characteristics of the system. The main thing
is that when we use symmetry to simplify the equations we are reducing the number
of independent variables, so it gets closer to the fundamental degrees of freedom of the
system and so we cut out a lot of the wheat and chaff (so to speak) with the full redundant
Newton equations.
One can of course introduce simplifications when solving Newton’s equations, it’s just
that it’s easier to do this when working with the Euler-Lagrange equations. Another good
reason to learn Lagrangian (or Hamiltonian) mechanics is that it translates better into
quantum mechanics.
Chapter 3
If the form of a system of dynamical equations does not change under spatial translations
then the momentum is a conserved quantity. When the form of the equations is similarly
invariant under time translations then the total energy is a conserved quantity (a constant
of the equations of motion). Time and space translations are examples of 1-parameter
groups of transformations. Invariance under a group of transformations is precisely what
we mean by a symmetry in group theory. So symmetries of a dynamical system give
conserved quantities or conservation laws. The rigorous statement of all this is the content
of Noether’s theorem.
Γ = {q : → Q}.
typically will not converge, so S is then no longer a function of the space of paths.
Nevertheless, if δq = 0 outside of some finite interval, then the functional variation,
Z ∞
d
δS := L qs (t), q̇s (t) dt
−∞ ds s=0
will converge, since the integral is smooth and vanishes outside this interval. Moreover,
demanding that this δS vanishes for all such variations δq is enough to imply the Euler-
20
3.1 Time Translation 21
Lagrange equations:
Z
d
∞
δS = L qs (t), q̇s (t) dt
−∞ ds s=0
Z ∞
∂L ∂L
= δqi + δ q̇i dt
−∞ ∂qi ∂ q̇i
Z ∞
∂L d ∂L
= − δqi dt
−∞ ∂qi dt ∂ q̇i
where again the boundary terms have vanished since δq = 0 near t = ±∞. To be explicit,
the first term in
∂L i d ∂L i d ∂L
δ q̇ = δq − δq
∂ q̇ i dt ∂ q̇ i dt ∂ q̇ i
vanishes when we integrate. Then the whole thing vanishes for all compactly supported
smooth δq iff
d ∂L ∂L
= .
dt ∂ q̇i ∂qi
Recall that,
∂L
= pi , is the generalized momentum, by defn.
∂ q̇i
∂L
= ṗi , is the force, by the E-L eqns.
∂qi
Note the similarity to Hamilton’s equations—if you change L to H you need to stick
in a minus sign, and change variables from q̇ to pi and eliminate ṗi .
Generalized Coordinates
For Lagrangian mechanics we have been using generalized coordinates, these are the
{qi , q̇i }. The qi are generalized positions, and the q̇i are generalized velocities. The full
set of independent generalized coordinates represent the degrees of freedom of a particle,
or system of particles. So if we have N particles then we’d typically have 6N general-
ized coordinates (the “6” is for 3 space dimensions, and at each point a position and a
momentum). These can be in any reference frame or system of axes, so for example,
22 Lagrangians and Noether’s Theorem
in a Cartesian frame, with two particles, in 3D space we’d have the 2 × 3 = 6 position
coordinates, and 2 × 3 = 6 velocities,
{x1 , y1 , z1 , x2 , y2 , z2 }, {u1 , v1 , w1 , u2 , v2 , w2 }
Canonical Coordinates
In Hamiltonian mechanics (which we have not yet fully introduced) we will find it more
useful to transform from generalized coordinates to canonical coordinates. The canonical
coordinates are a special set of coordinates on the cotangent bundle of the configuration
space manifold Q. They are usually written as a set of (q i , pj ) or (xi , pj ) with the x’s or q’s
denoting the coordinates on the underlying manifold and the p’s denoting the conjugate
momentum, which are 1-forms in the cotangent bundle at the point q in the manifold.
It turns out that the q i together with the pj , form a coordinate system on the cotangent
bundle T ∗ Q of the configuration space Q, hence these coordinates are called the canonical
coordinates.
We will not discuss this here, but if you care to know, later on we’ll see that the
relation between the generalized coordinates and the canonical coordinates is given by
the Hamilton-Jacobi equations for a system.
F : × Γ −→ Γ
(s, q) 7−→ qs , with q0 = q
3.2 Symmetry and Noether’s Theorem 23
Remark: The simplest case is δL = 0, in which case we really have a way of moving
paths around (q 7→ qs ) that doesn’t change the Lagrangian—i.e., a symmetry of L in the
most obvious way. But δL = dtd ` is a sneaky generalization whose usefulness will become
clear.
pi δqi − `
is conserved, that is, it’s time derivative is zero for any path q ∈ Γ satisfying the Euler-
Lagrange equations. In other words, in boring detail,
d ∂L d i
q(s)q̇(s) q (t) − ` q(t), q̇(t) = 0
dt ∂y i ds s s=0
Proof.
d d
pi δq i − ` = ṗi δq i + pi δ q̇ i − `
dt dt
∂L ∂L
= i δq i + i δ q̇ i − δL
∂q ∂ q̇
= δL − δL = 0.
“Okay, big deal” you might say. Before this can be of any use we’d need to find a
symmetry F . Then we’d need to find out what this pi δqi − ` business is that is conserved.
So let’s look at some examples.
24 Lagrangians and Noether’s Theorem
Example
1. Conservation of Energy. (The most important example!)
All of our Lagrangian systems will have time translation invariance (because the
laws of physics do not change with time, at least not to any extent that we can tell).
So we have a one-parameter family of symmetries
qs (t) = q(t + s)
δL = L̇
for
d d
L(qs ) = L = L̇
ds s=0 dt
so here we take ` = L simply! We then get the conserved quantity
pi δq i − ` = pi q̇ i − L
n
which we normally call the energy. For example, if Q = , and if
1
L = mq̇ 2 − V (q)
2
then this quantity is
1 1
mq̇ · q̇ − mq̇ · q̇ − V = mq̇ 2 + V (q)
2 2
The term in parentheses is K − V , and the left-hand side is K + V .
Let’s repeat this example, this time with a specific Lagrangian. It doesn’t matter what
the Lagrangian is, if it has 1-parameter families of symmetries then it’ll have conserved
quantities, guaranteed. The trick in physics is to write down a correct Lagrangian in the
first place! (Something that will accurately describe the system of interest.)
Fs : Γ −→ Γ
q 7−→ qs
3.3 Conserved Quantities from Symmetries 25
which satisfies
δL = `˙
for some function ` = `(q, q̇) gives a conserved quantity
pi δq i − `
qs (t) = q(t + s)
because
δL = L̇
so we get a conserved quantity called the total energy or Hamiltonian,
H = pi q̇ i − L (3.1)
(You might prefer “Hamiltonian” to “total energy” because in general we are not in the
same configuration space as Newtonian mechanics, if you are doing Newtonian mechanics
then “total energy” is appropriate.)
For example: a particle on n in a potential V has Q = n , L(q, q̇) = 21 mq̇ 2 − V (q).
This system has
∂L
pi q̇ i = i q̇ i = mq̇ 2 = 2K
∂ q̇
so
H = pi q̇ i − L = 2K − (K − V ) = K + V
as you’d have hoped.
qs (t) = q(t) + s v
26 Lagrangians and Noether’s Theorem
with
δL = 0
because δ q̇ = 0 and L depends only on q̇ not on q in this particular case. (Since L does not
depend upon q i we’ll call q i an ignorable coordinate; as above, these ignorables always give
symmetries, hence conserved quantities. It is often useful therefore, to change coordinates
so as to make some of them ignorable if possible!)
In this example we get a conserved quantity called momentum in the v direction:
pi δq i = mq̇i v i = mq̇ · v
Aside: Note the subtle difference between two uses of the term “momentum”; here it
is a conserved quantity derived from space translation invariance, but earlier it was a
different thing, namely the momentum ∂L/∂ q̇ i = pi conjugate to q i . These two different
“momentum’s” happen to be the same in this example!
Since this is conserved for all v we say that mq̇ ∈ n is conserved. (In fact that
whole Lie group G = n is acting as a translation symmetry group, and we’re getting a
q(= n )-valued conserved quantity!)
which has
∂L i ∂L i
δL = δq + i δ q̇ = mq̇i δ q̇ i
∂q i ∂ q̇
now qi is ignorable and so ∂L/∂q i = 0, and ∂L/∂ q̇ i = pi , and
i d i
δ q̇ = q̇
ds s s=0
d d sX
= e q
ds dt s=0
d
= Xq
dt
= X q̇
3.3 Conserved Quantities from Symmetries 27
So,
δL = mq̇i Xji q̇ j
= mq̇ · (X q̇)
=0
m(q̇i q j − q̇j q i )
mq̇×q
Note that above we have assumed one can construct a basis for so(n) using matrices
of the form assumed for X, i.e., skew symmetric with ±1 in the respectively ij and ji
elements, otherwise zero.
I mentioned earlier that we can do mechanics with any Lagrangian, but if we want to
be useful we’d better pick a Lagrangian that actually describes a real system. But how
do we do that? All this theory is fine but is useless unless you know how to apply it.
The above examples were for a particularly simple system, a free particle, for which the
Lagrangian is just the kinetic energy, since there is no potential energy variation for a
free particle. We’d like to know how to solve more complicated dynamics.
The general idea is to guess the kinetic energy and potential energy of the particle (as
functions of your generalized positions and velocities) and then let,
L=K−V
So we are not using Lagrangians directly to tell us what the fundamental physical laws
should be, instead we plug in some assumed physics and use the Lagrangian approch to
solve the system of equations. If we like, we can then compare our answers with exper-
iments, which indirectly tells us something about the physical laws—but only provided
the Lagrangian formulation of mechanics is itself a valid procedure in the first place.
28 Lagrangians and Noether’s Theorem
To see how the formalisms in this chapter function in practise, let’s do some problems.
It’s vastly superior to the simplistic F = ma formulation of mechanics. The Lagrangian
formulation allows the configuration space to be any manifold, and allows us to easily use
any coordinates we wish.
x
PSfrag replacements
`−x
m1
m2
1 d 1
K = (m1 + m2 )( (` − x))2 = (m1 + m2 )ẋ2
2 dt 2
V = −m1 gx − m2 g(` − x)
so
1
L = K − V = (m1 + m2 )ẋ2 + m1 gx + m2 g(` − x)
2
The configuration space is Q = (0, `), and x ∈ (0, `) (we could use the “owns” symbol 3
here and write Q = (0, `) 3 x ). Moreover T Q = (0, `) × 3 (x, ẋ). As usual L : T Q → .
Note that solutions of the Euler-Lagrange equations will only be defined for some time
t ∈ , as eventually the solutions reaches the “edge” of Q.
The momentum is:
∂L
p= = (m1 + m2 )ẋ
∂ ẋ
and the force is:
∂L
F = = (m1 − m2 )g
∂x
3.4 Example Problems 29
r m1
PSfrag replacements
m2
∂L
Fr = = m1 r θ̇ 2 − gm2
∂r
∂L
Fθ = = 0, (θ is ignorable)
∂θ
Let’s use our conservation law here to eliminate θ̇ from the first equation:
J
θ̇ =
m1 r 2
so
J2
(m1 + m2 )r̈ = − m2 g
m1 r 3
J2
Fr = − m2 g
m1 r 3
which could come from an “effective potential” V (r) such that dV /dr = −Fr . So integrate
−Fr to find V (r):
J2
V (r) = + m2 gr
2m1 r 2
this is a sum of two terms that look like Fig. 3.1
If θ̇(t = 0) = 0 then there is no centrifugal force and the disk will be pulled into the
hole until it gets stuck. At that time the disk reaches the hole, which is topologically the
center of the disk that has been removed from Q, so then we’ve hit the boundary of Q
and our solution is broken.
At r = r0 , the minimum of V (r), our disc mass m1 will be in a stable circular orbit of
radius r0 (which depends upon J). Otherwise we get orbits like Fig. 3.2.
3.4 Example Problems 31
V (r)
PSfrag replacements
attractive gravitational potential
r0 r
no swinging allowed!
Figure 3.2: Orbits for the disc and gravitating mass system.
g(v, w) = v 0 w 0 − v 1 w 1 − . . . − v n w n
= ηµν v µ w ν
32 Lagrangians and Noether’s Theorem
where
1 0 0 ... 0
0 −1 0 . . . 0
0 0 −1 0
ηµν =
.. .. .. .
. . . ..
0 0 0 . . . −1
In special relativity we take spacetime to be the configuration space of a single point
particle, so we let Q be Minkowski spacetime, i.e., n+1 3 (x0 , . . . , xn ) with the metric
ηµν defined above. Then the path of the particle is,
q : (3 t) −→ Q
where t is a completely arbitrary parameter for the path, not necessarily x0 , and not
necessarily proper time either. We want some Lagrangian L : T Q → , i.e., L(q i , q̇ i ) such
that the Euler-Lagrange equations will dictate how our free particle moves at a constant
velocity. Many Lagrangians do this, but the “best” ones(s) give an action that is inde-
pendent of the parameterization of the path—since the parameterization is “unphysical”
(it can’t be measured). So the action
Z t1
i i
S(q) = L q (t), q̇ (t) dt
t0
for q : [t0 , t1 ] → Q, should be independent of t. The obvious candidate for S is mass times
arclength, Z t1 q
S=m ηij q̇ i (t)q̇ j (t) dt
t0
or rather the Minkowski analogue of arclength, called proper time, at least when q̇ is a
timelike vector, i.e., ηij q̇ i q̇ j > 0, which says q̇ points into the future (or past) lightcone
and makes S real, in fact it’s then the time ticked off by a clock moving along the path q :
[t0 , t1 ] → Q. By “obvious candidate” we are appealing somewhat to physical intuition and
Timelike
Lightlike
Spacelike
generalization. In Euclidean space, free particles follow straight paths, so the arclength
or pathlength variation is an extremum, and we expect the same behavior in Minkowski
3.4 Example Problems 33
spacetime. Also, the arclength does not depend upon the parameterization, and lastly,
the mass m merely provides the correct units for ‘action’.
So let’s take p
L = ηij q̇ i q̇ j (3.2)
and work out the Euler-Lagrange equations. We have
∂L ∂ p
pi = = ηij q̇ i q̇ j
∂ q̇ i ∂ q̇ i
2ηij q̇ j
=m p
2 ηij q̇ i q̇ j
ηij q̇ j mq̇i
=m p =
i
ηij q̇ q̇ j kq̇k
(Note the numerator is “mass times 4-velocity”, at least when n = 3 for a real single
particle system, but we’re actually in a more general n + 1-dim spacetime, so it’s more
like the “mass times n + 1-velocity”). Now note that this pi doesn’t change when we
change the parameter to accomplish q̇ 7→ αq̇. The Euler-Lagrange equations say,
∂L
ṗi = Fi = =0
∂q i
The meaning of this becomes clearer if we use “proper time” as our parameter (like
parameterizing a curve by it’s arclength) so that
Z t1
kq̇kdt = t1 − t0 , ∀ t0 , t1
t0
which fixes the parametrization up to an additive constant. This implies kq̇k = 1, so that
q̇i
pi = m = mq̇i
kq̇k
and the Euler-Lagrange equations say
ṗi = 0 ⇒ mq̈i = 0
so our (free) particle moves unaccelerated along a straight line, which is as we desired
(expected).
Comments
This Lagrangian from Eq.(3.2) has lots of symmetries coming from reparameterizing the
path, so Noether’s theorem yields lots of conserved quantities for the relativistic free
34 Lagrangians and Noether’s Theorem
particle. This is in fact called “the problem of time” in general relativity, here we see it
starting to show up in special relativity.
These reparameterization symmetries work as follows. Consider any (smooth) 1-
parameter family of reparameterizations, i.e., diffeomorphisms
fs : −→
with f0 = . These act on the space of paths Γ = {q : → Q} as follows: given any
q ∈ Γ we get
qs (t) = q fs (t)
where we should note that qs is physically indistinguishable from q. Let’s show that
˙
δL = `, (when E-L eqns. hold)
pi δq i − `
Here we go then.
∂L i ∂L i
δL = δq + i δ q̇
∂ q̇ i ∂q
i
= pi δ q̇
mq̇i d i
= q̇ fs (t)
kq̇k ds s=0
mq̇i d d i
= q fs (t)
kq̇k dt ds s=0
mq̇i d i f (t)
= q̇ fs (t)
s
kq̇k dt ds s=0
mq̇i d i
= q̇ δfs
kq̇k dt
d
= pi q̇ i δf
dt
where in the last step we used the E-L eqns., i.e. dtd pi = 0, so δL = `˙ with ` = pi q̇ i δf .
So to recap a little: we saw the free relativistic particle has
p
L = mkq̇k = m ηij q̇ i q̇ j
∂L i ∂L i
δL = δq + i δ q̇
∂q i ∂ q̇
= pi δ q̇ , (since ∂L/∂q i = 0, and ∂L/∂ q̇ i = p)
i
= pi δ q̇ i
d
= pi δq i
dt
d
= pi q̇ i δf
dt
d
= pi q̇ i δf, and set pi q̇ i δf = `
dt
so Noether’s theorem gives a conserved quantity
pi δq i − ` = pi q̇ i δf − pi q̇ i δf
=0
1. These are symmetries that permute different mathematical descriptions of the same
physical situation—in this case reparameterizations of a path.
36 Lagrangians and Noether’s Theorem
2. These symmetries make it impossible to compute q(t) given q(0) and q̇(0): since if
q(t) is a solution so is q(f (t)) for any reparameterization f : → . We have a high
degree of non-uniqueness of solutions to the Euler-Lagrange equations.
3. These symmetries give conserved quantities that work out to equal zero!
Note that (1) is a subjective criterion, (2) and (3) are objective, and (3) is easy to
test, so we often use (3) to distinguish gauge symmetries from physical symmetries.
qs (t) = q(t + s)
and this is an example of a reparametrization (with δf = 1), so we see from the previous
results that the Hamiltonian is zero!
H = 0.
qs (t) = q(t) + s w
∂L i ∂L i
δL = δq + i δ q̇
∂q i ∂ q̇
= pi δ q̇ , (since ∂L/∂q i = 0 and ∂L/∂ q̇ i = pi )
i
= pi 0 = 0
3.6 Relativistic Particle in an Electromagnetic Field 37
qs
q
PSfrag replacements w
pi δq i − ` = piw
p =(p0 , p1 , . . . , pn )
p0 is energy, (p1 , . . . , pn ) is spatial momentum.
We’ve just about exhausted all the basic stuff that we can learn from the free particle.
So next we’ll add some external force via an electromagnetic field.
Note that since A is a 1-form it can be integrated (it is a linear combination of some basis
1-forms like the {dxi }).
Note that since A is a 1-form we can integrate it over an oriented manifold, but one
can also write the path integral using time t as a parameter, with Ai q̇ i dt the differential,
after dq i = q̇ i dt.
The Lagrangian in the above action, for a charge e with mass m in an electromagnetic
potential A is
L(q, q̇) = mkq̇k + eAi q̇ i (3.5)
so we can work out the Euler-Lagrange equations:
∂L q̇i
pi = i
=m + eAi
∂ q̇ kq̇k
= mvi + e Ai
3.7 Alternative Lagrangians 39
where v ∈ n+1 is the velocity, normalized so that kvk = 1. Note that now momentum is
no longer mass times velocity! That’s because we’re in n + 1-d spacetime, the momentum
is an n + 1-vector. Continuing the analysis, we find the force
∂L ∂ j
Fi = = e A j q̇
∂q i ∂q i
∂Aj
= e i q̇ j
∂q
So the Euler-Lagrange equations say (noting that Ai = Aj q(t) :
ṗ = F
d ∂Aj
mvi + eAi = e i q̇ j
dt ∂q
dvi ∂Aj dAi
m = e i q̇ j − e
dt ∂q dt
dvi ∂Aj ∂Ai
m = e i q̇ j − e j q̇ j
dt ∂q ∂q
∂Aj ∂Ai
=e i
− j q̇ j
∂q ∂q
the term in parentheses is F ij = the electromagnetic field, F = dA. So we get the following
equations of motion
dvi
m = eF ij q̇ j , (Lorentz force law) (3.6)
dt
or with ‘proper time’ instead of ‘arclength’, where the 1-from A can be integrated over
a 1-dimensional path. A generalization (or specialization, depending on how you look at
it) would be to consider a Lagrangian for an extended object.
In string theory we boost the dimension by +1 and consider a string tracing out a 2D
surface as time passes (Fig. 3.3).
becomes
Can you infer an appropriate action for this system? Remember, the physical or
physico-philosophical principle we’ve been using is that the path followed by physical
objects minimizes the “activity” or “aliveness” of the system. Given that we presumably
cannot tamper with the length of the closed string, then the worldtube quantity analogous
to arclength or proper time would be the area of the worldtube (or worldsheet for an open
string). If the string is also assumed to be a source of electromagnetic field then we need
a 2-form to integrate over the 2D worldtube analogous to the 1-form integrated over the
pathline of the point particle. In string theory this is usually the “Kalb-Ramond field”,
call it B. To recover electrodynamic interactions it should be antisymmetric like A, but
it’s tensor components will have two indices since it’s a 2-form. The string action can
then be written Z
S = α · (area) + e B (3.7)
We’ve also replaced the point particle mass by the string tension α [mass·length−1 ] to
obtain the correct units for the action (since replacing arclength by area meant we had to
compensate for the extra length dimension in the first term of the above string action).
This may still seem like we’ve pulled a rabbit out of a hat. But we haven’t checked that
this action yields sensible dynamics yet! But supposing it does, then would it justify our
guesswork and intuition in arriving at Eq.(3.7)? Well by now you’ve probably realized
that one can have more than one form of action or Lagrangian that yields the same
dynamics. So provided we supply reasonabe physically realistic heuristics then whatever
Lagrangian or action that we come up with will stand a good chance of describing some
system with a healthy measure of physical verisimilitude.
3.7 Alternative Lagrangians 41
That’s enough about string for now. The point was to illustrate the type of reasoning
that one can use in conjuring up a Lagrangian. It’s particularly useful when Newtonian
theory cannot give us a head start, i.e., in relativistic dynamics and in the physics of
extended particles.
• There’s no ugly square root, so it’s everywhere differentiable, and there’s no trouble
with paths being timelike or spacelike in direction, they are handled the same.
∂L
pi = = mq̇i + eAi
∂ q̇ i
∂L ∂Aj
Fi = i = e i q̇ j
∂q ∂q
d ∂Aj
mq̇i + eAi = e i q̇ j
dt ∂q
mq̈i = eF ij q̇ j
almost as before. (I’ve taken to using F here for the electromagnetic field tensor to avoid
clashing with F for the generalized force.) The only difference is that we have mq̈i instead
of mv̇i where vi = q̇i /kq̇k. So the old Euler-Lagrange equations of motion reduce to the
42 Lagrangians and Noether’s Theorem
4 5
3 4 w
2 3
1 2
PSfrag replacements 0 1
Of course we can write g(v, w) = gij v i w j in any basis, but for different bases gij will
have a different form.
3. g(x) varies smoothly with x.
44 Lagrangians and Noether’s Theorem
PSfrag replacements
light particle
n high V high
faster slower
gij = n2 δij
n
that is, the index of refraction n : → (0, ∞), times the usual Euclidean metric
1
..
0
δij =
.
0 1
3.9 The Principle of Least Action and Geodesics 47
This is just like the free particle in general relativity (minimizing it’s proper time)
except that now gij is a Riemannian metric
g(v, w) = gij v i w j
where g(v, v) ≥ 0
As before the Christoffel symbols Γ are built from the derivatives of the metric g.
Now, what Jacobi did is show how the motion of a particle in a potential could be
viewed as a special case of this. Consider a particle of mass m in Euclidean n with
potential V : n → . It satisfies F = ma, i.e.,
d2 q i
m = −∂i V (3.10)
dt2
How did Jacobi see (3.10) as a special case of (3.9)? He considered a particle of energy
E and he chose the index of refraction to be
r
2
n(q) = E − V (q)
m
which is just the speed of a particle of energy E when the potential energy is V (q), since
r r
2 21
(E − V ) = mkq̇k2 = kq̇k.
m m2
Note: this is precisely backwards compared to optics, where n(q) is proportional to the
reciprocal of the speed of light!! But let’s see that it works.
q
L = gij (q)q̇ i q̇ j
p
= n2 (q)q̇ i q̇ j
p
= 2/m(E − V (q))q̇ 2
48 Lagrangians and Noether’s Theorem
where q̇ 2 = q̇ · q̇ is just the usual Euclidean dot product, v · w = δij v i w j . We get the
Euler-Lagrange equations,
r
∂L 2 q̇
pi = i = (E − V ) ·
∂ q̇ m kq̇k
r
∂L 2
Fi = i = ∂ i (E − V (q)) · kq̇k
∂q m
1 −2/m∂i V
= p · kq̇k
2 2/m(E − V q)
Then ṗ = F says,
dp q̇i 1 kq̇k
2/m(E − V (q)) · = − ∂i V p
dt kq̇k m 2/m(E − V )
Jacobi noticed that this is just F = ma, or mq̈i = −∂i V , that is, provided we reparame-
terize q so that, p
kq̇k = 2/m(E − V (q)).
Recall that our Lagrangian gives reparameterization invariant Euler-Lagrange equations!
This is the unification between least time (from optics) and least action (from mechanics)
that we sought.
which is proper time when (Q, g) is a Lorentzian manifold, or arclength when (Q, g) is a
Riemannian manifold. We have
g(q) : Tq Q × Tq Q →
(v, w) 7→ g(v, w)
and it is bilinear.
2. w.r.t a basis of Tq Q
g(v, w) = δij v i w j
3.9 The Principle of Least Action and Geodesics 49
Let’s examine this last result a bit further. To get the desired equations for motion on
Q × U (1) we need to given Q × U (1) a cleverly designed metric built from g and A where
the amount of “spiralling”—the velocity in the U (1) direction is e/m. The metric h on
Q × U (1)
PSfrag replacements
a geodesic
Q
the apparent path
Q × U (1) is built from g and A in a very simple way. Let’s pick coordinates xi on Q where
i ∈ {0, . . . , n} since we’re in n + 1-dimensional spacetime, and θ is our local coordinate
on S 1 . The components of h are
hij = gij + Ai Aj
hθi = hθi = −Ai
hθθ = 1
In the Lagrangian approach we focus on the position and velocity of a particle, and
compute what the particle does starting from the Lagrangian L(q, q̇), which is a function
L : T Q −→
where the tangent bundle is the space of position-velocity pairs. But we’re led to consider
momentum
∂L
pi = i
∂ q̇
since the equations of motion tell us how it changes
dpi ∂L
= i.
dt ∂q
H = pi q̇ i − L(q, q̇)
H : T ∗ Q −→
where the cotangent bundle is the space of position-momentum pairs. In this approach,
position and momentum will satisfy Hamilton’s equations:
dq i ∂H dpi ∂H
= , =−
dt ∂pi dt ∂qi
51
52 From Lagrangians to Hamiltonians
λ ∂L
q̇ −→ pi =
∂ q̇ i
So λ is defined using L : T Q → . Despite appearances, λ can be defined in a coordinate-
free way, as follows (referring to Fig. 4.1). We want to define “ ∂∂L
q̇ i
” in a coordinate-free
Tq Q
TQ
T(q,q̇) Tq Q
PSfrag replacements
Tq Q
Q
q
Figure 4.1:
way; it’s the “differential of L in the vertical direction”—i.e., the q̇ i directions. We have
π : T Q −→ Q
(q, q̇) 7−→ q
and
dπ : T (T Q) −→ T Q
4.1 The Hamiltonian Approach 53
V T Q = ker dπ ⊆ T T Q
that is,
dL(q,q̇) : T(q,q̇) T Q −→ .
f : V(q,q̇) T Q −→ .
But note
V(q,q̇) T Q = T (Tq Q)
and since Tq Q is a vector space,
Tq Q
T(q,q̇) Tq Q ∼
= Tq Q
PSfrag replacements V(q,q̇) T Q
in a canonical way2 . So f gives a linear map
p : Tq Q −→ T(q,q̇) Tq Q
that is,
q Q
p∈ Tq∗ Q
λ : T Q −→ T ∗ Q
(q, q̇) 7−→ (q, p)
We call X the phase space of the system. In practice often X = T ∗ Q, then L is said to
be strongly regular.
1
L(q, q̇) = mgij q̇ i q j − V (Q)
2
Here
∂L
pi = = mij q̇ j
∂ q̇ i
so
λ(q, q̇) = q, mg(q̇, −)
Tq Q −→ Tq∗ Q
v −→ g(v, −)
is 1-1 and onto, i.e., the metric is nondegenerate. Thus λ is a diffeomorphism, which in
this case extends to all of T ∗ Q.
3
The missing object there “−” is of course any tangent vector, not inserted since λ itself is an operator
on tangent vectors, not the result of the operation.
4.2 Regular and Strongly Regular Lagrangians 55
λ |Tq Q : Tq Q −→ Tq∗ Q
q̇ 7−→ m g(q̇, −) + eA(q)
This is terrible from the perspective of regularity properties—it’s not differentiable when
gij q̇ i q̇ j vanishes, and undefined when the same is negative. Where it is defined
∂L mgij q̇ j
pi = =
∂ q̇ i kq̇k
(where q̇ is timelike), we can ask about regularity. Alas, the map λ is not 1-1 where defined
since multiplying q̇ by some number has no effect on p! (This is related to the reparameter-
ization invariance—this always happens with reparameterization-invariant Lagrangians.)
so that
∂L
p= = f 0 (q̇)
∂ q̇
This will be regular but not strongly so if f 0 : → is a diffeomorphism from to some
∼
proper subset U ⊂ . For example, take f (q̇) = eq̇ so f 0 : → (0, ∞) ⊂ . So
positive slope
L(q, q̇) = eq̇ /
or
slope between
p −1 and 1
L(q, q̇) = 1 + q̇ 2 /
and so forth.
This lets us have the best of both worlds: we can identify T Q with X using λ. This lets
us treat q i , pi , L, H, etc., all as functions on X (or T Q), thus writing
q̇ i (function on T Q)
q̇ i ◦ λ−1 (function on X)
In particular
∂L
ṗi := (Euler-Lagrange eqn.)
∂q i
which is really a function on T Q, will be treated as a function on X. Now let’s calculate:
∂L i ∂L i
dL = dq + i dq̇
∂q i ∂ q̇
= ṗi dq + pi dq̇ i
i
4.3 Hamilton’s Equations 57
while
dH = d(pi q̇ i − L)
= q̇ i dpi + pi dq̇ i − L
= q̇ i dpi + pi dq̇ i − (ṗi dq i + pi dq̇ i )
= q̇ i dpi − ṗi dq̇ i
so
dH = q̇ i dpi − ṗi dqi .
Assume the Lagrangian L : T Q → is regular, so
∼
λ : T Q −→ X ⊆ T ∗ Q
(q, q̇) 7−→ (q, p)
dL = ṗi dq i + pi dq̇ i
dH = q̇ i dpi − ṗi dq i .
But we can also work out dH directly, this time using local coordinates (q i , pi ), to get
∂H ∂H
dH = i
dpi + i dq i .
∂p ∂q
Since dpi , dq i form a basis of 1-forms, we conclude:
∂H ∂H
q̇ i = , ṗi = −
∂pi ∂qi
These are Hamilton’s Equations.
really just lets us recover the velocity q̇ as a function of q and p, inverting the formula
∂L
pi =
∂ q̇ i
which gave p as a function of q and q̇. So we get a formula for the map
λ−1 : X −→ T Q
(q, p) 7−→ (q, q̇).
Given this, the other Hamilton equation
∂H
ṗi = −
∂q i
is secretly the Euler-Lagrange equation
d ∂L ∂L ∂L
i
= i, or ṗ =
dt ∂ q̇ ∂q ∂q i
These are the same because
∂H ∂ i
∂L
= p i q̇ − L = − .
∂q i ∂q i ∂q i
S : P −→
by
Z t1
S(x) = (pi q̇ i − H)dt
t0
where pi q̇ i − H = L. More precisely, write our path x as x(t) = q(t), p(t) and let
Z t1
d i
S(x) = pi (t) q (t) − H q(t), p(t) dt
t0 dt
we write dtd q i instead of q̇ i to emphasize that we mean the time derivative rather than a
coordinate in phase space.
Let’s show δS = 0 ⇔Hamilton’s equations.
Z
δS = δ (pi q̇ i − H)dt
Z
= δpi q̇ i + pi δ q̇ i − δH dt
60 From Lagrangians to Hamiltonians
hij = n2 gij
where n : Q → (0, ∞) is the index of refraction throughout space (generally not a con-
stant).
/
62 From Lagrangians to Hamiltonians
and saw that the wavefront is the envelope of a bunch of little wavelets centered at points
along the big wavefront:
balls of radius
centered at points
of the old wavefront
In short, the wavefront moves at unit speed in the normal direction with respect to the
“optical metric” h. We can think about the distance function
d : Q × Q −→ [0, ∞)
where Υ = {paths from q0 to q1 }. (Secretly this d(q0 , q1 ) is the least action—the infimum
of action over all paths from q0 to q1 .) Using this we get the wavefronts centered at q0 ∈ Q
as the level sets
{q : d(q0 , q) = c}
or at least for small c > 0, as depicted in Fig. 4.2. For larger c the level sets can cease to
PSfrag replacements q0
Figure 4.2:
be smooth—we say a catastrophe occurs—and then the wavefronts are no longer the level
sets. This sort of situation can happen for topological reasons (as when the waves smash
into each other in the back of Fig. 4.2) and it can also happen for geometrical reasons
4.4 Waves versus Particles—The Hamilton-Jacobi Equations 63
Figure 4.3:
(Fig. 4.3). Assuming no such catastrophes occur, we can approximate the waves of light
by a wavefunction:
ψ(q) = A(q)eik d(q,q0 )
where k is the wavenumber of the light (i.e., its color) and A : Q → describes the am-
plitude of the wave, which drops off far from q0 . This becomes the eikonal approximation
in optics5 once we figure out what A should be.
Hamilton and Jacobi focused on distance d : Q × Q → [0, ∞) as a function of two
variables and called it W =Hamilton’s principal function. They noticed,
• /
q1 p 1
∂,
W (q0 , q1 ) = (p1 )i , •
∂q1i q0
where p1 is a cotangent vector pointing normal to the wavefronts.
where (Q, h) is a Riemannian manifold, h is the optical metric, q0 ∈ Q is the light source,
k is the frequency and
W : Q × Q −→ [0, ∞)
is the distance function on Q, or Hamilton’s principal function:
where Υ is the space of paths from q0 to q and S(q) is the action of the path q, i.e.,
its arclength. This is begging to be generalized to other Lagrangian systems! (At least
retrospectively with the advantage of our historical perspective.) We also saw that
• /
q1 p 1
,∂ W (q0 , q1 ) = (p1 )i , •
∂q1i q0
points in this direction. In fact kp1i is the momentum of the light passing through q1 .
This foreshadows quantum mechanics! (Note: in QM, the momentum is a derivative
operator—we get p by differentiating the wavefunction!)
Jacobi generalized this to the motion of point particles in a potential V : Q → ,
using the fact that a particle of energy E traces out geodesics in the metric
2(E − V )
hij = gij .
m
We’ve seen this reduces point particle mechanics to optics—but only for particles of fixed
energy E. Hamilton went further, and we now can go further still.
Suppose Q is any manifold and L : T Q → is any function (Lagrangian). Define
Hamilton’s principal function
W :Q× ×Q× −→
by
W (q0 , t0 ; q1 , t1 ) = inf S(q)
q∈Υ
where
Υ = q : [t0 , t1 ] → Q, q(t0 ) = q0 , & q(t1 ) = q1
and Z t1
S(q) = L q(t), q̇(t) dt
t0
4.4 Waves versus Particles—The Hamilton-Jacobi Equations 65
Now W is just the least action for a path from (q0 , t0 ) to (q1 , t1 ); it’ll be smooth if (q0 , t0 )
and (q1 , t1 ) are close enough—so let’s assume that is true. In fact, we have
• /
p
(q1 , t1 )1
∂ ,
W (q0 , q1 ) = (p1 )i , •
∂q1i (q0 , t0 )
∂W
= −(p0 )i , (-momentum at time t0 )
∂q0i
∂W
= −H1 , (-energy at time t1 )
∂t1
∂W
= H0 , (+momentum at time t0 )
∂t0
(H1 = H0 as energy is conserved). These last four equations are the Hamilton-Jacobi
equations. The mysterious minus sign in front of energy was seen before in the 1-from,
β = pi dq i − H dt
on the extended phase space X × . Maybe the best way to get the Hamilton-Jacobi
equations is from this extended phase space formulation. But for now let’s see how
Hamilton’s principal function W and variational principles involving least action also
yield the Hamilton-Jacobi equations.
Given (q0 , t0 ), (q1 , t1 ), let
q : [t0 , t1 ] −→ Q
W (q0 , t0 ; q1 , t1 ) = S(q)
t0 t1
66 From Lagrangians to Hamiltonians
and thus vary the action-minimizing path, getting a variation δq which does not vanish
at t0 and t1 . We get
δW = δS
Z t1
=δ L(q, q̇) dt
t0
Z t1
∂L i ∂L i
= δq + i δ q̇ dt
t0 ∂q i ∂ q̇
Z t1 t1
∂L i i
i
= δq − ṗ i δq dt + p i δq
t0 ∂q i t0
Z t1
∂L
= i
− ṗi δq i dt
t0 ∂q
the term in parentheses is zero because q minimizes the action and the Euler-Lagrange
equations hold. So we δq i have
and so
∂W ∂W
, and = −p0i
∂q1i ∂q0i
These are two of the four Hamilton-Jacobi equations! To get the other two, we need to
vary t0 and t1 :
Now change in
W will involve ∆t0
•
• and ∆t1
• •
t0 t1
t0 + ∆t0 t1 + ∆t1
(Q × )2 → Υ
q • (t1 , q1 )
u 7→ q
defined only when (q0 , t0 ) and
(t0 , q0 ) • (q1 , t1 ) are sufficiently close.
where β = pdq − H(q, p)dt is a 1-form on the extended phase space X × , and C is a
curve in the extended phase space:
C(t) = q(t), p(t), t ∈ X × .
Note that C depends on the curve q ∈ Υ, which in turn depends upon u = (q0 , t0 ; q1 , t1 ) ∈
(Q × )2 . We are after the derivatives of W that appear in the Hamilton-Jacobi relations,
so let’s differentiate Z
W (u) = β
C
with respect to u and get the Hamilton-Jacobi equations from β. Let us be a 1-parameter
family of points in (Q × )2 and work out
Z
d d
W (us ) = β
ds ds Cs
68 From Lagrangians to Hamiltonians
C* s •
• . . . . ..
... . . .
.V ....... . . . . . . . . . . . . . ...........W Bs X×
.. . .. . . . .
. . .. .......... .
................. . ... .....
As
•
..... •
,
C0
Let’s compare Z Z Z Z Z
β and = β+ β+ β
Cs As +Cs +Bs As Cs Bs
Since C0 minimizes the action among paths with the given end-points, and the curve
As + Cs + Bs has the same end-points, we get
Z
d
β=0
ds As +Cs +Bs
(although As + Cs + Bs is not smooth, we can approximate it by a path that is smooth).
So Z Z Z
d d d
β= β− β at s = 0.
ds Cs ds Bs ds As
Note
Z Z
d d
β= β(A0r ) dr
ds As ds
= β(A00 )
where A00 = v is the tangent vector of As at s = 0. Similarly,
Z
d
β = β(w)
ds Bs
where w = B00 . So,
d
W (us) = β(w) − β(v)
ds
where w keeps track of the change of (q1 , p1 , t1 ) as we move Cs and v keeps track of
(q0 , p0 , t0 ). Now since β = pi dqi − Hdt, we get
∂W
= pi1
∂q1i
∂W
= −H
∂t1
4.4 Waves versus Particles—The Hamilton-Jacobi Equations 69
and similarly
∂W
= −pi0
∂q0i
∂W
=H
∂t0
So, if we define a wavefunction:
then we get
∂ψ i
= − H1 ψ
∂t1 ~
∂ψ i
= p1 ψ
∂q1i ~
which at the time of Hamilton and Jacobi’s research was interesting enough, but nowadays
it is thoroughly familiar from quantum mechanics!
Bibliography
[Pel94] Peter Peldan. Actions for gravity, with generalizations: A review. Classical
and Quantum Gravity, 11:1087, 1994.
70