Lagrange's Equations: I Background
Lagrange's Equations: I Background
Lagrange's Equations: I Background
Arthur Jaffe
Physics 151 Handout
October 28, 2012
Background
A very special aspect of physics concerns a change of variables and its relation to how one formulates
the laws of physics. In particular, one often looks for covariant laws that have the same form in
different coordinate systems. One might mention that Einstein liked this notion a lot, and his
famous work on relativity (both special and general) revolves around these concepts. Furthermore
the importance of symmetry in physics is also closely tied to covariance.
However the notion of covariance emerged much earlier in classical mechanics. The form of
Lagranges equations remains the same in a wide variety of coordinate systems. We will see in these
notes that the equations have a certain covariance under change from one coordinate system to
another. The Lagrange equations of motion can often be written as
d L L
= 0 , for i = 1, . . . , N .
dt qi qi
Here the Lagrangian L = L(q, q,
t) is a function1 of the coordinate variables qi , for i = 1, . . . , N ,
that we denote collectively by q. It also depends on the velocity variables that we denote q.
It may
also have an explicit dependence on the t.
In these notes we establish covariance of the Lagrange equations, namely how the equations
transform when one changes from one coordinate system to another. We start by using Cartesian
coordinates (the standard coordinates for Euclidean space, and coordinates that we denote by
x) to describe the non-relativistic motion of a system of particles moving under forces given by a
conservative potential. We make a very simple choice for the Lagrangian function, namely L = T V
where T is kinetic and V is potential energy. Then we see that Newtons equations of motion,
F = ma, are equivalent to Lagranges equations of motion.
We then make an invertible coordinate transformation to a new (generalized) set of coordinates.
We use covariance to verify that Lagranges equations retain the same form under such a change to
1
When one differentiates L with respect to the components of q and of q one considers the instantaneous position
and velocity as being independent variables. One also uses their values as independent initial data for the solutions to
the equations of motion. Of course when one follows a particle along its trajectory over time, the position determines
the velocity. But the Lagrange equations concern the equations for the trajectory at a particular time.
a wide variety of generalized coordinate systems. These coordinates q may be orthogonal or not.
They may have dimension length or may be a mixture (as in polar coordinates) of components with
dimension length, angles, etc.
In VI we return to the same question when the coordinate change between q and x is not
invertible. We consider what happens to the Lagrange equations when we use a system of generalized
coordinates q with redundant components. Thus one needs to impose relations between these
coordinates (constraints) in order to describe a physical motion. We then see how the idea of
covariance also applies in this situation. We use covariance to derive a modified form of Lagrange
equations
X f (j) (q)
L(q, q)
d L(q, q)
=
j
, for i = 1, . . . , N .
dt qi
qi
q
i
j
One can interpret the additional terms on the right side of the equations as forces of constraint.
These forces arise from given functions f (j) that determine the constraints. The constants j are
called Lagrange multipliers. They are unknowns that need to be found, along with the particle
trajectories q(t). We give complete details in VI.
II
Once we know about Lagranges equations for the motion of a set of non-relativistic particles, we
are prepared to understand the major paradigm shift that took place in thinking about physics.
This is the main point.
The old paradigm is to verify Lagranges equations in a particular situation. One can check
that Lagranges equations apply as a consequence of some other law of physics. For example,
in these notes we show how to derive Lagranges equations from Newtons law. We derive the
new equations by implementing the covariance of Lagranges equations. Then when you have the
equations, you can solve them. Or if you cannot solve them exactly, you can explore approximate
solutions to describe particular problems. The remainder of these notes explore the first aspect of
the old paradigm: deriving Lagranges equations from Newtons laws.
But let us mention how theoretical physics shifted with the discovery of Lagranges equations
to a new paradigm. One came to regard the Lagrangian as a fundamental starting point, and the
equations of motion as the consequence of knowing the Lagrangian. We can conceptualize finding
a law of physics as the art of finding an appropriate Lagrangian.
Finding consequences of some law of physics became exploring properties of some Lagrangian
function L. Studying symmetries of the equations of motion are replaced by studying symmetries
of the Lagrangian. A general result relates symmetries of the Lagrangian with conservation laws in
the physics. Etc., etc.,.... After we derive Lagranges equations in these notes, most of the rest of
the course will be devoted to investigating this new paradigm.
In fact the new paradigm gives us freedom. Lagranges formulation has much wider applicability
than just particle motion in non-relativistic mechanics. There is a Lagrangian formulation for the
motion of particles moving relativistically. Lagrangian mechanics applies to optics as well as to
particles. There is a Lagrangian formulation for the evolution of fields, such as electromagnetic
Arthur Jaffe
field (Maxwell fields), matter fields, etc. And the relation between symmetries and conservation
laws carry over to other areas of physics.
The study of the Lagrangian is closely tied to another important approach, Hamiltonian mechanics. One can argue that the Hamiltonian is just as important as the Lagrangian. In fact
both of them are central to understanding classical physics. But the Lagrangian and Hamiltonian
approaches also lie at the basis of quantum theory. These ideas arise both at the level of the formulation of quantum theory, as well as in the detailed understanding the consequences of quantum
theory. They are central both to abstract ideas and also to solving particular problems. For example, the relations between symmetry and conservation laws that appear in classical physics, appear
in a similar fashion in quantum theory. They appear in Lagrangian theory and also somewhat
differently in Hamiltonian theory.
Lagranges equations also fit naturally into the whole area of variational principles in physics,
which we will explore at length later in the term. These variational principles not only lead to the
derivation of equations of motion, but to the formulation of other ideas, such as the principle of
least action or the principle of least time. Variational principles also relate the Lagrangian and the
Hamiltonian approaches.
We will not cover all these topics in this course. But you will be happy to know that you will
encounter concepts learned and emphasized in this course in the future, in fact in almost every
other discussion of physics that you will have. While Lagrangian and Hamiltonian ideas originally
arose in classical mechanics, they have evolved to permeate all of modern physics! So we will have
a great deal of fun.
III
We consider here the simplest case of Lagranges equations. We consider a set of non-relativistic
particles described by Cartesian coordinates ~x1 , ~x2 , . . .. In order to simplify our notation, let us
not label both particles and the components of their position, but just pick a set of coordinates
x1 , x2 , . . . , xN , so if there are n particles moving in 3-space, then N = 3n. The first three xs are the
x, y, z (Cartesian) coordinates of particle 1 in three-space, etc. We let x stand for the collection of all
the particle coordinates x1 , . . . , xN . We assign mi to be a mass associated with the coordinate xi , so
for our example m1 = m2 = m3 denotes the mass of the first particle. We denote the instantaneous
velocities of the coordinates by x,
which stands for the collection of velocity coordinates x 1 , . . . , x N .
A trajectory will be a curve x(t) parameterized by time t. The velocity along the trajectory is
x(t)
= dx(t)/dt.
We assume that the motion along a trajectory evolves according to Newtons law F = ma.
Explicitly, we assume that
mi xi = Fi ,
where the forces F with components F1 , . . . FN are conservative. This means that the forces are the
negative gradient of a single, given potential function V (x) that does not depend on the velocities,
Fi (x) =
V (x)
.
xi
V
= 0 , for i = 1, . . . , N .
xi
(1)
Given V for a particular problem, we can solve these equations, either exactly, approximately, or
computationally, to obtain information about the resulting motion.
In this simple situation, the kinetic energy T of the system of particles is
T = T (x)
=
N
X
1
j=1
mj x 2j ,
(2)
=0.
(3)
dt x i xi
In fact
Also
Thus
T
d L
L
=
= mi x i , and
= mi xi .
x i
x i
dt x i
L
V
=
.
xi
xi
d L
L
V
= mi xi +
,
dt x i xi
xi
showing that the equations (1) and (3) are the same equations.
IV
In this section we begin the discussion of transformations from a coordinate system x (which here
we take to be Cartesian) to a new (or generalized) coordinate system q. We suppose that we can
2
Arthur Jaffe
write q as a function of x, in the form q(x).3 A given function V (q), can be expressed as a function
of x by Ve (x) = V (q(x)). Let us ask how the components of the gradient Ve (x)/xi in x-space are
related to the components of the gradient V (q)/qj of the gradient of V in q-space?
We find that there is a simple relation; the vector of derivatives x transforms as a covariant
vector. This is a consequence of the chain rule for differentiation. The important point is not the
calculation, but rather its interpretation. In any case, let us suppose that one can go back and forth
between the two coordinate systems. Namely for each point q there is a corresponding x = x(q), as
well as the inverse, so that for each point x0 , one has x(q(x0 )) = x0 . Then the chain rule gives
N
V (q(x)) X V (q) qj
=
.
xi
qj xi
j=1
It is usual to write this in shorthand as
N
X qj V
V
=
.
xi
xi qj
j=1
As V occurs on both sides of the equation, and this identity holds for all V , one can write this as
a relation between derivatives in the two coordinate systems. Explicitly
N
X qj
.
=
xi
x
q
i
j
j=1
IV.1
(4)
The coefficients of the transformation law (4) play such an important role that one defines the
Jacobian matrix J as
J = ..
.
(5)
..
.. , with Jij =
xi
.
.
.
JN 1 JN 2 JN N
One can write (4) in matrix form as
x = J q .
3
(6)
It might be better to write the transformation x 7 q as arising from another function, for instance one could
write q = G(x). In that case, when we follow x(t) along a trajectory parameterized by the time t, the corresponding
point is q(t) = G(x(t)). The meaning of this identity is clear and unambiguous.
On the other hand, here we denote the transformation x 7 q as q(x). And we let the point on a trajectory of q at
time t be q(t). Thus we use the function q( ) with two distinct meanings; and the identity q(t) = q(x(t)) illustrates
the ambiguity. But in an attempt to keep our notation in accordance with convention, we leave it to the clever reader
to interpret any ambiguous function correctly, and as intended by the writer.
(q1 , . . . , qN )
.
(x1 , . . . , xN )
(7)
The relations (4)(6) are precisely the transformation law for a covariant vector under the change
of coordinates. A contravariant vector transforms by multiplication with the inverse matrix J 1 .
IV.2
One Coordinate The simplest case to visualize is the case of a single real coordinate. The coordinate change q(x) is invertible if and only if q(x) is strictly monotonic, namely strictly increasing
or decreasing. For then only one value of x gives a particular value of q(x). We can specify that
x(q) is precisely that x.
Think of this in terms of the graph of a function in the plane with coordinates (x, q(x)). The
inverse function would be the graph that you obtain by rotating the plane by 180 degrees about the
line in the plane at 45 to the axes. This rotation interchanges the two axes. However this inverse
function is single-valued only if the original function is strictly monotonic. A sufficient condition4
yielding strict monotonicity of q(x) is the requirement that
dq
6= 0 .
dx
(8)
Many Coordinates At first glance, the situation seems much more complicated with many
coordinates. However, there is an elementary condition which reduces to (8) in the case of one
variable that is sufficient to ensure that q(x) has an inverse function x(q). This inverse function
satisfies x(q(x0 )) = x0 for all x0 . This condition is the invertibility of the Jacobian matrix.
Any matrix, like J, has an inverse matrix, if and only if it has non-vanishing determinant. The
matrix J has an inverse if and only if it has a non-zero determinant. Thus a natural assumption is
that
(q1 , . . . , qN )
6= 0 .
(9)
det J = det
(x1 , . . . , xN )
The inverse function theorem shows that x(q) exists. It is exactly what we need:
Theorem.5 If q(x) is differentiable and det J 6= 0 in the neighborhood of a particular point x =
a, then the inverse coordinate transformation x(q) exists in a neighborhood of q(a) and satisfies
x(q(x0 )) = x0 in this neighborhood.
4
This condition is sufficient, but not necessary. Actually the first derivative might vanish at some point (say an
inflection point of q(x)), but q(x) could still be strictly monotonic. A function with this property is q = x3 . In this
case the inverse function is x = q 1/3 ; but we also need to specify that we choose the real branch of the cube root.
Choose the branch for which q 1/3 is positive for positive q > 0 and negative for negative q < 0.
5
See for example, Michael Spivak, Calculus on Manifolds, page 35, Benjamin Cummings, 1965, or H. K. Nickerson,
D. C. Spencer and N. E. Steenrod, Advanced Calculus, Chapters IX and X, Dover, 2011.
Arthur Jaffe
IV.3
Suppose that q(x) is differentiable and that its Jacobian matrix J satisfies det J 6= 0. Thus the
inverse function x(q) exists and the matrix J has an inverse matrix J 1 . On the other hand, the
e We claim these two are the same, namely
inverse transformation has its own Jacobian matrix J.
1
e In other words: the Jacobian of the inverse is the inverse of the Jacobian. As matrices,
J = J.
J 1 = Je =
(x1 , . . . , xN )
, and
(q1 , . . . , qN )
J 1
ij
= Jeij =
xj
.
qi
(10)
(11)
Using the notation (7) for the Jacobian, one can also write this identity as
(x1 , . . . , xN ) (q1 , . . . , qN )
(q1 , . . . , qN ) (x1 , . . . , xN )
=
=I.
(q1 , . . . , qN ) (x1 , . . . , xN )
(x1 , . . . , xN ) (q1 , . . . , qN )
(12)
The proof of the relations (10)(12) is a consequence of the chain rule for differentiation. Here
we must be very careful about which variables are independent and which are dependent. We start
from the elementary relation
qj
= ji .
(13)
qi
Here we take the coordinates qi as independent. But we can also express q = q(x) as a function of
independent variables x. Then the chain rule gives a new expression for (13), namely
N
qj X qj xk X
e
=
=
Jkj Jeik = JJ
= ji .
qi
x
ij
k qi
k=1
k=1
(14)
e denotes matrix multiplication. And this is just the relation (10) for the
Here the product JJ
e = J Je = I.
Jacobian J and the Jacobian Je of the inverse transformation. In matrix form, JJ
IV.4
In order to see a simple example, consider polar coordinates q = (r, ) in the plane. As long as
0 6 r < and 0 6 < 2, there is a single-valued transformation q(x) from Cartesian coordinates
to polar coordinates. It is defined for x 6= 0 by
q
x2
r = x21 + x22 , = arctan
.
x1
There one can invert the transformation as
x1 = r cos ,
x2 = r sin .
so
(r, )
J=
=
(x1 , x2 )
r
x1
r
x2
x1
x2
x1
r
x2
r
xr22
x1
r2
cos 1r sin
1
cos
sin
r
!
,
and det J =
1
.
r
1
d
6
arctan u = 1+u
Here we have used du
Thus det J is well-defined and det J 6= 0 for r 6= 0. This
2.
agrees with the direct analysis above.
On the other hand, the inverse transformation has the Jacobian Je that one can calculate as
(x1 , x2 )
=
Je =
(r, )
x1
r
x1
x2
r
x2
=
cos
sin
r sin r cos
,
and det Je = r .
e = I.
Of course it is the case that JJ
In this section we derive Lagranges equations in generalized coordinates q. As with polar coordinates, such coordinates may not have the dimension of length; for example, one generalized
coordinate might be a length while a second one might be an angle. In this section we assume
that there is an invertible coordinate transformation q(x) between the generalized coordinates q
and Cartesian coordinates x for generic values of the coordinates. We also assume that one can
differentiate this relation.
The Lagrangian L for a set of particles with coordinates q = (q1 , q2 , . . . , qN ) is a function of the
coordinates and the corresponding velocities q = (q1 , q2 , . . . , qN ) at a particular instant of time. As
time varies, the Lagrangian function follows a possible trajectory q(t). Along a trajectory we have
L = L(q(t), q(t)).
We claim that the Lagrange equations describe physical trajectories, and they
satisfy the Lagrange equations of motion. These equations are
L
d L
=
, for i = 1, . . . , N .
dt qi
qi
V.1
(15)
Following a Trajectory
Here we reduce the derivation of Lagranges equations to the derivation of the covariance of a
certain vector under invertible changes from Cartesian coordinates x to generalized coordinate q.
At a given instant of time t, we consider the coordinates q, q to be independent variables.
6
Given a function u(v), the inverse function v(u) satisfies u(v(u)) = u. Thus differentiating with respect to u shows
du
dv
dv
1
1
dv
1
1
2
that du
dv du = 1 or du = du . For u = tan v and v = arctan u one has dv = cos2 v , so du = cos v = 1+tan2 v = 1+u2 .
dv
Arthur Jaffe
Let us follow a point q on a physical trajectory q(t). The time dependence q(t) of the point
q is identical with the time dependence x(t) of the same point in Cartesian coordinates, namely
q(t) = x(q(t)). Accordingly
N
dqj (t) X qj
qj =
=
x i .
dt
x
i
i=1
Thus under the change of coordinates x 7 q, the positions xi and velocities x i map as
N
X
qj
qi = qi (x) , and qj = qj (x, x)
=
x k .
xk
k=1
(16)
Differentiate (16) with respect to x i , remembering that both the components of x and of x are
independent variables. From this we infer the rule of cancellation of dots,
qj
qj
=
.
x i
xi
(17)
X 2 qj
d qj
qj
=
x k =
.
xi
xk xi
dt xi
k=1
(18)
V.2
(19)
We assume that the function Le is the Lagrangian in Cartesian coordinates x for a system of particles
moving in a potential V (x) that can be expressed in terms of the instantaneous position x. The
potential does not depend on the instantaneous velocity x.
Then Newtons equations hold in
Cartesian coordinates, and Lagranges equations also hold as they agree with Newtons, see III.
Let us define two sets of N functions, which we put together as vectors. The first set is the
Newton vector N with components
Ni =
e x)
e x)
L(x,
d L(x,
, where i = 1, . . . , N .
dt x i
xi
The Newton equations holding for trajectories x(t) moving under forces determined by the potential
V (x) is equivalent to the statement that N = 0.
The second set is the Lagrange vector L with components
Li =
d L(q, q)
L(q, q)
, where i = 1, . . . , N .
dt qi
qi
10
The Lagrange equations of motion along trajectories in the generalized coordinates q(t) described
by a Lagrangian L(q, q)
are the equations L = 0.
We claim that these two vectors obey the matrix equation,
N=JL.
(20)
In other words, the vector N that describes the Lagrange equations in Cartesian coordinates x,
transforms as a covariant vector when one changes to generalized coordinates q. Hence if N 0
along a Newtonian trajectory in Cartesian coordinates, then J L = 0 along a trajectory q(t) arising
from the Lagrangian L. As we are assuming the coordinate transformation x 7 q is invertible, so
is the Jacobian matrix J. Therefore multiplying by J 1 shows that L = 0 on a physical trajectory.
This shows that the Lagrange equations hold in the generalized coordinate system q.
V.3
In order to establish (20), we calculate the terms in Ni . But we express N in terms of L by using
the transformation q(x) and the relation (19). Then using the chain rule,
N
X L(q(x), q(x,
e x)
L(q(x), q(x,
x))
x))
qj
L(x,
=
=
x i
x i
qj
x i
j=1
N
X
L(q, q)
qj
=
.
x
j
i
j=1
In the last equality we use identity (17) to cancel the dots. Hence differentiating this expression
gives
N
X
e x)
d L(q, q)
d L(x,
qj
L(q, q)
d qj
=
+
.
dt x i
dt qj
xi
qj dt xi
j=1
(21)
Also
N
X
e x)
L(x,
L(q(x), q(x,
x))
L(q(x), q(x,
x))
qj
L(q(x), q(x,
x))
qj
=
=
+
xi
xi
qj
xi
qj
xi
j=1
N
X
L(q, q)
qj
L(q, q)
qj
=
+
.
q
x
x
j
i
j
i
j=1
(22)
Arthur Jaffe
11
The identity (18) shows that the final sums in (21) are identical to the last terms in (22). Therefore
subtracting (22) from (21) shows that
e x)
e x)
d L(x,
L(x,
dt x i
xi
N
X d L(q, q)
L(q, q)
qj
=
dt qj
qj
xi
j=1
Ni =
N
X
X
qj
=
Lj
=
Lj Jij = (J L)i .
xi
j=1
j=1
This is just the claimed relation (20), so it completes the proof of Lagranges equations.
VI
VI.1
Sometimes it is useful to describe a mechanics problem using some redundant coordinates. This is
also called a system of coordinates with constraints. In this case we take a set of N generalized
coordinates q = (q1 , . . . , qN ) that is strictly greater than the number n of independent coordinates
x = (x1 , . . . , xn ). In this section of the notes, we do not assume that x denotes a set of Cartesian
coordinates. Rather we only assume that x denotes a set of independent, generalized coordinates
e x)
for which the Lagrange function is L(x,
and the Lagrange equations hold in the form
d Le
Le
=
, for i = 1, . . . , n .
dt x i
xi
(23)
In such a situation, we cannot conclude that Lagranges equations (15) hold, at least not without
modification. The reason is that our derivation of the equations used the fact that one can go
back and forth between the coordinates q and the coordinates x. More specifically, we used the
invertibility of the N N Jacobian matrix J. In the present situation, as noted in the introduction,
extra terms arise in the equations. One can interpret these terms as giving forces of constraint;
they ensure that the motion described by the solution to the equations lies in the subspace that is
allowed by the physics.
One cannot write the coordinate x as a function of the redundant coordinates q, because there
are values of q that do not correspond to any x. For example, if x-space is a line in a plane with
coordinates q, then one cannot map a point q not lying on the line into a point x. Since the
difference in the dimensions of q-space and x-space is k = N n, one needs to give k = N n
conditions on the coordinates q. One generally gives k functions, called constraint functions, such
as f (1) (q), f (2) (q), . . . , f (k) (q), that vanish on x-space. Thus the equations
f (1) (q) = 0 ,
f (2) (q) = 0 , . . . ,
f (k) (q) = 0 ,
(24)
12
VI.2
Elementary Example
In this example, we consider a point particle moving under the influence of gravity in one dimension. One can take a single generalized coordinate x to be the distance x that the particle has
moved. This is a Cartesian coordinate system and corresponds to the n = 1. Suppose that the
particle is actually moving in a friction-less manner along a line in the (q1 , q2 ) plane. Here q1 , q2
denotes a second Cartesian coordinate system in which we imbed the inclined plane. Assume that
q2 denotes the vertical direction, and the plane is inclined at an angle to the horizontal direction
q1 , and in a direction so q1 increases with time. Let the N = 2 coordinate system q and the n = 1
coordinate system x have coinciding origins.
We can easily express a Lagrangian for the particle in terms of the x coordinate as
1
e x)
L(x,
= T V = mx 2 + mgx sin .
2
(25)
1
m q12 + q22 mg q2 ,
2
(26)
which describes the vertical and horizontal coordinates of a point particle under the influence of
gravity. Furthermore, suppose that one can express q as a function of a parameter x lying on a line
such that
q1 = x cos , and q2 = x sin .
(27)
Here we fix the angle , so x parameterizes a point on a plane inclined by angle to the horizontal.
Then the corresponding velocities in q and x space satisfy the relations,
q1 = x cos , and q2 = x sin .
(28)
e x).
Using (27)(28), we express L(q, q)
as a function L(x,
One finds that for x on the allowed line
in q-space,
1
e x)
.
(29)
L(q, q)
= L(q(x), q(x,
x))
= mx 2 + mgx sin = L(x,
2
e x)
This agrees with L(x,
in (25) for which we known that Lagranges equations hold. So our choice
of L is the natural one.
Arthur Jaffe
13
Now we come to the tricky point: While the relation (29) holds for all x on the allowed line, it
e
does not hold in general. We cannot write L(q, q)
= L(x(q),
x(q,
q))
for all q in the present case,
although this was possible in the context of V where there are no constraints.
VI.3
Let us generalize from this example to study the general case where there are more q coordinates than
x coordinates. We gain insight by looking at the Jacobian. Assume that we have N generalized
coordinates q and n < N generalized coordinates x. In this case we do not assume that the
coordinates x are Cartesian coordinates. Rather we do assume that the coordinates x are generalized
e x),
coordinates for which we have given a Lagrangian L(x,
and that the Lagrange equations (15)
hold in these coordinates. We assume that we have a well-defined, differentiable transformation
q = q(x). Define the Jacobian matrix
J=
(q1 , q2 , . . . , qN )
qj
, with entries Jij =
.
(x1 , x2 , . . . , xn )
xi
q1
x1
.
.
J =
.
q1
xn
q2
x1
...
q2
xn
qN
x1
...
.
qN
xn
(30)
14
As n < N , the maximal rank of J is n.7 Therefore there are at least k = N n > 1 linearlyindependent null vectors v (1) , v (2) , . . . , v (k) of the Jacobian matrix J. These null vectors v (j) satisfy
J v (j) = 0 , for all q , and for j = 1, . . . , k .
(31)
In this case, the null space of J is spanned by the the k linearly-independent null vectors v (j) .
In fact the gradient q f (j) of each constraint function f (j) is such a null-vector, namely
f (j) (q)
q1
f (j) (q)
q
2
v (j) = q f (j) =
..
.
f (j) (q)
qN
(32)
To check this, note that each constraint function satisfies the constraint condition (36). So as an
identity, it is the case that f (j) (q(x))/x` = 0 for each ` = 1, . . . , n and for all x. Using the chain
7
Let us briefly digress on the rank of the matrix J. The domain of J is RN while its range JRN is a subset of
R . The integer Rank(J) is the dimension of the range of J. Namely Rank(J) = dim(JRN ). As n 6 N , clearly
Rank(J) 6 n. We say that J has maximal rank if Rank(J) = n for all x. This replaces the condition that J be
invertible, for if n = N , the conditions coincide. If an n N matrix J has maximal rank n, and n 6 N , then the
null space of J has dimension k = N n.
Note that an ` ` matrix H with entries Hij is hermitian if Hij = Hji . Any ` ` hermitian matrix H has an
orthonormal basis of eigenvectors ej with real eigenvalues j . They satisfy Hej = j ej for j = 1, . . . , `.
In fact we claim that
n
Arthur Jaffe
15
f (j) (q(x)) X f (j) (q(x)) q`
=
= J v (j) ` = 0 , for ` = 1, . . . , n .
x`
q`
xi
i=1
(33)
Let us now return to our assumption (35), that we called the independence of the constraints.
In fact this assumption means that the different null-vector solutions v (j) = q f (j) to the equation
Jv (j) = 0 provide k linearly independent solutions to this equation. Assuming further that the
dimension of the vector space of null vectors of J is k, then every null vector v satisfying Jv = 0
can be expanded as a linear combination of the solutions v (j) = f (j) given by the constraints. In
other words, there are a set of constants j such that an arbitrary null vector v for J can be written
as
k
X
v=
j v (j) .
(34)
j=1
In this context, let us call the coefficients j Lagrange multipliers. The reason is that is how they
will arise in the context of Lagranges equations.
We want to have the relation (34), so now we add an additional assumption to our analysis so
that this is the case. This assumption replaces the assumption in V that the N N Jacobian
matrix J is invertible. The assumption involves two parts, which taken together ensure that k
constraint functions give a basis of null vectors for J.
We define independence of the constraint functions f (j) (q) at q to mean that the k different
gradient vectors q f (j) are linearly independent. This means that for any constants cj ,
k
X
(35)
j=1
VI.4
Assumptions
(q1 ,...,qN )
(x1 ,...,xn )
3. We are given k = N n constraint functions f (j) (q) which vanish on the range of x, or
f (j) (q(x)) = 0 , for all x , and for all j = 1, . . . , k .
4. The constraint functions f (j) are independent for generic points q.
(36)
16
VI.5
In the above situation, there are N +k Lagrange equations. The N +k unknowns to these equations
are the N coordinates q1 (t), q2 (t), . . . , qN (t), along with k constants 1 , . . . , k , called Lagrange
multipliers. There are also N + k equations which determine the N + k unknowns. These equations
comprise N Lagrange equations of motion which have the form,
k
X f (j) (q)
d L(q, q)
L(q, q)
=
+
j
, where i = 1, . . . , N ,
dt qi
qi
qi
j=1
(37)
VI.6
(38)
In order to establish the validity of the equations (37) as the Lagrange equations with constraints,
we begin by defining N and L as in V.2 and following the argument in V.3. We wish to show that
the transformation from N to L is covariant. In fact
N=JL.
The details of deriving this relation are identical to the proof of the same relation (20) in V. The
only difference is that one needs to keep track that the number N of components of q is different
from the number n of components of x.
e x)
Furthermore N = 0 on a physical trajectory, as the Lagrange equations for L(x,
hold. The
difference between the previous case without constraints and the present case is: the Jacobian J
is not invertible. However the assumption that J has maximal rank means that every null vector
of J is a linear combination of the null vectors arising from the independent constraints. Thus
the Lagrange equations hold in the form (37) in the q coordinates. The Lagrange multipliers are
unknowns, and we solve the N + k equations for the j s, as well as for the trajectories q(t).
VI.7
m
q2 = mg + , and q2 = q1 tan .
(39)
cos
q(t) = g sin
sin
,
and = mg cos2 .
(40)
One can interpret the terms tan and in (39) as the x1 and x2 components of a force of
constraint, keeping the particle on the surface of the inclined plane.