Lagrangian Mechanics 4
Lagrangian Mechanics 4
Lagrangian Mechanics 4
Sense
Sean He
Acknowledgements
Special thanks to Andrew Wang for proofreading and providing feedback.
Dedicated to my favorite physics clubber William ”Willy” Wang.
Preface
Lagrangian Mechanics... a sort of holy grail in undergraduate physics, or so I’ve heard. Lagrange’s
equation itself is not hard to learn to use, as long as you understand partial derivatives. However, the
origin of the topic is usually left somewhat ambiguous. Most sources will derive the Euler-Lagrange
equation, with a lot of arbitrary definitions and confusing notation. Then there is the choice of the
Lagrangian - why is it L = T − U ? In short, without a good teacher, it is extremely hard to learn the
theory behind Lagrangian Mechanics.
Well I obviously somehow managed, since I’m here writing this handout. After reading too many
articles and Classical Mechanics textbook chapters, I think I have a decent grasp on Lagrangian
Mechanics. (Hopefully saying that none of this handout is rigorous will cover for any mistakes I
make.) And with a good grasp on calculus, some free time to learn a bit of Multivariable Differential
Calculus (try MIT OCW 18.02 Lectures, or read the Appendix), and some patience, any high school
senior should also be able to understand Lagrangian Mechanics.
Good luck.
1
1 Motivations for the Calculus of Variations
The calculus of variations is the mathematical foundations that Lagrangian mechanics is built on, so
I want to motivate the need to develop Lagrangian mechanics and also the types of problems calculus
of variations can solve.
Newton’s 2nd Law is the foundation of Newtonian mechanics and it works great in Cartesian coor-
dinates. However, it tends to fall apart in other coordinate systems due to how messy it becomes.
Lagrangian mechanics is the alternative which will work with what we call generalized coordinates.
To see how calculus of variations will help us with this goal, let us first see what types of problems it
can solve.
The principle behind the catenary is that it must minimize the total potential energy of the chain,
that is, we want to minimize:
Z Z Z x2 p
gydm = µgyds = µgy(x) 1 + y ′ (x)2 dx.
C x1
Note the fact that y is a function of x and we are finding a function such that the value of this integral
is minimized, rather than the typical calculus problem of finding a value that minimizes an expression.
Z t2 Z Z Z x2
ds 1 1 p
dt = = nds = n 1 + y ′ (x)2 dx
t1 C v c C c x1
2
Again we are finding a function that minimizes the integral.
1.3 Lagrangian
As we have seen in the previous sections, calculus of variations is designed to tackle problems that find
the stationary points of an integral with respect to a function (which happened to be minimization
with our examples). It turns out finding the path a system will take in mechanics is also a problem of
finding stationary points. We will first use calculus of variations to see how to go about this process
of finding stationary points, and then derive Hamilton’s principle to find the expression that when
stationary, gives us the equations of motion.
f + df = f (x + dx)
df = f (x + dx) − f (x)
df = (x + dx)2 − x2
df = 2xdx + (dx)2
df ≈ 2xdx.
Note that we take a first order approximation and drop the (dx)2 term, as it’s negligible compared
to dx. In general, the differential can be seen as a infinitesimal nudge to the input. Notice how the
derivative appears in the equation. While we were able to directly solve for the differential in this case
since we had an explicit function, it is not always the case, and more generally we have,
df
df = dx.
dx
3
We are now ready to construct a similar definition for the functional. To do this we will introduce the
variation, represented by δ. Suppose for a functional I[f ], we vary the input function by f → f + δf,
Then naturally the output should also vary by I → I + δI. In other words, we can define the variation
of a functional by:
δI = I[f + δf ] − I[f ].
Let us first think about what it means to vary the input function of a functional. With a normal
function, varying the input means either increasing or decreasing the value of the input, but with a
function, there are infinitely many ways to vary it. Below are a few examples of what f + δf could
look like relative to f :
We can stretch our function in infinitely many different ways, but we still have to be able to control
how much we stretch it. This motivates the following definition for the variation of a function:
δf (x) = ϵη(x),
where η(x) is an arbitrary function that determines how it will change the shape of the original
function, and ϵ is a parameter that determines how much it will change.
Varying ϵ is much easier than directly varying f because of two main reasons. First, regardless of
the η we choose, ϵ tells us how close we are to the original function. The smaller the magnitude of ϵ,
the closer to the original function. At ϵ = 0, we get the original function. Second, we have already
developed the tools for varying a function with respect to a variable: namely, derivatives.
Now, let us return to finding the variation of a functional using ϵ. If we change ϵ by a little bit, that
changes f, which then changes I[f ]. Since our goal is to find how much I[f ] changed, we can write
this in terms of rates of change, that is:
dI[f + ϵη]
δI = ϵ.
dϵ
ϵ=0
Note that we evaluate the derivative at ϵ = 0 since we’re looking for the variation of I when we vary
the original function. Another way to arrive at this expression is a first-order Taylor expansion of
I[f + ϵη] − I[f ].
Additional note: It may be tempting to think of the derivative in the equation above as the functional
derivative, and although it is closely related, it is not quite the same.
4
2.3 Properties of the Variation
In some ways, the variation behaves similarly to the differential, but other times not. Here we will list
out the two main properties we will use for calculations. No attempt at a proof will be made.
1. f δx = δF where F is the antiderivative of f. This property is a case where the variation and
differential work similarly. For example, xδx = δ 12 x2 .
The choice of parameters of the functional seems obvious for this example as it only depend on these
three, but it is completely possible to derive a different Euler-Lagrange equation by taking higher
derivatives as parameters. As we will see later with Hamilton’s Principle, this is the most useful form.
Thus, we have to minimize the integral:
Z x2
S= f [x, y(x), y ′ (x)]dx,
x1
5
We look at the variation of S by looking what happens when we vary y(x), which also affects y ′ (x).
For convenience, we will define the following:
We also know that the endpoints of our function should remain the same when we vary it. For example,
with the catenary, it makes sense that the endpoints of the chain is fixed and only the shape unknown.
Thus, we have
y(x1 ) = y1 , y(x2 ) = y2
η(x1 ) = η(x2 ) = 0.
Now we can return to evaluating the derivative. We write S as an integral of f and exchange the order
of the derivative and integral, giving:
Z x2 Z x2
dS[y + ϵη] d ′ ∂f
= f [x, Y (x), Y (x)]dx = dx.
dϵ
ϵ=0 dϵ ϵ=0 x1 x1 ∂ϵ ϵ=0
∂f ∂f ∂Y ∂f ∂Y ′ ∂f ∂f
= + ′
=η + η′ .
∂ϵ ∂Y ∂ϵ ∂Y ∂ϵ ∂Y ∂Y ′
We now plug this expression back into the integral for further manipulation. We will integrate the
second term first by using integration by parts. We will use a variation of it that is not as quite
commonly seen, that is:
Z Z
′
u vdx = uv − uv ′ dx.
6
Applying this, we have:
x2
∂f x2
Z Z x2 Z x2
′∂f d ∂f d ∂f
η dx = η ′ − η dx = − η dx.
x1 ∂y ′ ∂y x1 x1 dx ∂y ′ x1 dx ∂y ′
The first term on the RHS cancels out because as we mentioned early, the endpoints of the curve must
remain the same so η evaluated there must be 0.
Substituting back into the integral, we get the equation:
Z x2
dS ∂f d ∂f
= η − dx = 0.
dϵ ϵ=0 x1 ∂y dx ∂y ′
Since η(x) is an arbitrary function, this equation must be satisfied for all η, and it follows that
∂f d ∂f
− = 0.
∂y dx ∂y ′
Earlier we wrote y as a function of x, but here we will actually write x as a function of y to simplify
calculations. Thus, the integral is,
Z y2 Z y2
f [y, x(y), x′ (y)]dy.
p
µgy 1 + x′ (y)2 dy =
y1 y1
7
d ∂f
=0
dy ∂x′
∂f
= const.
∂x′
µgyx′
p = const.
1 + (x′ )2
yx′
p = C.
1 + (x′ )2
s
C2
x′ =
y2 − C 2
Z s
1
x= dy
(y/C)2 − 1
y
= cosh−1
C
y = C cosh(x)
Of course the particular solution depends on initial conditions, but this tells us that a chain will always
hang in the shape of a cosh function.
5 Hamilton’s Principle
Thus far, we have been developing the mathematical foundations we need for Lagrangian Mechanics,
but now we can do the physics. We will relate the Euler-Lagrange equation we already derived to
mechanics by showing that a physical law is equivalent to a problem of finding stationary points.
We start by considering a system that moves along a single coordinate with respect of time, let’s say
x(t). If we assume our system is conservative, we have:
dU (x)
F = mẍ = − .
dx
dU
mẍδx = − δx.
dx
Then, moving everything to one side, and integrating with respect to time, we have:
Z t2
dU
−mẍδx − δx dt = 0.
t1 dx
8
Integrating the first term by integration by parts, we get:
Z t2 t2 Z t2 Z t2 Z t2 Z t2
1
mẋ2 dt =
−mẍδxdt = −mẋδx + mẋδ ẋdt = mẋδ ẋdt = δ δT dt.
t1 t1 t1 t1 t1 2 t1
We use T = 12 mẋ2 to denote the kinetic energy. Once again, the first term on the RHS cancels out
because the δx = 0 at the endpoints. Integrating the second, term we get:
Z t2 Z t2
dU
− δxdt = −δU dt.
t1 dx t1
Refer back to the properties of the variation if you are confused on these calculations. Plugging
everything back into the original equation, we get:
Z t2 Z t2
(δT − δU )dt = δ (T − U )dt = 0.
t1 t1
Z t2
δ Ldt = 0.
t1
6 Lagrange’s Equations
Hamilton’s Principle tells us that for the path a system takes, x(t), the integral of the Lagrangian
must be stationary. This problem is exactly what the Euler-Lagrange equation solves. Combining the
two, we get Lagrange’s Equations. Note that L = L[t, x, ẋ].
Thus, the Euler-Lagrange equation tells us:
∂L d ∂L
− = 0.
∂x dt ∂ ẋ
7 Generalized Coordinates
Although we derived Lagrange’s equation with only one coordinate, it can be applied to any coordinate
x, y, z. In fact, we can use any coordinate system. In general,
∂L d ∂L
− = 0,
∂qi dt ∂ q˙i
9
∂L dU
=− = Fx
∂x dx
∂L dT
= = mẋ = px .
∂ ẋ dẋ
THis suggests that for any generalized coordinate, we can define the following:
∂L
ith component of generalized force =
∂qi
∂L
ith component of generalized momentum = .
∂ q˙i
1
T = l2 θ2 , and U = −mgl cos θ.
2
1
L = T − U = l2 θ2 + mgl cos θ.
2
∂L d ∂L
− =0
∂θ dt ∂ θ̇
d 2
(−mgl sin θ) − ml θ̇ = 0
dt
−mgl sin θ − ml2 θ̈ = 0
g sin θ + lθ̈ = 0
This is the same differential equation you would get by applying Newton’s 2nd Law. So why use
Lagrange’s equation over Newton’s 2nd Law? Well, it would take quite a bit of space to solve a
non-trivial example, so instead I will link a handout on applying Lagrangian mechanics to a double
pendulum, where the differences between Lagrangian mechanics and Newton’s 2nd Law becomes much
clearer.
https://www.phys.lsu.edu/faculty/gonzalez/Teaching/Phys7221/DoublePendulum.pdf
10
9 Closing Remarks
This is really only an introduction to the subject. What happens when you use coordinates that are
time-dependent? What happens if the system is not conservative? These are problems that I still
haven’t learned about. The goal of this handout is to give some insight as to why finding equations
of motion can be thought of as a minimization problem or finding stationary points in general.
10 Appendix
10.1 Partial Derivatives
It turns out that most things in the real world are described by more than one variable. For example,
take a metal plate with different temperatures at different points. We have to use two coordinates
to describe where a point is, whether it’s rectangular coordinates or polar coordinates. We can then
write a function f (x, y) that gives us the temperature at the corresponding point.
One useful question to ask is about the rate of change of temperature as we walk along this metal
plate. The problem is that there are infinitely many directions to walk in. Often times, it’s useful to
find the rates of change with respect to each of the coordinates in the coordinate system we are using,
which is where partial derivatives come in.
Partial derivatives give the rates of change with respect to a single variable, and when doing calcu-
lations, we take the derivative with respect to that variable while treating all the other variables like
constants.
∂f
is read as the partial derivative of f with respect to x.
∂x
For example, if f (x, y) = x2 + y, then ∂f /∂x = 2x.
The reason it turns into a partial derivative is because, before integrating, the function still depends
on x, but it also must depend on y since that’s what we’re taking a derivative with respect to. We
only want to take the derivative with respect to y so we take a partial derivative.
df df dx
= .
dt dx dt
When a function depends on multiple variables, each of those variables will have its own rate of change
and we sum up the effects of the chain rule on each variable.
df ∂f dx ∂f dy
= + .
dt ∂x dt ∂y dt
11
References
[1] Taylor, J. R. (2005). Classical Mechanics. University Science Books.
[2] Fowler, M. (2015). Graduate Classical Mechanics. University of Virginia.
[3] Engel, E. & Dreizler R. M. (2011). Density Function Theory: An Advanced Course. Springer.
[4] Thompson, S. (1914). Calculus Made Easy (2nd edition). Macmillan and Co. Limited.
[5] Gonzalez, G. (2006). Double Pendulum. Louisiana State University.
https://www.phys.lsu.edu/faculty/gonzalez/Teaching/Phys7221/DoublePendulum.pdf
12