Lagrangian Mechanics 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Lagrangian Mechanics But It Makes

Sense

Sean He
Acknowledgements
Special thanks to Andrew Wang for proofreading and providing feedback.
Dedicated to my favorite physics clubber William ”Willy” Wang.

Preface
Lagrangian Mechanics... a sort of holy grail in undergraduate physics, or so I’ve heard. Lagrange’s
equation itself is not hard to learn to use, as long as you understand partial derivatives. However, the
origin of the topic is usually left somewhat ambiguous. Most sources will derive the Euler-Lagrange
equation, with a lot of arbitrary definitions and confusing notation. Then there is the choice of the
Lagrangian - why is it L = T − U ? In short, without a good teacher, it is extremely hard to learn the
theory behind Lagrangian Mechanics.
Well I obviously somehow managed, since I’m here writing this handout. After reading too many
articles and Classical Mechanics textbook chapters, I think I have a decent grasp on Lagrangian
Mechanics. (Hopefully saying that none of this handout is rigorous will cover for any mistakes I
make.) And with a good grasp on calculus, some free time to learn a bit of Multivariable Differential
Calculus (try MIT OCW 18.02 Lectures, or read the Appendix), and some patience, any high school
senior should also be able to understand Lagrangian Mechanics.
Good luck.

1
1 Motivations for the Calculus of Variations
The calculus of variations is the mathematical foundations that Lagrangian mechanics is built on, so
I want to motivate the need to develop Lagrangian mechanics and also the types of problems calculus
of variations can solve.
Newton’s 2nd Law is the foundation of Newtonian mechanics and it works great in Cartesian coor-
dinates. However, it tends to fall apart in other coordinate systems due to how messy it becomes.
Lagrangian mechanics is the alternative which will work with what we call generalized coordinates.
To see how calculus of variations will help us with this goal, let us first see what types of problems it
can solve.

1.1 The Catenary


A catenary is a curve that describes how a flexible chain would hang between two poles.

The principle behind the catenary is that it must minimize the total potential energy of the chain,
that is, we want to minimize:
Z Z Z x2 p
gydm = µgyds = µgy(x) 1 + y ′ (x)2 dx.
C x1

Note the fact that y is a function of x and we are finding a function such that the value of this integral
is minimized, rather than the typical calculus problem of finding a value that minimizes an expression.

1.2 Fermat’s Principle


Another important example is the path that light will take between two points. In the same medium,
light will obviously travel in a straight line. However, in mediums of varying refractive indices, Fermat
found that light will take the path that requires the shortest time of travel. Given that the speed light
travels in a medium with refractive index n is v = nc , we seek to minimize is:

Z t2 Z Z Z x2
ds 1 1 p
dt = = nds = n 1 + y ′ (x)2 dx
t1 C v c C c x1

2
Again we are finding a function that minimizes the integral.

1.3 Lagrangian
As we have seen in the previous sections, calculus of variations is designed to tackle problems that find
the stationary points of an integral with respect to a function (which happened to be minimization
with our examples). It turns out finding the path a system will take in mechanics is also a problem of
finding stationary points. We will first use calculus of variations to see how to go about this process
of finding stationary points, and then derive Hamilton’s principle to find the expression that when
stationary, gives us the equations of motion.

2 The Functional and the Variation


2.1 The Functional
Since we are minimizing expressions with respect to a function rather than a variable, we need to
introduce the functional. The functional I[f ] is a type of function that maps a function to a real
number. For example, with Fermat’s principle, the functional would map a function representing
the path light takes to the amount of time it takes for light to travel that path. Then, making the
functional stationary would give us the path that minimizes the time of travel.

2.2 The Variation


Thus, we need to develop calculus tools for the functional. Similar to the derivative of a function,
there is the derivative of a functional. However, it is a bit complicated to understand, and not strictly
necessary for our purposes, so we will instead turn to the differential.
The rigorous definition of the derivative involves a limit on the local slope of a function, but there is
another nonrigorous method involving differentials that gives more intuition on infinitesimal change.
We will explore just enough to give insight into how we should define a ”differential” for a functional,
but I highly recommend Silvanus Thompson’s Calculus Made Easy for further reading.
Suppose that we have a function f (x) = x2 . If we change the x value by a bit, let’s say x → x + dx,
then the value of the function also changes, from f → f + df. We can use this to find an expression
for df:

f + df = f (x + dx)
df = f (x + dx) − f (x)
df = (x + dx)2 − x2
df = 2xdx + (dx)2
df ≈ 2xdx.

Note that we take a first order approximation and drop the (dx)2 term, as it’s negligible compared
to dx. In general, the differential can be seen as a infinitesimal nudge to the input. Notice how the
derivative appears in the equation. While we were able to directly solve for the differential in this case
since we had an explicit function, it is not always the case, and more generally we have,

df
df = dx.
dx

3
We are now ready to construct a similar definition for the functional. To do this we will introduce the
variation, represented by δ. Suppose for a functional I[f ], we vary the input function by f → f + δf,
Then naturally the output should also vary by I → I + δI. In other words, we can define the variation
of a functional by:

δI = I[f + δf ] − I[f ].

Let us first think about what it means to vary the input function of a functional. With a normal
function, varying the input means either increasing or decreasing the value of the input, but with a
function, there are infinitely many ways to vary it. Below are a few examples of what f + δf could
look like relative to f :

We can stretch our function in infinitely many different ways, but we still have to be able to control
how much we stretch it. This motivates the following definition for the variation of a function:

δf (x) = ϵη(x),

where η(x) is an arbitrary function that determines how it will change the shape of the original
function, and ϵ is a parameter that determines how much it will change.
Varying ϵ is much easier than directly varying f because of two main reasons. First, regardless of
the η we choose, ϵ tells us how close we are to the original function. The smaller the magnitude of ϵ,
the closer to the original function. At ϵ = 0, we get the original function. Second, we have already
developed the tools for varying a function with respect to a variable: namely, derivatives.
Now, let us return to finding the variation of a functional using ϵ. If we change ϵ by a little bit, that
changes f, which then changes I[f ]. Since our goal is to find how much I[f ] changed, we can write
this in terms of rates of change, that is:

dI[f + ϵη]
δI = ϵ.

ϵ=0

Note that we evaluate the derivative at ϵ = 0 since we’re looking for the variation of I when we vary
the original function. Another way to arrive at this expression is a first-order Taylor expansion of
I[f + ϵη] − I[f ].
Additional note: It may be tempting to think of the derivative in the equation above as the functional
derivative, and although it is closely related, it is not quite the same.

4
2.3 Properties of the Variation
In some ways, the variation behaves similarly to the differential, but other times not. Here we will list
out the two main properties we will use for calculations. No attempt at a proof will be made.
1. f δx = δF where F is the antiderivative of f. This property is a case where the variation and
differential work similarly. For example, xδx = δ 12 x2 .


2. (δf )′ = δ(f ′ ). Self-explanatory.

2.4 Stationary Points


With derivatives, stationary points are quite easy to understand. If you think of derivatives as rates
of change, stationary points must occur at points where the instantaneous rate of change is 0. If you
think of derivatives as the slope of a tangent line, it must be where the tangent line is flat. However,
what does this mean in terms of differentials?
At a stationary point, if you change the value of the input variable, the value of the function will likely
change, but if you change the value of the input by an infinitesimal amount, then the value of the
function will stay the same. This is what is meant when the instantaneous rate of change is 0, or the
tangent line is flat. Locally at the stationary point, the value of the function is constant.
The same analogy can be applied to functionals. Locally at a stationary point of a functional, if
you change the input function infinitesimally, the value of the functional does not change, or in other
words, the variation of the functional is 0 at any stationary point. Thus, to find the stationary points
of any functional, the following condition has to be satisfied:

dI[f + ϵη]
δI = 0 =⇒ = 0.

ϵ=0

3 The Euler-Lagrange Equation


Suppose we take the catenary example from before and decide to minimize the integral. Let us denote
the integrand by

f [x, y(x), y ′ (x)] = µgy(x) 1 + y ′ (x)2 .


p

The choice of parameters of the functional seems obvious for this example as it only depend on these
three, but it is completely possible to derive a different Euler-Lagrange equation by taking higher
derivatives as parameters. As we will see later with Hamilton’s Principle, this is the most useful form.
Thus, we have to minimize the integral:
Z x2
S= f [x, y(x), y ′ (x)]dx,
x1

which means we must have δS = 0, or



dS[y + ϵη]
= 0.

ϵ=0

5
We look at the variation of S by looking what happens when we vary y(x), which also affects y ′ (x).
For convenience, we will define the following:

Y (x) = y(x) + ϵη(x)


Y ′ (x) = y ′ (x) + ϵη ′ (x).

We also know that the endpoints of our function should remain the same when we vary it. For example,
with the catenary, it makes sense that the endpoints of the chain is fixed and only the shape unknown.
Thus, we have

y(x1 ) = y1 , y(x2 ) = y2
η(x1 ) = η(x2 ) = 0.

Now we can return to evaluating the derivative. We write S as an integral of f and exchange the order
of the derivative and integral, giving:
Z x2 Z x2
dS[y + ϵη] d ′ ∂f
= f [x, Y (x), Y (x)]dx = dx.

ϵ=0 dϵ ϵ=0 x1 x1 ∂ϵ ϵ=0

Applying chain rule, we get

∂f ∂f ∂Y ∂f ∂Y ′ ∂f ∂f
= + ′
=η + η′ .
∂ϵ ∂Y ∂ϵ ∂Y ∂ϵ ∂Y ∂Y ′

Evaluating the derivative at ϵ = 0, Y and Y ′ become y and y ′ , respectively, and we get:



∂f ∂f ∂f
=η + η′ ′ .
∂ϵ ϵ=0 ∂y ∂y

We now plug this expression back into the integral for further manipulation. We will integrate the
second term first by using integration by parts. We will use a variation of it that is not as quite
commonly seen, that is:
Z Z

u vdx = uv − uv ′ dx.

6
Applying this, we have:

x2
∂f x2
Z Z x2   Z x2  
′∂f d ∂f d ∂f
η dx = η ′ − η dx = − η dx.
x1 ∂y ′ ∂y x1 x1 dx ∂y ′ x1 dx ∂y ′

The first term on the RHS cancels out because as we mentioned early, the endpoints of the curve must
remain the same so η evaluated there must be 0.
Substituting back into the integral, we get the equation:
Z x2   
dS ∂f d ∂f
= η − dx = 0.
dϵ ϵ=0 x1 ∂y dx ∂y ′

Since η(x) is an arbitrary function, this equation must be satisfied for all η, and it follows that
 
∂f d ∂f
− = 0.
∂y dx ∂y ′

This is the Euler-Lagrange equation.

4 Applications of the Euler-Lagrange


As an example, let’s solve the catenary problem and find the equation of a hanging chain. Recall from
earlier that we are trying to minimize
Z Z
gydm = µgyds.
C

Earlier we wrote y as a function of x, but here we will actually write x as a function of y to simplify
calculations. Thus, the integral is,
Z y2 Z y2
f [y, x(y), x′ (y)]dy.
p
µgy 1 + x′ (y)2 dy =
y1 y1

The Euler-Lagrange equation tells us


 
∂f d ∂f
− = 0.
∂x dy ∂x′

Since f does not depend on x, we have ∂f /∂x = 0. Thus, we have:

7
 
d ∂f
=0
dy ∂x′
∂f
= const.
∂x′
µgyx′
p = const.
1 + (x′ )2
yx′
p = C.
1 + (x′ )2

Here we can solve for x′ and then integrate to find x(y).

s
C2
x′ =
y2 − C 2
Z s
1
x= dy
(y/C)2 − 1
y
= cosh−1
C
y = C cosh(x)

Of course the particular solution depends on initial conditions, but this tells us that a chain will always
hang in the shape of a cosh function.

5 Hamilton’s Principle
Thus far, we have been developing the mathematical foundations we need for Lagrangian Mechanics,
but now we can do the physics. We will relate the Euler-Lagrange equation we already derived to
mechanics by showing that a physical law is equivalent to a problem of finding stationary points.
We start by considering a system that moves along a single coordinate with respect of time, let’s say
x(t). If we assume our system is conservative, we have:

dU (x)
F = mẍ = − .
dx

We can then multiply both sides of the equation by a small variation in x of δx :

dU
mẍδx = − δx.
dx

Then, moving everything to one side, and integrating with respect to time, we have:

Z t2  
dU
−mẍδx − δx dt = 0.
t1 dx

8
Integrating the first term by integration by parts, we get:
Z t2 t2 Z t2 Z t2 Z t2   Z t2
1
mẋ2 dt =

−mẍδxdt = −mẋδx + mẋδ ẋdt = mẋδ ẋdt = δ δT dt.
t1 t1 t1 t1 t1 2 t1

We use T = 12 mẋ2 to denote the kinetic energy. Once again, the first term on the RHS cancels out
because the δx = 0 at the endpoints. Integrating the second, term we get:

Z t2 Z t2
dU
− δxdt = −δU dt.
t1 dx t1

Refer back to the properties of the variation if you are confused on these calculations. Plugging
everything back into the original equation, we get:

Z t2 Z t2
(δT − δU )dt = δ (T − U )dt = 0.
t1 t1

Thus we define the Lagrangian L = T − U, and

Z t2
δ Ldt = 0.
t1

This is Hamilton’s Principle.

6 Lagrange’s Equations
Hamilton’s Principle tells us that for the path a system takes, x(t), the integral of the Lagrangian
must be stationary. This problem is exactly what the Euler-Lagrange equation solves. Combining the
two, we get Lagrange’s Equations. Note that L = L[t, x, ẋ].
Thus, the Euler-Lagrange equation tells us:
 
∂L d ∂L
− = 0.
∂x dt ∂ ẋ

7 Generalized Coordinates
Although we derived Lagrange’s equation with only one coordinate, it can be applied to any coordinate
x, y, z. In fact, we can use any coordinate system. In general,
 
∂L d ∂L
− = 0,
∂qi dt ∂ q˙i

where qi is a generalized coordinate in a system r = (q1 , q2 , . . . , qn ).


This also allows us to define a few more useful generalized quantities. Returning to the x coordinate,
notice how

9
∂L dU
=− = Fx
∂x dx

∂L dT
= = mẋ = px .
∂ ẋ dẋ

THis suggests that for any generalized coordinate, we can define the following:

∂L
ith component of generalized force =
∂qi
∂L
ith component of generalized momentum = .
∂ q˙i

8 Applications of Lagrange’s Equations


Finally, time to solve some physics problems! (or not) Let’s first look at a simple pendulum. We can
describe the energies in terms of the angle θ :

1
T = l2 θ2 , and U = −mgl cos θ.
2

Then, the Lagrangian is

1
L = T − U = l2 θ2 + mgl cos θ.
2

Applying Lagrange’s equation, we get

 
∂L d ∂L
− =0
∂θ dt ∂ θ̇
d  2 
(−mgl sin θ) − ml θ̇ = 0
dt
−mgl sin θ − ml2 θ̈ = 0
g sin θ + lθ̈ = 0

This is the same differential equation you would get by applying Newton’s 2nd Law. So why use
Lagrange’s equation over Newton’s 2nd Law? Well, it would take quite a bit of space to solve a
non-trivial example, so instead I will link a handout on applying Lagrangian mechanics to a double
pendulum, where the differences between Lagrangian mechanics and Newton’s 2nd Law becomes much
clearer.
https://www.phys.lsu.edu/faculty/gonzalez/Teaching/Phys7221/DoublePendulum.pdf

10
9 Closing Remarks
This is really only an introduction to the subject. What happens when you use coordinates that are
time-dependent? What happens if the system is not conservative? These are problems that I still
haven’t learned about. The goal of this handout is to give some insight as to why finding equations
of motion can be thought of as a minimization problem or finding stationary points in general.

10 Appendix
10.1 Partial Derivatives
It turns out that most things in the real world are described by more than one variable. For example,
take a metal plate with different temperatures at different points. We have to use two coordinates
to describe where a point is, whether it’s rectangular coordinates or polar coordinates. We can then
write a function f (x, y) that gives us the temperature at the corresponding point.
One useful question to ask is about the rate of change of temperature as we walk along this metal
plate. The problem is that there are infinitely many directions to walk in. Often times, it’s useful to
find the rates of change with respect to each of the coordinates in the coordinate system we are using,
which is where partial derivatives come in.
Partial derivatives give the rates of change with respect to a single variable, and when doing calcu-
lations, we take the derivative with respect to that variable while treating all the other variables like
constants.
∂f
is read as the partial derivative of f with respect to x.
∂x
For example, if f (x, y) = x2 + y, then ∂f /∂x = 2x.

10.2 And in relation to integrals


We will also use the following trick:
Z x2 Z x2
d ∂f
f dx = dx.
dy x1 x1 ∂y

The reason it turns into a partial derivative is because, before integrating, the function still depends
on x, but it also must depend on y since that’s what we’re taking a derivative with respect to. We
only want to take the derivative with respect to y so we take a partial derivative.

10.3 Multivariable Chain Rule


Recall the chain rule from single variable calculus:

df df dx
= .
dt dx dt
When a function depends on multiple variables, each of those variables will have its own rate of change
and we sum up the effects of the chain rule on each variable.

df ∂f dx ∂f dy
= + .
dt ∂x dt ∂y dt

11
References
[1] Taylor, J. R. (2005). Classical Mechanics. University Science Books.
[2] Fowler, M. (2015). Graduate Classical Mechanics. University of Virginia.
[3] Engel, E. & Dreizler R. M. (2011). Density Function Theory: An Advanced Course. Springer.
[4] Thompson, S. (1914). Calculus Made Easy (2nd edition). Macmillan and Co. Limited.
[5] Gonzalez, G. (2006). Double Pendulum. Louisiana State University.
https://www.phys.lsu.edu/faculty/gonzalez/Teaching/Phys7221/DoublePendulum.pdf

12

You might also like