Astrophysical Flows
Astrophysical Flows
Astrophysical Flows
2
CONTENTS
3
APPENDICES
4
LITERATURE
The material covered and presented in these lecture notes has relied heavily on a
number of excellent textbooks listed below.
• Theoretical Astrophysics
by M. Bartelmann (ISBN-978-3-527-41004-0)
• Galactic Dynamics
by J. Binney & S. Tremaine (ISBN-978-0-691-13027-9)
5
6
CHAPTER 1
What is a fluid?
A fluid is a substance that can flow, has no fixed shape, and o↵ers little resistance
to an external stress
• In a fluid the constituent particles (atoms, ions, molecules, stars) can ‘freely’
move past one another.
• A fluid changes its shape at a steady rate when acted upon by a stress force.
What is a plasma?
A plasma is a fluid in which (some of) the consistituent particles are electrically
charged, such that the interparticle force (Coulomb force) is long-range in nature.
Fluid Demographics:
All fluids are made up of large numbers of constituent particles, which can be
molecules, atoms, ions, dark matter particles or even stars. Di↵erent types of fluids
mainly di↵er in the nature of their interparticle forces. Examples of inter-particle
forces are the Coulomb force (among charged particles in a plasma), vanderWaals
forces (among molecules in a neutral fluid) and gravity (among the stars in a galaxy).
Fluids can be both collisional or collisionless, where we define a collision as an
interaction between constituent particles that causes the trajectory of at least one
of these particles to be deflected ‘noticeably’. Collisions among particles drive the
system towards thermodynamic equilibrium (at least locally) and the velocity
distribution towards a Maxwell-Boltzmann distribution.
In neutral fluids the particles only interact with each other on very small scales.
Typically the inter-particle force is a vanderWaals force, which drops o↵ very rapidly.
Put di↵erently, the typical cross section for interactions is the size of the particles
(i.e., the Bohr radius for atoms), which is very small. Hence, to good approximation
7
Figure 1: Examples of particle trajectories in (a) a collisional, neutral fluid, (b) a
plasma, and (c) a self-gravitating collisionless, neutral fluid. Note how di↵erent the
dynamics are.
In a fully ionized plasma the particles exert Coulomb forces (F~ / r 2 ) on each other.
Because these are long-range forces, the velocity of a charged particle changes more
likely due to a succession of many small deflections rather than due to one large one.
As a consequence, particles trajectories in a highly ionized plasma (see Fig. 1b) are
very di↵erent from those in a neutral fluid.
8
Here hF~ ii is the time (or ensemble) averaged force at i and F~i (t) is the instantaneous
deviation due to the discrete nature of the particles that make up the system. As
N ! 1 then F~i ! 0 and the system is said to be collisionless; its dynamics
are governed by the collective force from all particles rather than by collisions with
individual particles.
9
Here n is the number density in cm 3 , T is the temperature in degrees Kelvin, and
e is the electrical charge of an electron in e.s.u. Related to the Debye length is the
Plasma parameter
1 3
g⌘ 3
' 8.6 ⇥ 10 n1/2 T 3/2
n D
As an example, let’s consider three di↵erent astrophysical plasmas: the ISM (inter-
stellar medium), the ICM (intra-cluster medium), and the interior of the Sun. The
warm phase of the ISM has a temperature of T ⇠ 104 K and a number density of
n ⇠ 1 cm 3 . This implies ND ⇠ 1.2 ⇥ 108 . Hence, the warm phase of the ISM can be
treated as a collisionless plasma on sufficiently small time-scales (for example when
treating high-frequency plasma waves). The ICM has a much lower average density
of ⇠ 10 4 cm 3 and a much higher temperature (⇠ 107 K). This implies a much
larger number of particles per Debye volume of ND ⇠ 4 ⇥ 1014 . Hence, the ICM can
typically be approximated as a collisionless plasma. The interior of stars, though,
has a similar temperature of ⇠ 107 K but at much higher density (n ⇠ 1023 cm 3 ),
implying ND ⇠ 10. Hence, stellar interiors are highly collisional plasmas!
10
NOTE: Although a gas is said to be compressible, many gaseous flows (and virtually
all astrophysical flows) are incompressible. When the gas is in a container, you can
easily compress it with a piston, but if I move my hand (sub-sonically) through the
air, the gas adjust itself to the perturbation in an incompressible fashion (it moves out
of the way at the speed of sound). The small compression at my hand propagates
forward at the speed of sound (sound wave) and disperses the gas particles out of
the way. In astrophysics we rarely encounter containers, and subsonic gas flow is
often treated (to good approximation) as being incompressible.
Throughout what follows, we use ‘fluid’ to mean a neutral fluid, and ‘plasma’ to
refer to a fluid in which the particles are electrically charged.
NOTE: An ideal (or perfect) fluid should NOT be confused with an ideal or perfect
gas, which is defined as a gas in which the pressure is solely due to the kinetic motions
of the constituent particles. As we show in Chapter 11, and as you have probably
seen before, this implies that the pressure can be written as P = n kB T , with n the
particle number density, kB the Boltzmann constant, and T the temperature.
• Giant (gaseous) planets: Similar to stars, gaseous planets are large spheres
of gas, albeit with a rocky core. Contrary to stars, though, the gas is typically
11
so dense and cold that it can no longer be described with the equation of
state of an ideal gas.
• White Dwarfs & Neutron stars: These objects (stellar remnants) can be
described as fluids with a degenerate equation of state.
• Proto-planetary disks: the dense disks of gas and dust surrounding newly
formed stars out of which planetary systems form.
• Accretion disks: Accretion disks are gaseous, viscous disks in which the vis-
cosity (enhanced due to turbulence) causes a net rate of radial matter towards
the center of the disk, while angular momentum is being transported outwards
(accretion)
12
• Galaxies (stellar component): as already mentioned above, the stellar com-
ponent of galaxies is a collisionless fluid; to very, very good approximation, two
stars in a galaxy will never collide with other.
Course Outline:
In this course, we will start with standard hydrodynamics, which applies mainly to
neutral, collisional fluids. We derive the fluid equations from kinetic theory: starting
with the Liouville theorem we derive the Boltzmann equation, from which in turn we
derive the continuity, momentum and energy equations. Next we discuss a variety of
di↵erent flows; vorticity, incompressible barotropic flow, viscous flow, and turbulent
flow, before addressing fluid instabilities and shocks. Next we study numerical fluid
dynamics. We discuss methods used to numerically solve the partial di↵erential
equations that describe fluids, and we construct a simple 1D hydro-code that we test
on an analytical test case (the Sod shock tube problem). After a brief discussion
of collisionless fluid dynamics, highlighting the subtle di↵erences between the Jeans
equations and the Navier-Stokes equations, we turn our attention to plasmas. We
again derive the relevant equations (Vlasov and Lenard-Balescu) from kinetic theory,
and then discuss MHD and several applications. The goal is to end with a brief
discussion of the Fokker-Planck equation, which describes the impact of long-range
collisions in a gravitational N-body system or a plasma.
It is assumed that the reader is familiar with vector calculus, with curvi-linear co-
ordinate systems, and with di↵erential equations. A brief overview of these topics,
highlighting the most important essentials, is provided in Appendices A-E.
13
CHAPTER 2
2. a (set of) equation(s) to describe how the state variables change with time
This system is described by the position vectors (~x) and momentum vectors (~p) of
all the N particles, i.e., by (~x1 , ~x2 , ..., ~xN , p~1 , p~2 , ..., ~pN ).
If the particles are trully classical, in that they can’t emit or absorb radiation, then
one can define a Hamiltonian
N
X
H(~xi , p~i , t) ⌘ H(~x1 , ~x2 , ..., ~xN , p~1 , p~2 , ..., p~N , t) = ~pi · ~x˙ i L(~xi , ~x˙ i , t)
i=1
where L(~xi , ~x˙ i , t) is the system’s Lagrangian, and ~x˙ i = d~xi /dt.
The equations that describe the time-evolution of these state-variables are the Hamil-
tonian equations of motion:
@H @H
~x˙ i = ; p~˙ i =
@~pi @~xi
14
Example 2: an electromagnetic field
~ x) and
The state of this system is described by the electrical and magnetic fields, E(~
~ x), respectively, and the equations that describe their evolution with time are the
B(~
Maxwell equations, which contain the terms @ E/@t ~ ~
and @ B/@t.
15
h h
= 'p
p mkB T
Here h is the Planck constant, p is the particle’s momentum, m is the particle mass,
kB is the Boltzman constant, and T is the temperature of the fluid. This de Broglie
wavelength indicates the ‘characteristic’ size of the wave-packet that according to
quantum mechanics describes the particle, and is typically very small. Except for
extremely dense fluids such as white dwarfs and neutron stars, or ‘exotic’ types of
dark matter (i.e., ‘fuzzy dark matter’), the de Broglie wavelength is always much
smaller than the mean particle separation, and classical, Newtonian mechanics suf-
fices. As we have seen above, a classical, Newtonian system of N particles can be
described by a Hamiltonian, and the corresponding equations of motions. We refer
to this as the level-1 description of fluid dynamics (see under ‘example 1’ above).
Clearly, when N is very large is it unfeasible to solve the 2N equations of motion for
all the positions and momenta of all particles. We need another approach.
In the level-2 approach, one introduces the distribution function f (~x, ~p, t), which
describes the number density of particles in 6-dimensional ‘phase-space’ (~x, p~) (i.e.,
how many particles are there with positions in the 3D volume ~x +d~x and momenta in
the 3D volume ~p + d~p). The equation that describes how f (~x, ~p, t) evolves with time
is called the Boltzmann equation for a neutral fluid. If the fluid is collisionless
this reduces to the Collisionless Boltzmann equation (CBE). If the collisionless
fluid is a plasma, the same equation is called the Vlasov equation. Often CBE and
Vlasov are used without distinction.
At the final level-3, the fluid is modelled as a continuum. This means we ignore
that fluids are made up of constituent particles, and rather describe the fluid with
continuous fields, such as the density and velocity fields ⇢(~x) and ~u(~x) which assign
to each point in space a scalar quantity ⇢ and a vector quantity ~u, respectively. For
an ideal neutral fluid, the state in this level-3 approach is fully described by four
fields: the density ⇢(~x), the velocity field ~u(~x), the pressure P (~x), and the internal,
specific energy "(~x) (or, equivalently, the temperature T (~x)). In the MHD treat-
ment of plasmas one also needs to specify the magnetic field B(~ ~ x). The equations
that describe the time-evolution of ⇢(~x), ~u(~x), and "(~x) are called the continuity
equation, the Navier-Stokes equations, and the energy equation, respectively.
Collectively, we shall refer to these as the hydrodynamic equations or fluid equa-
tions. In MHD you have to slightly modify the Navier-Stokes equations, and add
an additional induction equation describing the time-evolution of the magnetic
16
field. For an ideal (or perfect) fluid (i.e., no viscosity and/or conductivity), the
Navier-Stokes equations reduce to what are known as the Euler equations. For a
collisionless gravitational system, the equivalent of the Euler equations are called the
Jeans equations.
Throughout this course, we mainly focus on the level-3 treatment, to which we refer
hereafter as the macroscopic approach. However, for completeness we will derive
these continuum equations starting from a completely general, microscopic level-1
treatment. Along the way we will see how subtle di↵erences in the inter-particle forces
gives rise to a rich variety in dynamics (fluid vs. plasma, collisional vs. collisionless).
1. the FE needs to be much smaller than the characteristic scale in the problem,
which is the scale over which the hydrodynamical quantities Q change by an
order of magnitude, i.e.
Q
lFE ⌧ lscale ⇠
rQ
2. the FE needs to be sufficiently large that fluctuations due to the finite number
of particles (‘discreteness noise’) can be neglected, i.e.,
3
n lFE 1
3. the FE needs to be sufficiently large that it ‘knows’ about the local conditions
through collisions among the constituent particles, i.e.,
lFE
17
The ratio of the mean-free path, , to the characteristic scale, lscale is known as the
Knudsen number: Kn = /lscale . Fluids typically have Kn ⌧ 1; if not, then one
is not justified in using the continuum approach (level-3) to fluid dynamics, and one
is forced to resort to a more statistical approach (level-2).
Note that fluid elements can NOT be defined for a collisionless fluid (which has
an infinite mean-free path). This is one of the reasons why one cannot use the
macroscopic approach to derive the equations that govern a collisionless fluid.
In the case of an ideal (or perfect) fluid (i.e., with zero viscosity and conductivity),
the Navier-Stokes equations (which are the hydrodynamical momentum equations)
reduce to what are called the Euler equations. In that case, the evolution of fluid
elements is describe by the following set of hydrodynamical equations:
18
• If the EoS is barotropic, i.e., if P = P (⇢), then the energy equation is not needed
to close the set of equations. There are two barotropic EoS that are encountered
frequently in astrophysics: the isothermal EoS, which describes a fluid for which
cooling and heating always balance each other to maintain a constant temperature,
and the adiabatic EoS, in which there is no net heating or cooling (other than
adiabatic heating or cooling due to the compression or expansion of volume, i.e., the
P dV work). We will discuss these cases in more detail later in the course.
• No EoS exists for a collisionless fluid. Consequently, for a collisionless fluid one
can never close the set of fluid equations, unless one makes a number of simplifying
assumptions (i.e., one postulates various symmetries)
• If the fluid is not ideal, then the momentum equations include terms that contain
the (kinetic) viscosity, ⌫, and the energy equation includes a term that contains
the conductivity, K. Both ⌫ and K depend on the mean-free path of the constituent
particles and therefore depend on the temperature and collisional cross-section of
the particles. Closure of the set of hydrodynamic equations then demands additional
constitutive equations ⌫(T ) and K(T ). Often, though, ⌫ and K are simply assumed
to be constant (the T -dependence is ignored).
• In the case the fluid is self-gravitating (i.e., in the case of stars or galaxies)
there is an additional unknown, the gravitational potential . However, there is also
an additional equation, the Poisson equation relating to ⇢, so that the set of
equations remains closed.
• In the case of a plasma, the charged particles give rise to electric and magnetic
fields. Each fluid element now carries 6 additional scalars (Ex , Ey , Ez , Bx , By , Bz ),
and the set of equations has to be complemented with the Maxwell equations that
describe the time evolution of E~ and B.
~
19
Fluid Dynamics: Eulerian vs. Lagrangian Formalism:
One distinguishes two di↵erent formalisms for treating fluid dynamics:
• Eulerian Formalism: in this formalism one solves the fluid equations ‘at
fixed positions’: the evolution of a quantity Q is described by the local (or
partial, or Eulerian) derivative @Q/@t. An Eulerian hydrodynamics code is a
‘grid-based code’, which solves the hydro equations on a fixed grid, or using
an adaptive grid, which refines resolution where needed. The latter is called
Adaptive Mesh Refinement (AMR).
dQ @Q
= + ~u · rQ
dt @t
~ x, t), it is straightforward
Using a similar derivation, but now for a vector quantity A(~
to show that
20
~
dA ~
@A
= ~
+ (~u · r) A
dt @t
which, in index-notation, is written as
dAi @Ai @Ai
= + uj
dt @t @xj
Another way to derive the above relation between the Eulerian and Lagrangian
derivatives, is to think of dQ/dt as
dQ Q(~x + ~x, t + t) Q(~x, t)
= lim
dt t!0 t
Using that
~x(t + t) ~x(t) ~x
~u = lim =
t!0 t t
and
Q(~x + ~x, t) Q(~x, t)
rQ = lim
~ !0
x ~x
it is straightforward to show that this results in the same expression for the substan-
tial derivative as above.
• Streaklines: the locus of points of all the fluid particles that have passed con-
tinuously through a particular spatial point in the past. Dye steadily injected
into the fluid at a fixed point extends along a streakline.
21
Figure 2: Streaklines showing laminar flow across an airfoil; made by injecting dye
at regular intervals in the flow
• Particle paths: (aka pathlines) are the trajectories that individual fluid ele-
ments follow. The direction the path takes is determined by the streamlines of
the fluid at each moment in time.
Only if the flow is steady, which means that all partial time derivatives (i.e., @~u/@t =
@⇢/@t = @P/@t) vanish, will streamlines be identical to streaklines be identical to
particle paths. For a non-steady flow, they will di↵er from each other.
22
CHAPTER 3
Without any formal derivation (this comes later) we now present the hydrodynamic
equations for an ideal, neutral fluid. Note that these equations adopt the level-3
continuum approach discussed in the previous chapter.
Lagrangian Eulerian
d⇢ @⇢
Continuity Eq: = ⇢ r · ~u + r · (⇢~u) = 0
dt @t
d~u rP @~u rP
Momentum Eqs: = r + (~u · r) ~u = r
dt ⇢ @t ⇢
d" P L @" P L
Energy Eq: = r · ~u + ~u · r" = r · ~u
dt ⇢ ⇢ @t ⇢ ⇢
NOTE: students should become familiar with switching between the Eulerian and
Lagrangian equations, and between the vector notation shown above and the
index notation. The latter is often easier to work with. When writing down the
index versions, make sure that each term carries the same index, and make use of the
Einstein summation convention. The only somewhat tricky term is the (~u · r) ~u-term
in the Eulerian momentum equations, which in index form is given by uj (@ui /@xj ),
where i is the index carried by each term of the equation.
23
(r · ~u = 0), which is also called solenoidal.
Momentum Equations: these equations simply state than one can accelerate a
fluid element with either a gradient in the pressure, P , or a gradient in the gravita-
tional potential, . Basically these momentum equations are nothing by Newton’s
F~ = m~a applied to a fluid element. In the above form, valid for an inviscid, ideal
fluid, the momentum equations are called the Euler equations.
Energy Equation: the energy equation states that the only way that the specific,
internal energy, ", of a fluid element can change, in the absence of conduction, is
by adiabatic compression or expansion, which requires a non-zero divergence of the
velocity field (i.e., r · ~u 6= 0), or by radiation (emission or absorption of photons).
The latter is expressed via the net volumetric cooling rate,
dQ
L=⇢ =C H
dt
Here Q is the thermodynamic heat, and C and H are the net volumetric cooling and
heating rates, respectively.
If the ideal fluid is governed by self-gravity (as opposed to, is placed in an external
gravitational field), then one needs to complement the hydrodynamical equations
with the Poisson equation: r2 = 4⇡G⇢. In addition, closure requires an addi-
tional constitutive relations in the form of an equation-of-state P = P (⇢, "). If
the ideal fluid obeys the ideal gas law, then we have the following two constitutive
relations:
kB T 1 kB T
P = ⇢, "=
µ mp 1 µ mp
(see Appendix H for details). Here µ is the mean molecular weight of the fluid in
units of the proton mass, mp , and is the adiabatic index, which is often taken to
be 5/3 as appropriate for a mono-atomic gas.
@⇢
+ r · (⇢~u) = 0
@t
@⇢~u
+r·⇧= ⇢r
@t
@E @
+ r · [(E + P ) ~u] = ⇢ L
@t @t
Here
⇧ = ⇢ ~u ⌦ ~u + P
is the momentum flux density tensor (of rank 2), and
✓ ◆
1 2
E=⇢ u + +"
2
NOTE: In the expression for the momentum flux density tensor A⌦ ~ B ~ is the tensor
product of A ~ and B~ defined such that (A
~ ⌦ B)
~ ij = ai bj (see Appendix A). Hence,
the index-form of the momentum flux density tensor is simply ⇧ij = ⇢ ui uj + P ij ,
with ij the Kronecker delta function. Note that this expression is ONLY valid for
an ideal fluid; in the next chapter we shall derive a more general expression for the
momentum flux density tensor.
Note also that whereas there is no source or sink term for the density, gradients in the
gravitational field act as a source of momentum, while its time-variability can cause
an increase or decrease in the energy density of the fluid (if the fluid is collisionless,
we call this violent relaxation). Another source/sink term for the energy density
is radiation (emission or absorption of photons).
25
CHAPTER 4
The hydrodynamic equations presented in the previous chapter are only valid for
an ideal fluid, i.e., a fluid without viscosity and conduction. We now examine the
origin of conduction and viscosity, and link the latter to the stress tensor, which
is an important quantity in all of fluid dynamics.
In an ideal fluid, the particles e↵ectively have a mean-free path of zero, such that they
cannot communicate with their neighboring particles. In reality, though, the mean-
free path, mfp = (n ) 1 is finite, and particles ”communicate” with each other
through collisions. These collisions cause an exchange of momentum and energy
among the particles involved, acting as a relaxation mechanism. Note that in a
collisionless system the mean-free path is e↵ectively infinite, and there is no two-
body relaxation, only collective relaxation mechanisms (i.e., violent relaxation or
wave-particle interactions).
• When there are gradients in velocity (”shear”) then the collisions among neigh-
boring fluid elements give rise to a net transport of momentum. The collisions
drive the system towards equilibrium, i.e., towards no shear. Hence, the collisions
act as a resistance to shear, which is called viscosity. See Fig. 3 for an illustration.
• When there are gradients in temperature (or, in other words, in specific inter-
nal energy), then the collisions give rise to a net transport of energy. Again,
the collisions drive the system towards equilibrium, in which the gradients vanish,
and the rate at which the fluid can erase a non-zero rT is called the (thermal)
conductivity.
26
Figure 3: Illustration of origin of viscosity and shear stress. Three neighboring fluids
elements (1, 2 and 3) have di↵erent streaming velocities, ~u. Due to the microscopic
motions and collisions (characterized by a non-zero mean free path), there is a net
transfer of momentum from the faster moving fluid elements to the slower moving
fluid elements. This net transfer of momentum will tend to erase the shear in ~u(~x),
and therefore manifests itself as a shear-resistance, known as viscosity. Due to the
transfer of momentum, the fluid elements deform; in our figure, 1 transfers linear
momentum to the top of 2, while 3 extracts linear momentum from the bottom of 2.
Consequently, fluid element 2 is sheared as depicted in the figure at time t+ t. From
the perspective of fluid element 2, some internal force (from within its boundaries)
has exerted a shear-stress on its bounding surface.
27
uniform Gases” by S. Chapman and T. Cowling. Briefly, in the Chapman-Enskog
expansion one expands the distribution function (DF) as f = f (0) +↵f (1) +↵2 f (2) +....
Here f (0) is the equilibrium DF of an ideal fluid, which is the Maxwell-Boltzmann
distribution, while ↵ is the Knudsen number which is assumed to be small. Substu-
titing this expansion in the Boltzmann equation, yields, after some tedious algebra,
the following expressions for µ and K:
✓ ◆1/2
a m kB T 5
µ= , K = cV µ
⇡ 2
Here a is a numerical factor that depends on the details of the interparticle forces,
is the collisional cross section, and cV is the specific heat (i.e., per unit mass). Thus,
for a given fluid (given and m) we basically have that µ = µ(T ) and K = K(T ).
Note that µ / T 1/2 ; viscosity increases with temperature. This only holds for gases!
For liquids we know from experience that viscosity decreases with increasing tem-
perature (think of honey). Since in astrophysics we are mainly concerned with gas,
µ / T 1/2 will be a good approximation for most of what follows.
Now that we have a rough idea of what viscosity (resistance to shear) and conduc-
tivity (resistance to temperature gradients) are, we have to ask how to incorporate
them into our hydrodynamic equations.
~v = ~u + w
~
where h~v i = ~u, hwi
~ = 0 and h.i indicates the average over a fluid element. If we
define vi as the velocity in the i-direction, we have that
hvi vj i = ui uj + hwi wj i
These di↵erent velocities allow us to define a number of di↵erent velocity tensors:
28
Stress Tensor: ⌘ ⇢hwi wj i
ij = ⇢w ~ ⌦w ~
Momentum Flux Density Tensor: ⇧ij ⌘ +⇢hvi vj i ⇧ = +⇢~v ⌦ ~v
Ram Pressure Tensor: ⌃ij ⌘ +⇢ui uj ⌃ = +⇢~u ⌦ ~u
which are related according to = ⌃ ⇧. Note that each of these tensors is man-
ifest symmetric (i.e., ij = ji , etc.), which implies that they have 6 independent
variables.
Note that the stress tensor is related to the microscopic random motions. These
are the ones that give rise to pressure, viscosity and conductivity! The reason that
~ x, n̂) acting on a
ij is called the stress tensor is that it is related to the stress ⌃(~
surface with normal vector n̂ located at ~x according to
⌃i (n̂) = ij nj
Here ⌃i (n̂) is the i-component of the stress acting on a surface with normal n̂, whose
j-component is given by nj . Hence, in general the stress will not necessarily be along
the normal to the surface, and it is useful to decompose the stress in a normal
stress, which is the component of the stress along the normal to the surface, and a
shear stress, which is the component along the tangent to the surface.
To see that fluid elements in general are subjected to shear stress, consider the
following: Consider a flow (i.e., a river) in which we inject a small, spherical blob (a
fluid element) of dye. If the only stress to which the blob is subject is normal stress,
the only thing that can happen to the blob is an overall compression or expansion.
However, from experience we know that the blob of dye will shear into an extended,
‘spaghetti’-like feature; hence, the blob is clearly subjected to shear stress, and this
shear stress is obvisouly related to another tensor called the deformation tensor
@ui
Tij =
@xj
Since @ui /@xj = 0 in a static fluid (~u(~x) = 0), we see that in a static fluid the stress
tensor can only depend on the normal stress, which we call the pressure.
29
Pascal’s law for hydrostatistics: In a static fluid, there is no preferred direction,
and hence the (normal) stress has to be isotropic:
static fluid () ij = P ij
Sign Convention: The stress ⌃(~ ~ x, n̂) acting at location ~x on a surface with normal
n̂, is exerted by the fluid on the side of the surface to which the normal points, on
the fluid from which the normal points. In other words, a positive stress results in
compression. Hence, in the case of pure, normal pressure, we have that ⌃ = P .
Viscous Stress Tensor: The expression for the stress tensor in the case of static
fluid motivates us to write in general
ij = P ij + ⌧ij
where we have introduced a new tensor, ⌧ij , which is known as the viscous stress
tensor, or the deviatoric stress tensor.
Since the deviatoric stress tensor, ⌧ij , is only non-zero in the presence of shear in the
fluid flow, this suggests that
@uk
⌧ij = Tijkl
@xl
where Tijkl is a proportionality tensor of rank four. As described in Appendix G
(which is NOT part of the curriculum for this course), most (astrophysical) fluids are
Newtonian, in that they obey a number of conditions. As detailed in that appendix,
for a Newtonian fluid, the relation between the stress tensor and the deformation
tensor is given by
@ui @uj 2 @uk @uk
ij = P ij +µ + ij +⌘ ij
@xj @xi 3 @xk @xk
30
Let’s take a closer look at these three quantities, starting with the pressure P . To be
exact, P is the thermodynamic equilibrium pressure, and is normally computed
thermodynamically from some equation of state, P = P (⇢, T ). It is related to the
translational kinetic energy of the particles when the fluid, in equilibrium, has reached
equipartition of energy among all its degrees of freedom, including (in the case of
molecules) rotational and vibrations degrees of freedom.
Using the above expression for ij , and using that @uk /@xk = r · ~u (Einstein sum-
mation convention), it is easy to see that
Pm = P ⌘ r · ~u
From this expression it is clear that the bulk viscosity, ⌘, is only non-zero if P 6=
Pm . This, in turn, can only happen if the constituent particles of the fluid have
degrees of freedom beyond position and momentum (i.e., when they are molecules
with rotational or vibrational degrees of freedom). Hence, for a fluid of monoatoms
(ideal gas), ⌘ = 0. From the fact that P = Pm + ⌘r · ~u it is clear that for an
incompressible flow P = Pm and the value of ⌘ is irrelevant; bulk viscosity plays
no role in incompressible fluids or flows. The only time when Pm 6= P is when a
fluid consisting of particles with internal degrees of freedom (e.g., molecules) has just
undergone a large volumetric change (i.e., during a shock). In that case there may
be a lag between the time the translational motions reach equilibrium and the time
when the system reaches full equipartition in energy among all degrees of freedom.
In astrophysics, bulk viscosity can generally be ignored, but be aware that it may
be important in shocks. This only leaves the shear viscosity µ, which describes the
ability of the fluid to resist shear stress via momentum transport resulting from
collisions and the non-zero mean free path of the particles.
31
CHAPTER 5
As we have seen in the previous chapter, the e↵ect of viscosity is captured by the
stress tensor, which is given by
@ui @uj 2 @uk @uk
ij = P ij + ⌧ij = P ij + µ + ij + ⌘ ij
@xj @xi 3 @xk @xk
Note that in the limit µ ! 0 and ⌘ ! 0, valid for an ideal fluid, ij = P ij . This
suggests that we can incorporate viscosity in the hydrodynamic equations by simply
replacing the pressure P with the stress tensor, i.e., P ij ! ij = P ij ⌧ij .
dui @( P ij ) @⌧ij @
⇢ = + ⇢
dt @xj @xj @xi
It is more common, and more useful, to write out the viscous stress tensor, yielding
✓ ◆ ✓ ◆
dui @P @ @ui @uj 2 @uk @ @uk @
⇢ = + µ + ij + ⌘ ⇢
dt @xi @xj @xj @xi 3 @xk @xi @xk @xi
32
These are the Navier-Stokes equations (in Lagragian index form) in all their glory,
containing both the shear viscosity term and the bulk viscosity term (the latter
is often ignored).
Note that µ and ⌘ are usually functions of density and temperature so that they
have spatial variations. However, it is common to assume that these are suficiently
small so that µ and ⌘ can be treated as constants, in which case they can be taken
outside the di↵erentials. In what follows we will make this assumption as well.
where we have introduced the kinetic viscosity ⌫ ⌘ µ/⇢. Note that these equations
reduce to the Euler equations in the limit ⌫ ! 0. Also, note that the r(r · ~u)
term is only significant in the case of flows with variable compression (i.e., viscous
dissipation of accoustic waves or shocks), and can often be ignored. This leaves the
⌫r2~u term as the main addition to the Euler equations. Yet, this simple ‘di↵use’
term (describing viscous momentum di↵usion) dramatically changes the charac-
ter of the equation, as it introduces a higher spatial derivative. Hence, additional
boundary conditions are required to solve the equations. When solving problems
with solid boundaries (not common in astrophysics), this condition is typically that
the tangential (or shear) velocity at the boundary vanishes. Although this may sound
ad hoc, it is supported by observation; for example, the blades of a fan collect dust.
Recall that when writing the Navier-Stokes equation in Eulerian form, we have that
d~u/dt ! @~u/@t + ~u · r~u. It is often useful to rewrite this extra term using the vector
calculus identity
✓ ◆
~u · ~u
~u · r~u = r + (r ⇥ ~u) ⇥ ~u
2
33
Hence, for an irrotational flow (i.e., a flow for which r ⇥ ~u = 0), we have that
~u · r~u = 12 ru2 , where u ⌘ |~u|.
Now that we have added the e↵ect of viscosity, what remains is to add conduction.
We can make progress by realizing that, on the microscopic level, conduction arises
from collisions among the constituent particles, causing a flux in internal energy. The
internal energy density of a fluid element is h 21 ⇢w 2 i, where w
~ = ~v ~u is the random
motion of the particle wrt the fluid element (see Chapter 4), and the angle brackets
indicate an ensemble average over the particles that make up the fluid element. Based
on this we see that the conductive flux in the i-direction can be written as
1
Fcond,i = h ⇢w 2 wi i = h⇢"wi i
2
From experience we also know that we can write the conductive flux as
F~cond = K rT
34
with K the thermal conductivity.
Next we realize that conduction only causes a net change in the internal energy at
some fixed position if the divergence in the conductive flux (r· F~cond ) at that position
is non-zero. This suggests that the final form of the energy equation, for a non-ideal
fluid, and in Lagrangian vector form, has to be
d"
⇢ = P r · ~u r · F~cond + V L
dt
To summarize, below we list the full set of equations of gravitational, radial hydro-
dynamics (ignoring bulk viscosity) 1 .
d⇢
Continuity Eq. = ⇢ r · ~u
dt
d~u 1
Momentum Eqs. ⇢ = rP + µ r2~u + r(r · ~u) ⇢r
dt 3
d"
Energy Eq. ⇢ = P r · ~u r · F~cond L+V
dt
@ui
Diss/Cond/Rad V ⌘ ⌧ik , Fcond,k = h⇢"wk i , L⌘C H
@xk
1
Diss/Cond/Rad stands for Dissipation, Conduction, Radiation
35
CHAPTER 6
In the previous chapter, we presented the hydrodynamic equations that are valid in
the macroscopic, continuum approach to fluid dynamics. We now derive this set of
equations rigorously, starting from the microscopic, particle-based view of fluids. We
start by reminding ourselves of a few fundamental concepts in dynamics.
Caution: I will use ‘phase-space’ to refer to both this ndof -dimensional space,
in which each state is associated with a point in that space, as well as to the 6-
dimensional space (~x, ~v) in which each individual particle is associated with a point
in that space. In order to avoid confusion, in this chapter I will refer to the former
as -space, and the latter as µ-space.
Here the curly brackets correspond to Poisson brackets (see Appendix I). When
qi is a Cartesian coordinate in configuration space, pi is the corresponding linear
momentum. However, when using curvi-linear coordinates and qi is an angle, then
the corresponding pi is an angular momentum. Hence, pi is therefore not always
equal to mq̇i ! Note that pi is called the conjugate momentum, to indicate that it
belongs to qi in a canonical sense (meaning, that it obeys the canonical commutation
relations).
36
Let N be the number of constituent particles in our fluid. In all cases of interests, N
will be a huge number; N 1020 . How do you (classically) describe such a system?
To completely describe a fluid of N particles, you need to specify for each particle
the following quantities:
position q~ = (q1 , q2 , q3 )
momentum p~ = (p1 , p2 , p3 )
internal degrees of freedom ~s = (s1 , s2 , ...., sK )
Examples of internal degrees of freedom are electrical charge (in case of a plasma),
or the rotation or vibrational modes for molecules, etc. The number of degrees of
freedom in the above example is ndof = N(6 + K). In what follows we will only
consider particles with zero internal dof (i.e., K = 0 so that ndof = 6N). Such
particles are sometimes called monoatoms, and can be treated as point particles.
The microstate of a system composed of N monoatoms is completely described by
H(~qi , p~i , t) ⌘ H(~q1 , ~q2 , ..., ~qN , p~1 , p~2 , ..., p~N , t)
and the corresponding equations of motion are:
@H @H
~q˙i = ; p~˙i =
@~pi @~qi
In what follows we will often adopt a shorthand notation, which also is more ‘sym-
metric’. We introduce the 6D vector w ~ ⌘ (~q , ~p), i.e., the 6D array one obtains
when combining the 3 components of ~q with the 3 components of p~. Using Poisson
brackets, we can then write the Hamiltonian equations of motion as
~˙ i = {w
w ~ i , H}
37
Figure 4: Illustration of evolution in -space. The x- and y-axes represent the 3N-
dimensional position-vector and momentum-vector, respectively. Panel (a) shows
the evolution of a state (indicated by the red dot). As time goes on, the potitions
and momentum of all the particles change (according to the Hamiltonian equations
of motion), and the state moves around in -space. Panel (b) shows the evolution
of an ensemble of microstates (called a macrostate). As neighboring states evolve
slightly di↵erently, the volume in -space occupied by the original microstates (the
red, oval region) is stretched and sheared into a ‘spagetti-like’ feature. According to
Liouville’s theorem, the volume of this spagetti-like feature is identical to that of the
original macrostate (i.e., the flow in -space is incompressible). Note also, that two
trajectories in -space can NEVER cross each other.
38
Thus, given w~ i for all i = 1, 2, ..., N, at any given time t0 , one can compute the
Hamiltonian and solve for the equations of motion to obtain w ~ i (t). This specifies
~
a unique trajectory (t) in this phase-space (see panel [a] of Fig. 4). Note that
no two trajectories ~ 1 (t) and ~ 2 (t) are allowed to cross each other. If that were
the case, it would mean that the same initial state can evolve di↵erently, which
would be a violation of the deterministic character of classical physics. The
Hamiltonian formalism described above basically is a complete treatment of fluid
dynamics. In practice, though, it is utterly useless, simply because N is HUGE,
making it impossible to specify the complete set of initial conditions. We neither
have (nor want) the detailed information that is required to specify a microstate.
We are only interested in the average behavior of the macroscopic properties of the
system, such as density, temperature, pressure, etc. With each such macrostate
corresponds a huge number of microstates, called a statistical ensemble.
f (N ) (w
~ i ) ⌘ f (N ) (w
~ 1, w ~ N ) = f (N ) (~q1 , ~q2 , ..., ~qN , p~1 , p~2 , ..., ~pN )
~ 2 , ..., w
which expresses the ensemble’s probability distribution, i.e., f (N ) (w
~ i ) dV is the prob-
Q
ability that the actual microstate is given by ~ (~qi , p~i ), where dV = N 6
i=1 d w~i =
QN 3 3
i=1 d ~
qi d p~i . This implies the following normalization condition
Z
dV f (N ) (w
~ i) = 1
In our statistical approach, we seek to describe the evolution of the N-body distribu-
tion function, f (N ) (w ~ i , t), rather than that of a particular microstate, which instead
is given by ~ (w
~ i , t). Since probability is locally conserved, it must obey a continuity
equation; any change of probability in one part of phase-space must be compen-
sated by a flow of probability into or out of neighboring regions. As we have seen in
Chapter 2, the continuity equation of a (continuum) density field, ⇢(~x), is given by
@⇢
+ r · (⇢ ~v ) = 0
@t
which expresses that the local change in the mass enclosed in some volume is balanced
by the divergence of the flow out of that volume. In the case of our probability
distribution f (N ) we have that r is in 6N-dimensional phase-space, and includes
@/@~qi and @/@~pi , i.e.,
39
✓ ◆ ✓ ◆
@ @ @ @ @ @ @ @ @
r= = , = , ..., , , ...,
@w
~i @~xi @~pi @~x1 @~x2 , @~xN @~p1 @~p2 , @~pN
Similarly, the ‘velocity vector’ in our 6N-dimensional -space is given by
~˙ ⌘ (~q˙i , ~p˙ i ) = (~q˙1 , ~q˙2 , ..., ~q˙N , ~p˙ 1 , ~p˙ 2 , ..., ~p˙ N )
w
Hence, the continuity equation for f (N ) , which is known as the Liouville equation,
can be written as
@f (N )
~˙ = 0
+ r · (f (N ) w)
@t
Using the fact that the gradient of the product of a vector and a scalar can be written
as the sum of the scalar times the divergence of the vector, plus the dot-product of
the vector and the gradient of the scalar (see Appendix A), we have that
r · (f (N ) w)
~ = f (N ) r · w ~˙ + w~˙ · rf (N )
If we write out the divergence of w ~˙ as
N
" #
X @ ˙i @ p~˙i
q
~
r·w ~˙ = +
i=1
@~qi @~pi
and use the Hamiltonian equations of motion to write ~q˙i and p~˙ i as gradients of the
Hamiltonian, we find that
XN ✓ ◆ ✓ ◆ XN
@ @H @ @H @2H @2H
~˙ =
r·w = =0
i=1
@~qi @~pi @~pi @~qi i=1
@~qi @~pi @~pi @~qi
This is generally known as the Liouville Theorem. It implies that the volume in
-space occupied by a macrostate does NOT change under Hamiltonian evolution.
Although the microstates that make up the macrostate can disperse, the volume
they occupy stays connected and constant; it typically will change shape, but its
total volume remains fixed (see panel [b] of Fig. 4).
40
Using this result, we can write the Liouville equation in any of the following forms:
@f (N )
~˙ · rf (N ) = 0
+w
@t
N ✓ ◆
@f (N ) X ˙ @f (N ) @f (N )
+ q~i · + p~˙ i · =0
@t i=1
@~
qi @~p i
df (N )
=0
dt
@f (N )
+ {f N , H} = 0
@t
The second expression follows from the first by simply writing out the terms of the
divergence. The third expression follows from the second one upon realizing that
f (N ) = f (N ) (t, ~q1 , ~q2 , ..., ~q3 , p~1 , p~2 , ..., p~N ) and using the fact that for a function f (x, y)
the infinitessimal df = (@f /@x) dx + (@f /@y) dy. Finally, the fourth expression
follows from the second upon using the Hamiltonian equations of motion and the
expression for the Poisson brackets, and will be used abundantly below.
Recall from Chapter 2, that a level-2 theory seeks to describe the evolution of the
phase-space distribution function (DF)
d6 N
f (~q, ~p) = 3 3
d ~q d ~p
which describes the density of particles in 6D phase-space (~q, p~). In what follows,
41
we shall refer to this 6-dimensional phase-space as µ-space, to distinguish it from
the 6N-dimensional -space. And we shall refer to the above DF as the 1-point
DF, f (1) , in order to distinguish it from the N-point DF, f (N ) , which appears in the
Liouville equation. Whereas the latter describes the ensemble density of micro-states
in -space, the latter describes the density of particles in µ-space.
Kinetic Theory: One can derive an equation for the time-evolution of the 1-point
DF, starting from the Liouville equation. First we make the assumption that all
particles are (statistically) identical (which is basically always the case). This implies
that f (N ) is a symmetric function of w ~ i , such that
We first define the reduced or k-particle DF, which is obtained by integrating the
N-body DF, f (N ) , over N k six-vectors w~ i . Since f (N ) is symmetric in w~ i , without
loss of generality we may choose the integration variables to be w ~ k+1, w
~ k+2 , ..., w
~N:
Z N
Y
(k) N!
f (w
~ 1, w ~ k , t) ⌘
~ 2 , ..., w d6 w
~ i f (N ) (w
~ 1, w
~ 2 , ..., w
~ N , t)
(N k)! i=k+1
where the choice of the prefactor will become clear in what follows.
42
where we have used the normalization condition of f (N ) . Hence, f (1) (~q , ~p, t) =
dN/d3 ~q d3 p~ is the number of particles in the phase-space volume d3 ~q d3 p~ centered
on (~q , p~).
Hence, computing the expectation value for any observable Q(w) ~ only requires knowl-
edge of the 1-particle DF. And since all our macroscopic continuum properties of the
fluid (i.e., ⇢, ~u, ") depend additively on the phase-space coordinates, the 1-particle
DF suffices for a macroscopic description of the fluid. Hence, our goal is derive an
evolution equation for f (1) (~q , ~p, t). We do so as follows.
Z N
Y
@f (k) N! @f (N )
= d6 w
~i (w
~ 1, w
~ 2 , ..., w
~N)
@t (N k)! i=k+1
@t
Z N
Y
N!
= d6 w
~ i {H, f (N ) }
(N k)! i=k+1
where the first step simply follows from operating the time derivative on the defini-
tion of the reduced k-particle DF, and the second step follows from the Liouville
equation.
43
Next we substitute the Hamiltonian, which in general can be written as
XN N N N
p~i2 X 1 XX
H(~qi , p~i ) = + V (~qi ) + U(|~qi ~qj |)
i=1
2m i=1 2 i=1 j=1
j6=i
Note that the Hamiltonian contains three terms; a kinetic energy term, a term
describing the potential energy due to an external force F~i = rV (~qi ) that only
depends on the position of particle i (i.e., an example would be the gravitational
field of Earth when describing it’s atmosphere), and the potential energy U(|~qi ~qj |)
related to two-body interactions between particles i and j. The force on particle
i due to the latter depends on the positions of all the other N 1 particles. Note
that the factor of 1/2 is to avoid double-counting of the particle pairs. Examples of
the two-body interactions can be the VanderWaals force in the case of a liquid, the
Coulomb force in the case of a plasma, or the gravitational force in the case of dark
matter halo.
Substituting this expression for the Hamiltonian in the equation for the time-evolution
of the reduced DF yields, after some tedious algebra (see Appendix J), an expression
for the evolution of the k-particle DF
X k Z
@f (k) @U(|~qi ~qk+1 |) @f (k+1)
= {H(k) , f (k) } + d3 ~qk+1 d3 ~pk+1 ·
@t i=1
@~qi @~pi
Here H(k) is the Hamiltonian for the k-particles, which is simply given by
Xk k k k
p~i2 X 1 XX
H(k) (w
~ 1 , ~q2 , ..., ~qk ) = + V (~qi ) + U(|~qi ~qj |)
i=1
2m i=1
2 i=1 j=1
j6=i
Note that the above experssion for the evolution of the k-particle DF is not a closed
function; it depends on f (k+1) . Hence, if you want to solve for f (k) you first need
to solve for f (k+1) , which requires that you solve for f (k+2) , etc. Thus, we have a
hierarcical set of N coupled di↵erential equations, which is called the BBGKY hi-
erarchy (after Bogoliubov, Born, Green, Kirkwood and Yvon, who independently
developed this approach between 1935 and 1946).
44
Of particular interest to us is the expression for the 1-particle DF:
Z
@f (1) @U(|~q1 ~q2 |) @f (2)
= {H(1) , f (1) } + d3 ~q2 d3 ~p2 ·
@t @~q1 @~p1
p2
H(1) = H(1) (~q , ~p) = + V (~q )
2m
where we emphasize once more that V (~x) is the external potential. The first term
in the evolution equation for the 1-particle DF (the Poisson brackets) is called the
streaming term; it describes how particles move in the absence of collisions. The
second term is called the collision integral, and describes how the distribution of
particles in phase-space is impacted by two-body collisions. Note that it depends
on the the 2-particle DF f (2) (~q1 , ~q2 , p~1 , p~2 ), which shouldn’t come as a surprise
given that accounting for two-body collisions requires knowledge of the phase-space
coordinates of the two particles in question.
From here on out, though, we can target specific fluids (i.e., collisionless fluids, neu-
tral fluids, plasmas) by specifying details about the two-body interaction potential
U(|~qi ~qj |) and/or the external potential V (~q ). Let us start by considering the easiest
example, namely the collisionless fluid. Here we have two methods of proceeding.
First of all, we can simply set U(|~qi ~qj |) = 0 (i.e., ignore two-body interactions)
and realize that we can compute V (~q ) from the density distribution
Z
⇢(~q , t) = m d3 p~ f (1) (~q, p~, t)
r2 = 4⇡G⇢
45
where we have used (~q , t) = V (~q , t)/m to coincide with the standard notation for
the gravitational potential used throughout these lecture notes. This implies that the
collision integral vanishes, and we are left with a closed equation for the 1-particle
DF, given by
Here we have used the more common (~x, ~v ) coordinates in place of the canonical
(~q , ~p), and the fact that ~p˙ = mr . This equation is the Collisionless Boltzmann
Equation (CBE) which is the fundamental equation describing a collisionless system
(i.e., a galaxy or dark matter halo). It expresses that the flow of particles in µ-space
is incompressible, and that the local phase-space density around any particle is fixed.
The evolution of a collisionless system of particles under this CBE is depicted in the
left-hand panel of Fig. 5. Although the CBE is a simple looking equation, recall that
f (1) is still a 6D function. Solving the CBE is tedious and not something that is
typically done. As we will see in the next Chapter, instead what we do is to (try to)
solve moment equations of the CBE.
For completeness, let us now derive the CBE using a somewhat di↵erent approach.
This time we treat the gravity among the individual particles, and we do NOT assume
upfront that we can account for gravity in terms of a ‘smooth’ potential V (~q ). Hence,
we set V = 0, and
Gm2
U(|~q1 ~q2 |) =
|~q1 ~q2 |
is now the gravitational potential energy due to particles 1 and 2. Starting from our
BBGKY expression for the 1-particle DF, we need to come up with a description for
the 2-particle DF. In general, we can always write
f (2) (~q1 , ~q2 , p~1 , p~2 ) = f (1) (~q1 , p~1 ) f (1) (~q2 , p~2 ) + g(~q1 , ~q2 , p~1 , p~2 )
This the first step in what is called the Mayer cluster expansion. We can write
this in (a self-explanatory) shorthand notation as
46
The next step in the expansion involves the 3-particle DF:
f (3) (1, 2, 3) = f (1) f (2) f (3) + f (1) g(2, 3) + f (2) g(1, 3) + f (3) g(1, 2) + h(1, 2, 3)
and so onward for k-particle DFs with k > 3. The function g(1, 2) is called the
two-point correlation function. It describes how the phase-space coordinates of
two particles are correlated. Note that if they are NOT correlated than g(1, 2) = 0.
This is reminiscent of probability statistics: if x and y are two independent random
variables then P (x, y) = P (x) P (y). Similarly, h(1, 2, 3) describes the three-point
correlation function; the correlation among particles 1, 2 and 3 that is not already
captured by their mutual two-point correlations described by g(1, 2), g(2, 3) and
g(1, 3).
Now, let’s assume that the phase-space coordinates of two particles are uncorrelated;
i.e., we set g(1, 2) = 0. This implies that the 2-particles DF is simply the products
of two 1-particle DFs, and thus that the evolution equation for f (1) is closed! In fact,
using that @H(1) /@~q = 0 and @H(1) /@~p = ~p/m = ~v we obtain that
Z
@f (1) @f (1) @U(|~q1 ~q2 |) @f (1)
+ ~v · = d3 ~q2 d3 ~p2 f (1) (~q2 , p~2 ) ·
@t @~x @~q1 @~p1
Taking the operator outside of the collision integral (note that the f (1) in the
operator has ~q1 and ~p1 as arguments), and performing the integral over p~2 yields
Z
@f (1) @f (1) @f (1) @ 1
+ ~v · · d3 ~q2 ⇢(~q2 ) U(|~q1 ~q2 |) = 0
@t @~x @~p1 @~q1 m
Using that
Z
⇢(~x0 )
(~x) = G d3~x0
|~x ~x0 |
this finally can be written as
47
Figure 5: Illustration of evolution in µ-space. The x- and y-axes represent the 3-
dimensional position-vector and momentum-vector, respectively. Panel (a) shows
the evolution of a collection of particles (indicated by the red dots) in a collisionless
system governed by the CBE. As time goes on, the potitions and momentum of all
the particles change (according to the Hamiltonian equations of motion), and the
particles move around in µ-space smoothly (no abrubt changes) . If the combined
potential of all the particle (or the external potential) is time-variable, trajectories
of individual particles are allowed to cross each other, unlike trajectories in -space,
which can never cross. Panel (b) shows the evolution of a collection of particles in
a collisional system (where collisions are highly localized). Collisions cause abrupt
changes in momentum. The dynamics of this system is described by the Boltzmann
equation.
48
If we want to describe say a collisional, neutral fluid, we need to decide on how
to treat these correlation functions. If we do this for a neutral fluid, which means a
fluid in which the interaction potentials are only e↵ective over very small distances
(i.e., U(r) = 0 for r larger than some small, characteristic collision scale, rcoll ), then
one derives what is called the Boltzmann equation, which is given by
Here I[f (1) ] is the collision integral which now is only a function of the 1-particle DF,
making the Boltzmann equation a closed equation. It describes how, due to collisions,
particles are ‘kicked’ in and out of certain parts of phase-space. The right-hand panel
of Fig. 5 shows an illustration of evolution under the Boltzmann equation.
Rigorously deriving an expression for I[f (1) ] from the BBGKY hierarchy is fiddly and
outside the scope of this course (see for example the textbook ”Statistical Mechan-
ics” by Kerson Huang). Instead, in the next Chapter we will use a more heuristic
approach, which relies on making the assumptions
• dilute gas; density is sufficiently low so that only binary collisions need to be
considered
The first two assumptions are reasonable, but the molecular chaos assumption (in-
troduced by Boltzmann, who referred to it as the Stosszahlansatz, which translates
to ‘assumption regarding number of collisions’) has a long and interesting history.
Mathematically, the assumption implies that
f (2) (~q , ~q, p~1 , p~2 ) = f (1) (~q , ~p1 ) f (1) (~q , ~p2 )
which thus assumes that g(~q, ~q, ~p1 , p~2 ) = 0. Note that this is di↵erent from the
assumption we made above when describing collisionless fluids, as here it is only
assumed that at a given location the momenta are correlated. This is a weaker as-
sumption than setting g(~q1 , ~q2 , p~1 , p~2 ) = 0. At first sight this seems a reasonable
assumption; after all, in a dilute gas particles move (relatively) long distances be-
tween collisions (i.e., mfp rcoll ). Although collisions introduce correlations among
49
the particles, each particle is expected to have many collisions with other particles
before colliding with a particular particle again. It seems reasonable to postulate that
these intermittent collisions erase correlations again. However, this apparently unre-
markable assumption e↵ectively introduces an arrow of time, as briefly discussed
in the colored text-box at the end of this chapter.
Finally, we point out that if treating a collisional plasma in which the interactions
are due to long-range Coulomb forces, then the standard approach is to assume that
h(1, 2, 3) = 0 (i.e., assume that three-body correlation function is zero). Making
several other assumptions (i.e., plasma is spatially homogeneous and the 2-particle
correlation function g(1, 2) relaxes much faster than the 1-particle DF f (1) ) this allows
one to derive an expression for g(1, 2) from the evolution equation of the 2-particle
DF, which can then be substituted in the evolution equation of the 1-particles DF.
The result is called the Lenard-Balescu equation. It is an example of a Fokker-
Planck equation, which is a generic equation used to describe the time evolution of
the probability density function of the velocity of a particle under the influence of
stochastic forces (here the Coulomb collisions) that mainly cause small deflections.
The Fokker-Planck equation is also used to describe gravitational N-body systems
in which the impact of collisions is not negligble (i.e., describing two-body relaxation
in a globular cluster).
As a final remark for this Chapter, we have thus far only considered the case of
a single species of mono-atoms. If we consider di↵erent types of particles, then we
have to introduce a separate distribution function for each type. If the di↵erent types
of particles can interact with each other, this then has to be accounted for in the
collision terms.
50
Molecular Chaos and the Arrow of Time
51
CHAPTER 7
Z
@f (1) @U(|~q1 ~q2 |) @f (2)
= {H , f } + d3 ~q2 d3 p~2
(1) (1)
·
@t @~q1 @~p1
·
·
·
X k Z
@f (k) (k) (k) 3 3 @U(|~qi ~qk+1 |) @f (k+1)
= {H , f } + d ~qk+1 d ~pk+1 ·
@t i=1
@~qi @~pi
Here k = 1, 2, ..., N, f (k) is the k-particle DF, which relates to the N-particle DF
(N > k) according to
Z N
Y
(k) N!
f (w
~ 1, w ~ k , t) ⌘
~ 2 , ..., w d6 w
~ i f (N ) (w
~ 1, w
~ 2 , ..., w
~ N , t) ,
(N k)! i=k+1
Xk k k k
(k) p~i2 X 1 XX
H = + V (~qi ) + U(|~qi ~qj |)
i=1
2m i=1 2 i=1 j=1
j6=i
with V (~q) the potential associated with an external force, and U(r) the two-body
interaction potential between two (assumed equal) particles separated by a distance
r = |~qi ~qj |.
In order to close this set of N equations, one needs to make certain assumptions that
truncate the series. One such assumption is that all particles are uncorrrelated (both
spatially and in terms of their momenta), such that
f (2) (~q1 , ~q2 , p~1 , p~2 ) = f (1) (~q1 , p~1 ) f (1) (~q2 , p~2 )
52
which is equivalent to setting the correlation function g(1, 2) = 0. As we have
shown in the previous Chapter, the first equation in the BBGKY hierarchy is now
closed, and yields the Collisionless Boltzmann Equation (CBE), which can be
written as
df @f @f @f
= + ~x˙ · + ~v˙ · =0
dt @t @~x @~v
which is the fundamental evolution equation for collisionless systems. If the forces
between particles are gravitational in nature, then ~v˙ = r , with (~x) the gravita-
tional potential which related to the density distribution via the Poisson equa-
tion. NOTE: we have used the shorthand notation f for the 1-particle DF f (1) . In
what follows we will adopt that notation throughout, and only use the superscript-
notation whenever confusion might arise.
If, on the other hand, we want to describe a dilute, neutral fluid in which the particles
only have short-range interactions (such that U(r) ' 0 outside of some small distance
rcoll ), then we can make the assumption of molecular chaos which also allows us
to close the BBGKY hierarchy, yielding the Boltzmann Equation:
df @f @f @f
= + ~x˙ · + ~v˙ · = I[f ]
dt @t @~x @~v
where I[f ] is the collision integral, which describes how the phase-space density
around a particle (or fluid element) changes with time due to collisions.
Let us now take a closer look at this collision integral I[f ]. It basically expresses the
Eulerian time-derivative of the DF due to collisions, i.e.,I[f ] = (@f /@t)coll . Recall
that we have made the assumption of a dilute gas, so that we only need to consider
two-body interactions. In what follows, we make the additional assumption that all
collisions are elastic [actually, this is sort of implied by the fact that we assume that
the dynamics are Hamiltonian]. An example is shown in Figure 1, where ~p1 + p~2 !
p~1 0 + ~p2 0 . Since we assumed a short-range, instantaneous and localized interaction,
so that the external potential doesn’t significantly vary over the interaction volume
(the dashed circle in Fig. 1), we have
53
Figure 6: Illustration of ‘collision’ between two particles with momenta p1 and p2 due
to interaction potential U(r). The impact parameter of the collision is b.
We can write the rate at which particles of momentum p~1 at location ~x experience
collisions p~1 + ~p2 ! ~p1 0 + p~2 0 as
R = !(~p1, p~2 |~p1 0 , p~2 0 ) f (2) (~x, ~x, p~1 , p~2 ) d3 ~p2 d3 p~1 0 d3 ~p2 0
Here f (2) (~x, ~x, ~p1 , p~2 ) is the 2-particle DF, expressing the probability that at location
~x, you encounter two particles with momenta p~1 and p~2 , respectively. The function
!(~p1 , p~2 |~p1 0 , p~2 0 ) depends on the interaction potential U(~r ) and can be calculated
(using kinetic theory) via di↵erential cross sections. Note that momentum and energy
conservation is encoded in the fact that !(~p1 , p~2 |~p1 0 , p~2 0 ) / 3 (P~ P~ 0 ) (E E 0 ) with
(x) the Dirac delta function, P~ = ~p1 + p~2 and P~ 0 = ~p1 0 + p~2 0 .
54
Using our assumption of molecular chaos, which states that the momenta of the
interacting particles are independent, we have that
f (2) (~x, ~x, p~1 , p~2 ) = f (1) (~x, ~p1 ) f (1) (~x, ~p2 )
so that the collision integral can be written as
Z
I[f ] = d3 p~2 d3 ~p1 0 d3 p~2 0 !(~p1 0 , p~2 0 |~p1 , p~2 ) [f (~p1 0 ) f (~p2 0 ) f (~p1 ) f (~p2 )]
We can use the above expression to derive that the equilibrium solution for the
velocity distribution in a homogeneous fluid is given by the Maxwell-Boltzmann
distribution. The expression for an equilibrium distribution function, feq is that
@feq /@t = 0 (i.e., the DF at any given location doesn’t evolve with time). If we ignore
a potential external potential (i.e., V = 0), and we take into consideration that an
equilibrium solution must indeed be spatially homogeneous, such that @feq /@~q = 0,
then we have that the streaming term {H, feq } = 0. Hence, having an equilibrium
requires that the collision integral vanishes as well. As is apparent from the above
expression, this will be the case if
55
A=1 particle number conservations
A = ~p momentum conservation
A = ~p 2 /(2m) energy conservation
and we thus expect that
We have seen that if the logarithm of the DF is a sum of collisional invariants (which
it is if the system is in equilibrium), then the collision integral vanishes. In addition,
as we will now demonstrate, for a collisional invariant A(~p) we also have that
Z ✓ ◆
3 @f
d p~ A(~p) =0
@t coll
which will be useful for what follows. To see that this equality holds, we first intro-
duce
Z
I1 = d3 ~p1 d3 p~2 d3 ~p1 0 d3 p~2 0 !(~p1 0 , p~2 0 |~p1 , p~2 ) A(~p1 ) [f (~p1 0 ) f (~p2 0 ) f (~p1 ) f (~p2 )]
which is the collision integral multiplied by A(~p1 ) and integrated over p~1 . Note
that now all momenta are integrated over, such that they are basically nothing but
dummy variables. Re-labelling 1 $ 2, and reordering yields
Z
I2 = d3 ~p1 d3 p~2 d3 ~p1 0 d3 p~2 0 !(~p1 0 , p~2 0 |~p1 , p~2 ) A(~p2 ) [f (~p1 0 ) f (~p2 0 ) f (~p1 ) f (~p2 )]
i.e., everything is unchanged except for the argument of our collisional invariant.
And since the momenta are dummy variables, we have that I2 = I1 . Rather than
56
swapping indices 1 and 2, we can also swap ~p $ p~ 0 . This gives us two additional
integrals:
Z
I3 = d3 ~p1 d3 p~2 d3 p~1 0 d3 ~p2 0 !(~p1 , p~2 |~p1 0 , p~2 0 ) A(~p1 0 ) [f (~p1 0 ) f (~p2 0 ) f (~p1 ) f (~p2 )]
and
Z
I4 = d3 ~p1 d3 p~2 d3 p~1 0 d3 ~p2 0 !(~p1 , p~2 |~p1 0 , p~2 0 ) A(~p2 0 ) [f (~p1 0 ) f (~p2 0 ) f (~p1 ) f (~p2 )]
where the minus sign comes from the fact that we have reversed f (~p1 ) f (~p2 )
f (~p1 0 ) f (~p2 0 ). Because of time-reversibility !(~p1 0 , p~2 0 |~p1 , p~2 ) = !(~p1, p~2 |~p1 0 , p~2 0 ), and
we thus have that I4 = I3 = I2 = I1 . Hence I1 = [I1 + I2 + I3 + I4 ]/4, which can
be written as
Z
1
I1 = d3 p~1 d3 ~p2 d3 p~1 0 d3 ~p2 0 !(~p1 0 , p~2 0 |~p1 , p~2 )⇥
4
{A(~p1 ) + A(~p2 ) A(~p1 0 ) A(~p2 0 )} [f (~p1 0 ) f (~p2 0 ) f (~p1 ) f (~p2)]
Since A(~p) is a collisional invariant, the factor in curly brackets vanishes, which in
turn assures that I1 = 0, which completes our proof.
Thus far, we have derived the Boltzmann equation, and we have been able to write
down an expression for the collision integral under the assumptions of (i) short-
range, elastic collisions and (ii) molecular chaos. How do we proceed from here?
The Boltzmann distribution with the above expression for the collision integral is
a non-linear integro-di↵erential equation, and solving such an equation is extremely
difficult. Fortunately, in the fluid limit we don’t really need to. Rather, we are
interested what happens to our macroscopic quantities that describe the fluid (⇢, ~u,
P , ", etc). We can use the Boltzmann equation to describe the time-evolution of
these macroscopic quantities by considering moment equations of the Boltzmann
equation.
57
If f (x) is normalized, so that it can be interpreted as a probability function, then
µn = hxn i.
In our case, consider the scalar function Q(~v ). The expectation value for Q at location
~x at time t is given by
R
Q(~v ) f (~x, ~v, t) d3~v
hQi = hQi(~x, t) = R
f (~x, ~v, t) d3~v
Using that
Z
n = n(~x, t) = f (~x, ~v, t) d3~v
Then, in relation to fluid dynamics, there are a few functions Q(~v ) that are of par-
ticular interest:
This indicates that we can obtain dynamical equations for the macroscopic fluid
quantities by multiplying the Boltzmann equation with appropriate functions, Q(~v ),
and integrating over all of velocity space.
58
Hence, we seek to solve equations of the form
Z Z ✓ ◆
@f @f 3 @f
Q(~v ) + ~v · rf r · d ~v = Q(~v ) d3~v
@t @~v @t coll
Z Z Z
@f 3 3 @f 3
Q(~v ) d ~v + Q(~v ) ~v · rf d ~v Q(~v) r · d ~v = 0
@t @~v
Since mass, momentum and energy are all conserved in elastic, short-range collisions
we have that the momentum integral over the collision integral will be zero for the
zeroth, first and second order moment equations! In other words, although collisional
and collisionless systems solve di↵erent Boltzmann equations, their zeroth, first and
second moment equations are identical!
Integral I
The first integral can be written as
Z Z Z
@f 3 @Qf 3 @ @
Q(~v ) d ~v = d ~v = Qf d3~v = nhQi
@t @t @t @t
where we have used that both Q(~v ) and the integration volume are independent of
time.
59
Integral II
Using similar logic, the second integral can be written as
Z Z Z
@f 3 @Q vi f 3 @ @ h i
Q(~v ) vi d ~v = d ~v = Q vi f d3~v = n hQ vi i
@xi @xi @xi @xi
Here we have used that
@f @(Q vi f ) @Q vi @(Q vi f )
Q vi = f =
@xi @xi @xi @xi
where the last step follows from the fact that neither vi nor Q depend on xi .
Integral III
For the third, and last integral, we are going to define F~ = r and rv ⌘ (@/@vx , @/@vy , @/@vz ),
i.e., rv is the equivalent of r but in velocity space. This allows us to write
Z Z Z
Q F~ · rv f d ~v =
3
rv · (Qf F~ )d ~v
3
f rv · (QF~ ) d3~v
Z Z
~ 2 @QFi 3
= Qf F d Sv f d ~v
@vi
Z Z
@Fi 3 @Q 3
= fQ d ~v f Fi d ~v
@vi @vi
Z ⌧
@ @Q 3 @ @Q
= f d ~v = n
@xi @vi @xi @vi
Here we have used Gauss’ divergence theorem, and the fact that the integral of Qf F~
over the surface Sv (which is a sphere with radius R|~v| = 1) is equal to zero. This
follows from the ‘normalization’ requirement that f d3~v = n. We have also used
that Fi = @ /@xi is independent of vi .
Combining the above expressions for I, II, and III, we obtain that
⌧
@ @ h i @ @Q
nhQi + nhQvi i + n =0
@t @xi @xi @vi
60
Now let us consider Q = m, which is indeed a collisional invariant, as required.
Substitution in the master-moment equation, and using that hmi = m, that mn = ⇢
and that hmvi i = mhvi i = mui , we obtain
@⇢ @⇢ui
+ =0
@t @xi
which we recognize as the continuity equation in Eulerian index form.
Next we consider Q = mvj , which is also a collisional invariant. Using that nhmvj vi i =
⇢hvi vj i and that
⌧ ⌧
@ @mvj @ @vj @ @
n = ⇢ = ⇢ ij = ⇢
@xi @vi @xi @vi @xi @xj
substitution of Q = mvj in the master-moment equation yields
@⇢uj @⇢hvi vj i @
+ +⇢ =0
@t @xi @xj
Next we use that
@⇢uj @uj @⇢ @uj @⇢uk
=⇢ + uj =⇢ uj
@t @t @t @t @xk
where, in the last step, we have used the continuity equation. Substitution in the
above equation, and using that k is a mere dummy variable (which can therefore be
replaced by i), we obtain that
If we now restrict ourselves to collisional fluids, and use that the stress tensor
can be written as
61
ij = ⇢hwi wj i = ⇢hvi vj i + ⇢ui uj = P ij + ⌧ij
then the equation above can be rewritten as
@uj @uj 1 @ ij @
+ ui =
@t @xi ⇢ @xi @xj
Finally, it is left as an exersize for the reader (or look at Appendix K) to show
that substitution of Q = mv 2 /2 in the master moment equation yields the energy
equation (in Lagrangian index form):
62
d" @uk @Fcond,k
⇢ = P +V
dt @xk @xk
which is exactly the same as what we heuristically derived in Chapter 5, except for
the L term, which is absent from this derivation based on the Boltzmann equation,
since the later does not include the e↵ects of radiation.
Finally, Fig. 7 below summarizes what we have discussed in the last two chapters
63
CHAPTER 8
Vorticity: The vorticity of a flow is defined as the curl of the velocity field:
vorticity : ~ = r ⇥ ~u
w
It is a microscopic measure of rotation (vector) at a given point in the fluid, which
can be envisioned by placing a paddle wheel into the flow. If it spins about its axis
at a rate ⌦, then w = |w|
~ = 2⌦.
Circulation: The circulation around a closed contour C is defined as the line integral
of the velocity along that contour:
I Z
circulation : C = ~u · d~l = w ~
~ · dS
C S
Vortex line: a line that points in the direction of the vorticity vector. Hence, a
vortex line relates to w,
~ as a streamline relates to ~u (cf. Chapter 2).
In an inviscid fluid the vortex lines/tubes move with the fluid: a vortex line an-
chored to some fluid element remains anchored to that fluid element.
64
Figure 8: Evolution of a vortex tube. Solid dots correspond to fluid elements. Due
to the shear in the velocity field, the vortex tube is stretched and tilted. However, as
long as the fluid is inviscid and barotropic Kelvin’s circularity theorem assures that
the circularity is conserved with time. In addition, since vorticity is divergence-free
(‘solenoidal’), the circularity along di↵erent cross sections of the same vortex-tube is
the same.
65
✓ ◆
@w
~ rP
= r ⇥ (~u ⇥ w)
~ r⇥ + ⌫r2 w
~
@t ⇢
~ = rS ⇥ A
To write this in Lagrangian form, we first use that r ⇥ (S A) ~ + S (r ⇥ A)
~
[see Appendix A] to write
1 1 1 ⇢r(1) 1r⇢ rP ⇥ r⇢
r ⇥ ( rP ) = r( ) ⇥ rP + (r ⇥ rP ) = ⇥ rP =
⇢ ⇢ ⇢ ⇢2 ⇢2
where we have used, once more, that curl(grad S) = 0. Next, using the vector
identities from Appendix A, we write
r ⇥ (w
~ ⇥ ~u) = w(r
~ · ~u) ~ · r)~u
(w ~u(r · w)
~ + (~u · r)w
~
The third term vanishes because r · w
~ = r · (r ⇥ ~u) = 0. Hence, using that @ w/@t
~ +
(~u · r)w
~ = dw/dt
~ we finally can write the vorticity equation in Lagrangian
form:
dw
~ r⇢ ⇥ rP
~ · r)~u
= (w w(r
~ · ~u) + 2
+ ⌫r2 w
~
dt ⇢
This equation describes how the vorticity of a fluid element evolves with time. We
now describe the various terms of the rhs of this equation in turn:
• (w
~ · r)~u: This term represents the stretching and tilting of vortex tubes due
to velocity gradients. To see this, we pick w
~ to be pointing in the z-direction.
Then
• w(r
~ · ~u): This term describes stretching of vortex tubes due to flow com-
pressibility. This term is zero for an incompressible fluid or flow (r · ~u = 0).
Note that, again under the assumption that the vorticity is pointing in the
z-direction,
66
@ux @uy @uz
w(r
~ · ~u) = wz + + ~ez
@x @y @z
• ⌫r2 w:
~ This term describes the di↵usion of vorticity due to viscosity, and
is obviously zero for an inviscid fluid (⌫ = 0). Typically, viscosity gener-
ates/creates vorticity at a bounding surface: due to the no-slip boundary con-
dition shear arises giving rise to vorticity, which is subsequently di↵used into
the fluid by the viscosity. In the interior of a fluid, no new vorticity is generated;
rather, viscosity di↵uses and dissipates vorticity.
• r ⇥ F~ : There is a fifth term that can create vorticity, which however does not
appear in the vorticity equation above. The reason is that we assumed that the
only external force is gravity, which is a conservative force and can therefore be
written as the gradient of a (gravitational) potential. More generally, though,
there may be non-conservative, external body forces present, which would give
rise to a r ⇥ F~ term in the rhs of the vorticity equation. An example of a non-
conservative force creating vorticity is the Coriolis force, which is responsible
for creating hurricanes.
67
Figure 9: The baroclinic creation of vorticity in a pyroclastic flow. High density fluid
flows down a mountain and shoves itself under lower-density material, thus creating
non-zero baroclinicity.
Using the definition of circulation, it can be shown (here without proof) that
Z
d @w~ ~
= + r ⇥ (w~ ⇥ ~u) · dS
dt S @t
Using the vorticity equation, this can be rewritten as
Z
d r⇢ ⇥ rP
= ~ + r ⇥ F~ · dS
+ ⌫r2 w ~
dt S ⇢2
68
NOTE: By comparing the equations expressing dw/dt ~ and d /dt it is clear that
the stretching a tilting terms present in the equation describing dw/dt,
~ are absent
in the equation describing d /dt. This implies that stretching and tilting changes
the vorticity, but keeps the circularity invariant. This is basically the first theorem
of Helmholtz described below.
Kelvin’s Circulation Theorem: The number of vortex lines that thread any
element of area that moves with the fluid (i.e., the circulation) remains unchanged
in time for an inviscid, barotropic fluid, in the absence of non-conservative forces.
We end this chapter on vorticity and circulation with the three theorems of Helmholtz,
which hold in the absence of non-conservative forces (i.e., F~ = 0).
where A1 and A2 are the areas of the cross sections that bound the volume V of the
vortex tube. Using Stokes’ curl theorem, we have that
69
Z I
~ · n̂ dA =
w ~u · d~l
A C
Hence we have that C1 = C2 where C1 and C2 are the curves bounding A1 and A2 ,
respectively.
Helmholtz Theorem 2: A vortex line cannot end in a fluid. Vortex lines and tubes
must appear as closed loops, extend to infinity, or start/end at solid boundaries.
70
Figure 10: A beluga whale demonstrating Kelvin’s circulation theorem and Helmholtz’
second theorem by producing a closed vortex tube under water, made out of air.
71
CHAPTER 9
Having derived all the relevant equations for hydrodynamics, we now start examining
several specific flows. Since a fully general solution of the Navier-Stokes equation is
(still) lacking (this is one of the seven Millenium Prize Problems, a solution of which
will earn you $1,000,000), we can only make progress if we make several assumptions.
We start with arguably the simplest possible flow, namely ‘no flow’. This is the area
of hydrostatics in which ~u(~x, t) = 0. And since we seek a static solution, we also
must have that all @/@t-terms vanish. Finally, in what follows we shall also ignore
radiative processes (i.e., we set L = 0).
Applying these restrictions to the continuity, momentum and energy equations (see
box at the end of Chapter 5) yields the following two non-trivial equations:
rP = ⇢r
r · F~cond = 0
The first equation is the well known equation of hydrostatic equilibrium, stating
that the gravitational force is balanced by pressure gradients, while the second equa-
tion states that in a static fluid the conductive flux needs to be divergence-free.
To further simplify matters, let’s assume (i) spherical symmetry, and (ii) a barotropic
equation of state, i.e., P = P (⇢).
dP G M(r) ⇢(r)
=
dr r2
72
In addition, if the gas is self-gravitating (such as in a star) then we also have that
dM
= 4⇡⇢(r) r 2
dr
For a barotropic EoS this is a closed set of equations, and the density profile can be
solved for (given proper boundary conditions). Of particular interest in astrophysics,
is the case of a polytropic EoS: P / ⇢ , where is the polytropic index. Note
that = 1 and = for isothermal and adiabatic equations of state, respectively.
A spherically symmetric, polytropic fluid in HE is called a polytropic sphere.
Here n = 1/( 1) is related to the polytropic index (in fact, confusingly, some texts
refer to n as the polytropic index),
✓ ◆1/2
4⇡G⇢c
⇠= r
0 c
is a dimensionless radius,
✓ ◆
0 (r)
✓=
0 c
with c and 0 the values of the gravitational potential at the center (r = 0) and
at the surface of the star (where ⇢ = 0), respectively. The density is related to ✓
according to ⇢ = ⇢c ✓n with ⇢c the central density.
73
(see Appendix H) and is therefore describes by a polytrope of index n = 3/2. In the
relativistic case P / ⇢4/3 which results in a polytrope of index n = 3.
Heat transport in stars: Typically, ignoring abundance gradients, stars have the
equation of state of an ideal gas, P = P (⇢, T ). This implies that the equations of
stellar structure need to be complemented by an equation of the form
dT
= F (r)
dr
Since T is a measure of the internal energy, the rhs of this equation describes the
heat flux, F (r).
74
Recall from Chapter 4 that the thermal conductivity K / (kB T )1/2 / where
is the collisional cross section. Using that kB T / v 2 and that the mean-free path of
the particles is mfp = 1/(n ), we have that
K/n mfp v
with v the thermal, microscopic velocity of the particles (recall that ~u = 0). Since
radiative heat transport in a star is basically the conduction of photons, and since
c ve and the mean-free part of photons is much larger than that of electrons (after
all, the cross section for Thomson scattering, T , is much smaller than the typical
cross section for Coulomb interactions), we have that in stars radiation is a far more
efficient heat transport mechanism than conduction. An exception are relativistic,
degenerate cores, for which ve ⇠ c and photons and electrons have comparable mean-
free paths.
Trivia: On average it takes ⇠ 200.000 years for a photon created at the core of the
Sun in nuclear burning to make its way to the Sun’s photosphere; from there it only
takes ⇠ 8 minutes to travel to the Earth.
Hydrostatic Mass Estimates: Now let us consider the case of an ideal gas, for
which
kB T
P = ⇢,
µmp
but this time the gas is not self-gravitating; rather, the gravitational potential may
be considered ‘external’. A good example is the ICM; the hot gas that permeates
clusters. From the EoS we have that
75
dP @P d⇢ @P dT P d⇢ P dT
= + = +
dr @⇢ dr @T dr ⇢ dr T dr
P r d⇢ r dT P d ln ⇢ d ln T
= + = +
r ⇢ dr T dr r d ln r d ln r
Substitution of this equation in the equation for Hydrostatic equilibrium (HE) yields
kB T (r) r d ln ⇢ d ln T
M(r) = +
µmp G d ln r d ln r
This equation is often used to measure the ‘hydrostatic’ mass of a galaxy cluster;
X-ray measurements can be used to infer ⇢(r) and T (r) (after deprojection, which is
analytical in the case of spherical symmetry). Substitution of these two radial depen-
dencies in the above equation then yields an estimate for the cluster’s mass profile,
M(r). Note, though, that this mass estimate is based on three crucial assump-
tions: (i) sphericity, (ii) hydrostatic equilibrium, and (iii) an ideal-gas EoS. Clusters
typically are not spherical, often are turbulent (such that ~u 6= 0, violating the as-
sumption of HE), and can have significant contributions from non-thermal pressure
due to magnetic fields, cosmic rays and/or turbulence. Including these non-thermal
pressure sources the above equation becomes
kB T (r) r d ln ⇢ d ln T Pnt d ln Pnt
M(r) = + +
µmp G d ln r d ln r Pth d ln r
were Pnt and Pth are the non-thermal and thermal contributions to the total gas
pressure. Unfortunately, it is extremely difficult to measure Pnt reliably, which is
therefore often ignored. This may result in systematic biases of the inferred cluster
mass (typically called the ‘hydrostatic mass’).
The Solar corona is a large, spherical region of hot (T ⇠ 106 K) plasma extending
well beyond its photosphere. Let’s assume that the heat is somehow (magnetic
reconnection?) produced in the lower layers of the corona, and try to infer the density,
temperature and pressure profiles under the assumption of hydrostatic equilibrium.
76
We have the boundary condition of the temperature at the base, which we assume
to be T0 = 3 ⇥ 106 K, at a radius of r = r0 ⇠ R ' 6.96 ⇥ 1010 cm. The mass of the
corona is negligble, and we therefore have that
dP G M µmp P
=
dr r2 kB T
✓ ◆
d dT
K r2 = 0
dr dr
where we have used the ideal gas EoS to substitute for ⇢. As we have seen above
K / n mfpT 1/2 . In a plasma one furthermore has that mfp / n 1 T 2 , which implies
that K / T 5/2 . Hence, the second equation can be written as
dT
r 2 T 5/2 = constant
dr
which implies
✓ ◆ 2/7
r
T = T0
r0
Note that this equation satisfies our boundary condition, and that T1 = limr!1 T (r) =
0. Substituting this expression for T in the HE equation yields
dP G M µmp dr
= 2/7 r 12/7
P kB T0 r0
Note that
7 G M µmp
lim P = P0 exp 6= 0
r!1 5 kB T0 r0
Hence, you need an external pressure to confine the corona. Well, that seems OK,
given that the Sun is embedded in an ISM, whose pressure we can compute taking
77
characteristic values for the warm phase (T ⇠ 104 K and n ⇠ 1 cm 3 ). Note that the
other phases (cold and hot) have the same pressure. Plugging in the numbers, we
find that
P1 ⇢0
⇠ 10
PISM ⇢ISM
Since ⇢0 ⇢ISM we thus infer that the ISM pressure falls short, by orders of magni-
tude, to be able to confine the corona....
As first inferred by Parker in 1958, the correct implicication of this puzzling result is
that a hydrostatic corona is impossible; instead, Parker made the daring suggestion
that there should be a solar wind, which was observationally confirmed a few years
later.
————————————————-
Having addressed hydrostatics (‘no flow’), we now consider the next simplest flow;
steady flow, which is characterised by ~u(~x, t) = ~u(~x). For steady flow @~u/@t = 0,
and fluid elements move along the streamlines (see Chapter 2).
The enthalpy, H, is a measure for the total energy of a thermodynamic system that
includes the internal energy, U, and the amount of energy required to make room
for it by displacing its environment and establishing its volume and pressure:
H = U + PV
78
The di↵erential of the enthalpy can be written as
dH = dU + P dV + V dP
Using the first law of thermodynamics, according to which dU = dQ P dV , and
the second law of thermodynamics, according to which dQ = T dS, we can rewrite
this as
dH = T dS + V dP
which, in specific form, becomes
dP
dh = T ds +
⇢
(i.e., we have s = S/m). This relation is one of the Gibbs relations frequently
encountered in thermodynamics. NOTE: for completeness, we point out that this
expression ignores changes in the chemical potential (see Appendix L).
rP
= rh T rs
⇢
(for a formal proof, see at the end of this chapter). Now recall from the previous
chapter on vorticity that the baroclinic term is given by
✓ ◆
rP r⇢ ⇥ rP
r⇥ =
⇢ ⇢2
Using the above relation, and using that the curl of the gradient of a scalar vanishes,
we can rewrite this baroclinic term as r ⇥ (T rs). This implies that one can create
vorticity by creating a gradient in (specific) entropy! One way to do this, which is
one of the most important mechanisms of creating vorticity in astrophysics, is via
curved shocks; when an irrotational, isentropic fluid comes across a curved shock,
di↵erent streamlines will experience a di↵erent jump in entropy ( s will depend on
the angle under which you cross the shock). Hence, in the post-shocked gas there
will be a gradient in entropy, and thus vorticity.
79
Intermezzo: isentropic vs. adiabatic
Using the momentum equation for a steady, ideal fluid, and substituting rP/⇢ !
rh T rs, we obtain
rB = T rs + ~u ⇥ w
~
u2 u2
B⌘ + +h= + + " + P/⇢
2 2
Let’s investigate what happens to the Bernoulli function for an ideal fluid in a
steady flow. Since we are in a steady state we have that
dB @B
= + (~u · r)B = (~u · r)B
dt @t
Next we use that
80
u2
(~u · r)B = (~u · r) + +h
2
2
u rP
= (~u · r) + + ~u · + T (~u · rs)
2 ⇢
✓ 2 ◆
u rP
= (~u · r + + + T (~u · rs)
2 ⇢
= ~u · (~u ⇥ w)
~ + T (~u · rs)
= 0
Here we have used that the cross-product of ~u and w~ is perpendicular to ~u, and that
in an ideal fluid ~u · rs = 0. The latter follow from the fact that in an ideal fluid
ds/dt = 0, and the fact that ds/dt = @s/@t + ~u · rs. Since all @/@t terms vanish
for a steady flow, we see that ~u · rs = 0 for a steady flow of ideal fluid. And as a
consequence, we thus also have that
dB
=0
dt
Hence, in a steady flow of ideal fluid, the Bernoulli function is conserved. Using the
definition of the Bernoulli function we can write this as
dB d~u d ds 1 dP
= ~u · + +T + =0
dt dt dt dt ⇢ dt
Since ds/dt = 0 for an ideal fluid, we have that if the flow is such that the gravita-
tional potential along the flow doesn’t change significantly (such that d /dt ' 0),
we find that
d~u 1 dP
~u · =
dt ⇢ dt
This is known as Bernoulli’s theorem, and states that as the speed of a steady
flow increases, the internal pressure of the ideal fluid must decrease. Applications of
Bernoulli’s theorem discussed in class include the shower curtain and the pitot tube
(a flow measurement device used to measure fluid flow velocity).
————————————————-
81
Potential flow: The final flow to consider in this chapter is potential flow. Consider
~ ⌘ r ⇥ ~u = 0 everywhere. This implies that
an irrotational flow, which satisfies w
there is a scalar function, u (x), such that ~u = r u , which is why u (x) is called
the velocity potential. The corresponding flow ~u(~x) is called potential flow.
If the fluid is ideal (i.e., ⌫ = K = 0), and barotropic or isentropic, such that the flow
fluid has vanishing baroclinicity, then Kelvin’s circulation theorem assures that
the flow will remain irrotational throughout (no vorticity can be created), provided
that all forces acting on the fluid are conservative.
r · ~u = r2 u =0
This is the well known Laplace equation, familiar from electrostatics. Mathemat-
ically, this equation is of the elliptic PDE type which requires well defined boundary
conditions in order for a solution to both exist and be unique. A classical case of
potential flow is the flow around a solid body placed in a large fluid volume. In this
case, an obvious boundary condition is the one stating that the velocity component
perpendicular to the surface of the body at the body (assumed at rest) is zero. This
is called a Neumann boundary condition and is given by
@ u
= ~n · r u = 0
@n
with ~n the normal vector. The Laplace equation with this type of boundary condition
constitutes a well-posed problem with a unique solution. An example of potential
flow around a solid body is shown in Fig. 2 in Chapter 2. We will not examine any
specific examples of potential flow, as this means having to solve a Laplace equation,
which is purely a mathematical exersize. We end, though, by pointing out that real
fluids are never perfectly inviscid (ideal fluids don’t exist). And any flow past a
surface involves a boundary layer inside of which viscosity creates vorticity (due to
no-slip boundary condition, which states that the tangential velocity at the surface
of the body must vanish). Hence, potential flow can never fully describe the flow
around a solid body; otherwise one would run into d’Alembert’s paradox which
is that steady potential flow around a body exerts zero force on the body; in other
words, it costs no energy to move a body through the fluid at constant speed. We
know from everyday experience that this is indeed not true. The solution to the
82
paradox is that viscosity created in the boundary layer, and subsequently dissipated,
results in friction.
Although potential flow around an object can thus never be a full description of the
flow, in many cases, the boundary layer is very thin, and away from the boundary
layer the solutions of potential flow still provide an accurate description of the flow.
————————————————-
To see this, use that the natural variables of h are the specific entropy, s, and the
pressure P . Hence, h = h(s, P ), and we thus have that
@h @h
dh = ds + dP
@s @P
From a comparison with the previous expression for dh, we see that
@h @h 1
=T, =
@s @P ⇢
which allows us to derive
@h @h @h
rh = ~ex + ~ey + ~ez
@x @y @z
✓ ◆ ✓ ◆ ✓ ◆
@h @s @h @P @h @s @h @P @h @s @h @P
= + ~ex + + ~ey + + ~ez
@s @x @P @x @s @y @P @y @s @z @P @z
✓ ◆ ✓ ◆
@h @s @s @s @h @P @P @P
= ~ex + ~ey + ~ez + ~ex + ~ey + ~ez
@s @x @y @z @P @x @y @z
1
= T rs + rP
⇢
————————————————-
83
CHAPTER 10
As we have seen in our discussion on potential flow in the previous chapter, realistic
flow past an object always involves a boundary layer in which viscosity results in
vorticity. Even if the viscosity of the fluid is small, the no-slip boundary condition
typically implies a region where the shear is substantial, and viscocity thus manifests
itself.
In this chapter we examine two examples of viscous flow. We start with a well-
known example from engineering, known as Poiseuille-Hagen flow through a pipe.
Although not really an example of astrophysical flow, it is a good illustration of how
viscosity manifests itself as a consequence of the no-slip boundary condition. The
second example that we consider is viscous flow in a thin accretion disk. This flow,
which was first worked out in detail in a famous paper by Shakura & Sunyaev in
1973, is still used today to describe accretion disks in AGN and around stars.
————————————————-
Pipe Flow: Consider the steady flow of an incompressible viscous fluid through
a circular pipe of radius Rpipe and lenght L. Let ⇢ be the density of the fluid as
it flows through the pipe, and let ⌫ = µ/⇢ be its kinetic viscosity. Since the
flow is incompressible, we have that fluid density will be ⇢ throughout. If we pick a
Cartesian coordinate system with the z-axis along the symmetry axis of the cylinder,
then the velocity field of our flow is given by
~u = uz (x, y, z) ~ez
In other words, ux = uy = 0.
84
Figure 11: Poiseuille-Hagen flow of a viscous fluid through a pipe of radius Rpipe and
lenght L.
and using that all partial time-derivatives of a steady flow vanish, we obtain that
@⇢ux @⇢uy @⇢uz @uz
+ + =0 ) =0
@x @y @z @z
where we have used that @⇢/@z = 0 because of the incompressibility of the flow.
Hence, we can update our velocity field to be ~u = uz (x, y) ~ez .
Next we write down the momentum equations for a steady, incompressible flow,
which are given by
rP
(~u · r)~u = + ⌫r2 ~u r
⇢
In what follows we assume the pipe to be perpendicular to r , so that we may
ignore the last term in the above expression. For the x- and y- components of the
momentum equation, one obtains that @P/@x = @P/@y = 0. For the z-component,
we instead have
@uz 1 @P
uz = + ⌫r2 uz
@z ⇢ @z
Combining this with our result from the continuity equation, we obtain that
1 @P
= ⌫r2 uz
⇢ @z
Next we use that @P/@z cannot depend on z; otherwise uz would depend on z, but
according to the continuity equation @uz /@z = 0. This means that the pressure
85
gradient in the z-direction must be constant, which we write as P/L, where P
is the pressure di↵erent between the beginning and end of the pipe, and the minus
sign us used to indicate the the fluid pressure declines as it flows throught the pipe.
P ⇥ 2 ⇤
uz (R) = Rpipe R2
4⇢ ⌫ L
As is evident from the above expression, for a given pressure di↵erence P , the flow
speed u / ⌫ 1 (i.e., a more viscous fluid will flow slower). In addition, for a given
fluid viscosity, applying a larger pressure di↵erence P results in a larger flow speed
(u / P ).
Now let us compute the amount of fluid that flows through the pipe per unit time:
R
Zpipe
⇡ P 4
Ṁ = 2⇡ ⇢ uz (R) R dR = R
8 ⌫ L pipe
0
Note the strong dependence on the pipe radius; this makes it clear that a clogging of
the pipe has a drastic impact on the mass flow rate (relevant for both arteries and oil-
pipelines). The above expression also gives one a relatively easy method to measure
86
the viscosity of a fluid: take a pipe of known Rpipe and L, apply a pressure di↵erence
P across the pipe, and measure the mass flow rate, Ṁ; the above expression allows
one to then compute ⌫.
The Poiseuille velocity flow field has been experimentally confirmed, but only for
slow flow! When |~u| gets too large (i.e., P is too large), then the flows becomes
irregular in time and space; turbulence develops and |~u| drops due to the enhanced
drag from the turbulence. This will be discussed in more detail in Chapter 11.
————————————————-
Accretion Disks: We now move to a viscous flow that is more relevant for as-
trophysics; accretion flow. Consider a thin accretion disk surrounding an accreting
object of mass M• Mdisk (such that we may ignore the disk’s self-gravity). Because
of the symmetries involved, we adopt cylindrical coordinates, (R, ✓, z), with the
z-axis perpendicular to the disk. We also have that @/@✓ is zero, and we set uz = 0
throuhout.
Let’s start with the continuity equation, which in our case reads
@⇢ 1 @
+ (R ⇢ uR ) = 0
@t R @R
(see Appendix D for how to express the divergence in cylindrical coordinates).
NOTE: There are several terms in the above expression that may seem ‘surprising’.
The important thing to remember in writing down the equations in curvi-linear
87
coordinates is that operators can also act on unit-direction vectors. For example,
the ✓-component of r2~u is NOT r2 u✓ . That is because the operator r2 acts on
uR~eR + u✓~e✓ + uz~ez , and the directions of ~eR and ~e✓ depend on position! The same
holds for the convective operator (~u · r) ~u. The full expressions for both cylindrical
and spherical coordinates are written out in Appendix D.
Setting all the terms containing @/@✓ and/or uz to zero, the Navier-Stokes equation
simplifies considerably to
2
@u✓ @u✓ uR u✓ @ u✓ @ 2 u✓ 1 @u✓ u✓
⇢ + uR + =µ 2
+ 2
+
@t @R R @R @z R @R R2
where we have replaced the kinetic viscosity, ⌫, with µ = ⌫⇢.
Next we multiply the continuity equation by Ru✓ which we can then write as
@(⌃ R u✓ ) @(Ru✓ ) @(⌃ R uR u✓ ) @u✓
⌃ + R ⌃ uR =0
@t @t @R @R
Adding this to R times the Navier-Stokes equation, and rearranging terms, yields
@(⌃ R u✓ ) @(⌃ R uR u✓ )
+ + ⌃ uR u✓ = G(µ, R)
@t @R
where G(µ, R) = RF (µ). Next we introduce the angular frequency ⌦ ⌘ u✓ /R
which allows us to rewrite the above expression as
@(⌃ R2 ⌦) 1 @
+ ⌃ R3 ⌦ uR = G(µ, R)
@t R @R
88
Note that ⌃ R2 ⌦ = ⌃ R u✓ is the angular momentum per unit surface density. Hence
the above equation describes the evolution of angular momentum in the accretion
disk. It is also clear, therefore, that G(µ, R) must describe the viscous torque on
the disk material, per unit surface area. To derive an expression for it, recall that
Z 2
@ u✓ 1 @u✓ u✓
G(µ, R) = R dz µ 2
+
@R R @R R2
where we have ignored the @ 2 u✓ /@z 2 term which is assumed to be small. Using that
µ = ⌫⇢ and that µ is independent of R and z (this is an assumption that underlies
the Navier-Stokes equation from which we started) we have that
2
@ u✓ 1 @u✓ u✓
G(µ, R) = ⌫ R ⌃ +
@R2 R @R R2
Next we use that u✓ = ⌦ R to write
@u✓ d⌦
= ⌦+R
@R dR
Substituting this in the above expression for G(µ, R) yield
2
✓ ◆
2d ⌦ d⌦ 1 @ 3 d⌦
G(µ, R) = ⌫ ⌃ R + 3R = ⌫ ⌃R
dR2 dR R @R dR
Substituting this expression for the viscous torque in the evolution equation for the
angular momentum per unit surface density, we finally obtain the full set of equations
that govern our thin accretion disk:
✓ ◆
@ 1 @ 1 @ 3 d⌦
⌃ R2 ⌦ + ⌃ R 3 ⌦ uR = ⌫ ⌃R
@t R @R R @R dR
@⌃ 1 @
+ (R ⌃ uR ) = 0
@t R @R
✓ ◆1/2
G M•
⌦=
R3
89
These three equations describe the dynamics of a thin, viscous accretion disk. The
third equation indicates that we assume that the fluid is in Keplerian motion around
the accreting object of mass M• . As discussed further below, this is a reasonable
assumption as long as the accretion disk is thin.
Ṁ (R) = 2⇡⌃ R uR
Now let us consider a steady accretion disk. This implies that @/@t = 0 and
that Ṁ (R) = Ṁ ⌘ Ṁ• (the mass flux is constant throughout the disk, otherwise
@⌃/@t 6= 0). In particular, the continuity equation implies that
R ⌃ u R = C1
Using the above expression for the mass inflow rate, we see that
Ṁ•
C1 =
2⇡
Ṁ•
C2 = R•2 ⌦• C1 = (G M• R• )1/2
2⇡
90
we have that
✓ ◆ 1
Ṁ• ⇥ 2 ⇤ 3 d⌦
⌫⌃ = R ⌦ + (G M• R• )1/2 R
2⇡ dR
" ✓ ◆1/2 #
Ṁ• R•
= + 1
3⇡ R
This shows that the mass inflow rate and kinetic viscosity depend linearly on each
other.
The gravitational energy lost by the inspiraling material is converted into heat. This
is done through viscous dissipation: viscosity robs the disk material of angular
momentum which in turn causes it to spiral in.
(see Chapter 4). Note that the last term in the above expression vanishes because
the fluid is incompressible, such that
"✓ ◆2 #
@ui @uj @ui
V=µ +
@xj @xi @xj
In our case, using that @/@✓ = @/@z = 0 and that uz = 0, the only surviving terms
are
"✓ ◆2 ✓ ◆2 # " ✓ ◆2 ✓ ◆2 #
@uR @u✓ @uR @uR @uR @u✓
V =µ + + =µ 2 +
@R @R @R @R @R @R
91
If we make the reasonable assumption that uR ⌧ u✓ , we can ignore the first term,
such that we finally obtain
✓ ◆2 ✓ ◆2
@u✓ 2 d⌦
V=µ = µR
@R dR
which expresses the viscous dissipation per unit volume. Note that @u✓ /@R = ⌦ +
d⌦/dR. Hence, even in a solid body rotation (d⌦/dR = 0) there is a radial derivative
of u✓ . However, when d⌦/dR = 0 there is no velocity shear in the disk, which shows
that the ⌦ term cannot contribute to the viscous dissipation rate.
Using once more that d⌦/dR = (3/2)⌦/R, and integrating over the entire disk
yields the accretion luminosity of a thin accretion disk:
Z1
dE G M• Ṁ•
Lacc ⌘ 2⇡ R dR =
dt 2 R•
R•
To put this in perspective, realize that the gravitation energy of mass m at radius
R• is G M• m/ R• . Thus, Lacc is exactly half of the gravitational energy lost due to
the inflow. This obviously begs the question where the other half went...The answer
is simple; it is stored in kinetic energy at the ‘boundary’ radius R• of the accreting
flow.
We end our discussion on accretion disks with a few words of caution. First of
all, our entire derivation is only valid for a thin accretion disk. In a thin disk, the
92
pressure in the disk must be small (otherwise it would pu↵ up). This means that the
@P/@R term in the R-component of the Navier-Stokes equation is small compared
to @ /@R = GM/R2 . This in turn implies that the gas will indeed be moving on
Keplerian orbits, as we have assumed. If the accretion disk is thick, the situation is
much more complicated, something that will not be covered in this course.
Finally, let us consider the time scale for accretion. As we have seen above, the
energy loss rate per unit surface area is
✓ ◆2
2 d⌦ 9 G M•
⌫ ⌃R = ⌫
dR 4 R3
We can compare this with the gravitation potential energy of disk material per unit
surface area, which is
G M• ⌃
E=
R
E 4 R2 R2
tacc ⌘ = ⇠
dE/dt 9 ⌫ ⌫
Too estimate this time-scale, we first estimate the molecular viscosity. Recall that
⌫ / mfpv with v a typical velocity of the fluid particles. In virtually all cases
encountered in astrophysics, we have that the size of the accretion disk, R, is many,
many orders of magnitude larger than mfp . As a consequence, the corresponding
tacc easily exceeds the Hubble time!
The conclusion is that molecular viscosity is way too small to result in any signif-
icant accretion in objects of astrophysical size. Hence, other source of viscosity are
required, which is a topic of ongoing discussion in the literature. Probably the most
promising candidates are turbulence (in di↵erent forms), and the magneto-rotational
instability (MRI). Given the uncertainties involved, it is common practive to simply
write ✓ ◆ 1
P 1 d⌦
⌫=↵
⇢ R dR
where ↵ is a ‘free parameter’. A thin accretion disk modelled this way is often called
an alpha-accretion disk. If you wonder what the origin is of the above expression;
93
Figure 12: Image of the central region of NGC 4261 taken with the Hubble Space
Telescope. It reveals a ⇠ 100pc scale disk of dust and gas, which happens to be per-
pendicular to a radio jet that emerges from this galaxy. This is an alledged ‘accretion
disk’ supplying fuel to the central black hole in this galaxy. This image was actually
analyzed by the author as part of his thesis.
it simply comes from assuming that the only non-vanishing o↵-diagonal term of the
stress tensor is taken to be ↵P (where P is the value along the diagonal of the stress
tensor).
94
CHAPTER 11
Turbulence
1
~u · r~u = ru2 ~u ⇥ w
~
2
which describes the ”inertial acceleration” and is ultimately responsible for the origin
of the chaotic character of many flows and of turbulence. Because of this non-
linearity, we cannot say whether a solution to the Navier-Stokes equation with nice
and smooth initial conditions will remain nice and smooth for all time (at least not
in 3D).
Laminar flow: occurs when a fluid flows in parallel layers, without lateral mixing
(no cross currents perpendicular to the direction of flow). It is characterized by high
momentum di↵usion and low momentum convection.
The Reynold’s number: In order to gauge the importance of viscosity for a fluid,
it is useful to compare
⇥ 2 the ratio ⇤of the inertial acceleration (~u · r~u) to the viscous
1
acceleration (⌫ r ~u + 3 r(r · ~u) ). This ratio is called the Reynold’s number, R,
and can be expressed in terms of the typical velocity scale U ⇠ |~u| and length scale
L ⇠ 1/r of the flow, as
~u · r~u U 2 /L UL
R= ⇥ ⇤ ⇠ =
⌫ r2~u + 13 r(r · ~u) ⌫U/L2 ⌫
If R 1 then viscosity can be ignored (and one can use the Euler equations to
describe the flow). However, if R ⌧ 1 then viscosity is important.
95
Figure 13: Illustration of laminar vs. turbulent flow.
Similarity: Flows with the same Reynold’s number are similar. This is evident
from rewriting the Navier-Stokes equation in terms of the following dimensionless
variables
~u ~x U P ˜=
ũ = x̃ = t̃ = t p̃ = r̃ = L r
U L L ⇢ U2 U2
This yields (after multiplying the Navier-Stokes equation with L/U 2 ):
@ ũ 1 1
+ ũ · r̃ũ + r̃p̃ + r̃ ˜ = r̃2 ũ + r̃(r̃ · ũ)
@ t̃ R 3
which shows that the form of the solution depends only on R. This principle is
extremely powerful as it allows one to making scale models (i.e., when developing
airplanes, cars etc). NOTE: the above equation is only correct for an incompressible
fluid, i.e., a fluid that obeys r⇢ = 0. If this is not the case the term P̃ (r⇢/⇢) needs
to be added at the rhs of the equation, braking its scale-free nature.
96
Figure 14: Illustration of flows at di↵erent Reynolds number.
• R > 103 : vortices are unstable, resulting in a turbulent wake behind the
cylinder that is ‘unpredictable’.
97
Figure 15: The image shows the von Kármán Vortex street behind a 6.35 mm di-
ameter circular cylinder in water at Reynolds number of 168. The visualization was
done using hydrogen bubble technique. Credit: Sanjay Kumar & George Laughlin,
Department of Engineering, The University of Texas at Brownsville
The following movie shows a R = 250 flow past a cylinder. Initially one can witness
separation, and the creation of two counter-rotating vortices, which then suddenly
become ‘unstable’, resulting in the von Kármán vortex street:
http://www.youtube.com/watch?v=IDeGDFZSYo8
98
Figure 16: Typical Reynolds numbers for various biological organisms. Reynolds
numbers are estimated using the length scales indicated, the rule-of-thumb in the
text, and material properties of water.
99
Boundary Layers: Even when R 1, viscosity always remains important in thin
boundary layers adjacent to any solid surface. This boundary layer must exist in
order to satisfy the no-slip boundary condition. If the Reynolds number exceeds
a critical value, the boundary layer becomes turbulent. Turbulent layes and their
associated turbulent wakes exert a much bigger drag on moving bodies than their
laminar counterparts.
Momentum Di↵usion & Reynolds stress: This gives rise to an interesting phe-
nomenon. Consider flow through a pipe. If you increase the viscosity (i.e., decrease
R), then it requires a larger force to achieve a certain flow rate (think of how much
harder it is to push honey through a pipe compared to water). However, this trend
is not monotonic. For sufficiently low viscosity (large R), one finds that the trend
reverses, and that is becomes harder again to push the fluid through the pipe. This
is a consequence of turbulence, which causes momentum di↵usion within the flow,
which acts very much like viscosity. However, this momentum di↵usion is not due
to the viscous stress tensor, ⌧ij , but rather to the Reynolds stress tensor Rij .
To understand the ‘origin’ of the Reynolds stress tensor,consider the following:
ui = ūi + u0i
This is knowns as the Reynolds decomposition. The ‘mean’ component can be a
time-average, a spatial average, or an ensemble average, depending on the detailed
characteristics of the flow. Note that this is reminiscent of how we decomposed the
microscopic velocities of the fluid particles in a ‘mean’ velocity (describing the fluid
elements) and a ‘random, microscopic’ velocity (~v = ~u + w).
~
Substituting this into the Navier-Stokes equation, and taking the average of that, we
obtain
@ ūi @ ūi 1 @ ⇥ ⇤
+ ūj = ⌧ ij ⇢u0i u0j
@t @xj ⇢ @xj
where, for simplicity, we have ignored gravity (the r -term). This equation looks
identical to the Navier-Stokes equation (in absence of gravity), except for the ⇢u0i u0j
term, which is what we call the Reynolds stress tensor:
100
Rij = ⇢u0i u0j
Note that u0i u0j means the same averaging (time, space or ensemble) as above, but
now for the product of u0i and u0j . Note that ū0i = 0, by construction. However,
the expectation value for the product of u0i and u0j is generally not. As is evident
from the equation, the Reynolds stresses (which reflect momentum di↵usion due
to turbulence) act in exactly the same way as the viscous stresses. However, they
are only present when the flow is turbulent.
Note also that the Reynolds stress tensor is related to the two-point correlation
tensor
101
• Turbulent flows have a high rate of viscous energy dissipation.
• Advected tracers are rapidly mixed by turbulent flows.
However, one further property of turbulence seems to be more fun-
damental than all of these because it largely explains why turbulence
demands a statistical treatment...turbulence is chaotic.
Turbulence kicks in at sufficiently high Reynolds number (typically R > 103 104 ).
Turbulent flow is characterized by irregular and seemingly random motion. Large
vortices (called eddies) are created. These contain a large amount of kinetic energy.
Due to vortex stretching these eddies are stretched thin until they ‘break up’ in
smaller eddies. This results in a cascade in which the turbulent energy is transported
from large scales to small scales. This cascade is largely inviscid, conserving the total
turbulent energy. However, once the length scale of the eddies becomes comparable
to the mean free path of the particles, the energy is dissipated; the kinetic energy
associated with the eddies is transformed into internal energy. The scale at which
this happens is called the Kolmogorov length scale. The length scales between
the scale of turbulence ‘injection’ and the Kolomogorov length scale at which it
is dissipated is called the inertial range. Over this inertial range turbulence is
believed/observed to be scale invariant. The ratio between the injection scale, L,
and the dissipation scale, l, is proportional to the Reynolds number according to
L/l / R3/4 . Hence, two turbulent flows that look similar on large scales (comparable
L), will dissipate their energies on di↵erent scales, l, if their Reynolds numbers are
di↵erent.
102
CHAPTER 12
Sound Waves
If the perturbation is small, we may assume that the velocity gradients are so small
that viscous e↵ects are negligble (i.e., we can set ⌫ = 0). In addition, we assume that
the time scale for conductive heat transport is large, so that energy exchange due to
conduction can also safely be ignored. In the absence of these dissipative processes,
the wave-induced changes in gas properties are adiabatic.
Thus, as long as the wave-length of the acoustic wave is much larger than the mean-
free path of the fluid particles, we have that the Reynolds number is large, and thus
that viscosity and conduction can be ignored.
Let (⇢0 , P0 , ~u0) be a uniform, equilibrium solution of the Euler fluid equations
(i.e., ignore viscosity). Also, in what follows we will ignore gravity (i.e., r = 0).
Uniformity implies that r⇢0 = rP0 = r~u0 = 0. In addition, since the only al-
lowed motion is uniform motion of the entire system, we can always use a Galilean
coordinate transformation so that ~u0 = 0, which is what we adopt in what follows.
103
Substitution into the continuity and momentum equations, one obtains that @⇢0 /@t =
@~u0 /@t = 0, indicative of an equilibrium solution as claimed.
Perturbation Analysis: Consider a small perturbation away from the above equi-
librium solution:
⇢0 ! ⇢0 + ⇢1
P0 ! P0 + P1
~u0 ! ~u0 + ~u1 = ~u1
where |⇢1 /⇢0 | ⌧ 1, |P1 /P0 | ⌧ 1 and ~u1 is small (compared to the sound speed, to
be derived below).
Next we linearize these equations, which means we use that the perturbed values
are all small such that terms that contain products of two or more of these quantities
are always negligible compared to those that contain only one such quantity. Hence,
the above equations reduce to
@⇢1
+ ⇢0 r~u1 = 0
@t
@~u1 rP1
+ = 0
@t ⇢0
104
These equations describe the evolution of perturbations in an inviscid and uniform
fluid. As always, these equations need an additional equation for closure. As men-
tioned above, we don’t need the energy equation: instead, we can use that the
flow is adiabatic, which implies that P / ⇢ .
where we have used (@P/@⇢)0 as shorthand for the partial derivative of P (⇢) at
⇢ = ⇢0 . And since the flow is isentropic, we have that the partial derivative is for
constant entropy. Using that P (⇢0 ) = P0 and P (⇢0 + ⇢1 ) = P0 + P1 , we find that,
when linearized, ✓ ◆
@P
P1 = ⇢1
@⇢ 0
Note that P1 6= P (⇢1 ); rather P1 is the perturbation in pressure associated with the
perturbation ⇢1 in the density.
Taking the partial time derivative of the above continuity equation, and using that
@⇢0 /@t = 0, gives
@ 2 ⇢1 @~u1
2
+ ⇢0 r · =0
@t @t
Substituting the above momentum equation, and realizing that (@P/@⇢)0 is a
constant, then yields
✓ ◆
@ 2 ⇢1 @P
r2 ⇢1 = 0
@t2 @⇢ 0
105
with ~k the wavevector, k = |~k| = 2⇡/ the wavenumber, the wavelength,
! = 2⇡⌫ the angular frequency, and ⌫ the frequency.
To gain some insight, consider the 1D case: ⇢1 / ei(kx !t) / eik(x vp t) , where we have
defined the phase velocity vp ⌘ !/k. This is the velocity with which the wave
pattern propagates through space. For our perturbation of a compressible fluid, this
phase velocity is called the sound speed, cs . Substituting the solution ⇢1 / ei(kx !t)
into the wave equation, we see that
s✓ ◆
! @P
cs = =
k @⇢ s
where we have made it explicit that the flow is assumed to be isentropic. Note that
the partial derivative is for the unperturbed medium. This sound speed is sometimes
called the adiabatic speed of sound, to emphasize that it relies on the assumption
of an adiabatic perturbation. If the fluid is an ideal gas, then
s
kB T
cs =
µ mp
which shows that the adiabatic sound speed of an ideal fluid increases with temper-
ature.
We can repeat the above derivation by relaxing the assumption of isentropic flow,
and assuming instead that (more generally) the flow is polytropic. In that case,
P / ⇢ , with the polytropic index (Note: a polytropic EoS is an example of a
barotropic EoS). The only thing that changes is that now the sound speed becomes
s s
@P P
cs = =
@⇢ ⇢
which shows that the sound speed is larger for a sti↵er EoS (i.e., a larger value of ).
Note also that, for our barotropic fluid, the sound speed is independent of !. This
implies that all waves move equally fast; the shape of a wave packet is preserved
106
as it moves. We say that an ideal (inviscid) fluid with a barotropic EoS is a non-
dispersive medium.
To gain further insight, let us look once more at the (1D) solution for our perturba-
tion:
⇢1 / ei(kx !t)
/ eikx e i!t
• The eikx part describes a periodic, spatial oscillation with wavelength = 2⇡/k.
i!t
• The e part describes the time evolution:
We will return to this in Chapter 14, when we discuss the Jeans stability criterion.
In the linear perturbation theory used earlier in this chapter, we neglected the inertial
acceleration term ~u · r~u since it is quadratic in the (assumed small) velocity. When
developing a theory in which the sound waves are sourced by fluid, the velocities
are not necessarily small, and we cannot neglect the inertial acceleration term. To
107
proceed, it is advantageous to start from the Euler equation in flux-conservative form
(see Chapter 3)
@⇢ui @⇧ij
+ =0
@t @xj
Here ⇧ij is the momentum flux density tensor which, for an inviscid fluid, is
given by
⇧ij = P ij + ⇢ui uj
which describes the departure of ⇧ij from linear theory. To see this, recall that
in linear theory any term that is quadratic in velocity is ignored, and that in linear
theory the perturbed pressure and density are related according to P1 = c2s ⇢1 . Hence,
in linear theory Qij = 0.
Substituting the above expression for ⇧ij in the Euler equation in flux-conservative
form yields
@⇢ui @⇢1 @Qij
+ c2s =
@t @xi @xj
108
Substituting the Euler equation finally yields the inhomogeneous wave equation
@ 2 ⇢1 @ 2 ⇢1 @ 2 Qij
c2s =
@t2 @x2i @xi @xj
which is known as the Lighthill equation, after M.J. Lighthill who first derived it
in 1952. It is an example of an inhomogeneous wave equation; the term on the
rhs is a source term, and its presence makes the PDE inhomogeneous.
~
@A
~ =
E r ~ =r⇥A
B ~
@t
and adopting the Lorenz gauge condition
1@ ~=0
+r·A
c2 @t
the four Maxwell equations in a vacuum with charge ⇢ and current J~ reduce to
two uncoupled, inhomogeneous wave equations that are symmetric in the potentials:
1 @2 ⇢
r2 =
c2 @t2 "0
1 @2A ~
~
r2 A = µ0 J~ .
c2 @t2
@ 2 ⇢1 @ 2 ⇢1
c2s = G(~x, t)
@t2 @x2i
is given by Z
1 G(~x0 , t |~x ~x0 |/cs )
⇢1 (~x, t) = dV 0
4⇡c2s |~x ~x0 |
109
This represent a superposition of spherical acoustic waves traveling outward from
their sources located at ~x0 .
Thus, we have seen that keeping the higher-order velocity terms yields a source term
of acoustic waves, given by
@2 ⇥ ⇤
G(~x, t) = ⇢ui uj + (P1 c2s ⇢1 ) ij
@xi @xj
Note that this is a scalar quantity (Einstein summation). Although this equation
gives some insight as to how fluid motion can spawn sound waves, actually solving
the Lighthill equation for a turbulent velocity field ~u(~x, t) is obviously horrendously
difficult.
110
CHAPTER 13
Shocks
When discussing sound waves in the previous chapter, we considered small (linear)
perturbations. In this Chapter we consider the case in which the perturbations are
large (non-linear). Typically, a large disturbance results in an abrupt discontinuity
in the fluid, called a shock. Note: not all discontinuities are shocks, but all shocks
are discontinuities.
If = 1, i.e., the EoS is isothermal, then the sound speed is a constant, independent
of density of pressure. However, if 6= 1, then the sound speed varies with the local
density. An important example, often encountered in (astro)physics is the adiabatic
EoS, for which = ( = 5/3 for a mono-atomic gas). In that case we have that cs
increases with density (and pressure, and temperature).
Mach Number: if v is the flow speed of the fluid, and cs is the sound speed, then
111
the Mach number of the flow is defined as
v
M=
cs
Note: simply accelerating a flow to supersonic speeds does not necessarily generate
a shock. Shocks only arise when an obstruction in the flow causes a deceleration of
fluid moving at supersonic speeds. The reason is that disturbances cannot propagate
upstream, so that the flow cannot ‘adjust itself’ to the obstacle because there is no
way of propagating a signal (which always goes at the sound speed) in the upstream
direction. Consequently, the flow remains undisturbed until it hits the obstacle,
resulting in a discontinuous change in flow properties; a shock.
Structure of a Shock: Fig. 17 shows the structure of a planar shock. The shock
has a finite, non-zero width (typically a few mean-free paths of the fluid particles),
and separates the ‘up-stream’, pre-shocked gas, from the ‘down-stream’, shocked gas.
For reasons that will become clear in what follows, it is useful to split the downstream
region in two sub-regions; one in which the fluid is out of thermal equilibrium, with
net cooling L > 0, and, further away from the shock, a region where the downstream
gas is (once again) in thermal equilibrium (i.e., L = 0). If the transition between
these two sub-regions falls well outside the shock (i.e., if x3 x2 ) the shock is said
to be adiabatic. In that case, we can derive a relation between the upstream (pre-
shocked) properties (⇢1 , P1 , T1 , u1) and the downstream (post-shocked) properties
(⇢2 , P2 , T2 , u2 ); these relations are called the Rankine-Hugoniot jump conditions.
Linking the properties in region three (⇢3 , P3 , T3 , u3) to those in the pre-shocked gas
is in general not possible, except in the case where T3 = T1 . In this case one may
consider the shock to be isothermal.
112
ρ1 P1 s ρ2 P2 ρ3 P3
h v2 T2
v1 T1 v3 T3
o
c
L=0 k L>0 L=0
x1 x2 x3
Figure 17: Structure of a planar shock.
⇢1 u1 = ⇢2 u2
This equation describes mass conservation across a shock.
113
@ @ @
(⇢ ux ) = (⇢ ux ux + P ) ⇢
@t @x @x
Integrating this equation over V and ignoring any gradient in across the shock, we
obtain
⇢1 u21 + P1 = ⇢2 u22 + P2
This equation describes how the shock converts ram pressure into thermal
pressure.
Finally, applying the same to the energy equation under the assumption that the
shock is adiabatic (i.e., dQ/dt = 0), one finds that (E + P )u has to be the same on
both sides of the shock, i.e.,
1 2 P
u + +"+ ⇢ u = constant
2 ⇢
We have already seen that ⇢ u is constant. Hence, if we once more ignore gradients
in across the shock, we obtain that
1 2 1
u1 + "1 + P1 /⇢1 = u22 + "2 + P2 /⇢2
2 2
This equation describes how the shock converts kinetic energy into enthalpy.
Qualitatively, a shock converts an ordered flow upstream into a disordered (hot) flow
downstream.
The three equations in the rectangular boxes are known as the Rankine-Hugoniot
(RH) jump conditions for an adiabatic shock. Using straightforward but
tedious algebra, these RH jump conditions can be written in a more useful form
using the Mach number M1 of the upstream gas:
✓ ◆ 1
⇢2 u1 1 1 1
= = + 1
⇢1 u2 M21 +1 M21
P2 2 1
= M21
P1 +1 +1
✓ ◆
T2 P2 ⇢2 1 2 2 1 4 1
= = M1 +
T1 P1 ⇢1 +1 +1 M21 1 +1
114
Here we have used that for an ideal gas
kB T
P =( 1) ⇢ " = ⇢
µ mp
Given that M1 > 1, we see that ⇢2 > ⇢1 (shocks compress), u2 < u1 (shocks
decelerate), P2 > P1 (shocks increase pressure), and T2 > T1 (shocks heat).
The latter may seem surprising, given that the shock is considered to be adiabatic:
although the process has been adiabatic, in that dQ/dt = 0, the gas has changed its
adiabat; its entropy has increased as a consequence of the shock converting kinetic
energy into thermal, internal energy. In general, in the presence of viscosity, a
change that is adiabatic does not imply that the states before and after are simply
linked by the relation P = K ⇢ , with K some constant. Shocks are always viscous,
which causes K to change across the shock, such that the entropy increases; it is this
aspect of the shock that causes irreversibility, thus defining an ”arrow of time”.
where we have used that = 5/3 for a monoatomic gas. Thus, with an adia-
batic shock you can achieve a maximum compression in density of a factor four!
Physically, the reason why there is a maximal compression is that the pressure and
temperature of the downstream fluid diverge as M21 . This huge increase in down-
stream pressure inhibits the amount of compression of the downstream gas. However,
this is only true under the assumption that the shock is adiabtic. The downstream,
post-shocked gas is out of thermal equilibrium, and in general will be cooling (i.e.,
L > 0). At a certain distance past the shock (i.e., when x = x3 in Fig. 15), the
fluid will re-establish thermal equilibrium (i.e., L = 0). In some special cases, one
can obtain the properties of the fluid in the new equilibrium state; one such case is
the example of an isothermal shock, for which the downstream gas has the same
temperature as the upstream gas (i.e., T3 = T1 ).
In the case of an isothermal shock, the first two Rankine-Hugoniot jump con-
115
ditions are still valid, i.e.,
⇢1 u1 = ⇢3 u3
⇢1 u21 + P1 = ⇢3 u23 + P3
However, the third condition, which derives from the energy equation, is no longer
valid. After all, in deriving that one we had assumed that the shock was adiabatic.
In the case of an isothermal shock we have to replace the third RH jump condition
with T1 = T3 . The latter implies that c2s = P3 /⇢3 = P1 /⇢1 , and allows us to rewrite
the second RH condition as
Here the second step follows from using the first RH jump condition. If we now
substitute this result back into the first RH jump condition we obtain that
✓ ◆2
⇢3 u1 u1
= = = M21
⇢1 u3 cs
Hence, in the case of isothermal shock (or an adiabatic shock, but sufficiently far
behind the shock in the downstream fluid), we have that there is no restriction to
how much compression the shock can achieve; depending on the Mach number of the
shock, the compression can be huge.
116
Figure 18: An actual example of a supernova blastwave. The red colors show the
optical light emitted by the supernova ejecta, while the green colors indicate X-ray
emission coming from the hot bubble of gas that has been shock-heated when the
blast-wave ran over it.
117
gas, which ‘pushes’ the shock outwards. As more and more material is swept-up, and
accelerated outwards, the mass of the shell increases, which causes the velocity of the
shell to decelerate. At the early stages, the cooling of the hot bubble is negligble, and
the blastwave is said to be in the adiabatic phase, also known as the Sedov-Taylor
phase. At some point, though, the hot bubble starts to cool, radiating awaiting the
kinetic energy of the supernova, and lowering the interior pressure up to the point
that it no longer pushes the shell outwards. This is called the radiative phase.
From this point on, the shell expands purely by its inertia, being slowed down by the
work it does against the surrounding material. This phase is called the snow-plow
phase. Ultimately, the velocity of the shell becomes comparable to the sound speed
of the surrounding material, after which it continues to move outward as a sound
wave, slowly dissipating into the surroundings.
During the adiabatic phase, we can use a simple dimensional analysis to solve for
the evolution of the shock radius, rsh , with time. Since the only physical parameters
that can determine rsh in this adiabatic phase are time, t, the initial energy of the
SN explosion, "0 , and the density of the surrounding medium, ⇢0 , we have that
It is easy to check that there is only one set of values for ⌘, ↵ and for which the
product on the right has the dimensions of length (which is the dimension of rsh .
This solution has ⌘ = 2/5, ↵ = 1/5 and = 1/5, such that
✓ ◆1/5
"
rsh = A t2/5
⇢0
118
CHAPTER 14
Fluid Instabilities
119
time, ⌧s , while re-establishing thermal equilibrium
proceeds much slower, on the conduction time, ⌧c . Given that ⌧s ⌧ ⌧c we can assume
that Pb⇤ = P 0 , and treat the displacement as adiabatic. The latter implies that the
process can be described by an adiabatic EoS: P / ⇢ . Hence, we have that
✓ ◆1/ ✓ ◆1/ 1/
Pb⇤ P0 1 dP
⇢⇤b = ⇢b = ⇢b = ⇢b 1+ z
Pb P P dz
In the limit of small displacements z, we can use Taylor series expansion to show
that, to first order,
⇢ dP
⇢⇤b = ⇢ + z
P dz
where we have used that initially ⇢b = ⇢, and that the Taylor series expansion,
f (x) ' f (0)+f 0(0)x+ 21 f 00 (0)x2 +..., of f (x) = [1+x]1/ is given by f (x) ' 1+ 1 x+....
Suppose we have a stratified medium in which d⇢/dz < 0 and dP/dz < 0. In that
case, if ⇢⇤b > ⇢0 the blob will be heavier than its surrounding and it will sink back to its
original position; the system is stable to convection. If, on the other hand, ⇢⇤b < ⇢0
then the displacement has made the blob more buoyant, resulting in instability.
Hence, using that ⇢0 = ⇢ + (d⇢/dz) z we see that stability requires that
d⇢ ⇢ dP
<
dz P dz
This is called the Schwarzschild criterion for convective stability.
It is often convenient to rewrite this criterion in a form that contains the temperature.
Using that
µ mp
⇢ = ⇢(P, T ) = P
kB T
it is straightforward to show that
d⇢ ⇢ dP ⇢ dT
=
dz P dz T dz
Substitution in ⇢0 = ⇢ + (d⇢/dz) z then yields that
⇤ 0 1 ⇢ dP ⇢ dT
⇢b ⇢ = (1 ) + z
P dz T dz
120
Since stability requires that ⇢⇤b ⇢0 > 0, and using that z > 0, dP/dz < 0 and
dT /dz < 0 we can rewrite the above Schwarzschild criterion for stability as
✓ ◆
dT 1 T dP
< 1
dz P dz
This shows that if the temperature gradient becomes too large the system becomes
convectively unstable: blobs will rise up until they start to loose their thermal en-
ergy to the ambient medium, resulting in convective energy transport that tries to
“overturn” the hot (high entropy) and cold (low entropy) material. In fact, without
any proof we mention that in terms of the specific entropy, s, one can also write
the Schwarzschild criterion for convective stability as ds/dz > 0.
✓ ◆
dT 1 T dP
< 1
dz P dz
d⇢ ⇢ dP
<
dz P dz
ds
>0
dz
It is easy to see where the RT instability comes from. Consider a fluid of density
⇢2 sitting on top of a fluid of density ⇢1 < ⇢2 in a gravitational field that is point-
ing in the downward direction. Consider a small perturbation in which the initially
horizontal interface takes on a small amplitude, sinusoidal deformation. Since this
121
Figure 19: Example of Rayleigh-Taylor instability in a hydro-dynamical simulation.
implies moving a certain volume of denser material down, and an equally large vol-
ume of the lighter material up, it is immediately clear that the potential energy of
this ‘perturbed’ configuration is lower than that of the initial state, and therefore
energetically favorable. Simply put, the initial configuration is unstable to small
deformations of the interface.
Stability analysis (i.e., perturbation analysis of the fluid equations) shows that the
dispersion relation corresponding to the RT instability is given by
r
g ⇢2 ⇢1
! = ±i k
k ⇢2 + ⇢1
where g is the gravitational acceleration, and the factor (⇢2 ⇢1 )/(⇢2 + ⇢1 ) is called
the Atwood number. Since the wavenumber of the perturbation k > 0 we see
that ! is imaginary, which implies that the perturbations will grow exponentially
(i.e., the system is unstable). If ⇢1 > ⇢2 though, ! is real, and the system is stable
(perturbations to the interface propagate as waves).
122
Figure 20: Illustration of onset of Kelvin-Helmholtz instability
Stability analysis (i.e., perturbation analysis of the fluid equations) shows that the
the dispersion relation corresponding to the KH instability is given by
!R (⇢1 u1 + ⇢2 u2 )
=
k ⇢1 + ⇢2
and
!I (⇢1 ⇢2 )1/2
= (u1 u2 )
k ⇢1 + ⇢2
Since the imaginary part is non-zero, except for u1 = u2 , we we have that, in principle,
any velocity di↵erence across an interface is KH unstable. In practice, surface
tension can stabilize the short wavelength modes so that typically KH instability
kicks in above some velocity treshold.
123
The mode that will destroy the cloud has k ⇠ 1/Rc , so that the time-scale for cloud
destruction is
1 Rc +2
⌧KH ' '
! cs,h ( + 1)1/2
Assuming pressure equilibrium between cloud and ICM, and adopting the EoS of an
ideal gas, implies that ⇢h Th = ⇢c Tc , so that
1/2 1/2
cs,h T ⇢c
= h1/2 = 1/2 = ( + 1)1/2
cs,c Tc ⇢h
Hence, one finds that the Kelvin-Helmholtz time for cloud destruction is
1 Rc +2
⌧KH ' '
! cs,c +1
Note that ⌧KH ⇠ ⇣(Rc/cs,c ) = ⇣⌧s , with ⇣ = 1(2) for 1(⌧ 1). Hence, the Kelvin-
Helmholtz instability will typically destroy clouds falling into a hot ”atmosphere”
on a time scale between one and two sound crossing times, ⌧s , of the cloud. Note,
though, that magnetic fields and/or radiative cooling at the interface may stabilize
the clouds.
124
and a Jeans mass ✓ ◆3
4 J ⇡ 3
MJ = ⇡⇢0 = ⇢0 J
3 2 6
From the dispersion relation one immediately sees that the system is unstable (i.e.,
! is imaginary) if k < kJ (or, equivalently, > J or M > MJ ). This is called the
Jeans criterion for gravitational instability. It expresses when pressure forces
(which try to disperse matter) are no longer able to overcome gravity (which tries to
make matter collapse), resulting in exponential gravitational collapse on a time scale
r
3⇡
⌧↵ =
32 G ⇢
In deriving the Jeans Stability criterion you will encounter a somewhat puzzling issue.
Consider the Poisson equation for the unperturbed medium (which has density ⇢0
and gravitational potential 0 ):
r2 0 = 4⇡G⇢0
125
Figure 21: The locus of ther-
mal equilibrium (L = 0) in
the (⇢, T ) plane, illustrating the
principle of thermal instability.
The dashed line indicates a line
of constant pressure.
The condition L(⇢, T ) = 0 corresponds to a curve in the (⇢, T )-plane with a shape
similar to that shown in Fig. 11. It has flat parts at T ⇠ 106 K, at T ⇠ 104K, at
T ⇠ 10 100K. This can be understood from simple atomic physics (see for example
§ 8.5.1 of Mo, van den Bosch & White, 2010). Above the TE curve we have that
L > 0 (net cooling), while below it L < 0 (net heating). The dotted curve indicates
a line of constant pressure (T / ⇢ 1 ). Consider a blob in thermal and mechanical
(pressure) equilibrium with its ambient medium, and with a pressure indicated by
the dashed line. There are five possible solutions for the density and temperature of
the blob, two of which are indicated by P1 and P2 ; here confusingly the P refers to
‘point’ rather than ‘pressure’. Suppose I have a blob located at point P2 . If I heat
the blob, displacing it from TE along the constant pressure curve (i.e., the blob is
assumed small enough that the sound crossing time, on which the blob re-established
mechanical equilibrium, is short). The blob now finds itself in the region where L > 0
(i.e, net cooling), so that it will cool back to its original location on the TE-curve;
the blob is stable. For similar reasons, it is easy to see that a blob located at point
P1 is unstable. This instability is called thermal instability, and it explains
why the ISM is a three-phase medium, with gas of three di↵erent temperatures
(T ⇠ 106 K, 104 K, and ⇠ 10 100 K) coexisting in pressure equilibrium. Gas at any
other temperature but in pressure equilibrium is thermally unstable.
126
It is easy to see that the requirement for thermal instability translates into
✓ ◆
@L
<0
@T P
which is known as the Field criterion for thermal instability (after astrophysicist
George B. Field).
Thus, we see that for > 4/3 the Jeans mass will increase with increasing density,
while the opposite is true for < 4/3. Now consider a system that is (initially)
larger than the Jeans mass. Since pressure can no longer support it against its own
gravity, the system will start to collapse, which increases the density. If < 4/3,
the Jeans mass will becomes smaller as a consequence of the collapse, and now small
subregions of the system will find themselves having a mass larger than the Jeans
mass ) the system will start to fragment.
If the collapse is adiabatic (i.e., we can ignore cooling), then = = 5/3 > 4/3 and
there will be no fragmentation. However, if cooling is very efficient, such that while
the cloud collapses it maintains the same temperature, the EoS is now isothermal,
which implies that = 1 < 4/3: the cloud will fragment into smaller collapsing
clouds. Fragmentation is believed to underly the formation of star clusters.
127
A very similar process operates related to the thermal instability. In the discussion
of the Field criterion we had made the assumption “the blob is assumed small
enough that the sound crossing time, on which the blob re-established mechanical
equilibrium, is short”. Here ‘short’ means compared to the cooling time of the cloud.
Let’s define the cooling length lcool ⌘ cs ⌧cool , where cs is the cloud’s sound speed and
⌧cool is the cooling time (the time scale on which it radiates away most of its internal
energy). The above assumption thus implies that the size of the cloud, lcloud ⌧
lcool . As a consequence, whenever the cloud cools somewhat, it can immediately
re-establish pressure equilibrium with its surrounding (i.e., the sound crossing time,
⌧s = lcloud /cs is much smaller than the cooling time ⌧cool = lcool /cs ).
Now consider a case in which lcloud lcool (i.e., ⌧cool ⌧ ⌧s ). As the cloud cools,
it cannot maintain pressure equilibrium with its surroundings; it takes too long for
mechanical equilibrium to be established over the entire cloud. What happens is that
smaller subregions, of order the size lcool , will fragment. The smaller fragments will
be able to maintain pressure equilibrium with their surroundings. But as the small
cloudlets cool further, the cooling length lcool shrinks. To see this, realize that when T
drops this lowers the sound speed and decreases the cooling time; after all, we are in
the regime of thermal instability, so (@L/@T )P < 0. As a consequence, lcool = cs ⌧cool
drops as well. So the small cloudlett soon finds itself larger than the cooling length,
and it in turn will fragment. This process of shattering continues until the cooling
time becomes sufficiently long and the cloudletts are no longer thermally unstable
(see McCourt et al., 2018, MNRAS, 473, 5407 for details).
128
CHAPTER 15
In this chapter we consider collisionless fluids, such as galaxies and dark matter halos.
As discussed in previous chapters, their dynamics is governed by the Collisionless
Boltzmann equation (CBE)
df @f @f @ @f
= + vi =0
dt @t @xi @xi @vi
By taking the velocity moment of the CBE (see Chapter 7), we obtain the Jeans
equations
@ui @uj 1 @ ˆij @
+ ui =
@t @xi ⇢ @xi @xi
which are the equivalent of the Navier-Stokes equations (or Euler equations), but for
a collisionless fluid. The quantity ˆij in the above expression is the stress tensor,
defined as
ˆij = ⇢ hwi wj i = ⇢hvi vj i ⇢hvi i hvj i
In this chapter, we write a hat on top of the stress tensor, in order to distinguish it
from the velocity dispersion tensor given by
2 ˆij
ij = hvi vj i hvi i hvj i =
⇢
This notation may cause some confusion, but it is adapted here in order to be con-
sistent with the notation in standard textbooks on galactic dynamics. For the same
reason, in what follows we will write hvi i in stead of ui .
As we have discussed in detail in chapters 4 and 5, for a collisional fluid the stress
tensor is given by
ˆij = ⇢ ij2 = P ij + ⌧ij
and therefore completely specified by two scalar quantities; the pressure P and the
shear viscosity µ (as always, we ignore bulk viscosity). Both P and µ are related to
⇢ and T via constitutive equations, which allow for closure in the equations.
129
In the case of a collisionless fluid, though, no consistutive relations exist, and the
(symmetric) velocity dispersion tensor has 6 unknowns. As a consequence, the Jeans
equations do not form a closed set. Adding higher-order moment equations of the
CBE will yield more equations, but this also adds new, higher-order unknowns such
as hvi vj vk i, etc. As a consequence, the set of CBE moment equations never closes!
Note that ij2 is a local quantity; ij2 = ij2 (~x). At each point ~x it defines the
velocity ellipsoid; an ellipsoid whose principal axes are defined by the orthogonal
eigenvectors of ij2 with lengths that are proportional to the square roots of the
respective eigenvalues.
Since these eigenvalues are typically not the same, a collisionless fluid experiences
anisotropic pressure-like forces. In order to be able to close the set of Jeans equa-
tions, it is common to make certain assumptions about the symmetry of the fluid.
For example, a common assumption is that the fluid is isotropic, such that the
(local) velocity dispersion tensor is specified by a single quantity; the local velocity
dispersion 2 . Note, though, that if with this approach, a solution is found, the
solution may not correspond to a physical distribution function (DF) (i.e., in order
to be physical, f 0 everywhere). Thus, although any real DF obeys the Jeans
equations, not every solution to the Jeans equations corresponds to a physical DF!!!
As a worked out example, we now derive the Jeans equations under cylindrical
symmetry. We therefore write the Jeans equations in the cylindrical coordinate
system (R, , z). The first step is to write the CBE in cylindrical coordinates.
df @f @f @f @f @f @f @f
= + Ṙ + ˙ + ż + v̇R + v̇ + v̇z
dt @t @R @ @z @vR @v @vz
Recall from vector calculus (see Appendices A and D) that
~v = Ṙ~eR + R ˙ ~e + ż~ez = vR~eR + v ~e + vz~ez
from which we obtain the acceleration vector
d~v
~a = = R̈~eR + Ṙ~e˙ R + Ṙ ˙ ~e + R ¨~e + R ˙ ~e˙ + z̈~ez + ż~e˙ z
dt
Using that ~e˙ R = ˙ ~e , ~e˙ = ˙ ~eR , and ~e˙ z = 0 we have that
h i h i
~a = R̈ R ˙ 2 ~eR + 2Ṙ ˙ + R ¨ ~e + z̈~ez
130
Next we use that
vR = Ṙ ) v̇R = R̈
v = R ˙ ) v̇ = Ṙ ˙ + R ¨
vz = ż ) v̇z = z̈
@ v2
v̇R = @R + R
1 @ v v
v̇ = R @R + RR
v̇z = @@z
The Jeans equations follow from multiplication with vR , v , and vz and integrat-
ing over velocity space. Note that the cylindrical symmetry requires that all
derivatives with respect to vanish. The remaining terms are:
131
Z Z
@f @ @(⇢hvR i)
vR d3~v = vR f d3~v =
@t @t @t
Z Z
@f 3 @ @(⇢hvR2 i)
vR2 d ~v = vR2 f d3~v =
@R @R @R
Z Z
@f @ @(⇢hvR vz i)
vR vz d3~v = vR vz f d3~v =
@z @z @z
Z 2 Z 2 Z
vR v @f 3 1 @(vR v f ) 3 @(vR v 2 ) 3 hv 2 i
d ~v = d ~v f d ~v = ⇢
R @vR R @vR @vR R
Z Z Z
@ @f 3 @ @(vR f ) 3 @vR 3 @
vR d ~v = d ~v f d ~v = ⇢
@R @vR @R @vR @vR @R
Z 2 Z Z
vR v @f 3 1 @(vR2 v f ) 3 @(vR2 v ) 3 hvR2 i
d ~v = d ~v f d ~v = ⇢
R @v R @v @v R
Z Z Z
@ @f 3 @ @(vR f ) 3 @vz 3
vR d ~v = d ~v f d ~v = 0
@z @vz @z @vz @vR
Working out the similar terms for the other Jeans equations we finally obtain the
Jeans Equations in Cylindrical Coordinates:
2
@(⇢hvR i) @(⇢hvR2 i) @(⇢hvR vz i) hvR i hv 2 i @
+ + +⇢ + =0
@t @R @z R @R
These are 3 equations with 9 unknowns, which can only be solved if we make addi-
tional assumptions. In particular, one often makes the following assumptions:
@
1 System is static ) the @t
-terms are zero and hvR i = hvz i = 0.
2 Velocity dispersion tensor is diagonal ) hvi vj i = 0 (if i 6= j).
3 Meridional isotropy ) hvR2 i = hvz2i = 2
R = 2
z ⌘ 2
.
132
Under these assumptions we have 3 unknowns left: hv i, hv 2 i, and 2
, and the Jeans
equations reduce to
2
@(⇢ 2 ) hv 2 i @
+⇢ + =0
@R R @R
@(⇢ 2 ) @
+⇢ =0
@z @z
Since we now only have two equations left, the system is still not closed. If from
the surface brightness we can estimate the mass density, ⇢(R, z), and hence (using
the Poisson equation) the potential (R, z), we can solve the second of these Jeans
equations for the meridional velocity dispersion:
Z1
2 1 @
(R, z) = ⇢ dz
⇢ @z
z
and the first Jeans equation then gives the mean square azimuthal velocity
hv 2 i = hv i2 + 2 :
2 2 @ R @(⇢ 2 )
hv i(R, z) = (R, z) + R +
@R ⇢ @R
Thus, although hv 2 i is uniquely specified by the Jeans equations, we don’t know how
it splits in the actual azimuthal streaming, hv i, and the azimuthal dispersion,
2
. Additional assumptions are needed for this.
————————————————-
133
A similar analysis, but for a spherically symmetric system, using the spherical coordi-
nate system (r, ✓, ), gives the following Jeans equations in Spherical Symmetry
@(⇢hvr i) @(⇢hvr2 i) ⇢ ⇥ 2 ⇤ @
+ + 2hvr i hv✓2 i hv 2 i + ⇢ =0
@t @r r @r
@(⇢hv✓ i) @(⇢hvr v✓ i) ⇢ ⇥ ⇤
+ + 3hvr v✓ i + hv✓2 i hv 2 i cot✓ = 0
@t @r r
@(⇢hv i) @(⇢hvr v i) ⇢
+ + [3hvr v i + 2hv✓ v icot✓] = 0
@t @r r
If we now make the additional assumptions that the system is static and that also
the kinematic properties of the system are spherically symmetric, then there can
be no streaming motions and all mixed second-order moments vanish. Consequently,
the velocity dispersion tensor is diagonal with ✓2 = 2 . Under these assumptions
only one of the three Jeans equations remains:
@(⇢ r2 ) 2⇢ ⇥ 2 2
⇤ @
+ r ✓ +⇢ =0
@r r @r
Notice that this single equation still constains two unknown, r2 (r) and ✓2 (r) (if we
assume that the density and potential are known), and can thus not be solved.
It is useful to define the anisotropy parameter
2
+ 2 (r)
✓ (r)
2
✓ (r)
(r) ⌘ 1 =1
2 r2 (r) 2
r (r)
where the second equality only holds under the assumption that the kinematics are
spherically symmetric.
1 @(⇢hvr2 i) hvr2 i d
+2 =
⇢ @r r dr
If we now use that d /dr = GM(r)/r, we can write the following expression for the
134
enclosed (dynamical) mass:
rhvr2 i d ln ⇢ d lnhvr2 i
M(r) = + +2
G d ln r d ln r
Hence, if we know ⇢(r), hvr2 i(r), and (r), we can use the spherical Jeans equation
to infer the mass profile M(r).
with ⌥(r) the mass-to-light ratio. Similarly, the line-of-sight velocity disper-
sion, p2 (R), which can be inferred from spectroscopy, is related to both hvr2 i(r) and
(r) according to (see Figure 22)
Z1
2 ⌫ r dr
⌃(R) p (R) = 2 h(vr cos ↵ v✓ sin ↵)2 i p
r 2 R2
R
Z1
⌫ r dr
= 2 hvr2 i cos2 ↵ + hv✓2 i sin2 ↵ p
r 2 R2
R
Z1 ✓ ◆
R2 ⌫ hvr2 i r dr
= 2 1 p
r2 r 2 R2
R
The 3D luminosity density is trivially obtained from the observed ⌃(R) using the
Abel transform
Z1
1 d⌃ dR
⌫(r) = p
⇡ dR R2 r 2
r
In general, we have three unknowns: M(r) [or equivalently ⇢(r) or ⌥(r)], hvr2 i(r) and
(r). With our two observables ⌃(R) and p2 (R), these can only be determined if we
make additional assumptions.
135
Figure 22: Geometry related to projection
EXAMPLE 1: Assume isotropy: (r) = 0. In this case we can use the Abel
transform to obtain
Z1
1 d(⌃ p2 ) dR
⌫(r)hvr2 i(r) = p
⇡ dR R2 r 2
r
136
We can now use the spherical Jeans Equation to write (r) in terms of M(r),
⌫(r) and hvr2 i(r). Substituting this in the equation for ⌃(R) p2 (R) yields a solution
for hvr2i(r), and thus for (r). As long as (r) 1 the model is said to be self-
consistent within the context of the Jeans equations.
Almost always, radically di↵erent models (based on radically di↵erent assumptions)
can be constructed, that are all consistent with the data and the Jeans equations.
This is often referred to as the mass-anisotropy degeneracy. Note, however, that
none of these models need to be physical: they can still have f < 0.
————————————————-
137
with l and m integers), then the orbit is a resonant orbit, and has a dimensionality
that is one lower than that of the non-resonant regular orbits (i.e., l!i + m!j is an
extra isolating integral of motion). Orbits with fewer than n isolating integrals of
motion are called irregular or stochastic.
Every spherical potential admits at least four isolating integrals of motion, namely
energy, E, and the three components of the angular momentum vector L. ~ Orbits in
a flattened, axisymmetric potential frequently (but not always) admit three isolating
integrals of motion: E, Lz (where the z-axis is the system’s symmetry axis), and a
non-classical third integral I3 (the integral is called non-classical since there is no
analytical expression of I3 as function of the phase-space variables).
dI @I dxi @I dvi @I
= + = ~v · rI r · =0
dt @xi dt @vi dt @~v
Compare this to the CBE for a steady-state (static) system:
@f
~v · rf r · =0
@~v
Thus the condition for I to be an integral of motion is identical with the condition
for I to be a steady-state solution of the CBE. This implies the following:
Jeans Theorem: Any steady-state solution of the CBE depends on the phase-
space coordinates only through integrals of motion. Any function of these integrals is
a steady-state solution of the CBE.
Hence, the DF of any steady-state spherical system can be expressed as f = f (E, L).~
If the system is spherically symmetric in all its properties, then f = f (E, L2 ), i.e.,
the DF can only depend on the magnitude of the angular momentum vector, not on
its direction.
138
An even simpler case to consider is the one in which f = f (E): Since E = (~r ) +
1 2
[v + v✓2 + v 2 ] we have that
2 r
Z ✓ ◆
2 1 1 2
hvr i = dvr dv✓ dv vr2f 2 2
+ [vr + v✓ + v ]
⇢ 2
Z ✓ ◆
2 1 2 1 2 2 2
hv✓ i = dvr dv✓ dv v✓ f + [vr + v✓ + v ]
⇢ 2
Z ✓ ◆
2 1 2 1 2 2 2
hv i = dvr dv✓ dv v f + [vr + v✓ + v ]
⇢ 2
Since these equations di↵er only in the labelling of one of the variables of integration,
it is immediately evident that hvr2 i = hv✓2 i = hv 2 i. Hence, assuming that f = f (E) is
identical to assuming that the system is isotropic (and thus (r) = 0). And since
Z ✓ ◆
1 1 2 2 2
hvi i = dvr dv✓ dv vi f + [vr + v✓ + v ]
⇢ 2
it is also immediately evident that hvr i = hv✓ i = hv i = 0. Thus, a system with
f = f (E) has no net sense of rotation.
The more general f (E, L2 ) models typically are anisotropic. Models with 0 < 1
are radially anisotropic. In the extreme case of = 1 all orbits are purely radial
and f = g(E) (L), with g(E) some function of energy. Tangentially anisotropic
models have < 0, with = 1 corresponding to a model in which all orbits are
circular. In that case f = g(E) [L Lmax (E)], where Lmax (E) is the maximum
angular momentum for energy E. Another special case is the one in which (r) =
is constant; such models have f = g(E) L 2 .
139
2
@(⇢hvR2 i) @(⇢hvR vz i) hvR i hv 2 i @
+ +⇢ + =0
@R @z R @R
@(⇢hvR vz i) @(⇢hvz2 i) hvR vz i @
+ +⇢ + =0
@R @z R @z
which clearly doesn’t suffice to solve for the four unknowns (modelling three-integral
axisymmetric systems is best done using the Schwarzschild orbit superposition tech-
nique). To make progress with Jeans modeling, one has to make additional assump-
tions. A typical assumption is that the DF has the two-integral form f = f (E, Lz ).
In that case, hvR vz i = 0 [velocity ellipsoid now is aligned with (R, , z)] and hvR2 i =
hvz2 i (see Binney & Tremaine 2008), so that the Jeans equations reduce to
2
@(⇢hvR2 i) hvR i hv 2 i @
+⇢ + =0
@R R @R
@(⇢hvz2 i) @
+⇢ =0
@z @z
which is a closed set for the two unknowns hvR2 i (= hvz2 i) and hv 2 i. Note, however,
that the Jeans equations provide no information about how hv 2 i splits in streaming
and random motions. In practice one often writes that
⇥ ⇤
hv i = k hv 2 i hvR2 i
————————————————-
140
The final topics to be briefly discussed in this chapter are the Virial Theorem and
the negative heat capacity of gravitational systems.
P
N
1
Total Kinetic Energy: K= 2
mi vi2
i=1
1
P
N P
G mi mj
Total Potential Energy: W = 2 |~ rj |
ri ~
i=1 j6=i
The latter follows from the fact that gravitational binding energy between a pair
of masses is proportional to the product of their masses, and inversely proportional
to their separation. The factor 1/2 corrects for double counting the number of pairs.
where rij = |~ri ~rj |. In the continuum limit this simply becomes
Z
1
W = ⇢(~x) (~x) d3~x
2
One can show (see e.g., Binney & Tremaine 2008) that this is equal to the trace of
the Chandrasekhar Potential Energy Tensor
Z
@
Wij ⌘ ⇢(~x) xi d3~x
@xj
In particular,
141
3
X Z
W = Tr(Wij ) = Wii = ⇢(~x) ~x · r d3~x
i=1
which is another, equally valid, expression for the gravitational potential energy in
the continuum limit.
2K + W = 0
Combining the virial equation with the expression for the total energy, E = K +W ,
we see that for a system that obeys the virial theorem
E= K = W/2
If we assume, for simplicity, that all galaxies have equal mass then we can rewrite
this as
N N
1 X 2 G (Nm)2 1 X X 1
Nm v =0
N i=1 i 2 N 2 i=1 j6=i rij
142
2 hv 2 i
M=
G h1/ri
with
XN X
1 1
h1/ri =
N(N 1) i=1 j6=i rij
G M2
W =
rg
Using the relations above, it is clear that rg = 2/h1/ri. We can now rewrite the
above equation for M in the form
rg hv 2 i
M=
G
Hence, one can infer the mass of our cluster of galaxies from its velocity dispersion
and its gravitation radius. In general, though, neither of these is observable, and one
uses instead
2
Re↵ hvlos i
M =↵
G
where vlos is the line-of-sight velocity, Re↵ is some measure for the ‘e↵ective’ radius
of the system in question, and ↵ is a parameter of order unity that depends on the
radial distribution of the galaxies. Note that, under the assumption of isotropy,
2
hvlos i = hv 2 i/3 and one can also infer the mean reciprocal pair separation from the
projected pair separations; in other words under the assumption of isotropy one can
infer ↵, and thus use the above equation to compute the total, gravitational mass of
the cluster. This method was applied by Fritz Zwicky in 1933, who inferred that
the total dynamical mass in the Coma cluster is much larger than the sum of the
masses of its galaxies. This was the first observational evidence for dark matter,
although it took the astronomical community until the late 70’s to generally accept
this notion.
143
For a self-gravitating fluid
N
X 1 1 3
K= mi vi2 = N m hv 2 i = N kB T
i=1
2 2 2
where the last step follows from the kinetic theory of ideal gases of monoatomic
particles. In fact, we can use the above equation for any fluid (including a collisionless
one), if we interpret T as an e↵ective temperature that measures the rms velocity
of the constituent particles. If the system is in virial equilibrium, then
3
E= K=
N kB T
2
which, as we show next, has some important implications...
Heat Capacity: the amount of heat required to increase the temperature by one
degree Kelvin (or Celsius). For a self-gravitating fluid this is
dE 3
C⌘
= N kB
dT 2
which is negative! This implies that by losing energy, a gravitational system
gets hotter!! This is a very counter-intuitive result, that often leads to confusion and
wrong expectations. Below we give three examples of implications of the negative
heat capacity of gravitating systems,
144
potential energy that becomes more negative). In order for the star to remain in
virial equilibrium its kinetic energy, which is proportional to temperature, has to
increase; the star’s energy loss results in an increase of its temperature.
In the Sun, hydrogen burning produces energy that replenishes the energy loss from
the surface. As a consequence, the system is in equilibrium, and will not contract.
However, once the Sun has used up all its hydrogren, it will start to contract and heat
up, because of the negative heat capacity. This continues until the temperature in
the core becomes sufficiently high that helium can start to fuse into heavier elements,
and the Sun settles in a new equilibrium.
Example 3: Core Collapse a system with negative heat capacity in contact with
a heat bath is thermodynamically unstable. Consider a self-gravitating fluid of ‘tem-
perature’ T1 , which is in contact with a heat bath of temperature T2 . Suppose the
system is in thermal equilibrium, so that T1 = T2 . If, due to some small disturbance,
a small amount of heat is tranferred from the system to the heat bath, the negative
heat capacity implies that this results in T1 > T2 . Since heat always flows from hot
to cold, more heat will now flow from the system to the heat bath, further increasing
the temperature di↵erence, and T1 will continue to rise without limit. This run-away
instability is called the gravothermal catastrophe. An example of this instability
is the core collapse of globular clusters: Suppose the formation of a gravitational
system results in the system having a declining velocity dispersion profile, 2 (r) (i.e.,
decreases with increasing radius). This implies that the central region is (dynami-
cally) hotter than the outskirts. IF heat can flow from the center to those outskirts,
the gravothermal catastrophe kicks in, and in the central regions will grow with-
out limits. Since 2 = GM(r)/r, the central mass therefore gets compressed into
a smaller and smaller region, while the outer regions expand. This is called core
collapse. Note that this does NOT lead to the formation of a supermassive black
hole, because regions at smaller r always shrink faster than regions at somewhat
larger r. In dark matter halos, and elliptical galaxies, the velocity dispersion profile
is often declining with radius. However, in those systems the two-body relaxation
time is soo long that there is basically no heat flow (which requires two-body in-
teractions). However, globular clusters, which consist of N ⇠ 104 stars, and have
a crossing time of only tcross ⇠ 5 ⇥ 106 yr, have a two-body relaxation time of only
⇠ 5 ⇥ 108 yr. Hence, heat flow in globular clusters is not negligible, and they can
(and do) undergo core collapse. The collapse does not proceed indefinitely, because
of binaries (see Galactic Dynamics by Binney & Tremaine for more details).
145
CHAPTER 16
Consider an encounter between two collisionless N-body systems (i.e., dark matter
halos or galaxies): a perturber P and a system S. Let q denote a particle of S and
let b be the impact parameter, v1 the initial speed of the encounter, and R0 the
distance of closest approach (see Fig. 14).
There are only two cases in which we can calculate the outcome of the encounter
analytically:
• high speed encounter (v1 vcrit ). In this case the encounter is said to
be impulsive and one can use the impulsive approximation to compute its
outcome.
• large mass ratio (MP ⌧ MS ). In this case one can use the treatment of
dynamical friction to describe how P loses orbital energy and angular mo-
mentum to S.
In all other cases, one basically has to resort to numerical simulations to study the
outcome of the encounter. In what follows we present treatments of first the impulse
approximation and then dynamical friction.
146
Figure 23: Schematic illustration of an encounter with impact parameter b between
a perturber P and a subject S.
In the large v1 limit, we have that the distance of closest approach R0 ! b, and the
velocity of P wrt S is vP (t) ' v1~ey ⌘ vP~ey . Consequently, we have that
~
R(t) = (b, vP t, 0)
147
Let ~r be the position vector of q wrt S and adopt the distant encounter approx-
imation, which means that b max[RS , RP ], where RS and RP are the sizes of S
and P , respectively. This means that we may treat P as a point mass MP , so that
GMP
P (~
r) =
~
|~r R|
Using geometry, and defining ~ we we have that
as the angle between ~r and R,
|~r ~ 2 = (R
R| r cos )2 + (r sin )2
so that
p
|~r ~ =
R| R2 2rR cos + r 2
Next we use the series expansion
1 1 13 2 135 3
p =1 x+ x x + ....
1+x 2 24 246
to write
" ✓ ◆ ✓ ◆2 #
1 1 1 r r2 3 r r2
= 1 2 cos + 2 + 2 cos + 2 + ...
|~r ~
R| R 2 R R 8 R R
Substitution in the expression for the potential of P yields
✓ ◆
GMP GMP r GMP r 2 3 1
P (~
r) = cos cos2 + O[(r/R)3 ]
R R2 R3 2 2
• The first term on rhs is a constant, not yielding any force (i.e., rr P = 0).
• The second term on the rhs describes how the center of mass of S changes its
velocity due to the encounter with P .
• The third term on the rhs corresponds to the tidal force per unit mass and is
the term of interest to us.
148
✓ ◆
GMP 3 0 2 1 02
3 (~
r) = 3
r cos2 r
R 2 2
✓ ◆
GMP 1 02 1 02
= 3
x0 2 y z
R 2 2
GMP ⇥ ⇤
Fx = x(2 3 sin2 ✓) 3y sin ✓ cos ✓
R3
GMP ⇥ ⇤
Fy = y(2 3 cos2 ✓) 3x sin ✓ cos ✓
R3
GMP
Fx = z
R3
Using these, we have that
Z Z Z ⇡/2
dvx dt
vx = dt = Fx dt = Fx d✓
dt ⇡/2 d✓
with similar expressions for vy and vz . Using that ✓ = tan 1 (vP t/b) one has that
dt/d✓ = b/(vP cos2 ✓). Substituting the above expressions for the tidal force, and
using that R = b/ cos ✓, one finds, after some algebra, that
2GMP
~v = ( vx , vy , vz ) = (x, 0, z)
vP b2
Substitution in the expression for ES yields
149
Z
1 2 G2 MP2
ES = | ~v |2 ⇢(r) d3~r = MS hx2 + z 2 i
2 vP2 b4
Under the assumption that S is spherically symmetric we have that hx2 + z 2 i =
2
3
hx2 + y 2 + z 2 i = 23 hr 2 i and we obtain the final expression for the energy increase of
S as a consequence of the impulsive encounter with P :
✓ ◆2
4 MP hr 2 i
ES = G2 MS
3 vP b4
This derivation, which is originally due to Spitzer (1958), is surprisingly accurate for
encounters with b > 5max[RP , RS ], even for relatively slow encounters with v1 ⇠ S .
For smaller impact parameters one has to make a correction (see Galaxy Formation
and Evolution by Mo, van den Bosch & White 2010 for details).
The impulse approximation shows that high-speed encounters can pump energy
into the systems involved. This energy is tapped from the orbital energy of the two
systems wrt each other. Note that ES / b 4 , so that close encounters are far more
important than distant encounters.
After the encounter, S has gained kinetic energy (in the amount of ES ), but
its potential energy has remained unchanged (recall, this is the assumption that
underlies the impulse approximation). As a consequence, after the encounter S will
no longer be in virial equilibrium; S will have to readjust itself to re-establish
virial equilibrium.
150
Let K0 and E0 be the initial (pre-encounter) kinetic and total energy of S. The
virial theorem ensures that E0 = K0 . The encounter causes an increase of
(kinetic) energy, so that K0 ! K0 + ES and E0 ! E0 + ES . After S has re-
established virial equilibrium, we have that K1 = E1 = (E0 + ES ) = K0 ES .
Thus, we see that virialization after the encounter changes the kinetic energy of
S from K0 + ES to K0 ES ! The gravitational energy after the encounter
is W1 = 2E1 = 2E0 + 2 ES = W0 + 2 ES , which is less negative than before
the encounter. Using the definition of the gravitational radius (see Chapter 12),
rg = GMS2 /|W |, from which it is clear that the (gravitational) radius of S increases
due to the impulsive encounter. Note that here we have ignored the complication
coming from the fact that the injection of energy ES may result in unbinding some
of the mass of S.
Although these views are similar, there are some subtle di↵erences. For example,
according to the first two descriptions dynamical friction is a local e↵ect. The
151
third description, on the other hand, treats dynamical friction more as a global
e↵ect. As we will see, there are circumstances under which these views make di↵erent
predictions, and if that is the case, the third and latter view presents itself as the
better one.
Chandrasekhar derived an expression for the dynamical friction force which, although
it is based on a number of questionable assumptions, yields results in reasonable
agreement with simulations. This so-called Chandrasekhar dynamical friction
force is given by
Here ⇢(< vS ) is the density of particles of mass m that have a speed vm < vS , and ln ⇤
is called the Coulomb logarithm. It’s value is uncertain (typically 3 ⇠ < ln ⇤ < 30).
⇠
One often approximates it as ln ⇤ ⇠ ln(Mh /MS ), where Mh is the total mass of
the system of particles of mass m, but this should only be considered a very rough
estimate at best. The uncertainties for the Coulomb logarithm derive from the
oversimplified assumptions made by Chandrasekhar, which include that the medium
through which the subject mass is moving is infinite, uniform and with an isotropic
velocity distribution f (vm ) for the sea of particles.
Similar to frictional drag in fluid mechanics, F~df is always pointing in the direction
opposite of vS .
Note that F~df is independent of the mass m of the constituent particles, and pro-
portional to MS2 . The latter arises, within the second or third view depicted above,
from the fact that the wake or response density has a mass that is proportional to
MS , and the gravitational force between the subject mass and the wake/response
density therefore scales as MS2 .
152
Figure 24: Examples of the response density in a host system due to a perturber
orbiting inside it. The back-reaction of this response density on the perturber causes
the latter to experience dynamical friction. The position of the perturber is indicated
by an asterisk. [Source: Weinberg, 1989, MNRAS, 239, 549]
Let us assume that the host mass is a singular isothermal sphere with density
and potential given by
Vc2
⇢(r) = 2
(r) = Vc2 ln r
4⇡Gr
2
where Vc = GMh /rh with rh the radius of the host mass. If we further assume
that this host mass has, at each point, an isotropic and Maxwellian velocity
distrubution, then
153
2
⇢(r) vm
f (vm ) = exp
(2⇡ 2 )3/2 2 2
p
with = Vc / 2.
154
that virialized dark matter halos all have the same average density). Using that
ln ⇤ ⇠ ln(Mh /MS ) and assuming that the subject mass starts out from an initial
radius ri = rh , we obtain a dynamical friction time
Mh /MS
tdf = 0.12 tH
ln(Mh /MS )
Hence, the time tdf on which dynamical friction brings an object of mass MS moving
in a host of mass Mh from an initial radius of ri = rh to r = 0 is shorter than the
Hubble time as long as MS ⇠ > M /30. Hence, dynamical friction is only e↵ective for
h
fairly massive objects, relative to the mass of the host. In fact, if you take into
account that the subject mass experiences mass stripping as well (due to the tidal
interactions with the host), the dynamical friction time increases by a factor 2 to 3,
and tdf < tH actually requires that MS ⇠> M /10.
h
For a more detailed treatment of collisions and encounters of collisionless systems, see
Chapter 12 of Galaxy Formation and Evolution by Mo, van den Bosch & White.
155
Computational Hydrodynamics
Numerical, or computational, hydrodynamics is a rich topic, and one could easily de-
vote an entire course to it. The following chapters therefore only scratch the surface
of this rich topic. Readers who want to get more indepth information are referred to
the following excellent textbooks
- Numerical Solution of Partial Di↵erential Equations by K. Morton & D. Mayers
- Numerical Methods for Partial Di↵erential Equations by W. Ames
- Di↵erence Methods for Initial Value Problems by R. Richtmyer
- Numerical Methods for Engineers and Scientists by J. Ho↵man
- Riemann Solvers and Numerical Methods for Fluid Dynamics by E. Toro
156
CHAPTER 17
Having discussed the theory of fluid dynamics, we now focus on how to actually solve
the partial di↵erential equations (PDEs) that describe the time-evolution of the
various hydrodynamical quantities. Solving PDEs analytically is complicated (see
Appendix E for some background) and typically only possible for highly idealized
flows. In general, we are forced to solve the PDEs numerically. How this is done is
the topic of this and subsequent chapters.
The first thing to realize is that a computer will only solve a discrete representation
of the actual PDE. We call this approximation a finite di↵erence approximation
(FDA). In addition, we need to represent the physical data (i.e., our hydrodynam-
ical quantities of interest) in a certain finite physical region of interest, called the
computational domain). Typically this is done by subdividing the computational
domain using a computational mesh; the data is specified at a finite number of grid
points or cells. Throughout these chapters on numerical hydrodynamics we will con-
sider regular, cartesian meshes, but more complicated, even time-dependent meshes
can be adopted as well. There are two di↵erent ‘methods’ (or ‘philosophies’) for how
to formulate the problem. On the one hand, one can think of the data as literally
being placed at the grid points. This yields a finite di↵erence formulation. Al-
ternatively, one can envision the data being spread our over the mesh cell (or ‘zone’),
yielding a finite volume formulation. In practice. finite di↵erence formulations
are a bit faster, and easier to comprehend, which is why much of what follows adopts
the finite di↵erence formulation. In Chapter 19, though, we will transition to fi-
nite volume formulation, which is the formulation most often adopted in modern
computational fluid dynamics.
157
equations being modelled. In what follows we therefore adhere to Eulerian methods
based on computational domains that are discretized on a mesh.
158
where
0 1 0 1
⇢ ⇢~u
~q = @ ⇢~u A and f~(~q) = @ ⇢~u ⌦ ~u + P A
1 2 1 2
2
⇢u + ⇢" ( 2 u + " + P/⇢) ⇢~u
Depending on the number of spatial dimensions, this is a set of 3-5 coupled PDEs,
and our goal is to come up with a scheme how to numerically solve these. Note that
we are considering a highly oversimplified case, ignoring gravity, radiation, viscosity
and conduction. As discussed later, adding viscosity and conduction changes the
character of the PDE, while gravity and radiation can be added as source and/or
sink terms, something we will not cover in these lecture notes.
Typically, one can numerically solve this set of PDEs using the following procedure:
1. Define a spatial grid, ~xi , where here the index refers to the grid point.
2. Specify initial conditions for ~q (~x, t) at t = 0 at all ~xi , and specify suitable
boundary conditions on the finite computational domain used.
5. Take a small step in time, t, and compute the new ~q (~xi , t + t).
6. Go back to [3]
Although this sounds simple, there are many things that can go wrong. By far the
most tricky part is step [5], as we will illustrate in what follows.
Let us start with some basics. The above set of PDEs is hyperbolic. Mathematically
this means the following. Consider the Jacobian matrix of the flux function
@fi
Fij (~q ) ⌘
@qj
The set of PDEs is hyperbolic if, for each value of ~q , all the eigenvalues of F are real,
and F is diagonizable.
159
In a system described by a hyperbolic PDE, information travels at a finite speed (cs
in the example here). Information is not transmitted until the wave arrives. The
smoothness of the solution to a hyperbolic PDE depends on the smoothness of the
initial and boundary conditions. For instance, if there is a jump in the data at the
start or at the boundaries, then the jump will propagate as a shock in the solution.
If, in addition, the PDE is nonlinear (which is the case for the Euler equations), then
shocks may develop even though the initial conditions and the boundary conditions
are smooth.
If we were to consider the Navier-Stokes equation, rather than the Euler equa-
tions, then we add ‘non-ideal’ terms (referring to the fact that these terms are absent
for an ideal fluid) related to viscosity and conduction. These terms come with a
second-order spatial derivative, describing di↵usive processes. Such terms are called
parabolic, and they have a di↵erent impact on the PDE. See the box on the next
page.
Hence, we are faced with the problem of simultaneously solving a hyperbolic set of
3-5 PDEs, some of which are non-linear, and some of which may contain parabolic
terms. This is a formidable problem, and one could easily devote an entire course to
it. Readers who want to get more indepth information are referred to the textbooks
listed at the start of this section on numerical hydrodynamics.
160
The Nature of Hyperbolic and Parabolic PDEs
This is a travelling wave, and ! is always real. We also see that the group
velocity @!/@k = v is constant, and independent of k; all modes prop-
agate at the same speed, and the wave-solution is thus non-dispersive.
This is characteristic of a hyperbolic PDE. Next consider the heat
equation
@T @2T
= 2
@t @x
This equation describes how heat di↵uses in a 1D system with di↵usion
coefficient . Substituting the same formal solution as above, we obtain
that
2
! = ik 2 ) ⇢(x, t) = ⇢0 + ⇢1 eikx e k t
Note that ! is now imaginary, and that the solution has an exponentially
decaying term. This describes how the perturbation will ‘die out’ over
time due to dissipation. This is characteristic of a parabolic PDE.
161
In what follows, rather than tackling the problem of numerically solving the Euler
or Navier-Stokes equations, we focus on a few simple limiting cases that result in
fewer equations, which will give valuable insight to aspects of the various numerical
schemes. In particular, we restrict ourselves to the 1D case, and continue to assume
an ideal fluid for which we can ignore radiation and gravity. The hydrodynamic
equations now reduce to the following set of 3 PDEs:
@⇢ @
+ (⇢u) = 0
@t @x
@ @
(⇢u) + (⇢uu + P ) = 0
✓ ◆ @t✓ @x ◆
@ 1 2 @ 1 2
⇢u + ⇢" + ⇢u + ⇢" + P u = 0
@t 2 @x 2
If we now, in addition, assume constant pressure, P , and a constant flow ve-
locity, u, then this system reduces to 2 separate, linear PDEs given by
@⇢ @⇢ @" @"
+u =0 and +u =0
@t @x @t @x
These are identical equations, known as the linear advection equation. They
describe the passive transport of the quantities ⇢ and " in the flow with constant
velocity u. This equation has a well-known, rather trivial solution: If the initial
(t = 0) conditions are given by ⇢0 (x) and "0 (x), the the general solutions are
⇢(x, t) = ⇢0 (x ut) and "(x, t) = "0 (x ut)
Since the analytical solution to this linear advection equation is (trivially) known,
it is an ideal equation on which to test our numerical scheme(s). As we will see,
even this extremely simple example will proof to be surprisingly difficult to solve
numerically.
162
Figure 25: Filled and open dots indicate the grid points (horizontal) and time steps
(vertical). The grey region is the physical domain of dependence for the (x, t)-grid
point, and the CFL-condition states that the numerical domain of dependence must
contain this physical domain. The triangle formed by the two dashed lines and solid
black line indicates this numerical domain in the case of the explicit Euler scheme
of integration discussed below. It relies on the properties of the neighboring grid points
at the previous time-step. The time-step in panel [1] DOES meet the CFL-criterion,
while that in panel [2] does NOT.
Here the parameter ↵c is often called the CFL, or Courant, parameter. Note that this
CFL condition is necessary for stability, but not sufficient. In other words, obeying
the CFL condition does not guarantee stability, as we will see shortly.
The principle behind the CFL condition is simple: if a wave is moving across a
discrete spatial grid and we want to compute its amplitude at discrete time steps
of equal duration, t, then this duration must be less than the time for the wave
to travel to adjacent grid points. As a corollary, when the grid point separation is
reduced, the upper limit for the time step also decreases. In essence, the numerical
domain of dependence of any point in space and time must include the analytical
domain of dependence (wherein the initial conditions have an e↵ect on the exact value
of the solution at that point) to assure that the scheme can access the information
required to form the solution. This is illustrated in Fig. 25.
163
in computational hydrodynamics and much of the literature on solving di↵erential
equations. In this new notation our linear advection equation is given by
@u @u
+v =0
@t @x
where v is now the constant advection speed, and u is the property being advected.
Throughout we adopt a discretization in time and space given by
tn = t0 + n t
xi = x0 + i x
Note that subscripts indicate the spatial index, while superscripts are used to refer
to the temporal index. Hence, un+1 i refers to the value of u at grid location i at
time-step n + 1, etc. The key to numerically solving di↵erential equations is find how
to express derivatives in terms of the discretized quantities. This requires a finite
di↵erence scheme. Using Taylor series expansion, we have that
@u ( x)2 @ 2 u ( x)3 @ 3 u
ui+1 ⌘ u(xi + x) = u(xi ) + x (xi ) + 2
(xi ) + 3
(xi ) + O( x4 )
@x 2 @x 6 @x
ui ⌘ u(xi )
@u ( x)2 @ 2 u ( x)3 @ 3 u
ui 1 ⌘ u(xi x) = u(xi ) x (xi ) + (xi ) (xi ) + O( x4 )
@x 2 @x2 6 @x3
By subtracting the first two expression, and dividing by x, we obtain the following
finite di↵erence approximation for the first derivative
ui+1 ui 1 @2u
u0i ⇡ (xi ) x
x 2 @x2
The first term is the finite di↵erence approximation (FDA) for the first derivative,
and is known as the forward di↵erence. The second term gives the truncation
error, which shows that this FDA is first-order accurate (in x).
We can obtain an alternative FDA for the first derivative by subtracting the latter
two expressions, and again dividing by x:
ui ui 1 1 @2u
u0i ⇡ (xi ) x
x 2 @x2
164
Figure 26: Stencil diagrams for the Backward-Space FTBS scheme (left-hand panel),
the Central-Space FTCS scheme (middle panel), and the Forward-Space FTFS
scheme (right-hand panel). All of these are examples of explicite Euler integration
schemes.
Combining the two Taylor series approximations, and subtracting one from the other,
yields yet another FDA for the first derivative, given by
ui+1 ui 1 1 @3u
u0i ⇡ 3
(xi )( x)2
2 x 3 @x
which is known as the centred di↵erence scheme. Note that this FDA is second-
order accurate in x.
Using the same approach, one can also obtain FDAs for higher-order derivatives. For
example, by adding the two Taylor series expressions above, we find that
Similarly, by folding in Taylor series expressions for ui+2 and ui 2 , one can obtain
higher-order FDAs. For example, the first derivative can then be written as
165
Figure 27: The results of using the FTCS scheme to propate the initial conditions,
indicated by the red dotted lines, using the 1D linear advection equation. Despite the
fact that ↵c = v t/ x = 0.1, thus obeying the CFL-condition, the numerically solu-
tion (in blue) develops growing oscillations, a manifestation of its inherent unstable
nature. The red solid lines in each panel show the corresponding analytical solutions.
These results are based on a linear spatial grid using 100 spatial cells over the domain
[0, 1], with a time step t = 0.001. The advection velocity is v = 1.0.
Now let us return to our linear advection equation. Since we only know u(x, t) in
the past, but not in the future, we have to use the backward di↵erence scheme to
approximate @⇢/@t. For the spatial derivative it seems natural to pick the centred
di↵erence scheme, which is higher order than the forward or backward di↵erence
schemes. Hence, we have that
t
un+1 = uni v un uni
i
2 x i+1 1
This scheme for solving the linear advection equation numerically is called the ex-
plicit Euler scheme or FTCS-scheme for Forward-Time-Central-Space.
We can define similar explicit schemes based on the forward and backward di↵erence
166
schemes. In particular, we have the FTBS (Forward-Time-Backward-Space) scheme
t
un+1
i = uni v uni uni 1
x
t
un+1
i = uni v uni+1 uni
x
Figure 27 shows the results of the FTCS-scheme applied to a simple initial condition
in which u = 1 for 0.4 x 0.6 and zero otherwise (red, dotted lines). These
conditions are advected with a constant, uniform velocity v = 1.0. The blue curves
show the results obtained using the FTCS scheme with x = 0.01 (i.e., the domain
[0, 1] is discretized using 100 spatial cells) and t = 0.001. Results are shown after
50, 100, 150 and 200 time-steps, as indicated. The solid, red curves in the four panels
show the corresponding analytical solution, which simply correspond to a horizontal
displacement of the initial conditions.
Despite the fact that the CFL-condition is easily met (|v t/ x = 0.1), the solution
develops large oscillations that grow with time, rendering this scheme useless.
Let’s now try another scheme. Let’s pick the FTBS scheme, whose stencil is given
by the left-hand panel of Fig. 26. The results are shown in Fig. 28. Surprisingly,
this scheme, which is also known as the upwind or donor-cell scheme yields very
di↵erent solutions. The solutions are smooth (no growing oscillations), but they are
substantial ‘smeared out’, as if di↵usion is present.
For completeness, Fig. 29 shows the same results but for the FTFS scheme (stencil
shown in right-hand panel of Fig. 26). This scheme is even more unstable as the
FTCS scheme, with huge oscillations developing rapidly.
167
Figure 28: Same as Figure 27, but for the FCBS scheme (stencil in left-hand panel
of Fig. 26). This scheme is stable, yielding smooth solutions, but it su↵ers from
significant numerical di↵usion.
Before we delve into other integration schemes, and an indepth analysis of why subtly
di↵erent schemes perform so dramatically di↵erently, we first take a closer look at
the 1D conservation equation
@u @f
+ =0
@t @x
where f = f (u) is the flux associated with u (which, in the case of linear advection
is give by f = vu). We can write this equation in integral form as
xi+ 1
Z 2 tZn+1
@u @f
dx dt + =0
@t @x
xi 1 tn
2
where the integration limits are the boundaries of cell i, which we denote by xi 1
2
and xi+ 1 , and the boundaries of the time step tn ! tn+1 . If we now consider u as
2
being constant over a cell, and the flux is assumed constant during a time step, we
can write this as
xi+ 1 h i
R2 tn+1
R
n+1 n
dx [u(x, t ) u(x, t )] + dt f (xi+ 1 , t) f (xi 1 , t) =
2 2
xi 1 tn
2
n+ 1 n+ 21
u(xi , tn+1 ) x u(xi , tn ) x + fi+ 12 t fi 1 t=0
2 2
168
Figure 29: Same as Figure 27, but for the FCFS scheme (stencil in right-hand panel
of Fig. 26). Clearly, this scheme is utterly unstable and completely useless.
This is called ‘conservation form’ because it expresses that property u in cell i only
changes due to a flux of u through its boundaries. With this formulation, we can
n
describe any integration scheme by simply specifying the flux fi+ 1 . For example, the
2
three integration schemes discussed thus far are specified by fi+ 1 = f (uni ) (FTBS),
n
2
n 1 n n n
fi+ 1 = 2 [f (ui+1 ) + f (ui )] (FTCS), and f
i+ 12
= f (uni+1) (FTFS). See also the Table
2
on the next page.
So far we have considered three integration schemes, which are all examples of explicit
Euler schemes. When a direct computation of the dependent variables can be made
in terms of known quantities, the computation is said to be explicit. In contrast,
when the dependent variables are defined by coupled sets of equations, and either a
matrix or iterative technique is needed to obtain the solution, the numerical method
is said to be implicit. An example of an implicit scheme is the following FDA of
the heat equation (see Problem Set 3)
✓ n+1 ◆
un+1
i uni ui+1 2un+1
i + un+1
i 1
=
t x2
Note that the second-order spatial derivative on the rhs is evaluated at time tn+1 ,
rather than tn , which is what makes this scheme implicit. Since explicit schemes
are much easier to code up, we will not consider any implicit schemes in these lec-
ture notes. We emphasize, though, the sometimes implicit schemes are powerful
alternatives.
Neither of the three (explicit) schemes considered thus far are satisfactory; the FTCS
169
and FTFS schemes are unstable, yielding oscillations that grow rapidly, while the
FTBS scheme su↵ers from a large amount of numerical dissipation. Fortunately,
there are many alternative schemes, both explicit and implicit. The Table on the
next page lists 7 explicit finite di↵erence methods that have been suggested in the
literature. In addition to the three Euler methods discussed above, this includes the
Lax-Friedrichs method, which is basically a FTCS-method with an added artificial
viscosity term. Similar to the FTCS method, it is second order accurate in space,
and first-order accurate in time. The performance of this method is evident from the
upper-right panel in Fig. 30. Clearly, the artificial viscosity suppresses the onset of
growing oscillations, which is good, but the numerical di↵usion is much worse than
in the FCBS upwind (or donor-cell) scheme, rendering this method not very useful,
except when the initial conditions are very smooth.
All the finite-di↵erence methods encountered thus far are first-order accurate in time.
The table also lists three schemes that are second-order accurate in both space
and time, i.e., with an error that is O( x2 , t2 ). The first of these is the Lax-
Wendro↵ method. As is evident from the lower-left panel of Fig. 30 it results
in some oscillations, but these don’t grow much beyond a certain point (unlike, for
example, the first-order Euler-FTCS scheme shown in the upper-left panel). Another,
similar scheme, is the Beam-Warming method, whose performance, shown in the
lower-middle panel of Fig 30, is only marginally better. Finally, the lower-right panel
shows the performace of the Fromm method, which is basically the average of the
Lax-Wendro↵ and Beam-Warming schemes, i.e., fi+ n
1
,Fromm
= 12 [fi+
n
1
,LW
n
+ fi+ 1
,BW
].
2 2 2
As is evident, this is clearly the most successfull method encountered thus far.
170
Explicit Finite Di↵erence Methods for 1D Linear Advection Problem
n n
Euler FTBS fi+ 1 = f (ui )
2
un+1
i = uni ↵c [uni uni 1 ]
n 1 n n
Euler FTCS fi+ 1 = 2 [f (ui+1 ) + f (ui )]
2
un+1
i = uni ↵c n
2
[ui+1 uni 1]
n n
Euler FTFS fi+ 1 = f (ui+1 )
2
un+1
i = uni ↵c [uni+1 uni ]
n 1 n n 1 x n
Lax-Friedrichs fi+ 1 = 2 [f (ui+1 ) + f (ui )] 2
[u
t i+1
uni ]
2
un+1
i = uni ↵c n
2
[ui+1 uni 1] + 1 n
[u
2 i+1
2uni + uni 1 ]
n 1 n n v2 t n
Lax-Wendro↵ fi+ 1 = 2 [f (ui+1 ) + f (ui )] 2
[u
x i+1
uni ]
2
↵2c
un+1
i = uni ↵c n
2
[ui+1 uni 1] + 2
[uni+1 2uni + uni 1 ]
n 1 n v2 t
Beam-Warming fi+ 1 = 2 [3f (ui ) f (uni 1)] 2
[un
x i
uni 1 ]
2
↵2c
un+1
i = uni ↵c
2
[3uni 4uni 1 + uni 2 ] + 2
[uni 2uni 1 + uni 2 ]
n 1 n n v2 t n
Fromm fi+ 1 = 4 [f (ui+1 ) + 4f (ui ) f (uni 1 )] 4
[u
x i+1
uni uni 1 + uni 2 ]
2
↵2c
un+1
i = uni ↵c n
4
[ui+1 + 3uni 5uni 1 + uni 2 ] + 4
[uni+1 uni uni 1 + uni 2 ]
Table listing all the explicit integration schemes discussed in the text. For each entry
the first line indicates the flux, while the second line indicate the conservative update
formula for the 1D linear advection equation, for which f (u) = vu, with v the con-
stant advection speed. The parameter ↵c is the Courant (or CFL) parameter given
by ↵c = v t/ x.
171
Figure 30: The result of using 6 di↵erent explicit integration schemes, as indicated,
to propagate the initial conditions indicated by red using the 1D linear advection
equation with a constant velocity v = 1.0. All schemes use a Courant parameter
↵c = 0.1, and 100 grid points to sample u(x) over the x-interval [0, 1], assuming
periodic boundary conditions. The blue curves show the results after 1000 time steps
of t = 0.001, which covers exactly one full period.
————————————————-
172
CHAPTER 18
In this chapter we address the question how one can test/assess the performance of
finite di↵erence schemes. We start by introducing some relevant terms:
The three ‘aspects’ of a numerical scheme are related through what is known as Lax’s
equivalance theorem: It states that for a consistent finite di↵erence method, for
a well-posed linear initial value problem, the method is convergent if and only if it is
stable.
173
modified equation, which is useful to develop a feeling for the behavior of a finite
di↵erence method.
Truncation error: As we have seen in the previous chapter, the finite di↵erences
are typically obtained using Taylor series expansion up to some order in x and/or
t. This introduces truncation errors, errors that derive from the fact that the series
is truncated at some finite order. The forward and backward Euler schemes are first
order in both space and time, we write O( t, x), the FTCS and Lax-Friedrichs
schemes are first order in time, but second order in space, i.e., O( t, x2 ), and
the Lax-Wendro↵, Beam-Warming and Fromm methods are all second order in both
space and time, i.e., O( t2 , x2 ). Typically higher order yields better accuracy, if
stable. Or, put di↵erently, one can achieve the same accuracy but using a coarser
grid/mesh.
As we have seen in the previous chapter, the first-order method that appears stable
(the upwind/donor cell methods) yields smeared solutions, while the second-order
methods (Lax-Wendro↵, Beam-Warming and Fromm) give rise to oscillations. This
qualitively di↵erent behavior of first and second order methods is typical and can
be understood using an analysis of what is called the modified equation. Recall
that the discrete equation used (i.e., the finite di↵erence scheme adopted) is to ap-
proximate the original PDE (in the cases discussed thus far, the 1D linear advection
equation). However, the discrete equation may be an even better approximation of a
modified version of the original PDE (one that corresponds to a higher order of the
truncation error). Analyzing this modified equation gives valuable insight into the
qualitative behavior of the numerical scheme in question.
As an example, consider the Euler FTCS method, which replaces the actual PDE
@u @u
+v =0
@t @x
with the following discrete equation
un+1
i uni uni+1 uni 1
+v =0
t 2 x
Using Taylor series expansion in time up to second order, we have that
✓ ◆ ✓ ◆
n+1 n @u ( t)2 @ 2 u
ui = ui + t + 2
+ O( t3 )
@t 2 @t
174
which implies that
✓ ◆
un+1
i uni @u t @2u
= + + O( t2 )
t @t 2 @t2
Similarly, using Taylor series expansion in space up to second order, we have that
✓ ◆ ✓ ◆
n n @u ( x)2 @ 2 u
ui+1 = ui + x + + O( x3 )
@x 2 @x2
✓ ◆ ✓ ◆
n n @u ( x)2 @ 2 u
ui 1 = ui x + 2
+ O( x3 )
@x 2 @x
which implies that
uni+1 uni 1 @u
= + O( x2 )
2 x @x
Hence, our modified equation is
@u @u t @2u
+v = + O( t2 , x2 )
@t @x 2 @t2
Using that
✓ ◆ ✓ ◆ ✓ ◆
@2u @ @u @ @u @ @u @2u
= = v = v = v2
@t2 @t @t @t @x @x @t @x2
where in the second and final step we have used the original PDE to relate the
temporal derivate to the spatial derivative. Using this, we can write our modified
equation as
@u @u t @2u
+v = v2 + O( t2 , x2 )
@t @x 2 @x2
Note that the first term on the right-hand side is a di↵usion term, with a di↵usion
coefficient
t
D = v2
2
Hence, to second order, the discrete equation of the 1D linear advection equa-
tion based on the FTCS method, actually solves what is known as an advection-
di↵usion equation. But, most importantly, the corresponding di↵usion coeffient
is negative. This implies that the FTCS scheme is unconditionally unstable; i.e.,
there are no x and/or t for which the FTCS method will yield a stable solution
of the 1D linear advection equation.
175
Let us now apply the same method to the FTBS scheme, whose discrete equation
for the 1D linear advection equation is given by
un+1
i uni uni uni 1
+v =0
t x
Using Taylor series expansions as above, one finds that
uni uni 1 @u x @2u
= + O( x2 )
x @x 2 @x2
Hence, our modified equation is
@u @u t @2u x @2u
+v = + v + O( t2 , x2 )
@t @x 2 @t2 2 @x2
which can be recast in the advection-di↵usion equation form with
✓ ◆
x 2 t x t x
D=v v =v 1 v =v (1 ↵c )
2 2 2 x 2
Thus we see that we can achieve stable di↵usion (meaning D > 0) if v > 0
and ↵c < 1 (the latter is the CFL-condition, which has to be satisfied anyways).
This explains the di↵use nature of the FTBS scheme (see Fig. 28). It also shows
that if v < 0 one needs to use the FTFS scheme, to achieve similar stability. The
upwind or donor cell scheme is generic term to refer to the FTBS (FTFS) scheme
if v > 0 (v < 0). Or, put di↵erently, in the upwind method the spatial di↵erencing
is performed using grid points on the side from which information flows.
The student is encouraged to apply this method to other finite di↵erence schemes.
For example, applying it to the Lax-Friedrichs method yields once again a modified
equation of the advection-di↵usion form, but this time with a di↵usion coefficient
x2
D= (1 ↵c2 )
2 t
which results in stable di↵usion (D > 0) as long as the CFL-criterion is satisfied.
For Lax-Wendro↵ and Beam-Warming one obtains modified equations of the form
@u @u @3u
+v = ⌘ 3 + O( t3 , x3 )
@t @x @x
176
with
v x2 2
⌘= (↵c 1) Lax-Wendro↵
6
2
v x
⌘= (2 3↵c + ↵c2 ) Beam-Warming
6
In order to understand the behavior of the explicit, second-order schemes, consider
the modified equation
@u @u @3u
+v =⌘ 3
@t @x @x
Applying this to a linear wave with frequency ! and wave number k, i.e., u /
exp[±i(kx wt)], yields a dispersion relation
i! + i v k = i ⌘ k3 ) ! = v k + ⌘ k3
We now turn our attention to the stability of finite di↵erence schemes. In the case
where the original PDE is linear, one can assess the stability of the finite di↵er-
ence method using a von Neumann stability analysis. This analysis models the
numerical noise as a Fourier series, and investigates whether the amplitude of the
Fourier modes will grow or not. To see how this works, consider once again the 1D
linear advection equation, as discretized by the FTCS method:
↵c ⇥ n ⇤
un+1 = uni u uni
i
2 i+1 1
177
Since the underlying PDE is linear, the numerical noise, which is what is added to the
actual solution, also obeys the above equation. The von Neumann stability analysis
therefore starts by writing the present solution as a Fourier series (representing the
numerical noise), i.e., X
uni = Ank exp( ikxi )
k
where we have assumed period boundary conditions, such that we have a discrete
sum of modes. Substitution in the above equation yields
h ↵c ↵c i
An+1
k = An
k 1 exp( ik x) + exp(+ik x)
2 2
= Ank [1 + i↵c sin(k x)]
where we have used that sin x = (eix e ix )/2i. The evolution of the mode amplitudes
is thus given by
|An+1 |2
⇣ 2 ⌘ k n 2 = 1 + ↵c2 sin2 (k x)
|Ak |
As is evident, we have that ⇣ > 1, for all k. Hence, for any k the mode amplitude will
grow, indicating that the FTCS method is inherently, unconditionally unstable.
Now let’s apply the same analysis to the upwind scheme (FTBS), for which the
discrete equation is given by
⇥ ⇤
un+1
i = uni ↵c uni uni 1
An+1
k = Ank [1 ↵c + ↵c exp(+ik x)]
= Ank [1 ↵c + ↵c cos(k x) + i↵c sin(k x)]
After a bit of algebra, one finds that the evolution of the mode amplitudes is thus
given by
2 |An+1 |2
⇣ ⌘ k n 2 = 1 2↵c (1 ↵c )[1 cos(k x)]
|Ak |
Upon inspection, this has ⇣ < 1 if ↵c < 1; Hence, the upwind scheme is stable as long
as the CFL condition is satisfied. Note, though, that the fact that ⇣ < 1 implies not
only that the numerical noise will not grow, but also that the actual solution will
decline with time. Pure advection, which is what the actual PDE describes, show
have ⇣ = 1, i.e., solutions only move, they don’t grow or decay with time. Hence, the
178
fact that our FDA has ⇣ < 1 is not physical; rather, this represents the numerical
di↵usion that is present in the upwind scheme.
In Problem Set 3, the students will perform a similar von Neumann stability analysis
for an explicit FDA of the heat equation.
179
CHAPTER 19
In the previous two chapters we discussed how numerically solving the equations of
hydrodynamics means that we have to develop a FDA of the PDE (typically hyper-
bolic, potentially with non-ideal parabolic terms). We discussed how we can obtain
insight as to the behavior of the FDA by examining the corresponding modified
equation, and by performing a von Neumann stability analysis.
We have compared various FDA schemes to solve the 1D linear advection equation,
but found all of them to have serious shortcomings. These became especially apparent
when we examined the advection of ICs that contained discontinuities. The first-
order FDA schemes were too di↵usive and dissipative, while the second order schemes
gave rise to spurious over- and undershoots. The latter can be fatal whenever the
property to be advected is inherently positive (i.e., mass density). In fact, this relates
to an important theorem due to Godunov,
Godunov Theorem: there are no linear higher-order schemes for treating linear
advection that retain positivity of the solution.
180
The above equation in integral form is simply
xi+ 1
Z 2 tZn+1
@u @f
dx dt + =0
@t @x
xi 1 tn
2
which reduces to
xi+ 1
Z tZn+1
2
⇥ h
⇤ i
dx u(x, tn+1 ) u(x, tn ) + dt f (xi+ 1 , t) f (xi 1 , t) = 0
2 2
xi 1 tn
2
and
tZn+1 tZn+1
n+ 21 1 n+ 12 1
Fi 1 ⌘ f (xi 1 , t)dt , Fi+ 1 ⌘ f (xi+ 1 , t)dt
2 t 2 2 t 2
tn tn
t ⇣ n+ 21 n+ 21
⌘
Uin+1 = Uin Fi+ 1 Fi 1
x 2 2
This is similar to the update formula for the conservation equation that we derived
in chapter 17, expect that here the quantities are volume averaged. Note that this
equation is exact (it is not a numerical scheme), as long as the U and F involved are
computed using the above integrations.
Computing the precise fluxes, though, requires knowledge of u(x, t) over each cell, and
at each time. This is easy to see within the context of the linear advection equation:
Let u(x, tn ) be the continuous description of u at time tn . Then, the amount of
u advected to the neighboring downwind cell in a timestep t is simply given by
R x+ 1
u = x+ 12 v t u(x, tn )dx (assuming that v t < x), and the time-averaged flux
2
181
n+ 1
through the corresponding cell face is Fi+ 12 = u/ t. If the continuous u(x, tn ) is
2
known, this flux can be computed, and the advection equation (in integral form) can
be solved exactly. However, because of the discrete nature of sampling, we only know
u at finite positions xi , and the best we can hope to do is to approximate u, and
thus the corresponding flux. Once such approximations are introduced, the update
formula becomes a numerical scheme, called a Godunov scheme.
The left-hand panel of Fig. 31 shows the condition for some particular u(xi ) at time
tn . Only 5 cells are shown, for the sake of clarity. The cells are assumed to have a
constant distribution of u (i.e., we have made the piecewise constant assumption).
Let us now focus on cell i, which straddles a discontinuity in u. Advection with a
constant v > 0 simply implies that in a time step t the piecewise constant profile
of u(x) shifts right-ward by an amount v t. This right-ward shift is indicated in the
right-hand panel of Fig. 31 by the dashed lines. At the end of the time-step, i.e.,
at time tn+1 , we once again want the fluid to be represented in a piecewise constant
fashion over the cells. This is accomplished, for cell i, by integrating u(x) under the
dashed lines from xi 1 to xi+ 1 and dividing it by x to obtain the new cell-averaged
2 2
value Uin+1 . The new Ui thus obtained are indicated by the solid lines in the right-
hand panel. The Uin+1 di↵ers from Uin because some amount of u has flown into cell
i from cell i 1 (indicated by the light-gray shading), and some amount of u has
flown from i into cell i + 1 (indicated by the dark-gray shading). The corresponding
time-averaged fluxes obey
n+ 21 n+ 1
t Fi 1 = (v t) Uin 1 , and t Fi+ 12 = (v t) Uin
2 2
182
Figure 31: A single time step in the linear advection of a fluid modelled using piece-
wise constant reconstruction. The left-hand panel (a) shows the conditions at time
tn . The right-hand panel (b) shows the slabs of fluid after they have been advected for
a time t (dashed lines) as well as the final profile of u(x) at the end of the time step
(solid lines). The total amount of fluid entering (leaving) cell i is shaded light-gray
(dark-gray). [Figure adapted from Prof. D. Balsara’s lecture notes on ”Numerical
PDE Techniques for Scientists and Engineers”].
So, one might wonder, what is so ‘special’ about this Godunov scheme? Well,
the ingenious aspect of Godunov’s method is that is yields an upwind scheme for
a general, non-linear system of hyperbolic PDEs. For a linear system of equations,
upwind schemes can only be used if all velocities of all waves in the problem (recall
183
Figure 32: Same as Fig. 31, but this time piecewise linear reconstruction is used,
based on right-sided slopes (thus giving rise to the Lax-Wendro↵ scheme). Note how
the linear reconstruction has introduced a new, higher-than-before, extremum in cell
i + 1, which is ultimately responsible for the spurious oscillations characteristic of
second-order schemes. As discussed in the text, the solution is to develop a Total
Variation Diminishing (TVD) scheme with the use of slope-limiters. [Figure adapted
from Prof. D. Balsara’s lecture notes on ”Numerical PDE Techniques for Scientists
and Engineers”].
that hyperbolic PDEs describe travelling waves) have the same sign. If mixed signs
are present, one can typically split the flux F (u) in two components: F + and F
which correspond to the fluxes in opposite directions. This is called Flux Vector
Splitting. The linearity of the PDE(s) then assures that the solution of the PDE
is simply given by the sum of the PDEs for F + and F separately. However, for a
non-linear system (we will encounter such systems in the next chapter) this approach
will not work. This is where Godunov’s method really brings its value to bear.
For now, though, we apply it to the 1D linear advection equation, in which case it
simply becomes identical to the first-order accurate FTBS scheme. And as we have
already seen, this scheme su↵ers from a large amount of numerical di↵usion. But,
within the Godunov scheme, we can now try to overcome this by going to higher-
order. In terms of reconstruction, this implies going beyond piecewise constant
reconstruction.
184
The logical next-order step in reconstruction is to assume that within each cell u(x)
follows a linear profile, with a slope that is determined by the values of U at its
neighboring cells. This is called piecewise linear reconstruction. As always, we
have three choices for the slope: a right-sided finite di↵erence Uin = Ui+1
n
Uin , a
left-sided di↵erence Uin = Uin Uin 1 and a central di↵erence Uin = (Ui+1
n
Uin 1 )/2.
In what follows we shall refer to Uin as the slope, eventhough it really is only an
undivided di↵erence.
The left-hand panel of Fig. 32 shows the same mesh function as in Fig 31, but
this time the dashed lines indicate the reconstructed profile based on the right-
sided slopes. For cells i 2, i + 1 and i + 2, this right-sided slope is zero, and the
reconstruction is thus identical to that for the piecewise constant case. However, for
cells i 1 and i reconstruction has endowed the cells with a non-zero slope. For
example, the profile of u(x) in cell i is given by
Uin
uni (x) = Uin + (x xi )
x
where xi is the central position of cell i. It is easy to see (the student should do this),
that upon substitution of this profile in the integral expression for Uin , one obtains
that Uin = uni , as required.
Advecting the fluid with second-order accuracy is equivalent to shifting the piecewise
linear profile rightwards by a distance v t. The resulting, shifted profile is shown in
the right-hand panel of Fig. 32. As in Fig. 31, the light-gray and dark-gray shaded
regions indicate the amount of u that is entering cell i from cell i 1, and leaving
cell i towards i + 1, respectively. With a little algebra, one finds that the associated
time-averaged fluxes obey
n+ 21 1
t Fi 1 = (v t) Uin 1 + (1 ↵c ) Uin 1
2 2
and
n+ 21 1
t Fi+ 1 = (v t) Uin + (1 ↵c ) Uin
2 2
As before, invoking conservation of u then implies that
↵c
Uin+1 = Uin ↵c Uin Uin 1 (1 ↵c ) Uin Uin 1
2
185
A comparison with the update formula in the piecewise constant case, we see that
we have added an extra term proportional to ( t/ x)2 that depends on the slopes.
Hence, this is indeed a second-order scheme. By substituting the expressions for the
right-sided bias adopted here, the update formula becomes identical to that of the
Lax-Wendro↵ scheme that we encountered in Chapter 17, but with ui replaced
by Ui . Similarly, it is easy to show that using the left-sided slopes, yields an update
formula equal to that for the Beam-Warming scheme, while the central slopes
yield an update formula identical to Fromm’s scheme.
Finite volume reconstruction methods and their link to finite di↵erence schemes.
As we have seen in Chapter 17, these second-order accurate schemes all give rise to
large oscillations; large over- and undershoots. And as we know from Godunov’s
theorem, these schemes are not positivity-conserving. Fig. 32 makes it clear where
these problems come from. Advection of the linearly reconstructed u(x) has caused
a spurious overshoot in cell i 1 at time tn+1 . Upon inspection, it is clear that
this overshoot arises because our reconstruction has introduced values for u(x) that
are higher than any ui present at tn . Once such an unphysical extremum has been
introduced, it has a tendency to grow in subsequent time-steps. Using the centered
slopes would cause a similar overshoot (albeit somewhat smaller), while the left-sided
slopes will result in an undershoot in cell i + 1.
This insight shows us that the over- and under-shoots have their origin in the fact that
the linear reconstruction introduces new extrema that were not present initially. The
solution, which was originally suggested by Bram van Leer, is to limit the piecewise
linear profile within each cell such that no new extrema are introduced. This is
accomplished by introducing slope-limiters (or, very similar, flux-limiters). The
idea is simple: limit the slopes Uin , such that no new extrema are introduced. Over
the years, many di↵erent slope-limiters have been introduced by the computational
186
fluid dynamics community. All of these use some combination of the left- and right-
sided slopes defined above. An incomplete list of slope-limiters is presented in the
Table below.
Here L and R are the left- and right-sided slopes, and Q( L , R) = [sgn( L )+sgn( R )]
with sgn(x) the sign-function, defined as +1 for x 0 and 1 for x < 0.
187
Figure 33: The result of using Piecewise Linear Reconstruction combined with 4
di↵erent slope limiters (as indicated at the top of each panel). to linearly advect the
initial conditions shown in red. As in Fig. 30, the advection speed is v = 1.0, and
100 grid points are used to sample u(x) over the x-interval [0, 1], assuming periodic
boundary conditions. The Courant parameter ↵c = 0.5. The blue curves show the
results after 1000 time steps of t = 0.001, which covers exactly one full period. Note
the drastic improvement compared to the finite di↵erence schemes used in Fig. 30!!
Fig. 33 shows the results of applying our Piecewise Linear Reconstruction with four
di↵erent slope-limiters (as indicated) to the 1D linear advection of initial conditions
indicated by the red top-hat. The blue curves show the results obtained after one
period (using periodic boundary conditions) using an advection speed v = 1.0, a
mesh with 100 grid points on the domain x 2 [0, 1], and a Courant parameter ↵c =
0.5. As can be seen, all limiters produce oscillation-free propogation of the top-
hat profile, and with a numerical di↵usion that is much smaller than in the case of
the upwind finite di↵erence scheme used in Chapter 17 (i.e., compare Fig. 33 to the
results in Fig. 30). Clearly, by using a finite volume formulation with non-linear
hybridizatrion in the form of piecewise linear reconstruction with the use of
slope limiters has drastically improved our ability to advect discontinuous features
in the fluid.
What is it that makes these slope-limiters so succcessful? In short, the reason is that
they are total variation dimishing, or TVD for short. The total variation, TV
of a discrete set U = U1 , U2 , ..., UN is defined as
N
X
T V (U n ) ⌘ n
Ui+1 Uin
i=1
188
and an integration scheme is said to be TVD i↵ the total variation does NOT increase
with time, i.e.,
TVD , T V (U n+1 ) T V (U n )
Clearly, whenever a scheme introduces spurious oscillations, the TV will go up, vio-
lating the TVD-condition. Or, put di↵erently, if a scheme is TVD, then it will not
allow for the formation of spurious over- and/or under-shoots. Readers interested in
finding a quick method to test whether a scheme is TVD are referred to the paper
”High Resolution schemes for Conservation Laws” by Harten (1983) in the Journal
of Computational Physics. Here we merely point out that the schemes used in this
chapter are all TVD.
This begs the question: why can’t we use a continuous reconstruction, i.e., connect
all the ui (i = 1, 2, ..., N) using say an N th -order polynomial. That would assure
smoothness and di↵erentiability across the entire computational domain. However,
this is not an option, for the simple reason that, as we will see in the next Chap-
ter, discontinuities can be real. An obvious example is a shock, which is a natural
outcome of the Euler equations due to its non-linear character. Using continuous
reconstruction would fail to capture such discontinuities.
As we will see, the solution is to use Godunov schemes that rely on use piecewise
reconstruction (be it constant, linear or parabolic) and Riemann solvers to compute
the fluxes across the resulting discontinuities between adjacent cells. Before we ex-
amine this approach in detail, though, we will first take a closer look at non-linearity.
189
CHAPTER 20
The linear advection equation is an ideal test case because its solution is trivially
known (or can be derived using the method of characteristics discussed below).
In this chapter we are going to consider another equation, which appears very similar
to the linear advection equation, except that it is non-linear. As before, we consider
the 1D case, and we ignore radiation and gravity. But rather than assuming both
P and ~u to be constant, we only assume a constant pressure. This implies that the
continuity equation is given by
@⇢ @⇢u @⇢ @u @⇢
+ = +⇢ +u =0
@t @x @t @x @x
while the momentum equation reduces to
@⇢u @ @u @⇢ @u @⇢u
+ [⇢uu + P ] = ⇢ +u + ⇢u +u =0
@t @x @t @t @x @x
where we have used that @P/@x = 0. Multiplying the continuity equation with u
and subtracting this from the momentum equation yields
@u @u
+u =0
@t @x
This equation is known as Burgers’ equation. Unlike the similar looking advection
equation, this is a non-linear equation. In fact, it is one of the few non-linear PDEs
for which an analytical solution can be found for a select few ICs (see below). The
importance of Burgers’ equation is that it highlights the quintessential non-linearity
of the Euler equations.
190
Note that the above form of Burgers’ equation is not in conservative form. Rather
this form is called quasi-linear. However, it is trivial to recast Burgers’ equation in
conservative form:
@u @ 12 u2
+ =0
@t @x
Let’s devise finite di↵erence upwind schemes for both (assuming u > 0). The results
are shown in the table below.
@u @u t n⇥ n ⇤
quasi-linear +u un+1 = uni u ui uni
@t @x i
x i 1
@u @ 12 u2 t ⇥ n 2 ⇤
conservative + un+1
i = uni (ui ) (uni 1 )2
@t @x 2 x
Let’s use these two schemes to numerically solve Burgers’ equation on the domain
x 2 [0, 1] (using periodic boundary conditions) for an initial velocity field u(x, 0) given
by a Gaussian centered at x = 0.5, and with a dispersion equal to = 0.1. The initial
density is assumed to be uniform. The results for a Courant parameter ↵c = 0.5 are
shown in Fig. 34, where the initial conditions are shown in red, the results from
the quasi-linear scheme in magenta (dashed) and the results from the conservative
scheme in blue (solid). Note how, in the region where @u/@x > 0 a rarefaction wave
develops, causing a reduction in the density. Over time, this rarefied region grows
larger and larger. In the region where @u/@x < 0 a compression wave forms,
which steepens over time. As discussed in Chapter 13, because of the non-linear
nature of the Euler equations such waves steepen to give rise to shocks, representing
discontinuities in flow speed.
Note, though, that at late times the numerical schemes based on the conservative
and quasi-linear forms of Burgers’ equation yield di↵erent predictions for the location
of this shock. As it turns out, and as we demonstrate explicitely below, the correct
prediction is that coming from the conservative form. This highlights the importance
of using a conservative scheme, which is expressed by the following theorem:
191
Figure 34: Evolution as governed by Burgers’ equation for an initial, uniform density
with the 1D velocity field given by the red Gaussian. Left and right-hand panels show
the evolution in density and velocity, respectively. Red lines indicate the initial con-
ditions, while blue (solid) and magenta (dashed) lines indicate the numerical results
obtained using the conservative and quasi-linear equations, respectively. Both are
solved using the upwind scheme with a Courant parameter ↵c = 0.5, and sampling
the x = [0, 1] domain using 100 grid points. Note how a shock develops due to the
non-linear nature of Burgers’ equation, but that the location of the shock di↵ers in
the two schemes. Only the conservative scheme yields the correct answer.
192
Lax-Wendro↵ theorem: If the numerical solution of a conservative scheme con-
verges, it converges towards a weak solution.
In order to develop some understanding of the shock and rarefaction, we are going
to solve Burgers’ equation analytically using the ‘method of characteristics’, which
is a powerful method to solve hyperbolic PDEs.
Let the ICs of Burgers’ equation be given by the initial velocity field u(x, 0) = f (x).
Now consider an ‘observer’ moving with the flow (i.e., an observer ‘riding’ a fluid
element). Let x(t) be the trajectory of this observer. At t = 0 the observer is located
at x0 and has a velocity u0 = f (x0 ). We want to know how the velocity of the
observer changes as function of time, i.e., along this trajectory. Hence, we want to
know
du d @u @u dx
= u(x(t), t) = +
dt dt @t @x dt
3
or, for an elementary introduction, see https://www.youtube.com/watch?v=tNP286WZw3o
193
Figure 35: Solving the Burgers equation for the initial conditions indicated by the blue
curve at t = 0 using the method of characteristics. The red lines are characteristics;
lines along which the velocity remains fixed to the intial value. Not the formation of
a rarefaction fan, where the method of characteristics fails to provide a solution, and
the formation of a shock where-ever characeristics collide together.
We see that this equation is equal to Burgers’ equation that we seek to solve if
dx/dt = u(x, t). And in that case we thus have that du/dt = 0. Hence, we see
that solving the Burgers equation (a quasi-linear, first-order PDE), is equivalent
to solving the ODE du/dt = 0 along characteristic curves (characteristics) given
by dx/dt = u(x, t). The solution is simple: u(x, t) = u0 (x0 , 0) = f (x0 ) where
x0 = x u0 t. Hence, this can be solved implicitly: for given x and t find the x0 that
solves x0 = x f (x0 )t. Then, the instantaneous velocity at (x, t) is given by f (x0 ).
Fig. 35 illustrates an example. The blue curve indicates the initial conditions; i.e., the
initial velocity as function of position x at t = 0. It shows a sudden jump (increase)
in velocity at x1 and a sudden decrease at x = x2 . The red lines are characteristics,
i.e., lines along which the velocity remains constant; their slope is the inverse of the
initial velocity at the location x0 where they cross the t = 0 axis. From point x1 ,
a rarefraction fan emenates, which corresponds to a region where the density will
decline since neighboring elements spread apart. The method of characteristics does
not give a solution in this regime, simply because no characteristics enter here...the
solution in this regime turns out to be a linear interpolation between the beginning
and end-point of the fan at a given t. From point x2 a shock emenates. Here
characteristic from x0 < x2 ‘merge’ with characteristics emerging from x0 > x2 .
When two characteristics meet, they stop and a discontinuity in the solution emerges
(which manifests as a shock).
194
rarefaction
shock
shock
compression
Figure 36: Initial conditions of u(x) (top panels), and the corresponding characteris-
tics (bottom panels). Note the formation of shocks, and, in the right-hand panel, of
a rarefaction wave. Clearly, characteristics give valuable insight into the solution of
a hyperbolic PDE.
195
Figure 37: Evolution of a shock wave in velocity. The initial discontinuity in u(x)
at x = 0.5 introduces a shock wave which propagates to the right. The solid red
curve panels show the analytical solution (a shock propagating at ushock = [u(x <
0.5) + u(x > 0.5)]/2), while the red dotted curve shows the ICs. As in Fig. 34, the
blue and magenta curves indicate the numerical solutions obtained using conservative
and quasi-linear schemes, respectively. Note how the latter fails to reproduce the
correct shock speed.
This is further illustrated in Fig. 36. Upper panels show the initial conditions, with
the little bar under the panel indicating with little line-segments the velocity (as
reflected by the slope of the line-segment) as function of position. The lower panels
plot the characteristics (in a t vs. x plot). Where characteristics merge, a shock
forms. It is apparent from the left-hand panels, that the shock in this case will prop-
agate with a speed that is simply the median of the upwind and downwind material,
i.e., ushock = [u(x2 +) + u(x2 )]/2. The right-hand panels show the characteristics
in the case of the Gaussian ICs also considered in Fig. 34; note how one can see the
formation of both a rarefaction fan as well as a shock.
The example shown in the left-hand panels of Fig. 36 presents us with a situation
in which the shock speed is known analytically. We can use this as a test-case
to determine which of our numerical schemes (quasi-linear vs. conservative) best
reproduces this. We set up ICs in which u(x) = 1.0 for x < 0.5 and u(x) = 0.2
for x > 0.5. We solve this numerically using both schemes (with ↵c = 0.5), and
compare the outcome to our analytical solution (the shock is moving right-ward
with a speed vshock = (1.0 + 0.2)/2 = 0.6). The results are shown in Fig. 37. Note
196
how the solutions from the conservative scheme (in blue) nicely overlap with the
analytical solution (in red), while that from the quasi-linear scheme (in magenta)
trails behind. This demonstrates the Lax-Wendro↵ theorem, and makes it clear
that conservative schemes are required to correctly model the propogation of shocks.
197
CHAPTER 21
Thus far, rather than trying to numerically solve the full set of hydrodynamics equa-
tions, we instead considered two very special, much simpler cases, namely the linear
advection equation, and the non-linear Burgers’ equation, both in 1D. We
derived these equation from the set of hydro-equations by assuming an ideal fluid,
ignoring gravity and radiation, by assuming constant pressure, and, in the case of
the advection equation, also constant velocity. Clearly these are highly simplified
cases, but they have the advantage that analytical solution exist, thus allowing us to
test our numerical schemes.
We have seen, though, that numerically solving even these super-simple PDEs using
finite di↵erence schemes is far from trivial. First order schemes, if stable, su↵er
from significant numerical di↵usion, while second order schemes have a tendency to
develop oscillations. As we will discuss in this chapter, and briefly touched upon in
Chapter 19, the way forward is to use Godunov schemes with Riemann solvers.
198
In this Chapter we are going to see how to apply Godunov schemes to the (1D)
Euler equations. This will involve Riemnann solvers, which are numerical schemes
for solving Riemann problems, which describes the evolution of a discontinuity in
fluid properties. We will discuss how to solve a Riemann problem, apply it to the
SOD shock tube, and then end by briefly discussing approximate Riemann
solvers.
Let us first give a brief review of the basics behind the Godunov scheme. In the
absence of source/sink terms (i.e., gravity and radiation), and ignoring viscocity
and conduction (which add parabolic terms), the hydrodynamic equations reduce
to a set of hyperbolic PDEs that can be written in conservation form as
@~u
+ r · f~(~u) = 0
@t
(see Chapter 17). The update-formula for this equation, in the Finite Volume for-
mulation, is given by
t ⇣ n+ 21 n+ 21
⌘
Uin+1 = Uin Fi+ 1 Fi 1
x 2 2
and
tZn+1 tZn+1
n+ 21 1 n+ 12 1
Fi 1 ⌘ f (xi 1 , t)dt , Fi+ 1 ⌘ f (xi+ 1 , t)dt
2 t 2 2 t 2
tn tn
Reconstruction basically means that one models the continuous u(x, t) from the
discrete uni on the mesh. This means that, for xi 1 x xi+ 1 , one assumes that
2 2
Uin
u(x, tn ) = uni + (x xi ) piecewise linear
x
199
In the latter Uin is a slope, which can be computed centered, left-sided or right-
sided. Once such a reconstruction scheme is adopted, one can compute the Uin using
the integral expression given above. Next, the Godunov schemes use (approximate)
n+ 12 n+ 1
Riemann solvers to infer the (time-averaged) fluxes Fi 1 and Fi+ 12 . The idea is
2 2
that reconstruction (be it piecewise constant, piecewise linear or piecewise parabolic)
leaves discontinuities between adjacent cells. Godunov’ insight was to treat these as
‘real’ and to solve them analytically as Riemann problems. That implies that one
now has, at each cell-interface, a solution for u(x, t), which can be integrated over
n+ 1 n+ 1
time to infer Fi 12 and Fi+ 12 . Next, one uses the update formula to compute Uin+1 ,
2 2
and one proceeds cell-by-cell, and time-step by time-step. In what follows we take a
closer look at this Riemann problem and how it may be solved.
The solution of the Riemann problem, i.e., the time-evolution of this discontinuous
initial state, can comprise
• 0, 1 or 2 shocks
but, the total number of shocks plus rarefaction fans cannot exceed two. All these
shocks, entropy jumps and rarefaction waves appear as characteristics in the solu-
tion. In particular, the velocities of the features are given by the eigenvalues of the
Jacobian matrix of the flux function (called the characteristic matrix), which is
given by
@fi
Aij (~q ) ⌘
@qj
where f~(~q) is the flux in the Euler equations in conservative form.
200
Figure 38: Initial conditions for the Sod shock tube. The left region has the higher
pressure (i.e., PL > PR ) and is therefore called the driven section, while the region
on the right is called the working section. The two regions are initially separated by
a diaphragm (in blue), which is instantaneously removed at t = 0. Both the fluid on
the left and right are assumed to be ideal fluids with an ideal equation of state.
The solution for a completely general Riemann problem can be tedious, and will not
be discussed here. Rather, we will look at a famous special case, the Sod shock
tube problem, named after Gary Sod who discussed this case in 1978. It is a
famous example of a 1D Riemann problem for which the solution is analytical, and
which is often used as a typical test-case for numerical hydro-codes.
The shock tube is a long one-dimensional tube, closed at its ends and initially divided
into two equal size regions by a thin diagragm (see Fig. 38). Each region is filled
with the same gas (assumed to have an ideal equation of state), but with di↵erent
thermodynamic parameters (pressure, density and temperature). The gas to the
left, called the driven section, has a higher pressure than that to the right, called
the working section (i.e., PL > PR ), and both gases are initially at rest (i.e.,
uL = uR = 0). At t = 0, the diagragm, which we consider located at x = x0
is instantaneously removed, resulting in a high speed flow, which propagates into
the working section. The high-pressure gas originally in the driven section expands,
creating an expansion or rarefaction wave, and flows into the working section,
pushing the gas of this part. The rarefaction is a continuous process and takes place
inside a well-defined region, called the rarefaction fan, which grows in width with
time (see also Chapter 20). The compression of the low-pressure gas results in a
shock wave propagating into the working section. The expanded fluid (originally
part of the driven section) is separated from the compressed gas (originally part
of the working section) by a contact discontinuity, across which there is a jump
in entropy. The velocities and pressures on both sides of the contact discotinuity,
though, are identical (otherwise it would be a shock).
201
(L) (E) (2) (1) (R)
undisturbed
compressed,
undisturbed
left-going
post-rarefaction
high density,
post-shocked
low density,
rarefaction
gas
high pressure
gas low pressure
fan
gas gas
x1 x2 x0 x3 x4
Figure 39: Illustration of the di↵erent zones present in the SOD shock tube. The
original diaphragm, which was removed at t = 0, was located at x0 indicated by the
dotted line. The solid line at x4 marks the location of the right-going shock, while
the dashed line at x3 corresponds to a contact discontuity. The region marked (E),
between x1 and x2 , indicates the left-going rarefaction fan. Regions (L) and (R) are
not yet a↵ected by the removal of the diaphragm and thus reflect the initial conditions
Left and Right of x0 .
Fig. 39 illustrates the di↵erent zones at a time before either the shock wave or the
rarefaction fan has been able to reach the end of the tube. Hence, the regions to
the far left and far right are still in their original, undisturbed states, to which we
refer as the ‘L’ and ‘R’ states, respectively. In between, we distinguish three di↵erent
zones; a rarefaction fan ‘E’, a region of gas (region ‘2’) that originally came from the
driven section but has been rarefied due to expansion, and a region with gas (region
‘1’) that originally belonged to the working section but that has been compressed (it
has been overrun by the shock wave). Note that regions 1 and 2 are separated by a
contact discontinuity (aka entropy jump).
Our goal is to compute ⇢(x, t), u(x, t), and P (x, t) in each zone, as well as the
locations x1 , x2 , x3 and x4 of the boundaries between each of these zones. This is a
typical Riemann problem. It can be solved using the method of characteristics,
but since we are focussing on numerical hydrodynamics here, we are not going to
give the detailed derivation; interested readers are referred to textbooks on this
topic. Another useful resource is the paper by Lora-Clavijo et al. 2013, Rev. Mex.
de Fisica, 29-50, which gives a detailed description of exact solutions to 1D Riemann
problems.
202
However, even without the method of characteristics, we can use our physical insight
developed in these lectures notes to obtain most of the solution. This involves the
following steps:
[1] First we realize that we can infer the conditions in region ‘1’ from the known con-
ditions in region ‘R’ using the Ranking-Hugoniot jump conditions (see Chapter
13) for a non-radiative shock. If we refer to thepMach number of the shock (to be
derived below) as Ms ⌘ us /cs,R , with cs,R = PR /⇢R the sound speed in region
‘R’, then we have that
2 1
P1 = PR M2s
+1 +1
1
2 1 1
⇢1 = ⇢R +
+ 1 M2s +1
2 1
u1 = Ms
+1 Ms
Note that for the latter, one first needs to convert to the rest-frame of the shock,
in which the velocities in regions ‘R’ and ‘1’ are given by u0R = uR us = us and
u01 = u1 + us . One then solves for u01 and converts to u1 . Finally, if needed one
can infer the temperature T1 from TR using the corresponding RH jump condition
according to which T1 /TR = (P1 ⇢1 /PR ⇢R ).
[2] Having established the properties in zone ‘1’, the next step is to infer the prop-
erties in zone ‘2’. Here we use that the velocity and pressure are constant across a
contact discontinuity to infer that P2 = P1 and u2 = u1 . For the density, we need to
link it to ⇢L , which we can do using the fact that rarefraction is an adiabatic process,
for which P / ⇢ . Hence, we have that ⇢2 = ⇢L (P2 /PL )1/ .
[3] What remains is to compute the shock speed, us , or its related Mach number,
Ms . This step is not analytical, though. Using insight that can be gained from the
method of characteristics, not discussed here, one can infer that the Mach number
is a solution to the following implicit, non-linear equation, which needs to be solved
numerically using a root finder:
( ✓ ◆ 2 1)
1 +1 PR 2 1
Ms = cs,L 1 M2s
Ms 1 PL +1 +1
p
with cs,L = PL /⇢L the sound speed in region ‘L’. Once the value of Ms has been
203
determined, it can be used in steps [1] and [2] to infer all the parameters of (uniform)
zones 1 and 2.
[4] To determine the internal structure of the rarefraction fan, one once again has to
rely on the method of characteristics. Without any derivation, we simply give the
solution:
✓ ◆
2 x x0
u(x) = cs,L +
+1 t
1
cs (x) = cs,L ( 1)u(x)
2
2
cs (x) 1
P (x) = PL
cs,L
P (x)
⇢(x) =
c2s (x)
[5] Finally we need to determine the locations of the zone boundaries, indicated by x1 ,
x2 , x3 and x4 (see Fig. 39). The shock wave is propagating with speed us = Ms cs,R .
The contact discontinuity is propagating with a speed u2 = u1 . The far left-edge
of the rarefaction wave is propogating with the sound-speed in zone L. And finally,
from the method of characteristics, one infers that the right-edge of the rarefraction
zone is propagating with speed u2 + cs,2 in the positive direction. Hence, we have
that
x1 = x0 cs,L t (1)
x2 = x0 + (u2 cs,2 )t (2)
x3 = x0 + u2 t (3)
x4 = x0 + us t (4)
which completes the ‘analytical’ solution to the Sod shock tube problem. Note that
the word analytical is in single quotation marks. This is to highlight that the solution
is not trully analytical, in that it involves a numerical root-finding step!
Fig. 40 shows this analytical solution at t = 0.2 for a Sod shock tube problem with
= 1.4 and the following (unitless) initial conditions:
⇢L = 8.0 ⇢R = 1.0
PL = 10/ PR = 1/
uL = 0.0 uR = 0.0
204
Figure 40: Analytical solution to the SOD shock tube problem at t = 0.2. Note that
the pressure and velocity are unchanged across the discontinuity (entropy jump) at
x3 , while the density is clearly discontinuous.
In what follows, we develop a simple 1D numerical hydro code to integrate this same
Sod shock tube problem, which we can then compare with our ‘analytical’ solution.
The code will use di↵erent Godunov schemes to do so.
As discussed above, Godunov’s method, and its higher order modifications, require
solving the Riemann problem at every cell boundary and for every time step. This
amounts to calculating the solution in the regions between the left- and right-moving
waves (i.e., zones ‘E’, ‘1’, and ‘2’ in the case of the Sod shock tube), as well as the
speeds of the various waves (shock wave(s), rarefraction wave(s), and entropy jumps)
involved. The solution of the general Riemann problem cannot be given in a closed
analytic form, even for 1D Newtonian flows (recall, that even for the Sod shock tube
a numerical root finding step is required). What can be done is to find the answer
numerically, to any required accuracy, and in this sense the Riemann problem is said
to have been solved exactly, even though the actual solution is not analytical.
However, mainly because of the iterations needed to solve the Riemann problem, the
Godunov scheme as originally envisioned by Godunov, which involves using an exact
Riemann solver at every cell-interface, is typically far too slow to be practical. For
that reason, several approximate Riemann solvers have been developed. These
can be divided in approximate-state Riemann solvers, which use an approximation
for the Riemann states and compute the corresponding flux, and approximate-flux
Riemann solvers, which approximate the numerical flux directly.
205
Here we highlight one of these approximate Riemann solvers; the HLL(E) method,
after Harten, Lax & van Leer, who proposed this method in 1989, and which was later
improved by Einfeldt (1988). The HLL(E) method is an approximate-flux Riemann
solver, which assumes that the Riemann solution consists of just two waves separating
three constant states; the original L and R states which border an intermediate ‘HLL’
state. It is assumed that, after the decay of the initial discontinuity of the local
Riemann problem, the waves propagate in opposite directions with velocities SL and
SR , generating a single state (assumed constant) between them. SL and SR are the
smallest and the largest of the signal speeds arising from the solution of the Riemann
problem. The simplest choice is to take the smallest and the largest among the
eigenvalues of the Jacobian matrix @fi /@qj evaluated at some intermediate (between
L and R) state. For the 1D Euler equation that we consider here, one obtains
reasonable results if one simply approximates these eigenvalues as SL = uL cs,L and
SR = uR + cs,R , where uL and uR are the initial fluid velocities in the L and R states,
and cs,L and cs,R are the corresponding sound speeds. Without going into any detail,
the HLL(E) flux to be used in the Godunov scheme is given by
Here we have made it explicit that the flux is a vector, where each element refers to
the corresponding elements of ~q and f~(~q ) of the Euler equation in conservation form.
Note that the L and R states here, refer to mesh cells i 1 and i, respectiveley. In
n+ 1
the case of the Fi+ 12 flux, which is needed in the Godunov scheme together with
2
n+ 21
Fi 1 , the L and R states refer to mesh cells i and i + 1.
2
206
A simple 1D hydro-code: We are now ready to write our own simple 1D numerical
hydro-code (adopting an adiabatic EoS), which we can test against the (analytical)
solution of the Sod shock tube problem examined above. What follows are some of
the steps that you may want to follow in writing your own code:
• Define an array q(1 : Nx, 0 : Nt, 1 : 3) to store the discrete values of the vector
~q = (⇢, ⇢u, E)t of conserved quantities on the spatial mesh xi with i = 1, .., Nx
and at discrete time tn with n = 0, 1, ..., Nt.
• Write a subroutine that computes the primary variables, ⇢(1 : Nx), u(1 : Nx)
and P (1 : Nx), given the conserved variables ~q . This requires computing the
pressure, which follows from E = P/( 1) 12 ⇢u2 . Also compute the local
p
sound speed cs (1 : Nx) = P/⇢ (which is needed in the HLL(E) scheme).
• Write a subroutine that, given the primary variables, computes the time step
n
t = ↵c ( x/|vmax |). Here ↵c < 1 is the user-supplied value for the Courant
parameter, and |vmax | = MAXi [|uni | + cs (xi )] denotes the maximum velocity
n
• Write a subroutine that, given the array q(1 : Nx, n, 1 : 3) computes the corre-
sponding fluxes f~(~q ) at time tn , and store these in f (1 : Nx, 1 : 3).
• Each time step (i) compute the primary variables, (ii) compute the time step,
t, (iii) compute the fluxes f (1 : Nx, 1 : 3), (iv) compute the Godunov fluxes
n+ 1 n+ 1
Fi 12 and Fi+ 12 ; (this depends on the scheme used), and (v) update q using
2 2
the Godunov update scheme:
t h n+ 12 n+ 21
i
q(i, n + 1, 1 : 3) = q(i, n, 1 : 3) F 1 (1 : 3) Fi 1 (1 : 3)
x i+ 2 2
• Loop over time steps until the total integration time exceeds the user-defined
time, and output the mesh of primary variables at the required times.
Fig. 41 shows the outcome of such a program for three di↵erent numerical schemes
applied to the Sod shock tube problem. All methods start from the same ICs as
discussed above (i.e., those used to make Fig. 40), and are propagated forward using
time-steps that are computed using a Courant parameter ↵c = 0.8 until t = 0.2. The
207
Figure 41: Numerical integration of Sod’s shock tube problem. From top to bottom
the panels show the density, velocity and pressure as function of position. The ‘an-
alytical’ results at t = 0.2 are indicated in red (these are identical to those shown
in Fig. 40). In blue are the results from three di↵erent numerical schemes; from
left to right, these are the first-order FTBS scheme, the first-order Godunov scheme
with the approximate Riemann solver of HLL(E), and the second-order predictor-
corrector scheme of Lax-Wendro↵. All schemes adopted a spatial grid of 65 points
on the domain x 2 [0, 1], and a Courant parameter ↵c = 0.8.
208
results (in blue, open circles) are compared to the analytical solution (in red). In
each column, panels from top to bottom show the density, velocity, and pressure.
The first scheme, shown in the left-hand panels, is the standard Euler FTBS scheme,
n+ 1
which simply sets Fi 12 = f (qin 1 ) (cf. Table at the end of Chapter 17). Although this
2
scheme reduces to the stable upwind scheme in the case of the 1D linear advection
equation, clearly in this more complicated case it failes miserably. The reason is easy
to understand. In the Sod shock tube problem, there are multiple waves moving in
di↵erent directions (forward moving shock and entropy jump, and a backward moving
rarefraction wave). Hence, there is no single direction in the flow, and the FTBS
scheme cannot be an upwind scheme for all these waves. For some it is a downwind
scheme (equivalent to FTFS), and such a scheme is unconditionally unstable. This
explains the drastic failure of this scheme. In the case of the advection equation
there is only a single wave, and the FTBS (FTFS) scheme acts as an upwind scheme
if u > 0 (u < 0).
Finally,the right-hand panels show the results for a second-order Godunov scheme,
based on the Lax-Wendro↵ fluxes. This scheme uses piecewise linear reconstruction
(see Chapter 19), and is second-order in both space and time. The latter arises
because this scheme uses a predictor and corrector step, according to:
q~in + ~qi+1
n
t h~ n i
~qi+ 1 = f (~qi+1 ) f~(~qin )
2 2 2 x
t h~ i
~qin+1 = ~qin f (~qi+ 1 ) f~(~qi 1 )
x 2 2
It uses an intermediate step, and it is apparent from combining the two steps that
the final update formula is Q( t2 ). It is left as an exersize for the student to show
that this scheme reduces to the LW-scheme highlighted in Chapter 17 for the linear
advection equation (i.e., when f = vu with v the constant advection speed and u the
209
quantity that is advected). The higher-order accuracy of this scheme is better able
to capture the discontinuities in the analytical solution of the Sod shock tube, but,
as expected from Godunov’s theorem, the scheme is not positivity conserving and
introduces artificial over- and under-shoots.
As we have seen in Chapter 19, such over- and undershoots can be prevented by
using slope (or flux) limiters. An example of such a scheme is the MUSCL scheme
developed by van Leer, which is based on piecewise linear reconstruction combined
with slope limiters. This higher-order reconstruction scheme implies that at the
cell interfaces, one now has to solve so-called generalized Riemann problems; i.e.,
a discontinuity separating two linear (rather than constant) states. These are not
as easy to solve as the standard Riemann problem. Hence, one typically resorts to
approximate Riemann solvers.
210
Plasma Physics
The following chapters give an elementary introduction into the rich topic of plasma
physics. The main goal is to highlight how plasma physics di↵ers from that the
physics of neutral fluids. After introducing some characteristic time and length
scales, we discuss plasma orbit theory and plasma kinetic theory before considering
the dynamics of collisionless plasmas, described by the Vlasov equations, and that
of collisional plasma, described (under certain circumstances) by the equations of
magnetohydrodynamics.
Plasma is a rich topic, and one could easily devote an entire course to it. The follow-
ing chapters therefore only scratch the surface of this rich topic. Readers who want
to get more indepth information are referred to the following excellent textbooks
- Introduction to Plasma Theory by D.R. Nicholson
- The Physics of Plasma by T.J.M. Boyd & J.J. Sandweson
- Plasma Physics for Astrophysics by R.M. Kulsrud
- The Physics of Fluids and Plasmas by A.R. Choudhuri
- Introduction to Modern Magnetohydrodynamics by S. Galtier
211
CHAPTER 22
Plasma Characteristics
Roughly speaking a plasma is a fluid in which the constituent particles are charged.
More specifically, a plasma is a fluid for which the plasma parameter (defined
below) g < 1.
Plasma dynamics is governed by the interaction of the charged particles with the
self-generated (through their charges and current densities) electromagnetic fields.
This feedback loop (motion of particles generates fields, and the dynamics of the
particles depends on the fields) is what makes plasma physics such a difficult topic.
NOTE: As is well known, accelerated charges emit photons. Hence, the charged
particles in a plasma will lose energy. As long as the timescale of this energy loss is
long compared to the processes of interest, we are justified in ignoring these radiative
losses. Throughout we assume this to be the case.
NOTE ABOUT UNITS: in this and the following chapters on plasma physics we
adopt the Gaussian system of units. This implies that the Coulomb force between
two charges q1 and q2 is given by
q1 q2
F =
r2
By contrast, the same Coulomb law in the alternative SI unit system is given by
1 q1 q2
F =
4⇡"0 r 2
with "0 the vacuum permittivity. Using Gaussian units also implies that the electric
and magnetic fields have the same dimensions, and that the Lorentz force on a
particles of charge q is given by
F~ = q E~ + ~v ⇥ B
~
c
212
In a neutral fluid, the interactions (also called collisions) among the particles are
well-separated, short-range, and causing large deflections. In between the
collisions, the particles travel in straight lines.
In a plasma, the interactions are long-range, not well-separated (i.e., each particle
undergoes many interactions simultanously), and each individual collision typi-
cally causes only a small deflection. Consequently, the trajectories of particles in
a plasma are very di↵erent from those in a neutral fluid (see Fig.1 in Chapter 1).
If the plasma is weakly ionized, then a charged particle is more likely to have a
collision with a neutral particle. Such collisions take place when the particles are
very close to each other and usually produce large deflections, similar to collisions
between two neutral particles. Hence, a weakly ionized plasma can be described
using the Boltzmann equation.
If the plasma is highly ionized, Coulomb interactions among the charged particles
dominate. These are long-range interactions, and typically result in small deflections
(seee below). In addition, a particle typically has interactions with multiple other
particles simultaneously. Hence, the collisions are not instantaneous, well-separated,
and localized (i.e., short-range). Consequently, the Boltzmann equation does not
apply, and we need to derive an alternative dynamical model. Unfortunately, this is
a formidable problem that is not completely solved for an arbitrary, inhomogeneous
magnetized plasma.
In our discussion of neutral fluids we have seen that a system of particles can be
treated like a continuum fluid i↵ frequent collisions keep the distribution function
in local regions close to a Maxwellian. Although not easy to proof, there is ample
experimental evidence that shows that the collisions in a plasmas also relax to a
Maxwellian. We therefore will seek to develop some continuum fluid model to
describe our plasma.
213
make in astrophysics (i.e., ni is then basically the number density of free protons). In
astrophysics these number densities can span many orders of magnitudes. For exam-
ple, the ISM has ne ⇠ 1 cm 3 , while stellar interiors have densities ne ⇠ 1025 cm 3 .
⇢ = e(np ne )
(for a derivation, see any good textbook on Plasma Physics), while the total Debye
length, D , is defined by
2 2 2
D = e + i
Because of Debye shielding, the net electrical potential around a charge q is given by
q
= exp( r/ D)
r
Debye shielding is a prime example of collective behavior in a plasma; it indicates
that each charged particle in a plasma, as it moves, basically carries, or better, tries
to carry, a cloud of shielding electrons and ions with it.
214
The average number of particles on which a charged particle excerts an influence is
roughly n 3D , with n the average number density of particles. Associated with this
is the Plasma Parameter
1 (8⇡)3/2 e3 n1/2
g⌘ 3
=
n D (kB T )3/2
NOTE: The plasma parameter g / n1/2 . Hence, low density plasma’s are more
‘plasma-like’ (display more collective phenomenology). Even though the number of
particles per volume is smaller, the total number of particles within a Debye volume,
3
D , is larger.
The average distance between particles is of the order n 1/3 . Hence, the average
potential energy of electrostatic interactions is of the order e2 n1/3 . We thus see that
hP.E.i e2 n1/3
/ / g 2/3
hK.E.i kB T
In other words, the plasma parameter is a measure for the ratio between the aver-
age potential energy associated with collisions and the average kinetic energy of the
particles.
• When g ⌧ 1, interactions among the particles are weak, but a large number of
particles interact simultaneously, giving rise to plasma behavior.
• When g 1, interactions among the particles are strong, but few particles interact
collectively, and the fluid behaves like a neutral fluid. In fact, if g > 1 then the
215
typical kinetic energy of the electrons is smaller then the potential energy due its
nearest neighbor, and there would be a strong tendency for the electrons and ions
to recombine, thus destroying the plasma. The need to keep the fluid ionized means
that most plasmas have temperatures in excess of ⇠ 1 eV (or they are exposed to
strong ionizing radiation).
The corresponding frequency for ions, the ion plasma frequency, is defined by
✓ ◆1/2
4⇡ni (Ze)2
!pi =
mi
with Ze the charge of the ion. The total plasma frequency for a two-component
plasma is defined as
!p2 ⌘ !pe
2 2
+ !pi
Since for most plasmas in nature !pe !pi , we have that typically !p ⇡ !pe .
Any applied field with a frequency less than the electron plasma frequency is pre-
vented from penetrating the plasma by the more rapid response of the electrons,
which neutralizes the applied field. Hence, a plasma is not transparent to elec-
tromagnetic radiation of frequency ! < !pe . An example of such long-wavelength
radiation that is typically inhibited from traversing a plasma is cyclotron radiation,
which is the non-relativistic version of synchrotron radiation.
Collisions We now turn our attention to the collisions in a plasma. Our goal is two
fold: to derive under what conditions collisions are important, and (ii) to demonstrate
216
that in a plasma weak collisions (causing small deflections) are more important that
strong collisions (causing large angle deflections).
As we have seen above, a particle in a plasma is feeling the Coulomb force from
all g 1 particles inside its Debye volume. Hence, unlike in a neutral gas, where
particles have individual short-range interactions, moving freely in between, in a
plasma the particle have many simultaneous, long-range (i.e., of order the Debye
length) interactions.
From our definition of a plasma (i.e., g ⌧ 1) we know that the potential energy of
the ‘average’ interaction of particles with their nearest neighbor is small. This means
that the strongest of all its g 1 simultaneous interactions (i.e., that with its nearest
neighbor) is, on average, weak. Thus, it is safe to conclude that a charged particle in
a plasma simultaneously undergoes of order g 1 weak interactions (aka ‘collisions’).
In fact, as we shall see shortly, even the combined e↵ect of all these g 1 simultanous
collisions is still relatively weak.
Consider a charged particle of charge q and mass m having an encounter with impact
parameter b with another particle with charge q 0 and mass m0 . Let v0 be the speed of
the encounter when the charges are still widely separated. In what follows we assume
that m0 = 1, and we treat the encounter from the rest-frame of m0 , which coincides
with the center-of-mass. This is a reasonable approximation for an encounter between
an electron and a much heavier ion. It makes the calculation somewhat easier, and
is anyways accurate to a factor of two or so. Let x = v0 t describe the trajectory of
m in the case where it would not be deflected by m0 . If the scattering angle is small,
then the final speed in the x-direction (i.e., the original direction of m) will be close
to v0 again. However, the particle will have gained a perpendicular momentum
Z 1
mv? = F? (t)dt
1
where Fperp (t) is the perpendicular force experienced by the particle along its trajec-
tory. As long as the deflection angle is small, we may approximate that trajectory as
the unperturbed one (i.e., x = v0 t). Note that this is exactly the same approximation
as we make in our treatment of the impulse approximation in Chapter 16.
217
Next we use that
q q0
F? = sin ✓
r2
where sin ✓ = b/r. Using this to substitute for r in the above expression, we obtain
that Z 1
q q0
v? = sin ✓3 (t) dt
m b2 1
Using that
b cos ✓
x= r cos ✓ = = v0 t
sin ✓
we see that
b d✓
dt =
v0 sin2 ✓
Substituting this in the integral expression we obtain
Z ⇡
q q0 2q q 0
v? = sin ✓ d✓ =
m v0 b 0 m v0 b
Our approximation that the collision must be weak (small deflection angle) breaks
down when vperp ' v0 . In that case all the forward momentum is transformed into
perpendicular momentum, and the deflection angle is 90o . This happens for an
impact parameter
2q q 0
b90 =
m v02
In some textbooks on plasma physics, this length scale is called the Landau length.
Using this expression we have that
v? b90
=
v0 b
NOTE: although the above derivation is strictly only valid when b b90 (i.e., v? ⌧
v0 ), we shall consider b90 the border between weak and strong collisions, and compute
the combined impact of all ‘weak’ collisions (i.e., those with b > b90 ).
But first, let us compute the collision frequency for strong collisions. Let n be the
number density of targets with which our particle of interest can have a collision.
The cross section for each target for a strong interaction is ⇡b290 . In a time t the
218
particle of interest traverses a distance v0 t, which implies that the expectation value
for the number of strong interactions during that time is given by
hNL i = n ⇡b290 v0 t
where the subscript ‘L’ refers to Large (deflection angle). Using that the correspond-
ing collision frequency is the inverse of the time it takes for hNL i = 1, we obtain
that
4⇡nq 2 q 0 2 4⇡ne4
⌫L = n ⇡b290 v0 = =
m2 v03 m2 v03
where the last step only holds if, as assumed here, all ions have Z = 1.
We now proceed to compute the combined e↵ect of all these many small angle col-
lisions. Since the perpendicular direction in which the particle is deflected by each
individual collision is random, we have that hvperp i = 0. However, the second mo-
2
ment, hvperp i will not be zero. As before, in a time t the number of collisions that
our subject particle will experience with impact parameters in the range b-b + db is
given by
hNcoll i = n 2⇡b db v0 t
Hence, using that each collision causes a vperp = v0 (b90 /b), we can calculate the total
2
change in vperp by integrating over all impact parameters
Z bmax ✓ ◆
2 v02 b290 3 2 bmax
hv? i = db n 2⇡ b v0 t 2 = 2⇡ n v0 t b90 ln
bmin b bmin
Substituting the expression for b90 and using that bmax ' D (i.e., a charged particle
only experiences collisions with particles inside the Debye length) and bmin = b90
(i.e., collisions with b < b90 are strong collisions), we find that
✓ ◆
2 8⇡ n e4 t D
hv? i = 2
ln
m v0 b90
219
Next we use that the typical velocity of charges is the thermal speed, so that v0 '
kB T /m, to write that
2
D D mv0 3 1
= = 2⇡ n D = 2⇡g
b90 2e2
where in the third step we have used the definition of the Debye length. Since for
a plasma g 1 2⇡ we have that ln( D /b90 ) ' ln ⇤, where we have introduced
the inverse of the plasma parameter ⇤ ⌘ g 1 . The quantity ln ⇤ is known as the
Coulomb logarithm.
8⇡ n e4
⌫c = ln ⇤
m2 v03
Upon comparing this with the collision frequency of large-angle collisions, we see
that
⌫c
= 2 ln ⇤
⌫L
This is a substantial factor, given that ⇤ = g 1 is typically very large: i.e., for
⇤ = 106 we have the ⌫c ⇠ 28⌫L indicating that small-angle collisions are indeed
much more important than large-angle collisions.
Let us now compare this collision frequency to the plasma frequency. By once again
using that v0 is of order the thermal velocity, and ignoring the small di↵erence
between D and e , we find that
!c 2⇡ ⌫c 2⇡ ln ⇤ ln ⇤ 1
= ' 3
= ⇠⇤
!p,e !p,e 2⇡ n D ⇤
Hence, we see that the collision frequency is much, much smaller than the plasma
frequency, which basically indicates that, in a plasma, particle collisions are far
less important than collective e↵ects: a plasma wave with frequency near !p,e will
oscillate many times before experiencing significant damping due to collisions. Put
di↵erently, collisional relaxation mechanisms in a plasma are far less important
than collective relaxation mechanisms, such as, for example, Landau damping
(to be discussed in a later chapter).
220
Finally, we point out that, since both the Coulomb force and the gravitational force
scale as r 2 , the above derivation also applies to gravitational systems. All that is
requires is to replace q q 0 = e2 ! Gm2 , and the derivation of the collision frequencies
now apply to gravitational N-body systems. This is the calculation that is famously
used to derive the relaxation time of a gravitational system. The only non-trivial
part in that case is what to use for bmax ; since there is no equivalent to Debye shielding
for gravity, there is no Debye length. It is common practice (though contentious)
to therefore adopt bmax ' R with R a characteristic length or maximum extent of
the gravitational system under consideration. In a gravitational system, we also
find that the two-body, collisional relaxation time is very long, which is why we
approximate such systems as ‘collisionless’. Similar to a plasma, in a gravitational
system relaxation is not due to two-particle collisions, but due to collective e↵ects
(i.e., violent relaxation) and to more subtle relaxation mechanisms such as phase
mixing.
Let’s take a closer look at this comparison. If we define the two-body relaxation
time as the inverse of the collision frequency, we see that for a plasma
⇤ ⇤ 2⇡ ⇤ ⇣ n ⌘ 1/2
e
tplasma
relax = tp ' ' 10 4s
ln ⇤ ln ⇤ !p ln ⇤ cm 3
Here we have used the ratio between the collision frequency and plasma frequency
derived above, and we have used the expression for !p ⌘ 2⇡/tp in terms of the elec-
tron density. We thus see that the two-body relaxation time for a plasma is very
short. Even for a plasma with ⇤ = n 3D = 1010 , the relaxation time for plasma
at ISM densities (⇠ 1 cm 3 ) is only about 12 hours, which is much shorter than
any hydrodynamical time scale in the plasma (but more longer than the charac-
teristic time scale on which the plasma responds to a charge-imbalance, which is
tp = 2⇡/!p ' 0.1ms). Hence, although a plasma can often be considered colli-
sionless (in which case its dynamics are described by the Vlasov equation, see
Chapter 24), on astrophysical time scales, plasmas are collisionally relaxed, and thus
well described by a Maxwell-Boltzmann distribution.
In the case of a gravitational N-body system, the two-body relaxation time is given
by
⇤ tcross
tNbody
relax '
ln ⇤ 10
221
with tcross ' R/V ' (2/⇡)tdyn (for a detailed derivation, see Binney & Tremaine
2008). If, as discussed above, we set ⇤ = bmax /b90 with bmax ' R the size of the
gravitational system and b90 = 2Gm/ 2 , with the characteristic velocity dispersion,
then it is easy to see that ⇤ = [NR 2 ]/[2G(Nm)] ⇠ N, where we have used the
virial relation 2 = GM/R with M = Nm (see Chapter 15). Hence, we obtain the
well-known expression for the two-body relaxation time of a gravitational N-body
system
N
tNbody
relax ' tcross
10 ln N
And since tcross can easily be of order a Gyr in astrophysical systems like dark matter
halos or galaxies, while N 1, we see that tNbody
relax is typically much larger than the
Hubble time. Hence, gravitational N-body systems are much better approximations
of trully collisionless systems than plasmas, and their velocity distributions can thus
not be assumed to be Maxwellian.
222
CHAPTER 23
In describing a fluid, and plasmas are no exception, we are typically not interested in
the trajectories of individual particles, but rather in the behaviour of the statistical
ensemble. Nevertheless, one can obtain some valuable insight as to the behavior of
a plasma if one has some understanding of the typical orbits that charged particles
take. In this chapter we therefore focus on plasma orbit theory, which is the study
of the motion of individual particles in a plasma.
Each particle is subjected to the EM field produced by the other particles. In addi-
tion, there may be an external magnetic field imposed on the plasma. The interior
of a plasma is usually shielded from external electrical fields.
NOTE: The gyroradius is also known as the Larmor radius or the cyclotron
radius.
What about the motion in a non-uniform, magnetic field, B(~ ~ x)? As long
~ x) are small over the scale of the gyroradius, i.e.,
as the non-uniformities in B(~
223
~
|B/(d ~
B/dr)| < r0 , one can still meaningfully decompose the motion into a circu-
lar motion around the guiding center and the motion of the guiding center itself.
The latter can be quite complicated, though. The aim of plasma orbit theory is
to find equations describing the motion of the guiding center. Unfortnately, there
is no general equation of motion for the guiding center in an arbitrary EM field.
Rather, plasma orbit theory provides a ‘bag of tricks’ to roughly describe what hap-
pens under certain circumstances. In what follows we discuss four examples: three
circumstances under which the guiding center experiences a drift, and one in which
the guiding center is reflected.
There are three cases in which the guiding center experiences a ‘drift’:
224
Figure 42: Drift of a gyrating particle in crossed gravitational and magnetic fields.
The magnetic field is pointing out of the page, while the gravitational force is pointing
upward. Note that the positively and negatively charged particles drift in opposite
directions, giving rise to a non-zero current in the plasma.
c F~ ⇥ B
~
~vGC =
q B2
Note that positively and negatively charged particles will drift in opposite directions,
thereby giving rise to a non-zero current in the plasma. This will not be the case if
the external force is the electrical force, i.e., if F~ = q E.
~ In that case
~ ⇥B
E ~
~vGC = c
B2
which does not depend on the charge; hence, all particles drift in the same direction
and no current arises.
225
It can be shown that the resulting drift of the guiding center is given by:
1 ~ ⇥ rB
B
~vGC = ± v? r0
2 B2
where, as throughout this chapter, v? is the component of velocity perpendicular to
the magnetic field lines, r0 is the Larmor radius, and the + and signs correspond
to positive and negative charges. Hence, particles of opposite charge drift in opposite
directions, once again giving rise to a non-zero current in the plasma.
where vk is the velocity component parallel to B,~ and R ~ c is the curvature vector
directed towards the curvature center. As a consequence of this centrifugal force, the
guiding center drifts with a velocity
c m vk2 R
~c ⇥ B
~
~vGC =
qRc2 B2
Like the gradient drift, the curvature drift is also in opposite directions for positively
and negatively charged particles, and can thus give rise to a non-zero current.
Magnetic Mirrors:
In all three examples above the drift arises from a ‘force’ perpendicular to the mag-
netic field. There are also forces that are parallel to the magnetic field, and these
can give rise to the concept of magnetic mirrors.
Let us first introduce the concept of magnetic moment. As you may recall from
a course on electromagnetism, the current loop with area A and current I (i.e., the
current flows along a closed loop that encloses an area A) has an associated magnetic
moment given by
IA
µ=
c
226
Figure 43: Illustration of Lorentz force (red arrows) acting on a gyrating particle
(indicated by ellipse) in two magnetic field configurations. In panel (a) the field
lines are parallel, and the Lorentz force has no component along the z-direction.
In the configuration shown in panel (b), though, the Lorentz force, which is always
perpendicular to the local magnetic field, now has a component in the z-direction,
pointing away from where the magnetic field is stronger.
A charged particle moving along its Larmor radius is such a loop with A = ⇡r02 and
I = q(⌦/2⇡). Using that ⌦ = v? /r0 and substituting the definition of the Larmor
radius, we find that the magnetic moment of a charge gyrating in a magnetic field
B is given by
1
⇡r 2 qv? mv 2
µ= 0 = 2 ?
2⇡cr0 B
The magnetic moment is an adiabatic invariant, which means that it is conserved
under slow changes in an external variable. In other words, if B only changes slowly
with position and time, then the magnetic momentum of a charged particle gyrating
is conserved!
Now consider the magnetic field topologies shown in Fig. 43. In panel (a), on the left,
the field is uniform and all field lines run parallel. The ellipse represents a gyration of
a particle whose guiding center moves along the central field line. The Lorentz force
is perpendicular to both B ~ and v? and pointing towards the guiding center. Now
consider the topology in panel (b). The field lines converge towards the right. At the
top of its gyro-radius, the magnetic field now makes an angle wrt the magnetic field
line corresponding to the guiding center, and as a result the Lorentz force (indicated
in red), now has a non-zero component in the z-direction. Hence, the particle will
be accelerated away from the direction in which the field strength increases!
227
To make this quantitative, let us compute the Lorentz force in the z-direction:
q⇣ ~
⌘ q
Fz = ~v ⇥ B = v? BR
c z c
where BR is the magnetic field component in the cylindrical R-direction, in which
~ = 0, we
the z-direction is as indicated in Fig. 43. Using the Maxwell equation r · B
have that
1 @ @Bz
(RBR ) + =0
R @R @z
which implies Z R
@Bz
R BR = R (R)dR
0 @z
@Bz
If we take into account that @z
does not vary significantly over one Larmor radius,
we thus find that
1 @Bz
BR = R
2 @z
Substituting this in our expression for the Lorentz force in the z-direction, with R
equal to the Larmor radius, we find that
1 2 1 @Bz @Bz
Fz = mv = µ
2 ? B @z @z
This makes it clear that the Lorentz force has a non-zero component, proportional
to the magnetic moment of the charged particle, in the direction opposite to that in
which the magnetic field strength increases.
Now we are ready to address the concept of magnetic mirror confinement. Con-
sider a magnetic field as depicted in Fig. 44. Close to the coils, the magnetic field is
stronger than in between. Now consider a particle gyrating along one of these field
lines, as shown. Suppose the particle starts out with kinetic energy K0 = 12 m(v?2
+vk2 )
and magnetic moment µ. Both of these quantities will be conserved as the charged
particle moves. As the particle moves in the direction along which the strength of
B~ increases (i.e., towards one of the coils), vperp must increase in order to guaran-
tee conservation of the magnetic moment. However, the transverse kinetic energy
can never exceed the total kinetic energy. Therefore, when the particle reaches a
region of sufficiently strong B, where the transverse kinetic energy equals the total
kinetic energy, it is not possible for the particle to penetrate further into regions of
228
Figure 44: Illustration of a magnetic bottle. The black horizontal curved lines depict
magnetic field lines, which are ‘squeezed’ together at the ends by two electrical coils.
As a result the magnetic field strength is larger at the ends than in the middle,
creating magnetic mirrors in between charged particles can be trapped. An example
of a particle trajectory is shown.
even stronger magnetic field: the particle will be reflected back, and the region of
increasing magnetic field thus acts as a reflector, known as a magnetic mirror.
Magnetic bottles are not only found in laboratories; the Earth’s magnetic field creates
its own magnetic bottles due to its toroidal topology. The charged particles that are
trapped give rise to what are called the Van Allen belts (electrons and protons have
their own belts, as depicted in Fig. 45). As the trapped particles move back and forth
between the North and South poles of the Earth’s magnetic field, they experience
curvature drift (in opposite directions for the electrons and protons). The resulting
229
Figure 45: Illustration of the Van Allen belts of trapped, charged particles in the
toroidal magnetic field of the Earth.
230
CHAPTER 24
In Chapter 6 we discussed the kinetic theory of fluids. Starting from the Liouville
theorem we derived the BBGKY hierarchy of equations, which we repeat here for
convenience:
Z
@f (1) @U(|~q1 ~q2 |) @f (2)
= {H(1) , f (1) } + d3 ~q2 d3 p~2 ·
@t @~q1 @~p1
·
·
·
X k Z
@f (k) @U(|~qi ~qk+1 |) @f (k+1)
= {H(k) , f (k) } + d3 ~qk+1 d3 ~pk+1 ·
@t i=1
@~qi @~pi
Here k = 1, 2, ..., N, f (k) is the k-particle DF, which relates to the N-particle DF
(N > k) according to
Z N
Y
(k) N!
f (w
~ 1, w ~ k , t) ⌘
~ 2 , ..., w d6 w
~ i f (N ) (w
~ 1, w
~ 2 , ..., w
~ N , t) ,
(N k)! i=k+1
with V (~q) the potential associated with an external force, and U(r) the two-body
interaction potential between two (assumed equal) particles separated by a distance
r = |~qi ~qj |.
All of this is completely general: it holds for any Hamiltonian system consisting of N
particles, and therefore also applies to plasmas. However, we also saw that in order
231
to make progress, one needs to make certain assumptions that allow one to truncate
the BBGKY hierarchy at some point (in order to achieve closure).
If one can ignore two-body collisions, then the phase-space coordinates of the particles
will be uncorrelated, such that
f (2) (~q1 , ~q2 , p~1 , p~2 ) = f (1) (~q1 , p~1 ) f (1) (~q2 , p~2 )
df @f @f @f
= + ~x˙ · + ~v˙ · =0
dt @t @~x @~v
In a gravitational N-body system the acceleration in the third term of the CBE
~v˙ = r , where follows from the Poisson equation
r2 = 4⇡G⇢
with Z
⇢(~x, t) = m f (~x, ~v , t)d3~v
~ and B
And since the e↵ects of collisions are ignored here, the fields E ~ are the smooth,
232
ensemble-averaged fields that satisfy the Maxwell equations
~ = 4⇡⇢
r·E
~ = 0
r·B
~
1 @B
~ =
r⇥E
c @t
~
~ = 4⇡ J~ + 1 @ E
r⇥B
c c @t
Here ⇢ = ⇢i + ⇢e is the total charge density and J~ = J~e + J~i the total current density,
which are related to the distribution function according to
Z
⇢s (~x, t) = qs d3~v fs (~x, ~v , t)
Z
J~s (~x, t) = qs d3~v ~v fs (~x, ~v , t)
for species ‘s’. Thus we see that the Maxwell equations are for a plasma, what the
Poisson equation is for a gravitational system. Note that the distribution function
in the Vlasov equation is the sum of fi (~x, ~v, t) plus fe (~x, ~v, t).
f (2) (~q , ~q, p~1 , p~2 ) = f (1) (~q , ~p1 ) f (1) (~q , ~p2 )
(note that the collisions are assumed to be perfectly localized, such that we only
need to know the 2-particle DF for ~q1 = ~q2 = ~q ). This assumption allows us to close
the BBGKY hierarchy, yielding the Boltzmann Equation:
df @f @f @f
= + ~x˙ · + ~v˙ · = I[f ]
dt @t @~x @~v
Here I[f ] is the collision integral, which describes how the phase-space density
around a particle (or fluid element) changes with time due to short-range collisions.
Upon taking the moment equations of this Boltzmann equation we obtain a hierarchy
of ‘fluid equations’, which we can close upon supplementing them with constitutive
233
equations for various transport coefficients (i.e., viscosity and conductivity) that
can be computed using the Chapman-Enskog expansion (something we did not
cover in these lecture notes).
In the case of a plasma, though, we cannot use the assumption of molecular chaos,
as individual particles have many (of order g 1 ) simultaneous long-range Coulomb
interactions. This situation is very di↵erent from that of a neutral gas, and the
Boltzmann equation can therefore NOT be used to describe a plasma.
So what assumption can we make for a plasma that allows us to truncate the BBGKY
hierarchy? The standard approach is to assume that h(1, 2, 3) = 0 (i.e., assume that
three-body correlation function is zero). This is a very reasonable assumption to
make, as it basically asserts that two-body interactions are more important than
three-body interactions. However, even with h(1, 2, 3) = 0 the BBGKY hierarchy
yields a set of two equations (for @f (1) /@t and @f (2) /@t) that is utterly unsolvable.
Hence, additional assumptions are necessary. The two assumptions that are typically
made to arrive at a manageable equation are
1. plasma is spatially homogeneous.
2. the two-point correlation function g(1, 2) relaxes much faster than the one-point
distribution function f (1).
The latter of these is known as Bogoliubov’s hypothesis, and is a reasonable
assumption under certain conditions. Consider for example injecting an electron
into a plasma. The other electrons will adjust to the presence of this new electron in
roughly the time it takes for them to have a collision with the new electron. Using
that the typical speed of the electrons is ve / kB T and using the Debye length as the
typical length scale, the time scale for the injected electron to relax is e /ve ⇠ !p,e1 .
In contrast, the time for f (1) to relax to the newly injected electron is ⇠ g 1 !p,e1 , as
all the g 1 particles within the Debye volume need to undergo mutual collisions.
Using the BBGKY hierarchy with h(1, 2, 3) = 0, assuming the plasma to be spatially
homogeneous, and adopting Bogoliubov’s hypothesis yields, after some tedious alge-
bra the Lenard-Balescu equation. Although the student is not required to know
or comprehend this equation, it is given here for the sake of completeness:
Z
@f (~v , t) 8⇡ 4 ne @ ~ 0 ~~
2
(k) ~ 0 @f @f
= dk d~v k k · [k · (~v ~v )] f (~v ) f (~v 0 )
@t 2
me @~v |"(~k, ~k · ~v )|2 @~v 0 @~v
234
Here
e2
(k) =
2⇡ 2 k 2
is the Fourier transform of the Coulomb potential (x) = e2 /|x|, and
2 Z ~k · (@f /@~v )
!p,e
"(~k, !) = 1 + d~v
k2 ! ~k · ~v
is called the dielectric function, which basically represents the plasma shielding of a
test particle. Note that ~x does not appear as an argument of the distribution function,
which reflects the assumption of a homogeneous plasma. And the term in square
brackets has no explicit time-dependence, which reflects Bogoliubov’s hypothesis.
We emphasize that because of the assumptions that underly the Lenard-Balescu
equation, it is NOT applicable to all plasma processes. Although it can be used to
describe, say, the collisional relaxation of an electron beam in a plasma, it cannot
be used to describe for example the collisional damping of spatially inhomogeneous
wave motion.
The rhs of the Lenard-Balescu equation represents the physics of two-particle colli-
sions. This is evident from the fact that the term (k)/"(~k, ~k · ~v ) appears squared.
This term represents the Coulomb potential of a charged particle (the (k)-part)
together with its shielding cloud (represented by the dielectric function). Hence, the
fact that this term appears squared represents the collision of two shielded particles.
It may be clear that this is not an easy equation to deal with. However, one can
obtain a simplified but fairly accurate form of the Lenard-Balescu equation that can
be recast in the form of a Fokker-Planck equation
@f (~v , t) @ 1 @2
= [Ai f (~v)] + [Bij f (~v)]
@t @vi 2 @vi @vj
Here Z
8⇡ne e4 ln ⇤ @ f (~v 0 , t)
Ai = d~v 0
m2e @vi |~v ~v 0 |
is called the coefficient of dynamical friction, which represents the slowing down
of a typical particle because of many small angle collisions, and
Z
4⇡ne e4 ln ⇤ @ 2
Bij = d~v 0 |~v ~v 0 | f (~v 0 , t)
m2e @vi @vj
235
is the di↵usion coefficient, which represents the increase of a typical particle’s
velocity (in the direction perpendicular to its instantaneous velocity) due to the
many small angle collisions.
If the two terms on the rhs of the Fokker-Planck equation balance each other, such
that @f (~v , t)/@t = 0, then the plasma has reached an equilibrium. It can be shown
that this is only the case if f (~v) follows a Maxwell-Boltzmann distribution.
This is another way of stating that two-body collisions drive the systems towards a
Maxwellian.
236
CHAPTER 25
In Chapter 22 we have seen that the two-body collision frequency of a plasma is much
smaller than the plasma frequency (by roughly a factor ⇤ = g 1 ). Hence, there are
plasma phenomena that have a characteristic time scale that is much shorter than
the two-body relaxation time. For such phenomena, collisions can be ignored, and
we may consider the plasma as being collisionless.
And as we have seen in the previous chapter, the equation that governs the dynamics
of a collisionless plasma is the Vlasov equation.
@f @f F~ @f
+ ~v · + · =0
@t @~x m @~v
with Z
F~ = n d~x2 d~v2 F~12 f (1) (~x2 , ~v2 , t)
the smooth force acting on particle 1 due to the long-range interactions of all other
particles (within the Debye length). This equation derives from the BBGKY hierar-
chy upon neglectiong the two-particle correlation function, g(1, 2), which arises due
to Coulomb interactions among particles within each others Debye volume. Hence,
F~ assumes that the force field (for a plasma that entails E(~
~ x, t) and B(~
~ x, t)) are
perfectly smooth.
237
that
✓ ◆
@fa @fa qa ~ + ~v ⇥ B
~ @fa
+ ~v · + E · =0
@t @~x m c @~v
where ‘a’ is either ‘e’ or ‘i’.
Rather than solving the Vlasov equation, we follow the same approach as with our
neutral fluids, and our collisionless fluids, and solve instead the moment equations,
by multiplying the Vlasov equation with (~v ) and integrating over all of velocity
(momentum) space (cf. Chapter 7).
As for neutral fluids, we need to complement these moment equations with an equa-
tion of state (EoS) in order to close the equations. Without going into detail, in most
cases the EoS of a plasma can be taken to have one of the following three forms:
Pa = 0 (”cold plasma”)
Pa = na kB Ta (”ideal plasma”)
Pa = Cna (”adiabatic processes”)
238
A ‘cold plasma’ is a plasma in which the random motions of the particles are not
important.
Since the momentum equations for our plasma contain the electric and magnetic
fields, we need to complement the moment equations and EoS with the Maxwell
equations
~ = 4⇡(ni ne ) e
r·E
~ = 0
r·B
~
1 @B
~ =
r⇥E
c @t
~
~ = 4⇡ (ni~ui ne~ue ) e + 1 @ E
r⇥B
c c @t
Let’s assume the plasma to be ‘cold’ (i.e., Pe = Pi = 0), and consider perturbations
in a uniform, homogeneous plasma. The perturbation analysis treatment is exactly
analogous to that of accoustic waves in Chapter 12: First, apply small perturbations
~0 ! E
to the dynamical quantities (i.e., n0 ! n0 + n1 , E ~0 + E
~ 1 , etc, where subscripts
‘0’ refer to the unperturbed equilibrium solution. Next, linearize the equations,
239
which implies that we ignore all higher-order terms. For the momentum equations
this yields
@~v1 ~1
me n0 = e n0 E
@t
~ 1 is second-order in the perturbed quantities and
Note that the magnetic force ~v1 ⇥ B
therefore neglected. For the Maxwell equations we obtain that
4⇡ ~
1 @E
~1 =
r⇥B n0 e ~v1 +
c c @t
and
~1
1 @B
~1 =
r⇥E .
c @t
Combining these equations, and assuming all perturbations to be of the form EXP[ i(~k·
~x !t)], which implies that @/@t ! i! and r ! i~k, one obtains the dispersion
relation !(~k). In the case of our two-fluid model, this dispersion relation has the
form ✓ ◆
! 2 !p2
~k ⇥ (~k ⇥ E
~ 1) = 1 ~1
E
c2 !2
Here ✓ ◆1/2
4⇡ n0 e2
!p =
me
is the plasma frequency in the undisturbed plasma. This dispersion relation cor-
responds to two physically distinct types of wave modes:
Plasma Oscillations:
These are oscillation modes for which
where the z-direction is taken to be along ~k. Hence, since the group velocity
vg = @!/@k = 0 we see that these correspond to non-propagating, longitudi-
nal oscillations with a frequency equal to the plasma frequency. These are called
plasma waves, or Langmuir waves. Physically, they are waves in which pertur-
bations in E,~ cause a separation between electrons and ions, which results in an
electrostatic restoring force. Note that the plasma frequency depends only on the
density of the plasma. If the plasma is not cold (i.e., Pa 6= 0), then it follows that
these Langmuir oscillations become travelling waves.
240
Electromagnetic waves:
These are oscillation modes for which
E1z = 0 , ! 2 = !p2 + k 2 c2
where as before the z-direction is taken to be along ~k. Hence, these are transverse
waves. In fact, these are simply electromagnetic waves, but modified by the plasma.
The group velocity (i.e., the velocity with which information propagates) is given by
r
@! !p2
vg ⌘ =c 1
@k !2
which is less than the speed of light, c. For comparison, the phase velocity is given
by
! c
vph ⌘ = q
k !2
1 !p2
which is larger than c. Note, though, that this does not violate special relativity, as
no physical signal is travelling at this speed (i.e., it does not carry any information).
• If ! < !p , then k and vg become imaginary. This indicates that the EM waves
simply cannot penetrate the plasma; they are reflected back. The reason is that the
plasma can counteract the oscillations in the EM field at a rate that is faster, thereby
shorting the fluctuations, and thus the EM wave. This explains why low-frequency
radio signals can be reflected from the ionispheric plasma, and why cyclotron radia-
tion cannot travel through the plasma that permeates the Universe (unless it derives
from very strong magnetic fields, in which case the frequency can be larger than the
plasma frequency). Note that in this regime, strictly speaking the Vlasov equation
is no longer applicable, as the time scale for collisions becomes comparable to, or
shorter than, that of the perturbation.
241
The above analysis is based on a perturbation analysis of the two-fluid model, which
is based on moment equations of the Vlasov equation. Landau performed a more
thorough analysis, by actually perturbing the Vlasov equation itself. He found
that the Langmuir waves will damp, a process known as Landau damping.
This damping may come as a surprise (as it did to Landau, when he first derived
this result). After all, damping is typically associated with dissipation, and hence
requires either radiation, or collisions that convert wave energy into random, thermal
energy. But the Vlasov equation includes neither radiation nor collisions. So where
does this damping come from? Without going through a rigorous derivation, which is
somewhat tedious, involving nasty complex contour integrals, we merely sketch how
Landau damping arises from the energy exchange between an electromagnetic wave
with phase velocity vph ⌘ !/k and particles in the plasma with velocities approxi-
mately equal to vph ; these particles can interact strongly with the wave (similar to
how particles that are in near-resonance with a perturber can exchange energy with
it). Particles that have a velocity v ⇠ < v
ph will be accelerated (i.e., gaining energy)
by the electric field of the Langmuir wave to move with the phase velocity of the
wave. Particles with v ⇠ > v , on the other hand, will be decelerated (losing energy).
ph
All in all, the particles have a tendency to synchronize with the wave. An imbalance
between energy gainers and energy losers arises from the fact that the velocity distri-
bution of a plasma is typically a Maxwell-Boltzmann distribution; hence, there will
be slightly more particles with v < vph (energy gainers) than particles with v > vph
(energy losers). Hence, there is a net tranfer of energy from the wave to the particles,
causing the former to damp.
A famous metaphor for Landau damping involves surfing. One can view Langmuir
waves as waves in the ocean, and the particles as surfers trying to catch the wave,
all moving in the same direction. If the surfer is moving on the water surface at a
velocity slightly less than the waves he will eventually be caught and pushed along
by the wave (gaining energy). On the other hand, a surfer moving slightly faster
than a wave will be pushing on the wave as he moves uphill (losing energy to the
wave). Within this metaphor, it is also clear that if the surfer is not moving at all,
no exchange of energy happens as the wave simply moves the surfer up and down
as it goes by. Also a wind-surfer, who is moving much faster than the wave won’t
interact much with the wave either.
Hence, Landau damping arises from gradients in the distribution function at the
phase velocity of the wave, which can cause a transfer of energy from the wave to
242
the particles; Landau damping is a prime example of a wave-particle interaction.
As first pointed out by Lynden-Bell, it is similar to violent relaxation for a purely
collisionless, gravitational system, in which the energy in potential fluctuations (i.e.,
oscillations in the gravitational system, for example due to gravitational collapse)
are transferred into random motions, ultimately leading to virialization (relaxation).
243
CHAPTER 26
Magnetohydrodynamics
When we consider phenomena with length scales much larger than the Debye
length, and time scales much longer than the inverse of the plasma frequency,
charge separation is small, and can typically be neglected. In that case we don’t
need to treat electrons and ions separately. Rather, we treat the plasma as a single
fluid. Note, though, that as we are considering phenomena with longer time scales,
our one-fluid model of plasma physics will have to account for collisions (i.e., we
won’t be able to use the Vlasov equation as our starting point). As we will see, the
main e↵ect of these collisions is to transfer momentum between electrons and ions,
which in turn manifests as an electrical current.
where as before ‘a’ refers to a species, either ‘e’ or ‘i’. As we will see, we can obtain
the necessary insight to develop our one-fluid model, without regard of what this
collision term looks like in detail.
244
By integrating the above ‘Boltzmann-like’ equation over velocity space, we obtain
the continuity equation
@na
+ r · (na ~ua ) = 0
@t
This term represents the change in the number of particles of species ‘a’ in a small
volume of configuration space due to collisions. To good approximation this is zero,
which follows from the fact that while Coulomb interactions can cause large changes
in momentum (when b < b90 ), they do not cause much change in the positions of the
particles. Hence, the collision term leaves the continuity equation unaltered.
For the momentum equation, we multiply the above ‘Boltzmann-like’ equation with
velocity and again integrate over velocity space. If we once again ignore viscosity,
this yields exactly the same equation as for the two-fluid model discussed in the
previous chapter, but with one additional, collisional term:
✓ ◆
@~ua ~ ~ua ~ ~a
ma na + (~ua · r) ~ua = rPa + qa na E + ⇥B +C
dt c
where
Z ✓ ◆
~ a = ma @fa
C d~v ~v
@t coll
This term represents the change in the momentum of species ‘a’ at position ~x due
to Coulomb interactions. Note that a given species cannot change its momentum
through collisions with members of its own species (i.e., the center of mass of two
~ e repre-
electrons is not changed after they have collided with each other). Hence, C
sents the change in the momentum of the electrons due to collisions with the ions,
and C ~ i represents the change in the momentum of the ions due to collisions with
the electrons. And, since the total momentum is a conserved quantity, we have that
~e = C
C ~ i.
245
Since in MHD we treat the plasma as a single fluid, we now define the relevant
quanties:
1
com fluid velocity ~u ⌘ (mi ni ~ui + me ne ~ue )
⇢
total pressure P = Pe + Pi
By multiplying the continuity equations for the electrons with me , and adding it
to the continuity equation for the ions multiplied by mi , one obtains the MHD
continuity equation,
@⇢
+ r · (⇢~u) = 0
@t
@⇢c
+ r · J~ = 0
@t
For the momentum equation, it is common to assume that @na /@t and ua are small
compared to other terms. This allows one to neglect terms that contain products of
these small quantities, which in turn allows one to add the momentum equations for
electrons and ions, yielding:
@~u ~ + 1 J~ ⇥ B
~
⇢ = rP + ⇢c E
@t c
246
In general, in MHD one assumes that ne ' ni , which implies that the charge density,
⇢c , is typically (very) small. We adopt that assumption here as well, which implies
that the ⇢c E~ term in the momentum equations vanishes and that we no longer need
to consider the charge conservation equation.
In MHD, the energy equation is the same as for a neutral fluid, except that there is
an additional term to describe Ohmic dissipation (aka Ohmic loss). In the absence
of radiation, viscosity and conduction, we therefore have
d" J2
⇢ = P r · ~u
dt
Since both the momentum and energy equations contain the current density, we
need to complement them with an equation for the time-evolution of J. ~ This rela-
tion, called the generalized Ohm’s law, derives from multiplying the momentum
equations for the individual species by qa /ma , adding the versions for the electrons
and ions, while once again ignoring terms that contain products of small quanties
~e = C
(i.e., @na /@t and ua ). Using that C ~ i , that ni ⇡ ne , that Pe ⇡ Pi ⇡ P/2, and
1 1
that mi ⌧ me , one can show that
me mi @ J~ mi ~ + 1 ~u ⇥ B
~ mi ~ ~ mi ~
2
= rP + E J ⇥B+ Ci
⇢e @t 2⇢e c ⇢ec ⇢e
(for a derivation see the excellent textbook Introduction to Plasma Theory by D.R.
Nicholson).
The above generalized Ohm’s law is rather complicated. But fortunately, in most
circumstances certain terms are significantly smaller than others and can thus be
ignored. Before discussing which terms can be discarded, though, we first give a
~ i.
heuristic derivation of the collision term C
247
As already mentioned above, C ~e = C ~ i describes the transfer of momentum from
the electrons to the ions (and vice-versa). Let’s consider the strong interactions,
i.e., those with an impact parameter b ' b90 . Since the electron basically loses all
its forward momentum in such a collision, we have that the electron fluid loses an
average momentum me (~ue ~ui ) to the ion fluid per strong electron-ion encounter.
Hence, the rate of momentum transfer is approximately given by
~e =
C me ne ⌫L (~ue ~ui )
where ⌫L is the collision frequency for strong collisions. Since the current density is
J~ = qe ne ~ue + qi ni ~ui ' ne e (~ui ~ue ) , where we have used that ni ' ne , we can write
this as
C~ e = +ne e ⌘ J~
where
me ⌫L
⌘=
ne e2
This parameter is called the electric resistivity, and is the inverse of the electric
conductivity, . Substituting the expression for ⌫L derived in Chapter 22 (and
using v0 ⇠ ve ⇠ (3kB T /me )1/2 ), we find that
1/2 1/2
4⇡ me e2 me e2
⌘= p ⇡ 2.4
3 3 (kB T )3/2 (kB T )3/2
Using a much more rigorous derivation of the electrical resistivity, accounting for the
whole range of impact parameters, Spitzer & Härm (1953) obtain (assuming that all
ions have Z = 1)
1/2
me e2
⌘ = 1.69 ln ⇤
(kB T )3/2
in reasonable agreement with our crude estimate.
248
~ Hence,
term, which describes the Hall e↵ect, is typically small compared to ~u/c⇥ B.
we can simplify the generalized Ohm’s law to read
✓ ◆
~u ~
J~ = ~
E+ ⇥B
c
The MHD equations derived thus far (mass continuity, charge continuity, momentum
conservation and Ohm’s law) need to be complemented with the Maxwell equa-
tions. Fortunately, these can also be simplified. Let’s start with Ampère’s circuital
law
~
r⇥B ~ = 4⇡ J~ + 1 @ E
c c @t
It can be shown (see §3.6 in The Physics of Fluids and Plasma by A.R. Choudhuri)
that " #
~
1 @E h i 2
/ r⇥B ~ ⇠v
c @t c2
Hence, in the non-relativistic regime considered here, the displacement current is
negligble, which implies that J~ = 4⇡
c ~ Combined with Ohm’s law, we therefore
r ⇥ B.
have that
~ = c (r ⇥ B)
E ~ ~u ~
⇥B
4⇡ c
Hence, we see that, in MHD, the electric field does not have to be considered an
~
independent variable; instead, it can be obtained from ~u and B.
249
~ = r(r · B)
Using the vector identity r ⇥ (r ⇥ B) ~ ~ (see Appendix A), and the
r2 B
~ = 0 (yet another Maxwell equation), we finally obtain the induction
fact that r · B
equation
~
@B ~ + r2 B
= r ⇥ (~u ⇥ B) ~
@t
where
c2
⌘
4⇡
is called the magnetic di↵usivity. As is evident from the induction equation, it
describes the rate at which the magnetic field di↵uses due to collisions in the plasma.
Before we finally summarize our set of MHD equations, we apply one final modifica-
tion by expanding the Lorentz force term, J~ ⇥ B,
~ in the moment equations. Using
Ampère’s circuital law without the displacement current, we have that
✓ 2◆
1 ~ ~ 1 ~ ~ 1 ~ ~ B
(J ⇥ B) = (r ⇥ B) ⇥ B = (B · r)B r
c 4⇡ 4⇡ 2
where the last step is based on a standard vector identity (see Appendix A).
Next, using that
⇣ ⌘
~
B ·r B~ = Bj @Bi = @Bi Bj Bi
@Bj
=
@Bi Bj
@xj @xj @xj @xj
where the last step follows from the fact that r · B~ = 0, we can now write the
momentum equations in index form as
✓ 2◆ ✓ ◆
@ui @P @ B @ Bi Bj @
⇢ = + =+ [ ij Mij ]
@t @xi @xi 8⇡ @xj 4⇡ @xj
B2 Bi Bj
Mij ⌘ ij
8⇡ 4⇡
is the magnetic stress tensor. Its’ diagonal elements represent the magnetic
pressure, while its o↵-diagonal terms arise from magnetic tension.
250
The following table summarizes the full set of resistive MHD equations. These
are valid to describe low-frequency plasma phenomena for a relatively cold plasma in
which ne ' ni , such that the charge density can be neglected. Note also that conduc-
tion, viscosity, radiation and gravity are all neglected (the corresponding terms are
trivially added). A fluid that obeys these MHD equations is called a magnetofluid.
d⇢
Continuity Eq. = ⇢ r · ~u
dt
d~u 1
Momentum Eqs. ⇢ = rP + J~ ⇥ B
~
dt c
d" J2
Energy Eq. ⇢ = P r · ~u
dt
✓ ◆
~ ~ ~u ~
Ohm’s Law J= E+ ⇥B
c
~
@B
Induction Eq. ~ + r2 B
= r ⇥ (~u ⇥ B) ~
@t
1/2
c2 1 me e2
Constitutive Eqs. = , =⌘/
4⇡ (kB T )3/2
Note that in the momentum equations we have written the Lagrangian derivative
d~u/dt, rather than the Eulerian @~u/@t that we obtained earlier in our derivation.
This is allowed, since we had assumed that both (~ue · r)~ue and (~ui · r)~ui are small
compared to other terms, which therefore also applies to (~u · r)~u.
Note also that although we have written the above set of MHD equations including
~ this is not an independent dynamical quantity. After all, as
the electric field E,
251
~ and ~u according to
already mentioned above, it follows from B
~ = c ~ ~u ~
E r⇥B ⇥B
4⇡ c
In fact, Ohm’s law is not required to close this set of equations as the current density
J~ can be computed directly from the magnetic field B ~ using Ampère’s circuital law
without displacement current; r ⇥ B ~ = (4⇡/c)J. ~
Hence, we see that in the end MHD is actually remarkably similar to the hydrody-
namics of neutral fluids. The ‘only’ additions are the magnetic field, which adds an
additional pressure and an (anisotropic) tension, and the Coulomb collisions, which
cause Ohmic dissipation and a di↵usion of the magnetic fields. To further strengthen
the similarities with regular fluid hydrodynamics, note that the induction equation
is very similar to the vorticity equation
✓ ◆
@w
~ rP
= r ⇥ (~u ⇥ ~! ) r ⇥ + ⌫r2 ~!
@t ⇢
(see Chapter 8). Here w ~ is the vorticity and ⌫ the kinetic viscosity. Except
for the baroclinic term, which is absent in the induction equation, vorticity and
magnetic field (in the MHD approximation) behave very similar (cf., magnetic field
lines and vortex lines).
with U and L the characteristic velocity and length scales of the plasma flow. Recall
(from Chapter 11), the definition of the Reynolds number R = U L/⌫. We thus
merely replaced the kinetic viscosity with the magnetic di↵usivity, which is
proportional to the electric resistivitiy (and thus inversely proportional to the
conductivity).
252
left to itself, decays away due to magnetic di↵usion. This can be understood from
the fact that magnetic fields are directly related to currents, which die away due to
Ohmic dissipation unless one applies a source of voltage.
• When Rm 1, the first term in the induction equation dominates, which therefore
becomes
@B ~
' r ⇥ (~u ⇥ B)
@t
This is the situation we typically encounter in astrophysics, where U and L are
large. In the limit of infinite conductivity (i.e., zero electrical resistivity, and thus
zero magnetic di↵usivity), the above equation is exact, and we talk of ideal MHD.
Obviously, with infinite conductivity there is also no Ohmic dissipation, and the
energy equation in ideal MHD is therefore identical to that for a neutral fluid. Hence,
for ideal MHD the full, closed set of equations reduces to
d⇢
Continuity Eq. = ⇢ r · ~u
dt
d~u 1
Momentum Eqs. ⇢ = rP + J~ ⇥ B
~
dt c
d"
Energy Eq. ⇢ = P r · ~u
dt
~
@B
Induction Eq. ~
= r ⇥ (~u ⇥ B)
@t
~ = 4⇡ ~
Ampère’s law r⇥B J
c
The reader may wonder what happens to Ohm’s law in the limit where ! 1; In
order to assure only finite currents, we need to have that E + ~u/c ⇥ B~ = 0, and thus
~ = ~u ⇥ B.
E 1 ~ However, since E ~ is not required (is not an independent dynamical
c
quantity), this is of little relevance.
253
An important implication of ideal MHD is that
Z
d ~ · d2 s = 0
B
dt S
This expresses that the magnetic flux is conserved as it moves with the fluid. This is
known as Alfvén’s theorem of flux freezing. It is the equivalent of Helmholtz’
theorem that d /dt = 0 for an inviscid fluid (with the circularity). An implication
is that, in the case of ideal MHD, two fluid elements that are connected by a magnetic
flux line, will remain connected by that same magnetic flux line.
(Ideal) MHD is used to describe many astrophysical processes, from the magnetic
field topology of the Sun, to angular momentum transfer in accretion disks, and from
the formation of jets in accretion disks, to the magnetic breaking during star forma-
tion. One can also apply linear perturbation theory to the ideal MHD equations, to
examine what happens to a magnetofluid if it is perturbed. If one ignores viscosity,
heat conduction, and electric resistivity (i.e., we are in the ideal MHD regime), then
the resulting dispersion relation is given by
h i
! 2~u1 = (c2s + u2A )(~k · ~u1 )~k + ~uA · ~k (~uA · ~k)~u1 (~uA · ~u1 )~k (~k · ~u1)~uA
~0
B
~uA = p
4⇡⇢0
The above dispersion relation, !(~k), for given sound speed, cs , and Alvén velocity,
~vA , of the magnetofluid, is the basic dispersion relation for hydromagnetic waves.
Although it has a complicated looking form, there is one solution that is rather
simple. It corresponds to a purely tranverse wave in which the displacement, and
therefore the velocity perturbation ~u1 (~x, t), is perpendicular to both the wave vector
~k and the magnetic field (which is in the direction of ~uA ). Under those conditions
the dispersion relation reduces to
! 2 = (~vA · ~k)2
These waves, called Alfvén waves, have a group velocity vg = @!/@~k = ~vA and are
moving, with that velocity, along the magnetic field lines.
254
Any wave is driven by some restoring force. In the case of accoustic waves these are
pressure gradients, while the restoring force in the case of the plasma oscillations
(Langmuir waves) discussed in the previous chapter arise from the electrical field
that results from a separation of electrons and ions. In the case of perturbations to a
magnetofluid, there are two restoring forces that play a role; pressure gradients and
magnetic tension. In the case of Alfvèn waves the restoring force is purely the tension
in the magnetic field lines (pressure plays no role). Hence, Alfvén waves are similar
to the waves in a rope or string, which
p are also transverse waves. The group velocity
of these waves is proportional to tension/density. Since the magnetic tension is
given by B 2 /4⇡, we see that the Alfvén velocity has exactly the same form.
Note that in the case of ideal MHD the resistivity is zero, and there is thus no
di↵usion or dissipation of the magnetic fields, which instead are ‘frozen’ into the
fluid. In the case of resistive MHD (i.e., if the magnetic resistivity is non-zero) the
Alfvén waves will experience damping, thereby transferring the energy stored in the
magnetic wave to random, thermal energy.
Alfvèn waves, though, are not the only solution to the dispersion relation given
above. There are two additional solutions, corresponding to fast mode and slow
mode waves. Contrary to the Alfvèn waves, the restoring force for these modes is
a combination of magnetic tension and pressure (i.e., they are mixtures of acoustic
and magnetic waves). Without going into any detail, we only mention in closing
that any disturbance of a magnetofluid can be represented as a superposition of the
Alfvèn, fast and slow modes.
255
Supplemental Material
Appendices
256
Appendix A
Vector Calculus
~ = (a1 , a2 , a3 ) = a1 î + a2 ĵ + a3 k̂
Vector: A
p
~ =
Amplitude of vector: |A| a21 + a22 + a23
~ =1
Unit vector: |A|
Basis: In the above example, the unit vectors î, ĵ and k̂ form a vector basis.
~ B
Any 3 vectors A, ~ and C
~ can form a vector basis
~ B,
as long as det(A, ~ C)
~ 6= 0.
~ B)
~ a1 a2
Determinant: det(A, = = a1 b2 a2 b1
b1 b2
a1 a2 a3
~ B,
~ C)
~ b2 b3 b3 b1 b1 b2
det(A, = b1 b2 b3 = a1 + a2 + a3
c2 c3 c3 c1 c1 c2
c1 c2 c3
Geometrically: ~ B)
det(A, ~ = ± area of parallelogram
~ B,
det(A, ~ C)
~ = ± volume of parallelepiped
~+B
Summation of vectors: A ~ =B
~ +A
~ = (a1 + b1 , a2 + b2 , a3 + b3 )
257
P
Einstein Summation Convention: ai bi = i ai bi = a1 b1 + a2 b2 + a3 b3 = ~a · ~b
@Ai /@xi = @A1 /@x1 + @A2 /@x2 + @A3 /@x3 = r · A ~
Aii = A11 + A22 + A33 = Tr A ~ (trace of A)~
~ ·B
• check orthogonality: two vectors are orthogonal if A ~ =0
~ in direction of A,
• compute projection of B ~ which is given by A
~ · B/|
~ A|~
î ĵ k̂
~⇥B
Cross Product (aka vector product): A ~ = a1 a2 a3 = "ijk ai bj êk
b1 b2 b3
~ ~ ~ |B|
|A ⇥ B| = |A| ~ sin ✓ = det(A,
~ B)
~
In addition to the dot product and cross product, there is a third vector product
that one occasionally encounters in dynamics;
The tensor product AB is a tensor of rank two and is called a dyad. It is best to
~ into another vector with the
define a dyad by what it does: it transforms a vector C
~ according to the rule
direction of A
~ ⌦ B)
(A ~ C~ =A
~ (B
~ · C)
~
258
~·B
A ~ =B
~ ·A
~ ~⇥B
A ~ = ~ ⇥A
B ~
~ ·B
(↵A) ~ = ↵(A
~ · B)
~ =A
~ · (↵B)
~ ~ ⇥B
(↵A) ~ = ↵(A
~ ⇥ B)
~ =A
~ ⇥ (↵B)
~
~ · (B
A ~ + C)
~ =A
~·B
~ +A
~·C
~ ~ ⇥ (B
A ~ + C)
~ =A
~ ⇥B
~ +A
~⇥C
~
~·B
A ~ =0 ! ~?B
A ~ ~ ⇥B
A ~ =0 ! ~kB
A ~
~·A
A ~ = |A|
~2 ~ ⇥A
A ~=0
~ · (B
Triple Scalar Product: A ~ ⇥ C)
~ = det(A,
~ B,
~ C)
~ = "ijk ai bj ck
~ · (B
A ~ ⇥ C)
~ = 0 ! A, ~ B,
~ C
~ are coplanar
~ · (B
A ~ ⇥ C)
~ =B~ · (C
~ ⇥ A)
~ =C~ · (A~ ⇥ B)~
~ ⇥ (B
Triple Vector Product: A ~ ⇥ C)
~ = (A~ · C)
~ B~ (A ~ · B)
~ C~
as is clear from above, A~ ⇥ (B
~ ⇥ C)
~ lies in plane of B
~ and C.
~
~ ⇥ B)
~ · (C
~ ⇥ D)
~ = (A
~ ~ ~ ~ ~ ~ ~ ~
Useful to remember: (A h · C) (B · D)i (A ·hD) (B · C) i
~ ⇥ B)
(A ~ ⇥ (C~ ⇥ D)
~ = A ~ · (B
~ ⇥ D)
~ C~ ~ · (B
A ~ ⇥ C)
~ D ~
⇣ ⌘
Gradient Operator: r = r ~ = @, @, @
@x @y @z
This vector operator is sometimes called the nabla or del operator.
@ 2 @ 2@ 2
Laplacian operator: r2 = r · r = @x 2 + @y 2 + @z 2
@f @f @f
Di↵erential: f = f (x, y, z) ! df = @x
dx + @y
dy + @z
dz
259
⇣ ⌘
@f @f @f
Gradient Vector: rf = gradf = , ,
@x @y @z
the gradient vector at (x, y, z) is normal to the level surface
through the point (x, y, z).
î ĵ k̂
Curl of Vector Field: curlF~ = r ⇥ F~ = @/@x @/@y @/@z
Fx Fy Fz
~
A vector field for which r ⇥ F = 0 is called irrotational or curl-free.
~ x) and B(~
Let S(~x) and T (~x) be scalar fields, and let A(~ ~ x) be vector fields:
~ = divA
r·A ~ = scalar ~ = (r · r) A
r2 A ~ = vector
~ = curlA
r⇥A ~ = vector
260
r ⇥ (rS) = 0 curl grad S = 0
~ =0
r · (r ⇥ A) ~=0
div curl A
r(ST ) = S rT + T rS
~ = S(r · A)
r · (S A) ~ +A
~ · rS
~ = (rS) ⇥ A
r ⇥ (S A) ~ + S(r ⇥ A)
~
~ ⇥ B)
r · (A ~ =B
~ · (r ⇥ A)
~ ~ · (r ⇥ B)
A ~
~ ⇥ B)
r ⇥ (A ~ = A(r
~ · B)
~ ~
B(r ~ + (B
· A) ~ · r)A
~ ~ · r)B
(A ~
~ · B)
r(A ~ = (A
~ · r)B
~ + (B
~ · r)A
~ +A
~ ⇥ (r ⇥ B)
~ +B
~ ⇥ (r ⇥ A)
~
~ ⇥ (r ⇥ A)
A ~ = 1 r(A
~ · A)
~ ~ · r)A
(A ~
2
~ = r2 (r ⇥ A)
r ⇥ (r2 A) ~
261
Appendix B
• F~ (~x) is a gradient field, which means that there is a scalar field (~x) so that
F~ = r
H
• Path independence: c F~ · d~l = 0
• Irrotational = curl-free: r ⇥ F~ = 0
262
Appendix C
Integral Theorems
NOTE: in the first equation we have used that r ⇥ F~ is always pointing in the
direction of the normal n̂.
NOTE: The curve of the line intergral must have positive orientation, meaning that
d~l points counterclockwise when the normal of the surface points towards the viewer.
263
Appendix D
where ~e1 , ~e2 and ~e3 are the unit directional vectors in the new (q1 , q2 , q3 )-coordinate
system. In what follows we show how to properly treat such generalized coordinate
systems.
In general, one expresses the distance between (q1 , q2 , q3 ) and (q1 + dq1 , q2 + dq2 , q3 +
dq3 ) in an arbitrary coordinate system as
p
ds = hij dqi dqj
Here hij is called the metric tensor. In what follows, we will only consider orthog-
onal coordinate systems for whichphij = 0 if i 6= j, so that ds2 = h2i dqi2 (Einstein
summation convention) with hi = hii .
An example of an orthogonal coordinate system are the Cartesian coordinates, for
which hij = ij . After all, the distance between two points separated by the infinites-
imal displacement vector d~x = (dx, dy, dz) is ds2 = |d~x|2 = dx2 + dy 2 + dz 2 .
264
The coordinates (x, y, z) and (q1 , q2 , q3 ) are related to each other via the transfor-
mation relations
x = x(q1 , q2 , q3 )
y = y(q1 , q2 , q3 )
z = z(q1 , q2 , q3 )
@~x
hi =
@qi
265
Using this expression for the metric allows us to write the unit directional vectors
as
1 @~x
~ei =
hi @qi
and the di↵erential vector in the compact form as
From the latter we also have that the infinitesimal volume element for a general
coordinate system is given by
Note that the absolute values are needed to assure that d3~x is positive.
266
~ In the Cartesian basis C = {~ex , ~ey , ~ez } we have that
Now consider a vector A.
~ C = Ax ~ex + Ay ~ey + Az ~ez
[A]
In the basis B = {~e1 , ~e2 , ~e3 }, corresponding to our generalized coordinate system,
we instead have that
~ B = A1 ~e1 + A2 ~e2 + A3 ~e3
[A]
We can rewrite the above as
0 1 0 1 0 1 0 1
e11 e21 e31 A1 e11 + A2 e21 + A3 e31
~ B = A1 @ e12 A + A2 @ e22 A + A3 @ e32 A = @ A2 e12 + A2 e22 + A3 e32 A
[A]
e13 e23 e33 A3 e13 + A2 e23 + A3 e33
and thus 0 10 1 0 1
e11 e21 e31 A1 A1
~ B = @ e12 e22 e32 A @ A2 A ⌘ T @ A2 A
[A]
e13 e23 e33 A3 A3
Using similar logic, one can write
0 10 1 0 10 1 0 1
ex1 ey1 ez1 Ax 1 0 0 Ax Ax
~ C = @ ex2 ey2 ez2 A @ Ay A = @ 0 1 0 A @ Ay A = I @ Ay A
[A]
ex3 ey3 ez3 Az 0 0 1 Az Az
~ is the same object independent of its basis we have that
and since A
0 1 0 1
Ax A1
I @ Ay A = T @ A2 A
Az A3
~ B and [A]
and thus, we see that the relation between [A] ~ C is given by
~ C = T [A]
[A] ~ B, ~B=T
[A] 1 ~C
[A]
For this reason, T is called the transformation of basis matrix. Note that the
columns of T are the unit-direction vectors ~ei , i.e., Tij = eij . Since these are or-
thogonal to each other, the matric T is said to be orthogonal, which implies that
T 1 = T T (the inverse is equal to the transpose), and det(T ) = ±1.
Now we are finally ready to determine how to write our position vector ~x in the new
basis B of our generalized coordinate system. Let’s write ~x = ai ~ei , i.e.
0 1
a1
[~x]B = @ a2 A
a3
267
We started this chapter by pointing out that it is tempting, butp wrong, to set ai = qi
(as for the Cartesian basis). To see this, recall that |~x| = (a1 )2 + (a2 )2 + (a3 )2 ,
from which it is immediately clear that each ai needs to have the dimension of length.
Hence, when qi is an angle, clearly ai 6= qi . To compute the actual ai you need to
use the transformation of basis matrix as follows:
0 10 1 0 1
e11 e12 e13 x e11 x + e12 y + e13 z
[~x]B = T 1 [~x]C = @ e21 e22 e23 A @ y A = @ e21 x + e22 y + e23 z A
e31 e32 e33 z e31 x + e32 y + e33 z
Hence, using our expression for the unit direction vectors, we see that
✓ ◆ ✓ ◆
1 @xj 1 @~x
ai = xj = · ~x
hi @qi hi @qi
and by operating d/dt on [~x]B we find that the corresponding velocity vector in the
B basis is given by X
[~v ]B = hi q̇i ~ei
i
with q̇i = dqi /dt. Note that the latter can also be inferred more directly by simply
dividing the expression for the di↵erential vector (d~x = hi qi ~ei ) by dt.
268
Next we write out the gradient, the divergence, the curl and the Laplacian for our
generalized coordinate system:
The gradient:
1 @
r = ~ei
hi @qi
The divergence:
~= 1 @ @ @
r·A (h2 h3 A1 ) + (h3 h1 A2 ) + (h1 h2 A3 )
h1 h2 h3 @q1 @q2 @q3
The Laplacian:
✓ ◆ ✓ ◆ ✓ ◆
2 1 @ h2 h3 @ @ h3 h1 @ @ h1 h2 @
r = + +
h1 h2 h3 @q1 h1 @q1 @q2 h2 @q2 @q3 h3 @q3
269
Vector Calculus in Cylindrical Coordinates:
hR = 1 h =R hz = 1
AR = Ax cos Ay sin
A = Ax sin + Ay cos
Az = Az
The Gradient:
~ = 1 @ (RAR ) + 1 @A + @Az
r·A
R @R R @ @z
270
The Laplacian:
✓ ◆
2 1 @ @ 1 @2 @2
scalar : r = R + +
R @R @R R2 @ 2 @z 2
✓ ◆
2~ 2FR 2 @F✓
vector : r F = r FR ~eR
R2 R2 @✓
✓ ◆
2 2 @FR F✓
+ r F✓ + 2 ~e✓
R @✓ R2
+ r2 Fz ~ez
271
Vector Calculus in Spherical Coordinates:
hr = 1 h✓ = r h = r sin ✓
~v = ṙ ~er + r ~e˙ r
= ṙ ~er + r ✓˙ ~e✓ + r sin ✓ ˙ ~e
The Gradient:
~ = 1 @ (r 2 Ar ) + 1
r·A
@
(sin ✓A✓ ) +
1 @A
2
r @r r sin ✓ @✓ r sin ✓ @
The Convective Operator:
✓ ◆
~ ~ @Br A✓ @Br A @Br A✓ B✓ + A B
(A · r) B = Ar + + ~er
@r r @✓ r sin ✓ @ r
✓ ◆
@B✓ A✓ @B✓ A @B✓ A✓ Br A B cot✓
+ Ar + + + ~e✓
@r r @✓ r sin ✓ @ r r
✓ ◆
@B A✓ @B A @B A Br A B✓ cot✓
+ Ar + + + + ~e
@r r @✓ r sin ✓ @ r r
272
The Laplacian:
✓ ◆ ✓ ◆
2 1 @ 2 @ 1 @ @ 1 @2
scalar : r = 2 r + 2 sin ✓ + 2 2 2
r @r @r r sin ✓ @✓ @✓ r sin ✓ @
✓ ◆
2~ 2 2Fr 2 @(F✓ sin ✓) 2 @F
vector : rF = r Fr ~er
r2 r 2 sin ✓ @✓ r 2 sin ✓ @
✓ ◆
2 2 @Fr F✓ 2 cos ✓ @F
+ r F✓ + 2 ~e✓
r @✓ r 2 sin ✓ r 2 sin2 ✓ @
✓ ◆
2 2 @Fr 2 cos ✓ @F✓ F
+ r F + 2 + 2 2 ~e
r sin ✓ @ r sin ✓ @ r 2 sin2 ✓
273
Appendix E
Di↵erential Equations
The equations of fluid dynamics are all di↵erential equations. In order to provide
the necessary background, this appendix gives a very brief overview of the basics.
If the unknown function depends on two or more independent variables, then the
di↵erential equation is a partial di↵erential equation (PDE).
Equation [a] is an ODE of order 1, equation [b] is an ODE of order 2, and equation
[c] is a PDE of order 2.
du d2 u dn u
u0 ⌘ , u00 ⌘ , u(n) ⌘ .
dx dx2 dxn
When the independent variable is time, we often use a dot rather than a hyphen,
i.e., u̇ = du/dt, ü = d2 u/dt2 , etc.
274
When dealing with PDEs, we use the following shorthand:
@u @2u @2u
u,x ⌘ , u,xy ⌘ , u,tt ⌘ ,
@x @x@y @2t
etc. Consider the following examples
Note that in the latter we have adopted the Einstein summation convention.
A di↵erential equation along with subsidiary conditions on the unknown function and
its derivatives, all given at the same value of the independent variable, consistitute
an initial value problem.
If the subsidiary conditions are given at more than one value of the independent vari-
able, the problem is a boundary value problem and the conditions are boundary
conditions.
• Cauchy boundary conditions: Both the value and the normal derivative of
the dependent variable are specified on the boundary.
Cauchy boundary conditions are analogous to the initial conditions for a second-order
ODE. These are given at one end of the interval only.
275
Linear and non-linear PDEs: A linear PDE is one that is of first degree in all of
its field variables and partial derivatives.
Equations [a] and [c] are linear, while [b], [d] and [e] are all non-linear.
@ @
[a] L(u) = 0 with L := +
@x @y
✓ ◆2
@ @
[b] L(u) = 0 with L := +
@x @y
2
@ @2
[c] L(u) = x + y with L := +
@x2 @y 2
@ @
[d] L(u) = 0 with L := + u2
@x @y
@2 @2
[e] L(u) = 0 with L := + u =0
@x2 @y 2
276
In (hydro-)dynamics, we typically encounter three types of second-order PDEs, clas-
sified as elliptic, hyperbolic, and parabolic. Each type has certain characteristics
that help determine if a particular finite element approach is appropriate to the prob-
lem being described by the PDE. Interestingly, just knowing the type of PDE can
give us insight into how smooth the solution is, how fast information propagates,
and the e↵ect of initial and boundary conditions.
Consider a second-order PDE for the unknown function u(x, y) of the form
277
Finally, since solving PDEs can often be reduced to solving (sets) of ODEs, a few
words about solving the latter. Problems involving ODEs can always be reduced to
a set of first-order ODEs! For example, the 2nd order ODE
d2 u du
2
+ s(x) = t(x)
dx dx
can be rewritten as two first-order ODEs
du dv
= v(x) , = t(x) s(x)v(x)
dx dx
dn u dn 1 du
n
= an 1 (x) n 1
+ ..... + a1 (x) + a0 (x) u(x) + f (x)
dx dx dx
with u(0) = c0 , u0(0) = c1 , u00 (0) = c2 , ... ,u(n 1) (0) = cn 1 as the initial values.
In general, this can be written in matrix form as
with the initial values given by u(0) = c. Here the elements of u are given by
u1 = u(x), u2 = u0 (x), ..., un = u(n 1) (x). Theses are interrelated with the elements
of u0 by the equations u01 = u2 , u02 = u3, ... ,u0n 1 = un , u0n = u(n) (x). The matrices
A(x) and f(x) are related to ai (x) and f (x) according to
0 1
0 1 0 0 ··· 0
B 0 0 1 0 ··· 0 C
B C
B 0 0 0 1 ··· 0 C
B C
A(x) = B .. .. .. .. .. C
B . . . . . C
B C
@ 0 0 0 0 ··· 1 A
a0 (x) a1 (x) a2 (x) a3 (x) · · · an 1 (x)
and 0 1
0
B 0 C
B C
B .. C
f(x) = B . C
B C
@ 0 A
f (x)
278
Hence, solving an ODE of order N reduces to solving a set of N coupled first-order
di↵erential equations for the functions ui (i = 1, 2, ..., N) having the general form
dui
= fi (x, u1 , u2, ..., un )
dx
where the functions fi on the rhs are known.
279
Appendix F
The Levi-Civita symbol, also known as the permutation symbol or the anti-
symmetric symbol,is a collection of numbers, defined from the sign of a permu-
tation of the natural numbers 1, 2, 3, ..., n. It is often encountered in linear algebra,
vector and tensor calculus, and di↵erential geometry.
The n-dimensional Levi-Civita symbol is indicated by "i1 i2 ...in , where each index
i1 , i2 , ..., in takes values 1, 2, ..., n, ans has the defining property that the symbol is
total antisymmetric in all its indices: when any two indices are interchanged, the
symbol is negated:
"...ip...iq ... = "...iq ...ip ...
If any two indices are equal, the symbol is zero, and when all indices are unequal,
we have that
"i1 i2 ...in = ( 1)p "1,2,...n
where p is called the parity of the permutation. It is the number of pairwise inter-
changes necessary to unscramble i1 , i2 , ..., in into the order 1, 2, ..., n. A permutation
is said to be even (odd) if its parity is an even (odd) number.
280
Appendix G
1 @ui @uj
eij = +
2 @xj @xi
1 @ui @uj
⇠ij =
2 @xj @xi
The symmetric part of the deformation tensor, eij , is called the rate of strain
tensor, while the anti-symmetric part, ⇠ij , expresses the vorticity w ~ ⌘ r ⇥ ~u in
1
the velocity field, i.e., ⇠ij = 2 "ijk wk . Note that one can always find a coordinate
system for which eij is diagonal. The axes of that coordinate frame indicate the
eigendirections of the strain (compression or stretching) on the fluid element.
In terms of the relation between the viscous stress tensor, ⌧ij , and the deformation
tensor, Tkl , there are a number of properties that are important.
281
• Locality: the ⌧ij Tkl -relation is said to be local if the stress tensor is only
a function of the deformation tensor and thermodynamic state functions like
temperature.
• Linearity: the ⌧ij Tkl -relation is said to be linear if the relation between
the stress and rate-of-strain is linear. This is equivalent to saying that ⌧ij does
not depend on r2~u or higher-order derivatives.
Tijkl = ij kl +µ( ik jl + il jk )
Note that (in a Newtonian fluid) the viscous stress tensor depends only on the sym-
metric component of the deformation tensor (the rate-of-strain tensor eij ), but not
on the antisymmetric component which describes vorticity. You can understand
the fact that viscosity and vorticity are unrelated by considering a fluid disk in solid
body rotation (i.e., r · ~u = 0 and r ⇥ ~u = w ~ 6= 0). In such a fluid there is no
”slippage”, hence no shear, and therefore no manifestation of viscosity.
282
Thus far we have derived that the stress tensor, ij , which in principle has 6 un-
knowns, can be reduced to a function of three unknowns only (P , µ, ) as long as
the fluid is Newtonian. Note that these three scalars, in general, are functions of
temperature and density. We now focus on these three scalars in more detail, starting
with the pressure P . To be exact, P is the thermodynamic equilibrium pres-
sure, and is normally computed thermodynamically from some equation of state,
P = P (⇢, T ). It is related to the translational kinetic energy of the particles when
the fluid, in equilibrium, has reached equipartition of energy among all its degrees
of freedom, including (in the case of molecules) rotational and vibrations degrees of
freedom.
ij = P ij + 2 µ eij + ekk ij
Pm = P ⌘ r · ~u
where
2 P Pm
⌘ = µ+ =
3 r · ~u
is the coefficient of bulk viscosity. We can now write the stress tensor as
@ui @uj 2 @uk @uk
ij = P ij +µ + ij +⌘ ij
@xj @xi 3 @xk @xk
This is the full expression for the stress tensor in terms of the coefficients of shear
viscosity, µ, and bulk viscosity, ⌘.
283
Appendix H
Equations of State
Ideal Gas: a hypothetical gas that consists of identical point particles (i.e. of zero
volume) that undergo perfectly elastic collisions and for which interparticle forces
can be neglected.
An ideal gas obeys the ideal gas law: P V = N kB T .
kB T
P = ⇢
µ mp
NOTE: astrophysical gases are often well described by the ideal gas law. Even for a
fully ionized gas, the interparticle forces (Coulomb force) can typically be neglected
(i.e., the potential energies involved are typically < 10% of the kinetic energies).
Ideal gas law breaks down for dense, and cool gases, such as those present in gaseous
planets.
284
Maxwell-Boltzmann Distribution: the distribution of particle momenta, p~ =
m~v , of an ideal gas follows the Maxwell-Boltzmann distribution.
✓ ◆3/2 ✓ ◆
3 1 p2
P(~p) d p~ = exp d3 p~
2⇡mkB T 2mkB T
where p2 = ~p · ~p. This distribution follows from maximizing entropy under the
following assumptions:
P = ⇣ n hEi
is the average, translational energy of the particles. In the case of our ideal (non-
relativistic) fluid,
⌧ 2 Z 1 2
p p 3
hEi = = P(p) dp = kB T
2m 0 2m 2
285
Hence, we find that the EoS for an ideal gas is indeed given by
2 kB T
P = n hEi = n kB T = ⇢
3 µmp
Specific Internal Energy: the internal energy per unit mass for an ideal gas is
hEi 3 kB T
"= =
µmp 2 µmp
Actually, the above derivation is only valid for a true ‘ideal gas’, in which the particles
are point particles. More generally,
1 kB T
"=
1 µmp
where is the adiabatic index, which for an ideal gas is equal to = (q +5)/(q +3),
with q the internal degrees of freedom of the fluid particles: q = 0 for point particles
(resulting in = 5/3), while diatomic particles have q = 2 (at sufficiently low
temperatures, such that they only have rotational, and no vibrational degrees of
freedom). The fact that q = 2 in that case arises from the fact that a diatomic
molecule only has two relevant rotation axes; the third axis is the symmetry axis of
the molecule, along which the molecule has negligible (zero in case of point particles)
moment of inertia. Consequently, rotation around this symmetry axis carries no
energy.
Photon gas: Having discussed the EoS of an ideal gas, we now focus on a gas of
photons. Photons have energy E = h⌫ and momentum p = E/c = h⌫/c, with h the
Planck constant.
Black Body: an idealized physical body that absorbs all incident radiation. A black
body (BB) in thermal equilibrium emits electro-magnetic radiation called black
body radiation.
The spectral number density distribution of BB photons is given by
8⇡⌫ 2 1
n (⌫, T ) = 3 h⌫/k T
c e B 1
286
which implies a spectral energy distribution
8⇡h⌫ 3 1
u(⌫, T ) = n (⌫, T ) h⌫ =
c3 eh⌫/kB T 1
and thus an energy density of
Z 1
4 SB
u(T ) = u(⌫, T ) d⌫ = T 4 ⌘ ar T 4
0 c
where
2⇡ 5 kB4
SB =
15h3 c2
15 3 4
is the Stefan-Boltzmann constant and ar ' 7.6 ⇥ 10 erg cm K is called the
radiation constant.
Radiation Pressure: when the photons are reflected o↵ a wall, or when they
are absorbed and subsequently re-emitted by that wall, they transfer twice their
momentum in the normal direction to that wall. Since photons are relativistic, we
have that the EoS for a photon gas is given by
1 1 1 aT 4
P = n hEi = n hh⌫i = u(T ) =
3 3 3 3
where we have used that u(T ) = n hEi.
287
for quarks). Finally, µ is called the chemical potential, and is a form of potential
energy that is related (in a complicated way) to the number density and temperature
of the particles (see Appendix I).
Classical limit: In the limit where the mean interparticle separation is much larger
than the de Broglie wavelength of the particles, so that quantum e↵ects (e.g., Heisen-
berg’s uncertainty principle) can be ignored, the above distribution function of mo-
menta can be accurately approximated by the Maxwell-Boltzmann distribution.
x y z px py pz = h3
Pauli Exclusion Principle: no more than one fermion of a given spin state can
occupy a given phase-space element h3 . Hence, for electrons, which have g = 2, the
maximum phase-space density is 2/h3 .
288
fully degenerate, then
N 3
Vx Vp = h
2
Using that ne = N/Vx , we find that
✓ ◆1/3
3
pF = ne h
8⇡
289
White Dwarfs and the Chandrasekhar limit: White dwarfs are the end-states
of stars with mass low enough that they don’t form a neutron star. When the
pressure support from nuclear fusion in a star comes to a halt, the core will start
to contract until degeneracy pressure kicks in. The star consists of a fully ionized
plasma. Assume for simplicity that the plasma consists purely of hydrogen, so that
the number density of protons is equal to that of electrons: np = ne . Because of
equipartition
p2p p2
= e
2mp 2me
p
Since mp >> me we have also that pp >> pe (in fact pp /pe = mp /me ' 43).
Consequently, when cooling or compressing the core of a star, the electrons will
become degenerate well before the protons do. Hence, white dwarfs are held up
against collapse by the degeneracy pressure from electrons. Since the electrons
are typically non-relativistic, the EoS of the white dwarf is: P / ⇢5/3 . If the white
dwarf becomes more and more massive (i.e., because it is accreting mass from a
companion star), the Pauli-exclusion principle causes the Fermi momentum, pF , to
increase to relativistic values. This softens the EoS towards P / ⇢4/3 . Such an
equation of state is too soft to stabilize the white dwarf against gravitational collapse;
the white dwarf collapses until it becomes a neutron star, at which stage it is
supported against further collapse by the degeneracy pressure from neutrons. This
happens when the mass of the white dwarf reaches Mlim ' 1.44M , the so-called
Chandrasekhar limit.
Non-Relativistic Relativistic
non-degenerate P / ⇢T P / T4
degenerate P / ⇢5/3 P / ⇢4/3
Summary of equations of state for di↵erent kind of fluids
290
Appendix I
Poisson Brackets
Given two functions A(qi , pi ) and B(qi , pi ) of the canonical phase-space coordinates
qi and pi , the Poisson bracket of A and B is defined as
3N ✓
X ◆
@A @B @A @B
{A, B} =
i=1
@qi @pi @pi @qi
In vector notation,
XN ✓ ◆
@A @B @A @B
{A, B} = · ·
i=1
@~
qi @~pi @~pi @~qi
where ~qi = (qi1 , qi2 , qi3 ) and ~pi = (pi1 , pi2 , pi3 ) and i now indicates a particular particle
(i = 1, 2, ..., N).
Note that Poisson brackets are anti-symmetric, i.e., {A, B} = {B, A}, and that
{A, A} = 0.
To see this, use multi-variable calculus to write out the di↵erential dA, write out the
Poisson brackets, and substitute the Hamiltonian equations of motion:
291
XN ✓ ◆
@A @H @A @H
{A, H} = · ·
i=1
@~qi @~pi @~pi @~qi
XN ✓ ◆
@A ˙ @A ˙
= · ~qi + · p~i
i=1
@~qi @~pi
Hence, a function of the phase-space coordinates whose Poisson bracket with the
Hamiltonian vanishes is conserved along its orbit/path in a static system (static
means that all partial time derivates are equal to zero). Such a quantity is called an
integral of motion.
As we have seen in Chapter 6, the Liouville equation states that the Lagrangian
time-derivative of the N-point DF vanishes, i.e., df (N ) /dt = 0. Using the above, we
can write this as
@f (N ) /@t + {f (N ) , H} = 0
which is yet another way of writing the Liouville equation. We can also use the
Poisson brackets to rewrite the Collisionless Boltzmann equation as
The Poisson brackets are useful for writing equations in more compact form. For
example, using the 6D vector w
~ = (~q , ~p), the Hamiltonian equations of motion
are simply given by
~˙ = {w,
w ~ H}
292
Appendix J
In this Appendix we derive the BBGKY hierarchy of evolution equations for the
k-particle distribution function f (k) (w ~ 1, w
~ 2, ...w
~ k ) starting from the Liouville equa-
tion for the N-particle distribution function f (N ) (w ~ 1, w
~ 2 , ...w
~ N ), where N > k. Here
~ i ⌘ (~qi , p~i ) is the 6D phase-space vector of particle i,
w
df (N ) @f (N )
= + {f (N ) , H(N ) } = 0
dt @t
expresses the incompressibilty of Hamiltonian flow in -space. Here we have adopted
the notation based on Poisson brackets (see Appendix I), and we have used the
index ’(N)’ on the Hamiltonian to emphasize that this is the N-particle Hamiltonian
XN N N N
(N ) p~i2 X 1 XX
H (~qi , p~i ) = + V (~qi ) + Uij
i=1
2m i=1 2 i=1 j=1
j6=i
Here V (~q ) is the potential corresponding to an external force, and we have used
Uij as shorthand notation for
the potential energy associated with the two-body interaction between particles i and
j. Note that Uij = Uji . The factor 1/2 in the above expression for the Hamiltonian
is to correct for double-counting of the particle pairs.
We can relate the N-particle Hamiltonian, H(N ) to the k-particle Hamiltonian, H(k) ,
which is defined in the same way as H(N ) but with N replaced by k < N, according
to
k
X N
X
(N ) (k) (k,N )
H =H +H + Uij
i=1 j=k+1
293
Here
Xk k k k
(k) p~i2 X 1 XX
H = + V (~qi ) + Uij
i=1
2m i=1 2 i=1 j=1
j6=i
and
N
X N
X N N
(k,N ) p~i2 1 X X
H = + V (~qi ) + Uij
2m 2
i=k+1 i=k+1 i=k+1 j=k+1
j6=i
To see this, consider the Uij term, for which we can write
N X
X N k X
X N N X
X N
Uij = Uij + Uij
i=1 j=i i=1 j=1 i=k+1 j=1
j6=i j6=i j6=i
k X
X k k
X N
X N X
X k N
X N
X
= Uij + Uij + Uij + Uij
i=1 j=1 i=1 j=k+1 i=k+1 j=1 i=k+1 j=k+1
j6=i j6=i
The second and third terms are identical (since Uij = Uji ) so that upon substitution
we obtain the above relation between H(N ) and H(k) .
Now let’s take the Liouville equation and integrate it over the entire phase-space
of particles k + 1 to N:
Z N
Y Z N
Y
3 @f (N )
3
d ~qn d p~n = d3 ~qn d3 p~n {H(N ) , f (N ) }
n=k+1
@t n=k+1
First the LHS: using that the integration is independent of time, we take the time
derivative outside of the integral, yielding
Z N
Y
@ (N k)! @f (k)
d3 ~qn d3 ~pn f (N ) =
@t n=k+1
N! @t
where we have made use of the definition of the reduced k-particle distribution
function (see Chapter 6). Writing out the Poisson brackets in the RHS, and splitting
294
the summation over i in two parts, we can write the RHS as the sum of two integrals,
I1 plus I2 , where
Z N
Y Xk ✓ ◆
3 @H(N ) @f (N )
3 @H(N ) @f (N )
I1 = d ~qn d p~n · ·
n=k+1 i=1
@~
qi @~pi @~pi @~qi
and
Z N
Y XN ✓ ◆
3 @H(N ) @f (N )
3 @H(N ) @f (N )
I2 = d ~qn d p~n · ·
n=k+1 i=k+1
@~
qi @~pi @~pi @~qi
Integral I2 vanishes. To see this, realize that @H(N ) /@~qi is independent of p~i and
@H(N ) /@~pi is independent of ~qi (this follows from the definition of the Hamiltonian).
Because of this, each terms in I2 can be cast in the form
Z+1 Z+1 Z+1
@f (x, y)
dx dy g(x) = dx g(x) [f (x, y)]y=+1
y= 1
@y
1 1 1
i.e., these turn into surface integrals, and since f (N ) (~qi , p~i ) = 0 in the limits |~qi | ! 1
(systems are of finite extent) and |~pi | ! 1 (particles have finite speed), we see that
I2 must vanish.
In order to compute I1 , we first write H(N ) in terms of H(k) and H(k,N ) as indicated
above. This allows us split the result in three terms, I1a , I1b and I1c given by
Z N
Y Xk ✓ ◆
3 @H(k) @f (N )
3 @H(k) @f (N )
I1a = d ~qn d ~pn · · ,
i=1
@~
qi @~pi @~pi @~qi
n=k+1
Z N
Y Xk ✓ ◆
3 @H(k,N ) @f (N )
3 @H(k,N ) @f (N )
I1b = d ~qn d p~n · · ,
n=k+1 i=1
@~qi @~pi @~pi @~qi
and
Z N k
" k N # " k N # !
Y X @ X X @f (N ) @ X X @f (N )
I1c = d3 ~qn d3 ~pn Ukl · Ukl ·
n=k+1 i=1
@~qi l=1 j=k+1 @~pi @~pi l=1 j=k+1 @~qi
295
We now examine each of these in turn. Starting with I1a , for which we realize that
the operator for f (N ) is independent of the integration variables, such that we can
take it outside of the integral. Hence, we have that
k ✓
X ◆Z N
Y
@H(k) @ @H(k) @
I1a = · · d3 ~qn d3 ~pn f (N )
i=1
@~qi @~pi @~pi @~qi n=k+1
Using the definition of the reduced k-particle distribution function, this can be writ-
ten as
k ✓ ◆
(N k)! X @H(k) @f (k) @H(k) @f (k) (N k)!
I1a = · · = {H(k) , f (k) }
N! i=1
~qi @~pi ~pi @~qi N!
where we have made use of the definition of the Poisson brackets (see Appendix I).
Next up is I1b . It is clear that this integral must vanish, since both @H(k,N ) /@~qi
and @H(k,N ) /@~pi are equal to zero. After all, the index i runs from 1 to k, and the
phase-space coordinates of those particles do not appear in H(k,N ) . This leaves I1c ;
since Ukl is independent of momentum, the second term within the brackets vanishes,
leaving only
Z Y N k N
!
X X @U ij @f (N )
I1c = d3 ~qn d3 ~pn ·
i=1
@~qi @~pi
n=k+1 j=k+1
Upon inspection, you can see that each term of the j-summation is equal (this follows
from the fact that we integrate over all of ~qj for each j = k + 1, ..., N). Hence, since
there are N k such terms we have that
296
k Z
X N
Y ✓ ◆
3 @Ui,k+1 @f (N )
3
I1c = (N k) d ~qn d p~n ·
i=1 n=k+1
@~qi @~pi
X k Z ✓ ◆Z Y N
3 3 @Ui,k+1 @
= (N k) d ~qk+1 d p~k+1 · d3 ~qn d3 ~pn f (N )
i=1
@~
q i @~p i
n=k+2
k Z ✓ ◆
(N k 1)! X 3 3 @Ui,k+1 @f (k+1)
= (N k) d ~qk+1 d p~k+1 ·
N! i=1
@~qi @~pi
k Z ✓ ◆
(N k)! X 3 3 @Ui,k+1 @f (k+1)
= d ~qk+1 d ~pk+1 ·
N! i=1
@~qi @~pi
where as before we have taken the operator outside of the integral, and we have used
the definition of the reduced distribution functions.
Combining everything, we obtain our final expression for the evolution of the reduced
k-particle distribution function
Xk Z ✓ ◆
@f (k) @Ui,k+1 @f (k+1)
= {H(k) , f (k) } + 3 3
d ~qk+1 d ~pk+1 ·
@t i=1
@~qi @~pi
Note that the evolution of f (k) thus depends on f (k+1) , such that the above expression
represents a set of N coupled equations, known as the BBGKY hierarchy. Note
also that this derivation is completely general; the ONLY assumption we have made
along the way is that the dynamics are Hamiltonian!
297
Appendix K
The energy equation can be obtained from the master moment equation
⌧
@ @ h i @ @Q
nhQi + nhQvi i + n =0
@t @xi @xi @vi
by substituting
1 m m m
Q = mv 2 = vi vi = (ui + wi )(ui + wi ) = (u2 + 2ui wi + w 2 )
2 2 2 2
1 2 1 2
Hence, we have that hQi = 2 mu + 2 mhw i where we have used that hui = u and
hwi = 0. Using that ⇢ = m n, the first term in the master moment equation thus
becomes
2
@ @ u
[nhQi] = ⇢ + ⇢"
@t @t 2
where we have used that the specific internal energy " = 12 hw 2 i. For the second term,
we use that
⇢
nhvk Qi = h(uk + wk )(u2 + 2ui wi + w 2 )i
2
⇢ 2
= hu uk + 2ui uk wi + w 2 uk + u2 wk + 2ui wi wk + w 2 wk i
2
⇢⇥ 2 ⇤
= u uk + uk hw 2i + 2ui hwi wk i + hw 2wk i
2
u2
= ⇢ uk + ⇢"uk + ⇢ui hwi wk i + Fcond,k
2
Here we use that the conductivity can be written as
1
Fcond,k ⌘ ⇢hwk w 2 i = h⇢"wk i
2
(see Chapter 5). Recall that conduction describes how internal energy is dispersed
due to the random motion of the fluid particles. Using that ⇢hwi wk i = ik =
P ik ⌧ik , the second term of the master moment equation becomes
298
2
@ @ u
[nhvk Qi] = ⇢ uk + ⇢"uk + (P ik ⌧ik )ui + Fcond,k
@xk @xk 2
Finally, for the third term we use that
@Q m @v 2
= = mvk
@vk 2 @vk
To understand the last step, note that in Cartesian coordinates v 2 = vx2 + vy2 + vz2 .
Hence, we have that
⌧
@ @Q @ @
n =⇢ hvk i = ⇢ uk
@xk @vk @xk @xk
Combining the three terms in the master moment equation, we finally obtain the
following form of the energy equation:
✓ 2 ◆
@ u
⇢ +" =
@t 2
✓ 2 ◆
@ u @
⇢ + " uk + (P jk ⌧jk ) uj + Fcond,k ⇢ uk
@xk 2 @xk
Note that there is no L term, which is absent from the derivation based on the
Boltzmann equation, since the later does not include the e↵ects of radiation.
Finally, we want to recast the above energy equation in a form that describes the
evolution of the internal energy, ". This is obtained by subtracting ui times the
Navier-Stokes equation in conservative, Eulerian form from the energy equation
derived above.
299
Next we multiply this equation with ui . Using that
2 2
@⇢ui @⇢u2 @ui @ u @ u @ui
ui = ⇢ui = ⇢ + ⇢ ⇢ui
@t @t @t @t 2 @t 2 @t
2 2 2
@ u ⇢ @u u @⇢ @ui
= ⇢ + + ⇢ui
@t 2 2 @t 2 @t @t
2 2
@ u u @⇢
= ⇢ +
@t 2 2 @t
where we have used that @u2 /@t = 2ui @ui /@t. Similarly, we have that
2 2
@ @ u @ u @ui
ui [⇢ui uk ] = ⇢ uk + ⇢ uk ⇢ui uk
@xk @xk 2 @xk 2 @xk
2 2 2
@ u ⇢ @u u @⇢uk @ui
= ⇢ uk + uk + ⇢ui uk
@xk 2 2 @xk 2 @xk @xk
2 2
@ u u @⇢uk
= ⇢ uk +
@xk 2 2 @xk
Combining the above two terms, and using the continuity equation to dispose of
the two terms containing the factor u2 /2, the Navier-Stokes equation in conservation
form multiplied by ui becomes
2 2
@ u @ u @ ik @
⇢ + ⇢ uk = ui ⇢ui
@t 2 @xk 2 @xk @xi
Subtracting this from the form of the energy equation on the previous page ultimately
yields
@ @ @uk @Fcond,k
(⇢") + (⇢"uk ) = P +V
@t @xk @xk @xk
where
@ui
V ⌘ ⌧ik
@xk
is the rate of viscous dissipation, describing the rate at which heat is added to
the internal energy budget via viscous conversion of ordered motion (~u) to disordered
energy in random particle motions (w).
~
300
Next we split the first term in ⇢(@"/@t)+"(@⇢/@t) and the second term in ⇢uk (@"/@xk )+
"(@⇢uk /@xk ), and use the continuity equation to obtain the (internal) energy equa-
tion in Lagrangian index form:
301
Appendix L
Consider a system which can exchange energy and particles with a reservoir, and
the volume of which can change. There are three ways for this system to increase
its internal energy; heating, changing the system’s volume (i.e., doing work on the
system), or adding particles. Hence,
dU = T dS P dV + µ dN
Note that this is the first law of thermodynamics, but now with the added possibility
of changing the number of particles of the system. The scalar quantity µ is called
the chemical potential, and is defined by
✓ ◆
@U
µ=
@N S,V
This is not to be confused with the µ used to denote the mean weight per particle,
which ALWAYS appears in combination with the proton mass, mp . As is evident
from the above expression, the chemical potential quantifies how the internal energy
of the system changes if particles are added or removed, while keeping the entropy
and volume of the system fixed. The chemical potential appears in the Fermi-Dirac
distribution describing the momentum distribution of a gas of fermions or bosons.
Consider an ideal gas, of volume V , entropy S and with internal energy U. Now
imagine adding a particle of zero energy (✏ = 0), while keeping the volume fixed.
Since ✏ = 0, we also have that dU = 0. But what about the entropy? Well, we
have increased the number of ways in which we can redistribute the energy U (a
macrostate quantity) over the di↵erent particles (di↵erent microstates). Hence, by
adding this particle we have increased the system’s entropy. If we want to add a
particle while keeping S fixed, we need to decrease U to o↵set the increase in the
number of ‘degrees of freedom’ over which to distribute this energy. Hence, keeping
S (and V ) fixed, requires that the particle has negative energy, and we thus see that
µ < 0.
302
For a fully degenerate Fermi gas, we have that T = 0, and thus S = 0 (i.e., there is
only one micro-state associated with this macrostate, and that is the fully degenerate
one). If we now add a particle, and demand that we keep S = 0, then that particle
must have the Fermi energy (see Chapter 13); ✏ = Ef . Hence, for a fully degenerate
gas, µ = Ef .
To end this discussion of the chemical potential, we address the origin of its name,
which may, at first, seem weird. Let’s start with the ‘potential’ part. The origin of
this name is clear from the following. According to its definition (see above), the
chemical potential is the ‘internal energy’ per unit amount (moles). Now consider
the following correspondences:
These examples make it clear why µ is considered a ‘potential’. Finally, the word
chemical arises from the fact that the µ plays an important role in chemistry (i.e.,
when considering systems in which chemical reactions take place, which change the
particles). In this respect, it is important to be aware of the fact that µ is an
additive quantity that is conserved in a chemical reaction. Hence, for a chemical
303
reaction i + j ! k + l one has that µi + µj = µk + µl . As an example, consider the
annihilation of an electron and a positron into two photons. Using that µ = 0 for
photons, we see that the chemical potential of elementary particles (i.e., electrons)
must be opposite to that of their anti-particles (i.e., positrons).
Because of the additive nature of the chemical potential, we also have that the
above equation for dU changes slightly whenever the gas consists of di↵erent particle
species; it becomes X
dU = T dS P dV + µi dNi
i
where the summation is over all species i. If the gas consists of equal numbers
of elementary particles and anti-particles, then the total chemical potential of the
system will be P equal to zero. In fact, in many treatments of fluid dynamics it may be
assumed that i µi dNi = 0; in particular when the relevant reactions are ‘frozen’
(i.e., occur on a timescales ⌧react that are much longer than the dynamical timescales
⌧dyn of interest), so that dNi = 0, or if the reactions go so fast (⌧react ⌧ ⌧dyn )
that each
P reaction and its inverse are in local thermodynamic equilibrium, in which
case i µi dNi = 0 for those species involved in the reaction. Only in the rare,
intermediate case when ⌧react ⇠ ⌧dyn is it important to keep track of the relative
abundances of the various chemical and/or nuclear species.
304