Phy332 Notes
Phy332 Notes
Phy332 Notes
= E
2
E
1
, (1.1)
where E
1
and E
2
are the energies of the lower and upper levels of the atom respectively.
Spectroscopists measure the wavelength of the photon, and hence deduce energy dierences. The
absolute energies are determined by xing one of the levels (normally the ground state) by other methods,
e.g. by measurement of the ionization potential.
E
2
E
1
hn
E
2
E
1
hn
absorption emission
E
2
E
1
hn
E
2
E
1
hn
absorption emission
Figure 1.1: Absorption and emission transitions.
1
2 CHAPTER 1. INTRODUCTION AND REVISION OF HYDROGEN
Energy scale Energy (eV) Energy (cm
1
) Contributing eects
Gross structure 1 10 10
4
10
5
electronnuclear attraction
electron-electron repulsion
electron kinetic energy
Fine structure 0.001 0.01 10 100 spin-orbit interaction
relativistic corrections
Hyperne structure 10
6
10
5
0.01 0.1 nuclear interactions
Table 1.1: Rough energy scales for the dierent interactions that occur within atoms.
i
n
c
r
e
a
s
i
n
g
s
p
e
c
t
r
a
l
r
e
s
o
l
u
t
i
o
n
l
ultraviolet visible infrared
l
l
gross
structure
fine
structure
hyperfine
structure
i
n
c
r
e
a
s
i
n
g
s
p
e
c
t
r
a
l
r
e
s
o
l
u
t
i
o
n
l
ultraviolet visible infrared
l
l
gross
structure
fine
structure
hyperfine
structure
Figure 1.2: Hierarchy of spectral lines observed with increasing spectral resolution.
1.2 Energy units
Atomic energies are frequently quoted in electron volts (eV). 1 eV is the energy acquired by an electron
when it is accelerated by a voltage of 1 Volt. Thus 1 eV = 1.6 10
19
J. This is a convenient unit,
because the energies of the electrons in atoms are typically a few eV.
Atomic energies are also often expressed in wave number units (cm
1
). The wave number is the
reciprocal of the wavelength of the photon with energy E. It is dened as follows:
=
1
(in cm)
=
c
=
E
hc
. (1.2)
Note that the wavelength should be worked out in cm. Thus 1 eV = (e/hc) cm
1
= 8066 cm
1
.
Wave number units are particularly convenient for atomic spectroscopy. This is because they dis-
pense with the need to introduce fundamental constants in our calculation of the wavelength. Thus the
wavelength of the radiation emitted in a transition between two levels is simply given by:
1
=
2
1
, (1.3)
where
1
and
2
are the energies of levels 1 and 2 in cm
1
units, and is measured in cm.
1.3 Energy scales in atoms
In atomic physics it is traditional to order the interactions that occur inside the atom into a three-level
hierarchy according to the scheme summarized in Table 1.1. The eect of this hierarchy on the observed
atomic spectra is illustrated schematically in Fig. 1.2.
1.4. THE BOHR MODEL OF HYDROGEN 3
1.3.1 Gross structure
The rst level of the hierarchy is called the gross structure, and covers the largest interactions within
the atom, namely:
the kinetic energy of the electrons in their orbits around the nucleus;
the attractive electrostatic potential between the positive nucleus and the negative electrons;
the repulsive electrostatic interaction between the dierent electrons in a multi-electron atom.
The size of these interactions gives rise to energies in the 110 eV range and upwards. They thus determine
whether the photon that is emitted is in the infrared, visible, ultraviolet or X-ray spectral regions, and
more specically, whether it is violet, blue, green, yellow, orange or red for the case of a visible transition.
1.3.2 Fine structure
Close inspection of the spectral lines of atoms reveals that they often come as multiplets. For example,
the strong yellow line of sodium that is used in street lamps is actually a doublet: there are two lines with
wavelengths of 589.0nm and 589.6 nm. This tells us that there are smaller interactions going on inside
the atom in addition to the gross structure eects. The gross structure interactions determine that the
emission line is yellow, but ne structure eects cause the splitting into the doublet. In the case of the
sodium yellow line, the ne structure energy splitting is 2.1 10
3
eV or 17 cm
1
.
Fine structure arises from the spin-orbit interaction. Electrons in orbit around the nucleus are
equivalent to current loops, which give rise to atomic magnetism. The magnitude of the magnetic dipole
moment of the electron is typically of the order of the Bohr magneton
B
:
B
=
e
2m
e
= 9.27 10
24
J T
1
. (1.4)
The atomic dipoles generate strong magnetic elds within the atom, and the spin of the electron can
then interact with the internal eld. This produces small shifts in the energies, which can be worked out
by measuring the ne structure in the spectra. In this way we can learn about the way the spin and
the orbital motion of the atom couple together. In more advanced theories of the atom (e.g. the Dirac
theory), it becomes apparent that the spin-orbit interaction is actually a relativistic eect.
1.3.3 Hyperne structure
Even closer inspection of the spectral lines with a very high resolution spectrometer reveals that the
ne-structure lines are themselves split into more multiplets. The interactions that cause these splitting
are called hyperne interactions.
The hyperne interactions are caused by the interactions between the electrons and the nucleus. The
nucleus has a small magnetic moment of magnitude
B
/2000 due to the nuclear spin. This can interact
with the magnetic eld due to the orbital motion of the electron just as in spin-orbit coupling. This gives
rise to shifts in the atomic energies that are about 2000 times smaller than the ne structure shifts. The
well-known 21 cm line of radio astronomy is caused by transitions between the hyperne levels of atomic
hydrogen. The photon energy in this case is 6 10
6
eV, or 0.05 cm
1
.
1.4 The Bohr model of hydrogen
The Bohr model of hydrogen is part of the old (i.e. pre-quantum mechanics) quantum theory of the
atom. It includes the quantization of energy and angular momentum, but uses classical mechanics to
describe the motion of the electron. With the advent of quantum mechanics, we realize that this is an
inconsistent approach, and therefore should not be pushed too far. Nevertheless, the Bohr model does
give the correct quantised energy levels of hydrogen, and also gives a useful parameter (the Bohr radius)
for quantifying the size of atoms. Hence it remains a useful starting point for understanding the basic
structure of atoms.
In 1911 Rutherford discovered the nucleus, which led to the idea of atoms consisting of electrons in
classical orbits in which the central forces are provided by the Coulomb attraction to the positive nucleus,
as shown in Fig. 1.3. The problem with this idea is that the electron in the orbit is constantly accelerating.
Accelerating charges emit radiation called bremsstrahlung, and so the electrons should be radiating all
the time. This would reduce the energy of the electron, and so it would gradually spiral into the nucleus,
like an old satellite crashing to the earth. In 1913 Bohr resolved this issue by postulating that:
4 CHAPTER 1. INTRODUCTION AND REVISION OF HYDROGEN
+Ze
-e
v
r
F
+Ze
-e
v
r
F
Figure 1.3: The Bohr model of the atom considers the electrons to be in orbit around the nucleus. The
central force is provided by the Coulomb attraction. The angular momentum of the electron is quantized
in integer units of .
The angular momentum L of the electron is quantized in units of ( = h/2):
L = n , (1.5)
where n is an integer.
The atomic orbits are stable, and light is only emitted or absorbed when the electron jumps from
one orbit to another.
When Bohr made these hypotheses in 1913, they had no justication other than their success in predicting
the energy spectrum of hydrogen. With hindsight, we realize that the rst assumption is equivalent to
stating that the circumference of the orbit must correspond to a xed number of de Broglie wavelengths:
2r = integer
deB
= n
h
p
= n
h
mv
, (1.6)
which can be rearranged to give
L mvr = n
h
2
. (1.7)
The second assumption is a consequence of the fact that the Schrodinger equation leads to time-independent
solutions (eigenstates).
The derivation of the quantized energy levels proceeds as follows. Consider an electron orbiting a
nucleus of mass m
N
and charge +Ze. The central force is provided by the Coulomb force:
F =
mv
2
r
=
Ze
2
4
0
r
2
. (1.8)
As with all two-body orbit systems, the mass m that enters here is the reduced mass:
1
m
=
1
m
e
+
1
m
N
, (1.9)
where m
e
and m
N
are the masses of the electron and the nucleus, respectively. The energy is given by:
1
E
n
= kinetic energy + potential energy
=
1
2
mv
2
Ze
2
4
0
r
=
mZ
2
e
4
8
2
0
h
2
n
2
, (1.10)
where we made use of eqns 1.7 and 1.8 to solve for v and r. This can be written in the form:
E
n
=
R
n
2
(1.11)
1
In atoms the electron moves in free space, where the relative dielectric constant r is equal to unity. However, in
solid-state physics we frequently encounter hydrogenic systems inside crystals where r is not equal to 1. In this case, we
must replace
0
by r
0
throughout.
1.4. THE BOHR MODEL OF HYDROGEN 5
where R
is given by:
R
=
_
m
m
e
Z
2
_
R
hc , (1.12)
and R
hc =
m
e
e
4
8
2
0
h
2
. (1.13)
The Rydberg energy is a fundamental constant and has a value of 2.17987 10
18
J, which is equivalent
to 13.606 eV. This tells us that the gross energy of the atomic states in hydrogen is of order 1 10 eV,
or 10
4
10
5
cm
1
in wave number units.
R
is the eective Rydberg energy for the system in question. In the hydrogen atom we have an
electron orbiting around a proton of mass m
p
. The reduced mass is therefore given by
m = m
e
m
p
m
e
+m
p
= 0.9995 m
e
and the eective Rydberg energy for hydrogen is:
R
H
= 0.9995 R
hc . (1.14)
Atomic spectroscopy is very precise, and 0.05% factors such as this are easily measurable. Furthermore,
in other systems such as positronium (an electron orbiting around a positron), the reduced mass eect is
much larger, because m = m
e
/2.
By following through the mathematics, we also nd that the orbital radius and velocity are quantized.
The relevant results are:
r
n
=
n
2
Z
m
e
m
a
0
, (1.15)
and
v
n
=
Z
n
c . (1.16)
The two fundamental constants that appear here are the Bohr radius a
0
:
a
0
=
h
2
0
m
e
e
2
, (1.17)
and the ne structure constant :
=
e
2
2
0
hc
. (1.18)
The fundamental constants arising from the Bohr model are related to each other according to:
a
0
=
m
e
c
1
, (1.19)
and
R
hc =
2
2m
e
1
a
2
0
. (1.20)
The denitions and values of these quantities are given in Table 1.2.
The energies of the photons emitted in transition between the quantized levels of hydrogen can be
deduced from eqn 1.11:
h = R
H
_
1
n
2
1
1
n
2
2
_
, (1.21)
where n
1
and n
2
are the quantum numbers of the two states involved. Since = c/, this can also be
written in form:
1
=
m
m
e
R
_
1
n
2
1
1
n
2
2
_
. (1.22)
In absorption we start from the ground state, so we put n
1
= 1. In emission, we can have any combination
where n
1
< n
2
. Some of the series of spectral lines have been given special names. The emission lines
with n
1
= 1 are called the Lyman series, those with n
1
= 2 are called the Balmer series, etc. The
Lyman and Balmer lines occur in the ultraviolet and visible spectral regions respectively.
2
Note the dierence between the Rydberg energy Rhc (13.606 eV) and the Rydberg constant R (109,737 cm
1
). The
former has the dimensions of energy, while the latter has the dimensions of inverse length. They dier by a factor of hc.
(See Table 1.2.) When high precision is not required, it is convenient just to use the symbol R
H
for the Rydberg energy,
although, strictly speaking, R
H
diers from the true Rydberg energy by 0.05%. (See eqn 1.14.)
6 CHAPTER 1. INTRODUCTION AND REVISION OF HYDROGEN
Quantity Symbol Formula Numerical Value
Rydberg energy R
hc m
e
e
4
/8
2
0
h
2
2.17987 10
18
J
13.6057 eV
Rydberg constant R
m
e
e
4
/8
2
0
h
3
c 109,737 cm
1
Bohr radius a
0
0
h
2
/e
2
m
e
5.29177 10
11
m
Fine structure constant e
2
/2
0
hc 1/137.04
Table 1.2: Fundamental constants that arise from the Bohr model of the atom.
A simple back-of-the-envelope calculation can easily show us that the Bohr model is not fully consistent
with quantum mechanics. In the Bohr model, the linear momentum of the electron is given by:
p = mv =
_
Z
n
_
mc =
n
r
n
. (1.23)
However, we know from the Heisenberg uncertainty principle that the precise value of the momentum
must be uncertain. If we say that the uncertainty in the position of the electron is about equal to the
radius of the orbit r
n
, we nd:
p
x
r
n
. (1.24)
On comparing Eqs. 1.23 and 1.24 we see that
[p[ np . (1.25)
This shows us that the magnitude of p is undened except when n is large. This is hardly surprising,
because the Bohr model is a mixture of classical and quantum models, and we can only expect the
arguments to be fully self-consistent when we approach the classical limit at large n. For small values of
n, the Bohr model fails when we take the full quantum nature of the electron into account.
1.5 The quantum mechanics of the hydrogen atom
The full solution of the Schrodinger equation for hydrogen has been considered in course PHY251, and
so is only given in summary form here.
1.5.1 Angular momentum
It is well known from classical physics that planetary orbits are characterized by their energy and angular
momentum. This is also true for quantum systems, and emerges from the solutions of the Schrodinger
equation for the allowed states of the nucleus-electron system. It is helpful to start by considering the
angular momentum states.
In classical mechanics, the rate of change of the angular momentum is given by:
dL
dt
= , (1.26)
where is the torque dened by:
= r F . (1.27)
In the hydrogen atom, the electron is bound to the nucleus by the Coulomb force, which is parallel to
r. The torque is therefore zero, and so the angular momentum of the electron does not change. This
means that the angular momentum is a constant of the motion, and is therefore very important in
the specication of the quantum states of hydrogen.
3
3
The starting approximation for the treatment of multi-electron atoms is the central eld approximation in which
we assume that the dominant force is radial (i.e. pointing centrally towards the nucleus), so that the potential only depends
on r. (See Section 3.1.) In this case the torque is also zero, so that again the angular momentum is constant.
1.5. THE QUANTUM MECHANICS OF THE HYDROGEN ATOM 7
z
l = 0
m = 0
z
l = 1
m = 0
m = 1
z
l = 2
m = 0
m = 1
m = 2
z
l = 0
m = 0
z
l = 0
m = 0
z
l = 1
m = 0
m = 1
z
l = 1
m = 0
m = 1
z
l = 2
m = 0
m = 1
m = 2
z
l = 2
m = 0
m = 1
m = 2
z
l = 2
m = 0
m = 1
m = 2
Figure 1.4: Polar plots of the spherical harmonics with l 2. The plots are to be imagined with
spherical symmetry about the z axis. In these polar plots, the value of the function for a given an-
gle is plotted as the distance from the origin. Prettier pictures may be found, for example, at:
http://mathworld.wolfram.com/SphericalHarmonic.html.
In classical mechanics we can know all three components of the angular momentum vector L, namely
L
x
, L
y
, and L
z
, from which we can work out the magnitude of the angular momentum according to
L
2
= L
2
x
+ L
2
y
+ L
2
z
. This is not the case in quantum mechanics, where measurable quantities such as
the angular momentum are represented by operators. It is shown in Section 4.2.1 that the components
of the angular momentum operator, namely
L
x
,
L
y
, and
L
z
, do not commute with each other:
4
[
L
x
,
L
y
] = i
L
z
. (1.28)
This means that we can only know one component of the angular momentum at a time, which, for
convenience, is usually taken to be the z component. On the other hand, the
L
z
operator does commute
with the
L
2
operator, which means that we can know L
2
and L
z
simultaneously.
Atoms have spherical symmetry, and so it is convenient to use spherical polar co-ordinates (r, , ) to
describe them. In terms of these spherical polar co-ordinates, the two key angular momentum operators
are given by:
L
2
=
2
_
1
sin
_
sin
_
+
1
sin
2
2
_
, (1.29)
and
L
z
= i
. (1.30)
As with all measurable quantities in quantum mechanics, the possible values of L
2
and L
z
are found by
solving eigenvalue equations. Since
L
2
and
L
z
commute, it is possible to nd wave functions Y (, ) that
are simultaneously eigenfunctions of both
L
2
and
L
z
. On writing these eigenfunctions as Y (, ), the
eigen-equations become:
5
L
2
Y (, )
2
_
1
sin
_
sin
_
+
1
sin
2
2
_
Y (, ) = L
2
Y (, ) , (1.31)
and
L
z
Y (, ) i
Y
= L
z
Y (, ) . (1.32)
Equation 1.32 implies that Y (, ) exp (L
z
/i). The additional requirement that Y (, ) should
be single-valued i.e. Y (, + 2) = Y (, ) implies that L
z
= m, where m is an integer. The
derivation of the dependence of Y (, ) is best left to mathematicians! The nal result is that the
functions that satisfy eqns 1.31 and 1.32 are of the form:
Y
lm
(, ) = normalization constant P
m
l
(cos ) e
im
, (1.33)
4
Note that the hat symbol indicates that we are representing an operator and not just a number.
5
The wave functions of an atom will, in general, be a function of r as well and . However, since the angular momentum
operators only act on the and variables, we can ignore the r-dependence of the wave functions when considering the
eigenfunctions of
L
2
and
Lz.
8 CHAPTER 1. INTRODUCTION AND REVISION OF HYDROGEN
l m Y
lm
(, )
0 0
_
1
4
1 0
_
3
4
cos
1 1
_
3
8
sin e
i
2 0
_
5
16
(3 cos
2
1)
2 1
_
15
8
sin cos e
i
2 2
_
15
32
sin
2
e
2i
Table 1.3: Spherical harmonic functions.
where P
m
l
(cos ) is a polynomial function in cos called the associated Legendre polynominal, e.g.
P
0
0
(cos ) = constant, P
0
1
(cos ) = cos , P
1
1
(cos ) = sin , etc. The indices l and m must be inte-
gers, with l 0 and l m +l. In spectroscopic notation, states with l = 0, 1, 2, 3, . . . are called
s, p, d, f, . . . states, respectively.
Functions of the type given in eqn 1.33 are called spherical harmonic functions. The rst few
spherical harmonic functions are listed in Table 1.3. Representative polar plots of the wave functions are
shown in gure 1.4. The spherical harmonics have the general property that:
_
=0
_
2
=0
Y
lm
(, )Y
l
m
(, ) sin dd =
l,l
m,m
. (1.34)
The symbol
k,k
is called the Kronecker delta function. It has the value of 1 if k = k
and 0 if k ,= k
.
The sin factor in Eq. 1.34 comes from the volume increment in spherical polar co-ordinates: see Eq. 1.54
below.
The eigenvalues of
L
2
and
L
z
are found by substituting the spherical harmonics into eqns 1.31 and
1.32 to obtain:
L
2
Y
lm
(, ) = l(l + 1)
2
Y
lm
(, ) . (1.35)
and
L
z
Y
lm
(, ) = m Y
lm
(, ) . (1.36)
The integers l and m that appear here are called the orbital and magnetic quantum numbers respec-
tively. The proof of eqn 1.35 is time-consuming, but eqn 1.36 can be demonstrated quite easily by using
eqns 1.30 and 1.33:
L
z
Y
lm
(, ) = i
Y
lm
(, ) ,
= i
CP
m
l
(cos ) e
im
,
= iCP
m
l
(cos )
d
d
e
im
,
= iCP
m
l
(cos ) ime
im
,
= m Y
lm
(, ) .
Equations 1.351.36 show that the magnitude of the angular momentum and its z-component are equal
to
_
l(l + 1) and m respectively.
6
This is represented pictorially in the vector model of the atom
shown in gure 1.5. In this model the angular momentum is represented as a vector of length
_
l(l + 1)
angled in such a way that its component along the z axis is equal to m. The x and y components of the
angular momentum are not known.
6
In Bohrs model, L was quantized in integer units of . (See eqn 1.7.) The full quantum treatment shows that this is only
true in the classical limit where n is large and l approaches its maximum value, so that L =
l(l + 1)
(n 1)n n.
1.5. THE QUANTUM MECHANICS OF THE HYDROGEN ATOM 9
z
L
z
= m
l
h
h ) 1 ( | | + = l l L
x,y
z
L
z
= m
l
h
h ) 1 ( | | + = l l L
x,y
Figure 1.5: Vector model of the angular momentum in an atom. The angular momentum is represented
by a vector of length
_
l(l + 1) precessing around the z-axis so that the z-component is equal to m
l
.
1.5.2 The Schrodinger Equation
The time-independent Schrodinger equation for hydrogen is given by:
_
2
2m
Ze
2
4
0
r
_
(r, , ) = E (r, , ) , (1.37)
where the spherical polar co-ordinates (r, , ) refer to the position of the electron relative to the nucleus.
Since we are considering the motion of the electron relative to a stationary nucleus, the mass that appears
here is the reduced mass dened previously in eqn 1.9:
1
m
=
1
m
e
+
1
m
N
. (1.38)
For hydrogen, the nuclear mass m
N
is equal to the proton mass m
p
, and so the reduced mass has a value
of 0.9995m
e
, which is very close to m
e
. Note that we are using the same symbol m to represent both the
mass and the magnetic quantum number. Its meaning should be clear from the context, and, if necessary,
we can add a subscript to the quantum number to distinguish it: m
l
.
Written out explicitly in spherical polar co-ordinates, the Schrodinger equation becomes:
2
2m
_
1
r
2
r
_
r
2
r
_
+
1
r
2
sin
_
sin
_
+
1
r
2
sin
2
2
_
Ze
2
4
0
r
= E . (1.39)
Our task is to nd the wave functions (r, , ) that satisfy this equation, and hence to nd the allowed
quantized energies E. The solution proceeds by the method of separation of variables. This works
because the Coulomb potential is an example of a central eld in which the force only lies along the
radial direction. This allows us separate the motion into the radial and angular parts:
(r, , ) = R(r) F(, ) . (1.40)
We start by noting that we can use eqn 1.29 to re-write the Schrodinger equation as follows:
2
2m
1
r
2
r
_
r
2
r
_
+
L
2
2mr
2
Ze
2
4
0
r
= E . (1.41)
On substituting eqn 1.40 into eqn 1.41, and recalling that
L
2
only acts on and , we nd:
2
2m
1
r
2
d
dr
_
r
2
dR
dr
_
F +R
L
2
F
2mr
2
Ze
2
4
0
r
RF = E RF . (1.42)
Multiply by r
2
/RF and re-arrange to obtain:
2
2m
1
R
d
dr
_
r
2
dR
dr
_
Ze
2
r
4
0
Er
2
=
1
F
L
2
F
2m
. (1.43)
10 CHAPTER 1. INTRODUCTION AND REVISION OF HYDROGEN
The left hand side is a function of r only, while the right hand side is only a function of the angular
co-ordinates and . The only way this can be true is if both sides are equal to a constant. Lets call
this constant
2
( + 1)/2m, where is an arbitrary number at this stage. This gives us, after a bit of
re-arrangement:
2
2m
1
r
2
d
dr
_
r
2
dR(r)
dr
_
+
2
( + 1)
2mr
2
R(r)
Ze
2
4
0
r
R(r) = ER(r) , (1.44)
and
L
2
F(, ) =
2
( + 1)F(, ) . (1.45)
On comparing Eqs. 1.35 and 1.45 we can now identify the arbitrary separation constant with the angular
momentum quantum number l, and we can see that the function F(, ) that enters Eq. 1.45 must be
one of the spherical harmonics, Y
lm
(, ).
We can tidy up the radial equation Eq. 1.44 by writing:
R(r) =
P(r)
r
.
This gives:
_
2
2m
d
2
dr
2
+
2
l(l + 1)
2mr
2
Ze
2
4
0
r
_
P(r) = EP(r) . (1.46)
This now makes physical sense. It is a Schrodinger equation of the form:
H =
2
2m
d
2
dr
2
+V
eective
(r) . (1.48)
The rst term in eqn 1.48 is the radial kinetic energy given by
K.E.
radial
=
p
2
r
2m
=
2
2m
d
2
dr
2
.
The second term is the eective potential energy:
V
eective
(r) =
2
l(l + 1)
2mr
2
Ze
2
4
0
r
, (1.49)
which has two components. The rst of these is the orbital kinetic energy given by:
K.E.
orbital
=
L
2
2I
=
2
l(l + 1)
2mr
2
,
where I mr
2
is the moment of inertia. The second is the usual potential energy due to the Coulomb
energy.
This analysis shows that the quantized orbital motion adds quantized kinetic energy to the radial
motion. For l > 0 the orbital kinetic energy will always be larger than the Coulomb energy at small r,
and so the eective potential energy will be positive. This has the eect of keeping the electron away
from the nucleus, and explains why states with l > 0 have nodes at the origin (see below).
1.5.3 The radial wave functions and energies
The wave function we require is given by Eq. 1.40. We have seen above that the F(, ) function that
appears in Eq. 1.40 must be one of the spherical harmonics, some of which are listed in Table 1.3. The
radial wave function R(r) can be found by solving the radial dierential equation given in Eq. 1.44. The
mathematics is somewhat complicated and here we just quote the main results.
Solutions are only found if we introduce an integer quantum number n. The energy depends only
on n, but the functional form of R(r) depends on both n and l, and so we must write the radial wave
function as R
nl
(r). A list of some of the radial functions is given in Table 1.4, and representative wave
functions are plotted in Fig. 1.6. The radial wave functions listed in Table 1.4 are of the form:
R
nl
(r) = C
nl
(polynomial in r) e
r/a
, (1.50)
1.5. THE QUANTUM MECHANICS OF THE HYDROGEN ATOM 11
Spectroscopic name n l R
nl
(r)
1s 1 0 (Z/a
0
)
3
2
2 exp(Zr/a
0
)
2s 2 0 (Z/2a
0
)
3
2
2
_
1
Zr
2a0
_
exp(Zr/2a
0
)
2p 2 1 (Z/2a
0
)
3
2
2
3
_
Zr
2a0
_
exp(Zr/2a
0
)
3s 3 0 (Z/3a
0
)
3
2
2
_
1 (2Zr/3a
0
) +
2
3
_
Zr
3a0
_
2
_
exp(Zr/3a
0
)
3p 3 1 (Z/3a
0
)
3
2
(4
2/3)
_
Zr
3a0
__
1
1
2
Zr
3a0
_
exp(Zr/3a
0
)
3d 3 2 (Z/3a
0
)
3
2
(2
2/3
5)
_
Zr
3a0
_
2
exp(Zr/3a
0
)
Table 1.4: Radial wave functions of the hydrogen atom. a
0
is the Bohr radius (5.29 10
11
m). The
wave functions are normalized so that
_
r=0
R
nl
R
nl
r
2
dr = 1.
where a = na
H
/Z, with a
H
being the Bohr radius of Hydrogen given in eqn 1.17, namely 5.29 10
11
m.
C
nl
is a normalization constant. The polynomial functions that drop out of the equations are polynomials
of order n 1, and have n 1 nodes. If l = 0, all the nodes occur at nite r, but if l > 0, one of the
nodes is at r = 0.
The full wave function for hydrogen is therefore of the form:
nlm
(r, , ) = R
nl
(r)Y
lm
(, ) , (1.51)
where R
nl
(r) is one of the radial functions given in eqn 1.50, and Y
lm
(, ) is a spherical harmonic function
as discussed in Section 1.5.1. The quantum numbers obey the following rules:
n can have any integer value 1.
l can have positive integer values from zero up to (n 1).
m can have integer values from l to +l.
These rules drop out of the mathematical solutions. Functions that do not obey these rules will not
satisfy the Schrodinger equation for the hydrogen atom.
The energy of the system is found to be:
E
n
=
mZ
2
e
4
8
2
0
h
2
1
n
2
, (1.52)
which is the same as the Bohr formula given in Eq. 1.10. The energy only depends only on the principal
quantum number n, which means that all the l states for a given value of n are degenerate (i.e. have
the same energy), even though the radial wave functions depend on both n and l. This degeneracy with
respect to l is called accidental, and is a consequence of the fact that the electrostatic energy has a
precise 1/r dependence in hydrogen. In more complex atoms, the electrostatic energy will depart from
a pure 1/r dependence due to the shielding eect of inner electrons, and the gross energy will depend
on l as well as n, even before we start thinking of higher-order ne-structure eects. Note also that the
energy does not depend on the orbital quantum number m
l
at all. Hence, the m
l
states for each value of
l are degenerate in the gross structure of all atoms.
The wave functions are nomalized so that
_
r=0
_
=0
_
2
=0
n,l,m
,l
,m
dV =
n,n
l,l
m,m
(1.53)
where dV is the incremental volume element in spherical polar co-ordinates:
dV = r
2
sin drdd. (1.54)
12 CHAPTER 1. INTRODUCTION AND REVISION OF HYDROGEN
0 2 4 6 8 10
0
2
4
6
R
1
0
(
r
)
(
3
/
2
)
radius ()
n = 1
l = 0
0 2 4 6 8 10
0
1
2
R
2
l
(
r
)
(
3
/
2
)
radius ( )
n = 2
l = 1
l = 0
0 2 4 6 8 10 12 14
0
1
R
3
l
(
r
)
(
3
/
2
)
radius ( )
n = 3
l = 0
l = 1
l = 2
0 2 4 6 8 10
0
2
4
6
R
1
0
(
r
)
(
3
/
2
)
radius ()
n = 1
l = 0
0 2 4 6 8 10
0
1
2
R
2
l
(
r
)
(
3
/
2
)
radius ( )
n = 2
l = 1
l = 0
0 2 4 6 8 10 12 14
0
1
R
3
l
(
r
)
(
3
/
2
)
radius ( )
n = 3
l = 0
l = 1
l = 2
Figure 1.6: The radial wave functions R
nl
(r) for the hydrogen atom with Z = 1. Note that the axes for
the three graphs are not the same.
The radial probability function P
nl
(r) is the probability that the electron is found between r and r +dr:
P
nl
(r) dr =
_
=0
_
2
=0
r
2
sin drdd
= [R
nl
(r)[
2
r
2
dr . (1.55)
The factor of r
2
that appears here is just related to the surface area of the radial shell of radius r (i.e.
4r
2
.) Some representative radial probability functions are sketched in Fig. 1.7. 3-D plots of the shapes
of the atomic orbitals are available at: http://www.shef.ac.uk/chemistry/orbitron/.
Expectation values of measurable quantities are calculated as follows:
A =
___
AdV . (1.56)
Thus, for example, the expectation value of the radius is given by:
r =
___
rdV
=
_
r=0
R
nl
rR
nl
r
2
dr
_
=0
_
2
=0
Y
lm
(, )Y
lm
(, ) sin dd
=
_
r=0
R
nl
rR
nl
r
2
dr . (1.57)
This gives:
r =
n
2
a
H
Z
_
3
2
l(l + 1)
2n
2
_
. (1.58)
Note that this only approaches the Bohr value, namely n
2
a
H
/Z (see eqn 1.15), for the states with l = n1
at large n.
1.6 Spin
The spin of the electron does not appear in the basic Schrodinger equation for hydrogen given in eqn 1.39,
which means that the energy of the quantized states of hydrogen does not depend on the spin.
7
At this
stage, we just note that electrons are spin 1/2 particles, with two spin states for every quantized level.
This means that each quantum state dened by the quantum numbers (n, l, m
l
) has a degeneracy of two
due to the two allowed spin states. Given that the m
l
states are degenerate in the gross structure of all
atoms, the full degeneracy of each l state is therefore 2 (2l + 1) = 2(2l + 1).
7
The spin will eventually turn up in the Hamiltonian of hydrogen when we consider ne-structure eects.
1.6. SPIN 13
Reading
Bransden and Joachain, Atoms, Molecules and Photons, 1.7, 2.5, 2.6, chapter 3
Demtroder, Atoms, Molecules and Photons, 3.4, 4.3 5.1.
Haken and Wolf, The Physics of Atoms and Quanta, chapter 810.
Phillips, A.C., Introduction to Quantum Mechanics, chapters 8 & 9.
Beisser, A., Concepts of Modern Physics, chapters 4 6.
Eisberg, R. and Resnick, R., Quantum Physics, chapter 7.
14 CHAPTER 1. INTRODUCTION AND REVISION OF HYDROGEN
0 5 10 15
0.0
0.2
0.4
0.6
0.8
1.0
1.2
[
r
R
1
0
(
r
)
]
2
(
-
1
)
radius ()
n = 1
l = 0
0 5 10 15
0.0
0.1
0.2
0.3
0.4
[
r
R
2
l
(
r
)
]
2
(
-
1
)
radius ( )
n = 2
l = 0
l = 1
0 5 10 15
0.0
0.1
0.2
[
r
R
3
l
(
r
)
]
2
(
-
1
)
radius ( )
l = 0
l = 1
l = 2
n = 3
0 5 10 15
0.0
0.2
0.4
0.6
0.8
1.0
1.2
[
r
R
1
0
(
r
)
]
2
(
-
1
)
radius ()
n = 1
l = 0
0 5 10 15
0.0
0.2
0.4
0.6
0.8
1.0
1.2
[
r
R
1
0
(
r
)
]
2
(
-
1
)
radius ()
n = 1
l = 0
0 5 10 15
0.0
0.1
0.2
0.3
0.4
[
r
R
2
l
(
r
)
]
2
(
-
1
)
radius ( )
n = 2
l = 0
l = 1
0 5 10 15
0.0
0.1
0.2
0.3
0.4
[
r
R
2
l
(
r
)
]
2
(
-
1
)
radius ( )
n = 2
l = 0
l = 1
0 5 10 15
0.0
0.1
0.2
[
r
R
3
l
(
r
)
]
2
(
-
1
)
radius ( )
l = 0
l = 1
l = 2
n = 3
0 5 10 15
0.0
0.1
0.2
[
r
R
3
l
(
r
)
]
2
(
-
1
)
radius ( )
l = 0
l = 1
l = 2
n = 3
Figure 1.7: Radial probability functions for the rst three n states of the hydrogen atom with Z = 1.
Note that the radial probability is equal to r
2
[R
nl
(r)[
2
, not just to [R
nl
(r)[
2
. Note also that the horizontal
axes are the same for all three graphs, but not the vertical axes.
Chapter 2
Radiative transitions
In this chapter we shall look at the classical and quantum theories of radiative emission and absorption.
This will enable us to derive certain selection rules which determine whether a particular transition is
allowed or not. We shall also investigate the physical mechanisms that aect the shape of the spectral
lines that are observed in atomic spectra.
2.1 Classical theories of radiating dipoles
The classical theories of radiation by atoms were developed at the end of the 19
th
century before the
discoveries of the electron and the nucleus. With the benet of hindsight, we can understand more clearly
how the classical theory works. We model the atom as a heavy nucleus with electrons attached to it by
springs with dierent spring constants, as shown in Fig. 2.1(a). The spring represents the binding force
between the nucleus and the electrons, and the values of the spring constants determine the resonant
frequencies of each of the electrons in the atom. Every atom therefore has several dierent natural
frequencies.
The nucleus is heavy, and so it does not move very easily. However, the electrons can readily vibrate
about their mean position, as illustrated in Fig. 2.1(b). The vibrations of the electron create a uctuating
electric dipole. In general, electric dipoles consist of two opposite charges of q separated by a distance
d. The dipole moment p is dened by:
p = qd , (2.1)
where d is a vector of length d pointing from q to +q. In the case of atomic dipoles, the positive charged
is xed, and so the time dependence of p is just determined by the movement of the electron:
p(t) = ex(t) , (2.2)
where x(t) is the time dependence of the electron displacement.
It is well known that oscillating electric dipoles emit electromagnetic radiation at the oscillation
frequency. This is how aerials work. Thus we expect an atom that has been excited into vibration to
emit light waves at one of its natural resonant frequencies. This is the classical explanation of why atoms
emit characteristic colours when excited electrically in a discharge tube. Furthermore, it is easy to see
that an incoming light wave of frequency
0
can drive the natural vibrations of the atom through the
oscillating force exerted on the electron by the electric eld of the wave. This transfers energy from
the light wave to the atom, which causes absorption at the resonant frequency. Hence the atom is also
expected to absorb strongly at its natural frequency.
The classical theories actually have to assume that each electron has several natural frequencies of
varying strengths in order to explain the observed spectra. If you do not do this, you end up predicting,
for example, that hydrogen only has one emission frequency. There was no classical explanation of the
origin of the atomic dipoles. It is therefore not surprising that we run into contradictions such as this
when we try to patch up the model by applying our knowledge of electrons and nuclei gained by hindsight.
2.2 Quantum theory of radiative transitions
We have just seen that the classical model can explain why atoms emit and absorb light, but it does not
oer any explanation for the frequency or the strength of the radiation. These can only be calculated
15
16 CHAPTER 2. RADIATIVE TRANSITIONS
p(t)
t
+
t = 0 t =
p
w
0
t =
2p
w
0
t
x(t)
x
(a) (b)
p(t)
t
+
t = 0 t =
p
w
0
t =
2p
w
0
t
x(t)
x
p(t)
tt
+
t = 0 t =
p
w
0
t =
p
w
0
p
w
0
t =
2p
w
0
t =
2p
w
0
2p
w
0
tt
x(t)
x
(a) (b)
Figure 2.1: (a) Classical atoms consist of electrons bound to a heavy nucleus by springs with characteristic
force constants. (b) The vibrations of an electron in an atom at its natural resonant frequency
0
creates
an oscillating electric dipole. This acts like an aerial and emits electromagnetic waves at frequency
0
. Alternatively, an incoming electromagnetic wave at frequency
0
can drive the oscillations at their
resonant frequency. This transfers energy from the wave to the atom, which is equivalent to absorption.
by using quantum theory. Quantum theory tells us that atoms absorb or emit photons when they jump
between quantized states, as shown in gure 2.2(a). The absorption or emission processes are called
radiative transitions. The energy of the photon is equal to the dierence in energy of the two levels:
h = E
2
E
1
. (2.3)
Our task here is to calculate the rate at which these transitions occur.
The transition rate W
12
can be calculated from the initial and nal wave functions of the states
involved by using Fermis golden rule:
W
12
=
2
[M
12
[
2
g(h) , (2.4)
where M
12
is the matrix element for the transition and g(h) is the density of states. The matrix
element is equal to the overlap integral
1
:
M
12
=
_
2
(r)H
(r)
1
(r) d
3
r . (2.5)
where H
is the perturbation that causes the transition. This represents the interaction between the
atom and the light wave. There are a number of physical mechanisms that cause atoms to absorb or emit
light. The strongest process is the electric dipole (E1) interaction. We therefore discuss E1 transitions
rst, leaving the discussion of higher order eects to Section 2.5.
The density of states factor is dened so that g(h)dE is the number of nal states per unit volume
that fall within the energy range E to E+dE, where E = h. In the standard case of transitions between
quantized levels in an atom, the initial and nal electron states are discrete. In this case, the density of
states factor that enters the golden rule is the density of photon states.
2
In free space, the photons can
have any frequency and there is a continuum of states available, as illustrated in Fig. 2.2(b). The atom
can therefore always emit a photon and it is the matrix element that determines the probability for this
to occur. Hence we concentrate on the matrix element from now on.
2.3 Electric dipole (E1) transitions
Electric dipole transitions are the quantum mechanical equivalent of the classical dipole oscillator dis-
cussed in Section 2.1. We assume that the atom is irradiated with light, and makes a jump from level 1
1
This is sometimes written in the shorthand Dirac notation as M
12
2|H
|1.
2
In solid-state physics, we consider transitions between electron bands rather than between discrete states. We then have
to consider the density of electron states as well as the density of photon states when we calculate the transition rate. This
point is covered in other courses, e.g. PHY475: Optical properties of solids.
2.3. ELECTRIC DIPOLE (E1) TRANSITIONS 17
E
2
E
1
hn
(a)
(b)
E
2
E
1
hn
hn
1
2
dE
absorption emission
E
2
E
1
hn
E
2
E
1
hn
(a)
(b)
E
2
E
1
hn
E
2
E
1
hn
hn
1
2
dE
hn
1
2
dE
absorption emission
Figure 2.2: (a) Absorption and emission transitions in an atom. (b) Emission into a continuum of photon
modes during a radiative transition between discrete atomic states.
to 2 by absorbing a photon. The interaction energy between an electric dipole p and an external electric
eld E is given by
E = p E . (2.6)
We presume that the nucleus is heavy, and so we only need to consider the eect on the electron. Hence
the electric dipole perturbation is given by:
H
= +er E , (2.7)
where r is the position vector of the electron and E is the electric eld of the light wave. This can be
simplied to:
H
= e(xc
x
+yc
y
+zc
z
) , (2.8)
where c
x
is the component of the eld amplitude along the x axis, etc. Now atoms are small compared
to the wavelength of light, and so the amplitude of the electric eld will not vary signicantly over the
dimensions of an atom. We can therefore take c
x
, c
y
, and c
z
in Eq. 2.8 to be constants in the calculation,
and just evaluate the following integrals:
M
12
_
1
x
2
d
3
r xpolarized light ,
M
12
_
1
y
2
d
3
r ypolarized light , (2.9)
M
12
_
1
z
2
d
3
r zpolarized light .
Integrals of this type are called dipole moments. The dipole moment is thus the key parameter that
determines the transition rate for the electric dipole process.
At this stage it is helpful to give a hand-waving explanation for why electric dipole transitions lead to
the emission of light. To do this we need to to consider the time-dependence of the quantum mechanical
wave functions. This naturally drops out of the time-dependent Schrodinger equation:
H(r)(r, t) = i
t
(r, t) , (2.10)
where
H(r) is the Hamiltonian of the system. The solutions of Eq. 2.10 are of the form:
(r, t) = (r)e
iEt/
, (2.11)
where (r) satises the time-independent Schrodinger equation:
1
(r, t) +c
2
2
(r, t)
= c
1
1
(r)e
iE1t/
+c
2
2
(r)e
iE2t/
, (2.13)
where c
1
and c
2
are the mixing coecients. The expectation value x of the position of the electron is
given by:
x =
_
xd
3
r . (2.14)
18 CHAPTER 2. RADIATIVE TRANSITIONS
Quantum number Selection rule
parity changes
l l = 1
m m = 0, 1 unpolarized light
m = 0 linear polarization | z
m = 1 linear polarization in (x, y) plane
m = +1
+
circular polarization
m = 1
circular polarization
s s = 0
m
s
m
s
= 0
Table 2.1: Electric dipole selection rules for the quantum numbers of the states involved in the
transition.
With given by Eq. 2.13 we obtain:
x = c
1
c
1
_
1
x
1
d
3
r +c
2
c
2
_
2
x
2
d
3
r (2.15)
+ c
1
c
2
e
i(E2E1)t/
_
1
x
2
d
3
r + c
2
c
1
e
i(E1E2)t/
_
2
x
1
d
3
r .
This shows that if the dipole moment dened in Eq. 2.9 is non-zero, then the electron wave-packet
oscillates in space at angular frequency (E
2
E
1
)/. The oscillation of the electron wave packet creates
an oscillating electric dipole, which then radiates light at angular frequency (E
2
E
1
)/. Hey presto!
2.4 Selection rules for E1 transitions
Electric dipole transitions can only occur if the selection rules summarized in Table 2.1 are satised.
Transitions that obey these E1 selection rules are called allowed transitions. If the selection rules are
not satised, the matrix element (i.e. the dipole moment) is zero, and we then see from Eq. 2.4 that the
transition rate is zero. The origins of these rules are discussed below.
Parity
The parity of a function refers to the sign change under inversion about the origin. Thus if f(x) = f(x)
we have even parity, whereas if f(x) = f(x) we have odd parity. Now atoms are spherically symmetric,
which implies that
[(r)[
2
= [(+r)[
2
. (2.16)
Hence we must have that
(r) = (+r) . (2.17)
In other words, the wave functions have either even or odd parity. The dipole moment of the transition
is given by Eq. 2.9. x, y and z are odd functions, and so the product
1
2
must be an odd function if
M
12
is to be non-zero. Hence
1
and
2
must have dierent parities.
The orbital quantum number l
The parity of the spherical harmonic functions is equal to (1)
l
. Hence the parity selection rule implies
that l must be an odd number. Detailed evaluation of the overlap integrals tightens this rule to l = 1.
This can be seen as a consequence of the fact that the angular momentum of a photon is , with the
sign depending on whether we have a left or right circularly polarized photon. Conservation of angular
momentum therefore requires that the angular momentum of the atom must change by one unit.
2.5. HIGHER ORDER TRANSITIONS 19
The magnetic quantum number m
The dipole moment for the transition can be written out explicitly:
M
12
_
r=0
_
=0
_
2
=0
,l
,m
r
n,l,m
r
2
sin drdd. (2.18)
We consider here just the part of this integral:
M
12
_
2
0
e
im
r e
im
d, (2.19)
where we have made use of the fact that (see eqns 1.51 and 1.33):
n,l,m
(r, , ) e
im
. (2.20)
Now for z-polarized light we have from Eq. 2.9:
M
12
_
2
0
e
im
z e
im
d
_
2
0
e
im
1 e
im
d, (2.21)
because z = r cos . Hence we must have that m
= m if M
12
is to be non-zero. If the light is polarized
in the (x, y) plane, we have integrals like
M
12
_
2
0
e
im
xe
im
d
_
2
0
e
im
e
i
e
im
d. (2.22)
This is because x = r sin cos = r sin
1
2
(e
+i
+e
i
), and similarly for y. This give m
m = 1. This
rule can be tightened up a bit by saying that m = +1 for
+
circularly polarized light and m = 1 for
circularly polarized light. If the light is unpolarized, then all three linear polarizations are possible,
and we can have m = 0, 1.
Spin
The photon does not interact with the electron spin. Therefore, the spin state of the atom does not
change during the transition. This implies that the spin quantum numbers s and m
s
are unchanged.
2.5 Higher order transitions
How does an atom de-excite if E1 transitions are forbidden by the selection rules? In some cases it
may be possible for the atom to de-excite by alternative methods. For example, the 3s 1s transition
is forbidden, but the atom can easily de-excite by two allowed E1 transitions, namely 3s 2p, then
2p 1s. However, this may not always be possible, and in these cases the atom must de-excite by
making a forbidden transition. The use of the word forbidden is somewhat misleading here. It really
means electric-dipole forbidden. The transitions are perfectly possible, but they just occur at a slower
rate.
After the electric-dipole interaction, the next two strongest interactions between the photon and
the atom give rise to magnetic dipole (M1) and electric quadrupole (E2) transitions. There have
dierent selection rules to E1 transitions (e.g. parity is conserved), and may therefore be allowed when
when E1 transitions are forbidden. M1 and E2 transitions are second-order processes and have much
smaller probabilities than E1 transitions.
In extreme cases it may happen that all types of radiative transitions are forbidden. In this case, the
excited state is said to be metastable, and must de-excite by transferring its energy to other atoms in
collisional processes or by multi-photon emission.
2.6 Radiative lifetimes
An atom in an excited state has a spontaneous tendency to de-excite by a radiative transition involving
the emission of a photon. This follows from statistical physics: atoms with excess energy tend to want
to get rid of it. This process is called spontaneous emission. Let us suppose that there are N
2
atoms
20 CHAPTER 2. RADIATIVE TRANSITIONS
Transition Einstein A coecient Radiative lifetime
E1 allowed 10
8
10
9
s
1
1 10 ns
E1 forbidden (M1 or E2) 10
3
10
6
s
1
1 s 1 ms
Table 2.2: Typical transition rates and radiative lifetimes for allowed and forbidden transitions at optical
frequencies.
in level 2 at time t. We use quantum mechanics to calculate the transition rate from level 2 to level 1,
and then write down a rate equation for N
2
as follows:
dN
2
dt
= AN
2
. (2.23)
This merely says that the total number of atoms making transitions is proportional to the number of
atoms in the excited state and to the quantum mechanical probability. The parameter A that appears in
eqn 2.23 is called the Einstein A coecient of the transition. The Einstein B coecients that describe
the processes of stimulated emission and absorption are considered in Section 8.3 in the context of laser
physics.
Equation 2.23 has the following solution:
N
2
(t) = N
2
(0) exp(At)
= N
2
(0) exp(t/) , (2.24)
where
=
1
A
. (2.25)
Equation 2.24 shows that if the atoms are excited into the upper level, the population will decay due to
spontaneous emission with a time constant . is thus called the natural radiative lifetime of the
excited state.
The values of the Einstein A coecient and hence the radiative lifetime vary considerably from
transition to transition. Allowed E1 transitions have A coecients in the range 10
8
10
9
s
1
at optical
frequencies, giving radiative lifetimes of 1 10 ns. Forbidden transitions, on the other hand, are much
slower because they are higher order processes. The radiative lifetimes for M1 and E2 transitions are
typically in the millisecond or microsecond range. This point is summarized in Table 2.2.
2.7 The width and shape of spectral lines
The radiation emitted in atomic transitions is not perfectly monochromatic. The shape of the emission
line is described by the spectral line shape function g(). This is a function that peaks at the line
centre dened by
h
0
= (E
2
E
1
) , (2.26)
and is normalized so that:
_
0
g() d = 1 . (2.27)
The most important parameter of the line shape function is the full width at half maximum (FWHM)
, which quanties the width of the spectral line. We shall see below how the dierent types of line
broadening mechanisms give rise to two common line shape functions, namely the Lorentzian and
Gaussian functions.
In a gas of atoms, spectral lines are broadened by three main processes:
natural broadening,
collision broadening,
Doppler broadening.
2.8. NATURAL BROADENING 21
We shall look at each of these processes separately below. A useful general division can be made at this
stage by classifying the broadening as either homogeneous or inhomogeneous.
Homogeneous broadening aects all the individual atoms in the same way. Natural lifetime and
collision broadening are examples of homogeneous processes. All the atoms are behaving in the
same way, and each atom produces the same emission spectrum.
Inhomogeneous broadening aects dierent individual atoms in dierent ways. Doppler broad-
ening is the standard example of an inhomogeneous process. The individual atoms are presumed to
behave identically, but they are moving at dierent velocities, and one can associate dierent parts
of the spectrum with the subset of atoms with the appropriate velocity. Inhomogeneous broadening
is also found in solids, where dierent atoms may experience dierent local environments due to
the inhomogeneity of the medium.
2.8 Natural broadening
We have seen in Section 2.6 that the process of spontaneous emission causes the excited states of an
atom to have a nite lifetime. Let us suppose that we somehow excite a number of atoms into level 2
at time t = 0. Equation 2.23 shows us that the rate of transitions is proportional to the instantaneous
population of the upper level, and eqn 2.24 shows that this population decays exponentially. Thus the
rate of atomic transitions decays exponentially with time constant . For every transition from level 2 to
level 1, a photon of angular frequency
0
= (E
2
E
1
)/ is emitted. Therefore a burst of light with an
exponentially-decaying intensity will be emitted for t > 0:
I(t) = I(0) exp(t/) . (2.28)
This corresponds to a time dependent electric eld of the form:
t < 0 : c(t) = 0 ,
t 0 : c(t) = c
0
e
i0t
e
t/2
. (2.29)
The extra factor of 2 in the exponential in eqn 2.29 compared to eqn 2.28 arises because I(t) c(t)
2
.
We now take the Fourier transform of the electric eld to derive the frequency spectrum of the burst:
c() =
1
2
_
+
c(t) e
it
dt . (2.30)
The emission spectrum is then given by:
I()
c()
2
1
(
0
)
2
+ (1/2)
2
. (2.31)
Remembering that = 2, we nd the nal result for the spectral line shape function:
g() =
2
1
(
0
)
2
+ (/2)
2
, (2.32)
where the full width at half maximum is given by
=
1
2
. (2.33)
The spectrum described by eqn 2.32 is called a Lorentzian line shape. This function is plotted in
Fig. 2.3. Note that we can re-write eqn 2.33 in the following form:
=
1
2
. (2.34)
By multiplying both sides by h, we can recast this as:
E = h/2 . (2.35)
If we realize that represents the average time the atom stays in the excited state (i.e the uncertainty
in the time), we can interpret this as the energytime uncertainty principle.
22 CHAPTER 2. RADIATIVE TRANSITIONS
(n-n
0
) in units of 1/2pt
-3 -2 -1 0 1 2 3
area = 1
0
0.5
1
(n - n
0
)
g
(
n
)
i
n
u
n
i
t
s
o
f
(
2
/
p
D
n
)
g(n)
FWHM = 1 / 2pt
Figure 2.3: The Lorentzian line shape. The functional form is given in eqn 2.32. The function peaks
at the line centre
0
and has an FWHM of 1/2. The function is normalized so that the total area is
unity.
2.9 Collision (Pressure) broadening
The atoms in a gas jostle around randomly and frequently collide into each other and the walls of the
containing vessel. This interrupts the process of light emission and eectively shortens the lifetime of the
excited state. This gives additional line broadening through the uncertainty principle, as determined by
eqn 2.33 with replaced by
c
, where
c
is the mean time between collisions.
It can be shown from the kinetic theory of gases that the time between collisions in an ideal gas is
given by:
c
1
s
P
_
mk
B
T
8
_
1/2
, (2.36)
where
s
is the collision cross-section, and P is the pressure. The collision cross-section is an eective
area which determines whether two atoms will collide or not. It will be approximately equal to the size
of the atom. For example, for sodium atoms we have:
s
r
2
atom
(0.2 nm)
2
= 1.2 10
19
m
2
.
Thus at S.T.P we nd
c
610
10
s, which gives a line width of 1 GHz. Note that
c
is much shorter
than typical radiative lifetimes. For example, the strong yellow D-lines in sodium have a radiative lifetime
of 16 ns, which is nearly two orders of magnitude larger.
In conventional atomic discharge tubes, we reduce the eects of pressure broadening by working at
low pressures. We see from eqn 2.36 that this increases
c
, and hence reduces the linewidth. This is why
we tend to use low pressure discharge lamps for spectroscopy.
2.10 Doppler broadening
The spectrum emitted by a typical gas of atoms in a low pressure discharge lamp is usually found to be
much broader than the radiative lifetime would suggest, even when everything is done to avoid collisions.
For example, the radiative lifetime for the 632.8 nm line in neon is 2.7 10
7
s. Equation 2.33 tells us
that we should have a spectral width of 0.54 MHz. In fact, the line is about three orders of magnitude
broader, and moreover, does not have the Lorentzian lineshape given by eqn 2.32.
The reason for this discrepancy is the thermal motion of the atoms. The atoms in a gas move about
randomly with a root-mean-square thermal velocity given by:
1
2
mv
2
x
=
1
2
k
B
T , (2.37)
where k
B
is Boltzmanns constant. At room temperature the thermal velocities are quite large. For
example, for sodium with a mass number of 23 we nd v
x
330 ms
1
at 300 K. This random thermal
2.10. DOPPLER BROADENING 23
atom moving at right angles
to the observer
atom moving towards
the observer
atom moving away
from the observer
Emission spectrum
of all the atoms combined
n
n
0
Figure 2.4: The Doppler broadening mechanism. The thermal motion of the atoms causes their frequency
to be shifted by the Doppler eect.
motion of the atoms gives rise to Doppler shifts in the observed frequencies, which then cause line
broadening, as illustrated in Fig. 2.4. This is Doppler line broadening mechanism.
Let us suppose that the atom is emitting light from a transition with centre frequency
0
. An atom
moving with velocity v
x
will have its observed frequency shifted by the Doppler eect according to:
=
0
_
1
v
x
c
_
, (2.38)
where the + and sign apply to motion towards or away from the observer respectively. The probability
that an atom has velocity v
x
is governed by the Boltzmann formula:
p(E) e
E/kBT
. (2.39)
On setting E equal to the kinetic energy, we nd that the number of atoms with velocity v
x
is given by
the MaxwellBoltzmann distribution:
N(v
x
) exp
_
1
2
mv
2
x
k
B
T
_
. (2.40)
We can combine eqns 2.38 and 2.40, to nd the number of atoms emitting at frequency :
N() exp
_
mc
2
(
0
)
2
2k
B
T
2
0
_
. (2.41)
The frequency dependence of the light emitted is therefore given by:
I() exp
_
mc
2
(
0
)
2
2k
B
T
2
0
_
. (2.42)
This gives rise to a Gaussian line shape with g() given by:
g() exp
_
mc
2
(
0
)
2
2k
B
T
2
0
_
, (2.43)
with a full width at half maximum equal to:
D
= 2
0
_
(2 ln2)k
B
T
mc
2
_
1/2
=
2
_
(2 ln 2)k
B
T
m
_
1/2
. (2.44)
The Doppler linewidth in a gas at S.T.P is usually several orders of magnitude larger than the natural
linewidth. For example, the Doppler line width of the 632.8 nm line of neon at 300 K works out to
be 1.3 GHz, i.e. three orders of magnitude larger than the broadening due to spontaneous emission.
The dominant broadening mechanism in the emission spectrum of gases at room temperature is usually
Doppler broadening, and the line shape is closer to Gaussian than Lorentzian.
3
3
Since
D
is proportional to
T, we can reduce its value by cooling the gas. Cooling also reduces the collision
broadening because P T, and therefore c T
1/2
. (See eqn 2.36.) Laser cooling techniques can produce temperatures
in the micro-Kelvin range, where we nally observe the natural line shape of the emission line.
24 CHAPTER 2. RADIATIVE TRANSITIONS
2.11 Atoms in solids
In laser physics we shall frequently be interested in the emission spectra of atoms in crystals. The spectra
will be subject to lifetime broadening as in gases, since this is a fundamental property of radiative
emission. However, the atoms are locked in a lattice, and so collisional broadening is not relevant.
Doppler broadening does not occur either, for the same reason. On the other hand, the emission lines
can be broadened by other mechanisms.
In some cases it may be possible for the atoms to de-excite from the upper level to the lower level
by making a non-radiative transition. One way this could happen is to drop to the lower level by
emitting phonons (ie heat) instead of photons. To allow for this possibility, we must re-write eqn 2.23 in
the following form:
dN
2
dt
= AN
2
N
2
NR
=
_
A +
1
NR
_
N
2
=
N
2
, (2.45)
where
NR
is the non-radiative transition time. This shows that non-radiative transitions shorten the
lifetime of the excited state according to:
1
= A +
1
NR
. (2.46)
We thus expect additional lifetime broadening according to eqn 2.33. The phonon emission times in solids
are often very fast, and can cause substantial broadening of the emission lines. This is the solid-state
equivalent of collisional broadening.
Another factor that may cause line broadening is the inhomogeneity of the host medium, for example
when the atoms are doped into a glass. If the environment in which the atoms nd themselves is not
entirely uniform, the emission spectrum will be aected through the interaction between the atom and
the local environment. This is an example of an inhomogeneous broadening mechanism.
Reading
Bransden and Joachain, Atoms, Molecules and Photons, chapter 4
Demtroder, W., Atoms, Molecules and Photons, 7.1 7.4.
Haken, H. and Wolf, H.C., The Physics of Atoms and Quanta, chapter 16.
Hooker, S. and Webb, C., Laser Physics, chapter 3.
Smith, F.G. and King, T.A., Optics and Photonics, sections 13.14, 20.12
Beisser, A., Concepts of Modern Physics, sections 6.89
Eisberg, R. and Resnick, R., Quantum Physics, section 8.7.
Chapter 3
The shell model and alkali spectra
Everything we have been doing so far in this course applies to hydrogenic atoms. We have taken this
approach because the hydrogen atom only contains two particles: the nucleus and the electron. This is
a two-body system and can be solved exactly by separating the motion into the centre of mass and
relative co-ordinates. This has allowed us to nd the wave functions and understand the meaning of the
quantum numbers n, l, m
l
and m
s
.
We are well aware, however, that hydrogen is only the rst of about 100 elements. These are not two
body problems: we have one nucleus and many electrons, which is a many body problem, with no
exact solution. This chapter begins our consideration of the approximation techniques that are used to
understand the behaviour of many-electron atoms.
3.1 The central eld approximation
The Hamiltonian for an N-electron atom with nuclear charge +Ze can be written in the form:
H =
N
i=1
_
2
2m
2
i
Ze
2
4
0
r
i
_
+
N
i>j
e
2
4
0
r
ij
, (3.1)
where N = Z for a neutral atom. The subscripts i and j refer to individual electrons and r
ij
= [r
i
r
j
[.
The rst summation accounts for the kinetic energy of the electrons and their Coulomb interaction with
the nucleus, while the second accounts for the electron-electron repulsion.
It is not possible to nd an exact solution to the Schrodinger equation with a Hamiltonian of the
form given by eqn 3.1, because the electron-electron repulsion term depends on the co-ordinates of two
of the electrons, and so we cannot separate the wave function into a product of single-particle states.
Furthermore, the electron-electron repulsion term is comparable in magnitude to the rst summation,
making it impossible to use perturbation theory either. The description of multi-electron atoms therefore
usually starts with the central eld approximation in which we re-write the Hamiltonian of eqn 3.1
in the form:
1
H =
N
i=1
_
2
2m
2
i
+V
central
(r
i
)
_
+V
residual
, (3.2)
where V
central
is the central eld and V
residual
is the residual electrostatic interaction.
The central eld approximation works in the limit where
i=1
V
central
(r
i
)
[V
residual
[ . (3.3)
In this case, we can treat V
residual
as a perturbation, and worry about it later. We then have to solve a
Schrodinger equation in the form:
_
N
i=1
_
2
2m
2
i
+V
central
(r
i
)
_
_
= E . (3.4)
1
A eld is described as central if the potential energy has spherical symmetry about the origin, so that V (r) only
depends on r. The fact that V does not depend on or means that the force is parallel to r, i.e. it points centrally
towards or away from the origin.
25
26 CHAPTER 3. THE SHELL MODEL AND ALKALI SPECTRA
This is not as bad as it looks. By writing
2
=
1
(r
1
)
2
(r
2
)
N
(r
N
) , (3.5)
we end up with N separate Schrodinger equations of the form:
_
2
2m
2
i
+V
central
(r
i
)
_
i
(r
i
) = E
i
i
(r
i
) , (3.6)
with
E = E
1
+E
2
E
N
. (3.7)
This is much more tractable. We might need a computer to solve any one of the single particle wave
Schrodinger equations of the type given in eqn 3.6, but at least it is possible in principle. Furthermore, the
fact that the potentials that appear in eqn 3.6 only depend on the radial co-ordinate r
i
(i.e. no dependence
on the angles
i
and
i
) means that every electron is in a well-dened orbital angular momentum state,
3
and that the separation of variables discussed in Section 1.5.2 is valid. In analogy with eqn 1.40, we can
then write:
i
(r
i
) (r
i
,
i
,
i
) = R
i
(r
i
) Y
i
(
i
,
i
) . (3.8)
By proceeding exactly as in Section 1.5.2, we end up with two equations, namely:
L
2
i
Y
limi
(
i
,
i
) =
2
l
i
(l
i
+ 1)Y
limi
(
i
,
i
) , (3.9)
and
_
2
2m
1
r
2
i
d
dr
i
_
r
2
i
d
dr
i
_
+
2
l
i
(l
i
+ 1)
2mr
2
i
+V
central
(r
i
)
_
R
i
(r
i
) = E
i
R
i
(r
i
) . (3.10)
The rst tells us that the angular part of the wave functions will be given by the spherical harmonic
functions described in Section 1.5.1, while the second one allows us to work out the energy and radial
wave function for a given form of V
central
(r
i
) and value of l
i
. Each electron will therefore have four
quantum numbers:
l and m
l
: these drop out of the angular equation for each electron, namely eqn 3.9.
n: this arises from solving eqn 3.10 with the appropriate form of V
central
(r) for a given value of l.
n and l together determine the radial wave function R
nl
(r) (which cannot be expected to be the
same as the hydrogenic ones given in Table 1.4) and the energy of the electron.
m
s
: spin has not entered the argument. Each electron can therefore either have spin up (m
s
= +1/2)
or down (m
s
= 1/2), as usual. We do not need to specify the spin quantum number s because it
is always equal to 1/2.
The state of the many-electron atom is then found by working out the wave functions of the individual
electrons and nding the total energy of the atom according to eqn 3.7, subject to the constraints imposed
by the Pauli exclusion principle. This provides a useful working model that will be explored in detail
below.
In the following sections we shall consider the experimental evidence for the shell model which proves
that the central approximation is a good one. The reason it works is based on the nature of the shells.
An individual electron experiences an electrostatic potential due to the Coulomb repulsion from all the
other electrons in the atom. Nearly all of the electrons in a many-electron atom are in closed sub-shells,
which have spherically-symmetric charge clouds. The o-radial forces from electrons in these closed shells
cancel because of the spherical symmetry. Furthermore, the o-radial forces from electrons in unlled
shells are usually relatively small compared to the radial ones. We therefore expect the approximation
given in eqn 3.3 to be valid for most atoms.
3.2 The shell model and the periodic table
We summarize here what we know so far about atomic states.
2
The fact that electrons are indistinguishable particles means that we cannot distinguish physically between the case
with electron 1 in state 1, electron 2 in state 2, . . . , and the case with electron 2 in state 1, electron 1 in state 2, . . . , etc.
We should therefore really write down a linear combination of all such possibilities. We shall reconsider this point when
considering the helium atom in Chapter 5.
3
As noted in Section 1.5.1, the torque on the electron is zero if the force points centrally towards the nucleus. This
means that the orbital angular momentum is constant.
3.2. THE SHELL MODEL AND THE PERIODIC TABLE 27
Quantum number symbol Value
principal n any integer > 0
orbital l integer up to (n 1)
magnetic m
l
integer from l to +l
spin m
s
1/2
Table 3.1: Quantum numbers for electrons in atoms.
Shell n l m
l
m
s
N
shell
N
accum
1s 1 0 0 1/2 2 2
2s 2 0 0 1/2 2 4
2p 2 1 1, 0, +1 1/2 6 10
3s 3 0 0 1/2 2 12
3p 3 1 1, 0, +1 1/2 6 18
4s 4 0 0 1/2 2 20
3d 3 2 2, 1, 0, +1, +2 1/2 10 30
4p 4 1 1, 0, +1 1/2 6 36
5s 5 0 0 1/2 2 38
4d 4 2 2, 1, 0, +1, +2 1/2 10 48
5p 5 1 1, 0, +1 1/2 6 54
6s 6 0 0 1/2 2 56
4f 4 3 3, 2, 1, 0, +1, +2, +3 1/2 14 70
5d 5 2 2, 1, 0, +1, +2 1/2 10 80
6p 6 1 1, 0, +1 1/2 6 86
7s 7 0 0 1/2 2 88
Table 3.2: Atomic shells, listed in order of increasing energy. N
shell
is equal to 2(2l +1) and is the number
of electrons that can t into the shell due to the degeneracy of the m
l
and m
s
levels. The last column
gives the accumulated number of electrons that can be held by the atom once the particular shell and all
the lower ones have been lled.
1. The electronic states are specied by four quantum numbers: n, l, m
l
and m
s
. The values that
these quantum numbers can take are summarized in Table 3.1. In spectroscopic notation, electrons
with l = 0, 1, 2, 3, . . . are called s, p, d, f, . . . electrons.
2. The gross energy of the electron is determined by n and l, except in hydrogenic atoms, where the
gross structure depends only on n.
3. In the absence of ne structure and external magnetic elds, all the states with the same values of n
and l are degenerate. Each (n, l) term of the gross structure therefore contains 2(2l +1) degenerate
levels.
4. Electrons are indistinguishable, spin 1/2 particles and are therefore fermions. This means that
they obey the Pauli exclusion principle, so that only one electron can occupy a particular
quantum state.
4
In the shell model of multi-electron atoms, we forget about ne structure and external magnetic
elds, and just concentrate on the gross structure.
5
The energy levels are ordered according to the
quantum numbers n and l, with big jumps in energy each time we move to the next set of quantum
numbers. The degenerate states with the same values of n and l are called shells. As we add electrons to
the atom, the Pauli exclusion principle dictates that the electrons ll up the lowest available shell until
it is full, and then go on to the next one. The lling up of the shells in order of increasing energy in
multi-electron atoms is sometimes called the Aufbau principle,
6
and is the basis of the periodic table
of elements. The shells are listed in order of increasing energy in Table 3.2.
4
We shall discuss how the Pauli exclusion principle gives rise to exchange energy shifts in Chapter 5.
5
This approximation is justied by the fact that the ne structure and magnetic eld splittings are smaller than the
gross structure energies by a factor of about Z
2
2
= (Z
2
/137)
2
10
4
Z
2
. Note, however, that the ne structure energy
can get to be quite signicant for large Z.
6
The German word Aufbau means building up.
28 CHAPTER 3. THE SHELL MODEL AND ALKALI SPECTRA
Element Atomic number Electronic conguration
H 1 1s
1
He 2 1s
2
Li 3 1s
2
2s
1
Be 4 1s
2
2s
2
B 5 1s
2
2s
2
2p
1
C 6 1s
2
2s
2
2p
2
N 7 1s
2
2s
2
2p
3
O 8 1s
2
2s
2
2p
4
F 9 1s
2
2s
2
2p
5
Ne 10 1s
2
2s
2
2p
6
Na 11 1s
2
2s
2
2p
6
3s
1
Table 3.3: The electronic conguration of the rst 11 elements of the periodic table.
7s
6h 6g 6f 6d 6p 6s
5g 5f 5d 5p 5s
4f 4d 4p 4s
3d 3p 3s
2p 2s
1s
7s
6h 6g 6f 6d 6p 6s
5g 5f 5d 5p 5s
4f 4d 4p 4s
3d 3p 3s
2p 2s
1s
Figure 3.1: Atomic shells are lled in diagonal order when listed in rows according to the principal
quantum number n.
Inspection of Table 3.2 shows us that the energy of the shells always increases with n and l. We build
up multi-electron atoms by adding electrons one by one, putting each electron into the lowest energy
shell that has unlled states. In general, this will be the one with the lowest n, but there are exceptions
to this rule. For example, the 19
th
electron goes into 4s shell rather than the 3d shell. Similarly, the 37
th
electron goes into 5s shell rather than the 4d shell. This happens because the energy of the shell with a
large l value may be higher than that of another shell with a larger value of n but smaller value of l.
The periodic table of elements is built up by adding electrons into the shells as the atomic number
increases. This allows us to determine the electronic conguration of the elements, that is, the
quantum numbers of the electrons in the atom. The congurations of the rst 11 elements are listed in
Table 3.3. The superscript attached to the shell tells us how many electrons are in that shell. The process
of lling the shells follows the pattern indicated in Table 3.2. The nl sub-shells are lled diagonally when
laid out in rows determined by the principal quantum number n, as shown in Fig. 3.1.
7
3.3 Justication of the shell model
The theoretical justication for the shell model relies on the concept of screening. The idea is that the
electrons in the inner shells screen the outer electrons from the potential of the nucleus. To see how this
works we take sodium as an example.
Sodium has an atomic number of 11, and therefore has a nucleus with a charge of +11e with 11
electrons orbiting around it. The picture of the atom based on the shell model is shown in Fig. 3.2. The
7
There are some exceptions to the general rules. For example, copper (Cu) with Z = 29 has a conguration of 4s
1
3d
10
instead of 4s
2
3d
9
. This happens because lled shells are particularly stable. It is therefore energetically advantageous
to promote the second 4s electron into the 3d shell to give the very stable 3d
10
conguration. The energy dierence between
the two congurations is not particularly large, which explains why copper sometimes behaves as though it is monovalent,
and sometimes divalent.
3.4. EXPERIMENTAL EVIDENCE FOR THE SHELL MODEL 29
1s
2s
2p 3s
nucleus
Q = +11e
valence
electron
SODIUM
Z = 11
Figure 3.2: The electronic conguration of the sodium atom according to the shell model.
Shell n Z
e
radius (
A) Energy (eV)
1s 1 11 0.05 1650
2s, 2p 2 9 0.24 275
3s 3 1 4.8 1.5
Table 3.4: Radii and energies of the principal atomic shells of sodium according to the Bohr model. The
unit of 1
Angstrom (
A) = 10
10
m.
radii and energies of the electrons in their shells are estimated using the Bohr formul:
r
n
=
n
2
Z
a
H
, (3.11)
E
n
=
_
Z
n
_
2
R
H
, (3.12)
where a
H
= 5.29 10
11
m is the Bohr radius of hydrogen, R
H
= 13.6 eV is the Rydberg constant and
Z is the atomic number.
The rst two electrons go into the n = 1 shell. These electrons see the full nuclear charge of +11e.
With n = 1 and Z = 11, we nd r
1
= 1
2
/11 a
H
= 0.05
A and E
1
= 11
2
R
H
= 1650 eV. The next
eight electrons go into the n = 2 shell. These are presumed to orbit outside the n = 1 shell. The two inner
electrons partly screen the nuclear charge, and the n = 2 electrons see an eective charge Z
e
= +9e.
The radius is therefore r
2
= (2
2
/9) a
H
= 0.24
A and E
3
= 1.5 eV. These values are summarized
in Table 3.4. Note the large jump in energy and radius in moving from one shell to the next.
The treatment of the screening discussed in the previous paragraph is clearly over-simplied because
it is based on Bohr-type orbits and does not treat the electron-electron repulsion properly. In Section 3.5
we shall see how we might improve on it. One point to realize, however, is that the model is reasonably
self-consistent: by assuming that the inner shells screen the outer ones, we nd that the orbital radius
increases in each subsequent shell, which corroborates our original assumption. This is why the model
works so well.
3.4 Experimental evidence for the shell model
There is a wealth of experimental evidence to conrm that the shell model is a good one. The main
points are discussed briey here.
Ionization potentials and atomic radii
The ionization potentials of the noble gas elements are the highest within a particular period of the
atomic table, while those of the alkali metals are the lowest. This can be seen by looking at the data in
Fig. 3.3. The ionization potential gradually increases as the atomic number increases until the shell is
lled, and then it drops abruptly. This shows that the lled shells are very stable, and that the valence
electrons go in larger, less tightly-bound orbits. The results correlate with the chemical activity of the
30 CHAPTER 3. THE SHELL MODEL AND ALKALI SPECTRA
0 10 20
0
10
20
F
i
r
s
t
i
o
n
i
z
a
t
i
o
n
p
o
t
e
n
t
i
a
l
(
e
V
)
Atomic number (Z)
He
Ne
Ar
Li
Na
K
Figure 3.3: First ionization potentials of the elements up to calcium. The noble gas elements (He, Ne,
Ar) have highly stable fully lled shells with large ionization potentials. The alkali metals (Li, Na, K)
have one weakly-bound valence electron outside fully-lled shells.
heater
cathode
anode
electrons
-
+ kV
x-rays
2
n
1 K
L
3 M
4 N
5 O
6 P
electron K series
(a) (b)
shell
Figure 3.4: (a) A typical X-ray tube. Electrons are accelerated with a voltage of several kV and impact on
a target, causing it to emit X-rays. (b) Transitions occurring in the K-series emission lines. An electron
from the discharge tube ejects one of the K-shell electrons of the target, leaving an empty level in the
K shell. X-ray photons are emitted as electrons from the higher shells drop down to ll the hole in the
K-shell.
elements. The noble gases require large amounts of energy to liberate their outermost electrons, and
they are therefore chemically inert. The alkali metals, on the other hand, need much less energy, and are
therefore highly reactive.
It is also found that the average atomic radius determined by X-ray crystallography on closely packed
crystals is largest for the alkali metals. This is further evidence that we have weakly-bound valence
electrons outside strongly-bound, small-radius, inner shells.
X-ray line spectra
Measurements of X-ray line spectra allow the energies of the inner shells to be determined directly. The
experimental arrangement for observing an X-ray emission spectrum is shown in Fig. 3.4(a). Electrons
are accelerated across a potential drop of several kV and then impact on a target. This ejects core
electrons from the inner shells of the target, as shown in Fig. 3.4(b). X-ray photons are emitted as the
higher energy electrons drop down to ll the empty level (or hole) in the lower shell.
Each target emits a series of characteristic lines. The series generated when a K-shell (n = 1) electron
has been ejected is called the K-series. Similarly, the L- and M-series correspond to ejection of L-shell
(n = 2) or M-shell (n = 3) core electrons respectively. This old spectroscopic notation dates back to the
early work on X-ray spectra.
3.4. EXPERIMENTAL EVIDENCE FOR THE SHELL MODEL 31
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
0
2
4
6
8
40 kV
80 kV
E
m
i
s
s
i
o
n
i
n
t
e
n
s
i
t
y
(
a
r
b
.
u
n
i
t
s
)
Wavelength ( )
(a)
10 100 1000
10
23
10
22
10
21
10
20
10
19
A
b
s
o
r
p
t
i
o
n
c
r
o
s
s
s
e
c
t
i
o
n
(
c
m
2
)
Photon Energy (keV)
(b)
L
K
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
0
2
4
6
8
40 kV
80 kV
E
m
i
s
s
i
o
n
i
n
t
e
n
s
i
t
y
(
a
r
b
.
u
n
i
t
s
)
Wavelength ( )
(a)
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
0
2
4
6
8
40 kV
80 kV
E
m
i
s
s
i
o
n
i
n
t
e
n
s
i
t
y
(
a
r
b
.
u
n
i
t
s
)
Wavelength ( )
(a)
10 100 1000
10
23
10
22
10
21
10
20
10
19
A
b
s
o
r
p
t
i
o
n
c
r
o
s
s
s
e
c
t
i
o
n
(
c
m
2
)
Photon Energy (keV)
(b)
L
K
10 100 1000
10
23
10
22
10
21
10
20
10
19
A
b
s
o
r
p
t
i
o
n
c
r
o
s
s
s
e
c
t
i
o
n
(
c
m
2
)
Photon Energy (keV)
(b)
L
K
Figure 3.5: (a) X-ray emission spectra for tungsten at two dierent electron voltages. The sharp lines
are caused by radiative transitions after the electron beam ejects an inner shell electron, as indicated in
Fig. 3.4(b). The continuum is caused by bremsstrahlung, which has a short wavelength limit equal to
hc/eV at voltage V . (b) X-ray absorption cross-section spectrum for lead.
Figure 3.5(a) shows a typical X-ray emission spectrum. The spectrum consists of a series of sharp
lines on top of a continuous spectrum. The groups of sharp lines are generated by radiative transitions
following the ejection of an inner shell electron as indicated in Fig. 3.4(b). The group of lines around
0.2
A originate from K-shell transitions, while the three groups of lines between 1.0
A and 1.6
A arise
from L-shell transitions. A particular set of lines is only observed if the tube voltage is high enough to
eject the relevant electron. Hence new groups of lines appear as the voltage is increased, as the higher
energy electron beam ejects ever deeper inner shell electrons. At a given voltage, several groups of lines
are observed as the hole in the initial shell moves up through the higher shells. For example, L-shell lines
are observed at the same time as K-shell lines after the electron in the L-shell drops to the hole in the
K-shell, thus leaving a hole in the L-shell, and so on.
The continuous spectrum is caused by bremsstrahlung.
8
Bremsstrahlung occurs when the electron
is scattered by the atoms without ejecting a core electron from the target. The acceleration of the electron
associated with its change of direction causes it to radiate. Conservation of energy demands that the
frequency of the radiation must cut o when h = eV , V being the voltage across the tube. This means
that the minimum wavelength is equal to hc/eV . The reduction of the short wavelength limit of the
bremsstrahlung with increasing voltage is apparent in the data shown in Fig. 3.5(a).
The energy of an electron in an inner shell with principal quantum number n is given by:
E
n
=
Z
e
n
2
n
2
R
H
, (3.13)
where Z
e
n
is the eective nuclear charge, and R
H
= 13.6 eV. The dierence between Z (the atomic
number of the target) and Z
e
n
is caused by the screening eect of the other electrons. The energy of the
optical transition from n n
Z
e
n
2
n
2
Z
e
n
2
n
R
H
. (3.14)
In practice, the wavelengths of the various series of emission lines are found to obey Moseleys law,
where we make the approximation Z
e
n
= Z
e
n
and write both as (Z
n
). For example, the K-shell lines
are given by:
9
hc
(Z
K
)
2
R
H
_
1
1
2
1
n
2
_
, (3.15)
where n > 1 and
K
3. Similarly, the L-shell spectra obey:
hc
(Z
L
)
2
R
H
_
1
2
2
1
n
2
_
, (3.16)
8
German: brems = braking (i.e. deceleration) + strahlung = radiation.
9
There is no real scientic justication for the approximation Z
e
n
= Z
e
n
A.
3.5 Eective potentials, screening, and alkali metals
The electrons in a multi-electron atom arrange themselves with the smallest number of electrons in
unlled shells outside inner lled shells. These outermost electrons are called the valence electrons of
the atom. They are responsible for the chemical activity of the particular elements.
10
Absorption coecients are often expressed as cross sections. The cross section is equal to the eective area of the
beam that is blocked out by the absorption of an individual atom. If there are N atoms per unit volume, and the cross
section is equal to
abs
, the absorption coecient in m
1
is equal to N
abs
.
11
The spin-orbit eect is zero for s-states such as the 1s and 2s sub-shells, because they have l = 0. See eqn 6.32.
3.5. EFFECTIVE POTENTIALS, SCREENING, AND ALKALI METALS 33
Element Z Electronic conguration
Lithium 3 1s
2
2s
1
Sodium 11 [Ne] 3s
1
Potassium 19 [Ar] 4s
1
Rubidium 37 [Kr] 5s
1
Cesium 55 [Xe] 6s
1
Table 3.5: Alkali metals. The symbol [. . . ] indicates that the inner shells are lled according to the
electronic conguration of the noble gas element identied in the bracket.
In order to work out the energy levels of the valence electrons, we need to solve the N-electron
Schrodinger equation given in eqn 3.1. Within the central-eld approximation, each valence electron
satises a Schrodinger equation of the type given in eqn 3.6, which can be written in the form:
_
2
2m
2
+V
l
e
(r)
_
= E . (3.18)
The Coulomb repulsion from the core electrons is lumped into the eective potential V
l
e
(r). This is only
an approximation to the real behaviour, but it can be reasonably good, depending on how well we work
out V
l
e
(r). Note that the eective potential depends on l. This arises from the term in l that appears in
eqn 3.10 and has important consequences, as we shall see below.
The overall dependence of V
e
(r) with r must look something like Fig. 3.6. At very large values of r,
the outermost valence electron will be well outside any lled shells, and will thus only see an attractive
potential equivalent to a charge of +e. On the other hand, if r is very small, the electron will see
the full nuclear charge of +Ze. The potential at intermediate values of r must lie somewhere between
these two limits: hence the generic form of V
e
(r) shown in Fig. 3.6. The task of calculating V
l
e
(r)
keeps theoretical atomic physicists busy. Two common approximation techniques used to perform the
calculations are called the Hartree and Thomas-Fermi methods.
As a specic example, we consider the alkali metals such as lithium, sodium and potassium, which
come from group I of the periodic table. They have one valence electron outside lled inner shells, as
indicated in Table 3.5. They are therefore approximately one-electron systems, and can be understood
by introducing a phenomenological number called the quantum defect to describe the energies. To see
how this works, we consider the sodium atom.
The shell model picture of sodium is shown in Fig. 3.2. The optical spectra are determined by
excitations of the outermost 3s electron. The energy of each (n, l) term of the valence electron is given
by:
E
nl
=
R
H
[n (l)]
2
, (3.19)
where n 3 and (l) is the quantum defect. The quantum defect allows for the penetration of the inner
shells by the valence electron.
The dependence of the quantum defect on l can be understood with reference to Fig. 3.6(b). This
shows the radial probability densities P
nl
(r) = r
2
[R(r)[
2
for the 3s and 3p orbitals of a hydrogenic atom
with Z = 1, which might be expected to be a reasonable approximation for the single valence electron of
sodium. The shaded region near r = 0 represents the inner n = 1 and n = 2 shells with radii of 0.09a
0
and 0.44a
0
respectively. (See Section 3.3.) We see that both the 3s and 3p orbitals penetrate the inner
shells, and that this penetration is much greater for the 3s electron. The electron will therefore see a
larger eective nuclear charge for part of its orbit, and this will have the eect of reducing the energies.
The energy reduction is largest for the 3s electron due to its larger core penetration.
The quantum defect (l) was introduced empirically to account for the optical spectra. In principle
it should depend on both n and l, but it was found experimentally to depend mainly on l. This can be
seen from the values of the quantum defect for sodium tabulated in Table 3.6. The corresponding energy
spectrum is shown schematically in Fig. 3.7. Note that (l) is very small for l 2.
We can use the quantum defect to calculate the wavelengths of the emission lines. The D lines
correspond to the 3p 3s transition. By using the values of given in Table 3.6, we nd that the
34 CHAPTER 3. THE SHELL MODEL AND ALKALI SPECTRA
l n = 3 n = 4 n = 5 n = 6
0 1.373 1.357 1.352 1.349
1 0.883 0.867 0.862 0.859
2 0.010 0.011 0.013 0.011
3 0.000 -0.001 -0.008
Table 3.6: Values of the quantum defect (l) for sodium against n and l.
hydrogen n = 5
hydrogen n = 4
hydrogen n = 3
l 3
n = 4
E
n
e
r
g
y
0
n = 3
n = 4
n = 5
n = 6
2
n = 3
n = 4
1
n = 3
n = 4
n = 5
Figure 3.7: Schematic energy level diagram for sodium, showing the ordering of the energy levels.
wavelength is given by:
hc
= R
H
_
1
[3 (3s)]
2
1
[3 (3p)]
2
_
= (1.10 10
5
cm
1
)
_
1
1.627
2
1
2.117
2
_
.
This gives the wave number = 1.70 10
4
cm
1
, and so is equal to 590nm, as we would expect for the
yellow D-lines of sodium.
Reading
Bransden and Joachain, Atoms, Molecules and Photons, 8.1, 8.2, 9.4, 9.7
Demtroder, Atoms, Molecules and Photons, 6.26.4, 7.5
Haken and Wolf, The physics of atoms and quanta, chapters 11, 18 & 19.
Phillips, A.C., Introduction to Quantum Mechanics, chapter 11.
Eisberg and Resnick, Quantum Physics, chapters 9 & 10.
Beisser, Concepts of Modern Physics, chapter 7.
Chapter 4
Angular momentum
We noted in Section 1.5.1 that the treatment of angular momentum is very important for understanding
the properties of atoms. It is now time to explore these eects in detail, and to see how this leads to the
classication of the quantized states of atoms by their angular momentum.
4.1 Conservation of angular momentum
In the Sections that follow, we are going to consider several dierent types of angular momentum, and
the ways in which they are coupled together. Before going into the details, it is useful to stress one
very important point related to conservation of angular momentum. In an isolated atom, there are
many forces (and hence torques) acting inside the atom. These internal forces cannot change the total
angular momentum of the atom, since conservation of angular momentum demands that the angular
momentum of the atom as a whole must be conserved in the absence of any external torques. The total
angular momentum of the atom is normally determined by its electrons. The total electronic angular
momentum is written J, and is specied by the quantum number J. The principle of conservation of
angular momentum therefore requires that isolated atoms always have well-dened J states.
1
It is this
J value that determines, for example, the magnetic dipole moment of the atom.
The principle of conservation of angular momentum does not apply, of course, when external pertur-
bations are applied. The most obvious example is the perturbation caused by the emission or absorption
of a photon. In this case the angular momentum of the atom must change because the photon itself
carries angular momentum, and the angular momentum of the whole system (atom + photon) has to be
conserved. The change in J is then governed by selection rules, as discussed, for example, in Section 4.8.
Another obvious example is the eect of a strong external DC magnetic eld. In this case it is possible
for the magnetic eld to produce states where the component of angular momentum along the direction
of the eld is well-dened, but not the total angular momentum. (See the discussion of the Paschen-Back
eect in Section 7.1.3.)
4.2 Types of angular momentum
The electrons in atoms possess two dierent types of angular momentum, namely orbital and spin angular
momentum.These are discussed separately below.
4.2.1 Orbital angular momentum
The electrons in atoms orbit around the nucleus, and therefore possess orbital angular momentum. In
classical mechanics, we dene the orbital angular momentum of a particle by:
L = r p, (4.1)
1
This statement about J has to be qualied somewhat when we add in the eects of the nucleus. The angular momentum
of an atom is the resultant of the electronic angular momentum and the nuclear spin. The total angular momentum of an
isolated atom has to be conserved, but the electrons can exchange angular momentum with the nucleus through hyperne
interactions. (See Section 6.7.2.) These interactions are very weak, and can usually be neglected except when explicitly
considering nuclear eects.
35
36 CHAPTER 4. ANGULAR MOMENTUM
where r is the radial position, and p is the linear momentum. The components of L are given by
_
_
L
x
L
y
L
z
_
_
=
_
_
x
y
z
_
_
_
_
p
x
p
y
p
z
_
_
=
_
_
yp
z
zp
y
zp
x
xp
z
xp
y
yp
x
_
_
. (4.2)
In quantum mechanics we represent the linear momentum by dierential operators of the type
p
x
= i
x
. (4.3)
Therefore, the quantum mechanical operators for the orbital angular momentum are given by:
L
x
= i
_
y
z
z
y
_
(4.4)
L
y
= i
_
z
x
x
z
_
(4.5)
L
z
= i
_
x
y
y
x
_
. (4.6)
Note that the hat symbol indicates that we are representing an operator and not just a number.
In classical mechanics, the magnitude of the angular momentum is given by:
L
2
= L
2
x
+L
2
y
+L
2
z
.
We therefore dene the quantum mechanical operator for the magnitude of the angular momentum by:
L
2
=
L
2
x
+
L
2
y
+
L
2
z
. (4.7)
Note that operators like
L
2
x
should be understood in terms of repeated operations:
L
2
x
=
2
_
y
z
z
y
__
y
z
z
y
_
=
2
_
y
2
z
2
y
y
z
z
2yz
2
yz
+z
2
y
2
_
.
A key property of the orbital angular momentum operator is that its components do not commute
with each other, but they do commute with
L
2
. We can summarise this by writing the commutators:
2
[
L
x
,
L
y
] ,= 0 ,
[
L
2
, L
z
] = 0 . (4.8)
The non-commutation of the components can be proved as follows:
L
x
L
y
= (i)
2
_
y
z
z
y
__
z
x
x
z
_
,
=
2
_
yz
2
zx
+y
x
yx
z
2
z
2
2
yx
+zx
2
yz
_
.
On the other hand, we have:
L
y
L
x
= (i)
2
_
z
x
x
z
__
y
z
z
y
_
,
=
2
_
zy
2
xz
z
2
2
xy
xy
z
2
+xz
2
zy
+x
y
_
.
On recalling that
2
/xy =
2
/yx, we nd:
L
x
L
y
L
y
L
x
[
L
x
,
L
y
] =
2
_
y
x
x
y
_
,
= i i
_
x
y
y
x
_
,
= i
L
z
.
2
The commutator of two quantum mechanical operators
A and
B is dened by: [
A,
B] =
A
B
B
A. Hence [
Lx,
Ly] =
Lx
Ly
Ly
Lx.
4.2. TYPES OF ANGULAR MOMENTUM 37
We therefore conclude that:
[
L
x
,
L
y
] = i
L
z
(4.9)
The other commutators of the angular momentum operators, namely [
L
y
,
L
z
] and [
L
z
,
L
x
] are obtained
by cyclic permutation of the indices in Eq. 4.9: x y, y z, z x.
It can be shown that the measurable quantities corresponding to two quantum mechanical operators
that do not commute must obey an uncertainty principle. The general result for operators
A and
B is:
A
2
B
2
1
4
[
A,
B]
2
. (4.10)
The Heisenberg uncertainty principle xp /2 is a well known example of this.
3
The non-commutation
of the components of L thus implies that it is not possible to know the values of L
x
, L
y
, L
z
simultane-
ously: we can only know one of them (usually L
z
) at any time. Once L
z
is known, we cannot know L
x
and L
y
as well. On the other hand, the fact that
L
z
commutes with
L
2
(cf. eqn 4.8) means that we can
know the length of the angular momentum vector and its z component simultaneously. In summary:
We can know the length of the angular momentum vector L and one of its components.
For mathematical convenience, we usually take the component we know to be L
z
.
We cannot know the values of all three components of the angular momentum simultaneously.
The eigenvalues of the angular momentum operators were discussed in Section 1.5.1. The orbital
angular momentum is specied by two quantum numbers: l and m. The latter is sometimes given an
extra subscript (i.e. m
l
) to distinguish it from the spin quantum number m
s
considered below. The
magnitude of l is given by
[l[ =
_
l(l + 1) , (4.11)
and the component along the z axis by
l
z
= m . (4.12)
Note that we have switched to a lower case notation here because we are referring to a single electron.
(See Section 4.7.) l can take positive integer values (including 0) and m can take values in integer steps
from l to +l. The number of m states for each l state is therefore equal to (2l +1). These m states are
degenerate in isolated atoms, but can be split by external perturbations (e.g. magnetic or electric elds.)
The quantisation of the angular momentum can be represented pictorially in the vector model shown
in gure 1.5. In this model the angular momentum is represented as a vector of length
_
l(l + 1) angled
so that its component along the z axis is equal to m. The x and y components of the angular momentum
are not known.
In classical mechanics, the orbital angular momentum is conserved when the force F is radial: i.e.
F F r, where r is a unit vector parallel to r. This follows from the equation of motion:
dl
dt
= = r F = r F r = 0 , (4.13)
where is the torque. In the hydrogen atom, the Coulomb force on the electron acts towards the nucleus,
and hence l is conserved. This is why the angular momentum ends up being quantized with well-dened
constant values when we consider the quantum mechanics of the hydrogen atom. It is also the case
that the individual electrons of many-electron atoms have well-dened l states. This follows because the
central eld approximation gives a very good description of the behaviour of many electron atoms (see
Section 3.1), and the dominant resultant force on the electron is radial (i.e. central) in this limit.
4
3
The commutator of x and p is given by:
[ x, p] = ( x p p x) = i x
d
dx
+i
d(x)
dx
= i .
Hence [ x, p] = i.
4
The inclusion of non-central forces via the residual electrostatic interaction leads to some mixing of the orbital angular
momentum states. This can explain why transitions that are apparently forbidden by selection rules can sometimes be
observed, albeit with low transition probabilities.
38 CHAPTER 4. ANGULAR MOMENTUM
+
-
m
s
non-uniform
magnetic field
atom
beam
L = 0
+
-
m
s
non-uniform
magnetic field
atom
beam
L = 0
Figure 4.1: The SternGerlach experiment. A beam of monovalent atoms with L = 0 (i.e. zero orbital
angular momentum and hence zero orbital magnetic dipole moment) is deected in two discrete ways by
a non-uniform magnetic eld. The force on the atoms arises from the interaction between the eld and
the magnetic moment due to the electron spin.
4.2.2 Spin angular momentum
A wealth of data derived from the optical, magnetic and chemical properties of atoms points to the fact
that electrons possess an additional type of angular momentum called spin. The electron behaves as if
it spins around its own internal axis, but this analogy should not be taken literally the electron is,
as far as we know, a point particle, and so cannot be spinning in any classical way. In fact, spin is a
purely quantum eect with no classical explanation. Paul Dirac at Cambridge successfully accounted for
electron spin when he produced the relativistic wave equation that bears his name in 1928.
The discovery of spin goes back to the Stern-Gerlach experiment, in which a beam of atoms is deected
by a non-uniform magnetic eld. (See Fig. 4.1). The force on a magnetic dipole in a non-uniform magnetic
eld is given by:
5
F
z
=
z
dB
dz
, (4.14)
where dB/dz is the eld gradient, which is assumed to point along the z direction, and
z
is the z-
component of the magnetic dipole of the atom. In Chapter 6 we shall explore the origin of magnetic
dipoles in detail. At this stage, all we need to know is that the magnetic dipole is directly proportional
to the angular momentum of the atom. (See Section 6.1.)
The original SternGerlach experiment was performed on silver atoms, which have a ground-state
electronic conguration of [Kr] 4d
10
5s
1
. Filled shells have no net orbital angular momentum, because
there are as many positive m
l
states occupied as negative ones. Furthermore, electrons in s-shells have
l = 0 and therefore the orbital angular momentum of the atom is zero. This implies that the orbital
magnetic dipole of the atom is also zero, and hence we expect no deection. However, the experiment
showed that the atoms were deected either up or down, as indicated in Fig. 4.1.
In order to explain the up/down deection of the atoms with no orbital angular momentum, we have
to assume that each electron possesses an additional type of magnetic dipole moment. This magnetic
dipole is attributed to the spin angular momentum. In analogy with orbital angular momentum, spin
angular momentum is described by two quantum numbers s and m
s
, where m
s
can take the (2s + 1)
values in integer steps from s to +s. The magnitude of the spin angular momentum is given by
[s[ =
_
s(s + 1) , (4.15)
and the component along the z axis is given by
s
z
= m
s
. (4.16)
The fact that atoms with a single s-shell valence electron (e.g. silver) are only deected in two directions
(i.e. up or down) implies that (2s +1) = 2 and hence that s = 1/2. Hence the spin quantum numbers of
5
Note that we need a non-uniform magnetic eld to deect a magnetic dipole. A uniform magnetic eld merely exerts
a torque, not a force. We can understand this by analogy with electrostatics. Electric monopoles (i.e. free charges) can
be moved by applying electric elds, but an electric dipole experiences no net force in a uniform electric eld because the
forces on the positive and negative charges cancel. If we wish to apply a force to an electric dipole, we therefore need to
apply a non-uniform electric eld, so that the forces on the two charges are dierent. Magnetic monopoles do not exist
(as far as we know), and so all atomic magnets are dipoles. Hence we must apply a non-uniform magnetic eld to exert
a magnetic force on an atom. The magnitude of the force in the non-uniform eld can be worked out from the energy:
U = B = (xBx +yBy +zBz). With Bx = By = 0 and Fz = U/z, eqn 4.14 follows directly.
4.3. ADDITION OF ANGULAR MOMENTUM 39
A
B
C
q
l
s
j
(a) (b)
A
B
C
q
A
B
C
q
l
s
j
l
s
j
(a) (b)
Figure 4.2: (a) Vector addition of two angular momentum vectors A and B to form the resultant C. (b)
Vector model of the atom. The spin-orbit interaction couples l and s together to form the resultant j.
The magnitudes of the vectors are given by: [j[ =
_
j(j + 1), [l[ =
_
l(l + 1), and [s[ =
_
s(s + 1) .
the electron can have the following values:
s = 1/2 ,
m
s
= 1/2 .
The SternGerlach experiment is just one of many pieces of evidence that support the hypothesis for
electron spin. Here is an incomplete list of other evidence for spin based on atomic physics:
The periodic table of elements, which is the foundation of the whole subject of chemistry, cannot
be explained unless we assume that the electrons possess spin.
High resolution spectroscopy of atomic spectral lines shows that they frequently consist of closely-
spaced multiplets. This ne structure is caused by spinorbit coupling , which can only be explained
by postulating that electrons possess spin. See Chapter 6.
If we ignore spin, we expect to observe the normal Zeeman eect when an atom is placed in an
external magnetic eld. However, most atoms display the anomalous Zeeman eect, which is a
consequence of spin. See Chapter 7.
The ratio of the magnetic dipole moment to the angular momentum is called the gyromagnetic ratio.
(See Section 6.1.) The gyromagnetic ratio can be measured directly by a number of methods. In
1915, Einstein and de Haas measured the gyromagnetic ratio of iron and came up with a value twice
as large as expected. They rejected this result, assigning it to experimental errors. However, we
now know that the magnetism in iron is caused by the spin rather the orbital angular momentum,
and so the experimental value was correct. (The electron spin g-factor is 2: see Section 6.2.) This
is a salutary lesson from the history that even great physicists like Einstein and de Haas can get
their error analysis wrong!
4.3 Addition of angular momentum
Having discovered that electrons have dierent types of angular momentum, the question now arises as
to how we add them together. Let us suppose that C is the resultant of two angular momentum vectors
A and B as shown in Fig. 4.2(a), so that:
C = A+B. (4.17)
We assume for the sake of simplicity that [A[ > [B[. (The argument is unaected if [A[ < [B[.) We
dene as the angle between the two vectors, as shown in gure 4.2(a).
In classical mechanics the angle can take any value from 0
to 180
H =
spin
B
orbital
l s , (4.20)
since
spin
s and B
orbital
l.
2. The spin-orbit interaction scales roughly as Z
2
. (See eqn 6.42.) It is therefore weak in light atoms,
and stronger in heavy atoms.
We introduce the spin-orbit interaction here because it is one of the mechanisms that is important in
determining the angular momentum coupling schemes that apply in dierent atoms.
4.5 Angular momentum coupling in single electron atoms
If an atom has just a single electron, the addition of the orbital and spin angular momenta is relatively
straightforward. The physical mechanism that couples the orbital and spin angular momenta together is
the spin-orbit interaction, and the resultant total angular momentum vector j is dened by:
j = l +s . (4.21)
j is described by the quantum numbers j and m
j
according to the usual rules for quantum mechanical
angular momenta, namely:
[j[ =
_
j(j + 1) , (4.22)
and
j
z
= m
j
, (4.23)
where m
j
takes values of j, (j 1), , j. The addition of l and s to form the resultant j is illustrated
by Fig. 4.2(b).
The allowed values of j are worked out by applying eqn 4.19, with the knowledge that the spin
quantum number s is always equal to 1/2. If the electron is in a state with orbital quantum number l,
we then nd j = l s = (l 1/2), except when l = 0, in which case we just have j = 1/2. In the second
case, the angular momentum of the atom arises purely from the electron spin.
4.6. ANGULAR MOMENTUM COUPLING IN MULTI-ELECTRON ATOMS 41
4.6 Angular momentum coupling in multi-electron atoms
The Hamiltonian for an N-electron atom can be written in the form:
H =
H
0
+
H
1
+
H
2
, (4.24)
where:
H
0
=
N
i=1
_
2
2m
2
i
+V
central
(r
i
)
_
, (4.25)
H
1
=
N
i=1
Ze
2
4
0
r
i
+
N
i>j
e
2
4
0
[r
i
r
j
[
N
i=1
V
central
(r
i
) , (4.26)
H
2
=
N
i=1
(r
i
)l
i
s
i
. (4.27)
As discussed in Section 3.1,
H
0
is the central-eld Hamiltonian and
H
1
is the residual electrostatic
potential.
H
2
is the spin-orbit interaction summed over the electrons of the atom.
In Chapter 3 we neglected both
H
1
and
H
2
, and just concentrated on
H
0
. This led to the conclusion
that each electron occupies a state in a shell dened by the quantum numbers n and l. The reason why
we neglected
H
1
is that the o-radial forces due to the electron-electron repulsion are smaller than the
radial ones, while
H
2
was neglected because the spin-orbit eects are much smaller than the main terms
in the Hamiltonian. It is now time to study what happens when these two terms are included. In doing
so, there are two obvious limits to consider:
6
LS coupling:
H
1
H
2
.
jj coupling:
H
2
H
1
.
Since the spin-orbit interaction scales approximately as Z
2
, LS-coupling mainly occurs in atoms with
small to medium Z, while jj-coupling occurs in some atoms with large Z. In the sections below, we focus
on the LS-coupling limit. The less common case of jj-coupling is considered briey in Section 4.10.
4.7 LS coupling
In the LS-coupling limit (alternatively called RussellSaunders coupling), the residual electrostatic
interaction is much stronger than the spin-orbit interaction. We therefore deal with the residual elec-
trostatic interaction rst and then apply the spin-orbit interaction as a perturbation. The LS coupling
regime applies to most atoms of small and medium atomic number.
Let us rst discuss some issues of notation. We shall need to distinguish between the quantum
numbers that refer to the individual electrons within an atom and the state of the atom as a whole. The
convention is:
Lower case quantum numbers (j, l, s) refer to individual electrons within atoms.
Upper case quantum numbers (J, L and S) refer to the angular momentum states of the whole
atom.
For single electron atoms like hydrogen, there is no dierence. However, in multi-electron atoms there
is a real dierence because we must distinguish between the angular momentum states of the individual
electrons and the resultants which give the angular momentum states of the whole atom.
We can use this notation to determine the angular momentum states that the LS-coupling scheme
produces. The residual electrostatic interaction has the eect of coupling the orbital and spin angular
momenta of the individual electrons together, so that we nd their resultants according to:
L =
i
l
i
, (4.28)
S =
i
s
i
. (4.29)
6
In some atoms with medium-large Z (e.g. germanium Z = 32) we are in the awkward situation where neither limit
applies. We then have intermediate coupling, and the behaviour is quite complicated to describe.
42 CHAPTER 4. ANGULAR MOMENTUM
Filled shells of electrons have no net angular momentum, and so the summation only needs to be carried
out over the valence electrons. In a many-electron atom, the rule given in eqn 4.19 usually allows several
possible values of the quantum numbers L and S for a particular electronic conguration. Their energies
will dier due to the residual electrostatic interaction. The atomic states dened by the values of L and
S are called terms.
For each atomic term, we can nd the total angular momentum of the whole atom from:
J = L+S . (4.30)
The values of J, the quantum number corresponding to J, are found from L and S according to eqn 4.19.
The states of dierent J for each LS-term have dierent energies due to the spin-orbit interaction. In
analogy with eqn 4.20, the spin-orbit interaction of the whole atom is written:
E
so
atom
spin
B
atom
orbital
L S , (4.31)
where the atom superscript indicates that we take the resultant values for the whole atom. The details
of the spin-orbit interaction in the LS coupling limit are considered in Section 6.6. At this stage, all we
need to know is that the spin-orbit interaction splits the LS terms into levels labelled by J.
It is convenient to introduce a shorthand notation to label the energy levels that occur in the LS
coupling regime. Each level is labelled by the quantum numbers J, L and S and is represented in the
form:
2S+1
L
J
.
The factors (2S + 1) and J appear as numbers, whereas L is a letter that follows the rule:
S implies L = 0,
P implies L = 1,
D implies L = 2,
F implies L = 3, etc.
Thus, for example, a
2
P
1/2
term is the energy level with quantum numbers S = 1/2, L = 1, and J = 1/2,
while a
3
D
3
has S = 1, L = 2 and J = 3. The factor of (2S +1) in the top left is called the multiplicity.
It indicates the degeneracy of the level due to the spin: i.e. the number of M
S
states available. If S = 0,
the multiplicity is 1, and the terms are called singlets. If S = 1/2, the multiplicity is 2 and we have
doublet terms. If S = 1 we have triplet terms, etc.
As an example, consider the (3s,3p) electronic conguration of magnesium, where one of the valence
electrons is in an s-shell with l = 0 and the other is in a p-shell with l = 1. We rst work out the LS
terms:
L = l
1
l
2
= 0 1 = 1.
S = s
1
s
2
= 1/2 1/2 = 1 or 0.
We thus have two terms: a
3
P triplet and a
1
P singlet. The allowed levels are then worked out as follows:
For the
3
P triplet, we have J = L S = 1 1 = 2, 1, or 0. We thus have three levels:
3
P
2
,
3
P
1
,
and
3
P
0
.
For the
1
P singlet, we have J = L S = 1 0 = 1. We thus have a single
1
P
1
level.
These levels are illustrated in Fig. 4.3. The ordering of the energy states should not concern us at this
stage. The main point to realize is the general way the states split as the new interactions are turned on,
and the terminology used to designate the states.
4.8 Electric dipole selection rules in the LS coupling limit
When considering electric-dipole transitions between the states of many-electron atoms that have LS-
coupling, a single electron makes a jump from one atomic shell to a new one. The rules that apply to
this electron are the same as the ones discussed in Section 2.4. However, we also have to think about the
angular momentum state of the whole atom as specied by the quantum numbers (L, S, J). The rules
that emerge are as follows:
4.9. HUNDS RULES 43
3s3p
3
P
1
P
1
2
1
0
J
configuration terms levels
residual
electrostatic
interaction
spin-orbit
coupling
3s3p
3
P
1
P
1
2
1
0
J
configuration terms levels
residual
electrostatic
interaction
spin-orbit
coupling
Figure 4.3: Splitting of the energy levels for the (3s,3p) conguration of magnesium in the LS coupling
regime.
1. The parity of the wave function must change.
2. l = 1 for the electron that jumps between shells.
3. L = 0, 1, but L = 0 0 is forbidden.
7
4. J = 0, 1, but J = 0 0 is forbidden.
5. S = 0.
Rule 1 follows from the odd parity of the dipole operator. Rule 2 applies the l = 1 single-electron
rule to the individual electron that makes the jump in the transition, while Rule 3 applies Rule 2 to the
resultant orbital angular momentum of the whole atom according to the rules for addition of angular
momenta. Rule 4 follows from the fact that the total angular momentum must be conserved in the
transition, allowing us to write:
J
initial
= J
nal
+J
photon
. (4.32)
The photon carries one unit of angular momentum, and so we conclude from eqn 4.19 that J =
1, 0, or + 1. However, the J = 0 rule cannot be applied to J = 0 0 transitions because it is not
possible to satisfy eqn 4.32 in these circumstances. Finally, rule 5 is a consequence of the fact that the
photon does not interact with the spin.
8
4.9 Hunds rules
We have seen above that there are many terms in the energy spectrum of a multi-electron atom. Of these,
one will have the lowest energy, and will form the ground state. All the others are excited states. Each
atom has a unique ground state, which is determined by minimizing the energy of its valence electrons
with the residual electrostatic and spin-orbit interactions included. In principle, this is a very complicated
calculation. Fortunately, however, Hunds rules allow us to determine which level is the ground state
for atoms that have LS-coupling without lengthy calculation. The rules are:
1. The term with the largest multiplicity (i.e. largest S) has the lowest energy.
2. For a given multiplicity, the term with the largest L has the lowest energy.
3. The level with J = [L S[ has the lowest energy if the shell is less than half full. If the shell is
more than half full, the level with J = L +S has the lowest energy.
7
L = 0 transitions are obviously forbidden in one-electron atoms, because L = l and l must change. However, in atoms
with more than one valence electron, it is possible to get transitions between dierent congurations that satisfy rule 2, but
have the same value of L. An example is the allowed 3p3p
3
P
1
3p4s
3
P
2
transition in silicon at 250.6 nm.
8
S = 0 transitions can be weakly allowed when the spin-orbit coupling is strong, because the spin is then mixed with
the orbital motion.
44 CHAPTER 4. ANGULAR MOMENTUM
The rst of these rules basically tells us that the electrons try to align themselves with their spins parallel
in order to minimize the exchange interaction. (See Chapter 5.) The other two follow from the minimizing
the spin-orbit interaction.
Let us have a look at carbon as an example. Carbon has an atomic number Z = 6 with two valence
electrons in the outermost 2p shell. Each valence electron therefore has l = 1 and s = 1/2. Consider rst
the (2p,np) excited state conguration with one electron in the 2p shell and the other in the np shell,
where n 3. We have from eqn 4.19 that L = 1 1 = 0, 1 or 2, and S = 1/2 1/2 = 0 or 1. We
thus have three singlet terms (
1
S,
1
P,
1
D), and three triplet terms (
3
S,
3
P,
3
D). This gives rise to three
singlet levels:
1
S
0
,
1
P
1
,
1
D
2
,
and seven triplet levels:
3
S
1
,
3
P
0
,
3
P
1
,
3
P
2
,
3
D
1
,
3
D
2
,
3
D
3
.
We thus have a confusing array of ten levels in the energy spectrum for the (2p,np) conguration.
The situation in the ground state conguration (2p,2p) is simplied by the fact that the electrons are
equivalent, i.e. in the same shell. The Pauli exclusion principle forbids the possibility that two or more
electrons should have the same set of quantum numbers, and in the case of an atom with two valence
electrons, it can be shown that this implies that L + S must be equal to an even number. There is no
easy explanation for this rule, but the simplest example of its application, namely to two electrons in
the same s-shell, is considered in Section 5.3. For these two s-electrons, we have L = 0 0 = 0 and
S = 1/2 1/2 = 0 or 1, giving rise to two terms:
1
S and
3
S. Both terms are allowed when the electrons
are in dierent s-shells, but the L +S = even rule tells us that only the singlet
1
S term is allowed if the
electrons are in the same s-shell. The proof that the triplet term does not exist for the (1s,1s) ground-state
conguration of helium is given in Section 5.3.
On applying the rule that L + S must be even to the equivalent 2p electrons in the carbon ground
state, we nd that only the
1
S,
1
D, and
3
P terms are allowed, which means that only ve of the ten levels
listed above are possible:
9
1
S
0
,
1
D
2
,
3
P
0
,
3
P
1
,
3
P
2
.
We can now apply Hunds rules to nd out which of these is the ground state. The rst rule states that
the triplet levels have the lower energy. Since these all have L = 1 we do not need to consider the second
rule. The shell is less than half full, and so we have J = [L S[ = 0. The ground state is thus the
3
P
0
level. All the other levels are excited states.
It is important to notice that, if we had forgotten the rule that L + S must be even, we would have
incorrectly concluded from Hunds rules that the ground state is a
3
D
1
term, which does not exist for the
(2p,2p) conguration. It is therefore safer to use a dierent version of Hunds rules, based on the allowed
combinations of (m
s
, m
l
) sub-levels:
1. Maximize the spin and set S =
m
s
.
2. Maximize the orbital angular momentum, subject to rule 1, and set L =
m
l
.
3. J = [L S[ if the shell is less than half full, otherwise J = [L +S[.
These rules should work in all cases, since they incorporate the Pauli exclusion principle properly.
As an example of how to use the second version of Hunds rules, we apply them again to the two 2p
electrons of carbon. The two electrons can go into the six possible (m
s
, m
l
) sub-levels of the 2p shell.
1. To get the largest value of the spin, we must have both electron spins aligned with m
s
= +1/2.
This gives S = 1/2 + 1/2 = 1.
2. Having put both electrons into spin up states, we cannot now put both electrons into m
l
= +1
states because of Paulis exclusion principle. The best we can do is to put one into an m
l
= 1 state
and the other into an m
l
= 0 state, as illustrated in Table 4.1. This gives L = 1 + 0 = 1.
3. The shell is less than half full, and so we have J = [L S[ = 0.
We thus deduce that the ground state is the
3
P
0
level, as before.
The ground state levels for the rst 11 elements, as worked out from Hunds rules, are listed in
Table 4.2. Experimental results conrm these predictions. Note that full shells always give
1
S
0
level with
no net angular momentum: S = L = J = 0.
9
The full derivation of the allowed states for the (np,np) conguration of a group IV atom is considered, for example,
in Woodgate, Elementary Atomic Structure, 2nd Edition, Oxford University Press, 1980, Section 7.2.
4.10. JJ COUPLING 45
m
l
m
s
1 0 +1
+1/2
1/2
Table 4.1: Distribution of the two valence electrons of the carbon ground state within the m
s
and m
l
states of the 2p shell.
Z Element Conguration Ground state
1 H 1s
1 2
S
1/2
2 He 1s
2 1
S
0
3 Li 1s
2
2s
1 2
S
1/2
4 Be 1s
2
2s
2 1
S
0
5 B 1s
2
2s
2
2p
1 2
P
1/2
6 C 1s
2
2s
2
2p
2 3
P
0
7 N 1s
2
2s
2
2p
3 4
S
3/2
8 O 1s
2
2s
2
2p
4 3
P
2
9 F 1s
2
2s
2
2p
5 2
P
3/2
10 Ne 1s
2
2s
2
2p
6 1
S
0
11 Na 1s
2
2s
2
2p
6
3s
1 2
S
1/2
Table 4.2: Electronic congurations and ground state terms of the rst 11 elements in the periodic table.
It is important to be aware that Hunds rules cannot be used to nd the energy ordering of excited
states with reliability. For example, consider the (2p,3p) excited state conguration of carbon, which has
the ten possible levels listed previously. Hunds rules predict that the
3
D
1
level has the lowest energy,
but the lowest state is actually the
1
P
1
level.
4.10 jj coupling
The spin-orbit interaction gets larger as Z increases. (See, for example, eqn 6.42.) This means that in
some atoms with large Z (eg tin with Z = 50) we can have a situation in which the spin-orbit interaction
is much stronger than the residual electrostatic interaction. In this regime, jj coupling coupling occurs.
The spin-orbit interaction couples the orbital and spin angular momenta of the individual electrons
together rst, and we then nd the resultant J for the whole atom by adding together the individual js:
j
i
= l
i
+ s
i
J =
N
i=1
j
i
(4.33)
These J states are then split by the weaker residual electrostatic potential, which acts as a perturbation.
Reading
Bransden and Joachain, Atoms, Molecules and Photons, 1.8, 2.5, 8.5, 9.2
Demtroder, W., Atoms, Molecules and Photons, 5.56, 6.25.
Haken and Wolf, The physics of atoms and quanta, chapters 12, 17, 19.
Eisberg and Resnick, Quantum Physics, chapters 8, 10.
Foot, Atomic physics, 2.3.1, chapter 5.
Beisser, Concepts of Modern Physics, 7.7 8.
46 CHAPTER 4. ANGULAR MOMENTUM
Chapter 5
Helium and exchange symmetry
In this chapter we will look at atoms with two valence electrons. This includes helium, and the group
II elements: beryllium, magnesium, calcium, etc. As we will see, this leads to the idea of the exchange
energy. We shall use helium as the main example, as it is a true two electron system and illustrates the
physical points most clearly.
5.1 Exchange symmetry
Consider a multi-electron atom with N electrons, as illustrated in gure 5.1(a). The wave function of the
atom will be a function of the co-ordinates of the individual electrons:
(r
1
, r
2
, , r
K
, r
L
, r
N
)
However, the electrons are indistinguishable particles. It is not physically possible to stick labels on
the individual electrons and then keep tabs on them as the move around their orbits. This means that
the many-electron wave function must have exchange symmetry:
[(r
1
, r
2
, , r
K
, r
L
, r
N
)[
2
= [(r
1
, r
2
, , r
L
, r
K
, r
N
)[
2
. (5.1)
This says that nothing happens if we switch the labels of any pair of electrons. Equation 5.1 will be
satised if
(r
1
, r
2
, , r
K
, r
L
, r
N
) = (r
1
, r
2
, , r
L
, r
K
, r
N
) . (5.2)
The + sign in equation 5.2 applies if the particles are bosons. These are said to be symmetric with
respect to particle exchange. The sign applies to fermions, which are anti-symmetric with respect
to particle exchange.
Electrons have spin 1/2 and are therefore fermions. Hence the wave function of a multi-electron atom
must be anti-symmetric with respect to particle exchange. This is a very fundamental property, and is
the physical basis of the Pauli exclusion principle, as we shall see below.
The discussion of exchange symmetry gets quite complicated when there are lots of electrons, and so
we shall just concentrate on helium here.
5.2 Helium wave functions
Figure 5.1(b) shows a schematic diagram of a helium atom. It consists of one nucleus with Z = 2 and
two electrons. The position co-ordinates of the electrons are written r
1
and r
2
respectively.
r
1
r
2
r
K
r
L
r
N
r
1
r
2
Z = 2
He
(a) (b)
r
1
r
2
r
K
r
L
r
N
r
1
r
2
r
K
r
L
r
N
r
1
r
2
Z = 2
He
r
1
r
2
Z = 2
He
(a) (b)
Figure 5.1: (a) A multi-electron atom with N electrons. (b) The helium atom.
47
48 CHAPTER 5. HELIUM AND EXCHANGE SYMMETRY
spatial
spin
symmetric anti-symmetric (S = 0)
anti-symmetric symmetric (S = 1
Table 5.1: Allowed combinations of the exchange symmetries of the spatial and spin wave functions of
fermionic particles.
The quantum state in the helium atom will be specied both by the spatial co-ordinates and by the
spin of the two electrons. The two-electron wave function is therefore written as a product of a spatial
wave function and a spin wave function:
=
spatial
(r
1
, r
2
)
spin
. (5.3)
As we have seen above, the fact that electrons are indistinguishable fermions requires that the two-electron
wave function must be anti-symmetric with respect to exchange of electrons 1 and 2. Table 5.1 lists the
two possible combinations of wave function symmetries that can produce an antisymmetric total wave
function.
Let us rst consider the spatial wave function. The state of the atom will be specied by the con-
guration of the two electrons. In the ground state both electrons are in the 1s shell, and so we have
a conguration of 1s
2
. In the excited states, one or both of the electrons will be in a higher shell. The
conguration is thus given by the n, l values of the two electrons, and we write the conguration as
(n
1
l
1
, n
2
l
2
). This means that the spatial part of the helium wave function must contain terms of the
type u
A
(r
1
) u
B
(r
2
), where u
nl
(r) is the wave function for an electron with quantum numbers n and l,
and the subscripts A and B stand for the quantum numbers n, l of the two electrons.
The discussion above does not take account of the fact that the electrons are indistinguishable: we
cannot distinguish between the state with electron 1 in state A and electron 2 in state B, and vice versa.
u
B
(r
1
) u
A
(r
2
) is therefore an equally valid wave function for the particular electronic conguration. The
wave function for the conguration A, B must therefore take the form:
AB
(r
1
, r
2
) =
1
2
_
u
A
(r
1
) u
B
(r
2
) u
B
(r
1
) u
A
(r
2
)
_
. (5.4)
The 1/
1
2
+ +1
1
2
(
1
2
+
1
2
) + 0
1
2
(
1
2
1
2
) 0
1
2
+ 1
Table 5.2: Spin wave functions for a two-electron system. The arrows indicate whether the spin of the
individual electrons is up or down (ie +
1
2
or
1
2
). The + sign in the symmetry column applies if the
wave function is symmetric with respect to particle exchange, while the sign indicates that the wave
function is anti-symmetric. The S
z
value is indicated by the quantum number for M
S
, which is obtained
by adding the m
s
values of the two electrons together.
S M
S
spin
spatial
0 0
1
2
(
1
2
1
2
)
1
2
_
u
A
(r
1
) u
B
(r
2
) +u
B
(r
1
) u
A
(r
2
)
_
+1
1
2
1 0
1
2
(
1
2
+
1
2
)
1
2
_
u
A
(r
1
) u
B
(r
2
) u
B
(r
1
) u
A
(r
2
)
_
1
1
2
Table 5.3: Spin and spatial wave functions for a two-electron atom with electronic conguration designated
by the labels A and B.
of the atom as we shall see below. This is a surprising result when you consider that the spin and spatial
co-ordinates are basically independent of each other.
5.3 The Pauli exclusion principle
Let us suppose that we try to put the two electrons in the same atomic shell. The ground state of helium
is an example of such a conguration, with both electrons in the 1s shell. The spatial wave functions will
be given by eqn 5.4 with A = B. The antisymmetric combination with the sign in the middle is zero in
this case. From Table 5.3 we see that this implies that there are no triplet S = 1 states if both electrons
are in the same shell.
The fact that the triplet state does not exist for the helium ground state is a demonstration of the
rule that L +S must be even for a two-electron atom with both electrons in the same shell. In the case
of the 1s
2
conguration, we have L = 0, and therefore S = 1 is not allowed. This rule was introduced
without any justication in Section 4.9. The general justication of the rule is beyond the scope of this
course, but the example of the helium ground state at least demonstrates that the rule is true for the
simplest case.
The absence of the triplet state for 1s
2
conguration is equivalent to the Pauli exclusion principle.
We are trying to put two electrons in the same state as dened by the n, l, m
l
quantum numbers. This
is only possible if the two electrons have dierent m
s
values. In other words, their spins must be aligned
anti-parallel. The S = 1 state contains terms with both spins pointing in the same direction, and is
therefore not allowed. The analysis of the symmetry of the wave function discussed here thus shows
us that the Pauli exclusion principle is a consequence of the fact that electrons are indistinguishable
fermions.
50 CHAPTER 5. HELIUM AND EXCHANGE SYMMETRY
5.3.1 Slater determinants
We note in passing that the anti-symmetric wave function given in eqn. 5.4 can be written as a determi-
nant:
spatial
=
1
u
A
(r
1
) u
A
(r
2
)
u
B
(r
1
) u
B
(r
2
)
. (5.5)
This can be generalized to give the correct anti-symmetric wave function when we have more than two
electrons:
=
1
N!
(1) u
(2) u
(N)
u
(1) u
(2) u
(N)
.
.
.
.
.
.
.
.
.
.
.
.
u
(1) u
(2) u
(N)
, (5.6)
where , , , each represent a set of quantum numbers n, l, m
l
, m
s
for the individual electrons,
and 1, 2, , N are the electron labels. Determinants of this type are called Slater determinants.
Note that the determinant is zero if any two rows are equal, which tells us that each electron in the atom
must have a unique set of quantum numbers, as required by the Pauli exclusion principle.
We shall not make further use of Slater determinants in this course. They are mentioned here for
completeness.
5.4 The exchange energy
The Hamiltonian for the helium atom before we consider ne-structure eects is given by:
H =
_
2
2m
2
1
2e
2
4
0
r
1
_
+
_
2
2m
2
2
2e
2
4
0
r
2
_
+
e
2
4
0
r
12
, (5.7)
where r
12
= [r
1
r
2
[. The rst two terms enclosed in brackets account for the kinetic energy of the
two electrons and their attraction towards the nucleus, which has a charge of +2e. The nal term is the
Coulomb repulsion between the two electrons. It is this Coulomb repulsion which makes the equations
dicult to deal with.
In 3.1 and following we described how to deal with a many-electron Hamiltonian by splitting it into
a central eld and a residual electrostatic interaction. In the case of helium, we just have one Coulomb
repulsion term and it is easier to go back to rst principles. We can then use the correctly symmeterized
wave functions to calculate the energies for specic electronic congurations of the helium atom.
The energy of the electronic conguration (n
1
l
1
, n
2
l
2
) is found by computing the expectation value
of the Hamiltonian:
E =
__
spatial
H
spatial
d
3
r
1
d
3
r
2
. (5.8)
The spin wave functions do not appear here because the Hamiltonian does not aect the spin directly,
and so the spin wave functions just integrate out to unity.
We start by re-writing the Hamiltonian given in eqn 5.7 in the following form:
H =
H
1
+
H
2
+
H
12
, (5.9)
where
H
i
=
2
2m
2
i
2e
2
4
0
r
i
, (5.10)
H
12
=
e
2
4
0
[r
1
r
2
[
. (5.11)
The energy can be split into three parts:
E = E
1
+E
2
+E
12
, (5.12)
where:
E
i
=
__
spatial
H
i
spatial
d
3
r
1
d
3
r
2
, (5.13)
and
E
12
=
__
spatial
H
12
spatial
d
3
r
1
d
3
r
2
. (5.14)
5.4. THE EXCHANGE ENERGY 51
The rst two terms in eqn 5.12 represent the energies of the two electrons in the absence of the electron-
electron repulsion. These are just equal to the hydrogenic energies of each electron:
E
1
+E
2
=
4R
H
n
2
1
4R
H
n
2
2
, (5.15)
where the factor of 4 Z
2
accounts for the nuclear charge. The third term is the electron-electron
Coulomb repulsion energy:
E
12
=
__
spatial
e
2
4
0
r
12
spatial
d
3
r
1
d
3
r
2
. (5.16)
The detailed evaluation of this integral for the correctly symmeterized wave functions given in eqn 5.4 is
discussed in 5.7. The end result is:
E
12
= D
AB
J
AB
, (5.17)
where the + sign is for singlets and the sign is for triplets. D
AB
is the direct Coulomb energy given
by:
D
AB
=
e
2
4
0
__
u
A
(r
1
) u
B
(r
2
)
1
r
12
u
A
(r
1
) u
B
(r
2
) d
3
r
1
d
3
r
2
, (5.18)
and J
AB
is the exchange Coulomb energy given by
J
AB
=
e
2
4
0
__
u
A
(r
1
) u
B
(r
2
)
1
r
12
u
B
(r
1
) u
A
(r
2
) d
3
r
1
d
3
r
2
. (5.19)
Note that in the exchange integral, we are integrating the expectation value of 1/r
12
with each electron
in a dierent shell. This is why it is called the exchange energy. The total energy of the conguration
(n
1
l
1
, n
2
l
2
) is thus given by:
E(n
1
l
1
, n
2
l
2
) =
4R
H
n
2
1
4R
H
n
2
2
+D
AB
J
AB
, (5.20)
where the + sign applies to singlet (S = 0) states and the sign to triplets (S = 1). We thus see that
the energies of the singlet and triplet states dier by 2J
AB
. This splitting of the spin states is a direct
consequence of the exchange symmetry.
Note that:
The exchange splitting is not a small energy. It is part of the gross structure of the atom. This
contrasts with the other spin-dependent eect that we have considered, namely the spin-orbit
interaction, which is a small relativistic correction and only contributes to the ne structure of
the atom. The value of 2J
AB
for the rst excited state of helium, namely the 1s2s conguration, is
0.80 eV.
We can give a simple physical reason why the symmetry of the spatial wave function (and hence
the spin) aects the energy so much. If we put r
1
= r
2
into eqn 5.4, we see that we get
spatial
= 0
for the anti-symmetric state. This means that the two electrons have a low probability of coming
close together in the triplet state, and hence reduces their Coulomb repulsion energy. On the other
hand,
spatial
(r
1
= r
2
) ,= 0 for singlet states with symmetric spatial wave functions. They therefore
have a larger Coulomb repulsion energy.
The exchange energy is sometimes written in the form
E
exchange
J s
1
s
2
. (5.21)
This emphasizes the point that the change of energy is related to the relative alignment of the
electron spins. If both spins are aligned, as they are in the triplet states, the energy goes down. If
the spins are anti-parallel, the energy goes up.
The notation given in eqn 5.21 is extensively used when explaining the phenomenon of ferromag-
netism in the subject of magnetism. The energy that induces the spins to align parallel to each
other is caused by the spin-dependent change of the Coulomb repulsion energy of the electrons.
The magnetic energy of the electrons due to the dipole-dipole interaction is completely negligible
on this scale.
52 CHAPTER 5. HELIUM AND EXCHANGE SYMMETRY
0
He
++
He
+
(1s)
He (1s
2
)
2
4
.
6
e
V
5
4
.
4
e
V
E
-54.4 eV
-79.0 eV
Figure 5.2: The ionization energies of helium atom.
5.5 The helium term diagram
The term diagram for helium can be worked out if we can evaluate the direct and exchange Coulomb
energies. The total energy for each conguration is given by eqn 5.20.
The ground state
In the ground state both electrons are in the 1s shell, and so we have a conguration of 1s
2
. We have
seen above that we can only have S = 0 for this conguration. The energy is thus given by:
E(1s
2
) =
4R
H
1
2
4R
H
1
2
+
_
D
1s
2 +J
1s
2
_
= 54.4 eV54.4 eV + 29.8 eV
= 79.0 eV . (5.22)
The computation of the direct and exchange energies is non-trivial (to say the least) and keeps theoretical
atomic physicists busy. The value of 29.8 eV given here can be deduced experimentally from the rst
ionization potential (see below).
Ionization potentials
The excited states are made by promoting one of the electrons to higher shells. When the second electron
has been promoted into the energy continuum at n
2
= , we are left with a singly ionized helium atom:
He
+
. This is now a hydrogenic system. We have one electron in the 1s shell orbiting around a nucleus
with charge +2e, and the energy is just Z
2
R
H
= 54.4 eV. We thus deduce that the rst ionization
potential of helium is 54.4 (79.0) = 24.6 eV. The second ionization potential (ie the energy required
to liberate the second electron) is then equal to 54.4 eV. This point is illustrated in Fig. 5.2.
Optical spectra
The rst few excited states of helium are listed in Table 5.4. We do not need to consider two electron
jump excited states such as the 2s2s conguration here. This is because the Bohr model tells us that
we need an energy of about 2
3
4
R
H
to promote two electrons to the n = 2 shell. This is larger than the
rst ionization energy.
For each excited state we have two spin states corresponding to S equal to 0 or 1. The triplet S = 1
terms are at lower energy than the singlets due to the exchange energy. (See eqn 5.17.) The S = 0
selection rule tells us that we cannot get optical transitions between the singlets and triplet terms. The
transitions involving singlet states have a normal Zeeman eect since S = 0, but the triplet transitions
have an anomalous Zeeman eect since S ,= 0.
The energy term diagram for the rst few excited states are shown in Fig. 5.3. The energy of the
(1s, nl) state approaches the hydrogenic energy R
H
/n
2
when n is large. This is because the excited
electron is well outside the 1s shell, which just partly screens the nuclear potential. The outer electron
just sees Z
e
= 1, and we have a hydrogenic potential.
5.6. OPTICAL SPECTRA OF GROUP II ELEMENTS 53
Ground state 1s 1s ( 1s
2
)
First excited state 1s 2s
Second excited state 1s 2p
Third excited state 1s 3s
Fourth excited state 1s 3p
.
.
.
Ionization limit 1s l
Table 5.4: Electron congurations for the states of the helium atom.
0
-1
-2
-3
-4
-5
S = 0
singlet states
S = 1
triplet states
-24.5
E
n
e
r
g
y
(
e
V
)
1
S
0
1
P
1
1
D
2
3
S
1
3
P
0.1,2
3
D
1,2,3
n = 2
n = 3
(1s)
2
(1s,2s)
(1s,2p)
(1s,2s)
(1s,2p)
(1s,3s)
(1s,3p)
(1s,3d)
(1s,3s)
(1s,3p)
(1s,3d)
n = 1
exchange splitting
0
-1
-2
-3
-4
-5
S = 0
singlet states
S = 1
triplet states
-24.5
E
n
e
r
g
y
(
e
V
)
1
S
0
1
P
1
1
D
2
3
S
1
3
P
0.1,2
3
D
1,2,3
n = 2
n = 3
(1s)
2
(1s,2s)
(1s,2p)
(1s,2s)
(1s,2p)
(1s,3s)
(1s,3p)
(1s,3d)
(1s,3s)
(1s,3p)
(1s,3d)
n = 1
0
-1
-2
-3
-4
-5
S = 0
singlet states
S = 1
triplet states
-24.5
E
n
e
r
g
y
(
e
V
)
1
S
0
1
P
1
1
D
2
3
S
1
3
P
0.1,2
3
D
1,2,3
n = 2
n = 3
(1s)
2
(1s,2s)
(1s,2p)
(1s,2s)
(1s,2p)
(1s,3s)
(1s,3p)
(1s,3d)
(1s,3s)
(1s,3p)
(1s,3d)
n = 1
exchange splitting
Figure 5.3: Approximate energy term diagram for helium. The diagram is split into singlet and triplet
states because only S = 0 transitions are allowed by the selection rules. The energy dierence between
the singlet and triplet terms for the same conguration is caused by the exchange energy, as identied
for the 1s2s conguration.
Excited states states such as the 1s 2s conguration are said to be metastable. They cannot relax
easily to the ground state. The relaxation would involve a 2s 1s transition, which is forbidden by the
l = 1 selection rule. Furthermore, the relaxation of the triplet 1s 2s conguration is further forbidden
by the S = 0 selection rule. These states therefore have very long lifetimes.
5.6 Optical spectra of group II elements
The principles that we have been discussing here with respect to helium apply equally well to other
two-electron atoms. In particular, they apply to the elements in group IIA of the periodic table (e.g. Be,
Mg, Ca.) These atoms have two valence electrons in an s-shell outside a lled shell. The term diagram
for group IIA elements would appear generically similar to Fig. 5.3, and the optical spectra would follow
similar rules, with singlet and triplet transitions split by the exchange energy. The singlet and triplet
transitions have normal and anomalous Zeeman eects, respectively.
54 CHAPTER 5. HELIUM AND EXCHANGE SYMMETRY
5.7 Appendix: Detailed evaluation of the exchange integrals
Our task is to evaluate the gross energy for a specic electronic conguration of helium. We restrict
ourselves to congurations of the type (1s,nl), since these are the ones that give rise to the excited states
that are observed in the optical spectra. From eqn 5.4 we see that spatial part of the wave function is
given by:
(r
1
, r
2
) =
1
2
_
u
1s
(r
1
)u
nl
(r
2
) u
nl
(r
1
)u
1s
(r
2
)
_
where we take the + sign for singlets with S = 0 and the sign for triplets with S = 1.
Our task is to evaluate the three terms in eqn 5.12. We rst tackle E
1
:
E
1
=
__
H
1
d
3
r
1
d
3
r
2
=
1
2
__
_
u
1s
(r
1
) u
nl
(r
2
) u
nl
(r
1
) u
1s
(r
2
)
_
H
1
_
u
1s
(r
1
) u
nl
(r
2
) u
nl
(r
1
) u
1s
(r
2
)
_
d
3
r
1
d
3
r
2
,
where the + sign applies for singlet states and the sign for triplets. This splits into four integrals:
E
1
=
1
2
__
u
1s
(r
1
)u
nl
(r
2
)
H
1
u
1s
(r
1
)u
nl
(r
2
)d
3
r
1
d
3
r
2
+
1
2
__
u
nl
(r
1
)u
1s
(r
2
)
H
1
u
nl
(r
1
)u
1s
(r
2
)d
3
r
1
d
3
r
2
1
2
__
u
1s
(r
1
)u
nl
(r
2
)
H
1
u
nl
(r
1
)u
1s
(r
2
) d
3
r
1
d
3
r
2
1
2
__
u
nl
(r
1
)u
1s
(r
2
)
H
1
u
1s
(r
1
)u
nl
(r
2
)d
3
r
1
d
3
r
2
.
We now use the fact that u
nl
(r
1
) is an eigenstate of
H
1
:
H
1
u
nl
(r
1
) = E
nl
u
nl
(r
1
) ,
and that
H
1
has no eect on r
2
, to obtain:
E
1
=
1
2
E
1s
_
u
1s
(r
1
)u
1s
(r
1
)d
3
r
1
_
u
nl
(r
2
)u
nl
(r
2
)d
3
r
2
+
1
2
E
nl
_
u
nl
(r
1
)u
nl
(r
1
)d
3
r
1
_
u
1s
(r
2
)u
1s
(r
2
)d
3
r
2
1
2
E
nl
_
u
1s
(r
1
)u
nl
(r
1
) d
3
r
1
_
u
nl
(r
2
)u
1s
(r
2
)d
3
r
2
1
2
E
1s
_
u
nl
(r
1
)u
1s
(r
1
)d
3
r
1
_
u
1s
(r
2
)u
nl
(r
2
)d
3
r
2
=
1
2
E
1s
+
1
2
E
nl
+ 0 + 0 .
The integrals in the rst two terms are unity because the u
nl
wave functions are normalized, while the
last two terms are zero by orthogonality.
The evaluation of E
2
follows a similar procedure:
E
2
=
__
H
2
d
3
r
1
d
3
r
2
,
= +
1
2
__
u
1s
(r
1
)u
nl
(r
2
)
H
2
u
1s
(r
1
)u
nl
(r
2
)d
3
r
1
d
3
r
2
+
1
2
__
u
nl
(r
1
)u
1s
(r
2
)
H
2
u
nl
(r
1
)u
1s
(r
2
)d
3
r
1
d
3
r
2
1
2
__
u
1s
(r
1
)u
nl
(r
2
)
H
2
u
nl
(r
1
)u
1s
(r
2
)d
3
r
1
d
3
r
2
1
2
__
u
nl
(r
1
)u
1s
(r
2
)
H
2
u
nl
(r
1
)u
1s
(r
2
)d
3
r
1
d
3
r
2
= +
1
2
E
nl
+
1
2
E
1s
+ 0 + 0 .
5.7. APPENDIX: DETAILED EVALUATION OF THE EXCHANGE INTEGRALS 55
Finally, we have to evaluate the Coulomb repulsion term:
E
12
=
__
e
2
4
0
r
12
d
3
r
1
d
3
r
2
=
1
2
__
_
u
1s
(r
1
) u
nl
(r
2
) u
nl
(r
1
) u
1s
(r
2
)
_
e
2
4
0
r
12
_
u
1s
(r
1
) u
nl
(r
2
) u
nl
(r
1
) u
1s
(r
2
)
_
d
3
r
1
d
3
r
2
,
where again the + sign applies for singlet states and the sign for triplets. The four terms are:
E
12
= +
1
2
e
2
4
0
__
u
1s
(r
1
)u
nl
(r
2
)
1
r
12
u
1s
(r
1
)u
nl
(r
2
)d
3
r
1
d
3
r
2
+
1
2
e
2
4
0
__
u
nl
(r
1
)u
1s
(r
2
)
1
r
12
u
nl
(r
1
)u
1s
(r
2
)d
3
r
1
d
3
r
2
1
2
e
2
4
0
__
u
1s
(r
1
)u
nl
(r
2
)
1
r
12
u
nl
(r
1
)u
1s
(r
2
)d
3
r
1
d
3
r
2
1
2
e
2
4
0
__
u
nl
(r
1
)u
1s
(r
2
)
1
r
12
u
1s
(r
1
)u
nl
(r
2
)d
3
r
1
d
3
r
2
= +
D
2
+
D
2
J
2
J
2
,
where D and J are given by eqns 5.18 and 5.19 respectively.
The total energy is thus given by
E = E
1s
+E
nl
+D J
= 4R
H
4R
H
/n
2
+D J ,
where the + sign applies to singlets and the sign to triplets. (cf eqn 5.20 with n
1
= 1 and n
2
= n.)
Reading
Bransden and Joachain, Atoms, Molecules and Photons, chapter 7, 9.5
Demtroder, W., Atoms, Molecules and Photons, section 6.1.
Haken and Wolf, The physics of atoms and quanta, chapters 17 and 19.
Foot, Atomic physics, Chapter 3.
Eisberg and Resnick, Quantum Physics, chapter 9.
Beisser, Concepts of Modern Physics, chapter 7.
56 CHAPTER 5. HELIUM AND EXCHANGE SYMMETRY
Chapter 6
Fine structure
Up to this point, we have been mainly studying the gross structure of atoms. When we consider the
gross structure, we include only the largest interaction terms in the Hamiltonian, namely, the electron
kinetic energy, the electron-nuclear attraction, and the electron-electron repulsion.
It is now time to start considering the smaller interactions in the atom that arise from magnetic
eects. In this chapter we shall consider only those eects caused by internal magnetic elds, leaving
the discussion of the eects produced by external elds to the next set of notes. The internal elds
within atoms cause ne structure in atomic spectra. We shall start by considering the ne structure of
hydrogen and then move on to many-electron atoms. At the end of these notes we shall also look briey
at hyperne structure, which is a similar, but smaller, eect due to the magnetic interactions between
the electrons and the nucleus.
6.1 Orbital magnetic dipoles
The quantum numbers n and l were rst introduced in the old quantum theory of Bohr and Sommerfeld.
The principal quantum number n was introduced in the Bohr model as a fundamental postulate con-
cerning the quantization of the angular momentum (see eqn 1.5), while the orbital quantum number
l was introduced a few years later by Sommerfeld as a patch-up to account for the possibility that the
atomic orbits might be elliptical rather than circular. In Section 1.5.1 we saw how these two quantum
numbers naturally re-appear in the full quantum mechanical treatment of the hydrogen atom. Then, in
Section 3.1, we saw how they carry across to many-electron atoms.
Two key results that drop out of the quantum mechanical treatment of atoms are:
The magnitude L of the orbital angular momentum of an electron is given by (see eqn 1.35):
L =
_
l(l + 1) , (6.1)
where l can take integer values up to (n 1).
The component of the angular momentum along a particular axis (usually taken as the z axis) is
quantized in units of and its value is given by (see eqn 1.36):
L
z
= m
l
, (6.2)
where the magnetic quantum number m
l
can take integer values from l to +l.
These two relationships give rise to the vector model of angular momentum illustrated in Fig. 1.5.
The orbital motion of the electron causes it to have a magnetic moment. Let us rst consider an
electron in a circular Bohr orbit, as illustrated in Fig. 6.1(a). The electron orbit is equivalent to a current
loop, and we know from electromagnetism that current loops behave like magnets. The electron in the
Bohr orbit is equivalent to a little magnet with a magnetic dipole moment given by:
= i Area = (e/T) (r
2
) , (6.3)
where T is the period of the orbit. Now T = 2r/v, and so we obtain
=
ev
2r
r
2
=
e
2m
e
m
e
vr =
e
2m
e
L, (6.4)
57
58 CHAPTER 6. FINE STRUCTURE
m
r
-e
v
+Ze
i
(a) (b)
du
r
dA = r du
i
m
r
-e
v
+Ze
i
m
r
-e
v
+Ze
i
(a) (b)
du
r
dA = r du
i
du
r
dA = r du
i
Figure 6.1: (a) The orbital motion of the electron around the nucleus in a circular Bohr orbit is equivalent
to a current loop, which generates a magnetic dipole moment. (b) Magnetic dipole moment of an electron
in a non-circular orbit.
where we have substituted L for the orbital angular momentum m
e
vr.
This relationship can easily be generalized to the case of electrons in non-circular orbits. Consider
an electron at position vector r in a non-circular orbit with an origin O. The magnetic dipole moment is
given by:
=
_
i dA, (6.5)
where i is the current in the loop and dA is the incremental area swept out by the electron as it performs
its orbit. The incremental area dA is related to the path element du by:
dA =
1
2
r du, (6.6)
and so eqn 6.5 becomes:
=
1
2
_
i r du. (6.7)
We can write the current as i = dq/dt, where q is the charge, which implies:
=
1
2
_
dq
dt
r du,
=
1
2
_
dq r
du
dt
,
=
1
2
_
dq r v ,
=
1
2m
e
_
dq r p, (6.8)
where v is the velocity, and p is the momentum. The angular momentum is dened as usual by
L = r p (6.9)
and so we nally obtain:
=
1
2m
e
_
Ldq =
1
2m
e
L
_
dq =
1
2m
e
L(e) , (6.10)
as in eqn 6.4. Note that the result works because the angular momentum L is a constant of the motion
in the central eld approximation (see Section 4.2.1), and so it can be taken out of the integral.
Equation 6.4 shows us that the orbital angular momentum is directly related to the magnetic dipole
moment. The quantity e/2m
e
that appears is called the gyromagnetic ratio. It species the propor-
tionality constant between the angular momentum of an electron and its magnetic moment. It is apparent
from eqns 6.1 and 6.4 that the magnitude of atomic magnetic dipoles is given by:
[[ =
e
2m
e
_
l(l + 1) =
B
_
l(l + 1) , (6.11)
where
B
is the Bohr magneton dened by:
B
=
e
2m
e
= 9.27 10
24
JT
1
. (6.12)
6.2. SPIN MAGNETISM 59
This shows that the size of atomic dipoles is of order
B
. In many cases we are interested in the z
component of the magnetic dipole, which is given from eqns 6.2 and 6.4 as:
z
=
e
2m
e
L
z
=
B
m
l
, (6.13)
where m
l
is the orbital magnetic quantum number.
6.2 Spin magnetism
We have seen in Section 4.2.2 that electrons also have spin angular momentum. The deections measured
in the Stern-Gerlach experiment (see Fig. 4.1) enabled the magnitude of the magnetic moment due to
the spin angular momentum to be determined. The component along the z axis was found to obey:
z
= g
s
B
m
s
, (6.14)
where g
s
is the g-value of the electron, and m
s
= 1/2 is the magnetic quantum number due to spin.
This is identical in form to eqn 6.13 apart from the factor of g
s
. The experimental value of g
s
was
found to be close to 2. The Dirac equation predicts that g
s
should be exactly equal to 2, and more
recent calculations based on quantum electrodynamics (QED) give a value of 2.0023192, which agrees
very accurately with the most precise experimental data.
6.3 Spin-orbit coupling
The fact that electrons in atoms have both orbital and spin angular momentum leads to a new interaction
term in the Hamiltonian called spin-orbit coupling. Sophisticated theories of spin-orbit coupling (e.g.
those based on the Dirac equation) indicate that it is actually a relativistic eect. At this stage it is more
useful to consider spin-orbit coupling as the interaction between the magnetic eld due to the orbital
motion of the electron and the magnetic moment due to its spin. This more intuitive approach is the one
we adopt here. We start by giving a simple order of magnitude estimate based on the semi-classical Bohr
model, and then take a more general approach that works for the fully quantum mechanical picture.
6.3.1 Spin-orbit coupling in the Bohr model
The easiest way to understand the spin-orbit coupling is to consider the single electron of a hydrogen atom
in a Bohr-like circular orbit around the nucleus, and then shift the origin to the electron, as indicated in
Fig. 6.2. In this frame, the electron is stationary and the nucleus is moving in a circular orbit of radius
r
n
. The orbit of the nucleus is equivalent to a current loop, which produces a magnetic eld at the origin.
Now the magnetic eld produced by a circular loop of radius r carrying a current i is given by:
B
z
=
0
i
2r
, (6.15)
where z is taken to be the direction perpendicular to the loop. As in Section 6.1, the current i is given
by the charge Ze divided by the orbital period T = 2r/v. On substituting for the velocity and radius
in the Bohr model from eqns 1.15 and 1.16, we nd:
B
z
=
0
Zev
n
4r
2
n
=
_
Z
4
n
5
_
0
ce
4a
2
0
, (6.16)
where = e
2
/2
0
hc 1/137 is the ne structure constant dened in eqn 1.18. For hydrogen with
Z = n = 1, this gives B
z
12 Tesla, which is a large eld.
The electron at the origin experiences this orbital eld and we thus have a magnetic interaction energy
of the form:
E
so
=
spin
B
orbital
, (6.17)
which, from eqn 6.14, becomes:
E
so
= g
s
B
m
s
B
z
=
B
B
z
, (6.18)
where we have used g
s
= 2 and m
s
= 1/2 in the last equality. By substituting from eqn. 6.16 and
making use of eqn 6.12, we nd:
[E
so
[ =
_
Z
4
n
5
_
0
ce
2
8m
e
a
2
0
2
Z
2
n
3
[E
n
[ , (6.19)
60 CHAPTER 6. FINE STRUCTURE
+Ze
-e
-v
r
shift origin to the electron
+Ze
-e
v
r
E
+Ze
-e
-v
r
+Ze
-e
-v
r
shift origin to the electron
+Ze
-e
v
r
E
+Ze
-e
v
r
+Ze
-e
v
r
E
Figure 6.2: An electron moving with velocity v through the electric eld E of the nucleus experiences a
magnetic eld equal to (Ev)/c
2
. The magnetic eld can be understood by shifting the origin to the
electron and calculating the magnetic eld due to the orbital motion of the nucleus around the electron.
The velocity of the nucleus in this frame is equal to v.
where E
n
is the quantized energy given by eqn 1.10. For the n = 1 orbit of hydrogen, this gives:
[E
so
[ =
2
R
H
= 13.6 eV/137
2
= 0.7 meV 6 cm
1
.
This shows that the spin-orbit interaction is about 10
4
times smaller than the gross structure energy
in hydrogen. Note that the relative size of the spin-orbit interaction grows as Z
2
, so that spin-orbit
eects are expected to become more important in heavier atoms, which is indeed the case. Note also that
eqn 6.19 can be re-written using eqn 1.16 as
[E
so
[ =
_
v
n
c
_
2
[E
n
[
n
, (6.20)
which shows that the spin-orbit interaction energy is of the same magnitude as the relativistic corrections
that would be expected for the Bohr model. This is hardly surprising, given that Dirac tells us that we
should really think of spin-orbit coupling as a relativistic eect.
6.3.2 Spin-orbit coupling beyond the Bohr model
In this sub-section we repeat the calculation above but without making use of the semi-classical results
from the Bohr model. The electrons experience a magnetic eld as they move through the electric eld of
the nucleus. If the electron velocity is v, it will see the nucleus orbiting around it with a velocity of v,
as shown in Fig. 6.2. The magnetic eld generated at the electron can be calculated by the Biot-Savart
law as shown by Fig. 6.3. This gives the magnetic eld at the origin of a loop carrying a current i as:
B =
0
4
_
loop
i
du r
r
3
, (6.21)
where du is an orbital path element. For simplicity we consider the case of a circular orbit with constant
r. In this case we have:
_
i du =
_
dq
dt
du = Ze
du
dt
= Ze(v) .
We thus obtain:
B =
0
4
Ze
r
3
v r =
0
4
Ze
r
3
r v . (6.22)
For a Coulomb eld the electric eld E is given by:
E =
Ze
4
0
r
2
r =
Ze
4
0
r
3
r , (6.23)
where the hat symbol on r in the rst equality indicates that it is a unit vector. On combining equations
6.22 and 6.23 we obtain:
B =
0
0
Ev . (6.24)
We know from Maxwells equations that
0
0
= 1/c
2
, and so we can re-write this as:
B =
1
c
2
Ev . (6.25)
6.3. SPIN-ORBIT COUPLING 61
i
r
du
O
i
r
du
O
Figure 6.3: The magnetic eld at the origin O due to a loop carrying a current i is calculated by the
Biot-Savart law given in Eq. 6.21. The eld points out of the paper.
The same formula can also be derived for the more general case of non-circular orbits and non-Coulombic
electric elds such as those found in multi-electron atoms.
The spin-orbit interaction energy is given by:
E
so
=
spin
B
orbital
, (6.26)
where
spin
is the magnetic moment due to spin, which is given by:
spin
= g
s
[e[
2m
e
s = g
s
s . (6.27)
On substituting Eqs. 6.25 and 6.27 into Eq. 6.26, we obtain:
E
so
=
g
s
B
c
2
s(Ev) . (6.28)
If we have a central eld (ie the potential V is a function of r only), we can write:
1
E =
1
e
r
r
dV
dr
. (6.29)
On making use of this, the spin-orbit energy becomes:
E
so
=
g
s
B
c
2
em
e
_
1
r
dV
dr
_
s(r p) , (6.30)
where we have substituted v = p/m
e
. On recalling that the angular momentum l is dened as r p,
we nd:
E
so
=
g
s
B
c
2
em
e
_
1
r
dV
dr
_
s l . (6.31)
This calculation of E
so
does not take proper account of relativistic eects. In particular, we moved the
origin from the nucleus to the electron, which is not really allowed because the electron is accelerating
all the time and is therefore not an inertial frame. The translation to a rotating frame gives rise to an
extra eect called the Thomas precession which reduces the energy by a factor of 2. (See Eisberg and
Resnick, Appendix O.) On taking the Thomas precession into account, and recalling that
B
= e/2m
e
,
we obtain the nal result:
E
so
=
g
s
2
1
2c
2
m
2
e
_
1
r
dV
dr
_
l s . (6.32)
This is the same as the result derived from the Dirac equation, except that g
s
is exactly equal to 2 in
Diracs theory. Equation 6.32 shows that the spin and orbital angular momenta are coupled together. If
we have a simple Coulomb eld and take g
s
= 2, we nd
E
so
=
Ze
2
8
0
c
2
m
2
e
_
1
r
3
_
l s . (6.33)
We can use this formula for hydrogenic atoms, while we can use the more general form given in Eq. 6.32 for
more complicated multi-electron atoms where the potential will dier from the Coulombic 1/r dependence
due to the repulsion between the electrons.
1
It is easy to verify that this works for a Coulomb eld where V = Ze
2
/4
0
r and E is given by eqn 6.23.
62 CHAPTER 6. FINE STRUCTURE
n = 2
l = 0,1
l = 1, j = 3/2
j = 3/2
j = 1/2
l = 0
l = 1
Gross
structure
spin-orbit Dirac theory Lamb shift + + +
0.365 cm
-1
0.035 cm
-1
l = 0, j = 1/2
l = 1, j = 1/2
n = 2
l = 0,1
l = 1, j = 3/2
j = 3/2
j = 1/2
l = 0
l = 1
Gross
structure
spin-orbit Dirac theory Lamb shift + + +
0.365 cm
-1
0.035 cm
-1
l = 0, j = 1/2
l = 1, j = 1/2
Figure 6.4: Fine structure in the n = 2 level of hydrogen.
6.4 Evaluation of the spin-orbit energy for hydrogen
The magnitude of the spin-orbit energy can be calculated from eqn 6.32 as:
E
so
=
1
2c
2
m
2
e
_
1
r
dV
dr
_
l s , (6.34)
where we have taken g
s
= 2, and the notation indicates that we take expectation values:
_
1
r
dV
dr
_
=
___
nlm
_
1
r
dV
dr
_
nlm
r
2
sin drdd. (6.35)
The function (dV/dr)/r depends only on r, and so we are left to calculate an integral over r only:
_
1
r
dV
dr
_
=
_
0
[R
nl
(r)[
2
_
1
r
dV
dr
_
r
2
dr , (6.36)
where R
nl
(r) is the radial wave function. This integral can be evaluated exactly for the case of the
Coulomb eld in hydrogen where (dV/dr)/r 1/r
3
, and the radial wave functions are known exactly.
(See Table 1.4.) We then have, for l 1:
_
1
r
dV
dr
_
_
1
r
3
_
=
Z
3
a
3
0
n
3
l(l +
1
2
)(l + 1)
. (6.37)
This shows that we can re-write eqn 6.34 in the form:
E
so
= C
nl
l s , (6.38)
where C
nl
is a constant that depends only on n and l.
We can evaluate l s by realizing from eqn 4.21 that we must have:
j
2
= (l +s)
2
= l
2
+s
2
+ 2l s . (6.39)
This implies that:
l s =
_
1
2
(j
2
l
2
s
2
)
_
=
2
2
[j(j + 1) l(l + 1) s(s + 1)] . (6.40)
We therefore nd:
E
so
= C
nl
[j(j + 1) l(l + 1) s(s + 1)] , (6.41)
where C
nl
= C
nl
2
/2. On using eqn 6.37 we obtain the nal result for states with l 1:
E
so
=
2
Z
2
2n
2
E
n
n
l(l +
1
2
)(l + 1)
[j(j + 1) l(l + 1) s(s + 1)] , (6.42)
where 1/137 is the ne structure constant, and E
n
= R
H
Z
2
/n
2
is equal to the gross energy.
For states with l = 0 it is apparent from eqn 6.34 that E
so
= 0.
The fact that j takes values of l +1/2 and l 1/2 for l 1 means that the spin-orbit interaction splits
the two j states with the same value of l. We thus expect the electronic states of hydrogen with l 1 to
split into doublets. However, the actual ne structure of hydrogen is more complicated for two reasons:
6.5. SPIN-ORBIT COUPLING IN ALKALI ATOMS 63
n, l
J = (L + )
J = (L - )
+ C L
- C (L+1)
3s
3p
5
8
9
.
0
n
m
5
8
9
.
6
n
m
J = 1/2
J = 3/2
J = 1/2
2
P
J
levels
2
S
1/2
level
(a)
(b)
n, l
J = (L + )
J = (L - )
+ C L
- C (L+1)
n, l
J = (L + )
J = (L - )
+ C L
- C (L+1)
3s
3p
5
8
9
.
0
n
m
5
8
9
.
6
n
m
J = 1/2
J = 3/2
J = 1/2
2
P
J
levels
2
S
1/2
level
3s
3p
5
8
9
.
0
n
m
5
8
9
.
6
n
m
J = 1/2
J = 3/2
J = 1/2
2
P
J
levels
2
S
1/2
level
(a)
(b)
Figure 6.5: Spin-orbit interactions in alkali atoms. (a) The spin-orbit interaction splits the nl states into
a doublet if l ,= 0. (b) Fine structure in the yellow sodium D lines.
1. States with the same n but dierent l are degenerate.
2. The spin-orbit interaction is small.
The rst point is a general property of pure one-electron systems, and the second follows from the scaling
of E
so
/E
n
with Z
2
. A consequence of point 2 is that other relativistic eects that have been neglected
up until now are of a similar magnitude to the spin-orbit coupling. In atoms with higher values of Z, the
spin-orbit coupling is the dominant relativistic correction, and we can neglect the other eects.
The ne structure of the n = 2 level in hydrogen is illustrated in gure 6.4. The fully relativistic Dirac
theory predicts that states with the same j are degenerate. The degeneracy of the two j = 1/2 states is
ultimately lifted by a quantum electrodynamic (QED) eect called the Lamb shift. The complications of
the ne structure of hydrogen due to other relativistic and QED eects means that hydrogen is not the
paradigm for understanding spin-orbit eects. The alkali metals considered below are in fact simpler to
understand.
6.5 Spin-orbit coupling in alkali atoms
Alkali atoms have a single valence electron outside close shells. Closed shells have no angular momentum,
and so the angular momentum state [L, S, J of the atom is determined entirely by the valence electron.
By analogy with the results for hydrogen given in eqns 6.38 and eqn 6.41, we can write the spin-orbit
interaction term as:
E
SO
L S [J(J + 1) L(L + 1) S(S + 1)] . (6.43)
It follows immediately that the spin-orbit energy is zero when the valence electron is in an s-shell, since
L S = 0 when L = 0. (Alternatively: J = S if L = 0, so J(J + 1) L(L + 1) S(S + 1) = 0.)
Now consider the case when the valence electron is in a shell with l ,= 0. We now have L = l and
S = 1/2, so that L S ,= 0. J has two possible values, namely J = L S = L 1/2 = L 1/2. On
writing eqn 6.43 in the form:
E
SO
= C [J(J + 1) L(L + 1) S(S + 1)] , (6.44)
the spin-orbit energy of the J = (L + 1/2) state is given by:
E
so
= C
_
(L +
1
2
)(L +
3
2
) L(L + 1)
1
2
3
2
_
= +CL,
while for the J = (L 1/2) level we have:
E
so
= C
_
(L
1
2
)(L +
1
2
) L(L + 1)
1
2
3
2
_
= C(L + 1) .
Hence the term dened by the quantum numbers n and l is split by the spin-orbit coupling into two
new states, as illustrated in gure 6.5(a). This gives rise to the appearance of doublets in the atomic
spectra. The magnitude of the splitting is smaller than the gross energy by a factor
2
= 1/137
2
.
(See Eq. 6.42.) This is why these eects are called ne structure, and is called the ne structure
constant.
As an example, let us consider sodium, which has 11 electrons, with one valence electron outside lled
1s, 2s and 2p shells. It can therefore be treated as a one electron system, provided we remember that
64 CHAPTER 6. FINE STRUCTURE
0 500 1000 1500 2000 2500 3000 3500
0
100
200
300
400
500
600
Li
Na
K
Rb
Cs
Alkali D-lines
F
i
n
e
s
t
r
u
c
t
u
r
e
s
p
l
i
t
t
i
n
g
(
c
m
-
1
)
(Atomic number Z)
2
Figure 6.6: Spin-orbit splitting of the rst excited state of the alkali atoms versus Z
2
, as determined by
the ne structure splitting of the D-lines. (See Table 6.1.)
this is only an approximation. One immediate consequence is that the diering l states arising from the
same value of n are not degenerate as they are in hydrogen. (See section 3.5.) The bright yellow D lines
of sodium correspond to the 3p 3s transition.
It is well known that the D-lines actually consist of a doublet, as shown in Fig. 6.5(b). The doublet
arises from the spin-orbit coupling. The ground state is a
2
S
1/2
level with zero spin-orbit splitting. The
excited state is split into the two levels derived from the dierent J values for L = 1 and S = 1/2, namely
the
2
P
3/2
and
2
P
1/2
levels. The two transitions in the doublet are therefore:
2
P
3/2
2
S
1/2
and
2
P
1/2
2
S
1/2
.
The energy dierence of 17 cm
1
between them arises from the spin-orbit splitting of the two J states of
the
2
P term.
Similar arguments can be applied to the other alkali elements. The spinorbit energy splittings of
their rst excited states are tabulated in Table 6.1. Note that the splitting increases with Z, and that
the splitting energy is roughly proportional to Z
2
, as shown in Fig. 6.6. This is an example of the fact
that spinorbit interactions generally increase with the atomic number, so that the spinorbit coupling
is stronger in heavier elements.
Element Z Ground state 1st excited state Transition E (cm
1
)
Lithium 3 [He] 2s 2p 2p 2s 0.33
Sodium 11 [Ne] 3s 3p 3p 3s 17
Potassium 19 [Ar] 4s 4p 4p 4s 58
Rubidium 37 [Kr] 5s 5p 5p 5s 238
Cesium 55 [Xe] 6s 6p 6p 6s 554
Table 6.1: Spin-orbit splitting E of the D lines of the alkali elements. The energy splitting is equal to
the dierence of the energies of the J = 3/2 and J = 1/2 levels of the rst excited state.
6.6 Spin-orbit coupling in many-electron atoms
We have seen in Chapter 4 that atoms with more than one valence electron can have dierent types of
angular momentum coupling. We restrict our attention here to atoms with LS-coupling, which is the
most common type, as explained in Section 4.7. In LS-coupling, the residual electrostatic interaction
6.7. NUCLEAR EFFECTS IN ATOMS 65
couples the orbital and spin angular momenta together according to eqns 4.28 and 4.29. The resultants
are then coupled together to give the total angular momentum J according to
J = L+S . (6.45)
The rules for coupling of angular momenta produce several J states for each LS-term, with J running
from L+S down to [LS[ in integer steps.
2
These J states experience dierent spin-orbit interactions,
and so are shifted in energy from each other. Hence the spin-orbit coupling splits the J states of a
particular LS-term into ne structure multiplets.
The splitting of the J states can be evaluated as follows. The spin-orbit interaction takes the form:
E
so
=
spin
B
orbital
L S , (6.46)
which implies (cf. eqns 6.38 6.41):
E
SO
= C
LS
[J(J + 1) L(L + 1) S(S + 1)] . (6.47)
It follows from eqn 6.47 that levels with the same L and S but dierent J are separated by an energy
which is proportional to J. This is called the interval rule. Figure 4.3 shows an example of the interval
rule for the
3
P term of the (3s,3p) conguration of magnesium.
6.7 Nuclear eects in atoms
For most of the time in atomic physics we just take the nucleus to be a heavy charged particle sitting at
the centre of the atom. However, careful analysis of the spectral lines can reveal small eects that give
us direct information about the nucleus. The main eects that can be observed generally fall into two
categories, namely isotope shifts and hyperne structure.
6.7.1 Isotope shifts
There are two main processes that give rise to isotope shifts in atoms, namely mass eects and eld
eects.
Mass eects The mass m that enters the Schrodinger equation is the reduced mass, not the bare electron
mass m
e
(cf. eqn 1.9). Changes in the nuclear mass therefore make small changes to m and hence
to the atomic energies.
Field eects Electrons in s shells have a nite probability of penetrating the nucleus, and are therefore
sensitive to its charge distribution.
Both eects cause small shifts in the wavelengths of the spectral lines from dierent isotopes of the same
element. The heavy isotope of hydrogen, namely deuterium, was discovered in this way through its mass
eect.
6.7.2 Hyperne structure
In highresolution spectroscopy, it is necessary to consider eects relating to the magnetic interaction
between the electron angular momentum (J) and the nuclear spin (I). The angular momentum of the
electrons creates a magnetic eld at the nucleus which is proportional to J. The spin of the nucleus gives
it a magnetic dipole moment which is proportional to I, and we therefore have an interaction energy
term of the form:
E
hyperne
=
nucleus
B
electron
I J . (6.48)
This gives rise to hyperne splittings in the atomic terms. The magnitude of the splittings is very small
because the nuclear dipole is about 2000 times smaller than that of the electron. This follows from the
small gyromagnetic ratio of the nucleus, which is inversely proportional to its mass. (See eqn 6.4.) The
splittings are therefore about three orders of magnitude smaller than the ne structure splittings: hence
the name hyperne.
Hyperne states are labelled by the total angular momentum F of the whole atom (i.e. nucleus plus
electrons), where
F = I +J. (6.49)
2
There is only one J state, and hence no ne structure splitting, when one or both of L or S are zero.
66 CHAPTER 6. FINE STRUCTURE
F = 1
F = 0
1s
2
S
1/2
1420 MHz
3p
2
P
3/2
3s
2
S
1/2
F
1
0
2
1
3
2
3p
2
P
1/2
3s
2
S
1/2
F
2
1
2
1
(a) (b) (c)
F = 1
F = 0
1s
2
S
1/2
1420 MHz
F = 1
F = 0
1s
2
S
1/2
1420 MHz
3p
2
P
3/2
3s
2
S
1/2
F
1
0
2
1
3
2
3p
2
P
3/2
3s
2
S
1/2
F
1
0
2
1
3
2
3p
2
P
1/2
3s
2
S
1/2
F
2
1
2
1
3p
2
P
1/2
3s
2
S
1/2
F
2
1
2
1
(a) (b) (c)
Figure 6.7: (a) Hyperne structure of the 1s ground state of hydrogen. The arrows indicate the relative
directions of the electron and nuclear spin. (b) Hyperne transitions for the sodium D
1
line. (c) Hyperne
transitions for the sodium D
2
line. Note that the hyperne splittings are not drawn to scale. The
splittings of the sodium levels are as follows:
2
S
1/2
, 1772 MHz;
2
P
1/2
, 190 MHz;
2
P
3/2
(3 2), 59 MHz;
2
P
3/2
(2 1), 34 MHz;
2
P
3/2
(1 0), 16 MHz.
In analogy with the [LSJ states of ne structure, the electric dipole selection rule for transitions between
hyperne states is:
F = 0, 1 , (6.50)
with the exception that F = 0 0 transitions are forbidden. Let us consider two examples to see how
this works.
The hydrogen 21 cm line
Consider the ground state of hydrogen. The nucleus consists of just a single proton, and we therefore have
I = 1/2. The hydrogen ground state is the 1s
2
S
1/2
term, which has J = 1/2. The hyperne quantum
number F is then found from F = I J = 1/2 1/2 = 1 or 0. These two hyperne states correspond to
the cases in which the spins of the electron and the nucleus are aligned parallel (F = 1) or antiparallel
(F = 0). The two F states are split by the hyperne interaction by 0.0475 cm
1
(5.9 10
6
eV). (See
Fig. 6.7(a).) Transitions between these levels occur at 1420 MHz ( = 21 cm), and are very important
in radio astronomy. Radio frequency transitions such as these are also routinely exploited in nuclear
magnetic resonance (NMR) spectroscopy.
Hyperne structure of the sodium D lines
The sodium D lines originate from 3p 3s transitions. As discussed in Section 6.5, there are two lines
with energies split by the spin-orbit coupling, as indicated in Fig. 6.5(b).
Consider rst the lower energy D
1
line, which is the
2
P
1/2
2
S
1/2
transition. The nucleus of sodium
has I = 3/2, and so we have F = 3/2 1/2 = 2 or 1 for both the upper and lower levels of the transition,
as shown in Fig. 6.7(b). Note that the hyperne splittings are not drawn to scale in Fig. 6.7(b): the
splitting of the
2
S
1/2
level is 1772 MHz, which is much larger than that of the
2
P
1/2
, namely 190 MHz.
This is a consequence of the fact that s-electrons have higher probability densities at the nucleus, and
hence experience stronger hyperne interactions. All four transitions are allowed by the selection rules,
and so we observe four lines. Since the splitting of the upper and lower levels are so dierent, we obtain
two doublets with relative frequencies of (0, 190) MHz and (1772, 1962) MHz. These splittings should be
compared to the much larger ( 5 10
11
Hz) splitting between the two J states caused by the spin-orbit
interaction. Since the hyperne splittings are much smaller, they are not routinely observed in optical
spectroscopy, and specialized techniques using narrow band lasers are typically employed nowadays.
Now consider the higher energy D
2
line, which is the
2
P
3/2
2
S
1/2
transition. In the upper level
we have J = 3/2, and hence F = I J = 3/2 3/2 = 3, 2, 1, or 0. There are therefore four hyperne
states for the
2
P
3/2
level, as shown in Fig. 6.7(c). The hyperne splittings of the
2
P
3/2
level are again
much smaller than that of the
2
S
1/2
level, on account of the low probability density of p-electrons near
the nucleus. Six transitions are allowed by the selection rules, with the F = 3 1 and F = 0 2
6.7. NUCLEAR EFFECTS IN ATOMS 67
transitions being forbidden by the [F[ 1 selection rule. We thus have six hyperne lines, which split
into two triplets at relative frequencies of (0, 34, 59) MHz and (1756, 1772, 1806) MHz.
Reading
Bransden and Joachain, Atoms, Molecules and Photons, chapter 5, 9.46
Demtroder, Atoms, Molecules and Photons, 5.48.
Haken and Wolf, The physics of atoms and quanta, chapters 12, 20.
Eisberg and Resnick, Quantum Physics, chapters 8, 10.
Foot, Atomic physics, 2.3, 4.56, chapter 6.
Beisser, Concepts of Modern Physics, 7.8.
68 CHAPTER 6. FINE STRUCTURE
Chapter 7
External elds: the Zeeman and
Stark eects
In the previous chapter, we considered the eects of the internal magnetic elds within atoms. We now
wish to consider the eects of external elds. Table 7.1 denes the nomenclature of the eects that we
shall be considering. We shall start by looking at magnetic elds and then move on to consider electric
elds.
Applied eld Field strength Eect
Magnetic weak Zeeman
strong Paschen-Back
Electric all Stark
Table 7.1: Names of the eects of external elds in atomic physics.
7.1 Magnetic elds
The rst person to study the eects of magnetic elds on the optical spectra of atoms was Zeeman in
1896. He observed that the transition lines split when the eld is applied. Further work showed that the
interaction between the atoms and the eld can be classied into two regimes:
Weak elds: the Zeeman eect, either normal or anomalous;
Strong elds: the Paschen-Back eect.
The normal Zeeman eect is so-called because it agrees with the classical theory developed by Lorentz.
The anomalous Zeeman eect is caused by electron spin, and is therefore a completely quantum result.
The criterion for deciding whether a particular eld is weak or strong will be discussed in Section 7.1.3.
In practice, we usually work in the weak-eld (i.e. Zeeman) limit.
7.1.1 The normal Zeeman eect
The normal Zeeman eect is observed in atoms with no spin. The total spin of an N-electron atom is
given by:
S =
N
i=1
s
i
. (7.1)
Filled shells have no net spin, and so we only need to consider the valence electrons here. Since all the
individual electrons have spin 1/2, it will not be possible to obtain S = 0 from atoms with an odd number
69
70 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
B = 0 B 0
m
B
B
z
l = 2
field B
z
z
x,y
transverse
magnet
longitudinal
(a) (b)
B = 0 B 0
m
B
B
z
l = 2
field B
z
z
x,y
z
x,y
transverse
magnet
longitudinal
(a) (b)
Figure 7.1: The normal Zeeman eect. (a) Splitting of the degenerate m
l
states of an atomic level with
l = 2 by a magnetic eld. (b) Denition of longitudinal (Faraday) and transverse (Voigt) observations.
The direction of the eld denes the z axis.
of valence electrons. However, if there is an even number of valence electrons, we can obtain S = 0 states.
For example, if we have two valence electrons, then the total spin quantum number S = 1/2 1/2 can be
either 0 or 1. In fact, the ground states of divalent atoms from group II of the periodic table (electronic
conguration ns
2
) always have S = 0 because the two electrons align with their spins antiparallel.
The magnetic moment of an atom with no spin will originate entirely from its orbital motion:
=
L, (7.2)
where
B
/ = e/2m
e
is the gyromagnetic ratio. (See eqn 6.4.) The interaction energy between a
magnetic dipole and a uniform magnetic eld B is given by:
E = B. (7.3)
We set up the axes of our spherically-symmetric atom so that the z axis coincides with the direction of
the eld. In this case we have:
B =
_
_
0
0
B
z
_
_
,
and the interaction energy of the atom is therefore:
E =
z
B
z
=
B
B
z
m
l
, (7.4)
where m
l
is the orbital magnetic quantum number. Equation 7.4 shows us that the application of an
external B-eld splits the degenerate m
l
states evenly. This is why m
l
is called the magnetic quantum
number. The splitting of the m
l
states of an l = 2 electron is illustrated in Fig. 7.1(a).
The eect of the magnetic eld on the spectral lines can be worked out from the splitting of the levels.
Consider the transitions between two Zeeman-split atomic levels as shown in Fig. 7.2. The selection rules
listed in Table 2.1 of Chapter 2 indicate that we can have transitions with m
l
= 0 or 1. The gives
rise to three transitions whose frequencies are given by:
h = h
0
+
B
B
z
m
l
= 1 ,
h = h
0
m
l
= 0 , (7.5)
h = h
0
B
B
z
m
l
= +1 .
This is the same result as that derived by classical theory.
The polarization of the Zeeman lines is determined by the selection rules, and the conditions of obser-
vation. If we are looking along the eld (longitudinal observation), the photons must be propagating in
the z direction. (See Fig. 7.1(b).) Light waves are transverse, and so only the x and y polarizations are
possible. The z-polarized m
l
= 0 line is therefore absent, and we just observe the
+
and
circularly
polarized m
l
= 1 transitions. When observing at right angles to the eld (transverse observation),
all three lines are present. The m
l
= 0 transition is linearly polarized parallel to the eld, while the
m
l
= 1 transitions are linearly polarized at right angles to the eld. These results are summarized in
Table 7.2.
1
1
In solid-state physics, the longitudinal and transverse observation conditions are frequently called the Faraday and
Voigt geometries, respectively.
7.1. MAGNETIC FIELDS 71
B = 0 B 0
hn
0
m
l
l = 1
l = 2
+1
0
-1
+1
0
-1
+2
-2
(a)
hn
B = 0
hn
0
(b)
hn
hn
0
B 0
m
B
B
+1 -1 0
Dm
l
B = 0 B 0
hn
0
m
l
l = 1
l = 2
+1
0
-1
+1
0
-1
+2
-2
(a)
hn
B = 0
hn
0
hn
B = 0
hn
0
(b)
hn
hn
0
B 0
m
B
B
+1 -1 0
Dm
l
hn
hn
0
B 0
m
B
B
+1 -1 0
Dm
l
Figure 7.2: The normal Zeeman eect for a p d transition. (a) The eld splits the degenerate m
l
levels equally. Optical transitions can occur if m
l
= 0, 1. (Only the transitions originating from the
m
l
= 0 level of the l = 1 state are identied here for the sake of clarity.) (b) The spectral line splits into a
triplet when observed transversely to the eld. The m
l
= 0 transition is unshifted, but the m
l
= 1
transitions occur at (h
0
B
B
z
).
m
l
Energy Polarization
Longitudinal Transverse
observation observation
+1 h
0
B
B
+
E B
0 h
0
not observed E | B
1 h
0
+
B
B
E B
Table 7.2: The normal Zeeman eect. The last two columns refer to the polarizations observed in
longitudinal (Faraday) and transverse (Voigt) observation conditions. The direction of the circular (
)
polarization in longitudinal observation is dened relative to B. In transverse observation, all lines are
linearly polarized.
72 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
L
S
J
B
L
S
J
z
q
1
q
2
(a) (b)
L
S
J
B
L
S
J
B
L
S
J
z
q
1
q
2
L
S
J
z
q
1
q
2
(a) (b)
Figure 7.3: (a) Slow precession of J around B in the anomalous Zeeman eect. The spin-orbit interaction
causes L and S to precess much more rapidly around J. (b) Denition of the projection angles
1
and
2
used in the calculation of the Lande g factor.
7.1.2 The anomalous Zeeman eect
The anomalous Zeeman eect is observed in atoms with non-zero spin. This will include all atoms with
an odd number of electrons. In the LS-coupling regime, the spin-orbit interaction couples the spin and
orbital angular momenta together to give the resultant total angular momentum J according to:
J = L+S . (7.6)
The orbiting electrons in the atom are equivalent to a classical magnetic gyroscope. The torque applied
by the eld causes the atomic magnetic dipole to precess around B, an eect called Larmor precession.
The external magnetic eld therefore causes J to precess slowly about B. Meanwhile, L and S precess
more rapidly about J due to the spin-orbit interaction. This situation is illustrated in Fig. 7.3(a). The
speed of the precession about B is proportional to the eld strength. If we turn up the eld, the Larmor
precession frequency will eventually be faster than the spin-orbit precession of L and S around J. This
is the point where the behaviour ceases to be Zeeman-like, and we are in the strong eld regime of the
Paschen-Back eect.
The interaction energy of the atom is equal to the sum of the interactions of the spin and orbital
magnetic moments with the eld:
E =
z
B
z
= (
spin
z
+
orbital
z
)B
z
= g
s
S
z
+L
z
B
z
, (7.7)
where g
s
= 2, and the symbol implies, as usual, that we take expectation values. The normal
Zeeman eect is obtained by setting S
z
= 0 and L
z
= m
l
in this formula. In the case of the precessing
atomic magnet shown in Fig. 7.3(a), neither S
z
nor L
z
are constant. Only J
z
= M
J
is well-dened.
We must therefore rst project L and S onto J, and then re-project this component onto the z axis.
The eective dipole moment of the atom is therefore given by:
=
_
[L[ cos
1
J
[J[
+ 2[S[ cos
2
J
[J[
_
, (7.8)
where the factor of 2 in the second term comes from the fact that g
s
= 2. The angles
1
and
2
that
appear here are dened in Fig. 7.3(b), and can be calculated from the scalar products of the respective
vectors:
L J = [L[ [J[ cos
1
,
S J = [S[ [J[ cos
2
, (7.9)
which implies that:
=
_
L J
[J[
2
+ 2
S J
[J[
2
_
J . (7.10)
Now equation 7.6 implies that S = J L, and hence that:
S S = (J L)(J L) = J J +L L2L J .
We therefore nd that:
L J = (J J +L LS S)/2 ,
7.1. MAGNETIC FIELDS 73
so that:
_
L J
[J[
2
_
=
[J(J + 1) +L(L + 1) S(S + 1)]
2
/2
J(J + 1)
2
,
=
[J(J + 1) +L(L + 1) S(S + 1)]
2J(J + 1)
. (7.11)
Similarly:
S J = (J J +S S L L)/2 ,
and so:
_
S J
[J[
2
_
=
[J(J + 1) +S(S + 1) L(L + 1)]
2
/2
J(J + 1)
2
,
=
[J(J + 1) +S(S + 1) L(L + 1)]
2J(J + 1)
. (7.12)
We therefore conclude that:
=
_
[J(J + 1) +L(L + 1) S(S + 1)]
2J(J + 1)
+ 2
[J(J + 1) +S(S + 1) L(L + 1)]
2J(J + 1)
_
J . (7.13)
This can be written in the form:
= g
J
J , (7.14)
where g
J
is the Lande g-factor given by:
g
J
= 1 +
J(J + 1) +S(S + 1) L(L + 1)
2J(J + 1)
. (7.15)
This implies that
z
= g
J
B
M
J
, (7.16)
and hence that the interaction energy with the eld is:
E =
z
B
z
= g
J
B
B
z
M
J
. (7.17)
This is the nal result for the energy shift of an atomic state in the anomalous Zeeman eect. Note that
we just obtain g
J
= 1 if S = 0, as we would expect for an atom with only orbital angular momentum.
Similarly, if L = 0 so that the atom only has spin angular momentum, we nd g
J
= 2. Classical theories
always predict g
J
= 1. The departure of g
J
from unity is caused by the spin part of the magnetic moment,
and is a purely quantum eect.
The spectra can be understood by applying the following selection rules on J and M
J
:
J = 0, 1 ;
M
J
= 0, 1 .
These rules have to be applied in addition to the l = 1 and S = 0 rules. (See discussion in 4.8.)
2
J = 0 transitions are forbidden when J = 0 for both states, and M
J
= 0 transitions are forbidden in
a J = 0 transition. The transition energy shift is then given by:
h = (h h
0
) ,
=
_
g
upper
J
M
upper
J
g
lower
J
M
lower
J
_
B
B
z
,
(7.18)
where h
0
is the transition energy at B
z
= 0 and the superscripts refer to the upper and lower states
respectively.
The polarizations of the transitions follow the same patterns as for the normal Zeeman eect:
With longitudinal observation the M
J
= 0 transitions are absent and the M
J
= 1 transitions
are
circularly polarized.
With transverse observation the M
J
= 0 transitions are linearly polarized along the z axis (i.e.
parallel to B) and the M
J
= 1 transitions are linearly polarized in the x-y plane (i.e. perpen-
dicular to B).
74 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
Level J L S g
J
2
P
3/2
3/2 1 1/2 4/3
2
P
1/2
1/2 1 1/2 2/3
2
S
1/2
1/2 0 1/2 2
Table 7.3: Lande g-factors evaluated from eqn 7.15 for the levels involved in the sodium D lines.
+3/2
+1/2
-1/2
M
J
B = 0 B 0
+1/2
-1/2
-3/2
2
P
3/2
2
S
1/2
D
M
J
=
-
1
,
0
,
+
1
B = 0 B 0
+1/2
-1/2
M
J
+1/2
-1/2
2
P
1/2
2
S
1/2
D
1
(
5
8
9
.
6
n
m
)
D
M
J
=
-
1
,
0
,
+
1
DE
so
= 17 cm
-1
D
2
(
5
8
9
.
0
n
m
)
+3/2
+1/2
-1/2
M
J
B = 0 B 0
+1/2
-1/2
-3/2
2
P
3/2
2
S
1/2
D
M
J
=
-
1
,
0
,
+
1
B = 0 B 0
+1/2
-1/2
M
J
+1/2
-1/2
2
P
1/2
2
S
1/2
D
1
(
5
8
9
.
6
n
m
)
D
M
J
=
-
1
,
0
,
+
1
DE
so
= 17 cm
-1
D
2
(
5
8
9
.
0
n
m
)
Figure 7.4: Splitting of the sodium D-lines by a weak magnetic eld. Note that the Zeeman splittings
are smaller than the spin-orbit splitting, as must be the case in the weak eld limit.
Example: The sodium D lines
The sodium D lines correspond to the 3p 3s transition. At B = 0, the spin-orbit interaction splits the
upper 3p
2
P term into the
2
P
3/2
and
2
P
1/2
levels separated by 17 cm
1
. The lower
2
S
1/2
level has no
spin-orbit interaction. The Lande g-factors of the levels worked out from eqn 7.15 are given in Table 7.3.
The splitting of the lines in the eld is shown schematically in Fig. 7.4. The
2
P
3/2
level splits into
four M
J
states, while the two J = 1/2 levels each split into two states. The splittings are dierent for
each level because of the dierent Lande factors. On applying the M
J
= 0, 1 selection rule, we nd
four allowed transitions for the D
1
line and six for the D
2
. These transitions are listed in Table 7.4.
The results tabulated in Table 7.4 can be compared to those predicted by the normal Zeeman eect. In
the normal Zeeman eect we observe three lines with an energy spacing equal to
B
B. In the anomalous
eect, there are more than three lines, and the spacing is dierent to the classical value: in fact, the lines
are not evenly spaced. Furthermore, none of the lines occur at the same frequency as the unperturbed
line at B = 0.
7.1.3 The Paschen-Back eect
The Paschen-Back eect is observed at very strong magnetic elds. The criterion for observing the
Paschen-Back eect is that the interaction with the external magnetic eld should be much stronger than
2
There are no selection rules on M
L
and M
S
here because Lz and Sz are not constants of the motion when L and S are
coupled by the spin-orbit interaction.
7.1. MAGNETIC FIELDS 75
M
upper
J
M
lower
J
M
J
Transition energy shift
D
1
line D
2
line
+
3
2
+
1
2
1 +1
+
1
2
+
1
2
0
2
3
1
3
+
1
2
1
2
1 +
4
3
+
5
3
1
2
+
1
2
+1
4
3
5
3
1
2
1
2
0 +
2
3
+
1
3
3
2
1
2
+1 1
Table 7.4: Anomalous Zeeman eect for the sodium D lines. The transition energy shifts are worked out
from eqn 7.18 and are quoted in units of
B
B
z
.
the spin-orbit interaction:
B
B
z
E
so
. (7.19)
If we satisfy this criterion, then the precession speed around the external eld will be much faster than
the spin-orbit precession. This means that the interaction with the external eld is now the largest
perturbation, and so it should be treated rst, before the perturbation of the spin-orbit interaction.
Another way to think of the strong-eld limit is that it occurs when the external eld is much stronger
than the internal eld of the atom arising from the orbital motion. We saw in Section 6.3 that the internal
elds in most atoms are large. For example, the Bohr model predicts an internal eld of 12 T for the
n = 1 shell of hydrogen. (See eqn 6.16.) This is a very strong eld, that can only be obtained in the
laboratory by using powerful superconducting magnets. This internal eld strength is typical of many
atoms, and so it will frequently be the case the eld required to observe the Paschen-Back eect is so
large that we never go beyond the Zeeman regime in the laboratory.
3
For example, in sodium, the eld
strength equivalent to the spin-orbit interaction for the D-lines is given by:
B
z
=
E
so
B
=
17 cm
1
9.27 10
24
JT
1
= 36 T,
which is not achievable in normal laboratory conditions. On the other hand, since the spin-orbit inter-
action decreases with decreasing atomic number Z, the splitting for the equivalent transition in lithium
with Z = 3 (i.e. the 2p 2s transition) is only 0.3 cm
1
. This means that we can reach the strong eld
regime for elds 0.6 T. This is readily achievable, and allows the Paschen-Back eect to be observed.
In the Paschen-Back eect, the spin-orbit interaction is assumed to be negligibly small, and L and S
are therefore no longer coupled together. Each precesses separately around B, as sketched in Fig. 7.5.
The precession rates for L and S are dierent because of the dierent g-values. Hence the magnitude of
the resultant J varies with time: the quantum number J is no longer a constant of the motion.
The interaction energy is now calculated by adding the separate contributions of the spin and orbital
energies:
E =
z
B
z
= (
orbital
z
+
spin
z
)B
z
= (M
L
+g
s
M
S
)
B
B
z
. (7.20)
The shift of the spectral lines is given by:
(h) = (M
L
+g
s
M
S
)
B
B
z
. (7.21)
We have noted before that optical transitions do not aect the spin, and so we must have M
S
= 0. The
frequency shift is thus given by:
(h) =
B
B
z
M
L
, (7.22)
where M
L
= 0 or 1. In other words, we revert to the normal Zeeman eect.
3
There are extremely large magnetic elds present in the Sun due to the circulating plasma currents. This means that
the Paschen-Back eect can be observed for elements like sodium in solar spectra.
76 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
L
S
B
L
S
B
Figure 7.5: Precession of L and S around B in the Paschen-Back eect.
Putting it all together
The change of the spectra as we increase B from zero is illustrated for the p s transitions of an alkali
atom in Fig. 7.6. At B = 0 the lines are split by the spin-orbit interaction. At weak elds we observe
the anomalous Zeeman eect, while at strong elds we change to the Paschen-back eect.
B = 0
weak B
strong B
m
B
B m
B
B
DE
so
photon energy
B = 0
weak B
strong B
m
B
B m
B
B
DE
so
B = 0
weak B
strong B
B = 0
weak B
strong B
m
B
B m
B
B
DE
so
photon energy
Figure 7.6: Schematic progression of the optical spectra for the p s transitions of an alkali atom with
increasing eld.
7.1.4 Magnetic eld eects for hyperne levels
Everything we have said so far has ignored the hyperne structure of the atom. The whole process can
be repeated to calculate the Zeeman and Paschen-Back energy shifts for the hyperne levels. In this
case, the energy splittings at B = 0 are much smaller, due to the much smaller gyromagnetic ratio of the
nucleus compared to the electron. (See Section 6.7.2.) This implies that the change from the weak-eld
to the strong-eld limit occurs at much smaller eld strengths than for the states split by ne-structure
interactions. We shall not consider the hyperne states further in this course.
7.2 The concept of good quantum numbers
It is customary to refer to quantum numbers that relate to constants of the motion as good quantum
numbers. In this discussion of the eects of magnetic elds, we have used six dierent quantum numbers
to describe the angular momentum state of the atom: J, M
J
, L, M
L
, S, M
S
. However, we cannot know
all of these at the same time. In fact, we can only know four: (L, S, J, M
J
) in the weak-eld limit, or
(L, S, M
L
, M
S
) in the strong-eld limit. In the weak-eld limit, L
z
and S
z
are not constant which implies
that J and M
J
are good quantum numbers but M
L
and M
L
are not. Similarly, in the strong-eld limit,
the coupling between L and S is broken and so J and J
z
are not constants of the motion: M
L
and M
S
are good quantum numbers, but J and M
J
are not.
A similar type of argument applies to the two angular momentum coupling schemes discussed in
Section 4.6, namely LS-coupling and jj-coupling. As an example, consider the total angular momentum
7.3. NUCLEAR MAGNETIC RESONANCE 77
state of a two electron atom. In the LS-coupling scheme, we specify (L, S, J, M
J
), whereas in the jj-
coupling scheme we have (j
1
, j
2
, J, M
J
). In both cases, we have four good quantum numbers, which tell
us the precisely measurable quantities. The other quantum numbers are unknown because the physical
quantities they represent are not constant. In LS-coupling we cannot know the j values of the individual
electrons because the residual electrostatic potential overpowers the spin-orbit eect, whereas in the jj-
coupling scheme we cannot know L and S. Note, however, that J and M
J
are good quantum numbers
in both coupling limits. This means that we can always describe the Zeeman energy of the atom by
eqn 7.17, although in the case of jj-coupling, the formula for the g
J
factor given in eqn 7.15 will not be
valid because L and S are not good quantum numbers.
7.3 Nuclear magnetic resonance
Everything that has been covered so far in this chapter applies to the electrons in the atom. However,
a discussion of the Zeeman eect would not be complete if we did not at least mention the interaction
of an external magnetic eld with the nucleus. As noted in Section 6.7.2, the nucleus has spin, and this
gives it a magnetic dipole moment. In analogy with eqn 7.14, the nuclear dipole moment is written:
nucleus
= g
I
I , (7.23)
where g
I
is the nuclear g-factor, and
N
= e/2m
p
is the nuclear magneton. I is the nuclear spin angular
momentum, which is assumed to be quantised in the usual way, so that [I[ =
_
I(I + 1) and I
z
= M
I
,
with M
I
running in integer steps from I to +I. Note that the omission of the minus sign in eqn 7.23
is deliberate, as nuclei are positively charged.
If an external magnetic eld is applied along the z direction, the energy of the nucleus will shift by:
E =
nucleus
B =
nucleus
z
B
z
. (7.24)
On substituting from eqn 7.23, the Zeeman energy becomes:
E = g
I
I
z
B
z
= g
I
N
B
z
M
I
. (7.25)
In magnetic resonance experiments, a radio-frequency (RF) electromagnetic eld is applied to induce
magnetic-dipole transitions between the Zeeman-split levels. The angular momentum of the nucleus
changes by one unit when the photon is absorbed, so that the selection rule is M
I
= 1.
4
The energy
of the photon required to induce this transition is thus given by:
h = g
I
N
B
z
. (7.26)
The resonance is detected either by scanning at xed B
z
, or by scanning B
z
at xed .
In the magnetic resonance systems used in medical imaging, the RF photons are brought to resonance
with the hydrogen atoms or ions in the body. The g factor of the proton is 5.586, which implies that
= 42.6 MHz at a eld of 1 T. The non-obvious value of the g value is a consequence of the internal
structure of the proton. Magnetic resonance can also be observed from other nuclei in a variety of liquid
and solid-state environments, and this gives rise to a host of techniques used especially in chemistry and
biology to obtain information about the structure and bonding of molecules.
7.4 Electric elds
In the case of electric elds, the weak and strong eld limits are not normally distinguished, and all the
phenomena are collectively called the Stark eect. These eects are named after J. Stark, who was the
rst person to study the eect of electric elds on atomic spectral lines, when he measured the splitting
of the hydrogen Balmer lines in an electric eld in 1913. In most atoms we observe the quadratic Stark
eect and we therefore consider this eect rst. We then move on to consider the linear Stark eect,
which is observed for the excited states of hydrogen, and in other atoms at very strong elds. The Stark
shift of an atom is harder to observe than the Zeeman shift, which explains why magnetic eects are
more widely studied in atomic physics. However, large Stark eects are readily observable in solid state
physics, and we therefore conclude by briey considering the quantum-conned Stark eect.
4
In magnetic-dipole transitions, the parity of the initial and nal states does not change. (See section 2.5.) The photon
interacts with the magnetic dipole of the nucleus, since its electric dipole is zero.
78 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
(a) E = 0
+
(b) E > 0
p
positive
nucleus
negative electron
charge cloud
field direction
+
(a) E = 0
+
(b) E > 0
p
positive
nucleus
negative electron
charge cloud
field direction
+
Figure 7.7: Eect of an electric eld c on the electron cloud of an atom. (a) When c = 0, the negatively-
charged electron cloud is arranged symmetrically about the nucleus, and there is no electric dipole. (b)
When the electric eld is applied, the electron cloud is displaced, and a net dipole parallel to the eld is
induced.
7.4.1 The quadratic Stark eect
Most atoms show a small red shift (i.e. a shift to lower energy) which is proportional to the square of the
electric eld. This phenomenon is therefore called the quadratic Stark eect. The energy of an atom in
an electric eld E is given by
E = p E , (7.27)
where p is the electric dipole of the atom. We can understand the quadratic Stark eect intuitively with
reference to Fig. 7.7. The negatively-charged electron clouds of an atom are spherically symmetric about
the positively-charged nucleus in the absence of applied elds. A charged sphere acts like a point charge
at its centre, and it is thus apparent that atoms do not normally possess a dipole moment, as shown in
Fig. 7.7(a). When a eld is applied, the electron cloud and the nucleus experience opposite forces, which
results in a net displacement of the electron cloud with respect to the nucleus, as shown in Fig. 7.7(b).
This creates a dipole p which is parallel to c and whose magnitude is proportional to [E[. This can be
expressed mathematically by writing:
p = E , (7.28)
where is the polarizability of the atom. The energy shift of the atom is found by calculating the
energy change on increasing the eld strength from zero:
E =
_
c
0
pdE
=
_
c
0
c
dc
=
1
2
c
2
, (7.29)
which predicts a quadratic red shift, as required. The magnitude of the red shift is generally rather small.
This is because the electron clouds are tightly bound to the nucleus, and it therefore requires very strong
electric elds to induce a signicant dipole.
We can understand the quadratic Stark eect in more detail by applying perturbation theory.
5
The
perturbation caused by the eld is of the form:
H
i
(er
i
) E ,
= ec
i
z
i
, (7.30)
where the eld is assumed to point in the +z direction. In principle, the sum is over all the electrons,
but in practice, we need only consider the valence electrons, because the electrons in closed shells are
very strongly bound to the nucleus and are therefore very hard to perturb. In writing eqn 7.30, we take,
as always, r
i
to be the relative displacement of the electron with respect to the nucleus.
For simplicity, we shall just consider the case of alkali atoms which possess only one valence electron.
In this case, the perturbation to the valence electron caused by the eld reduces to:
H
= ecz . (7.31)
5
Many of you will not have done perturbation theory yet, as it is normally rst encountered in detail in course PHY309,
which is taken in the second semester. You will therefore have to take the results presented here on trust.
7.4. ELECTRIC FIELDS 79
The rst-order energy shift is given by:
E = [H
[ = ec[z[ , (7.32)
where
[z[ =
___
all space
z d
3
r . (7.33)
Now unperturbed atomic states have denite parities. (See discussion in Section 2.4.) The product
= [
2
[ is therefore an even function, while z is an odd function. It is therefore apparent that
[z[ =
___
all space
(even function) (odd function) d
3
r = 0 .
The rst-order energy shift is therefore zero, which explain why the energy shift is quadratic in the eld,
rather than linear.
The quadratic energy shift can be calculated by second-order perturbation theory. In general, the
energy shift of the ith state predicted by second-order perturbation theory is given by:
E
i
=
j=i
[
i
[H
[
j
[
2
E
i
E
j
, (7.34)
where the summation runs over all the other states of the system, and E
i
and E
j
are the unperturbed
energies of the states. The condition of validity is that the magnitude of the perturbation, namely
[
i
[H
[
j
[, should be small compared to the unperturbed energy splittings. For the Stark shift of the
valence electron of an alkali atom, this becomes:
E
i
= e
2
c
2
j=i
[
i
[z[
j
[
2
E
i
E
j
. (7.35)
We see immediately that the shift is expected to quadratic in the eld, which is indeed the case for most
atoms.
As a specic example, we consider sodium, which has a single valence electron in the 3s shell. We
rst consider the ground state 3s
2
S
1/2
term. The summation in eqn 7.35 runs over all the excited states
of sodium, namely the 3p, 3d, 4s, 4p, . . . states. Now in order that the matrix element
i
[z[
j
should
be non-zero, it is apparent that the states i and j must opposite parities. In this case, we would have:
i
[z[
j
=
___
all space
(even/odd parity) (odd parity) (odd/even parity) d
3
r ,= 0 ,
since the integrand is an even function. On the other hand, if the states have the same parities, we have:
i
[z[
j
=
___
all space
(even/odd parity) (odd parity) (even/odd parity) d
3
r = 0 ,
since the integrand is an odd function. Since the parity varies as (1)
l
, the s and d states do not
contribute to the Stark shift of the 3s state, and the summation in eqn 7.35 is only over the p and f
excited states. Owing to the energy dierence factor in the denominator, the largest perturbation to the
3s state will arise from the rst excited state, namely the 3p state. Since this lies above the 3s state, the
energy dierence in the denominator is negative, and the energy shift is therefore negative. Indeed, it is
apparent that the quadratic Stark shift of the ground state of an atom will always be negative, since the
denominator will be negative for all the available states of the system. This implies that the Stark eect
will always correspond to a red shift for the ground state level.
There is no easy way to calculate the size of the energy shift, but we can give a rough order of
magnitude estimate. If we neglect the contributions of the even parity excited states above the 3p state,
the energy shift will be given by:
E
3s
e
2
c
2
[
3s
[z[
3p
[
2
E
3p
E
3s
.
The expectation value of z over the atom must be smaller than a, where a is the atomic radius of sodium,
namely 0.18 nm. Hence with E
3p
E
3s
= 2.1 eV, we then have:
E
3s
e
2
a
2
E
3p
E
3s
c
2
,
80 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
2
S
1/2
2
P
1/2
2
P
3/2
M
J
= 3/2
M
J
= 1/2
M
J
= 1/2
M
J
= 1/2
E = 0
n
n
E > 0
E = 0
E > 0
(a) (b)
D
1
D
2
D
1
D
2
D
1
D
2
2
S
1/2
2
P
1/2
2
P
3/2
M
J
= 3/2
M
J
= 1/2
M
J
= 1/2
M
J
= 1/2
E = 0
n
n
E > 0
E = 0
E > 0
(a) (b)
D
1
D
2
D
1
D
2
D
1
D
2
Figure 7.8: (a) Shift of the
2
S
1/2
,
2
P
1/2
, and
2
P
3/2
terms of an alkali atom in an electric eld. Note
that the red shifts of the upper levels are larger than that of the lower level. (b) Red shift of the D
1
(
2
P
1/2
2
S
1/2
) and the D
2
(
2
P
3/2
2
S
1/2
) lines in the eld.
which implies from eqn 7.29 that
3s
3.2 10
20
eVm
2
V
2
. This predicts a shift of 1 10
5
eV
(0.08 cm
1
) in a eld of 2.5 10
7
V/m, which compares reasonably well with the experimental value of
0.6 10
5
eV (0.05 cm
1
).
The order of magnitude calculation given above can also provide a useful estimation of the eld
strength at which the second-order perturbation approximation breaks down. As mentioned above,
this will occur when the magnitude of the perturbation become comparable to the unperturbed energy
splitting, that is when:
ec[
3s
[z[
3p
[ (E
3p
E
3s
) .
On setting [
3s
[z[
3p
[ = a as before, we nd c 10
10
V/m, which is an extremely large eld. The
second-order perturbation approach will therefore be a good approximation in most practical situations.
Now consider the Stark shift of the 3p state. The 3p state has odd parity, and so the non-zero
contributions in eqn 7.35 will now arise from the even parity ns and nd states:
E
3p
= e
2
c
2
_
[
3p
[z[
3s
[
2
E
3p
E
3s
+
[
3p
[z[
3d
[
2
E
3p
E
3d
+
[
3p
[z[
4s
[
2
E
3p
E
4s
+
_
.
The rst term gives a positive shift, while all subsequent terms are negative. Therefore, it is not imme-
diately obvious that the Stark shift of excited states like the 3p state will be negative. However, since
the energy dierence of the excited states tends to get smaller as we go up the ladder of levels, it will
generally be the case that the negative terms dominate, and we have a red shift as for the ground state.
Moreover, the red shift is generally expected to be larger than that of the ground state for the same
reason (i.e. the smaller denominator). In the case of the 3p state of sodium, the largest contribution
comes from the 3d state which lies 1.51 eV above the 3p state, even though the 4s state is closer (relative
energy +1.09 eV). This is because of the smaller value of the matrix element for the s states.
Explicit evaluation of the matrix elements indicates that the Stark shift at a given eld strength
depends on M
2
J
. This means that electric elds do not completely break the degeneracy of the M
J
sub-levels of a particular [J term. This contrasts with the Zeeman eect, where the energy shift is
proportional to M
J
, and the degeneracy if fully lifted. The Stark shift of the sodium D lines is shown
schematically in Fig. 7.8. All states are shifted to lower energy, with those of the same M
J
values being
shifted equally for a given level, as indicated in Fig. 7.8(a). The shifts of the upper 3p levels are larger
than that of the lower 3s
2
S
1/2
term, and both spectral lines therefore show a net shift to lower energy,
as indicated in Fig. 7.8(b). Owing to the degeneracy of the sub-levels with the same [M
J
[, the D
1
(
2
P
1/2
2
S
1/2
) line does not split, while the D
2
(
2
P
3/2
2
S
1/2
) line splits into a doublet.
An interesting consequence of the perturbation caused by the electric eld is that the unperturbed
atomic states get mixed with other states of the opposite parity. For example, the 3s state has even
parity at c = 0, but acquires a small admixture of the odd parity 3p state as the eld is increased.
This means that parity forbidden transitions (eg ss, pp, ds, etc.) become weakly allowed as the
eld is increased. Since we are dealing with a second-order perturbation, the intensity of these forbidden
transitions increases in proportion to c
2
.
7.4. ELECTRIC FIELDS 81
7.4.2 The linear Stark eect
Starks original experiment of 1913 was performed on the Balmer lines of hydrogen.
6
In contrast to
what has been discussed in the previous subsection, the shift was quite large, and varied linearly with
the eld. The reason for this is that the l states of hydrogen are degenerate. This means that we have
states of opposite parities with the same energy, so that the second-order energy shift given by eqn 7.35
diverges. We therefore have to take a new approach to calculate the Stark shift by employing degenerate
perturbation theory.
Consider rst the 1s ground state of hydrogen. This level is unique, and hence the second-order
perturbation approach is valid. A small quadratic red-shift therefore occurs, as discussed in the previous
sub-section.
Now consider the n = 2 shell, which has four levels, namely the m = 0 level from the 2s term, and
the m = 1, 0, and +1 levels of the 2p term. In the absence of an applied eld, these four levels are
degenerate. If the atom is in the n = 2 shell, it is equally likely to be in any of the four degenerate levels.
We must therefore write its wave function as:
n=2
=
4
i=1
c
i
i
, (7.36)
where the subscript i identies the quantum numbers n, l, m, that is:
1
2,0,0
;
2
2,1,1
;
3
2,1,0
;
4
2,1,+1
.
The rst-order energy shift from eqn 7.32 becomes:
E = ec
i,j
c
i
c
j
i
[z[
j
. (7.37)
Unlike the case of the ground state, we can see from parity arguments that some of the matrix elements
are non-zero. For example,
1
has even parity, but
3
has odd parity. We therefore have:
1
[z[
3
=
___
all space
1
z
3
d
3
r ,
=
___
all space
(even parity) (odd parity) (odd parity) d
3
r ,
,= 0 .
This implies that we can observe a linear shift of the levels with the eld. It turns out that
1
[z[
3
is
the only non-zero matrix element. This is because the perturbation H
1
[z[
3
= 3a
0
,
where a
0
is the Bohr radius of hydrogen. We then nd by degenerate perturbation theory that the
eld splits the n = 2 shell into a triplet, with energies of 3ea
0
c, 0, and +3ea
0
c with respect to the
unperturbed level. Note that this shift is linear in the eld and has a much larger magnitude than that
calculated for the quadratic Stark eect. For example, at c = 2.510
7
V/m, we nd shifts of 410
3
eV
(32 cm
1
), which are more than two orders of magnitude larger than the shifts of the levels in sodium
at the same eld strength. This, of course, explains why the linear Stark eect in hydrogen was the rst
electric-eld induced phenomenon to be discovered.
It was mentioned in Section 7.4.1 that the second-order perturbation analysis is expected to break
down at large eld strengths when the eld-induced perturbation becomes comparable to the splittings
of the unperturbed levels. We made an estimate of this for the 3s level of sodium and concluded that
extremely large elds were required for the strong-eld limit to be reached. However, the elds required
for the breakdown of the second-order approach for the excited states can be signicantly smaller, because
some atoms can have dierent parity excited states which are relatively close to each other. We would
then expect the behaviour to change as the eld is increased. At low elds we would observe the quadratic
6
The Balmer series of hydrogen corresponds to those lines that terminate on the n = 2 level. These lines occur in the
visible spectral region.
82 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
conduction band
valence band
E
g
E
g
(a) (b) (c)
E
g
conduction band
valence band
Energy
+
electron
hole
exciton
quantum well
hw
E
z
V
0
P N
z
conduction band
valence band
E
g
E
g
conduction band
valence band
E
g
E
g
(a) (b) (c)
E
g
conduction band
valence band
Energy
+
electron
hole
exciton
quantum well
hw
E
z
V
0
P N
quantum well
hw
E
z
V
0
P N
z
Figure 7.9: The quantum conned Stark eect. (a) Excitons are created by optical transitions from the
valence to the conduction band of a semiconductor. (b) A quantum well is formed when a thin layer of a
semiconductor with a band gap E
g
is sandwiched between layers of another semiconductor with a larger
band gap E
g
. (c) Electric elds are applied to an exciton in a quantum well by embedding the quantum
well within a P-N junction and applying reverse bias.
Stark eect, but when the eld is suciently large that the perturbation is comparable to the energy
splitting, we would eectively have degenerate levels with dierent parities. This would then result in a
linear shift determined by degenerate perturbation theory. This change from the quadratic Stark eect at
low elds to the linear Stark eect at high elds was rst studied for the (1s, 4l) excited state conguration
of helium by Foster in 1927.
7.4.3 The quantum-conned Stark eect
An optical transition between the valence and conduction bands of a semiconductor leaves a positively-
charged hole in the valence band, and a negatively-charged electron in the conduction band, as shown in
Fig. 7.9(a). The electron and hole can bind together to form a hydrogen-like atom called an exciton.
The binding energy of the exciton is rather small, due to the high relative dielectric constant
r
of the
semiconductor, and also because of the low reduced eective mass of the exciton. Typical values might
be
r
10 and m 0.1m
e
, which implies from eqn 1.10 that the 1s binding energy would be 0.01 eV.
7
From the discussion given in Section 7.4.1, we would expect the 1s exciton state to show a quadratic
Stark shift as an electric eld is applied. However, in bulk semiconductors the excitons are very unstable
to applied electric elds due to their low binding energy, which implies that the electrostatic force between
the electron and hole is relatively small. The electrons and holes are pushed in opposite directions, and
the exciton then easily gets ripped apart by the eld. This eect is called eld ionization. It can also
be observed in atomic physics, but only at extremely high eld strengths.
The situation in a quantum-conned structure such as a semiconductor quantum well or quantum
dot is rather dierent. Consider the case of the quantum well shown in Fig. 7.9(b). The quantum well is
formed by sandwiching a thin semiconductor with a band gap of E
g
between layers of another semicon-
ductor with a larger band gap E
g
. This then gives rise to spatial discontinuities in the conduction and
valence band energies as shown in the gure. The excitons that are formed by optical transitions across
the smaller band gap are then trapped in the nite potential well created by the band discontinuities.
A strong electric eld can be applied to the quantum well by embedding it within a P-N junction, and
then applying reverse bias, as shown in Fig. 7.9(c). P-N junctions conduct when forward bias is applied,
but not under reverse bias. In the latter case, the applied voltage is dropped across the narrow junction
region, thereby generating an electric eld that is controlled by the reverse bias. The excitons that are
created by optical transitions are now stable to the eld, because the barriers of the quantum well prevent
them from being ripped apart. The electrons are pushed to one side of the quantum well, and the holes
to the other, which creates a dipole of magnitude ed, where d is the width of the quantum well. With
d 10 nm, much larger dipoles can be created than in atomic physics, resulting in correspondingly larger
7
Note that the factor of
2
0
in the denominator of eqn 1.10 has to be replaced by (r
0
)
2
in a dielectric medium.
7.4. ELECTRIC FIELDS 83
Stark shifts. This eect is called the quantum-conned Stark eect, and is widely used for making
electro-optical modulators. The quantum-conned Stark eect will be studied in more detail in course
PHY475.
Reading
Bransden and Joachain, Atoms, Molecules and Photons, chapter 6, 9.89, 16.1
Demtroder, Atoms, Molecules and Photons, sections 5.2, 5.6, 7.2 and 11.9.
Haken and Wolf, The physics of atoms and quanta, chapters 13 and 15, 20.67
Eisberg and Resnick, Quantum Physics, section 10.6.
Beisser, Concepts of Modern Physics, section 6.10.
Foot, Atomic Physics, sections 1.8 and 5.5
84 CHAPTER 7. EXTERNAL FIELDS: THE ZEEMAN AND STARK EFFECTS
Chapter 8
Lasers I: Stimulated emission
8.1 Introduction
The word LASER is an acronym standing for Light Amplication by Stimulated Emission of Radi-
ation. The origins of the laser may be traced back to Einsteins seminal paper on stimulated emission
published in 1917, but it took until 1960 for the rst laser to be invented. It is dicult to identify all of
the key milestones in the history of laser physics, but here are a few of the more important ones:
1917 Einsteins treatment of stimulated emission.
1951 Development of the maser by C.H. Townes.
1
1958 Proposal by C.H. Townes and A.L. Schawlow that the maser concept could be extended to optical
frequencies.
1960 T.H. Maiman at Hughes Laboratories reports the rst laser: the pulsed ruby laser.
1961 The rst continuous wave laser is reported: the helium neon laser.
1962 Invention of the semiconductor laser.
1964 Nicolay Basov, Charlie Townes and Aleksandr Prokhorov are awarded the Nobel prize for funda-
mental work in the eld of quantum electronics, which has led to the construction of oscillators and
ampliers based on the maser-laser principle.
1981 Art Schalow and Nicolaas Bloembergen are awarded the Nobel Prize for their contribution to the
development of laser spectroscopy.
1997 Steven Chu, Claude Cohen-Tannoudji and William D. Phillips are awarded the Nobel Prize for the
development of methods to cool and trap atoms with laser light.
2005 John Hall and Theodor H ansch receive the Nobel Prize for their contributions to the development
of laser-based precision spectroscopy, including the optical frequency comb technique.
2010 50th anniversary of the laser.
Many dierent types of laser have been developed over the years. The L in laser stands for Light,
but light is understood here in a general sense to mean electromagnetic radiation with a frequency of
10
14
10
15
Hz, not specically visible radiation. This provides the rst general classication of laser
types:
infrared, visible or ultraviolet wavelength.
Other general classications include:
1
A maser is basically the same as a laser, except that it works at microwave rather than optical frequencies. It took some
years to move on from masers to lasers because microwave cavities are designed on the assumption that the cavity dimensions
are comparable to the wavelength of the radiation within the cavity, which is typically around 10 cm at microwave frequencies.
Such designs cannot be scaled easily to optical wavelengths, where 1 m, and it required some lateral thinking to design
a cavity that would work in the regime where the cavity dimensions are much larger than the wavelength. It is only relatively
recently that it has been possible to make microcavity lasers and nanolasers that have physical dimensions that are
comparable to the wavelength of light.
85
86 CHAPTER 8. LASERS I: STIMULATED EMISSION
solid, liquid or gas gain medium;
continuous wave (CW) or pulsed operation;
xed wavelength or tuneable wavelength.
The gain medium (i.e. amplifying medium) of the laser determines the possible wavelengths that the
laser can emit, but the characteristic properties of laser light are also strongly aected by the design of
the cavity, which is the other essential part of a laser. Such properties include:
Monochromaticity Discharge lamps emit light of many dierent colours simultaneously, according to
the emission probabilities of the transitions in the atoms. Lasers, by contrast, emit light from just a
single atomic transition, and are therefore highly monochromatic.
2
The transition that is selected
by the laser is determined by the amount of amplication that is available at that wavelength and
the reectivity of the mirrors that comprise the cavity.
Directionality Discharge lamps emit in all directions, but lasers emit a well-dened beam in a specic
direction. The direction of the beam is governed by the orientation of the mirrors in the cavity.
Brightness The brightness of lasers arises from two factors. First, the radiation is emitted in a beam,
which means that the intensity (i.e. power per unit area) can be very high, even though the
total amount of power is relatively low. Second, all of the energy is concentrated within the narrow
spectrum of a single atomic transition. This means that the spectral brightness (i.e. the intensity
in the beam divided by the spectral width of the emission line) is even higher in comparison with
a white-light source such as an incandescent light bulb. For example, the spectral brightness of a
1 mW laser beam could easily be millions of time greater than that of a 100 W light bulb.
Coherence Lasers have a high degree of both spatial and temporal coherence. The coherence of laser
light will be considered in more detail in Section 9.3.
These four properties are common to all lasers. In addition, some lasers emit radiation in very short pulses,
which can be used for studying fast processes in physics, chemistry, and biology, or for transmitting optical
data at a very high rate down optical bres. The principles that govern whether a laser can produce very
short pulses are considered in Section. 9.2.2.
8.2 Principles of laser oscillation
As mentioned above, the word LASER is an acronym that stands for light amplication by stimulated
emission of radiation. However, there is more to a laser than just light amplication. A laser is actually
an oscillator rather than just an amplier.
3
The dierence is that an oscillator has positive feedback in
addition to amplication. The key ingredients of a laser may thus be summarized as:
LASER = light amplication + positive optical feedback .
Light amplication is achieved by stimulated emission. Ordinary optical materials do not amplify
light. Instead, they tend to absorb or scatter the light, so that the light intensity out of the medium
is less than the intensity that went in. To get amplication you have to drive the material into a non-
equilibrium state by pumping energy into it. The amplication of the medium is determined by the gain
coecient , which is dened by the following equation:
I(x + dx) = I(x) +I(x)dx I(x) + dI , (8.1)
where I(x) represents the intensity at a point x within the gain medium. The dierential equation can
be solved as follows:
dI = Idx
dI
dx
= I
I(x) = I(0)e
x
. (8.2)
2
Monochromatic means single coloured.
3
A more accurate acronym for a laser might therefore be LOSER, but it is easy to understand why this one never
caught on.
8.3. STIMULATED EMISSION 87
GAIN
MEDIUM
high
reflector
output
coupler POWER SUPPLY
LIGHT
OUTPUT
Figure 8.1: Schematic diagram of a laser
Thus the intensity grows exponentially within the gain medium.
Positive optical feedback is achieved by inserting the amplifying medium inside a resonant cavity.
The eects of the cavity on the properties of laser light will be considered in detail in Sections 9.19.3.
At this stage, we conne ourselves to considering the parameters of the cavity that aect the condition
for laser oscillation.
Figure 8.1 shows a schematic diagram of a laser. Light in the cavity passes through the gain medium
and is amplied. It then bounces o the end mirrors and passes through the gain medium again, getting
amplied further. This process repeats itself until a stable equilibrium condition is achieved when the
total round trip gain balances all the losses in the cavity. Under these conditions the laser will oscillate.
The condition for oscillation is thus:
round-trip gain = round-trip loss .
The losses in the cavity fall into two categories: useful, and useless. The useful loss comes from the
output coupling. One of the mirrors (called the output coupler) has reectivity less than unity, and
allows some of the light oscillating around the cavity to be transmitted as the output of the laser. The
value of the transmission is chosen to maximize the output power. If the transmission is too low, very
little of the light inside the cavity can escape, and thus we get very little output power. On the other
hand, if the transmission is too high, there may not be enough gain to sustain oscillation, and there would
be no output power. The optimum value is somewhere between these two extremes. Useless losses arise
from absorption in the optical components (including the laser medium), scattering, and the imperfect
reectivity of the other mirror (the high reector). By taking into account the fact that the light passes
twice through the gain medium during a round trip, the condition for oscillation in a laser can be written:
e
2l
R
OC
R
HR
L = 1 , (8.3)
where l is the length of the gain medium, R
OC
is the reectivity of the output coupler, R
HR
is the
reectivity of the high reector, and L is the round-trip loss factor due to absorption and scattering, such
that L = 1 corresponds to the situation with no losses. If the total round-trip losses are small ( 10%),
then the gain required to sustain lasing will also be small, and eqn 8.3 simplies to:
2l = (1 R
OC
) + (1 R
HR
) + scattering losses + absorption losses . (8.4)
This shows more clearly how the gain in the laser medium must exactly balance the losses in the cavity.
In general we expect the gain to increase as we pump more energy into the laser medium. At low
pump powers, the gain will be small, and there will be insucient gain to reach the oscillation condition.
The laser will not start to oscillate until there is enough gain to overcome all of the losses. This implies
that the laser will have a threshold in terms of the pump power. (See Section 8.6.)
8.3 Stimulated emission
In Chapter 2, we considered the spontaneous tendency for atoms in excited states to emit radiation. We
now consider the optical transitions that occur when the atom is subjected to electromagnetic radiation
with its frequency resonant with the energy dierence of the two levels. We follow the treatment of
Einstein (1917).
In addition to transitions from the upper to the lower level due to spontaneous emission, there will
also be:
absorption of photons causing transitions from level 1 up to level 2;
88 CHAPTER 8. LASERS I: STIMULATED EMISSION
absorption
spontaneous
emission
stimulated
emission
u(n)
Level 2: energyE
2
, population N
2
Level 1: energyE
1
, population N
1
Figure 8.2: Absorption, spontaneous emission, and stimulated emission transitions
stimulated emission in which atoms in level 2 drop to level 1 induced by the incident radiation.
The process of stimulated emission is a coherent quantum-mechanical eect. The photons emitted by
stimulated emission are in phase with the photons that induce the transition. This is the fundamental
basis of laser operation, as the name suggests: Light Amplied by Stimulated Emission of Radiation.
Consider an atom irradiated by white light, with N
2
atoms in level 2 and N
1
atoms in level 1.
The part of spectrum at frequency , where h = (E
2
E
1
), can induce absorption and stimulated
emission transitions. We write the spectral energy density of the light at frequency as u(). The
transitions that can occur are shown in Fig. 8.2. In order to treat this situation, Einstein introduced
his A and B coecients. We have already seen in Section 2.6 that the A coecient determines the
rates of spontaneous transitions. The introduction of the B coecient extends the treatment to include
absorption and stimulated emission. The transition rates for three processes are:
Spontaneous emission (2 1):
dN
2
dt
=
dN
1
dt
= A
21
N
2
. (8.5)
Stimulated emission (2 1):
dN
2
dt
=
dN
1
dt
= B
21
N
2
u() . (8.6)
Absorption (1 2):
dN
1
dt
=
dN
2
dt
= B
12
N
1
u() . (8.7)
These are eectively the denitions of the Einstein A and B coecients. At this stage we might be
inclined to think that the three coecients are independent parameters for a particular transition. This
is not in fact the case. If you know one, you can work out the other two. To see this, we follow Einsteins
analysis.
We imagine a gas of atoms inside a box at temperature T with black walls. If we leave the atoms
for long enough, they will come to equilibrium with the black-body radiation that lls the cavity. In
these steady-state conditions, the rate of upward transitions must exactly balance the rate of downward
transitions:
B
12
N
1
u() = A
21
N
2
+B
21
N
2
u() , (8.8)
which implies that:
N
2
N
1
=
B
12
u()
A
21
+B
21
u()
. (8.9)
Furthermore, since the gas is in thermal equilibrium at temperature T, the ratio of N
2
to N
1
must satisfy
Boltzmanns law:
N
2
N
1
=
g
2
g
1
exp
_
h
k
B
T
_
, (8.10)
where g
2
and g
1
are the degeneracies of levels 2 and 1 respectively, and h = (E
2
E
1
). Equations 8.9
and 8.10 together imply that:
B
12
u()
A
21
+B
21
u()
=
g
2
g
1
exp
_
h
k
B
T
_
. (8.11)
8.4. POPULATION INVERSION 89
On solving this for u(), we nd:
u() =
g
2
A
21
g
1
B
12
exp(h/k
B
T) g
2
B
21
. (8.12)
However, we know that the cavity is lled with black-body radiation, which has a spectral energy density
given by the Planck formula:
u() =
8h
3
(c/n)
3
1
exp(h/k
B
T) 1
, (8.13)
where c/n is the speed of light in a medium with refractive index n. The only way to make eqns 8.12
and 8.13 to be consistent with each other at all temperatures and frequencies is if:
g
1
B
12
= g
2
B
21
,
A
21
=
8n
3
h
3
c
3
B
21
.
(8.14)
A moments thought will convince us that it is not possible to get consistency between the equations
without the stimulated emission term. This is what led Einstein to introduce the concept.
The relationships between the Einstein coecients given in eqn 8.14 have been derived for the case
of an atom in thermal equilibrium with black-body radiation at temperature T. However, once we have
derived the inter-relationships, they will apply in all other cases as well. This is very useful, since we
only need to know one of the coecients to work out the other two. For example, we can measure the
radiative lifetime to determine A
21
from (cf. eqn 2.25),
A
21
=
1
, (8.15)
and then work out the B coecients from eqn 8.14.
Equation 8.14 shows that the probabilities for absorption and emission are the same apart from
the degeneracy factors, and that the ratio of the probability for spontaneous emission to stimulated
emission increases in proportion to
3
. In a laser we want to encourage stimulated emission and suppress
spontaneous emission. Hence it gets progressively more dicult to make lasers work as the frequency
increases, all other things being equal.
8.4 Population inversion
We have seen above that stimulated emission is the basis of laser operation. We now wish to study how
we can use stimulated emission to make a light amplier. In a gas of atoms in thermal equilibrium, the
population of the lower level will always be greater than the population of the upper level. (See eqn 8.10).
Therefore, if a light beam is incident on the medium, there will always be more upward transitions due
to absorption than downward transitions due to stimulated emission. Hence there will be net absorption,
and the intensity of the beam will diminish on progressing through the medium.
To amplify the beam, we require that the rate of stimulated emission transitions exceeds the rate of
absorption.
4
We see from eqns 8.6 and 8.7 that this condition is satised when:
B
21
N
2
u() > B
12
N
1
u() . (8.16)
On substituting from eqn 8.14, this leads to the conclusion:
N
2
>
g
2
g
1
N
1
, (8.17)
which simplies to:
N
2
> N
1
, (8.18)
for non-degenerate levels. This is a highly non-equilibrium situation, and is called population inversion.
On comparing eqn 8.17 to eqn 8.10, we see that population inversion corresponds to negative temperatures.
This is not as ridiculous as it sounds, because the atoms are not in thermal equilibrium.
Once we have population inversion, we have a mechanism for generating gain in the laser medium.
The art of making a laser is to work out how to get population inversion for the relevant transition. We
will discuss how this is done for specic types of laser in Sections 9.59.6.
4
We can usually ignore spontaneous emission in an operating laser because we are considering the case in which the light
intensity is very high, so that the stimulated emission rate far exceeds the spontaneous emission rate.
90 CHAPTER 8. LASERS I: STIMULATED EMISSION
(a) (b)
c/n
unit area of beam
energy density u
n
Intensity I
All this energy hits
the screen in
unit time
I
I + dI
dx
unit area
(a) (b)
c/n
unit area of beam
energy density u
n
Intensity I
All this energy hits
the screen in
unit time
c/n
unit area of beam
energy density u
n
Intensity I
All this energy hits
the screen in
unit time
I
I + dI
dx
unit area
I
I + dI
dx
unit area
Figure 8.3: (a) Relationship between the intensity I and energy density u
g() ,
W
21
= B
21
N
2
u
g() .
(8.19)
The light source is considered to have a delta function spectrum at frequency with total energy density
u
c
n
, (8.20)
where n is the refractive index of the medium. This means that the net stimulated rate downwards from
level 2 to level 1 is given by:
W
net
21
W
21
W
12
= (N
2
N
1
)B
21
g()
n
c
I , (8.21)
where we have assumed that the levels are non-degenerate so that B
12
= B
21
.
5
(See eqn 8.14.)
For each net transition a photon of energy h is added to the beam. The energy added to a unit
volume of beam per unit time is thus W
net
21
h. Consider a small increment of the light beam inside the
gain medium with length dx, as shown in Fig. 8.3(b). The energy added to this increment of beam per
unit time is W
net
21
h Adx, where A is the beam area. On remembering that the intensity equals the
energy per unit time per unit area, we can write:
dI = W
net
21
h dx,
= (N
2
N
1
)B
21
g()
n
c
Ih dx.
(8.22)
On comparing this to eqn 8.2, we see that the gain coecient is given by:
() = (N
2
N
1
)B
21
g()
n
c
h . (8.23)
This result shows that the gain is directly proportional to the population inversion density, and also
follows the spectrum of the emission line. By using eqn 8.14 to express B
21
in terms of A
21
, we can
5
For non-degenerate levels, the population inversion density N is equal to (N
2
(g
2
/g
1
)N
1
) rather than just (N
2
N
1
).
8.6. LASER THRESHOLD 91
0
3
2
1
PUMP
LASER EMISSION
rapid decay
rapid decay
ground state
Energy difference ? k
B
T
0
3
2
1
PUMP
LASER EMISSION
rapid decay
rapid decay
ground state
Energy difference ? k
B
T
Figure 8.4: Level scheme for a four-level laser.
re-write the gain coecient in terms of the natural radiative lifetime using eqn 8.15 to obtain:
() = (N
2
N
1
)
2
8n
2
g() , (8.24)
where is the vacuum wavelength of the emission line. This is the required result. Equation 8.24 tells us
how to relate the gain in the medium to the population inversion density using experimentally measurable
parameters: , , n and g().
8.6 Laser threshold
Equation 8.24 shows us that the gain in a laser medium is directly proportional to the population inversion
density. Laser operation will occur when there is enough gain to overcome the losses in the cavity. This
implies that a minimum amount of population inversion must be obtained before the laser will oscillate.
Population inversion is achieved by pumping atoms into the upper laser level. This pumping can be
done by a variety of techniques, which will be described in more detail when we consider individual laser
systems in Chapter 9. At this stage we just consider the general principles.
Lasers are classied as being either three-level of four-level systems. We will consider four-level
lasers rst, as these are the most common. Examples are Helium Neon or Nd:YAG. The four levels are:
the ground state (0), the two lasing levels ( 1 & 2 ), and a fourth level (3) which is used as part of the
pumping mechanism. The level scheme for an ideal four-level laser is shown in Fig. 8.4. The feature that
makes it a four-level as opposed to a three-level laser is that the lower laser level is at an energy more
than k
B
T above the ground state. This means that the thermal population of level 1 is negligible, and
so level 1 is empty before we turn on the pumping mechanism.
We assume that the atoms are inside a cavity and are being pumped into the upper laser level (level
2) at a constant rate of R
2
. This is usually done by exiting atoms to level 3 with a bright ash lamp or
by an electrical discharge, and then by a rapid decay to level 2. We can write down the following rate
equations for the populations of levels 1 and 2:
dN
2
dt
=
N
2
2
W
net
12
+R
2
,
dN
1
dt
= +
N
2
2
+W
net
12
N
1
1
.
(8.25)
The various terms allow for:
spontaneous emission from level 2 to level 1 (N
2
/
2
),
stimulated transitions from level 2 to level 1 (W
net
21
),
pumping into level 2 (R
2
),
decay from level 1 to the ground state by radiative transitions and/or collisions (N
1
/
1
).
Note that W
net
21
is the net stimulated transition rate from level 2 to level 1, as given in eqn 8.21. This is
equal to the rate of stimulated emission transitions downwards minus the rate of stimulated absorption
transitions upwards.
There are two important assumptions implicitly contained in eqn 8.25:
92 CHAPTER 8. LASERS I: STIMULATED EMISSION
pumping rate R
R
th
pumping rate R
R
th
t
r
a
n
s
i
t
i
o
n
r
a
t
e
l
i
g
h
t
o
u
t
p
u
t
P
W
o
u
t
G
a
i
n
P
o
p
.
i
n
v
.
g
D
N
g
th
, DN
th
Laser with higher
output coupler
transmission
Figure 8.5: (a) Variation of the gain and population inversion in a laser with the pumping rate. (b)
Comparison of the threshold and light outputs for two dierent values of the transmission of the output
coupler.
1. There is no pumping into level 1.
2. The only decay route from level 2 is by radiative transitions to level 1 (i.e. there are no non-radiative
transitions between level 2 and level 1, and transitions to other levels are not possible).
It may not always be possible to satisfy these assumptions, but it helps if we can. That is why we
described the above scenario as an ideal four-level laser.
We can re-write eqn 8.21 in the following form:
W
net
21
= B
21
g()
n
c
I(N
2
N
1
) W(N
2
N
1
) , (8.26)
where W = B
21
g()nI/c, and I is the intensity inside the laser cavity. In steady-state conditions, the
time derivatives in eqn 8.25 must be zero. We can thus solve eqn 8.25 for N
1
and N
2
using eqn 8.26 to
obtain:
N
1
= R
2
1
,
N
2
=
WN
1
+R
2
W+ 1/
2
.
(8.27)
Therefore the population inversion is given by
N N
2
N
1
=
R
2
W+ 1/
2
_
1
1
2
_
. (8.28)
This shows that the population inversion is directly proportional to the pumping rate into the upper
level. Note, however, that it is not possible to achieve population inversion (i.e. N > 0) unless
2
>
1
.
This makes sense if you think about it. Unless the lower laser level empties quickly, atoms will pile up in
the lower laser level and this will destroy the population inversion.
Equation 8.28 can be re-written as :
N =
R
W+ 1/
2
, (8.29)
where R = R
2
(1
1
/
2
). This is the net pumping rate after allowing for the unavoidable accumulation
of atoms in the lower level because
1
is non-zero. If the laser is below the threshold for lasing, there
will be very few photons in the cavity. Therefore, W will be very small because I is very small: see
eqn 8.26 above. The population inversion is simply R
2
, and thus increases linearly with the pumping
rate. Equation 8.24 implies that the gain coecient similarly increases linearly with the pumping rate
below threshold.
Eventually we will have enough gain to balance the round trip losses. This determines the threshold
gain coecient
th
for laser oscillation, as set out in eqn 8.3 or 8.4. From eqn 8.24 we have:
N
th
=
8n
2
2
g()
th
. (8.30)
By combining eqns 8.28 and 8.29 with W = 0 we can work out the pumping rate required to instigate
lasing. This is the threshold pumping rate. It is given by R
th
= N
th
/
2
. All lasers have a threshold.
Unless you pump them hard enough, they will not work.
8.7. PULSED LASERS 93
1
3
2
PUMP
LASER EMISSION
rapid decay
ground state
3-level laser
e.g. ruby
1
3
2
PUMP
LASER EMISSION
rapid decay
ground state
3-level laser
e.g. ruby
Figure 8.6: Level scheme for a three-level laser, for example: ruby.
What happens if we increase the pumping rate beyond the threshold value? In steady-state conditions,
the gain cannot increase any more, which implies that the population inversion is clamped at the value
given by eqn 8.30 even when R exceeds R
th
. This is shown in Fig. 8.5(a). What about the power output?
We set W to zero in eqn 8.29 because there was very little light in the cavity below threshold. This is no
longer true once the laser starts oscillating. If we are above threshold, N is clamped at the value set
by eqn 8.30, and so eqn 8.29 tells us that:
W=
R
N
th
1
2
=
1
2
_
R
R
th
1
_
. . . for R > R
th
. (8.31)
Now W is proportional to the intensity I inside the cavity (see eqn 8.26), which in turn is proportional
to the output power P
out
emitted by the laser. Thus P
out
is proportional to W, and we may write:
P
out
_
R
R
th
1
_
. . . for R > R
th
. (8.32)
This shows that the output power increases linearly with the pumping rate once the threshold has been
achieved, as shown in Fig. 8.5(b).
The choice of the reectivity of the output coupler aects the threshold because it determines the
oscillation conditions: see eqn 8.3 or 8.4. If the output coupler transmission (1 R
OC
) is small, the laser
will have a low threshold, but the output coupling eciency will be low. By increasing the transmis-
sion, the threshold increases, but the power is coupled out more eciently. This point is illustrated in
Fig. 8.5(b). The nal choice for R
OC
depends on how much pump energy is available, which will govern
the optimal choice to get the maximum output power.
8.7 Pulsed Lasers
So far we have only considered continuous wave (CW) lasers, but many lasers in fact operate in a pulsed
mode. Powerful pulsed ash lamps can give rise to very large pumping rates, with correspondingly large
output pulse energies, especially when using a trick called Q-switching. In this technique, the losses in
the cavity are kept articially high by some external method.
6
This prevents lasing and allows the build
up of very large population inversion densities, with correspondingly large gain coecients. If the losses
are suddenly reduced, a very powerful pulse will build up because of the very high gain in the cavity.
Q-switching is widely used in solid-state lasers because they tend to have long upper state lifetimes, which
allows the storage of a large amount of energy in the crystal, but it is seldom used in gas lasers because
the lifetimes are shorter which makes it dicult to store much energy in the gain medium.
8.8 Three-level lasers
Some lasers are classied as being three-level systems. The standard example is ruby, which was the
rst laser ever produced. The key dierence between a three-level laser and a four-level laser is that the
lower laser level is the ground state, as shown in Fig. 8.6. On comparing Figs. 8.4 and 8.6, it is apparent
that the lower laser level of the four-level system has merged with the ground state in the three-level
system. This makes it much more dicult to obtain population inversion in three-level lasers because
6
One way to switch the cavity losses from high to low on fast timescales is to use an electro-optical modulator. This
eectively behaves like a fast intra-cavity shutter.
94 CHAPTER 8. LASERS I: STIMULATED EMISSION
g(n)
u(n)
Frequency
(a) (b)
g(n)
Frequency
n
0
n
0
u(n)
n
laser
g(n)
u(n)
Frequency
(a) (b)
g(n)
Frequency
n
0
n
0
u(n)
n
laser
Figure 8.7: Interaction of an atomic transition with: (a) broad-band radiation, and (b) narrow-band
radiation. Note that the spectral energy densities and the atomic line-shape functions are not drawn on
the same vertical scales.
the lower laser level initially has a very large population. Let this population be N
0
. By turning on the
pump, we excite dN atoms to level 3, which then decay to level 2. Thus the population of level 2 will be
dN, while the population of the lower laser level will be (N
0
dN). Hence for population inversion we
require dN > (N
0
dN), that is dN > N
0
/2. Therefore, in order to obtain population inversion we have
to pump more than half of the atoms out of the ground state into the upper laser level. This obviously
requires a very large amount of energy, which contrasts with four-level lasers, where the lower laser level
is empty before the pumping process starts, and much less energy is required to reach threshold.
Despite the fact that the threshold for population inversion is very high in a three-level system, they
can be quite ecient once this threshold is overcome. Ruby lasers pumped by bright ash lamps actually
give very high output pulse energies. However, they only work in pulsed mode. Continuous lasers tend
to be made by using four-level systems.
8.9 Appendix: Interaction with narrow-band radiation
The Einstein B coecients were introduced to consider the interaction of atoms with broad-band radi-
ation, such as black-body radiation, as illustrated in Fig. 8.7(a). In this situation, the spectral energy
density u() varies much more slowly with frequency than the atomic lineshape function g(), and
may eectively be taken as constant over the line width of the transition. In a laser, by contrast, the
spectral width of the radiation inside the cavity is frequently much narrower than the width of the atomic
transition, as illustrated in Fig. 8.7(b).
Th absorption and stimulated emission transition rates for the case of narrow-band radiation, as shown
in Fig. 8.7(b), can be calculated as follows. The spectral line-shape function g()d gives the probability
that a particular atom will absorb or emit in the spectral range +d. Hence the number of atoms
in the lower level per unit volume that can absorb radiation in this frequency range is N
1
g()d. From
the denition of the Einstein B
12
coecient given in eqn 8.7, the absorption rate in this frequency range
is therefore:
dW
12
= B
12
N
1
g()d u() . (8.33)
The total absorption rate is thus:
W
12
=
_
0
B
12
N
1
g()u() d . (8.34)
Since the spectral energy density of the radiation inside the laser cavity is much narrower than the width
of the atomic transition, we can write it as:
u() = u
(
laser
) , (8.35)
where u
is the total energy density of the beam (cf eqn 8.20) and () is the Dirac delta function.
7
On
7
The Dirac delta function (x x
0
) takes the value of 0 at all values of x apart from x
0
, and is normalized such that
0
(x x
0
)dx = 1. It can be thought of as the limit of a top-hat function of width and height 1/ centred at x
0
in
the limit where 0. It is easy to show that
0
f(x)(x x
0
) dx = f(x
0
) .
8.9. APPENDIX: INTERACTION WITH NARROW-BAND RADIATION 95
inserting this into eqn 8.34, we obtain:
W
12
=
_
0
B
12
N
1
g()u
(
laser
) d .
= B
12
N
1
g(
laser
)u
. (8.36)
The argument for the stimulated emission rate follows similarly, and leads to:
W
21
= B
21
N
2
g(
laser
)u
. (8.37)
Further Reading
Hooker and Webb, Laser Physics: chapters 1, 2, 4, 6.57
Wilson and Hawkes, Optoelectronics: 5.18 , 6.5, and appendix 4
Bransden and Joachain, Atoms, Molecules and Photons, 4.4, 15.1
Demtroder, Atoms, Molecules and Photons: 7.1, 8.1
Smith and King, Optics and Photonics: chapters 15, 17
Hecht, Optics: 7.4.3, 12.1, 13.1
Silfvast, Laser Fundamentals: Chapter 1, chapter 69
Svelto, Principles of Lasers: Chapter 1, 2.14, 7.13
Yariv, Optical Electronics in Modern Communications: 5.13, 6.35
96 CHAPTER 8. LASERS I: STIMULATED EMISSION
Chapter 9
Lasers II: Cavities and examples
In Chapter 8 we pointed out that a laser works by combining an amplifying medium with a resonant
cavity. In this chapter we study how the cavity aects the properties of the light emitted by the laser,
and then look at a few examples of important lasers, paying particular attention to the mechanism that
produces population inversion.
9.1 Laser cavities
The cavity is an essential part of a laser. It provides the positive feedback that turns an amplier into
an oscillator, and determines the properties of the beam of light that is emitted, as shown schematically
in Fig. 9.1. This beam is characterized by its transverse and longitudinal mode structure, which are
considered separately below.
9.1.1 Transverse modes
The transverse modes of a laser beam describe the variation of the electrical eld across a cross-sectional
slice of the beam. The modes are labelled TEM
mn
where m and n are integers. TEM stands for
transverse electro magnetic. If the eld is propagating in the z direction, the (x, y) dependence of the
eld is given by:
c
mn
(x, y) = c
0
H
m
_
2x
w
_
H
n
_
2y
w
_
exp
_
x
2
+y
2
w
2
_
, (9.1)
where w is the beam waist parameter that determines the size of the beam, and H
m
and H
n
are
mathematical functions called the Hermite polynomials.
1
The rst few polynomials are:
H
0
(u) = 1 ,
H
1
(u) = 2u ,
H
2
(u) = 2(2u
2
1) .
(9.2)
The most important mode is the TEM
00
mode. This has a Gaussian radial distribution:
c
00
(x, y) = c
0
exp
_
x
2
+y
2
w
2
_
= c
0
exp
_
r
2
w
2
_
, (9.3)
1
Hermite polynomials also appear in the solution of the Schr odinger equation for a simple harmonic oscillator.
GAIN
MEDIUM
high reflector
output coupler
z
x
y
L
electric field
GAIN
MEDIUM
high reflector
output coupler
z
x
y
L
electric field
Figure 9.1: Laser cavity and output beam
97
98 CHAPTER 9. LASERS II: CAVITIES AND EXAMPLES
-1
0
1
0.0
0.5
1.0
-1
0
1
R
e
l
a
t
i
v
e
i
n
t
e
n
s
i
t
y
y
/
w
x
/
w
TEM
00
TEM
10
TEM
20
TEM
01
TEM
11
TEM
21
(a) (b)
-1
0
1
0.0
0.5
1.0
-1
0
1
R
e
l
a
t
i
v
e
i
n
t
e
n
s
i
t
y
y
/
w
x
/
w
TEM
00
TEM
10
TEM
20
TEM
01
TEM
11
TEM
21
(a) (b)
Figure 9.2: (a) Intensity distribution of a TEM
00
mode, which has a Gaussian prole. (b) Beam proles
produced by various higher-order laser modes. Note that the side lobes in the x direction for the TEM
21
mode are too faint to be seen on this grey scale.
Diode laser HeNe laser Argon ion laser
L 1 mm 30 cm 2 m
n 3.5 1 1
Mode spacing 150 GHz 500 MHz 75 MHz
Table 9.1: Mode spacing for several common laser
where r is the distance from the centre of the beam, as shown in Fig. 9.2(a). The TEM
00
mode is the
closest thing to a ray of light found in nature. It has the smallest divergence of all the modes and can be
focussed to the smallest size. We therefore usually try hard to prevent the other modes from oscillating.
This is achieved by inserting apertures in the cavity which are lossy for the higher-order modes but not
for the smaller TEM
00
mode.
Figure 9.2(b) compares the beam cross-section for a number of higher-order laser modes with that of
the TEM
00
mode. Note that the TEM
mn
mode has m nodes (zeros) in the x direction and n nodes in
the y direction. These higher-order modes make pretty pictures, but are not useful for very much. A
well-designed laser will contain apertures that allow only the TEM
00
mode to oscillate.
9.1.2 Longitudinal modes
The longitudinal modes determine the emission spectrum of the laser. The light bouncing repeatedly o
the end mirrors sets up standing waves inside the cavity, as shown in Fig. 9.1. There are nodes (eld
zeros) at the mirrors because they have high reectivities. Thus there must be an integer number of half
wavelengths inside the cavity. If the length of the cavity is L, this condition can be written:
L = integer
2
= integer
c
2n
, (9.4)
where n is the average refractive index of the cavity. In gas lasers, n will be very close to unity. It will
also be the case that n 1 in a solid-state laser with a short laser rod inside a long air-lled cavity.
Equation 9.4 implies that only certain frequencies that satisfy:
= integer
c
2nL
(9.5)
will oscillate. Most cavities are much larger than the optical wavelength and thus the value of the integer
in equations 9.4 and 9.5 is very large.
2
The most important parameter is the longitudinal mode spacing:
mode
=
c
2nL
. (9.6)
Table 9.1 lists the longitudinal mode spacing for several lasers.
2
This is not true for microcavity lasers, where we deliberately make the cavity to be of similar dimensions to the
optical wavelength. In this case the integer would have a value of unity. The use of microcavity semiconductor lasers is
now widespread in optical bre systems.
9.2. SINGLE-MODE, MULTI-MODE, AND MODE-LOCKED LASERS 99
frequency c / 2L frequency c / 2L
Atomic
emission
spectrum
frequency c / 2L
MUTLI-MODE:
mode phases random
SINGLE
MODE
MODE
LOCKED
Insert sharp
filter at n
0
frequency
frequency c / 2L
SINGLE
MODE
c / 2L'
lock phases together
shorten cavity length
n
0
Figure 9.3: Multi-mode, single-mode and mode-locked operation
9.2 Single-mode, multi-mode, and mode-locked lasers
The gain bandwidth of a laser medium will usually be much wider than the spacing of the longitudinal
modes of the cavity. This leads to a number of ways of operating the laser.
9.2.1 Multi-mode and single-mode lasers
For a given longitudinal mode to oscillate, its frequency must lie within the emission spectrum of the
laser transition. Unless we do something about it, there will be a tendency for all the longitudinal modes
that experience gain to oscillate. Therefore the laser will have multi-mode operation, as illustrated in
Fig. 9.3. As a rough guide, the number of modes that will be oscillating is equal to the gain bandwidth
divided by the mode spacing. Thus for a 30 cm HeNe laser with a gain bandwidth of 1.5 GHz, there will
be three modes oscillating. In a Doppler-broadened emission line such as that from the Neon atoms in a
HeNe laser, the phases of these modes will be random relative to each other because they are emitted by
dierent atoms.
When a laser runs in multi-mode operation, its spectral bandwidth is not signicantly smaller than
that of the light emitted from the same transition in a discharge lamp. For many applications (e.g.
supermarket bar-code readers), this is not an issue. However, for some others, it is. An obvious example is
high-resolution spectroscopy. Other examples include those that rely on having high temporal coherence,
for example: holography and interferometry. This follows from the fact that the temporal coherence is
inversely proportional to the spectral bandwidth. (See Section 9.3 below.) It is therefore interesting to
see if we can make the laser run on just a single longitudinal mode. The spectral linewidth would then
be determined by the quality factor (Q) of the cavity rather than the gain band width. This is called
single-mode operation.
One way to achieve to single-mode operation is to shorten the cavity so that the mode spacing exceeds
the gain bandwidth. See Fig. 9.3. In this case only one mode will fall within the emission line of the
transition and the laser will automatically oscillate on only one mode. However, this may not be practical.
For example, in the case of the HeNe laser considered above, we would need to make the cavity shorter
than 10 cm. Such a laser would have very small round-trip gain, and we would probably not be able to
make it oscillate. A better way to obtain single mode operation is to insert a narrow frequency lter in
the cavity such as a FabryPerot etalon. By tuning the spacing of the etalon, the frequency of the single
mode can be changed continuously, which is very useful for high-resolution spectroscopy. The spectral
line width of a single-mode laser is typically a few MHz. This is about a thousand times narrower than
the atomic emission line that produces the light.
100 CHAPTER 9. LASERS II: CAVITIES AND EXAMPLES
L
time separation = 2L/c
pulse duration ~ 1/Dn
output coupler
time
L
time separation = 2L/c
pulse duration ~ 1/Dn
output coupler
time
Figure 9.4: Mode-locked laser pulses from a cavity with n = 1.
9.2.2 Mode locking
Lasers can be made to operate continuously or in pulses. The length of the pulse might be determined,
for example, by the duration of the ash-lamp pulse that produces the population inversion, or by the
properties of a Q-switch. (See Section 8.7.) The pulses produced in this way are relatively long, e.g,
tens of nanoseconds at best. However, there is another technique called mode locking that leads to the
emission of a continuous train of very short (ultrashort) pulses. This is the technique we consider here.
Mode locking is the opposite extreme to single-mode operation. In a mode-locked laser we try to
get as many longitudinal modes oscillating as possible, but with all their phases locked together. (See
Fig. 9.3). This contrasts with a multi-mode laser in which many modes are oscillating but with random
phases with respect to each other.
In Appendix 9.7 we prove that the mode-locked operation of a laser corresponds to a single pulse
oscillating around the cavity and getting emitted every time it hits the output coupler, as shown in
Fig. 9.4. The time taken for a pulse to circulate around a cavity of length L with n = 1 is 2L/c.
Therefore we get pulses out of the laser at a repetition rate of (2L/c)
1
.
The minimum pulse duration is set by the Fourier transform of the gain spectrum:
t
min
1/2 , (9.7)
where is the gain bandwidth. This uncertainty principle means that to get very short pulses we
need a wide gain bandwidth. Gas lasers are not very good in this context because they are based on
fairly narrow atomic transitions. For example, the bandwidth of the 632.8 nm line in the HeNe laser is
1.5 GHz (see Table 9.2), and so the pulses that can be produced must be at least 0.11 ns long.
The best results have been achieved in tuneable lasers such as dye lasers or titanium-doped sapphire
lasers. The gain bandwidth of the Ti:sapphire laser is nearly 10
14
Hz, and mode-locked Ti:saphhire lasers
routinely produce pulses shorter than 100 fs (1 fs = 10
15
s), which corresponds to millions of longitudinal
modes oscillating. When the full gain bandwidth of the crystal is used, pulses shorter than 1 fs have been
produced from this laser.
Mode locking is achieved by two main techniques. With active mode locking, a time-dependent shutter
is inserted in the cavity.
3
The shutter is opened briey every 2L/c seconds. Continuous operation of the
laser is impossible, but the mode-locked pulses will be unaected by the shutter. In passive mode locking,
a saturable absorber is inserted in the cavity. Such absorbers have strong absorption at low powers and
small absorption at high powers. The peak power in the pulsed mode is much higher than in continuous
operation, and thus the cavity naturally selects the pulsed mode.
Mode-locked lasers are widely used in scientic research to study fast processes in physics, chemistry,
and biology. For example, the typical time for a current-carrying electron in a copper wire to interact with
a phonon at room temperature is about 100 fs. Similarly, the early stages of many chemical reactions or
biological processes such as photosynthesis take place in less than 10
12
s. Another widespread application
of short pulse lasers in biology is in microscopy. It is common practice to obtain images of biological
molecules by tagging them with uorescent chromophores (e.g. dyes, quantum dots) and then exciting
the sample with a laser in a confocal microscopy. The use of mode-locked lasers gives far superior depth
resolution compared to continuous wave (CW) lasers.
4
Mode-locked lasers are also useful to telecommunication companies, who are interested in packing
as many bits of information (represented by pulses of light) as possible down their optical bres. The
shorter the pulses, the higher the data rate. There are also medical applications: it is much cleaner to
3
The time-dependent shutter is typically made by using a high speed acousto-optic modulator.
4
When a mode-locked laser is used, the higher peak power allows the uorescent chromophore to be excited by two-
photon absorption. The power is only high enough for this to occur at the focus of the laser, and so only the part of the
sample at the focus produces light. With a CW laser, by contrast, the chromophores are excited by standard one-photon
absorption, and the whole depth of the sample emits light.
9.3. COHERENCE OF LASER LIGHT 101
SOURCE Spectral line width Coherence time Coherence length
(Hz) t
c
(s) l
c
Sodium discharge lamp (D-lines at 589 nm) 5 10
11
2 10
12
0.6 mm
Multi-mode HeNe laser (632.8nm line) 1.5 10
9
6 10
10
20 cm
Single-mode HeNe laser (632.8 nm line) 1 10
6
10
6
300 m
Table 9.2: Coherence length of several light sources.
use a very short, low-energy, high-peak-power pulse for laser surgery, than a longer pulse with the same
peak power but much higher energy.
9.3 Coherence of laser light
As mentioned in Section 8.1, laser light has a high degree of both spatial and temporal coherence. The
spatial coherence is related to the phase uniformity across a cross-sectional slice of the beam. When the
laser is running in a well-dened transverse mode, the optical phase across such a slice will be constant.
Hence the spatial coherence follows from the transverse modes, and will be very high when the laser is
running on a single transverse mode.
The temporal coherence of light refers to the time duration over which the phase is constant. In
general, the temporal coherence time t
c
is determined by the spectral line width according to:
t
c
1
. (9.8)
Hence the coherence length l
c
is given by:
l
c
= ct
c
c
(9.9)
Typical values of the coherence length for a number of light sources are given in Table 9.2. The gures
explain why it is much easier to do interference experiments with a laser than with a discharge lamp. If
the path dierence exceeds l
c
you will not get interference fringes, because the light is incoherent. In the
case of the single-mode HeNe laser, you can set up an interferometer in which the path lengths dier by
300 m, and you will still observe fringes. The long coherence length of laser light is useful in holography
and interferometry.
9.4 Examples of lasers
There are many dierent types of lasers in common use, and it is not possible to describe all of them
here. Most lasers operate at xed wavelengths:
Infrared lasers CO
2
(10.6 m), erbium (1.55 m), Nd:YAG (1.064 m), Nd:glass (1.054 m);
Visible lasers ruby (693 nm), krypton ion (676, 647 nm), HeNe (633 nm), copper vapour (578 nm),
doubled Nd:YAG (532 nm), argon ion (514, 488 nm), HeCd (442 nm);
Ultraviolet lasers argon ion (364, 351 nm), tripled Nd:YAG (355 nm), nitrogen (337 nm), HeCd (325 nm),
quadrupled Nd:YAG (266 nm), excimer (308, 248, 193, 150 nm).
Others lasers are tuneable, for example: dye lasers (typical tuning range 100 nm, dyes available from UV
to near infrared); Ti:sapphire lasers (700-1000nm, doubled: 350-500nm); free electron lasers (far infrared
to ultraviolet). The most common lasers in widespread use are semiconductor diode lasers. Cheap
and ecient diode lasers available at blue (400 nm), red (620-670nm), and near-infrared wavelengths
(700-1600 nm).
In the sections below we consider a few of the more important lasers that are available, following the
general classication according to whether the gain medium is a gas or a solid.
5
5
There are relatively few liquid-phase lasers. The most important examples are dye lasers. However, with the advent of
broadly-tuneable high power solid-state lasers such as Ti:sapphire lasers, and the development of techniques of nonlinear
optics to extend their frequency range (see Appendix 9.8), dye lasers are gradually becoming obsolete.
102 CHAPTER 9. LASERS II: CAVITIES AND EXAMPLES
power supply
~ 1 kV
load resistor
anode cathode
He + Ne mixture
output
coupler
high
reflector
output
16
18
20
22
0
helium neon
1s
2
1s
2
2s
2
2p
6
1s2s
3s
4s
5s
4p
3p
632.8nm
E
n
e
r
g
y
(
e
V
)
ground state
S = 0
S = 1
S = 0
(a) (b)
Figure 9.5: (a) Schematic diagram of a HeNe laser. (b) Level scheme for the HeNe laser.
9.5 Gas lasers
9.5.1 The helium-neon (HeNe) laser
Helium-neon lasers consist of a discharge tube inserted between highly reecting mirrors, as shown
schematically in Fig. 9.5(a). The tube contains a mixture of helium and neon atoms in the approxi-
mate ratio of He:Ne 5:1. By applying a high voltage across the tube, an electrical discharge can be
induced. The electrons collide with the atoms and put them into excited states. The light is emitted by
the neon atoms, and the purpose of the helium is to assist the population inversion process. To see how
this works we need to refer to the level diagram in Fig. 9.5(b).
Helium has two electrons. In the ground state both electrons are in the 1s level. The rst excited state
is the 1s2s conguration. There are two possible energies for this state because there are two possible
congurations of the electron spin: the singlet S = 0 and the triplet S = 1 terms. The helium atoms are
excited by collisions with the electrons in the discharge tube and cascade down the levels. When they get
to the 1s2s conguration, however, the cascade process slows right down. In the 2s1s 1s
2
transition
one of the electrons jumps from the 2s level to the 1s level. This is forbidden by the l = 1 selection
rule. Furthermore, transitions from the 1s2s S = 1 level to the 1s
2
S = 0 ground state are also forbidden
by the S = 0 selection rule. The net result is that all transitions from the 1s2s levels are strongly
forbidden. The 1s2s level therefore has a very long lifetime, and is called metastable. See Section 5.5
in Chapter 5 for more details.
Neon has ten electrons in the conguration 1s
2
2s
2
2p
6
. The excited states correspond to the promotion
of one of the 2p electrons to higher levels. This gives the level scheme shown in the diagram. The symbols
of the excited states refer to the level of this single excited electron. By good luck, the 5s and 4s levels
of the neon atoms are almost degenerate with the S = 0 and S = 1 terms of the 1s2s conguration of
helium. Thus the helium atoms can easily de-excite by collisions with neon atoms in the ground state
according to the following scheme:
He
+ Ne He + Ne
. (9.10)
The star indicates that the atom is an excited state. Any small dierences in the energy between the
excited states of the two atoms are taken up as kinetic energy.
This scheme leads to a large population of neon atoms in the 5s and 4s excited states. This gives
population inversion with respect to the 3p and 4p levels. It would not be easy to get this population
inversion without the helium because collisions between the neon atoms and the electrons in the tube
would tend to excite all the levels of the neon atoms equally. This is why there is more helium than neon
in the tube.
The main laser transition at 632.8 nm occurs between the 5s level and the 3p level. The lifetime of
the 5s level is 170 ns, while that of the 3p level is 10 ns. This transition therefore easily satises the
criterion
upper
>
lower
. (See discussion of eqn 8.28.) This ensures that atoms do not pile up in the
lower level once they have emitted the laser photons, as this would destroy the population inversion. The
9.6. SOLID-STATE LASERS 103
atoms in the 3p level rapidly relax to the ground state by radiative transitions to the 3s level and then
by collisional de-excitation to the original 2p level. Lasing can also be obtained on other transitions: for
example, 5s 4p at 3391 nm and 4s 3p at 1152nm. These are not as strong as the main 632.8 nm
line.
The gain in a HeNe tube tends to be rather low because of the relatively low density of atoms in
the gas (compared to a solid). This is partly compensated by the fairly short lifetime of 170 ns. (See
eqn 8.24.) The round trip gain may only be a few percent, and so very highly reecting mirrors are
needed. With relatively small gain, the output powers are not very high - only a few mW. However, the
ease of manufacture makes these lasers to be extremely common for low power applications: bar-code
readers, laser alignment tools (theodolites, rie sights), classroom demos etc. They are gradually being
replaced nowadays by visible semiconductor laser diodes, which are commonly used in laser pointers.
9.5.2 Helium-cadmium lasers
The HeCd laser is another gas laser system based on helium. The population inversion scheme in HeCd
is similar to that in HeNe except that the active medium is Cd
+
ions. The laser transitions occur in the
blue and the ultraviolet at 442 nm, 354 nm and 325 nm. The UV lines are useful for applications that
require short wavelength lasers, such as high-precision printing on photosensitive materials. Examples
include lithography of electronic circuitry and making master copies of compact disks.
9.5.3 Ion lasers
There are several important types of gas lasers that use ions rather than neutral atoms as the gain
medium, for example, the argon-ion laser. The argon ions are produced by collisions with electrons
in a discharge tube. The atomic number of argon is 18, and so the Ar
+
ion has 17 electrons in the
conguration 1s
2
2s
2
2p
6
3s
2
3p
5
. The excited states of the Ar
+
ion are generated by exciting one of the
ve 3p electrons to higher levels, and the most important laser transitions occur between the 4p and 4s
levels. Spin-orbit coupling splits this into a doublet, with emission lines at 488 nm (blue) and 514.5 nm
(green). The krypton ion laser works by similar principles, and has a strong laser emission line in the
red at 676.4 nm. This red line can be combined with the green and blue lines of the argon-ion laser to
make very colourful laser light shows.
In addition to laser light shows, argon-ion lasers are frequently used for pumping tuneable lasers such
as dye lasers and Ti:sapphire lasers. There are also some medical applications such as laser surgery, and
scientic applications include uorescence excitation and Raman spectroscopy.
9.5.4 Carbon dioxide lasers
The CO
2
laser is one of the best examples of a molecular laser. The transitions take place between the
vibrational levels of the molecule. The strongest emission lines are in the infrared around 10.6 m. The
lasers are very powerful with powers up to several kilowatts possible. Hence they are used in cutting
applications in industry (including the military industry!) and also for medical surgery. The high power
output is a consequence of the fact that the stimulated emission becomes more favourable compared to
spontaneous emission at lower frequencies: see eqn 8.14.
A mixture of nitrogen and CO
2
in a ratio of about 4:1 is used in the laser tube. The N
2
molecules
are excited by collisions with electrons, and then transfer their energy to the upper level of the CO
2
molecules. This gives population inversion in much the same sort of way as for the HeNe mixture.
9.6 Solid-state lasers
9.6.1 Ruby lasers
Ruby lasers have historical importance because they were the rst successful laser to operate. Ruby
consists of Cr
3+
ions doped into crystalline Al
2
O
3
(sapphire) at a typical concentration of around 0.05%
by weight. The Al
2
O
3
host crystal is colourless. The light is emitted by transitions of the Cr
3+
impurities.
The level scheme for ruby is shown in Fig. 9.6(a). Ruby is a three-level system (see Section 8.8), with
strong absorption bands in the blue and green spectral regions. (Hence the red colour: ruber means red
in Latin.) Electrons are excited to these bands by a powerful ashlamp. These electrons relax rapidly
to the upper laser level by non-radiative transitions in which phonons are emitted. This leads to a large
population in the upper laser level. If the ashlamp is powerful enough, it will be possible to pump more
104 CHAPTER 9. LASERS II: CAVITIES AND EXAMPLES
1
3
2
PUMPING
BANDS
694.3nm
rapid decay
ground state
blue green
high reflector
mirror
semi -
transparent
mirror
power supply
capacitor bank
output
flash lamp
ruby
rod
(a) (b)
Figure 9.6: (a) Level diagram for ruby (Cr
3+
:Al
2
O
3
). (b) Schematic diagram of a ruby laser.
laser rod
Crosssection
of the elliptical
reflector
flash lamp
elliptical reflector
output
coupler
high
reflector
4
I
9/2
ground state
4
I
11/2
4
F
3/2
pumping bands
non-radiative
decay
non-radiative
decay
1.06 mm laser
Flash lamp
or
diode laser
pumping
(a)
(b)
laser rod
Crosssection
of the elliptical
reflector
flash lamp
elliptical reflector
output
coupler
high
reflector
4
I
9/2
ground state
4
I
11/2
4
F
3/2
pumping bands
non-radiative
decay
non-radiative
decay
1.06 mm laser
Flash lamp
or
diode laser
pumping
(a)
(b)
Figure 9.7: (a) Level diagram for the Nd
3+
lasers. (b) Schematic diagram of an Nd:YAG laser.
than half of the atoms from the ground state (level 1) to the upper laser level (level 2). In this case, there
will then be population inversion between level 2 and level 1, and lasing can occur if a suitable cavity is
provided. The laser emission is in the red at 694.3 nm.
Figure 9.6(b) shows a typical arrangement for a ruby laser. The crystal is inserted inside a powerful
helical ashlamp. Water-cooling prevents damage to the crystal by the intense heat generated by the
lamp. Mirrors at either end of the crystal dene the cavity. Reective coatings can be applied directly to
the end of the rod as shown, or external mirrors can be used (not shown). The lamps are usually driven
in pulsed mode by discharge from a capacitor bank. The pulse energy can be as high as 100 J per pulse.
This is because the upper laser level has a very long lifetime (3 ms) and can store a lot of energy.
9.6.2 Neodymium lasers (Nd:YAG and Nd:glass)
Neodymium ions form the basis for a series of high power solid-state lasers. In the two most common
variants, the Nd
3+
ions are doped into either Yttrium Aluminium Garnet (YAG) crystals or into a
phosphate glass host. These two lasers are known as either Nd:YAG or Nd:glass. The main laser
transition is in the near-infrared at about 1.06 m. The wavelength does not change much on varying
the host.
Figure 9.7(a) shows the level scheme for the Nd
3+
lasers, which are four-level lasers. Electrons are
excited to the pump bands by absorption of photons from a powerful ashlamp or from a diode laser
operating around 800 nm. The electrons rapidly relax to the upper laser level by phonon emission.
Lasing then occurs on the
4
F
3/2
4
I
11/2
transition.
6
The electrons return to the ground state by rapid
non-radiative decay by phonon emission.
6
This transition is strongly forbidden for free atoms. However, the wave functions of the Nd
3+
ion get distorted in
the crystal by the electric elds from the neighbouring host atoms, and this relaxes the selection rules. The Einstein A
coecient of the 1064 nm transition in Nd:YAG is 4.3 10
3
s
1
.
9.6. SOLID-STATE LASERS 105
tuneable
laser
emission
2
T
2
Absorption
of
pump laser
2
E
phonon
emission
phonon
emission
Figure 9.8: Level diagram for the Ti:sapphire laser.
Figure 9.7(b) shows the cavity arrangement in a ashlamp-pumped system. The rod and lamp are
positioned at the foci of an elliptical reector. This ensures that most of the photons emitted by the lamp
are incident on the rod to maximize the pumping eciency. Mirrors at either end of the rod provide the
optical cavity. The laser can either be operated in pulsed or continuous wave mode.
As with the ruby laser, the lifetime of the upper laser level is long: 0.20.3ms, depending on the host.
This long lifetime, which is a consequence of the fact that the laser transition is E1-forbidden, allows
the storage of large amounts of energy. Continuous wave Nd:YAG lasers can easily give 2030 W, while
pulsed versions can give energies up to 1 J in 10 ns. The pulse energies possible from Nd:glass lasers are
even higher, although they can only operate at lower repetition rates. The Lawrence Livermore Lab in
California uses Nd:glass lasers for fusion research. The pulse energy in these systems is 10 kJ. With
pulse durations in the 10 ns range, this gives peak powers of 10
12
W.
Nd lasers are extensively used in industry for cutting applications, and in medicine for laser surgery.
They are very rugged and can be used in extreme conditions (eg onboard military aircraft). Frequency-
doubled Nd:YAG lasers (see Appendix 9.8) are now gradually replacing argon-ion lasers for pumping
tuneable lasers such as Ti:sapphire. (See below)
9.6.3 Ti:sapphire
Titanium-doped sapphire lasers represent the current state-of-the-art in tuneable lasers. The level scheme
is shown in Fig. 9.8. The active transitions occur in the Ti
3+
ion. This has one electron in the 3d shell.
In the octahedral environment of the sapphire (Al
2
O
3
) host, the crystal eld splits the ve m levels of
the 3d shell into a doublet and a triplet. These are labelled as the
2
E and
2
T
2
states in Fig. 9.8.
7
The
electron-phonon coupling in Ti:sapphire is very strong, and the
2
E and
2
T
2
states are broadened into
vibronic bands. The absorption of the
2
E band peaks in the green-blue spectral region, and thus can
be pumped by the 488 nm and 514 nm lines of an argon ion laser. Alternatively, a frequency-doubled
Nd:YAG laser operating at (1064/2 = 532 nm) can be used.
Electrons excited into the middle of the
2
E band rapidly relax by phonon emission to the bottom of
the band. Laser emission can then take place to anywhere in the
2
T
2
band. The electrons nally relax
to the bottom of the
2
T
2
band by rapid phonon emission.
The fact that tuning can be obtained over the entire
2
T
2
band is a very useful feature because it
means that the laser wavelength can be chosen at will. Lasing has in fact been demonstrated all the way
from 690 nm to 1080 nm i.e. over nearly 400 nm. This is why it makes sense to use one laser to pump
another: we convert a xed frequency laser such as the argon-ion or frequency-doubled Nd:YAG into a
tuneable source. Energy conversion eciencies of up to 25% are possible.
The broad emission band width is also ideal for making short pulse mode-locked lasers. (See Sec-
tion 9.2.2). The shortest pulses that can be produced are given by t 1/2. With such a broad
emission band, it has been possible to generate pulses shorter than 10 fs.
9.6.4 Semiconductor diode lasers
Semiconductor diode lasers are by far the most common types of lasers. They are used in laser printers,
DVD players, laser pointers, and optical bre communication systems. The laser consists of a semi-
7
This notation might be familiar to the Chemical Physicists. The letters are abbreviations for German words. E and
T label doublet and triplet states. The superscript of 2 refers to the spin degeneracy. Thus these two states contain
(2 2 +2 3 = 10) levels, as we would expect for the 3d states. The subscript of 2 on the triplet state indicates that it has
a particular symmetry.
106 CHAPTER 9. LASERS II: CAVITIES AND EXAMPLES
uncoated
facet
output
~ 3 V
~ 0.1 1 A
N-type
P-type
coated
facet
current control
p - AlGaAs
i - GaAs
n - AlGaAs
light output
metal contact
oxide
optical m
ode
current
metal contact
n- GaAs substrate
(a) (b)
uncoated
facet
output
~ 3 V
~ 0.1 1 A
N-type
P-type
coated
facet
current control
p - AlGaAs
i - GaAs
n - AlGaAs
light output
metal contact
oxide
optical m
ode
current
metal contact
n- GaAs substrate
(a) (b)
Figure 9.9: (a) Schematic diagram of the operation of a semiconductor diode laser. (b) Detailed sketch
of a typical GaAs diode laser chip.
conductor p-n diode cleaved into a small chip, as shown in Fig. 9.9(a). Electrons are injected into the
n-region, and holes into the p-region. At the junction between the n- and p-regions we have both elec-
trons in the conduction band and holes (i.e. empty states) in the valence band. This creates population
inversion between the conduction and valence bands, and gain is produced at the band gap energy E
g
of
the semiconductor. The electrons in the conduction band drop to the empty states in the valence band,
and laser photons with energy h = E
g
are emitted. The drive voltage must be at least equal to E
g
/e,
where e is the electron charge.
The laser cavity is formed by using the cleaved facets of the chips. The refractive index of a typical
semiconductor is in the range 3-4, which gives about 30% reectivity at each facet. This is enough to
support lasing, even in crystals as short as 1 mm, because the gain in the semiconductor crystal is very
high. A highly reective coating is often placed on the rear facet to prevent unwanted losses through this
facet and hence reduce the threshold.
The semiconductor must have a direct band gap to be an ecient light emitter. Silicon has an indirect
band gap, and is therefore not used for laser diode applications. The laser diode industry is based mainly
on the compound semiconductor GaAs, which has a direct band gap at 1.4 eV (890 nm). A typical design
of a GaAs diode laser is shown in Fig. 9.9(b). By using alloys of GaAs, the band gap can be shifted into
the red spectral region for making laser pointers, or further into the infrared to match the wavelength
for lowest losses in optical bres (1500 nm). Blue laser diodes are made from the wide band gap IIIV
semiconductor GaN and its alloys. These lasers are used in blue-ray systems.
The power conversion eciency of electricity into light in a diode laser is very high, with gures of
25% typically achieved. This compares with typical eciencies of < 0.1% in gas lasers. Since the laser
chips are so small, it is possible to make high power diode lasers by running many GaAs chips in parallel.
Laser power outputs over 20W can easily be achieved in this way. These high power laser diodes are very
useful for pumping Nd:YAG lasers.
9.7 Appendix: mathematics of mode-locking
The electric eld of the light emitted by a multi-mode laser is given by:
c(t) =
m
c
m
exp(i
m
t +
m
) , (9.11)
where the sum is over all the longitudinal modes that are oscillating.
m
is the angular frequency of the
mth mode (= mc/L, for n = 1), and
m
is its optical phase. In multi-mode operation all the phases
of the modes are random, and not much can be done with the summation. However, in a mode-locked
laser, all the phases are the same (call it
0
) because they have been locked together. This allows us to
evaluate the summation.
We assume that all the modes have approximately equal amplitudes c
0
. The output eld of the
mode-locked laser is then given by:
c(t) = c
0
e
i0
m
e
imt
. (9.12)
Let us suppose that there are N modes oscillating, and the frequency of the middle mode is
0
. This
9.8. APPENDIX: FREQUENCY CONVERSION BY NONLINEAR OPTICS 107
w
1
w
2
w w w
3 1 2
= +
(c)
w
p
w
s
(e)
w
i
(b)
w
w
2w
w
nonlinear
crystal
2w (a)
k
1
k
2
k
3
(d)
Figure 9.10: (a) and (b): frequency doubling. (c) Sum frequency mixing. (d) Phase matching. (e)
Parametric down conversion. The subscripts p, s, and i stand, respectively, for pump, signal and
idler.
gives the eld as :
c(t) = c
0
e
i0
m
=+(N1)/2
=(N1)/2
exp (i(
0
+m
c/L)t) ,
= c
0
e
i0
e
i0t
m
=+(N1)/2
=(N1)/2
exp
_
i
m
c
L
t
_
.
(9.13)
This type of summation is frequently found in the theory of diraction gratings. It can be evaluated by
standard techniques.
8
We are actually interested in the time dependence of the output power, which is
given by:
P(t) c(t)c(t)
. (9.14)
The nal answer is:
P(t)
sin
2
(Nct/2L)
sin
2
(ct/2L)
. (9.15)
This function has big peaks whenever t = integer 2L/c and is small at all other times. Thus the output
consists of pulses separated in time by 2L/c. The duration of the pulse is approximately given by the time
for the numerator to go to zero after one of the major peaks. This time is 2L/Nc. The frequency band
width is equal to the (number of modes oscillating) (spacing between modes), i.e. N c/2L.
Thus t 1 as expected from the uncertainty principle given in eqn 9.7.
9.8 Appendix: frequency conversion by nonlinear optics
It was discovered very early on in the history of lasers that certain crystals could double the frequency
of laser light, as shown in Fig. 9.10(a). This eect, which is know as frequency doubling, works
by combining two photons at frequency to produce a single photon at frequency 2, as shown in
Fig. 9.10(b). It occurs when a nonlinear crystal is driven by the intense light eld produced by a
powerful laser. The crystal must be non-centro-symmetric: i.e., belong to a crystal class that does
not have inversion symmetry. Beta-barium borate (BBO), potassium dihydrogen phosphate (KDP) and
lithium niobate are well-known examples of such crystals.
Frequency doubling is a specic example of a more general eect called nonlinear frequency mixing.
In nonlinear frequency mixing, the nonlinear crystal is driven by two intense waves at angular frequencies
1
and
2
, and a third wave is generated at the sum of their frequencies, as shown in Fig 9.10(c):
3
=
1
+
2
. (9.16)
8
Remember that e
ab
= (e
a
)
b
, and that
n1
j=0
r
j
= (1 r
n
)/(1 r).
108 CHAPTER 9. LASERS II: CAVITIES AND EXAMPLES
Frequency doubling corresponds to the case where
1
=
2
. The condition in eqn 9.16 is equivalent to
energy conservation in the photon conversion process. The photon momentum must also be conserved,
as shown in Fig. 9.10(d), which requires that:
k
3
= k
1
+k
2
, (9.17)
where k
3
, k
1
and k
2
are the respective wave vectors inside the crystal. This condition is called phase
matching, and is, in general, very hard to satisfy. Nonlinear frequency mixing therefore only works
eciently for the very specic wavelengths that satisfy the phase-matching condition. These wavelengths
are selected by the orientation of the crystal.
The nonlinear process works equally well in reverse, and a single pump photon can be split into
two photons of lower frequency called the signal and idler photons subject to energy conservation:
p
=
s
+
i
. (9.18)
This process is called parametric down conversion and is illustrated in Fig. 9.10(e). Note that there
is an innite number of combinations of signal and idler frequencies that can satisfy eqn 9.18, and the
actual frequencies that are produced are determined by the phase-matching condition. Down conversion
is a convenient way to generate tuneable radiation from a xed-frequency laser, and is now widely used
to extend the range of frequencies available from lasers.
It should be pointed out that the frequency conversion processes that are considered in this appendix
are examples of phenomena that are well-known in classical nonlinear optics. The description in terms of
photons is helpful, but not necessary: all of the eects can have classical explanations. Quantum eects
do show up when these nonlinear processes are considered at the single-photon level, but this is not the
regime that is being considered when the driving eld is an intense laser beam.
Reading
Hooker and Webb, Laser Physics: 6.3, chapter 7, 8.3, chapters 9, 11
Wilson and Hawkes, Optoelectronics: 5.9, 5.10.13, 6.13
Bransden and Joachain, Atoms, Molecules and Photons, 15.1
Demtroder, Atoms, Molecules and Photons: 8.24, 8.6
Smith and King, Optics and Photonics, 15.2, 15.7, 15.9, chapter 16, 17.4
Hecht, Optics: 13.1
Yariv, Optical Electronics in Modern Communications: 6.67, chapter 7
Silfvast, Laser Fundamentals, 2.4, chapters 1015
Svelto, Principles of Lasers: 7.7-8, 8.6, chapters 910
Chapter 10
Laser cooling of atoms
10.1 Introduction
The resonant force between an atom and a light eld was rst observed in 1933, when Frisch measured
the deection of a sodium beam caused by a sodium lamp shining on the side of the beam. The invention
of lasers opened up new possibilities, and the rst laser cooling experiments were carried out in the
1980s. The importance of this work was recognized by the award of the Nobel Prize for Physics in 1997
to three of the pioneers of the eld: Stephen Chu, Claude Cohen-Tannoudji, and William D. Phillips.
There are two aspects of laser cooling that make it particularly remarkable.
1. It is highly surprising that the technique works at all. We would normally expect a powerful laser
to cause heating rather than cooling. This makes us realize that the technique will only work when
special conditions are satised. These will be discussed in the rest of this chapter.
2. The very low temperatures achieved by laser cooling are extremely impressive, but this in itself is
not the main point. Techniques for achieving very low temperatures have been used for decades by
condensed matter physicists. For example, commercial dilution refrigerators routinely achieve
temperatures in the milli-Kelvin range, and as early as the 1950s, Nicholas Kurti and co-workers at
Oxford University used adiabatic demagnetisation to achieve nuclear spin temperatures in the
micro-Kelvin range. The novelty of laser cooling is that it produces of an ultracold gas of atoms,
in contrast to the condensed matter techniques which all work on liquids or solids. These ultracold
atoms only interact weakly with each other, which makes it possible to study the light-matter
interaction with unsurpassed precision.
These aspects of laser cooling have given rise to a whole host of related benets. Atomic clocks have been
made with ever greater accuracy, and a whole range of new quantum phenomena have been discovered.
The most spectacular of these is BoseEinstein condensation, which was observed for the rst time
in 1995.
10.2 Gas temperatures
In order to understand how laser cooling works, we rst need to clarify how the temperature of a gas
of atoms is measured. The key point is the link between the thermal motion of the atoms and the
temperature. Starting from the MaxwellBoltzmann distribution (cf. eqn 2.40), it is possible to dene
a number of dierent characteristic velocities for the gas. The simplest of these is the root-mean-square
(rms) velocity, which can be evaluated by remembering the principle of equipartition of energy. This
states that the average thermal energy per degree of freedom is equal to
1
2
k
B
T. For an atom of mass m,
each component of the velocity must therefore satisfy:
1
2
mv
2
i
=
1
2
k
B
T , (10.1)
which implies that the rms velocity is given by:
1
2
m(v
rms
)
2
=
3
2
k
B
T . (10.2)
109
110 CHAPTER 10. LASER COOLING OF ATOMS
velocity = v
x
atom laser beam
n
L
= n
0
- d
Figure 10.1: In Doppler cooling, the laser frequency is tuned below the atomic resonance by . The
frequency seen by an atom moving towards the laser is Doppler shifted up by
0
(v
x
/c).
We therefore conclude that:
1
v
rms
x
=
_
k
B
T
m
, (10.3)
v
rms
=
_
3k
B
T
m
. (10.4)
These simple relationships allow us to work out, for example, that the atoms in a typical gas at room
temperature jostle about in a random way with thermal velocities of around 1000 kmph. The random
thermal motion is the cause of the Doppler broadening of spectral lines considered in Section 2.10.
The link between temperature and the velocity distribution tells us that we can cool the gas if we
can slow the atoms down, which is the strategy adopted in laser cooling experiments. Furthermore, the
temperature of the gas can be inferred from a measurement of the velocity distribution of the atoms.
This is the method that is used to determine the temperature of an ultra cold gas cooled by a laser.
10.3 Doppler Cooling
10.3.1 The laser cooling process
Consider an atom emitting at
0
moving in the +x direction towards a laser of frequency
L
with velocity
v
x
as shown in Fig. 10.1. The laser is tuned so that its frequency is below the resonance line by an
amount :
L
=
0
. (10.5)
The Doppler-shifted frequency
observed
L
of the laser in the atoms frame of reference is given by:
observed
L
=
L
_
1 +
v
x
c
_
= (
0
)
_
1 +
v
x
c
_
=
0
+
v
x
c
0
v
x
c
. (10.6)
The last term is small because
0
and v
x
c. Hence if we choose
=
0
v
x
c
, (10.7)
we nd
observed
L
=
0
. This situation is depicted in Fig. 10.2(a). The laser is in resonance with atoms
moving in the +x direction, but not with those moving away or obliquely. For sodium at 300 K with
v
x
330 ms
1
, we need to choose = 560 MHz for the D-lines at 589 nm. This means that only those
atoms moving towards the laser absorb photons from the laser beam.
Now consider what happens after the atom has absorbed a photon from the laser beam. The photon
goes into an excited state and then re-emits another photon by spontaneous emission. This occurs on
average after a time (the radiative lifetime), and the direction of the emitted photon is random. The
absorption-emission cycle is illustrated schematically in Fig. 10.2(b).
Repeated absorption-emission cycles generate a net force in the same direction as the laser beam,
that is, the x direction. This happens because each photon of wavelength has a momentum of h/.
Conservation of momentum demands that every time a photon is absorbed from the laser beam the
momentum of the atom changes by (h/). On the other hand, the change of momentum due to the
recoil of the atom after spontaneous emission averages to zero, because the photons are emitted in random
directions. Hence the net change of momentum per absorption-emission cycle is given by:
p
x
=
h
. (10.8)
1
The rms velocity of the atoms of a beam of atoms eusing from a hot oven diers from eqn 10.3 by factor of 2, and the
most probable velocity by factor of
3. These numerical factors arise from the fact that the probability of escaping from
the oven is related to the velocity, which modies the probability distribution of the atoms in the beam. These ner details
needs not concern us here.
10.3. DOPPLER COOLING 111
0
laser
a
b
s
o
r
p
t
i
o
n
frequency
0
laser
a
b
s
o
r
p
t
i
o
n
frequency
0
laser
a
b
s
o
r
p
t
i
o
n
frequency
2
1.
2.
3.
t = 0
t =
(a) (b)
Figure 10.2: Doppler cooling. (a) Doppler-shifted laser frequency in the rest frame of the atom. A laser
with frequency
0
is in resonance with the atoms when they are moving towards the laser, but not if
they are moving sideways or away, if =
0
(v
x
/c). (b) An absorption-emission cycle. (1) A laser photon
impinges on the atom. (2) The atom absorbs the photon and goes into an excited state. (3) The atom
re-emits a photon in a random direction by spontaneous emission after a time .
If the laser intensity is large, then the probability for absorption will be large, and the absorption process
will be fast. Hence the time to complete the absorption-emission cycle is determined by the radiative
lifetime . The maximum force exerted on the atom is thus given by:
F
x
=
dp
dt
=
p
x
=
h
, (10.9)
and the deceleration is given by
v
x
=
F
x
m
=
h
m
. (10.10)
For the sodium D-lines with = 589 nm and = 16 ns, we nd F
x
= 7.0 10
20
N and v
x
=
1.8 10
6
ms
2
10
5
g.
The number of absorption-emission cycles required to stop the atom is given by:
N
stop
=
mu
x
p
x
=
mu
x
h
, (10.11)
where u
x
is the initial velocity of the atom. This sets a minimum time for the laser beam to slow the
atoms to a halt:
t
min
= N
stop
=
mu
x
h
. (10.12)
In this time, the atoms move a minimum distance d
min
given by:
0 u
2
x
= 2 v
x
d
min
, (10.13)
where v
x
is the deceleration given by eqn 10.10, and we have assumed that the nal velocity of the atom
is very small. This gives:
d
min
=
u
2
x
2 v
x
=
mu
2
x
2h
. (10.14)
112 CHAPTER 10. LASER COOLING OF ATOMS
For our standard sodium example with u
x
= 330 ms
1
, we nd N
stop
= 1.1 10
4
, t
min
= 0.18 ms and
d
min
= 0.03 m.
The analysis above ignores stimulated emission. The atom in the excited state step 2 in Fig. 10.2(b)
can be triggered to emit a photon by stimulated emission from other impinging laser photons. The
stimulated photon will be emitted in the same direction as the incident photon, and the photon recoil
exactly cancels the momentum kick given by the absorption process. When stimulated emission is included
in the analysis, the force is reduced by a factor two at high laser powers. This happens because the
population of levels 1 and 2 equalize at a value of N
0
/2, where N
0
is the total number of atoms. The
nal result is that the time to stop the atoms and the distance travelled in that time are both doubled.
10.3.2 The Doppler limit temperature
At rst sight, we might think that we would be able to completely stop the atoms by the Doppler
cooling technique. However, the minimum temperature that can be achieved is set by the uncertainty
principle. The cooling eect only works if we have the right detuning frequency for the particular
velocity. However, from eqn 2.33 we see that the radiative lifetime of the transition causes broadening.
This gives rise to an intrinsic uncertainty in the energy of the atom, and we will therefore never be able
cancel the velocity to an accuracy better than that set by :
1
2
mv
2
x
1
2
h
lifetime
=
1
2
h
1
2
=
2
. (10.15)
For sodium with = 16 ns, the minimum thermal velocity v
min
is thus around 0.4 ms
1
. The minimum
temperature is then given by:
1
2
k
B
T
min
=
1
2
mv
2
min
. (10.16)
On substituting from eqn 10.15 we therefore nd:
T
min
k
B
. (10.17)
A more rigorous analysis gives an extra factor of two:
T
min
=
2k
B
. (10.18)
This minimum temperature is called the Doppler limit. The Doppler limit of the sodium D-lines is
2.4 10
4
K 240 K.
10.4 Experimental considerations
Ecient cooling of the atoms requires that the laser should exert the optimal force on the atoms, which
occurs when the laser is detuned by the amount set out in eqn 10.7. However, the velocity of the atoms
decreases as the atoms cool, which suggests two possible strategies to achieve low temperatures:
1. Tune the laser frequency in a programmed way as the atoms slow down.
2. Keep the laser frequency xed and tune the transition frequency.
The rst method is called chirp cooling, in analogy to the chirping sound made by birds, in which the
frequency of the sound changes during the birdsong. Early experiments used tunable dye lasers, but more
modern experiments on cesium use tunable semiconductor diode lasers.
An ingenious approach that follows the second method is shown in Fig. 10.3. The sodium atoms are
produced by heating sodium metal in an oven to 450
B
B(x)M
J
, (10.19)
where B(x) is the spatiallyvarying magnetic eld. For the
2
S
1/2
ground state of the 3s electron in
sodium, we have pure spin angular momentum, and hence g
J
= 2 and M
J
= 1/2. If the laser is
10.5. OPTICAL MOLASSES AND MAGNETO-OPTICAL TRAPS 113
sodium
oven
detector
collection
optics
tapered solenoid
cooling
laser
observation
region
probe
laser
trapping coils
Figure 10.3: William D. Phillips apparatus at the NIST laboratory, USA, to stop a beam of sodium
atoms. The frequency of the cooling laser is xed, and the transition energy of the atoms is shifted in
a controlled way by the Zeeman eect using a tapered solenoid. The trapping coils prevent the atoms
falling under gravity. A probe laser is used to measure the velocity distribution of the cooled atoms.
tuned to
0
, then the laser cooling condition given in eqn 10.7 is satised for the M
J
= +1/2 state when
B
B(x) = h
0
(v
x
/c). The solenoid was therefore designed so that the reduction of the eld strength
compensates for the reduction of the velocity as the atoms slow down due to the laser cooling process.
Coils were added at the end to prevent the ultra cold atoms falling out of the apparatus under gravity.
The atoms with M
J
= 1/2 are not cooled and escape from the apparatus.
The properties of the cooled atoms can be measured by a second laser. This excites uorescence which
is collected and imaged onto a detector. The velocity distribution can be measured by the time-of-ight
technique in which we turn the cooling laser o and then watch the gas of trapped atoms expand as a
function of time.
10.5 Optical molasses and magneto-optical traps
The arrangement with a single laser beam shown in Fig. 10.1 is able to stop the atoms moving in the
positive direction for one of the components of the velocity (i.e the +x direction). To stop the atoms
in both directions for all three velocity components (i.e. the x, y and z directions), we need a six-
beam arrangement as shown in Fig. 10.4(a). This counter-propagating six-beam technique was pioneered
by Stephen Chu and co-workers at Bell Laboratories in 1985, and given the name optical molasses.
Molasses is the American word for treacle, and it gives a good description of how the Doppler cooling
force acts like a viscous medium for the trapped atoms.
The optical molasses experiment becomes a magneto-optical trap when magnetic coils are added
above and below the intersection point, as shown in Fig. 10.4(b). The current ows in opposite directions
through the coils, which produces a quadrupole eld, where the eld at the centre of the apparatus
cancels. However, on moving a small distance from the centre, the elds increases according to:
B(x, y, z) = B
0
(x
2
+y
2
+ 4z
2
)
1/2
, (10.20)
where (x, y, z) is the position relative to the centre. The atoms with M
J
= +1/2 have a Zeeman energy
of +
B
B, and so experience an increase in energy as they move away from the intersection point of the
lasers. In other words, they sit in a potential well, with the minimum at the origin. This has the eect
of trapping the atoms close to the origin if their thermal energy is less than the depth of the potential
well.
2
The combination of optical molasses and the quadrupole eld thus provides a method to cool and
trap a gas of atoms at very low temperatures.
2
Atoms with M
J
= 1/2 are not trapped, since their energy decreases as they move away from the centre: it is as if
they are on the top of a hill.
114 CHAPTER 10. LASER COOLING OF ATOMS
z
x
y
I
(a)
(b)
Figure 10.4: (a) Optical molasses. Six laser beams are used to annul the three velocity components of
the atoms velocity in both directions. (b) Magneto-optical trap, comprising the optical molasses lasers
and a quadrupole magnetic eld.
10.6 Cooling below the Doppler limit
Careful measurements by Phillips at NIST in 1988 led to the rather startling result that the temperature
of the laser-cooled atoms in an optical molasses experiment was substantially less than the Doppler limit
given in eqn 10.18. The temperature of the trapped sodium atoms was measured to be 40 K, that
is, six-times smaller than the Doppler limit. Chu and Cohen-Tannoudji soon conrmed this result in
independent experiments.
The explanation of the discrepancy comes from realizing that the single-beam mechanism described
in Section 10.3 is too simplistic. The counter-propagating laser beams in an optical molasses experiment
form an interference pattern, and this leads to a new type of cooling mechanism called Sisyphus cooling.
The mechanism is named after the character in Greek mythology who was condemned to roll a stone up
a hill forever, only for it to roll down again every time he got near the top. This is an analogy for the
way Sisyphus cooling works: the atoms repeatedly climb to the top of a potential barrier created by the
Stark eect of the interfering laser beams, and then drop to the bottom of the potential barrier after
absorption and emission of a photon. The energy loss in the process is taken from the atoms thermal
energy.
The detailed mechanism for Sisyphus cooling is too complicated at this level of treatment. The key
point is that the minimum temperature that can be achieved is set by the recoil limit, rather than
the Doppler limit. The atoms are constantly emitting spontaneous photons of wavelength in random
directions. The atom recoils each time with momentum h/, so it ends up with a random thermal energy
given by:
1
2
k
B
T
recoil
=
(h/)
2
2m
=
h
2
2m
2
. (10.21)
This gives a minimum temperature of:
T
recoil
=
h
2
mk
B
2
. (10.22)
Table 10.1 compares the key parameters of the sodium and cesium atoms that are frequently used in
laser cooling experiments.
In the years since Chu, Cohen-Tannoudji and Phillips pioneering experiments, the laser cooling
techniques have allowed the study of atom-photon interactions with unprecedented precision, and have
paved the way for the discovery of Bose-Einstein condensation, as described in the next sections.
10.7 Bose-Einstein condensation
We have seen above how laser cooling techniques can produce a very cold gas of atoms. Despite the
extremely low temperatures that are achieved, the motion of the atoms at the focus of the laser beams
is still classical in terms of statistical mechanics. We now wish to explore what happens when the gas
is cooled even further. It turns out that some atoms can undergo a phase transition to a quantum
state proposed by Bose and Einstein in 19245. In the sections that follow, we rst consider the general
10.7. BOSE-EINSTEIN CONDENSATION 115
Sodium Cesium
Laser rhodamine dye semiconductor diode
Atomic transition 3p 3s 6p 6s
Wavelength 589 nm 852 nm
Atomic mass m 23.0 m
H
132.9 m
H
Radiative lifetime 16 ns 32 ns
Doppler limit T
min
240 K 120 K
Recoil limit T
recoil
2.4 K 0.2 K
Table 10.1: Parameters for laser cooling of sodium and cesium atoms. T
min
and T
recoil
are the minimum
temperature set by the Doppler and photon recoil limits given in eqns 10.18 and 10.22 respectively.
1 10 100 1000 10000
0
1
2
3
4
Temperature (K)
H
e
a
t
C
a
p
a
c
i
t
y
p
e
r
m
o
l
e
c
u
l
e
/
k
B
3/2 k
B
5/2 k
B
7/2 k
B
vibrational
motion
rotational
motion
translational
motion
?
Figure 10.5: Schematic variation of the specic heat capacity of a gas of diatomic molecules with tem-
perature. The rotational and vibrational contributions freeze out at characteristic temperatures, but the
freezing out of the translational motion is not normally observed.
principles of BoseEinstein condensation (BEC), and then describe how the experiments to observe BEC
in a gas of atoms are carried out.
10.7.1 The concept of BoseEinstein condensation
The behaviour of a gas of atoms is said to be classical if the distribution of energies obeys Boltzmann
statistics:
p(E
i
) exp
_
E
i
k
B
T
_
, (10.23)
where p(E
i
) is probability that the atom is in the quantum state with energy E
i
at temperature T.
Boltzmann statistics apply at high temperatures when the probability for the occupation of any individual
quantum level is small. If we reduce the temperature, the atoms tend to occupy the lowest energy levels
of the system. It will therefore eventually be the case that the assumption that the occupancy factor is
small no longer applies. In this case, we will have quantum statistics rather than classical statistics.
It is this regime that we shall be exploring here.
The transition from classical to quantum behaviour occurs at a temperature that is determined by
the energy scales of the system. Consider, for example, the specic heat capacity of a gas of diatomic
molecules. The variation of the specic heat with temperature is shown schematically in gure 10.5.
A diatomic molecule possesses seven degrees of freedom: three translational, two rotational, and two
vibrational. As noted in Section 10.2, the classical principle of equipartition of energy states that the
thermal energy per molecule per degree of freedom is equal to
1
2
k
B
T. Since the heat capacity is equal to
dE/dT, we therefore expect a contribution of 3k
B
/2 for the translational motion, 2k
B
/2 for the rotational
motion, and a further 2k
B
/2 for the vibrations, giving 7k
B
/2 in total. This is in fact observed, but only
at very high temperatures, as shown in Fig. 10.5.
The reason for the departure of the heat capacity from the classical result is the quantization of the
thermal motion. The vibrations of a molecule can be approximated to a simple harmonic oscillator, with
116 CHAPTER 10. LASER COOLING OF ATOMS
quantized energy levels given by:
E = (n + 1/2)h
vib
, (10.24)
where
vib
is the vibrational frequency. The classical result will only be obtained if the thermal energy
is much greater than the vibrational quanta, that is when
k
B
T h
vib
. (10.25)
With typical values for
vib
around 10
13
Hz, the classical behaviour is only observed at temperatures
above about 1000 K. At room temperature the vibrational motion is usually frozen out, as shown in
Fig. 10.5. In the same way we expect the rotational motion to freeze out when the thermal energy is
comparable to the quantized rotational energy, that is when
k
B
T
2
I
rot
, (10.26)
where I
rot
is the moment of inertia about the rotation axis. This typically occurs for T 50 K. Thus
the rotational motion is usually classical at room temperature, but freezes out at lower temperatures, as
indicated in Fig. 10.5.
We are nally left with the translational motion. The third law of thermodynamics tells us
that the heat capacity must eventually go to zero as we approach absolute zero. However, this is never
observed in practice. In any normal gas the attractive forces between the molecules cause liquefaction
and solidication long before the quantum eects for the translational motion become important. If,
however, we could somehow prevent the gas from condensing, we would eventually expect to observe
quantum eects related to the translational motion. This eect was rst considered by Einstein in
19245, following Boses work on the statistical mechanics of photons.
A key point in understanding the concept of BoseEinstein condensation is that we are considering
the quantised motion of non-interacting particles. The molecules in a gas do not normally behave as
non-interacting particles: there are attractive forces between them that cause condensation to the liquid
or solid phase at low temperatures. These forces can never be turned o, and the only way to make their
eect minimal is to keep the molecules far away from each other. This means that the gas density must
be very low, which, as we shall see below, makes the temperature required to observe the quantum eect
extremely low. This is why is took 70 years to observe BoseEinstein condensation in a gas.
The phenomenon of BoseEinstein condensation was described by Einstein in a letter to Paul Ehrenfest
in late 1924 as follows:
Form a certain temperature on, the molecules condense without attractive forces, that is,
they accumulate at zero velocity. The theory is pretty, but is there some truth to it?
3
Einstein had to wait 14 years for the beginnings of an answer to his question. The superuid transition in
liquid helium was discovered in 1928 by W.H. Keesom, and in 1938 Fritz London successfully interpreted
Keesoms discovery as a BoseEinstein condensation phenomenon. In the years following Londons work,
the theory of Bose-Einstein condensation was applied to other condensed matter systems, e.g. super-
conductors. However, the problem with all of these condensed matter systems is that the particles are
not non-interacting. The mere fact that helium is a liquid at the superuid temperature tells us that
there are strong interactions between the atoms over and above any eects due to the quantization of the
kinetic energy. For this reason, in 1946 Schrodinger described the modications to the gas laws caused
by the quantum statistics as:
4
satisfactory, because they are negligible at high temperatures and low densities;
disappointing, because they occur at such low temperatures and high densities that they are
hard to distinguish from other eects;
astounding, because the behaviour is completely dierent to that of a classical system.
In an ideal world we would therefore like to observe the BoseEinstein condensation in a weakly
interacting system (i.e. a gas) so that we can study it in isolation. This was not possible until the new
techniques of laser cooling described in the previous sections were developed.
3
Letter to P. Ehrenfest, 29 November, 1924. An historical discussion of Einsteins work may be found in Pais, A. (1982).
Subtle is the Lord, Oxford University Press.
4
See E. Schr odinger, Statistical Thermodynamics, Cambridge University Press, 1946.
10.7. BOSE-EINSTEIN CONDENSATION 117
10.7.2 Atomic bosons
Before going into the details of BoseEinstein condensation, we need to clarify one important point. The
quantized behaviour of a gas of identical particles at low temperatures depends on the spin of the particle.
Particles with integer spins are called bosons, while those with half-integer spins are called fermions.
Fermions obey the Pauli exclusion principle, which says that it is not possible to put more than one
particle into a particular quantum state. Bosons, by contrast, do not obey the Pauli principle. There is
no limit to the number of particles that can be put into a particular level, which allows the observation
of new quantum eects such as BEC.
Atom are composite particles, made up of protons, neutrons, and electrons. These are all spin-1/2
particles, but the composite atom can be either a fermion or a boson depending on its total spin, which
can be worked out from:
S
atom
= S
electrons
I , (10.27)
where I is the nuclear spin. Since the number of electrons and protons in a neutral atom is equal, it is
easy to see that the atom will be a boson if the number of neutrons is an even number, and a fermion if
it is odd.
The simplest example to consider is hydrogen.
1
H has one proton and one electron, and so we nd
S
atom
= 0 or 1.
1
H atoms are therefore bosons. Deuterium atoms (
2
H), by contrast are bosons. Now
consider helium. Helium has two common isotopes:
4
He and
3
He. The ground state of the
4
He nucleus
is the -particle with I = 0, and the electron ground state also has S = 0. (See Chapter 5). Thus the
spin of the
4
He atom in its ground state is zero, which make it a boson. In
3
He atoms, by contrast, the
nucleus has two protons and one neutron, with I = 1/2 in its ground state The electrons have spin 0 or
1, and so we nd S
atom
= 1/2 or 3/2, making it a fermion.
5
Note that the number of neutrons is two
for
4
He and one for
3
He, so that our general rule for deciding whether an atom is a boson or a fermion
applies.
10.7.3 The condensation temperature
Consider a gas of identical non-interacting bosons of mass m at temperature T. As noted above, the
word non-interacting is very important here. It implies that the particles are completely free, with only
kinetic energy, and no forces between the atoms. In these circumstances the de Broglie wavelength
deB
is determined by the free thermal motion :
p
2
2m
=
1
2m
_
h
deB
_
2
=
3
2
k
B
T . (10.28)
This implies that
deB
=
h
3mk
B
T
. (10.29)
The thermal de Broglie wavelength thus increases as T decreases.
The quantum mechanical wave function of a free atom extends over a distance of
deB
. As
deB
increases with decreasing T, a temperature will eventually be reached when the wave functions of neigh-
bouring atoms begin to overlap. This situation is depicted in Fig. 10.6(a). The atoms will interact with
each other and coalesce to form a super atom with a common wave function. This is the BoseEinstein
condensed state.
The condition for wave function overlap is that the reciprocal of the eective particle volume deter-
mined by the de Broglie wavelength should be equal to the particle density. If we have N particles in
volume V , this condition can be written:
N
V
1
3
deB
. (10.30)
By inserting from eqn 10.29 and solving for T, we nd:
T
c
1
3
h
2
mk
B
_
N
V
_
2/3
. (10.31)
5
It is interesting to note that a superuid phase transition can also be observed for liquid
3
He at 2.5 mK, even though the
individual atoms are fermions. The
3
He atoms pair up to form a bosonic system analogous to the Cooper pairs developed
in the BCS (BardeenCooperSchrieer) theory of superconductivity. This theory explains how electrons can undergo a
superconducting phase transition even though they are fermions.
118 CHAPTER 10. LASER COOLING OF ATOMS
0 T
c
0
N
T
N
0
deB
(a) (b)
Figure 10.6: (a) Overlapping wave functions of two atoms separated by
deB
. (b) Number of particles in
the Bose-condensed state versus temperature. T
c
is the condensation temperature given by eqn 10.32.
We thus see that the condensation temperature is proportional to (N/V )
2/3
. This shows that low density
systems such as gases are expected to have very low transition temperatures, which explains why it has
been so dicult to observe BEC in gases until recently.
A rigorous formula for the Bosecondensation temperature T
c
can be derived by applying the laws of
statistical mechanics to the non-interacting boson gas. For a gas of spin-0 bosons, the critical temperature
T
c
is given by:
6
T
c
= 0.0839
h
2
mk
B
_
N
V
_
2/3
. (10.32)
Note that this is the same as the intuitive result in eqn 10.31 apart from the numerical factor. As noted
previously, the theory of BoseEinstein condensation was rst applied to liquid helium-4. Below T
c
some
of the liquid shows superuid behaviour, while the remainder remains normal. On inserting the atom
density of
4
He into eqn 10.32, we nd T
c
= 2.7 K, which is close to the actual superuid transition
temperature of 2.17 K. The discrepancy is a consequence of the fact the
4
He atoms in the liquid phase
are non truly non-interacting, and is an example of why Schrodinger described the properties of the
quantum gas as disappointing: the most spectacular eects usually occur in conditions where many
other interactions are important.
The picture which emerges from the statistical mechanics of BoseEinstein condensation is as follows.
Above the critical temperature the particles are distributed among the energy states of the system
according to the BoseEinstein distribution:
n
BE
(E) =
1
exp[(E )/k
B
T] 1
, (10.33)
where is the chemical potential. In the case that we are considering here, the particles only have kinetic
energy with E =
1
2
mv
2
, so that the minimum value of E is zero. The chemical potential must therefore
be negative to keep n
BE
well-behaved for all possible values of E. The chemical potential increases with
decreasing temperature, and at T
c
it reaches its maximum value of zero. In these conditions, there is a
singularity in eqn 10.33 for the zero-velocity state with E = 0, and a phase transition occurs in which a
macroscopic fraction of the total number of particles condenses into the ground state. The remainder of
the particles continue to be distributed thermally between the nite-velocity states. The fraction of the
particles in the zero-velocity state is given by:
N
0
(T) = N
_
1
_
T
T
c
_
3/2
_
, (10.34)
where N is the total number of particles. This dependence is plotted in Fig. 10.6(b). We see that N
0
is
zero at T = T
c
and increases to the maximum value of N at T = 0.
The description of the system with a macroscopic fraction of the particles in the zero-velocity state
and the rest distributed thermally among the nite-velocity states gives rise to the two-uid model.
The two uids correspond to the BoseEinstein condensed state with E = 0, and the normal particles
with E > 0. The total number of particles is written:
N = N
normal
+N
condensed
, (10.35)
where N
condensed
obeys eqn 10.34. This model gives a fairly good description of the behaviour of superuid
liquid
4
He and superconductors.
6
See, for example, Mandl, Statistical Physics, Section 11.6.
10.8. EXPERIMENTAL TECHNIQUES FOR ATOMIC BEC 119
(a) initial trap
magnetic trap potential
(b) after evaporative cooling
hottest atoms
escape
Figure 10.7: Evaporative cooling. (a) The laser-cooled atoms are rst compressed in a magnetic trap. (b)
The trap potential is then reduced by decreasing the magnetic eld strength, so that the hottest atoms
can escape. This reduces the temperature, in the same way that evaporation cools a liquid.
We can relate this behaviour to the discussion of the diatomic gas in Fig. 10.5 in the temperature
region indicated by the question mark. Since the number of particles in the zero-velocity state gradually
approaches 100% as T goes to zero, the thermal energy of the system goes to zero as T 0. The heat
capacity therefore also goes to zero, and we nally reach consistency with the third law of thermodynamics.
10.8 Experimental techniques for atomic BEC
The conditions required to achieve BoseEinstein condensation (BEC) in a gas impose severe technical
challenges. If we want to observe pure BEC without the complication of other eects such as liquefaction,
we have to keep the atoms well apart from each other. This means that the particle density must be
small, which in turn implies that the transition temperature is very low.
We have seen in section 10.6 that laser cooling can typically produce temperatures in the range
110 K. This is not quite cold enough. The typical particle density achieved in an optical molasses
experiment is around 10
17
m
3
, which implies condensation temperatures below 100 nK. We therefore
have to invent new techniques to observe condensation. The general procedure usually follows three
steps:
1. Trap a gas of atoms and cool them towards the recoil-limit temperature using laser-cooling tech-
niques. Compress the gas by increasing the magnetic eld.
2. Turn the cooling laser o to permit cooling below the recoil limit.
3. Cool the gas again by evaporative cooling until condensation occurs.
The rst step has been discussed previously in section 10.6. The magnetic eld has to be ramped
up carefully so as not to heat the gas while compressing it. Once the gas has been compressed, the
cooling lasers then have to be turned o, since the temperature will not fall below the recoil limit given
in eqn 10.22 while the lasers are on.
The nal step is called evaporative cooling, in analogy to the cooling of a liquid by evaporation. In
this technique, the magnetic eld strength is gradually turned down in order to reduce the depth of the
magnetic potential as shown in Fig. 10.7(b). The fastest-moving atoms now have enough kinetic energy
to escape from the trap, leaving the slower ones behind. This causes an overall reduction in the average
kinetic energy, which is equivalent to a reduction in the temperature.
The rst successful observation of BoseEinstein condensation in an atomic gas was reported by the
group of Eric Cornell and Carl Wieman at the JILA Laboratory
7
in the United States in 1995. In
their experiments they used
87
Rb atoms with a density of about 10
20
m
3
. This density is eight orders
of magnitude smaller than that of liquid helium, and so the condensation temperature calculated from
eqn 10.32 is very low: 3.9 10
7
K.
8
The inter-particle distance in the gas is equivalent to about 100
atomic radii. This means that the forces between the atoms are very small, and the BEC eects can
be observed in their own right. Similar results were reported by Wolfgang Ketterle and his team at
Massachusetts Institute of Technology for a gas of sodium atoms soon afterwards. The groundbreaking
nature of these discoveries was recognized by the joint award of the Nobel Prize for Physics in 2001 to
Cornell, Ketterle and Wieman.
BoseEinstein condensation is observed by measuring the velocity distribution of the atoms at the
end of the experiment. Figure 10.8 shows some typical data. These pictures are obtained by turning the
7
Joint Institute for Laboratory Astrophysics, run jointly by the University of Colorado and the National Institute of
Standards and Technology (NIST).
8
The condensation temperature in a magnetic trap diers slightly from the one given in eqn 10.32, because the atoms
are subject to the trapping potential. This level of detail need not concern us here.
120 CHAPTER 10. LASER COOLING OF ATOMS
v
x
v
y
Figure 10.8: Bose-Einstein condensation in rubidium atoms. The three gures show the measured velocity
distribution as the gas is cooled through T
c
on going from left to right. Above T
c
, we have a broad
MaxwellBoltzmann, but as the gas condenses, the fraction of atoms in the zero velocity state at the
origin increases dramatically. Source: http://jila.colorado.edu/bec/CornellGroup/index.html.
trapping eld o completely and allowing the gas to expand. An image of the gas is taken at a later
time, and the velocity distribution can be inferred from the amount of expansion that has occurred. The
key point in Fig. 10.8 is that a peak can be seen to appear at the centre as the temperature is lowered.
This corresponds to the zero-velocity state, and shows that a macroscopic fraction of the atoms have
condensed to the ground state.
In the years since the original observation, BEC has been observed in many other gaseous atomic
systems, and this has led to the observation of many other spectacular quantum eects, for example:
atom lasers. The use of the word laser is slightly confusing here, because there is no amplication. It
is used to emphasize the dierence between the coherence of the atomic beam from the condensate and
that from a thermal source, in analogy to the dierence between the coherence of the light from a laser
beam and that from a hot lament. The beam of atoms generated by hot ovens such as the one shown in
Fig. 10.3 has a MaxwellBoltzmann velocity distribution, with random phases between dierent atoms.
The atoms in a beam emanating from a BoseEinstein condensate, by contrast, are all in phase, because
they have a common wave function. This point has been proven by demonstrating that the atomic beams
from a condensate can form interference patterns when they overlap.
Further reading
Bransden and Joachain, Atoms, Molecules and Photons, sections 15.46
Foot, Atomic physics, chapters 9 and 10
Fox, Quantum optics, chapter 11
Haken, H. and Wolf, The Physics of Atoms and Quanta, sections 22.6, 23.1112.
Mandl, Statistical Physics, Section 11.6.
The BEC homepage at the University of Colorado gives interactive tutorial articles on laser cooling and
BoseEinstein condensation. See: http://www.colorado.edu/physics/2000/bec/index.html