Part QM - Quantum Mechanics

Stony Brook University
Academic Commons
Essential Graduate Physics Department of Physics and Astronomy
2013
Part QM: Quantum Mechanics

Konstantin Likharev
SUNY Stony Brook, [email protected]
Follow this and additional works at: https://commons.library.stonybrook.edu/egp
Part of the Physics Commons
Recommended Citation
Likharev, Konstantin, "Part QM: Quantum Mechanics" (2013). Essential Graduate Physics. 4.
https://commons.library.stonybrook.edu/egp/4
This Book is brought to you for free and open access by the Department of Physics and Astronomy at Academic
Commons. It has been accepted for inclusion in Essential Graduate Physics by an authorized administrator of
Academic Commons. For more information, please contact [email protected],
[email protected].
Konstantin K. Likharev
Essential Graduate Physics
Lecture Notes and Problems
Beta version
Open online access at

http://commons.library.stonybrook.edu/egp/
and
https://sites.google.com/site/likharevegp/
Part QM:
Quantum Mechanics
Last corrections: March 31, 2022
A version of this material was published in 2019 under the title
Quantum Mechanics: Lecture notes

IOPP, Essential Advanced Physics – Volume 5, ISBN 978-0-7503-1411-4,
with the model solutions of the exercise problems published under the title
Quantum Mechanics: Problems with solutions

IOPP, Essential Advanced Physics – Volume 6, ISBN 978-0-7503-1414-5
However, by now this online version of the lecture notes, as well as the problem solutions
available from the author, have been much better corrected
See also the author’s list

https://you.stonybrook.edu/likharev/essential-books-for-young-physicist/
of other essential reading recommended to young physicists
© K. Likharev
Essential Graduate Physics QM: Quantum Mechanics
Table of Contents
Chapter 1. Introduction (28 pp.)

1.1. Experimental motivations
1.2. Wave mechanics postulates
1.3. Postulates’ discussion
1.4. Continuity equation
1.5. Eigenstates and eigenvalues
1.6. Time evolution
1.7. Space dependence
1.8. Dimensionality reduction
1.9. Exercise problems (15)
Chapter 2. 1D wave mechanics (76 pp.)
2.1. Basic relations
2.2. Free particle: Wave packets
2.3. Particle reflection and tunneling
2.4. Motion in soft potentials
2.5. Resonant tunneling, and metastable states
2.6. Localized state coupling, and quantum oscillations
2.7. Periodic systems: Energy bands and gaps
2.8. Periodic systems: Particle dynamics
2.9. Harmonic oscillator: Brute force approach
Chapter 3. Higher Dimensionality Effects (64 pp.)
3.1. Quantum interference and the AB effect
3.2. Landau levels and the quantum Hall effect
3.3. Scattering and diffraction
3.4. Energy bands in higher dimensions
3.5. Axially-symmetric systems
3.6. Spherically-symmetric systems: Brute force approach
3.7. Atoms
3.8. Spherically-symmetric scatterers
Chapter 4. Bra-ket Formalism (52 pp.)
4.1. Motivation
4.2. States, state vectors, and linear operators
4.3. State basis and matrix representation
4.4. Change of basis, and matrix diagonalization
4.5. Observables: Expectation values and uncertainties
4.6. Quantum dynamics: Three pictures
4.7. Coordinate and momentum representations
Table of Contents Page 2 of 4

Chapter 5. Some Exactly Solvable Problems (48 pp.)

5.1. Two-level systems
5.2. The Ehrenfest theorem
5.3. The Feynman path integral
5.4. Revisiting harmonic oscillator
5.5. Glauber states and squeezed states
5.6. Revisiting spherically-symmetric problems
5.7. Spin and its addition to orbital angular momentum
Chapter 6. Perturbative Approaches (36 pp.)
6.1. Time-independent perturbations
6.2. The linear Stark effect
6.3. Fine structure of atomic levels
6.4. The Zeeman effect
6.5. Time-dependent perturbations
6.6. Quantum-mechanical Golden Rule
6.7. Golden Rule for step-like perturbations
Chapter 7. Open Quantum Systems (50 pp.)
7.1. Open systems, and the density matrix
7.2. Coordinate representation, and the Wigner function
7.3. Open system dynamics: Dephasing
7.4. Fluctuation-dissipation theorem
7.5. The Heisenberg-Langevin approach
7.6. Density matrix approach
Chapter 8. Multiparticle Systems (52 pp.)
8.1. Distinguishable and indistinguishable particles
8.2. Singlets, triplets, and the exchange interaction
8.3. Multiparticle systems
8.4. Perturbative approaches
8.5. Quantum computation and cryptography
Chapter 9. Elements of Relativistic Quantum Mechanics (36 pp.)
9.1. Electromagnetic field quantization
9.2. Photon absorption and counting
9.3. Photon emission: spontaneous and stimulated
9.4. Cavity QED
9.5. The Klien-Gordon and relativistic Schrödinger equations
9.6. Dirac’s theory
9.7. Low energy limit

Chapter 10. Making Sense of Quantum Mechanics (16 pp.)

10.1. Quantum measurements
10.2. QND measurements
10.3. Hidden variables, the Bell theorem, and local reality
10.4. Interpretations of quantum mechanics
***
Additional file (available from the author upon request):

Exercise and Test Problems with Model Solutions (277 + 70 = 347 problems; 518 pp.)

Chapter 1. Introduction
This introductory chapter briefly reviews the major experimental motivations for quantum mechanics,
and then discusses its simplest formalism – Schrödinger’s wave mechanics. Much of this material
(besides the last section) may be found in undergraduate textbooks,1 so that the discussion is rather
brief, and focused on the most important conceptual issues.
1.1. Experimental motivations

By the beginning of the 1900s, physics (which by that time included what we now call non-
relativistic classical mechanics, classical thermodynamics and statistics, and classical electrodynamics
including the geometric and wave optics) looked like an almost completed discipline, with most human-
scale phenomena reasonably explained, and just a couple of mysterious “dark clouds”2 on the horizon.
However, rapid technological progress and the resulting development of more refined scientific
instruments have led to a fast multiplication of observed phenomena that could not be explained on a
classical basis. Let me list the most consequential of those experimental findings.
(i) The blackbody radiation measurements, pioneered by G. Kirchhoff in 1859, have shown that
in the thermal equilibrium, the power of electromagnetic radiation by a fully absorbing (“black”)
surface, per unit frequency interval, drops exponentially at high frequencies. This is not what could be
expected from the combination of classical electrodynamics and statistics, which predicted an infinite
growth of the radiation density with frequency. Indeed, classical electrodynamics shows3 that
electromagnetic field modes evolve in time just as harmonic oscillators, and that the number dN of these
modes in a large free-space volume V >> 3, within a small frequency interval d <<  near some
frequency , is
d 3k 4k 2 dk 2
dN  2V  2V  V d , (1.1)
2 3 2 3  2c3
where c  3×108 m/s is the free-space speed of light, k = /c the free-space wave number, and  = 2/k
is the radiation wavelength. On the other hand, classical statistics4 predicts that in the thermal
equilibrium at temperature T, the average energy E of each 1D harmonic oscillator should be equal to
kBT, where kB is the Boltzmann constant.5
Combining these two results, we readily get the so-called Rayleigh-Jeans formula for the
average electromagnetic wave energy per unit volume:
1 See, for example, D. Griffith, Quantum Mechanics, 2nd ed., Cambridge U. Press, 2016.
2 This famous expression was used in a 1900 talk by Lord Kelvin (born William Thomson), in reference to the
results of blackbody radiation measurements and the Michelson-Morley experiments, i.e. the precursors of
quantum mechanics and relativity theory.
3 See, e.g., EM Sec. 7.8, in particular Eq. (7.211).
4 See, e.g., SM Sec. 2.2.
5 In the SI units, used throughout this series, k  1.38×10-23 J/K – see Appendix CA: Selected Physical Constants
B
for the exact value.
© K. Likharev
1 dE k BT dN 2
u   2 3 k BT , (1.2)
V d V d  c
that diverges at    (Fig. 1) – the so-called ultraviolet catastrophe. On the other hand, the blackbody
radiation measurements, improved by O. Lummer and E. Pringsheim, and also by H. Rubens and F.
Kurlbaum to reach a 1%-scale accuracy, were compatible with the phenomenological law suggested in
1900 by Max Planck:
2  Planck’s
u 2 3 . (1.3a)
 c exp / k BT   1
radiation
law
This law may be reconciled with the fundamental Eq. (1) if the following replacement is made for the
average energy of each field oscillator:

k BT  , (1.3b)
exp( / k BT )  1
with a factor
Planck’s
  1.055  10 -34 J  s , (1.4) constant
now called the Planck’s constant.6 At low frequencies ( << kBT), the denominator in Eq. (3) may be
approximated as /kBT, so that the average energy (3b) tends to its classical value kBT, and the
Planck’s law (3a) reduces to the Rayleigh-Jeans formula (2). However, at higher frequencies ( >>
kBT), Eq. (3) describes the experimentally observed rapid decrease of the radiation density – see Fig. 1.
10
1
u
u0
0.1
Fig. 1.1. The blackbody radiation density u, in units

of u0  (kBT)3/22c3, as a function of frequency,
0.01 according to the Rayleigh-Jeans formula (blue line)
0.1 1 10
and the Planck’s law (red line).
/kBT
(ii) The photoelectric effect, discovered in 1887 by H. Hertz, shows a sharp lower boundary for
the frequency of the incident light that may kick electrons out from metallic surfaces, independent of the
light’s intensity. Albert Einstein, in one of his three famous 1905 papers, noticed that this threshold min
could be explained assuming that light consisted of certain particles (now called photons) with energy
6 Max Planck himself wrote  as h, where  = /2 is the “cyclic” frequency (the number of periods per
second) so that in early texts on quantum mechanics the term “Planck’s constant” referred to h  2, while  was
called “the Dirac constant” for a while. I will use the contemporary terminology, and abstain from using the “old
Planck’s constant” h at all, to avoid confusion.
Chapter 1 Page 2 of 28
Energy
vs E   , (1.5)
frequency
with the same Planck’s constant that participates in Eq. (3).7 Indeed, with this assumption, at the photon
absorption by an electron, its energy E =  is divided between a fixed energy U0 (nowadays called the
workfunction) of electron’s binding inside the metal, and the excess kinetic energy mev2/2 > 0 of the
freed electron – see Fig. 2. In this picture, the frequency threshold finds a natural explanation as min=
U0/.8 Moreover, as was shown by Satyendra Nath Bose in 1924, 9 Eq. (5) explains Planck’s law (3).
me v 2
 E U0
v 2
E  
q  e Fig. 1.2. Einstein’s explanation of the

m  me photoelectric effect’s frequency threshold.
(iii) The discrete frequency spectra of the electromagnetic radiation by excited atomic gases
could not be explained by classical physics. (Applied to the planetary model of atoms, proposed by
Ernst Rutherford, classical electrodynamics predicts the collapse of electrons on nuclei in ~10-10s, due to
the electric-dipole radiation of electromagnetic waves.10) Especially challenging was the observation by
Johann Jacob Balmer (in 1885) that the radiation frequencies of simple atoms may be well described by
simple formulas. For example, for the lightest, hydrogen atom, all radiation frequencies may be
numbered with just two positive integers n and n’:
 1 1 
 n,n '   0  2
 2 , (1.6)
n n' 
with 0  1,  2.071016 s-1. This observation, and the experimental value of 0, have found its first
explanation in the famous 1913 theory by Niels Henrik David Bohr, which was a phenomenological
precursor of the present-day quantum mechanics. In this theory, n,n’ was interpreted as the frequency of
a photon that obeys Einstein’s formula (5), with its energy En,n’ = n.n’ being the difference between
two quantized (discrete) energy levels of the atom (Fig. 3):
E n ,n '  E n '  E n  0 . (1.7)
Bohr showed that Eq. (6) may be obtained from Eqs. (5) and (7), and non-relativistic11 classical
mechanics, augmented with just one additional postulate, equivalent to the assumption that the angular
7 As a reminder, A. Einstein received his only Nobel Prize (in 1921) for exactly this work, rather than for his
relativity theory, i.e. essentially for jumpstarting the same quantum theory which he later questioned.
8 For most metals, U0 is between 4 and 5 electron-volts (eV), so that the threshold corresponds to max = 2c/min
= 2c/(U0/)  300 nm – approximately at the border between the visible light and the ultraviolet radiation.
9 See, e.g., SM Sec. 2.5.
10 See, e.g., EM Sec. 8.2.
11 The non-relativistic approach to the problem may be justified a posteriori by the fact the resulting energy scale
EH, given by Eq. (13), is much smaller than the electron’s rest energy, mec2  0.5 MeV.
momentum L = mevr of an electron moving with velocity v on a circular orbit of radius r about the
hydrogen’s nucleus (the proton, assumed to be at rest because of its much higher mass), is quantized as
Angular
L  n , (1.8) momentum
quantization
where  is again the same Planck’s constant (4), and n is an integer. (In Bohr’s theory, n could not be
equal to zero, though in the genuine quantum mechanics, it can.)
En'
 n , n '  E n '  E n Fig. 1.3. The electromagnetic radiation
of a system at a result of the transition
En between its quantized energy levels.
Indeed, it is sufficient to solve Eq. (8), mevr = n, together with the equation
v2 e2
me  , (1.9)
r 4 0 r 2
which expresses the 2nd Newton’s law for an electron rotating in the Coulomb field of the nucleus. (Here
e  1.610-19C is the fundamental electric charge, and me  0.9110-30 kg is the electron’s rest mass.)
The result for r is
 2 / me Bohr
r  n 2 rB , where rB  2  0.0529 nm . (1.10) radius
e / 4 0
The constant rB, called the Bohr radius, is the most important spatial scale of phenomena in atomic,
molecular, and condensed-matter physics – and hence in all chemistry and biochemistry.
Now plugging these results into the non-relativistic expression for the full electron energy (with
its rest energy taken for reference),
m v2 e2
E e  , (1.11)
2 4 0 r
we get the following simple expression for the electron’s energy levels:
Hydrogen
E atom’s
E n   H2  0 , (1.12) energy
2n levels
which, together with Eqs. (5) and (7), immediately gives Eq. (6) for the radiation frequencies. Here EH is
called the so-called Hartree energy constant (or just the “Hartree energy”)12
EH 
e 2
/ 4 0 
2
 4.360  10 18 J  27.21 eV . (1.13a)

Hartree
energy
 2 / me constant
(Note the useful relations, which follow from Eqs. (10) and (13a):
12 Unfortunately, another name, the “Rydberg constant”, is sometimes used for either this energy unit or its half,
EH/2  13.6 eV. To add to the confusion, the same term “Rydberg constant” is used in some sub-fields of physics
for the reciprocal free-space wavelength (1/0 = 0/2c) corresponding to the frequency 0 = EH/2.
1/ 2
e2 2 e 2 / 4 0   2 / me 
EH   , i.e. rB    
 ; (1.13b)
4 0 rB me rB2 EH  EH 
the first of them shows, in particular, that rB is the distance at which the natural scales of the electron’s
potential and kinetic energies are equal.)
Note also that Eq. (8), in the form pr = n, where p = mev is the electron momentum’s
magnitude, may be rewritten as the condition than an integer number (n) of wavelengths  of certain
(before the late 1920s, hypothetic) waves13 fits the circular orbit’s perimeter: 2r  2n/p = n.
Dividing both parts of the last equality by n, we see that for this statement to be true, the wave number k
 2/ of the de Broglie waves should be proportional to the electron’s momentum p = mv:
Momentum
vs wave p  k , (1.14)
number
again with the same Planck’s constant as in Eq. (5).
(iv) The Compton effect14 is the reduction of frequency of X-rays at their scattering on free (or
nearly-free) electrons – see Fig. 4.
 ' / c
 / c 
me 
Fig. 1.4. The Compton effect.
p
The effect may be explained assuming that the X-ray photon also has a momentum that obeys the
vector-generalized version of Eq. (14):

p photon  k  n, (1.15)
c
where k is the wavevector (whose magnitude is equal to the wave number k, and whose direction
coincides with the unit vector n directed along the wave propagation15), and that the momenta of both
the photon and the electron are related to their energies E by the classical relativistic formula16
E 2  cp   mc 2  .
2 2
(1.16)
(For a photon, the rest energy m is zero, and this relation is reduced to Eq. (5): E = cp = ck = .)
Indeed, a straightforward solution of the following system of three equations,
13 This fact was first noticed and discussed in 1924 by Louis Victor Pierre Raymond de Broglie (in his PhD
thesis!), so that instead of speaking of wavefunctions, we are still frequently speaking of the de Broglie waves,
especially when free particles are discussed.
14 This effect was observed in 1922, and explained a year later by Arthur Holly Compton, using Eqs. (5) and (15).
15 See, e.g., EM Sec. 7.1.

  me c 2  '  (cp) 2  (me c 2 ) 2 
1/ 2
, (1.17)
 '
 cos   p cos  , (1.18)
c c
'
0 sin   p sin  , (1.19)
c
(which express the conservation of, respectively, the full energy of the system and the two relevant
Cartesian components of its full momentum, at the scattering event – see Fig. 4), yields the result:
1 1 1
  (1  cos  ) , (1.20a)
'  me c 2
which is traditionally represented as the relation between the initial and final values of the photon’s
wavelength  = 2/k = 2/(/c): 17
2 2
'    (1  cos  )    C (1  cos  ), with C  , (1.20b) Compton
effect
me c me c
and is in agreement with experiment.
(v) De Broglie wave diffraction. In 1927, Clinton Joseph Davisson and Lester Germer, and
independently George Paget Thomson succeeded to observe the diffraction of electrons on solid crystals
(Fig. 5). Specifically, they have found that the intensity of the elastic reflection of electrons from a
crystal increases sharply when the angle  between the incident beam of electrons and the crystal’s
atomic planes, separated by distance d, satisfies the following relation:
Bragg
2d sin   n , (1.21) condition
where  = 2/k = 2/p is the de Broglie wavelength of the electrons, and n is an integer. As Fig. 5
shows, this is just the well-known condition18 that the path difference l = 2dsin between the de
Broglie waves reflected from two adjacent crystal planes coincides with an integer number of , i.e. of
the constructive interference of the waves.19
To summarize, all the listed experimental observations could be explained starting from two very
simple (and similarly looking) formulas: Eq. (5) (at that stage, for photons only), and Eq. (15) for both
photons and electrons – both relations involving the same Planck’s constant. This fact might give an
impression of experimental evidence sufficient to declare the light consisting of discrete particles
17 The constant C that participates in this relation, is close to 2.4610-12 m, and is called the electron’s Compton
wavelength. This term is somewhat misleading: as the reader can see from Eqs. (17)-(19), no wave in the
Compton problem has such a wavelength – either before or after the scattering.
18 See, e.g., EM Sec. 8.4, in particular Fig. 8.9 and Eq. (8.82). Frequently, Eq. (21) is called the Bragg condition,
due to the pioneering experiments by W. Bragg on X-ray scattering from crystals, which were started in 1912.
19 Later, spectacular experiments on diffraction and interference of heavier particles (with the correspondingly
smaller de Broglie wavelength), e.g., neutrons and even C60 molecules, have also been performed – see, e.g., the
review by A. Zeilinger et al., Rev. Mod. Phys. 60, 1067 (1988) and a later publication by O. Nairz et al., Am. J.
Phys. 71, 319 (2003). Nowadays, such interference of heavy particles is used, for example, for ultrasensitive
measurements of gravity – see, e.g., a popular review by M. Arndt, Phys. Today 67, 30 (May 2014), and more
recent experiments by S. Abend et al., Phys. Rev. Lett. 117, 203003 (2016).
(photons), and, on the contrary, electrons being some “matter waves” rather than particles. However, by
that time (the mid-1920s), physics has accumulated overwhelming evidence of wave properties of light,
such as interference and diffraction.20 In addition, there was also strong evidence for the lumped-particle
(“corpuscular”) behavior of electrons. It is sufficient to mention the famous oil-drop experiments by
Robert Andrew Millikan and Harvey Fletcher (1909-1913), in which only single (and whole!) electrons
could be added to an oil drop, changing its total electric charge by multiples of electron’s charge (-e) –
and never its fraction. It was apparently impossible to reconcile these observations with a purely wave
picture, in which an electron and hence its charge need to be spread over the wave’s extension, so that
its arbitrary part of it could be cut off using an appropriate experimental setup.
 d
d sin  Fig. 1.5. The De Broglie wave interference
d sin 
at electron scattering from a crystal lattice.
Thus the founding fathers of quantum mechanics faced a formidable task of reconciling the wave
and corpuscular properties of electrons and photons – and other particles. The decisive breakthrough in
that task has been achieved in 1926 by Ervin Schrödinger and Max Born, who formulated what is now
known either formally as the Schrödinger picture of non-relativistic quantum mechanics of the orbital
motion21 in the coordinate representation (this term will be explained later in the course), or informally
just as the wave mechanics. I will now formulate the main postulates of this theory.
1.2. Wave mechanics postulates

Let us consider a spinless,22 non-relativistic point-like particle, whose classical dynamics may be
described by a certain Hamiltonian function H(r, p, t),23 where r is the particle’s radius-vector and p is
its momentum. (This condition is important because it excludes from our current discussion the systems
whose interaction with their environment results in irreversible effects, in particular the friction leading
to particle energy’s decay. Such “open” systems need a more general description, which will be
discussed in Chapter 7.) Wave mechanics of such Hamiltonian particles may be based on the following
set of postulates that are comfortingly elegant – though their final justification is given only by the
agreement of all their corollaries with experiment.24
20 See, e.g., EM Sec. 8.4.

21 The orbital motion is the historic (and rather misleading) term used for any motion of the particle as a whole.
22 Actually, in wave mechanics, the spin of the described particle has not to be equal to zero. Rather, it is assumed
that the particle spin’s effects on its orbital motion are negligible.
23 As a reminder, for many systems (including those whose kinetic energy is a quadratic-homogeneous function of
generalized velocities, like mv2/2), H coincides with the total energy E – see, e.g., CM Sec. 2.3. In what follows, I
will assume that H = E.
24 Quantum mechanics, like any theory, may be built on different sets of postulates/axioms leading to the same
final conclusions. In this text, I will not try to beat down the number of postulates to the absolute possible
(i) Wavefunction and probability. Such variables as r or p cannot be always measured exactly,
even at “perfect conditions” when all external uncertainties, including measurement instrument
imperfection, varieties of the initial state preparation, and unintended particle interactions with its
environment, have been removed.25 Moreover, r and p of the same particle can never be measured
exactly simultaneously. Instead, the most detailed description of the particle’s state allowed by Nature,
is given by a certain complex function (r, t), called the wavefunction (or “wave function”), which
generally enables only probabilistic predictions of the measured values of r, p, and other directly
measurable variables – in quantum mechanics, usually called observables.
Specifically, the probability dW of finding a particle inside an elementary volume dV  d3r is
proportional to this volume, and hence may be characterized by a volume-independent probability
density w  dW/d3r, which in turn is related to the wavefunction as
w   (r, t )   * (r, t ) (r, t ) ,

2
(1.22a)
Probability
where the sign * denotes the usual complex conjugation. As a result, the total probability of finding the via
wavefunction
particle somewhere inside a volume V may be calculated as
W   wd 3 r    * d 3 r . (1.22b)
V V
In particular, if volume V contains the particle definitely (i.e. with the 100% probability, W = 1), Eq.
(22b) is reduced to the so-called wavefunction normalization condition
*

Normalization
d 3 r  1 . (1.22c) condition
V
(ii) Observables and operators. With each observable A, quantum mechanics associates a certain
linear operator Â , such that (in the perfect conditions mentioned above) the average measured value of
A (usually called the expectation value) is expressed as26
Observable’s
A    * Aˆ d 3 r , (1.23) expectation
V value
where … means the statistical average, i.e. the result of averaging the measurement results over a large
ensemble (set) of macroscopically similar experiments, and  is the normalized wavefunction that obeys
Eq. (22c). Note immediately that for Eqs. (22) and (23) to be compatible, the identity (or “unit”)
operator defined by the relation
Iˆ   ,
Identity
(1.24) operator
has to be associated with a particular type of measurement, namely with the particle’s detection.
minimum, not only because that would require longer argumentation, but chiefly because such attempts typically
result in making certain implicit assumptions hidden from the reader – the practice as common as regrettable.
25 I will imply such perfect conditions in the further narrative, until the discussion of the system’s interaction with
its environment in Chapter 7.
26 This key measurement postulate is sometimes called the Born rule, though sometimes this term is used for the
(less general) Eqs. (22).
(iii) The Hamiltonian operator and the Schrödinger equation. Another particular operator,
Hamiltonian Ĥ , whose observable is the particle’s energy E, also plays in wave mechanics a very
special role, because it participates in the Schrödinger equation,
Schrödinger 
equation i  Hˆ  , (1.25)
t
that determines the wavefunction’s dynamics, i.e. its time evolution.
(iv) The radius-vector and momentum operators. In wave mechanics (in the “coordinate
representation”), the vector operator of particle’s radius-vector r just multiples the wavefunction by this
vector, while the operator of the particle’s momentum is proportional to the spatial derivative:
rˆ  r, pˆ  i , (1.26a)
Operators of
coordinate and where  is the del (or “nabla”) vector operator.27 Thus in the Cartesian coordinates,
momentum
  
rˆ  r  x, y, z, pˆ  i  , ,  . (1.26b)
 x y z 
(v) The correspondence principle. In the limit when quantum effects are insignificant, e.g., when
the characteristic scale of action28 (i.e. the product of the relevant energy and time scales of the problem)
is much larger than Planck’s constant , all wave mechanics results have to tend to those given by
classical mechanics. Mathematically, this correspondence is achieved by duplicating the classical
relations between various observables by similar relations between the corresponding operators. For
example, for a free particle, the Hamiltonian (which in this particular case corresponds to the kinetic
energy T = p2/2m alone) has the form
Free
ˆ ˆ pˆ 2 2 2
particle’s H T    . (1.27)
Hamiltonian 2m 2m
Now, even before a deeper discussion of the postulates’ physics (offered in the next section), we
may immediately see that they indeed provide a formal way toward resolution of the apparent
contradiction between the wave and corpuscular properties of particles. Indeed, for a free particle, the
Schrödinger equation (25), with the substitution of Eq. (27), takes the form
Free
 2 2
particle’s
i    , (1.28)
Schrödinger
equation
t 2m
whose particular, but the most important solution is a plane, single-frequency (“monochromatic”)
traveling wave,29
Plane
wave  (r, t )  ae i (k r t ) , (1.29)
27 If you need, see, e.g., Secs. 8-10 of the Selected Mathematical Formulas appendix – below, referred to as MA.
Note that according to those formulas, the del operator follows all the rules of the usual (geometric) vectors. This
is, by definition, true for other quantum-mechanical vector operators to be discussed below.
28 See, e.g., CM Sec. 10.3.
29 See, e.g., CM Sec. 6.4 and/or EM Sec. 7.1.
where a, k and  are constants. Indeed, plugging Eq. (29) into Eq. (28), we immediately see that such
plane wave, with an arbitrary complex amplitude a, is indeed a solution of this Schrödinger equation,
provided a specific dispersion relation between the wave number k   k  and the frequency :
Free
( k ) 2 particle’s
  . (1.30) dispersion
2m relation
The constant a may be calculated, for example, assuming that the wave (29) is extended over a certain
volume V, while beyond it,  = 0. Then from the normalization condition (22c) and Eq. (29), we get30
2
a V  1. (1.31)
Let us use Eqs. (23), (26), and (27) to calculate the expectation values of the particle’s
momentum p and energy E = H in the state (29). The result is
( k ) 2
p  k, E  H  ; (1.32)
2m
according to Eq. (30), the last equality may be rewritten as E = .
Next, Eq. (23) enables calculation of not only the average (in the math speak, the first moment)
of an observable but also its higher moments, notably the second moment – in physics, usually called
variance:
A 2  A  A   A 2  A ,
~ 2 2
(1.33) Observable’s
variance
and hence its uncertainty, alternatively called the “root-mean-square (r.m.s.) fluctuation”,
~ 1/ 2 Observable’s
A  A 2 . (1.34) uncertainty
~
The uncertainty is a scale of deviations A  A  A of measurement results from their average. In the
particular case when the uncertainty A equals zero, every measurement of the observable A will give
the same value A; such a state is said to have a definite value of the variable. For example, in
application to the state with wavefunction (29), these relations yield E = 0, p = 0. This means that in
this plane-wave, monochromatic state, the energy and momentum of the particle have definite values, so
that the statistical average signs in Eqs. (32) might be removed. Thus, these relations are reduced to the
experimentally-inferred Eqs. (5) and (15).
Hence the wave mechanics postulates indeed may describe the observed wave properties of non-
relativistic particles. (For photons, we would need its relativistic generalization – see Chapter 9 below.)
On the other hand, due to the linearity of the Schrödinger equation (25), any sum of its solutions is also
a solution – the so-called linear superposition principle. For a free particle, this means that any set of
plane waves (29) is also a solution to this equation. Such sets, with close values of k and hence p = k
(and, according to Eq. (30), of  as well), may be used to describe spatially localized “pulses”, called
wave packets – see Fig. 6. In Sec. 2.1, I will prove (or rather reproduce H. Weyl’s proof :-) that the
wave packet’s extension x in any direction (say, x) is related to the width kx of the distribution of the
30For infinite space (V  ), Eq. (31) yields a  0, i.e. wavefunction (29) vanishes. This formal problem may be
readily resolved considering sufficiently long wave packets – see Sec. 2.2 below.
corresponding component of its wave vector as xkx  ½, and hence, according to Eq. (15), to the width
px of the momentum component distribution as
Heisenberg’s 
uncertainty x  p x  . (1.35)
relation 2
(a) (b)
Re  ak kx
x
Im 
0 0
x k0 kx  px / 
the particle is Fig. 1.6. (a) A snapshot of a typical wave packet
(somewhere :-) propagating along axis x, and (b) the corresponding
here! distribution of the wave numbers kx, i.e. the momenta px.
This is the famous Heisenberg’s uncertainty principle, which quantifies the first postulate’s
point that the coordinate and the momentum cannot be defined exactly simultaneously. However, since
the Planck’s constant,  ~ 10-34 Js, is extremely small on the human scale of things, it still allows for
particle localization in a very small volume even if the momentum spread in a wave packet is also small
on that scale. For example, according to Eq. (35), a 0.1% spread of momentum of a 1 keV electron (p ~
1.710-24 kgm/s) allows its wave packet to be as small as ~310-10 m. (For a heavier particle such as a
proton, the packet would be even tighter.) As a result, wave packets may be used to describe the
particles that are quite point-like from the macroscopic point of view.
In a nutshell, this is the main idea of wave mechanics, and the first part of this course (Chapters
1-3) will be essentially a discussion of various effects described by this approach. During this
discussion, however, we will not only witness wave mechanics’ many triumphs within its applicability
domain but also gradually accumulate evidence for its handicaps, which will force an eventual transfer
to a more general formalism – to be discussed in Chapter 4 and beyond.
1.3. Postulates’ discussion

The wave mechanics’ postulates listed in the previous section (hopefully, familiar to the reader
from their undergraduate studies) may look very simple. However, the physics of these axioms is very
deep, leading to some counter-intuitive conclusions, and their in-depth discussion requires solutions of
several key problems of wave mechanics. This is why in this section I will give only an initial,
admittedly superficial discussion of the postulates, and will be repeatedly returning to the conceptual
foundations of quantum mechanics throughout the course, especially in Chapter 10.
First of all, the fundamental uncertainty of observables, which is in the core of the first postulate,
is very foreign to the basic ideas of classical mechanics, and historically has made quantum mechanics
so hard to swallow for many star physicists, notably including Albert Einstein – despite his 1905 work,
which essentially launched the whole field! However, this fact has been confirmed by numerous
experiments, and (more importantly) there has not been a single confirmed experiment that would
contradict this postulate, so that quantum mechanics was long ago promoted from a theoretical
hypothesis to the rank of a reliable scientific theory.
One more remark in this context is that Eq. (25) itself is deterministic, i.e. conceptually enables
an exact calculation of the wavefunction’s distribution in space at any instant t, provided that its initial
distribution, and the particle’s Hamiltonian, are known exactly. Note that in the classical statistical
mechanics, the probability density distribution w(r, t) may be also calculated from deterministic
differential equations, for example, the Liouville equation.31 The quantum-mechanical description
differs from that situation in two important aspects. First, in the perfect conditions outlined above (the
best possible initial state preparation and measurements), the Liouville equation is reduced to the 2nd
Newton law of classical mechanics, i.e. the statistical uncertainty of its results disappears. In quantum
mechanics this is not true: the quantum uncertainly, such as described by Eq. (35), persists even in this
limit. Second, the wavefunction (r, t) gives more information than just w(r, t), because besides the
modulus of , involved in Eq. (22), this complex function also has the phase   arg, which may
affect some observables, describing, in particular, interference of the de Broglie waves.
Next, it is very important to understand that the relation between the quantum mechanics and
experiment, given by the second postulate, necessarily involves another key notion: that of the
corresponding statistical ensemble, in this case, a set of many experiments carried out at apparently
(macroscopically) similar conditions, including the initial conditions – which nevertheless may lead to
different measurement results (outcomes). Indeed, the probability of a certain (nth) outcome of an
experiment may be only defined for a certain statistical ensemble, as the limit
Mn N
with M   M n ,
Probability:
Wn  lim M  , (1.36) definition
M n 1
where M is the total number of experiments, Mn is the number of outcomes of the nth type, and N is the
number of different outcomes.
Note that a particular choice of statistical ensemble may affect probabilities Wn very
significantly. For example, if we pull out playing cards at random from a standard pack of 52 different
cards of 4 suits, the probability Wn of getting a certain card (e.g., the queen of spades) is 1/52. However,
if the cards of a certain suit (say, hearts) had been taken out from the pack in advance, the probability of
getting the queen of spades is higher, 1/39. It is important that we would also get the last number for the
probability even if we had used the full 52-card pack, but by some reason discarded results of all
experiments giving us any rank of hearts. Hence, the ensemble definition (or its redefinition in the
middle of the game) may change outcome probabilities.
In wave mechanics, with its fundamental relation (22) between w and , this means that not only
the outcome probabilities, but the wavefunction itself also may depend on the statistical ensemble we
are using, i.e. not only on the preparation of the system and the experimental setup, but also on the
subset of outcomes taken into account. The sometimes accounted attribution of the wavefunction to a
single experiment, both before and after the measurement, may lead to very unphysical interpretations of
the results, including some wavefunction’s evolution which is not described by the Schrödinger equation
(the so-called wave packet reduction), subluminal action on distance, etc. Later in the course, we will
see that minding the fundamentally statistical nature of quantum mechanics, and in particular the
31 See, e.g., SM Sec. 6.1.
dependence of wavefunctions on the statistical ensembles’ definition (or redefinition), readily resolves
some, though not all, paradoxes of quantum measurements.
Note, however, again that the standard quantum mechanics, as discussed in Chapters 1-6 of this
course, is limited to statistical ensembles with the least possible uncertainty of the considered systems,
i.e. with the best possible knowledge of their state.32 This condition requires, first, the least uncertain
initial preparation of the system, and second, its total isolation from the rest of the world, or at least from
its disordered part (the “environment”), in the course of its evolution in time. Only such ensembles may
be described by certain wavefunctions. A detailed discussion of more general ensembles, which are
necessary if these conditions are not satisfied, will be given in Chapters 7, 8, and 10.
Finally, regarding Eq. (23): a better feeling of this definition may be obtained by its comparison
with the general definition of the expectation value (i.e. the statistical average) in the probability theory.
Namely, let each of N possible outcomes in a set of M experiments give a certain value An of observable
A; then
1 N N
 n n 
Definition
of statistical A  lim M  A M  AnWn . (1.37)
average M n 1 n 1
Taking into account Eq. (22), which relates W and , the structures of Eq. (23) and the final form of Eq.
(37) are similar. Their exact relation will be further discussed in Sec. 4.1.
1.4. Continuity equation

The wave mechanics postulates survive one more sanity check: they satisfy the natural
requirement that the particle does not appear or vanish in the course of the quantum evolution.33 Indeed,
let us use Eq. (22b) to calculate the rate of change of the probability W to find a particle within a certain
volume V:
dW d
   * d 3 r . (1.38)
dt dt V
Assuming for simplicity that the boundaries of the volume V do not move, it is sufficient to carry out the
partial differentiation of the product * inside the integral. Using the Schrödinger equation (25),
together with its complex conjugate,
 *
 i  ( Hˆ  )* , (1.39)
t
we readily get
  *  3
dW 
    *  d 3 r     *
t   

t

t 
1 
i V 
    *
d r    * Hˆ    Hˆ   d 3 r. (1.40)
dt V V  
32 The reader should not be surprised by the use of the notion of “knowledge” (or “information”) in this context.
Indeed, due to the statistical character of experiment outcomes, quantum mechanics (or at least its relation to
experiment) is intimately related to information theory. In contrast to much of classical physics, which may be
discussed without any reference to information, in quantum mechanics, as in classical statistical physics, such
abstraction is possible only in some very special (and not the most interesting) cases.
33 Note that this requirement may be violated in the relativistic quantum theory – see Chapter 9.
Let the particle move in a field of external forces (not necessarily constant in time), so that its
classical Hamiltonian function H is the sum of the particle’s kinetic energy T = p2/2m and its potential
energy U(r, t).34 According to the correspondence principle, and Eq. (27), the Hamiltonian operator may
be represented as the sum35,
pˆ 2 2 2
Hˆ  Tˆ  Uˆ   U (r,t )     U (r, t ) . (1.41) Potential field:
Hamiltonian
2m 2m
At this stage, we should notice that this operator, when acting on a real function, returns a real
function.36 Hence, the result of its action on an arbitrary complex function  = a + ib (where a and b are
real) is
Hˆ   Hˆ (a  ib)  Hˆ a  iHˆ b , (1.42)
where Ĥa and Ĥb are also real, while
( Hˆ  )*  ( Hˆ a  iHˆ b)*  Hˆ a  iHˆ b  Hˆ (a  ib)  Hˆ  * . (1.43)
This means that Eq. (40) may be rewritten as

dW 1 2 1  * 2
   * Hˆ   Hˆ  *  d 3 r         2  *  d 3 r . (1.44)
dt 
i V  
 
2 m i V  
Now let us use general rules of vector calculus37 to write the following identity:
   Ψ *Ψ  ΨΨ *   Ψ * 2 Ψ  Ψ 2 Ψ * , (1.45)

 
A comparison of Eqs. (44) and (45) shows that we may write
dW
   (  j) d 3 r , (1.46)
dt V
where the vector j is defined as

Probability
i  
 ΨΨ  c.c.  Im Ψ Ψ  ,
* *
j (1.47) current
2m   m   density
where c.c. means the complex conjugate of the previous expression – in this case, (*)*, i.e. *.
Now using the well-known divergence theorem,38 Eq. (46) may be rewritten as the continuity equation
Continuity
dW equation:
 I  0, with I   j n d 2 r , (1.48) integral
dt S form
34 As a reminder, such description is valid not only for conservative forces (in that case U has to be time-
independent), but also for any force F(r, t) that may be expressed via the gradient of U(r, t) – see, e.g., CM
Chapters 2 and 10. (A good example when such a description is impossible is given by the magnetic component
of the Lorentz force – see, e.g., EM Sec. 9.7, and also Sec. 3.1 below.)
35 Historically, this was the main step made (in 1926) by E. Schrödinger on the background of L. de Broglie’s
idea. The probabilistic interpretation of the wavefunction was put forward, almost simultaneously, by M. Born.
36 In Chapter 4, we will discuss a more general family of Hermitian operators, which have this property.
37 See, e.g., MA Eq. (11.4a), combined with the del operator’s definition 2  .
38 See, e.g., MA Eq. (12.2).
where jn is the component of the vector j, along the outwardly directed normal to the closed surface S
that limits the volume V, i.e. the scalar product jꞏn, where n is the unit vector along this normal.
Equalities (47) and (48) show that if the wavefunction on the surface vanishes, the total
probability W of finding the particle within the volume does not change, providing the intended sanity
check. In the general case, Eq. (48) says that dW/dt equals the flux I of the vector j through the surface,
with the minus sign. It is clear that this vector may be interpreted as the probability current density –
and I, as the total probability current through the surface S. This interpretation may be further supported
by rewriting Eq. (47) for the wavefunction represented in the polar form  = aei, with real a and :

j  a2  . (1.49)
m
Note that for a real wavefunction, or even for a wavefunction with an arbitrary but space-constant phase
, the probability current density vanishes. On the contrary, for the traveling wave (29), with a constant
probability density w = a2, Eq. (49) yields a non-zero (and physically very transparent) result:
 p
j w k  w  wv , (1.50)
m m
where v = p/m is particle’s velocity. If multiplied by the particle’s mass m, the probability density w
turns into the (average) mass density , and the probability current density, into the mass flux density v.
Similarly, if multiplied by the total electric charge q of the particle, with w turning into the charge
density , j becomes the electric current density. As the reader (hopefully :-) knows, both these currents
satisfy classical continuity equations similar to Eq. (48).39
Finally, let us recast the continuity equation, rewriting Eq. (46) as
 w 
  t
V
   j d 3 r  0 .

(1.51)
Now we may argue that this equality may be true for any choice of the volume V only if the expression
under the integral vanishes everywhere, i.e. if
Continuity
equation: w
differential    j  0. (1.52)
form t
This differential form of the continuity equation may be more convenient than its integral form (48).
1.5. Eigenstates and eigenvalues

Now let us discuss the most important corollaries of wave mechanics’ linearity. First of all, it
uses only linear operators. This term means that the operators must obey the following two rules:40
Aˆ  Aˆ    Aˆ   Aˆ ,
1 2 1 2 (1.53)
39See, e.g., respectively, CM 8.3 and EM Sec. 4.1.

40 By the way, if any equality involving operators is valid for an arbitrary wavefunction, the latter is frequently
dropped from notation, resulting in an operator equality. In particular, Eq. (53) may be readily used to prove that
   
the linear operators are commutative: Aˆ 2  Aˆ1  Aˆ1  Aˆ 2 , and associative: Aˆ1  Aˆ 2  Aˆ 3  Aˆ1  Aˆ 2  Aˆ 3 .
Aˆ c1 1  c 2 2   Aˆ c1 1   Aˆ c 2 2   c1 Aˆ 1  c 2 Aˆ 2 , (1.54)

where n are arbitrary wavefunctions, while cn are arbitrary constants (in quantum mechanics,
frequently called c-numbers, to distinguish them from operators and wavefunctions). The most
important examples of linear operators are given by:
(i) the multiplication by a function, such as for the operator r̂ given by Eq. (26), and
(ii) the spatial or temporal differentiation, such as in Eqs. (25)-(27).
Next, it is of key importance that the Schrödinger equation (25) is also linear. (This fact was
already used in the discussion of wave packets in the last section.) This means that if each of several
functions n are (particular) solutions of Eq. (25) with a certain Hamiltonian, then their arbitrary linear
combination,
   cn n , (1.55)
n
is also a solution of the same equation.41
Let us use the linearity to accomplish an apparently impossible feat: immediately find the
general solution of the Schrödinger equation for the most important case when the system’s
Hamiltonian does not depend on time explicitly – for example, like in Eq. (41) with time-independent
potential energy U = U(r), when the Schrödinger equation has the form
Potential
 2 2 field:
i     U (r ) . (1.56) Schrödinger
t 2m equation
First of all, let us prove that the following product,
n  a n (t ) n (r ) , (1.57) Variable
separation
qualifies as a (particular) solution of such an equation. Indeed, plugging Eq. (57) into Eq. (25) with any
time-independent Hamiltonian, using the fact that in this case
Hˆ a n (t ) n (r )  a n (t ) Hˆ  n (r ) , (1.58)
and dividing both parts of the equation by ann, we get

i da n Hˆ  n
 . (1.59)
a n dt n
The left-hand side of this equation may depend only on time, while the right-hand side, only on
coordinates. These facts may be only reconciled if we assume that each of these parts is equal to (the
same) constant of the dimension of energy, which I will denote as En.42 As a result, we are getting two
separate equations for the temporal and spatial parts of the wavefunction:
41 At the first glance, it may seem strange that the linear Schrödinger equation correctly describes quantum
properties of systems whose classical dynamics is described by nonlinear equations of motion (e.g., an
anharmonic oscillator – see, e.g., CM Sec. 5.2). Note, however, that statistical equations of classical dynamics
(see, e.g., SM Chapters 5 and 6) also have this property, so it is not specific to quantum mechanics.
42 This argumentation, leading to variable separation, is very common in mathematical physics – see, e.g., its
discussion in CM Sec. 6.5, and in EM Sec. 2.5 and beyond.
Stationary
Schrödinger
equation
Hˆ  n  En n , (1.60)
da n
 En an .
i (1.61a)
dt
The latter of these equations, rewritten in the form
da n E
 i n dt , (1.61b)
an 
is readily integrable, giving
Stationary
En
so that a n  const  exp i n t,
state:
time ln a n  i n t  const, with ωn  . (1.62)
evolution 
Now plugging Eqs. (57) and (62) into Eq. (22), we see that in the quantum state described by Eqs. (57)-
(62), the probability w of finding the particle at a certain location does not depend on time:
w   n* r  n r   wr  . (1.63)
With the same substitution, Eq. (23) shows that the expectation value of any operator that does not
depend on time explicitly is also time-independent:
A   n* r Aˆ  n r d 3 r = const. (1.64)
Due to this property, the states described by Eqs. (57)-(62) are called stationary; they are fully
defined by the possible solutions of the stationary (or “time-independent”) Schrödinger equation (60).43
Note that for the time-independent Hamiltonian (41), the stationary Schrödinger equation (60),
Potential field:
2 2
stationary
Schrödinger
   n  U (r ) n  E n n , (1.65)
equation 2m
is a linear, homogeneous differential equation for the function n, with a priory unknown parameter En.
Such equations fall into the mathematical category of eigenproblems,44 whose eigenfunctions n and
eigenvalues En should be found simultaneously, i.e. self-consistently.45
Mathematics46 tells us that for such equations with space-confined eigenfunctions n, tending to
zero at r  , the spectrum of eigenvalues is discrete. It also proves that the eigenfunctions
corresponding to different eigenvalues are orthogonal, i.e. that space integrals of the products nn’*
vanish for all pairs with n  n’. Due to the Schrödinger equation’s linearity, each of these functions may
be multiplied by a proper constant coefficient to make their set orthonormal:
* 1, for n  n' ,

 n  n ' d 3 r   n ,n '  
0, for n  n' .
(1.66)
43 In contrast, the full Schrödinger equation (25) is frequently called time-dependent or non-stationary.
44 From the German root eigen, meaning “particular” or “characteristic”.
45 Eigenvalues of energy are frequently called eigenenergies, and it is often said that the eigenfunction  and the
n
corresponding eigenenergy En together determine the nth stationary eigenstate of the system.
46 See, e.g., Sec. 9.3 of the wonderful handbook by G. Korn and T. Korn, listed in MA Sec. 16(ii).
Moreover, the eigenfunctions n(r) form a full set, meaning that an arbitrary function (r), in particular
the actual wavefunction  of the system in the initial moment of its evolution (which I will always, with
a few clearly marked exceptions, take for t = 0) may be represented as a unique expansion over the
eigenfunction set:
 (r,0)   c n n (r ) . (1.67)
n
The expansion coefficients cn may be readily found by multiplying both sides of Eq. (67) by *n’,
integrating the results over the space, and using Eq. (66). The result is
c n   n* (r )  (r ,0) d 3 r . (1.68)
Now let us consider the following wavefunction

 E 
 (r, t )   c n a k (t ) k (r )   c n exp i n
General
t   n (r ) . (1.69) solution
n n    for U = U(r)
Since each term of the sum has the form (57) and satisfies the Schrödinger equation, so does the sum as
the whole. Moreover, if the coefficients cn are derived in accordance with Eq. (68), then the solution
(69) satisfies the initial conditions as well. At this moment we can use one more bit of help from
mathematicians, who tell us that the linear, partial differential equation of type (65), with fixed initial
conditions, may have only one (unique) solution. This means that in our case of time-independent
potential Hamiltonian, Eq. (69) gives the general solution of the Schrödinger equation (25).
So, we have succeeded in our apparently over-ambitious goal. Now let us pause this mad
mathematical dash for a minute, and discuss this key result.
1.6. Time evolution

For the time-dependent factor, an(t), of each component (57) of the general solution (69), our
procedure gave a very simple and universal result (62), describing a linear change of the phase n 
arg(an) of this complex function in time, with the constant rate
d n E
  n   n , (1.70)
dt 
so that the real and imaginary parts of an oscillate sinusoidally with this frequency. The relation (70)
coincides with Einstein’s conjecture (5), but could these oscillations of the wavefunctions represent a
physical reality? Indeed, for photons, described by Eq. (5), E may be (and as we will see in Chapter 9,
is) the actual, well-defined energy of one photon, and  is the frequency of the radiation so quantized.
However, for non-relativistic particles, described by wave mechanics, the potential energy U, and hence
the full energy E, are defined to an arbitrary constant, because we may measure them from an arbitrary
reference level. How can such a change of the energy reference level (which may be made just in our
mind) alter the frequency of oscillations of a variable?
According to Eqs. (22)-(23), this time evolution of a wavefunction does not affect the particle’s
probability distribution, or even any observable (including the energy E, provided that it is always
referred to the same origin as U), in any stationary state. However, let us combine Eq. (5) with Bohr’s
assumption (7):
 nn'  E n'  E n . (1.71)
The difference nn’ of the eigenfrequencies n and n’, participating in this formula, is evidently
independent of the energy reference, and as will be proved later in the course, determines the
measurable frequency of the electromagnetic radiation (or possibly of a wave of a different physical
nature) emitted or absorbed at the quantum transition between the states.
As another but related example, consider two similar particles 1 and 2, each in the same (say, the
lowest-energy) eigenstate, but with their potential energies (and hence the ground state energies E1,2)
different by a constant U  U1 – U2. Then, according to Eq. (70), the difference   1 – 2 of their
wavefunction phases evolves in time with the reference-independent rate
d U
 . (1.72)
dt 
Certain measurement instruments, weakly coupled to the particles, may allow observation of this
evolution, while keeping the particle’s quantum dynamics virtually unperturbed, i.e. Eq. (70) intact.
Perhaps the most dramatic measurement of this type is possible using the Josephson effect in weak links
between two superconductors – see Fig. 7.47
I  sin(1   2 )
 expi1   expi 2 
V Fig. 1.7. The Josephson effect in a weak link

between two bulk superconductor electrodes.
As a brief reminder,48 superconductivity may be explained by a specific coupling between

conduction electrons in solids, that leads, at low temperatures, to the formation of the so-called Cooper
pairs. Such pairs, each consisting of two electrons with opposite spins and momenta, behave as Bose
particles and form a coherent Bose-Einstein condensate.49 Most properties of such a condensate may be
described by a single, common wavefunction , evolving in time just as that of a free particle, with the
effective potential energy U = q = –2e, where  is the electrochemical potential,50 and q = –2e is the
electric charge of a Cooper pair. As a result, for the system shown in Fig. 7, in which externally applied
voltage V fixes the difference 1 – 2 between the electrochemical potentials of two superconductors,
Eq. (72) takes the form
Eq. (72) for d 2e
Josephson  V. (1.73)
effect dt 
If the link between the superconductors is weak enough, the electric current I of the Cooper pairs (called
the supercurrent) through the link may be approximately described by the following simple relation,
47 The effect was predicted in 1962 by Brian Josephson (then a graduate student!) and observed soon after that.
48 For a more detailed discussion, including the derivation of Eq. (75), see e.g. EM Chapter 6.
49 A detailed discussion of the Bose-Einstein condensation may be found, e.g., in SM Sec. 3.4.
50 For more on this notion see, e.g. SM Sec. 6.3.
Josephson
I  I c sin  , (1.74) supercurrent
where Ic is some constant, dependent on the weak link’s strength.51 Now combining Eqs. (73) and (74),
we see that if the applied voltage V is constant in time, the current oscillates sinusoidally, with the so-
called Josephson frequency
2e
J  V , (1.75)

as high as ~484 MHz per microvolt of applied dc voltage. This effect may be readily observed
experimentally: though its direct detection is a bit tricky, it is easy to observe the phase locking
(synchronization)52 of the Josephson oscillations by an external microwave signal of frequency . Such
phase locking results in the relation J = n fulfilled within certain dc current intervals, and hence in the
formation, on the weak link’s dc I-V curve, of virtually vertical current steps at dc voltages

Vn  n , (1.76)
2e
where n is an integer.53 Since frequencies may be stabilized and measured with very high precision, this
effect is being used in highly accurate standards of dc voltage.
1.7. Spatial dependence

In contrast to the simple and universal time dependence (62) of the stationary states, the spatial
distributions of their wavefunction n(r) need to be calculated from the problem-specific stationary
Schrödinger equation (65). The solution of this equation for various particular cases is a major focus of
the next two chapters. For now, let us consider just the simplest example, which nevertheless will be the
basis for our discussion of more complex problems: let a particle be confined inside a rectangular hard-
wall box. Such confinement may be described by the following potential energy profile:54
 0, for 0  x  a x , 0  y  a y , and 0  z  a z , Hard-wall box:

U (r )   (1.77) potential
  , otherwise .
The only way to keep the product U(r)n in Eq. (65) finite outside the box, is to have  = 0 in
these regions. Also, the function has to be continuous everywhere, to avoid the divergence of the
51 In some cases, the function I() may somewhat deviate from Eq. (74), but these deviations do not affect its
fundamental 2-periodicity, and hence the fundamental relations (75)-(76). (No corrections to them have been
found yet.)
52 For the discussion of this very general effect, see, e.g., CM Sec. 5.4.
53 The size of these dc current steps may be readily calculated from Eqs. (73) and (74). Let me leave this task for
the reader’s exercise.
54 Another common name for such potentials, especially of lower dimensionality, is the potential well, in our
current case “rectangular” one: with a flat “bottom” and vertical, infinitely high “walls”. Note that sometimes,
very unfortunately, such potential profiles are called “quantum wells”. (This term seems to imply that the
particle’s confinement in such a well is a phenomenon specific for quantum mechanics. However, as we will
repeatedly see in this course, the opposite is true: quantum effects do as much as they only can to overcome the
particle’s confinement in a potential well, letting it partly penetrate in the “classically forbidden” regions beyond
the well’s walls.)
kinetic-energy term (-2/2m)2n. Hence, in this case we may solve the stationary Schrödinger equation
(60) just inside the box, i.e. with U = 0, so that it takes a simple form
2 2
   n  E n n , (1.78a)
2m
with zero boundary conditions on all the walls.55 For our particular geometry, it is natural to express the
Laplace operator in the Cartesian coordinates {x, y, z} aligned with the box sides, with the origin at one
of the corners of its rectangular axayaz volume, so that our boundary problem becomes:
2  2 2 2 
  2  2  2  n  E n n , for 0  x  a x , 0  y  a y , and 0  z  a z ,
2m  x y z  (1.78b)
with  n  0 for : x  0 and a x ; y  0 and a y ; z  0 and a z .
This problem may be readily solved using the same variable separation method as was used in
Sec. 5 – now to separate the Cartesian spatial variables from each other, by looking for a partial
solution of Eq. (78) in the form
 (r )  X ( x)Y ( y ) Z ( z ) . (1.79)
(Let us postpone assigning function indices for a minute.) Plugging this expression into Eq. (78b) and
dividing all terms by the product XYZ, we get
 2 1 d 2 X  2 1 d 2Y  2 1 d 2 Z
    E. (1.80)
2m X dx 2 2m Y dy 2 2m Z dz 2
Now let us repeat the standard argumentation of the variable separation method: since each term on the
left-hand side of this equation may be only a function of the corresponding argument, the equality is
possible only if each of them is a constant – in our case, with the dimensionality of energy. Calling these
constants Ex, etc., we get three similar 1D equations
2 1 d 2 X  2 1 d 2Y 2 1 d 2Z
  Ex ,   Ey ,   Ez , (1.81)
2m X dx 2 2m Y dy 2 2m Z dx 2
with Eq. (80) turning into the following energy-matching condition:
Ex  E y  Ez  E . (1.82)
All three ordinary differential equations (81), and their solutions, are similar. For example, for
X(x), we have the following 1D Helmholtz equation
d2X 2mE x
2
 k x2 X  0, with k x2  , (1.83)
dx 2
and simple boundary conditions: X(0) = X(ax) = 0. Let me hope that the reader knows how to solve this
well-known 1D boundary problem – describing, for example, the usual mechanical waves on a guitar
string. The problem allows an infinite number of sinusoidal standing-wave eigenfunctions,56
55 Rewritten as 2f + k2f = 0, Eq. (78a) is just the Helmholtz equation, which describes waves of any nature (with
the wave vector k) in a uniform, isotropic, linear medium – see, e.g., EM Secs. 7.5-7.9 and 8.5.
56 The front coefficient in the last expression for X ensures the (ortho)normality condition (66).
1/ 2
n x  2  n x x Rectangular
X  sin k x x, with k x  , so that X    sin , with n x  1, 2,... , (1.84) potential well:
1D eigen-
ax  ax  ax functions
corresponding to the following eigenenergies:

 2 2  2 2 2
Ex  kx  2
n x  E x1 n x2 . (1.85)
2m 2ma x
Figure 8 shows these simple results, using a somewhat odd but very graphic and common
representation, in that the eigenenergy values (frequently called the energy levels) are used as horizontal
axes for plotting the eigenfunctions – despite their completely different dimensionality.
E x / E x1
X ( x) nx  3
9
nx  2
5
Fig. 1.8. The lowest eigenfunctions (solid lines) and
4 eigenvalues (dashed lines) of Eq. (83) for a potential well
nx  1 of length ax. Solid black lines show the effective potential
10 energy profile for this 1D eigenproblem.
0 ax x
Due to the similarity of all Eqs. (81), Y(y) and Z(z) are absolutely similar functions of their
arguments, and may also be numbered by integers (say, ny and nz) independent of nx, so that the
spectrum of values of the total energy (82) is
 2  2  n x2 n y2 n z2 
.
Rectangular
E n ,n ,n    (1.86) potential well:
x y z 2m  a 2
x a 2
y a z2 
 energy levels
Thus, in this 3D problem, the role of the index n in the general Eq. (69) is played by a set of three
independent integers {nx, ny, nz}. In quantum mechanics, such integers play a key role and thus have a
special name, the quantum numbers. Using them, that general solution, for our current simple problem
may be represented as the sum
Rectangular

n x x n y y n z z  E nx ,n y ,nz  potential
Ψ(r, t )   c nx , n y , nz sin
ax
sin
ay
sin
az
exp  i

t , (1.87) well:
nx , n y , nz  1   general
solution
with the front coefficients that may be readily calculated from the initial wavefunction (r, 0), using
Eq. (68) – again with the replacement n  {nx, ny, nz}.
This simplest problem is a good illustration of typical results the wave mechanics gives for
spatially-confined motion, including the discrete energy spectrum, and (in this case, evidently)
orthogonal eigenfunctions. Perhaps most importantly, its solution shows that the lowest value of the
particle’s kinetic energy (86), reached in the so-called ground state (in our case, the state with nx = ny =
nz = 1) is above zero for any finite size of the confining box.
An example of the opposite case of a continuous spectrum for the unconfined motion of a free
particle is given by the plane waves (29). With the account of relations E =  and p = k, such
Free
wavefunction may be viewed as the product of the time-dependent factor (62) by the eigenfunction,
 k  ak exp ik  r,
particle:
eigen- (1.88)
functions
which is the solution of the stationary Schrödinger equation (78a) if it is valid in the whole space.57 The
reader should not be worried too much by the fact that the fundamental solution (86) in free space is a
traveling wave (having, in particular, a non-zero value of the probability current j), while those inside a
quantum box are standing waves, with j = 0, even though the free space may be legitimately considered
as the ultimate limit of a quantum box with volume V = axayaz  . Indeed, due to the linearity of
wave mechanics, two traveling-wave solutions (88) with equal and opposite values of the momentum
(and hence with the same energy) may be readily combined to give a standing-wave solution,58 for
example, exp{ikr} + exp{-ikr} = 2cos(kr), with the net current j = 0. Thus, depending on the
convenience for a particular problem, we may represent its general solution as a sum of either traveling-
wave or standing-wave eigenfunctions. Since in the unlimited free space, there are no boundary
conditions to satisfy, the Cartesian components of the wave vector k in Eq. (88) can take any real
values. (This is why it is more convenient to label these wavefunctions, and the corresponding
eigenenergies,
Free
particle: 2k 2
eigen-
Ek   0, (1.89)
energies
2m
with their wave vector k rather than an integer index.)
However, one aspect of continuous-spectrum systems requires a bit more caution with
mathematics: the summation (69) should be replaced by the integration over a continuous index or
indices – in our current case, the three Cartesian components of the vector k. The main rule of such
replacement may be readily extracted from Eq. (84): according to this relation, for standing-wave
solutions, the eigenvalues of kx are equidistant, i.e. separated by equal intervals kx = /ax, with similar
relations for other two Cartesian components of vector k. Hence the number of different eigenvalues of
the standing-wave vector k (with kx, ky, kz  0), within a volume d3k >> 1/V of the k space is dN =
d3k/(kxkxkx) = (V/3)d3k. Frequently, it is more convenient to work with traveling waves (88); in this
case we should take into account that, as was just discussed, there are two different traveling wave
numbers (say, +kx and –kx) corresponding to each standing wave vector’s kx > 0. Hence the same number
of physically different states corresponds to a 23 = 8-fold larger k space or, equivalently, to an 8-fold
smaller number of states per unit volume d3k:
Number V
dN  d 3k . (1.90)
2 
of 3D states 3
57 In some systems (e.g., a particle interacting with a potential well of a finite depth), a discrete energy spectrum
within a certain energy interval may coexist with a continuous spectrum in a complementary interval. However,
the conceptual philosophy of eigenfunctions and eigenvalues remains the same even in this case.
58 This is, of course, the general property of waves of any physical nature, propagating in a linear medium – see,
e.g., CM Sec. 6.5 and/or EM Sec. 7.3.
For dN >> 1, this expression is independent of the boundary conditions, and is frequently
represented as the following summation rule
Summation
V
lim
k 3V   f (k )   f (k )dN  2   f (k )d3
3
k, (1.91) over
3D states
k
where f(k) is an arbitrary function of k. Note that if the same wave vector k corresponds to several
internal quantum states (such as spin – see Chapter 4), the right-hand side of Eq. (91) requires its
multiplication by the corresponding degeneracy factor of orbital states.59
1.8. Dimensionality reduction

To conclude this introductory chapter, let me discuss the conditions when the spatial
dimensionality of a wave-mechanical problem may be reduced.60 Naively, one may think that if the
particle’s potential energy depends on just one spatial coordinate, say U = U(x, t), then its wavefunction
has to be one-dimensional as well:  = (x, t). Our discussion of the particular case U = const in the
previous section shows that this assumption is wrong. Indeed, though this potential61 is just a special
case of the potential U(x, t), most of its eigenfunctions, given by Eqs. (87) or (88), do depend on the
other two coordinates. This is why the solutions (x, t) of the 1D Schrödinger equation
1D time-
 2 2 dependent
i    U ( x, t )  , (1.92) Schrödinger
t 2m x 2 equation
which follows from Eq. (65) by assuming /y = /z = 0, are insufficient to form the general
solution of Eq. (65) for this case.
This fact is easy to understand physically for the simplest case of a stationary 1D potential: U =
U(x). The absence of the y- and z-dependence of the potential energy U may be interpreted as a potential
well that is flat in two directions, y and z. Repeating the arguments of the previous section for this case,
we see that the eigenfunctions of a particle in such a well have the form
 r   X ( x) expi k y y  k z z , (1.93)
where X(x) is an eigenfunction of the following stationary 1D Schrödinger equation:

2 d 2 X
  U ef ( x) X  EX , (1.94)
2m dx 2
where Uef(x) is not the full potential energy of the particle, as it would follow from Eq. (92), but rather
its effective value including the kinetic energy of the lateral motion:
U ef  U  E y  E z   U 
2 2
2m

k y  k z2 .  (1.95)
59 Such factor is similar to the front factor 2 in Eq. (1) for the number of electromagnetic wave modes, in that case
describing two different polarizations of the waves with the same wave vector.
60 Most textbooks on quantum mechanics jump to the formal solution of 1D problems without such discussion,
ignoring the fact that such dimensionality restriction is adequate only under very specific conditions.
61 Following tradition, I will frequently use this shorthand for “potential energy”, returning to the full term in
cases where there is any chance of confusion of this notion with another (say, electrostatic) potential.
In plain English, the particle’s partial wavefunction X(x) and its full energy, depend on its transverse
momenta, which have continuous spectrum – see the discussion of Eq. (89). This means that Eq. (92) is
adequate only if the condition ky = kz = 0 is somehow enforced, and in most physical problems, it is not.
For example, if a de Broglie (or any other) plane wave (x, t) is incident on a potential step, it would be
reflected exactly back, i.e. with ky = kz = 0, only if the wall’s surface is perfectly plane and exactly
normal to the axis x. Any imperfection (and they are so many of them in real physical systems –:) may
cause excitation of waves with non-zero values of ky and kz, due to the continuous character of the
functions Ey(ky) and Ez(kz).62
There is essentially one, perhaps counter-intuitive way to make the 1D solutions “robust” to
small perturbations: it is to provide a rigid lateral confinement63 in two other directions. As the simplest
example, consider a narrow quantum wire (Fig. 9a), described by the following potential:
U ( x), for 0  y  a y , and 0  z  a z ,

U (r )   (1.96)
  , otherwize.
(a) (b)
y
z y z
x x
Fig. 1.9. Partial confinement in: (a) two dimensions, and (b) one dimension.
Performing the standard variable separation (79), we see that the corresponding stationary
Schrödinger equation is satisfied if the partial wavefunction X(x) obeys Eqs. (94)-(95), but now with a
discrete energy spectrum in the transverse directions:
 2  2  n y
n z2 
2
U ef  U   . (1.97)
2m  a y2 a z2 
If the lateral confinement is tight, ay, az  0, then there is a large energy gap,
 2 2
U ~ , (1.98)
2ma y2, z
between the ground-state energy of the lateral motion (with ny = nz = 1) and that for all its excited states.
As a result, if the particle is initially placed into the lateral ground state, and its energy E is much
smaller than U, it would stay in such state, i.e. may be described by a 1D Schrödinger equation similar
to Eq. (92) – even in the time-dependent case, if the characteristic frequency of energy variations is
much smaller than U/. Absolutely similarly, the strong lateral confinement in just one dimension (say,
z, see Fig. 9b) enables systems with a robust 2D evolution of the particle’s wavefunction.
62 This problem is not specific to quantum mechanics. The classical motion of a particle in a 1D potential may be
also unstable with respect to lateral perturbations, especially if the potential is time-dependent, i.e. capable of
exciting low-energy lateral modes.
63 The term “quantum confinement”, sometimes used to describe this phenomenon, is as unfortunate as the
“quantum well” term discussed above, because of the same reason: the confinement is a purely classical effect,
and as we will repeatedly see in this course, the quantum-mechanical effects reduce, rather than enable it.
The tight lateral confinement may ensure the dimensionality reduction even if the potential well
is not exactly rectangular in the lateral direction(s), as described by Eq. (96), but is described by some x-
and t-independent profile, if it still provides a sufficiently large energy gap U. For example, many 2D
quantum phenomena, such as the quantum Hall effect,64 have been studied experimentally using
electrons confined at semiconductor heterojunctions (e.g., epitaxial interfaces GaAs/AlxGa1-xAs), where
the potential well in the direction perpendicular to the interface has a nearly triangular shape, and
provides an energy gap U of the order of 10-2 eV.65 This gap corresponds to kBT with T ~100 K, so that
careful experimentation at liquid helium temperatures (4K and below) may keep the electrons
performing purely 2D motion in the “lowest subband” (nz = 1).
Finally, note that in systems with reduced dimensionality, Eq. (90) for the number of states at
large k (i.e., for an essentially free particle motion) should be replaced accordingly: in a 2D system of
area A >> 1/k2,
A Number
dN  d 2k , (1.99)
2 2 of 2D states
while in a 1D system of length l >> 1/k,

l Number
dN  dk , (1.100) of 1D states
2
with the corresponding changes of the summation rule (91). This change has important implications for
the density of states on the energy scale, dN/dE: it is straightforward (and hence left for the reader :-) to
use Eqs. (90), (99), and (100) to show that for free 3D particles the density increases with E
(proportionally to E1/2), for free 2D particles it does not depend on energy at all, while for free 1D
particles it scales as E-1/2, i.e. decreases with energy.
1.9. Exercise problems
1.1. The actual postulate made by N. Bohr in his original 1913 paper was not directly Eq. (8), but
the assumption that at quantum leaps between adjacent large (quasiclassical) orbits with n >> 1, the
hydrogen atom either emits or absorbs energy E = , where  is its classical radiation frequency –
according to classical electrodynamics, equal to the angular velocity of electron’s rotation.66 Prove that
this postulate is indeed compatible with Eqs. (7)-(8).
1.2. Use Eq. (53) to prove that the linear operators of quantum mechanics are commutative:
1 2 
1 2 3 1 2 
Â  Aˆ  Aˆ  Aˆ , and associative: Aˆ  Aˆ  Aˆ  Aˆ  Aˆ  Aˆ .
2 1 3 
1.3. Prove that for any time-independent Hamiltonian operator Ĥ and two arbitrary complex
functions f(r) and g(r),
 f r Hˆ g r  d r   Hˆ f r g r  d r .
3 3
64 To be discussed in Sec. 3.2.

65 See, e.g., P. Harrison, Quantum Wells, Wires, and Dots, 3rd ed., Wiley, 2010.
66 See, e.g., EM Sec. 8.2.
1.4. Prove that the Schrödinger equation (25) with the Hamiltonian operator given by Eq. (41), is
Galilean form-invariant, provided that the wavefunction is transformed as
 mv  r mv 2t 
' r' , t'    r, t exp i i ,
  2 
where the prime sign marks the variables measured in the reference frame 0’ that moves, without
rotation, with a constant velocity v relatively to the “lab” frame 0. Give a physical interpretation of this
transformation.
1.5.* Prove the so-called Hellmann-Feynman theorem:67

E n H
 ,
  n
where  is some c-number parameter, on which the time-independent Hamiltonian Ĥ , and hence its
eigenenergies En, depend.
1.6.* Use Eqs. (73) and (74) to analyze the effect of phase locking of Josephson oscillations on
the dc current flowing through a weak link between two superconductors (frequently called the
Josephson junction), assuming that an external source applies to the junction a sinusoidal ac voltage
with frequency  and amplitude A.
1.7. Calculate x, px, x, and px for the eigenstate {nx, ny, nz} of a particle in a rectangular
hard-wall box described by Eq. (77), and compare the product xpx with the Heisenberg’s uncertainty
relation.
1.8. Looking at the lower (red) line in Fig. 8, it seems plausible that the 1D ground-state function
(84) of the simple potential well (77) may be well approximated with an inverted quadratic parabola:
X trial  x   C x a x  x  ,
where C is a normalization constant. Explore how good this approximation is.
1.9. A particle placed in a hard-wall rectangular box with sides ax, ay, and az, is in its ground
state. Calculate the average force acting on each face of the box. Can the forces be characterized by a
certain pressure?
1.10. A 1D quantum particle was initially in the ground state of a very deep, rectangular
potential well of width a:
 0, for  a / 2  x   a / 2,
U ( x)  
 , otherwise.
67 Despite this common name, H. Hellmann (in 1937) and R. Feynman (in 1939) were not the first ones in the
long list of physicists who had (apparently, independently) discovered this equality. Indeed, it has been traced
back to a 1922 paper by W. Pauli, and was carefully proved by P. Güttinger in 1931.
At some instant, the well’s width is abruptly increased to a new value a’ > a, leaving the potential
symmetric with respect to the point x = 0, and then left constant. Calculate the probability that after the
change, the particle is still in the ground state of the system.
1.11. At t = 0, a 1D particle of mass m is placed into a hard-wall, flat-bottom potential well

 0, for 0  x  a,
U ( x)  
 , otherwise,
in a 50/50 linear superposition of the lowest (ground) state and the first excited state. Calculate:
(i) the normalized wavefunction (x, t) for arbitrary time t  0, and
(ii) the time evolution of the expectation value x of the particle’s coordinate.
1.12. Calculate the potential profiles U(x) for that the following wavefunctions,
(i)  
  c exp  ax 2  ibt , and
(ii)   c exp a x  ibt
(with real coefficients a > 0 and b), satisfy the 1D Schrödinger equation for a particle with mass m. For
each case, calculate x, px, x, and px, and compare the product xpx with the Heisenberg’s
uncertainty relation.
1.13. A 1D particle of mass m, moving in the field of a stationary potential U(x), has the
following eigenfunction
C
 x   ,
cosh x
where C is the normalization constant, and  is a real constant. Calculate the function U(x) and the
state’s eigenenergy E.
1.14. Calculate the density dN/dE of traveling-wave quantum states in large rectangular potential
wells of various dimensions: d = 1, 2, and 3.
1.15.* Use the finite-difference method with steps a/2 and a/3 to find as many eigenenergies as
possible for a 1D particle in the infinitely deep, hard-wall 1D potential well of width a. Compare the
results with each other, and with the exact formula.68
68 You may like to start by reading about the finite-difference method – see, e.g., CM Sec. 8.5 or EM Sec. 2.11.
Chapter 2. 1D Wave Mechanics

Even the simplest, 1D version of wave mechanics enables quantitative analysis of many important
quantum-mechanical effects. The order of their discussion in this chapter is dictated mostly by
mathematical convenience – going from the simplest potential profiles to more complex ones, so that we
may build upon the previous results. However, I would advise the reader to focus more not on the math,
but rather on the physics of the non-classical phenomena it describes, ranging from particle penetration
into classically-forbidden regions, to quantum-mechanical tunneling, to the metastable state decay, to
covalent bonding and quantum oscillations, to energy bands and gaps.
2.1. Basic relations

As was discussed at the end of Chapter 1, in several cases (in particular, at strong confinement
within the [y, z] plane), the general (3D) Schrödinger equation may be reduced to its 1D version, similar
to Eq. (1.92):
 ( x, t )  2  2  ( x, t )
Schrödinger i   U ( x, t )  ( x, t ) . (2.1)
equation t 2m x 2
It is important, however, to remember that according to the discussion in Sec, 1.8, U(x, t) in this
equation is generally effective potential energy, which may include the energy of the lateral motion,
while (x, t) may be just one factor in the complete wavefunction (x, t)(y, z). If the transverse factor
(y, z) is normalized to 1, then the integration of Eq. (1.22a) over the 3D space within a segment [x1, x2]
gives the following probability to find the particle on this segment:
x2
Probability W (t )   Ψ ( x, t )Ψ * ( x, t )dx . (2.2)
x1
If the particle under analysis is definitely somewhere inside the system, the normalization of its 1D
wavefunction (x, t) is provided by extending integral (2) to the whole axis x:

Normalization
 w( x, t )dx  1, where w( x, t )  Ψ( x, t )Ψ * ( x, t ) . (2.3)

A similar integration of Eq. (1.23) shows that the expectation value of any observable depending only
on the coordinate x (and possibly time), may be expressed as

*
Expectation
value
A (t )  Ψ ( x, t ) Aˆ Ψ( x, t )dx . (2.4)

It is also useful to introduce the notion of the probability current along the x-axis (a scalar):
  *    2 
I ( x, t )   j x dydz 
Probability
current Im Ψ Ψ   Ψ ( x, t ) , (2.5)
m  x  m x
where jx is the x-component of the current density vector j(r,t). Then the continuity equation (1.48) for
any segment [x1, x2] takes the form
© K. Likharev
dW Continuity
 I ( x 2 )  I ( x1 )  0 . (2.6) equation
dt
The above formulas are sufficient for analysis of 1D problems of wave mechanics, but before
proceeding to particular cases, let me deliver on my earlier promise to prove that Heisenberg’s
uncertainty relation (1.35) is indeed valid for any wavefunction (x, t). For that, let us consider the
following positive (or at least non-negative) integral
 2

J     x   dx  0 , (2.7)

x
where  is an arbitrary real constant, and assume that at x   the wavefunction vanishes, together
with its first derivative – as we will see below, a very common case. Then the left-hand side of Eq. (7)
may be recast as
 2  *
     
J     x   dx    x    x    dx

x  
x  x 
(2.8)
   *
  *  *   
  x 2  * dx    x     dx  2  dx .
  
x x  
x x
According to Eq. (4), the first term in the last form of Eq. (8) is just x2, while the second and the third
integrals may be worked out by parts:
   
  x   
  *  *   * * * x   *
x   x  x  dx  x x  dx  xxd    x x     dx  1 , (2.9)

  *
x  
 *  * x   * 
2
1
 
*ˆ2
p x2
 x x dx 


x  
x
d  
x

x  
 


x 2
dx  2
 

p x dx 
2
. (2.10)
As a result, Eq. (7) takes the following form:

p x2 2 2 x2
J    x 2
  2
 0, i.e.   a  b  0, with a  
2
, b . (2.11)
2 p x2 p x2
This inequality should be valid for any real , so that the corresponding quadratic equation, 2 + a + b
= 0, can have either one (degenerate) real root – or no real roots at all. This is only possible if its
determinant, Det = a2 – 4b, is non-positive, leading to the following requirement:
2
x2 p x2  . (2.12)
4
In particular, if x = 0 and px = 0,1 then according to Eq. (1.33), Eq. (12) takes the form
1 Eq. (13) may be proved even if x and px are not equal to zero, by making the following replacements: x  x –
x and /x  /x + ip/ in Eq. (7), and then repeating all the calculations – which in this case become
somewhat bulky. In Chapter 4, equipped with the bra-ket formalism, we will derive a more general uncertainty
relation, which includes the Heisenberg’s relation (13) as a particular case, in a more efficient way.
Heisenberg’s
~ 2
uncertainty
relation
x2 ~
p x2  , (2.13)
4
which, according to the definition (1.34) of the r.m.s. uncertainties, is equivalent to Eq. (1.35).
Now let us notice that the Heisenberg’s uncertainty relation looks very similar to the
commutation relation between the corresponding operators:
xˆ, pˆ x   xˆpˆ x  pˆ x xˆ    x  i      i  

 x   i . (2.14a)
 x   x 
Since this relation is valid for any wavefunction (x, t), it may be represented as an operator equality:
Coordinate/
xˆ, pˆ x   i  0 .
momentum
operators’ (2.14b)
commutator
In Sec. 4.5 we will see that the relation between Eqs. (13) and (14) is just a particular case of a general
relation between the expectation values of non-commuting operators, and their commutators.
2.2. Free particle: Wave packets

Let us start our discussion of particular problems with the free 1D motion, i.e. with U(x, t) = 0.
From Eq. (1.29), it is evident that in the 1D case, a similar “fundamental” (i.e. a particular but the most
important) solution of the Schrödinger equation (1) is a sinusoidal (“monochromatic”) wave
0 ( x, t )  const  expi (k 0 x   0 t ). (2.15)
According to Eqs. (1.32), it describes a particle with definite momentum2 p0 = k0 and energy E0 = 0
= 2k02/2m. However, for this wavefunction, the product * does not depend on either x or t, so that
the particle is completely delocalized, i.e. the probability to find it the same along all axis x, at all times.
In order to describe a space-localized state, let us form, at the initial moment of time (t = 0), a
wave packet of the type shown in Fig. 1.6, by multiplying the sinusoidal waveform (15) by some smooth
envelope function A(x). As the most important particular example, consider the Gaussian wave packet
Gaussian
1  x2 
 ( x,0)  A x  e
ik0 x
wave
, with Ax   exp - 2 
. (2.16)
packet:
t=0
(2 )1/4 ( x)1/2  (2 x) 
(By the way, Fig. 1.6a shows exactly such a packet.) The pre-exponential factor in this envelope
function has been selected in the way to have the initial probability density,
* *  x2 
1
w( x,0)   ( x,0) ( x,0)  A ( x) A( x)  exp , (2.17)
2 1 / 2  x  2( x) 2 
normalized as in Eq. (3), for any parameters x and k0.3
2From this point on to the end of this chapter, I will drop the index x in the x-components of the vectors k and p.
3 This fact may be readily proven using the well-known integral of the Gaussian function (17), in infinite limits –
see, e.g., MA Eq. (6.9b). It is also straightforward to use MA Eq. (6.9c) to prove that for the wave packet (16), the
parameter x is indeed the r.m.s. uncertainty (1.34) of the coordinate x, thus justifying its notation.
To explore the evolution of this wave packet in time, we could try to solve Eq. (1) with the initial
condition (16) directly, but in the spirit of the discussion in Sec. 1.5, it is easier to proceed differently.
Let us first represent the initial wavefunction (16) as a sum (1.67) of the eigenfunctions k(x) of the
corresponding stationary 1D Schrödinger equation (1.60), in our current case
 2 d 2 k 2k 2
  E k k , with E k  , (2.18)
2m dx 2 2m
which are simply monochromatic waves,
 k  a k e ikx . (2.19)
Since (as was discussed in Sec. 1.7) at the unconstrained motion the spectrum of possible wave numbers
k is continuous, the sum (1.67) should be replaced with an integral:4
 ( x,0)   a k e ikx dk . (2.20)
Now let us notice that from the point of view of mathematics, Eq. (20) is just the usual Fourier
transform from the variable k to the “conjugate” variable x, and we can use the well-known formula of
the reciprocal Fourier transform to write
1 ikx 1 1  x2 ~  ~
ak 
2   ( x , 0 ) e dx 
2 (2 ) (x)
1/4 1/2 
exp -
 (2 x)
2
 ik x dx, where k  k  k 0 . (2.21)

This Gaussian integral may be worked out by the following standard method, which will be used many
times in this course. Let us complement the exponent to the full square of a linear combination of x and
k, adding a compensating term independent of x:
-
x2
(2 x) 2
~
 ik x  
1
(2 x) 2
 ~
x  2i ( x) 2 k  2 ~
 k 2 ( x) 2 . (2.22)
~
Since the integration in the right-hand side of Eq. (21) should be performed at constant k , in the infinite
~
limits of x, its result would not change if we replace dx by dx’  d[x + 2i(x)2 k ]. As a result, we get:5
ak 
1 1
2 (2 )1/4 ( x)1/2
~2
exp  k  x   exp-
2

 x' 2 
2 
dx'
 (2 x) 
~ (2.23)
 1 
1/ 2
1  k2 
  exp 2 
,
 2  (2 ) ( k )  (2 k ) 
1/4 1/ 2
so that ak also has a Gaussian distribution, now along the k-axis, centered to the value k0 (Fig. 1.6b),
with the constant k defined as
4 For the notation brevity, from this point on the infinite limit signs will be dropped in all 1D integrals.
5 The fact that the argument’s shift is imaginary is not important. Indeed, since the function under the integral
tends to zero at Re x’  Re x  , the difference between infinite integrals of this function along axes of x and
x’ is equal to its contour integral around the rectangular area x  Imz  x’. Since the function is also analytical, it
obeys the Cauchy theorem MA Eq. (15.1), which says that this contour integral equals zero.
 k  1 2 x . (2.24)
Thus we may represent the initial wave packet (16) as
1/ 2
 1  1  (k  k 0 ) 2  ikx
(2 )1/4 ( k )1 / 2 
 ( x,0)    exp  2 
e dk . (2.25)
 2   (2 k ) 
From the comparison of this formula with Eq. (16), it is evident that the r.m.s. uncertainty of the wave
number k in this packet is indeed equal to k defined by Eq. (24), thus justifying the notation. The
comparison of the last relation with Eq. (1.35) shows that the Gaussian packet represents the ultimate
case in which the product xp = x(k) has the lowest possible value (/2); for any other envelope’s
shape, the uncertainty product may only be larger. We could of course get the same result for k from
Eq. (16) using the definitions (1.23), (1.33), and (1.34); the real advantage of Eq. (25) is that it can be
readily generalized to t > 0. Indeed, we already know that the time evolution of the wavefunction is
always given by Eq. (1.69), for our current case6
Gaussian
1/ 2
wave  1  1  (k  k 0 ) 2  ikx  k 2 
(2 )1/4 (k )1 / 2 
packet:  ( x, t )    exp 2 
e exp i t dk . (2.26)
arbitrary  2   (2k )   2m 
time
Fig. 1 shows several snapshots of the real part of the wavefunction (26), for a particular case k = 0.1 k0.
1 v ph x
) t0 t  2 .2
v0
Re 
0
Fig. 2.1. Typical time evolution

v gr of a 1D wave packet on (a)
1 smaller and (b) larger time scales.
The dashed lines show the packet
5 0 5 10 15 20 envelopes, i.e.   .
x / x
1
x x
t0 t 3 t  20
Re 0 v0 v0
 10 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140
x / x
6 Note that Eq. (26) differs from Eq. (16) only by an exponent of a purely imaginary number, and hence this
wavefunction is also properly normalized to 1 – see Eq. (3). Hence the wave packet introduction offers a natural
solution to the problem of traveling de Broglie wave’s normalization, which was mentioned in Sec. 1.2.
The plots clearly show the following effects:

(i) the wave packet as a whole (as characterized by its envelope) moves along the x-axis with a
certain group velocity vgr,
(ii) the “carrier” quasi-sinusoidal wave inside the packet moves with a different, phase velocity
vph, which may be defined as the velocity of the spatial points where the wave’s phase (x, t)  arg
takes a certain fixed value (say,  = /2, where Re vanishes), and
(iii) the wave packet’s spatial width gradually increases with time – the packet spreads.
All these effects are common for waves of any physical nature.7 Indeed, let us consider a 1D
wave packet of the type (26), but more general:
Arbitrary
  x, t    a k e i ( kx t ) dk , (2.27) 1D wave
packet
propagating in a medium with an arbitrary (but smooth!) dispersion relation (k), and assume that the
wave number distribution ak is narrow: k << k  k0 – see Fig. 1.6b. Then we may expand the function
(k) into the Taylor series near the central wave number k0, and keep only three of its leading terms:
d ~ 1 d 2 ~ 2 ~
 (k )   0  k k , where k  k  k 0 ,  0   k 0 , (2.28)
dk 2 dk 2
where both derivatives have to be evaluated at the point k = k0. In this approximation,8 the expression in
the parentheses on the right-hand side of Eq. (27) may be rewritten as
~  d ~ 1 d 2  ~ 2 
kx   k t  k 0 x  k x    0  k k  t
 dk 2 dk 2 
(2.29)
~ d  1 d  ~ 2 2
 k 0 x   0 t   k  x  t k t,
 dk  2 dk 2
so that Eq. (27) becomes
i ( k0 x 0t )   ~ d  1 d 2 ~ 2   ~
  x, t   e  k    dk t   2 dk 2 k t  dk .
a exp i k  x  (2.30)
 
First, let neglect the last term in square brackets (which is much smaller than the first term if the
dispersion relation is smooth enough and/or the time interval t is sufficiently small), and compare the
result with the initial form of the wave packet (27):
ik0 x ~ ~
  x,0    a k e ikx dk  A x  e , with A x    a k e ik x dk . (2.31)
The comparison shows that in this approximation, Eq. (30) is reduced to

ik0 ( x vph t )
  x, t   A( x  v gr t )e , (2.32)
where vgr and vph are two constants with the dimension of velocity:
7See, e.g., brief discussions in CM Sec. 6.3 and EM Sec. 7.2.

8By the way, in the particular case of de Broglie waves described by the dispersion relation (1.30), Eq. (28) is
exact, because  = E/ is a quadratic function of k = p/, and all higher derivatives of  over k vanish for any k0.
Group
d 
and phase vgr  , v ph  . (2.33a)
velocities dk k k0 k k  k0
Clearly, Eq. (32) describes the effects (i) and (ii) listed above. For the particular case of the de Broglie
waves, whose dispersion law is given by Eq. (1.30),
d k 0  k 0 v gr
vgr  k  k0   v0 , v ph  k  k0   . (2.33b)
dk m k 2m 2
We see that (very fortunately :-) the velocity of the wave packet’s envelope is equal to v0 – the classical
velocity of the same particle.
Next, the last term in the square brackets of Eq. (30) describes the effect (iii), the wave packet’s
spread. It may be readily evaluated if the packet (27) is initially Gaussian, as in our example (25):
~
 k2 
a k  const  exp 2 
. (2.34)
 2k  
In this case the integral (30) is Gaussian, and may be worked out exactly as the integral (21), i.e. by
representing the merged exponents under the integral as a full square of a linear combination of x and k:
~
k2 ~ i d 2 ~ 2
  i k ( x  v gr t )  k t
(2k ) 2 2 dk 2
(2.35)
~ x  vgr t 
2
x  vgr t 
2
i d  2
2
 (t ) k  i    ik 0 x  k0 t ,
 2  (t )  4  (t ) 2 dk 2
where I have introduced the following complex function of time:

1 i d 2 i d 2
(t )   t  (x ) 2
 t, (2.36)
4(k ) 2 2 dk 2 2 dk 2
~
and used Eq. (24). Now integrating over k , we get
 ( x  v gr t ) 2  1 d 2 2 
 ( x, t )  exp   i k 0 x  k 0 t  . (2.37)
 4  (t )  2 dk 2 
The imaginary part of the ratio 1/(t) in this exponent gives just an additional contribution to the wave’s
phase and does not affect the resulting probability distribution
 ( x  vgr t ) 2
* 1 
w( x, t )     exp Re . (2.38)
 2  (t ) 
This is again a Gaussian distribution over axis x, centered to point x = vgrt, with the r.m.s. width
1 2
  1   1 d 2  1
x'  2
 Re    x 
2
  t
2 
. (2.39a)
  (t )    2 dk  (x)
2
In the particular case of de Broglie waves, d2/dk2 = /m, so that
2 Wave
x' 2  x 2   t  1
 . (2.39b) packet’s
 2m  (x)
2 spread
The physics of the packet spreading is very simple: if d2/dk2  0, the group velocity d/dk of
each small group dk of the monochromatic components of the wave is different, resulting in the gradual
(eventually, linear) accumulation of the differences of the distances traveled by the groups. The most
curious feature of Eq. (39) is that the packet width at t > 0 depends on its initial width x’(0) = x in a
non-monotonic way, tending to infinity at both x→ 0 and x → ∞. Because of that, for a given time
interval t, there is an optimal value of x that minimizes x’:
1/ 2
 t 
x' min  2 x opt   . (2.40)
m
This expression may be used for estimates of the spreading effect. Due to the smallness of the Planck
constant  on the human scale of things, for macroscopic bodies this effect is extremely small even for
very long time intervals; however, for light particles it may be very noticeable: for an electron (m = me 
10-30 kg), and t = 1 s, Eq. (40) yields (x’)min ~ 1 cm.
Note also that for any t  0, the wave packet retains its Gaussian envelope, but the ultimate
relation (24) is not satisfied, x’p > /2 – due to a gradually accumulated phase shift between the
component monochromatic waves. The last remark on this topic: in quantum mechanics, the wave
packet spreading is not a ubiquitous effect! For example, in Chapter 5 we will see that in a quantum
oscillator, the spatial width of a Gaussian packet (for that system, called the Glauber state of the
oscillator) does not grow monotonically but rather either stays constant or oscillates in time.
Now let us briefly discuss the case when the initial wave packet is not Gaussian but is described
by an arbitrary initial wavefunction. To make the forthcoming result more aesthetically pleasing, it is
beneficial to generalize our calculations to an arbitrary initial time t0; it is evident that if U does not
depend on time explicitly, it is sufficient to replace t with (t – t0) in all above formulas. With this
replacement, Eq. (27) becomes
 ( x, t )   a k e

i kx   (t  t 0 ) dk , (2.41)
and the reciprocal transform (21) reads

1
ak    ( x, t 0 )e ikx dx . (2.42)
2
If we want to express these two formulas with one relation, i.e. plug Eq. (42) into Eq. (41), we should
give the integration variable x some other name, e.g., x0. (Such notation is appropriate because this
variable describes the coordinate argument in the initial wave packet.) The result is
1 i k x  x0  t t0 
 ( x, t ) 
2  dk  dx ( x , t
0 0 0 )e . (2.43)
Changing the order of integration, this expression may be rewritten in the following general form:
1D
 ( x, t )   G  x, t ; x0 , t0   ( x0 , t0 )dx0 , (2.44) propagator:
definition
where the function G, usually called kernel in mathematics, in quantum mechanics is called the
propagator.9 Its physical sense may be understood by considering the following special initial
condition:10
 ( x0 , t 0 )   ( x 0  x' ) , (2.45)
where x’ is a certain point within the domain of particle’s motion. In this particular case, Eq. (44) gives
Ψ( x, t )  G ( x, t ; x' , t 0 ) . (2.46)
Hence, the propagator, considered as a function of its arguments x and t only, is just the wavefunction of
the particle, at the -functional initial conditions (45). Thus just as Eq. (41) may be understood as a
mathematical expression of the linear superposition principle in the momentum (i.e., reciprocal) space
domain, Eq. (44) is an expression of this principle in the direct space domain: the system’s “response”
(x,t) to an arbitrary initial condition (x0,t0) is just a sum of its responses to its elementary spatial
“slices” of this initial function, with the propagator G(x,t; x0,t0) representing the weight of each slice in
the final sum.
According to Eqs. (43) and (44), in the particular case of a free particle the propagator is equal to
1 i k x  x0  t t0  ,
G  x, t ; x0 , t0    e dk (2.47)
2
Calculating this integral, one should remember that here  is not a constant but a function of k, given by
the dispersion relation for the partial waves. In particular, for the de Broglie waves, with  = 2k2/2m,
1  k 2 
G ( x, t ; x 0 , t 0 )   exp i  k  x  x   (t  t )  dk . (2.48)
2
0 0
 2m 
This is a Gaussian integral again, and may be readily calculated just it was done (twice) above, by
completing the exponent to the full square. The result is
1/ 2
Free  m   m( x  x 0 ) 2 
particle’s G ( x, t ; x0 , t 0 )    exp . (2.49)
propagator
 2i(t  t 0 )   2i (t  t 0 ) 
Please note the following features of this complex function (plotted in Fig. 2):
(i) It depends only on the differences (x – x0) and (t – t0). This is natural because the free-particle
propagation problem is translation-invariant both in space and time.
(ii) The function’s shape does not depend on its arguments – they just rescale the same function:
its snapshot (Fig. 2), if plotted as a function of un-normalized x, just becomes broader and lower with
time. It is curious that the spatial broadening scales as (t – t0)1/2 – just as at the classical diffusion, as a
result of a deep mathematical analogy between quantum mechanics and classical statistics – to be
discussed further in Chapter 7.
9 Its standard notation by letter G stems from the fact that the propagator is essentially the spatial-temporal
Green’s function, defined very similarly to Green’s functions of other ordinary and partial differential equations
describing various physics systems – see, e.g., CM Sec. 5.1 and/or EM Sec. 2.7 and 7.3.
10 Note that such initial condition is mathematically not equivalent to a -functional initial probability density (3).
(iii) In accordance with the uncertainty relation, the ultimately compressed wave packet (45) has
an infinite width of momentum distribution, and the quasi-sinusoidal tails of the free-particle
propagator, clearly visible in Fig. 2, are the results of the free propagation of the fastest (highest-
momentum) components of that distribution, in both directions from the packet center.
0.5
Re  G ( x, t ; x 0 , t 0 )
 
Im  m /  (t  t 0 )
1/ 2
0
Fig. 2.2. The real (solid line)

and imaginary (dotted line)
parts of the 1D free
 0.5 particle’s propagator (49).
 10 0 10
( x  x 0 ) /  (t  t 0 ) / m 
1/ 2
In the following sections, I will mostly focus on monochromatic wavefunctions (that, for
unconfined motion, may be interpreted as wave packets of a very large spatial width x), and only rarely
discuss wave packets. My best excuse is the linear superposition principle, i.e. our conceptual ability to
restore the general solution from that of monochromatic waves of all possible energies. However, the
reader should not forget that, as the above discussion has illustrated, mathematically such restoration is
not always trivial.
2.3. Particle reflection and tunneling

Now, let us proceed to the cases when a 1D particle moves in various potential profiles U(x) that
are constant in time. Conceptually, the simplest of such profiles is a potential step – see Fig. 3.
classically accessible classically forbidden
U (x) Fig. 2.3. Classical 1D motion in a potential

xc profile U(x).
classical turning point
As I am sure the reader knows, in classical mechanics the particle’s kinetic energy p2/2m cannot
be negative, so if the particle is incident on such a step (in Fig. 3, from the left), it can only travel
through the classically accessible region, where its (conserved) full energy,
p2
E  U ( x) , (2.50)
2m
is larger than the local value U(x). Let the initial velocity v = p/m be positive, i.e. directed toward the
step. Before it has reached the classical turning point xc, defined by equality
U ( xc )  E , (2.51)
the particle’s kinetic energy p2/2m is positive, so that it continues to move in the initial direction. On the
other hand, a classical particle cannot penetrate that classically forbidden region x > xc, because there
its kinetic energy would be negative. Hence when the particle reaches the point x = xc, its velocity has to
change its sign, i.e. the particle is reflected back from the classical turning point.
In order to see what does the wave mechanics say about this situation, let us start from the
simplest, sharp potential step shown with the bold black line in Fig. 4:
 0, at x  0,
U ( x)  U 0 ( x)   (2.52)
U 0 , at 0  x.
For this choice, and any energy within the interval 0 < E < U0, the classical turning point is xc = 0.
U ( x), E
A C
 ( x) U0
B
Fig. 2.4. The reflection of a
monochromatic wave from a potential
step U0 > E. (This particular
E wavefunction’s shape is for U0 = 5E.)
The wavefunction is plotted with the
0 x same schematic vertical offset by E as
those in Fig. 1.8.
Let us represent an incident particle with a wave packet so long that the spread k ~ 1/x of its
wave-number spectrum is sufficiently small to make the energy uncertainty E =  = (d/dk)k
negligible in comparison with its average value E < U0, as well as with (U0 – E). In this case, E may be
considered as a given constant, the time dependence of the wavefunction is given by Eq. (1.62), and we
can calculate its spatial factor (x) from the 1D version of the stationary Schrödinger equation (1.65):11
 2 d 2
  U ( x)  E . (2.53)
2m dx 2
At x < 0, i.e. at U = 0, the equation is reduced to the Helmholtz equation (1.78), and may be
satisfied with either of two traveling waves, proportional to exp{+ikx} and exp{-ikx} correspondingly,
with k satisfying the dispersion equation (1.30):
2mE
k2  . (2.54)
2
Thus the general solution of Eq. (53) in this region may be represented as
Incident
   x   Ae ikx  Be ikx .
and
reflected (2.55)
waves
11 Note that this is not the eigenproblem like the one we have solved in Sec. 1.4 for a potential well. Indeed, now
the energy E is considered given – e.g., by the initial conditions that launch a long wave packet upon the potential
step – in Fig. 4, from the left.
The second term on the right-hand side of Eq. (55) evidently describes a (formally, infinitely long) wave
packet traveling to the left, arising because of the particle’s reflection from the potential step. If B = –A,
this solution is reduced to Eq. (1.84) for the potential well with infinitely high walls, but for our current
case of a finite step height U0, the relation between the coefficients B and A may be different.
To show this, let us solve Eq. (53) for x > 0, where U = U0 > E. In this region the equation may
be rewritten as
d 2 
2
  2  , (2.56)
dx
where  is a real and positive constant defined by a formula similar in structure to Eq. (54):
2m(U 0  E )
2   0. (2.57)
2
Decay in
The general solution of Eq. (56) is the sum of exp{+x} and exp{–x}, with arbitrary coefficients. classically
However, in our particular case the wavefunction should be finite at x  +, so only the latter exponent forbidden
region
is acceptable:
   x   Ce x . (2.58)
Such penetration of the wavefunction to the classically forbidden region, and hence a non-zero
probability to find the particle there, is one of the most fascinating predictions of quantum mechanics,
and has been repeatedly observed in experiment – e.g., via tunneling experiments – see the next
section.12 From Eq. (58), it is evident that the constant , defined by Eqs. (57), may be interpreted as the
reciprocal penetration depth. Even for the lightest particles, this depth is usually very small. Indeed, for
E << U0 that relation yields
1 
  E 0  . (2.59)
 2mU 0 1 / 2
For example, let us consider a conduction electron in a typical metal, which runs, at the metal’s surface,
into a sharp potential step whose height is equal to metal’s workfunction U0  5 eV – see the discussion
of the photoelectric effect in Sec. 1.1. In this case, according to Eq. (59),  is close to 0.1 nm, i.e. is
close to a typical size of an atom. For heavier elementary particles (e.g., protons) the penetration depth
is correspondingly lower, and for macroscopic bodies, it is hardly measurable.
Returning to Eqs. (55) and (58), we still should relate the coefficients B and C to the amplitude A
of the incident wave, using the boundary conditions at x = 0. Since E is a finite constant, and U(x) is a
finite function, Eq. (53) says that d2/dx2 should be finite as well. This means that the first derivative
should be continuous:
 
 d d  d 2 2m
lim  0  x    x    lim  0  2
dx  2 lim  0  U ( x)  E  dx  0 . (2.60)
 dx dx   dx  
Repeating such calculation for the wavefunction (x) itself, we see that it also should be continuous at
all points, including the border point x = 0, so that the boundary conditions in our problem are
12Note that this effect is pertinent to waves of any type, including mechanical waves (see, e.g., CM Secs. 6.4 and
7.7) and electromagnetic waves (see, e.g., EM Secs. 7.3-7.7).
d  d 
  (0)    (0), ( 0)  ( 0) . (2.61)
dx dx
Plugging Eqs. (55) and (58) into Eqs. (61), we get a system of two linear equations
A  B  C, ikA  ikB  C , (2.62)
whose (easy :-) solution allows us to express B and C via A :
k  i 2k
BA , CA . (2.63)
k  i k  i
We immediately see that the numerator and denominator in the first of these fractions have equal
moduli, so that B = A. This means that, as we could expect, a particle with energy E < U0 is totally
reflected from the step – just as in classical mechanics. As a result, at x < 0 our solution (55) may be
represented as a standing wave
k
   2iAe i sin(kx   ), with   tan 1 . (2.64)

Note that the shift x  /k = (tan-1k/)/k of the standing wave to the right, due to the partial penetration
of the wavefunction under the potential step, is commensurate with, but generally not equal to the
penetration depth   1/. The red line in Fig. 4 shows the exact behavior of the wavefunction, for a
particular case E = U0/5, at which k/  [E/(U0-E)]1/2= 1/2.
According to Eq. (59), as the particle’s energy E is increased to approach U0, the penetration
depth 1/ diverges. This raises an important issue: what happens at E > U0, i.e. if there is no classically
forbidden region in the problem? In classical mechanics, the incident particle would continue to move to
the right, though with a reduced velocity, corresponding to the new kinetic energy E – U0, so there
would be no reflection. In quantum mechanics, however, the situation is different. To analyze it, it is not
necessary to re-solve the whole problem; it is sufficient to note that all our calculations, and hence Eqs.
(63) are still valid if we take13
2 m( E  U 0 )
  ik' , with k' 2   0. (2.65)
2
With this replacement, Eq. (63) becomes14
k  k' 2k
BA , CA . (2.66)
k  k' k  k'
The most important result of this change is that now the particle’s reflection is not total:  B  <
 A . To evaluate this effect quantitatively, it is fairer to use not the B/A or C/A ratios, but rather that of
the probability currents (5) carried by the de Broglie waves traveling to the right, with amplitudes C and
A, in the corresponding regions (respectively, for x > 0 and x < 0):
13 Our earlier discarding of the particular solution exp{x}, now becoming exp{-ik’x}, is still valid, but now on
different grounds: this term would describe a wave packet incident on the potential step from the right, and this is
not the problem under our current consideration.
14 These formulas are completely similar to those describing the partial reflection of classical waves from a sharp
interface between two uniform media, at normal incidence (see, e.g., CM Sec. 6.4 and EM Sec. 7.4), with the
effective impedance Z of de Broglie waves being proportional to their wave number k.
4E ( E  U 0 )
2 1/ 2
I C k' C 4kk' Potential
T     . (2.67)
 
step’s
I A k A 2 (k  k' ) 2 E 1 / 2  E  U 0 
1/ 2 2
transparency
(The parameter T so defined is called the transparency of the system, in our current case of the potential
step of height U0, at particle’s energy E.) The result given by Eq. (67) is plotted in Fig. 5a as a function
of the U0/E ratio. Note its most important features:
(i) At U0 = 0, the transparency is full, T = 1 – naturally, because there is no step at all.
(ii) At U0  E, the transparency drops to zero, giving a proper connection to the case E < U0.
(iii) Nothing in our solution’s procedure prevents us from using Eq. (67) even for U0 < 0, i.e. for
the step-down (or “cliff”) potential profile – see Fig. 5b. Very counter-intuitively, the particle is (partly)
reflected even from such a cliff, and the transmission diminishes (though rather slowly) at U0  –.
(a) (b)
1 A C
E0
0.8 B0
U 0
0.6
T U0
0.4
0.2 Fig. 2.5. (a) The transparency of a potential step with U0

< E as a function of its height, according to Eq. (75), and
0 (b) the “cliff” potential profile, with U0 < 0.
1 0 1
U0 / E
The most important conceptual conclusion of this analysis is that the quantum particle is partly
reflected from a potential step with U0 < E, in the sense that there is a non-zero probability T < 1 to find
it passed over the step, while there is also some probability, (1 – T) > 0, to have it reflected.
The last property is exhibited, but for any relation between E and U0, by another simple potential
profile U(x), the famous potential (or “tunnel”) barrier. Fig. 6 shows its simple, “rectangular” version:
 0, for x  d / 2,

U ( x)  U 0 , for  d / 2  x   d / 2, (2.68)
 0, for  d / 2  x .

To analyze this problem, it is sufficient to look for the solution to the Schrödinger equation in the form
(55) at x  –d/2. At x > +d/2, i.e., behind the barrier, we may use the arguments presented above (no
wave source on the right!) to keep just one traveling wave, now with the same wave number:
  ( x)  Fe ikx . (2.69)
However, under the barrier, i.e. at –d/2  x  +d/2, we should generally keep both exponential terms,
 b ( x)  Ce x  De x , (2.70)
because our previous argument, used in the potential step problem’s solution, is no longer valid. (Here k
and  are still defined, respectively, by Eqs. (54) and (57).) In order to express the coefficients B, C, D,
and F via the amplitude A of the incident wave, we need to plug these solutions into the boundary
conditions similar to Eqs. (61), but now at two boundary points, x =  d/2.
U  U0
A C F
E
B D
Fig. 2.6. A rectangular potential
barrier, and the de Broglie waves
U 0
taken into account in its analysis.
d /2 d /2 x
Solving the resulting system of 4 linear equations, we get four ratios B/A, C/A, etc.; in particular,
1
F  i  k   ikd
 cosh d     sinh d  e , (2.71a)
A  2 k   
and hence the barrier’s transparency
1
Rectangular
tunnel F
2   2  k2 
2

barrier’s T   cosh 2 d    sinh 2 d  . (2.71b)
transparency A   2k  
So, quantum mechanics indeed allows particles with energies E < U0 to pass “through” the
potential barrier – see Fig. 6 again. This is the famous effect of quantum-mechanical tunneling. Fig. 7a
shows the barrier transparency as a function of the particle energy E, for several characteristic values of
its thickness d, or rather of the ratio d/, with  defined by Eq. (59).
(a) (b)
0.01
d 6
d /   3 .0
0.8  0.3 110

2  10
110 10
0.6
 14
110
T 1 .0 T
 18
0.4 110 30
 22
110
0.2 3 .0
 26
110
 30
110
0 1 2 3 0 0.2 0.4 0.6 0.8
E /U 0 1  E / U 0  1/ 2
Fig. 2.7. The transparency of a rectangular potential barrier as a function of the particle’s energy E.
The plots show that generally, the transparency grows gradually with the particle’s energy. This
growth is natural because the penetration constant  decreases with the growth of E, i.e., the
wavefunction penetrates more and more into the barrier, so that more and more of it is “picked up” at
the second interface (x = +d/2) and transferred into the wave Fexp{ikx} propagating behind the barrier.
Now let us consider the important limit of a very thin and high rectangular barrier, d << , E <<
U0, giving k <<  << 1/d. In this limit, Eq. (71) yields
2
F 1 1 1  2  k2  1  2d m
T    , where     d   2 U 0 d , (2.72)
A 1  i
2
1  2 2  k  2 k  k
The last product, U0d, is just the “energy area” (or the “weight”)
W  U ( x)dx
U ( x ) E
(2.73)
of the barrier. This fact implies that the very simple result (72) may be correct for a barrier of any shape,
provided that it is sufficiently thin and high.
To confirm this guess, let us consider the tunneling problem for a very thin barrier with d, kd
<< 1, approximating it with the Dirac’s -function (Fig. 8):
U ( x )  W ( x ) , (2.74)
so that the parameter W satisfies Eq. (73).
U ( x )  W ( x )
A F
E
B
Fig. 2.8. A delta-functional potential
x barrier.
0
The solutions of the tunneling problem at all points but x = 0 still may be taken in the form of
Eqs. (55) and (69), so we only need to analyze the boundary conditions at that point. However, due to
the special character of the -function, we should be careful here. Indeed, instead of Eq. (60) we now get
 
 d d  d 2 2m
lim  0  x    x    lim  0  dx 2 dx  lim  0  2  U ( x)  E  dx
 dx dx   
(2.75)
2m
 2 W (0).

According to this relation, at a finite W, the derivatives d/dx are also finite, so that the wavefunction
itself is still continuous:

d
lim  0  x  x    lim  0  dx  0. (2.76)

dx
Using these two boundary conditions, we readily get the following system of two linear equations,
2mW
A  B  F , ikF  (ikA  ikB)  F, (2.77)
2
whose solution yields
B  i F 1 mW
 ,  , where   . (2.78)
A 1  i A 1  i 2k
(Taking Eq. (73) into account, this definition of  coincides with that in Eq. (72).) For the barrier
transparency T  F/A2, this result again gives the first of Eqs. (72), which is therefore general for such
thin barriers. That formula may be recast to give the following simple expression (valid only for E <<
Umax):
Thin 1 E mW 2
barrier: T   , where E 0  , (2.79)
transparency 1   2 E  E0 2 2
which shows that as energy becomes larger than the constant E0, the transparency approaches 1.
Now proceeding to another important limit of thick barriers (d >> ), Eq. (71) shows that in this
case, the transparency is dominated by what is called the tunnel exponent:
2
4k  2d
T   2
Thick
2 
barrier: e (2.80)
transparency  k  
– the behavior which may be clearly seen as the straight-line segments in semi-log plots (Fig. 7b) of T
as a function of the combination (1 – E/U0)1/2 , which is proportional to  – see Eq. (57). This
exponential dependence on the barrier thickness is the most important factor for various applications of
quantum-mechanical tunneling – from the field emission of electrons to vacuum15 to the scanning
tunneling microscopy.16 Note also very substantial negative implications of the effect for the electronic
technology progress, most importantly imposing limits on the so-called Dennard scaling of field-effect
transistors in semiconductor integrated circuits (which is the technological basis of the well-known
Moore’s law), due to the increase of tunneling both through the gate oxide and along the channel of the
transistors, from source to drain.17
Finally, one more feature visible in Fig. 7a (for case d = 3) are the oscillations of the
transparency as a function of energy, at E > U0, with T = 1, i.e. the reflection completely vanishing, at
some points.18 This is our first glimpse at one more interesting quantum effect: resonant tunneling. This
effect will be discussed in more detail in Sec. 5 below, using another potential profile where it is more
clearly pronounced.
15 See, e.g., G. Fursey, Field Emission in Vacuum Microelectronics, Kluwer, New York, 2005.
16 See, e.g., G. Binning and H. Rohrer, Helv. Phys. Acta 55, 726 (1982).
17 See, e.g., V. Sverdlov et al., IEEE Trans. on Electron Devices 50, 1926 (2003), and references therein. (A brief
discussion of the field-effect transistors, and literature for further reading, may be found in SM Sec. 6.4.)
18 Let me mention in passing the curious case of the potential well U(x) = –(2/2m)( + 1)/cosh2(x/a), with any
positive integer  and any real a, which is reflection-free (T = 1) for the incident de Broigle wave of any energy
E, and hence for any incident wave packet. Unfortunately, a proof of this fact would require more time/space than
I can afford. (Note that it was first described in a 1930 paper by Paul Sophus Epstein, before the 1933 publication
by G. Pöschl and E. Teller, which is responsible for the common name of this Pöschl-Teller potential.)
2.4. Motion in soft potentials

Before moving on to exploring other quantum-mechanical effects, let us see how the results
discussed in the previous section are modified in the opposite limit of the so-called soft (also called
“smooth”) potential profiles, like the one sketched in Fig. 3.19 The most efficient analytical tool to study
this limit is the so-called WKB (or “JWKB”, or “quasiclassical”) approximation developed by H.
Jeffrey, G. Wentzel, A. Kramers, and L. Brillouin in 1925-27. In order to derive its 1D version, let us
rewrite the Schrödinger equation (53) in a simpler form
d 2
2
 k 2 ( x)  0 , (2.81)
dx
where the local wave number k(x) is defined similarly to Eq. (65),
2mE  U ( x) Local
k 2 ( x)  ; (2.82) wave
2 number
besides that now it may be a function of x. We already know that for k(x) = const, the fundamental
solutions of this equation are Aexp{+ikx} and Bexp{-ikx}, which may be represented in a single form
 ( x)  e i ( x ) , (2.83)
where (x) is a complex function, in these two simplest cases being equal, respectively, to (kx – ilnA)
and (-kx – ilnB). This is why we may try use Eq. (83) to look for solution of Eq. (81) even in the general
case, k(x)  const. Differentiating Eq. (83) twice, we get
d 2  d 2   d   i
2
d d i
i e ,  i   e . (2.84)
dx dx dx 2  dx 2  dx  
Plugging the last expression into Eq. (81) and requiring the factor before exp{i(x)} to vanish, we get
2
d 2   d 
i 2    k ( x)  0 .
2
(2.85)
dx  dx 
This is still an exact, general equation. At the first sight, it looks harder to solve than the initial
equation (81), because Eq. (85) is nonlinear. However, it is ready for simplification in the limit when the
potential profile is very soft, dU/dx  0. Indeed, for a uniform potential, d2/dx2 = 0. Hence, in the so-
called 0th approximation, (x)  0(x), we may try to keep that result, so that Eq. (85) is reduced to
2
 d 0  d 0 x
   k 2 ( x), i.e.   k ( x),  0  x   i  k ( x' )dx' , (2.86)

 dx  dx
so that its general solution is a linear superposition of two functions (83), with  replaced with 0:
 x   x 
 0 ( x)  A exp i  k ( x' )dx'   B exp i  k ( x' )dx'  , (2.87)
   
19 Quantitative conditions of the “softness” will be formulated later in this section.
where the choice of the lower limits of integration affects only the constants A and B. The physical sense
of this result is simple: it is a sum of the forward- and back-propagating de Broglie waves, with the
coordinate-dependent local wave number k(x) that self-adjusts to the potential profile.
Let me emphasize the non-trivial nature of this approximation.20 First, any attempt to address the
problem with the standard perturbation approach (say,  = 0 + 1 +…, with n proportional to the nth
power of some small parameter) would fail for most potentials, because as Eq. (86) shows, even a slight
but persisting deviation of U(x) from a constant leads to a gradual accumulation of the phase 0,
impossible to describe by any small perturbation of . Second, the dropping of the term d2/dx2 in Eq.
(85) is not too easy to justify. Indeed, since we are committed to the “soft potential limit” dU/dx  0,
we should be ready to assume the characteristic length a of the spatial variation of  to be large, and
neglect the terms that are the smallest ones in the limit a  . However, both first terms in Eq. (85) are
apparently of the same order in a, namely O(a-2); why have we neglected just one of them?
The price we have paid for such a “sloppy” treatment is substantial: Eq. (87) does not satisfy the
fundamental property of the Schrödinger equation solutions, the probability current’s conservation.
Indeed, since Eq. (81) describes a fixed-energy (stationary) spatial part of the general Schrödinger
equation, its probability density w = * =*, and should not depend on time. Hence, according to
Eq. (6), we should have I(x) = const. However, this is not true for any component of Eq. (87); for
example for the first, forward-propagating component on its right-hand side, Eq. (5) yields
 2
I 0 ( x)  A k ( x) , (2.88)
m
evidently not a constant if k(x)  const. The brilliance of the WKB theory is that the problem may be
fixed without a full revision of the 0th approximation, just by amending it. Indeed, let us explore the
next, 1st approximation:
  x    WKB ( x)   0 ( x)   1 ( x) , (2.89)
where 0 still obeys Eq. (86), while 1 describes a 0th approximation’s correction that is small in the
following sense:21
d 1 d 0
  k ( x) . (2.90)
dx dx
Plugging Eq. (89) into Eq. (85), with the account of the definition (86), we get
 d 2 0 d 2 0  d 1  d 0 d  1 
i  2
 
 dx  2 dx  dx   0 . (2.91)
 dx dx 2   
Using the condition (90), we may neglect d21/dx2 in comparison with d20/dx2 inside the first
parentheses, and d1/dx in comparison with 2d0/dx inside the second parentheses. As a result, we get
the following (still approximate!) result:
20 Philosophically, this space-domain method is very close to the time-domain van der Pol method in classical
mechanics, and the very similar rotating wave approximation (RWA) in quantum mechanics – see, e.g., CM Secs.
5.2-5.5, and also Secs. 6.5, 7.6, 9.2, and 9.4 of this course.
21 For certainty, I will use the discretion given by Eq. (82) to define k(x) as the positive root of its right-hand side.
dΦ1 i d 2  0
dx

2 dx 2
d 0 i d  d 0  i d
dx
  ln
2 dx 

dx  2 dx dx

ln k ( x)  i d ln k 1 / 2 ( x) ,  (2.92)
x
1
iΦ WKB  iΦ 0  iΦ1  i  k ( x' )dx'  ln 1/ 2
, (2.93)
k ( x)
a  x  b  x  WKB
 WKB ( x)  1/ 2
exp i k ( x' ) dx'   1/ 2
exp  i  k ( x' )dx' . for k 2 x   0. (2.94) wave-
k ( x)   k ( x)   function
(Again, the lower integration limit is arbitrary, because its choice may be incorporated into the complex
constants a and b.) This modified approximation overcomes the problem of current continuity; for
example, for the forward-propagating wave, Eq. (5) gives
 2 WKB
I WKB ( x)  a  const . (2.95) probability
m current
Physically, the factor k1/2 in the denominator of the WKB wavefunction’s pre-exponent is easy to
understand. The smaller the local group velocity (32) of the wave packet, vgr(x) = k(x)/m, the “easier”
(more probable) it should be to find the particle within a certain interval dx. This is exactly the result
that the WKB approximation gives: w(x) = *  1/k(x)  1/vgr. Another value of the 1st approximation
is a clarification of the WKB theory’s validity condition: it is given by Eq. (90). Plugging into this
relation the first form of Eq. (92), and estimating d20/dx2 as d0/dx/a, where a is the spatial scale of
a substantial change of  d0/dx  = k(x), we may write the condition as WKB:
first
ka  1 . (2.96) condition
of validity
In plain English, this means that the region where U(x), and hence k(x), change substantially should
contain many de Broglie wavelengths  = 2/k.
So far I have implied that k2(x)  E – U(x) is positive, i.e. particle moves in the classically
accessible region. Now let us extend the WKB approximation to situations where the difference E –
U(x) may change sign, for example to the reflection problem sketched in Fig. 3. Just as we did for the
sharp potential step, we first need to find the appropriate solution in the classically forbidden region, in
this case for x > xc. For that, there is again no need to redo our calculations, because they are still valid if
we, just as in the sharp-step problem, take k(x) = i(x), where
2mU ( x)  E 
 2 x    0, for x  xc , (2.97)
2
and keep just one of two possible solutions (with  > 0), in analogy with Eq. (58). The result is
c  x 
 WKB ( x)  exp    ( x' )dx' , for k 2  0 , i.e. κ 2  0, (2.98)
 ( x)
1/ 2
 
with the lower limit at some point with 2 > 0 as well. This is a really wonderful formula! It describes
the quantum-mechanical penetration of the particle into the classically forbidden region and provides a
natural generalization of Eq. (58) – leaving intact our estimates of the depth  ~ 1/ of such penetration.
Now we have to do what we have done for the sharp-step problem in Sec. 2: use the boundary
conditions at classical turning point x = xc to relate the constants a, b, and c. However, now this
operation is a tad more complex, because both WKB functions (94) and (98) diverge, albeit weakly, at
the point, because here both k(x) and (x) tend to zero. This connection problem may be solved in the
following way. 22
Let us use our commitment of the potential’s “softness”, assuming that it allows us to keep just
two leading terms in the Taylor expansion of the function U(x) at the point xc:
dU dU
U ( x)  U ( xc )  x  xc ( x  x c )  E  x  xc ( x  x c ) . (2.99)
dx dx
Using this truncated expansion, and introducing the following dimensionless variable for the
coordinate’s deviation from the classical turning point,
1/ 3
x  xc  2 
  with x0  
,  , (2.100)
x0  2m dU / dx  x xc 
we reduce the Schrödinger equation (81) to the so-called Airy equation
Airy d 2
equation    0. (2.101)
d 2
This simple linear, ordinary, homogenous differential equation of the second order has been very well
studied. Its general solution may be represented as a linear combination of two fundamental solutions,
the Airy functions Ai( ) and Bi( ), shown in Fig. 9a.23
(a) (b)
1 1
Bi ( ) Ai WKB ( )
Ai( ) Ai( )
0 0
)
1 1
 10 0 10 3 0 3
 
Fig. 2.9. (a) The Airy functions Ai and Bi, and (b) the WKB approximation for the function Ai().
22 An alternative way to solve the connection problem, without involving the Airy functions but using an
analytical extension of WKB formulas to the complex-argument plane, may be found, e.g., in Sec. 47 of the
textbook by L. Landau and E. Lifshitz, Quantum Mechanics, Non-Relativistic Theory, 3rd ed. Pergamon, 1977.
23 Note the following (exact) integral formulas,
1   3 
 
1 3    3
Ai( )  
 0
cos    d ,
 3 
Bi( )   exp
 
 0  3
   

 sin     d ,
 3 
frequently more convenient for practical calculations of the Airy functions than the differential equation (101).
The latter function diverges at   +, and thus is not suitable for our current problem (Fig. 3),
while the former function has the following asymptotic behaviors at   >> 1:
1  2 3/ 2 
1  2 exp 3  , for   ,
Ai( )     (2.102)
 sin  2   3 / 2   , for   .
1/ 4
 1/ 2

  3 4
Now let us apply the WKB approximation to the Airy equation (101). Taking the classical
turning point ( = 0) for the lower limit, for  > 0 we get

2
 2 ( )   ,  ( )   1 / 2 ,   (' )d'  3 
3/ 2
, (2.103)
0
i.e. exactly the exponent in the top line of Eq. (102). Making a similar calculation for  < 0, with the
natural assumption  b  =  a  (full reflection from the potential step), we arrive at the following result:
  2 3/ 2 
1 c' exp 3  , for   0,
Ai WKB       (2.104)
 a' sin  2   3 / 2   , for   0.
1/ 4

 3 
This approximation differs from the exact solution at small values of  , i.e. close to the classical turning
point – see Fig. 9b. However, at    >> 1, Eqs. (104) describe the Airy function exactly, provided that
 a' WKB:
 , c'  . (2.105) connection
4 2 formulas
These connection formulas may be used to rewrite Eq. (104) as

  2 3/ 2 
a' exp 3  , for   0,
Ai WKB       (2.106)
1 exp i 2  3 / 2  i   - exp i 2  3 / 2  i  , for   0,
1/ 4
2
 i   3 4  3 4 
and hence may be described by the following two simple mnemonic rules:
(i) If the classical turning point is taken for the lower limit in the WKB integrals in the
classically allowed and the classically forbidden regions, then the moduli of the quasi-amplitudes of the
exponents are equal.
(ii) Reflecting from a “soft” potential step, the wavefunction acquires an additional phase shift
 = /2, if compared with its reflection from a “hard”, infinitely high potential wall located at point xc
(for which, according to Eq. (63) with  = 0, we have B = –A).
In order for the connection formulas (105)-(106) to be valid, deviations from the linear
approximation (99) of the potential profile should be relatively small within the region where the WKB
approximation differs from the exact Airy function:    ~ 1, i.e.  x – xc  ~ x0. These deviations may be
estimated using the next term of the Taylor expansion, dropped in Eq. (99): (d2U/d2x)(x – xc)2/2. As a
result, the condition of validity of the connection formulas (i.e. of the “softness” of the reflecting
potential profile) may be expressed as d2U/d2x<< dU/dxat x  xc – meaning the ~x0–wide vicinity
of the point xc). With the account of Eq. (100) for x0, this condition becomes
3 4
Connection d 2U 2m  dU 
formulas’
2
 2   . (2.107)
validity dx x  x   dx  x  x
c c
As an example of a very useful application of the WKB approximation, let us use the connection
formulas to calculate the energy spectrum of a 1D particle in a soft 1D potential well (Fig. 10).
U ( x)
En
Fig. 2.10. The WKB treatment of an eigenstate

of a particle in a soft 1D potential well.
xL 0 xR x
As was discussed in Sec. 1.7, we may consider the standing wave describing an eigenfunction n
(corresponding to an eigenenergy En) as a sum of two traveling de Broglie waves going back and forth
between the walls, being sequentially reflected from each of them. Let us apply the WKB approximation
to such traveling waves. First, according to Eq. (94), propagating from the left classical turning point xL
to the right such point xR, it acquires the phase change
xR
    k ( x)dx .
xL
(2.108)
At the reflection from the soft wall at xR, according to the mnemonic rule (ii), the wave acquires an
additional shift /2. Now, traveling back from xR to xL, the wave gets a shift similar to one given by Eq.
(108):  = . Finally, at the reflection from xL it gets one more /2-shift. Summing up all these
contributions at the wave’s roundtrip, we may write the self-consistency condition (that the
wavefunction “catches its own tail with its teeth”) in the form
xR
 
 total          2  k ( x)dx    2n , with n  1, 2,... (2.109)
2 2 xL
Rewriting this result in terms of the particle’s momentum p(x) = k(x), we arrive at the so-called Wilson-
Sommerfeld (or “Bohr-Sommerfeld”) quantization rule
Wilson-
 1
Sommerfeld
quantization  p( x)dx  2 n  2  ,
C
(2.110)
rule
where the closed path C means the full period of classical motion.24
24 Note that at the motion in more than one dimension, a closed classical trajectory may have no classical turning
points. In this case, the constant ½, arising from the turns, should be dropped from Eqs. (110) written for the
scalar product p(r)dr – the so-called Bohr quantization rule. It was suggested by N. Bohr as early as 1913 as an
interpretation of Eq. (1.8) for the circular motion of the electron around the proton, while its 1D modification
(110) is due to W. Wilson (1915) and A. Sommerfeld (1916).
Let us see what does this quantization rule give for the very important particular case of a
quadratic potential profile of a harmonic oscillator of frequency 0. In this case,
m 2 2
U ( x)  0 x , (2.111)
2
and the classical turning points (where U(x) = E) are the roots of a simple equation
1/ 2
m 2 2 1  2En 
 0 xc  E n , so that x R     0, x L   x R  0 . (2.112)
2 0  m 
Due to the potential’s symmetry, the integration required by Eq. (110) is also simple:
xR xR xR 1/ 2
 x2 
 p( x)dx  x 2mE  U ( x) dx  2mE n  2 1  2 
1/ 2 1/ 2
n dx
xL L 0  xR  (2.113)
1
 E
 2mE n  2 x R  1   
2 1/ 2
d  2mE n 
1/ 2 1/ 2
2 xR  n ,
0
4 0
so that Eq. (110) yields
 1
E n   0  n'  , with n'  n  1  0, 1, 2, ... . (2.114)
 2
To estimate the validity of this result, we have to check the condition (96) at all points of the
classically allowed region, and Eq. (107) at the turning points. The checkup shows that both conditions
are valid only for n >> 1. However, we will see in Sec. 9 below that Eq. (114) is actually exactly correct
for all energy levels – thanks to special properties of the potential profile (111).
Now let us use the mnemonic rule (i) to examine particle’s penetration into the classically
forbidden region of an abrupt potential step of a height U0 > E. For this case, the rule, i.e. the second of
Eqs. (105), yields the following relation of the quasi-amplitudes in Eqs. (94) and (98): c = a/2. If we
now naively applied this relation to the sharp step sketched in Fig. 4, forgetting that it does not satisfy
Eq. (107), we would get the following relation of the full amplitudes, defined by Eqs. (55) and (58):
C 1 A
 . (WRONG!) (2.115)
 2 k
This result differs from the correct Eq. (63), and hence we may expect that the WKB approximation’s
prediction for more complex potentials, most importantly for tunneling through a soft potential barrier
(Fig. 11) should be also different from the exact result (71) for the rectangular barrier shown in Fig. 6.
U ( x) d WKB  x c'  xc
U max
a c f
E
b d
Fig. 2.11. Tunneling through
a soft 1D potential barrier.
0
xc xm x c' x
In order to analyze tunneling through such a soft barrier, we need (just as in the case of a
rectangular barrier) to take unto consideration five partial waves, but now they should be taken in the
WKB form:
 a  x  b 
x 
 1/ 2 expi  k ( x' )dx'   1 / 2 exp i  k ( x' )dx' , for x  xc ,
 k ( x)   k ( x)  
  x   x 
 c d
 WKB   1 / 2 exp   ( x' )dx'   1 / 2 exp  ( x' )dx' , for xc  x  xc ' , (2.116)
  ( x ) 
 
  ( x ) 
 

  x 
 f expi k ( x' )dx' ,
 k ( x)
1/ 2

 
for xc'  x,
  
where the lower limits of integrals are arbitrary (each within the corresponding range of x). Since on the
right of the left classical point, we have two exponents rather than one, and on the right of the second
point, one traveling waves rather than two, the connection formulas (105) have to be generalized, using
asymptotic formulas not only for Ai( ), but also for the second Airy function, Bi( ). The analysis,
absolutely similar to that carried out above (though naturally a bit bulkier),25 gives a remarkably simple
result:
Soft
f
2  xc'   2 xc' 
 exp 2   ( x)dx   exp  2mU ( x)  E  dx  ,
1/ 2
potential
barrier:
TWKB  (2.117)
transparency
a  x    x 
c c
with the pre-exponential factor equal to 1 – the fact which might be readily expected from the mnemonic
rule (i) of the connection formulas.
This formula is broadly used in applied quantum mechanics, despite the approximate character
of its pre-exponential coefficient for insufficiently soft barriers that do not satisfy Eq. (107). For
example, Eq. (80) shows that for a rectangular barrier with thickness d >> , the WKB approximation
(117) with dWKB = d underestimates T by a factor of [4k/(k2 + 2)]2 – equal, for example, 4, if k = , i.e.
if U0 = 2E. However, on the appropriate logarithmic scale (see Fig. 7b), such a factor, smaller than an
order of magnitude, is just a small correction.
Note also that when E approaches the barrier’s top Umax (Fig. 11), the points xc and xc’ merge, so
that according to Eq. (117), TWKB  1, i.e. the particle reflection vanishes at E = Umax. So, the WKB
approximation does not describe the effect of the over-barrier reflection at E > Umax. (This fact could be
noticed already from Eq. (95): in the absence of the classical turning points, the WKB probability
current is constant for any barrier profile.) This conclusion is incorrect even for apparently smooth
barriers where one could naively expect the WKB approximation to work perfectly. Indeed, near the
point x = xm where the potential reaches maximum (i.e. U(xm) = Umax), we may always approximate any
smooth function U(x) with the quadratic term of the Taylor expansion, i.e. with an inverted parabola:
m02  x  xm 
2
U ( x)  U max  . (2.118)
2
25 For the most important case TWKB << 1, Eq. (117) may be simply derived from Eqs. (105)-(106) – the exercise
left for the reader.
Calculating derivatives dU/dx and d2U/dx2 of this function and plugging them into the condition
(107), we may see that the WKB approximation is only valid if Umax – E >> 0. Just for the reader’s
reference, an exact analysis of tunneling through the barrier (118) gives the following Kemble formula:26
1 Kemble
T  , (2.119)
1  exp 2 ( E  U max ) /  0  formula
valid for any sign of the difference (E – Umax). This formula describes a gradual approach of T to 1, i.e.
a gradual reduction of reflection, at the particle energy’s increase, with T = ½ at E = Umax.
The last remark of this section: the WKB approximation opens a straight way toward an
alternative formulation of quantum mechanics, based on the Feynman path integral. However, I will
postpone its discussion until a more compact notation has been introduced in Chapter 4.
2.5. Resonant tunneling, and metastable states

Now let us move to other, conceptually different quantum effects, taking place in more elaborate
potential profiles. Neither piecewise-constant nor smooth-potential models of U(x) are convenient for
their quantitative description because they both require “stitching” partial de Broglie waves at each
classical turning point, which may lead to cumbersome calculations. However, we may get a very good
insight of the physics of quantum effects that may take place in such profiles, using their approximation
by sets of Dirac’s delta functions.
Additional help in studying such effects is provided by the notions of the scattering and transfer
matrices, very useful for other cases as well. Consider an arbitrary but finite-length potential “bump”
(formally called a scatterer), localized somewhere between points x1 and x2, on the flat potential
background, say U = 0 (Fig. 12).
U ( x)
A1 A2
E
B1 B2
Fig. 2.12. De Broglie wave amplitudes
x near a single 1D scatterer.
x1 0 x2
From Sec. 2, we know that the general solutions of the stationary Schrödinger equation, with a
certain energy E, outside the interval [x1, x2] are sets of two sinusoidal waves, traveling in the opposite
directions. Let us represent them in the form
ik ( x  x j ) ik ( x  x j )
 j  Aj e  Bje , (2.120)
26 This formula was derived (in a more general form, valid for an arbitrary soft potential barrier) by E. Kemble in
1935. In some communities, it is known as the “Hill-Wheeler formula”, after D. Hill and J. Wheeler’s 1953 paper
in that the Kemble formula was spelled out for the quadratic profile (118). Note that mathematically Eq. (119) is
similar to the Fermi distribution in statistical physics, with an effective temperature Tef = 0/2kB. This
coincidence has some curious implications for the Fermi particle tunneling statistics.
where the index j (for now) is equal to either 1 or 2, and (k)2/2m = E. Note that each of the two wave
pairs (129) has, in this notation, its own reference point xj, because this is very convenient for what
follows. As we have already discussed, if the de Broigle wave/particle is incident from the left (i.e. B2 =
0), the solution of the linear Schrödinger equation within the scatterer range (x1 < x < x2) can provide
only linear expressions for the transmitted (A2) and reflected (B1) wave amplitudes via the incident wave
amplitude A1:
A2  S 21 A1 , B1  S11 A1 , (2.121)
where S11 and S21 are certain (generally, complex) coefficients. Alternatively, if a wave, with amplitude
B2, is incident on the scatterer from the right (i.e. if A1 = 0), it can induce a transmitted wave (B1) and a
reflected wave (A2), with amplitudes
B1  S12 B2 , A2  S 22 B2 , (2.122)
where the coefficients S22 and S12 are generally different from S11 and S21. Now we can use the linear
superposition principle to argue that if the waves A1 and B2 are simultaneously incident on the scatterer
(say, because the wave B2 has been partly reflected back by some other scatterer located at x > x2), the
resulting scattered wave amplitudes A2 and B1 are just the sums of their values for separate incident
waves:
B1  S11 A1  S12 B2 ,
(2.123)
A2  S 21 A1  S 22 B2 .
These linear relations may be conveniently represented using the so-called scattering matrix S:
Scattering  B1  A  S S12 
matrix:    S  1 , with S   11 . (2.124)
definition
 A2   B2   S 21 S 22 
Scattering matrices, duly generalized, are an important tool for the analysis of wave scattering in more
dimensions than one; for 1D problems, however, another matrix is often more convenient to represent
the same linear relations (123). Indeed, let us solve this system for A2 and B2. The result is
Transfer A2  T11 A1  T12 B1 , A  A 
matrix: i.e.  2   T  1 , (2.125)
definition B2  T21 A1  T22 B1 ,  B2   B1 
where T is the transfer matrix, with the following elements:
S11 S 22 S S 1
T11  S 21  , T12  22 , T21   11 , T22  . (2.126)
S12 S12 S 21 S12
The matrices S and T have some universal properties, valid for an arbitrary (but time-
independent) scatterer; they may be readily found from the probability current conservation and the
time-reversal symmetry of the Schrödinger equation. Let me leave finding these relations for the
reader’s exercise. The results show, in particular, that the scattering matrix may be rewritten in the
following form:
 re i t 
S  e i  
i  , (2.127a)
 t  re 
where four real parameters r, t, , and  satisfy the following universal relation:
r 2  t 2  1, (2.127b)
so that only 3 of these parameters are independent. As a result of this symmetry, T11 may be also
represented in a simpler form, similar to T22: T11 = exp{i}/t = 1/S12*= 1/S21*. The last form allows a
ready expression of the scatterer’s transparency via just one coefficient of the transfer matrix:
2
A 2 2
T  2  S 21  T11 . (2.128)
A1 B 0
2
In our current context, the most important property of the 1D transfer matrices is that to find the
total transfer matrix T of a system consisting of several (say, N) sequential arbitrary scatterers (Fig. 13),
it is sufficient to multiply their matrices.
A1 A2 A3 AN 1
B1 B2 B3 BN 1
Fig. 2.13. A sequence of several 1D
 scatterers.
x1 x2 x3 x N 1 x
Indeed, extending the definition (125) to other points xj (j = 1, 2, …, N + 1), we can write
 A2  A   A3  A  A 
   T1  1 ,    T2  2   T2 T1  1 , etc. (2.129)
 B2   B1   B3   B2   B1 
(where the matrix indices correspond to the scatterers’ order on the x-axis), so that
 AN 1  A 
   TN TN 1 ...T1  1 . (2.130)
 BN 1   B1 
But we can also define the total transfer matrix similarly to Eq. (125), i.e. as
 AN 1  A 
   T  1 , (2.131)
 BN 1   B1 
so that comparing Eqs. (130) and (131) we get Transfer
matrix:
T  TN TN 1...T1 . (2.132) composite
scatterer
This formula is valid even if the flat-potential gaps between component scatterers are shrunk to
zero, so that it may be applied to a scatterer with an arbitrary profile U(x), by fragmenting its length into
many small segments x = xj+1 – xj, and treating each fragment as a rectangular barrier of the average
height (Uj)ef = [U(xj+1) – U(xj)]/2 – see Fig. 14. Since very efficient numerical algorithms are readily
available for fast multiplication of matrices (especially as small as 2×2 in our case), this approach is
broadly used in practice for the computation of transparency of potential barriers with complicated
profiles U(x). (Computationally, this procedure is much more efficient than the direct numerical solution
of the stationary Schrödinger equation.)
U ( x)
A1 (U j ) ef AN 1
B1 BN 1
Fig. 2.14. The transfer matrix approach
to a potential barrier with an arbitrary
profile.
x1 x2  x j x j 1  x N 1 x
In order to apply this approach to several particular, conceptually important systems, let us
calculate the transfer matrices for a few elementary scatterers, starting from the delta-functional barrier
located at x = 0 – see Fig. 8. Taking x1, x2  0, we can merely change the notation of the wave
amplitudes in Eq. (78) to get
 i 1
S11  , S 21  . (2.133)
1  i 1  i
An absolutely similar analysis of the wave incidence from the left yields
 i 1
S 22  , S12  , (2.134)
1  i 1  i
and using Eqs. (126), we get
Transfer
 1  i  i  
matrix: T   . (2.135)
short
scatterer  i 1  i 
As a sanity check, Eq. (128) applied to this result, immediately brings us back to Eq. (79).
The next example may seem strange at the first glance: what if there is no scatterer at all between
the points x1 and x2? If the points coincide, the answer is indeed trivial and can be obtained, e.g., from
Eq. (135) by taking W = 0, i.e.  = 0:
1 0
Identity T0     I (2.136)
matrix 0 1
- the so-called identity matrix. However, we are free to choose the reference points x1,2 participating in
Eq. (120) as we wish. For example, what if x2 – x1 = a? Let us first take the forward-propagating wave
alone: B2 = 0 (and hence B1 = 0); then
ik ( x  x1 ) ik ( x2  x1 ) ik ( x  x2 )
 2   1  A1e  A1e e . (2.137)
The comparison of this expression with the definition (120) for j = 2 shows that A2 = A1 exp{ik(x2 – x1)}
= A1 exp{ika}, i.e. T11 = exp{ika}. Repeating the calculation for the back-propagating wave, we see that
T22 = exp{-ika}, and since the space interval provides no particle reflection, we finally get
Transfer
matrix:  e ika 0 
spatial Ta   , (2.138)
interval
 0 e ika 
independently of a common shift of points x1 and x2. At a = 0, we naturally recover the special case
(136).
Now let us use these simple results to analyze the double-barrier system shown in Fig. 15. We
could of course calculate its properties as before, writing down explicit expressions for all five traveling
waves shown by arrows in Fig. 15, then using the boundary conditions (124) and (125) at each of points
x1,2 to get a system of four linear equations, and finally, solving it for four amplitude ratios.
a
W  x  x1  W  x  x 2 
E
Fig. 2.15. The double-barrier system. The
dashed lines show (schematically) the quasi-
levels of the metastable-state energies.
x1 x2 x
However, the transfer matrix approach simplifies the calculations, because we may immediately
use Eqs. (132), (135), and (138) to write
1  i  i   e ika 0  1  i  i 
T  T Ta T     . (2.139)
 i 1  i   0 e ika   i 1  i 
Let me hope that the reader remembers the “row by column” rule of the multiplication of square
matrices;27 using it for the last two matrices, we may reduce Eq. (139) to
1  i  i   (1  i )e ika  ie ika 
T    . (2.140)
 i 1  i   ie ika (1  i )e ika 
Now there is no need to calculate all elements of the full product T, because, according to Eq. (128), for
the calculation of barrier’s transparency T we need only one its element, T11:
1 1 Double
T  2
 2
. (2.141) barrier:
T11  2 e ika  (1  i ) 2 e ika transparency
This result is somewhat similar to that following from Eq. (71) for E > U0: the transparency is a
-periodic function of the product ka, reaching its maximum (T = 1) at some point of each period – see
Fig. 16a. However, Eq. (141) is different in that for  >> 1, the resonance peaks of the transparency are
very narrow, reaching their maxima at ka  kna  n, with n = 1, 2, …
The physics of this resonant tunneling effect28 is the so-called constructive interference,
absolutely similar to that of electromagnetic waves (for example, light) in a Fabry-Perot resonator
formed by two parallel semi-transparent mirrors.29 Namely, the incident de Broglie wave may either
N
27 In an analytical form: AB jj'  A jj" B j"j' , where N is the matrix rank (in our current case, N = 2).
j" 1
28 In older literature, it is sometimes called the Townsend (or “Ramsauer-Townsend”) effect. However, it is more
common to use that term only for a similar effect at 3D scattering – to be discussed in Chapter 3.
29 See, e.g., EM Sec. 7.9.
tunnel through the two barriers or undertake, on its way, several sequential reflections from these semi-
transparent walls. At k = kn, i.e. at 2ka = 2kna = 2n, the phase differences between all these partial
waves are multiples of 2, so that they add up in phase – “constructively”. Note that the same
constructive interference of numerous reflections from the walls may be used to interpret the standing-
wave eigenfunctions (1.84), so that the resonant tunneling at  >> 1 may be also considered as a result
of the incident wave’s resonance induction of such a standing wave, with a very large amplitude, in the
space between the barriers, with the transmitted wave’s amplitude proportionately increased.
(a) (b)
  0.3 2
Im 1
0.8
 1
 2 1
0.6
T k k 0
0.4 Re
1.0
Fig. 2.16. Resonant tunneling through a
0.2 potential well with delta-functional walls:
3.0 (a) the system’s transparency as a
0 function of ka, and (b) calculating the
0 0.5 1 1.5 2
ka /  resonance’s FWHM at  >> 1.
As a result of this resonance, the maximum transparency of the system is perfect (Tmax = 1) even
at   , i.e. in the case of very low transparency of each of the two component barriers. Indeed, the
denominator in Eq. (141) may be interpreted as the squared length of the difference between two 2D
vectors, one of length 2, and another of length (1 – i)2 = 1 + 2, with the angle  = 2ka + const
between them – see Fig. 16b. At the resonance, the vectors are aligned, and their difference is smallest
(equal to 1) so that Tmax = 1. (This result is exact only if the two barriers are exactly equal.)
The same vector diagram may be used to calculate the so-called FWHM, a common acronym for
the Full Width [of the resonance curve at its] Half-Maximum. By definition, this is the difference k = k+
– k- between such two values of k, on the opposite slopes of the same resonance, at that T = Tmax/2 – see
the arrows in Fig. 16a. Let the vectors in Fig. 16b, drawn for  >> 1, be misaligned by a small angle 
~ 1/2 << 1, so that the length of the difference vector is much smaller than the length of each vector. To
double its length squared, and hence to reduce T by a factor of two in comparison with its maximum
value 1, the arc between the vectors, equal to 2 , should also become equal to 1, i.e. 2(2ka + const)
= 1. Subtracting these two equalities from each other, we get
1
k  k   k    k  . (2.142)
a 2
Now let us use the simple system shown in Fig. 15 to discuss an issue of large conceptual
significance. For that, consider what would happen if at some initial moment (say, t = 0) we have placed
a 1D quantum particle inside the double-barrier well with  >> 1, and left it there alone, without any
incident wave. To simplify the analysis, let us assume that the initial state of the particle coincides with
one of the stationary states of the infinite-wall well of the same size – see Eq. (1.84):
1/ 2
2 n
 ( x,0)   n ( x)    sink n ( x  x1 ), where k n  , n  1, 2,... . (2.143)
a a
At   , this is just an eigenstate of the system, and from our analysis in Sec. 1.5 we know the time
evolution of its wavefunction:
1/ 2
2 E n k n2
 ( x, t )   n ( x) exp i n t    sink n ( x  x1 )exp i n t, with  n   , (2.144)
a  2m
telling us that the particle remains in the well at all times with constant probability W(t) = W(0) = 1.
However, if the parameter  is large but finite, the de Broglie wave would slowly “leak out”
from the well, so that W(t) would slowly decrease. Such a state is called metastable. Let us derive the
law of its time evolution, assuming that at the slow leakage, with a characteristic time  >> 1/n, does
not affect the instant wave distribution inside the well, besides the gradual, slow reduction of W.30 Then
we can generalize Eq. (144) as
1/ 2
 2W 
 ( x, t )    sink n ( x  x1 )exp i n t  A expi k n x   n t   B i k n x   n t , (2.145)
 a 
making the probability of finding the particle in the well equal to W  1. As the last form of Eq. (145)
shows, this function is the sum of two traveling waves, with equal magnitudes of their amplitudes and
equal but opposite probability currents (5):
1/ 2
W   2  W n
A  B   , IA  A kn  , I B  I A . (2.146)
 2a  m m 2a a
But we already know from Eq. (79) that at  >> 1, the delta-functional wall’s transparency T equals
1/2, so that the wave carrying current IA, incident on the right wall from the inside, induces an
outcoming wave outside of the well (Fig. 17) with the following probability current:
1 1  nW
I R  TI A  IA  . (2.147)
 2
 2 2ma 2
~ vgr
IL IR
E1
 vgr t 0  vgr t x
Fig. 2.17. Metastable state’s decay in the simple model of a 1D potential well
formed by two low-transparent walls – schematically.
Absolutely similarly,
1
IL  I B  I R . (2.148)
2
30This virtually evident assumption finds its formal justification in the perturbation theory to be discussed in
Chapter 6.
Now we may combine the 1D version (6) of the probability conservation law for the well’s interior:
dW
 IR  IL  0 , (2.149)
dt
with Eqs. (147)-(148) to write
dW 1  n
 2 W. (2.150)
dt  ma 2
This is just the standard differential equation,
Metastable dW 1
state:  W , (2.151)
decay law dt 
of the exponential decay, W(t) = W(0)exp{-t/}, where the constant , in our case equal to
ma 2 2
  , (2.152)
n
is called the metastable state’s lifetime. Using Eq. (2.33b) for the de Broglie waves’ group velocity, for
our particular wave vector giving vgr = kn/m = n/ma, Eq. (152) may be rewritten in a more general
form,
Metastable t
state:  a, (2.153)
lifetime T
where the attempt time ta is equal to a/vgr, and (in our particular case) T = 1/2, in which it is valid for a
broad class of similar metastable systems.31 Equality may be interpreted in the following semi-classical
way. The particle travels back and forth between the confining potential barriers, with the time interval
ta between the sequential moments of incidence, each time attempting to leak through the wall, with the
success probability equal to T, so the reduction of W per each incidence is W = –WT, in the limit  >>
1 (i.e. T << 1) immediately leading to the decay equation (151) with the lifetime (153).
Another useful look at Eq. (152) may be taken by returning to the resonant tunneling problem in
the same system, and expressing the resonance width (142) in terms of the incident particle’s energy:
  2k 2   2kn  2kn 1 n 2
E     k   . (2.154)
 2m  m m a 2 ma 2 2
Comparing Eqs. (152) and (154), we get a remarkably simple, parameter-independent formula32
Energy-time
uncertainty E     . (2.155)
relation
31 Essentially the only requirement is to have the attempt time tA to be much longer than the effective time (the
instanton time, see Sec. 5.3 below) of tunneling through the barrier. In the delta-functional approximation for the
barrier, the latter time is equal to zero, so that this requirement is always fulfilled.
32 Note that the metastable state’s decay (2.151) may be formally obtained from the basic Schrödinger equation
(1.61) by adding an imaginary part, equal to (-E/2), to its eigenenergy En. Indeed, in this case Eq. (1.62)
becomes an(t) = constexp{-i(En – iE/2}t/}  constexp{-iEnt/}exp{-Et/2} = constexp{-iEnt/}exp{-
t/2}, so that W(t)  an(t)2  exp{-t/}. Such formalism, which hides the physical origin of the state’s decay,
may be convenient for some calculations, but misleading in other cases, and I will not use it in this course.
This energy-time uncertainty relation is certainly more general than our simple model; for
example, it is valid for the lifetime and resonance tunneling width of any metastable state in the
potential profile of any shape. This seems very natural, since because of the energy identification with
frequency, E = , typical for quantum mechanics, Eq. (155) may be rewritten as  = 1 and seems to
follow directly from the Fourier transform in time, just as the Heisenberg’s uncertainty relation (1.35)
follows from the Fourier transform in space. In some cases, these two relations are indeed
interchangeable; for example, Eq. (24) for the Gaussian wave packet width may be rewritten as Et =
, where E = (d/dk)k = vgrk is the r.m.s. spread of energies of monochromatic components of the
packet, while t  x/vgr is the time scale of packet’s passage through a fixed observation point x.
However, Eq. (155) it is much less general than Heisenberg’s uncertainty relation (1.35). Indeed,
in the non-relativistic quantum mechanics we are studying now, the Cartesian coordinates of a particle,
the Cartesian components of its momentum, and the energy E are regular observables, represented by
operators. In contrast, time is treated as a c-number argument, and is not represented by an operator, so
that Eq. (155) cannot be derived in such general assumptions as Eq. (1.35). Thus the time-energy
uncertainty relation should be used with caution. Unfortunately, not everybody is so careful. One can
find, for example, wrong claims that due to this relation, the energy dissipated by any system performing
an elementary (single-bit) calculation during a time interval t has to be larger than /t.33 Another
incorrect statement is that the energy of a system cannot be measured, during a time interval t, with an
accuracy better than /t.34
Now that we have a quantitative mathematical description of the metastable state’s decay (valid,
again, only if  >> 1, i.e. if  >> ta), we may use it for discussion of two important conceptual issues of
quantum mechanics. First, this is one of the simplest examples of systems that may be considered, from
two different points of view, as either Hamiltonian (and hence time-reversible), or open (and hence
irreversible). Indeed, from the former point of view, our particular system is certainly described by a
time-independent Hamiltonian of the type (1.41), with the potential energy
U  x   W   x  x1     x  x 2  (2.156)
- see Fig. 15 again. In this point of view, the total probability of finding the particle somewhere on the
axis x remains equal to 1, and the full system’s energy, calculated from Eq. (1.23),

E    x, t  Hˆ x, t  d
* 3
x, (2.157)

remains constant and completely definite (E = 0). On the other hand, since the “emitted” wave packets
would never return to the potential well,35 it makes sense to look at the well’s region alone. For such a
33 On this issue, I dare to refer the reader to my own old work K. Likharev, Int. J. Theor. Phys. 21, 311 (1982),
which provided a constructive proof (for a particular system) that at reversible computation, whose idea had been
put forward in 1973 by C. Bennett (see, e.g., SM Sec. 2.3), energy dissipation may be lower than this apparent
“quantum limit”.
34 See, e.g., a discussion of this issue in the monograph by V. Braginsky and F. Khalili, Quantum Measurement,
Cambridge U. Press, 1992.
35 For more realistic 2D and 3D systems, this statement is true even if the system as a whole is confined inside
some closed volume, much larger than the potential well housing the metastable states. Indeed, if the walls
truncated, open system (for which the space beyond the interval [x1, x2] serves as its environment), the
probability W of finding the particle inside this interval, and hence its energy E = WEn, decay
exponentially per Eq. (151) – the decay equation typical for irreversible systems. We will return to the
discussion of the dynamics of such open quantum systems in Chapter 7.
Second, the same model enables a preliminary discussion of one important aspect of quantum
measurements. As Eq. (151) and Fig. 17 show, at t >> , the well becomes virtually empty (W  0), and
the whole probability is localized in two clearly separated wave packets with equal amplitudes, moving
from each other with the speed vgr, each “carrying the particle away” with a probability of 50%. Now
assume that an experiment has detected the particle on the left side of the well. Though the formalisms
suitable for quantitative analysis of the detection process will not be discussed until Chapter 7, due to
the wide separation x = 2vgrt >> 2vgr of the packets, we may safely assume that such detection may be
done without any actual physical effect on the counterpart wave packet.36 But if we know that the
particle has been found on the left side, there is no chance to find it on the right side. If we attributed the
full wavefunction to all stages of this particular experiment, this situation might be rather confusing.
Indeed, that would mean that the wavefunction at the right packet’s location should instantly turn into
zero – the so-called wave packet reduction (or “collapse”) – a hypothetical, irreversible process that
cannot be described by the Schrödinger equation for this system, even including the particle detectors.
However, if (as was already discussed in Sec. 1.3) we attribute the wavefunction to a certain
statistical ensemble of similar experiments, there is no need to involve such an artificial notion. The
two-packet picture we have calculated (Fig. 17) describes the full ensemble of experiments with all
systems prepared in the initial state (143), i.e. does not depend on the particle detection results. On the
other hand, the “reduced packet” picture (with no wave packet on the right of the well) describes only a
sub-ensemble of such experiments, in which the particles have been detected on the left side. As was
discussed on classical examples in Sec. 1.3, for such redefined ensemble the probability distribution is
rather different. So, the “wave packet reduction” is just a result of a purely accounting decision of the
observer.37 I will return to this important discussion in Sec. 10.1 – on the basis of the forthcoming
discussion of open systems in Chapters 7 and 8.
2.6. Localized state coupling, and quantum oscillations

Now let us discuss one more effect specific to quantum mechanics. Its mathematical description
may be simplified using a model potential consisting of two very short and deep potential wells. For
that, let us first analyze the properties of a single well of this type (Fig. 18), which may be modeled
similarly to the short and high potential barrier – see Eq. (74), but with a negative “weight”:
U  x   W  x , with W  0 . (2.158)
providing such confinement are even slightly uneven, the emitted plane-wave packets will be reflected from them,
but would never return to the well intact. (See SM Sec. 2.1 for a more detailed discussion of this issue.)
36 This argument is especially convincing if the particle’s detection time is much shorter than the time t = 2v t/c,
c gr
where c is the speed of light in vacuum, i.e. the maximum velocity of any information transfer.
37 “The collapse of the wavefunction after measurement represents nothing more than the updating of that
scientist’s expectations.” N. D. Mermin, Phys. Today, 72, 53 (Jan. 2013).
In contrast to its tunnel-barrier counterpart (74), such potential sustains a stationary state with a negative
eigenenergy E < 0, and a localized eigenfunction , with    0 at x  .
0 U ( x )  W ( x )
x

E0 Fig. 2.18. Delta-functional
potential well and its localized
1/  1/  eigenstate (schematically).
Indeed, at x  0, U(x) = 0, so the 1D Schrödinger equation is reduced to the Helmholtz equation

(1.83), whose localized solutions with E < 0 are single exponents, vanishing at large distances:38
 Ae x , for x  0,  2 2
 ( x)   0  x    x with   E,   0 . (2.159)
 Ae , for x  0, 2m
(The coefficients before the exponents have been selected equal to satisfy the boundary condition (76)
of the wavefunction’s continuity at x = 0.) Plugging Eq. (159) into the second boundary condition, given
by Eq. (75), but now with the negative sign before W, we get
 A   A   2m2W A, (2.160)


in which the common factor A  0 may be canceled. This equation39 has one solution for any W > 0:
mW
  0  , (2.161)
2
and hence the system has only one (ground) localized state, with the following eigenenergy:40
 2 02 mW 2
E  E0    . (2.162)
2m 2 2
Now we are ready to analyze localized states of the two-well potential shown in Fig. 19:
  a  a 
U ( x)  W   x      x  , with W  0 . (2.163)
  2  2 
Here we may still use the single-exponent solutions, similar to Eq. (159), for the wavefunction outside
the interval [-a/2, +a/2], but inside the interval, we need to take into account both possible exponents:
a a
  C  ex  C  e x  C A sinh x  CS cosh x, for  x , (2.164)
2 2
38 See Eqs. (56)-(58), with U0 = 0.

39 Such algebraic equations for linear differential equations are frequently called characteristic.
40 Note that this E is equal, by magnitude, to the constant E that participates in Eq. (79). Note also that this result
0 0
was actually already obtained, “backward”, in the solution of Problem 1.12(ii), but that solution did not address
the issue of whether the calculated potential (158) could sustain any other localized eigenstates.
with the parameter  defined as in Eq. (159). The last of these equivalent expressions is more
convenient because due to the symmetry of the potential (163) to the central point x = 0, the system’s
eigenfunctions should be either symmetric (even) or antisymmetric (odd) functions of x (see Fig. 19), so
that they may be analyzed separately, only for one half of the system, say x  0, and using just one of the
hyperbolic function (164) in each case.
 a / 2 U x  a/2
0 x
S
EA Fig. 2.19. A system of two coupled
ES potential wells, and its localized
A eigenstates (schematically).
For the antisymmetric eigenfunction, Eqs. (159) and (164) yield

 a
 sinh x , for 0  x  ,
2
 A  CA   (2.165)
a   a  a
sinh exp   x  , for  x,
 2   2  2
where the front coefficient in the lower line has been selected to satisfy the condition (76) of the
wavefunction’s continuity at x = +a/2 – and hence at x = –a/2. What remains is to satisfy the condition
(75), with a negative sign before W, for the derivative’s jump at that point. This condition yields the
following characteristic equation:
a a 2mW a a  0 a 
sinh  cosh  sinh , i.e. 1  coth 2 , (2.166)
2 2  
2
2 2 a 
where 0, given by Eq. (161), is the value of  for a single well, i.e. the reciprocal spatial width of its
localized eigenfunction – see Fig. 18.
Figure 20a shows both sides of Eq. (166) as functions of the dimensionless product a, for
several values of the parameter 0a, i.e. of the normalized distance between the two wells. The plots
show, first of all, that as the parameter 0a is decreased, the LHS and RHS plots cross (i.e. Eq. (166) has
a solution) at lower and lower values of a. At a << 1, the left-hand side of the last form of this
equation may be approximated as 2/a. Comparing this expression with the right-hand side of the
characteristic equation, we see that this transcendental equation has a solution (i.e. the system has an
antisymmetric localized state) only if 0a > 1, i.e. if the distance a between the two narrow potential
wells is larger than the following value,
1 2
a min   , (2.167)
 0 mW
which is equal to the characteristic spread of the wavefunction in a single well – see Fig. 18. (At a 
amin, a  0, meaning that the state’s localization becomes weaker and weaker.)
5 5
4   A 4   S
3 3
LHS (166)
LHS (172)
2 2
1.5
1.0 1.5
1  0 a  0.5 1 1.0
RHS  0 a  0.5 RHS
0 0
0 1 2 3 0 1 2 3
a a
Fig. 2.20. Graphical solutions of the characteristic equations of the two-well system, for:
(a) the antisymmetric eigenstate (165), and (b) the symmetric eigenstate (171).
In the opposite limit of large distances between the potential wells, i.e. 0a >> 1, Eq. (166)
shows that a >> 1 as well, so that its left-hand side may be approximated as 2(1 + exp{–a}), and the
equation yields
   0 1  exp  0 a   0 . (2.168)
This result means that the eigenfunction is an antisymmetric superposition of two virtually unperturbed
wavefunctions (159) of each partial potential well:
1  a  a
 A x    R x    L x , where  R  x    0  x  ,  L  x    0  x   , (2.169)
2  2  2
and the front coefficient is selected in such a way that if the eigenfunction 0 of each well is normalized,
so is A. Plugging the middle (more exact) form of Eq. (168) into the last of Eqs. (159), we can see that
in this limit the antisymmetric state’s energy is only slightly higher than the eigenenergy E0 of a single
well, given by Eq. (162):
2mW 2
E A  E 0 1  2 exp  0 a  E 0   , where   exp  0 a  0 . (2.170)
2
The symmetric eigenfunction has a form reminding Eq. (165), but still different from it:
 a
cosh x, for 0  x  ,
2
   S  CS   (2.171)
a   a  a
cosh exp   x  , for  x,
 2   2  2
giving a characteristic equation similar in structure to Eq. (166), but with a different left-hand side:
a  0 a 
1  tanh 2 . (2.172)
2 a 
Figure 20b shows both sides of this equation for several values of the parameter 0a. It is evident that in
contrast to Eq. (166), Eq. (172) has a unique solution (and hence the system has a localized symmetric
eigenstate) for any value of the parameter 0a, i.e. for any distance between the partial wells. In the limit
of very close wells (i.e. their strong coupling), 0a << 1, we get a << 1, tanh(a/2)  0, and Eq. (172)
yields   20, leading to a four-fold increase of the eigenenergy’s magnitude in comparison with that
of the single well:
m(2W ) 2
ES  4 E 0   , for  0 a  1 . (2.173)
2 2
The physical meaning of this result is very simple: two very close potential wells act (on the symmetric
eigenfunction only!) together, so that their “weights” W  U(x)dx just add up.
In the opposite, weak coupling limit, i.e. 0a >> 1, Eq. (172) shows that a >> 1 as well, so that
its left-hand side may be approximated as 2(1 – exp{–a}), and the equation yields
   0 1  exp  0 a   0 . (2.174)
In this limit, the eigenfunction is a symmetric superposition of two virtually unperturbed wavefunctions
(159) of each partial potential well:
1
 S x    R x    L x  , (2.175)
2
and the eigenenergy is also close to the energy E0 of a partial well, but is slightly lower:
ES  E 0 1  2 exp  0 a  E 0   , so that E A  ES  2 , (2.176)
where  is again given by the last of Eqs. (170).

So, the eigenenergy of the symmetric state is always lower than that of the antisymmetric state.
The physics of this effect (which remains qualitatively the same in more complex two-component
systems, most importantly in diatomic molecules such as H2) is evident from the sketch of the
wavefunctions A and S, given by Eqs. (165) and (171), in Fig. 19. In the antisymmetric mode, the
wavefunction has to vanish at the center of the system, so that each its half is squeezed to one half of the
system’s spatial extension. Such a squeeze increases the function’s gradient, and hence its kinetic
energy (1.27), and hence its total energy. On the contrary, in the symmetric mode, the wavefunction
effectively spreads into the counterpart well. As a result, it changes in space slower, and hence its
kinetic energy is also lower.
Even more importantly, the symmetric state’s energy decreases as the distance a is decreased,
corresponding to the effective attraction of the partial wells. This is a good toy model of the strongest
(and most important) type of atomic cohesion – the covalent (or “chemical”) bonding.41 In the simplest
case of the H2 molecule, each of two electrons of the system, in its ground state,42 reduces its kinetic
energy by spreading its wavefunction around both hydrogen nuclei (protons), rather than being confined
near one of them – as it had to be in a single atom. The resulting bonding is very strong: in chemical
units, 429 kJ/mol, i.e. 18.6 eV per molecule. Perhaps counter-intuitively, this quantum-mechanical
41 Historically, the development of the quantum theory of such bonding in the H2 molecule (by Walter Heinrich
Heitler and Fritz Wolfgang London in 1927) was the breakthrough decisive for the acceptance of the then-
emerging quantum mechanics by the community of chemists.
42 Due to the opposite spins of these electrons, the Pauli principle allows them to be in the same orbital ground
state – see Chapter 8.
covalent bonding is even stronger than the strongest classical (ionic) bonding due to electron transfer
between atoms, leading to the Coulomb attraction of the resulting ions. (For example, the atomic
cohesion in the NaCl molecule is just 3.28 eV.)
Now let us analyze the dynamic properties of our model system (Fig. 19) because such a pair of
weakly coupled potential wells is our first example of the very important class of two-level systems.43 It
is easiest to do in the weak-coupling limit 0a >> 1, when the simple results (168)-(170) and (174)-(176)
are quantitatively valid. In particular, Eqs. (169) and (175) enable us to represent the quasi-localized
states of the particle in each partial well as linear combinations of its two eigenstates:
1 1
 R x    S x    A x ,  L x    S x    A x . (2.177)
2 2
Let us perform the following thought (“gedanken”) experiment: place a particle, at t = 0, into one of
these quasi-localized states, say R(x), and leave the system alone to evolve, so that
1
 ( x,0)   R ( x)   S ( x)   A ( x) . (2.178)
2
According to the general solution (1.69) of the time-independent Schrödinger equation, the time
dynamics of this wavefunction may be obtained simply by multiplying each eigenfunction by the
corresponding complex-exponential time factor:
1   ES   E 
 ( x, t )   S ( x) exp i t    A ( x) exp i A t  . (2.179)
2      
From here, using Eqs. (170) and (176), and then Eqs. (169) and (175) again, we get
1   i t   i t    iE0t 
 ( x, t )   S ( x) exp    A ( x) exp   exp 
2          (2.180)
 t t   E t
  R ( x) cos  i L ( x) sin  exp i 0  .
     
This result implies, in particular, that the probabilities WR and WL to find the particle, respectively, in
the right and left wells change with time as
t t Quantum
WR  cos 2 , WL  sin 2 , (2.181) oscillations
 
mercifully leaving the total probability constant: WR + WL = 1. (If our calculation had not passed this
sanity check, we would be in big trouble.)
This is the famous effect of quantum oscillations44 of the particle’s wavefunction between two
similar, coupled subsystems, with the frequency
43 As we will see later in Chapter 4, these properties are similar to those of spin-½ particles; hence two-level
systems are frequently called the spin-½-like systems.
44 Sometimes they are called the Bloch oscillations, but more commonly the last term is reserved for a related but
different effect in spatially-periodic systems – to be discussed in Sec. 8 below.
2 E A  ES
  . (2.182)
 
In its last form, this result does not depend on the assumption of weak coupling, though the simple form
(181) of the oscillations, with its 100% probability variations, does. (Indeed, at a strong coupling of two
subsystems, the very notion of the quasi-localized states R and L is ambiguous.) Qualitatively, this
effect may be interpreted as follows: the particle, placed into one of the potential wells, tries to escape
from it via tunneling through the potential barrier separating the wells. (In our particular system, shown
in Fig. 17, the barrier is formed by the spatial segment of length a, which has the potential energy, U =
0, higher than the eigenstate energy –E0.) However, in the two-well system, the particle can only escape
into the adjacent well. After the tunneling into that counterpart well, the particle tries to escape from it,
and hence comes back, etc. – very much as a classical 1D oscillator, initially deflected from its
equilibrium position, at negligible damping.
Some care is required at using such interpretation for quantitative conclusions. In particular, let
us compare the period T  2/ of the oscillations (181) with the metastable state’s lifetime discussed in
the previous section. For our particular model, we may use the second of Eqs. (170) to write
4 E0   t
 exp  0 a, T   exp 0 a  a exp 0 a, for  0 a  1 , (2.183)
  2 E0 2
where ta  2/0  2/E0 is the effective attempt time. On the other hand, according to Eq. (80), the
transparency T of our potential barrier, in this limit, scales as exp{-20a},45 so that according to the
general relation (153), the lifetime  is of the order of taexp{20a} >> T. This is a rather counter-
intuitive result: the speed of particle tunneling into a similar adjacent well is much higher than that,
through a similar barrier, to the free space!
In order to show that this important result is not an artifact of our delta-functional model of the
potential barrier, and also compare T and  more directly, let us analyze the quantum oscillations
between two weakly coupled wells, now assuming that the (symmetric) potential profile U(x) is
sufficiently soft (Fig. 21), so that all its eigenfunctions S and A are at least differentiable at all
points.46 If the barrier’s transparency is low, the quasi-localized wavefunctions R(x) and L(x) = R(-x)
and their eigenenergies may be found approximately by solving the Schrödinger equations in one of the
wells, neglecting the tunneling through the barrier, but the calculation of  requires a little bit more care.
Let us write the stationary Schrödinger equations for the symmetric and antisymmetric solutions in the
form
E A  U ( x) A    d 2A , ES  U ( x) S    d 2 S ,
2 2 2 2
(2.184)
2m dx 2m dx
45 It is hard to use Eq. (80) for a more exact evaluation of T in our current system, with its infinitely deep
potential wells, because the meaning of the wave number k is not quite clear. However, this is not too important,
because in the limit 0a >> 1, the tunneling exponent makes the dominant contribution into the transparency –
see, again, Fig. 2.7b.
46 Such a smooth well may have more than one quasi-localized eigenstate, so that the proper state (and energy)
index n is implied in all remaining formulas of this section.
multiply the former equation by S and the latter one by A, subtract them from each other, and then
integrate the result from 0 to . The result is
 
 2  d 2 S d 2 A 
( E A  ES )  S A dx   
 2
 A  2
 S dx. (2.185)
0
2m 0  dx dx 
If U(x), and hence d2A,S/dx2, are finite for all x, we may integrate the right-hand side by parts to get
 
 2  d S d A 
( E A  ES )  S A dx   A  S  . (2.186)
0
2m  dx dx 0
U ( x)
 L ( x)  R ( x)
E
Fig. 2.21. Weak coupling between two
a 0 a x similar, soft potential wells.
 
2 x xc' 2
c
So far, this result is exact (provided that the derivatives participating in it are finite at each
point); for weakly coupled wells, it may be further simplified. Indeed, in this case, the left-hand side of
Eq. (186) may be approximated as

E  ES
( E A  ES )  S A dx  A  , (2.187)
0
2
because this integral is dominated by the vicinity of point x = a/2, where the second terms in each of
Eqs. (169) and (175) are negligible, and the integral is equal to ½, assuming the proper normalization of
the function R(x). On the right-hand side of Eq. (186), the substitution at x =  vanishes (due to the
wavefunction’s decay in the classically forbidden region), and so does the first term at x = 0, because for
the antisymmetric solution, A(0) = 0. As a result, the energy half-split  may be expressed in any of the
following (equivalent) forms:
2 d A 2 d R 2 d L
  S (0) (0)   R (0) (0)    L (0) (0). (2.188)
2m dx m dx m dx
It is straightforward (and hence left for the reader’s exercise) to show that within the limits of the
WKB approximation’s validity, Eq. (188) may be reduced to
  xc'   t a  xc' 
  exp   ( x' )dx' , so that T   exp   ( x' )dx'  , (2.189)
ta  xc   2  xc 
where ta is the time period of the classical motion of the particle, with the energy E  EA  ES, inside
each well, the function (x) is defined by Eq. (82), and xc and xc’ are the classical turning points limiting
the potential barrier at the level E of the particle’s eigenenergy – see Fig. 21. The result (189) is
evidently a natural generalization of Eq. (183), so that the strong relationship between the times of
particle tunneling into the continuum of states and into a discrete eigenstate, is indeed not specific for
the delta-functional model. We will return to this fact, in its more general form, at the end of Chapter 6.
2.7. Periodic systems: Energy bands and gaps

Let us now proceed to the discussion of one of the most important issues of wave mechanics:
particle motion through a periodic system. As a precursor to this discussion, let us calculate the
transparency of the potential profile shown in Fig. 22 (frequently called the Dirac comb): a sequence of
N similar, equidistant delta-functional potential barriers, separated by (N – 1) potential-free intervals a.
a a
IA TI A
E Fig. 2.22. Tunneling through a
Dirac comb: a system of N similar,
equidistant barriers, i.e. (N – 1)
x1 x2  xN x similar coupled potential wells.
According to Eq. (132), its transfer matrix is the following product

T  T Ta T ...Ta T , (2.190)
 
( N 1)  N terms
with the component matrices given by Eqs. (135) and (138), and the barrier height parameter  defined
by the last of Eqs. (78). Remarkably, this multiplication may be carried out analytically,47 giving
1
  sin ka   cos ka  
2
 cos Nqa   
2 2
T  T11 sin Nqa   , (2.191a)
sin qa
   
where q is a new parameter, with the wave number dimensionality, defined by the following relation:
cos qa  cos ka   sin ka. (2.191b)
For N = 1, Eqs. (191) immediately yield our old result (79), while for N = 2 they may be readily reduced
to Eq. (141) – see Fig. 16a. Fig. 20 shows its predictions for two larger numbers N, and several values of
the dimensionless parameter .
Let us start the discussion of the plots from the case N = 3, when three barriers limit two coupled
potential wells between them. Comparison of Fig. 23a and Fig. 16a shows that the transmission patterns,
and their dependence on the parameter , are very similar, besides that in the coupled-well system, each
resonant tunneling peak splits into two, with the ka-difference between them scaling as 1/. From the
discussion in the last section, we may now readily interpret this result: each pair of resonance peaks of
transparency corresponds to the alignment of the incident particle’s energy E with the pair of energy
levels EA, ES of the symmetric and antisymmetric states of the system. However, in contrast to the
47 This formula will be easier to prove after we have discussed the properties of Pauli matrices in Chapter 4.
system shown in Fig. 19, these states are metastable, because the particle may leak out from these states
just as it could in the system studied in Sec. 5 – see Fig. 15 and its discussion. As a result, each of the
resonant peaks has a non-zero energy width E, obeying Eq. (155).
(a) (b)
N 3 N  10
0.8   0.3 0.8
0.6 0.6

T T 0.3
0.4 1.0 0.4
1.0
0.2 3.0 0.2
3.0
0 0
0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8
ka /  ka / 
Fig. 2.23. The Dirac comb’s transparency as a function of the product ka for three values of . Since
the function T(ka) is -periodic (just like it is for N = 2, see Fig. 16a), only one period is shown.
A further increase of N (see Fig. 23b) results in the increase of the number of resonant peaks per
period to (N – 1), and at N   the peaks merge into the so-called allowed energy bands (frequently
called just the “energy bands”) with average transparency T ~ 1, separated from similar bands in the
adjacent periods of function T (ka) by energy gaps48 where T  0. Notice the following important
features of the pattern:
(i) at N  , the band/gap edges become sharp for any , and tend to fixed positions
(determined by  but independent of N);
(ii) the larger is well coupling (the smaller is ), the broader are the allowed energy bands and
the narrower are the gaps between them.
Our previous discussion of the resonant tunneling gives us a clue for a semi-quantitative
interpretation of this pattern: if (N – 1) potential wells are weakly coupled by tunneling through the
potential barriers separating them, the system’s energy spectrum consists of groups of (N – 1) metastable
energy levels, each group being close to one of the unperturbed eigenenergies of the well. (According to
Eq. (1.84), for our current example shown in Fig. 22, with its rectangular potential wells, these
eigenenergies correspond to kna = n.)
Now let us recall that in the case N = 2, analyzed in the previous section, the eigenfunctions
(169) and (175) differed only by the phase shift  between their localized components R(x) and
L(x), with  = 0 for one of them (S) and  =  for its counterpart. Hence it is natural to expect that
for other N as well, each metastable energy level corresponds to an eigenfunction that is a superposition
of similar localized functions in each potential well, but with certain phase shifts  between them.
Moreover, we may expect that at N  , i.e. for periodic structures,49 with
48In solid-state (especially semiconductor) physics and electronics, the term bandgaps is more common.
49 This is a reasonable 1D model, for example, for solid-state crystals, whose samples may feature up to ~109
similar atoms or molecules in each direction of the crystal lattice.
U ( x  a)  U ( x), (2.192)
when the system does not have the ends that could affect its properties, the phase shifts  between the
localized wavefunctions in all couples of adjacent potential wells should be equal, i.e.
 ( x  a )   ( x ) e i  (2.193a)
for all x.50 This equality is the (1D version of the) much-celebrated Bloch theorem.51 Mathematical rigor
aside,52 it is a virtually evident fact because the particle’s density w(x) = *(x)(x), which has to be
periodic in this a-periodic system, may be so only  is constant. For what follows, it is more
convenient to represent the real constant  in the form qa, so that the Bloch theorem takes the form
Bloch
theorem:  ( x  a )   ( x)eiqa . (2.193b)
1D version
The physical sense of the parameter q will be discussed in detail below, but we may immediately notice
that according to Eq. (193b), an addition of (2/a) to this parameter yields the same wavefunction;
hence all observables have to be (2/a)-periodic functions of q. 53
Now let us use the Bloch theorem to calculate the eigenfunctions and eigenenergies for the
infinite version of the system shown in Fig. 22, i.e. for an infinite set of delta-functional potential
barriers – see Fig. 24.
a a a
En
Fig. 2.24. The simplest periodic potential:
an infinite Dirac comb.
  x
xj x j 1
To start, let us consider two points separated by one period a: one of them, xj, just left of the
position of one of the barriers, and another one, xj+1, just left of the following barrier – see Fig. 24 again.
50 A reasonably fair classical image of  is the geometric angle between similar objects – e.g., similar paper
clips – attached at equal distances to a long, uniform rubber band. If the band’s ends are twisted, the twist is
equally distributed between the structure’s periods, representing the constancy of . (I have to confess that, due
to the lack of time, this was the only “lecture demonstration” in my Stony Brook QM courses.)
51 Named after F. Bloch who applied this concept to the wave mechanics in 1929, i.e. very soon after its
formulation. Note, however, that an equivalent statement in mathematics, called the Floquet theorem, has been
known since at least 1883.
52 I will recover this rigor in two steps. Later in this section, we will see that the function obeying Eq. (193) is
indeed a solution to the Schrödinger equation. However, to save time/space, it will be better for us to postpone
until Chapter 4 the proof that any eigenfunction of the equation, with periodic boundary conditions, obeys the
Bloch theorem. As a partial reward for this delay, that proof will be valid for an arbitrary spatial dimensionality.
53 The product q, which has the dimensionality of linear momentum, is called either the quasimomentum or
(especially in solid-state physics) the “crystal momentum” of the particle. Informally, it is very convenient (and
common) to use the name “quasimomentum” for the bare q as well, despite its evidently different dimensionality.
The eigenfunctions at each of the points may be represented as linear superpositions of two simple
waves exp{ikx}, and the amplitudes of their components should be related by a 22 transfer matrix T
of the potential fragment separating them. According to Eq. (132), this matrix may be found as the
product of the matrix (135) of one delta-functional barrier by the matrix (138) of one zero-potential
interval a:
 A j 1   A   ika 0 1  i  i  A j 
   Ta T  j    e . (2.194)
B  B    ika  i 1  i  B j 
 j 1   j  0 e 
However, according to the Bloch theorem (193b), the component amplitudes should be also related as
 A j 1   A   iqa 0  A j  .
   e iqa  j    e (2.195)
B  B    
 j 1   j  0 e iqa  B j 
The condition of self-consistency of these two equations gives the following characteristic equation:
 e ika
 0 1  i  i   e iqa 0   0 .
  (2.196)
 0
 e  ika  i 1  i   0 e iqa 
In Sec. 5, we have already calculated the matrix product participating in this equation – see the
second operand in Eq. (140). Using it, we see that Eq. (196) is reduced to the same simple Eq. (191b)
that has jumped at us from the solution of the somewhat different (resonant tunneling) problem. Let us
explore that simple result in detail. First of all, the left-hand side of Eq. (191b) is a sinusoidal function of
the product qa with unit amplitude, while its right-hand side is a sinusoidal function of the product ka,
with amplitude (1 + 2)1/2 > 1 – see Fig. 25,
gap gap …
2
band band …
 1
1
cos qa
0
Fig. 2.25. The graphical representation of the
characteristic equation (191b) for a fixed value of the
1 parameter . The ranges of ka that yield cos qa < 1,
correspond to allowed energy bands, while those with
2 cos qa > 1, correspond to energy gaps between them.
0 1 2 3 4
ka / 
As a result, within each half-period (ka) =  of the right-hand side, there is an interval where
the magnitude of the right-hand side is larger than 1, so that the characteristic equation does not have a
real solution for q. These intervals correspond to the energy gaps (see Fig. 23 again), while the
complementary intervals of ka, where a real solution for q exists, correspond to the allowed energy
bands. In contrast, the parameter q can take any real values, so it is more convenient to plot the
eigenenergy E = 2k2/2m as the function of the quasimomentum q (or, even more conveniently, of the
dimensionless parameter qa) rather than ka.54 Before doing that, we need to recall that the parameter ,
defined by the last of Eqs. (78), depends on the wave vector k as well, so that if we vary q (and hence k),
it is better to characterize the structure by another, k-independent dimensionless parameter, for example
W
  (ka)   2
, (2.197)
 / ma
so that our characteristic equation (191b) becomes
sin ka
Dirac comb:
q vs k
cos qa  cos ka   . (2.198)
ka
Fig. 26 shows the plots of k and E, following from Eq. (198), as functions of qa, for a particular,
moderate value of the parameter . The first evident feature of the pattern is the 2-periodicity of the
pattern in the argument qa, which we have already predicted from the general Bloch theorem arguments.
(Due to this periodicity, the complete band/gap pattern may be studied, for example, on just one interval
–  qa  + , called the 1st Brillouin zone – the so-called reduced zone picture. For some applications,
however, it is more convenient to use the extended zone picture with –  qa  + – see, e.g., the next
section.)
1st Brillouin zone (a) 1st Brillouin zone (b)

100
4
ka E
 E0
2 50
1
E1
0 0
-2 –1 0 1 2 -2 –1 0 1 2
qa /  qa / 
Fig. 2.26. (a) The “genuine” momentum k of a particle in an infinite Dirac comb (Fig. 24), and (b) its
energy E = 2k2/2m (in the units of E0  2/2ma2), as functions of normalized quasimomentum, for a
particular value ( = 3) of the dimensionless parameter defined by Eq. (197). Arrows in the lower right
corner of panel (b) illustrate the definition of energy band (En) and energy gap (n) widths.
54
A more important reason for taking q as the argument is that for a general periodic potential U(x), the particle’s
momentum k is not uniquely related to E, while (according to the Bloch theorem) the quasimomentum q is.
However, maybe the most important fact, clearly visible in Fig. 26, is that there is an infinite
number of energy bands, with different energies En(q) for the same value of q. Mathematically, it is
evident from Eq. (198) – or alternatively from Fig. 25. Indeed, for each value of qa, there is a solution
ka to this equation on each half-period (ka) = . Each of such solutions (see Fig. 26a) gives a specific
value of particle’s energy E = 2k2/2m. A continuous set of similar solutions for various qa forms a
particular energy band.
Since the energy band picture is one of the most practically important results of quantum
mechanics, it is imperative to understand its physics. It is natural to describe this physics in two different
ways in two opposite potential strength limits. In parallel, we will use this discussion to obtain simpler
expressions for the energy band/gap structure in each limit. An important advantage of this approach is
that both analyses may be carried out for an arbitrary periodic potential U(x) rather than for the
particular model shown in Fig. 24.
(i) Tight-binding approximation. This approximation works well when the eigenenergy En of the
states quasi-localized at the energy profile minima is much lower than the height of the potential barriers
separating them – see Fig. 27. As should be clear from our discussion in Sec. 6, essentially the only role
of coupling between these states (via tunneling through the potential barriers separating the minima) is
to establish a certain phase shift   qa between the adjacent quasi-localized wavefunctions un(x – xj)
and un(x – xj+1).
a a
U ( x) un ( x  x j 1 ) un (x  x j ) u n ( x  x j 1 )
n n
En x0 a  x0 Fig. 2. 27. The tight-binding
approximation (schematically).
0 x j 1 xj x j 1 x
To describe this effect quantitatively, let us first return to the problem of two coupled wells
considered in Sec. 6, and recast the result (180), with restored eigenstate index n, as
 E 
n ( x, t )  a R (t ) R ( x)  a L (t ) L ( x)exp i n t , (2.199)
  
where the probability amplitudes aR and aL oscillate sinusoidally in time:
n n
a R (t )  cos t,t. a L (t )  i sin (2.200)
 
This evolution satisfies the following system of two equations whose structure is similar to Eq. (1.61a):
ia R   n a L , ia L   n a R . (2.201)
Eq. (199) may be readily generalized to the case of many similar coupled wells:
   E 
n ( x, t )   a j (t )u n ( x  x j ) exp  i n t  , (2.202)
 j    
where En are the eigenenergies and un the eigenfunctions of each well. In the tight-binding limit, only
the adjacent wells are coupled, so that instead of Eq. (201) we should write an infinite system of similar
equations
ia j   n a j 1   n a j 1 , (2.203)
for each well number j, where parameters n describe the coupling between two adjacent potential wells.
Repeating the calculation outlined at the end of the last section for our new situation, for a smooth
potential we may get an expression essentially similar to the last form of Eq. (188):
Tight-
binding 2 du
limit: n  u n ( x0 ) n (a  x0 ) , (2.204)
coupling m dx
energy
where x0 is the distance between the well bottom and the middle of the potential barrier on the right of it
– see Fig. 27. The only substantial new feature of this expression in comparison with Eq. (188) is that
the sign of n alternates with the level number n: 1 > 0, 2 < 0, 3 > 0, etc. Indeed, the number of zeros
(and hence, “wiggles”) of the eigenfunctions un(x) of any potential well increases as n – see, e.g., Fig.
1.8,55 so that the difference of the exponential tails of the functions, sneaking under the left and right
barriers limiting the well also alternates with n.
The infinite system of ordinary differential equations (203) enables solutions of many important
problems (such as the spread of the wavefunction that was initially localized in one well, etc.), but our
task right now is just to find its stationary states, i.e. the solutions proportional to exp{-i(n/)t}, where
n is a still unknown, q-dependent addition to the background energy En of the nth energy level. To
satisfy the Bloch theorem (193) as well, such a solution should have the following form:
  
a j (t )  a expiqx j  i n t  const  . (2.205)
  
Plugging this solution into Eq. (203) and canceling the common exponent, we get
Tight-
 
binding
limit: E  En   n  En   n e iqa  e iqa  En  2 n cos qa , (2.206)
energy
bands
so that in this approximation, the energy band width En (see Fig. 26b) equals 4n .
The relation (206), whose validity is restricted to n << En, describes the lowest energy bands
plotted in Fig. 26b reasonably well. (For larger , the agreement would be even better.) So, this
calculation explains what the energy bands really are: in the tight-binding limit they are best interpreted
as isolated well’s energy levels En, broadened into bands by the interwell interaction. Also, this result
gives clear proof that the energy band extremes correspond to qa = 2l and qa = 2(l + ½), with integer
l. Finally, the sign alteration of the coupling coefficient n (204) explains why the energy maxima of one
band are aligned, on the qa axis, with energy minima of the adjacent bands – see Fig. 26.
(ii) Weak-potential limit. Amazingly, the energy-band structure is also compatible with a
completely different physical picture that may be developed in the opposite limit. Let the particle’s
energy E be so high that the periodic potential U(x) may be treated as a small perturbation. Naively, in
55
Below, we will see several other examples of this behavior. This alternation rule is also described by the
Wilson-Sommerfeld quantization condition (110).
this limit we could expect a slightly and smoothly deformed parabolic dispersion relation E = 2k2/2m.
However, if we are plotting the stationary-state energy as a function of q rather than k, we need to add
2l/a, with an arbitrary integer l, to the argument. Let us show this by expanding all variables into the
1D-spatial Fourier series. For the potential energy U(x) that obeys Eq. (192), such an expansion is
straightforward:56
 2x 
U ( x )   U l " exp  i l" , (2.207)
l"  a 
where the summation is over all integers l”, from – to +. However, for the wavefunction we should
show due respect to the Bloch theorem (193), which shows that strictly speaking, (x) is not periodic.
To overcome this difficulty, let us define another function:
u ( x)   ( x)e iqx , (2.208)
and study its periodicity:
u ( x  a )   ( x  a )e iq ( x  a )   ( x)e iqx  u ( x) . (2.209)
We see that the new function is a-periodic, and hence we can use Eqs. (208)-(209) to rewrite the Bloch
theorem in a different form: 1D Bloch
iqx theorem:
 ( x )  u ( x )e , with u ( x  a)  u ( x) . (2.210) alternative
form
Now it is safe to expand the periodic function u(x) exactly as U(x):

 2x 
u ( x)   ul ' exp i l' , (2.211)
l'  a 
so that, according to Eq. (210),
 2x   2  
 ( x)  eiqx u l' exp i

l'    ul' expi q 
a  l' a
l'  x . (2.212)
l'   
The only nontrivial part of plugging Eqs. (207) and (212) into the stationary Schrödinger
equation (53) is how to handle the product term,
 2
U ( x)   U l "ul ' expi q  l'  l"  x . (2.213)
l ',l "  a  
At fixed l’, we may change the summation over l” to that over l  l’ + l” (so that l”  l – l’), and write:
 2  
U ( x)   expi q  l  x  ul'U l l' . (2.214)
l  a   l'
Now plugging Eq. (212) (with the summation index l’ replaced with l) and Eq. (214) into the stationary
Schrödinger equation (53), and requiring the coefficients of each spatial exponent to match, we get an
infinite system of linear equations for ul:
56 The benefits of such an unusual notation of the summation index (l” instead of, say, l) will be clear in a few
lines.
 2  2  
2
 
U l l ' u l '  E  q
2m 
 l  ul .
a  
(2.215)
l' 
(Note that by this calculation we have essentially proved that the Bloch wavefunction (210) is indeed a
solution of the Schrödinger equation, provided that the quasimomentum q is selected in a way to make
the system of linear equation (215) compatible, i.e. is a solution of its characteristic equation.)
So far, the system of equations (215) is an equivalent alternative to the initial Schrödinger
equation, for any potential’s strength.57 In the weak-potential limit, i.e. if all Fourier coefficients Un are
small,58 we can complete all the calculations analytically.59 Indeed, in the so-called 0th approximation
we can ignore all Un, so that in order to have at least one ul different from 0, Eq. (215) requires that
2
2  2l 
E  El  q   . (2.216)
2m  a 
(ul itself should be obtained from the normalization condition). This result means that in this
approximation, the dispersion relation E(q) has an infinite number of similar quadratic branches
numbered by integer l – see Fig. 28.
E ( 2) 2
l 0 l 1 l2
l  1 l 0 
Fig. 2.28. The energy band/gap
(1) 1 picture in the weak potential limit (n
E
<< E(n)), with the shading showing the
1st Brillouin zone.
0 1 qa / 2
On every branch, such eigenfunction has just one Fourier coefficient, i.e. is a monochromatic traveling
wave
 2l  
 l  u l e ikx  u l expi q   x . (2.217)
 a  
Next, the above definition of El allows us to rewrite Eq. (215) in a more transparent form
57 By the way, the system is very efficient for fast numerical solution of the stationary Schrödinger equation for
any periodic profile U(x), even though to describe potentials with large Un, this approach may require taking into
account a correspondingly large number of Fourier amplitudes ul.
58 Besides, possibly, a constant potential U , which, as was discussed in Chapter 1, may be always taken for the
0
energy reference. As a result, in the following calculations, I will take U0 = 0 to simplify the formulas.
59 This method is so powerful that its multi-dimensional version is not much more complex than the 1D version
described here – see, e.g., Sec. 3.2 in the classical textbook by J. Ziman, Principles of the Theory of Solids, 2nd
ed., Cambridge U. Press, 1979.
U u  E  El u l ,
l l ' l' (2.218)
l ' l
which may be formally solved for ul:

1
ul 
E  El
U
l ' l
l l' ul ' . (2.219)
This formula shows that if the Fourier coefficients Un are non-zero but small, the wavefunctions do
acquire other Fourier components (besides the main one, with the index corresponding to the branch
number), but these additions are all small, besides narrow regions near the points El = El’ where two
branches (216) of the dispersion relation E(q), with some specific numbers l and l’, cross. According to
Eq. (216), this happens when
 2   2 
q  l    q  l'  , (2.220)
 a   a 
i.e. at q  qm  m/a (with the integer m  l + l’) 60 corresponding to
Weak-
2  2 2 2
E l  El '  2
 (l  l' )  2l 2
 2
n  E (n) , (2.221) potential
limit:
2ma 2ma energy gap
positions
with integer n  l – l’. (According to their definitions, the index n is just the number of the branch
crossing on the energy scale, while the index m numbers the position of the crossing points on the q-axis
– see Fig. 28.) In such a region, E has to be close to both El and El’, so that the denominator in just one
of the infinite number of terms in Eq. (219) is very small, making the term substantial despite the
smallness of Un. Hence we can take into account only one term in each of the sums (written for l and l’):
U n ul '  ( E  El )ul ,
(2.222)
U n ul  ( E  El ' )ul ' .
Taking into account that for any real function U(x), the Fourier coefficients in its Fourier expansion
(207) have to be related as U-n = Un*, Eq. (222) yields the following simple characteristic equation
E  El Un
* 0, (2.223)
Un E  El'
with the following solution:
1/ 2 Weak-
 E  El '  2 * E  El ' potential
E   Eave   l   U nU n  , with Eave  l  E (n) . (2.224) limit:
 2   2 level
anticrossing
According to Eq. (216), close to the branch crossing point qm = (l + l’)/a, the fraction
participating in this result may be approximated as61
El  El ' dE l  2 n 2aE ( n )
  q~, with   q qm   , and q~  q  q m , (2.225)
2 dq ma n
60 Let me hope that the difference between this new integer and the particle’s mass, both called m, is absolutely
clear from the context.
61 Physically, /  (n/a)/m = k(n)/m is just the velocity of a free classical particle with energy E(n).
while the parameters Eave = E(n) and UnUn* =  Un2 do not depend on q~ , i.e. on the distance from the
central point qm. This is why Eq. (224) may be plotted as the famous level anticrossing (also called
“avoided crossing”, or “intended crossing”, or “non-crossing”) diagram (Fig. 29), with the energy gap
width n equal to 2Un, i.e. just twice the magnitude of the n-th Fourier harmonic of the periodic
potential U(x). Such anticrossings are also clearly visible in Fig. 28, which shows the result of the exact
solution of Eq. (198) for the particular case  = 0.5.62
E  E (n)
E
2Un
0 El  El '
  q  q m 
2
E
Fig. 2.29. The level anticrossing diagram.
We will run into the anticrossing diagram again and again in the course, notably at the discussion
of spin-½ and other two-level systems. It is also repeatedly met in classical mechanics, for example at
the calculation of frequencies of coupled oscillators.63,64 In our current case of the weak potential limit
of the band theory, the diagram describes the interaction of two traveling de Broglie waves (217), with
oppositely directed wave vectors, l and –l’ , via the (l – l’)th (i.e. the nth) Fourier harmonic of the
potential profile U(x).65 This effect exists also in the classical wave theory and is known as the Bragg
reflection, describing, for example, the 1D model of the X-wave reflection by a crystal lattice (see, e.g.
Fig. 1.5) in the limit of weak interaction between the incident wave and each atom.
The anticrossing diagram shows that rather counter-intuitively, even a weak periodic potential
changes the topology of the initially parabolic dispersion relation radically, connecting its different
branches, and thus creating the energy gaps. Let me hope that the reader has enjoyed the elegant
description of this effect, discussed above, as well as one more illustration of the wonderful ability of
physics to give completely different interpretations (and different approximate approaches) to the same
effect in opposite limits.
So, we have explained analytically (though only in two limits) the particular band structure
shown in Fig. 26. Now we may wonder how general this structure is, i.e. how much of it is independent
of the Dirac comb model (Fig. 24). For that, let us represent the band pattern, such as that shown in Fig.
62 From that figure, it is also clear that in the weak potential limit, the width En of the nth energy band is just E(n)
– E(n – 1) – see Eq. (221). Note that this is exactly the distance between the adjacent energy levels of the simplest
1D potential well of infinite depth – cf. Eq. (1.85).
63 See, e.g., CM Sec. 6.1 and in particular Fig. 6.2.
64 Actually, we could readily obtain this diagram in the previous section, for the system of two weakly coupled
potential wells (Fig. 21), if we assumed the wells to be slightly dissimilar.
65 In the language of the de Broglie wave scattering, to be discussed in Sec. 3.3, Eq. (220) may be interpreted as
the condition that each of these waves, scattered on the nth Fourier harmonic of the potential profile,
constructively interferes with its counterpart, leading to a strong enhancement of their interaction.
26b (plotted for a particular value of the parameter , characterizing the potential barrier strength) in a
more condensed form, which would allow us to place the results for a range of  values on a single
comprehensible plot. The way to do this should be clear from Fig. 26b: since the dependence of energy
on the quasimomentum in each energy band is not too eventful, we may plot just the highest and the
smallest values of the particle’s energy E = 2k2/2m as functions of   maW/2 – see Fig. 30, which
may be obtained from Eq. (198) with qa = 0 and qa = .
100
80
band
E 60
gap
E0 ) 40
band Fig. 2.30. Characteristic curves of the
20 Schrödinger equation for the infinite
Dirac comb (Fig. 24).
0
0 2 4 6 8 10

These plots (in mathematics, commonly called characteristic curves, while in applied physics,
band-edge diagrams) show, first of all, that at small , all energy gap widths are equal and proportional
to this parameter, and hence to W. This feature is in a full agreement with the main conclusion (224) of
our general analysis of the weak-potential limit, because for the Dirac comb potential (Fig. 24),

U x   W  δ x  ja  const  , (2.226)
j  
all Fourier harmonic amplitudes, defined by Eq. (207), are equal by magnitude:  Ul  = W/a. As  is
further increased, the gaps grow and the allowed energy bands shrink, but rather slowly. This is also
natural, because, as Eq. (79) shows, transparency T of the delta-functional barriers separating the quasi-
localized states (and hence the coupling parameters n  T1/2 participating in the general tight-binding
limit’s theory) decrease with W   very gradually.
These features may be compared with those for more realistic and relatively simple periodic
functions U(x), for example the sinusoidal potential U(x) = Acos(2x/a) – see Fig. 31a. For this
potential, the stationary Schrödinger equation (53) takes the following form:
 2 d 2 2x
 2
 A cos   E . (2.227)
2m dx a
By introduction of dimensionless variables66
x E A
 ,  (1)
, 2  (1) , (2.228)
a E E
66 Note that this definition of  is quantitatively different from that for the Dirac comb (226), but in both cases,
this parameter is proportional to the amplitude of the potential modulation.
where E(1) is defined by Eq. (221) with n = 1, Eq. (227) is reduced to the canonical form of the well-
studied Mathieu equation67
d 2
Mathieu  (  2 cos 2 )  0. (2.229)
equation d 2
(a) (b)
a U ( x) d
U ( x)
A U0
0 a
A x
0
x
Fig. 2.31. Two other simple periodic potential profiles: (a) the sinusoidal (“Mathieu”) potential and
(b) the Kronig-Penney potential.
Figure 32 shows the characteristic curves of this equation. We see that now at small  the first
energy gap grows much faster than the higher ones: n   n. This feature is in accord with the weak-
coupling result 1 = 2U1, which is valid only in the linear approximation in Un, because for the Mathieu
potential, Ul = A(l,+1 + l,-1)/2. Another clearly visible feature is the exponentially fast shrinkage of the
allowed energy bands at 2 >  (in Fig. 32, on the right from the dashed line), i.e. at E < A. It may be
readily explained by our tight-binding approximation result (206): as soon as the eigenenergy drops
significantly below the potential maximum Umax = A (see Fig. 31a), the quantum states in the adjacent
potential wells are connected only by tunneling through relatively high potential barriers separating
these wells, so that the coupling amplitudes n become exponentially small – see, e.g., Eq. (189).
Fig. 2.32. Characteristic curves of the

Mathieu equation. The dashed line
corresponds to the equality  = 2, i.e. E
 band = A  Umax, separating the regions of
gap under-barrier tunneling and over-barrier
motion. Adapted from Fig. 28.2.1 at
http://dlmf.nist.gov. (Contribution by US
Government, not subject to copyright).

Another simple periodic profile is the Kronig-Penney potential, shown in Fig. 31b, which gives
relatively simple analytical expressions for the characteristic curves. Its advantage is a more realistic law
of the decrease of the Fourier harmonics Ul at l >> 1, and hence of the energy gaps in the weak-potential
limit:
67 This equation, first studied in the 1860s by É. Mathieu in the context of a rather practical problem of vibrating
elliptical drumheads (!), has many other important applications in physics and engineering, notably including the
parametric excitation of oscillations – see, e.g., CM Sec. 5.5.
U0
n  2 Un  , at E ~ E ( n )  U 0 . (2.230)
n
Leaving a detailed analysis of the Kronig-Penney potential for the reader’s exercise, let me
conclude this section by addressing the effect of potential modulation on the number of eigenstates in
1D systems of a large but finite length l >> a, k-1. Surprisingly, the Bloch theorem makes the analysis of
this problem elementary, for arbitrary U(x). Indeed, let us assume that l is comprised of an integer
number of periods a, and its ends are described by similar boundary conditions – both assumptions
evidently inconsequential for l >> a. Then, according to Eq. (210), the boundary conditions impose, on
the quasimomentum q, exactly the same quantization condition as we had for k for a free 1D motion.
Hence, instead of Eq. (1.100), we can write
l 1D number
dN  dq , (2.231) of states
2
with the corresponding change of the summation rule:
l
 f (q)  2  f (q)dk .
q
(2.232)
As a result, the density of states in the 1D q-space, dN/dq = l/2, does not depend on the
potential profile at all! Note, however, that the profile does affect the density of states on the energy
scale, dN/dE. As an extreme example, on the bottom and at the top of each energy band we have dE/dq
 0, and hence
dN dN dE l dE
   . (2.233)
dE dq dq 2 dq
This effect of state concentration at the band/gap edges (which survives in higher spatial
dimensionalities as well) has important implications for the operation of several important electronic
and optical devices, in particular semiconductor lasers and light-emitting diodes.
2.8. Periodic systems: Particle dynamics

The band structure of the energy spectrum of a particle moving in a periodic potential has
profound implications not only for its density of states but also for its dynamics. Indeed, let us consider
the simplest case of a wave packet composed of the Bloch functions (210), all belonging to the same
(say, nth) energy band. Similarly to Eq. (27) for a free particle, we can describe such a packet as
 ( x, t )   a q u q ( x )e

i qx   q t dq , (2.234)
where the a-periodic functions u(x), defined by Eq. (208), are now indexed to emphasize their
dependence on the quasimomentum, and (q)  En(q)/ is the function of q describing the shape of the
corresponding energy band – see, e.g., Fig. 26b or Fig. 28. If the packet is narrow in the q-space, i.e. if
the width q of the distribution aq is much smaller than all the characteristic q-scales of the dispersion
relation (q), in particular of /a, we may simplify Eq. (234) exactly as it was done in Sec. 2 for a free
particle, despite the presence of the periodic factors uq(x) under the integral. In the linear approximation
of the Taylor expansion, we get a full analog of Eq. (32), but now with q rather than k, and
d 
vgr  q q0 , and v ph  q q0 , (2.235)
dq q
where q0 is the central point of the quasimomentum distribution. Despite the formal similarity with Eqs.
(33) for the free particle, this result is much more eventful. For example, as evident from the dispersion
relation’s topology (see Figs. 26b, 28), the group velocity vanishes not only at q = 0, but at all values of
q that are multiples of (/a), at the bottom and on the top of each energy band. Even more intriguing is
that the group velocity’s sign changes periodically with q.
This group velocity alternation leads to fascinating, counter-intuitive phenomena if a particle
placed in a periodic potential is the subject of an additional external force F(t). (For electrons in a
crystal, this may be, for example, the force of the applied electric field.) Let the force be relatively weak,
so that the product Fa (i.e. the scale of the energy increment from the additional force per one lattice
period) is much smaller than both relevant energy scales of the dispersion relation E(q) – see Fig. 26b:
Fa  E n ,  n . (2.236)
This strong relationship allows one to neglect the force-induced interband transitions, so that the wave
packet (234) includes the Bloch eigenfunctions belonging to only one (initial) energy band at all times.
For the time evolution of its center q0, theory yields68 an extremely simple equation of motion
Time
evolution 1
of quasi- q 0  F (t ) . (2.237)
momentum 
This equation is physically very transparent: it is essentially the 2nd Newton law for the time evolution
of the quasimomentum q under the effect of the additional force F(t) only, excluding the periodic force
–U(x)/x of the background potential U(x). This is very natural, because as Eq. (210) implies, q is
essentially the particle’s momentum averaged over the potential’s period, and the periodic force effect
drops out at such an averaging.
Despite the simplicity of Eq. (237), the results of its solution may be highly nontrivial. First, let
us use Eqs. (235) and (237) to find the instant group acceleration of the particle (i.e. the acceleration of
its wave packet’s envelope):
dvgr d d q0  d d q0  dq0 d 2 (q0 ) dq0 1 d 2
agr      q q0 F (t ) . (2.238)
dt dt dq0 dq0 dq0 dt dq02 dt  dq 2
This means that the second derivative of the dispersion (q) relation (specific for each energy band)
plays the role of the effective reciprocal mass of the particle at this particular value of q0:
Effective  2
mass mef  2  . (2.239)
d  / dq 2 d 2 E n / dq 2
For the particular case of a free particle, for which Eq. (216) is exact, this expression is reduced to the
original (and constant) mass m, but generally, the effective mass depends on the wave packet’s
68 The proof of Eq. (237) is not difficult, but becomes more compact in the bra-ket formalism, to be discussed in
Chapter 4. This is why I recommend to the reader its proof as an exercise after reading that chapter. For a
generalization of this theory to the case of essential interband transitions see, e.g., Sec. 55 in E. Lifshitz and L.
Pitaevskii, Statistical Physics, Part 2, Pergamon,1980.
momentum. According to Eq. (239), at the bottom of any energy band, mef is always positive but
depends on the strength of the particle’s interaction with the periodic potential. In particular, according
to Eq. (206), in the tight-binding limit, the effective mass is very large:
2 E (1)
mef q ( / a ) n   m 2  m . (2.240)
2 n a 2  n
On the contrary, in the weak-potential limit, the effective mass is close to m at most points of each
energy band, but at the edges of the (narrow) bandgaps, it is much smaller. Indeed, expanding Eq. (224)
in the Taylor series near point q = qm, we get
2
1  dE l   2 ~2
E  E ave   Un    q~ 2   U n  q , (2.241)
E E ( n) 2 Un  dq  q qm 2 Un
where  and q~ are defined by Eq. (225), so that
2 Un
mef q q  U n 2  m ( n )  m . (2.242)
m  2E
The effective mass effects in real atomic crystals may be very significant. For example, the
charge carriers in silicon have mef  0.19 me in the lowest, normally-empty energy band (traditionally
called the conduction band), and mef  0.98 me in the adjacent lower, normally-filled valence band. In
some semiconducting compounds, the conduction-band mass may be even smaller – down to 0.0145 me
in InSb!
However, the effective mass’ magnitude is not the most surprising effect. A more fascinating
corollary of Eq. (239) is that on the top of each energy band the effective mass is negative – please
revisit Figs. 26b, 28, and 29 again. This means that the particle (or more strictly, its wave packet’s
envelope) is accelerated in the direction opposite to the applied force. This is exactly what electronic
engineers, working with electrons in semiconductors, call holes, characterizing them by a positive mass
mef, but compensating this sign change by taking their charge e positive. If the particle stays in close
vicinity of the energy band’s top (say, due to frequent scattering effects, typical for the semiconductors
used in engineering practice), such double sign flip does not lead to an error in calculations of hole’s
dynamics, because the electric field’s force is proportional to the particle’s charge, so that the particle’s
acceleration agr is proportional to the charge-to-mass ratio.69
However, in some phenomena such simple representation is unacceptable.70 For example, let us
form a narrow wave packet at the bottom of the lowest energy band,71 and then exert on it a constant
force F > 0 – say, due to a constant external electric field directed along the x-axis. According to Eq.
(237), this force would lead to linear growth of q0 in time, so that in the quasimomentum space, the
69 More discussion of this issue may be found in SM Sec. 6.4.

70 The balance of this section describes effects that are not discussed in most quantum mechanics textbooks.
Though, in my opinion, every educated physicist should be aware of them, some readers may skip them at the
first reading, jumping directly to the next Sec. 9.
71 Physical intuition tells us (and the theory of open systems, to be discussed in Chapter 7, confirms) that this may
be readily done, for example, by weakly coupling the system to a relatively low-temperature environment, and
letting it relax to the lowest possible energy.
packet’s center would slide, with a constant speed, along the q axis – see Fig. 33a. Close to the energy
band’s bottom, this motion would correspond to a positive effective mass (possibly, somewhat different
than the genuine particle’s mass m), and hence be close to the free particle’s acceleration. However, as
soon as q0 has reached the inflection point where d2E1/dq2 = 0, the effective mass, and hence its
acceleration (238) change signs to negative, i.e. the packet starts to slow down (in the direct space),
while still moving ahead with the same velocity in the quasimomentum space. Finally, at the energy
band’s top, the particle stops at a certain xmax, while continuing to move forward in the q-space.
(a) E (b)
E (q )
E 2 (q) 1 x 0   1 / F
1 E1
E1 E1 ( q )
x0
 0  a 0
qa
x max  E1 / F
Fig. 2.33. The Bloch oscillations (red lines) and the Landau-Zener tunneling (blue arrows)
represented in: (a) the reciprocal space of q, and (b) the direct space. On panel (b), the tilted gray
strips show the allowed energy bands, while the bold red lines, the Wannier-Stark ladder’s steps.
Now we have two alternative ways to look at the further time evolution of the wave packet along
the quasimomentum’s axis. From the extended zone picture (which is the simplest for this analysis, see
Fig. 33a),72 we may say that the particle crosses the 1st Brillouin zone boundary and continues to go
forward in q-space, i.e. down the lowest energy band. According to Eq. (235), this region (up to the next
energy minimum at qa = 2) corresponds to a negative group velocity. After q0 has reached that
minimum, the whole process repeats again – and again, and again.
These are the famous Bloch oscillations – the effect which had been predicted, by the same F.
Bloch, as early as 1929 but evaded experimental observation until the 1980s (see below) due to the
strong scattering effects in real solid-state crystals. The time period of the oscillations may be readily
found from Eq. (237):
q 2 / a 2
t B    , (2.243)
dq / dt F /  Fa
so that their frequency may be expressed by a very simple formula
72 This phenomenon may be also discussed from the point of view of the reduced zone picture, but then it requires
the introduction of instant jumps between the Brillouin zone boundary points (see the dashed red line in Fig. 33)
that correspond to physically equivalent states of the particle. Evidently, for the description of this particular
phenomenon, this language is more artificial.
2 Fa Bloch
B   , (2.244) oscillations:
t B  frequency
and hence is independent of any peculiarities of the energy band/gap structure.

The direct-space motion of the wave packet’s center x0(t) during the Bloch oscillation process
may be analyzed by integrating the first of Eqs. (235) over some time interval t, and using Eq. (237):
t t t t  t
d ( q 0 ) d ( q 0 )  
x0 (t )   v gr dt  
0 0
dq 0
dt  
0

dq 0 / dt F  d  F  q  .
t 0
0 (2.245)
If the interval t is equal to the Bloch oscillation period tB (243), the initial and final values of E(q0) =
(q0) are equal, giving x0 = 0: in the end of the period, the wave packet returns to its initial position in
space. However, if we carry out this integration only from the smallest to the largest values of (q0), i.e.
the adjacent points where the group velocity vanishes, we get the following Bloch oscillation swing:
Bloch
x max 

 max   min   E1 . (2.246)
oscillations:
spatial
F F swing
This simple result may be interpreted using an alternative energy diagram (Fig. 33b), which
results from the following arguments. The additional force F may be described not only via the 2nd
Newton law’s version (237), but, alternatively, by its contribution –Fx to the Gibbs potential energy73
U  ( x)  U ( x)  Fx (2.247)
The exact solution of the Schrödinger equation (61) with such a potential may be hard to find directly,
but if the force F is sufficiently weak, as we are assuming throughout this discussion, the second term in
Eq. (247) may be considered as a constant on the scale of a << xmax. In this case, our quantum-
mechanical treatment of the periodic potential U(x) is still virtually correct, but with an energy shift
depending on the “global” position x0 of the packet’s center. In this approximation, the total energy of
the wave packet is
E   E (q0 )  Fx0 . (2.248)
In a plot of such energy as a function of x0 (Fig. 33b), the energy dependence on q0 is hidden, but
as was discussed above, it is rather uneventful and may be well characterized by the position of band-
gap edges on the energy axis.74 In this representation, the Bloch oscillations keep the full energy E of
the particle constant, i.e. follow a horizontal line in Fig. 33b, limited by the classical turning points
corresponding to the bottom and the top of the allowed energy band. The distance xmax between these
points is evidently given by Eq. (246).
Besides this alternative look at the Bloch oscillation swing, the total energy diagram shown in
Fig. 33b enables one more remarkable result. Let a wave packet be so narrow in the momentum space
73Physically, this is just the relevant part of the potential energy of the total system comprised of our particle (in
the periodic potential) and the source of the force F – see, e.g., CM Sec. 1.4.
74 In semiconductor physics and engineering, such spatial band-edge diagrams are virtually unavoidable
components of almost every discussion/publication. In this series, a few more examples of such diagrams may be
found in SM Sec. 6.4.
that x ~ 1/q >> xmax; then it may be well represented by definite energy, i.e. by a horizontal line in
Fig. 33b. But Eq. (247) is exactly invariant with respect to the following simultaneous translation of the
coordinate and the energy:
x  x  a, E  E  Fa . (2.249)
This means that it is satisfied by an infinite set of similar solutions, each corresponding to one of the
horizontal red lines shown in Fig. 33b. This is the famous Wannier-Stark ladder,75 with the step height
Wannier-
Stark E WS  Fa . (2.250)
ladder
The importance of this alternative representation of the Bloch oscillations is due to the following
fact. In most experimental realizations, the power of the electromagnetic radiation with frequency (244),
that may be extracted from the oscillations of a charged particle, is very low, so that their direct
detection represents a hard problem.76 However, let us apply to a Bloch oscillator an additional ac field
at frequency   B. As these frequencies are brought close together, the external signal should
synchronize (“phase-lock”) the Bloch oscillations,77 resulting in certain changes of time-independent
observables – for example, a resonant change of absorption of the external radiation. Now let us notice
that the combination of Eqs. (244) and (250) yield the following simple relation:
E WS   B . (2.251)
This means that the phase-locking at   B allows for an alternative (but equivalent) interpretation – as
the result of ac-field-induced quantum transitions78 between the steps of the Wannier-Stark ladder.
(Again, such occasions when two very different languages may be used for alternative interpretations of
the same effect is one of the most beautiful features of physics.)
This phase-locking effect has been used for the first experimental confirmations of the Bloch
oscillation theory.79 For this purpose, the natural periodic structures, solid-state crystals, are
inconvenient due to their very small period a ~ 10-10 m. Indeed, according to Eq. (244), such structures
require very high forces F (and hence very high electric fields E = F/e) to bring B to an experimentally
convenient range. This problem has been overcome using artificial periodic structures (superlattices) of
certain semiconductor compounds, such as Ga1-xAlxAs with various degrees x of the gallium-to-
aluminum atom replacement, whose layers may be grown over each other epitaxially, i.e., with very few
crystal structure violations. Such superlattices, with periods a ~ 10 nm, have enabled a clear observation
of the resonance at   B, and hence a measurement of the Bloch oscillation frequency, in particular its
proportionality to the applied dc electric field, predicted by Eq. (244).
75 This effect was first discussed in detail by Gregory Hugh Wannier in his 1959 monograph on solid-state
physics, while the name of Johannes Stark is traditionally associated with virtually any electric field effect on
atomic systems, after he had discovered the first of such effects in 1913.
76 In systems with many independent particles (such as electrons in semiconductors), the detection problem is
exacerbated by the phase incoherence of the Bloch oscillations performed by each particle. This drawback is
absent in atomic Bose-Einstein condensates whose Bloch oscillations (in a periodic potential created by standing
optical waves) were eventually observed by M. Ben Dahan et al., Phys. Rev. Lett. 76, 4508 (1996).
77 A simple analysis of phase locking of a classical oscillator may be found, e.g., in CM Sec. 5.4. (See also the
brief discussion of the phase locking of the Josephson oscillations at the end of Sec. 1.6 of this course.)
78 A quantitative theory of such transitions will be discussed in Sec. 6.6 and then in Chapter 7.
79 E. Mendez et al., Phys. Lev. Lett. 60, 2426 (1988).
Very soon after this discovery, the Bloch oscillations were observed80 in small Josephson
junctions, where they result from the quantum dynamics of the Josephson phase difference  in a 2-
periodic potential profile, created by the junction. A straightforward translation of Eq. (244) to this case
(left for the reader’s exercise) shows that the frequency of such Bloch oscillations is
I B I
B  , i.e. f B   , (2.252)
2e 2 2e
where I is the dc current passed through the junction – the effect not to be confused with the “classical”
Josephson oscillations with frequency (1.75). It is curious that Eq. (252) may be legitimately interpreted
as a result of a periodic transfer, through the Josephson junction, of discrete Cooper pairs (of charge –
2e), between two coherent Bose-Einstein condensates in the superconducting electrodes of the
junction.81
Next, our discussion of the Bloch oscillations was based on the premise that the wave packet of
the particle stays within one (say, the lowest) energy band. However, just one look at Fig. 28 shows that
this assumption becomes unrealistic if the energy gap separating this band from the next one becomes
very small, 1  0. Indeed, in the weak-potential approximation, which is adequate in this limit, U1 
 0, the two dispersion curve branches (216) cross without any interaction, so that if our particle
(meaning its the wave packet) is driven to approach that point, it should continue to move up in energy –
see the dashed blue arrow in Fig. 33a. Similarly, in the real-space representation shown in Fig. 33b, it is
intuitively clear that at 1  0, the particle residing at one of the steps of the Wannier-Stark ladder
should be able to somehow overcome the vanishing spatial gap x0 = 1/F and to “leak” into the next
band – see the horizontal dashed blue arrow on that panel.
This process, called the Landau-Zener (or “interband”, or “band-to-band”) tunneling,82 is indeed
possible. To analyze it, let us first take F = 0, and consider what happens if a quantum particle,
described by an x-long (i.e. E-narrow) wave packet, is incident from free space upon a periodic structure
of a large but finite length l = Na >> a – see, e.g., Fig. 22. If the packet’s energy E is within one of the
energy bands, it may evidently propagate through the structure (though may be partly reflected from its
ends). The corresponding quasimomentum may be found by solving the dispersion relation for q; for
example, in the weak-potential limit, Eq. (224) (which is valid near the gap) yields
1 ~

q  q m  q~, with q~   E 2  U n


2 1/ 2
, for U n
2 ~
 E2, (2.253)
~
where E  E   E ( n ) and  = 2aE(n)/n – see the second of Eqs. (225).
Now, if the energy E is inside one of the energy gaps n, the wave packet’s propagation in an
infinite periodic lattice is impossible, so that it is completely reflected from it. However, our analysis of
the potential step problem in Sec. 3 implies that the packet’s wavefunction should still have an
exponential tail protruding into the structure and decaying on some length  – see Eq. (58) and Fig. 2.4.
80 D. Haviland et al., Z. Phys. B 85, 339 (1991).

81 See, e.g., D. Averin et al., Sov. Phys. – JETP 61, 407 (1985). This effect is qualitatively similar to the transfer
of single electrons, with a similar frequency f =I/e, in tunnel junctions between normal (non-superconducting)
metals – see, e.g., EM Sec. 2.9 and references therein.
82 It was predicted independently by L. Landau, C. Zener, E. Stueckelberg, and E. Majorana in 1932.
Indeed, a straightforward review of the calculations leading to Eq. (253) shows that it remains valid for
energies within the gap as well, if the quasimomentum is understood as a purely imaginary number:
q~  i , where  
1 2 ~
Un  E2

  1/ 2
,
~ 2
for E 2  U n . (2.254)
With this replacement, the Bloch solution (193b) indeed describes an exponential decay of the
wavefunction at length  ~ 1/.
Returning to the effects of weak force F, in the real-space approach described by Eq. (248) and
illustrated in Fig. 33b, we may recast Eq. (254) as
   ( x) 
1

U n
2
 ( F~
x )2 
1/ 2
, (2.255)
where ~x is the particle’s (i.e. its wave packet center’s) deviation from the mid-gap point. Thus the gap
creates a potential barrier of a finite width x0 = 2Un/F, through which the wave packet may tunnel
with a non-zero probability. As we already know, in the WKB approximation (in our case requiring
x0 >> 1) this probability is just the potential barrier’s transparency T, which may be calculated from
Eq. (117):
 
x 1
2 Un
2 xc  1   2  d .
2 c ~ 1/ 2
~ 1/ 2
 ln T  2   ( x)dx   U n  ( Fx )
2 2
dx  (2.256)
 ( x )2 0   xc
 0
where xc  x0/2 = Un /F are the classical turning points. Working out this simple integral (or just
noticing that it is a quarter of the unit circle’s area, and hence is equal to /4), we get
Landau-
  Un 2

T  exp
Zener
tunneling . (2.257)
probability  F 
This famous result may be also obtained in a more complex way, whose advantage is a
constructive proof that Eq. (257) is valid for an arbitrary relation between F and Un 2, i.e. arbitrary T,
while our simple derivation was limited to the WKB approximation, valid only at T << 1.83 Using Eq.
(225), we may rewrite the product F participating in Eq. (257), as
1 d El  El '  dq  d El  El '  u

F   0   , (2.258)
2 dq 0 El  El'  E ( n ) dt 2 dt El  El'  E ( n ) 2
where u has the meaning of the “speed” of the energy level crossing in the absence of the gap. Hence,
Eq. (257) may be rewritten in the form
 2 U n 2 
T  exp , (2.259)
 u 

which is more transparent physically. Indeed, the fraction 2Un /u = nu gives the time scale t of the
energy’s crossing the gap region, and according to the Fourier transform, its reciprocal, max ~ 1/t
83In Chapter 6 below, Eq. (257) will be derived using a different method, based on the so-called Golden Rule of
quantum mechanics, but also in the weak-potential limit, i.e. for hyperbolic dispersion law (253).
gives the upper cutoff of the frequencies essentially involved in the Bloch oscillation process. Hence Eq.
(259) means that
n
 ln T  . (2.260)
 max
This formula allows us to interpret the Landau-Zener tunneling as the system’s excitation across the
energy gap n by the highest-energy quantum max available from the Bloch oscillation process. This
interpretation remains valid even in the opposite, tight-binding limit, in which, according to Eqs. (206)
and (237), the Bloch oscillations are purely sinusoidal, so that the Landau-Zener tunneling is completely
suppressed at B < 1.
Interband tunneling is an important ingredient of several physical phenomena and even some
practical electron devices, for example, the tunneling (or “Esaki”) diodes. This simple device is just a
junction of two semiconductor electrodes, one of them so strongly n-doped by electron donors that some
electrons form a degenerate Fermi gas at the bottom of the conduction band. 84 Similarly, the counterpart
semiconductor electrode is p-doped so strongly that the Fermi level in the valence band is shifted below
the band edge – see Fig. 34.
(a) (b) I (c)
n-doped
eV

 p-doped   eV 0 /e V
Fig. 2.34. The tunneling (“Esaki”) diode: (a) the band-edge diagram of the device at zero bias;
(b) the same diagram at a modest positive bias eV ~ /2, and (c) the I-V curve of the device
(schematically). Dashed lines on panels (a) and (b) show the Fermi level positions.
In thermal equilibrium, and in the absence of external voltage bias, the Fermi levels of the two
electrodes self-align, leading to the build-up of the contact potential difference /e, with  a bit larger
than the energy bandgap  – see Fig. 34a. This potential difference creates an internal electric field that
tilts the energy bands (just as the external field did in Fig. 33b), and leads to the formation of the so-
called depletion layer, in which the Fermi level is located within the energy gap and hence there are no
charge carriers ready to move. In the usual p-n junctions, this layer is broad and prevents any current at
applied voltages V lower than ~/e. In contrast, in a tunneling diode the depletion layer is so thin (below
~10 nm) that the interband tunneling is possible and provides a substantial Ohmic current at small
applied voltages – see Fig. 34c. However, at larger positive biases, with eV ~ /2, the conduction band
is aligned with the middle of the energy gap in the p-doped electrode, and electrons cannot tunnel there.
Similarly, there are no electrons in the n-doped semiconductor to tunnel into the available states just
above the Fermi level in the p-doped electrode – see Fig. 34b. As a result, at such voltages the current
84 Here I have to rely on the reader’s background knowledge of basic semiconductor physics; it will be discussed
in more detail in SM Sec. 6.4.
drops significantly, to grow again only when eV exceeds ~, enabling electron motion within each
energy band. Thus the tunnel junction’s I-V curve has a part with a negative differential resistance
(dV/dI < 0) – see Fig. 34c. This phenomenon, equivalent in its effect to negative kinematic friction in
mechanics, may be used for amplification of weak analog signals, for self-excitation of electronic
oscillators85 (i.e. an ac signal generation), and for signal swing restoration in digital electronics.
2.9. Harmonic oscillator: Brute force approach

To complete our review of the basic 1D wave mechanics, we have to consider the famous
harmonic oscillator, i.e. a 1D particle moving in the quadratic-parabolic potential (111), so that the
stationary Schrödinger equation (53) is
 2 d 2 m 02 x 2
    E . (2.261)
2m dx 2 2
Conceptually, on the background of the fascinating quantum effects discussed in the previous sections,
this is not a very interesting system: Eq. (261) is a standard 1D eigenproblem, resulting in a discrete
energy spectrum En, with smooth eigenfunctions n(x) vanishing at x   (because the potential
energy tends to infinity there).86 However, as we will repeatedly see later in the course, this problem’s
solutions have an enormous range of applications, so we have to know their basic properties.
The direct analytical solution of the problem is not very simple (see below), so let us start by
trying some indirect approaches to it. First, as was discussed in Sec. 4, the WKB-approximation-based
Wilson-Sommerfeld quantization rule (110), applied to this potential, yields the eigenenergy spectrum
(114). With the common quantum number convention, this result is
Harmonic
oscillator:  1
energy E n   0  n  , with n  0, 1, 2, ... , (2.262)
levels  2
so that (in contrast to the 1D rectangular potential well) the ground-state energy corresponds to n = 0.
However, as was discussed in the end of Sec. 4, for the quadratic potential (111) the WKB
approximation’s conditions are strictly satisfied only at En >> 0, so that so far we can only trust Eq.
(262) for high levels, with n >> 1, rather than for the (most important) ground state.
This is why let me use Eq. (261) to demonstrate another approximate approach, called the
variational method, whose simplest form is aimed at finding ground states. The method is based on the
following observation. (Here I am presenting its 1D wave mechanics form, though the method is much
more general.) Let n be the exact, full, and orthonormal set of stationary wavefunctions of the system
under study, and En the set of the corresponding energy levels, satisfying Eq. (1.60):
Hˆ  n  E n n . (2.263)
Then we may use this set for the unique expansion of an arbitrary trial wavefunction trial :
85 See, e.g., CM Sec. 5.4.

86 The stationary state of the harmonic oscillator (which, as will be discussed in Secs. 5.4 and 7.1, may be
considered as the state with a definite number of identical bosonic excitations) are sometimes called Fock states –
after Vladimir Aleksandrovich Fock. (This term is also used in a more general sense, for definite-particle-number
states of systems with indistinguishable bosons of any kind – see Sec. 8.3.)
 trial    n n , *
so that  trial    n* n* , (2.264)
n n
where n are some (generally, complex) coefficients. Let us require the trial function to be normalized,
using the condition (1.66) of orthonormality of the eigenfunctions n:

*
trial trial d 3 x     n* n* n' n' d 3 x    n'  n*  n* n' d 3 x    n'  n* n,n'   Wn  1 , (2.265)
n ,n ' n ,n ' n ,n ' n
where each of the coefficients Wn, defined as

Wn   n* n   n
2
 0, (2.266)
may be interpreted as the probability for the particle, in the nth trial state, to be found in the nth genuine
stationary state. Now let us use Eq. (1.23) for a similar calculation of the expectation value of the
system’s Hamiltonian in the trial state:87
H trial
* ˆ
  trial H trial d 3 x     n* n* Hˆ  n' n' d 3 x    n'  n* E n'  n* n' d 3 x
n ,n ' n ,n '
(2.267)
   n'  n E n'  n,n'   Wn E n .
*
n,n ' n
Since the exact ground state energy Eg is, by definition, the lowest one of the set En, i.e. En  Eg, Eqs.
(265) and (267) yield the following inequality:
Variational
H trial
  W n E g  E g  Wn  E g . (2.268) method’s
n n justification
Thus, the genuine ground state energy of the system is always lower than (or equal to) its energy
in any trial state. Hence, if we make several attempts with reasonably selected trial wavefunctions, we
may expect the lowest of the results to approximate the genuine ground state energy reasonably well.
Even more conveniently, if we select some reasonable class of trial wavefunctions dependent on a free
parameter , then we may use the necessary condition of the minimum of Htrial,
 H trial
 0, (2.269)

to find the closest of them to the genuine ground state. Even better results may be obtained using trial
wavefunctions dependent on several parameters. Note, however, that the variational method does not tell
us how exactly the trial function should be selected, or how close its final result is to the genuine
ground-state function. In this sense, this method has “uncontrollable accuracy”, and differs from both
the WKB approximation and the perturbation methods (to be discussed in Chapter 6), for which we have
certain accuracy criteria. Because of this drawback, the variational method is typically used as the last
resort – though sometimes (as in the example that follows) it works remarkably well.88
87 It is easy (and hence left for the reader) to show that the uncertainty H in any state of a Hamiltonian system,
including the trial state (264), vanishes, so that the Htrial may be interpreted as the definite energy of the state.
For our current goals, however, this fact is not important.
88 The variational method may be used also to estimate the first excited state (or even a few lowest excited states)
of the system, by requiring the new trial function to be orthogonal to the previously calculated eigenfunctions of
the lower-energy states. However, the method’s error typically grows with the state number.
Let us apply this method to the harmonic oscillator. Since the potential (111) is symmetric with
respect to point x = 0, and continuous at all points (so that, according to Eq. (261), d2/dx2 has to be
continuous as well), the most natural selection of the ground-state trial function is the Gaussian function
 trial  x   C exp x 2 , (2.270)
with some real  > 0. The normalization coefficient C may be immediately found either from the
standard Gaussian integration of trial2, or just from the comparison of this expression with Eq. (16), in
which  = 1/(2x)2, i.e. x = 1/21/2, giving  C 2 = (2/)1/2. Now the expectation value of the particle’s
Hamiltonian,
pˆ 2  2 d 2 m 02 x 2
Hˆ   U x     , (2.271)
2m 2m dx 2 2
in the trial state, may be calculated as

*   d
2 2
m 02 x 2 
H trial
  trial 
 2m dx 2   trial dx

  2 
(2.272)
 2 
1/ 2
  2   m 02 2 2 2  2 
    
exp  2  x 2
dx  
 2  m   
  x exp  2x 2 dx  .
   m 0  0 
Both involved integrals are of the same well-known Gaussian type,89 giving
2 m 02
H   . (2.273)
trial
2m 8
As a function of , this expression has a single minimum at the value opt that may be found from the
requirement (269), giving opt = m0/2. The resulting minimum of Htrial is exactly equal to ground-
state energy following from Eq. (262),
Harmonic  0
oscillator: E0  . (2.274)
ground state 2
energy
Such a coincidence of results of the WKB approximation and the variational method is rather
unusual, and implies (though does not strictly prove) that Eq. (274) is exact. As a minimum, this
coincidence gives a strong motivation to plug the trial wavefunction (270), with  = opt, i.e.
Harmonic 1/ 4
oscillator:  m 0   m 0 x 2 
ground state 0    exp , (2.275)
wavefunction     2 
and the energy (274), into the Schrödinger equation (261). Such substitution90 shows that the equation is
indeed exactly satisfied.
According to Eq. (275), the characteristic scale of the wavefunction’s spatial spread91 is
89 See, e.g., MA Eqs. (6.9b) and (6.9c).

90 Actually, this is a twist of one of the tasks of Problem 1.12.
91 Quantitatively, as was already mentioned in Sec. 2.1, x = 2x = 2x21/2.
0
1/ 2
   Harmonic
x0    . (2.276) oscillator:
 m 0
spatial scale

Due to the importance of this scale, let us give its crude estimates for several representative systems:92
(i) For atom-bound electrons in solids and fluids, m ~ 10-30 kg, and 0 ~ 1015 s-1, giving x0 ~ 0.3
nm, of the order of the typical inter-atomic distances in condensed matter. As a result, classical
mechanics is not valid at all for the analysis of their motion.
(ii) For atoms in solids, m  10-24-10-26 kg, and 0 ~ 1013 s-1, giving x0 ~ 0.01 – 0.1 nm, i.e.
somewhat smaller than inter-atomic distances. Because of that, the methods based on classical
mechanics (e.g., molecular dynamics) are approximately valid for the analysis of atomic motion, though
they may miss some fine effects exhibited by lighter atoms – e.g., the so-called quantum diffusion of
hydrogen atoms, due to their tunneling through the energy barriers of the potential profile created by
other atoms.
(iii) Recently, the progress of patterning technologies has enabled the fabrication of high-quality
micromechanical oscillators consisting of zillions of atoms. For example, the oscillator used in one of
the pioneering experiments in this field93 was a ~1-m thick membrane with a 60-m diameter, and had
m ~ 210-14 kg and 0 ~ 31010 s-1, so that x0 ~ 410-16 m. It is remarkable that despite such extreme
smallness of x0 (much smaller than not only any atom but even any atomic nucleus!), quantum states of
such oscillators may be manipulated and measured, using their coupling to electromagnetic (in
particular, optical) resonant cavities.94
Returning to the Schrödinger equation (261), in order to analyze its higher eigenstates, we will
need more help from mathematics. Let us recast this equation into a dimensionless form by introducing
the dimensionless variable   x/x0. This gives
d 2
   2   , (2.277)
d 2
where   2E/0 = E/E0. In this notation, the ground-state wavefunction (275) is proportional to exp{-
2/2}. Using this clue, let us look for solutions to Eq. (277) in the form
 2
  C exp  H ( ) , (2.278)
 2
where H() is a new function, and C is the normalization constant. With this substitution, Eq. (277)
yields
d 2H dH
 2  (  1) H  0 . (2.279)
d 2
d
92 By order of magnitude, such estimates are valid even for the systems whose dynamics is substantially different
from that of harmonic oscillators, if a typical frequency of quantum transitions is taken for 0.
93 A. O’Connell et al., Nature 464, 697 (2010).
94 See a review of such experiments by M. Aspelmeyer et al., Rev. Mod. Phys. 86, 1391 (2014), and also recent
experiments with nanoparticles placed in much “softer” potential wells – e.g., by U. Delić et al., Science 367, 892
(2020).
It is evident that H = const and  = 1 is one of its solutions, describing the ground-state
eigenfunction (275) and energy (274), but what are the other eigenstates and eigenvalues? Fortunately,
the linear differential equation (274) was studied in detail in the mid-1800s by C. Hermite who has
shown that all its eigenvalues are given by the set
 n  1  2n, with n = 0, 1, 2,…, (2.280)
so that Eq. (262) is indeed exact for any n.95 The eigenfunction of Eq. (279), corresponding to the
eigenvalue n, is a polynomial (called the Hermite polynomial) of degree n, which may be most
conveniently calculated using the following explicit formula:
H n   1 exp 2  exp  2 .
Hermite n dn
polynomials (2.281)
d n
It is easy to use this formula to spell out several lowest-degree polynomials – see Fig. 35a:
H 0  1, H 1  2 , H 2  4 2  2, H 3  8 3  12 , H 4  16 4 - 48 2  12, ... (2.282)
10
n2 (a)
n 1
H n ( )
n0
0
n3
 10
3 0 3

3 (b)
E3
2
Fig. 2.35. (a) A few lowest Hermite
E2 polynomials and (b) the corresponding
U ( x) 1 eigenenergies (horizontal dashed lines)
and eigenfunctions (solid lines) of the
E1 harmonic oscillator. The black dashed
curve shows the potential profile U(x)
0 drawn on the same scale as the energies
E0 En, so that its crossings with the energy
levels correspond to classical turning
points.
0 x / x0
95 Perhaps the most important property of this energy spectrum is that it is equidistant: En+1 – En = 0 = const.
The properties of these polynomials, which most important for applications, are as follows:
(i) the function Hn() has exactly n zeros (its plot crosses the -axis exactly n times); as a
result, the “parity” (symmetry-antisymmetry) of these functions alternates with n, and
(ii) the polynomials are mutually orthonormal in the following sense:

H n ( )H n ' ( ) exp  2 d   1 / 2 2 n n! n ,n ' . (2.283)


Using the last property, we may readily calculate, from Eq. (278), the normalized eigenfunctions n(x)
of the harmonic oscillator – see Fig.35b:
Harmonic
1  x2   x  oscillator:
 n ( x)  exp 2  H n   . (2.284)
2 n!
n 1/ 2
 1 / 4 x01 / 2  2 x0   x0 
eigen-
functions
At this point, it is instructive to compare these eigenfunctions with those of a 1D rectangular

potential well, with its ultimately hard walls – see Fig. 1.8. Let us list their common features:
(i) The wavefunctions oscillate in the classically allowed regions with En > U(x), while
dropping exponentially beyond the boundaries of that region. (For the rectangular well with infinite
walls, the latter regions are infinitesimally narrow.)
(ii) Each step up the energy level ladder increases the number of the oscillation half-
waves (and hence the number of its zeros), by one.96
And here are the major features specific for soft (e.g., the quadratic-parabolic) confinement:
(i) The spatial spread of the wavefunction grows with n, following the gradual widening
of the classically allowed region.
(ii) Correspondingly, En exhibits a slower growth than the En  n2 law given by Eq.
(1.85), because the gradual reduction of spatial confinement moderates the kinetic energy’s growth.
Unfortunately, the “brute-force” approach to the harmonic oscillator problem, discussed above,
is not too appealing. First, the proof of Eq. (281) is rather longish – so I do not have time/space for it.
More importantly, it is hard to use Eq. (284) for the calculation of the expectation values of observables
including the so-called matrix elements of the system – as we will see in Chapter 4, virtually the only
numbers important for most applications. Finally, it is also almost evident that there has to be some
straightforward math leading to any formula as simple as Eq. (262) for En. Indeed, there is a much more
efficient, operator-based approach to this problem; it will be described in Sec. 5.4.
2.1. The initial wave packet of a free 1D particle is described by Eq. (20):
  x,0    a k e ikx dk .
96In mathematics, a slightly more general statement, valid for a broader class of ordinary linear differential
equations, is frequently called the Sturm oscillation theorem, and is a part of the Sturm-Liouville theory of such
equations – see, e.g., Chapter 10 in the handbook by G. Arfken et al., cited in MA Sec. 16.
(i) Obtain a compact expression for the expectation value p of the particle's momentum at an
arbitrary moment t > 0.
(ii) Calculate p for the case when the function ak2 is symmetric with respect to some value k0.
2.2. Calculate the function ak defined by Eq. (20), for the wave packet with a rectangular spatial
envelope:
C expik 0 x, for  a / 2  x   a / 2,
 ( x,0)  
0, otherwise.
Analyze the result in the limit k0a  .
2.3. Prove Eq. (49) for the 1D propagator of a free quantum particle, starting from Eq. (48).
2.4. Express the 1D propagator defined by Eq. (44), via the eigenfunctions and eigenenergies of
a particle moving in an arbitrary stationary potential U(x).
2.5. Calculate the change of the wavefunction of a 1D particle, resulting from a short pulse of an
external classical force that may be approximated by the delta function:97
F t   P t  .
2.6. Calculate the transparency T of the rectangular potential barrier (68),

0, for x   d / 2,

U ( x)  U 0 , for  d / 2  x   d / 2,
 0, for d / 2  x,

for a particle of energy E > U0. Analyze and interpret the result, taking into account that U0 may be
either positive or negative. (In the latter case, we are speaking about the particle’s passage over a
rectangular potential well of a finite depth U0.)
2.7. Prove Eq. (117) for the case TWKB << 1, using the connection formulas (105).
2.8. Spell out the stationary wavefunctions of a harmonic oscillator in the WKB approximation,
and use them to calculate the expectation values x2 and x4 for the eigenstate number n >> 1.
2.9. Use the WKB approximation to express the expectation value of the kinetic energy of a 1D
particle confined in a soft potential well, in its nth stationary state, via the derivative dEn/dn, for n >> 1.
2.10. Use the WKB approximation to calculate the transparency T of the following triangular
potential barrier:
 0, for x  0,
U ( x)  
U 0  Fx, for x  0,
97 The constant P is called the force’s impulse. (In higher dimensionalities, it is a vector – just as the force is.)
with F, U0 > 0, as a function of the incident particle’s energy E.

Hint: Be careful with the sharp potential step at x = 0.
2.11.* Prove that the symmetry of the 1D scattering matrix S describing an arbitrary time-
independent scatterer, allows its representation in the form (127).
2.12. Prove the universal relations between elements of the 1D transfer matrix T of a stationary
(but otherwise arbitrary) scatterer, mentioned in Sec. 5.
2.13. A 1D particle had been localized in a very narrow and deep potential well, with the “energy
area” U(x)dx equal to –W, where W > 0. Then (say, at t = 0) the well’s bottom is suddenly lifted up, so
that the particle becomes completely free. Calculate the probability density, w(k), to find the particle in a
state with a certain wave number k at t > 0, and the total final energy of the system.
2.14. Calculate the lifetime of the metastable localized state of a 1D particle in the potential
U  x   W  x   Fx, with W  0 ,
using the WKB approximation. Formulate the condition of validity of the result.
U ( x)
2.15. Calculate the energy levels and the corresponding eigenfunctions
of a 1D particle placed into a flat-bottom potential well of width 2a, with
infinitely-high hard walls, and a transparent, short potential barrier in the W ( x)
middle – see the figure on the right. Discuss particle dynamics in the limit
when W is very large but still finite.
a 0 a x
2.16.* Consider a symmetric system of two potential wells of U x 

the type shown in Fig. 21, but with U(0) = U() = 0 – see the
figure on the right. What is the sign of the well interaction force due
to their sharing a quantum particle of mass m, for the cases when 0 x
the particle is in:
(i) a symmetric localized eigenstate: S(–x) = S(x)?
(ii) an antisymmetric localized eigenstate: A(–x) = –A(x)?
Use an alternative approach to verify your result for the particular case of delta-functional wells.
2.17. Derive and analyze the characteristic equation for localized

eigenstates of a 1D particle in a rectangular potential well of a finite depth U x 
(see the figure on the right): a/2 a/2
0
x
 U , for x  a/ 2 ,
U ( x)   0 U0
 0, otherwise.
In particular, calculate the number of localized states as a function of well’s width a, and explore the
limit U0 << 2/2ma2.
2.18. Calculate the energy of a 1D particle localized in a potential well of an arbitrary shape
U(x), provided that its width a is finite, and the average depth is very small:
2 1
U  2
, where U   U x dx
2ma a well
2.19. A particle of mass m is moving in a field with the following potential:

U  x   U 0  x   W  x  ,
where U0(x) is a smooth, symmetric function with U0(0) = 0, growing monotonically at x  .
(i) Use the WKB approximation to derive the characteristic equation for the particle’s energy
spectrum, and
(ii) semi-quantitatively describe the spectrum’s evolution at the increase of  W , for both signs
of this parameter.
Make both results more specific for the quadratic-parabolic potential (111): U0(x) = m02x2/2.
2.20. Prove Eq. (189), starting from Eq. (188).
2.21. For the problem discussed at the beginning of Sec. 7, i.e. the 1D particle’s motion in an
infinite Dirac comb potential shown in Fig. 24,

U x   W   x  ja , with W  0 ,
j  
(where j takes integer values), write explicit expressions for the eigenfunctions at the very bottom and at
the very top of the lowest energy band. Sketch both functions.
2.22. A 1D particle of mass m moves in an infinite periodic system of very narrow and deep
potential wells that may be described by delta functions:

U x   W   x  ja , with W  0 .
j  
(i) Sketch the energy band structure of the system for very small and very large values of the
potential well’s “weight” W, and
(ii) calculate explicitly the ground state energy of the system in these two limits.
2.23. For the system discussed in the previous problem, write explicit expressions for the
eigenfunctions of the system, corresponding to:
(i) the bottom of the lowest energy band,
(ii) the top of that band, and
(iii) the bottom of each higher energy band.
Sketch these functions.
2.24.* The 1D “crystal” analyzed in the last two problems, now extends only to x > 0, with a
sharp step to a flat potential plateau at x < 0:
 
W   x  ja , with W  0,
U x    
for x  0,
j 1
U 0  0, for x  0.
Prove that the system can have a set of the so-called Tamm states, localized near the “surface” x = 0, and
calculate their energies in the limit when U0 is very large but finite. (Quantify this condition.) 98
2.25. Calculate the whole transfer matrix of the rectangular potential barrier, specified by Eq.
(68), for particle energies both below and above U0.
2.26. Use the results of the previous problem to calculate the transfer matrix of one period of the
periodic Kronig-Penney potential shown in Fig. 31b.
2.27. Using the results of the previous problem, derive the characteristic equations for particle’s
motion in the periodic Kronig-Penney potential, for both E < U0 and E > U0. Try to bring the equations
to a form similar to that obtained in Sec. 7 for the delta-functional barriers – see Eq. (198). Use the
equations to formulate the conditions of applicability of the tight-binding and weak-potential
approximations, in terms of the system’s parameters, and the particle’s energy E.
2.28. For the Kronig-Penney potential, use the tight-binding approximation to calculate the
widths of the allowed energy bands. Compare the results with those of the previous problem (in the
corresponding limit).
2.29. For the same Kronig-Penney potential, use the weak-potential limit formulas to calculate
the energy gap widths. Again, compare the results with those of Problem 27, in the corresponding limit.
2.30. 1D periodic chains of atoms may exhibit what is called the Peierls instability, leading to
the Peierls transition to a phase in which atoms are slightly displaced, from the exact periodicity, by
alternating displacements xj = (-1)jx, with x << a, where j is the atom’s number in the chain, and a is
its initial period. These displacements lead to the alternation of the coupling amplitudes n (see Eq.
(204)) between close values n+ and n-. Use the tight-binding approximation to calculate the resulting
change of the nth energy band, and discuss the result.
2.31.* Use Eqs. (1.73)-(1.74) of the lecture notes to derive Eq. (252), and discuss the relation
between these Bloch oscillations and the Josephson oscillations of frequency (1.75).
2.32. A 1D particle of mass m is placed to the following triangular potential well:
 , for x  0,
U x    with F  0 .
 Fx, for x  0,
(i) Calculate its energy spectrum using the WKB approximation.
98 In applications to electrons in solid-state crystals, the delta-functional potential wells model the attractive
potentials of atomic nuclei, while U0 represents the workfunction, i.e. the energy necessary for the extraction of an
electron from the crystal to the free space – see, e.g., Sec. 1.1(ii), and also EM Sec. 2.6 and SM Sec. 6.3.
(ii) Estimate the ground state energy using the variational method, with two different trial
functions.
(iii) Calculate the three lowest energy levels, and also the 10th level, with an accuracy better than
0.1%, from the exact solution of the problem.
(iv) Compare and discuss the results.
Hint: The values of the first zeros of the Airy function, necessary for Task (iii), may be found in
many math handbooks, for example, in Table 9.9.1 of the online version of the collection edited by
Abramowitz and Stegun – see MA Sec. 16(i).
2.33. Use the variational method to estimate the ground state energy Eg of a particle in the
following potential well:
 
U  x   U 0 exp  x 2 , with   0, and U 0  0 .
Spell out the results in the limits of small and large U0, and give their interpretation.
2.34. For a 1D particle of mass m, placed to a potential well with the following profile,
U  x   ax 2 s , with a  0, and s  0 ,
(i) calculate its energy spectrum using the WKB approximation, and
(ii) estimate the ground state energy using the variational method.
Compare the ground-state energy results for the parameter s equal to 1, 2, 3, and 100.
2.35. Use the variational method to estimate the 1st excited state of the 1D harmonic oscillator.
2.36. Assuming the quantum effects to be small, calculate the lower part of 
the energy spectrum of the following system: a small bead of mass m, free to move
without friction along a ring of radius R, which is rotated about its vertical diameter
with a constant angular velocity  – see the figure on the right. Formulate a
quantitative condition of validity of your results. R
Hint: This system was used as the analytical mechanics’ “testbed problem” 
in the CM part of this series, and the reader is welcome to use any relations derived mg
there.
2.37. A 1D harmonic oscillator, with mass m and frequency 0, had been in its ground state; then
an additional force F was suddenly applied, and after that kept constant. Calculate the probability of the
oscillator staying in its ground state.
2.38. A 1D particle of mass m has been placed to a quadratic potential well (111),
m 02 x 2
U ( x)  ,
2
and allowed to relax into the ground state. At t = 0, the well was fast accelerated to move with velocity
v, without changing its profile, so that at t  0 the above formula for U is valid with the replacement x 
x’  x – vt. Calculate the probability for the system to still be in the ground state at t > 0.
2.39. Initially, a 1D harmonic oscillator was in its ground state. At a certain moment of time, its
spring constant  is abruptly increased so that its frequency 0 = (/m)1/2 is increased by a factor of ,
and then is kept constant at the new value. Calculate the probability that after the change, the oscillator
is still in its ground state.
2.40. A 1D particle is placed into the following potential well:

  , for x  0,
U ( x)  
 m 0 x / 2, for x  0.
2 2
(i) Find its eigenfunctions and eigenenergies.

(ii) This system had been let to relax into its ground state, and then the potential wall at x < 0
was rapidly removed so that the system was instantly turned into the usual harmonic oscillator (with the
same m and 0). Find the probability for the oscillator to remain in its ground state.
2.41. Prove the following formula for the propagator of the 1D harmonic oscillator:
1/ 2
 m 0   im 0 
G ( x, t ; x0 , t 0 )    exp   
x 2  x02 cos[ 0 (t  t 0 )]  2 xx0 .
 2i sin[ 0 (t  t 0 )]   2 sin[ 0 (t  t 0 )] 
Discuss the relation between this formula and the propagator of a free 1D particle.
2.42. In the context of the Sturm oscillation theorem mentioned in Sec. 9, prove that the number
of eigenfunction’s zeros of a particle confined in an arbitrary but finite potential well always increases
with the corresponding eigenenergy.
Hint: You may like to use the suitably modified Eq. (186).
2.43.* Use the WKB approximation to calculate the lifetime of the metastable ground state of a
1D particle of mass m in the “pocket” of the potential profile
m 02 2
U ( x)  x  x 3 .
2
Contemplate the significance of this problem.
Chapter 3. Higher Dimensionality Effects

The description of the basic quantum-mechanical effects, given in the previous chapter, may be extended
to higher dimensions in an obvious way. This is why this chapter is focused on the phenomena (such as
the AB effect and the Landau levels) that cannot take place in one dimension due to topological reasons,
and also on a few key 3D problems (such as the Born approximation in the scattering theory, and the
axially- and spherically-symmetric systems) that are important for numerous applications.
3.1. Quantum interference and the AB effect

In the past two chapters, we have already discussed some effects of the de Broglie wave
interference. For example, standing waves inside a potential well, or even on the top of a potential
barrier, may be considered as a result of interference of incident and reflected waves. However, there are
some remarkable new effects made possible by spatial separation of such waves, and such separation
requires a higher (either 2D or 3D) dimensionality. A good example of wave separation is provided by
the Young-type experiment (Fig. 1) in which particles, emitted by the same source, are passed through
two narrow holes (or slits) is an otherwise opaque partition.
l1' l1''
1
z W  w(r )
2
C particle
l2'' detector
l 2'
particle Fig. 3.1. The scheme of the “two-slit”
source (Young-type) interference experiment.
partition
with 2 slits
According to Eq. (1.22), if particle interactions are negligible (which is always true if the
emission rate is sufficiently low), the average rate of particle counting by the detector is proportional to
the probability density w(r, t) = (r, t) *(r, t) to find a single particle at the detector’s location r,
where (r, t) is the solution of the single-particle Schrödinger equation (1.25) for the system. Let us
calculate the rate for the case when the incident particles may be represented by virtually-
monochromatic waves of energy E (e.g., very long wave packets), so that their wavefunction may be
taken in the form given by Eqs. (1.57) and (1.62): (r, t) = (r) exp{-iEt/}. In this case, in the free-
space parts of the system, where U(r) = 0, (r) satisfies the stationary Schrödinger equation (1.78a):
2 2
    E . (3.1a)
2m
With the standard definition k  (2mE)1/2/, it may be rewritten as the 3D Helmholtz equation:
3D
Helmholtz
equation
 2  k 2  0 . (3.1b)
© K. Likharev
The opaque parts of the partition may be well described as classically forbidden regions, so if their size
scale a is much larger than the wavefunction penetration depth  described by Eq. (2.59), we may use on
their surface S the same boundary conditions as for the well’s walls of infinite height:
 S  0. (3.2)
Eqs. (1) and (2) describe the standard boundary problem of the theory of propagation of scalar waves of
any nature. For an arbitrary geometry, this problem does not have a simple analytical solution. However,
for a conceptual discussion of wave interference, we may use certain natural assumptions that will allow
us to find its particular, approximate solution.
First, let us discuss the wave emission, into free space, by a small-size, isotropic source located
at the origin of our reference frame. Naturally, the emitted wave should be spherically symmetric: (r)
= (r). Using the well-known expression for the Laplace operator in spherical coordinates,1 we may
reduce Eq. (1) to the following ordinary differential equation:
1 d  2 d 
k   0.
2
r (3.3)
r 2 dr  dr 
Let us introduce a new function, f(r)  r(r). Plugging the reciprocal relation  = f/r into Eq. (3), we see
that it is reduced to the 1D wave equation,
d2 f
 k2 f  0. (3.4)
dr 2
As was discussed in Sec. 2.2, for a fixed k, the general solution of Eq. (4) may be represented in the
form of two traveling waves:
f  f  e ikr  f  e ikr (3.5)
so that the full wavefunction is
f  ikr f  ikr f i ( kr t ) f  i ( kr t ) E k 2
 (r )  e  e , i.e.  r , t    e  e , with    . (3.6)
r r r r  2m
If the source is located at point r’  0, the obvious generalization of Eq. (6) is
f  i ( kRt ) f  i ( kRt )
 (r, t )  e  e , with R  R , R  r  r' . (3.7)
R R
The first term of this solution describes a spherically-symmetric wave propagating from the
source outward, while the second one, a wave converging onto the source point r’ from large distances.
Though the latter solution is possible at some very special circumstances (say, when the outgoing wave
is reflected back from a spherical shell), for our current problem, only the outgoing waves are relevant,
so that we may keep only the first term (proportional to f+) in Eq. (7). Note that the factor R is the
denominator (that was absent in the 1D geometry) has a simple physical sense: it provides the
independence of the full probability current I = 4R2j(R), with j(R) k*  1/R2, of the distance R
between the observation point and the source.
1 See, e.g., MA Eq. (10.9) with / = / = 0.
Now let us assume that the partition’s geometry is not too complicated – for example, it is either
planar as shown in Fig. 1, or nearly-planar, and consider the region of the particle detector location far
behind the partition (at z >> 1/k), and at a relatively small angle to it: x << z. Then it should be
physically clear that the spherical waves (7) emitted by each point inside the slit cannot be perturbed too
much by the opaque parts of the partition, and their only role is the restriction of the set of such emitting
points to the area of the slits. Hence, an approximate solution of the boundary problem is given by the
following Huygens principle: the wave behind the partition looks as if it was the sum of contributions
(7) of point sources located in the slits, with each source’s strength f+ proportional to the amplitude of
the wave arriving at this pseudo-source from the real source – see Fig. 1. This principle finds its
confirmation in the strict wave theory, which shows that with our assumptions, the solution of the
boundary problem (1)-(2) may be represented as the following Kirchhoff integral:2
 (r' ) k
Kirchhoff
 (r )  c  e ikR d 2 r' , with c  . (3.8)
integral
slits
R 2i
If the source is also far from the partition, its wave’s front is almost parallel to the slit plane, and
if the slits are not too broad, we can take (r’) constant (1,2) at each slit, so that Eq. (8) is reduced to
cA1, 2
 (r )  a"1 exp ikl"1   a" 2 exp ikl" 2 , with a"1, 2   1, 2 , (3.9)
l"1, 2
where A1,2 are the slit areas, and l”1,2 are the distances from the slits to the detector. The wavefunctions
on the slits may be calculated approximately3 by applying the same Eq. (7) to the region before the slits:
1,2  (f+/l’1,2)exp{ikl’1,2}, where l’1,2 are the distances from the source to the slits – see Fig. 1. As a
result, Eq. (9) may be rewritten as
Wave- c f  A1, 2
function  (r )  a1 expikl1   a 2 expikl 2 , with l1, 2  l'1, 2  l''1, 2 ; a1, 2  . (3.10)
superposition l'1, 2 l"1, 2
(As Fig. 1 shows, each of l1,2 is the full length of the classical path of the particle from the source,
through the corresponding slit, and further to the observation point r.)
According to Eq. (10), the resulting rate of particle counting at point r is proportional to
Quantum
w(r )   (r ) * (r )  a1  a 2  2 a1 a 2 cos 12 ,
interference 2 2
(3.11)
where
12  k (l 2  l1 ) (3.12)
is the difference of the total wave phase accumulations along each of two alternative paths. The last
expression may be evidently generalized as
2 For the proof and a detailed discussion of Eq. (8), see, e.g., EM Sec. 8.5.
3 A possible (and reasonable) concern about the application of Eq. (7) to the field in the slits is that it ignores the
effect of opaque parts of the partition. However, as we know from Chapter 2, the main role of the classically
forbidden region is reflecting the incident wave toward its source (i.e. to the left in Fig. 1). As a result, the
contribution of this reflection to the field inside the slits is insignificant if A1,2 >> 2, and even in the opposite case
provides just some rescaling of the amplitudes a1,2, which is not important for our conceptual discussion.
Quantum
12   k  dr , (3.13) interference:
phase
C difference
with integration along the virtually closed contour C (see the dashed line in Fig. 1), i.e. from point 1, in
the positive (i.e. counterclockwise) direction all the way to point 2. (From our discussion of the 1D
WKB approximation in Sec. 2.4, we may expect such generalization to be valid even if k changes,
sufficiently slowly, along the paths.)
Our result (11)-(12) shows that the particle counting rate oscillates as a function of the difference
(l2 – l1), which in turn changes with the detector’s position, giving the famous interference pattern, with
the amplitude proportional to the product a1a2, and hence vanishing if any of the slits is closed. For
the wave theory, this is a well-known result,4 but for particle physics, it was (and still is :-) rather
shocking. Indeed, our analysis is valid for a very low particle emission rate, so that there is no other way
to interpret the pattern other than resulting from a particle’s interference with itself – or rather the
interference of its de Broglie waves passing through each of two slits.5 Nowadays, such interference is
reliably observed not only for electrons but also for much heavier particles: atoms and molecules,
including very complex organic ones;6 moreover, atomic interferometers are used as ultra-sensitive
instruments for measurements of gravity, rotation, and tilt.7
Let us now discuss a very interesting effect of magnetic field on quantum interference. To
simplify our discussion, let us consider a slightly different version of the two-slit experiment, in which
each of the two alternative paths is constricted to a narrow channel using partial confinement – see Fig.
2. (In this arrangement, moving the particle detector without changing channels’ geometry, and hence
local values of k may be more problematic experimentally, so let us think about its position r as fixed.)
In this case, because of the effect of the walls providing the path confinement, we cannot use Eqs. (10)
for the amplitudes a1,2. However, from the discussions in Sec. 1.6 and Sec. 2.2, it should be clear that
the first of the expressions (10) remains valid, though maybe with a value of k specific for each channel.
region with B  0
channel 1
1
w  w(B )
 C
2
channel 2 Fig. 3.2. The AB effect.
In this geometry, we can apply some local magnetic field B , say normal to the plane of particle
motion, whose lines would pierce, but not touch the contour C drawn along the particle propagation
4 See, e.g., a detailed discussion in EM Sec. 8.4.

5 Here I have to mention the fascinating experiments (first performed in 1987 by C. Hong et al. with photons, and
recently, in 2015, by R. Lopes et al., with non-relativistic particles – helium atoms) on the interference of de
Broglie waves of independent but identical particles, in the same internal quantum state and virtually the same
values of E and k. These experiments raise the important issue of particle indistinguishability, which will be
discussed in Sec. 8.1.
6 See, e.g., the recent demonstration of the quantum interference of oligo-porphyrin molecules, consisting of
~2,000 atoms, with a total mass above 25,000 mp – Y. Fein et al., Nature Physics 15, 1242 (2019).
7 See, e.g., the review paper by A. Cronin, J. Schmiedmayer, and D. Pritchard, Rev. Mod. Phys. 81, 1051 (2009).
channels – see the dashed line in Fig. 2. In classical electrodynamics,8 the external magnetic field’s
effect on a particle with electric charge q is described by the Lorentz force
FB  qv  B , (3.14)
where B is the field value at the point of its particle’s location, so that for the experiment shown in Fig.
2, FB = 0, and the field would not affect the particle motion at all. In quantum mechanics, this is not so,
and the field does affect the probability density w, even if B = 0 at all points where the wavefunction
(r) is not equal to zero.
In order to describe this surprising effect, let us first develop a general framework for an account
of electromagnetic field effects on a quantum particle, which will also give us some by-product results
important for forthcoming discussions. To do that, we need to calculate the Hamiltonian of a charged
particle in electric and magnetic fields. For an electrostatic field, this is easy. Indeed, from classical
electrodynamics we know that such field may be represented as a gradient of its electrostatic potential ,
E   r , (3.15)
so that the force exerted by the field on a particle with electric charge q,
FE  qE , (3.16)
may be described by adding the field-induced potential energy,
U r   q r  , (3.17)
to other (possible) components of the full potential energy of the particle. As was already discussed in
Sec. 1.4, such potential energy may be included in the particle’s Hamiltonian operator just by adding it
to the kinetic energy operator – see Eq. (1.41).
However, the magnetic field’s effect is peculiar: since its Lorentz force (14) cannot do any work
on a classical particle:
dWB  FB  dr  FB  vdt  q ( v  B )  v dt  0, (3.18)
the field cannot be represented by any potential energy, so it may not be immediately clear how to
account for it in the Hamiltonian. The crucial help comes from the analytical-mechanics approach to
classical electrodynamics:9 in the non-relativistic limit, the Hamiltonian function of a particle in an
electromagnetic field looks like that in the electric field only:
mv 2 p2
H U   q ; (3.19)
2 2m
however, the momentum p  mv that participates in this expression is now the difference
p  P  qA . (3.20)
Here A is the vector potential, defined by the well-known relations for the electric and magnetic fields:10
8 See, e.g., EM Sec. 5.1. Note that Eq. (14), as well as all other formulas of this course, are in the SI units.
10 See, e.g., EM Sec. 6.1, in particular Eqs. (6.7).
A
E     , B  A, (3.21)
t
while P is the canonical momentum, whose Cartesian components may be calculated (in classics) from
the Lagrangian function L using the standard formula of analytical mechanics,
L
Pj  . (3.22)
v j
To emphasize the difference between the two momenta, p = mv is frequently called the
kinematic momentum (or “mv-momentum”). The distinction between p and P = p + qA becomes more
clear if we notice that the vector potential is not gauge-invariant: according to the second of Eqs. (21),
at the so-called gauge transformation
A  A   , (3.23)
with an arbitrary single-valued scalar gauge function  = (r, t), the magnetic field does not change.
Moreover, according to the first of Eqs. (21), if we make the simultaneous replacement

   , (3.24)
t
the gauge transformation does not affect the electric field either. With that, the gauge function’s choice
does not affect the classical particle’s equation of motion, and hence the velocity v and momentum p.
Hence, the kinematic momentum is gauge-invariant, while P is not, because according to Eqs. (20) and
(23), the introduction of  changes it by q.
Now the standard way of transfer to quantum mechanics is to treat the canonical rather than
kinematic momentum as prescribed by the correspondence postulate discussed in Sec. 1.2. This means
that in the wave mechanics, the operator of this variable is still given by Eq. (1.26):11
Canonical
P̂  i . (3.25) momentum
operator
Hence the Hamiltonian operator corresponding to the classical function (19) is

2
2 
ˆ
H
1
 i  qA   q      iq A   q ,
2
(3.26)
2m 2m   
Charged
so that the stationary Schrödinger equation (1.60) of a particle moving in an electromagnetic field (but particle
otherwise free) is in EM field
2
2  iq 
    A    q  E , (3.27)
2m   
We may now repeat all the calculations of Sec. 1.4 for the case A  0, and get the following
generalized expression for the probability current density:
11The validity of this choice is clear from the fact that if the kinetic momentum was described by this differential
operator, the Hamiltonian operator corresponding to the classical Hamiltonian function (19), and the
corresponding Schrödinger equation would not describe the magnetic field effects at all.
j
  

iq  
    A   c.c  
2im  
1
 

2 q 
 pˆ   c.c       A . (3.28)
   2m m   
We see that the current density is gauge-invariant (as required for any observable) only if the
wavefunction’s phase  changes as
q
    . (3.29)

This may be a point of conceptual concern: since quantum interference is described by the
spatial dependence of the phase , can the observed interference pattern depend on the gauge function’s
choice? (That would not make any sense, because we may change the gauge in our mind.) Fortunately,
this is not true, because the spatial phase difference between two interfering paths, participating in Eq.
(12), is gauge-transformed as
q
12  12   2   1 . (3.30)

But  has to be a single-valued function of coordinates, hence in the limit when the points 1 and 2
coincide, 1 = 2, so that  gauge-invariant, and so is the interference pattern.
However, the difference  may be affected by the magnetic field, even if it is localized outside
the channels in which the particle propagates. Indeed, in this case, the field cannot affect the particle’s
velocity v and the probability current density j:
j(r ) B 0  j(r ) B 0 , (3.31)
so that the last form of Eq. (28) yields

q
 (r ) B 0   (r ) B 0  A . (3.32)

Integrating this equation along the contour C (Fig. 2), for the phase difference between points 1 and 2
we get
q
12 B 0  12 B 0   A  dr , (3.33)
C
where the integral should be taken along the same contour C as before (in Fig. 2, from point 1,
counterclockwise along the dashed line to point 2). But from classical electrodynamics we know12 that
as points 1 and 2 tend to each other, i.e. the contour C becomes closed, the last integral is just the
magnetic flux   Bnd2r through any smooth surface limited by the contour, so that Eq. (33) may be
rewritten as
AB q
effect 12 B 0  12 B 0  Φ . (3.34a)

In terms of the interference pattern, this means a shift of interference fringes, proportional to the
magnetic flux (Fig. 3).
12 See, e.g., EM Sec. 5.3.
(a) (b)
Fig. 3.3. Typical results of a two-paths interference experiment by A. Tonomura et al., Phys. Rev.
Lett. 56, 792 (1986), showing the AB effect for electrons well shielded from the applied magnetic
field. In this particular experimental geometry, the AB effect produces a relative shift of the
interference patterns inside and outside the dark ring. (a)  = 0’/2, (b)  = 0’. © 1986 APS.
This phenomenon is usually called the “Aharonov-Bohm” (or just the AB) effect.13 For particles
with a single elementary charge, q = e, this result is frequently represented as

 12 B 0  12 B 0  2  ' , (3.34b)
0
where the fundamental constant 0’  2/e  4.1410-15 Wb has the meaning of the magnetic flux
necessary to change 12 by 2, i.e. to shift the interference pattern (11) by one period, and is called the
normal magnetic flux quantum – “normal” because of the reasons we will soon discuss.
The AB effect may be “almost explained” classically, in terms of Faraday’s electromagnetic
induction. Indeed, a change  of magnetic flux in time induces a vortex-like electric field E around
it. That field is not restricted to the magnetic field’s location, i.e. may reach the particle’s trajectories.
The field’s magnitude (or rather of its integral along the contour C) may be readily calculated by
integration of the first of Eqs. (21):
dΦ
ΔV   ΔE  dr   . (3.35)
C
dt
I hope that in this expression the reader readily recognizes the integral (“undergraduate”) form of
Faraday’s induction law.14 To calculate the effect of this electric field of the particles, let us assume that
the variable separation described by Eq. (1.57) may be applied to the end points 1 and 2 of particle’s
alternative trajectories as two independent systems,15 and that the magnetic flux’ change by a certain
amount  does not change the spatial factors 1,2, with the phases 1,2 included into the time-
dependent factors a1,2. Then we may repeat the arguments that were used in Sec. 1.6 at the discussion of
13 I prefer the latter, less personable name, because the effect had been actually predicted by Werner Ehrenberg
and Raymond Siday in 1949, before it was rediscovered (also theoretically) by Y Aharonov and D. Bohm in
1959. To be fair to Aharonov and Bohm, it was their work that triggered a wave of interest in the phenomenon,
resulting in its first experimental observation by Robert G. Chambers in 1960 and several other groups soon after
that. Later, the experiments were improved using ferromagnetic cores and/or superconducting shielding to provide
a better separation between the electrons and the applied field – as in the work whose result is shown in Fig. 3.
14 See, e.g., EM Sec. 6.1.
15 This assumption may seem a little bit of a stretch, but the resulting relation (37) may be indeed proven for a
rather realistic model, though that would take more time/space than I can afford.
the Josephson effect, and since the change (35) leads to the change of the potential energy difference U
= qV between the two points, we may rewrite Eq. (1.72) as
d12 U q q d
   V  . (3.36)
dt    dt
Integrating this relation over the time of the magnetic field’s change, we get
q
12   , (3.37)

- superficially, the same result as given by Eq. (34).
However, this interpretation of the AB effect is limited. Indeed, it requires the particle to be in
the system (on the way from the source to the detector) during the flux change, i.e. when the induced
electric field E may affect its dynamics. On the contrary, Eq. (34) predicts that the interference pattern
would shift even if the field change has been made when there was no particle in the system, and hence
the field E could not be felt by it. Experiment confirms the latter conclusion. Hence, there is something
in the space where a particle propagates (i.e., outside of the magnetic field region), that transfers the
information about even the static magnetic field to the particle. The standard interpretation of this
surprising fact is as follows: the vector potential A is not just a convenient mathematical tool, but a
physical reality (just as its scalar counterpart ), despite the large freedom of choice we have in
prescribing specific spatial and temporal dependences of these potentials without affecting any
observable – see Eqs. (23)-(24).
To conclude this section, let me briefly discuss the very interesting form taken by the AB effect
in superconductivity. To be applied to this case, our results require two changes. The first one is simple:
since superconductivity may be interpreted as the Bose-Einstein condensate of Cooper pairs with
electric charge q = –2e, 0’ has to be replaced by the so-called superconducting flux quantum16
Super-
conducting 
flux Φ0   2.07  10 15 Wb  2.07  10 7 Gs  cm 2 . (3.38)
quantum e
Second, since the pairs are Bose particles and are all condensed in the same (ground) quantum
state, described by the same wavefunction, the total electric current density, proportional to the
probability current density j, may be extremely large – in practical superconducting materials, up to
~1012 A/m2. In these conditions, one cannot neglect the contribution of that current into the magnetic
field and hence into its flux , which (according to the Lenz rule of the Faraday induction law) tries to
compensate for changes in external flux. To see possible results of this contribution, let us consider a
closed superconducting loop (Fig. 4). Due to the Meissner effect (which is just another version of the
flux self-compensation), the current and magnetic field penetrate into a superconductor by only a small
distance (called the London penetration depth) L ~ 10-7 m.17 If the loop is made of a superconducting
“wire” that is considerably thicker than L, we may draw a contour deep inside the wire, at that the
current density is negligible. According to the last form of Eq. (28), everywhere at the contour,
q
  A  0. (3.39)

16 One more bad, though common term: a metallic wire may (super)conduct, but a quantum hardly can!
17 For more detail, see EM Sec. 6.4.
Integrating this equation along the contour as before (in Fig. 4, from some point 1, all the way around
the ring to the virtually coinciding point 2), we need to have the phase difference 12 equal to 2n,
because the wavefunction   exp{i} in the initial and final points 1 and 2 should be “essentially” the
same, i.e. produce the same observables. As a result, we get

Φ   A  dr  2n  n 0 . (3.40) Flux
C
q quantization
This is the famous flux quantization effect,18 which justifies the term “magnetic flux quantum” for the
constant 0 given by Eq. (38).
C 1
2
Fig. 3.4. The magnetic flux quantization in a
superconducting loop (schematically).
Unfortunately, in this course I have no space/time to discuss the very interesting effects of
“partial flux quantization” that arise when a superconductor loop is closed with a Josephson junction,
forming the so-called Superconductor QUantum Interference Device – “SQUID”. Such devices are
used, in particular, for supersensitive magnetometry and ultrafast, low-power computing.19
3.2. Landau levels and quantum Hall effect

In the last section, we have used the Schrödinger equation (27) for an analysis of static magnetic
field effects in “almost-1D”, circular geometries shown in Figs. 1, 2, and 4. However, this equation
describes very interesting effects in fully-higher-dimensions as well, especially in the 2D case. Let us
consider a quantum particle free to move in the [x, y] plane only (say, due to its strong confinement in
the perpendicular direction z – see the discussion in Sec. 1.8). Taking the confinement energy for the
reference, we may reduce Eq. (27) to a similar equation, but with the Laplace operator acting only in the
directions x and y:
2
2    q 
 n x ny  i A    E . (3.41)
2m  x y  
Let us find its solutions for the simplest case when the applied static magnetic field is uniform
and perpendicular to the motion plane:
B  B nz . (3.42)
18 It was predicted in 1949 by Fritz London and experimentally discovered (independently and virtually
simultaneously) in 1961 by two experimental groups: B. Deaver and W. Fairbank, and R. Doll and M. Näbauer.
19 A brief review of these effects, and recommendations for further reading may be found in EM Sec. 6.5.
According to the second of Eqs. (21), this relation imposes the following restriction on the choice of the
vector potential:
Ay Ax
 B, (3.43)
x y
but the gauge transformations still give us a lot of freedom in its choice. The “natural” axially-
symmetric form, A = nB/2, where  = (x2 + y2)1/2 is the distance from some z-axis, leads to
cumbersome math. In 1930, L. Landau realized that the energy spectrum of Eq. (41) may be obtained by
making a much simpler, though very counter-intuitive choice:
Ax  0, Ay  B x  x0 , (3.44)
(with arbitrary x0), which evidently satisfies Eq. (43), though ignores the physical symmetry of the x and
y directions for the field (42).
Now, expanding the eigenfunction into the Fourier integral in the y-direction:
 ( x, y )   X k ( x) expik  y  y 0 dk , (3.45)
we see that for each component of this integral, Eq. (41) yields a specific equation
2
2  d  q 
 n x  in y k  B  x  x 0   X k  EX k . (3.46)
2m  dx   
Since the two vectors inside the curly brackets are mutually perpendicular, its square has no cross-terms,
so that Eq. (46) reduces to
2 d 2 q2 2 k
 2
Xk  B x  x0' 2 X k  EX k , where x 0'  x 0  . (3.47)
2m dx 2m qB
But this 1D Schrödinger equation is identical to Eq. (2.261) for a 1D harmonic oscillator,20 with the
center at point x0’, and frequency 0 equal to
qB
c  . (3.48)
m
In the last expression, it is easy to recognize the cyclotron frequency of the classical particle’s rotation in
the magnetic field. (It may be readily obtained using the 2nd Newton law for a circular orbit of radius r,
v2
m  FB  qvB , (3.49)
r
and noting that the resulting ratio v/r = qB /m is just the radius-independent angular velocity c of the
particle’s rotation.) Hence, the energy spectrum for each Fourier component of the expansion (45) is the
same:
20 This result may become a bit less puzzling if we recall that at the classical circular cyclotron motion of a
particle, each of its Cartesian coordinates, including x, performs sinusoidal oscillations with frequency (48), just
as a 1D harmonic oscillator with this frequency.
 1
E n   c  n   , (3.50) Landau
levels
 2
independent of either x0, or y0, or k.
This is a good example of a highly degenerate system: for each eigenvalue En, there are many
similar eigenfunctions that differ only by the positions {x0, y0} of their centers, and the rate k of their
phase change along the y-axis. They may be used to assemble a large variety of linear combinations,
including 2D wave packets whose centers move along classical circular orbits. Note, however, that the
radius of such rotation cannot be smaller than the so-called Landau radius,
1/ 2 1/ 2
      Landau
rL      
 , (3.51) radius
 m c   qB 
which characterizes the minimum size of the wave packet, and follows from Eq. (2.276) after the
replacement 0  c. This radius is remarkably independent of the particle mass, and may be
interpreted in the following way: the scale BAmin of the applied magnetic field’s flux through the
effective area Amin = 2rL2 of the smallest wave packet is just one normal flux quantum 0’  2/ q .
A detailed analysis of such wave packets (for which we would not have time in this course), in
particular proves the virtually evident fact: the applied magnetic field does not change the average
density dN2/dE of different 2D states on the energy scale, following from Eq. (1.99), but just
“assembles” them on the Landau levels (see Fig. 5a), so that the number of different orbital states on
each Landau level (per unit area) is
N 2 1 dN 2 1 dN 2 d 2k 1 1 A 1 qB
nL    Δ E   ΔE  2k 2  c  . (3.52)
A A dE B 0 2
Ad k B 0 dk dE / dk A 2  2
 k /m 2
This expression may again be interpreted in terms of magnetic flux quanta: nL0’ = B, i.e. there is one
particular state on each Landau level per each normal flux quantum.
En B=0 (a) (b)

B 0
electrodes
 c EF Fig. 3.5. (a) The “assembly” of
2D states on Landau levels, and
 c (b) filling the levels with
electrons at the quantum Hall
0 effect.
The most famous application of the Landau levels picture is the explanation of the quantum Hall
effect21. It is usually observed in the “Hall bar” geometry sketched in Fig. 6, where electric current I is
passed through a rectangular conducting sample placed into magnetic field B perpendicular to the
21It was first observed in 1980 by a group led by Klaus von Klitzing, while the classical version (54) of the effect
was first observed by Edwin Hall a century earlier – in 1879.
sample’s plane. The classical analysis of the effect is based on the notion of the Lorentz force (14). As
the magnetic field is turned on, this force starts to deviate the effective charge carriers (electrons or
holes) from their straight motion between the electrodes, bending them toward the insulated sides of the
bar (in Fig. 6, parallel to the x-axis). Here the carriers accumulate, generating a gradually increasing
electric field E until its force (16) exactly balances the Lorentz force (14):
qEy  qv x B , (3.53)
where vx is the drift velocity of the carriers along the bar (Fig. 6), providing the sustained balance
condition Ey/vx = B at each point of the sample.
y
w
E I
v, j

B Fig. 3.6. The Hall bar geometry. Darker
rectangles show external (3D) electrodes.
0 x
l
With n2 carriers per unit area, in a sample of width w, this condition yields the following
classical expression for the so-called Hall resistance RH, remarkably independent of w and l:
Classical Vy Ey w B
Hall RH    . (3.54)
effect Ix qn2 v x w qn2
This formula is broadly used in practice for the measurement of the 2D density n2 of the charge carriers,
and of the carrier type – electrons with q = –e < 0, or holes with the effective charge q = +e > 0.
However, in experiments with high-quality (low-defect) 2D well structures, at sub-kelvin
temperatures22 and high magnetic fields, the linear growth of RH with B, described by Eq. (54), is
interrupted by virtually horizontal plateaus (Fig. 7). Most remarkably, the experimental values of RH on
these plateaus are reproduced with extremely high accuracy (up to ~10-9) from experiment to experiment
and, even more remarkably, from sample to sample.23 They are described by the following formula:
Quantum 1 2
Hall RH  RK , where RK  , (3.55)
effect i e2
so that
RK  25.812 807 459 304... k , (3.56)
and i is (only until the end of this section, following tradition!) the plateau number, i.e. a real integer.
22 In some systems, such as the graphene (virtually perfect 2D sheets of carbon atoms – see Sec. 4 below), the
effect may be more stable to thermal fluctuations, due to their topological properties, so that it may be observed
even at room temperature – see, e.g., K. Novoselov et al., Science 315, 1379 (2007). Also note that in some thin
ferromagnetic layers, the quantum Hall effects may be observed in the absence of an external magnetic field – see,
e.g., M. Götz et al., Appl. Phys. Lett. 112, 072102 (2018) and references therein.
23Due to this high accuracy (which is a rare exception in solid-state physics!), since 2018 the von Klitzing
constant RK is used in metrology for the “legal” ohm’s definition, with its value (56) considered fixed – see
Appendix CA: Selected Physical Constants.
RH
Fig. 3.7. A typical record of the integer

quantum Hall effect. The lower trace (with
sharp peaks) shows the diagonal element,
Vx/Ix, of the resistance tensor. (Adapted from
https://www.nobelprize.org/nobel_prizes/phy
sics/laureates/1998/press.html ).
B tesla 
This effect may be explained using the Landau level picture. The 2D sample is typically in a
weak contact with 3D electrodes whose conductivity electrons, at low temperatures, fill all states with
energies below a certain Fermi energy EF – see Fig. 5b. According to Eqs. (48) and (50), as B is
increased, the spacing c between the Landau levels increases proportionately, so that fewer and fewer
of these levels are below EF (and hence all their states are filled in equilibrium), and within certain
ranges of field variations, the number i of the filled levels is constant. (In Fig. 5b, i = 2.) So, plugging n2
= inL and q = –e into Eq. (54), and using Eq. (52) for nL, we get
1 B 1 2
RH   , (3.57)
i qnL i e 2
i.e. exactly the experimental result (55).
This admittedly oversimplified explanation of the quantum Hall effect does not take into account
at least two important factors:
(i) the nonuniformity of the background potential U(x, y) in realistic Hall bar samples, and the
role of the quasi-1D edge channels this nonuniformity produces;24 and
(ii) the Coulomb interaction of the electrons, in high-quality samples leading to the formation of
RH plateaus with not only integer but also fractional values of i (1/3, 2/5, 3/7, etc.).25
Unfortunately, a thorough discussion of these very interesting features is well beyond the
framework of this course.26,27
24 Such quasi-1D regions, with the width of the order of rL, form along the lines where the Landau levels cross the
Fermi surface, and are actually responsible for all the electron transfer at the quantum Hall effect (giving the
pioneering example of what is nowadays called the topological insulators). The particle motion along these
channels is effectively one-dimensional; because of this, it cannot be affected by modest unintentional
nonuniformities of the potential U(x, y). This fact is responsible for the extraordinary accuracy of Eq. (55).
25 This fractional quantum Hall effect was discovered in 1982 by D. Tsui, H. Stormer, and A. Gossard. In
contrast, the effect described by Eq. (55) with an integer i (Fig. 7) is now called the integer quantum Hall effect.
26 For a comprehensive discussion of these effects, I can recommend, e.g., either the monograph by D. Yoshioka,
The Quantum Hall Effect, Springer, 1998, or the review by D. Yennie, Rev. Mod. Phys. 59, 781 (1987). (See also
the later publications cited above.)
27 Note also that the quantum Hall effect is sometimes discussed in terms of the so-called Berry phase, one of the
geometric phases – the notion apparently pioneered by S. Pancharatnam in 1956. However, in the “usual”
3.3. Scattering and diffraction

The second class of quantum effects, which becomes richer in multi-dimensional spaces, is
typically referred to as either diffraction or scattering – depending on the context. In classical physics,
these two terms are used to describe very different effects. The term “diffraction” is used for the
interference of the waves re-emitted by elementary components of extended objects, under the effect of
a single incident wave.28 On the other hand, the term “scattering” is used in classical mechanics to
describe the result of the interaction of a beam of incident particles29 with such an extended object,
called the scatterer – see Fig. 8.
r  a, k 1
k detector
a
ki scattered particles
incident scatterer
particles Fig. 3.8. Scattering (schematically).
Most commonly, the detector of the scattered particles is located at a large distance r >> a from
the scatterer. In this case, the main observable independent of r is the flux (the number per unit time) of
particles scattered in a certain direction, i.e. their flux per unit solid angle . Since it is proportional to
the incident flux of particles per unit area, the efficiency of scattering in a particular direction may be
characterized by the ratio of these two fluxes. This ratio has is called the differential cross-section of the
scatterer:
Differential d flux of scatterd particles per unit solid angle
cross-  . (3.58)
section d flux of incident particles per unit area
Such terminology and notation stem from the fact that the integral of d/d over all scattering angles,
Total d total flux of scattered particles
cross-   dΩ  , (3.59)
section dΩ incident flux per per unit area
evidently having the dimensionality of area, has a simple interpretation as the total cross-section of
scattering. For the simplest case when a solid object scatters all classical particles hitting its surface, but
does not affect the particles flying by it,  is just the geometric area of the scatterer, as observed from
the direction of the incident particles. In classical mechanics, we first calculate the particle’s scattering
quantum Hall effect the Berry phase equals zero, and I believe that this concept should be saved for the discussion
of more topologically involved systems. Unfortunately, I will have no time/space for a discussion of such systems
in this course, and have to refer the interested reader to special literature – see, e.g., either the key papers collected
by A. Shapere and F. Wilczek, Geometric Phases in Physics, World Scientific, 1992, or the monograph by A.
Bohm et al., The Geometric Phase in Quantum Systems, Springer, 2003.
28 The notion of interference is very close to diffraction, but the former term is typically reserved for the wave re-
emission by just a few components, such as two slits in the Young experiment – see Figs. 1 and 2. A detailed
discussion of diffraction and interference of electromagnetic waves may be found in EM Secs. 8.3-8.8.
29 In the context of classical waves, the term “scattering” is typically reserved for wave interaction with
disordered sets of small objects – see, e.g., EM Sec. 8.3.
angle as a function of its impact parameter b and then average the result over all values of b, considered
random. 30
In quantum mechanics, due to the particle/wave duality, a relatively broad, parallel beam of
incident particles of the same energy E may be fairly represented with a plane de Broglie wave (1.88):
 i   i expik i  r, (3.60)
with the free-space wave number ki = k = (2mE)1/2/. As a result, the particle scattering becomes a
synonym of the de Broglie wave diffraction, and (somewhat counter-intuitively) the description of the
effect becomes simpler, excluding the notion of the impact parameter. Indeed, the wave (60)
corresponds to a constant probability current density (1.49):
2 
ji   i ki, (3.61)
m
which is exactly the flux of incident particles per unit area that is used in the denominator of Eq. (58),
while the numerator of that fraction may be simply expressed via the probability current density js of the
scattered de Broglie waves:
d j r2
 s , at r  a. (3.62)
d ji
Hence our task is reduced to the calculation of js at sufficiently large distances r from the
scatterer. For that, let us rewrite the stationary Schrödinger equation for the elastic scattering problem
(when the energy E of the scattered particles is the same as that of the incident particles) in the form
E  Hˆ   U (r) ,
0 with Hˆ 0  
2 2
2m
 , and E 
2k 2
2m
, (3.63)
where the potential energy U(r) describes the effect of the scatterer. Looking for the solution of Eq. (62)
in the natural form
   i  s , (3.64)
where i is the incident wave (60) and s has the sense of the scattered wave, and taking into account
that the former wave satisfies the free-space Schrödinger equation
Hˆ 0 i  E i , (3.65)
we may reduce Eq. (63) to either of the following equivalent forms:
E  Hˆ 
0 s  U r  i   s ,  2

 k2 s 
2m
2
U (r ) . (3.66)
For applications, an integral form of this equation is more convenient. To derive it, we may look
at the second of Eqs. (66) as a linear, inhomogeneous differential equation for the function s, thinking
of its right-hand side as a known “source”. The solution of such an equation obeys the linear
superposition principle, i.e. we may represent it as the sum of the waves outcoming from all elementary
volumes d3r’ of the scatterer. Mathematically, this sum may be expressed as either
30 See, e.g., CM Sec. 3.5.
2m
2 
 s (r )  U (r' ) (r' )G (r, r' )d 3 r' , (3.67a)

or, equivalently, as31
2m
 (r )   i r   2 
U (r' ) (r' )G (r, r' )d 3 r' , (3.67b)

where G(r, r’) is the spatial Green’s function, defined as such an elementary, spherically-symmetric
response of the 3D Helmholtz equation to a point source, i.e. the outward-propagating solution of the
following equation32
 2  k 2 G   (r  r' ) . (3.68)
But we already know such solution of this equation – see Eq. (7) and its discussion:
f  ikR
G (r, r' )  e , where R  r  r' , (3.69)
R
so that we need just to calculate the coefficient f+ for Eq. (68). This can be done in several ways, for
example by noticing that at R << k-1, the second term on the left-hand side of Eq. (68) is negligible, so
that it is reduced to the well-known Poisson equation with a delta-functional right-hand side, which
describes, for example, the electrostatic potential induced by a point electric charge. Either recalling the
Coulomb law or applying the Gauss theorem,33 we readily get the asymptote
1
G , at kR  1, (3.70)
4R
which is compatible with Eq. (69) only if f+ = –1/4, i.e. if
1 ikR
G (r, r' )   e . (3.71)
4R
Plugging this result into Eq. (67a), we get the following formal solution of Eq. (66):
Scattering
m  (r' ) ikR 3
2 
problem:  s (r )   U (r' ) e d r' . (3.72)
formal solution 2 R
Note that if the function U(r) is smooth, the singularity in the denominator is integrable (i.e. not
dangerous); indeed, the contribution of a sphere with some radius R  0, with the center at point r’, into
this integral scales as
31 This formula is sometimes called the Lipmann-Schwinger equation, though more frequently this term is
reserved for either its operator form or the resulting equation for the spatial Fourier components of  and i.
32 Please notice both the similarity and difference between this Green’s function and the propagator discussed in
Sec. 2.1. In both cases, we use the linear superposition principle to solve wave equations, but while Eq. (67) gives
the solution of the inhomogeneous equation (66), Eq. (2.44) does that for a homogeneous Schrödinger equation.
In the latter case, the elementary wave sources are the elementary parts of the initial wavefunction, rather than of
the equation’s right-hand side as in our current problem.
33 See, e.g., EM Sec. 1.2.
R R
d 3R R 2 dR
 R   0 R  4 0 RdR  2R  0.
2
4 (3.73)
RR
So far, our result (72) is exact, but its apparent simplicity is deceiving, because the wavefunction
 on its right-hand side generally includes not only the incident wave I but also the scattered wave s
– see Eq. (64). The most straightforward, and most common simplification of this problem, called the
Born approximation,34 is possible if the scattering potential U(r) is in some sense small. (We will derive
the quantitative condition of this smallness in a minute.) Since at U(r) = 0 the scattering wave s has to
disappear, at small but non-zero U(r), s has to be much smaller than i . In this case, on the right-
hand side of Eq. (73) we may ignore s in comparison with i, getting
m expik i  r'  ikR 3 Born

 s (r )    i  U (r' ) e d r' . (3.74) approximation
2 2
R
Actually, Eq. (74) gives us even more than we wanted: it evaluates the scattered wave at any
point, including those within of the scattering object, while to spell out Eq. (62), we only need to find
the wave far from the scatterer, at r  . However, before going to that limit, we can use this general
formula to find a quantitative criterion of the Born approximation’s validity. For that, let us estimate the
magnitude of the right-hand side of this equation for a scatterer of a linear size ~a, and the potential
magnitude’s scale U0. The results are different in the following two limits:
(i) If ka << 1, then inside the scatterer (i.e., at distances r’ ~ a), both exp{ikr’} and the second
exponent under the integral in Eq. (74) change little, so that a crude but fair estimate of the solution’s
magnitude is
m
s ~  i U 0a2 . (3.75)
2 2
(ii) In the opposite limit ka >>1, the function under the integral is nearly periodic in one of the
spatial directions (that of the scattered wave propagation), so that the net integral accumulates only on
distances of the order of the de Broglie wavelength, ~k-1, and the integral is correspondingly smaller:
m a2
s ~  i U 0 . (3.76)
2 2 ka
These relations allow us to spell out the Born approximation’s condition, s << I , as
2
U 0  max[ka, 1] . (3.77)
ma 2
In the fraction on the right-hand side, we may readily recognize the scale of the kinetic (quantum-
confinement) energy Ea of the particle inside a potential well of a size of the order of a, so that the Born
approximation is valid essentially if the potential energy of particle’s interaction with the scatterer is
34 Named after M. Born, who was the first to apply this approximation in quantum mechanics. However, the
basic idea of this approach had been developed much earlier (in 1881) by Lord Rayleigh in the context of
electromagnetic wave scattering – see, e.g., EM Sec. 8.3. Note also that the contents of that section repeat some
aspects of our current discussion – perhaps regrettably but unavoidably so, because the Born approximation is a
centerpiece of the theory of scattering/diffraction for both the electromagnetic waves and the de Broglie waves.
Hence I felt I had to cover it in this course for the benefit of the readers who skipped the EM part of my series.
smaller than Ea. Note, however, that the estimates (75) and (76) are not valid in some special situations
when the effects of scattering accumulate in some direction. This is frequently the case for small angles
 of scattering by extended objects, when ka >> 1, but ka ≾ 1.
Now let us proceed to large distances r >> r’ ~ a, and simplify Eq. (74) using an approximation
similar to the dipole expansion in electrodynamics.35 Namely, in the denominator’s R, we may ignore r’
in comparison with the much larger r, but the exponents require more care, because even if r’ ~ a << r,
the product kr’ ~ ka may still be of the order of 1. In the first approximation in r’, we can take (Fig. 9a):
R  r  r'  r  n r  r' , (3.78)
and since directions of the vectors k and r coincide, i.e. k = knr, we get
kR  kr  k  r' , so that e ikR  e ikr e ik r' . (3.79)
(a) (b)
R
r' detector k
r  q
a 
0 0 ki
n r  r'
Fig. 3.9. (a) The long-range expansion of R, and (b) the definitions of q,  , and .
With this replacement, Eq. (74) yields

m  i ikr
 s (r )   e  U (r' ) exp i (k  k i )  r' d 3 r' . (3.80)
2 2 r
This relation is a particular case of a more general formula36
Scattering f (k , k i ) ikr
function: s  i e , (3.81)
definition r
where f(k, ki) is called the scattering function.37 The physical sense of this function becomes clear from
the calculation of the corresponding probability current density js. For that, generally, we need to use
Eq. (1.47) with the gradient operator having all spherical-coordinate components.38 However, at kr >> 1,
the main contribution to s , proportional to k >> 1/r, is provided by differentiating the factor eikr,
which changes in the common direction of vectors r and k, so that
35 See, e.g., EM Sec. 8.2.

36 It is easy to prove that this form is an asymptotic form of any solution s of the scattering problem (even that
beyond the Born approximation) at sufficiently large distances r >> a, k-1.
37 Note that the function f has the dimension of length, and does not account for the incident wave. This is why
sometimes a dimensionless function, S = 1 + 2ikf, is used instead. This function S is called the scattering matrix,
because it may be considered a natural generalization of the 1D matrix S defined by Eq. (2.124), to higher
dimensionality.
38 See, e.g., MA Eq. (10.8).

 s  n r  s  k s , at kr  1 , (3.82)
r
and Eq. (1.47) yields
2
 2 f (k , k i )
j s ( )   i k. (3.83)
m r2
Plugging this expression, and also Eq. (61) into Eq. (62), for the differential cross-section we get simply
d 2
 f (k , k i ) , (3.84)
d
while the total cross-section is
   f (k , k i ) dΩ ,
2
(3.85)
so that the scattering function f(k, ki) gives us everything we need – and in fact more, because the
function also contains information about the phase of the scattered wave.
According to Eq. (80), in the Born approximation the scattering function is reduced to the so-
called Born integral
m  iq  r 3
2 
f (k , k i )   U (r )e d r, (3.86) Born
2 integral
where for the notation simplicity r’ is replaced with r, and the following scattering vector is introduced:
q  k  ki, (3.87)
with the length q = 2k sin(/2), where  is the scattering angle between the vectors k and ki – see Fig.
9b. For the differential cross-section, Eqs. (84) and (86) yield39
Differential
2 cross-
d  m 
2
 iq  r
 U (r)e
section:
  d 3r . (3.88)
d  2 2  Born
approximation
This is the main result of this section; it may be further simplified for spherically-symmetric
scatterers, with
U (r )  U (r ). (3.89)
In this case, it is convenient to represent the exponent in the Born integral as exp{-iqr’cos}, where  is
the angle between the vectors k (i.e. the direction nr toward the detector) and q (rather than the incident
wave vector ki!) – see Fig. 9b. Now, for a fixed q, we can take this vector’s direction for the polar axis
of a spherical coordinate system, and reduce Eq. (86) to a 1D integral:
 2 
m
2 2 0
f (k , k i )   r 2 drU (r )  d  sin d exp{iqr' cos  }
0 0
39Note that according to Eq. (88), in the Born approximation the scattering intensity does not depend on the sign
of the potential U, and also that scattering in a certain direction is completely determined by a specific Fourier
component of the function U(r), namely by its harmonic with the wave vector equal to the scattering vector q.
 
m 2 sin qr 2m
2 
 r 2 drU (r ) 2   2  U (r ) sin( qr ) rdr. (3.90)
2 0 qr  q0
As a simple example, let us use the Born approximation to analyze scattering on the following
spherically-symmetric potential:
 r2 
U r   U 0 exp 2  . (3.91)
 2a 
In this particular case, it is better to avoid the temptation to exploit the spherical symmetry by using Eq.
(90), and instead, use the general Eq. (88), because it may be represented as a product of three similar
Cartesian factors:
mU 0   x2


f (k , k i )   IxIyIz, with I x   exp  2  iq x x dx , (3.92)
2 2    2a 
and similar integrals for Iy and Iz. From Chapter 2, we already know that the Gaussian integrals like Ix
may be readily worked out by complementing the exponent to the full square, in our current case giving
 q x2 a 2 
I x  2  a exp
1/ 2
, etc.,
 2 
2 2 (3.93)
d  mU 0  2  mU 0 a   q2a2
2
  I I I 
x y z   2 a 
 2  e .
d  2  2   
Now, the total cross-section  is an integral of d/d over all directions of the vector k. Since
in our case the scattering intensity does not depend on the azimuthal angle , the only nontrivial
integration is over the scattering angle  – see Fig. 9b:
2
d

d  mU 0 a 2      2 
2
   d  2  sin  d  4π 2 a 2   0  
exp  2 k sin  a  sin  d
d 0
d  
2
  2  
(3.94)
2   2
 mU 0 a 2   mU 0 a 2 
  2π 2   4k a 
2 2
 4π 2 a 2  2
  exp  2 k 2 2
a 1  cos   d (1  cos  )    1  e .
    0 k2  
2
  
Let us analyze these results. In the low-energy limit, ka << 1 (and hence qa << 1 for any
scattering angle), the scattered wave is virtually isotropic: d/d  const – a very typical feature of a
scalar-wave scattering40 by small objects, in any approximation. Note that according to Eq. (77), the
Born expression for , following from Eq. (94) in this limit,
2
 mU 0 a 2 
  8 a 
2 2
2
 ,
 (3.95)
  
40 Note that this is only true for scalar (e.g., the de Broglie) waves, and different for vector ones, in particular the
electromagnetic waves, where the intensity of the dipole radiation, and hence the scattering by small objects
vanishes in the direction of the incident field’s polarization – see, e.g., EM Eqs. (8.26) and (8.139).
is only valid if  is much smaller than the scale a2 of the physical cross-section of the scatterer. In the
opposite, high-energy limit ka >>1, the scattering is dominated by small angles   q/k ~ 1/ka ~ /a:
2
d  mU 0 a 2 
d
 2a 2  2   
 exp  k 2 a 2 2 . (3.96)
  
This is, again, very typical for diffraction. Note, however, that due to the smooth character of the
Gaussian potential (91), the diffraction pattern (98) exhibits no oscillations of d/d as a function of the
diffraction angle .
Such oscillations naturally appear for scatterers with sharp borders. Indeed, let us consider a
uniform spherical scatterer, described by the potential
U , for r  R,
U (r )   0 (3.97)
 0, otherwise.
In this case, integration by parts of Eq. (90) readily yields
2
2mU d  2mU 0 
f (k , k i )  2 30 qR cos qR  sin qR ,   2 3  qR cos qR  sin qR  . (3.98)
2
so that
 q d   q 
According to this result, the scattered wave’s intensity drops very fast with q, so that one needs a semi-
log plot (such as shown in Fig. 10) to reveal small diffraction fringes,41 with the nth destructive
interference (zero-intensity) point tending to qR = (n + ½) at n  . Since, as Fig. 9b shows, q may
only change from 0 to 2k, these intensity minima are only observable at sufficiently large values of the
parameter kR, when they correspond to real values of the scattering angle . (At kR >> 1, approximately
kR/ of these minima, i.e. “dark rings” of low scattering probability, are observable.) On the contrary, at
kR << 1 all allowed values of qR are much smaller than 1, and is this limit, the differential cross-section
does not depend on qR, i.e. the scattering by the sphere (as by any object in this limit) is isotropic.
0.1
1 d 0.01
u  g d
2
0
110
3 Fig. 3.10. The differential cross-section of
the Born scattering of a particle by a
4 “hard” (sharp-border) sphere (97),
110
normalized to its geometric cross-section
g  R2 and the square of the potential’s
5
110 magnitude parameter u0  U0/(2/2mR2), as
a function of the normalized magnitude of
6
110 the scattering vector q.
0 1 2 3 4 5
qR / 
This example shows that in quantum mechanics the notions of particle scattering and diffraction
are essentially inseparable.
41 Their physics is very similar to that of the Fraunhofer diffraction on a 1D scatterer – see, e.g., EM Sec. 8.4.
The Born approximation, while being very simple and used more than any other scattering
theory, is not without substantial shortcomings, as becomes clear from the following example. It is not
too difficult to prove the so-called optical theorem, strictly valid for an arbitrary scatterer:
k
Optical
theorem Im f k i , k i   . (3.99)
4
However, Eq. (86) shows that in the Born approximation, the function f is purely real at q = 0 (i.e. for k
= ki), and hence cannot satisfy the optical theorem. Even more evidently, it cannot describe such a
simple effect as a dark shadow (  0) cast by a virtually opaque object (say, with U >> E). There are
several ways to improve the Born approximation, while still keeping its general idea of an approximate
treatment of U.
(i) Instead of the main assumption s  U0, we may use a complete perturbation series:
 s   1   2  ... (3.100)
with n  U0n, and find successive approximations n one by one. In the 1st approximation we return to
the Born formula, but already the 2nd approximation yields
k
Im f 2 k i , k i   1 , (3.101)
4
where 1 is the total cross-section calculated in the 1st approximation, so that the optical theorem (99) is
“almost satisfied”.
(ii) As was mentioned above, the Born approximation does not work very well for the objects
stretching along the direction (say, x) of the initial wave vector ki. This deficiency may be corrected by
the so-called eikonal42 approximation, which replaces the plane-wave representation (60) of the incident
wave with a WKB-like exponent, though still in the 1st approximation in U  0:
 x   x 2mE  U (r' )1 / 2 
expik 0 x  expi  k (r' )dx'   expi  dx' 
0  0  
(3.102)
  m
x
 
 expi k 0 x  2  U (r' )dx'  .
   k0 0  
Results of this approach satisfy the optical theorem (99) already in the 1st approximation.
Another way toward quantitative results in the theory of scattering, beyond the Born
approximation, may be pursued for spherically-symmetric potentials (89); I will discuss it in Sec. 8,
after a general discussion of particle motion in such potentials in Sec. 7.
3.4. Energy bands in higher dimensions

In Sec. 2.7, we have discussed the 1D band theory for potential profiles U(x) that obey the
periodicity condition (2.192). For what follows, let us notice that the condition may be rewritten as
U ( x  X )  U ( x) , (3.103)
42 From the Greek word , meaning “image”. In our current context, this term is purely historic.
where X = a, with  being an arbitrary integer. One may say that the set of points X forms a periodic
1D lattice in the direct (r-) space. We have also seen that each Bloch state (i.e., each eigenstate of the
Schrödinger equation for such periodic potential) is characterized by the quasimomentum q, and its
energy does not change if q is changed by a multiple of 2/a. Hence if we form, in the reciprocal (q-)
space, a 1D lattice of points Q = lb, with b  2/a and integer l, any pair of points from these two
mutually reciprocal lattices satisfies the following rule:
 2 
expiQX   expil a   e 2il  1 . (3.104)
 a 
In this form, the results of Sec. 2.7 may be readily extended to d-dimensional periodic potentials
whose translational symmetry obeys the following natural generalization of Eq. (103):
U (r  R )  U (r ) , (3.105)
where the points R, which may be numbered by d integers j, form the so-called Bravais lattice:43
d
Bravais
R   j a j , (3.106) lattice
j 1
with d primitive vectors aj. The simplest example of a 3D Bravais lattice is given by the simple cubic
lattice (Fig. 11a), which may be described by a system of mutually perpendicular primitive vectors aj of
equal length. However, not in any lattice these vectors are perpendicular; for example, Figs. 11b and 11c
show possible sets of the primitive vectors describing, respectively, the face-centered cubic (fcc) lattice
and the body-centered cubic (bcc) lattice. In 3D, the science of crystallography, based on group theory,
distinguishes, by their symmetry properties, 14 Bravais lattices grouped into 7 different lattice
systems.44
(a) (b) (c)
a3 a3
a1 a3
a2 a1 a2 a2
a1
Fig. 3.11. The simplest (and most common) 3D Bravais lattices: (a) simple cubic, (b) face-centered cubic
(fcc), and (c) body-centered cubic (bcc), and possible choices of their primitive vector sets (blue arrows).
Note, however, not all highly symmetric sets of points form Bravais lattices. As probably the
most striking example, the nodes of a very simple 2D honeycomb lattice (Fig. 12a)45 cannot be
43 Named after Auguste Bravais, the crystallographer who introduced this notion in 1850.
44 A very clear, well-illustrated introduction to the Bravais lattices is given in Chapters 4 and 7 of the famous
textbook by N. Ashcroft and N. Mermin, Solid State Physics, Saunders College, 1976.
45 This structure describes, for example, the now-famous graphene: isolated monolayer sheets of carbon atoms
arranged in a honeycomb lattice with an interatomic distance of 0.142 nm.
described by a Bravais lattice – while those of the 2D hexagonal lattice shown in Fig. 12b, can. The
most prominent 3D case of such a lattice is the diamond structure (Fig. 12c), which describes, in
particular, silicon crystals.46 In cases like these, the band theory is much facilitated by the fact that the
Bravais lattices using some point groups (called primitive unit cells) may describe these systems.47 For
example, Fig. 12a shows a possible choice of the primitive vectors for the honeycomb lattice, with the
primitive unit cell formed by any two adjacent points of the original lattice (say, within the dashed ovals
on that panel). Similarly, the diamond lattice may be described as an fcc Bravais lattice with a two-point
primitive unit cell – see Fig. 12c.
(a) (b) (c)
a1 a2
a3 a2
a1 a2
a1
Fig. 3.12. Two important periodic structures that require two-point primitive cells for their Bravais lattice
representation: (a) 2D honeycomb lattice and (c) 3D diamond lattice, and their primitive vectors. For contrast,
panel (b) shows the 2D hexagonal structure which forms a Bravais lattice with a single-point primitive cell.
Now we are ready for the following generalization of the 1D Bloch theorem, given by Eqs.
(2.193) and (2.210), to higher dimensions: any eigenfunction of the Schrödinger equation describing
particle’s motion in the spatially-unlimited periodic potential (105) may be represented either as
 (r  R )   (r )e iq  R , (3.107)
3D Bloch
theorem or as
 (r )  u (r )e iq  r , with u (r  R )  u (r ), (3.108)
where the quasimomentum q is again a constant of motion, but now it is a vector. The key notion of the
band theory in d dimensions is the reciprocal lattice in the wave-vector (q) space, formed as
Reciprocal d
lattice in Q  l jb j , (3.109)
q-space j 1
with integer lj, and vectors bj selected in such a way that the following natural generalization of Eq.
(104) is valid for any pair of points of the direct and reciprocal lattices:
46 This diamond structure may be best understood as an overlap of two fcc lattices of side a, mutually shifted by
the vector {1, 1, 1}a/4, so that the distances between each point of the combined lattice and its 4 nearest
neighbors (see the solid gray lines in Fig. 12c) are all equal.
47 A harder case is presented by so-called quasicrystals (whose idea may be traced down to medieval Islamic
tilings, but was discovered in natural crystals, by D. Shechtman et al., only in 1984), which obey a high (say, the
5-fold) rotational symmetry, but cannot be described by a Bravais lattice with any finite primitive unit cell. For a
popular review of quasicrystals see, for example, P. Stephens and A. Goldman, Sci. Amer. 264, #4, 24 (1991).
e iQR  1 . (3.110)
One way to describe the physical sense of the lattice Q is to say that according to Eqs. (80)
and/or (86), it gives the set of the vectors q  k – ki for that the interference of the waves scattered by all
Bravais lattice points is constructive, and hence strongly enhanced.48 Another way to look at the
reciprocal lattice follows from the first formulation of the Bloch theorem, given by Eq. (107): if we add
to the quasimomentum q of a particle any vector Q of the reciprocal lattice, the wavefunction does not
change. This means, in particular, that all information about the system’s eigenfunctions is contained in
just one elementary cell of the reciprocal space q. Its most frequent choice, called the 1st Brillouin zone,
is the set of all points q that are closer to the origin than to any other point of the lattice Q. (Evidently,
the 1st Brillouin zone in one dimension, discussed in Sec. 2.7, falls under this definition – see, e.g., Figs.
2.26 and 2.28.)
It is easy to see that the primitive vectors bj of the reciprocal lattice may be constructed as
Reciprocal
a2  a3 a 3  a1 a1  a 2 lattice:
b 1  2 , b 2  2 , b 3  2 . (3.111)
a1  a 2  a 3  a1  a 2  a 3  a1  a 2  a 3 
primitive
vectors
Indeed, from the “operand rotation rule” of the vector algebra49 it is evident that ajbj’ = 2jj’. Hence,
with the account of Eq. (109), the exponent on the left-hand side of Eq. (110) is reduced to
e iQ  R  exp2i l1 1  l 2 2  l3 3  . (3.112)

Since all lj and all j are integers, the expression in the parentheses is also an integer, so that the
exponent indeed equals 1, thus satisfying the definition of the reciprocal lattice given by Eq. (110).
As the simplest example, let us return to the simple cubic lattice of a period a (Fig. 11a), oriented
in space so that
a1  an x , a 2  an y , a 3  an z , (3.113)
According to Eq. (111), its reciprocal lattice is also cubic:

2
Q (l x n x  l y n y  l z n z ) , (3.114)
a
so that the 1st Brillouin zone is a cube with the side b = 2/a.
Almost equally simple calculations show that the reciprocal lattice of fcc is bcc, and vice versa.
Figure 13 shows the resulting 1st Brillouin zone of the fcc lattice.
The notion of the reciprocal lattice makes the multi-dimensional band theory not much more
complex than that in 1D, especially for numerical calculations, at least for the single-point Bravais
lattices. Indeed, repeating all the steps that have led us to Eq. (2.218), but now with a d-dimensional
Fourier expansion of the functions U(r) and u(r), we readily get its generalization:
48 This is why the notion of the Q-lattice is also the main starting point of X-ray diffraction studies of crystals.
Indeed, it allows rewriting the well-known Bragg condition for diffraction peaks in an extremely simple form: k =
ki + Q, where ki and k are the wave vectors of the, respectively, incident and diffracted waves – see, e.g., EM Sec.
8.4 (where it was more convenient for me to use the notation k0 for ki ).
49 See, e.g., MA Eq. (7.6).
U
l'  l
u  ( E  El )ul ,
l'  l l' (3.115)
where l is now a d-dimensional vector of integer indices lj. The summation in Eq. (115) should be
carried over all essential components of this vector (i.e. over all relevant nodes of the reciprocal lattice),
so that writing a corresponding computer code requires a bit more care than in 1D. However, this is just
a homogeneous system of linear equations, and numerous routines of finding its eigenvalues E are
readily available from both public sources and commercial software packages.
qz
qy Fig. 3.13. The 1st Brillouin zone of the fcc

lattice, and the traditional notation of its
main directions. Adapted from
qx http://en.wikipedia.org/wiki/Band_structure,
as a public domain material.
What is indeed more complex than in 1D is representation (and hence comprehension :-), of the
calculated results and experimental data. Typically, the representation is limited to plotting the Bloch
state eigenenergy as a function of components of the vector q along certain special directions the
reciprocal space of quasimomentum (see, e.g., the red lines in Fig. 13), typically on a single panel. Fig.
14 shows perhaps the most famous (and certainly the most practically important) of such plots, the band
structure of electrons in crystalline silicon. The dashed horizontal lines mark the so-called indirect gap
of the width ~1.12 eV between the “valence” (nominally occupied) and the next “conduction”
(nominally unoccupied) energy bands.
Fig. 3.14. The band structure of silicon, plotted along

the special directions shown in Fig. 13. (Adapted from
http://www.tf.uni-kiel.de/matwis/amat/semi_en/.)
In order to understand the reason for such complexity, let us see how would we start to calculate
such a picture in the weak-potential approximation, for the simplest case of a 2D square lattice – which
is a subset of the cubic lattice (106), with 3 = 0. Its 1st Brillouin zone is of course also a square, of the
area (2/a)2 – see the dashed lines in Fig. 15. Let us draw the lines of the constant energy of a free
particle (U = 0) in this zone. Repeating the arguments of Sec. 2.7 (see especially Fig. 2.28 and its
discussion), we may conclude that Eq. (2.216) should be now generalized as follows,
 2 k 2  2  
2
2
2 l x   2 l y 
E   qx     q y   , (3.116)
2m 2m  a   a  

with all possible integers lx and ly. Considering this result only within the 1st Brillouin zone, we see that
as the particle’s energy E grows, the lines of equal energy, for the lowest energy band, evolve as shown
in Fig. 15. Just like in 1D, the weak-potential effects are only important at the Brillouin zone
boundaries, and may be crudely considered as the appearance of narrow energy gaps, but one can see
that the band structure in q-space is complex enough even without these effects – and becomes even
more involved at higher E.
qy (a) (b) (c)
Fig. 3.15. The lines of constant

2 energy E of a free particle, within
a 0 qx the 1st Brillouin zone of a square
Bravais lattice, for: (a) E/E1  0.95,
(b) E/E1  1.05; and (c) E/E1  2.05,
where E1  22/2ma2.
2 / a
The tight-binding approximation is usually easier to follow. For example, for the same square 2D
lattice, we may repeat the arguments that have led us to Eq. (2.203), to write 50
ia 0,0   n a 1,0  a 1,0  a0, 1  a 0, 1  , (3.117)
where the indices correspond to the deviations of the integers x and y from an arbitrarily selected
minimum of the potential energy – and hence of the wavefunction’s “hump”, quasi-localized at this
minimum. Now, looking for the stationary solution of these equations, that would obey the Bloch
theorem (107), instead of Eq. (2.206) we get
E  E n   n  E n   n  e x  e x  e y  e y   E n  2 n cos q x a  cos q y a .
iq a iq a iq a iq a
(3.118)
 
Figure 16 shows this result, within the 1st Brillouin zone, in two forms: as color-coded lines of
equal energy, and as a 3D plot (also enhanced by color). It is evident that the plots of this function along
different lines on the q-plane, for example along one of the axes (say, qx) and along a diagonal of the 1st
Brillouin zone (say, with qx = qy) give different curves E(q), qualitatively similar to those of silicon (Fig.
50Actually, using the same values of n in both directions (x and y) implies some sort of symmetry of the quasi-
localized states. For example, the s-states of axially-symmetric potentials (see the next section) always have such
symmetry.
14). However, the latter structure is further complicated by the fact that the primitive cell of its Bravais
lattice contains 2 atoms – see Fig. 12c and its discussion. In this case, even the tight-binding picture
becomes more complex. Indeed, even if the atoms at different positions of the primitive unit cell are
similar (as they are, for example, in both graphene and silicon), and hence the potential wells near those
points and the corresponding local wavefunctions u(r) are similar as well, the Bloch theorem (which
only pertains to Bravais lattices!) does not forbid them to have different complex probability amplitudes
a(t) whose time evolution should be described by a specific differential equation.
qy
n
 4 n
2 qy
0 qx 0
a
Fig. 3.16. The allowed band
energy  n  E – En for a square
 4 n qx 2D lattice, in the tight-binding
approximation.
2 / a
As the simplest example, to describe the honeycomb lattice shown in Fig. 12a, we have to
prescribe different probability amplitudes to the “top” and “bottom” points of its primitive cell – say, 
and , correspondingly. Since each of these points is surrounded (and hence weakly interacts) with three
neighbors of the opposite type, instead of Eq. (117) we have to write two equations:
3 3
i   n   j , i   n   j' , (3.119)
j 1 j' 1
where each summation is over three next-neighbor points. (In these two sums, I am using different
summation indices just to emphasize that these directions are different for the “top” and “bottom” points
of the primitive cell – see Fig. 12a.) Now using the Bloch theorem (107) in the form similar to Eq.
(2.205), we get two coupled systems of linear algebraic equations:
3 3
iqr'j'
E  En    n   eiqr j , E  En    n  e , (3.120)
j 1 j' 1
where rj and r’j’ are the next-neighbor positions, as seen from the top and bottom points, respectively.
Writing the condition of consistency of this system of homogeneous linear equations, we get two equal
and opposite values for energy correction for each value of q:
E  En   n 1/ 2 , where   e
3

iq r j  r'j' . (3.121)
j , j '1
According to Eq. (120), these two energy bands correspond to the phase shifts (on the top of the regular
Bloch shift qr) of either 0 or  between the adjacent quasi-localized wavefunctions u(r ).
The most interesting corollary of such energy symmetry, augmented by the honeycomb lattice’s
symmetry, is that for certain values qD of the vector q (that turn out to be in each of six corners of the
honeycomb-shaped 1st Brillouin zone), the double sum  vanishes, i.e. the two band surfaces E(q)
touch each other. As a result, in the vicinities of these so-called Dirac points,51 the dispersion relation is
linear:
~ , where q
E  qq D  E n  v n q ~  qq , (3.122)
D
with vn  n being a constant with the dimension of velocity – for graphene, close to 106 m/s. Such a
linear dispersion relation ensures several interesting transport properties of graphene, in particular of the
quantum Hall effect in it – as was already mentioned in Sec. 2. For their more detailed discussion, I have
to refer the reader to special literature.52
3.5. Axially-symmetric systems

I cannot conclude this chapter (and hence our review of wave mechanics) without addressing the
exact solutions of the stationary Schrödinger equation53 possible in the cases of highly symmetric
functions U(r). Such solutions are very important, in particular, for atomic and nuclear physics, and will
be used in the later chapters of this course.
In some rare cases, such symmetries may be exploited by the separation of variables in Cartesian
coordinates. The most famous (and rather important) example is the d-dimensional harmonic oscillator –
a particle moving inside the potential
m 02 d 2
U  rj .
2 j 1
(3.123)
Separating the variables exactly as we did in Sec. 1.7 for the rectangular hard-wall box (1.77), for each
degree of freedom we get the Schrödinger equation (2.261) of a 1D oscillator, whose eigenfunctions are
51 This term is based on a (rather indirect) analogy with the Dirac theory of relativistic quantum mechanics, to be
discussed in Chapter 9 below.
52 See, e.g., the reviews by A. Castro Neto et al., Rev. Mod. Phys. 81, 109 (2009) and by X. Lu et al., Appl. Phys.
Rev. 4, 021306 (2017). Note that the transport properties of graphene are determined by coupling of 2p-state
electrons of its carbon atoms (see Secs. 6 and 7 below), whose wavefunctions are proportional to exp{i} rather
than are axially-symmetric as implied by Eqs. (120). However, due to the lattice symmetry, this fact does not
affect the above dispersion relation E(q).
53 This is my best chance to mention, in passing, that the eigenfunctions  (r) of any such problem do not feature
n
the instabilities typical for the deterministic chaos effects of classical mechanics – see, e.g., CM Chapter 9. (This
is why the term quantum mechanics of classically chaotic systems is preferable to the occasionally used term
“quantum chaos”.) It is curious that at the initial stages of the time evolution of the wavefunctions of such
systems, their certain correlation functions still grow exponentially, reminding the Lyapunov exponents  of their
classical chaotic dynamics. This growth stops at the so-call Ehrefect times tE ~ -1ln(S/), where S is the action
scale of the problem – see, e.g., I. Aleiner and A. Larkin, Phys. Rev. E 55, R1243 (1997). In a stationary quantum
state, the most essential trace of the classical chaos in a system is an unusual statistics of its eigenvalues, in
particular of the energy spectra. We will have a chance for a brief look at such statistics in Chapter 5, but
unfortunately, I will not have time/space to discuss this field in much detail. Perhaps the best available book for
further reading is the monograph by M. Gutzwiller, Chaos in Classical and Quantum Mechanics, Springer, 1991.
given by Eq. (2.284), and the energy spectrum is described by Eq. (2.162). As a result, the total energy
spectrum may be indexed by a vector n = {n1, n2,…, nd} of d independent integer quantum numbers nj:
 d d
E n   0   n j   , (3.124)
 j 1 2
each ranging from 0 to . Note that every energy level of this system, with the only exception of its
ground state,
d
1  1 d 
 g   0 (r j )  d / 4 d / 2 exp 2  r j2  , (3.125)
j 1  x0  2 x0 j 1 
is degenerate: several different wavefunctions, each with its own different set of quantum numbers nj,
but the same value of their sum, have the same energy.
However, the harmonic oscillator problem is an exception: for other central- and spherically-
symmetric problems the solution is made easier by using more appropriate curvilinear coordinates. Let
us start with the simplest axially-symmetric problem: the so-called planar rigid rotator (or “rotor”), i.e.
a particle of mass m,54 constrained to move along a plane circle of radius R (Fig. 17).55
n
m
R l  R

0
Fig. 3.17. A planar rigid rotator.
The classical planar rotator may be described by just one degree of freedom, say the angle
displacement  (or equivalently the arc displacement l  R) from some reference direction, with the
energy (and the Hamiltonian function) H = p2/2m, where p  mv = mn(dl/dt), n being the unit vector
in the azimuthal direction – see Fig. 17. This function is similar to that of a free 1D particle (with the
replacement x  l  R), and hence the rotator’s quantum properties may be described by a similar
Hamiltonian operator:
ˆ pˆ 2   
H , with pˆ  in   i n  , (3.126)
2m l R 
whose eigenfunctions have a similar structure:
  Ce ikl  Ce ikR . (3.127)
54 From this point on (until the chapter’s end), I will use this exotic font for the particle’s mass, to avoid any
chance of its confusion with the impending “magnetic” quantum number m, traditionally used in axially-
symmetric problems.
55 This is a reasonable model for the confinement of light atoms, notably hydrogen, in some organic compounds,
but I am addressing this system mostly as the basis for the following, more complex problems.
The “only” new feature is that in the rotator, all observables should be 2-periodic functions of
the angle . Hence, as we have already discussed in the context of the magnetic flux quantization (see
Fig. 4 and its discussion), as the particle makes one turn around the central point 0, its wavefunction’s
phase kR may only change by 2m, with an arbitrary integer m (ranging from – to +):
 m (  2 )   m ( )e 2im . (3.128)
With the eigenfunctions (127), this periodicity condition immediately gives 2kR = 2m. Thus, the wave
number k can take only quantized values km = m/R, so that the eigenfunctions should be indexed by this
magnetic quantum number m:
 l
 m  C m expim   C m expim , (3.129) Planar rotator:
eigenfunctions
 R
and the energy spectrum is discrete:
p m2  2 k m2 2m2
Em    . (3.130) Planar rotator:
eigenenergies
2m 2m 2 mR 2
This simple model allows exact analysis of the external magnetic field effects on a confined
motion of an electrically charged particle. Indeed, in the simplest case when this field is axially
symmetric (or just uniform) and directed normally to the rotator’s plane, it does not violate the axial
symmetry of the system. According to Eq. (26), in this case, we have to generalize Eq. (126) as
2 2
1    1    
Hˆ    in   qA     i n   qA  . (3.131)
2m  l  2m  R  
Here, in contrast to the Cartesian gauge choice (44), which was so instrumental for the solution of the
Landau level problem, it is beneficial to take the vector potential in the axially-symmetric form A =
A()n, where   {x, y} is the 2D radius-vector, with the magnitude  = (x2 + y2)1/2. Using the well-
known expression for the curl operator in the cylindrical coordinates,56 we can readily check that the
requirement A = Bnz, with B = const, is satisfied by the following function:
B
A  n . (3.132)
2
For the planar rotator, ρ = R = const, so that the stationary Schrödinger equation becomes
2
1    BR 
  i q   m  E n m . (3.133)
2 m  R  2 
A little bit surprisingly, this equation is still satisfied with the eigenfunctions (127). Moreover,
since the periodicity condition (128) is also unaffected by the applied magnetic field, we return to the
periodic eigenfunctions (129), independent of B. However, the field does affect the system’s
eigenenergies:
56 See, e.g., MA Eq. (10.5).
2 2 2
Planar rotator: 1 1    BR  1  m BR  2  Φ 
magnetic Em    i q   m    q    m   , (3.134)
field’s effect m 2 m  R  2  2m  R 2  2 mR 2  Φ '
0 
where   R2B is the magnetic flux through the area limited by the particle’s trajectory, and 0’ 
2/q is the “normal” magnetic flux quantum we have already met in the AB effect’s context – see Eq.
(34) and its discussion. The field also changes the electric current of the particle in each eigenstate:
  *  iqR B    2   
Im  q  m    m  c.c.  q Cm  m   . (3.135)
2i m R     2   m R   '
0 
Normalizing the wavefunction (129) to have Wm = 1, we get Cm 2 = 1/2R, so that Eq. (135) becomes
 Φ  q
I m   m   I 0 , with I 0  . (3.136)
 Φ '
0  2 mR 2
The functions Em() and Im () are shown in Fig. 18. Note that since 0’  1/q, for any sign of
the particle’s charge q, dIm/d < 0. It is easy to verify that this means that the current is diamagnetic for
any sign of q:57 the field-induced current flows in the direction that its own magnetic field tries to
compensate for the external magnetic flux applied to the loop. This result may be interpreted as a
different manifestation of the AB effect.58 In contrast to the interference experiment that was discussed
in Sec. 1, in the situation shown in Fig. 17 the particle is not absorbed by the detector but travels around
the ring continuously. As a result, its wavefunction is “rigid”: due to the periodicity condition (128), the
quantum number m is discrete, and the applied magnetic field cannot change the wavefunction
gradually. In this sense, the system is similar to a superconducting loop – see Fig. 4 and its discussion.
The difference between these systems is two-fold:
Em
m  1 m0 m  1
0 1  /  0'
Im
Fig. 3.18. The magnetic field effect on a
charged planar rotator. Dashed arrows show
possible inelastic transitions between
metastable and ground states, due to weak
0 1  /  0' interaction with the environment, as the
m  1 m0 m  1 external magnetic field is slowly increased.
57 This effect, whose qualitative features remain the same for all 2D or 3D localized states (see Chapter 6 below),
is frequently referred to as orbital diamagnetism. In magnetic materials consisting of particles with
uncompensated spins, this effect competes with an opposite effect, spin paramagnetism – see, e.g., EM Sec. 5.5.
58 It is straightforward to check that the final forms of Eqs. (134)-(136) remain valid even if the magnetic field is
localized well inside the rotator’s circumference so that its lines do not touch the particle’s trajectory.
(i) For a single charged particle, in macroscopic systems with practicable values of q, R, and m,
the scale I0 of the induced current is very small. For example, for m = me, q = –e, and R = 1 m, Eq.
(136) yields I0  3 pA.59 With the ring’s inductance L of the order of 0R,60 the contribution I = LI ~
0RI0 ~ 10-24 Wb of such a small current into the net magnetic flux  is negligible in comparison with
0’ ~ 10-15 Wb, so that the wavefunction quantization does not lead to the constancy of the total
magnetic flux.
(ii) As soon as the magnetic field raises the eigenstate energy Em above that of another eigenstate
Em’, the former state becomes metastable, and a weak interaction of the system with its environment
(which is neglected in our simple model, but will be discussed in Chapter 7) may induce a quantum
transition of the system to the lower-energy state, thus reducing the diamagnetic current’s magnitude –
see the dashed lines in Fig. 18. The flux quantization in superconductors is much more robust to such
perturbations.61
Now let us return, once again, to the key Eq. (129), and see what does it give for one more
important observable, the particle’s angular momentum
L  rp , (3.137)
In this particular geometry, the vector L has just one component, normal to the rotator plane:
Lz  Rp . (3.138)
In classical mechanics, Lz of the rotator should be conserved (due to the absence of external torque), but
it may take arbitrary values. In quantum mechanics, the situation changes: with p = k, our result km =
m/R for the mth eigenstate may be rewritten as
Angular
( L z ) m  Rk m  m . (3.139) momentum
quantization
Thus, the angular momentum is quantized: it may be only a multiple of the Planck constant  –
confirming the N. Bohr’s guess – see Eq. (1.8). As we will see in Chapter 5, this result is very general
(though it may be modified by spin effects), and the wavefunctions (129) may be interpreted as
eigenfunctions of the angular momentum operator.
Let us see whether this quantization persists in more general, but still axial-symmetric systems.
To implement the planar rotator in our 3D world, we needed to provide rigid confinement of the particle
both in the motion plane and along the 2D radius . Let us consider a more general situation when only
the former confinement is strict, i.e. to the case when a 2D particle moves in an arbitrary centrally-
symmetric potential
U (ρ)  U (  ) . (3.140)
59 Such weak persistent, macroscopic diamagnetic currents in non-superconducting systems have been
experimentally observed by measuring the weak magnetic field induced by the currents, in systems of a large
number (~107) of similar conducting rings – see, e.g., L. Lévy et al., Phys. Rev. Lett. 64, 2074 (1990). Due to the
dephasing effects of electron scattering by phonons and other electrons (unaccounted for in our simple theory),
the effect’s observation requires submicron rings and millikelvin temperatures.
60 See, e.g., EM Sec. 5.3.
61 Interrupting a superconducting ring with a weak link (Josephson junction), i.e. forming a SQUID, we may get a
switching behavior similar to that shown with dashed arrows in Fig. 18 – see, e.g., EM Sec. 6.5.
Using the well-known expression for the 2D Laplace operator in polar coordinates,62 we may represent
the 2D stationary Schrödinger equation in the form
2  1     1 2 
      U (  )  E . (3.141)
2 m        2  2 
Separating the radial and angular variables as63
  R (  )F ( ) , (3.142)
we get, after the division of all terms by  and their multiplication by ρ2, the following equation:
 2   d  dR  1 d 2 F 
       U ( )   E .
2 2
(3.143)
2 m R d  d  F d 2 
The fraction (d2F/d2)/F should be a constant (because all other terms of the equation may be functions
only of ρ), so that for the function F() we get an ordinary differential equation,
d 2F
 2F  0 , (3.144)
d 2
where 2 is the variable separation constant. The fundamental solutions of Eq. (144) are evidently F 
exp{i}. Now requiring, as we did for the planar rotator, the 2 periodicity of any observable, i.e.
F (  2 )  F ( )e 2im , (3.145)
where m is an integer, we see that the constant  has to be equal to m, and get, for the angular factor, the
same result as for the full wavefunction of the planar rotator – cf. Eq. (129):
Fm  C m e im , with m  0,  1,  2,... (3.146)
Plugging the resulting relation (d2F/d2)/F = –m2 back into Eq. (143), we may rewrite it as
 2  1 d  dR  m 2 
      U ( )  E . (3.147)
2 m  R d  d   2 
The physical interpretation of this equation is that the full energy is a sum,
E  E   E , (3.148)
of the radial-motion part
 2 1 d  dR 
E      U ( ) . (3.149)
2 m  d  d 
and the angular-motion part
62See, e.g., MA Eq. (10.3) with /z = 0.

63At this stage, I do not want to mark the particular solution (eigenfunction)  and corresponding eigenenergy E
with any single index, because based on our experience in Sec. 1.7, we already may expect that in a 2D problem
the role of this index will be played by two integers – two quantum numbers.
2m2
E  . (3.150)
2 m 2
Now let us recall that a similar separation exists in classical mechanics,64 because the total
energy of a particle moving in a central field may be represented as
E
m
2
v 2  U ( )     2 2   U (  )  E   E ,
m 2
2
(3.151)
p 2 m p2 L2z
with E    U (  ), and E   2 2   . (3.152)
2m 2 2m 2 m 2
The comparison of the latter relation with Eqs. (139) and (150) gives us grounds to expect that the
quantization rule Lz = m may be valid not only for this 2D problem but in 3D cases as well. In Sec. 5.6,
we will see that this is indeed the case.
Returning to Eq. (147), with our 1D wave mechanics experience we may expect that any fixed m
this ordinary, linear, second-order differential equation should have (for a motion confined to a certain
final region of its argument ρ) a discrete energy spectrum described by another integer quantum number
– say, n. This means that the eigenfunctions (142) and corresponding eigenenergies (148) and R(ρ)
should be indexed by two quantum numbers, m and n. So, the variable separation is not so “clean” as it
was for the rectangular potential well. Normalizing the angular function F to the full circle,  = 2, we
may rewrite Eq. (142) as
1
 m ,n  R m,n (  )Fm ( )  R m,n (  )e im . (3.153)
2 1/ 2
A good (and important) example of an analytically solvable problem of this type is a 2D particle
whose motion is rigidly confined to a disk of radius R, but otherwise free:
 0, for 0    R,
U ( )   (3.154)
 , for R   .
In this case, the solutions Rm,n() of Eq. (147) are proportional to the first-order Bessel functions
Jm(knρ),65 with the spectrum of possible values kn following from the boundary condition Rm,n(R) = 0.
Let me leave a detailed analysis of this problem for the reader’s exercise.
3.6. Spherically-symmetric systems: Brute force approach

Now let us proceed to the mathematically more involved, but practically even more important
case of the 3D motion, in a spherically-symmetric potential
U (r )  U (r ). (3.155)
64See, e.g., CM Sec. 3.5.

65 A short summary of properties of these functions, including the most important plots and a useful table of
values, may be found in EM Sec. 2.7.
Let us start, again, with solving the eigenproblem for a rigid rotator – now a spherical rotator,
i.e. a particle confined to move on the spherical surface of radius R. The rotator has two degrees of
freedom because its position on the surface is completely described by two coordinates – say, the polar
angle  and the azimuthal angle . In this case, the kinetic energy we need to consider is limited to its
angular part, so that in the Laplace operator in spherical coordinates66 we may keep only those parts,
with fixed r = R. Because of this, the stationary Schrödinger equation becomes
2  1     1 2 
   sin     E . (3.156)
2 mR 2  sin      sin 2   2 
(Again, we will attach indices to  and E in a minute.) With the natural variable separation,
  ( )F ( ) , (3.157)
Eq. (156), with all terms multiplied by sin2/F, yields
 2  sin  d  d  1 d 2 F 
 2   sin   2 
 E sin 2  . (3.158)
2 mR   d   d   F d 
Just as in Eq. (143), the fraction (d2F/dx2)/F may be a function of  only, and hence has to be constant,
giving Eq. (144) for it. So, with the same periodicity condition (145), the azimuthal functions are
expressed by (146) again; in the normalized form,
1
Fm ( )  e im . (3.159)
2 1/ 2
With that, the fraction (d2F/d2)/F in Eq. (158) equals (-m2), and after the multiplication of all terms of
that equation by /sin2, it is reduced to the following ordinary linear differential equation for the polar
eigenfunctions ():
1 d  d  m2 2
  sin     , with   E . (3.160)
sin  d  d  sin 2  2 mR 2
It is common to recast it into an equation for a new function P()  (), with   cos  :
 2 dP   m2 
d
 
 1   d   l l  1  1   2  P  0 ,
d
(3.161)
   
where a new notation for the normalized energy is introduced: l(l+1)  . The motivation for such
notation is that, according to the mathematical analysis of Eq. (161) with integer m,67 it has solutions
only if the parameter l is an integer: l = 0, 1, 2,…, and only if that integer is not smaller than m, i.e. if
 l  m  l . (3.162)
This fact immediately gives the following spectrum of the spherical rotator’s energy E – and, as we will
see later, the angular part of the energy of any spherically-symmetric system:
66 See, e.g., MA Eq. (10.9).

67 This analysis was first carried out by A.-M. Legendre (1752-1833). Just as a historic note: besides many
original mathematical achievements, Dr. Legendre had authored a famous textbook, Éléments de Géométrie,
which dominated teaching geometry through the 19th century.
 2 l l  1 Angular
El  , (3.163) energy
2 mR 2 spectrum
so that the only effect of the magnetic quantum number m here is imposing the restriction (162) on the
non-negative integer l – the so-called orbital quantum number. This means, in particular, that each
energy (163) corresponds to (2l + 1) different values of m, i.e. is (2l + 1)–degenerate.
To understand the nature of this degeneracy, we need to explore the corresponding
eigenfunctions of Eq. (161). They are naturally numbered by two integers, m and l, and are called the
associated Legendre functions Plm. (Note that here m is an upper index, not a power!) For the particular,
simplest case m = 0, these functions are the so-called Legendre polynomials Pl()  Pl0(), which may
be defined as the solutions of the following Legendre equation, resulting from Eq. (161) at m = 0:
 
d
d
 2 d

 1   d P   l l  1P  0 , (3.164)
Legendre
equation
 
but also may be calculated explicitly from the following Rodrigues formula:68
1 dl Legendre
Pl ( )  ( 2  1) l , l  0, 1, 2,... . (3.165) polynomials
2 l! d
l l
Using this formula, it easy to spell out a few lowest Legendre polynomials:
1
P0 ( )  1, P1 ( )   , P2 ( ) 
2
 1

3 2  1 , P3 ( )  5 3  3 , ... ,
2
 
(3.166)
though such explicit expressions become bulkier and bulkier as l is increased. As these expressions (and
Fig. 19) show, as the argument  is increased, all these functions end up at the same point, Pl(+1) = + 1,
while starting at either at the same point or at the opposite point: Pl(-1) = (-1)l. On the way between
these two end points, the lth polynomial crosses the horizontal axis exactly l times, i.e. Eq. (164) has l
roots.69
1 .
0.5
Pl ( ) 0
4
3
0.5
2
l 1
1
Fig. 3.19. A few lowest Legendre polynomials.
1  0.5 0 0.5 1
  cos 
68 This wonderful formula may be readily proved by plugging it into Eq. (164), but was not so easy to discover!
This was done (independently) by B. O. Rodrigues in 1816, J. Ivory in 1824, and C. Jacobi in 1827.
69 In this behavior, we may readily recognize the “standing wave” pattern typical for all 1D eigenproblems – cf.
Figs. 1.8 and 2.35, as well as the discussion of the Sturm oscillation theorem at the end of Sec. 2.9.
It is also easy to use the Rodrigues formula (165) and the integration by parts to show that on the
segment –1    +1, the Lagrange polynomials form a full orthogonal set of functions, with the
following normalization rule:
1
2
1 Pl ( ) Pl ' ( )d  2l  1  ll ' . (3.167)
For m > 0, the associated Legendre functions (now not necessarily polynomials!), may be
expressed via the Legendre polynomials (165) using the following formula:70
Associated dm
Legendre Pl m ( )  (1) m (1   2 ) m / 2 Pl ( ) , (3.168)
functions d m
while the functions with a negative magnetic quantum number may be found as
(l  m)! m
Pl m ( )  (1) m Pl ( ), for m  0 . (3.169)
(l  m)!
On the segment –1    +1, the associated Legendre functions with a fixed index m form a full
orthogonal set, with the normalization relation,
1
2 (l  m)!
P ( ) Pl m' ( )d   ll ' ,
m
l (3.170)
1
2l  1 (l  m)!
which is evidently a generalization of Eq. (167) for arbitrary m.
Since the difference between the angles  and  is to large extent artificial (due to an arbitrary
direction of the polar axis), physicists prefer to use not the functions ()  Pl m (cos) and Fm()  eim
separately, but normalized products of the type (157), which are called the spherical harmonics:
 2l  1 (l  m)!
1/ 2
Spherical
harmonics Yl ( ,  )  
m
 Pl m (cos  )e im . (3.171)
 4 (l  m)!
The specific front factor in Eq. (171) is chosen in a way to simplify the following two expressions: the
relation of the spherical harmonics with opposite signs of the magnetic quantum number,

Yl  m ( ,  )  (1) m Yl m ( ,  ) ,  (3.172)
and the following normalization relation:
Y l
m
 
( ,  ) Yl m' ' ( ,  ) d   ll ' mm ' , (3.173)
4
with the integration over the whole solid angle. The last formula shows that on a spherical surface, the
spherical harmonics form an orthonormal set of functions. This set is also full, so that any function
defined on the surface, may be uniquely represented as a linear combination of Ylm.
Despite a somewhat intimidating character of the formulas given above, they yield quite simple
expressions for the lowest spherical harmonics, which are most important for applications:
70 Note that some texts use different choices for the front factor (called the Condon-Shortley phase) in the
functions Plm, which do not affect the final results for the spherical harmonics Ylm.
Y00  1 / 4 
1/ 2
l  0: , (3.174)
 Y 1  3 / 8 1 / 2 sin  e  i ,
 1
l  1 :  Y10  3 / 4  cos  ,
1/ 2
(3.175)
 1 i
 Y1  3 / 8  sin  e ,
1/ 2
 Y  2  15 / 32 1 / 2 sin 2  e  2i ,

 2
 Y21  15 / 8 1 / 2 sin  cos  e  i ,

l  2 :  Y20  3 / 16 1 / 2 (3 cos 2   1), etc. (3.176)
 1 i
 Y2  15 / 8  sin  cos  e ,
1/ 2
 Y 2  15 / 32 1 / 2 sin 2  e 2i ,

 2
It is important to understand the general structure and symmetry of these functions. Since the
sherical harmonics with m  0 are complex, the most popular way of their graphical representation is to
normalize their real and imaginary parts as71
Ylm  2  1  
m  
 Im Yl m  sin m , for m  0,
 
Re Yl m  cos m , for m  0,
(3.177)
(for m = 0, Yl0  Yl0), and then plot the magnitude of these real functions in the spherical coordinates as
the distance from the origin, while using two colors to show their sign – see Fig. 20.
Let us start from the simplest case l = 0. According to Eq. (162), for this lowest orbital quantum
number, there may be only one magnetic quantum number, m = 0. According to Eq. (174), the spherical
harmonic corresponding to that state is just a constant, so that the wavefunction of this so-called s state72
is uniformly distributed over the sphere. Since this function has no gradient in any angular direction, it is
only natural that the angular kinetic energy (163) of the particle equals zero.
According to the same Eq. (162), for l = 1, there are 3 different p states, with m = –1, m = 0, and
m = +1 – see Eq. (175). As the second row of Fig. 20 shows, these states are essentially identical in
structure and are just differently oriented in space, thus readily explaining the 3-fold degeneracy of the
kinetic energy (163). Such a simple explanation, however, is not valid for the 5 different d states (l = 2),
shown in the third row of Fig. 20, as well as the states with higher l: despite their equal energies, they
differ not only by their spatial orientation but their structure as well. All states with m = 0 have a
nonzero gradient only in the  direction. On the contrary, the states with the ultimate values of m (l),
change only monotonically (as sinl) in the polar direction, while oscillating in the azimuthal direction.
The states with intermediate values of m provide a gradual transition between these two extremes,
oscillating in both directions, stronger and stronger in the azimuthal direction as m is increased. Still,
the magnetic quantum number, surprisingly, does not affect the angular energy for any l.
71 Such real functions Ylm, which also form a full orthonormal set, and are frequently called the real (or “tesseral”)
spherical harmonics, are more convenient than the complex harmonics Ylm for several applications, especially
when the variables of interest are real by definition.
72 The letter names for the states with various values of l stem from the history of optical spectroscopy – for
example, the letter “s” used for states with l = 0, originally denoted the “sharp” optical line series, etc. The
sequence of the letters is as follows: s, p, d, f, g, and then continuing in alphabetical order.
l=0
(s state)
l=1
(p states)
l=2
(d states)
l=3
(f states)
m = –3 –2 –1 0 +1 +2 +3
Fig. 3.20. Radial plots of several lowest real spherical harmonics Ylm. (Adapted from
https://en.wikipedia.org/wiki/Spherical_harmonics under the CC BY-SA 3.0 license.)
Another counter-intuitive feature of the spherical harmonics follows from the comparison of Eq.
(163) with the second of the classical relations (152). These expressions coincide if we interpret the
constant
L2   2 l (l  1) , (3.178)
as the value of the full angular momentum squared, L2 = L2 (including its both  and  components) in
the eigenstate with eigenfunction Ylm. On the other hand, the structure (159) of the azimuthal component
F() of the wavefunction is exactly the same as in 2D axially-symmetric problems, implying that Eq.
(139) still gives correct values Lz = m for the z-component of the angular momentum. This fact invites
a question: why for any state with l > 0, (Lz)2 = m22  l22 is always less than L2 = l(l + 1)2? In other
words, what prevents the angular momentum vector to be fully aligned with the axis z?
Besides the difficulty of answering this question using the above formulas, this analysis (though
mathematically complete), is as intellectually unsatisfactory as the harmonic oscillator analysis in Sec.
2.9. In particular, it does not explain the meaning of the extremely simple relations for the eigenvalues
of the energy and the angular momentum, coexisting with rather complicated eigenfunctions.
We will obtain natural answers to all these questions and concerns in Sec. 5.6 below, and now
proceed to the extension of our wave-mechanical analysis to the 3D motion in an arbitrary spherically-
symmetric potential (155). In this case, we have to use the full form of the Laplace operator in spherical
coordinates.73 The variable separation procedure is an evident generalization of what we have done
before, with the particular solutions of the type
  R (  )Θ( )F ( ), (3.179)
whose substitution into the stationary Schrödinger equation yields
2  1 d  2 dR  1 1 d  dΘ  1 1 d 2F 
  r   sin     U (r )  E . (3.180)
2 mr 2 R dr  dr  Θ sin  d  d  sin 2  F d 2 
It is evident that the angular part of the left-hand side (the two last terms in the square brackets)
separates from the radial part, and that for the former part we get Eq. (156) again, with the only change,
R  r. This change does not affect the fact that the eigenfunctions of that equation are still the spherical
harmonics (171), which obey Eq. (164). As a result, Eq. (180) gives the following equation for the radial
function R(r):
 2  1 d  2 dR  
 2  r   l (l  1)  U (r )  E . (3.181)
2 mr R dr  dr  
Note that no information about the magnetic quantum number m has crept into this radial equation
(besides setting the limitation (162) for the possible values of l) so that it includes only the orbital
quantum number l.
Let us explore the radial equation for the simplest case when U(r) = 0 – for example, to solve the
eigenproblem for a 3D particle free to move only inside the sphere of radius R – say, confined there by
the potential74
 0, for 0  r  R,
U  (3.182)
   , for R  r .
In this case, Eq. (181) is reduced to
2  1 d  2 dR  
 R dr  r dr   l (l  1)  E. (3.183)
2 mr 2    
Multiplying both parts of this equality by r2R, and introducing the dimensionless argument   kr, where
k2 is defined by the usual relation 2k2/2m = E, we obtain the canonical form of this equation,

d 2R
2
d 2
 2
dR
d
 
  2  l l  1 R  0, (3.184)
Satisfied by so-called spherical Bessel functions of the first and second kind, jl() and yl().75 These
functions are directly related to the Bessel functions of semi-integer order,76
73 Again, see MA Eq. (10.9).

74 This problem, besides giving a simple example of the quantization in spherically-symmetric systems, is also an
important precursor for the discussion of scattering by spherically-symmetric potentials in Sec. 8.
75 Alternatively, y () are called “spherical Weber functions” or “spherical Neumann functions”.
l
76 Note that the Bessel functions J () and Y () of any order  obey the universal recurrent formulas and
 
asymptotic formulas (discussed, e.g., in EM Sec. 2.7), so that many properties of the functions jl() and yl() may
be readily derived from these relations and Eqs. (185).
1/ 2 1/ 2
   
jl ( )    J 1 ( ), y l ( )    Y ( ), (3.185)
 2   2 
1
l l
2 2
but are actually much simpler than even the “usual” Bessel functions, such as Jn() and Yn() of an
integer order n, because the former ones may be directly expressed via elementary functions:
sin  sin  cos   3 1 3
j 0 ( )  , j1 ( )   , j 2 ( )   3   sin   2 cos  ,...,
  2
   
(3.186)
cos  cos  sin   3 1 3
y 0 ( )   , y1 ( )    , y 2 ( )   3   cos   2 sin  ,...,
  2
   
A few lowest-order spherical Bessel functions are plotted in Fig. 21.
1 0.5
l0 l0
1
2 3
0.5 0
1
2
j l   3 yl  
0 0.5
 0.5 1
0 5 10 15 0 5 10 15
 
Fig. 3.21. Several lowest-order spherical Bessel functions.
As these formulas and plots show, the functions yl() are diverging at   0, and thus cannot be
used in the solution of our current problem (182), so that we have to take
R l r   const  jl kr  . (3.187)
Still, even for these functions, with the sole exception of the simplest function j0(), the characteristic
equation jl(kR) = 0, resulting from the boundary condition R(R)= 0, can be solved only numerically.
However, the roots l,n of the equation jl() = 0, where the integer n (= 1, 2, 3,…) is the root’s number,
are tabulated in virtually any math handbook, and we may express the eigenvalues we are interested in,
 l ,n  2 k l2,n  2 l2,n
k l ,n  , El ,n   , (3.188)
R 2m 2 mR 2
via these tabulated numbers. The table below lists several smallest roots, and the corresponding
eigenenergies (normalized to their natural unit E0  2/2mR2), in the order of their growth. It shows a
very interesting effect: going up the energy spectrum, first the eigenenergies grow because of increases
of the orbital quantum number l, at the same (lowest) radial quantum number n = 1, due to the growth of
the first roots of functions jl(), but then suddenly the second root of j0() cuts into this orderly
sequence, just to be followed by the first root of j3(). With the further growth of energy, the sequences
of l and n become even more entangled.
l n l,n El,n/E0 = (l,n)2

0 1   3.1415 2  9.87
1 1 4.493 20.19
2 1 5.763 33.21
0 2 2  6.283 42  39.48
3 1 6.988 48.83
To complete the discussion of our current problem (182), note again that the energy levels, listed
in the table above, are (2l +1)-degenerate because each of them corresponds to (2l + 1) different
eigenfunctions, each with a specific value of the magnetic quantum number m:
  l ,n r  m
 n,l ,m  Cl ,n jl   Yl  ,  , with  l  m  l . (3.189)
 R 
3.7 Atoms
Now we are ready to discuss atoms, starting from the simplest, exactly solvable Bohr atom
problem, i.e. that of a single particle’s motion in the so-called attractive Coulomb potential77
Attractive
C
U (r )   , with C  0. (3.190) Coulomb
potential
r
The natural scales of E and r in this problem are commonly defined by the requirement of equality of the
kinetic and potential energy magnitude scales (dropping all numerical coefficients):
2 C
E0  2
 , (3.191)
mr0 r0
similar to its particular case (1.13b). Solving this system of two equations, we get78
2
2 C  2
E0  2
 m  , and r0  . (3.192)
mr0  mC
77 Historically, the solution of this problem in 1928, that reproduced the main result (1.12)-(1.13) of the “old”
quantum theory developed by N. Bohr in 1912, without its phenomenological assumptions, was the decisive step
toward the general acceptance of Schrödinger’s wave mechanics.
78 For the most important case of the hydrogen atom, with C = e2/4 , these scales are reduced, respectively, to
0
the Bohr radius rB (1.10) and the Hartree energy EH (1.13a). Note also that according to Eq. (192), for the so-
called hydrogen-like atom (actually, a positive ion) with C = Z(e2/40), these two key parameters are rescaled as
r0 = rB/Z and E0 = Z2EH.
In the normalized units   E/E0 and   r/r0, equation (181) for our current case (190), looks relatively
simple,
d 2R 2 dR  1
  l l  1R  2    R  0, (3.193)
d 2
 d  
but unfortunately, its eigenfunctions may be called elementary only in the most generous meaning of the
word. With the adequate normalization,

R
0
R n ',l r 2 dr   nn ' ,
n ,l (3.194)
these (mutually orthogonal) functions may be represented as

Bohr 1/ 2
atom:  2  (n  l  1)! 
3
 r   2r
l
 2l 1  2r 
radial R n ,l (r )     3
exp    Ln l 1  . (3.195)
functions  nr0  2n(n  l )!   nr0   nr0   nr0 
Here Lqp ( ) are the so-called associated Laguerre polynomials, which may be calculated as
Associated dq
Laguerre Lqp ( )  (1) q L p  q ( ) . (3.196)
polynomials d q
from the simple Laguerre polynomials Lp()  L0p   .79 In turn, the easiest way to obtain Lp() is to use
the following Rodrigues formula:80
Rodrigues
 d p  p  
formula for
L p ( )  e  e  . (3.197)
Laguerre
polynomials
d p  
Note that in contrast with the associated Legendre functions Plm, participating in the spherical
harmonics, all Lpq are just polynomials, and those with small indices p and q are indeed quite simple:
L00    1, L10      1, L02     2  4  2,

L10    1, L11    2  4, L12    3 2  18  18, (3.198)
L20    2, L12    6  18, L22    12 2  96  144,...
Returning to Eq. (195), we see that the natural quantization of the radial equation (193) has
brought us a new integer quantum number n. To understand its range, we should notice that according to
Eq. (197), the highest power of terms in the polynomial Lp+q is (p + q), and hence, according to Eq.
(196), that of Lqp is p, so that the highest power in the polynomial participating in Eq. (195) is (n – l –
1). Since the power cannot be negative to avoid the unphysical divergence of wavefunctions at r  0,
the radial quantum number n has to obey the restriction n  l + 1. Since l, as we already know, may take
values l = 0, 1, 2,…, we may conclude n may only take the following values:
79 In Eqs. (196)-(197), p and q are non-negative integers, with no relation whatsoever to the particle’s momentum
or electric charge. Sorry for this notation, but it is absolutely common, and can hardly result in any confusion.
80 Named after the same B. O. Rodrigues, and belonging to the same class as his other famous result, Eq. (165) for
the Legendre polynomials.
n  1, 2, 3,... (3.199)
What makes this relation very important is the following, most surprising result: the eigenenergies
corresponding to the wavefunctions (179), which are indexed with three quantum numbers:
 n ,l .m  R n ,l (r )Yl m ( ,  ) , (3.200)
depend only on one of them, n:
2
1 E 1 C 
  n   2 , i.e. E n   02   2 m  . (3.201)
2n 2n 2n 
i.e. agree with Bohr’s formula (1.12). Because of this reason, n is usually called the principal quantum
number, and the above relation between it and the “more subordinate” orbital quantum number l is
rewritten as
l  n  1. (3.202)
Together with the inequality (162), this gives us the following, very important hierarchy of the three
quantum numbers involved in the Bohr atom problem:
Bohr
atom:
1 n    0  l  n 1   l  m  l . (3.203) quantum
numbers
Taking into account the (2l +1)-degeneracy related to the magnetic number m, and using the well-known
formula for the arithmetic progression,81 we see that the nth energy level (201) has the following orbital
degeneracy:
n 1 n 1 n 1
nn  1
g   (2l  1)  2  l   1  2  n  n2. (3.204)
l 0 l 0 l 0 2
Due to its importance for atoms, let us spell out the hierarchy (203) of a few lowest-energy states, using
the traditional state notation, in which the value of n is followed by the letter that denotes the value of l:
n  1: l0 (one 1s state) m  0. (3.205)
n  2: l  0 (one 2 s state) m  0,
(3.206)
l  1 ( three 2 p states) m  0,  1.
n  3: l  0 (one 3s state) m  0,
l  1 ( three 3 p states) m  0,  1, (3.207)
l2 (five 3d states) m  0,  1,  2 .
Figure 22 shows plots of the radial functions (195) of the listed states. The most important of
them is of course the ground (1s) state with n = 1 and hence E = –E0/2. According to Eqs. (195) and
(198), its radial function is just a simple decaying exponent
Ground
2  r / r0 state:
R 1,0 (r )  e , (3.208) radial
r03 / 2 function
81 See, e.g., MA Eq. (2.5a).
while its angular distribution is uniform – see Eq. (174). The gap between the ground energy and the
energy E = –E0/8 of the lowest excited states (with n = 2) in a hydrogen atom (in which E0 = EH  27.2
eV) is as large as ~ 10 eV, so that their thermal excitation requires temperatures as high as ~105 K, and
the overwhelming part of all hydrogen atoms in the visible Universe are in their ground state. Since the
atomic hydrogen makes up about 75% of the “normal” matter,82 we are very fortunate that such simple
formulas as Eqs. (174) and (208) describe the atomic states prevalent in Mother Nature!
0.25 0.25
n 1 2 p (l  1) n2
1s (l  0)
R 1,l r03 / 2 R 2 ,l r03 / 2
0 0
2 s (l  0 )
 0.25  0.25
0 2 4 6 8 10 0 2 4 6 8 10
r/r0 r/r0
0.25
n3
R 3,l r03 / 2 3 p (l  1)
3d (l  2)
0
3 s (l  0 )
Fig. 3.22. The lowest radial functions
of the Bohr atom.
 0.25
0 2 4 6 8 10
r/r0
According to Eqs. (195) and (198), the radial functions of the lowest excited states, 2s (with n =
2 and l = 0), and 2p (with n = 2 and l = 1) are also not too complicated:
1  r   r / 2r0 1 r  r / 2 r0
R 2,0 (r )   2   e , R 2,1 (r )  e , (3.209)
2r0  3/ 2
 r0  2r0  3/ 2 1/ 2
3 r0
with the former of these states (2s) having a uniform angular distribution, and the three latter (2p) states,
with different m = 0, 1, having simple angular distributions, which differ only by their spatial
orientation – see Eq. (175) and the second row of Fig. 20. The most important trend here, clearly visible
from the comparison of the two top panels of Fig. 22 as well, is a larger radius of the decay exponent in
82 Excluding the so-far hypothetical dark matter and dark energy.
the radial functions (2r0 for n = 2 instead of r0 for n = 1), and hence a larger radial extension of the
states. This trend is confirmed by the following general formula:83
r n ,l

r0
2

3n 2  l (l  1) . (3.210)
The second important trend is that at a fixed n, the orbital quantum number l determines how fast
does the wavefunction change with r near the origin, and how much it oscillates in the radial direction at
larger values of r. For example, the 2s eigenfunction R2,0(r) is different from zero at r = 0, and “makes
one wiggle” (has one root) in the radial direction, while the eigenfunctions 2p equal zero at r = 0 but do
not cross the horizontal axis after that. Instead, those wavefunctions oscillate as the functions of an
angle – see the second row of Fig. 20. The same trend is clearly visible for n = 3 (see the bottom panel
of Fig. 22), and continues for the higher values of n.
The states with l = lmax  n – 1 may be viewed as crude analogs of the circular motion of a
particle in a plane whose orientation defines the quantum number m. On the other hand, the best
classical image of the s-state (l = 0) is a purely radial, spherically-symmetric motion of the particle to
and from the attracting center. (The latter image is especially imperfect because the motion needs to
happen simultaneously in all radial directions.) The classical language becomes reasonable only for the
highly degenerate Rydberg states, with n >> 1, whose linear superpositions may be used to compose
wave packets closely following the classical (circular or elliptic) trajectories of the particle – just as was
discussed in Sec. 2.2 for the free 1D motion.
Besides Eq. (210), mathematics gives us several other simple relations for the radial functions
Rn,l (and, since the spherical harmonics are normalized to 1, for the eigenfunctions as the whole),
including those that we will use later in the course:84
1 1 1 1 1 1
 ,  ,  . (3.211)
r n ,l
2
n r0 r2 n ,l n l  ½  r02
3
r3 n ,l n l l  ½ (l  1) r03
3
In particular, the first of these formulas means that for any eigenfunction n,l,m, with all its complicated
radial and angular dependencies, there is a simple relation between the potential and full energies:
1 C E
U n ,l
 C  2
  20  2 E n , (3.212)
r n ,l n r0 n
so that the average kinetic energy of the particle, Tn,l = En – Un,l, is equal to En – 2En = En > 0.
As in the several previous cases we have met, simple results (201), (210)-(212) are in sharp
contrast with the rather complicated expressions for the corresponding eigenfunctions. Historically this
contrast gave an additional motivation for the development of more general approaches to quantum
mechanics, that would replace, or at least complement our brute-force (wave-mechanics) analysis. A
discussion of such an approach will be the main topic of the next chapter.
83 Note that even at the largest value of l, equal to (n –1), the second term l(l + 1) in Eq. (210) is equal to (n2 – n),
and hence cannot over-compensate the first term 3n2.
84 The first of these relations may be readily proved using the Heller-Feynman theorem (see Chapter 1); this proof
will be offered for the reader’s exercise after a more general form of this theorem has been proved in Chapter 6.
Rather strikingly, the above classification of the quantum numbers, with minor steals from the
later chapters of this course, allows a semi-quantitative explanation of the whole system of chemical
elements. The “only” two additions we need are the following facts:
(i) due to their unavoidable interaction with relatively low-temperature environments, atoms tend
to relax into their lowest-energy state, and
(ii) due to the Pauli principle (valid for electrons as the Fermi particles), each orbital eigenstate
discussed above may be occupied by two electrons with opposite spins.
Of course, atomic electrons do interact, so that their quantitative description requires quantum
mechanics of multiparticle systems, which is rather complex. (Its main concepts will be discussed in
Chapter 8.) However, the lion’s share of this interaction is reduced to simple electrostatic screening, i.e.
a partial compensation of the electric charge of the atomic nucleus, as felt by a particular electron, by
other electrons of the atom. This screening changes quantitative results (such as the energy scale E0)
dramatically; however, the quantum number hierarchy, and hence their classification, is not affected.
The hierarchy of atoms is most often represented as the famous periodic table of chemical
elements,85 whose simple version is shown in Fig. 23. (The table in Fig. 24 presents a sequential list of
the elements and their electron configurations, following the convention already used in Eqs. (205)-
(207), with the additional upper index showing the number of electrons with the indicated values of
quantum numbers n and l.) The number in each table’s cell, and in the first column of the list, is the so-
called atomic number Z, which physically is the number of protons in the particular atomic nucleus, and
hence the number of electrons in an electrically-neutral atom.
1 2
Property legend:
H He
3 4 alkali metals transition metals metalloids 5 6 7 8 9 10
Li Be alkali-earth metals nonmetals halogens B C N O F Ne
11 12 13 14 15 16 17 18
Na Mg rare-earth metals other metals noble gases Al Si P S Cl Ar
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe
55 56 57- 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
Cs Ba 71 Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn
87 88 89- 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
Fr Ra 102 Rf Db Sg Bh Hs Mt Ds Rg Cn Uut Fl Uup Lv Uus Uuo
Lanthanides: 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
La Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb Lu
Actinides: 89 89 90 91 92 93 94 95 96 97 98 99 100 101 102
Ac Ac Th Pa U Np Pu Am Cm Bk Cf Es Fm Md Lr
Fig. 3. 23. The periodic table of elements, showing their atomic numbers and chemical symbols, as well as their
color-coded basic physical/chemical properties at the so-called ambient (meaning usual laboratory) conditions.
85Also called the Mendeleev table, after Dmitri Ivanovich Mendeleev who put forward the concept of the quasi-
periodicity of chemical element properties as functions of Z phenomenologically in 1869. (The explanation of this
periodicity had to wait for 60 more years until the advent of quantum mechanics in the late 1920s.)
Atomic Atomic Electron Atomic Atomic Electron Atomic Atomic Electron

number symbol states number symbol states number symbol states
[Kr] shell, 77 Ir 4f145d76s2
Period 1 Period 5
plus: 78 Pt 4f145d96s1
1 H 1s1 37 Rb 5s1 79 Au 4f145d106s1
2 He 1s2 38 Sr 5s2 80 Hg 4f145d106s2
[He] shell, 39 Y 4d15s2 81 Tl 4f145d106s26p1
Period 2
plus: 40 Zr 4d25s2 82 Pb 4f145d106s26p2
3 Li 2s1 41 Nb 4d45s1 83 Bi 4f145d106s26p3
4 Be 2s2 42 Mo 4d55s1 84 Po 4f145d106s26p4
5 B 2s22p1 43 Tc 4d65s1 85 At 4f145d106s26p5
6 C 2s22p2 44 Ru 4d75s1 86 Rn 4f145d106s26p6
7 N 2s22p3 45 Rh 4d85s1 [Rn] shell,
Period 7
8 O 2s22p4 46 Pd 4d10 plus:
9 F 2s22p5 47 Ag 4d105s1 87 Fr 7s1
10 Ne 2s22p6 48 Cd 4d105s2 88 Ra 7s2
[Ne] shell, 49 In 4d105s25p1 89 Ac 6d17s2
Period 3
plus: 50 Sn 4d105s25p2 90 Th 6d27s2
11 Na 3s1 51 Sb 4d105s25p3 91 Pa 5f26d17s2
12 Mg 3s2 52 Te 4d105s25p4 92 U 5f36d17s2
13 Al 3s23p1 53 I 4d105s25p5 93 Np 5f46d17s2
14 Si 3s23p2 54 Xe 4d105s25p6 94 Pu 5f67s2
15 P 3s23p3 [Xe] shell, 95 Am 5f77s2
Period 6
16 S 3s23p4 plus: 96 Cm 5f76d17s2
17 Cl 3s23p5 55 Cs 6s1 97 Bk 5f97s2
18 Ar 3s23p6 56 Ba 6s2 98 Cf 5f107s2
[Ar] shell, 57 La 5d16s2 99 Es 5f117s2
Period 4
plus: 58 Ce 4f15d16s2 100 Fm 5f127s2
19 K 4s1 59 Pr 4f36s2 101 Md 5f137s2
20 Ca 4s2 60 Nd 4f46s2 102 No 5f147s2
21 Sc 3d14s2 61 Pm 4f56s2 103 Lr 5f146d17s2
22 Ti 3d24s2 62 Sm 4f66s2 104 Rf 5f146d27s2
23 V 3d34s2 63 Eu 4f76s2 105 Db 5f146d37s2
24 Cr 3d44s2 64 Gd 4f75d16s2 106 Sg 5f146d47s2
25 Mn 3d54s2 65 Tb 4f96s2 107 Bh 5f146d57s2
26 Fe 3d64s2 66 Dy 4f106s2 108 Hs 5f146d67s2
27 Co 3d74s2 67 Ho 4f116s2 109 Mt 5f146d77s2
28 Ni 3d84s2 68 Er 4f126s2 110 Ds 5f146d87s2
29 Cu 3d94s1 69 Tm 4f136s2 111 Rg 5f146d97s2
30 Zn 3d104s2 70 Yb 4f146s2 112 Cn 5f146d107s2
31 Ga 3d104s24p1 71 Lu 4f145d16s2 113 Uut 5f146d107s27p1
32 Ge 3d104s24p2 72 Hf 4f145d26s2 114 Fl 5f146d107s27p2
33 As 3d104s24p3 73 Ta 4f145d36s2 115 Uup 5f146d107s27p3
34 Se 3d104s24p4 74 W 4f145d46s2 116 Lv 5f146d107s27p4
35 Br 3d104s24p5 75 Re 4f145d56s2 117 Uus 5f146d107s27p5
36 Kr 3d104s24p6 76 Os 4f145d66s2 118 Uuo 5f146d107s27p6
Fig. 3.24. Atomic electron configurations. The upper index shows the number of electrons in the states with
the indicated quantum numbers n (the first digit) and l (letter-coded as was discussed above).
The simplest atom, with Z = 1, is hydrogen (chemical symbol H) – the only atom for that the
theory discussed above is quantitatively correct.86 According to Eq. (191), the 1s ground state of its only
electron corresponds to the quantum number values n = 1, l = 0, and m = 0 – see Eq. (205). In most
versions of the periodic table, the cell of H is placed in the top left corner.
In the next atom, helium (symbol He, Z = 2), the same orbital quantum state (1s) is filled with
two electrons. As will be discussed in detail in Chapter 8, electrons of the same atom are actually
indistinguishable, so that their quantum states are not independent and may be entangled. These factors
are important for several properties of helium atoms (and heavier elements as well); however, a bit
counter-intuitively, for the atom classification purposes, they are not crucial, and we may pretend that
two electrons of a helium atom just have “opposite spins”. Due to the twice higher electric charge of the
nucleus of the helium atom, i.e. the twice higher value of the constant C in Eq. (190), resulting in a 4-
fold increase of the constant E0 given by Eq. (192), the binding energy of each electron is crudely 4
times higher than that of the hydrogen atom – though the electron interaction decreases it by about 25%
– see Sec. 8.2. This is why taking one electron away (i.e. the positive ionization of a helium atom)
requires relatively high energy, ~23.4 eV, which is not available in the usual chemical reactions. On the
other hand, a neutral helium atom cannot bind one more electron (i.e. form a negative ion) either. As a
result, the helium, and all other elements with fully completed electron shells (the term meaning the sets
of states with eigenenergies well separated from higher energy levels) is a chemically inert noble gas,
thus starting the whole right-most column of the periodic table, allocated for such elements.
The situation changes rather dramatically as we move to the next element, lithium (Li), with Z =
3 electrons. Two of them are still accommodated by the inner shell with n = 1 (listed in Fig. 24 as the
helium shell [He]), but the third one has to reside in the next shell with n = 2, l = 0, and m = 0, i.e. in the
2s state. According to Eq. (201), the binding energy of this electron is much lower, especially if we take
into account that according to Eqs. (210)-(211), the 1s electrons of the [He] shell are much closer to the
nucleus and almost completely compensate two-thirds of its electric charge +3e. As a result, the 2s-state
electron is approximately, but reasonably well described by Eq. (201) with Z = 1 and n = 2, giving
binding energy close to just 3.4 eV (experimentally, ~5.39 eV), so that a lithium atom can give out that
electron rather easily – to either an atom/ion of another element to form a chemical compound, or to the
common conduction band of the solid-state lithium; as a result, at the ambient conditions, it is a typical
alkali metal. The similarity of chemical properties of lithium and hydrogen, with the chemical valence
of one,87 places Li as the starting element of the second period (row), with the first period limited to
only H and He – see Fig. 23.
In the next element, beryllium (symbol Be, Z = 4), the 2s state (n = 2, l = 0, m = 0) houses one
more electron, with the “opposite spin”. Due to the higher electric charge of the nucleus, Q = +4e, with
only half of it compensated by 1s electrons of the [He] shell, the binding energy of the 2s electrons is
somewhat higher than that in lithium, so that the ionization energy increases to ~9.32 eV. As a result,
beryllium is also chemically active with the valence of two, but not as active as lithium, and is also is
metallic in its solid-state phase, but with a lower electric conductivity than lithium.
86 Besides very small fine-structure and hyperfine-splitting corrections – to be discussed, respectively, in Chapters
6 and 8.
87 The chemical valence (or “valency”) is a not very precise term describing the number of atom’s electrons
involved in chemical reactions. For the same atom, especially with a large number of electrons in its outer,
unfilled shell, this number may depend on the chemical compound formed. (For example, the valence of iron is
two in the ferrous oxide, FeO, and three in the ferric oxide, Fe2O3.)
Moving in this way along the second row of the periodic table (from Z = 3 to Z = 10), we see a
gradual filling of the rest of the total 2n2 = 222 = 8 different electron states of the n = 2 shell (see Eq.
(204), with the additional spin degeneracy factor of 2), including two 2s states with m = 0, and six 2p
states with m = 0, 1,88 with a gradually growing ionization potential (up to ~21.6 eV in Ne with Z =
10), i.e. a growing reluctance to conduct electricity or form positive ions. However, the last elements of
the row, such as oxygen (O, with Z = 8) and especially fluorine (F, with Z = 9) can readily pick up extra
electrons to fill up their 2p states, i.e. form negative ions. As a result, these elements are chemically
active, with a double valence for oxygen and a single valence for fluorine. However, the final element of
this row, neon, has its n = 2 shell completely full, and cannot form a stable negative ion. This is why it is
a noble gas, like helium. Traditionally, in the periodic table, such elements are placed right under helium
(Fig. 23), to emphasize the similarity of their chemical properties. But this necessitates making an at
least 6-cell gap in the 1st row. (Actually, the gap is often made larger, to accommodate the next rows –
keep reading.)
Period 3, i.e. the 3rd row of the table, starts exactly like period 2, with sodium (Na, with Z = 11),
also a chemically active alkali metal whose atom features 10 electrons filling the shells with n = 1 and n
= 2 (in Fig. 24, collectively called the neon shell [Ne]), plus one electron in the 3s state (n = 3, l = 0, m =
0), which may be again reasonably well described by the hydrogen atom theory – see, e.g., the red curve
on the last panel of Fig. 22. Continuing along this row, we could naively expect that, according to Eq.
(204), and with the account of double spin degeneracy, this period of the table should have 2n2 = 232 =
18 elements, with a gradual, sequential filling of two 3s states, then six 3p states, and then ten 3d states.
However, here we run into a big surprise: after argon (Ar, with Z = 18), a relatively inert element with
the ionization energy of ~15.7 eV due to the fully filled 3s and 3p shells, the next element, potassium
(K, with Z = 19) is an alkali metal again!
The reason for that is the difference of the actual electron energies from those of the hydrogen
atom, which is due mostly to inter-electron interactions, and gradually accumulates with the growth of
Z. It may be semi-quantitatively understood from the results of Sec. 6. In hydrogen-like atoms/ions, the
electron state energies do not depend on the quantum number l (as well as m) – see Eq. (201). However,
the orbital quantum number does affect the wavefunction of an electron. As Fig. 22 shows, the larger l
the less the probability for an electron to be close to the nucleus, where its positive charge is less
compensated by other electrons. As a result of this effect (and also the relativistic corrections to be
discussed in Sec. 6.3), the electron’s energy grows with l. Actually, this effect is visible already in
period 2 of the table: it manifests itself in the filling order – the p states after the s states. However, for
potassium (K, with Z = 19) and calcium (Ca, with Z = 20), the energies of the 3d states become so high
that the energies of the two 4s states are lower, and the latter states are filled first. As described by Eq.
(210), and also by the first of Eqs. (211), the effect of the principal number n on the distance from the
nucleus is much stronger than that of l, so that the 4s wavefunctions of K and Ca are relatively far from
the nucleus, and determine the chemical valence (equal to 1 and 2, correspondingly) of these elements.
The next atoms, from Sc (Z = 21) to Zn (Z = 30), with the gradually filled “internal” 3d states, are the
so-called transition metals whose (comparable) ionization energies and chemical properties are
determined by the 4s electrons.
This fact is the origin of the difference between various forms of the “periodic” table. In its most
popular option, shown in Fig. 23, K is used to start the next period 4, and then a new period is started
88 The specific order of filling of the states within each shell follows the so-called Hund rules – see Sec. 8.3.
each time and only when the first electron with the next principal quantum number (n) appears.89 This
topology of the table provides a very clear match of the chemical properties of the first element of each
period (an alkali metal), as well as its last element (a noble gas). It also automatically means making
gaps in all previous rows. Usually, this gap is made between the atoms with completely filled s states
and with those with the first electron in a p state, because here the properties of the elements make a
somewhat larger step. (For example, the step from Be to B makes the material an insulator, but the step
from Mg to Al makes a smaller difference.) As a result, the elements of the same column have only
approximately similar chemical valences and physical properties.
In order to accommodate the lower, longer rows, such representation is inconvenient, because
the whole table would be too broad. This is why the so-called rare earth elements, including lanthanides
(with Z from 57 to 70, of the 6th row, with a gradual filling of the 4f and 5d states) and actinides (Z from
89 to 103, of the 7th row, with a gradual filling of the 5f and 6d states), are usually represented as outlet
rows – see Fig. 23. This is quite acceptable for basic chemistry, because the chemical properties of the
elements within each such group are rather close.
To summarize my very short review of this extremely important topic,90 the “periodic table of
elements” is not periodic in the strict sense of the word. Nevertheless, it has had an enormous historic
significance for chemistry, as well as atomic and solid-state physics, and is still very convenient for
many purposes. For our course, the most important aspect of its discussion is the surprising possibility to
describe, at least for classification purposes, such a complex multi-electron system as an atom as a set of
quasi-independent electrons in certain quantum states indexed with the same quantum numbers n, l, and
m as those of the hydrogen atom. This fact enables the use of various perturbation theories, which give
a more quantitative description of atomic properties. Some of these techniques will be reviewed in
Chapters 6 and 8.
3.8. Spherically-symmetric scatterers

The machinery of the Legendre polynomials and the spherical Bessel functions, discussed in Sec.
6, may also be used for analysis of particle scattering by spherically-symmetric potentials (155) beyond
the Born approximation (Sec. 3), provided that such a potential U(r) is also localized, i.e. reduces
sufficiently fast at r  . (The quantification of this condition is left for the reader’s exercise.)
Indeed, directing the z-axis along the propagation of the incident plane de Broglie wave i, and
taking its origin in the center of the scatterer, we may expect the scattered wave s to be axially
symmetric, so that its expansion in the series over the spherical harmonics includes only the terms with
m = 0. Hence, the solution (64) of the stationary Schrödinger equation (63) in this case may be
represented as91
 

   i   s  ai e ikz  R l r Pl cos   , (3.213)
 l 0 
89 Another popular option is to return to the first column as soon an atom has one electron in the s state (like it is
in Cu, Ag, and Au, in addition to the alkali metals).
90 For a bit more detailed (but still succinct) discussion of the valence and other chemical aspects of atomic
structure, I can recommend Chapter 5 of the very clear text by L. Pauling, General Chemistry, Dover, 1988.
91 The particular terms in this sum are frequently called partial waves.
where k  (2mE)1/2/ is defined by the energy E of the incident particle, while the radial functions Rl(r)
have to satisfy Eq. (181), and be finite at r  0. At large distances r >> R, where R is the effective
radius of the scatterer, the potential U(r) is negligible, and Eq. (181) is reduced to Eq. (183). In contrast
to its analysis in Sec. 6, we should look for its solution using a linear superposition of the spherical
Bessel functions of both kinds:
R l r   Al jl kr   Bl y l kr , at r  R , (3.214)
because Eq. (183) is now invalid at r  0, and our former argument for dropping the functions yl(kr) is
no more valid. In Eq. (214), Al and Bl are some complex coefficients, determined by the scattering
potential U(r), i.e. by the solution of Eq. (181) at r ~ R.
As the explicit expressions (186) show, the spherical Bessel functions jl() and yl() represent
standing de Broglie waves, with equal real amplitudes, so that their simple linear combinations (called
the spherical Hankel functions of the first and second kind),
hl1    jl    iy l  , and hl2     jl    iy l   , (3.215)
represent traveling spherical waves propagating, respectively, from the origin (i.e. from the center of the
scatterer), and toward the origin. In particular, at  >> 1, l, i.e. at large distances r >> 1/k, l/k,92
hl1 kr  
 i l 1 e ikr hl2  kr  
i l 1 ikr
e . (3.216)
kr kr
But using the same physical argument as at the beginning of Sec. 1, we may argue that in the case of a
localized scatterer, there should be no latter waves at r >> R; hence, we have to require the amplitude of
the term proportional to hl(2) to be zero. With the relations reciprocal to Eqs. (215),
jl   
2

1 1
hl    hl2    ,  yl   
2i

1 1

hl    hl2    , (3.217)
which enable us to rewrite Eq. (214) as
R l r  
Al 1
2
 B
 
hl    hl2     l hl1    hl2   
2i

(3.218)
 A  iBl  1  A  iBl  2 
 l hl     l hl  ,
 2   2 
this means that the combination (Al + iBl) has to be equal zero, so that Bl = iAl. Hence we have just one
unknown coefficient (say, Al) for each l,93 and may rewrite Eq. (218) in an even simpler form:
Spherically-
R l r   Al  jl kr   iy l kr   Al hl1 kr , at r  R , (3.219) symmetric
scatterer: Al
92 For arbitrary l, this result may be confirmed using Eqs. (185) and the asymptotic formulas for the “usual”
Bessel functions – see, e.g., EM Eqs. (2.135) and (2.152), valid for an arbitrary (not necessarily integer) index n.
93 Moreover, using the conservation of the orbital momentum, to be discussed in Sec. 5.6, it is possible to show
that this complex coefficient may be further reduced to just one real parameter, usually recast as the partial phase
shift l between the lth spherical harmonics of the incident and scattered waves. However, I will not use this
notion, because practical calculations are more physically transparent (and not more complex) without it.
and use Eqs. (213) and (216) to write the following expression for the scattered wave at large distances:
ai ikr  1 l
 s  e   i l 1 Al Pl cos  , at r  R, , . (3.220)
kr l 0 k k
Comparing this expression with the general Eq. (81), we see that for a spherically-symmetric,
localized scatterer,
1 
f    i  Al Pl cos   ,
l 1
(3.221)
k l 0
so that the differential cross-section (84) is
2
d 1 
1 
  i  Al Pl cos     i l' l A A* P cos P cos  .

l 1
 2 l l' l l' (3.222)
d k l 0 k2 l ,l '  0
The last expression is more convenient for the calculation of the total cross-section (59):
1 1
d d 2 
 d  2  d cos    2 i l' l
Al Al '  Pl  Pl '  d ,
*
(3.223)
4
d 1
d k l ,l '  0 1
where   cos , because this result may be much simplified by using Eq. (167):94
Spherically- 2

4 A
symmetric
    l , with  l  2 l . (3.224)
scatterer: 
l 0 k 2l  1
Hence the solution of the scattering problem is reduced to the calculation of the partial wave
amplitudes Al defined by Eq. (219) – and for the total cross-section, merely of their magnitudes. This
task is much facilitated by using the following Rayleigh formula for the expansion of the incident plane
wave’s exponent into a series over the Legendre polynomials,95

e ikz  e ikr cos   i l 2l  1 jl kr Pl cos   . (3.225)
l 0
As the simplest example, let us calculate scattering by a completely opaque and “hard” (meaning
sharp-boundary) sphere, which may be described by the following potential:
 , for r  R,
U r    (3.226)
 0, for R  r.
In this case, the total wavefunction has to vanish at r  R, and hence for the external problem (r  R) the
sphere enforces the boundary condition   0 + s = 0 for all values of , at r = R. With Eqs. (213),
(220) and (225), this condition becomes
 

ai  R l R   i l 2l  1 jl kR  Pl cos    0 . (3.227)
l 0
94 Physically, this reduction of the double sum to a single one means that due to the orthogonality of the sherical
harmonics, the total scattering probability flows due to each partial wave just add up.
95 It may be proved using the Rodrigues formula (165) and integration by parts – the task left for the reader’s
exercise.
Due to the orthogonality of the Legendre polynomials, this condition may be satisfied for all
angles  only if all the coefficients before all Pl(cos) vanish, i.e. if
R l R   i l 2l  1 jl kR  . (3.228)
On the other hand, for r > R, U(r) = 0, so that Eq. (183) is valid, and its outward-wave solution (219) has
to be valid even at r  R, giving
R l R   Al  jl kR   iy l kR  . (3.229)
Requiring the two last expressions to give the same result, we get
j l kR 
Al  i l 2l  1 , (3.230)
j l kR   iy l kR 
so that Eqs. (222) and (224) yield:
jl kR  4 2l  1 jl2 kR 

2
d 1 
 2  2l  1 Pl cos   , l  . (3.231)
d k l 0 jl kR   iy l kR  k2 jl2 kR   y l2 kR 
As Fig. 25a shows, the first of these results describes an angular structure of the scattered de
Broglie wave, which is qualitatively similar to that given by the Born approximation – cf. Eq. (98) and
Fig. 10.
(a) (b)
100 4
kR  30

10 3    l
0 l 0
10
1 d 
 g d 1
1 .0 g
2
1
0 .1
0.1 1
2
0.01 0
0 0.2 0.4 0.6 0.8 0 2 4 6 8 10
 / kR
Fig. 3.25. Particle scattering by an opaque, hard sphere: (a) the differential cross-section
normalized to the geometric cross-section g  R2 of the sphere, as a function of the scattering
angle , and (b) the (similarly normalized) total cross-section and its lowest spherical components,
as functions of the dimensionless product kR  E1/2.
Namely, at low particle’s energies (kR << 1), the scattering is essentially isotropic, while in the
opposite, high-energy limit kR >> 1, it is mostly confined to small angles  ~ /kR << 1, and exhibits
numerous local destructive-interference minima at angles n ~ n/kR. However, in our current (exact!)
theory, these minima are always finite, because the theory describes effective bending of the de Broglie
waves along the back side of the sphere, which smears the interference pattern.
The same bending is also responsible for a rather counter-intuitive fact, described by the second
of Eqs. (231) and clearly visible in Fig. 25b: even at kR  , the total cross-section  of scattering
tends to 2g  2R2, rather than to g as in the purely-classical scattering theory. (The fact that at kR <<
1, the cross-section is also larger than g, approaching 4g at kR  0, is less surprising, because in this
limit the de Broglie wavelength  = 2/k is much longer than the sphere’s radius R, so that the wave’s
propagation is affected by the whole sphere.)
The above analysis may be readily generalized to the case a step-like (sharp but finite) potential
(97) – the problem left for the reader’s exercise. On the other hand, for a finite and smooth scattering
potential U(r), plugging Eq. (225) into Eq. (213) and the result into Eq. (66), requiring the coefficients
before each angular function Pl(cos) to be balanced, and canceling the common coefficient a0, we get
the following inhomogeneous generalization of Eq. (181) for the radial functions defined by Eq. (213):
2 d  2 d  
E  U r R l   dr  r dr   l (l  1)R l (r )  U r i 2l  1 jl kr  .
l
(3.232)
2 mr 2    
This differential equation has to be solved in the whole scatterer volume (i.e. for all r ~ R) with
the boundary conditions for the functions Rl(r) to be finite at r  0, and to tend to the asymptotic form
(219) at r >> R. The last requirement enables the evaluation of the coefficients Al that are needed for
spelling out Eqs. (222) and (224). Unfortunately, due to the lack of time, I have to refer the reader
interested in such cases to special literature.96
3.9. Exercise Problems
3.1. A particle of energy E is incident (in the figure on the right, within y
the plane of the drawing) on a sharp potential step:
 0, for x  0, 0
U (r )    x
U 0 , for 0  x .
k
Calculate the particle reflection probability R as a function of the incidence
angle ; sketch and discuss this function for various magnitudes and signs of U0.
3.2.* Analyze how are the Landau levels

(50) modified by an additional uniform electric gate Vg  0
B Vg  0 gate
field E directed along the plane of the particle’s w
motion. Contemplate the physical meaning of 2D electron
your result and its implications for the quantum gas plane semiconductor
Hall effect in a gate-defined Hall bar. (The area
lw of such a bar [see Fig. 6] is defined by metallic gate electrodes parallel to the 2D electron gas plane
– see the figure on the right. The negative voltage Vg, applied to the gates, squeezes the 2D gas from the
area under them into the complementary, Hall-bar part of the plane.)
96 See, e.g., J. Taylor, Scattering Theory, Dover, 2006.
3.3. Analyze how are the Landau levels (50) modified if a 2D particle is confined in an
additional 1D potential well U(x) = m02x2/2.
3.4. Find the stationary states of a spinless, charged 3D particle moving in “crossed” (mutually
perpendicular), uniform electric and magnetic fields, with E << cB. For such states, calculate the
expectation values of the particle’s velocity in the direction perpendicular to both fields, and compare
the result with the solution of the corresponding classical problem.
Hint: You may like to generalize Landau’s solution for 2D particles, discussed in Sec. 2.
3.5. Use the Born approximation to calculate the angular dependence and the total cross-section
of scattering of an incident plane wave propagating along the x-axis, by the following pair of similar
point inhomogeneities:
  a  a 
U (r )  W   r  n z     r  n z  .
  2  2 
Analyze the results in detail. Derive the condition of the Born approximation’s validity for such delta-
functional scatterers.
3.6. Complete the analysis of the Born scattering by a uniform spherical potential (97), started in
Sec. 3, by calculation of its total cross-section. Analyze the result in the limits kR << 1 and kR >>1.
3.7. Use the Born approximation to calculate the differential cross-section of particle scattering
by a very thin spherical shell, whose potential may be approximated as
U (r )  W r  R  .
Analyze the results in the limits kR << 1 and kR >> 1, and compare them with those for a uniform sphere
considered in Sec. 3.
3.8. Use the Born approximation to calculate the differential and total cross-sections of electron
scattering by a screened Coulomb field of a point charge Ze, with the electrostatic potential
Ze
 r   e  r ,
4 0 r
neglecting spin interaction effects, and analyze the result’s dependence on the screening parameter .
Compare the results with those given by the classical (“Rutherford”) formula97 for the unscreened
Coulomb potential (  0), and formulate the condition of Born approximation’s validity in this limit.
3.9. A quantum particle with electric charge Q is scattered by a localized distributed charge with
a spherically-symmetric density (r), and zero total charge. Use the Born approximation to calculate the
differential cross-section of the forward scattering (with the scattering angle  = 0), and evaluate it for
the scattering of electrons by a hydrogen atom in its ground state.
97 See, e.g., CM Sec. 3.5, in particular Eq. (3.73).
3.10. Reformulate the Born approximation for the 1D case. Use the result to find the scattering
and transfer matrices of a “rectangular” (flat-top) scatterer
U , for x  d / 2,
U ( x)   0
 0, otherwise.
Compare the results with those of the exact calculations carried out earlier in Chapter 2, and analyze
how does their relationship change in the eikonal approximation.
3.11. In the tight-binding approximation, find the lowest stationary states of a particle placed into
a system of three similar, weakly coupled potential wells located in the vertices of an equilateral
triangle.
3.12. The figure on the right shows a fragment of a periodic 2D lattice, y a

with the red and blue points showing the positions of different local potentials.
(i) Find the reciprocal lattice and the 1st Brillouin zone of the system.
(ii) Calculate the wave number k of the monochromatic de Broglie wave a
incident along axis x, at that the lattice creates the lowest-order diffraction peak
within the [x, y] plane, and the direction toward this peak.
(iii) Semi-quantitatively, describe the evolution of the intensity of the x
peak when the local potentials, represented by the different points, become
similar.
Hint: The order of diffraction on a multidimensional Bravais lattice is a somewhat ambiguous
notion, usually associated with the sum of magnitudes of all integers lj in Eq. (109), for the vector Q that
is equal to q  k – ki.
3.13. For the 2D hexagonal lattice (Fig. 12b):

(i) find the reciprocal lattice Q and the 1st Brillouin zone;
(ii) use the tight-binding approximation to calculate the dispersion relation E(q) for a 2D particle
moving through a potential profile with such periodicity, with an energy close to the energy of the
axially-symmetric states quasi-localized at the potential minima;
(iii) analyze and sketch (or plot) the resulting dispersion relation E(q) inside the 1st Brillouin
zone.
3.14. Complete the tight-binding-approximation calculation of the band structure of the

honeycomb lattice, started at the end of Sec. 4. Analyze the results; in particular prove that the Dirac
points qD are located in the corners of the 1st Brillouin zone, and express the velocity vn participating in
Eq. (122), in terms of the coupling energy n. Show that the final results do not change if the quasi-
localized wavefunctions are not axially-symmetric, but are proportional to exp{im} – as they are, with
m = 1, for the 2pz electrons of carbon atoms in graphene, that are responsible for its transport properties.
3.15. Examine basic properties of the so-called Wannier functions defined as

 R (r )  const   q (r )e iqR d 3 q ,
BZ
where q(r) is the Bloch wavefunction (108), R is any vector of the Bravais lattice, and the integration
over the quasimomentum q is extended over any (e.g., the first) Brillouin zone.
3.16. Evaluate the long-range interaction (the so-called London dispersion force) between two
similar, electrically-neutral atoms or molecules, modeling each of them as an isotropic 3D harmonic
oscillator with the electric dipole moment d = qs, where s is the oscillator’s displacement from its
equilibrium position.
Hint: Represent the total Hamiltonian of the system as a sum of Hamiltonians of independent 1D
harmonic oscillators, and calculate their total ground-state energy as a function of the distance between
the dipoles. 98
3.17. Derive expressions for the stationary wavefunctions and the corresponding energies of a
2D particle of mass m, free to move inside a round disk of radius R. What is the degeneracy of each
energy level? Calculate the five lowest energy levels with an accuracy better than 1%.
3.18. Calculate the ground-state energy of a 2D particle of mass m, localized in a very shallow
flat-bottom potential well
 U , for   R, 2
U     0 with 0  U 0  .
 0, for   R, mR 2
3.19. Estimate the energy E of the localized ground state of a particle of mass m, in an axially-
symmetric 2D potential well of a finite radius R, with an arbitrary but very small potential U().
(Quantify this condition.)
3.20. Spell out the explicit form of the spherical harmonics Y40 ( ,  ) and Y44 ( ,  ) .
3.21. Calculate x and x2 in the ground states of the planar and spherical rotators of radius R.
What can you say about px and px2?
3.22. A spherical rotator, with r  (x2 + y2+ z2)1/2 = R = const, of mass m is in a state with the
following wavefunction:  = const(⅓ + sin2). Calculate its energy.
3.23. According to the discussion at the beginning of Sec. 5, stationary wavefunctions of a 3D

harmonic oscillator may be calculated as products of three 1D “Cartesian oscillators” – see, in particular
Eq. (125), with d = 3. However, according to the discussion in Sec. 6, the wavefunctions of the type
(200), proportional to the spherical harmonics Ylm, also describe stationary states of this spherically-
symmetric system. Represent the wavefunctions (200) of:
(i) the ground state of the oscillator, and
98 This explanation of the interaction between electrically-neutral atoms was put forward in 1930 by F. London,
on the background of a prior (1928) work by C. Wang. Note that in some texts this interaction is (rather
inappropriately) referred to as the “van der Waals force”, though it is only one, long-range component of the van
der Waals model – see, e.g. SM Sec. 4.1.
(ii) each of its lowest excited states,

as linear combinations of products of 1D oscillator’s stationary wavefunctions. Also, calculate the
degeneracy of the nth energy level of the oscillator.
3.24. Calculate the smallest depth U0 of a spherical, flat-bottom potential well

 U 0 , for r  R,
U r   
 0, for R  r ,
at that it has a bound (localized) stationary state. Does such a state exist for a very narrow and deep well
U(r) = –W(r), with a positive and finite W?
3.25. A 3D particle of mass m is placed into a spherically-symmetric potential well with – <
U(r)  U() = 0. Relate its ground-state energy to that of a 1D particle of the same mass, moving in the
following potential well:
U  x , for x  0,
U'  x   
  , for x  0.
In the light of the found relation, discuss the origin of the difference between the solutions of the
previous problem and Problem 2.17.
3.26. Calculate the smallest value of the parameter U0, for that the following spherically-
symmetric potential well,
U r   U 0 e  r / R , with U 0 , R  0 ,
has a bound (localized) eigenstate.
Hint: You may like to introduce the following new variables: f  rR and   Ce–r/2R, with an
appropriate choice of the constant C.
3.27. A particle moving in a certain central potential U(r) has a stationary state with the
following wavefunction:
 r
  Cr  e cos  ,
where C, , and  > 0 are constants. Calculate:
(i) the probabilities of all possible values of the quantum numbers m and l, and
(ii) the confining potential and the state’s energy.
3.28. Use the variational method to estimate the ground-state energy of a particle of mass m,
moving in the following spherically-symmetric potential:
U r   ar 4 .
3.29. Use the variational method, with the trial wavefunction trial = A/(r + a)b, where both a > 0
and b > 1 are fitting parameters, to estimate the ground-state energy of the hydrogen-like atom/ion with
the nuclear charge +Ze. Compare the solution with the exact result.
3.30. Calculate the energy spectrum of a particle moving in a monotonic, but otherwise arbitrary
attractive spherically-symmetric potential U(r) < 0, in the approximation of very large orbital quantum
numbers l. Formulate the quantitative condition(s) of validity of your theory. Check that for the
Coulomb potential U(r) = –C/r, your result agrees with Eq. (201).
Hint: Try to solve Eq. (181) approximately, introducing the same new function, f(r)  rR(r), that
was already used in Sec. 1 and in the solutions of a few earlier problems.
3.31. An electron had been in the ground state of a hydrogen-like atom/ion with nuclear charge
Ze, when the charge suddenly changed to (Z + 1)e. 99 Calculate the probabilities for the electron of the
changed system to be:
(i) in the ground state, and
(ii) in the lowest excited state.
3.32. Due to a very short pulse of an external force, the nucleus of a hydrogen-like atom/ion,
initially at rest in its ground state, starts moving with velocity v. Calculate the probability Wg that the
atom remains in its ground state. Evaluate the energy to be given, by the pulse, to a hydrogen atom in
order to reduce Wg to 50%.
3.33. Calculate x2 and px2 in the ground state of a hydrogen-like atom/ion. Compare the
results with Heisenberg’s uncertainty relation. What do these results tell about the electron’s velocity in
the system?
3.34. Use the Hellmann-Feynman theorem (see Problem 1.5) to prove:

(i) the first of Eqs. (211), and
(ii) the fact that for a spinless particle in an arbitrary spherically-symmetric attractive potential
U(r), the ground state is always an s-state (with the orbital quantum number l = 0).
3.35. For the ground state of a hydrogen atom, calculate the expectation values of E and E2,
where E is the electric field created by the atom, at distances r >> r0 from its nucleus. Interpret the
resulting relation between E2 and E 2, at the same observation point.
3.36. Calculate the condition at that a particle of mass m, moving in the field of a very thin
spherically-symmetric shell, with
U r   W r  R ,
and W < 0, has at least one localized (“bound”) stationary state.
99 Such a fast change happens, for example, at the beta-decay, when one of the nucleus’ neurons spontaneously
turns into a proton, emitting a high-energy electron and a neutrino, which leave the system very fast (instantly on
the atomic time scale), and do not affect directly the atom transition’s dynamics.
3.37. Calculate the lifetime of the lowest metastable state of a particle in the same spherical-shell
potential as in the previous problem, but now with W > 0, for sufficiently large W. (Quantify this
condition.)
3.38. A particle of mass m and energy E is incident on a very thin spherical shell of radius R,
whose localized states were the subject of two previous problems, with an arbitrary “weight” W.
(i) Derive general expressions for the differential and total cross-sections of scattering for this
geometry.
(ii) Spell out the contribution 0 to the total cross-section , due to the spherically-symmetric
component of the scattered de Broglie wave.
(iii) Analyze the result for 0 in the limits of very small and very large magnitudes of W, for both
signs of this parameter. In particular, in the limit W  +, relate the result to the metastable state’s
lifetime  calculated in the previous problem.
3.39. Calculate the spherically-symmetric contribution 0 to the total cross-section of particle

scattering by a uniform sphere of radius R, described by the following potential:
U , for r  R,
U r    0
 0, otherwise,
with an arbitrary U0. Analyze the result in detail, and give an interpretation of its most remarkable
features.
3.40. Use the finite difference method with the step h = a/2 to calculate as many energy levels as
possible, for a particle confined to the interior of:
(i) a square with side a, and
(ii) a cube with side a,
with hard walls. For the square, repeat the calculations, using a finer step: h = a/3. Compare the results
for different values of h with each other and with the exact formulas.
Hint: It is advisable to either first solve (or review the solution of) the similar 1D Problem 1.15,
or start from reading about the finite difference method.100 Also: try to exploit the symmetry of the
systems.
100 See, e.g., CM Sec. 8.5 or EM Sec. 2.11.
This page is
intentionally left
blank
Chapter 4. Bra-ket Formalism

The objective of this chapter is to describe Dirac’s “bra-ket” formalism of quantum mechanics, which
not only overcomes some inconveniences of wave mechanics but also allows a natural description of
such intrinsic properties of particles as their spin. In the course of the formalism’s discussion, I will give
only a few simple examples of its application, leaving more involved cases for the following chapters.
4.1. Motivation
As the reader could see from the previous chapters of these notes, wave mechanics gives many
results of primary importance. Moreover, it is mostly sufficient for many applications, for example,
solid-state electronics and device physics. However, in the course of our survey, we have filed several
grievances about this approach. Let me briefly summarize these complaints:
(i) Attempts to analyze the temporal evolution of quantum systems, beyond the trivial time
behavior of the stationary states, described by Eq. (1.62), run into technical difficulties. For example, we
could derive Eq. (2.151) describing the metastable state’s decay and Eq. (2.181) describing the quantum
oscillations in coupled wells, only for the simplest potential profiles, though it is intuitively clear that
such simple results should be common for all problems of this kind. Solving such problems for more
complex potential profiles would entangle the time evolution analysis with the calculation of the spatial
distribution of the evolving wavefunctions – which (as we could see in Secs. 2.9 and 3.6) may be rather
complex even for simple time-independent potentials. Some separation of the spatial and temporal
dependencies is possible using perturbation approaches (to be discussed in Chapter 6), but even those
would lead, in the wavefunction language, to very cumbersome formulas.
(ii) The last statement can also be made concerning other issues that are conceptually
addressable within the wave mechanics, e.g., the Feynman path integral approach, coupling to the
environment, etc. Pursuing them in the wave mechanics language would lead to formulas so bulky that I
had postponed their discussion until we would have a more compact formalism on hand.
(iii) In the discussion of several key problems (for example the harmonic oscillator and
spherically-symmetric potentials), we have run into rather complicated eigenfunctions coexisting with
very simple energy spectra – that infer some simple background physics. It is very important to get this
physics revealed.
(iv) In the wave-mechanics postulates formulated in Sec. 1.2, the quantum mechanical operators
of the coordinate and momentum are treated rather unequally – see Eqs. (1.26b). However, some key
expressions, e.g., for the fundamental eigenfunction of a free particle,
 p r
expi , (4.1)
  
or the harmonic oscillator’s Hamiltonian,
1 2 m 02 2
Hˆ  pˆ  rˆ , (4.2)
2m 2
just beg for a similar treatment of coordinates and momenta.
© K. Likharev
However, the strongest motivation for a more general formalism comes from wave mechanics’
conceptual inability to describe elementary particles’ spins1 and other internal quantum degrees of
freedom, such as quark flavors or lepton numbers. In this context, let us review the basic facts on spin
(which is very representative and experimentally the most accessible of all internal quantum numbers),
to understand what a more general formalism has to explain – as a minimum.
Figure 1 shows the conceptual scheme of the simplest spin-revealing experiment, first conceived
by Otto Stern in 1921 and implemented by Walther Gerlach in 1922. A collimated beam of electrons
from a natural source, such as a heated cathode, is passed through a gap between the poles of a strong
magnet, whose magnetic field B , (in Fig. 1, directed along the z-axis) is nonuniform, so that both Bz
and dBz/dz are not equal to zero. The experiment shows that the beam splits into two beams of equal
intensity.
collimator z magnet
y N W = 50%
Bz W = 50% Fig. 4.1. The simplest Stern-

electron Bz , 0 S
source z Gerlach experiment.
particle detectors
This result may be semi-quantitatively explained on classical if somewhat phenomenological

grounds, by assuming that each electron has an intrinsic, permanent magnetic dipole moment m. Indeed,
classical electrodynamics2 tells us that the potential energy U of a magnetic dipole in an external
magnetic field B is equal to (–m ꞏ B ), so that the force acting on the electron,
F  U    m  B  , (4.3)
has a non-zero vertical component
Fz  

 m z  Bz   m z Bz . (4.4)
z z
Hence if we further assume that electron’s magnetic moment may take only two equally probable
discrete values of mz =  (though such discreteness does not follow from any classical model of the
particle), this may explain the original Stern-Gerlach effect qualitatively. The quantitative explanation
of the beam splitting angle requires the magnitude of  to be equal (or very close) to the so-called Bohr
magneton3
e J Bohr
B   0.9274  10  23 . (4.5) magneton
2me T
1 To the best of my knowledge, the concept of spin as a measure of the internal rotation of a particle was first
suggested by Ralph Kronig, then a 20-year-old student, in January 1925, a few months before two other students,
G. Uhlenbeck and S. Goudsmit – to whom the idea is usually attributed. The concept was then accepted (rather
reluctantly) and developed quantitatively by Wolfgang Pauli.
3 A good mnemonic rule is that it is close to 1 K/T. In the Gaussian units,   e/2m c  0.927410-20 erg/G.
B e
However, as we will see below, this value cannot be explained by any internal motion of the
electron, say its rotation about the z-axis. More importantly, this semi-classical phenomenology cannot
explain, even qualitatively, other experimental results, for example those of the set of multi-stage Stern-
Gerlach experiments shown in Fig. 2. In the first of the experiments, the electron beam is first passed
through a magnetic field (and its gradient) oriented along the z-axis, just as in Fig. 1. Then one of the
two resulting beams is absorbed (or removed from the setup in some other way), while the other one is
passed through a similar but x-oriented field. The experiment shows that this beam is split again into two
components of equal intensity. A classical explanation of this experiment would require an even more
unnatural additional assumption that the initial electrons had random but discrete components of the
magnetic moment simultaneously in two directions, z and x.
100 % SG 50%
SG (x)
(z) 50%
absorber
100 % SG 50%
SG (z)
SG 50%
(x)
(z)
Fig. 4.2. Three multi-stage
Stern-Gerlach experiments.
The boxes SG (…) denote
100 % SG 100% magnets similar to one
SG (z) shown in Fig. 1, with the
(z) 0% field oriented in the
indicated direction.
However, even this assumption cannot explain the results of the three-stage Stern-Gerlach
experiment shown on the middle panel of Fig. 2. Here, the previous two-state setup is complemented
with one more absorber and one more magnet, now with the z-orientation again. Completely counter-
intuitively, it again gives two beams of equal intensity, as if we have not yet filtered out the electrons
with mz corresponding to the lower beam, at the first z-stage. The only way to save the classical
explanation here is to say that maybe, electrons somehow interact with the magnetic field so that the x-
polarized (non-absorbed) beam becomes spontaneously depolarized again somewhere between the two
last stages. But any hope for such an explanation is ruined by the control experiment shown on the
bottom panel of Fig. 2, whose results indicate that no such depolarization happens.
We will see below that all these (and many more) results find a natural explanation in the so-
called matrix mechanics pioneered by Werner Heisenberg, Max Born, and Pascual Jordan in 1925.
However, the matrix formalism is rather inconvenient for the solution of most problems discussed in
Chapters 1-3, and for a short time, it was eclipsed by E. Schrödinger’s wave mechanics, which had been
put forward just a few months later. However, very soon Paul Adrien Maurice Dirac introduced a more
general bra-ket formalism of quantum mechanics, which provides a generalization of both approaches
and proves their equivalence. Let me describe it, begging for the reader’s patience, because (in a
contrast with my usual style), I will not be able to give particular examples of its application for a while
– until all the basic notions of the formalism have been introduced.
4.2. States, state vectors, and linear operators

The basic notion of the general formulation of quantum mechanics is the quantum state of a
system.4 To get some gut feeling of this notion, if a quantum state  of a particle may be adequately
described by wave mechanics, this description is given by the corresponding wavefunction (r, t).
Note, however, a quantum state as such is not a mathematical object,5 and can participate in
mathematical formulas only as a “label” – e.g., the index of the wavefunction . On the other hand,
such wavefunction is not a state, but a mathematical object (a complex function of space and time)
giving a quantitative description of the state – just as the classical radius vector r and velocity v as
real functions of time are mathematical objects describing the motion of the particle in its classical
description – see Fig. 3. Similarly, in the Dirac formalism, a certain quantum state  is described by
either of two mathematical objects, called the state vectors: the ket-vector   and bra-vector  ,6
whose relationship is close to that between the wavefunction  and its complex conjugate  *.
classical mechanics : r t , v  t , etc.

system in mathematical
wave mechanics : either  (r, t ) or Ψ*α (r, t )
state  descriptions:
bra - ket formalism : either  or 
Fig. 4.3. Physical state of a system and its descriptions.
One should be cautious with the term “vector” here. The usual geometric vectors, such as r and
v, are defined in the usual geometric (say, Euclidean) space. In contrast, the bra- and ket-vectors are
defined in a more abstract Hilbert space – the full set of its possible bra- and ket-vectors of a given
system.7 So, despite certain similarities with the geometric vectors, the bra- and ket-vectors are different
mathematical objects, and we need to define the rules of their handling. The primary rules are essentially
postulates and are justified only by the correct description of all experimental observations of the rule
corollaries. While there is a general consensus among physicists what the corollaries are, there are many
possible ways to carve from them the basic postulate sets. Just as in Sec. 1.2, I will not try too hard to
beat the number of the postulates to the smallest possible minimum, trying instead to keep their physical
meaning transparent.
(i) Ket-vectors. Let us start with ket-vectors – sometimes called just kets for short. Their most
important property is the linear superposition. Namely, if several ket-vectors j describe possible
states of a quantum system, numbered by the index j, then any linear combination (superposition)
Linear
  cj  j , (4.6) superposition
of ket-vectors
j
4 An attentive reader could notice my smuggling the term “system” instead of “particle”, which was used in the
previous chapters. Indeed, the bra-ket formalism allows the description of quantum systems much more complex
than a single spinless particle that is a typical (though not the only possible) subject of wave mechanics.
5 As was expressed nicely by Asher Peres, one of the pioneers of the quantum information theory, “quantum
phenomena do not occur in the Hilbert space, they occur in a laboratory”.
6 The terms bra and ket were suggested to reflect the fact that the pair   and  may be considered as the parts
of the combinations like   (see below), which remind expressions in the usual angle brackets.
7 I have to confess that this is a bit loose definition; it will be refined soon.
where cj are any (possibly complex) c-numbers, also describes a possible state of the same system.8
Actually, since ket-vectors are new mathematical objects, the exact meaning of the right-hand side of
Eq. (6) becomes clear only after we have postulated the following rules of summation of these vectors,
 j   j'   j'   j , (4.7)
and their multiplication by an arbitrary c-number:

c j  j c. (4.8)
Note that in the set of wave mechanics postulates, the statements parallel to Eqs. (7) and (8) were
unnecessary, because the wavefunctions are the usual (albeit complex) functions of space and time, and
we know from the usual algebra that such relations are indeed valid.
As evident from Eq. (6), the complex coefficient cj may be interpreted as the “weight” of the
state j in the linear superposition . One important particular case is cj = 0, showing that the state j
does not participate in the superposition . The corresponding term of the sum (6), i.e. the product
Null-state
vector 0j , (4.9)
has a special name: the null-state vector. (It is important to avoid confusion between the null-state
corresponding to vector (9), and the ground state of the system, which is frequently denoted by ket-
vector 0. In some sense, the null-state does not exist at all, while the ground state not only does exist
but frequently is the most important quantum state of the system.)
(ii) Bra-vectors and inner products. Bra-vectors , which obey the rules similar to Eqs. (7) and
(8), are not new, independent objects: a ket-vector   and the corresponding bra-vector  describe
the same state. In other words, there is a unique dual correspondence between  and ,9 very similar
(though not identical) to that between a wavefunction  and its complex conjugate *. The
correspondence between these vectors is described by the following rule: if a ket-vector of a linear
superposition is described by Eq. (6), then the corresponding bra-vector is
Linear
superposition    c *j  j   j c *j . (4.10)
of bra-vectors j j
The mathematical convenience of using two types of vectors rather than just one becomes clear
from the notion of their inner product (due to its second, shorthand form, also called the short bracket):
Short
bracket
(inner         , (4.11)
product)
which is a scalar c-number, in a certain but limited analogy with the scalar product of the usual
geometric vectors. (For one difference, the product (11) may be a complex number.)
The main property of the inner product is its linearity with respect to any of its component
vectors. For example, if a linear superposition  is described by the ket-vector (6), then
8 One may express the same statement by saying that the vector  belongs to the same Hilbert space as all j.
9 Mathematicians like to say that the ket- and bra-vectors of the same quantum system are defined in two
isomorphic Hilbert spaces.
   cj   j , (4.12)
j
while if Eq. (10) is true, then

    c *j  j  . (4.13)
j
In plain English, c-number factors may be moved either into or out of the inner products.
The second key property of the inner product is
Inner
* product::
     . (4.14) complex
conjugate
It is compatible with Eq. (10); indeed, the complex conjugation of both parts of Eq. (12) gives:
*
  *
  c *j   j   c *j  j     . (4.15)
j j
Finally, one more rule: the inner product of the bra- and ket-vectors describing the same state
(called the norm squared) is real and non-negative,
2 State’s
     0. (4.16) norm
squared
In order to give the reader some feeling about the meaning of this rule: we will see below that if some
state  may be described by the wavefunction (r, t), then
    * d 3 r  0 . (4.17)
Hence the role of the bra- and ket-vectors of the same state is very similar to that of complex-conjugate
pairs of its wavefunctions.
(iii) Operators. One more key notion of the Dirac formalism is quantum-mechanical linear
operators. Just as for the operators discussed in wave mechanics, the function of an operator is to
“generate” of one state from another: if  is a possible ket of the system, and Â is a legitimate10
operator, then the following combination,
Â  , (4.18)
is also a ket-vector describing a possible state of the system, i.e. a ket-vector in the same Hilbert space
as the initial vector . An alternative formulation of the same rule is the following clarification of the
notion of the Hilbert space: for the given set of linear operators of a system, its Hilbert state includes all
vectors that may be obtained from each other using the operations of the type (18). In this context, let
me note that the operator set, and hence the Hilbert space of a system, usually (if not always) implies its
certain approximate model. For example, if the coupling of orbital degrees of freedom of a particle to its
spin may be ignored (as it may be for a non-relativistic particle in the absence of an external magnetic
10 Here the term “legitimate” means “having a clear sense in the bra-ket formalism”. Some examples of
“illegitimate” expressions are:  Â , Â , , and . Note, however, that the last two expressions may be
legitimate if  and  are states of different systems, i.e. if their state vectors belong to different Hilbert spaces.
We will run into such direct products of the bra- and ket-vectors (sometimes denoted, respectively, as  and
) in Chapters 6-10.
field), we may describe the dynamics of the particle using spin operators only. In this case, the set of all
possible spin vectors of the particle forms a Hilbert space separate from that of the orbital-state vectors
of that particle.
As the adjective “linear” in the operator definition implies, the main rules governing the
operators is their linearity with respect to both any superposition of vectors:
 
Aˆ   c j  j    c j Aˆ  j , (4.19)
 j  j
and any superposition of operators:

 
  c j Aˆ j     c j Aˆ j  . (4.20)
 
 j  j
These rules are evidently similar to Eqs. (1.53)-(1.54) of wave mechanics.

The above rules imply that an operator “acts” on the ket-vector on its right; however, a
combination of the type  Â is also legitimate and represents a new bra-vector. It is important that,
generally, this vector does not represent the same state as the ket-vector (18); instead, the bra-vector
isomorphic to the ket-vector (18) is
Conjugate  Â† . (4.21)
operator
This statement serves as the definition of the Hermitian conjugate (also called “Hermitian
adjoint”) Â† of the initial operator Â . For an important class of operators, called the Hermitian
operators, the conjugation is inconsequential, i.e. for them
Hermitian
operator Aˆ †  Aˆ . (4.22)
(This equality, as well as any other operator equation below, means that these operators act similarly on
any bra- or ket-vector of the given Hilbert space.) 11
To proceed further, we need one more additional postulate, sometimes called the associative
axiom of multiplication: just as an ordinary product of scalars, any legitimate bra-ket expression, not
including explicit summations, does not change from an insertion or removal of a pair of parentheses –
meaning as usual that the operation inside them has to be performed first. The first two examples of this
postulate are given by Eqs. (19) and (20), but the associative axiom is more general and means, for
example, that
Long
bracket:
definition
   
 Aˆ    Aˆ    Aˆ  , (4.23)
This last equality serves as the definition of the last form, called the long bracket (evidently, also a
scalar), with an operator sandwiched between a bra-vector and a ket-vector. This definition, when
combined with the definition of the Hermitian conjugate and Eq. (14), yields an important corollary:
11 If we consider c-numbers as a particular type of operators (which is legitimate for any Hilbert space), then
according to Eqs. (11) and (21), for them the Hermitian conjugation is equivalent to the simple complex
conjugation, so that only real c-numbers may be considered as a particular type of the Hermitian operators (22).
 
 Aˆ    Aˆ      Aˆ †    *   Aˆ † 
  
 * , (4.24)
which is most frequently rewritten as Long

* bracket:
 Aˆ   ˆ†
 A  . (4.25) complex
conjugate
The associative axiom also enables us to comprehend the following definition of one more, outer
product of bra- and ket-vectors: Outer
  . (4.26) bra-ket
product
In contrast to the inner product (11), which is a scalar, this mathematical construct is an operator.
Indeed, the associative axiom allows us to remove parentheses in the following expression:
       . (4.27)
But the last short bracket is just a scalar; hence the mathematical object (26), acting on a ket-vector (in
this case, ), gives a new ket-vector, which is the essence of the operator’s action. Very similarly,
       (4.28)
– again a typical operator’s action on a bra-vector. So, Eq. (26) defines an operator.
Now let us perform the following calculation. We may use the parentheses’ insertion into the
bra-ket equality following from Eq. (14),
        * , (4.29)
to transform it to the following form:
           * . (4.30)
Since this equality should be valid for any state vectors  and  , its comparison with Eq. (25) gives
the following operator equality Outer
   †    . (4.31) product:
Hermitian
conjugate
This is the conjugate rule for outer products; it reminds the rule (14) for inner products but involves the
Hermitian (rather than the usual complex) conjugation.
The associative axiom is also valid for the operator “multiplication”:
Aˆ Bˆ   
 Aˆ Bˆ  ,    
 Aˆ Bˆ   Aˆ Bˆ ,  (4.32)
showing that the action of an operator product on a state vector is nothing more than the sequential
action of its operands. However, we have to be careful with the operator products; generally, they do not
commute: Aˆ Bˆ  Bˆ Aˆ . This is why the commutator – the operator defined as
Aˆ , Bˆ   Aˆ Bˆ  Bˆ Aˆ , (4.33) Commutator
is a non-trivial and very useful notion. Another similar notion is the anticommutator:12
Anti-
commutator
 Aˆ , Bˆ  Aˆ Bˆ  Bˆ Aˆ . (4.34)
Finally, the bra-ket formalism broadly uses two special operators. The null-operator 0̂ is
defined by the following relations:
Null
operator
0̂   0  ,  0̂   0 , (4.35)
where  is an arbitrary state; we may say that the null-operator “kills” any state, turning it into the null-
state. Another useful notion is the identity operator, which is defined by the following action (or rather
“inaction” :-) on an arbitrary state vector:
Identity
operator Iˆ    ,  Iˆ   . (4.36)
These definitions show that the null-operator and the identity operator are Hermitian.
4.3. State basis and matrix representation

While some operations in quantum mechanics may be carried out in the general bra-ket
formalism outlined above, many calculations are performed for quantum systems that feature a full and
orthonormal set {u}  {u1, u2, …, uj, …} of its states uj, frequently called a basis. The first of these
terms means that any possible state vector of the system (i.e. of its Hilbert space) may be represented as
a unique sum of the type (6) or (10) over its basis vectors:
Expansion
over    j u j ,     *j u j , (4.37)
state basis j j
so that, in particular, if  is one of the basis states, say uj’, then j = jj’. The second term means that
Basis
vectors:
ortho- u j u j'   jj' . (4.38)
normality
For the systems that may be described by wave mechanics, examples of the full orthonormal bases are
represented by any full and orthonormal set of eigenfunctions calculated in the previous three chapters
of this course – for the simplest example, see Eq. (1.87).
Due to the uniqueness of the expansion (37), the full set of the coefficients j involved in the
expansion of a state  in certain basis {u} gives its complete description – just as the Cartesian
components Ax, Ay, and Az of a usual geometric 3D vector A in certain reference frame give its complete
description. Still, let me emphasize some differences between such representations of the quantum-
mechanical state vectors and 3D geometric vectors:
(i) a quantum state basis may have a large or even infinite number of states uj, and
(ii) the expansion coefficients j may be complex.
12  
Another popular notation for the anticommutator (34) is Aˆ , Bˆ  ; it will not be used in these notes.
With these reservations in mind, the analogy with geometric vectors may be pushed further on.
Let us inner-multiply both parts of the first of Eqs. (37) by a bra-vector uj’ and then transform the
resulting relation using the linearity rules discussed in the previous section, and Eq. (38):
u j'   u j' 
j
j u j    j u j' u j   j' .
j
(4.39)
Together with Eq. (14), this means that any of the expansion coefficients in Eq. (37) may be represented
as an inner product: Expansion
j  uj  ,  *j   u j ; (4.40) coefficients
as inner
products
these important equalities relations are analogs of equalities Aj = njA of the usual vector algebra, and
will be used on numerous occasions in this course. With them, the expansions (37) may be rewritten as
   u j u j    ˆ j  ,     u j u j    ˆ j , (4.41)
j j j j
where
̂ j  u j u j . (4.42) Projection
operator
Eqs. (41) show that ̂ j so defined is a legitimate linear operator. This operator, acting on any state
vector of the type (37), singles out just one of its components, for example,
̂ j   u j u j    j u j , (4.43)
i.e. “kills” all components of the linear superposition but one. In the geometric analogy, such operator
“projects” the state vector on the jth “direction”, hence its name – the projection operator. Probably, the
most important property of the projection operators, called the closure (or “completeness”) relation,
immediately follows from Eq. (41): their sum over the full basis is equivalent to the identity operator
u j
j u j  Iˆ . (4.44) Closure
relation
This means in particular that we may insert the left-hand side of Eq. (44), for any basis, into any bra-ket
relation, at any place – the trick that we will use again and again.
Now let us see how the expansions (37) transform the key notions introduced in the last section,
starting from the short bracket (11), i.e. the inner product of two state vectors:
    u j  *j  j' u j '    *j  j'  jj'    *j  j . (4.45)

j , j' j , j' j
Besides the complex conjugation, this expression is similar to the scalar product of the usual, geometric
vectors. Now, let us explore the long bracket (23):
 Aˆ     *j u j Aˆ u j'  j'    *j A jj'  j' . (4.46)

j, j' j, j'
Here, the last step uses the very important notion of matrix elements of the operator, defined as
Operator’s
matrix A jj'  u j Aˆ u j' . (4.47)
elements
As evident from Eq. (46), the full set of the matrix elements completely characterizes the operator, just
as the full set of the expansion coefficients (40) fully characterizes a quantum state. The term “matrix”
means, first of all, that it is convenient to represent the full set of Ajj’ as a square table (matrix), with the
linear dimension equal to the number of basis states uj of the system under the consideration. By the
way, this number (which may be infinite) is called the dimensionality of its Hilbert space.
As two simplest examples, all matrix elements of the null-operator, defined by Eqs. (35), are
evidently equal to zero (in any basis), and hence it may be represented as a matrix of zeros (called the
null-matrix):
 0 0 ...
 
Null 0   0 0 ..., (4.48)
matrix
 ... ... ...
 
while for the identity operator Iˆ , defined by Eqs. (36), we readily get
I jj '  u j Iˆ u j '  u j u j '   jj ' , (4.49)
i.e. its matrix (naturally called the identity matrix) is diagonal – also in any basis:
 1 0 ... 
 
Identity
matrix
I   0 1 ... . (4.50)
 ... ... ... 
 
The convenience of the matrix language extends well beyond the representation of particular
operators. For example, let us use the definition (47) to calculate matrix elements of a product of two
operators:
( AB) jj "  u j Aˆ Bˆ u j " . (4.51)
Here we may use Eq. (44) for the first (but not the last!) time, inserting the identity operator between the
two operators, and then expressing it via a sum of projection operators:
Matrix
element
of an ( AB ) jj "  u j Aˆ Bˆ u j"  u j Aˆ IˆBˆ u j"   u j Aˆ u j' u j' Bˆ u j"   A jj' B j'j" . (4.52)
operator
j' j'
product
This result corresponds to the standard “row by column” rule of calculation of an arbitrary element of
the matrix product
 A11 A12 ... B11 B12 ...
  
AB   A21 A22 ... B21 B22 ... . (4.53)
 ... ... ... ... ... ...

Hence a product of operators may be represented (in a fixed basis!) by that of their matrices (in the same
basis).
This is so convenient that the same language is often used to represent not only long brackets,
 A11 A12 ...  1 

* *    Long
ˆ * 
 A     j A jj ' j '   1 ,  2 ,...  A21 A22 ...   2  , (4.54) bracket
  as a matrix
...  ... 
j'
 ... ... product
but even short brackets:

 1  Short
 
    j j *
  1* ,  2* ,...   2  ,
(4.55) bracket
j   as a matrix
product
 ... 
although these equalities require the use of non-square matrices: rows of (complex-conjugate!)
expansion coefficients for the representation of bra-vectors, and columns of these coefficients for the
representation of ket-vectors. With that, the mapping of quantum states and operators on matrices
becomes completely general.
Now let us have a look at the outer product operator (26). Its matrix elements are just
   jj'
 u j   u j'   j  *j ' . (4.56)
These are the elements of a very special square matrix, whose filling requires the knowledge of just 2N
scalars (where N is the basis size), rather than N2 scalars as for an arbitrary operator. However, a simple
generalization of such an outer product may represent an arbitrary operator. Indeed, let us insert two
identity operators (44), with different summation indices, on both sides of an arbitrary operator:
   
Aˆ  IÂˆ Iˆ    u j u j  Aˆ   u j' u j'  , (4.57)
 j   j' 
and then use the associative axiom to rewrite this expression as
Aˆ   u j u j Aˆ u j ' u j' . (4.58)

j , j'
But the expression in the middle long bracket is just the matrix element (47), so that we may write
Operator
Aˆ   u j A jj ' u j ' . (4.59) via its
matrix
j, j'
elements
The reader has to agree that this formula, which is a natural generalization of Eq. (44), is extremely
elegant.
The matrix representation is so convenient that it makes sense to extend it to one level lower –
from state vector products to the “bare” state vectors resulting from the operator’s action upon a given
state. For example, let us use Eq. (59) to represent the ket-vector (18) as
 
'  Aˆ     u j A jj ' u j '     u j A jj ' u j '  . (4.60)
j, j'  j, j' 
According to Eq. (40), the last short bracket is just j’, so that
 
'   u j A jj ' j '     A jj ' j '  u j (4.61)
j, j' j  j' 
But the expression in the parentheses is just the coefficient ’j of the expansion (37) of the resulting ket-
vector (60) in the same basis, so that
' j   A jj ' j ' . (4.62)
j'
This result corresponds to the usual rule of multiplication of a matrix by a column, so that we may
represent any ket-vector by its column matrix, with the operator’s action looking like
 '1   A11 A12 ...   1 
    
 ' 2    A21 A22 ...   2  . (4.63)
 ...   ... ... ...  ... 
  
Absolutely similarly, the operator action on the bra-vector (21), represented by its row-matrix, is
  A†   A†  ... 
   
  11  12 
 * *
  * *
   †
'1 , ' 2 ,...   1 ,  2 ,...  A   †
A 
   21   22
...  .

(4.64)
 ... ... ... 
 
 
By the way, Eq. (64) naturally raises the following question: what are the elements of the matrix
on its right-hand side, or more exactly, what is the relation between the matrix elements of an operator
and its Hermitian conjugate? The simplest way to answer it is to use Eq. (25) with two arbitrary states
(say, uj and uj’) of the same basis in the role of  and . Together with the orthonormality relation (38),
this immediately gives13
Hermitian
conjugate:  Aˆ † 
  jj'   A j ' j  * . (4.65)
matrix    
elements
Thus, the matrix of the Hermitian-conjugate operator is the complex conjugated and transposed matrix
of the initial operator. This result exposes very clearly the difference between the Hermitian and the
complex conjugation. It also shows that for the Hermitian operators, defined by Eq. (22),
A jj '  A*j ' j , (4.66)
i.e. any pair of their matrix elements, symmetric with respect to the main diagonal, should be the
complex conjugate of each other. As a corollary, their main-diagonal elements have to be real:
A jj  A*jj , i.e. Im A jj  0. (4.67)
13 For the sake of formula compactness, below I will use the shorthand notation in that the operands of this
equality are just A†jj’ and A*j’j. I believe that it leaves little chance for confusion, because the Hermitian
conjugation sign † may pertain only to an operator (or its matrix), while the complex conjugation sign *, to a
scalar – say a matrix element.
In order to fully appreciate the special role played by Hermitian operators in quantum theory, let
us introduce the key notions of eigenstates aj (described by their eigenvectors aj and aj) and
eigenvalues (c-numbers) Aj of an operator Â , both defined by the equation they have to satisfy:14
Operator:
Aˆ a j  A j a j . (4.68) eigenstates
and
eigenvalues
Let us prove that eigenvalues of any Hermitian operator are real,15
Hermitian
*
A j  A j , for j  1, 2,..., N , (4.69) operator:
eigenvalues
while the eigenstates corresponding to different eigenvalues are orthogonal:

Hermitian
a j a j '  0, if A j  A j ' . (4.70) operator:
eigenvectors
The proof of both statements is surprisingly simple. Let us inner-multiply both sides of Eq. (68)
by the bra-vector aj’. On the right-hand side of the result, the eigenvalue Aj, as a c-number, may be
taken out of the bracket, giving
a j ' Aˆ a j  A j a j ' a j . (4.71)
This equality has to hold for any pair of eigenstates, so that we may swap the indices in Eq. (71), and
write the complex-conjugate of the result:
* *
a j Aˆ a j '  A*j ' a j a j ' . (4.72)
Now using Eqs. (14) and (25), together with the Hermitian operator’s definition (22), we may transform
Eq. (72) into the following form:
a Aˆ a  A* a a .
j' j j' j' j (4.73)
Subtracting this equation from Eq. (71), we get
0   A j  A*j '  a j ' a j . (4.74)

 
There are two possibilities to satisfy this relation. If the indices j and j’ are equal (denote the
same eigenstate), then the bracket is the state’s norm squared, and cannot be equal to zero. In this case,
the left parentheses (with j = j’) have to be zero, proving Eq. (69). On the other hand, if j and j’
correspond to different eigenvalues of A, the parentheses cannot equal zero (we have just proved that all
Aj are real!), and hence the state vectors indexed by j and j’ should be orthogonal, e.g., Eq. (70) is valid.
As will be discussed below, these properties make Hermitian operators suitable, in particular, for
the description of physical observables.
14 This equation should look familiar to the reader – see the stationary Schrödinger equation (1.60), which was the
focus of our studies in the first three chapters. We will see soon that that equation is just a particular (coordinate)
representation of Eq. (68) for the Hamiltonian as the operator of energy.
15 The reciprocal statement is also true: if all eigenvalues of an operator are real, it is Hermitian (in any basis).
This statement may be readily proved by applying Eq. (93) below to the case when Akk’ = Akkk’, with Ak* = Ak.
4.4. Change of basis, and matrix diagonalization

From the discussion of the last section, it may look that the matrix language is fully similar to,
and in many instances more convenient than the general bra-ket formalism. In particular, Eqs. (54)-(55)
and (63)-(64) show that any part of any bra-ket expression may be directly mapped on the similar matrix
expression, with the only slight inconvenience of using not only columns but also rows (with their
elements complex-conjugated), for state vector representation. This invites the question: why do we
need the bra-ket language at all? The answer is that the elements of the matrices depend on the particular
choice of the basis set, very much like the Cartesian components of a usual geometric vector depend on
the particular choice of reference frame orientation (Fig. 4), and very frequently, at problem solution, it
is convenient to use two or more different basis sets for the same system. (Just a bit more patience –
numerous examples will follow soon.)
y
y'
y
α
 y'
x' Fig. 4.4. The transformation

 x'  of components of a 2D vector
at a reference frame’s rotation.
0 x x
With this motivation, let us explore what happens at the transform from one basis, {u}, to
another one, {v} – both full and orthonormal. First of all, let us prove that for each such pair of bases,
and an arbitrary numbering of the states of each base, there exists such an operator Û that, first,
v j  Uˆ u j , (4.75)
Unitary
operator: and, second,
definition
UÛˆ †  Uˆ †Uˆ  Iˆ . (4.76)
(Due to the last property,16 Û is called a unitary operator, and Eq. (75), a unitary transformation.)
A very simple proof of both statements may be achieved by construction. Indeed, let us take
Unitary
operator: Uˆ   v j' u j' , (4.77)
construction
j'
- an evident generalization of Eq. (44). Then, using Eq. (38), we obtain
Uˆ u j   v j' u j' u j   v j'  j'j  v j , (4.78)

j' j'
so that Eq. (75) has been proved. Now, applying Eq. (31) to each term of the sum (77), we get
Conjugate
unitary
operator
Uˆ †   u j' v j' , (4.79)
j'
†
16 An alternative way to express Eq. (76) is to write Uˆ  Uˆ 1 , but I will try to avoid this language.
so that
UÛˆ †   v j u j u j' v j '   v j  jj' v j '   v j v j . (4.80)
j, j' j , j' j
But according to the closure relation (44), the last expression is just the identity operator, so that one of
Eqs. (76) has been proved. (The proof of the second equality is absolutely similar.) As a by-product of
our proof, we have also got another important expression – Eq. (79). It implies, in particular, that while,
according to Eq. (75), the operator Û performs the transform from the “old” basis uj to the “new” basis
vj, its Hermitian adjoint Û † performs the reciprocal transform:
Reciprocal
Uˆ † v j   u j'  j'j  u j . (4.81) basis
j'
transform
Now let us see how do the matrix elements of the unitary transform operators look like.
Generally, as was discussed above, the operator’s elements may depend on the basis we calculate them
in, so let us be specific – at least initially. For example, let us calculate the desired matrix elements Ujj’
in the “old” basis {u}, using Eq. (77):
 
U jj ' in u  u j Uˆ u j '  u j   v j" u j"  u j'  u j v j"  j"j'  u j v j' . (4.82)
 j"  j"
Now performing a similar calculation in the “new” basis {v}, we get

 
U jj' in v  v j Uˆ v j'  v j   v j" u j"  v j'    jj" u j" v j'  u j v j' . (4.83)
 j"  j"
Surprisingly, the result is the same! This is of course true for the Hermitian conjugate (79) as well:
U †jj ' in u  U †jj ' in v  v j u j' . (4.84)
These expressions may be used, first of all, to rewrite Eq. (75) in a more direct form. Applying
the first of Eqs. (41) to any state vj’ of the “new” basis, and then Eq. (82), we get
v j '   u j u j v j '   U jj ' u j . (4.85)

j j Basis
transforms:
Similarly, the reciprocal transform is matrix
form
u j '   v j v j u j '   U †jj ' v j . (4.86)

j j
These formulas are very convenient for applications; we will use them already in this section.
Next, we may use Eqs. (83)-(84) to express the effect of the unitary transform on the expansion
coefficients j of the vectors of an arbitrary state , defined by Eq. (37). As a reminder, in the “old”
basis {u} they are given by Eqs. (40). Similarly, in the “new” basis {v},
j in v  vj  . (4.87)
Again inserting the identity operator in its closure form (44) with the internal index j’, and then using
Eqs. (84) and (40), we get
 
j in v  v j   u j' u j'     v j u j' u j'   U †jj' u j'   U †jj' j' in u . (4.88)
 j'  j' j' j'
The reciprocal transform is performed by matrix elements of the operator Û :
j in u  U jj' j' in v . (4.89)

j'
So, if the transform (75) from the “old” basis {u} to the “new” basis {v} is performed by a
unitary operator, the change (88) of state vectors components at this transformation requires its
Hermitian conjugate. This fact is similar to the transformation of components of a usual vector at
coordinate frame rotation. For example, for a 2D vector whose actual position in space is fixed (Fig. 4):
  x'   cos  sin    x 
     , (4.90)
 ' 
 y    sin  cos    y 
but the reciprocal transform is performed by a different matrix, which may be obtained from that
participating in Eq. (90) by the replacement   –. This replacement has a clear geometric sense: if
the “new” reference frame {x’, y’} is obtained from the “old” frame {x, y} by a counterclockwise
rotation by angle , the reciprocal transformation requires angle –. (In this analogy, the unitary
property (76) of the unitary transform operators corresponds to the equality of the determinants of both
rotation matrices to 1.)
Due to the analogy between expressions (88) and (89) on one hand, and our old friend Eq. (62)
on the other hand, it is tempting to skip indices in these new results by writing
 in v
 Uˆ †  in u
,  in u
 Uˆ  in v
. (SYMBOLIC ONLY!) (4.91)
†
Since the matrix elements of Û and Û do not depend on the basis, such language is not too bad and is
mnemonically useful. However, since in the bra-ket formalism (or at least its version presented in this
course), the state vectors are basis–independent, Eq. (91) has to be treated as a symbolic one, and should
not be confused with the strict Eqs. (88)-(89), and with the rigorous basis-independent vector and
operator equalities discussed in Sec. 2.
Now let us use the same trick of identity operator insertion, repeated twice, to find the
transformation rule for matrix elements of an arbitrary operator:
   
A jj' in v  v j Aˆ v j'  v j   uk uk  Aˆ   uk' uk'  v j'   U †jk Akk' in u U ; (4.92)
 k   k'  k , k'
k'j'
Matrix
elements’ absolutely similarly, we may also get
transforms
A jj' in u   U jk Akk' in v
†
U k'j' . (4.93)
k , k'
In the spirit of Eq. (91), we may represent these results symbolically as well, in a compact form:
Aˆ in v  Uˆ † Aˆ in u Uˆ , Aˆ in u  Uˆ Aˆ in v Uˆ † . (SYMBOLIC ONLY!) (4.94)
As a sanity check, let us apply Eq. (93) to the identity operator:
Iˆ in v  Uˆ † IÛˆ   Uˆ †Uˆ   Iˆ in u (4.95)

  in u   in u
- as it should be. One more (strict rather than symbolic) invariant of the basis change is the trace of any
operator, defined as the sum of the diagonal terms of its matrix:
Operator’s/
Tr Aˆ  Tr A   A jj . (4.96) matrix’
trace
j
The (easy) proof of this fact, using previous relations, is left for the reader’s exercise.
So far, I have implied that both state bases {u} and {v} are known, and the natural question is
where does this information come from in quantum mechanics of actual physical systems. To get a
partial answer to this question, let us return to Eq. (68), which defines the eigenstates and the
eigenvalues of an operator. Let us assume that the eigenstates aj of a certain operator Â form a full and
orthonormal set, and calculate the matrix elements of the operator in the basis {a} of these states, at
their arbitrary numbering. For that, it is sufficient to inner-multiply both sides of Eq. (68), written for
some index j’, by the bra-vector of an arbitrary state aj of the same set:
a j Aˆ a j'  a j A j' a j' . (4.97)
The left-hand side of this equality is the matrix element Ajj’ we are looking for, while its right-hand side
is just Aj’jj’. As a result, we see that the matrix is diagonal, with the diagonal consisting of the
operator’s eigenvalues: Matrix
A jj'  A j jj' . (4.98) elements in
eigenstate
basis
In particular, in the eigenstate basis (but not necessarily in an arbitrary basis!), Ajj means the same as Aj.
Thus the important problem of finding the eigenvalues and eigenstates of an operator is equivalent to the
diagonalization of its matrix,17 i.e. finding the basis in which the operator’s matrix acquires the diagonal
form (98); then the diagonal elements are the eigenvalues, and the basis itself is the desirable set of
eigenstates.
To see how this is done in practice, let us inner-multiply Eq. (68) by a bra-vector of the basis
(say, {u}) in that we have happened to know the matrix elements Ajj’:
u k Aˆ a j  u k A j a j . (4.99)
On the left-hand side, we can (as usual :-) insert the identity operator between the operator Â and the
ket-vector, and then use the closure relation (44) in the same basis {u}, while on the right-hand side, we
can move the eigenvalue Aj (a c-number) out of the bracket, and then insert a summation over the same
index as in the closure, compensating it with the proper Kronecker delta symbol:
u k Aˆ  u k' u k' a j  A j  u k ' a j  kk' . (4.100)

k' k'
Moving out the signs of summation over k’, and using the definition (47) of the matrix elements, we get
17 Note that the expression “matrix diagonalization” is a very common but dangerous jargon. (Formally, a matrix
is just a matrix, an ordered set of c-numbers, and cannot be “diagonalized”.) It is OK to use this jargon if you
remember clearly what it actually means – see the definition above.
 A
k'
kk '  A j  kk '  u k ' a j  0 . (4.101)
But the set of such equalities, for all N possible values of the index k, is just a system of linear,
homogeneous equations for unknown c-numbers uk’aj. According to Eqs. (82)-(84), these numbers are
nothing else than the matrix elements Uk’j of a unitary matrix providing the required transformation from
the initial basis {u} to the basis {a} that diagonalizes the matrix A. This system may be represented in
the matrix form:
Matrix  A11  A j A12 ...   U 1 j 

diagonali-   
zation  A21 A22  A j ...  U 2 j   0 , (4.102)
 ... ... ...   ... 

and the condition of its consistency,
Characteristic A11  A j A12 ...

equation
for A21 A22  A j ...  0, (4.103)
eigenvalues
... ... ...
plays the role of the characteristic equation of the system. This equation has N roots Aj – the eigenvalues
of the operator Â ; after they have been calculated, plugging any of them back into the system (102), we
can use it to find N matrix elements Ukj (k = 1, 2, …N) corresponding to this particular eigenvalue.
However, since the equations (103) are homogeneous, they allow finding Ukj only to a constant
multiplier. To ensure their normalization, i.e. enforce the unitary character of the matrix U, we may use
the condition that all eigenvectors are normalized (just as the basis vectors are):
a j a j   a j u k u k a j   U kj
2
 1, (4.104)
k k
for each j. This normalization completes the diagonalization.18

Now (at last!) I can give the reader some examples. As a simple but very important case, let us
diagonalize each of the operators described (in a certain two-function basis {u}, i.e. in two-dimensional
Hilbert space) by the so-called Pauli matrices
0 1 0  i 1 0 
Pauli
σ x   , σ y   , σ z   . (4.105)
 0  1
matrices
1 0 i 0 
Though introduced by a physicist, with a specific purpose to describe electron’s spin, these matrices
have a general mathematical significance, because together with the 22 identity matrix, they provide a
full, linearly-independent system – meaning that an arbitrary 22 matrix may be represented as
 A11 A12 
   bI  c x σ x  c y σ y  c z σ z , (4.106)
 A21 A22 
18 A possible slight complication here is that the characteristic equation may give equal eigenvalues for certain
groups of different eigenvectors. In such cases, the requirement of the mutual orthogonality of these degenerate
states should be additionally enforced.
with a unique set of four c-number coefficients b, cx, cy, and cz.
Since the matrix z is already diagonal, with the evident eigenvalues 1, let us start with
diagonalizing the matrix x. For it, the characteristic equation (103) is evidently
 Aj 1
 0, i.e. A 2j  1  0, (4.107)
1  Aj
and has two roots, A1,2 = ±1. (Again, the state numbering is arbitrary!) So the eigenvalues of the matrix
x are the same as of the matrix z. (The reader may readily check that the eigenvalues of the matrix y
are also the same.) However, the eigenvectors of the operators corresponding to these three matrices are
different. To find them for x, let us plug its first eigenvalue, A1 = +1, back into equations (101) spelled
out for this particular case (j = 1; k, k’ = 1,2):
 u1 a1  u 2 a1  0,
(4.108)
u1 a1  u 2 a1  0.
These two equations are compatible (of course, because the used eigenvalue A1 = +1 satisfies the
characteristic equation), and any of them gives
u1 a1  u 2 a1 , i. e. U 11  U 21 . (4.109)
With that, the normalization condition (104) yields

2 1 2
U 11  U 21 . (4.110)
2
Although the normalization is insensitive to the simultaneous multiplication of U11 and U21 by the same
phase factor exp{i} with any real , it is convenient to keep the coefficients real, for example taking 
= 0, to get
1
U 11  U 21  . (4.111)
2
Performing an absolutely similar calculation for the second characteristic value, A2 = –1, we get
U12 = –U22, and we may choose the common phase to have
1
U 12  U 22 
, (4.112)
2
so that the whole unitary matrix for diagonalization of the operator corresponding to x is19
1 1 1  Unitary matrix
U x  U †x    , (4.113) diagonalizing
2 1  1 x
For what follows, it will be convenient to have this result expressed in the ket-relation form – see Eqs.
(85)-(86):
1
a1  U 11 u1  U 21 u 2   u1  u 2 , a2  U 12 u1  U 22 u 2  1  u1  u 2 , (4.114a)
2 2
19 Note that though this particular unitary matrix is Hermitian, this is not true for an arbitrary choice of phases .
1 1
u1  U 11† a1  U 21
†
a2  a1  a 2 , u 2  U 12† a1  U 22
†
a2  a1  a 2 . (4.114b)
2 2
Now let me show that these results are already sufficient to understand the Stern-Gerlach
experiments described in Sec. 1 – but with two additional postulates. The first of them is that the
interaction of a particle with the external magnetic field, besides that due to its orbital motion, may be
described by the following vector operator of its spin dipole magnetic moment:20
Spin
magnetic
moment
ˆ  Sˆ ,
m (4.115a)
where the constant coefficient , specific for every particle type, is called the gyromagnetic ratio,21 and
Ŝ is the vector operator of spin, with three Cartesian components:
Spin
vector
operator
Sˆ  n x Sˆ x  n y Sˆ y  n z Sˆ z . (4.115b)
Here nx,y,z are the usual Cartesian unit vectors in the 3D geometric space (in the quantum-mechanics
sense, just c-numbers, or rather “c-vectors”), while Sˆ x , y , z are the “usual” (scalar) operators.
For the so-called spin-½ particles (including the electron), these components may be simply, as

Sˆ x , y , z  σˆ x , y , z , (4.116a)
2
Spin-½ expressed via those of the Pauli vector operator σˆ  n xˆ x  n y ˆ y  n zˆ z , so that we may also write
operator

Sˆ  σˆ . (4.116b)
2
In turn, in the so-called z-basis, each Cartesian component of the latter operator is just the corresponding
Pauli matrix (105), so that it may be also convenient to use the following 3D vector of these matrices:
Pauli  nz n x  in y 
matrix σ  n x σ x  n y σ y  n z σ z   . (4.117)
vector
 n x  in y  n z 
The z-basis, in which such matrix representation of σ̂ is valid, is defined as an orthonormal basis
of certain two states, commonly denoted  an , in that the matrix of the operator σ̂ z is diagonal, with
eigenvalues, respectively, + 1 and –1, and hence the matrix Sz  (/2)z of Ŝ z is also diagonal, with the
eigenvalues +/2 and –/2. Note that we do not “understand” what exactly the states  and  are,22 but
20 This was the key point in the electron spin’s description, developed by W. Pauli in 1925-1927.
21For the electron, with its negative charge q = –e, the gyromagnetic ratio is negative:  e = –ge e/2me, where ge  2
is the dimensionless g-factor. Due to quantum-electrodynamic (relativistic) effects, this g-factor is slightly higher
than 2: ge = 2(1 + /2 + …)  2.002319304…, where   e2/40c  (EH/mec2)1/2  1/137 is the so-called fine
structure constant. (The origin of its name will be clear from the discussion in Sec. 6.3.)
22 If you think about it, the word “understand” typically means that we can express a new, more complex notion
in terms of those discussed earlier and considered “known”. In our current case, we cannot describe the spin states
by some wavefunction (r), or any other mathematical notion discussed in the previous three chapters. The bra-
ket formalism has been invented exactly to enable mathematical analyses of such “new” quantum states we do not
loosely associate them with some internal rotation of a spin-½ particle about the z-axis, with either
positive or negative angular momentum component Sz. However, attempts to use such classical
interpretation for quantitative predictions runs into fundamental difficulties – see Sec. 6 below.
The second necessary postulate describes the general relation between the bra-ket formalism and
experiment. Namely, in quantum mechanics, each real observable A is represented by a Hermitian
operator Aˆ  Aˆ † , and the result of its measurement,23 in a quantum state  described by a linear
superposition of the eigenstates aj of the operator,
   j a j , with  j  a j  , (4.118)
j
may be only one of the corresponding eigenvalues Aj.24 Specifically, if the ket (118) and all eigenkets
aj are normalized to 1,
   1, a j a j  1, (4.119)
then the probability of a certain measurement outcome Aj is25

2 Quantum
Wj   j   *j  j   a j a j  , (4.120) measurement
postulate
This relation is evidently a generalization of Eq. (1.22) in wave mechanics. As a sanity check, let us
assume that the set of the eigenstates aj is full, and calculate the sum of the probabilities to find the
system in one of these states:
W
j
j    a j a j    Iˆ   1 .
j
(4.121)
Now returning to the Stern-Gerlach experiment, conceptually the description of the first (z-
oriented) experiment shown in Fig. 1 is the hardest for us, because the statistical ensemble describing
the unpolarized electron beam at its input is mixed (“incoherent”), and cannot be described by a pure
(“coherent”) superposition of the type (6) that have been the subject of our studies so far. (We will
discuss such mixed ensembles in Chapter 7.) However, it is intuitively clear that its results are
compatible with the description of the two output beams as sets of electrons in the pure states  and ,
respectively. The absorber following that first stage (Fig. 2) just takes all spin-down electrons out of the
picture, producing an output beam of polarized electrons in the definite  state. For such a beam, the
probabilities (120) are W = 1 and W = 0. This is certainly compatible with the result of the “control”
experiment shown on the bottom panel of Fig. 2: the repeated SG (z) stage does not split such a beam,
keeping the probabilities the same.
initially “understand”. Gradually we get accustomed to these notions, and eventually, as we know more and more
about their properties, start treating them as “known” ones.
23 Here again, just like in Sec. 1.2, the statement implies the abstract notion of “ideal experiments”, deferring the
discussion of real (physical) measurements until Chapter 10.
24 As a reminder, at the end of Sec. 3 we have already proved that such eigenstates corresponding to different
values Aj are orthogonal. If any of these values is degenerate, i.e. corresponds to several different eigenstates, they
should be also selected orthogonal, in order for Eq. (118) to be valid.
25 This key relation, in particular, explains the most common term for the (generally, complex) coefficients  , the
j
probability amplitudes.
Now let us discuss the double Stern-Gerlach experiment shown on the top panel of Fig. 2. For
that, let us represent the z-polarized beam in another basis – of the two states (I will denote them as 
and ) in that, by definition, the matrix Sx is diagonal. But this is exactly the set we called a1,2 in the x
matrix diagonalization problem solved above. On the other hand, the states  and  are exactly what we
called u1,2 in that problem, because in this basis, we know matrix  explicitly – see Eq. (117). Hence, in
the application to the electron spin problem, we may rewrite Eqs. (114) as
Relation  
1
2
   ,  
1
2
 
  , (4.122)
between
eigenvectors
1 1
   ,    ,
of Sx and Sz
    (4.123)
2 2
Currently for us the first of Eqs. (123) is most important, because it shows that the quantum state
of electrons entering the SG (x) stage may be represented as a coherent superposition of electrons with
Sx = +/2 and Sx = –/2. Notice that the beams have equal probability amplitude moduli, so that
according to Eq. (122), the split beams  and  have equal intensities, in accordance with
experimental results. (The minus sign before the second ket-vector is of no consequence here, but it may
have an impact on outcomes of other experiments – for example, if coherently split beams are brought
together again.)
Now, let us discuss the most mysterious (from the classical point of view) multi-stage SG
experiment shown on the middle panel of Fig. 2. After the second absorber has taken out all electrons in,
say, the  state, the remaining electrons, all in the state , are passed to the final, SG (z), stage. But
according to the first of Eqs. (122), this state may be represented as a (coherent) linear superposition of
the  and  states, with equal probability amplitudes. The final stage separates electrons in these two
states into separate beams, with equal probabilities W = W = ½ to find an electron in each of them,
thus explaining the experimental results.
To conclude our discussion of the multistage Stern-Gerlach experiment, let me note that though
it cannot be explained in terms of wave mechanics (which operates with scalar de Broglie waves), it has
an analogy in classical theories of vector fields, such as the classical electrodynamics. Indeed, let a plane
electromagnetic wave propagate normally to the plane of the drawing in Fig. 5, and pass through the
linear polarizer 1.
/4 2
Fig. 4.5. A light polarization sequence similar to the three-stage
3 Stern-Gerlach experiment shown on the middle panel of Fig. 2.
0
Similarly to the output of the initial SG (z) stages (including the absorbers) shown in Fig. 2, the
output wave is linearly polarized in one direction – the vertical direction in Fig. 5. Now its electric field
vector has no horizontal component – as may be revealed by the wave’s full absorption in a
perpendicular polarizer 3. However, let us pass the wave through polarizer 2 first. In this case, the
output wave does acquire a horizontal component, as can be, again, revealed by passing it through
polarizer 3. If the angles between the polarization directions 1 and 2, and between 2 and 3, are both
equal to /4, each polarizer reduces the wave amplitude by a factor of 2, and hence the intensity by a
factor of 2, exactly like in the multistage SG experiment, with the polarizer 2 playing the role of the SG
(x) stage. The “only” difference is that the necessary angle is /4, rather than by /2 for the Stern-
Gerlach experiment. In quantum electrodynamics (see Chapter 9 below), which confirms classical
predictions for this experiment, this difference may be interpreted by that between the integer spin of
electromagnetic field quanta (photons) and the half-integer spin of electrons.
4.5. Observables: Expectation values and uncertainties

After this particular (and hopefully inspiring) example, let us discuss the general relation
between the Dirac formalism and experiment in more detail. The expectation value of an observable
over any statistical ensemble (not necessarily coherent) may be always calculated using the general
statistical rule (1.37). For the particular case of a coherent superposition (118), we can combine that rule
with Eq. (120) and the second of Eqs. (118):
 
A   A jW j    *j A j  j    a j A j a j      a j A j a j   . (4.124)
j j j  j 
Now using Eq. (59) for the particular case of the eigenstate basis {a}, for which Eq. (98) is valid, we
arrive at a very simple and important formula26
Expectation
value
A   Aˆ  . (4.125) as a long
bracket
This is a clear analog of the wave-mechanics formula (1.23) – and as we will see soon, may be used to
derive it. A big advantage of Eq. (125) is that it does not explicitly involve the eigenvector set of the
corresponding operator, and allows the calculation to be performed in any convenient basis.27
For example, let us consider an arbitrary coherent state  of spin-½,28 and calculate the
expectation values of its components. The calculations are easiest in the z-basis because we know the
matrix elements of the spin operator components in that basis. Representing the ket- and bra-vectors of
the given state as linear superpositions of the corresponding vectors of the basis states  and ,
       ,     *    * . (4.126)
and plugging these expressions to Eq. (125) written for the observable Sz, we get
26 This equality reveals the full beauty of Dirac’s notation. Indeed, initially in this chapter the quantum-
mechanical brackets just reminded the angular brackets used for the statistical averaging. Now we see that in this
particular (but most important) case, the angular brackets of these two types may be indeed equal to each other!
27 Note also that Eq. (120) may be rewritten in a form similar to Eq. (125): W    ˆ  , where ̂ is the
j j j
operator (42) of the state’s projection upon the jth eigenstate aj.
28 For clarity, the noun “spin-½” is used, here and below, to denote the spin degree of freedom of a spin-½
particle, independent of its orbital motion.

S z     *    *  Sˆ z       
 

(4.127)
* * * *
     Sˆ z       Sˆ z       Sˆ z       Sˆ z  .
Now there are two equivalent ways (both very simple) to calculate the long brackets in this
expression. The first one is to represent each of them in the matrix form in the z-basis, in which the bra-
and ket-vectors of states  and  are the matrix-rows (1, 0) and (0, 1), or similar matrix-columns – the
exercise highly recommended to the reader. Another (perhaps more elegant) way is to use the general
Eq. (59), in the z-basis, together with the spin-½-specific Eqs. (116a) and (105) to write
Spin-½
component
operators
Sˆ x 

2
     ,  

Sˆ y  i      ,
2
 Sˆ z 
2
     .  (4.128)
For our particular calculation, we may plug the last of these expressions into Eq. (127), and use the
orthonormality conditions (38):
      1,       0. (4.129)
Both approaches give (of course) the same result:


S z     *    *  . (4.130)
2 
This particular result might be also obtained using Eq. (120) for the probabilities W = *
and W = *, namely:
       
S z  W     W        *       *    . (4.131)
 2  2  2  2
The formal way (127), based on using Eq. (125), has, however, an advantage of being applicable,
without any change, to finding the observables whose operators are not diagonal in the z-basis, as well.
In particular, absolutely similar calculations give

S x    *  Sˆ x     *  Sˆ x     *  Sˆ x     *  Sˆ x      *    *  , (4.132)
2 

S y    *  Sˆ y     *  Sˆ y     *  Sˆ y     *  Sˆ y   i    *    *  , (4.133)
2 
Let us have a good look at a particular spin state, for example the spin-up state . According to
Eq. (126), in this state  = 1 and  = 0, so that Eqs. (130)-(133) yield:

Sz  , Sx  Sy  0 . (4.134)
2
Now let us use the same Eq. (125) to calculate the spin component uncertainties. According to Eqs.
(105) and (116)-(117), the operator of each spin component squared is equal to (/2)2 Iˆ , so that the
general Eq. (1.33) yields
2 2 2
  
S z 2  S z2  S z
2
  Sˆ z2         Iˆ      0, (4.135a)
2 2 2
2 2
 
S x 2  S x2  S x
2
  Sˆ x2   0     Iˆ     , (4.135b)
2 2
2 2
S y
2
 S 2
y  Sy
2  
  Sˆ y2   0     Iˆ     . (4.135c)
2 2
While Eqs. (134) and (135a) are compatible with the classical notion of the angular momentum
of magnitude /2 being directed exactly along the z-axis, this correspondence should not be
overstretched, because such classical picture cannot explain Eqs. (135b) and (135c). The best (but still
imprecise!) classical image I can offer is the spin vector S oriented, on average, in the z-direction, but
still having its x- and y-components strongly “wobbling” (fluctuating) about their zero average values.
It is straightforward to verify that in the x-polarized and y-polarized states the situation is similar,
with the corresponding change of axis indices. Thus, in neither of these states all three spin components
have definite values. Let me show that this is not just an occasional fact, but reflects the perhaps most
profound property of quantum mechanics, the uncertainty relations. For that, let us consider two
measurable observables, A and B, of the same quantum system. There are two possibilities here. If the
operators corresponding to these observables commute,
 Aˆ , Bˆ   0 , (4.136)
then all matrix elements of the commutator in any orthogonal basis (in particular, in the basis of
eigenstates aj of the operator Â ) have to equal zero:
 
a j Aˆ , Bˆ a j '  a j Aˆ Bˆ a j '  a j Bˆ Aˆ a j '  0 . (4.137)
In the first bracket of the middle expression, let us act by the (Hermitian!) operator Â on the bra-vector,
while in the second one, on the ket-vector. According to Eq. (68), such action turns the operators into
the corresponding eigenvalues, which may be taken out of the long brackets, so that we get
A j a j Bˆ a j'  A j' a j Bˆ a j'   A j  A j'  a j Bˆ a j'  0. (4.138)

 
This means that if all eigenstates of operator Â are non-degenerate (i.e. Aj  Aj’ if j  j’), the
matrix of operator B̂ has to be diagonal in the basis {a}, i.e., the eigenstate sets of the operators Â and
B̂ coincide. Such pairs of observables (and their operators) that share their eigenstates, are called
compatible. For example, in the wave mechanics of a particle, its momentum (1.26) and kinetic energy
(1.27) are compatible, sharing their eigenfunctions (1.29). Now we see that this is not occasional,
because each Cartesian component of the kinetic energy is proportional to the square of the
corresponding component of the momentum, and any operator commutes with an arbitrary integer
power of itself:
 
 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ
Aˆ , Aˆ n   Aˆ , 
AA
 A  A 
... AA
 A
... AA
 AA  0 .
... (4.139)
 n  n n
Now, what if operators Â and B̂ do not commute? Then the following general uncertainty
relation is valid:
General
uncertainty
relation
A B 
1
2
Aˆ , Bˆ ,   (4.140)
where all expectation values are for the same but arbitrary state of the system. The proof of Eq. (140)
may be divided into two steps, the first one proving the so-called Schwartz inequality for any two
possible states, say  and :29
Schwartz 2
inequality       . (4.141)
Its proof may be readily achieved by applying the postulate (16) – that the norm of any legitimate state
of the system cannot be negative – to the state with the following ket-vector:
 
     , (4.142)
 
where  and  are possible, non-null states of the system, so that the denominator in Eq. (142) is not
equal to zero. For this case, Eq. (16) gives
      
         0. (4.143)
    
  
Opening the parentheses, we get
       
           0. (4.144)
     
2
After the cancellation of one inner product   in the numerator and the denominator of the last term,
it cancels with the 2nd (or the 3rd) term. What remains is the Schwartz inequality (141).
Now let us apply this inequality to states
~ˆ ~ˆ
  A and   B  , (4.145)
where, in both relations,  is the same (but otherwise arbitrary) possible state of the system, and the
deviation operators are defined similarly to the deviations of the observables (see Sec. 1.2):
~ˆ ~
A  Aˆ  A , Bˆ  Bˆ  B . (4.146)
With this substitution, and taking into account again that the observable operators Â and B̂ are
Hermitian, Eq. (141) yields
2
~ˆ ~ˆ ~ˆ ~ˆ
 A 2   B 2    AB  . (4.147)
Since the state  is arbitrary, we may use Eq. (125) to rewrite this relation as an operator inequality:
29 This inequality is the quantum-mechanical analog of the usual vector algebra’s result 22  2.
~ˆ ~ˆ
AB  A B . (4.148)
Actually, this is already an uncertainty relation, even “better” (stronger) than its standard form
(140); moreover, it is more convenient in some cases. To prove Eq. (140), we need a couple of more
steps. First, let us notice that the operator product participating in Eq. (148) may be recast as
~ˆ ~ˆ 1 ~ˆ ~ˆ i ~ˆ ~ˆ
A B   A, B   Cˆ , where Cˆ  i  A, B  . (4.149)
2  2 
 
Any anticommutator of Hermitian operators, including that in Eq. (149), is a Hermitian operator, and its
eigenvalues are purely real, so that its expectation value (in any state) is also purely real. On the other
hand, the commutator part of Eq. (149) is just

~ˆ ~ˆ

    
Cˆ  i  A, B   i Aˆ  A Bˆ  B  i Bˆ  B Aˆ  A  i Aˆ Bˆ  Bˆ Aˆ  i Aˆ , Bˆ .      (4.150)
Second, according to Eqs. (52) and (65), the Hermitian conjugate of any product of the Hermitian
operators Â and B̂ is just the product of these operators swapped. Using the fact, we may write
  
Cˆ †  i Aˆ , Bˆ
†
 i ( Aˆ Bˆ ) †  i ( Bˆ Aˆ ) †  iBˆ Aˆ  iAˆ Bˆ  i Aˆ , Bˆ  Cˆ ,   (4.151)
so that the operator Ĉ is also Hermitian, i.e. its eigenvalues are also real, and thus its expectation value
is purely real as well. As a result, the square of the expectation value of the operator product (149) may
be represented as
2 2
~ˆ ~ˆ 2 1  ~ˆ ~ˆ  1 ˆ
AB   A, B   C . (4.152)
2  2
Since the first term on the right-hand side of this equality cannot be negative, we may write
 
2 2 2
~ˆ ~ˆ 1 i ˆ ˆ
AB  Cˆ  A, B , (4.153)
2 2
and hence continue Eq. (148) as
AB  A B 
~ˆ ~ˆ 1
2
Aˆ , Bˆ  , (4.154)
thus proving Eq. (140).

For the particular case of operators x̂ and p̂ x (or a similar pair of operators for another Cartesian
coordinate), we may readily combine Eq. (140) with Eq. (2.14b) and to prove the original Heisenberg’s
uncertainty relation (2.13). For the spin-½ operators defined by Eq. (116)-(117), it is very simple (and
highly recommended to the reader) to show that
ˆ   
Spin- ½:
j ,  j '  2i jj ' j " j " ,
ˆ ˆ i.e. Sˆ j , Sˆ j '  i jj ' j " Sˆ j " , (4.155) commutation
relations
where jj’j” is the Levi-Civita permutation symbol – see, e.g., MA Eq. (13.2). As a result, the uncertainty
relations (140) for all Cartesian components of spin-½ systems are similar, for example
Spin-½:

S x S y  S z , etc . (4.156) uncertainty
2 relations
In particular, as we already know, in the  state the right-hand side of this relation equals (/2)2
> 0, so that neither of the uncertainties Sx, Sy can equal zero. As a reminder, our direct calculation
earlier in this section has shown that each of these uncertainties is equal to /2, i.e. their product is equal
to the lowest value allowed by the uncertainty relation (156) – just as the Gaussian wave packets (2.16)
provide the lowest possible value of the product xpx, allowed by the Heisenberg relation (2.13).
4.6. Quantum dynamics: Three pictures

So far in this chapter, I shied away from the discussion of the system’s dynamics, implying that
the bra- and ket-vectors were just their “snapshots” at a certain instant t. Now we are sufficiently
prepared to examine their evolution in time. One of the most beautiful features of quantum mechanics is
that this evolution may be described using either of three alternative “pictures”, giving exactly the same
final results for the expectation values of all observables.
From the standpoint of our wave-mechanics experience, the Schrödinger picture is the most
natural one. In this picture, the operators corresponding to time-independent observables (e.g., to the
Hamiltonian function H of an isolated system) are also constant in time, while the bra- and ket-vectors
evolve in time as
 (t )   (t 0 ) uˆ † (t , t 0 ),  (t )  uˆ (t , t 0 )  (t 0 ) . (4.157a)
Here uˆ (t , t 0 ) is the time-evolution operator, which obeys the following differential equation:

i uˆ  Hˆ uˆ , (4.157b)
t
†
where Ĥ is the Hamiltonian operator of the system – which is always Hermitian: Hˆ  Hˆ , and t0 is the
initial moment of time. (Note that Eqs. (157) remain valid even if the Hamiltonian depends on time
explicitly.) Differentiating the second of Eqs. (157a) over time t, and then using Eq. (157b) twice, we
can merge these two relations into a single equation, without explicit use of the time-evolution operator:

Schrödinger
i  t   Hˆ  t  , (4.158)
equation
t
which is frequently more convenient. (However, for some purposes the notion of the time-evolution
operator, together with Eq. (157b), are useful – as we will see in a minute.) While Eq. (158) is a very
natural generalization of the wave-mechanical equation (1.25), and is also frequently called the
Schrödinger equation,30 it still should be considered as a new, more general postulate, which finds its
final justification (as it is usual in physics) in the agreement of its corollaries with experiment – more
exactly, in the absence of a single credible contradiction to an experiment.
Starting the discussion of Eq. (158), let us first consider the case of a time-independent
Hamiltonian, whose eigenstates an and eigenvalues En obey Eq. (68) for this operator:31
Hˆ an  En an , (4.159)
30 Moreover, we will be able to derive Eq. (1.25) from Eq. (158) – see below.
31 I have switched the state index notation from j to n, which was used for numbering stationary states in Chapter
1, to emphasize the special role played by the stationary states an in quantum dynamics.
and hence are also time-independent. (Similarly to the wavefunctions n defined by Eq. (1.60), an are
called the stationary states of the system.) Let us use Eqs. (158)-(159) to calculate the law of time
evolution of the expansion coefficients n (i.e. the probability amplitudes) defined by Eq. (118), in a
stationary state basis, using Eq. (158):
d d 1 ˆ E i
 n (t )  a n  (t )  a n  (t )  a n H  (t )  n a n  (t )   E n n . (4.160)
dt dt i i 
This is the same simple equation as Eq. (1.61), and its integration, with the initial moment t0 taken for 0,
yields a similar result – cf. Eq. (1.62), just with the initial time t0 rather than 0:
Time
 i  evolution
 n (t )   n (t 0 ) exp E n t  . (4.161) of probability
   amplitudes
In order to illustrate how this result works, let us consider the dynamics of a spin-½ in a time-
independent, uniform external magnetic field B . To construct the system’s Hamiltonian, we may apply
the correspondence principle to the classical expression for the energy of a magnetic moment m in the
external magnetic field B , 32
U  m  B . (4.162)
In quantum mechanics, the operator corresponding to the moment m is given by Eq. (115) (suggested by
W. Pauli), so that the spin-field interaction is described by the so-called Pauli Hamiltonian, which may
be, due to Eqs. (116)-(117), represented in several equivalent forms:
Pauli

ˆ  B  Sˆ  B  - γ σˆ  B .
Hˆ  m (4.163a) Hamiltonian:
2 operator
If the z-axis is aligned with the field’s direction, this expression is reduced to

Hˆ   B Sˆ z   B ˆ z . (4.163b)
2
According to Eq. (117), in the z-basis of the spin states  and , the matrix of the operator (163b) is
B Ω Pauli
H σz  σ z , where Ω  B . (4.164) Hamiltonian:
2 2 z-basis matrix
The constant  so defined coincides with the classical frequency of the precession, about the z-axis, of
an axially-symmetric rigid body (the so-called symmetric top), with an angular momentum S and the
magnetic moment m = S, induced by the external torque  = mB.33 (For an electron, with its negative
gyromagnetic ratio e = –gee/2me, neglecting the tiny difference of the ge-factor from 2, we get
e
 B, (4.165)
me
so that according to Eq. (3.48), the frequency  coincides with the electron’s cyclotron frequency c.)
32 See, e.g., EM Eq. (5.100). As a reminder, we have already used this expression for the derivation of Eq. (3).
33 See, e.g., CM Sec. 4.5, in particular Eq. (4.72), and EM Sec. 5.5, in particular Eq. (5.114) and its discussion.
In order to apply the general Eq. (161) to this case, we need to find the eigenstates an and
eigenenergies En of our Hamiltonian. However, with our (smart :-) choice of the z-axis, the Hamiltonian
matrix is already diagonal:
   1 0 
H σz   , (4.166)
2 2  0  1
meaning that the states  and  are the eigenstates of this system, with the eigenenergies, respectively,
Spin-½ in
magnetic  
field: E   and E   . (4.167)
eigenenergies 2 2
Note that their difference,
ΔE  E   E   Ω   B , (4.168)
corresponds to the classical energy 2 mB  of flipping a magnetic dipole with the moment’s magnitude
m = /2, oriented along the direction of the field B . Note also that if the product B is positive, then 
is negative, so that E is negative, while E is positive. This is in the agreement with the classical picture
of a magnetic dipole m having negative potential energy when it is aligned with the external magnetic
field B – see Eq. (162) again.
So, for the time evolution of the probability amplitudes of these states, Eq. (161) immediately
yields the following expressions:
 i   i 
  (t )    (0) exp  t ,
  (t )    (0) exp  t , (4.169)
 2   2 
allowing a ready calculation of the time evolution of the expectation values of any observable. In
particular, we can calculate the expectation value of Sz as a function of time by applying Eq. (130) to the
(arbitrary) time moment t:
 
S z (t )    (t ) * (t )    (t ) * (t )    (0) * (0)    (0) * (0)  S z (0) . (4.170)
2   2  
Thus the expectation value of the spin component parallel to the applied magnetic field remains constant
in time, regardless of the initial state of the system. However, this is not true for the components
perpendicular to the field. For example, Eq. (132), applied to the moment t, gives
 
S x (t )    t  * t     t  * t     0  * 0 e it    0 * 0  e it  . (4.171)
2   2  
Clearly, this expression describes sinusoidal oscillations with frequency (164). The amplitude
and the phase of these oscillations depend on initial conditions. Indeed, solving Eqs. (132)-(133) for the
probability amplitude products, we get the following relations:
  t  * t   S x t   i S y t ,   t  * t   S x t   i S y t  , (4.172)
valid for any time t. Plugging their values for t = 0 into Eq. (171), we get
S x (t ) 
1
2
 
S x 0  i S y 0 e it 
1
2
 
S x 0  i S y 0  e it
(4.173)
 S x 0  cos t  S y 0  sin t .
An absolutely similar calculation using Eq. (133) gives
S y (t )  S y 0  cos t  S x 0sin t . (4.174)
These formulas show, for example, that if at moment t = 0 the spin’s state was , i.e. Sx(0) =
Sy(0) = 0, then the oscillation amplitudes of the both “lateral” components of the spin vanish. On the
other hand, if the spin was initially in the state →, i.e. had the definite, largest possible value of Sx, equal
to /2 (in classics, we would say “the spin-½ was oriented in the x-direction”), then both expectation
values Sx and Sy oscillate in time34 with this amplitude, and with the phase shift /2 between them.
So, the quantum-mechanical results for the expectation values of the Cartesian components of
spin-½ are indistinguishable from the classical results for the precession, with the frequency  = –B, 35
of a symmetric top with the angular momentum of magnitude S = /2, about the field’s direction (our
axis z), under the effect of an external torque  = mB exerted by the field B on the magnetic moment
m = S. Note, however, that the classical language does not describe the large quantum-mechanical
uncertainties of the components, obeying Eqs. (156), which are absent in the classical picture – at least
when it starts from a definite orientation of the angular momentum vector. Also, as we have seen in Sec.
3.5, the component Lz of the angular momentum at the orbital motion of particles is always a multiple of
 – see, e.g., Eq. (3.139). As a result, the angular momentum of a spin-½ particle, with Sz = /2, cannot
be explained by any summation of orbital angular moments of its hypothetical components, i.e. by any
internal rotation of the particle about its axis.
After this illustration, let us return to the discussion of the general Schrödinger equation (157b)
and prove the following fascinating fact: it is possible to write the general solution of this operator
equation. In the easiest case when the Hamiltonian is time-independent, this solution is an exact analog
of Eq. (161),
 i   i 
uˆ (t , t 0 )  uˆ (t 0 , t 0 ) exp Hˆ t  t 0   exp Hˆ t  t 0 . (4.175)
     
To start its proof we should, first of all, understand what a function (in this particular case, the exponent)
of an operator means. In the operator (and matrix) algebra, such nonlinear functions are defined by their
Taylor expansions; in particular, Eq. (175) means that
34 This is one more (hopefully, redundant :-) illustration of the difference between the averaging over the
statistical ensemble and that over time: in Eqs. (170), (173)-(174), and also in quite a few relations below, only
the former averaging has been performed, so the results are still functions of time.
35 Note that according to this relation, the gyromagnetic ratio  may be interpreted just as the angular frequency
of the spin precession per unit magnetic field – hence the name. In particular, for electrons, e  1.7611011 s-1T-
1
; for protons, the ratio is much smaller, p  gpe/2mp  2.675108 s-1T-1, mostly because of their larger mass mp, at
a g-factor of the same order as for the electron: gp  5.586. For heavier spin-½ particles, e.g., atomic nuclei with
such spin, the values of  are correspondingly smaller – e.g.,   8.681106 s-1T-1 for the 57Fe nucleus.
 k
1 i 
uˆ (t , t 0 )  Iˆ    Hˆ t  t 0 
k 1 k !   
2 3
(4.176)
1 i  1 i 1 i
 Iˆ     Hˆ t  t 0      Hˆ 2 (t  t 0 ) 2     Hˆ 3 (t  t 0 ) 3  ...,
1!    2!    3!   
where Hˆ 2  Hˆ Hˆ , Hˆ 3  Hˆ Hˆ Hˆ , etc. Working with such series of operator products is not as hard as one
could imagine, due to their regular structure. For example, let us differentiate both sides of Eq. (176)
over t, at constant t0, at the last stage using this equality again – backward:
2 2
 1 i  1 i 1 i
uˆ (t , t 0 )  0̂     Hˆ     Hˆ 2 2(t  t 0 )     Hˆ 3 3(t  t 0 ) 2  ...
t 1!    2!    3!   
(4.177)
 i  ˆ ˆ 1  i  ˆ 
2
1 i ˆ2 i
    H  I     H t  t 0      H (t  t 0 ) 2   ...   Hˆ uˆ (t , t 0 ),
    1!    2!     
so that the differential equation (158) is indeed satisfied. On the other hand, Eq. (175) also satisfies the
initial condition
uˆ (t 0 , t 0 )  uˆ † (t 0 , t 0 )  Iˆ (4.178)
that immediately follows from the definition (157a) of the evolution operator. Thus, Eq. (175) indeed
gives the (unique) solution for the time evolution operator – in the Schrödinger picture.
Now let us allow the operator Ĥ to be a function of time, but with the condition that its “values”
(in fact, operators) at different instants commute with each other:
Hˆ (t' ), Hˆ (t" )  0, for any t' , t" . (4.179)
(An important non-trivial example of such a Hamiltonian is the time-dependent part of the Hamiltonian
of a particle, due to the effect of a classical, time-dependent, but position-independent force F(t),
Hˆ F  F(t )  rˆ. (4.180)
Indeed, the radius vector’s operator r̂ does not depend explicitly on time and hence commutes with
itself, as well as with the c-numbers F(t’) and F(t”).) In this case, it is sufficient to replace, in all the
above formulas, the product Hˆ (t  t 0 ) with the corresponding integral over time; in particular, Eq. (175)
is generalized as
Evolution
 i t 
uˆ (t , t 0 )  exp  Hˆ (t' )dt' .
operator:
explicit (4.181)
expression   t0 
This replacement means that the first form of Eq. (176) should be replaced with
k

1 i
k
t 
uˆ (t , t 0 )  Iˆ       Hˆ (t' )dt' 
k 1 k !    t 
0 
 k t t t
1 i
 Iˆ       dt1  dt 2 ... dt k Hˆ (t1 )Hˆ (t 2 )...Hˆ (t k ). (4.182)
k 1 k !   t0 t0 t0
The proof that Eq. (182) satisfies Eq. (158) is absolutely similar to the one carried out above.
We may now use Eq. (181) to show that the time-evolution operator remains unitary at any
moment, even for a time-dependent Hamiltonian, if it satisfies Eq. (179). Indeed, Eq. (181) yields
 i t   i t 
uˆ (t , t 0 )uˆ † (t , t 0 )  exp  Hˆ (t' )dt'  exp  Hˆ (t" )dt"  . (4.183)
  t0    t0 
Since each of these exponents may be represented with the Taylor series (182), and, thanks to Eq. (179),
different components of these sums may be swapped at will, the expression (183) may be manipulated
exactly as the product of c-number exponents, for example rewritten as
 i  t t  
uˆ (t , t 0 )uˆ † (t , t 0 )  exp   Hˆ (t' )dt'   Hˆ (t" )dt"    exp{0̂}  Iˆ. (4.184)
  t0 t0  
This property ensures, in particular, that the system state’s normalization does not depend on time:
 (t )  (t )   (t 0 ) uˆ † (t,t 0 )uˆ (t,t 0 )  (t 0 )   (t 0 )  (t 0 ) . (4.185)
The most difficult cases for the explicit solution of Eq. (158) are those where Eq. (179) is
violated.36 It may be proven that in these cases the integral limits in the last form of Eq. (182) should be
truncated, giving the so-called Dyson series
 k t t1 t k 1
1 i
uˆ (t , t 0 )  Iˆ      dt1  dt 2 ...  dt k Hˆ (t1 )Hˆ (t 2 )...Hˆ (t k ). (4.186)
k 1 k !   t0 t0 t0
Since we would not have time/space to use this relation in our course, I will skip its proof.37
Let me now return to the general discussion of quantum dynamics to outline its alternative,
Heisenberg picture. For its introduction, let us recall that according to Eq. (125), in quantum mechanics
the expectation value of any observable A is a long bracket. Let us explore an even more general form of
such bracket:
 Â  . (4.187)
(In some applications, the states  and  may be different.) As was discussed above, in the Schrödinger
picture the bra- and ket-vectors of the states evolve in time, while the operators of observables remain
time-independent (if they do not explicitly depend on time), so that Eq. (187), applied to a moment t,
may be represented as
 (t ) Aˆ S  (t ) , (4.188)
where the index “S” is added to emphasize the Schrödinger picture. Let us apply the evolution law
(157a) to the bra- and ket-vectors in this expression:
36 We will run into such situations in Chapter 7, but will not need to apply Eq. (186) there.
37 It may be found, for example, in Chapter 5 of J. Sakurai’s textbook – see References.
 t  Aˆ S  t    (t 0 ) uˆ † (t , t 0 ) Aˆ S uˆ (t , t 0 )  (t 0 ) . (4.189)
This equality means that if we form a long bracket with bra- and ket-vectors of the initial-time states,
together with the following time-dependent Heisenberg operator38
Heisenberg
operator Aˆ H (t )  uˆ † (t , t 0 ) Aˆ S uˆ (t , t 0 )  uˆ † (t , t 0 ) Aˆ H (t 0 )uˆ (t , t 0 ) , (4.190)
all experimentally measurable results will remain the same as in the Schrödinger picture:
Heisenberg
picture
 t  Aˆ  t    (t 0 ) Aˆ H (t , t 0 )  (t 0 ) . (4.191)
For full clarity, let us see how does the Heisenberg picture work for the same simple (but very
important!) problem of the spin-½ precession in a z-oriented magnetic field, described (in the z-basis) by
the Hamiltonian matrix (164). In that basis, Eq. (157b) for the time-evolution operator becomes
  u11 u12    1 0  u11 u12    u11 u12 

i       . (4.192)
t  u 21 u 22  2  0  1 u 21 u 22  2   u 21  u 22 
We see that in this simple case the differential equations for different matrix elements of the evolution
operator matrix are decoupled, and readily solvable, using the universal initial conditions (178):39
 e  it / 2 
u (t ,0)  
0   I cos t  iσ z sin t . (4.193)
 0 e it / 2  2 2

Now let us use them in Eq. (190) to calculate the Heisenberg-picture operators of spin
components – still in the z-basis. Dropping the index “H” for the notation brevity (the Heisenberg-
picture operators are clearly marked by their dependence on time anyway), we get
 †
S x (t )  u † (t ,0)S x (0)u (t ,0)  u (t ,0)σ x u (t ,0)
2
  e i t / 2 0   0 1   e  i t / 2
 0 
   (4.194)
2  0  i t / 2   1 0   i t / 2 
e   0 e 
  0 e  
i t
  σ cos t  σ sin t   S (0) cos t  S (0) sin t .

2  e  i t 0  2
x y x y
Absolutely similar calculations of the other spin components yield
38 Note that this strict relation is similar in structure to the first of the symbolic Eqs. (94), with the state bases {v}
and {u} loosely associated with the time moments, respectively, t and t0.
39 We could of course use this solution, together with Eq. (157), to obtain all the above results for this system
within the Schrödinger picture. In our simple case, the use of Eqs. (161) for this purpose was more
straightforward, but in some cases, e.g., for some time-dependent Hamiltonians, an explicit calculation of the
time-evolution matrix may be the best (or even only practicable) way to proceed.
  0  ie it   
S y (t )   σ y cos t  σ x sin t   S y (0) cos t  S x (0) sin t , (4.195)
2  ie  it 0  2
 1 0  
S z (t )     σ z  S z (0) . (4.196)
2  0  1 2
One practical advantage of these formulas is that they describe the system’s evolution for
arbitrary initial conditions, thus making the analysis of initial state effects very simple. Indeed, since in
the Heisenberg picture the expectation values of observables are calculated using Eq. (191) (with  = ),
with time-independent bra- and ket-vectors, such averaging of Eqs. (194)-(196) immediately returns us
to Eqs. (170), (173), and (174), which were obtained above in the Schrödinger picture. Moreover, these
equations for the Heisenberg operators formally coincide with the classical equations of the torque-
induced precession for c-number variables. (Below we will see that the same exact correspondence is
valid for the Heisenberg picture of the orbital motion.)
In order to see that the last fact is by no means a coincidence, let us combine Eqs. (157b) and
(190) to form an explicit differential equation of the Heisenberg operator’s evolution. For that, let us
differentiate Eq. (190) over time:
d ˆ uˆ † ˆ Aˆ uˆ
AH  AS uˆ  uˆ † S uˆ  uˆ † Aˆ S . (4.197)
dt t t t
Plugging in the derivatives of the time evolution operator from Eq. (157b) and its Hermitian conjugate,
and multiplying both sides of the equation by i, we get
d ˆ † Aˆ †
i AH  uˆ Hˆ Aˆ S uˆ  iuˆ † S uˆ  uˆ Aˆ S Hˆ uˆ . (4.198a)
dt t
If for the Schrödinger-picture’s Hamiltonian the condition similar to Eq. (179) is satisfied, then,
according to Eqs. (177) or (182), the Hamiltonian commutes with the time evolution operator and its
Hermitian conjugate, and may be swapped with any of them.40 Hence, we may rewrite Eq. (198a) as
i
d ˆ
dt
† Aˆ
t
† Aˆ
 
AH   Hˆ uˆ Aˆ S uˆ  iuˆ † S uˆ  uˆ Aˆ S uˆ Hˆ  iuˆ † S uˆ  uˆ † Aˆ S uˆ , Hˆ .
t
(4.198b)
Now using the definition (190) again, for both terms on the right-hand side, we may write
i
d ˆ  Aˆ 

AH  i   Aˆ H , Hˆ .  (4.199)
Heisenberg
equation
dt  t  H of motion
This is the so-called Heisenberg equation of motion.

Let us see how this equation looks for the same problem of the spin-½ precession in a z-oriented,
time-independent magnetic field, described in the z-basis by the Hamiltonian matrix (164), which does
not depend on time. In this basis, Eq. (199) for the vector operator of spin reads41
† †
40Due to the same reason, Hˆ H  uˆ Hˆ S uˆ  uˆ uˆHˆ S  Hˆ S ; this is why the Hamiltonian operator’s index may be
dropped in Eqs. (198)-(199).
 S S 12  Ω  S11 S12   1 0   0 - S12 

i 11  
 
, 
 
   Ω   . (4.200)
S S  2  S S   0  1  S 0 
 21 22   21 22  21
Once again, the equations for different matrix elements are decoupled, and their solution is elementary:
S11 t   S11 (0)  const, S 22 t   S 22 (0)  const,
(4.201)
S12 t   S12 (0)e it , S 21 t   S 21 (0)e it .
According to Eq. (190), the initial values of the Heisenberg-picture matrix elements are just the
Schrödinger-picture ones, so that using Eq. (117) we may rewrite this solution in either of two forms:
   0 e  it   0  ie  it   1 0 
S(t )  n x ny  n z  
2   e  it 0   ie  it
 0 
  0  1 

(4.202)
  nz ne  i t 
  , where n   n x  in y .
2  n e  it  nz 
  
The simplicity of the last expression is spectacular. (Remember, it covers any initial conditions
and all three spatial components of spin!) On the other hand, for some purposes the previous form may
be more convenient; in particular, its Cartesian components give our earlier results (194)-(196).42
One of the advantages of the Heisenberg picture is that it provides a more clear link between
classical and quantum mechanics, found by P. Dirac. Indeed, analytical classical mechanics may be used
to derive the following equation of time evolution of an arbitrary function A(qj, pj, t) of the generalized
coordinates qj and momenta pj of the system, and time t: 43
dA A
  A, H P , (4.203)
dt t
where H is the classical Hamiltonian function of the system, and {..,..} is the so-called Poisson bracket
defined, for two arbitrary functions A(qj, pj, t) and B(qj, pj, t), as
 B A B 
A, BP    A
Poisson
bracket  . (4.204)
p j  j q j q j p j 
Comparing Eq. (203) with Eq. (199), we see that the correspondence between the classical and quantum
mechanics (in the Heisenberg picture) is provided by the following symbolic relation
41 Using the commutation relations (155), this equation may be readily generalized to the case of an arbitrary
magnetic field B(t) and an arbitrary state basis – the exercise highly recommended to the reader.
42 Note that the “values” of the same Heisenberg operator at different moments of time may or may not commute.
For example, consider a free 1D particle, with the time-independent Hamiltonian Hˆ  pˆ 2 / 2m . In this case, Eq.
(199) yields the following equations: ixˆ  [ xˆ , Hˆ ]  ipˆ / m and ipˆ  [ pˆ , Hˆ ]  0 , with simple solutions
(similar to those for the classical motion): pˆ (t )  const  pˆ (0) and xˆ (t )  xˆ (0)  pˆ (0) t / m , so that
[ xˆ (0), xˆ (t )]  [ xˆ (0), pˆ (0)] t / m  [ xˆ S , pˆ S ] t / m  it / m  0, for t  0.
43See, e.g., CM Eq. (10.17). The notation there does not use the subscript “P” that is employed in Eqs. (203)-
(205) to distinguish the classical Poisson bracket (204) from the quantum anticommutator (34).
 
Classical
i
A, BP  Aˆ , Bˆ . (4.205)
vs.
quantum
 mechanics
This relation may be used, in particular, for finding appropriate operators for the system’s observables,
if their form is not immediately evident from the correspondence principle.
Finally, let us discuss one more alternative picture of quantum dynamics. It is also attributed to
Dirac, and is called either the “Dirac picture”, or (more frequently) the interaction picture. The last
name stems from the fact that this picture is very useful for the perturbative (approximate) approaches
to systems whose Hamiltonians may be partitioned into two parts,
Hˆ  Hˆ 0  Hˆ int , (4.206)
where Ĥ 0 is the sum of relatively simple Hamiltonians of the component subsystems, while the second
term in Eq. (206) represents their weak interaction. (Note, however, that all relations in the balance of
this section are exact and not directly based on the interaction weakness.) In this case, it is natural to
consider, together with the full operator uˆ t , t 0  of the system’s evolution, which obeys Eq. (157b), a
similarly defined unitary operator uˆ 0 t , t 0  of evolution of the “unperturbed system” described by the
Hamiltonian Ĥ 0 alone:

i uˆ 0  Hˆ 0 uˆ 0 , (4.207)
t
and also the following interaction evolution operator,
Interaction
uˆ I  uˆ 0†uˆ . (4.208) evolution
operator
The motivation for these definitions become more clear if we insert the reciprocal relation,
uˆ  uˆ 0 uˆ 0†uˆ  uˆ 0 uˆ I , (4.209)
and its Hermitian conjugate,
uˆ †  uˆ 0 uˆ I   uˆ I†uˆ 0† ,
†
(4.210)
into the basic Eq. (189):

 Aˆ    (t 0 ) uˆ † (t , t 0 ) Aˆ S uˆ (t , t 0 )  (t 0 )
(4.211)
  (t 0 ) uˆ I t , t 0 uˆ 0 t , t 0 Aˆ S uˆ 0 t , t 0 uˆ I t , t 0   (t 0 ) .
† †
This relation shows that any long bracket (187), i.e. any experimentally verifiable result of
quantum mechanics, may be expressed as
 Aˆ    I (t ) Aˆ I (t )  I (t ) , (4.212)
if we assume that both the state vectors and the operators depend on time, with the vectors evolving only
due to the interaction operator û I ,
Interaction
 I (t )   (t 0 ) uˆ I† (t , t 0 ),  I (t )  uˆ I (t , t 0 )  (t 0 ) , (4.213) picture:
state vectors
while the operators’ evolution being governed by the unperturbed operator û 0 :

Interaction
picture:
operators
Aˆ I (t )  uˆ 0† t , t 0 Aˆ S uˆ 0 t , t 0  . (4.214)
These relations describe the interaction picture of quantum dynamics. Let me defer an example
of its use until the perturbative analysis of open quantum systems in Sec. 7.6, and end this section with a
proof that the interaction evolution operator (208) satisfies the following natural equation,

i uˆ I  Hˆ I uˆ I , (4.215)
t
where Ĥ I is the interaction Hamiltonian formed from Ĥ int in accordance with the same rule (214):
Hˆ I t   uˆ 0† t , t 0 Hˆ int uˆ 0 t , t 0  . (4.216)
The proof is very straightforward: first using the definition (208), and then Eqs. (157b) and the
Hermitian conjugate of Eq. (207), we may write
 
†

i uˆ I  i
t
 †
t
uˆ
uˆ 0 uˆ  i 0 uˆ  uˆ 0†i
t
uˆ
t

  Hˆ 0 uˆ 0†uˆ  uˆ 0† Hˆ uˆ   Hˆ 0 uˆ 0†uˆ  uˆ 0† Hˆ 0  Hˆ int uˆ  (4.217)
 
  Hˆ 0 uˆ 0†uˆ  uˆ 0† Hˆ 0 uˆ  uˆ 0† Hˆ int uˆ   Hˆ 0 uˆ 0†  uˆ 0† Hˆ 0 uˆ  uˆ 0† Hˆ int uˆ .
Since û 0† may be represented as an integral of an exponent of Ĥ 0 over time (similar to Eq. (181) relating
û and Ĥ ), these operators commute, so that the parentheses in the last form of Eq. (217) vanish. Now
plugging û from the last form of Eq. (209), we get the equation,

t

uˆ I  uˆ 0† Hˆ int uˆ 0 u I  uˆ 0† Hˆ int uˆ 0 uˆ I ,
i  (4.218)
which is clearly equivalent to the combination of Eqs. (215) and (216).

As Eq. (215) shows, if the energy scale of the interaction Hint is much smaller than that of the
background Hamiltonian H0, the interaction evolution operators û I and û I† , and hence the state vectors
(213) evolve relatively slowly, without fast background oscillations. This is very convenient for the
perturbative approaches to complex interacting systems, in particular to the “open” quantum systems
that weakly interact with their environment – see Sec. 7.6.
4.7. Coordinate and momentum representations

Now let me show that in application to the orbital motion of a particle, the bra-ket formalism
naturally reduces to the notions and postulates of wave mechanics, which were discussed in Chapter 1.
For that, we first have to modify some of the above formulas for the case of a basis with a continuous
spectrum of eigenvalues. In that case, it is more appropriate to replace discrete indices, such as j, j’, etc.
broadly used above, with the corresponding eigenvalue – just as it was done earlier for functions of the
wave vector – see, e.g., Eqs. (1.88), (2.20), etc. For example, the key Eq. (68), defining the eigenkets
and eigenvalues of an operator, may be conveniently rewritten in the form
Aˆ a A  A a A . (4.219)
More substantially, all sums over such continuous eigenstate sets should be replaced with
integrals. For example, for a full and orthonormal set of the continuous eigenstates  aA, the closure
relation (44) should be replaced with Continuous
spectrum:
dA a a  Iˆ ,
 A A (4.220) closure
relation
where the integral is over the whole interval of possible eigenvalues of the observable A.44 Applying this
relation to the ket-vector of an arbitrary state , we get the following replacement of Eq. (37):
  Iˆ    dA a A a A    dA a A  a A . (4.221)
For the particular case when  =  aA’ , this relation requires that
Continuous
a A a A'   ( A  A' ); (4.222) spectrum:
state ortho-
normality
this formula replaces the orthonormality condition (38).
According to Eq. (221), in the continuous case the bracket aA still the role of probability
amplitude, i.e. a complex c-number whose modulus squared determines the state aA’s probability – see
the last form of Eq. (120). However, for a continuous observable, the probability to find the system
exactly in a particular state is infinitesimal; instead, we should speak about the probability dW = w(A)dA
of finding the observable within a small interval dA << A near the value A, with probability density w(A)
  aA2. The coefficient in this relation may be found by making a similar change from the
summation to integration in the normalization condition (121):
 dA  a A a A   1. (4.223)
Since the total probability of the system to be in some state should be equal to w(A)dA, this means that
Continuous
2 spectrum:
w( A)   a A a A    a A . (4.224) probability
density
Now let us see how we can calculate the expectation values of continuous observables, i.e. their
ensemble averages. If we speak about the same observable A whose eigenstates are used as the
continuous basis (or any compatible observable), everything is simple. Indeed, inserting Eq. (224) into
the general statistical relation
A   w( A) AdA , (4.225)
which is just the obvious continuous version of Eq. (1.37), we get

A    a A A a A  dA. (4.226)
Inserting a delta-function to represent this expression as a formally double integral,
A   dA dA'  a A A ( A  A' ) a A'  , (4.227)
and using the continuous-spectrum version of Eq. (98),
44 The generalization to cases when the eigenvalue spectrum consists of both a continuum interval plus some set
of discrete values, is straightforward, though leads to somewhat cumbersome formulas.
a A Aˆ a A'  A ( A  A' ) , (4.228)

we may write
A   dA dA'  a A a A Aˆ a A' a A'    Aˆ  , (4.229)
so that Eq. (4.125) remains valid in the continuous-spectrum case without any changes.
The situation is a bit more complicated for the expectation values of an operator that does not
commute with the basis-generating operator, because its matrix in that basis may not be diagonal. We
will consider (and overcome :-) this technical difficulty very soon, but otherwise we are ready for a
discussion of the relation between the bra-ket formalism and the wave mechanics. (For the notation
simplicity I will discuss its 1D version; its generalization to 2D and 3D cases is straightforward.)
Let us postulate the (intuitively almost evident) existence of a quantum state basis, whose ket-
vectors will be called x, corresponding to a certain definite value x of the particle’s coordinate. Writing
the following trivial identity:
xx xx , (4.230)
and comparing this relation with Eq. (219), we see that they do not contradict each other if we assume
that x on the left-hand side of this relation is the (Hermitian) operator x̂ of particle’s coordinate, whose
action on a ket- (or bra-) vector is just its multiplication by the c-number x. (This looks like a proof, but
is actually a separate, independent postulate, no matter how plausible.) Hence we may consider vectors
x as the eigenstates of the operator x̂ . Let me hope that the reader will excuse me if I do not pursue
here a strict proof that this set is full and orthogonal,45 so that we may apply to them Eq. (222):
x x'    x  x'  . (4.231)
Using this basis is called the coordinate representation – the term which was already used at the end of
Sec. 1.1, but without explanation.
In the basis of the x-states, the inner product aA(t) becomes x(t), and Eq. (223) takes the
following form:
*
w( x, t )   (t ) x x  (t )  x  (t ) x  (t ) . (4.232)
Comparing this formula with the basic postulate (1.22) of wave mechanics, we see that they coincide if
Wave-
the wavefunction of a time-dependent state  is identified with that short bracket:46
function
as inner  ( x, t )  x  (t ) . (4.233)
product
This key formula provides the desired connection between the bra-ket formalism and the wave
mechanics, and should not be too surprising for the (thoughtful :-) reader. Indeed, Eq. (45) shows that
any inner product of two state vectors describing two states is a measure of their coincidence – just as
the scalar product of two geometric vectors is; the orthonormality condition (38) is a particular
manifestation of this fact. In this language, the particular value (233) of a wavefunction  at some
45Such proof is rather involved mathematically, but physically this fact should be evident.
46 I do not quite like expressions like x used in some papers and even textbooks. Of course, one is free to
replace  with any other letter ( including) to denote a quantum state, but then it is better not to use the same
letter to denote the wavefunction, i.e. an inner product of two state vectors, to avoid confusion.
point x and moment t characterizes “how much of a particular coordinate x” does the state  contain at
time t. (Of course, this informal language is too crude to reflect the fact that (x, t) is a complex
function, which has not only a modulus but also an argument – the quantum-mechanical phase.)
Now let us rewrite the most important formulas of the bra-ket formalism in the wave mechanics
notation. Inner-multiplying both parts of Eq. (219) by the ket-vector x, and then inserting into the left-
hand side of that relation the identity operator in the form (220) for coordinate x’, we get
 dx' x Aˆ x' x' a A  A x a A , (4.234)
i.e., using the wavefunction’s definition (233),
 dx' x Aˆ x'  A ( x' )  A A ( x) , (4.235)
where, for the notation brevity, the time dependence of the wavefunction is just implied (with the capital
 serving as a reminder of this fact), and will be restored when needed.
For a general operator, we would have to stop here, because if it does not commute with the
coordinate operator, its matrix in the x-basis is not diagonal, and the integral on the left-hand side of Eq.
(235) cannot be worked out explicitly. However, virtually all quantum-mechanical operators discussed
in this course47 are (space-) local: they depend on only one spatial coordinate, say x. For such operators,
the left-hand side of Eq. (235) may be further transformed as
 x Aˆ x'  ( x' )dx'   x x' Aˆ  ( x' )dx'  Aˆ   ( x  x' ) ( x' )dx'  Aˆ  ( x) . (4.236)
The first step in this transformation may appear as elementary as the last two, with the ket-vector x’
swapped with the operator depending only on x; however, due to the delta-functional character of the
bracket (231), this step is, in fact, an additional postulate, so that the second equality in Eq. (236)
essentially defines the coordinate representation of the local operator, whose explicit form still needs to
be determined.
Let us consider, for example, the 1D version of the Hamiltonian (1.41),
pˆ 2
Hˆ  x  U ( xˆ ) , (4.237)
2m
which was the basis of all our discussions in Chapter 2. Its potential-energy part U (which may be time-
dependent as well) commutes with the operator x̂ , i.e. its matrix in the x-basis has to be diagonal. For
such an operator, the transformation (236) is indeed trivial, and its coordinate representation is given
merely by the c-number function U(x).
The situation the momentum operator p̂ x (and hence the kinetic energy pˆ x2 / 2m ), not
commuting with x̂ , is less evident. Let me show that its coordinate representation is given by the 1D
version of Eq. (1.26), if we postulate that the commutation relation (2.14),
xˆ, pˆ   iIˆ, i.e. xˆpˆ x  pˆ x xˆ  iIˆ , (4.238)
47 The only substantial exception is the statistical operator ŵ (x, x’), to be discussed in Chapter 7.
is valid in any representation.48 For that, let us consider the following matrix element, x xˆpˆ x  pˆ x xˆ x' .
On one hand, we may use Eq. (238), and then Eq. (231), to write
x xˆpˆ x  pˆ x xˆ x'  x iIˆ x'  i x x'  i ( x  x' ) . (4.239)
On the other hand, since xˆ x'  x' x' and x xˆ  x x , we may represent the same matrix element as
x xˆpˆ x  pˆ x xˆ x'  x xpˆ x  pˆ x x' x'   x  x' x pˆ x x' . (4.240)
Comparing Eqs. (239) and (240), we get

 ( x  x' )
x pˆ x x'  i . (4.241)
x  x'
As it follows from the definition of the delta function,49 all expressions involving it acquire final sense
only at their integration, in our current case, at that described by Eq. (236). Plugging Eq. (241) into the
left-hand side of that relation, we get
  x  x' 
 x pˆ x x'  ( x' )dx'  i 
x  x'
 ( x' )dx' . (4.242)
Since the right-hand-part integral is contributed only by an infinitesimal vicinity of the point x’ = x, we
may calculate it by expanding the continuous wavefunction (x’) into the Taylor series in small (x’ – x),
and keeping only two leading terms of the series, so that Eq. (242) is reduced to
   x  x'    x'  
 x pˆ x x'  ( x' )dx'  i  ( x)  dx'     x  x'  x'  x dx'  . (4.243)
 x  x' x' 
Since the delta function may be always understood as an even function of its argument, in our case of (x
– x’), the first term on the right-hand side is proportional to an integral of an odd function in symmetric
limits and is equal to zero, and we get50

 x pˆ x x'  ( x' )dx'  i
x
. (4.244)
Comparing this expression with the right-hand side of Eq. (236), we see that in the coordinate
representation we indeed get the 1D version of Eq. (1.26), which was used so much in Chapter 2,51

pˆ x  i . (4.245)
x
48 Another possible approach to the wave mechanics axiomatics is to derive Eg. (238) by postulating the form,
Tˆ X  exp{ipˆ x X / } , of the operator that shifts any wavefunction by distance X along the axis x. In my
approach, this expression will be derived when we need it (in Sec. 5.5), while Eq. (238) is postulated.
49 If necessary, please revisit MA Sec. 14.
50 One more useful expression of this type, which may be proved similarly, is (/x)(x – x’) = (x – x’)/x’.
51 This means, in particular, that in the sense of Eq. (236), the operator of differentiation is local, despite the fact
that its action on a function f may be interpreted as the limit of the fraction f/x, involving two points. (In some
axiomatic systems, local operators are defined as arbitrary polynomials of functions and their derivatives.)
It is straightforward to show (and is virtually evident) that the coordinate representation of any
operator function f ( pˆ x ) is
  
f   i  . (4.246)
 x 
In particular, this pertains to the kinetic energy operator in Eq. (237), so the coordinate representation of
this Hamiltonian also takes the very familiar form:
2
1    2 2
Hˆ    i   U ( x , t )    U ( x, t ) . (4.247)
2m  x  2m x 2
Now returning to the discussion of the general Eq. (235), and comparing its last form with that of
Eq. (236), we see that for a local operator in the coordinate representation, the eigenproblem (219) takes
the form
Eigenproblem
Aˆ  A ( x)  A A ( x), (4.248) in x-
representation
even if the operator Â does not commute with the operator x̂ . The most important case of this
coordinate-representation form of the eigenproblem (68) is the familiar Eq. (1.60) for the eigenvalues En
of the energy of a system with a time-independent Hamiltonian.
The operator locality also simplifies the expression for its expectation value. Indeed, plugging
the closure relation in the form (231) into the general Eq. (125) twice (written in the first case for x and
in the second case for x’), we get
A   dx  dx'  (t ) x x Aˆ x' x'  (t )   dx  dx' * ( x, t ) x Aˆ x'  ( x' , t ) . (4.249)
Now, Eq. (236) reduces this result to just
A   dx  dx'* ( x, t )Aˆ  ( x, t )  x  x'    * ( x, t ) Aˆ  ( x, t )dx . (4.250)
i.e. to Eq. (1.23), which had to be postulated in Chapter 1.

Finally, let us discuss the time evolution of the wavefunction, in the Schrödinger picture. For
that, we may use Eq. (233) to calculate the (partial) time derivative of the wavefunction of some state :
 
i  i x  (t ) . (4.251)
t t
Since the coordinate operator x̂ does not depend on time explicitly, its eigenstates x are stationary, and
we can swap the time derivative and the time-independent bra-vector x. Now using the Schrödinger-
picture equation (158), and then inserting the identity operator in the continuous form (220) of the
closure relation, written for the coordinate eigenstates,
 dx' x' x'  Iˆ , (4.252)
we may continue to develop the right-hand side of Eq. (251) as


x i  (t )  x Hˆ  (t )   dx' x Hˆ x' x'  (t )   dx' x Hˆ x' Ψ  ( x' ) , (4.253)
t
If the Hamiltonian operator is local, we may apply Eq. (236) to the last expression, to get the familiar
form (1.28) of the Schrödinger equation:

i   Hˆ  , (4.254)
t
in which the coordinate representation of the operator Ĥ is implied.
So, for the local operators that obey Eq. (236), we have been able to derive all the basic notions
and postulates of the wave mechanics from the bra-ket formalism. Moreover, the formalism has allowed
us to get the very useful equation (248) for an arbitrary local operator, which will be repeatedly used
below. (In the first three chapters of this course, we have only used its particular case (1.60) for the
Hamiltonian operator.)
Now let me deliver on my promise to develop a more balanced view at the monochromatic de
Broglie waves (1), which would be more respectful to the evident r  p symmetry of the coordinate
and momentum. Let us do this for the 1D case when the wave may be represented as
 px 
 p ( x)  a p expi , for all    x   . (4.255)
  
(For the sake of brevity, from this point to the end of the section, I am dropping the index x in the
notation of the momentum – just as it was done in Chapter 2.) Let us have a good look at this function.
Since it satisfies Eq. (248) for the 1D momentum operator (245),
pˆ  p  p p , (4.256)
p is an eigenfunction of that operator. But this means that we can also write Eq. (219) for the
corresponding ket-vector:
pˆ p  p p , (4.257)
and according to Eq. (233), the wavefunction (255) may be represented as

 p ( x)  x p . (4.258)
This expression is quite remarkable in its x  p symmetry – which may be pursued further on.
Before doing that, however, we have to discuss the normalization of such wavefunctions. Indeed, in this
case, the probability density w(x) (18) is constant, so that its integral
 
*
 w( x)dx   p ( x) p ( x)dx
 
(4.259)
diverges if ap  0. Earlier in the course, we discussed two ways to avoid this divergence. One is to use a
very large but finite integration volume – see Eq. (1.31). Another way is to work with wave packets of
the type (2.20), possibly of a very large length and hence a very narrow spread of the momentum values.
Then the integral (259) may be required to equal 1 without any conceptual problem.
However, both these methods, convenient for the solution of many particular problems, violate
the x  p symmetry and hence are inconvenient for our current conceptual discussion. Instead, let us
continue to identify the eigenvectors p and p of the momentum with the bra- and ket-vectors aA and
aA of the general theory described at the beginning of this section. Then the normalization condition
(222) becomes
p p'   ( p  p' ). (4.260)
Inserting the identity operator in the form (252), with the integration variable x’ replaced by x, into the
left-hand side of this equation, and using Eq. (258), we can translate this normalization rule to the
wavefunction language:
 dx p x x p'   dx *p ( x) p' ( x)   ( p  p' ). (4.261)
For the wavefunction (255), this requirement turns into the following condition:

*  ( p  p' ) x  2
a p a p'  expi dx  a p 2 ( p  p' )   ( p  p' ), (4.262)
   
so that, finally, ap = ei/(2)1/2, where  is an arbitrary (real) phase, and Eq. (255) becomes52
1   px 
 p ( x)  x p  expi    (4.263)
2  1/ 2
  
Now let us represent an arbitrary wavefunction (x) as a wave packet of the type (2.20), based
on the wavefunctions (263), taking  = 0 for the notation brevity, because the phase may be
incorporated into the (generally, complex) envelope function (p):
1  px  x-
 ( x) 
2 1 / 2   ( p) expi   
dp . (4.264) representation:
wavefunctions
From the mathematical point of view, this is just a 1D Fourier spatial transform, and its reciprocal is
1  px  p-
 ( p) 
2 1 / 2  ( x) exp i  
dx . (4.265) representation:
wavefunctions
These expressions are completely symmetric, and represent the same wave packet; this is why the
functions (x) and (p) are frequently called the reciprocal representations of a quantum state of the
particle: respectively, its coordinate (x-) and momentum (p-) representations. Using Eq. (258), and Eq.
(263) with  = 0, they may be recast into simpler forms,
 ( x)    ( p) x p dp,  ( p )   ( x) p x dx , (4.266)
in which the inner products satisfy the basic postulate (14) of the bra-ket formalism:
1  px 
p x  exp i   x p * . (4.267)
2  1/ 2
  
52 Repeating such calculation for each Cartesian component of a plane monochromatic wave of arbitrary
dimensionality d, we get p = (2)–d/2exp{i(pr/ + )}.
Next, we already know that in the x-representation, i.e. in the usual wave mechanics, the
coordinate operator x̂ is reduced to the multiplication by x, and the momentum operator is proportional
to the partial derivative over the coordinate:
x-
representation: 
operators xˆ in x  x, pˆ in x  i . (4.268)
x
It is natural to guess that in the p-representation, the expressions for operators would be reciprocal:
p- 
representation: xˆ in p  i , pˆ in p  p, (4.269)
operators p
with the only difference of one sign, which is due to the opposite signs of the Fourier exponents in Eqs.
(264) and (265). The proof of Eqs. (269) is straightforward; for example, acting by the momentum
operator on the arbitrary wavefunction (264), we get
 1    px   1  px 
1/ 2   p ( p) expi
pˆ  ( x)  i  ( x)   ( p)  i expi  dp  dp, (4.270)
x 2   x    2 1 / 2  
and similarly for the operator x̂ acting on the function (p). Comparing the final form of Eq. (270) with
the initial Eq. (264), we see that the action of the operators (268) on the wavefunction  (i.e. the state’s
x-representation) gives the same results as the action of the operators (269) on the function  (i.e. its p-
representation).
It is illuminating to have one more, different look at this coordinate-momentum duality. For that,
notice that according to Eqs. (82)-(84), we may consider the bracket xp as an element of the (infinite-
size) matrix Uxp of the unitary transform from the x-basis to the p-basis. Let us use this fact to derive the
general operator transform rule that would be a continuous version of Eq. (92). Say, we want to
calculate the general matrix element of some operator, known in the x-representation, in the p-
representation:
p Aˆ p' . (4.271)
Inserting two identity operators (252), written for x and x’, into this bracket, and then using Eq. (258)
and its complex conjugate, and also Eq. (236) (again, valid only for space-local operators!), we get
p Aˆ p'   dx  dx' p x x Aˆ x' x' p'   dx  dx'  *p ( x) x Aˆ x'  p' ( x' )

(4.272)
1  px   p'x'  1  px   p'x 
  dx  dx' exp i  ( x  x' ) Aˆ exp i   dx exp i  Aˆ exp i .
2       2      
As a sanity check, for the momentum operator itself, this relation yields:

1  px     p'x  p'  ( p'  p) x 
p pˆ p' 
2  dx exp i   i  expi
   x 
 
   2  
expi

dx  p' ( p'  p). (4.273)

Due to Eq. (257), this result is equivalent to the second of Eqs. (269).
From a thoughtful reader, I anticipate the following natural question: why is the momentum
representation used much less frequently than the coordinate representation – i.e. wave mechanics? The
answer is purely practical: besides the important special case of the 1D harmonic oscillator (to be
revisited in Sec. 5.4), in most systems the orbital-motion Hamiltonian (237) is not x  p symmetric,
with the potential energy U(r) typically being a more complex function than the kinetic energy p2/2m.
Because of that, it is easier to analyze such systems treating such potential energy operator just a c-
number multiplier, as it is in the coordinate representation – as it was done in Chapters 1-3.
The most significant exception from this practice is the motion in a periodic potential in presence
of a coordinate-independent external force F(t). As was discussed in Secs. 2.7 and 3.4, in such periodic
systems the eigenenergies En(q), playing the role of the effective kinetic energy of the particle, may be
rather involved functions of its quasimomentum q, while its effective potential energy Uef = –F(t)r,
due to the additional force F(t), is a very simple function of coordinates. This is why detailed analyses of
the quantum effects briefly discussed in Sec. 2.8 (the Bloch oscillations, etc.) and also such statistical
phenomena as drift, diffusion, etc.,53 in solid-state theory are typically based on the momentum (or
rather quasimomentum) representation.
4.1. Prove that if Â and B̂ are linear operators, and C is a c-number, then:
 
(i) Aˆ † †  Aˆ ;  
(ii) CAˆ †  C * Aˆ † ;  
(iii) Aˆ Bˆ †  Bˆ † Aˆ † ;
(iv) the operators Aˆ Aˆ † and Aˆ † Aˆ are Hermitian.
4.2. Prove that for any linear operators Aˆ , Bˆ , Cˆ , and Dˆ ,
Aˆ Bˆ , CˆDˆ   Aˆ Bˆ , Cˆ Dˆ  Aˆ Cˆ Bˆ , Dˆ  Aˆ , Cˆ Dˆ Bˆ  Cˆ Aˆ , Dˆ Bˆ .

4.3. Calculate all possible binary products jj’ (for j, j’ = x, y, z) of the Pauli matrices, defined
by Eqs. (105), and their commutators and anticommutators (defined similarly to those of the
corresponding operators). Summarize the results, using the Kronecker delta and Levi-Civita permutation
symbols.54
4.4. Calculate the following expressions,

(i) (c)n, and then
(ii) (bI + c)n,
for the scalar product c of the Pauli matrix vector   nxx + nyy + nzz by an arbitrary c-number
geometric vector c, where n  0 is an integer, and b is an arbitrary scalar c-number.
Hint: For task (ii), you may like to use the binomial theorem,55 and then transform the result in a
way enabling you to use the same theorem backward.
53 In this series, a brief discussion of these effects may be found in SM Chapter 6.

54 See, e.g., MA Eqs. (13.1) and (13.2).
55 See, e.g. MA Eq. (2.9).
4.5. Use the solution of the previous problem to derive Eqs. (2.191) for the transparency T of a
system of N similar, equidistant, delta-functional potential barriers.
4.6. Use the solution of Problem 4(i) to spell out the following matrix: exp{i n}, where  is
the 3D vector (117) of the Pauli matrices, n is a c-number geometric vector of unit length, and  is a c-
number scalar.
4.7. Use the solution of Problem 4(ii) to calculate exp{A}, where A is an arbitrary 22 matrix.
4.8. Express all elements of the matrix B  exp{A} explicitly via those of the 22 matrix A.
Spell out your result for the following matrices:
a a  i i 
A   , A'   ,
a a  i i 
with real a and .
4.9. Prove that for arbitrary square matrices A and B,

Tr (AB)  Tr (BA) .
Is each diagonal element (AB)jj necessarily equal to (BA)jj?
4.10. Calculate the trace of the following 22 matrix:

A  a  σ b  σ c  σ  ,
where  is the Pauli matrix vector, while a, b, and c are arbitrary c-number vectors.
4.11. Prove that the matrix trace of an arbitrary operator does not change at its arbitrary unitary
transformation.
4.12. Prove that for any two full and orthonormal bases {u} and {v} of the same Hilbert space,

Tr u j v j'  v j' u j .
4.13. Is the 1D scattering matrix S, defined by Eq. (2.124), unitary? What about the 1D transfer
matrix T defined by Eq. (2.125)?
4.14. Calculate the trace of the following matrix:

expia  σexpib  σ,
where  is the Pauli matrix vector, while a and b are c-number geometric vectors.
4.15. Prove the following vector-operator identity:

σ  rˆ σ  pˆ   I rˆ  pˆ  iσ  rˆ  pˆ ,
where  is the Pauli matrix vector, and I is the 22 identity matrix.
Hint: Take into account that the vector operators r̂ and p̂ are defined in the orbital-motion
Hilbert space, different from that of the Pauli vector-matrix , and hence commute with it – even though
they do not commute with each other.
4.16. Let Aj be eigenvalues of some operator Â . Express the following two sums,
1   A j ,  2   A 2j ,
j j
via the matrix elements Ajj’ of this operator in an arbitrary basis.
4.17. Calculate  z of a spin–½ in the quantum state with the following ket-vector:

  const         , 
where (, ) and (, ) are the eigenstates of the Pauli matrices z and x, respectively.
Hint: Double-check whether your solution is general.
4.18. A spin-½ is fully polarized in the positive z-direction. Calculate the probabilities of the
alternative outcomes of a perfect Stern-Gerlach experiment with the magnetic field oriented in an
arbitrary different direction, performed on a particle in this spin state.
4.19. In a certain basis, the Hamiltonian of a two-level system is described by the matrix
E 0
H   1 , with E1  E 2 ,
0 E 2 
while the operator of some observable A of this system, by the matrix
1 1
A    .
1 1
For the system’s state with the energy definitely equal to E1, find the possible results of measurements
of the observable A and the probabilities of the corresponding measurement outcomes.
4.20. Certain states u1,2,3 form an orthonormal basis of a system with the following Hamiltonian
Hˆ    u1 u 2  u 2 u 3  u 3 u1   h.c.,
where  is a real constant, and h.c. means the Hermitian conjugate of the previous expression. Calculate
its stationary states and energy levels. Can you relate this system to any other(s) discussed earlier in the
course?
4.21. Guided by Eq. (2.203), and by the solutions of Problems 3.11 and 4.20, suggest a
Hamiltonian describing particle’s dynamics in an infinite 1D chain of similar potential wells in the tight-
binding approximation, in the bra-ket formalism. Verify that its eigenstates and eigenvalues correspond
to those discussed in Sec. 2.7.
4.22. Calculate eigenvectors and eigenvalues of the following matrices:

0 0 0 1
 0 1 0  
  0 0 1 0
A  1 0 1, B
 0 1 0 0 1 0 0
   
1 0 0 0 

4.23. A certain state  is an eigenstate of each of two operators, Â and B̂ . What can be said
about the corresponding eigenvalues a and b, if the operators anticommute?
4.24. Derive the differential equation for the time evolution of the expectation value of an
observable, using both the Schrödinger picture and the Heisenberg picture of quantum dynamics.
4.25. At t = 0, a spin-½ whose interaction with an external field is described by the Hamiltonian
Hˆ  c  σˆ  c x σˆ x  c y σˆ y  c z σˆ z
(where cx,y,z are real c-number constants, and ̂ x , y , z are the Pauli operators), was in the state , one of the
two eigenstates of ˆ z . In the Schrödinger picture, calculate the time evolution of:
(i) the ket-vector  of the spin (in any time-independent basis you like),
(ii) the probabilities to find the spin in the states  and , and
(iii) the expectation values of all three Cartesian components ( Sˆ x , etc.) of the spin vector.
Analyze and interpret the results for the particular case cy = cz = 0.
Hint: Think about the best basis to use for the solution.
4.26. For the same system as in the previous problem, use the Heisenberg picture to calculate the
time evolution of:
(i) all three Cartesian components of the spin operator Ŝ H (t), and
(ii) the expectation values of the spin components.
Compare the latter results with those of the previous problem.
4.27. For the same system as in the two last problems, calculate matrix elements of the operator
ˆ z in the basis of the stationary states of the system.
4.28. In the Schrödinger picture of quantum dynamics, certain three operators satisfy the
following commutation relation:
 
Aˆ , Bˆ  Cˆ .
What is their relation in the Heisenberg picture, at a certain time instant t?
4.29. Prove the Bloch theorem given by either Eq. (3.107) or Eq. (3.108).
Hint: Consider the translation operator TˆR defined by the following result of its action on an
arbitrary function f(r):
TˆR f (r )  f (r  R ) ,
for the case when R is an arbitrary vector of the Bravais lattice (3.106). In particular, analyze the
commutation properties of this operator, and apply them to an eigenfunction (r) of the stationary
Schrödinger equation for a particle moving in the 3D periodic potential described by Eq. (3.105).
4.30. A constant force F is applied to an (otherwise free) 1D particle of mass m. Calculate the
stationary wavefunctions of the particle in:
(i) the coordinate representation, and
(ii) the momentum representation.
Discuss the relation between the results.
4.31. Use the momentum representation to re-solve the problem discussed at the beginning of
Sec. 2.6, i.e. calculate the eigenenergy of a 1D particle of mass m, localized in a very short potential
well of “weight” W.
4.32. The momentum representation of a certain operator of 1D orbital motion is p-1. Find its
coordinate representation.
4.33.* For a particle moving in a 3D periodic potential, develop the bra-ket formalism for the q-
representation, in which a complex amplitude similar to aq in Eq. (2.234) (but generalized to 3D and all
energy bands) plays the role of the wavefunction. In particular, calculate the operators r and v in this
representation, and use the result to prove Eq. (2.237) for the 1D case in the low-field limit.
4.34. A uniform, time-independent magnetic field B is induced in one semi- B

space, while the other semi-space is field-free, with a sharp, plane boundary
between these two regions. A monochromatic beam of non-relativistic, electrically-
neutral spin-½ particles with a gyromagnetic ratio   0,56 in a certain spin state and 
with a kinetic energy E, is incident on this boundary, from the field-free side, under E
angle  – see figure on the right. Calculate the coefficient of particle reflection
from the boundary.
56 The fact that  may be different from zero even for electrically-neutral particles, such as neutrons, is explained
by the Standard Model of the elementary particles, in which a neutron “consists” (in a broad sense of the word) of
three electrically-charged quarks with zero net charge.
Chapter 5. Some Exactly Solvable Problems

The objective of this chapter is to describe several relatively simple but very important applications of
the bra-ket formalism, including a few core problems of wave mechanics we have already started to
discuss in Chapters 2 and 3.
5.1. Two-level systems

The discussion of the bra-ket formalism in the previous chapter was peppered with numerous
illustrations of its main concepts on the example of “spins-½” – systems with the smallest non-trivial
(two-dimensional) Hilbert space, in which the bra- and ket-vectors of an arbitrary quantum state  may
be represented as a linear superposition of just two basis vectors, for example
       , (5.1)
where the states  and  were defined as the eigenstates of the Pauli matrix z – see Eq. (4.105). For the
genuine spin-½ particles, such as electrons, placed in a z-oriented time-independent magnetic field,
these states are the stationary “spin-up” and “spin-down” stationary states of the Pauli Hamiltonian
(4.163), with the corresponding two energy levels (4.167). However, an approximate but reasonable
quantum description of some other important systems may also be given in such Hilbert space.
For example, as was discussed in Sec. 2.6, two weakly coupled space-localized orbital states of a
spin-free particle are sufficient for an approximate description of its quantum oscillations between two
potential wells. A similar coupling of two traveling waves explains the energy band splitting in the
weak-potential approximation of the band theory – Sec. 2.7. As will be shown in the next chapter, in
systems with time-independent Hamiltonians, such situation almost unavoidably appears each time
when two energy levels are much closer to each other than to other levels. Moreover, as will be shown
in Sec. 6.5, a similar truncated description is adequate even in cases when two levels En and En’ of an
unperturbed system are not close to each other, but the corresponding states become coupled by an
applied ac field of a frequency  very close to the difference (En – En’ )/. Such two-level systems
(alternatively called “spin-½-like” systems) are nowadays the focus of additional attention in the view of
prospects of their use for quantum information processing and encryption.1 This is why let me spend a
bit more time reviewing the main properties of an arbitrary two-level system.
First, the most general form of the Hamiltonian of a two-level system is represented, in an
arbitrary basis, by a 22 matrix
H H 12 
H   11 .
 (5.2)
H
 21 H 22 
According to the discussion in Secs. 4.3-4.5, since the Hamiltonian operator has to be Hermitian, the
diagonal elements of the matrix H have to be real, and its off-diagonal elements be complex conjugates
1 In the last context, to be discussed in Sec. 8.5, the two-level systems are usually called qubits.
© K. Likharev
of each other: H21 = H12*. As a result, we may not only represent H as a linear combination (4.106) of
the identity matrix and the Pauli matrices but also reduce it to a more specific form:
 b  cz c x  ic y   b  c z c 
H  bI  c  σ    , c   c x  ic y , (5.3)
 c x  ic y b  c z   c  b  c z 
where the scalar b and the Cartesian components of the vector c are real c-number coefficients:
H 11  H 22 H  H 21 H  H 12 H  H 22
b , c x  12  Re H 21 , c y  21  Im H 21 , c z  11 . (5.4)
2 2 2i 2
If such Hamiltonian does not depend on time, the corresponding characteristic equation (4.103) for the
system’s energy levels E,
b  cz  E c
 0, (5.5)
c b  cz  E
is a simple quadratic equation, with the following solutions:

1/ 2
H  H 22  H 11  H 22 
2


E   b  c  b  c c  c z
2 1/ 2

 b  c x2  c y2  c z
2 1/ 2
 11
2
 
2
  H 21
2
 . (5.6)
  
The parameter b  (H11 + H22)/2 evidently gives the average energy E(0) of the system, which
does not contribute to the level splitting
   
1/ 2
  H 11  H 22  4 H 21 
1/ 2 2 2
E  E   E   2c  2 c x2  c y2  c z2 . (5.7)
 
So, the splitting is a hyperbolic function of the coefficient cz  (H11 – H22)/2. A plot of this function is
the famous level-anticrossing diagram (Fig. 1), which has already been discussed in Sec. 2.7 in the
particular context of the weak-potential limit of the 1D band theory.
E   E (0)
E
 H 21
0 H 11  H 22
cz 
 H 21 2
E Fig. 5.1. The level-anticrossing diagram
for an arbitrary two-level system.
The physics of the diagram becomes especially clear if the two states of the basis used to spell
out the matrix (2), may be interpreted as the stationary states of two potentially independent subsystems,
with the energies, respectively, H11 and H22. (For example, in the case of two weakly coupled potential
wells discussed in Sec. 2.6, these are the ground-state energies of two distant wells.) Then the off-
diagonal elements c–  H12 and c+  H21 = H12* describe the subsystem coupling, and the level
anticrossing diagram shows how do the eigenenergies of the coupled system depend (at fixed coupling)
on the difference of the subsystem energies. As was already discussed in Sec. 2.7, the most striking
feature of the diagram is that any non-zero coupling c  (cx2 + cy2)1/2 changes the topology of the
eigenstate energies, creating a gap of the width E.
As it follows from our discussions of particular two-level systems in Secs. 2.6 and 4.6, their
dynamics also has a general feature – the quantum oscillations. Namely, if we put any two-level system
into any initial state different from one of its eigenstates , and then let it evolve on its own, the
probability of its finding the system in any of the “partial” states exhibits oscillations with the frequency
E E   E 
   2c , (5.8)
 
lowest at the exact subsystem symmetry (cz = 0, i.e. H11 = H22), when it is proportional to the coupling
strength: min = 2c/  2H12/ = 2H21/.
In the case discussed in Sec. 2.6, these are the oscillations of a particle between the two coupled
potential wells (or rather of the probabilities to find it in either well) – see, e.g., Eqs. (2.181). On the
other hand, for a spin-½ particle in an external magnetic field, these oscillations take the form of spin
precession in the plane normal to the field, with periodic oscillations of its Cartesian components (or
rather their expectation values) – see, e.g., Eqs. (4.173)-(4.174). Some other examples of the quantum
oscillations in two-level systems may be rather unexpected; for example, the ammonium molecule NH3
(Fig. 2) has two symmetric states that differ by the inversion of the nitrogen atom relative to the plane of
the three hydrogen atoms, which are weakly coupled due to quantum-mechanical tunneling of the
nitrogen atom through the plane of the hydrogen atoms.2 Since for this particular molecule, in the
absence of external fields, the level splitting E corresponds to an experimentally convenient frequency
/2  24 GHz, it played an important historic role at the initial development of the atomic frequency
standards and microwave quantum generators (masers) in the early 1950s,3 which paved the way toward
laser technology.
N
0.102 nm 107.8
H
H
H Fig. 5.2. An ammonia molecule and its inversion.
Now let us now discuss a very convenient geometric representation of an arbitrary state  of
(any!) two-level system. As Eq. (1) shows, such state is completely described by two complex
2 Since the hydrogen atoms are much lighter, it would be fairer to speak about the tunneling of their triangle
around the (nearly immobile) nitrogen atom.
3 In particular, these molecules were used in the demonstration of the first maser by C. Townes’ group in 1954.
coefficients (c-numbers) – say,  and . If the vectors of the basis states  and  are normalized, then
these coefficients must obey the following restriction:
  
W       *    *          *    *       
2 2
 1. (5.9)
This requirement is automatically satisfied if we take the moduli of  and  equal to the sine and
cosine of the same (real) angle. Thus we may write, for example,
 
   cos e i ,    sin e i (  ) . (5.10)
2 2
Moreover, according to the general Eq. (4.125), if we deal with just one system,4 the common phase
factor exp{i} drops out of the calculation of any expectation value, so that we may take  = 0, and Eq.
(10) is reduced to
 
   cos ,    sin e i .
Bloch
(5.11) sphere
2 2 representation
The reason why the argument of these sine and cosine functions is usually taken in the form /2,
becomes clear from Fig. 3a: Eq. (11) conveniently maps each state  of a two-level system on a certain
representation point on a unit-radius Bloch sphere,5 with the polar angle  and the azimuthal angle .
z (a) z (b) z (c)

y  c y
c

  
0  x 0 x 0 x


Fig. 5.3. The Bloch sphere: (a) the representation of an arbitrary state (solid red point) and the
eigenstates of the Pauli matrices (dotted points), and (b, c) the two-level system’s evolution: (b) in a
constant “field” c directed along the z-axis, and (c) in a field of arbitrary orientation.
In particular, the basis state , described by Eq. (1) with  = 1 and  = 0, corresponds to the
North Pole of the sphere ( = 0), while the opposite state , with  = 0 and  = 1, to its South Pole (
= ). Similarly, the eigenstates  and  of the matrix x, described by Eqs. (4.122), i.e. having  =
4 If you need a reminder of why this condition is crucial, please revisit the discussion at the end of Sec. 1.6. Note
also that the mutual phase shifts between different qubits are important, in particular, for quantum information
processing (see Sec. 8.5 below), so that most discussions of these applications have to start from Eq. (10) rather
than Eq. (11).
5 This representation was suggested in 1946 by the same Felix Bloch who has pioneered the energy band theory
discussed in Chapters 2-3.
1/2 and  = 1/2, correspond to the equator ( = /2) points with, respectively,  = 0 and  = .
Two more special points (denoted in Fig. 3a as ⊙ and ) are also located on the sphere’s equator, at  =
/2 and  = /2; it is easy to check that they correspond to the eigenstates of the matrix y (in the same
z-basis).
To understand why such mutually perpendicular location of these three special point pairs on the
Bloch sphere is not occasional, let us plug Eqs. (11) into Eqs. (4.131)-(4.133) for the expectation values
of the spin-½ components. In terms of the Pauli vector operator (4.117),   S/(/2), the result is
 x  sin  cos  ,  y  sin  sin  ,  z  cos  , (5.12)
showing that the radius vector of any representation point is just the expectation value of .
Now let us use Eq. (3) to see how does the representation point moves in various cases, ignoring
the term bI – which, again, describes the offset of the total energy of the system relative to some
reference level, and does not affect its dynamics. First of all, according to Eq. (4.158), in the case c = 0
(when the Hamiltonian operator turns to zero, and hence the state vectors do not depend on time) the
point does not move at all, and its position is determined by initial conditions, i.e. by the system’s
preparation. If c  0, we may re-use some results of Sec. 4.6, obtained for the Pauli Hamiltonian
(4.163a), which coincides with Eq. (3) if6

c  B . (5.13)
2
In particular, if the field B , and hence the vector c, is directed along the z-axis and is time-independent,
Eqs. (4.170) and (4.173)-(4.174) show that the representation point  on the Bloch sphere rotates
within a plane normal to this axis (see Fig. 3b) with the angular velocity
d 2c
   Bz  z . (5.14)
dt 
Almost evidently, since the selection of the coordinate axes is arbitrary, this picture should
remain valid for any orientation of the vector c, with the representation point rotating, on the Bloch
sphere, around it direction, with the angular speed  = 2c/ – see Fig. 3c. This fact may be proved
using any picture of the quantum dynamics, discussed in Sec. 4.6. Actually, the reader may already have
done that by solving Problems 4.25 and 4.26, just to see that even for the particular, simple initial state
of the system (), the final results for the Cartesian components of the vector  are somewhat bulky.
However, this description may be readily simplified, even for arbitrary time dependence of the “field”
vector c(t) in Eq. (3), using the (geometric) vector language.
Indeed, let us rewrite Eq. (3) (again, with b = 0) in the operator form,
Hˆ  ct   σˆ , (5.15)
valid in an arbitrary basis. According to Eq. (4.199), the corresponding Heisenberg equation of motion
for the jth Cartesian components of the vector-operator σ̂ (which does not depend on time explicitly, so
that ˆ / t  0 ) is
6 This correspondence justifies using the use of term “field” for the vector c.
   
  3
 
3
iˆ j  ˆ j , Hˆ  ˆ j , ct   σˆ  ˆ j ,  c j ' t ˆ j'    c j ' t  ˆ j , ˆ j' . (5.16)
 j '1  j '1
Now using the commutation relations (4.155), which remain valid in any basis and in any picture of time
evolution,7 we get
3
iˆ j  2i  c j ' t ˆ j " jj'j" , (5.17)
j '1
where j” is the index, or the same set {1, 2, 3}, complementary to j and j’ (j”  j, j’), and jj’j” is the
Levi-Civita symbol.8 But it is straightforward to verify that the usual vector product of two 3D vectors
may be represented in a similar Cartesian-component form:
n1 n2 n3 3
a  b  j  a1 a2 a3   a j' b j"  jj'j" , (5.18)
j '1
b1 b2 b3 j
As a result, Eq. (17) may be rewritten in a vector form – or rather several equivalent forms:
2
iˆ j  2ict   σˆ  j , i.e. iσˆ  2ict   σˆ , or σˆ  ct   σˆ , or σˆ  Ωt   σˆ , (5.19)

where the vector  is defined as
Ωt   2ct  (5.20)
– an evident generalization of Eq. (14).9 As we have seen in Sec. 4.6, any linear relation between two
Heisenberg operators is also valid for the expectation values of the corresponding observables, so that
the last form of Eq. (19) yields:
σ  Ωt   σ . (5.21)
But this is the well-known kinematic formula10 for the rotation of a constant-length classical 3D
vector  around the instantaneous direction of the vector (t), with the instantaneous angular velocity
(t). So, the time evolution of the representation point on the Bloch sphere is quite simple, especially in
the case of a time-independent c, and hence  – see Fig. 3c.11 Note that it is sufficient to turn off the
field to stop the precession instantly. (Since Eq. (21) is the first-order differential equation, the
7 Indeed, if some three operators in the Schrödinger picture are related as [ Aˆ S , Bˆ S ] = ĈS , then according to Eq.
(4.190), in the Heisenberg picture:
[ Aˆ H , Bˆ H ]  [ uˆ † Aˆ H uˆ , uˆ † Bˆ H uˆ ]  uˆ † Aˆ H uûˆ † Bˆ H uˆ  uˆ † Bˆ H uûˆ † Aˆ H uˆ  uˆ † [ Aˆ S , Bˆ S ] uˆ  uˆ † Cˆ S uˆ  Cˆ H .
8 See, e.g., MA Eq. (9.2). Note that in Eqs. (17)-(18) and similar expressions below, the condition j”  j, j’ may be
(and frequently is) replaced by the summation over not only j’, but also j”, in their right-hand sides.
9 It is also easy to verify that in the particular case  = n , Eqs. (19) are reduced, in the z-basis, to Eqs. (4.200)
z
for the spin-½ vector matrix S = (/2).
10 See, e.g., CM Sec. 4.1, in particular Eq. (4.8).
11 The bulkiness of the solutions of Problems 4.25 and 4.26 (which were offered just as useful exercises in
quantum dynamic formalisms) reflects the awkward expression of the resulting circular motion of the vector 
(see Fig. 3c) via its Cartesian components.
representation point has no effective inertia.12) Hence, changing the direction and the magnitude of the
effective external field, it is possible to drive the representation point of a two-level system from any
initial position to any final position on the Bloch sphere, i.e. make the system take any of its possible
quantum states.
In the particular case of a spin-½ in a magnetic field B (t), it is more customary to use Eqs. (13)
and (20) to rewrite Eq. (21) as the following equation for the expectation value of the spin vector S =
(/2):
S   S  B t  . (5.22)
As we know from the discussion in Chapter 4, such a classical description of the spin’s evolution does
not give a full picture of the quantum reality; in particular, it does not describe the possible large
uncertainties of its components – see, e.g., Eqs. (4.135). The situation, however, is different for a
collection of N >> 1 similar, non-interacting spins, initially prepared to be in the same state – for
example by polarizing all spins with a strong external field B 0, at relatively low temperatures T, with
kBT << B0. (A practically important example of such a collection is a set of nuclear spins in
macroscopic condensed-matter samples, where the spin interaction with each other and the environment
is typically very small.) For such a collection, Eq. (22) is still valid, while the relative uncertainty of the
resulting sample’s magnetization M = nm = nS (where n  N/V is the spin density) is proportional
to 1/N1/2 << 1. Thus, the evolution of magnetization may be described, with good precision, by the
essentially classical equation (valid for any spin, not necessarily spin-½):
  M  B t  .
M (5.23)
This equation, or the equivalent set of three Bloch equations13 for its Cartesian components,
with the right-hand side augmented with small terms describing the effects of dephasing and relaxation
(to be discussed in Chapter 7), is used, in particular, to describe the magnetic resonance, taking place
when the frequency (4.164) of the spin’s precession in a strong dc magnetic field approaches the
frequency of an additionally applied (and usually weak) ac field.14
5.2. The Ehrenfest theorem

In Sec. 4.7, we have derived all the basic relations of wave mechanics from the bra-ket
formalism, which will also enable us to get some important additional results in that area. One of them is
a pair of very interesting relations, together called the Ehrenfest theorem. To derive them, for the
simplest case of 1D orbital motion, let us calculate the following commutator:
xˆ, pˆ   xˆpˆ
2
x x pˆ x  pˆ x pˆ x xˆ. (5.24)
Let us apply the commutation relation (4.238) in the following form:

xˆpˆ x  pˆ x xˆ  iIˆ, (5.25)
12 This is also true for the classical angular momentum L at its torque-induced precession – see, e.g., CM Sec. 4.5.
13 They were introduced by F. Bloch in the same 1946 paper as the Bloch-sphere representation.
The quantum theory of this effect will be discussed in the next chapter.
to the first term of the right-hand side of Eq. (24) twice, with the goal to move the coordinate operator to
the rightmost position:
   
xˆpˆ x pˆ x  pˆ x xˆ  iIˆ pˆ x  pˆ x xˆpˆ x  ipˆ x  pˆ x pˆ x xˆ  iIˆ  ipˆ x  pˆ x pˆ x xˆ  2ipˆ x . (5.26)
The first term of this result cancels with the last term of Eq. (24), so that the commutator becomes quite
simple:

xˆ , pˆ x2  2ipˆ x . (5.27)
Let us use this equality to calculate the Heisenberg-picture equation of motion of the operator x̂ ,
by applying the general Heisenberg equation (4.199) to the 1D orbital motion described by the
Hamiltonian (4.237), but possibly with a more general, time-dependent potential energy U:
dxˆ 1

dt i
 1  pˆ 2 
xˆ , Hˆ   xˆ , x  U ( xˆ , t ).
i  2 m
(5.28)

The potential energy operator is a function of the coordinate operator and hence, as we know, commutes
with it. Thus, the right-hand side of Eq. (28) is proportional to the commutator (27), and we get
Heisenberg
dxˆ pˆ x equation
 . (5.29) for
dt m coordinate
In this operator equality, we readily recognize the full analog of the classical relation between the
particle’s momentum and is velocity.
Now let us see what a similar procedure gives for the momentum’s derivative:
dpˆ x
dt

1
i
ˆ 1
px , H   px ,
ˆ
i 
ˆ  pˆ x2
2m

 U ( xˆ , t ). (5.30)

The kinetic energy operator commutes with the momentum operator and hence drops from the right-
hand side of this equation. To calculate the remaining commutator of the momentum and potential
energy, let us use the fact that any smooth (infinitely differentiable) function may be represented by its
Taylor expansion:

1  kU k
U ( x, t )  
ˆ xˆ , (5.31)
k  0 k! x
ˆk
where the derivatives of U may be understood as c-numbers (evaluated at x = 0, and the given time t), so
that we may write
 
 pˆ x ,U ( xˆ, t )   1  Uk pˆ x , xˆ k    1  Uk  pˆ x 
 k  k
k! xˆ
xˆxˆ..xˆ  
k! xˆ 
xˆx xˆ pˆ x  .
ˆ... (5.32a)
k 0 k 0 k times k times 
Applying Eq. (25) k times to the last term in the parentheses, exactly as we did it in Eq. (26), we get

1  kU 
1  k U k 1
 pˆ x ,U ( xˆ, t )   ikx  i 
ˆ k 1
xˆ . (5.32b)
k 1 k! x
ˆk k 1 ( k  1)! x
ˆk
But the last sum is just the Taylor expansion of the derivative U/x. Indeed,
U 
1  k'  U  k ' 
1  k ' 1U k ' 
1  k U k 1
   ˆ
x   ˆ
x   xˆ , (5.33)
xˆ k '0 k'! xˆ k'  xˆ  ˆ k '1
k ' 0 k '! x k 1 ( k  1)! x
ˆk
where at the last step the summation index was changed from k’ to k – 1. As a result, we may rewrite Eq.
(5.32b) as
 pˆ x ,U ( xˆ, t )  i  U ( xˆ, t ) , (5.34)
xˆ
so that Eq. (30) yields:
Heisenberg
dpˆ x 
equation   U ( xˆ , t ). (5.35)
for dt xˆ
momentum
This equation also coincides with the classical equation of motion! Moreover, averaging Eqs. (29) and
(35) over the initial state (as Eq. (4.191) prescribes), we get similar results for the expectation values:15
Ehrenfest d x px d px U
theorem  ,  . (5.36)
dt m dt x
However, it is important to remember that the equivalence between these quantum-mechanical
equations and similar equations of classical mechanics is superficial, and the degree of the similarity
between the two mechanics very much depends on the problem. As one extreme, let us consider the case
when a particle’s state, at any moment between t0 and t, may be accurately represented by one, relatively
px-narrow wave packet. Then we may interpret Eqs. (36) as the equations of the essentially classical
motion of the wave packet’s center, in accordance with the correspondence principle. However, even in
this case, it is important to remember the purely quantum mechanical effects of non-zero wave packet
width and its spread in time, which were discussed in Sec. 2.2.
As an opposite extreme, let us revisit the “leaky” potential well discussed in Sec. 2.5 – see Fig.
2.15. Since both the potential U(x) and the initial wavefunction of that system are symmetric relative to
point x = 0 at all times, the right-hand sides of both Eqs. (36) identically equal zero. Of course, the result
they predict (that the average values of the coordinate and the momentum stay equal to zero at all times)
is correct, but this fact does not tell us much about the rich dynamics of the system: the finite lifetime of
the metastable state, the formation of two wave packets, their waveform and propagation speed (see Fig.
2.17), and about the insights the full solution gives for the quantum measurement theory and the
system’s irreversibility. Another similar example is the energy band theory (Sec. 2.7), with its purely
quantum effect of the allowed energy bands and forbidden energy gaps, of which Eqs. (36) give no clue.
To summarize, the Ehrenfest theorem is important as an illustration of the correspondence
principle, but its predictive power should not be exaggerated.
5.3. The Feynman path integral

As has been already mentioned, even within the realm of wave mechanics, the bra-ket language
may simplify some calculations that would be very bulky using the notation used in Chapters 1-3.
Probably the best example is the famous alternative, path-integral formulation of quantum mechanics.16
15The equation set (36) constitutes the Ehrenfest theorem, named after its author, P. Ehrenfest.
16This formulation was developed in 1948 by Richard Phillips Feynman. (According to his memories, this work
was motivated by a “mysterious” remark by P. Dirac in his pioneering 1930 textbook on quantum mechanics.)
I will review this important concept, cutting one math corner for the sake of brevity.17 (This shortcut
will be clearly marked below.)
Let us inner-multiply both parts of Eq. (4.157a), which is essentially the definition of the time-
evolution operator, by the bra-vector of state x,
x  (t )  x uˆ (t , t 0 )  (t 0 ) , (5.37)
insert the identity operator before the ket-vector on the right-hand side, and then use the closure
condition in the form of Eq. (4.252), with x’ replaced with x0:
x  (t )   dx0 x uˆ (t , t 0 ) x0 x0  (t 0 ) . (5.38)
According to Eq. (4.233), this equality may be represented as

 ( x, t )   dx0 x uˆ (t , t 0 ) x0  ( x0 , t 0 ) . (5.39)
Comparing this expression with Eq. (2.44), we see that the long bracket in this relation is nothing other
than the 1D propagator, which was discussed in Sec. 2.2, i.e.
G ( x, t ; x 0 , t 0 )  x uˆ (t , t 0 ) x 0 . (5.40)
Let me hope that the reader sees that this equality corresponds to the physical sense of the propagator.
Now let us break the time segment [t0, t] into N (for the time being, not necessarily equal) parts,
by inserting (N – 1) intermediate points (Fig. 4) with
t 0  t1  ...  t k  ...  t N 1  t , (5.41)
and use the definition (4.157) of the time evolution operator to write
uˆ (t , t 0 )  uˆ (t , t N 1 )uˆ (t N 1 , t N  2 )...uˆ (t 2 , t1 )uˆ (t1 , t 0 ) . (5.42)
After plugging Eq. (42) into Eq. (40), let us insert the identity operator, again in the closure form
(4.252), but written for xk rather than x’, between each two partial evolution operators including the time
argument tk. The result is
G ( x, t ; x0, t 0 )   dx N 1  dx N  2 ... dx1 x uˆ (t , t N 1 ) x N 1 x N 1 uˆ (t N 1 , t N  2 ) x N  2 ... x1 uˆ (t1 , t 0 ) x0 . (5.43)
The physical sense of each integration variable xk is the wavefunction’s argument at time tk – see Fig. 4.
x0 x1 xk x N 2 x N 1 x
Fig. 5.4. Time partition and coordinate
notation at the initial stage of the
Feynman path integral’s derivation.
t0 t1 ... tk ... t N 2 t N 1 t
17 A more thorough discussion of the path-integral approach may be found in the famous text by R. Feynman and
A. Hibbs, Quantum Mechanics and Path Integrals, first published in 1965. (For its latest edition by Dover in
2010, the book was emended by D. Styler.) For a more recent monograph, which reviews more applications, see
L. Schulman, Techniques and Applications of Path Integration, Wiley, 1981.
The key Feynman’s breakthrough was the realization that if all intervals are taken similar and
sufficiently small, tk – tk-1 = d → 0, all the partial brackets participating in Eq. (43) may be expressed
via the free-particle’s propagator, given by Eq. (2.49), even if the particle is not free, but moves in a
stationary potential profile U(x). To show that, let us use either Eq. (4.175) or Eq. (4.181), which, for a
small time interval d, give the same result:
 i   i  pˆ 2 
uˆ (  d , )  exp Hˆ d   exp  d  U  xˆ  d . (5.44)
      2m 
Generally, an exponent of a sum of two operators may be treated as that of c-number arguments, and in
particular factored into a product of two exponents, only if the operators commute. (In this case, we can
use all the standard algebra for the exponents of c-number arguments.) In our case, this is not so,
because the operator pˆ 2 / 2m does not commute with x̂ , and hence with U( x̂ ). However, it may be
shown18 that for an infinitesimal time interval d, the non-zero commutator
 pˆ 2 
 d , U ( xˆ )d   0, (5.45)
 2m 
proportional to (d)2, may be ignored in the first, linear approximation in d. As a result, we may
factorize the right-hand side in Eq. (44) by writing
 i pˆ 2   i 
uˆ (  d , ) d 0  exp d  exp U ( xˆ )d  . (5.46)
  2m    
(This approximation is very much similar in spirit to the trapezoidal-rule approximation in the usual 1D
integration,19 which in also asymptotically impeachable.)
Since the second exponential function on the right-hand side of Eq. (46) commutes with the
coordinate operator, we may move it out of each partial bracket participating in Eq. (43), with U(x)
turning into a c-number function:
 i pˆ 2   i 
x  d uˆ (  d , ) x  x  d exp d  x exp U ( x)d . (5.47)
  2m    
But the remaining bracket is just the propagator of a free particle, so that for it we may use Eq. (2.49):
1/ 2
 i pˆ 2   m   m(dx) 2 
x  d exp d  x    expi . (5.48)
  2m   2 id   2d 
As the result, the full propagator (43) takes the form
N /2
 m   N  m(dx) 2 U ( x)   .
G ( x, t ; x0, t 0 )  lim d 0  dx N 1  dx N  2 .. dx1   exp i i d   (5.49)
N 
 2 id   
k 1 2d  
18 This is exactly the corner I am going to cut because a strict mathematical proof of this (intuitively evident)
statement would take more time/space than I can afford.
19 See, e.g., MA Eq. (5.2).
At N   and hence d  (t – t0)/N  0, the sum under the exponent in this expression may be
approximated with the corresponding integral:
i  m  dx   i  m  dx  
N 2 t 2
     U ( x) d       U ( x) d , (5.50)

 2  d 
k 1     t k
 t0  2  d  
and the expression in the square brackets is just the particle’s Lagrangian function L.20 The integral of
this function over time is the classical action S calculated along a particular “path” x().21 As a result,
defining the (1D) path integral as
N /2 1D path
 m 
 (...) D[ x( )]  lim d 0  2id   dx  dx
N 1 N 2 .. dx1 (...), (5.51a) integral:
definition
N 
we can bring our result to the following (superficially simple) form:

1D
i 
G ( x, t ; x0, t 0 )   exp S x( )D[ x( )] . (5.51b) propagator
via path
  integral
The name “path integral” for the mathematical construct (51a) may be readily explained if we
keep the number N of time intervals large but finite, and also approximate each of the enclosed integrals
with a sum over M >> 1 discrete points along the coordinate axis – see Fig. 5a.
d
(a) (b)
x x
M Fig. 5.5. Several 1D classical
x0 x0 paths: (a) in the discrete
approximation and (b) in the
t0 t  t0 t  continuous limit.
N 1
Then the path integral (51a) is the product of (N – 1) sums corresponding to different values of
time , each of them with M terms, each of those representing the function under the integral at a
particular spatial point. Multiplying those (N – 1) sums, we get a sum of (N – 1)M terms, each
evaluating the function at a specific spatial-temporal point [x, ]. These terms may be now grouped to
represent all possible different continuous classical paths x[] from the initial point [x0, t0] to the finite
point [x, t]. It is evident that the last interpretation remains true even in the continuous limit N, M   –
see Fig. 5b.
Why does such path representation of the sum make sense? This is because in the classical limit
the particle follows just a certain path, corresponding to the minimum of the action S . As a result, for
all close trajectories, the difference (S – Scl) is proportional to the square of the deviation from the
20 See, e.g., CM Sec. 2.1.

21 See, e.g., CM Sec. 10.3.
classical trajectory. Hence, for a quasiclassical motion, with Scl >> , there is a bunch of close
trajectories, with (S – Scl) << , that give substantial contributions to the path integral. On the other
hand, strongly non-classical trajectories, with (S – Scl) >> , give phases S/ rapidly oscillating from
one trajectory to the next one, and their contributions to the path integral are averaged out.22 As a result,
for a quasi-classical motion, the propagator’s exponent may be evaluated on the classical path only:
i   i t  m  dx  2  
Gcl  exp S cl   exp      U ( x) d . (5.52)
    t0  2  d   
The sum of the kinetic and potential energies is the full energy E of the particle, that remains constant
for motion in a stationary potential U(x), so that we may rewrite the expression under this integral as23
 m  dx  2    dx  2  dx
    U ( x) d  m   E  d  m dx  Ed . (5.53)
 2  d     d   d
With this replacement, Eq. (52) yields

 i x dx   i   i x   i 
Gcl  exp  m dx  exp E (t  t 0 )  exp  p ( x)dx  exp E (t  t 0 ), (5.54)
  x0 d       x0    
where p is the classical momentum of the particle. But (at least, leaving the pre-exponential factor alone)
this is the WKB approximation result that was derived and studied in detail in Chapter 2!
One may question the value of such a complicated calculation, which yields the results that could
be readily obtained from Schrödinger’s wave mechanics. Feynman’s approach is indeed not used too
often, but it has its merits. First, it has an important philosophical (and hence heuristic) value. Indeed,
Eq. (51) may be interpreted by saying that the essence of quantum mechanics is the exploration, by the
system, of all possible paths x(), each of them classical-like, in the sense that the particle’s coordinate x
and velocity dx/d are exactly defined simultaneously at each point. The resulting contributions to the
path integral are added up coherently to form the actual propagator G, and via it, the final probability W
 G2 of the particle’s propagation from [x0, t0] to [x, t]. As the scale of the action S of the motion
decreases and becomes comparable to , more and more paths produce substantial contributions to this
sum, and hence to W, providing a larger and larger difference between the quantum and classical
properties of the system.
Second, the path integral provides a justification for some simple explanations of quantum
phenomena. A typical example is the quantum interference effects discussed in Sec. 3.1 – see, e.g., Fig.
3.1 and the corresponding text. At that discussion, we used the Huygens principle to argue that at the
two-slit interference, the WKB approximation might be restricted to contributions from two paths that
pass through different slits, but otherwise consisting of straight-line segments. To have another look at
22 This fact may be proved by expanding the difference (S – Scl) in the Taylor series in the path variation (leaving
only the leading quadratic terms) and working out the resulting Gaussian integrals. This integration, together with
the pre-exponential coefficient in Eq. (51a), gives exactly the pre-exponential factor that we have
already found refining the WKB approximation in Sec. 2.4.
23The same trick is often used in analytical classical mechanics – say, for proving the Hamilton principle, and for
the derivation of the Hamilton – Jacobi equations (see, e.g., CM Secs. 10.3-4).
that assumption, let us generalize the path integral to multi-dimensional geometries. Fortunately, the
simple structure of Eq. (51b) makes such generalization virtually evident:
3D
i 
t
 dr   m  dr  2
t
 propagator
G (r, t ; r0, t 0 )   exp S r ( )D[r ( )], S   L  r, d       U (r ) d . (5.55) as a path
  t0  d   2  d 
t0  
integral
where the definition (51a) of the path integral should be also modified correspondingly. (I will not go
into these technical details.) For the Young-type experiment (Fig. 3.1), where a classical particle could
reach the detector only after passing through one of the slits, the classical paths are the straight-line
segments shown in Fig. 3.1, and if they are much longer than the de Broglie wavelength, the propagator
may be well approximated by the sum of two integrals of Ld = ip(r)dr/  – as it was done in Sec. 3.1.
Last but not least, the path integral allows simple solutions to some problems that would be hard
to obtain by other methods. As the simplest example, let us consider the problem of tunneling in multi-
dimensional space, sketched in Fig. 6 for the 2D case – just for the graphics’ simplicity. Here, the
potential profile U(x, y) has a saddle-like shape. (Another helpful image is a mountain path between two
summits, in Fig. 6 located on the top and at the bottom of the shown region.) A particle of energy E may
move classically in the left and right regions with U(x, y) < E, but if E is not sufficiently high, it can pass
from one of these regions to another one only via the quantum-mechanical tunneling under the pass. Let
us calculate the transparency of this potential barrier in the WKB approximation, ignoring the possible
pre-exponential factor. 24
y U1  E
U1  E U2  E
r
x Fig. 5.6. A saddle-type 2D
r0
potential profile and the instanton
trajectory of a particle of energy
U2  E E (schematically).
UE
UE
According to the evident multi-dimensional generalization Eq. (54), for the classically forbidden
region, where E < U(x, y), and hence p(r)/ = i(r), the contributions to the propagator (55) are
proportional to
r
I  i 
e exp  E (t  t 0 ) , where I   κ (r )  dr , (5.56)
   r0
where    may be calculated just in the 1D case – cf. Eq. (2.97):
 2 2 (r )
 U (r )  E . (5.57)
2m
24 Actually, one can argue that the pre-exponential factor should be close to 1, just like in Eq. (2.117), especially
if the potential is smooth, in the sense of Eq. (2.107), in all spatial directions. (Let me remind the reader that for
most practical applications of quantum tunneling, the pre-exponential factor is of minor importance.)
Hence the path integral in this region is much simpler than in the classically allowed region,
because the spatial exponents are purely real and there is no complex interference between them. Due to
the minus sign before I in the exponent (56), the largest contribution to G evidently comes from the
trajectory (or a narrow bundle of close trajectories) for which the integral I has the smallest value, so
that the barrier transparency may be calculated as
3D
 r 
T  G  e  exp 2  κ (r' )  dr'  ,
tunneling 2 2 I
in WKB (5.58)
limit  r0 
where r and r0 are certain points on the opposite classical turning-point surfaces: U(r) = U(r0) = E – see
Fig. 6.
Thus the barrier transparency problem is reduced to finding the trajectory (including the points r
and r0) that connects the two surfaces and minimizes the functional I. This is of course a well-known
problem of the calculus of variations,25 but it is interesting that the path integral provides a simple
alternative way of solving it. Let us consider an auxiliary problem of particle’s motion in the potential
profile Uinv(r) that is inverted relative to the particle’s energy E, i.e. is defined by the following equality:
U inv (r )  E  E  U (r ). (5.59)
As was discussed above, at fixed energy E, the path integral for the WKB motion in the classically
allowed region of potential Uinv(x, y) (that coincides with the classically forbidden region of the original
problem) is dominated by the classical trajectory corresponding to the minimum of
r r
S inv   p inv (r' )  dr'    k inv (r' )  dr, (5.60)
r0 r0
where kinv should be determined from the WKB relation

 2 k inv
2
(r )
 E  U inv (r ). (5.61)
2m
But comparing Eqs. (57), (59), and (61), we see that kinv = κ at each point! This means that the tunneling
path (in the WKB limit) corresponds to the classical (so-called instanton26) trajectory of the same
particle moving in the inverted potential Uinv(r). If the initial point r0 is fixed, this trajectory may be
readily found by the means of classical mechanics. (Note that the initial kinetic energy, and hence the
initial velocity of the instanton launched from point r0 should be zero because by the classical turning
point definition, Uinv(r0) = U(r0) = E.) Thus the problem is further reduced to a simpler task of
maximizing the transparency (58) by choosing the optimal position of r0 on the equipotential surface
U(r0) = E – see Fig. 6. Moreover, for many symmetric potentials, the position of this point may be
readily guessed even without calculations – as it is in Problems 6 and 7, left for the reader’s exercise.
Note that besides the calculation of the potential barrier’s transparency, the instanton trajectory
has one more important implication: the so-called traversal time t of the classical motion along it, from
25 For a concise introduction to the field see, e.g., I. Gelfand and S. Fomin, Calculus of Variations, Dover, 2000,
or L. Elsgolc, Calculus of Variations, Dover, 2007.
26 In the quantum field theory, the instanton concept may be formulated somewhat differently, and has more
complex applications – see, e.g. R. Rajaraman, Solitons and Instantons, North-Holland, 1987.
the point r0 to the point r, in the inverted potential defined by Eq. (59), plays the role of the most
important (though not the only one) time scale of the particle’s tunneling under the barrier.27
5.4. Revisiting harmonic oscillator

Let us return to the 1D harmonic oscillator, now understood as any system, regardless of its
physical nature, described by the Hamiltonian (4.237) with the potential energy (2.111):
Harmonic
ˆ pˆ 2 m 02 xˆ 2
H  . (5.62) oscillator:
Hamiltonian
2m 2
In Sec. 2.9 we have used a “brute-force” (wave-mechanics) approach to analyze the eigenfunctions
n(x) and eigenvalues En of this Hamiltonian, and found that, unfortunately, this approach required
relatively complex mathematics, which does not enable an easy calculation of its key characteristics.
Fortunately, the bra-ket formalism helps to make such calculations.
First, introducing normalized (dimensionless) operators of coordinates and momentum:28
xˆ pˆ
ˆ  , ˆ  , (5.63)
x0 m 0 x0
where x0  (/m0)1/2 is the natural coordinate scale discussed in detail in Sec. 2.9, we can represent the
Hamiltonian (62) in a very simple and x  p symmetric form:
Hˆ 
 0 ˆ 2 ˆ 2
2
  .   (5.64)
This symmetry, as well as our discussion of the very similar coordinate and momentum representations
in Sec. 4.7, hints that much may be gained by treating the operators ˆ and ˆ on equal footing. Inspired
by this clue, let us introduce a new operator
ˆ  iˆ
1/ 2
 m 0   pˆ  Annihilation
aˆ     xˆ  i  . (5.65a) operator:
2  2   m 0  definition
Since both operators ˆ and ˆ correspond to real observables, i.e. have real eigenvalues and hence are
Hermitian (self-adjoint), the Hermitian conjugate of the operator â is simply its complex conjugate:
ˆ  iˆ
1/ 2
†  m 0   pˆ  Creation
aˆ     xˆ  i  . (5.65b) operator:
2  2   m 0  definition
Because of the reason that will be clear very soon, aˆ † and aˆ (in this order!) are called the creation and
annihilation operators.
27 For more on this interesting issue see, e.g., M. Buttiker and R. Landauer, Phys. Rev. Lett. 49, 1739 (1982), and
references therein.
28 This normalization is not really necessary, it just makes the following calculations less bulky – and thus more
aesthetically appealing.
Now solving the simple system of two linear equations (65) for ˆ and ˆ , we get the following
reciprocal relations:
1/ 2
aˆ  aˆ † aˆ  aˆ †    aˆ  aˆ † aˆ  aˆ †
ˆ
 ˆ
,   i.e. xˆ    pˆ  m 0 
1/ 2
, , . (5.66)
2 2i  m 0  2 2i
Our Hamiltonian (64) includes squares of these operators. Calculating them, we have to be careful to
avoid swapping the new operators, because they do not commute. Indeed, for the normalized operators
(63), Eq. (2.14) gives
ˆ, ˆ  2
1
 
x0 m 0
xˆ, pˆ   iIˆ, (5.67)
so that Eqs. (65) yield

Creation-
annihilation
operators:
commutation   2
  2
   
aˆ , aˆ †   1 ˆ  iˆ , ˆ  iˆ   i ˆ, ˆ  ˆ , ˆ  Iˆ . (5.68)
relation
With such due caution, Eq. (66) gives
1 2
 1 2

ˆ 2   aˆ 2  aˆ †  aâˆ †  aˆ † aˆ , ˆ 2    aˆ 2  aˆ †  aâˆ †  aˆ † aˆ . (5.69)
2  2 
Plugging these expressions back into Eq. (64), we get
 0  † † 
Hˆ   aâˆ  aˆ aˆ  . (5.70)
2  
This expression is elegant enough, but may be recast into an even more convenient form. For
that, let us rewrite the commutation relation (68) as
aâˆ †  aˆ † aˆ  Iˆ (5.71)
and plug it into Eq. (70). The result is
 0  †   1 
Hˆ   2aˆ aˆ  Iˆ    0  Nˆ  Iˆ  , (5.72)
2    2 
where, in the last form, one more (evidently, Hermitian) operator,
Number
operator:
definition Nˆ  aˆ † aˆ , (5.73)
has been introduced. Since, according to Eq. (72), the operators Ĥ and N̂ differ only by the addition of
the identity operator and multiplication by a c-number, these operators commute. Hence, according to
the general arguments of Sec. 4.5, they share a set of stationary eigenstates n (they are frequently called
the Fock states), and we can write the standard eigenproblem (4.68) for the new operator as
Nˆ n  N n n , (5.74)
where Nn are some eigenvalues that, according to Eq. (72), determine also the energy spectrum of the
oscillator:
 1
E n   0  N n   . (5.75)
 2
So far, we know only that all eigenvalues Nn are real; to calculate them, let us carry out the
following calculation – splendid in its simplicity and efficiency. Consider the result of the action of the
operator N̂ on the ket-vector â †n. Using the definition (73) and then the associative rule of the bra-ket
formalism, we may write
Nˆ  aˆ † n    aˆ † aˆ   aˆ † n   aˆ †  aâˆ †  n . (5.76)
      
Now using the commutation relation (71), and then Eq. (74), we may continue as
  
 
aˆ †  aâˆ †  n  aˆ †  aˆ † aˆ  Iˆ  n  aˆ † Nˆ  Iˆ n  aˆ †  N n  1 n   N n  1 aˆ † n .
  
(5.77)
For clarity, let us summarize the result of this calculation:
Nˆ  aˆ † n    N n  1 aˆ † n  . (5.78)
   
Performing a similar calculation for the operator â , we get a similar formula:
Nˆ aˆ n    N n  1 aˆ n  . (5.79)
It is time to stop calculations for a minute, and translate these results into plain English: if n is
an eigenket of the operator N̂ with the eigenvalue Nn, then â †n and â n are also eigenkets of that
operator, with the eigenvalues (Nn + 1), and (Nn – 1), respectively. This statement may be vividly
represented on the so-called ladder diagram shown in Fig. 7.
eigenket ... eigenvalue of N̂
† aˆ † aˆ
aˆ n Nn 1
aˆ † aˆ
n Nn
†
aˆ aˆ Fig. 5.7. The “ladder diagram” of eigenstates of a 1D
â n Nn 1 harmonic oscillator. Arrows show the actions of the
aˆ † aˆ creation and annihilation operators on the eigenstates.
...
The operator â † moves the system one step up this ladder, while the operator â brings it one
step down. In other words, the former operator creates a new excitation of the system,29 while the latter
operator kills (“annihilates”) such excitation.30 On the other hand, according to Eq. (74) inner-multiplied
by the bra-vector n, the operator N̂ does not change the state of the system, but “counts” its position
on the ladder:
29 For electromagnetic field oscillators, such excitations are called photons; for mechanical wave oscillators,
phonons, etc.
30 This is exactly why â † is called the creation operator, and â , the annihilation operator.
n Nˆ n  n N n n  N n . (5.80)
This is why N̂ is called the number operator, in our current context meaning the number of the
elementary excitations of the oscillator.
This calculation still needs completion. Indeed, we still do not know whether the ladder shown in
Fig. 7 shows all eigenstates of the oscillator, and what exactly the numbers Nn are. Fascinating enough,
both questions may be answered by exploring just one paradox. Let us start with some state n (read a
step of the ladder), and keep going down the ladder, applying the operator â again and again. According
to Eq. (79), at each step the eigenvalue Nn is decreased by one, so that eventually it should become
negative. However, this cannot happen, because any actual eigenstate, including the states represented
by kets d  â n and n, should have a positive norm – see Eq. (4.16). Comparing the norms,

d  n aˆ † a n  n N n  N n n n ,
2 2
n  nn, (5.81)
we see that both of them cannot be positive simultaneously if Nn is negative.

To resolve this paradox let us notice that the action of the creation and annihilation operators on
the stationary states n may consist of not only their promotion to an adjacent step of the ladder diagram
but also by their multiplication by some c-numbers:
aˆ n  An n  1 , aˆ † n  A' n n  1 . (5.82)
(The linear relations (78)-(79) clearly allow that.) Let us calculate the coefficients An assuming, for
convenience, that all eigenstates, including the states n and (n –1), are normalized:
aˆ † aˆ 1  Nn
n n  1, n 1 n 1  n *
n  n N n  n n  1. (5.83)
An An An* An An* An
From here, we get  An  = (Nn)1/2, i.e.
i n
aˆ n  N n1 / 2 e n 1 , (5.84)
where n is an arbitrary real phase. Now let us consider what happens if all numbers Nn are integers.
(Because of the definition of Nn, given by Eq. (74), it is convenient to call these integers n, i.e. to use
the same letter as for the corresponding eigenstate.) Then when we have come down to the state with n
= 0, an attempt to make one more step down gives
aˆ 0  0  1 . (5.85)
But according to Eq. (4.9), the state on the right-hand side of this equation is the “null-state”, i.e. does
not exist.31 This gives the (only known :-) resolution of the state ladder paradox: the ladder has the
lowest step with Nn = n = 0.
As a by-product of our discussion, we have obtained a very important relation Nn = n, which
means, in particular, that the state ladder shown in Fig. 7 includes all eigenstates of the oscillator.
31 Please note again the radical difference between the null-state on the right-hand side of Eq. (85) and the state
described by the ket-vector 0 on the left-hand side of that relation. The latter state does exist and, moreover,
represents the most important, ground state of the system, with n = 0 – see Eqs. (2.274)-(2.275).
Plugging this relation into Eq. (75), we see that the full spectrum of eigenenergies of the harmonic
oscillator is described by the simple formula
 1
E n   0  n  , n  0, 1, 2... , (5.86)
 2
which was already discussed in Sec. 2.9. It is rather remarkable that the bra-ket formalism has allowed
us to derive it without calculating the corresponding (rather cumbersome) wavefunctions n(x) – see
Eqs. (2.284).
Moreover, this formalism may be also used to calculate virtually any matrix element of the
oscillator, without using n(x). However, to do that, we should first calculate the coefficient A’n
participating in the second of Eqs. (82). This may be done similarly to the above calculation of An;
alternatively, since we already know that An = (Nn)1/2 = n1/2, we may notice that according to Eqs. (73)
and (82), the eigenproblem (74), which in our new notation for Nn becomes
Nˆ n  n n , (5.87)
may be rewritten as
n n  aˆ † aˆ n  aˆ † An n  1  An An' 1 n . (5.88)
Comparing the first and the last form of this equality, we see that A’n-1 = n/An = n1/2, so that A’n = (n +
1)1/2exp(in’). Taking all phases n and n’ equal to zero for simplicity, we may spell out Eqs. (82) as32
aˆ † n  n  1 n  1 ,
1/ 2 Fock state
aˆ n  n1 / 2 n  1 . (5.89) ladder
Now we can use these formulas to calculate, for example, the matrix elements of the operator x̂
in the Fock state basis:
x x
n' xˆ n  x 0 n' ˆ n  0 n'  aˆ  aˆ †  n  0  n' aˆ n  n' aˆ n 
†
2   2 
(5.90)
x

 0 n1 / 2 n' n  1  n  1 n' n  1 .
2
1/ 2

Taking into account the Fock state orthonormality:
n' n   n 'n , (5.91)
this result becomes
1/ 2
  
n  n' ,n 1  (n  1)  n' ,n 1  n  n' ,n 1  (n  1)1 / 2  n' ,n 1  .
Coordinate’s
x0
n' xˆ n  1/ 2 1/ 2
   1/ 2
(5.92) matrix
 2m 0
elements
2 
Acting absolutely similarly, for the momentum’s matrix elements we get a similar expression:
1/ 2
 m 0 
n' pˆ n  i    n 1/ 2
 n' ,n 1  (n  1)1 / 2  n ',n 1 . (5.93)
 2 
32A useful mnemonic rule for these key relations is that the c-number coefficient in any of them is equal to the
square root of the largest number of the two states it relates.
Hence the matrices of both operators in the Fock-state basis have only two diagonals, adjacent to the
main diagonal; all other elements (including the main-diagonal ones) are zeros.
The matrix elements of higher powers of these operators, as well as their products, may be
handled similarly, though the higher the power, the bulkier the result. For example,

n' xˆ 2 n  n' xˆxˆ n  
n"  0
n' xˆ n" n" xˆ n
 n"   
x02 

1/ 2
 n' ,n"1  n"  11 / 2  n ',n"1 n1 / 2 n",n 1  n  11 / 2  n",n 1 (5.94)
2 n " 0

x02
2
n n  1 1/ 2
 n ',n  2  n  1n  2 1 / 2  n ',n  2  (2n  1) n ',n .
For applications, the most important of these matrix elements are those on its main diagonal:
x 02
x 2
 n xˆ n  2n  1.
2
(5.95)
2
This expression shows, in particular, that the expectation value of the oscillator’s potential energy in the
nth Fock state is
m 02 2 m 02 x02  1   0  1
U  x  n     n  . (5.96)
2 2  2 2  2
This is exactly one-half of the total energy (86) of the oscillator. As a sanity check, an absolutely similar
calculation for the momentum squared, and hence for the kinetic energy p2/2m, yields
 0

2 1 
 1
p 2  n pˆ 2 n  m 0 x0  n    m 0  n  , so that
2 2
p2
2m

2
 1
 n  ,
2
(5.97)
  
i.e. both partial energies are equal to En/2, just as in a classical oscillator.33
Note that according to Eqs. (92) and (93), the expectation values of both x and p in any Fock
state are equal to zero:
x  n xˆ n  0, p  n pˆ n  0, (5.98)
This is why, according to the general Eqs. (1.33)-(1.34), the results (95) and (97) also give the variances
of the coordinate and the momentum, i.e. the squares of their uncertainties, (x)2 and (p)2. In particular,
for the ground state (n = 0), these uncertainties are
1/ 2 1/ 2
x0    m 0 x 0  m 0 
x     , p   . (5.99)
2  2m 0  2  2 
In the theory of precise measurements (to be reviewed in brief in Chapter 10), these expressions are
often called the standard quantum limit.
33 Still note that operators of the partial (potential and kinetic) energies do not commute with either each other or
with the full-energy (Hamiltonian) operator, so that the Fock states n are not their eigenstates. This fact maps on
the well-known oscillations of these partial energies (with the frequency 20) in a classical oscillator, at the full
energy staying constant.
5.5. Glauber states and squeezed states

There is a huge difference between a quantum stationary (Fock) state of the oscillator and its
classical state. Indeed, let us write the well known classical equations of motion of the oscillator (using
capital letters to distinguish classical variables from the arguments of quantum wavefunctions): 34
P U
X  , P    m 02 X . (5.100)
m x
On the so-called phase plane, with the Cartesian coordinates x and p, these equations describe a
clockwise rotation of the representation point {X(t), P(t)} along an elliptic trajectory starting from the
initial point {X(0), P(0)}. (The normalization of the momentum by m0, similar to the one performed by
the second of Eqs. (63), makes this trajectory pleasingly circular, with a constant radius equal to the
oscillations amplitude A, corresponding to the constant full energy
2 2
m 02 2  P(t )   P(0) 
with A  X (t )     const  X (0)  
2 2
E A , 2
 , (5.101)
2  m 0   m 0 
determined by the initial conditions – see Fig. 8.)
p / m0

P / m0
 /2 Fig. 5.8. Representations of various states of a harmonic

oscillator on the phase plane. The bold black point
represents a classical state with the complex amplitude
, with the dashed line showing its trajectory. The (very
n0 X x
imperfect) classical images of the Fock states with n = 0,
1, and 2 are shown in blue. The blurred red spot is the
(equally schematic) image of the Glauber state .
n 1 Finally, the magenta elliptical spot is a classical image of
a squeezed ground state – see below. Arrows show the
direction of the states’ evolution in time.
n2
For the forthcoming comparison with quantum states, it is convenient to describe this classical
motion by the following dimensionless complex variable
1  P (t ) 
 (t )   X (t )  i , (5.102)
2 x0  m 0 
which is essentially the standard complex-number representation of the representing point’s position on
the 2D phase plane, with    A/2x0. With this definition, Eqs. (100) are conveniently merged into one
equation,
  i 0 , (5.103)
34 If Eqs. (100) are not evident, please consult a classical mechanics course – e.g., CM Sec. 3.2 and/or Sec. 10.1.
with an evident, very simple solution

 (t )   (0) exp i 0 t, (5.104)
where the constant (0) may be complex, and is just the (normalized) classical complex amplitude of
oscillations.35 This equation describes sinusoidal oscillations of both X(t)  Re[(t)] and P  Im[(t)],
with a phase shift of /2 between them.
On the other hand, according to the basic Eq. (4.161), the time dependence of a Fock state, as of
a stationary state of the oscillator, is limited to the phase factor exp{-iEnt/}. This factor drops out at the
averaging (4.125) for any observable. As a result, in this state the expectation values of x, p, or of any
function thereof are time-independent. (Moreover, as Eqs. (98) show, x = p = 0.) Taking into account
Eqs. (96)-(97), the closest (though very imperfect) geometric image36 of such a state on the phase plane
is a static circle of the radius An = x0(2n + 1)1/2, along which the wavefunction is uniformly spread – see
the blue rings in Fig. 8. For the ground state (n = 0), with the wavefunction (2.275), a better image may
be a blurred round spot, of a radius ~x0, at the origin. (It is easy to criticize such blurring, intended to
represent the non-vanishing spreads (99), because it fails to reflect the fact that the total energy of the
oscillator in the state, E0 = 0/2 is defined exactly, without any uncertainty.)
So, the difference between a classical state of the oscillator and its Fock state n is very profound.
However, the Fock states are not the only possible quantum states of the oscillator: according to the
basic Eq. (4.6), any state described by the ket-vector

   n n (5.105)
n 0
with an arbitrary set of (complex) c-numbers n, is also its legitimate state, subject only to the
normalization condition  = 1, giving


2
n  1. (5.106)
n 0
It is natural to ask: could we select the coefficients n in such a special way that the state properties
would be closer to the classical one; in particular the expectation values x and p of the coordinate and
momentum would evolve in time as the classical values X(t) and P(t), while the uncertainties of these
observables would be, just as in the ground state, given by Eqs. (99), and hence have the smallest
possible uncertainty product, xp = /2. Let me show that such a Glauber state,37 which is
35 See, e.g., CM Chapter 5, especially Eqs. (5.4).

36 I have to confess that such geometric mapping of a quantum state on the phase plane [x, p] is not exactly
defined; you may think about colored areas in Fig. 8 as the regions of the observable pairs {x, p} most probably
obtained in measurements. A quantitative definition of such a mapping will be given in Sec. 7.3 using the Wigner
function, though, as we will see, even such imaging has certain internal contradictions. Still, such cartoons as Fig.
8 have a substantial heuristic value, provided that their limitations are kept in mind.
37 Named after Roy Jay Glauber who studied these states in detail in the mid-1965s, though they had been
discussed in brief by Ervin Schrödinger as early as in 1926. Another popular adjective, “coherent”, for the
Glauber states is very misleading, because all quantum states of all systems we have studied so far (including the
Fock states of the harmonic oscillator) may be represented as coherent (pure) superpositions of the basis states.
This is why I will not use this term for the Glauber states.
schematically represented in Fig. 8 by a blurred red spot around the classical point {X(t), P(t)}, is indeed
possible.
Conceptually the simplest way to find the corresponding coefficients n would be to calculate
x, p, x, and p for an arbitrary set of n, and then try to optimize these coefficients to reach our
goal. However, this problem may be solved much easier using wave mechanics. Indeed, let us consider
the following wavefunction:
1/ 2 Glauber
 m 0   m 0
x  X (t )2  i P(t ) x  ,
state:
 ( x, t )    exp (5.107) coordinate
    2   representation
Its comparison with Eqs. (2.275) shows that this is just the ground-state wavefunction, but with the
center shifted from the origin into the classical point {X(t), P(t)}. A straightforward (though a bit bulky)
differentiation over x and t shows that it satisfies the oscillator’s Schrödinger equation, provided that the
c-number functions X(t) and P(t) obey the classical equations (100). Moreover, a similar calculation
shows that the wavefunction (107) also satisfies the Schrödinger equation of an oscillator under the
effect of a pulse of a classical force F(t), provided that the oscillator initially was in its ground state, and
that the classical evolution law {X(t), P(t)} in Eq. (107) takes this force into account.38 Since for many
experimental implementations of the harmonic oscillator, the ground state may be readily formed (for
example, by providing a weak coupling of the oscillator to a low-temperature environment), the Glauber
state is usually easier to form than any Fock state with n > 0. This is why the Glauber states are so
important and deserve much discussion.
In such a discussion, there is a substantial place for the bra-ket formalism. For example, to
calculate the corresponding coefficients in the expansion (105) by wave-mechanical means,
 n t   n  t    dx n x x  t    n* ( x)  ( x, t )dx , (5.108)
we would need to use not only the simple Eq. (107), but also the Fock state wavefunctions n(x), which
are not very appealing – see Eq. (2.284) again. Instead, this calculation may be readily done in the bra-
ket formalism, giving us one important byproduct result as well.
Let us start by expressing the double shift of the ground state (by X and P), which has led us to
Eq. (107), in the operator language. Forgetting about the P for a minute, let us find the translation
operator Tˆ X that would produce the desired shift of an arbitrary wavefunction (x) by a c-number
distance X along the coordinate argument x. This means
Tˆ X ( x)   ( x  X ) . (5.109)
Representing the wavefunction  as the standard wave packet (4.264), we see that
1  p( x  X )  1   pX   px 
Tˆ X ( x) 
2   1/ 2  
 ( p) expi dp   ( p) exp i  expi dp . (5.110)
1/ 2
   2        
38For its description, it is sufficient to solve Eqs. (100), with F(t) added to the right-hand side of the second of
these equations.
Hence, the shift may be achieved by the multiplication of each Fourier component of the packet, with
the momentum p, by exp{-ipX/}. This gives us a hint that the general form of the translation operator,
valid in any representation, should be
 pˆ X 
Tˆ X  exp i . (5.111)
  
The proof of this formula is provided merely by the fact that, as we know from Chapter 4, any operator
is uniquely determined by the set of its matrix elements in any full and orthogonal basis, in particular the
basis of momentum states p. According to Eq. (110), the analog of Eq. (4.235) for the p-representation,
applied to the translation operator (which is evidently local), is
 pX 
 dp p Tˆ X p'  ( p' )  exp i  ( p) , (5.112)
  
so that the operator (111) does exactly the job we need it to.
The operator that provides the shift of momentum by a c-number P is absolutely similar – with
the opposite sign under the exponent, due to the opposite sign of the exponent in the reciprocal Fourier
transform, so that the simultaneous shift by both X and P may be achieved by the following translation
operator:
 Pxˆ  pˆ X 
Translation
operator
Tˆ  expi . (5.113)
  
As we already know, for a harmonic oscillator the creation-annihilation operators are more natural, so
that we may use Eqs. (66) to recast Eq. (113) as
Tˆ  expaˆ †   * aˆ , so that Tˆ †  exp * aˆ  αaˆ † , (5.114)

   
where  (which, generally, may be a function of time) is the c-number defined by Eq. (102). Now,
according to Eq. (107), we may form the Glauber state’s ket-vector just as
  Tˆ 0 . (5.115)
This formula, valid in any representation, is very elegant, but using it for practical calculations
(say, of the expectation values of observables) is not too easy because of the exponent-of-operators form
of the translation operator. Fortunately, it turns out that a much simpler representation for the Glauber
state is possible. To show this, let us start with the following general (and very useful) property of
exponential functions of an operator argument: if
Aˆ , Bˆ   Iˆ, (5.116)
(where Â and B̂ are arbitrary linear operators, and  is a c-number), then39
   
exp  Aˆ Bˆ exp  Aˆ  Bˆ  Iˆ. (5.117)
39   
A proof of Eq. (117) may be readily achieved by expanding the operator fˆ ( )  exp  Aˆ Bˆ exp  Aˆ in 
the Taylor series with respect to the c-number parameter , and then evaluating the result for  = 1. This simple
exercise is left for the reader.
Let us apply Eqs. (116)-(117) to two cases, both with
   
Aˆ   *aˆ  aˆ † , so that exp  Aˆ  Tˆ † , exp  Aˆ  Tˆ . (5.118)
First, let us take Bˆ  Iˆ ; then Eq. (116) is valid with  = 0, and Eq. (117) yields
Tˆ †Tˆ  Iˆ , (5.119)

This equality means that the translation operator is unitary – not a big surprise, because if we shift a
classical point on the phase plane by a complex number (+) and then by (-), we certainly must come
back to the initial position. Eq. (119) means merely that this fact is true for any quantum state as well.
Second, let us take Bˆ  aˆ ; in order to find the corresponding parameter , we must calculate the
commutator on the left-hand side of Eq. (116) for this case. Using, at the due stage of the calculation,
Eq. (68), we get
 
Aˆ , Bˆ   * aˆ - aˆ † , aˆ    aˆ † , aˆ   Iˆ,
   
(5.120)
so that in this case  = , and Eq. (117) yields
Tˆ † aˆTˆ  aˆ  Iˆ. (5.121)

We have approached the summit of this beautiful calculation. Let us consider the following operator:
Tˆ Tˆ † aˆTˆ . (5.122)
Using Eq. (119), we may reduce this product to âTˆ , while the application of Eq. (121) to the same
expression (122) yields Tˆ aˆ  Tˆ . Hence, we get the following operator equality:
aˆTˆ  Tˆ aˆ  Tˆ , (5.123)

which may be applied to any state. Now acting by both sides of this equality on the ground state’s ket
0, and using the fact that â 0 is the null-state, while according to Eq. (115), Tˆ 0   , we finally
get a very simple and elegant result:40
Glauber
â     . (5.124) state as
eigenstate
Thus any Glauber state  is one of the eigenstates of the annihilation operator, namely the one
with the eigenvalue equal to the c-number parameter  of the state, i.e. to the complex representation
(102) of the classical point which is the center of the Glauber state’s wavefunction.41 This fact makes the
40 This result is also rather counter-intuitive. Indeed, according to Eq. (89), the annihilation operator â , acting
upon a Fock state n, “beats it down” to the lower-energy state (n – 1). However, according to Eq. (124), the action
of the same operator on a Glauber state  does not lead to the state change and hence to any energy change! The
resolution of this paradox is given by the representation of the Glauber state as a series of Fock states – see Eq.
(134) below. The operator â indeed transfers each Fock component of this series to a lower-energy state, but it
also re-weighs each term, so that the complete energy of the Glauber state remains constant.
41 This fact means that the spectrum of eigenvalues  in Eq. (124), viewed as an eigenproblem, is continuous – it
may be any complex number.
calculations of all Glauber state properties much simpler. As an example, let us calculate x in the
Glauber state with some c-number :
x x
x   xˆ   0   aˆ  aˆ †    0   aˆ    aˆ †   . (5.125)
2   2 
In the first term in the parentheses, we can apply Eq. (124) directly, while in the second term, we can
use the bra-counterpart of that relation,  aˆ †    * . Now assuming that the Glauber state is
normalized,  = 1, and using Eq. (102), we get
x 
x0
2
      *    x0
2

 * X , (5.126)
Acting absolutely similarly, we may verify that p = P, and that x and p do indeed obey Eqs. (99).
As the last sanity check, let us use Eq. (124) to re-calculate the Glauber state’s wavefunction
(107). Inner-multiplying both sides of that relation by the bra-vector x, and using the definition (65a) of
the annihilation operator, we get
1  pˆ 
x  xˆ  i   x  . (5.127)
2 x0  m 0 
Since x is the bra-vector of the eigenstate of the Hermitian operator x̂ , they may be swapped, with the
operator giving its eigenvalue x; acting on that bra-vector by the (local!) operator of momentum, we
have to use it in the coordinate representation – see Eq. (4.245). As a result, we get
1    
 x x   x     x  . (5.128)
2 x0  m 0 x 
But x is nothing else than the Glauber state’s wavefunction  so that Eq. (128) gives for it a first-
order differential equation
1    
 x      . (5.129)
2 x0  m 0 x 
Chasing  and x to the opposite sides of the equation, and using the definition (102) of the parameter
, we can bring this equation to the form (valid at fixed t, and hence fixed X and P):
d m 0   P 
  x   X  i  dx . (5.130)
    m 0 
Integrating both parts, we return to Eq. (107).
Now we can use Eq. (124) for finding the coefficients n in the expansion (105) of the Glauber
state  in the series over the Fock states n. Plugging Eq. (105) into both sides of Eq. (124), using the
second of Eqs. (89) on the left-hand side, and requiring the coefficients at each ket-vector n in both
parts of the resulting relation to be equal, we get the following recurrence relation:

 n 1  n. (5.131)
(n  1)1 / 2
Applying this relation sequentially for n = 0, 1, 2, etc., we get

n
n  0. (5.132)
( n !) 1 / 2
Now we can find 0 from the normalization requirement (106), getting
2n
 

2
0  1. (5.133)
n 0 n!
In this sum, we may readily recognize the Taylor expansion of the function exp{2}, so that the final
result (besides an arbitrary common phase multiplier) is
  2    n Glauber
  exp  n . (5.134) state vs
 2  n 0 (n!)1 / 2 Fock states

Hence, if the oscillator is in the Glauber state , the probabilities Wn  nn* of finding the
system on the nth energy level (86) obey the well-known Poisson distribution (Fig. 9):
n
n  n Poisson
Wn  e , (5.135) distribution
n!
where n is the statistical average of n – see Eq. (1.37):

n   n Wn . (5.136)
n 0
The result of such summation is not necessarily integer! In our particular case, Eqs. (134)-(136) yield
2
n   . (5.137)
0.8
Wn n  0.3
0.6
0.4 1.0
Fig. 5.9. The Poisson distribution (135)
3.0 for several values of n. Note that Wn are
0.2
10 defined only for integer values of n; the
0 lines are only guides for the eye.
0 5 10 15 20
n
For applications, perhaps the most important mathematical property of this distribution is
n~ 2  n  n 
Glauber state:
so that n  n~ 2
2 1/ 2 1/ 2
 n,  n . (5.138) r.m.s.
uncertainty
Another important property is that at n >> 1, the Poisson distribution approaches the Gaussian
(“normal”) one, with a small relative r.m.s. uncertainty: n/n << 1 – the trend clearly visible in Fig. 9.
Now let us discuss the Glauber state’s evolution in time. In the wave-mechanics language, it is
completely described by the dynamics (100) of the c-number shifts X(t) and P(t) participating in the
wavefunction (107). Note again that, in contrast to the spread of the wave packet of a free particle,
discussed in Sec. 2.2, in the harmonic oscillator the Gaussian packet of the special width (99) does not
spread at all!
An alternative and equivalent way of dynamics description is to use the Heisenberg equation of
motion. As Eqs. (29) and (35) tell us, such equations for the Heisenberg operators of coordinate and
momentum have to be similar to the classical equations (100):
pˆ
xˆ H  H , pˆ H  m 02 xˆ H . (5.139)
m
Now using Eqs. (66), for the Heisenberg-picture creation and annihilation operators we get the equations
† †
aˆ H  i 0 aˆ H , aˆ H  i 0 aˆ H , (5.140)
which are completely similar to the classical equation (103) for the c-number parameter  and its
complex conjugate, and hence have the solutions identical to Eq. (104):
i0t i0t
aˆ H (t )  aˆ H (0)e , aˆ H† (t )  aˆ H† (0)e . (5.141)
As was discussed in Sec. 4.6, such equations are very convenient, because they enable simple
calculation of time evolution of observables for any initial state of the oscillator (Fock, Glauber, or any
other) using Eq. (4.191). In particular, Eq. (141) shows that regardless of the initial state, the oscillator
always returns to it exactly with the period 2/0.42 Applied to the Glauber state with  = 0, i.e. the
ground state of the oscillator, such calculation confirms that the Gaussian wave packet of the special
width (99) does not spread in time at all – even temporarily.
Now let me briefly mention the states whose initial wave packets are still Gaussian, but have
different widths, say x < x0/2. As we already know from Sec. 2.2, the momentum spread p will be
correspondingly larger, still with the smallest possible uncertainty product: xp = /2. Such squeezed
ground state , with zero expectation values of x and p, may be generated from the Fock/Glauber ground
state:
  Sˆ  0 ,
Squeezed
ground (5.142a)
state
using the so-called squeezing operator,
1 
Squeezing Sˆ   exp   *aâˆ   aˆ † aˆ †  , (5.142b)
operator
2  
which depends on a complex c-number parameter  = rei, where r and  are real. The parameter’s
modulus r determines the squeezing degree; if  is real (i.e.  = 0), then
42 Actually, this fact is also evident from the Schrödinger picture of the oscillator’s time evolution: due to the
exactly equal distances 0 between the eigenenergies (86), the time functions an(t) in the fundamental expansion
(1.69) of its wavefunction oscillate with frequencies n0, and hence they all share the same time period 2/0.
x0 r m 0 x0 r m 0 x 02 
x  e , p  e , so that xp   . (5.143)
2 2 2 2
On the phase plane (Fig. 8), this state, with r > 0, may be represented by an oval spot squeezed along
one of two mutually perpendicular axes (hence the state’s name), and stretched by the same factor er
along the counterpart axis; the same formulas but with r < 0 describe squeezing along the other axis. On
the other hand, the phase  of the squeezing parameter  determines the angle  /2 of the
squeezing/stretching axes about the phase plane origin – see the magenta ellipse in Fig. 8. If   0, Eqs.
(143) are valid for the variables {x’, p’} obtained from {x, p} via clockwise rotation by that angle. For
any of such origin-centered squeezed ground states, the time evolution is reduced to an increase of the
angle with the rate 0, i.e. to the clockwise rotation of the ellipse, without its deformation, with the
angular velocity 0 – see the magenta arrows in Fig. 8. As a result, the uncertainties x and p oscillate
in time with the double frequency 20. Such squeezed ground states may be formed, for example, by a
parametric excitation of the oscillator,43 with a parameter modulation depth close to, but still below the
threshold of the excitation of degenerate parametric oscillations.
By action of an additional external force, the center of a squeezed state may be displaced from
the origin to an arbitrary point {X, P}. Such displaced squeezed state may be described by the action of
the translation operator (113) upon the ground squeezed state, i.e. by the action of the operator product
Tˆ Sˆ  on the usual (Fock / Glauber, i.e. non-squeezed) ground state. Calculations similar to those that
led us from Eq. (114) to Eq. (124), show that such displaced squeezed state is an eigenstate of the
following mixed operator:
bˆ  aˆ cosh r  aˆ † e i sinh r , (5.144)
with the same parameters r and , with the eigenvalue
   cosh r   * e i sinh r , (5.145)
thus generalizing Eq. (124), which corresponds to r = 0. For the particular case  = 0, Eq. (145) yields 
= 0, i.e. the action of the operator (144) on the squeezed ground state  yields the null-state. Just as Eq.
(124) in the case of the Glauber states, Eqs. (144)-(145) make the calculation of the basic properties of
the squeezed states (for example, the proof of Eqs. (143) for the case  =  = 0) very straightforward.
Unfortunately, I do not have more time/space for a further discussion of the squeezed states in
this section, but their importance for precise quantum measurements will be discussed in Sec. 10.2
below.44
43 For a discussion and classical theory of this effect, see, e.g., CM Sec. 5.5.
44 For more on the squeezed states see, e.g., Chapter 7 in the monograph by C. Gerry and P. Knight, Introductory
Quantum Optics, Cambridge U. Press, 2005. Also, note the spectacular measurements of the Glauber and
squeezed states of electromagnetic (optical) oscillators by G. Breitenbach et al., Nature 387, 471 (1997), a large
(ten-fold) squeezing achieved in such oscillators by H. Vahlbruch et al., Phys. Rev. Lett. 100, 033602 (2008), and
the first results on the ground state squeezing in micromechanical oscillators, with resonance frequencies 0/2 as
low as a few MHz, using their parametric coupling to microwave electromagnetic oscillators – see, e.g., E.
Wollman et al., Science 349, 952 (2015) and/or J.-M. Pirkkalainen et al., Phys. Rev. Lett. 115, 243601 (2015).
5.6. Revisiting spherically-symmetric systems

One more blank spot to fill has been left by our study, in Sec. 3.6, of wave mechanics of particle
motion in spherically-symmetric 3D potentials. Indeed, while the azimuthal components of the
eigenfunctions (the spherical harmonics) of such systems are very simple,
 m  2 1 / 2 e im , with m  0,  1,  2,... , (5.146)
their polar components include the associated Legendre functions Plm(cos), which may be expressed
via elementary functions only indirectly – see Eqs. (3.165) and (3.168). This makes all the calculations
less than transparent and, in particular, does not allow a clear insight into the origin of the very simple
energy spectrum of such systems – see, e.g., Eq. (3.163). The bra-ket formalism, applied to the angular
momentum operator, not only enables such insight and produces a very convenient tool for many
calculations involving spherically-symmetric potentials, but also opens a clear way toward the
unification of the orbital momentum with the particle’s spin – the latter task to be addressed in the next
section.
Let us start by using the correspondence principle to spell out the quantum-mechanical vector
operator of the orbital angular momentum L  rp of a point particle:
nx ny nz
Angular 3
momentum Lˆ  rˆ  pˆ  rˆ1 rˆ2 rˆ3 , i.e. Lˆ j   rˆj' pˆ j"  jj'j" , (5.147)
operator
j' 1
pˆ 1 pˆ 2 pˆ 3
where each of the indices j, j’, and j” may take values 1, 2, and 3 (with j”  j, j’), and jj’j” is the Levi-
Civita permutation symbol, which we have already used in Sec. 4.5, and also in Sec. 1 of this chapter, in
similar expressions (17)-(18). From this definition, we can readily calculate the commutation relations
for all Cartesian components of operators Lˆ , rˆ , and pˆ ; for example,
Lˆ , rˆ    rˆ pˆ j"  jkj" , r j'     rk r j' , p j"  jkj"  i  rk  j'j"  jkj"  i  rk  jj'k  ir j"  jj'j" , (5.148)

3 3 3 3
j j' k
ˆ ˆ ˆ ˆ ˆ ˆ ˆ
 k 1  k 1 k 1 k 1
The summary of all these calculations may be represented in similar compact forms:
Lˆ , rˆ   irˆ  Lˆ , pˆ   ipˆ Lˆ , Lˆ   iLˆ

Key
commutation
relations j j' j" jj'j" , j j' j"  jj'j" , j j' j"  jj'j" ; (5.149)
the last of them shows that the commutator of two different Cartesian components of the vector-operator
L̂ is proportional to its complementary component.
Also introducing, in a natural way, the (scalar!) operator of the observable L2  L2,
3
Lˆ2  Lˆ2x  Lˆ2y  Lˆ2z   L2j ,
Operator
of L
2 (5.150)
j 1
it is straightforward to check that this operator commutes with each of the Cartesian components:
 Lˆ , Lˆ   0.
2
j (5.151)
This result, at the first sight, may seem to contradict the last of Eqs. (149). Indeed, haven’t we learned in
Sec. 4.5 that commuting operators (e.g., L̂2 and any of L̂ j ) share their eigenstate sets? If yes, shouldn’t
this set has to be common for all four angular momentum operators? The resolution in this paradox may
be found in the condition that was mentioned just after Eq. (4.138), but (sorry!) was not sufficiently
emphasized there. According to that relation, if an operator has degenerate eigenstates (i.e. if some Aj =
Aj’ even for j  j’), they should not be necessarily all shared by another compatible operator.
This is exactly the situation with the orbital angular momentum operators, which may be
schematically shown on a Venn diagram (Fig. 10):45 the eigenstates of the operator L̂2 are highly
degenerate,46 and their set is broader than those of any component operator L̂ j (that, as will be shown
below, are non-degenerate – until we consider particle’s spin).
Fig. 5.10. The Venn diagram showing the partitioning of

the set of eigenstates of the operator L̂2 . Each inner sector
L̂z corresponds to the states shared with one of the Cartesian
component operators L̂ j , while the outer (shaded) ring
L̂ x L̂ y
represents the eigenstates of L̂2 that are not shared with
either of L̂ j – for example, all linear combinations of the
eigenstates of different component operators.
Let us focus on just one of these three joint sets of eigenstates – by tradition, of the operators L̂2
and L̂z . (This tradition stems from the canonical form of the spherical coordinates, in which the polar
angle is measured from the z-axis. Indeed, in the coordinate representation we may write
      
Lˆ z  xˆp y  yˆ p x  x   i   y  i   i . (5.152)
 y   x  
Writing the standard eigenproblem for the operator in this representation, Lˆ z m  L z m , we see that it
is satisfied by the eigenfunctions (146), with eigenvalues Lz = m – which was already conjectured in
Sec. 3.5.) More specifically, let us consider a set of eigenstates {l, m} corresponding to a certain
degenerate eigenvalue of the operator L̂2 , and all possible eigenvalues of the operator L̂z , i.e. all
possible quantum numbers m. (At this point, l is just some label of the eigenvalue of the operator L̂2 ; it
will be defined more explicitly in a minute.) To analyze this set, it is instrumental to introduce the so-
called ladder (also called, respectively, “raising” and “lowering”) operators47
45 This is just a particular example of the Venn diagrams (introduced in the 1880s by John Venn) that show
possible relations (such as intersections, unions, complements, etc.) between various sets of objects, and are very
useful tool in the general set theory.
46 Note that this particular result is consistent with the classical picture of the angular momentum vector: even
when its length is fixed, the vector may be oriented in various directions, corresponding to different values of its
Cartesian components. However, in the classical picture, all these components may be fixed simultaneously, while
in the quantum picture this is not true.
47 Note a substantial similarity between this definition and Eqs. (65) for the creation/annihilation operators.
Ladder Lˆ   Lˆ x  iLˆ y . (5.153)

operators
It is simple (and hence left for the reader’s exercise) to use this definition and the last of Eqs. (149) to
calculate the following commutators:
Lˆ , Lˆ   2Lˆ ,  
Important
commutation
relations   z and Lˆ z , Lˆ    Lˆ  , (5.154)
and also to use Eqs. (149)-(150) to prove two other important operator relations:
Lˆ2  Lˆ2z  Lˆ  Lˆ   Lˆ z , Lˆ2  Lˆ2z  Lˆ  Lˆ   Lˆ z . (5.155)

Now let us rewrite the last of Eqs. (154) as
Lˆ z Lˆ   Lˆ  Lˆ z  Lˆ  , (5.156)
and act by its both sides upon the ket-vector l, m of an arbitrary common eigenstate:
Lˆ z Lˆ  l , m  Lˆ  Lˆ z l , m  Lˆ  l , m . (5.157)
Since the eigenvalues of the operator L̂ z are equal to m, in the first term of the right-hand side of Eq.
(157) we may write
Lˆ z l , m  m l , m . (5.158)
With that, Eq. (157) may be recast as
  
Lˆ z Lˆ  l , m   m  1 Lˆ  l , m .  (5.159)
In a spectacular similarity with Eqs. (78)-(79) for the harmonic oscillator, Eq. (159) means that
the states Lˆ  l , m are also eigenstates of the operator L̂z , corresponding to eigenvalues (m  1). Thus
the ladder operators act exactly as the creation and annihilation operators of a harmonic oscillator,
moving the system up or down a ladder of eigenstates – see Fig. 11.
eigenket eigenvalue of Lˆ z
l, l l
Lˆ  Lˆ 
…
Lˆ  l , m m 1
Lˆ  Lˆ 
l, m m
Lˆ  Lˆ 
Lˆ  l , m m 1
…
Fig. 5.11. The ladder diagram of the common
Lˆ  Lˆ  eigenstates of the operators L̂2 and L̂z .
l , l l
The most significant difference is that now the state ladder must end in both directions, because
an infinite increase of m, with whichever sign of m, would cause the expectation values of the operator
Lˆ2x  Lˆ2y  Lˆ2  Lˆ2z , (5.160)
which corresponds to a non-negative observable, to become negative. Hence there have to be two states
at both ends of the ladder, with such ket-vectors l, mmax and l, mmin that
Lˆ  l , mmax  0, Lˆ  l , mmin  0. (5.161)
Due to the symmetry of the whole problem with respect to the replacement m  –m, we should have
mmin = – mmax. This mmax is exactly the quantum number traditionally called l, so that
Relation
 l  m  l. (5.162) between
m and l
Evidently, this relation of quantum numbers m and l is semi-quantitatively compatible with the
classical image of the angular momentum vector L, of the same length L, pointing in various directions,
thus affecting the value of its component Lz. In this classical picture, however, L2 would be equal to the
square of (Lz)max, i.e. to (l)2; however, in quantum mechanics, this is not so. Indeed, applying both parts
of the second of the operator equalities (155) to the top state’s vector l, mmax  l, l, we get
Lˆ2 l , l  Lˆ z l , l  Lˆ2z l , l  Lˆ  Lˆ  l , l   2 l l , l   2 l 2 l , l  0 Eigenvalues

(5.163) 2
  l l  1 l , l .
2 of L
Since by our initial assumption, all eigenvectors l, m correspond to the same eigenvalue of L̂2 , this
result means that all these eigenvalues are equal to 2l(l + 1). Just as in the case of the spin-½ vector
operators discussed in Sec. 4.5, the deviation of this result from 2l2 may be interpreted as the result of
unavoidable uncertainties (“fluctuations”) of the x- and y-components of the angular momentum, which
give non-zero positive contributions to Lx2 and Ly2, and hence to L2, even if the angular momentum
vector is aligned with the z-axis in the best possible way.
(For various applications of the ladder operators (153), one more relation is convenient:
Lˆ  l , m   l l  1  mm  1 l , m  1 .
1/ 2
(5.164)
This equality, valid to the multiplier ei with an arbitrary real phase , may be readily proved from the
above relations in the same way as the parallel Eqs. (89) for the harmonic-oscillator operators (65) were
proved in Sec. 4; due to this similarity, the proof is also left for the reader’s exercise.48)
Now let us compare our results with those of Sec. 3.6. Using the expression of Cartesian
coordinates via the spherical ones exactly as this was done in Eq. (152), we get the following
expressions for the ladder operators (153) in the coordinate representation:
48 The reader is also challenged to use the commutation relations discussed above to prove one more important
property of the common eigenstates of L̂ z and L̂2 :
l , m rˆj l' , m'  0, unless l'  l  1 and m'  either m  1 or m .
This property gives the selection rule for the orbital electric-dipole quantum transitions, to be discussed later in
the course, especially in Sec. 9.3. (The final selection rules at these transitions may be affected by the particle’s
spin – see the next section.)
   
Lˆ   e i    icotan . (5.165)
Angular    
momentum
operators: Now plugging this relation, together with Eq. (152), into any of Eqs. (155), we get
coordinate
representation
ˆ 2 1     1 2 
L     sin  
2
2 
. (5.166)
 sin      sin 2
  
But this is exactly the operator (besides its division by the constant parameter 2mR2) that stands on the
left-hand side of Eq. (3.156). Hence that equation, which was explored by the “brute-force” (wave-
mechanical) approach in Sec. 3.6, may be understood as the eigenproblem for the operator L̂2 in the
coordinate representation, with the eigenfunctions Ylm(,) corresponding to the eigenkets l, m, and the
eigenvalues L2 = 2mR2E. As a reminder, the main result of that, rather involved analysis was expressed
by Eq. (3.163), which now may be rewritten as
L2l  2 mR 2 El   2 l (l  1) , (5.167)
in full agreement with Eq. (163), which was obtained by much more efficient means based on the bra-
ket formalism. In particular, it is fascinating to see how easy it is to operate with the eigenvectors l, m,
while the coordinate representations of these ket-vectors, the spherical harmonics Ylm(,), may be only
expressed by rather complicated functions – please have one more look at Eq. (3.171) and Fig. 3.20.
Note that all relations discussed in this section are not conditioned by any particular Hamiltonian
of the system under analysis, though they (as well as those discussed in the next section) are especially
important for particles moving in spherically-symmetric potentials.
5.7. Spin and its addition to orbital angular momentum

The theory described in the last section is useful for much more than orbital motion analysis. In
particular, it helps to generalize the spin-½ results discussed in Chapter 4 to other values of spin s – the
parameter still to be quantitatively defined. For that, let us notice that the commutation relations (4.155)
for spin-½, which were derived from the Pauli matrix properties, may be rewritten in exactly the same
form as Eqs. (149) and (151) for the orbital momentum:
Spin
operators:
commutation Sˆ , Sˆ   iSˆ
j j' j"  jj'j" , Sˆ , Sˆ   0
2
j (5.168)
relations
It had been postulated (and then confirmed by numerous experiments) that these relations hold
for quantum particles with any spin. Now notice that all the calculations of the last section have been
based almost exclusively on such relations – the only exception will be discussed imminently. Hence,
we may repeat them for the spin operators, and get the relations similar to Eqs. (158) and (163):
Spin
operators:
eigenstates
and
Sˆ z s, ms  ms s, ms , Sˆ 2 s, ms   2 s( s  1) s, ms , 0  s,  s  ms   s , (5.169)
eigenvalues
where ms is a quantum number parallel to the orbital magnetic number m, and the non-negative constant
s is defined as the maximum value of  ms . The c-number s is exactly what is called the particle’s spin.
Now let us return to the only part of our orbital moment calculations that has not been derived
from the commutation relations. This was the fact, based on the solution (146) of the orbital motion
problems, that the quantum number m (the analog of ms) may be only an integer. For spin, we do not
have such a solution, so that the spectrum of numbers ms (and hence its limits s) should be found from
the more loose requirement that the eigenstate ladder, extending from –s to + s, has an integer number of
steps. Hence, 2s has to be an integer, i.e. the spin s of a quantum particle may be either integer (as it is,
for example, for photons, gluons, and massive bosons W and Z0), or half-integer (e.g., for all quarks
and leptons, notably including electrons).49 For s = ½, this picture yields all the properties of the spin-½
that were derived in Chapter 4 from Eqs. (4.115)-(4.117). In particular, the operators Ŝ 2 and Ŝ z have
two common eigenstates ( and ), with Sz = ms = /2, both with S2= s(s +1)2 = (3/4)2.
Note that this analogy with the angular momentum sheds new light on the symmetry properties
of spin-½. Indeed, the fact that m in Eq. (146) is integer was derived in Sec. 3.5 from the requirement
that making a full circle around axis z, we should find a similar value of wavefunction m, which differs
from the initial one by an inconsequential factor exp{2im} = +1. With the replacement m  ms = ½,
such operation would multiply the wavefunction by exp{i} = –1, i.e. reverse its sign. Of course, spin
properties cannot be described by a usual wavefunction, but this odd parity of electrons, shared by all
other spin-½ particles, is clearly revealed in properties of multiparticle systems (see Chapter 8 below),
and as a result, in their statistics (see, e.g., SM Chapter 2).
Now we are sufficiently equipped to analyze the situations in which a particle has both the
orbital momentum and the spin – as an electron in an atom. In classical mechanics, such an object, with
the spin S interpreted as the angular moment of its internal rotation, would be characterized by the total
angular momentum vector J = L + S. Following the correspondence principle, we may assume that
quantum-mechanical properties of this observable may be described by the similarly defined vector
operator:
Total
Jˆ  Lˆ  Sˆ , (5.170) angular
momentum
with Cartesian components
Jˆ z  Lˆ z  Sˆ z , etc. , (5.171)
and the magnitude squared equal to
Jˆ 2  Jˆ x2  Jˆ y2  Jˆ z2 . (5.172)
Let us examine the properties of this vector operator. Since its two components (170) describe
different degrees of freedom of the particle, i.e. belong to different Hilbert spaces, they have to be
completely commuting:
Lˆ , Sˆ   0,
j j' Lˆ , Sˆ   0,
2 2
Lˆ , Sˆ   0,
j
2
Lˆ , Sˆ   0 .
2
j (5.173)
The above equalities are sufficient to derive the commutation relations for the operator Ĵ , and
unsurprisingly, they turn out to be absolutely similar to those of its components:
49 As a reminder, in the Standard Model of particle physics, such hadrons as mesons and baryons (notably
including protons and neutrons) are essentially composite particles. However, at non-relativistic energies, protons
and neutrons may be considered fundamental particles with s = ½.
Total
momentum:
commutation
relations
Jˆ , Jˆ   iJˆ
j j' j"  jj'j" , Jˆ 2

, Jˆ j  0 . (5.174)
Now repeating all the arguments of the last section, we may derive the following expressions for the
common eigenstates of the operators Ĵ 2 and Ĵ z :
Total
momentum:
eigenstates, J z j , m j  m j j , m j , Jˆ 2 j , m j   2 j ( j  1) j , m j , 0  j ,  j  m j   j , (5.175)
and
eigenvalues
where j and mj are new quantum numbers.50 Repeating the arguments just made for s and ms, we may
conclude that j and mj may be either integer or half-integer.
Before we proceed, one remark on notation: it is very convenient to use the same letter m for
numbering eigenstates of all momentum components participating in Eq. (171), with corresponding
indices (j, l, and s), in particular, to replace what we called m with ml. With this replacement, the main
results of the last section may be summarized in a form similar to Eqs. (168), (169), (174), and (175):
Orbital
momentum:
Lˆ , Lˆ   iLˆ
j j' j"  jj'j" , Lˆ , Lˆ   0 ,
2
j (5.176)
basic
properties
(new notation) Lˆ z l , ml  ml l , ml , Lˆ2 l , ml   2 l (l  1) l , ml , 0  l ,  l  ml  l. (5.177)
In order to understand which eigenstates participating in Eqs. (169), (175), and (177) are
compatible with each other, it is straightforward to use Eq. (172), together with Eqs. (168), (173), (174),
and (176) to get the following relations:
 Jˆ 2
  Jˆ , Sˆ   0,
, Lˆ2  0, 2 2
(5.178)
 Jˆ 2
, Lˆ   0,  Jˆ , Sˆ   0.
z
2
z (5.179)
This result is represented schematically on the Venn diagram shown in Fig. 12, in which the crossed
arrows indicate the only non-commuting pairs of operators.
operators
2
Ŝ 2
Ĵ 2 diagonal in
L̂
the coupled
representation
operators
diagonal in
the uncoupled L̂ z Ŝ z Ĵ z Fig. 5.12. The Venn diagram of angular momentum
representation operators, and their mutually-commuting groups.
This means that there are eigenstates shared by two operator groups encircled with colored lines
in Fig. 12. The first group (encircled red), consists of all these operators but Jˆ 2 . Hence there are
eigenstates shared by the five remaining operators, and these states correspond to definite values of the
corresponding quantum numbers: l, ml, s, ms, and mj. Actually, only four of these numbers are
50Let me hope that the difference between the quantum number j, and the indices j, j’, j” numbering the Cartesian
components in the relations like Eqs. (168) or (174), is absolutely clear from the context.
independent, because due to Eq. (171) for these compatible operators, for each eigenstate of this group,
their “magnetic” quantum numbers m have to satisfy the following relation:
m j  ml  m s . (5.180)
Hence the common eigenstates of the operators of this group are fully defined by just four quantum
numbers, for example, l, ml, s, and ms. For some calculations, especially those for the systems whose
Hamiltonians include only the operators of this group, it is convenient51 to use this set of eigenstates as
the basis; frequently this approach is called the uncoupled representation.
However, in some situations we cannot ignore interactions between the orbital and spin degrees
of freedom (in the common jargon, the spin-orbit coupling), which leads in particular to splitting (called
the fine structure) of the atomic energy levels even in the absence of external magnetic field. I will
discuss these effects in detail in the next chapter, and now will only note that they may be described by a
term proportional to the product Lˆ  Sˆ , in the system’s Hamiltonian. If this term is substantial, the
uncoupled representation becomes inconvenient. Indeed, writing
Jˆ 2  (Lˆ  Sˆ ) 2  Lˆ2  Sˆ 2  2Lˆ  Sˆ , so that 2Lˆ  Sˆ  Jˆ 2  Lˆ2  Sˆ 2 , (5.181)
and looking at Fig. 12 again, we see that operator Lˆ  Sˆ , describing the spin-orbit coupling, does not
commute with operators L̂ z and Ŝ z . This means that stationary states of the system with such term in
the Hamiltonian do not belong to the uncoupled representation’s basis. On the other hand, Eq. (181)
shows that the operator Lˆ  Sˆ does commute with all four operators of another group, encircled blue in
Fig. 12. According to Eqs. (178), (179), and (181), all operators of that group also commute with each
other, so that they have common eigenstates, described by the quantum numbers, l, s, j, and mj. This
group is the basis for the so-called coupled representation of particle states.
Excluding, for the notation briefness, the quantum numbers l and s, common for both groups, it
is convenient to denote the common ket-vectors of each group as, respectively,
ml , m s , for the uncolpled representation' s basis, Coupled and

(5.182) uncoupled
j, m j , for the coupled representation' s basis. bases
As we will see in the next chapter, for the solution of some important problems (e.g., the fine structure
of atomic spectra and the Zeeman effect), we will need the relation between the kets j, mj and the kets
ml, ms. This relation may be represented as the usual linear superposition,
Clebsch-
 m ,m
Jordan
j, m j  l s ml , m s j , m j . (5.183) coefficients:
ml ,ms definition
The short brackets in this relation, essentially the elements of the unitary matrix of the transformation
between two eigenstate bases (182), are called the Clebsch-Gordan coefficients.
The best (though imperfect) classical interpretation of Eq. (183) I can offer is as follows. If the
lengths of the vectors L and S (in quantum mechanics associated with the numbers l and s, respectively),
51This is especially true for motion in spherically-symmetric potentials, whose stationary states correspond to
definite l and ml; however, the relations discussed in this section are important for some other problems as well.
and also their scalar product LS, are all fixed, then so is the length of the vector J = L + S – whose
length in quantum mechanics is described by the number j. Hence, the classical image of a specific
eigenket j, mj, in which l, s, j, and mj are all fixed, is a state in which L2, S2, J2, and Jz are fixed.
However, this fixation still allows for arbitrary rotation of the pair of vectors L and S (with a fixed angle
between them, and hence fixed LS and J2) about the direction of the vector J – see Fig. 13.
Hence the components Lz and Sz in these conditions are not fixed, and in classical mechanics
may take a continuum of values, two of which (with the largest and the smallest possible values of Sz)
are shown in Fig. 13. In quantum mechanics, these components are quantized, with their states
represented by eigenkets ml, ms, so that a linear combination of such kets is necessary to represent a ket
j, mj. This is exactly what Eq. (183) does.
z z
Jz J Jz J
L
Lz
Lz L Fig. 5.13. A classical image of two
S
Sz different quantum states with the
Sz S same quantum numbers l, s, j, and
mj, but different ml and ms.
0 0
Some properties of the Clebsch-Gordan coefficients ml, ms j, mj may be readily established.
For example, the coefficients do not vanish only if the involved magnetic quantum numbers satisfy Eq.
(180). In our current case, this relation is not an elementary corollary of Eq. (171), because in the
Clebsch-Gordan coefficients, with the quantum numbers ml, ms in one state vector, and mj in the other
state vector, characterize the relation between different groups of the basis states, so we need to prove
this fact. All matrix elements of the null-operator
Jˆ z  ( Lˆ z  Sˆ z )  0̂ (5.184)
should equal zero in any basis; in particular
j , m j Jˆ z  ( Lˆ z  Sˆ z ) ml , m s  0. (5.185)
Acting by the operator Ĵ z upon the bra-vector, and by the sum ( Lˆ z  Sˆ z ) upon the ket-vector, we get
m j 
 (ml  m s ) j , m j ml , m s  0, (5.186)
thus proving that
*
ml , m s j , m s  j , m s ml , m s  0, if m j  ml  m s . (5.187)
For the most important case of spin-½ particles (with s = ½, and hence ms = ½), whose
uncoupled representation basis includes 2(2l + 1) states, the restriction (187) enables the representation
of all non-zero Clebsch-Gordan coefficients on the simple “rectangular” diagram shown in Fig. 14.
Indeed, each coupled-representation eigenket j, mj, with mj = ml + ms = ml  ½, may be related by non-
zero Clebsch-Gordan coefficients to at most two uncoupled-representation eigenstates ml, ms. Since ml
may only take integer values from –l to +l, mj may only take semi-integer values on the interval [- l – ½,
l + ½]. Hence, by the definition of j as (mj)max, its maximum value has to be l + ½, and for mj = l + ½,
this is the only possible value with this j. This means that the uncoupled state with ml = l and ms = ½
should be identical to the coupled-representation state with j = l + ½ and mj = l + ½:
j  l  ½, m j  l  ½  ml  m j  ½, m s  ½ . (5.188)
In Fig. 14, these two identical states are represented with the top-rightmost point (the uncoupled
representation) and the sloped line passing through it (the coupled representation).
ms
m j  l  ½ m j  l  3 2 mj  l 3 2 mj  l ½
j  l ½ j  l ½ j  l ½ j  l ½ mj  l ½
½ j  l ½
   
l  l 1 l 2 0 l2 l 1 l ml
m j  l  ½    
j  l ½ ½
Fig. 5.14. A graphical representation of possible basis states of a spin-½ particle with a fixed l. Each dot
corresponds to an uncoupled-representation ket-vector ml, ms, while each sloped line corresponds to one
coupled-representation ket-vector j, mj, related by Eq. (183) to the kets ml, ms whose dots it connects.
However, already the next value of this quantum number, mj = l – ½, is compatible with two
values of j, so that each ml, ms ket has to be related to two j, mj kets by two Clebsch-Gordan
coefficients. Since j changes in unit steps, these values of j have to be l  ½. This choice,
j  l ½, (5.189)
evidently satisfies all lower values of mj as well – see Fig. 14.52 (Again, only one value, j = l + ½, is
necessary to represent the state with the lowest mj = – l – ½ – see the bottom-leftmost point of that
diagram.) Note that the total number of the coupled-representation states is 1 + 22l + 1  2(2l + 1), i.e.
is the same as those in the uncoupled representation. So, for spin-½ systems, each sum (183), for fixed j
and mj (plus the fixed common parameter l, plus the common s = ½), has at most two terms, i.e. involves
at most two Clebsch-Gordan coefficients.
These coefficients may be calculated in a few steps, all but the last one rather simple even for an
arbitrary spin s. First, the similarity of the vector operators Jˆ and Sˆ to the operator L̂ , expressed by
Eqs. (169), (175), and (177), may be used to argue that the matrix elements of the operators Sˆ  and Jˆ  ,
defined similarly to L̂ , have the matrix elements similar to those given by Eq. (164). Next, acting by
the operator Jˆ   Lˆ   Sˆ  upon both parts of Eq. (183), and then inner-multiplying the result by the bra
52 Eq. (5.189) allows a semi-qualitative classical interpretation in terms of the vector diagrams shown in Fig. 13:
since, according to Eq. (169), s gives the scale of the length of the vector S, if it is small (s = ½), the length of
vector J (similarly scaled by j) cannot deviate much from the length of the vector L (scaled by l) for any spatial
orientation of these vectors, so that j cannot differ from l too much. Note also that for a fixed mj, the alternating
sign in Eq. (189) is independent of the sign of ms – see also Eqs. (190).
vector ml, ms and using the above matrix elements, we may get recurrence relations for the Clebsch-
Gordan coefficients with adjacent values of ml, ms, and mj. Finally, these relations may be sequentially
applied to the adjacent states in both representations, starting from any of the two states common for
them – for example, from the state with the ket-vector (188), corresponding to the top right point in Fig.
14.
Let me leave these straightforward but a bit tedious calculations for the reader’s exercise, and
just cite the final result of this procedure for s = ½:53
1/ 2
l  mj ½ 
m l  m j  ½ , m s  ½ j  l  ½ , m j    ,
 2l  1 
Clebsch –
Gordan 1/ 2 (5.190)
coefficients l  mj ½ 
for s = ½ m l  m j  ½ , m s  ½ j  l  ½ , m j    .
 2l  1 
In this course, these relations will be used mostly in Sec. 6.4 for an analysis of the anomalous Zeeman
effect. Moreover, the angular momentum addition theory described above is also valid for the addition
of angular momenta of multiparticle system components, so we will revisit it in Chapter 8.
To conclude this section, I have to note that the Clebsch-Gordan coefficients (for arbitrary s)
participate also in the so-called Wigner-Eckart theorem that expresses the matrix elements of spherical
tensor operators, in the coupled-representation basis j, mj, via a reduced set of matrix elements. This
theorem may be useful, for example, for the calculation of the rate of quantum transitions to/from high-n
states in spherically-symmetric potentials. Unfortunately, a discussion of this theorem and its
applications would require a higher mathematical background than I can expect from my readers, and
more time/space than I can afford.54
5.1. Use the discussion in Sec. 1 to find an alternative solution of Problem 4.18.
5.2. A spin-½ is placed into an external magnetic field, with a time-independent orientation, its
magnitude B(t) being an arbitrary function of time. Find explicit expressions for the Heisenberg
operators and the expectation values of all three Cartesian components of the spin, as functions of time,
in a coordinate system of your choice.
5.3. A two-level system is in the quantum state  described by the ket-vector  =  +
, with given (generally, complex) c-number coefficients . Prove that we can always select such
a geometric c-number vector c = {cx, cy, cz}, that  is an eigenstate of c  σˆ , where σ̂ is the Pauli vector
operator. Find all possible values of c satisfying this condition, and the second eigenstate (orthogonal to
) of the operator c  σˆ . Give a Bloch-sphere interpretation of your result.
53 For arbitrary spin s, the calculations and even the final expressions for the Clebsch-Gordan coefficients are
rather bulky. They may be found, typically in a table form, mostly in special monographs – see, e.g., A.
Edmonds, Angular Momentum in Quantum Mechanics, Princeton U. Press, 1957.
54 For the interested reader, I can recommend either Sec. 17.7 in E. Merzbacher, Quantum Mechanics, 3rd ed.,
Wiley, 1998, or Sec. 3.10 in J. Sakurai, Modern Quantum Mechanics, Addison-Wesley, 1994.
5.4.* Analyze statistics of the spacing S  E+ – E- between energy levels of a two-level system,
assuming that all elements Hjj’ of its Hamiltonian matrix (2) are independent random numbers, with
equal and constant probability densities within the energy interval of interest. Compare the result with
that for a purely diagonal Hamiltonian matrix, with the similar probability distribution of its random
diagonal elements.
5.5. For a periodic motion of a single particle in a confining potential U(r), the virial theorem of
non-relativistic classical mechanics55 is reduced to the following equality:
1
T  r  U ,
2
where T is the particle’s kinetic energy, and the top bar means averaging over the time period of motion.
Prove the following quantum-mechanical version of the theorem for an arbitrary stationary quantum
state, in the absence of spin effects:
1
T  r  U ,
2
where the angular brackets mean the expectation values of the observables.
Hint: Mimicking the proof of the classical virial theorem, consider the time evolution of the
following operator:
Gˆ  rˆ  pˆ .
5.6. Calculate, in the WKB approximation, the transparency T of the following saddle-shaped
potential barrier:
 xy 
U ( x, y )  U 0 1  2 ,
 a 
where U0 > 0 and a are real constants, for tunneling of a 2D particle with energy E < U0.
5.7. Calculate the so-called Gamow factor56 for the alpha decay of atomic nuclei, i.e. the
exponential factor in the transparency of the potential barrier resulting from the following simple model
for the alpha-particle’s potential energy as a function of its distance from the nuclear center:
U 0  0, for r  R,

U r    ZZ'e 2
, for R  r ,
 4 0 r
(where Ze = 2e > 0 is the charge of the particle, Z’e > 0 is that of the nucleus after the decay, and R is
the nucleus’ radius), in the WKB approximation.
5.8. Use the WKB approximation to calculate the average time of ionization of a hydrogen atom,
initially in its ground state, made metastable by application of an additional weak, uniform, time-
independent electric field E. Formulate the conditions of validity of your result.
55 See, e.g., CM Problem 1.12.

56 Named after G. Gamow, who made this calculation as early as in 1928.
5.9. For a 1D harmonic oscillator with mass m and frequency 0, calculate:
(i) all matrix elements n xˆ 3 n' , and
(ii) the diagonal matrix elements n xˆ 4 n ,
where n and n’ are arbitrary Fock states.
5.10. Calculate the sum (over all n > 0) of the so-called oscillator strengths,
2m
f n  2 E n  E 0  n xˆ 0 ,
2

(i) for a 1D harmonic oscillator, and
(ii) for a 1D particle confined in an arbitrary stationary potential.
5.11. Prove the so-called Bethe sum rule,

2 2k 2
 E
n'
n'  En  n e ikxˆ
n' 
2m
,
valid for a 1D particle moving in an arbitrary time-independent potential U(x), and discuss its relation
with the Thomas-Reiche-Kuhn sum rule whose derivation was the subject of the previous problem.
Hint: Calculate the expectation value, in a stationary state n, of the following double
commutator,

Dˆ  Hˆ , e ikxˆ , e ikxˆ ,  
in two ways: first, just spelling out both commutators, and, second, using the commutation relations
between operators p̂ x and e ikxˆ , and compare the results.
5.12. Given Eq. (116), prove Eq. (117), using the hint given in the accompanying footnote.
5.13. Use Eqs. (116)-(117) to simplify the following operators:

(i) exp iaxˆ pˆ x exp iaxˆ , and
(ii) exp iapˆ x  xˆ exp iapˆ x ,
where a is a c-number.
5.14. For a 1D harmonic oscillator, calculate:

(i) the expectation value of energy, and
(ii) the time evolution of the expectation values of the coordinate and momentum,
provided that in the initial moment (t = 0) it was in the state described by the following ket-vector:
1
   31  32 ,
2
where n are the ket-vectors of the stationary (Fock) states of the oscillator.
5.15.* Re-derive the London dispersion force’s potential of the interaction of two isotropic 3D
harmonic oscillators (already calculated in Problem 3.16), using the language of mutually-induced
polarization.
5.16. An external force pulse F(t), of a finite time duration T, has been exerted on a 1D harmonic
oscillator, initially in its ground state. Use the Heisenberg-picture equations of motion to calculate the
expectation value of the oscillator’s energy at the end of the pulse.
5.17. Use Eqs. (144)-(145) to calculate the uncertainties x and p for a harmonic oscillator in its
squeezed ground state, and in particular, to prove Eqs. (143) for the case  = 0.
5.18. Calculate the energy of a harmonic oscillator in the squeezed ground state .
5.19.* Prove that the squeezed ground state, described by Eqs. (142) and (144)-(145), may be
sustained by a sinusoidal modulation of a harmonic oscillator’s parameter, and calculate the squeezing
factor r as a function of the parameter modulation depth, assuming that the depth is small, and the
oscillator’s damping is negligible.
5.20. Use Eqs. (148) to prove that the operators L̂ j and L̂2 commute with the Hamiltonian of a
spinless particle placed in any central potential field.
5.21. Use Eqs. (149)-(150) and (153) to prove Eqs. (155).
5.22. Derive Eq. (164), using any of the prior formulas.
5.23. In the basis of common eigenstates of the operators L̂z and L̂2 , described by kets l, m:
(i) calculate the matrix elements l , m1 Lˆ x l , m2 and l , m1 Lˆ2x l , m2 ;
(ii) spell out your results for diagonal matrix elements (with m1 = m2) and their y-axis
counterparts; and
(iii) calculate the diagonal matrix elements l , m Lˆ Lˆ l , m and l , m Lˆ Lˆ l , m .
x y y x
5.24. For the state described by the common eigenket l, m of the operators L̂ z and L̂2 in a
reference frame {x, y, z}, calculate the expectation values Lz’ and Lz’2 in the reference frame whose
z’-axis forms angle  with the z-axis.
5.25. Write down the matrices of the following angular momentum operators:
Lˆ x , Lˆ y , Lˆ z , and Lˆ  , in the z-basis of the {l, m} states with l = 1.
5.26. Calculate the angular factor of the orbital wavefunction of a particle with a definite value
of L , equal to 62, and the largest possible value of Lx. What is this value?
2
5.27. For the state with the wavefunction  = Cxye-r, with a real, positive , calculate:
(i) the expectation values of the observables Lx, Ly, Lz, and L2, and
(ii) the normalization constant C.
5.28. An angular state of a spinless particle is described by the following ket-vector:

1
   l  3, m  0  l  3, m  1  .
2
Calculate the expectation values of the x- and y-components of its angular momentum. Is the result
sensitive to a possible phase shift between the component eigenkets?
5.29. A particle is in a quantum state  with the orbital wavefunction proportional to the
spherical harmonic Y11 ( ,  ). Find the angular dependence of the wavefunctions corresponding to the
following ket-vectors:
(i) L̂x  , (ii) L̂ y  , (iii) L̂ z  , (iv) Lˆ  Lˆ   , and (v) L̂2  .
5.30. A charged, spinless 2D particle of mass m is trapped in a soft potential well U(x, y) =
m0 (x +y2)/2. Calculate its energy spectrum in the presence of a uniform magnetic field B, normal to
2 2
the [x, y]-plane of particle’s motion.
5.31. Solve the previous problem for a spinless 3D particle, placed (in addition to a uniform
magnetic field B) into a spherically-symmetric potential well U(r) = m02r2/2.
5.32. Calculate the spectrum of rotational energies of an axially-symmetric, rigid body.
5.33. Simplify the following double commutator:
 rˆ , Lˆ , rˆ  .
j
2
j'
5.34. Prove the following commutation relation:
Lˆ , Lˆ , rˆ    2 rˆ Lˆ

2 2
j
2
j
2

 Lˆ2 rˆj .
5.35. Use the commutation relation proved in the previous problem, and Eq. (148), to prove the
orbital electric-dipole selection rules mentioned in Sec. 5.6 of the lecture notes.
   
5.36. Express the commutators listed in Eq. (179), Jˆ 2 , Lˆ z and Jˆ 2 , Sˆ z , via L̂ j and Ŝ j .
5.37. Find the operator Tˆ describing a quantum state’s rotation by angle  about a certain axis,
using the similarity of this operation with the shift of a Cartesian coordinate, discussed in Sec. 5. Then
use this operator to calculate the probabilities of measurements of spin-½ components of a beam of
particles with z-polarized spin, by a Stern-Gerlach instrument turned by angle  within the [z, x] plane,
where y is the axis of particle propagation – see Fig. 4.1.57
5.38. The rotation (“angular translation”) operator Tˆ analyzed in the previous problem, and the
linear translation operator Tˆ X discussed in Sec. 5, have a similar structure:
 
Tˆ  exp  iCˆ  /  ,
where  is a real c-number, characterizing the shift, and Ĉ is a Hermitian operator, which does not
explicitly depend on time.
(i) Prove that such operators Tˆ are unitary.

(ii) Prove that if the shift by , induced by the operator Tˆ , leaves the Hamiltonian of some
system unchanged for any , then C is a constant of motion for any initial state of the system.
(iii) Discuss what the last conclusion means for the particular operators Tˆ X and Tˆ .
5.39. A particle with spin s is in a state with definite quantum numbers l and j. Prove that the
observable LS also has a definite value, and calculate it.
5.40. For a spin-½ particle in a state with definite quantum numbers l, ml, and ms, calculate the
expectation value of the observable J2, and the probabilities of all its possible values. Interpret your
results in the terms of the Clebsh-Gordan coefficients (190).
5.41. Derive general recurrence relations for the Clebsh-Gordan coefficients.

Hint: Using the similarity of the commutation relations discussed in Sec. 7, write the relations
similar to Eqs. (164) for other components of the angular momentum, and apply them to Eq. (170).
5.42. Use the recurrence relations derived in the previous problem to prove Eqs. (190) for the
spin-½ Clebsh-Gordan coefficients.
5.43. A spin-½ particle is in a state with definite values of L2, J2, and Jz. Find all possible values
of the observables S2, Sz, and Lz, the probability of each listed value, and the expectation value for each
of these observables.
5.44. Re-solve the Landau-level problem discussed in Sec. 3.2, for a spin-½ particle. Discuss the
result for the particular case of an electron, with the g-factor equal to 2.
5.45. In the Heisenberg picture of quantum dynamics, find an explicit relation between the
operators of velocity vˆ  drˆ / dt and acceleration aˆ  dvˆ / dt of a spin-½ particle with electric charge q,
moving in an arbitrary external electromagnetic field. Compare the result with the corresponding
classical expression.
Hint: For the orbital motion’s description, you may use Eq. (3.26).
57 Note that the last task is just a particular case of Problem 4.18 (see also Problem 1).
5.46. A byproduct of the solution of Problem 41 is the following relation for the spin operators
(valid for any spin s):
m s  1 Sˆ  m s  s  m s  1s  m s  .
1/ 2
Use this result to spell out the matrices Sx, Sy, Sz, and S2 of a particle with s = 1, in the z-basis – defined
as the basis in which the matrix Sz is diagonal.
5.47.* For a particle with an arbitrary spin s, moving in a spherically-symmetric field, find the
ranges of the quantum numbers mj and j that are necessary to describe, in the coupled-representation
basis:
(i) all states with a definite quantum number l, and
(ii) a state with definite values of not only l, but also ml and ms.
Give an interpretation of your results in terms of the classical geometric vector diagram (Fig. 13).
5.48. A particle of mass m, with electric charge q and spin s, free to move along a plane ring of a
radius R, is placed into a constant, uniform magnetic field B, directed normally to the ring’s plane.
Calculate the energy spectrum of the system. Explore and interpret the particular form the result takes
when the particle is an electron with the g-factor ge = 2.
This page is
intentionally left
blank
Chapter 6. Perturbative Approaches

This chapter discusses several perturbative approaches to problems of quantum mechanics, and their
simplest but important applications including the fine structure of atomic energy levels, and the effects
of external dc and ac electric and magnetic fields on these levels. It continues with a discussion of the
perturbation theory of transitions to continuous spectrum and the Golden Rule of quantum mechanics,
which naturally brings us to the issue of open quantum systems – to be discussed in the next chapter.
6.1. Time-independent perturbations

Unfortunately, only a few problems of quantum mechanics may be solved exactly in an
analytical form. Actually, in the previous chapters we have solved a substantial part of such problems
for a single particle, while for multiparticle systems, the exactly solvable cases are even more rare.
However, most practical problems of physics feature a certain small parameter, and this smallness may
be exploited by various approximate analytical methods giving asymptotically correct results – i.e. the
results whose error tends to zero at the reduction of the small parameter(s). Earlier in the course, we
have explored one of them, the WKB approximation, which is adequate for a particle moving through a
soft potential profile. In this chapter, we will discuss other techniques that are more suitable for other
cases. The historic name for these techniques is the perturbation theory, though it is more fair to speak
about several perturbative approaches, because they are substantially different for different situations.
The simplest version of the perturbation theory addresses the problem of stationary states and
energy levels of systems described by time-independent Hamiltonians of the type
Hˆ  Hˆ ( 0 )  Hˆ (1) , (6.1)
where the operator Ĥ (1) , describing the system’s “perturbation”, is relatively small – in the sense that its
addition to the unperturbed operator Ĥ ( 0) results in a relatively small change of the eigenenergies En of
the system, and the corresponding eigenstates. A typical problem of this type is the 1D weakly
anharmonic oscillator (Fig. 1), described by the Hamiltonian (1) with
Weakly
ˆ pˆ 2 m 02 xˆ 2
anharmonic
oscillator
H 
( 0)
 , Hˆ (1)  xˆ 3  xˆ 4  ... (6.2)
2m 2
with sufficiently small coefficients , , ….
U ( x)
m x
2 2
En U ( 0 )  H (1)
U (0)  0
2
E 2( 0 ) E2 Fig. 6.1. The simplest application of
the perturbation theory: a weakly
E1( 0 ) E1 anharmonic 1D oscillator. (Dashed
lines characterize the unperturbed,
harmonic oscillator.)
0 x
© K. Likharev
I will use this system as our first example, but let me start by describing the perturbative
approach to the general time-independent Hamiltonian (1). In the bra-ket formalism, the eigenproblem
(4.68) for the perturbed Hamiltonian, i.e. the stationary Schrödinger equation of the system, is
Hˆ (0)

 Hˆ (1) n  E n n . (6.3)
Let the eigenstates and eigenvalues of the unperturbed Hamiltonian, which satisfy the equation
Hˆ ( 0) n ( 0)  E n( 0) n ( 0) , (6.4)
be considered as known. In this case, the solution of problem (3) means finding, first, its perturbed
eigenvalues En and, second, the coefficients n’ (0)n of the expansion of the perturbed state’s vectors n
in the following series over the unperturbed ones, n’ (0):
n   n' ( 0 ) n' ( 0) n . (6.5)

n'
Let us plug Eq. (5), with the summation index n’ replaced with n” (just to have a more compact
notation in our forthcoming result), into both sides of Eq. (3):
n"
n" ( 0) n Hˆ ( 0 ) n" ( 0 )   n" ( 0 ) n Hˆ (1) n" ( 0 )   n" ( 0 ) n E n n" ( 0 ) .
n" n"
(6.6)
and then inner-multiply all terms by an arbitrary unperturbed bra-vector n’ (0) of the system. Assuming
that the unperturbed eigenstates are orthonormal, n’ (0)n” (0) = n’n”, and using Eq. (4) in the first term
on the left-hand side, we get the following system of linear equations
 n" ( 0) n H n'n"
(1)
 n' ( 0) n E n  E n'( 0 )  , (6.7)
n"
where the matrix elements of the perturbation are calculated, by definition, in the unperturbed brackets:
Perturbation’s
(1)
H n'n"  n' ( 0) Hˆ (1) n" ( 0 ) . (6.8) matrix
elements
The linear equation system (7) is still exact,1 and is frequently used for numerical calculations.
(Since the matrix coefficients (8) typically decrease when n’ and/or n” become sufficiently large, the
sum on the left-hand side of Eq. (7) may usually be truncated, still giving an acceptable accuracy of the
solution.) To get analytical results, we need to make approximations. In the simple perturbation theory
we are discussing now, this is achieved by the expansion of both the eigenenergies and the expansion
coefficients into the Taylor series in a certain small parameter  of the problem:
E n  E n( 0 )  E n(1)  E n( 2 ) ..., (6.9)
(0) (1) ( 2)
n' ( 0) n  n' ( 0 ) n  n' ( 0) n  n" ( 0) n ..., (6.10)
where
1 Please note the similarity of Eq. (7) with Eq. (2.215) of the 1D band theory. Indeed, the latter equation is not
much more than a particular form of Eq. (7) for the 1D wave mechanics, and a specific (periodic) potential U(x)
considered as the perturbation Hamiltonian. Moreover, the whole approximate treatment of the weak-potential
limit in Sec. 2.7 is essentially a particular case of the perturbation theory we are discussing now (in its 1st order).
(k )
E n( k )  n' ( 0) n  k. (6.11)
In order to explore the 1st-order approximation, which ignores all terms O(2) and higher, let us
plug only the two first terms of the expansions (9) and (10) into the basic equation (7):
H (1)
n'n"
   n" ( 0 ) n
 n"n
(1)
     n' ( 0) n
  n'n
(1)

 n n n' 
 E ( 0)  E (1)  E ( 0 ) . (6.12)
n"
Now let us open the parentheses, and disregard all the remaining terms O(2). The result is
(1)
(1)
H n'n   n'n En(1)  n' ( 0 ) n ( En( 0 )  En'( 0 ) ), (6.13)
This relation is valid for any set of indices n and n’; let us start from the case n = n’,
immediately getting a very simple (and practically, the most important!) result:
Energy:
 n ( 0) Hˆ (1) n ( 0 ) .
st
1 -order
correction
E n(1)  H nn
(1)
(6.14)
For example, let us see what this result gives for two first perturbation terms in the weakly anharmonic
oscillator (2):
E n(1)   n ( 0 ) xˆ 3 n ( 0)   n ( 0 ) xˆ 4 n ( 0 ) . (6.15)
As the reader knows (or should know :-) from the solution of Problem 5.9, the first bracket equals zero,
while the second one yields2
E n(1)   x04 2n 2  2n  1.
3
(6.16)
4
Naturally, there should be some non-vanishing contribution to the energies from the (typically, larger)
perturbation proportional to , so that for its calculation we need to explore the 2nd order of the theory.
However, before doing that, let us complete our discussion of its 1st order.
For n’  n, Eq. (13) may be used to calculate the eigenstates rather than the eigenvalues:
(1)
(1) H n'n
n' ( 0) n  , for n'  n. (6.17)
E n( 0 )  E n'( 0 )
This means that the eigenket’s expansion (5), in the 1st order, may be represented as
States: (1)
H n'n
n (1)  C n ( 0)   n' ( 0 ) .
st
1 -order (6.18)
result E
n'  n n
( 0)
 E (0)
n'
The coefficient C  n(0)n(1) cannot be found from Eq. (17); however, requiring the final state n to be
normalized, we see that other terms may provide only corrections O(2), so that in the 1st order we
should take C = 1. The most important feature of Eq. (18) is its denominators: the closer are the
unperturbed eigenenergies of two states, the larger is their mutual “interaction” due to the perturbation.
2 A useful exercise for the reader: analyze the relation between Eq. (16) and the result of the classical theory of
such weakly anharmonic (”nonlinear”) oscillator – see, e.g., CM Sec. 5.2, in particular, Eq. (5.49).
This feature also affects the 1st-order’s validity condition, which may be quantified using Eq.
(17): the magnitudes of the brackets it describes have to be much less than the unperturbed bracket
nn(0) = 1, so that all elements of the perturbation matrix have to be much less than the difference
between the corresponding unperturbed energies. For the anharmonic oscillator’s energy corrections
(16), this requirement is reduced to En(1) << 0.
Now we are ready for going after the 2nd-order approximation to Eq. (7). Let us focus on the case
n’ = n, because as we already know, only this term will give us a correction to the eigenenergies.
Moreover, since the left-hand side of Eq. (7) already has a small factor H(1)n’n”  , the bracket
coefficients in that part may be taken from the 1st-order result (17). As a result, we get
(1) (1)
H n"n H nn"
E n( 2 )   n" ( 0 ) n  (0) (0) .
(1) (1)
H nn"  (6.19)
n" n"  n E n  E n"
Since Ĥ (1) has to be Hermitian, we may rewrite this expression as

2 2
(1)
H n'n n' ( 0 ) Hˆ (1) n ( 0) Energy:
E n( 2 )  E
n'  n
(0)
E (0)
 
n'  n E (0)
E (0)
. (6.20) nd
2 -order
correction
n n' n n'
This is the much-celebrated 2nd-order perturbation result, which frequently (in sufficiently
symmetric problems) is the first non-vanishing correction to the state energy – for example, from the
cubic term (proportional to ) in our weakly anharmonic oscillator problem (2). To calculate the
corresponding correction, we may use another result of the solution of Problem 5.9:
3
x 
n' xˆ 3 n   0 
 2 (6.21)

 n(n  1)(n  2)  n' ,n 3  3n 3 / 2 n' ,n 1  3(n  1) 3 / 2  n' ,n 1  (n  1)(n  2)(n  3)  n' ,n 3 .
1/ 2 1/ 2

So, according to Eq. (20), we need to calculate
6
 x 
E ( 2)
n   0 
2
 2
(6.22)

n(n  1)(n  2)1 / 2  n' ,n3  3n 3 / 2 n' ,n1  3(n  1) 3 / 2  n' ,n1  (n  1)(n  2)(n  3)1 / 2  n' ,n3 .
2
n'  n  0 (n  n' )
The summation is not as cumbersome as may look, because at the curly brackets’ squaring, all mixed
products are proportional to the products of different Kronecker deltas and hence vanish, so that we
need to sum up only the squares of each term, finally getting
15  2 x 06  2 11 
E  n  n   .
( 2)
(6.23)
4  0 
n
30 
This formula shows that all energy level corrections are negative, regardless of the sign of .3 On the
contrary, the 1st order correction En(1), given by Eq. (16), does depend on the sign of , so that the net
correction, En(1) + En(2), may be of any sign.
3Note that this is correct for the ground-state energy correction Eg(2) of any system, because for this state, the
denominators of all terms of the sum (20) are negative, while their numerators are always non-negative.
The results (18) and (20) are clearly inapplicable to the degenerate case where, in the absence of
perturbation, several states correspond to the same energy level, because of the divergence of their
denominators.4 This divergence hints that in this case, the largest effect of the perturbation is the
degeneracy lifting, e.g., some splitting of the initially degenerate energy level E(0) (Fig. 2), and that for
the analysis of this case we can, in the first approximation, ignore the effect of all other energy levels.
(A careful analysis shows that this is indeed the case until the level splitting becomes comparable with
the distance to other energy levels.)
E1
(0) (0) (0)
1 2 ... N E2
E (0)
...
Fig. 5.2. Lifting the energy
EN level degeneracy by a
perturbation (schematically).
Hˆ  Hˆ ( 0 ) Hˆ  Hˆ ( 0 )  Hˆ (1)
Limiting the summation in Eq. (7) to the group of N degenerate states with equal En’(0)  E(0), we
reduce it to
 
N

n" 1
n" ( 0 ) n H n'n"
(1)
 n' ( 0 ) n E n  E ( 0 ) , (6.24)
where now n’ and n” number the N states of the degenerate group.5 For n = n’, Eq. (24) may be
rewritten as
 H 
N
(1)
n'n"  E n1" n'n" n" ( 0 ) n'  0, where E n(1)  E n  E ( 0 ) . (6.25)
n" 1
For each n’ = 1, 2, …N, this is a system of N linear, homogenous equations (with N terms each) for N
unknown coefficients n”(0)n’ . In this problem, we may readily recognize the problem of
diagonalization of the perturbation matrix H(1) – cf. Sec. 4.4 and in particular Eq. (4.101). As in the
general case, the condition of self-consistency of the system is:
Initially
H 11(1)  E n1 H 12(1) ...
1
degenerate H (1)
21 H (1)
22  En ...  0 , (6.26)
system:
energy levels ... ... ...
where now the index n numbers the N roots of this equation, in an arbitrary order. According to the
definition (25) of En(1), the resulting N energy levels En may be found as E(0) + En(1). If the perturbation
matrix is diagonal in the chosen basis n(0), the result is extremely simple,
E n  E ( 0 )  E n(1)  H nn
(1)
, (6.27)
4 This is exactly the reason why such simple perturbation approach runs into serious problems for systems with a
continuous spectrum, and other techniques (such as the WKB approximation) are often necessary.
5 Note that here the choice of the basis is to some extent arbitrary, because due to the linearity of equations of
quantum mechanics, any linear combination of the states n”(0) is also an eigenstate of the unperturbed
Hamiltonian. However, for using Eq. (25), these combinations have to be orthonormal, as was supposed at the
derivation of Eq. (7).
and formally coincides with Eq. (14) for the non-degenerate case, but now it may give a different result
for each of N previously degenerate states n.
Now let us see what this general theory gives for several important examples. First of all, let us
consider a system with just two degenerate states with energy sufficiently far from all other levels. Then,
in the basis of these two degenerate states, the most general perturbation matrix is
H H 12 
H (1)   11  (6.28)
 H 21 H 22 
This matrix coincides with the general matrix (5.2) of a two-level system. Hence, we come to the very
important conclusion: for a weak perturbation, all properties of any double-degenerate system are
identical to those of the genuine two-level systems, which were the subject of numerous discussions in
Chapter 4 and again in Sec. 5.1. In particular, its eigenenergies are given by Eq. (5.6), and may be
described by the level-anticrossing diagram shown in Fig. 5.1.
6.2. The linear Stark effect

As a more involved example of the level degeneracy lifting by a perturbation, let us discuss the
Stark effect6 – the atomic level splitting by an external electric field. Let us study this effect, in the linear
approximation, for a hydrogen-like atom/ion. Taking the direction of the external electric field E (which
is practically always uniform on the atomic scale) for the z-axis, the perturbation may be represented by
the following Hamiltonian:
Hˆ (1)   Fzˆ   qEzˆ  qEr cos  . (6.29)
(In the last form, the operator sign is dropped, because we will work in the coordinate representation.)
As you (should :-) remember, energy levels of a hydrogen-like atom/ion depend only on the
principal quantum number n – see Eq. (3.201); hence all the states, besides the ground 1s state in which
n = 1 and l = m = 0, have some orbital degeneracy, which grows rapidly with n. Let us consider the
lowest degenerate level with n = 2. Since, according to Eq. (3.203), 0  l  n –1, at this level the orbital
quantum number l may equal either 0 (one 2s state, with m = 0) or 1 (three 2p states, with m = 0, 1).
Due to this 4-fold degeneracy, H(1) is a 44 matrix with 16 elements:

l0   1
l
m  0 m  0 m  1 m  1
 H 11 H 12 H 13 H 14  m  0, l  0,
  (6.30)
H H 22 H 23 H 24  m  0, 
H (1)
  21 
H H 32 H 33 H 34  m  1,  l  1.
 31  
H H 42 H 43 H 44  m  1, 
 41
6 This effect was discovered experimentally in 1913 by Johannes Stark and independently by Antonio Lo Surdo,
so it is sometimes (and more fairly) called the “Stark – Lo Surdo effect”. Sometimes this name is used with the
qualifier “dc” to distinguish it from the ac Stark effect – the energy level shift under the effect of an ac field – see
Sec. 5 below.
However, there is no need to be scared. First, due to the Hermitian nature of the operator, only
ten of these matrix elements (four diagonal and six off-diagonal ones) may be substantially different
from each other. Moreover, due to a high symmetry of the problem, there are a lot of zeros even among
these elements. Indeed, let us have a look at the angular components Ylm of the corresponding
wavefunctions, with l = 0 and l = 1, described by Eqs. (3.174)-(3.175). For the states with m = 1, the
azimuthal parts of wavefunctions are proportional to exp{i}; hence the off-diagonal elements H34 and
H43 of the matrix (30), relating these functions, are proportional to
2 *
 dΩ Y 1
*
Hˆ (1)Y1   d  e  i   e  i   0. (6.31)
0
   
The azimuthal-angle symmetry also kills the off-diagonal elements H13, H14, H23, H24 (and hence their
complex conjugates H31, H41, H32, and H42), because they relate states with m = 0 and m = 1, and hence
are proportional to
2
0*  i
 dΩ Y Hˆ Y1   d e  0.
(1) 1
1 (6.32)
0
For the diagonal matrix elements H33 and H44, corresponding to l = 1 and m = 1, the azimuthal-angle
integrals do not vanish, but since the corresponding spherical harmonics depend on the polar angle as
sin, these elements are proportional to
 1
1*
 d Y1 Hˆ Y1   sin  d sin  cos  sin    cos  1  cos  d (cos ),
(1) 1 2
  (6.33)
0 1
and hence are equal to zero – as any limit-symmetric integral of an odd function. Finally, for the states
2s and 2p with m = 0, the diagonal elements H11 and H22 are also killed by the polar-angle integration:
 1
0*
 d Y Hˆ Y0   sin  d cos    cos  d (cos )  0 ,
(1) 0
0 (6.34)
0 1
 1
1*
 d Y0 Hˆ Y0   sin  d cos    cos  d (cos )  0.
(1) 1 3 3
(6.35)
0 1
Hence, the only non-zero elements of the matrix (30) are two off-diagonal elements H12 and H21,
which relate two states with the same m = 0, but different l = {0, 1}, because they are proportional to
2 
0* 3 1
 d Y cos  Y   d  sin d cos   0.
0 2
(6.36)
4
0 1
0 0 3
What remains is to use Eqs. (3.209) for the radial parts of these functions to complete the calculation of
those two matrix elements:

qE 2
H 12  H 21  
30
 r drR 2,0 (r )rR 2,1 (r ) . (6.37)
Due to the additive structure of the function R2,0(r), the integral falls into a sum of two table integrals,
both of the type MA Eq. (6.7d), finally giving
H 12  H 21  3qEr0 , (6.38)
where r0 is the spatial scale (3.192); for the hydrogen atom, it is just the Bohr radius rB – see Eq. (1.10).
Thus, the perturbation matrix (30) is reduced to
 0 3qEr0 0 0 
 
 3qEr0 0 0 0 
H (1)  , (6.39)
0 0 0 0
 
 0 0 0 0 
 
so that the condition (26) of self-consistency of the system (25),
 E 21 3qEr0 0 0
1
3qEr0  E2 0 0
 0, (6.40)
0 0  E 21 0
0 0 0  E 21
gives a very simple characteristic equation
E    E     3qEr    0 .
2
1 2
2
1 2
0
2
(6.41)
with four roots:
Linear
E 
(1)
2 1, 2  0, E 
(1)
2 3, 4
 3qEr0 . (6.42) Stark
effect
for n = 2
so that the degeneracy is only partly lifted – see the levels in Fig. 3.7
 
1
 2s  2p 
2
m  0
3qEr0
E 2( 0 ) m  1
3qEr0
m  0 Fig. 6.3. The linear Stark effect for the
 
1
 2s  2p  level n = 2 of a hydrogen-like atom.
2
Generally, in order to understand the nature of states corresponding to these levels, we should
return to Eq. (25) with each calculated value of E2(1), and find the corresponding expansion coefficients
n”(0)n’ that describe the perturbed states. However, in our simple case, the outcome of this procedure
is clear in advance. Indeed, since the states with {l = 1, m =  1} are not affected by the perturbation at
all (in the linear approximation in the electric field), their degeneracy is not lifted, and energy is not
affected – see the middle line in Fig. 3. On the other hand, the partial perturbation matrix connecting the
states 2s and 2p, i.e. the top left 22 part of the full matrix (39), is proportional to the Pauli matrix x,
and we already know the result of its diagonalization – see Eqs. (4.113)-(4.114). This means that the
7The proportionality of this splitting to the small field is responsible for the qualifier “linear” in the name of this
effect. If observable effects grow only as E2 (see, e.g., Problem 9), the term quadratic Stark effect is used instead.
upper and lower split levels correspond to very simple linear combinations of the previously degenerate
states with m = 0,
1
   2s  2 p  . (6.43)
2
Finally, let us estimate the magnitude of the linear Stark effect for a hydrogen atom. For a very
high electric field of E = 3106 V/m,8  q  = e  1.610-19 C, and r0 = rB  0.510-10 m, we get a level
splitting of 3qEr0  0.810-22 J  0.5 meV. This number is much lower than the unperturbed energy of
the level, E2 = –EH/(222)  –3.4 eV, so that the perturbative result is quite applicable. On the other
hand, the calculated splitting is much larger than the resolution limit imposed by the line’s natural width
(~10-7 E2, see Chapter 9), so that the effect is quite observable even in substantially lower electric fields.
Note, however, that our simple results are quantitatively correct only when the Stark splitting (42) is
much larger that the fine-structure splitting of the same level in the absence of the field– see the next
section.
6.3. Fine structure of atomic levels

Now let us use the same perturbation theory to analyze, also for the simplest case of a hydrogen-
like atom/ion, the so-called fine structure of atomic levels – their degeneracy lifting even in the absence
of external fields. Since the effective speed v of the electron motion in atoms is much smaller than the
speed of light c, the fine structure may be analyzed as a sum of two independent relativistic effects. To
analyze the first of them, let us expand the well-known classical relativistic expression9 for the kinetic
energy T = E – mc2 of a free particle with the rest mass m,10
 p2 
1/ 2


T m c p c
2 4 2

2 1/ 2
 mc  mc 1  2 2   1 ,
2 2
(6.44)
 m c  
into the Taylor series with respect to the small ratio (p/mc)2  (v/c)2:
 1  p 2 1  p 4  p2 p4
T  mc 1  
2
     ...  1   3 2
 ... , (6.45)
 2  mc  8  mc   2 m 8m c
and drop all the terms besides the two spelled-out terms. Of them, the first term is non-relativistic, while
the second one represents the first relativistic correction to T.
Following the correspondence principle, the quantum-mechanical problem in this approximation
may be described by the perturbative Hamiltonian (1), whose unperturbed part (whose eigenstates and
eigenenergies were discussed in Sec. 3.5) is
pˆ 2 C
Hˆ ( 0)   Uˆ (r ), Uˆ (r )   , (6.46)
2m r
while the kinetic-relativistic perturbation
8 This value approximately corresponds to the threshold of electric breakdown in air at ambient conditions, due to
the impact ionization. As a result, experiments with higher dc fields are rather difficult.
9 See, e.g., EM Eq. (9.78) – or any undergraduate text on special relativity.
10 This fancy font is used, as in Secs. 3.5-3.8, to distinguish the mass m from the magnetic quantum number m.
2
pˆ 4 1  pˆ 2  Kinetic-
Hˆ (1)   3 2
   . (6.47) relativistic
8m c 2 mc 2  2m  perturbation
Using Eq. (46), we may rewrite the last formula as
Hˆ (1)  
1
2 mc 2
 2
Hˆ ( 0 )  Uˆ (r ) ,  (6.48)
so that its matrix elements participating in the characteristic equation (25) for a given degenerate energy
level (3.201), i.e. a given principal quantum number n, are
nlm Hˆ (1) nl'm'  

1
2 mc 2
 
nlm Hˆ ( 0 )  Uˆ (r ) Hˆ ( 0)  Uˆ (r ) nl'm' ,  (6.49)
where the bra- and ket-vectors describe the unperturbed eigenstates, whose eigenfunctions (in the
coordinate representation) are given by Eq. (3.200): n,l,m = Rn,l(r)Ylm(,).
It is straightforward (and hence left for the reader :-) to prove that all off-diagonal elements of
the set (49) are equal to 0. Thus we may use Eq. (27) for each set of the quantum numbers{n, l, m}:
E n(1,l),m  E n ,l ,m  E n( 0 )  nlm Hˆ (1) nlm  

1
2 mc 2
Hˆ (0)
 Uˆ (r ) 2
n ,l , m
(6.50)
   1  E 0  E 0 C 1 
2
1  2 1
  E n  2 E n Uˆ  Uˆ 2 C ,
2
2 mc 2  n ,l n ,l  2 mc 2  4n 4 n 2 r n ,l r2 
n ,l 
where the index m has been dropped, because the radial wavefunctions Rn,l(r), which affect these
expectation values, do not depend on that quantum number. Now using Eqs. (3.191), (3.201) and the
first two of Eqs. (3.211), we finally get
Kinetic-
mC 2  n 3 2E 2  n 3 relativistic
E (1)
n ,l  2 2 4     n2   . (6.51) energy
2 c n  l  ½ 4  mc  l  ½ 4  correction
Let us discuss this result. First of all, its last form confirms that the correction (51) is indeed
much smaller than the unperturbed energy En (and hence the perturbation theory is valid) if the latter is
much smaller than the relativistic rest energy mc2 of the particle – as it is for the hydrogen atom. Next,
since in the Bohr problem’s solution n  l + 1, the first fraction in the parentheses of Eq. (51) is always
larger than 1, and hence than 3/4, so that the kinetic relativistic correction to energy is negative for all n
and l. (Actually, this fact could be predicted already from Eq. (47), which shows that the perturbation’s
Hamiltonian is a negatively defined form.) Finally, for a fixed principal number n, the negative
correction’s magnitude decreases with the growth of l. This fact may be interpreted using the second of
Eqs. (3.211): the larger is l (at fixed n), the larger is the particle’s effective distance from the center, and
hence the smaller is its effective velocity, i.e. the smaller is the magnitude of the quantum-mechanical
average of the negative relativistic correction (47) to the kinetic energy.
The result (51) is valid for the Coulomb interaction U(r) = –C/r of any physical nature. However,
if we speak specifically about hydrogen-like atoms/ions, there is also another relativistic correction to
energy, due to the so-called spin-orbit interaction (alternatively called the “spin-orbit coupling”). Its
physics may be understood from the following semi-quantitative classical reasoning: from the “the point
of view” of an electron rotating about the nucleus at distance r with velocity v, it is the nucleus, of the
electric charge Ze, that rotates about the electron with the velocity (-v) and hence the time period T =
2r/v. From the point of view of magnetostatics, such circular motion of the electric charge Q = Ze , is
equivalent to a circular dc electric current I = Q/T = (Ze)(v/2r), which creates, at the electron’s
location, i.e. in the center of the current loop, the magnetic field with the following magnitude:11
0  0 Zev  0 Zev
Ba  I  . (6.52)
2r 2r 2r 4r 2
The field’s direction n is perpendicular to the apparent plane of the nucleus’ rotation (i.e. that of the real
rotation of the electron), and hence its vector may be readily expressed via the similarly directed vector
L = mevrn of the electron’s angular (orbital) momentum:
 0 Zev  Ze  Ze Ze
Ba  n  0 3 me vrn  0 3 L  L, (6.53)
4r 2
4r me 4r me 4 0 r 3 me c 2
where the last step used the basic relation between the SI-unit constants: 0  1/c20.
A more careful (but still classical) analysis of the problem12 brings both good and bad news. The
bad news is that the result (53) is wrong by the so-called Thomas factor of two even for the circular
motion, because the electron moves with acceleration, and the reference frame bound to it cannot be
inertial (as was implied in the above reasoning), so that the effective magnetic field felt by the electron
is actually
Ze
B L. (6.54)
8 0 r 3 me c 2
The good news is that this result is valid not only for circular but an arbitrary orbital motion in
the Coulomb field U(r). Hence from the discussion in Sec. 4.1 and Sec. 4.4 we may expect that the
quantum-mechanical description of the interaction between this effective magnetic field and the
electron’s spin moment (4.115) is given by the following perturbation Hamiltonian13
ˆ ˆ ˆ  Ze ˆ  1 Ze 2 1 ˆ ˆ
(1)
ˆ 
H  m  B   e S   
L  SL, (6.55)
 8 0 r me c  2me c 4 0 r
3 2 2 2 3
where at spelling out the electron’s gyromagnetic ratio e  –gee/2me, the small correction to the value ge
= 2 of the electron’s g-factor (see Sec. 4.4) is ignored, because Eq. (55) is already a small correction.
This expectation is confirmed by the fully-relativistic Dirac theory, to be discussed in Sec. 9.7 below: it
yields, for an arbitrary central potential U(r), the following spin-orbit coupling Hamiltonian:
Spin-
1 1 dU (r ) ˆ ˆ
orbit Hˆ (1)  SL . (6.56)
coupling
2me2 c 2 r dr
For the Coulomb potential U(r) = –Ze2/40r, this formula is reduced to Eq. (55).
11 See, e.g., EM Sec. 5.1, in particular, Eq. (5.24). Note that such effective magnetic field is induced by any
motion of electrons, in particular that in solids, leading to a variety of spin-orbit effects there – see, e.g., a concise
review by R. Winkler et al., in B. Kramer (ed.), Advances in Solid State Physics 41, 211 (2001).
12 It was carried out first by Llewellyn Thomas in 1926; for a simple review see, e.g., R. Harr and L. Curtis, Am.
J. Phys. 55, 1044 (1987).
13 In the Gaussian units, Eq. (55) is valid without the factor 4 in the denominator; while Eq. (56), “as is”.
0
As we already know from the discussion in Sec. 5.7, the angular factor of this Hamiltonian
commutes with all the operators of the coupled-representation group (inside the blue line in Fig. 5.12):
L̂2 , Ŝ 2 , Ĵ 2 , and Ĵ z , and hence is diagonal in the coupled-representation basis with definite quantum
numbers l, j, and mj (and of course s = ½). Hence, using Eq. (5.181) to rewrite Eq. (56) as
Hˆ (1) 
1
2me2 c 2 4 0 r 3 2

Ze 2 1 1 ˆ 2 ˆ2 ˆ 2
J L S ,  (6.57)
we may again use Eq. (27) for each set {s, l, j, mj}, with common n:
1 Ze 2 1 1 ˆ 2 ˆ2 ˆ 2
E n(1, )j ,l  J L S , (6.58)
2me c 4 0 r n,l 2
2 2 3 j ,s
where the indices irrelevant for each particular factor have been dropped. Now using the last of Eqs.
(3.211), and similar expressions (5.169), (5.175), and (5.177) for eigenvalues of the involved operators,
we get an explicit expression for the spin-orbit corrections14
Spin-
1 Ze 2  2 j ( j  1)  l (l  1)  ¾ E n2 j ( j  1)  l (l  1)  ¾ orbit
E n(1, )j ,l   n , (6.59) energy
2me c 4 0 2r0
2 2 3
n l (l  ½)(l  1)
3
me c 2
l (l  ½)(l  1) correction
with l and j related by Eq. (5.189): j = l  ½.

The last form of its result shows clearly that this correction has the same scale as the kinetic
correction (51).15 In the 1st order of the perturbation theory, they may be just added (with m = me),
giving a surprisingly simple formula for the net fine structure of the nth energy level:
Fine
E n2  4n 
E (1)
  3  . (6.60) structure
j  ½ 
fine of atomic
2me c 2  levels
This simplicity, as well as the independence of the result of the orbital quantum number l, will become
less surprising when (in Sec. 9.7) we see that this formula follows in one shot from the Dirac theory, in
which the Bohr atom’s energy spectrum in numbered only with n and j, but not l. Let us recall that for
an electron (s = ½), according to Eq. (5.189) with 0  l  n – 1, the quantum number j may take n
positive half-integer values, from ½ to n – ½. Hence, Eq. (60) shows that the fine structure of the nth
Bohr’s energy level has n sub-levels – see Fig. 4.
Please note that according to Eq. (5.175), each of these sub-levels is still (2j + 1)-times
degenerate in the quantum number mj. This degeneracy is very natural, because in the absence of an
external field the system is still isotropic. Moreover, on each fine-structure level (besides the highest one
with j = n – ½), each of the mj-states is doubly-degenerate in the orbital quantum number l = j  ½ – see
the labels of l in Fig. 4. (According to Eq. (5.190), each of these states, with fixed j and mj, may be
14 The factor l in the denominator does not give a divergence at l = 0, because in this case j = s = ½, so that j(j +
1) = ¾, and the numerator turns into 0 as well. A careful analysis of this case (which may be found, e.g., in G.
Woolgate, Elementary Atomic Structure, 2nd ed., Oxford, 1983), as well as the exact analysis of the hydrogen
atom using the Dirac theory (see Sec. 9.7), show that Eq. (60), which does not include l, is valid even in this case.
15 This is natural, because the magnetic interaction of charged particles is essentially a relativistic effect, of the
same order (~v2/c2) as the kinetic correction (47) – see, e.g., EM Sec. 5.1, in particular Eq. (5.3).
represented as a linear combination of two states with adjacent values of l, and hence different electron
spin orientations, ms = ½, weighed with the Clebsch-Gordan coefficients.)
En j  n ½
l  n 1
... ...
l  2, 3
j  5/2
l  1, 2
j  3/ 2
Fig. 6.4. The fine structure of a
j ½ hydrogen-like atom’s level.
l  0, 1
These details aside, one may crudely say that the relativistic corrections combined make the total
eigenenergy grow with l, contributing to the effect already mentioned at our analysis of the periodic
table of elements in Sec. 3.7. The relative scale of this increase may be scaled by the largest deviation
from the unperturbed energy En, reached for s-states (with l = 0, j = ½):
(1) 2
E max E  3   Ze 2   1 3  2 2 1 3 
 n2  2 n        2 Z    2 . (6.61)
En me c  2   4 0 c   n 4n   n 4n 
where  is the fine-structure (“Sommerfeld’s”) constant,
e2 1
  , (6.62)
4 0 c 137
(already mentioned in Sec. 4.4), which characterizes the relative strength (or rather weakness :-) of the
electromagnetic effects in quantum mechanics – which in particular makes the perturbative quantum
electrodynamics possible.16 These expressions show that the fine-structure splitting is a very small effect
(~2 ~ 10-6) for the hydrogen atom, but it rapidly grows (as Z2) with the nuclear charge (i.e. the atomic
number) Z, and becomes rather substantial for the heaviest stable atoms with Z ~ 102.
6.4. The Zeeman effect

Now, we are ready to review the Zeeman effect – the atomic level splitting by an external
magnetic field.17 Using Eq. (3.26), with q = –e, for the description of the electron’s orbital motion in the
field, and the Pauli Hamiltonian (4.163), with  = –e/me, for the electron spin’s interaction with the field,
we see that even for a hydrogen-like (i.e. single-electron) atom/ion, neglecting the relativistic effects,
the full Hamiltonian is rather involved:
 
2
1 ˆ 2  Ze  e B  Sˆ .
Hˆ  pˆ  eA (6.63)
2me 4 0 r me
16 The expression 2 = EH/mec2, where EH is the Hartree energy (1.13), i.e. the scale of energies En, is also very
revealing.
17 It was discovered experimentally in 1896 by Pieter Zeeman who, amazingly, was fired from the University of
Leiden for unauthorized use of lab equipment for this work – just to receive a Nobel Prize for it in a few years!
There are several simplifications we may make. First, let us assume that the external field is
spatial-uniform on the atomic scale (which is a very good approximation for most cases), so that we can
take its vector potential in an axially-symmetric gauge – cf. Eq. (3.132):
1
A
B  r. (6.64)
2
Second, let us neglect the terms proportional to B2, which are small in practical magnetic fields of the
order of a few teslas.18 The remaining term in the effective kinetic energy, describing the interaction
with the magnetic field, is linear in the momentum operator, so that we may repeat the standard classical
calculation19 to reduce it to the product of B by the orbital magnetic moment’s component mz = –
eLz/2me – besides that both mz and Lz should be understood as operators now. As a result, the
Hamiltonian (63) reduces to Eq. (1), Hˆ ( 0)  Hˆ (1) , where Ĥ ( 0) is that of the atom at B = 0, and
eB ˆ Zeeman
Hˆ (1)  ( Lz  2Sˆ z ). (6.65) effect’s
2me perturbation
This expression immediately reveals the major complication with the Zeeman effect’s analysis.
Namely, in comparison with the equal orbital and spin contributions to the total angular momentum
(5.170) of the electron, its spin produces a twice larger contribution to the magnetic moment, so that the
right-hand side of Eq. (65) is not proportional to the total angular moment J. As a result, the effect’s
description is simple only in two limits.
If the magnetic field is so high that its effects are much stronger than the relativistic (fine-
structure) effects discussed in the previous section, we may treat the two terms in Eq. (65) as
independent perturbations of different (orbital and spin) degrees of freedom. Since each of the
perturbation matrices is diagonal in its own z-basis, we can again use Eq. (27) to write
 
Paschen-
eB eB
E  E (0)  n, l , ml Lˆ z n, l , ml  2 m s Sˆ z m s  ml  2ms    BB (ml  1). (6.66) Back
2me 2me effect
This result describes splitting of each 2(2l + 1)-degenerate energy level, with certain n and l, into (2l
+3) levels (Fig. 5), with the adjacent level distance of BB , of the order of 10-4 eV per tesla.
.
.
.
ml  2, m s  1 / 2

 BB  ml  0, m s  1 / 2
ml  1, m s  1 / 2
En( 0,l) 
 BB ml  1, m s  1 / 2
 ml  0, m s  1 / 2 Fig. 6.5. The Paschen-Back effect.
.
.
.

ml  2, m s  1 / 2
18 Despite its smallness, the quadratic term is necessary for a description of the negative contribution of the orbital
motion to the magnetic susceptibility m (the so-called orbital diamagnetism, see EM Sec. 5.5), whose analysis,
using Eq. (63), is left for the reader’s exercise.
19 See, e.g., EM Sec. 5.4, in particular Eqs. (5.95) and (5.100).
Note that all the levels, these besides the top and bottom ones, remain doubly degenerate. This
limit of the Zeeman effect is sometimes called the Paschen-Back effect – whose simplicity was
recognized only in the 1920s, due to the need in very high magnetic fields for its observation.
In the opposite limit of relatively low magnetic fields, the Zeeman effect takes place on the
background of the much larger fine-structure splitting. As was discussed in Sec. 3, at B = 0 each split
sub-level has a 2(2j + 1)-fold degeneracy corresponding to (2j + 1) different values of the half-integer
quantum number mj, ranging from –j to +j, and two values of the integer l = j  ½ – see Fig. 4.20 The
magnetic field lifts this degeneracy. Indeed, in the coupled representation discussed in Sec. 5.7, the
perturbation (65) is described by the matrix with elements
eB eB
H (1)  j , m j Lˆ z  2Sˆ z j' , m j'  j , m j Jˆ z  Sˆ z j' , m j'
2me 2me
(6.67)

eB
2me
 ˆ
m j  m m  j , m j S z j' , m j' .
j j'

To spell out the second term, let us use the general expansion (5.183) for the particular case s =
½, when (as was discussed in the end of Sec. 5.7) it has at most two non-vanishing terms, with the
Clebsh-Gordan coefficients (5.190):
j  l  ½, m j
 l  mj ½ 
1/ 2
 l  mj ½ 
1/ 2
(6.68)

    
 ml  m j  ½, ms  ½   2l  1  ml  m j  ½, ms  ½ .
 2l  1   
Taking into account that the operator Ŝ z gives non-zero brackets only for ms = ms’, the 22 matrix of
elements m  m  ½, m  ½ Sˆ m  m  ½, m  ½ is diagonal, so we may use Eq. (27) to get
l j s z l j s

eB  l  m j  ½   l  m j  ½ 
Anomalous E  E (0)   m j   
Zeeman 2me
 2 2l  1 2 2l  1 
effect (6.69)
for s = 1/2 eB  1   1 
 m j  1     B B m j 1   , for  j  m j   j ,
2me  2l  1   2l  1 
where the two signs correspond to the two possible values of l = j  ½ – see Fig. 6.
.
.
.
.
.
.
m j  3 / 2
m j  3 / 2
m j  1 / 2 m j  1 / 2
(0)
E n, j En( 0, j)
l  j ½ m j  1 / 2 l  j ½ m j  1 / 2
m j  3 / 2
.
.
.
m j  3 / 2
.
.
.
Fig. 6.6. The anomalous Zeeman effect in a hydrogen-like atom/ion.
20In the almost-hydrogen-like, but more complex atoms (such as those of alkali metals), the degeneracy in l may
be lifted by electron-electron Coulomb interaction even in the absence of external magnetic field.
We see that the magnetic field splits each sub-level of the fine structure, with a given l, into 2j +
1 equidistant levels, with the distance between the levels depending on l. In the late 1890s, when the
Zeeman effect was first observed, there was no notion of spin at all, so that this puzzling result was
called the anomalous Zeeman effect. (In this terminology, the normal Zeeman effect is the one with no
spin splitting, i.e. without the second terms in the parentheses of Eqs. (66), (67), and (69); it was first
observed in 1898 by Preston Thomas in atoms with zero net spin.)
The strict quantum-mechanical analysis of the anomalous Zeeman effect for arbitrary s (which is
important for applications to multi-electron atoms) is conceptually not complex, but requires explicit
expressions for the corresponding Clebsch-Gordan coefficients, which are rather bulky. Let me just cite
the unexpectedly simple result of this analysis:
ΔE   B Bm j g , (6.70a)
Anomalous
Zeeman
where g is the so-called Lande factor:21 effect
for arbitrary s
j ( j  1)  s ( s  1)  l (l  1)
g  1 . (6.70b)
2 j ( j  1)
For s = ½ (and hence j = l  ½), this factor is reduced to the parentheses in the last forms of Eq. (69).
It is remarkable that Eqs. (70) may be readily derived using very plausible classical arguments,
similar to those used in Sec. 5.7 – see Fig. 5.13 and its discussion. As was discussed in Sec. 5.6, in the
absence of spin, the quantization of the observable Lz is an extension of the classical picture of the
torque-induced precession of the vector L about the magnetic field’s direction, so that the interaction
energy, proportional to BLz = B L, remains constant – see Fig. 7a. On the other hand, at the spin-orbit
interaction without an external magnetic field, the Hamiltonian function of the system includes the
product SL, so that in the stationary state it has to be constant, together with J2, L2, and S2. Hence, this
system’s classical image is a joint precession of the vectors S and L about the direction of the vector J =
L + S, in such a manner that the spin-orbit interaction energy, proportional to the product LS, remains
constant (Fig. 7b). On this backdrop, the anomalous Zeeman effect in a relatively weak magnetic field B
= Bnz corresponds to a much slower precession of the vector J about the z-axis, “dragging” with it the
vectors L and S, rapidly rotating around it.
z (a) z (b)
J
Lz L
(L J ) z
LJ Fig. 6.7. Classical images of (a)
B L the orbital angular momentum’s
S
(S J ) z quantization in a magnetic field,
SJ and (b) the fine-structure level
B  L  const L  S  const splitting.
0 0
21 This formula is frequently used with capital letters J, S, and L, which denote the quantum numbers of the atom
as a whole.
This physical picture allows us to conjecture that what is important for the slow precession rate
are only the vectors L and S averaged over the period of their much faster precession around vector J –
in other words, only their components LJ and SJ along the vector J. Classically, these components may
be calculated as
LJ SJ
L J  2 J, and S J  2 J. (6.71)
J J
The scalar products participating in these expressions may be readily expressed via the squared lengths
of the vectors, using the following geometric formulas:
S 2  (J  L) 2  J 2  L2  2L  J, L2  (J  S) 2  J 2  S 2  2J  S. (6.72)
As a result, we get the following time average:

LJ SJ  J
L z  2 S z  L J  2S J z   2 J  2 2 J   z2 L  J  2S  J 
 J J z J
(6.73)
( J 2  L2  S 2 )  2( J 2  S 2  L2 )  J 2  S 2  L2 
 Jz  J z 1   .
2J 2  2J 2 
The last move is to smuggle in some quantum mechanics by using, instead of the vector lengths
squared and the z-component of Jz, their eigenvalues given by Eqs. (5.169), (5.175), and (5.177). As a
result, we immediately arrive at the exact Eqs. (70). This coincidence encourages thinking about
quantum mechanics of angular momenta in the classical terms of torque-induced precession, which turns
out to be very fruitful in some more complex problems of atomic and molecular physics.
The high-field limit and low-field limits of the Zeeman effect, described respectively by Eqs.
(66) and (69), are separated by a medium field range, in which the Zeeman splitting is of the order of the
fine-structure splitting analyzed in Sec. 3. There is no time in this course for a quantitative analysis of
this crossover.22
6.5. Time-dependent perturbations

Now let us proceed to the case when the perturbation Ĥ (1) in Eq. (1) is a function of time, while
Ĥ ( 0) is time-independent. The adequate perturbative approach to this problem, and its results, depend
critically on the relation between the characteristic frequency  of the perturbation and the distance
between the initial system’s energy levels:
  E n  E n ' . (6.74)
In the case when all essential frequencies of a perturbation are very small in the sense of Eq.
(74), we are dealing with the so-called adiabatic change of parameters, that may be treated essentially as
a time-independent perturbation – see the previous sections of this chapter). The most interesting
observation here is that the adiabatic perturbation does not allow any significant transfer of system’s
22For a more complete discussion of the Stark, Zeeman, and fine-structure effects in atoms, I can recommend, for
example, either the monograph by G. Woolgate cited above, or the one by I. Sobelman, Theory of Atomic Spectra,
Alpha Science, 2006.
probability from one eigenstate to another. For example, in the WKB limit of the orbital motion, the
Bohr quantization rule and its Wilson-Sommerfeld modification (2.110) guarantee that the integral
 p  dr ,
C
(6.75)
taken along the particle’s classical trajectory, is an adiabatic invariant, i.e. does not change at a slow
change of system’s parameters. (It is curious that classical mechanics also guarantees the invariance of
the integral (75), but its proof there23 is much harder than the quantum-mechanical derivation of this
fact, carried out in Sec. 2.4.) This is why even if the perturbation becomes large with time (while
changing sufficiently slowly), we can expect the classification of eigenstates and eigenvalues to persist.
Let us proceed to the harder case when both sides of Eq. (74) are comparable, using for this
discussion the Schrödinger picture of quantum dynamics, given by Eq. (4.158). Combining it with Eq.
(1), we get the Schrödinger equation in the form
i

t
 
 (t )  Hˆ ( 0)  H (1) (t )  (t ) . (6.76)
Very much in the spirit of our treatment of the time-independent case in Sec. 1, let us represent the time-
dependent ket-vector of the system with its expansion,
 (t )   n n  (t ) , (6.77)
n
over the full and orthonormal set of the unperturbed, stationary ket-vectors defined by equation
Hˆ ( 0) n  E n n . (6.78)
(Note that these kets n are exactly what was called n(0) in Sec. 1; we may afford a less bulky notation
in this section, because only the lowest orders of the perturbation theory will be discussed.) Plugging the
expansion (77), with n replaced with n’, into both sides of Eq. (76), and then inner-multiplying both its
sides by the bra-vector n of another unperturbed (and hence time-independent) state of the system, we
get the following set of linear, ordinary differential equations for the expansion coefficients:
d
i n  (t )  E n n  (t )   H nn
(1)
' (t ) n'  (t ) , (6.79)
dt n'
where the matrix elements of the perturbation, in the unperturbed state basis, defined similarly to Eq.
(8), are now functions of time:
ˆ (1) (t ) n' .
' (t )  n H
(1)
H nn (6.80)
The set of differential equations (79), which are still exact, may be useful for numerical
calculations.24 However, it has a certain technical inconvenience, which becomes clear if we consider its
(evident) solution in the absence of perturbation:25
23 See, e.g., CM Sec. 10.2.

24 Even if the problem under analysis may be described by the wave-mechanics Schrödinger equation (1.25), a
direct numerical integration of that partial differential equation is typically less convenient than that of the
ordinary differential equations (79).
25 This is of course just a more general form of Eq. (1.62) of the wave mechanics of time-independent systems.
 E 
n  (t )  n  (0) exp i n t  . (6.81)
  
We see that these solutions oscillate very fast, and their numerical modeling may represent a challenge
for even the fastest computers. These spurious oscillations (whose frequency, in particular, depends on
the energy reference level) may be partly tamed by looking for the general solution of Eqs. (79) in a
form inspired by Eq. (81):
 E 
n  (t )  a n (t ) exp i n t  . (6.82)
  
Here an(t) are new functions of time (essentially, the stationary states’ probability amplitudes),
which may be used, in particular, to calculate the time-dependent level occupancies, i.e. the probabilities
Wn to find the perturbed system on the corresponding energy levels of the unperturbed system:
 a n t  .
2 2
Wn (t )  n  (t ) (6.83)
Plugging Eq. (82) into Eq. (79), for these functions we readily get a slightly modified system of
equations:
inn' t
ia n   a n' H nn'
Probability (1)
amplitudes: (t )e , (6.84)
evolution n'
where the factors nn’, defined by the relation

Quantum
transition
frequencies nn'  En  En' , (6.85)
have the physical sense of frequencies of potential quantum transitions between the nth and n’ th energy
levels of the unperturbed system. (The conditions when such transitions indeed take place will be clear
soon.) The advantages of Eq. (84) over Eq. (79), for both analytical and numerical calculations, is their
independence of the energy reference, and lower frequencies of oscillations of the right-hand side terms,
especially when the energy levels of interest are close to each other.26
In order to continue our analytical treatment, let us focus on a particular but very important
problem of a sinusoidal perturbation turned on at some moment – which may be taken for t = 0:
Turning on
 0, for t  0,
sinusoidal
Hˆ (1) (t )   it (6.86)
perturbation ˆ
 Ae  Aˆ e it , for t  0,
†
where the perturbation amplitude operators Â and Â† ,27 and hence their matrix elements,
26 Note that the relation of Eq. (84) to the initial Eq. (79) is very close to the relation of the interaction picture of
quantum dynamics, discussed at the end of Sec. 4.6, to its Schrödinger picture, with the perturbation Hamiltonian
playing the role of the interaction one – compare Eqs. (1) and Eq. (4.206). Indeed, Eq. (84) could be readily
obtained from the interaction picture, and I did not do this just to avoid using this heavy bra-ket artillery for our
current (relatively) simple problem, and hence to keep its physics more transparent.
27 The notation of the amplitude operators in Eq. (86) is justified by the fact that the perturbation Hamiltonian has
to be self-adjoint (Hermitian), and hence each term on the right-hand side of that relation has to be a Hermitian
conjugate of its counterpart, which is evidently true only if the amplitude operators are also the Hermitian
conjugates of each other. Note, however, that each of these amplitude operators is generally not Hermitian.
n Aˆ n'  Ann' , n Aˆ † n'  An'n

*
, (6.87)
are time-independent after the turn-on moment. In this case, Eq. (84) yields
i (nn '  )t i (nn '  )t 
ia n   a n '  Ann' e  An*'n e , for t  0 . (6.88)
n'
 
This is, generally, still a nontrivial system of coupled differential equations; however, it allows
simple and explicit solutions in two very important limits. First, let us assume that our system initially
was definitely in one eigenstate n’ (usually, though not necessarily, in the ground state), and that the
occupancies Wn of all other levels stay very low all the time. (We will find the condition when the
second assumption is valid a posteriori – from the solution.) With these assumptions,
a n'  1; a n  1, for n  n' , (6.89)
Eq. (88) may be readily integrated, giving

*
Ann' e i (nn '  )t  1  An'n e i (nn '  )t  1,
an   for n  n' . (6.90)
 nn '    
   nn '    
 
This expression describes what is colloquially called the ac excitation of (other) energy levels.
Qualitatively, it shows that the probability Wn (83) of finding the system in each state (“on each energy
level”) of the system does not tend to any constant value but rather oscillates in time. It also shows that
that the ac-field-induced transfer of the system from one state to the other one has a clearly resonant
character: the maximum occupancy Wn of a level number n  n’ grows infinitely when the
corresponding detuning28
 nn '     nn' , (6.91)
tends to zero. This conclusion is clearly unrealistic, and is an artifact of our initial assumption (89);
according to Eq. (90), it is satisfied only if29
Ann '      nn' , (6.92)
and hence which does not allow a more deep analysis of the resonant excitation.
In order to overcome this limitation, we may perform the following trick – very similar to the
one we used for the transfer to the degenerate case in Sec. 1. Let us assume that for a certain level n,
 nn '   ,    n"n ,    n"n' , for all n"  n, n' (6.93)
– the condition illustrated in Fig. 8. Then, according to Eq. (90), we may ignore the occupancy of all but
two levels, n and n’, and also the second, non-resonant term with frequency nn’ +   2 >> nn’ in
Eqs. (88),30 now written for two probability amplitudes, an and an’.
28 The notion of detuning is also very useful in the classical theory of oscillations (see, e.g., CM Chapter 5), where
the role of nn’ is played by the own frequency 0 of the oscillator.
29 Strictly speaking, one more condition is that the number of “resonance” levels is also not too high – see Sec. 6.
30 The second assumption, i.e. the omission of non-resonant terms in the equations for amplitudes is called the
Rotating Wave Approximation (RWA); the same idea in the classical theory of oscillations is the basis of what is
usually called the van der Pol method, and its result, the reduced equations – see, e.g., CM Secs. 5.3-5.5.

En
 nn '  0
 
E n" Fig. 6.8. The resonant excitation of
an energy level.
En'
The result is the following system of two linear equations:
ia n  a n' Ae  it , ia n'  a n A*e it , (6.94)
which uses the shorthand notation A  Ann’ and   nn’. (I will use this notation for a while – until other
energy levels become involved, at the beginning of the next section). This system may be readily
reduced to a form without explicit time dependence of the right-hand parts – for example, by introducing
the following new probability amplitudes, with the same moduli:
bn  a n e it / 2 , bn'  a n' e  it / 2 , (6.95)

so that
a n  bn e  it / 2 , a n '  bn' e it / 2 . (6.96)
Plugging these relations into Eq. (94), we get two usual linear first-order differential equations:
 
ibn   bn  Abn' , ibn '  A*bn  bn' . (6.97)
2 2
As the reader knows very well by now, the general solution of such a system is a linear combination of
two exponential functions, exp{t}, with the exponents  that may be found by plugging any of these
functions into Eq. (97), and requiring the consistency of the two resulting linear algebraic equations. In
our case, the consistency condition (i.e. the characteristic equation of the system) is
  / 2  i A
 0, (6.98)
A*  / 2  i
and has two solutions  = i, where
1/ 2 1/ 2
 Δ2 A
2
  A
2

Ω  i.e. 2     4 2 
Rabi
oscillations:  2 , 2
. (6.99)
frequency  4     
   
The coefficients at the exponents are determined by initial conditions. If, as was assumed before,
the system was completely on the level n’ initially (at t = 0), i.e. if an’ (0) = 1, an(0) = 0, so that bn’ (0) =
1, bn(0) = 0 as well, then Eqs. (97) yield, in particular:
A
bn (t )  i sin t , (6.100)

so that the nth level occupancy is
2 2
2 A A Rabi
Wn  bn  sin t 
2
sin 2 t . (6.101)
  / 2
formula
 
2 2
A
2 2
This is the famous Rabi oscillation formula.31 If the detuning is large in comparison with  A /,
though still small in the sense of Eq. (93), the frequency 2 of the Rabi oscillations is completely
determined by the detuning, and their amplitude is small:
2
A t 2
Wn (t )  4 sin 2  1, for A  ( ) 2 , (6.102)
 
2 2
2
– the result which could be obtained directly from Eq. (90), just neglecting the second term on its right-
hand side. However, now we may also analyze the results of an increase of the perturbation amplitude: it
leads not only to an increase of the amplitude of the probability oscillations, but also of their frequency
– see Fig. 9. Ultimately, at  A  >>  (for example, at the exact resonance,  = 0., i.e. nn’ = , so that
En = En’ + ), Eqs. (101)-(102) give  =  A / and (Wn)max = 1, i.e. describe a periodic, full
“repumping” of the system from one level to another and back, with a frequency proportional to the
perturbation amplitude.32
A
3
0.8 
0.6
Wn 1
0.4
0 .3 Fig. 6.9. The Rabi oscillations

0.2
for several values of the
0 .1 normalized amplitude of ac
0 perturbation.
0 0.2 0.4 0.6 0.8
t /( 2 /  )
This effect is a close analog of the quantum oscillations in two-level systems with time-
independent Hamiltonians, which were discussed in Secs. 2.6 and 5.1. Indeed, let us revisit, for a
moment, their discussion started at the end of Sec.1 of this chapter, now paying more attention to the
time evolution of the system under the perturbation. As was argued in that section, the most general
perturbation Hamiltonian lifting the two-fold degeneracy of an energy level, in an arbitrary basis, has
the matrix (28). Let us describe the system’s dynamics using, again, the Schrödinger picture,
representing the ket-vector of an arbitrary state of the system in the form (5.1), where  and  are the
31 It was derived in 1952 by Isaac Rabi, in the context of his group’s pioneering experiments with the ac
(practically, microwave) excitation of quantum states, using molecular beams in vacuum.
32 As Eqs. (82), (96), and (99) show, the lowest frequency in the system is  =  – /2 + , so that at A  0,
l n’
l  n’ + 2A2/. This effective shift of the lowest energy level (which may be measured by another “probe”
field of a different frequency) is a particular case of the ac Stark effect, which was already mentioned in Sec. 2.
time-independent states of the basis in that Eq. (28) is written (now without any obligation to associate
these states with the z-basis of any spin-½.) Then, the Schrödinger equation (4.158) yields
      H H 12      H 11   H 12  
i    H 1      11     . (6.103)
         H 21 H 22      H 21   H 22  
As we know (for example, from the discussion in Sec. 5.1), the average of the diagonal elements
of the matrix gives just a common shift of the system’s energy; for the purpose of the dynamics analysis,
it may be absorbed into the energy reference level. Also, the Hamiltonian operator has to be Hermitian,
so that the off-diagonal elements of its matrix have to be complex-conjugate. With this, Eqs. (103) are
reduced to the form,
 
i       H 12  , i   H 12*      , with   H 22  H 11 , (6.104)
2 2
which is absolutely similar to Eqs. (97). In particular, these equations describe the quantum oscillations
of the probabilities W = 2 and W = 2 with the frequency33
1/ 2
 H 12 
2
 2
2     4 2  . (6.105)
  
 
The similarity of Eqs. (97) and (104), and hence of Eqs. (99) and (105), shows that the “usual”
quantum oscillations and the Rabi oscillations have essentially the same physical nature, besides that in
the latter case the external ac signal quantum  bridges the separated energy levels, effectively
reducing their difference (En – En’) to a much smaller difference –  (En – En’) – . Also, since the
Hamiltonian (28) is similar to that given by Eq. (5.2), the dynamics of such a system with two ac-
coupled energy levels, within the limits (93) of the perturbation theory, is completely similar to that of a
time-independent two-level system. In particular, its state may be similarly represented by a point on the
Bloch sphere shown in Fig. 5.3, with its dynamics described, in the Heisenberg picture, by Eq. (5.19).
This fact is very convenient for the experimental implementation of quantum information systems (to be
discussed in more detail in Sec. 8.5), because it enables qubit manipulations in a broad variety of
physical systems with well-separated energy levels, using external ac (usually either microwave or
optical) sources.
Note, however, that according to Eq. (90), if the system has energy levels other than n and n’,
they also become occupied to some extent. Since the sum of all occupancies equals 1, this means that
(Wn)max may approach 1 only if the other excitation amplitude is very small, and hence the state
manipulation time scale T = 2/ = 2/ A  is very long. The ultimate limit in this sense is provided by
the harmonic oscillator where all energy levels are equidistant, and the probability repumping between
all of them occurs at comparable rates. In particular, in this system the implementation of the full Rabi
oscillations is impossible even at the exact resonance.34
33 By the way, Eq. (105) gives a natural generalization of the relations obtained for the frequency of such
oscillations in Sec. 2.6, where the coupled potential wells were assumed to be exactly similar, so that  = 0.
Moreover, Eqs. (104) gives a long-promised proof of Eqs. (2.201), and hence a better justification of Eqs. (2.203).
34 From Sec. 5.5, we already know what happens to the ground state of an oscillator at its external sinusoidal (or
any other) excitation: it turns into a Glauber state, i.e. a superposition of all Fock states – see Eq. (5.134).
However, I would not like these the quantitative details to obscure from the reader the most
important qualitative (OK, maybe semi-quantitative :-) conclusion of this section’s analysis: a resonant
increase of the interlevel transition intensity at   nn’. As will be shown later in the course, in a
quantum system coupled to its environment at least slightly (hence in reality, in any quantum system),
such increase is accompanied by a sharp increase of the external field’s absorption, which may be
measured. This effect has numerous practical applications including spectroscopies based on the
electron paramagnetic resonance (EPR) and nuclear magnetic resonance (NMR), which are broadly used
in material science, chemistry, and medicine. Unfortunately, I will not have time to discuss the related
technical issues and methods (in particular, interesting ac pulsing techniques, including the so-called
Ramsey interferometry) in detail, and have to refer the reader to special literature.35
6.6. Quantum-mechanical Golden Rule

One of the results of the past section, Eq. (102), may be used to derive one of the most important
and nontrivial results of quantum mechanics. For that, let us consider the case when the perturbation
causes quantum transitions from a discrete energy level En’ into a group of eigenstates with a very dense
(essentially continuous) spectrum En – see Fig. 10a.
(a) (b)
En
0.2
 Fig. 6.10. Deriving the Golden

0.1 Rule: (a) the energy level
En' scheme, and (b) the function
0 under the integral (108).
 15 0 15
 nn ' t
If, for all states n of the group, the following conditions are satisfied
Ann '
2 2

  nn '    nn ' , 2
(6.106)
then Eq. (102) coincides with the result that would follow from Eq. (90). This means that we may apply
Eq. (102), with the indices n and n’ duly restored, to any level n of our tight group. As a result, the total
probability of having our system transferred from the initial level n’ to that group is
2
4 Ann '  nn ' t
W (t )   Wn (t )  2  sin 2 . (6.107)
n  n  2
nn ' 2
Now comes the main, absolutely beautiful trick: let us assume that the summation over n is
limited to a tight group of very similar states whose matrix elements Ann’ are virtually similar (we will
check the validity of this assumption later on), so that we can take Ann’2 out of the sum in Eq. (107) and
then replace the sum with the corresponding integral:
35For introductions see, e.g., J. Wertz and J. Bolton, Electron Spin Resonance, 2nd ed., Wiley, 2007; J. Keeler,
Understanding NMR Spectroscopy, 2nd ed., Wiley, 2010.
2 2
4 Ann ' 1 2  nn' t
4 Ann '  n t 1  t
W (t ) 
2  2nn' sin
2
dn 
   nn' t  2
sin 2 nn' d ( nn' t ),
2
(6.108)
where n is the density of the states n on the energy axis:

dn
Density
of states
n  . (6.109)
dE n
This density and the matrix element Ann’ have to be evaluated at nn’ = 0, i.e. at energy En = En’ + ,
and are assumed to be constant within the final state group. At fixed En’, the function under integral
(108) is even and decreases fast at nn’t >> 1 – see Fig. 10b. Hence we may introduce a dimensionless
integration variable   nn’t, and extend the integration over it formally from – to +. Then the
integral in Eq. (108) is reduced to a table one,36 and yields
2 2
4 Ann'  nt   1 2  4 Ann'  nt 
W (t ) 
  2
sin
2
d 
 2
 t , (6.110)
where the constant

2 2
Golden
Rule
 Ann'  n (6.111)

is called the transition rate.37 This is one of the most famous and useful results of quantum mechanics,
its Golden Rule38, which deserves much discussion.
First of all, let us reproduce the reasoning already used in Sec. 2.5 to show that the meaning of
the rate  is much deeper than Eq. (110) seems to imply. Indeed, due to the conservation of the total
probability, Wn’ + W = 1, we can rewrite that equation as
W n' t 0  . (6.112)
Evidently, this result cannot be true for all times, otherwise the probability Wn’ would become negative.
The reason for this apparent contradiction is that Eq. (110) was obtained in the assumption that initially,
the system was completely on level n’: Wn’(0) = 1. Now, if at the initial moment the value of Wn’ is
different, the result (110) has to be multiplied by that number, due to the linear relation (88) between
dan/dt and an’. Hence, instead of Eq. (112) we get a differential equation similar to Eq. (2.159),
W n' t 0  Wn' , (6.113)
which, for a time-independent , has the evident solution,
36 See, e.g., MA Eq. (6.12).

37 In some texts, the density of states in Eq. (111) is replaced with a formal expression  (En – En’ – ). Indeed,
applied to a finite energy interval En with n >> 1 levels, it gives the same result: n  (dn/dEn)En  nEn.
Such replacement may be technically useful in some cases, but is incorrect for n ~ 1, and hence should be used
with the utmost care, so that for most applications the more explicit form (111) is preferable.
38 Sometimes Eq. (111) is called “Fermi’s Golden Rule”. This is rather unfair, because this result was developed
mostly by the same P. A. M. Dirac in 1927, and Enrico Fermi’s role was not much more than advertising it, under
the name of “Golden Rule No. 2”, in his influential lecture notes on nuclear physics, which were published much
later, in 1950. (To be fair to Fermi, he has never tried to pose as the Golden Rule’s author.)
Initial
Wn ' (t )  Wn ' (0)e  Γt , (6.114) occupancy’s
decay
describing the exponential decay of the initial state’s occupancy, with the time constant  = 1/.
I am inviting the reader to review this fascinating result again: by the summation of periodic
oscillations (102) over many levels n, we have got an exponential decay (114) of the probability. This
trick becomes possible because the effective range En of the state energies En giving substantial
contributions to the integral (108), shrinks with time: En ~ /t.39 However, since most of the decay
(114) takes place within the time interval of the order of   1/, the range of the participating final
energies may be estimated as

 E n ~   . (6.115)

This estimate is very instrumental for the formulation of conditions of the Golden Rule’s validity. First,
we have assumed that the matrix elements of the perturbation and the density of states are independent
of the energy within the interval (115). This gives the following requirement
En ~   En  En' ~  , (6.116)
Second, for the transfer from the sum (107) to the integral (108), we need the number of states within
that energy interval, Nn = nEn, to be much larger than 1. Merging Eq. (116) with Eq. (92) for all the
energy levels n”  n, n’ not participating in the resonant transition, we may summarize all conditions of
the Golden Rule validity as
Golden
 n1  Γ      n'n" . (6.117) Rule’s
validity
(The reader may ask whether I have neglected the condition expressed by the first of Eqs. (106).
However, for nn’ ~ En/ ~ , this condition is just Ann’2 << ()2, so that plugging it into Eq. (111),
2
   2  n , (6.118)

and canceling one  and one , we see that it coincides with the first relation in Eq. (117) above.)
Let us have a look at whether these conditions may be satisfied in practice, at least in some
cases. For example, let us consider the optical ionization of an atom, with the released electron confined
in a volume of the order of 1 cm3  10-6 m3. According to Eq. (1.90), with E of the order of the atomic
ionization energy En – En’ =  ~ 1 eV, the density of electron states in that volume is of the order of
1021 1/eV, while the right-hand side of Eq. (117) is of the order of En ~ 1 eV. Thus the conditions (117)
provide an approximately 20-orders-of magnitude range for acceptable values of . This illustration
should give the reader a taste of why the Golden Rule is applicable to so many situations.
Finally, the physical picture of the initial state’s decay (which will also be the key to our
discussion of quantum-mechanical “open” systems in the next chapter) is also very important.
According to Eq. (114), the external excitation transfers the system into the continuous spectrum of
levels n, and it never comes back to the initial level n’. However, it was derived from the quantum
mechanics of Hamiltonian systems, whose equations are invariant with respect to time reversal. This
39 This is one more appearance of the “energy-time uncertainty relation”, which was discussed in Sec. 2.5.
paradox is a result of our generalization (113) of the exact result (112) This trick, breaking the time-
reversal symmetry, is absolutely adequate for the physics under study. Indeed, some gut feeling of the
physical sense of this irreversibility may be obtained from the following observation. As Eq. (1.86)
illustrates, the distance between the adjacent orbital energy levels tends to zero only if the system’s size
goes to infinity. This means that our assumption of the continuous energy spectrum of the finial states n
essentially requires these states to be broadly extended in space – being either free, or essentially free de
Broglie waves. Thus the Golden Rule corresponds to the (physically justified) assumption that in an
infinitely large system, the traveling de Broglie waves excited by a local source and propagating
outward from it, would never come back, and even if they did, unpredictable phase shifts introduced by
minor uncontrollable perturbations on their way would never allow them to sum up in the coherent way
necessary to bring the system back into the initial state n’. (This is essentially the same situation which
was discussed, for a particular 1D wave-mechanical system, in Sec. 2.5.)40
To get a feeling of the Golden Rule at work, let us apply it to the following simple problem –
which is a toy model of the photoelectric effect, briefly discussed in Sec. 1.1(ii). A 1D particle is
initially trapped in the ground state of a narrow potential well, described by Eq. (2.158):
U ( x)  W ( x), with W  0 . (6.119)

Let us calculate the rate  of the particle’s “ionization” (i.e. its excitation into a group of extended,
delocalized states) by a weak classical sinusoidal force of amplitude F0 and frequency , suddenly
turned on at some instant, say t = 0.
As a reminder, the initial localized state (in our current notation, n’) of such a particle was
already found in Sec. 2.6:
mW  2 2 mW 2
 n ' ( x)   1 / 2 exp  x , with   2 , E n'    . (6.120)
 2m 2 2
The final, extended states n, with a continuous spectrum, for this problem exist only at energies En > 0,
so that the excitation rate is different from zero only for frequencies
E n' mW 2
   min   . (6.121)
 2 3
The weak sinusoidal force may be described by the following perturbation Hamiltonian,
F
Hˆ (1)   F (t ) xˆ   F0 xˆ cos t   0 xˆ  e it  e  it , for t  0 , (6.122)
2  
so that according to Eq. (86), which serves as the amplitude operator’s definition, in this case
F
Aˆ  Aˆ †   0 xˆ. (6.123)
2
The matrix elements Ann’ that participate in Eq. (111) may be readily calculated in the coordinate
representation:
40This situation is also similar to the irreversible increase of entropy of macroscopic systems, despite the fact that
their microscopic components obey reversible laws of motion, which is postulated in thermodynamics and
explained in statistical physics – see, e.g., SM Secs. 1.2 and 2.2.
 
F
Ann '   n* ( x) Aˆ ( x) n ' ( x)dx   0  n* ( x) x n ' ( x)dx . (6.124)

2 
Since, according to Eq. (120), the initial n’ is a symmetric function of x, non-vanishing contributions
to this integral are given only by antisymmetric functions n(x), proportional to sinknx, with the wave
number kn related to the final energy by the well-familiar equality (1.89):
 2 k n2
 En . (6.125)
2m
As we know from Sec. 2.6 (see in particular Eq. (2.167) and its discussion), such antisymmetric
functions, with n(0) = 0, are not affected by the zero-centered delta-functional potential (119), so that
their density n is the same as that in completely free space, and we could use Eq. (1.100). However,
since that relation was derived for traveling waves, it is more prudent to repeat its derivation for
standing waves, confining them to an artificial segment [-l/2, +l/2] – long in the sense
k n l , l  1 , (6.126)
so it does not affect the initial localized state and the excitation process. Then the confinement
requirement n(l/2) = 0 immediately yields the condition knl/2 = n, so that Eq. (1.100) is indeed valid,
but only for positive values of kn, because sinknx with kn  –kn does not describe an independent
standing-wave eigenstate. Hence the final state density is
dn dn dE n l  2kn lm
n     . (6.127)
dE n dk n dk n 2 m 2 2 k n
It may look troubling that the density of states depends on the artificial segment’s length l, but
the same l also participates in the final wavefunctions’ normalization factor,41
1/ 2
2
n    sin k n x , (6.128)
l
and hence in the matrix element (124):
F0  2 
1 / 2 l
 x F0  2 
1/ 2
 l (ikn  ) x l
(ikn  ) x 
Ann '  
2
 
 l 
 sin k n x e
l
xdx.  
2i
 
 l 
 e

0
xdx  
0
e xdx  . (6.129)


These two integrals may be readily worked out by parts. Taking into account that due to the condition
(126), their upper limits may be extended to , the result is
1/ 2
 2  2k n 
Ann '    F0 . (6.130)
 l  k 2
n 2 
2
Note that the matrix element is a smooth function of kn (and hence of En), so that an important condition
of the Golden Rule, the virtual constancy of Ann’ on the interval En ~  << En, is satisfied. So, the
general Eq. (111) is reduced, for our problem, to the following expression:
41 The normalization to infinite volume, using Eq. (4.263), is also possible, but physically less transparent.
2
2  2 1 / 2 2k n   lm 8 F02 mk n 3
   F0 2   , (6.131)
  l  (k n   2 ) 2  2 2 k n  3 (k n2   2 ) 4
which is independent of the artificially introduced l – thus justifying its use.

Note that due to the above definitions of kn and , the expression in the parentheses in the
denominator of the last expression does not depend on the potential well’s “weight” W, and is a function
of only the excitation frequency  (and the particle’s mass):

 2 k n2   2 E n  E n'  . (6.132)
2m
As a result, Eq. (131) may be recast simply as
F02 W 3 k n
  . (6.133)
2 
4
What is hidden here is that kn, defined by Eq. (125) with En = En’ + , is a function of the external
force’s frequency, changing as 1/2 at  >> min (so that  drops as -7/2 at   ), and as ( – min)1/2
when  approaches the “red boundary” (121) of the ionization effect, so that   ( – min)1/2  0 in
that limit as well.
A conceptually very similar, but a bit more involved analysis of such effect in a more realistic
3D case, namely the hydrogen atom’s ionization by an optical wave, is left for the reader’s exercise.
6.7. Golden Rule for step-like perturbations

Now let us reuse some of our results for a perturbation being turned on at t = 0, but after that
time-independent:
0, for t  0,
Step-like Hˆ (1) (t )   ˆ (6.134)
 H  const, for t  0.
perturbation
A superficial comparison of this equality and the former Eq. (86) seems to indicate that we may use all
our previous results, taking  = 0 and replacing Aˆ  Aˆ † with Ĥ 1 . However, that conclusion (which
would give us a wrong factor of 2 in the result) does not take into account the fact that analyzing both
the two-level approximation in Sec. 5, and the Golden Rule in Sec. 6, we have dropped the second (non-
resonant) term in Eq. (90). In our current case (134), with  = 0, there is no such difference between
these terms. This why it is more prudent to use the general Eq. (84),
inn 't
ia n   a n ' H nn ' e , (6.135)
n'
in which the matrix element of the perturbation is now time-independent at t > 0. We see that it is
formally equivalent to Eq. (88) with only the first (resonant) term kept, if we make the following
replacements:
Aˆ  Hˆ ,  nn '     nn '   nn ' . (6.136)
Let us use this equivalency to consider the results of coupling between a discrete-energy state n’,
into which the particle is initially placed, and a dense group of states with a quasi-continuum spectrum,
in the same energy range. Figure 11a shows an example of such a system: a particle is initially (say, at t
= 0) placed into a potential well separated by a penetrable potential barrier from a formally infinite
region with a continuous energy spectrum. Let me hope that the physical discussion in the last section
makes the outcome of such an experiment evident: the particle will gradually and irreversibly tunnel out
of the well, so that the probability Wn’(t) of its still residing in the well will decay in accordance with Eq.
(114). The rate of this decay may be found by making the replacements (136) in Eq. (111):
2 2
 H nn '  n , (6.137)

where the states n and n’ now have virtually the same energy.42
(a) (b)
l con l  l con l con l con
n' n n' n
 2
Fig. 6.11. Tunneling from a discrete-energy state n’: (a) to a

state continuum, and (b) to another discrete-energy state n.
It is very informative to compare this result, semi-quantitatively, with Eq. (105) for a symmetric
(En = En’) system of two potential wells separated by a similar potential barrier – see Fig. 11b. For the
symmetric case, i.e.  = 0, Eq. (105) is reduced to simply
1
H nn ' .  (6.138)
 con
Here I have used the index “con” (from “confinement”) to emphasize that this matrix element is
somewhat different from the one participating in Eq. (137), even if the potential barriers are similar.
Indeed, in the latter case, the matrix element,
H nn '  n Hˆ n'   n*' Hˆ  n dx , (6.139)
has to be calculated for two wavefunctions n and n’ confined to spatial intervals of the same scale lcon,
while in Eq. (137), the wavefunctions n are extended over a much larger distance l >> lcon – see Fig.
11. As Eq. (128) tells us, in the 1D model this means an additional small factor of the order of (lcon/l)1/2.
Now using Eq. (128) as a crude but suitable model for the final-state wavefunctions, we arrive at the
following estimate, independent of the artificially introduced length l:
2
 ~ 2 H nn '
2 l con
 n ~ 2 H nn '
2 l con lm
~
H nn '
con

 2 , (6.140)
con l con l 2 2 k n E n ' E n '
where En’ ~ 2/ml2con is the scale of the differences between the eigenenergies of the particle in an
unperturbed potential well. Since the condition of validity of Eq. (138) is  << En’, we see that
42 The condition of validity of Eq. (137) is again given by Eq. (117), just with  = 0 in the upper limit for .

 ~   . . (6.141)
E n
This (sufficiently general43) perturbative result confirms the conclusion of a more particular
analysis carried out in the end of Sec. 2.6: the rate of the (irreversible) quantum tunneling into a state
continuum is always much lower than the frequency of (reversible) quantum oscillations between
discrete states separated with the same potential barrier – at least for the case when both are much lower
than En’/, so that the perturbation theory is valid. A very handwaving interpretation of this result is
that the particle oscillates between the confined state in the well and the space-extended states behind
the barrier many times before finally “deciding to perform” an irreversible transition into the unconfined
continuum. This qualitative picture is consistent with experimentally observable effects of dispersive
electromagnetic environments on electron tunneling.44
Let me conclude this section (and this chapter) with the application of Eq. (137) to a very
important case, which will provide a smooth transition to the next chapter’s topic. Consider a composite
system consisting of two component systems, a and b, with the energy spectra sketched in Fig. 12.
system a system b
n' a nb
interaction Fig. 6.12. Energy relaxation in

  system a due to its weak coupling
Hˆ (1)  Aˆ ( a ) Bˆ (b ) to system b (which serves as the
na n' b environment of a).
Let the systems be completely independent initially. The independence means that in the absence
of their coupling, the total Hamiltonian of the system may be represented as a sum of two operators:
Hˆ ( 0)  Hˆ a (a )  Hˆ b (b), (6.142)
where the arguments a and b symbolize the non-overlapping sets of the degrees of freedom of the two
systems. Such operators, belonging to their individual, different Hilbert spaces, naturally commute.
Similarly, the eigenkets of the system may be naturally factored as
n  n a  nb . (6.143)
The direct product sign  is used here (and below) to denote the formation of a joint ket-vector from the
kets of the independent systems, belonging to different Hilbert spaces. Evidently, the order of operands
in such a product may be changed at will. As a result, its eigenenergies separate into a sum, just as the
Hamiltonian (142) does:
     
Hˆ ( 0) n  Hˆ a  Hˆ b na  nb  Hˆ a na  nb  Hˆ b nb  na  E na  E nb  n . (6.144)
43 It is straightforward to verify that the estimate (141) is valid for similar problems of any spatial dimensionality,
not just for the 1D case we have analyzed.
44 See, e.g., P. Delsing et al., Phys. Rev. Lett. 63, 1180 (1989).
In such composite systems, the relatively weak interaction of its components may be usually
represented as a bilinear product of two Hermitian operators, each depending only on the degrees of
freedom of one component system:
Hˆ (1)  Aˆ (a) Bˆ (b) . (6.145)
A very common example of such an interaction is the electric-dipole interaction between an atomic-
scale system (with a linear size of the order of the Bohr radius rB ~ 10-10 m) and the electromagnetic
field at optical frequencies  ~ 1016 s-1, with the wavelength  = 2c/ ~ 10-6 m >> rB: 45
Hˆ (1)  dˆ  Eˆ, with dˆ   q k rˆk , (6.146)

k
where the dipole electric moment d depends only on the positions rk of the charged particles (numbered
with index k) of the atomic system, while that of electric field E is a function of only the
electromagnetic field’s degrees of freedom – to be discussed in Chapter 9 below.
Returning to the general situation shown in Fig. 12, if the component system a was initially in an
excited state n’a, the interaction (145), turned on at some moment of time, may bring it into another
discrete state na of a lower energy – for example, the ground state. In the process of this transition, the
released energy, in the form of an energy quantum
  E n 'a  E na , (6.147)
is picked up by the system b:
E nb  E n 'b    E n 'b  E n 'a  E na  , (6.148)
so that the total energy E = Ea + Eb of the system does not change. (If the states na and n’b are the ground
states of the two component systems, as they are in most applications of this analysis, and we take the
ground state energy Eg = Ena + En’b of the composite system for the reference, then Eq. (148) gives
merely Enb = En’a.) If the final state nb of the system b is inside a state group with a quasi-continuous
energy spectrum (Fig. 12), the process has the exponential character (114)46 and may be interpreted as
the effect of energy relaxation of the system a, with the released energy quantum  absorbed by the
system b. Note that since the quasi-continuous spectrum essentially requires a system of large spatial
size, such a model is very convenient for description of the environment b of the quantum system a. (In
physics, the “environment” typically means all the Universe – less the system under consideration.)
If the relaxation rate  is sufficiently low, it may be described by the Golden Rule (137). Since
the perturbation (145) does not depend on time explicitly, and the total energy E does not change, this
relation, with the account of Eqs. (143) and (145), takes the form
Golden
2 Rule
Ann ' Bnn '  n , where Ann '  na Aˆ n' a , and Bnn '  nb Bˆ n' b ,
2 2
 (6.149) for coupled
 systems
where n is the density of the final states of the system b at the relevant energy (147). In particular, Eq.
(149), with the dipole Hamiltonian (146), enables a very straightforward calculation of the natural
linewidth of atomic electric-dipole transitions. However, such calculation has to be postponed until
Chapter 9, in which we will discuss the electromagnetic field quantization – i.e., the exact nature of the
45See, e.g., EM Sec. 3.1, in particular Eq. (3.16), in which letter p is used for the electric dipole moment.
46Such process is spontaneous: it does not require any external agent, and starts as soon as either the interaction
(145) has been turned on, or (if it is always on) as soon as the system a is placed into the excited state n’a.
states nb and n’b for this problem, and hence will be able to calculate Bnn’ and n. Instead, I will now
proceed to a general discussion of the effects of quantum systems interaction with their environment,
toward which the situation shown in Fig. 12 provides a clear conceptual path.
6.1. Use Eq. (14) to prove the following general form of the Hellmann-Feynman theorem (whose
proof in the wave-mechanics domain was the task of Problem 1.5):
E n Hˆ
 n n ,
 
where  is an arbitrary c-number parameter.
6.2. Establish a relation between Eq. (16) and the result of the classical theory of weakly
anharmonic (“nonlinear”) oscillations at negligible damping.
Hint: You may like to use N. Bohr’s reasoning discussed in Problem 1.1.
6.3. A weak, time-independent force F is exerted on a 1D particle that was placed into a hard-
wall potential well
0, for 0  x  a,
U x   
  , otherwise.
Calculate, sketch, and discuss the 1st-order perturbation of its ground-state wavefunction.
6.4. A time-independent force F = (nxy+nyx), where  is a small constant, is applied to a 3D

harmonic oscillator of mass m and frequency 0, located at the origin. Calculate, in the first order of the
perturbation theory, the effect of the force upon the ground state energy of the oscillator, and its lowest
excited energy level. How small should the constant  be for your results to be quantitatively correct?
6.5. A 1D particle of mass m is localized at a narrow potential well that may be approximated
with a delta function:
U  x   W  x , with W  0.
Calculate the change of its ground state energy by an additional weak, time-independent force F, in the
first non-vanishing approximation of the perturbation theory. Discuss the limits of validity of this result,
taking into account that at F  0, the localized state of the particle is metastable.
6.6. Use the perturbation theory to calculate the eigenvalues of the operator L̂2 in the limit m  l
>> 1, by purely wave-mechanical means.
Hint: Try the following substitution: () = f()/sin1/2 .
6.7. In the lowest non-vanishing order of the perturbation theory, calculate the shift of the
ground-state energy of an electrically charged spherical rotator (i.e. a particle of mass m, free to move
over a spherical surface of radius R) due to a weak, uniform, time-independent electric field E.
6.8. Use the perturbation theory to evaluate the effect of a time-independent, uniform electric
field E on the ground state energy Eg of a hydrogen atom. In particular:
(i) calculate the 2nd-order shift of Eg, neglecting the extended unperturbed states with E > 0, and
bring the result to the simplest analytical form you can,
(ii) find the lower and the upper bounds on the shift, and
(iii) discuss the simplest experimental manifestations of this quadratic Stark effect.
6.9. A particle of mass m, with electric charge q, is in its ground s-state with a given energy Eg <
0, being localized by a very short-range, spherically-symmetric potential well. Calculate its static
electric polarizability .
6.10. In some atoms, the charge-screening effect of other electrons on the motion of each of them
may be reasonably well approximated by the replacement of the Coulomb potential (3.190), U = –C/r,
with the so-called Hulthén potential
C/a 1 / r , for r  a,
U   C  
expr / a  1  exp r / a/ a, for a  r.
Assuming that the effective screening radius a is much larger than r0  2/mC, use the perturbation
theory to calculate the energy spectrum of a single particle of mass m, moving in this potential, in the
lowest order needed to lift the l-degeneracy of the levels.
6.11. In the lowest non-vanishing order of the perturbation theory, calculate the correction to
energies of the ground state and all lowest excited states of a hydrogen-like atom/ion, due to electron’s
penetration into its nucleus, modeling it as a spinless, uniformly charged sphere of radius R << rB/Z.
6.12. Prove that the kinetic-relativistic correction operator (48) indeed has only diagonal matrix
elements in the basis of unperturbed Bohr atom states (3.200).
6.13. Calculate the lowest-order relativistic correction to the ground-state energy of a 1D

harmonic oscillator.
6.14. Use the perturbation theory to calculate the contribution to the magnetic susceptibility m
of a dilute gas, that is due to the orbital motion of a single electron inside each gas particle. Spell out
your result for a spherically-symmetric ground state of the electron, and give am estimate of the
magnitude of this orbital susceptibility.
6.15. How to calculate the energy level degeneracy lifting, by a time-independent perturbation,
in the 2nd order of the perturbation in Hˆ 1 , assuming that it is not lifted in the 1st order? Carry out such
calculation for a plane rotator of mass m and radius R, carrying electric charge q, and placed into a
weak, uniform, constant electric field E.
6.16.* The Hamiltonian of a quantum system is slowly changed in time.

(i) Develop a theory of quantum transitions in the system, and spell out its result in the 1st order
in the speed of the change.
(ii) Use the 1st-order result to calculate the probability that a finite-time pulse of a slowly
changing force F(t) drives a 1D harmonic oscillator, initially in its ground state, into an excited state.
(iii) Compare the last result with the exact one.
6.17. Use the single-particle model to calculate the complex electric permittivity () of a dilute
gas of similar atoms, due to their induced electric polarization by a weak external ac field, for a field
frequency  very close to one of quantum transition frequencies nn’. Based on the result, calculate and
estimate the absorption cross-section of each atom.
Hint: In the single-particle model, atom’s properties are determined by Z similar, non-interacting
electrons, each moving in a similar static attracting potential, generally different from the Coulomb one,
because it is contributed not only by the nucleus, but also by other electrons.
6.18. Use the solution of the previous problem to generalize the expression for the London
dispersion force between two atoms (whose calculation in the harmonic-oscillator model was the subject
of Problems 3.16 and 5.15) to the single-particle model with an arbitrary energy spectrum.
6.19. Use the solution of the previous problem to calculate the potential energy of interaction of
two hydrogen atoms, both in their ground state, separated by distance r >> rB.
6.20. In a certain quantum system, distances between the three lowest E2

energy levels are slightly different – see the figure on the right ( << 1,2). 2  1   
Assuming that the involved matrix elements of the perturbation Hamiltonian
are known, and are all proportional to the external ac field’s amplitude, find E1
the time necessary to populate the first excited level almost completely (with a 1
given precision  << 1), using the Rabi oscillation effect, if at t = 0 the system E0
is completely in its ground state.
6.21.* Analyze the possibility of a slow transfer of a system from one of E 2

its energy levels to another one (in the figure on the right, from level 1 to level A  
A  
3), using the scheme shown in that figure, in which the monochromatic external
E3
excitation amplitudes A+ and A– may be slowly changed at will. E1
6.22. A weak external force pulse F(t), of a finite time duration, is applied to a 1D harmonic
oscillator that initially was in its ground state.
(i) Calculate, in the lowest non-vanishing order of the perturbation theory, the probability that
the pulse drives the oscillator into its lowest excited state.
(ii) Compare the result with the exact solution of the problem.
(iii) Spell out the perturbative result for a Gaussian-shaped waveform,
 
F t   F0 exp  t 2 /  2 ,
and analyze its dependence on the scale  of the pulse duration.
6.23. A spatially-uniform, but time-dependent external electric field E(t) is applied, starting from
t = 0, to a charged plane rotator, initially in its ground state.
(i) Calculate, in the lowest non-vanishing order in the field’s strength, the probability that by
time t > 0, the rotator is in its nth excited state.
(ii) Spell out and analyze your results for a constant-magnitude field rotating, with a constant
angular velocity , within the rotator’s plane.
(iii) Do the same for a monochromatic field of frequency , with a fixed polarization.
6.24. A spin-½ with a gyromagnetic ratio  is placed into a magnetic field including a time-
independent component B0, and a perpendicular field of a constant magnitude Br, rotated with a
constant angular velocity . Can this magnetic resonance problem be reduced to one already discussed
in Chapter 6?
6.25. Develop general theory of quantum excitations of the higher levels of a discrete-spectrum
system, initially in the ground state, by a weak time-dependent perturbation, up to the 2nd order. Spell
out and discuss the result for the case of monochromatic excitation, with a nearly perfect tuning of its
frequency  to the half of a certain quantum transition frequency n0  (En – E0)/ .
6.26. A heavy, relativistic particle, with electric charge q = Ze, passes by a hydrogen atom,
initially in its ground state, with an impact parameter b within the range rB << b << rB/, where  
1/137 is the fine structure constant. Calculate the probabilities of the atom’s transition to its lowest
excited states.
6.27. A particle of mass m is initially in the localized ground state, with energy Eg < 0, of a very
small, spherically-symmetric potential well. Calculate the rate of its delocalization by an applied
classical force F(t) = nF0cost with a time-independent direction n.
6.28.* Calculate the rate of ionization of a hydrogen atom, initially in its ground state, by a
classical, linearly polarized electromagnetic wave with an electric field’s amplitude E0, and a frequency
 within the range
 / me rB2    c / rB ,
where rB is the Bohr radius. Recast your result in terms of the cross-section of electromagnetic wave
absorption. Discuss briefly what changes of the theory would be necessary if either of the above
conditions had been violated.
6.29.* Use the quantum-mechanical Golden Rule to derive the general expression for the electric
current I through a weak tunnel junction between two conductors, biased with dc voltage V, treating the
conductors as degenerate Fermi gases of electrons with negligible direct interaction. Simplify the result
in the low-voltage limit.
Hint: The electric current flowing through a weak tunnel junction is so low that it does not
substantially perturb the electron states inside each conductor.
6.30.* Generalize the result of the previous problem to the case when a weak tunnel junction is
biased with voltage V(t) = V0 + Acost, with  generally comparable with eV0 and eA.
6.31.* Use the quantum-mechanical Golden Rule to derive the Landau-Zener formula (2.257).
Chapter 7. Open Quantum Systems

This chapter discusses the effects of a weak interaction of a quantum system with its environment. Some
part of this material is on the fine line between quantum mechanics and (quantum) statistical physics.
Here I will only cover those aspects of the latter field1 that are of key importance for the major goals of
this course, including the discussion of quantum measurements in Chapter 10.
7.1. Open systems, and the density matrix

All the way until the last part of the previous chapter, we have discussed quantum systems
isolated from their environment. Indeed, from the very beginning, we have assumed that we are dealing
with the statistical ensembles of systems as similar to each other as only allowed by the laws of quantum
mechanics. Each member of such an ensemble, called pure or coherent, may be described by the same
state vector  – in the wave mechanics case, by the same wavefunction . Even the discussion at the
end of the last chapter, in which one component system (in Fig. 6.13, system b) may be used as a model
of the environment of its counterpart (system a), was still based on the assumption of a pure initial state
(6.143) of the composite system. If the interaction of the two components of such a system is described
by a certain Hamiltonian (the one given by Eq. (6.145) for example), and the energy spectrum of each
component system is discrete, for state  of the composite system at an arbitrary instant we may write
    n n    n n a  nb , (7.1)
n n
with a unique correspondence between the eigenstates na and nb.
However, in many important cases, our knowledge of a quantum system’s state is even less
complete.2 These cases fall into two categories. The first case is when a relatively simple quantum
system s of our interest (say, an electron or an atom) is in a weak3 but substantial contact with its
environment e – here understood in the most general sense, say, as all the whole Universe less system s
– see Fig. 1. Then there is virtually no chance of making two or more experiments with exactly the same
composite system because that would imply a repeated preparation of the whole environment (including
the experimenter :-) in a certain quantum state – a rather challenging task, to put it mildly. Then it makes
much more sense to consider a statistical ensemble of another kind – a mixed ensemble, with random
states of the environment, though possibly with its macroscopic parameters (e.g., temperature, pressure,
etc.) known with high precision. Such ensembles will be the focus of the analysis in this chapter.
Much of this analysis will pertain also to another category of cases – when the system of our
interest is isolated from its environment, at present, with acceptable precision, but our knowledge of its
state is still incomplete for some other reason. Most typically, the system could be in contact with its
1 A broader discussion of statistical mechanics and physical kinetics, including those of quantum systems, may be
found in the SM part of this series.
2 Indeed, a system, possibly apart from our Universe as a whole (who knows? – see below), is never exactly
coherent, though in many cases, such as the ones discussed in the previous chapters, deviations from the
coherence may be ignored with acceptable accuracy.
3 If the interaction between a system and its environment is very strong, their very partition is impossible.
© K. Likharev
environment at earlier times, and its reduction to a pure state is impracticable. So, this second category
of cases may be considered as a particular case of the first one, and may be described by the results of its
analysis, with certain simplifications – which will be spelled out in appropriate places of my narrative.
weak
interaction The Universe
system of environment (e)

interest (s) Fig. 7.1. A quantum system and its environment
(VERY schematically :-).
In classical physics, the analysis of mixed statistical ensembles is based on the notion of the
probability W (or the probability density w) of each detailed (“microscopic”) state of the system of
interest.4 Let us see how such an ensemble may be described in quantum mechanics. In the case when
the coupling between the system of our interest and its environment is so weak that they may be clearly
separated, we can still use state vectors of their states, defined in completely different Hilbert spaces.
Then the most general quantum state of the whole Universe, still assumed to be pure,5 may be described
as the following linear superposition:
Universe:
    jk s j  ek . (7.2) quantum
j ,k state
The “only” difference of such a state from the superposition described by Eq. (1), is that there is
no one-to-one correspondence between the states of our system and its environment. In other words, a
certain quantum state sj of the system of interest may coexist with different states ek of its environment.
This is exactly the quantum-mechanical description of a mixed state of the system s.
Of course, the huge size of the Hilbert space of the environment, i.e. of the number of the ek
factors in the superposition (2), strips us of any practical opportunity to make direct calculations using
that sum. For example, according to the basic Eq. (4.125), to find the expectation value of an arbitrary
observable A in the state (2), we would need to calculate the long bracket
A   Aˆ    * 
j,j' ; k,k'
jk j'k' ek  s j Aˆ s j'  ek' . (7.3)
Even if we assume that each of the sets {s} and {e} is full and orthonormal, Eq. (3) still includes a
double sum over the enormous basis state set of the environment!
However, let us consider a limited, but the most important subset of operators – those of intrinsic
observables, which depend only on the degrees of freedom of the system of our interest (s). These
operators do not act upon the environment’s degrees of freedom, and hence in Eq. (3), we may move the
environment’s bra-vectors ek over all the way to the ket-vectors ek’. Assuming, again, that the set of
environmental eigenstates is full and orthonormal, Eq. (3) is now reduced to
4See, e.g., SM Sec. 2.1.

5 Whether this assumption is true is an interesting issue, still being debated (more by philosophers than by
physicists), but it is widely believed that its solution is not critical for the validity of the results of this approach.
A   α* α
j,j' ; k,k'
jk j'k' s j Aˆ s j' ek ek'   A jj'  α*jk α j'k .
jj' k
(7.4)
This is already a big relief because we have “only” a single sum over k, but the main trick is still
ahead. After the summation over k, the second sum in the last form of Eq. (4) is some function w of the
indices j and j’, so that, according to Eq. (4.96), this relation may be represented as
Intrinsic
observable:
expectation A   A jj ' w j ' j  Tr (Aw ) , (7.5)
value jj '
where the matrix w, with the elements
w j'j    *jk  j'k , i.e. w jj'    jk  *j'k ,

Density
matrix: (7.6)
definition k k
is called the density matrix of the system.6 Most importantly, Eq. (5) shows that the knowledge of this
matrix allows the calculation of the expectation value of any intrinsic observable A (and, according to
the general Eqs. (1.33)-(1.34), its r.m.s. fluctuation as well, if needed), even for the very general state
(2). This is why let us have a good look at the density matrix.
First of all, we know from the general discussion in Chapter 4, fully applicable to the pure state
(2), the expansion coefficients in superpositions of this type may be always expressed as short brackets
of the type (4.40); in our current case, we may write

 jk  ek  s j  .  (7.7)
Plugging this expression into Eq. (6), we get

 
w jj '    jk  *j'k  s j    ek   ek   s j '  s j wˆ s j ' . (7.8)
k  k 
We see that from the point of our system (i.e. in its Hilbert space whose basis states may be numbered
by the index j only), the density matrix is indeed just the matrix of some construct,7
Density
operator: wˆ   ek   ek , (7.9)
definition k
which is called the density (or “statistical”) operator. As it follows from the definition (9), in contrast to
the density matrix this operator does not depend on the choice of a particular basis sj – just as all linear
operators considered earlier in this course. However, in contrast to them, the density operator does
depend on the composite system’s state , including the state of the system s as well. Still, in the j-space
it is mathematically just an operator whose matrix elements obey all relations of the bra-ket formalism.
In particular, due to its definition (6), the density operator is Hermitian:
w*jj'    *jk j'k    j'k *jk  w j'j , (7.10)

k k
6This notion was suggested in 1927 by John von Neumann.

7Note that the “short brackets” in this expression are not c-numbers, because the state  is defined in a larger
Hilbert space (of the environment plus the system of interest) than the basis states ek (of the environment only).
so that according to the general analysis of Sec. 4.3, in the Hilbert space of the system s, there should be
a certain basis {w} in that the matrix of this operator is diagonal:
w jj ' in w  w j  jj ' . (7.11)
Since any operator, in any basis, may be represented in the form (4.59), in the basis {w} we may write
Statistical
wˆ   w j w j w j . (7.12) operator in
j
w-basis
This expression reminds, but is not equivalent to Eq. (4.44) for the identity operator, that has been used
so many times in this course, and in the basis wj has the form
Iˆ   w j w j . (7.13)
j
In order to comprehend the meaning of the coefficients wj participating in Eq. (12), let us use Eq.
(5) to calculate the expectation value of any observable A whose eigenstates coincide with those of the
special basis {w}, and whose matrix is, therefore, diagonal in this basis:
Expectation
value of
A  Tr (Aw )   A jj' w j  jj'   A j w j , (7.14) wj-compatible
jj ' j variable
where Aj is just the expectation value of the observable A in the state wj. Hence, to comply with the
general Eq. (1.37), the real c-number wj must have the physical sense of the probability Wj of finding the
system in the state j. As the result, we may rewrite Eq. (12) in the form
wˆ   w j W j w j . (7.15)
j
In the ultimate case when only one of the probabilities (say, Wj”) is different from zero,
W j   jj" , (7.16)
the system is in a coherent (pure) state wj”. Indeed, it is fully described by one ket-vector wj”, and we
can use the general rule (4.86) to represent it in another (arbitrary) basis {s} as a coherent superposition
w j"  U †j"j' s j'  U *j'j" s j' , (7.17)

j' j'
where U is the unitary matrix of transform from the basis {w} to the basis {s}. According to Eqs. (11)
and (16), in such a pure state the density matrix is diagonal in the {w} basis,
w jj' in w  j , j" j' , j" , (7.18a)
but not in an arbitrary basis. Indeed, using the general rule (4.92), we get
w jj' in s  U †jl wll' in w U l'j'  U †jj"U j"j'  U *j"jU j"j' . (7.18b)

l ,l '
To make this result more transparent, let us denote the matrix elements Uj”j  wj”sj (which, for
a fixed j”, depend on just one index j) by j; then
Density
w jj ' in s   *j  j' , (7.19) matrix:
pure state
so that N2 elements of the whole NN matrix is determined by just one string of N c-numbers j. For
example, for a two-level system (N = 2),
  *   * 
w in s   1 1 2 1 
. (7.20)
  *   * 
 1 2 2 2 
We see that the off-diagonal terms are, colloquially, “as large as the diagonal ones”, in the following
sense:
w12 w21  w11 w22 . (7.21)
Since the diagonal terms have the sense of the probabilities W1,2 to find the system in the corresponding
state, we may represent Eq. (20) in the form
 W1 (W1W2 )1 / 2 e i 
w  . (7.22)
pure state
 (W W )1 / 2 e  i W2 
 1 2 
The physical sense of the (real) constant  is the phase shift between the coefficients in the linear
superposition (17), which represents the pure state wj” in the basis {s1,2}.
Now let us consider a different statistical ensemble of two-level systems, that includes the
member states identical in all aspects (including similar probabilities W1,2 in the same basis s1,2), besides
that the phase shifts  are random, with the phase probability uniformly distributed over the
trigonometric circle. Then the ensemble averaging is equivalent to the averaging over  from 0 to 2,8
which kills the off-diagonal terms of the density matrix (22), so that the matrix becomes diagonal:
W 0
w classical mixture   1 . (7.23)
 0 W2 
The mixed statistical ensemble with the density matrix diagonal in the stationary state basis is called the
classical mixture and represents the limit opposite to the pure (coherent) state.
After this example, the reader should not be much shocked by the main claim9 of statistical
mechanics that any large ensemble of similar systems in thermodynamic (or “thermal”) equilibrium is
exactly such a classical mixture. Moreover, for systems in the thermal equilibrium with a much larger
environment of a fixed temperature T (such an environment is usually called a heat bath) the statistical
physics gives a very simple expression, called the Gibbs distribution, for the probabilities Wn:10
Gibbs 1  E   E 
distribution Wn  exp n , with Z   exp n  . (7.24)
Z  k BT  n  k BT 
8 For a system with a time-independent Hamiltonian, such averaging is especially plausible in the basis of the
stationary states n of the system, in which the phase  is just the difference of integration constants in Eq. (4.158),
and its randomness may be naturally produced by minor fluctuations of the energy difference E1 – E2. In Sec. 3
below, we will study the dynamics of this dephasing process.
9 This fact follows from the basic postulate of statistical physics, called the microcanonical distribution – see,
e.g., SM Sec. 2.2.
10 See. e.g., SM Sec. 2.4. The Boltzmann constant k is only needed if the temperature is measured in non-energy
B
units – say in kelvins.
where En is the eigenenergy of the corresponding stationary state, and the normalization coefficient Z is
called the statistical sum.
A detailed analysis of classical and quantum ensembles in thermodynamic equilibrium is a major
focus of statistical physics courses (such as the SM of this series) rather than this course of quantum
mechanics. However, I would still like to attract the reader’s attention to the key fact that, in contrast
with the similarly-looking Boltzmann distribution for single particles,11 the Gibbs distribution is
general, not limited to classical statistics. In particular, for a quantum gas of indistinguishable particles,
it is absolutely compatible with the quantum statistics (such as the Bose-Einstein or Fermi-Dirac
distributions) of the component particles. For example, if we use Eq. (24) to calculate the average
energy of a 1D harmonic oscillator of frequency 0 in thermal equilibrium, we easily get12
         0     0  
Wn  exp n 0 1  exp 0  , Z  exp  1  exp
   . (7.25)
 k BT   k BT    2 k BT    k B T 

 0  0  0  0
E   Wn E n  coth   . (7.26a)
n 0 2 2k BT 2 exp 0 / k BT   1
The final form of the last result,
 0 1  0, for k BT   0 ,
E    0 n , with n   (7.26b)
2 exp 0 / k BT   1 k BT /  0 , for  0  k BT ,
may be interpreted as an addition, to the ground-state energy 0/2, of the average number n of
thermally-induced excitations, with the energy 0 each. In the harmonic oscillator, whose energy levels
are equidistant, such a language is completely appropriate, because the transfer of the system from any
level to the one just above it adds the same amount of energy, 0. Note that the above expression for
n is actually the Bose-Einstein distribution (for the particular case of zero chemical potential); we see
that it does not contradict the Gibbs distribution (24) of the total energy of the system, but rather
immediately follows from it.
Because of the fundamental importance of Eq. (26) for virtually all fields of physics, let me draw
the reader’s attention to its main properties. At low temperatures, kBT << 0, there are virtually no
excitations, n  0, and the average energy of the oscillator is dominated by that of its ground state. In
the opposite limit of high temperatures, n  kBT /0 >> 1, and E approaches the classical value kBT.
7.2. Coordinate representation, and the Wigner function

For many applications of the density operator, its coordinate representation is convenient. (I will
only discuss it for the 1D case; the generalization to multi-dimensional cases is straightforward.)
Following Eq. (4.47), it is natural to define the following function of two arguments (traditionally, also
called the density matrix):
11 See, e.g., SM Sec. 2.8.

12 See, e.g., SM Sec. 2.5 – but mind a different energy reference level, E0 = 0/2, used for example in SM Eqs.
(2.68)-(2.69), affecting the expression for Z. Actually, the calculation, using Eqs. (24) and (5.86), is so
straightforward that it is highly recommended to the reader as a simple exercise.
Density
matrix:
coordinate
w( x, x' )  x wˆ x' . (7.27)
representation
Inserting, into the right-hand side of this definition, two closure conditions (4.44) for an arbitrary (but
full and orthonormal) basis {s}, and then using Eq. (4.233),13 we get
w( x, x' )   x s j s j wˆ s j' s j' x'   j ( x) w jj ' in s  *j' ( x' ) . (7.28)

j, j' j, j'
In the special basis {w}, in which the density matrix is diagonal, this expression is reduced to
w( x, x' )   j ( x)W j *j ( x' ) . (7.29)

j
Let us discuss the properties of this function. At coinciding arguments, x’ = x, this is just the
probability density:14
w( x, x)   j ( x)W j *j ( x)   w j ( x)W j  w( x) . (7.30)
j j
However, the density matrix gives more information about the system than just the probability density.
As the simplest example, let us consider a pure quantum state, with Wj = j,j’, so that (x) = j’(x), and
w( x, x' )   j ' ( x) *j ' ( x' )   ( x) * ( x' ) . (7.31)
We see that the density matrix carries the information not only about the modulus but also the phase of
the wavefunction. (Of course one may argue rather convincingly that in this ultimate limit the density-
matrix description is redundant because all this information is contained in the wavefunction itself.)
How may be the density matrix interpreted? In the simple case (31), we can write
w( x, x' )  w( x, x' ) w* ( x, x' )   ( x) * ( x) ( x' ) * ( x' )  w( x) w( x' ) ,

2
(7.32)
so that the modulus squared of the density matrix is just as the joint probability density to find the
system at the point x and the point x’. For example, for a simple wave packet with a spatial extent x,
w(x,x’) has an appreciable magnitude only if both points are not farther than ~x from the packet center,
and hence from each other. The interpretation becomes more complex if we deal with an incoherent
mixture of several wavefunctions, for example, the classical mixture describing the thermodynamic
equilibrium. In this case, we can use Eq. (24) to rewrite Eq. (29) as follows:
1  E 
w( x, x' )   n ( x)Wn n* ( x' )   n ( x) exp n  n* ( x' ) . (7.33)
n Z n  k BT 
As the simplest example, let us see what is the density matrix of a free (1D) particle in the
thermal equilibrium. As we know very well by now, in this case, the set of energies Ep = p2/2m of
stationary states (monochromatic waves) forms a continuum, so that we need to replace the sum (33)
with an integral, using for example the “delta-normalized” traveling-wave eigenfunctions (4.264):
13 For now, I will focus on a fixed time instant (say, t = 0), and hence write (x) instead of (x, t).
14 This fact is the origin of the density matrix’s name.

1  ipx   p2   ipx' 
w( x, x' ) 
2Z      2mk BT  exp  dp .
exp  exp  (7.34)
This is a usual Gaussian integral, and may be worked out, as we have done repeatedly in Chapter 2 and
beyond, by complementing the exponent to the full square of the momentum p plus a constant. The
statistical sum Z may be also readily calculated, 15
Z  2mk B T  ,
1/ 2
(7.35)
However, for what follows it is more useful to write the result for the product wZ (the so-called un-
normalized density matrix):
1/ 2 Free
 mk T   mk BT ( x  x' ) 2  particle:
w( x, x' ) Z   B 2  exp . (7.36) thermal
 2   2 2  equilibrium
This is a very interesting result: the density matrix depends only on the difference of its
arguments, dropping to zero fast as the distance between the points x and x’ exceeds the following
characteristic scale (called the correlation length)

xc   x  x' 
2 1/ 2 Correlation
 . (7.37) length
mk BT 1 / 2
This length may be interpreted in the following way. It is straightforward to use Eq. (24) to verify that
the average energy E = p2/2m of a free particle in the thermal equilibrium, i.e. in the classical mixture
(33), equals kBT/2. Hence the average magnitude of the particle’s momentum may be estimated as
 2m E   mk BT  ,
1/ 2 1/ 2 1/ 2
pc  p 2 (7.38)
so that xc is of the order of the minimal length allowed by the Heisenberg-like “uncertainty relation”:
xc   / p c . (7.39)
Note that with the growth of temperature, the correlation length (37) goes to zero, and the
density matrix (36) tends to a delta function:
w( x, x' ) Z T    ( x  x' ) . (7.40)
Since in this limit the average kinetic energy of the particle is not smaller than its potential energy in any
fixed potential profile, Eq. (40) is the general property of the density matrix (33).
Let us discuss the following curious feature of Eq. (36): if we replace kBT with /i(t – t0), and x’
with x0, the un-normalized density matrix wZ for a free particle turns into the particle’s propagator – cf.
Eq. (2.49). This is not just an occasional coincidence. Indeed, in Chapter 2 we saw that the propagator of
a system with an arbitrary stationary Hamiltonian may be expressed via the stationary eigenfunctions as
15 Due to the delta-normalization of the eigenfunction, the density matrix (34) for the free particle (and any system
with a continuous eigenvalue spectrum) is normalized as
 
 w( x, x' )Zdx'   w( x, x' )Zdx  1.

 
 E 
G ( x, t ; x0 , t 0 )   n ( x) exp i n t  t 0  n* ( x0 ) . (7.41)
n   
Comparing this expression with Eq. (33), we see that the replacements
i (t  t 0 ) 1
 , x0  x' , (7.42)
 k BT
turn the pure-state propagator G into the un-normalized density matrix wZ of the same system in
thermodynamic equilibrium. This important fact, rooted in the formal similarity of the Gibbs distribution
(24) with the Schrödinger equation’s solution (1.69), enables a theoretical technique of the so-called
thermodynamic Green’s functions, which is especially productive in condensed matter physics.16
For our current purposes, we can employ Eq. (42) to re-use some of the wave mechanics results,
in particular, the following formula for the harmonic oscillator’s propagator

G ( x, t ; x0 , t 0 )  
m 0 

1/ 2
  
 m 0 x 2  x 02 cos[ 0 (t  t 0 )]  2 xx0 
exp  . (7.43)
 2i sin[ 0 (t  t 0 )]   2i sin[ 0 (t  t 0 )] 
which may be readily proved to satisfy the Schrödinger equation for the Hamiltonian (5.62), with the
appropriate initial condition: G(x, t0; x0, t0) = (x – x0). Making the substitution (42), we immediately get
Harmonic
oscillator

w( x, x' ) Z  
m 0 
1/ 2
exp
 
 m 0 x 2  x' 2 cosh  0 / k BT   2 xx'  
 . (7.44)
 2 sinh  0 / k BT   2 sinh  0 / k BT 
in thermal
equilibrium  
As a sanity check, at very low temperatures, kBT << 0, both hyperbolic functions participating in this
expression are very large and nearly equal, and it yields
 m 0 1 / 4  m 0 x 2    0   m 0 
1/ 4
 m 0 x' 2 
w( x, x' ) Z T 0    exp   exp     exp  . (7.45)
       2k BT       
In each of the expressions in square brackets we can readily recognize the ground state’s wavefunction
(2.275) of the oscillator, while the middle exponent is just the statistical sum (24) in the low-temperature
limit when it is dominated by the ground-level contribution:
  0 
Z T 0  exp . (7.46)
 2 k BT 
As a result, Z in both parts of Eq. (45) may be canceled, and the density matrix in this limit is described
by Eq. (31), with the ground state as the only state of the system. This is natural when the temperature is
too low for the thermal excitation of any other state.
16 I will have no time to discuss this technique and have to refer the interested reader to special literature.
Probably, the most famous text of that field is A. Abrikosov, L. Gor’kov, and I. Dzyaloshinski, Methods of
Quantum Field Theory in Statistical Physics, Prentice-Hall, 1963. (Later reprintings are available from Dover.)
Returning to arbitrary temperatures, Eq. (44) in coinciding arguments gives the following
expression for the probability density:17
1/ 2
 m 0   m 0 x 2  0 
w( x, x) Z  w( x) Z    exp tanh . (7.47)
 2 sinh  0 / k B T     2 k BT 
This is just a Gaussian function of x, with the following variance:
  0
x2  coth . (7.48)
2m 0 2 k BT
To compare this result with our earlier ones, it is useful to recast it as
m 02 2  0  0
U  x  coth . (7.49)
2 4 2 k BT
Comparing this expression with Eq. (26), we see that the average value of potential energy is exactly
one-half of the total energy – the other half being the average kinetic energy. This is what we could
expect, because according to Eqs. (5.96)-(5.97), such relation holds for each Fock state and hence
should also hold for their classical mixture.
Unfortunately, besides the trivial case (30) of coinciding arguments, it is hard to give a
straightforward interpretation of the density function in terms of the system’s measurements. This is a
fundamental difficulty, which has been well explored in terms of the Wigner function (sometimes called
the “Wigner-Ville distribution”)18 defined as
~ ~ ~
1  X X  iPX  ~ Wigner
2  
W ( X , P)  w  X  , X   exp  dX . (7.50) function:
2 2    definition
From the mathematical standpoint, this is just the Fourier transform of the density matrix in one of two
new coordinates defined by the following relations (see Fig. 2):
~ ~
x  x' ~ X X
X  , X  x  x' , so that x  X  , x'  X  . (7.51)
2 2 2
Physically, the new argument X may be interpreted as the average position of the particle during
~
the time interval (t – t’), while X , as the distance passed by it during that time interval, so that P
characterizes the momentum of the particle during that motion. As a result, the Wigner function is a
mathematical construct intended to characterize the system’s probability distribution simultaneously in
the coordinate and the momentum space – for 1D systems, on the phase plane [X, P], which we had
discussed earlier – see Fig. 5.8. Let us see how fruitful this intention is.
17 I have to confess that this notation is imperfect, because strictly speaking, w(x, x’) and w(x) are different
functions, and so are the functions w(p, p’) and w(p) used below. In the perfect world, I would use different letters
for them all, but I desperately want to stay with “w” for all the probability densities, and there are not so many
good fonts for this letter. Let me hope that the difference between these functions is clear from their arguments
and the context.
18 It was introduced in 1932 by Eugene Wigner on the basis of a general (Weyl-Wigner) transform suggested by
Hermann Weyl in 1927 and re-derived in 1948 by Jean Ville on a different mathematical basis.
x'
X 2
~
x Fig. 7.2. The coordinates X and X employed in the Weyl-
0 Wigner transform (50). They differ from the coordinates
~ obtained by the rotation of the reference frame by the angle
X
/4 only by factors 2 and 1/2, describing scale stretch.
2
First of all, we may write the Fourier transform reciprocal to Eq. (50):
~ ~ ~
 X X  iPX 
w X  , X     W ( X , P) exp dP . (7.52)
 2 2    
~
For the particular case X  0 , this relation yields
w( X )  w( X , X )   W ( X , P)dP . (7.53)
Hence the integral of the Wigner function over the momentum P gives the probability density to find the
system at point X – just as it does for a classical distribution function wcl(X, P).19
Next, the Wigner function has the similar property for integration over X. To prove this fact, we
may first introduce the momentum representation of the density matrix, in full analogy with its
coordinate representation (27):
w( p, p' )  p wˆ p' . (7.54)
Inserting, as usual, two identity operators, in the form given by Eq. (4.252), into the right-hand side of
this equality, we get the following relation between the momentum and coordinate representations:
1  ipx   ip'x' 
w( p, p' )    dxdx' p x x wˆ x' x' p'    dxdx' exp w( x, x' ) exp  . (7.55)
2      
This is of course nothing else than the unitary transform of an operator from the x-basis to the p-basis,
similar to the first form of Eq. (4.272). For coinciding arguments, p = p’, Eq. (55) is reduced to
1  ip( x  x' ) 
w( p )  w( p, p ) 
2   dxdx' w( x, x' ) exp
 
.

(7.56)
Now using Eq. (29) and then Eq. (4.265), this function may be represented as
1  ip ( x  x' ) 
 W j   dxdx'  j ( x) *j ( x) exp    W j  j  p  j  p  ,
*
w( p )  (7.57)

2  j    j
and hence interpreted as the probability density of the particle’s momentum at value p. Now, in the
variables (51), Eq. (56) has the form
19 Suchfunction, used to express the probability dW to find the system in a small area of the phase plane
as dW = wcl(X, P)dXdP, is a major notion of the (1D) classical statistics – see, e.g., SM Sec. 2.1.
~ ~ ~
1  X X  ipX  ~
2   
w( p )  w X  , X   exp dXdX . (7.58)
2 2   
Comparing this equality with the definition (50) of the Wigner function, we see that
w( P)   W ( X , P)dX . (7.59)
Thus, according to Eqs. (53) and (59), the integrals of the Wigner function over either the
coordinate or momentum give the probability densities to find the system at a certain value of the
counterpart variable. This is of course the main requirement to any quantum-mechanical candidate for
the best analog of the classical probability density, wcl(X,P).
Let us see at how does the Wigner function look for the simplest systems at thermodynamic
equilibrium. For a free 1D particle, we can use Eq. (34), ignoring for simplicity the normalization issues:
~ ~

 mk BTX 2   iPX  ~
W ( X , P)   exp  exp dX . (7.60)
  2 2    
The usual Gaussian integration yields:
 P2 
W ( X , P)  const  exp . (7.61)
 2mk BT 
We see that the function is independent of X (as it should be for this translational-invariant system), and
coincides with the Gibbs distribution (24). We could get the same result directly from classical statistics.
This is natural because as we know from Sec. 2.2, the free motion is essentially not quantized – at least
in terms of its energy and momentum.
Now let us consider a substantially quantum system, the harmonic oscillator. Plugging Eq. (44)
into Eq. (50), for that system in thermal equilibrium it is easy to show (and hence is left for reader’s
exercise) that the Wigner function is also Gaussian, now in both its arguments:
  m 02 X 2 P 2  
W ( X , P)  const  exp C    , (7.62)
  2 2m  
though the coefficient C is now different from 1/kBT , and tends to that limit only at high temperatures,
kBT >> 0. Moreover, for a Glauber state, the Wigner function also gives a very plausible result – a
Gaussian distribution similar to Eq. (62), but properly shifted from the origin to the central point of the
state – see Sec. 5.5.20
Unfortunately, for some other possible states of the harmonic oscillator, e.g., any pure Fock state
with n > 0, the Wigner function takes negative values in some regions of the [X, P] plane – see Fig. 3.21
(Such plots were the basis of my, admittedly very imperfect, classical images of the Fock states in Fig.
5.8.)
20 Please note that in the notation of Sec. 5.5, the capital letters X and P mean not the arguments of the Wigner
function, but the Cartesian coordinates of the central point (5.102), i.e. the classical complex amplitude of the
oscillations.
21 Spectacular experimental measurements of this function (for n = 0 and n = 1) were carried out recently by E.
Bimbard et al., Phys. Rev. Lett. 112, 033601 (2014).
Fig. 7.3. The Wigner functions W(X, P) of a harmonic oscillator, in a few of its stationary
(Fock) states n: (a) n = 0, (b) n = 1; (c) n = 5. Graphics by J. S. Lundeen; adapted from
http://en.wikipedia.org/wiki/Wigner_function as a public-domain material.
The same is true for most other quantum systems and their states. Indeed, this fact could be
predicted just by looking at the definition (50) applied to a pure quantum state, in which the density
function may be factored – see Eq. (31):
~ ~ ~
1  X  * X  iPX  ~
2  
W ( X , P)    X     X   exp  dX . (7.63)
2  2   
Changing the argument P (say, at fixed X), we are essentially changing the spatial “frequency” (wave
number) of the wavefunction product’s Fourier component we are calculating, and we know that their
Fourier images typically change sign as the frequency is changed. Hence the wavefunctions should have
some high-symmetry properties to avoid this effect. Indeed, the Gaussian functions (describing, for
example, the Glauber states, and in their particular case, the ground state of the harmonic oscillator)
have such symmetry, but many other functions do not.
Hence if the Wigner function was taken seriously as the quantum-mechanical analog of the
classical probability density wcl(X, P), we would need to interpret the negative probability of finding the
particle in certain elementary intervals dXdP – which is hard to do. However, the function is still used
for a semi-quantitative interpretation of mixed states of quantum systems.
7.3. Open system dynamics: Dephasing

So far we have discussed the density operator as something given at a particular time instant.
Now let us discuss how is it formed, i.e. its evolution in time, starting from the simplest case when the
probabilities Wj participating in Eq. (15) are time-independent – by this or that reason, to be discussed in
a moment. In this case, in the Schrödinger picture, we may rewrite Eq. (15) as
wˆ (t )   w j (t ) W j w j (t ) . (7.64)
j
Taking a time derivative of both sides of this equation, multiplying them by i, and applying Eq. (4.158)
to the basis states wj, with the account of the fact that the Hamiltonian operator is Hermitian, we get

iwˆ  i  w j (t ) W j w j (t )  w j (t ) W j w j (t ) 
j

  Hˆ w j (t ) W j w j (t )  w j (t ) W j w j (t ) Hˆ  (7.65)
j
 Hˆ  w j (t ) W j w j (t )   w j (t ) W j w j (t ) Hˆ .
j j
Now using Eq. (64) again (twice), we get the so-called von Neumann equation22

iwˆ  Hˆ , wˆ .  (7.66)
von Neumann
equation
Note that this equation is similar in structure to Eq. (4.199) describing the time evolution of time-
independent operators in the Heisenberg picture operators:


iAˆ  Aˆ , Hˆ , (7.67)
besides the opposite order of the operators in the commutator – equivalent to the change of sign of the
right-hand side. This should not be too surprising, because Eq. (66) belongs to the Schrödinger picture
of quantum dynamics, while Eq. (67), to its Heisenberg picture.
The most important case when the von Neumann equation is (approximately) valid is when the
“own” Hamiltonian Ĥ s of the system s of our interest is time-independent, and its interaction with the
environment is so small that its effect on the system’s evolution during the considered time interval is
negligible, but it had lasted so long that it gradually put the system into a non-pure state – for example,
but not necessarily, into the classical mixture (24).23 (This is an example of the second case discussed in
Sec. 1, when we need the mixed-ensemble description of the system even if its current interaction with
the environment is negligible.) If the interaction with the environment is stronger, and hence is not
negligible at the considered time interval, Eq. (66) is generally not valid,24 because the probabilities Wj
may change in time. However, this equation may still be used for a discussion of one major effect of the
environment, namely dephasing (also called “decoherence”), within a simple model.
Let us start with the following general model a system interacting with its environment, which
will be used throughout this chapter:
Interaction
Hˆ  Hˆ s  Hˆ e    Hˆ int , (7.68) with
environment
22 In some texts, it is called the “Liouville equation”, due to its philosophical proximity to the classical Liouville
theorem for the classical distribution function wcl(X, P) – see, e.g., SM Sec. 6.1 and in particular Eq. (6.5).
23 In the last case, the statistical operator is diagonal in the stationary state basis and hence commutes with the
Hamiltonian. Hence the right-hand side of Eq. (66) vanishes, and it shows that in this basis, the density matrix is
completely time-independent.
24 Very unfortunately, this fact is not explained in some textbooks, which quote the von Neumann equation
without proper qualifications.
where {} denotes the (huge) set of degrees of freedom of the environment.25 Evidently, this model is
useful only if we may somehow tame the enormous size of the Hilbert space of these degrees of
freedom, and so work out the calculations all way to a practicably simple result. This turns out to be
possible mostly if the elementary act of interaction of the system and its environment is in some sense
small. Below, I will describe several cases when this is true; the classical example is the Brownian
particle interacting with the molecules of the surrounding gas or fluid.26 (In this example, a single hit by
a molecule changes the particle’s momentum by a minor fraction.) On the other hand, the model (68) is
not very productive for a particle interacting with the environment consisting of similar particles, when
a single collision may change its momentum dramatically. In such cases, the methods discussed in the
next chapter are more relevant.
Now let us analyze a very simple model of an open two-level quantum system, with its intrinsic
Hamiltonian having the form
Hˆ s  c z̂ z , (7.69)
similar to the Pauli Hamiltonian (4.163),27 and a factorable, bilinear interaction – cf. Eq. (6.145) and its
discussion:
Hˆ int  fˆ  ˆ z , (7.70)
where fˆ is a Hermitian operator depending only on the set {} of environmental degrees of freedom
(“coordinates”), defined in their Hilbert space – different from that of the two-level system. As a result,
the operators fˆ   and Ĥ e   commute with ̂ z - and with any other intrinsic operator of the two-level
system. Of course, any realistic Ĥ e   is extremely complex, so that how much we will be able to
achieve without specifying it, may be a pleasant surprise for the reader.
Before we proceed to the analysis, let us recognize two examples of two-level systems that may
be described by this model. The first example is a spin-½ in an external magnetic field of a fixed
direction (taken for the axis z), which includes both an average component B and a random
~
(fluctuating) component Bz (t) induced by the environment. As it follows from Eq. (4.163b), it may be
described by the Hamiltonian (68)-(70) with
  ~
cz   Bz , and fˆ   Bˆz  t  . (7.71)
2 2
25 Note that by writing Eq. (68), we are treating the whole system, including the environment, as a Hamiltonian
one. This can always be done if the accounted part of the environment is large enough so that the processes in the
system s of our interest do not depend on the type of boundary between this part and the “external” (even larger)
environment; in particular, we may assume the total system to be closed, i.e. Hamiltonian.
26 The theory of the Brownian motion, the effect first observed experimentally by biologist Robert Brown in the
1820s, was pioneered by Albert Einstein in 1905 and developed in detail by Marian Smoluchowski in 1906-1907
and Adriaan Fokker in 1913. Due to this historic background, in some older texts, the approach described in the
balance of this chapter is called the “quantum theory of the Brownian motion”. Let me, however, emphasize that
due to the later progress of experimental techniques, quantum-mechanical behaviors, including the environmental
effects in them, have been observed in a rapidly growing number of various quasi-macroscopic systems, for which
this approach is quite applicable. In particular, this is true for most systems being explored as possible qubits of
prospective quantum computing and encryption systems – see Sec. 8.5 below.
27 As we know from Secs. 4.6 and 5.1, such Hamiltonian is sufficient to lift the energy level degeneracy.
Another example is a particle in a symmetric double-well potential Us (Fig. 4), with a barrier
between them sufficiently high to be practically impenetrable, and an additional force F(t), exerted by
the environment, so that the total potential energy is U(x, t) = Us(x) – F(t)x. If the force, including its
~
static part F and fluctuations F t  , is sufficiently weak, we can neglect its effects on the shape of
potential wells and hence on the localized wavefunctions L,R, so that the force effect is reduced to the
variation of the difference EL – ER = F(t)x between the eigenenergies. As a result, the system may be
described by Eqs. (68)-(70) with
~ˆ
c z   F x / 2; fˆ   F t x / 2 . (7.72)
L R U s (x)
 F (t ) x Fig. 7.4. Dephasing in a double-well

system.
0 x
x
Let us start our general analysis of the model described by Eqs. (68)-(70) by writing the equation
of motion for the Heisenberg operator ̂ z t  :
   
iˆ z  ˆ z , Hˆ  (c z  fˆ ) ˆ z , ˆ z  0, (7.73)
showing that in our simple model (68)-(70), the operator ˆ z does not evolve in time. What does this
mean for the observables? For an arbitrary density matrix of any two-level system,
w w12 
w   11 , (7.74)
 w21 w22 
we can readily calculate the trace of operator ˆ z wˆ . Indeed, since the operator traces are basis-
independent, we can do this in any basis, in particular in the usual z-basis:
 1 0  w11 w12 
Tr σˆ z wˆ   Tr σ z w   Tr     w11  w22  W1  W2 . (7.75)
 0  1 w21 w22 
Since, according to Eq. (5), ˆ z may be considered the operator for the difference of the number
of particles in the basis states 1 and 2, in the case (73) the difference W1 – W2 does not depend on time,
and since the sum of these probabilities is also fixed, W1 + W2 = 1, both of them are constant. The
physics of this simple result is especially clear for the model shown in Fig. 4: since the potential barrier
separating the potential wells is so high that tunneling through it is negligible, the interaction with the
environment cannot move the system from one well into another one.
It may look like nothing interesting may happen in such a simple situation, but in a minute we
will see that this is not true. Due to the time independence of W1 and W2, we may use the von Neumann
equation (66) to describe the density matrix evolution. In the usual z-basis:
 w
  i 11
iw
w 12 
 
  H, w   c z  fˆ σ z , w 
w 22 
 w 21
(7.76)

 c z  fˆ 
 1 0   w11
, 
w12 
  0
  c z  fˆ   2w12 
0 
.
 0  1  w21 w22    2w21
This result means that while the diagonal elements, i.e., the probabilities of the states, do not evolve in
time (as we already know), the off-diagonal elements do change; for example,
iw 12  2(c z  fˆ ) w12 , (7.77)
with a similar but complex-conjugate equation for w21. The solution of this linear differential equation
(77) is straightforward, and yields
 2c z   2t ˆ 
w12 (t )  w12 (0) exp i t  exp i  f (t' )dt'  . (7.78)
    0 
The first exponent is a deterministic c-number factor, while in the second one fˆ (t )  fˆ  (t ) is still an
operator in the Hilbert space of the environment, but from the point of view of the two-level system of
our interest, it is a random function of time. The time-average part of this function may be included in
cz, so in what follows, we will assume that it equals zero.
Let us start from the limit when the environment behaves classically.28 In this case, the operator
in Eq. (78) may be considered as a classical random function of time f(t), provided that we average its
effects over a statistical ensemble of many functions f(t) describing many (macroscopically similar)
experiments. For a small time interval t = dt  0, we can use the Taylor expansion of the exponent,
truncating it after the quadratic term:
 2 dt  2
dt
1 2
dt
 2 dt 
exp i  f (t' )dt'   1   i  f (t' )dt'    i  f (t' )dt'   i  f (t" )dt" 
 0  0 2  0   0  (7.79)
dt dt dt dt dt
2 2 2
 1  i  f (t' ) dt'  2  dt'  dt" f (t' ) f (t" )  1  2  dt'  dt"K f (t'  t" ).
0  0 0  0 0
Here we have used the facts that the statistical average of f(t) is equal to zero, while the second average,
called the correlation function, in a statistically- (i.e. macroscopically-) stationary state of any
environment may only depend on the time difference   t’ – t”:
Correlation
function f (t' ) f (t" )  K f (t'  t" )  K f ( ). (7.80)
If this difference is much larger than some time scale c, called the correlation time of the environment,
the values f(t’) and f(t”) are completely independent (uncorrelated), as illustrated in Fig. 5a, so that at 
 , the correlation function has to tend to zero. On the other hand, at  = 0, i.e. t’ = t”, the correlation
function is just the variance of f:
28 This assumption is not in contradiction with the need for the quantum treatment of the two-level system s,
because a typical environment is large, and hence has a very dense energy spectrum, with the distances adjacent
levels that may be readily bridged by thermal excitations of small energies, often making it essentially classical.
K f (0)  f 2
, (7.81)
and has to be positive. As a result, the function looks (semi-quantitatively) as shown in Fig. 5b.
(a) (b)
f (t ) f (t' ) f (t" )
t"  t'
Fig. 7.5. (a) A typical random

process and (b) its correlation
0 t 0 t'  t" function – schematically.
c
Hence, if we are only interested in time differences  much longer than c, which is typically
very short, we may approximate Kf() well with a delta function of the time difference. Let us take it in
the following form, convenient for later discussion:
Phase
K f ( )   D  ( ) ,
2
(7.82) diffusion
coefficient
where D is a positive constant called the phase diffusion coefficient. The origin of this term stems from
the very similar effect of classical diffusion of Brownian particles in a highly viscous medium. Indeed,
the particle’s velocity in such a medium is approximately proportional to the external force. Hence, if
the random hits of a particle by the medium’s molecules may be described by a force that obeys a law
similar to Eq. (82), the velocity (along any Cartesian coordinate) is also delta-correlated:
v(t )  0, v(t' )v(t" )  2 D (t'  t" ). (7.83)
Now we can integrate the kinematic relation x  v, to calculate particle’s displacement from its initial
position during a time interval [0, t] and its variance:
t
x(t )  x(0)   v(t' )dt' , (7.84)
0
t t t t t t
x(t )  x(0)  2
  v(t' )dt'  v(t" )dt"   dt'  dt" v(t' )v(t" )   dt'  dt" 2 D (t'  t" )  2 Dt. (7.85)
0 0 0 0 0 0
This is the famous law of diffusion, showing that the r.m.s. deviation of the particle from the initial point
grows with time as (2Dt)1/2, where the constant D is called the diffusion coefficient.
Returning to the diffusion of the quantum-mechanical phase, with Eq. (82) the last double
integral in Eq. (79) yields 2Dφdt, so that the statistical average of Eq. (78) is
 2c 
w12 (dt )  w12 (0) exp i z dt 1  2 D dt  . (7.86)
  
Applying this formula to sequential time intervals,
 2c   2c 
w12 (2dt )  w12 (dt ) exp i z dt 1  2 D dt   w12 (0) exp i z 2dt 1  2 D dt  , (7.87)
2
     
etc., for a finite time t = Ndt, in the limit N → ∞ and dt → 0 (at fixed t) we get
N
 2c z   1
w12 (t )  w12 (0) exp i t   lim N  1  2 D t  . (7.88)
    N
By the definition of the natural logarithm base e,29 this limit is just exp{-2Dt}, so that, finally:
 t 
w12 (t )  w12 (0) exp i t  exp 2 D t  w12 (0) exp i t  exp  .
Two-level
 2a   2a 
system: (7.89)
dephasing        T2 
So, due to coupling to the environment, the off-diagonal elements of the density matrix decay
with some dephasing time T2 = 1/2D, providing a natural evolution from the density matrix (22) of a
pure state to the diagonal matrix (23), with the same probabilities W1,2, describing a fully dephased
(incoherent) classical mixture.30
This simple model offers a very clear look at the nature of the decoherence: the random “force”
f(t), exerted by the environment, “shakes” the energy difference between two eigenstates of the system
and hence the instantaneous velocity 2(cz + f)/ of their mutual phase shift φ(t) – cf. Eq. (22). Due to the
randomness of the force, φ(t) performs a random walk around the trigonometric circle, so that the
average of its trigonometric functions exp{±iφ} over time gradually tends to zero, killing the off-
diagonal elements of the density matrix. Our analysis, however, has left open two important issues:
(i) Is this approach valid for a quantum description of a typical environment?
(ii) If yes, what is physically the D that was formally defined by Eq. (82)?
7.4. Fluctuation-dissipation theorem

Similar questions may be asked about a more general situation, when the Hamiltonian Ĥ s of the
system of interest (s), in the composite Hamiltonian (68), is not specified at all, but the interaction
between that system and its environment still has a bilinear form similar to Eqs. (70) and (6.130):
Hˆ int   Fˆ {} xˆ, (7.90)

where x is some observable of our system s – say, its generalized coordinate or generalized momentum.
It may look incredible that in this very general situation one still can make a very simple and powerful
statement about the statistical properties of the generalized force F, under only two (interrelated)
conditions – which are satisfied in a huge number of cases of interest:
(i) the coupling of system s of interest to its environment e is weak – in the sense that the
perturbation theory (see Chapter 6) is applicable, and
29See, e.g., MA Eq. (1.2a) with n = –N/2Dt.

30Note that this result is valid only if the approximation (82) may be applied at time interval dt which, in turn,
should be much smaller than the T2 in Eq. (88), i.e. if the dephasing time is much longer than the environment’s
correlation time c. This requirement may be always satisfied by making the coupling to the environment
sufficiently weak. In addition, in typical environments, c is very short. For example, in the original Brownian
motion experiments with a-few-m pollen grains in water, it is of the order of the average interval between
sequential molecular impacts, of the order of 10-21 s.
(ii) the environment may be considered as staying in thermodynamic equilibrium, with a certain
temperature T, regardless of the process in the system of interest.31
This famous statement is called the fluctuation-dissipation theorem (FDT).32 Due to the
importance of this fundamental result, let me derive it.33 Since by writing Eq. (68) we treat the whole
system (s + e) as a Hamiltonian one, we may use the Heisenberg equation (4.199) to write

   
iFˆ  Fˆ , Hˆ  Fˆ , Hˆ e , (7.91)
because, as was discussed in the last section, operator F̂   commutes with both Ĥ s and x̂ . Generally,
very little may be done with this equation, because the time evolution of the environment’s Hamiltonian
depends, in turn, on that of the force. This is where the perturbation theory becomes indispensable. Let
us decompose the force operator into the following sum:
~ˆ ~ˆ
Fˆ    Fˆ  F (t ), with F (t )  0 , (7.92)
where (here and on, until further notice) the sign … means the statistical averaging over the
environment alone, i.e. over an ensemble with absolutely similar evolutions of the system s, but random
states of its environment.34 From the point of view of the system s, the first term of the sum (still an
operator!) describes the average response of the environment to the system dynamics (possibly,
including such irreversible effects as friction), and has to be calculated with a proper account of their
interaction – as we will do later in this section. On the other hand, the last term in Eq. (92) represents
random fluctuations of the environment, which exist even in the absence of the system s. Hence, in the
first non-zero approximation in the interaction strength, the fluctuation part may be calculated ignoring
the interaction, i.e. treating the environment as being in thermodynamic equilibrium:
~ˆ ~ˆ
iF   F , Hˆ e eq  . (7.93)
 
Since in this approximation the environment’s Hamiltonian does not have an explicit dependence on
time, the solution of this equation may be written by combining Eqs. (4.190) and (4.175):
31 The most frequent example of the violation of this condition is the environment’s overheating by the energy
flow from system s. Let me leave it to the reader to estimate the overheating of a standard physical laboratory
room by a typical dissipative quantum process – the emission of an optical photon by an atom. (Hint: it is
extremely small.)
32 The FDT was first derived by Herbert Callen and Theodore Allen Welton in 1951, on the background of an
earlier derivation of its classical limit by Harry Nyquist in 1928.
33 The FDT may be proved in several ways that are shorter than the one given below – see, e.g., either the proof in
SM Secs. 5.5 and 5.6 (based on H. Nyquist’s arguments), or the original paper by H. Callen and T. Welton, Phys.
Rev. 83, 34 (1951) – wonderful in its clarity. The longer approach I will describe here, besides giving the
important Green-Kubo formula (109) as a byproduct, is a very useful exercise in the operator manipulation and
the perturbation theory in its integral form – different from the differential forms used in Chapter 6. If the reader
is not interested in this exercise, they may skip the derivation and jump straight to the result expressed by Eq.
(134), which uses the notions defined by Eqs. (114) and (123).
34 For usual (“ergodic”) environments, without intrinsic long-term memories, this statistical averaging over an
ensemble of environments is equivalent to averaging over intermediate times – much longer than the correlation
time c of the environment, but still much shorter than the characteristic time of evolution of the system under
analysis, such as the dephasing time T2 and the energy relaxation time T1 – both still to be calculated.
 i   i 
Fˆ t   exp Hˆ e eq t  Fˆ 0  exp Hˆ e eq t  . (7.94)
     
Let us use this relation to calculate the correlation function of the fluctuations F(t), defined
similarly to Eq. (80), but taking care of the order of the time arguments (very soon we will see why):
~ ~  i   i   i   i 
F t F t'   exp  Hˆ e t  Fˆ 0  exp  Hˆ e t  exp  Hˆ e t'  Fˆ 0  exp Hˆ e t'  . (7.95)
           
(Here, for the notation brevity, the thermal equilibrium of the environment is just implied.) We may
calculate this expectation value in any basis, and the best choice for it is evident: in the environment’s
stationary-state basis, the density operator of the environment, its Hamiltonian, and hence the exponents
in Eq. (95) are all represented by diagonal matrices. Using Eq. (5), the correlation function becomes
~ ~   i   i   i   i 
F t F t'   Tr  wˆ exp Hˆ e t  Fˆ 0  exp Hˆ e t  exp Hˆ e t'  Fˆ 0 exp Hˆ e t' 
            
  i   i   i   i 
   wˆ exp Hˆ e t  Fˆ 0  exp Hˆ e t  exp Hˆ e t'  Fˆ 0 exp Hˆ e t' 
n              nn
 i   i   i   i 
  Wn exp E n t  Fˆnn' exp E n' t  exp E n' t'  Fˆn'n exp E n t'  (7.96)
n , n'            
 i 
  Wn Fnn' exp E n  E n'  (t  t' ) .
2
n ,n '   
Here Wn are the Gibbs distribution probabilities given by Eq. (24), with the environment’s temperature
T, and Fnn’  Fnn’(0) are the Schrödinger-picture matrix elements of the interaction force operator.
We see that though the correlator (96) is a function of the difference   t – t’ only (as it should
be for fluctuations in a macroscopically stationary system), it may depend on the order of its arguments.
This is why let us mark this particular correlation function with the upper index “+”,
~
~ ~  iE   ~
K F    F t F t'    Wn Fnn' exp
 2
 , where E  E n  E n' , (7.97)
n ,n '   
while its counterpart, with the swapped times t and t’, with the upper index “–”:
~
~ ~  iE  
K F    K F     F t' F t    Wn Fnn' exp
  2
. (7.98)
n,n '   
So, in contrast with classical processes, in quantum mechanics the correlation function of fluctuations
~
F is not necessarily time-symmetric:
~
~ ~ ~ ~ E
K F    K F    K F    K F     F t F t'   F t' F t   2i  Wn Fnn' sin
    2
 0, (7.99)
n,n ' 
so that F̂ t  gives one more example of a Heisenberg-picture operator whose “values”, taken in
different moments of time, generally do not commute – see Footnote 49 in Chapter 4. (A good sanity
check here is that at  = 0, i.e. at t = t’, the difference (99) between KF+ and KF- vanishes.)
Now let us return to the force operator’s decomposition (92), and calculate its first (average)
component. To do that, let us write the formal solution of Eq. (91) as follows:
 
t
1
Fˆ (t )  Fˆ t' , Hˆ e t'  dt' .
i 
(7.100)
On the right-hand side of this relation, we still cannot treat the Hamiltonian of the environment as an
unperturbed (equilibrium) one, even if the effect of our system (s) on the environment is very weak,
because this would give zero statistical average of the force F(t). Hence, we should make one more step
of our perturbative treatment, taking into account the effect of the force on the environment. To do this,
let us use Eqs. (68) and (90) to write the (so far, exact) Heisenberg equation of motion for the
environment’s Hamiltonian,


iHˆ  Hˆ , Hˆ   xˆ Hˆ , Fˆ ,
e e   e 
(7.101)
and its formal solution, similar to Eq. (100), but for time t’ rather than t:
 
t'
1
Hˆ e (t' )    xˆ (t" ) Hˆ e t" , Fˆ t"  dt" . (7.102)
i  
Plugging this equality into the right-hand side of Eq. (100), and averaging the result (again, over the
environment only!), we get
  
t t'
1
Fˆ (t )  2  dt'  dt" xˆ t"  Fˆ t' , Hˆ e t" , Fˆ t"  . (7.103)
  
This is still an exact result, but now it is ready for an approximate treatment, implemented by
averaging in its right-hand side over the unperturbed (thermal-equilibrium) state of the environment.
This may be done absolutely similarly to that in Eq. (96), at the last step using Eq. (94):
Fˆ t' , Hˆ t" , Fˆ t"    Trw Ft' , H Ft"  

e e
 Tr w Ft' H e Ft"   Ft' Ft" H e  H e Ft" Ft'   Ft" H e Ft'  
  Wn Fnn' t' E n' Fn'n t"   Fnn ' t' Fn 'n t" E n  E n Fnn' t" Fn'n t'   Fnn' t" E n' Fn'n t" 
n,n '
~
~ 2   iE t'  t"    (7.104)
  Wn E Fnn' exp   c.c.  .
n ,n '     
Now, if we try to integrate each term of this sum, as Eq. (103) seems to require, we will see that the
lower-limit substitution (at t’, t”  –) is uncertain because the exponents oscillate without decay. This
mathematical difficulty may be overcome by the following physical reasoning. As illustrated by the
example considered in the previous section, coupling to a disordered environment makes the “memory
horizon” of the system of our interest (s) finite: its current state does not depend on its history beyond a
certain time scale.35 As a result, the function under the integrals of Eq. (103), i.e. the sum (104), should
35 Actually, this is true for virtually any real physical system – in contrast to idealized models such as a
dissipation-free oscillator that swings for ever and ever with the same amplitude and phase, thus “remembering”
the initial conditions.
self-average at a certain finite time. A simplistic technique for expressing this fact mathematically is just
dropping the lower-limit substitution; this would give the correct result for Eq. (103). However, a better
(mathematically more acceptable) trick is to first multiply the functions under the integrals by,
respectively, exp{(t – t’)} and exp{(t’ – t”)}, where  is a very small positive constant, then carry out
the integration, and after that follow the limit   0. The physical justification of this procedure may be
provided by saying that the system’s behavior should not be affected if its interaction with the
environment was not kept constant but rather turned on gradually – say, exponentially with an
infinitesimal rate . With this modification, Eq. (103) becomes
1 ~
t t'
  iE~ t'  t"   
Fˆ (t )   2  Wn E Fnn' lim  0  dt'  dt" xˆ t"  exp   t"  t   c.c. . (7.105)
2
 n ,n '       
This double integration is over the area shaded in Fig. 6, which makes it obvious that the order of
integration may be changed to the opposite one as
t t' t t t 0 t 
 dt'  dt"... 
 
 dt"  dt'... 
 t"
 dt"

 d t'  t ... 
t" t
 dt"  d'... ,
 0
(7.106)
where ’  t – t’, and   t – t”.

t"
t t'  t"
Fig. 7.6. The 2D integration

t t' area in Eqs. (105) and (106).
As a result, Eq. (105) may be rewritten as a single integral,

Average t 
environment’s Fˆ t    G (t  t" ) xˆ (t" )dt"   G ( ) xˆ (t   )d , (7.107)
response
 0
whose kernel,
1 ~

  iE~   '   
G   0    2  Wn E Fnn' lim  0  exp
2
    c.c. d'
0  
 n ,n '    
~ ~ (7.108)
2 E  2 E
 lim  0  Wn Fnn' sin   Wn Fnn' sin
2 2
e ,
 n ,n '   n ,n ' 
does not depend on the particular law of evolution of the system (s) under study, i.e. provides a general
characterization of its coupling to the environment.
In Eq. (107) we may readily recognize the most general form of the linear response of a system
(in our case, the environment), taking into account the causality principle, where G() is the response
function (also called the “temporal Green’s function”) of the environment. Now comparing Eq. (108)
with Eq. (99), we get a wonderfully simple universal relation,
 F~ˆ ( ), F~ˆ (0)  iG ( ) . (7.109) Green-Kubo

  formula
that emphasizes once again the quantum nature of the correlation function’s time asymmetry. (This
relation, called the Green-Kubo (or just “Kubo”) formula after the works by Melville Green (1954) and
Ryogo Kubo (1957), does not come up in the easier derivations of the FDT, mentioned in the beginning
of this section.)
However, for us the relation between the function G() and the force’s anti-commutator,
F~ˆ (t   ), F~ˆ (t)  ~ˆ ~ˆ ~ˆ ~ˆ

F (t   ) F (t )  F (t ) F (t   )  K F    K F   , (7.110)
is much more important, because of the following reason. Eqs. (97)-(98) show that the so-called
symmetrized correlation function,
 
~
K     K F   1 ~ˆ ~ˆ E 2 
K F    F F ( ), F (0)  lim 0  Wn Fnn' cos
2
 e Symmetrized
2 2 n,n ' 
~ (7.111) correlation
E function
  Wn Fnn' cos
2
,
n,n ' 
which is an even function of the time difference , looks very similar to the response function (108),
“only” with another trigonometric function under the sum, and a constant front factor.36 This similarity
may be used to obtain a direct algebraic relation between the Fourier images of these two functions of .
Indeed, the function (111) may be represented as the Fourier integral37
 
 i
K F ( )  S

F ( )e d  2  S F ( ) cos  d ,
0
(7.112)
with the reciprocal transform

 
1 i 1
S F ( ) 
2 K

F ( )e d 
 K
0
F ( ) cos  d , (7.113)
of the symmetrized spectral density of the variable F, defined as

1 ˆ ˆ 1 ˆ ˆ
 
Symmetrized
S F ( ) (  ' )  F Fω'  Fˆω' Fˆ  F , Fω' , (7.114) spectral
2 2 density
where the function F̂ (also a Heisenberg operator rather than a c-number!) is defined as
 
1 ~ˆ it ~ˆ  i t
Fˆ   F (t )e dt , so that F (t )   Fˆ e d . (7.115)
2  
The physical meaning of the function SF() becomes clear if we write Eq. (112) for the
particular case  = 0:
36 For the heroic reader who has suffered through the calculations up to this point: our conceptual work is done!
What remains is just some simple math to bring the relation between Eqs. (108) and (111) to an explicit form.
37 Due to their practical importance, and certain mathematical issues of their justification for random functions,
Eqs. (112)-(113) have their own grand name, the Wiener-Khinchin theorem, though the math rigor aside, they are
just a straightforward corollary of the standard Fourier integral transform (115).
 
~ˆ
K F (0)  F 2   S F ( )d  2  S F ( )d .
 0
(7.116)
This formula infers that if we pass the function F(t) through a linear filter cutting from its frequency
spectrum a narrow band dω of physical (positive) frequencies, then the variance Ff2 of the filtered
signal Ff(t) would be equal to 2SF(ω)dω – hence the name “spectral density”.38
~
Let us use Eqs. (111) and (113) to calculate the spectral density of fluctuations F t  in our
model, using the same -trick as at the deviation of Eq. (108), to quench the upper-limit substitution:
~
E   i

1
S F     Wn Fnn' lim  0  cos
2
e e d
n ,n ' 2 

1

  iE~    i
  0 
2
 W F lim  exp    c.c. e e d (7.117)
2
n nn'
n ,n ' 
0     
1  1 1 

2
 Wn Fnn' lim  0  ~  .
2 n ,n '   ~

i E /      i  E /       
Now it is a convenient time to recall that each of the two summations here is over the eigenenergies of
the environment, whose spectrum is virtually continuous because of its large size, so that we may
transform each sum into an integral – just as this was done in Sec. 6.6:
 ...   ...dn  ... E  dE

n
n n , (7.118)
where (E)  dn/dE is the environment’s density of states at a given energy. This transformation yields
1 2 1 1 
S F    lim  0  dE nW ( E n )  E n  dE n'  ( E n' ) Fnn'  ~   . (7.119)
2  ~
 
i E /      i  E /       
Since the expression inside the square bracket depends only on a specific linear combination of two
~
energies, namely on E  E n  E n' , it is convenient to introduce also another, linearly-independent
combination of the energies, for example, the average energy E  E n  E n'  / 2 , so that the state energies
may be represented as
~ ~
E E
E n  E  , E n'  E  . (7.120)
2 2
With this notation, Eq. (119) becomes
 ~  ~ ~ ~
 E  E  E 1
S F     lim 0  dE   dE W  E     E     E   Fnn'
2
    
2   2  2  2
~
i E       
~ ~ ~ 
~  E  E  E 1
  dE W  E     E     E   Fnn'
2
     . (7.121)
 2  2  2
~
i  E       
An alternative popular measure of the spectral density of a process F(t) is SF()  Ff /d = 4SF(), where 
38 2
= /2 is the “cyclic” frequency (measured in Hz).
Due to the smallness of the parameter  (which should be much smaller than all genuine energies of
the problem, including kBT, , En, and En’), each of the internal integrals in Eq. (121) is dominated by
~
an infinitesimal vicinity of one point, E    . In these vicinities, the state densities, the matrix
elements, and the Gibbs probabilities do not change considerably, and may be taken out of the integral,
which may be then worked out explicitly:39
  ~  ~ 
 dE dE
S F     lim 0  dE     W F  ~  W F 
2 2
2  
  i E       ~

  i  E     

  
lim  0  dE     W F  ~
~

 i E     ~  
 ~

i E     ~
 ~
2 2
 dE  W F dE 
2  
  E  
2

  
2  
   E  2
  
2



2   2
    W F  W F
2
dE , (7.122)
~
where the indices  mark the functions’ values at the special points E    , i.e. En = En’  . The
physics of these points becomes simple if we interpret the state n, for which the equilibrium Gibbs
distribution function equals Wn, as the initial state of the environment, and n’ as its final state. Then the
top-sign point corresponds to En’ = En – , i.e. to the result of emission of one energy quantum  of
the “observation” frequency  by the environment to the system s of our interest, while the bottom-sign
point En’ = En + , corresponds to the absorption of such quantum by the environment. As Eq. (122)
shows, both processes give similar, positive contributions into the force fluctuations.
The situation is different for the Fourier image of the response function G(),40

 ( )   G ( ) e i d , (7.123) Generalized
susceptibility
0
that is usually called either the generalized susceptibility or the response function – in our case, of the
environment. Its physical meaning is that according to Eq. (107), the complex function () = ’() +
i”() relates the Fourier amplitudes of the generalized coordinate and the generalized force: 41
Fˆ   ( ) xˆ . (7.124)
The physics of its imaginary part ”() is especially clear. Indeed, if x represents a sinusoidal classical
process, say
x x x
x(t )  x0 cos t  0 e it  0 e it , i.e. x  x   0 , (7.125)
2 2 2
39 Using, e.g., MA Eq. (6.5a). (The imaginary parts of the integrals vanish, because the integration in infinite
limits may be always re-centered to the finite points .) A math-enlightened reader may have noticed that the
integrals might be taken without the introduction of small , using the Cauchy theorem – see MA Eq. (15.1).
40 The integration in Eq. (123) may be extended to the whole time axis, –  <  < +, if we complement the
definition (107) of the function G() for  > 0 with its definition as G( ) = 0 for  < 0, in correspondence with the
causality principle.
ˆ s  xˆ e it , or any sum of such exponents,
41 In order to prove this relation, it is sufficient to plug expression x
into Eqs. (107) and then use the definition (123). This (simple) exercise is highly recommended to the reader.
then, in accordance with the correspondence principle, Eq. (124) should hold for the c-number complex
amplitudes F and x, enabling us to calculate the time dependence of the force as
x0
F (t )  F e it  F e it    x e it      x  e it 
2

   e it   *   e it 
(7.126)

x0
2
 it
 χ'  i"  e   χ'  i"  e it

 x0  χ'   cos t  "  sin t .
We see that ”() weighs the force’s part (frequently called quadrature) that is /2-shifted from the
coordinate x, i.e. is in phase with its velocity, and hence characterizes the time-average power flow from
the system into its environment, i.e. the energy dissipation rate:42
x02
P   F (t ) x (t )   x0  χ'   cos t  "  sin t  x0 sin t   "   . (7.127)
2
Let us calculate this function from Eqs. (108) and (123), just as we have done for the spectral
density of fluctuations:
~
1   E   i  
 
i 2
" ( )  Im  G ( ) e d   Wn Fnn' lim  0 Im   exp i   c.c.  e e
2
d
0
 n ,n ' 0
2i     
 1 1 
  Wn Fnn' lim  0 Im ~
2
 ~ 
n,n '   E     i  E     i  
   
  Wn Fnn' lim  0  ~ .
2
 ~ (7.128)
n ,n '   2
 2

 E     2 E     2 
 
Making the transfer (118) from the double sum to the double integral, and then the integration variable
transfer (120), we get
  ~ ~ ~
E  E  E  ~
" ( )  lim  0  dE   W  E     E     E   Fnn' ~
2
     dE
   2  2  2 E     
2

2

(7.129)
~ ~ ~
 E  E  E ~


  W  E     E     E   Fnn' ~
2
dE .
 
2  2  2 
E     
2 2


Now using the same argument about the smallness of parameter  as above, we may take the spectral
densities, the matrix elements of force, and the Gibbs probabilities out of the integrals, and work out the
remaining integrals, getting a result very similar to Eq. (122):

" ( )        W F  W F
2 2
 dE . (7.130)
42The sign minus in Eq. (127) is due to the fact that according to Eq. (90), F is the force exerted on our system (s)
by the environment, so that the force exerted by our system on the environment is –F. With this sign clarification,
the expression P   Fx   Fv for the instant power flow is evident if x is the usual Cartesian coordinate of a
1D particle. However, according to analytical mechanics (see, e.g., CM Chapters 2 and 10), it is also valid for any
{generalized coordinate, generalized force} pair which forms the interaction Hamiltonian (90).
In order to relate these two results, it is sufficient to notice that according to Eq. (24), the Gibbs
probabilities W are related by a coefficient depending on only the temperature T and observation
frequency :
~
 E    1  E   / 2    
W  W  E    W  E    exp   W E exp , (7.131)

2  2  Z k BT 
    2k BT 
so that both the spectral density (122) and the dissipative part (130) of the generalized susceptibility
may be expressed via the same integral over the environment energies:
 
S F     cosh 


     W E  F
2
 F
2
 dE , (7.132)
 2 k BT 
"    2 sinh 
  

     W E  F  F
2 2
 dE , (7.133)
 2 k BT 
and hence are universally related as
  Fluctuation-
S F ( )  " ( ) coth . (7.134) dissipation
2 2 k BT theorem
This is, finally, the much-celebrated Callen-Welton’s fluctuation-dissipation theorem (FDT). It

reveals a fundamental, intimate relationship between these two effects of the environment (“no
dissipation without fluctuation”) – hence the name. A curious feature of the FDT is that Eq. (134)
includes the same function of temperature as the average energy (26) of a quantum oscillator of
frequency , though, as the reader could witness, the notion of the oscillator was by no means used in its
derivation. As will see in the next section, this fact leads to rather interesting consequences and even
conceptual opportunities.
In the classical limit,  << kBT, the FDT is reduced to
 2k T k T Im  ( )
S F ( )  " ( ) B  B . (7.135)
2   
In most systems of interest, the last fraction is close to a finite (positive) constant within a substantial
range of relatively low frequencies. Indeed, expanding the right-hand side of Eq. (123) into the Taylor
series in small , we get
 
     0   i  ..., with  0   G   d , and    G   d . (7.136)
0 0
Since the temporal Green’s function G is real by definition, the Taylor expansion of ”()  Im() at
 = 0 starts with the linear term , where  is a certain real coefficient, and unless  = 0, is dominated
by this term at small . The physical sense of the constant  becomes clear if we consider an
environment that provides a force described by a simple, well-known kinematic friction law
Fˆ   xˆ , with   0 , (7.137)
where  is usually called the drag coefficient. For the Fourier images of coordinate and force, this gives
the relation F = ix, so that according to Eq. (124),
" ( ) Im   
    i ,
Ohmic
dissipation i.e.    0. (7.138)
 
With this approximation, and in the classical limit, the FDT (134) is reduced to the well-known Nyquist
formula:43
k T
Nyquist S F ( )  B  , i.e. Ff2  4k BTd . (7.139)
formula 
According to Eq. (112), if such a constant spectral density44 persisted at all frequencies, it would
correspond to a delta-correlated process F(t), with
K F ( )  2 S F (0) ( )  2k BT ( ) (7.140)
- cf. Eqs. (82) and (83). Since in the classical limit the right-hand side of Eq. (109) is negligible, and the
correlation function may be considered an even function of time, the symmetrized function under the
integral in Eq. (113) may be rewritten just as F()F(0). In the limit of relatively low observation
frequencies (in the sense that  is much smaller than not only the quantum frontier kBT/ but also the
frequency scale of the function ”()/), Eq. (138) may be used to recast Eq. (135) in the form45
"   1

  lim 0


k BT  F  F 0 d .
0
(7.141)
To conclude this section, let me return for a minute to the questions formulated in our earlier
discussion of dephasing in the two-level model. In that problem, the dephasing time scale is T2 = 1/2D.
Hence the classical approach to the dephasing, used in Sec. 3, is adequate if D << kBT. Next, we may
 
identify the operators fˆ and ̂ z participating in Eq. (70) with, respectively,  F̂ and x̂ participating
in the general Eq. (90). Then the comparison of Eqs. (82), (89), and (140) yields
1 4k T
Classical  2 D  B2  , (7.142)
dephasing T2 
43 Actually, the 1928 work by H. Nyquist was about the electronic noise in resistors, just discovered
experimentally by his Bell Labs colleague John Bertrand Johnson. For an Ohmic resistor, as the dissipative
“environment” of the electric circuit it is connected with, Eq. (137) is just the Ohm’s law, and may be recast as
either V = –R(dQ/dt) = RI, or I = –G(d/dt) = GV. Thus for the voltage V across an open circuit, 
corresponds to its resistance R, while for current I in a short circuit, to its conductance G = 1/R. In this case, the
fluctuations described by Eq. (139) are referred to as the Johnson-Nyquist noise. (Because of this important
application, any model leading to Eq. (138) is commonly referred to as the Ohmic dissipation, even if the physical
nature of the variables x and F is quite different from voltage and current.)
44 A random process whose spectral density may be reasonably approximated by a constant is frequently called
the white noise, because it is a random mixture of all possible sinusoidal components with equal weights,
reminding the spectral composition of the natural white light.
45 Note that in some fields (especially in physical kinetics and chemical physics), this particular limit of the
Nyquist formula is called the Green-Kubo (or just “Kubo”) formula. However, in the view of the FDT
development history (described above), it is much more reasonable to associate these names with Eq. (109) – as it
is done in most fields of physics.
so that, for the model described by Eq. (137) with a temperature-independent drag coefficient , the rate
of dephasing by a classical environment is proportional to its temperature.
7.5. The Heisenberg-Langevin approach

The fluctuation-dissipation theorem offers a very simple and efficient, though limited approach
to the analysis of the system of interest (s in Fig. 1). It is to write its Heisenberg equations (4.199) of
motion of the relevant operators, which would now include the environmental force operator, and
explore these equations using the Fourier transform and the Wiener-Khinchin theorem (112)-(113). This
approach to classical equations of motion is commonly associated with the name of Langevin,46 so that
its extension to dynamics of Heisenberg-picture operators is frequently referred to as the Heisenberg-
Langevin (or “quantum Langevin”, or “Langevin-Lax”47) approach to open system analysis.
Perhaps the best way to describe this method is to demonstrate how it works for the very
important case of a 1D harmonic oscillator, so that the generalized coordinate x of Sec. 4 is just the
oscillator’s coordinate. For the sake of simplicity, let us assume that the environment provides the
simple Ohmic dissipation described by Eq. (137) – which is a very good approximation in many cases.
As we already know from Chapter 5, the Heisenberg equations of motion for operators of coordinate and
momentum of the oscillator, in the presence of an external force F(t), are
pˆ
xˆ  , pˆ  m 02 xˆ  Fˆ , (7.143)
m
so that using Eqs. (92) and (137), we get
pˆ ~ˆ
xˆ  , pˆ  m 02 xˆ  xˆ  F t  . (7.144)
m
Combining Eqs. (144), we may write their system as a single differential equation
~ˆ
mxˆ  xˆ  m 02 xˆ  F t  , (7.145)
which is similar to the well-known classical equation of motion of a damped oscillator under the effect
of an external force. In the view of Eqs. (5.29) and (5.35), whose corollary the Ehrenfest theorem (5.36)
is, this may look not surprising, but please note again that the approach discussed in the previous section
justifies such quantitative description of the drag force in quantum mechanics – necessarily in parallel
with the accompanying fluctuation force.
For the Fourier images of the operators, defined similarly to Eq. (115), Eq. (145) gives the
following relation,
46 A 1908 work by Paul Langevin was the first systematic development of Einstein’s ideas (1905) on the
Brownian motion, using the random force language, as an alternative to Smoluchowski’s approach using the
probability density language – see Sec. 6 below.
47 Indeed, perhaps the largest credit for the extension of the Langevin approach to quantum systems belongs to
Melvin J. Lax, whose work in the early 1960s was motivated mostly by quantum electronics applications – see,
e.g., his monograph M. Lax, Fluctuation and Coherent Phenomena in Classical and Quantum Physics, Gordon
and Breach, 1968, and references therein.
F
xˆ  , (7.146)

m    2  i
2
0 
which should be also well known to the reader from the classical theory of forced oscillations.48
However, since these Fourier components are still Heisenberg-picture operators, and their “values” for
different  generally do not commute, we have to tread carefully. The best way to proceed is to write a
copy of Eq. (146) for frequency (-’), and then combine these equations to form a symmetrical
combination similar to that used in Eq. (114). The result is
1 1 1 ˆ ˆ
xˆ xˆ  '  xˆ  ' xˆ  F F '  Fˆ ' Fˆ . (7.147)
m 02   2   i
2
2 2
Since the spectral density definition similar to Eq. (114) is valid for any observable, in particular for x,
Eq. (147) allows us to relate the symmetrized spectral densities of coordinate and force:
S F ( ) S F ( )
S x ( )   . (7.148)
 
m  02   2  i
2

m  2
2 2
0    
2 2
Now using an analog of Eq. (116) for x, we can calculate the coordinate’s variance:
 
S F ( )d
x 2  K x (0)   S x ( )d  2 
 0 m 2  02   2    
2 2
, (7.149)
where now, in contrast to the notation used in Sec. 4, the sign … means averaging over the usual
statistical ensemble of many systems of interest – in our current case, of many harmonic oscillators.
If the coupling to the environment is so weak that the drag coefficient  is small (in the sense
that the oscillator’s dimensionless Q-factor is large, Q  mω0/ >> 1), this integral is dominated by the
resonance peak in a narrow vicinity,  – 0     << 0, of its resonance frequency, and we can take
the relatively smooth function SF() out of the integral, thus reducing it to a table form:49
 
d d
x 2
 2S F ( 0 )   2S F ( 0 ) 
0 
m  
2 2
0 
  
2 2 2
  2m 0    0  
2 2
(7.150)

1 d 1  S F ( 0 )
2 
 2 S F ( 0 )  2 S F ( 0 )  .
 0   (2m /  )  1
2
 0 2 2m m 02
With the account of the FDT (134) and of Eq. (138), this gives50
   0   0
x2   0 coth  coth . (7.151)
m 0 2
2
2k B T 2m 0 2k B T
48 If necessary, see CM Sec. 5.1.

49 See, e.g., MA Eq. (6.5a).
50 Note that this calculation remains correct even if the dissipation’s dispersion law deviates from the Ohmic
model (138), provided that the drag coefficient  is replaced with its effective value Im(0)/0, because the
effects of the environment are only felt, by the oscillator, at its oscillation frequency.
But this is exactly Eq. (48), which was derived in Sec. 2 from the Gibbs distribution, without any
explicit account of the environment – though keeping it in mind by using the notion of the thermally-
equilibrium ensemble.51
Notice that in the final form of Eq. (151) the coefficient , which characterizes the oscillator-to-
environment interaction strength, has canceled! Does this mean that in Sec. 4 we toiled in vain? By no
means. First of all, the result (150), augmented by the FDT (134), has an important conceptual value.
For example, let us consider the low-temperature limit kBT << 0 where Eq. (151) is reduced to
 x2
x2   0. (7.152)
2m 0 2
Let us ask a naïve question: what exactly is the origin of this coordinate’s uncertainty? From the point of
view of the usual quantum mechanics of absolutely closed (Hamiltonian) systems, there is no doubt: this
non-vanishing variance of the coordinate is the result of the final spatial extension of the ground-state
wavefunction (2.275), reflecting Heisenberg’s uncertainty relation – which in turn results from the fact
that the operators of coordinate and momentum do not commute. However, from the point of view of the
Heisenberg-Langevin equation (145), the variance (152) is an inalienable part of the oscillator’s
~
response to the fluctuation force F t  exerted by the environment at frequencies   0. Though it is
impossible to refute the former, absolutely legitimate point of view, in many applications it is easier to
subscribe to the latter standpoint and treat the coordinate’s uncertainty as the result of the so-called
quantum noise of the environment, which, in equilibrium, obeys the FTD (134). This notion has
received numerous confirmations in experiments that did not include any oscillators with their own
frequencies 0 close to the noise measurement frequency .52
The second advantage of the Heisenberg-Langevin approach is that it is possible to use Eq.
(148) to calculate the (experimentally measurable!) distribution Sx(), i.e. decompose the fluctuations
into their spectral components. This procedure is not restricted to the limit of small  (i.e. of large Q);
for any damping, we may just plug the FDT (134) into Eq. (148). For example, let us have a look at the
so-called quantum diffusion. A free 1D particle, moving in a viscous medium providing it with the
Ohmic damping (137), may be considered as the particular case of a 1D harmonic oscillator (145), but
with 0 = 0, so that combining Eqs. (134) and (149), we get
 
S F ( )d 1  
x2  2   2  coth d . (7.153)
0 ( m )    0 ( m )   
2 2 2 2 2 2
2 2k B T
This integral has two divergences. The first one, of the type d/2 at the lower limit, is just a
classical effect: according to Eq. (85), the particle’s displacement variance grows with time, so it cannot
have a finite time-independent value that Eq. (153) tries to calculate. However, we still can use that
result to single out the quantum effects on diffusion – say, by comparing it with a similar but purely
classical case. These effects are prominent at high frequencies, especially if the quantum noise
overcomes the thermal noise before the dynamic cut-off, i.e. if
51 By the way, the simplest way to calculate SF(), i.e. to derive the FDT, is to require that Eqs. (48) and (150)
give the same result for an oscillator with any eigenfrequency . This is exactly the approach used by H. Nyquist
(for the classical case) – see also SM Sec. 5.5.
52 See, for example, R. Koch et al., Phys. Lev. B 26, 74 (1982).
k BT 
 . (7.154)
 m
In this case, there is a broad range of frequencies where the quantum noise gives a substantial
contribution to the integral:
/m /m
Quantum 1   d   
 2    d   
2
diffusion x ln ~ . (7.155)
Q 2
2    mk BT 
kB T / k BT / 
Formally, this contribution diverges at either m  0 or T  0, but this logarithmic (i.e. extremely weak)
divergence is readily quenched by almost any change of the environment model at very high
frequencies, where the “Ohmic” approximation (136) becomes unrealistic.
The Heisenberg-Langevin approach is very powerful, because its straightforward generalizations
enable analyses of fluctuations in virtually arbitrary linear systems, i.e. the systems described by linear
differential (or integro-differential) equations of motion, including those with many degrees of freedom,
and distributed systems (continua), and such systems prevail in many fields of physics. However, this
approach also its limitations. The main of them is that if the equations of motion of the Heisenberg
operators are not linear, there is no linear relation, such as Eq. (146), between the Fourier images of the
generalized forces and the generalized coordinates, and as the result, there is no simple relation, such as
Eq. (148), between their spectral densities. In other words, if the Heisenberg equations of motion are
nonlinear, there is no regular simple way to use them to calculate the statistical properties of the
observables.
For example, let us return to the dephasing problem described by Eqs. (68)-(70), and assume that
the deterministic and fluctuating parts of the effective force –f exerted by the environment, are
characterized by relations similar, respectively, to Eqs. (124) and (134). Now writing the Heisenberg
equations of motion for the two remaining spin operators, and using the commutation relations between
them, we get
1
i
 1
i
   2

  
2
 ~ˆ
ˆ x  ˆ x , Hˆ  ˆ x , c z  fˆ ˆ z   ˆ y c z  fˆ   ˆ y  c z  ˆ z  f  ,
  
(7.156)
and a similar equation for ̂ . Such nonlinear equations cannot be used to calculate the statistical
y
properties of the Pauli operators in this system exactly – at least analytically.

For some calculations, this problem may be circumvented by linearization: if we are only
interested in small fluctuations of the observables, their nonlinear Heisenberg equations of motion, such
as Eq. (156), may be linearized with respect to small deviations of the operators about their (generally,
time-dependent) deterministic “values”, and then the resulting linear equations for the operator
variations may be solved either as has been demonstrated above, or (if the deterministic “values” evolve
in time) using their Fourier expansions. Sometimes such approach gives relatively simple and important
results,53 but for many other problems, this approach is insufficient, leaving a lot of space for alternative
methods.
53For example, the formula used for processing the experimental results by R. Koch et al. (mentioned above), had
been derived in this way. (This derivation will be suggested to the reader as an exercise.)
7.6. Density matrix approach

The main alternative approach to the dynamics of open quantum systems, which is essentially a
generalization of the one discussed in Sec. 2, is to extract the final results of interest from the dynamics
of the density operator of our system s. Let us discuss this approach in detail.54
We already know that the density matrix allows the calculation of the expectation value of any
observable of the system s – see Eq. (5). However, our initial recipe (6) for the density matrix element
calculation, which requires the knowledge of the exact state (2) of the whole Universe, is not too
practicable, while the von Neumann equation (66) for the density matrix evolution is limited to cases in
which probabilities Wj of the system states are fixed – thus excluding such important effects as the
energy relaxation. However, such effects may be analyzed using a different assumption – that the system
of interest interacts only with a local environment that is very close to its thermally-equilibrium state
described, in the stationary-state basis, by a diagonal density matrix with the elements (24).
This calculation is facilitated by the following general observation. Let us number the basis
states of the full local system (the system of our interest plus its local environment) by l, and use Eq. (5)
to write
 
A  Tr Aˆ wˆ l   All' wl'l   l Aˆ l' l' wˆ l l , (7.157)
l , l' l,l'
where ŵl is the density operator of this local system. At a weak interaction between the system s and the
local environment e, their states reside in different Hilbert spaces, so that we can write
l  s j  ek , (7.158)
and if the observable A depends only on the coordinates of the system s of our interest, we may reduce
Eq. (157) to the form similar to Eq. (5):
A  
j,j' ; k,k'
ek  s j Aˆ s j '  ek ' ek '  s j ' wˆ l s j  ek
(7.159)
 
  A jj' s j '    ek wˆ l ek   s j  Tr j ( Aˆ wˆ ) ,
j , j'  k 
where
wˆ   ek wˆ l ek  Trk wˆ l , (7.160)
k
showing how exactly the density operator ŵ of the system s may be calculated from ŵl .
Now comes the key physical assumption of this approach: since we may select the local
environment e to be much larger than the system s of our interest, we may consider the composite
system l as a Hamiltonian one, with time-independent probabilities of its stationary states, so that for the
description of the evolution in time of its full density operator ŵl (again, in contrast to that, ŵ , of the
system of our interest) we may use the von Neumann equation (66). Partitioning its right-hand side in
accordance with Eq. (68), we get:
     
iwˆ l  Hˆ s , wˆ l  Hˆ e , wˆ l  Hˆ int , wˆ l . (7.161)
54As in Sec. 4, the reader not interested in the derivation of the basic equation (181) of the density matrix
evolution may immediately jump to the discussion of this equation and its applications.
The next step is to use the perturbation theory to solve this equation in the lowest order in Ĥ int , which
would yield, for the evolution of w, a non-vanishing contribution due to the interaction. For that, Eq.
(161) is not very convenient, because its right-hand side contains two other terms, of a much larger scale
than the interaction Hamiltonian. To mitigate this technical difficulty, the interaction picture that was
discussed at the end of Sec. 4.6, is very natural. (It is not necessary though, and I will use this picture
mostly as an exercise of its application – unfortunately, the only example I can afford in this course.)
As a reminder, in that picture (whose entities will be marked with index “I”, with the unmarked
operators assumed to be in the Schrödinger picture), both the operators and the state vectors (and hence
the density operator) depend on time. However, the time evolution of the operator of any observable A is
described by an equation similar to Eq. (67), but with the unperturbed part of the Hamiltonian only – see
Eq. (4.214). In model (68), this means


iAˆ I  Aˆ I , Hˆ 0 .  (7.162)
where the unperturbed Hamiltonian consists of two parts defined in different Hilbert spaces:
Hˆ 0  Hˆ s  Hˆ e . (7.163)
On the other hand, the state vector’s dynamics is governed by the interaction evolution operator û I that
obeys Eqs. (4.215). Since this equation, using the interaction-picture Hamiltonian (4.216),
Hˆ I  uˆ 0† Hˆ int uˆ 0 , (7.164)
is absolutely similar to the ordinary Schrödinger equation using the full Hamiltonian, we may repeat all
arguments given at the beginning of Sec. 3 to prove that the dynamics of the density operator in the
interaction picture of a Hamiltonian system is governed by the following analog of the von Neumann
equation (66):
iwˆ  Hˆ , wˆ ,
I  I I  (7.165)
where the index l is dropped for the notation simplicity. Since this equation is similar in structure (with
the opposite sign) to the Heisenberg equation (67), we may use the solution Eq. (4.190) of the latter
equation to write its analog:
wˆ I t   uˆ I t ,0 wˆ l (0)uˆ I† t ,0  . (7.166)
It is also straightforward to verify that in this picture, the expectation value of any observable A may be
found from an expression similar to the basic Eq. (5):

A  Tr Aˆ I wˆ I  , (7.167)
showing again that the interaction and Schrödinger pictures give the same final results.
In the most frequent case of factorable interaction (90),55 Eq. (162) is simplified for both
operators participating in that product – for each one in its own way. In particular, for Aˆ  xˆ , it yields
55A similar analysis of a more general case, when the interaction with the environment has to be represented as a
sum of products of the type (90), may be found, for example, in the monograph by K. Blum, Density Matrix
Theory and Applications, 3rd ed., Springer, 2012.
    
ixˆ I  xˆ I , Hˆ 0  xˆ I , Hˆ s  xˆ I , Hˆ e .  (7.168)
Since the coordinate operator is defined in the Hilbert space of our system s, it commutes with the
Hamiltonian of the environment, so that we finally get

ixˆ I  xˆ I , Hˆ s .  (7.169)
On the other hand, if Aˆ  Fˆ , this operator is defined in the Hilbert space of the environment, and
commutes with the Hamiltonian of the unperturbed system s. As a result, we get

iFÎ  FÎ , Hˆ e  
. (7.170)
This means that with our time-independent unperturbed Hamiltonians, Ĥ s and Ĥ e , the time
evolution of the interaction-picture operators is rather simple. In particular, the analogy between Eq.
(170) and Eq. (93) allows us to immediately write the following analog of Eq. (94):
 i   i 
FÎ t   exp Hˆ e t  Fˆ 0 exp Hˆ e t  , (7.171)
     
so that in the stationary-state basis n of the environment,
Fˆ 
I nn'
 i   i   E  E n'
(t )  exp E n t  Fnn' (0) exp E n ' t   Fnn' (0) exp i n

t , (7.172)
        
and similarly (but in the basis of the stationary states of system s) for operator x̂ . As a result, the right-
hand side of Eq. (164) may be also factored:
i
 
   i
 
Hˆ I t   uˆ 0† t ,0 Hˆ int uˆ 0 t ,0   exp  Hˆ s  Hˆ e t   xˆFˆ exp  Hˆ s  Hˆ e t   
    
(7.173)
  i   i    i   i 
  exp  Hˆ s t  xˆ exp  Hˆ s t   exp  Hˆ e t  Fˆ (0) exp  Hˆ e t     xˆ I t FÎ t .
             
So, the transfer to the interaction picture has taken some time, but now it enables a smooth ride.56
Indeed, just as in Sec. 4, we may rewrite Eq. (165) in the integral form:
 
t
1
wˆ I t    Hˆ I t' , wˆ I t'  dt' ; (7.174)
i 
plugging this result into the right-hand side of Eq. (165), we get
    xˆ(t ) Fˆ (t ), xˆ(t' ) Fˆ (t' ), wˆ (t' ) dt' ,

t t
1 1
wˆ I t    2  Hˆ I t , Hˆ I t' , wˆ I t'  dt '    2 I (7.175)
 
where, for the notation’s brevity, from this point on I will strip the operators x̂ and F̂ of their index
“I”. (I hope their time dependence indicates the interaction picture clearly enough.)
56 If we used either the Schrödinger or the Heisenberg picture instead, the forthcoming Eq. (175) would pick up a
rather annoying multitude of fast-oscillating exponents, of different time arguments, on its right-hand side.
So far, this equation is exact (and cannot be solved analytically), but this is a good time to notice
that even if we approximate the density operator on its right-hand side by its unperturbed, factorable
“value” (corresponding to no interaction between the system s and its thermally-equilibrium
environment e),57
wˆ I t'   wˆ t'  wˆ e , with en wˆ e en'  Wn nn ' , (7.176)
where en are the stationary states of the environment and Wn are the Gibbs probabilities (24), Eq. (175)
still describes nontrivial time evolution of the density operator. This is exactly the first non-vanishing
approximation (in the weak interaction) we have been looking for. Now using Eq. (160), we find the
equation of evolution of the density operator of the system of our interest:
 Tr xˆ(t ) Fˆ (t ), xˆ(t' ) Fˆ (t' ), wˆ (t' )wˆ  dt' ,

t
1
wˆ t    2 n e (7.177)
 
where the trace is over the stationary states of the environment. To spell out the right-hand side of Eq.
(177), note again that the coordinate and force operators commute with each other (but not with
themselves at different time moments!) and hence may be swapped at will, so that we may write
  
Trn ..., ...,...  xˆ t xˆ t' wˆ t' Trn Fˆ t Fˆ t' wˆ e  xˆ t wˆ t' xˆ t' Trn Fˆ t wˆ e Fˆ t'  
  
 xˆ t' wˆ t' xˆ t Trn Fˆ t' wˆ e Fˆ t   wˆ t' xˆ t' xˆ t Trn wˆ e Fˆ t' Fˆ t  
 xˆ t xˆ t' wˆ t'  Fnn' t Fn'n t' Wn  xˆ t wˆ t' xˆ t'  Fnn' t Wn' Fn'n t' 
n , n' n , n'
(7.178)
 xˆ t' wˆ t' xˆ t  Fnn' t' Wn' Fn'n t   wˆ t' xˆ t' xˆ t  Wn Fnn' t' Fn'n t  .
n , n' n , n'
Since the summation over both indices n and n’ in this expression is over the same energy level set (of
all stationary states of the environment), we may swap these indices in any of the sums. Doing this only
in the terms including the factors Wn’, we turn them into Wn, so that this factor becomes common:
Trn ..., ...,...   Wn xˆ t xˆ t' wˆ t' Fnn' t Fn'n t'   xˆ t wˆ t' xˆ t' Fn'n t Fnn' t' 
n , n' (7.179)
 xˆ t' wˆ xˆ t Fn'n t' Fnn' t   wˆ xˆ t' xˆ t Fnn' t' Fn'n t .
Now using Eq. (172), we get
~ ~
  iE t  t'    iE t  t'   
 xˆ t xˆ t' wˆ t'  exp   xˆ t wˆ t' xˆ t'  exp  
       
Trn ..., ...,...   Wn Fnn' 
2
~ ~
  iE t  t'    iE t  t'  
 xˆ t' wˆ t' xˆ t  exp   wˆ t' xˆ t' xˆ t  exp
n , n'

      
~ ~
E t  t' 
  Wn Fnn' cos
2
xˆ t , xˆ t' , wˆ t'    iWn Fnn' 2 sin E t  t'  xˆ t , xˆ t' , wˆ t'  . (7.180)
n , n'  n,n' 
Comparing the two double sums participating in this expression with Eqs. (108) and (111), we see that
they are nothing else than, respectively, the symmetrized correlation function and the temporal Green’s
57For the notation simplicity, the fact that here (and in all following formulas) the density operator ŵ of the
system s of our interest is taken in the interaction picture, is just implied.
function (multiplied by /2) of the time-difference argument  = t – t’  0. As the result, Eq. (177) takes
a compact form:
t t Density
1 i
wˆ t    2 K F t  t' xˆ(t ), xˆ(t' ), wˆ (t' ) dt'  2 Gt  t' xˆ(t ), xˆ(t' ), wˆ (t' ) dt' .
matrix:
(7.181) time
 evolution
Let me hope that the readers (especially the ones who have braved through this derivation) enjoy
this beautiful result as much as I do. It gives an equation for the time evolution of the density operator of
the system of our interest (s), with the effects of its environment represented only by two real, c-number
functions of τ: one (KF) describing the fluctuation force exerted by the environment, and the other one
(G) representing its ensemble-averaged environment’s response to the system’s evolution. And most
spectacularly, these are exactly the same functions that participate in the alternative, Heisenberg-
Langevin approach to the problem, and hence related to each other by the fluctuation-dissipation
theorem (134).
After a short celebration, let us acknowledge that Eq. (181) is still an integro-differential
equation, and needs to be solved together with Eq. (169) for the system coordinate’s evolution. Such
equations do not allow explicit analytical solutions, besides a few very simple (and not very interesting)
cases. For most applications, further simplifications should be made. One of them is based on the fact
(which was already discussed in Sec. 3) that both environmental functions participating in Eq. (181)
tend to zero when their argument  becomes much larger than the environment’s correlation time c,
independent of the system-to-environment coupling strength. If the coupling is sufficiently weak, the
time scales Tnn’ of the evolution of the density matrix elements, following from Eq. (181), are much
longer than this correlation time, and also the characteristic time scale of the coordinate operator’s
evolution. In this limit, all arguments t’ of the density operator, giving substantial contributions to the
right-hand side of Eq. (181), are so close to t that it does not matter whether its argument is t’ or just t.
This simplification, w(t’)  w(t), is known as the Markov approximation.58
However, this approximation alone is still insufficient for finding the general solution of Eq.
(181). Substantial further progress is possible in two important cases. The most important of them is
when the intrinsic Hamiltonian Ĥ s of the system s of our of interest does not depend on time explicitly
and has a discrete eigenenergy spectrum En,59 with well-separated levels:

E n  E n'  . (7.182)
Tnn '
Let us see what does this condition yield for Eq. (181), rewritten for the matrix elements in the
stationary state basis, in the Markov approximation:
58 Named after Andrey Andreyevich Markov (1856-1922; in older Western literature, “Markoff”), a
mathematician famous for his general theory of the so-called Markov processes, whose future development is
completely determined by its present state, but not its pre-history.
59 Here, rather reluctantly, I will use this standard notation, E , for the eigenenergies of our system of interest (s),
n
in hope that the reader would not confuse these discrete energy levels with the quasi-continuous energy levels of
its environment (e), participating in particular in Eqs. (108) and (111). As a reminder, by this stage of our
calculations, the environment levels have disappeared from our formulas, leaving behind their functionals KF()
and G().
t t
1 i
w nn'  2
  K t  t' xˆ(t ), xˆ(t' ), wˆ  

F nn' dt' 
2 
Gt  t' xˆ (t ), xˆ (t' ), wˆ  nn' dt' . (7.183)
After spelling out the commutators, the right-hand side of this expression includes four operator
products, which differ “only” by the operator order. Let us first have a look at one of these products,
xˆ (t ) xˆ (t' ) wˆ nn'   xnm (t ) xmm' (t' ) wm'n' , (7.184)

m,m'
where the indices m and m’ run over the same set of stationary states of the system s of our interest as
the indices n and n’. According to Eq. (169) with a time-independent Hs, the matrix elements xnn’ (in the
stationary state basis) oscillate in time as exp{inn’t}, so that
xˆ (t ) xˆ (t' ) wˆ nn'   xnm xmm' expi nm t   mm't' wm'n' , (7.185)

m , m'
where on the right-hand side, the coordinate matrix elements are in the Schrödinger picture, and the
usual notation (6.85) is used for the quantum transition frequencies:
 nn'  E n  E n' . (7.186)
According to the condition (182), frequencies nn’ with n  n’ are much higher than the speed of
evolution of the density matrix elements (in the interaction picture!) – on both the left-hand and right-
hand sides of Eq. (183). Hence, on the right-hand side of Eq. (183), we may keep only the terms that do
not oscillate with these frequencies nn’, because rapidly-oscillating terms would give negligible
contributions to the density matrix dynamics.60 For that, in the double sum (185) we should save only
the terms proportional to the difference (t – t’) because they will give (after the integration over t’) a
slowly changing contribution to the right-hand side.61 These terms should have nm + mm’ = 0, i.e. (En –
Em) + (Em – Em’)  En – Em’ = 0. For a non-degenerate energy spectrum, this requirement means m’ = n;
as a result, the double sum is reduced to a single one:
xˆ (t ) xˆ (t' ) wˆ nn'  wnn'  x nm x mn expi nm t  t'   wnn'  x nm expi nm t  t'  .

2
(7.187)
m m
Another product, wˆ xˆ (t' ) xˆ (t )nn' , which appears on the right-hand side of Eq. (183), may be simplified
absolutely similarly, giving
wˆ xˆ (t' ) xˆ (t )nn'   x n'm 2 expi n'm t'  t wnn' . (7.188)
m
These expressions hold whether n and n’ are equal or not. The situation is different for two other
products on the right-hand side of Eq. (183), with w sandwiched between x(t) and x(t’). For example,
xˆ (t ) wˆ xˆ (t' )nn'   xnm (t ) wmm' xm'n' (t' )   xnm wmm' xm'n' expi nm t   m'n' t'  . (7.189)
m , m' m , m'
60This is essentially the same rotating-wave approximation (RWA) as was used in Sec. 6.5.
61As was already discussed in Sec. 4, the lower-limit substitution (t’ = –) in the integrals participating in Eq.
(183) gives zero, due to the finite-time “memory” of the system, expressed by the decay of the correlation and
response functions at large values of the time delay  = t – t’.
For this term, the same requirement of having a fast oscillating function of (t – t’) only, yields a different
condition: nm + m’n’ = 0, i.e.
En  Em   Em'  En'   0 . (7.190)
Here the double sum’s reduction is possible only if we make an additional assumption that all interlevel
energy distances are unique, i.e. our system of interest has no equidistant levels (such as in the harmonic
oscillator). For the diagonal elements (n = n’), the RWA requirement is reduced to m = m’, giving sums
over all diagonal elements of the density matrix:
xˆ (t ) wˆ xˆ (t' )nn   x nm 2 expi nm t  t' wmm . (7.191)

m
(Another similar term, xˆ (t' ) wˆ xˆ (t )nn , is just a complex conjugate of (191).) However, for off-diagonal
matrix elements (n  n’), the situation is different: Eq. (190) may be satisfied only if m = n and also m’
= n’, so that the double sum is reduced to just one, non-oscillating term:
xˆ (t ) wˆ xˆ (t' )nn'  xnn wnn' xn'n' , for n  n' . (7.192)

The second similar term, xˆ (t' ) wˆ xˆ (t )nn , is exactly the same, so that in one of the integrals of Eq. (183),
these terms add up, while in the second one, they cancel.
This is why the final equations of evolution look differently for diagonal and off-diagonal
elements of the density matrix. For the former case (n = n’), Eq. (183) is reduced to the so-called master
equation62 relating diagonal elements wnn of the density matrix, i.e. the energy level occupancies Wn: 63

 1
W n   xnm    K F  Wn  Wm  expi nm   exp i nm 
2
2
mn 0 (7.193)
i 
 G  Wn  Wm  expi nm   exp i nm  d ,
2 
where   t – t’. Changing the summation index notation from m to n’, we may rewrite the master
equation in its canonical form
W n   
n'  n
Wn'  nn' Wn  ,
n'  n (7.194) Master
equation
where the coefficients


2 1  Interlevel
0   2 K F  cosnn'   G  sin nn' dt' ,
2
Γ n' n  xnn' (7.195) transition
rates
are called the interlevel transition rates.64 Eq. (194) has a very clear physical meaning of the level
occupancy dynamics (i.e. the balance of the probability flows W) due to quantum transitions between
62 The master equations, first introduced to quantum mechanics in 1928 by W. Pauli, are sometimes called the
“Pauli master equations”, or “kinetic equations”, or “rate equations”.
63 As Eq. (193) shows, the term with m = n would vanish and thus may be legitimately excluded from the sum.
64 As Eq. (193) shows, the result for 
nn’ is described by Eq. (195) as well, provided that the indices n and n’ are
swapped in all components of its right-hand side, including the swap nn’  n’n = –nn’.
the energy levels (see Fig. 7), in our current case caused by the interaction between the system of our
interest and its environment.
higher levels
En Wn
Fig. 7.7. Probability flows in a discrete-
n' n nn' spectrum system. Solid arrows: the
exchange between the two energy levels, n
En' Wn' and n’, described by one term in the master
equation (194); dashed arrows: other
lower levels transitions to/from these two levels.
The Fourier transforms (113) and (123) enable us to express the two integrals in Eq. (195) via,
respectively, the symmetrized spectral density SF() of environment force fluctuations and the
imaginary part ”() of the generalized susceptibility, both at frequency  = nn’. After that we may use
the fluctuation-dissipation theorem (134) to exclude the former function, getting finally65
1    2 " ( nn' )
xnn' "  nn'  coth nn'  1  xnn'
Transition 2 2
n' n  . (7.196)
expEn  En'  / k BT   1
rates via
”()   2k B T  
Note that since the imaginary part ” of the generalized susceptibility is an odd function of
frequency, Eq. (196) is in compliance with the Gibbs distribution for arbitrary temperature. Indeed,
according to this equation, the ratio of the “up” and “down” rates for each pair of levels equals
n' n "  nn'  "  n'n   E  E n' 

  exp  n . (7.197)
nn' exp{( E n  E n' ) / k BT }  1 exp{( E n'  E n ) / k BT }  1  k BT 
On the other hand, according to the Gibbs distribution (24), in thermal equilibrium the level populations
should be in the same proportion. Hence, Eq. (196) complies with the so-called detailed balance
equation,
Detailed
balance
Wn nn'  Wn' n' n , (7.198)
valid in the equilibrium for each pair {n, n’}, so that all right-hand sides of all Eqs. (194), and hence the
time derivatives of all Wn vanish – as they should. Thus, the stationary solution of the master equations
indeed describes the thermal equilibrium correctly.
The system of master equations (194), frequently complemented by additional terms on their
right-hand sides, describing interlevel transitions due to other factors (e.g., by an external ac force with a
frequency close to one of nn’), is the key starting point for practical analyses of many quantum systems,
notably including optical quantum amplifiers and generators (lasers). It is important to remember that
65 It is straightforward (and highly recommended to the reader) to show that at low temperatures (kBT << En’ –
En), Eq. (196) gives the same result as the Golden Rate formula (6.111), with A = x. (The low-temperature
condition ensures that the initial occupancy of the excited level n is negligible, as was assumed at the derivation
of Eq. (6.111).)
they are strictly valid only in the rotating-wave approximation, i.e. if Eq. (182) is well satisfied for all n
and n’ of substance.
For a particular but very important case of a two-level system (with, say, E1 > E2), the rate 12
may be interpreted (especially in the low-temperature limit kBT << 12 = E1 – E2, when 12 >> 21 )
as the reciprocal characteristic time 1/T1  12 of the energy relaxation process that brings the
diagonal elements of the density matrix to their thermally-equilibrium values (24). For the Ohmic
dissipation described by Eqs. (137)-(138), Eq. (196) yields
1 2 2  , for k BT  12 , Energy

 12  2 x12    12 (7.199) relaxation
T1   k BT , for 12  k BT . time
This relaxation time T1 should not be confused with the characteristic time T2 of the off-diagonal
element decay, i.e. dephasing, which was already discussed in Sec. 3. In this context, let us see what do
Eqs. (183) have to say about the dephasing rates. Taking into account our intermediate results (187)-
(192), and merging the non-oscillating components (with m = n and m = n’) of the sums Eq. (187) and
(188) with the terms (192), which also do not oscillate in time, we get the following equation:66
  1  2
wnn'    2 K F     xnm expi nm    xn'm exp i n'm    xnn  xn'n'  
2 2

0   mn m n ' 
(7.200)
i   
 G   xnm expi nm    xn'm exp i n'm  d wnn' , for n  n' .
2 2
2  m n m n '  
In contrast with Eq. (194), the right-hand side of this equation includes both a real and an imaginary
part, and hence it may be represented as
w nn'  1 / Tnn'  i nn' wnn' , (7.201)
where both factors 1/Tnn’ and nn’ are real. As Eq. (201) shows, the second term in the right-hand side of
this equation causes slow oscillations of the matrix elements wnn’, which, after returning to the
Schrödinger picture, add just small corrections67 to the unperturbed frequencies (186) of their
oscillations, and are not important for most applications. More important is the first term, proportional to

1 1  2
   2 K F    x nm cos  nm   x n'm cos  n'm  x nn  x n'n'  
2 2
Tnn' 0    mn mn' 
66 Sometimes Eq. (200) (in any of its numerous alternative forms) is called the Redfield equation, after the 1965
work by A. Redfield. Note, however, that in the mid-1960s several other authors, notably including (in the
alphabetical order) H. Haken, W. Lamb, M. Lax, W. Louisell, and M. Scully, also made major contributions to
the very fast development of the density-matrix approach to open quantum systems.
67 Such corrections are sometimes called Lamb shifts, due to their conceptual similarity to the genuine Lamb shift
– the effect first observed experimentally in 1947 by Willis Lamb and Robert Retherford: a minor difference
between energy levels of the 2s and 2p states of hydrogen, due to the electric-dipole coupling of hydrogen atoms
to the free-space electromagnetic environment. (These energies are equal not only in the non-relativistic theory
described in Sec. 3.6 but also in the relativistic theory (see Secs. 6.3, 9.7), if the electromagnetic environment is
ignored.) The explanation of the Lamb shift by H. Bethe, in the same 1947, essentially launched the whole field of
quantum electrodynamics – to be briefly discussed in Chapter 9.
1  
G    x nm sin  nm   x n'm sin  n'm  d , for n  n' ,
2 2
 (7.202)
2  mn mn' 
because it describes the effect completely absent without the environment coupling: exponential decay
of the off-diagonal matrix elements, i.e. the dephasing. Comparing the first two terms of Eq. (202) with
Eq. (195), we see that the dephasing rates may be described by a very simple formula:
1 1  
   nm   n' m   2  x nn  x n'n'  S F 0
2
Dephasing Tnn' 2  m  n m  n'  

rate (7.203)
1  k T
   nm   n' m   B2  x nn  x n'n'  , for n  n' ,
2
2  m n m  n'  
where the low-frequency drag coefficient  is again defined as lim0”()/ – see Eq. (138).
This result shows that two effects yield independent contributions to the dephasing. The first of
them may be interpreted as a result of “virtual” transitions of the system, from the levels n and n’ of our
interest, to other energy levels m; according to Eq. (195), this contribution is proportional to the
strength of coupling to the environment at relatively high frequencies nm and n’m. (If the energy
quanta  of these frequencies are much larger than the thermal fluctuation scale kBT, then only the
lower levels, with Em < max[En, En’] are important.) On the contrary, the second contribution is due to
low-frequency, essentially classical fluctuations of the environment, and hence to the low-frequency
dissipative susceptibility. In the Ohmic dissipation case, when the ratio   ”()/ is frequency-
independent, both contributions are of the same order, but their exact relation depends on the matrix
elements xnn’ of a particular system.
For example, returning for a minute to the two-level system discussed in Sec. 3, described by our
current theory with the replacement xˆ  ˆ z , the high-frequency contributions to dephasing vanish
because of the absence of transitions between energy levels, while the low-frequency contribution yields
1 1 k BT k T 4k T
 2   xnn  xn'n'   B2  σ z 11  σ z 22   B2  ,
2 2
 (7.204)
T2 T12   
thus exactly reproducing the result (142) of the Heisenberg-Langevin approach.68 Note also that the
expression for T2 is very close in structure to Eq. (199) for T1 (in the high-temperature limit). However,
for the simple interaction model (70) that was explored in Sec. 3, the off-diagonal elements of the
operator xˆ  ̂ z in the stationary-state z-basis vanish, so that T1  , while T2 says finite. The physics
of this result is very clear, for example, for the two-well implementation of the model (see Fig. 4 and its
discussion): it is suitable for the case of a very high energy barrier between the wells, which inhibits
tunneling, and hence any change of the well occupancies. However, T1 may become finite, and
comparable with T2, if tunneling between the wells is substantial.69
68 The first form of Eq. (203), as well as the analysis of Sec. 3, implies that low-frequency fluctuations of any
other origin, not taken into account in own current analysis (say, an unintentional noise from experimental
equipment), may also contribute to dephasing; such “technical fluctuations” are indeed a very serious challenge
for the experimental implementation of coherent qubit systems – see Sec. 8.5 below.
69 As was discussed in Sec. 5.1, the tunneling may be described by using, instead of Eq. (70), the full two-level
Hamiltonian (5.3). Let me leave for the reader’s exercise to spell out the equations for the time evolution of the
density matrix elements of this system, and of the expectation values of the Pauli operators, for this case.
Because of the reason explained above, the derivation of Eqs. (200)-(204) is not valid for
systems with equidistant energy spectra – for example, the harmonic oscillator. For this particular, but
very important system, with its simple matrix elements xnn’ given by Eqs. (5.92), it is longish but
straightforward to repeat the above calculations, starting from (183), to obtain an equation similar in
structure to Eq. (200), but with two other terms, proportional to wn1,n’1, on its right-hand side.
Neglecting the minor Lamb-shift term, the equation reads
ne  1n  n'   ne n  n'  2 wnn' 

w nn'    . (7.205)
 2ne  1n  1n'  1 wn 1,n' 1  2ne nn'  wn 1,n' 1 
1/ 2 1/ 2
Here  is the effective damping coefficient,70

x02 Im   0 
 Im   0   , (7.206)
2 2m 0
equal to just /2m for the Ohmic dissipation, and ne is the equilibrium number of oscillator’s excitations,
given by Eq. (26b), with the environment’s temperature T. (I am using this new notation because in
dynamics, the instant expectation value n may be time-dependent, and is generally different from its
equilibrium value ne.)
As a remark: the derivation of Eq. (205) might be started at a bit earlier point, from the Markov
approximation applied to Eq. (181), expressing the coordinate operator via the creation-annihilation
operators (5.65). This procedure gives the result in the operator (i.e. basis-independent) form:71
 
wˆ   ne  1 aˆ † aˆ , wˆ   2aˆwˆ aˆ †   ne  aâˆ † , wˆ   2aˆ † wˆ aˆ  . (7.207)
      
In the Fock state basis, this equation immediately reduces to Eq. (205); however, Eq. (207) may be more
convenient for some applications.
Returning to Eq. (205), we see that it relates only the elements wnn’ located at the same distance
(n – n’) from the principal diagonal of the density matrix. This means, in particular, that the dynamics of
the diagonal elements wnn of the matrix, i.e. the Fock state probabilities Wn, is independent of the off-
diagonal elements, and may be represented in the form (194), truncated to the transitions between the
adjacent energy levels only (n’ = n  1):
70 This coefficient participates prominently in the classical theory of damped oscillations (see, e.g., CM Sec. 5.1),
in particular defining the oscillator’s Q-factor as Q  0/2, and the decay time of the amplitude A and the energy
E of free oscillations: A(t) = A(0)exp{- t}, E(t) = E(0)exp{-2 t}.
71 Sometimes Eq. (207) is called the Lindblad equation, but I believe this terminology is inappropriate. It is true
that its structure falls into a general category of equations, suggested by G. Lindblad in 1976 for the density
operators in the Markov approximation, whose diagonalized form in the interaction picture is
  
wˆ    j 2 Lˆ j wˆ Lˆ†j  Lˆ j Lˆ†j , wˆ .
j
However, Eq. (207) was derived much earlier (by L. Landau in 1927 for zero temperature, and by M. Lax in 1960
for an arbitrary temperature), and in contrast to the general Lindblad equation, spells out the participating
operators L̂ j and coefficients j for a particular physical system – the harmonic oscillator.
W n  n 1nWn 1  nn 1Wn   n 1nWn 1  nn 1Wn  , (7.208)

with the following rates:
n 1n  2 n  1ne  1, nn 1  2 n  1 ne ,
(7.209)
n 1n  2 n ne , nn 1  2 n ne  1.
Since according to the definition of ne, given by Eq. (26b),
1 exp 0 / k BT  1
ne  , so that ne  1   , (7.210)
exp 0 / k BT   1 exp 0 / k BT   1 exp  0 / k BT   1
taking into account Eqs. (5.92), (186), (206), and the asymmetry of the function ”(), we see that these
rates are again described by Eq. (196), even though the last formula was derived for non-equidistant
energy spectra.
Hence the only substantial new feature of the master equation for the harmonic oscillator, is that
the decay of the off-diagonal elements of its density matrix is scaled by the same parameter (2) as that
of the decay of its diagonal elements, i.e. there is no radical difference between the dephasing and
energy-relaxation times T2 and T1. This fact may be interpreted as the result of the independence of the
energy level distances, 0, of the fluctuations F(t) exerted on the oscillator by the environment, so that
their low-frequency density, SF(0), does not contribute to the dephasing. (This fact formally follows also
from Eq. (203) as well, taking into account that for the oscillator, xnn = xn’n’ = 0.)
The simple equidistant structure of the oscillator’s spectrum makes it possible to readily solve
the system of Eqs. (208), with n = 0, 1, 2, …, for some important cases. In particular, if the initial state
of the oscillator is a classical mixture, with no off-diagonal elements, its further relaxation proceeds as
such a mixture: wnn’(t) = 0 for all n’  n.72 In particular, it is straightforward to use Eq. (208) to verify
that if the initial classical mixture obeys the Gibbs distribution (25), but with a temperature Ti different
from that of the environment (Te), then the relaxation process is reduced to a simple exponential
transient of the effective temperature from Ti to Te:

Wn t   exp n
 0  
1  exp
 0  
 ,  
with Tef t   Ti e 2t  Te 1  e 2t , (7.211)
 k BTef t    k BTef t   
with the corresponding evolution of the expectation value of the full energy E – cf. Eq. (26b):
 0 1
E t     0 n t , n t    t  ne . (7.212)
2 exp 0 / k BTef t   1
However, if the initial state of the oscillator is different (say, corresponds to some upper Fock
state), the relaxation process, described by Eqs. (208)-(209), is more complex – see, e.g., Fig. 8. At low
temperatures (Fig. 8a), it may be interpreted as a gradual “roll” of the probability distribution down the
energy staircase, with a gradually decreasing velocity dn/dt  n. However, at substantial temperatures,
72 Note, however, that this is not true for many applications, in which a damped oscillator is also under the effect
of an external time-dependent field, which may be described by additional, typically off-diagonal terms on the
right-hand side of Eqs. (205).
with kBT ~0 (Fig. 8b), this “roll-down” is saturated when the level occupancies Wn(t) approach their
equilibrium values (25).73
1 1
n0 n0
 0
T 0 k BT 
2
Wn 1 n e  0  Wn 1 n e  0 .157 
0.1 0.1
2
2
3 3
5 4 6 5 4
0.01 0.01
0 1 2 3 4 5 6 0 1 2 3 4 5 6
2t 2t
Fig. 7.8. Relaxation of a harmonic oscillator, initially in its 5th Fock state, at: (a) T = 0, and (b) T > 0. Note
that in the latter case, even the energy levels with n > 5 get populated, due to their thermal excitation.
The analysis of this process may be simplified in the case when W(n, t)  Wn(t) is a smooth
function of the energy level number n, limited to high levels: n >> 1. In this limit, we may use the
Taylor expansion of this function (written for the points n = 1), truncated to three leading terms:
W n, t  1  2W n, t 
Wn 1 t   W n  1, t   W n, t    . (7.213)
n 2 n 2
Plugging this expression into Eqs. (208)-(209), we get for the function W(n, t) a partial differential
equation, which may be recast in the following form:
W  2
   f (n) W   2 d n W , with f n   2 ne  n , d n   2 ne  ½  n . (7.214)
t n n
Since at n >> 1, the oscillator’s energy E is close to 0n, this energy diffusion equation (sometimes
incorrectly called the Fokker-Planck equation – see below) essentially describes the time evolution of
the continuous probability density w(E, t), which may be defined as w(E, t)  W(E/0, t)/0.74
73 The reader may like to have a look at the results of nice measurements of such functions Wn(t) in microwave
oscillators, performed using their coupling with Josephson-junction circuits: H. Wang et al., Phys. Rev. Lett. 101,
240401 (2008), and with Rydberg atoms: M. Brune et al., Phys. Rev. Lett. 101, 240402 (2008).
74 In the classical limit n >> 1, Eq. (214) is analytically solvable for any initial conditions – see, e.g., the paper by
e
B. Zeldovich et al., Sov. Phys. JETP 28, 308 (1969), which also gives some more intricate solutions of Eqs.
(208)-(209). Note, however, that the most important properties of the damped harmonic oscillator (including its
relaxation dynamics) may be analyzed simpler by using the Heisenberg-Langevin approach discussed in the
previous section.
This continuous approximation naturally reminds us of the need to discuss dissipative systems
with a continuous spectrum. Unfortunately, for such systems the few (relatively :-) simple results that
may be obtained from the basic Eq. (181), are essentially classical in nature and are discussed in detail
in the SM part of this series. Here, I will give only a simple illustration. Let us consider a 1D particle
that interacts weakly with a thermally-equilibrium environment, but otherwise is free to move along the
x-axis. As we know from Chapters 2 and 5, in this case, the most convenient basis is that of the
momentum eigenstates p. In the momentum representation, the density matrix is just the c-number
function w(p, p’), defined by Eq. (54), which was already discussed in brief in Sec. 2. On the other hand,
the coordinate operator, which participates in the right-hand side of Eq. (181), has the form given by the
first of Eqs. (4.269),

xˆ  i , (7.215)
p
dual to the coordinate-representation formula (4.268). As we already know, such operators are local –
see, e.g., Eq. (4.244). Due to this locality, the whole right-hand side of Eq. (181) is local as well, and
hence (within the framework of our perturbative treatment) the interaction with the environment affects
only the diagonal values w(p, p) of the density matrix, i.e. the momentum probability density w(p).
Let us find the equation governing the evolution of this function in time in the Markov
approximation, when the time scale of the density matrix evolution is much longer than the correlation
time c of the environment, i.e. the time scale of the functions KF() and G(). In this approximation, we
may take the matrix elements out of the first integral of Eq. (181),
t 
1 1

2  K F t  t'  dt'xˆ(t ), xˆ(t' ), wˆ (t' )     2 0
K F   d xˆ, xˆ, wˆ  
 (7.216)
 k T
  2 S F 0xˆ, xˆ, wˆ     B2  xˆ, xˆ, wˆ  ,
 
and calculate the last double commutator in the Schrödinger picture. This may be done either using an
explicit expression for the matrix elements of the coordinate operator or in a simpler way – using the
same trick as at the derivation of the Ehrenfest theorem in Sec. 5.2. Namely, expanding an arbitrary
function f(p) into the Taylor series in p,

1 k f k
f ( p)   p , (7.217)
k  0 k! p
k
and using Eq. (215), we can prove the following simple commutation relation:
xˆ, p    k1! p f ikp   i k 1 1! p

k 1
 f  k 1
xˆ, f    1  f
 k  k 
f k 1
k
k 1
  p  i . (7.218)
k! p  p  p
k k
k 0 k 0 k 1
Now applying this result sequentially, first to w and then to the resulting commutator, we get
xˆ, xˆ, w   xˆ, i w   i   w 

 i
2w
   2 2 . (7.219)
 p  p  p  p
It may look like the second integral in Eq. (181) might be simplified similarly. However, it
vanishes at p’  p, and t’  t, so that to calculate the first non-vanishing contribution from that integral
for p = p’, we have to take into account the small difference   t – t’ ~ c between the arguments of the
coordinate operators under that integral. This may be done using Eq. (169) with the free-particle’s
Hamiltonian consisting of the kinetic-energy contribution alone:
xˆ t'   xˆ t    xˆ   1

i
 1  pˆ 2 
xˆ , Hˆ s    xˆ ,
i  2 m 
pˆ
   ,
m
(7.220)
where the exact argument of the operator on the right-hand side is already unimportant and may be
taken for t. As a result, we may use the last of Eqs. (136) to reduce the second term on the right-hand
side of Eq. (181) to

i
t
i   pˆ      pˆ 
  Gt  t' x(t ), x(t' ), w(t' ) dt' 
ˆ ˆ ˆ  G  d  xˆ,  , wˆ     xˆ,  m , wˆ  . (7.221)
2  2 0   m   2i   
In the momentum representation, the momentum operator and the density matrix w are just c-numbers
and commute, so that, applying Eq. (218) to the product pw, we get
  pˆ   p   p 
 xˆ,  m , wˆ    xˆ,2 m w  2i p  m w  , (7.222)
      
and may finally reduce the integro-differential equation Eq. (181) to a partial differential equation:
Fokker –
w  2w p Planck
  Fw   k BT 2 , with F   . (7.223) equation:
t p p m free
1D particle
This is the 1D form of the famous Fokker-Planck equation describing the classical statistics of
motion of a particle (in our particular case, of a free particle) in an environment providing a linear drag
characterized by the coefficient ; it belongs to the same drift-diffusion type as Eq. (214). The first, drift
term on its right-hand side describes the particle’s deceleration due to the drag force (137), F = –p/m =
–v, provided by the environment. The second, diffusion term on the right-hand side of Eq. (223)
describes the effect of fluctuations: the particle’s momentum’ random walk around its average (drift-
affected, and hence time-dependent) value. The walk obeys the law similar to Eq. (85), but with the
momentum-space diffusion coefficient
D p  k B T . (7.224)
This is the reciprocal-space version of the fundamental Einstein relation between the dissipation
(friction) and fluctuations, in this classical limit represented by their thermal energy scale kBT.75
Just for the reader’s reference, let me note that the Fokker-Planck equation (223) may be readily
generalized to the 3D motion of a particle under the effect of an additional external force,76 and in this
75 Note that Eq. (224), as well as the original Einstein’s relation between the diffusion coefficient D in the direct
space and temperature, may be derived much simpler by other means – for example, from the Nyquist formula
(139). These issues are discussed in detail in SM Chapter 5.
76 Moreover, Eq. (223) may be generalized to the motion of a quantum particle in an additional periodic potential
U(r). In this case, due to the band structure of the energy spectrum (which was discussed in Secs. 2.7 and 3.4),
the coupling to the environment produces not only a continuous drift-diffusion of the probability density in the
space of the quasimomentum q but also quantum transitions between different energy bands at the same q –
see, e.g., K. Likharev and A. Zorin, J. Low Temp. Phys. 59, 347 (1985).
more general form is the basis for many important applications; however, due to its classical character,
its discussion is also left for the SM part of this series.77
To summarize our discussion of the two alternative approaches to the analysis of quantum
systems interacting with a thermally-equilibrium environment, described in the last three sections, let
me emphasize again that they give different descriptions of the same phenomena, and are characterized
by the same two functions G(τ) and KF(τ). Namely, in the Heisenberg-Langevin approach, we describe
the system by operators that change (fluctuate) in time, even in the thermal equilibrium, while in the
density-matrix approach, the system is described by non-fluctuating probability functions, such as Wn(t)
or w(p, t), which are stationary in equilibrium. In the cases when a problem may be solved analytically
to the end by both methods (for example, for a harmonic oscillator), they give identical results.
7.1. Calculate the density matrix of a two-level system whose Hamiltonian is described, in a
certain basis, by the following matrix:
H  c  σ  cxσ x  c y σ y  cz σ z ,
where k are the Pauli matrices and cj are c-numbers, in thermodynamic equilibrium at temperature T.
7.2. In the usual z-basis, spell out the density matrix of a spin-½ with gyromagnetic ratio :
(i) in the pure state with the spin definitely directed along the z-axis,
(ii) in the pure state with the spin definitely directed along the x-axis,
(iii) in thermal equilibrium at temperature T, in a magnetic field directed along the z-axis, and
(iv) in thermal equilibrium at temperature T, in a magnetic field directed along the x-axis.
7.3. Calculate the Wigner function of a harmonic oscillator in:

(i) in thermodynamic equilibrium at temperature T,
(ii) in the ground state, and
(ii) in the Glauber state with dimensionless complex amplitude .
Discuss the relation between the first of the results and the Gibbs distribution.
7.4. Calculate the Wigner function of a harmonic oscillator, with mass m and frequency 0, in its
first excited stationary state (n = 1).
7.5.* A harmonic oscillator is weakly coupled to an Ohmic environment.

(i) Use the rotating-wave approximation to write the reduced equations of motion for the
Heisenberg operators of the complex amplitude of oscillations.
77See SM Secs. 5.6-5.7. For a more detailed discussion of quantum effects in dissipative systems with continuous
spectra see, e. g., either U. Weiss, Quantum Dissipative Systems, 2nd ed., World Scientific, 1999, or H.-P. Breuer
and F. Petruccione, The Theory of Open Quantum Systems, Oxford U. Press, 2007.
(ii) Calculate the expectation values of the correlators of the fluctuation force operators
participating in these equations, and express them via the average number n of thermally-induced
excitations in equilibrium, given by the second of Eqs. (26b).
7.6. Calculate the average potential energy of long-range electrostatic interaction between two
similar isotropic, 3D harmonic oscillators, each with the electric dipole moment d = qs, where s is the
oscillator’s displacement from its equilibrium position, at arbitrary temperature T.
7.7. A semi-infinite string with mass  per unit length is attached to a wall and stretched with a
constant force (tension) T. Calculate the spectral density of the transverse force exerted on the wall, in
thermal equilibrium at temperature T.
7.8.* Calculate the low-frequency spectral density of small fluctuations of the voltage V across a
Josephson junction, shunted with an Ohmic conductor, and biased with a dc external currentI > Ic.
Hint: You may use Eqs. (1.73)-(1.74) to describe the junction’s dynamics, and assume that the
shunting conductor remains in thermal equilibrium.
7.9. Prove that in the interaction picture of quantum dynamics, the expectation value of an
arbitrary observable A may be indeed calculated using Eq. (167).
7.10. Show that the quantum-mechanical Golden Rule (6.149) and the master equation (196)
give the same results for the rate of spontaneous quantum transitions n’  n in a system with a discrete
energy spectrum, weakly coupled to a low-temperature heat bath (with kBT << nn’).
Hint: You may start by establishing a relation between the function ”(nn’), which participates
in Eq. (196), and the density of states n, which participates in the Golden Rule formula, using the
particular case of sinusoidal classical oscillations in the system of interest.
7.11. For a harmonic oscillator with weak Ohmic dissipation, use Eqs. (208)-(209) to find the
time evolution of the expectation value E of oscillator’s energy for an arbitrary initial state, and
compare the result with that following from the Heisenberg-Langevin approach.
7.12. Derive Eq. (219) in an alternative way, using an expression dual to Eq. (4.244).
7.13. A particle in a system of two coupled potential wells (see, e.g., Fig. 7.4 in the lecture notes)
is weakly coupled to an Ohmic environment.
(i) Derive equations describing the time evolution of the density matrix elements.
(ii) Solve these equations in the low-temperature limit, when the energy level splitting is much
larger than kBT, to calculate the time evolution of the probability of finding the particle in one of the
wells, after it had been placed there at t = 0.
~
7.14.* A spin-½ with gyromagnetic ratio  is placed into the magnetic field B t   B0  B (t )
with an arbitrary but relatively small time-dependent component, and is also weakly coupled to a
dissipative environment. Derive differential equations describing the time evolution of the expectation
values of spin’s Cartesian components, at arbitrary temperature.
Chapter 8. Multiparticle Systems

This chapter provides a brief introduction to quantum mechanics of systems of similar particles, with
special attention to the case when they are indistinguishable. For such systems, theory predicts (and
experiment confirms) very specific effects even in the case of negligible explicit (“direct”) interactions
between the particles. These effects notably include the Bose-Einstein condensation of bosons and the
exchange interaction of fermions.
8.1. Distinguishable and indistinguishable particles

The importance of quantum systems of many similar particles is probably self-evident; just the
very fact that most atoms include several/many electrons is sufficient to attract our attention. There are
also important systems where the total number of electrons is much higher than in one atom; for
example, a cubic centimeter of a typical metal houses ~1023 conduction electrons that cannot be
attributed to particular atoms, and have to be considered as common parts of the system as the whole.
Though quantum mechanics offers virtually no exact analytical results for systems of substantially
interacting particles,1 it reveals very important new quantum effects even in the simplest cases when
particles do not interact, and least explicitly (directly).
If non-interacting particles are either different from each other by their nature, or physically
similar but still distinguishable because of other reasons, everything is simple – at least, conceptually.
Then, as was already discussed in Sec. 6.7, a system of two particles, 1 and 2, each in a pure quantum
state, may be described by a state vector which is a direct product,
   1   ' 2, (8.1a)
Distinguish- of single-particle vectors, describing their states  and ’ defined in different Hilbert spaces. (Below, I
able
particles
will frequently use, for such direct product, the following convenient shorthand:
   ' , (8.1b)
in which the particle’s number is coded by the state symbol’s position.) Hence the permuted state
Pˆ  '   '   ' 1   2

, (8.2)
where Pˆ is the permutation operator defined by Eq. (2), is clearly different from the initial one.
This operator may be also used for states of systems of identical particles. In physics, the last
term may be used to describe:
(i) the “really elementary” particles like electrons, which (at least at this stage of development of
physics) are considered as structure-less entities, and hence are all identical;
1 As was emphasized in Sec. 7.3, for such systems of similar particles the powerful methods discussed in the last
chapter, based on the separation of the whole Universe into a “system of our interest” and its “environment”,
typically do not work well – mostly because the quantum state of the “particle of interest” may be substantially
correlated (in particular, entangled) with those of similar particles forming its “environment” – see below.
© K. Likharev
(ii) any objects (e.g., hadrons or mesons) that may be considered as a system of “more
elementary” particles (e.g., quarks and gluons), but are placed in the same internal quantum state – most
simply, though not necessarily, in the ground state.2
It is important to note that identical particles still may be distinguishable – say by their clear
spatial separation. Such systems of similar but distinguishable particles (or subsystems) are broadly
discussed nowadays in the context of quantum computing and encryption – see Sec. 5 below. This is
why it is insufficient to use the term “identical particles” if we want to say that they are genuinely
indistinguishable, so I below I will use the latter term, despite it being rather unpleasant grammatically.
It turns out that for a quantitative description of systems of indistinguishable particles we need to
use, instead of direct products of the type (1), linear combinations of such products, for example of
’ and ’.3 To see this, let us discuss the properties of the permutation operator defined by Eq. (2).
Consider an observable A, and a system of eigenstates of its operator:
Aˆ a j  A j a j . (8.3)
If the particles are indistinguishable, the observable’s expectation value should not be affected by their
permutation. Hence the operators Â and Pˆ have to commute and share their eigenstates. This is why
the eigenstates of the operator Pˆ are so important: in particular, they include the eigenstates of the
Hamiltonian, i.e. the stationary states of a system of indistinguishable particles.
Let us have a look at the action of the permutation operator squared, on an elementary ket-vector
product:
 
Pˆ 2  '  Pˆ Pˆ  '  Pˆ  '   ' , (8.4)
i.e. Pˆ 2 brings the state back to its original form. Since any pure state of a two-particle system may be
represented as a linear combination of such products, this result does not depend on the state, and may
be represented as the following operator relation:
Pˆ 2  Iˆ. (8.5)
Now let us find the possible eigenvalues Pj of the permutation operator. Acting by both sides of Eq. (5)
on any of eigenstates j of the permutation operator, we get a very simple equation for its eigenvalues:
P j2  1 , (8.6)
2 Note that from this point of view, even complex atoms or molecules, in the same internal quantum state, may be
considered on the same footing as the “really elementary” particles. For example, the already mentioned recent
spectacular interference experiments by R. Lopes et al., which require particle identity, were carried out with
couples of 4He atoms in the same internal quantum state.
3 A very legitimate question is why, in this situation, we need to introduce the particles’ numbers to start with. A
partial answer is that in this approach, it is much simpler to derive (or guess) the system Hamiltonians from the
correspondence principle – see, e.g., Eq. (27) below. Later in this chapter, we will discuss an alternative approach
(the so-called “second quantization”), in which particle numbering is avoided. While that approach is more
logical, writing adequate Hamiltonians (which, in particular, would avoid spurious self-interaction of the
particles) within it is more challenging – see Sec. 3 below.
with two possible solutions:

P j  1 . (8.7)
Let us find the eigenstates of the permutation operator in the simplest case when each of the
component particles can be only in one of two single-particle states – say,  and ’. Evidently, none of
the simple products ’ and ’, taken alone, does qualify for the eigenstate – unless the states  and
’ are identical. This is why let us try their linear combination
 j  a  '  b  ' , (8.8)
so that
Pˆ  j  P j  j  a  '  b  ' . (8.9)
For the case Pj = +1 we have to require the states (8) and (9) to be the same, so that a = b, giving the so-
called symmetric eigenstate4
1
Symmetric
entangled
    '   ' , (8.10)
eigenstate
2
where the front coefficient guarantees the orthonormality of the two-particle state vectors, provided that
the single-particle vectors are orthonormal. Similarly, for Pj = –1 we get a = –b, i.e. an antisymmetric
eigenstate
1
  '   '  .
Anti-
symmetric   (8.11)
entangled 2
eigenstate
These are the simplest (two-particle, two-state) examples of entangled states, defined as multiparticle
system states whose vectors cannot be factored into a direct product (1) of single-particle vectors.
So far, our math does not preclude either sign of Pj, in particular the possibility that the sign
would depend on the state (i.e. on the index j). Here, however, comes in a crucial fact: all
indistinguishable particles fall into two groups: 5
(i) bosons, particles with integer spin s, for whose states Pj = +1, and
(ii) fermions, particles with half-integer spin, with Pj = –1.
In the non-relativistic theory we are discussing now, this key fact should be considered as an
experimental one. (The relativistic quantum theory, whose elements will be discussed in Chapter 9,
offers proof that the half-integer-spin particles cannot be bosons and the integer-spin ones cannot be
fermions.) However, our discussion of spin in Sec. 5.7 enables the following handwaving interpretation
of the difference between these two particle species. In the free space, the permutation of particles 1 and
2 may be viewed as a result of their pair’s common rotation by angle  =  about a properly selected z-
4 As in many situations we have met earlier, the kets given by Eqs. (10) and (11) may be multiplied by exp{i}
with an arbitrary real phase . However, until we discuss coherent superpositions of various states , there is no
good motivation for taking the phase different from 0; that would only clutter the notation.
5 Sometimes this fact is described as having two different “statistics”: the Bose-Einstein statistics of bosons and
Fermi-Dirac statistics of fermions, because their statistical distributions in thermal equilibrium are indeed
different – see, e.g., SM Sec. 2.8. However, this difference is actually deeper: we are dealing with two different
quantum mechanics.
axis. As we have seen in Sec. 5.7, at the rotation by this angle, the state vector  of a particle with a
definite quantum number ms acquires an extra factor exp{ims}. As we know, the quantum number ms
ranges from –s to +s, in unit steps. As a result, for bosons, with integer s, ms can take only integer
values, so that exp{ims} = 1, so that the product of two such factors in the state product ’ is
equal to +1. On the contrary, for the fermions with their half-integer s, all ms are half-integer as well, so
that exp{ims} = i so that the product of two such factors in vector ’ is equal to (i)2 = –1.
The most impressive corollaries of Eqs. (10) and (11) are for the case when the partial states of
the two particles are the same:  = ’. The corresponding Bose state +, defined by Eq. (10), is possible;
in particular, at sufficiently low temperatures, a set of non-interacting Bose particles condenses on the
ground state – the so-called Bose-Einstein condensate (“BEC”).6 The most fascinating feature of the
condensates is that their dynamics is governed by quantum mechanical laws, which may show up in the
behavior of their observables with virtually no quantum uncertainties7 – see, e.g., Eqs. (1.73)-(1.74).
On the other hand, if we take  = ’ in Eq. (11), we see that state - becomes the null-state, i.e.
cannot exist at all. This is the mathematical expression of the Pauli exclusion principle:8 two
indistinguishable fermions cannot be placed into the same quantum state. (As will be discussed below,
this is true for systems with more than two fermions as well.) Probably, the key importance of this
principle is self-evident: if it was not valid for electrons (that are fermions), all electrons of each atom
would condense on in their ground (1s-like) state, and all the usual chemistry (and biochemistry, and
biology, including dear us!) would not exist. The Pauli principle makes fermions implicitly interacting
even if they do not interact directly, i.e. in the usual sense of this word.
8.2. Singlets, triplets, and the exchange interaction

Now let us discuss possible approaches to quantitative analyses of identical particles, starting
from a simple case of two spin-½ particles (say, electrons), whose explicit interaction with each other
and the external world does not involve spin. The description of such a system may be based on
factorable states with ket-vectors
   o12  s12 , (8.12)
with the orbital state vector o12 and the spin vector s12 belonging to different Hilbert spaces. It is
frequently convenient to use the coordinate representation of such a state, sometimes called the spinor:
2-particle
r1 , r2    r1 , r2 o12  s12   (r1 , r2 ) s12 . (8.13) spinor
Since the spin-½ particles are fermions, the particle permutation has to change the sign:
6 For a quantitative discussion of the Bose-Einstein condensation see, e.g., SM Sec. 3.4. Examples of such
condensates include superfluids like helium, Cooper-pair condensates in superconductors, and BECs of weakly
interacting atoms.
7 For example, for a coherent condensate of N >> 1 particles, Heisenberg’s uncertainty relation takes the form
xp = x(Nmv)  /2, so that its coordinate x and velocity v may be measured simultaneously with much higher
precision than those of a single particle.
8 It was first formulated for electrons by Wolfgang Pauli in 1925, on the background of less general rules
suggested by Gilbert Lewis (1916), Irving Langmuir (1919), Niels Bohr (1922), and Edmund Stoner (1924) for
the explanation of experimental spectroscopic data.
Pˆ (r1 , r2 ) s12   (r2 , r1 ) s 21   (r1 , r2 ) s12 , (8.14)
of either the orbital factor or the spin factor.

In particular, in the case of symmetric orbital factor,
 (r2 , r1 )   (r1 , r2 ), (8.15)
the spin factor has to obey the relation
s 21   s12 . (8.16)
Let us use the ordinary z-basis (where z, in the absence of an external magnetic field, is an arbitrary
spatial axis) for both spins. In this basis, the ket-vector of any two spins-½ may be represented as a
linear combination of the following four basis vectors:
 ,  ,  , and  . (8.17)
The first two kets evidently do not satisfy Eq. (16), and cannot participate in the state. Applying to the
remaining kets the same argumentation as has resulted in Eq. (11), we get
Singlet
state
s12  s  
1
2
  -  .  (8.18)
Such an orbital-symmetric and spin-antisymmetric state is called the singlet.

The origin of this term becomes clear from the analysis of the opposite (orbital-antisymmetric
and spin-symmetric) case:
 (r2 , r1 )   (r1 , r2 ), s12  s 21 . (8.19)
For the composition of such a symmetric spin state, the first two kets of Eq. (17) are completely
acceptable (with arbitrary weights), and so is an entangled spin state that is the symmetric combination
of the two last kets, similar to Eq. (10):
s 
1
2

   ,  (8.20)
so that the general spin state is a triplet:
Triplet
state
s12  c    c    c0
1
2
    . (8.21)
Note that any such state (with any values of the coefficients c satisfying the normalization condition),
corresponds to the same orbital wavefunction and hence the same energy. However, each of these three
states has a specific value of the z-component of the net spin – evidently equal to, respectively, +, –,
and 0. Because of this, even a small external magnetic field lifts their degeneracy, splitting the energy
level in three; hence the term “triplet”.
In the particular case when the particles do not interact at all, for example
2
Hˆ  hˆ1  hˆ2 , ˆh  pˆ k  uˆ (r ), with k  1,2 , (8.22)
k k
2m
the two-particle Schrödinger equation for the symmetrical orbital wavefunction (15) is obviously
satisfied by the direct products,
 (r1 , r2 )   n (r1 ) n' (r2 ), (8.23)
of single-particle eigenfunctions, with arbitrary sets n, n’ of quantum numbers. For the particular but
very important case n = n’, this means that the eigenenergy of the (only acceptable) singlet state,
1
2
  
-   n (r1 ) n (r2 ) , (8.24)
is just 2n, where n is the single-particle energy level.9 In particular, for the ground state of the system,
such singlet spin state gives the lowest energy Eg = 2g, while any triplet spin state (19) would require
one of the particles to be in a different orbital state, i.e. in a state of higher energy, so that the total
energy of the system would be also higher.
Now moving to the systems in which two indistinguishable spin-½ particles do interact, let us
consider, as their simplest but important10 example, the lower energy states of a neutral atom11 of helium
– more exactly, 4He. Such an atom consists of a nucleus with two protons and two neutrons, with the
total electric charge q = +2e, and two electrons “rotating” about the nucleus. Neglecting the small
relativistic effects that were discussed in Sec. 6.3, the Hamiltonian describing the electron motion may
be expressed as
pˆ 2 2e 2 e2
Hˆ  hˆ1  hˆ2  Uˆ int , hˆk  k  , Uˆ int  . (8.25)
2m 4 0 rk 4 0 r1  r2
As with most problems of multiparticle quantum mechanics, the eigenvalue/eigenstate problem
for this Hamiltonian does not have an exact analytical solution, so let us carry out its approximate
analysis considering the electron-electron interaction Uint as a perturbation. As was discussed in Chapter
6, we have to start with the “0th-order” approximation in which the perturbation is ignored, so that the
Hamiltonian is reduced to the sum (22). In this approximation, the ground state of the atom is the singlet
(24), with the orbital factor
 g (r1 , r2 )   100 (r1 ) 100 (r2 ) , (8.26)
and energy 2g. Here each factor 100(r) is the single-particle wavefunction of the ground (1s) state of
the hydrogen-like atom with Z = 2, with quantum numbers n = 1, l = 0, and m = 0 – hence the
wavefunctions’ indices. According to Eqs. (3.174) and (3.208),
1 2  r / r0 rB rB
 100 (r )  Y00 ( ,  )R 1, 0 (r )  e , with r0   , (8.27)
4 r
3/ 2
0 Z 2
and according to Eqs. (3.191) and (3.201), in this approximation the total ground state energy is
9 In this chapter, I try to use lower-case letters for all single-particle observables (in particular,  for their
energies), in order to distinguish them as clearly as possible from the system’s observables (including the total
energy E of the system), which are typeset in capital letters.
10 Indeed, helium makes up more than 20% of all “ordinary” matter of our Universe.
11 Note that the positive ion He+1 of this atom, with just one electron, is fully described by the hydrogen-like atom
theory with Z = 2, whose ground-state energy, according to Eq. (3.191), is -Z2EH/2 = -2EH  -55.4 eV.
    Z 2 EH 
E g(0)  2 g(0)  2   02   2     4 E H  109 eV. (8.28)
 2n  n 1, Z  2  2  Z 2
This is still somewhat far (though not terribly far!) from the experimental value Eg  –78.8 eV – see the
bottom level in Fig. 1a.
(a) (b)
ΔE singlet state
(s )
(eV) (“parahelium”)
“parahelium” “orthohelium” E ex
0
3s 3p 3d 3s 3p 3d
2s 2p 2p E ex
2s triplet state
-5 (s )
(“orthohelium”)
-20 E dir
B 0
1s (ground state)
-25
l l  100   nlm
Fig. 8.1. The lower energy levels of a helium atom: (a) experimental data and (b) a schematic structure
of an excited state in the first order of the perturbation theory. On panel (a), all energies are referred to
that (-2EH  –55.4 eV) of the ground state of the positive ion He+1, so that their magnitudes are the
(readily measurable) energies of the atom’s single ionization starting from the corresponding state of the
neutral atom. Note that the “spin direction” nomenclature on panel (b) is rather crude: it does not reflect
the difference between the entangled states s+ and s-.
Making a minor (but very useful) detour from our main topic, let us note that we can get a much
better agreement with experiment by calculating the electron interaction energy in the 1st order of the
perturbation theory. Indeed, in application to our system, Eq. (6.14) reads
E g(1)  g Uˆ int g   d 3 r1  d 3 r2  g* (r1 , r2 )U int (r1 , r2 ) g (r1 , r2 ). (8.29)
Plugging in Eqs. (25)-(27), we get

2
 1 4 e2  2(r1  r2 ) 
     4 0 r1  r2  r0 .
(1) 3 3
E 3 
d r d r exp (8.30)

g 1 2
 4 r0 
As may be readily evaluated analytically (this exercise is left for the reader), this expression equals
(5/4)EH, so that the corrected ground state energy,
E g  E g(0)  E g(1)   4  5 / 4 E H  74.8 eV , (8.31)
is much closer to experiment.
There is still room here for a ready improvement, using the variational method discussed in Sec.
2.9. For our particular case of the 4He atom, we may try to use, as the trial state, the orbital
wavefunction given by Eqs. (26)-(27), but with the atomic number Z considered as an adjustable
parameter Zef < Z = 2 rather than a fixed number. The physics behind this approach is that the electric
charge density (r) = –e(r)2 of each electron forms a negatively charged “cloud” that reduces the
effective charge of the nucleus, as seen by the other electron, to Zefe, with some Zef < 2. As a result, the
single-particle wavefunction spreads further in space (with the scale r0 = rB/Zef > rB/Z), while keeping its
functional form (27) nearly intact. Since the kinetic energy T in the system’s Hamiltonian (25) is
proportional to r0-2  Zef2, while the potential energy is proportional to r0-1  Zef1, we can write
2
Z  Z ef
E g ( Z ef )   ef  Tg  Ug . (8.32)
 2  Z 2 2 Z 2
Now we can use the fact that according to Eq. (3.212), for any stationary state of a hydrogen-like
atom (just as for the classical circular motion in the Coulomb potential), U = 2E, and hence T = E –
U = –E. Using Eq. (30), and adding the correction (31) to the potential energy, we get
  Z 2  5 Z 
E g ( Z ef )  4 ef     8   ef  E H . (8.33)
  2   4  2 
This expression allows an elementary calculation of the optimal value of Zef, and the corresponding
minimum of the function Eg(Zef):
 5
( Z ef ) opt  2 1    1.6875, E 
g min  2.85E H  77.5 eV . (8.34)
 32 
Given the trial state’s crudeness, this number is in surprisingly good agreement with the experimental
value cited above, with a difference of the order of 1%.
Now let us return to the main topic of this section – the effects of particle (in this case, electron)
indistinguishability. As we have just seen, the ground-level energy of the helium atom is not affected
directly by this fact, but the situation is different for its excited states – even the lowest ones. The
reasonably good precision of the perturbation theory, which we have seen for the ground state, tells us
that we can base our analysis of wavefunctions (e) of the lowest excited state orbitals, on products like
100(rk)nlm(rk’), with n > 1. To satisfy the fermion permutation rule, Pj = –1, we have to take the orbital
factor of the state in either the symmetric or the antisymmetric form:
Orthohelium
1 and
 e (r1 , r2 )   100 (r1 ) nlm (r2 )   nlm (r1 ) 100 (r2 ) , (8.35) parahelium:
2 orbital
wavefunctions
with the proper total permutation asymmetry provided by the corresponding spin factor (18) or (21), so
that the upper/lower sign in Eq. (35) corresponds to the singlet/triplet spin state. Let us calculate the
expectation values of the total energy of the system in the first order of the perturbation theory. Plugging
Eq. (35) into the 0th-order expression
Ee
(0)
 
  d 3 r1  d 3 r2  e* r1 , r2  hˆ1  hˆ2  e r1 , r2  , (8.36)
we get two groups of similar terms that differ only by the particle index. We can merge the terms of
each pair by changing the notation as (r1  r, r2  r’ ) in one of them, and (r1  r’, r2  r) in the
counterpart term. Using Eq. (25), and the mutual orthogonality of the wavefunctions 100(r) and nlm(r),
we get the following result:
*   2  r2 2e 2  *   2  r2' 2e 2 
  100  100 (r )d 3 r   nlm
(0)
Ee (r )    (r' )     nlm (r' )d 3 r'
 2 m 4 r
0   2 m 4 0 r'  (8.37)
  100   nlm , with n  1.
It may be interpreted as the sum of eigenenergies of two separate single particles, one in the ground state
100, and another in the excited state nlm – although actually the electron states are entangled. Thus, in
the 0th order of the perturbation theory, the electron entanglement does not affect their energy.
However, the potential energy of the system also includes the interaction term Uint, which does
not allow such separation. Indeed, in the 1st approximation of the perturbation theory, the total energy Ee
of the system may be expressed as 100 + nlm + Eint(1), with
(1)
E int  U int   d 3 r1  d 3 r2  e* (r1 , r2 )U int (r1 , r2 ) e (r1 , r2 ) , (8.38)
Plugging Eq. (35) into this result, using the symmetry of the function Uint with respect to the particle
number permutation, and the same particle coordinate re-numbering as above, we get
(1)
Eint  Edir  Eex , (8.39)
with the following, deceivingly similar expressions for the two components of this sum/difference:
Direct
* *
interaction E dir   d 3 r  d 3 r' 100 (r ) nlm (r' )U int (r, r' ) 100 (r ) nlm (r' ) , (8.40)
energy
Exchange
* *
interaction E ex   d 3 r  d 3 r' 100 (r ) nlm (r' )U int (r, r' ) nlm (r ) 100 (r' ) . (8.41)
energy
Since the single-particle orbitals can be always made real, both components are positive – or at
least non-negative. However, their physics and magnitude are different. The integral (40), called the
direct interaction energy, allows a simple semi-classical interpretation as the Coulomb energy of
interacting electrons, each distributed in space with the electric charge density (r) = –e*(r)(r):12
100 (r )  nlm (r' )
E dir   d 3 r  d 3 r'   100 (r ) nlm (r )d 3 r    nlm (r ) (r )d 3 r , (8.42)
4 0 r  r' 100
where (r) are the electrostatic potentials created by the electron “charge clouds”:13
1 100 (r' ) 1  nlm (r' )
100 (r )  d  nlm (r )  d
3 3
r' , r' . (8.43)
4 0 r  r' 4 0 r  r'
However, the integral (41), called the exchange interaction energy, evades a classical
interpretation, and (as it is clear from its derivation) is the direct corollary of electrons’
indistinguishability. The magnitude of Eex is also very much different from Edir because the function
under the integral (41) disappears in the regions where the single-particle wavefunctions 100(r) and
nlm(r) do not overlap. This is in full agreement with the discussion in Sec. 1: if two particles are
identical but well separated, i.e. their wavefunctions do not overlap, the exchange interaction disappears,

13 Note that the result for Edir correctly reflects the basic fact that a charged particle does not interact with itself,
even if its wavefunction is quantum-mechanically spread over a finite space volume. Unfortunately, this is not
true for some popular approximate theories of multiparticle systems – see Sec. 4 below.
i.e. measurable effects of particle indistinguishability vanish. (In contrast, the integral (40) decreases
with the growing electron separation only slowly, due to the long-range Coulomb interaction.)
Figure 1b shows the structure of an excited energy level, with certain quantum numbers n > 1, l,
and m, given by Eqs. (39)-(41). The upper, so-called parahelium14 level, with the energy
E para   100   nlm   E dir  E ex   100   nlm , (8.44)
corresponds to the symmetric orbital state and hence to the singlet spin state (18), while the lower,
orthohelium level, with
E orth   100   nlm   E dir  E ex  E para , (8.45)
corresponds to the degenerate triplet spin state (21).

This degeneracy may be lifted by an external magnetic field, whose effect on the electron spins15
is described by the following evident generalization of the Pauli Hamiltonian (4.163),
e 
Hˆ field  sˆ 1  B  sˆ 2  B  Sˆ  B , with    e    2 B , (8.46)
me 
where
Sˆ  sˆ 1  sˆ 2 , (8.47)
is the operator of the (vector) sum of the system of two spins.16 To analyze this effect, we need first to
make one more detour, to address the general issue of spin addition. The main rule17 here is that in a full
analogy with the net spin of a single particle, defined by Eq. (5.170), the net spin operator (47) of any
system of two spins, and its component Ŝ z along the (arbitrarily selected) z-axis, obey the same
commutation relations (5.168) as the component operators, and hence have the properties similar to
those expressed by Eqs. (5.169) and (5.175):
Sˆ 2 S , M S   2 S S  1 S , M S , Sˆ z S , M S  M S S , M S , with  S  M S   S , (8.48)
where the ket vectors correspond to the coupled basis of joint eigenstates of the operators of S2 and Sz
(but not necessarily all component operators – see again the Venn shown in Fig. 5.12 and its discussion,
with the replacements S, L  s1,2 and J  S). Repeating the discussion of Sec. 5.7 with these
replacements, we see that in both coupled and uncoupled bases, the net magnetic number MS is simply
expressed via those of the components
14 This terminology reflects the historic fact that the observation of two different hydrogen-like spectra,
corresponding to the opposite signs in Eq. (39), was first taken as evidence for two different species of 4He, which
were called, respectively, the “orthohelium” and the “parahelium”.
15 As we know from Sec. 6.4, the field also affects the orbital motion of the electrons, so that the simple analysis
based on Eq. (46) is strictly valid only for the s excited state (l = 0, and hence m = 0). However, the orbital effects
of a weak magnetic field do not affect the triplet level splitting we are analyzing now.
16 Note that similarly to Eqs. (22) and (25), here the uppercase notation of the component spins is replaced with
their lowercase notation, to avoid any possibility of their confusion with the total spin of the system.
17 Since we already know that the spin of a particle is physically nothing more than a (specific) part of its angular
momentum, the similarity of the properties (48) of the sum (47) of spins of different particles to those of the sum
(5.170) of different spin components of the same particle it very natural, but still has to be considered as a new
fact – confirmed by a vast body of experimental data.
M S  m s 1  m s 2 . (8.49)
However, the net spin quantum number S (in contrast to the Nature-given spins s1,2 of its elementary
components) is not universally definite, and we may immediately say only that it has to obey the
following analog of the relation  l – s   j  (l + s) discussed in Sec. 5.7:
s1  s 2  S  s1  s 2 . (8.50)
What exactly S is (within these limits), depends on the spin state of the system.
For the simplest case of two spin-½ components, each with s = ½ and ms = ½, Eq. (49) gives
three possible values of MS, equal to 0 and 1, while Eq. (50) limits the possible values of S to just either
0 or 1. Using the last of Eqs. (48), we see that the possible combinations of the quantum numbers are
 S  0,  S  1,
 and  (8.51)
M S  0, M S  0,  1.
It is virtually evident that the singlet spin state s– belongs to the first class, while the simple (separable)
triplet states  and  belong to the second class, with MS = +1 and MS = –1, respectively. However,
for the entangled triplet state s+, evidently with MS = 0, the value of S is less obvious. Perhaps the easiest
way to recover it18 to use the “rectangular diagram”, similar to that shown in Fig. 5.14, but redrawn for
our case of two spins, i.e., with the replacements ml  (ms)1 = ½, ms  (ms)2 = ½ – see Fig. 2.
m s 2
½ 

S 1 Fig. 8.2. The “rectangular diagram”
s
S  0, 1 showing the relation between the
uncoupled-representation states (dots)
½ 0 ½ m s 1 and the coupled-representation states
(straight lines) of a system of two spins-
  ½ – cf. Fig. 5.14.
½
S 1
Just as at the addition of various angular momenta of a single particle, the top-right and bottom-
left corners of this diagram correspond to the factorable triplet states  and , which participate in
both the uncoupled-representation and coupled-representation bases, and have the largest value of S, i.e.
1. However, the entangled states s, which are linear combinations of the uncoupled-representation
states  and , cannot have the same value of S, so that for the triplet state s+, S has to take the value
different from that (0) of the singlet state, i.e. 1. With that, the first of Eqs. (48) gives the following
expectation values for the square of the net spin operator:
 2 2 , for each triplet state,
S2   (8.52)
 0, for the singlet state.
18 Another, a bit longer but perhaps a more prudent way is to directly calculate the expectation values of Ŝ 2 for
the states s, and then find S by comparing the results with the first of Eqs. (48); it is highly recommended to the
reader as a useful exercise.
Note that for the entangled triplet state s+, whose ket-vector (20) is a linear superposition of two kets of
states with opposite spins, this result is highly counter-intuitive, and shows how careful we should be
interpreting entangled quantum states. (As will be discussed in Chapter 10, the entanglement brings
even more surprises for quantum measurements.)
Now we may return to the particular issue of the magnetic field effect on the triplet state of the
4
He atom. Directing the z-axis along the field, we may reduce Eq. (46) to
Sˆ
Hˆ field   e Sˆ z B  2 B B z . (8.53)

Since all three triplet states (21) are eigenstates, in particular, of the operator Ŝ z , and hence of the
Hamiltonian (53), we may use the second of Eqs. (48) to calculate their energy change simply as
  1, for the factorable triplet state ,

E field  2 B BM S  2 B B   0, for the entangled triplet state s  , (8.54)
 1, for the factorable triplet state  .

This splitting of the “orthohelium” level is schematically shown in Fig. 1b.19
8.3. Multiparticle systems

Leaving several other problems on two-particle systems for the reader’s exercise, let me proceed
to the discussion of systems with N > 2 indistinguishable particles, whose list notably includes atoms,
molecules, and condensed-matter systems. In this case, Eq. (7) for fermions is generalized as
Pˆkk '       , for all k , k'  1, 2,..., N , (8.55)
where the operator Pˆkk ' permutes particles with numbers k and k’. As a result, for systems with non-
directly-interacting fermions, the Pauli principle forbids any state in which any two particles have
similar single-particle wavefunctions. Nevertheless, it permits two fermions to have similar orbital
wavefunctions, provided that their spins are in the singlet state (18), because this satisfies the
permutation requirement (55). This fact is of paramount importance for the ground state of the systems
whose Hamiltonians do not depend on spin because it allows the fermions to be in their orbital single-
particle ground states, with two electrons of the spin singlet sharing the same orbital state. Hence, for
the limited (but very important!) goal of finding ground-state energies of multi-fermion systems with
negligible direct interaction, we may ignore the actual singlet spin structure, and reduce the Pauli
19 It is interesting that another very important two-electron system, the hydrogen (H2) molecule, which was briefly
discussed in Sec. 2.6, also has two similarly named forms, parahydrogen and orthohydrogen. However, their
difference is due to two possible (respectively, singlet and triplet) states of the system of two spins of the two
hydrogen nuclei – protons, which are also spin-½ particles. The resulting energy of the parahydrogen is lower
than that of the orthohydrogen by only ~45 meV per molecule – the difference comparable with kBT at room
temperature (~26 meV). As a result, at the ambient conditions, the equilibrium ratio of these two spin isomers is
close to 3:1. Curiously, the theoretical prediction of this minor effect by W. Heisenberg (together with F. Hund) in
1927 was cited in his 1932 Nobel Prize award as the most noteworthy application of quantum theory.
exclusion principle to the simple picture of single-particle orbital energy levels, each “occupied with
two fermions”.
As a very simple example, let us find the ground energy of five fermions, confined in a hard-
wall, cubic-shaped 3D volume of side a, ignoring their direct interaction. From Sec. 1.7, we know the
single-particle energy spectrum of the system:
 2 2
 n ,n
x y ,nz

  0 n x2  n y2  n z2 , with  0 
2ma 2
, and n x , n y , n z  1, 2,... (8.56)
so that the lowest-energy states are:

- one ground state with {nx,ny,nz} = {1,1,1}, and energy 111= (12+12+12)0 = 30, and
- three excited states, with {nx,ny,nz} equal to either {2,1,1}, or {1,2,1}, or {1,1,2}, with equal
energies 211= 121 = 112 = (22+12+12)0 = 60.
According to the above simple formulation of the Pauli principle, each of these orbital energy levels can
accommodate up to two fermions. Hence the lowest-energy (ground) state of the five-fermion system is
achieved by placing two of them on the ground level 111 = 30, and the remaining three particles, in any
of the degenerate “excited” states of energy 60, so that the ground-state energy of the system is
12 2  2
E g  2  3 0  3  6 0  24 0  . (8.57)
ma 2
Moreover, in many cases, relatively weak interaction between fermions does not blow up such a
simple quantum state classification scheme qualitatively, and the Pauli principle allows tracing the order
of single-particle state filling. This is exactly the simple approach that has been used in our discussion of
atoms in Sec. 3.7. Unfortunately, it does not allow for a more specific characterization of the ground
states of most atoms, in particular the evaluation of the corresponding values of the quantum numbers S,
L, and J that characterize the net angular momenta of the atom, and hence its response to an external
magnetic field. These numbers are defined by relations similar to Eqs. (48), each for the corresponding
vector operator of the net angular momenta:
N N N
Sˆ   sˆ k , Lˆ   ˆl k , Jˆ   ˆj k ; (8.58)
k 1 k 1 k 1
note that these definitions are consistent with Eq. (5.170) applied both to the angular momenta sk, lk, and
jk of each particle, and to the full vectors S, L, and J. When the numbers S, L, and J for a state are
known, they are traditionally recorded in the form of the so-called Russell-Saunders symbols:20
2 S 1
LJ , (8.59)
where S and J are the corresponding values of these quantum numbers, while L is a capital letter,
encoding the quantum number L – via the same spectroscopic notation as for single particles (see Sec.
3.6): L = S for L = 0, L = P for L = 1, L = D for L = 2, etc. (The reason why the front superscript of
the Russel-Saunders symbol lists 2S + 1 rather than just S, is that according to the last of Eqs. (48), it
20 Named after H. Russell and F. Saunders, whose pioneering (circa 1925) processing of experimental spectral-
line data has established the very idea of vector addition of the electron spins, described by the first of Eqs. (58).
shows the number of possible values of the quantum number MS, which characterizes the state’s spin
degeneracy, and is called its multiplicity.)
For example, for the simplest, hydrogen atom (Z = 1), with its single electron in the ground 1s
state, L = l = 0, S = s = ½, and J = S = ½, so that its Russell-Saunders symbol is 2S1/2. Next, the
discussion of the helium atom (Z = 2) in the previous section has shown that in its ground state L = 0
(because of the 1s orbital state of both electrons), and S = 0 (because of the singlet spin state), so that the
total angular momentum also vanishes: J = 0. As a result, the Russell-Saunders symbol is 1S0. The
structure of the next atom, lithium (Z = 3) is also easy to predict, because, as was discussed in Sec. 3.7,
its ground-state electron configuration is 1s22s1, i.e. includes two electrons in the “helium shell”, i.e. on
the 1s orbitals (now we know that they are actually in a singlet spin state), and one electron in the 2s
state, of higher energy, also with zero orbital momentum, l = 0. As a result, the total L in this state is
evidently equal to 0, and S is equal to ½, so that J = ½, meaning that the Russell-Saunders symbol of
lithium is 2P1/2. Even in the next atom, beryllium (Z = 4), with the ground state configuration 1s22s2, the
symbol is readily predictable, because none of its electrons has non-zero orbital momentum, giving L =
0. Also, each electron pair is in the singlet spin state, i.e. we have S = 0, so that J = 0 – the quantum
number set described by the Russell-Saunders symbol 1S0 – just as for helium.
However, for the next, boron atom (Z = 5), with its ground-state electron configuration 1s22s22p1
(see, e.g., Fig. 3.24), there is no obvious way to predict the result. Indeed, this atom has two pairs of
electrons, with opposite spins, on its two lowest s-orbitals, giving zero contributions to the net S, L, and
J. Hence these total quantum numbers may be only contributed by the last, fifth electron with s = ½ and
l = 1, giving S = ½, L = 1. As was discussed in Sec. 5.7 for the single-particle case, the vector addition
of the angular momenta S and L enables two values of the quantum number J: either L + S = ³/2 or L – S
= ½. Experiment shows that the difference between the energies of these two states is very small (~2
meV), so that at room temperature (with kBT  26 meV) they are both occupied, with the genuine
ground state having J = ½, so that its Russell-Saunders symbol is 2P1/2.
Such energy differences, which become larger for heavier atoms, are determined both by the
Coulomb and spin-orbit21 interactions between the electrons. Their quantitative analysis is rather
involved (see below), but the results tend to follow simple phenomenological Hund rules, with the
following hierarchy:
Rule 1. For a given electron configuration, the ground state has the largest possible S, and hence
the largest multiplicity 2S + 1.
Rule 2. For a given S, the ground state has the largest possible L.
Rule 3. For given S and L, J has its smallest possible value,  L – S , if the given sub-shell {n, l}
is filled not more than by half, while in the opposite case, J has its largest possible value, L + S.
Let us see how these rules work for the boron atom we have just discussed. For it, the Hund
Rules 1 and 2 are satisfied automatically, while the sub-shell {n = 2, l = 1}, which can house up to 2(2l
+ 1) = 6 electrons, is filled with just one 2p electron, i.e. by less than a half of the maximum value. As a
result, the Hund Rule 3 predicts the ground state’s value J = ½, in agreement with experiment.
21 In light atoms, the spin-orbit interaction is so weak that it may be reasonably well described as an interaction of
the total momenta L and S of the system – the so-called LS (or “Russell-Saunders”) coupling. On the other hand,
in very heavy atoms, the interaction is effectively between the net momenta jk = lk + sk of the individual electrons
– the so-called jj coupling. This is the reason why in such atoms the Hund rule 3 may be violated.
Generally, for lighter atoms, the Hund rules are well obeyed. However, the lower down the Hund rule
hierarchy, the less “powerful” the rules are, i.e. the more often they are violated in heavier atoms.
Now let us discuss possible approaches to a quantitative theory of multiparticle systems – not
only atoms. As was discussed in Sec. 1, if fermions do not interact directly, the stationary states of the
system have to be the antisymmetric eigenstates of the permutation operator, i.e. satisfy Eq. (55). To
understand how such states may be formed from the single-electron ones, let us return for a minute to
the case of two electrons, and rewrite Eq. (11) in the following compact form:
state 1 state 2
  (8.60a)
 
1
    '   '     1  β '  particle number 1,
2 2  β '  particle number 2,
where the direct product signs are just implied. In this way, the Pauli principle is mapped on the well-
known property of matrix determinants: if any of two columns of a matrix coincide, its determinant
vanishes. This Slater determinant approach22 may be readily generalized to N fermions occupying any N
(not necessarily the lowest-energy) single-particle states , ’, ’’, etc:
state list 
 β'  
"
 particle
Slater 1  β' "
 
determinant   N list (8.60b)
N!1 / 2  β' "
 

    

N
The Slater determinant form is extremely nice and compact – in comparison with direct writing
of a sum of N! products, each of N ket factors. However, there are two major problems with using it for
practical calculations:
(i) For the calculation of any bra-ket product (say, within the perturbation theory) we still need to
spell out each bra- and ket-vector as a sum of component terms. Even for a limited number of electrons
(say N ~ 102 in a typical atom), the number N! ~ 10160 of terms in such a sum is impracticably large for
any analytical or numerical calculation.
(ii) In the case of interacting fermions, the Slater determinant does not describe the eigenvectors
of the system; rather the stationary state is a superposition of such basis functions, i.e. of the Slater
determinants – each for a specific selection of N states from the full set of single-particle states – that is
generally larger than N.
For atoms and simple molecules, whose filled-shell electrons may be excluded from an explicit
analysis (by describing their effects, approximately, with effective pseudo-potentials), the effective
number N may be reduced to a smaller number Nef of the order of 10, so that Nef! < 106, and the Slater
determinants may be used for numerical calculations – for example, in the Hartree-Fock theory – see the
next section. However, for condensed-matter systems, such as metals and semiconductors, with the
22 It was suggested in 1929 by John C. Slater.
number of free electrons is of the order of 1023 per cm3, this approach is generally unacceptable, though
with some smart tricks (such as using the crystal’s periodicity) it may be still used for some approximate
(also mostly numerical) calculations.
These challenges make the development of a more general theory that would not use particle
numbers (which are superficial for indistinguishable particles to start with) a must for getting any final
analytical results for multiparticle systems. The most effective formalism for this purpose, which avoids
particle numbering at all, is called the second quantization.23 Actually, we have already discussed a
particular version of this formalism, for the case of the 1D harmonic oscillator, in Sec. 5.4. As a
reminder, after the definition (5.65) of the “creation” and “annihilation” operators via those of the
particle’s coordinate and momentum, we have derived their key properties (5.89),
aˆ † n  n  1 n  1
1/ 2
aˆ n  n1 / 2 n  1 , , (8.61)
where n are the stationary (Fock) states of the oscillator. This property allows an interpretation of the
operators’ actions as the creation/annihilation of a single excitation with the energy 0 – thus justifying
the operator names. In the next chapter, we will show that such excitation of an electromagnetic field
mode may be interpreted as a massless boson with s = 1, called the photon.
In order to generalize this approach to arbitrary bosons, not appealing to a specific system, we
may use relations similar to Eq. (61) to define the creation and annihilation operators. The definitions
look simple in the language of the so-called Dirac states, described by ket-vectors
Dirac
N 1 , N 2 ,...N j ,... , (8.62) state
where Nj is the state occupancy, i.e. the number of bosons in the single-particle state j. Let me
emphasize that here the indices 1, 2, …j,… number single-particle states (including their spin parts)
rather than particles. Thus the very notion of an individual particle’s number is completely (and for
indistinguishable particles, very relevantly) absent from this formalism. Generally, the set of single-
particle states participating in the Dirac state may be selected arbitrarily, provided that it is full and
orthonormal in the sense
N 1' ,N 2' ..., N 'j' , ... N 1 ,N 2 ..., N j , ...   N N'  N N' ... N N' ... , (8.63)
1 1 2 2 j j
though for systems of non- (or weakly) interacting bosons, using the stationary states of individual
particles in the system under analysis is almost always the best choice.
Now we can define the particle annihilation operator as follows:
Boson
annihilation
aˆ j N 1 , N 2 ,...N j ,...  N 1j / 2 N 1 , N 2 ,...N j  1,... . (8.64) operator
Note that the pre-ket coefficient, similar to that in the first of Eqs. (61), guarantees that any attempt to
annihilate a particle in an initially unpopulated state gives the non-existing (“null”) state:
aˆ j N 1 , N 2 ,...0 j ,...  0 , (8.65)
23 It was invented (first for photons and then for arbitrary bosons) by P. Dirac in 1927, and then modified in 1928
for fermions by E. Wigner and P. Jordan. Note that the term “second quantization” is rather misleading for the
non-relativistic applications we are discussing here, but finds certain justification in the quantum field theory.
where the symbol 0j means zero occupancy of the jth state. According to Eq. (63), an equivalent way to
write Eq. (64) is
N1' , N 2' ,..., N 'j ,... aˆ j .N1 , N 2 ,.., N j ,...  N 1j / 2 N N'  N N' ...  N' , N 1... (8.66)
1 1 2 2 j j
According to the general Eq. (4.65), the matrix element of the Hermitian conjugate operator aˆ †j is
*
N1' ,N 2' ,..., N 'j ,... aˆ †j N1 , N 2 ,...N j ,...  N1 , N 2 ,..., N j ,... aˆ j N1' ,N 2' ,..., N 'j ,...
 N1 , N 2 ,..., N j ,... N 'j  1/ 2
 
N1' ,N 2' ,..., N 'j  1,...  N 'j
1/ 2
 N N'  N
1 1 2 N'2
...  N , N' 1...
j j
(8.67)
 N j  1  N N'  N ... N 1, N' ... ,
1/ 2
1 1 2 N'2 j j
meaning that
aˆ †j N 1 , N 2 ,..., N j ,...  N j  1 N 1 , N 2 ,..., N j  1,... ,
Boson 1/ 2
creation (8.68)
operator
in total compliance with the second of Eqs. (61). In particular, this particle creation operator allows the
description of the generation of a single particle from the vacuum (not null!) state 0, 0, …:
aˆ †j 0, 0,...,0 j ,...,0  0, 0,...,1 j ,...0 , (8.69)
and hence a product of such operators may create, from the vacuum, a multiparticle state with an
arbitrary set of occupancies: 24
aˆ1†aˆ1† ...aˆ1† aˆ 2†aˆ 2† ...aˆ 2† ... 0, 0,...   N1! N 2 !... N1 , N 2 ,... .

1/ 2
(8.70)
   
N1 times N 2 times
Next, combining Eqs. (64) and (68), we get
aˆ †j aˆ j N1 , N 2 ,...N j ,...  N j N1 , N 2 ,..., N j ,... , (8.71)
so that, just as for the particular case of harmonic oscillator excitations, the operator
Number-
counting
operator
Nˆ j  aˆ †j aˆ j (8.72)
“counts” the number of particles in the jth single-particle state, while preserving the whole multiparticle
state. Acting on a state by the creation-annihilation operators in the reverse order, we get
aˆ j aˆ †j N 1 , N 2 ,..., N j ,...  N j  1 N 1 , N 2 ,..., N j ,... . (8.73)
Eqs. (71) and (73) show that for any state of a multiparticle system (which may be represented as a
linear superposition of Dirac states with all possible sets of numbers Nj), we may write
aˆ j aˆ †j  aˆ †j aˆ j  aˆ j , aˆ †j   Iˆ, (8.74)

 
24The resulting Dirac state is not an eigenstate of every multiparticle Hamiltonian. However, we will see below
that for a set of non-interacting particles it is a stationary state, so that the full set of such states may be used as a
good basis in perturbation theories of systems of weakly interacting particles.
again in agreement with what we had for the 1D oscillator – cf. Eq. (5.68). According to Eqs. (63), (64),
and (68), the creation and annihilation operators corresponding to different single-particle states do
commute, so that Eq. (74) may be generalized as
aˆ , aˆ †   Iˆ , (8.75)
 j j '  jj ' Bosonic
operators:
commutation
while the similar operators commute, regardless of which states do they act upon: relations
aˆ † , aˆ †   aˆ , aˆ   0̂ . (8.76)

 j j'   j j' 
As was mentioned earlier, a major challenge in the Dirac approach is to rewrite the Hamiltonian
of a multiparticle system, that naturally carries particle numbers k (see, e.g., Eq. (22) for k = 1, 2), in the
second quantization language, in which there are no these numbers. Let us start with single-particle
components of such Hamiltonians, i.e. operators of the type
N Single-
Fˆ   fˆk . (8.77) particle
operator
k 1
where all N operators fˆk are similar, besides that each of them acts on one specific (kth) particle, and N
is the total number of particles in the system, which is evidently equal to the sum of single-particle state
occupancies:
N   N j. (8.78)
j
The most important examples of such operators are the kinetic energy of N similar single particles, and
their potential energy in an external field:
N
pˆ 2 N
Tˆ   k , Uˆ   uˆ (rk ). (8.79)
k 1 2m k 1
For bosons, instead of the Slater determinant (60), we have to write a similar expression, but
without the sign alternation at permutations:
1/ 2
 N 1!...N j !... 
N 1 ,...N j ,...  
N !
  ...
 '"...
 
, (8.80)
  P
N operands
sometimes called the permanent. Note again that the left-hand side of this relation is written in the Dirac
notation (that does not use particle numbering), while on its right-hand side, just in relations of Secs. 1
and 2, the particle numbers are coded with the positions of the single-particle states inside the state
vectors, and the summation is over all different permutations of the states in the ket – cf. Eq. (10).
(According to the basic combinatorics,25 there are N!/(N1!...Nj!...) such permutations, so that the front
coefficient in Eq. (80) ensures the normalization of the Dirac state, provided that the single-particle
states , ’, …are normalized.) Let us use Eq. (80) to spell out the following matrix element for a
system with (N –1) particles:
25 See, e.g., MA Eq. (2.3).
...N j ,...N j'  1,... Fˆ ...N j  1,...N j' ,...

N 1!...( N j  1)!...(N j '  1)!... N 1 (8.81)
 N j N j' 
1/ 2
  ... ' "...  fˆk ... '"... ,
( N  1)! P N 1 P N 1 k 1
where all non-specified occupation numbers in the corresponding positions of the bra- and ket-vectors
are equal to each other. Each single-particle operator fˆk participating in the operator sum, acts on the
bra- and ket-vectors of states only in one (kth) position, giving the following result, independent of the
position number:
j th fˆk  j' th   j fˆ  j'  f jj' . (8.82)
in k position in k position
Since in both permutation sets participating in Eq. (81), with (N – 1) state vectors each, all positions are
equivalent, we can fix the position (say, take the first one) and replace the sum over k with the
multiplication by of the bracket by (N – 1). The fraction of permutations with the necessary bra-vector
(with number j) in that position is Nj/(N – 1), while that with the necessary ket-vector (with number j’)
in the same position is Nj’/(N – 1). As the result, the permutation sum in Eq. (81) reduces to
Nj N j'
( N  1)
N 1 N 1
f jj '   ... '"... ... '"... ,
P N 2 P N 2
(8.83)
where our specific position k is now excluded from both the bra- and ket-vector permutations. Each of
these permutations now includes only (Nj – 1) states j and (Nj’ – 1) states j’, so that, using the state
orthonormality, we finally arrive at a very simple result:
...N j ,...N j'  1,... Fˆ ...N j  1,...N j' ,...
N 1!...( N j  1)!...( N j'  1)!... Nj N j' ( N  2)!
 N j N j'  ( N  1)
1/ 2
f jj ' (8.84)
( N  1)! N 1 N 1 N 1!...( N j  1)!...( N j '  1)!...
 N j N j' 
1/ 2
f jj' .
On the other hand, let us calculate matrix elements of the following operator:
†
f
j, j'
jj' aˆ j aˆ j' . (8.85)
A direct application of Eqs. (64) and (68) shows that the only non-vanishing of the elements are
...N j ,...N j '  1,... f jj ' aˆ j aˆ j ' ...N j  1,..., N j ' ,...  N j N j '  f jj ' .
† 1/ 2
(8.86)
But this is exactly the last form of Eq. (84), so that in the basis of Dirac states, the operator (77) may be
Single- represented as
Fˆ   f aˆ † aˆ .
particle
operator jj' (8.87)
j j'
in Dirac j , j'
language
This beautifully simple relation is the key formula of the second quantization theory, and is
essentially the Dirac-language analog of Eq. (4.59) of the single-particle quantum mechanics. Each term
of the sum (87) may be described by a very simple mnemonic rule: for each pair of single-particle states
j and j’, annihilate a particle in the state j’, create one in the state j, and weigh the result with the
corresponding single-particle matrix element. One of the corollaries of Eq. (87) is that the expectation
value of an operator whose eigenstates coincide with the Dirac states is
F  ...N j ,... Fˆ ...N j ,...   f jj N j , (8.88)

j
with an evident physical interpretation as the sum of single-particle expectation values over all states,
weighed by the occupancy of each state.
Proceeding to fermions, which have to obey the Pauli principle, we immediately notice that any
occupation number Nj may only take two values, 0 or 1. To account for that, and also make the key
relation (87) valid for fermions as well, the creation-annihilation operators are defined by the following
relations:
aˆ j N 1 , N 2 ,...,0 j ,...  0, aˆ j N 1 , N 2 ,...,1 j ,...  (1) (1, j1) N 1 , N 2 ,...,0 j ,... , (8.89) Fermion
creation-
annihilation
aˆ †j N 1 , N 2 ,...,0 j ,...  (1)  (1, j 1) N 1 , N 2 ,...,1 j ,... , aˆ †j N 1 , N 2 ,...,1 j ,...  0, (8.90) operators
where the symbol (J, J’) means the sum of all occupancy numbers in the states with numbers from J to
J’, including the border points:
J'
( J , J' )   N j , (8.91)
jJ
so that the sum participating in Eqs. (89)-(90) is the total occupancy of all states with the numbers below
j. (The states are supposed to be numbered in a fixed albeit arbitrary order.) As a result, these relations
may be conveniently summarized in the following verbal form: if an operator replaces the jth state’s
occupancy with the opposite one (either 1 with 0, or vice versa), it also changes the sign before the
result if (and only if) the total number of particles in the states with j’ < j is odd.
Let us use this (perhaps somewhat counter-intuitive) sign alternation rule to spell out the ket-
vector 11 of a completely filled two-state system, formed from the vacuum state 00 in two different
ways. If we start by creating a fermion in the state 1, we get
aˆ1† 0, 0  (1) 0 1, 0  1, 0 , aˆ 2† aˆ1† 0, 0  aˆ 2† 1, 0  (1)1 1,1   1,1 , (8.92a)
while if the operator order is different, the result is
aˆ 2† 0, 0  (1) 0 0,1  0,1 , aˆ1† aˆ 2† 0, 0  aˆ1† 0,1  (1) 0 1,1  1,1 , (8.92b)
so that
 aˆ † aˆ †  aˆ † aˆ †  0, 0  0 .
 1 2 2 1  (8.93)
 
Since the action of any of these operator products on any initial state rather than the vacuum one also
gives the null ket, we may write the following operator equality:
aˆ1† aˆ 2†  aˆ 2† aˆ1†  aˆ1† , aˆ 2†   0̂. (8.94)

 
It is straightforward to check that this result is valid for Dirac vectors of an arbitrary length, and does
not depend on the occupancy of other states, so that we may generalize it as
aˆ † , aˆ †   aˆ , aˆ   0̂ ; (8.95)

 j j'   j j' 
   
Fermionic
operators: these equalities hold for j = j’ as well. On the other hand, an absolutely similar calculation shows that
commutation
relations
the mixed creation-annihilation commutators do depend on whether the states are different or not:26
aˆ , aˆ †   Iˆ . (8.96)
 j j'  jj'
 
These equations look very much like Eqs. (75)-(76) for bosons, “only” with the replacement of
commutators with anticommutators. Since the core laws of quantum mechanics, including the operator
compatibility (Sec. 4.5) and the Heisenberg equation (4.199) of operator evolution in time, involve
commutators rather than anticommutators, one might think that all the behavior of bosonic and
fermionic multiparticle systems should be dramatically different. However, the difference is not as big
as one could expect; indeed, a straightforward check shows that the sign factors in Eqs. (89)-(90) just
compensate those in the Slater determinant, and thus make the key relation (87) valid for the fermions as
well. (Indeed, this is the very goal of the introduction of these factors.)
To illustrate this fact on the simplest example, let us examine what does the second quantization
formalism say about the dynamics of non-interacting particles in the system whose single-particle
properties we have discussed repeatedly, namely two nearly-similar potential wells, coupled by
tunneling through the separating potential barrier – see, e.g., Figs. 2.21 or 7.4. If the coupling is so small
that the states localized in the wells are only weakly perturbed, then in the basis of these states, the
single-particle Hamiltonian of the system may be represented by the 22 matrix (5.3). With the energy
reference selected at the middle between the energies of unperturbed states, the coefficient b vanishes,
this matrix is reduced to
c c 
h  c  σ   z , with c   c x  ic y , (8.97)
 c  c z 
and its eigenvalues to
with c  c  c x2  c y2  c z2  .
1/ 2
    c, (8.98)
Now following the recipe (87), we can use Eq. (97) to represent the Hamiltonian of the whole system of
particles in terms of the creation-annihilation operators:
† † † †
Hˆ  c z aˆ1 aˆ1  c  aˆ1 aˆ 2  c  aˆ 2 aˆ1  c z aˆ 2 aˆ 2 , (8.99)
where â1†, 2 and â1, 2 are the operators of creation and annihilation of a particle in the corresponding
potential well. (Again, in the second quantization approach the particles are not numbered at all!) As
Eq. (72) shows, the first and the last terms of the right-hand side of Eq. (99) describe the particle
energies 1,2 = cz in uncoupled wells,
c z aˆ1† aˆ1  c z Nˆ 1   1 Nˆ 1 ,  c z aˆ 2† aˆ 2  c z Nˆ 2   2 Nˆ 2 , (8.100)
26 Aby-product of this calculation is proof that the operator defined by Eq. (72) counts the number of particles Nj
(now equal to either 1 or 0), just at it does for bosons.
while the sum of the middle two terms is the second-quantization description of tunneling between the
wells.
Now we can use the general Eq. (4.199) of the Heisenberg picture to spell out the equations of
motion of the creation-annihilation operators. For example,
 
iaˆ1  aˆ1 , Hˆ  c z aˆ1 , aˆ1 aˆ1   c  aˆ1 , aˆ1 aˆ 2   c  aˆ1 , aˆ 2 aˆ1   c z aˆ1 , aˆ 2 aˆ 2 .

†
 
†
 
†
 
†

(8.101)
Since the Bose and Fermi operators satisfy different commutation relations, one could expect the right-
hand side of this equation to be different for bosons and fermions. However, it is not so. Indeed, all
commutators on the right-hand side of Eq. (101) have the following form:
aˆ , aˆ † aˆ   aˆ aˆ † aˆ  aˆ † aˆ aˆ . (8.102)
 j j' j"  j j' j" j' j" j
As Eqs. (74) and (94) show, the first pair product of operators on the right-hand side may be recast as
aˆ j aˆ †j'  Iˆ jj'  aˆ †j' aˆ j , (8.103)
where the upper sign pertains to bosons and the lower one to fermions, while according to Eqs. (76) and
(95), the very last pair product in Eq. (102) is
aˆ j" aˆ j   aˆ j aˆ j" , (8.104)
with the same sign convention. Plugging these expressions into Eq. (102), we see that regardless of the
particle type, there is a universal (and generally very useful) commutation relation
aˆ , aˆ † aˆ   aˆ  , (8.105)
 j j' j"  j" jj'
valid for both bosons and fermions. As a result, the Heisenberg equation of motion for the operator â1 ,
and the equation for â 2 (which may be obtained absolutely similarly), are also universal:27
iaˆ1  c z aˆ1  c  aˆ 2 ,
(8.106)
iaˆ 2  c  aˆ1  c z aˆ 2 .
This is a system of two coupled, linear differential equations, which is similar to the equations
for the c-number probability amplitudes of single-particle wavefunctions of a two-level system – see,
e.g., Eq. (2.201) and the model solution of Problem 4.25. Their general solution is a linear superposition
aˆ1, 2 (t )   ˆ 1(,2) exp  t. (8.107)


As usual, to find the exponents , it is sufficient to plug in a particular solution aˆ1, 2 (t )  ˆ 1, 2 expt
into Eq. (106) and require that the determinant of the resulting homogeneous, linear system for the
“coefficients” (actually, time-independent operators) ̂ 1, 2 equals zero. This gives us the following
characteristic equation
†
27 Equations of motion for the creation operators aˆ1, 2 are just the Hermitian-conjugates of Eqs. (106), and do not
add any new information about the system’s dynamics.
c z  i  c
 0, (8.108)
c  c z  i 
with two roots  = i/2, where   2c/ – cf. Eq. (5.20). Now plugging each of the roots, one by one,
into the system of equations for ̂ 1, 2 , we can find these operators, and hence the general solution of
system (98) for arbitrary initial conditions.
Let us consider the simple case cy = cz = 0 (meaning in particular that the wells are exactly
aligned, see Fig. 2.21), so that /2  c = cx; then the solution of Eq. (106) is
t t t t
aˆ1 (t )  aˆ1 (0) cos  iaˆ 2 (0) sin , aˆ 2 (t )  iaˆ1 (0) sin  aˆ 2 (0) cos . (8.109)
2 2 2 2
Multiplying the first of these relations by its Hermitian conjugate, and ensemble-averaging the result, we
get
t t
Quantum
oscillations:
N 1  aˆ1† (t )aˆ1 (t )  aˆ1† (0)aˆ1 (0) cos 2  aˆ 2† (0)aˆ 2 (0) sin 2
second
2 2 (8.110)
quantization † † t t
form  i aˆ1 (0)aˆ 2 (0)  aˆ 2 (0)aˆ1 (0) sin cos .
2 2
Let the initial state of the system be a single Dirac state, i.e. have a definite number of particles
in each well; in this case, only the two first terms on the right-hand side of Eq. (110) are different from
zero, giving:28
t t
N 1  N 1 (0) cos 2  N 2 (0) sin 2 . (8.111)
2 2
For one particle, initially placed in either well, this gives us our old result (2.181) describing the usual
quantum oscillations of the particle between two wells with the frequency . However, Eq. (111) is
valid for any set of initial occupancies; let us use this fact. For example, starting from two particles, with
initially one particle in each well, we get N1 = 1, regardless of time. So, the occupancies do not
oscillate, and no experiment may detect the quantum oscillations, though their frequency  is still
formally present in the time evolution equations. This fact may be interpreted as the simultaneous
quantum oscillations of two particles between the wells, exactly in anti-phase. For bosons, we can go on
to even larger occupancies by preparing the system, for example, in the state with N1(0) = N, N2(0) = 0.
The result (111) says that in this case, we see that the quantum oscillation amplitude increases N-fold;
this is a particular manifestation of the general fact that bosons can be (and evolve in time) in the same
quantum state. On the other hand, for fermions we cannot increase the initial occupancies beyond 1, so
that the largest oscillation amplitude we can get is if we initially fill just one well.
The Dirac approach may be readily generalized to more complex systems. For example, Eq. (99)
implies that an arbitrary system of potential wells with weak tunneling coupling between the adjacent
wells may be described by the Hamiltonian
Hˆ    j a †j aˆ j   jj' a †j aˆ j'  h.c., (8.112)

j  
j , j'
28For the second well’s occupancy, the result is complementary, N2(t) = N1(0)sin2t + N2(0)cos2t , giving in
particular a good sanity check: N1(t) + N2(t) = N1(0) + N2(0) = const.
where the symbol {j, j’} means that the second sum is restricted to pairs of next-neighbor wells – see,
e.g., Eq. (2.203) and its discussion. Note that this Hamiltonian is still a quadratic form of the creation-
annihilation operators, so the Heisenberg-picture equations of motion of these operators are still linear,
and its exact solutions, though possibly cumbersome, may be studied in detail. Due to this fact, the
Hamiltonian (112) is widely used for the study of some phenomena, for example, the very interesting
Anderson localization effects, in which a random distribution of the localized-site energies j prevents
tunneling particles, within a certain energy range, from spreading to unlimited distances.29
8.4. Perturbative approaches

The situation becomes much more difficult if we need to account for direct interactions between
the particles. Let us assume that the interaction may be reduced to that between their pairs (as it is the
case at their Coulomb interaction and most other interactions30), so that it may be described by the
following “pair-interaction” Hamiltonian
1 N
Uˆ int   uˆ int (rk , rk' ), (8.113)
2 k ,k '1
k k '
with the front factor of ½ compensating the double-counting of each particle pair. The translation of this Pair-
interaction
operator to the second-quantization form may be done absolutely similarly to the derivation of Eq. (87), Hamiltonian:
two forms
and gives a similar (though naturally more involved) result
1
Uˆ int 
2
u
j , j' ,l ,l'
jj'll' aˆ †j aˆ †j' aˆl' aˆl , (8.114)
where the two-particle matrix elements are defined similarly to Eq. (82):
u jj'll'   j  j' uînt  l  l' . (8.115)
The only new feature of Eq. (114) is a specific order of the indices of the creation operators. Note the
mnemonic rule of writing this expression, similar to that for Eq. (87): each term corresponds to moving
a pair of particles from states l and l’ to states j’ and j (in this order!) factored with the corresponding
two-particle matrix element (115).
However, with the account of such term, the resulting Heisenberg equations of the time
evolution of the creation/annihilation operators are nonlinear, so that solving them and calculating
observables from the results is usually impossible, at least analytically. The only case when some
general results may be obtained is the weak interaction limit. In this case, the unperturbed Hamiltonian
contains only single-particle terms such as (79), and we can always (at least conceptually :-) find such a
basis of orthonormal single-particle states j in which that Hamiltonian is diagonal in the Dirac
representation:
29For a review of the 1D version of this problem, see, e.g., J. Pendry, Adv. Phys. 43, 461 (1994).
30 A simple but important example from the condensed matter theory is the so-called Hubbard model, in which
particle repulsion limits their number on each of localized sites to either 0, or 1, or 2, with negligible interaction of
the particles on different sites – though the next-neighbor sites are still connected by tunneling – as in Eq. (112).
Hˆ ( 0)    (j0 ) aˆ †j aˆ j . (8.116)
j
Now we can use Eq. (6.14), in this basis, to calculate the interaction energy as a first-order perturbation:
1
(1)
Eint  N 1 , N 2 ,... Uˆ int N 1 , N 2 ,...  N 1 , N 2 ,...
2
u
j , j' ,l ,l'
jj'll' aˆ †j aˆ †j' aˆ l' aˆ l N 1 , N 2 ,...
(8.117)
1

2
u
j , j' ,l ,l'
jj'll'
† †
N 1 , N 2 ,... aˆ j aˆ j' aˆ l' aˆ l N 1 , N 2 ,... .
Since, according to Eq. (63), the Dirac states with different occupancies are orthogonal, the last long
bracket is different from zero only for three particular subsets of its indices:
(i) j  j’, l = j, and l’ = j’. In this case, the four-operator product in Eq. (117) is equal to
† †
aˆ j aˆ j' aˆ j' aˆ j , and applying the commutation rules twice, we can bring it to the so-called normal ordering,
with each creation operator standing to the right of the corresponding annihilation operator, thus
forming the particle number operator (72):
aˆ †j aˆ †j' aˆ j' aˆ j   aˆ †j aˆ †j' aˆ j aˆ j'   aˆ †j   aˆ j aˆ †j' aˆ j'  aˆ †j aˆ j aˆ †j' aˆ j'  Nˆ j Nˆ j' , (8.118)
 
with a similar sign of the final result for bosons and fermions.
(ii) j  j’, l = j’, and l’ = j. In this case, the four-operator product is equal to aˆ †j aˆ †j' aˆ j aˆ j' , and
bringing it to the form Nˆ j Nˆ j' requires only one commutation:
aˆ †j aˆ †j' aˆ j aˆ j'  aˆ †j   aˆ j aˆ †j' aˆ j'   aˆ †j aˆ j aˆ †j' aˆ j'   Nˆ j Nˆ j' , (8.119)

 
with the upper sign for bosons and the lower sign for fermions.
(iii) All indices are equal to each other, giving aˆ †j aˆ †j' aˆl' aˆl  aˆ †j aˆ †j aˆ j aˆ j . For fermions, such an
operator (that “tries” to create or to kill two particles in a row, in the same state) immediately gives the
null-vector. In the case of bosons, we may use Eq. (74) to commute the internal pair of operators, getting
aˆ †j aˆ †j aˆ j aˆ j  aˆ †j  aˆ j aˆ †j  Iˆ aˆ j  Nˆ j ( Nˆ j  Iˆ) . (8.120)

 
Note, however, that this expression formally covers the fermion case as well (always giving zero). As a
result, Eq. (117) may be rewritten in the following universal form:
Particle
N j N j' u jj'jj'  u jj'j'j    N j ( N j  1) u jjjj .

1 1

interaction:
st
1 -order
(1)
Eint  (8.121)
energy 2 j , j' 2 j
correction j j'
The corollaries of this important result are very different for bosons and fermions. In the former
case, the last term usually dominates, because the matrix elements (115) are typically the largest when
all basis functions coincide. Note that this term allows a very simple interpretation: the number of the
diagonal matrix elements it sums up for each state (j) is just the number of interacting particle pairs
residing in that state.
In contrast, for fermions the last term is zero, and the interaction energy is proportional to the
difference of the two terms inside the first parentheses. To spell them out, let us consider the case when
there is no direct spin-orbit interaction. Then the vectors j of the single-particle state basis may be
represented as direct products o j m j  of their orbital and spin-orientation parts. (Here, for the brevity
of notation, I am using m instead of ms.) For spin-½ particles, including electrons, mj may equal only
either +½ or –½; in this case the spin part of the first matrix element, proportional to ujj’jj’, equals
m  m' m  m' , (8.122)
where, as in the general Eq. (115), the position of a particular state vector in each direct product is
encoding the particle’s number. Since the spins of different particles are defined in different Hilbert
spaces, we may move their state vectors around to get
m  m' m  m'   m m    m'
1
m' 
2
 1, (8.123)
for any pair of j and j’. On the other hand, the second matrix element, ujj’j’j, is factored by
m  m' m'  m   m m'    m'
1
m 
2
  mm' . (8.124)
In this case, it is convenient to rewrite Eq. (121) in the coordinate representation, using single-
particle wavefunctions called spin-orbitals
 j (r )  r  j   r o  m .
j
(8.125) Spin-
orbital
They differ from the spatial parts of the usual orbital wavefunctions of the type (4.233) only in that their
index j should be understood as the set of the orbital-state and the spin-orientation indices.31 Also, due to
the Pauli-principle restriction of numbers Nj to either 0 or 1, Eq. (121) may be also rewritten without the
explicit occupancy numbers, with the understanding that the summation is extended only over the pairs
of occupied states. As a result, it becomes
 * (r) * (r' ) u (r, r' ) (r) (r' )  Energy
1
   d r  d r'  .
(1) 3 3 j j' int j j' correction
Eint (8.126) due to
2 j, j'  * * 
 j (r) j' (r' ) uint (r, r' ) j' (r) j (r' )
fermion
j  j' interaction
In particular, for a system of two electrons, we may limit the summation to just two states (j, j’ =
1, 2). As a result, we return to Eqs. (39)-(41), with the bottom (minus) sign in Eq. (39), corresponding to
the triplet spin states. Hence, Eq. (126) may be considered as the generalization of the direct and
exchange interaction balance picture to an arbitrary number of orbitals and an arbitrary total number N
of electrons. Note, however, that this formula cannot correctly describe the energy of the singlet spin
states, corresponding to the plus sign in Eq. (39), and also of the entangled triplet states.32 The reason is
31 The spin-orbitals (125) are also close to spinors (13), besides that the former definition takes into account that
the spin s of a single particle is fixed, so that the spin-orbital may be indexed by the spin’s orientation m  ms
only. Also, if an orbital index is used, it should be clearly distinguished from j, i.e. the set of the orbital and spin
indices. This is why I believe that the frequently met notation of spin-orbitals as j,s(r) may lead to confusion.
32 Indeed, due to the condition j’  j, and Eq. (124), the calculated negative exchange interaction is limited to
electron state pairs with the same spin direction – such as the factorable triplet states ( and ) of a two-
electron system, in which the contribution of Eex, given by Eq. (41), to the total energy is also negative.
that the description of entangled spin states, given in particular by Eqs. (18) and (20), requires linear
superpositions of different Dirac states. (A proof of this fact is left for the reader’s exercise.)
Now comes a very important fact: the approximate result (126), added to the sum of unperturbed
energies j(0), equals the sum of exact eigenenergies of the so-called Hartree-Fock equation:33
 2 2 
Hartree-
    u (r )  j (r )
Fock  2m  (8.127)
equation
 
    j' (r' )u int (r, r' ) j (r ) j' (r' )   j' (r' )u int (r, r' ) j' (r ) j (r ) d r'   j j (r ),
* * 3

j'  j 

where u(r) is the external-field potential acting on each particle separately – see the second of Eqs. (79).
An advantage of this equation in comparison with Eq. (126) is that it allows the (approximate)
calculation of not only the energy spectrum of the system, but also the corresponding spin-orbitals,
taking into account their electron-electron interaction. Of course, Eq. (127) is an integro-differential
rather than just a differential equation. There are, however, efficient methods of numerical solution of
such equations, typically based on iterative methods. One more important practical trick is the exclusion
of the filled internal electron shells (see Sec. 3.7) from the explicit calculations, because the shell states
are virtually unperturbed by the valence electron effects involved in typical atomic phenomena and
chemical reactions. In this approach, the Coulomb field of the shells, described by fixed, pre-calculated,
and tabulated pseudo-potentials, is added to that of the nuclei. This approach dramatically cuts the
computing resources necessary for systems of relatively heavy atoms, enabling a pretty accurate
simulation of electronic and chemical properties of rather complex molecules, with thousands of
electrons.34 As a result, the Hartree-Fock approximation has become the de-facto baseline of all so-
called ab-initio (“first-principle”) calculations in the very important field of quantum chemistry.35
In departures from this baseline, there are two opposite trends. For larger accuracy (and typically
smaller systems), several “post-Hartree-Fock methods”, notably including the configuration interaction
method,36 that are more complex but may provide higher accuracy, have been developed.
There is also a strong opposite trend of extending such ab initio (“first-principle”) methods to
larger systems while sacrificing some of the results’ accuracy and reliability. The ultimate limit of this
trend is applicable when the single-particle wavefunction overlaps are small and hence the exchange
interaction is negligible. In this limit, the last term in the square brackets in Eq. (127) may be ignored,
and the multiplier j(r) taken out of the integral, which is thus reduced to a differential equation –
formally just the Schrödinger equation for a single particle in the following self-consistent effective
potential:
33 This equation was suggested in 1929 by Douglas Hartree for the direct interaction and extended to the
exchange interaction by Vladimir Fock in 1930. To verify its compliance with Eq. (126), it is sufficient to
multiply all terms of Eq. (127) by *j(r), integrate them over all r-space (so that the right-hand side would give
j), and then sum these single-particle energies over all occupied states j.
34 For condensed-matter systems, this and other computational methods are applied to single elementary
spatial cells, with a limited number of electrons in them, using cyclic boundary conditions.
35 See, e.g., A. Szabo and N. Ostlund, Modern Quantum Chemistry, Revised ed., Dover, 1996.
36That method, in particular, allows the calculation of proper linear superpositions of the Dirac states (such as the
entangled states for N = 2, discussed above) which are missing in the generic Hartree-Fock approach – see, e.g.,
the just-cited monograph by Szabo and Ostlund.
u ef (r )  u (r )  u dir (r ), u dir (r )    *j ' (r' )u int (r, r' ) j' (r' )d 3 r' .
Hartree
(8.128) approximation
j'  j
This is the so-called Hartree approximation – that gives reasonable results for some systems,37
especially those with low electron density.
However, in dense electron systems (such as typical atoms, molecules, and condensed matter)
the exchange interaction, described by the second term in the square brackets of Eqs. (126)-(127), may
be as high as ~30% of the direct interaction, and frequently cannot be ignored. The tendency of taking
this interaction in the simplest possible form is currently dominated by the Density Functional Theory,38
universally known by its acronym DFT. In this approach, the equation solved for each eigenfunction
j(r) is a differential, Schrödinger-like Kohn-Sham equation
 2 2  Kohn-
   u (r )  u dir
KS
(r )  u xc (r ) j (r )   j j (r ) , (8.129) Sham
 2m  equation
where
1  (r' )
(r )  e (r ),  (r )  d  (r )  en(r ),
KS 3
u dir r' , (8.130)
4 0 r  r'
and n(r) is the total electron density in a particular point, calculated as
n(r )   *j (r ) j (r ). (8.131)
j
The most important feature of the Kohn-Sham Hamiltonian is the simplified description of the
exchange and correlation effects by the effective exchange-correlation potential uxc(r). This potential is
calculated in various approximations, most of them valid only in the limit when the number of electrons
in the system is very high. The simplest of them (proposed by Kohn et al. in the 1960s) is the Local
Density Approximation (LDA) in which the effective exchange potential at each point is a function only
of the electron density n at the same point r, taken from the theory of a uniform gas of free electrons.39
However, for many tasks of quantum chemistry, the accuracy given by the LDA is insufficient, because
inside molecules the density n typically changes very fast. As a result, DFT has become widely accepted
in that field only after the introduction, in the 1980s, of more accurate, though more cumbersome
models for uxc(r), notably the so-called Generalized Gradient Approximations (GGAs). Due to its
relative simplicity, DFT enables the calculation, with the same computing resources and reasonable
precision, some properties of much larger systems than the methods based on the Hartree-Fock theory.
As the result, is has become a very popular tool of ab initio calculations. This popularity is enhanced by
the availability of several advanced DFT software packages, some of them in the public domain.
37 An extreme example of the Hartree approximation is the Thomas-Fermi model of heavy atoms (with Z >> 1), in
which atomic electrons, at each distance r from the nucleus, are treated as an ideal, uniform Fermi gas, with a
certain density n(r) corresponding to the local value uef(r), but a global value of their highest full single-particle
energy,  = 0, to ensure the equilibrium. (The analysis of this model is left for the reader’s exercise.)
38 It had been developed by Walter Kohn and his associates (notably Pierre Hohenberg) in 1965-66, and
eventually (in 1998) was marked with a Nobel Prize in Chemistry for W. Kohn.
39 Just for the reader’s reference: for a uniform, degenerate Fermi-gas of electrons (with the Fermi energy  >>
F
kBT), the most important, exchange part ux of uxc may be calculated analytically: ux = -(3/4)e2kF/40, where the
Fermi momentum kF = (2meF)1/2/ is defined by the electron density: n = 2(4/3)kF3/(2)3  kF3/32.
Please note, however, that despite this undisputable success, this approach has its problems.
From my personal point of view, the most offensive of them is the implicit assumption of the unphysical
Coulomb interaction of an electron with itself (by dropping, on the way from Eq. (128) to Eq. (130), the
condition j’  j at the calculation of u dir
KS
). As a result, for a reasonable description of some effects, the
available DFT packages are either inapplicable at all or require substantial artificial tinkering.40
Unfortunately, because of lack of time/space, for details I have to refer the interested reader to
specialized literature.41
8.5. Quantum computation and cryptography

Now I have to review the emerging fields of quantum computation and encryption. (Since these
fields are much related, they are often referred to under the common title of “quantum information
science”, though this term is somewhat misleading, de-emphasizing physical aspects of the topic.) These
fields are currently the subject of intensive research and development efforts, which has already brought,
besides an enormous body of hype, some results of general importance. My coverage, by necessity
short, will focus on these results, referring the reader interested in details to special literature.42 Because
of the very active stage of the fields, I will also provide quite a few references to recent publications,
making the style of this section closer to a brief literature review than to a textbook’s section.
Presently, most work on quantum computation and encryption is based on systems of spatially
separated (and hence distinguishable) two-level systems – in this context, universally called qubits.43
Due to this distinguishability, the issues that were the focus of the first sections of this chapter, including
the second quantization approach, are irrelevant here. On the other hand, systems of qubits have some
interesting properties that have not been discussed in this course yet.
First of all, a system of N >> 1 qubits may contain much more information than the same number
of N classical bits. Indeed, according to the discussions in Chapter 4 and Sec. 5.1, an arbitrary pure state
of a single qubit may be represented by its ket vector (4.37) – see also Eq. (5.1):
 N 1
  1 u1   2 u 2 , (8.132)
where {uj} is any orthonormal two-state basis. It is natural and common to employ, as uj, the eigenstates
aj of the observable A that is eventually measured in the particular physical implementation of the qubit
– say, a certain Cartesian component of spin-½. It is also common to write the kets of these base states
as 0 and 1, so that Eq. (132) takes the form
40 As just a few examples, see N. Simonian et al., J. Appl. Phys. 113, 044504 (2013); M. Medvedev et al., Science
335, 49 (2017); A. Hutama et al., J. Phys. Chem. C 121, 14888 (2017).
41 See, e.g., either the monograph by R. Parr and W. Yang, Density-Functional Theory of Atoms and Molecules,
Oxford U. Press, 1994, or the later textbook J. A. Steckel and D. Sholl, Density Functional Theory: Practical
Introduction, Wiley, 2009. For a popular review and references to more recent work in this still-developing field,
see A. Zangwill, Phys. Today 68, 34 (July 2015).
42 Despite the recent flood of new books on the field, one of its first surveys, by M. Nielsen and I. Chuang,
Quantum Computation and Quantum Information, Cambridge U. Press, 2000, is perhaps still the best one.
43 In some texts, the term qubit (or “Qbit”, or “Q-bit”) is used instead for the information contents of a two-level
system – very much like the classical bit of information (in this context, frequently called “Cbit” or “C-bit”)
describes the information contents of a classical bistable system – see, e.g., SM Sec. 2.2.
Single qubit
 N 1
 a 0 0  a1 1  a
j 0,1
j j . (8.133) state’:
representation
(Here, and in the balance of this section, the letter j is used to denote an integer equal to either 0 or 1.)
According to this relation, any state  of a qubit is completely defined by two complex c-numbers aj,
i.e. by 4 real numbers. Moreover, due to the normalization condition a12 + a22 = 1, we need just 3
independent real numbers – say, the Bloch sphere coordinates  and  (see Fig. 5.3), plus the common
phase , which becomes important only when we consider coherent states of a several-qubit system.
This is a good time to note that a qubit is very much different from any classical bistable system
used to store single bits of information – such as two possible voltage states of the usual SRAM cell
(essentially, a positive-feedback loop of two transistor-based inverters). Namely, the stationary states of
a classical bistable system, due to its nonlinearity, are stable with respect to small perturbations, so that
they may be very robust to unintentional interaction with their environment. In contrast, the qubit’s state
may be disturbed (i.e. its representation point on the Bloch sphere shifted) by even minor perturbations,
because it does not have such an internal state stabilization mechanism.44 Due to this reason, qubit-based
systems are rather vulnerable to environment-induced drifts, including the dephasing and relaxation
discussed in the previous chapter, creating major experimental challenges – see below.
Now, if we have a system of 2 qubits, the vectors of its arbitrary pure state may be represented as
a sum of 22 = 4 terms,45
Two-qubit
 N 2
 a 00 00  a01 01  a10 10  a11 11   a j j j1 j 2 ,
1 2
(8.134) state’s
representation
j1 , j2 0,1
with four complex coefficients, i.e. eight real numbers, subject to just one normalization condition,
which follows from the requirement  = 1:
2
 aj j
1 2
 1. (8.135)
j1, 2 0,1
The evident generalization of Eqs. (133)-(134) to an arbitrary pure state of an N-qubit system is
a sum of 2N terms:
 N   a j1 j2 ... jN j1 j 2 ... j N , (8.136)
j1 , j2 ,.. jN 0,1
including all possible combinations of 0s and 1s for indices j, so that the state is fully described by 2N
complex numbers, i.e. 22N  2N+1 real numbers, with only one constraint, similar to Eq. (135), imposed
by the normalization condition. Let me emphasize that this exponential growth of the information
contents would not be possible without the qubit state entanglement. Indeed, in the particular case when
qubit states are not entangled, i.e. are factorable:
44 In this aspect as well, the information processing systems based on qubits are closer to classical analog
computers (which were popular once, but nowadays are used for a few special applications only) rather than
classical digital ones.
45 Here and in most instances below I use the same shorthand notation as was used at the beginning of this chapter
– cf. Eq. (1b). In this short form, qubit’s number is coded by the order of its state index inside a full ket-vector,
while in the long form, such as in Eq. (137), it is coded by the order of its single-qubit vector in a full direct
product.
 N
  1  2 ...  N , (8.137)
where each n is described by an equality similar to Eq. (133) with its individual expansion
coefficients, the system state description requires only 3N – 1 real numbers – e.g., N sets {, , } less
one common phase.
However, it would be wrong to project this exponential growth of information contents directly
on the capabilities of quantum computation, because this process has to include the output information
readout, i.e. qubit state measurements. Due to the fundamental intrinsic uncertainty of quantum systems,
the measurement of a single qubit even in a pure state (133) generally may give either of two results,
with probabilities W0 = a02 and W1 = a12. To comply with the general notion of computation, any
quantum computer has to provide certain (or virtually certain) results, and hence the probabilities Wj
have to be very close to either 0 or 1, so that before the measurement, each measured qubit has to be in a
basis state – either 0 or 1. This means that the computational system with N output qubits, just before the
final readout, has to be in one of the factorable states
 N
 j1 j 2 ... j N  j1 j 2 ... j N , (8.138)
which is a very small subset even of the set of all unentangled states (137), and whose maximum
information contents is just N classical bits.
Now the reader may start thinking that this constraint strips quantum computations of any
advantages over their classical counterparts, but such a view is also superficial. To show that, let us
consider the scheme of the most actively explored type of quantum computation, shown in Fig. 3.46
j1 j1
 j1 in qubit 1
in out
 j1 out
classical classical
bits j2 j2 bits
 j2 in  j2 out
in out
of the qubit 2 U of the
input
number
    output
number
jN jN
 jN in  jN out
in out
qubit N
qubit state unitary qubit state
 
preparation in transform out measurement
Fig. 8.3. The baseline scheme of quantum computation.
Here each horizontal line (sometimes called a “wire”47) corresponds to a single qubit, tracing its
time evolution in the same direction as at the usual time function plots: from left to right. This means
46 Numerous modifications of this “baseline” scheme have been suggested, for example with the number of output
qubits different from that of input qubits, etc. Some other options are discussed at the end of this section.
47 The notion of “wires” stems from the similarity between such quantum schemes and the drawings describing
classical computation circuits – see, e.g., Fig. 4a below. In the classical case, the lines may be indeed understood
as physical wires connecting physical devices: logic gates and/or memory cells. In this context, note that classical
computer components also have non-zero time delays, so that even in this case the left-to-right device ordering is
useful to indicate the timing of (and frequently the causal relation between) the signals.
that the left column in of ket-vectors describes the initial state of the qubits,48 while the right column
out describes their final (but pre-measurement) state. The box labeled U represents the qubit evolution
in time due to their specially arranged interactions between each other and/or external drive “forces”.
Besides these forces, during this evolution the system is supposed to be ideally isolated from the
dephasing and energy-dissipating environment, so that the process may be described by a unitary
operator defined in the 2N-dimensional Hilbert space of N qubits:
 out
 Uˆ  in
. (8.139)
With the condition that the input and output states have the simple form (138), this equality reads
 j1 out  j 2 out ... j N out  Uˆ  j1 in  j 2 in ... j N in . (8.140)

The art of quantum computer design consists of selecting such unitary operators Û that would:
- satisfy Eq. (140),
- be physically implementable, and
-enable substantial performance advantages of the quantum computation over its classical
counterparts with similar functionality, at least for some digital functions (algorithms).
I will have time/space to demonstrate the possibility of such advantages on just one, perhaps the
simplest example – the so-called Deutsch problem,49 on the way discussing several common notions and
issues of this field. Let us consider the family of single-bit classical Boolean functions jout = f(jin). Since
both j are Boolean variables, i.e. may take only values 0 and 1, there are evidently only 4 such functions
– see the first four columns of the following table:
f f(0) f(1) class F f(1)-f(0)

f1 0 0 constant 0 0
f2 0 1 balanced 1 1 (8.141)
f3 1 0 balanced 1 -1
f4 1 1 constant 0 0
Of them, the functions f1 and f4, whose values are independent of their arguments, are called constants,
while the functions f2 (called “YES” or “IDENTITY”) and f3 (“NOT” or “INVERSION”) are called
balanced. The Deutsch problem is to determine the class of a single-bit function, implemented in a
“black box”, as being either constant or balanced, using just one experiment.
48 As was discussed in Chapter 7, the preparation of a pure state (133) is (conceptually :-) straightforward. Placing
a qubit into a weak contact with an environment of temperature T << /kB, where  is the difference between
energies of the eigenstates 0 and 1, we may achieve its relaxation into the lowest-energy state. Then, if the qubit
must be set into a different pure state, it may be driven there by the application of a pulse of a proper external
classical “force”. For example, if an actual spin-½ is used as the qubit, a pulse of a magnetic field, with proper
direction and duration, may be applied to arrange its precession to the required Bloch sphere point – see Fig. 5.3c.
However, in most physical implementations of qubits, a more practicable way for that step is to use a proper part
of the Rabi oscillation period – see Sec. 6.5.
49 It is named after David Elieser Deutsch, whose 1985 paper (motivated by an inspirational but not very specific
publication by Richard Feynman in 1982) launched the whole field of quantum computation.
Classically, this is clearly impossible, and the simplest way to perform the function’s
classification involves two similar black boxes f – see Fig. 4a.50 It also uses the so-called exclusive-OR
(XOR for short) gate whose output is described by the following function F of its two Boolean
arguments j1 and j2:51
0, if j1  j 2 ,
F ( j1 , j 2 )  j1  j 2   (8.142)
1, if j1  j 2 .
In the particular circuit shown in Fig. 4a, the gate produces the following output:
F  f (0)  f (1) , (8.143)
which is equal to 1 if f(0)  f(1), i.e. if the function f is balanced, and to 0 in the opposite case – see
column F in the table of Eq. (141).
f (0)
(a) 1
2
0  1  1
 0   1 1 
F (b)
2
0 f 0 H H F
1 1
f (1)
XOR F 0  1  0 1
2 2
1 f 1 H f H 1
Fig. 8.4. The simplest (a) classical and (b) quantum ways to classify a single-bit Boolean function f.
On the other hand, as will be shown below, any of four functions f may be implemented
quantum-mechanically, for example (Fig. 5a) as a unitary transform of two input qubits, acting as
follows on each basis component j1j2  j1j2 of the general input state (134):
fˆ j1 j 2  j1 j 2  f ( j1 ) , (8.144)
where f is the corresponding classical Boolean function – see the table in Eq. (141).
(a) (b)
j1 j1 j1 j1
Fig. 8.5. Two-qubit quantum gates: (a) a

C two-qubit function f and (b) its particular
f case C (CNOT), and their actions on a
j2 j2  f ( j1 ) j2 j2  j1
basis state.
In the particular case when f in Eq. (144) is just the YES function: f(j) = f2(j) = j, this “circuit” is
reduced to the so-called CNOT gate, a key ingredient of many other quantum computation schemes,
performing the following two-qubit transform:
50 Alternatively, we may perform two sequential experiments on the same black box f, first recording, and then
recalling the first experiment’s result. However, the Deutsch problem calls for a single-shot experiment.
51The XOR sign  should not be confused with the sign  of the direct product of state vectors (which
in this section is just implied).
Cˆ j1 j 2  j1 j 2  j1 . (8.145a) CNOT
function
Let us use Eq. (142) to spell out this function for all four possible input qubit combinations:
Cˆ 00  00 , Cˆ 01  01 , Cˆ 10  11 , Cˆ 11  10 . (8.145b)
In plain English, this means that acting on a basis state j1j2, the CNOT gate leaves the state of the first,
source qubit (shown by the upper horizontal line in Fig. 5) intact, but flips the state of the second, target
qubit if the first one is in the basis state 1. In even simpler words, the state j1 of the source qubit controls
the NOT function acting on the target qubit; hence the gate’s name CNOT – the semi-acronym of
“Controlled NOT”.
For the quantum function (144), with an arbitrary and unknown f, the Deutsch problem may be
solved within the general scheme shown in Fig. 3, with the particular structure of the unitary-transform
box U spelled out in Fig. 4b, which involves just one implementation of the function f. Here the single-
qubit quantum gate H performs the Hadamard (or “Walsh-Hadamard“ or “Walsh”) transform,52 whose
operator is defined by the following actions on the qubit’s basis states:
1 1
Hˆ 0   0  1 , Hˆ 1   0  1 , (8.146) Hadamard
transform
2 2
- see also the two leftmost state label columns in Fig. 4b.53 Since this operator has to be linear (to be
quantum-mechanically realistic), it needs to perform the action (146) on the basis states even when they
are parts of a linear superposition – as they are, for example, for the two right Hadamard gates in Fig.
4b. For example, as immediately follows from Eqs. (146) and the operator’s linearity,
   1
Hˆ Hˆ 0  Hˆ   
 0  1   1 Hˆ 0  Hˆ 1  1  1  0  1   1  0  1   0 . (8.147a)
 2  2 2 2 2 
Absolutely similarly, we may get54
 
Hˆ Hˆ 1  1 . (8.147b)
Now let us carry out a sequential analysis of the “circuit” shown in Fig. 4b. Since the input states
of the gate f in this particular circuit are described by Eqs. (146), its output state’s ket is
   1
fˆ Hˆ 0 Hˆ 1  fˆ  
 0  1  1  0  1   1 fˆ 00  fˆ 01  fˆ 10  fˆ 11 .  (8.148)
 2 2  2
Now we may apply Eq. (144) to each component in the parentheses:
52 Named after mathematicians J. Hadamard (1865-1963) and J. Walsh (1895-1973). To avoid any chance of
confusion between the Hadamard transform’s operator Hˆ and the general Hamiltonian operator Ĥ , in these
notes they are typeset using different fonts.
53 Note that according to Eq. (146), the operator Hˆ does not belong to the class of transforms Û described by Eq.
(140) – while the whole “circuit” shown in Fig. 4b, does – see below.
54 Since the states 0 and 1 form a full basis of a single qubit, both Eqs. (147) may be summarized as an operator
equality: Hˆ 2  Iˆ . It is also easy to verify that the Hadamard transform of an arbitrary state may be represented
on the Bloch sphere (Fig. 5.3) as a -rotation about the direction that bisects the angle between the x- and z-axes.
fˆ 00  fˆ 01  fˆ 10  fˆ 11  fˆ 0 0  fˆ 0 1  fˆ 1 0  fˆ 1 1
 0 0  f (0)  0 1  f (0)  1 0  f (1)  1 1  f (1) (8.149)
 0  0  f (0)  1  f (0)   1  0  f (1)  1  f (1) .
Note that the contents of the first parentheses of the last expression, characterizing the state of the target
qubit, is equal to (0 – 1)  (-1)0 (0 – 1) if f(0) = 0 (and hence 0f(0) = 0 and 1f(0) = 1), and to (1
– 0)  (-1)1(0 – 1) in the opposite case f(0) = 1, so that both cases may be described in one shot by
rewriting the parentheses as (-1)f(0)(0 – 1). The second parentheses is absolutely similarly controlled
by the value of f(1), so that the outputs of the gate f are unentangled:
 
fˆ Hˆ 0 Hˆ 1  (1) f ( 0 ) 0  (1) f (1) 1  0  1   
1
2
1
2
 0  (1) F 1 
1
2
 0  1 , (8.150)
where the last step has used the fact that the classical Boolean function F, defined by Eq. (142), is equal
to [f(1) – f(0)] – please compare the last two columns in Eq. (141). The front sign  in Eq. (150) may
be prescribed to any of the component ket-vectors – for example to that of the target qubit, as shown by
the third column of state labels in Fig. 4b.
This intermediate result is already rather remarkable. Indeed, it shows that, despite the
superficial impression one could get from Fig. 5, the gates f and C, being “controlled” by the source
qubit, may change that qubit’s state as well! This fact (partly reflected by the vertical direction of the
control lines in Figs. 4 and 5, symbolizing the same stage of the system’s time evolution) shows how
careful one should be interpreting quantum-computational “circuits”, thriving on qubits’ entanglement,
because the “signals” on different sections of a “wire” may differ – see Fig. 4b again.
At the last stage of the circuit shown in Fig. 4b, the qubit components of the state (150) are fed
into one more pair of Hadamard gates, whose outputs therefore are
Hˆ
1
0  (1) F 1  
1 ˆ
   1
H 0  (1) F Hˆ 1 , and Hˆ   
 0  1    1 Hˆ 1  Hˆ 0 . (8.151) 
2 2  2  2
Now using Eqs. (146) again, we see that the output state ket-vectors of the source and target qubits are,
respectively,
1  (1) F 1  (1) F
0  1 , and  1 . (8.152)
2 2
Since, according to Eq. (142), the Boolean function F may take only values 0 or 1, the final state of the
source qubit is always one of its basis states j, namely the one with j = F. Its measurement tells us
whether the function f, participating in Eq. (144), is constant or balanced – see Eq. (141) again.55
Thus, the quantum circuit shown in Fig. 4b indeed solves the Deutsch problem in one shot.
Reviewing our analysis, we may see that this is possible because the unitary transform performed by the
quantum gate f is applied to the entangled states (146) rather than to the basis states. Due to this trick,
the quantum state components depending on f(0) and f(1) are processed simultaneously, in parallel. This
55 Note that the last Hadamard transform of the target qubit (i.e. the Hadamard gate shown in the lower right
corner of Fig. 4b) is not necessary for the Deutsch problem’s solution – though it should be included if we want
the whole circuit to satisfy the condition (140).
quantum parallelism may be extended to circuits with many (N >> 1) qubits and, for some tasks,
provide a dramatic performance increase – for example, reducing the necessary circuit component
number from O(2N) to O(N p), where p is a finite (and not very big) number.
However, this efficiency comes at a high price. Indeed, let us discuss the possible physical
implementation of quantum gates, starting from the single-qubit case, on an example of the Hadamard
gate (146). With the linearity requirement, its action on the arbitrary state (133) should be
1
Hˆ   a 0Hˆ 0  a1Hˆ 1  a 0  0  1   a1 1  0  1   1 a0  a1  0  1 a0  a1  1 , (8.153)
2 2 2 2
meaning that the state probability amplitudes in the end (t = T) and in the beginning (t = 0) of the qubit
evolution in time have to be related as
a0 (0)  a1 (0) a0 (0)  a1 (0)
a 0 (T )  , a1 (T )  . (8.154)
2 2
This task may be again performed using the Rabi oscillations, which were discussed in Sec. 6.5,
i.e. by applying to the qubit (a two-level system), for a limited time period T, a weak sinusoidal external
signal of frequency  equal to the intrinsic quantum oscillation frequency nn’ defined by Eq. (6.85).
The analysis of the Rabi oscillations was carried out in Sec. 6.5, even for non-vanishing (though small)
detuning  =  – nn, but only for the particular initial conditions when at t = 0 the system was fully in
one on the basis states (there labeled as n’), i.e. the counterpart state (there labeled n) was empty. For
our current purposes we need to find the amplitudes a0,1(t) for arbitrary initial conditions a0,1(0), subject
only to the time-independent normalization condition a02 + a12 = 1. For the case of exact tuning,  =
0, the solution of the system (6.94) is elementary,56 and gives the following solution:57
a 0 (t )  a0 (0) cos t  ia1 (0)e i sin t ,
(8.155)
a1 (t )  a1 (0) cos t  ia 0 (0)e i sin t ,
where  is the Rabi oscillation frequency (6.99), in the exact-tuning case proportional to the amplitude
A of the external ac drive A = Aexp{i} – see Eq. (6.86). Comparing these expressions with Eqs.
(154), we see that for t = T = /4 and  = /2 they “almost” coincide, besides the opposite sign of
a1(T). Conceptually the simplest way to correct this deficiency is to follow the ac “/4-pulse”, just
discussed, by a short dc “-pulse” of the duration T’ = /, which temporarily creates a small additional
energy difference  between the basis states 0 and 1. According to the basic Eq. (1.62), such difference
creates an additional phase difference T’/ between the states, equal to  for the “-pulse”.
Another way (that may be also useful for two-qubit operations) is to use another, auxiliary
energy level E2 whose distances from the basic levels E1 and E0 are significantly different from the
difference (E1 – E0) – see Fig. 6a. In this case, the weak external ac field tuned to any of the three
potential quantum transition frequencies nn’  (En- En’)/ initiates such transitions between the
corresponding states only, with a negligible perturbation of the third state. (Such transitions may be
56 An alternative way to analyze the qubit evolution is to use the Bloch equation (5.21), with an appropriate
function (t) describing the control field.
57 To comply with our current notation, the coefficients a and a of Sec. 6.5 are replaced with a and a .
n’ n 0 1
again described by Eqs. (155), with the appropriate index changes.) For the Hadamard transform
implementation, it is sufficient to apply (after the already discussed /4-pulse of frequency 10, and with
the initially empty level E2), an additional -pulse of frequency 20, with any phase . Indeed, according
to the first of Eqs. (155), with the due replacement a1(0)  a2(0) = 0, such pulse flips the sign of the
amplitude a0(t), while the amplitude a1(t), not involved in this additional transition, remains unchanged.
(a) (b) (c)

E2 2 11 11
1   2
20 21 
10
01 , 10 
E1 1 01
10  1  2
E0 0 00 00
Fig. 8.6. Energy-level schemes used for unitary transformations of (a) single qubits and (b, c) two-qubit systems.
Now let me describe the conceptually simplest (though, for some qubit types, not the most
practically convenient) scheme for the implementation of two-qubit gates, on an example of the CNOT
gate whose operation is described by Eq. (145). For that, evidently, the involved qubits have to interact
for some time T. As was repeatedly discussed in the two last chapters, in most cases such interaction of
two subsystems is factorable – see Eq. (6.145). For qubits, i.e. two-level systems, each of the component
operators may be represented by a 22 matrix in the basis of states 0 and 1. According to Eq. (4.106),
such matrix may be always expressed as a linear combination (bI + c), where b and three Cartesian
components of the vector c are c-numbers. Let us consider the simplest form of such factorable
interaction Hamiltonian:
ˆ 1ˆ 2  , for 0  t  T ,
Hˆ int t    z z (8.156)
 0 , otherwise,
where the upper index is the qubit number and  is a c-number constant.58 According to Eq. (4.175), by
the end of the interaction period, this Hamiltonian produces the following unitary transform:
 i   i 
Uˆ int  exp Hˆ intT   exp ˆ z1ˆ z2 T . (8.157)
     
Since in the basis of unperturbed two-bit basis states j1j2, the product operator ˆ z1ˆ z2  is diagonal, so is
the unitary operator (157), with the following action on these states:
58 The assumption of simultaneous time independence of the basis state vectors and the interaction operator
(within the time interval 0 < t < T) is possible only if the basis state energy difference  of both qubits is exactly
the same. In this case, the simple physical explanation of the time evolution (156) follows from Figs. 6b,c, which
show the spectrum of the total energy E = E1 + E2 of the two-bit system. In the absence of interaction (Fig. 6b),
the energies of two basis states, 01 and 10, are equal, enabling even a weak qubit interaction to cause their
substantial evolution in time – see Sec. 6.7. If the qubit energies are different (Fig. 6c), the interaction may still be
reduced, in the rotating-wave approximation, to Eq. (156), by compensating the energy difference (1 – 2) with
an external ac signal of frequency  = (1 – 2)/ – see Sec. 6.5.
Uˆ int j1 j 2  expi z(1) z( 2)  j1 j 2 , (8.158)
where   –T/, and z are the eigenvalues of the Pauli matrix z for the basis states of the
corresponding qubit: z = +1 for j = 0, and z = –1 for j = 1. Let me, for clarity, spell out Eq. (158)
for the particular case  = –/4 (corresponding to the qubit coupling time T = /4):
Uˆ int 00  e i / 4 00 , Uˆ int 01  e i / 4 01 , Uˆ int 10  e i / 4 10 , Uˆ int 11  e i / 4 11 . (8.159)
In order to compensate the undesirable parts of this joint phase shift of the basis states, let us
now apply similar individual “rotations” of each qubit by angle ’ = +/4, using the following product
of two independent operators, plus (just for the result’s clarity) a common, and hence inconsequential,
phase shift ” = –/4:59
   
     
Uˆ com  exp i ' ˆ z1  ˆ z2   i"  expi ˆ z1  expi ˆ z2  e i / 4 . (8.160)
 4   4 
Since this operator is also diagonal in the j1j2 basis, it is easy to calculate the change of the basis states
by the total unitary operator Uˆ tot  Uˆ comUˆ int :
Uˆ tot 00  00 , Uˆ tot 01  01 , Uˆ tot 10  10 , Uˆ tot 11   11 . (8.161)
This result already shows the main “miracle action” of two-qubit gates, such as the one shown in Fig.
4b: the source qubit is left intact (only if it is in one of the basis states!), while the state of the target
qubit is altered. True, this change (of the sign) is still different from the CNOT operator’s action (145),
but may be readily used for its implementation by sandwiching of the transform Utot between two
Hadamard transforms of the target qubit alone:
1
Cˆ  Hˆ 2 Uˆ totHˆ 2  . (8.162)
2
So, we have spent quite a bit of time on the discussion of the CNOT gate,60 and now I can
reward the reader for their effort with a bit of good news: it has been proved that an arbitrary unitary
transform that satisfies Eq. (140), i.e. may be used within the general scheme outlined in Fig. 3, may be
decomposed into a set of CNOT gates, possibly augmented with simpler single-qubit gates – for
example, the Hadamard gate plus the /2 rotation discussed above.61 Unfortunately, I have no time for a
59 As Eq. (4.175) shows, each of the component unitary transforms exp{i 'ˆ z } may be created by applying to
each qubit, for time interval T’ = ’/’, a constant external field described by Hamiltonian Hˆ  'ˆ z . We
already know that for a charged, spin-½ particle, such Hamiltonian may be created by applying a z-oriented
external dc magnetic field – see Eq. (4.163). For most other physical implementations of qubits, the organization
of such a Hamiltonian is also straightforward – see, e.g., Fig. 7.4 and its discussion.
60 As was discussed above, this gate is identical to the two-qubit gate shown in Fig. 5a for f = f , i.e. f(j) = j. The
3
implementation of the gate of f for 3 other possible functions f requires straightforward modifications, whose
analysis is left for the reader’s exercise.
61 This fundamental importance of the CNOT gate was perhaps a major reason why David Wineland, the leader of
the NIST group that had demonstrated its first experimental implementation in 1995 (following the theoretical
suggestion by J. Cirac and P. Zoller), was awarded the 2012 Nobel Prize in Physics – shared with Serge Haroche,
the leader of another group working towards quantum computation.
detailed discussion of more complex circuits.62 The most famous of them is the scheme for integer
number factoring, suggested in 1994 by Peter Winston Shor.63 Due to its potential practical importance
for breaking broadly used communication encryption schemes such as the RSA code,64 this opportunity
has incited much enthusiasm and triggered experimental efforts to implement quantum gates and circuits
using a broad variety of two-level quantum systems. By now, the following experimental options have
given the most significant results:65
(i) Trapped ions. The first experimental demonstrations of quantum state manipulation
(including the already mentioned first CNOT gate) have been carried out using deeply cooled atoms in
optical traps, similar to those used in frequency and time standards. Their total spins are natural qubits,
whose states may be manipulated using the Rabi transfers excited by suitably tuned lasers. The spin
interactions with the environment may be very weak, resulting in large dephasing times T2 – up to a few
seconds. Since the distances between ions in the traps are relatively large (of the order of a micron),
their direct spin-spin interaction is even weaker, but the ions may be made effectively interacting either
via their mechanical oscillations about the potential minima of the trapping field, or via photons in
external electromagnetic resonators (“cavities”).66 Perhaps the main challenge of using this approach for
quantum computation is poor “scalability”, i.e. the enormous experimental difficulty of creating and
managing large ordered systems of individually addressable qubits. So far, only a-few-qubit systems
have been demonstrated.67
(ii) Nuclear spins are also typically very weakly connected to their environment, with dephasing
times T2 exceeding 10 seconds in some cases. Their eigenenergies E0 and E1 may be split by external dc
magnetic fields (typically, of the order of 10 T), while the interstate Rabi transfers may be readily
achieved by using the nuclear magnetic resonance, i.e. the application of external ac fields with
frequencies  = (E1 – E0)/ – typically, of a few hundred MHz. The challenges of this option include the
weakness of spin-spin interactions (typically mediated through molecular electrons), resulting in a very
slow spin evolution, whose time scale / may become comparable with T2, and also very small level
separations E1 – E0, corresponding to a few K, i.e. much smaller than the room temperature, creating a
challenge of qubit state preparation.68 Despite these challenges, the nuclear spin option was used for the
first implementation of the Shor algorithm for factoring of a small number (15 = 53) as early as 2001.69
However, the extension of this success to larger systems, beyond the set of spins inside one molecule, is
extremely challenging.
62 For that, the reader may be referred to either the monographs by Nielsen-Chuang and Reiffel-Polak, cited
above, or to a shorter (but much more formal) textbook by N. Mermin, Quantum Computer Science, Cambridge
U. Press, 2007.
63 A clear description of this algorithm may be found in several accessible sources, including Wikipedia – see the
article Shor’s Algorithm.
64 Named after R. Rivest, A. Shamir, and L. Adleman, the authors of the first open publication of the code in
1977, but actually invented earlier (in 1973) by C. Cocks.
65 For a discussion of other possible implementations (such as quantum dots and dopants in crystals) see, e.g., T.
Ladd et al., Nature 464, 45 (2010), and references therein.
66 A brief discussion of such interactions (so-called Cavity QED) will be given in Sec. 9.4 below.
67 See, e.g., S. Debnath et al., Nature 536, 63 (2016). Note also the related work on arrays of trapped, optically-
coupled neutral atoms – see, e.g., J. Perczel et al., Phys. Rev. Lett. 119, 023603 (2017) and references therein.
68 This challenge may be partly mitigated using ingenious spin manipulation techniques such as refocusing – see,
e.g., either Sec. 7.7 in Nielsen and Chuang, or the J. Keeler’s monograph cited at the end of Sec. 6.5.
69 B. Lanyon et al., Phys. Rev. Lett. 99, 250505 (2001).
(iii) Josephson-junction devices. Much better scalability may be achieved with solid-state
devices, especially using superconductor integrated circuits including weak contacts – Josephson
junctions (see their brief discussion in Sec. 1.6). The qubits of this type are based on the fact that the
energy U of such a junction is a highly nonlinear function of the Josephson phase difference  – see Sec.
1.6. Indeed, combining Eqs. (1.73) and (1.74), we can readily calculate U() as the work W of an
external circuit increasing the phase from, say, zero to some value :
'  '  ' 
2eI c d' 2eI c
U    U 0   dW   IVdt   sin ' dt  1  cos   . (8.163)
' 0 ' 0   '0 dt 
There are several options of using this nonlinearity for creating qubits;70 currently the leading
option, called the phase qubit, is using two lowest eigenstates localized in one of the potential wells of
the periodic potential (163). A major problem of such qubits is that at the very bottom of this well the
potential U() is almost quadratic, so that the energy levels are nearly equidistant – cf. Eqs. (2.262),
(6.16), and (6.23). This is even more true for the so-called “transmons” (and “Xmons”, and “Gatemons”,
and several other very similar devices71) – the currently used phase qubits versions, where a Josephson
junction is made a part of an external electromagnetic oscillator, making its relative total nonlineartity
(anharmonism) even smaller. As a result, the external rf drive of frequency  = (E1 – E0)/, used to
arrange the state transforms described by Eq. (155), may induce simultaneous undesirable transitions to
(and between) higher energy levels. This effect may be mitigated by a reduction of the ac drive
amplitude, but at a price of the proportional increase of the operation time and hence of dephasing – see
below. (I am leaving a quantitative estimate of such an increase for the reader’s exercise.)
Since the coupling of Josephson-junction qubits may be most readily controlled (and, very
importantly, kept stable if so desired), they have been used to demonstrate the largest prototype quantum
computing systems to date, despite quite modest dephasing times T2 – for purely integrated circuits, in
the tens of microseconds at best, even at operating temperatures in tens of mK. By the time of this
writing (mid-2019), several groups have announced chips with a few dozen of such qubits, but to the
best of my knowledge, only their smaller subsets could be used for high-fidelity quantum operations.72
(iv) Optical systems, attractive because of their inherently enormous bandwidth, pose a special
challenge for quantum computation: due to the virtual linearity of most electromagnetic media at
reasonable light power, the implementation of qubits (i.e. two-level systems), and interaction
Hamiltonians such as the one given by Eq. (156), is problematic. In 2001, a very smart way around this
70 The “most quantum” option in this technology is to use Josephson junctions very weakly coupled to their
dissipative environment (so that the effective resistance shunting the junction is much higher than the quantum
2
resistance unit RQ  (/2) /e ~ 104 ). In this case, the Josephson phase variable  behaves as a coordinate of a
1D quantum particle, moving in the 2-periodic potential (163), forming the energy band structure E(q) similar to
those discussed in Sec. 2.7. Both theory and experiment show that in this case, the quantum states in adjacent
Brillouin zones differ by the charge of one Cooper pair 2e. (This is exactly the effect responsible for the Bloch
oscillations of frequency (2.252).) These two states may be used as the basis states of charge qubits.
Unfortunately, such qubits are rather sensitive to charged impurities, randomly located in the junction’s vicinity,
causing uncontrollable changes of its parameters, so that currently, to the best of my knowledge, this option is not
actively pursued.
71 For a recent review of these devices see, e.g., G. Wendin, Repts. Progr. Phys. 80, 106001 (2017), and
references therein.
72 See, e.g., C. Song et al., Phys. Rev. Lett. 119, 180511 (2017) and references therein.
hurdle was invented.73 In this KLM scheme (also called the “linear optical quantum computing”),
nonlinear elements are not needed at all, and quantum gates may be composed just of linear devices
(such as optical waveguides, mirrors, and beam splitters), plus single-photon sources and detectors.
However, estimates show that this approach requires a much larger number of physical components than
those using nonlinear quantum systems such as usual qubits,74 so that right now it is not very popular.
So, despite more than two decades of large-scale efforts, the progress of quantum computing
development has been rather modest. The main culprit here is the unintentional coupling of qubits to
their environment, leading most importantly to their state dephasing, and eventually to errors. Let me
discuss this major issue in detail.
Of course, some error probability exists in classical digital logic gates and memory cells as
well.75 However, in this case, there is no conceptual problem with the device state measurement, so that
the error may be detected and corrected in many ways. Conceptually,76 the simplest of them is the so-
called majority voting logic – using several similar logic circuits working in parallel and fed with
identical input data. Evidently, two such devices can detect a single error in one of them, while three
devices in parallel may correct such error, by taking two coinciding output signals for the genuine one.
For quantum computation, the general idea of using several devices (say, qubits) for coding the
same information remains valid; however, there are two major complications. First, as we know from
Chapter 7, the environment’s dephasing effect may be described as a slow random drift of the
probability amplitudes aj, leading to the deviation of the output state fin from the required form (140),
and hence to a non-vanishing probability of wrong qubit state readout – see Fig. 3. Hence the quantum
error correction has to protect the result not against possible random state flips 0  1, as in classical
digital computers, but against these “creeping” analog errors.
Second, the qubit state is impossible to copy exactly (clone) without disturbing it, as follows
from the following simple calculation.77 Cloning some state  of one qubit to another qubit that is
initially in an independent state (say, the basis state 0), without any change of , means the following
transformation of the two-qubit ket: 0  . If we want such transform to be performed by a real
quantum system, whose evolution is described by a unitary operator û , and to be correct for an arbitrary
state , it has to work not only for both basis states of the qubit:
uˆ 00  00 , uˆ 10  11 , (8.164)
but also for their arbitrary linear combination (133). Since the operator û has to be linear, we may use
that relation, and then Eq. (164) to write
73 E. Knill et al., Nature 409, 46 (2001).

74 See, e.g., Y. Li et al., Phys. Rev. X 5, 041007 (2015).
75 In modern integrated circuits, such “soft” (runtime) errors are created mostly by the high-energy neutron
component of cosmic rays, and also by the -particles emitted by radioactive impurities in silicon chips and their
packaging.
76 Practically, the majority voting logic increases circuit complexity and power consumption, so that it is used
only in most critical points. Since in modern digital integrated circuits the bit error rate is very small (< 10-5), in
most of them, less radical but also less penalizing schemes are used – if used at all.
77 Amazingly, this simple no-cloning theorem was discovered as late as 1982 (to the best of my knowledge,
independently by W. Wooters and W. Zurek, and by D. Dieks), in the context of work toward quantum
cryptography – see below.
uˆ  0  uˆ a 0 0  a1 1  0  a 0 uˆ 00  a1uˆ 10  a 0 00  a1 11 . (8.165)
On the other hand, the desired result of the state cloning is
  a 0 0  a1 1 a 0 0  a1 1   a 02 00  a 0 a1  10  01   a12 11 , (8.166)
i.e. is evidently different, so that, for an arbitrary state , and an arbitrary unitary operator û ,
No-cloning
û  0   , (8.167) theorem
meaning that the qubit state cloning is indeed impossible.78 This problem may be, however, indirectly
circumvented – for example, in the way shown in Fig. 7a.
(a) (b)
a0 0  a1 1 
a0 0  a1 1 H H
a0 00 0
0  a1 11 0
A B C D E F
Fig. 8.7. (a) Quasi-cloning, and (b) detection and correction of dephasing errors in a single qubit.
Here the CNOT gate, whose action is described by Eq. (145), entangles an arbitrary input state
(133) of the source qubit with a basis initial state of an ancillary target qubit – frequently called the
ancilla. Using Eq. (145), we can readily calculate the output two-qubit state’s vector:
 Cˆ a 0 0  a1 1  0  a 0 Cˆ 00  a1Cˆ 10  a0 00  a1 11 .
Quasi-
 N 2
(8.168) cloning
We see that this circuit does perform the operation (165), i.e. gives the initial source qubit’s probability
amplitudes a0 and a1 equally to two qubits, i.e. duplicates the input information. However, in contrast
with the “genuine” cloning, it changes the state of the source qubit as well, making it entangled with the
target (ancilla) qubit. Such “quasi-cloning” is the key element of most suggested quantum error
correction techniques.
Consider, for example, the three-qubit “circuit” shown in Fig. 7b, which uses two ancilla qubits
– see the two lower “wires”. At its first two stages, the double application of the quasi-cloning produces
an intermediate state A with the following ket-vector:
A  a 0 000  a1 111 , (8.169)
which is an evident generalization of Eq. (168).79 Next, subjecting the source qubit to the Hadamard
transform (146), we get the three-qubit state B represented by the state vector
78 Note that this does not mean that two (or several) qubits cannot be put into the same, arbitrary quantum state –
theoretically, with arbitrary precision. Indeed, they may be first set into their lowest-energy stationary states, and
then driven into the same arbitrary state (133) by exerting on them similar classical external fields. So, the no-
cloning theorem pertains only to qubits in unknown states  – but this is exactly what we need for error correction
– see below.
1 1
B  a0 0  1  00  a1 0  1  11 . (8.170)
2 2
Now let us assume that at this stage, the source qubit comes into contact with a dephasing
environment – in Fig. 7b, symbolized by the single-qubit “gate” . As we know from Chapter 7 (see
Eq. (7.22) and its discussion, and also Sec. 7.3), its effect may be described by a random shift of the
relative phase of two states:80
0  e i 0 , 1  e  i 1 . (8.171)
As a result, for the intermediate state C (see Fig. 7b) we may write
C  a0
2

1 i

e 0  e i 1 00  a1
1 i
2

e 0  e i 1 11 .  (8.172)
At this stage, in this simple theoretical model, the coupling with the environment is completely
stopped (ahh, if this could be possible! we might have quantum computers by now :-), and the source
qubit is fed into one more Hadamard gate. Using Eqs. (146) again, for the state D after this gate we get
D  a 0 cos  0  i sin  1  00  a1 i sin  0  cos  1  11 . (8.173)
Now the qubits are passed through the second, similar pair of CNOT gates – see Fig. 7b. Using Eq.
(145), for the resulting state E we readily get the following expression:
E  a 0 cos  000  a 0 i sin  111  a1i sin  011  a1 cos  100 , (8.174a)
whose right-hand side may by evidently grouped as
E  a 0 0  a1 1 cos  00  a1 0  a 0 1  i sin  11 . (8.174b)
This is already a rather remarkable result. It shows that if we measured the ancilla qubits at stage
E, and both results corresponded to states 0, we might be 100% sure that the source qubit (which is not
affected by these measurements!) is in its initial state even after the interaction with the environment.
The only result of an increase of this unintentional interaction (as quantified by the r.m.s. magnitude of
the random phase shift ) is the growth of the probability,
W  sin 2  , (8.175)
of getting the opposite result, which signals a dephasing-induced error in the source qubit. Such implicit
measurement, without disturbing the source qubit, is called quantum error detection.
An even more impressive result may be achieved by the last component of the circuit, the so-
called Toffoli (or “CCNOT”) gate, denoted by the rightmost symbol in Fig. 7b. This three-qubit gate is
conceptually similar to the CNOT gate discussed above, besides that it flips the basis state of its target
qubit only if both source qubits are in state 1. (In the circuit shown in Fig. 7b, the former role is played
79 Such state is also the 3-qubit example of the so-called Greeenberger-Horne-Zeilinger (GHZ) states, which are
frequently called the “most entangled” states of a system of N > 2 qubits.
80 Let me emphasize again that Eq. (171) is strictly valid only if the interaction with the environment is a pure
dephasing, i.e. does not include the energy relaxation of the qubit or its thermal activation to the higher-energy
eigenstate; however, it is a reasonable description of errors in the frequent case when T2 << T1.
by our source qubit, while the latter role, by the two ancilla qubits.) According to its definition, the
Toffoli gate does not affect the first parentheses in Eq. (174b), but flips the source qubit’s states in the
second parentheses, so that for the output three-qubit state F we get
F  a 0 0  a1 1 cos  00  a 0 0  a1 1  i sin  11 . (8.176a)
Obviously, this result may be factored as
F  a 0 0  a 1 1 cos  00  i sin  11 , (8.176b)

Quantum
showing that now the source qubit is again fully unentangled from the ancilla qubits. Moreover, error
correction
calculating the norm squared of the second operand, we get
cos  00  i sin  11  cos  00  i sin  11   cos 2   sin 2   1 , (8.177)
so that the final state of the source qubit exactly coincides with its initial state. This is the famous
miracle of quantum state correction, taking place “automatically” – without any qubit measurements,
and for any random phase shift .
The circuit shown in Fig. 7b may be further improved by adding Hadamard gate pairs, similar to
that used for the source qubit, to the ancilla qubits as well. It is straightforward to show that if the
dephasing is small in the sense that the W given by Eq. (175) is much less than 1, this modified circuit
may provide a substantial error probability reduction (to ~W2) even if the ancilla qubits are also
subjected to a similar dephasing and the source qubits, at the same stage – i.e. between the two
Hadamard gates. Such perfect automatic correction of any error (not only of an inner dephasing of a
qubit and its relaxation/excitation, but also of the mutual dephasing between qubits) of any used qubit
needs even more parallelism. The first circuit of that kind, based on nine parallel qubits, which is a
natural generalization of the circuit discussed above, was invented in 1995 by the same P. Shor. Later,
five-qubit circuits enabling similar error correction were suggested. (The further parallelism reduction
has been proved impossible.)
However, all these results assume that the error correction circuits as such are perfect, i.e.
completely isolated from the environment. In the real world, this cannot be done. Now the key question
is what maximum level Wmax of the error probability in each gate (including those in the used error
correction scheme) can be automatically corrected, and how many qubits with W < Wmax would be
required to implement quantum computers producing important practical results – first of all, factoring
of large numbers.81 To the best of my knowledge, estimates of these two related numbers have been
made only for some very specific approaches, and they are rather pessimistic. For example, using the so-
called surface codes, which employ many physical qubits for coding an informational one, and hence
increase its fidelity, Wmin may be increased to a few times 10-3, but then we would need ~108 physical
qubits for the Shor’s algorithm implementation.82 This is very far from what currently looks doable
using the existing approaches.
Because of this hard situation, the current development of quantum computing is focused on
finding at least some problems that could be within the reach of either the existing systems, or their
immediate extensions, and simultaneously would present some practical interest – a typical example of a
81 In order to compete with the existing classical factoring algorithms, such numbers should have at least 103 bits.
82 A. Fowler et al., Phys. Rev. A 86, 032324 (2012).
technology in the search for applications. Currently, to the best of my knowledge, all suggested
problems of this kind address either specially crafted mathematical problems,83 or properties of some
simple physical systems – such as the molecular hydrogen84 or the deuteron (the deuterium’s nucleus,
i.e. the proton-neutron system).85 In the latter case, the interaction between the qubits of the
computational system is organized so that the system’s Hamiltonian is similar to that of the quantum
system of interest. (For this work, quantum simulation is a more adequate name than “quantum
computation”.86)
Such simulations are pursued by some teams using schemes different from that shown in Fig. 3.
Of those, the most developed is the so-called adiabatic quantum computation,87 which drops the hardest
requirement of negligible interaction with the environment. In this approach, the qubit system is first
prepared in a certain initial state, and then is let evolve on its own, with no effort to couple-uncouple
qubits by external control signals during the evolution.88 Due to the interaction with the environment, in
particular the dephasing and the energy dissipation it imposes, the system eventually relaxes to a final
incoherent state, which is then measured. (This reminds the scheme shown in Fig. 3, with the important
difference that the transform U should not necessarily be unitary.) From numerous runs of such an
experiment, the outcome statistics may be revealed. Thus, at this approach the interaction with the
environment is allowed to play a certain role in the system evolution, though every effort is made to
reduce it, thus slowing down the relaxation process – hence the word “adiabatic” in the name of this
approach. This slowness allows the system to exhibit some quantum properties, in particular quantum
tunneling89 through the energy barriers separating close energy minima in the multi-dimensional space
of states. This tunneling creates a substantial difference in the finite state statistics from that in purely
classical systems, where such barriers may be overcome only by thermally-activated jumps over them.90
Due to technical difficulties of the organization and precise control of long-range interaction in
multi-qubit systems, the adiabatic quantum computing demonstrations so far have been limited to a few
simple arrays described by the so-called extended quantum Ising (“spin-glass”) model
Hˆ   J  σˆ  σˆ 
{ j , j '}
z
j
z
j '
  h j σˆ z j  ,
j
(8.178)
where the curly brackets denote the summation over pairs of close (though not necessarily closest)
neighbors. Though the Hamiltonian (178) is the traditional playground of phase transitions theory (see,
83 F. Arute et al., Nature 574, 505 (2019). Note that the claim of the first achievement of “quantum supremacy”,
made in this paper, refers only to an artificial, specially crafted mathematical problem, and does not change my
assessment of the current status of this technology.
84 P. O’Malley et al., Phys. Rev. X 6, 031007 (2016).
85 E. Dumitrescu et al., Phys. Lett. Lett. 120, 210501 (2018).
86 To the best of my knowledge, this idea was first put forward by Yuri I. Malin in his book Computable and
Incomputable published in 1980, i.e. before the famous 1982 paper by Richard Feynman. Unfortunately, since the
book was in Russian, this suggestion was acknowledged by the international community only much later.
87 Note that the qualifier “quantum” is important in this term, to distinguish this research direction from the
classical adiabatic (or “reversible”) computation – see, e.g., SM Sec. 2.3 and references therein.
88 Recently, some hybrids of this approach with the “usual” scheme of quantum computation have been
demonstrated, in particular, using some control of inter-bit coupling during the relaxation process – see, e.g., R.
Barends et al., Nature 534, 222 (2016).
89 As a reminder, this process was repeatedly discussed in this course, starting from Sec. 2.3.
90 A quantitative discussion of such jumps may be found in SM Sec. 5.6.
e.g., SM Chapter 4), to the best of my knowledge there are not many practically important tasks that
could be achieved by studying the statistics of its solutions. Moreover, even for this limited task, the
speed of the largest experimental adiabatic quantum “computers”, with several hundreds of Josephson-
junction qubits91 is still comparable with that of classical, off-the-shelf semiconductor processors (with
the dollar cost lower by many orders of magnitude), and no dramatic change of this comparison is
predicted for realistic larger systems.
To summarize the current (circa mid-2019) situation with the quantum computation
development, it faces a very hard challenge of mitigating the effects of unintentional coupling with the
environment. This problem is exacerbated by the lack of algorithms, beyond Shor’s factoring, that
would give quantum computation a substantial advantage over the classical competition in solving real-
world problems, and hence a much broader potential customer base that would provide the field with the
necessary long-term motivation and resources. So far, even the leading experts in this field abstain from
predictions on when quantum computation may become a self-supporting commercial technology.92
There seem to be somewhat better prospects for another application of entangled qubit systems,
namely to telecommunication cryptography.93 The goal here is more modest: to replace the currently
dominating classical encryption, based on the public-key RSA code mentioned above, that may be
broken by factoring very large numbers, with a quantum encryption system that would be fundamentally
unbreakable. The basis of this opportunity is the measurement postulate and the no-cloning theorem: if a
message is carried over by a qubit, it is impossible for an eavesdropper (in cryptography, traditionally
called Eve) to either measure or copy it faithfully, without also disturbing its state. However, as we have
seen from the discussion of Fig. 7a, state quasi-cloning using entangled qubits is possible, so that the
issue is far from being simple, especially if we want to use a publicly distributed quantum key, in some
sense similar to the classical public key used at the RSA encryption. Unfortunately, I would not have
time/space to discuss various options for quantum encryption, but cannot help demonstrating how
counter-intuitive they may be, on the famous example of the so-called quantum teleportation (Fig. 8).94
Suppose that some party A (in cryptography, traditionally called Alice) wants to send to party B
(Bob) the full information about the pure quantum state  of a qubit, unknown to either party. Instead of
sending her qubit directly to Bob, Alice asks him to send her one qubit () of a pair of other qubits,
prepared in a certain entangled state, for example in the singlet state described by Eq. (11); in our
current notation
1
 '   01  10  . (8.179)
2
The initial state of the whole three-qubit system may be represented in the form
91 See, e.g., R. Harris et al., Science 361, 162 (2018). Similar demonstrations with trapped-ion systems so far have
been on a smaller scale, with a few tens of qubits – see, e.g., J. Zhang et al., Nature 551, 601 (2017).
92 See the publication Quantum Computing: Progress and Prospects, The National Academies Press, 2019.
93 This field was pioneered in the 1970s by S. Wisener. Its important theoretical aspect (which I, unfortunately,
also will not be able to cover) is the distinguishability of different but close quantum states – for example, of an
original qubit set, and that slightly corrupted by noise. A good introduction to this topic may be found, for
example, in Chapter 9 of the monograph by Nielsen and Chuang, cited above.
94 This procedure had been first suggested in 1993 by Charles Henry Bennett, and then repeatedly demonstrated
experimentally – see, e.g., L. Steffen et al., Nature 500, 319 (2013), and literature therein.
a0 a0 a1 a1
 '  a 0 0  a1 1   '  001  010  010  111 , (8.180a)
2 2 2 2
which may be equivalently rewritten as the following linear superposition,
1 1
 '  

s
 a
1 0  a0 1  

s
a1 0  a0 1 
2 2 (8.180b)
1
 

e
 a0 0  a1 1   1  
e
 a 0 0  a1 1 ,
2 2
of the following four states of the qubit pair :
1 1


s
  00  11 , 

e
  01  10  . (8.181)
2 2
Alice Bob
  ' (a)
Fig. 8.8. Sequential stages of a “quantum
1 qubit teleportation” procedure: (a) the initial state
 ' (b)
  with entangled qubits  and ’, (b) the back
 ' (c) transfer of the qubit , (c) the measurement of
the pair , (d) the forward transfer of two
2 bits classical bits with the measurement results, and
 ' (d)
(e) the final state, with the state of the qubit ’
  '   (e) mirroring the initial state of the qubit .
After having received qubit  from Bob, Alice measures which of these four states does the pair
 have. This may be achieved, for example, by measurement of one observable represented by the
operator ˆ z ˆ z   and another one corresponding to ˆ x ˆ x   – cf. Eq. (156). (Since all four states (181)
are eigenstates of both these operators, these two measurements do not affect each other and may be
performed in any order.) The measured eigenvalue of the former operator enables distinguishing the
couples of states (181) with different values of the lower index, while the latter measurement
distinguishes the states with different upper indices.
Then Alice reports the measurement result (which may be coded with just two classical bits) to
Bob over a classical communication channel. Since the measurement places the pair  definitely into
the corresponding state, the remaining Bob’s bit ’ is now definitely in the unentangled single-qubit
state that is represented by the corresponding parentheses in Eq. (180b). Note that each of these
parentheses contains both coefficients a0,1, i.e. the whole information about the initial state that the qubit
 had initially. If Bob likes, he may now use appropriate single-qubit operations, similar to those
discussed earlier in this section, to move his qubit ’ into the state exactly similar to the initial state of
qubit . (This fact does not violate the no-cloning theorem (167), because the measurement has already
changed the state of .) This is, of course, a “teleportation” only in a very special sense of this term, but
a good example of the importance of qubit entanglement’s preservation at their spatial transfer. For this
course, this was also a good primer for the forthcoming discussion of the EPR paradox and Bell’s
inequalities in Chapter 10.
Returning for just a minute to quantum cryptography: since its most common quantum key
distribution protocols95 require just a few simple quantum gates, whose experimental implementation is
not a large technological challenge, the main focus of the current effort is on decreasing the single-
photon dephasing in long electromagnetic-wave transmission channels,96 with sufficiently high qubit
transfer fidelity. The recent progress was rather impressive, with the demonstrated transfer of entangled
qubits over landlines longer than 100 km,97 and over at least one satellite-based line longer than 1,000
km;98 and also the whole quantum key distribution over a comparable distance, though for now at a very
low rate yet.99 Let me hope that if not the author of this course, then its readers will see this technology
used in practical secure telecommunication systems.
8.1. Prove that Eq. (30) indeed yields Eg(1) = (5/4)EH.
8.2. For a dilute gas of helium atoms in their ground state, with n atoms per unit volume,
calculate its:
(i) electric susceptibility e, and
(ii) magnetic susceptibility m,
and compare the results.
Hint: You may use the model solution of Problems 6.8 and 6.14, and the results of the variational
description of the helium atom’s ground state in Sec. 2.
8.3. Calculate the expectation values of the following observables: s1s2, S2  (s1 + s2)2, and Sz 
s1z + s2z, for the singlet and triplet states of the system of two spins-½, defined by Eqs. (18) and (21),
directly, without using the general Eq. (48). Compare the results with those for the system of two
classical geometric vectors of length /2 each.
8.4. Discuss the factors 1/2 that participate in Eqs. (18) and (20) for the entangled states of the
system of two spins-½, in terms of Clebsh-Gordan coefficients similar to those discussed in Sec. 5.7.
8.5.* Use the perturbation theory to calculate the contribution into the so-called hyperfine
splitting of the ground energy of the hydrogen atom,100 due to the interaction between the spins of its
nucleus (proton) and electron.
95 Two of them are the BB84 suggested in 1984 by C. Bennett and G. Brassard, and the EPRBE suggested in
1991 by A. Ekert. For details, see, e.g., either Sec. 12.6 in the repeatedly cited monograph by Nielsen and
Chuang, or the review by N. Gizin et al., Rev. Mod. Phys. 74, 145 (2002).
96 For their quantitative discussion see, e.g., EM Sec. 7.8.
97 See, e.g., T. Herbst et al., Proc. Nath. Acad. Sci. 112, 14202 (2015), and references therein.
98 J. Yin et al., Science 356, 1140 (2017).
99 H.-L. Yin et al., Phys. Rev. Lett. 117, 190501 (2016).
100 This effect was discovered experimentally by A. Michelson in 1881 and explained theoretically by W. Pauli in
1924.
Hint: The proton’s magnetic moment operator is described by the same Eq. (4.115) as the
electron, but with a positive gyromagnetic factor p = gpe/2mp  2.675108 s-1T-1, whose magnitude is
much smaller than that of the electron (e  1.7611011 s-1T-1), due to the much higher mass, mp 
1.67310-27 kg  1,835 me. (The g-factor of the proton is also different, gp  5.586.101)
8.6. In the simple case of just two similar spin-interacting particles, distinguishable by their
spatial location, the famous Heisenberg model of ferromagnetism102 is reduced to the following
Hamiltonian:
H   J sˆ 1  sˆ 2  B  sˆ 1  sˆ 2  ,
where J is the spin interaction constant,  is the gyromagnetic ratio of each particle, and B is the
external magnetic field. Find the stationary states and energies of this system for spin-½ particles.
8.7. Two particles, both with spin-½ but different gyromagnetic ratios 1 and 2, are placed to
external magnetic field B . In addition, their spins interact as in the Heisenberg model:
Hˆ int   J sˆ 1  sˆ 2 .
Find the eigenstates and eigenenergies of the system.
8.8. Two similar spin-½ particles, with gyromagnetic ratio , localized at two points separated by
distance a, interact via the field of their magnetic dipole moments. Calculate stationary states and
energies of the system.
8.9. Consider the permutation of two identical particles, each of spin s. How many different
symmetric and antisymmetric spin states can the system have?
8.10. For a system of two identical particles with s = 1:

(i) List all spin states forming the uncoupled-representation basis.
(ii) List all possible pairs {S, MS} of the quantum numbers describing the states of the coupled-
representation basis – see Eq. (48).
(iii)Which of the {S, MS} pairs describe the states symmetric, and which the states
antisymmetric, with respect to the particle permutation?
8.11. Represent the operators of the total kinetic energy and the total orbital angular momentum
of a system of two particles, with masses m1 and m2, as combinations of terms describing the center-of-
mass motion and the relative motion. Use the results to calculate the energy spectrum of the so-called
101 The anomalously large value of the proton’s g-factor results from the composite quark-gluon structure of this
particle. (An exact calculation of gp remains a challenge for quantum chromodynamics.)
102 It was suggested in 1926, independently by W. Heisenberg and P. Dirac. A discussion of thermal motion
effects on this and other similar systems (especially the Ising model of ferromagnetism) may be found in SM
Chapter 4.
positronium – a metastable “atom”103 consisting of one electron and its positively charged antiparticle,
the positron.
8.12. Two particles with similar masses m and charges q are free to move along a round, plane
ring of radius R. In the limit of strong Coulomb interaction of the particles, find the lowest eigenenergies
of the system, and sketch the system of its energy levels. Discuss possible effects of particle
indistinguishability.
8.13. Low-energy spectra of many diatomic molecules may be well described by modeling the
molecule as a system of two particles connected with a light and elastic, but very stiff spring. Calculate
the energy spectrum of a molecule within this model. Discuss possible effects of nuclear spins on
spectra of the so-called homonuclear diatomic molecules, formed by two similar atoms.
8.14. Two indistinguishable spin-½ particles are attracting each other at contact:
U  x1 , x 2   W  x1  x 2 , with W  0,
but are otherwise free to move along the x-axis. Find the energy and the orbital wavefunction of the
ground state of the system.
8.15. Calculate the energy spectrum of the system of two identical spin-½ particles, moving
along the x-axis, which is described by the following Hamiltonian:
pˆ 12 pˆ 22 m0 02 2
ˆ
H 
2m0 2m 0

2

x1  x 22  x1 x 2 ,
and the degeneracy of each energy level.
8.16.* Two indistinguishable spin-½ particles are confined to move around a circle of radius R,
and interact only at a very short arc distance l = R  R(1 – 2) between them, so that the interaction
potential U may be well approximated with a delta function of . Find the ground state and its energy,
for the following two cases:
(i) the “orbital” (spin-independent) repulsion: Û  W   ,
(ii) the spin-spin interaction: Uˆ  W sˆ  sˆ    ,
1 2
both with constant W > 0. Analyze the trends of your results in the limits W  0 and W  .
8.17. Two particles of mass M, separated by two much lighter particles of mass m M
m << M, are placed on a ring of radius R – see the figure on the right. The particles
strongly repulse at contact, but otherwise, each of them is free to move along the ring. M
Calculate the lower part of the energy spectrum of the system. m
103Its lifetime (either 0.124 ns or 138 ns, depending on the parallel or antiparallel configuration of the component
spins), is limited by the weak interaction of its components, which causes their annihilation with the emission of
several gamma-ray photons.
8.18. N indistinguishable spin-½ particles move in a spherically-symmetric quadratic potential

U(r) = m02r2/2. Neglecting the direct interaction of the particles, find the ground-state energy of the
system.
8.19. Use the Hund rules to find the values of the quantum numbers L, S, and J in the ground
states of the atoms of carbon and nitrogen. Write down the Russell-Saunders symbols for these states.
8.20. N >> 1 indistinguishable, non-interacting quantum particles are placed in a hard-wall,

rectangular box with sides ax, ay, and az. Calculate the ground-state energy of the system, and the
average forces it exerts on each face of the box. Can we characterize the forces by certain pressure P?
Hint: Consider separately the cases of bosons and fermions.
8.21.* Explore the Thomas-Fermi model104 of a heavy atom, with the nuclear charge Q = Ze >>
e, in which the interaction between electrons is limited to their contribution to the common electrostatic
potential (r). In particular, derive the ordinary differential equation obeyed by the radial distribution of
the potential, and use it to estimate the effective radius of the atom.
8.22.* Use the Thomas-Fermi model, explored in the previous problem, to calculate the total
binding energy of a heavy atom. Compare the result with that for the simpler model, in that the Coulomb
electron-electron interaction is completely ignored.
8.23. A system of three similar spin-½ particles is described by the Heisenberg Hamiltonian (cf.
Problems 6 and 7):
Hˆ   J sˆ 1  sˆ 2  sˆ 2  sˆ 3  sˆ 3  sˆ 1  ,
where J is the spin interaction constant. Find the stationary states and energies of this system, and give
an interpretation of your results.
8.24. For a system of three spins-½, find the common eigenstates and eigenvalues of the
operators Ŝ z and Ŝ 2 , where
Sˆ  sˆ 1  sˆ 2  sˆ 3
is the vector operator of the total spin of the system. Do the corresponding quantum numbers S and MS
obey Eqs. (48)?
8.25. Explore basic properties of the Heisenberg model (which was the subject of Problems 6, 7,
and 23), for a 1D chain of N spins-½:
Hˆ   J  sˆ j  sˆ j '  B   sˆ j , with J  0,
 j , j ' j
where the summation is over all N spins, with the symbol {j, j’} meaning that the first sum is only over
the adjacent spin pairs. In particular, find the ground state of the system and its lowest excited states in
the absence of external magnetic field B , and also the dependence of their energies on the field.
104 It was suggested in 1927, independently, by L. Thomas and E. Fermi.
Hint: For the sake of simplicity, you may assume that the first sum includes the term sˆ N  sˆ 1 as
well. (Physically, this means that the chain is bent into a closed loop. 105)
8.26. Compose the simplest model Hamiltonians, in terms of the second quantization formalism,
for systems of indistinguishable particles moving in the following external potentials:
(i) two weakly coupled potential wells, with on-site particle interactions (giving additional
energy J per each pair of particles in the same potential well), and
(ii) a periodic 1D potential, with the same particle interactions, in the tight-binding limit.
8.27. For each of the Hamiltonians composed in the previous problem, derive the Heisenberg
equations of motion for particle creation/annihilation operators:
(i) for bosons, and
(ii) for fermions.
8.28. Express the ket-vectors of all possible Dirac states for the system of three indistinguishable
(i) bosons, and
(ii) fermions,
via those of the single-particle states , ’, and ” they occupy.
8.29. Explain why the general perturbative result (8.126), when applied to the 4He atom, gives
the correct106 expression (8.29) for the ground singlet state, and correct Eqs. (8.39)-(8.42) (with the
minus sign in the first of these relations) for the excited triplet states, but cannot describe these results,
with the plus sign in Eq. (8.39), for the excited singlet state.
8.30. For a system of two distinct qubits (i.e. two-level systems), introduce a reasonable
uncoupled-representation z-basis, and write in this basis the 44 matrix of the operator that swaps their
states.
8.31. Find a time-independent Hamiltonian that can cause the qubit evolution described by Eqs.
(155). Discuss the relation between your result and the time-dependent Hamiltonian (6.86).
105 Note that for dissipative spin systems, differences between low-energy excitations of open-end and closed-end
1D chains may be substantial even in the limit N   – see, e.g., SM Sec. 4.5. However, for our Hamiltonian
(and hence dissipation-free) system, the differences are relatively small.
106 Correct in the sense of the first order of the perturbation theory.
Chapter 9. Introduction to Relativistic Quantum Mechanics

The brief introduction to relativistic quantum mechanics, presented in this chapter, consists of two very
different parts. Its first part is a discussion of the basic elements of the quantum theory of the
electromagnetic field (usually called quantum electrodynamics, QED), including the field quantization
scheme, photon statistics, radiative atomic transitions, the spontaneous and stimulated radiation, and
so-called cavity QED. We will see, in particular, that the QED may be considered as the relativistic
quantum theory of particles with zero rest mass – photons. The second part of the chapter is a brief
review of the relativistic quantum theory of particles with non-zero rest mass, including the Dirac
theory of spin-½ particles. These theories mark the point of entry into a more complete relativistic
quantum theory – the quantum field theory – which is beyond the scope of this course.1
9.1. Electromagnetic field quantization2

Classical physics gives us3 the following general relativistic relation between the momentum p
and energy E of a free particle with rest mass m, which may be simplified in two limits – non-relativistic
and ultra-relativistic:
 
Free
mc 2  p 2 / 2m, for p  mc,
E   pc   (mc 2 ) 2
particle’s 2 1/ 2
relativistic  (9.1)
energy  pc, for p  mc .
In both limits, the transfer from classical to quantum mechanics is easier than in the arbitrary case. Since
all the previous part of this course was committed to the first, non-relativistic limit, I will now jump to a
brief discussion of the ultra-relativistic limit p >> mc, for a particular but very important system – the
electromagnetic field. Since the excitations of this field, called photons, are currently believed to have
zero rest mass m,4 the ultra-relativistic relation E = pc is exactly valid for any photon energy E, and the
quantization scheme is rather straightforward.
As usual, the quantization has to be based on the classical theory of the system – in this case, the
Maxwell equations. As the simplest case, let us consider the electromagnetic field inside a finite free-
space volume limited by ideal walls, which reflect incident waves perfectly.5 Inside the volume, the
Maxwell equations give a simple wave equation6 for the electric field
1  2E
 2E   0, (9.2)
c 2 t 2
1 Note that some material covered in this chapter is frequently taught as a part of the quantum field theory. I will
focus on the most important results that may be obtained without starting the heavy engines of that theory.
2 The described approach was pioneered by the same P. A. M. Dirac as early as 1927.
3 See, e.g., EM Chapter 9.
4 By now this fact has been verified experimentally with an accuracy of at least ~10-22 m – see S. Eidelman et al.,
e
Phys. Lett. B 592, 1 (2004).
5 In the case of finite energy absorption in the walls, or in the wave propagation media (say, described by complex
constants  and ), the system is not energy-conserving (Hamiltonian), i.e. interacts with some dissipative
environment. Specific cases of such interaction will be considered in Sections 2 and 3 below.
6 See, e.g., EM Eq. (7.3), for the particular case  =  ,  =  , so that v2  1/ = 1/   c2.
0 0 0 0
© K. Likharev
and an absolutely similar equation for the magnetic field B . We may look for the general solution of Eq.
(2) in the variable-separating form
E (r, t )   p j (t )e j (r ) . (9.3)
j
Physically, each term of this sum is a standing wave whose spatial distribution and polarization
(“mode”) are described by the vector function ej(r), while the temporal dynamics, by the function pj(t).
Plugging an arbitrary term of this sum into Eq. (2), and separating the variables exactly as we did, for
example, in the Schrödinger equation in Sec. 1.5, we get
 2e j 1 p j
 2
 const   k 2j , (9.4)
ej c p j
so that the spatial distribution of the mode satisfies the 3D Helmholtz equation:
Equation
 2 e j  k 2j e j  0. (9.5) for spatial
distribution
The set of solutions of this equation, with appropriate boundary conditions, determines the set of the
functions ej, and simultaneously the spectrum of the wave number magnitudes kj. The latter values
determine the mode eigenfrequencies, following from Eq. (4):
p j   2j p j  0, with  j  k j c . (9.6)
There is a big philosophical difference between the quantum-mechanical approach to Eqs. (5)
and (6), despite their single origin (4). The first (Helmholtz) equation may be rather difficult to solve in
realistic geometries,7 but it remains intact in the basic quantum electrodynamics, with the scalar
components of the vector functions ej(r) still treated (at each point r) as c-numbers. In contrast, the
classical Eq. (6) is readily solvable (giving sinusoidal oscillations with frequency j), but this is exactly
where we can make the transfer to quantum mechanics, because we already know how to quantize a
mechanical 1D harmonic oscillator, which in classics obeys the same equation.
As usual, we need to start with the appropriate Hamiltonian – the operator corresponding to the
classical Hamiltonian function H of the proper set of generalized coordinates and momenta. The
electromagnetic field’s Hamiltonian function (which in this case coincides with the field’s energy) is8
 E 2 B2 
H   d 3 r  0  .
 (9.7)
 2 2 0 
Let us represent the magnetic field in a form similar to Eq. (3),9
7 See, e.g., various problems discussed in EM Chapter 7, especially in Sec. 7.9.

8 See, e.g., EM Sec. 9.8, in particular, Eq. (9.225). Here I am using SI units, with 00  c-2; in the Gaussian units,
the coefficients 0 and 0 disappear, but there is an additional common factor 1/4 in the equation for energy.
However, if we modify the normalization conditions (see below) accordingly, all the subsequent results, starting
from Eq. (10), look similar in any system of units.
9 Here I am using the letter q instead of x , for the generalized coordinate of the field oscillator, in order to
j, j
emphasize the difference between the former variable, and one of the Cartesian coordinates, i.e. one of the
arguments of the c-number functions e and b.
B (r, t )    j q j (t )b j (r ) . (9.8)
j
Since, according to the Maxwell equations, in our case the magnetic field satisfies the equation similar
to Eq. (2), the time-dependent amplitude qj of each of its modes bj(r) obeys an equation similar to Eq.
(6), i.e. in the classical theory also changes in time sinusoidally, with the same frequency j. Plugging
Eqs. (3) and (8) into Eq. (7), we may recast it as
 p 2j  2j q 2j 1 
H      0 e j (r )d r 
2 3
 0 j
b 2
r d 3
r . (9.9)
 2
j  2 
Since the distribution of constant factors between two multiplication operands in each term of Eq. (3) is
so far arbitrary, we may fix it by requiring the first integral in Eq. (9) to equal 1. It is straightforward to
check that according to the Maxwell equations, which give a specific relation between vectors E and
B,10 this normalization makes the second integral in Eq. (9) equal 1 as well, and Eq. (9) becomes
p 2j  2j q 2j
H  H j, Hj   . (9.10a)
j 2 2
Note that that pj is the legitimate generalized momentum corresponding to the generalized coordinate qj,
because it is equal to L / q j , where L is the Lagrangian function of the field – see EM Eq. (9.217):
  0E 2 B 2  p 2j  2j q 2j
L   d r 3
    Lj , Lj   . (9.10b)
 2 2 0  j 2 2
Hence we can carry out the standard quantization procedure, namely declare Hj, pj, and qj the
quantum-mechanical operators related exactly as in Eq. (10a),
Electro-
pˆ 2j  2j qˆ 2j
magnetic
mode’s Hˆ j   . (9.11)
Hamiltonian 2 2
We see that this Hamiltonian coincides with that of a 1D harmonic oscillator with the mass mj formally
equal to 1,11 and the eigenfrequency equal to j. However, in order to use Eq. (11) in the general Eq.
(4.199) for the time evolution of Heisenberg-picture operators pˆ j and qˆ j , we need to know the
commutation relation between these operators. To find them, let us calculate the Poisson bracket (4.204)
for the functions A = qj’ and B = pj”, taking into account that in the classical Hamiltonian mechanics, all
generalized coordinates qj and the corresponding momenta pj have to be considered independent
arguments of H, only one term (with j = j’ = j”) in only one of the sums (12) (namely, with j’ = j”),
gives a non-zero value (-1), so that
 q j' p j " q j' p j" 
q ,p j" P   

    j'j" . (9.12)
q j p j 
j'
j  p j q j
Hence, according to the general quantization rule (4.205), the commutation relation of the operators
corresponding to qj’ and pj” is
10See, e.g., EM Eq. (7.6).

11Selecting a different normalization of the functions ej(r) and bj(r), we could readily arrange any value of mj,
and the choice corresponding to mj = 1 is the best one just for the notation simplicity.
qˆ j' 
,pˆ j"  i j'j" , (9.13)
i.e. is exactly the same as for the usual Cartesian components of the radius-vector and momentum of a
mechanical particle – see Eq. (2.14).
As the reader already knows, Eqs. (11) and (13) open for us several alternative ways to proceed:
(i) Use the Schrödinger-picture wave mechanics based on wavefunctions j(qj, t). As we know
from Sec. 2.9, this way is inconvenient for most tasks, because the eigenfunctions of the harmonic
oscillator are rather clumsy.
(ii) A substantially better way (for the harmonic oscillator case) is to write the equations of the
time evolution of the operators qˆ j (t ) and pˆ j (t ) in the Heisenberg picture of quantum dynamics.
(iii) An even more convenient approach is to use equations similar to Eqs. (5.65) to decompose
†
the Heisenberg operators qˆ j (t ) and pˆ j (t ) into the creation-annihilation operators aˆ j t  and aˆ j t  , and
work with these operators.
In this chapter, I will mostly use the last route. Replacing m with mj 1, and 0 with j, the last
forms of Eqs. (5.65) become
1/ 2 1/ 2
j 
 pˆ  j   pˆ j 
 qˆ j  i j ,
aˆ j    aˆ †j     qˆ j  i . (9.14)
     
 2 
 j  2 
   j 
Due to Eq. (13), the creation-annihilation operators obey the commutation similar to Eq. (5.68),
aˆ , aˆ †   Iˆ . (9.15)
 j j'  jj'
As a result, according to Eqs. (3) and (8), the quantum-mechanical operators of the electric and
magnetic fields are sums over all field oscillators:
1/ 2
  
Eˆ(r, t )  i   j  e j (r )  aˆ †j  aˆ j  , (9.16a) Electro-
j  2    magnetic
fields’
1/ 2
   operators
Bˆ(r, t )    j  b j (r )  aˆ †j  aˆ j  , (9.16b)
j  2   
and Eq. (11) for the jth mode’s Hamiltonian becomes
 1   1 
Hˆ j   j  aˆ †j aˆ j  Iˆ    j  nˆ j  Iˆ , with nˆ j  aˆ †j aˆ j , (9.17)
 2   2 
absolutely similar to Eq. (5.72) for a mechanical oscillator.
Now comes a very important conceptual step. From Sec. 5.4 we know that the eigenfunctions
(Fock states) nj of the Hamiltonian (17) have energies
Electro-
 1 magnetic
E j   j  n j   , n j  0, 1, 2,... (9.18) mode’s
 2 eigen-
energies
†
and, according to Eq. (5.89), the operators aˆ j and â j act on the eigenkets of these partial states as
aˆ j n j  n j  n j  1 , aˆ †j n j  n j  1 n j  1 ,
1/ 2 1/ 2
(9.19)
regardless of the quantum states of other modes. These rules coincide with the definitions (8.64) and
(8.68) of bosonic creation-annihilation operators, and hence their action may be considered as the
creation/annihilation of certain bosons. Such a “particle” (actually, an excitation, with energy j, of an
electromagnetic field oscillator) is exactly what is, strictly speaking, called a photon. Note immediately
that according to Eq. (16), such an excitation does not change the spatial distribution of the jth mode of
the field. So, such a “global” photon is an excitation created simultaneously at all points of the field
confinement region.
If this picture is too contrary to the intuitive image of a particle, please recall that in Chapter 2,
we discussed a similar situation with the fundamental solutions of the Schrödinger equation of a free
non-relativistic particle: they represent sinusoidal de Broglie waves existing simultaneously in all points
of the particle confinement region. The (partial :-) reconciliation with the classical picture of a moving
particle might be obtained by using the linear superposition principle to assemble a quasi-localized wave
packet, as a group of sinusoidal waves with close wave numbers. Very similarly, we may form a similar
wave packet using a linear superposition of the “global” photons with close values of kj (and hence j),
to form a quasi-localized photon. An additional simplification here is that the dispersion relation for
electromagnetic waves (at least in free space) is linear:
 j  2 j
 c  const, i.e.  0, (9.20)
k j k j
2
so that, according to Eq. (2.39a), the electromagnetic wave packets (i.e. space-localized photons) do not
spread out during their propagation. Note also that due to the fundamental classical relations p = nE/c
for the linear momentum of the traveling electromagnetic wave packet of energy E, propagating along
the direction n  k/k, and L = nE/j for its angular momentum,12 such photon may be prescribed the
linear momentum p = nj/c  k and the angular momentum L = n, with the sign depending on the
direction of its circular polarization (“helicity”).
This electromagnetic field quantization scheme should look very straightforward, but it raises an
important conceptual issue of the ground state energy. Indeed, Eq. (18) implies that the total ground-
state (i.e., the lowest) energy of the field is
Ground-
 j
Eg   ( Eg ) j  
state
energy . (9.21)
of EM field j j 2
Since for any realistic model of the field-confining volume, either infinite or not, the density of
electromagnetic field modes only grows with frequency,13 this sum diverges on its upper limit, leading
to infinite ground-state energy per unit volume. This infinite-energy paradox cannot be dismissed by
declaring the ground-state energy of field oscillators unobservable, because this would contradict
numerous experimental observations – starting perhaps from the famous Casimir effect.14 The
12 See, e.g., EM Sections 7.7 and 9.8.

13See, e.g., Eq. (1.1), which is similar to Eq. (1.90) for the de Broglie waves, derived in Sec. 1.7.
14This effect was predicted in 1948 by Hendrik Casimir and Dirk Polder, and confirmed semi-quantitatively in
experiments by M. Sparnaay, Nature 180, 334 (1957). After this, and several other experiments, a decisive error
bar reduction (to about ~5%), providing a quantitative confirmation of the Casimir formula (23), was achieved by
conceptually simplest implementation of this effect involves two parallel, perfectly conducting plates of
area A, separated by a vacuum gap of thickness t << A1/2 (Fig. 1).
z
t Fig. 9.1. The simplest geometry of

the Casimir effect manifestation.
Rather counter-intuitively, the plates attract each other with a force F proportional to the area A
and rapidly increasing with the decrease of t, even in the absence of any explicit electromagnetic field
sources. The effect’s explanation is that the energy of each electromagnetic field mode, including its
ground-state energy, exerts average pressure,
E j
Pj   , (9.22)
V
on the walls constraining it to volume V. While the field’s pressure on the external surfaces on the plates
is due to the contributions (22) of all free-space modes, with arbitrary values of kz (the z-component of
the wave vector kj), in the gap between the plates the spectrum of kz is limited to the multiples of π/t, so
that the pressure on the internal surfaces is lower. This is why the net force exerted on the plates may be
calculated as the sum of the contributions (22) from all “missing” low-frequency modes in the gap, with
the minus sign. In the simplest model when the plates are made of an ideal conductor, which provides
boundary conditions E = Bn = 0 on their surfaces,15 such calculation is quite straightforward (and is
hence left for the reader’s exercise), and its result is
 2 Ac Casimir
F  4
. (9.23) effect
240t
Note that for such calculation, the high-frequency divergence of Eq. (21) is not important,
because it participates in the forces exerted on all surfaces of each plate, and cancels out from the net
pressure. In this way, the Casimir effect not only confirms Eq. (21), but also teaches us an important
lesson on how to deal with the divergences of such sums at ωj → . The lesson is: just get accustomed
to the idea that the divergence exists, and ignore this fact while you can, i.e. if the final result you are
interested in is finite. However, for some more complex problems of quantum electrodynamics (and the
S. Lamoreaux, Phys. Rev. Lett. 78, 5 (1997) and by U. Mohideen and A. Roy, Phys. Rev. Lett. 81, 004549 (1998).
Note also that there are other experimental confirmations of the reality of the ground-state electromagnetic field,
including, for example, the experiments by R. Koch et al. already discussed in Sec. 7.5, and the recent spectacular
direct observations by C. Riek et al., Science 350, 420 (2015).
15 For realistic conductors, the reduction of t below ~1 μm causes significant deviations from this simple model,
and hence from Eq. (23). The reason is that for gaps so narrow, the depth of field penetration into the conductors
(see, e.g., EM Sec. 6.2), at the important frequencies ω ~ c/t, becomes comparable with t, and an adequate theory
of the Casimir effect has to involve a certain model of the penetration. (It is curious that in-depth analyses of this
problem, pioneered in 1956 by E. Lifshitz, have revealed a deep relation between the Casimir effect and the
London dispersion force which was the subject of Problems 3.16, 5.15, and 6.18 – for a review see, e.g., either I.
Dzhyaloshinskii et al., Sov. Phys. Uspekhi 4, 153 (1961), or K. Milton, The Casimir Effect, World Scientific,
2001. Recent experiments in the 100 nm – 2 m range of t, with an accuracy better than 1%, have allowed not
only to observe the effects of field penetration on the Casimir force, but even to make a selection between some
approximate models of the penetration – see D. Garcia-Sanchez et al., Phys. Rev. Lett. 109, 027202 (2012).
quantum theory of any other fields), this simplest approach becomes impossible, and then more
complex, renormalization techniques become necessary. For their study, I have to refer the reader to a
quantum field theory course – see the references at the end of this chapter.
9.2. Photon absorption and counting

As a matter of principle, the Casimir effect may be used to measure quantum effects in not only
the free-space electromagnetic field but also that the field arriving from active sources – lasers, etc.
However, usually such studies may be done by simpler detectors, in which the absorption of a photon by
a single atom leads to its ionization. This ionization, i.e. the emission of a free electron, triggers an
avalanche reaction (e.g., an electric discharge in a Geiger-type counter), which may be readily registered
using appropriate electronic circuitry. In good photon counters, the first step, the “trigger” atom
ionization, is the bottleneck of the whole process (the photon count), so that to analyze their statistics, it
is sufficient to consider the field’s interaction with just this atom.
Its ionization is a quantum transition from a discrete initial state of the atom to its final, ionized
state with a continuous energy spectrum, induced by an external electromagnetic field. This is exactly
the situation shown in Fig. 6.12, so we may apply to it the Golden Rule of quantum mechanics in the
form (6.149), with the system a associated with the electromagnetic field, and system b with the trigger
atom. The atom’s size is typically much smaller than the radiation wavelength, so that the field-atom
interaction may be adequately described in the electric dipole approximation (6.146)
Hˆ int  Eˆ  d̂ , (9.24)

where d̂ is the dipole moment’s operator. Hence we may associate this operator with the operand B̂ in
Eqs. (6.145)-(6.149), while the electric field operator Eˆ is associated with the operand Â in those
relations. First, let us assume that our field consists of only one mode ej(r) of frequency . Then we can
keep only one term in the sum (16a), and drop the index j, so that Eq. (6.149) may be rewritten as
2 2
fin dˆ t   n e ini  a
2
Γ fin Eˆ(r, t ) ini

(9.25)
2 
2
fin aˆ † t   aˆ t  e(r ) ini fin dˆ t   n e ini
2
 a ,
 2  
where ne  e(r)/e(r) is the local direction of the vector e(r), symbols “ini” and “fin” denote the initial
and final states of the corresponding system (the electromagnetic field in the first long bracket, and the
atom in the second bracket), and the density a of the continuous atomic states should be calculated at its
final energy Efin = Eini + .
As a reminder, in the Heisenberg picture of quantum dynamics, the initial and final states are
time-independent, while the creation-annihilation operators are functions of time. In the Golden Rule
formula (25), as in any perturbative result, this time dependence has to be calculated ignoring the
perturbation – in this case, the field-atom interaction. For the field’s creation-annihilation operators, this
dependence coincides with that of the usual 1D oscillator – see Eq. (5.141), in which 0 should be, in
our current notation, replaced with :
aˆ (t )  aˆ (0)e  it , aˆ † (t )  aˆ † (0)e  it . (9.26)
Hence Eq. (25) becomes

2
   fin aˆ † (0)e it  aˆ (0)e  it  e(r ) ini

2
fin dˆ (t )  n e ini a . (9.27a)
 
Now let us multiply the first long bracket by exp{it}, and the second one by exp{-it}:
2 2
   fin aˆ † (0)e 2it  aˆ (0) e(r ) ini fin dˆ (t )  n e e  it ini a . (9.27b)
 
This, mathematically equivalent form of the previous relation shows more clearly that at resonant
photon absorption, only the annihilation operator gives a significant time-averaged contribution to the
first bracket matrix element. (As a reminder, the quantum-mechanical Golden Rule for time-dependent
perturbations is a result of averaging over a time interval much larger than 1/ – see Sec. 6.6.) Similarly,
according to Eq. (4.199), the Heisenberg operator of the dipole moment, corresponding to the increase
of atom’s energy by , has the Fourier components that differ in frequency from  only by ~ << ,
so that its time dependence virtually compensates the additional factor in the second bracket of Eq.
(27b), and this bracket also may have a substantial time average. Hence, in the first bracket we may
neglect the fast-oscillating term, whose average over time interval ~1/ is very close to zero.16
Now let us assume, first, that we use the same detector, characterized by the same matrix
element of the quantum transition, i.e. the same second bracket in Eq. (27), and the same final state
density a, for measurement of various electromagnetic fields – or just of the same field at different
points r. Then we are only interested in the behavior of the first, field-related bracket, and may write
*
 fin aê(r ) ini fin aê(r ) ini  ini aˆ † e* (r ) fin fin aê(r ) ini , (9.28)
2
  fin aê(r ) ini
where the creation-annihilation operators are implied to be taken at t = 0, i.e. in the Schrödinger picture,
and the initial and final states are those of the field alone. Second, let us now calculate the total rate of
transitions to all available final states of the given mode e(r). If such states formed a full and
orthonormal set, we could use the closure relation (4.44), applied to the final states, to write
Photon
†
Γ   ini aˆ † e* (r ) fin fin aê(r ) ini  ini aˆ aˆ ini e* (r )e(r )  n
2
ini
e(r ) , (9.29) counting
fin
rate
where, for a given field mode, nini is the expectation value of the operator nˆ  aˆ † aˆ for the initial state
of the electromagnetic field. In the more realistic case of fields in relatively large volumes, V >> 3, with
their virtually continuous spectrum of final states, the middle equality in this relation is not strictly valid,
but it is correct to a constant multiplier,17 which we are currently not interested in. Note, however, that
Eq. (29) may be substantially wrong for high-Q electromagnetic resonators (“cavities”), which may
make just one (or a few) modes available for transitions. (Quantum electrodynamics of such cavities will
be briefly discussed in Sec. 4 below.)
Let us apply Eq. (29) to several possible quantum states of the mode.
16This is essentially the same rotating wave approximation (RWA), which was already used in Sec. 6.5 and
beyond – see, e.g., the transition from Eq. (6.90) to the first of Eqs. (6.94).
17 As the Golden Rule shows, this multiplier is proportional to the density f of the final states of the
field.
(i) First, as a sanity check, the ground initial state, n = 0, gives no photon absorption at all. The
interpretation is easy: the ground state field, cannot emit a photon that would ionize an atom in the
counter. Again, this does not mean that the ground-state “motion” is not observable (if you still think so,
please review the Casimir effect discussion in Sec. 1), just that it cannot ionize the trigger atom –
because it does not have any spare energy for doing that.
(ii) All other coherent states (Fock, Glauber, squeezed, etc.) of the field oscillator give the same
counting rate, provided that their nini is the same. This result may be less evident if we apply Eq. (29)
to the interference of two light beams from the same source – say, in the double-slit or the Bragg-
scattering configurations. In this case, we may represent the spatial distribution of the field as a sum
e(r )  e1 (r )  e2 (r ) . (9.30)
Here each term describes one possible wave path, so that the operator product in Eq. (29) may be a
rapidly changing function of the detector position. For this configuration, our result (29) means that the
interference pattern (and its contrast) are independent of the particular state of the electromagnetic
field’s mode.
(iii) Surprisingly, the last statement is also valid for a classical mixture of the different
eigenstates of the same field mode, for example for its thermal-equilibrium state. Indeed, in this case we
need to average Eq. (29) over the corresponding classical ensemble, but it would only result in a
different meaning of averaging n in that equation; the field part describing the interference pattern is not
affected.
The last result may look a bit counter-intuitive because common sense tells us that the
stochasticity associated with thermal equilibrium has to suppress the interference pattern contrast. These
expectations are (partly :-) justified because a typical thermal source of radiation produces many field
modes j, rather than one mode we have analyzed. These modes may have different wave numbers kj and
hence different field distribution functions ej(r), resulting in shifted interference patterns. Their
summation would indeed smear the interference, suppressing its contrast.
So the use of one photon detector is not the best way to distinguish different quantum states of an
electromagnetic field mode. This task, however, may be achieved using the photon counting correlation
technique shown in Fig. 2.18
controllable
semi-transparent delay
mirror

light
source detector 2 detector 1
count
statistics Fig. 9.2. Photon count
calculation correlation measurement.
18It was pioneered as early as the mid-1950s (i.e. before the advent of lasers), by Robert Hanbury Brown and
Richard Twiss. Their second experiment was also remarkable for the rather unusual light source – the star Sirius!
(Their work was an effort to improve astrophysics interferometry techniques.)
In this experiment, the counter rate correlation may be characterized by the so-called second-
order correlation function of the counting rates,
Second-
1 (t )2 (t   ) order
g ( 2)
( )  , (9.31) correlation
1 (t ) 2 (t ) function
where the averaging may be carried out either over many similar experiments, or over a relatively long
time interval t >> , with usual field sources – due to their ergodicity. Using the normalized correlation
function (31) is very convenient because the characteristics of both detectors and the beam splitter (e.g.,
a semi-transparent mirror, see Fig. 2) drop out from this fraction.
Very unexpectedly for the mid-1950s, Hanbury Brown and Twiss discovered that the correlation
function depends on time delay  in the way shown (schematically) with the solid line in Fig. 3. It is
evident from Eq. (31) that if the counting events are completely independent, g(2)() should be equal to 1
– which is always the case in the limit   . (As will be shown in the next section, the characteristic
time of this approach is usually between 10-11s and 10-8s, so that for its measurement, the delay time
control may be provided just by moving one of the detectors by a human-scale distance between a few
millimeters to a few meters.) Hence, the observed behavior at   0 corresponds to a positive
correlation of detector counts at small time delays, i.e. to a higher probability of the nearly simultaneous
arrival of photons to both counters. This counter-intuitive effect is called photon bunching.
g ( 2)
2
1 Fig. 9.3. Photon bunching (solid line) and

antibunching for various n (dashed lines). The
n 1 lines approach level g(2) = 1 at    (on the
0 time scale depending on the light source).

Let us use our simple single-mode model to analyze this experiment. Now the elementary
quantum process characterized by the numerator of Eq. (31), is the correlated, simultaneous ionization
of two trigger atoms, at two spatial-temporal points {r1, t} and (r2, t – }, by the same field mode, so
that we need to make the following replacement in the first of Eqs. (25):
Eˆ(r, t )  const  Eˆ(r1 , t )Eˆ(r2 , t   ) . (9.32)
Repeating all the manipulations done above for the single-counter case, we get
Γ1 t Γ 2 t     ini aˆ (t ) † aˆ (t   ) † aˆ (t   )aˆ (t ) ini e* (r1 )e* (r2 )e(r1 )e(r2 ). (9.33)
Plugging this expression, as well as Eq. (29) for single-counter rates, into Eq. (31), we see that the field
distribution factors (as well as the detector-specific brackets and the density of states a) cancel, giving a
very simple final expression:
aˆ † (t )aˆ † (t   )aˆ (t   )aˆ (t )
g ( ) 
( 2)
2
, (9.34)
aˆ † (t )aˆ (t )
where the averaging should be carried out, as before, over the initial state of the field.
Still, the calculation of this expression for arbitrary  may be quite complex, because in many
cases the relaxation of the correlation function to the asymptotic value g(2)() is due to the interaction of
the light source with the environment, and hence requires the open-system techniques that were
discussed in Chapter 7. However, the zero-delay value g(2)(0) may be calculated straightforwardly,
because the time arguments of all operators are equal, so that we may write
Zero-delay
aˆ † aˆ † aâˆ
correlation g ( 2 ) (0)  2
. (9.35)
†
aˆ aˆ
Let us evaluate this ratio for the simplest states of the field.
(i) The nth Fock state. In this case, it is convenient to act with the annihilation operators upon the
ket-vectors, and by the creation operators, upon the bra-vectors, using Eqs. (19):
n aˆ † aˆ † aâˆ n n  2 n(n  1) n(n  1)1 / 2 n  2

1/ 2
Photon n(n  1) 1
anti- g ( 2)
(0)     1 . (9.36)
n aˆ † aˆ n
2 2 2
bunching n  1 n1 / 2 n1 / 2 n  1 n n
We see that the correlation function at small delays is suppressed rather than enhanced – see the dashed
lines in Fig. 3. This photon antibunching effect has a very simple handwaving explanation: a single
photon emitted by the wave source may be absorbed by just one of the detectors. For the initial state n =
1, this is the only option, and it is very natural that Eq. (36) predicts no simultaneous counts at  = 0.
Despite this theoretical simplicity, reliable observations of the antibunching have not been carried out
until 1977,19 due to the experimental difficulty of driving electromagnetic field oscillators into their
Fock states – see Sec. 4 below.
(ii) The Glauber state . A similar procedure, but now using Eq. (5.124) and its Hermitian
conjugate,  aˆ †    * , yields
 aˆ † aˆ † aâˆ   * *
Glauber
g ( 2)
( 0)    1, (9.37)
 aˆ † aˆ 
field 2
statistics ( * ) 2
for any parameter . We see that the result is different from that for the Fock states, unless in the latter
case n  . (We know that the Fock and Glauber properties should also coincide for the ground state,
but at that state the correlation function’s value is uncertain, because there are no photon counts at all.)
(iii) Classical mixture. From Chapter 7, we know that such statistical ensembles cannot be
described by single state vectors, and require the density matrix w for their description. Here, we may
combine Eqs. (35) and (7.5) to write
g ( 0) 
( 2) Tr wˆ aˆ † aˆ † aâˆ
.
 (9.38)

 
2
 Tr wˆ aˆ † aˆ 
 
19 ByH. J. Kimble et al., Phys. Rev. Lett. 39, 691 (1977). For a detailed review of phonon antibunching, see, e.g.,
H. Paul, Rev. Mod. Phys. 54, 1061 (1982).
Spelling out this expression is easy for the field in thermal equilibrium at some temperature T,
because its density matrix is diagonal in the basis of Fock states n – see Eqs. (7.24):
 E  
  
wnn '  Wn nn ' , Wn  exp n  Z  n  n
, where   exp . (9.39)
 k BT  n 0  k BT 
So, for the operators in the numerator and denominator of Eq. (38) we also need just the diagonal terms
of the operator products, which have already been calculated – see Eq. (36). As a result, we get
  
W n(n  1)   n(n  1)   
n
n n
g ( 2) (0)  n 0
. 2
 n 0
2
n 0
(9.40)
     n 
  Wn n   n
 n 0   n 0 
One of the three series involved in this expression is just the usual geometric progression,

1

n 0
n

1 
, (9.41)
and the remaining two series may be readily calculated by its differentiation over the parameter :
 
d  n d 1 
 n n    n1n  
n 0 n 0

d n 0
  
d 1   (1   ) 2
,
(9.42)
 
d2    d2 1 22
  n(n  1)    
n 0
n 2
n 0
n2
n(n  1)   2   n   2 2
d   n 0 
2

d 1   (1   ) 3
,
and for the correlation function we get an extremely simple result independent of the parameter  and
hence of temperature:
g ( 2)
( 0) 
2 2

/(1   ) 3 1 /(1   )
 2. (9.43) Photon
 /(1   )  2 2 bunching
This is exactly the photon bunching effect first observed by Hanbury Brown and Twiss – see Fig.
3. We see that in contrast to antibunching, this is an essentially classical (statistical) effect. Indeed, Eq.
(43) allows a purely classical derivation. In the classical theory, the counting rate (of a single counter) is
proportional to the wave intensity I, so that Eq. (31) with  = 0 is reduced to
I2
g ( 2)
(0)  2
, with I  E 2 (t )  E E* . (9.44)
I
For a sinusoidal field, the intensity is constant, and g(2)(0) = 1. (This is also evident from Eq. (37),
because the classical state may be considered as a Glauber state with   .) On the other hand, if the
intensity fluctuates (either in time, or from one experiment to another), the averages in Eq. (44) should
be calculated as
 
I k   w( I ) I k dI , with  w( I )dI  1, and k  1, 2 , (9.45)
0 0
where w(I) is the probability density. For classical statistics, the probability is an exponential function of
the electromagnetic field energy, and hence its intensity:
 I
w( I )  Ce , where   1 / k BT , (9.46)
so that Eqs. (45) yield:20

 C exp I dI  C /   1, and hence C   ,

0
(9.47)
 
1

1 /  , for k  1,
I k
  w( I ) I dI  C  exp  I I dI  k  exp   d  
k k k
 0 2 /  , for k  2.
2
0 0
Plugging these results into Eq. (44), we get g(2)(0) = 0, in complete agreement with Eq. (43).
For some field states, including the squeezed ground states  discussed at the end of Sec. 5.5,
values g(2)(0) may be even higher than 2 – the so-called super-bunching. Analyses of two cases of such
super-bunching are offered for the reader’s exercise – see the problem list in the chapter’s end.
9.3. Photon emission: spontaneous and stimulated

In our simple model of photon counting, considered in the last section, the trigger atom in the
counter absorbed a photon. Now let us have a look at the opposite process of spontaneous emission of
photons by an atom in an excited state, still using the same electric-dipole approximation (24) for the
atom-to-field interaction. For this, we may still use the Golden Rule for the model depicted in Fig. 6.12,
but now the roles have changed: we have to associate the operator Â with the electric dipole moment of
the atom, while the operator B̂ , with the electric field, so that the continuous spectrum of the system b
represents the plurality of the electromagnetic field modes into which the spontaneous radiation may
happen. Since now the transition increases the energy of the electromagnetic field, and decreases that of
the atom, after the multiplication of the field bracket in Eq. (27a) by exp{-it}, and the second, by
exp{+ it}, we may keep only the photon creation operator whose time evolution (26) compensates this
additional fast “rotation”. As a result, the Golden Rule takes the following form:
Spontaneous 2 2
photon
emission
s   fin aˆ † 0 fin dˆ  e(r ) ini f , (9.48)
rate
where all operators and states are time-independent (i.e. taken in the Schrödinger picture), and f is the
density of final states of the electromagnetic field – which in this problem plays the role of the atom’s
environment.21 Here the electromagnetic field oscillator has been assumed to be initially in the ground
state – the assumption that will be changed later in this section.
This relation, together with Eq. (19), shows that for the field’s matrix element be different from
zero, the final state of the field has to be the first excited Fock state, n = 1. (By the way, this is exactly
20See, e.g., MA Eq. (6.7c) with n = 0 and n = 1.

21 Here the sum over all electromagnetic field modes j may be smuggled back. Since in the quasi-static
approximation kja << 1, which is necessary for the interaction representation by Eq. (24), the matrix elements in
Eq. (48) are virtually independent on the direction of the wave vectors, and their magnitudes are fixed by , the
summation is reduced to the calculation of the total f for all modes, and the averaging of e2(r) – see below.
the most practicable way of generating an excited Fock state of a field oscillator.) With that, Eq. (48)
yields
2 2
Γ s   fin dˆ  e(r ) ini  f   fin dêd (r ) ini  f , (9.49)
where the density f of the excited electromagnetic field states should be calculated at the energy E =
, and ed is the Cartesian component of the vector e(r) along the electric dipole’s direction. The
expression for the density f was our first formula in this course – see Eq. (1.1).22 From it, we get
dN 2
f  V 2 3 , (9.50)
dE  c
where the bounding volume V should be large enough to ensure spectrum’s virtual continuity: V >> 3 =
(2c/)3. Because of that, in the normalization condition used to simplify Eq. (9), we may consider e2(r)
constant. Let us represent this square as a sum of squares of the three Cartesian components of the
vector e(r): one of those (ed) aligned with the dipole’s direction; due to the space isotropy we may write
e 2  ed2  e21  e2 2  3ed2 . (9.51)
As a result, the normalization condition yields
1
ed2 . (9.52)
3 0V
and Eq. (49) gives the famous (and very important) formula23
Free-space
1 4 3 2 1 4 3 * spontaneous
Γs  fin dˆ ini  fin dˆ ini  ini dˆ fin . (9.53) emission
4 0 3c 3 4 0 3c 3
rate
Leaving a comparison of this formula with the classical theory of radiation,24 and the exact
evaluation of s for a particular transition in the hydrogen atom, for reader’s exercises, let me just
estimate its order of magnitude. Assuming that d ~ erB  e2/me(e2/40) and  ~ EH  me(e2/40)2/2,
and taking into account the definition (6.62) of the fine structure constant   1/137, we get
3
Γ  e2 
~     3 ~ 3  10 7. (9.54)
  4 0 c 
This estimate shows that the emission lines at atomic transitions are typically very sharp. With the
present-day availability of high-speed electronics, it also makes sense to evaluate the time scale  = 1/
of the typical quantum transition: for a typical optical frequency  ~ 31015 s-1, it is close to 1 ns. This is
22 If the same atom is placed into a high-Q resonant cavity (see, e.g., EM 7.9), the rate of its photon emission is
strongly suppressed at frequencies between the cavity resonances (where f  0) – see, e.g., the review by S.
Haroche and D. Klepner, Phys. Today 42, 24 (Jan. 1989). On the other hand, the emission is strongly (by a factor
~ (3/V)Q, where V is cavity’s volume) enhanced at resonance frequencies – the so-called Purcell effect,
discovered by E. Purcell in the 1940s. For a brief discussion of this and other quantum electrodynamic effects in
cavities, see the next section.
23 This was the breakthrough result obtained by P. Dirac in 1927, which jumpstarted the whole field of quantum
electrodynamics. An equivalent expression was obtained from more formal arguments in 1930 by V. Weisskopf
and E. Wigner, so that sometimes Eq. (53) is (very unfairly) called the “Weisskopf-Wigner formula”.
exactly the time constant that determines the time-delay dependence of the photon counting statistics of
the spontaneously emitted radiation – see Fig. 3. Colloquially, this is the temporal scale of the photon
emitted by an atom.25
Note, however, that the above estimate of  is only valid for a transition with a non-zero electric-
dipole matrix element. If it equals zero, i.e. the transition does not satisfy the selection rules,26 – say,
due to the initial and final state symmetry – it is “forbidden”. The “forbidden” transition may still take
place due to a different, smaller interaction (say, via a magnetic dipole field of the atom, or its
quadrupole electric field27), but takes much longer. In some cases the increase of  is rather dramatic –
sometimes to hours! Such long-lasting radiation is called the luminescence – or the fluorescence if the
initial atom’s excitation was due to external radiation of a higher frequency, followed first by non-
radiative transitions down the ladder of energy levels.
Now let us consider a more general case when the electromagnetic field mode of frequency  is
initially in an arbitrary Fock state n, and from it may either get energy  from the atomic system
(photon emission) or, vice versa, give such energy back to the atom (photon absorption). For the photon
emission rate, an evident generalization of Eq. (48) gives
2
fin aˆ † n
e nfin
  , (9.55)
s 01 †
2
1 aˆ 0
where both brackets should be calculated in the Schrödinger picture, and s is the spontaneous emission
rate (48) of the same atomic system. According to the second of Eqs. (19), at the photon emission, the
final field state has to be the Fock state with n’ = n + 1, and Eq. (55) yields
Stimulated
photon e  (n  1) s . (9.56)
emission rate
Thus the initial field increases the photon emission rate; this effect is called the stimulated emission of
radiation. Note that the spontaneous emission may be considered as a particular case of the stimulated
emission for n = 0, and hence interpreted as the emission stimulated by the ground state of the
electromagnetic field – one more manifestation of the non-trivial nature of this “vacuum” state.
On the other hand, following the arguments of Sec. 2,28 for the description of radiation
absorption, the photon creation operator has to be replaced with the annihilation operator, giving the
rate ratio
25 The scale c of the spatial extension of the corresponding wave packet is surprisingly macroscopic – in the
range of a few millimeters. Such a “human” size of spontaneously emitted photons makes the usual optical table,
with its 1-cm-scale components, the key equipment for many optical experiments – see, e.g., Fig. 2.
26 As was already discussed in Sec. 5.6, for a single spin-less particle moving in a spherically-symmetric potential
(e.g., a hydrogen-like atom), the orbital selection rules are simple: the only allowed electric-dipole transitions are
those with l  lfin- lini = 1 and m  mfin- mini = 0 or 1. The simplest example of the transition that does not
satisfy this rule, i.e. is “forbidden”, is that between the s-states (l = 0) with n = 2 and n = 1; because of that, the
lifetime of the lowest excited s-state of a hydrogen atom is as long as ~0.15 s.
27 See, e.g., EM Sec. 8.9.
28 Note, however, a major difference between the rate  discussed in Sec. 2, and  in Eq. (57). In our current
a
case, the atomic transition is still between two discrete energy levels (see Fig. 4 below), so that the rate a is
2
a fin aˆ n
 . (9.57)
s †
2
1 aˆ 0
According to this relation and the first of Eqs. (19), the final state of the field at the photon absorption
has to be the Fock state with n’ = n – 1, and Eq. (57) yields
Photon
a  ns . (9.58) absorption
rate
The results (56) and (58) are usually formulated in terms of relations between the Einstein
coefficients A and B defined in the way shown in Fig. 4, where the two energy levels are those of the
atom, a is the rate of energy absorption from the electromagnetic field in its nth Fock state, and e is that
of energy emission into the field, initially in the same state. In this notation, Eqs. (56) and (58) yield29
Einstein
A21  B21  B12 , (9.59) coefficients’
relation
because each of these coefficients equals the spontaneous emission rate s.
W2
E   a  B12 n e  A21  B21 n Fig. 9.4. The Einstein coefficients

on the atomic quantum transition
diagram – cf. Fig. 7.6.
W1
I cannot resist the temptation to use this point for a small detour – an alternative derivation of the
Bose-Einstein statistics for photons. Indeed, in the thermodynamic equilibrium, the average probability
flows between levels 1 and 2 (see Fig. 4 again) should be equal:30
W2 e  W1 a , (9.60)
where W1 and W2 are the probabilities for the atomic system to occupy the corresponding levels, so that
Eqs. (56) and (58) yield
W n
W2 s 1  n  W1s n , i.e. 2  , (9.61)
W1 n 1
where n is the average number of photons in the field causing the interstate transitions. But, on the
other hand, for an atomic subsystem only weakly coupled to its electromagnetic environment, we ought
to have the Gibbs distribution of these probabilities:
W2 exp{ E 2 / k BT }  E    
  exp   exp . (9.62)
W1 exp{ E1 / k BT }  k BT   k BT 
proportional to f, the density of final states of the electromagnetic field, i.e. the same density as in Eq. (48) and
beyond, while the rate (27) is proportional to a, the density of final (ionized) states of the “trigger” atom – more
exactly, of it’s the electron released at its ionization.
29 These relations were conjectured, from very general arguments, by Albert Einstein as early as 1916.
30 This is just a particular embodiment of the detailed balance equation (7.198).
Requiring Eqs. (61) and (62) to give the same result for the probability ratio, we get the Bose-Einstein
distribution for the electromagnetic field in thermal equilibrium:
1
n  (9.63)
exp{ / k BT }  1
- the same result as that obtained in Sec. 7.1 by other means – see Eq. (7.26b).
Now returning to the discussion of Eqs. (56) and (58), their very important implication is the
possibility to achieve the stimulated emission of coherent radiation using the level occupancy inversion.
Indeed, if the ratio W2/ W1 is larger than that given by Eq. (62), the net power flow from the atomic
system into the electromagnetic field,
power    s W2  n  1  W1 n  , (9.64)
may be positive. The necessary inversion may be produced using several ways, notably by intensive
quantum transitions to level 2 from an even higher energy level (which, in turn, is populated, e.g., by
absorption of external radiation, usually called pumping, at a higher frequency.)
A less obvious, but crucial feature of the stimulated emission is spelled out by Eq. (55): as was
mentioned above, it shows that the final state of the field after the absorption of energy  from the
atom is a pure (coherent) Fock state (n + 1). Colloquially, one may say that the new, (n + 1)st photon
emitted from the atom is automatically in phase with the n photons that had been in the field mode
initially, i.e. joins them coherently.31 The idea of stimulated emission of coherent radiation using
population inversion32 was first implemented in the early 1950s in the microwave range (masers) and in
1960 in the optical range (lasers). Nowadays, lasers are ubiquitous components of almost all high-tech
systems and constitute one of the cornerstones of our technological civilization.
A quantitative discussion of laser operation is well beyond the framework of this course, and I
have to refer the reader to special literature,33 but still would like to briefly mention two key points:
(i) In a typical laser, each generated electromagnetic field mode is in its Glauber (rather than the
Fock) state, so that Eqs. (56) and (58) are applicable only for the n averaged over the Fock-state
decomposition of the Glauber state – see Eq. (5.134).
(ii) Since in a typical laser n >> 1, its operation may be well described using quasiclassical
theories that use Eq. (64) to describe the electromagnetic energy balance (with the addition of a term
describing the energy loss due to field absorption in external components of the laser, including the
useful load), plus the equation describing the balance of occupancies W1,2 due to all interlevel transitions
– similar to Eq. (60), but including also the contribution(s) from the particular population inversion
mechanism used in the laser. At this approach, the role of quantum mechanics in laser science is
essentially reduced to the calculation of the parameter s for the particular system.
This role becomes more prominent when one needs to describe fluctuations of the laser field.
Here two approaches are possible, following the two options discussed in Chapter 7. If the fluctuations
31 It is straightforward to show that this fact is also true if the field is initially in the Glauber state – which is more
typical for modes in practical lasers.
32 This idea may be traced back at least to an obscure 1939 publication by V. Fabrikant.
33 I can recommend, for example, P. Milloni and J. Eberly, Laser Physics, 2nd ed., Wiley, 2010, and a less
technical text by A. Yariv, Quantum Electronics, 3rd ed., Wiley, 1989.
are relatively small, one can linearize the Heisenberg equations of motion of the field oscillator
operators near their stationary-lasing “values”, with the Langevin “forces” (also time-dependent
operators) describing the fluctuation sources, and use these Heisenberg-Langevin equations to calculate
the radiation fluctuations, just as was described in Sec. 7.5. On the other hand, near the lasing threshold,
the field fluctuations are relatively large, smearing the phase transition between the no-lasing and lasing
states. Here the linearization is not an option, but one can use the density-matrix approach described in
Sec. 7.6, for the fluctuation analysis.34 Note that while the laser fluctuations may look like a peripheral
issue, pioneering research in that field has led to the development of the general theory of open quantum
systems, which was discussed in Chapter 7.
9.4. Cavity QED

Now I have to visit, at least in passing, the field of cavity quantum electrodynamics (usually
called cavity QED for short) – the art and science of creating and using the entanglement between
quantum states of an atomic system (either an atom, or an ion, or a molecule, etc.) and the
electromagnetic field in a macroscopic volume called the resonant cavity (or just “resonator”, or just
“cavity”). This field is very popular nowadays, especially in the context of the quantum computation
and communication research discussed in Sec. 8.5.35
The discussion in the previous section was based on the implicit assumption that the energy
spectrum of the electromagnetic field interacting with an atomic subsystem is essentially continuous, so
that its final state is spread among many field modes, effectively losing its coherence with the quantum
state of the atomic subsystem. This assumption has justified using the quantum-mechanical Golden Rule
for the calculation of the spontaneous and stimulated transition rates. However, the assumption becomes
invalid if the electromagnetic field is contained inside a relatively small volume, with its linear size
comparable with the radiation wavelength. If the walls of such a cavity mostly reflect, rather than
absorb, radiation, then the 0th approximation the energy dissipation may be disregarded, and the
particular solutions ej(r) of the Helmholtz equation (5) correspond to discrete, well-separated mode
wave numbers kj and hence well-separated frequencies j.36 Due to the energy conservation, an atomic
transition corresponding to energy E =  Eini – Efin  may be effective only if the corresponding quantum
transition frequency   E/ is close to one of these resonance frequencies.37 As a result of such
resonant interaction, the quantum states of the atomic system and the resonant electromagnetic mode
may become entangled.
A very popular approximation for the quantitative description of this effect is the so-called Rabi
model,38 in which the atom is treated as a two-level system interacting with a single electromagnetic
field mode of the resonant cavity. (As was shown in Sec. 6.5, this model is justified, e.g., if transitions
34 This path has been developed (also in the mid-1960s), by several researchers, notably including M. Sully and
W. Lamb – see, e.g., M. Sargent III, M. Scully, and W. Lamb, Jr., Laser Physics, Westview, 1977.
35 This popularity was demonstrated, for example, by the award of the 2012 Nobel Prize in Physics to cavity QED
experimentalists S. Haroche and D. Wineland.
36 The calculation of such modes and corresponding frequencies for several simple cavity geometries was the
subject of EM Sec. 7.8 of this series.
37 On the contrary, if  is far from any  , the interaction is suppressed; in particular, the spontaneous emission
j
rate may be much lower than that given by Eq. (53) – so that this result is not as fundamental as it may look.
38 After the pioneering work by I. Rabi in 1936-37.
between all other energy level pairs have considerably different frequencies.) As the reader knows well
from Chapters 4-6 (see in particular Sec. 5.1), any two-level system may be described, just as a spin-½,
by the Hamiltonian bIˆ  c  σˆ . Since we may always select the energy origin that b = 0, and the state
basis in which c = cnz, the Hamiltonian of the atomic subsystem may be taken in the diagonal form
Ω
Hˆ a  cσˆ z  σˆ z , (9.65)
2
where   2c = E is the difference between the energy levels in the absence of interaction with the
field. Next, according to Eq. (17), ignoring the constant ground-state energy /2 (which may be always
added to the energy at the end – if necessary), the contribution of a single field mode of frequency  to
the total Hamiltonian of the system is
Hˆ  aˆ † aˆ .
f (9.66)
Finally, according to Eq. (16a), the electric field of the mode may be represented as
1/ 2
1 
Eˆ(r, t )    e(r ) aˆ  aˆ †  , (9.67)
i 2   
so that in the electric-dipole approximation (24), the cavity-atom interaction may be represented as a
product of the field by some (say, y-) Cartesian component39 of the Pauli spin-½ operator:
1/ 2
   1 †  †
Hˆ int  const  ˆ y  E  const  ˆ y     aˆ  aˆ   iˆ y  aˆ  aˆ  , (9.68)
 2  i   
where  is a coupling constant (with the dimension of frequency). The sum of these three terms,
Ω
Rabi
Hamiltonian Hˆ  Hˆ a  Hˆ f  Hˆ int  σˆ z  aˆ † aˆ  iˆ y  aˆ  aˆ †  . (9.69)
2  
giving a very reasonable description of the system, is called the Rabi Hamiltonian. Despite its apparent
simplicity, using this Hamiltonian for calculations is not that straightforward.40 Only in the case when
the electromagnetic field is large and hence may be treated classically, the results following from Eq.
(69) are reduced to Eqs. (6.94) describing, in particular, the Rabi oscillations discussed in Sec. 6.3.
The situation becomes simpler in the most important case when the frequencies  and  are very
close, enabling an effective interaction between the cavity field and the atom even if the coupling
constant  is relatively small. Indeed, if both the  and the so-called detuning (defined similarly to the
parameter  used in Sec. 6.5),
  Ω  , (9.70)
39The exact component is not important for final results, while intermediate formulas simplify if the interaction is
proportional to either pure ̂ x or pure ˆ y .
40 For example, an exact quasi-analytical expression for its eigenenergies (as zeros of a Taylor series in the
parameter , with coefficients determined by a recurrence relation) was found only recently – see D. Braak, Phys.
Rev. Lett. 107, 100401 (2011).
are much smaller than   , the Rabi Hamiltonian may be simplified using the rotating-wave
approximation, already used several times in this course. For this, it is convenient to use the spin ladder
operators, defined absolutely similarly for those of the orbital angular momentum – see Eqs. (5.153):
ˆ   ˆ 
ˆ   ˆ x  iˆ y , so that ˆ y  . (9.71)
2i
From Eq. (4.105), it is very easy to find the matrices of these operators in the standard z-basis,
0 2  0 0
σ     , σ     , (9.72)
0 0  2 0
and their commutation rules – which turn out to be naturally similar to Eqs. (5.154):
ˆ  , ˆ    4ˆ z , ˆ z , ˆ    2ˆ  . (9.73)
In this notation, the Rabi Hamiltonian becomes
Ω 
Hˆ  σ̂ z  aˆ † aˆ  ˆ   ˆ   aˆ  aˆ †  , (9.74)
2 2  
and it is straightforward to use Eq. (4.199) and (73) to derive the Heisenberg-picture equations of
motion for the involved operators. (Doing this, we have to remember that operators of the “spin”
subsystem, on one hand, and of the field mode, on the other hand, are defined in different Hilbert spaces
and hence commute – at least at coinciding time moments.) The result (so far, exact!) is
i i
aˆ  iaˆ  ˆ   ˆ  , aˆ †  iaˆ †  ˆ   ˆ  ,
2 2 (9.75)
  †
ˆ   iΩˆ   2i  aˆ  aˆ ˆ z ,  
ˆ z  i  aˆ  aˆ ˆ   ˆ  .
 †
   
At negligible coupling,   0, these equations have simple solutions,
aˆ t   e it , aˆ † t   e it , ˆ  t   e iΩt , ˆ z t   const , (9.76)

and the small terms proportional to  on the right-hand sides of Eqs. (75) cannot affect these time
evolution laws dramatically even if  is not exactly zero. Of those terms, ones with frequencies close to
the “basic” frequency of each variable would act in resonance and hence may have a substantial impact
on the system’s dynamics, while non-resonant terms may be ignored. In this rotating-wave
approximation, Eqs. (75) are reduced to a much simpler system of equations:
i i
aˆ  iaˆ  ˆ  , aˆ †  iaˆ †  ˆ  ,
2 2 (9.77)
ˆ   iΩˆ   2iaˆ †ˆ z , ˆ   iΩˆ   2iaˆˆ z , ˆ z  i  aˆ †ˆ   aˆ ˆ   .
 
Alternatively, these equations of motion may be obtained exactly from the Rabi Hamiltonian
(74), if it is preliminary cleared of the terms proportional to ˆ  aˆ † and ˆ  â , that oscillate fast and hence
self-average to produce virtually zero effect:
Jaynes- Ω   †
Cummings Hˆ  σ̂ z  aˆ † aˆ   ˆ  aˆ  ˆ  aˆ  , at  ,    , Ω . (9.78)
Hamiltonian 2 2  
This is the famous Jaynes-Cummings Hamiltonian,41 which is basic model used in the cavity QED and
its applications.42 To find its eigenstates and eigenenergies, let us note that at negligible interaction (
 0), the spectrum of the total energy E of the system, which in this limit is the sum of two independent
contributions from the atomic and cavity-field subsystems,
Ω 
E  0    n  E n  , with n  1, 2,... , (9.79)
2 2
consists43 of close level pairs (Fig. 5) centered to values
 1
E n    n   . (9.80)
 2
(At the exact resonance  = , i.e. at  = 0, each pair merges into one double-degenerate level En.)
Since at   0 the two subsystems do not interact, the eigenstates corresponding to the sublevels of the
nth pair may be represented by direct products of their independent state vectors:
    n 1 and -    n , (9.81)
where the first ket of each product represents the state of the two-level (spin-½-like) atomic subsystem,
and the second ket, that of the field oscillator.
... ... ...
 Ω/2    E2  
E 2  3 / 2

 Ω/2  2  E2  
 Ω/2  E1  
 E1   / 2
 Ω/2    E1  
E 0 Ω
E g  Ω/2
atom field total system
Fig. 9.5. The energy spectrum (79) of the Jaynes-Cummings Hamiltonian in the limit  <<  .
Note again that the energy is referred to the ground-state energy /2 of the cavity field.
As we know from Chapter 6, even weak interaction may lead to strong coherent mixing44 of
quantum states with close energies (in this case, the two states (81) within each pair with the same n),
41 It was first proposed and analyzed in 1963 by two engineers, Edwin Jaynes and Fred Cummings, in a Proc.
IEEE publication, and it took the physics community a while to recognize and acknowledge the fundamental
importance of that work.
42 For most applications, the baseline Hamiltonian (78) has to be augmented by additional term(s) describing, for
example, the incoming radiation and/or the system’s coupling to the environment, for example, due to the
electromagnetic energy loss in a finite-Q-factor cavity – see Eq. (7.68).
43 Only the ground state level E = –/2 is non-degenerate – see Fig. 5.
g
while their mixing with the states with farther energies is still negligible. Hence, at 0 < ,  <<   , a
good approximation of the eigenstate with E  En is given by a linear superposition of the states (81):
Jaynes-
 n  c   c   c   n  1  c   n , (9.82) Cummings
eigenstates
with certain c-number coefficients c. This relation describes the entanglement of the atomic eigenstates
 and  with the Fock states number n and n – 1 of the field mode. Let me leave the (straightforward)
calculation of the coefficients (c) for each of two entangled states (for each n) for the reader’s exercise.
(The result for the corresponding two eigenenergies (En) may be again represented by the same
anticrossing diagram as shown in Figs. 2.29 and 5.1, now with the detuning  as the argument.) This
calculation shows, in particular, that at  = 0 (i.e. at  = ), c+ = c- = 1/2 for both states of the pair.
This fact may be interpreted as a (coherent!) equal sharing of an energy quantum  =  by the atom
and the cavity field at the exact resonance.
As a (hopefully, self-evident) by-product of the calculation of c is the fact that the dynamics of
the state n described by Eq. (82), is similar to that of the generic two-level system that was repeatedly
discussed in this course – the first time in Sec. 2.6 and then in Chapters 4-6. In particular, if the
composite system had been initially prepared to be in one component state, for example 0 (i.e.
with the atom excited, while the cavity in its ground state), and then allowed to evolve on its own, after
some time interval t ~ 1/ it may be found definitely in the counterpart state 1, including the first
excited Fock state n = 1 of the field mode. If the process is allowed to continue, after the equal time
interval t, the system returns to the initial state 0, etc. This most striking prediction of the Jaynes-
Cummings model was directly observed, by G. Rempe et al., only in 1987, although less directly this
model was repeatedly confirmed by numerous experiments carried out in the 1960s and 1970s.
This quantized version of the Rabi oscillations can only persist in time if the inevitable
electromagnetic energy losses (not described by the basic Jaynes-Cummings Hamiltonian) are somehow
compensated – for example, by passing a beam of particles, externally excited into the higher-energy
state , though the cavity. If the losses become higher, the dissipation suppresses quantum coherence, in
our case the coherence between two components of each pair (82), as was discussed in Chapter 7. As a
result, the transition from the higher-energy atomic state  to the lower-energy state , giving energy 
to the cavity (n – 1  n), which is then rapidly drained into the environment, becomes incoherent, so
that the system’s dynamics is reduced to the Purcell effect, already mentioned in Sec. 3. A quantitative
analysis of this effect is left for the reader’s exercise.
The number of interesting physics games one can play with such systems – say by adding
external sources of radiation at a frequency close to  and , in particular with manipulated time-
dependent amplitude and/or phase, is always unlimited.45 Unfortunately, my time/space allowance for
the cavity QED is over, and for further discussion, I have to refer the interested reader to special
literature.46
44 In some fields, especially chemistry, such mixing is frequently called hybridization.

45 Most of them may be described by adding new terms to the basic Jaynes-Cummings Hamiltonian (78).
46 I can recommend, for example, either C. Gerry and P. Knight, Introductory Quantum Optics, Cambridge U.
Press, 2005, or G. Agarwal, Quantum Optics, Cambridge U. Press, 2012.
9.5. The Klein-Gordon and relativistic Schrödinger equations

Now let me switch gears and discuss the basics of relativistic quantum mechanics of particles
with a non-zero rest mass m. In the ultra-relativistic limit pc >> mc2 the quantization scheme of such
particles may be essentially the same as for electromagnetic waves, but for the intermediate energy
range, pc ~ mc2, a more general approach is necessary. Historically, the first attempts47 to extend the
non-relativistic wave mechanics into the relativistic energy range were based on performing the same
transitions from classical observables to their quantum-mechanical operators as in the non-relativistic
limit:

p  pˆ  i , E  Hˆ  i . (9.83)
t
The substitution of these operators, acting on the Schrödinger-picture wavefunction (r,t), into the
classical relation (1) between the energy E and momentum p (for of a free particle) leads to the
following formulas:
Table 9.1. Deriving the Klein-Gordon equation for a free relativistic particle. 48
Non-relativistic limit Relativistic case
E 2  c 2 p 2  mc 2 
Classical 1 2 2
E p
mechanics 2m
2
 1  
Wave i Ψ   i 2 Ψ  i  Ψ  c  i  Ψ  (mc ) Ψ
2 2 2 2
mechanics t 2m  t 
The resulting equation for the non-relativistic limit, in the left-bottom cell of the table, is just the
usual Schrödinger equation (1.28) for a free particle. Its relativistic generalization, in the right-bottom
cell, usually rewritten as
Klein-  1 2  mc
Gordon  2 2   2 Ψ   2 Ψ  0, with   , (9.84)
equation
 c t  
is called the Klein-Gordon (or sometimes “Klein-Gordon-Fock”) equation. The fundamental solutions
of this equation are the same plane, monochromatic waves
Ψ(r, t )  expik  r  t . (9.85)
as in the non-relativistic case. Indeed, such waves are eigenstates of the operators (83), with
eigenvalues, respectively,
p  k , and E   , (9.86)
so that their substitution into Eq. (84) immediately returns us to Eq. (1) with the replacements (86):
47 This approach was suggested in 1926-1927, i.e. virtually simultaneously, by (at least) V. Fock, E. Schrödinger,
O. Klein and W. Gordon, J. Kudar, T. de Donder and F.-H. van der Dungen, and L. de Broglie.
48 Note that in the left, non-relativistic column of this table, the energy is referred to the rest energy mc2, while in
its right, relativistic column, it is referred to zero – see Eq. (1).

E       ck   mc 2
2
  2 1/ 2
. (9.87)
Though one may say that this dispersion relation is just a simple combination of the classical
relation (1) and the same basic quantum-mechanical relations (86) as in non-relativistic limit, it attracts
our attention to the fact that the energy  as a function of the momentum k has two branches, with E–
(p) = –E+(p) – see Fig. 6a. Historically, this fact has played a very important role in spurring the
fundamental idea of particle-antiparticle pairs. In this idea (very similar to the concept of electrons and
holes in semiconductors, which was discussed in Sec. 2.8), what we call the “vacuum” actually
corresponds to all quantum states of the lower branch, with energies E–(p) < 0, being completely filled,
while the states on the upper branch, with energies E+(p) > 0, being empty. Then an externally supplied
energy,
ΔE  E   E   E    E    2mc 2  0 , (9.88)
may bring the system from the lower branch to the upper one (Fig. 6b). The resulting excited state is
interpreted as a combination of a particle (formally, of the infinite spatial extension) with the energy E+
and the momentum p, and a “hole” (antiparticle) of the positive energy (–E-) and the momentum –p.
This idea49 has led to a search for, and discovery of the positron: the electron’s antiparticle with charge q
= +e, in 1932, and later of the antiproton and other antiparticles.
 (a)  (b)
E
E
 mc 2
  ck Fig. 9.6. (a) The free-particle
ΔE dispersion relation resulting from
0 k 0 k the Klein-Gordon and Dirac
 mc 2 equations, and (b) the scheme of
E creation of a particle-antiparticle
E pair from the vacuum.
Free particles of a finite spatial extension may be described, in this approach, just as in the non-
relativistic Schrödinger equation, by wave packets, i.e. linear superpositions of the de Broglie waves
(85) with close wave vectors k, and the corresponding values of  given by Eq. (87), with the positive
sign for the “usual” particles, and negative sign for antiparticles – see Fig. 6a above. Note that to form,
from a particle’s wave packet, a similar wave packet for the antiparticle, with the same phase and group
velocities (2.33a) in each direction, we need to change the sign not only before , but also before k, i.e.
to replace all component wavefunctions (85), and hence the full wavefunction, with their complex
conjugates.
Of more formal properties of Eq. (84), it is easy to prove that its solutions satisfy the same
continuity equation (1.52), with the probability current density j still given by Eq. (1.47), but a different
expression for the probability density w – which becomes very similar to that for j:
w
i
2mc 2
 * Ψ
Ψ
 t

 c.c.,

j
i
2m

 *  c.c.  (9.89)
49 Due to the same P. A. M. Dirac!
– very much in the spirit of the relativity theory, treating space and time on equal footing. (In the non-
relativistic limit p/mc  0, Eq. (84) allows the reduction of this expression for w to the non-relativistic
Eq. (1.22): w  *.)
The Klein-Gordon equation may be readily generalized to describe a particle moving in external
fields; for example, the electromagnetic field effects on a particle with charge q may be described by the
same replacement as in the non-relativistic limit (see Sec. 3.1):
pˆ  Pˆ  qAr, t , Hˆ  Hˆ  q (r, t ) , (9.90)
where P̂  i is the canonical momentum operator (3.25), and the vector- and scalar potentials, A
and , should be treated appropriately – either as c-number functions if the electromagnetic field
quantization is not important for the particular problem, or as operators (see Secs. 1-4 above) if it is.
However, the practical value of the resulting relativistic Schrödinger equation is rather limited,
for two main reasons. First of all, it does not give the correct description of particles with spin. For
example, for the hydrogen-like atom/ion problem, i.e. the motion of an electron with the electric charge
–e, in the Coulomb central field of an immobile nucleus with charge +Ze, the equation may be readily
solved exactly50 and yields the following spectrum of (doubly-degenerate) energy levels:
1 / 2
 Z 2 2
E  mc 1 
2
2

 
, with   n  l  ½   Z 2 2
2
 1/ 2
 l  ½ , (9.91)
 
where n = 1, 2,… and l = 0, 1,…, n – 1 are the same quantum numbers as in the non-relativistic theory
(see Sec. 3.6), and   1/137 is the fine structure constant (6.62). The three leading terms of the Taylor
expansion of this result in the small parameter Z are as follows:
 Z 2 2 Z 4 4  n 3 
E  mc 2 1      . (9.92)
 2n 2 2n 4  l  ½ 4 
The first of these terms is just the rest energy of the particle. The second term,
Z 2 2 mZ 2 e 4 1 E
E n  mc 2     02 , with E 0  Z 2 E H , (9.93)
2n 2
4 0   2n
2 2 2
2n
reproduces the non-relativistic Bohr’s formula (3.201). Finally, the third term,
Z 4 4  n 3 2 E n2  n 3
 mc 2
4 
  2 
 , (9.94)
2n  l  ½ 4  mc  l  ½ 4 
is just the perturbative kinetic-relativistic contribution (6.51) to the fine structure of the Bohr levels (93).
However, as we already know from Sec. 6.3, for a spin-½ particle such as the electron, the spin-orbit
interaction (6.55) gives an additional contribution to the fine structure, of the same order, so that the net
result, confirmed by experiment, is given by Eq. (6.60), i.e. is different from Eq. (94). This is very
natural, because the relativistic Schrödinger equation does not have the very notion of spin.
Second, even for massive spinless particles (such as the Z0 bosons), for which this equation is
believed to be valid, the most important problems are related to particle interactions at high energies of
50 This task is left for the reader’s exercise.
the order of  ~ 2mc2 and beyond – see Eq. (88). Due to the possibility of creation and annihilation of
particle-antiparticle pairs at such energies, the number of particles participating in such interactions is
typically considerable (and variable), and the adequate description of the system is given not by the
relativistic Schrödinger equation (which is formulated in single-particle terms), but by the quantum field
theory – to which I will devote only a few sentences in the very end of this chapter.
9.6. Dirac’s theory

The real breakthrough toward the quantum relativistic theory of electrons (and other spin-½
fermions) was achieved in 1928 by P. A. M. Dirac. For that time, the structure of his theory was highly
nontrivial. Namely, while formally preserving, in the coordinate representation, the same Schrödinger-
picture equation of quantum dynamics as in the non-relativistic quantum mechanics,51

i  Hˆ  , (9.95)
t
it postulates that the wavefunction  it describes is not a scalar complex function of time and
coordinates, but a four-component column-vector (sometimes called the bispinor) of such functions, its
Hermitian-conjugate bispinor † being a 4-component row-vector of their complex conjugates:
 Ψ 1 (r, t ) 
 
 Ψ 2 (r, t ) 
Ψ
Ψ 3 (r, t ) 
,  
Ψ †  Ψ 1* (r, t ), Ψ *2 (r, t ), Ψ *3 (r, t ), Ψ *4 (r, t ) , (9.96)
 
 Ψ (r, t ) 
 4 
and that the Hamiltonian participating in Eq. (95) is a 44 matrix defined in the Hilbert space of
bispinors . For a free particle, the postulated Hamiltonian looks amazingly simple: 52
51 After the “naturally-relativistic” form of the Klein-Gordon equation (84), this apparent return to the non-
relativistic Schrödinger equation may look very counter-intuitive. However, it becomes a bit less surprising taking
into account the fact (whose proof is left for the reader’s exercise) that Eq. (84) may be also recast into the form
(95) for a two-component column-vector  (sometimes called spinor), with a Hamiltonian which may be
represented by a 22 matrix – and hence expressed via the Pauli matrices (4.105) and the identity matrix I.
52 Moreover, if the time derivative participating in Eq. (95), and the three coordinate derivatives participating (via
the momentum operator) in Eq. (97), are merged into one 4-vector operator /xk  {, /(ct)}, the Dirac
equation (95) may be rewritten in an even simpler, manifestly Lorentz-invariant 4-vector form (with the implied
summation over the repeated index k = 1, ..., 4 – see, e.g., EM Sec. 9.4):
    0 - iσˆ 
 γˆk   Ψ  0, where γˆ  γˆ1 , γˆ2 , γˆ3    , γ̂ 4  β̂,
 xk   iσˆ 0 
where μ  mc/ – just as in Eq. (84). Note also that, very counter-intuitively, the Dirac Hamiltonian (97) is linear
in the momentum, while the non-relativistic Hamiltonian of a particle, as well as the relativistic Schrödinger
equation, are quadratic in p. In my humble opinion, the Dirac theory (including the concept of antiparticles it has
inspired) may compete for the title of the most revolutionary theoretical idea in physics of all times, despite such
strong contenders as Newton’s laws, Maxwell’s equations, Gibbs’ statistical distribution, Bohr’s theory of the
hydrogen atom, and Einstein’s general relativity.
Free-
particle’s
Hamiltonian
Hˆ  cαˆ  pˆ  ˆ mc 2 . (9.97)
where p̂ = –i is the same 3D vector operator of momentum as in the non-relativistic case, while the
operators α̂ and ˆ may be represented in the following shorthand 22 form:
 0̂ σˆ   Iˆ 0̂ 
Dirac
αˆ   ,
 βˆ   . (9.98a)
operators
ˆ
σ 0̂ 0̂  ˆ
I
   
The operator α̂ , composed of the Pauli vector operators σ̂ , is also a vector in the usual 3D
space, with each of its 3 Cartesian components being a 44 matrix. The particular form of the 22
matrices corresponding to the operators σ̂ and Iˆ in Eq. (98a) depends on the basis selected for the spin
state representation; for example, in the standard z-basis, in which the Cartesian components of σ̂ are
represented by the Pauli matrices (4.105), the 44 matrix form of Eq. (98a) is
0 0 0 1 0 0 0  i 0 0 1 0 1 0 0 0
       
0 0 1 0 0 0 i 0  0 0 0  1 0 1 0 0
αx   , αy   , αz   , β . (9.98b)
0 1 0 0 0 i 0 0 1 0 0 0 0 0 1 0 
       
1
 0 0 0  i 0 0 0 
 
0 1
 0 0  0
 0 0  1
It is straightforward to use Eqs. (98) to verify that the matrices x, y, z and  satisfy the following
relations:
α 2x  α 2y  α 2z  β 2  I, (9.99)
α x α y  α y α x  α y α z  α z α y  α z α x  α x α z  α x β  βα x  α y β  βα y  α z β  βα z  0 , (9.100)
i.e. anticommute.
Using these commutation relations, and acting essentially as in Sec. 1.4, it is straightforward to
show that any solution to the Dirac equation obeys the probability conservation law, i.e. the continuity
equation (1.52), with the probability density:
w   † , (9.101)
and the probability current,
j  Ψ † cαˆ Ψ , (9.102)
looking almost as in the non-relativistic wave mechanics – cf. Eqs. (1.22) and (1.47). Note, however, the
Hermitian conjugation used in these formulas instead of the complex conjugation, to form the scalars w,
jx, jy, and jz from the 4-component state vectors (96).
This close similarity is extended to the fundamental, plane-wave solutions of the Dirac equations
is free space. Indeed, plugging such solution, in the form
u 
 1
u 
Ψ  ue i k r t    2  e i kr t  , (9.103)
u
 3
u 
 4
into Eqs. (95) and (97), we see that they are indeed satisfied, provided that a system of four coupled,
linear algebraic equations for four complex c-number amplitudes u1,2,3,4 is satisfied. The condition of its
consistency yields the same dispersion relation (87), i.e. the same two-branch diagram shown in Fig. 6,
as follows from the Klein-Gordon equation. The difference is that plugging each value of , given by
Eq. (87), back into the system of the linear equations for four amplitudes u, we get two solutions for
their vector u  (u1, u2, u3, u4) for each of the two energy branches – see Fig. 6 again. In the standard z-
basis of spin operators, they may be represented as follows:
 1   0 
   
 0   1 
 cp z   cp  
for E  E   0 : u   c   2 , u   c   2 , (9.104a)
E  mc E  mc
     
 cp     cp z 
 E  mc 2   E  mc 2 
     
 cp z   cp  
   
 E   mc   E   mc 
2 2
 cp     cp z 
for E  E   0 : u   c   2 , u   c   2 , (9.104b)
 E   mc   E   mc 
 1   0 
 0   1 
   
where p  px  ipy, and c are normalization coefficients.
The simplest interpretation of these solutions is that Eq. (103), with the vectors u+ given by Eq.
(104a), represents a spin-½ particle (say, an electron), while with the vectors u– given by Eq. (104b), it
represents an antiparticle (a positron), and the two solutions for each particle, indexed with opposite
arrows, correspond to two possible directions of the spin–½ , z = 1, i.e. Sz = /2. This interpretation
is indeed solid in the non-relativistic limit, when two last components of the vector (104a), and two first
components of the vector (104b) are negligibly small:
1 0  0  0
       
 0 1  0  0 p x, y, z
u    , u    , u    , u    , for  0. (9.105)
0 0 1 0 mc
       
 0 0  0 1
       
However, at arbitrary energies, the physical picture is more complex. To show this, let us use the
Dirac equation to calculate the Heisenberg-picture law of time evolution of the operator of some
Cartesian component of the orbital angular momentum L  rp, for example of Lx = ypz – zpy, taking
into account that the Dirac operators (98a) commute with those of r and p, and also the Heisenberg
commutation relations (2.14):
i
Lˆ x
t
   
 Lˆ x , Hˆ  cαˆ   yˆ pˆ z  zˆpˆ y , pˆ  icˆ z pˆ y  ˆ y pˆ z  , (9.106)
with similar relations for two other Cartesian components. Since the right-hand side of these equations is
different from zero, the orbital momentum is generally not conserved – even for a free particle! Let us,
however, consider the following vector operator,
Spin
operator   σˆ 0̂ 
Sˆ   . (9.107a)
2  0̂ σˆ 
in Dirac’s
theory
According to Eqs. (4.105), its Cartesian components, in the z-basis, are represented by 44 matrices
0 1 0 0 0  i 0 0  1 0 0 0 
 
   
 1 0 0 i 0 0 0 
0  0 1 0 0 
Sx   Sy   , , Sz   . (9.107b)
2 0 0 0 2 0 0 0  i
1 2 0 0 1 0
 
   
0 0 1 0 0 i 0 
0   0 0 0  1
    
Let us calculate the Heisenberg-picture law of time evolution of these components, for example
Sˆ x
t
   
 Sˆ x , Hˆ  c Sˆ x , ˆ x pˆ x  ˆ y pˆ y  ˆ z pˆ z  .
i (9.108)
A direct calculation of the commutators of the matrices (98) and (107) yields
Sˆ ,ˆ   0, Sˆ ,ˆ   iˆ , Sˆ ,ˆ   iˆ

x x x y z x z y , (9.109)
so that we finally get
Sˆ x
 icˆ z pˆ y  ˆ y pˆ z  ,
i (9.110)
t
with similar expressions for the other two components of the operator. Comparing this result with Eq.
(106), we see that any Cartesian component of the operator defined similarly to Eq. (5.170),
Jˆ  Lˆ  Sˆ , (9.111)
is an integral of motion,53 so that this operator may be interpreted as the one representing the total
angular momentum of the particle. Hence, the operator (107) may be interpreted as the spin operator of a
spin-½ particle (e.g., electron). As it follows from the last of Eq. (107b), in the non-relativistic limit the
columns (105) represent the eigenkets of the z-component of that operator, with eigenstates Sz = /2,
with the sign corresponding to on the arrow index. So, the Dirac theory provides a justification for spin-
½ – or, somewhat more humbly, replaces the Pauli Hamiltonian postulate (4.163) with that of a simpler
(and hence more plausible), Lorentz-invariant Hamiltonian (97).
Note, however, that this simple interpretation, fully separating a particle from its antiparticle, is
not valid for the exact solutions (103)-(104), so that generally the eigenstates of the Dirac Hamiltonian
are certain linear (coherent) superpositions of the components describing the particle and its antiparticle
– each with both directions of spin. This fact leads to several interesting effects, including the so-called
Klien paradox at the reflection of a relativistic electron from a potential barrier.54
53 It is straightforward to show that this result remains valid for a particle in any central field U(r).
54 See, e.g., A. Calogeracos and N. Dombey, Contemp. Phys. 40, 313 (1999).
9.7. Low-energy limit

The generalization of Dirac’s theory to the case of a (spin-½) particle with an electric charge q,
moving in a classically-described electromagnetic field, may be obtained using the same replacement
(90). As a result, Eq. (95) turns into
cαˆ   i  qA  mc ˆ  q  Hˆ  Ψ  0 ,

2
(9.112) Dirac equation
in EM field
where the Hamiltonian operator Ĥ is understood in the sense of Eq. (95), i.e. as the partial time
derivative with the multiplier i. Let us prepare this equation for a low-energy approximation by acting
on its left-hand side by a similar square bracket but with the opposite sign before the last parentheses –
also an operator! Using Eqs. (99) and (100), and the fact that the space- and time-independent
operators α̂ and β̂ commute with the spin-independent, c-number functions Ar, t  and  r, t  , as well
as with the Hamiltonian operator i/t, the result is
c αˆ   i  qA  mc   cαˆ   i  qA, q  Hˆ  q  Hˆ  Ψ  0 .
2 2 2 2 2
(9.113)
A direct calculation of the first square bracket, using Eqs. (98) and (107), yields
αˆ   i  qA 2   i  qA 2  2qSˆ    A . (9.114)
But the last vector product on the right-hand side is just the magnetic field – see, e.g., Eqs. (3.21):
B  A. (9.115)
Similarly, we may use the first of Eqs. (3.21), for the electric field,
A
E    , (9.116)
t
to simplify the commutator participating in Eq. (9.113):
αˆ   i  qA , q  Hˆ    qαˆ  Hˆ , A  iqαˆ   ,    iq At  iαˆ     iqαˆ  E . (9.117)
As a result, Eq. (113) becomes
c  i  qA  q  Hˆ   mc   2qc Sˆ  B  icqαˆ E Ψ  0 .

2 2 2 2 2 2
(9.118)
So far, this is an exact result, equivalent to Eq. (112), but it is more convenient for an analysis of
the low-energy limit, in which not only the energy offset E – mc2 (which is just the energy used in the
non-relativistic mechanics), but also the electrostatic energy of the particle, q, are much smaller than
the rest energy mc2. In this limit, the second and third terms of Eq. (118) almost cancel, and introducing
the offset Hamiltonian
~ˆ
H  Hˆ  mc 2 Iˆ . (9.119)
we may approximate their difference, up to the first non-zero term, as
qIˆ  Hˆ   mc  Iˆ   q Iˆ  mc Iˆ  H~ˆ   mc  Iˆ  2mc

2
~ˆ
 H  q Iˆ  .
2 2 2 2 2 2 2
(9.120)
 
2
As a result, after the division of all terms by 2mc , Eq. (118) may be approximated as
 1
Low-
energy
~ˆ
HΨ    i  qA 2  q  q Sˆ  B  iq αˆ  E  Ψ . (9.121)
Hamiltonian  2m m 2mc 
Let us discuss this important result. The first two terms in the square brackets give the non-
relativistic Hamiltonian (3.26), which was extensively used in Chapter 3 for the discussion of charged
particle motion. Note again that the contribution of the vector potential A into that Hamiltonian is
essentially relativistic, in the following sense: when used for the description of magnetic interaction of
two charged particles, due to their orbital motion with speed v << c, the magnetic interaction is a factor
of (v/c)2 smaller than the electrostatic interaction of the particles.55 The reason why we did discuss the
effects of A in Chapter 3 was that is was used there to describe external magnetic fields, keeping our
analysis valid even for the cases when that field is strong because of being produced by relativistic
effects – such as aligned spins of a permanent magnet.
The next, third term in the square brackets of Eq. (121) should be also familiar to the reader: this
is the Pauli Hamiltonian – see Eqs. (4.3), (4.5), and (4.163). When justifying this form of interaction in
Chapter 4, I referred mostly to the results of Stern-Gerlach-type experiments, but it is extremely
pleasing that this result56 follows from such a fundamental relativistic treatment as Dirac's theory. As we
already know from the discussion of the Zeeman effect in Sec. 6.4, the magnetic field effects on the
orbital motion of an electron (described by the orbital angular momentum L) and its spin S are of the
same order, though quantitatively different.
Finally, the last term in the square brackets of Eq. (121) is also not quite new for us: in
particular, it describes the spin-orbit interaction. Indeed, in the case of a classical, spherical-symmetric
electric field E corresponding to the potential  (r) = U(r)/q, this term may be reduced to Eq. (6.56):
Spin-orbit 1 ˆ ˆ 1 dU q ˆ ˆ1
coupling Hˆ so  2 2
SL  SL E . (9.122)
2m c r dr 2m 2 c 2 r
The proof of this correspondence requires a bit of additional work.57 Indeed, in Eq. (121), the term
responsible for the spin-orbit interaction acts on 4-component wavefunctions, while the Hamiltonian
(122) is supposed to act on non-relativistic state vectors with an account of spin, whose coordinate
representation may be given by 2-component spinors:58
55 This difference may be traced by classical means – see, e.g., EM Sec. 5.1.
56 Note that in this result, the g-factor of the particle is still equal to exactly 2 – see Eq. (4.115) and its discussion
in Sec. 4.4. In order to describe the small deviation of ge from 2, the electromagnetic field should be quantized
(just as this was discussed in Secs. 1-4 of this chapter), and its potentials A and , participating in Eq. (121),
should be treated as operators – rather than as c-number functions as was assumed above.
57 The only facts immediately evident from Eq. (121) are that the term we are discussing is proportional to the
electric field, as required by Eq. (122), and that it is of the proper order of magnitude. Indeed, Eqs. (101)-(102)
imply that in the Dirac theory, cα̂ plays the role of the velocity operator, so that the expectation values of the term
are of the order of qvE/2mc2. Since the expectation values of the operators participating in the Hamiltonian (122)
scale as S ~ /2 and L ~ mvr, the spin-orbit interaction energy has the same order of magnitude.
58 In this course, the notion of spinor (popular in some textbooks) was not used much; it was introduced earlier
only for two-particle states – see Eq. (8.13). For a single particle, such definition is reduced to (r)s, whose
representation in a particular spin-½ basis is the column (123). Note that such spinors may be used as a basis for
an expansion of the spin-orbitals j(r) defined by Eq. (8.125), where the index j is used for numbering both the
spin’s orientation (i.e. the particular component of the spinor's column) and the orbital eigenfunction.
 
     . (9.123)
  
The simplest way to prove the equivalence of these two expressions is not to use Eq. (121)
directly, but to return to the Dirac equation (112), for the particular case of motion in a static electric
field but no magnetic field, when Dirac’s Hamiltonian is reduced to
Hˆ  cαˆ  pˆ  ˆ mc 2  U r , with U  q . (9.124)

Since this Hamiltonian is time-independent, we may look for its 4-component eigenfunctions in the form
 r   E 
Ψr, t      exp  i t  , (9.125)
 r
      
where each of  is a 2-component column of the type (123), representing two spin states of the particle
(index +) and its antiparticle (index –). Plugging Eq. (125) into Eq. (95) with the Hamiltonian (124),
and using Eq. (98a), we get the following system of two linear equations:
E  mc 2

 U r     cσˆ  pˆ    0, E  mc 2

 U r     cσˆ  pˆ    0. (9.126)
Expressing - from the latter equation, and plugging the result into the former one, we get the following
single equation for the particle’s spinor:
 1 
 E  mc  U r   c σˆ  pˆ σˆ  pˆ    0 .
2 2
(9.127)
 E  mc  U r 
2

So far, this is an exact equation for eigenstates and eigenvalues of the Hamiltonian (124), but it
may be substantially simplified in the low-energy limit when both the potential energy59 and the non-
relativistic eigenenergy
~
E  E  mc 2 (9.128)
are much lower than mc2. Indeed, in this case, the expression in the denominator of the last term in the
brackets of Eq. (127) is close to 2mc2. Since 2 = 1, with that replacement, Eq. (127) is reduced to the
non-relativistic Schrödinger equation, similar for both spin components of +, and hence giving spin-
degenerate energy levels. To recover small relativistic and spin-orbit effects, we need a slightly more
accurate approximation:
~ 1 ~
1 1 1  E  U r   1  E  U r  
 ~  1    1   , (9.129)
E  mc 2  U r  2mc 2  E  U r  2mc 2  2mc 2  2mc 2  2mc 2 
in which Eq. (127) is reduced to
~
~ pˆ 2  E  U r  
 E  U r    σˆ  pˆ ˆ
σ  ˆ
p   0 . (9.130)
 2m 2mc 2 2  
As Eqs. (5.34) shows, the operators of the momentum and of a function of coordinates commute as
pˆ ,U r   iU , (9.131)
59 Strictly speaking, this requirement is imposed on the expectation values of U(r) in the eigenstates to be found.
so that the last term in the square brackets of Eq. (130) may be rewritten as
~ ~
E  U r  E  U r  2 i
σˆ  pˆ ˆ
σ  ˆ
p  pˆ  σˆ  U σˆ  pˆ  . (9.132)
2mc  2
2mc  2
2mc 2
Since in the low-energy limit, both terms on the right-hand side of this relation are much smaller
than the three leading terms of Eq. (130), we may replace the first term’s numerator with its non-
relativistic approximation pˆ 2 / 2m . With this replacement, the term coincides with the first relativistic
correction to the kinetic energy operator – see Eq. (6.47). The second term, proportional to the electric
field E = – = –U/q, may be transformed further on, using a readily verifiable identity
σˆ  U σˆ  pˆ   U   pˆ  iσˆ  U   pˆ . (9.133)
Of the two terms on the right-hand side of this relation, only the second one depends on spin,60 giving
the following spin-orbital interaction contribution to the Hamiltonian,
 q ˆ
Hˆ so  σˆ  U   pˆ   S     pˆ  . (9.134)
2mc  2
2m 2 c 2
For a central potential (r), its gradient has only the radial component:  = (d/dr)r/r = –Er/r, and with
the angular momentum definition (5.147), Eq. (134) is (finally!) reduced to Eq. (122).
As was shown in Sec. 6.3, the perturbative treatment of Eq. (122), together with the kinetic-
relativistic correction (6.47), in the hydrogen-like atom/ion problem, leads to the fine structure of each
Bohr level En, given by Eq. (6.60):
2E  4n 
ΔE fine   n2  3  . (9.135)
mc  j  ½ 
This result receives a confirmation from the surprising fact that for the hydrogen-like atom/ion problem,
the Dirac equation may be solved exactly – without any assumptions. I would not have time/space to
reproduce the solution,61 and will only list the final result for the energy spectrum:
1 / 2
 Z 2 2 
H-like atom: E  
 1 
 2 
. (9.136)
 
eigenenergies
mc 2  n  ( j  ½ ) 2  Z 2 2
1/ 2
  j  ½  

Here n = 1, 2, … is the same principal quantum number as in Bohr’s theory, while j is the quantum
number specifying the eigenvalues (5.175) of J2, in our case of a spin-½ particle taking half-integer
values: j = l  ½ = 1/2, 3/2, 5/2, … – see Eq. (5.189). This is natural, because due to the spin-orbit
interaction, the orbital momentum and spin are not conserved, while their vector sum, J = L + S, is – at
least in the absence of an external field. Each energy level (136) is doubly-degenerate, with two
eigenstates representing two directions of the spin. (In the low-energy limit, we may say: corresponding
to two values of l = j  ½, at fixed j.)
60The first term gives a small spin-independent energy shift, which is very difficult to verify experimentally.
61Good descriptions of the solution are available in many textbooks (the older the better :-) – see, e.g., Sec. 53 in
L. Schiff, Quantum Mechanics, 3rd ed., McGraw-Hill (1968).
Speaking of that limit (when E – mc2 ~ EH << mc2): since according to Eq. (1.13) for EH, the
square of the fine-structure constant   e2/40c may be represented as the ratio EH/mc2, we may
follow this limit expanding Eq. (136) into the Taylor series in (Z)2 << 1. The result,
 Z 2 2 Z 4 4  n 3 
E  mc 2 1     , (9.137)
 2n 2 2n 4  j  ½ 4 
has the same structure, and allows the same interpretation as Eq. (92), but with the last term coinciding
with Eq. (6.60) – and with experimental results. Historically, this correct description of the fine structure
of the atomic levels provided the decisive proof of Dirac’s theory.
However, even such an impressive theory does not have too many direct applications. The main
reason for that was already discussed in brief in the end of Sec. 5: due to the possibility of creation and
annihilation of particle-antiparticle pairs by an energy influx higher than 2mc2, the number of particles
participating in high-energy interactions is not fixed. An adequate general description of such situations
is given by the quantum field theory, in which the particle’s wavefunction is treated as a field to be
quantized, using so-called field operators  ˆ r, t  – very much similar to the electromagnetic field
operators (16). The Dirac equation follows from such theory in the single-particle approximation.
As was mentioned above on several occasions, the quantum field theory is well beyond the
time/space limits of this course, and I have to stop here, referring the interested reader to one of several
excellent textbooks on this discipline.62 However, I would strongly encourage the students going in this
direction to start by playing with the field operators on their own, taking clues from Eqs. (16), but
replacing the creation/annihilations operators aˆ †j and aˆ j of the electromagnetic field oscillators with
those of the general second quantization formalism outlined in Sec. 8.3.
9.1. Prove the Casimir formula, given by Eq. (23), by calculating the net force F = PA exerted by
the electromagnetic field, in its ground state, on two perfectly conducting parallel plates of area A,
separated by a vacuum gap of width t << A1/2.
Hint: Calculate the field energy in the gap volume with and without the account of the plate
effect, and then apply the Euler-Maclaurin formula63 to the difference between these two results.
9.2. Electromagnetic radiation by some single-mode quantum sources may have such a high
degree of coherence that it is possible to observe the interference of waves from two independent
sources with virtually the same frequency, incident on one detector.
(i) Generalize Eq. (29) to this case.
62 For a gradual introduction see, e.g., either L. Brown, Quantum Field Theory, Cambridge U. Press (1994) or R.
Klauber, Student Friendly Quantum Field Theory, Sandtrove (2013). On the other hand, M. Srednicki, Quantum
Field Theory, Cambridge U. Press (2007) and A. Zee, Quantum Field Theory in a Nutshell, 2nd ed., Princeton
(2010), among others, offer steeper learning curves.
63 See, e.g., MA Eq. (2.12a).
(ii) Use this generalized expression to show that incident waves in different Fock states do not
create an interference pattern.
9.3. Calculate the zero-delay value g(2)(0) of the second-order correlation function of a single-
mode electromagnetic field in the so-called Schrödinger-cat state:64 a coherent superposition of two
Glauber states, with equal but sign-opposite parameters , and a certain phase shift between them.
9.4. Calculate the zero-delay value g(2)(0) of the second-order correlation function of a single-
mode electromagnetic field in the squeezed ground state  defined by Eq. (5.142).
9.5. Calculate the rate of spontaneous photon emission (into unrestricted free space) by a
hydrogen atom, initially in the 2p state (n = 2, l = 1) with m = 0. Would the result be different for m = 
1? for the 2s state (n = 2, l = 0, m = 0)? Discuss the relation between these quantum-mechanical results
and those given by the classical theory of radiation for the simplest classical model of the atom.
9.6. An electron has been placed on the lowest excited level of a spherically-symmetric,
quadratic potential well U(r) = me2r2/2. Calculate the rate of its relaxation to the ground state, with the
emission of a photon (into unrestricted free space). Compare the rate with that for a similar transition of
the hydrogen atom, for the case when the radiation frequencies of these two systems are equal.
9.7. Derive an analog of Eq. (53) for the spontaneous photon emission into the free space, due to
a change of the magnetic dipole moment m of a small-size system.
9.8. A spin-½ particle, with a gyromagnetic ratio , is in its orbital ground state in dc magnetic
field B0. Calculate the rate of its spontaneous transition from the higher to the lower energy level, with
the emission of a photon into the free space. Evaluate this rate for in an electron in a field of 10 T, and
discuss the implications of this result for laboratory experiments with electron spins.
9.9. Calculate the rate of spontaneous transitions between the two sublevels of the ground state
of a hydrogen atom, formed as a result of its hyperfine splitting. Discuss the implications of the result
for the width of the 21-cm spectral line of hydrogen.
9.10. Find the eigenstates and eigenvalues of the Jaynes-Cummings Hamiltonian (78), and
discuss their behavior near the resonance point  = .
9.11. Analyze the Purcell effect, mentioned in Secs. 3 and 4, quantitatively; in particular,
calculate the so-called Purcell factor FP defined as the ratio of the rate s of atom’s spontaneous
emission into a resonant cavity tuned exactly to the quantum transition frequency, to that into the free
space.
9.12. Prove that the Klein-Gordon equation (84) may be rewritten in the form similar to the non-
relativistic Schrödinger equation (1.25), but for a two-component wavefunction, with the Hamiltonian
represented (in the usual z-basis) by the following 22-matrix:
64 Its name stems from the well-known Schrödinger cat paradox, which is (very briefly) discussed in Sec. 10.1.
2 2
H  σ z  iσ y    mc 2 σ z .
2m
Use your solution to discuss the physical meaning of the wavefunction’s components.
9.13. Calculate and discuss the energy spectrum of a relativistic, spinless, charged particle placed
into an external uniform, time-independent magnetic field B. Use the result to formulate the condition
of validity of the non-relativistic theory in this situation.
9.14. Prove Eq. (91) for the energy spectrum of a hydrogen-like atom/ion, starting from the
relativistic Schrödinger equation.
Hint: A mathematical analysis of Eq. (3.193) shows that its eigenvalues are given by Eq. (3.201),
n = –1/2n2, with n = l + 1 + nr, where nr = 0, 1, 2,…, even if the parameter l is not integer.
9.15. Derive a general expression for the differential cross-section of elastic scattering of a
spinless relativistic particle by a static potential U(r), in the Born approximation, and formulate the
conditions of its validity. Use these results to calculate the differential cross-section of scattering of a
particle with the electric charge –e by the Coulomb electrostatic potential (r) = Ze/40r.
9.16. Starting from Eqs. (95)-(98), prove that the probability density w given by Eq. (101) and
the probability current density j defined by Eq. (102) do indeed satisfy the continuity equation (1.52):
w/t + j = 0.
9.17. Calculate the commutator of the operator L̂2 and Dirac’s Hamiltonian of a free particle.
Compare the result with that for the non-relativistic Hamiltonian, and interpret the difference.
9.18. Calculate commutators of the operators Ŝ 2 and Ĵ 2 with Dirac’s Hamiltonian (97), and give
an interpretation of the results.
9.19. In the Heisenberg picture of quantum dynamics, derive an equation describing the time
evolution of free electron’s velocity in the Dirac theory. Solve the equation for the simplest state, with
definite energy and momentum, and discuss the solution.
9.20. Calculate the eigenstates and eigenenergies of a relativistic spin-½ particle with charge q,
placed into a uniform, time-independent external magnetic field B. Compare the calculated energy
spectrum with those following from the non-relativistic theory and the relativistic Schrödinger equation.
9.21.* Following the discussion at the very end of Section 7, introduce quantum field operators
ˆ that would be related to the usual wavefunctions  just as the electromagnetic field operators (16)
are related to the classical electromagnetic fields, and explore basic properties of these operators. (For
this preliminary study, consider the fixed-time situation.)
Chapter 10. Making Sense of Quantum Mechanics

This (rather brief) chapter addresses some conceptually important issues of quantum measurements and
quantum state interpretation. Please note that some of these issues are still subjects of debate1 –
fortunately not affecting quantum mechanics’ practical results, discussed in the previous chapters.
10.1. Quantum measurements

The knowledge base outlined in the previous chapters gives us a sufficient background for a (by
necessity, very brief) discussion of quantum measurements.2 Let me start by reminding the reader of the
only postulate of the quantum theory that relates it to experiment – so far, meaning perfect
measurements. In the simplest case when the system is in a coherent (pure) quantum state, its ket-vector
may be represented as a linear superposition
   j a j , (10.1)
j
where aj are the eigenstates of the operator of an observable A, related to its eigenvalues Aj by Eq.
(4.68):
Aˆ a j  A j a j . (10.2)
In such a state, the outcome of every single measurement of the observable A may be uncertain, but is
restricted to the set of eigenvalues Aj, with the jth outcome probability equal to
2
Wj   j . (10.3)
As was discussed in Chapter 7, the state of the system (or rather of the statistical ensemble of
macroscopically similar systems we are using for this particular series of similar experiments) may be
not coherent, and hence even more uncertain than the state described by Eq. (1). Hence, the
measurement postulate means that even if the system is in this (the least uncertain) state, the
measurement outcomes are still probabilistic.3
If we believe that a particular measurement may be done perfectly, and do not worry too much
how exactly, we are subscribing to the mathematical notion of measurement, that was, rather reluctantly,
used in these notes – up to this point. However, the actual (physical) measurements are always
imperfect, first of all because of the huge gap between the energy-time scale  ~ 10-34 Js of the quantum
phenomena in “microscopic” systems such as atoms, and the “macroscopic” scale of the direct human
perception, so that the role of the instruments bridging this gap (Fig. 1), is highly nontrivial.
1 For an excellent review of these controversies, as presented in a few leading textbooks, I highly recommend J.
Bell’s paper in the collection by A. Miller (ed.), Sixty-Two Years of Uncertainty, Plenum, 1989.
2 “Quantum measurements” is a very unfortunate and misleading term; it would be more sensible to speak about
“measurements of observables in quantum mechanical systems”. However, the former term is so common and
compact that I will use it – albeit rather reluctantly.
3 The measurement outcomes become definite only in the trivial case when the system is definitely in one of the
eigenstates aj, say a0; then j = j,0exp{i}, and Wj = j,0.
© K. Likharev
to a human
interaction
observer
instrument
back action Fig.10.1. The general

quantum macroscopic scheme of a quantum
system pointer measurement.
Besides the famous Bohr-Einstein discussion in the mid-1930s, which will be briefly reviewed
in Sec. 3, the founding fathers of quantum mechanics have not paid much attention to these issues,
apparently because of the following reason. At that time it looked like the experimental instruments (at
least the best of them :-) were doing exactly what the measurement postulate was telling. For example,
the z-oriented Stern-Gerlach experiment (Fig. 4.1) turns two complex coefficients  and , describing
the spin state of the incoming electrons, into a set of particle-counter clicks, with the rates proportional
to, respectively, 2 and 2. The crude internal nature of these instruments makes more detailed
questions unnatural. For example, each click of a Geiger counter involves an effective disappearance of
one observed electron in a zillion-particle electric discharge avalanche it has triggered. A century ago, it
looked much more important to extend the newly born quantum mechanics to more complex systems
(such as atomic nuclei, etc.) than to think about the physics of such instruments.
However, since that time the experimental techniques, notably including high-vacuum and low-
temperature systems, micro- and nano-fabrication, and low-noise electronics, have improved quite
dramatically. In particular, we now may observe quantum-mechanical behavior of more and more
macroscopic objects – such as the micromechanical oscillators mentioned in Sec. 2.9. Moreover, some
“macroscopic quantum systems” (in particular, special systems of Josephson junctions, see below) have
properties enabling their use as essential parts of measurement setups. Such developments are making
the line separating the “micro” and “macro” worlds finer and finer, so that more inquisitive inquiries
into the physical nature of quantum measurements are not so hopeless now. In my personal scheme of
things,4 these inquiries may be grouped as follows:
(i) Does a quantum measurement involve any laws besides those of quantum mechanics? In
particular, should it necessarily involve a human/intelligent observer? (The last question is not as
laughable as it may look – see below.)
(ii) What is the state of the measured system just after a single-shot measurement – meaning a
measurement process limited to a time interval much shorter than the time scale of the measured
system’s evolution? (This question is a necessary part of any discussion of repeated measurements and
of their ultimate form – continuous monitoring of a certain observable.)
(iii) If a measurement of an observable A has produced a certain outcome Aj, what statements
may be made about the state of the system just before the measurement? (This question is most closely
related to various interpretations of quantum mechanics.)
Let me discuss these issues in the listed order. First of all, I am happy to report that there is a
virtual consensus of physicists on some aspects of these issues. According to this consensus, any
reasonable quantum measurement needs to result in a certain, distinguishable state of a macroscopic
output component of the measurement instrument – see Fig. 1. (Traditionally, its component is called a
4 Again, this list and some other issues discussed in the balance of this section are still controversial.
pointer, though its role may be played by a printer or a plotter, an electronic circuit sending out the
result as a number, etc.). This requirement implies that the measurement process should have the
following features:
- provide a large “signal gain”, i.e. some means of mapping the quantum state with its -scale of
action (i.e. of the energy-by-time product) onto a macroscopic position of the pointer with a much larger
action scale, and
- if we want to approach the fundamental limit of uncertainty, given by Eq. (3), the instrument
should introduce as little additional fluctuations (“noise”) as permitted by the laws of physics.
Both these requirements are fulfilled in a well-designed Stern-Gerlach experiment – see Fig. 4.1
again. Indeed, the magnetic field gradient, splitting the electron beam, turns the minuscule (microscopic)
energy difference (4.167) between two spin-polarized states into a macroscopic difference between the
final positions of two output beams, where their detectors may be located. However, as was noted
above, the internal physics of the particle detectors (say, Geiger counters) at this measurement is rather
complex, and would not allow us to discuss some aspects of the measurement, in particular to answer
the second of inquiries we are working on.
This is why let me describe the scheme of an almost similar “single-shot” measurement of a two-
level quantum system, which shares the simplicity, high gain, and low internal noise of the Stern-
Gerlach apparatus, but has an advantage that at its certain hardware implementations,5 the measurement
process allows a thorough, quantitative theoretical description. Let us measure a particle trapped in a
double-well potential (Fig. 2), where x is some continuous generalized coordinate – not necessarily a
mechanical displacement. Let the particle be initially in a pure quantum state, with the energy close to
the well’s bottom. Then, as we know from the discussion of such systems in Secs. 2.6 and 5.1, the state
may be described by a ket-vector similar to that of spin-½:
       , (10.4)
where the component states  and  is described by wavefunctions localized near the potential well
bottoms at x  x0 – see the blue lines in Fig. 2. Our goal is to measure in which well the particle resides
at a certain time instant, say at t = 0. For that, let us rapidly change, at that moment, the potential profile
of the system, so that at t > 0, near the origin, it may be well described by an inverted parabola:
m2 2
U ( x)   x , for t  0, x  xf . (10.5)
2
5 The scheme may be implemented, for example, using a simple Josephson-junction circuit called the balanced
comparator – see, e.g., T. Walls et al., IEEE Trans. on Appl. Supercond. 17, 136 (2007), and references therein.
Experiments have demonstrated that this system may have a measurement variance dominated by the theoretically
expected quantum-mechanical uncertainty, at practicable experimental conditions (at temperatures below ~ 1K).
A conceptual advantage of this system is that it is based on externally-shunted Josephson junctions, i.e. the
devices whose quantum-mechanical model, including its part describing the coupling to the environment, is in a
quantitative agreement with experiment – see, e.g., D. Schwartz et al., Phys. Rev. Lett. 55, 1547 (1985).
Colloquially, the balanced comparator is a high-gain instrument with a “well-documented Hamiltonian”,
eliminating the need for speculations about the environmental effects. In particular, the dephasing process in it,
and its time T2, are well described by Eqs. (7.89) and (7.142), with the coefficients  equal to the Ohmic
conductances G of the shunts.
It is straightforward to verify that the Heisenberg equations of motion in such an inverted

potential describe exponential growth of operator x̂ in time (proportional to exp{t}) and hence a
similar, proportional growth of the expectation value x and its r.m.s. uncertainty x.6 At this “inflation”
stage, the coherence between the two component states  and  is still preserved, i.e. the time
evolution of the system is, in principle, reversible.
(a) (b)
U ( x, t )
t0
 xf  xf  
x t0
0  x0  x0
t0
t0
Fig. 10.2. The potential inversion, as viewed on the (a) “macroscopic”

and (b) “microscopic” scales of the generalized coordinate x.
Now let the system be weakly coupled, also at t > 0, to a dissipative (e.g., Ohmic) environment.
As we know from Chapter 7, such coupling ensures the state’s dephasing on some time scale T2. If
x0  x0 exp{T2 }, xf , (10.6)
then the process, after the potential inversion, consists of two stages, well separated in time:
- the already discussed “inflation” stage, preserving the component the state’s coherence, and
- the dephasing stage, at which the coherence of the component states  and  is gradually
suppressed as described by Eq. (7.89), i.e. the density matrix of the system is gradually reduced to the
diagonal form describing a classical mixture of two probability packets with the probabilities (3) equal
to, respectively, W = 2 and W = 2  1 – 2.
Besides dephasing, the environment gives the motion certain kinematic friction, with the drag
coefficient  (7.141), so that the system eventually settles to rest at one of the macroscopically separated
minima x = xf of the inverted potential (Fig. 2a), thus ensuring a high “signal gain” xf/x0 >> 1. As a
result, the final probability density distribution w(x) along the x-axis has two narrow, well-separated
peaks. But this is just the situation that was discussed in Sec. 2.5 – see, in particular, Fig. 2.17. Since
that discussion is very important, let me repeat – or rather rephrase it. The final state of the system is a
classical mixture of two well-separated states, with the respective probabilities W and W, whose sum
equals 1. Now let us use some detector to test whether the system is in one of these states – say the right
6 Somewhat counter-intuitively, the latter growth improves the measurement’s fidelity. Indeed, it does not affect
the intrinsic “signal-to-noise ratio” x/x, while making the intrinsic (say, quantum-mechanical) uncertainty much
larger than the possible noise contribution by the later measurement stage(s).
one. (If xf is sufficiently large, the noise contribution of this detector into the measurement uncertainty is
negligible,7 and its physics is unimportant.) If the system has been found at this location (again, the
probability of this outcome is W = 2), the probability to find it at the counterpart (left) location at a
consequent detection turns to zero.
This probability “reduction” is a purely classical (or if you like, mathematical) effect of the
statistical ensemble’s re-definition: W equals zero not in the initial ensemble of all similar experiments
(where is equals 2), but only in the re-defined ensemble of experiments in that the system had been
found at the right location. Of course, which ensemble to use, i.e. what probabilities to register/publish
is a purely accounting decision, which should be made by a human (or otherwise intelligent :-) observer.
If we are only interested in an objective recording of results of a pre-fixed sequence of experiments (i.e.
the members of a pre-defined, fixed statistical ensemble), there is no need to include such an observer in
any discussion. In any case, this detection/registration process, very common in classical statistics,
leaves no space for any mysterious “wave packet reduction” – understood as a hypothetical process that
would not obey the regular laws of quantum mechanical evolution.
The state dephasing and ensemble re-definition at measurements are in the core of several
paradoxes, of which the so-called quantum Zeno paradox is perhaps the most spectacular.8 Let us return
to a two-level system with the unperturbed Hamiltonian given by Eq. (4.166), the quantum oscillation
period 2/ much longer than the single-shot measurement time, and the system initially (at t = 0)
definitely in one of the partial quantum states – for example, a certain potential well of the double-well
potential. Then, as we know from Secs. 2.6 and 4.6, the probability to find the system in this initial state
at time t > 0 is
Ωt Ωt
W (t )  cos 2  1  sin 2 . (10.7)
2 2
If the time is small enough (t = dt << 1/), we may use the Taylor expansion to write
Ω 2 dt 2
W (dt )  1  . (10.8)
4
Now, let us use some good measurement scheme (say, the potential inversion discussed above)
to measure whether the system is still in this initial state. If it is (as Eq. (8) shows, the probability of
such an outcome is nearly 100%), then the system, after the measurement, is in the same state. Let us
allow it to evolve again, with the same Hamiltonian. Then the evolution of W will follow the same law
7 At the balanced-comparator implementation mentioned above, the final state detection may be readily performed
using a “SQUID” magnetometer based on the same Josephson junction technology – see, e.g., EM Sec. 6.5. In
this case, the distance between the potential minima xf is close to one superconducting flux quantum (3.38),
while the additional uncertainty induced by the SQUID may be as low as a few millionths of that amount.
8 This name, coined by E. Sudarshan and B. Mishra in 1997 (though the paradox had been discussed in detail by
A. Turing in 1954) is due to its superficial similarity to the classical paradoxes by the ancient Greek philosopher
Zeno of Elea. By the way, just for fun, let us have a look at what happens when Mother Nature is discussed by
people that do not understand math and physics. The most famous of the classical Zeno paradoxes is the case of
Achilles and Tortoise: the fast runner Achilles can apparently never overtake the slower Tortoise, because (in
Aristotle’s words) “the pursuer must first reach the point whence the pursued started, so that the slower must
always hold a lead”. For a physicist, the paradox has a trivial, obvious resolution, but here is what a philosopher
writes about it – not in some year BC, but in the 2010 AD: "Given the history of 'final resolutions', from Aristotle
onwards, it's probably foolhardy to think we've reached the end.” For me, this is a sad symbol of modern
philosophy.
as in Eq. (7). Thus, when the system is measured again at time 2dt, the probability to find it in the same
state both times is
2
 Ω 2 dt 2   Ω 2 dt 2 

W (2dt )  W (dt ) 1   
 1  . (10.9)
 4   4 
After repeating this cycle N times (with the total time t = Ndt still much less than N1/2/), the probability
that the system is still in its initial state is
N N
 Ω 2 dt 2   Ω 2t 2  Ω 2t 2

W ( Ndt )  W (t )  1   
 1    1 . (10.10)
 4   4 N 2  4N
Comparing this result with Eq. (7), we see that the process of system’s transfer to the opposite partial
state has been slowed down rather dramatically, and in the limit N   (at fixed t), its evolution is
virtually stopped by the measurement process. There is of course nothing mysterious here; the evolution
slowdown is due to the quantum state dephasing at each measurement.
This may be the only acceptable occasion for me to mention, very briefly, one more famous – or
rather infamous Schrödinger cat paradox, so much overplayed in popular publications.9 For this thought
experiment, there is no need to discuss the (rather complicated :-) physics of the cat. As soon as the
charged particle, produced at the radioactive decay, reaches the Geiger counter, the initial coherent
superposition of the two possible quantum states (“the decay has happened”/“the decay has not
happened”) of the system is rapidly dephased, i.e. reduced to their classical mixture, leading,
correspondingly, to the classical mixture of the final macroscopic states “cat dead”/“cat alive”. So,
despite attempts by numerous authors, without a proper physics background, to represent this situation
as a mystery whose discussion needs involvement of professional philosophers, hopefully the reader
knows enough about dephasing from Chapter 7, to ignore all this babble.
10.2. QND measurements

I hope that the above discussion has sufficiently illuminated the issues of the group (i), so let me
proceed to the question group (ii), in particular to the general issue of the back action of the instrument
upon the system under measurement – symbolized with the back arrow in Fig. 1. In the instruments like
the Geiger counter, such back action is large: the instrument essentially destroys (“demolishes”) the
state of the system under measurement. Even the “cleaner” potential-inversion measurement, shown in
Fig. 2, fully destroys the initial coherence of the system, i.e. perturbs it rather substantially.
However, in the 1970s it was understood that this is not really necessary. For example, in Sec.
7.3, we have already discussed an example of a two-level system coupled with its environment and
described by the Hamiltonian (7.68)-(7.70):
Hˆ  Hˆ s  Hˆ int  Hˆ e  , with Hˆ s  c z ˆ z , and Hˆ int   f  ˆ z , (10.11)
so that
Hˆ , Hˆ   0 .
s int (10.12)
9I fully agree with S. Hawking who has been quoted to say, “When I hear about the Schrödinger cat, I reach for
my gun.” The only good aspect of this popularity is that the formulation of this paradox should be so well
known to the reader that I do not need to waste time/space repeating it.
Comparing this equality with Eq. (4.199), applied to the explicitly-time-independent Hamiltonian Ĥ s ,

      
iHˆ s  Hˆ s , Hˆ  Hˆ s , Hˆ s  Hˆ int  Hˆ e    Hˆ s , Hˆ int  0 , (10.13)
we see that in the Heisenberg picture, the Hamiltonian operator (and hence the energy) of the system of
our interest does not change in time. On the other hand, if the “environment” in this discussion is the
instrument used for the measurement (see Fig. 1 again), the interaction can change its state, so it may be
used to measure the system’s energy – or another observable whose operator commutes with the
interaction Hamiltonian. Such a trick is called the quantum non-demolition (QND), or sometimes “back-
action-evading” measurements.10 Due to the lack of back action of the instrument on the corresponding
variable, such measurements allow its continuous monitoring. Let me present a fine example of an
actual measurement of this kind – see Fig. 3.11
(a) (b)
Fig. 10.3. QND measurements of single electron’s energy by Peil and Gabrielse: (a) the
experimental setup’s core, and (b) a record of the thermal excitation and spontaneous relaxation
of the Fock states. © 1999 APS; reproduced with permission.
In this experiment, a single electron is captured in a Penning trap – a combination of a (virtually)

uniform magnetic field B and a quadrupole electric field.12 This electric field stabilizes the cyclotron
orbits but does not have any noticeable effect on electron motion in the plane perpendicular to the
magnetic field, and hence on its Landau level energies – see Eq. (3.50):
 1 eB
E n   c  n  , with  c  . (10.14)
 2 me
(In the cited work, with B  5.3 T, the cyclic frequency c/2 was about 147 GHz, so that the Landau
level splitting c was close to 10-22 J, i.e. corresponded to kBT at T ~10 K, while the physical
temperature of the system might be reduced well below that, down to 80 mK). Now note that the
10 For a detailed discussion of this field see, e.g., V. Braginsky and F. Khalili (ed. by K. Thorne), Quantum
Measurement, Cambridge U. Press, 1992; for an earlier review, see V. Braginsky et al., Science 209, 547 (1980).
11 S. Peil and G. Gabrielse, Phys. Rev. Lett. 83, 1287 (1999).
12 It is similar to the 2D system discussed in EM Sec. 2.7, but with additional rotation about one of the axes.
analogy between a Landau-level particle and a harmonic oscillator goes beyond the energy spectrum
(14). Indeed, since the Hamiltonian of a 2D particle in a perpendicular magnetic field may be reduced to
Eq. (3.47), similar to that of a 1D oscillator, we may repeat all procedures of Sec. 5.4 and rewrite this
effective Hamiltonian in the terms of the creation-annihilation operators – see Eq. (5.72):
 1
Hˆ s   c  aˆ † aˆ   . (10.15)
 2
In the Peil and Gabrielse experiment, the trapped electron had one more degree of freedom –
along the magnetic field. The electric field of the Penning trap created a soft confining potential along
this direction (vertical in Fig. 3a; I will take it for the z-axis), so that small electron oscillations along
that axis could be well described as those of a 1D harmonic oscillator of much lower eigenfrequency, in
that particular experiment with z/2  64 MHz. This frequency could be measured very accurately
(with error ~1 Hz) by sensitive electronics whose electric field does affect the z-motion of the electron,
but not its motion in the perpendicular plane. In an exactly uniform magnetic field, the two modes of
electron motion would be completely uncoupled. However, the experimental setup included two special
superconducting rings made of niobium (see Fig. 3a), which slightly distorted the magnetic field and
created an interaction between the modes, which might be well approximated by the Hamiltonian13
 1
Hˆ int  const   aˆ † aˆ   zˆ 2 , (10.16)
 2
so that the main condition (12) of a QND measurement was very closely satisfied. At the same time, the
coupling (16) ensured that a change of the Landau level number n by 1 changed the z-oscillation
eigenfrequency by ~12.4 Hz. Since this shift was substantially larger than electronics’ noise, rare
spontaneous changes of n (due to a weak uncontrolled coupling of the electron to the environment)
could be readily measured – moreover, continuously monitored – see Fig. 3b. The record shows
spontaneous excitations of the electron to higher Landau levels, with its sequential relaxation, just as
described by Eqs. (7.208)-(7.210). The detailed data statistics analysis showed that there was virtually
no effect of the measuring instrument on these processes – at least on the scale of minutes, i.e. as many
as ~1013 cyclotron orbit periods.14
It is important, however, to note that any measurement – QND or not – cannot avoid the
uncertainty relations between incompatible variables; in the particular case described above, continuous
monitoring of the Landau state number n does not allow the simultaneous monitoring of its quantum
phase (which may be defined exactly as in the harmonic oscillator). In this context, it is natural to
wonder whether the QND measurement concept may be extended from quadratic-form variables like
energy to “usual” observables such as coordinates and momenta. whose uncertainties are bound by the
ordinary Heisenberg’s relation (1.35). The answer is yes, but the required methods are a bit more tricky.
For example, let us place an electrically charged particle into a uniform electric field E = nxE(t)
of an instrument, so that their interaction Hamiltonian is
13 Here I have simplified the real situation a bit. Actually, in that experiment, there was an electron spin’s
contribution to the interaction Hamiltonian as well, but since the used high magnetic field polarized the spins
quite reliably, their only effect was a constant shift of the frequency z, which is not important for our discussion.
14 See also the conceptually similar experiments, performed by different means: G. Nogues et al., Nature 400, 239
(1999).
Hˆ int   qEˆ(t ) xˆ . (10.17)

Such interaction may certainly pass the information on the time evolution of the coordinate x to the
instrument. However, in this case, Eq. (12) is not satisfied – at least for the kinetic-energy part of the
particle’s Hamiltonian; as a result, the interaction distorts its time evolution. Indeed, writing the
Heisenberg equation (4.199) for the x-component of the momentum, we get
pˆ  pˆ E 0  qEˆ(t ) . (10.18)
On the other hand, integrating Eq. (5.139) for the coordinate operator evolution, 15 we get the expression
t
1
m t0
xˆ (t )  xˆ (t 0 )  pˆ (t' )dt' , (10.19)
which shows that the perturbations (18) of the momentum eventually find their way to the coordinate
evolution, not allowing its unperturbed sequential measurements.
However, for such an important particular system as a harmonic oscillator, the following trick is
possible. For this system, Eqs. (5.139) with the addition (18) may be readily combined to give a second-
order differential equation for the coordinate operator, that is absolutely similar to the classical equation
of motion of the system, and has a similar solution:16
t
q
xˆ (t )  xˆ (t ) E 0 
m 0  Eˆ(t' ) sin  t  t'  dt' .

0 (10.20)
This formula confirms that generally, the external field E(t) (in our case, the sensing field of the
measurement instrument) affects the time evolution law – of course. However, Eq. (20) shows that if the
field is applied only at moments t’n separated by intervals T/2, where T  2/0 is the oscillation period,
its effect on coordinate vanishes at similarly spaced observation instants tn = tn’ + (m +1/2)T. This is the
idea of stroboscopic QND measurements. Of course, according to Eq. (18), even such measurement
strongly perturbs the oscillator momentum, so that even if the values xn are measured with high
accuracy, the Heisenberg’s uncertainty relation is not violated.
A direct implementation of the stroboscopic measurements is technically complicated, but this
initial idea has opened a way to more practicable solutions. For example, it is straightforward to use the
Heisenberg equations of motion to show that if the coupling of two harmonic oscillators, with
coordinates x and X, and unperturbed frequencies  and , is modulated in time as
Hˆ int  xˆXˆ cos t cos t , (10.21)
15 This simple relation is limited to 1D systems with Hamiltonians of the type (1.41), but by now the reader
certainly knows enough to understand that this discussion may be readily generalized to many other systems.
16 Note in particular that the function sin  (with   t – t’) under the integral, divided by  , is nothing more
0 0
than the temporal Green’s function G() of a loss-free harmonic oscillator – see, e.g., CM Sec. 5.1.
then the process in one of the oscillators (say, that with frequency ) does not affect dynamics of one of
the quadrature components of the counterpart oscillator, defined by relations17
pˆ pˆ
xˆ1  xˆ cos t  sin t , xˆ 2  xˆ sin t  cos t , (10.22)
m m
while this component’s motion does affect the dynamics of one of the quadrature components of the
counterpart oscillator. (For the counterpart couple of quadrature components, the information transfer
goes in the opposite direction.) This scheme has been successfully used for QND measurements.18
Please note that the last two QND measurement examples are based on the idea of a periodic
change of a certain parameter in time – either in the short-pulse form or the sinusoidal form. If the only
goal of a QND measurement is a sensitive measurement of a weak classical force acting on a quantum
probe system, i.e. a 1D oscillator of eigenfrequency 0, it may be implemented much simpler – just by
modulating an oscillator’s parameter with a frequency   20. From the classical dynamics, we know
that if the depth of such modulation exceeds a certain threshold value, it results in the excitation of the
so-called degenerate parametric oscillations with frequency /2  0, and one of two opposite phases.19
In the language of Eq. (22), the parametric excitation means exponential growth of one of the quadrature
components (with its sign depending on initial conditions), while the counterpart component is
suppressed. Close to, but below the excitation threshold, the parameter modulation boosts all
fluctuations of the almost-excited component, including its quantum-mechanical uncertainty, and
suppresses (squeezes) those of the counterpart component. The result is a squeezed state, already
discussed in Sec. 5.5 of this course (see in particular Eqs. (5.143) and Fig. 5.8), which allows one to
notice the effect of an external force on the oscillator on the backdrop of a quantum uncertainty much
smaller than the standard quantum limit (5.99).
In electrical engineering, this fact may be conveniently formulated in terms of noise parameter
N of a linear amplifier – essentially the tool for continuous monitoring of an input “signal” – e.g., a
microwave or optical waveform.20 Namely, N of “usual” (say, transistor or maser) amplifiers which are
equally sensitive to both quadrature components of the signal, N has the minimum value /2, due to
the quantum uncertainty pertinent to the quantum state of the amplifier itself (which therefore plays the
role of its “quantum noise”) – the fact that was recognized in the early 1960s.21 On the other hand, a
17 The physical sense of these relations should be clear from Fig. 5.8: they define a system of coordinates rotating
clockwise with the angular velocity equal to , so that the point representing unperturbed classical oscillations
with that frequency is at rest in this rotating frame. (The “probability cloud” representing a Glauber state is also
stationary in the coordinates [x1, x2].) The reader familiar with the classical theory oscillations may notice that the
observables x1 and x2 so defined are just the Poincaré plane coordinates (“RWA variables”) – see, e.g., CM Sec.
5.3-5.6, and especially Fig. 5.9, where these coordinates are denoted as u and v.
18 The first, initially imperfect QND experiments were reported by R. Slusher et al., Phys. Rev. Lett. 55, 2409
(1985), and other groups soon after this, using nonlinear interactions of optical waves. Later, the results were
much improved – see, e.g., P. Grangier et al., Nature 396, 537 (1998), and references therein. Recently, such
experiments were extended to mechanical systems – see, e.g., F. Lecocq et al., Phys. Rev. X 5, 041037 (2015).
19 See, e.g., CM Sec. 5.5, and also Fig. 5.8 and its discussion in Sec. 5.6.
20 For a quantitative definition of the latter parameter, suitable for the quantum sensitivity range ( ~ ) as
N
well, see, e.g., I. Devyatov et al., J. Appl. Phys. 60, 1808 (1986). In the classical noise limit (N >> ), it
coincides with kBTN, where TN is a more popular measure of electronics’ noise, called the noise temperature.
21 See, e.g., H. Haus and J. Mullen, Phys. Rev. 128, 2407 (1962).
degenerate parametric amplifier, sensitive to just one quadrature component, may have N well below
/2, due to its ground state squeezing.22
Let me note that the parameter-modulation schemes of the QND measurements are not limited to
harmonic oscillators, and may be applied to other important quantum systems, notably including two-
level (i.e. spin-½-like) systems.23 Such measurements may be an important tool for the further progress
of quantum computation and cryptography.24
Finally, let me mention that the composite systems consisting of a quantum subsystem, and a
classical subsystem performing its continuous weakly-perturbing measurement and using its results for
providing a specially crafted feedback to the quantum subsystem, may have some curious properties, in
particular mock a quantum system detached from the environment.25
10.3. Hidden variables and local reality

Now we are ready to proceed to the discussion of the last, hardest group (iii) of the questions
posed in Sec. 1, namely on the state of a quantum system just before its measurement. After a very
important but inconclusive discussion of this issue by Albert Einstein and his collaborators on one side,
and Niels Bohr on the other side, in the mid-1930s, such discussions have resumed in the 1950s.26 They
have led to a key contribution by John Stewart Bell in the early 1960s, summarized as so-called Bell’s
inequalities, and then to experimental work on better and better verification of these inequalities.
(Besides that work, the recent progress, in my humble view, has been rather marginal.)
The central question may be formulated as follows: what had been the “real” state of a quantum-
mechanical system just before a virtually perfect single-shot measurement was performed on it, and
gave a certain, documented outcome? To be specific, let us focus again on the example of Stern-Gerlach
measurements of spin-½ particles – because of their conceptual simplicity.27 For a single-component
system (in this case a single spin-½) the answer to the posed question may look evident. Indeed, as we
know, if the spin is in a pure (least-uncertain) state , i.e. its ket-vector may be expressed in the form
similar to Eq. (4),
       , (10.23)
where, as usual,  and  denote the states with definite spin orientations along the z-axis, the
probabilities of the corresponding outcomes of the z-oriented Stern-Gerlach experiment are W = 2
and W = 2. Then it looks natural to suggest that if a particular experiment gave the outcome
corresponding to the state , the spin had been in that state just before the experiment. For a classical
22 See, e.g., the spectacular experiments by B. Yurke et al., Phys. Rev. Lett. 60, 764 (1988). Note also that the
squeezed ground states of light are now used to improve the sensitivity of interferometers in gravitational wave
detectors – see, e.g., the recent review by R. Schnabel, Phys. Repts. 684, 1 (2017), and the later paper by F.
Acernese et al., Phys. Rev. Lett. 123, 231108 (2019).
23 See, e.g., D. Averin, Phys. Rev. Lett. 88, 207901 (2002).
24 See, e.g., G. Jaeger, Quantum Information: An Overview, Springer, 2006.
25 See, e.g., the monograph by H. Wiseman and G. Milburn, Quantum Measurement and Control, Cambridge U.
Press (2009), more recent experiments by R. Vijay et al., Nature 490, 77 (2012), and references therein.
26 See, e.g., J. Wheeler and W. Zurek (eds.), Quantum Theory and Measurement, Princeton U. Press, 1983.
27 As was discussed in Sec. 1, the Stern-Gerlach-type experiments may be readily made virtually perfect, provided
that we do not care about the evolution of the system after the single-shot measurement.
system such answer would be certainly correct, and the fact that the probability W = 2, defined for
the statistical ensemble of all experiments (regardless of their outcome), may be less than 1, would
merely reflect our ignorance about the real state of this particular system before the measurement –
which just reveals the real situation.
However, as was first argued in the famous EPR paper published in 1935 by A. Einstein, B.
Podolsky, and N. Rosen, such an answer becomes impossible in the case of an entangled quantum
system, if only one of its components is measured with an instrument. The original EPR paper discussed
thought experiments with a pair of 1D particles prepared in a quantum state in that both the sum of their
momenta and the difference of their coordinates simultaneously have definite values: p1 + p2 = 0, x1 – x2
= a.28 However, usually this discussion is recast into an equivalent Stern-Gerlach experiment shown in
Fig. 4a.29 A source emits rare pairs of spin-½ particles, propagating in opposite directions. The particle
spin states are random, but with the net spin of the pair definitely equal to zero. After the spatial
separation of the particles has become sufficiently large (see below), the spin state of each of them is
measured with a Stern-Gerlach detector, with one of them (in Fig. 1, SG1) somewhat closer to the
particle source, so it makes the measurement first, at a time t1 < t2.
(a) (b)
c
particle pair a b
source  
Fig. 10. 4. (a) General scheme
SG 1 SG 2 a ,c  of two-particle Stern-Gerlach
experiments, and (b) the
   orientation of the detectors,
Stern-Gerlach detectors assumed at Wigner’s deviation
on both sides of Bell’s inequality (36).
First, let the detectors be oriented say along the same direction, say the z-axis. Evidently, the
probability of each detector to give any of the values sz = /2 is 50%. However, if the first detector had
given the result Sz = –/2, then even before the second detector’s measurement, we know that the latter
will give the result Sz = +/2 with the 100% probability. So far, this situation still allows for a classical
interpretation, just as for the single-particle measurements: we may fancy that the second particle has a
definite spin before the measurement, and the first measurement just removes our ignorance about that
reality. In other words, the change of the probability of the outcome Sz = +/2 at the second detection
from 50% to 100% is due to the statistical ensemble re-definition: the 50% probability of this detection
belongs to the ensemble of all experiments, while the 100% probability, to the sub-ensemble of
experiments with the Sz = –/2 outcome of the first experiment.
However, let the source generate the spin pairs in the entangled, singlet state (8.18),
s12 
1
2
   ,   (10.24)
28 This is possible because the corresponding operators commute:  pˆ 1  pˆ 2 , xˆ1  xˆ 2    pˆ 1 , xˆ1    pˆ 2 , xˆ 2   0 .

29 Another equivalent but experimentally more convenient (and as a result, frequently used) technique is the
degenerate parametric excitation of entangled optical photon pairs – see, e.g., the publications cited at the end of
this section.
that certainly satisfies the above assumptions: the probability of each value of Sz of any particle is 50%,
and the sum of both Sz is definitely zero, so that if the first detector’s result is Sz = –/2, then the state of
the remaining particle is , with zero uncertainty. Now let us use Eqs. (4.123) to represent the same state
(24) in a different form:
1  1
s12        1       1      1      . (10.25)
2 2 2 2 2 
Opening the parentheses (carefully, without swapping the ket-vector order, which encodes the particle
numbers!), we get an expression similar to Eq. (24), but now for the x-basis:
1
s12       . (10.26)
2
Hence if we use the first detector (closest to the particle source) to measure Sx rather than Sz, then after it
had given a certain result (say, Sx = –/2), we know for sure, before the second particle spin’s
measurement, that its Sx component definitely equals +/2.
So, depending on the experiment performed on the first particle, the second particle, before its
measurement, may be in one of two states – either with a definite component Sz or with a definite
component Sx, in each case with zero uncertainty. Evidently, this situation cannot be interpreted in
classical terms if the particles do not interact during the measurements. A. Einstein was deeply unhappy
with such situation because it did not satisfy what, in his view, was the general requirement to any
theory, which nowadays is called the local reality. His definition of this requirement was as follows:
“The real factual situation of system 2 is independent of what is done with system 1 that is spatially
separated from the former”. (Here the term “spatially separated” is not defined, but from the context, it
is clear that Einstein meant the detector separation by a superluminal interval, i.e. by distance
r1  r2  c t1  t 2 , (10.27)
where the measurement time difference on the right-hand side includes the measurement duration.) In
Einstein’s view, since quantum mechanics did not satisfy the local reality condition, it could not be
considered a complete theory of Nature.
This situation naturally raises the question of whether something (usually called hidden
variables) may be added to the quantum-mechanical description to enable it to satisfy the local reality
requirement. The first definite statement in this regard was John von Neumann’s “proof”30 (first famous,
then infamous :-) that such variables cannot be introduced; for a while, his work satisfied the quantum
mechanics practitioners, who apparently did not pay much attention.31 A major new contribution to the
problem was made only in the 1960s by J. Bell.32 First of all, he has found an elementary (in his words,
“foolish”) error in von Neumann’s logic, which voids his “proof”. Second, he has demonstrated that
Einstein’s local reality condition is incompatible with conclusions of quantum mechanics – that had
been, by that time, confirmed by too many experiments to be seriously questioned.
30 In his very early book J. von Neumann, Mathematische Grundlagen der Quantenmechanik [Mathematical
Foundations of Quantum Mechanics], Springer, 1932. (The first English translation was published only in 1955.)
31 Perhaps it would not satisfy A. Einstein, but reportedly he did not know about the von Neumann’s publication
before signing the EPR paper.
32 See, e. g., either J. Bell, Rev. Mod. Phys. 38, 447 (1966) or J. Bell, Foundations of Physics 12, 158 (1982).
Let me describe a particular version of the Bell’s result (suggested by E. Wigner), using the same
EPR pair experiment (Fig. 4a), in that each SG detector may be oriented in any of 3 directions: a, b, or c
– see Fig. 4b. As we already know from Chapter 4, if a fully-polarized beam of spin-½ particles is
passed through a Stern-Gerlach apparatus forming angle  with the polarization axis, the probabilities of
two alternative outcomes of the experiment are
 
W (  )  cos 2 , W (  )  sin 2. (10.28)
2 2
Let us use this formula to calculate all joint probabilities of measurement outcomes, starting from the
detectors 1 and 2 oriented, respectively, in the directions a and c. Since the angle between the negative
direction of the a-axis and the positive direction of the c-axis is  a-,c+ =  –  (see the dashed arrow in
Fig. 4b), we get
  1 2 
W (a  c )  W (a )W (c a )  W (a )W a , c    cos 2
1
 sin , (10.29)
2 2 2 2
where W(x  y) is the joint probability of both outcomes x and y, while W(x  y) is the conditional
probability of the outcome x, provided that the outcome y has happened. (The first equality in Eq. (29) is
the well-known identity of the probability theory.) Absolutely similarly,
1 
W (c  b )  W (c )W (b c )  sin 2 , (10.30)
2 2
1   2 1 2
W (a  b )  W (a )W (b a )  cos 2  sin  . (10.31)
2 2 2
Now note that for any angle  smaller than /2 (as in the case shown in Fig. 4b), trigonometry gives
1 2 1  1  
sin   sin 2  sin 2  sin 2 . (10.32)
2 2 2 2 2 2
(For example, for   0 the left-hand side of this inequality tends to 2/2, while the right-hand side, to
2/4.) Hence the quantum-mechanical result gives, in particular,
Quantum-
W (a  b )  W (a  c )  W (c  b ), for    / 2 . (10.33) mechanical
result
On the other hand, we can get a different inequality for these probabilities without calculating
them from any particular theory, but using the local reality assumption. For that, let us prescribe some
probability to each of 23 = 8 possible outcomes of a set of three spin measurements. (Due to zero net
spin of particle pairs, the probabilities of the sets shown in both columns of the table have to be equal.)
Detector 1 Detector 2 Probability

a+  b+  c+ a-  b-  c- W1
a+  b+  c- a-  b-  c+ W2
W (a  c) a+  b-  c+ a-  b+  c- W3
W (a  b)
a+  b-  c- a-  b+  c+ W4
W(c  b) a-  b+  c+ a+  b-  c- W5
a-  b+  c- a+  b-  c+ W6
a-  b-  c+ a+  b+  c- W7
a-  b-  c- a+  b+  c+ W8
From the local-reality point of view, these measurement options are independent, so we may
write (see the arrows on the left of the table):
W (a  c )  W2  W4 , W (c  b )  W3  W7 , W (a  b )  W3  W4 . (10.34)
On the other hand, since no probability may be negative (by its very definition), we may always write
W3  W4  W2  W4   W3  W7  . (10.35)
Plugging into this inequality the values of these two parentheses, given by Eq. (34), we get
Bell’s
inequality
(local-reality W (a  b )  W (a  c )  W (c  b ). (10.36)
theory)
This is the Bell’s inequality, which has to be satisfied by any local-reality theory; it directly
contradicts the quantum-mechanical result (33) – opening the issue to direct experimental testing. Such
tests were started in the late 1960s, but the first results were vulnerable to two criticisms:
(i) The detectors were not fast enough and not far enough to have the relation (27) satisfied. This
is why, as a matter of principle, there was a chance that information on the first measurement outcome
had been transferred (by some, mostly implausible) means to particles before the second measurement –
the so-called locality loophole.
(ii) The particle/photon detection efficiencies were too low to have sufficiently small error bars
for both parts of the inequality – the detection loophole.
Gradually, these loopholes have been closed.33 As expected, substantial violations of the Bell
inequalities (36) (or their equivalent forms) have been proved, essentially rejecting any possibility to
reconcile quantum mechanics with Einstein’s local reality requirement.
10.4. Interpretations of quantum mechanics

The fact that quantum mechanics is incompatible with local reality, makes it reconciliation with
our (classically-bred) “common sense” rather challenging. Here is a brief list of the major interpretations
of quantum mechanics, that try to provide at least a partial reconciliation of this kind.
(i) The so-called Copenhagen interpretation – to which most physicists adhere. This
“interpretation” does not really interpret anything; it just accepts the intrinsic stochasticity of
measurement results in quantum mechanics, and the absence of local reality, essentially saying: “Do not
worry; this is just how it is; live with it”. I generally subscribe to this school of thought, with the
following qualification. While the Copenhagen interpretation implies statistical ensembles (otherwise,
how would you define the probability? – see Sec. 1.3), its most frequently stated formulations34 do not
put a sufficient emphasis on their role, in particular on the ensemble re-definition as the only point of
human observer’s involvement in a nearly-perfect measurement process – see Sec.1 above. The most
33 Important milestones in that way were the experiments by A. Aspect et al., Phys. Rev. Lett. 49, 91 (1982) and
M. Rowe et al., Nature 409, 791 (2001). Detailed reviews of the experimental situation were given, for example,
by M. Genovese, Phys. Repts. 413, 319 (2005) and A. Aspect, Physics 8, 123 (2015); see also the later paper by J.
Handsteiner et al., Phys. Rev. Lett. 118, 060401 (2017). Presently, a high-fidelity demonstration of the Bell
inequality violation has become a standard test in virtually every experiment with entangled qubits used for
quantum encryption research – see Sec. 8.5, in particular the paper by J. Lin cited there.
34 With certain pleasant exceptions – see, e.g. L. Ballentine, Rev. Mod. Phys. 42, 358 (1970).
famous objection to the Copenhagen interpretation belongs to A. Einstein: “God does not play dice.”
OK, when Einstein speaks, we all should listen, but perhaps when God speaks (through experimental
results), we have to pay even more attention.
(ii) Non-local reality. After the dismissal of J. von Neumann’s “proof” by J. Bell, to the best of
my knowledge, there has been no proof that hidden parameters could not be introduced, provided that
they do not imply the local reality. Of constructive approaches, perhaps the most notable contribution
was made by David Joseph Bohm,35 who developed the initial Louis de Broglie’s interpretation of the
wavefunction as a “pilot wave”, making it quantitative. In the wave-mechanics version of this concept,
the wavefunction governed by the Schrödinger equation, just guides a “real”, point-like classical particle
whose coordinates serve as hidden variables. However, this concept does not satisfy the notion of local
reality. For example, the measurement of the particle’s coordinate at a certain point r1 has to instantly
change the wavefunction everywhere in space, including the points r2 in the superluminal range (27).
After A. Einstein’s private criticism, D. Bohm essentially abandoned his theory.36
(iii) The many-world interpretation, introduced in 1957 by Hugh Everitt and popularized in the
1960s and 1970s by Bruce de Witt. In this interpretation, all possible measurement outcomes do happen,
splitting the Universe into the corresponding number of “parallel multiverses”, so that from one of them,
other multiverses and hence other outcomes cannot be observed. Let me leave to the reader an estimate
of the rate at which the parallel multiverses have to be constantly generated (say, per second), taking
into account that such generation should take place not only at explicit lab experiments but at every
irreversible process – such as fission of every atomic nucleus or an absorption/emission of every photon,
everywhere in each multiverse – whether its result is formally recorded or not. Nicolaas van Kampen
has called this a “mind-boggling fantasy”.37 Even the main proponent of this interpretation, B. de Witt
has confessed: “The idea is not easy to reconcile with common sense.” I agree.
(iv) Quantum logic. In desperation, some physicists turned philosophers have decided to dismiss
the formal logic we are using – in science and elsewhere. From what (admittedly, very little) I have read
about this school of thought, it seems that from its point of view, definite statements like “the SG
detector has found the spin to be directed along the magnetic field” should not necessarily be either true
or false. OK, if we dismiss the formal logic, I do not know how we can use any scientific theory to make
any predictions – until the quantum logic experts tell us what to replace it with. To the best of my
knowledge, so far they have not done that. I personally trust the opinion by J. Bell, who certainly gave
more thought to these issues: “It is my impression that the whole vast subject of Quantum Logic has
arisen […] from the misuse of a word.”
As far as I know, neither of these interpretations has yet provided a suggestion on how it might
be tested experimentally to exclude other ones. On the positive side, there is a virtual consensus that
quantum mechanics makes correct (if sometimes probabilistic) predictions, which do not contradict any
reliable experimental results we are aware of. Maybe, this is not that bad for a scientific theory.38
35 D.Bohm, Phys. Rev. 85, 165; 180 (1952).

36 See, e.g., Sec. 22.19 of his (generally very good) textbook D. Bohm, Quantum Theory, Dover, 1979.
37 N. van Kampen, Physica A 153, 97 (1988). By the way, I highly recommend the very reasonable summary of
the quantum measurement issues, given in this paper, though believe that the quantitative theory of dephasing,
discussed in Chapter 7 of this course, might give additional clarity to some of van Kampen’s statements.
38 For the reader who is not satisfied with this “positivistic” approach, and wants to improve the situation, my
earnest advice is to start not from square one, but from reading what other (including some very clever!) people
thought about it. The review collection by J. Wheeler and W. Zurek, cited above, may be a good starting point.

Part QM - Quantum Mechanics

Uploaded by

Copyright:

Available Formats

Part QM - Quantum Mechanics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Part QM - Quantum Mechanics

Uploaded by

Copyright:

Available Formats

Stony Brook University

Essential Graduate Physics Department of Physics and Astronomy

Part QM: Quantum Mechanics

Follow this and additional works at: https://commons.library.stonybrook.edu/egp

Part of the Physics Commons

Open online access at

A version of this material was published in 2019 under the title

Quantum Mechanics: Lecture notes

Quantum Mechanics: Problems with solutions

See also the author’s list

Chapter 1. Introduction (28 pp.)

Table of Contents Page 2 of 4

Chapter 5. Some Exactly Solvable Problems (48 pp.)

Table of Contents Page 3 of 4

Chapter 10. Making Sense of Quantum Mechanics (16 pp.)

Additional file (available from the author upon request):

Table of Contents Page 4 of 4

1.1. Experimental motivations

Fig. 1.1. The blackbody radiation density u, in units

q  e Fig. 1.2. Einstein’s explanation of the

E n ,n '  E n '  E n  0 . (1.7)

 4.360  10 18 J  27.21 eV . (1.13a)

again with the same Planck’s constant as in Eq. (5).

1.2. Wave mechanics postulates

20 See, e.g., EM Sec. 8.4.

w   (r, t )   * (r, t ) (r, t ) ,

1.3. Postulates’ discussion

31 See, e.g., SM Sec. 6.1.

1.4. Continuity equation

where Ĥa and Ĥb are also real, while

( Hˆ  )*  ( Hˆ a  iHˆ b)*  Hˆ a  iHˆ b  Hˆ (a  ib)  Hˆ  * . (1.43)

This means that Eq. (40) may be rewritten as

   Ψ *Ψ  ΨΨ *   Ψ * 2 Ψ  Ψ 2 Ψ * , (1.45)

where the vector j is defined as

1.5. Eigenstates and eigenvalues

39See, e.g., respectively, CM 8.3 and EM Sec. 4.1.

Aˆ c1 1  c 2 2   Aˆ c1 1   Aˆ c 2 2   c1 Aˆ 1  c 2 Aˆ 2 , (1.54)

First of all, let us prove that the following product,

and dividing both parts of the equation by ann, we get

A   n* r Aˆ  n r d 3 r = const. (1.64)

* 1, for n  n' ,

Now let us consider the following wavefunction

1.6. Time evolution

 nn'  E n'  E n . (1.71)

V Fig. 1.7. The Josephson effect in a weak link

As a brief reminder,48 superconductivity may be explained by a specific coupling between

1.7. Spatial dependence

 0, for 0  x  a x , 0  y  a y , and 0  z  a z , Hard-wall box:

corresponding to the following eigenenergies:

1.8. Dimensionality reduction

 r   X ( x) expi k y y  k z z , (1.93)

where X(x) is an eigenfunction of the following stationary 1D Schrödinger equation:

U ( x), for 0  y  a y , and 0  z  a z ,

while in a 1D system of length l >> 1/k,

1.9. Exercise problems

64 To be discussed in Sec. 3.2.

1.5.* Prove the so-called Hellmann-Feynman theorem:67

1.11. At t = 0, a 1D particle of mass m is placed into a hard-wall, flat-bottom potential well

Chapter 2. 1D Wave Mechanics

   Ψ Ψ  ΨΨ   Ψ  2 Ψ  Ψ 2 Ψ , (1.45)