The Conceptual Framework of Quantum Field Theory (PDFDrive)
The Conceptual Framework of Quantum Field Theory (PDFDrive)
The Conceptual Framework of Quantum Field Theory (PDFDrive)
Anthony Duncan
1
3
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
If furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Anthony Duncan 2012
The moral rights of the author have been asserted
First Edition published in 2012
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging in Publication Data
Data available
ISBN 978–0–19–957326–4
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
Preface
In the roughly six decades since modern quantum field theory came of age with the
introduction in the late 1940s of covariant field theory, supplemented by renormaliza-
tion ideas, there has been a steady stream of expository texts aimed at introducing
each new generation of physicists to the concepts and techniques of this central area of
modern theoretical physics. Each decade has produced one or more “classics”, attuned
to the background, needs, and interests of students wishing to acquire a proficiency
in the subject adequate for the beginning researcher at the time. In the 1950s the
seminal text of Jauch and Rohrlich, Theory of Photons and Electrons, provided the
first systematic textbook treatment of the Feynman diagram technique for quantum
electrodynamics, while more or less simultaneously the first field-theoretic attacks on
the strong interactions were presented in the two-volume Mesons and Fields of Bethe,
de Hoffmann, and Schweber. The 1960s saw the appearance of the massive treatise
by Schweber, Introduction to Relativistic Quantum Field Theory, which addressed
in much greater detail formal aspects of the theory, including the LSZ asymptotic
formalism and the Wightman axiomatic approach. The dominant text of the late 1960s
was undoubtedly the two-volume text of Bjorken and Drell, Relativistic Quantum
Mechanics and Relativistic Quantum Fields, which combined a thorough introduction
to Feynman graph technology (in volume 1) with a more formal introduction to
Lagrangian field theory (in volume 2). In the 1970s the emergence of non-abelian
gauge theories as the overwhelmingly favored candidates for a successful field-theoretic
description of weak and strong interactions coincided with the emergence of functional
(path-integral) methods as the appropriate technical tool for quantization of gauge
theories. In due course, these methods received full treatment with the appearance
in 1980 of Itzhykson and Zuber’s encyclopedic Quantum Field Theory. In a similar
way, the surge to prominence of supersymmetric field theories throughout the 1980s
necessitated a full account of supersymmetry, which is the sole subject of the third
volume of Weinberg’s comprehensive three-volume The Quantum Theory of Fields,
the first edition of which appeared in 1995.
With such a selection of classic expository treatises (not to mention many other
fine texts not listed above—with apologies to authors of same!) one may well doubt
the need for yet another introductory treatment of quantum field theory. Nevertheless,
in the course of teaching the subject to graduate students (typically, second year) over
the last 25 years, I have been struck by the number of occasions on which important
conceptual issues are raised by questions in the classroom which require a careful
explanation not to be found in any of the readily available textbooks on quantum
field theory. To give just a small sample of the sort of questions one encounters in the
classroom setting: “Of the plethora of quantum fields introduced to describe Nature at
subatomic scales, why do so few (basically, only electromagnetism and gravity) have
classical macroscopic correlates?”; “If there are many possible quantum fields available
iv Preface
to ‘represent’ a given particle, can, or in what sense does, quantum field theory
prescribe a unique all-time dynamics?”; “If the interaction picture does not exist,
as implied by Haag’s theorem, why (or in what sense) are the formulas derived in
this picture for the S-matrix still valid?”; “Are there non-perturbative phenomena
amenable to treatment using perturbative (i.e., graph-theoretical) methods?”; and so
on. None of these questions require an answer if one’s attitude in learning quantum
field theory amounts to a purely pragmatic desire to “start with a Lagrangian and
compute a process to two loops”. However, if the aim is to arrive at a truly deep
and satisfying comprehension of the most powerful, beautiful, and effective theoretical
edifice ever constructed in the physical sciences, the pedagogical approach taken by
the instructor has to be quite a bit different from that adopted in the “classics”
enumerated above.
In the present work, an attempt is made to provide an introduction to quantum
field theory emphasizing conceptual issues frequently neglected in more “utilitarian”
treatments of the subject. The book is divided into four parts, entitled respec-
tively, “Origins”, “Dynamics”, “Symmetries”, and “Scales”. Although the emphasis is
conceptual—the aim is to build the theory up systematically from some clearly stated
foundational concepts—and therefore to a large extent anti-historical, I have included
two historical chapters in the “Origins” section which trace the evolution of the modern
theory from the earliest “penumbra” of quantum-field-theoretical phenomena detected
by Planck and Einstein in the early years of the twentieth century to the emergence,
in the late 1940s, of the recognizable structure of modern quantum field theory, in the
form of quantum electrodynamics. The reader anxious to proceed with the business
of logically developing the framework of modern field theory is at liberty to skim, or
even entirely omit, this historical introduction.
The three remaining sections of the book follow a step-by-step reconstruction of
this framework beginning with just a few basic assumptions: relativistic invariance,
the basic principles of quantum mechanics, and the prohibition of physical action at a
distance embodied in the clustering principle. The way in which these physical ingre-
dients combine to engender some of the most dramatic results of relativistic quantum
field theory is outlined qualitatively in Chapter 3, which also contains a summary of
the topics treated in later chapters. Subsequent chapters in the “Dynamics” section
of the book lay out the basic structure of quantum field theory arising from the
sequential insertion of quantum-mechanical, relativistic, and locality constraints. The
rather extended treatment of free fields allows us to discuss important conceptual
issues (e.g., the classical limit of field theory) in greater depth than usually found in
the standard texts. Some applications of perturbation theory to some simple theories
and processes are discussed in Chapter 7, after the construction of covariant fields for
general spin has been explained. A deeper discussion of interacting field theories is
initiated in Chapters 9, 10, and 11, where we treat first general features shared by all
interacting theories (Chapter 9) and then aspects amenable to formal perturbation
expansions (Chapter 10). The “Dynamics” section concludes with a discussion of
“non-perturbative” aspects of field theory—a rather imprecise methodological term
encompassing a wide variety of very different physical processes. In Chapter 11 we
attempt to clarify the extent to which certain features of field theory are “intrinsically”
Preface v
To the beginning student, quantum field theory all too often takes on the appear-
ance of a multi-headed Hydra, with many intertwined parts, the understanding of any
one of which seems to require a prior understanding of the rest of the frightening
anatomy of the whole beast. The motivation for the present work was the author’s
desire to provide an introduction to modern quantum field theory in which this rich
and complex structure is seen to arise naturally from a few basic conceptual inputs, in
contrast to the more typical approach in which Lagrangian field theory is presented as
a theoretical fait accompli and then subsequently shown to have the desired physical
features.
Much (perhaps most) of the attitude towards quantum field theory expressed in this
book is the result of innumerable conversations, over four decades, with colleagues and
students. For laying the foundations of my knowledge of field theory I wish especially
to thank my predoctoral and post-doctoral mentors (Steven Weinberg and Al Mueller,
respectively). In the case of the present work I am extremely grateful to Estia Eichten,
Michel Janssen, Adam Leibovich, Max Niedermaier, Sergio Pernice, and Ralph Roskies
for reading extensive parts of the manuscript, and for many useful comments and
suggestions. Any remaining solecisms of style or content are, of course, entirely the
responsibility of the author.
Contents
of the final quarter-century of the 1800s. The three great edifices of classical physics—
Newtonian mechanics (amplified and deepened, of course, by the contributions of
Laplace, Lagrange, Hamilton, and many others), electromagnetic theory, only recently
completed by Maxwell (in Philosophical Transactions, vol. 155, 1865), and, somewhat
later, put in a form recognizable to the modern student of the subject by Hertz,
and thermodynamics, which reached essential conceptual completeness at the hands
of Clausius, also around 1865—stood as precise descriptions of natural phenomena,
each apparently unassailable in its natural domain of applicability. In a sense, further
progress keeping strictly within the limits of each of these disciplines had become
difficult or impossible. However, precisely at this time, natural phenomena requiring
the simultaneous application of more than one of these formal structures began to
demand the attention of physicists. It is possible to trace the origins of the core
disciplines of twentieth-century physics—quantum theory, statistical mechanics, and
relativity—to developments at the interfaces of the three basic classical frameworks.
We can summarize these developments very briefly as follows:
In our outline of the conceptual origins of quantum mechanics and quantum field
theory, the first item above holds pride of place: firstly, because by common consent
quantum theory begins with Planck’s introduction of a new universal constant of
Nature in his blackbody distribution formula of 1900, and secondly, because, as we shall
see later, the first explicitly quantum field-theoretic calculation, Jordan’s derivation of
the mean-square energy fluctuations in a subvolume of a cavity containing (a one-
dimensional version of) electromagnetic radiation in the final section of the Drei-
Männer-Arbeit (1925) of Born, Heisenberg, and Jordan (Born et al., 1926), came
directly out of an attempt to reproduce a remarkable result of Einstein dating from
1909 (Einstein, 1909b,a) in which the apparent paradox of simultaneous wave and
particle behavior of light was first exposed with full clarity. The essence of quantum
field theory is to provide a unified dynamical framework in which these apparently dis-
parate behaviors can coexist in a conceptually consistent fashion. The electromagnetic
radiation contained in a cavity at thermal equilibrium therefore plays a central role
in the conceptual origins both of quantum theory generally and quantum field theory
in particular. For this reason we shall retell in this chapter the history of thermal
radiation in some detail, paying particular attention to those aspects important for
understanding the conceptual origins of quantum field theory. The story will lead
us continuously from the early arguments surrounding the role of the Second Law
of Thermodynamics (and the “arrow of time” it implies) in blackbody radiation, to
the appearance of the first truly quantum-field-theoretical analysis of electromagnetic
radiation. We begin in the next section by describing some important milestones on
the way to the understanding of blackbody radiation as it stood in 1900 when Planck
took the first steps along the road to modern quantum theory.
of fire!). The precise nature of thermal radiation became the subject of intense study
in the nineteenth century, and as we shall see, led directly to the discovery of the
quantum principle. Wedgewood, the porcelain manufacturer, observed in the 1790s
that heated bodies all became red at the same temperature. The Scottish physicist
Balfour Stewart noted (in 1858) that a block of rock salt at 100o C strongly absorbs
the radiation emitted by a similar block at the same temperature, and suggested the
rule of equality of radiating and absorbing power of bodies for rays of any given type
(i.e., wavelength). About a year later (independently of Stewart) Kirchhoff put all
of this phenomenology into a comprehensible framework by the use of very general
thermodynamic arguments (Kirchhoff, 1859, 1860).
The arguments given by Kirchhoff established the universal character of blackbody,
or “cavity”, radiation—the radiation filling the interior of a hollow material cavity, the
walls of which are maintained at a fixed temperature T . It is important to understand
at the outset the role of the cavity in these arguments. After heating the cavity to
the desired temperature T , a fixed amount of radiant energy fills the interior. To
examine the nature of this radiation we are at liberty to drill a very small hole in
the walls, as the small amount of radiant energy emerging will not sensibly disturb
the established equilibrium. Note that from the point of view of the external world,
the punctured cavity is essentially “black”, in the sense that any radiation entering
through the pinhole will have to scatter around in the interior of the cavity for a
very long time before having an opportunity to escape. Such a cavity is therefore
(effectively) a perfect absorber, or “black body”.1 The essence of the problem is that
radiation in the cavity is forced to interact with the walls (or contents, if any) of the
cavity until thermal equilibrium is reached.
Kirchhoff showed that the energy (per unit volume, and per wavelength interval) of
the cavity radiation was uniform and isotropic throughout the interior. The arguments
he gave were subsequently simplified by Pringsheim. The latter observed that by
inserting a reflecting plane surface at some point inside the cavity, the equality of
radiation in opposite directions follows, as otherwise the unequal radiation pressure
exerted on the two faces would allow the spontaneous conversion of heat to work,
thereby violating the Second Law of Thermodynamics. The existence of a pressure
exerted by radiation reflected from a surface is crucial to these arguments: the precise
relation (to be discussed further below) between this pressure and the energy density of
the radiation had been derived earlier by Maxwell (Maxwell, 1873). Similar arguments
employing two mirrors can be used to prove the isotropy and homogeneity of the
radiation in the cavity. The existence of filters selectively absorbing and transmitting
radiation of different wavelengths means that these statements hold separately within
each interval of wavelength. Thus, we can define a function φ(λ, T ) such that φ(λ, T )Δλ
is the radiant energy per unit volume between wavelengths λ, λ + Δλ anywhere in the
cavity.
1 The use of the term “blackbody”, though historically predominant, frequently confuses the beginning
student, who wonders quite naturally how a truly black body can radiate! In the forthcoming discussion we
prefer the use of the term “cavity (or thermal) radiation”. The German terminology, “Normal Spektrum”,
is not particularly illuminating either.
Early work on cavity radiation 5
Next, one easily sees that φ(λ, T ) is the same (universal!) function for any cavity,
irrespective of size, shape, or material constitution. This can be established by
connecting two cavities at the same temperature by a thin tube allowing radiation
to pass in either direction. Unequal radiation densities would result in unequal fluxes
down the tube, and hence in a spontaneous flow of heat between two systems at
the same temperature. Again, by placing filters which pass only a limited range of
wavelengths in the tube, this equality is found to hold in each wavelength interval
λ, λ + Δλ.
The universal function φ(λ, T ) appears in the description of radiation emitted from
any hot surface in the following way. At thermal equilibrium the radiation impinging
on the surface must be balanced by the total radiation leaving, as otherwise the surface
would grow progressively cooler or hotter. Suppose the surface (say, the interior wall
of our cavity discussed above) absorbs a fraction Aλ of the incident radiation flux eλ
(at wavelength λ). It is clear that the radiation emitted (as opposed to reflected), or
the “emissive power” of the surface Eλ must just equal Aλ eλ , or equivalently, the ratio
Eλ /Aλ of emissive power to absorption coefficient is a universal function (essentially
our old friend φ) of wavelength and temperature. This result, of course, lies at the
core of Stewart’s observations mentioned above. It also explains why thermos flasks
are silvered (making Aλ small) in order to get them to radiate less.
The total energy density of cavity radiation (at all wavelengths) is evidently just
the integral of Kirchhoff’s function φ:
ρ(T ) = φ(λ, T )dλ (1.1)
In 1879 Josef Stefan proposed on the basis of some preliminary experiments the
form
ρ(T ) = aT 4 (1.2)
4 1
d/Q = dU + d/W = d( πr3 ρ) + 4πr 2 ρdr (1.3)
3 3
4 3 dρ 16 2 dr 4 dρ dT 16 dr
dS = d/Q/T = πr + πr ρ = πr 3 + πr2 ρ (1.4)
3 T 3 T 3 dT T 3 T
2 We remind the reader that U and S are state functions, unlike work W and heat Q: only the changes
in the latter are meaningful, whence the difference in notation in the associated differentials d
/W, d
/Q.
6 Origins I: From the arrow of time to the first quantum field
∂ ∂S ∂ ∂S
Equating ∂T ∂r to ∂r ∂T we find
dρ 4
= ρ (1.5)
dT T
which immediately implies the T 4 law stated above, and now known as the Stefan–
Boltzmann Law.
Serious attempts to measure experimentally the intensity and spectral composition
of blackbody radiation began with Langley, the American astronomer, who invented
the bolometer, a device for measuring the intensity of radiation in the infrared, using
the principle of temperature-dependent resistance of a thin filament placed in the path
of the rays after they were refracted through a rock-salt prism. These measurements
(1886) extended up to about wavelengths of 5μ. He found “a real though slight
progression of the point of maximum heat towards the shorter wave-lengths as the
temperature rises” and the asymmetric form of the maximum of φ(λ, T ), steeper on
the shorter wavelength side. By 1895 the greatly improved measurements of Paschen
established the rule that the wavelength λm of maximum intensity was inversely
proportional to the temperature T . Paschen’s measurements led him to propose, in
1896, the form
A
φ(λ, T ) = Bλ−C exp(− ) (1.6)
λT
with the constant C somewhere in the range of 5–6.
In 1893 Wien derived, using purely thermodynamic arguments, an important
constraint on the Kirchhoff function φ(λ, T ). Consider once again the spherical cavity
used above in the derivation of the Stefan–Boltzmann Law. Imagine that the sphere
undergoes slow, adiabatic compression, where the radius contracts steadily at speed v
(v << c). At every reflection from this inwardly contracting sphere light of wavelength
λ suffers a Doppler shift to wavelength λ(1 − 2v/c). During a contraction by Δr,
occurring in time Δr/v, the light undergoes cΔr/2vr reflections across a diameter of
the sphere (it can be shown that light not incident perpendicular to the walls suffers a
smaller Doppler shift each time, but is reflected correspondingly more frequently: the
net result is the same). The result is a total blue shift to wavelength
2v cΔr Δr
(1 − ) 2vr λ ∼ (1 − )λ
c r
so the wavelength is shifted by Δλ
λ
= − Δr
r
in this adiabatic compression. As there is
no heat transfer dS = 0 and (see Eq. (1.4) above)
4 3 4 dT 16 dr
πr ρ = − πr 2 ρ
3 T T 3 T
dr dT
⇒ =−
r T
so r ∝ T1 .
The essence of the thermodynamic argument given by Wien lies in the observation
that the adiabatic process described here gives at every stage cavity radiation in
Early work on cavity radiation 7
λ5 φ = f (λ/r) = f (λT )
which Wien called the “T 5 ” law, but is now commonly called the Wien Displacement
Law. If we know the radiation function φ(λ1 , T1 ) at temperature T1 , the law allows us
to “displace” it into the appropriate curve for any other temperature T2 , as
T2 5 T2
φ(λ, T2 ) = ( ) φ(λ , T1 )
T1 T1
8 Origins I: From the arrow of time to the first quantum field
It follows immediately from the Wien Displacement Law that the total energy density
ρ(T ) = dλφ(λ, T )
1
= dλ f (λT )
λ5
∞
1
= T4 dx f (x) ≡ aT 4 (1.10)
0 x5
satisfies the Stefan–Boltzmann Law (provided, of course, that the integral converges:
a condition by no means to be taken for granted, as we shall see). Another imme-
diate corollary is the result later verified experimentally by Paschen (but suggested
previously by several workers in this field, notably H. F. Weber) that the maximum in
wavelength displaces inversely with the temperature. Finally, the Wien Displacement
Law immediately fixes the value of the constant C in the form proposed by Paschen
(see (1.6)) to be 5 exactly. Independently of Paschen’s work, Wien in 1896 arrived
at the form (1.6) on the basis of an ad hoc assumption concerning the emission of
radiation by molecules distributed according to a Maxwellian velocity distribution.
This form, which is now a complete specification of the Kirchhoff function φ, was
called the Wien Distribution Law, and was to play, with its “corrected” version, the
Planck Distribution Law, a critical role in the evolution of attempts to understand
quantization of the electromagnetic field.
3 For a beautiful retelling of this remarkable period in the development of statistical heat theory, see the
biography by Martin Klein of Paul Ehrenfest (Klein, 1970).
Planck’s route to the quantization of energy 9
incident on a charged oscillator into outgoing spherical waves. This subject was
explored with great thoroughness in a series of five papers in the Berliner Berichte
(1897–99) entitled “Über irreversible Strahlungsvorgänge” (“On irreversible radiation
processes”) (Planck, 1900a). The subject of absorption and re-emission of electromag-
netic radiation from an oscillator led Planck naturally into the subject of thermal
cavity radiation. Here the oscillators constitute the material of the walls of the cavity,
absorbing and re-emitting the radiation in the interior. The universal character of the
thermal radiation discussed above allowed Planck the freedom of making a very simple
model of the constituent particles of the walls (essentially charged simple harmonic
oscillators), as the spectral distribution of the cavity radiation would have to be
independent of the specific material constitution of the cavity once equilibrium is
reached.
The irreversibility that Planck relies upon in his radiation studies can be seen
clearly in the damped oscillator equation that he derived as a prelude to his studies
of the coupled field-oscillator problem:
d2 x 2e2 d3 x
m + kx − = eE cos(2πνt) (1.11)
dt2 3c3 dt3
The third (“radiation damping”) term has three time-derivatives and evidently
changes sign under time-reversal. It arises because the damping force times the velocity
must give the power lost to radiation, which is proportional to the acceleration of the
charged particle squared. The average of the third term above times the velocity
dx
dt over a cycle of the periodic system is easily seen to be the same as the average
power radiated, by a single integration by parts. Planck was particularly impressed
by the fact that the irreversibility in this system arises without any recourse to non-
conservative processes, in which ordered energy is lost (as in friction or air resistance)
to disordered heat. Instead, the energy appears to flow irreversibly from an ordered
source (an incoming plane wave incident on the oscillator) to an equally ordered form:
outgoing spherical radiation.
In 1898 Boltzmann succeeded in convincing Planck (in a paper entitled “On
the supposedly(!) irreversible radiation processes” (Boltzmann, 1898)) that the hope
of deriving irreversible phenomena from electromagnetic theory without additional
statistical assumptions was bound to fail, as Maxwell’s equations are just as invariant
under time-reversal as those of classical mechanics. In fact (Boltzmann claimed), in
the course of a careful derivation of radiation damping one is forced to apply boundary
conditions to the fields which amount to a field analog of the assumption of molecular
disorder implicit in the Boltzmann approach to gas theory.4 Planck admitted this
promptly and abandoned the attempt at a “microscopic” explanation of irreversibility
based on electrodynamics.
In the fifth of his papers on irreversible radiation processes (Planck, 1899), Planck
derived a crucial formula relating the distribution function for cavity radiation to the
average energy of his fictional oscillators (at equilibrium). Before stating this formula,
4 See (Klein, 1970), (Kuhn, 1978) for masterful expositions of the remarkable developments at the
interface of mechanics and heat theory summarized all too briefly above.
10 Origins I: From the arrow of time to the first quantum field
a slight change in notation will be convenient. Let ρ(ν, T )dν be the energy/unit volume
of cavity radiation in the frequency interval (ν, ν + dν), where νλ = c, |dν| = λc2 dλ, so
that
c
ρ(ν, T )dν = ρ(ν, T ) 2 dλ = φ(λ, T )dλ
λ
λ2 c c
ρ(ν, T ) = φ(λ, T ) = 2 φ( , T ) (1.12)
c ν ν
In terms of ρ(ν, T ), the T 5 law takes the form
ν
ρ(ν, T ) = ν 3 f ( ) (1.13)
T
and the Wien Distribution Law is
where the new constants α, β are related to those appearing in the Paschen result
(1.6) by α = cB4 , β = Ac (recall that the constant C was fixed previously by purely
thermodynamic reasoning to be 5). The equation derived by Planck, obtained by
equating at equilibrium the energy absorbed and emitted by the oscillator, stated
simply
c3
E(νo , T ) = ρ(νo , T ) (1.15)
8πν 2
1
It relates the average energy of an oscillator of natural frequency νo = 2π k/m to
the blackbody distribution function. That such a relation must exist is physically
clear. Planck showed that if the left-hand side exceeded the right, energy would flow
from the oscillators to the electromagnetic field, while if the intensity of radiation at
ν0 became large enough that the right-hand side exceeded the left the oscillators
would tend to absorb energy from the field. The importance of this equation (a
full derivation of which we must unfortunately forego, in the interests of brevity) in
Planck’s intellectual journey can scarcely be overemphasized: it allowed him to restrict
the application of energy quantization to the material oscillators alone (left-hand side
of (1.15)), while relying on the equilibrium condition to transfer the resultant average
distribution of energy by main force, as it were, to the continuous electromagnetic
radiation (right-hand side) in the interior of the cavity. Planck would continue to
insist on the continous, purely classical character of electromagnetic radiation for the
next 25 years.
In his final paper on irreversible radiation processes (see (Planck, 1900a)), Planck
gave a “derivation” of the Wien Law based on purely thermodynamic arguments
together with the crucial formula (1.15) above. This was done by making a plausible
assumption for the form of the entropy S of the oscillator as a function of energy E,
using for the inverse temperature T −1 = ∂E ∂S
, and solving for E as a function of T .
Planck showed that his assumption for S(E) implied that the entropy of the whole
system (oscillators plus radiation) would necessarily increase in time, in agreement
with the Second Law of Thermodynamics. He was also under the (as it later turned
Planck’s route to the quantization of energy 11
out, erroneous) impression that this was the only possible choice for S(E) consistent
with the Second Law. Consequently, at this point Planck was quite convinced that
he had finally managed a complete derivation of the blackbody spectrum from pure
thermodynamics (even if he had now to agree with Boltzmann that the Second Law
had a statistical rather than absolute significance, even in radiation phenomena).
On the afternoon of Sunday, 7 October 1900, Planck was visited at home by an
experimental colleague from the Physikalische-Technische Reichsanstalt (the Physical-
Technical Imperial Institute, or PTR), H. Rubens. He learnt from Rubens that recent
experiments at the PTR had established incontrovertible deviations from the Wien
Distribution Law on the infrared (low-frequency) side. In particular, the intensity was
roughly proportional to temperature in this regime, instead of the saturation at high
temperatures implied by the Wien Law (1.14). Planck realized that a more general
form for the oscillator entropy S(E) would in turn allow the derivation of a modified
distribution law
1
ρ(ν, T ) = αν 3 (1.16)
exp(βν/T ) − 1
which clearly reproduces the Wien Law at higher frequencies, but behaves like
1 α
ρ(ν, T ) ∼ αν 3 ∼ T ν2 (1.17)
(βν/T ) β
in the infrared (small ν), showing the desired linear behavior with T . This interpolating
formula, which Planck appears to have constructed in the few hours following the visit
of Rubens, was checked within the next week and a half and found to match exactly
the experimental data.
Planck was perfectly aware that his interpolating formula was nothing more than
an enlightened guess at this stage, and he began right away to search for a proper
understanding of the formula (1.16). His strategy was precisely the inverse of the one
he had followed heretofore. He used (1.15) to obtain the average oscillator energy,
assuming the validity of the Planck distribution (1.16):
hν
E(ν, T ) = (1.18)
exp (βν/T ) − 1
3
where h ≡ αc 8π
. He then reconstructed the corresponding expression for oscillator
entropy as a function of energy, using the thermodynamic relation T dS = dE valid
for a reversible transformation involving transfer of heat but no external work. Here,
oscillators of a fixed natural frequency ν (called νo above) are considered. Solving
(1.18) for 1/T as a function of E:
1 1 hν
= ln(1 + ) (1.19)
T βν E
12 Origins I: From the arrow of time to the first quantum field
|
|
...|
(P + N − 1)!
S = k ln( ) ∼ k((N + P ) ln(N + P ) − P ln(P ) − N ln(N ))
P !(N − 1)!
The average entropy of each oscillator S = N1 SN while the average energy of a single
oscillator is E = N1 EN = N
P
. Consequently
E E E E
S = k{(1 + ) ln(1 + ) − ln( )} (1.21)
= hν! The arguments outlined above were presented in Planck’s paper in Annalen der
Physik 4 (1901),p. 553, “Über das Gesetz der Energieverteilung im Normalspectrum”
(“On the law of energy distribution for the normal [i.e., blackbody] spectrum”). The
famous Planck’s constant h appears here for the first time. In modern notation, the
blackbody distribution thus takes the form
8πν 2 hν
ρ(ν, T ) = (1.22)
c3 exp(hν/kT ) − 1
From the experimental fits, Planck determined h = 6.55 × 10−27 erg/sec, and k (Boltz-
mann’s constant)= 1.346 × 10−16 ergs/degree. The latter value allowed Planck to
obtain the first decently accurate value for Avogadro’s number N = R/k (where R
is the gas constant).
It is a strange historical irony that Planck’s modification of the Wien’s Law,
motivated by the pressure of the Kurlbaum–Rubens experimental results, was actually
a move towards a “more classical” result: as Einstein was to emphasize in his epochal
1905 paper (Einstein, 1905a), in which the revolutionary idea of field quantization
was introduced, the Wien Law is in a sense an extreme manifestation of the quantal
properties of light. The deviations observed from this law in the infrared by Kurlbaum
and Rubens are harbingers of the reappearance of the classical wave-like aspects of
electromagnetic phenomena. To understand this we must realize that despite Planck’s
heroic efforts to obtain a rigorous and unique classical result for the distribution func-
tion of cavity radiation throughout the 1890s, leading up to the quantum-theoretically
correct Planck distribution (1.22), the first derivation of the blackbody distribution
based on a consistent and full application of classical principles is actually due to Lord
Rayleigh. In a short (two-page) paper published in 1900 (Rayleigh, 1900) Rayleigh
derived the correct classical form of the distribution function from the classical
equipartition theorem applied directly to the electromagnetic modes in the cavity.
Consider a cubical LxLxL box containing electromagnetic radiation in the form of
standing waves. A typical standing wave mode takes the form
n1 πx n2 πy n3 πz
sin( ) sin( ) sin( )
L L L
where the associated frequency is ν = 2Lc
|n| and n is the vector with (positive) integer
Cartesian components (n1 , n2 , n3 ). The number of such modes in the shell (|n|, |n| +
d|n|) (octant of positive components only!—an error of Rayleigh’s later corrected by
Jeans, see below) is evidently
1 L3
4π|n|2 d|n| = 4π 3 ν 2 dν
8 c
and each of these modes receives a total of 2kT at equilibrium by the equipartition
principle (namely, 12 kT each into electric and magnetic field energy, and each of two
polarization modes). Thus the energy per unit volume in the field in the frequency
interval (ν, ν + dν) is
14 Origins I: From the arrow of time to the first quantum field
1 L3 ν2
ρ(ν, T )dν = 3
2kT (4π 3 ν 2 )dν = 8π 3 kT dν (1.23)
L c c
—a result which has since become known as the Rayleigh–Jeans Law. (The error
mentioned above of an overall factor of eight made by Rayleigh in his original paper was
subsequently corrected by Jeans. As Pais points out in his biography of Einstein (Pais,
1982), the correction was made also in Einstein’s 1905 paper on the light quantum, so
the result should perhaps more properly be called the Rayleigh–Jeans–Einstein Law.)
Rayleigh was perfectly aware that this result could not be correct: the total energy
contained in the cavity radiation, when integrated over all frequencies, would then be
infinite! Instead, he assumed that it was correct only for the “graver modes” (i.e., lower
frequencies) and that the distribution was modified for some as yet unknown reason
at higher frequencies (Rayleigh simply inserted an exponential suppression factor at
high frequencies, and the resultant formula was in fact his final result). In any event
the simple linear dependence on temperature in the Rayleigh–Jeans Law flies in the
face of experience: a bar of steel at room temperature (300 K, say) does not emit
radiation at one-tenth the blinding intensity of a bar at 3000 K ! The infinite amount
of energy present in the classical radiation field under equipartition would later (1911)
be referred to by Ehrenfest (Ehrenfest, 1911) as the “ultraviolet catastrophe”.
Of course, if Planck had finished his Boltzmannian calculation of the average
oscillator energy by taking the energy units
to zero, as Boltzmann had done
previously in his discussion of gas theory, he would have arrived precisely at Rayleigh’s
result (though he does not seem to have been aware of Rayleigh’s work during the
critical period leading up to the 1901 paper), as the Rayleigh–Jeans Law is simply the
h → 0 limit of the Planck distribution. That he did not do so is probably due to a
combination of reasons:
1. He does not seem to have regarded equipartition as a fundamental guiding
principle to the same extent as other physicists of a more “mechanist” bent.
2. Planck attacked the problem from the point of view of the behavior of the
oscillators at thermal equilibrium, rather than by directly considering the modes
of the electromagnetic field itself, which would have led much more quickly to
the (wrong!) classical result.
3. The result obtained by setting the energy units to zero would not have agreed
with the Wien Law, with which Planck had started and which he knew to be
empirically correct at higher frequencies.
was to insist for almost another full quarter century, was a continuous, fully classical
entity regulated by Maxwell’s equations. The situation was to change dramatically
with Einstein’s remarkable 1905 paper, “On a heuristic point of view concerning
the creation and conversion of light” (Einstein, 1905a). Although this paper is now
commonly referred to as the “photoelectric paper”, Einstein spends much more time
in it on an analysis of the volume-dependence of blackbody radiation (pp. 92–102)
than on the brief discussion (pp. 104–105) of the photoelectric effect.
After pointing out that a strictly classical analysis must necessarily lead to the
Rayleigh–Jeans result (1.23), with its inescapable concomitant ultraviolet catastrophe,
Einstein goes on to analyse cavity radiation in the high-frequency domain, drawing
some extraordinarily non-classical conclusions from the quintessentially “classical” (at
least from an historical point of view) Wien Law. Einstein’s approach in this paper is
radically different from Planck’s. He focusses first and foremost on the thermodynamic
and statistical properties of the electromagnetic radiation in the interior of the cavity.
Taking a cavity of volume V0 and considering only the electromagnetic radiation in
the frequency interval (ν, ν + dν), the energy E of such radiation in the high-frequency
domain where Wien’s Law (1.14) holds is given by
8πhν 3
V0 e− kT dν
hν
E= (1.24)
c3
Solving this equation for T1 and repeating the integration procedure of (1.20) to obtain
an expression for the entropy S of the electromagnetic radiation in this frequency
interval, one finds (the 0 subscript indicates that the radiation in the entire cavity of
volume V0 is being considered—we shall shortly consider radiation in a subcavity)
kE E
S0 = − {ln ( 8πhν 3 dν ) − 1} (1.25)
hν V0 c3
The same amount of radiation confined to a smaller volume V would lead to an entropy
S with exactly the same form as (1.25) but with V0 replaced with V . Accordingly, the
difference in entropy for the two situations is
kE V V
S − S0 = ln ( ) = k ln ( )E/hν (1.26)
hν V0 V0
The fundamental Boltzmannian association of entropy with the probability W of the
associated microstates of the system, S = k ln W , then leads to the conclusion that
V E/hν
W =( ) (1.27)
V0
i.e., that the probability of an energy fluctuation leading to a concentration of all the
electromagnetic radiation in the frequency interval (ν, ν + dν) in the subvolume V
of the full cavity V0 takes exactly the form which we would expect if that radiation
E
consisted of hν “mutually independent energy quanta” (each of energy hν) moving
freely throughout the cavity, in complete analogy to the behavior of molecules in a
gas. This is as far from the classical picture of electromagnetic radiation as extended
waves subject to mutual (destructive and constructive) interference as it is possible to
16 Origins I: From the arrow of time to the first quantum field
get. The result (1.26)—extraordinarily simple, but profoundly baffling, from a classical
point of view—clearly had a deep impact on Einstein’s thinking. He was to hold firmly
to the concept of energy (and later momentum) quantization of the electromagnetic
field over the next 20 years—a period of time in which the majority of physicists
were firmly on Planck’s side and resistant to any notion of quantization of the sacred
classical Maxwellian fields.
The centrality of blackbody radiation to Einstein’s thinking about the nature of
the electromagnetic field is clear once one reflects on the number of occasions on
which he would return to the subject: to take the most prominent cases, in 1909 in
two papers (Einstein, 1909b,a) (one entitled “On the present status of the radiation
problem”, the other “On the development of our conceptions on the nature and
constitution of radiation”) in which energy fluctuations were once more used as a
diagnostic for exposing the underlying properties of radiation, and in 1917, in the
famous “A-B coefficients” paper (Einstein, 1916, 1917), of critical importance in
the later development of dispersion theory by Kramers, and thereafter in the 1925
development of matrix mechanics at the hands of Heisenberg, Born, and Jordan.5 Here
we briefly review Einstein’s results of 1909, which proved to be a critical inspiration
for Jordan’s introduction in 1925, in the last section of the “Three-Man” paper of
Born, Heisenberg, and Jordan (Born et al., 1926), of the first true quantum field.
In returning to the problem of energy fluctuations in cavity radiation, Einstein
decided to relax the simplifying assumption of high-frequency (or low-density) radia-
tion described by the Wien Law, and to enquire into the implications of the full Planck
distribution (1.16), valid at all densities and frequencies, for the fluctuation properties
of thermal radiation. In this case, instead of considering the highly non-Gaussian
process whereby a fluctuation would concentrate 100% of the radiation energy in
a given interval (ν, ν + dν) in a subvolume V (giving the result (1.27), later to be
called “Einstein’s first fluctuation theorem” by Jordan) Einstein decided to calculate
the mean-square energy fluctuation of the energy in this interval in the subvolume
V . The formula for such mean-square fluctuations is a standard result of statistical
mechanics:
dE
(ΔE)2 = kT 2 (1.28)
dT
where T is the temperature and E the mean energy, which in this case is clearly just
V ρ(ν, T )dν. We can distinguish three interesting choices for the energy distribution
ρ(ν, T ) and corresponding mean-square energy fluctuation. We shall distinguish the
results obtained for the mean-square energy fluctuation in each case by a subscript
indicating the assumed form for the universal Kirchhoff distribution function ρ(ν, T ):
“RJ” for the completely classical Rayleigh–Jeans form, “W” for the Wien Law, and
“P” for the final result of Planck. In the case of the Rayleigh–Jeans Law valid at low
frequencies,
5 For a thorough study of the role played by dispersion theory in the birth of modern quantum mechanics,
see the two-part paper by M. Janssen and the present author (Duncan and Janssen, 2007a,b).
First inklings of field quantization: Einstein and energy fluctuations 17
8π 2
ρRJ = ν kT (1.29)
c3
c3 E2RJ
⇒ (ΔE)2 RJ = (1.30)
8πν 2 V dν
while the Wien case, valid at high frequencies, yields
8πh 3 −hν/kT
ρW = ν e (1.31)
c3
⇒ (ΔE)2 W = hνEW (1.32)
and, using the Planck distribution formula valid at all frequencies, one obtains instead
8πh ν3
ρP = (1.33)
c3 ehν/kT − 1
c3 E2P
⇒ (ΔE)2 P = + hνEP (1.34)
8πν 2 V dν
The energy fluctuation result (1.30) for the classical Rayleigh–Jeans regime would
be reproduced by Lorentz (Lorentz, 1916) a few years later independently of thermo-
dynamic considerations by considering the fluctuations of energy of electromagnetic
radiation due to constructive and destructive interference in a subvolume (with an
assumption of random phases of the component waves). The fact that the mean-square
energy fluctuation comes out to be proportional to the square of the mean energy can
be considered to be the characteristic feature of classical waves in this context.
The linear energy dependence of the Wien result for the squared energy fluctuation
is on the other√hand immediately suggestive of quantization, as we would expect a
fluctuation of√ N from N particles of energy hν to lead to a mean-square energy
fluctuation ( N hν)2 = hν · N hν = hνE. The full Planck result (1.34), however,
leads to a mean-square fluctuation which appears to be the purely additive result
of wave and particle contributions.6 In 1909 Einstein was to interpret this remarkable
result (later dubbed by Jordan “Einstein’s second fluctuation theorem”, to distinguish
it from (1.27), the “first fluctuation theorem”) as evidence for two statistically—and
therefore structurally—independent causes for energy fluctuation, and insisted in a
lecture at the Salzburg Naturforscherversammlung (1909) that “the next phase of
the development of theoretical physics will bring us a theory of light that can be
interpreted as a kind of fusion of the wave and emission (i.e., particle) theories”. In
the next section we shall see that Pascual Jordan, in his introduction of the first
quantum field, was to establish conclusively that two separate physical mechanisms
for energy fluctuation are not necessary: rather, once the kinematic demands of the
new quantum theory are properly implemented in the description of the modes of the
electromagnetic field, the result (1.34) emerges precisely and naturally from a unified
dynamical framework.
6 Note, however, that in the Wien regime of large ν, the first term on the right-hand side of (1.34) is
exponentially smaller than the second, agreeing with Einstein’s 1905 assertion of purely particle behavior
in this limit.
18 Origins I: From the arrow of time to the first quantum field
1.5 The first true quantum field: Jordan and energy fluctuations
The evolution of understanding of quantum physics in the twenty years between
Einstein’s introduction of light quanta and the development of the modern for-
mal structure of quantum mechanics initiated by Heisenberg’s famous Umdeutung
(=“Reinterpretation”) paper of 1925 (Heisenberg, 1925) is a fascinating and com-
plex story of frequent frustration punctuated by occasional leaps of understanding
leading to yet further frustration. The final stages of this development, a fusion of
correspondence principle arguments with Einstein’s radiation theory (the “A and
B coefficients”) of 1917 leading to the quantum dispersion theory of Kramers, and
thence to Heisenberg’s reinterpretation of the kinematics of electrons in terms of
non-commuting quantities, have been described many times (and in great detail
in some recent work of M. Janssen and the present author (Duncan and Janssen,
2007a,b)). Here, we shall assume that the reader is familiar with the basic principles
of quantum mechanics and focus our attention on the developments directly related
to the quantization of fields—specifically, of the electromagnetic field, as this was the
classical field of immediate phenomenological importance at the time, given the need
to understand the interactions of atomic systems with light quanta.
The immediate successor to Heisenberg’s Umdeutung paper—the work in which
Heisenberg, without any reliance on the mathematics of matrix algebra, introduced
the basic ideas of matrix mechanics—is the “Two-Man” paper of Born and Jordan
“On Quantum Mechanics” (Born and Jordan, 1925), written in the late summer of
1925 as a formal amplification of the Heisenberg approach. In this paper, Born and
Jordan derive the commutation relations of coordinates and momenta that we now
regard as the fundamental defining characteristic of quantum phenomena. The first
part of the paper clarifies, using explicit matrix methods, many of the “magical”
results obtained by Heisenberg, but it is in the fourth chapter (entitled “Observations
on electrodynamics”) that the subject of quantization of fields is raised for the first
time in the context of the new mechanics. The way in which this idea is introduced is
of some relevance to the explicit calculations to follow in the “Three-Man” paper of
Born, Heisenberg, and Jordan, and deserves an extensive quote:
A cavity with electromagnetic oscillations constitutes a system of infinitely many degrees of
freedom. Nevertheless, the basic principles developed in the preceding sections, which admittedly
only concern systems of a single degree of freedom, are sufficient to handle this case as well, given
that it goes over to a system of uncoupled oscillators once analyzed in terms of eigenmodes. There
is hardly any possible doubt, how such a system is to be treated (our emphasis). In particular, the
circumstance that the basic equations of electromagnetism are linear is of importance, for it then
follows that the virtual oscillators (eigenmodes) are harmonic, and it is precisely for harmonic
oscillators, in contradistinction to other systems, that the validity of energy conservation is
independent of the quantum condition.
7 It is best to draw the veil of charity over the attempts made at the very end to derive Heisenberg’s
connection between matrix elements of the electron coordinate and the transition amplitude for atomic
transitions. This connection would first be properly elucidated in Dirac’s seminal work of 1927, to be
discussed in Chapter 2.
20 Origins I: From the arrow of time to the first quantum field
true insight into the physics of wave–particle duality. We will present a brief summary
of Jordan’s argument in the following (for more details, see the work cited above).
The problem of interpretation of “Einstein’s second fluctuation theorem” (1.34)
(as Jordan termed Einstein’s 1909 result) involving wave and particle terms had
been addressed by Ehrenfest just prior to the 3M paper in a paper (Ehrenfest, 1925)
which is of interest to us only in one aspect: Ehrenfest introduces a one-dimensional
model of cavity radiation which leads to a considerable technical simplification in the
calculation of energy fluctuations. Unfortunately (for Ehrenfest), the paper precedes
the Umdeutung paper of Heisenberg, so the non-commutativity of the eigenmode
variables is unrecognized, and incorrect results necessarily follow in the quantum case.
Still, the model of Ehrenfest was exactly the technical tool Jordan used to carry
through a correct, post-Umdeutung calculation of the energy fluctuations in cavity
radiation. The model of Ehrenfest which Jordan uses imagines a string of length l,
fixed at both ends, and of constant elasticity and constant mass density. This is simply
a one-dimensional analog of an electromagnetic field where the fixing of the string at
the ends corresponds to an electric field component forced to vanish at the conducting
sides of a box. The displacement of the string at location x (with 0 ≤ x ≤ l) and time
t is denoted u(x, t). The wave equation for the string (the analog of the free Maxwell
equations for this simple model) is then
∂2u ∂2u
− =0 (1.35)
∂t2 ∂x2
Note that the velocity of propagation is set to unity here. The boundary conditions
u(0, t) = u(l, t) = 0 for all times t express that the string is fixed at both ends. The
general solution of this problem can be written as a Fourier series
∞
kπ
u(x, t) = qk (t) sin (ωk x), ωk ≡ (1.36)
l
k=1
where the dot indicates a time-derivative and the subscript x a partial derivative with
respect to x. The terms u̇2 and u2x are the analogs of the densities of the electric and
the magnetic field energy, respectively, in this simple model of blackbody radiation.
Inserting (1.36) for u(x, t) in (1.38), one finds
l ∞
1
H= dx (q̇j (t)q̇k (t) sin (ωj x) sin (ωk x)
2 0 j,k=1
The functions {sin (ωk x)}k in (1.36) are orthogonal on the interval (0, l), i.e.,
l
l
dx sin (ωj x) sin (ωk x) = δjk (1.40)
0 2
The same is true for the functions {cos (ωk x)}k . It follows that the integral in (1.39)
only gives contributions for j = k. The double sum thus turns into the single sum:
∞ ∞
l 2
H= q̇j (t) + ωj2 qj2 (t) = Hj (1.41)
j=1
4 j=1
This expression shows that the vibrating string can be replaced by an infinite
number of uncoupled oscillators, one for every mode of the string, just as described
in the extended quote from chapter 4 of the Born–Jordan paper given previously.
Moreover, the distribution of the energy over the frequencies of these oscillators
is constant in time. Since there is no coupling between the oscillators, there is no
mechanism for transferring energy from one mode to another. The spatial distribution
of the energy in a given frequency range over the length of the string, however, varies
in time. In analogy to Einstein’s considerations of 1909, Jordan now sets out to study
the fluctuations of the energy in a narrow frequency interval (ω, ω + Δω) in a small
segment of the string, namely the region 0 ≤ x ≤ a, a << l. The total energy in that
frequency range will be constant but the fraction located in that small segment will
fluctuate. Jordan derived an expression for the mean-square energy fluctuation of this
energy, first in classical theory, then in matrix mechanics. Here we shall abbreviate the
discussion by going directly to the quantum mechanical case. Accordingly, quantities
like q(t), q̇(t) in the forthcoming equations must be considered to be non-commuting
matrices. However, the time-development of these matrices involves exactly the usual
periodic functions as in the classical case.
Changing the upper boundary of the integral in (1.39) from l to a (a l) and
restricting the sums over j to correspond to a narrow angular frequency range (ω, ω +
Δω) (i.e., ω < j(π/l) < ω + Δω and ω < k(π/l) < ω + Δω), we find the instantaneous
energy in that frequency range in a small segment (0, a) ⊂ (0, l) of the string, here
denoted E(a,ω) :
a
1
E(a,ω) (t) = dx (q̇j (t)q̇k (t) sin (ωj x) sin (ωk x)
2 0 j,k
The functions {sin (ωk x)}k and the functions {cos (ωk x)}k are not orthogonal on
the interval (0, a), so both terms with j = k and terms with j
= k will contribute to
the instantaneous energy E(a,ω) (t) in (1.42). First consider the (j = k) terms. On the
assumption that a is large enough for the integrals over sin2 (ωj x) and cos2 (ωj x) to
range over many periods corresponding to ωj , these terms are given by
22 Origins I: From the arrow of time to the first quantum field
(j=k) a 2 a
E(a,ω) (t) ≈ q̇j (t) + ωj2 qj2 (t) = Hj (t). (1.43)
4 j l j
Since we are dealing with a system of uncoupled oscillators, the energy of the individual
oscillators is constant, even at the quantum level (recall the emphasis on this point in
(j=k)
the Born–Jordan paper). Since all terms Hj (t) are constant, E(a,ω) (t) is constant too
and equal to its time average:
(j=k) (j=k)
E(a,ω) (t) = E(a,ω) (t). (1.44)
Since the time averages q̇j (t)q̇k (t) and qj (t)qk (t) vanish for j
= k, the (j
= k) terms
in (1.42) do not contribute to its time average:
(j=k)
E(a,ω) (t) = 0. (1.45)
From (1.46) it follows that the (j
= k) terms in (1.42) give the instantaneous deviation
ΔE(a,ω) (t) of the energy in this frequency range in the segment (0, a) of the string
from its mean (time average) value:
(j=k)
ΔE(a,ω) (t) ≡ E(a,ω) (t) − E(a,ω) (t) = E(a,ω) (t). (1.47)
We now integrate the (j
= k) terms in (1.42) to find ΔE(a,ω) . From now on, we suppress
the explicit display of the time-dependence of ΔE(a,ω) , qj and q̇j .
a
1
ΔE(a,ω) = dx (q̇j q̇k [cos ((ωj − ωk )x) − cos ((ωj + ωk )x)]
4 0 j=k
Defining the expressions within square brackets as (cf. 3M paper, ch. 4, Eq. (45 ))
1
ΔE(a,ω) = Kjk (q̇j q̇k + ωj ωk qj qk ) (1.51)
4
j=k
From (1.41) above it is apparent that the individual oscillators are formally identical
to point particles of mass m = l/2. The subsequent calculations can be considerably
simplified by introducing the now familiar8 raising and lowering operators a†j (t), aj (t)
lω 1
aj (t) = qj (t) + i pj (t) = aj (0)e−iωj t
4 lω
† lω 1
aj (t) = qj (t) − i pj (t) = a†j (0)eiωj t (1.53)
4 lω
satisfying
In fact, the calculations of the 3M paper involve only the pj and qj matrices, and their
commutation relation, and are mathematically perfectly equivalent to results obtained
with the linear combinations defined in (1.53). The introduction of operators which
raise or lower the excitation level of the individual eigenmodes will become central
in our later development of the modern formalism of quantum field theory. To the
extent that the excitation levels {nj } are identified (as they clearly are in the 3M
paper) with the number of light quanta (i.e., photons, in modern terminology) with
frequency ωj , operators raising and lowering these levels are clearly identifiable as
the particle creation and destruction operators of modern field theory. Later, in our
systematic development of field theory, they will turn out to be the technical tool
ideally suited to the introduction of physically sensible local interactions as well as
dealing effortlessly with the statistics of properly symmetrized multi-particle states.
Here they are introduced simply in order to allow us to write the expression for the
energy fluctuation in a maximally compact fashion. We remind the reader (Baym,
8 See (Baym, 1990) for a discussion of this now standard method for solving the quantized harmonic
oscillator.
24 Origins I: From the arrow of time to the first quantum field
1990) that the effect of these operators for a single mode (the jth, say) on an eigenstate
|nj of Hj is (at time 0)
√
aj (0)|nj = nj |nj − 1 (1.55)
a†j (0)|nj = nj + 1|nj + 1 (1.56)
In terms of the aj and a†j operators the instantaneous energy fluctuation takes the
very simple form:
√
ΔE(a,ω) = Kjk ωj ωk a†j (t)ak (t) (1.58)
l
j=k
In other words, the operator (or “matrix”, in the language of the 3M paper) represent-
ing energy fluctuations is simply a sum of terms each of which takes a single photon
in a given energy level and transfers it to a different energy level. What could be more
natural?
The question now arises concerning in which state to evaluate the squared energy
2
fluctuation ΔE(a,ω) . The 1909 calculations of Einstein refer to cavity radiation in
thermal equilibrium at a specified temperature T —i.e., to the evaluation of the mean-
square fluctuation in a canonical thermal ensemble of states—but a careful perusal
of the Jordan calculations of the 3M paper shows that the temperature never enters!
Instead, Jordan calculates the quantum dispersion of the energy in a single, pure state
of the field |{nl }, characterized by specifying all excitation levels nl , l = 1, 2, 3, ... of
the field (recall (1.41)):
1
H|{nl } = (nj + )ωj |{nl } (1.59)
j
2
2
The expectation value of ΔE(a,ω) in this eigenstate of the full energy operator H is
necessarily time-independent, so the time-averaging is moot. One has simply
2 √
{nl }|ΔE(a,ω)
2
|{nl } = Kjk Kj k ωj ωk ωj ωk {nl }|a†j ak a†j ak |{nl }
l2
j=k,j =k
(1.60)
The diagonal matrix element in (1.60) only receives non-vanishing contributions when
the indices j, k, j , k satisfy j = k
= k = j , i.e., when the photon destroyed in mode
k (by ak ) and the photon created into the different mode j (by a†j ) are replaced in
mode j = k (by a†j ) and removed in mode k = j (by ak ). Thus we obtain
The first true quantum field: Jordan and energy fluctuations 25
2 2
{nl }|ΔE(a,ω)
2
|{nl } = Kjk ωj ωk {nl }|a†j ak a†k aj |{nl } (1.61)
l2
j=k
2 2
= Kjk ωj ωk {nl }|a†j aj (a†k ak + 1)|{nl } (1.62)
l2
j=k
2 2
= Kjk nj (nk + 1)ωj ωk (1.63)
l2
j=k
where in going from (1.61) to (1.62) we have used the commutation relation (1.54),
and used the fact that the operator a†j aj has eigenvalue nj (i.e., the excitation level)
for the jth mode. At this point a modern derivation of the thermal fluctuation would
immediatelyperform a canonical ensemble average (by multiplying by the Boltzmann
−β j n ω
j
weight e j and summing over excitation levels to obtain the weighted average
of the quantity in (1.63)). As the double sum is over non-identical indices j
= k, the
thermal averages factorize and the result is to replace the fixed occupation number nj
(resp. nk ) by the Planck mean occupation number of that mode n̄j = eβω1j −1 (resp.
n̄k ). This is manifestly a smooth function of the discrete index j, a condition required
by the next step Jordan takes in simplifying (1.63): namely, the (implicit) assumption
that the dependence of the summand is sufficiently smooth that we can, with negligible
error, replace the double sum with a double frequency integral. It should be emphasized
once again that Jordan does not perform a thermal average in the 3M paper: rather,
his derivation must be considered as valid for the pure state quantum dispersion,
assuming that the given pure state has a photon occupation number dependence which
is adequately smooth with respect to the variation of the discrete mode index over the
finite interval under consideration to allow the replacement of sums by integrals. The
further simplification of (1.63) therefore proceeds by the replacements
l
→ dω (1.64)
j
π
l
→ dω (1.65)
π
k
where ω = jπ
l ,ω = l .
kπ
Resuming our calculation, we first note that if a is very large compared to the
wavelengths associated with the frequencies in the narrow range (ω, ω + Δω), we
can set9
sin ((ω − ω )a)
2
dω f (ω ) = dω f (ω )πaδ(ω − ω ) = πaf (ω) (1.66)
(ω − ω )2
9 Here we use the fact that the integrand becomes highly peaked for a → ∞, with total weight determined
+∞ sin2 (x)
by the definite integral −∞ x2
dx = π.
26 Origins I: From the arrow of time to the first quantum field
where δ(x) is the Dirac δ-function and f (x) is an arbitrary function. As pointed out
2
previously, the sine function in (1.66) is the dominant part of the Kjk factor in (1.63)
in a narrow frequency interval, so we finally obtain, with the indicated translations
from sums to integrals,
2
{nl }|ΔE(a,ω)
2
|{nl } = dω dω πaδ(ω − ω )n(ω)(n(ω ) + 1)ωω
π2
a ω+Δω
= (n(ω)2 + n(ω))2 ω 2 dω
π ω
a
(n(ω)2 + n(ω))2 ω 2 Δω (1.67)
π
We once again (and for the last time!) re-emphasize that the smooth variation of the
occupation numbers nj is a precondition for the validity of the result (1.67). This
smoothness is, of course, guaranteed once the thermal average is performed to replace
fixed occupation numbers nj by their thermal averages n̄j , as the latter are just the
smooth Planck function 1/(eβωj − 1).
The result (1.67) is just Einstein’s second fluctuation theorem (à la Jordan) in
slightly disguised form. Writing all quantities as functions of the cyclic frequency ν
rather than angular frequency ω, and using an overbar as a shorthand for the diagonal
expectation values in Eqs. (1.60)–(1.67), we find:
2
ΔE(a,ν) = 2aΔν (n(ν)hν)2 + (n(ν)hν)hν (1.68)
We now introduce the excitation energy—the difference between the total energy and
the zero-point energy. Jordan and his co-authors call this the “thermal energy” (p. 377,
p. 384). Although the intuition behind it is clear, this terminology is misleading. The
term “thermal energy” suggests that the authors consider a thermal ensemble of energy
eigenstates—what we would call a mixed state—while as has been made clear in the
preceding, in fact they are dealing with individual energy eigenstates, i.e., pure states.
The term “excitation energy” is therefore preferable in this context. The excitation
energy E(ν) in the narrow frequency range (ν, ν + Δν) in the entire string in the state
{nν } is
where we used that N (ν) = 2lΔν is the number of modes between ν and ν + Δν for
our one-dimensional string. On average there will be a fraction a/l of this energy in
the small segment (0, a) of the string:
a
E(a,ν) = E(ν) = 2aΔν(n(ν)hν) (1.70)
l
Substituting E(a,ν) /2aΔν for n(ν)hν in (1.68), we arrive at the final result of this
section of the Dreimännerarbeit (ch. 4, Eq. (55)):
The first true quantum field: Jordan and energy fluctuations 27
2
2
E(a,ν)
ΔE(a,ν) = + hνE(a,ν) (1.71)
2aΔν
—precisely the analog, for a one-dimensional system of waves (with unit wave speed),
of the Einstein result (1.34) (recall that in three dimensions, the number of electro-
magnetic modes in volume V in a narrow frequency interval is just 8πν 2 V dν/c3 ).
Jordan’s derivation of Einstein’s peculiar “hybrid” formula for the energy fluc-
tuations in cavity radiation is used in the final paragraph of the 3M paper as
further, as it were independent (of the dispersion considerations that had originally
motivated Heisenberg) evidence for the validity of the matrix mechanical procedure
of maintaining the form of the classical dynamical equations while reinterpreting as
non-commuting matrix quantities the kinematical ingredients of these equations. But
Jordan himself regarded the result as having far greater significance, insofar as it
pointed the way to a general procedure for extending the principles of quantum
mechanics to field systems with infinitely many degrees of freedom, including, as
Jordan was to show a few years later, systems of particles satisfying Fermi statistics.
Jordan was later (in 1962, in a comment to van der Waerden) to refer to his fluctuation
calculation in the 3M paper as “almost the most important contribution I ever made
to quantum mechanics.”
The negative reaction of Jordan’s contemporaries—including his co-authors on the
3M paper!—to the fluctuation calculation is of interest in its own right. Heisenberg
seems to have been worried about potential divergences, and later (in 1930) was to
publish a paper (Heisenberg, 1931) showing that the mean-square fluctuation in a
subvolume of the cavity is in fact infinite if one considers the electromagnetic energy
integrated over all frequencies. Although mathematically correct, this calculation is
irrelevant as a criticism of Jordan’s result in the 3M paper, which explicitly considers
the energy fluctuations in a finite frequency interval, which are perfectly finite, as we
have seen. It is also irrelevant from a physical point of view: the isolation of energy
in a subvolume requires the introduction of enclosing physical filters of small but
necessarily finite thickness, and the structure of these filters will eventually be resolved
if we consider photons of arbitrarily high frequency and small wavelength. In fact,
Heisenberg found that if the walls of the enclosing subvolume are smeared out (for
example, if we replace the θ-function θ(a − x) implementing the restriction of the
range of the integral (1.42) by a smooth function interpolating between 0 and 1),
the integral over frequencies can be extended to infinity with a finite result for the
mean-square energy fluctuation. Born was initially less vocal in opposition to Jordan’s
result, but later (in 1939), in exile in Edinburgh, published in collaboration with his
assistant Klaus Fuchs (later famous after being discovered as a Soviet spy!) a paper
(Born and Fuchs, 1939a,b) essentially retracting the entire calculation. The retraction
had itself in short order to be retracted when serious technical errors were discovered
by Pauli’s assistant Markus Fierz.10
10 For a detailed discussion of the reception of the Jordan fluctuation calculation, see the previously cited
work of Duncan and Janssen (Duncan and Janssen, 2008).
28 Origins I: From the arrow of time to the first quantum field
would finally emerge by mid-century. The second task, also beginning in 1927, was
taken up with great intensity and focus by Jordan and collaborators: the extension of
the notion of field quantization to the treatment of matter fields, in particular fields
with elementary excitations of fermionic character, which in consequence could never
possess a classical counterpart analogous to the electromagnetic field of Maxwell. Our
review of the historical evolution of quantum field theoretical concepts continues in
the next chapter with a discussion of these developments.
2
Origins II: Gestation and birth of
interacting field theory: from Dirac
to Shelter Island
The convoluted evolution of quantum field theory in the period from the emergence
of modern quantum mechanics in the late 1920s until the completion of the fully
covariant and renormalizable quantum electrodynamics of the early 1950s seems at first
sight a reprise of the extended birth pangs of quantum mechanics itself in the period
from Planck’s quantization of the distribution of energy among thermally equilibrated
oscillators in 1900 to the development of matrix mechanics by Heisenberg in 1925. In
both cases, and in contrast to the development of special and general relativity by
Einstein, progress (often slow and halting) was due to the efforts of many physicists
working along several lines of enquiry, with each new insight often opening up new
questions and new difficulties.
However, at least in hindsight, it is apparent that there is a considerable difference
in the intellectual background of the two efforts. The physicists struggling with the
development of quantum mechanics in the first quarter of the twentieth century were
faced with the need to construct an entirely novel, and in many respects completely
counter-intuitive, type of physical theory, in which many of the basic concepts of classi-
cal physics seemed no longer applicable. Indeed, adherence to these concepts frequently
obstructed rather than assisted the understanding of the complex of microscopic
phenomena steadily being uncovered on the experimental front. As with relativity,
and even more radically, the new theory seemed to demand the demolition of some of
the most deeply held presuppositions of classical physics, and it was totally unclear,
almost to the very end (i.e., the Heisenberg–Schrödinger revolution of 1925–26), what
nexus of consistently interrelated concepts could replace them.
By contrast, at least from 1930 on, the physical requirements and conceptual struc-
ture needed for an adequate quantum theory of fields were fairly clear: the quantum
mechanical substructure needed to follow a clear set of by now well-established rules,
and the resultant theory should obviously respect the precepts of special relativity,
yielding transition probabilities indifferent to one’s choice of inertial frame. In a way,
the fact that the quantum mechanical and relativistic foundations needed for an
adequate physical theory were completely clear by the early 1930s made the apparent
inconsistencies and frequently infinite results obtained in early calculations in quantum
electrodynamics even more frustrating for the leading theorists in the field, who (as
in the case of Heisenberg’s willingness to introduce a discretization of space and a
Introducing interactions: Dirac and the beginnings of quantum electrodynamics 31
universal length unit) often felt the need for radical modifications at a fundamental
level—modifications which we now understand to have been quite unnecessary.
A comprehensive (and comprehensible) account of the history of quantum field
theory from the Dreimännerarbeit of 1925 to Dyson’s renormalization analysis of 1949,
effectively completing the formal framework of perturbative quantum electrodynamics
(QED), would require an entire treatise.1 In this chapter, constraints of space will
limit us to a highly selective account of some of the major breakthroughs along the
way. Many important and interesting contributions made to the early development
of QED will be passed over in silence. The papers discussed are primarily those
which had a definitive impact in (a) the development of the formal structure of
quantum field theory, and (b) uncovering and (partially) resolving the conceptual
difficulties occasioned by the lack of explicit relativistic covariance and by the
appearance of ultraviolet divergencies in early calculations of quantum electrodynamic
processes.
1 Indeed, the subject has been tackled already, in the excellent book by Schweber (Schweber, 1994),
which the reader is encouraged to consult for more extensive details on many of the developments discussed
below. See also the book by Miller (Miller, 1994), containing a number of the original papers in translation.
32 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
The problem addressed by Dirac in (Dirac, 1927b) was that of determining the
“perturbation of an assembly satisfying the Einstein–Bose statistics” due to the
interaction of the assembly with an atomic system (in fact, with an electron in such
a system). The terminology here seems somewhat strange from a modern point of
view: Dirac refers frequently to the set of independent modes of the non-interacting
electromagnetic field (say, quantized in a box of finite volume so that the allowed
modes form a discrete set, as in the Jordan calculation in the 3M paper) as an
“assembly of independent systems”. He then sets about to write a Hamiltonian for
the electromagnetic field in terms of these mode variables—one sufficiently general
to accommodate both the free electromagnetic field and the possible perturbation
which might be induced by inserting an atomic system (described with the usual
non-relativistic quantum formalism) into the field.
At this point Dirac introduces two technical devices which would become central
features of quantum field theory: firstly, the use of a canonical transformation from the
(pr , qr ) type variables (such as those employed by Jordan in describing the individual
√
harmonic oscillator systems for the rth field mode; cf. Section 1.5) to amplitude ( Nr )
and phase (θr ) variables, and thence to destruction br and creation b†r operators which
lower and raise, respectively, by one the number of photons associated with the rth
mode. The use of such variables actually goes back to a paper by London (London,
1926),2 where the harmonic oscillator spectrum (and eigenfunctions) are derived using
this technique.
Secondly, he employs for the first time the interaction picture of time development
wherein the operators of the theory evolve via time-dependent unitary transformations
due only to the free part of the Hamiltonian. The Hamiltonian introduced by Dirac
for his Einstein–Bose “assembly” (i.e., the electromagnetic field) takes the form
H= b†r Hrs bs , Hrs = Wr δrs + vrs (2.1)
r,s
where Wr is the energy of a free photon in the rth mode (namely, Wr = hνr ), and
the vrs form a matrix representing the effect of a perturbation (for example, due to
the presence of an atomic electron in some atomic state) on the electromagnetic field.
The operators br are defined in terms of the aforesaid conjugate amplitude and angle
variables3
br = e−iθr / Nr , [Nr , θs ] = −iδrs (2.2)
∂
whence one finds, using the representation θr = i ∂N r
, and consequently e−iθr / =
∂ ∂ ∂
e ∂Nr . Thus, e ∂Nr f (Nr ) = f (Nr + 1)e ∂Nr , and we find
2 See(Duncan and Janssen, 2009, p. 358) for a discussion of this important but not well known paper.
3 The proper definition of a well-defined phase operator involves some subtle mathematical problems
which at that time were not appreciated by Dirac, and which will be addressed in detail in Section 8.2.
Here, we note simply that Dirac’s results can be obtained without relying on the ill-defined phase operators
θr , but purely in terms of the algebra of creation b†r and annihilation br operators, which are perfectly well
defined.
Introducing interactions: Dirac and the beginnings of quantum electrodynamics 33
br = e−iθr / Nr = Nr + 1e−iθr / (2.3)
b†r = Nr eiθr / = eiθr / Nr + 1 (2.4)
Nr = b†r br (2.5)
∂
i ψ(n1 , n2 , .., nr , ..; t) = n1 , n2 , ...|H|ψ; t
∂t
√
= Hrs nr ns + 1 − δrs ψ(n1 , ..., nr − 1, ..., ns + 1, ...; t)
r,s
(2.8)
The bilinear (in creation and destruction operators) form (2.1) chosen by Dirac for
the Hamiltonian of the electromagnetic field means, of course, that the number of
photons is necessarily conserved: the perhaps most characteristic feature of quantum
field theories—particle creation and/or annihilation—is completely missing in this first
attempt at an interacting theory of photons! Indeed, each destruction operator bs is
accompanied by a creation operator b†r , so the time evolution of the system under
the Hamiltonian (2.1) amounts simply to the continual transitioning of photons from
one mode to another. This proves to be something of an embarrassment when Dirac
addresses the basic object of the paper: the calculation of probabilities for the emission
or absorption of photons by an electron (in a bound state of an atom). This difficulty
is finessed by assuming that the “disappearance” of a photon in a pure absorption
process really corresponds to the transition of the photon from a finite-energy (and
hence detectable) state to a zero energy (and hence undetectable) state, which is
possible, of course, for a massless particle with zero momentum. Indeed, Dirac assumes
34 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
with α a complex number with modulus going to infinity if (as Dirac assumes) there
are infinitely many zero-energy photons present. Dirac assumes that the terms in H
in (2.1) with r
= 0, s = 0 (pure emission events) then lead (via vanishingly small vr0
coefficients) to finite matrix elements by setting the limit (for infinitely many zero-
energy photons)
which then implies a correspondingly finite amplitude v0s α∗ = vs∗ for a pure absorption
event, in which a finite-energy photon in mode s is transferred to the zero-energy
reservoir. The net effect of these shenanigans is the appearance of a term (necessarily
hermitian, given the hermiticity of the original interaction matrix vrs ) linear in the
4 In modern terminology, Dirac was imagining a Bose–Einstein condensate of zero-energy photons. While
this idea is physically incorrect in the present context, Dirac’s intuition of the physical imperceptibility of
very-low-energy photons is a critical component in a proper understanding of the problem of infrared
divergences in scattering amplitudes in quantum electrodynamics, as we shall see in Section 19.2.
5 Strictly speaking, this is not what Dirac does. He assumes a definite number N of zero-energy photons,
0
in which case the matrix elements of the Hamiltonian between initial and final states would necessarily
vanish! We are “fixing up” his argument here to yield the desired result, which he could, of course,
have obtained directly by assuming from the outset an hermitian term linear in creation and destruction
operators. Dirac was perfectly aware that the desired interaction Hamiltonian
√ was necessarily linear in the
electromagnetic vector potential and hence in the amplitude variable Nr , thence also in the creation–
destruction variables, as he shows in the final Section 7 of his paper (see below).
Introducing interactions: Dirac and the beginnings of quantum electrodynamics 35
creation and destruction operators, and capable of initiating the desired pure emission
and absorption events:
For an interaction term of linear type, the Einstein results for the A and B coeffi-
cients describing spontaneous emission and absorption (Einstein, 1916, 1917) follow
immediately once we re-express Hlin in terms of occupation number/phase variables:
Hlin = (vr eiθr / Nr + 1 + vr∗ e−iθr / Nr ) (2.15)
r
Recalling (cf. (2.3, 2.4)) that the operators eiθr / (resp. e−iθr / ) increase (resp.
decrease) the associated mode number eigenvalue nr by one and hence effect the emis-
sion (resp. absorption) of a photon in the rth mode: the corresponding probabilities are
therefore proportional to the initial-state occupation numbers nr + 1 (nr resp.). Note
that the same factor |vr |2 appears in both probabilities (in Einstein’s language, this
is the equality of the coefficients for absorption and induced emission): it involves an
atomic state matrix element which Dirac will identify once the interaction Hamiltonian
is fully specified in terms of electron and electromagnetic field quantities. In any event,
with the usual mode counting of classical plane wave modes in a box of volume V ,
giving cV3 ν 2 dνdΩ modes of a given polarization into solid angle dΩ in the frequency
interval dν, one finds for the total energy of the radiation in this interval (assuming
the occupation numbers nr smoothly varying),
V 2 V c2
nr · hνr · 3
νr dνr dΩ ≡ I(νr )dνr dΩ ⇒ nr = I(νr ) (2.16)
c c hνr3
where the specific intensity of the ambient radiation (in the initial state) I(νr )
corresponds to the radiative flux in the particular frequency interval and solid angle
(with flux defined as usual as the energy density times the speed of light). Thus, if
the absorption rate is proportional to nr and hence to I(νr ), the emission rate is
hν 3
correspondingly proportional to I(νr ) + c2r , with the first and second terms corre-
sponding respectively to induced and spontaneous emission (the latter present even
in the absence of ambient radiation). These are Einstein’s laws for the emission and
absorption of radiation, in the form presented by Dirac.
The remaining task facing Dirac was to go beyond the results of Einstein by
providing a complete route to a calculation of the absolute (rather than relative)
absorption and emission (spontaneous and induced) rates for specified transitions of
an electron between distinct atomic stationary states, thereby putting on a firm basis
36 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
(at last!) Heisenberg’s basic, but as yet purely hypothetical, intuition in his seminal
Umdeutung paper connecting the transition amplitudes for radiative processes in
atoms to the matrix elements of the electron’s coordinate operator. Dirac accomplishes
this, in the final section of (Dirac, 1927b), by a correspondence principle argument
identical in spirit to those used by Kramers, Born, and Heisenberg in the dispersion
theory precursors to matrix mechanics. We shall briefly reprise the argument here, in
modern notation.
We begin with the Fourier expansion for the classical vector potential in radiation
(i.e., Coulomb) gauge, ∇ ·A = 0, in a finite box of volume V ,
1
r, t) = √1
A(
a
eik·
r−iωk t + c.c.) (2.17)
V
2ω
k k,λ k,λ
k,λ
where we are using natural units (so = c = 1), the polarization index λ takes two
values, with a transverse polarization vector k ·
k,λ
k,λ · p e−ik·
r for the mode r corresponding
to wavevector k and polarization λ. In the electric dipole approximation the wavelength
Introducing interactions: Dirac and the beginnings of quantum electrodynamics 37
k −1 of the photon is assumed much larger than atomic dimensions, and the exponential
can be replaced by unity. The matrix element of this interaction operator between
distinct initial and final atomic states |i, |f thus involves matrix elements of the
component of the electron momentum in the direction of the photon polarization,
f |
k,λ · p|i, multiplied by the dependence on the photon occupation numbers of the
electromagnetic field found earlier by considering matrix elements of the br and b†r
photon destruction and creation operators.6 Dirac shows that these results agree in
detail with the matrix-mechanical expressions for the Einstein A and B coefficients.
Dirac was perfectly aware that his fully quantum-mechanical discussion of the
interaction of electrons with the electromagnetic field suffered from a serious short-
coming: the continuing treatment of the electrons as non-relativistic particles, which in
particular made a proper derivation of the relativistic fine-structure effects of electrons
in atomic bound states impossible. Initial attempts to produce a fully relativistic
wave equation proceeded by replacing the non-relativistic (free-particle) Schrödinger
equation
∂ 2 2
i ψ(r, t) = − ∇ ψ(r, t) (2.20)
∂t 2m
implementing (via the standard associations H → i ∂t
∂
, p → i ∇)
the non-relativistic
2
energy-momentum relation H = p /2m, with the fully relativistic Klein–Gordon equa-
tion (we take c=1)7
∂2 2 ) + m2 }ψ(r, t) = 0
{2 ( −∇ (2.21)
∂t2
incorporating the correct relativistic relation H 2 = p2 + m2 . These attempts having
failed in the archetypal test case of the hydrogen atom spectrum, Dirac tackled the
problem of the coupling of a relativistic electron to the electromagnetic radiation in a
seminal paper (Dirac, 1928) which settled once and for all the kinematical aspects of
the relativistic treatment of spin- 12 particles. The incorporation of an electromagnetic
coupling in (2.21), by the usual minimal coupling replacement p → p − eA, H→H−
eA0 (with A, A0 the vector and scalar potentials respectively) led to a wave-equation
for the electron with two immediate, and serious, drawbacks:
1. The presence of a second-order derivative in time ran counter to the fundamental
quantum-mechanical principle that the state of a quantum system at any time
is determined solely by knowledge of its state at any given earlier time (unlike
the situation in the configuration space formulation of classical mechanics, say,
where both coordinates and velocities needed to be specified at an initial time to
allow the subsequent time evolution to be computed). All of the highly successful
quantum transformation theory developed up to this point, primarily by Dirac
6 The matrix elements of the electron momentum operator can be converted to matrix elements of the
coordinate operator by a simple commutator trick; see (Baym, 1990), chapter 13.
7 We note for historical accuracy, that Schrödinger had actually tried the relativistic Klein–Gordon
equation first in his treatment of the hydrogen atom, only to discover to his dismay that it led to incorrect
fine-structure predictions.
38 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
(Dirac, 1927a) and Jordan (Jordan, 1927a,b), was predicated on this assumption,
which clearly required the dynamical evolution equation of a quantum system to
be first order in time.
2. Secondly, the second-order time-derivative had the highly unpleasant ancillary
effect of introducing solutions with both positive and negative energy (with
the latter, in the minimally coupled case, corresponding to particles of opposite
charge to those of positive energy). This is due simply to the fact that the equation
(2.21) determines only the square of the energy, but not its sign.
Dirac showed that it was possible to solve the first of these problems, and obtain a
fully relativistic equation first order in the time-derivative, with the highly desirable
byproduct of providing a natural explanation of the hitherto mysterious double-
valuedness (rather quaintly termed “duplexity” by Dirac) of electron states, due to the
intrinsic spin. However, as Dirac frankly admitted, the resultant equation, involving a
four-component wavefunction (in other words, twice the desired “duplexity” account-
ing for spin), was still plagued with the unwanted appearance of negative-energy,
opposite-charge solutions. The desired relativistic equation for the free particle case,
if linear in the time-derivative (hence in the energy operator p0 ≡ H = i ∂t ∂
), must
necessarily be linear also in the spatial momentum operators p = −i∇, and Dirac
showed that the simplest possibility, unique up to obvious similarity transformations,
was obtained by setting
{γμ pμ − m}ψ(r, t) = 0 (2.22)
where the γμ , μ = 0, 1, 2, 3 were algebraic objects satisfying the anticommutation
algebra8
8 Our notation vis-à-vis the Dirac equation and algebra differs slightly from Dirac’s, in accord with
modern usage.
Introducing interactions: Dirac and the beginnings of quantum electrodynamics 39
with the energy term necessarily produced two +1 and two −1 eigenvalues on
diagonalization.
The successful application of the Dirac equation (with the coupling of the electron
to the electromagnetic field accomplished via the usual minimal coupling procedure of
replacing pμ → pμ − ec Aμ in the free particle equation (2.22)) to the relativistic fine-
structure of the hydrogen atom, and the fact that the new theory automatically yielded
the correct, and previously utterly mysterious, gyromagnetic ratio of 2 for the electron,
led to the immediate acceptance of the Dirac equation as the correct foundation for a
fully relativistic treatment of the interaction of electrons and photons. However, the
conceptual wave-mechanical framework in which the equation was conceived and born
was to prove a stubborn hindrance to the early acceptance of a fully second-quantized
formalism for matter fields such as the electron.
The irony is that Dirac, having pioneered the application of a second-quantized
formalism (with creation and annihilation operators at the center) for dealing with
the electromagnetic field, continued to insist on the use of first-quantization ideas for
the electron. In particular, multi-electron states would be described relativistically in
the same way as one was now accustomed to treat them in non-relativistic contexts,
such as in multi-electron atoms, using wavefunctions defined on a 3N -dimensional
coordinate space (for N electrons) and with interactions handled approximately via
Hartree–Fock mean field techniques.
The absence of disastrous instabilities incurred by the unavoidable transitions
between positive- and negative-energy states once the Dirac electron was coupled to
the electromagnetic field was legislated by fiat, by the assertion that the physical
“vacuum” (i.e., no-particle) state actually consisted in having all negative-energy
electron states filled, so that the normal positive electron states were simply (by
the Pauli exclusion principle) denied the opportunity of dropping down to any of
the infinitely many otherwise available negative-energy states (with the release of
electromagnetic gamma-radiation of energy ≥ 2mc2 ). The absence of a negative-energy
electron was then interpreted as the presence of a positive-energy particle of positive
charge, interpreted first by Weyl and Dirac as the proton, but as difficulties arose with
this proposal, as a (so far) unseen particle of equal mass and opposite charge to the
electron.
This approach was soon confirmed by Anderson’s experimental discovery of the
positron in 1931. However, the idea of an unobservable “filled sea” of negative-energy
electrons—in some sense the conceptual progeny of Dirac’s earlier idea, discussed
above, of a sea of zero-energy photons—would prove to be an extremely persistent
distraction during the 1930s, leading to a vast amount of ultimately unprofitable,
and extremely complicated, formalism, before the whole rickety framework9 could be
thrown overboard and replaced by a conceptually clean and technically efficient Fock
9 Among many other artificialities, the negative-sea idea required the introduction of mathematically
murky subtractions in the operators representing the total energy and charge of the system in order to
implement the absence of energy and charge in the vacuum state—the dubious character of the necessary
subtractions being considerably amplified by the assumption that the infinitely many occupied negative-
energy electron states were interacting and the corresponding multi-particle wavefunction therefore had
to be treated by extremely unconvincing Hartree–Fock approximation techniques incorporating, at least
roughly, the interactions of the occupied negative-energy electrons.
40 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
space (i.e., occupation number) formalism in which positive-energy electron states and
“negative-energy hole” states (i.e., positive-energy positron states) could be treated in
a completely symmetric way. This required that the ideas of second-quantization, fully
accepted for radiation after Dirac’s 1927 paper, should be applied with equal force to
matter (i.e., fermionic) fields. The unquestionable leader in this point of view, as we
shall now see, was Pascual Jordan.
applied external fields. At this point, however, the incorporation of Fermi statistics in
a second-quantized formalism was not yet fully understood (a deficit soon to be erased
in the paper of Jordan and Wigner discussed below), so the particles are assumed to
obey bosonic statistics. By writing the Hamiltonian operator of such a system (we
assume that no external fields are present to simplify slightly the formalism) in the
form
2
† e2 : φ† (r, t)φ(r, t)φ† (r , t)φ(r , t) : 3 3
H= ∇φ (r, t) · ∇φ(r, t)d r +
3
d rd r
2μ 2 |r − r |
(2.25)
with the field operator expanded in discrete plane wave modes (say, by quantizing the
system in a box of volume V )
1 i
k·
r
φ(r, t) = √ e b
k (t) (2.26)
V
and where the : . . . . : notation appearing in (2.25) embodies the instruction10 to move
all conjugate operators b
† to the left of the b
k when (2.26) is inserted into (2.25),
k
giving
e2
H= E(k)b
† b
k + A(k q |kq)b
† b†q
b
k bq
(2.27)
k 2 k
k
k q
q
r
1 e−i(k−k )·
r−i(
q−
q )·
A(k q |kq) ≡ 2 d3 rd3 r (2.28)
V |r − r |
[b
k (t), b
† (t)] = δ
k
k (2.29)
k
[b
k (t), b
k (t)] = 0 = [b
† (t), b
† (t)] (2.30)
k k
The transition to the occupation number basis, and the interpretation of the bs and
b† s as destruction and creation operators, is then made exactly as in Dirac’s work, via
the transcription (see (2.2, 2.3, 2.4, 2.5))
b
k → e− θk N
k , b
† → N
k e+ θk
i i
(2.31)
k
10 This instruction, in modern terminology the “normal-ordering” of the Hamiltonian operator, is essential
to remove infinite contributions to the energy arising from the Coulomb self-energy of the individual
particles, leaving only the electrostatic interaction energy of charged particle pairs (see (Jordan and Klein,
1927), Section 3). In the tangled history of 1930s field theory, the normal-ordering instruction was frequently
referred to as the “Klein–Jordan trick”.
42 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
The algebra (2.29, 2.30) implies that the Hamiltonian equation of motion of the field
operator φ(r, t), expressed in coordinate space, amounts to the field equations
∂ 2
i φ(r, t) = − Δφ(r, t) + eV (r, t)φ(r, t) (2.32)
∂t 2μ
ΔV (r, t) = −4πeφ† φ(r, t) (2.33)
One sees immediately that (2.32) is formally identical with the non-relativistic (time-
dependent) Schrödinger equation for a single particle of mass μ and charge e interact-
ing with a Coulomb potential V , if we interpret φ(r, t) as a c-number wavefunction. Eq.
(2.33) is then simply the Poisson equation giving the electrostatic potential in terms
of the particle probability density |φ(r, t)|2 . Jordan and Klein refer to the process
of replacing the c-number coordinate space Schrödinger wavefunction by a q-number
field operator as a “quantization of de Broglie waves”. This terminology would soon be
replaced by the denotation second quantization,11 expressing the fact that the original
c-number (or “first-quantized”) equations of wave mechanics would lead to a correct
treatment of multi-particle systems, including the proper treatment of the statistics
(i.e., bosonic or fermionic), by the simple expedient of quantizing (i.e., replacing by
operators satisfying suitable commutation relations) the expansion amplitudes of the
single-particle wavefunctions.
The form of the Hamiltonian (2.27), when expressed in terms of creation and
destruction operators, makes it clear that the theory (like Dirac’s first version of
the Hamiltonian for quantum electrodynamics) conserves particle number: each term
contains an equal number of bs and b† s. Thus the time-dependent Schrödinger equation
acts independently in each sector of the state space corresponding to a given fixed
number of particles. Introducing a Schrödinger wavefunction for a given total number
of particles N (in analogy to (2.7)), Jordan and Klein show after a simple calculation
that the Hamiltonian (2.27) generates precisely the appropriate non-relativistic time-
dependent Schrödinger equation (analogous to (2.8)) for the given N -particle system.
Moreover, the Bose–Einstein symmetry of the multi-particle wavefunction is manifest
throughout.
The extension of the second-quantization procedure to systems of particles obeying
fermionic statistics followed within a few months of the Jordan–Klein paper, in an
article of Jordan and E. Wigner entitled “On the Pauli Exclusion Principle” (Jordan
and Wigner, 1928). The non-relativistic Schrödinger wave mechanical formalism for a
system of N fermions, wherein the non-relativistic Schrödinger Hamiltonian acts on
fully antisymmetric wavefunctions in the N -particle coordinate space, was shown to be
equivalent to a second-quantized version in which the Hamiltonian appears in the form
(2.27)12 but with creation and destruction operators a†κ , aκ subject to anticommutation
relations13
11 In an interview with Thomas Kuhn in June 1963 for the Archive for the History of Quantum Physics
(session 3, p. 9), Jordan claims to have been the first to employ this terminology. See (Duncan and Janssen,
2008, p. 642).
12 See (Jordan and Wigner, 1928), Eqs. (66a), (66b).
13 The anticommutator of two operators A and B is defined as {A, B} ≡ AB + BA¿
Completing the formalism for free fields: Jordan, Klein, Wigner, Pauli, and Heisenberg 43
with the indices κ, λ labeling a complete set of single-particle states (which may
require both a continuous as well as discrete specification, with a corresponding
interpretation of the Kronecker δ as a Dirac δ-function). The vanishing of the square of
the creation operator a†κ (implied by (2.35), setting κ = λ) immediately incorporates
the Pauli exclusion principle denying the possibility of multiply occupied fermionic
states. However, the treatment of Jordan and Wigner is still entirely within the
framework of non-relativistic physics: their paper appeared almost simultaneously with
Dirac’s relativistic equation for the electron, the correct second-quantized formulation
of which, as we shall see, was as yet still several years in the future.
The formal developments outlined so far, while of critical importance in establish-
ing the role of second quantization in enforcing the wave–particle connection while
maintaining the correct symmetry of multi-particle states, left the issue of relativistic
invariance essentially untouched. This shortcoming was remedied by the seminal paper
of Heisenberg and Pauli (Heisenberg and Pauli, 1929), which put in place the formalism
of Lagrangian field theory, still (eighty years later) at the core of modern field theory.
Heisenberg and Pauli begin with the canonical formalism of classical field theory,
where an action functional defined as the spacetime integral of the Lagrangian gives
rise to field equations via a variational principle:
α , φ̇α )d4 x = 0 ⇒ ∂L = ∂
δ L(φα , ∇φ
∂L
(2.36)
∂φα ∂xμ ∂( ∂φ
∂xμ )
α
Here φα (xμ ) are a set of spacetime fields labeled by a discrete index α, and the
covariant form of the Euler–Lagrange equation on the right-hand side of (2.36) suggests
that, at least classically, relativistic invariance of the field equations will follow once
the Lagrangian L is chosen to be a Lorentz scalar functional. The transition to a
Hamiltonian framework is made by the standard procedure: one introduces canonically
conjugate “momentum” fields πα = ∂∂L φ̇α
14
and sets the Hamiltonian density equal to
the Legendre transform of the Lagrangian density,
α , πα ) ≡
H(φα , ∇φ πα φ̇α − L (2.37)
α
Exactly as in classical point mechanics, one is then easily able to verify that, in the
absence of explicit time-dependence, the spatial integral of the density H is temporally
constant, and hence deserves the appellation “total energy of the system”. The fields
φα and πα at all spatial points but on the same time-slice are subjected to quantization
by the usual replacement of the classical Poisson bracket by commutators
14 Some notational warnings: Heisenberg and Pauli use Q (resp. P ) for φ (resp. π ) to emphasize
α α α α
the analogy to Q, P variables in classical mechanics: the analogy is made even more complete by a lattice
formulation wherein the spatial continuum is discretized so that the
x dependence also becomes discrete.
44 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
[πα (x, t), φβ (y , t)] = δαβ δ 3 (x − y ) (2.38)
i
with the desirable consequence that the Hamiltonian operator H defined as the
(conserved) spatial integral of the Hamiltonian density, H ≡ Hd3 x generates the
time evolution of any dynamical variable F (πα , φα ) constructed from the fields and
their conjugates on a time-slice:
∂F i
= [H, F ] (2.39)
∂t
The article of Heisenberg and Pauli runs for 61 pages in the Zeitschrift für Physik,
and we can do no more here than to summarize a few of the critical contributions it
makes to the conceptual development of quantum field theory:
1. The relativistic invariance of the action (i.e., choice of a scalar Lagrange func-
tion) is shown to lead to relativistically invariant commutation relations, in the
sense that the vanishing of the equal-time commutators of fields and conjugate
(momentum) fields in one inertial frame implies the vanishing at equal time in
another inertial frame (vis-à-vis the time coordinate of the new frame). From
this follows the vanishing of local observables built from the fields and their
conjugates at pairs of points with strictly space-like separation (as such point
pairs can always be brought to equal time by a suitable Lorentz transformation).
This is the first clear statement of the microcausality principle which lies at the
heart of relativistic field theory.15
2. The canonical formalism was applied to the electromagnetic field, starting from
the classically well-known Lagrangian function
1
L = − Fμν (x)F μν (x), Fμν ≡ ∂μ Aν (x) − ∂ν Aμ (x) (2.40)
4
15 The earlier derivation of field commutation relations for the electromagnetic field by Jordan and Pauli
(Jordan and Pauli, 1928) had considered only the free electromagnetic field, and the commutators were
evaluated at arbitrary spacetime separations, so the specific aspect of space-like commutativity was not
emphasized.
Completing the formalism for free fields: Jordan, Klein, Wigner, Pauli, and Heisenberg 45
momentum field π0 =
Ȧ0 for the scalar potential A0 .16 Heisenberg and Pauli
then suggest that the parameter
be set to zero at the end of all calculations to
recover the correct gauge-invariant results.
3. The Lagrangian formalism is developed for the Dirac relativistic electron equa-
tion. In modern notation, one writes an action (spacetime integral of the
Lagrangian),
SDirac = LDirac d x = ψ̄(x)(iγ μ ∂μ − m)ψ(x)d4 x, ψ̄ ≡ ψ † γ0
4
(2.41)
16 In two extremely influential papers in 1929 and 1930, Enrico Fermi (Fermi, 1929, 1930) was able
to develop a consistent Hamiltonian quantum electrodynamics in the covariant Lorentz gauge ∂μ Aμ = 0,
and to establish its equivalence to the transverse gauge ∇ ·A = 0 used by Dirac. In a second paper on
the quantization of wave fields (Heisenberg and Pauli, 1930), Heisenberg and Pauli showed that Fermi’s
covariant gauge results corresponded to a choice of unity for their parameter , and that the gauge condition
∂μ Aμ = 0 should be interpreted as a constraint on the allowed states, not as an operator identity (a “q-
Zahlrelation”).
46 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
interaction with the electromagnetic field is turned on). The calculation yields (Eq.
(115) in (Heisenberg and Pauli, 1929)) a linear divergence identical with the classical
2
Coulomb self-energy result: namely, |
xe−
x| = ∞. This divergent energy (together with
the ubiquitous vacuum energy arising from the sum of 12 hν zero-point energies
from each mode of the electromagnetic field) would plague attempts to arrive at a
mathematically consistent quantum electrodynamics throughout the 1930s and early
1940s. As only electron states are considered (the interpretation of the negative-energy
solutions in terms of positrons, with a correct treatment of the latter, is yet to come),
the Heisenberg–Pauli quantum electrodynamics conserves electron number, and the
additive infinity in the electron energy can therefore be dropped “as one is only
interested in energy differences”. The development of positron theory, and its proper
application by Weisskopf to the electron self-energy problem in 1939, would lead to a
softening of the divergence from linear to logarithmic, but the extreme discomfort felt
by physicists working on early quantum electrodynamics in the presence of numerous
divergent expressions would only be assuaged in the late 1940s with the development
of a covariant renormalization procedure.
17 For a convenient compendium of many of the important papers, see (Miller, 1994).
Problems with interacting fields: infinite seas, divergent integrals, and renormalization 47
The way out of the impasse created by the fundamentally untenable notion of a
negative-energy sea of electrons was shown very clearly in a paper by V. A. Fock in
1933 (Fock, 1933), and somewhat less clearly (as the basic idea is submerged in a large
quantity of speculation on other unrelated topics) in an almost simultaneous article
by Furry and Oppenheimer (Furry and Oppenheimer, 1934), although it must be
admitted that the basic lessons of both papers seem to have been pretty much ignored
by the theoretical community, which for the most part went right on calculating in
terms of electrons and holes.
We shall describe briefly the ideas of Fock here, as this paper can be regarded
as the seminal work responsible for the term “Fock space”, which provides the basic
kinematical scaffolding for modern (operator) formulations of relativistic field theory.
Fock proposed that instead of treating electrons as the primary objects of the theory
and positrons as derived concepts (i.e., holes in a negative-energy sea of electrons), the
latter should appear in the theory along with electrons in a completely symmetrical
way. By this time, the experimentally well-established phenomenological symmetry
between the two types of particle (identical mass, opposite charge) certainly made
this proposal a plausible one. Thus, the Hamiltonian of the free theory (i.e., with
electromagnetic interactions switched off) was assumed to take the form18
H0 = d3 p E(p)(b† ( p, σ) + d† (
p, σ)b( p, σ)d(
p, σ)) + infinite constant (2.43)
σ
where the creation and destruction operators for electrons (b† , b) and positrons (d† , d),
as already clear from the work of Jordan and Wigner, must obey anticommutation
relations to enforce Fermi–Dirac statistics:
p, σ), b† (
{b( p , σ )} = δσσ δ 3 (
p − p ) (2.44)
p, σ), d† (
{d( p , σ )} = δσσ δ 3 (
p − p ) (2.45)
{b( p , σ )} = {d(
p, σ), b( p, σ), d(
p , σ )} = 0 (2.46)
18 We have taken the liberty of introducing modern notation here: in Fock’s paper, the momentum-spin
pair p, σ is denoted by the single variable q, the electron (resp. positron) destruction
operators b( p, σ)
(resp. d( 2 + m2 is given as
p, σ)) are denoted φ(q, 1) (resp. φ(q, 2)), and the relativistic energy E(p) = p
a matrix element of the single-particle Dirac Hamiltonian.
48 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
(here h.c. denotes “hermitian conjugate”) while conserving electric charge (as electrons
and positrons are created or destroyed in tandem by the operator Hu ), necessarily
results in a theory in which the number of particles (either electrons or positrons)
is not conserved. The importance of these prescient remarks would soon become
clear when it became apparent that the gauge-invariant treatment of the coupling
to electromagnetism involved a four-vector charge current j μ (x) containing exactly
odd terms of this sort (in addition to even terms conserving separately electron and
positron number).
In retrospect, the advantages of a charge symmetric quantum electrodynamics
should certainly have become completely manifest after the appearance of the paper
of Pauli and Weisskopf in 1934 (Pauli and Weisskopf, 1934), in which a fully
gauge-invariant theory of massive charged scalar (i.e., spinless) particles coupled
to electromagnetism was written down classically and then subjected to canonical
quantization à la Heisenberg and Pauli. Thus, temporarily switching off the coupling to
the electromagnetic field (and setting = c =1), one starts with a free Hamiltonian for
the scalar field ψ (where for c-number fields the † means simply complex conjugation)
∂ψ † ∂ψ †
H0 = { + ∇ψ · ∇ψ + m2 ψ † ψ}d3 x (2.48)
∂t ∂t
Problems with interacting fields: infinite seas, divergent integrals, and renormalization 49
and then (via the usual identification of conjugate momentum fields) introduces
quantization via the equal-time commutation relations
∂ψ†
[π(x, t), ψ(x , t)] = −iδ 3 (x − x ), π≡ (2.49)
∂t
This theory, with fields satisfying a Klein–Gordon equation (which follows from
(2.48) and (2.49)) with classical solutions of both negative and positive energy,
provided Pauli and Weisskopf with a clear analog of the problems in electron theory
which led Dirac to the desperate expedient of a negative-energy sea: with the crucial
difference that the absence of an exclusion principle made the notion of viewing the
physical vacuum as a state with all negative-energy states filled (each one infinitely
many times, as we are dealing with bosons here!) even more manifestly absurd than
in the fermionic case.
Fortunately, Pauli and Weisskopf were able to show that the quantized version
of the theory possessed a perfectly sensible interpretation, provided the particles and
antiparticles of the theory were put on a completely equal footing (as in the work of
Fock, which, however, is not referenced in (Pauli and Weisskopf, 1934)). A Fourier
expansion of the scalar field ψ(r, 0) (at time zero), incorporating the commutation
relations (2.49), and working in a box of volume V so that the allowed momenta are
discrete, gives (with some slight modifications of notation to accommodate modern
taste)
i 1
ψ(r, 0) = √ (a(k)eik·
r − b† (−k)e−ik·
r ) (2.50)
V
2E(k)
k
where a(k) (resp. b(k)) are now interpreted as destruction operators for a particle
of spatial momentum k (resp. an antiparticle of momentum −k) and the hermitian
conjugate operators are the corresponding creation operators. The Hamiltonian (2.48)
then becomes (cf. Fock’s (2.43))
H0 = E(k)(a† (k)a(k) + b† (k)b(k) + 1) (2.51)
k
with the infinite term
k E(k) · 1 interpreted as a vacuum zero-point energy which
“can be deleted in all applications”. Once this is done, the vacuum, with zero energy, is
simply the state |0 annihilated by H0 via a(k)|0 = b(k)|0 = 0 for all k: the noisome
negative-energy sea is simply banished from the theory. The divergenceless four-vector
†
† ∂ψ
Jμ (x) ≡ i( ∂ψ
∂xμ ψ − ψ ∂xμ ) (with ∂ Jμ = 0 following from the Klein–Gordon equations
μ
of the free theory) then leads in the usual way to a conserved charge operator
Q ≡ J0 (r, t)d3 r = (a† (k)a(k) − b† (k)b(k)) (2.52)
(via the usual mimimal coupling replacement ∂μ ψ → (∂μ − ieAμ )ψ in the Lagrangian,
where Aμ is the electromagnetic four-vector potential) can be carried through in
a straightforward way. One arrives at an interacting theory in which (a) charge
conservation (and the vanishing of the four-divergence ∂ μ Jμ ) is still exact, and (b)
new photon-mediated pair-creation and annihilation processes appear in the theory,
exactly of the sort expected in the Dirac hole theory from transitions between positive-
and negative-energy electron states (but, finally, without the need for an invisible
infinite background of charged particles!).
In hindsight, the advantages of a second-quantized formalism in which electrons
and positrons are treated symmetrically seem so compelling that it is difficult to
understand the persistence of the hole-theory perspective years after the works of
Fock and Pauli–Weisskopf discussed above. Nevertheless, the hole-theory point of view
remained prominent even up to the late 1940s, and the troublesome charge and mass
divergences which would undermine the confidence of many of the early practitioners
of quantum electrodynamics first made their appearance in the context of calculations
performed on the basis of a vacuum consisting of an invisible filled sea of negative-
energy electrons. By 1930, divergent field-theoretic quantities had already made their
appearance in the form of the zero-point energy of the electromagnetic field and the
infinite sea of negative-energy electrons, as well as in the linear divergence in the self-
energy of the electron encountered by Heisenberg and Pauli. In his presentation at
the 1933 Solvay conference (Dirac, 1933), Dirac pointed out that the alteration in the
charge density of the background sea of filled electron states induced by the insertion
of a test charge could be interpreted as a polarizability of the vacuum, leading to
an effective screening of the bare test charge by a factor (1 − 2α Λ
3π ln mc ),
19
where α is
the fine-structure constant and Λ is a momentum cutoff which Dirac assumed should
correspond to the inverse electron Compton wavelength, above which the theory was
presumably unreliable. A perturbative calculation involving an intermediate state in
which a negative electron changes state—necessarily to a positive-energy state, as all
other negative-energy states are filled—corresponds in the Fock point of view to the
appearance of a virtual electron–positron pair, so that the screening can alternatively
be viewed as due to the preferential orientation of these virtual dipoles with respect
to the applied field, much as in the classical theory of polarization. Dirac made clear
that the “observed” charges measured on electrically charged particles necessarily
differed, as a result of this polarization of the vacuum, from the “true” charges carried
by these particles. This observation clearly contains the germ of the idea of charge
renormalization, and more generally the realization that physically observed proper-
ties may—indeed must—contain built-in modifications as a consequence of radiative
interaction effects, necessarily complicating the interpretation of the “true” (or, in
modern terminology, “bare”) parameters appearing in the fundamental Hamiltonian
of the theory.
Further calculations of vacuum polarization in the mid-1930s, by Furry and
Oppenheimer (Furry and Oppenheimer, 1934), Peierls (Peierls, 1934), and Weisskopf
(Weisskopf, 1936), confirmed the presence of a logarithmic ultraviolet divergence in the
19 The order α correction appears to differ from the correct value by a factor of 2, but the reason for this
is unclear.
Problems with interacting fields: infinite seas, divergent integrals, and renormalization 51
charge screening factor. However, it was generally accepted (perhaps “hoped” would
be more accurate here) that the screening of the “true” charges would operate in a
universal and field-independent way, and could therefore be consistently absorbed once
and for all into a uniform redefinition of electric charge. Of course, this maneuver had
the inevitable consequence of making the “true” charges appearing in the Hamiltonian
cutoff dependent (a situation which persists to the present in local quantum field
theories), and the unconscious presupposition that these “true” charges were somehow
physically meaningful could only be satisfied by the expedient, considered desperate
at the time, but now understood (cf. Chapter 16) to be an ineluctable feature of
any realistic field theory, of assuming an actual breakdown of the theory at some
high momentum, which would then cut off the divergent integrals and allow these
underlying charges to take finite values.
Another classic example of the dominance of the hole-theory language, even when
the results were equivalent to those obtainable via a second-quantized formalism with
only positive-energy electrons and positrons, is Weisskopf’s own calculation of the
divergent self-energy of the electron in quantum electrodynamics in 1939 (Weisskopf,
1939), which is phrased throughout in hole-theory language, despite the fact that the
subtractions performed to remove the unpleasant—and clearly unobserved—attributes
of the negative-energy sea precisely correspond to the rewriting in terms of electron
and positron operators suggested by Fock, suggesting that the lessons of second
quantization have, at least subliminally, been absorbed. The second-order (in the
electron charge e) correction to the energy of an electron at rest (with momentum
0 and spin σ) arises from a Coulomb self-energy term, corresponding to the diagonal
matrix element of the Coulomb energy in first order,
was written as a single sum (over discrete momentum modes, with box normalization,
where the single index q contains a spatial momentum q, a spin index σ and a
discrete energy sign index to distinguish between positive-energy and negative-energy
modes)
ψ(r ) = φq (r )aq (2.58)
q
with the wavefunctions φq relabeled as u (resp. v) for positive (resp. negative) energy
solutions. The expression ρ(r ) = eψ† (r )ψ(r ), taken literally, of course, contains an
infinite background charge in the vacuum due to the negative-energy sea. If we insert
(2.59) in this formula, and reorder the charge density via the “Klein–Jordan trick”
of normal-ordering, whereby all destruction operators are moved to the right of all
creation operators (with a change of sign for each interchange required, as we are
dealing with fermions), one finds
†
ρ(r ) =: ρ(r ) : + evq
σ vq
σ =: ρ(r ) : + e (2.60)
q
σ q
σ
with the divergent second term on the right-hand side the sum of the (negative)
charges for each electron in a filled negative-energy state. This term arises from
reordering terms of the form dq
σ dq†
σ = δq
q
δσσ − d†q
σ dq
σ appearing in ρ(r ), using
the anticommutation relations (2.45). By contrast, the normal-ordered charge density
: ρ(r ) : vanishes as physically required in the vacuum state, as the destruction (resp.
creation) operators are deployed on the right (resp. left) side of the expression,
and therefore encounter immediately the vacuum state |0 (resp. 0|), giving zero.
The subtractions performed by Weisskopf amount to the replacement of the charge-
density operator ρ(r ) in (2.54) by its normal-ordered version : ρ(r ) :. When this
normal-ordered expression is used in the evaluation of the charge–charge correlation
function G̃(ξ) defined in (2.57), one finds, returning to infinite volume and continuous
20
momenta,
= e2 d3 q m i
q·ξ
G̃(ξ) e (2.61)
(2π)3 E(q)
20 The calculation is considerably simplified by using Wick expansion techniques described in Chapter
10; one also needs the appropriate normalization properties of the Dirac spinor functions uqσ , vqσ , defined
and discussed in Chapter 7. See Chapter 10, Problem 5.
Problems with interacting fields: infinite seas, divergent integrals, and renormalization 53
On the other hand, if the positron contributions are ignored, one finds a contribution
(due to the appearance of an extra factor of E(q) in the
proportional to δ 3 (ξ)
to G̃(ξ) m
integral), which leads, when substituted into (2.56), to the classic linear divergence
in the Coulomb self-energy, as found previously by Heisenberg and Pauli. Inserting
(2.61) into (2.56), one finds instead the logarithmically divergent integral21
e2 d3 q m ei
q·ξ 3 e2 d3 q m
ΔECoul = d ξ= (2.62)
2 (2π)3 E(q)
4π|ξ| 2 (2π)3 E(q)q2
Weisskopf was also able to show in his 1930 paper (correcting an earlier error pointed
out by Furry in which he had found a quadratic divergence) that the other, transverse
contribution to the electron self-energy was likewise given in terms of a logarithmically
divergent integral, and that logarithmic divergences of this kind persisted to all higher
orders of perturbation theory (in the electron charge).
The lack of manifest covariance in Weisskopf’s calculation,22 performed only for
an electron at rest, with the electromagnetic field in radiation gauge, concealed the
crucial fact that the lowest-order correction to the self-energy of an electron in motion,
with momentum p
= 0, would take the form δE(p) ∼ 2E(p) 1
δm2 , with δm2 a divergent
shift
in the squared rest-mass, corresponding to the change E(p) = p 2 + m2 →
2 2 2
p + m + δm . In other words, the disturbing ultraviolet divergences appearing
in the electron self-energy were really divergences in the (Lorentz-invariant) rest-
mass, and could therefore be removed by a (admittedly divergent) redefinition—or
renormalization—of the “bare” mass m appearing in the defining Hamiltonian of
the theory. As we shall now see, this crucial realization, essential for a consistent
formulation of quantum electrodynamics, would come only after another decade had
passed, with the appearance of a fully covariant formulation, and more importantly, a
transparent calculational scheme vastly simplifying the otherwise onerous higher-order
calculations needed for a full understanding of the theory.
The wartime years 1939–45 brought an almost complete halt to research in funda-
mental issues in physics—such as the issues of consistency and calculability in quantum
electrodynamics—as the discovery of nuclear fission in 1939 redirected the attention
of the leading practitioners of subatomic physics to the urgent question of the military
applicability of the potentially vast (and perhaps accessible) stores of energy in the
nuclei of atoms. One important development in this period was Heisenberg’s introduc-
tion (Heisenberg, 1943a,b, 1944) of the concept of the S-matrix, which attempted to
replace a detailed microscopic prescription of the Hamiltonian dynamics of a quantum
system (with the concomitant appearance of apparently intractable divergences) with
a specification of only the phenomenologically “observable” aspects—in particular,
the unitary scattering (or S-) matrix encoding the amplitudes with which particular
d3 q Λdq
21 Inserting a cutoff at |
q | = Λ in the integral, with Λ >> m, one finds E(q)q 2
∼ 4π ∼
m q
4π ln (Λ/m).
22 In a footnote, Weisskopf admits that the direct calculation of the energy shift for electrons in motion
is complicated by ambiguities in the subtraction of quadratic divergences appearing at various stages of
his calculation. These concerns could, and would only, be put to rest with the development of a manifestly
Lorentz-covariant formulation of QED in the late 1940s.
54 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
incoming states resolve to particular outgoing states in a scattering event. This project
would receive an extended (but nonetheless finite) rebirth in the late 1950s and 1960s
in the S-matrix theory approach to strong interactions, as frustration with the inability
of field-theoretic methods to yield useful quantitative descriptions of strong-interaction
processes mounted. But the resolution of the divergence difficulties of quantum elec-
trodynamics in the late 1940s and early 1950s, together with the new beautiful and
powerful perturbation-theoretic apparatus which (owing to the small value of the fine-
structure constant, α ∼ 1/137), allowed ever more accurate calculation (in agreement
with ever more accurate experiments!) of measured quantities such as the hydrogen
fine-structure and electron magnetic moment, meant that the S-matrix would remain
a useful auxiliary, if not the central, quantity in quantum electrodynamics.
In June 1947, the National Academy of Sciences of the US sponsored a three-
day conference (25 participants, with an emphasis given to the younger generation
of theorists, who dominated the conference numerically, although, as we shall see,
it was the experimental results reported there by Rabi and Lamb that had a really
dramatic effect in stimulating theoretical progress) on “The Foundations of Quantum
Mechanics”, to be held on a small island, Shelter Island, at the tip of Long Island
in New York State. The three rapporteurs chosen to lead the discussions were
V. Weisskopf, J. R. Oppenheimer, and H. A. Kramers. For these physicists (and
in contrast to present usage) the term “foundations of quantum mechanics” meant
primarily, not quantum measurement theory, but rather the accumulated difficulties
and confusions of the preceding two decades in developing consistent quantum field
theories to describe electrodynamics (and to a lesser extent, the strong and weak
interactions). Weisskopf, in his abstract prepared and distributed in advance of the
meeting, was quite explicit about these failures: “Certain well known attempts have
been made in the last fifteen years to overcome a series of fundamental problems.
All these attempts seem to have failed at an early stage.”23 Weisskopf explicitly
mentions the need for obtaining finite results in a “reliable” way in the presence of
divergent contributions to the electron self-energy and to the vacuum polarization.
Kramers, in his abstract, also emphasized the need for a consistent treatment of
divergences in “hole theory”, and mentions in passing that the meson theory of
nuclear forces offered no respite from similar difficulties, but rather “brought new
divergence sorrows”.
The first day of the Shelter Island conference was primarily given over to the new
experimental results of Lamb and Rabi on hydrogen spectroscopy. In particular, the
discovery of an unequivocal deviation from the hydrogen fine structure given by the
Dirac theory, in which states of equal j (electron total, i.e., spin plus orbital, angular
momentum) and neighboring l (orbital angular momentum) were exactly degenerate,
was presented by Lamb in his measurement of the 2S–2P splitting (for j = 12 ), which
corresponded to an energy of order α3 Rydbergs, in contrast to the Dirac formula for
the hydrogen relativistic fine structure, in which only even powers of the fine-structure
constant appear:
23 See Schweber, op. cit., for a detailed account of the run-up to Shelter Island and the discussions in the
conference itself.
Problems with interacting fields: infinite seas, divergent integrals, and renormalization 55
α 1
Enj = mc2 {(1 + (
)2 )−1/2 − 1}, j + ≤ n, n = 1, 2, 3, . . .
n−j− 1
+ (j + 12 )2 − α2 2
2
(2.63)
It was clear to all the participants that a reliable calculation of this new “Lamb shift”,
requiring the subtraction of the divergent self-energy corrections for the electron in
two distinct atomic bound states, would be an ideal test of the adequacy of any
proposed quantum electrodynamic theory, inasmuch as the desired finite-energy shifts
would have to be very carefully disentangled from the divergent electron self-energy
contributions which were sure to appear in any higher-order calculation.
On the second day of the conference, Kramers gave a very important talk in which
the essential conceptual content of mass renormalization was very clearly laid out,
albeit in the context of a purely classical theory of a non-relativistic electron interacting
with the electromagnetic field. Kramers emphasized—and Bethe’s calculation of the
Lamb shift just two days later, on the train home, showed that his arguments fell on
fertile ground—that the measured mass of the electron should be regarded as already
containing the divergent self-energy contributions, and that calculations should be
reorganized to express the desired physical observables in terms of this physical mass,
rather than the “intrinsic” or “bare” mass appearing in the Hamiltonian. During the
conference it was realized that the weak logarithmic divergence in the electron self-
energy would in fact cancel in the calculation of the energy difference ΔE between
the 2S1/2 and 2P1/2 states (as the electron in both states receives the same self-
energy correction)—a point emphasized by Weisskopf in his report on the divergence
difficulties of hole theory. On the final day of the conference, Feynman presented his
spacetime (in modern language, “path-integral”) approach to quantum mechanics,
which would lead within a year to his reformulation of quantum electrodynamics in
terms of Feynman diagrams and a set of explicitly relativistically covariant calcula-
tional rules.
The calculations by Bethe of the Lamb shift (immediately following the Shelter
Island conference) were performed for a non-relativistic electron, for which the self-
energy corrections are linearly rather than logarithmically divergent (as the momentum
integral in (2.62) becomes linearly divergent in the non-relativistic limit when we
replace E(q) → mc2 ), with the result that the energy shift ΔE calculated by Bethe
still contained a logarithmic divergence. Given that the correct relativistic treatment
converts the linearly divergent behavior of the integral in (2.62) to logarithmic once
q > mc, Bethe simply introduced a cutoff in his logarthmically divergent integral at
q ∼ mc, obtaining a finite result which agreed very well with Lamb’s measurements
(1040 MHz for the associated frequency, as compared to the observed 1000 MHz).
But the need for a relativistically correct calculation was urgently felt by all the
theorists now engaged in the hunt for a fully consistent quantum electrodynamics.
Bethe’s conversations with Feynman at Cornell in the next few months provided
a strong impetus for the latter’s development (Feynman, 1949a,b) of a manifestly
relativistic classical Lagrangian formalism, extended to quantum electrodynamics by
the sum-over-histories (path-integral) approach that Feynman had already developed
for ordinary quantum mechanics.
56 Origins II: Gestation and birth of interacting field theory: from Dirac to Shelter Island
24 For excellent treatments of the genesis and later spread of the use of the Feynman diagrammatic
approach, see (Wüthrich, 2010) and (Kaiser, 2005).
3
Dynamics I: The physical ingredients
of quantum field theory: dynamics,
symmetries, scales
In the preceding chapters we have presented an all too brief review of some of the
critical episodes in the historical evolution of modern quantum field theory, up to
the point where renormalized covariant quantum field theory, epitomized by the
astonishing quantitative successes of quantum electrodynamics beginning in the late
1940s and continuing to the present day, reached a state of technical (if not conceptual)
completion. While this historical account is remarkably fascinating in its own right,
it runs somewhat at cross-purposes to the account of field theory which is the major
motivation of this book: namely, to present local quantum field theory as the natural,
and in a certain sense, almost inevitable framework arising from the application of
a few basic principles which lie at the very core of modern physical science. These
principles fall into three basic categories: those involved in the specification of the
dynamics of the sought-for microphysical theory, those concerned with the specification
of the symmetries of the theory, and finally, those principles having to do with the
behavior of the theory at different distance (or energy/momentum) scales. The rest
of the book is therefore organized with a view to exploring how different conceptual
strands in each of these three areas are woven together to produce the fabric of modern
quantum field theory. In contrast to the procedure followed in the first two chapters,
our approach for the rest of the book will be resolutely antihistorical: we shall introduce
the basic principles from which relativistic quantum field theory can be constructed
with little or no attention to the role played (explicitly or implicitly) by the invocation
of such principles in the actual historical record. In particular, the order in which topics
are discussed will have in general no connection to the actual historical sequence of
events discussed in the “Origins” section of the book. In this chapter the main themes
will be introduced, as far as possible, in a non-technical and qualitative (but, we hope,
illuminating) manner. The technicalities will ensue in proper course in the ensuing
chapters!
Local relativistic quantum field theory is based on three basic principles which
in combination lead to a powerful and elegant formalism which appears to allow a
remarkably accurate description (the so-called “Standard Model”) of at least three
of the four fundamental forces in Nature: the strong, weak, and electromagnetic
interactions.
58 Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales
Items 1 and 2 above are, of course, the basic ingredients of modern physics. One
often encounters the assertion that quantum field theory arises from the marriage of
these two. In fact, the addition of special relativity to quantum mechanics leads to
no remarkably novel physics. Later we shall see that it is quite easy to write down
scattering amplitudes which fulfill both the requirements of unitarity and Lorentz
invariance. In a sense, such theories are just as unconstrained as non-relativistic
quantum theory prior to the addition of the principle of special relativity: e.g.,
Hamiltonians can be written in terms of essentially arbitrary covariant functions of
momenta, much as we are allowed to invent potential energy functions with abandon
in elementary non-relativistic quantum theory.
The characteristic phenomena of relativistic field theory only appear once we insist
on the third principle: clustering, i.e., the factorization of the S-matrix1 containing
the scattering amplitude for an arbitrary process as the product of two independent
amplitudes in the event of two spatially far separated scattering subprocesses. This
principle, which seems intuitively obvious, is surely a precondition for the success of
experimental science. It relieves us of the obligation to specify completely the state of
the entire world outside the laboratory prior to a correct interpretation of the results
of an experiment. However, the inclusion of item 3 greatly increases the complexity
of the resultant formalism, and means that it is no longer possible to write exactly
S-matrices satisfying all the desired properties in spacetimes of more than 1 space-1
time dimension. Rather, we must resort to various approximative schemes. This is
the bad part. On the other hand, the inclusion of the clustering requirement means,
as we shall see, that the construction of an appropriate Hamiltonian (dynamics) is
now far more constrained. Arbitrary interaction potentials are no longer allowed: the
potential between far-separated electric charges is forced to be 1/r and not r −3.5 , etc.
Moreover, we are led ineluctably to the formalism of local quantum field theories, with
two immediate and unavoidable consequences:2
(a) an explanation of the existence of antimatter, with each particle having an
antiparticle of exactly equal mass and opposite additive quantum numbers,3 and
(b) the Spin-Statistics theorem, which clarifies one of the great mysteries of non-
relativistic quantum theory: the contrasting symmetry properties of the wavefunctions
of particles of integer (bosonic) versus half-integer (fermionic) spin.
A simple and intuitive picture of the emergence of antimatter as a natural conse-
quence of the basic physical ingredients of local field theory goes back to the work of
Feynman (Feynman, 1949b) on quantum electrodynamics in the late 1940s. The results
cited in item (a) above are special cases of the more general TCP theorem valid in
any local relativistic quantum field theory: the invariance of scattering amplitudes
under simultaneous interchange of particles with antiparticles (the “C” operation),
spatial reflection (or parity, the “P” operation), and time reversal (the “T” operation).
A beautifully simple argument to illustrate property (a) has been given by Weinberg
(Weinberg, 1972), although, as pointed out above, the underlying ideas were first
elucidated by Feynman. Consider a process such as that illustrated in Fig. 3.1, where
a positive pion (π + ) emitted by a proton (P) at spacetime point x travels to a neutron
(N) and is absorbed at spacetime point y. The idea of locality here amounts to the
statement that the neutron and proton interact via local emission and absorption
events of a third intermediary particle. On the one hand, the mutual indeterminacy
P
N
N P
x
y π–
π+
time
x boost y
P N P N
2 The primary character of these results is emphasized, and rigorous proofs given, in the seminal work
of Streater and Wightman (Streater and Wightman, 1978).
3 Recently, the charge-to-mass ratio of the antiproton and proton was measured by Penning trap
techniques to be equal to within about 1 part in 1012 !
60 Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales
of position and velocity in quantum mechanics allows for the possibility that the
spacetime points x and y are actually space-like separated (“tunneling outside the
classical light-cone”, as it were).4
But if the interval between the emission and absorption is space-like, relativity tells
us that it is possible to find an inertial frame in which y 0 < x0 , i.e., spacetime point
y precedes spacetime point x, so the same event appears in this new frame as in the
figure on the right. An observer in the new frame will naturally interpret this as the
emission of a particle (at y) from the neutron, turning it into a proton. Such a particle
must be negatively charged, if we are to maintain charge conservation, but with the
same mass (as its kinematics is identical to that of the original π+ , with a spatially and
temporally reversed path). This particle is just the π − , the antiparticle for the original
positively charged pion. This example contains in a nutshell the intimate association
between spatiotemporal reflection and particle–antiparticle interchange characteristic
of local theories and exemplified in the TCP theorem.
Insofar as the characteristic features of relativistic field theory require at a min-
imum the implementation of unitary quantum dynamics, Lorentz symmetry, and
locality, our exploration of the conceptual framework of field theory must begin with
a detailed examination of these physical ingredients. This will allow us to build up the
technical framework appropriate to the task of weaving together the desired physical
properties into a unified and consistent dynamical theory. This will be our object in
this second section of the book (entitled “Dynamics”, Chapters 3–11), where we shall
concentrate on the most general features shared by essentially all relativistic local
quantum field theories (which we henceforth denote “LQFTs”).
The necessary input from quantum theory will be reviewed in Chapter 4, which
will also contain a brief review of those results from quantum scattering theory needed
for later development of the theory. Chapter 5 describes the kinematics of relativistic
quantum mechanics, which incorporates the requirements of Lorentz symmetry (but
not yet the clustering principle), leading to an enormous class of interacting theories
almost all of which display bizarre and completely unphysical long-distance behavior.
A natural way to restrict the form of the interactions—by introducing local fields—
is introduced here and shown to incorporate the requirements of Lorentz-invariance
(though, as yet, with no proof of the desired clustering properties). The restriction
to physically sensible theories compatible with the clustering principle is effected
in Chapter 6, which shows how the huge class of quantum theories incorporating
special relativity can be systematically “pruned” to yield theories which display
4 The reader may be momentarily disturbed by the apparent superluminal transmission of influence by
the exchanged pion, which would seem to run counter to the requirement that physical signals/effects can
only be transmitted at most at the speed of light (“Einstein causality”). Here, as in the EPR paradox, it is
important to keep in mind that quantum theory is fundamentally a theory of the statistics of microscopic
processes, and that the formalism can (and does!) contain apparently non-local features on an event by
event basis, provided only that these features do not result in a measurable transmission of statistically
measurable properties at faster than light speed. In the quantum information community, this is referred to
as the “no-signalling” property of quantum mechanics. We shall see later, in Chapter 9, that measurements
performed in two space-like separated domains of spacetime are guaranteed to yield statistically independent
results, as a rigorous consequence of microcausality: i.e., the property of space-like commutativity of local
field operators used to construct the hermitian operators embodying the said measurements.
Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales 61
5 By “action-at-a-distance” effects, we do not refer here to the psychologically unsettling effects involving
non-local transitions in entangled wavefunctions, commonly referred to as the “EPR paradox”, but to
physically observable non-local phenomena: namely, those leading to superluminal transmission of physical
signals. See the preceding footnote.
6 For example, in the case of confinement, discussed in Chapter 19, the theory contains fields which do
not correspond to finite-energy particle states at all!
62 Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales
quantum states) means that symmetry arguments can play a much more significant
role in the resolution of the dynamics in a quantum mechanical problem than in a
comparable classical problem.7 For example, the invariance of the Hamiltonian under
a symmetry operation means that it must commute with the operator generating the
transformation. Such a commutation property already implies a partial diagonalization
of the Hamiltonian, as matrix elements connecting states with different eigenvalues
of the symmetry generator must vanish. In some cases (e.g., the O(4) symmetry of
the hydrogen atom, or in completely integrable quantum systems) the symmetry is
sufficiently large to allow a complete resolution of the spectrum, or the dynamics. The
third section of this book (entitled “Symmetries”) will therefore examine some of the
important ways in which symmetry considerations are woven into the fabric of modern
quantum field theory.
There is an important distinction between spacetime symmetries, involving sym-
metry transformations affecting exclusively the universal underlying spatiotemporal
framework of field theories, and internal symmetries in which the symmetry transfor-
mations act in specific non-geometrical ways on the assorted fields in the theory.8 By
far the most important example of the first type of symmetry is Poincaré invariance—
item 2 in the discussion at the beginning of this Chapter—whereby the physical content
of special relativity is injected into relativistic field theory. The extension of this
symmetry to supersymmetry (SUSY), wherein the Poincaré group is enlarged to a
graded extension and spacetime to an enlarged “superspace” containing conventional
space and time as well as a Grassmannian component, should probably be included in
this category, purely on the basis of the extremely powerful formal analogy between
operations carried out in normal spacetime and the extended superspace of SUSY.
In Chapter 12, devoted to continuous spacetime symmetry, we develop the canonical
formalism of Lagrangian field theory as the natural solution to the problem of gener-
ating, in as painless a process as possible, Hamiltonian energy densities that lead to a
quantum field theory with fully Lorentz-invariant dynamics. The general connection
between symmetries and conservation laws, expressed in the form most natural to
field theory (Noether’s theorem) is also given here, together with its application to
the case of Poincaré symmetry, conformal symmetry, and global internal symmetries.
Chapter 12 concludes with an introduction to the extension of Poincaré symmetry to
the super-Poincaré algebra of supersymmetry.
Discrete spacetime symmetries (reflection or parity symmetry P, and time-reversal
symmetry T) are treated in Chapter 13, together with charge-conjugation invariance
symmetry C (a symmetry under interchange of particles and antiparticles): despite
the “internal” appearance of the latter, the fact that we are dealing throughout with
local theories immediately introduces an intimate and unbreakable connection with
the P and T symmetries, making it natural to treat the C symmetry on the same
7 For example, the application of a symmetry transformation to a possible classical phase-space trajectory
will, of course, yield another possible classical trajectory, but does not directly assist in the explicit solution
of either: on the other hand, the fact that symmetries imply conservation laws and hence invariants of the
motion is clearly of great utility in resolving the dynamics in many important classical problems.
8 The terminology here has evolved over time: Wigner (Wigner, 1979b) speaks in the first case of
“classical” or “geometric” symmetries, and in the second, of “dynamical” or “non-geometric” symmetries.
64 Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales
footing with these. Our treatment of discrete spacetime symmetries concludes with
proofs of the TCP and Spin-Statistics theorems, using techniques of axiomatic field
theory introduced in Chapter 9.
Although discrete internal symmetries have played some role in constructing
models of elementary particle interactions beyond the Standard Model, by far the
most important internal symmetries have turned out to be the continuous ones
corresponding to transformations which form compact, finite-dimensional Lie groups.
Internal symmetries may either be “global”, where the dynamics is invariant under
an application of the same symmetry transformation to the field quantities at all
spacetime points, or “local”, in which the invariance persists even for spacetime
dependent transformations. Evidently, every local (or “gauge”) symmetry contains
ipso facto a global subsymmetry. However, the presence of a local symmetry has
extraordinarily deep ramifications for the dynamics of the theory displaying such a
symmetry, far beyond the comparatively simple implications of global symmetry.
In Chapter 14 the role of global symmetries in LQFT is examined. We shall see that
exact global symmetries are rare, indeed, if one takes gravitational effects into account,
probably non-existent! Nevertheless, approximate global symmetries play an enor-
mously important role in modern field theory. The appearance of massless Goldstone
particles once an exact global symmetry is spontaneously broken is of enormous impor-
tance in modern field theory, and a proof of the Goldstone theorem embodying this
phenomenon is given in Section 14.2. Dynamical aspects of spontaneous symmetry-
breaking (SSB) are examined in Section 14.3, where we see that the essence of SSB
resides in the energetics of the theory in the infrared (i.e., at long distances).
The additional rich structure introduced when a LQFT displays a local gauge
symmetry is studied in Chapter 15, where we show how such symmetries require
a generalization of the canonical Lagrangian/Hamiltonian formalism discussed in
Section 12.3 in order to handle the presence of constraints entailed by the presence
of local symmetries. The concept of a local symmetry is introduced in Section 15.1
with a simple example from classical mechanics, the lessons of which are extrapolated
to a wide class of constrained Hamiltonian systems in Section 15.2, where we intro-
duce the Dirac constrained Hamiltonian theory, and the Faddeev–deWitt functional
quantization method for such systems. The quantization of gauge theories using this
functional (path-integral) method is then explained, first using abelian gauge theory
in Section 15.3, where the technical complications are minimal. The extension to
non-abelian gauge theories is performed, again using path-integral methods (which
in this case are vastly more efficient than the canonical operator approach) applied
to the constrained Hamiltonian in Section 15.4, leading to the Feynman rules for
general (unbroken) non-abelian gauge theories. The existence of quantum anomalies
in the chiral currents of internal global symmetries is explored in Section 15.5, where
we see that the classical current conservation implied by Noether’s theorem may be
violated by quantum effects, yielding a non-vanishing divergence of the Noether current
explicitly proportional to Planck’s constant. The peculiar features of spontaneous
symmetry breaking in the presence of local (as opposed to global) gauge symmetry are
the subject of Section 15.6, where we explain the famous “Higgs phenomenon” in the
context of the electroweak sector of the Standard Model, and outline the derivation
of the Feynman rules for a general spontaneously broken local gauge theory.
Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales 65
The class of theories obtained from the three basic requirements discussed at the
beginning of this chapter turn out to display a very important feature: that of scale
separation, which will here be vaguely defined as the weak coupling of physics at
widely varying distance scales. This property, and its consequences, will be the central
theme of the fourth, and final, section of this book, entitled “Scales”. Much of the
confusion over troublesome “infinities” which plagued the development of interacting
field theory in the 1930s and 1940s, as described in Chapter 2, derived from a failure
to appreciate this characteristic property of LQFTs.
Unlike the situation in classically chaotic systems, where small perturbations at
very short distance scales can propagate rapidly up to much longer scales (the famous
“butterfly in China leading to a hurricane in the Atlantic” effect), LQFTs can be
“tailored” to accurately reflect the physics in some given range of length scales even if
we are completely ignorant of the “true microphysics” which obtains at much shorter
distances. This property is as indispensable to the theoretical success of field theory
as the cluster decomposition property is for the practicability of experimental science.
Our unavoidable ignorance—in a direct empirical sense—of the behavior of matter
at distance scales much smaller than the reciprocal of the highest experimentally
attainable particle momenta would be disastrous if we were dealing with theories in
which complicated (and unknown!) details of the interactions at very short distances
propagated up to the much longer scales presently accessible.
For example, there is no doubt that quantum gravity effects will drastically alter
the structure of spacetime on distance scales corresponding to the inverse of the Planck
mass, i.e., at distances below about 10−34 cm.9 Nevertheless, quantum electrodynamics
correctly predicts the anomalous magnetic moment of the electron to an astonishing
nine significant figures, all in terms of integrals extending in principle up to infinite
energy (or, in coordinate space, down to zero distance). Evidently, this remarkably
accurate result means that the long-distance behavior of quantum electrodynamic
systems must be insensitive to the detailed structure of the interactions at such very
short distances.
Obviously, theories in which unknown short-distance structure infects the behavior
of amplitudes at much longer scales would be as intractable from the point of view
of theoretical predictability in quantum physics as chaotic systems are in the classical
arena. So scale separation in the sense of the isolation of very short-distance physics
(from the behavior at accessible scales) is as crucial to the formulation of successful
theories as the isolation of long-distance effects entailed by clustering (item 3 above)
was for the correct interpretation of experimental results. In Chapter 16 we discuss
various aspects of scale separation: the critical role it plays in leading to quantitative
predictions at accessible energy scales, the introduction of regularization techniques
to quantify and simplify the study of scale sensitivity, the relevance of power counting
methods in LQFTs, the extremely important concept of effective Lagrangians, and
the classification of operators into relevant, marginal, and irrelevant on the basis of
their scaling behavior. At this stage, the point of view first introduced by Wilson
9 In theories with extra dimensions, the effective distance scale at which quantum gravity effects become
significant can in fact be much larger than this value.
66 Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales
10 The central role of the renormalization group in the understanding of second-order phase transitions
was first set forth in the seminal work of K. Wilson (Wilson, 1971; Wilson and Kogut, 1974).
Dynamics I: The physical ingredients of quantum field theory: dynamics, symmetries, scales 67
11 From this point of view LQFTs may be the analog in physics of the “conserved core processes” in
Kirschner and Gerhart’s theory of facilitated biological evolution (Kirschner and Gerhart, 2005).
4
Dynamics II: Quantum mechanical
preliminaries
At one level, quantum field theories can be regarded as a very special subclass of all
quantum theories: theories based on a kinematical structure consisting of a state space
which is typically an infinite-dimensional complex Hilbert space, and a dynamical
structure in which time-evolution is effected by a deterministic unitary transformation
of the state vectors determined by the linear operator representing the energy of the
system (the “Hamiltonian”).1
In addition, this theoretical scaffolding needs to be supplemented with the stochas-
tic postulate of quantum mechanical measurement theory: “detection” of a state
|Ψ (via interaction with a suitable macroscopic measurement apparatus) given a
previously prepared state |Φ occurs with probability given by the absolute square of
the inner product of the (suitably normalized) state vectors: |Ψ|Φ|2 . From the vast
variety of possible quantum theories (distinguished by the structure of the Hilbert
space representing the particular system under study, as well as by the variety of
possible physically sensible Hamiltonians, measurable physical quantities, etc.) our
object in this book is to select the minuscule subset of theories in which the relativistic
invariance of special relativity is implemented exactly, and in which physical processes
localized in space-like separated regions are strictly independent (i.e., no faster-than-
light transmission of physically measurable effects). Our task in this chapter is to
review and assemble just those parts of the basic underlying quantum-mechanical
structure which will be critical in realizing these relativity and locality constraints in
the following chapters. This will also serve as a convenient opportunity to introduce the
reader to the particular notational idiosyncrasies of the author. We will begin with
a review of the basic operator formalism underlying standard quantum mechanics,
paying particular attention to dynamics (time evolution) and symmetries. Then we
turn to the reformulation of quantum dynamics as a sum over histories (the “path-
integral” approach) due to Feynman and Dirac, which has turned out to be of enormous
conceptual and technical utility in quantum field theory. Finally, we review those
aspects of quantum scattering theory which will be central in teasing out the intricate
physical content of field theory.
1 This text assumes that the reader is familiar with the basic formalism and technical apparatus of
non-relativistic quantum mechanics, at the level of an advanced undergraduate or beginning graduate level.
Conventions and notation used throughout generally coincide with those of Gordon Baym’s excellent text
“Lectures in Quantum Mechanics” (Baym, 1990).
70 Dynamics II: Quantum mechanical preliminaries
where H is the self-adjoint operator representing the energy of the system: the
“Hamiltonian”. The Planck constant = 2π h
will henceforth be set to unity (for the rest
of this book, with a few exceptions, natural units will hold sway: = c = 1). The
subscript H appearing in (4.1) reminds us that we are in the “Heisenberg picture” of
time development. In this picture, the quantum state of a particular system is a fixed
The canonical (operator) framework 71
vector |α in the Hilbert space appropriate for the system in question.2 Taking the
time-derivative of the finite time-evolution (4.1) yields the commutation property
∂OH (t)
= i[H, OH (t)] (4.2)
∂t
To put some meat on these rather abstract bones, consider a spinless point particle
of mass m, described by non-relativistic kinematics, and moving on a one-dimensional
line (say, the x-axis). In this case the Hilbert space of states is just the linear space of
complex, Lebesgue square-integrable functions,
+∞
|α → ψα (x), |ψα (x)|2 dx < ∞ (4.3)
−∞
which is commonly denoted L2 (R) (the R refers to the functions ψα being defined on
the entire real axis: if our particle were constrained to move in the interval a < x <
b, we would denote the corresponding Hilbert space L2 (a, b)). Of course, our linear
space needs an inner product to be a Hilbert space, so if |β is another state vector,
representing the square-integrable function ψβ (x),
+∞
β|α = ψβ (x)∗ ψα (x)dx (4.4)
−∞
For the special case of systems such as the harmonic (or anharmonic) oscillators studied
in Heisenberg’s original work, the energy eigenstates of the Hamiltonian
form such a complete orthonormal basis (in other words, the spectrum of the Hamil-
tonian is completely discrete), and the matrix elements of any physical observable
O in this basis have a purely oscillatory time-dependence determined by the energy
differences between the states:
n|OH (t)|m = n|eiHt OH (0)e−iHt |m = ei(En −Em )t n|OH (0)|m (4.7)
2 We will use the Dirac bra-ket notation throughout this book: see Baym, op. cit.
3 Contrary to assertions in many texts, this is true even for operators with a partially or fully continuous
spectrum: matrix mechanics is not restricted to situations where the spectrum is fully discrete! Of course,
in many, indeed most, cases the coordinate space representation of wave mechanics is technically more
convenient.
72 Dynamics II: Quantum mechanical preliminaries
where the subscript S indicates that we are in the Schrödinger picture. In this picture,
physical observables are represented by time-independent self-adjoint operators OS . By
convention, states and observables in the Heisenberg and Schrödinger picture coincide
at time t = 0:
S α; t|OS |α; tS = α|eiHt OS e−iHt |α = α|OH (t)|α (4.11)
In both the Heisenberg and Schrödinger pictures, the time-evolution of the system
is treated exactly, i.e., with the full energy operator H of the system. However, it
is frequently the case—for quantum field theories, almost always the case—that the
exact dynamics is too complicated for an analytic solution to be available. A standard
tactic is then to split the full Hamiltonian H into “free” (H0 ) and “interaction” (V )
parts
H = H0 + V (4.12)
There are obviously an infinite number of ways in which such a split can be done,
but the split is only useful if (a) H0 generates an analytically simple dynamics, and
(b) the effects of V represent a quantitatively small “perturbation” on the evolution
induced by H0 . Then one can hope to obtain useful results by expanding the desired
physical quantities in powers of the “small” interaction V . In order to facilitate such an
expansion, Dirac introduced a third version of quantum-mechanical time-development,
which is now universally referred to as the “interaction picture”. In the interaction
picture, states and observables share the burden of carrying the time development of
the system. In particular, operators retain the time-development characteristic of the
Heisenberg picture, but only the free part H0 of the Hamiltonian is used:
Once again, these choices ensure that expectation values of an observable are the same
as those computed in (say) the Heisenberg picture:
ip α; t|Oip (t)|α; tip = α|eiHt e−iH0 t eiH0 t OS e−iH0 t eiH0 t e−iHt |α
= α|eiHt OS e−iHt |α
= α|OH (t)|α (4.15)
From (4.14) it follows that time evolution within the interaction picture (say from
time t0 to a later time t) is accomplished by the unitary operators
The unitary operator (4.16) also gives directly the transformation of the operators
from interaction to Heisenberg picture:
d
U (t, t0 ) = eiH0 t (iH0 − iH)e−iH(t−t0 ) e−iH0 t0 (4.19)
dt
= −iVip (t)U (t, t0 ) (4.20)
where
4 For an introduction to asymptotic expansions, the short treatise of Erdelyi, “Asymptotic Expansions”
(Dover, 1956) is very useful.
74 Dynamics II: Quantum mechanical preliminaries
t
U (t, t0 ) = 1 − i Vip (t1 )U (t1 , t0 )dt1 (4.22)
t0
which can be straightforwardly iterated (by reinserting the right-hand side in the
integral) to yield the formal expansion
∞
t t1 tn−1
U (t, t0 ) = (−i)n dt1 dt2 . . . dtn Vip (t1 )Vip (t2 ) . . . .Vip (tn ) (4.23)
n=0 t0 t0 t0
T (Vip (t1 )Vip (t2 )) ≡ Vip (t1 )Vip (t2 ) , t1 > t2 (4.24)
≡ Vip (t2 )Vip (t1 ) , t2 > t1 (4.25)
we can expand the region of integration so that both t1 and t2 run from t0 to t. In
fact,
t t1 t t
1
dt1 dt2 Vip (t1 )Vip (t2 ) = dt1 dt2 T (Vip (t1 )Vip (t2 )) (4.26)
t0 t0 2 t0 t0
The factor of 12 just compensates for the inclusion of the upper triangular region in
the figure, which contributes equally (by the reordering action of the T symbol) to the
lower. This can obviously be generalized to the nth term in the series. We simply allow
1
all the integrations to go from t0 to t, and compensate with a factor n! . A T symbol
must be included to ensure that the operators are always in the proper time-sequence,
no matter what sector of the multi-dimensional integration region we happen to be
in. In other words,
The canonical (operator) framework 75
t2
t0 ≤ t2 < t1 ≤ t
t0
t1
t0 t
∞
(−i)n t
U (t, t0 ) = dt1 dt2 ..dtn T {Vip (t1 ) . . . .Vip (tn )} (4.27)
n=0
n! t0
This formula will play a central role in our discussion of scattering theory, both for
non-relativistic quantum systems later in this chapter, and for relativistic quantum
field theories, where it will afford us a simple criterion for understanding how Lorentz-
invariance can be guaranteed in the simplest field theories (cf. Chapter 5, Section 5).
The resemblance of (4.27) to an exponential series suggests the convenient notation:
t
U (t, t0 ) = T {exp (−i Vip (τ )dτ )} (4.28)
t0
where qH (t) is the position operator for the particle in Heisenberg representation. The
connection of such states to the conventionally defined Heisenberg states (specified at
t = 0) follows immediately from (4.1):
This result is clearly translation-invariant: the amplitude depends only on the elapsed
time T = tf − ti , so we may as well consider simply K(qf , T ; qi , 0), with no loss of
generality. The simple harmonic oscillator provides a concrete example: the Hamilto-
nian is5
p2 1
Hsho = + mω 2 q2 (4.32)
2m 2
so the propagator satisfies the differential equation
∂ 1 ∂ 2 K(qf , T ; qi , 0) 1
i K(qf , T ; qi , 0) = − + mω 2 qf2 K(qf , T ; qi , 0) (4.33)
∂T 2m ∂qf2 2
5 We employ the standard device of bold-face notation to distinguish operators from c-numbers.
6 This rotation by 90 degrees in the complex plane of the time variable is called a “Wick rotation”.
The canonical (operator) framework 77
where the δ-function limit is now apparent in the increasingly peaked Gaussians
(normalized to unity) appearing on the right-hand side of (4.37).
The Euclidean propagators KE are sometimes referred to as “heat kernels”. Indeed,
(0)
the free-particle propagator KE satisfies the one-dimensional diffusion equation
where we have introduced an alternative notation (|Φ, |Ψ) for the complex inner
product of two Hilbert space vectors which will be useful in the following. Translational
invariance of the laws of physics is a fundamental symmetry which has survived from
the times of Galileo and Newton, and we are certainly entitled to expect that if the
entire apparatus that prepared the system in the state |Ψ and the detection apparatus
which on interaction with the system will project it onto the eigenstate |Φ are both
translated by the same fixed spatial vector a, the measurement probability P in (4.40)
should be unchanged. The translation of a physical system by displacement a is of
course effected by the unitary operator
where p is the total 3-momentum vector for the system. Similarly, a rotation of
a physical system around the direction of the vector α by an angle given by the
magnitude |α| is implemented on the Hilbert space of states by the unitary operator
α) = e−i
α·J , where J is the total angular momentum of the system.7 Returning
Urot (
to the translation case, it is no surprise that if we replace
|Φ → Utrans (a)|Φ, |Ψ → Utrans (a)|Ψ (4.42)
then (by the unitarity of Utrans )
†
P → |(Utrans (a)|Φ, Utrans (a)|Ψ)|2 = |(|Φ, Utrans (a)Utrans (a)|Ψ)|2 = |Φ|Ψ)|2 = P
(4.43)
In general, the symmetries of physics can be expressed mathematically as groups of
transformations (e.g., in the case just above, the succession of two translations a, b
is equivalent to the combined translation a + b, the translation −a is the inverse of
the translation a, etc.). A particular element g of such a symmetry transformation
group will be associated with some Hilbert space operator S(g) (just as translation
by a above was associated with the unitary operator e−i
p·
a ). And the statement that
physics is invariant under such a group of transformations amounts to the requirement
|(S(g)|Φ, S(g)|Ψ)|2 = |(|Φ, |Ψ)|2 (4.44)
for arbitrary states |Φ, |Ψ and arbitrary group elements g. This requirement clearly
holds if the symmetry group is implemented by unitary operators S(g), such as the
operator Utrans (a) discussed above.
In fact, there is another option, as Wigner was the first to demonstrate, in his
famous unitarity–antiunitarity theorem (Wigner, 1959). The other option—indeed the
only other possibility compatible with (4.44)—is that S(g) be an antiunitary operator.
An operator T is antiunitary if, for some complete orthonormal basis {|n} of the state
space (which we shall assume here to be separable, i.e., to allow a denumerable basis),
(T |n, T |m) = δnm (4.45)
T an |n = a∗n T |n (4.46)
n n
The property (4.46) indicates that T is an antilinear operator. The symmetry require-
ment (4.44) follows immediately:
|Φ = an |n, |Ψ = bm |m (4.47)
n m
|(T |Φ, T |Ψ)| = |( a∗n T |n, b∗m T |m)|
n m
=| an b∗m (T |n, T |m)| (4.48)
n,m
A symmetry group cannot consist purely of antiunitary operators, for the simple reason
that the product of two antilinear operators must be linear. Indeed, the only case of
physical interest in which the antiunitary option is required is for the discrete group
consisting of (i) the identity and (ii) the time-reversal operation t → −t. That time
reversal should entail a complex conjugation is plausible once we consider that the
time-dependence of quantum states in the energy basis involves the factor e−iEt with
the energy eigenvalue E real. For a classical particle the time-reversal operation is
easily described in phase-space as the mapping taking q(t), p(t) to
where the subscript “tr” denotes the time-reversed trajectory. In quantum mechanics,
the corresponding mapping is realized by an antiunitary operator T (the need for
the “anti” will be shortly apparent) with the Heisenberg operators (we omit the “H”
subscript to avoid clogging the notation) transforming like
It can be readily shown (see, for example, (Messiah, 1966), Chapter XV) that the
most general antilinear operator satisfying (4.49) takes the form
T = UK (4.58)
In other words, the amplitude that the initial state |Ψ will be found to have evolved
to the state |Φ after time T is equal to the corresponding amplitude for the symmetry
rotated states U (g)|Ψ and U (g)|Φ. Now suppose that G is a finite-dimensional
linear Lie group: namely, a group of matrices parameterizable by a finite set of group
parameters ω α and finite-dimensional generator matrices Tα ,
If the Lie group in question in unitary or orthogonal (e.g., the rotation group) then the
group parameters ω α can be chosen to be real and the generators Tα to be hermitian
8 This definition of a “dynamical symmetry” differs from the usage introduced by Wigner (cf. Chapter
3), where the term was reserved for symmetries of a non-geometrical character.
The canonical (operator) framework 81
matrices. For non-unitary groups, such as the homogeneous Lorentz group, discussed in
greater detail in the following chapter, some of the generators must be non-hermitian
(if we follow usual convention and continue to parameterize the group in terms of
real parameters ω α ). In either case, the discussion of Wigner’s theorem above makes
clear that individual group operations g must be represented on the Hilbert space of
quantum states by unitary operators (putting aside the special case of time reversal
for the moment) U (g), with
The commutativity of the Hamiltonian with arbitrary group operations U (g) then
implies, for symmetry operations infinitesimally close to the identity, g = 1 − iωα Tα ,
U (g) = 1 − iω α Jα , that
[Jα , H] = 0 (4.67)
[J, H] = 0 (4.69)
J
e , acting on states in Hilbert space. The general requirement of dynamical
rotational invariance (4.62) implies in particular that the propagator K(rf , T ; ri , 0) =
rf |e−iHT |ri for detecting the particle at rf at time T if localized initially at time zero
at ri should satisfy, for any fixed rotation R( α)
9 See Goldstein (Goldstein, 2002), Chapter 9 for a review of the essential properties of canonical
transformations.
10 This is a generating function of the second type, F , in the notation of Goldstein, op. cit. We only
2
consider time-independent generating functions here: the new Hamiltonian is then equal to the old one,
re-expressed in terms of the new canonical variables.
The canonical (operator) framework 83
of quantum transformation theory in the late 1920s. Jordan showed that the operator
Ucan implementing this transformation for the quantum kinematic variables, for the
special case of classical generating functions of the form
F (q, P ) = fn (q)gn (P ) (4.73)
n
with C an arbitrary constant. With this form Jordan could show (cf. Problem 2 at
the end of this chapter)
−1
q = Ucan QUcan (4.75)
−1
p = Ucan P Ucan (4.76)
In the formula (4.74) the round bracket expressions (Q, P ) and (fn (Q), gn (P )) imply a
specific ordering of the non-commuting operators Q and P : one is instructed to order
all Qs to the left of all P s in the formally expanded exponential in (4.74). Although
a formal demonstration of (4.74, 4.75, 4.76) is straightforward, the actual existence
of Ucan is not guaranteed, and in general the operator obtained in this fashion is not
even unitary! We will shortly provide an example of the problems that can arise in
this connection—but first, a “nice” canonical transformation where all the desired
properties of a quantum symmetry obtain in an unproblematic way. We consider the
generating function
1
F (q, P ) = (bP 2 + 2qP − cq 2 ) (4.77)
2d
where a, b, c, d are real constants satisfying ad − bc = 1. The result is a linear canonical
transformation
Q = aq + bp (4.78)
P = cq + dp (4.79)
In this case the general formula for the symmetry representative Ucan (4.74) gives
i b c 1
Ucan = exp ( { P 2 − Q2 + ( − 1)(Q, P )}) (4.80)
2d 2d d
We now derive an explicit formula for this operator as an integral kernel U(Q, Q )
acting on coordinate wavefunctions ψ(Q) (so that the operator P in (4.80) becomes
∂
i ∂Q ):
Ucan ψ(Q) = U(Q, Q )ψ(Q )dQ (4.81)
84 Dynamics II: Quantum mechanical preliminaries
where the arbitrary constant C in (4.74) is chosen to ensure unitarity of the resultant
kernel:
U(Q, Q )U ∗ (Q , Q )dQ = δ(Q − Q ) (4.83)
q → Q = f (q) (4.84)
1
p→P = p (4.85)
f (q)
i
Ucan = C exp ( (f (Q) − Q, P )) (4.87)
We assume that f (q) is (a) monotone increasing, and (b) invertible. Unfortunately,
irrespective of the choice of C, Ucan fails to be norm-preserving for general non-linear
choices of the reparameterization function f (q), as
F (q, P ) = f (q)P + ln |f (q)| (4.89)
2i
The canonical (operator) framework 85
The relation between the new and old momentum variables is now
1
p= (f (q)P + P f (q)) (4.90)
2
consistent with hermiticity of both p and P ; the symmetry representative becomes
i
Ucan = |f (Q)|1/2 exp ( (f (Q) − Q, P )) (4.91)
generating the norm-preserving action
defined in (4.36), but with the initial and final (imaginary) times shifted to general
values, so that the elapsed time T = tf − ti . We divide the finite interval (ti , tf ) into
N subintervals of size τ = T /N by introducing N − 1 intermediate times
tn = ti + nτ, n = 1, 2, 3, . . . .N − 1 (4.94)
11 For a rigorous introduction to conditional Wiener measures, including appropriate convergence proofs
for potential theory, see Chapter 3 of (Glimm and Jaffe, 1987).
The functional (path-integral) framework 87
these intermediate times. Inserting a complete set of intermediate states, the finite-
time heat kernel (4.93), can be written as a multiple convolution of the N kernels
which accomplish the time-evolution of the system over the temporal subintervals
(tn , tn+1 ), n = 0, 1, 2, . . .N − 1:
N
−1
KE (qf , tf ; qi , ti ) = dqn qf |e−τ Hsho |qN −1 qN −1 |e−τ Hsho |qN −2
n=1
One easily sees, from the exact result (4.36), that in the limit of large N (T fixed,
τ → 0), the individual kernel factors in (4.95) become
Inserting this result in the expression (4.95) for the finite-time evolution kernel, we
find
N
−1
KE (qf , tf ; qi , ti ) = lim dqn e−SE (4.97)
N →∞
n=1
N −1
m qn+1 − qn 2 1 q 2 + qn2
SE ≡ τ ( ( ) + mω 2 n+1 ) (4.98)
n=0
2 τ 2 2
The limit on the right-hand side of (4.97), of course, just yields our Euclidean
propagator (4.36). Formally, identifying qn = q(tn ), the corresponding limit of the
exponent in (4.98) yields the Euclidean action
tf
1 1
SE = { mq̇ 2 (t) + mω 2 q 2 (t)}dt (4.99)
ti 2 2
and the above-mentioned limit can therefore by interpreted as an integral of e−SE over
all “paths” q(t) subject to the boundary conditions q(ti ) = qi , q(tf ) = qf . In fact, it can
be shown that this limit defines a countably additive measure (called a conditional
Wiener measure to indicate the boundary conditions) over the space of continuous
functions q(t) defined on the interval (ti , tf ) (see (Glimm and Jaffe, 1987)), whence
the full weight of Lebesgue integration theory can be brought to bear to give a rigorous
meaning to this “functional integral” over the space of continuous functions. We shall
use the notation Dq(t) to indicate the measure defining this integral, as follows:
KE (qf , tf ; qi , ti ) = e−SE Dq(t) (4.100)
q(ti )=qi ,q(tf )=qf
88 Dynamics II: Quantum mechanical preliminaries
The connection of SE to the conventional action S, defined as the real time integral
of the Lagrangian
tf
1 1
S= { mq̇ 2 (t) − mω2 q 2 (t)}dt (4.101)
ti 2 2
can be seen if we make the reverse analytic continuation back to real time
t → ei( 2 −) t = (i +
)t
π
(4.102)
t
τ = → (i +
)τ (4.103)
N
where the rotation is by an angle π2 −
in the complex plane, with
a positive
infinitesimal quantity, to be set to zero after the integrations in (4.97) are done. The
need for this maneuver is apparent if we examine the discretized Euclidean action
(4.98) after the continuation (4.103):
N −1
m qn+1 − qn 2 1 q 2 + qn2
−SE → iτ ( ( ) − mω2 n+1 )
n=0
2 τ 2 2
N −1
m qn+1 − qn 2 1 q 2 + qn2
−
τ ( ( ) + mω2 n+1 ) (4.104)
n=0
2 τ 2 2
1 2 1
L= mq̇ (t) − mω 2 q 2 (t) (4.106)
2 2
We again emphasize that (4.105) must be interpreted as containing a hidden regulariz-
ing “
” term in the exponent in order to be meaningful. Note that we have temporarily
reintroduced Planck’s constant in (4.105), abandoning the choice of natural units
( = 1) for a moment. The classical limit → 0 clearly results in strong damping
except where the phase of the exponent is stationary, which is the extremal action
principle of classical mechanics selecting the classical path qcl (t):
δ
Ldt =0 (4.107)
δq(t) q=qcl
−Hsho (tf −ti )
Tr(e )= dQQ|e−Hsho (tf −ti ) |Q (4.108)
= dQ e−SE Dq(t) (4.109)
q(ti )=q(tf )=Q
= e−SE Dq(t) (4.110)
q(ti )=q(tf )
In the last line the integration over all paths which begin and end at a given coordinate
Q, followed by an integration over all Q, has been replaced by a functional integral
over all periodic paths satisfying q(ti ) = q(tf ). Choosing ti = −β/2, tf = +β/2, we see
that this functional integral actually provides a simple reformulation of the finite-
temperature partition function Z ≡ Tr(e−βHsho ) of the simple harmonic oscillator.
Thus the path integral also provides, in its Euclidean version, an alternative tool
for computations in quantum statistical mechanics as well as for real-time quantum
mechanics.
1 2
H= p + V (q) = H0 + V (4.111)
2m
Under rather loose conditions on the potential V (q) (e.g., it is sufficient that V be a
polynomial in q, bounded below so as to guarantee a “bottom” to the energy spectrum)
it is possible to show (Glimm and Jaffe, 1987) that for infinitesimal time intervals
τ = (tf − ti )/N , e−Hτ e−H0 τ e−V τ , in the sense that
qf |e−H(tf −ti ) |qi = qf |(e−Hτ )N |qi = lim qf |(e−H0 τ e−V τ )N |qi (4.112)
N →∞
N
−1
KE (qf , tf ; qi , ti ) = lim dqn qf |e−H0 τ e−V τ |qN −1
N →∞
n=1
−H0 τ −V τ
·qN −1 |e e |qN −2 . . . ..qn+1 |e−H0 τ e−V τ |qn . . . ..q1 |e−H0 τ e−V τ |qi (4.113)
Once again we focus attention on the individual matrix elements representing the
propagation amplitude over the infinitesimal intervals τ :
qn+1 |e−H0 τ e−V (q)τ |qn = e−V (qn )τ qn+1 |e−H0 τ |qn
1 2
= e−V (qn )τ qn+1 |e− 2m p τ |qn
dpn 1 2
= e−V (qn )τ qn+1 |pn pn |e− 2m p τ |qn
2π
dpn ipn (qn+1 −qn )− 1 p2n τ
= e−V (qn )τ e 2m (4.114)
2π
Notice that each time interval is now associated with an auxiliary momentum variable
pn , n = 0, 1, . . . N − 1. Inserting the result (4.114) into the integral representation
(4.113), we find
N
dpn i N −1 pn (qn+1 −qn )−N −1 H(pn ,qn )τ
−1
N −1
KE (qf , tf ; qi , ti ) = lim dqn e n=0 n=0
N →∞
n=1 n=0
2π
(4.115)
1 2
where H(pn , qn ) = 2m pn + V (qn ) is the c-number valued classical energy associated
with momentum pn and coordinate qn . Although the integrals in (4.115) involve an
oscillatory factor, the second part of the exponent is negative and real, resulting
in absolute convergence of the integrals (provided, as mentioned above, that the
potential V (q) is bounded below!). As in the special case of the simple harmonic
oscillator, the limit defines a conditional Wiener measure over continuous phase-space
paths ({q(t), p(t)}, ti < t < tf ) with the boundary condition q(ti ) = qi , q(tf ) = qf and
p(ti ), p(tf ) unrestricted (all momentum integrals in (4.115) range from −∞ to +∞).
Of course, the Gaussian integral over the intermediate momentum pn in (4.114)
can easily be evaluated to leave us with
dpn ipn (qn+1 −qn )− 1 p2n τ m 1/2 − 12 m (qn+1τ−qn )2
e 2m =( ) e (4.116)
2π 2πτ
which we immediately recognize as the first factor in (4.96), the infinitesimal time
propagator for the simple harmonic oscillator. The product over all time intervals of
the remaining factor in (4.96) gives (with V (q) = 12 mω 2 q 2 for the simple harmonic
oscillator)
N −1
mω 2
q2 2
+qn N −1 22 2
N −1
= e−τ 4 (qf −qi )τ
V (qn )− mω
→ e−τ
n+1
e−τ 2 2 n=0 n=0
V (qn )
, τ →0
n=0
(4.117)
The functional (path-integral) framework 91
corresponding to the product of the potential terms: i.e., the first factor (outside the
integral) in (4.114). This establishes the equivalence of our new integral representation
(4.115), involving the Hamiltonian of the system and an integration over phase-space
paths in coordinate and momentum, with the results obtained previously (for the
harmonic oscillator) in which the exponent involved the Lagrangian (a function of
coordinates and velocities) and an integration over paths in coordinate space solely.
The Wiener measure defined by the limit in (4.115) will be indicated by a notation
analogous to that used previously in the Lagrangian formulation (4.100): namely,
tf
(ip(t)q̇(t)−H(p(t),q(t)))dt
KE (qf , tf ; qi , ti ) = DpDqe ti (4.118)
Implicit in this expression are (i) the boundary conditions q(ti ) = qi , q(tf ) = qf , and
1
(ii) the 2π factors in the measure for the momentum integrations, visible in (4.115).
A Hamiltonian path integral representation for the real-time propagation ampli-
tude K(qf , tf ; qi , ti ) can be recovered from the Euclidean version (4.115) by the
analytic continuation discussed above in the Lagrangian case: namely, we rotate
τ → (i +
)τ , obtaining
dpn i N −1 (pn (qn+1 −qn )−(1−i)H(pn ,qn )τ )
−1
n=N
N −1
K(qf , tf ; qi , ti ) = lim dqn e n=0
N →∞
n=1 n=0
2π
(4.119)
The integrals here are oscillatory except for the real factors e−H(pn ,qn )τ , which ensure
absolute convergence provided H is bounded below (and grows for large qn , pn ). Again,
one typically uses an abbreviated notation for this real-time version of the Hamitonian
path integral:
tf
i (p(t)q̇(t)−H(p(t),q(t)))dt
K(qf , tf ; qi , ti ) = DpDqe ti (4.120)
but it must be understood that this integral is given meaning by a hidden i
factor,
as in (4.119). If Planck’s constant is once more made explicit (as in (4.105)) and
the stationary phase approximation applied to the classical limit → 0, we find the
modified Hamilton’s principle (Leech, 1965) as the condition selecting the classical
path in phase space:
tf
δ (p(t)q̇(t) − H(p(t), q(t)))dt = 0 (4.121)
ti
of matrix elements between these initial and final states of products of Heisenberg
operators. It turns out that the path-integral method is ideally suited for the repre-
sentation of such matrix elements, but only if the corresponding operator product is
time-ordered. As a simple example, consider the propagation of the system from initial
time ti to final time tf with two intermediate times t1 , t2 specified, and t1 > t2 . The
product qH (t1 )qH (t2 ) is then time-ordered (later operator to the left), and
qf , tf |qH (t1 )qH (t2 )|qi , ti = qf |e−iHtf eiHt1 qe−iHt1 eiHt2 qe−iHt2 eiHti |qi
= qf |e−iH(tf −t1 ) qe−iH(t1 −t2 ) qe−iH(t2 −ti ) |qi
(4.122)
We now repeat the steps leading from (4.112) to (4.113), dividing the time interval
(ti , tf ) into N subintervals. For very large N , we may assume that the times t1 , t2 are
arbitrarily close to the discrete times tn1 , tn2 . The only modifications to the previous
calculation are therefore the appearance of the position operator q before the states
|qn1 and |qn2 in the matrix elements (4.114), leading to the additional c-number
factor qn1 qn2 in the full functional integral. In the continuum limit, this additional
factor is just q(t1 )q(t2 ), so we obtain, in analogy to (4.120)
tf
i (p(t)q̇(t)−H(p(t),q(t)))dt
qf , tf |qH (t1 )qH (t2 )|qi , ti = DpDq q(t1 )q(t2 )e ti (4.123)
It is important to realize that the order of the products q(t1 )q(t2 ) inside the path
integral (4.123) is irrelevant, as at this stage we are dealing with c-number real-valued
functions which multiply commutatively. However, the above argument shows that the
path integral automatically computes the matrix element of the time-ordered product
of the corresponding Heisenberg operators. In other words, irrespective of the time
order of t1 and t2 , we have
tf
i (p(t)q̇(t)−H(p(t),q(t)))dt
qf , tf |T (qH (t1 )qH (t2 ))|qi , ti = DpDq q(t1 )q(t2 )e ti
(4.124)
The arguments of the preceding section leading to the result (4.118) for the imaginary
time kernel may be repeated with the insertion of imaginary time Heisenberg operators
(which evolve according to qH (t) = eHt qH (0)e−Ht ): unsurprisingly, the corresponding
matrix element (which we distinguish with the subscript E) has the path-integral
representation
tf
(ip(t)q̇(t)−H(p(t),q(t)))dt
qf , tf |T (qH (t1 )qH (t2 ))|qi , ti E = DpDq q(t1 )q(t2 )e ti
(4.125)
It is, of course, straightforward to repeat the above argument to establish that the
insertion of arbitrary multi-nomials in both the coordinate q(t) and momentum p(t)
values at distinct times in the integral (4.124) results in an integral representation
for the matrix element of the corresponding time-ordered products of the Heisenberg
The functional (path-integral) framework 93
coordinate qH (t) and momentum pH (t) operators. The same will be true were terms
in the exponent of (4.124) involving products of p(t) and q(t) at different times to be
present. Such terms do not appear in (4.124) as it stands, and one may wonder why
they ever would! However, the fact that the path-integral formulation involves only
commuting c-number functions—either on coordinate or on phase-space—leads to the
following puzzle, which perhaps has already occurred to the reader. If the Hamiltonian
H(p, q) contains terms with both coordinate and momentum operators, with some
specified ordering (chosen, of course, to maintain the self-adjoint property of H), how
can this be reflected in the path integral where different orders of multiplication of
the c-number valued p(t) and q(t) seem manifestly equivalent? For example, if at the
operator level the Hamiltonian contained a term (λ a real constant)
V1 ≡ λpq2 p (4.126)
V1 = V2 + λ2 (4.128)
so that the two Hamiltonians definitely lead to different quantum dynamics. The
solution to this quandary is to realize that in situations like this the apparently
innocent continuum limit (4.119) develops ambiguities which must be resolved by
temporarily separating the times of the coordinates and momenta to indicate the
desired ordering. Thus propagators for a Hamiltonian involving the term V1 above
would be generated by a path integral in which the c-number Hamiltonian H(p(t), q(t))
in (4.120) contains a term
where δ is a small time-interval which is only set to zero after the N → ∞ limit in
(4.119) is carried out. Likewise, if we desire the propagator in a theory containing the
term V2 above, we need to regularize the path integral by adding a term
λ
(p(t + δ)2 q(t)2 + p(t)2 q(t + δ)2 ) (4.130)
2
to H(p(t), q(t)) in (4.120), with the limit δ → 0 performed only after the discrete
time-limit defining the path integral is performed (i.e., τ → 0).
non-perturbative formulations of quantum field theory begin with the study of such
expectation values of (products of) the field operators. This technique also lies at
the heart of modern approaches to the numerical computation of the spectrum of
theories such as quantum chromodynamics (the field theory describing the strong
interactions of quarks and gluons), where perturbative methods fail. We begin with
the result (4.124) for the real-time expectation value of the product of two Heisenberg
coordinate operators:
+t
i (p(t)q̇(t)−H(p(t),q(t)))dt
qf , +t|T (qH (t1 )qH (t2 ))|qi , −t = DpDq q(t1 )q(t2 )e −t
Note that we have chosen to evolve the system over the symmetric time-interval from
−t to +t, and that the 1 − i
factor needed to make the path integral well-defined (see
(4.119)) is explicitly displayed in the corresponding matrix element. We now assume
that our quantum system has a unique ground state |0 of energy E0 , separated in
energy from the first excited state (or states) by a finite gap E1 − E0 . If we insert a
complete set of energy eigenstates 1 = |nn| (where H|n = En |n), we find
e−iHt(1−i) |qi , 0 = e−iEn t(1−i) n|qi , 0|n
n
−iE0 t(1−i)
→e 0|qi , 0|0 + O(e−(E1 −E0 )t ), t → ∞
qf , 0|e−iHt(1−i) → 0|qf , 0|0e−iE0 t(1−i) + O(e−(E1 −E0 )t ), t → ∞ (4.132)
In other words, the infinite time limit of the path integral acts as a “low-pass filter”,
effectively selecting out the ground-state component of the initial and final states. If
we divide the matrix element in (4.131) by the same quantity without the T-product
(i.e., by qf , t|qi , −t) and take the large time limit, the exponential time factors and
overlap factors 0|qi , 0 and qf , 0|0 cancel, leaving
qf , +t|T (qH (t1 )qH (t2 ))|qi , −t 0|T (qH (t1 )qH (t2 ))|0
lim =
t→∞ qf , t|qi , −t 0|0
+∞
i (p(t)q̇(t)−H(p(t),q(t)))dt
DpDq q(t1 )q(t2 )e −∞
= +∞ (4.133)
i (p(t)q̇(t)−H(p(t),q(t)))dt
DpDq e −∞
with V (r ) a rotationally invariant potential, V (R( α)r ) = V (r ). The dynamical rota-
tional invariance expressed by (4.70) follows immediately by making a change of
integration variable in the functional integral r(t) → R( α)r(t) (rendered precise by
discretization, as in (4.104)). All that is required is (a) the invariance property of
the Lagrangian, L(R( α)r˙, R(
α)r ) = L(r˙, r ), and (b) the invariance of the functional
measure, which follows from the unimodular property of the rotation matrices,
det(R( α)) =1.
For the canonical symmetries discussed previously, in which new coordinates and
momenta are introduced which are in general non-linear functions of the old ones (cf.
Section 4.1.3), the realization of the canonical symmetry at the quantum level can
involve some subtle issues. Classically, for generating functions lacking explicit time-
dependence, the new Hamiltonian (expressed in the new variables) is algebraically
equal to the old Hamiltonian. In the quantum case there may be additional “anoma-
lous” terms of order (or higher powers of ). In the path-integral formalism these
terms appear as (a) a consequence of a non-trivial Jacobian in the change of variables
qn , pn in the (discretized) Hamiltonian version of the path integral (4.119), and (b)
time reordering of coordinate and momentum variables when a discretized version of
the continuous classical contact transformation is implemented (recall our discussion
above of operator ordering in the path-integral context). We shall return later in the
book to the important issue of quantum anomalies in classical symmetries in quantum
field theory: the reader interested in a more thorough discussion of these issues in non-
relativistic quantum mechanics is referred to the work of Swanson (Swanson, 1993).
where g(α) is the folding function. One may also define “out-states” correspondingly
as physical states in which the system goes over to a definite number of free outgoing
particles after the collision. These are not the states prepared in any conceivable
accelerator, but they are the states measured by the detectors after the collision has
taken place. The amplitude that a given incoming state |αin will then be found to be
in the state |βout by a detector measurement after the collision is just the overlap
all the amplitudes for possible final states β arising from a given initial state α. The
probability interpretation of quantum mechanics requires that the sum of the absolute
squares of these amplitudes must be unity (something has to happen!). This is just the
property of a unitary matrix. The unitarity of the S-matrix, which we shall see below
follows from the hermiticity of the Hamiltonian, is one of the fundamental constraints
which we will have to keep in mind when building quantum field theories.
|Ψ → Ψ
Φ|Ψ → (Φ, Ψ)
|||Ψ|| → ||Ψ|| (4.142)
In the Schrödinger picture we have time-dependent states Ψ(t), and the question arises
whether it makes sense to consider infinite time limits (“far past” or “far future”) of
such states. In general, a sequence of states Ψn is said to converge weakly to Υ if
for any fixed Hilbert space vector Φ. In particular, a sequence of states has a weak
limit if each component of these states in a complete orthonormal basis converges
13 For a more detailed introduction to the relevant concepts from functional analysis, see (Newton, 1966),
Chapter 6.
98 Dynamics II: Quantum mechanical preliminaries
Ψn → Υ (4.144)
The reason for the appelation “weak” becomes apparent when we consider that the
sequence of states (1, 0, 0, 0, . . .), (0, 1, 0, 0, . . .), (0, 0, 1, 0, . . .), etc., specified by listing
their components in a denumerable orthonormal basis, converges weakly to zero,14 even
though the norm of each state in the sequence is unity! Another equally off-putting
case is given by the wavefunction for a localized wave-packet, given in coordinate
space by
2
p
Ψ(r, t) = r|Ψ, t = g( p)ei
p·
r−i 2m t d3 p (4.145)
which also converges weakly to zero at large (negative or positive) times, due to the
famous spreading of the wave-packet, which implies that the overlap of Ψ(t) with
any (normalizable, and hence essentially localized) state must vanish at large times.
Our intuitive feeling of “convergence” conforms more closely to a stronger requirement
than that implied by (4.143). We say that the sequence of states Ψn converges strongly
to the state vector Υ if the norm of the difference vectors converges to zero:
lim Ψn ⇒ Υ (4.147)
n→∞
With this definition, neither of the examples of weak convergence given in the
preceding paragraph survive (as the individual states in the sequence have fixed norm
and can clearly not converge in the strong sense to zero!). Correspondingly, weak (resp.
strong) convergence of a sequence of operators On to a limit operator O can be defined
by requiring that for every fixed state Φ, On Φ → OΦ (resp. On Φ ⇒ OΦ).
14 Indeed, the overlap of these vectors with any finite norm vector with components c gives just c →
n n
0, n → ∞, as the cn must go to zero if n |cn |2 < ∞.
Scattering theory 99
found which approach each other arbitrarily closely in the “strong” sense in the limit
t → −∞, and that this unique association of free with interacting states allows us to
define a mapping from an arbitrary (finite norm) free state Ψ(t) evolving according
to H0 to an interacting state Ψin (t) evolving according to the full Hamiltonian H,
provided that the interaction potential V is sufficiently short-ranged. To see how to do
this, we work for the moment in coordinate representation and imagine that we are
provided with a normalizable solution Ψ(r, t) of the free time-dependent Schrödinger
equation
1 ∂Ψ(r, t)
− ΔΨ(r, t) = i (4.148)
2m ∂t
For definiteness, such a solution is given by (4.145), with g( p) a smearing function
peaked at some 3-momentum p0 and some width Δp, such that the resultant wave-
packet is localized in the neighborhood of the spatial origin at time t = 0. For any given
large negative time T , we now define an interacting solution Ψ(T ) (r, t) associated with
this freely evolving state as the solution of the interacting Schrödinger equation
1 ∂Ψ(T ) (r, t)
− ΔΨ(T ) (r, t) + V (r )Ψ(T ) (r, t) = i (4.149)
2m ∂t
subject to the boundary condition
we expect that in the far past the influence of a localized potential must vanish for
wave-packets then localized far from the center of the potential, i.e.,
and that in this limit Ψ(T ) (0) (i.e., the interacting solution matched in the far past
to a specified free wave-packet, run forward to time zero) converges to a well-defined
Heisenberg state (recall that the various representations are defined to coincide at
time zero):
where U (t, t0 ) is the time-development operator in the interaction picture (see (4.16)).
It then follows from (4.153) that U (0, T ) has a strong limit when T is taken to the
infinite past:
U (0, T ) ⇒ Ω− , T → −∞ (4.155)
where the Møller wave operator Ω− maps the free state Ψ onto the in-state Ψin
associated with it by the procedure described above:
Ψin = Ω− Ψ (4.156)
An exactly analogous procedure, this time taking the limit T → +∞, can be used
to associate any freely evolving wave-packet solution Ψ with an interacting state
converging strongly to it in the far future: the associated Heisenberg state is then
called Ψout :
Ψout = Ω+ Ψ (4.157)
U (0, T ) ⇒ Ω+ , T → +∞ (4.158)
In the case that the potential V admits bound states (a not infrequent situation!),
the above discussion conceals some subtleties which we shall mention briefly here. In
this situation the scattering eigenstates of the full Hamiltonian H are not complete,
as the bound state(s) are missing. Indeed, the Møller wave operators are in this case
norm-preserving maps from the full Hilbert space H(=L2 (R3 )) spanned by the free
solutions Ψ to the subspace Hscat spanned by the interacting scattering states Ψin (or
Ψout ). Indeed, for a bound state of energy Eb , normalized eigenstate Ψb , consider the
overlap matrix element
Recall that Ψ here represents a localized wave-packet (e.g., the state given in (4.145))
so that e−iH0 T Ψ will spread in the limit T → −∞ to a state with pointwise vanishing
probability density, and hence vanishing overlap with any stationary, localized bound-
state wavefunction Ψb . Thus, the right-hand side of (4.159) vanishes in the limit
T → −∞. As Ω− is the strong limit of U (0, T ) as T → −∞, it follows that this Møller
operator maps an arbitrary free state onto the proper subspace of states orthogonal
15 We also note here that the Coulomb potential fails to satisfy the falloff condition posited above, and
indeed, non-relativistic Coulomb scattering exhibits a number of subtleties, which, however, will not concern
us further here. Related field-theoretic subtleties in defining the scattering matrix for theories with massless
particles will be considered explicitly in Section 19.2.
Scattering theory 101
HΩ± = Ω± H0 (4.161)
We are finally equipped with the necessary tools to introduce the central concept
of scattering theory: the S-matrix, which we discussed qualitatively in the preceding
section. We recall that the Hilbert space L2 (R3 ) is separable, so the free solutions Ψ
may be expanded in a discrete orthonormal basis of finite-norm states Ψα (α a discrete
index)
Ψ= (Ψα , Ψ)Ψα (4.162)
α
where the limits in the last two lines are strong, which together with the unitarity
of U (t, t0 ) at finite t, t0 and the completeness of the free states Ψα ensures that the
infinite discrete matrix S is unitary in the standard way (even in the presence of bound
states).17 In practice, it is more convenient for obvious reasons to use continuum-
normalized states: we go over to wave-packets of arbitrarily well-defined momentum,
for example, in which limit the S-matrix amplitudes remain well defined (again, with
16 As a simple example of a norm-preserving but non-unitary operator, consider the operator represented
by the infinite discrete matrix Onm = δn,m+1 , which maps an arbitrary vector in l2 , the Hilbert space
of square-summable infinite complex sequences, into a shifted vector of equal norm but one with no first
component.
17 The unitarity of the S-matrix defined as an overlap of in- and out-states even in the presence of unitary
defects in various interaction-picture operators will (fortunately) persist in quantum field theory, where the
defect will become total, given Haag’s theorem for the non-existence of the interaction picture, discussed
below in Chapter 10.
102 Dynamics II: Quantum mechanical preliminaries
the proviso of suitably localized interaction potentials). Then the discrete index α
above is replaced by a specification of the momentum (and if present, spins) of the
incoming particle. It should be emphasized that the matrix S defined above is to be
thought of as the matrix of an operator S acting in the full Hilbert space spanned
by freely-evolving wave-packets. Energy conservation requires commutation of S with
H0 , the free Hamiltonian. Indeed, Ψα and Ψβ can be chosen to be free wave-packets
of arbitrarily well-defined H0 eigenvalue, in which case we certainly require Sβα to
vanish if Eβ = Eα , and this is ensured by the intertwining property (4.161):
satisfies
t
U (t, −∞) = 1 − i Vip (t1 )U (t1 , −∞)dt1 (4.167)
−∞
or equivalently
t
−
Ω = 1−i e−iH0 t Vip (t1 )eiH0 t1 Ω− e−iH0 (t1 −t) dt1 (4.169)
−∞
t
= 1−i eiH0 (t1 −t) V Ω− e−iH0 (t1 −t) dt1 (4.170)
−∞
0
= 1−i eiH0 t1 V Ω− e−iH0 t1 dt1 (4.171)
−∞
We now take the matrix element of this result between two free wave-packet solutions
Ψβ and Ψα which are arbitrarily close to energy eigenstates with energy Eβ , Eα . The
free time-development factors e±iH0 t1 mean that the centroids of these packets are
moved very far from our presumably localized potential V at large negative times t1 ,
so we may at no cost insert an adiabatic switching factor et1 (
> 0) multiplying the
potential: for very small
(
will be taken to zero at the very end), switching off the
Scattering theory 103
interaction potential at very large negative times will have no effect if the wave-packets
are still very far from the potential center. We then obtain
0
−
(Ψβ , Ω Ψα ) = (Ψβ , Ψα ) − i ei(Eβ −Eα −i)t1 (Ψβ , V Ω− Ψα ) (4.172)
−∞
1
= (Ψβ , Ψα ) − (Ψβ , V Ω− Ψα ) (4.173)
Eβ − Eα − i
1
= (Ψβ , Ψα ) + (Ψβ , V Ω− Ψα ) (4.174)
Eα − H0 + i
As the Ψβ can be chosen to run over a complete basis of the Hilbert space, we
may remove it, obtaining the desired Lipmann–Schwinger equation, relating free to
interacting scattering states:
1
Ψα,in = Ψα + V Ψα,in (4.175)
Eα − H0 + i
In a similar fashion one may derive the Lipmann–Schwinger equation for out-states:
1
Ψα,out = Ψα + V Ψα,out (4.176)
Eα − H0 − i
Note that by multiplying both sides of (4.175) by Eα − H0 (at which point the i
becomes irrelevant) we find
so that the interacting scattering state has the same energy relative to the full
Hamiltonian H as the free state Ψα which matches it in the far past has relative
to H0 .
By carrying the time evolution in (4.166) all the way forward to t = +∞ we of
course obtain the S-matrix element Sβα :
At this point it will be convenient to return to Dirac notation for states and matrix
elements, as the fine points of convergence that infest time-dependent scattering theory
104 Dynamics II: Quantum mechanical preliminaries
have already been discussed adequately for our purposes. The Lipmann–Schwinger
equations (4.175, 4.176) will henceforth be written
1 1
|αin = |α + V |αin , |αout = |α + V |αout (4.183)
Eα − H0 + i
Eα − H0 − i
We will also abandon our previous insistence on normalizable (wave-packet) states
and allow the label α to denote continuum orthonormalized states of well-defined
momentum and energy (possibly containing a discrete spin index as well). The usual
completeness
relations will then involve integrals (as well as spin sums), denoted
formally dα . . . Similarly, the notation δαβ will denote a product of continuous
δ-functions (in the momentum variables) and discrete Kronecker δs (for any discrete
spin indices).
We now turn to the derivation of some formal scattering theorems of great
importance. By taking the adjoint of the Lippmann–Schwinger (LS) equation for an
in-state, we find
1
in β| = β| + in β|V
Eβ − H0 − i
1
⇒ in β|V |αin = in β|V |α + in β|V V |αin
Eα − H0 + i
∗ 1
= Tαβ + in β|V V |αin (4.184)
Eα − H0 + i
1
= β|V |αin + in β|V V |αin
Eβ − H0 − i
1
= Tβα + in β|V V |αin (4.185)
Eβ − H0 − i
In (4.184) we have used the LS equation on the right, and the definition (4.181) of
the T-matrix; in (4.185) one uses the LS equation on the left. Subtract the right-hand
sides of (4.184) and (4.185) to obtain
∗ 1 1
Tβα − Tαβ = in β|V { − }V |αin (4.186)
Eα − H0 + i
Eβ − H0 − i
If we insert a complete set of free eigenstates of H0 , with dγ|γγ| = 1, on the right
of the second V in (4.186), we find
∗
Tβα − Tαβ = dγ in β|V |γγ|V |αin
1 1
.{ − } (4.187)
Eα − Eγ + i
Eβ − Eγ − i
At this point we remind the reader of a famous identity: observe that
1 x
= 2 −i 2 (4.188)
x + i
x +
2 x +
2
Scattering theory 105
The function x2 +2 is a highly peaked function of x (for
small) which integrates
x
to π: in other words, it is just πδ(x). The odd function x2 +2 is a regularized form
A particularly important special case of the above result occurs for forward scattering:
for example, when α, β refer to two-particle states with identical momenta, i.e., the
elastic scattering amplitude in the limit of zero exchanged momentum. Then (4.189)
becomes
Im(Tαα ) = −π dγδ(Eγ − Eα )|Tγα |2 (4.190)
Evidently, the right-hand side of this relation describes the integrated probability (or
total cross-section) for the given initial state |αin to evolve into an arbitrary final
state |γout (naturally, of the same energy, hence the δ-function). The optical theorem
relates this to the imaginary part of the the forward scattering amplitude Tαα . The
term “optical” derives from the special case where the scattering is that of photons off
neutral atoms in a medium, in which case the forward scattering amplitude is related
to the index of refraction and the integrated scattering cross-section to the absorption.
The Generalized Optical theorem (G.O.T.) is really nothing but the previously
advertised unitarity of the S-matrix, somewhat disguised, as the following brief
computation shows:
†
(S † S)βα = dγSβγ Sγα = dγSγβ ∗
Sγα (4.191)
= δβα (4.192)
where all terms save the final Kronecker δ cancel in the penultimate line, courtesy of
the Generalized Optical theorem (4.189).
Of course, the above results still leave us a fair way from an actual scattering
experiment. In particular, we need to be able to convert information about S-matrix
elements into a precise statement of how many particles of given momentum and
type will emerge per unit time when a given target is placed in a particle beam of a
given intensity. Also of interest are the cases in which the Ψα,in state in (4.164) is a
106 Dynamics II: Quantum mechanical preliminaries
one-particle state, while the Ψβ,out state may contain two, three, or more particles—
corresponding to the decay of an unstable particle. The relevant formulas connecting
the S-matrix elements to the desired phenomenological cross-sections and rates are
derived in Appendix B.
4.4 Problems
1. For the evolution operator U (t, t0 ) (4.16), verify the semigroup property
2. (a) Show that the Jordan operator Ucan given in (4.74) effects the appropriate
similarity transformation between old and new canonical coordinates (4.75,
4.76).
(b) Show that the operator (4.80) implementing linear canonical transformations
is (up to a multiplicative constant) equivalent to the unitary kernel (4.81).
(Hint: apply the operator (4.80) to the coordinate space wavefunction written
as a Fourier transform of the momentum-space wavefunction.)
3. The object of the following exercise is to build intuition for the very important
concepts of in/out states and the S-matrix, in a simple example where every-
thing can be worked out explicitly. The model being considered is that of a
one-dimensional repulsive δ-function potential, V (x) = gδ(x), g > 0 (see Baym
(Baym, 1990), p.113). Scattering experiments are performed by firing particles
of mass m and momentum k ( is set to unity throughout!) in from the left or
the right. By energy conservation, this results in outgoing particles (either to
the left or right) with the same magnitude of momentum k. Use a normalization
where the free particle plane wave moving left to right is given by a wavefunction
x|k = eikx .
(a) Show that in coordinate space the Lippmann–Schwinger equation for right-
moving in-states takes the form (k > 0)
(+) (+)
Ψin (x; k) ≡ x|kin = eikx + dyGin (x, y)V (y)Ψin (y; k) (4.193)
In this chapter we shall begin to explore the implications of the basic underlying
symmetry of relativistic quantum field theory: the Poincaré group incorporating
the symmetry of such theories under Lorentz transformations and translations in
spacetime. This is, of course, the second of the three fundamental physical ingredients
of quantum field theory discussed at length in Chapter 3 (the other two being
quantum theory and the locality principle). Minkowski’s introduction in 1908 of a
four-dimensional space, in which the Lorentz transformations of special relativity
could be interpreted as rotations preserving an indefinite metric, does not expand
the physical content of relativity, but it vastly improves our ability to visualize the
physical structure of the theory, and just as importantly, greatly simplifies the search
for theories satisfying the constraints of relativistic invariance.
Although we assume the reader to be familiar with the basic tenets of special
relativity, as formulated in spacetime concepts, we begin this chapter with a brief
review of the Lorentz and Poincaré groups, which provide the kinematic underpinnings
for the description of particle states in field theory. This will also provide a convenient
opportunity for introducing the reader to the notational conventions that will prevail
in the rest of this book.
1 Note that the physical energy and three spatial momentum components are conventionally taken as
the 0,1,2,3 components respectively of a contravariant vector kμ .
The Lorentz and Poincaré groups 109
xμ = gμν xν
The components of gμν are g00 = 1, g11 = g22 = g33 = −1, with all off-diagonal ele-
ments zero. The Lorentz transformation (5.1) preserves the relativistic scalar product
of any two four-vectors:
for all xρ , yρ . Differentiating first with respect to y ρ and then with respect to xν , one
finds
4x4 matrices satisfying (5.3) are said to lie in the fundamental representation of the
Lorentz group. By raising the index ρ we may rewrite this as
which implies
Because of the signs involved in raising and lowering indices, it should be noted that
the Λ matrices are not orthogonal in general (this will hold only for the subgroup of
purely spatial rotations).
Condition (5.3) can be written as the matrix equation
ΛT gΛ = g (5.6)
from which we find, by taking the determinant of both sides, that det(Λ)2 = 1,
det(Λ) = ±1. The set of Lorentz transformations with det(Λ) = +1 form the “proper
Lorentz transformations” (obviously a subgroup). The class of transformations consid-
ered can be refined further by considering only those transformations corresponding to
physically realizable changes of inertial frame. Such “orthochronous” transformations
leave the sign of the time component unchanged and are characterized by Λ00 > 0.
The combined application of the proper and orthochronous requirements lead us
to the restricted Lorentz group, which contains all (and only) physically accessible2
Lorentz transformations: namely, those corresponding to physically accessible changes
of inertial frame. The unimodularity of the proper Lorentz transformations implies
the invariance of four-dimensional spacetime integrals under change of variable corre-
sponding to a Lorentz transform (as the Jacobian det(Λ) is unity):
2 We exclude “looking in the mirror” (parity transformations with Λ0 = 1, det(Λ) = −1) from the set
0
of physically accessible changes of frame.
110 Dynamics III: Relativistic quantum mechanics
d4 x . . . = d 4 x . . . (5.7)
for x = Λx.
The kinematic discussion above focussed on descriptions of coordinates of events:
in particle physics, the description of scattering processes is almost always exclusively
in terms of energies and momenta of the scattering particles.3 Using natural units,
= c =1, an isolated stable particle of mass m and well-defined spatial momentum
k has energy E(k) = k 2 + m2 . This “mass-shell condition” relating the relativistic
energy and momentum is more simply written k 2 ≡ k · k = m2 . We shall frequently
encounter integrals over the possible four-momenta of a stable particle subject to (i)
the mass-shell condition, and (ii) positivity of the energy. Under proper orthochronous
transformations, the invariance of the relativistic product (and the sign of the energy)
3
therefore implies the (not at all obvious!) invariance of the measure d
k :
E(k)
3
d k
d4 kθ(k0 )δ(k 2 − m2 )f (k0 , k) = f (E(k), k) (5.8)
2E(k)
—a result which we shall employ on numerous occasions.
We list here some special Lorentz transformations of particular interest:
1. Rotation by θ around the z-axis
⎛ ⎞
1 0 0 0
⎜ 0 cos(θ) sin(θ) 0⎟
Λμν = R(θ)μν =⎜
⎝ 0 − sin(θ)
⎟ (5.9)
cos(θ) 0⎠
0 0 0 1
2. Boost with rapidity ω along z-axis
⎛ ⎞
cosh(ω) 0 0 − sinh(ω)
⎜ 0 1 0 0 ⎟
Λμν = B(ω)μν =⎜
⎝
⎟
⎠ (5.10)
0 0 1 0
− sinh(ω) 0 0 cosh(ω)
where the rapidity is related to the velocity of the boost by
1
cosh(ω) = √ =γ
1 − v2
v
sinh(ω) = √ = vγ (5.11)
1 − v2
Note that the boost matrix B(ω) (in contrast to the rotation matrix R(θ)) is not
orthogonal.
The restricted Poincaré group (sometimes referred to as the inhomogeneous
(restricted) Lorentz group) is the set of transformations consisting of the combined
3 Oscillation experiments, such as in the neutral kaon system, are a notable exception.
Relativistic multi-particle states (without spin) 111
effect of restricted Lorentz transformations (i.e., those Λ satisfying det(Λ) = +1, Λ0 0 >
0) and spacetime displacements. Unless otherwise explicitly stated, we shall assume
henceforth that the restriction to proper, orthochronous Lorentz transformations
applies in our further discussions of both the homogeneous Lorentz group and its
inhomogeneous extension, the Poincaré group. In defining the latter, by convention,
the Lorentz transformation is performed first, followed by the displacement. Thus an
element of the Poincaré group is specified by a pair (Λ, a), with action
For the special case of the subgroup of proper Lorentz transformations (a = 0), the
element Λ is represented on the state space by the unitary operator U (Λ). We now
turn to the properties of the U (Λ), beginning with their action on states with spinless
particles.
To prove this, integrate both sides of (5.16) with a smooth test function f (k)/2E(k):
f (k)
d3 k 2E(k)δ 3 (k − k ) = f (k ) (5.17)
2E(k)
where on the second line we have made a change of variable k → Λ−1 k. This ensures
the invariance of the Hilbert-space inner product under Lorentz transformations:
With non-covariant normalization, we simply drop the factor of 2E(k) on the right-
hand side of (5.15).
A state |k will look like |Λk to a boosted or rotated observer. The Lorentz group
is realized on the space of states by operators U (Λ) defined to effect precisely this
change:
U (Λ)|k ≡ |Λk (5.20)
5 The Spin-Statistics theorem asserting the necessity of Bose statistics for particles of integer spin, and
Fermi statistics for particles of half-integral spin, is one of the seminal results of local field theory: we shall
discuss it in Chapters 7 and 13.
Relativistic multi-particle states (without spin) 113
for the spinless bosonic particles under consideration in this section, the positive sign
is to be taken in the symmetrization. Note that the completeness sums acquire extra
combinatoric factors for identical particles: for example, the decomposition of the
identity in the multi-particle Fock space of a single boson takes the form
1
1= d3 k1 d3 k2 . . . d3 kN |k1 , k2 , .., kN k1 , k2 , .., kN | (5.22)
N!
N
as the reader may readily verify by squaring the above expression. Rather surprisingly,
the very “big” Fock space obtained by the construction outlined here is still neverthe-
less, like L2 , a separable Hilbert space: it is spanned by a denumerable orthonormal
basis.6
The operator implementing a Lorentz transformation Λ on a multi-particle state
takes the unsurprising form:
With the covariant normalization of states defined above these operators are in fact
unitary. For example, on single-particle states,
6 For a clear discussion of this frequently misunderstood feature, see (Streater and Wightman, 1978).
114 Dynamics III: Relativistic quantum mechanics
In either case, we shall often abbreviate the Lorentz action on a multi-particle state
as U (Λ)|α = |Λα.
We can view the multi-particle Fock space described above as a basis of eigenstates
of a free Hamiltonian H0 corresponding to having all self-interactions of our putative
stable, spinless particle switched off. The energy of a multi-particle state is then simply
N
H0 |k1 , k2 , .., kN = E(ki )|k1 , k2 , .., kN (5.28)
i=1
N
P (0) |k1 , k2 , .., kN = ki |k1 , k2 , .., kN (5.29)
i=1
We can consider H0 and P (0) as the time and spatial components respectively
of an energy-momentum four-vector operator P (0)μ for free (hence the superscript
(0)
notation) particles. As we know from general quantum theory, these operators
generate infinitesimal translations in time and space. Taking the matrix element of
the Lorentz transformed energy-momentum operator between covariantly normalized
single particle states:
with a similar result for the matrix element in a general multi-particle state. Thus the
four operators (P (0)0 , P (0)i , i = 1, 2, 3) transform as expected under Lorentz transfor-
mations
Finally, recalling that a general element (Λ, a) of the Poincaré group is defined as a
Lorentz transformation Λ followed by a translation by the four-vector aμ , we see that
the unitary representation of (Λ, a) in the free Fock space is given by
(0)
·a
U (Λ, a) = eiP U (Λ) (5.32)
A few lines of algebra, employing the transformation property (5.31), confirms that
these operators do indeed furnish a unitary representation of the full Poincaré group
(cf. (5.13))
U (Λ1 , a1 )U (Λ2 , a2 ) = U (Λ1 Λ2 , a1 + Λ1 a2 ) (5.33)
7 Of course, in string theory even supposedly point-like quarks and leptons acquire a one-dimensional
string structure in ten dimensions, or a membrane structure in eleven-dimensional M-theory.
8 The reader may recall that a semisimple group is one with no invariant abelian subgroups.
116 Dynamics III: Relativistic quantum mechanics
Jz |0, σ = σ|0, σ
(Jx ± iJy )|0, σ = (j ∓ σ)(j ± σ + 1)|0, σ ± 1
J2 |0, σ = j(j + 1)|0, σ (5.34)
or
m
|k, λ ≡ U (L(k))|0, λ (5.38)
E(k)
9 Throughout this section we shall make heavy use of the machinery of angular momentum and the
rotation group: for a review, see Chapters 15 and 17 of (Baym, 1990).
Relativistic multi-particle states (general spin) 117
m
J · k̂|k, λ = U (R(k̂))U † (R(k̂))J · k̂U (R(k̂))U (B(|k|)|0, λ
E(k)
m
= U (R(k̂))Jz U (B(|k|))|0, λ
E(k)
m
= U (R(k̂))U (B(|k|))Jz |0, λ
E(k)
= λ|k, λ (5.39)
where in the third line we have used the fact that rotations around the z axis (affecting
only the transverse x and y coordinates) commute with boosts along the z axis.
Of course, both the |k, σ, σ = −j, −j + 1, .., +j and the |k, λ, λ = −j, .. + j states
form complete sets and can be expressed as linear combinations of one another
(specifically: |k, λ = σ Dσλ j
(R(k̂))|k, σ). The helicity states |k, λ are of particular
utility in dealing with massless or highly energetic particles, as we shall see below when
we address the massless case explicitly. To summarize our options, we may either label
the discrete spin state of our particle by the Jz eigenvalue of a comoving observer (as
in (5.37)), or by the eigenvalue of the component of angular momentum J · k̂ along
the direction of motion of the particle (as in (5.38)).
Now we turn to the Lorentz transformation properties of these states. Evidently
m m
U (Λ)|k, σ = U (Λ)U (L(k))|0, σ = U (ΛL(k))|0, σ
E(k) E(k)
m
= U (L(Λk))U (L−1 (Λk)ΛL(k))|0, σ
E(k)
m
≡ U (L(Λk))U (W (Λ, k))|0, σ (5.40)
E(k)
10 See Baym (Baym, 1990), Chapter 17. Our notation vis-à-vis rotation group matters agrees with this
reference.
118 Dynamics III: Relativistic quantum mechanics
m j
U (Λ)|k, σ = U (L(Λk)) Dσ σ (W (Λ, k))|0, σ
E(k) σ
E(Λk) j
= Dσ σ (W (Λ, k))|Λk, σ (5.41)
E(k) σ
For j = 0 we have trivially Dσj σ = δσ 0 δσ0 , and we regain the transformation law (5.26)
for spinless particle states. A straightforward calculation, employing the unitarity of
the D j matrices and U (Λ) operators, confirms the non-covariant normalization of these
states:
Finally, we note that the extension of the unitary representation U (Λ) for the
homogeneous Lorentz group to the full Poincaré group including translations follows
exactly the same lines as in the preceding section for spinless particles: in particular,
the spin indices are unaffected by the application of the energy-momentum operator
P (0)μ .
The above discussion for massive particles undergoes significant modifications for
massless particles. As far as we know, the only exactly massless particles in Nature
are the photon (spin j = 1) and the (hypothetical) graviton (spin j = 2). In either
case, the one particle states are labeled by a discrete helicity index which takes only
two possible values, +j, −j: the intermediate values −j + 1, . . . , j − 1 which would be
present for a massive particle are missing. The reason for this, as we shall now see,
is that the little group for massless particles is quite different from that for massive
particles (the rotation group O(3), as discussed above).
First, let
us revert to covariant normalization of states, to avoid inconvenient
factors of E(k) in the transformation formulas. We cannot choose the state of the
particle at rest as our standard state for massless particles; instead, our standard
state will be defined as the particle with a standard momentum (and energy) +μ in
the z-direction, i.e., with four-momentum k0 = (μ, 0, 0, μ). The boost operator B(k)
will now be defined as the boost in the z-direction with rapidity ω = ln μk , i.e., it takes
the standard state with four-momentum k0 = (μ, 0, 0, μ) to four-momentum (k, 0, 0, k).
The rotation R(k̂) then takes this to a general light-like state with four-momentum
(|k|, k). Thus, if we label the (as yet unspecified) spin quantum number(s) needed
to specify fully the one-particle state τ , a general massless particle state can be
defined as
Note that we must use the helicity type transformation of (5.36) here, rather than
the spin type transformation (5.35), as our standard state has non-zero momen-
tum. Following exactly the steps performed leading to (5.40), this time in covariant
normalization (so the square-root factors are absent), we find the general Lorentz
Relativistic multi-particle states (general spin) 119
must now be an element of the little group for massless particle states: namely, it must
be a Lorentz transformation leaving the standard vector k0 = (μ, 0, 0, μ) invariant. As
in the massive case, this condition specifies a three-parameter subgroup of the homoge-
neous Lorentz group. However, this group, as we shall now see, is not the conventional
O(3) rotation group (obviously, as rotations around the x and y directions manifestly
do not leave the standard vector unchanged), with its attendant fully understood finite-
dimensional representations labeled by j, σ quantum numbers. Instead, it turns out to
be a non-compact, non-semisimple group: namely, the Euclidean group of rotations and
translations in two dimensions, with a completely different representation structure to
the O(3) rotation group.
The little group of Lorentz transformations leaving our standard light-like four-
vector k0 = (μ, 0, 0, μ) invariant clearly still retains a remnant of the full rotation
group: namely, rotations R(θ) by an angle θ around the z-axis, as given in (5.9).
Defining a two-dimensional vector ξ ≡ (ξ1 , ξ2 ), the reader may easily verify that the
two-parameter set of Lorentz transformations defined by
⎛ ⎞
1 + 12 ξ2 ξ1 ξ2 − 12 ξ2
μ =⎜
⎜ ξ
1 1 0 −ξ1 ⎟ ⎟
T (ξ) ν ⎜ ⎟ (5.46)
⎝ ξ2 0 1 −ξ2 ⎠
1 2
2ξ ξ1 ξ2 1 − 1 ξ2 2
and is therefore structurally identical to the group of translations in the x-y plane.
Indeed, the two-vector ξ transforms as expected under a rotation around the z-axis:
−1 (θ) = T (Rθ ξ)
R(θ)T (ξ)R (5.48)
where Rθ ξ ≡ (cos (θ)ξ1 + sin (θ)ξ2 , − sin (θ)ξ1 + cos (θ)ξ2 ). Moreover, (5.48) implies
that these two-dimensional translations form an invariant abelian subgroup of the
full three-parameter Euclidean group of two-dimensional translations and rotations
θ) ≡ T (ξ)R(θ)
W (ξ, which comprise the little group for a massless particle:
so, as advertised previously, our little group is definitely not semisimple, nor compact
(as the translations are unbounded).
120 Dynamics III: Relativistic quantum mechanics
What is the representation structure of this Euclidean group? The unitary rep-
resentative of the rotation part R(θ) is just U (R(θ)) = eiθJz so if we label the
eigenvalue of the hermitian generator Jz by λ, we must include this “helicity”
value (the component of angular momentum along the direction of motion) in the
specification of quantum numbers τ which give a full labeling of our standard state
|k0 , τ . In principle, the hermitian generators of the translation part of the little
group could also take on non-vanishing “momentum” eigenvalues π = (π1 , π2 ), with
π ·ξ
U (T (ξ))|k
be labeled by the three parameter set (π , λ). The one massless particle with which we
have (extensive!) experience, the photon, definitely has non-zero helicity states, but
there is no empirical evidence for any further degrees of freedom corresponding to the
π variables, which we must therefore set to zero. Accordingly, we shall assume that
one-particle massless states in general are fully specified by a momentum and single
helicity variable, with a standard state |k0 , λ satisfying
The only remnant of the rotation group here is the one-dimensional abelian subgroup
of rotations around the z-axis, for the standard state, or more generally, around the
direction of motion of the massless particle. In particular, we are missing the raising
and lowering operators constructed from the other two generators (Jx , Jy ) which would
normally allow us to (a) move stepwise from helicity λ to λ ± 1, thereby filling out a full
2j + 1-dimensional O(3) representation of spin j states, and (b) establish the integral
or half-integral quantization of the maximal helicity. In fact, for massless particles the
helicity is actually a Lorentz-invariant: it is exactly the same number for the state
|k, λ = U (L(k))|k0 , λ obtained by boosting the standard state |k0 , λ for a particle
moving in the z-direction with the same helicity, as we see by the calculation leading
to (5.39). Physically, massless particles of different helicity are, from the standpoint
of the proper Lorentz group (i.e., absent improper parity transformations), completely
distinct and unrelated objects. If we include parity transformations, of course, which
reverse momentum but leave angular momentum unchanged, the helicity changes sign,
so if a massless particle participates in parity conserving interactions (as the photon
certainly does), then if the particle exists with helicity +λ it must also exist with
helicity −λ. But the representation theory of the little group clearly says nothing
about intermediate helicity values (such as 0 for the photon, or +1,0, and –1 for the
spin-2 graviton).
What about the quantization condition for maximal helicity (≡ spin of the par-
ticle), given the absence of the full O(3) rotation group? At first, one might suppose
that as a rotation by angle θ around the momentum vector of our helicity λ state |k, λ
generates a phase factor eiλθ , a rotation by 2π must return us to the original state,
ensuring integer quantization of helicity. Of course, the doubly connected structure of
the rotation group is the critical property saving us from this conclusion, and reinstat-
ing the possibility of the spinor type half-integral representations. Recall that whereas
any continuous one-dimensional path R(α), 0 ≤ α ≤ 1 through the group manifold of
O(3) in which the starting element R(0) is the identity but the final rotation R(1) is
by an odd multiple of 2π (around the z-axis, say) cannot be continuously shrunk to a
How not to construct a relativistic quantum theory 121
point (the identity) in the group manifold (i.e., to the path R(α) = R(0) = 1, α ≤ 1):
the rotation group in three dimensions is not simply connected. On the other hand,
paths starting at the identity and ending at rotations involving a multiple of 4π can
be so shrunk.11 The corresponding unitary representatives U (R(α)) must generate
continuously varying phases applied to our massless one-particle state, which then
implies that
1 3
ei(4πn)λ = 1 ⇒ λ = 0, ± , ±1, ± , . . . (5.51)
2 2
In other words, as we expect, only integer or half-integer values for the helicity of
a massless particle are allowed. This is an example of a topological, rather than an
algebraic, quantization of a quantum number.
[U (Λ), S] = 0 (5.55)
11 The reader may verify this immediately by the magical “twisted belt” experiment: a belt held flat at
both ends, but with a single twist by 360 degrees in the middle cannot be flattened by moving only the
right end around (while held flat), whereas if subjected to a double twist of two full rotations, the belt
“unwinds” easily if we move its right end appropriately, keeping the left end fixed.
122 Dynamics III: Relativistic quantum mechanics
We remind the reader (cf. Section 4.3) that the S-matrix is defined as a unitary
operator in the basis of free particle states: the behavior of these states under the
action of U (Λ) is precisely the content of the preceding sections of this chapter.
The requirement of Poincaré invariance adds invariance under spacetime translations
to the above (i.e., we require commutativity of S with the larger class of unitary
symmetry operators U (Λ, a)). This, of course, implies commutativity of S with the
infinitesimal generators of spacetime translations, namely the (free) energy-momentum
operator P (0)μ :
[P (0)μ , S] = 0 (5.56)
There is, in fact, a trivial way to ensure both unitarity of S and the invariance
requirements (5.55, 5.56). Let us write the full Hamiltonian of the theory as
H = H0 + V (5.57)
where the free part H0 is defined by (5.28). Recall that the S-matrix is
obtained as the infinite time limit of the interaction-picture time-evolution operator:
S = limT →∞ U (T, −T ). Moreover, for general times t, t0 , this operator is constructed
(cf. (4.27)) as a sum of products of the interaction operator Vip (t) (i.e., the interac-
tion part of the Hamiltonian, in the interaction picture):
Thus, the S-matrix involves a sum of products of the operators H0 and V . Energy-
momentum conservation (5.56) is therefore ensured if12
for all t. In fact, a very simple Ansatz would seem to do the trick. Consider the
operator (for the rest of this section we use covariant normalization throughout, and
drop vector symbols for three-momenta in the states, assumed to be multi-particle
states of a single spinless particle):
2
d3 ki d3 ki 4
V = δ (k1 + k2 − k1 − k2 )h(k1 , k2 , k1 , k2 )|k1 k2 k1 k2 | (5.61)
i=1
2E(k i ) 2E(k i )
so
3
† d ki d3 ki 4
U (Λ)V U (Λ) =
2E(k ) 2E(k ) δ (k1 + k2 − k1 − k2 )h(k1 , ..)|Λk1 , Λk2 Λk1 , Λk2 |
i i i
(5.65)
d3 p
Changing variables from pi to Λpi and using the invariance of 2E(p) and the four-
dimensional δ-function, we find
3
† d ki d3 ki 4
U (Λ)V U (Λ) =
2E(k ) 2E(k ) δ (k1 + k2 − k1 − k2 )
i i i
·h(Λ−1 k1 , Λ−1 k2 , Λ−1 k1 , Λ−1 k2 )|k1 , k2 k1 , k2 | (5.66)
∞
d3 ki d3 ki 4
N
V = ) δ ( ki − ki )h(N ) (ki , ki )|k1 k2 ..kN
k1 k2 ..kN |
i=1
2E(k i ) 2E(k i i i
N =2
(5.68)
Alas, this seemingly trivial method of generating a profusion of theories with Lorentz-
invariant interactions (and number-conserving Lorentz-invariant scattering for any
number of particles) possesses some fatal flaws:
1. The “convenient” property that the free and interaction parts of the Hamiltonian
commute, [H0 , V ] = 0, induced by the need to complete the 3-momentum con-
servation δ-function in (5.68) to a four-dimensional energy-momentum δ-function
124 Dynamics III: Relativistic quantum mechanics
13 We shall later encounter “persistent interactions” in local quantum field theory which also result
in divergent contributions to the infinite-time propagation of the system encapsulated by the S-matrix,
although these effects can be removed by appropriate choice of the interaction operator V . However, for
interaction operators V of the general class specified by (5.68), it is impossible to avoid a divergent, and
hence physically meaningless, S-matrix.
14 Einstein’s original use of the term “spukhafte Fernwirkung” referred to the peculiar (from a clas-
sical standpoint) statistical correlations of entangled quantum states, as in the famous EPR effect.
These correlations, while perhaps psychologically disturbing, do not lead to the physically unacceptable
action-at-a-distance effect of the kind discussed here (which we might perhaps call an “entsetzliche
Fernwirkung”!).
A simple condition for Lorentz-invariant scattering 125
Correspondingly, the interaction operator in interaction picture Vip (t) will be given
by a spatial integral of a spacetime field Hint (x, t) (≡ Hint (x): as usual, coordinate
vectors without three-vector symbols are to be taken to be spacetime four-vectors):
Vip (t) = d3 xHint (x, t) (5.70)
Accordingly, the formal expansion (4.27) for the S-matrix S = U (+∞, −∞) becomes
∞
(−i)n +∞
S= dt1 dt2 ..dtn T {Vip (t1 ) . . . .Vip (tn )} (5.72)
n=0
n! −∞
∞
(−i)n
= d4 x1 d4 x2 ..d4 xn T {Hint (x1 )..Hint (xn )} (5.73)
n=0
n!
Ignore for the time being the presence of a time-ordering symbol in (5.72). Then the
desired Lorentz-invariance of the S-matrix
since the change xi → Λxi could be erased by the change of variable (with unit
Jacobian) xi → Λ−1 xi .15
Unfortunately, such a change of variable can in general change the ordering in time
of the various interaction operators, which will not in general commute at different
times (this would require [H0 , V ] = 0 and we have already seen in the preceding
section why this is untenable). When can this happen? Only if two of the spacetime
arguments, xi and xj say, differ by a space-like interval, (xi − xj )2 < 0. There would
then exist Lorentz transformations Λ for which the time-ordering symbol will reverse
the order of the corresponding interaction Hamiltonians. Unless we insist that these
operators commute in this situation, the argument leading to (5.74) will break down.
We may clearly avoid this outcome by insisting on commutativity of interaction energy
densities at space-like separations. This result is, of course, intuitively plausible, as it
asserts the non-interference of measurements of (interaction) energy performed at
space-like separations. The hand-waving argument presented above suggests that the
two conditions
τ (x1 , x2 ; n) ≡ T {Hint (x1 )Hint (x2 } = θ(n · (x1 − x2 ))Hint (x1 )Hint (x2 )
+ θ(n · (x2 − x1 ))Hint (x2 )Hint (x1 ) (5.79)
15 The order in which the similarity transformation is performed, with the U (Λ) operator on the left
and the U † (Λ) on the right, is dictated by the need for two successive transformations to follow the group
composition law: U (Λ2 )U (Λ1 ) = U (Λ2 Λ1 ).
A simple condition for Lorentz-invariant scattering 127
∂
Πμν τ (x1 , x2 ; n) = Πμν (x1 − x2 )ν δ(n · (x1 − x2 ))[Hint (x1 ), Hint (x2 ] (5.80)
∂nν
In our original frame, nμ = (1, 0, 0, 0), the δ-function sets the times of spacetime
points x1 , x2 equal, so the commutator of the interactions energy densities in (5.80)
is an equal-time commutator (ETC). The points x1 and x2 must therefore either be
coincident or space-like separated, (x1 − x2 )2 < 0. Locality (5.77) implies the vanishing
of the commutator in the latter case, but we still have the possibility of δ-function-
type singularities in the coincidence limit at equal time x01 = x02 = t, including possible
terms with spatial derivatives of the coincidence δ-function δ 3 (x1 − x2 ),
∂ 3
[Hint (x1 , t), Hint (x2 , t)] = C(x1 , t)δ 3 (x1 − x2 ) + Dρ (x1 , t)Πρσ δ (x1 − x2 ) + . . .
∂xσ1
(5.81)
where the ellipsis indicates the possible presence of higher (spatial) derivatives. The
fact that the commutator in (5.81) involves the same field operator Hint twice
actually eliminates the first term on the right-hand side as the commutator must
be antisymmetric under the exchange of x1 and x2 (in other words, in this case,
C(x, t) = 0). The second (and if present, higher) terms on the right-hand side of (5.81),
involving derivatives of δ-functions in an equal-time commutator, are called Schwinger
terms. If no such terms are present, the commutator is called “ultralocal”. Inserting
(5.81) in the result (5.80) we find
∂ ∂
Πμν ν
τ (x1 , x2 ; n) = Πμν (x1 − x2 )ν Πρσ Dρ (x1 ) σ δ 4 (x1 − x2 ) + ..
∂n ∂x1
= −Πμρ Dρ (x1 )δ 4 (x1 − x2 ) + .. (5.82)
(where we have used (x1 − x2 )ν δ 4 (x1 − x2 ) = 0). Note that the preceding equations
are to be interpreted in the usual distributional sense: as equalities holding after
integration with appropriately smooth, rapidly decreasing c-number functions. This
allows us to remove the derivative from the δ-function in the Schwinger term and apply
it to Dρ (x1 )(x1 − x2 )ν , obtaining the final result (5.82). We see that if Dρ (x) = 0
the time-ordered product τ (x1 , x2 ; n) contributing to the second-order term in the
S-matrix is indeed frame-dependent—thereby ruining the Lorentz-invariance of the
S-matrix—in the presence of Schwinger terms in the equal-time commutator of Hint .
This is not an empty possibility: we shall see in Chapter 12 that this is exactly
128 Dynamics III: Relativistic quantum mechanics
the situation in derivatively-coupled field theories, for example.16 For the time being
we shall satisfy ourselves with the requirement of “ultralocality” in the equal-time
commutators of Hint : namely, the absence of any derivative terms in the general form of
the ETC (5.81). With this proviso, it can then be shown that the qualitative argument
given previously for the Lorentz-invariance of the S-matrix (5.73) is indeed correct.
Having satisfied the requirements of invariance under the homogeneous portion of
the Poincaré group (i.e., the homogeneous Lorentz group) by choosing the interaction
energy density to be an ultralocal Lorentz scalar field, what can we say about the
inhomogeneous part: namely, invariance under translations in space and time? The
corresponding conservation laws require the S-matrix to commute with the free state
(0) (0)
energy (P0 = H0 ) and momentum (Pi , i = 1, 2, 3) operators. The fact that our free
and interaction operators H0 and V are time-independent ensures energy conservation
(cf. (4.182)); but what about spatial momentum conservation? Recall that this is
(0)
assured once we have [Pi , V ] = 0 (5.59). It turns out that we get this for free once
Hint is chosen to be a Lorentz scalar field satisfying (5.76). First, we note that by the
usual property of interaction-picture operators,
(0) ∂
i[H0 , Hint (x, t)] = i[P0 , Hint (x, t)] = Hint (x, t) (5.83)
∂t
(0)
We saw previously (cf. 5.31) that the energy-momentum four-vector operator Pμ
transforms in the expected way under Lorentz transformations
For example, for a boost along the z-axis, choosing Λ to be the Lorentz transformation
given by (5.10), and writing U (Λ) = U (ω),
(0)
= Λ0ν Pν(0) = cosh(ω)H0 + sinh(ω)P3 (5.85)
Thus
iU † (ω)[H0 ,Hint (x, t)]U (ω) = i[U † (ω)H0 U (ω), U † (ω)Hint (x, t)U (ω)]
= i[cosh(ω)H0 + sinh(ω)P3 , Hint (Λ−1 x)]
(0)
(5.86)
16 For such theories it is possible (cf. Section 12.1) to concoct an additional non-covariant term in the
interaction density which cancels everywhere the effect of the Schwinger term—a so-called “covariantizing
seagull”. The Lagrangian formalism developed in Chapter 12 provides an automatic and foolproof procedure
for generating the correct form of such covariantizing terms, if needed.
A simple condition for Lorentz-invariant scattering 129
∂
iU † (ω)[H0 ,Hint (x, t)]U (ω) = Hint (Λ−1 x)
∂t
∂
= Hint (x1 , x2 , cosh(ω)x3 + sinh(ω)t, cosh(ω)t + sinh(ω)x3 )
∂t
∂ ∂
= (sinh(ω) 3 + cosh(ω) )Hint (x , t ), x ≡ Λ−1 x (5.87)
∂x ∂t
Comparing coefficients of sinh(ω) in (5.86) and (5.87), we find
∂
i[P3 , Hint (Λ−1 x)] = i[P3 , Hint (x )] = Hint (x )
(0) (0)
(5.88)
∂x3
Evidently, for any of the spatial components
(0) ∂
i[Pi , Hint (x)] = Hint (x) (5.89)
∂xi
which implies
(0) (0) ∂
i[Pi , V ] = i[Pi , d3 xHint (x)] = d3 x Hint (x) = 0 (5.90)
∂xi
exactly the condition for momentum conservation. We shall henceforth consider the
definition of a general Lorentz scalar field A(x) to include the transformation property
(5.76) under the HLG as well as the commutation relations with the energy-momentum
four-vector
∂
i[Pμ(0) , A(x)] = A(x) (5.91)
∂xμ
From (5.91), a standard application of the Baker–Campbell–Hausdorff formula
1 1
eP Qe−P = Q + [P, Q] + [P, [P, Q]] + [P, [P, [P, Q]]] + . . . (5.92)
2! 3!
leads to the very important translation property for scalar fields in the interaction
picture
(0) μ (0) μ
eiPμ a
A(x)e−iPμ a
= A(x + a) (5.93)
for any fixed displacement four-vector aμ . We can combine the transformation require-
ments under the HLG (5.76) with the translation property (5.93) to obtain the
transformation of our scalar field under a general element (Λ, a) of the Poincaré group,
(0)
with unitary representative U (Λ, a) = eiP ·a U (Λ):
5.6 Problems
1. Calculate explicitly (as a 4x4 matrix) the commutator [Λ1 , Λ2 ] of a boost Λ1 with
rapidity ω1 along the x axis with a boost Λ2 with rapidity ω2 along the y axis. If
the rapidities ω1 , ω2 are infinitesimal, what type of transformation does [Λ1 , Λ2 ]
induce?
2. (a) Show the following connection between helicity and spin states:
j
|
p, λ = Dσλ (R(p̂))|
p, σ
σ
(b) Prove that helicity states transform as follows under Lorentz transformations:
E(Λp) j
U (Λ)|p, λ = p, λ
D (W(Λ, p))|Λ
E(p) λ λ
λ
where
3. Verify that the massless particle little group displacement transformations T (ξ)
in (5.46) are indeed Lorentz transformations. Also, verify the group composition
rule (5.47) and the transformation law (5.48).
4. The transformation property of scalar fields under spatial rotations follows from
the infinitesimal version of the transformation property:
Choosing Λ to be the infinitesimal rotation (Ri x)j = xj +
ijk xk δθ, show that
the scalar field A(x) has the following commutation relation with the angular
momentum components Ji , i = 1, 2, 3:
5. Show that a product H(x) = A(x)B(x)C(x).. of local, Lorentz scalar fields A(x),
B(x), C(x),.. is itself a local, Lorentz scalar field: i.e., verify
1 The cluster decomposition principle for the S-matrix seems to have been first articulated by Wichmann
and Crichton (Wichmann and Crichton, 1963): the proof that factorization of scattering probabilities
extends to the scattering amplitudes was given by Taylor (Taylor, 1966).
Clustering and the smoothness of scattering amplitudes 133
S
k ,
k = δ 3 (k − k) ≡ S
kc ,
k (6.1)
Sk1 k2 ,k1 k2 = Skc ,k1 Skc ,k2 + Skc ,k2 Skc ,k1 + Skc k ,k1 k2 (6.2)
1 2 1 2 1 2
which can be expressed graphically as indicated in Fig. 6.2. Note that in the present
discussion we assume that the incoming particles are identical (hence, indistinguish-
able) Bose particles.
The three-particle connected part is similarly defined by subtracting off those parts
of the full 3-3 scattering amplitude in which some proper subset of the particles pass
through unaffected by interactions. Again, this can be pictorially represented as shown
134 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
k k
k k
k1 k2 k1 k2 k1 k2 k1 k2
in Fig. 6.3 (here, “perms” refers to the appropriate set of terms with initial and
final particle momenta permuted to ensure Bose symmetry of the amplitude). As
the connected 2-2 amplitude has already been defined in (6.2), the fully connected
3-3 amplitude Skc k k ,k1 k2 k3 is defined inductively as the full 3-3 amplitude minus
1 2 3
the contributions from situations in which some proper subset of particles interacts
separately from the others.
Since Skc k ,k1 k2 , Skc k k ,k1 k2 k3 , etc., all come from the δ(Eα − Eβ )Tβα (i.e., interac-
1 2 1 2 3
tion) part of S (recall (4.182)), and T is assumed to conserve total spatial momentum,
we expect these connected parts all to have a full four-dimensional δ-function of
four-momentum conservation: δ 4 (P − P ), with P, P the total initial and final four-
momenta for the connected subprocess. For clustering to hold, as we shall now see,
it is crucial that this be the only δ-function in the connected parts of the S-matrix.
Of course, the disconnected parts may have many more, as energy-momentum must
Clustering and the smoothness of scattering amplitudes 135
+perms + C +perms + C
k1 k2 k3
k1 k3 k1 k2 k3 k1 k2 k3
k2
be conserved separately in each disconnected subprocess. To see the necessity for the
above assertion, consider a N→N process (N particles in, N out) where N=n1 + n2 ,
with n1 particles scattering far from the other n2 . Cluster decomposition requires
that the S-matrix factor into a product of S-matrices for n1 → n1 and n2 → n2
scattering separately. (We are assuming here for simplicity of notation only that the
scatterings conserve the number of particles, which is certainly not the case in general
in relativistic quantum theory.) The general expansion of the S-matrix in terms of
connected parts means that we can write the graphical representation for the full
N→N process as in Fig. 6.4. For the S-matrix to cluster—in other words, for the overall
process to factorize into a product of independent n1 → n1 and n2 → n2 scatterings
when the two sets of particles are spatially far separated—the extra terms containing
connected parts with both qs and ks must vanish when we form wave-packets for
the incoming and outgoing particles and then move all q-type particles far from all
k-type ones.
The separation can be achieved by introducing a large three-vector Δ and position-
ing one subset of particles around −Δ and the other around Δ. The wave-packets are
constructed in the usual way: replace2 the plane wave eik·
x by d3 k̃g(k̃; k)eik̃·(x−[Δ+ξ]) ,
i.e., a wave-packet of momentum centered around k (if g is strongly peaked there) and
peaked in coordinate space at Δ + ξ. A typical connected transition amplitude for
such a set of wave-packets will thus take the form:
d3 k̃1 d3 k̃2 ..d3 q̃n2 d3 k̃1 ..d3 q̃n 2 g(k̃1 ; k1 )..g(q̃n2 ; qn2 )
−i k̃j ·(Δ+ξj )−i q̃j ·(−Δ+ηj )+i k̃j ·(Δ+ξj )+i q̃j ·(−Δ+ηj )
e· j j j j Sk̃c ..q̃ (6.3)
1 n2 ,k̃1 ..q̃n2
2 For the rest of this section we drop three-vector notation to avoid overcrowding the formulas—but
re-emphasize to the reader that the requirements of clustering are exclusively spatial ones!
136 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
kn1 q1 k1 kn1 k1 k2 k3· · kn1 k1 ··· kn1
k1 qn2
·· ·· ···
·· + C ·· + ····+ C
·· ·· ···
k1 qn2
kn1 q 1 k1 kn1 k1 k2 k3·· kn1 k1 ··· kn1
···
× ·· + C ·· + · · · · + C + terms connecting
k’s and q’s
···
The utility of the generating functional defined in this way follows from the ease with
which it allows us to extract the connected amplitudes. Indeed, if we define a connected
functional S c (j ∗ , j) as the logarithm of the full generating functional S defined above,
so that
one easily sees that (for the case where S is the scattering operator) the amplitudes
encoded in S c are exactly the connected amplitudes defined above. For example,
3 For fermions, a similar procedure can be used, but with source functions which take values in a
anticommuting Grassmann field. We shall return to this later in the book when we discuss the path-integral
formulation of quantum field theory.
4 For a review of functional calculus, see Appendix A.
138 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
k1 k2 k3 |V |k1 k2 k3 = k1 k2 k3 |V12 + V23 + V31 |k1 k2 k3 (6.9)
For example, the system being considered might be the three electrons of a lithium
atom, and V above the total electrostatic interaction energy of the electrons. Explicitly,
connected piece
k1 k2 k3
Note that the remaining integral in (6.10) defines a smooth function of k1 − k1
provided V12 goes to zero sufficiently rapidly for x− → ∞. In general, the multi-particle
matrix elements of H decompose as follows:
H
k ,
k =< k |H|k ≡ E(k)δ 3 (k − k) ≡ H
kc ,
k (6.11)
H
k
k ,
k1
k2 ≡ H
kc ,
k δ 3 (k2 − k2 ) + H
kc ,
k δ 3 (k2 − k1 ) + H
kc ,
k δ 3 (k1 − k2 )
1 2 1 1 1 2 2 1
+ H
kc ,
k δ 3 (k1 − k1 ) + H
kc
k ,
k
k (6.12)
2 2 1 2 1 2
which defines the 2-2 connected piece Hkc k ,k1 k2 . Graphically, (6.12) may be repre-
1 2
sented as shown in Fig. 6.6. Similarly, in the three-body sector we have a decomposition
H
k
k
k ,
k1
k2
k3 ≡ H
kc ,
k δ 3 (k2 − k2 )δ 3 (k3 − k3 ) + perms.
1 2 3 1 1
+ H
kc
k ,
k
k δ 3 (k3 − k3 ) + perms.
1 2 1 2
+ H
kc
k
k ,
k
(6.13)
1 2 3 1 k2 k3
Note that in the simple non-relativistic example mentioned above, with only two-
body forces present, the connected part of the Hamiltonian in the one-particle sector
is given just by the matrix elements of the free Hamiltonian H0 (see (6.11)), while the
connected 3-3 part Hkc k k ,k1 k2 k3 in (6.13) is actually zero. In more general situations
1 2 3
in many-body theory, there are intrinsically three (or higher) body forces. Intuitively,
this corresponds to a situation where a single interaction alters the momenta of all
C +perms + C
from V only
k1 k2 k1 k2 k1 k2
three (or more) participating particles. In relativistic quantum field theory, particle
number is no longer conserved in interactions, and the existence of persistent inter-
actions (cf. Chapter 10) can lead to interaction contributions even in the one-particle
sector (depending on exactly how the full Hamiltonian is split into free and interacting
parts).
To summarize the foregoing discussion for non-relativistic scattering, we expect
that the connected N-N part of the Hamiltonian should not contain any partial
δ-functions conserving momentum of any subset M < N of the N interacting particles.
This intuitive expectation turns out to be precisely the requirement that the resultant
S-matrix possess the desired cluster-decomposition properties. More formally, the con-
nected part of the Hamiltonian has matrix elements obtained by Fourier-transforming
some multi-particle potential-energy function
Hkc k ..,k1 k2 = d3 x1 d3 x2 ..d3 x1 d3 x2 ..V (x1 , x2 , .., x1 ..)
1 2
e−ik1 ·x1 e−ik2 ·x2 ..eik1 ·x1 eik2 ·x2 .. (6.14)
where the potential energy function V should only depend on differences of coordinates
(translation invariance). The invariance of V under an equal shift of all coordinates
leads directly to momentum conservation: H c must contain an overall conservation
δ-function, namely δ3 (k1 + k2 + .. − k1 − k2 − ..). The only way to have additional
δ-functions conserving subsets of momenta is to have V constant when some subset
of coordinates is moved en-bloc far away from some other subset. We can prevent this
by insisting that V is a smooth5 function of differences of coordinates falling to zero
when any two coordinates separate to infinity with the others fixed.
The basic theorem specifying the smoothness properties needed in the con-
nected Hamiltonian amplitudes to guarantee the clustering property for the S-matrix
was proven by Weinberg (Weinberg, 1964b), in a classic study of multi-particle
scattering,:
Theorem 6.1 If Hkc k ..,k1 k2 = δ 3 ( k − k) times a smooth function of the
1 2
momenta k, k , then Skc k ..,k1 k2 = δ 3 ( k − k) times a smooth function.
1 2
We shall derive this result for the same system used in our discussions in this
and the preceding section: namely, for multi-particle scattering of a single species of
spinless boson. However, there will be no restriction to non-relativistic scattering. In
particular, the scattering amplitudes need not conserve particle number. To facilitate
the otherwise rather tricky combinatorics, we shall use the generating functional
technique described in the preceding section. As in the case of S-matrix amplitudes,
the introduction of associated functionals
5 Here “smooth” is being used in a somewhat loose sense: we are certainly not requiring analyticity
in the momentum variables, for example. Rather, here and henceforth in our discussion of clustering, the
reader should interpret the property “smooth” as simply implying the absence of singularities of δ-function
strength.
Hamiltonians leading to clustering theories 141
d3 k ...d3 k d3 k1 ...d3 kN
H(j ∗ , j) ≡ 1 M
j ∗ (k1 )...j ∗ (kM
)Hk1 ..kM
,k ..k j(k1 )...j(kN )
1 N
M! N!
M,N
(6.15)
d3 k ...d3 k d3 k1 ...d3 kN
Hc (j ∗ , j) ≡ 1 M
j ∗ (k1 )...j ∗ (kM
)Hkc ..k ,k1 ..kN j(k1 )...j(kN )
M! N! 1 M
M,N
(6.16)
allows the inductive sequence defining connected parts of the Hamiltonian to be solved
very simply. Namely, defining
F(j ∗ , j) ≡ d3 kj ∗ (k)j(k) (6.17)
For example, the reader may easily check (take again number conserving theories with
4
M = N for simplicity) that applying the functional derivative δj ∗ (k )δj ∗ (kδ )δj(k1 )δj(k2 )
1 2
to (6.18) (and then setting j = j ∗ = 0) leads directly to the connectedness structure
(6.12) for the two-body sector (as illustrated in Fig. 6.6).
We are interested in the clustering properties of the S-matrix, which the reader
will recall from Section 4.3 is defined as the set of multi-particle matrix elements of
the infinite time limit U (+∞, −∞) of the finite time evolution operator U (t, t0 ) in the
interaction picture, satisfying (4.20):
∂
i U (t, t0 ) = Vip (t)U (t, t0 ) (6.19)
∂t
It will suffice to establish the desired smoothness property for the matrix elements of
U (t, t0 ) at finite times t, t0 as it will then carry over trivially to the large time limit.
We need just one further technical tool to carry through the argument. It is easy to
see that the functional associated with the product V U of two linear operators V and
U is given by
where V (resp. U ) are the functionals associated with the Fock space operators V
(resp. U ), and L is the linking operator defined as
δ2
L ≡ exp ( d3 k ) (6.21)
δJ(k)δJ ∗ (k)
The role of the exponential in (6.21) is to produce (on expansion) a series of terms,
each of which ties together the momentum of the final-state particles in the matrix
element of U with the initial-state particles in the matrix element of V (for some
specific intermediate state in the operator product V U ). There is no substitute here
142 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
for the reader writing out a few simple examples to verify this result (see Problem 1
at the end of this chapter).
Suppressing the passive initial time t0 , which plays no role in the following, we shall
denote the generating functional for the matrix elements of U (t, t0 ) by U(j ∗ , j; t), the
functional for the matrix elements of the interaction operator Vip (t) by V(j ∗ , j; t),
and the corresponding connected quantities as usual by attaching a superscript “c”.
Then the equation of motion (6.19) translates to a corresponding time-development
equation for the associated functionals:
∂ ∂
i U(j ∗ , j; t) = i exp (U c (j ∗ , j; t))
∂t ∂t
∂U c (j ∗ , j; t)
=i exp (U c (j ∗ , j; t))
∂t
= (VU )(j ∗ , j; t)
= LV(j ∗ , J; t)U (J ∗ , j; t)|J=J ∗ =0
∗
(J ∗ ,j;t)
= L{eF (j V c (j ∗ , J; t)eU
c
,J)
} (6.22)
J=J ∗ =0
which (a) commutes with A and B (remember that the functional derivatives with
respect to J and J ∗ act independently), and (b) acts as the generator of translations
on functionals of J ∗ :
δ
exp ( d3 kj ∗ (k) )G(J ∗ ) = G(J ∗ + j ∗ ) (6.25)
δJ ∗ (k)
∗
Once the factor eF (j ,J) is to the left of the linking operator (containing derivatives
with respect to J) the source J can be set to zero, leaving only the second two terms
on the right-hand side of (6.26). Inserting this result on the final line of (6.22),
Hamiltonians leading to clustering theories 143
∂U c (j ∗ , j; t)
i exp (U c (j ∗ , j; t))
∂t
δ U c (J ∗ ,j;t)
= L exp ( d3 kj ∗ (k) ∗
){V c ∗
(j , J; t)e }
δJ (k) J=J ∗ =0
∗ ∗
= L{V c (j ∗ , J; t)eU (J +j ,j;t) }
c
∗
(6.27)
J=J =0
whence
∂U c (j ∗ , j; t)
= L{V c (j ∗ , J; t)eU (J +j ,j;t)−U (j ,j;t) }
c ∗ ∗ c ∗
i (6.28)
∂t J=J ∗ =0
The essential result we desire is already contained, in disguised form, in Eq. (6.28).
On the right-hand side, we find an exponential which if expanded gives a sum of
terms corresponding to a product of r (say) terms of the form U c (J ∗ + j ∗ , j; t) −
U c (j ∗ , j; t), each of which would vanish when we evaluate the final expression at J ∗ = 0,
as indicated in the formula. The only way to avoid this is if each such term receives at
least one derivative δJδ ∗ from the linking operator on the left, which will then attach the
corresponding connected U c factor to the connected Hamiltonian term V c (j ∗ , J; t). In
other words, the only terms which survive involve r connected factors of the evolution
amplitude U c connected to a single factor of the connected interaction Hamiltonian
V c , as indicated schematically in Fig. 6.7, for the special case of r =3.
Now suppose that at a given time U c has only a single overall δ-function of
momentum conservation. By assumption, so does V c . It is apparent that each term
on the right of Fig. 6.7 is completely connected, so that after integration over internal
∂
momenta it can have only a single overall δ-function. Thus, ∂t U c has only a single
c
δ-function, which implies that U retains this property for all time. At time t = t0
however, U (t0 , t0 ) = 1, so initially the only non-vanishing connected piece of U is the
one-particle matrix element k |U c |k = δ 3 (k − k), which indeed has but a single δ-
function. Apart from this single overall δ-function then, we have (by assumption) only
smooth functions of momenta, which remain smooth when combined and integrated
via the linking operator. This establishes the desired result, as stated in theorem 6.1.
···
Vc
∂ ∞
i Uc = Uc Uc
∂t r=1
Uc
···
The importance of the above result is clear: we now have a precise criterion
for choosing interaction Hamiltonians which lead to properly clustering S-matrices.
In particular, the connected part of the matrix elements of Vip should contain no
dangerous δ-functions (in the terminology of (Weinberg, 1964b)). It turns out that the
extraction of the connected part of the Hamiltonian by either the inductive scheme
described above, or by functional methods, is rather inconvenient in general. Instead,
a technical device known as “second quantization”6 allows us to display the connected
parts of H with great ease. This device, involving the introduction of the famous
“creation” and “annihilation” operators, will also have an important added bonus:
it will be trivial to ensure the proper symmetry (bosonic or fermionic) of the states.
But the real advantage of the creation–annihilation technology is the ease with which
clustering can be incorporated in the theory.
Much of the preceding discussion of clustering applies quite generally to scattering
processes in non-relativistic quantum mechanics, as well as to scattering in relativistic
quantum field theory. Certainly, we do not expect locality, in the sense of space-
like commutativity of field operators, to play any role in the non-relativistic case.
However, we shall soon see that this much more special property, sometimes termed
microcausality, dovetails effortlessly with the clustering requirement once we add the
condition of Lorentz-invariance of scattering amplitudes. The natural structure of local
quantum field theory emerges inexorably once these basic concepts are fused.
6 The terminology “second quantization” arose in the late 1920s to describe the process, pioneered by
Jordan and Dirac, of replacing the c-number wavefunctions for multi-particle systems by a single spacetime-
dependent q-number operator field: it was soon realized that the creation and annihilation algebra provided
an extremely convenient operator basis for constructing such fields. See Section 2.2.
7 Again, in the interests of avoiding notational overload, the three-vector arrow indications on spatial
momenta are omitted in what follows. Whether a momentum label refers to a spatial, or four momentum,
should (we hope) be obvious from context.
Constructing clustering Hamiltonians: second quantization 145
N
a(k)|k1 , k2 , ..kN ≡ (±)r−1 δ 3 (k − kr )|k1 , ..kr−1 , kr+1 , ..kN (6.29)
r=1
k1 , ..kN
|k1 , ..kN = (±)P δ 3 (k1 − kP (1) )...δ 3 (kN − kP (N ) ) (6.31)
P
Here P denotes a sum over all permutations P of the integers 1, 2, ...N , with (±)P
inserting a minus sign for fermions if the permutation P is odd.
As suggested by the notation, a† is really the hermitian conjugate of a, as the
following computation shows:
This implies that if |0 is the state with no particles (the “vacuum”) and ψ| an
arbitrary bra state
as the bra state k, ψ| must contain at least one particle. Since ψ| is arbitrary, it
follows that the annihilation operators a(k) all annihilate the vacuum, a(k)|0 = 0.
(Note that we shall always normalize the vacuum to unity, 0|0 = 1.)
The commutator properties of the creation and annihilation operators will be
crucial. Thus, observe
146 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
N
= δ 3 (k − kr )|k1 ..kr−1 k kr+1 ..kN (6.34)
r=1
whereas
N
= δ3 (k − k )|k1 ..kN ± δ 3 (k − kr )|k1 ..kr−1 k kr+1 .. (6.35)
r=1
Hence
More concisely, we have derived, for bosons (resp. fermions), the fundamental com-
mutation (resp. anticommutation) relations
For the case of fermions, this implies a† (k)a† (k) + a† (k)a† (k) = 2a† (k)a† (k) = 0, i.e.,
the Pauli exclusion principle forbidding the addition of two identical fermionic particles
to any state.
Next we derive the behavior of the creation and annihilation operators under
Lorentz transformations. Recall that with non-covariant normalization of the states,
E(Λk) E(Λki )
U (Λ)|k, k1 , ... = |Λk, Λk1 , ... (6.39)
E(k) E(ki )
i
† E(Λk) †
U (Λ)a (k) = a (Λk)U (Λ) (6.41)
E(k)
Multiplying (6.41) on the right by U † (Λ) and using unitarity of the U (Λ)
† † E(Λk) †
U (Λ)a (k)U (Λ) = a (Λk) (6.42)
E(k)
we have
1
H= d3 k1 ..d3 kM
3 3
d k1 ..d kM
M !M !
M,M
hM M (k1 , ..kM
† †
, k1 , ..kM )a (k1 )..a (kM )a(k1 )..a(kM ) (6.47)
Similarly, k1 k2 |H|k1 k2 = h22 (k1 , k2 , k1 , k2 )+ terms involving h11 δ 3 (..). Having deter-
mined h11 in the preceding step, this fixes h22 , and so on. We may now state the
critical result which validates the importance of an expansion in terms of creation and
annihilation operators:
Theorem 6.2
This remarkable theorem shows that the expansion (6.47) directly yields the connected
matrix elements of the Hamiltonian in terms of the expansion functions hN N . The
proof is simple. Consider a general (N, N ) matrix element of H, expressed graphi-
cally as shown in Fig. 6.8. The first terms on the right-hand side indicate possible
disconnected contributions: in particular, each term here must contain at least two
δ-functions (of course, some of the terms, for example the first term on the right-hand
side, only appear if N = N ). The last term on the right is the fully connected piece,
with only a single overall δ-function of momentum conservation. The terms in the
expansion H = M M . . . with M < N or M < N do not contain enough creation or
annihilation operators to affect all the particles in the final and initial states: thus such
terms contribute only to the disconnected part. The terms with M > N or M > N
give a vanishing contribution, by attempting to destroy more particles than are present
in the initial state or create more than are present in the final state. So the only part
of H to contribute to the fully connected matrix element is hN N ! Q.E.D.
The theorem of the preceding section requiring H c to contain at most a single
delta-function may now be applied directly to the coefficient functions hN N to obtain
the following immediate corollary of theorem 6.2:
Constructing a relativistic, clustering theory 149
N N
··· ···
··· ···
N N
Corollary 6.3 hN N (k1 , k2 , .. k1 , k2 ..) = δ 3 (k1 + k2 + .. − k1 − k2 − ..) · f (k1 , k2 , ..)
=⇒ clustering property of the S-matrix, provided f is a smooth function of momenta.
∂
i[Pμ(0) , Hint (x)] = Hint (x) (6.49)
∂xμ
8 Strictly speaking, ultralocal, with no spacetime-derivatives appearing in the commutator contact terms
(cf. the discussion in Section 5.5).
150 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
We also know that Hint may be written as an expansion in destruction and creation
operators (6.47), a, a† s. Let us therefore try to construct an object from a single a (or
a† ) which satisfies the above conditions. Then we can rely on the fact that local scalar
fields form an algebraic “ring”: the product of a set of local scalar fields is again a
local scalar field. To prove this, suppose A(x), B(x), C(x), . . . comprise a set of local
scalar fields. We want to show that A(x)B(x)C(x) . . . satisfies (6.49, 6.50, 6.51).
∂A(x) ∂B
= B(x)C(x).. + A(x) μ C(x).. + ..
∂xμ ∂x
∂
= (A(x)B(x)C(x)..)
∂xμ
We remind the reader that all operators considered here are in the interaction
(0)
picture: taking μ = 0 we see that P0 = H0 , the free Hamiltonian, generates the
time development for all fields in the theory.
2. From unitarity of U (Λ), U † (Λ)U (Λ) = 1:
3. If the points x1 , x2 are space-like separated, then each of the operators A(x1 ),
B(x1 ), C(x1 ),... commutes with each of A(x2 ), B(x2 ), C(x2 ), .., by (6.51), whence
For the time being imagine that we are dealing with a single spinless boson, with
a(k) the destruction operator for a particle of spatial momentum k.9 The most general
operator linear in the destruction operator must take the form:10
φ(+) (x) = d3 kf (x; k)a(k) (6.52)
= −i d3 kkμ f (x; k)a(k)
= d3 k∂μ f (x; k)a(k) (6.53)
which implies
Note that the k in the exponential factor in (6.55) is the four-vector momentum, with
k0 ≡ E(k) ≡ k 2 + m2 : such a four-vector momentum is said to be “on-mass-shell”.
The result is that f (x; k), and therefore φ(+) (x), necessarily satisfy the Klein–Gordon
equation (Klein, 1926; Gordon, 1926)
1 1
f (k) = (6.60)
(2π)3/2 2E(k)
φ(+) (x) and its hermitian conjugate φ(+)† (given by an analogous formula involving
the creation operator a† ) will be the basic ingredients out of which we shall construct
our (necessarily hermitian) interaction Hamiltonians.
We must still address the question of locality, (6.51). For bosons we have the com-
mutation relations [a(k), a(k )] = [a† (k), a† (k )] = 0 among creation and destruction
operators separately, so automatically
φ(+) (x), φ(+) (y) = 0 = φ(+)† (x), φ(+)† (y) (6.62)
On the other hand, the interaction Hamiltonian must be hermitian, and must therefore
be built out of both φ(+) and φ(+)† . Thus, we must also check the commutation relation
1 d3 k d3 k
φ(+) (x), φ(+)† (y) = e−i(k·x−k ·y) δ 3 (k − k )
(2π) 3
2E(k) 2E(k )
1 d3 k −ik·(x−y)
= 3
e
(2π) 2E(k)
≡ Δ+ (x − y; m) (6.63)
where the last line follows because Δ+ is a c-number (no creation or annihilation
operators), and therefore commutes with all operators in the theory, in particular with
the U (Λ). The frame-independence of this invariant function means that for z ≡ x − y
Constructing a relativistic, clustering theory 153
.
Changing variables in the radial momentum integral to β ≡ k/m,
m2 ∞ βdβ sin(βm|z|)
Δ+ (z; m) =
4π 2 0 β2 + 1 m|z|
m2 1
= √ K1 (m −z 2 ) (6.65)
4π m −z 2
2
Spacelike region
Timelike region
Δ+-->
z2 -->
Fig. 6.9 Behavior of the invariant function Δ+ (real part shown for z 2 > 0).
11 In the time-like region with z 2 > 0, Δ+ can be shown to take the form
m √ √
Δ+ (z; m) = √ (N1 (m z 2 ) ± iJ1 (m z 2 )), z0 = ±
8π z 2
which displays oscillatory behavior, in contrast to the exponential decrease of K1 in the space-like region
(Fig. 6.9).
154 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
m2 √
2
Δ+ (z; m) ∼ √ (πm −z 2 )−3/2 e−m −z , −z 2 → ∞ (6.66)
4 2
Unfortunately for the Lorentz-invariance of the theory, exponential falloff is not
good enough here: a non-zero commutator will result in non-Lorentz-invariant con-
tributions to the S-matrix, as the argument following (5.75) in Section 5.5 makes
clear. An important example of such a failure is provided by scalar number-conserving
interaction Hamiltonians. Consider a theory in which the interaction Hamiltonian has
an expansion of the form (6.47) with the only non-zero term having M = M =2. The
perturbative expansion of the S-matrix then yields amplitudes for processes which can
arise from a succession of 2-2 particle scatterings. Writing the interaction Hamiltonian
as a spatial integral of an hermitian scalar density, the interaction energy density in
the simplest such theory must then take the form
have separate and independent destruction ac (k) and creation ac† (k) operators for the
antiparticle, satisfying (for bosons)
ac (k), ac† (k ) = δ 3 (k − k ) (6.70)
ac (k), a(k ) = ac (k), a† (k ) = 0 (6.71)
ac (k), ac (k ) = 0 (6.72)
etc. Although the heuristic argument given in Chapter 3 implies equality of particle
and antiparticle masses, we temporarily allow the antiparticle to have an independent
mass mc . Now destruction of a particle and creation of the antiparticle at a spacetime
point x must be treated on the same footing (as they both describe the same physical
event in different frames), which suggests that we write a single field (which we shall
call a canonical scalar field) containing both terms:
1 d3 k
φ(x) = (a(k)e−ik·x + ac† (k)eik·x ) (6.73)
(2π)3/2 2E(k)
≡ φ(+) (x) + φ(−) (x) (6.74)
We note here in passing that φ(x), like its positive frequency part φ(+) (x), automati-
cally satisfies the Klein–Gordon equation (6.56): ( + m2 )φ(x) = 0, as only on-mass-
shell four-momenta occur in the plane-wave exponentials e±ik·x .
The basic commutation relations (6.70–6.72) immediately imply
for all x, y (i.e., not necessarily space-like separated). The necessity to include both
φ and φ† in an hermitian Hamiltonian requires that we also check for locality in the
commutator
1 d3 k
φ(x), φ† (y) = 3
(e−ik·(x−y) − eik·(x−y) )
(2π) 2E(k)
= Δ+ (x − y; m) − Δ+ (y − x; mc ) (6.76)
If the particle and antiparticle masses m, mc differ, this commutator will still fail to
vanish even for x − y space-like. On the other hand, exact equality of particle and
antiparticle masses m = mc implies (since Δ+ (z; m) in (6.65) is even in z for z space-
like) that the full invariant function
vanishes for space-like separation z (Δ+ (z; m) is even in z for z space-like), and we
have the desired locality
156 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
φ(x), φ† (y) = 0 for (x − y)2 < 0 (6.78)
Of course, as a consequence of (6.62), we already have [φ(x), φ(y)] = [φ† (x), φ† (y)] = 0
for all x, y. The fact that strict locality of our fields implies an exact equality of
particle and antiparticle masses is a consequence of the famous “TCP theorem” of local
quantum field theory, which we shall discuss further in Section 13.4. For the proton
and antiproton, the equality has been established experimentally to nine significant
figures.
With the local fields φ(x) and φ† (x) in hand we may multiply them freely to obtain
an hermitian scalar interaction Hamiltonian density guaranteed to have the right
locality and Lorentz transformation properties, thereby ensuring Lorentz-invariance
of the scattering amplitudes of the theory.12 A famous example is “phi-4” theory, with
interaction
Hint (x) = λφ† (x)2 φ(x)2 (6.79)
commutes with Hint (x), and is thus exactly conserved by the dynamics of this theory
(see also Problem 3).
In certain cases a particle may have identical quantum numbers to the antiparticle
(e.g., photons, π0 , ρ0 ,..): in other words, a = ac and the canonical local field φ repre-
senting this particle is hermitian, φ(x) = φ† (x). Such particles (and their associated
canonical field) are called “self-conjugate”. A “real” version of the phi-4 theory (6.79)
describing the self-interactions of a self-conjugate neutral boson would therefore take
the form
Hint (x) = λφ(x)4 (6.81)
With this lengthy digression into the requirements of locality (as a way to ensure
Lorentz-invariance of the S-matrix) concluded, we may finally return to examine
the constraints due to cluster decomposition. Recall that clustering required the
interaction part of the Hamiltonian to take the form
12 Of course, we must still ensure ultralocality of the equal-time commutators, as discussed in Section
5.5. In particular, derivatively-coupled theories will in general produce Schwinger terms in the commutator,
requiring additional non-covariant “seagull” terms in the interaction to correct the resultant defects in
Lorentz-invariance. We shall see in Chapter 12 how the Lagrangian formalism automatically solves the
problem of generating the appropriate seagull terms.
Constructing a relativistic, clustering theory 157
1
V = hM,M (k1 , ..kM
, k1 , ..kM )
M !M !
M,M
d3 ki d 3 ki
· a† (k1 )..a† (kM
)a(k1 )..a(kM )
2E(ki ) 2E(ki )
1
= δ 3 (k1 + k2 + ..kM
− k1 − ..kM )
M !M !
d3 ki d3 ki
· fM M (k1 ..kM )a† (k1 )....a(kM ) (6.82)
2E(ki ) 2E(ki )
From the fact that a(k) (a† (k)) removes (resp. adds) a particle of energy E(k) from
any state on which it acts, it follows that
where now the exponential factors contain four-vector dot-products. It is now straight-
forward to show that the scalar-field transformation property U (Λ)Hint (x)U † (Λ) =
Hint (Λx) requires that
erty [Hint (x), Hint (y)] = 0, (x − y) space-like, will hold, giving us a Lorentz-invariant
theory.
There is one tricky point here which has perhaps already occurred to the reader:
namely, the form (6.83) in which we wrote the interaction Hamiltonian. In this
expression, all creation operators appear to the left of all destruction operators. This is
called the normal-ordered product. Given any product of fields, ABC.., in which each
field A, B, C.. is a linear combination of creation and annihilation parts, we form the
normal-ordered product : ABC.. : by multiplying out the field product in the normal
way and then moving all a† s to the left and all as to the right in each of the resultant
terms, ignoring any resultant commutator terms! For example (for a self-conjugate
field):
which is, of course, a c-number. For field reorderings at the same spacetime point
we encounter a divergent c-number Δ+ (0; m) (the momentum integral in (6.64) is
quadratically divergent for z = 0)—our first encounter with the ubiquitous ultraviolet
delicacies of local field theory. We shall see how to deal with the sensitivity of the
theory to high-momentum contributions in the fourth section of this book, but for
the time being we may simply imagine inserting a cutoff on divergent momentum
integrals at some very high value, reflecting our inevitable ignorance of very-short-
distance physics. Thus, the rearrangement implied by normal ordering within a single
interaction term does not affect the locality properties of the interaction Hamiltonian
density, as a normal-ordered product can be rearranged into a linear combination of
ordinary powers of the local field. For example (see Problem 4):
with A, B c-number constants related to Δ+ (0; m). Accordingly, we are free to take
λ
Hint (x) = : φ(x)4 :
4!
λ
≡ (φ(+)4 + 4φ(−) φ(+)3 + 6φ(−)2 φ(+)2 + 4φ(−)3 φ(+) + φ(−)4 )
4!
which is, despite the
reorderings, still a local field in virtue of (6.88). The free
Hamiltonian, H0 = d3 kE(k)a† (k)a(k), is also given (see Problem 5) by the integral
of a spatial energy density, itself a sum of normal-ordered products
φ̇2 1 2 1 2 2
H0 = d3 xH0 (x, t) = d3 x : + |∇φ| + m φ : (6.89)
2 2 2
Local fields, non-localizable particles! 159
where the divergent c-number commutator term by which the normal-ordered expres-
sion differs from the corresponding expression without normal ordering is just the
infamous “zero-point” energy of a free scalar field theory, which we have already
encountered in Chapter 1 in Jordan’s seminal calculation of field energy fluctuations.
The dependence on the time t at which the field operators in (6.89) are taken is
spurious—not surprisingly, as H0 is clearly time-independent in the interaction picture.
In summary, we have a theory with complete Hamiltonian
1 1 2 1 2 2 λ 4
H = d3 x : φ̇2 + |∇φ| + m φ + φ : (6.90)
2 2 2 4!
where all the field operators are taken at t = 0, say.
Let us return briefly to the very strong analyticity constraint assumed above
for fM M . The Taylor series for fM M in powers of momenta translates to higher
derivatives of the fields in coordinate space. The immediate problem induced by such
derivatives—failure of ultralocality—is not, in fact, fatal: as we shall see in Chapter
12, the Lagrangian formalism enables the construction of Lorentz invariant theories
even for interaction Hamiltonians with arbitrarily many spacetime-derivatives. The
justification for this assumption really lies in the basic feature needed to insulate
us from our ignorance of physics at very short distance scales—the scale separation
property—discussed in Chapter 3 (and in much greater detail in Chapter 16), and in
the further property of renormalizability possessed by a small subclass of local quantum
field theories, in which the low-momentum physics can, at least perturbatively (and in
a few special cases exactly, i.e., even non-perturbatively), be completely isolated from
the behavior of the theory at arbitrarily high momentum. In fact, the requirement of
renormalizability will constrain the suitable range of theories in any more than one
spacetime dimension to polynomial interactions in the fields: the sum over M, M must
terminate! In addition, for theories of spinless particles in four spacetime dimensions,
the coefficient function fM M must actually be momentum-independent. The necessity
for this will be examined in great detail in the fourth section of this book when we
consider the physical origin and role of renormalizability. We shall also see then that
more general effective field theories, with arbitrarily high derivatives, and powers of the
field, which are the natural end results of the imposition only of Lorentz-invariance and
clustering requirements, lie inevitably at the microscopic “core” of any local quantum
field theory—even the perturbatively renormalizable ones.
and are thus, in the general sense of quantum measurement theory, mutually compati-
ble hermitian (hence, measurable) observables of the theory: states can be constructed
in which φf1 and φf2 take simultaneously sharp values.13
The exact localizability of relativistic quantum fields does not, however, extend
to the quantal manifestations of these fields: the particles whose interactions have
motivated the introduction of the field concept in the first place! Recall the situation
in non-relativistic quantum theory, where a massive particle like an electron can be
assigned a wavefunction ψ(x, t) which at some time t0 exactly vanishes outside an
arbitrarily small bounded spatial region v1 , thereby localizing the particle exactly
inside the given region, at least at some instant of time.14 More generally, for a non-
relativistic many particle quantum system, a number density operator N (x, t) can
be defined such that the operators Nv ≡ v d3 xN (x, t0 ) defined for arbitrary spatial
regions at some instant t0 exactly commute with each other for non-overlapping spatial
regions. Thus, it makes perfect sense in such a theory to speak of a definite number
of particles in a precisely well defined spatial volume at a given time.
All of this falls apart in relativistic field theory. Let us illustrate the basic issues in
the simplest case, that of a massive spinless boson described by a self-conjugate scalar
field φ(x) (as given by (6.73) with ac = a). We first note that the failure of strict
localizability for relativistic particles is a completely kinematic issue: interactions of
the field are not relevant here. The number operator counting the total number of
particles in a state can be written as a spatial integral of a number density N (x, t)
(see Problem 6)
N ≡ d3 ka† (k)a(k) (6.92)
= d3 xN (x, t), (6.93)
↔
∂ (+)
N (x, t) = iφ (−)
(x, t) φ (x, t) (6.94)
∂t
where the antisymmetric time-derivative symbol in (6.94) is defined as
↔
∂ ∂B(t) ∂A(t)
A(t) B(t) ≡ A(t) − B(t) = A(t)Ḃ(t) − Ȧ(t)B(t) (6.95)
∂t ∂t ∂t
13 We shall see how to do this explicitly in Chapter 8 when we discuss coherent states of a quantum field.
14 Ofcourse, the instantaneous spreading of the wave-packet allowed in non-relativistic theory will produce
a non-vanishing wavefunction outside of v1 for t > t0 .
Local fields, non-localizable particles! 161
In this relativistic theory, the equal-time commutator of the number density operator
at spatially distinct points x
= y does not vanish. We shall need the basic commutators
(Problem 7)
Δ+ (x, 0; m) = 3
√ eik·
x (6.100)
(2π) 2 m c4 + k2 c2
2
and
d3 k 1 2 4
Note that although the relativistic function Δ+ is not zero, but rather falls exponen-
tially (recall (6.66)), implying the same behavior for the function Δ̃+ , by (6.102), in
the formal non-relativistic limit c → ∞, we may expand
1 1 2 3
Δ+ (x, 0; m) ∼ 2
(δ 3 (x) + ∇ δ (x) + ..) (6.103)
mc 2m2 c2
thereby recovering local behavior for all relevant commutators. However, in the
relativistic case, both Δ+ (x, 0; m) and Δ̃+ (x, 0; m) have a dominant asymptotic
exponential falloff ∼ e−m|
x| .
With these ingredients, a short calculation yields
[N (x, t), N (y , t)] = Δ+ (x − y , 0; m)(φ̇(−) (x, t)φ̇(+) (y , t) − φ̇(−) (y , t)φ̇(+) (x, t))
+ Δ̃+ (x − y , 0; m)(φ(−) (x, t)φ(+) (y , t) − φ(−) (y , t)φ(+) (x, t))
(6.104)
Accordingly, measurements of the number of particles Nv1 ≡ v1 d3 xN (x, t), Nv2 ≡
v2
d3 xN (x, t) in two spatially non-overlapping volumes v1 , v2 will mutually interfere,
with the level of interference falling exponentially as the separation (smallest distance
between points in v1 and v2 ) is increased, on the scale of the Compton wavelength
m−1 of the particle. In the non-relativistic limit, after appropriately absorbing the
1
leading mc 2 factor in (6.103) into the normalization of the operators, the commutator
The difficulty in defining localized states for relativistic particles can be seen with
even greater clarity if we examine the behavior of various field observables for a one-
particle state |ψ, defined by specifying a momentum wavefunction ψ(k):
|ψ ≡ d3 k ψ(k)a† (k)|0 = d3 k ψ(k)|k, ψ|ψ = d3 k|ψ(k)|2 = 1 (6.105)
We now consider the expectation value of various field observables in this one-particle
state at some fixed time, say t =0. For example, the expectation value of the number
density is (suppressing the time-variable in the fields, as we are at t =0)
ψ|N (x, 0)|ψ = i d3 kd3 k ψ ∗ (k )ψ(k)k |φ(−) (x)φ̇(+) (x) − φ̇(−) (x)φ(+) (x)|k
1 i(
k )·
k−
E(k) E(k )
= d3 kd3 k ψ ∗ (k )ψ(k)e x
{ + }
2(2π)3 E(k ) E(k)
= Re(χ∗ (x)χ̃(x)) (6.106)
χ(x) ≡ 3/2
d3 k ψ(k)eik·
x (6.107)
(2π) E(k)
1
χ̃(x) ≡ 3/2
d3 k E(k)ψ(k)eik·
x (6.108)
(2π)
and the expectation value of the number density (6.106) reduces to the conventional
probability density of non-relativistic quantum mechanics |ψ(x)|2 , as first proposed
by Max Born in 1926.
In the relativistic case, the non-commutativity of number operators for non-
overlapping regions means, however, that even if we choose ψ(k) in (6.105) so that
the expectation value of N vanishes exactly outside some compact spatial volume
v1 , this does not mean that the state is an eigenstate, with zero eigenvalue, of the
number operator for another non-overlapping spatial volume v2 . In fact, such a state
will even have a non-zero-energy distribution outside the volume v1 . This is easily
seen by examining the energy density, given by the operator (6.89) (recall that we are
interested only in a free system here, and that we are reinstating explicit factors of
the velocity of light to facilitate the non-relativistic limit, while keeping natural units
Local fields, non-localizable particles! 163
1 ∗ (x) · ∇χ(
x) + m2 c4 |χ(x)|2 )
ψ|H0 (x, 0)|ψ = (|χ̃(x)|2 + c2 ∇χ (6.112)
2
We can localize our one-particle state exactly with respect to the number density
operator either by choosing ψ(k) ∝ E(k) (in which case χ(x) ∝ δ 3 (x) but χ̃ is not a
point distribution, but rather our old friend Δ̃+ from (6.101)) or by choosing ψ(k) ∝
√ 1 (in which case χ̃(x) ∝ δ 3 (x), and χ reduces to Δ+ ), but in either case the energy
E(k)
density involves terms which do not vanish for x
= 0, but rather fall off exponentially
away from the origin at a rate determined once again by the Compton wavelength of
the particle, specifically (apart from power prefactors) like e−2m|
x| . The non-relativistic
limit, using (6.109, 6.110), is just as expected:
1
ψ|H0 (x, 0)|ψ → mc2 |ψ(x)|2 + |∇ψ(x)|2 (6.113)
2m
which, on integration over space, gives the rest energy mc2 plus the expectation value
1 2
of the non-relativistic kinetic energy operator − 2m |∇| .
The peculiar resistance of relativistic particles described by local fields15 to local-
ization of their physical attributes is really a manifestation of a deep complemen-
tarity principle at play between particle and field aspects in relativistic quantum
field theories. Indeed, the fluctuations of the energy of blackbody radiation in a
subvolume of a cavity even for states in which the number of photons of each
mode (in the full system) is completely definite was the critical piece of information
used by Jordan to carry through the first real calculation in quantum field theory,
as we saw in Chapter 1. We also recall from Heisenberg’s original “gamma-ray
microscope” argument for the uncertainty principle that we can often gain some
physical understanding of complementary quantities in quantum physics by thought
experiments in which physical processes are invoked to effect a “measurement” of a
given quantity. From the field point of view, attempts to localize physical attributes
of a relativistic particle (energy, number of quanta) in a finite volume of order the
Compton wavelength necessarily entail interaction with other fields with momentum
(and hence, relativistically, energy) components on the order of the inverse Compton
wavelength. Such interactions can produce additional “virtual” particle–antiparticle
15 One of the most remarkable examples of the “fuzzy” character of localization in quantum field theories
is the Reeh–Schlieder theorem (see (Reeh and Schlieder, 1961); also (Streater and Wightman, 1978), theorem
4-2) of axiomatic quantum field theory, which asserts that an arbitrary physical state can be approximated
arbitrarily well by applying polynomial functions of field operators localized in any finite open region of
spacetime to the vacuum: even a region arbitrarily far separated from the “location” of the particles in the
given state!
164 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
pairs, vitiating the desired localization of the original particle.16 Perhaps the confus-
ing disparity between exactly localizable fields and our stubbornly “fuzzy” particle
states is best understood in the following terms. The locality principle operating
at the field level is really an implementation of the point-like nature of the particle
interactions: we construct interaction Hamiltonians by multiplying the relevant fields
at exactly the same spacetime point. The structureless character of an elementary
particle is, from this point of view, a statement about the way it interacts with
other elementary particles (or, in the case of purely self-coupled theories, with itself),
and not a statement about our ability to localize the physical characteristics (energy,
momentum, charge, etc.) of the associated particle at a dimensionless spatial point,
which, as the preceding discussion shows, is intrinsically impossible in a relativistic
theory.
16 We shall later return to the underlying particle–field complementarity (in the form of the number-phase
mutual uncertainty principle) at work here in Chapter 8, when we examine the classical limit of quantum
field theory.
17 The S-matrix approach to strong interactions, pioneered in the late 1950s and 1960s by Chew,
Mandelstam, Regge, and many others, while leading to many important and lasting results, faded once
the efficacy of quantum chromodynamics in addressing a much wider variety of strong dynamics processes
became apparent in the 1970s and 1980s.
18 See the paper of Toll (Toll, 1956), for a review and careful discussion of the logical foundations.
From microcausality to analyticity 165
where causality requires T (t − t ) = 0, t < t : the output at any time t can only depend
on prior values of the input signal at time t . The Fourier transform of the transfer
function is given by
+∞ +∞
iωt
T̃ (ω) = e T (t)dt = eiωt T (t)dt (6.115)
−∞ 0
The restriction of the time variable t to positive values in the above integral means that
T̃ (ω) may be analytically continued from real values of ω to the upper-half-plane of the
complex frequency plane, Im(ω) > 0, as the integral acquires an additional convergence
factor e−Im(ω)t as we move off the real ω axis into the upper half plane of ω. In fact,
for square-integrable transfer functions, the connection goes both ways, as has been
shown by Titchmarsh (Titchmarsh, 1948): upper-half-plane analytic functions square
integrable along any line parallel to the real axis are inevitably the Fourier transforms
of causal transfer functions (i.e., functions vanishing for negative time). This upper-
half-plane analyticity of T̃ (ω) allows the derivation of important dispersion relations,
relating the real and imaginary parts of T̃ (ω) for real values of ω. Such dispersion
relations are of enormous phenomenological importance in high-energy physics: they
played a central role in the S-matrix approach to strong interaction physics which
dominated particle physics in the late 1950s and through the 1960s.
The connection between analyticity and causality is also familiar in ordinary
quantum scattering theory. Form a wave-packet of incoming waves by smearing plane
waves in the usual way:
dωg(ω)eiω( c −t)
z
ψin (z, t) =
The scattering amplitude f (ω) then gives the outgoing spherical scattered wave as
(see Fig. 6.10)
z r
z r
ψin (z, t) = g(ω)eiω( c −t) ψscat (r, t) = f (ω)g(ω)eiω( c −t)
Fig. 6.10 Scattering from a localized target.
166 Dynamics IV: Aspects of locality: clustering, microcausality, and analyticity
dωf (ω)g(ω)eiω( c −t)
r
ψscat (r, t) =
Now choose g(ω) so the incoming packet arrives at the scattering center at t = 0:
+∞
dt
ψin (0, t) = 0, t < 0, with g(ω) = ψin (0, t)eiωt
0 2π
g(ω) certainly exists for real ω, a fortiori for Im(ω) > 0, where the integral has an
additional real exponential convergence. In fact, this implies (by reasoning exactly
analogous to that presented previously for causal signals) analyticity of g(ω) in the
entire upper-half-plane. But causality requires that there be no scattered wave ahead
of the incident one! In other words,
r
ψscat (r, t) = 0, t− <0
c
so by the same argument, f (ω)g(ω) is upper-half-plane analytic. This implies that the
scattering amplitude f (ω) cannot have singularities in the upper-half-plane.
In field theory, a simple, and physically important, illustration of the connection
between microcausality and analyticity can be found in the process of forward scat-
tering, in which a massless particle (e.g., a photon) scatters with zero momentum
exchange off a target (e.g., a neutral atom, or a proton). The complications of spin
are completely irrelevant to the argument we shall present, so we shall discuss this
process for a massless spinless boson described by a field φ(x). The target particle is
assumed to be stable with respect to emission of the φ-particle. We also assume that the
interaction between our scalar “photon” and the target can be treated perturbatively
to second order in an interaction Hamiltonian of form
where φ(x) is the canonical scalar field for the “photon”—in the interaction picture,
and hence given explicitly by (6.73) (with a = ac , as we assume a self-conjugate
boson).
The fields describing the internal dynamics of the target are all contained in the
“current”19 J(x) which evolves dynamically according to a Hamiltonian H0 which,
despite the suggestive subscript, is “free” only in the sense that the field φ is absent,
while all other aspects of the internal dynamics of the target are treated exactly. For
example, if the target is a proton, J(x) would be constructed from the appropriate
quark fields and would evolve dynamically with the full Hamiltonian of quantum
chromodynamics, the gauge theory assumed to describe strong dynamics. We need only
assume that J(x) is a local scalar field with the usual translation property (cf (5.93))
(0) (0)
·xμ
J(x) = eiPμ J(0)e−iPμ ·xμ
(6.117)
19 The terminology is justified by the form of the interaction in the physical case of electron–proton scat-
tering, where Hint (x) = eJμ (x)Aμ (x), with Jμ (x) the electromagnetic current for the strongly interacting
fields, and Aμ (x) is the vector potential field mediating the photon. Here, for simplicity, we assume that
the current is a Lorentz scalar field.
From microcausality to analyticity 167
and microcausality
(−ie)2
S (2) (k , q ; k, q) = d4 xd4 yk , q |T {J(x)φ(x)J(y)φ(y)}|k, q (6.119)
2
or, writing out the T-product explicitly,
(2) e2
S (k , q ; k, q) = − d4 xd4 y{θ(x0 − y 0 )k |J(x)J(y)|kq |φ(x)φ(y)|q + x ↔ y}
2
(6.120)
Here the field φ is free and commutes with the current J, as the dynamics of the
latter by definition lacks all reference to φ. This allows the factorization of the matrix
element indicated in (6.120). From the explicit expression for φ in terms of creation
and annihilation operators, we find
1 1
q |φ(x)φ(y)|q = 3
(eiq ·x−iq·y + x ↔ y) (6.121)
(2π) 2E(q)
whence
−e2 1
S (2) (k , q ; k, q) = 3
d4 xd4 y
2(2π) 2E(q)
· (eiq ·x−iq·y + x ↔ y)k |T {J(x)J(y)}|k (6.122)
and that the second term on the right-hand side of (6.127) makes no contribu-
tion to the forward scattering amplitude. This follows from the following brief
computation:
d4 qeiq·z k|J(0)J(z)|k = d4 zeiq·z k|J(0)|nn|J(z)|k
n
= d4 zei(q+Pn −k)·z k|J(0)|nn|J(0)|k
n
= (2π)4 δ 4 (q + Pn − k)k|J(0)|nn|J(0)|k (6.128)
n
where we have inserted a complete set |n of eigenstates of the target system energy-
momentum P (0) , P (0)μ |n = Pnμ |n, and used the translation property (6.117). By
assumption, the stability of the target system |p to emission of a “photon” implies
that there are no states |n with momentum k − q and non-vanishing matrix element
n|J(0)|k.20 Thus, the forward scattering amplitude T (k, q; k, q) can be written as
the Fourier transform of a retarded commutator:
T (k, q; k, q) = d4 zeiq·z θ(z 0 )k|[J(z), J(0)]|k (6.129)
20 For example, our target system might be an atom in its ground state, or the proton.
Problems 169
space-like z ensures that the spacetime integral over z is restricted to the forward
light-cone, i.e., coordinates z with z 0 > 0, z 2 ≥ 0. Our massless “photon” has energy-
momentum q μ = (ω, ω q̂), with q̂ a unit spatial vector, so (6.129) can be written
(suppressing the dependence on the target momentum k and photon direction q̂)
0
T (ω) = d4 zeiω(z −q̂·
z) θ(z 0 )k|[J(z), J(0)]|k (6.130)
6.7 Problems
1. Verify that the linking operator L defined in (6.21) correctly constructs the
contribution to the 2-2 matrix elements of the product V U of two Fock space
operators V and U arising from two-(bosonic)particle intermediate states. You
will only need the M = N = 2 terms in the expansions of the V(j ∗ , j) and U(j ∗ , j)
functionals. Also, recall the form of the completeness relation for the Fock space
of a single boson, (5.22).
2. Consider a bosonic theory with Hamiltonian
H = d3 k1 d3 k1 h11 (k1 , k1 )a† (k1 )a(k1 )
1
+ d3 k1 d3 k2 d3 k1 d3 k2 h22 (k1 , k2 , k1 , k2 )a† (k1 )a† (k2 )a(k1 )a(k2 )
4
Calculate the 3-3 matrix element q1 q2 q3 |H|q1 q2 q3 explicitly in terms of the
functions h11 , h22 . Express your result graphically to display the connectedness
structure of this matrix element.
3. A conserved four-vector field (called a “current”) J μ (x) can be defined as follows
for a free complex scalar field φ(x):
counts electric charge (defined as e times the difference in the number of particles
and antiparticles), and is time-independent.
4. Show that the normal-ordered product of four self-conjugate scalar fields at the
same spacetime point can be written in terms of even powers of the field (and is
therefore itself local); i.e., show
: φ(x)4 := φ(x)4 + Aφ(x)2 + B
where A, B are c-numbers (independent of x).
5. The following exercise shows that the free Hamiltonian H0 can be written as an
integral of a density constructed from the local self-conjugate field φ.
(a) Show that
H0 = d3 kE(k)a† (k)a(k)
i.e., show that an arbitrary free state |k1 , k2 , ...kn has the appropriate eigenvalue
relative to this operator.
(b) Next, show that
1 ∂φ 2 + m2 φ2 } = d3 kE(k)(a† (k)a(k) + a(k)a† (k))/2
d3 x{( )2 + |∇φ|
2 ∂t
1 3
= H0 + δ (0) d3 kE(k)
2
Note that the singular zero-point energy is removed by normal-ordering H0 , as
in (6.89).
6. Verify the expressions for the number operator (6.92, 6.93, 6.94).
7. Verify the commutator results listed in Eqs. (6.96–6.99).
8. Show that the expectation value of the free scalar energy density in the state
(6.105) is as given in (6.112).
9. Let gk (x) = √ 13 e−ik·x (k and x are four-vectors, with k0 = E(k)). Show
(2π) 2E(k)
that the destruction operator a(k) may be reconstructed from a self-conjugate
scalar field by
a(k) = i d3 x(gk∗ (x, t)∂0 φ(x, t) − φ(x, t)∂0 gk∗ (x, t))
(6.132)
7
Dynamics V: Construction of local
covariant fields
In many of the standard texts on quantum field theory, the introduction of fields
representing particles of low spin (zero, 12 , or one—there is no direct phenomenological
evidence for elementary particles of any higher spin) is a fairly ad hoc matter. Rela-
tivistic wave equations are introduced and shown to have “nice” covariance properties.
A Lagrangian formalism is then constructed for which these equations are just the
Euler–Lagrange equations of the theory, corresponding to the extremal condition
on the classical action. Finally, a canonical quantization procedure is carried out:
conjugate momentum fields are introduced, and the resultant Hamiltonian is shown
to be the appropriate energy operator for particles of the desired mass and spin.
In this chapter we shall eschew this ad hoc methodology in favor of a more
direct, constructive approach. The relativistic wave equations satisfied by the covariant
fields representing particles of low spin are shown to be automatic consequences
of the representation theory of the Poincaré group, which can be used to write a
completely general expression for the fields transforming according to an arbitrary
finite-dimensional representation of the Lorentz group and representing particles of
arbitrary mass and spin.
The advantage of this approach is twofold. In the first place, it will become apparent
in a completely natural way that there is an inevitable, but completely classifiable,
fluidity in the association of fields to particles: many different covariant fields may be
used with equal validity to represent a particle of a given spin, although there is usually
a “best” (i.e., most convenient) choice. Secondly, the formalism allows us to solve the
problem of constructing covariant fields in one fell swoop: the special cases of spin
zero, 12 , or 1 then follow from the general result simply by inserting j = 0, 12 , 1 into the
master formula. In particular, the Spin-Statistics theorem associating particles with
integral (resp. half-integral) spin with fields of bosonic (resp. fermionic) type (at the
free field level) will emerge naturally in the framework of this formalism.
simple scalar canonical fields can never produce particles of non-zero spin acting on
the vacuum. From the defining transformation law for a scalar field
one finds, choosing Λ to be the infinitesimal rotation by δθ around the ith axis,
(Ri x)j = xj +
ijk xk δθ, and U (Λ) = e−iδθJi ,
Recall that the spin of a particle is simply the residual angular momentum it possesses
when at rest: i.e., at zero linear momentum. Any state, formed by φ acting on the
vacuum with zero spatial momentum (which we achieve by integrating over 3-space to
project out the creation operator at zero momentum), must then have zero angular
momentum as well, as we discover by a simple integration by parts:
Ji d xφ(x)|0 = i d3 x
ijk xk ∂j φ(x)|0
3
= −i d3 xφ(x)
ijk ∂j xk |0
=0 (7.3)
It is clear that more general fields are needed to describe particles of non-vanishing
spin. Nevertheless, it will be important to be able to combine such fields to again
produce an interaction Hamiltonian density which is itself a Lorentz scalar field,
satisfying (7.1) above. The solution is to construct fields that transform according
to definite finite-dimensional representations of the homogeneous Lorentz group. For
example, two vector fields Aμ (x), B μ (x) transforming like
U (Λ)(A, B)μ (x)U † (Λ) = (Λ−1 )μν (A, B)ν (Λx) (7.4)
can be coupled together to make a scalar field C(x) ≡ Aμ (x)Bμ (x), which is easily seen
to satisfy (7.1). Moreover, C(x) is local (commutes with itself at space-like separation)
if A, B are. The original four-fermion theory of the weak interactions, dating from the
1930s, involved a weak interaction Hamiltonian of precisely this form.
A general covariant field will be a set of field operators transforming according to
a general finite-dimensional representation of the homogeneous Lorentz group realized
by the finite-dimensional matrices Mnm (Λ):
Lorentz scalar fields can be constructed from such covariant fields by coupling them
together with invariant tensors tn1 n2 ... of the HLG; namely
tn1 n2 ... Mn1 m1 (Λ−1 )Mn2 m2 (Λ−1 )... = tm1 m2 ... , all Λ (7.7)
Finite-dimensional representations of the homogeneous Lorentz group 173
Of course, the simplest example of such a tensor is the two index Minkowski-space
metric tensor gμν , employed above to construct a scalar field C as the invariant dot-
product of two vector fields Aμ , B ν .
In addition we shall require translation invariance just as for scalar fields (cf(5.91)):
and locality
[φn (x), φm (y)] = [φn (x), φ†m (y)] = 0, (x − y)2 < 0 (7.9)
which implies that there are six independent parameters in the specification of an
arbitrary infinitesimal Lorentz transformation (three angles and three boosts). In a
general finite-dimensional representation, Λ is represented by the matrix
i
Mnm (Λ) = δnm + Ωμν (J μν )nm + O(Ω2 ) (7.13)
2
where the six independent matrices J 12 , J 23 , J 31 , J 01 , J 02 , J 03 are the generators of
rotations (around the z, x, and y axes, respectively) and boosts (along the x, y, and z
axes, respectively). Now consider some definite fixed Λ̄ = 1 + Ω̄ which we subject to
a similarity transformation with a general M (Λ):
Comparing (7.17) and (7.18), we see that the generators J μν transform as second-rank
contravariant tensors under the Lorentz group:
Next, starting from (7.19), choose Λ itself infinitesimal, Λμν = g μν + ω μν , so that keeping
terms of O(ω), we obtain
i
[ωρσ J ρσ , J μν ] = (ωρμ gσν + gρμ ωσν )J ρσ
2
= ωρσ (g σμ J ρν − g ρν J μσ )
1
= ωρσ (gσμ J ρν + g σν J μρ − g ρν J μσ − g ρμ J σν ) (7.20)
2
from which follows immediately the full Lie algebra of the HLG:
[J μν , J ρσ ] = i(g μσ J ρν + g νσ J μρ − g ρμ J σν − g ρν J μσ ) (7.21)
If μ, ν are both spatial indices, we are dealing with the rotation subgroup of the HLG,
and the corresponding generators are therefore just the familiar angular momentum
operators:
J1 ≡ J 32 = J32 (7.22)
J2 ≡ J 13
= J13 (7.23)
J3 ≡ J 21 = J21 (7.24)
The physical content of the algebra is made more transparent by defining boost
generators
K1 ≡ J10 (7.26)
K2 ≡ J20 (7.27)
K3 ≡ J30 (7.28)
The structure of the HLG is further clarified by considering the linear combinations
1 1
Ai ≡ (Ji − iKi ). Bi ≡ (Ji + iKi ) (7.31)
2 2
Finite-dimensional representations of the homogeneous Lorentz group 175
[Ai , Bj ] = 0 (7.33)
Evidently, the HLG can be regarded as the direct product of two “angular momentum
groups”! A note of caution is necessary here: we parenthesize “angular momentum
groups” because the generators Ai , Bi are not hermitian, so that the groups they
generate are not, strictly speaking, the usual unitary SU(2) group, but rather a
complexified version. Nevertheless, the resolution of the full HLG algebra into two
commuting subgroups tremendously simplifies (in fact, effectively solves) the problem
of classifying all the finite-dimensional representations of HLG.
General representations of the HLG can be labeled by a spin pair (A, B), where
A, B are integers or half-integers. Within a representation (A, B), a state is labeled
by (a, b), where as usual a = −A, −A + 1, ..., +A, b = −B, −B + 1, ..., +B. Since the
usual angular momentum J = A + B,
we can only describe particles of spin j by fields
(A, B) where A and B can be coupled together to make angular momentum j, i.e,
|A − B| ≤ j ≤ |A + B| (7.34)
On the other hand, any field satisfying this constraint can be used to represent a
particle of the given spin! This is a central feature of field theory which appears at
this point clearly for the first time: there is no unique correspondence between particles
and fields. A given particle can be represented by a variety of covariant fields, and (as
we shall see later) a given field can also represent many particle states.
The construction of scalar quantities under the HLG is analogous to the problem
of coupling non-zero angular momenta to net zero spin in the theory of the rotation
group—except that we have here to worry about two “rotation groups”! We shall
soon see that the behavior of these representations under (i) complex conjugation,
and (ii) spatial inversion (parity) play a particularly important role in understanding
how to accomplish this. First, we make some comments about the effect of complex
conjugation. This becomes an issue for spinorial (half-integral) representations: in the
fundamental (spin- 12 ) representation, for example, which is complex, but pseudoreal.
In other words, the 2-spinor representation of SU(2) is isomorphic to its complex
conjugate. In the case of the HLG, with its doubled SU(2) structure, conjugation has
the additional effect of interchanging the A and B quantum numbers. To see this,
consider the M (Λ) representation matrix for the ( 12 ,0) representation, with Λ a finite
HLG element with finite parameters Ωμν (cf. Eq(7.13) for the infinitesimal case):
1 i μν
M ( 2 ,0) (Λ) = e 2 Ωμν J (7.35)
Defining angle and boost vectors by φ = (Ω23 , Ω31 , Ω12 ), ξ = (Ω10 , Ω20 , Ω30 ), and
recalling that J i0 = −Ki = − 2i σ (i=1,2,3), J ij = −
ijk Jk = − 12
ijk σk for the ( 12 ,0)
176 Dynamics V: Construction of local covariant fields
M ( 2 ,0) (Λ) = e 2 ξ·
σ− 2 φ·
σ
i
(7.36)
Likewise, the corresponding finite transformation matrix for the (0, 12 ) representation
is
1 1
M (0, 2 ) (Λ) = e− 2 ξ·
σ− 2 φ·
σ
i
(7.37)
Thus if a ( 12 ,0) spinor χα , α = 1, 2 transforms under a Lorentz transformation as
1
χα → (e 2 ξ·
σ− 2 φ·
σ )αβ χβ
i
(7.38)
the conjugate spinor will transform as
1
∗
σ∗
χ∗α → (e 2 ξ·
σ + 2i φ·
0 1
Cs ≡ iσ2 = (7.40)
−1 0
we have
Csσ Cs−1 = −σ ∗ , Csσ ∗ Cs−1 = −σ (7.41)
whence the conjugation transformation (7.39) becomes
1
∗
σ∗
(Cs χ∗ )α → (Cs e 2 ξ·
σ + 2i φ·
= (e− 2 ξ·
σ− 2 φ·
σ )αβ (Cs χ∗ )β
i
(7.42)
Comparing this result with (7.37), we see that the conjugate spinor Cs χ∗ transforms
appropriately for a (0, 12 ) representation: conjugation has reversed the roles of the A
and B quantum numbers. This is hardly surprising, given Eqs. (7.31), as the J (resp.
generators are represented by hermitian (resp. antihermitian) matrices.
K)
Next, we consider the effects of a parity transformation: namely, an improper
Lorentz transformation (i.e., with determinant equal to negative unity) reversing the
sign of the three spatial coordinates while leaving the time coordinate unchanged.
Evidently, the angular momentum generators J, with two spatial indices (see (7.22))
are even under a parity transformation, while the boost generators K, with a single
spatial index (see (7.26)) are odd. In common parlance, angular momentum J is an
axial vector, the boost vector K a polar vector. A glance at the definitions (7.31) shows
that the parity transformation has the effect of interchanging the A and B labels
of a given irreducible representation (A, B) of the HLG. If the interactions of our
particles exactly conserve parity (as in the strong and electromagnetic interactions),
then the Hamiltonian cannot change under a parity transformation, so that if we
employ fields transforming under a representation (A, B), with A = B (note: such
Local covariant fields for massive particles of any spin: the Spin-Statistics theorem 177
fields are termed chiral), then the Hamiltonian must also contain, symmetrically,
fields transforming according to the representation (B, A). Indeed, for fermions, the
Spin-Statistics theorem (discussed below) implies j half-integral, whence, by (7.34),
we necessarily have A = B. Parity may be preserved in this case either by employing
conjugate representations as discussed above (leading to Majorana fermions, cf. Section
7.4.1), or by the use of reducible representations of the HLG, such as (A, B) ⊕ (B, A)
(Dirac fermions, Section 7.4.2).
7.3 Local covariant fields for massive particles of any spin: the
Spin-Statistics theorem
We now turn to the task of explicitly constructing local canonical covariant fields,
linear in creation and annihilation operators, for (massive) particles of any spin. This
section will correspondingly be algebraically somewhat more dense than most, but
the end results will more than merit the effort expended: just a few pages will suffice
to establish the general form of covariant fields of any spin, from which with very
little further effort flows the whole panoply of relativistic wave equations (Klein–
Gordon, Dirac, Maxwell–Proca, etc.) typically introduced in a more or less ad hoc
fashion in earlier generations of field theory texts.1 Moreover, a profound consequence
of relativistic field theory—the Spin-Statistics connection—will emerge naturally as
part of the construction.
The most general canonical field linear in creation and annihilation operators can
be written
d3 k
φn (x) = (un (k, σ)a(k, σ)e−ik·x + vn (k, σ)ac† (k, σ)eik·x ) (7.43)
σ (2π)3/2 2E(k)
Our task is to determine the coefficient functions un (k, σ), vn (k, σ) in order to satisfy
the requirements of Poincaré covariance and locality, namely (7.5, 7.8, 7.9). First, note
d3 k E(Λk)
†
U (Λ)φn (x)U (Λ) =
(2π)3/2 2E(k) E(k)
σσ
1 The construction of covariant fields for any spin in a unified way employing the representation theory
of the Poincaré group was first carried out in a seminal paper by Weinberg (Weinberg, 1964a).
178 Dynamics V: Construction of local covariant fields
The un (k, σ), vn (k, σ) should be regarded as connection coefficients between the finite-
dimensional (n index) field representations of the HLG and the infinite-dimensional
unitary Fock-space representation of the single-particle states (labeled by k and
σ). The constraints (7.45) and (7.46) will turn out to uniquely determine these
coefficients and hence the desired covariant field operators for any spin, up to an
obvious normalization and phase freedom. They also imply, as we shall see, that
the covariant field operators necessarily satisfy certain partial differential equations
in coordinate space (commonly called “relativistic wave equations”): in conventional
presentations of field theory these equations arise (somewhat magically) as Euler–
Lagrange equations of relativistically invariant actions. Here the constraints appear
naturally as a consequence of connecting the covariance of the field operator to the
underlying unitary Fock-space structure. We shall now show how these constraints
may be explicitly solved for arbitrary spin j.
Recall the definition of the Wigner rotation: for general Lorentz transformation Λ,
so that a unitary rotation on the field (m) index can be transferred to a unitary spin-j
rotation on the particle-spin index (σ). In an irreducible representation, this constraint
will determine un (0, σ) up to overall normalization.
Similarly, choosing k = 0, Λ = L−1 (k),
so
un (k, σ) = Mnm (L(k))um (0, σ) (7.50)
m
so the full coefficient function un (k, σ) is determined by a boost once we have used
(7.48) to fix the un s at zero momentum.
For the vn coefficient functions, using (7.46), and the property of rotation matrices
Dσj σ (W (Λ, k)) = (−)σ −σ D−σ,−σ
j
(W
−1
(Λ, k)) (7.51)
Local covariant fields for massive particles of any spin: the Spin-Statistics theorem 179
one finds
j
Dσσ (R)(−)
j+σ
vn (0, −σ) = Mnm (R)(−)j+σ vm (0, −σ ) (7.52)
σ m
which implies (−)j+σ vn (0, −σ) = ξun (0, σ) (ξ a so far arbitrary constant) as both
satisfy (7.48), while the vn s at non-zero momentum are again obtained by a boost:
vn (k, σ) = Mnm (L(k))vm (0, σ) (7.53)
m
so that finally
We now turn to the task of explicitly solving the constraints (7.48, 7.50). Here
we shall need the results of the preceding section, in which the finite-dimensional
representations of the HLG (i.e., the structure of the representation matrices Mnm (Λ))
were classified. We shall use the (AB; ab) notation described above to identify specific
representations of the HLG and components within a representation. Thus
(AB) (AB)
φn → φab , un → uab (7.55)
Then (7.57) just says that the |σ states transform as an irreducible representation of
the rotations induced by the angular momentum operator J = A + B,
corresponding
2
α −iB·
−iA·
α
to eigenvalue J = j(j + 1). Proof: under a simultaneous rotation e e
|Aa, Bb → DaA a ( α)|Aa , Bb
α)DbB b ( (7.59)
a b
e−iJ·
α |σ = uab (0, σ)e−iA·
α e−iB·
α |Aa, Bb (7.60)
ab
= uab (0, σ)DaA a ( α)|Aa , Bb
α)DbB b ( (7.61)
ab,a b
180 Dynamics V: Construction of local covariant fields
= α)ua b (0, σ )|Aa , Bb
Dσj σ ( (7.62)
σ ,a b
= α)|σ
Dσj σ ( (7.63)
σ
i.e., the |σ span a spin j irreducible representation of the rotation group. The
coefficients which perform the desired coupling (7.58) are just the familiar Clebsch–
Gordon coefficients (unique up to a phase), so we now know the coefficient functions
at zero momentum:
ab (0, σ) = A B a b |j σ
uAB (7.64)
AB
vab (0, σ) = ξ AB (−1)j−σ A B a b|j − σ (7.65)
From (7.50), once the coefficient functions are known at zero momentum, they can
be boosted to any non-zero momentum. For this, we need the general expression for
the boost Lnm (L(k)) in the (AB) representation. An infinitesimal boost in the ith
direction is realized by the Ji0 = Ki generator. For example, if k is in the z-direction,
and θ is the rapidity angle of the boost,
⎛ ⎞
cosh θ 0 0 sinh θ
⎜ 0 1 0 0 ⎟
Lρσ (k) ≡ B ρσ (θ) = ⎜
⎝ 0
⎟
0 1 0 ⎠
sinh θ 0 0 cosh θ
which is equal to (in the four-vector representation) i(J30 )ρσ = iK3 . (In the four-vector
representation, the explicit formula for the generators is (Jμν )ρσ = i(gνρ gμσ − gμρ gνσ );
some boring algebra establishes that these matrices satisfy the Lie algebra (7.21)). It
follows that z-boosts in a general (AB) representation are realized by
with (using (7.54)) an obvious corresponding equation for vab (k, σ).
Now that we know how to build covariant fields, transforming according to definite
representations of the HLG, it remains to be seen whether we can also arrange for
microcausality. For interaction Hamiltonians built purely from scalar fields, we saw
that Lorentz-invariance of the S-matrix hinged on the local commutativity property:
For fermions, [φ(x), φ† (y)]− is not even a c-number, but something quadratic in
creation and annihilation operators, so it certainly cannot vanish identically in the
space-like region. However [φ(x), φ† (y)]+ ≡ {φ(x), φ† (y)} is a c-number, so it is at least
possible for the anticommutator of fermionic fields to vanish in the space-like region.
Therefore, recalling that the commutator of products of an even number of fields
can always be rewritten as a sum of terms involving only anticommutators (for exam-
ple, [AB, CD]− = A{B, C}D − AC{B, D} + {A, C}DB − C{A, D}B), it follows that
local commutativity of Hint can be assured simply by building it out of an even number
of fermionic fields, and insisting on space-like anticommutativity for the elementary
fermionic fields.
The general structure of a commutator (or anticommutator—as usual, upper signs
refer to bosons, lower to fermions) of two covariant fields is
A2 B2 † d3 k 1 d 3 k 2 1 B1 −ik1 ·x
[φA 1 B1
a1 b1 (x), φa2 b2 (y)]∓ = [uA
a1 b1 (k1 , σ1 )e a(k1 , σ1 )
3
(2π) 2E(k1 )2E(k2 ) σ1 σ2
Define
1 B1 A2 B2 ∗
NaA11bB 1 A2 B2
1 a 2 b2
(k) ≡ uA
a1 b1 (k, σ)ua2 b2 (k, σ)
σ
= (e−θk̂·A )a1 a1 (eθk̂·B )b1 b1 (e−θk̂·A )∗a2 a (eθk̂·B )∗b2 b
2 2
a1 b1 a2 b2
· A1 B1 a1 b1 |jσ A2 B2 a2 b2 |jσ (7.73)
σ
N (k) has a definite parity under k → −k . This is easiest to see by choosing k̂ along
μ μ
while
NaA11bB 1 A2 B2
1 a 2 b2
(−k) = eθ (a1 +a2 −b1 −b2 ) A1 B1 a1 b1 |jσ A2 B2 a2 b2 |jσ (7.75)
σ
where eθ = −k m+|k| = −e−θ , and the interchange of as and bs results from the change
NaA11bB 1 A2 B2
1 a 2 b2
(−k) = (−1)(a1 +a2 −b1 −b2 ) NaA11bB 1 A2 B2
1 a2 b2
(k)
vaA11bB1 1 (k, σ)vaA22bB2 2 (k, σ)∗ = ξ A1 B1 ξ A2 B2 ∗ uA 1 B1 A2 B2
a1 b1 (k, −σ)ua2 b2 (k, −σ)
∗
σ σ
A1 B1 A2 B2 ∗
=ξ ξ NaA11bB 1 A2 B2
1 a2 b2
(k) (7.77)
We can now use the parity property (7.76) to write this result as
A2 B2 † d3 k
A1 B1
[φa1 b1 (x), φa2 b2 (y)]∓ = (N A1 B1 A2 B2 (k)e−ik·(x−y)
(2π)3 2E(k) a1 b1 a2 b2
∓ (−1)2B1 +2B2 +2j NaA11bB 1 A2 B2
1 a2 b2
(−k)ξ A1 B1 ξ A2 B2 ∗ e+ik·(x−y) )
∂
= NaA11bB 1 A2 B 2
1 a2 b2
(i )F(x − y) (7.79)
∂x
where
d3 k
F(x − y) ≡ (e−ik·(x−y) ∓ (−1)2B1 ξ A1 B1 (−1)2B2 ξ A2 B2 ∗ (−1)2j e+ik·(x−y) )
(2π)3 2E(k)
(7.80)
If we compare this result with (6.76) we see immediately that space-like commutativity
(resp. anticommutativity) is assured if and only if
Take first (A1 , B1 ) = (A2 , B2 ) and recall that ξ AB is a pure phase. Then we conclude
(−1)2j = ±1 (7.82)
which is solved by setting all ξ AB = (−1)2B ξ, with ξ = eiθ a universal phase factor. In
fact, an overall universal phase in all destruction parts is unobservable, so with no loss
of generality we can set ξ = 1.2 Referring back to (7.54), we thus obtain a completely
general expression for a local covariant field of any spin and Lorentz representation:
d3 k −ik·x
φAB
ab (x) = (uAB
ab (k, σ)e a(k, σ)
(2π) 3/2 2E(k) σ
ik·x c†
ab (k, −σ)e
+ (−)2B (−)j−σ uAB a (k, σ)) (7.84)
with uAB
ab (k, σ) given by (7.69).
For a spinless particle, j = 0, the simplest covariant field satisfying (7.34) is clearly
obtained by taking A = B = 0, whence the u coefficient function in (7.84) becomes
unity (by (7.64) and (7.69)), and we recover the canonical scalar field (6.73) of Chapter
6, satisfying (automatically) the Klein–Gordon equation ( + m2 )φ(x) = 0. The only
other cases of real importance in the Standard Model of elementary particle physics
2 In certain cases, e.g., the Majorana field discussed below, the choice ξ = −1 will be more convenient.
184 Dynamics V: Construction of local covariant fields
are for j = 12 , 1. These two important special cases are therefore given separate and
detailed attention in the two following sections.
1 1
u(0, ) =
2 0
1 0
u(0, − ) =
2 1
1
With choice of phase ξ 2 0 = −1, the zero momentum v-spinor given by (7.65) is easily
seen to be
u(k, σ) = e−θk̂·
σ/2 u(0, σ)
v(k, σ) = e−θk̂·
σ/2 Cs u(0, σ) (7.86)
where the rapidity of the boost is given by cosh θ = E(k)m . Up to normalization, the
unique covariant field of type ( 12 ,0) is given by√(7.84). It is conventional for spinor
fields of non-zero mass to include an additional m in the definition of the field. We
thus obtain
d3 k m −θk̂·
σ/2
χ(x) = 3/2
(e u(0, σ)b(k, σ)e−ik·x
(2π) 2E(k) σ
+ e−θk̂·
σ/2 Cs u(0, σ)b† (k, σ)eik·x ) (7.87)
Note that the antiparticle creation operator b† is just the hermitian adjoint of the
particle annihilation operator: our field is self-conjugate. The spinor index (on χ and
u(0, σ)) has been suppressed: the reader is invited to visualize the left- and right-
hand sides of (7.87) as a column two-vector of field operators. Next we introduce
the augmented Pauli matrices σμ , μ = 0, 1, 2, 3 where σ0 ≡ 1 and the σi , i = 1, 2, 3 are
and a straightforward calculation,
the usual Pauli matrices. Thus, σ μ ∂μ = ∂0 − σ · ∇,
using the conjugation properties (7.41), leads to the Majorana equation
Here, and throughout this section, we use the asterisk of complex conjugation to
indicate both normal complex conjugation (of numbers) and hermitian conjugation
of operators, while the † symbol is reserved for the combination of conjugation (both
types) and transposition of 2-spinors or 2x2 matrices.
The free Hamiltonian H0 for our particle is given by the usual expression in terms
of creation and annihilation operators:
H0 = d3 k E(k)b† (k, σ)b(k, σ) (7.89)
σ
It is not difficult to show that this operator can be written as a spatial integral of a
free energy density
m †
H0 = d3 x : χ† i∂0 χ + (χ Cs χ∗ + χT Cs χ) : (7.90)
2
Note that the normal-ordering symbol :....: is defined for fermions by moving all
creation operators to the left of all annihilation operators (as for bosons), but with an
extra minus sign for each transposition. Use of the Majorana equation (7.88) allows
us to rewrite this in the more usual form (see Problem 1):
H0 = + m (χT Cs χ − χ† Cs χ∗ ) :
d3 x : χ† iσ · ∇χ (7.91)
2
Note that both the kinetic (spatial derivative) term and the mass term in this
expression are hermitian operators (in particular, χ† Cs χ∗ = −(χT Cs χ)† ).
186 Dynamics V: Construction of local covariant fields
The mass term in (7.91) is actually the integral of a Lorentz scalar field. Recall
that a general covariant field transforms under the HLG as
Accordingly
U (Λ)χα (x)(Cs )αβ χβ (x)U † (Λ) = χα (Λx)(M T (Λ−1 )Cs M (Λ−1 ))αβ χβ (Λx)
= χT (Λx)Cs χ(Λx) (7.94)
so the bilinear S(x) = χT (x)Cs χ(x) is indeed a Lorentz scalar field. Such a bilinear
can therefore be used to construct Lorentz scalar interaction densities of Yukawa form,
coupling the Majorana fermion to a self-conjugate spinless scalar φ, for example:
Hint (x) = λYuk φ(x)(χT (x)Cs χ(x) + h.c.) = λYuk φ(χT Cs χ − χ† Cs χ∗ ) (7.95)
What if our ( 12 ,0) field is not self-conjugate? Then it is easy to see that the
mass term in the Hamiltonian must be constructed as a bilinear in χ and χ∗ (to
reproduce the required number operators b† b, bc† bc in H0 ), but then the resultant
field is not a Lorentz scalar (specifically, the scalar property fails under boosts: see
Problem 2). This still leaves the option of a massless spin- 12 non-self-conjugate fermion
transforming according to ( 12 ,0). Such fermions are called left-handed Weyl fermions:4
we return to them in subsection 7.4.5, when we address the massless case and introduce
the two-component Weyl field (Weyl, 1929). On the other hand, massive non-self-
conjugate spin- 12 particles are easily treated by using a reducible representation
of the HLG containing both ( 12 ,0) and (0, 12 ) components, as first introduced by
Dirac (Dirac, 1928) in the late 1920s. We now turn to a study of such reducible
fields.
AB
we may conveniently display the corresponding field as a column 4-spinor ψab
(generally called a “Dirac 4-spinor”, or “bispinor”), as follows:
⎛ 1
0 ⎞
ψ 12
⎜ 2 ⎟ 0
⎜ 1 0⎟
⎜ψ2 1 ⎟
⎜ −20 ⎟
ψ=⎜
⎜ 0 12 ⎟
⎟
⎜ψ 1 ⎟
⎜ 0 2 ⎟
⎝ ⎠
0 1
ψ0−21
2
where the “A” (resp. “B”) generators of HLG act only on the top (resp. bottom) two
components of the 4-spinor:
1
= 2
σ 0 = 0 0
A , B 1
0 0 0 2σ
which then gives (recall (7.31)) the following 4x4 matrices for the generators of
rotations and boosts:
1
σ 0
J = 2 , (7.96)
0 12 σ
i
K = 2 σ 0
(7.97)
0 − 2i σ
Our next task is to construct the connection coefficients u(k, σ), v(k, σ) which
determine the canonical field operator for this representation. As usual, we start at
zero momentum: recalling the trivial Clebsch–Gordon values 12 0 12 0| 12 12 = 1 etc., one
finds
⎛ √1 ⎞
2
1 ⎜ 0 ⎟
u(0, ) = ⎜ ⎟
2 ⎝ √1 ⎠
2
0
⎛ ⎞
0
1 ⎜ √1 ⎟
u(0, − ) = ⎜ 2⎟
⎝ 0 ⎠
2
√1
2
√
where the additional 2 factors are conventional to normalize the 4-spinors. Recall
(cf. (7.83)) that in general we have vab (0, σ) = ξ(−1)2B (−1)j−σ uab (0, −σ). For Dirac
spinors it is conventional to choose the arbitrary phase ξ = −1, giving
188 Dynamics V: Construction of local covariant fields
⎛ ⎞
0
1 ⎜ − √1 ⎟
v(0, ) = ⎜
⎝ 0 ⎠
2⎟
2
√1
2
⎛ ⎞
√1
2
1 ⎜ 0 ⎟
v(0, − ) = ⎜ ⎟
2 ⎝ − √1 ⎠
2
0
The spinor coefficient functions at non-zero momentum are obtained from these by a
boost:
0 1
β≡
1 0
one easily verifies that βu(0, σ) = u(0, σ), whence B(k)u(k, σ) = u(k, σ) where
−θ k̂·
σ
0 e
B(k) ≡
eθk̂·
σ 0
E(k) k · σ
eθk̂·
σ = + (7.99)
m m
We may now establish contact with the conventional Dirac formalism by introducing
the Dirac matrices:
0 1
γ0 = β =
1 0
0 −σi
γi = (7.100)
σi 0
If we now recall the general expression for a covariant field, we find that the Dirac
field ψ(x), given by
d3 k m
ψ(x) = (u(k, σ)e−ik·x b(k, σ) + v(k, σ)eik·x d† (k, σ)) (7.103)
(2π)3/2 E(k) σ
7.4.3 Diracology
The Dirac matrices introduced in (7.100) are readily seen to anticommute with each
other; together with the fact that γ02 = 1, γi2 = −1, we conclude that
where the 4x4 identity matrix is understood on the right-hand side of (7.105),
multiplying the metric element 2gμν . The antiparticle coefficient functions v(k, σ) are
conventionally defined as
1
v(k, σ) = γ5 (−1) 2 +σ u(k, −σ) (7.106)
1 0
γ5 = (7.107)
0 −1
{γ5 , γ μ } = 0 (7.108)
(NB: spacetime indices are raised and lowered on the γ-matrices in the usual way: the
spatial ones get a minus sign).
In almost any calculation involving Dirac fields, we encounter spin sums analogous
to (7.73):
N (k)nm ≡ un (k, σ)u∗m (k, σ) (7.109)
σ
190 Dynamics V: Construction of local covariant fields
At zero momentum, from the explicit result for the zero-momentum spinors derived
above we have
1
N (0)nm ≡ un (0, σ)u∗m (0, σ) = (1 + γ0 )nm (7.110)
σ
2
Applying the boost matrix in (7.98) on the left and the right of N (0), we find
1 1 1 k/
N (k) = B(k)γ0 + γ0 = ( + 1)γ0 (7.111)
2 2 2 m
Another very useful concept in Dirac theory is the “Dirac adjoint” of a 4-spinor.
Namely:
† 0
ψ̄n ≡ ψm γmn (7.112)
etc. The spin sum (7.109) usually is needed in the equivalent form
1 k/
un (k, σ)ūm (k, σ) = ( + 1)nm (7.113)
σ
2 m
which follows from (7.111) as γ02 = 1. For the antiparticle spinors, the result is similar,
with an obvious change of sign:
1 k/
vn (k, σ)v̄m (k, σ) = ( − 1)nm (7.114)
σ
2 m
where the second line follows from the boost equation (7.98):
eθk̂·
σ/2 0
u(−k, σ ) = u(0, σ ) (7.117)
0 e−θk̂·
σ/2
e−θk̂·
σ/2 0
v(k, σ) = v(0, σ) (7.118)
0 eθk̂·
σ/2
E(k)
ū(k, σ)γ0 u(k, σ ) = δσσ = v̄(k, σ)γ0 v(k, σ ) (7.119)
m
Local covariant fields for spin-½ (spinor fields) 191
With the help of (7.115–7.119) one may easily establish the field-theoretic formula for
the free Dirac Hamiltonian:
H0 = d3 kE(k) (b† (k, σ)b(k, σ) + d† (k, σ)d(k, σ)) (7.120)
σ
Similarly, one finds the following expression for the charge operator (for a Dirac particle
of charge e; see Problem 4):
Q = d3 k (eb† (k, σ)b(k, σ) − ed† (k, σ)d(k, σ)) (7.122)
σ
= d3 xe : ψ̄(x)γ 0 ψ(x) : (7.123)
χ
ψM =
−Cs χ∗
with χ(x) the two-component field of Section 7.4.1, satisfying the Majorana equation
(7.88) and its conjugate:6
i(∂0 − σ · ∇)χ = −mCs χ∗ (7.124)
−i(∂0 − σ ∗ · ∇)χ
∗ = −mCs χ (7.125)
whence one easily verifies that the four-component field ψM satisfies the Dirac equation
(7.104):
∂
(iγ μ − m)ψM (x) = 0 (7.126)
∂xμ
5 Alternatively, one can take the (0, 12 ) field as the starting point and write
Cs χ∗
ψM =
χ
6 We temporarily return to the convention of Section 7.4.1, where an asterisk is used to represent both
complex conjugation and hermitian conjugation.
192 Dynamics V: Construction of local covariant fields
Likewise, the free Hamiltonian (7.91) for the Majorana field takes exactly the same
form as the Dirac Hamiltonian (7.121), with an extra factor of 12 to compensate for
the doubling of the Majorana field in the four-component notation:
1 + m)ψM (x) :
H0 = d3 x : ψ̄M (x)(iγ · ∇ (7.127)
2
Indeed, one can think of the Dirac field as a 4-spinor composed of two independent
Majorana fields
χ1
ψM =
−Cs χ∗2
J † = J
0
= γ0 Jγ (7.128)
†
= −K
K
0
= γ0 Kγ (7.129)
Next, note that the matrix element N (k)nm = σ un (k, σ)u∗m (k, σ) of the spin-sum
matrix introduced above is a spinor dot-product of un and um , and therefore invari-
j
ant under simultaneous rotations with the rotation matrix Dσσ . From (7.45), one
finds
Local covariant fields for spin-½ (spinor fields) 193
Mn1 m1 (Λ−1 )Mn∗2 m2 (Λ−1 )um1 (Λk, σ)u∗m2 (Λk, σ)
σ,m1 m2
from which
As expected, the four γ matrices transform as a four-vector under the HLG. The
matrix γ5 introduced in (7.107) transforms as a pseudoscalar:
i
γ5 ≡ −iγ 0 γ 1 γ 2 γ 3 = −
μνρσ γ μ γ ν γ ρ γ σ
4!
i
M (Λ)γ5 M −1 (Λ) = −
μνρσ Λμμ Λν ν Λρρ Λσσ γ μ γ ν γ ρ γ σ
4!
i
= −
μ ν ρ σ det(Λ)γ μ γ ν γ ρ γ σ
4!
= det(Λ)γ5 (7.138)
†
= (M † (Λ−1 )γ0 )m n ψm (Λx)
†
= (γ0 M (Λ))m n ψm (Λx)
where we have used the conjugation property (7.131) between the second and third
lines. From (7.139) and (7.141) follow directly
Accordingly, the field S(x) ≡ ψ̄(x)ψ(x) is a scalar field. On the other hand, using
(7.138),
we see that there is an extra minus sign for transformations including a spatial
reflection, so that P (x) ≡ ψ̄γ5 ψ is a pseudoscalar field. Not surprisingly, we can
construct a vector field employing the Dirac matrices in the obvious way:
U (Λ)ψ̄n (x)(γμ )n n ψn (x)U † (Λ) = Mm n (Λ)(γμ )n n Mnm (Λ−1 )ψ̄m (Λx)ψm (Λx)
= Λρ μ ψ̄m (Λx)(γρ )m m ψm (Λx) (7.144)
Thus, Vμ (x) ≡ ψ̄γμ ψ(x) is indeed a vector field. The corresponding result for the four-
vector field Aμ (x) ≡ ψ̄γ5 γμ ψ contains an extra det(Λ) factor (which would be –1 if
the transformation Λ is improper, containing a spatial reflection), so we conclude that
Aμ is an axial vector field.
As the Dirac field has four components, it is apparent that there must be sixteen
independent bilinears constructible as linear combinations of ψ̄n (x)ψm (x). So far, the
S, P, Vμ , and Aμ fields provide us with 1+1+4+4=10 independent operators. The six
remaining independent operators form a second-rank antisymmetric Lorentz tensor
i
Tμν (x) ≡ ψ̄(x) [γμ , γν ]ψ(x) (7.146)
4
Of course, fields with non-trivial Lorentz transformation properties, such as Vμ , Aμ ,
and Tμν can still be contracted in the standard way to produce Lorentz scalar fields
suitable for use in an interaction Hamiltonian: e.g., Vμ V μ , Tμν T μν etc. Combinations
like Vμ Aμ are pseudoscalar fields and will lead to parity violation if included in the
interaction Hamiltonian density. Precisely this occurs in the effective Fermi theory
of the weak interactions, as we shall see. Another important type of interaction is
Local covariant fields for spin-½ (spinor fields) 195
the Yukawa interaction between a scalar field φ(x)(or pseudoscalar π(x)) and a Dirac
field ψ(x):
Note that both (7.147) and (7.148) are parity conserving interactions (as Hint is even
under parity in both cases)!
while the Dirac field now takes the form (with E(k) = |k|)
d3 k
ψ(x) = (u(k, σ)e−ik·x b(k, σ) + v(k, σ)eik·x d† (k, σ)) (7.150)
(2π) 3/2 2E(k) σ
There is one further subtlety of which we must be cognizant in taking the massless limit
of such a field. The discussion of massless particle states in Section 5.3 made it clear
that such states always appear with a definite value of helicity—i.e., of their angular
momentum resolved along the direction of motion of the particle—rather than along
an arbitrary z-axis, as in the spin states employed in the discussion of the massive
Dirac field above. The relation between spin states |k, σ and helicity states |k, λ is
very simple (recall Problem 2(a) of Chapter 5):
196 Dynamics V: Construction of local covariant fields
1
|k, λ = 2
Dσλ (R(k̂))|k, σ
σ
where R(k̂) is the rotation from the z-axis into the direction of momentum k̂ (with a
similar relation for the antiparticle states). There is a corresponding relation between
the creation operators
1
b† (k, λ) = Dσλ2
(R(k̂))b† (k, σ)
σ
1
d† (k, λ) = 2
Dσλ (R(k̂))d† (k, σ)
σ
the Dirac field (7.150) can be rewritten entirely in terms of helicity spinors and
creation–annihilation operators:
d3 k 1
ψ(x) = 3/2
(u(k, λ)e−ik·x b(k, λ) + v(k, λ)eik·x d† (k, λ)) (7.153)
(2π) 2E(k)
λ
Recall that the Dirac field transforms according to the reducible representation
( 12 ,0)⊕(0, 12 ), with the upper (resp. lower) two components transforming as ( 12 ,0) (resp.
(0, 12 )). If we label the upper two components ψL and the lower two ψR , then it is easy
to see that in the limit m → 0 the Dirac equation (7.104) decouples into separate
equations for these two chiral fields:7
σ μ ∂μ ψL (x) = 0 (7.154)
μ
σ̄ ∂μ ψR (x) = 0 (7.155)
where σμ = (1, σi ), σ̄μ = (1, −σi ). These are called the “left-handed” and “right-
handed” Weyl equations,8 respectively (Weyl, 1929). The first equation (for the ( 12 ,0)
field) coincides in form, not surprisingly, with the previously discussed Majorana
equation (7.88). Here, however, our field is not assumed to be self-conjugate: our
particle is allowed to carry a non-zero additive conserved quantum number, for
example, opposite in sign to that of the antiparticle (e.g., lepton number in the
7 Recall from the discussion in Section 7.2 that half-integral spin fields must be described by an (A, B)
representation with A = B. In the spin- 12 case, the chiral character (left- or right-handed) of the Weyl field
is directly correlated with the eigenvalue of γ5 ; cf. (7.107).
8 These equations, although evidently Lorentz-covariant, were rejected by Pauli in his famous Handbuch
article,(Pauli, 1933), p. 226, for violating invariance under spatial reflections (parity)—a shortcoming which
was later transformed into a virtue when parity non-conservation was discovered in the 1950s.
Local covariant fields for spin-½ (spinor fields) 197
original form of weak interaction theory, with massless neutrinos). The change in sign
of the spin matrices between the equations for ψL and ψR suggests that they describe
(massless) particles of opposite helicity. We shall now demonstrate this explicitly.
Return for a moment to the massive case, in the original spin representation. We
shall concentrate on the upper two components, so there will be a “L” subscript
everywhere. From (7.98), the finite momentum spinor is obtained by a boost from the
zero-momentum state:
√
uL (k, σ) = 2me−θk̂·
σ/2 uL (0, σ) (7.156)
1 12
= √ Dσλ (R(k̂))( (E(k) + m) − (E(k) − m)k̂ · σ )nσ
2 σ
1 1
= √ {( (E(k) + m) − (E(k) − m)k̂ · σ )Dσλ
2
(R(k̂))}nλ
2
again with the understanding that spin or helicity values + 21 (resp. − 12 ) need to be
translated appropriately into row or column indices 1 (resp. 2) when evaluating matrix
elements of the rotation or spin matrices. Here we have used the definition of R(k̂) as
the rotation taking the z-axis into the k̂ direction, whence
1 1
D 2 (R(k̂))−1 k̂ · σ D 2 (R(k̂)) = σ3 (7.160)
1 1 1
uL (k, + )n = √ Dn1
2
(R(k̂))( E(k) + m − E(k) − m) (7.161)
2 2
198 Dynamics V: Construction of local covariant fields
and clearly vanishes in the zero-mass (or high-energy) limit. On the other hand, the
negative helicity (or “left-handed”) spinor survives in this limit (whence our choice of
the subscript “L” in labeling the upper two components of the Dirac 4-spinor):
1 1 1
uL (k, − )n = √ Dn2
2
(R(k̂))( E(k) + m + E(k) − m)
2 2
1
→ 2E(k)Dn22
(R(k̂)), m → 0 (7.162)
In other words, the chirality of the field (defined as the eigenvalue of γ5 ) and the
helicity of the particle (the eigenvalue of k̂ · σ ) have become linked in the high-energy,
or zero-mass limit.
The helicity interpretation of these spinors is again a consequence of (7.160), which
1
implies that the first column of the 2x2 rotation matrix D 2 (R(k̂)) is an eigenvector of
the helicity operator k̂ · σ with eigenvalue +1, while the second column appearing in
(7.162) is an eigenvector with eigenvalue –1. A similar examination of the properties
of the v spinors associated with the antiparticle states shows that in the massless limit
only the positive helicity state survives. Thus our left-handed Weyl field ψL describes
a massless particle with negative helicity, but whose antiparticle is necessarily positive
helicity. The right-handed Weyl field ψR correspondingly describes a positive helicity
particle, paired with a negative helicity antiparticle. As we shall see later, only the
ψL -type neutrino fields participate in weak interactions in the Standard Model of
elementary particle interactions, even though the particles themselves now appear to
have non-zero mass—suggesting that we are dealing with Dirac fermions after all!
where M (Λ)μν = Λμν is simply Rμν for a pure rotation R. For μ = 0 this reduces to
(1)
Dσσ (R)u0 (0, σ) = u0 (0, σ ) (7.164)
σ
which implies that u0 (0, σ) = 0. As (7.50) implies that the coefficient functions uμ (k, σ)
at non-zero momentum are obtained from those at zero momentum by a boost
kμ uμ (k, σ) = 0 (7.166)
d3 k
μ
W (x) = √ (uμ (k, σ)a(k, σ)e−ik·x + v μ (k, σ)ac† (k, σ)eik·x ) (7.167)
σ (2π)3/2 2E
we have
(∂ ρ ∂ρ + m2 )W μ (x) = 0 (7.168)
μ
∂μ W (x) = 0 (7.169)
where the vanishing divergence of W μ follows directly from the transversality property
(7.166). The field equations (7.168), (7.169) can be summarized succintly by defining
the tensor field
∂μ F μν + m2 W ν = 0 (7.171)
referred to as the Maxwell–Proca equation (Proca, 1936), is easily seen to yield both
the constraints (7.168), (7.169) which build in the correct transformation properties
of the covariant field W μ under Lorentz transformations. In the special case of a self-
conjugate vector field, W μ is hermitian and the antiparticle piece is just the hermitian
adjoint of the particle piece, so (7.167) simplifies to
d3 k
Z μ (x) = √ (
μ (k, σ)a(k, σ)e−ik·x +
μ (k, σ)∗ a† (k, σ)eik·x ) (7.172)
σ (2π)3/2 2E
where we have (borrowing from electroweak theory) used the letter Z for a neutral
(i.e., self-conjugate) vector boson, and (from quantum electrodynamics) the more
conventional notation
μ for the corresponding polarization vector.
Once again, as in the case of spin- 12 particles, the massless limit exhibits special
features. We can proceed as before by examining the behavior of the massive field
discussed above in the limit m → 0. The discussion of massless particle representations
in Section 5.3 again indicates that the helicity representation provides the appropriate
description in this limit. From Eqs. (7.64, 7.65) the zero-momentum polarization
vectors for a massive j = 1 particle (with ξ AB = (−1)2B ) are easily seen to be
200 Dynamics V: Construction of local covariant fields
⎛ ⎞
0
⎜ 0 ⎟
μ (0, σ = 0) = ⎜
⎝ 0 ⎠
⎟
−1
⎛ ⎞
0
⎜ ± √1 ⎟
μ (0, σ = ±1) = ⎜⎝ √i ⎠
2⎟
2
0
with the polarization vectors v μ (0, σ) for the antiparticle term given by the conjugates
μ ∗ (0, σ) as in (7.172).
The corresponding vectors for a particle moving in the positive z-direction (so that
the spin σ and helicity λ specifications are equivalent) with momentum k = kẑ are
⎛ k ⎞
−m
⎜ 0 ⎟
μ (0, λ = 0) = ⎜ ⎟
⎝ 0 ⎠
− E(k)
m
⎛ ⎞
0
⎜ ± √1 ⎟
μ (0, λ = ±1) = ⎜ 2⎟
⎝ √i ⎠
2
0
Note that the maximal helicity states λ = ±1 are unaffected by the boost. On the
other hand, the zero-helicity mode is singular in the massless limit (or, for fixed mass,
in the high-energy limit). This seems quite at variance with the situation for massless
spin- 12 Weyl fields, where, for example, the ( 12 ,0) left-handed field gave a well-defined
left-handed spinor in the massless limit, while the right-handed spinor vanished as the
mass was taken to zero (cf. (7.161)). Of course, if we start with an exactly massless
j = 1 particle from the outset, the considerations of Section 5.3 assure us that helicity
λ = +1 and λ = −1 states transform separately and irreducibly under the proper
HLG, and in particular do not mix with a zero-helicity mode, which can be eliminated
completely from the theory. On the other hand, if we wish to have parity conserving
interactions, as is certainly the case for the photon, then both λ = +1 and λ = −1
states must appear.
Still, the singularity of the massless limit for the zero-helicity mode is somewhat
unsettling: it would seem to imply that we could distinguish between an exactly mass-
less photon and one with a mass of 10−80 eV, for example. In fact, the interactions of
a photon with charged particles enjoy a gauge symmetry which ensures the decoupling
of the zero-helicity mode and restores the smoothness of the massless limit, as well
as softening the high-energy behavior of the theory, rendering it renormalizable. We
shall return to these issues (of gauge symmetry and renormalizability) in much greater
detail in the “Symmetries and Scales” sections of the book, but it may be useful here
Local covariant fields for spin-1 (vector fields) 201
to provide a brief explanation of the role of gauge symmetry in taming the massless
limit for the unwanted λ =0 mode.
Suppose the interactions of our spin-1 field Z μ (we use the Z notation as our field
is still massive) are insensitive to altering Z μ by longitudinal (i.e., gradient) fields of
the form ∂ μ Λ(x), for arbitrary Λ(x). Choose Λ(x) to be the hermitian field
1 d3 k
Λ(x) = i √ (a(k, λ)e−ik·x − a† (k, λ)eik·x ) (7.173)
m (2π)3/2 2E
λ
d3 k
A(x) = √ (
(k, λ)a(k, λ)e−ik·x +
(k, λ)∗ a† (k, λ)eik·x ) (7.174)
λ=±1
(2π)3/2 2E
· A(x)
∇ =0 (7.175)
From (7.174), the recovery of the usual form for the free Hamiltonian as a sum of
electric and magnetic field energies is straightforward (see Problem 9):
1
H0 = d3 k E(k)a† (k, λ)a(k, λ) = 2 + B
d3 x : (E 2) : (7.176)
2
λ=±1
≡ − ∂ A
, B
where E ≡∇
× A.
∂t
Instead of developing the concept of covariant fields starting from the somewhat
ad hoc assumption of relativistic wave equations such as (7.168, 7.169, 7.171), the
systematic construction followed in the preceding sections shows clearly that the
form of the field equations, and the corresponding free Hamiltonian, satisfied by
local covariant fields is actually specified uniquely by the underlying particle state
transformation properties, once we insist on choosing the simplest representation of
the HLG (for each spin) that will do the job.
(A) λ
Hint (x) = φ(x)4 (7.177)
4!
2. (Theory B) With λ, φ(x) as in theory A above, and with ψ(x) a non-self-conjugate
scalar field of mass m
3. (Theory C) With λ, φ(x) as in theory B, but with ψ(x) a spin- 12 Dirac field of
mass m, subject to a Yukawa interaction
(C)
Hint (x) = λψ̄(x)ψ(x)φ(x) (7.179)
The desired phenomenological quantities are then given by (see Appendix B for a
derivation of these essential results, specifically (B.17) and (B.27)):
1. The differential decay rate of a one-particle state α to final states β:
with ki the final state momenta. Of course, when considering spinless particles,
the sum over final-state spins will be irrelevant.
2. The differential cross-section for a two-particle state α to scatter into final
states β:
(2π)4 4
dσ(α → β) = δ (Pα − Pβ )|Tβα |2 dβ (7.183)
vα
where the relative velocity in the initial state vα is given by
(k1 · k2 )2 − m21 m22
vα = (7.184)
E1 E2
and the final-state phase-space is as above.
The perturbative expansion of the S-matrix
∞
(−i)n
S= d4 x1 ..d4 xn T {Hint (x1 )....Hint (xn )} (7.185)
n=0
n!
will be used to extract the T-matrix element for some simple processes in the theories
specified above. We will evaluate the lowest-order non-trivial contribution to S for
each of the following processes:
1. φ − φ scattering in theory A (first order in λ).
2. φ decay in theory B (first order in λ).
204 Dynamics V: Construction of local covariant fields
1
Dp ≡ d3 p (7.188)
(2π)3/2 2E(p)
In order to change the incoming state |k1 k2 to the outgoing one |k1 k2 (with different
momenta), we must choose the annihilation part of two of the φ(x) fields to get rid
of the incoming particles and the creation part of the remaining two to produce the
outgoing ones. A typical matrix element appearing in (7.186) might, for example, be
k1 k2 |a† (p1 )a(p2 )a† (p3 )a(p4 )|k1 k2 (7.189)
a† (p1 )a(p2 )a† (p3 )a(p4 ) = a† (p1 )a† (p3 )a(p2 )a(p4 ) + δ 3 (p2 − p3 )a† (p1 )a(p4 ) (7.190)
k1 k2 |a† (p1 )a(p4 )|k1 k2 = δ 3 (k1 − p1 )δ 3 (k1 − p4 )δ 3 (k2 − k2 ) + permutations (7.191)
corresponding to disconnected terms of the structure displayed in Fig. 6.6, in which one
of the particles passes through the process completely unaffected by the interaction,
while the other particle suffers a self-interaction induced by the interaction. Such
persistent self-interactions, present even for an isolated particle, are typically absent
in non-relativistic scattering, but are an intrinsic feature of relativistic field theory. As
we shall see in our more detailed discussion of perturbation theory in Chapter 10, and
in even more detail in Part 4 of the book (on “Scales”), they result in renormalizations
of the attributes of single-particle states: specifically, of the mass and normalization
of the one particle states of the theory. For now, we ignore such effects, concentrating
on the fully connected contributions to the scattering exemplified by the right-most
diagram in Fig. 6.6.
Some simple theories and processes 205
Fully connected contributions to the scattering arise from keeping the normal-
ordered piece from each of the six possible ways in which we can pick two creation
and two annihilation terms from the product of four fields in (7.186). Relabeling the
momentum integration variables, we find six equivalent contributions. For example,
we can take the term
4
d4 x Dpi k1 k2 |a† (p1 )eip1 ·x a† (p2 )eip2 ·x a(p3 )e−ip3 ·x a(p4 )e−ip4 ·x |k1 k2 (7.192)
i=1
and then multiply by 6. In (7.192) the annihilation operator a(p4 ) can remove either the
particle k1 , in which case a(p3 ) must remove k2 , or the other way around. Likewise,
there are two possibilities for creating the final state. Recalling that a(p4 )|k1 .... =
δ 3 (p4 − k1 )|..., k1 ...|a† (p1 ) = δ 3 (p1 − k1 ) ...|, we see that the only interesting parts
of (7.192) evaluate to
4
d4 x Dpi (δ 3 (p1 − k1 )δ 3 (p2 − k2 ) + p1 ↔ p2 )
i=1
× (δ (p3 − k1 )δ 3 (p4 − k2 ) + p3 ↔ p4 )
3
(7.194)
4
= (2π)4 δ 4 (k1 + k2 − k1 − k2 ) (7.195)
(2π)6 2E1 · 2E2 · 2E1 · 2E2
λ2 1 d3 k d3 k
dσ = δ 4 (k1 + k2 − k1 − k2 ) 1 2 (7.198)
64π 2
(k1 · k2 )2 − M 4 E1 E2
At this point we have to be a little more specific about the kinematic conditions for the
scattering. Let us then assume we are performing the experiment in a collider, in the
center-of-mass frame of the two incoming particles. Thus the four-vector momenta
take the form k1 = (E, k), k2 = (E, −k), k1 = (E1 , k1 ), k2 = (E2 , k2 ). One finds
206 Dynamics V: Construction of local covariant fields
If we have detectors deployed with angular resolution, we should leave the integral
over solid angles undone; on the other hand, the detector will “blip” once the particle
enters the opening angle of the detector, irrespective of the magnitude of momentum
|k1 |, so we should integrate over this variable. This gives the final result—a differential
cross-section
dσ λ2 3 λ2
= + O(λ ) = + O(λ3 ) (7.200)
dΩk̂ 256π 2 E 2 256π 2 (
k 2 + M 2)
1
Note that the scattering (to lowest order) is isotropic (“s-wave”): independent of the
angle between k1 and k1 .
Inserting these forms in (7.201) and performing the spacetime and momentum inte-
grals, we obtain
−iλ
Sk1 k2 ,k = (2π)4 δ 4 (k1 + k2 − k) + O(λ2 ) (7.202)
(2π)9/2 2E · 2E1 · 2E2
Stripping off the δ-function, this gives the T-matrix element
λ
Tk1 k2 ,k =
+ O(λ2 ) (7.203)
(2π)3/2 2E · 2E1 · 2E2
Some simple theories and processes 207
The general decay formula (7.181) can now be applied to get the differential decay
rate:
λ2 1
dΓ = 2πδ 4 (k1 + k2 − k) d3 k1 d3 k2 + O(λ3 ) (7.204)
(2π) 2E · 2E1 · 2E2
3
Once again we have reached the point where a choice of frame needs to be made.
Clearly, it is easiest to pick the rest frame for the decaying particle (simple time-
dilation arguments give the decay rate in other frames). So let us take k = (M, 0),
k1 = (E1 , k1 ), k2 = (E2 , k2 ). One then finds (integrating out k2 , and neglecting higher
orders)
λ2 1
dΓ = δ(2E1 − M )|k1 |2 d|k1 |dΩk̂ (7.205)
(2π)2 2M (2E1 )2 1
which, of course, makes sense only if M > 2m, as assumed initially (otherwise the
energy δ-function constraint in the k1 integral is never satisfied, and we simply get zero:
the φ particle is stable). Note that the above calculation gives the rate for detecting
ψ particles at any angle (in this theory, the particle and antiparticle are in principle
distinguishable; of course, the ψ c simply comes out in the opposite direction to the ψ
in the rest frame of the φ). The decay is isotropic, so the total decay rate (#decays/sec
in all directions) is just, to second order in λ,
√
λ2 M 2 − 4m2
Γtot = (7.207)
16π M2
The matrix element in the joint Fock space of multi-particle ψ and φ states can be
factorized in an obvious way:
The vacuum expectation value (vev) of the time-ordered product of two fields
appears in almost every perturbative calculation in field theory: it has been dubbed
the Feynman propagator (see Fig. 7.1). Conventionally, a factor of i is included in the
definition, as follows:
(Strictly speaking, this is the propagator for the φ field which should be distinguished
from a similar object for the ψ field, not needed in this calculation.) Note that the
Some simple theories and processes 209
−E(k) +
E(k) −
Accordingly, if x0 > 0 (resp. x0 < 0), the integrand of (7.215) vanishes exponentially
fast in the negative (resp. positive) imaginary direction in the complex plane of the
k0 integration variable, and we can use Cauchy’s theorem to close the contour in the
lower (resp. upper) half-plane, in which case we pick up the residue of the pole at
k 0 = E(k) − i
(resp. k 0 = −E(k) + i
). The result of the k0 integration is then
d3 k
ΔF (x) = −i e−ik·x , x0 > 0 (7.217)
(2π)3 2E(k)
d3 k
ΔF (x) = −i eik·x , x0 < 0 (7.218)
(2π)3 2E(k)
210 Dynamics V: Construction of local covariant fields
It is apparent from the Fourier representation (7.215) that the Feynman propagator is
a Lorentz-invariant function of its spacetime argument x, despite its definition (7.214)
involving θ-functions of a frame-dependent time. The resolution of this apparent
paradox becomes clear if we recall the locality of the φ(x) field: under a Lorentz
transformation Λ altering the time-ordering of φ(x1 ) and φ(x2 ), the space-like separa-
tion of x1 and x2 ensures that the order of field operators is irrelevant, as the field is
local. This is just a special case of the argument given in Section 5.5 for the Lorentz-
invariance of an S-matrix constructed perturbatively in terms of the time-ordered
products of local scalar interaction Hamiltonians.
Returning to the calculation of the S-matrix element for elastic ψ − ψ c scattering
in (7.212), we can now write (at second order in λ)
λ2 d4 k 1
S
k
k ,
k1
k2 = −i (PSF) d 4 x1 d 4 x2
1 2 2 (2π)4 k 2 − M 2 + i
× eik·(x2 −x1 ) (e−i(k1 +k2 )·x2 +i(k1 +k2 )·x1 + e−i(k1 −k1 )·x2 +i(k2 −k2 )·x1 + x1 ↔ x2 )
d4 k 1
= −iλ2 (PSF) (2π)8 {δ 4 (k − k1 − k2 )δ 4 (k − k1 − k2 )
(2π)4 k 2 − M 2 + i
+ δ 4 (k − k1 + k1 )δ 4 (k + k2 − k2 )}
We see that the final amplitude is the sum of two distinct pieces, depending
respectively on the square of the total incoming four-momentum, s ≡ (k1 + k2 )2 (equal
to four times the square of the particle energy in the CM frame) and on the square of
the four-momentum transferred from the ψ to ψ c , t ≡ (k1 − k1 )2 . In the CM frame,
the second term will, of course, lead to an angular dependence of the differential
cross-section. The two terms have a simple graphical interpretation (see Fig. 7.2: note
that the direction of the arrows for the ψ particles indicates “charge”, rather than
momentum, flow, if we associate positive charge with the ψ and negative charge with
the antiparticle ψ c ). The result for the S-matrix amplitude can be read off immediately
from the following simple Feynman rules:
k1 k2
k2
k1
k1 − k1
k1 + k2
k2
k1
k1 k2
with
1 m 3
Dp ≡ d p (7.223)
(2π)3/2 E(p)
We shall consider the scattering of an incoming ψ particle (with momentum and spin
p1 , σ1 ) on an anti-ψ particle (momentum and spin p2 , σ2 ). The outgoing particles
are similarly labeled, with primes added. As before, we work to second order in the
interaction (7.179). Following steps exactly analagous to those leading to the term
(7.210) in theory B, we encounter a matrix element of the form
m2
PSF = (7.225)
(2π)6 E(p1 )E(p2 )E(p1 )E(p2 )
Note the appearance of a minus sign in two of the four terms, due to the need for
an odd number of transpositions of fermionic creation and destruction operators in
the fields in order to destroy the incoming particles and create the final-state ones.
As we shall see shortly when we discuss the crossing symmetry of these amplitudes,
the minus sign is an indication of a generalized notion of Fermi antisymmetrization
of amplitudes, applicable in local relativistic theories to exchange of particles between
the initial and final states (and not only, as in non-relativistic quantum mechanics,
to exchange of identical particles in either the initial or final state). Nevertheless, the
entire expression is symmetric under x1 ↔ x2 , just as in the bosonic case of Theory
B. This allows us, as there, to factor the fermionic matrix element entirely from the
vacuum expectation of time-ordered φ fields, so we recover, as before, a Feynman
propagator iΔF (x1 − x2 ) for the φ field. After substituting the Fourier transform
expression (7.215) and performing the spacetime integrals over x1 and x2 , we obtain
the final result for the second-order scattering amplitude:
Apart from the minus sign, the result (7.226) is very similar to the result (7.220), with
a contribution from scalar exchange interfering with one from particle–antiparticle
annihilation, as indicated in Fig. 7.3. Our previous list of Feynman rules needs only
the following obvious additions for spin- 12 Dirac particles:
Some simple theories and processes 213
p1, σ1
p1 − p1
− p1 + p 2
p2, σ2
p 1 , σ1
p 1, σ 1 p 2, σ 2
1. Drawing particle lines with the arrow pointing upward (i.e., from the
initial to the
final state), each final-state particle is associated with a
factor (2π)13/2 E(pm
) ū(p σ ), and each initial-state particle with a factor
1 m
(2π)3/2 E(p) u(pσ).
2. Adopting the convention that antiparticle lines should be drawn with the arrow
pointing downward (i.e., from final
to initial state), each initial-state antiparticle
1 m
is associated with a factor (2π)3/2 E(p) v̄(pσ), and each final-state antiparticle
1 m
with a factor (2π)3/2 E(p ) v(p σ ).
In other processes in Theory C, such as φ-ψ scattering (see Problem 12), the
Feynman–Dirac propagator iSF (x1 − x2 ), involving the time-ordered product of a ψ
with a ψ̄ field (see Problem 6), will make its appearance. After all Fourier transforms
are performed, this leads, as expected, to a factor of iSF (p) for every internal fermion
line carrying four-momentum p.
(we are using momenta labeled by “p”s rather than “k”s to avoid confusion in the
crossing rules given below). A calculation along similar lines to that carried out above
for ψ − ψ c scattering yields an amplitude for this process proportional to
1 1
+ (7.228)
(p1 − p1 )2
− M + i
(p1 − p2 ) − M 2 + i
2 2
corresponding to the Feynman graphs shown in Fig. 7.4. One easily sees that this
amplitude transforms to the corresponding result for ψ − ψ c scattering in (7.220)
with the simple replacements:
p1 → k1
p1 → k1
p2 → −k2
p2 → −k2
i.e., by twisting the antiparticle lines around and changing the sign of their momenta.
p2 p1
p1 p1
7.7 Problems
1. Verify the expression given in the text for the free Hamitonian of a massive
Majorana field (given in (7.87)):
+ m (χT Cs χ − χ† Cs χ∗ ) :
H0 = d3 x : (χ† iσ · ∇χ
2
Namely, show that the above expression reduces to the desired energy summa-
tion formula in terms of creation and annihilation operators:
H0 = d3 k E(k)b† (k, σ)b(k, σ)
σ
As for the Majorana case, remember that for fermions the normal ordering
includes an extra minus sign for each transposition of fermion operators, by
definition.
4. Convince yourself that J μ (x) ≡ e : ψ̄(x)γ μ ψ(x) : is the quantized version of the
conventional electric current four-vector for a charged Dirac particle described
by a free Dirac field ψ(x), by showing
(a) current conservation (use the Dirac equation for ψ):
∂μ J μ (x) = 0
216 Dynamics V: Construction of local covariant fields
counts electric charge (defined as e times the difference in the number of particles
and antiparticles), and is time-independent.
5. Prove the following equal-time anticommutator relation for the Dirac field:
†
{ψn (x, t), ψm (y , t)} = δ 3 (x − y )δnm
6. Use the result (7.113) for the spin sums for a Dirac particle
u(p, σ)ū(p, σ) = (p/ + m)/2m
σ
(and the corresponding result (7.114) for the v spin functions) to derive the
momentum-space formula for the Dirac propagator, defined as follows (note the
minus sign!):
p/ + m
SF (p) = i
p2 − m2 + i
7. The relation between the four-vector notation and the (A=1/2,B=1/2) notation
for the fundamental representation of the Lorentz group is given by
1 1 1
v 12 ,21 = √ (v 1 − iv2 )
2 2 2
1 1 1
1 = √ (v − v )
0 3
v 12 ,−
2
2 2 2
1 1 1
v−2 21 , 1 = − √ (v 0 + v 3 )
2 2 2
1 1 1
v−2 21 ,− 1 = − √ (v 1 + iv2 )
2 2 2
8. Construct the spin functions uμ (k, σ), v μ (k, σ), in four-vector notation, for a
(1 1)
j = 1 massive boson, starting from the spin functions u±212,± 1 etc., in (AB)
2 2
notation. Do this in the following steps:
(a) First, construct the four-vectors uμ (0, σ), v μ (0, σ) for the particle at rest,
using (7.64, 7.54), with ξ AB = (−1)2B ), and the translation dictionary
supplied in Problem 7. Verify that v μ (0, σ) = (uμ (0, σ))∗ .
(b) Show that the polarization vectors derived in (a) satisfy
uμ (0, σ)uν∗ (0, σ) = −g μν + g0μ g0ν
σ
(Consider the cases of μ, ν both spatial, then one time, one spatial, etc.)
(c) Now, show that for non-zero momentum,
kμ kν
uμ (k, σ)uν∗ (k, σ) = −(g μν − )
σ
m2
Use the boost operator Lμν (k), recalling that Lμ0 = k μ /m, and that Lμν (k)
is a Lorentz transformation.
(d) The polarization vectors in helicity representation are
uμ (k, λ) = j
Dσλ (R(k̂))uμ (k, σ)
σ
Note that in this case the vacuum expectation value of the time-ordered product
of the ψ field occurs, in the form
d4 q e−iq·(x1 −x2 )
=i
(2π)4 q 2 − m2 + i
11. Again, in Theory B of Section 7.6, calculate the lowest-order connected S-matrix
element for the annihilation process
ψ(p1 ) + ψ c (p2 ) → φ(k1 ) + φ(k2 )
The precise way in which underlying microphysical processes, governed by the laws
of quantum mechanics, merge into a phenomenal realm describable by classical laws
has been a source of intense discussion and controversy from the very earliest days of
quantum theory. At its core, this subject leads inexorably to quantum measurement
theory, a subject long regarded by physicists of a more practical bent as an intellectual
black hole, quite capable of permanently absorbing any physicist careless enough to
stray within its event horizon. Nevertheless, much can be understood of the way
in which quantum phenomena merge into and “mimic” classical physics without
making a definite commitment to the ultimate role or character of measurement
processes in quantum theory. Typically, one approaches this topic by identifying a set
of “complementarity” relations between quantities which have a precise meaning in
classical physics but cannot be simultaneously “sharp” once the theory is quantized.
The archetype of such relations is, of course, the Heisenberg uncertainty principle
relating the dispersion (or roughly, the “uncertainty”) in the position and momentum
observables for a non-relativistic point particle. Such relations follow directly, by
a straightforward exercise in linear algebra, from the non-commuting character of
the associated quantum-mechanical operators. From the complementarity point of
view, the classical limit amounts to a regime in which the dispersion of the relevant
observables (in the states of interest) becomes much smaller than their mean values,
although, of course, still restricted by the appropriate quantum-mechanical uncertainty
principle. Our task in this chapter will be to identify and examine those states in a
relativistic quantum field theory for which the field observables of interest take on an
essentially classical character.
momentum eigenstates, this description obscures the spacetime aspects of the field.
From a spatiotemporal point of view, the conjugate variables analogous to the “q”s and
“p”s of non-relativistic quantum theory are, for a spinless field φ, the field variables
φ(x, t) and their time-derivatives π(x, t) ≡ ∂φ(
x,t)
∂t at each spatial point x at some fixed
time t. Locality ensures that the φ(x, t) commute among themselves (i.e., for any
x, y , [φ(x, t), φ(y , t)] = 0), as do the π(x, t), while (with x0 = y 0 = t, and the measure
3
Dk ≡ √ d 3k )
(2π) 2E(k)
[π(x, t), φ(y , t)] = DkDq[−iE(k)a(k)e−ik·x + iE(k)a† (k)eik·x , a(q)e−iq·y + a† (q)eiq·y ]
d3 k
= (−iE(k)eik·(
x−
y) − iE(k)e−ik·(
x−
y) )
(2π)3 2E(k)
d3 k i
k·(
x−
y)
= −i e = −iδ 3 (x − y ) (8.1)
(2π)3
Note that the analogy to the fundamental [p, q] commutation relation is somewhat
obscured here by the use of natural units in which is set to unity: restoring the , it
appears on the right-hand side of (8.1) as expected.
The next step, in analogy to the well-known procedure in non-relativistic quantum
mechanics, would naturally be a derivation of an inequality for the product of the
dispersion of the two non-commuting field operators, analogous to Δp · Δx ≥ /2
in particle quantum mechanics. At this point we encounter an embarassment: the
dispersions of the field operators φ(x), or π(x), defined at a single spacetime point,
are typically infinite! In the simplest possible Fock-space state, for example—the
vacuum—one finds that 0|φ(x)2 |0 = ∞. Mathematically, the reason for this is that
the field operator φ(x) is in fact not well-defined on the Hilbert space of normalizable
Fock-space states: the unit norm state |0 is taken into an infinite-norm state φ(x)|0
by the action of the local field φ(x), whence the divergence of the vacuum expectation
value 0|φ(x)2 |0 = (φ(x)|0, φ(x)|0). Local fields such as φ(x) and π(x) should instead
be regarded as operator-valued distributions: they yield well-defined1 operators only
after smearing with sufficiently smooth c-number test functions (cf. the discussion of
localization in Section 6.5). At the very least, we must construct spatially smeared
operators: e.g., we might replace the field at the origin by
1 x2
3 −
φ̄ = d xe 2a2 φ(
x, t) (8.2)
(2πa2 )3/2
The normalization factor is chosen so that the smeared object is identical to the
original for constant fields. The square of this operator has a perfectly finite vacuum
expectation value: in other words, φ̄ maps the vacuum state to a normalizable state
1 The smeared field operators will still be unbounded operators, as are typically the position and
momentum operators in ordinary quantum mechanics: hence, their domain will be a proper subset of
the full Hilbert space, but at least it will not be empty! A systematic discussion of the use of smeared
operators in field theory will be an indispensable part of our introduction to axiomatic quantum field
theory in Chapter 9.
Complementarity issues for quantum fields 221
in the Hilbert space (in mathematical lingo, the vacuum state lies in the domain of
the operator φ̄). A short calculation shows that
∞
1 k2 2 2 1 1
0|φ̄2 |0 = √ e−a k dk ∼ , a << (8.3)
4π 2 0 k2+m 2 2
8π a 2 m
which shows clearly the re-emergence of the quadratic divergence in the vacuum
expectation value in the local limit a → 0. We see that it is not even possible to define
a dispersion Δφ(x) for the local field φ(x), whilefor the smeared field the dispersion
is perfectly finite: Δφ̄ ≡ 0|φ̄2 |0 − 0|φ̄|02 = 0|φ̄2 |0. Physically, the restriction
to smeared fields is also perfectly reasonable, as we can hardly expect any conceivable
measurement apparatus to have infinitely fine resolution, either in space or in time.
The preceding discussion focussed on a spin-0 (hence bosonic) field. What about
spin- 12 fields, which, by the Spin-Statistics theorem, necessarily describe fermionic
particles? We recall from the discussion of blackbody radiation in Chapter 1 that the
classical “Rayleigh–Jeans” regime corresponds to the limit in which the denominator
of the Planck distribution (1.22) vanishes, corresponding to a large occupation number
in modes of any given frequency. We shall see in the next section that the classical limit
for a quantum field necessarily requires such large occupation numbers. For a fermionic
field, the corresponding particle quanta are restricted by the exclusion principle to an
occupancy of either zero or one for each distinct quantum mode, so a classical limit for
such fermionic fields is simply impossible. Of course, there are many particle states of
such fields which exhibit classical behavior—the ∼1026 electrons, protons and neutrons
bound together into a spherical billiard ball certainly behaves perfectly classically in
many contexts—but the classical physics appropriate in this circumstance is particle
mechanics, rather than classical field theory. In this chapter we are concerned with
the approach to the latter, so fermionic fields will no longer be considered.
Returning, then, to bosonic fields, the only classical fields of importance are the
electromagnetic and gravitational fields. As conventional quantum field theories of
gravity are at best effective theories valid only at low energies, we shall concentrate
on the electromagnetic field, which in any case deserves special consideration given its
unique role in the gestation and birth of modern quantum mechanics and field theory.
This leads us to the consideration of complementarity issues for the field A(x) (7.174)
introduced in Section 7.5 for the description of massless spin-1 particles. We recall that
this field undergoes a non-trivial modification under gauge transformations, leaving
the physics invariant (classically, it is the spatial part of the four-vector potential). The
associated physical (hence gauge-invariant) fields are the electric field and magnetic
fields: in the radiation (or Coulomb) gauge in which we constructed A(x), these take
the form E = ∂t and B = ∇ × A (or Bi =
ijk ∂j Ak , i, j, k = 1, 2, 3). We can think of
∂A
the electric field, involving a time-derivative, as the conjugate momentum field to the
magnetic field, which involves only space derivatives, and hence only combinations of
field values on a given time-slice. The commutation relation analogous to the result
(8.1) for a canonical scalar field is (see Problem 1)
The reader may easily verify that the corresponding equal-time commutators among
electric and magnetic fields vanish separately.
Equal time commutation relations of the type (8.1) and (8.4) will later (cf.
Chapter 12) play a central role in the development of a Lagrangian formalism for
field theory, of enormous utility in the incorporation of spacetime and local gauge
symmetries, which necessarily take a complicated and obscure form in the Hamiltonian
approach we have followed so far. Here we note that the complementary role played
by electric and magnetic fields leads to physical consequences of enormous importance
in modern condensed matter physics and field theory. In the BCS theory of super-
conductivity, for example, the appearance of a condensate of charged Cooper pairs in
the superconducting state implies an essentially infinite dispersion in the local electric
fields, requiring the dispersion of the local magnetic field to vanish in the interior of
the superconductor- the famous Meissner effect. A dual Meissner effect, in which the
chromoelectric field is forced to zero (in this case as a result of large chromomagnetic
fluctuations) outside of thin tubes carrying the conserved flux required by Gauss’s
Law, is believed to lie at the core of color confinement in quantum chromodynamics
(cf. Chapter 19). The non-zero zero-point energy of the electromagnetic field ( 12 ω
for each quantized mode of the field) can be viewed as a direct consequence of the
non-existence of states in which both electric E and magnetic B fields have vanishing
expectation values and dispersion, allowing zero energy (∝ E + B 2 2 ). And so on . . .
In the next section we shall examine closely the conditions for “classical” behavior
of a quantum field. This will turn out to be more easily accomplished in momentum
space—i.e., by examining the multi-particle states for individual quantized modes of
the field, rather than in terms of the spatiotemporally defined fields appearing in the
commutators above. Before doing that, a brief historical digression is in order. As we
recall from the historical account in Chapters 1 and 2, the birth of quantum field
theory was accompanied by much uncertainty about the validity of a straightforward
extension of quantum-mechanical principles to systems with infinitely many degrees
of freedom. An early critique which caused a great deal of unease appeared in a
paper of Landau and Peierls, appearing in 1930 and published the following year
(see (Landau and Peierls, 1983)), in which the arbitrarily precise measurability of
even individual components of the quantized electric or magnetic fields employing
a charged point particle was denied (for fields smeared over a temporal extent Δt,
Landau and Peierls claimed an intrinsic dispersion ΔE ΔB (Δt) 1
2 ). The basic
difficulty was that the acceleration of the test particle in the presence of the field to be
measured would lead to an uncontrollable radiation emission and concomitant energy-
momentum loss. In a famous (and famously unread2 ) paper, Bohr and Rosenfeld
(Bohr and Rosenfeld, 1983) subjected the measurability of the field components of the
free electromagnetic field to a typically exhaustive examination, and concluded that
the uncertainty relations holding among appropriately smeared averages of various
components of the electric and magnetic field operators are precisely consistent with
2 As Pais relates in his biography of Bohr (Pais, 1991): “It [the Bohr–Rosenfeld paper] has been read by
very very few of the aficionados . . . As a friend of Bohr’s and mine once said to me: ‘It is a very good paper
that one does not have to read. You just have to know that it exists.’ ”
When is a quantum field “classical”? 223
the ΔpΔx ≥ 2 constraint holding for the dispersion of the momentum and position
of an extended test body, sufficiently massive to allow the response of the test body
to the field to lead to negligible accelerations, thereby minimizing the uncontrollable
radiation emissions that had bothered Landau and Peierls.
The Bohr–Rosenfeld paper played an important historical role in reassuring the
quantum community that the quantization of systems with infinitely many degrees of
freedom did not lead to conceptual inconsistencies or phenomenologically pathological
results, but that quantum field theories should indeed be viewed simply as quantum
systems of a new type, where the constraints on measurability bear just the same
relation to the underlying formal operator framework as in non-relativistic point-
particle quantum mechanics. From a practical point of view, the enormous body of
phenomenological information—the entire field of quantum optics—gathered over the
three-quarters of a century that have elapsed since the Bohr–Rosenfeld paper has
indeed allowed measurements of the quantum properties of light with unparalleled
precision. However, these measurements typically address the behavior of states in
which a limited number of quantized modes of the electromagnetic field are excited:
in other words, we are concerned with the field in momentum rather than coordinate
space. In the next section we shall see that the transition to classical behavior (as well
as the non-classical deviations therefrom which are the bread and butter of quantum
optics) is most easily studied in the occupation number space of these modes.
wavevectors (i.e., momenta) becomes a discrete sum in the usual way, so that we
can speak sensibly of a specific individual mode. Instead of imposing the physical
constraints of an actual rectangular laser cavity (electric field vanishing at the plane
boundaries, for example), we shall simply assume our cubical box of volume V to be a
three-torus topologically, so that the fields satisfy simple periodic boundary conditions
in each of the three Cartesian coordinates. The transition from continous to discrete
momenta, and from creation–annihilation operators with δ distribution commutators
to ones with Kronecker δs, is then made via
(2π)3 2π
d3 k → , (kx , ky , kz ) = (nx , ny , nz )
V L
k
(2π)3
a(k, λ) → a
k,λ
V
where we use subscripts (rather than functional dependence) to indicate discretely
defined objects—for example, the creation–annihilation operators, which now satisfy
[a
k,λ , a
†k ,λ ] = δλλ δ
k
k (8.5)
With these translations, and the assumption that only one mode is occupied, we can
given in (7.174) to
effectively truncate the quantized spin-1 field A
1
A(x) =
(a
e−ik·x + a
† eik·x ) (8.6)
2E(k)V k,λ k,λ k,λ
To simplify the notation, we shall henceforth drop the subscripts k, λ indicating the
specific mode under consideration, and write simply
x, t) = iC(aei
k·
x−iωt − a† e−i
k·
x+iωt )
E( (8.8)
If our state space were finite-dimensional, we could apply the polar decomposition
theorem for finite-dimensional operators, which asserts that we can find a semipositive-
definite hermitian operator N and unitary operator E, with
√ √
a = E N , a† = N E † (8.10)
N = a† a (8.11)
When is a quantum field “classical”? 225
EN − N E = E (8.12)
[N , Φ] = i (8.13)
then the commutation relation (8.12), with E = eiΦ , indeed follows directly using the
usual multiple-commutator formulas. In analogy to [q, p] = i leading to the uncer-
tainty relation ΔqΔp ≥ 2 , one then concludes that ΔN ΔΦ ≥ 12 , and it is immediately
apparent that a classical limit, with both amplitude and phase defined to high relative
precision, necessarily requires the number operator N to take on large values, with
ΔN << N̄ (but ΔN >> 1, thus still allowing ΔΦ << 2π).
While this conclusion is basically correct, the argument leading to it is completely
fallacious. For example, if we take the expectation value of the purported commutation
relation (8.13) in the vacuum state |0 for our mode, we obtain
There are two sources of trouble here. First, the polar decomposition theorem does not
extend in general to the existence of polar pairs N , E in infinite-dimensional spaces.
Secondly, the transition from a unitary E to a phase operator Φ clearly cannot lead to
the desired commutation relation (8.13), as the latter is clearly inconsistent. In fact,
the definition and construction of a well-defined phase operator in quantum mechanics
is a notoriously slippery subject—which is hardly surprising, given that even classically
the phase variable is not uniquely defined, but given only modulo 2π.
We shall circumvent the first problem by dealing with a truncated system, where
only states with a maximum occupation number N are considered. It will become
apparent in the next section that the coherent states that most mimic classical
behavior have very rapidly decreasing components for large occupation numbers N ,
with the probability of detecting more than N quanta falling more rapidly than ( en̄N
)N
for N > n̄ ≡ N once we consider states with occupancy N exceeding the expectation
2
value n̄ of the number operator. Such states also have exponentially small (∼ e−n̄ )
probability of occupancy of the vacuum state |0. Thus, we expect to make only
very small errors by simply truncating the state space at some very high, but finite,
occupation number N . The second problem, defining a phase operator, can be avoided
simply by observing that the specification of the phase θ modulo 2π amounts to
specifying the real numbers cos θ and sin θ, or equivalently, the complex number eiθ ,
corresponding to our unitary operator E above. There will be no difficulty obtaining a
well-defined unitary E (or its hermitian “real” and “imaginary” parts C ≡ 12 (E + E † )
and S ≡ 2i 1
(E − E † )) once we have truncated the space.
226 Dynamics VI: The classical limit of quantum fields
In keeping with our demand that only a maximum number N of photons can
occupy any given mode of the electromagnetic field, we define new creation operators
by demanding that
a† |n = (n + 1)|n + 1, n < N, a† |N = 0 (8.15)
The destruction operator a is then given by the adjoint of a† . The algebra is realized
by (N + 1) × (N + 1) matrices, which are easily seen to give the modified commutator
The restriction to states with extremely (factorially) small components in the |N (or
higher) mode means that the effect of the last term on the right, with a projection
operator onto the highest mode, is effectively negligible. It is, of course, required,
because the trace of the commutator of two finite-dimensional matrices must vanish.
Note that we still have the usual number operator N = a† a with matrix elements
Nnm = nδnm , 0 ≤ n ≤ N .
In this finite-dimensional space the application of the polar decomposition theorem
is unproblematic, and we find that
√ √
a = E N , a† = N E † (8.18)
N −1
E= |nn + 1| + |N 0| (8.19)
n=0
E is basically the lowering operator a without the square-root factors, except for the
action on the vacuum, which, as mentioned above, has exponentially small occupancy
in the quasi-classical states (see Section 8.3 below) in which we shall be interested. E
is the quantum analog of the complex classical phase eiθ . It is a normal operator (i.e.,
being unitary, it commutes with its adjoint), so its hermitian and antihermitian parts
C ≡ 12 (E + E † ) and S ≡ 2i1
(E − E † ) commute
[C, S] = 0 (8.20)
where once again we have neglected operators such as |N 0|, which simply exchange
the very small amplitudes for our state to have either 0 or N photons. Together with
[E † , N ] ≈ −E † , this then yields
The final inequality implies that the discriminant of the quadratic equation for λ must
be negative or zero, i.e.,
The answer to the question posed by the heading of this Section—namely, when
is a quantum field classical?—is therefore quite straightforward. The behavior of such
a field is classical only in the context of states with high occupation number for
individual modes of the field. It is therefore necessary that the field (a) be bosonic in
character, (b) be associated with stable particles, in order that a classical configuration,
once established, be maintained over macroscopic time-scales, and (c) have a sensible
non-interacting limit, so that the previous discussion, which assumed many-particle,
but non-interacting, particle states is indeed applicable.
In the Standard Model, the only bosonic fields associated with stable elementary
particles4 are the gauge-vector bosons mediating electromagnetic (the photon) and
strong (the gluons of quantum chromodynamics) interactions. We shall later see that
the non-abelian gauge dynamics of the gluons of the strong interactions results in a
qualitative alteration of the state space once interactions are turned on: namely, the
phenomenon of color confinement (cf. Sections 19.3 and 19.4) in which the physical
particle states of the theory are related to the underlying local (quark and gluon)
fields in an extremely complicated way, so condition (c) is violated in this case. We
are left (apart from gravity) with the quantized electromagnetic field as the single
phenomenologically relevant example of a relativistic quantum field with a sensible
classical limit. It is therefore hardly surprising, given the primary role played by
correspondence principle arguments in the historical evolution of quantum mechanics
and quantum field theory, that the unravelling of the quantum properties of light were
at the core of the most critical developments in this history, as we have already seen
in Chapters 1 and 2.
4 The Standard Model spin-0 Higgs is, of course, elementary but unstable: see Chapter 7, Problem 12.
Coherent states of a quantum field 229
with the commutator Z = −i[X, Y] = 12 . We shall soon see that the states which
saturate the mutual uncertainty relation for the hermitian operators X, Y (called
“quadrature” operators in the quantum optics literature) do the same for the number-
phase uncertainty relations. The inequality (8.24) becomes an equality when the
Z
discriminant vanishes and λ = 2(ΔY )2 . If we further require that the dispersions in
X and Y are equal, then (ΔX) = (ΔY )2 = 12 Z = 14 , so λ = 1, and the state |ψ
5 2
must satisfy
Henceforth, as is usual in quantum theory, we shall relabel the coherent state |ψ by
its eigenvalue with respect to the destruction operator a, thus |ψ → |α.
The interpretation of the complex number α becomes apparent if we return to the
expression for the quantized single-mode electric field (8.8). The expectation values
of the destruction a and creation a† operators in the coherent state |α are α and
α∗ respectively, so writing α = Aeiθ , the expectation value of the electric field in the
coherent state is
x, t)|α = −2CA sin (k · x − ωt + θ)
α|E( (8.38)
i.e., exactly the classical monochromatic wave form (8.9). Thus the norm A and phase
θ of the complex eigenvalue α encode the amplitude and phase of the corresponding
classical field. We shall shortly see how to extend this result to construct coherent
5 Minimum uncertainty states in which ΔX = ΔY are also of considerable interest in quantum optics:
they are the so-called “squeezed” states.
230 Dynamics VI: The classical limit of quantum fields
states with an arbitrary preassigned spatiotemporal behavior for the field expectation
value, even in a fully interacting theory!
Note that we have so far not imposed a mode cutoff at some large occupation
number N as in the previous section. Our next task, then, is to verify the assertions
made concerning the negligible effects of such a cutoff in the classical regime. First,
note that the expectation value of the number operator N in the state |α is
2 |α|2n 2 ∂ |α|2
N̄ = nc2n = e−|α| n = e−|α| |α|2 2
e = |α|2 (8.39)
n n
n! ∂|α|
Thus, the probability P (n) of finding exactly n photons in our coherent state
characterized by an average photon number N̄ is precisely the Poisson distribution
n
P (n) = e−N̄ (N̄n!) . Moreover, the classical limit requires that we choose |α|2 = N̄ >> 1
in order to achieve simultaneously a small fractional dispersion in the classical ampli-
tude (related to N̄ ) and in the phase (which requires ΔN >> 1). In this limit the
Poisson distribution becomes approximately Gaussian,
(n−N̄ )2
1 −1
P (n) ∼ √ e 2 (ΔN )2 (8.40)
2πΔN
1 eN̄ N/2
|N |α| ∼ e−N̄ /2 ( ) << 1 (8.42)
(2πN )1/4 N
are negligibly small, allowing us to use freely the phase operators E, C, and S
introduced in the previous section without worrying about the mutilation of the basic
commutation relations (8.17) and (8.21) induced by our mode cutoff.
With the mode cutoff in place, the unitary operator E defined in (8.19) has
eigenstates of the form
1 N
|φ = √ einφ |n (8.43)
N + 1 n=0
with E|φ = eiφ |φ, provided the eigenphases φ take the discrete values
2πm
φ= N +1
, m = 0, 1, 2, . . . N . The corresponding eigenvalues of the hermitian (resp.
antihermitian) parts C (resp. S) of E are, of course, cos φ (resp. sin φ). The amplitude
for measuring a phase φ in a coherent state |α is therefore
/2 |α|
n (n−N̄ )2
2 1 − 14
e−|α| √ ∼ √ e (ΔN )2 (8.45)
n! (2π)1/4 ΔN
The phase amplitude (8.44) can therefore be evaluated in the limit of large N as a
Fourier sum approximated by a Gaussian integral
(n−N̄ )2
−1 2 2
φ|α ∝ e−in(φ−θ) e 4 (ΔN )2 dn ∝ e−iN̄ (φ−θ) e−(ΔN ) (φ−θ) (8.46)
frequency part φ(+) of the field, built entirely from destruction operators a(q). To
construct such states, introduce the operator
1
S = exp( d3 q(2π)3/2 2E(q)f˜(q)a† (q)) (8.51)
2
where f˜(q) is for the present an unspecified c-number function of spatial momentum
q (as usual, we simplify the notation by omitting arrows for spatial vectors, e.g., q,
where the context is clear). A straightforward calculation using the Baker–Campbell–
Hausdorff6 expansion formula gives
1
S −1 a(k)S = a(k) + (2π)3/2 2E(k)f˜(k) (8.52)
2
and hence
1
S −1 φ(+) (x, 0)S = φ(+) (x, 0) + f (x) (8.53)
2
where f (x) is the Fourier transform of f˜ in (8.51). Thus, if we define the coherent
state
|f ≡ S|0 (8.54)
we see (since φ(+) |0 = 0) that our coherent state is indeed an eigenstate of the positive
frequency part of the field
1
φ(+) (x, 0)|f = f (x)|f
2
1
f |φ(−) (x, 0) = f | f (x) (8.55)
2
where we have taken f to be real. Thus, as promised, we have constructed a state |f >
with a field expectation value exactly equal to the preassigned function f (x):
f | : φ2 (x, 0) : |f
= f (x)2
f |f
f | : φ4 (x, 0) : |f
= f (x)4
f |f
x, 0)|2 : |f
f | : |∇φ( (x)|2
= |∇f (8.57)
f |f
i
S −1 π (+) (x, 0)S = π (+) (x, 0) − fE (x) (8.58)
2
where π (+) (x, 0) is the positive frequency part of φ̇(x, 0), and
fE (x) ≡ d3 kE(k)f˜(k)eik·
x (8.59)
i i
π (+) (x, 0)|f = − fE (x)|f , f |π (−) (x, 0) = f |fE (x) (8.60)
2 2
from which it follows that
Thus the expectation value of the full Hamiltonian density in the state |f is
f |H(x, 0)|f 1 1 λ
= |∇f (x)|2 + m2 f (x)2 + f (x)4 (8.62)
f |f 2 2 4!
As the expectation value of the full Hamiltonian, obtained as the spatial integral of
(8.62), is time-independent, it is given, at any time, by the exact functional
f |H|f 1 1 λ
= d3 x( |∇f (x)|2 + m2 f (x)2 + f (x)4 ) (8.63)
f |f 2 2 4!
<H>
V
f0 f
Fig. 8.1 Energy density for various coherent states with spontaneous symmetry-breaking.
Signs, stability, symmetry-breaking 235
Rewriting the Hamiltonian (8.64) in terms of the shifted field, one finds
1˙ 1 2 λ 3 λ 4
H= d x : φ̂2 + |∇
3
φ̂| + m2 φ̂2 + m φ̂ + φ̂ : (8.67)
2 2 6 4!
(An additive constant in the energy density, irrelevant for our present discussion,
but of profound importance in modern theories of cosmological inflation, has been
discarded).
Note that a physically sensible positive sign has reappeared in front of the quadratic
mass term. Now, however, the theory contains a cubic as well as quartic interac-
tion term. The original φ → −φ symmetry of (8.64), which would have guaranteed
conservation of evenness or oddness of the number of particles in any scattering
process, has been broken. The breaking is really due to an asymmetrical choice of
an off-centered minimum of a symmetric potential energy curve. In other words,
the symmetry is broken by the ground state, not by the underlying dynamics. If
we had chosen random values for the coefficients of the φ̂2 , φ̂3 , φ̂4 terms in (8.67),
the symmetry-breaking would have been explicit in the dynamics: indeed, in this
case, there would be a unique lowest-energy state (vacuum), and no non-trivial
symmetry operation (such as φ → −φ) connecting distinct degenerate vacuum states.
Only for the special case where these couplings and masses are related as indicated
in (8.67) can the theory be regarded as possessing an underlying symmetry broken
only by the choice of a single asymmetric vacuum—from a set of degenerate ones
connected by the symmetry—dictated by historical circumstance, just as the direction
of spontaneous magnetization in a ferromagnet lowered below its Curie tempera-
ture will depend on the presence of small external magnetic fields to resolve the
rotational ambiguity. Symmetry-breaking of this type is referred to as “spontaneous
symmetry-breaking”.
The symmetry of the theory defined by the Hamiltonian (8.64) is a discrete one:
φ → −φ. If the symmetry undergoing spontaneous breakdown is continuous, as in
the case of rotational symmetry mentioned above in the context of ferromagnetism,
there is a remarkable new feature, first noticed by Goldstone (Goldstone, 1961): the
appearance of an exactly massless scalar particle, or “Goldstone boson”. Imagine a
theory with three scalar particles of identical mass m. (For example, the neutral and
charged pions π0 , π ± form such a triplet, if we ignore the slight differences in mass of
236 Dynamics VI: The classical limit of quantum fields
the neutral and charged types due to the weak and electromagnetic interactions.) The
free Hamiltonian for such a system may be written thus:
1 ˙ 2 1 2 1 2 2
H0 = d3 x : φ + |∇φ| + m φ : (8.68)
2 2 2
Here the three scalar fields have been written as the three “components” φ1 , φ2 , φ3 of a
three-dimensional field vector φ. This Hamiltonian possesses a continuous symmetry of
rotations in field space, whereby φi → Rij φj , with Rij a 3 × 3 orthogonal matrix, i.e.,
an element of the fundamental representation of the group O(3), the rotation group
in three dimensions. Note that the symmetry is a “global” one: exactly the same
rotation is applied to the field vector at all spacetime points (otherwise, the spatial
gradient or time-derivative terms in the Hamiltonian would not be left invariant).
Such a symmetry is commonly referred to as an “isospin” symmetry. One may write
down interactions which also respect the isospin symmetry of the free Hamiltonian,
λ 2 2
e.g., Hint = 4! (φ ) , which is clearly also invariant under rotations in field space. Now
suppose that the full Hamiltonian is actually
1 ˙ 2 1 2 1 2 2 λ 2 2
H = d3 x : φ + |∇φ| − m φ + (φ ) :
2 2 2 4!
1 ˙ 2 1 2 :
≡ d3 x : φ + |∇φ| + P (φ) (8.69)
2 2
Once again, the state of minimum energy does not correspond to zero vacuum
Rather, there are a whole family of minimum-energy
expectation value of the fields φ.
states corresponding to the various field orientations with fixed magnitude
6
|φ| = m (8.70)
λ
A physical Fock space is constructed by imagining (and here the analogy to ferromag-
netism becomes essentially complete) that some fluctuation has “tickled” the system
into choosing a definite direction in field space for 0|φ|0. By the rotational invariance
of (8.69), we may as well call this direction the “3” direction, and rewrite the theory
in terms of a shifted field:
6
φi (x) = mδi3 + φ̂i (x) (8.71)
λ
As in the theory (8.64), the physical vacuum state will be one in which the vacuum
expectation value of the shifted field φ̂ vanishes. With this substitution we find that
of H becomes
the “potential energy” part P (φ)
λ λ
P → m φ̂3 +
2 2
mφ̂3 φ̂i φ̂i + (φ̂i φ̂i )2 (8.72)
6 4!
Note the appearance once again of a cubic term which violates the rotational symmetry
(because the third direction in field space is singled out via φ̂3 ). But in contrast to the
Signs, stability, symmetry-breaking 237
discrete case, the theory now contains only a single massive particle, corresponding to
φ̂3 : there is no mass term for φ̂1 , φ̂2 ! These
two directions correspond to the “flat”
directions in field space near the point φi = 6 mδi3 . Note that the Hamiltonian after
λ
the shift is still invariant under rotations around the 3-axis (which leave φ̂3 fixed),
so the symmetry-breaking has reduced the symmetry group of the theory from O(3)
(rotations around the 1, 2, or 3 axes in field space) to O(2) (rotations only around the 3
axis, represented by 2x2 matrices mixing the 1 and 2 coordinates). The two generators
corresponding to the lost symmetries (rotations around the 1 and 2 directions) may
be associated with the two massless Goldstone bosons appearing in (8.72).
The appearance of a massless scalar for every broken generator of a continuous
global symmetry is a very general property of local quantum field theories. Note that
our arguments in this section are based on an identification of the physical masses of
particles with the coefficients of quadratic terms in the Hamiltonian, which is only,
strictly speaking, valid to lowest order in perturbation theory in the interactions: we
shall see later that the interactions will in general renormalize the physical masses.
Nevertheless, the masslessness of the modes induced by spontaneous breaking of global
continuous symmetries turns out to be exact to all orders of perturbation theory, as
we shall show when we return to the subject in Section 14.3.7 Some more examples
of spontaneous symmetry-breaking in theories with continuous global symmetries are
given in Problems 4, 5, and 6 at the end of this chapter.
One may wonder why the Goldstone theorem has assumed such central importance
in modern particle physics: after all, there do not appear to be any massless spinless
particles in Nature! There are two reasons:
(1) There are spontaneously broken continuous global symmetries which would be
exact except for the presence of small explicit terms breaking the symmetry
in the Hamiltonian. In such a case (see Problem 4) one gets scalar particles
of small mass, called “pseudo-Goldstone bosons”. This is the case for pions
in the strong interactions, which are the pseudo-Goldstone bosons of broken
approximate chiral symmetry.
(2) The spontaneous symmetry-breaking may occur in a theory where the contin-
uous symmetry is a local gauge symmetry, and in this case each Goldstone
boson appearing due to a broken generator of the symmetry group is in a sense
“absorbed” into the gauge field of the corresponding gauge particle, making the
latter massive (the so-called “Higgs” mechanism, discussed in detail in Section
15.6, which explains the mass splitting between the photon and the massive W
and Z bosons of modern electroweak theory).
We shall return later, in the “Symmetries” section of the book (cf. Chapter 14), to a
much more detailed examination of the physics of degenerate vacua and spontaneous
symmetry-breaking, both for global and local symmetries of the underlying theory.
For the time being, the preceding discussion provides a suggestive example of the
7 The first general proof of this was given by Goldstone, Salam, and Weinberg, Phys. Rev. 127 (1962),
965.
238 Dynamics VI: The classical limit of quantum fields
power and utility of the coherent state formalism in connecting certain macroscopic
properties of a quantum field theory with the microscopic dynamics.
8.5 Problems
1. Starting with the vector field A for a massless spin-1 particle given in (7.174),
with the electric field defined as E = ∂ A
and the magnetic field B =∇ ×A (or
∂t
Bi =
ijk ∂j Ak , i, j, k =1,2,3), derive the commutation relation:
2. Show that in the state (8.37), the probability of detecting more than N quanta
falls more rapidly than ( en̄ N)
N
for N > n̄ ≡ N once we consider states with
occupancy N exceeding the expectation value n̄ of the number operator.
3. Show that the states (8.43) are eigenstates of the operator (8.19), with eigenvalues
2πm
eiφ , φ = N +1 , m = 0, 1, 2, ...N .
4. Let σ and π (the vector symbol over π refers to an internal “flavor” index, not to
ordinary space!) be a set of four self-conjugate scalar fields, interacting via the
Hamiltonian
1 ∂σ 2 1 2 1 ∂πi 2
3
H= d3 x{: ( ) + |∇σ| + (( i |2 )
) + |∇π
2 ∂t 2 2 i=1 ∂t
1
− μ2 (σ 2 + π 2 ) + λ(σ 2 + π 2 )2 :}
2
(a) Show that the ground state of this system has non-vanishing expectation
value for one of the fields. (It is conventional to call this the σ field—why
is this purely a matter of convention?). Show that this corresponds to a
breaking of the global symmetry group (what is it?) of H. Show that there
are massless particles (how many?) in the theory.
(b) What happens to the spectrum if a term kσ (k a real constant) is added
to H?
(c) If μ=80 MeV, λ=0.04, k=0, calculate the difference in energy density in
J/m3 between the false vacuum with σ = π = 0 and one of the coherent
states minimizing the energy.
5. In the theory of the preceding Problem, show that the σ particle is unstable and
calculate its lifetime (inverse of the total decay rate) to lowest order in λ.
1 and χ
6. Let χ 2 be two three-vectors of self-conjugate scalar fields interacting via
a Hamiltonian density with polynomial part
1
2 ) = − μ2 (
χ1 , χ
P ( χ21 + χ
22 ) + λ1 (
χ21 + χ 1 · χ
22 )2 + λ2 χ χ21 + χ
2 ( 22 )
2
where λ1 , λ2 > 0, and λ1 > λ2 /2.
(a) What is the global symmetry group of the Hamiltonian?
Problems 239
(b) Find the configuration that minimizes the polynomial P and show that it
breaks the global symmetry (i.e., find χ1 and
χ2 ). What is the residual
global symmetry in this case after spontaneous symmetry-breaking? How
many generators of the original symmetry group were broken?
(c) Identify (as linear combinations of the original fields) the massless and the
massive fields after spontaneous symmetry-breaking, and find the masses of
those with non-zero mass.
9
Dynamics VII: Interacting fields:
general aspects
choice of inertial frame is necessary in order to specify the relevant time variable for
the state dynamics, and the field operators depend only on the spatial variables, to the
detriment of manifest Lorentz covariance of the theory. Our next task will therefore
be to reformulate local quantum field theory in the Heisenberg picture, avoiding as
far as possible any reference to perturbative expansions of the quantities discussed.
This is a long chapter, and we beg the reader’s patience insofar as the treat-
ment requires perhaps a somewhat higher level of mathematical sophistication than
previously necessary. The arguments have been spelled out in great detail to avoid
confusions, and the required additional mathematics (mainly distribution theory) has
been explained just enough to make the proofs comprehensible. However, the effort
is justified by the central importance of the topics discussed, which encompass the
essential conceptual content of interacting local quantum field theories, expressed, in
the absence of an explicit “from the ground up” mathematical construction of four-
dimensional field theories, in as precise a form as mathematical physicists have been
able to achieve to the present date. In particular, a clear understanding of the variety
of connections between particles and fields, discussed at length in Section 9.6, is really
not possible without the insight into the nature of the interpolating field given by the
asymptotic formalism developed in Sections 9.3 and 9.4.
with U (t, 0) = eiH0 t e−iHt . In the case of a quantum-mechanical system with a finite
number of degrees of freedom, there are no hidden subtleties here: the operator
U (t, 0) is properly unitary, acting within a single Hilbert space spanned by either
the complete set of eigenstates of the free Hamiltonian H0 or the full Hamiltonian H.
We shall see later (cf. the discussion of Haag’s theorem, Section 10.5) that none of
these nice properties obtain for a quantum field theory with infinitely many degrees
of freedom. Thus, strictly speaking, the manipulations that follow in this section are
only valid if the field theory is fully regularized—by which we mean that both infrared
(finite-volume V ) and ultraviolet (short-distance a, or high-momentum Λ) cutoffs are
imposed, reducing the number of allowed momentum modes to a finite number (∼ aV3 ).
In fact, we have already noted (cf. Section 8.3, final paragraph) that such cutoffs are
necessary to have a well-defined expression for the Hamiltonian in terms of the fields.
The attentive reader will, of course, complain that such cutoffs, while perhaps
restoring quantum-mechanical sanity, will clearly do violence to the two other sacred
ingredients which we have taken such pains to implement: Lorentz-invariance and
locality. And it is certainly far from obvious that the cutoffs can be removed at the
end of our calculations in a way that fully restores these desiderata. These are all
important issues, to which we shall return on several occasions later in the book. But
for the time being we shall throw caution to the winds and proceed as though the
conventional manipulations of non-relativistic quantum theory make sense in the field
theory case as well. A rigorous scattering theory, constructed without recourse to a
242 Dynamics VII: Interacting fields: general aspects
mathematically dubious interaction picture, will be the subject of Sections 9.3 and
9.4. In the meantime we shall assume that the interaction picture makes sense. Also,
we shall restrict the discussion to the case of a single, self-conjugate massive scalar
field with polynomial self-interactions, as the complications entailed by non-zero spin
are completely irrelevant to the physical issues of importance here.
We therefore define a Heisenberg field operator φH (x, t) in terms of the correspond-
ing interaction-picture field via (9.1) as
[φH (x, t), φH (y , t)] = U † (t, 0)[φ(x, t), φ(y , t)]U (t, 0) = 0, x = y (9.3)
The extension of this equal-time result to full microcausality (i.e., space-like commuta-
tivity) requires examination of the Lorentz transformation properties of φH , to which
we now turn.
We recall that the construction of the interaction part V of the Hamiltonian
as the spatial integral of a density implies the momentum conservation property
(5.90), [P (0) , V ] = [P (0) , H0 ] = 0, where P (0) is the spatial momentum operator on
the Fock space of eigenstates of H0 (the so-called “bare” states of field theory).
As the interaction does not alter spatial momentum, we can drop the (0) subscript,
as the same operator will measure the momentum of all states after interactions are
switched on. In particular, the commutator of this spatial momentum operator with
the Heisenberg field defined in (9.2) is
∂ ∂
i[Pi , φH (x, t)] = U † (t, 0)i[Pi , φ(x, t)]U (t, 0) = U † (t, 0) φ(x, t)U (t, 0) = φH (x, t)
∂xi ∂xi
(9.4)
as the spatial momentum operator Pi commutes with H, H0 , hence with U (t, 0). Of
course, the energy component of the four-vector energy-momentum operator which
generates the time-evolution of the Heisenberg field is the full Hamiltonian H:
∂
i[P0 , φH (x, t)] = i[H, φH (x, t)] = φH (x, t) (9.5)
∂t
The finite field translation property (5.93) for the interaction-picture field φ therefore
takes the obvious analogous form for the Heisenberg picture field:
eiPμ a φH (x)e−iPμ a = φH (x + a)
μ μ
(9.6)
(0)
with Pμ = (H, Pi ) = (H, Pi ).
Our previous discussion of scattering theory in Heisenberg representation (cf.
Section 4.3) was based on the specification of the Heisenberg state of a scattering
system in terms of the behavior of the system in the far past (the “in-states” |αin ) or
Field theory in Heisenberg representation: heuristics 243
the far future (the “out-states” |βout ). The former correspond to a set of far separated
incoming particles such as those prepared in the beams of a high-energy collider, the
latter to the set of outgoing detected particles, in interesting cases, following a collision.
In the case of non-relativistic potential scattering theory, and assuming the absence
of bound states of the incoming or outgoing particles, either the in- or the out-states
can be shown to be provide a complete set of eigenstates of the full Hamiltonian H. In
other words, the Hilbert space of the theory H can be identified with either the space
Hin spanned by the |αin or the space Hout spanned by the |βout states. We have
no option but to assume that the corresponding property remains valid in relativistic
field theory, where a basis for the in (resp. out) states is provided by the (continuum
normalized) multi-particle states |k1 , k2 , ....kN in (resp. |k1 , k2 , ...kM
out ) with
N
P μ |k1 , k2 , ....kN in = knμ |k1 , k2 , ....kN in (9.7)
n=1
M
P μ |k1 , k2 , ....kM
out = knμ |k1 , k2 , ....kM
out (9.8)
n=1
Here again we emphasize that we are taking for simplicity the theory of a single stable
spinless particle (with no bound states), so the states are fully specified by listing the
momenta of the particles.
The principle of asymptotic completeness asserts that the Hilbert spaces Hin , Hout
are identical, and can be identified with the full Hilbert space H of the theory. The
physical reasoning behind this assumption is in fact very straightforward. Imagine
an arbitrary finite-energy state of the system at, say, time t = 0, corresponding to
the moment of collision of some arbitrary set of formerly separated particles. The
state of the system near t = 0 is, of course, extremely complicated: all we know (in
a massive theory with short range interactions) is that the energy and momentum
of the field(s) is concentrated in a small spatial region around the interaction vertex.
It is physically clear that any such state will eventually evolve (absent bound states)
into a linear combination of Fock states consisting of sets of a finite number (≤ E/m,
where E is the total energy and m the mass of the field quantum) of outgoing stable
particles receding to infinity, with, of course, the total energy and momentum of the
system conserved at every stage. In other words, since the entire history of a system
is encapsulated (in Heisenberg representation) by its specification at some arbitrary
time (in this case, the far future), an arbitrary Heisenberg state of the system must
be resolvable into a linear combination of multi-particle out-states: the |βout span
the physical Hilbert space. Note that in a theory with unstable particles, these do
not form part of the asymptotic Hilbert space: only the stable particles persisting
at late time (including stable bound states, if such exist) are to be included in the
list of |βout . We shall return later in this chapter to a more detailed discussion of
the nature of the Hilbert space in relativistic field theory. For the time being we
note that the completeness of the in-states is assured given that of the out-states
by the TCP theorem of local field theory (cf. Section 13.4): the joint operation of
244 Dynamics VII: Interacting fields: general aspects
Choosing a countable basis1 |αin for Hin , the discrete matrix Sβα is unitary (cf.
Section 4.3.1), and Lorentz-invariance of the theory, out Λβ|Λαin = outβ|αin , then
implies (see Problem 1) the corresponding transformation property on the (once again,
continuum-normalized) out-states:
Note the subscript H on UH (Λ): these are not the same unitary operators as the U (Λ)
introduced earlier (cf. (5.23)) implementing Lorentz transformations on the bare multi-
particle states |α which form a basis of eigenstates of the free Hamiltonian H0 , and
which satisfy
1 The Fock space of field theory, as a countable direct sum of finite tensor products of one-particle spaces,
is, somewhat surprisingly, a separable Hilbert space; see (Streater and Wightman, 1978).
Field theory in Heisenberg representation: heuristics 245
the theory. After obtaining the relevant result linking matrix elements of interaction
and Heisenberg picture operators, we shall return to the issue of removal of the cutoff
(essential, of course, for restoring the locality and Lorentz transformation properties of
the theory).
The formula we need, of fundamental importance in both relativistic field theory
and many-body theory, is due to Gell–Mann and Low (Gell-Mann and Low, 1951). We
begin with the unitary Møller wave operators Ω∓ = U (0, ∓∞), connecting bare states
|α (eigenstates of H0 ) to the corresponding in- and out-states (cf. Section 4.3.2):
Now consider a general matrix element between arbitrary in- and out-states of a time-
ordered product of m Heisenberg fields
out β|T {φH (x1 )φH (x2 )..φH (xm )}|αin =out β|φH (y1 )φH (y2 )..φH (ym )|αin (9.14)
where the set (y1 , y2 , ....ym ) of spacetime coordinates are obtained by subjecting the
original set (x1 , x2 , ...xm ) to a permutation in order to effect the desired time-ordering
t1 ≡ y10 > t2 ≡ y20 > ... > tm ≡ ym 0
. Using (9.2, 9.13) and the semigroup property of
the U (t, t0 ) (cf. Problem 1 in Chapter 4), we find
out β|φH (y1 )...φH (ym )|αin = β|U (+∞, t1 )φ(y1 )U (t1 , t2 )φ(y2 )...φ(ym )U (tm , −∞)|α
(9.15)
Note that the left-hand side of (9.15) involves purely Heisenberg states and operators,
while the right-hand side contains only interaction-picture states and operators.
Further progress requires that we limit ourselves to the formal perturbative expansion
of the theory (the reasons for which will become apparent below):
∞
(−i)n
τ (y1 , ...ym ) ≡ ·
n=0
n!
β|T {φ(y1 )φ(y2 )..φ(ym )Hint (z1 )Hint (z2 )..Hint (zn )}|αd4 z1 d4 z2 ..d4 zn
(9.16)
As the integrations over the spacetime coordinates z1 , z2 , ...zn in (9.16) are performed,
the time-ordering symbol will redistribute the n Hint operators among the already
time-ordered field operators φ(y1 ), φ(y2 ), etc. The full integration can evidently be
subdivided into subregions in which n0 of the Hint operators occur at times later
than t1 ≡ y10 , n1 occur between times t1 and t2 , and so on, with nm interactions prior
to the earliest field time tm , and n = n0 + n1 + ...nm . There are n0 !n1n! !...nm !
equiva-
lent ways of selecting the particular Hint operators to be placed in these temporal
intervals. Accordingly, our m-point function τ (y1 , ...ym ) (commonly referred to as a
“Feynman m-point function”) in (9.16) may be re-expressed (relabeling the z1 , .., zn
coordinates)
246 Dynamics VII: Interacting fields: general aspects
∞
(−i)n n!
τ (y1 , ...ym ) =
n=0
n! n0 ,n1 ,..nm ;n0 +n1 +··nm
n !n !...nm !
=n 0 1
· β|T {Hint (z0,1 )..Hint (z0,n0 )}φ(y1 )T {Hint (z1,1 )..Hint (z1,n1 )}φ(y2 )...
The sums over the individual n0 , n1 , etc. T-products of interaction operators can now
be reassembled into the corresponding interaction-picture time-evolution operators.
For example.
(−i)n0
0
θ(z0,i − t1 )T {Hint (z0,1 )..Hint (z0,n0 )} d4 z0,j = U (+∞, t1 ) (9.18)
n
n0 ! j
0
and similarly for the sums involving time-ordered operators sandwiched in the other
temporal regions between the φ(yi ) operators. In the end we recover exactly
τ (y1 , y2 , .., ym ) = β|U (+∞, t1 )φ(y1 )U (t1 , t2 )φ(y2 )...φ(ym )U (tm , −∞)|α (9.19)
i.e., the right-hand side of (9.15). In other words, with the provisos given earlier vis-
à-vis regularization of the theory, we have the Gell–Mann–Low formula
∞
(−i)n
out β|φH (y1 )φH (y2 ) .. φH (ym )|αin = β|T {φ(y1 )φ(y2 )..φ(ym )
n=0
n!
Note that as a special case, taking m = 0 (no φH fields), we recover our previous
perturbative expression (5.73) for the S-matrix Sβα ≡ out β|αin .
With this result in hand we can return to the question of the Lorentz
transformation properties of the Heisenberg field φH (x). Of course, the presence
of regularizing cutoffs reducing our field theory to a finite number of degrees of
freedom explicitly breaks the invariance under the continuous HLG (for example, if
we formulate the theory on a discrete spacetime lattice, the usual prelude to attempts
at a rigorous construction of the continuum limit of a relativistic field theory).
We shall see in Part 4 of this book, with our treatment of covariant renormalized
perturbation theory, that the restoration of Lorentz-invariance can indeed be proven
rigorously in a number of four-dimensional field theories (the so-called “perturbatively
renormalizable” theories), but only in the context of the formal perturbative expansion
Field theory in Heisenberg representation: heuristics 247
U † (Λ)Hint (x)U (Λ) = Hint (Λ−1 x), U † (Λ)φ(x)U (Λ) = φ(Λ−1 x) (9.23)
Let us begin with (9.22), for arbitrary in |αin and out |βout states (recall that
these separately form complete sets, by the principle of asymptotic completeness).
Introducing an arbitrary fixed Lorentz transformation Λ
†
out β| UH (Λ)φH (x)UH (Λ)|αin = out Λβ|φH (x)|Λαin
∞
(−i)n
= Λβ|T {φ(x)Hint (z1 )..Hint (zn )}|Λαd4 z1 ..d4 zn
n=0
n!
∞
(−i)n
= β|U † (Λ)T {φ(x)Hint (z1 )..Hint (zn )}U (Λ)|αd4 z1 ..d4 zn
n=0
n!
(9.25)
By the arguments of Section 5.5, the T-product of ultralocal scalar fields (and we
are here assuming that Hint (x) is ultralocal) is Lorentz-covariant: i.e., the similarity
transformation by U (Λ) may be taken inside the T-product,2 giving
2 The reader will recall that the basic reason for this is that the only cases in which Λ can interchange
the time-ordering of two fields involve space-like separations where the fields already commute, by locality.
248 Dynamics VII: Interacting fields: general aspects
†
out β| UH (Λ)φH (x)UH (Λ)|αin
∞
(−i)n
= β|T {U † (Λ)φ(x)Hint (z1 )..Hint (zn )U (Λ)}|αd4 z1 ..d4 zn
n=0
n!
∞
(−i)n
= β|T {φ(Λ−1 x)Hint (Λ−1 z1 )..Hint (Λ−1 zn )}|αd4 z1 ..d4 zn (9.26)
n=0
n!
Changing integration variables from zi to wi ≡ Λ−1 zi , and recalling that the Λ are
unimodular (det(Λ) = 1),
†
out β| UH (Λ)φH (x)UH (Λ)|αin
∞
(−i)n
= β|T {φ(Λ−1 x)Hint (w1 )..Hint (wn )}|αd4 w1 ..d4 wn
n=0
n!
−1
= out β|φH (Λ x)|αin , ∀α, β, Λ (9.27)
As α, β run over arbitrary members of complete sets, this establishes (at least in the
context of perturbation theory!) the desired operator scalar field property (9.24) for the
fully interacting Heisenberg field φH . The extension of the equal-time commutativity
property (9.3) to full space-like commutativity, [φH (x), φH (y)] = 0, (x − y)2 < 0 is
now a straightforward exercise (see Problem 2).
The Heisenberg field φH (x) plays a fundamental role in field theory for a very sim-
ple reason: knowledge of its matrix elements is tantamount to a complete description
of the dynamics of the theory. In particular—and this is the primary topic of this
chapter—it contains all the information needed to reconstruct both the Heisenberg
in-states |αin and the out-states |β >out , and therefore, the exact scattering matrix
Sβα = outβ|αin of the theory. In order to see how this comes about, it is convenient
to introduce some auxiliary fields which incorporate the kinematical structure of the
asymptotic states. We shall do this explicitly for the in-states, with the understanding
that the entire procedure can be carried through, mutatis mutandis, for the out-states
by the simple device of replacing “in” with “out” everywhere. We first note that
creation and annihilation operators can be defined on the multi-particle Fock space
Hin exactly as for the bare states |k1 ..kN in (6.29, 6.30)
N
ain (k)|k1 , k2 , ..kN in ≡ (±)r−1 δ 3 (k − kr )|k1 , ..kr−1 , kr+1 , ..kN in (9.28)
r=1
We shall be concerned for the time being with the simplest case, a bosonic spinless
particle, so only positive signs need be taken in (9.28). A free field φin (x) can then
be defined in complete analogy to the interaction-picture field φ(x) in (6.73) (with
a = ac , as we are also restricting ourselves to a self-conjugate scalar field) by setting
Field theory in Heisenberg representation: heuristics 249
1 d3 k
φin (x) = (ain (k)e−ik·x + a†in (k)eik·x ) (9.30)
(2π)3/2 2E(k)
We shall see later that the mass m appearing in the interaction-picture field φ (and in
the coefficient of φ2 in the interaction Hamiltonian density) and denoting the energy of
single-particle eigenstates of H0 at zero momentum is by no means guaranteed to agree
with the actual physical mass mph : i.e., the energy eigenvalue of the full Hamiltonian
H on a single particle in (or out) state with zero spatial momentum.
On the other
hand, the energy function E(k) appearing in (9.30) is given by k 2 + m2ph , and the
four-momenta appearing in the complex exponentials in the integral are on-mass-shell
for the physical mass, k · k = m2ph . This implies that the in-field φin satisfies the free
Klein–Gordon equation, but with the physical mass:
The reader may easily verify that the spacetime translation of φin is implemented
by the full energy-momentum vector Pμ , of which the in-states in (9.28, 9.29) are
eigenstates:
So the in-field φin , despite satisfying a free field (i.e., covariant linear) equation,
is most definitely a Heisenberg picture operator. As indicated earlier, a Heisenberg
out-field φout (x) may be introduced in an exactly analogous fashion, starting from
creation and destruction operators for the out-states: it will likewise satisfy the free
Klein–Gordon equation (9.31) (replacing, of course, “in” with “out”). The Heisen-
berg field φH , defined by transformation from the free interaction-picture field φ,
on the other hand, certainly does not satisfy a free Klein–Gordon equation, as is
apparent from (9.22) giving its matrix elements: only the first, n = 0 term in the
perturbative expansion of these matrix elements, satisfies the Klein–Gordon equation
(albeit with the mass m associated with the free Hamitonian H0 ), while higher terms
involving insertions of the interaction Hamiltonian have a much more complicated
behavior.
It is apparent that knowledge of the φin and φout fields allows us to reconstruct the
complete set of in- and out-states, and hence their overlap, the S-matrix. This follows
from the formula (6.132) (Problem 8 in Chapter 6), appropriately generalized for the
in- (or out-)field case:
↔
∂
ain (k) = i d3 x {gk∗ (x, t) φin (x, t)} (9.33)
∂t
↔
∂
a†in (k) = −i d x {gk (x, t)
3
φin (x, t)} (9.34)
∂t
250 Dynamics VII: Interacting fields: general aspects
3 This expression is typically used in theoretical physics, as indeed here, as a euphemism for “mathe-
matically incorrect, but nevertheless suggestive”.
4 Although the steps that follow are mathematically unjustified, the result is in fact correct within
the context of renormalized perturbation theory, if we choose a split between the free and interacting
Hamiltonians which eliminates, order by order in perturbation theory, persistent interactions modifying the
mass and normalization of the single-particle states. In particular, we shall assume that the mass term in
H0 is taken to be the physical mass mph .
Field theory in Heisenberg representation: heuristics 251
sections the need for smearing (both fields and states) in any careful mathematical
treatment of field theory will be met head on and explicitly.
Observe that the unitary rotation effecting the transformation from interaction to
the Heisenberg picture
applies as well to the time-derivative of φH , by using the equations of motion for the
time-development operator U (t, 0), namely ∂t ∂
U (t, 0) = −iVip (t)U (t, 0), ∂t
∂
U † (t, 0) =
+iU † (t, 0)Vip (t),
∂ ∂φ(x, t)
φH (x, t) = U † (t, 0)( + i[Vip (t), φ(x, t)])U (t, 0) (9.38)
∂t ∂t
The commutator term vanishes if Vip (t) = d3 yHint (y , t), provided the interaction
Hamiltonian density is an ultralocal scalar field, as discussed in Chapter 5:
[Hint (y , t), φ(x, t)] = 0 ⇒ [Vip (t), φ(x, t)] = 0 (9.39)
so we have, defining (cf. Section 8.1) πH (x, t) ≡ φ̇H (x, t), π(x, t) ≡ φ̇(x, t),
Thus, both the Heisenberg field and its time-derivative are obtained as unitary trans-
forms of the corresponding interaction-picture free fields. This implies a corresponding
unitary relation between the operators a†H (k; t) and a† (k) built from these fields via
(9.35) and (9.36).
In our earlier discussion of scattering theory in Chapter 4 (cf. (4.155, 4.156)), the
in-state |αin arises as a strong limit from the corresponding bare state |α via
which suggests that the following limits obtain for the action of the interacting operator
a†H (k; t) on an arbitrary normalizable in-state |αin5 :
limt→−∞ a†H (k; t)|αin = limt→−∞ U † (t, 0)a† (k)U (t, 0)|αin (9.42)
†
= limt→−∞ U (t, 0)|k, α (9.43)
= |k, αin (9.44)
= a†in (k)|αin (9.45)
with a similar result for the action of the operators ain (k), aH (k; t). As the Heisenberg
field φH and the in-field φin are constructed in a precisely analogous fashion from their
corresponding creation and destruction parts, this line of argument suggests that the
5 Were a(k) a bounded operator, this, together with the unitarity (and hence automatic boundedness) of
U and U † would justify going from (9.42) to (9.43). As we shall see below, it is not, so again this argument
is suggestive, not rigorous. A mathematically correct theory of scattering will be given in Section 9.3.
252 Dynamics VII: Interacting fields: general aspects
Heisenberg field approximates the free in-field φin in the far past. A completely anal-
ogous argument leads to the corresponding result for the far future: for t → +∞, the
action of φH is equivalent to that of the free-field φout . In other words, the Heisenberg
field φH interpolates between the free fields φin , φout describing far-separated free
particles in the distant past and similar sets of far separated free particles in the far
future: for this reason, the Heisenberg field is sometimes referred to as an interpolating
field for the particle in question. Roughly speaking, we may write
although we must caution the reader that we are here glossing over important
mathematical subtleties, which will be fully dealt with in our treatment of Haag–
Ruelle scattering theory in Section 9.3.
We conclude this section by noting an important connection between the S-matrix,
defined previously as the set of overlap amplitudes between in- and out-states, and
the unitary operator implementing the mapping between in- and out-creation and
annihilation operators. Defining an unitary operator S by
we see that our scattering amplitudes Sβα ≡ out |β|αin are just the matrix elements
of S in the in-basis:
Sa†out (k)|αout = S|k, αout = |k, αin = a†in (k)|αin = a†in (k)S|αout (9.49)
This implies that the in- and out-fields (cf. (9.30), together with the analogous
expression for the out-field) are similarly related by a unitary rotation implemented
by the S-operator:
It should be emphasized that the results (9.47–9.52) rely only on asymptotic com-
pleteness, and make absolutely no reference to either the interaction picture or the
perturbative expansion thereof.
Field theory in Heisenberg representation: axiomatics 253
6 Here, “non-trivial perturbation” can be as innocent as a shift in the mass of a free field, as we shall
see in Section 10.5.
254 Dynamics VII: Interacting fields: general aspects
account essentially all possibilities (in particular, the situations encountered in the
Standard Model of modern particle physics), will follow in Section 9.6. We shall state
the axioms of our theory, divided into the three groups outlined above, with each
axiom followed by (sometimes extensive!) explanatory comments.
We begin with the State axioms, specifying features of the state space of the
theory:
1. Axiom Ia: The state space H of the system is a separable Hilbert space. It carries
a unitary representation U (Λ, a) (Λ an element of the HLG, a a coordinate four-
vector) of the proper inhomogeneous Lorentz group (i.e., the Poincaré group).
Thus, for all |α ∈ H, |α → U (Λ, a)|α ∈ H, with the U (Λ, a) satisfying the
Poincaré algebra U (Λ1 , a1 )U (Λ2 , a2 ) = U (Λ1 Λ2 , a1 + Λ1 a2 ) (cf. 5.14).
Comments: Our Hilbert space H is a countable direct sum of multi-particle
spaces corresponding to a definite number of particles. The multi-particle space
corresponding to a fixed finite number of particles is a finite tensor product of
separable L2 spaces, each with a countable basis, and is therefore itself separable.
The separability of H follows trivially. The reader is free to visualize H as the
space of in-states Hin (or out-states, Hout ) described in the preceding section,
with the action of the U (Λ, a) given by eiP ·a UH (Λ) with P μ , UH (Λ) as defined
in (9.7, 9.9).
2. Axiom Ib: The infinitesimal generators Pμ of the translation subgroup T (a) =
U (1, a) of the Poincaré group have a spectrum pμ restricted to the forward light-
cone, p0 ≥ 0, p2 ≥ 0.
Comments: In accordance with our intuition of asymptotic completeness—
that all Heisenberg states of the system correspond to field disturbances which
eventually resolve into a finite number of well-separated stable particles of finite
energy, and with individual four-momenta on or within the forward light-cone—
the total energy-momentum pμ of any state of the system must be resolvable into
a sum of four-vectors, each of which is within the forward light-cone, and must
therefore itself lie in this region.
3. Axiom Ic: There is a unique state |0 (up to a unimodular phase, of course),
called the “vacuum”, with the isolated eigenvalue pμ = 0 of Pμ . It is unit
normalized, 0|0 = 1.
Comments: In particular, we assume the absence of spontaneous symmetry-
breaking, as discussed in Section 8.4. The need to accommodate simultaneously
a discretely normalized vacuum and continuously normalized multi-particle states
in an infinite-volume interacting theory will be seen later to be intimately
connected with the difficulties associated with the interaction picture and the
famous “Haag’s theorem” (Section 10.5).
4. Axiom Id: The theory has a mass gap: the squared-mass operator P 2 = Pμ P μ
has an isolated eigenvalue m2 > 0, and the spectrum of P 2 is empty between 0
and m2 . The subspace H1 of H corresponding to the eigenvalue m2 carries an
irreducible spin-0 representation of the HLG. These are the single-particle states
of the theory. The remaining spectrum of P 2 is continuous, and begins at (2m)2 .
Comments: This axiom specifically excludes quantum electrodynamics with a
massless photon (the only known massless particle): indeed, the structure of the
state space and the asymptotic dynamics, and in particular the definition of
Field theory in Heisenberg representation: axiomatics 255
–2
–4
–4 –2 0 2 4
Fig. 9.1 The spectrum of the energy-momentum operator P μ (P 0 vertical axis, arbitrary
spatial component P i horizontal axis, m = 2). The dashed lines enclose the support region
for the function f˜(1) (p) introduced in Section 9.3, with a = 12 , b = 2.
where
1 0
−i
p·
f˜(
p) ≡ f (x)eiE(p)x x 4
d x (9.55)
(2π)3/2 2E(p)
Standard theorems ensure that the Fourier transform function f˜ is also a function
(of the spatial momentum vector p) of Schwarz type. A dense subset of H can be
obtained by considering all normalizable n-particle states of the form
|g1 , g2 , ..., gn = d3 q1 d3 q2 ..d3 qn g1 (q1 )..gn (qn )|q1 , q2 , ..qn (9.56)
where d3 q|gi (q)|2 < ∞, i = 1, 2, ..n (for example, the gi may be chosen from a
complete basis of L2 ). The reader may easily verify (see Problem 3) that the state
φf |g1 , g2 , ..gn has finite norm. It is also easy to see that a further application of a
second smeared field, φf1 say, still produces a finite norm state (thus, φf D ⊂ D).
That the smeared fields are unbounded operators is also unsurprising if we recall
the discussion of Section 8.3, where normalized coherent states were constructed
with arbitrary (and therefore, arbitrarily large) expectation values of the field.
Note that smearing of the original local field φ(x) is unnecessary if we are only
concerned with matrix elements β|φ(x)|α with |α, |β ∈ D, which are perfectly
well-defined functions of x. However, we must frequently consider the states (in
the Hilbert space) obtained by sequential application of field operators, in which
case the smeared (or, in the language of Haag, “almost local”) operators φf are
unavoidable. Finally, we note that in keeping with our restriction to a system of
a single self-conjugate particle, the hermiticity of φ(x) translates to the following
property for the smeared fields (for f (x) in general complex):
7 The space of Schwartz test functions consists of C ∞ (i.e., infinitely times continuously differentiable)
functions of fast decrease (i.e., falling faster than any power as the spacetime coordinates go to ∞).
Field theory in Heisenberg representation: axiomatics 257
0|φf1 ...φfn |0 = f1 (x1 )..fn (xn )0|φ(x1 )..φ(xn )|0d4 x1 ..d4 xn
= f1 (x1 )..fn (xn )W (x1 , ..., xn )d4 x1 ..d4 xn (9.58)
Again, the situation with free fields motivates the assumption that the distri-
butions W (x1 , x2 , ...xn ) are tempered: namely, continuous linear functionals on
the space of fast-falling Schwarz test functions in the 4n-dimensional space of
the combined spacetime coordinates x1 , x2 , ..xn . We remind the reader8 that
a tempered distribution W (z) (here z is a vector in the combined coordinate
spacetime of all the test functions in (9.58)) can always be written as a finite
derivative of a polynomially bounded continuous function:
immediately leads to (9.61) after multiplying both sides by f (x), integrating over
the four-coordinate x, and appropriate changes of variable in the integration. The
reader may also verify that U (Λ, a)φf = φfΛ,a U (Λ, a), following directly from
(9.61), implies that the domain D is Poincaré-invariant, namely U (Λ, a)D = D,
as fΛ,a is of Schwarz type if f is. The infinitesimal generators of translations are
just the energy-momentum operators whose spectral properties are delineated in
the first set of axioms above: thus, U (1, a) = eiP ·a , while the general Poincaré
group element can be expressed U (Λ, a) = U (1, a)U (Λ, 0) = U (1, a)U (Λ), with
8 For a review of the essential facts concerning tempered distributions, see (Streater and Wightman,
1978).
258 Dynamics VII: Interacting fields: general aspects
U (Λ) the generators of the HLG (just the UH (Λ) of the preceding Section).
The transformation law for a general (unsmeared) covariant field, given as a
set of component fields φn (x) transforming according to a finite-dimensional
representation M (Λ) of the HLG, is a straightforward generalization of (9.62)
(cf. (7.5)):
The Poincaré group composition rule implies U (Λ)U (1, a)U † (Λ) = U (1, Λa),
whence, using U (1, a) = eiP ·a and taking a infinitesimal, we obtain
Taking Λσρ = gσρ + ωσρ in (9.66), a short calculation gives the commutation
relations between the generators of translations (the energy-momentum four-
vector) and those of the HLG:
The full Lie algebra of the Poincaré group is completed by observing that the
subgroup of translations is abelian, so the generators P μ commute with each
other:
[P ρ , P σ ] = 0 (9.68)
Comments: If the fields in question are fermionic, then, of course, the commu-
tator in (9.69) should be replaced by an anticommutator. This requirement is an
obvious consequence of our previous assumption of space-like commutativity at
the level of unsmeared fields, e.g., (7.9). The restriction to compact supports for
Field theory in Heisenberg representation: axiomatics 259
the smearing fields is not strictly necessary: it suffices that the supports of f1
and f2 are non-overlapping and mutually space-like.
4. Axiom IId: The set of states obtained by applying arbitrary polynomials in the
smeared fields φf (with all possible Schwarz functions f ) to the vacuum state |0
(cf. Axiom Ic) is dense in the Hilbert space H. This axiom is sometimes expressed
as the “cyclicity of the vacuum”.
Comments: In other words, an arbitrary physical state can be approximated
arbitrarily well by linear combinations of states of the form φf1 φf2 ....φfn |0
(including, of course, the case n = 0, i.e., the vacuum itself). This so-called
Cyclicity axiom incorporates our desire that the complete dynamical information
of the theory is contained in the single field φ. Of course, in theories containing
several different types of particles (which are not bound states of each other,
for example), all such independent fields must be allowed to act on the vacuum
to produce the full Hilbert space of the theory. For the time being though, as
emphasized above, we wish to focus on the simplest possible case, that of a single
spinless particle associated with a single scalar field. The reader may easily verify
that for free fields, a dense set of normed states of the form (9.56) can indeed be
obtained by applying smeared (free) fields to the vacuum. Axiom IId is sometimes
formulated in a different but equivalent way: as the statement that the set of
(appropriately smeared) fields φf are irreducible—or in other words, that the only
bounded operator that commutes with all the φf is a multiple of the identity.
Before stating the axioms of the third category—those necessary to complete the
connection between particles and fields—we shall discuss two fundamental results
which already follow from the axioms previously stated: specifically, from a combi-
nation of the spectral and locality properties of the theory. The first result concerns
the analyticity properties of the Wightman distributions: indeed, we have already
seen in Section 6.6 that a close connection exists between locality and analyticity of
amplitudes in field theory. The second result is the Ruelle Clustering theorem, which is
the precise correlate, in field-theoretic terms, of the clustering property of the S-matrix
which was one of our primary motivations in developing the formalism of local field
theory (cf. Chapter 6). The exact connection will become apparent after our discussion
of Haag–Ruelle scattering theory in the next Section.
The analyticity domain of the Wightman functions follows directly from the trans-
lation property IIb, together with the spectral properties of the energy-momentum
operator P expressed in Axioms Ib and Ic. Thus, writing φ(xi ) = eiP ·xi φ(0)e−iP ·xi ,
we may write (defining ξi ≡ xi − xi+1 )
W (x1 , x2 , ...xn ) = 0|φ(0)e−iP ·(x1 −x2 ) φ(0)e−iP ·(x2 −x3 ) ....e−iP ·(xn−1 −xn ) φ(0)|0
= 0|φ(0)e−iP ·ξ1 φ(0)e−iP ·ξ2 φ(0)...e−iP ·ξn−1 φ(0)|0 ≡ W(ξ1 , ..ξn−1 )
(9.70)
n−1
n−1
d4 ki
W(ξ1 , ..ξn−1 ) = W̃(k1 , ..., kn−1 )e−i i=1
ki ·ξi
(9.71)
i=1
(2π)4
Inserting (9.70) in the inverse Fourier formula for W̃ and performing the ξi integrals,
we find
ζi = ξi − iηi (9.73)
with all ηi positive time-like (ηi2 > 0) (but the ξi arbitrary real) therefore leads
to the already well-defined integral (9.71) acquiring an additional real exponential
damping factor (as ki · ηi > 0 for all i). As long as we stay away from coincident point
singularities therefore, we see that the Wightman functions are analytic in this multi-
dimensional complex domain, called the forward tube Tn−1 . An extremely important
theorem of axiomatic field theory, due to Hall and Wightman (Hall and Wightman,
1957), shows that a further analytic continuation is possible, by the simple device
of extending the Lorentz-invariance property for real coordinates and real Lorentz
transformations Λ
to complex Lorentz transformations (i.e., complex 4x4 matrices Λ with det(Λ) = 1 and
ΛgΛT = g), and defining the analytic extension to the extended tube Tn−1 consisting
of points of the form (Λζ1 , ..., Λζn−1 ), ζi ∈ Tn−1 by
W (Λζ1 , ..., Λζn−1 ) = W (ζ1 , ..., ζn−1 ), (Λζ1 , ..., Λζn−1 ) ∈ Tn−1 (9.75)
This is obvious given that, by the Hall–Wightman theorem, the Wightman functions
depend only on the dot-products ζi · ζj , but can also be seen by the fact that we can
connect the unit Lorentz transformation Λ = 1 to the spacetime reflection Λ = −1
(with det(Λ) = +1!) by a continuous analytic path Λ(θ), θ : 0 → π in the complex
Lorentz group, by taking
Field theory in Heisenberg representation: axiomatics 261
⎛ ⎞
cos (θ) 0 0 −i sin (θ)
⎜ 0 cos(θ) sin(θ) 0 ⎟
Λμν (θ) = ⎜
⎝
⎟ = eiθ(J3 −iK3 ) = e2iθA3 (9.77)
⎠
0 − sin(θ) cos(θ) 0
−i sin (θ) 0 0 cos (θ)
The reader should check that this is indeed a Lorentz transformation (i.e., satisfies
det(Λ) = +1, ΛT gΛ = g)! It is complex insofar as the boost angle (coefficient of iK3 in
the exponent) is imaginary. The result (9.76), involving a reversal of both the temporal
(T) and spatial (P) coordinates, will turn out to be a crucial part of the demonstration
of the TCP theorem in the very general framework of axiomatic field theory given
in Section 13.4. An obvious generalization of (9.76) to the Wightman function for a
product φA1 B1 (x1 )φA2 B2 (x2 ) · · · φAn Bn (xn ) of fields in arbitrary representations of the
HLG follows directly from (9.77):
W A1 B1 ,A2 B2 ,..,An Bn (ζ1 , ..., ζn−1 ) = (−1) i 2Ai W A1 B1 ,A2 B2 ,..,An Bn (−ζ1 , ..., −ζn−1 )
(9.78)
by analytic continuation from θ = 0 to θ = π in (9.77), and recalling that the
3-component of an angular momentum Ai3 differs from the Ai value by an integer.9
Using local commutativity (Axiom IIc), it can be also shown that the analyticity
domains in the difference variables ξ1 , .., ξn−1 for different permutations of the original
arguments x1 , . . . xn can be connected to obtain analyticity in a permuted extended
tube (see (Streater and Wightman, 1978) for further details). A particular case of great
importance is the imaginary time continuation of the Wightman functions, yielding
the Euclidean Schwinger functions of the theory
S(x1 , x2 , ...xn ) ≡ W ((x1 , −ix41 ), (x2 , −ix42 ), ..., (xn , −ix4n )) (9.79)
where the xα i , α = 1, 2, 3, 4 are real. Unlike the Wightman functions, the Schwinger
functions are permutation symmetric in their arguments (the Wightman functions
involve field operators which do not commute at time-like separation). This can
be shown directly from the axioms, but will become essentially trivial in the path-
integral formalism, so we shall postpone further discussion of the Schwinger functions
to Section 10.3, where we shall explore the properties of functional integrals in field
theory.
We turn now to the issue of clustering—in particular, the result originally obtained
by Ruelle (Ruelle, 1962) on the large distance asymptotics of the Wightman functions.
In order to state this result, we need to introduce the concept of a smeared field
localized around a point x, φf (x). Our original definition, φf ≡ f (x)φ(x)d4 x, with
f (x) a function falling faster than any power as x moves away from the origin x = 0,
should be regarded as producing a smeared field localized in a very small, but finite,
region near x = 0. A corresponding field (or product of fields; see below) localized
9 A different choice of complex continuation from Λ = 1 to Λ = −1 could be made, resulting in the factor
(−1) i 2Bi . This is, however, identical to (−1) i 2Ai as the spin ji associated with each field differs from
Ai + Bi by an integer, and the vacuum expectation value of a product of fields can only differ from zero if
the spins they carry can couple to zero.
262 Dynamics VII: Interacting fields: general aspects
around an arbitrary coordinate x (in Haag’s language (Haag, 1992), an almost local
field) can be obtained by the standard process of translation:
iP ·x −iP ·x
φf (x) ≡ e φf e = f (y − x)φ(y)d4 y (9.80)
In fact, we can define more general types of smeared fields as smeared polynomials in
the basic field φ(x). For example, taking f (x1 , x2 ) to be a Schwarz function of rapid
decrease in the pair of coordinates (x1 , x2 ), we can define a smeared bilocal operator,
localized again in the neighborhood of the origin, by
φf ≡ f (x1 , x2 )φ(x1 )φ(x2 )d4 x1 d4 x2 (9.81)
carries a precise four-momentum: it alters the momentum of any state on which it acts
by precisely p. Indeed, if |α, |β are states of well-defined four-momentum Pα , Pβ , then
evidently
β| φf (x)e−ip·x d4 x|α = e−ip·x β|eiP ·x φf e−iP ·x |α
= e+i(Pβ −Pα −p)·x β|φf |αd4 x ∝ δ 4 (Pβ − (Pα + p)) (9.83)
This result depends on the fast decrease property of the smearing functions f1 , f2 as
well as the tempered distribution character of the Wightman distributions. We shall
explain the proof in somewhat greater detail than usual in order to give the reader
some idea of the flavor of the reasoning in axiomatic quantum field theory. First, note
that
Field theory in Heisenberg representation: axiomatics 263
0|[φf1 (−a), φf2 (+a)]|0 = f1 (x1 )f2 (x2 )0|[φ(x1 − a), φ(x2 + a)]|0d4 x1 d4 x2
(9.85)
The estimates we shall need rely on introducing a completely Euclidean metric on the
multiple coordinate space of all the fields under consideration. Thus, we shall define
eight-vectors z = (x1 , x2 ), α = (a, −a), with the Euclidean norm |z|2E = μ ((xμ1 )2 +
(xμ2 )2 ), |α|2E = 2|a|2 . The product of smearing functions f1 , f2 is likewise a fast falling
function of z: F (z) ≡ f1 (x1 )f2 (x2 ). The VEV of the commutator is the difference of
two Wightman distributions, W (x1 − a, x2 + a) − W (x2 + a, x1 − a), which we shall
write as a single tempered distribution of the eight-dimensional variable z − α:
0|[φf1 (−a), φf2 (+a)]|0 = F (z)W (z − α)d8 z (9.86)
A little geometric reasoning shows that if |z|E < |a|, the spacetime points x1 − a and
x2 + a for the local fields appearing in the commutator on the right-hand side of (9.85)
must be space-like, so that the integrand vanishes unless |z|E ≥ |a|: accordingly, the
polynomially-bounded function P, and hence the integrand as a whole, must vanish
except for |z|E > |a|:
|0|[φf1 (−a), φf2 (+a)]|0| = | P(z − α)Dm F (z)d8 z| (9.89)
|z|E >|
a|
The fast decrease property of the smearing functions implies that for any power N
there exists a constant C2 such that
−(N +2p+9)
|D m F (z)| < C2 |z|E (1 + |z|2E )−p (9.90)
where p is the power already appearing in (9.87). Inserting (9.90) in (9.89), and using
d8 z = S8 |z|7E d|z|E , with S8 the surface area of the eight-dimensional unit sphere,
|0|[φf1 (−a), φf2 (+a)]|0| < C1 C2 S8 (1 + |α|2E )p | |z|−N
E
−2p−2
d|z|E
|z|E >|
a|
C1 C2 S8
< (1 + 2|a|2 )p |a|−N −2p−1 (9.91)
N + 2p + 1
which establishes the desired behavior for large |a| as promised in (9.84).
We now turn to a discussion of the clustering properties of our field theory. In
analogy to the discussion of connected parts of the S-matrix in Sections 6.1 and 6.2,
264 Dynamics VII: Interacting fields: general aspects
W (x1 ) = W c (x1 )
W (x1 , x2 ) = W c (x1 )W c (x2 ) + W c (x1 , x2 ), etc. (9.95)
shows us that the full amplitude corresponding to the vacuum expectation value of
a product of fields has a cluster decomposition into a sum of terms in which the
fields are partitioned in all possible ways into clusters, with the fields in each cluster
appearing inside a connected matrix element. Connected versions of the vacuum-
expectation-values of the corresponding smeared (“almost local”) operators will be
defined similarly, thus
0|φf (x1 )φf (x2 )|0c ≡ 0|φf (x1 )φf (x2 )|0
− 0|φf (x1 )|00|φf (x2 )|0 (9.96)
and so on. The Ruelle Clustering theorem amounts to the intuitively plausible assertion
that the connected expectation values so defined decrease rapidly when the points of
localization of the fields are taken far apart spatially
Theorem 9.1 For a set of n spacetime coordinates x1 , x2 , . . . xn at equal time, xi =
(t, xi ), with spatial diameter d ≡ maxi,j |xi − xj |, for any positive N there exists a
constant CN such that for large d
0|φf1 (x1 )φf2 (x2 )..φfn (xn )|0c < CN d−N (9.97)
10 In much of the axiomatic quantum field theory literature, the term “truncated” is used instead of
“connected” for the distributions defined in (9.92–9.94). In the interests of honesty, we should admit here
that the clustering expansion of (9.92, etc.) is, strictly speaking, well-defined only for n-point functions
that are permutation-symmetric on their arguments. We shall shortly be restricting ourselves to the special
case where the spacetime coordinates x1 , . . . , xn are far space-like separated, in which limit the necessary
symmetry of the Wightman functions is restored.
Field theory in Heisenberg representation: axiomatics 265
Clustering theorem in its full generality here,11 but indicate (following the extremely
streamlined discussion of Haag, (Haag, 1992)) how the argument goes for the case
n = 2.
A crucial ingredient in the argument leading to clustering is the assumption made
in Axiom Id: the theory has a mass gap, with the minimum energy of states orthogonal
to the vacuum equal to m > 0. Choose positive numbers a, b such that 0 < a < b < m.
Define the function F+ (p0 ) on the energy variable p0 (for a four-vector p) as
− K
(p0 −a)2
e
F+ (p0 ) ≡ − K
− K , a < p0 < b (9.98)
(p0 −a)2 (p0 −b)2
e +e
≡ 0, p0 ≤ a (9.99)
≡ 1, p0 ≥ b (9.100)
which clearly vanishes for |p0 | > b. The three functions F+ , F0 and F− form a partition
of unity, and are illustrated in Fig. 9.2 for a specific choice of parameters. It is easily
1
0.8
0.6
0.4
0.2
0
p0
–2 –1 0 1 2
11 For an accessible, but careful and rigorous proof, see Jost (Jost, 1965), Chapter VI, especially
pp. 126–130.
266 Dynamics VII: Interacting fields: general aspects
seen that, like F+ , the functions F0 and F− are also C ∞ (infinitely many times
continuously differentiable).
We now consider the vacuum-expectation-value of a pair φf1 (x1 ), φf2 (x2 ) of almost
local operators localized around spacetime points x1 , x2 at equal time (the n = 2 case
of the Clustering theorem). Any such operator can be split into three parts as follows,
utilizing the energy functions F±,0 introduced above:
with f˜(p) the Fourier transform of the Schwarz smearing function f (x). Note that
F±,0 (p0 )f˜(p) is C ∞ if f˜(p) is, so the smearing of these three operators in coordinate
space is with fast-decreasing functions: i.e., they are almost local if φf (x) is. By the
momentum shift argument of (9.83), and the support in energy of the functions F±,0 ,
we know the following:
(+)
1. The operators of form φf (x) increase the energy of any state they act on by at
least a > 0.
(0)
2. The operators of form φf (x) change the energy of any state they act on by at
most b < m.
(−)
3. The operators of form φf (x) decrease the energy of any state they act on by at
least a > 0.
The State axioms (specifically, the spectral assumptions Ib,Ic, and Id) then imply
(+) (+)
0|φf (x)|α = 0, ∀|α ⇒ 0|φf (x) = 0 (9.104)
(0) (0)
φf (x)|0 = C|0, C = 0|φf (x)|0 (9.105)
(−)
φf (x)|0 = 0 (9.106)
Accordingly,
(0) (−) (+) (0)
0|φf1 (x1 )φf2 (x2 )|0 = 0|(φf1 (x1 ) + φf1 (x1 ))(φf2 (x2 ) + φf2 (x2 ))|0
(0) (0) (−) (+)
= 0|φf1 (x1 )φf2 (x2 )|0 + 0|[φf1 (x1 ), φf2 (x2 )]|0
(0) (+) (−) (0)
+ 0|[φf1 (x1 ), φf2 (x2 )]|0 + 0|[φf1 (x1 ), φf2 (x2 )]|0 (9.107)
Field theory in Heisenberg representation: axiomatics 267
By (9.84), the three commutators on the right-hand side of (9.107) fall faster than any
power of d ≡ |x1 − x2 |, while, by (9.105),
(0) (0) (0) (0)
0|φf1 (x1 )φf2 (x2 )|0 = 0|φf1 (x1 )|00|φf2 (x2 )|0
= 0|φf1 (x1 )|00|φf2 (x2 )|0 (9.108)
Thus
0|φf1 (x1 )φf2 (x2 )|0c ≡ 0|φf1 (x1 )φf2 (x2 )|0
− 0|φf1 (x1 )|00|φf2 (x2 )|0
< CN d−N , d → ∞ (9.109)
which is just the n = 2 case of Theorem 9.1. The extension to higher values of n
(for which we refer the reader to the above-cited book of Jost (Jost, 1965)) requires
an inductive argument relying on the fact that as the spatial diameter d of the
set of n points becomes large, we may always find two subsets of points separated
by an arbitrarily large spatial distance. It is also important to realize that the
above argument is equally valid for the more general class of composite almost local
operators: e.g., of the form (9.81). Indeed, we shall employ the Clustering theorem in
just such a circumstance in the next section, when it is combined with the Haag–Ruelle
scattering theory to establish the clustering properties of the S-matrix which served
as a crucial motivation in our first stabs at constructing local field theory.
We can now state our final set of axioms, those that connect the particle(s) and
field(s) of the theory. We may call them the particle–field duality axioms. They are
essential in the development of any comprehensible theory of particle scattering in the
context of field theory.
1. Axiom IIIa: For some one-particle state |α = g(k)|kd3 k (g(k) ∈ L2 ) with
discrete eigenvalue m2 of the squared-mass operator (cf. Axiom Id), the smeared
field φf (x) has a non-vanishing matrix element from this single-particle state to
the vacuum, 0|φf (x)|α = 0.
Comments: If this situation holds, we call φf (x) an interpolating Heisenberg
field for the given particle.
2. Axiom IIIb: (Asymptotic completeness.) The Hilbert space Hin (resp. Hout )
corresponding to multi-particle states of far-separated, freely moving stable
particles in the far past (resp. far future) are unitarily equivalent, and may be
identified with the full Hilbert space H of the system (which, from the cyclicity
axiom IId, can be regarded as the space generated by application of the smeared
fields to the vacuum). Thus, this axiom again connects particle concepts (the
asymptotic in- and out-states) with a space H defined in terms of the action of
the basic field(s) of the theory.
Comments: As discussed in the preceding section, this assumption is almost
unavoidable physically, as it incorporates a vast amount of phenomenological
experience of particle interactions. On the other hand, from a brutally utilitarian
point of view, it may be thought to be partially unnecessary: if there are physical
states which do not correspond to the states produced and detected in high-
268 Dynamics VII: Interacting fields: general aspects
energy accelerators, who needs to know, given that essentially all of our insights
into the nature of fundamental microphysical interactions are obtained from
such accelerator experiments? The assumed unitarity of the S-matrix would
only then require Hin = Hout , with both of these asymptotically defined spaces
being (perhaps) proper subsets of the full Hilbert space H. Indeed, the Haag–
Ruelle scattering theory of the next section can only establish the existence of
the asymptotic spaces as such subsets. Moreover, even in the few cases where
we have maximum mathematical control: e.g., the explicitly constructed field
theories corresponding to polynomially self-coupled scalar fields in two spacetime
dimensions, where the validity and consistency of the axioms of type I and II can
be explicitly checked by construction of the Hilbert space and operators of the
theory, the validity of Axiom IIIb remains, in the words of Jaffe and Glimm
((Glimm and Jaffe, 1987), p. 275), “a very deep (and open) mathematical ques-
tion”. Our attitude for the remainder of this book, in the absence of conclusive
evidence to the contrary, will simply be to assume the validity of asymptotic
completeness, and to treat the asymptotic spaces Hin,out as equivalent to the full
physical Hilbert space of the field theory.
The axiomatic framework outlined in this section is the basis for famous proofs
(Streater and Wightman, 1978) of the Spin-Statistics and TCP theorems which estab-
lish rigorously these fundamental properties of local field theory (already discussed
qualitatively in Chapter 3) in a very general context. We have already discussed
(in Section 7.3) the Spin-Statistics theorem, albeit in the context of free interaction-
picture fields. We shall return to both the Spin-Statistics and TCP theorems later, in
the “Symmetries” section of the book (cf. Section 13.4). Another deep and beautiful
result of the axiomatic approach—the Wightman reconstruction theorem ((Streater
and Wightman, 1978), Section 3-4)—allows us to recover all essential features of the
particle-oriented Fock-space formulation of field theory starting only with a set of
Wightman functions satisfying the above axioms. Here, the cyclicity of the vacuum
embodied in Axiom IID is critical, ensuring that arbitrary states of the Hilbert space
of the system can be approximated to arbitrary accuracy by applying polynomials of
the smeared field to the vacuum. An arbitrary matrix element of the smeared field
between physical states can consequently be approximated by linear combinations
of Wightman functions (integrated with smearing functions) to arbitrary accuracy.
The appropriate transformation properties of the states and field operator under the
Poincaré group are also explicitly demonstrated in the process of the reconstruction.
We shall be performing a similar reconstruction shortly using the scattering theory
of Haag and Ruelle. In this approach an explicit formula for the asymptotic in- and
out-states of the theory is given in terms of limits of appropriately smeared Heisenberg
fields acting on the vacuum. To the extent that we accept asymptotic completeness
(Axiom IIIB), the construction of the asymptotic in- or out-spaces is tantamount to
recovering the full physical Hilbert space of the theory.
and by Axiom IIIa we will henceforth assume that it is an interpolating field for the
spinless particle of the theory, with a non-vanishing vacuum to single-particle matrix
element. The latter is in fact determined up to a single overall normalization constant
of the field, by Lorentz-invariance. Using covariantly normalized one-particle states,
3
cov k |kcov = 2E(k)δ (k − k),
cov k|φ1 (x)|0 = f (1) (y − x) cov k|φ(y)|0d4 y
iP ·y
= f (1) (y − x)
cov k|e φ(0)e−iP ·y |0d4 y
= f (1) (y − x)eik·y
cov k|φ(0)|0d y
4
=
cov k|φ(0)|0f
˜(1) (k)eik·x (9.110)
12 Also, bound states of two particles, with p2 = 4m2 − E, are specifically excluded.
270 Dynamics VII: Interacting fields: general aspects
†
cov k|φ(0)|0 = cov k|U (Λ)φ(0)U (Λ)|0 = cov Λk|φ(0)|0 (9.111)
Choosing Λ to be the boost which takes momentum k to zero, we obtain the matrix
element cov 0|φ(0)|0. This matrix element (again, assumed non-zero by Axiom IIIa) is
thus a constant dependent on the normalization of the basic Heisenberg field φ(x). For
convenience, we may choose the normalization here to agree with that of a free field,
for which cov k|φ(0)|0 = (2π)13/2 , but it must be remembered that the normalization
of the field is often conventionally fixed by other requirements—such as commutation
relations—which will lead to a different normalization (more on this at the end of
this section, when we derive the asymptotic condition). Thus, we may simply take,
switching back to non-covariantly normalized states (recall |kcov = 2E(k)|k, cf
(5.15)),
1
k|φ1 (x)|0 = f˜(1) (k)eik·x (9.112)
(2π)3/2 2E(k)
∂
We also note at this point that ∂t φ1 (x) is an almost local field if φ1 (x) is, as the time-
derivative of the fast-decreasing C ∞ smearing function f (1) is still fast decreasing.
Next, let g(x, t) be a positive-energy solution of the Klein–Gordon equation:
d3 p
g(x, t) = p)ei(
p·
x−E(p)t)
g̃( , E(p) ≡ p 2 + m2 (9.113)
2E(p)
↔
∂
φ1,g (t) ≡ −i d x {g(x, t)
3
φ1 (x, t)} (9.114)
∂t
Both terms in (9.114) therefore correspond to the spatial smearing of an almost local
field with a single particle wavefunction solution of the Klein–Gordon equation. The
admittedly somewhat clumsy subscript “1” is maintained as a reminder that this field
has been engineered to produce only one-particle states when it acts on the vacuum:
we shall later define a similar field without this restriction, and will need to be able
to distinguish the two. Note the similarity of (9.114) to the creation operator defined
in (9.36) of Section 9.1. It is easy to see that the state obtained by applying this field
to the vacuum is time-independent:
Asymptotic formalism I: the Haag–Ruelle scattering theory 271
∂ ∂2 ∂2
φ1,g (t)|0 = −i d3 x {g(x, t) 2
φ1 (x, t) − φ1 (x, t) 2 g(x, t)}|0 (9.115)
∂t ∂t ∂t
∂2 2 − m2 )g(x, t)}|0 (9.116)
= −i d3 x {g(x, t) φ1 (x, t) − φ1 (x, t)(∇
∂t2
In going from (9.115) to (9.116) we have used the fact that the single particle wavefunc-
∂2 2
2 − ∇ + m )g = 0. Transferring
2
tion g(x, t) satisfies the Klein–Gordon equation ( ∂t
the spatial gradients by an integration by parts (using the fast decrease of g in x-space),
we find
∂
φ1,g (t) = −i d3 x g(x, t)( + m2 )φ1 (x, t)|0 (9.117)
∂t
As the energy-momentum operator Pμ annihilates the vacuum
∂
φ1,g (t)|0 = 0 (9.119)
∂t
A little thought reveals that this property is no longer maintained for multiple
applications of the field φ1,g (at the same time t) to the vacuum. However, the
resultant time-dependent multi-particle states will be shown below to have a well-
defined (strong) limit for t → ±∞. This is, in fact, the central result of the Haag–
Ruelle approach to scattering.
The momentum wavefunction of the one-particle state φ1,g (t)|0 = φ1,g (0)|0,
defined as the overlap of this state with the non-covariantly (continuum) normalized
state |k, follows straightforwardly from (9.112, 9.114):
↔
−ip·x ∂ ik·x ˜(1) 3 d3 p
ψ1,g (k) ≡ k|φ1,g (t)|0 = −i g̃(
p)(e e )f (k)d x
∂t 2E(p)
g̃(k)f˜(1) (k)
= (2π)3/2 (9.120)
2E(k)
1
3
p · v − E(p) ∼ −m/γ − (pi − mγvi )Mij (pj − mγvj ) (9.123)
2 i,j=1
with Mij = mγ 1
(δij − vi vj ) a symmetric 3x3 matrix with eigenvalues mγ
1 1
, mγ 1
and mγ 3,
1
and hence determinant m3 γ 5 . The Gaussian integration around the stationary phase
point gives us the desired asymptotic behavior at large t:
with C an irrelevant constant containing the mass m, π, etc. What if the momentum-
space wavefunction g̃( p) vanishes at (and in some neighborhood of) the stationary
phase point p = mγv ? For example, g̃ may have compact support in momentum space,
and simply vanish in some neighborhood of mγv , in which case not only the leading,
but also all higher-order terms in the stationary phase expansion, will vanish. In this
case it can be shown that g(v t, t) vanishes for large t faster than any power of t.14
We shall, however, only need the weaker result encapsulated in (9.124). Finally, we
note that for x (or v ) pointed along the direction of the particle’s motion, the t−3/2
falloff is just the expected spreading of the wave-packet due to the non-zero spread in
momentum space, which leads to the particle being delocalized over a region of linear
dimension ∝ t, with |g|2 t3 ∼ constant at large times.
Before statement and proof of the Haag Asymptotic theorem, we shall need two
important preliminary results. The first is a fairly direct consequence of the Ruelle
Clustering theorem discussed in the preceding section. The essential physical content
of the description of a physical system in terms of in/out-states is the intuition that
for large times, past or future, the particles become physically isolated and cease to
interact significantly with one another. In the field context, this turns out to be exactly
equivalent to the fast decrease of connected VEVs of almost local fields at large spatial
separation: the large-distance falloff of field expectation values is converted to a large
time falloff via the asymptotic kinematics of the single particle wavefunctions given in
(9.124).
First, let us temporarily use the notation φ1,g (t) to generically denote any field
obtained by spatially smearing an almost local field φ(x, t) with a single particle
13 The observation that such wave-packets obey the correct relativistic kinematics was the primary
motivation for de Broglie’s introduction of the wave hypothesis for particles, and as such the seminal
development initiating the path to Schrödinger’s wave mechanics.
14 For a tight proof of this result, originally due to Ruelle, see the above-cited book of Jost (Jost, 1965),
Chapter 6.
Asymptotic formalism I: the Haag–Ruelle scattering theory 273
We note that our previously defined field φ1,g (t) in (9.114) is just the sum of two
∂
such fields, with the almost local field being either φ1 or ∂t φ1 , and the single-particle
wavefunction either g(x, t) (defined in (9.113)) or its time-derivative ∂g(
x,t)
∂t . Our first
preliminary result states
Lemma 9.2 For large times, t → ±∞,
To establish this result, we begin by noting that the amplitude Mm,n (t) has a cluster
expansion as a sum of terms in which the m + n fields are distributed into Nc separate
clusters, with the mr φ†1,g (t) and nr φ1,g (t) fields in the rth cluster inside a separate
connected VEV of the form (9.97). The rth cluster will give a contribution of the form
0 |φ1 (x1 , t).....φ1 (xmr +nr , t)|0c G1 (x1 , t)...Gmr +nr (xmr +nr , t)d3 x1 ...d3 xmr +nr
= 0|φ1 (x1 , t)φ1 (x1 + ξ2 )...φ1 (x1 + ξmr +nr )|0c
· G1 (x1 , t)...Gmr +nr (x1 + ξmr +nr , t)d3 x1 d3 ξ2 ..d3 ξmr +nr
= d x1 G1 (x1 , t) 0|φ1 (0, t)φ1 (ξ2 , t)..φ1 (ξmr +nr , t)|0c
3
· G2 (x1 + ξ2 , t)..Gmr +nr (x1 + ξmr +nr , t)d3 ξ2 ..d3 ξmr +nr (9.127)
Here we use the generic notation Gi (x, t) to denote either a positive-energy wavefunc-
tion g(x, t) (appearing in the φ1,g fields) or a negative-energy wavefunction g ∗ (x, t)
(appearing in the φ†1,g fields). Note that the asymptotic behavior at large time of the g ∗
wavefunctions is given directly by complex conjugating (9.124), and involves the same
|t|−3/2 falloff as the g functions. In passing from the first to the second line we have
shifted integration variables by defining xn ≡ x1 + ξn , n = 2, . . . mr + nr . The last line
is obtained using the translation property of the φ1 fields.
At this point we invoke the Ruelle Clustering theorem, which asserts that the
connected vacuum expectation value appearing in the penultimate line of (9.127) is a
fast decreasing function of the ξ variables. In the asymptotic limit of large t then, each
such variable can be regarded as restricted to a finite range. If we change to velocity
space for the x1 ≡ v1 t variable, we note that the Gn functions become asymptotically,
using (9.124),
| 0|φ1 (x1 , t).....φ1 (xmr +nr , t)|0c Gi (xi , t)d3 x1 ...d3 xmr +nr |
i=1
3
∼ |t|3− 2 (mr +nr ) (9.129)
c Nc
Multiplying this behavior for all Nc clusters, with N r=1 mr ≡ m and r=1 nr ≡ n,
we find that the contribution to Mm,n (t) from Nc clusters with N = m + n total
3
fields behaves asymptotically like t3Nc − 2 N . As 0|φ1 |0 = 0 (φ1 |0 is a single-particle
state, hence orthogonal to the vacuum), all clusters must have at least two fields. If
any cluster has three fields, the total number of fields must satisfy N ≥ 2(Nc − 1) + 3,
which implies 3Nc − 32 N ≤ − 32 : i.e., a vanishing t−3/2 behavior as t → ±∞. Evidently,
the only way to obtain a non-vanishing result at large time is to have all clusters
contain exactly two fields, in which case N = 2Nc and the power falloff is eliminated.
Moreover, the only pairings that survive at large time involve a φ†1,g field paired with
a φ1,g field. If we take instead two φ1,g fields:
0|φ1 (x1 , t)φ1 (x2 , t)|0c g1 (x1 , t)g2 (x2 , t)d3 x1 d3 x2
d3 p1 d3 p2 3
= Δ(x1 − x2 )g̃1 ( p2 )ei(
p1 ·
x1 −E(p1 )t+
p2 ·
x2 −E(p2 )t)
p1 )g̃2 ( d x1 d 3 x2
2E(p1 ) 2E(p2 )
(9.130)
Here Δ(x1 − x2 ) ≡ 0|φ1 (x1 , t)φ1 (x2 , t)|0c is fast decreasing for large x1 − x2 by the
Ruelle Clustering theorem, so that its Fourier transform Δ̃( p) is smooth (C ∞ ).
Changing to center-of-mass variables X ≡ 1 2 , x ≡ x1 − x2 , and performing the
x +
x
2
spatial integrals, one finds
0| φ1,g1 (t)φ1,g2 (t)|0c = 0|φ1 (x1 , t)φ1 (x2 , t)|0c g1 (x1 , t)g2 (x2 , t)d3 x1 d3 x2
p1 − p2 3 d3 p1 d3 p2
= (2π)3 g̃1 ( p2 )e−i(E(p1 )+E(p2 ))t Δ̃(
p1 )g̃2 ( )δ (
p1 + p2 )
2 2E(p1 ) 2E(p2 )
d3 p 1
= (2π)3 g̃1 (
p1 )g̃2 (− p1 )e−2iE(
p1 )t
p1 )Δ̃( (9.131)
4E(p1 )2
The smooth momentum dependence of all factors in the integral then implies the fast
decrease (faster than any power) of (9.131) as t → ±∞. A similar result obtains for a
Asymptotic formalism I: the Haag–Ruelle scattering theory 275
cluster consisting of two φ†1,g fields. On the other hand, if we take a pairing of a φ†1,g
with a φ1,g field, the complex time exponentials cancel, and the result is non-vanishing,
and time-independent. Finally, we note that if m = n, allowing a complete pairing, as
in the second line of Lemma 9.2, the remainder term must involve at least two clusters
with three (or more) fields, and hence a falloff of t−3 at large time. This concludes the
demonstration of Lemma 9.2.
Our second preliminary result concerns the symmetry under permutation of the
states obtained by applying the φ1,g fields to the vacuum. We state it as the following
Lemma.
Lemma 9.3 Define the time-dependent state |Ψ, t as follows:
Then for large time t, the distance between these two state vectors has the asymptotic
behavior
1
(|Ψ, t − |Ψ , t, |Ψ, t − |Ψ , t) ∼ 3/2 (9.134)
|t|
This result follows as an immediate consequence of Lemma 9.2, as the squared distance
between the states is
(|Ψ, t − |Ψ , t, |Ψ, t − |Ψ , t) = Ψ, t|Ψ, t + Ψ , t|Ψ , t
− Ψ, t|Ψ , t − Ψ , t|Ψ, t (9.135)
Each of the inner products appearing on the right-hand side of (9.135) is an amplitude
of the form Mm,m (t), so by Lemma 9.2 approaches asymptotically a sum of m cluster
pairs which is symmetric under permutation of any two fields. Thus the leading terms
cancel, leaving a remainder of order t−3 , whence Lemma 9.3.
The proof of the Haag Asymptotic theorem follows very quickly from these results.
First, the theorem itself.
Theorem 9.4 The time-dependent state vector
with momentum wavefunctions ψ1,g1 (k), ..ψ1,gn (k) (defined in (9.120)); thus, the
states in (9.137) have the inner product structure corresponding to the contin-
uum non-covariant normalization of (5.21) (with plus signs everywhere as we are
276 Dynamics VII: Interacting fields: general aspects
considering only bosons here). All of the above holds with the replacements t → +∞
and “in”→“out” everywhere.
The strong convergence is easily established by taking the time-derivative of |Ψ, t:
∂ ∂ ∂
|Ψ, t = ( φ1,g1 (t))φ1,g2 (t)..φ1,gn (t)|0 + φ1,g1 (t)( φ1,g2 (t))..φ1,gn (t)|0
∂t ∂t ∂t
∂
+ . . . + φ1,g1 (t)φ1,g2 (t)..( φ1,gn (t))|0 (9.138)
∂t
The final term on the right-hand side of (9.138) vanishes by (9.119) as the time-
derivative of φ1,gn acts directly on the vacuum state. However, all the other (n-1)
terms correspond to permutations of a similar term in which the field with the time-
derivative is moved to the extreme right, and therefore, by Lemma 9.3, have norm
of order |t|−3/2 . However, || ∂t ∂
||Ψ, t|| < |t|C3/2 ⇒ |||Ψ, t − ||Ψ, t || < |T2C
|1/2
, t, t > T .
Thus the sequence of states |Ψ, t for large t is a Cauchy sequence in the Hilbert space
and must converge in norm to a limit vector |Ψin .
The second part of the theorem, establishing that the in- (or out-)states defined
as such limits have the appropriate inner-product structure to define a Fock space
of independent many-particle states, follows directly from Lemma 9.3, as the inner
product in g1 g2 ...gn |g1 g2 ...gn in is simply the limit for t → −∞ of the amplitude
Mn,n (t) defined in Lemma 9.2, the result of which is a symmetric sum of overlaps
of single-particle states (see Problem 5).
The physical interpretation of the states resulting from the limiting processes of
Theorem 9.4 is fairly clear. The smeared fields correspond to wave-packets which
overlap less and less as the state is run either forwards or backwards in time. The
asymptotic convergence can be greatly improved,15 from the t−3/2 behavior used
above, to faster than any power of the time, if the momentum wavefunctions of
the particles have non-overlapping support in momentum space: thus g̃i ( p) = 0 iff
p) = 0 for i = j. In this situation, the particle velocities are “pointed” in different
g̃j (
directions and the separation at large time is ensured, without recourse to wave-packet
spreading. In particular, in this case, the reader may easily verify (Problem 6) that
the coordinate space overlap of the single-particle wavefunctions of different particles
remains exactly zero at all times. Thus the states constructed by the Haag procedure
satisfy our intuitive picture of widely separated free particles in either the far past or
the far future. We also note here without proof that the in- and out-states as defined
in Theorem 9.4 can be shown to have the correct transformation properties under the
Poincaré operators U (Λ, a).
It is extremely important to realize that the construction of asymptotic multi-
particle states by the limiting procedure of Theorem 9.4 remains perfectly valid
if the underlying field φ(x) is itself a more general type of almost local field, such
as the bilocal operator of (9.81), provided only that Axiom IIIa holds: namely, that
the one particle state of the stable particle whose in- and out-states we wish to
15 For further details, the reader is encouraged to consult the technical literature: e.g., the above-cited
book of Jost (Jost, 1965).
Asymptotic formalism I: the Haag–Ruelle scattering theory 277
Sn1 +n2 →n1 +n2 = lim 0|φ†1,g (+T )..φ†1,gn (+T )φ†1,h (+T )..φ†1,hn (+T )
T →∞ 1 1 1 1
· φ1,g1 (−T )..φ1,gn1 (−T )φ1,h1 (−T )..φ1,hn1 (−T )|0 (9.139)
The limit T → ∞ is, of course, a mathematical formality: in a typical high-energy
scattering experiment the particles interact only in a spacetime region of microscopic
dimensions. So the limit is very rapidly attained already when T is some very small
value (e.g., 10−23 seconds for a typical strong interaction scattering event). We shall
therefore fix T at some very small but finite value, at which point the S-matrix
amplitude (9.139) has achieved its limit value to any preassigned level of precision,
and enquire about the behavior of the combined scattering amplitude when the two
>> T . In this limit the commutator
groups of particles are separated by distance 2|Δ|
of any almost local field of “g” type appearing in (9.139) with a field of “h” type falls
so we may rearrange (9.139) (for fixed T ) as
faster than any inverse power of |Δ|,
follows:
Sn1 +n2 →n1 +n2 = 0|φ†1,g (+T )..φ†1,gn (+T )φ1,g1 (−T )..φ1,gn1 (−T )
1 1
h (0, +Δ)|0
≡ 0|Φg (0, −Δ)Φ −N )
+ o(|Δ| (9.140)
Note that the two groups of fields, those involving “g” wavefunctions and those
and
involving “h” wavefunctions, can be combined into the single operators Φg (0, −Δ)
Φh (0, +Δ) which are almost local operators localized around the indicated spacetime
can be viewed as the smearing of the
points. For example, the product field Φg (0, Δ)
278 Dynamics VII: Interacting fields: general aspects
multi-local product φ(x1 )φ(x2 )...φ(x2n1 ) with a smearing function of Schwarz type
much larger than all other distance scales in the problem.
falling fast for |xi − Δ|
Now, by definition,
h (0, +Δ)|0
0|Φg (0, −Δ)Φ
= 0|Φg (0, −Δ)|00|Φ
h (0, +Δ)|0
h (0, +Δ)|0
+ 0|Φg (0, −Δ)Φ c (9.141)
The connected term on the right-hand side of (9.141), by the Ruelle Clustering theorem
9.1, falls faster than any inverse power of the cluster separation |Δ|, whence the
desired factorization of the S-matrix amplitude for the combined process into S-matrix
amplitudes representing the separate scattering of “g”-type and “h”-type particles.
We conclude this section by employing the Haag–Ruelle theory to derive the long
promised direct connection between the interpolating Heisenberg field φ(x) and the
free in (resp. out) fields φin (x) (resp. φout (x)) defined in Section 9.1. This connection,
usually referred to as the Asymptotic Condition, was already “derived” heuristically
by manipulations involving interaction-picture operators (see (9.45)). Here we shall see
that the precise result we need, which serves as the starting point for the extremely
important scattering theory formalism of Lehmann, Symanzik, and Zimmermann to
which we turn in the next section, follows from the Haag–Ruelle theory (and hence,
from the axioms of Section 9.2) without any reference to an interaction picture
or perturbation theory. We begin by defining a smeared field φg (t), analogous to
our φ1,g (t) fields, except that the initial smearing function f (1)(x) is now taken to
be a general Schwarz function f (x) of fast decrease, with four-dimensional Fourier
transform f˜(k) which is not restricted to a region of support sandwiching the one-
particle mass hyperboloid as previously. Eventually, in fact, we may even allow f (x) to
approach a δ-function (i.e., take f˜(k) constant). For the time being though, our new
field φg (t) will be obtained by smearing the almost local field φf (x), defined exactly as
in (9.80), with a positive-energy single particle wavefunction g(x, t) as in (9.113). We
note that φg (t)|0 is not any more a single particle state, nor is it time-independent.
However, as far as the preconditions for Lemma 9.2 are concerned, φg (t) is just as good
as our previous φ1,g (t) field. We may therefore conclude that, picking for definiteness
the limit for large negative time t → −∞,
0|φ†1,g (t)...φ†1,gm
(t)φg (t)φ1,g1 (t)...φ1,gn (t)|0
1
m
→ 0|φ†1,g (t)φg (t)|0c 0|φ†1,g (t)..φ†1,g (t)...φ1,gm
(t)φ1,g (t)...φ1,g (t)|0
1 n
i 1 i
i=1
n
+ 0|φg (t)φ1,gj (t)|0c 0|φ†1,g (t)...φ†1,gm
(t)φ1,g1 (t)..φ1,gj (t)...φ1,gn (t)|0,
1
j=1
(t → −∞) (9.142)
with remainder terms of relative order |t|−3/2 . Fields omitted from the second vacuum
expectation value on each line (and coupled to the far past field φg (t)) are indicated
by the hat notation. The vanishing of 0|φ1,g (t)|0 was previously assured by the fact
Asymptotic formalism I: the Haag–Ruelle scattering theory 279
that φ1,g (t)|0 is a one-particle state. This is no longer true for the new field φg :
here we must explicitly assume the vanishing of the VEV of φ(x). For example, we
may assume that our basic interpolating field transforms non-trivially under some
symmetry unbroken by the vacuum (e.g., there is no spontaneous symmetry-breaking
along the lines discussed in Section 8.4: if there is such a symmetry-breaking, we must
shift the field as described there to remove its vacuum expectation value). Thus, the
dominant terms at large time involve only clusters with pairings of the φg (t) field with
either a φ†1,g (t) field or a φ1,gj (t) field, which are then omitted from the rest of the
i
amplitude, as indicated by the notation [φ†1,g (t)] or [φ1,gj (t)]. We have reassembled
i
the clusters not containing the special field φg (t) into full amplitudes (i.e., without the
“c” subscript). Furthermore, we may also eliminate the connected requirement on the
two-field amplitudes appearing in (9.142), as 0|φ†1,g (t)|0 = 0, 0|φ1,gj (t)|0 = 0,
i
However, recalling that φ1,g fields connect only to one-particle states, φ1,gj (t)|0 is
simply the time-independent state |gj in = ψ1,gj (k)|kin d3 k. Likewise, 0|φ†1,g (t) =
∗ i
ψg (k) in k|d3 k. On the other hand, the vacuum to one-particle matrix elements of
i
φg (t) are determined up to normalization by Lorentz-invariance, as φg is obtained by
smearing a local scalar field φ(x), assumed to be an interpolating field for the particle
in question, so that, exactly as in (9.112), but replacing f (1) → f ,
Z 1/2
in k|φf (x)|0 = f˜(k)eik·x (9.145)
(2π)3/2 2E(k)
↔
∗ ∂
= −i ψ1,g (k){g(
x, t) in k|φf (
x, t)|0}d3 xd3 k
i ∂t
↔
∗ p){ei(
p·
x−E(p)t) ∂ d3 p
= −i ψ1,g (k)g̃( ink|φf (
x, t)|0}d3 xd3 k (9.146)
i ∂t 2E(p)
↔
p·
x−E(p)t) ∂ i(E(k)t−
k·
x) 3
e i(
e p − k)
d x = 2i(2π)3 E(k)δ 3 ( (9.147)
∂t
↔
p·
i(
x−E(p)t) ∂ −i(E(k)t−
k·
x) 3
e e d x=0 (9.148)
∂t
p + k). Inserting
The second integral vanishes, as it is proportional to (E(k) − E(p))δ 3 (
(9.147) in (9.146) we obtain
0|φ†1,g (t)φg (t)|0 =Z 1/2 ∗
ψ1,g 3
(k)ψg (k)d k (9.149)
i i
with ψg (k) defined analogously to ψ1,g (k) as in (9.120), but with f˜(k) replacing f˜(1) (k),
g̃(k)f˜(k)
ψg (k) = (2π)3/2 (9.150)
2E(k)
On the other hand, 0|φg (t)φ1,gj (t)|0 = 0|φg (t)|gj in involves the integral in (9.148)
and vanishes identically.
Returning once again to (9.142), we see that the second line vanishes identically,
so by applying Theorem 9.4 to the left-hand side, and to the multi-particle amplitudes
multiplying < 0|φ†1,g (t)φg (t)|0 >, we obtain
i
m
∗ 3
in g1 , .., gm |φg (t)|g1 , .., gn in → Z 1/2 ψ1,g (k)ψg (k)d k
i
i=1
· in g1 , ..[gi ].., gm |g1 , g2 , ..gn in , t → −∞ (9.151)
We now consider a smeared field φin,g (t) defined in complete analogy to φg (t) (with
the same smearing functions), but starting from the free local field φin (x) defined in
(9.30) rather than the interacting field φ(x). Recall that both φin (x) and φ(x) are
Heisenberg fields, evolving with the dynamics specified by the full Hamiltonian. The
contribution to φin,g (t) from the destruction operator in φin is found to vanish using
(9.148), and the creation term becomes time-independent:
φin,g (t) = ψg (k)a†in (k)d3 k (9.152)
Sandwiching this result between the bra- and ket-states of (9.151), and recalling that
creation operators acting to the left destroy particles,
∗ ∗ ∗
in g1 , .., gm | |φin,g (t)|g1 , .., gn in = ψ1,g (k1 )ψ1,g (k2 ) · ·ψ1,g (km )ψg (k)
1 2 m
† 3 3 3
· in k1 , k2 , ..km |ain (k)|g1 , ..., gn in d k1 ..d km d k
Asymptotic formalism II: the Lehmann–Symanzik–Zimmermann (LSZ) theory 281
m
∗ ∗ 3
= ψ1,g (k1 ) · · · ψ1,g (km )ψg (k)δ (k − ki )
1 m
i=1
3 3 3
· in k1 ..[ki ]..km |g1 , ...gn in d k1 ...d km d k
m
∗ 3
= ψ1,g (k)ψg (k)d k · in g1 , ..[gi ].., gm |g1 , g2 , ..gn in (9.153)
i
i=1
which is precisely the same as the limiting behavior on the right-hand side of (9.151),
up to the normalization factor of Z 1/2 . The bra and ket in-states in (9.151) and (9.153)
run over a dense subset of the Hilbert space Hin (indeed, choosing the ψ1,g (k) from
a countable basis of L2 (R3 ), they run over a countable basis of Hin ), so, provided
Axiom IIIb (asymptotic completeness) holds, and we are allowed to identify Hin with
the full Hilbert space H of the theory, the stated equality in the limit amounts to weak
convergence (i.e., matrix element by matrix element) of the smeared interpolating
field φg (t) to the smeared (and time-independent) free in-field φin,g as t → −∞. All
of the above holds, of course, in the far future limit t → +∞, with “in” replaced by
“out” everywhere. We also note that as we shall be employing the asymptotic limit of
the Heisenberg field φ(x) only in matrix elements between (normalizable) states, the
initial smearing of φ(x) is unnecessary (see the comments following Axiom IIa in the
preceding section): we may set f˜(k) = 1, φf (x) = φ(x) in (9.145).
The weak equivalence of φg (t) and φin,g (t) (resp. φout,g (t)) at large negative (resp.
positive) times, in other words, the limiting behavior just established, with complete
mathematical rigor
with the corresponding result for the far future limit, is commonly referred to as the
Asymptotic Condition, and will be the critical starting point for our treatment of the
scattering theory of Lehmann, Symanzik, and Zimmermann (LSZ) in the following
section. It replaces our previous heuristic result (9.46). The Asymptotic Condition
assures us that all of the information contained in the in- and out-states of the theory,
and in particular in their overlap, the S-matrix, is already implicit in the behavior
of the interpolating Heisenberg field(s) of the theory. The LSZ theory, and in particular
the explicit link it provides between the S-matrix and vacuum expectation values of
the associated interpolating fields, is of absolutely central importance in modern field
theory. Indeed, the formula it gives us for the S-matrix in terms of expectation values
of time-ordered Heisenberg fields will be of much greater practical utility, both within
the confines of perturbation theory and beyond, than expressions of the type (9.139)
obtained directly from the Haag–Ruelle approach.
theory, provided that the given Heisenberg field has a non-vanishing matrix element
from the vacuum to the single-particle state of the particle in question:
Z 1/2
in k|φ(x)|0 = eik·x , Z = 0 (9.155)
(2π)3/2 2E(k)
We have been able to verify (cf. 9.154), without any recourse to the interaction picture
or perturbation theory, that for arbitrary normalizable in-states |α >in , |β >in , with
g(x, t) a positive-energy solution of the Klein–Gordon equation, as in (9.113),
↔
∂
in β| −i d3 x {g(x, t) φ(x, t)}|αin
∂t
↔
∂
→ Z 1/2 in β| − i
3
d x{g(x, t) φin (x, t)}|αin , t → −∞ (9.156)
∂t
The basic reason for this limiting behavior is that the smeared Heisenberg field on
the left, sandwiched between in-states which correspond physically in the far past to
states with widely separated free particles, samples a localized region of spacetime
which is effectively the vacuum, and when appropriately folded with a positive-energy
solution of the Klein–Gordon equation, acts like a free field in creating an additional
free particle in that region. Although the Haag Asymptotic theorem provides an
explicit formula for the S-matrix in terms of large time limits of appropriately
smeared Wightman distributions, it turns out that the matrix elements specified
by the theorem are only computable in a rather cumbersome way in perturbation
theory, so while this result is of great conceptual value, it is of rather limited practical
utility.16 In this section we shall derive alternative expressions for the S-matrix
which are particularly suitable for perturbative evaluation, while still allowing the
application of non-perturbative methods in those situations where perturbation theory
is invalid.
Comparing (9.34) and (9.113,9.120), we √ see that in the limit of plane wave solu-
2E(p) 3
tions of well-defined momentum k, g̃( p) = p − k), ψg (
3/2 δ ( (2π)
p − k) and the
p) = δ 3 (
smeared in-field operator on the right-hand side of (9.156) becomes simply the creation
operator a†in (k) for a particle of well-defined momentum. With a realistic particle
wavefunction with some dispersion in momentum, and momentum-space wavefunction
p), we may denote the corresponding creation operator a†in,g = ψg (
ψg ( p)a†in (
p)d3 p, so
↔
∂ †
−i 3
d x g(x, t) in β|φ(
x, t)|αin → Z 1/2 in β|ain,g |αin , t → −∞ (9.157)
∂t
16 Specifically, the matrix elements in an expression like (9.139) involve non-time-ordered fields, due to
the f (1) (x) smearing of the original local fields. The graphical techniques of perturbation theory are, on
the other hand, tailor-made for time-ordered products. This is clear both in the functional framework (cf.
Section 4.2), in which functional integrals naturally yield such time-ordered operator matrix elements, or
from the Gell–Mann–Low theorem proved in Section 9.1.
Asymptotic formalism II: the Lehmann–Symanzik–Zimmermann (LSZ) theory 283
As |αin , |βin are arbitrary, we may take the conjugate of (9.157) to obtain
↔
∗ ∂
in β|φ( → Z 1/2 in β|ain,g |αin , t → −∞
3
i d x g (x, t) x, t)|αin (9.158)
∂t
We now observe that, assuming asymptotic completeness (Axiom IIIb), the Hilbert
spaces Hin and Hout coincide, with each other (and, although we do not need it
here, with the full physical Hilbert space H of the theory). In other words, any
(normalizable) |βin state is also an element of Hout . Accordingly, the in β| bra states
in (9.157,9.158) may be replaced by arbitrary out-states:
↔
∂ †
−i 3
d x g(x, t) out β|φ(
x, t)|αin → Z 1/2 out β|ain,g |αin , t → −∞ (9.159)
∂t
↔
∗ ∂
out β|φ( → Z 1/2 out β|ain,g |αin , t → −∞
3
i d x g (x, t) x, t)|αin (9.160)
∂t
↔
∂ †
−i out β|φ( → Z 1/2 out β|aout,g |αin , t → ∞ (9.161)
3
d x g(x, t) x, t)|αin
∂t
↔
∗ ∂
out β|φ( → Z 1/2 out β|aout,g |αin , t → ∞ (9.162)
3
i d x g (x, t) x, t)|αin
∂t
with aout,g , a†out,g defined in complete analogy to the corresponding in operators, but
starting from the free field φout (x). The asymptotic conditions (9.159–9.162) will be
the starting points for our derivation of the famous (and indispensable) LSZ reduction
formulas for the S-matrix.
We begin with the S-matrix element for the scattering of n incoming scalar
particles, described by momentum-space wavefunctions ψg1 (k), ..ψgn (k), into m outgo-
ing particles, with wavefunctions ψg1 (k), ..ψgm
(
k). These wavefunctions are assumed
to have disjoint support in momentum space: in particular, no incoming particle
wavefunction has non-vanishing overlap with an outgoing particle wavefunction, as
we wish to exclude uninteresting disconnected contributions to the S-matrix in which
a particle passes through without interaction. After deriving the reduction formula,
we shall take the limit in which the particle wave-packets approach plane waves (i.e.,
the ψ(k) approach δ-functions), to make contact with the LSZ formulas as usually
stated in field-theory textbooks. Thus
†
Sg1 ..gm
,g ..g
1 n
= outg1 , ..., gm |g1 , ..., gn in = outg1 , ..., gm |ain,g1 |g2 , ..., gn in (9.163)
as the application
of the destruction operator leads to a sum of terms involving overlap
integrals g1∗ (k)gi (k)d3 k, which all vanish by assumption. We therefore have
Sg1 ..gm
,g ..g
1 n
= out g1 , ..., gm |g1 , ..., gn in
†
=
out g1 , ..., gm |(ain,g1 − a†out,g1 )|g2 , ..., gn in (9.165)
as the a†out,g1 operator acting to the left as a destruction operator gives zero by the
preceding argument. Matrix elements of a†in,g1 , a†out,g1 are given as the asymptotic
limits (9.159) and (9.161), so
↔
−1/2 ∂
Sg1 ..gm
,g ..g
1 n
= iZ ( lim − lim ) 3
d xg1 (x, t) outg1 , .., gm |φ(
x, t)|g2 , .., gn in
t→+∞ t→−∞ ∂t
+∞
↔
−1/2 ∂ ∂
= iZ 3
d x dt {g1 (x, t) out g1 , ..., gm |φ(
x, t)|g2 , ..., gn in }
−∞ ∂t ∂t
(9.166)
The time-derivative inside the integrand (9.166) can be rewritten recalling that the
wavefunction g1 (x, t) is a solution of the Klein–Gordon equation (cf. (9.113)):
↔
∂ ∂ ∂2 ∂ 2 g1 (x, t)
(g1 (x, t) ....) = g1 (x, t) 2 .... − ....
∂t ∂t ∂t ∂t2
∂2 2 − m2 )g1 (x, t)....
= g1 (x, t) .... − (∇ (9.167)
∂t2
Inserting (9.167) in (9.166), and integrating by parts to transfer the spatial gradients
from the wavefunction g1 (the fast spatial decrease of which ensures the absence of
surface terms17 ) to the matrix element, we obtain
−1/2
Sg1 ..gm
,g ..g
1 n
= iZ d4 xg1 (x, t)(x + m2 ) out g1 , ..., gm
|φ(x)|g2 , ..., gn in
(9.168)
where x ≡ ∂ ∂
∂xμ ∂xμ
.
We note that in (9.168), the number of particles in the incoming
state has been reduced by one, and been replaced by an appropriately smeared
Heisenberg field operator sandwiched between the (remaining) incoming and outgoing
states. A result of this type is called a “LSZ reduction formula”. The notion that a
smeared Heisenberg field can be used to create (or destroy) in- or outgoing particles
should hardly be surprising, given the Haag Asymptotic theorem of the preceding
section, but we note the important difference here that the S-matrix element is given
in terms of an integral of a matrix element of such a field over all spacetime, and in
particular over all time, rather than as a limit for large time.
17 Recall that the matrix elements of φ(x) are tempered distributions—i.e., finite derivatives of a
polynomially bounded continuous function—while the C ∞ function g1 (
x, t) decreases faster than any power
of |
x| at any given t.
Asymptotic formalism II: the Lehmann–Symanzik–Zimmermann (LSZ) theory 285
The process of “reducing” particles from the incoming or outgoing state can be
continued, as follows. We focus our attention next on an outgoing particle—say, the one
with wavefunction ψg1 . Begin with the matrix element under the integral in (9.168):
out g1 , ..., gm |φ(x)|g2 , ..., gn in = outg2 , ..., gm |aout,g1 φ(x)|g2 , ..., gn in
↔
−1/2 ∂
= iZ lim d3 x g1∗ (x , t )
out g2 , .., gm |φ(x )φ(x)|g2 , .., gn in (9.169)
t →+∞ ∂t
As the spacetime point x is fixed in (9.169), the product of Heisenberg fields appearing
in the matrix element is automatically time-ordered in the stated limit, so we may
write
out g1 , ..., gm |φ(x)|g2 , ..., gn in
↔
−1/2 ∂
= iZ lim d3 x g1∗ (x , t )
out g2 , .., gm |T (φ(x )φ(x))|g2 , .., gn in (9.170)
t →+∞ ∂t
If the far-future time limit in (9.170) is replaced by one in the far past, so that
t → −∞, we note that the time-ordering would imply
↔
−1/2 ∂
iZ lim d3 x g1∗ (x , t )
out g2 , .., gm |T (φ(x )φ(x))|g2 , .., gn in
t →−∞ ∂t
↔
∂
= iZ −1/2 lim d3 x g1∗ (x , t )
out g2 , .., gm |φ(x)φ(x )|g2 , .., gn in
t →−∞ ∂t
= out g2 , .., gmφ(x)a
| in,g1 |g2 , .., gn in =0 (9.171)
using the asymptotic condition (9.160), and the fact that the in-state particle wave-
functions are non-overlapping with ψg1 . The expression in (9.170) may therefore be
replaced by one in which the limits at t → +∞ and t → −∞ are subtracted, leading
to an integral over t of the time-derivative, just as in (9.166):
out g1 , ..., gm |φ(x)|g2 , ..., gn in
+∞
↔
−1/2 3 ∂ ∂
= iZ d x dt {g1∗ (x , t )
outg2 , .., gm |T (φ(x )φ(x))|g2 , .., gnin } (9.172)
−∞ ∂t ∂t
Once again, using the fact that g1 (x ) is a solution of the Klein–Gordon equation, and
integrating by parts, one may convert this to the form
out g1 , ..., gm |φ(x)|g2 , ..., gn in
= iZ −1/2 d4 x g1∗ (x , t )(x + m2 ) out g2 , .., gm
|T (φ(x )φ(x))|g2 , .., gn in } (9.173)
286 Dynamics VII: Interacting fields: general aspects
Inserting (9.173) into (9.168), we obtain a result in which two particles—one incoming,
the other outgoing—have been “reduced out” of the original n → m amplitude:
−1/2 2
Sg1 ..gm
,g ..g
1 n
= (iZ ) d4 xd4 x g1 (x, t)g1∗ (x , t )(x + m2 )(x + m2 )
· out g2 , .., gm |T (φ(x )φ(x))|g2 , .., gn in (9.174)
This process may evidently be continued (and we encourage the reader to carry it at
least one step further; see Problem 7), removing all the incoming and outgoing particles
from the initial and final states, and leading to the final LSZ reduction formula, giving
the multi-particle S-matrix element in terms of an integral involving the vacuum-
expectation-value of the time-ordered-product of n + m Heisenberg interpolating fields
(the n + m point Feynman amplitude) for the particle undergoing scattering:
n
m
Sg1 ..gm
,g ..g
1 n
= (iZ −1/2 )m+n gi (xi )gj∗ (xj )(xi + m2 )(xj + m2 )
i=1 j=1
4
· out 0|T (φ(x1 )..φ(xm )φ(x1 )..φ(xn ))|0in d xi d xj
4
(9.175)
4
· Kxi Kxj out 0|T (φ(x1 )..φ(xm )φ(x1 )..φ(xn ))|0in d xi d xj
4
(9.176)
It will be convenient to define an intermediate quantity from which the S-matrix ampli-
tude can be extracted via (9.176). Leaving out for the time being the normalization
factors and Klein–Gordon operators, we define the Feynman Green functions in both
coordinate and momentum space in the obvious way:
G(x1 , . . . . xn ) ≡
out 0|T (φ(x1 )..φ(xm )φ(x1 )..φ(xn ))|0in (9.177)
kj ·xj −i ki ·xi
G̃(k1 , . . . . kn ) ≡ e+i G(x1 , . . . . xn )d4 x1 . . . d4 xn (9.178)
Note that the momenta appearing in (9.178) may be arbitrary four-vectors, not
necessarily satisfying the on-mass-shell condition ki · ki = kj · kj = m2 . In other words,
the LSZ formula provides us with a natural off-mass-shell extension of S-matrix
elements. If we integrate by parts over the spacetime coordinates xi , xj in (9.176)
we may write the S-matrix element as
Asymptotic formalism II: the Lehmann–Symanzik–Zimmermann (LSZ) theory 287
n m
−iZ −1/2 (ki2 − m2 ) −iZ −1/2 (kj2 − m2 )
Sk1 ..km
,k ..k
1 n
=
G̃(k1 , . . . . kn ) (9.179)
(2π) 3/2 2E(k ) 3/2
i=1 j=1 i (2π) 2E(k ) j
1. The asymptotic conditions used to derive the formula hold, by the Haag–
Ruelle theory, for any almost local field φ(x) with a non-vanishing vacuum to
single particle matrix element (9.155). In particular, they hold for almost local
composite fields (i.e., multi-local combinations of the local fields appearing in
the Hamiltonian defining the dynamics of the theory, as in (9.81)) with such
a non-vanishing matrix element. Such fields must be used, as we shall see in
more detail in the next section, if the particle in question is a bound state.
Even if the particle corresponds to an elementary local field in the theory, there
is no unique interpolating field giving the correct S-matrix for its scattering!
For example, if φ(x) is a local interpolating field for the particle in question,
with k|φ(x)|0 = 0, then for general values of a, b, c, .. we certainly would expect
that φ (x) = aφ(x) + bφ(x)2 + cφ(x)3 .. would also have a non-vanishing vacuum
to single particle matrix element, and the LSZ formula will hold equally well
using this field instead of φ(x). Of course, the Green function G(x1 , . . . . xn )
(and the normalization constant Z) will clearly be different with different fields:
only the multiple pole residue of the on-mass-shell limit of its Fourier transform
is guaranteed to be independent of the choice of field, as it gives the presum-
ably unique physical S-matrix amplitude for the scattering of a specific stable
particle.
2. The existence of simple poles in the off-shellness variables k2 − m2 for all exter-
nal (incoming and outgoing) particles is a rigorous consequence of the Haag–
Ruelle/LSZ theory, and depends critically on the assumed mass gap in the theory.
In a theory such as QED, with a strictly massless photon, this result no longer
holds. In fact, the singularities of charged-particle Green functions in the on-shell
limit are softer than simple poles, and connected S-matrix elements for specified
finite numbers of such incoming and outgoing particles vanish, as we shall see
in Section 19.1. The problem is that, with a strictly massless particle, it takes
essentially no energy to produce any number of extra very-low-energy particles,
so that the probability of finding a strictly finite number in any process where
a physical interaction has occurred is zero. Of course, in actual experiments the
288 Dynamics VII: Interacting fields: general aspects
detector resolution is finite, and ultra-soft photons are undetectable. Giving the
photon a very small mass (smaller than the detector resolution) restores sanity:
a non-vanishing S-matrix, and sensible cross-sections, rates, etc. We shall return
to this subject in Chapter 19.
3. If the fields φ(x) appearing in (9.176) are ultralocal (cf. Section 5.5), the T-
product defines a Lorentz scalar Green function, and the Lorentz invariance of
the resulting S-matrix is manifest (recall that the non-covariant energy square-
root factors are associated with our choice of non-covariantly normalized states).
However, as just discussed, it is perfectly possible to use fields which are almost
local (e.g., composite fields) but not strictly local, in which case the off-shell Green
functions, both in coordinate and momentum space, are not Lorentz-invariant.
However, the Lorentz-invariance of the S-matrix, which follows rigorously from
the Haag–Ruelle theory, assures us that this property still holds in the on-shell
limit (i.e., for the residue of the multi-pole term). This situation, in which a
symmetry of the theory is only recovered in the on-shell limit, is actually quite
common in field theory, as we shall see later in Part 3 of the book when we study
symmetries in field theory in detail.
4. The extension to particles and fields with non-vanishing spin is straightforward.
One begins from the generalization of (9.33, 9.34), which reconstruct the destruc-
tion and creation operators from the relevant free covariant (in- or out-)field, and
applies the asymptotic condition precisely as above. An example for spin- 12 Dirac
fields is given in Problem 8.
5. The crossing symmetry of S-matrix amplitudes discussed in a few simple examples
in Section 7.6 is seen to be an almost trivial consequence of the basic LSZ formula
(9.176). In the case of the self-conjugate scalar field for which this formula
applies, particles and antiparticles are identical, so the statement that initial-
state particles (resp. antiparticles) can be exchanged with final-state antiparti-
cles (resp. particles) by the simple expedient of inserting a minus sign in the
corresponding four-momentum follows from (a) the symmetry of the T-product
under exchange of the spacetime coordinates of the fields, and (b) the form of
the Fourier transform, which contains a factor e+ikj ·xj for final-state particles
(or antiparticles) and a factor e−iki ·xi for initial-state particles (or antiparticles).
In the event that we are dealing with non-self-conjugate fields, with distinct
particles and antiparticles, the need to interchange particles and antiparticles
when we cross from initial to final states is a simple consequence of the fact that
the in- and out-fields contain positive frequency parts corresponding to particle
destruction operators and negative-frequency parts corresponding to antiparticle
creation operators. If the reader retraces the derivation of the LSZ formula in
such a case, starting with the obvious generalization of the basic asymptotic
conditions (9.159–9.162) for the complex field case (replacing, for example, a†in,g
in (9.159) with ac† in,g ), the general form of the crossing rule for the S-matrix will
become immediately evident.
6. We note that the reduction formula (9.176), containing the Green function
out 0|T (φ(x1 )..φ(xm )φ(x1 )..φ(xn ))|0in , is precisely in a form amenable to pertur-
bative treatment via the Gell–Mann–Low formula (9.21), taking |α, |β to be the
Spectral properties of field theory 289
vacuum state. This convenient form explains why the LSZ, rather than the Haag–
Ruelle, approach has dominated the treatment of scattering processes in field
theory. It will be the starting point for our treatment of covariant perturbation
theory in Chapter 10.
7. Obviously, any calculation of the S-matrix amplitude using (9.176) must include
a knowledge of the normalization constant Z, appearing in (9.155). This constant
is conventionally, and somewhat misleadingly, referred to as the “wavefunction
renormalization constant” for the particle, although it is clear from the way
we have introduced it that it is more properly associated with the choice of
interpolating field. We shall see in Section 9.5 how to extract it from the
behavior of the two-point Feynman amplitude G(x1 , x2 )—commonly called the
“full Feynman propagator” of the theory.
8. The translation property of the Green function
for any fixed four-vector displacement a, valid for both local and almost-local
fields, implies (see Problem 9) energy-momentum conservation: namely
Sk1 ..km
,k ..k
1 n
∝ δ 4 (k1 + .. + km
− k1 − ... − kn ) (9.181)
where we assume that the integral over four-momentum can be interchanged with the
sum over states (see below). Next, note that the function in brackets in (9.182),
f (p) ≡ |0|φ(0)|α|2 δ 4 (p − Pα ) (9.183)
α
has, by the spectral axioms of our theory (cf. Section 9.2, Axioms Ib, Ic, Id), support
only for p2 ≥ 0, and if we assume 0|φ(0)|0 = 0, only for p2 ≥ m2 , where m is the
single-particle mass. In fact, the support of f (p) is restricted to the one-particle mass
hyperboloid p2 = m2 and the multi-particle continuum starting at p2 = (2m)2 (or, if
there is a symmetry φ → −φ, at p2 = (3m)2 ). Moreover,
f (p) = |0|U † (Λ)U (Λ)φ(0)U † (Λ)U (Λ)|α|2 δ 4 (p − Pα )
α
= |0|φ(0)|Λα|2 δ 4 (p − Pα )
α
= |0|φ(0)|α|2 δ 4 (p − Λ−1 Pα )
α
= |0|φ(0)|α|2 δ 4 (Λp − Pα ) = f (Λp) (9.184)
α
Spectral properties of field theory 291
where the spectral function ρ(p2 ) is positive (or zero) with support on the spectrum of
1
the squared mass operator P 2 as indicated above, and the normalization factor (2π) 3 is
chosen for later convenience. We finally obtain, writing ρ(p2 ) = δ(p2 − μ2 )ρ(μ2 )dμ2 ,
W (x) = θ(p0 )ρ(p2 )e−ip·x d4 p
∞
= ρ(μ2 )W0 (x; μ)dμ2 (9.186)
0
where
d4 p 1 d3 p −ip·x
W0 (x; μ) ≡ θ(p0 )δ(p2 − μ2 )e−ip·x = e = Δ+ (x; μ)
(2π)3 (2π)3 2E(p)
(9.187)
Here Δ+ (x; μ) is the invariant function arising from the two-point function of a
free, canonically normalized scalar field of mass μ (cf. Chapter 6, (6.63)). This
remarkable result—that the Wightman two-point function of an arbitrary scalar
interacting Heisenberg field can be written as the positively weighted average of the
corresponding free field Wightman functions for fields of varying mass, with a positive
weight-function containing all the non-trivial interaction physics of the theory—is
called the Kållen–Lehmann representation of the two-point function. We note that it
implies the vanishing of the VEV of the space-like commutator, 0|[φ(x1 ), φ(x2 )]|0 =
0, (x1 − x2 )2 < 0, as the invariant function Δ+ (x; μ) is symmetric at space-like points,
Δ+ (x; μ) = Δ+ (−x; μ), x2 < 0, even though we have not assumed locality of our field.
Vanishing of the matrix element of the space-like commutator between arbitrary states
would require locality (Axiom IIc, Section 9.2).
The spectral representation (9.186) implies that the Fourier transform W̃ (p) of
W (x) is basically the invariant spectral function θ(p0 )ρ(p2 ): as W (x) is a well-defined
tempered distribution, by the basic axioms of Section 9.2, its Fourier transform is
likewise well-defined, and the defining sum for the spectral function (9.184) must
therefore be convergent. We may therefore expect that the interchange of integration
and summation performed above is in this case quite legal. As we shall see below, this
is not necessarily the case for the spectral representation of other two-point functions.
Although it would require an exact solution of the interacting field theory to
calculate the full spectral function ρ(p2 ), the contribution from one-particle states
is calculable up to a normalization constant from (9.155). Thus
1 Z
3
θ(p0 )ρ1part (p 2
) = d3 k|0|φ(0)|kin |2 δ 4 (p − k) = 3
δ(p0 − E(p))θ(p0 )
(2π) (2π) 2E(p)
⇒ ρ1part (p2 ) = Zδ((p0 − E(p))(p0 + E(p)) = Zδ(p2 − m2 ) (9.188)
292 Dynamics VII: Interacting fields: general aspects
A spectral representation for the Feynman two-point Green function of the theory
G(x1 , x2 ) = 0|T (φ(x1 )φ(x2 ))|0, sometimes called the full propagator, can be derived
following the pattern for W (x1 , x2 ). In fact, as
and the spectral representation for W (x1 , x2 ) gives a linear superposition of free field
Wightman functions for mass μ weighted by ρ(μ2 ), one finds the obvious result (again,
defining x = x1 − x2 ):
G(x) = ρ(μ2 )G0 (x; μ)dμ2 (9.190)
where the corresponding free-field time-ordered Green function G0 (x; μ) for a particle
of mass μ is just i times the free Feynman propagator ΔF (x; μ) introduced in Section
7.6 (cf. (7.215)). The Fourier transform is accordingly a weighted average of the
momentum-space free Feynman propagator:
1
−iG̃(p) ≡ Δ̂F (p ) = ρ(μ2 ) 2
2
dμ2 (9.191)
p − μ2 + i
The one-particle contribution to the full propagator, which we now denote Δ̂F (p2 ),
can be isolated and displayed explicitly, using (9.188):
∞
2 Z 1
Δ̂F (p ) = 2 + ρ(μ2 ) 2 dμ2 (9.192)
p − m + i
2 2
Mmulti p − μ2 + i
2
where Mmulti is the lowest squared-mass threshold for multi-particle states (i.e., 4m2
if 0|φ(0)|k1 , k2 in = 0, 9m2 if the first non-vanishing multi-particle matrix element
occurs for three particle states, 0|φ(0)|k1 , k2 , k3 in , and so on). This result gives us
the promised interpretation of the normalization constant Z appearing in the LSZ
reduction formulas: it is simply the residue of the momentum-space full propagator
of the interpolating field at the single-particle pole. The same procedures, whether
perturbative or non-perturbative, which are applied to the calculation of the n + m
point Green function G̃(k1 , .., km
, k1 , ..kn ) in the LSZ formula can be used to calculate
the two-point function and extract the required constant Z.
An important difference in the spectral representations of the two-point Wightman
and Feynman functions is immediately apparent in (9.191): the representation for the
Fourier transform of the time-ordered two-point function contains an integral, the
convergence of which evidently requires that
∞
ρ(μ2 ) 2
dμ < ∞ (9.193)
2
Mmulti μ2
Unfortunately, the axioms introduced in Section 9.2 are not adequate to ensure
the existence of this integral in all cases. Basically, the problem is that although
θ(x01 − x02 ) and 0|φ(x1 )φ(x2 )|0 are separately fine distributions (with well-defined
Fourier transforms), their product, occurring in the definition of the time-ordered
Green function, is not necessarily a well-defined distribution, with an unambiguous
Spectral properties of field theory 293
Fourier transform. A classic example is given by the product of the θ and δ-functions,
θ(x)δ(x) =?, which is well-defined (namely, zero) on the subset of test functions
vanishing at x = 0, but when extended to a distribution on the full Schwarz space
necessarily involves an undetermined constant, θ(x)δ(x) = Cδ(x) (we can think of
the constant C as our (arbitrary) choice for the “value” of the step function θ(x) at
x = 0). We can also regard convergence problems in the final integral result (9.191) as
due to an unjustified interchange of summation and integration in the process of the
“derivation”, as mentioned above.
What do we actually know about the asymptotic behavior of the spectral function
ρ(μ2 ) at large μ2 ? This behavior will clearly hinge on (i) the specific field(s) appearing
in the time-ordered product (e.g., elementary versus composite), and (ii) the details
of the dynamics (i.e., interactions) in the theory, which determine the multi-particle
matrix elements in the sum (9.185) defining the spectral function. For the specific
example under consideration here, our two-point function involves an elementary
hermitian spinless field, and the dynamics is assumed to be specified by an interaction
Hamiltonian leading to a perturbatively renormalizable theory,18 such as Hint = 4! λ 4
φ
(or more generally, Hint = 3! φ + 4! φ ). In such theories, as we shall see later in
λ3 3 λ4 4
our study (cf. Chapter 18) of the scaling properties of local field theories, the use of
renormalization group techniques allows us to derive the asymptotic behavior both of
Δ̂F (p2 ) and ρ(μ2 ), to any finite order of perturbation theory.19 The result is that in
each order of perturbation theory, the spectral function falls like μ12 × powers of ln μ2 ,
ensuring the convergence of the spectral representation. For the super-renormalizable
φ3 theory, the falloff is even faster (Barton, 1965): ρ(μ2 ) μ14 (times logarithms).
Thus, in these cases, at least within the context of perturbation theory, the Lehmann
representation (9.192) is on a quite firm footing. Note that this representation implies
that the momentum-space full Feynman propagator Δ̂F (p2 ) can be regarded as the
limit of an analytic function F (w) as w → p2 + i: i.e., as the complex variable w
approaches the real axis from above, with
∞
Z 1
F (w) = + ρ(w ) dw (9.194)
w − m2 2
Mmulti w − w
18 The definition and study of renormalizable field theories will be one of our primary objects in Part
4 of this book. For the time being, the reader is invited to think of such theories as ones in which a
well-defined continuum limit exists at the perturbative level: namely, there is a well-defined asymptotic
expansion of Feynman functions of the theory in powers of suitably defined coupling parameter(s), with the
contributions at each order specified in terms of a finite number of parameters.
19 There is a large amount of circumstantial evidence—though as yet no complete proof—that renormal-
izable self-interacting scalar field theories in four spacetime dimensions do not possess a non-trivial—i.e.,
interacting—continuum limit, even though the perturbative expansion is order-by-order well-defined. In
other words, there is no set of Wightman functions satisfying all axioms of Section 9.2 whose asymptotic
expansion in a suitably defined coupling constant agrees with the renormalized perturbative expansion of a
φ4 theory. Such theories, as we shall explain in Part 4 of the book, still have a perfectly sensible interpretation
as effective field theories. For super-renormalizable theories, such as φ3 theory in four dimensions (alas, with
a spectrum unbounded below; cf. Section 8.4), or φ4 theory in two or three spacetime dimensions, the
continuum limit exists. Asymptotically free theories such as QCD, based on a non-abelian gauge group, are
also thought to have a well-defined continuum limit beyond perturbation theory.
294 Dynamics VII: Interacting fields: general aspects
With ρ(w ) falling at least as fast as 1/w , this representation implies that F (w) is a
real analytic function in the complex plane of w, with a simple pole at w = m2 , and
2
cuts on the positive real axis beginning at w = Mmulti . There are multiple cuts because
the spectral function ρ(w ) is the sum of n-particle contributions ρn (p2 ) which switch
on at progressively higher values of w : at w = (2m)2 for the two-particle states |α
in (9.185), at w = (3m)2 for the three-particle states, and so on. Using the familiar
identity
1 1
=P ∓ iπδ(w − w ) (9.195)
w − w ± i
w − w
the discontinuity of F (w) across the cut for positive real w = p2 is given by
∞
F (p2 + i) − F (p2 − i) = −2iπρ(p2 ) = −2iπ ρn (p2 ) (9.196)
n=2
allowing the reconstruction of the full analytic function F (w) anywhere in the complex
w-plane from knowledge of its residue at the single-particle pole and its discontinuity
along the cut(s) on the positive real axis.
One further consequence of the Lehmann representation (9.192) is worth com-
menting on at this point. The positivity of the spectral function (which goes back,
of course, to the underlying positivity of the metric of our Hilbert space) clearly
implies that the 1/p2 behavior of the free propagator at large p2 cannot be damped
(to a more rapid decrease) by interactions, as the contribution of the integral is non-
negative. At one point, attempts were made to construct a renormalizable theory of
quantum gravity by introducing higher derivative terms in the Lagrangian with the
effect of damping the high-momentum behavior of the graviton propagator in order to
eliminate the proliferation of ultraviolet divergences in higher orders of perturbation
theory that plague the conventional Einstein–Hilbert theory. The Lehmann spectral
representation shows that such a damping can be possible only in the presence of
negative metric states in the theory.
The derivation of the spectral representation given above depended, apparently,
only on a few very basic properties of the Heisenberg field φ(x) appearing in the
time-ordered product: specifically, hermiticity and the appropriate transformation
properties of the fields under the Poincaré group, together, of course, with the com-
pleteness sum appropriate to a positive metric Hilbert space. However, as we indicated
previously, the derivation also involves interchanges of summation and integration
which are potential sources of disaster. In this case, disaster means a non-convergent
spectral representation. In such a circumstance, the resultant dispersion relation needs
Spectral properties of field theory 295
μ
We shall see later, in Chapter 15, that the current Jem,had (x) is a composite field,
involving terms quadratic in quark fields. The transverse tensor on the right-hand side
μ
simply expresses the conservation of the current ∂μ Jem,had (x) = 0, so the interesting
physics is contained in the scalar function ω(q ). A dispersion relation for ω(q 2 ) can
2
be “derived” along the same lines as the Lehmann representation (9.191) for the
296 Dynamics VII: Interacting fields: general aspects
or, equivalently,
∞
w Im(F (w ))
F (w) = F (0) + dw (9.200)
π 2
Mmulti w (w − w)
The constant F (0) = ω(q 2 = 0) appearing on the right-hand side indicates the appear-
ance of an ambiguity in the definition of the T-product of two currents, even though the
currents themselves are individually perfectly well-defined. In this case the ambiguity
involves just the constant C0 discussed previously (i.e., C1 , C2 , .. = 0), and a single
subtraction produces sufficient inverse powers of the integration variable w to ensure
convergence of the spectral integral.21
The appearance of ambiguities in the time-ordered products of composite oper-
ators like the electromagnetic current discussed above, with the concomitant need
for subtractions in the associated dispersion relation, may well provoke feelings of
unease in the attentive reader. Our development of the LSZ reduction formalism in
Section 9.4 was motivated by the desire to achieve a computationally convenient
representation of general S-matrix amplitudes: the completely rigorous (and well-
defined!) representation (9.139) following from the Haag Asymptotic Theorem 9.4,
involving the vacuum expectation value of ordinary (i.e., not time-ordered) products of
smeared field operators, is extremely cumbersome to implement either in perturbation
theory or with available non-perturbative techniques. On the other hand, the LSZ
formula (9.175) can be developed straightforwardly in perturbation theory, as we
shall explain in detail in the next chapter, using the Gell–Mann–Low formula (9.21).
Moreover, ground-state (i.e., vacuum) expectation values of time-ordered products of
Heisenberg operators may be readily transcribed into a path-integral formulation (cf.
20 The μ
current Jem,had (x) is assumed to contain Heisenberg fields with only strong interactions—the
electromagnetic interactions are switched off in these fields—so that there is no pole corresponding to a
stable particle (e.g., the photon) for which the current interpolates. The rho resonance appears as a pole of
ω, but on the unphysical second sheet, as the rho is unstable.
21 Again, as in the case of the scalar-field two-point function, renormalization group techniques allow us
to derive the relevant asymptotic behavior, in the case of an asymptotically free theory like QCD, to all
orders of perturbation theory.
General aspects of the particle–field connection 297
Section 4.2), which we shall see is of enormous importance in both perturbative and
non-perturbative approaches to field theory.
Unfortunately, we now realize that the time-ordered products appearing in LSZ-
type formulas, once developed perturbatively à la Gell–Mann–Low (thus introducing
composite interaction Hamiltonian operators into the time-ordered products), are
likely to contain undefined ambiguities! The ambiguities, of course, arise from the
multiplication of distributions and are manifested as short-distance singularities in
the resultant products, or, in momentum space, as the familiar ultraviolet divergences
appearing in perturbative loop integrals for the Feynman functions of the theory. If the
field theory is regulated at short distance, say by replacing continuous spacetime by a
discrete spacetime lattice (thereby effectively introducing a high-momentum ultraviolet
cutoff in the theory), the ambiguities are eliminated. Of course, to recover the full
(continuous) Poincaré invariance of the theory, the spacing of the lattice points must
eventually be taken to zero, or equivalently, the ultraviolet cutoff taken to infinity, and
the question then arises as to the existence of a well-defined and unambiguous limit
for the S-matrix amplitudes when this is done. From a physical point of view, the
sensitivity of a field theory to the insertion of an ultraviolet (UV) cutoff is equivalent
to the question of the sensitivity of the low-energy (low means momenta much smaller
than the UV cutoff) predictions of the theory to our inevitable ignorance of new
physics at much higher momenta (i.e., much smaller distance scales). The study of the
sensitivity of local field theories at low energies to alterations in their short-distance
structure will be the primary focus of Chapters 16 and 17 of this book. For the present,
the reader should be reassured that for the class of field theories called “perturbatively
renormalizable theories”, the ambiguities appearing in the continuum limit (UV cutoff
going to infinity) in the Feynman Green functions appearing in the LSZ formula can
be shown to be completely absorbable in a finite set of low-energy parameters (masses
and couplings) which uniquely determine (order by order, for all orders of perturbation
theory) the S-matrix amplitudes of the theory, up to terms which fall as a power
(usually, at least quadratic) of the low-energy mass and momentum scales divided by
the UV cutoff. The latter correction terms are not unique, but depend on the precise
details of the regularization (i.e., how the UV cutoff is introduced), reflecting the
aforesaid ambiguities present in the underlying time-ordered products.
variety of ways in which the concepts of particle and field are linked in the panoply of
field theories which are known to be relevant in high-energy physics.
We shall start at the particle end, by noting two important classifications which
can be applied to particles. By “particle”, we mean simply a state which to some
appropriately high degree of approximation can be regarded as an eigenstate of
the squared-mass operator P 2 and the spin (i.e., J2 , Jz in the frame where P = 0).
Particles can therefore be associated with irreducible representations of the HLG, as
discussed in Sections 5.2 and 5.3. Beginning with this rather vague specification, one
finds that the zoo of particles encountered in the Particle Data Book may be broken
down into subcategories on the basis of the following two fundamental classifications:
22 We shall remain within the context of Standard Model physics in our examples, to avoid too many ifs,
ands, or buts! Thus, the possibility of “exotic” processes such as proton decay is ignored.
General aspects of the particle–field connection 299
one of the possible decay products of the Higgs consists of an electron–positron pair,
but there will be no single particle pole, as the Higgs itself is unstable. Instead, the
Higgs resonance is revealed as a pole below the real axis on the second Riemann
sheet at a much higher energy (at the present time, > 115 GeV).23 In the case of
stable composite particles, e.g., the proton, interpolating local (or almost local, cf.
Section 9.2) fields can be constructed which go right into the Haag–Ruelle or LSZ
formulas to determine the S-matrix for proton interactions. The reader will recall that
the Haag Asymptotic theorem in particular was perfectly valid if the underlying field
involved an appropriately smeared product of local fields: in the case of the proton,
taking quantum chromodynamics as the underlying dynamical theory, the appropriate
interpolating field involves the product of three quark fields (two up and one down)
coupled to zero color.
It is now possible to give a more precise definition of the concept of “elementary”,
for either fields or particles. We shall define as elementary any field appearing in the
fundamental Hamiltonian (or Lagrangian) specifying the exact dynamics of the theory
at the distance scales in question. It is assumed that such a Hamiltonian can be given
in an explicit analytic form: it can be written down, without approximation, on a
finite piece of paper! Any particle interpolated for by such a field can be rightly called
“elementary”: the “point-like” character of the particle is reflected in the fact that
its exact interacting dynamics has a precise finite expression in terms of products of
local fields at a single spacetime point. With this terminology, the quark, lepton, Higgs
and gauge boson fields of the Standard Model are elementary. Correspondingly, the
leptons, Higgs and electroweak gauge bosons of the Standard Model are elementary
particles as well.
What about the quarks and gluons, which we have omitted from our elementary
particle list? Quantum chromodynamics (QCD), which we shall discuss in detail later
in the book, provides a particularly stark example of the perils of the naive dictum “for
every field, a particle”. In confining theories such as QCD, the dynamics is specified via
a Lagrangian involving quark and gluon fields which do not interpolate for any of the
physical particles (“hadrons”) in the theory.24 In fact, these fields strictly speaking are
not defined on the physical Hilbert space at all! However, multi-local combinations of
the quark and gluon fields, appropriately coupled to be invariant under the local gauge
symmetry of the theory, form well-defined almost local operators which do interpolate
for physical particles, and are well defined on the physical state space. Thus, the
product of three quark fields provides us with an interpolating field for the proton,
the product of a quark field and its conjugate a pion field, and so on.
Although we have decided, in agreement with the definition proposed at the
beginning of this section, to withhold the appellation “particle” from quarks and
23 For an excellent discussion of the role and properties of unstable particles in a field-theory context,
see (Brown, 1992), Section 6.3. An elementary discussion in standard quantum theory, including the
interpretation of resonances as poles on the second Riemann sheet, can be found in (Baym, 1990), Chapter 4.
24 Remarkably, there are even difficulties in the description of charged particle states in quantum
electrodynamics, due to the masslessness of the photon: thus, the electron does not possess a conventional
set of asymptotic states à la Haag–Ruelle theory—in the language of Schroer, it is an “infraparticle”. More
on this in Sections 19.1 and 19.2.
General aspects of the particle–field connection 301
gluons, and speak only of quark and gluon fields, it is certainly true that perturbative
calculations in QCD can be performed treating the omnipresent quarks and gluons as
particles in just the same way that electrons and photons are so treated in quantum
electrodynamics. Again, this is a question of the relevant spatiotemporal scales of
the process: in a sense which can be made mathematically precise, QCD possesses
a property of asymptotic freedom which renders the interaction arbitrarily weak at
progressively smaller spatiotemporal scales (or equivalently, higher-energy scales),
so that perturbative calculations become correspondingly more accurate. How these
calculations can be connected to an actual S-matrix amplitude, in which necessarily
the field energy must be allowed to dissipate over a (relatively) large spatial and
temporal region, resulting in a hadronization of the underlying quark and gluon degrees
of freedom, requires an understanding of an important scale separation property of
renormalizable field theories, called “factorization”, which will be an important topic
of investigation in Part 4 of this book.
In summary, we once again emphasize the absence of any sacred one-to-one connec-
tion between particles and fields. A given particle (stable or unstable, elementary or
composite) may be “represented” by many different local or almost local fields: if the
particle is stable, and therefore represented in the asymptotic in- and out-states of the
theory, any field with a non-vanishing vacuum to single-particle matrix element serves
as an appropriate interpolating field for it. On the other hand, it may be convenient,
and in the case of gauge theories indispensable, to introduce fields which interpolate
for none of the physical particles of the theory, but in which products of such fields
do interpolate for these particles. In such confining theories (cf. Section 19.3) we
may loosely speak of the physical particle states as (stable or unstable) bound states
of the underlying quark and gluon “particles”, even though there is no attainable
physical circumstance in which these latter can be realized as isolated entities with
the characteristics expected of a particle: well-defined mass, spin, energy-momentum,
and (in the case of QCD) a definite color quantum number derived from the putative
interpolating field.
We conclude this section with a discussion of a subject which naturally arises when
one thinks carefully about the nature of the particle–field connection in quantum
field theory: the uniqueness (or otherwise) of the dynamical evolution specified by
the theory, given the enormous fluidity of field representations available for the same
particle, which nevertheless yield precisely the same S-matrix for scattering when
all is said and done. At the pure particle level, the time development of the theory
is certainly uniquely specified, in an almost trivial sense, once we assume asymp-
totic completeness. The multi-particle in-states |k1 , k2 , .., kN in , for example, form by
assumption a complete set, and the time-evolution of these states is trivially unique,
N
as they are eigenstates of the full Hamiltonian H (with eigenvalue E = i=1 E(ki )).
However, this obviously begs the question of what such states “look like” at any times
subsequent to the far past. In particular, knowledge of the particular combination
of outgoing sets of freely receding particles that a given in-state represents in the
far future requires that we have at our disposal the complicated connection between
the in- and out-fields (φin and φout ) of the theory, which then allows construction
of the S-matrix, yielding the desired scattering amplitudes. As we have seen, such a
connection is automatically afforded by the Heisenberg interpolating field φ(x) of the
302 Dynamics VII: Interacting fields: general aspects
theory, which converges (weakly) to φin (resp. φout ) in the far past (resp. far future).
But the interpolating field is subject to a considerable freedom of choice. There are
clearly an infinity of possible fields which interpolate between any two specified in-
and out-fields: we need only ensure that the given Heisenberg field has a non-vanishing
vacuum to single particle matrix element for the particle in question. Still, there is
an intuitive feeling that a proper theory should, at least in principle, uniquely specify
the physical situation not just at asymptotic times, but at finite intermediate times
as well: for example, at the time t 0 at which a scattering event takes place.
The conceptual difficulty here primarily springs from the need to specify more
clearly what one means by the phrase “physical situation” above. In standard quantum
mechanics, the specification of the physical state at some time would require a
measurement of a complete set of compatible observables. In field theory this is
enlarged to the notion of local measurements, exploring properties of the system in
bounded (microscopic) regions of spacetime. In other words, we would need to be
able to measure matrix elements of local (or almost local) operators in or between
specified physical states. To make this more concrete, let us take a famous example,
of enormous phenomenological importance: the hadronic electromagnetic current
μ
Jem,had (x) discussed in Section 9.5. A measurement of a general matrix element
μ
(typically called a “form factor”) out β|Jem,had (x)|αin for general spacetime points x is
actually possible given the fact that the current couples linearly and gauge-invariantly
to the photon field (cf. Chapter 15), which in turn is coupled in an accurately
computable (via perturbation theory) way to leptons. The momentum transferred
from scattered leptons (electrons, say: see Fig. 9.3) varies over the entire space-like
domain, giving directly the Fourier transform (to momentum variable k) with respect
to x of this matrix element. Unlike the situation with the S-matrix, this corresponds
e−
e− μ α
out β | J˜em,had(k)|α in
Fig. 9.3 Electron scattering off a hadronic state, allowing extraction of the space-like form
factor.
General aspects of the particle–field connection 303
μ 2 1 ¯
Jem,had (x) = e : ū(x)γ μ u(x) : − e : d(x)γ μ
d(x) : (9.202)
3 3
and the physically directly measurable local matrix elements of this object are then
determined (in principle!) uniquely by the fact that the dynamics of the quark
25 Of course, the completeness of the asymptotic states assures that such an expression is in principle
possible.
304 Dynamics VII: Interacting fields: general aspects
fields (and gluon fields through which they interact) is specified by a Hamilto-
nian/Lagrangian of definite form. Of course, the connection between the elementary
quark and gluon fields in (9.202) and the asymptotic nucleon and pion fields appearing
in (9.201) is exceedingly complicated, involving the intricacies of confinement in a
strongly coupled theory. Nevertheless, great progress in direct calculation of form
factors starting from the QCD Lagrangian has been possible using the techniques
of lattice gauge theory at low energy, or perturbative QCD, at high energy (cf.
Chapters 18 and 19).
The above example makes clear that local quantum field theory, although conceptu-
ally stimulated, as we saw in Chapters 5 and 6, by the requirements of S-matrix theory
(in particular, the desire to construct Hamiltonians which lead to Lorentz-invariant
and clustering S-matrices), specifies, in certain cases essentially uniquely, details of
the dynamics which go beyond a pure S-matrix philosophy, which treats scattering
amplitudes connecting the behavior of systems long before and long after interactions
occur, but remains agnostic with regard to what happens “in between”. Indeed, given
a precise specification of the dynamics in terms of elementary fields, local observables
can be constructed, and in some cases even measured, which give us a window into the
behavior of the theory in finite regions of spacetime—behavior which should certainly
be a central component of the conceptual content of any self-respecting field theory.26
9.7 Problems
1. Show, using the Lorentz-invariance of the S-matrix, that the transformation
property (9.9) under the HLG of the in-states of the theory transfers to the
corresponding out-state transformation property (9.11).
2. Show that the scalar field transformation property (9.24) of the Heisenberg field
allows us to extend the equal-time commutativity property (9.3) to full space-
like commutativity:
3. Show that the full Heisenberg field can be reconstructed from the time-
dependent creation and destruction operators aH (k; t), aH (k; t) defined by (9.36)
via
φH (x) = d3 k(gk (x)aH (k; t) + gk∗ (x)a†H (k; t)) (9.203)
where gk (x) ≡ √ 1
e−ik·x . (A precisely analogous formula relates φin (x)
(2π)3 2E(k)
(resp. φout (x)) to the in (resp. out) creation and destruction operators
a†in (k), ain (k) (resp.a†out (k), aout (k)).)
4. With |g1 , g2 , ..gn defined as in (9.56), show that φf |g1 , g2 , ..gn is a state of
finite norm, with φf a free scalar field smeared with a Schwarz-type function.
26 For a discussion of the role of almost local operators in devising thought experiments involving particle
detectors which act as probes of particles in arbitrary localized regions of spacetime, see Section II.4.3 of
(Haag, 1992).
Problems 305
5. Show that the function F+ (p0 ) (and hence, trivially, F0 (p0 ) and F− (p0 )) defined
in (9.98, 9.99, 9.100) is infinitely times continuously differentiable for arbitrary
real p0 . (Note that the differentiability is trivial except at the two singular
boundary points p0 = a, b.)
6. Show that the sum of overlap integrals for the inner product of two in-state
vectors obtained from Lemma 9.3 in the large time limit agrees with the inner
product following from the Fock space metric (5.21) on continuum-normalized
states, given the definition (9.137).
7. Show that if two momentum-space wavefunctions are non-overlapping,
gi∗ (
p)gj (
p) = 0, the coordinate space overlap of the corresponding single-particle
wavefunctions gi,j (x, t) (cf. (9.113)) vanishes for all time.
8. The asymptotic conditions (9.157, 9.158) can be used to derive an important
explicit relation (called the Yang–Feldman equation) between the interacting
Heisenberg field φH (x) and the asymptotic free in-field φin (x).
(a) First, show that with a†H (k; t) and gk (x) as in Problem 3, for arbitrary
states |αin , |βin ,
†
√
in β|aH (k; t)|αin = Z in β|a†in (k)|αin
t
−i dt d3 x gk (x )(x + m2ph )in β|φH (x )|αin
−∞
9. Starting with the formula (9.174), in which one incoming particle and one
outgoing particle have been reduced out of the asymptotic states, carry the LSZ
process one step further by reducing out the incoming particle with wavefunction
g2 to obtain an expression with the time-ordered product of three Heisenberg
fields.
306 Dynamics VII: Interacting fields: general aspects
10. Derive the following reduction formula for reducing out a single incoming Dirac
particle:
Sk1 σ1 ..km
σ ,k σ ..k σ
m 1 1 n n
i ←
= 1/2 out k1 σ1 ..km σm |ψ̄(x1 )|k2 σ2 ..kn σn in (i ∂
/x1 + m) (9.205)
Z
1 m
· u(k1 , σ1 )e−ik1 ·x1 d4 x1 (9.206)
(2π)3/2 E(k1 )
You should first derive the following Fourier transform formula for the creation
operator for a Dirac particle in the in-state, in terms of the Dirac in-field defined
with normalization as in (7.104):
1 m
b†in (k, σ) = ψin†
(x) u(k, σ)e−ik·x d3 x
(2π)3/2 E(k)
11. Derive the translation property
G(x1 , ...., xn ) = G(x1 − a, ...., xn − a)
for the n-point Green functions, and use it to establish the energy-momentum
conservation property for the S-matrix (as given by LSZ (9.176)):
Sk1 ..km
,k ..k
1 n
∝ δ 4 (k1 + ..km
− k1 − ... − kn )
10
Dynamics VIII: Interacting fields:
perturbative aspects
freedom, the interaction picture typically does not exist—a result usually referred to
as Haag’s theorem. The problem can be removed by a full regularization of the theory
in which both short- and long-distance cutoffs are introduced, leading to a theory
with a finite (indeed, arbitrarily large!) number of independent quantum-mechanical
degrees of freedom. Unfortunately, such regularizations inevitably result in a (one
hopes, temporary) loss of the full Poincaré symmetry of the theory, and the task then
remains to establish the return of this symmetry as the regularization is removed.
In fact, any application of the usual formal “theorems” (such as the Gell–Mann–
Low theorem of Section 9.1, (9.21)) of interaction-picture perturbative expansions in
field theory necessarily require the insertion of a regularization in order to obtain
unambiguous results, due to unavoidable ambiguities in time-ordered Feynman Green
functions involving composite operators (such as interaction Hamiltonian densities),
as discussed at the end of Section 9.5. On the other hand, the LSZ formula (9.176)
derived in Section 9.4 is completely independent of any reasoning relying on interaction-
picture arguments. Of course, the problem remains of actually calculating the Feynman
n-point amplitudes contained in the LSZ formula (i.e., the VEV of the time-ordered
products of n Heisenberg fields) in order to obtain the desired S-matrix elements.
These Green functions can only be obtained analytically in a handful of toy field
theories in 1+1 spacetime dimensions: in any realistic case, we necessarily must have
recourse to approximative methods. These are basically of two kinds:
4
· Kxi Kxj out 0|T (φH (x1 )..φH (xm )φH (x1 )..φH (xn ))|0in d xi d xj
4
(10.1)
The VEV of the time-ordered product of m + n fields appearing here (in common
parlance, the m + n-point Green function of the theory) has a formal perturbative
expansion via the Gell–Mann–Low theorem (9.21),
∞
(−i)p
out 0|T {φH (y1 )φH (y2 ) .. φH (ym )}|0in = 0|T {φ(y1 )φ(y2 )..φ(ym )
p=0
p!
which provides the needed formal expansion of the Green function appearing in
the LSZ formula in powers of the interaction. We once again emphasize that in a
continuum field theory with no short-distance cutoff, the T-products appearing on
the right-hand side of (10.2) contain ambiguities, so a suitable regularization (e.g.,
on a spacetime lattice) is implied to make the individual terms in the perturbative
expansion meaningful. All the fields appearing on the right are, of course, free fields,
as they are in the interaction picture.
Our task in this section is to derive an important technical result, called Wick’s
theorem, which will facilitate the computation of these T-products of free fields.
Although, for the purposes of the LSZ formula, we clearly only need the VEV of the
T-products, we shall derive the more general result contained in Wick’s theorem, giving
the T-products as a sum of terms involving products of normal-ordered products of the
fields and c-number two-point Green functions (i.e., free Feynman propagators). As
a normal-ordered product of fields vanishes when it encounters the vacuum either on
the left (bra-state) or right (ket-state), only terms involving products of Feynman
propagators (and no normal-ordered products) will actually survive in the VEV
appearing on the right of (10.2). Nevertheless, the more general operator result derived
below is important in other contexts, and worth the small additional effort required.
310 Dynamics VIII: Interacting fields: perturbative aspects
The reader should also note that we have reverted to the notation used prior to Section
9.2, wherein unsubscripted fields, such as φ(x), are free interaction-picture fields (as in
Chapters 7 and 8), while Heisenberg fields are explicitly distinguished by a subscript
“H”, as in φH (x).
Wick’s theorem is usually proved by an induction procedure, starting with the
result for two fields—the simplest non-trivial case. A short calculation, using the fact
that the positive (destruction) and negative (creation) frequency parts of the free field
operator φ(x) have a c-number commutator, shows that the difference between the
time-ordered and normal-ordered product of two fields is itself a c-number. The T-
product of φ(x1 ) and φ(x2 ) is symmetric in its arguments, so with no loss of generality
we may assume x01 > x02 , whence
On the other hand, a c-number is equal to its VEV, and by definition the normal-
ordered product vanishes when sandwiched between vacuum states, so taking the VEV
of the operator difference above we find
T (φ(x1 )φ(x2 ))− : φ(x1 )φ(x2 ) : = 0|T (φ(x1 )φ(x2 ))|0 (10.4)
or
1 See Bjorken and Drell (Bjorken and Drell, 1965), for the standard operatorial proof.
Perturbation theory in interaction picture and Wick’s theorem 311
We now consider a very simple interaction Hamiltonian, linear in the canonical scalar
field, so that
Vip (t) = Hint (x, t)d x = j(x, t)φ(x, t)d3 x
3
(10.7)
where j(x, t) is an unspecified real c-number source function, with respect to which
we shall later wish to perform functional derivatives. This hermitian interaction
Hamiltonian determines a unitary evolution operator U (t, t0 ) in the usual way
(cf. (4.28)):
t
U (t, t0 ) = T {exp (−i Vip (τ )dτ )} (10.8)
t0
with
∂U (t, t0 )
= −i( j(x, t)φ(x, t)d3 x)U (t, t0 ) (10.9)
∂t
The expansion (4.27) shows that time-ordered products of arbitrarily many φ(x) fields
can be obtained from knowledge of U (t, t0 ) by differentiating it with respect to the
source functions j(x). Next, note that if we define a new (unitary) operator E(t, t0 )
by omitting the time-ordering
t
4 (+) (−) 4
E(t, t0 ) ≡ exp (−i Vip (τ )dτ ) = e−i j(x)φ(x)d x = e−i j(x)(φ (x)+φ (x))d x
t0
(10.10)
the connection to a normal-ordered quantity is immediate using the BCH formula
(10.6):
(+) (−) 4
(−) 4
(+) 4
: E(t, t0 ) : = : e−i j(x)(φ (x)+φ (x))d x := e−i j(x)φ (x)d x e−i j(x)φ (x))d x
− 12 j(x1 )j(x2 )[φ(−) (x1 ),φ(+) (x2 )]d4 x1 d4 x2
=e E(t, t0 ) (10.11)
where the time-integrals associated with the spacetime coordinates x1 , x2 are implicitly
assumed to go from t0 to t. Our objective of finding a connection between the time-
ordered and normal-ordered field products is therefore accomplished if we can find a
simple relation between U (t, t0 ) and E(t, t0 ). We do this by studying the time-evolution
equation satisfied by E(t, t0 ):
t
∂E(t, t0 ) 1 −iΔtVip (t)−i tt Vip (τ )dτ −i Vip (τ )dτ
= lim (e 0 −e t0
) (10.12)
∂t Δt→0 Δt
t
Now, define A ≡ −iΔt Vip (t), B ≡ −i t0 Vip (τ )dτ , and apply (10.6), expanded to first
order in Δt, so that
1
eA+B = (1 − [A, B] + A + O((Δt)2 ))eB (10.13)
2
312 Dynamics VIII: Interacting fields: perturbative aspects
Now let t → +∞ and t0 → −∞, so that in effect we are studying the functionals
U (+∞, −∞) = S[j], E(+∞, −∞) = E[j], with
S[j] = T {exp −i j(x)φ(x)d4 x} (10.18)
E[j] = : exp −i j(x)φ(x)d4 x : (10.19)
and the integrals extend over all spacetime. Of course, S[j] is just the S-matrix (more
precisely, the S-operator whose matrix elements constitute the S-matrix) for the system
with interaction Hamiltonian (10.7). Using the normal-ordering result (10.11) relating
E to : E :, we find
1
4 4 (−) (+) 0 0
S[j] = e 2 d x1 d x2 j(x1 )j(x2 ){[φ (x1 ),φ (x2 )]−θ(x1 −x2 )[φ(x1 ),φ(x2 )]} E[j] (10.20)
0| [φ(−) (x1 ), φ(+) (x2 )] − θ(x01 − x02 )[φ(x1 ), φ(x2 )]|0
= 0| − φ(+) (x2 )φ(−) (x1 ) − θ(x01 − x02 )[φ(x1 ), φ(x2 )]|0
= 0| − φ(x2 )φ(x1 ) − θ(x01 − x02 )[φ(x1 ), φ(x2 )]|0
= 0| − (θ(x01 − x02 ) + θ(x02 − x01 ))φ(x2 )φ(x1 )
− θ(x01 − x02 )(φ(x1 )φ(x2 ) − φ(x2 )φ(x1 ))|0
= −0|θ(x01 − x02 )φ(x1 )φ(x2 ) + θ(x02 − x01 )φ(x2 )φ(x1 )|0
= −0|T (φ(x1 )φ(x2 ))|0 = −iΔF (x1 , x2 ) (10.21)
Perturbation theory in interaction picture and Wick’s theorem 313
Inserting this result in (10.20) we find the desired final result (Wick’s theorem in
functional notation)
1 4 4
T {exp −i j(x)φ(x)d4 x} =: exp −i j(x)φ(x)d4 x : e− 2 j(x1 )φ(x1 )φ(x2 )j(x2 )d x1 d x2
(10.22)
where we have introduced the concept of the “contraction of the fields φ(x1 ) and
φ(x2 )”, written φ(x
1 )φ(x2 ) and defined in this case simply as the Feynman two-point
function of the fields in question:
φ(x
1 )φ(x2 ) ≡ 0|T (φ(x1 )φ(x2 ))|0 = iΔF (x1 , x2 ) (10.23)
The time-ordered and normal-ordered products of fields are recovered simply by taking
the desired number of functional derivatives of (10.22) with respect to the c-number
source function j(x). Thus, the explicit expansion
∞
(−i)n
T {exp −i j(x)φ(x)d x} = 4
d4 x1 d4 x2 ..d4 xn j(x1 )..j(xn )T {φ(x1 )..φ(xn )}
n=0
n!
(10.24)
implies
δn
T {φ(x1 )φ(x2 )..φ(xn )} = in T {exp −i j(x)φ(x)d4 x}|j=0
δj(x1 )δj(x2 )..δj(xn )
(10.25)
and similarly
δn
: φ(x1 )φ(x2 )..φ(xn ) := i n
: exp −i j(x)φ(x)d4 x : |j=0 (10.26)
δj(x1 )δj(x2 )..δj(xn )
The special case n = 2 derived above, (10.5), emerges immediately by taking the
second functional derivative of (10.22) with respect to j(x1 ), j(x2 ) (and setting j = 0).
In general, the application of functional derivatives to (10.22) clearly results in an
expansion giving the T-product of n fields as a sum of terms where normal products
of all possible subsets of the fields are multiplied by products of contractions (i.e.,
Feynman propagators) of the remaining fields. The reader is strongly encouraged to
verify this explicitly for the case n = 4 (see Problem 2).
The preceding formulation of Wick’s theorem connects extremely naturally with
the functional (path-integral) formulation of field theory which we shall discuss in
Section 10.3 (indeed, we have chosen to derive Wick’s theorem in the functional
language for precisely this reason), and can be extended to fermionic fields by using
the Grassmann algebra technology to be discussed in Section 10.3.2 (Evans et al.,
1998). For fermions, the two-point Feynman Green functions are necessarily defined
with a minus sign in the anti-time-ordered part (cf. Chapter 7, Problems 5 and 11);
for example, for a Dirac field
≡ θ(x01 − x02 )0|ψm (x1 )ψ̄n (x2 )|0 − θ(x02 − x01 )0|ψ̄n (x2 )ψm (x1 )|0
314 Dynamics VIII: Interacting fields: perturbative aspects
while the corresponding two-point contractions for two ψ or two ψ̄ fields vanish:
ψm (x
1 )ψn (x2 ) = ψ̄m (x1 )ψ̄n (x2 ) = 0 (10.27)
The normal-ordered product of two fermionic fields likewise contains additional minus
signs when creation or annihilation parts of the fields are interchanged to effect the
normal-ordering. With these changes, and introducing2 c-number sources η(x), η̄(x)
(instead of the commuting j(x) above) which anticommute for arbitrary spacetime
points x (e.g., {η(x), η(y)} = 0, ∀x, y), one again recovers the basic result (10.22).3
The result for time-ordered products of fermionic fields, i.e., once the anticommuting
sources are removed by functional differentiation, is as above—namely, an expansion
containing all possible subsets of fields under the normal product, multiplied by all
possible contractions of the remaining fields—with the proviso that an extra minus
sign must be included in each term where fermionic fields on the right-hand side (both
in the normal-ordered parts as well as in the contractions) appear in an order which
is an odd permutation of the order in which they appear in the time-ordered product
on the left (as a result of the difference in sign obtained by reordering the attached
sources on the left versus the right).
Another important generalization of Wick’s theorem, more transparently obtained
in an operatorial proof, states that in the expansion of time-ordered products involving
(under the time-ordering symbol) already normal-ordered products of fields, contrac-
tions of the fields within each such normal-ordered product are omitted in the Wick
expansion. The interested reader is referred to the standard texts, e.g., (Bjorken and
Drell, 1965), for a complete operatorial proof of this extension.
0|T {φ(x1 )..φ(xm )φ(x1 )..φ(xn )Hint (z1 )...Hint (zp )}|0 (10.28)
The calculation is conveniently divided into two stages: (a) a Wick expansion of the
T-product appearing in (10.28), yielding the coordinate space Feynman rules of the
theory, and (b) application of the Klein–Gordon operators and evaluation of the final
(Fourier) integrals over xi , xj in (10.1), yielding the momentum-space Feynman rules
of the theory.
2 A detailed description of the properties of such fermionic c-number functions is deferred to Section
10.3.2, when we consider fermionic functional integration.
3 The desired generalization of Wick’s theorem, where bosonic and/or fermionic fields appear under the
time-ordered product, can also be accomplished by using an inductive operatorial proof. See, for example,
Bjorken and Drell, Relativistic Quantum Fields, Section 17.4 (Bjorken and Drell, 1965).
Feynman graphs and Feynman rules 315
For the moment, assume that we are dealing with a φ4 theory (specifically, Theory
A of Section 7.6, with Hint (x) = 4!
λ
φ(x)4 ) of self-interacting massive scalar particles.4
Thus the matrix element in (10.28) contains the T-product of N = m + n + 4p free
scalar fields, at spacetime points which we may relabel temporarily y1 , y2 , ...yN (with
some of the yi repeated four times). Wick’s theorem then gives the VEV in (10.28) as
a sum of terms of the form
φ(y
1 )φ(y2 )... φ(yp+1 )φ(yp+2 )...φ(yN −1 )φ(yN ), N even, 0 otherwise (10.29)
Here the scalar fields φ(y1 ), ...φ(yN ) are a permutation of the fields occurring in the
particular term of interest extracted from the product
so the spacetime coordinates y1 , y2 , ..yN are selected from the xi , i = 1, 2, ..n, xj ,
j = 1, 2, ..m, zk , k = 1, 2, ..p. Recall from (10.1) that the spacetime coordinates xi are
associated with the n incoming particles, the coordinates xj with the m outgoing
particles, and the zk coordinates with the spacetime points at which interactions occur.
Thus, the contractions occurring in (10.29) are of three kinds:
1. Contractions between an external spacetime point (i.e., one of the xi or xj ) and
an interaction point zk .
2. Contractions between two interaction points zk .
3. Contractions between two external spacetime points.
We may dispose immediately of the last case, in which external particles are connected
directly rather than through the intermediacy of an interaction, as it actually leads
to a vanishing disconnected contribution.5 Taking the external points to be x1 , x1 for
simplicity, the integrals over x1 , x1 factorize from the rest of the expression in (10.1)
eik1 ·x1 −ik1 ·x1 Kx1 Kx1 φ(x
4 4
1 )φ(x1 )d x1 d x1
= eik1 ·x1 −ik1 ·x1 Kx1 Kx1 iΔF (x1 − x1 )d4 x1 d4 x1
=i eik1 ·x1 −ik1 ·x1 Kx1 δ 4 (x1 − x1 )d4 x1 d4 x1
= i(k12 − m2ph ) ei(k1 −k1 )·x1 d4 x1 → 0, k12 → m2ph (10.30)
4 We shall see later that a sensible perturbation theory—one in which the amplitudes are expanded in
terms of physically accessible low-energy parameters—requires a split of free and interacting Hamiltonians in
which quadratic terms, called counterterms, are also transferred from the free to the interaction Hamiltonian.
Our theory may also contain φ3 terms, of course. The discussion given here is readily generalized to include
the corresponding additional graphs in which two or three field lines connect at a spacetime point. Below,
we consider the case of two distinct interacting scalar fields φ, ψ (Theory B from Section 7.6) in some detail.
5 The reader will recall that although disconnected contributions in which a particle passes through
without interacting with the others are certainly present in general S-matrix amplitudes, we explicitly
removed them in the process of deriving the LSZ formula, so we should not expect to see them re-emerging
here!
316 Dynamics VIII: Interacting fields: perturbative aspects
and vanish once the on-mass-shell limit is taken for the external momentum k1 (note
that we have integrated the Klein–Gordon operator by parts onto the exponential in
the step leading to (10.30)).
The first two types of contraction described above form the building blocks of a
graphical representation of the perturbative amplitudes of the theory first introduced
by Feynman in his seminal work (Feynman, 1949a) on quantum electrodynamics in the
late 1940s. Contractions between fields at an external point and an interaction point
(or vertex) are referred to as the external legs of the graph. Contractions between
two interaction vertices are the internal lines of the graph. For a simple illustration,
consider Theory B in Section 7.6, with interaction Hamiltonian
with φ a self-conjugate, ψ a complex scalar field. In this theory, the only non-zero
contractions are φ(x †
1 )φ(x2 ) = iΔF (x1 − x2 ; M ) and ψ(x1 )ψ (x2 ) = iΔF (x1 − x2 ; m).
In Section 7.6.3 we computed the second-order (n=2) contribution to ψ − ψ c scattering
in a theory with this interaction Hamiltonian. In this order the relevant LSZ formula
† †
contains the T-product 0|T {ψH (x1 )ψH (x2 )ψH (x1 )ψH (x2 )}0, the expansion of which
to second order employing the Gell–Mann–Low formula (10.2) contains the following
T-product of ten interaction-picture fields:
0|T {ψ † (x1 )ψ(x2 )ψ(x1 )ψ † (x2 )ψ † (z1 )ψ(z1 )φ(z1 )ψ † (z2 )ψ(z2 )φ(z2 )|0 (10.32)
In the application of Wick’s theorem to this expression, we recall that only connected
contributions, in which any spacetime point can be connected with any other by a
continuous sequence of contractions, are phenomenologically relevant. It will soon
become clear that these are precisely the contributions which give rise (after the
Fourier transformation effected by integrating over the external particle exponential
factors) to a single overall δ-function of four-momentum conservation (cf. Chapter 6).
In the particular case here, there are just four possible connected contractions, which
correspond to two topologically distinct graphs, duplicated by the symmetry z1 ↔ z2 ,
which simply produces a factor of 2 when the integrals over z1 , z2 are performed. Thus,
the relevant part of the Wick expansion of (10.32) is
{ψ(x † †
† †
1 )ψ (z1 )ψ(z1 )ψ (x2 )ψ(x2 )ψ (z2 )ψ(z2 )ψ (x1 )
+ ψ(x † †
† †
1 )ψ (z1 )ψ(z2 )ψ (x2 )ψ(x2 )ψ (z2 )ψ(z1 )ψ (x1 )}φ(z1 )φ(z2 ) + (z1 ↔ z2 )
x2
x1
x1 x2
z2
z2
+ z1 + z1 ↔ z2
z1
x1 x2
x1 x2
fields appearing in the fundamental LSZ formula (10.1), which need only the further
application of some spacetime-derivatives (in the Klein–Gordon operators) and Fourier
integrals to yield the desired scattering amplitude, seem by Wick’s theorem to dissolve
into sums of at first sight perfectly well-defined products of free propagators (i.e.,
the Feynman functions ΔF (x)). These propagators are indeed—as distributions—well
defined. For example, their Fourier transforms exist, also as well-defined distributions
(namely, the familiar 1/(q 2 − m2 + i
) factors). The problem arises from the fact
that products of well-defined distributions are not necessarily well-defined, although in
particular cases they may well be so. Problems with the multiplication of distributions
typically arise when the distributions being multiplied have coincident singularities.
In the situation here discussed, (10.33), no two Feynman propagators have the same
coordinate space arguments, so this situation does not arise. The fourth-order graph
shown in Fig. 10.2, on the other hand, evidently contains the square of the propagator
x1 x2
z3 z4
z2
z1
x1 x2
Fig. 10.2 Coordinate space contractions for ψ − ψ c scattering in fourth order, containing a
closed loop.
318 Dynamics VIII: Interacting fields: perturbative aspects
and this object is not a well-defined distribution! How do we know this? From the
fundamental theorem guaranteeing the existence of the Fourier transform of any
decent tempered distribution. If we attempt to compute this Fourier transform for
the indicated squared propagator, we find (writing z for z2 − z3 )
Π(q) ≡ Δ2F (z; m)eiq·z d4 z
e−ik1 ·z e−ik2 ·z d4 k1 d4 k2 iq·z 4
= e d z
k12 − m2 + i
k2 − m2 + i
(2π)4 (2π)4
2
1 1 d 4 k1 d4 k2
= (2π)4 δ 4 (q − k1 − k2 )
k12 − m + i
k2 − m + i
(2π)4 (2π)4
2 2 2
1 1 d 4 k1
= = ∞! (10.35)
k12 − m + i
(q − k1 ) − m + i
(2π)4
2 2 2
The infinity arises when we integrate (as, in a continuum theory with no short-
distance or high-momentum cutoff, we must) over all of four-dimensional momentum
space: we then discover that the integrand behaves, for k1 >> q, like d4 k1 /k14 , and
is therefore logarithmically divergent. The (infinite) ambiguity is in this case of a very
simple form: it is removed by a single subtraction at an arbitrary value of q, and
therefore amounts to a single overall additive constant in the definition of the product
distribution. For example, if we subtract at q = 0, we obtain
2q · k1 − q 2 d 4 k1
Π(q) − Π(0) = (10.36)
(k12 − m2 + i
)2 ((q − k1 )2 − m2 + i
) (2π)4
which is perfectly finite, as the integrand now behaves like 1/k15 at large values of k1 .
We shall shortly see that the integrated momentum k1 appears in graphs containing
loops as in Fig. 10.2: accordingly, such momenta are usually called “loop momenta”.
Going back to coordinate space (by an inverse Fourier transform), we see that the
(additive) ambiguity in the square of our Feynman propagator amounts to the Fourier
transform of an undetermined constant, i.e., to a four-dimensional δ-function δ 4 (z),
corresponding as expected to the short-distance limit in which the spacetime vertices z2
and z3 coincide. On the other hand, if we work—as we shall henceforth in this chapter
assume we are doing—in a regularized version of the theory6 with a suitably chosen
high-momentum cutoff, the divergence in (10.35) is removed and our amplitudes will
be well-defined at all stages. Soon, we shall see that ultraviolet divergences arising
from ill-defined products of coordinate-space distributions are associated with graphs
containing loops, leading to unbounded integrations in momentum space.
Returning to our initial task, the evaluation of the nth order perturbative con-
tributions to the S-matrix element in (10.1), we see that the final result is obtained
6 A detailed account of suitable regularizations is postponed to Chapter 16, where we begin the discussion
of the sensitivity of field-theory amplitudes to short-distance cutoffs in the theory.
Feynman graphs and Feynman rules 319
by applying Klein–Gordon operators + m2ph (recall: mph is the actual physical mass
of the particle) to the outer vertex on each of the external legs (for both incoming
and outgoing particles). Equivalently, we may integrate the derivatives by parts onto
the plane-wave exponentials, obtaining factors of m2ph − ki2 (resp. m2ph − ki2 ) for each
of the incoming (resp. outgoing) particles in the process. These, together with the
indicated phase-space factors involving the square-root of the particle energies, then
multiply the Fourier transform with respect to the external momenta of the coordinate
space product of contractions (i.e., free propagators) arising from Wick’s theorem as
discussed above. In our explicit example (10.33), for example, the reader may easily
verify (see Problem 3) that the Fourier transform of the indicated contractions yields
(with a factor of two due to the z1 ↔ z2 symmetry), and integrating also over the
interaction vertex points z1 , z2 , as required by (10.2),
eik1 ·x1 +ik2 ·x2 −ik1 ·x1 −ik2 ·x2 Gscatt (x1 , x2 , x1 , x2 , z1 , z2 )d4 xi d4 xi d4 zi
in terms of the momentum-space Feynman propagators Δ̃F (k; m) = k2 −m1 2 +i . In the
process of performing the integrations over the locations of the interactions vertices
zi , four-dimensional δ-functions implementing energy-momentum conservation at each
vertex are generated. After integrating over the momenta carried by each propagator,
we are left in this case (as the graph is connected: cf. Chapter 6) with a single overall δ-
function enforcing energy-momentum conservation for the entire process. The products
of Feynman propagators appearing in (10.37) have a clear graphical interpretation: the
relevant graphs are in fact just those displayed in Fig. 7.2, in our original discussion of
this scattering process in Chapter 7. We may summarize the ingredients of the above
calculation in a general way valid for the perturbative calculation of arbitrary S-matrix
amplitudes of our theory (with interaction Hamiltonian Hint (x) = λψ† (x)ψ(x)φ(x)),
as a set of Feynman rules associating specific algebraic expressions with each of the
graphical elements:
1. At each 4-vertex, a factor −iλ, and a four-momentum conservation factor which
ensures that the sum of incoming momenta to the vertex equals the sum of
outgoing momenta: namely, (2π)4 δ 4 (Σ) (where Σ is a shorthand notation for the
difference in the total incoming and outgoing four-momenta at the vertex).
2. For each line carrying momentum q, a factor iΔ̃F (q) = q2 −mi 2 +i .
3. Integrate over all internal momenta qi : i.e., those associated with the internal
lines of the diagram, connecting two interaction vertices. The external lines cor-
responding to propagators beginning at one of the external points xi , xi are fixed
at the corresponding external particle momenta. In general there will be more
internal momenta present than δ-functions available to fix them, leaving some
number L of remaining four-momentum integrals. If we think (as a consequence
of rule 1) of the four-momentum as a conserved “fluid” flowing through the graph,
it is immediately apparent that the number L of such remaining integrals must
correspond to the number of independent closed loops around which an arbitrary
320 Dynamics VIII: Interacting fields: perturbative aspects
The extension of these rules to other field theories, containing fields of non-zero
spin (e.g., Dirac or vector fields), and other types of (polynomial) self-interactions,
is straightforward. Internal lines for a Dirac particle connecting vertices at y1 and y2
become propagators iSF (y1 − y2 ), and so on. There is one additional rule, perhaps
not immediately obvious from the preceding, concerning the contraction of fermionic
fields in a closed loop: in such cases, an additional minus sign must be inserted in the
amplitude, as a consequence of fermionic statistics. Consider, for example, Theory C
of Section 7.6, with interaction Hamiltonian
(C)
Hint (x) = λψ̄a (x)ψa (x)φ(x) (10.38)
where ψ(x) is a massive Dirac field (with the Dirac index here denoted a =1,2,3,4) and
φ(x) a self-conjugate scalar field. Among the various Wick contractions of a second-
order contribution to the S-matrix expansion (10.1) with this interaction will occur
terms corresponding to the graph illustrated in Fig. 10.2, where now the bold lines
represent fermions and the wiggly lines the scalar particle. In this case we have fermion
fields at two separate vertices (at spacetime points z2 and z3 ) fully contracted so that
Feynman graphs and Feynman rules 321
T (ψ̄a1 (z2 )ψa1 (z2 )ψ̄a2 (z3 )ψa2 (z3 ).....) = −ψa1 (z
2 )ψ̄a2 (z3 )ψa2 (z3 )ψ̄a1 (z2 ) (10.39)
The minus sign arises because the reordering of the fields between the left and
right-hand side of (10.39) involves an odd permutation, and hence, by the fermionic
extension of Wick’s theorem discussed in the previous Section, an additional minus
sign. In graphical calculations this requirement is summarized in the simple rule: closed
fermion loops get an extra minus sign.7
There are several non-trivial issues which arise in interpreting the results of a
perturbative calculation of S-matrix amplitudes along the lines discussed above, some
of which have already been touched on above. Basically, obstructions to obtaining
well-defined results at each given order of perturbation theory may arise both in the
ultraviolet (short distance) or infrared (long time) domains. Even then, when a well-
defined regularized result has been obtained order by order, it must be kept in mind
that the resultant (infinite) series is never a convergent Taylor expansion, but only, at
best, an asymptotic expansion of the exact S-matrix amplitude. Let us examine these
three sets of problems in more detail.
We have already touched on the difficulties at short distance—the famous “ultra-
violet divergences” of perturbative quantum field theory—which go back to the ambi-
guities and/or divergences which arise when distributions with coincident spacetime
singularities are multiplied. These divergences are most easily seen in momentum
space, in the Fourier transforms of the products of coordinate space Green functions,
as ultraviolet divergences of loop integrals, such as (10.35), due to insufficiently
rapid falloff of the product of momentum-space propagators at large momentum.
In order to give meaning to the amplitudes we must introduce some appropriate
regularization of the loop integrals appearing in the amplitudes: namely, a large
momentum (or short distance) cutoff rendering all amplitudes finite and well defined in
all orders of perturbation theory. Here, “appropriate” means a regularization doing the
least violence manageable to the important underlying symmetries (such as Poincaré
invariance) of the theory, in such a way that eventually removal of the cutoff can be
accomplished with the restoration of these symmetries. Equivalently, we need to show
that the sensitivity of the Feynman amplitudes of the theory to the presence of the
short-distance cutoff, which can be viewed as an expression of our ignorance of as
yet unexplored new physics at very short distance scales, is for all practical purposes
negligible. These issues of scale sensitivity of field theory—and the whole machinery
of renormalization theory needed to address them adequately—will be the primary
topic of discussion in Part 4 of this book. For the time being we will simply assume,
when deriving or using perturbative results in the interaction picture, that the field
7 Of course, exactly as discussed previously for the scalar theory of Fig. 10.2, the multiplication of
two Dirac propagators with the same singularity (at z2 → z3 ) results in an undefined result, visible as a
divergence in the Fourier transform due to UV contributions to the corresponding loop integral. In this case
the divergence is even more severe, resulting in the appearance of two arbitrary constants. More on all this
in Chapters 16 and 17.
322 Dynamics VIII: Interacting fields: perturbative aspects
theory has been suitably regularized and that only well-defined expressions occur in
the evaluation of the Gell–Mann–Low expansion (10.2).
Another set of problems for our perturbative analysis arise from the large-distance
(or large-time) regime: specifically from the presence in a relativistic field theory
of persistent interactions which continue to affect the propagation of particles even
when (long before or long after a scattering interaction) they are well-separated from
one another and moving freely. Firstly, persistent interactions are present even in
the absence of scattering particles, i.e., in the evolution of the vacuum state. Such
“disconnected vacuum bubbles” appear with equal probability at all spacetime points
and lead to a spacetime volume dependent phase in the overlap of the in- and out-vacua
of the theory. Secondly, persistent interactions can occur as “radiative corrections”
on any of the incoming or outgoing particle lines in a scattering process. In Fig.
10.3 we display a Feynman diagram in λφ4 theory in which both types of persistent
interactions are present. We shall discuss the proper treatment of the external line
radiative corrections below. The role of the vacuum fluctuations leads to one of
the most interesting issues in modern cosmology: the cosmological constant, and its
“abnormally low” value (the so-called “cosmological constant problem”). The fact
that the energy of the discrete vacuum state is shifted by interactions is a perfectly
normal situation in any quantum-mechanical theory. The actual level of the ground-
state energy is not physically relevant in flat-space quantum field theory as it is an
unobservable quantity, but in the presence of gravity the absolute level of energies
(and energy-densities: specifically, the components of the energy-momentum tensor)
is clearly relevant. Staying with the flat-space theories which are our focus in this
book, it is easy to see that the net effect of the vacuum bubbles, defined as the set of
graphs arising from the m = 0 special case of the Gell–Mann–Low theorem (10.2),
∞
(−i)p
out 0|0in = 0|T {Hint (z1 )Hint (z2 )..Hint (zp )}|0d4 z1 d4 z2 ..d4 zp (10.41)
p=0
p!
which is singular in the limit where spatial volume V and temporal extent T go to
infinity. Here δE is the interaction-induced shift in the vacuum spatial energy-density
correction to “free-floating”
incoming propagator vacuum fluctuation
Fig. 10.3 A disconnected Feynman diagram in λφ4 theory displaying persistent interactions.
Feynman graphs and Feynman rules 323
(which in fact typically contains ultraviolet divergences). The presence of the spacetime
volume factor V T is to be expected, as the disconnected vacuum bubbles, such as
that shown in Fig. 10.3, are free to float over the entire spacetime volume, so the
evaluation of the Feynman diagram necessarily produces a factor of V T . In fact, this
divergence is the only singularity of an infrared nature in a theory of massive particles,
and may be removed by the simple device of taking only connected contributions to
S-matrix elements, which automatically dispenses with the noisome vacuum fluctua-
tions. Effectively, this amounts to using the same vacuum state (either in or out) on
both sides of (10.1), or equivalently, to dividing a general S-matrix element out β|αin
(as given by the LSZ formula, say) by the vacuum phase out 0|0in .
Returning now to the second class of persistent interactions, those giving rise to
radiative corrections on the external legs of the diagrams, we recall from the discussion
of scattering theory in Section 9.4 that the generation of the appropriate in- and out-
states whose overlap gives the desired S-matrix element is formally accomplished by
taking the on-(physical)mass-shell limit for the four-momentum associated with each
external particle, ki2 → m2ph . In addition, the momentum-space m + n point Green
function (where m, n are the number of outgoing or incoming particles, respectively)
appearing in the LSZ formula (cf. (10.1) must be multiplied by factors ki2 − m2ph
for each external particle, so that the on-mass-shell limit gives a well-defined finite
result if and only if this Green function has simple poles in each of the off-shellness
variables ki2 − m2ph , with the final S-matrix amplitude given as the residue of these
poles. A careless execution of perturbation theory will result in amplitudes which,
at any fixed order of perturbation theory, fail to have the required pole behavior!
In particular, taking φ4 scalar theory as an explicit example, if we begin with the
Hamiltonian
1 2 1 2 1 2 2 λ 4
H= d3 x : φ̇ + |∇φ| + m φ + φ : (10.43)
2 2 2 4!
the free momentum-space Feynman propagator Δ̃F (k) will clearly have a pole at
k2 = m2 , where m is the so-called “bare mass”, corresponding to the coefficient of φ2
in the Hamiltonian, but not to the actual physical mass mph , which differs from m as
a consequence of the interaction term. This is hardly unexpected: the energy of states
in quantum mechanics (in this case, the single particle at rest) is typically shifted
from the unperturbed value once interactions are switched on. We can restore mass
stability in our perturbation theory by splitting the full Hamiltonian in a different
way, thereby forcing the free part H0 to yield a single particle state with the correct
physical mass, by defining
and transferring the δm2 φ2 “mass counterterm” into the interaction part of the
Hamiltonian, so that we now have
1 1 2 1 2 2
H0 = d3 x : φ̇2 + |∇φ| + mph φ : (10.46)
2 2 2
1 λ
V = d3 x : δm2 φ2 + φ4 : (10.47)
2 4!
with the coefficient δm2 adjusted order by order in the expansion in the “bare
coupling” λ to ensure that the pole of the full Heisenberg propagator remains at the
required physical value (namely, at k 2 = m2ph ).8 The resultant perturbative amplitudes
are said to be “on-shell renormalized”. It is also possible (and frequently convenient)
to employ a more general class of perturbative splits, in which the poles of the free
propagator are not at the physical mass, but at some “intermediate mass” (coinciding
neither with the bare mass in the Hamiltonian nor the physical particle mass): this
requires the reorganization of the full Green functions into a factorized product of
full propagators on the external legs (with poles at the correct physical mass, by the
Lehmann representation of Section 9.5) and “amputated” Green functions which can
be computed perturbatively with intermediate mass renormalization. How this can be
accomplished will be described below in Section 10.4, where the relevant topological
concepts are introduced.
Finally, there remains the “inconvenient truth” that a perturbative expansion of
field-theoretic amplitudes as a formal expansion in some suitably defined coupling
constant(s) of the theory at best provides an asymptotic expansion to the exact
amplitude, not a convergent (Taylor) expansion capable of yielding results of arbitrary
accuracy (at least in principle) by pushing the calculation to sufficiently high orders.
This should not be surprising if we recall that the same is true even in very simple
(non-field-theoretic) models in non-relativistic quantum theory. The discrete energy
eigenvalues En (λ) of the one-dimensional anharmonic oscillator Hamiltonian,
p2 1
H= + mω 2 x2 + λx4 (10.48)
2m 2
when developed in an expansion in powers of the anharmonicity λ, produce an asymp-
(n) (n)
totic expansion En (λ) p Cp λp in which the coefficients Cp grow factorially with
p, so that the series in fact has zero radius of convergence. The lack of analyticity in
the λ variable is hardly surprising, if we consider the dramatically different behavior
of the theory if λ is taken negative (real), however small: the Hamiltonian then lacks
a ground state, as the λx4 part of the potential energy eventually becomes arbitrarily
negative for large enough x. The Rayleigh–Schrödinger perturbation expansion is still
useful of course, treated as an asymptotic expansion, provided λ is sufficiently small,
allowing an accurate estimate of the energy to be obtained by summing the first few
terms of the series (before the summands begin to increase). In certain cases (the
8 We remind the reader that we are assuming a fully regularized theory: in particular, all UV divergences
have been taken care of, so the perturbative expansion of the Green functions yields finite well-defined results
everywhere.
Path-integral formulation of field theory 325
so-called “Borel summable” expansions; cf. Section 11.3) the information contained in
the perturbative coefficients in fact suffices to determine the exact amplitudes, but this
is not the case in most field theories of interest (in particular, in the gauge theories
which underly the Standard Model of particle physics), and we must face the fact
that such theories contain qualitatively important “non-perturbative physics”, going
beyond any information which can be gleaned from a purely perturbative approach.
We will return to the issue of the non-convergence of perturbation theory, and the
extent to which “truly non-perturbative” information can be accessed by alternative
methods, in Chapter 11.
The analogous quantity for a free field theory, a generating functional for the vacuum
expectation value of time-ordered free field operators, has already been introduced
and discussed in Section 10.1 from an operatorial standpoint. Thus (introducing the
subscript 0 to indicate a free field theory, and taking a free massive scalar field to be
specific) we expect that the functional
326 Dynamics VIII: Interacting fields: perturbative aspects
Z0 [j] ≡ 0|S[j]|0 = 0|T {exp −i j(x)φ(x)d4 x}|0 (10.50)
plays a role analogous to the Z[j] in (10.49). We emphasize at this point that an overall
multiplicative constant in Z0 [j] is irrelevant when the normalized n-point functions
(compare (4.135)) are computed, as these involve derivatives of Z0 divided by Z0 .
We seek a functional integral representation for this object analogous to (10.49):
such a representation clearly involves canonical “momenta” complementary to the
coordinate degrees of freedom, which in this case are evidently the field operators at
any spatial point (on a given time-slice). The appropriate complementary quantities
were discussed in our treatment of the classical limit in Section 8.1: for our simple
scalar theory, the field operator and its time-derivative form quantum-mechanically
conjugate variables:
∂φ(x, t)
[π(x, t), φ(y , t)] = −iδ 3 (x − y ), π(x, t) ≡ (10.51)
∂t
We shall proceed by formally imitating the representation (10.49), with the appropriate
modifications for field theory, and then checking explicitly to ensure that the resultant
functional indeed yields the desired Feynman functions of the free field theory.
The generalization of the path-integral formula to include interactions will then be
straightforward.
Essentially, the only modification needed to convert (10.49) into a field theory
formula is to include a spatial integral, as our “coordinates” and “momenta” are now
the values of the field φ(x, t) and its time-derivative π(x, t) at all spatial points (at
any given time). This immediately yields the path integral
4
Z0 [j] = DφDπei (π(x)φ̇(x)−H0 (π(x),φ(x))−j(x)φ(x))d x (10.52)
where, as expected, the time integrals of (10.49) have been augmented to spacetime
integrals. As discussed in Section 4.2.4, expressions of this kind are purely formal in
nature—they must be given a precise meaning by the following maneuvers:
1. An appropriate regularization of the spacetime continuum, so that we are dealing
with a multiple integral over a well-defined discrete set of (many!) integration
variables. An obvious way to do this is simply to imagine defining the theory on
a finite discrete spacetime lattice: in other words, in addition to discretizing the
time variable as was done in Section 4.2.1, we also replace the spatial integrals
by sums over a finite spatial lattice.9 Space and time-derivatives are, of course,
replaced by appropriate difference quantities. The need for a short-distance cutoff
to obtain well-defined results is, of course, familiar by now from our discussion
of the difficulties that can arise from multiplying distributions in perturbation
theory- here we imagine also a long-distance cutoff (our spacetime lattice is
of finite extent) so that the functional integral becomes a multi- but finite-
dimensional one. At an appropriate point, once the path integrals have been
9 We have previously referred to a field theory regularized in this way as a fully regularized theory.
Path-integral formulation of field theory 327
performed, the continuum (i.e., zero lattice spacing) limit can be taken to obtain
the desired continuum results, and the spatial volume can then be allowed to go
to infinity.
2. Even after the functional integral is converted by the aforesaid discretization into
a conventional multi-dimensional integral, a further regularization, of a different
kind, is required to obtain well-defined results. Again, as discussed in Chapter
4, the integral (10.52) as it stands is not absolutely convergent, as the integrand
involves a complex undamped exponential. Just as in the quantum-mechanical
case, we shall see that the inclusion of an appropriate i
factor is needed to ensure
absolute convergence of the integral in all directions in field space.
Does the formula (10.52) reproduce the correct Green functions, even for our very
simple case of a free (massive) scalar field? The Hamiltonian density in this case is
given by (6.89)10
1 1 1
H0 = π(x)2 + |∇φ(x)|2
+ m2 φ(x)2 (10.53)
2 2 2
We shall assume that spacetime has been appropriately discretized and gradients and
time-derivatives realized in such a way that the shift-invariance of the path integral
is preserved. To avoid unnecessarily complicating the notation, however, we shall
continue to use continuum
notation, and to write the functional integral measure as
Dφ (rather than i dφ(xi ), for example, where xi indicates the discretized spacetime
lattice points). By shift-invariance, we mean simply
DφF [φ + χ] = DφF [φ] (10.54)
DπF [π + χ] = DπF [π] (10.55)
for any fixed (c-number) function χ(x), and functional F , assuming that the integral
exists (is absolutely convergent) in the first place. Inserting (10.53) into (10.52), we
find that the dependence of the integrand on the momentum field π(x) is Gaussian:
(π(x)φ̇(x)− 12 π(x)2 )d4 x 1 2 4
(π(x)−φ̇(x))2 d4 x
· e− 2
i
ei = ei 2 φ̇(x) d x
(10.56)
If we perform the functional integral over the π(x) field first, holding the φ(x) field
fixed, then the shift invariance property assumed for the Dπ integral implies (taking
χ = −φ̇ in (10.55))
that the latter decouples completely as a multiplicative factor
− 2i π(x)2 d4 x
C ≡ Dπe , leaving only the integral over the “coordinate” quantities,
i.e., the field φ(x):
10 The normal-ordering, here omitted, amounts to a shift of the Hamiltonian by a fixed c-number: this
affects the path integral by a multiplicative factor, which evidently cancels in formulas like (4.135) giving
the n-point Green functions of the theory.
328 Dynamics VIII: Interacting fields: perturbative aspects
( 12 φ̇(x)2 − 12 |∇φ(x)|
2
− 12 m2 φ(x)2 −j(x)φ(x))d4 x
Z0 [j] = C Dφei (10.57)
(L0 (φ̇,∇φ,φ)−jφ)d4 1 1
≡C Dφei x
, L0 = ∂μ φ∂ μ φ − m2 φ2 (10.58)
2 2
The result of integrating out the field momentum variables is a path integral only over
the field coordinate variables φ(x), with the integrand containing a function (the “free
φ) only of the field and its space and time-derivatives,
field Lagrangian”) L0 (φ̇, ∇φ,
in complete analogy to (4.105) where the mechanical Lagrangian of the quantized
point particle makes a similar appearance. The appearance of the Lorentz-invariant
scalar combination ∂μ φ(x)∂ μ φ(x) = φ̇(x)2 − |∇φ(x)| 2
at this stage is certainly an
encouraging development, and we shall see later in Chapter 12, when we develop
the canonical formalism for field theory, that it is precisely the Poincaré invariant
properties of the action (defined as the spacetime integral of the Lagrangian) that
render the Lagrangian such a useful, even indispensable, tool in field theory.
Of course, we still have to face the fact that the integral (10.57) contains an
oscillating integrand of absolute value unity, and is therefore certainly not absolutely
convergent (much less uniquely specified). The integral can be given a definite meaning
by introducing an appropriately signed small imaginary part in front of every term
(quadratic in the field) in the Lagrangian which becomes unbounded in the course of
the integration. Thus, we replace the real Lagrangian function appearing in (10.57) by
where δ is a small positive quantity to be taken to zero at the end of the calculations.
This corresponds, as the reader may easily verify, to just the “rotation” of the
time variable t → e−iδ t needed to obtain a well-defined real-time functional integral
(cf. Section 4.2.1). The real part of iL0δ is readily seen to be negative definite, so
the integrand is exponentially damped11 in all regions where either the field or its
spacetime-derivatives become large. We can now explicitly verify that the resultant
path integral is well-defined by evaluating it, using once again the shift property
(10.54). Define the differential operator
∂2
K ≡ eiδ − e−iδ (∇
2 − m2 ) (10.60)
∂t2
The inverse of this operator (or kernel) is a Green function G(x, y) defined by
KG(x, y) = δ 4 (x − y) (10.61)
11 Of course, we should include a similar convergence factor in the integral over π(x) performed previously,
i −iδ π(x)2 d4 x
which should strictly speaking be taken to be the absolutely convergent integral Dπe− 2 e .
See the discussion below for the interacting case, where all appropriate convergence factors are inserted ab
initio.
Path-integral formulation of field theory 329
where the operator K acts on the spacetime coordinate x. Writing G(x, y) as a Fourier
transform
d4 k
G(x, y) = G̃(k)e−ik·(x−y) (10.62)
(2π)4
1
G̃(k) = −
eiδ k02 − e−iδ (k2 + m2 )
1
= −e−iδ
k02 − e−2iδ (k 2 + m2 )
1
→− , ≡ 2δ, δ → 0 (10.63)
k2 − m2 + i(k 2 + m2 )
If the limit → 0 is taken after all integrals are performed, the (positive!) non-
covariant factor k2 + m2 is irrelevant and we obtain for G(x, y) a well-defined covariant
distribution.12 The Green function G(x, y), or equivalently, the inverse of the operator
K, is therefore, up to a sign, just our old friend the free Feynman propagator:
e−ik·(x−y) d4 k
G(x, y) = − = −ΔF (x − y) (10.64)
k 2 − m2 + i (2π)4
once we remove the regulators and return to an infinite continuous spacetime, as our
use of continuum notation suggests we have done. Were δ zero, the energy integral
(over k0 ) in the continuum theory would run directly along the real axis through these
poles, and the result of the integration would be ill-defined. We shall now see that the
evaluation of the path integral (10.57) involves exactly the Green function G(x, y), so
that the original lack of definition in the functional integral can be traced precisely to
the need to generate a well-defined distribution for the two-point function of the field.
To complete the evaluation of (10.57), we observe that, using integration by parts,
φ) − jφ)d4 x = − ( 1 φ(x)Kφ(x) + j(x)φ(x))d4 x
(Lδ (φ̇, ∇φ, (10.65)
2
1
= − { (φ(x) + K−1 j(x))K(φ(x) + K−1 j(x))
2
1
− j(x)K−1 j(x)}d4 x (10.66)
2
12 We shall see, however, in Chapter 17, that the momentum dependence of the term is crucial
in guaranteeing the absolute convergence of Minkowski-space Feynman integrals, needed to establish
rigorously the efficacy of the subtraction procedures used to renormalize the Minkowski-space amplitudes
of a perturbatively renormalizable theory.
330 Dynamics VIII: Interacting fields: perturbative aspects
The integral over φ in (10.58) can now be shifted using (10.54) to eliminate the
dependence on the source function j, giving a (convergent!) constant factor
4
C = Dφe−i φ(x)Kφ(x)d x (10.67)
The reader will easily confirm that this result is precisely Wick’s theorem for the
Feynman n-point Green functions in (10.22, 10.23), previously derived by operatorial
methods (note that the normal-ordered exponential on the right-hand side of (10.22)
simply becomes unity once the vacuum expectation value is taken: once expanded, only
the first term in the expansion survives). As indicated above, the overall multiplicative
constant CC is irrelevant once the functional derivatives needed to extract the n-point
Green function are taken,
n 1 δ n Z0 [j]
0|T {φ(x1 )φ(x2 )..φ(xn )}|0 = i (10.69)
Z0 [j] δj(x1 )δj(x2 )..δj(xn ) j=0
as the generating functional Z0 [j] appears in both the numerator and the denominator.
We may therefore take simply, for the generating functional of Green functions of our
free scalar field theory,
4 4
Z0 [j] = e− 2 j(x)ΔF (x−y)j(y)d xd y
i
(10.70)
1 1 1 1
Hδ = e−iδ { π(x)2 + |∇φ(x)| 2
+ m(a)2 φ(x)2 + λ(a)φ(x)4 } (10.72)
2 2 2 4!
Path-integral formulation of field theory 331
with the full Feynman Green functions of the interacting theory given by functional
derivatives of the functional Z[j]:
1 δ n Z[j]
0|T {φH (x1 )φH (x2 )..φH (xn )}|0 = in (10.73)
Z[j] δj(x1 )δj(x2 )..δj(xn ) j=0
Note that the factors of e+iδ and e−iδ appear in precisely the right places to guarantee
the absolute convergence of the remaining integral over φ(x): not surprisingly, as
the original integral, before the π(x) field was integrated out, was clearly absolutely
convergent (with the absolute value of the integrand falling exponentially in the large
field regime).
If we temporarily restore Planck’s constant in the Lagrangian form of the path
integral (it divides the action: namely, the spacetime integral of the Lagrangian),
1
4
Z[j] = C Dφei ( Lδ (φ̇,∇φ,φ)−jφ)d x (10.75)
one immediately sees that 1 multiplies the Klein–Gordon operator K in the quadratic
part of the action, and therefore the free propagator ΔF (x), essentially the inverse
of K, should be regarded as proportional to a factor of . The interaction vertices of
the theory, associated with the higher than quadratic part of Lδ , should each carry a
factor of 1/. These factors of will be of importance in Section 10.4 in sorting out the
“classical” (i.e., lowest order in ) from “quantum” contributions to the amplitudes of
the theory. The latter will turn out to be associated with the presence of closed loops
in the graphs.
332 Dynamics VIII: Interacting fields: perturbative aspects
The reader may recall that in our treatment of path integrals for a quantum
mechanical particle in Chapter 4, we described an alternative technique for eliminating
the noisome oscillations in the Minkowski space (i.e., real-time) formulation. In this
approach one analytically continues the amplitudes to imaginary time, basically by the
“Wick rotation” t → −it. Equivalently, the path integral is derived ab initio for matrix
elements of the bounded hermitian operator e−Ht , rather than the usual unitary real-
time development operator e−iHt . One then obtains, in the quantum-mechanics case,
path integrals such as
tf
− ( 12 mq̇ 2 (t)+ 12 mω 2 q 2 (t))dt
KE (qf , tf ; qi , ti ) = e ti
Dq(t) = e−SE (q̇,q) Dq(t) (10.76)
which involves a purely real integrand which is exponentially damped for large q(t)
and q̇(t), and therefore yields a well-defined absolutely convergent (multi-dimensional)
integral once the usual discretization of the time interval ti ≤ t ≤ tf is carried out. In
the field-theory case we observe that taking δ = π2 in (10.71,10.72) (which clearly
leads to a convergent path integral) is equivalent to an analytical continuation to
imaginary time t → −it of the Minkowski path integral: we simply reinterpret the
e−iδ factor in Hδ as part of the time variable in d4 x = dxdt. Note that the term
π(x)φ̇(x)d4 x = π(x)dφ(x)dx term is unchanged in this continuation. We then arrive
at the Euclidean generating functional
4 4
ZE [j] = DφDπe (iπ(x)φ̇(x)+j(x)φ(x))d x−H[π,φ])d x (10.77)
1 1 1 1
H[π, φ] ≡ { π(x)2 + |∇φ(x)| 2
+ m(a)2 φ(x)2 + λ(a)φ(x)4 }d4 x (10.78)
2 2 2 4!
Integrating out the π field in a (by now) familiar fashion, we obtain, up to an irrelevant
multiplicative constant,
j(x)φ(x)d4 x
ZE [j] = Dφe−SE [φ]+ (10.79)
1 1 1 1
SE [φ] ≡ { φ̇(x)2 + |∇φ(x)| 2
+ m(a)2 φ(x)2 + λ(a)φ(x)4 }d4 x (10.80)
2 2 2 4!
which can be regarded as the “Euclidean Lagrangian” version of the functional integral
for this field theory (cf. (4.99, 4.100)). The positive real Euclidean Action functional
SE [φ] appears here in analogy to the corresponding quantity SE (q̇, q) for the quantum-
mechanical particle appearing in (10.76). The functional derivatives of ZE [j] yield
the Euclidean Schwinger functions S(x1 , x2 , ..., xn ) of the theory, which we briefly
discussed in Section 9.2:
1 δ n ZE [j]
S(x1 , x2 , ..., xn ) = (10.81)
ZE [j] δj(x1 )δj(x2 )..δj(xn ) j=0
Path-integral formulation of field theory 333
13 For a careful recent study of the mathematical status of complex Langevin methods, see (Aarts et al.,
2010).
334 Dynamics VIII: Interacting fields: perturbative aspects
+∞
(−i)n L0 d4 z
d z1 ..d zn Dφ φ(x1 )...φ(xm )Hint (z1 )..Hint (zn ) ei
4 4
n=0
n!
(10.83)
14 In order to avoid overburdening the notation, we henceforth omit eiδ factors, and the δ subscript on
the Lagrangian functional.
Path-integral formulation of field theory 335
sponding behavior for the fermionic functional integrals discussed below is of great
importance in numerical approaches to non-perturbative quantum field theory. Let
us take another look at the basic (source-free) Gaussian path integral for a free self-
conjugate scalar field, in the Euclidean formulation, as given by
1 4
Z0 [0] = Dφ e− 2 φ(x)KE φ(x)d x (10.84)
4
∂2
KE = − + m2 , ≡ (10.85)
i=1
∂x2i
The implicit regularization of this path integral on a finite Euclidean spacetime lattice
(with N points) implies that the operator KE is actually replaced by a real, symmetric,
positive-definite N xN matrix Kij , so our integral actually reads
−1 φ K φ
Z0 [0] = dφi e 2 i,j i ij j (10.86)
i
where the field variables φi representing the value of φ(x) at the lattice point xi are real
(and range from −∞ to +∞). Let Oij be the orthogonal matrix (of unit determinant)
which diagonalizes Kij :
Kij = λk Oik Ojk ⇒ φi Kij φj = λk φ̂2k (10.87)
k i,j k
where we have introduced new field variables φ̂k = i φi Oik related to the original
ones by a unit Jacobian. The eigenvalues λk are all positive: otherwise our Euclidean
path integral would be divergent! Changing integration variables to the φ̂k then,
N
1 2 (2π)N/2 1
Z0 [0] = dφ̂k e− 2 k=1 λk φ̂k = N √ = (2π)N/2 det(K)− 2 (10.88)
k=1 λk
The Gaussian integral corresponding to (10.84) for the doublet of fields φ1 , φ2 can be
rewritten as an integral over the single complex field ψ (and its conjugate field ψ ∗ ,
treated as an independent variable)
2 ∗
1
φi (x)KE φi (x)d4 x 4
Dφ1 Dφ2 e− 2 i=1 = DψDψ ∗ e− ψ (x)KE ψ(x)d x
where the final line gives the explicit evaluation of the regularized path integral, in
this case yielding the inverse determinant of the Euclidean Klein–Gordon operator KE ,
without the square-root (due to the doubling of the degrees of freedom in the case of
a complex scalar field). We shall shortly see that the entire content of the distinction
between Bose–Einstein and Fermi–Dirac statistics for fields in the functional approach
lies in the power of the determinant that appears in the basic Gaussian integral, which
is negative for bosonic fields but positive for fermionic ones. This at first sight minor
distinction leads, in fact, to an enormous increase in difficulty of numerical evaluation
of the path integral for fermionic as opposed to bosonic fields.
suggests (by analogy to the corresponding result (8.1) for scalar fields) that the
conjugate “momentum” field to ψ(x) is π(x) = iψ † (x). If we formally imitate the
Hamiltonian path integral for scalar fields (10.52), where now the free Hamiltonian
density H0 is that appropriate for the Dirac field (Problem 3 in Chapter 7),
+ mψ̄ψ
H0 = iψ̄γ · ∇ψ (10.92)
measure is completely innocent, as det(γ0 ) = 1. Note that the action (the non-source
part of the exponent in (10.94)) has already assumed a Lorentz-invariant form: for
fermionic theories the Lagrangian is obtained directly by including the π ψ̇ term with
the Hamiltonian, without the need for an integration over the conjugate momentum
π field. Again, the reason for this will become clear in our discussion of the canonical
formalism in Chapter 12.
Defining the differential operator
D ≡ iγ μ ∂μ − m = i∂/ − m (10.95)
in analogy to the Klein–Gordon operator K introduced previously for the scalar field,
we find that the functional Z0 can be formally evaluated by the same procedure of
completion of the square used in the bosonic case:
−1 −1 −1 4
Z0 [η, η̄] = DψDψ̄ ei {(ψ̄−η̄D )D(ψ−D η)−η̄D η}d x (10.96)
η̄D−1 ηd4 x
= Ce−i (10.97)
where we have assumed that the integrals over ψ, ψ̄ are shift-invariant, as in the bosonic
case. This already looks quite promising, as the Feynman–Dirac propagator defined
by iSF (x − y) = 0|T (ψ(x)ψ̄(y))|0 (cf. Problem 5 in Chapter 7)
p/ + m d4 p e−ip·(x−y) d4 p
SF (x − y) = e−ip·(x−y) = (10.98)
p − m + i
2 2 (2π)4 p/ − m + i (2π)4
Unfortunately, plausible as it seems, the above argument conceals some deep flaws
in the analogistic reasoning used to arrive at (10.100). In particular, the crucial minus
sign embedded in the definition of the fermionic T-product
T (ψα (x)ψ̄β (y)) ≡ θ(x0 − y 0 )ψα (x)ψ̄β (y) − θ(y 0 − x0 )ψ̄β (y)ψα (x) (10.102)
is not reproduced by the double functional derivative in (10.101), if the field sources
η, η̄ take values in a commutative field (such as the complex numbers). The change
of sign when fermion fields are reordered implies a similar change of sign when the
338 Dynamics VIII: Interacting fields: perturbative aspects
δ2 δ2
=− (10.103)
δηβ (y)δ η̄α (x) δ η̄α (x)δηβ (y)
This problem can (indeed, must) be fixed by insisting that the fermionic source
functions η, η̄ take values in a Grassmann algebra, any two elements of which anti-
commute with each other. This then implies that the c-number fermionic fields ψ, ψ̄
appearing in (10.94) must also be anticommuting Grassmann numbers. Otherwise,
when the source terms in the exponent are expanded, only linear terms would survive,
and we would conclude that Green functions involving more than one ψ (or ψ̄)
¯ are conventional complex-valued c-number
field vanish! For example, if ψ(x), ψ(x)
functions, commuting with the Grassmann source functions,
( ψ̄(x)η(x)d4 x)2 = ψ̄(x)ψ̄(y)η(x)η(y)d4 xd4 y = 0 (10.104)
as the product η(x)η(y) is antisymmetric under exchange of x and y, while the product
ψ̄(x)ψ̄(y) is symmetric. This is avoided if we also require ψ̄(x) (and likewise ψ(x)) to
take values in a Grassmann algebra. As indicated above, we have implicitly defined
the theory on a spacetime lattice, so that the fields (and sources) are defined on a
finite discrete set of spacetime coordinates xi , i = 1, 2, ...N . Our Grassmann algebra
consists of the set of all multi-nomials (with complex coefficients) generated by the
4N Grassmann numbers ψ(xi ), ψ̄(xi ), η(xi ), η̄(xi ). Denoting these generically by χi ,
the only properties attributed to these numbers are
As the square of any given Grassmann number vanishes, the possible multi-nomials
involve each of the 4N independent Grassmann numbers at most once. The linear
space of allowed multi-nomials is therefore 24N -dimensional, consisting of the one-
dimensional space with no Grassmanns (i.e., the complex numbers C), the 4N space
−1)
spanned by the Grassmann generators appearing singly, the 4N (4N 2 space spanned
by products of two distinct Grassmanns χi χj , i < j, and so on.
Of course, in order to pursue this strategy we must next ensure that we under-
stand the meaning of functional derivatives (i.e., in our discretized system, partial
differentiation with respect to a given Grassmann element χi ) and functional (path)
integrals (in the case of Z0 above, multi-dimensional integration over the ψ(xi ), ψ̄(xi )
holding η(xi ), η̄(xi ) fixed). The definition of the derivative is obvious once we recall
that functions F (χi ) of the χi can at most depend linearly on any given χj . Any term
containing χj can therefore be rearranged so that the single factor of χj is moved to
the extreme left (with a concomitant factor of –1 if the rearrangement involves an odd
permutation of Grassmann numbers), giving
The partial derivative with respect to χj is then defined in the natural way as
∂F
≡ B(χi , i
= j) (10.107)
∂χj
The reader is invited to check that, with this definition, the desired antisymmetry of
the second derivatives expressed in (10.103) indeed obtains: namely
∂2F ∂2F
=− (10.108)
∂χi ∂χj ∂χj ∂χi
where we have also assumed that a constant Grassmann number can be factored out
of the integral. Thus any term lacking a particular Grassmann variable ψi vanishes
when integrated over ψi . As any term containing the Grassmann variable ψi more than
once vanishes, the only potentially non-vanishing integral is that of the variable ψi
appearing linearly, which we can normalize to any value we please (as overall constants
in the generating functional Z are irrelevant, as emphasized repeatedly above). It is
conventional to define
dχi χi = 1 (10.112)
for any Grassmann variable χi : i.e., for any of the field variables ψi , ψ̄i appearing in
the regularized path integral.
With this interpretation of the fermionic integrations in (10.94) we may justify
the result for the free generating function (10.100), at least at the level of lattice-
regularized fields. At this level, in contrast to the situation for bosonic fields, there are
no convergence problems due to oscillating integrands: the expansion of the exponential
terminates at a finite order, with terms containing each ψi and ψ̄i variable once and
only once, and the value of the path integral (as a function of the remaining source
340 Dynamics VIII: Interacting fields: perturbative aspects
N
quantities ηi , η̄i ) is just the coefficient of i=1 ψ̄i ψi in this expanded quantity.15 So
where does the requirement for an i in the denominator of the Fourier transformed
Feynman–Dirac propagator come from? In the bosonic case, this small imaginary
displacement arose naturally from the requirement of regularizing the oscillating
Minkowski integrand, as the integration range for the field variables was infinite,
leading to a failure of absolute convergence of the (finitely) multi-dimensional integral.
Here the finite-dimensional fermionic integral gives a perfectly finite result provided
that the 4N x4N matrix (4 Dirac, N spacetime degrees of freedom) representing the
Dirac operator D on our finite spacetime lattice is invertible. The discrete energies
and momenta allowed on such a lattice mean that the singularity at k0 = k 2 + m2
encountered by a continuous energy integral over k0 is “missed” when the integral is
converted to a finite sum. In the fermionic case, the need for inclusion of an i pre-
scription for avoiding this singularity only appears once the lattice spacing is taken to
zero and the sum goes over to a continuous integration along the real energy axis. The
need for including the i in the correct way at this point—i.e., as a negative imaginary
part in the mass (see (10.98))—is dictated by our desire to recover the correct causal
(i.e., time-ordered) propagator in agreement with the operator version of the theory.
For the interacting field theories of primary interest in modern particle physics—
the gauge field theories of the Standard Model—the dependence of the Hamiltonian
(or Lagrangian) on the Fermi fields is always quadratic, so we may in general write
the fermionic part of the full functional integral (which may, of course, also contain
further integrations over bosonic fields),
4N
(ψ̄(x)D(φ)ψ(x)−η̄(x)ψ(x)−ψ̄(x)η(x))d4 x
Dψ̄Dψei → dψ̄i dψi ei(ψ̄i D(φ)ij ψj −η̄i ψi −ψ̄i ηi )
i=1
(10.113)
where the notation D(φ) indicates the possible dependence of the differential operator
in the quadratic part on a generic set of bosonic fields φ, and the arrow a suitable
discretization of the continuum theory in which the fermion fields are placed on
a spacetime lattice. The indices i, j run from 1 to 4N , where N is the number
of spacetime lattice points and the 4 comes from the discrete Dirac index. After
completion of the square, exactly as previously for the free theory, this becomes
4N
−iη̄i D(φ)−1 −1
e ij ηj dψ̄i dψi eiψ̄i D(φ)ij ψj = e−iη̄i D(φ)ij ηj
det[D(φ)] (10.114)
i=1
The integral of the (exponential of the) source-free quadratic fermion action gives
simply the determinant of the 4N x4N matrix of the discretized operator D(φ)! The
reason for this is quite simple: in order to obtain a non-vanishing contribution, the
rules for Grassmann integration described above imply that only the term in which
the expansion of the exponential contains each ψi and each ψ̄i once and only once
contributes. The coefficient of such terms involves the multiplication of 4N matrix
15 Alternatively, we see that the value of the multi-dimensional Grassmann integral is also given by taking
a derivative of the integrand with respect to each and every one of the ψi and ψ̄i : for Grassmann quantities,
integration = differentiation!
Graphical concepts: N -particle irreducibility 341
elements Dij with each row and column of the matrix appearing exactly once, and
with a sign indicating the sign of the permutation needed to place the ψi in the same
order as the ψ̄i . This is exactly the definition of the determinant of the matrix Dij .
Note that the result is precisely the inverse of that obtained for the integration of
Gaussian complex bosonic path integrals (cf. (10.90)). As indicated previously, the
path integral for the interacting gauge field theories contained in the Standard Model
are at most quadratic in the elementary spin- 12 fields (leptons and quarks) of the
theory, so that in principle the fermionic integrations can all be performed yielding
determinantal functionals of the remaining bosonic fields, leaving only bosonic path
integrals to be performed. In the case of lattice QCD, by far the greatest part of the
numerical difficulties encountered with stochastic estimations of the resultant bosonic
path integrals derives from the evaluation of the fermionic determinant associated
with integrating out the quark fields.
There is one further important property of Grassmann integrals which we shall
need to take into account in our discussion of anomalies in Chapter 15. The peculiar
property of fermionic path integrals, whereby the fermion determinant appears to a
positive power (as in (10.114)) when a Gaussian integral is performed, has a correlate in
the behavior of the Grassmann integral under a change of fermionic variables. Suppose
we wish to change variables from a set of Grassmann quantities ψi , i = 1, 2, ..N to
ψi = Cij ψi , where the Cij are bosonic in character (i.e., complex scalars). The
requirement that the new variables satisfy
N
N
N
1= dψi ψi = dψi C1i1 ψi1 C2i2 ψi2 · · · CN iN ψiN (10.115)
i=1 i=1 i=1
N N
implies that the Jacobian J of the change of variables, with i=1 dψi = J i=1 dψi
is given by
J = det(C)−1 (10.116)
16 The third diffficulty, or “inconvenient truth”, identified in Section 10.2, the intrinsic non-convergence
of perturbative expansions, will be addressed in detail in the subsequent chapter.
342 Dynamics VIII: Interacting fields: perturbative aspects
removed. The resolution of these two issues is greatly facilitated by the introduction of
graphical concepts which allow us to reorganize the Feynman diagrams of the theory
in an intuitively powerful way. In the first case, the relevant concept is that of an
amputated diagram, in the second, of proper, or one-particle irreducible vertices.
The graphical representation of the Green functions of an interacting field theory,
as described in Section 10.2, suggests that important features of these amplitudes can
be directly visualized and correlated with corresponding properties of the Feynman
graphs which represent these amplitudes at any given order of perturbation theory.
The most basic such property is that of connectedness, first discussed in the context
of S-matrix elements in Section 6.1. This is a special case of the more general concept
of “N -particle irreducibility”. A graph (or set of graphs) contributing to a given n-
point function, or (via the LSZ formula) S-matrix element, is said to be N -particle
irreducible if it remains connected when any N internal lines of the graph are cut.
Thus, the connected diagrams are 0-particle irreducible: they contain the subset of
one-particle irreducible (or 1PI for short) diagrams which remain connected if only
a single internal line is cut. The 1PI diagrams in turn contain as a subset the 2PI
diagrams which remain connected even when two internal lines are cut, and so on.
The physical significance of the connected contributions to S-matrix elements, and
to the Green functions of the theory was already discussed at length in Sections 6.1
and 9.3. The physical interpretation of the 1PI and higher irreducible graphs will
be discussed in due course below. Our task here is to connect these concepts to the
functional approach to field theory introduced in the preceding section.
The functional derivatives of Z[j] (divided by Z[0]) yield the full set of contri-
butions to the Green functions of the theory—both connected and disconnected.17
Exactly as for the generating functional of S-matrix elements, where the connection
between full and connected S-matrix amplitudes is extremely simple when expressed
in terms of generating functionals, as discussed in Section 6.1,
with P (φ) a polynomial (higher than quadratic) in the fields, the individual terms of
which induce the three-point, four-point, etc. interaction vertices of the theory. As a
concrete example we shall imagine in the following that both trilinear and quadrilinear
interactions are present—we may take, for example, P (φ) = λ3!3 φ3 + λ4!4 φ4 . Thus, the
17 However, as discussed previously, the division by Z[0] eliminates the disconnected vacuum fluctuations
accompanying any scattering process.
Graphical concepts: N -particle irreducibility 343
graphs of the theory contain elementary vertices from which either three or four lines
emerge. The usual factors of eiδ needed for convergence of the Minkowski path integral
are suppressed here to avoid overburdening the notation.
The derivatives of (10.118), in analogy to the functional derivatives of Z[j] for the
full Green functions (see (10.73), lead directly to the cluster decomposition recursion
formulas (cf. (9.92, 9.93, 9.94)) relating connected to full amplitudes
iδW i δZ
= (10.120)
δj(x1 ) Z δj(x1 )
i2 δ 2 W i2 δ2Z iδW iδW
= − (10.121)
δj(x1 )δj(x2 ) Z δj(x1 )δj(x2 ) δj(x1 ) δj(x2 )
i3 δ 3 W i3 δ3Z iδW i2 δ 2 W
= −( + perms)
δj(x1 )δj(x2 )δj(x3 ) Z δj(x1 )δj(x2 )δj(x3 ) δj(x1 ) δj(x2 )δj(x3 )
iδW iδW iδW
− (10.122)
δj(x1 ) δj(x2 ) δj(x3 )
and so on. After the functional derivatives are performed, the desired n-point functions
are obtained by setting the source j = 0. These formulas display directly the removal
of disconnected contributions from the full set of graphs generated by the generating
functional Z[j]. We shall henceforth only consider connected amplitudes, relieving us
of the obligation to further complicate the notation by a sub(or super)script “c”, to
distinguish the connected Green functions generated by W [j] from those (containing
all graphs) generated by Z[j].
Before deriving the general form of the generating functional for the one-particle
irreducible diagrams of the theory, we shall illustrate the basic idea with some
examples, which will help to bring home the critical importance of these new graphical
concepts in exposing exactly the pole structure in the external momenta needed to
make the LSZ formula for S-matrix elements work. Recall from our discussion of
persistent interactions in Section 10.2 that the appearance of the correct set of poles
in the external particle momenta is by no means automatic in perturbation theory.
The heuristic account that follows is intended to reveal in a visually intuitive way
the appropriate reorganization of perturbation theory which will allow the direct and
unambiguous application of the LSZ formula to the extraction of S-matrix elements.
Consider the set of all connected contributions to the four-point Green function in
the theory with generating functional (10.119). All such graphs can be displayed in
the generic form exhibited in Fig. 10.4: the blobs labeled Δ̂F on the four external lines
represent all possible contributions to the full interacting two-point function (Feynman
propagator): they contain all possible self-interactions of the external particles coming
(4)
into and receding from the central collision process. The central blob, labeled Gamp , is
the “amputated” connected four-point function of the theory, and represents the part
of the process where the initial and final particles can no longer be considered to be
moving freely and independently of one another. Thus, the spacetime points labeled
z1 , z2 , z1 , z2 represent interaction vertices, indeed the first interactions (as we move
in towards the central collision process from the external points x1 , x2 , x1 , x2 ) which
344 Dynamics VIII: Interacting fields: perturbative aspects
x1 x2
z 1 z2
(4)
Gamp
Δ̂F (x1−z1) z1 z2
x1 x2
Fig. 10.4 General structure of connected four-point function G(4) (x1 , x2 , x1 , x2 ).
4 4 4
·G(4) 4
amp (z1 , z2 , z1 , z2 )d z1 d z2 d z1 d z2 (10.123)
G̃(4) (k1 , k2 , k1 , k2 ) = Δ̂F (k12 )Δ̂F (k22 )Δ̂F (k12 )Δ̂F (k22 )G̃(4)
amp (k1 , k2 , k1 , k2 ) (10.126)
Graphical concepts: N -particle irreducibility 345
Δ̂F
= + +
+ + + ....
Fig. 10.5 Feynman Graphs contributing to the full Feynman propagator Δ̂F in φ4 theory.
The 2-2 scattering amplitude in this theory is given by the LSZ formula (9.179): we
must multiply the momentum-space four-point function G̃(4) by external leg factors
−iZ −1/2 (k2 −m2ph )
of 3/2
√ for every external momentum k, and then take the on-mass-shell
(2π) 2E(k)
limit k 2 → m2ph . By the Lehmann representation (9.192), each of the external leg full
propagators in (10.126) produce in this limit a simple pole, with residue Z, at exactly
the (squared) physical mass m2ph (which may not be the same as the squared masses
employed in the free propagators from which the diagrams are constructed!), so that
the factors of (k 2 − m2ph ) for each initial or final-state particle are cancelled, and we
obtain for the 2-2 S-matrix element:
2
−iZ 1/2 2
−iZ 1/2
Sk1 ,k2 ,k1 ,k2 =
G̃(4)
amp (k1 , k2 , k1 , k2 ) (10.127)
(2π)3/2 2E(k ) 3/2 )
i=1 i j=1 (2π) 2E(kj
(4)
with the amputated four-point function G̃amp evaluated at on-mass-shell momenta
(i.e., ki2 = ki2 = m2ph ). The potentially singular behavior associated with the on-mass-
shell limit (or equivalently, the infinite time propagation of self-interacting particles
into and out of the process) has been taken care of in (10.127): we may continue
(4)
on with the evaluation of G̃amp (k1 , k2 , k1 , k2 ) in perturbation theory, confident that
the pole residue corresponding to the desired S-matrix element has been properly
extracted. In particular, there is no need to use an on-mass-shell renormalization
scheme of the sort described in Section 10.2, in which the pole of the free propagator
is shifted to the physical value order by order in perturbation theory by the choice of
suitable counterterms. The mass appearing in the free propagators constituting the
(4)
perturbative expansion of the amputated Green function G̃amp may be conveniently
chosen at some “intermediate” value, depending on the particular renormalization
scheme employed, as we shall see later in our detailed discussion of renormalization
theory in Chapter 17.
Before going on to a discussion of proper, or “one-particle-irreducible” (1PI), Green
functions, a short digression on some elementary graphical counting rules will equip
us with some results which facilitate the reorganization of perturbation theory implied
by the introduction of the 1PI condition. Recall that a tree diagram is defined as a
connected graph which is rendered disconnected by cutting a single internal line. Let
such a graph have I internal lines and V vertices. We imagine a “pruning” process
whereby vertices are removed one by one by cutting a single internal line, starting at
the outermost “branches” of the tree. Each such removal leaves the quantity I − V + 1
346 Dynamics VIII: Interacting fields: perturbative aspects
unchanged, as a single vertex and a single internal line have been removed. Eventually
we arrive at the remnant core of the graph, with a single vertex, and no remaining
internal lines, for which the quantity I − V + 1 is zero. We conclude that for any tree
graph, I − V + 1 = 0.
More generally, a connected graph may contain L independent loops, resulting in
momentum space in L independent four-momenta integrations corresponding to the
free flow of momentum around each loop. We again consider the quantity I − V + 1,
this time reducing the number of loops one at a time by cutting a single internal
line in each loop, without altering the total number of vertices, and leaving the graph
connected at each stage. In this case we reduce the number of loops L, and the quantity
I − V + 1, by one at each cut, until we arrive at a connected graph with no loops- i.e.,
a tree graph, with I − V + 1 = 0. This establishes that the number of loops L in our
initial graph is given precisely by the combination I − V + 1. The reader is invited to
verify this rule by drawing a few simple graphs.
The study of the short-distance (or ultraviolet) singularity structure of field theory
amplitudes is greatly facilitated by the introduction of the concept of one-particle-
irreducible diagrams, and in particular, by the use of a functional, analogous to W [j]
for connected graphs, which automatically generates such 1PI diagrams. In addition,
the graphs produced will be amputated: in other words, they are subsets of the graphs
(n)
describing the connected amputated functions Gamp discussed previously, where those
diagrams which can be disconnected by cutting a single internal line are discarded. It
is intuitively obvious that the full set of amputated connected diagrams (in coordinate
space) can be reconstituted from the 1PI diagrams by convolving products of the latter
with full Feynman propagators Δ̂F connecting the separate 1PI pieces.
For example, in Fig. 10.6, we see that a class of one-particle-reducible contributions
to the connected four-point function have the structure of two three-point 1PI graphs
(3) (3)
Gamp connected by a single full Feynman propagator. Note that Gamp is automatically
1PI (why?). In momentum space the full graph decomposes algebraically into a product
(3)
of the momentum-space amplitudes for the two Gamp functions times the momentum-
space Feynman propagator Δ̂F (p2 ), where p is the definite four-momentum (fixed by
Δ̂F Δ̂F
Δ̂F
(3)
Gamp (3)
Gamp
Δ̂F Δ̂F
energy-momentum conservation) passing between the two separate 1PI pieces. Thus
the singularity structure of the full connected amplitudes—in momentum space, the
dependence on the ultraviolet cutoff of the loop integrals in any given graph—is
decomposable into independent pieces: namely, the proper (or 1PI) subgraphs into
which it can be decomposed.18 The study of the short-distance sensitivity of general
amplitudes in field theory can therefore be reduced to a study of the cutoff dependence
of the proper, or 1PI, Green functions of the theory.
In addition to simplifying the study of ultraviolet behavior, the introduction of
the 1PI concept leads to two further important insights into the physics of local field
theories. First, we shall see that the generating functional for 1PI graphs (sometimes
referred to as the “effective action”) also has a direct energetic interpretation which
plays a critical role in the analysis of spontaneous symmetry-breaking, to which we
shall return in Part 3. In fact, it plays a role in field theory analogous to that played
by the free energy in thermodynamics. Secondly, if we reintroduce Planck’s constant
as an explicit signature of quantum effects, an expansion of the effective action in
powers of is found to (a) yield the classical action as the zeroth order term, and (b)
correspond precisely to a reorganization of the perturbation theory according to the
number of loops.
We turn now to the task of constructing a functional for the 1PI graphs of the
theory. To avoid annoying factors of i we shall work in the Euclidean formulation.
Furthermore, for the reasons adduced immediately above, Planck’s constant will
be reintroduced both in the path integral and in the definition of the generating
functionals. Thus, the exponents appearing in the path integral acquire an explicit
factor of 1 , which can be traced back to the reintroduction of in the Hamiltonian
evolution, e−iHt → e−iHt/ , or, in the imaginary time formulation, e−Ht → e−Ht/ .
Our Euclidean path integral now reads
1
( 12 φ(x)Kφ(x)+P (φ(x))−j(x)φ(x))d4 x 1
j(x)φ(x)d4 x)
Z[j] = Dφe− = Dφe− (S[φ]−
(10.128)
where now K = − + m2 is the Euclidean Klein–Gordon operator, which is self-
adjoint and positive-definite, and the coordinate integrations are over a Euclidean
four-space. Of course, we expect to be able, at least in principle, to return to the
physical Minkowski-space amplitudes by the process of Wick rotation. The generating
functional for the connected diagrams is now defined as
1
W [j] = ln Z[j] = G(n) (x1 , x2 , ...., xn )j(x1 )j(x2 )...j(xn )d4 x1 d4 x2 ...d4 xn
n
n!
(10.129)
(n)
where we shall for simplicity adopt the same notation G for the Euclidean n-point
Green functions (previously denoted S(x1 , .., xn ), in (10.81)) of the theory as that
18 In coordinate space it is clear that the ultraviolet singularities, which arise from multiplication of
Feynman propagators (which are distributions) at coincident vertices, must be localized within the 1PI
pieces, as the propagators connecting separate 1PI parts of the diagram appear independently.
348 Dynamics VIII: Interacting fields: perturbative aspects
which amounts to
In the source-free limit (j = 0), this equation is simply the non-linear classical field
equation corresponding to the classical least action principle (in Euclidean space, of
course). In the presence of a source, it determines the classical field φcl implicitly as a
functional of the external source j. The leading saddle-point approximation to Z[j] is
given by simply evaluating the exponential at its extremal point:
1
1 4
Zcl [j] = e− ( 2 φcl (x)Kφcl (x)+P (φcl (x))−j(x)φcl (x))d x (10.132)
whence we obtain, by (10.129), the classical limit of the generating functional W for
connected amplitudes
1
Wcl [j] = − ( φcl (x)Kφcl (x) + P (φcl (x)) − j(x)φcl (x))d4 x (10.133)
2
we find that
δWcl [j] δφcl (y)
= d4 y (−Kφcl (y) − P (φcl (y)) + j(y)) + φcl (x)
δj(x) δj(x)
= φcl (x) (10.135)
where the term in brackets in the integral has vanished in virtue of the field equation
(10.131).
The classical field equation (10.131) can be solved iteratively in increasing powers
of the source function j. Let us temporarily restrict the polynomial P (φ) to correspond
to φ4 theory—we take P (φ) = 4! φ , P (φ) = 3!
λ 4 λ 3
φ . Next, we rewrite (10.131) as follows:
λ −1 3
φcl (x) = K−1 j(x) − K (φcl )(x) (10.136)
3!
or more explicitly, using the Green function for K, which is just the Euclidean
2 , so that KΔE (x) = δ (x)),
1 4
propagator ΔE (x) (with Fourier transform k2 +m
λ
φcl (x) = ΔE (x − x1 )j(x1 )d4 x1 − ΔE (x − z)φcl (z)3 d4 z (10.137)
3!
Reinserting the left-hand side result for φcl on the right, we find, through order j 3 ,
λ
φcl (x) = ΔE (x − x1 )j(x1 )d4 x1 − ΔE (x − z)
3!
· ΔE (z − x1 )j(x1 )ΔE (z − x2 )j(x2 )ΔE (z − x3 )j(x3 )d4 x1 d4 x2 d4 x3 d4 z + O(j 5 )
(10.138)
Ignoring signs, coupling constants, and combinatoric factors, these terms can be
graphically represented as indicated in Fig. 10.7: it is apparent that one has generated
a set of tree graphs with a preferred external point (the argument x of the classical
field φcl (x)), and source functions j(xi ) attached to all other external points of the
graph. The third term in Fig. 10.7 shows also one of the terms arising at order j 5 (of
second order in the bare coupling λ, as there are two interaction vertices, at z1 and
z2 ), which we obtain by iterating (10.138) one more time.
If we insert this iterative solution for φcl into (10.133), we obtain the connected
generating functional in the classical limit as an explicit functional of the source
function j(x), expanded formally in increasing powers of j(x):
1
Wcl [j] = j(x1 )ΔE (x1 − x2 )j(x2 )d4 x1 d4 x2
2
4
λ
− ( ΔE (z − xi )j(xi )d4 xi )d4 z + O(j 6 ) (10.139)
4! i=1
350 Dynamics VIII: Interacting fields: perturbative aspects
z1 j (x1)
x j (x1) + x DE
DE DE
DE j(x2)
DE
j(x3)
DE j (x1)
x z1
+ DE j(x2) + . . .
DE
DE DE j(x3)
z2 DE
j(x4)
DE
j(x5)
Fig. 10.7 Formal expansion of the classical field φcl (x) in powers of j.
with the graphical representation indicated in Fig. 10.8 (again ignoring combinatoric
factors, signs, etc., and with one of the O(j 6 ) terms not shown explicitly in (10.139)
indicated graphically). As expected, only connected tree graphs are present, some of
which are, however, clearly one-particle-reducible (i.e., become disconnected when
a single internal line is cut). At the level of tree graphs, of course, the only 1PI
graphs involve at most a single interaction vertex: as soon as more than one vertex is
z1 j(x2)
j(x1) j(x2) + j(x1) DE
DE DE
DE DE
j(x3)
j(x4)
DE j(x2)
z1
+ j(x1) DE DE
j(x3) + . . .
DE DE j(x4)
z2 DE
j(x5)
DE
j(x6)
present, there must be an internal line connecting two vertices (the graph as a whole is
connected!), the removal of which will disconnect the diagram (as no loops are present).
Thus any generating functional for 1PI diagrams can, at the classical (leading order
in ) level, only contain a finite set of terms corresponding to the interaction vertices
of the theory (the monomial terms in P (φ)).
The key to finding such a functional turns out to be the Legendre transform: we
re-express the information contained in the functional W [j] as a functional Γ[φ] of
the derivatives δW [j]
δj(x)
≡ φ(x), in such a way that no information is lost in trading in
the source j for the field φ, or vice versa. The field φ(x) obtained by functionally
differentiating W [j] with respect to the source j(x) is sometimes (confusingly) called
the “classical field”, although properly speaking that term should be applied to its →
0 limit φcl (x) = δW cl [j]
δj(x) (cf. (10.135)). Referring to the path-integral representation for
W [j] = ln Z[j], and recalling that derivatives of Z[j] generate vacuum-expectation-
values of the corresponding Heisenberg fields (here, continued to Euclidean space), we
1 δZ[j]
see that φ(x) = Z[j] δj(x) is just the normalized VEV of the Heisenberg field φH (x)
in the presence of the source term. It can therefore be viewed as a c-number source
function, like j(x), and in that sense is “classical”.19 As usual, the Legendre transform
allowing us to incorporate the complete information in W [j] in a recoverable way20 as
a functional of the derivative φ(x) = δW [j]
δj(x) is given by
Γ[φ] ≡ −W [j] + j(x)φ(x)d4 x (10.140)
where, of course, the dependence on j(x) on the right-hand side must be eliminated
in favor of φ(x) by inverting the equation φ(x) = δW [j]
δj(x) . The Legendre transformation
is defined in order to preserve the information encoded in W [j]. Using the chain rule
(namely, (10.134), with the roles of j and φ interchanged), we find
δΓ[φ] δj(y) δ
= j(x) + d4 y (−W [j] + j(z)φ(z)d4 z)|φ
δφ(x) δφ(x) δj(y)
δj(y) δW [j]
= j(x) + d4 y (− + φ(y)) = j(x) (10.141)
δφ(x) δj(y)
so that W [j] = −Γ[φ] + j(x)φ(x)d4 x allows reconstruction of W [j] from Γ[φ] once φ
is re-expressed in terms of j via (10.141). Moreover, we see that the two-point functions
δ 2 W [j] δφ(y)
= (10.142)
δj(x)δj(y) δj(x)
19 We shall follow the usual confusing practice of using the same symbol φ(x) to refer to (a) the quantum
(operator) free field, (b) the c-number field integrated over in the path-integral formalism, and (c) the
independent field variable for the Legendre transform Γ[φ] of W [j].
20 See (Callen, 1960), Section 5.2, for a lucid geometrical introduction to Legendre transforms.
352 Dynamics VIII: Interacting fields: perturbative aspects
and
δ 2 Γ[φ] δj(x)
= (10.143)
δφ(x)δφ(y) δφ(y)
are functional inverses of one another. In particular, their discretized versions corre-
spond to matrices which are inverses of one another.
It should be noted here that the assumption of invertibility (i.e., solvability of
δW [j]
δj = φ for j in terms of φ, or of δΓ[φ] δφ = j for φ in terms of j) is not automat-
ically assured for arbitrary functionals. In the case of the Legendre transformation
connecting Lagrangians to Hamiltonians in mechanics (or field theory), for example,
there are important cases (in the case of field theories, in situations involving local
gauge symmetries, for example) where the form of the Lagrangian is such that it is not
possible to solve uniquely for the velocities (or time-derivatives of the fields) in terms of
the momenta defined as p = ∂L ∂ q̇ (or, for field theory, π ≡ δ φ̇ ), so the Hamiltonian is not
δL
We see that in the classical limit, the generating functional Γ[φ] coincides with the
classical action S[φ] of the theory. For this reason, the full (i.e.,
= 0) Γ[φ] is generally
referred to as the “effective action” of the field theory, which in some sense generalizes
the classical action to include quantum effects. If in the usual way we regard Γ[φ] as
a generating functional of n-point vertex functions Γ(n) (x1 , x2 , ..., xn ) of the theory,
δ n Γ[φ]
Γ (n)
(x1 , x2 , ..., xn ) = (10.145)
δφ(x1 )δφ(x2 )..δφ(xn ) φ=0
λ 4
we see that in the classical limit (for the specific example P (φ) = 4! φ ), the only
surviving vertex functions are for n = 2 and n = 4:
Graphical concepts: N -particle irreducibility 353
(2)
Γcl (x1 , x2 ) = Kx1 δ 4 (x1 − x2 ) (10.146)
(4)
Γcl (x1 , x2 , x3 , x4 ) = λδ 4 (x1 − x2 )δ 4 (x2 − x3 )δ 4 (x3 − x4 ) (10.147)
Note that all the one-particle-reducible terms present in the connected tree graphs
generated by Wcl [j] have disappeared: instead we have only a two-point vertex given
by the Klein–Gordon operator K (or equivalently, the inverse propagator Δ−1 E ), and a
four-point fully amputated vertex corresponding to a single interaction point: in other
words, the only amputated one-particle-irreducible graphs possible at the tree level.
(2)
The appearance of the inverse propagator in Γcl can be regarded as a result of the
amputation process whereby a factor of the inverse propagator Δ−1 E is applied for each
external point, leading in the case of the two-point function to Δ−1 −1
E · ΔE · ΔE = ΔE .
−1
To go beyond the classical limit (i.e., the tree diagrams of the theory) we need
to look a little more carefully at the structure of the effective action functional Γ[φ].
A convenient way to do this is to start with the functional W [j], as an expansion in
powers of the source j, and carry out the Legendre transformation to Γ[φ] order by
order in this formal expansion. We shall now revert to the usual procedure followed
through the rest of the book, and reinstate natural units = 1, keeping in mind that
an expansion in explicit powers of Planck’s constant amounts to nothing more than a
reorganization of the perturbation theory according to the number of loops. Moreover,
to keep the resultant formulas from expanding to intolerable lengths on the printed
page, we shall assume that spacetime has been discretized and work in a purely discrete
framework, in which spacetime points x are replaced by indices i (i = 1, 2, . . . N , where
the spacetime lattice xi has N points), and sources and fields are localized by attaching
the relevant index- Ji ≡ j(xi ) or φj ≡ φ(xj ), for example. Integrals over spacetime will
be replaced by summations over (typically) repeated indices, and operators K (resp.
Green functions ΔE (x, y), G(3) (x, y, z),etc.) by appropriately multi-indexed objects
(3)
Kij (resp. Δij , Gijk ,etc.). Connected contributions are readily identified as terms which
cannot be algebraically factored into two or more parts involving non-overlapping sets
of summed indices. We also allow for polynomial interactions P (φ) including both
even and odd powers of the field φ, so that amplitudes G(n) are in general non-
vanishing, even for odd n. By definition, W [j] is the generating functional of the
n-point connected amplitudes G(n) , so in this discrete notation we have simply
1 1 (3) 1 (4)
W [J] = Ji Δ̂ij Jj + Gijk Ji Jj Jk + Gijkl Ji Jj Jk Jl + ... (10.148)
2 3! 4!
whence
∂W 1 (3) 1 (4)
φi = = Δ̂ij Jj + Gijk Jj Jk + Gijkl Jj Jk Jl + O(J 4 ) (10.149)
∂Ji 2 3!
Note that here Δ̂ij is the discrete form of the full Euclidean propagator Δ̂E , as W [j]
generates the complete interacting connected Green functions of the theory, and K̂ will
now be defined as the (discrete) inverse of this full two-point function (which of course,
in the free field limit, reduces to a discretized version of the Euclidean Klein–Gordon
operator K = − + m2 ).
354 Dynamics VIII: Interacting fields: perturbative aspects
The last relation can now be inverted, order by order in increasing powers of φ
(which is of order J, by (10.149)), to yield Ji as a function of φi . Through terms of
order φ3 we find (see Problem 9), using the shorthand (K̂φ)i ≡ K̂ij φj ,
1 (3)
Ji = (K̂φ)i − K̂ij Gjkl (K̂φ)k (K̂φ)l
2
1 (3) (3)
+ K̂ij1 Gj1 k1 l1 (K̂φ)l1 K̂k1 k2 Gk2 l2 j2 (K̂φ)l2 (K̂φ)j2
2
1 (4)
− K̂ij Gjklm (K̂φ)k (K̂φ)l (K̂φ)m + O(φ4 ) (10.150)
3!
Reinserting this result in the expansion for W [J], (10.148), we find after some
straightforward algebra (Problem 10), the effective action through terms of order φ4 :
Γ[φ] = −W [J(φ)] + Ji φi
1 1 (3)
= φi K̂ij φj − Gijk (K̂φ)i (K̂φ)j (K̂φ)k
2 3!
1 (4)
− {Gijkl (K̂φ)i (K̂φ)j (K̂φ)k (K̂φ)l
4!
(3) (3)
− 3Gi1 j1 k1 (K̂φ)j1 (K̂φ)k1 K̂i1 i2 Gi2 j2 k2 (K̂φ)j2 (K̂φ)k2 } + O(φ5 ) (10.151)
We observe that the individual terms appearing in Γ[φ] are again connected, consisting
of the connected G(n) amplitudes singly, or tied together by K̂ operators, which
remove a single internal (full) propagator when two separate G(n) are connected
(so as to avoid doubling the connecting Feynman propagator, as the G(n) are not
amputated). Moreover, the inverse (full) propagators K̂ attached to all external legs
(i.e., to factors of the external field φ) ensure the removal of all external legs, so
that Γ[φ] generates amputated diagrams—exactly the amplitudes leading to a smooth
perturbative evaluation of the LSZ formula, as discussed previously. If we define a
general n-point proper vertex Γ(n) as the n-th derivative at zero field of Γ[φ] (with a
minus included to take care of the sign change in going from W to Γ):
∂ n Γ[φ]
(n)
Γi1 i2 ...in ≡ − (10.152)
∂φi1 ∂φi2 ..∂φin φ=0
(2)
we see that the two-point proper vertex is just Γij = −K̂ij , i.e., (minus) the inverse
(3)
full propagator. Similarly, Γijk ,
the three-point proper vertex, is just the amputated
three-point connected Green function
(3) (3)
Γijk = K̂ii K̂jj K̂kk Gi j k (10.153)
which, for obvious reasons, is automatically 1PI. At order φ4 a new feature appears,
as the proper vertex is given by a combination of terms, specifically, the full connected
four-point Green function G(4) , with the four external legs amputated (by the K̂
factors), minus three other terms which are clearly one-particle-reducible in character:
Graphical concepts: N -particle irreducibility 355
A glance at Fig. 10.9, which shows a decomposition of the amputated G(4) amplitude
into parts built from 1PI amplitudes (connected with full propagators), clarifies the
meaning of the subtracted terms in (10.154): they serve to precisely remove from the
full connected four-point amplitude its one-particle-reducible pieces. In other words,
the first term on the right-hand side of Fig. 10.9, consisting of the 1PI four-point
(4)
contributions to the full connected four-point amplitude, is identical to Γijkl . So,
at least through vertices with four external legs, the effective action constructed by
Legendre transformation does indeed seem to generate the amputated one-particle-
irreducible diagrams of the theory, and only such diagrams. Of course, it is hardly
obvious from the above that this remains true for n-point amplitudes with arbitrarily
large n.
A simple proof that the Legendre transform Γ[φ] indeed generates all the one-
particle-irreducible, and only the one-particle-irreducible, graphs for arbitrary pow-
ers of φ can be given using a trick described by Zinn–Justin (Zinn–Justin, 1989).
The individual contributions on the right-hand side of (10.154) are recognized as
corresponding to connected graphs in the sense that any external index (like i) can
be connected to any other (like l) by a continuous sequence of indices connected by
the multi-index connected Green functions K̂, G(3) , G(4) , etc. For example, taking the
second term on the right-hand side of (10.154), corresponding to the second graph on
the right-hand side in Fig. 10.8, the index i can be connected to l by the sequence
i → i → m → n → l → l, passing sequentially through K̂ii , Gmi k , K̂mn , Gnj l , and
(3) (3)
K̂ll . We can investigate the effect of cutting internal lines on the graphs generated
by Γ[φ] by introducing a separable perturbation of the Klein–Gordon operator in the
quadratic part of the action, as follows:
k l k l k l
(4)
Gamp = 1PI + 1PI 1PI
i j i j i j
j l l j
+ 1PI 1PI + 1PI 1PI
i k i k
(4)
Fig. 10.9 Decomposition of Gamp ijkl in terms of 1PI vertices: crossbars indicate that the external
legs are “amputated”.
356 Dynamics VIII: Interacting fields: perturbative aspects
2
S [φ] = S[φ] + φi φj = S[φ] + ( φi ) (10.155)
2 i,j 2 i
1
φi (Kij + Mij )φj , Mij = 1, ∀i, j (10.156)
2 i,j
1 ∂ 2 Z[J]
W [J] = W [J] −
2 Z[J] i,j ∂Ji ∂Jj
Recall here that in going over from the external source Ji to the classical field φi , we
∂ 2 W [J] ∂ 2 Γ[φ]
have ∂W [J]
∂Ji = φi , while ∂Ji ∂Jj is the inverse of the matrix ∂φi ∂φj (cf. (10.142) and
(10.143)). Now, the chain rule formula
∂ ∂ ∂Ji ∂
= + (10.160)
∂ φi ∂ Ji ∂ φi ∂Ji
applied to the Legendre relation for the perturbed effective action Γ [φ]
21 As Γ[φ] generates amputated graphs, the only lines present are internal lines!
Graphical concepts: N -particle irreducibility 357
gives
∂Γ [φ] ∂W [J] ∂Ji ∂
0= + + (W [J] − Jj φj )
∂ φi ∂ Ji ∂ φi ∂Ji
∂Γ [φ] ∂W [J] ∂Ji ∂Γ [φ] ∂W [J]
= + + (φi − φi ) = +
∂ φi ∂ Ji ∂ φi ∂ φi ∂ Ji
so that the first-order shift in the effective action Γ [φ] is just minus that in W [J],
whence, from (10.159),
2 ∂ 2 Γ[φ] −1
Γ [φ] = Γ[φ] + ( φi ) + ( ) (10.162)
2 i 2 i,j ∂φi ∂φj
using the fact that the second derivative matrices of Γ[φ] and W [J] are inverses.
The second term on the right-hand side of (10.162) is just the perturbation originally
introduced in the action (10.155), which must, of course, appear in the O(0 ) (classical)
part of the full effective action Γ [φ]. It generates the expected disconnected part of
the two-point function (inverse full propagator). The third term contains the effect of
disconnecting a single internal line in all the higher-order vertices, and a little thought
shows that it consists entirely of connected diagrams. Indeed, as a function of the Ji
∂2 W
sources, it expands into obviously connected graphs, as it is just i,j ∂J i ∂Jj
. But when
each of the Ji = ∂Γ[φ]
∂φi is inserted as a function of the φi into the latter expression, it
simply expands the previous connected graph (from W [J]) by a connected extension,
so even when expanded as a function of the φi only connected diagrams are obtained.
Note that the combinatoric factors relating different 1PI contributions are preserved
in Γ[φ]: the process of going from W to Γ involves (as we see from (10.154), and in Fig.
10.9) simply (a) amputating external lines, and (b) removing en bloc any one-particle-
reducible graphs. This concludes the proof that the graphs generated by Γ[φ] remain
connected even when any single internal line is cut, and hence must correspond exactly
to the amputated 1PI graphs appearing in the connected n-point functions generated
by W [j].
Historically, the concept of one-particle-irreducibility was first introduced by Dyson
(Dyson, 1949) in the context of the three-point function (electron–electron–photon
vertex) in quantum electrodynamics, where the 1PI graphs contributing to this three-
point function were referred to as the “proper vertex part”. As we shall see later
in Part 4, the systematic discussion of renormalizability pioneered by Dyson makes
critical use of the fact that the ultraviolet sensitivity of the theory (in other words,
the divergence structure of the loop integrals appearing in a general graph) can be
fully analysed in terms of 1PI diagrams, as loop integrals in separate 1PI pieces of a
larger graph are algebraically decoupled from one another.
A more systematic treatment of n-particle irreducibility was carried out by
Symanzik a decade later (Symanzik, 1960), emphasizing the fact that the study of
the singularity structure of amplitudes, as a function of the external momenta, was
intimately related to the ability to decompose the contributing graphs by cutting
one, two, or more internal lines. For example, the only amputated graphs capable of
358 Dynamics VIII: Interacting fields: perturbative aspects
p1 p4
p2 1PI 1PI p5
p3 p6
1
(p1 + p2 + p3)2 − m2 + i
Fig. 10.10 A one-particle-reducible contribution to the 6-point function G(6) (p1 , p2 , .., p6 ).
producing single-particle pole singularities of the form 1/(( pi )2 − m2 ), where the
summed momenta pi represent some non-trivial subset of the (appropriately signed)
external momenta of the graph, involve one-particle-reducible diagrams such as the
one shown in Fig. 10.10 in φ4 theory (we are back in Minkowski space, hence the
i in the propagator). The 1PI four-point vertices on either side of the central line
cannot contain any single-particle poles, which arise only when single lines connect
separate connected parts of the graph. Similarly, the ability to decompose a graph
into two pieces by cutting two internal lines, as in Fig. 10.11, implies a branch-point
structure in the variable t ≡ (p1 − p2 )2 when the two-particle threshhold is reached
(at t = 4m2 ).22 Correspondingly, such two-particle threshholds are absent from 2PI
(or two-particle-irreducible) amplitudes, which can only be disconnected by cutting
at least three internal lines of the graphs contributing to such amplitudes, which are
commonly referred to as “Bethe–Salpeter kernels”. They play, as we shall see in the
next chapter, an important role in understanding the physics of threshhold bound
p2 p4
2PI 2PI
p1 p3
22 The reader may recall that exactly such branch-points appeared in our discussion of the Lehmann
representation for the full propagator in momentum space in Section 9.5, with new cuts appearing precisely
at squared-momentum values w ≡ p2 at which new intermediate multi-particle states become kinematically
possible.
How to stop worrying about Haag’s theorem 359
states in quantum field theory. Finally, with the recognition of the importance of
spontaneous symmetry breaking in quantum field theory in the 1960s, the central role
of the effective action in understanding the energetics of broken symmetry became
clear, starting with the seminal paper of Jona-Lasinio (Jona-Lasinio, 1964)—a topic
to which we return in Chapter 14.
H = H0 + V (10.163)
φ̇2 1 2 1 2 2
H0 = d xH0 (x, t) = d3 x : {
3
+ |∇φ| + m1 φ } : (10.164)
2 2 2
1
V = d3 xHint (x, t) = d3 x δm2 : φ2 : , δm2 ≡ m22 − m21 (10.165)
2
Thus, our unperturbed field is a free field φ1 (x) of mass m1 , while our full “interacting”
Heisenberg field is likewise a free scalar field φ2 (x), but of mass m2 . We should hardly
expect to encounter difficulties with the interaction picture in so simple a case, but as
we shall soon see, appearances in these matters are definitely deceiving!
On the basis of our treatment of the interaction picture in Section 4.3, we would
expect that the ground states of the two Hamiltonians H0 (i.e., the vacuum |01 for
scalar particles of mass m1 ) and H (the vacuum |02 for particles of mass m2 ) would be
360 Dynamics VIII: Interacting fields: perturbative aspects
with Ω− the unitary Møller wave operator defined in Section 4.3. Of course, |01 is
the Fock vacuum with respect to the destruction operators obtained in the usual way
from φ1 (x):
= d3 xe−ik·
x {φ̇1 (x, 0) − iE1 (k)φ1 (x, 0)} (10.168)
(2π)3 2E1 (k)
The formula for the destruction operator in terms of the field is time-independent,
so we have chosen to set t = 0 in the last line. We recall from Chapter 4 that, by
convention, the various pictures of time-development in quantum theory (interaction,
Schrödinger, Heisenberg) are presumed to coincide at time t = 0. In particular, our
interaction-picture field φ1 (x, t) and “Heisenberg” field φ2 (x, t) must coincide at t = 0,
as must (cf. (9.40)) their respective time-derivatives:
where our “Heisenberg” field φ2 (x) is expressed in the usual way in terms of the
destruction and creation operators a2 (k), a†2 (k) appropriate for particles of mass m2 :
d3 q
φ2 (x) = (a2 (q)e−iq·x + a†2 (q)eiq·x ) (10.171)
2E2 (q)(2π)3
Using the identity relations (10.169, 10.170), inserting (10.171) into (10.168), and
carrying out the x and q integrations, we find the connection between the creation–
destruction operators for the particles of mass m1 and m2 :
[a1 (k), a†1 (k )] = [a2 (k), a†2 (k )] = δ 3 (k − k ) (10.175)
In a system with a finite number of degrees of freedom (for example, if the system were
quantized on a discrete spacetime lattice with a finite number of points, with discrete
values for the spatial momenta k), standard results going back to von Neumann ensure
the existence of a well-defined unitary transformation relating the a1 (k) and a2 (k). If
the vacua |01 , |02 (satisfying respectively a1 (k)|01 = 0, a2 (k)|02 = 0, ∀k) are indeed
related by a proper unitary transformation, as in (10.166), we should expect to be able
to expand the |01 vacuum in terms of Fock states for particle 2, and obtain thereby
a unit-normalized state:
1
|01 = g0 |02 + d3 k1 g1 (k1 )|k1 2 + d3 k1 d3 k2 g2 (k1 , k2 )|k1 , k2 2 + ... (10.176)
2
with g0 = 20|01 and
1
10|01 = 1 = |g0 |2 + d3 k1 |g1 (k1 )|2 + d3 k1 d3 k2 |g2 (k1 , k2 )|2 + ... (10.177)
2
On the other hand, from (10.172), we know that
If we successively take the overlap of (10.178) with 20|, 2k1 |, 2k1k2 |, .., we find easily,
using the expansion (10.176),
g1 (k) = 0 (10.179)
β(k) 3
g2 (k1 , k2 ) = − δ (k1 + k2 ) 20|01 (10.180)
α(k)
g3 (k1 , k2 , k3 ) = 0, and so on (10.181)
Inserting the result (10.180) into (10.177) however, the integral diverges, as it involves
the square of a δ-function. As β(k) = 0 if m1 = m2 , the finite norm of |01 therefore
requires the overlap 20|01 = g0 to vanish. The recursion implied by (10.178) then
leads to the conclusion that g2 = g4 = g6 = g2n = 0. As the states with odd numbers
of particles do not appear, we have arrived at an evident contradiction: 10|01 = 0! In
fact, the argument extends to arbitrary multi-particle states of particle 1, which all
have vanishing overlap with those of particle 2. It is apparent that the Fock spaces
of scalar particles of differing mass cannot be consistently incorporated within the
same separable Hilbert space, allowing the desired unitary transformation between
corresponding states. This result is evidently a direct consequence of the continuum
normalization of the two-particle states, which in turn can be traced back to the fact
that we have quantized our system in an infinite spatial box. Our problems in this
particular example arise from the fact that we cannot obtain a properly normalized
discrete vacuum state by mixing another such state with multi-particle states built
from pairs of continuum-normalized opposite-momentum particles.
362 Dynamics VIII: Interacting fields: perturbative aspects
particle states discretely normalized, like the vacuum state from which they arise by
application of creation operators. To distinguish discrete from continuous momenta,
we shall indicate the former by subscripts (rather than arguments) in the following.
Thus, the scalar field φ1 now takes the form
1
φ1 (x) = V −1/2 (a
e−ik·x + a†
eik·x ) (10.182)
2E
k 1k 1k
k
with a similar equation for φ2 . The infinite volume limit is obtained from (10.182) with
d3 k (2π)3/2
the usual transcription V1
k → (2π)
3 and the normalization a1
k → V 1/2 a1 (k) (so
that [a1
k , a†
] = δ
k
k ). The discrete analog of (10.172–10.174) is
1k
a1
k = αk a2
k + βk a†
(10.183)
2 −k
1 E1k E2k
αk = ( + ) (10.184)
2 E2k E1k
1 E1k E2k
βk = ( − ) (10.185)
2 E2k E1k
with Eik = k 2 + m2 as before, but with k discrete. Recognizing that the expansion
i
(10.176) involves only pairs of opposite momentum particles k, −k (which can appear
multiply for each discrete k), we divide the set of all pairs into a set of independent
pairs k, −k with k in the right hemisphere kx > 0 in order to avoid double-counting.
The Bogoliubov transformation can be carried out separately on each such pair as the
creation–annihilation operators for distinct k commute. Focussing our attention on a
specific discrete k, we can expand the |01 vacuum as follows:
1 † †
|01 = cN |N 2 , |N 2 ≡ (a a )N |02 , 2N |N 2 = 1 (10.186)
N ! 2
k 2 −
k
N
The requirement a1
k |01 = 0, using (10.183) then leads directly to the simple recursion
relation (see Problem 11):
βk βk
cn = −
cn−1 ⇒ cN = (− )N c0 (10.187)
αk αk
The normalization condition 10|01 = 1 = N |cN |2 = |c0 |2 1−β12 /α2 thus gives
k k
The overlap between the two vacua 20|01 is then given, up to an irrelevant phase, by
the product23 of c0 factors of the form (10.188) for all independent pairs, labeled by
k values in the right hemisphere:
k
In the large volume limit,
k → 2(2π) V
3 d3 k (the extra factor of 1/2 arises from the
restriction of k to the right hemisphere) and this becomes
V (E1 (k) + E2 (k))
20|01 exp {− 3
d3 k ln { } (10.190)
2(2π) 2 E1 (k)E2 (k)
so that the momentum integral in (10.190) is ultraviolet convergent for spatial dimen-
sions three or less (it is manifestly convergent for small k). However, the presence in
the exponent of a volume factor multiplying this finite integral (in three dimensions)
shows immediately the vanishing of the overlap of the two vacua in the infinite volume
limit found earlier. In the usual terminology employed by axiomatic field theorists,
the Fock spaces built on these two vacua are said to be “unitarily inequivalent”
spaces. It should be noted that the ultraviolet dependence of the integral appearing
in expressions for the vacuum overlap is extremely theory (and spacetime dimension)
dependent: for example, an analogous computation for two free spin- 12 fields of different
mass gives a momentum integral divergent for spatial dimensions greater than one
(rather than three, in the scalar case). The important point is that, irrespective
of the ultraviolet behavior, the overlap between normalized vacua for free fields of
differing mass necessarily vanishes in the infinite volume limit as a consequence of the
geometrical/kinematical mismatch in Hilbert space between discretely normalized and
continuum normalized states.
The above example has been generalized to a much broader statement concerning
the non-existence of the interaction picture in essentially all cases in which field
theory Hamiltonians H0 (“free”) and H (“interacting”), defined in an infinite-volume
continuum spacetime supporting the full Poincaré group of spacetime invariances,
differ non-trivially (i.e., by the integral of some local operator density) from each
other. We shall state below (though not prove fully) the modern version of this “Haag’s
theorem”: it is an inescapable consequence of the spectral and field axioms (Ia-d, IIa-d)
of Section 9.2.
Before going on to discuss a more general proof, let us emphasize what Haag’s
theorem does not say about interacting field theories. In particular, there is no difficulty
23 The vacuum of our theory may be regarded as the direct product of vacua for each momentum mode
separately, so that overlaps of the form 20|01 factorize as indicated.
364 Dynamics VIII: Interacting fields: perturbative aspects
whatsoever in establishing a well-defined unitary relation between the in- and out-
states of an interacting field theory: the overlaps out β|αin = Sβα are taken between
states living in spaces spanned by a complete basis of eigenstates of the same Hamil-
tonian operator H. Indeed, the Haag–Ruelle and LSZ scattering theories developed in
Sections 9.3 and 9.4 lead to a perfectly well-defined, and unitary, S-matrix, on the basis
of exactly the same axiomatic framework which can be used to establish the validity of
Haag’s theorem. The LSZ formula, for example, gives a rigorous connection between
well-defined Green functions (time-ordered products of the full Heisenberg fields) and
this unitary S-matrix. Direct non-perturbative evaluation of the Green functions of
the theory (say, by lattice field theory methods) therefore completely circumvents any
difficulty with the non-existence of interaction picture, as the latter is simply not
employed at any point.
Of course, in many cases the only sensible approach to the evaluation of the Green
functions is via perturbation theory, which is inescapably rooted in an interaction-
picture formalism, as we discussed earlier in this chapter in Sections 10.1 and
10.2. If we now return to our initial example, with unperturbed and perturbation
Hamiltonians given in (10.164, 10.165), we discover with some surprise that the
formal execution of the perturbative evaluation in interaction-picture of the n-point
Green functions of particle 2 (described by the “full” Hamiltonian H) leads to
no particular difficulties. For example, using the Gell–Mann–Low theorem (9.21),
we have the formal perturbative expansion for the two-point Feynman function
(2)
2 0|T (φ2 (x)φ2 (y))|02 = iΔF (x − y) of the “interacting” field φ2 in terms of the
propagators of the “free” field φ1 :
Expanding the vacuum-expectations inside the integrals using the Wick theorem
(recall that fields inside a normal product symbol are not contracted), one finds
(2) (1)
iΔF (x − y) = iΔF (x − y)
δm2 (1) (1)
−i (2) iΔF (x − z1 )iΔF (z1 − y)d4 z1
2
1 −iδm2 2 (1) (1) (1)
+ ( ) (8) iΔF (x − z1 )iΔF (z1 − z2 )iΔF (z2 − y)d4 z1 d4 z2
2 2
+... (10.193)
The factors (2) and (8) in the second and third lines are combinatoric factors
reflecting the number of equivalent ways by which the fields can be contracted to
give the displayed product of propagators. On transforming to momentum space, the
How to stop worrying about Haag’s theorem 365
convolutions over spacetime arguments become simple products, and the perturbative
series reveals itself as a simple geometric expansion:
(2) (1) (1) (1)
ΔF (k) = ΔF (k) + δm2 (ΔF (k))2 + (δm2 )2 (ΔF (k))3 + ...
1 1 1
= + δm2 ( 2 )2 + (δm2 )2 ( 2 )3 + ...
k2 − m1 + i
2
k − m1 + i
2
k − m21 + i
1
=
k 2 − m21 − δm2 + i
1
= (10.194)
k 2 − m22 + i
k
1 dk0 e−ik·(x1 −x2 )
=i (10.195)
V 2π k 2 − m21 + i
Here the time component k0 is still continuous (as the time dimension is still
unbounded), but the spatial components of k are discrete as appropriate for periodic
modes in our finite box. We may now repeat essentially verbatim the steps leading
from (10.192) to (10.194) to obtain for the “full” propagator,
(2) 1 dk0 e−ik·(x1 −x2 )
ΔF (x1 − x2 ) = (10.196)
V 2π k 2 − m22 + i
—precisely the expression for the free propagator of a particle of mass m2 , also
quantized in a box of volume V . The desired infinite volume result for the “inter-
acting” theory can then be obtained trivially by taking the V → ∞ limit in (10.196),
thereby returning to the conventional result (7.215). The recovery of the full Poincaré
invariance of the infinite-volume continuum theory, with continuous and unbounded
four-momenta, is perfectly straightforward in this case, as we are dealing with theories
without non-trivial interactions: in particular, the execution of the perturbation expan-
sion does not lead to ultraviolet-divergent loop integrals. Also, there are no divergent
366 Dynamics VIII: Interacting fields: perturbative aspects
phases in the evolution of the vacuum states (either for particle 1 or particle 2) due to
disconnected vacuum graphs. Nevertheless, we emphasize once again that, from the
point of view of Haag’s theorem, the interaction picture for our toy system in infinite
volume is just as pathological as in cases in which non-trivial interactions (involving
higher than quadratic terms in the fields) are present.
The situations in which the interaction picture is typically employed, to compute
the scattering amplitudes of an interacting field theory say, involve perturbations of
the free theory of a much more complicated nature than the innocent shift of mass
discussed above, and we should hardly be surprised if the negative conclusions reached
above concerning the existence of proper unitary transformations relating the Fock
states of the free and interacting field remain in force in these more physically relevant
circumstances. The proof of the generalized Haag theorem is usually accomplished
in two steps. One first establishes that the unitary equivalence of two irreducible
field operators (recall that from Axiom IId of Section 9.2, this implies that the only
operator commuting with all fields is a multiple of the identity) implies the equality of
their equal-time Wightman functions (VEVs of field products). The second, and more
difficult, step uses analyticity properties of the Wightman functions following from the
spectral and field axioms of Section 9.3 (and embodied in the Hall–Wightman theorem)
to extend this equality to Wightman functions for arbitrary spacetime arguments. We
shall outline the proof of the first part here, and refer the interested reader to the
literature for the more technically challenging second step. An alternative proof, due
to Jost and Schroer, and involving only the two-point function, will be relegated to
an exercise for the reader (see Problem 13).
Let us suppose that we have two scalar24 fields φ1 (x, t) and φ2 (x, t), with associated
canonical momentum fields π1 (x, t) ≡ φ̇1 (x, t), π2 (x, t) ≡ φ̇2 (x, t), and related at any
given time t by a well-defined unitary operator V (t) defined in a single Hilbert space
accomodating both fields (and equal to the interaction-picture operator U (t, 0) in our
previous notation: the change is occasioned by the desire to avoid confusion with the
U (R, a) unitary representatives of the Euclidean group introduced below):
The Euclidean subgroup of the Poincaré group, consisting of spatial rotations R and
translations a is realized on the Hilbert space by unitary operators Ui (R, a), i = 1, 2
which have the usual action on our local fields (cf. (5.94)):
24 The generalization to fields transforming under non-trivial representations of the Lorentz group is
unproblematic.
How to stop worrying about Haag’s theorem 367
U (Λ) (resp. UH (Λ)) introduced in Section 9.1. Note that we have at this stage already
committed ourselves to continuum-normalized multi-particle states, by insisting on
the invariance of the theory under the continuous Euclidean group. Finally, we shall
assume that there is a unique invariant state (vacuum) for each set of Euclidean group
representatives:
U1† (R, a)V (t)U2 (R, a)V † (t)φ1 (x, t) = U1† (R, a)V (t)U2 (R, a)φ2 (x, t)V † (t)
= U1† (R, a)V (t)φ2 (Rx + a, t)U2 (R, a)V † (t)
= U1† (R, a)φ1 (Rx + a, t)V (t)U2 (R, a)V † (t)
= φ1 (x, t)U1† (R, a)V (t)U2 (R, a)V † (t) (10.202)
so that the operator U1† (R, a)V † (t)U2 (R, a)V (t) commutes with all fields φ1 (x, t)
on timeslice t. An exactly similar sequence of manipulations (using (10.198,10.200))
establishes that this commutativity holds also with the π1 (x, t) operators. The creation
and annihilation operators appropriate for free field φ1 can be reconstructed from the
φ1 (x, t) and π(x, t) so commutativity of U1† (R, a)V † (t)U2 (R, a)V (t) is thus established
with all such operators, which implies that it must act as a multiple of the identity
in the Fock space of field φ1 (this is the irreducibility property, here invoked for fields
and their conjugate momenta on a single time-slice). Thus
U1† (R, a)V (t)U2 (R, a)V † (t) = c(R, a) (10.203)
The fact that U1,2 (R, a) form (infinite-dimensional) representations of the Euclidean
group (cf. (5.14):
implies that the c(R, a) must likewise form a one-dimensional representation of the
Euclidean group:
Some simple group-theoretic reasoning (see Problem 12) leads to the conclusion that
the only such representation is the trivial one, c(R, a) = 1, whence
10|φ1 (
x1 , t)φ2 (x2 , t)...φ1 (xn , t)|01
= 10|V (t)V † (t)φ1 (x1 , t)V (t)V † (t)φ1 (x2 , t)..V (t)V † (t)φ1 (xn , t)V (t)V † (t)|01
= 20|φ2 (x1 , t)φ2 (x2 , t)...φ2 (xn , t)|02 (10.210)
Recall that our fields φ1 , φ2 are supposed to represent free and fully interacting
fields, respectively, so this result is already astonishing, as we should certainly not
expect, even at equal time, the free-field vacuum-expectation-values to coincide with
the corresponding very complicated interacting ones. The final nail in the coffin of
the interaction picture is inserted by the realization that the very strong analyticity
constraints on the spacetime Wightman functions (cf. Section 9.2) allow the equality
expressed in (10.210) to be extended to arbitrary values of the spacetime coordinates
of the fields. Note that these analyticity properties follow from the full panoply of
Wightman axioms (of type I and II) discussed in Section 9.2: in particular, locality, full
Poincaré (not just Euclidean group) invariance, and the usual spectral properties. The
insertion of θ-functions leads to a similar conclusion for the Feynman (time-ordered)
Green functions of fields φ1 and φ2 , from which we conclude (via LSZ) that the S-
matrix of the interacting field φ2 is equal to that of the free field φ1 : namely, unity.
Thus, non-trivial interactions are excluded once we make the evidently overly strong
assumption of well-defined (“proper”) unitary equivalence of the representations for
the two fields. The interested reader is encouraged to follow the more detailed accounts
of this second step in the argument leading to Haag’s theorem, involving an application
of the fundamental Hall–Wightman theorem (Hall and Wightman, 1957)(cf. also
Section 9.2) on analytic domains of Wightman functions, in (Barton, 1963) and
(Greenberg, 1959).
A slightly different route (Streater and Wightman, 1978) to Haag’s theorem utilizes
the two-point function only, the analyticity properties of which are essentially trivial,
as we have seen in Section 9.5. Taking n =2 in (10.210), and with φ1 (x) a canonically
normalized free scalar field of mass m, we have for the “interacting field” φ2
20|φ2 (
x1 , t)φ2 (x2 , t)|02 = Δ+ (x1 − x2 , 0; m) (10.211)
where Δ+ is the invariant function of (6.63). For x1 , x2 any pair of space-like separated
points, the corresponding times t1 , t2 can be brought to equality by an appropriate
Lorentz boost, so from the Lorentz-invariance properties of Δ+ we conclude that the
two-point Wightman function of the φ2 field must coincide with that of the free field
for space-like separations of x1 − x2 . The equality can be analytically extended to the
time-like domain: for example, we need only appeal to the Lehmann representation for
the two-point function derived in Section 9.5, where the spectral function is already
fully determined by knowledge of the two-point function in the space-like region.
How to stop worrying about Haag’s theorem 369
Finally, one utilizes a theorem of Jost and Schroer (Jost, 1961) (see also Problem 13),
wherein it is shown that any field whose two-point function coincides with that of a
free field must itself be a free field (evidently, of the same mass).
The non-existence of the interaction picture for any Poincaré invariant local field
theory with essentially any non-trivial split (other than a trivial c-number one) into
free and interacting parts of the Hamiltonian is, of course, an unpleasant fact of
life given the enormous utility of perturbative Feynman graph technology in modern
particle physics. The attitude of the present author to this circumstance has already
been outlined above, in the discussion of the perturbative expansion of the two-point
function in the toy model of a scalar mass shift. The interaction-picture formalism
can be reinstated with complete mathematical rigor by a full regularization of the
field theory, in which both spatial infrared (i.e., finite volume) and ultraviolet (i.e.,
finite lattice spacing) cutoffs are introduced. The resultant theory, at the price of loss
of Poincaré invariance, is now a quantum-mechanical system with a finite number of
independent degrees of freedom, and the interaction picture makes perfect sense. The
problem is now transferred to the issue of regaining sensible (in particular, Poincaré
invariant!) results in the limit when these cutoffs are removed, after the perturbative
expansion of the n-point functions needed for evaluation of the S-matrix has been
performed. Note that the perturbative contributions obtained at each finite order of
perturbation theory are completely well-defined in this cutoff theory (although, as
emphasized previously, the expansion is only an asymptotic one, with the sum of
perturbative contributions diverging because of factorial growth of the coefficients, as
we shall see in the next chapter).
We consider first the behavior of the cutoff perturbative amplitudes as the spatial
volume of the system is allowed to go to infinity. From the discussion in Section 10.2
of persistent interactions, we know that in a theory of massive particles, the only
volume singularity of the n-point Feynman functions appears in the phase out 0|0in
accumulated in the vacuum due to the vacuum energy density shift induced by inter-
actions. This phase is removed by the simple expedient of considering only connected
contributions to the S-matrix: it would in any event disappear subsequently once the
S-matrix amplitudes are squared to determine the probability of scattering processes.
The infinite volume limit is perfectly smooth in the remaining connected amplitudes,
as the appearance of momentum integrals extending down to zero momentum is
unproblematic in a theory with massive particles due to the absence of infrared
divergences (on this matter, cf. Chapter 19).
It is important to realize that contrary to assertions one sometimes encounters
in discussions of Haag’s theorem, the vacuum fluctuations encountered in interaction
picture, corresponding to the interaction-induced shift in the ground-state energy of
the theory (and present even when the theory is fully regulated), are not the root
cause of the non-existence of the interaction picture. Indeed, Haag’s theorem applies
in full force to supersymmetric field theories (see Section 12.6) in which the vacuum
energy fluctuations cancel identically between bosonic and fermionic contributions.
Nevertheless, the interacting n-point functions of these theories most certainly differ
from their free limits, guaranteeing the non-existence of the interaction picture by
the arguments given above. Indeed, there are typically mass shifts (equal for bosonic
370 Dynamics VIII: Interacting fields: perturbative aspects
25 The non-renormalization theorems of the superpotential ensure the absence of renormalization in the
mass terms in the Lagrangian, but there are typically shifts in the poles of the propagator due to non-trivial
wavefunction renormalizations. See Section 12.6 for an explanation of these arcane terms.
26 An important exception to this occurs with Hamiltonian lattice formulations of field theory, provided,
of course, that the lattice Hamiltonian is constructed to be properly hermitian.
Problems 371
10.6 Problems
1. Verify that the operator E (t, t0 ), defined in (10.15), satisfies the same first-order
equation (10.16) and initial condition as E(t, t0 ), whence E (t, t0 ) = E(t, t0 ).
2. Determine the Wick expansion of T (φ(x1 )φ(x2 )φ(x3 )φ(x4 )) by taking the fourth
functional derivative of (10.22) with respect to j(x1 ), j(x2 ), j(x3 ), j(x4 ) and
setting the sources to zero.
3. Perform the indicated spacetime integrations in (10.37) to obtain the
momentum-space expression for 2-2 scattering in the theory with interaction
(10.31).
4. The object of this exercise is to work out the lowest-order perturbative contri-
butions to the vacuum energy density shift δE (see (10.42)) induced by a λφ4
interaction.
(a) First assume that the interaction is not normal-ordered, Hint = 4! λ 4
φ . There
is then a contribution to the vacuum-to-vacuum amplitude of first order in
λ. Show that after an overall integral over spacetime (interpreted as V · T ,
spatial volume times temporal extent) is extracted, the energy density shift
is found to be
λ λ 1 d4 k 2
δE = (iΔF (0))2 = − ( ) (10.212)
8 8 k 2 − m2 + i (2π)4
where the fields are at time zero. The calculation is greatly simplified by shifting
the first charge density (at r − ξ/2) to a slightly positive time , whereupon
the product of charge densities can be taken to be time-ordered. The Wick
372 Dynamics VIII: Interacting fields: perturbative aspects
expansion now yields a sum of two terms (of the form iSF : ψψ † :, with SF
the Feynman propagator for a Dirac particle), which can be reduced (using the
Fourier formula for SF ) to a three-dimensional momentum integral for G̃(ξ) by
explicitly performing the energy integral (after which the time shift can be set
to zero). Show that one obtains the result (2.61) quoted previously,
= e2 d3 q m i
q·ξ
G̃(ξ) e (10.214)
(2π)3 E(q)
leading to a logarithmically divergent Coulomb self-energy.
6. (a) Consider the one-dimensional integral
+∞
1 2 2 λ 4
Z(λ, m) ≡ e− 2 m x − 4 x dx ∼ cn λn (10.215)
−∞ n
Note that this result shows that Z(λ, m) is a real analytic function of λ,
with a cut of standard fractional power type along the negative real axis,
and behaving for large |λ| like |λ|−1/4 .
7. Let φ(x) be a free real scalar field with Feynman propagator ΔF (x). Show that
0|T (eiφ(x) e−iφ(0) )|0 = ei(ΔF (x)−ΔF (0)) by
(a) Expanding out the operator exponentials and using Wick’s theorem, and
(b) by using path-integral methods (i.e., evaluate the path integral for Z0 (j)
with j(z) = δ(z) − δ(z − x)).
8. Calculate the four-point function of four free Dirac fermion fields
0|T (ψ(x)ψ(y)ψ̄(z)ψ̄(w))|0
by taking four functional derivatives with respect to Grassmann sources
η(x), η̄(x) of Z0 [η, η̄], the generating functional for the free fermion field theory.
Check that the relative signs for the terms you obtain agree with Wick’s theorem.
9. Verify the expansion for the sources Ji in terms of the φi in (10.150).
Problems 373
10. Verify the result (10.151) for the effective potential through terms of order φ4 .
11. Verify the recursion relation (10.187) giving the amplitudes for multiple pairs
of the quanta of field φ2 in the vacuum of field φ1 , where φ1 (resp. φ2 ) are free
scalar fields of mass m1 (resp. m2 ).
12. Starting with the representation equation (10.206), show
(a) That for R = 1, the only solutions are c(1, a) = ei
p·
a , with p some fixed
three-vector.
(b) Using the fact that the one-dimensional (zero angular momentum) repre-
sentation of the rotation group is trivial, c(R, 0) = 1, and the result of part
(a), show that for any R, a, c(R, a) = 1.
13. Suppose that the two-point Wightman function of a Heisenberg field φH is
known to coincide with that for a canonically normalized free field of mass m
1. Most obviously, we have the classic processes in which a small coupling (in the
case of QED, or more generally, the standard model of electroweak interactions,
Dynamics IX: Interacting fields: non-perturbative aspects 375
this would be the fine structure constant α ∼1/137) allows extremely accurate
calculations of a given process simply by evaluating and summing the perturba-
tive contributions up to a finite loop order. In the case of QED this has been
done up to the four-loop order (α4 ) for quantities such as the electron anomalous
magnetic moment.
2. Next, we have situations in which a physical quantity, albeit in a weakly coupled
theory, necessarily involves an infinite number of interactions between the con-
stituent particles, and hence an infinite number of Feynman graphs. Bound states
such as the hydrogen atom in a weakly coupled theory such as QED clearly fall
into this category, as the permanent association of the proton and electron clearly
requires that they exchange photons over an infinite time span, in contrast to the
situation in unbound electron–proton scattering, where the scattering amplitudes
can be perturbatively evaluated, with exchange of many photons suppressed
by higher powers of α (uncompensated by kinematic enhancements due to the
bound-state threshold, as we shall see below). Of course, the calculation of all
Feynman diagrams contributing to a process is beyond our calculational powers,
and it will soon become clear that even if that were possible, the resultant series
is in fact a divergent asymptotic expansion, and cannot therefore be summed
directly to yield a meaningful answer! Instead, we shall see that for a certain
class of bound-state problems, the kinematic region important for the permanent
binding of the constituent particles identifies a dominant component of the
(infinite) set of perturbative amplitudes which can be convergently summed,
and which represent the leading contributions to the bound-state properties (in
an expansion in the available weak coupling). For lack of a better term, we may
refer to such situations as “perturbatively non-perturbative” processes in field
theory.
3. Finally, there are those physical processes in which the relevant coupling strength
is large, so that an asymptotic expansion, even if formally available to high order,
is simply useless in extracting quantitative (and in many cases, even qualitative)
features of the physics. Quark confinement and chiral symmetry-breaking in
QCD are archetypal examples of this type. We may (again, for lack of a better
term) refer to these cases as the “essentially non-perturbative” processes in field
theory. The Feynman graph approach is of little if any utility here: instead,
numerical approaches in which the Euclidean functional integral of the discretized
theory is evaluated directly by statistical Monte Carlo techniques (as in lattice
gauge theory) provide the most fruitful line of attack. Indeed, the use of such
methods has allowed us to obtain, starting from the QCD Lagrangian, and with
accuracy now approaching in many cases the level of a few percent, many detailed
predictions of hadron spectrum and structure.
In this chapter, after explaining the nature of the (unavoidable) divergence in per-
turbation theory, we shall give some examples of the various types of non-perturbative
phenomena encountered in the field theories of importance in the Standard Model
of particle physics. The physics of weakly coupled threshhold bound states will be
explained, as it is crucial for understanding the classic successes of QED in the
hydrogen atom spectrum, for example. The limitations of perturbation theory in
376 Dynamics IX: Interacting fields: non-perturbative aspects
strongly coupled theories, and the extent to which perturbation theory can even
in principle be regarded as determining the exact amplitudes of the theory, will be
explained, as well as the role played by Borel-summability (or its absence) of the
perturbative series. Conventional wisdom holds that perturbative information by itself
is virtually useless in non-Borel-summable theories, but we shall see that in at least one
iconic case (the anharmonic “double-well” oscillator) purely perturbative information
can be “massaged” to obtain a rigorously convergent sequence of approximants to the
amplitudes of a non-Borel theory. We conclude the chapter with some brief remarks
on numerical approaches to non-perturbative field theory.
we have already seen in Section 8.4 that the spectrum of the theory becomes
unbounded below if the sign of the coupling λ becomes negative. For this theory, the
functional integral representation of the Euclidean generating functional (or vacuum
to vacuum amplitude, setting the source function to zero)1
d 1 1 2 2 λ 4
Z(λ, m) = Dφe− d x{ 2 ∇φ·∇φ+ 2 m φ + 4 φ } (11.2)
clearly diverges if we allow the real part of λ to become negative. Even for the d = 0-
dimensional case, where the integral (11.2) degenerates to a one-dimensional integral
(as the field is defined at a single point, where we may denote its value x, and there
is no gradient term)
+∞
2
1
x2 − λ 4
Z(λ, m) ≡ e− 2 m 4x dx (11.3)
−∞
1 Here, the ∇ operator is the Euclidean d-gradient. As usual, we assume that the functional integral is
made well-defined—regularized—by an appropriate discretization, both in the infrared and the ultraviolet:
e.g., on a finite lattice.
On the (non-)convergence of perturbation theory 377
simple arguments show (cf. Problem 6 in Chapter 10) that the function Z(λ, m) is
analytic in the complex plane of λ with a cut on the negative real axis extending up
to the origin, so that a formal expansion around λ = 0 cannot converge. Instead, if
we expand the “interaction” x4 term inside the integral, we arrive at an asymptotic
expansion2
√ 1 Γ(2n + 12 )
Z(λ, m) ∼ cn λn , cn = (−1)n 2 (11.4)
n
(m2 )2n+1/2 Γ(n + 1)
2 The reader is reminded that the series n cn λn is said to be asymptotic to a function f (λ) if |f (λ) −
N
n=0 cn λ | = O(λ
n N +1 ) as λ → 0 for any fixed N .
378 Dynamics IX: Interacting fields: non-perturbative aspects
N electrons N positrons
r r
Fig. 11.1 Electron–positron pair assembly leading to energetic instability (for α < 0).
3 The heuristic argument given here, while correct for scalar electrodynamics—i.e., for theories with
spinless charged “electrons” and “positrons”—ignores Fermi exchange effects which suppress configurations
with fermions in highly overlapping states. More
√ careful arguments, first given by (Parisi, 1977), lead
to perturbative coefficients in QED rising like n!. In the path-integral approach described below, the
dominant behavior arises from saddle-points in a combined effective action arising from the free photon
contribution and the determinant obtained by integrating out the electron field: see (Ioffe et al., 2010),
Section 5.8.
On the (non-)convergence of perturbation theory 379
so that for large β (low temperature) the ground-state energy dominates. The
analog of the generating functional of connected graphs W becomes in this limit
W ≡ ln Z ∼ −βE0 (λ)—in other words, just the ground-state energy (up to a factor
of −β). The instability of the system as we analytically continue from positive real
λ to negative λ is manifested, as usual in quantum mechanics, in the appearance
of an imaginary part in the analytically continued energy eigenvalue E0 (λ): indeed,
just as in the zero-dimensional toy integral case discussed previously, the imaginary
part is simply the signal of a cut appearing along the negative real axis in the
complex plane of λ, with Z(λ = −|λ| + i) = Z ∗ (λ = −|λ| − i) as Z(λ) (dropping
the at present uninteresting β dependence) is real-analytic (real for positive real λ).
As for the one-dimensional integral, the value of Z on the top and bottom lips of
the cut can be computed by rotating the phase of the “field” variable q(t) in tandem
with the phase of the coupling so as to preserve a negative real part (and hence
convergence) in the exponent −SE appearing in (11.7). Thus, we let λ → eiφ |λ|,
1
q(t) → q(t)e−iφ( 4 −δ/π) (with δ small positive) and arrive after rotating φ → ±π with
380 Dynamics IX: Interacting fields: non-perturbative aspects
where qθ (t) = eiθ q(t), θ = ∓( π4 − δ). The discontinuity of Z across the cut on the
negative axis is then given by subtracting the two integrals (11.9) for the two signs
of θ. Lipatov realised that the resultant difference of path integrals can be deformed
further to pass through saddle points (extrema of the Euclidean action SE (q)) which
dominate the result for small negative Re(λ). The extremum of the action corresponds
to trajectories of our particle qcl (t) satisfying the equation
δSE (q)
|q=qcl = −q̈cl (t) + qcl (t) + λqcl (t)3 = 0 (11.10)
δq(t)
−1 β/2
1 1 λ
qcl (t) = ± , Scl = ( q̇cl (t)2 + qcl (t)2 + qcl (t)4 )dt = −β/(4λ) (11.11)
λ −β/2 2 2 4
10
C O D
V(q)
–5
A B
–10
–8 –6 –4 –2 0 2 4 6 8
q
The imaginary part of the energy displayed in (11.16) is directly connected to the
tunneling rate for a zero-energy particle initially localized around the origin to
escape through the barrier formed by the potential 12 q 2 − |λ| 4
4 q appearing in our
original functional integral (11.7), as one may verify with a simple WKB calculation
(see Problem 1). The connection of the instability generated by such tunneling to
the large-order behavior of the coefficients appearing in the Rayleigh–Schrödinger
perturbation theory for the ground-state energy E0 (λ) ∼ n cn λn is established by
use of analyticity, which allows us to connect the behavior of E0 (λ) for positive real
λ to the discontinuity across the cut on the negative real axis.
First, note that for large coupling λ, the energy E0 (λ) has the asymptotic behavior
λ1/3 , by a simple scaling argument (see Problem 2). This means that the function
f (λ) ≡ (E0 (λ) − 12 )/λ is (a) analytic in the complex plane of λ, cut along the negative
real axis, and (b) behaves for large |λ| like λ−2/3 . Accordingly, f (λ) satisfies the Cauchy
formula
1 f (λ )
f (λ) = dλ (11.17)
2πi C λ − λ
where C is the contour indicated in Fig. 11.3. Replacing f (λ) by E0 (λ), and expanding
the contour C to infinite size, whereupon the curved parts go to zero, while the two
straight portions along the negative real axis combine to give the imaginary part of
E0 , giving
λ
•λ
Expanding the denominator factor (λ − λ)−1 in powers of λ inside the integral yields
the desired asymptotic expansion and an explicit formula for the leading behavior at
large order n of the coefficients cn as an integral over the cut discontinuity of E0 :
1 0 ImE0 (λ) 3 1 1
cn = dλ ∼ K(−1)n+1 ( )n+ 2 Γ(n + )(1 + O(1/n)) (11.19)
π −∞ λn+1 4 2
where the corrections of order 1/n arise from the corrections of relative order λ to the
leading behavior of the discontinuity for small negative λ. The important features to
note here are first, that the coefficients rise factorially with order, as anticipated by
our intuitive arguments, and second, that they alternate in sign. The latter property
will move to center stage in Section 11.3, when we discuss the Borel summability (or
absence thereof) of perturbative expansions in field theory.
For spacetime dimensions d = 2 or 3, we move into the realm of field theory proper,
but the large-order analysis proceeds along much the same lines as for the anharmonic
oscillator: the dominant contribution to the discontinuity in the partition function
(11.2) when analytically continued to negative (small) λ is given by the saddle-point
contribution to the functional integral arising from finite-action solutions (as before,
called “instantons”) of the classical Euclidean field equation describing the extrema
of the action:
where Δ is the Laplacian in d dimensions. The extremal value of the action Scl at a
solution of (11.20) can be simplified, using (11.20) to eliminate the derivative terms:
1 1 1 λ
Scl = dd x{− φcl (x)Δφcl (x) + m2 φ2cl (x) + λφ4cl (x)} = − dd xφ4cl (x) > 0
2 2 4 4
(11.21)
The leading contribution is proportional to e−Scl , so we are really looking for the finite-
action solution with the minimum value for Scl . From (11.21) it is clear that finite
action requires that φcl (x) → 0, x → ∞. It can be further be shown (see (Zinn–Justin,
1989), op. cit) that the minimal action solutions correspond to spherical symmetry,
φcl (x − x0 ) = √m
−λ
u(r), r ≡ m|x − x0 |, where the instanton solution is centered at an
arbitrary (Euclidean) spacetime point x0 (exactly analogous to the time t0 appearing
in the instanton solution (11.12) for d = 1). The rescaled dimensionless function u(r)
satisfies the ordinary non-linear differential equation
d2 u(r) d − 1 du(r) d 1 1
=− − (− u2 (r) + u4 (r)) (11.22)
dr2 r dr dr 2 4
with the boundary condition u(r) → 0, r → ∞. In the previously discussed case of the
anharmonic oscillator (d = 1) the radial coordinate r corresponded to the time, and
the equation (11.22) had a ready-made mechanical interpretation in terms of Newton’s
Law for a unit mass particle in a potential V (u) = − 12 u2 + 14 u4 . This remains the case
384 Dynamics IX: Interacting fields: non-perturbative aspects
4
d=1 (anharmonic oscillator)
d=2
3 d=3
u(r)
0
0 1 2 3 4 5
r
Just as in the perturbative expansion (11.19) for the anharmonic ground-state energy,
Ad
the presence of the essential singularity induced by e λ will lead to factorially growing
contributions (with oscillating sign) at large order, so the perturbation theory (even
when ultraviolet cutoffs are in place) is divergent.
With the technology we have described in this section in place, there is a great
temptation to draw the conclusion that the dominant behavior at large orders of per-
turbation theory somehow ought to determine the dominant physical behavior of the
corresponding amplitudes. This temptation must be strenuously resisted, for (at least)
two reasons. First, one must bear in mind that even if a unique resummation procedure
were available to convert the information contained in the perturbative coefficients (to
all order) into well-defined convergent approximants to the exact field-theoretic ampli-
tudes, there is absolutely no guarantee that the dominant portions of the large-order
perturbative coefficients actually translate into the dominant parts of the amplitudes
in the physical regime of interest. In fact, we shall see in the next section that bound
states in field theory provide an immediate counterexample to any such claim.
Secondly, the analysis performed above was entirely carried out in the context of
the Euclidean functional integral: actual physical amplitudes need to be obtained from
those calculated in Euclidean space by an analytic continuation to Minkowski space.
Unfortunately, it is well known that asymptotic estimates of analytic functions cannot
in general be analytically continued: in other words, it is not true that the analytic
continuation of an asymptotic series yields in general a correct asymptotic expansion
for the analytically continued function. Of course, this would be possible were the
expansions in question convergent Taylor series, but we have just seen that this is
essentially never the case in a non-trivial interacting field theory. These obstacles
have unfortunately severely limited the extraction of quantitatively useful physical
information from a large and elegant body of work on instanton solutions, especially in
quantum chromodynamics, where the tunneling processes described by instantons are
almost certainly connected in a deep way to the chiral symmetries (and the breaking
thereof) of the theory. A fortunate exception is in the theory of critical phenomena,
in which the Borel resummation (cf. Section 11.3) of the large-order behavior of φ4
theory in d = 3 dimensions has been used to extract highly accurate results (often
386 Dynamics IX: Interacting fields: non-perturbative aspects
We then say that the state |P is a bound state of the particles interpolated for by A
and B. In the terminology of Section 9.2, the bilocal operator
x x
Cx (X) = T (A(X + )B(X − )) (11.27)
2 2
(suitably smeared over x) acts as an almost local field (centered at the point X) which
interpolates for the bound-state particle. The matrix element
“Perturbatively non-perturbative” processes: threshhold bound states 387
x x
ΦP (x) ≡ (2π)3/2 2E(P )0|T (A( )B(− )|P , E(P ) ≡ P 2 + MB2 (11.28)
2 2
is called the “Bethe–Salpeter wavefunction” of the bound state. The energy square-
root factor for the bound state is included for convenience as our bound-state ket is
non-covariantly normalized. As we shall see, it plays a role analogous to that of the
Schrödinger wavefunction in non-relativistic quantum theory: in particular, the bound-
state mass is determined by an eigenvalue equation involving this function.5 The
Kållen–Lehmann representation (cf. Section 9.5) tells us that the existence of a single
particle asymptotic state of mass MB implies a pole of the form 1/(P 2 − MB2 ) in the
Feynman two-point function of any Heisenberg field that interpolates for the bound-
state particle (see (9.192)). The pole arises in the usual way from the contribution of
single-particle intermediate states to the two-point function
where we have temporarily suppressed the relative coordinates x, y which are held
finite and fixed while the pole in the Fourier transform of G
G̃(P ) = 0|T (Cx (X)Cy† (0))|0eiP ·X dd X (11.30)
which then gives rise (see (9.191, 9.192)) to the expected single-particle pole in the
bound-state propagator
−iG̃(P ) = −i eiP ·X 0|T {A(X + x/2)B(X − x/2)A† (y/2)B † (−y/2)}|0
ΦP (x)Φ∗P (y)
→ , P 2 → MB2 (11.32)
P 2 − MB2 + i
5 However, one must be careful, in the relativistic field theory case, not to attach the usual probabilistic
interpretation to this function: recall the difficulties entailed in attempts to define a position operator in
field theory, discussed in detail in Section 6.5.
6 The mathematically fastidious reader may imagine smooth smearing functions of rapid decrease in x
and y attached to the equations that follow, so that we are really dealing with almost local operators Cf (X)
in the sense of Section 9.2.
388 Dynamics IX: Interacting fields: non-perturbative aspects
P P
+q −q
2 2
Φ P (q)
i
G(4)
P 2 → MB2 P2 − MB2 +
ΦP∗ (p)
P P
+p −p
2 2
Fig. 11.5 Bound-state pole contribution to the scattering amplitude in a binding channel.
= eiP ·X+iq·x−ip·y 0|T {A(X + x/2)B(X − x/2)A† (y/2)B † (−y/2)}|0dd Xdd xdd y
(11.33)
where we have omitted a spacetime integral over the fourth field (which deletes
the uninteresting (2π)4 δ 4 ( P ) energy-momentum conservation factor from the
amplitude) in the second line, and made the change of variables x = x1 − x2 , X =
x1 +x2 −y1
2 , y = y1 in the last line. Comparing (11.33) with (11.32), we see that the 2-2
scattering amplitude of our A and B particles will display a simple pole as the total
incoming momentum P is taken onto the mass shell for the bound state P 2 → MB2 :
P P P P ΦP (q)Φ∗P (p)
G̃(4) ( + q, − q, + p, − p) → i 2 , P 2 → MB2 (11.34)
2 2 2 2 P − MB2 + i
1 P P P P
G̃P (q, p) ≡ − G̃(4) ( + q, − q, + p, − p) (11.35)
(2π)d 2 2 2 2
The iteration is depicted graphically in Fig. 11.6, where the crossbars on the outgoing
(top) propagators indicate that they have been amputated, with the result that the
disconnected part simply becomes a δ-function equating the initial and final relative
momenta p and q. Note that two-particle irreducibility in this context is defined as
the property that the graph remains connected when a single A line and a single B
line is cut: in other words, we imagine cutting the graphs in Fig. 11.6 horizontally (in
the so-called “s-channel” for the scattering amplitude). Particle lines bearing an arrow
in Fig. 11.6 should be regarded as full propagators (including self-energy corrections,
etc.) for the A and B fields (see below).
390 Dynamics IX: Interacting fields: non-perturbative aspects
P P
+q −q
2 2
KP
KP
P P
+p −p
2 2
The series of 2PI kernels in Fig. 11.6 can be re-expressed as an integral equation
(4)
for G̃P with kernel KP and inhomogeneous part I(q, p) = δ d (q − p)
P P dd k
Δ̂−1
F ( + q)Δ̂−1
F ( − q)G̃P (q, p) = I(q, p) + KP (q, k)G̃P (k, p) (11.36)
2 2 (2π)d
as depicted in Fig. 11.7. Here, Δ̂F ( P2 + p) (resp. Δ̂F ( P2 − p)) represent full propagators
for the A (resp. B) fields (i.e., the Feynman two-point functions for the fully interacting
Heisenberg fields; cf. Section 9.5). To avoid overburdening the notation, we use the
same symbol for both propagators, even though the fields A and B may be distinct:
which propagator is meant will be clear from the context. The iteration of (11.36)
clearly generates the succession of 2PI segments indicated in Fig. 11.6.
Note that the kernel KP (q, k) is defined to be amputated with respect to both
incoming and outgoing legs (with momenta P2 ± k and P2 ± q), in order to avoid
doubling the internal propagators. The relevance of this representation of the 2-2
amplitude is that the kernel KP (q, k) does not contain a single particle bound state
pole, inasmuch as two-particle intermediate states are absent by definition in the
graphical expansion of KP . Thus, if we take the on-mass-shell limit P 2 → MB2 for
the bound state in (11.36), only the G̃P factors contain the pole term arising from
(11.34), so that, identifying the residues of this pole on both sides of (11.36), we find
the famous Bethe–Salpeter equation (Salpeter and Bethe, 1951):
P P dd k
Δ̂−1
F ( + q)ΦP (q)Δ̂−1
F ( − q) = KP (q, k)ΦP (k) (11.37)
2 2 (2π)d
This result has the obvious graphical depiction indicated in Fig. 11.8. An alternative
version, in which the Bethe–Salpeter wavefunction is itself amputated: i.e.,
“Perturbatively non-perturbative” processes: threshhold bound states 391
P P
P P +q −q
+q −q 2 2
2 2
KP (q,k)
G̃P (k,p)
P P
+p −p P P
2 2 +p −p
2 2
P P
P P +q −q
+q −q 2 2
2 2
KP (q,k)
ΦP (q) =
P P
+k −k
2 2
ΦP (k)
Fig. 11.8 Bethe–Salpeter equation determining the bound state wavefunction ΦP (q).
P P
ΨP (q) ≡ Δ̂−1
F ( + q)ΦP (q)Δ̂−1
F ( − q) (11.38)
2 2
leads to a slightly different Bethe–Salpeter equation for ΨP (q):
P P dd k
ΨP (q) = KP (q, k)Δ̂F ( + k)ΨP (k)Δ̂F ( − k) (11.39)
2 2 (2π)d
392 Dynamics IX: Interacting fields: non-perturbative aspects
The simplest example of a bound state arising in a weakly coupled system, and
amenable to perturbative analysis, is found in our old friend, the scalar φ4 theory,
in two or three spacetime dimensions. Thus, we imagine a self-interacting non-self-
conjugate massive scalar field φ with interaction Hamiltonian density
λ
Hint (z) = : (φ† (z)φ(z))2 : (11.40)
4
and free momentum-space propagator ΔF (p) = p2 −m1 2 +i . The normal ordering means
that graphs in which a scalar line leaves and returns to the same vertex are excluded
(cf. Section 10.1), so any loop integral contains
at least two scalar propagators and is
ultraviolet convergent in d = 2 or 3 (as dd k/k4 < ∞ as far as the large momentum
contribution is concerned, if d < 4). The theory is therefore ultraviolet finite ab
initio, although, of course, there will be finite corrections which convert the bare
mass m appearing in the free propagator to the physical mass mph of our scalar
particle. These renormalization effects are not particularly relevant to the physics
of the bound-state formation which is our primary interest here, and will not be
emphasized in the following, although in a real calculation of bound-state properties
one would need to re-express the final results in terms of the measurable physical
mass. In any case, for the theory in question, we now take A(x) = φ(x), B(x) = φ† (x),
so we are studying the possibility of particle–antiparticle binding in the φ − φc
channel.
The lowest-order contributions, through order λ2 , to the kernel KP are indi-
cated graphically in Fig. 11.9; analytically, one finds that the following terms cor-
rectly generate the 2-2 scattering amplitude G̃P through O(λ2 ) when (11.36) is
iterated:
1
KP (q, p) = i(λ − λ2 F((p − q)2 ) − λ2 F((p + q)2 )) + O(λ3 ), (11.41)
2
P P P P P P
+q −q +q −q −q +q
2 2 2 2 2 2
l p+q−l
KP (q,p)= + + +...
p−q+l l
P P− p P P P P
+p +p −p +p −p
2 2 2 2 2 2
Fig. 11.9 Low-order contributions to the 2PI kernel KP (q, p) (arrows indicate charge flow: up
for particles, down for antiparticles).
“Perturbatively non-perturbative” processes: threshhold bound states 393
√
1 1 k 2 + 2m
F(k ) =
2
√ ln ( √ ), d = 3 (11.44)
4π k 2 4m2 − k2
The value of F(k 2 ) for general k 2 can be obtained by analytic continuation of these
formulas: in particular, one finds that F(k 2 ) is a real analytic function of k 2 with a cut
on the positive real axis for 4m2 ≤ k 2 of square-root (resp. logarithmic) type in d = 2
2
(resp. 3), and no other singularities
√ (the apparent square-root branch point at k = 0
is spurious: only even powers of k 2 appear in the Taylor expansion around k = 0).
We now search for the appropriate conditions for a bound-state pole to develop
in the 2-2 scattering amplitude, √ where for convenience we work in the rest frame
of the bound state and set P 0 = P 2 = 2m − κ2 /m at the bound-state pole. Thus,
the binding energy Ebind = κ2 /m of the bound state is parameterized in terms of the
variable κ, which is a measure of the distance from the bound-state pole to the two-
(free-)particle threshold at P 0 = 2m. The iteration of the leading-order kernel (just
the constant iλ) clearly generates a tower of bubble graphs indicated in Fig. 11.10(a).
These graphs form a geometric series; excluding the trivial disconnected contribution
to G̃P (q, p), one has the sum (n is the number of loops)
∞
(0) iλ P P P P
G̃P (q, p) = d
(−λF(P 2 ))n ΔF ( + q)ΔF ( − q)ΔF ( + p)ΔF ( − p)
(2π) n=0 2 2 2 2
iλ 1 P P P P
= ΔF ( + q)ΔF ( − q)ΔF ( + p)ΔF ( − p)
(2π)d 1 + λF(P 2 ) 2 2 2 2
(11.45)
The superscript (0) here indicates that this is the leading contribution to a reorganized
set of perturbative contributions to G̃P : we shall soon see that this tower of graphs
determines the leading-order properties of the bound state for weak coupling. For a
true bound state to be present at weak coupling (small λ) the value of the bubble
integral F(P 2 ) must increase correspondingly at the bound-state pole to allow con-
tributions of arbitrary order to remain comparable, thereby keeping the constituents
bound for an infinite time. If the bound state is to be present at arbitrarily weak
coupling, this means that F must become singular: this can only happen if the bound-
394 Dynamics IX: Interacting fields: non-perturbative aspects
.
. n loops
insertion of
O(λ2) kernel
(a) (b)
Fig. 11.10 (a) 2-2 amplitude from iteration of leading-order kernel; (b) subleading tower from
a single insertion of a higher-order kernel.
state momentum P = (2m − κ2 /m, 0) approaches the two-particle threshold, i.e., as
κ/m → 0, when the bubble integral has the asymptotic behavior
1
F(P 2 ) ∼ + O(1), κ/m → 0, d = 2, (11.46)
8mκ
1 2m 2m
F(P 2 ) ∼ ln ( ) + O(κ2 ln ( )), κ/m → 0, d = 3 (11.47)
8mπ κ κ
In other words, the bound state must become non-relativistic, with binding energy
much smaller than the rest energy of the system. Since F is positive, we see also
that the existence of a pole in (11.45) requires that the coupling λ be negative,
corresponding to an attractive local point interaction between the constituent scalars.
Of course, this would lead if taken literally to a theory with a spectrum unbounded
below (cf. Section 8.4), but in d = 2 or 3 we are free to add a φ6 interaction with
an arbitrarily weak coupling to restore spectral sanity, without sensibly altering the
bound-state properties (or the renormalizability of the theory), so we shall ignore this
difficulty and proceed henceforth with negative λ.
Referring to (11.45) we see that the summed bubble graphs of Fig. 11.10(a) have
a pole at |λ|F(P 2 ) = 1, which for small λ, using the asymptotic forms (11.46, 11.47),
occurs when
|λ| λ2
κ∼ , Ebind ∼ , d=2 (11.48)
8m 64m3
8πm 16πm
κ ∼ 2m exp (− ), Ebind ∼ 4m exp (− ), d = 3 (11.49)
|λ| |λ|
“Perturbatively non-perturbative” processes: threshhold bound states 395
In one space and one time dimension (d = 2), the local λ4 (φ† φ)2 interaction generates a
δ-function potential V (x) = gδ(x) in the non-relativistic limit, with the dimensionless
λ
(in natural units) coupling given by g = 4m 2 (note that the coupling λ has dimension
mass squared in d = 2). The reader may easily verify that such a potential in one
spatial dimension does indeed lead (for λ < 0) in a system of reduced mass m/2
to a single bound state with the stated binding energy. The threshold singularity
in two-space, one-time dimensions (d = 3) is much weaker—only logarithmic rather
than linear—with the result that the binding energy vanishes exponentially for small
coupling, with an essential singularity in the dependence of binding energy on coupling.
This is actually the situation that arises for Cooper pairs in the BCS theory of super-
conductivity, where an arbitrarily weak phonon-induced attractive coupling results
in an exponentially small binding (and energy gap) in three spatial dimensions, but
with the system effectively reduced to the two-dimensional Fermi surface of available
electron states (see (Ziman, 1964), p. 330).
The asymptotic behavior indicated in (11.46, 11.47) is readily understood with a
simple power-counting argument. The region of the loop integral responsible for the
dominant contribution at weak coupling to the one-loop bubble integral F(P 2 ) ( with
F defined in (11.42)) corresponds to the non-relativistic scaling l ∼ κ, l0 ∼ κ2 (the
κ2
latter following if we perform the l0 integration picking up the pole at l0 = 2m −m+
l2 + m2 − i, with l ∼ κ << m). Thus in d = 2 dimensions each of the denominators
in the loop integral is of order κ2 , while the d2 l phase-space is of order κ3 , leading
to the overall 1/κ threshold singularity for small κ. In d = 3 dimensions, the power-
counting leads to κ0 , corresponding to a logarithmic dependence on κ when the spatial
integral over l is performed. Each insertion of a higher-order piece of the 2PI kernel
KP in the sequence of bubble graphs, such as the diagram indicated in Fig. 11.10(b),
reduces the strength of the threshold divergence at any given order in powers of λ.
For example, in d = 2, the order λ7 graph indicated in Fig. 11.10(b) produces only
a 1/κ5 threshold singularity, one less power of 1/κ than the corresponding λ7 graph
in the leading tower of bubble graphs shown in Fig. 11.10(a). The reader may easily
verify that adding terms of divergent structure λn+1 /κn−1 to the geometric series in
(11.45) (in contrast to the leading series, with terms of order λn+1 /κn ), corresponds
to an order λ2 contribution to the value of κ at the pole, and hence a higher order
(by one power of λ) contribution to the binding energy of the bound state (i.e., in
the d = 2 case, a contribution of order λ3 to Ebind in (11.48)). In fact, the inclusion
of successively more complicated kernel contributions is necessary to compute the
bound-state properties to successively higher accuracy, with the result that Ebind (for
example) becomes a divergent asymptotic expansion in λ.
The factorial divergence of perturbation theory in φ4 theory discussed in the
preceding section has not, of course, been eliminated: we must expect the total number
of graphs contributing to the 2-2 amplitude G̃P to grow factorially with the power n of
λ. But the remarkable simplification allowed by the existence of threshold singularities
which ensure the persistence of the binding at arbitrarily weak coupling leads to the
(highly fortunate) result that to first approximation the properties of a non-relativistic
threshold bound state are determined by a tiny subset of Feynman graphs (the bubble
diagrams of Fig. 11.10(a)) which form (in this scalar binding case) the simplest of all
396 Dynamics IX: Interacting fields: non-perturbative aspects
summable series: a geometric expansion! The factorial growth with order of the full
set of graphs contributing to the scattering amplitude translates, when the graphs
are reordered into towers on the basis of the strength of the threshold singularities
they exhibit, into an infinite asymptotic expansion for the ground-state properties in
powers of the weak coupling λ.
Threshold singularities of power strength, and therefore qualitatively similar to the
d = 2 self-coupled scalar situation, reappear in massless gauge theories (both abelian
and non-abelian varieties) in 3+1 spacetime dimensions. Some low-order contributions
to the kernel in the case of an “onium” bound state in quantum electrodynamics (e.g.,
positronium, the bound state of an electron and positron) are shown in Fig. 11.11. In
this case, our field A(x) is the electron Dirac field ψ(x) and B(x) is ψ̄(x). As we have
seen, threshold bound states are intrinsically non-relativistic in the weak coupling
region where perturbative resummation is useful, and in gauge theories this singles
out a particular gauge—Coulomb (or “radiation”) gauge—as particularly useful in
isolating the graphs with the strongest threshold singularities (Duncan, 1976). In this
case the leading properties of onium bound states are determined by iteration of the
2PI kernel corresponding to exchange of a single Coulomb photon (Fig. 11.11(a)), with
the two additional powers of κ per loop arising from the two extra space dimensions
cancelled by the 1/κ2 behavior of the momentum-space Coulomb propagator 1/l2 (for
l ∼ κ, where, of course, we need the exchanged photon to be massless). The iteration
of this kernel leads to a series of ladder graphs (see Fig. 11.12)), with the property that
each additional loop brings an extra factor of electron charge squared (and therefore α,
the fine-structure constant), as well as an additional 1/κ power threshold singularity.
Just as in φ4 theory in d = 2, we consequently expect a pole to develop for κ ∼ αm,
giving a binding energy of order α2 m.
We shall discuss the Feynman rules for gauge theories in Chapter 15: for present
purposes, we need only know that in Coulomb gauge the A0 propagator is instan-
taneous, corresponding to a factor i/l2 where l is the spatial momentum carried by
the Coulomb line, and attaches with a factor −ieγ0 at the charged fermion (i.e.,
electron or positron) line. The result is that the leading set of threshold singularities
is generated by iteration of the (fully amputated) kernel indicated in Fig. 11.11(a):
ie2
namely, KP (q, k) ∼ |
q−
k|2
(γ0 )(γ0 ). Note that the kernel has four Dirac subscripts (not
Fig. 11.11 (a) Coulomb exchange kernel; (b) transverse photon exchange kernel; (c, d) kernels
contributing to Lamb shift.
“Perturbatively non-perturbative” processes: threshhold bound states 397
P P
+q −q
2 2
kn − q
P P
+ kn − kn
2 2
P P
+ k1 − k1
2 2
k1 − p
P P
+p −p
2 2
Fig. 11.12 Tower of ladder graphs generating the leading threshold singularities (of order
αn+1 /κn ) in a massless gauge theory. Arrows denote charge flow; the momentum flow is upwards
on both fermion lines.
shown): the corresponding Dirac index dependence is given by the direct product of
two γ0 matrices. The full fermion propagator (in accordance with our notation from
Chapter 7) is now written ŜF ( P2 + k) for the electron line and ŜF (− P2 + k) for the
corresponding positron line (which is an electron propagator pointing downward, hence
with momentum reversed): in ladder approximation these full propagators become just
the free ones. Moreover, the leading threshold singularities are generated in the non-
relativistic kinematic domain where k0 ∼ O(κ2 ), k ∼ O(κ), and we are as usual in the
frame where P = (2m − κ2 /m, 0), so that we can replace
P/
P + k/ + m 1 1 + γ0
ŜF ( + k) → P 2 2 → P+ , P+ ≡
2 ( 2 + k) − m2 + i k0 −
k2 +κ2
+ i 2
2m
P − P2/ + k/ + m 1 1 − γ0
ŜF (− + k) → → −P− , P− ≡
2 (− P2 + k)2 − m2 + i k0 +
k2 +κ2
− i 2
2m
(11.50)
Note that the appearance of the P+ (resp. P− ) projection operator on the electron
(resp. positron) line explains why transverse photon exchange between the lines is
absent in the leading non-relativistic approximation: a transverse gluon vertex comes
with a spatial γ matrix γ , and the projection operators on either side will then cause
398 Dynamics IX: Interacting fields: non-perturbative aspects
k0 − k 2m
2 +κ 2
2
+ i k0 + k 2m+κ 2
− i (2π)
4
e2 m d3 k
= P+ ΨP (k)P− (11.51)
|q − k|2 k 2 + κ2 (2π)3
In passing from the second to the third line, we have used γ0 P+ = P+ , P− γ0 = −P−
(the relative minus sign ensuring the attractive coupling between the electron and
its antiparticle), and performed the integration over k0 to pick up the pole at
2 +κ2
k0 = − k 2m + i. Note that the absence of q0 in the leading order Coulomb kernel
implies that ΨP (q) (in the leading approximation) is really a function only of spatial
momentum, ΨP (q), so that the only k0 dependence in the integral is that displayed
explicitly in the fermion propagators. From (11.51) we conclude immediately that (in
the ladder approximation) (a) ΨP (q) is in fact only a function of the spatial vector q,
and (b) that the Dirac structure of ΨP (q) must be (in the representation (7.100)), as
a consequence of the projection operators P± in (11.51),
A(q) −A(q)
ΨP (q) ∝ (11.52)
A(q) −A(q)
1 2 e2 κ2
− ∇ φ(r) − φ(r) = − φ(r) (11.54)
m 4πr m
—exactly the non-relativistic Schrödinger equation for two equal mass m particles
e2
binding via an attractive Coulomb potential V (r) = − 4πr to form a bound state with
2
binding energy − κm .
Just as in the case of self-coupled scalar theories, the higher-order contribution
to two-particle irreducible kernels (such as the graphs displayed in Fig. 11.11(b,c,d))
reduce the strength of the threshold singularities at any given order of perturbation
theory, and consequently result in higher-order (in α) shifts in the binding energy (and
“Perturbatively non-perturbative” processes: threshhold bound states 399
from which we conclude that the function B appearing in (11.56) is an analytic function
of its argument at the origin, with a convergent Taylor series with radius of convergence
4 m (due to the square-root branch point appearing at z = − 4 m ). The integral
1 4 1 4
If we substitute the explicit result (11.4) for the cn into (11.61) we find the dominant
asymptotic behavior dn ∼ (−1)n ( m44 )n , leading to the radius of convergence quoted
above (using the ratio test). Note that the appearance of an oscillating sign factor in
the coefficients ensures that the singularity of B(z) occurs on the negative real axis
for z: the Borel transform is well-defined (by analytic continuation of its power series
around z = 0) and non-singular on the entire positive real axis where the integral
(11.56) reconstructing Z(λ, m) runs. In such cases one refers to the original divergent
asymptotic expansion as Borel summable: the full partition function is recoverable in
such cases from a knowledge of the perturbative expansion coefficients, which after
division by a factorial, yield Taylor coefficients with power behavior at large order,
and define a non-singular function B(z) (for positive real z) which can at least in
principle be used to reconstruct the desired Z(λ, m) for arbitrary coupling λ.
The extension of the above ideas to dimensions d ≥ 1 (i.e., the anharmonic oscilla-
tor for d = 1, fully regularized φ4 theories in d ≥ 2) with a positive sign mass term is
quite straightforward. The Euclidean action of the discretized theory may be written
1 λ 4
S(φi ) = φi Kij φj + φ (11.62)
2 i,j 4 i i
The appropriate analyticity of B(z) and hence the property of Borel summability has
been rigorously established for such theories, although for considerations of space we
shall not attempt to provide a proof here.9
9 For discussion and further references, see (Glimm and Jaffe, 1987), Section 23.2.
402 Dynamics IX: Interacting fields: non-perturbative aspects
The singularity at positive real values is a consequence of√the fact that the coefficients
in a power series expansion (in this case, in powers of z), while only growing at a
power rate (rather than factorially) now lack the (−1)n oscillating sign factor present
in the Borel-summable case, which resulted in a singularity of B(z) for negative real
z, safely away from the contour of integration of the Borel transform (11.56). For
d ≥ 1 (quantum mechanics or field theory), the appearance of a singularity of the
Borel transform on the positive real axis is associated with the presence of tunneling
phenomena for physical (i.e., positive) values of the coupling λ: recall that, for the
Borel summable cases discussed in Section 11.1, energetic instabilities exemplified by
the instanton solutions responsible for the leading large-order behavior only occurred
once we had analytically continued the coupling λ to negative real values. With a
negative squared-mass term, tunneling between distinct local minima of the potential
energy function already occurs for physical (i.e., positive) values of λ. The presence
of instanton solutions (extrema of the Euclidean action) for physical values of the
gauge coupling in non-abelian gauge theories like QCD mean that such theories must
also necessarily develop Borel singularities on the positive axis, and are therefore not
Borel-summable.
The discussion up to this point has implicitly assumed that the exact amplitudes
of our theory are expressed in terms of a well-defined path-integral representation: i.e.,
“Essentially non-perturbative” processes: non-Borel-summabilityin field theory 403
the theory is fully regularized in both the ultraviolet and the infrared to reduce the
number of degrees of freedom (effectively) to a finite level. In Part 4 of the book we
shall examine the process of reorganizing weak-coupling perturbation theory in order
to eliminate the dependence of the amplitudes on these cutoffs in favor of renormalized
amplitudes defined in terms of physically accessible low-energy parameters of the the-
ory. In particular, the formal expansion of the amplitudes of the theory in powers of the
bare coupling parameter(s) appearing in the cutoff theory is replaced by an expansion
in powers of cutoff-independent renormalized coupling(s). This reorganization of the
perturbation series can result in the appearance of new singularities of the Borel
transform on the positive real axis, called “UV renormalons”, which once again vitiate
the reconstruction of the non-perturbative amplitude from perturbative information.
Even if the ultraviolet cutoff is maintained, in the infinite volume limit for massless
field theories, infrared divergences can appear which similarly induce positive real
singularities of the Borel transform (in this case, called “IR renormalons”), again
destroying the Borel summability of the theory. The lesson from all of this is clear:
the property of Borel summability is an extremely fragile one, and one which we can
hardly ever expect to be present in interesting relativistic field theories.
The failure of the Borel resummation technique suggests that the question for-
mulated earlier in this section—whether or not the information encoded in a formal
perturbative expansion contains sufficient information to reconstruct the exact gener-
ating function(al) of the theory—should be answered in the negative in all such cases.
This conclusion is unwarranted, as we shall now see. The Borel transform is only one of
a variety of reconstruction techniques which attempt to connect perturbative compu-
tations with the exact amplitudes of theories defined by path integrals. In particular,
the path integral for scalar field theories, with either sign of the mass term, may
be reconstructed by use of methods which go under the generic name of “optimized
perturbation theory”. The particular form of optimized perturbation theory which we
shall describe here is sometimes referred to as the “linear δ expansion”. The basic
idea is to construct a series of approximants to the path integral which only require
perturbative calculations (defined here as path integrals with Gaussian exponents only:
hence analytically computable), but nevertheless can be shown to converge rigorously
to the exact answer. The basic idea is to interpolate between a “tunable” Gaussian
approximation and the exact action by introducing an auxiliary interpolating variable
δ, 0 ≤ δ ≤ 1, and a variational parameter μ—effectively a variable bare mass.
Thus, one writes, for a theory with Euclidean action S(φ, m, λ), an interpolating
action
where S0 is quadratic in the field variable. The partition function can then be formally
expanded in the δ variable
Z = Dφe−δS−(1−δ)S0 ∼ cn (m, λ; μ)δ n (11.68)
n=0
where the evaluation of the cn involve the usual perturbative manipulations: i.e.,
integrals with a Gaussian action S0 . The correct theory is, of course, recovered in the
404 Dynamics IX: Interacting fields: non-perturbative aspects
N
ZN ≡ cn (m, λ; μ) (11.69)
n=0
RN ≡ Z − ZN (μN ) (11.70)
goes exponentially to zero for large N . We shall provide the full proof here only for the
case of the quartic toy integral: the full argument for d = 1 (the anharmonic oscillator,
either single or double-well) can be found in (Duncan and Jones, 1993).
We begin with a useful identity which isolates the N th approximant ZN defined
in (11.69) (see Problem 6)
dz 1 1 − z N +1 −zS−(1−z)S0
ZN = Dφ e (11.71)
C 2πi z N +1 1 − z
where C is a small circular contour enclosing the origin (and excluding z = 1). The
condition of minimal sensitivity requires that we extremize ZN with respect to the
variational parameter μ, the dependence on which is entirely contained in the “free”
action S0 (cf. (11.67)):
∂Zn dz ∂S0 1 −zS−(1−z)S0
= 0 ⇒ Dφ e
∂μ C 2πi ∂μ z N +1
∂S0
0 = Dφ (S − S0 )N e−S0 (11.72)
∂μ
We shall also need the following identity, which will facilitate the evaluation of the
remainder term (see Problem 7):
N |f |
(−f )n 1
e−f − = e−f eξsign(f ) ξ N dξ (11.73)
n=0
n! N ! 0
valid for odd N . The large order asymptotics discussed henceforth will implicitly
assume that we are dealing with odd orders only. Using this identity, it follows
immediately that the remainder term RN at order N (odd) can be written as the
“Essentially non-perturbative” processes: non-Borel-summabilityin field theory 405
RN = AN + BN (11.74)
S−S0
1 −S
AN ≡ Dφθ(S − S0 )e eξ ξ N dξ (11.75)
N! 0
S0 −S
1 −S
BN ≡ Dφθ(S0 − S)e e−ξ ξ N dξ (11.76)
N! 0
For large (resp. small) fields φ, we have S > S0 (resp. S < S0 ), so we can refer to AN
(resp. BN ) as the strong (resp. weak) field contributions to the remainder term.
In the zero-dimensional case, there is a single spacetime point, the single field
variable φ is called x, and our free and full actions become
where inessential factors of 12 and 14 in our original quartic toy integral (11.55) have
been dropped. The PMS condition (11.72) determining μN becomes in this case
∞
2 2
x2 (x4 − μN x2 )N e−(m +λμN )x dx = 0 (11.78)
0
For large N , the integral in (11.80) is dominated by two saddle points at which SN = 0,
one at u = u> > 1 contributing with a positive sign, the other at u = u< , 0 < u< < 1
contributing with a negative sign. One readily finds
1 1 α2
u> = + (1 + 1 + N ) (11.81)
2 αN 4
1 1 α2
u< = + (1 − 1 + N ) (11.82)
2 αN 4
As the two contributions must cancel to satisfy the PMS condition, the values of the
effective action function SN (u) at these two points must agree in the large N limit:
α2
αN2 1 + 4N + 1
SN (u> ) = SN (u< ) ⇒ 2 1 + = ln
⇒ αN = 1.325487... ≡ α0
4 α2
1 + 4N − 1
(11.83)
406 Dynamics IX: Interacting fields: non-perturbative aspects
we find
∞
(λμ2N )N +1 √ 2
μN u−λμ2N u2 N + 12
1
2
AN = μN du e−m u (u − 1)N +1 σ N eλμN u(u−1)σ dσ
2N ! 1 0
(11.87)
Given that u > 1 in the integral above, the σ integral satisfies the obvious inequality
1
2 2
σ N eλμN u(u−1)σ dσ < eλμN u(u−1) (11.88)
0
Inserting the PMS scaling behavior found earlier in (11.79, 11.80, 11.83), this becomes
∞
(λμ2N )N +1 √ 1
AN < μN du u 2 (u − 1)e−N SN (u) (11.90)
2N ! 1
and we see that the u integral is dominated by exactly the same saddle-point at u = u>
found earlier in implementing the PMS condition. In other words, at large (odd) N ,
the u integral gives a contribution proportional to √1N e−N SN (u> ) , with u> given in
“Essentially non-perturbative” processes: non-Borel-summabilityin field theory 407
(11.81) (with αN = α0 = 1.325..). Inserting this result and using Stirling’s formula for
the factorial, a short calculation gives the desired asymptotic behavior
1 1
AN ∼ CN 4 eN (1+ln α0 −SN (u> )) = CN 4 e−0.6627..N (11.91)
In this case, the PMS scaling turns out to be μN ∼ N 2/3 . The strong-field contribution
to the remainder AN dies exponentially as in the quartic integral case (i.e., like
e−const·N ), while the weak-field contribution goes like
λβμ2 1 2 2/3
BN < √ N e− λβ (N/muN ) ∼ N 5/6 e−CN /(λβ) (11.97)
4 2πN
In the case of the anharmonic oscillator, techniques are available for the calculation
of the δ expansion perturbation theory coefficients to high order (e.g., N ∼ 75), and
these convergence results can therefore be checked explicitly. In the field-theory case,
408 Dynamics IX: Interacting fields: non-perturbative aspects
of course, higher loop calculations become simply impractical, so one has to hope for
convergence at moderate values of N (less than 5, say).
Note that the convergence of the optimized approximants in (11.97) is lost at
large Euclidean time extent β—a problem which is, of course, exacerbated in higher
dimension where β becomes βV , with V the spatial volume. This means that with the
particular interpolation chosen here (a variable bare mass), the optimized perturbation
theory is not really useful in the field theory context, even though in principle it implies
that exact results are reconstructible from “perturbative information” in the finite-
volume theory for either sign of the mass term. One might hope to eliminate the volume
dependence by attempting an optimized expansion for the connected amplitudes of the
theory—by studying an optimized expansion of ln Z rather than Z, for example—but
the convergence proof with the PMS optimized interpolation approach as described
above breaks down in this case (Duncan and Jones, 1993). Of course, this procedure
relied on a very specific choice for the interpolation between free and full actions (using
a variational bare mass term, in particular), and it may very well be possible that a
more ingenious interpolation scheme would allow a convergent reorganization of the
perturbative expansion for connected amplitudes even in non-Borel field theories.
The just-described examples of the δ expansion indicate that the question posed
at the beginning of this section—is the full content of field theory already present
in perturbatively computable amplitudes (perhaps in a highly disguised form!)?—
cannot be answered definitively in the negative, even for non-Borel-summable theories.
However, as a practical matter, we must admit that for “essentially non-perturbative”
processes in a strongly coupled non-Borel-summable field theory, those in which no
summable subset of perturbation theoretic contributions can be shown to incorporate
the dominant contribution to the desired amplitudes, a description in terms of Feyn-
man diagrams yields at best a crude qualitative (and very possibly misleading) picture
of the underlying physics. Very little can be learned, for example, about the physics
of quark confinement by studying Feynman graphs of interacting quarks and gluons,
however complicated.
In cases like these, where perturbation theory completely fails us, how can we hope
to make progress in making reliable, quantitatively accurate, predictions in a relativistic
field theory? If we give up on the most ambitious goal—explicitly calculating the full
amplitudes of the theory from a finite (and small) set of experimentally determinable
masses and couplings—much can be achieved simply by exploiting general structural
features which we expect our strongly coupled field theory to possess. In the 1950s and
1960s, for example, the failure of local field theory models to provide a quantitative
description of strong interaction processes (as they could hardly do prior to the
discovery of quantum chromodynamics (QCD), and the development of appropriate
non-perturbative techniques for dealing with QCD) led many theorists to adopt a
highly positivistic approach, in which one attempted to constrain strong interaction
scattering amplitudes (incorporated in the S-matrix of the theory), as the only directly
measurable objects, on the basis of a set of “sacred” principles, primarily Lorentz-
invariance, unitarity, crossing invariance, and a principle of maximal analyticity, which
asserted the analyticity of scattering amplitudes as functions of the complexified
kinematical variables except at points where singularities were necessitated by the
appearance of thresholds. The clever application of dispersion relations which could
“Essentially non-perturbative” processes: non-Borel-summabilityin field theory 409
be derived on the basis of these fundamental assumptions led to many important and
experimentally verifiable predictions in strong interaction physics, despite the fact
that the correct underlying local field theory had yet to be identified. Typically, in
dispersion theory one derives relations between amplitudes: the complete calculation
of a specific amplitude from first principles cannot, of course, be expected in the
absence of a specific microtheory, although at the time there were hopes that a
self-consistent “bootstrap” program for hadronic amplitudes would suffice to “almost
uniquely” determine the S-matrix for hadronic scattering.
The development of current algebra in the late 1960s provides another example
of the profitable exploitation of general symmetry assumptions to derive important
relations between amplitudes, this time on the basis of an assumed commutator algebra
of the currents of the theory associated with chiral symmetry. The results of current
algebra follow purely from the assumed current commutators, and are compatible
with a variety of underlying field-theoretic models (or “effective Lagrangians”) which
share the same current algebra (cf. Section 16.6), so the verification of current algebra
predictions for low-energy multipion scattering (say) brings us no closer to a unique
underlying dynamics than the results of the S-matrix approach. Consequently, if, as we
now believe, the dynamics of the strong interactions is just as precisely defined by an
underlying local quantum field theory (quantum chromodynamics) as the interactions
of electrons and photons were found to be by quantum electrodynamics in the 1950s, a
full test of such a theory must necessarily include a sufficiently accurate determination
of enough of the phenomenological content of the theory to allow us to conclude that
the specific quantum field theory chosen is indeed the correct one.
Fortunately, the last 30 years has seen the development of powerful new numerical
techniques for reliably extracting much of the non-perturbative content of strongly
coupled field theories. These techniques, which go under the general heading of “lattice
field theory”, mimic at a numerical level the rigorous construction of a continuum field
theory (in those cases where a construction is possible; see (Glimm and Jaffe, 1987)),
starting with a full regularization of the theory on a finite spacetime lattice (with
lattice spacing a and spacetime volume V ) and then taking the continuum limit a → 0
and the infinite volume limit V → ∞, sometimes referred to as the “thermodynamic
limit” (in that order). In the case of four-dimensional massless Yang–Mills theories
(coupled to Nf fermionic quarks, provided the number of quark types Nf does not
exceed a critical number; cf. Chapter 15) this limit is believed to yield well-defined
Green functions—in particular, a set of Wightman functions for the local operators of
the theory with zero color which satisfy the Wightman axioms, and a theory with a
non-zero mass gap in the spectrum. A rigorous proof of this assertion is likely to be
extremely difficult: it is one of the seven Millenium Prize Problems announced by the
Clay Mathematics Institute in May 2000!
Nevertheless, assuming the existence of the continuum and thermodynamic limits,
a sequence of approximants to the exact Euclidean Schwinger functions of the theory
can be obtained by evaluating the corresponding functional integrals numerically
(typically, by Monte Carlo simulation methods) on a finite hypercubical L × L × L × L
spacetime lattice (with L = La, L integer, a the lattice spacing), and then increasing
L as the lattice spacing is appropriately scaled towards zero. The statistical errors
incurred in such a numerical approach can be determined by standard statistical
410 Dynamics IX: Interacting fields: non-perturbative aspects
techniques. The systematic errors are of two kinds: short-distance errors due to the
finite lattice spacing a, and long-distance, due to the finite extent of the lattice
both spatially and temporally. The former turn out to be simply of a power nature
ap , with the power p depending on the particular observable being measured. For
a theory with a mass gap m, the finite volume corrections fall exponentially, at
least as fast as e−mL , as the physical size of the lattice is increased. In any event,
the existence of a continuum limit ensures that the approximants to the desired
Schwinger functions systematically approach the correct non-perturbative results,
unlike the situation with partial summands of the formal perturbative expansion for
these functions. In particular, using the methods of lattice gauge theory (described in
greater detail in Sections 19.3 and 19.4), it has been possible to (a) verify the presence
of a linearly rising potential at long distances (and Coulombic behavior at short
distances) between static color sources (quark confinement), as illustrated in a typical
quenched (i.e., pure gauge theory) calculation in Fig. 11.13 (from (Duncan et al.,
1995)), and (b) compute the spectrum of the low-lying hadrons from first principles
(i.e., starting from the QCD Lagrangian) to within a few percent and verify agree-
ment with the observed particle masses. (For a summary of some recent results, see
(Kuramashi, 2008).)
Despite the enormous progress that has been made in obtaining quantitatively reli-
able non-perturbative information with the methods of lattice field theory (especially
in the case of lattice QCD), the restrictions imposed in this approach to numerical
estimates of the Euclidean path integral lead to some serious drawbacks. There are
two main areas where lattice field theory leaves much to be desired:
2.5
2
V(R) (GeV)
1.5
0.5
0
0 0.2 0.4 0.6 0.8 1 1.2
R (fermi)
Fig. 11.13 Static quark–antiquark potential (pure gauge theory) from lattice simulation.
Problems 411
In summary, it is clear that we are still far from having a comprehensive and universally
applicable strategy—a “magic bullet”, as it were—for dealing with strongly-coupled
field theories. For the time being we must instead make do with a patchwork of
techniques which provide complementary (but far from complete) information about
the physics of such theories.
11.4 Problems
1. The instability of the ground state for an anharmonic oscillator with negative
λ (i.e. V (q) = 12 q 2 − |λ| 4
4 q ) can be studied by the standard WKB formula. The
tunneling amplitude for a particle of zero energy to tunnel from the origin q = 0 to
10 See, however, footnote 13 of Chapter 10 for a potentially useful Langevin simulation approach to
complex actions.
412 Dynamics IX: Interacting fields: non-perturbative aspects
qt √
− 2V (q)dq
the other side of the barrier (at qt = ± 2/|λ|) is proportional to e 0 .
Evaluate the integral and compare with the exponential term in (11.16) (note:
the imaginary part of the energy is related to the decay rate, i.e. the square of
the tunneling amplitude).
2. In the Hamiltonian for the anharmonic oscillator (m = = 1)
1 d2 1 λ
H=− 2
+ q2 + q4 (11.98)
2 dq 2 4
1 d2 1 1
H = λ−1/3 H = − + x4 + λ−2/3 x2 (11.99)
2 dx2 4 2
The last term is a Kato perturbation (see (Kato, 1995)) of the first two,
so the∞expansion of λ−1/3 E(λ) in powers of λ−2/3 is analytic: E(λ) =
1/3 −2n/3
λ n=0 an λ is a convergent series.
3. Verify the one-loop results (11.43,11.44) for the scalar loop integrals in space-time
dimensions 2 and 3, respectively (the identity (16.67) is useful).
4. Consider fermion–antifermion (ψ − ψ c ) scattering via exchange of a massive
spinless boson φ, as in Theory C of Section 7.6, in spacetime dimension d = 4.
(a) Show that the one-loop graph (fourth order in the Yukawa coupling λ) arising
from two successive exchanges of a φ (i.e., the graph displayed in Fig. 11.12
with n = 1) is infrared finite in the on-threshold limit p, q → 0. In this theory
a bound state cannot form at weak coupling by infrared enhancement of the
coupling strength: instead, the coupling itself must become large to encourage
the persistent rescattering needed for bound-state formation.
(b) Repeat the steps leading to (11.54) in this theory (i.e., study the Bethe–
Salpeter equation in the ladder approximation, treating the fermion propa-
gators non-relativistically) to show that the resultant Schrödinger equation
contains a Yukawa potential with range 1/M , where M is the mass of the
φ. Of course, in this theory, even if a bound state exists, the ladder graphs
do not play a preferred role in the formation of the bound state, unlike the
situation for threshold bound states. The formation of the bound state in
this case is an essentially non-perturbative phenomenon.
5. The effect of higher-order kernels in shifting the mass of a threshold bound state
can be calculated perturbatively by the following procedure. We shall imagine
inserting a single higher-order kernel (as, for example, in Fig. 11.10(b)) in the
graphs for the 2-2 scattering amplitude.
(a) Show that the first-order change in the amplitude G̃P (q, p) resulting from a
shift ΔKP (q, p) in the kernel KP (q, p) is given by
dd q dd q
ΔG̃P (q, p) = G̃P (q, q )ΔKP (q , q )G̃P (q , p) (11.100)
(2π)d
Problems 413
(b) By taking P to the bound-state pole and extracting the pole-term on both
sides of (11.100) using (11.34), show that the first-order shift in the (squared)
bound-state mass induced by ΔKP (q, p) is
dd q dd q
ΔMB = −i Φ∗P (q )ΔKP (q , q )ΦP (q )
2
(11.101)
(2π)2d
This formula is the field-theory analog of the familiar expression for
the energy shift in non-degenerate first-order perturbation theory in non-
relativistic quantum mechanics.
6. Verify the contour-integral identity (11.71) for the partial summand of the
asymptotic expansion of the partition function of a general scalar field theory.
7. Verify the identity (11.73) needed for the estimation of the remainder term in an
asymptotic expansion.
12
Symmetries I: Continuous spacetime
symmetry: why we need Lagrangians
in field theory
For most beginning students of quantum field theory, an early surprise is in store
when they encounter, for the first time since facing unpleasant problems in classical
mechanics, typically involving absurdly complicated devices requiring the insertion of
peculiar constraints, the notion of a Lagrangian as the fundamental object specifying
the dynamical behavior of the theory. Certainly, such a creature plays little or no role
in non-relativistic quantum theory, where the Hamiltonian, the explicit determinant
of time evolution of the system, reigns supreme. Our first objective in this chapter
is to understand the peculiar, and indispensable, utility of the Lagrangian approach
to dynamics in relativistic quantum field theories. Our emphasis initially will be to
underscore the facility with which a Lagrangian approach incorporates the desired—
in fact, indispensable—spacetime symmetries of a relativistic field theory. In later
chapters we shall see that the Lagrangian is an equally useful object in simplifying the
treatment of local gauge symmetries, which are in some sense an amalgam of internal
and spacetime symmetry.
1 The assignment of parity quantum numbers to local fields will be explained in Section 13.1.
The problem with derivatively coupled theories: seagulls, Schwinger terms, and T ∗ products 415
(−ig)2
d4 z1 d4 z2 p1 p2 | : ψ̄γ μ γ5 ψ(z1 )ψ̄γ ν γ5 ψ(z2 ) : |p1 p2
· 0|T (∂μ φ(z1 )∂ν φ(z2 ))|0
2
(12.2)
So far, on the surface, we seem to be dealing with a perfectly covariant expression, with
the Lorentz indices μ, ν properly contracted, but of course trouble lurks potentially in
the time-ordered product, where an explicit choice of inertial frame is presupposed.
Without the derivatives on the scalar field, this T-product is just the free scalar
propagator iΔF (z1 − z2 ), which we saw in Chapter 7 is perfectly Lorentz-invariant
(as a function of the spacetime separation z1 − z2 ). In this case, we find
∂
0|T (φ(z1 )φ(z2 ))|0
= 0|T (φ(z1 )∂ν φ(z2 ))|0
+ δν0 δ(z10 − z20 )[φ(z2 ), φ(z1 )] (12.3)
∂z2ν
= 0|T (φ(z1 )∂ν φ(z2 ))|0
(12.4)
where the commutator appearing in (12.3) (in virtue of time-derivatives acting on the
θ-functions defining the T-product) vanishes by locality of φ(z). Inserting the second
derivative, however, we find
∂ ∂
0|T ∗ (∂μ φ(z1 )∂ν φ(z2 ))|0
≡ 0|T (φ(z1 )φ(z2 ))|0
(12.5)
∂z1μ ∂z2ν
= 0|T (∂μ φ(z1 )∂ν φ(z2 ))|0
+ δμ0 δ(z10 − z20 )[φ(z1 ), ∂ν φ(z2 )] (12.6)
= 0|T (∂μ φ(z1 )∂ν φ(z2 ))|0
+ iδμ0 δν0 δ 4 (z1 − z2 ) (12.7)
where in going from (12.6) to (12.7) we have used the equal-time commutator (8.1)
of φ with φ̇. Now the T∗ -product defined in (12.5) is itself perfectly covariant, as it is
simply the second spacetime-derivative of the Lorentz-invariant scalar propagator:
kμ kν d4 k
0|T ∗ (∂μ φ(z1 )∂ν φ(z2 ))|0
= i e−ik·(z1 −z2 ) (12.8)
k2 − m + i
2 (2π)4
0|T (∂μ φ(z1 )∂ν φ(z2 ))|0
= 0|T ∗ (∂μ φ(z1 )∂ν φ(z2 ))|0
− iδμ0 δν0 δ 4 (z1 − z2 ) (12.9)
416 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
which in turn means that the 2-2 scattering amplitude (12.2) contains, in addition to
a perfectly covariant contribution (from the T ∗ -product), the non-covariant piece
g2
i d4 zp1 p2 | : ψ̄γ0 γ5 ψ(z)ψ̄γ0 γ5 ψ(z) : |p1 p2
(12.10)
2
Referring back to Section 5.5, the reader may easily verify (see Problem 1) that
the difficulty here was already identified in the general expression (5.81), where we
showed that the appearance of spatial derivatives of a δ-function in the equal-time
commutator of a non-ultralocal interaction Hamiltonian density spelled potential
disaster for the Lorentz covariance of the theory. In this case, the cure is easy to
find, as the non-covariant piece is itself local, and can be cancelled by augmenting
the interaction Hamiltonian in (12.1) by the four-fermion operator in (12.10), with an
opposite sign:
1
Hint (x) = g ψ̄γ μ γ5 ψ∂μ φ(x) + g2 (ψ̄γ0 γ5 ψ)2 (x) (12.11)
2
The contribution to first order of the second term in (12.11) to the 2-2 scattering
amplitude is easily seen to exactly cancel the undesired non-covariant piece (12.10).
Moreover, this new term in the interaction Hamiltonian—dubbed the “seagull” vertex
in the original literature—appears in one-to-one association with every internal scalar
line in the Feynman graphs of the theory, serving to cancel the non-covariant Schwinger
term in the scalar propagator wherever the latter chooses to pop up. The appearance
of a non-covariant term in the Hamiltonian should not cause alarm: the energy density
is itself not a covariant object (as in the free Hamiltonian density, cf. (6.89)).
In general, when non-covariant terms appear in a theory (via an interaction
Hamiltonian failing to be an ultralocal scalar field), it is a non-trivial task to guess the
appropriate seagull terms needed to restore Lorentz-invariance of the theory. In certain
cases (e.g., gauge theories with certain choices of gauge) the required terms may even
be non-local! We clearly need an effective means of assuring Lorentz-invariance of the
theory ab initio—which, we shall soon see, is exactly what the canonical formalism,
in its Lagrangian version, is guaranteed to supply.
H = H0 + Hint
1 2 1 2 1 2 2 + M )ψ
= φ̇ + | ∇φ | + m φ + ψ̄(iγ · ∇
2 2 2
1
+ g ψ̄γ μ γ5 ψ∂μ φ + g2 (ψ̄γ 0 γ5 ψ)2 (12.12)
2
1 1 2 1 2 2 + M )ψ
= φ̇2 + | ∇φ | + m φ + ψ̄(iγ · ∇
2 2 2
+ 1 g 2 (ψ̄γ 0 γ5 ψ)2
+ g ψ̄γ 0 γ5 ψ φ̇ − g ψ̄γ γ5 ψ · ∇φ (12.13)
2
In the interaction picture, the identification of conjugate field variables is trivial, as the
equal-time commutation relations of the free field operators are exactly computable.
For scalar fields, we have
φ
[φH (y , t), πH (x, t)] = iδ 3 (x − y )
ψ
{ψH (y , t), πH (x, t)} = iδ 3 (x − y ) (12.20)
418 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
where we have dropped the Dirac indices on the fermionic fields to avoid notational
overload. This means that if we replace φ̇ by π φ and ψ̄ by −iπ ψ γ 0 in (12.13),
1 φ 2 1 2 1 2 2 + M )ψ
H= (π ) + | ∇φ | + m φ − iπ ψ γ 0 (iγ · ∇
2 2 2
− 1 g 2 (π ψ γ5 ψ)2
− igπψ γ5 ψπφ + igπ ψ γ 0γ γ5 ψ · ∇φ (12.21)
2
and recall that the total Hamiltonian H = H(x, t)d3 x is a constant of the dynamics,
we can re-express the total Hamiltonian density immediately in terms of Heisenberg
fields simply by subscripting all the fields with H:
1 φ 2 1 1 ψ 0 + M )ψH
H= (π ) + | ∇φH |2 + m2 φ2H − iπH γ (iγ · ∇
2 H 2 2
ψ
− igπH φ
γ5 ψH πH ψ 0
+ igπH H − 1 g 2 (π ψ γ5 ψH )2
γ γ γ5 ψH · ∇φ (12.22)
H
2
The fact that the full Hamiltonian of the theory has been expressed in terms of pairs of
fields satisfying canonical commutation (or anticommutation relations) (12.20) means,
as we shall soon see, that the dynamical equations of the theory can be rewritten
as differential functional equations, the field equivalent of the first-order (in time)
Hamiltonian equations of classical mechanics. In the next section we shall see that a
simple Legendre transformation of these functional equations will lead us to a very
simple criterion for ensuring the eventual Lorentz-invariance of our field theory. This is
a particularly pressing objective, given that H in (12.22) displays no vestige whatsoever
of the underlying Lorentz-invariance of the theory! Nor indeed should it: as indicated
previously, H is a spatial energy density, clearly a frame-dependent, non-Lorentz-
invariant object.
The first step in the derivation of the Hamiltonian field equations is a generalization
of a familiar identity in ordinary quantum mechanics: the fact that the commutation of
the momentum with a function of the coordinate operator is equivalent to a derivative
of the latter. We shall begin with the case of bosonic fields, satisfying commutation,
rather than anticommutation, relations. For example, from (12.20) one finds (for n
integer)
φ
[φnH (y , t), πH (x, t)] = inφn−1
H (y , t)δ 3 (x − y ) (12.23)
and
One easily generalizes these simple cases to establish that for any polynomial function
φ φ,
F of φH , πH , ∇φH , ∇πH
φ
[F(φH (y , t), πH (y , t),∇φ φ (y , t)), π φ (x, t)] = i( ∂F + ∂F · ∇
H (y , t),∇π y )δ 3 (x − y )
H H H
∂φH ∂ ∇φ
(12.25)
Canonical formalism in quantum field theory 419
A similar argument, interchanging the roles of the field φH and its conjugate momen-
φ
tum πH , gives
∂F ∂F y )δ 3 (x − y )
= −i( φ
+ φ
·∇ (12.26)
∂πH
∂ ∇π H
In particular, taking for our function F the total Hamiltonian density H of a scalar
theory, with the total Hamiltonian H given as a (time-independent) functional of the
fields
φ φ (y , t))d3 y
H (y , t), ∇π
H = H(φH (y , t), πH (y , t), ∇φ H (12.27)
the results (12.25, 12.26) show that commutation of a field with the full Hamilto-
nian can be re-expressed as a functional derivative2 with respect to the canonically
conjugate field:
φ δH ∂H x· ∂H
[H, πH (x, t)] = i = i( −∇ ) (12.28)
δφH (x, t) ∂φH (x, t)
∂ ∇φH (x, t)
δH ∂H x· ∂H
[H, φH (x, t)] = −i φ
= −i( φ
−∇ φ
) (12.29)
δπH (x, t) ∂πH (x, t)
∂ ∇πH (x, t)
δH φ δH
= −π̇H (x), φ
= φ̇H (x) (12.30)
δφH (x) δπH (x)
2 In evaluating the functional derivatives in this formula, the full Hamiltonian H is assumed to be written
as a spatial integral over the fields on time-slice t. As H is conserved in time, we are of course free to choose
any time-slice on which to express the Hamiltonian as a spatial integral.
3 Cf. the discussion of Grassmann functional derivatives in Section 10.3.2.
420 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
δH ψ δH
= −π̇H (x), ψ
= ψ̇H (x) (12.31)
δψH (x) δπH (x)
Our toy model with Hamiltonian density (12.22) furnishes an immediate and
convenient example: for the scalar field, we find (suppressing the spacetime coordinate,
ψ
and writing πH in terms of the original ψ̄H )
δH H − g ψ̄H γ γ5 ψH ) = −π̇ φ
· (∇φ
= m2 φ H − ∇ H (12.32)
δφH (x)
δH φ
φ
= πH + g ψ̄H γ 0 γ5 ψH = φ̇H (12.33)
δπH (x)
Note the presence of a fermionic term (indicated here in bold face) in the scalar
φ
conjugate momentum field πH = φ̇H − g ψ̄ H γ 0 γ 5 ψ H : this will very soon play a crucial
role in cancelling the non-covariant effects of the seagull term. Combining these
φ
equations (by eliminating πH ), we arrive at the Lorentz-covariant Heisenberg field
equation for the scalar field φH :
Note that the quartic (in the ψH field) seagull term in the Hamiltonian plays no role
so far: the scalar Heisenberg field equation would still be Lorentz-covariant without
this term. The situation is quite different for the fermionic equation of motion. The
second of Eqs. (12.31) applied to (12.22) gives directly4
H − g 2 (π ψ γ ψ )γ ψ = ψ̇H
+ M )ψH − igγ5 ψH π φ + igγ 0γ γ5 ψH · ∇φ
−iγ 0 (iγ · ∇ H H 5 H 5 H
(12.35)
where the term arising from the quartic seagull contribution is highlighted in bold-
face type. Multiplying both sides by iγ 0 and inserting (12.33) to eliminate the scalar
φ
momentum field πH , we find
H = iγ 0 ψ̇H
+ M )ψH + gγ 0 γ5 ψH φ̇H − gγ γ5 ψH · ∇φ
(iγ · ∇
⇒ (iγ μ ∂μ − M )ψH = gγ μ γ5 ψH ∂μ φH (12.36)
so that the Heisenberg field equation for the Dirac field ψH is also manifestly Lorentz-
covariant. It is precisely at this point that we see that the contribution from the
seagull term in the interaction Hamiltonian has cancelled the non-covariant term
φ
introduced by the fermionic component of the scalar momentum field πH . In other
words, the strange necessity for a four-fermion seagull interaction in order to preserve
Lorentz-invariant S-matrix amplitudes for fermion scattering, is directly correlated
with the construction of a Hamiltonian leading to Lorentz-covariant field equations
for the fermionic Heisenberg field of the theory. In this toy theory it was not too
difficult to guess the type of extra non-covariant term needed in the interaction
4 The reader may verify that the first Hamiltonian equation simply produces an equation for ψ †
H
equivalent to the adjoint of the equation obtained from the second equation; see Problem 3.
General condition for Lorentz-invariant field theory 421
dF ∂F · ∂F
≡ −∇ (12.37)
dφ ∂φ
∂ ∇φ
In other words, the Euler derivative acting on the Hamiltonian density is equivalent
to the functional derivative acting on the spatially integrated Hamiltonian density (or
Hamiltonian). The Hamiltonian equations (12.28, 12.29) can thus be written as a pair
φ), which
of partial differential equations for the Hamiltonian density H = H(π φ , ∇φ,
we assume here can be written in such a way that the momentum fields π φ appear
without spatial derivatives, as is generally the case for theories of interacting spin-0
and spin- 12 particles.5 Namely,
dH
= −π˙φ , (12.38)
dφ πφ
dH
= φ̇ (12.39)
dπ φ φ
The restoration of spacetime symmetry clearly requires that we reintroduce the time-
derivative φ̇ in favor of the canonical momentum πφ , inasmuch as spatial gradients
of the field φ are already in evidence. This can be done without loss of dynamical
information by the use of a Legendre transformation: one introduces a Lagrange density
5 The canonical treatment of theories of spin-1 fields involving a local gauge symmetry introduces further
subtleties which we shall defer to Chapter 15.
422 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
φ) ≡ π φ φ̇ − H(π φ , ∇π
L(φ̇, ∇φ, φ , φ, ∇φ)
(12.40)
as well) by solving
where πφ is to be expressed in terms of φ̇ (and possibly φ and ∇φ
the second Hamiltonian equation
dH
= φ̇ (12.41)
dπ φ φ
Why the process of Legendre transformation should not only succeed in reintroducing
time-derivatives, but do so in just such a way as to manifest directly the Lorentz-
invariance of the theory is not obvious a priori (although we are about to demonstrate
this explicitly starting with the Hamiltonian equations of motion): the underlying
reason for the crucial role played by the Legendre transform in connecting a Lorentz-
invariant formulation of the theory with the energy operator of the theory will become
clear in the next section, when we discuss the Action Principle and Noether’s theorem.
Returning to (12.40), we find for the Euler derivative of the Lagrangian with respect
to φ (holding φ̇ fixed),
dL dπ φ dH dπ φ dH
= φ̇ − φ − (12.42)
dφ dφ dπ dφ dφ πφ
dH
=− (12.43)
dφ πφ
where in (12.45) we have used the first Hamiltonian equation (12.38). Comparing
(12.43) and (12.45), we conclude that
d ∂L dL ∂L ∂L
= = −∇· (12.46)
dt ∂ φ̇ dφ ∂φ
∂ ∇φ
∂L ∂L
⇒ ∂μ = (12.47)
∂(∂μ φ) ∂φ
− M )ψ
L0 = ψ̄(iγ 0 ∂0 − iγ · ∇ (12.48)
Thus, we always have πψ = iψ̄γ 0 , and the Legendre transform (starting with the
Hamiltonian, say) simply amounts to inserting π ψ ψ̇ into the (negative) Hamilto-
+ M )ψ (thereby covariantizing it), and replacing πψ → iψ̄γ 0
nian density −ψ̄(iγ · ∇
throughout (in both free and interaction terms).
Once again, our toy derivative-coupled theory, defined by the Hamiltonian (12.22),
provides a convenient explicit example. The Lagrangian is easily constructed: we have
(now including the fermionic contributions)
L = π φ φ̇ + π ψ ψ̇ − H (12.49)
π φ = φ̇ − g ψ̄γ 0 γ5 ψ (12.50)
πψ = iψ † = iψ̄γ 0 (12.51)
1
L= (∂μ φ∂ μ φ − m2 φ2 ) + ψ̄(i∂/ − M )ψ − g ψ̄γ μ γ5 ψ∂μ φ (12.52)
2
= L0 + Lint (12.53)
424 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
1
L0 ≡ (∂μ φ∂ μ φ − m2 φ2 ) + ψ̄(i∂/ − M )ψ (12.54)
2
Lint ≡ −g ψ̄γ μ γ5 ψ∂μ φ (12.55)
with an interaction term Lint which is clearly a Lorentz scalar, and just the negative
of the interaction Hamiltonian density (12.1) with which we began the chapter, in an
attempt to construct a derivatively-coupled theory of pions and nucleons. The free
Lagrangian L0 is a sum of scalar and Dirac field contributions which the reader may
easily verify lead directly (by reversing the Legendre transformation, to return to a
Hamiltonian) to the usual free scalar and Dirac Hamiltonians incorporated in (12.22).
We see now clearly that our original criterion for Lorentz-invariance, introduced
in Section 5.5—namely, to construct interaction Hamiltonians as scalar densities
built from products of the underlying covariant fields—was not too far from the
mark, except that the correct prescription in general requires that we choose the
Lagrangian to be a Lorentz scalar. In scalar theories without derivative coupling,
one has Hint = −Lint , so the two prescriptions in fact coincide. In the present case,
the presence of time-derivatives of the φ field in the interaction result in an extra
non-scalar contribution—the quartic seagull term!—to the Hamiltonian interaction
density (cf. (12.13)). Precisely such a term is needed, as we saw in Section 12.1,
to restore the Lorentz-invariance of the amplitudes of the theory, by cancelling the
non-covariant Schwinger terms which appear in propagators of the gradient field ∂μ φ
appearing in the interaction. Of course, in practice it is far easier to start with a Lorentz
scalar Lagrangian and generate the correct Hamiltonian (including any non-covariant
seagull interactions, if necessary) by an algebraically trivial6 Legendre transformation
than to try to guess the form of the interaction Hamiltonian needed to absorb non-
covariant terms in the propagators. In the following section we shall give a much more
general discussion of the Lagrangian formalism, in which it will become clear that it
provides the natural framework for incorporating and expressing the symmetries of the
theory (including symmetries beyond those directly associated with Lorentz/Poincaré
invariance).
The preceding discussion has focussed on the operator formulation of field theory
(in particular, on the Heisenberg field equations of the theory). A completely parallel
discussion of the relation between Hamiltonian and Lagrangian formulations can be
given using the path-integral formulation. In this approach, the relevance of a Lorentz-
invariant Lagrangian to the appearance of fully Lorentz-invariant amplitudes can be
seen in a much more direct fashion, as it allows us to circumvent completely the
appearance of non-covariant Schwinger terms and seagull vertices, and demonstrate
directly a set of Feynman rules with no non-covariant elements. We shall see that under
fairly general circumstances, the Legendre transformation connecting a field-theoretic
Hamiltonian and Lagrangian is exactly equivalent to a functional Fourier transform.
We begin with the bosonic case. Suppose that our Hamiltonian density is a function
of N bosonic fields φn , n = 1, 2, . . . N , where the φn may be individually scalar fields,
6 The asserted triviality is, however, absent in the presence of local gauge symmetries, where the canonical
procedure becomes quite delicate, as we shall see in Chapter 15.
General condition for Lorentz-invariant field theory 425
so, as promised, the functional Fourier transform induced by the integration over
momentum fields in the path integral has effected precisely the algebraic Legendre
transformation from the Hamiltonian to the Lagrangian. The full generating functional
of the theory is then obtained by a further functional integration over the φn , including,
as usual, for convenience, source functions to allow us to generate the n-point functions
of the theory by functional differentiation,
n )−jn (x)φn (x))d4 x
i (L(φ̇n ,φn ,∇φ 4
≡ Dφn e I[φn ,∂μ φn ]−i jn (x)φn (x)d x
i
Z[j] = Dφn e
(12.60)
where we have reinserted the usually invisible factor of Planck’s constant in the final
expression, for reasons shortly to become apparent. For theories where the Lagrange
7 The form of Hamiltonian density assumed here takes care, for example, of the situations encountered
in the quantization of massive or massless abelian gauge vector fields coupled to scalar or fermionic matter
fields (see Problems 4 and 5)—in the massless abelian case, under the proviso that the Hamiltonian has
been evaluated in a “physical” gauge in which all the gauge freedom has been removed. The more subtle
aspects of the canonical quantization procedure, which emerge in theories with a local gauge symmetry,
will be discussed in detail in Chapter 15.
426 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
density L ends up being a Lorentz scalar therefore, the Feynman rules (vertices and
propagators) generated by this path integral will clearly lead to Green functions (and
eventually, via LSZ, to S-matrix amplitudes) behaving appropriately under Lorentz
transformation. In practice, as emphasized previously, we begin by specifying the
dynamics of the theory in terms of a Lorentz-invariant action I (= spacetime integral
of the Lagrangian density). The classical Principle of Least Action amounts, as is
apparent from the functional integral (12.60) to a stationary phase approximation
in which the integral is dominated, in the limit of very small , by fields φn cl (x)
which lead to extremal values of the action integral I. We shall see in the next
section that these fields are precisely those satisfying the Euler–Lagrange equations.
From the point of view of quantum field theory, the discussion here shows that the
computation of Green functions and scattering amplitudes can in fact proceed entirely
at the Lagrangian level, using the representation (12.60), with no need to refer to the
Hamiltonian of the theory (which is frequently a much more complicated object than
the Lagrangian, especially, as we shall see later, in gauge theories).
In the fermionic case, the situation is even simpler. We saw earlier that the
fermionic Lagrangian L(ψ, ∂μ ψ) is algebraically identical to π ψ ψ̇ − H(π ψ , ψ, ∇ψ) (tak-
ing just a single Fermi field for simplicity). Thus the transition between the two
formulations in the path integral context
does not even require us to perform a
functional integral: we simply replace Dπ ψ → Dψ̄ and π ψ → iψ̄γ 0 to convert the
Hamiltonian functional integral into the Lagrangian one:
ψ
4 4
Dπ ψ Dψei (π ψ̇−H(π ,ψ,∇ψ))d x → Dψ̄Dψei L(ψ,∂μ ψ)d x
ψ
(12.61)
1/p2 at large p, which we saw in Section 9.5 is incompatible with the Kållen–Lehmann
spectral representation of the two-point function of a local field theory formulated on
a positive-definite Hilbert space.8 On the other hand, any interaction (higher than
quadratic) terms in the Lagrangian involving more than single derivatives of the fields
turn out (in four spacetime dimensions) to violate perturbative renormalizability, as
we shall see in Part 4 of the book. Indeed, the renormalizable gauge field theories of the
Standard Model all satisfy our basic assumption and involve Lagrangians which can
be written in the form L(φn , ∂μ φn ), where i is an index labeling the independent fields
of the theory. The extremal condition for an action obtained from such a Lagrangian
is therefore (again relying on the freedom to integrate by parts)
0 = δI = δLd4 x
∂L ∂L
= ( δφn (x) + ∂μ δφn (x))d4 x
∂φn (x) ∂(∂μ φn (x))
∂L ∂L
= ( − ∂μ )δφn (x)d4 x, ∀δφn (x) (12.62)
∂φn (x) ∂(∂μ φn (x))
As the first-order variation of the action must vanish for arbitrary local variations
δφn (x) of the independent fields of the theory, the Euler–Lagrange equations follow
directly:
∂L ∂L
∂μ = (12.63)
∂(∂μ φn (x)) ∂φn (x)
We have already seen that the covariant form of this expression ensures that the
dynamical equations of the theory, as encapsulated in these Euler–Lagrange equations,
will be rendered compatible with the demands of special relativity by the simple
device of choosing the Lagrange density to be a Lorentz scalar constructed from the
underlying fields φn .
Our task for the remainder of this section is to display the ubiquitous role played
by the action formulation in the study of symmetries, employing as our basic tool
the beautiful result of Emmy Noether, dating from 1918 (translation of original
paper in (Noether, 1971)), that connects the symmetries of a theory defined in terms
of an action functional with conserved currents expressing the exact conservation
laws implied by the dynamics of the theory. The Noether theorem allows for the
discussion of the symmetries of the theory—whether spacetime related or “internal”—
in a completely unified way, and therefore simplifies enormously the task of reading
off from a given action the symmetries of the theory, or conversely, the construction
of actions representing theories with desired conservation laws.
Noether’s theorem predates by several years the introduction of quantized fields
(indeed, of quantum mechanics itself, in its post-Heisenberg–Schrödinger form), and
concerns the symmetry and conservation properties of classical field theories, such
8 For an explicit demonstration of the failure of positivity in the context of canonical quantization of
higher-derivative theories, see (Bernard and Duncan, 1975).
428 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
xμ → xμ = xμ + ω μν xν (12.67)
φn (x) → φn (x ) = Mnm (Λ)φm (x) (12.68)
For example, if φ(x) is a scalar field, transforming like φ(x) → φ (x ) = φ(x), its four-
gradient vector field transforms like
∂ ∂xν ∂
∂μ φ(x) → μ
φ (x ) = φ(x) = Λμν ∂ν φ(x) (12.69)
∂x ∂xμ ∂xν
in accordance with (12.68). Of course, in making the Lorentz transformation in (12.64),
we must also transform the domain of integration Ω to Ω , where x ∈ Ω if and only if
x ∈ Ω, thereby ensuring the invariance of IΩ under Lorentz transformations provided
L is constructed as a Lorentz scalar composite of the component fields φn .
Noether’s theorem, the stress-energy tensor, and all that stuff 429
Leaving aside temporarily the special case of Lorentz transformations, we see that
invariance of the action under (12.65, 12.66) amounts to
δIΩ = L(φn (x ), ∂μ φn (x ))d4 x − L(φn (x), ∂μ φn (x))d4 x = 0 (12.70)
Ω Ω
∂xμ ∂
det( ) = 1 + μ δxμ (12.71)
∂xν ∂x
so, changing variables from x back to x in the first integral, and neglecting second-
order infinitesimals,
∂
δIΩ = [L(φn + δφn , ∂μ φn + δ∂μ φn )(1 + μ δxμ ) − L(φn , ∂μ φn )]d4 x
Ω ∂x
∂L ∂L ∂
= [ δφn + δ(∂μ φn ) + L μ (δxμ )]d4 x, ∀Ω
Ω ∂φ n ∂(∂ φ
μ n ) ∂x
∂L ∂L ∂
⇒ δφn + δ(∂μ φn ) + L μ (δxμ ) = 0 (12.72)
∂φn ∂(∂μ φn ) ∂x
There is a subtlety in the middle term of the final expression (12.72), arising from the
fact that the variation δ and spacetime-derivative ∂μ do not in general commute:
∂
∂μ φn (x ) = φ (x )
∂xμ n
∂
= (φn (x) + δφn (x))
∂xμ
∂xν ∂
= (φn (x) + δφn (x))
∂xμ ∂xν
so we find
∂δxν
∂μ φn (x ) = (g νμ − )(∂ν φn (x) + ∂ν δφn (x))
∂xμ
∂δxν
= ∂μ φn (x) + ∂μ δφn (x) − ∂ν φn (x)
∂xμ
∂δxν
⇒ δ(∂μ φn (x)) = ∂μ δφn (x) − ∂ν φn (x) (12.73)
∂xμ
Inserting δ(∂μ φn ) from (12.73) in the invariance condition (12.72) we obtain
∂L ∂L ∂δxν ∂
δφn + (∂μ δφn − ∂ν φn ) + L μ (δxμ ) = 0 (12.74)
∂φn ∂(∂μ φn ) ∂xμ ∂x
One final rearrangement of this identity leads to a convenient form for the statement
of Noether’s theorem. Define the intrinsic change in φn as δ ∗ φn ≡ δφn − δxμ ∂φ
∂xμ : this
n
430 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
is the change in the field other than that due to the shift in coordinates. Multiplying
out the factors and cancelling terms, a little algebra shows
∂L ∂L ∂L ∂L
δ ∗ φn { − ∂ν } + ∂μ { δφn + [g μν L − ∂ν φn ]δxν }
∂φn ∂(∂ν φn ) ∂(∂μ φn ) ∂(∂μ φn )
∂L ∂L ∂δxν ∂
= δφn + (∂μ δφn − ∂ν φn μ
) + L μ (δxμ ) = 0 (12.75)
∂φn ∂(∂μ φn ) ∂x ∂x
For fields satisfying the Euler–Lagrange equations of motion, the term proportional
to δ ∗ φn vanishes, and we are left with the desired Noether theorem:
∂L ∂L
∂μ J μ (x) = 0, Jμ ≡ [ ∂ν φn − g μν L]δxν − δφn (12.76)
∂(∂μ φn ) ∂(∂μ φn )
To summarize: invariances of the action under infinitesimal variations taking the form
(12.65, 12.66) stand in one-to-one correspondence with conserved currents J μ , each of
which in turn gives rise to a conserved charge Q, preserved under the dynamics entailed
by the Euler–Lagrange equations of the theory. After quantization, the latter equations
simply embody the dynamics of the Heisenberg fields of the theory, so we may at least
hope that quantized currents formed by simply replacing the classical fields from
which the Noether currents are built with the corresponding Heisenberg fields will
also provide conserved objects at the quantum level. This is by no means guaranteed
a priori: as we mentioned earlier, the necessity for careful definition of the composite
operators appearing in the currents, due to ordering difficulties and/or short-distance
singularities, may interfere with the implicit smoothness properties assumed in the
purely classical derivation of the Noether theorem, leading to a “quantum anomaly”,
or violation of a classically conserved quantity at the quantum level. The examples
given below (apart from dilatation symmetry) will not, however, be infected with the
anomaly disease, to which we return in Chapter 15.
We shall shortly see that the combination of fields multiplying δxν in (12.76) plays
a fundamental role for relativistically invariant theories. The free μ, ν indices suggest
that we define a second-rank “energy-momentum” tensor T μν , the physical significance
of which will shortly emerge, as
∂L
T μν ≡ ∂ν φn − g μν L (12.79)
∂(∂μ φn )
Applications of Noether’s theorem 431
xμ → xμ = xμ + g μσ , σ = 0, 1, 2, 3 (12.83)
δφn (x) = 0 (12.84)
432 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
Indeed, this is simply the statement of invariance of the spacetime integral under
the change of variables embodied in (12.83). The quantity is an arbitrary positive
infinitesimal constant, which will be divided out of the definition of the current at the
end. There are evidently four independent symmetries, corresponding to time (σ = 0)
and space (σ = 1, 2, 3) translations, corresponding to the currents (cf. (12.80)
J μσ = T μσ , σ = 0, 1, 2, 3 (12.85)
The reason for attaching the name “energy-momentum tensor” to T μν is now apparent.
Relabeling the conserved charge associated with J μσ as Pσ (rather than Qσ , say), we
have from (12.79),
∂L
Pσ = d3 x{ ∂σ φn − g 0σ L} (12.86)
∂ φ̇n
and in particular, for the time-component (recall that the conjugate momentum fields
πn ≡ ∂∂L
φ̇n
)
P0 = d3 x{πn φ̇n − L} = d3 x H(x, t) = H (12.87)
the corresponding charge is simply the full Hamiltonian, given as the Legendre trans-
form of the Lagrangian. The fundamental role played by the Legendre transform
in connecting the Lagrangian and Hamiltonian forms is therefore seen to emerge
inescapably from the Noether treatment of the time-translational symmetry of the
theory. For the spatial components of the four-vector Pσ we find
n (x, t)
P = d3 x πn (x, t)∇φ (12.88)
The interpretation of this spatial vector as the spatial momentum can be seen if we
take the case of purely bosonic fields, and impose the standard equal-time commutator
relations
so that P generates spatial translations of the fields of the theory, as expected of the
spatial momentum operator. The Noether charges Pσ given in (12.86) are therefore
nothing but our old friend the energy-momentum four-vector of the theory, with
the “columns” of the energy-momentum tensor T μν giving the associated conserved
currents (thus, ∂μ T μν = 0).
The physical interpretation of the energy-momentum tensor defined in (12.79) is
actually quite subtle. From the point of view of “flat space” field theory (formulated
in Minkowski space), the only relevant property of the tensor density T μν (x) is that
Applications of Noether’s theorem 433
the spatial integrals T 0σ d3 x reproduce the conserved energy-momentum four-vector
components Pσ which implement spacetime translations on the Heisenberg fields of the
theory. The whole axiomatic formulation of interacting field theory à la Wightman,
for example, only relies on the existence of the conserved generators of the Poincaré
group, not on the presence of a set of “charge” densities which can be integrated to
give these operators. Thus, nothing is altered from the point of view of Minkowski
field theory if we “redistribute” the energy and momentum density on any time-slice
as long as the spatial integral preserves the total energy and momentum of the field as
given by Pσ . For example, we can certainly alter the “canonical” energy-momentum
tensor (12.79) (to which we now add a “c” subscript to indicate its special origin in
the canonical Noether procedure) by adding a “superpotential” term:
∂L
Tcμν ≡ ∂ ν φn − g μν L → T μν = Tcμν + ∂λ S λμν , S λμν = −S μλν (12.91)
∂(∂μ φn )
as the divergence of the added term (on the μ index) is automatically zero due to
the antisymmetry property of the superpotential, so that the modified tensor leads to
exactly the same energy-momentum vector as the canonical version.
The actual local distribution of energy and momentum only acquires physical
significance if there are fields in the theory which couple directly, and locally, to the
energy-momentum tensor. In fact, once we include gravitational effects along the lines
of general relativity, such a field appears immediately in the form of the now dynamical
spacetime metric gμν (x). Once a generally covariant action functional is constructed
for the particular matter fields in the background metric gμν (x), the variation of the
action with respect to the metric is precisely the energy-momentum tensor T μν (x) of
these fields. In particular, in the weak field limit for the gravitational field, where
we expand gμν (x) = ημν + hμν (x), and now ημν = diag(1, −1, −1, −1) is the fixed
Minkowski metric (which we have heretofore simply called gμν !), hμν (x) acts as an
interpolating field for gravitons, and the term hμν (x)Tμν (x) of first-order in the metric
deviation is the appropriate interaction Lagrangian density if we wish to compute S-
matrix amplitudes for processes involving one graviton and multiple matter particles.
The specific choice of the spatial dependence of Tμν (x) clearly becomes physically
significant in this case. In particular, the Tμν tensor obtained by metric variation of
the generally covariant matter action is clearly a symmetric tensor, Tμν = Tνμ , which is
clearly not guaranteed by the Noether expression (12.79), and indeed there are cases
where the need to construct a generally covariant action necessitates the addition
of a superpotential term to the canonical tensor Tcμν in order to obtain a properly
symmetric tensor. For scalar field theory (e.g., λφ4 theory), the canonical tensor Tcμν
is already symmetric,
Tcμν = ∂ μ φ∂ ν φ − g μν L (12.92)
but leads to ultraviolet divergences when used as the current coupled to gravitons in
single graviton-multi-scalar scattering amplitudes, as shown by Callan, Coleman, and
Jackiw (Callan et al., 1970). The situation is remedied by adding a term − 12
1
R(x)φ2 (x)
to the generally covariant Lagrangian density, where R(x) is the curvature scalar (so
that the added term vanishes in flat space), thereby modifying the energy-momentum
434 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
μν 1
Timpr (x) = Tcμν (x) − (∂ μ ∂ ν − g μν )(φ2 (x)) (12.93)
6
We shall see shortly that this improved tensor also leads to simplified forms for the
currents associated with dilatation and conformal symmetry.
Example 2: Invariance under the homogeneous Lorentz group.
In this case the invariance obtains once we have constructed an action as the spacetime
integral of a scalar Lagrangian density, built out of properly contracted covariant fields.
The infinitesimal transformations are those given in (12.94, 12.95), repeated here for
convenience:
xμ → xμ = xμ + ω μν xν (12.94)
φn (x) → φn (x ) = Mnm (Λ)φm (x) (12.95)
By definition (cf. Section 7.2) the representatives Mnm (Λ) of a Lorentz transformation
Λ in a particular (finite-dimensional) representation (which may be a direct sum of
irreducible representations) are given in terms of the generators (J μν )nm and the
infinitesimal rotation angles and boost rapidities ωμν by
i
Mnm (Λ) = δnm + ωμν (J μν )nm + O(ω 2 ), ωμν = −ωνμ (12.96)
2
so
i
δφn (x) ≡ φn (x ) − φn (x) = ωμν (J μν )nm φm (x) (12.97)
2
There are six independent choices for the ωμν corresponding to rotations around or
boosts along the three spatial axes. Let us pick a specific one by choosing a pair κ, λ
with 0 ≤ κ < λ ≤ 3, and setting
shall use the same letter M to denote both the current and its charge:
Mκλ ≡ d3 xM0κλ , Ṁκλ = 0 (12.100)
the first two terms of which clearly give the orbital angular momentum of the system,
whereas the last term, present only for fields transforming non-trivially under the
Lorentz group, must correspond to spin angular momentum. That the total charge
Mij indeed corresponds to the total angular momentum operator follows from the
fact that it generates, by commutation with the field, and employing the usual equal-
time commutation relations, the correct infinitesimal variation:
A similar result can be obtained for the commutation with the boost operator M0i —
an exercise we leave for the reader (see Problem 6). The result (12.102) indicates that
our Noether charges are indeed the correct generators of the HLG in the state space of
the quantum field theory. Indeed, for a general Lorentz transformation Λ the covariant
field transformation law is (cf.(7.5))
Note that we are working entirely in Heisenberg representation here, so that these
U (Λ)s are really the UH (Λ) of Section 9.1, acting on the fully interacting Heisen-
berg fields of the theory, and in the state space spanned by eigenstates of the full
Hamiltonian: we are omitting the H subscript for simplicity of notation. If we take
Λ infinitesimally close to the identity, Λμν = g μν + ω μν , the corresponding unitary
operators are expressed in terms of the Hilbert space generators Mμν of infinitesimal
Lorentz transformations (see (9.64), for which we use the same notation as the Noether
charges found above, as they will shortly be seen to be identical:
i
U (Λ) = 1 + ωμν Mμν + O(ω 2 ) (12.104)
2
Inserting (12.104) into (12.103), we find, on expanding every term to first order in ω,
the commutation relation for the generators Mμν with our field φn :
which agrees with (12.102) if we set (μ, ν) → (i, j). One may also compute the
commutators of the various Noether charges with each other, using again the equal-
time commutators of the theory. In this way, the verification of the full Poincaré
algebra, as given in (9.65, 9.67, 9.68), can be carried out explicitly starting from the
expressions for the Noether charges given above.
436 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
so that in any theory with exact dilatation symmetry (so that e−iρD |ψ
is a physical
state if |ψ
is), the mass spectrum of particles must either be exactly zero or continuous:
clearly not the world we live in! Even for massless theories, we shall see that the
classical Noether dilatation (and conformal) currents and charges are in general
broken by quantum effects (anomalies) once interactions are present, so the formal
existence of the conformal extension of the Poincaré group may seem at first sight to
be a matter of purely formal interest.11 Nevertheless, the nature of the breaking of
the conformal group in interacting field theories is now completely understood, and
has deep and important connections to the renormalization group properties of such
theories which we shall study in detail in Part 4 of the book. Accordingly, we shall
give a brief description of the dilatation current and its connection to the trace of
the energy-momentum tensor, starting again from the general Noether prescription
for construction of a conserved current for a classical symmetry of the action.
First, we need to establish appropriate transformation rules for the fields of the
theory, which we shall take for simplicity to be self-conjugate spin-zero scalars. The
10 The rather strange—and highly non-linear!—expression (12.107) for the conformal transformation can
be understood once we re-express it as a sequence: coordinate inversion I- translation T - coordinate inver-
sion I, where Ixμ ≡ xμ /x2 , T xμ = xμ − cμ . It follows immediately that the conformal transformations
form an abelian subgroup of the full conformal group.
11 An important exception arises in the case of two-dimensional conformal field theories: it turns out that
there is a rich plethora of such non-trivial field theories displaying exact conformal invariance, which have
been the subject of intensive study in the last thirty years. The situation with exactly conformally invariant
theories in four dimensions is murkier; cf. Section 15.5.
Applications of Noether’s theorem 437
provided we choose the real number d (called the “scale dimension” of the field φ)
equal to unity. For example,
φ (x )4 d4 x = e4dρ φ(x)4 e−4ρ d4 x = φ(x)4 d4 x if d = 1 (12.113)
On the other hand, this invariance is destroyed the moment we include terms in the
Lagrangian with dimensionful coefficients, such as a mass term 12 m2 φ2 or interaction
terms other than φ4 , i.e., λ(n) φn , n = 4.
Referring back to the specification of a Noether symmetry in terms of the infinites-
imal transformations of coordinates (12.65) and fields (12.66), our general expression
for the associated Noether current (12.80) gives, for the present case of dilatation
symmetry, (taking ρ infinitesimal, and dividing by −ρ to obtain the conventional
normalization)
μ ∂L
Jdil,c = Tcμν xν + d φ = Tcμν xν + (∂ μ φ)φ (12.114)
∂(∂μ φ)
The subscript “c” in the dilatation current and the energy-momentum tensor indicate
that we are using the canonical energy-momentum tensor, as described previously,
without the “improvement” necessary once we couple quantized matter fields to
gravitation. For the theory defined by action (12.111), the canonical energy-momentum
tensor is just
1 λ
Tcμν = ∂ μ φ∂ ν φ − g μν ∂μ φ∂ μ φ + g μν φ4 (12.115)
2 4!
and the conservation of the dilatation current at the classical level follows immediately
(using ∂μ Tcμν = 0, and the classical field equation φ + 3!λ 3
φ = 0))
μ λ 4
∂μ Jdil,c = Tcμμ + ∂μ (φ∂ μ φ) = −∂μ φ∂ μ φ + 4 φ + ∂μ φ∂ μ φ + φφ = 0 (12.116)
4!
current by defining
μ μ 1
Jdil,impr ≡ Jdil,c + ∂σ (xμ ∂ σ − xσ ∂ μ )(φ2 )
6
1
= Tcμν xν + (∂ μ φ)φ + ∂σ (xμ ∂ σ − xσ ∂ μ )(φ2 )
6
μν
= Timpr xν (12.117)
μν
where Timpr is the improved energy-momentum tensor (12.93), and conservation of
the dilatation current now reduces simply to the tracelessness of this tensor
μ μ
∂μ Jdil,impr = Timpr μ =0 (12.118)
The reader may easily verify that the inclusion of a mass term 12 m2 φ2 in the Lagrangian
results in a non-vanishing divergence of the current, and trace of the energy-momentum
tensor:
μ μ 2 2
∂μ Jdil,impr = Timpr μ = 2m φ (12.119)
These results, while classically valid, turn out to be incorrect once we quantize our field
theory: there are additional terms, proportional to Planck’s constant, which appear on
the right-hand side of both (12.118) and (12.119). They provide our first example of the
famous quantum anomalies (in the present case, the “trace anomaly”) of interacting
quantum field theory, which we shall discuss in detail in Chapter 15. The important
lesson which we need to take away from the present discussion is that the extension
of the Poincaré group to the larger conformal group cannot be carried through in
interacting quantum field theories (in four dimensions12 ). This is in contrast to the
supersymmetric extension of the Poincaré group which we shall discuss below, where
it turns out to be perfectly possible to construct a wide class of interacting field
theories with an exact global supersymmetry which extends the conventional Poincaré
symmetry of relativistic field theory.
12 It turns out that in two dimensions, interacting conformal field theories can be constructed. Also, there
appears to be a very special class of supersymmetric field theories in four dimensions that possess exact
conformal invariance, even though they are interacting.
Applications of Noether’s theorem 439
The action functional IΩ in (12.64) will then be invariant (for ω infinitesimal) under
the variations,
δφn (x) = iωqn φn (x), δφ∗n (x) = −iωqn φ∗n (x), δxμ = 0 (12.121)
The set of transformations of the type (12.120) clearly form a commutative (abelian)
group. Now L, being real (classically- or hermitian, once quantized), must contain
both φn and φ∗n , so the associated Noether current (12.76) can be written
∂L ∂L
Jμ = { (−iqn φn ) + (iqn φ∗n )} (12.122)
n
∂(∂μ φn ) ∂(∂μ φ∗n )
The charge density J 0 has a simple equal-time commutation relation with the fields
of the theory:
The physical interpretation of this conserved quantity for the quantized theory is very
simple: the field φn may be considered as interpolating for a particle of “charge” qn .
Each termin the Lagrangian must contain a product of fields for which the phase
factor eiω (±qn ) = 1, so that the interaction terms in the Lagrangian lead to graphs
where the charge inserted by the incoming lines exactly balances that removed by
the outgoing lines. Depending on the particular Lagrangian under consideration, the
charge Q may represent electric charge, baryon number, lepton number, strangeness,
or indeed any globally conserved quantum number, depending on the particular set of
phase transformations chosen. Of course, as in the case of the spacetime symmetries
discussed previously, the Noether charge Q also serves as the infinitesimal generator
of the transformation (12.121), as we discover by integrating (12.124) over y :
symmetry transformation
φn (x) → φn (x) = (eiωα tα )nm φm (x) ≡ Mnm (ωα )φm (x) (12.126)
where the tα are a set of infinitesimal generators for some group of linear transfor-
mations. For real fields we shall assume that the tα are hermitian pure imaginary, so
that the resultant matrix transformation eiωα tα of the fields is a real orthogonal one,
while if the fields are complex, the generators are hermitian and the matrix unitary.
In the example to be considered shortly, the only complex fields are Dirac fields, and
the Lagrangian only contains derivatives of the φn s, not of the φ∗n s (i.e., the ψ̄s), so
the Noether current Jαμ associated with the αth generator takes the same form for
both real and complex fields
∂L
Jαμ (x) = −i (tα )nm φm (x) (12.127)
n
∂(∂μ φn (x))
In analogy with (12.124) we have the equal-time commutation of the charge density
with the fields
[Jα0 (y , t), φn (x, t)] = −δ 3 (x − y )(tα )nm φm (y , t) (12.128)
The isospin-invariant effective meson field theory of the 1950s provides a suitable
example: we assume that pions and nucleons interact via a (in this case, non-derivative
coupled) Yukawa interaction, with basic fields of the theory taken as a nucleon doublet
N (x) = (p(x), n(x)) (where p(x) and n(x) are Dirac fields for the proton and neutron,
assumed to have identical mass M ) and a triplet of real pion fields πα (x), α = 1, 2, 3
(with π3 interpolating for the neutral pion, and √12 (π1 + iπ2 ) for the positively charged
pion, all assumed to have identical mass mπ ). The Lagrangian for the full system (with
obvious implicit summations over internal indices) is then taken to be
1
L = N̄ (i∂/ − M )N + (∂μπ · ∂ μπ − m2π π · π ) − ig N̄ γ5τ N · φ (12.129)
2
where the i in the interaction term is there for hermiticity (for g real). The 2x2 matrices
τ are one-half the usual Pauli matrices, τα = 12 σα , α = 1, 2, 3, so the matrix group in
question is just SU(2). The reader may easily verify, using the Lie algebra of SU(2),
[τα , τβ ] = iαβγ τγ , that the Lagrangian (12.129) is invariant under the following set of
global infinitesimal transformations ( ω are spacetime-independent)
N (x) → (1 + i
ω · τ )N (x) (12.130)
N̄ (x) → N̄ (x)(1 − i
ω · τ ) (12.131)
π (x) → π (x) + π (x) × ω
(12.132)
from which one may read off directly the vector of conserved Noether currents (12.76)
for this theory
exactly the Lie algebra of the rotation group, allowing us to take over the entire
machinery of angular momentum in the discussion of isospin symmetry and conser-
vation. The proof of this result for a general global internal symmetry (for bosonic
fields) is deferred to the exercises at the end of this chapter (see Problem 7).
Our discussion of Noether’s theorem so far has been carried out for the most
part in a classical context: issues of operator ordering, regularization of operator
products, and so on, have been resolutely ignored. At first sight, it may seem possible
to circumvent these issues by resorting to a path-integral approach, and indeed, it is
both important and enlightening to understand the precise way in which invariance
and conservation intertwine in the context of the functional integral quantization
of field theory. Inasmuch as the functional integral involves an action built from c-
number fields, we might at first expect that our discussion to this point will carry
over fairly directly to quantum field theory realized via path-integral concepts. In
fact, we shall later see in our discussion of anomalous currents in Chapter 15 that
subtleties arising from operator regularization cannot simply be dodged in a functional
formalism: rather, they reappear in an unexpected location (specifically, in the case
of Noether’s theorem, in the definition of the functional measure).
We shall conclude this section by giving a brief description of the functional version
of Noether’s theorem for the case of non-anomalous symmetries, where the aforesaid
subtleties do not enter. The end result (the functional analog of (12.76)) will be a set of
identities—the so-called Ward–Takahashi identities—satisfied by the Feynman Green
functions of the theory. Let us start with a theory of N fields φn (x), n = 1, 2, ..., N with
a global internal symmetry (12.126), with Lagrangian L(φn , ∂μ φn ). The fields may in
fact be bosonic or fermionic, but here we shall assume for simplicity only bosonic
fields, to avoid having to keep careful track of minus signs arising from interchange of
Grassmann fields or sources. The symmetry parameters ωα in (12.126)) are spacetime
constants, as we are dealing with a global symmetry of the theory, but if the Lagrangian
can be written (as the notation L(φn , ∂μ φn ) implicitly suggests) so that the fields
appear with at most a single spacetime-derivative, it is easily seen to be invariant if
442 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
(12.140)
As this identity must hold for arbitrary ωα (x), we may functionally differentiate with
respect to ωα and obtain
Beyond Poincaré: supersymmetry and superfields 443
{L−jn φn }d4 x
Dφn {i∂μ Jαμ (x) + jn (x)(tα )nm φm (x)}ei =0 (12.141)
This result is effectively Noether’s theorem in functional form. The Green functions
of the theory are obtained by functionally differentiating with respect to the sources
δp
jn (x) (cf. (10.73)): if we apply ip δjn (y1 )···δjnp (yp )
to (12.141) and then set the sources
1
jn to zero, we obtain the Ward–Takahashi identities (in coordinate space) associated
with the internal symmetry (12.126):
∂
0|T (Jαμ (x)φn1 (y1 ).....φnp (yp ))|0
∂xμ
p
=− δ 4 (x − yr )(tα )nr m 0|T (φn1 (y1 )..φnr−1 (yr−1 )φm (x)φnr+1 (yr+1 )..φnp (yp ))|0
r=1
(12.142)
For the first fifty years of quantum field theory, until the mid-1970s, the Poincaré
symmetries were thought to represent the maximal set of spacetime symmetries: in
other words, it was implicitly assumed that one could not expand the algebra (12.143–
12.145) consistently by introducing further generators with non-trivial commutation
relations with the Mμν , P ρ . This prejudice was reinforced by the famous “no-go”
theorem of Coleman and Mandula (Coleman and Mandula, 1967), which showed that
the Poincaré algebra was indeed maximal in this sense, on the basis of assumptions
which seemed unexceptionable at the time. An important implicit assumption was that
the generators of the algebra were bosonic in character: acting on bosonic states, they
produced bosonic states, and on fermionic ones, fermionic states. The relaxation of this
last assumption is the critical step in allowing the existence of supersymmetry (SUSY)
algebras which expand the original Poincaré algebra stated above. The simplest
possible extension of the Poincaré algebra turns out13 to involve the introduction
of a Majorana 4-spinor set of generators Qα (α = 1, 2, 3, 4)
Cs Q∗
(12.146)
Q
13 The most general form of the supersymmetry algebra was first derived by by Haag, Lopuszanski, and
Sohnius, (Haag et al., 1975): see below.
Beyond Poincaré: supersymmetry and superfields 445
[φ̇(x, t), φ∗ (y , t)] = [φ̇∗ (x, t), φ(y , t)] = −iδ 3 (x − y ) (12.151)
{ψα (x, t), ψ̄β (y , t)} = (γ 0 )αβ δ 3 (x − y ) (12.152)
{ψα (x, t), ψβ (y , t)} = (iγ )αβ δ (x − y )
2 3
(12.153)
The unusual form of the anticommutation relation (12.153) arises because of the
Majorana property of ψ (namely, ψ ∗ = iγ 2 ψ, implying the equivalence of (12.152)
and (12.153)). The equations of motion φ = 0, ∂/ψ = 0 imply conservation (i.e., zero
divergence) of the fermionic current
√
J μ = 2{(∂/φ)γ μ ψR + (∂/φ∗ )γ μ ψL }, ∂μ J μ = 0 (12.154)
where ψL = PL ψ = 1+γ 5
2 ψ, ψR = PR ψ =
1−γ5
2 ψ are the upper and lower 2-spinor
components of ψ respectively. The current J μ gives rise in the usual way to an
associated conserved charge Qα
√
Qα = 2 d3 x(∂/φ(x, t)γ 0 PR + ∂/φ∗ (x, t)γ 0 PL )αβ ψβ (x, t) (12.155)
14 Notational alert: In SUSY, it is conventional to use the ∗ symbol for both complex conjugation (of
complex and Grassmann numbers), as well as hermitian conjugation (of operators). We shall adhere to this
policy throughout this section.
446 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
15 We
√
remove the annoying 2 here, inserted for later convenience in normalizing the charges.
Beyond Poincaré: supersymmetry and superfields 447
α β
λ≡
γ δ
P ≡ pμ σμ , det(P ) = pμ pμ = p2 (12.160)
P → P ≡ λP λ† (12.161)
The group SL(2,C) is six-dimensional: four complex numbers contain eight (real)
degrees of freedom, but the two constraints setting the real part of αδ − βγ to 1 and
the imaginary part to zero reduce the dimensionality to 6, the correct number for the
HLG. Infinitesimally, we can write
i 1
λ = 1 + ( ijk ω ij + ω 0k )σk (12.164)
4 2
corresponding to
Λμν = g μν + ω μν (12.165)
448 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
Q1
Q=
Q2
Q → λQ (12.166)
Suppose we manage to find a pair of operators Qa on the state space, with the
transformation under HLG
i
[Ki , Qa ] = − (σi )ab Qb (12.172)
2
≡
Finally, recalling the (A,B) notation (cf. Section 7.2) where we define generators A
1 B ≡ (J + iK),
these become
2
(J − iK), 2
1
[Ai , Qa ] = 0 (12.173)
1
[Bi , Qa ] = (σi )ab Qb (12.174)
2
so the 2-spinor Qa corresponds to what we previously (in Chapter 7) called the (0, 12 )
representation of the HLG. The reader will recall that such a spinor is conventionally
written as the lower half of a Dirac 4-spinor.
An important notational convention: In SUSY, it is conventional to use the ∗
symbol for both complex conjugation (of complex and Grassmann numbers), as well as
Beyond Poincaré: supersymmetry and superfields 449
hermitian conjugation (of operators)—the dagger symbol is reserved for column vectors
and matrices of operators. Moreover, the ∗ symbol applied to products of Grassmann
objects is defined to reverse the order, in analogy to the property of hermitian adjoints
of operators.
The conjugation matrix Cs ≡ iσ2 has the property
Accordingly,
so QCs Q is a scalar: the Cs -matrix can be used to couple two ( 12 ,0) reps (or two (0, 12 )
reps) to a Lorentz scalar.
In the forthcoming sections we shall be needing a number of simple algebraic
properties of Grassmann Majorana spinors, which are defined and studied in Appendix
C. The relevant results are gathered there, and we strongly recommend that the reader
spend a few minutes at this point in gaining some familiarity with the essential prop-
erties of these objects, the basic ingredients from which we construct supersymmetric
theories.
In the above, the indices a, b run over the values (1,2): Qa , Q∗a are 2-spinors. In fact, the
Haag–Lopuszanski–Sohnius theorem allows for an even more general algebra with N
independent Grassmannian generators, Qar , a = 1, 2, r = 1, 2, . . . N , with an algebra
where the Zrs commute with everything and are called “central charges”. This
algebra is called “N-extended” supersymmetry and leads to consistent field theories for
1 ≤ N ≤ 8. The case of N =1 (“simple supersymmetry”) is of the greatest phenomeno-
logical importance, and is the only one we shall consider in our brief introduction to
SUSY.16
The N=1 SUSY algebra, Eqs. (12.183, 12.184, 12.185), is more frequently written
in a four-component Dirac notation: as described previously, we group the generators
Qa , Q∗a into a single Majorana 4-spinor Qα , α = 1, 2, 3, 4 (see (12.146)) and
The basic SUSY anticommutation relations, Eqs. (12.183, 12.185), can now be
expressed as a single equation:
0 −2Cs (σμ P μ )T Cs
= (12.189)
2σμ P μ 0 αβ
16 The derivation of Eqs. (12.186, 12.187) is given in full in Weinberg (Weinberg, 1995b), Chapter 2.
Beyond Poincaré: supersymmetry and superfields 451
0 1 0 −σi
{Qα , Q̄β } = 2P 0 + 2P i (12.190)
1 0 αβ σi 0 αβ
giving finally
together with
[P μ , Qα ] = [P μ , Q̄α ] = 0 (12.192)
12.6.3 Superfields
The simplest way to construct field theories with supersymmetric invariant
Lagrangians is to introduce a Grassmannian extension of ordinary spacetime:
If the superfield S has overall bosonic character, then the coefficient fields Sn will be
bosonic for n even and fermionic for n odd. If (as we shall take here) the leading term
452 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
(no θ’s) is a scalar bosonic field, we call S(x, θ) a “scalar superfield”. So a superfield is
a handy way of grouping together bosonic and fermionic fields in a “supermultiplet”.
We need the analog of the bosonic spacetime (infinitesimal) translation property
∂
Kα ≡ (γ5 )αγ − iγαγ
μ
θ γ ∂μ (12.198)
∂θγ
∂
Kα = − − i(γ μ θ)α ∂μ (12.199)
∂ θ̄α
∂ ∂
{Kα , K̄β } = {(γ5 )αγ , i(γ5 γ μ )βδ θδ ∂μ } + {−iγαγ
μ
θ γ ∂μ , } (12.203)
∂θγ ∂θβ
μ
= −i(γ5 γ μ γ5 )βα ∂μ − iγαβ ∂μ (12.204)
μ
= −2iγαβ ∂μ (12.205)
This establishes that the Kα have the same algebra as the Qα , in the sense that, given
a superfield with the property
(from which follows [Q̄β , S(x, θ)] = [Qγ (γ5 )γβ , S] = −iK̄β S), then the SUSY algebra
(12.191) is properly realized via
[{Qα , Q̄β }, S(x, θ)] = Qα [Q̄β , S] + [Qα , S]Q̄β + Q̄β [Qα , S] + [Q̄β , S]Qα
= −iQα (K̄β S) − i(Kα S)Q̄β − iQ̄β (Kα S) − i(K̄β S)Qα
= iK̄β [Qα , S] + iKα [Q̄β , S]
= {K̄β , Kα }S
μ
= −2iγαβ ∂μ S
= 2γ μ [Pμ , S] (12.207)
A glance at (12.198) and (12.202) shows that {Kα , Kβ } = 0 = {K̄α , K̄β }. Moreover, a
covariant (superspace) derivative can be defined as follows, by a simple change of sign
from (12.198):
∂
Dα ≡ (γ5 )αγ μ
+ iγαγ θγ ∂μ (12.208)
∂θγ
The change of sign relative to the definition of the Kα implies the anticommutation
relations
Dα S(x, θ) = 0 (12.211)
i
S(x, θ) = C(x) − iθ̄γ5 ω(x) − (θ̄γ5 θ)M (x)
2
1 1
− (θ̄θ)N (x) − θ̄γ5 γμ θV μ (x)
2 2
i 1 1
− i(θ̄γ5 θ) θ̄(λ(x) − ∂/ω(x)) − (θ̄γ5 θ)2 (D(x) − C(x)) (12.212)
2 4 2
454 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
where C(x), D(x), M (x), N (x) and V μ (x) are bosonic fields while the fields λ(x) and
ω(x) are fermionic (Majorana) 4-spinors. The peculiar combinations used to define the
D(x) and λ(x) terms are chosen to simplify the transformation laws for these fields, as
we shall soon see. In analogy to (12.197), an infinitesimal SUSY transformation of S
is generated by an infinitesimal Grassmann “translation” in superspace ξα as follows:
¯ ∂ − iγ μ θ∂μ )S(x, θ)
δS = ξ¯α Kα S = ξ(−
∂ θ̄
¯ 5 ω + iγ5 θM + θN + γ5 γμ θV μ + 2i(γ5 θ)θ̄(λ − i ∂/ω)
= ξ{iγ
2
i 1
+i(θ̄γ5 θ)(λ − ∂/ω) + γ5 θ(θ̄γ5 θ)(D − C)}
2 2
i
¯ μ θ){∂μ C − iθ̄γ5 ∂μ ω − (θ̄γ5 θ)∂μ M
− i(ξγ
2
1 1 i
− θ̄θ∂μ N − θ̄γ5 γν θ∂μ V ν − i(θ̄γ5 θ)θ̄(∂μ λ − ∂μ ∂/ω)} (12.213)
2 2 2
The term with no θs corresponds to the change in the C(x) field, so we immediately
can read off
¯ 5 ω(x)
δC(x) = iξγ (12.214)
Notice that the SUSY transformation has turned a bosonic scalar field C into a
fermionic spinor field ω. Next, terms with a single θ:
whence
To deal with the terms with two θs we will need a Fierz rearrangement theorem—
(C.26) from Appendix C:
1 1 1
θα θ̄β = − δαβ θ̄θ + (γ5 γμ )αβ θ̄γ5 γ μ θ − (γ5 )αβ θ̄γ5 θ (12.217)
4 4 4
We can now rearrange the terms with two θs in (12.213) as follows:
1¯ μ 1¯ μ 1¯ μ
+ ξγ γ5 ∂μ ω θ̄θ − ξγ γ5 γν γ5 ∂μ ω(θ̄γ5 γ ν θ) + ξγ γ5 γ5 ∂μ ω θ̄γ5 θ
4 4 4
i ¯ 5 (λ − i∂/ω)
= − θ̄θ ξγ
2
1¯ i ¯ − i ∂/ω) − i ξ∂ ¯/ω}
+ iθ̄γ5 θ{− ξ(λ − ∂/ω) + ξ(λ
2 2 2 4
1¯ i¯ i¯ ν
+ iθ̄γ5 γ μ θ{ ξγ μ λ − ξγμ ∂ /ω + ξγ γ5 γμ γ5 ∂ν ω}
2 4 4
i ¯ 5 (λ − i∂/ω) (→ − 1 θ̄θδN )
= − θ̄θ ξγ (12.218)
2 2
i ¯ − i∂/ω) (→ − i θ̄γ5 θδM )
+ θ̄γ5 θξ(λ (12.219)
2 2
i ¯ μ λ − i ξ(γ
¯ ν γμ + γμ γ ν )∂ν ω} (→ − 1 θ̄γ5 γμ θδV μ )
+ θ̄γ5 γ μ θ{ξγ (12.220)
2 2 2
¯
δM (x) = −ξ(λ(x) − i∂/ω(x)) (12.221)
¯ 5 (λ(x) − i∂/ω(x))
δN (x) = iξγ (12.222)
¯ μ λ(x) − ξ∂
δV (x) = −iξγ
μ ¯ μ ω(x) (12.223)
The terms with three θs in (12.213) give δ(λ(x) − 2i ∂/ω(x)). Using (C.12, C.13),
and identities (1) and (2) from Appendix C, we have
Using these identities, we see that the terms with three θs in (12.213) can be rewritten
1 1 i i
(θ̄γ5 θ)θ̄{(D − C)γ5 ξ + γ μ ξ∂μ M + γ5 γ μ ξ∂μ N − ∂μ V/γ μ ξ} (12.228)
2 2 2 2
so that we can read off
i i 1 1 1
δ(λ − ∂/ω) = { ∂/M − γ5 ∂/N + ∂μ V/γ μ + i(D − C)γ5 }ξ (12.229)
2 2 2 2 2
which, together with the transformation law (12.216) for ω gives the desired SUSY
transformation of the λ field
1
δλ(x) = ( [∂μ V/(x), γ μ ] + iγ5 D(x))ξ (12.230)
2
456 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
Finally, the (very important!) transformation law for the D field is obtained by
looking at the term in δS with four θs (see (C.20) from Appendix C):
¯ 5 ∂/λ(x)
δD(x) = ξγ (12.235)
In other words, the D term of any scalar superfield transforms as a total spacetime-
derivative under an infinitesimal SUSY transformation, so if K is any such field, or
product of such fields, a SUSY invariant action can be obtained simply by taking the
D part of K as the Lagrangian density:
I = d4 x[K]D (12.236)
A simpler way to see this is to realize that the above action is really the integral of K
over all of superspace
I = d4 xdθα K(x, θ) (12.237)
1 + γ5
DLα ≡ ( )αβ Dβ (12.238)
2
1 − γ5
DRα ≡( )αβ Dβ (12.239)
2
Beyond Poincaré: supersymmetry and superfields 457
xμ± ≡ xμ ∓ iθR
T
γ μ θL (12.245)
These are cooked up so that x+ vanishes under a right derivative, x− under a left
derivative:
∂
DRα xμ+ = (−αβ + i(γ ν θL )α ∂ν )(xμ − iθRγ (γ μ )γδ θLδ )
∂θRβ
= i(2 γ μ )αδ θLδ + i(γ μ θL )α = 0 (12.246)
and
∂
DLα xμ− = (αβ + i(γ ν θR )α ∂ν )(xμ + iθRγ (γ μ )γδ θLδ )
∂θLβ
= −iαβ θRγ (γ μ )γβ + i(γ μ θR )α
= iθRγ (γ μ )γα + i(γ μ θR )α
= −i(γ μ θR )α + i(γ μ θR )α = 0 (12.247)
where the expansion must terminate at the term quadratic in θL (which has only
two independent components !). Note that φ, F are complex scalar fields (two real
458 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
If we expand the left-chiral field Φ around the bosonic xμ coordinate, we find that
it can be expressed as follows in terms of conventional spacetime fields:
1
Φ(x, θ) = φ(x) − iθR
T
γ μ θL ∂μ φ(x) + (−i)2 θR T
γ μ θL θR
T
γ ν θL ∂μ ∂ν φ(x)
2
√ T √ T
− 2θL ψL (x) + i 2θL T
∂μ ψL (x)θR γ μ θL + F(x)θL T
θL (12.250)
In order to compare the fields in (12.250) more easily with our original component
fields for the full scalar superfield S, we will need the following identities:
T 1 + γ5 1
θR γ μ θL = θ̄γ5 γ μ θ = θ̄γ5 γ μ θ (12.251)
2 2
1
T
θR γ μ θL θR
T
γ ν θL = − g μν (θ̄γ5 θ)2 (12.252)
4
T
θL ψL = θ̄γ5 ψL = θ̄ψL (12.253)
1 1
T
θR γ μ θL θL
T
∂μ ψL = − θ̄γ5 θθ̄γ5 ∂/ψL = θ̄γ5 θθ̄∂/ψL (12.254)
2 2
T 1 + γ 5 1 + γ5
θL θ = θ̄γ5 θ = θ̄ θ (12.255)
2 2
Inserting these results in (12.250) we obtain
i √ 1 + γ5
Φ(x, θ) = φ(x) − θ̄γ5 γ μ θ∂μ φ(x) − 2θ̄ψL (x) + θ̄ θF(x)
2 2
i 1
+ √ θ̄γ5 θθ̄∂/ψL (x) + (θ̄γ5 θ)2 φ(x) (12.256)
2 8
An extremely important special case occurs when Φ and Φ̃ are conjugates of each
other. This will be the case if the bosonic fields are related in the obvious way, φ̃ =
φ∗ , F̃ = F ∗ , and the upper components of ψL are related to the lower components
of ψR in the usual charge-conjugation way familiar from the 4-spinor version of a
Majorana field:
Beyond Poincaré: supersymmetry and superfields 459
⎛ ⎞
0
⎜ 0 ⎟
ψR = ⎜ ⎟
⎝ ψ1 ⎠
ψ2
and
⎛ ⎞
ψ2∗
⎜ −ψ1∗ ⎟
ψL = ⎜
⎝ 0 ⎠
⎟
Powers of a left-chiral field Φ are clearly left-chiral (i.e., satisfy (12.240)); likewise,
powers of right-chiral fields are right-chiral (satisfy (12.241)). However a product like
ΦΦ̃ is not chiral, although it is still a scalar superfield (of the S-type). If Φ̃ = Φ∗ as
discussed above, it is also hermitian. An hermitian Lagrangian can also be obtained
by taking the “real part” of a chiral field (or power of chiral fields), so consider
1
Sc (x, θ) ≡ √ (Φ + Φ∗ ) (12.258)
2
A + iB
φ= √ (12.259)
2
F − iG
F= √ (12.260)
2
and with ψL , ψR the upper and lower components of a single Majorana fermion field
ψ as indicated above, we see that taking the real part of the chiral field Φ yields a
constrained real superfield with components
1 1 i
Sc = A(x) − θ̄ψ(x) + θ̄γ5 γ μ θ∂μ B(x) + θ̄θF (x) − θ̄γ5 θG(x)
2 2 2
i 1
− θ̄γ5 θθ̄γ5 ∂/ψ(x) + (θ̄γ5 θ)2 A(x) (12.261)
2 8
Comparing with the component expression for the general scalar superfield (12.212)
i
S(x, θ) = C(x) − iθ̄γ5 ω(x) − (θ̄γ5 θ)M (x)
2
1 1
− (θ̄θ)N (x) − θ̄γ5 γμ θV μ (x)
2 2
i 1 1
− i(θ̄γ5 θ) θ̄(λ(x) − ∂/ω(x)) − (θ̄γ5 θ)2 (D(x) − C(x)) (12.262)
2 4 2
460 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
So we see that the real part of our chiral field is just a real scalar superfield with the
identifications
λ = D = 0 (12.263)
C→A (12.264)
M →G (12.265)
N → −F (12.266)
Vμ → −∂μ B (12.267)
ω → −iγ5 ψ (12.268)
¯
δM (x) = −ξ(λ(x) − i∂/ω(x)) (12.269)
¯ 5 (λ(x) − i∂/ω(x))
δN (x) = iξγ (12.270)
We will see shortly that this is the term responsible for non-trivial interactions in the
simplest SUSY models.
1 ∗ 1 1
[Φ Φ]θ4 = (θ̄γ5 θ)2 { (φ∗ φ + φφ∗ ) − ∂ μ φ∗ ∂μ φ − 2F ∗ F − i(ψ̄R ∂/ψL + ψ̄L ∂/ψR )}
2 8 2
1 1
= (θ̄γ5 θ)2 { (φ∗ φ + φφ∗ ) − ∂ μ φ∗ ∂μ φ − 2F ∗ F − iψ̄∂/ψ} (12.279)
8 2
which should be compared with − 14 (θ̄γ5 θ)2 (D(x) − 12 C(x)), with C(x) = 12 φ∗ φ. This
gives the desired D-term:
i
D = ∂ μ φ∗ ∂μ φ + F ∗ F + ψ̄∂/ψ (12.280)
2
which is exactly the free Lagrangian (12.149) studied earlier for a massless complex
spin-0 scalar φ and a massless spin- 12 Majorana field ψ, together with an auxiliary
complex scalar F with (at this stage) no interesting dynamics.
The other term in the general action (12.272) comes from the F-term in a poly-
nomial f , typically called the superpotential (and at most cubic for renormalizability)
in Φ. Recall that
√ Tthe component field decomposition for a general chiral field reads
Φ = φ(x+ ) − 2θL ψL (x+ ) + F(x)θLT
θL , so the F-term corresponds to the term
462 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
quadratic in θL :
∂ 2 f (φ(x)) T ∂f (φ)
[f (Φ)]θL2 = T
θL ψL (x)θL ψL (x) + F(x)θL
T
θL (12.281)
∂φ2 ∂φ
Recall the form of a Majorana spinor θ T = (θ2∗ , −θ1∗ , θ1 , θ2 ), from which we obtain
T
θL θL = 2θ1∗ θ2∗
T
θL ψL = θ1∗ ψ2∗ − θ2∗ ψ1∗
1 T
T
(θL θL )2 = 2θ1∗ θ2∗ ψ2∗ ψ1∗ = − θL θL (ψ̄)L ψL
2
T
allowing us to read off the coefficient of θL θL in (12.281):
∂f 1 ∂2f
[f (Φ)]F = F(x) (φ(x)) − (φ(x))ψ̄L ψL (12.282)
∂φ 2 ∂φ2
Adding the hermitian adjoint, the total Lagrange density corresponding to the action
(12.272) thus becomes
i ∂f ∂f
L = ∂μ φ∗ ∂ μ φ + F ∗ F + ψ̄∂/ψ + F + F ∗ ( )∗
2 ∂φ ∂φ
1 ∂2f 1 ∂2f ∗
− ψ̄L ψL − ( ) (ψ̄L ψL )∗ (12.283)
2 ∂φ2 2 ∂φ2
The Euler–Lagrange equation allows us to eliminate the non-dynamical field F
∂L ∂f (φ) ∗
= 0 ⇒ F(x) = −( ) (12.284)
∂F ∂φ
whereupon the Lagrangian becomes
i 1 ∂2f 1 ∂ 2f ∗
L = ∂μ φ∗ ∂ μ φ + ψ̄∂/ψ − P (φ) − ψ̄L ψ L − ( ) (ψ̄L ψL )∗ (12.285)
2 2 ∂φ2 2 ∂φ2
(φ) 2
with P (φ) ≡ | ∂f∂φ | .
∂f
In general, there will be a minimum of f (φ) where ∂φ = 0, putting the polynomial
P (φ) at its absolute minimum. In this case, SUSY is an unbroken global symmetry:
∂f
< 0|[ξ¯L Q, ψL ]|0 > ∝ < 0|F|0 > ∝ < 0| |0 >= 0 (12.286)
∂φ
implying that the generators Q annihilate the vacuum. There are ways to evade this
however (e.g., see (Weinberg, 1995b), Section 26.5, for a discussion of O’Raifertaigh
breaking). One can show that if SUSY is unbroken at the lowest order, it will remain
unbroken to all orders of perturbation theory. Obviously, given the notable absence of
superpartners of equal mass to the known elementary particles, broken supersymmetry
is clearly the norm, if indeed supersymmetry is present in Nature at all.
Beyond Poincaré: supersymmetry and superfields 463
Finally, note that since the fundamental SUSY algebra (12.191) implies that the
Hamiltonian P0 can be constructed as a product of Q and Q̄ operators, the exact
vacuum energy must vanish if SUSY is unbroken (i.e., the disconnected vacuum
energy graphs which determine the shift in vacuum energy due to interactions must
cancel identically between bosonic and fermionic loop contributions to all orders of
perturbation theory).
We conclude our abbreviated survey of supersymmetry by giving a simple explicit
example: historically, the first four-dimensional field theory in which supersymme-
try was demonstrated, the Wess–Zumino model (1974), obtained by taking for the
superpotential
√
1 2 2 3
f (φ) = mφ + λφ (12.287)
2 3
∂f √
= mφ + 2λφ2 (12.288)
∂φ
∂2f √
2
= m + 2 2λφ (12.289)
∂φ
We can re-express the complex scalar φ = √1 (A + iB), where A, B are real (i.e.,
2
hermitian) scalar fields. Then
∂f 2 1 λ2
P (φ) = | | = m2 (A2 + B 2 ) + mλA(A2 + B 2 ) + (A2 + B 2 )2 (12.290)
∂φ 2 2
1 ∂2f 1 ∂ 2f 1 √ A + iB
− 2
ψ̄L ψL − ( 2 )∗ (ψ̄L ψL )∗ = − m(ψ̄L ψL + ψ̄R ψR ) − 2λ √ ψ̄L ψL
2 ∂φ 2 ∂φ 2 2
√ A − iB
− 2λ √ ψ̄R ψR
2
1
= − mψ̄ψ − λAψ̄ψ − iλB ψ̄γ5 ψ (12.291)
2
1 1 1 i m
L= (∂μ A)2 + (∂μ B)2 − m2 (A2 + B 2 ) + ψ̄∂/ψ − ψ̄ψ
2 2 2 2 2
λ2 2
− λAψ̄ψ − iλB ψ̄γ5 ψ − mλA(A2 + B 2 ) − (A + B 2 )2 (12.292)
2
The overall 12 in the fermion kinetic term is the appropriate normalization for a self-
conjugate Majorana field (recall the similar factor of 12 when we go from complex
to real scalar fields). Given that, we see that the scalar and spin- 12 fields correspond
464 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
12.7 Problems
1. Show that the interaction Hamiltonian density (12.1) is non-ultralocal: i.e., its
equal-time commutator [Hint (x, t), Hint (y , t)] contains a gradient of a δ-function
(cf. (5.81)).
2. In the derivatively-coupled theory defined by interaction Hamiltonian density
(12.11), the interaction part V of the full Hamiltonian is given by
1
V = {g ψ̄γ μ γ5 ψ∂μ φ(y , 0) + g 2 (ψ̄γ 0 γ5 ψ(y , 0))2 }d3 y (12.293)
2
with all the (unsubscripted) fields in interaction picture and taken at time t = 0.
(a) Show that
∂
φH (x, t) = eiHt e−iH0 t φ̇(x, t)eiH0 t e−iHt + eiHt i[V, φ(x, 0)]e−iHt
∂t
φ
= πH (x, t) + eiHt i[V, φ(x, 0)]e−iHt (12.294)
(b) Use the equal-time commutation relations (12.14) to evaluate the commutator
in part (a), thereby recovering (12.33):
∂ φ
φH (x, t) = πH (x, t) + g ψ̄H γ 0 γ5 ψH (x, t) (12.295)
∂t
†
3. Show that the field equation for ψH (or ψ̄H ) obtained by applying the first of
the Hamiltonian equations (12.31) to the Hamiltonian (12.22) is equivalent, after
taking its adjoint, to the Dirac Heisenberg field equation (12.36) for ψH .
4. The Hamiltonian density for a massive neutral vector field, described by a Lorentz
vector field Zμ , coupled to a four-vector source field J μ , which involves fields other
than Zμ (for example, we might have J μ = eψ̄γ μ ψ, with ψ a Dirac field) is given
by
H = H0 + Hint (12.296)
1 2 × Z| 2 + m2 Z 2 + 1 (∇ · π )2 }
H0 = {π + |∇ (12.297)
2 m2
1 + 1 (J 0 )2
· π − J · Z
Hint = − 2 J 0∇ (12.298)
m 2m2
17 Again anticipating later discussions of renormalizability, it can be shown that the underlying SUSY
symmetry ensures that counterterms are induced by radiative corrections only in the D-terms, not in the
F-terms. Accordingly, there are no mass or coupling renormalizations in this theory: the bare m, λ can be
chosen at their fixed physical values! However, as the kinetic part of the Lagrangian derives from the D-term
of (12.272), there is a non-trivial wavefunction (field) renormalization: in other words, Z = 1.
Problems 465
1 m2
L = − Fμν F μν + Z μ Z μ − J μ Zμ , Fμν ≡ ∂μ Zν − ∂ν Zμ (12.299)
4 2
Proceed as follows. First, show that Z 0 is a dependent field: namely one that
can be expressed, via the Euler–Lagrange equations of motion, uniquely and
entirely in terms of the other canonical fields and their conjugate momenta at
the same time. Do this by writing down the equation of motion for Z 0 and
showing that it reduces to
1 · π ), ∂L
Z0 = (J 0 − ∇ πi = = Żi − ∂i Z 0 (12.300)
m2 ∂ Żi
−1
Show that (12.300) implies the formula πi = Kij (Żj − m12 ∂j J 0 ) obtained pre-
viously in the Hamiltonian framework. Finally, eliminate Z 0 completely from
the Lagrangian (12.299), and show that the resultant expression agrees with
that found in part (a).
(c) Now carry out the canonical procedure in the usual direction, by starting with
the Lagrangian, and eliminating Z ˙ in favor of π in H = π · Z˙ − L, to check that
the original Hamiltonian (12.296–12.298) is recovered.
(d) In the functional integral approach, the fact that Z 0 is a dependent field
manifests itself in the Gaussian dependence of the Lagrangian density on Z 0
(and its spatial derivatives). Show that if we explicitly integrate out Z 0 in the
path integral
1 ˙ 0
1 2 4
Z,J
4x
DZ μ ei (− 4 Fμν F + 2 m Zμ Z −Jμ Z )d x → DZe i L (Z,
μν μ μ
,J)d
(12.301)
the resultant Lagrangian L is exactly the expression found in part (a). (Hint:
−1 1
note the identity Kij = δij + −Δ+m 2 ∂i ∂j ) Its disgustingly non-covariant (and
non-local) appearance is seen to arise from the fact that the Legendre transform
from the Hamiltonian side naturally produces a Lagrangian density with depen-
dent fields eliminated—which we see clearly in this instance serves to disguise
the underlying Lorentz-invariance of the theory. The same situation arises when
redundant fields, associated with local gauge symmetries, are present, as we
466 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
shall see in Chapter 15. Of course, the sensible way to ensure Lorentz-invariance
is to start with a Lorentz-invariant Lagrangian, from which we go (if needed)
to the Hamiltonian.
5. The Hamiltonian density for a massless gauge vector field Aμ , coupled to a
conserved current J μ (which, as in Problem 4, we assume to depend on a separate
set of fields), is given in the axial gauge A3 = 0 by
H = H0 + Hint (12.302)
1 ∂i ∂j 1 2
H0 = πi (δij + 2 )πj + |∇ × A| (12.303)
2 ∂3 2
1 0 1 0 1 0
Hint = ∂i πi J − J 2 J − Ji Ai (12.304)
∂32 2 ∂3
where the spatial indices i, j run over the values 1,2. Perform the Legendre
transformation to obtain the Lagrange density
1 2 1 1 1 2 + Ji Ai
L= Ȧ + (∂i Ȧi − J 0 ) (∂j Ȧj − J 0 ) − |∇ × A| (12.305)
2 i 2 Δ 2
(b) Starting instead with the Lagrangian
1
LQED = − Fμν F μν − J μ Aμ , Fμν ≡ ∂μ Aν − ∂ν Aμ (12.306)
4
show that the equation of motion for A0 implies that it is a dependent field:
1 1
ΔA0 = ∂i Ȧi − J 0 ⇒ A0 = (∂i Ȧi − J 0 ) = 2 (∂i πi − J 0 ) (12.307)
Δ ∂3
fields φn and their conjugate momenta πn , the commutation relations for the charge
densities
[Jα0 (x, t), Jβ0 (y , t)] = ifαβγ δ 3 (x − y )Jγ0 (x, t) (12.309)
8. Rederive the Ward–Takahashi identity (12.142) using operator methods. One may
assume without loss of generality that the y1 , y2 , ..yp are already time-ordered. One
can then write the T-product as a sum of terms with explicit θ-functions enforcing
the time-ordering of the current relative to the φn fields, thereby facilitating the
application of the spacetime-derivative. The contact terms arise from the μ = 0
derivative, which result in a series of equal-time commutators, at which point
(12.128) may be employed.
9. Verify that the variation in the Lagrangian density (12.149) induced by the
infinitesimal SUSY transformations listed in (12.157) is a space-time divergence,
as given in (12.158). The Grassmann identity (C.12) from Appendix C will be
useful here.
10. Here, we shall check that the SUSY current (12.154) is indeed the appropriate
conserved current of the form (12.82), in a situation in which there is a non-trivial
variation in the Lagrangian density (12.149) (given by √ a spacetime divergence).
From the SUSY variations (again ignoring the noisome 2 normalization) of the
fields given in (12.157), and ∂(∂∂L
μ φ)
= ∂ μ φ∗ , ∂(∂∂L ∗
μφ )
= ∂ μ φ, and ∂(∂∂L
μ ψ)
= 2i ψ̄γ μ ,
show that the conventional Noether current takes the form (including the Grass-
mann infinitesimal ξ)
μ
JNoeth = i(∂ μ φ∗ )ξP ¯ R ψ + i ψ̄γ μ (∂/φ)PR ξ + i ψ̄γ μ (∂/φ∗ )PL ξ
¯ L ψ + i(∂ μ φ)ξP
2 2
(12.311)
Show that the correct conserved current J μ in (12.154) is obtained from this by
adding in the K μ correction term arising from (12.158). Some of the Grassmann
identities from Appendix C (specifically, (C.12) and (C.13)) will be useful in
interchanging the order ψ̄ · ·ξ → ξ¯ · ·ψ.
11. In this exercise we shall verify explicitly the anticommutation algebra of the
supersymmetry generators for the theory given by Lagrangian (12.149), as given in
(12.147). The fundamental equal-time (anti)commutation relations of the theory
(12.151,12.152,12.153) will be used: note that if φa , φb (resp. ψ, ψ̄) are bosonic
(resp. fermionic) fields, then the anticommutator {φa ψ, ψ̄φb } can be rearranged
into φa {ψ, ψ̄}φb − ψ̄[φa , φb ]ψ. A short calculation then shows that the anticom-
mutator {Qα , Q̄β } can be written as the sum of a bosonic and fermionic part,
where for example
{Qα , Q̄β }bos = 2 d3 x(∂/φγ 0 PR ∂/φ∗ + ∂/φ∗ γ 0 PL ∂/φ)αβ (12.312)
468 Symmetries I: Continuous spacetime symmetry: why we need Lagrangians in field theory
The invariance of a quantum field theory under the physically realizable transfor-
mations embodied in the Poincaré group—the proper orthochronous elements of the
homogeneous Lorentz group corresponding to a realizable change of inertial frame,
together with spacetime translations—has been built into the foundations of the
theory from the very beginning, and many decades of experimental investigation
at the subatomic level (where effects of gravitation can safely be neglected) have
confirmed that this symmetry, if broken, can only fail at an extremely subtle and
quantitatively minute level. Formally, the Lorentz group admits an obvious extension
if we allow the discrete operations of space and time reflection, corresponding to the
improper (because det(Λ) = −1) Lorentz transformations1 ΛP = diag(1, −1, −1, −1)
and ΛT = diag(−1, 1, 1, 1), respectively. These operations generate potential symme-
tries of a quantum field theory, but we must emphasize the term “potential” here, for
the simple reason that only some of the interactions in Nature (specifically, the strong
and electromagnetic) appear to possess an exact invariance under spatial (parity)
and temporal reflection (time reversal). The failure of parity invariance in the weak
interactions, discovered by Lee and Yang in the mid-1950s, came as a great shock at
the time, but we have since come to realize that Nature does not share the human
prejudice that the laws of Nature should take exactly the same form whether expressed
in a left- or right-handed coordinate system.
Although this chapter is entitled “discrete spacetime symmetries”, it turns out that
it is essential to include the discrete symmetry of charge conjugation—the interchange
of particles and antiparticles—in almost the same breath when discussing parity
and time reversal, even though the connection to “spacetime” is less than obvious
for a symmetry involving the swapping of particles for antiparticles. The reason
for this was already indicated in an heuristic fashion in Chapter 3, where we saw
that the assumptions of relativistic invariance, quantum mechanics, and locality of
interactions implied that exchange of a virtual particle is physically indistinguishable
from the spatiotemporally reflected exchange of the corresponding antiparticle. Our
main objective in this chapter will be to show that the combination of these three
discrete operations—parity, time reversal, and charge conjugation—is necessarily an
exact symmetry of any local relativistic quantum field theory, even when (as is the
case in the Standard Model of elementary particles as it presently stands) none of
the individual operations represents an exact symmetry of the theory. This result—
commonly referred to as the TCP theorem—will be obtained first for the case of
theories with a dynamics specified by a Lagrangian (or Hamiltonian) density, and
secondly, in a more general framework, by appealing to the irreducible fundamental
properties of local field theory as incorporated in the Wightman axioms of Section 9.2.
As a natural concomitant of the axiomatic proof of the TCP theorem, we shall also
indicate how the Spin-Statistics connection (previously discussed in Section 7.3, in
the context of Hamiltonian densities built from polynomials in local fields) also arises
rigorously from the spectral and locality axioms of the Wightman formulation of field
theory, with no commitment needed to a specific Lagrangian/Hamiltonian dynamics.
P|0 = |0
where the complex unimodular number η is called the “intrinsic parity” of the given
particle. Likewise, for a two-particle state
leading to the following pair of transformation equations for the creation and annihi-
lation operators (note: P is unitary)
Note that we must in general allow for the charge conjugate antiparticle to have a
different intrinsic parity (more on this later):
Of course, an exactly similar procedure can be used to define parity operators Pin
(resp. Pout ) acting on the in- (resp. out-)states of the theory, simply by appending a
subscript “in” or “out” to the state vectors and creation–annihilation operators in the
preceding. The presence of an exact parity symmetry (“parity conservation”) in the
theory then amounts to the statement that a single parity operator Pin = Pout effects
the parity transformation for both sets of asymptotic states. This occurs in the event
that the full Hamiltonian of the theory commutes with the parity operator P defined
on the interaction-picture states as above. Namely, if we can choose the intrinsic parity
quantum numbers η for the participating particles such that [P, H0 ] = [P, V ] = 0, it
follows in the usual way that [P, Vint ] = 0, and hence [P, S] = 0. This in turn implies
where the prime indicates states with reversed momenta but unchanged spins (see
Fig. 13.1 for an example). Exactly the same result obtains by applying a similarity
operation to the sequence of out annihilation and in creation operators whose vacuum
expectation value expresses the above S-matrix element, and using the identity
of Pin and Pout . As a consequence of the unimodularity of the intrinsic parities,
|Sβα | = |Sβ α |, which generally prevents the appearance in the scattering amplitudes
of mixtures of scalar and pseudoscalar functions of spin and momentum (for many
explicit examples, see Chapter 3 of the excellent book of Sakurai (Sakurai, 1964)).
As usual, we build symmetries into the Hamiltonian (or Lagrangian) density H
(resp. L) by constructing fields which have simple transformation properties under the
symmetry group—in this case, the discrete parity transformation. As above, we begin
by considering the effect of the parity operation as defined on a free interaction-picture
(or in- or out-)field. We determined in Section 7.3 the general form of a free local
covariant field φAB (x) transforming according to an irreducible (AB) representation
of the Lorentz group, and restate the final result (7.84) here:
d3 k −ik·x
φAB
ab (x) = (uAB
ab (k, σ)e a(k, σ)
(2π)3/2 2E(k) σ
ik·x c†
ab (k, −σ)e
+(−)2B (−)j−σ uAB a (k, σ)) (13.10)
From (13.6–13.8) we find immediately for the transformation of this field under P
d3 k
PφAB x, t)P −1 =
ab ( √ (uAB
ab (k, σ)e
−ik·x
ηa(−k, σ)
(2π)3/2 2E σ
η a (−k, σ))
ab (k, −σ)e
+(−)2B+j−σ uAB ik·x c∗ c†
(13.11)
472 Symmetries II: Discrete spacetime symmetries
νe ν̄μ μ+ e−
Sβα SPβ Pα
e− μ+ ν̄μ νe
ν̄e νμ νe ν̄μ
SCβCα STαTβ
e+ μ− e− μ+
Fig. 13.1 A weak interaction process (e− + μ+ → νe + ν̄μ ) and its P, C, and T transforms.
The momentum (straight arrow) and spin (squiggly arrow) vectors are to be taken literally: we
have chosen an example with incoming and outgoing particles of definite helicity.
Recall the explicit formula (7.69) derived earlier for the coefficient functions uAB
ab :
We can reverse the sign of the momentum by inverting the boost rapidity angle θ, and
also use the Clebsch–Gordon identity ABa b |jσ
= (−)A+B−j BAb a |jσ
, obtaining
uAB
ab (−k, σ) = (eθk̂·A )aa (e−θk̂·B )bb (−)A+B−j BAb a |jσ
= (−)A+B−j uBA
ba (k, σ)
a b
(13.13)
Parity properties of a general local covariant field 473
Changing integration variable k → −k in (13.11), and defining the parity inverted
spacetime point x = (x0 , −x):
−1 d3 k
PφAB
ab (
x , t)P = (−) A+B−j
√ (uBA
ba (k, σ)e−ik·x ηa(k, σ)
(2π)3/2 2E σ
ik·x c†
ba (k, −σ)e
+ (−)2B−2A η c∗ (−)2A+j−σ uBA a (k, σ)) (13.14)
The field appearing in (13.14) must be one of the covariant fields listed in Section
7.3 (cf. (7.84)), if a Hamiltonian constructed from such fields is to stand a chance of
commuting with the parity operator P. Indeed, it is almost the field φBA ba ! For this to
work, however, we must choose
This establishes the well-known theorem that the intrinsic parity ηηc of a particle–
antiparticle system is (−)2j (hence negative for fermions)—a result critical for the
understanding of positronium decay, for example.
Incorporating the constraint (13.15), we find the desired transformation law of an
arbitrary covariant field under the parity operation:
As a specific example, for a Dirac field, with A + B = j = 12 , the effect of the parity
transformation evidently reverses the upper and lower components of the Dirac
4-spinor (as well as the spatial coordinate x, of course), so (cf. (7.107)) we have simply
The transformation rule (13.16) for the free interaction-picture fields (or, with
appropriate subscripts, in- or out-fields) applies whether or not the interacting
dynamics of the theory is invariant under parity. Only if the interaction part of the
Hamiltonian commutes with P however, can we extend (13.16) to the full Heisenberg
fields φAB x, t), as in this case P commutes with the transformation operator U (t, 0) =
H (
eiH0 t e−iHt effecting the transformation from interaction to Heisenberg picture (cf.
(9.1)).2 In this case, parity symmetry is equivalent to the statement that the full
Heisenberg fields satisfy
with P|0
= |0
where |0
= |0
in = |0
out is the vacuum of the full Hamiltonian of the
theory. The transformation property (13.18) transfers immediately to a corresponding
transformation rule for the Wightman functions of the theory, and thence, by the
2 Of course, we must acknowledge here the usual embarrassment occasioned by Haag’s theorem: in
our appeal to the interaction picture, we presume some regularization which restores the existence of the
interaction picture, which either preserves parity or a sufficiently close simulacrum thereto that the passage
to a continuum limit is smooth vis-à-vis the parity properties of the theory.
474 Symmetries II: Discrete spacetime symmetries
Haag–Ruelle theory, to the exact S-matrix elements, and we recover the phenomeno-
logical constraints (13.9). In a theory like the weak interactions, on the other hand,
where parity is broken, there does not—indeed cannot—exist a single unitary operator
P(= Pin = Pout ) effecting the transformation (13.18) on the Heisenberg fields of the
theory. Exactly the same situation holds with respect to the discrete symmetries of
charge conjugation and time reversal which we are about to consider: the reader should
keep in mind that although these operations are conveniently defined in terms of free
fields and states, their extension to the fully interacting Heisenberg fields of the theory
presupposes that the symmetry is in fact unbroken. As we shall see, the only operation
for which this is in fact the case in the real world (assuming the general validity of the
Wightman axioms) is the combined application of time-reversal, charge-conjugation,
and parity (in any order), the T CP operator.
Provided we choose ζ = ζc∗ , the resultant field is simply related to the hermitian
conjugate of φ (allowing us to construct hermitian, local, and charge conjugation
invariant interactions, by balancing φ(x) and φ† (x) factors in the interaction density):
The general analysis for non-zero spin is a bit messy, though straightforward. First,
we shall need the conjugation property for a general rotation matrix Dj (see, for
example, (Messiah, 1966), Appendix C, Eqs. (62, 64)):
Charge-conjugation properties of a general local covariant field 475
(e−θk̂·A )∗aa = (Y (A) eθk̂·A Y (A)† )aa = (−1)2A+a+a (eθk̂·A )−a.−a (13.25)
providing the information needed to derive the relevant complex conjugation property:
∗
uAB
ab (k, σ) = (−)2A+a+a (eθk̂·A )−a,−a (−)2B+b+b (e−θk̂·B )−b,−b ABa b |jσ
a b
−b,−a (k, −σ)
= (−)A+B−j (−)j+σ (−)A+a (−)B+b uBA
(j) (A)
= (−)A+B−j Yσσ Yaa Ybb uBA (B)
b a (k, σ ) (13.26)
The transformation property of the general covariant field (13.10) under the
C-operation now follows straightforwardly:
−1 d3 k c −ik·x
Cφab (x)C =
AB
√ (uAB
ab (k, σ)ζa (k, σ)e
(2π)3/2 2E σ
∗ †
ab (k, −σ)ζc a (k, σ)e
+(−)2B (−)j−σ uAB ik·x
)
d3 k (A) (B) (j) BA ∗ c†
= (−)A+B−j √ Yaa Ybb {Yσσ ub a (k, σ )ζ a (k, σ)eik·x
σσ b a
(2π)3/2 2E
(j) −ik·x †
b a (k, −σ )ζc a(k, σ)e
+(−)2B (−)j−σ Y−σ.−σ uBA }
(A) (B) 3
d k −ik·x
= (−)A+B−j Yaa Ybb √ {(−)2B ζc uBA
b a (k, σ)a(k, σ)e
(2π)3/2 2E
ba σ
+ζ ∗ (−)j+σ uBA c†
b a (k, −σ)a (k, σ)e
ik·x †
} (13.27)
476 Symmetries II: Discrete spacetime symmetries
Using the relation (−)j+σ = (−)j−σ (−)2j = (−)j−σ (−)2A+2B (as 2j − 2σ is even, and
A and B must be able to couple to j), this becomes
−1
CφAB
ab (x)C
(A) (B) d3 k
−ik·x
= (−)A+B−j Yaa Ybb √ {ζc (−)2B uBA
b a (k, σ)a(k, σ)e
(2π) 3/2 2E σ
b a
∗ c† ik·x †
b a (k, −σ)a (k, σ)e }
2B
+ ζ (−) (−)2A (−)j−σ uBA (13.28)
Comparing the field in the last expression with the general result (7.84), we see that
we recover the (hermitian adjoint) of a covariant field if and only if
ζc = ζ ∗ (13.29)
As in the case of parity, if A = B we must include both φAB and φBA fields in
the Hamiltonian if the theory is to be invariant under C-conjugation. However, the
combined operation of charge conjugation and parity, CP, does not mix these two
types, so CP-invariant theories are possible in the presence of asymmetric (chiral)
representations:
−1 (A) (B)
CPφAB
ab (c)(CP) = ηζ(−)2(B−j) Yaa Ybb φAB†
a b (−
x, t) (13.31)
a b
Again, in analogy to (13.17) for the parity operator, the general result (13.30)
reduces in the Dirac field case (( 12 , 0) ⊕ (0, 12 ) representation) to
Another example: choosing ζ for a j=1, ( 12 , 12 ) field. The connection between the four-
vectorial notation φμ and the φAB
ab notation is as follows
1 1 1
φ 21 ,21 = √ (φ1 − iφ2 )
2 2 2
1 1 1
φ 2 2
1 1 = √ (φ0 − φ3 )
2 ,− 2 2
1 1 1
2 2
φ− 1 1
,
= − √ (φ0 + φ3 )
2 2 2
1 1 1
2 2
φ− 1
,− 1
= − √ (φ1 + iφ2 ) (13.33)
2 2 2
Time-reversal properties of a general local covariant field 477
As in the case of parity and charge conjugation, the antiunitary character of the
time reversal operator allows for an arbitrary phase factor τ (resp. τ c )—or intrinsic
“time-parity”—for the particle (resp. antiparticle) in question. The transformation
property (13.36) transfers in the usual way to a transformation property of creation
and destruction operators under a similarity transformation with T :
Recalling that T contains the charge-conjugation operator K (cf. (4.60)), one finds for
the time-reversal transformation of the covariant field (13.10), using (13.38, 13.39),
−1 d3 k
τ Yσσ a(−k, σ )
(j)
T φab (x)T
AB
= (uAB∗
ab (k, σ)e
ik·x
(2π)3/2 2E(k) σ
−ik·x c∗
τ Yσσ ac† (−k, σ ))
(j)
ab (k, −σ)e
+ (−)2B (−)j−σ uAB∗ (13.41)
Changing integration variable k → −k, defining xt = (x, −t), and using the reflection
and conjugation properties (13.13, 13.26) of the uAB
ab (k, σ), a few lines of algebra (see
Problem 3) leave us with
−1 (A) (B) d3 k −ik·xt
T φab (x)T
AB
= τ Yaa Ybb (uAB
a b (k, σ)e a(k, σ)
(2π)3/2 2E(k) σ
τ c∗ ik·xt c†
+ a b (k, −σ)e
(−)2B (−)j−σ uAB a (k, σ)) (13.42)
τ
The right-hand side of this expression is clearly a covariant field of type φAB
a b , provided
we make the by now usual identification
τc = τ∗ (13.43)
νe ν̄ μ μ− e+
Sβα SΘαΘβ
e− μ+ νμ
ν̄e
Fig. 13.2 The weak interaction process (e− + μ+ → νe + ν̄μ ) of Fig. 13.1 and its Θ = T CP
transform. The TCP theorem ensures equality (up to phase) of the amplitudes for the process
and its TCP transform in any (weakly) local relativistic quantum field theory describing the
interactions responsible for the process.
ηζτ = 1 (13.49)
It follows that the action obtained as the spacetime integral of any hermitian, Lorentz-
scalar Lagrangian density L(φ, ∂μ φ, φ† , ∂μ φ† ) is invariant under the combined TCP
480 Symmetries II: Discrete spacetime symmetries
∂L ∂L
Tμν (x) = ∂ν φ(x) + ∂ν φ† (x) − gμν L (13.51)
∂(∂μ φ) ∂(∂μ φ† )
Recalling that the Hamiltonian H of the theory is simply the spatial integral of the
density T00 (x, 0),
H= d3 xT00 (x, 0) ⇒ Θ H Θ−1 = H ⇒ [Θ, H] = 0 (13.53)
so the Hamiltonian of our theory is also invariant under the TCP operation.
The above argument is easily generalized to fields φABab (x) in arbitrary representa-
tions of the HLG: remarkably, we find that any such fields have (up to a sign) exactly
the same transformation property as the scalar field above. Combining (13.16, 13.30,
13.44) (or more directly, (13.31) and (13.44)), a few lines of algebra lead to
−1
Θ φAB
ab (x) Θ = (ηζτ )∗ (−)2A φAB†
ab (−x) = (−)
2A AB†
φab (−x) (13.54)
where we have inserted the phase choice (13.49). The argument for invariance of a
Lagrangian field theory constructed from such fields now follows directly. First, we note
that the four-vector spacetime gradient ∂μ , which transforms according to the ( 12 , 12 )
representation of the HLG, when applied to a φAB -type field multiplet, generates, by
the usual Clebsch–Gordon machinery of angular momentum addition, a combination
of irreducible fields of type φA B (with A = |A ± 12 |, B = |B ± 12 |), which individually
transform under T, C, and P just as we have indicated above. Thus, we may assume
henceforth that the (scalar!) Lagrangian density of our theory is simply a sum of terms
involving products of φAB fields in which the independent A and B representation
labels are separately coupled to zero spin. In any such product, the sign factors (−)2A
must also multiply to give +1, and we therefore have the strong reflection property
for the scalar Lagrangian density
where we have again assumed an hermitian Lagrangian in the last step. The invariance
of the action of the theory, given as the spacetime integral of the Lagrangian density,
The TCP and Spin-Statistics theorems 481
now follows as before. A similar argument3 ensures the strong reflection property for
the Hamiltonian density T00 (x), and hence the commutativity of the Hamiltonian with
the Θ operator.
The preceding discussion of TCP symmetry is based on a specification of the
dynamics in terms of a Lagrangian (or associated Hamiltonian) built from local fields.
A much more general understanding of the origins of TCP symmetry can be given using
the Wightman axiomatic formalism discussed in Section 9.2, with no commitment
whatsoever to a detailed dynamics of the fields. The basic ingredients which, once
present, ensure the exact TCP invariance of a field theory are (a) the spectral
and Lorentz transformation properties needed to establish the analytic properties of
the Wightman functions embodied in the Hall–Wightman theorem (specifically, the
property (9.76) following directly from that theorem), and (b) a weaker version of
space-like commutativity of the fields (called weak local commutativity, or WLC for
short), in which the vanishing of the relevant commutators (or anticommutators) is
only required in the vacuum expectation value. Without giving detailed proofs, we
shall outline the argument here.
We see comparing (13.50) and (13.54) that the behavior of a general covariant
field under the TCP operation is effectively identical to that of a scalar field, so to
strip out inessential details and present the essence of the argument most clearly, we
shall assume that we are dealing with a theory of a single real self-interacting scalar
field φ(x). The invariance of the theory (and its vacuum) under TCP implies the
existence of an antiunitary operator Θ satisfying (13.50): this then means that the
n-point Wightman function defined by4
which follows immediately from the fact that 0|ΘOΘ−1 |0
= 0|O † |0
for any operator
O (by antiunitarity of Θ), if we substitute for O the product of field operators appear-
ing in (13.56). By translation invariance, in terms of the displacements ξi ≡ xi − xi+1 ,
this can then be written
We recall from Section 9.3 that the Haag–Ruelle scattering theory assures us that the
entire phenomenological content of the theory contained in the S-matrix is uniquely
recoverable from knowledge of the Wightman functions, so the essential physical
content of TCP invariance (discussed in greater detail below) can be considered to be
fully incorporated in the condition (13.58) on the Wightman functions of the theory.
3 In this case, the stress-energy tensor density transforms like a symmetric rank 2 tensor—i.e., a
combination of (0,0) and (1,1) fields. In either case, the sum of A quantum numbers in each term must be
integer, so the product of (−)2A factors again gives unity.
4 We shall be dealing with the full Wightman functions of the theory for the remainder of this chapter,
and omit the “H” subscript indicating the omnipresent Heisenberg fields.
482 Symmetries II: Discrete spacetime symmetries
Next, we need to review briefly, and then extend somewhat, the analytic properties
of Wightman functions discussed previously in Section 9.2. A set of n − 1 complex 4-
coordinates (ζ1 , . . . , ζn−1 ) is said to lie in the extended tube Tn−1 if ζi = Λzi where
Λ is an arbitrary complex Lorentz transformation and the zi , i = 1, 2, .., n − 1 lie
in the forward tube (i.e., zi = ξi − iηi , ηi2 > 0, ηi0 > 0). The Hall–Wightman theorem
establishes the analyticity of the Wightman functions continued into the whole of the
extended tube, from which the reflection property (9.76) is easily deduced:
Note that this result hangs only on the spectral and Lorentz transformation axioms of
Section 9.2 (in particular Axioms Ia-d, IIa-b): in particular, there has as yet been no
appeal to the locality axiom IIc (space-like commutativity or anticommutativity of the
fields). A weakened version of the locality property is all that is needed to complete
the proof of the TCP theorem, in the form (13.58). Let (ξ1 , ξ2 , . . . , ξn−1 ) be a set of
real space-like four-vectors with space-like convex hull, i.e.,
n−1
( λi ξi )2 < 0 ∀λi , λi = 1, 0 ≤ λi ≤ 1 (13.60)
i=1 i
Note that this implies that for any subset S of (1, 2, . . . , n − 1), i∈S ξi is also space-
like, and hence, if ξi = xi − xi+1 , then for i < j we have xi − xj = (xi − xi+1 ) +
(xi+1 − xi+2 ) + . . . (xj−1 − xj ) = ξi + . . . ξj−1 also space-like, and hence xi − xj is
space-like for all i = j: the set ξi is totally space-like. Locality of the theory then
implies that we can rearrange the operators φ(x1 )φ(x2 ) . . . φ(xn ) into the reversed
order φ(xn )φ(xn−1 ) . . . φ(x1 ), and therefore, for the vacuum expectation value obtain
The condition (13.61) on the Wightman functions is strictly weaker than the full
requirement of space-like commutativity at the operator level, as we are only requiring
the commutativity to be effective at the level of the vacuum expectation value: it is
usually given the name “weak local commutativity” in the literature. Real points
in 4(n − 1)-dimensional space satisfying (13.60) are called Jost points, and have a
remarkable characteristic: they can all be shown to lie in the extended tube Tn−1 ! In
other words, given any set (ξ1 , ξ2 , . . . , ξn−1 ) satisfying (13.60), we can find a complex
Lorentz transformation Λ such that ξi = Λzi , with zi in the interior of the (complex)
forward tube. Some explicit examples are given in Problem 4.5 Now, the reader will
doubtless recall that two analytic functions of a single complex variable that agree
on any finite neighborhood of the real axis must continue to agree when analytically
continued to the rest of the complex plane. An exactly similar phenomenon in the
case of analytic functions of several complex variables ensures that the equality
W (ξ1 , . . . , ξn−1 ) = W (−ξn−1 , . . . , −ξ1 ) valid for real Jost points (and following from
5 For the full proof of this assertion, see (Streater and Wightman, 1978), theorem 2-12.
The TCP and Spin-Statistics theorems 483
weak local commutativity) can be extended to points in the entire extended tube Tn−1 :
where the final equality obtains in consequence of (13.59). The extended tube, of
course, contains the forward tube, consisting of points ζi = ξi − iηi with the ηi future
time-like, and the ξi arbitrary spacetime points (in particular, not necessarily space-
like!). Taking the boundary limit ηi → 0 in (13.62) we replace the ζi with real points
ξi everywhere and obtain
which is precisely the desired TCP theorem, in the form (13.58). The extension of
this result to fields in arbitrary representations of the HLG, and in the fermionic case,
satisfying space-like anticommutativity (in the weak form) can be found in (Streater
and Wightman, 1978) (theorem 4-7).
The immediate phenomenological consequences of TCP symmetry are among the
most precisely tested predictions of quantum field theory. In particular, the fact that
the exact Hamiltonian H of the world commutes with the T CP operator implies an
exact relation between the static properties, such as mass and magnetic moment (if
any), of a particle and its antiparticle. If |k = 0, σ
is the one particle state of a stable
particle at rest, with H|k = 0, σ
= mph |k = 0, σ
(mph the physical mass), then TCP
invariance requires that the state T CP|k = 0, σ
be an eigenstate of H with exactly
the same eigenvalue (i.e., mph ). But this state is just the one particle state of the
antiparticle at rest (with spin reversed, which by rotational invariance cannot alter
the energy). Thus the particle and antiparticle masses must be exactly equal, any
deviation implying a violation of TCP invariance. This equality has been tested in the
case of the proton/antiproton to better than one part in 108 . Even more precise tests
of TCP invariance are available in the neutral kaon system, where the K 0 and K̄ 0
masses6 are found to agree to within one part in 1018 !
In similar fashion, TCP symmetry implies that the magnetic moments of particle
and antiparticle are equal in magnitude and opposite in sign. This follows from the
fact that the single-particle state in the presence of a static magnetic field B must
have the same energy as its TCP conjugate, which is an antiparticle of reversed spin.
On the other hand, the magnetic field B is unchanged under TCP (B changes sign
under C and T, as is clear by considering the behavior of steady currents giving rise
but is unchanged under P, as it is an axial vector). Thus the magnetic moments,
to B,
like the spins, must be equal and opposite, although the phenomenological checks in
this case are much coarser than is the case for particle masses.
Consequences of TCP for typical scattering processes such as the weak process
depicted in Fig. 13.2, in which none of the discrete symmetries T, C, or P are separately
conserved, are typically much more difficult to check with comparable precision to the
static properties described above: one typically finds that the TCP transform of a
6 See (Bloch, 2006) for a recent review of the implications of TCP violation for neutral kaon physics.
484 Symmetries II: Discrete spacetime symmetries
By analytic continuation to the full extended tube, followed by the return to the real
boundary of the forward tube (exactly as above for TCP), we may therefore conclude
that for arbitrary real four-coordinates xi , i = 1, 2, . . . , n the Wightman functions (now
written as a function of the full set of n coordinates, rather than the n − 1 coordinate
differences) satisfy
W A1 B1 ,...,An Bn (x1 , . . . , xn ) = (−1)P +2 i Ai W An Bn ,...,A1 B1 (−xn , . . . , −x1 ) (13.66)
For n = 2, and taking the field φA2 B2 = (φA1 B1 )† (so A2 = B1 , B2 = A1 , cf. Section
13.2) this implies, in operator language,
0|φA1 B1 (x1 )(φA1 B1 )† (x2 )|0
= (−1)P +2(A1 +B1 ) 0|(φA1 B1 )† (−x2 )φA1 B1 (−x1 )|0
(13.67)
Multiplying both sides of (13.66) by test functions f (x1 )f ∗ (x2 ) and integrating over
x1 , x2 , and noting that (−1)2(A1 +B1 ) = (−1)2j , where j is the spin carried by the field
φA1 B1 ,
Problems 485
f (x1 )0|φA1 B1 (x1 )(φA1 B1 )† (x2 )|0
f ∗ (x2 )d4 x1 d4 x2
= (−1)P +2j f ∗ (x2 )0|(φA1 B1 )† (−x2 )φA1 B1 (−x1 )|0
f (x1 )d4 x2 d4 x1 (13.68)
which amounts to
∗ †
|| f (x)(φ A1 B1
(x)) |0
d x|| = (−1)
4 2 P +2j
|| f (−x)φA1 B1 (x)|0
||2 (13.69)
As the squared norms in (13.69) cannot vanish (if the smeared fields give zero acting
on the vacuum, the Wightman functions are zero and we have no theory!), we must
have P + 2j an even integer: in other words, Fermi fields (where P = 1) correspond
to half-integer spin, while Bose fields (P = 0) correspond to integer spin.
13.5 Problems
1. Let P be the unitary operator
d3 ka† (
k)(a(
k)+λa(−
P = e−iθ k))
(13.70)
where a(k) is the destruction operator for a spinless, self-conjugate boson. Find
θ and λ (in terms of η) so that P will be the parity operator for a particle of
intrinsic parity η:
where Cnm is a numerical 4x4 matrix. Write out this matrix explicitly.
(b) Show that the matrix C of part (a) has the following properties
C = −C −1 (13.73)
Cγμ C = γμT (13.74)
Use the results of parts (a,b) above to establish that this current is odd under
charge conjugation:
(d) Show that the expression for jμ (x) given in part (c) is equivalent to normal-
ordering, namely jμ (x) =: ψ̄(x)γμ ψ(x) : (where we are now dealing with
interaction-picture free fields, of course).
2. Verify, using the reflection and positivity results (13.13,13.26), the steps leading
from (13.41) to (13.42).
3. (a) Show that for a single real space-like vector ρ, there exists a complex Lorentz
transformation Λ such that ρ = Λζ, where ζ is in the forward tube (i.e., ζ =
ξ − iη where η is future time-like, η 2 > 0, η 0 > 0. (Hint: choose a coordinate
system in which ρ1 = ρ2 = 0, |ρ0 | < ρ3 . Then consider the transformation
(9.77), for suitable choice of θ).
(b) Now suppose the two real space-like vectors ρ1 , ρ2 have space-like convex
hull. Show that they form a point in the extended tube T2 : i.e., there exists
a complex Λ such that ρ1 = Λζ1 , ρ2 = Λζ2 , with ζ1 , ζ2 in the forward tube.
4. Suppose a set of n real points (ρ1 , ρ2 , .., ρn ) is in the extendedtube Tn . Show
that their convex hull (i.e., the set of spacetime points of form i λi ρi , i λi =
1, 0 ≤ λi ≤ 1) consists entirely of space-like points.
14
Symmetries III: Global symmetries
in field theory
The symmetries discussed in the preceding two chapters have been primarily those
involving the transformation properties of relativistic field theories under the continu-
ous and discrete parts of the homogeneous Lorentz group. In this chapter our focus will
be on internal global symmetries: those symmetries involving spacetime-independent
transformations of the fields which leave the dynamics of the theory invariant (or
almost invariant, in the case of weakly broken global symmetries). We have already
seen some examples of such symmetries in the context of Noether’s theorem in
Section 12.4, where we considered symmetries under global phase transformations
(cf. (12.120)) forming a commutative (abelian) group, or more generally, symmetries
in which the transformation of a multiplet of fields under a linear (non-abelian) matrix
group, as in (12.126), leaves the Lagrangian invariant.
The term “global” here refers to the fact that the same phase or matrix trans-
formation is applied to the fields of the theory at all spacetime points: the far richer
physics that emerges when the symmetry is local, allowing arbitrary dependence of
the transformation on spacetime location, will be the subject of the next chapter.
The formal application of Noether’s theorem in the presence of a global symmetry
leads, as we saw in Section 12.4, to the existence of a conserved current Jαμ (x) for each
generator tα of the global symmetry group, with the conserved charges associated with
each current,
Qα ≡ Jα0 (x)d3 x (14.1)
mimicking the commutation relations of the Lie algebra of the symmetry group:
Qα |0 = 0 (14.3)
In fact, the set of such generators must itself span the Lie algebra of a subgroup H ⊂ G,
as Qα |0
= Qβ |0
= 0 ⇒ [Qα , Qβ ]|0
= 0. In this case, we say that the symmetry group
H is realized in the Wigner–Weyl mode, while the generators Qα , α = m + 1, .., n
under which the vacuum is not invariant correspond to the Nambu–Goldstone mode of
symmetry realization. Since, by assumption, the full symmetry group G is preserved
by the dynamics, the broken generators still commute with the Hamiltonian, and
therefore lead (when exponentiated and applied to any minimum-energy vacuum state)
to new states which are also minimum-energy states: i.e., the vacuum is degenerate.
This is precisely the situation discussed previously in Chapter 8 under the heading
“spontaneous symmetry-breaking”.
By contrast, the surviving Wigner–Weyl symmetry group implies the existence
of degenerate multiplets of particle states, which must span finite-dimensional repre-
sentations of H: in this case the existence of the underlying symmetry is manifestly
visible in the spectrum of the theory and in symmetry constraints on the transition
amplitudes of the theory (as in the case of isospin symmetry, for example).1 For the
broken generators, the non-invariance of the ground state of the theory transfers to
complicated transformation properties of the multi-particle states built on it, and
the existence of an underlying exact dynamical symmetry can be far from obvious
phenomenologically. However, the spontaneous breaking of an exact global symmetry
leaves a remarkable phenomenological residue which can scarcely be missed: the
appearance of an exactly massless particle, called a “Goldstone boson”, for each
broken generator of the original exact global group G. We shall prove this result—the
Goldstone–Salam–Weinberg theorem—in Section 14.2.
In fact, for reasons to be discussed in Section 14.1, exact global symmetries (which
are not associated with a local gauge symmetry) are very rare in Nature: indeed, it
1 Of course, we may have m = 0, in which case H is null, the entire group G is spontaneously broken,
and we have only the Nambu–Goldstone mode of realization, or m = n, in which case H = G, and the
entire symmetry is realized in Wigner–Weyl mode.
Exact global symmetries are rare! 489
is possible that there are none at all! Instead, we find many examples of approximate
global symmetries, in which the breaking of the symmetry is in some (appropriate)
sense small, allowing us to exploit the symmetry by taking it to be exact at zeroth
order, and treating the effects of the symmetry-breaking perturbatively. The absence
of exactly zero-mass Goldstone bosons in Nature also suggests that there are no
dynamically exact (with exactly conserved charges) but spontaneously broken global
symmetries. However, the chiral symmetry of quantum chromodynamics provides a
clear example of a global symmetry which is (a) weakly broken by explicit non-
symmetric terms in the Lagrangian, and (b) spontaneously broken by the vacuum.
In this situation, one finds in place of the exactly zero-mass Goldstone bosons which
would necessarily appear if the small symmetry-breaking terms in the Lagrangian were
turned off, “light” spinless particles—pseudo-Goldstone bosons—associated with each
spontaneously broken generator of the chiral group.2 These light particles are just
the pions, whose squared masses are just 2% of the squared mass of the proton, for
example. The diagnostic techniques available for determining the presence or absence
of spontaneous breaking of a global symmetry will be the subject of Section 14.3:
it turns out that the concept of the effective action introduced as a graph theoretic
concept in Chapter 10 plays an essential role here.
of matter with respect to antimatter in the visible Universe requires either that we
introduce the asymmetry in an ad hoc fashion as an initial condition at some early
point in the evolution of the Universe following the Big Bang, while retaining exact
conservation (ignoring the effects of exponentially small anomalous violations), or, as
assumed by essentially all present practitioners of early Universe cosmology, we must
introduce a small breaking of baryon number (say) to account for the fact that the
ratio of baryon to photon number densities nnBγ at the epoch of nucleosynthesis is on
the order of 10−8 .
In fact, the structure of local quantum field theories, and in particular theories
based on local gauge interactions, already implies that exact conservation of global
symmetries is extremely fragile, and easily subject to violation on several grounds. This
is in contradistinction to the presumed exact character of the local gauge symmetries
described in the next chapter, both in the Standard Model as well as in Grand Unified
(or even, superstring) extensions thereof. These symmetries may undergo spontaneous
breaking, but are as far as we know exact symmetries at the dynamical level.
There are several reasons for the apparent fragility of global symmetries in local
quantum field theories. Here is a (partial) list:
through the wormhole. The conventional wisdom (see (Peccei, 1988), Section 2.7)
suggests that gravitational effects inevitably destroy global charge conservation.
The present author is an agnostic on this issue, and would prefer to wait for
a manifestly consistent quantum gravitational description of black holes and/or
wormholes before pronouncing a definitive verdict.
cancelling the first term in the integral in (14.6) and giving the desired result
α|Jαμ (x)|0 = α|U † (Λ)Jαμ (x)U (Λ)|0 = Λμν α|Jαν (Λ−1 x)|0 (14.9)
But the matrix elements in (14.9) can clearly be translated to x = 0, as the states
|α
have zero-energy-momentum (as mentioned above, [P μ , Qα ] = 0), allowing us to
conclude
3 Note that, as we shall see in our discussion of the Higgs phenomenon, it is perfectly possible for a
theory with exact dynamical local gauge symmetry to have no massless particles—one simply arranges for
the symmetry group G to be completely broken by the vacuum state, leading to a theory with a non-zero-
494 Symmetries III: Global symmetries in field theory
theories, the theory must be quantized either in a non-covariant “physical” gauge, with
a positive-definite Hilbert space, or in a covariant (“Gupta–Bleuler”) gauge in which
the Hilbert space is not positive-definite (even though the negative or zero metric
unphysical modes can be shown to decouple from physical transition amplitudes). As
both explicit covariance of the fields and positive-definiteness of the Hilbert space are
ingredients of the reasoning leading to (14.11), we must conclude that spontaneous
breaking in theories with exact local gauge symmetries has distinctive features not
present in the global case (in particular, we cannot expect a massless Goldstone boson
to appear). In fact, we shall see that the peculiarities (generally dubbed the “Higgs
mechanism”) of theories with both local gauge symmetry and spontaneous symmetry-
breaking, to be discussed in Section 15.7, play a central role in the electroweak sector
of the Standard Model of elementary particle interactions.
The physics underlying the appearance of massless particles when a continuous
global symmetry is broken by the vacuum becomes clearer if we approach the problem
from another angle. Let us assume that our theory admits an exact continuous global
symmetry G (which could be abelian or non-abelian) yielding Noether charges Qα
which generate the appropriate infinitesimal transformations on a finite set of local
(or almost local) fields φn (x) spanning a representation of G with matrix generators
tα (cf. Section 12.4, Example 5):
The fields φn (x) here may be elementary or composite: we require only that they fill
out a finite-dimensional representation of the global symmetry group G. If the vacuum
is invariant under G, i.e., Qα |0
= 0, ∀α, we must have
0|[Qα , φn (x)]|0 = 0 ⇒ (tα )nm 0|φm (x)|0 = (tα )nm 0|φm (0)|0 = 0, ∀α (14.13)
On the other hand, if there are generator(s) tα which do not annihilate the vector of
vacuum expectation values,
mass gap. In such cases, the possibility of evasion of the Goldstone theorem arises from the indicated mutual
compatibility of manifest covariance and positive-definiteness in the local gauge case.
Spontaneous breaking of global symmetries: dynamical aspects 495
Nαn = {0|Jα0 (0)|n
n|φn (0)|0
e−iPn ·x − 0|φn (0)|n
n|Jα0 (0)|0
eiPn ·x }d3 x
n
= (2π)3 δ 3 (Pn ){0|Jα0 (0)|n
n|φn (0)|0
e−iEn t − 0|φn (0)|n
n|Jα0 (0)|0
eiEn t }
n=|0>
= 0, ∀t (14.15)
Note that the physical vacuum does not appear among the sum over states |n
in
(14.15), as the two terms in brackets cancel in this case. In fact, the charge Qα (t)
is time-independent by assumption, so differentiating with respect to time, we must
have, for all t,
δ 3 (Pn )En {0|Jα0 (0)|n
n|φn (0)|0
e−iEn t + 0|φn (0)|n
n|Jα0 (0)|0
eiEn t } = 0
n=|0>
(14.16)
The conditions (14.15, 14.16) together imply that there must be non-vacuum states
|n
for which 0|Jα0 (0)|n
= 0, n|φn (0)|0
= 0 requires the vanishing of the energy En
whenever the spatial momentum Pn of the state is zero. Such states can only be single-
particle zero-mass states, and the requirement 0|Jα0 (0)|n
= 0 is simply the statement
that the current Jαμ is an interpolating field for the corresponding Goldstone particle.
If the components of Jαμ are bosonic fields, the particle is a Goldstone boson. However,
note that in supersymmetric theories, we can have global Noether currents of fermionic
type (recall (12.154)), and the corresponding Goldstone mode will be a fermion, if the
symmetry is spontaneously broken.
The effective action Γ[φ] was defined in Section 10.4 as the functional Legendre
transform of the generating functional W [j] of connected Green functions:
4
Z[j] = Dφe−(S[φ]− j(x)φ(x)d x) (14.17)
In the limit of infinite spacetime volume Ω, the integral is dominated by the point or
points at which the maximal value of the exponent is achieved. In the examples to be
considered below there will be a unique such point, and we have simply (for Ω → ∞)
W [j] = supφ (jφ − P (φ)) − supφ (−P (φ)) = supφ (jφ − P (φ)) + inf φ P (φ) (14.21)
For convenience, we shall choose the field polynomial P (φ) so that the second term
on the right vanishes, and we have simply
For example, we may take the scalar theory discussed in Section 8.4, with either
1 2 2 λ 4
P+ (φ) = m φ + φ (14.23)
2 4!
with no SSB, or the theory with a negative sign in the mass term,
1 2 2 λ 4 3 4 λ 2 6
P− (φ) = − m φ + φ + m = (φ − v ) , v ≡
2 2
m (14.24)
2 4! 2λ 4! λ
Taking first the positive-sign case, one easily sees
that for any real j there is a
6j
unique maximum of jφ − P (φ) when φ = vx, v ≡ λ6 m, x3 + x − λv 3 = 0, with the
Spontaneous breaking of global symmetries: dynamical aspects 497
cubic equation having only one real root. The location of the root can be found readily
for small j, and we find that the solution for φ depends analytically on j, with
6 216
φ= j − 3 8 j 3 + O(j 5 ) (14.25)
λv 2 λ v
and
3 2 54
W+ (j) = j − 3 8 j 4 + O(j 6 ) (14.26)
λv 2 λ v
Evidently, in this case, W+ (j) is everywhere differentiable, and the relation φ = W+ (j)
reduces to the cubic equation above, with a unique 1-1 mapping between φ and j. The
Legendre transform Γ+ (φ) of W+ (j) can therefore be constructed in the usual fashion
(as in (14.19)), and we find
1 2 2 λ 4
Γ+ (φ) = jφ − W+ (j) = m φ + φ = P+ (φ) (14.27)
2 4!
which is just the classical action in this constant-field model (S(φ) → P+ (φ)). There
are no loop graphs in this model, as there are no non-zero momenta to integrate over,
so this agrees with our discovery in Section 10.4 that the effective action reduces to
the classical Lagrangian at tree level.
The situation is quite different in the case where the mass term has a negative
sign, as the maximum of jφ − P− (φ) occurs at a solution of the cubic equation φ(φ2 −
v 2 ) = λ6 j, which has three real solutions for |j| < 9√
λ
3
v (for example, at j = 0 we find
solutions at φ = 0, ±v). For small positive j, the solution giving the absolute maximum
of jφ − P− (φ) is φ = v + λv3 2 j + O(j 2 ), while for small negative j we must choose the
solution at φ = −v + λv3 2 j + O(j 2 ). There is clearly a discontinuity in φ as a function
of j as we pass through j = 0, which asserts itself as a cusp (discontinuity in the first
derivative) of W− (j):
3 2
W− (j) = v|j| + j + .. (14.28)
2λv 2
The dramatic alteration in shape of W (j) as a result of the change of sign in the
mass term is clearly visible in Fig. 14.1. The failure of the derivative of W− (j) to be
well defined at j = 0 invalidates the usual procedure for constructing the Legendre
transform Γ− (φ) of W− (j). Note that the cusp in W− (j) is a direct consequence of
having taken the infinite volume limit Ω → ∞ in the defining integral (14.20): for finite
Ω the integral defines an infinitely differentiable function of j, and there are absolutely
no difficulties in defining a Legendre transform in the usual fashion. We can obtain
a unique and well-defined infinite-volume Γ− (φ), which, moreover, coincides with the
usual definition in those cases where there are no difficulties with differentiability of
W (j), by defining the Legendre transform in general4 as follows,
4 For a beautiful introduction to the applications of convexity theory in classical thermodynamics, leading
to the “sup” definition of the Legendre transform given here, see the Introduction to (Israel, 1978) by
A. S. Wightman.
498 Symmetries III: Global symmetries in field theory
W±(j)
80
W−(j)
W+(j)
40
j
–20 0 20
Fig. 14.1 The infinite volume connected generating functions W± (j) for the case of zero-
dimensional scalar theories P± (φ) and with positive or negative squared mass terms (parameters
m = 1, λ = 1.5, v = 2). Note the cusp of W− (j) at j = 0.
A finite maximum obtains for arbitrarily large |j| provided W (j) rises at least linearly
with j for large |j|, i.e., W (j) is convex for large j. Recall that a real function W (j)
is convex if
An exactly analogous definition holds for convex functionals W [j] on a function space
of real functions j(x). In fact, returning temporarily to the full field theory case, the
convexity of the generating functional W [j] (not just at large j) follows directly from
its functional integral representation
exp W [j] = exp ( j(x)φ(x)d x)dμ, dμ ≡ exp (−S[φ])Dφ/ exp (−S[φ])Dφ
4
(14.31)
as an integral over a normalized positive measure dμ, dμ = 1. Integrals over such
measures satisfy a Hölder inequality (see (Rudin, 1966), p. 62)
α 1−α
f g dμ ≤ ( f dμ) ( gdμ)1−α , 0 < α < 1
α
(14.32)
Setting f = exp ( j1 (x)φ(x)d4 x), g = exp ( j2 (x)φ(x)d4 x), we see that (14.32) imme-
diately implies, taking the logarithm,
The convexity of W (j) in our toy model (with or without SSB) is apparent from a
glance at Fig. 14.1. It is easy to verify that the property of convexity of W (j) carries
over to Γ(φ), provided that we use the “sup” definition of the latter. Indeed,
p2 λ 1 d2
Hj̄ ≡ + (x2 − v 2 )2 − j̄x = − + P (x) − j̄x ≡ H0 − j̄x (14.35)
2 24 2 dx2
Here the “field” φ has been replaced by the quantum coordinate x, and we have
restricted the source function j(t) to be independent of time, j(t) = j̄ = constant.
Evidently we are dealing with a “double-well” anharmonic oscillator, where we now
also assume that we are in a regime where λ is taken large for v fixed, so that the
two potential basins are separated by a large energy barrier (see Fig. 14.3). In this
situation, the Gaussian wavefunctions ψ± (x) = x|±
centered on x = ±v,
500 Symmetries III: Global symmetries in field theory
Γ+(φ)
P+(φ)
Γ−(φ)
+ P−(φ)
Γ−(φ), Ω = V T =1
o o
o o
20
o o
o o
o o
o o
o 10 o
+ o o +
+ o o +
+ o o +
+ o o +
+ ooo oo +
++ ooo o +
+++ oo
++
o++++++++++++
oo
+oooo +++ φ
++++++++ ooooooooooo +++++++++
–4 4
Fig. 14.2 The infinite volume effective potential Γ± (φ) for theories defined by action P± (φ)
(same parameters as in Fig. 14.1). The effective potential Γ− (φ) at finite volume, Ω = 1, for
the symmetry-broken case is also shown.
P (x)
ψ+(x)
ψ−(x)
15
10
x
–1.5 –1 –0.5 0 0.5 1 1.5
Fig. 14.3 Potential energy function P (x) and approximate ground states ψ± (x) for a one-
dimensional anharmonic oscillator (λ=300, m = v = 1).
Spontaneous breaking of global symmetries: dynamical aspects 501
−ω 2 λv 2
ψ± (x) ≡ Ce 2 (x∓v) , C = (ω/π) 1/4
, ω≡ (14.36)
3
are approximate degenerate ground-state eigenfunctions of H0 , with approximate
eigenvalue ω/2, due to the exponentially suppressed tunneling amplitude for the
particle to transition between the two potential basins. A short calculation shows
2
that γ ≡ +|H0 |−
∼ O(ω 2 e−ωv ) for λ (hence ω) large, and fixed v. We may therefore
represent approximately the low-energy sector of this theory by truncating the Hilbert
space to the two-dimensional subspace spanned by the states |+
, |−
, in which space
the Hamiltonian Hj̄ takes the matrix form
ω
2 − v j̄ γ
Hj̄ = ω (14.37)
γ 2 + v j̄
using the fact that the expectation value of the coordinate x in the highly localized
states represented by wavefunctions ψ+ (x), ψ− (x) are +v, −v approximately. The
source-free Hamiltonian H0 has, as is well known, a unique non-degenerate ground
state, with (in the approximation (14.37)) the symmetric eigenfunction √12 (ψ+ (x) +
ψ− (x)), and energy ω2 − γ. The antisymmetric state with wavefunction √12 (ψ+ (x) −
ψ− (x)) lies at an energy 2γ above this ground state, and will be suppressed in the
partition function Z(j̄) (for arbitrary j̄) if we choose T large enough that e−2γT << 1,
as we shall henceforth do. Switching on the source j̄, the energy of the new ground
state |0, j̄
becomes
ω
E0,j̄ = − γ 2 + v 2 j̄ 2 (14.38)
2
with Z(j̄) = e−T E0,j̄ + O(e−2T γ ). The generating function W (j̄) becomes, up to expo-
nentially small corrections,
1
W (j̄) = ln Z(j̄) = −E0,j̄ (14.39)
T
The expectation value of the coordinate x (recall that this is the analog of the field
operator φ in our zero-spatial-dimensional model) in the ground state of Hj̄ can be
calculated directly (by diagonalizing Hj̄ ) or by taking the derivative of W (j̄): in either
case one finds
dW (j̄) v 2 j̄
x̄ = = (14.40)
dj̄ γ 2 + v 2 j̄ 2
We pointed out previously that there are no difficulties in defining the Legendre
transform in the usual way at finite spatiotemporal volume, which is certainly the
case in our model (for finite T ). And indeed, for small j̄, when our approximations
are valid, we see that there is no problem with differentiability, or with inverting
the relation between j̄ and x̄, so we may define the effective potential Γ(x̄) with the
conventional Legendre transform,
This is the promised energetic interpretation of the effective potential Γ(x̄): it is the
expectation of the source-free Hamiltonian H0 in the ground state of the sourced Hamil-
tonian Hj̄ where the source j̄ is chosen to give the value x̄ for the expectation value
of position in that ground state. In our crude approximation, one finds (see Problem
3) that for |x̄| < v, Γ(x̄) is a convex function, with Γ(x̄) = ω2 − γ (1 − x̄2 /v 2 ). The
overlap matrix element γ is exponentially small in our model by our choice of a large
quartic coupling λ, but it is automatically small in a true field theory, as the overlap
of the states |v
(respectively, | − v
) characterized by having field expectation values
φ(x)
= +v (respectively −v) is exponentially suppressed in the spatial volume V :
+v| − v
∼ e−KV , essentially because this overlap involves the multiplication of order
V tunneling amplitudes connecting the field variable at each spatial point5 between the
two vacuum values +v and −v (see Problem 4 for an explicit example). We see that in
the infinite-volume limit (where γ → 0) in the field theory cases, the effective potential
develops a flat section, as in the zero-dimensional model (see Fig. 14.2), connecting
the two classical minima of the field potential. Of course, in the quantum case there
is an additional zero-point energy (which could be removed by normal-ordering): just
the ubiquitous ω2 appearing in the preceding formulas.
Finally, we consider field theory proper, specifically the double-well scalar theory
of Section 8.4, with Euclidean generating functional
j(x)φ(x)d4 x) 1
Z[j] = Dφe−(S[φ]− , S[φ] = (∂μ φ)2 − P− (φ) (14.42)
2
with P− (φ) given in (14.24). In the discussion that follows we shall be discussing
features of spontaneous symmetry-breaking at the lowest order of the loop expansion:
the reader will recall from the discussion in Section 10.4 that the effective action at
the leading order of a formal expansion in Planck’s constant —which, as we showed
there, was equivalent to the perturbative loop expansion—reproduces the classical
action. We now know that in the spontaneously broken case (14.42), the properly
defined (with either the “sup”, or equivalently, the minimum-energy definition as in
(14.41)) effective potential is convex, and at the leading order of amounts to the
convex hull of the double-well field potential P− (φ). There will be higher-order loop
corrections to the quantities discussed below, but they generally do not affect the
qualitative features of global spontaneous symmetry breakdown.6
As previously, we begin by working at finite spacetime volume Ω = V T , and restrict
the source function j(x) to be a spacetime constant, j(x) = j̄. A connected finite-
volume generating function W (j̄) is then defined in the usual way by taking the infinite
Euclidean time limit to project out the lowest-energy state |0, j̄
in the presence of the
source j̄, as follows:
5 We imagine throughout that our theory is regularized at short distance, say on a spatial lattice, so
that V can be regarded simply as enumerating the finite number of spatial lattice points.
6 An interesting exception is provided by the Coleman–Weinberg phenomenon (Coleman and Weinberg,
1973), in which a theory without spontaneous breaking at the classical level develops a non-vanishing field
expectation value at the one-loop level due to radiative corrections. The features of vacuum structure,
clustering, etc., discussed below, apply in full force to such theories, once these loop effects are included.
Spontaneous breaking of global symmetries: dynamical aspects 503
1 1 Z(j̄)
WV (j̄) ≡ lim ln (14.43)
V T →∞ T Z(0)
For finite spatial volume V , ΓV (φ̄) is convex and smooth, but develops cusps (dis-
continuities in the first derivative) and a flat section for −v < φ̄ < +v in the infinite
volume limit. In this limit, the theory has two exactly degenerate minimum-energy
states (in the absence of a source), which we may denote, following our previous
notation, | + v
and | − v
, with ±v|φ(x)| ± v
= ±v. As pointed out above, these
states are strictly orthogonal in the infinite volume limit: indeed, the matrix element
of any local operator taken between | + v
and | − v
also vanishes, as the local field
cannot “twist” the scalar field expectation value from −v to +v over all spacetime
points (see Problem 4). As
and
−v|H0 | + v = 0 (14.46)
with
the infinite volume effective potential is immediately seen to be constant in the range
−v < φ̄ < +v:
the range −v to +v: instead the system prefers the appropriate linear combination
(14.47) of the two degenerate vacua of the system.
The mixed states7 α| − v
+ β| + v
≡ |φ̄
defined in (14.47) are perfectly well-
defined normalized states in the Hilbert space of the theory, but, with the exception
of the two extreme cases | ± v
(|α| = 0 or 1), they are not physically acceptable vacua.
Indeed, a Fock space built on such states will necessarily result in a dramatic failure of
the Ruelle clustering property discussed in Section 9.2, and hence in the basic property
of cluster decomposition which constitutes one of the pillars on which we constructed
the entire framework of local quantum field theory. To see this in a simple example,
consider the connected part of the Wightman two-point function defined with respect
to a mixed vacuum |φ̄
, −v < φ̄ < v:
φ̄|φ(x1 )φ(x2 )|φ̄ c ≡ φ̄|φ(x1 )φ(x2 )|φ̄ − φ̄|φ(x1 )|φ̄ φ̄|φ(x2 )|φ̄ (14.50)
In the first term on the right-hand side, we may insert a complete set of states
where the primed sum runs over non-vacuum states, beginning with single-particle
states separated (in this theory with a broken discrete symmetry) by a non-zero mass
gap m from the degenerate vacuum states | ± v
. If x1 , x2 are spacetime coordinates
separated by a large space-like separation R, i.e., (x1 − x2 )2 = −R2 , then the primed
sum has a Kållen–Lehmann representation (cf. Section 9.5) as a spectral integral over
free two-point functions with mass μ ≥ m, which fall at least as fast as e−mR at large
R. Up to these exponential falling terms, therefore, and using the vacuum orthog-
onality property ±v|φ(x)| ∓ v
= 0, we find that in an infinite volume theory, for
R large
7 The terminology “mixed” here being applied to non-extremal vacuum states should not be confused
with the sense of “mixed” as distinguished from “pure” states in statistical physics: the states |φ̄ are
pure states for all α, in the statistical sense, corresponding to definite rays in the Hilbert space. The
thermodynamic analogy is with systems with coexisting phases: see Wightman, footnote 4, op. cit.
Spontaneous breaking of global symmetries: dynamical aspects 505
1 ˙ 1 2 = λ (φ
:, P (φ) 6
H= 3
d x: φ 2
+ |∇ φ| + P (φ) 2
−v ) , v =
2 2
m (14.55)
2 2 24 λ
the global O(3) symmetry of the Hamiltonian, under the transformations φ → Rφ,
with R an orthogonal rotation, is spontaneously broken, and the vacuum sector can be
parameterized by the points on the surface of a sphere |φ
| ≤ v in field space. There is
thus a continuous infinity of orthogonal minimum-energy states in the infinite-volume
limit, and the effective potential Γ(φ), which is again necessarily convex, is constant
within and on the boundary of this sphere. A physically realistic theory, satisfying
the constraints of clustering, again requires that we choose vacua corresponding to
an extreme point, on the surface of the sphere, with |φ
| = v. Of course, Goldstone’s
theorem (our symmetry is now continuous!) asserts that the field theory constructed
on such a vacuum state necessarily has zero-mass particle states. Denoting a particular
clustering vacuum state by |v
, where v is a three-vector with magnitude v = 6/λm,
the proof of the Goldstone theorem outlined in the preceding section indicates that the
state J 0 (x)|v
contains single-particle Goldstone particle states, so that the spatial
integral, giving the effect of the charges Q on |v
, produces a state with zero spatial
momentum (and therefore energy) Goldstone particles. On the other hand, the effect of
the O(3) charges is simply to perform an infinitesimal rotation in field space, so at least
= v can be constructed from each other by application
formally, the vacua |v
with |φ
|
ω ·Q
i
of the finite rotation e —in other words, by constructing coherent states containing
infinitely many zero-energy Goldstone modes. At infinite volume these states become
orthogonal, and the associated formally unitary operations become improper, much
as the interaction-picture operators in Haag’s theorem (cf. Section 10.5).
Although the definition of an effective action (or the associated effective potential,
for constant fields) in terms of a Legendre transform is very convenient for perturbative
calculations (one simply sums the 1PI graphs), it is not particularly useful in situations
where spontaneous symmetry-breaking occurs at a non-perturbative level, such as in
strongly coupled scalar theories (in calculating upper bounds on the Higgs mass,
for example) or in quantitative studies of chiral symmetry-breaking in quantum
chromodynamics. A more convenient object in these cases, where we must resort to
explicit numerical simulation of the lattice-regularized field theory, is provided by
the constraint effective potential introduced in (O’Raifertaigh et al., 1986). For the
scalar theory (14.42), for example, define a functional UΩ [φ], which we shall call the
constraint effective action, by the functional integral
−UΩ [φ]
e ≡ Dφ̂δ(φ(x) − φ̂(x))e−S[φ̂] (14.56)
where the field theory is defined at finite spacetime volume Ω. Note that the connected
generating functional WΩ [j] is completely reconstructible from the knowledge of UΩ [φ],
by a functional Laplace transformation
j(x)φ(x)d4 x)
WΩ [j] = Dφe−(UΩ [φ]− (14.57)
Problems 507
Restricting ourselves to spacetime constant fields, we can similarly define the con-
straint effective potential UΩ (φ)
1
e−UΩ (φ) ≡ Dφ̂δ(φ − φ̂(x)d4 x)e−S[φ̂] (14.58)
Ω
It is trivial to impose the δ-function constraint in (14.58) in a numerical (e.g.,
Monte Carlo) simulation of the lattice-regularized theory: for example, on a spacetime
N −1
lattice of Ω = N points, we can just set φ̂N = N φ − i=1 φ̂i , and then simulate the
remaining system of N − 1 field variables by standard statistical sampling methods.
In the case of our zero-dimensional toy theory (14.20), one sees immediately that
U (φ̄) = P (φ̄): the contraint effective potential therefore has the same double-well
structure as the classical field potential in the symmetry-breaking case, and is evidently
not convex, unlike the conventional effective potential Γ(φ̄). In proper field theory
(models with kinetic terms), one finds (O’Raifertaigh et al., 1986) that in the infinite
volume limit, Ω → ∞, UΩ (φ̄) approaches the previously discussed convex function
Γ∞ (φ̄): in particular, the flat regions describing mixed vacua are recovered in this
limit. For an application of the constraint effective potential approach to the problem
of the Higgs boson mass limit in electroweak theory, see (Kuti and Shen, 1988).
14.4 Problems
1. Show that in the toy model defined by the integral (14.20), in the symmetry-
breaking case with potential P− (φ), the infinite volume limit of the effective
potential (with the “sup” definition) Γ− (φ) equals P− (φ) for |φ| > v, while Γ− (φ)
is constant for −v ≤ φ ≤ +v.
2. Calculate the Hamiltonian matrix element γ ≡ +|H0 |−
for the Hamiltonian H0
in (14.35) between the Gaussianapproximate ground states ψ± (x) given in (14.36).
3. Verify the result Γ(x̄) = ω2 − γ (1 − x̄2 /v 2 ) for the effective potential for |x̄| < v
in the anharmonic oscillator, using the two-dimensional truncation to the subspace
spanned by |±
.
4. The coherent states of a scalar field of mass m with different expectation values for
the field become orthogonal in the infinite-volume limit. To see this, we begin by
considering the field quantized at finite volume (at time zero):
1 1
(a
eik·
x + a
† e−ik·
x ), [a
k , a
† ] = δ
k,
k
φ(x, 0) = √ (14.59)
V
2E
k k k k
k
(b) Show that −v| + v
vanishes exponentially in the infinite volume limit V → ∞.
(c) Show that −v|φ(x, 0)2 | + v
vanishes exponentially in the infinite-volume limit
V → ∞. Note that this matrix element actually vanishes identically even at
finite volume if we normal order the squared field: why? (See discussion of
coherent states at end of Section 8.3.)
15
Symmetries IV: Local symmetries
in field theory
The preceding three chapters have been devoted to examining the consequences of
two main types of symmetry in quantum field theory: those in which the dynamics
of the theory is invariant under certain transformations on the kinematical spacetime
scaffolding of the theory (we have called these “spacetime symmetries”), and those in
which the symmetry transformations operate globally (i.e., identically at all points in
spacetime) and linearly on the set of independent fields present in the theory (calling
these “internal global symmetries”). If only fields of spin zero and 12 are present, these
are in fact the only types of symmetry that are relevant in relativistic field theory. Once
spin-1 fields are present, however, the situation changes radically. The formulation
of renormalizable interacting field theories for spin-1 particles turns out to lead us
inexorably to the introduction of a new type of symmetry—local gauge symmetry—
which represents, in some sense, an amalgam of spacetime and internal symmetry.
Anticipating the discussion of scale dependence of Lagrangian field theories in
Part 4 of the book, we shall see that the survival to low energies of non-trivial
interactions of spin-1 particles guarantees the presence of local gauge invariance in
the dynamics of the theory describing these low-energy processes. Of course, local
gauge invariance was already fully present in the classical electrodynamics perfected
by the great work of Maxwell in the 1860s, and the incorporation of a local gauge
principle in relativistic quantum field theory was implicit in the very earliest works on
quantum electrodynamics.1 However, a full appreciation of the extraordinarily deep
implications of local gauge symmetry for all the fundamental interactions in Nature
had to await the development of the concepts and techniques of modern quantum field
theory. In this chapter we begin our study of these implications.
1 The modern “gauge” terminology, however, goes back to the work of Weyl, beginning in 1919.
510 Symmetries IV: Local symmetries in field theory
1 2
L= ṙ − V (r2 ) (15.1)
2
Now suppose that the motion of our particle is observed in a frame of reference
attached to a turntable (situated just below the half-line along which the particle
is moving) on which are inscribed perpendicular axes measuring two coordinates q1
and q2 . The turntable is allowed to execute capricious rotations in the course of
time, but at any time we have r = q12 + q22 . If we substitute this relation into the
Lagrangian, we find a new Lagrangian in terms of the q1 , q2 degrees of freedom which
describe the dynamics of the system as observed in the frame of reference affixed to the
turntable:
If we now wish to study the quantum mechanics of such a system, we must construct
a Hamiltonian via a Legendre transformation, and impose canonical commutation
relations on the conjugate momentum-coordinate pairs p1 , q1 and p2 , q2 , where
It is immediately clear that the Legendre transform does not exist in this case, for the
simple reason that it is impossible to solve uniquely for the velocities q̇1 , q̇2 in terms of
the conjugate momenta p1 , p2 , as the pair of equations (15.3, 15.4) are degenerate. In
fact, we have the identity (or “primary constraint”—one following directly from the
structure of the Lagrangian),
1 (q · q˙ )2
L= − V (q 2 ) (15.8)
2 q 2
˙
·q
q
In this case there are three primary constraints following immediately from p = q
2 ,
q
q × p = 0:
which are just the angular momentum components Li , generating rotations around
the three spatial axes. In this case the set of gauge transformations
with R(t) an orthogonal O(3) rotation clearly form a non-abelian group. Note that the
commutators (or at the classical level, the Poisson brackets) of the primary constraints
in this case form a closed algebra—indeed, just the Lie algebra of the rotation group.
Constraints which close in this way are referred to as “first-class” constraints, and are
always associated with the presence of superfluous “gauge” degrees of freedom which
can be eliminated by an appropriate gauge-fixing procedure. In order to understand
how to do this in a more general way, we must now turn to a brief discussion of the
theory of constrained Hamiltonian systems.
512 Symmetries IV: Local symmetries in field theory
1 2
L= (q̇ + q̇22 ) + q0 (q1 q̇2 − q2 q̇1 ) − V (q12 + q22 ) (15.11)
2 1
The absence of any dependence on q̇0 immediately implies the primary constraint
∂L
p0 = =0 (15.12)
∂ q̇0
However, the requirement that this constraint be preserved in the time evolution,
∂p0 ∂ ∂L ∂L
= = =0 (15.13)
∂t ∂t ∂ q̇0 ∂q0
where we have employed the Euler–Lagrange equation for the coordinate q0 , amounts
to the further secondary constraint (setting the non-dynamical q0 = 0)
∂L
= q1 q̇2 − q2 q̇1 = q1 p2 − q2 p1 = 0 (15.14)
∂q0
which is just the primary constraint (15.5) arising from the Lagrangian (15.2). In the
next section it will become apparent that both Lagrangians have identical physical
content as constrained Hamiltonian systems. The important distinction, as we shall
see, is between those constraints whose Poisson brackets (or, in the quantum case,
commutators) with each other vanish once the constraints themselves are imposed
(so-called first-class constraints) and those with non-vanishing Poisson brackets on
the constraint surface (second-class constraints, which we do not consider further
here, as our primary interest lies in the Hamiltonian interpretation of local gauge
symmetries).
space2 than we can devote to it here, so we shall restrict ourselves to the elements of
the theory directly relevant to the canonical treatment, and quantization, of theories
with local gauge symmetries.
Let us return to the simple example described in the preceding section, with
Lagrangian (15.2). Ignoring temporarily the inconvenient absence of a unique relation
between velocities and momenta, we see, using (15.3, 15.4), that we can re-express the
Hamiltonian function for this theory, initially given as
1 (q1 q̇1 + q2 q̇2 )2
H = p1 q̇1 + p2 q̇2 − L = + V (q12 + q22 ) (15.15)
2 q12 + q22
in a number of equivalent ways: for example,
1 2 q2
H= p1 (1 + 22 ) + V (q12 + q22 ) (15.16)
2 q1
1 2 q2
= p2 (1 + 12 ) + V (q12 + q22 ) (15.17)
2 q2
1 2
= (p + p22 ) + V (q12 + q22 ), . . . . (15.18)
2 1
The lack of a unique inversion for the velocities in terms of the momenta manifests
itself in the multiplicity of equivalent expressions for the Hamiltonian in the three lines
above, which are clearly equal once we take the primary constraint χ(q1 , q2 , p1 , p2 ) =
q1 p2 − q2 p1 = 0 into account. In fact, the set of Hamiltonians given by (the “T”
subscript denotes “total Hamiltonian”, including constraints, in Dirac’s language)
1 2
HT = (p + p22 ) + V (q12 + q22 ) − λ(t)χ(q1 , q2 , p1 , p2 ) (15.19)
2 1
with λ(t) an arbitrary function of time (either explicitly, and/or through an arbitrary
function of the coordinates q, p ), are all equivalent in this sense. If we derive Hamil-
tonian equations of motion q̇ = ∂H ∂p , ṗ = − ∂q in the usual way from (15.19), treating
∂H
2 For a careful and very thorough treatment of the full theory of constrained systems, with emphasis on
gauge theories, see (Henneaux and Teitelboim, 1992).
514 Symmetries IV: Local symmetries in field theory
∂V
q̇1 = p1 , ṗ1 = − (15.23)
∂q1
∂V
q̇2 = p2 , ṗ2 = − (15.24)
∂q2
In other words, the arbitrariness of the constraint term in (15.19) precisely incorporates
the gauge freedom in the solutions of the underlying one-dimensional problem when
viewed in the floating turntable frame, if we interpret the Lagrange multiplier function
λ(t) as the angular velocity of the turntable θ̇(t) at any given time.
At this point, it is useful to recall that the classical Hamiltonian equations of
the theory, (15.20), can be expressed in a form which is particularly suggestive when
one wishes to make the transition to quantum theory, in terms of the Poisson bracket
{F, G} defined on arbitrary functions F (qi , pi ), G(qi , pi ) on the (unconstrained) phase-
space as follows
∂F ∂G ∂F ∂G
{F, G} ≡ − (15.25)
∂qi ∂pi ∂pi ∂qi
Using the Poisson brackets, the dynamical evolution on phase-space (i.e., Eqs. (15.20))
amounts to
Equivalently, we can say that the total Hamiltonian acts as the generator of infinitesi-
mal time translations: for example, qi (t + δt) = qi (t) + {qi , δt · HT }, etc. The primary
constraint χ(q1 , q2 , p1 , p2 ) = q1 p2 − q2 p1 = 0 is itself left invariant under Hamiltonian
evolution, {χ, HT } = 0: this is physically obvious in our toy model, as the con-
straint is just the angular momentum which is preserved under the two-dimensional
motion of our particle in the central potential V (q12 + q22 ). Thus the constraint, once
applied as an initial condition at time t = 0, will automatically be satisfied at any
later time on trajectories following the Hamiltonian evolution (15.26). Or, in yet
other words, the three-dimensional constraint surface obtained by restricting the
four-dimensional phase-space (q1 , q2 , p1 , p2 ) to points satisfying χ(q1 , q2 , p1 , p2 ) = 0 is
invariant under Hamiltonian evolution. However, this three-dimensional space, as it
is odd-dimensional, cannot act as a proper dynamical phase-space (with an equal
number of “p’s” and “q’s”). In fact, it is clearly still too large, as it contains distinct
Constrained Hamiltonian systems 515
points representing physically equivalent states of the system: those related by a gauge
transformation
(q, p ) → (q = Rq, p = R
p) (15.27)
where R is a 2x2 rotation matrix (see (15.22)). As the rotation R varies over all
possible rotation angles 0 ≤ θ < 2π, the points q , p trace out a one-dimensional
“gauge orbit” of physically equivalent points in phase-space. In the preceding section
we saw that the gauge ambiguity of the system defined by Lagrangian (15.2) could
be eliminated by imposing a “gauge condition” (such as ψ(q, p ) = q2 = 0), at which
point we recover a non-singular Lagrangian (15.7) with perfectly regular canonical
properties. An appropriately chosen gauge condition ψ(q, p ) defines a surface in the
original four-dimensional phase-space of our unconstrained system which intersects
the gauge orbit passing through any given point in phase-space exactly once. The
imposition of such a condition means that the gauge freedom of the unconstrained
system has been completely eliminated: in the turntable model of the previous section,
it means that we have specified unambiguously the orientation of the turntable at every
moment in time.
Note that the constraint function χ(q, p ) acts as the infinitesimal generator of
gauge transformations (i.e., O(2) rotations), as
Note also that a necessary condition for the gauge freedom to be completely eliminated
is that the Poisson bracket {ψ, χ} of the gauge-fixing function and the constraint be
non-zero: otherwise put, once on the gauge-fixed surface, any gauge transformation,
and in particular any infinitesimal gauge transformation, must move us off that
surface. Simple axial gauges such as ψ = q2 clearly satisfy this requirement, as we see
from (15.28).
The three-dimensional version of our toy model, (15.8), has three primary first-
class constraints (15.9) whose Poisson brackets are just the Lie algebra of the gauge
group O(3):
{χi , χj } = ijk χk (15.30)
The reader will recall that a set of constraints is said to be first-class if their Poisson
brackets vanish once the constraints themselves are imposed, which is certainly the
case if they form a closed Lie algebra as here. The gauge orbits in this model correspond
to spheres of fixed radius |q |, |
p | for the coordinate and momentum vectors. Again,
a complete gauge-fixing- amounting to selecting a single representative point on each
gauge orbit- is easily achieved by the axial gauge corresponding to imposing, say,
ψ1 = q1 = 0, ψ2 = q2 = 0, which at the Lagrangian level amounts to rotating the q
vector at each time into the z-direction. Note that on the gauge-fixed surface, the
constraint χ3 = q1 p2 − q2 p1 is automatically satisfied: there are only two independent
first-class constraints, χ1 and χ2 which act non-trivially on this surface, and indeed
the non-degeneracy of the determinant
516 Symmetries IV: Local symmetries in field theory
r
HT = h(q1 , .., qf , p1 , .., pf ) + λm χm (15.32)
m=1
and that the set of first-class constraints χm (qi , pi ) = χm (q, p), m = 1, . . . , r generate
gauge transformations whose orbits intersect uniquely the submanifold defined by a
set of r gauge conditions ψm (qi , pi ) = 0, m = 1, . . . , r. As we saw earlier, this implies
that the determinant of the Poisson bracket matrix {ψm , χn } be non-vanishing on
the constraint surface. We shall assume that the gauge conditions are chosen to have
vanishing Poisson brackets with each other:
{ψm , ψn } = 0 (15.33)
The commutativity of the gauge-fixing conditions implies that we can find a canonical
transformation to a new set of 2f coordinates and momenta, which we shall label
(Q∗1 , .., Q∗f −r , Q1 , .., Qr , P1∗ , .., Pf∗−r , P1 , .., Pr ), where the ψm play the role of the last
r momenta (which necessarily commute)
∂χm
det{χm , Pn } = det( ) = 0 (15.35)
∂Qn
Constrained Hamiltonian systems 517
which ensures that our first-class constraints, re-expressed in the new variables,
χm (Q∗i , Pi∗ , Qm , Pm ) = 0 (15.36)
can be solved uniquely for the r new coordinates Qm , m = 1, .., r as functions of the
constrained starred variables only (once the Pm = 0 gauge conditions are applied):
H ∗ (Q∗i , Pi∗ ) = h(q1 , .., qf , p1 , .., pf )|Pm =0,Qm =fm (Q∗i ,Pi∗ ) (15.38)
Once again, resorting to our simple toy model as a concrete example, we find that our
original Hamiltonian (15.19) becomes, in terms of the constrained starred variables
Q∗1 , P1∗ , in the gauge ψ = αq1 + βq2 = 0,
1 ∗ 2
H ∗ (Q∗1 , P1∗ ) = (P ) + V ((Q∗1 )2 ) (15.39)
2 1
This Hamiltonian has exactly the form we would obtain by the conventional canonical
procedure beginning from the non-singular Lagrangian (15.7), which the reader will
recall was obtained by exploiting the gauge symmetry of the singular Lagrangian
(15.2) to eliminate the gauge freedom in the system ab initio (by setting q2 = 0). The
reader is strongly encouraged to carry through the gauge-fixing procedure in the three-
dimensional version of the model, with Lagrangian (15.8) and an O(3) non-abelian
gauge symmetry: the end result will be exactly the same Hamiltonian, representing a
theory with a single physical degree of freedom.
The fully constrained Hamiltonians in (15.38, 15.39) can be subjected to quanti-
zation in the normal way, by imposing the canonical commutator condition3
[Q∗i , Pj∗ ] = iδij (15.40)
3 As pointed out originally by Dirac, the classical to quantum transition in this context amounts simply
to the replacement {. . . , . . .} → −i
[. . . , . . .].
518 Symmetries IV: Local symmetries in field theory
We are now going to do something which at first sight seems very strange indeed: we
wish to write an equivalent functional-integral representation for the kernel K, but
in terms of the full set of unconstrained variables q1 , . . . , qf , p1 , . . . , pf from which we
started. In other words, we wish to restore the physically superfluous gauge degrees
of freedom which we have just expended so much effort to eliminate! For the simple
mechanical examples considered so far, such a maneuver would be completely unnec-
essary and, indeed, pointless, but for the gauge field theories which we are about to
explore it is precisely the unconstrained version of the theory which manifests directly
the critical (from a field-theoretic point of view) locality and Poincaré invariance
properties which we are enjoined to preserve at all costs.
First, we restore the original set of 2f coordinates and momenta at any given
time in the functional integral measure by inserting δ-functions which incorporate the
procedures by which we originally eliminated the 2r coordinates and momenta Qm , Pm
to obtain the constrained Hamiltonian H ∗ :
f −r f −r
r
DQ∗i DPi∗ → DQ∗i DPi∗ DQm DPm δ(Pm )δ(Qm − fm (Q∗i , Pi∗ ))
i=1 i=1 m=1
f
r
= Dqi Dpi δ(ψm )δ(Qm − fm (Q∗i , Pi∗ )) (15.42)
i=1 m=1
where we have used the fact that the canonical measure fi=1 Dqi Dpi is invari-
ant under the canonical transformation to the (Q∗1 , .., Q∗f −r , Q1 , .., Qr , P1∗ , .., Pf∗−r ,
P1 , .., Pr ) variables. The δ-functions of the Qm coordinates in (15.42) can be traded
in for δ-functions of the first-class constraints χm at the cost of the Jacobian (15.35):
r
r
∂(χ1 , χ2 , .., χr )
r
δ(Qm − fm (Q∗i , Pi∗ )) = δ(χm ) = δ(χm ) · det{χm , ψn }
m=1 m=1
∂(Q1 , Q2 , .., Qr ) m=1
(15.43)
The reader will recall (see, for example, (Goldstein, 2002), Section 9.1) that in Hamilto-
nian systems the combination pi q̇i − H is unchanged under a canonical transformation
up to an additive total time-derivative dF (where F is the generating function of the
canonical transformation), which in the exponent of the path integral will lead to
an overall phase factor e (Ff −Fi ) , where Fi (resp. Ff ) are the initial (resp. final)
i
values of the generating function over the time evolution from ti to tf . In the field
theory case, we shall be letting the initial and final times go to −∞ and +∞, where
the fields can be safely switched off, so we may ignore this factor here. Recalling
that the constrained Hamiltonian H ∗ in (15.41) is precisely obtained by subjecting
the unconstrained Hamiltonian h(q1 , .., qf , p1 , .., pf ) to the δ-function constraints in
(15.42), we see that our expression for the propagation kernel in terms of a path
integral over constrained variables can be written
f
r tf
i
(pi q̇i −h(qi ,pi ))dt
K(tf , ti ) = Dqi Dpi δ(χm )δ(ψm )det{χm , ψn } e ti
i=1 m=1
Abelian gauge theory as a constrained Hamiltonian system 519
f
r tf
i
(pi q̇i −h(qi ,pi )−λm χm )dt
= Dqi Dpi Dλm δ(ψm )det{χm , ψn } e ti
i=1 m=1
f
r tf
i
(pi q̇i −HT (qi ,pi ))dt
= Dqi Dpi Dλm δ(ψm )det{χm , ψn } e ti
(15.44)
i=1 m=1
Aμ (x) → AΛ
μ (x) ≡ Aμ (x) + ∂μ Λ(x) (15.50)
where Λ(x) is an arbitrary twice-differentiable function (as we wish the field strengths
Fμν to remain well-defined after the gauge transformation). The invariance of the term
in the action involving Jμ is apparent after we use integration by parts to transfer the
520 Symmetries IV: Local symmetries in field theory
Π0 = −F 00 = 0 (15.52)
while for the spatial components of the field Aμ , the conjugate momentum fields are
recognized as the electric field components of Maxwellian electrodynamics:
Πi = −F 0i = F0i = ∂0 Ai − ∂i A0 ≡ E i , i = 1, 2, 3 (15.53)
for arbitrary functionals F, G: in particular, the Poisson brackets for the spatial vector
potential and its conjugate (electric) field are simply
χ(x) ≡ J 0 − ∂i E i = J 0 − ∂i Πi = 0 (15.57)
while the effect of a gauge-transformation on the momentum field variables (i.e., the
electric field) is null:
{Π (x),
i
λ(y )χ(y )d3 y} = 0 (15.60)
—i.e., the electric field, unlike the vector potential, is gauge-invariant, and therefore
possesses a direct physical meaning. Just as in the mechanical examples of the
preceding sections, we are free to make such a transformation of the canonical fields
independently at different times, so we recognize the gauge invariance generated by
the first-class constraints of this theory as just the local gauge invariance (15.50)—
restricted to the dynamical canonical fields Ai , of course.
We can now proceed to the construction of an unconstrained total Hamiltonian à
la Dirac, imitating the procedure followed in the preceding section, where our starting
4 The current J μ should be regarded here as either a fixed external field, or built out of matter fields
which have vanishing Poisson brackets with the Aμ fields.
522 Symmetries IV: Local symmetries in field theory
We have used electric/magnetic field notation E i = Πi , B i = 12 ijk Fjk in the final line.
Note that the time component of the four-vector potential A0 now plays the role of the
Lagrange multiplier term in the Dirac total Hamiltonian, as it multiplies the Gauss’s
Law first-class constraint:
H = HT = (HEM + A0 (x)χ(x))d3 x (15.62)
1 2 2
HEM = (E + B ) − J · A
(15.63)
2
· A(
ψ ≡ ∂ i Ai (x, t) = 0 = ∇ x, t) (15.64)
· (A
∇ + ∇Λ)
·A
=∇ =0⇒∇
2 Λ(x) = 0 ⇒ Λ(x) = constant (15.65)
Alternatively, one could choose an axial gauge, where the gauge transformation is
chosen to remove (say) the third component of the field,
Abelian gauge theory as a constrained Hamiltonian system 523
ψ = A3 (x, t) = 0 (15.66)
which bears an obvious similarity to the gauge choices we made earlier in our mechan-
ical examples (cf. (15.7)). Both of these conditions destroy the manifest Lorentz-
invariance of the theory, of course. The Coulomb gauge is at least marginally superior
in that it preserves at least the rotational symmetry of the theory (the O(3) subgroup
of the HLG), and we shall adopt it for the time being in our construction of a fully
constrained Hamiltonian theory.
For Coulomb gauge, we may now write down, using (15.44), the functional integral
representation of the generating functional Z giving the vacuum persistence amplitude
in the presence of the external current J μ (we henceforth set = 1 as we return to
field theory proper):
i 4
ZEM = DE i DAi DA0 δ(ψ)det{χ, ψ}ei (E Ȧi −HEM −A0 χ)d x (15.67)
with the Hamiltonian energy density given in (15.63). The DFP determinant appearing
in this expression, involves the Poisson bracket of the Gauss’s Law constraint (15.57)
with the Coulomb gauge constraint (15.64) (at equal times, so we suppress time)
∂ i ∂ ∂ ∂ 2 δ 3 (x − y )
{J 0 (x) − E (x), j Aj (y )} = {Aj (y ), E i (x)} = ∇ (15.68)
∂xi ∂y ∂xi ∂y j
so the corresponding determinant5
2 δ 3 (x − y ))
DFP ≡ det{χ(x), ψ(y )} = det(∇ (15.69)
is a field-independent constant, which can be given a well-defined value by a full
regularization of the theory (for example, by introducing IR and UV cutoffs via a
lattice), but is of no physical significance, as we recall from Chapter 10 that overall
multiplicative factors in the functional integral for the vacuum amplitude of a field
theory disappear once we compute the connected Green functions of the theory.
Discarding the determinant factor in (15.67), we see that the dependence of the
exponent on the momentum (electric) fields is quadratic, so these may be integrated
out in the usual fashion by completing the square (and again dropping irrelevant
overall factors), leaving a functional integral over the original vector potential field
variables (A0 , Ai ) = Aμ :
i 1 2 2 4
i (E (∂0 Ai −∂i A0 )− 2 (E
+B
)−J Aμ )d x (15.70)
μ
ZEM = DAμ DE i δ(∇ · A)e
2 2
i
· A)e ( 12 F0i − 14 Fjk −J μ Aμ )d4 x
= DAμ δ(∇ (15.71)
5 The determinant factor det{χ, ψ} appearing in the functional integral (15.67) over spacetime fields is a
product of the factor DFP given here at each discrete time, once the theory is regularized, for example, on a
Nt
spacetime lattice: thus, for the present case of Coulomb gauge, det{χ, ψ} = DFP , where Nt is the number
of points in the time direction. Similarly, the δ-function gauge-fixing constraint δ(ψ) implicitly involves a
· A(
x, t)).
product of δ-functions enforcing the constraint at each spacetime point: δ(ψ) = x,t δ(∇
524 Symmetries IV: Local symmetries in field theory
i
· A)e (− 14 Fμν F μν −J μ Aμ )d4 x
= DAμ δ(∇ (15.72)
i
· A)e LEM d4 x
= DAμ δ(∇ (15.73)
Note that the original, manifestly local, gauge- and Poincaré-invariant action for our
abelian electrodynamics has re-emerged: the only fly in the ointment is the δ-function
enforcing the (non-Lorentz-invariant) Coulomb gauge restriction independently on
each time-slice. The loss of Lorentz symmetry is only apparent: indeed, we may already
anticipate that the theory must, despite appearances, preserve Lorentz symmetry,
as the Euler–Lagrange equations (15.48) of our starting Lagrangian are perfectly
covariant. We shall now demonstrate this highly desirable feature of the constrained
formalism by showing that, despite appearances, the functional integral (15.67) is
independent of the choice of gauge-fixing function ψ, leaving us free to replace the non-
covariant Coulomb (or axial) gauge choices by a perfectly Lorentz-covariant choice—
for example, “Landau (or Lorentz) gauge” ∂μ Aμ = 0.
Recalling (cf. 15.50)) the notation AΛμ ≡ Aμ + ∂μ Λ for the effect of a finite gauge
transformation on the gauge field Aμ , consider the functional Δcoul [A] of Aμ defined
implicitly by
Δcoul [A] · DΛδ(∇ ·A Λ (x, t)) = 1 (15.74)
We again remind the reader (see footnote 5) that the δ-function constraint in the
functional integral implies a product of δ-functions at each and every spacetime
point—a statement which can be given a precise meaning by regularizing the theory
on a spacetime lattice. To the extent that the gauge fixing imposed by ψ = 0 picks a
unique gauge field with ψ(AΛ ) = 0 on the orbit passing through an arbitrary field A,
the functional integral over Λ in (15.74) receives its entire contribution from exactly
one gauge function. In particular, for fields A already on the gauge surface ∇ ·A
= 0,
this must occur at Λ = 0, and we have
Δcoul [A]−1 = DΛ 2 Λ(x, t))
δ(∇
t
= DΛ(x, t)δ( M(x, y )Λ(y , t)d3 y) (15.75)
t
2 δ 3 (x − y )
M(x, y ) ≡ ∇ (15.76)
6 This is clear if we discretize also the spatial dependence of the fields, and recall that for any matrix
Mij , dΛi δ(Mij Λj ) = (detM)−1 .
Abelian gauge theory as a constrained Hamiltonian system 525
where we have used the shift invariance of the functional integral over Λ under Λ →
Λ − Λ (for Λ (x) any fixed function on spacetime). The gauge-invariance is, of course,
a triviality in this particular case (abelian gauge theory in Coulomb gauge), as we
previously saw that the determinant is a field-independent constant. But this will
not be the case once we repeat the procedure for non-abelian theories, so we shall
proceed as though the DFP determinants we encounter are non-trivial functionals
of the gauge field. One may similarly define a covariant DFP functional associated
with the gauge choice ψ = ∂ μ Aμ (x) − f (x) = 0, where f (x) is for the time being a
perfectly arbitrary, but fixed, function of spacetime (thus, Landau gauge corresponds
to taking f (x) = 0 identically):
Δfcov [A] · μ − f) = 1
DΛδ(∂ μ AΛ (15.78)
In going from the second to the third equations we have made the functional change
of variable Aμ → AΛ (there is no Jacobian as this amounts to an additive shift of
the integration variables); in the fourth equation we have used the gauge-invariance
of the functionals Δcoul , Δfcov and of the Lagrangian LEM and sourced fields Oi ; and
in the final line we have employed the definition (15.74) to remove all evidence of
the original Coulomb gauge-fixing. Our final result (15.80) is manifestly Lorentz-
covariant, as all spacetime indices are properly contracted. If we set f = 0 we recover
the functional integral for abelian gauge theory in Landau gauge. Note that, as we
pointed out earlier, despite appearances, Δfcov [A] in (15.80) is in fact independent of
f , and we may henceforth write it simply as Δcov [A]. As our starting point (15.79) for
ZEM is clearly independent of the choice of the arbitrary function f , we may multiply
it by an irrelevant constant factor obtained by a functional integral over all functions
f with a damped Gaussian factor (ξ a positive real number)
2 4
C ≡ Df e− 2ξ f (x) d x
i
(15.81)
obtaining
f (x)2 d4 x+i (LEM +Ki Oi )d4 x
Df DAμ Δcov [A]δ(∂ μ Aμ − f )e− 2ξ
i
ZEM [Ki ] =
1
(∂ μ Aμ )2 +Ki Oi )d4 x
= DAμ Δcov [A]ei (LEM (A)− 2ξ
(15.82)
The functional integral (15.82) defines the partition function (or vacuum persistence
amplitude) for abelian gauge theory in the so-called covariant ξ-gauges (first intro-
duced for non-abelian theories by ’t Hooft (T’Hooft, 1971)).
We reiterate that in the present case of abelian gauge theory, the DFP determinant
factor Δcov [A] is in fact a field-independent constant, and may be omitted completely:
we retain it here in anticipation of the fact that for non-abelian gauge theories,
it develops a non-trivial structure and must be kept in order to arrive at unitary
amplitudes. These gauges are extremely useful in performing perturbative calculations
in gauge theories (abelian or non-abelian): the disappearance of the arbitrary constant
ξ from all expressions for gauge-invariant quantities at the end of the calculation
provides a very useful check on the intermediate manipulations. That the Green
functions of the theory (vacuum expectation values of time-ordered products of the
Oi fields, obtained by functionally differentiating ZEM [Ki ] with respect to the source
functions Ki (x)) are ξ-independent is clear from the fact that our original expression
Abelian gauge theory as a constrained Hamiltonian system 527
(15.79) for ZEM [Ki ] did not contain the parameter ξ at all. It is a straightforward
matter to extract the perturbative Feynman rules in the ξ-gauge for abelian quantum
electrodynamics from (15.82), but we shall defer this task to the more interesting
case of non-abelian gauge theory, where the full power of the deWitt–Fadeev–Popov
approach becomes manifest. The Feynman rules for abelian gauge theory are in any
event obtained trivially from the non-abelian ones by a simple reduction, so we lose no
information by proceeding directly, as we shall shortly do, to the case of non-abelian
gauge field theory.
We may promote our abelian gauge theory to a full-fledged quantum electrody-
namics (QED), in which the external source is now the quantized four-vector current
arising from Dirac fermions (e.g, the electron) of charge e and mass m,
We must also include the usual kinetic term in the Lagrangian for the fermions, thereby
arriving at the Lagrangian for QED:
1
LQED = − Fμν F μν + ψ̄(iD
/ − m)ψ, Dμ ≡ ∂μ + ieAμ (15.84)
4
This Lagrangian is invariant under the local gauge transformations consisting of the
following joint transformations on the Aμ and ψ fields:
The invariance of the Fμν F μν and mass mψ̄ψ terms under these transformations is
obvious. The covariant derivative Dμ preserves the form of the gauge transformation
of the charged field, as under (15.85–15.87),
from which the invariance of the kinetic fermion term ψ̄γ μ Dμ ψ follows immediately.
Note that if we take the commutator of two covariant derivatives acting on the fermion
field ψ, the derivative terms on ψ cancel, and we are left with the field tensor Fμν :
As both ψ(x) and [Dμ , Dν ]ψ(x) must transform identically under (15.86), we see that
the commutator must be gauge-invariant—an observation that will simplify our search
below for a non-abelian generalization of the gauge-field kinetic term in (15.84).
The transformations embodied in (15.85–15.87) form an abelian (commutative)
group, as successive transformations with gauge functions Λ1 (x), Λ2 (x), etc., per-
formed in any order lead to the same final result. The global gauge symmetry obtained
by restricting the gauge functions Λ(x) to spacetime constants clearly amounts to a
U(1) phase transformation on the charged fermion field ψ, so we refer to a theory
528 Symmetries IV: Local symmetries in field theory
i (LQED +
j·A+η̄ψ+
· A)e ψ̄η)d4 x
ZQED [j, η, η̄] = DAμ DψDψ̄δ(∇ (15.90)
Note that in Coulomb gauge we need only include a source j for the spatial part A
of the four-vector potential, as this is the part of the gauge field that interpolates for
the asymptotic (transverse) photon states (cf. the discussion at the end of Section
7.5). The source terms in (15.90) are clearly not locally gauge-invariant, so the
reader may well wonder how we can manage the conversion of this functional into
a manifestly Lorentz-covariant form along the lines of the maneuvers leading to the
covariant functional (15.82) above, which required that the exponent in the path
integral be exactly invariant under an arbitrary local gauge transformation of the
fields. In fact, the Feynman Green functions (T-products of gauge and Dirac fields)
are not gauge-invariant as such, nor do they need to be. Instead, we shall see that the
physical information they contain, in the form of on-mass-shell S-matrix elements,
is preserved under local gauge transformations. For example, we recall that the
LSZ formula requires that we subject the generating functional ZQED [j, η, η̄] to the
operation
δ
d4 xeik·x x ∗ (k, λ) · ZQED (15.91)
δ j(x)
+ η̄ψ + ψ̄η → j · (A
j · A − ∇Λ)
+ eieΛ η̄ψ + e−ieΛ ψ̄η (15.92)
where we have integrated by parts to transfer the spatial gradient to the complex
exponential, and used the fact that the photon polarization vectors are transverse, k ·
(k, λ) = 0. One can similarly establish that the Λ-dependence of the fermionic source
terms visible in (15.92) does not affect the result once the on-mass-shell projection is
made for initial- or final-state fermions as required in the LSZ formula (cf. (9.206);
also Problem 1).
Non-abelian gauge theory: construction and functional integral formulation 529
The matrix identity [tα , [tβ , tγ ]] + [tβ , [tγ , tα ]] + [tγ , [tα , tβ ]] = 0 then implies the Jacobi
identity constraint on the structure constants fαβγ of the group
ψn (x) → ψn (x) ≡ Unm (x)ψm (x), U ≡ exp (−igΛa (x)tα ) (15.96)
For the gauge theories of the standard model, we are dealing with unitary groups:
the generator matrices tα are hermitian, the gauge functions Λα are real, and the finite
group transformation matrices U are therefore unitary. In analogy to the abelian
gauge transformation (15.86), it is conventional to include a factor of the gauge
coupling constant g (analogous to the electric charge e in the QED case) in the
gauge parameters defining U . The dynamical (as opposed to “gauge-kinematical”)
role of the gauge coupling constant will become clear shortly when we construct the
full Lagrangian of the theory. The non-abelian groups associated with the strong and
weak interactions are in addition “special” unitary, satisfying the additional constraint
det(U ) = 1 (corresponding to the generators tα being traceless, as we see immediately
using the identity ln det(U ) = Tr ln (U )). For the purposes of the present discussion, we
may as well restrict ourselves to the special unitary groups SU(N ) in which the matter
fields ψn (x) fill out the fundamental representation of the group. The dimension of
SU(N ) (i.e., the number of linearly independent traceless hermitian N × N generator
matrices tα ) is ng = N 2 − 1. Thus, for the gauge group SU(3), the gauge index α runs
530 Symmetries IV: Local symmetries in field theory
over the values 1,2,. . . ,8. In addition to the fundamental representation of dimension
N , the adjoint representation, of dimension ng , will play a central role in the following.
We remind the reader that a multiplet of real fields Vα (x), α = 1, 2, .., ng transforms
according to the adjoint representation of SU(N ) if, under the gauge transformation
Unm (x) in (15.96),
Note that the similarity transformation in (15.97) preserves the traceless, hermitian
character of the tα Vα matrix (if Vα are real fields, as we assume), so the fact that
the generators tα of SU(N ) are a complete basis for all N × N traceless hermitian
matrices ensures that the Vα are well-defined by the above procedure.
The transformations (15.96) are the field-theoretic analogs of the non-abelian gauge
transformations (15.10) leaving invariant the Lagrangian (15.8) in the mechanical
example of Section 15.1. In that mechanical model, the coordinate vector q(t) of
the point particle moves along a trajectory in three-dimensional space, but only
the radial coordinate r(t) ≡ q(t) · q(t) possesses physical significance: the individual
Cartesian coordinates can “wobble” furiously, as though the entire system is being
viewed from the standpoint of an inebriated experimentalist shaking (rotationally) the
coordinate axes in a random fashion. Exactly the same arbitrariness attaches to the
physical interpretation of the internal symmetry axes in the case of a gauge field
theory.
To take a concrete example, consider the role of the color quantum number
in quantum chromodynamics (QCD)—the local field theory describing the strong
interaction sector of the Standard Model. QCD is a gauge theory exhibiting an exact
invariance under local gauge transformations of the form (15.96), where independent
unitary SU(3) rotations of the three quark fields ψn (x) at arbitrary spacetime points
leave the physics unchanged. If we label the three quarks (fancifully and, of course,
arbitrarily) as “red”, “green”, and “blue”, we see that the attachment of any particular
color label to any particular quark at any given time is a completely arbitrary choice:
the color “axes” may be unitarily rotated in a completely random way during the
dynamical evolution of the system without altering any physical observable. Indeed,
the physical observables—in a gauge field theory, those associated with local or almost
local operators (in the language of Chapter 9)—are precisely those which (in analogy
to mechanical quantities which depend only on the radial coordinate in the “turntable
model” of Section 15.1) are gauge-invariant: that is, they are unchanged under the
transformations (15.96). For the gauge group SU(N ), such gauge-invariant observables,
built from fermionic fields in the fundamental representation (the quark fields of QCD,
for example) include composite operators such as
to present just a few examples. These constructs are easily seen to be invariant under
ψ(x) → U (x)ψ(x), ψ̄(x) → ψ̄(x)U † (x) for U ∈ SU (N ): we need only take into account
Non-abelian gauge theory: construction and functional integral formulation 531
the fact that the action of the gauge matrices U (x) leaves the implicit Dirac indices
in the ψn fields unaltered (or in other words, the U matrices commute with the
γ matrices implementing the Dirac algebra for our spin- 12 fields). Local fields such
as S(x), J μ (x), N (x),, and so on, are said to be “colorless” or “color neutral”, and
represent the only local observables with unambiguous physical content in a local
gauge field theory. From the axiomatic Wightman point of view discussed in Section
9.2, the Wightman functions (vacuum expectation values of products) of such gauge-
invariant local fields contain the entire physical content of the theory: the vacuum
is cyclic with respect to the algebra generated by all local gauge-invariant fields (cf.
Section 9.2, Axiom IId). In particular, the complete S-matrix of a gauge theory like
QCD with an exact (unbroken) non-abelian gauge symmetry is determined in principle
from a knowledge of such functions, as the asymptotic Fock space of physical states
consists entirely—as we shall see in Chapter 19—of colorless multi-particle states.
Just as in the abelian case, the construction of a gauge-covariant derivative for
fermionic matter fields transforming non-trivially under the gauge group requires the
existence of vector fields Aαμ (one for each independent gauge transformation). Such
fields are needed to absorb the term involving a spacetime-derivative of the local gauge
functions Λα (x) in the kinetic part of the matter Lagrangian. It is convenient to “pack”
these fields into the adjoint matrix (dimension N × N )
Aμ ≡ tα Aαμ (15.101)
We then require, from (15.97), that under global (i.e., spacetime-independent) gauge
transformations,
Aμ → U Aμ U † , U ∈ SU (N ) (15.102)
and demand that Dμ ψ transform identically to ψ for any set of matter fields in the
fundamental representation:
The inclusion of an inhomogeneous term in the transformation rule for the gauge field
Aμ under local gauge transformations,
† i †
Aμ (x) → AU
μ (x) = U (x)Aμ (x)U (x) + (∂μ U (x))U (x) (15.105)
g
is easily seen to do the trick:
Dμ ψ(x) → (∂μ + igU (x)Aμ (x)U † (x) − (∂μ U (x))U † (x))U (x)ψ(x)
= (∂μ U (x))ψ(x) + U (x)∂μ ψ(x) + igU (x)Aμ (x)ψ(x) − (∂μ U (x))ψ(x)
= U (x)(∂μ + igAμ (x))ψ(x) = U (x)Dμ ψ(x) (15.106)
532 Symmetries IV: Local symmetries in field theory
thereby guaranteeing the local gauge-invariance of the fermionic part of the Lagrangian
(as ψ̄ → ψ̄(x)U † (x) under a local gauge transformation)
Lferm = ψ̄(iD
/ − m)ψ, Dμ ≡ ∂μ + igAμ (15.107)
Note that the gauge invariance requires all members of the fermion multiplet to have
the same mass: if m is a mass matrix, then U † mU = m for all U ∈ SU (N ) implies m
a multiple of the identity (by Schur’s lemma).
A gauge-covariantly transforming field tensor suitable for constructing a kinetic
Lagrangian for the adjoint gauge fields Aαμ is constructed along exactly the same
lines as (15.89) for the abelian theory. We note that derivatives of the matter field
cancel in the commutator [Dμ , Dν ]:
As the covariant derivatives (or products thereof) preserve the transformation property
(15.96) of ψ(x), under ψ(x) → U (x)ψ(x),
Fμν (x)ψ(x) → U (x)Fμν (x)ψ(x) ⇒ Fμν (x) → U (x)Fμν (x)U † (x) (15.109)
so the matrix Fμν (x) = tα Fαμν transforms exactly as required for the adjoint rep-
resentation (15.97), and is built from ng = N 2 − 1 antisymmetric tensor fields Fαμν
related to the underlying vector fields Aαμ by (15.108):
It is now a trivial matter to construct the non-abelian version of the kinetic gauge
Lagrangian − 41 Fμν F μν in the abelian case: we simply take
1 1
Lgauge = − Tr(Fμν F μν ) = − Fαμν Fαμν (15.111)
4 4
where it conventional to normalize the group generators by Tr(tα tβ ) = δαβ . The
invariance of Lgauge under local gauge transformations is now obvious from the
transformation property (15.109) of the non-abelian field tensor. Including the matter
fields, we have arrived at the Yang–Mills Lagrangian
1
LYM = Lgauge + Lferm = − Tr(Fμν F μν ) + / − ma )ψa
ψ̄a (iD (15.112)
4 a
where we have made the obvious generalization of allowing for several fermionic
multiplets ψa , of different mass, all transforming according to the fundamental repre-
sentation of the gauge group SU(N ).
We note at this point the characteristic (and remarkable) feature of non-abelian
gauge theories, which dramatically sets them apart from their abelian cousins such
as quantum electrodynamics: even in the absence of matter (scalar or Dirac fields),
the gauge fields themselves form a highly non-trivial interacting field theory, with the
Non-abelian gauge theory: construction and functional integral formulation 533
The analysis of the dynamics of the classical system defined by the Lagrangian
(15.112) as a constrained Hamiltonian system proceeds in exact analogy to the abelian
case: as there, we find that the fields Aα0 have vanishing conjugate momenta, since
∂LYM
= −Fα0μ (15.116)
∂ Ȧαμ
As Fα00 = 0, we have the primary constraints (one for each generator of the group)
while the equations of motion for the Aα0 amount to secondary constraints (which
guarantee the preservation in time of the primary constraints) which are just the
non-abelian version of Gauss’s Law (see Problem 3):
As in the abelian case, the Eαi are the conjugate momentum fields for the Aαi fields
Note that these constraints generate via Poisson brackets, as in the abelian case, and
as in the mechanical examples of Sections 15.1 and 15.2, the infinitesimal local gauge
transformations (15.113–15.115): for example, for the spatial gauge field (taking the
Poisson bracket at equal times, and suppressing the time coordinate)
{Aα (y ), d xλβ (x)χβ (x)} = − d3 xλβ (x){Aiα (y ), ∂j Eβj (x) − gfβγδ Aγj (x)Eδj (x)}
i 3
in agreement with (15.115). It follows that the Poisson bracket algebra of the con-
straints among themselves must imitate the Lie algebra of the underlying local gauge
group,7
and therefore that the set of constraints (15.122) are indeed first-class as their Poisson
brackets form a closed algebra. The reader will note the analogy to the constraint
algebra in the mechanical non-abelian example (with gauge group O(3)) studied
earlier, (15.30). Now that the primary and secondary constraints have been identified,
we can proceed to the construction of the total Hamiltonian, following steps analogous
to those leading to (15.61, 15.63). One obtains (see Problem 4), ignoring for the time
being the free fermion kinetic parts (thus, we consider only the parts of the action
involving the gauge field),
H = HT = (HYM + Aα0 (x)χα (x))d3 x (15.125)
1 2 2 1 ijk
HYM = (E + Bα ) − Jα · A
α, Bαi = Fα jk (15.126)
2 α 2
In order to proceed to a canonical quantization of this theory, we have, as usual, two
choices: either (a) the gauge freedom is eliminated ab initio by imposing a physical
gauge choice—one which reduces the number of degrees of freedom in the theory
in accordance with the freedom implicit in the gauge symmetry to the point where
7 Classical Poisson brackets can be defined also for the fermionic fields, with appropriate attention to
signs: the charge densities Jα 0 have vanishing Poisson brackets with the gauge fields and momenta, and
the result is that they satisfy separately the algebra (15.124): see (12.309) for the corresponding quantum
commutator result.
Non-abelian gauge theory: construction and functional integral formulation 535
where it is conventional to simply choose the space-like unit vector n̂i = δi3 , so the
gauge freedom is employed to move an arbitrary gauge field along a gauge orbit to the
unique point where Aα3 (x) = 0 (for all x) (see Problem 5). In this case, it is easy to
see that the DFP determinant Δaxial [A] is in fact a field-independent constant (as in
the abelian case), and may therefore be omitted from the functional integral. Despite
this simplifying feature, axial gauge is not a very popular choice, as it clearly destroys
manifest Lorentz-invariance: not just boosts, but also (except for rotations around the
n̂-axis) rotational symmetry.
A less objectionable choice of physical gauge is supplied by Coulomb gauge, which
at least retains manifest rotational symmetry, and is also physically desirable in
situations where a non-relativistic limit plays an important physical role (as in bound-
state problems with heavy quarks, for example):
·A
ψα ≡ ∇ α (x) = 0 (15.128)
with apologies for the need to temporarily appropriate the ψ symbol from the fermionic
fields of the theory, in order to maintain conformity with our previous notation
for the gauge fixing condition. In fact, the gauge choice (15.128) is faulty in one
important respect: it fails to satisfy the important requirement that the gauge orbit,
through an arbitrary configuration, intersects the gauge-fixed surface once, and only
once. Instead, as pointed out by Gribov (Gribov, 1978), gauge orbits passing through
536 Symmetries IV: Local symmetries in field theory
“large fields” can intersect the gauge-fixed surface multiple times (the famous “Gribov
copies”). In particular (cf. the discussion following (15.28, 15.29)), infinitesimal gauge
transformations should definitely move us from a point on the gauge-fixed surface to a
point off the surface. Instead, we find that for fields A α “sufficiently large” (in a sense
soon to become clear), there exist infinitesimal transformations λα which preserve the
Coulomb gauge condition (15.128). Referring to (15.114), we see that this is the case
if (on a given time-slice, suppressing the time variable)
for some non-trivial λβ (x). It is easy to exhibit examples of this phenomenon. For
example, taking the gauge group to be SU(2), we have fαβγ = αβγ , and making the
Ansatz (all indices α, β, .., i, j, .. now run over the values 1,2,3):
∂μ Aμα = 0 (15.133)
Nevertheless, these gauges are perfectly appropriate if our interests are purely pertur-
bative: in particular, if we are content with finding a consistent set of Feynman rules
for generating the formal asymptotic expansion of the Green functions of the theory in
powers of the coupling constant g. The reason for this is simply that the perturbative
calculation of Green functions from the functional integral amounts to a saddle-point
expansion around the Gaussian functional integrand represented by the free part of
the action, and the results obtained to any finite order of a saddle-point expansion
only depend on the structure of the integrand in the infinitesimal neighborhood of the
saddle point. The preceding discussion makes it clear that in this neighborhood (of
infinitesimally small fields) the troublesome Gribov copies are, in fact, absent.
Restricting our attention to perturbation theory therefore, we may construct an
unconstrained functional integral in Coulomb gauge proceeding in analogy to the
8 We recall that arbitrarily weak potentials do not bind in three dimensions, absent long-range Coulomb-
like behavior.
Non-abelian gauge theory: construction and functional integral formulation 537
abelian case (cf. (15.69)). In this case, however, the DFP determinant constructed
from the Poisson bracket of the constraints with the Coulomb gauge conditions,
α ] = det(Δδαβ + gfαβγ A
Δcoul [A
γ · ∇) (15.135)
The reader will recognize here the reappearance of the same operator whose zero-mode
eigenfunctions signaled the appearance of the Gribov ambiguity in (15.129). The non-
abelian Dirac–Fadeev functional integral analogous to (15.67) (over gauge-field degrees
of freedom only: the fermions will be inserted later) (cf.(15.73)), using the expression
(15.125) for the total Hamiltonian density,9 now becomes (cf.(15.73)):
ZYM = DAαμ DEαi Δcoul [A α ]δ(∇
·A
α)
i
2 +B
(∂0 Aαi −∂i Aα0 −gfαβγ Aβ0 Aγi )− 12 (E
2 )−J μ Aαμ )d4 x
· ei (Eα α α α
(15.136)
2 2
·A α ]ei
α )Δcoul [A ( 12 Fα0i − 14 Fαjk −Jα
μ
Aαμ )d4 x
= DAμ δ(∇ (15.137)
·A α ]ei
α )Δcoul [A (− 14 Fαμν Fαμν −Jα
μ
Aαμ )d4 x
= DAμ δ(∇ (15.138)
·A α ]ei
α )Δcoul [A LYM d4 x
= DAμ δ(∇ (15.139)
where LYM is the local and Lorentz scalar Yang–Mills Lagrangian (15.112), minus
the fermion kinetic piece. Of course, manifest Lorentz-invariance is still broken by
the δ-function enforcing the non-covariant Coulomb gauge condition and by the DFP
functional Δcoul which only depends on the spatial components of the gauge-field.
As for abelian gauge theories, a choice of gauge which preserves the manifest
Lorentz symmetry and locality properties of the underlying dynamics is usually
preferable to the non-covariant choices which lead to a straightforward canonical
treatment. The conversion can be made (again, as in QED) by recognizing that the
DFP determinant (15.135) can be written as the inverse of a functional integral which
averages the gauge-fixing δ-function over all gauge transformations (cf. (15.74)). This
“averaging over a group” requires the notion of Hurwitz measure: the integration over
all elements U of a continuous Lie group G is uniquely defined by the two conditions
dU f (U V ) = dU f (V U ) = dU f (U ), V ∈ G (shift invariance) (15.140)
and
dU = 1 (normalization) (15.141)
For example, the Lie group SU(2) is defined in the fundamental representation by unit
determinant 2x2 unitary matrices U = iσ · u + u4 · 1 with u2 + u24 = 1: in other words,
it is topologically the four-dimensional unit sphere. The Hurwitz measure turns out
in this case to be the obvious choice: dU = 2π1 2 dΩ where dΩ is the solid angle
in four dimensions. It is easy to verify the shift invariance condition (see Problem 6)
for this definition. For a local gauge symmetry we have the obvious generalization of
the single Hurwitz integral to a functional integral DU (x) over independent gauge
elements U (x) at each spacetime point.
Now consider the functional F [A] defined by the Hurwitz functional integral
≡ DU δ(∇
F [A] ·AU ) (15.142)
α
where A U is the spatial gauge field after being subjected to the finite local gauge
transformation U (x) (as in (15.105)). We shall now evaluate this functional integral
for gauge fields A α which are (a) sufficiently weak that no Gribov copies exist, and (b)
in Coulomb gauge—i.e., satisfy ∇ ·A α = 0. It is clear that the δ-function in the integral
is supported exactly at the identity value for the local gauge function U (x) = eigtα Λα (x)
(as we are already in Coulomb gauge). In the neighborhood of the identity, we may
replace the finite group parameters Λα by their infinitesimal limits λα (x), and obtain
(up to an irrelevant constant factor), using (15.114), and the identity from footnote 6,
F [A] = Dλα δ(∇ · (A
α + ∇λ
α + gfαβγ λβ A γ ))
= Dλα δ((Δδαβ + gfαβγ A γ )λβ )
Thus, we find, in analogy to (15.74) in the abelian case, that the DFP determinant
provides a partition of unity over all gauge transformations
α ] DU δ(∇
Δcoul [A ·A
U ) = 1 (15.144)
α
We are now in a position to repeat the steps analogous to those leading from
(15.79) to (15.80) in the abelian case, thereby making the transition from the non-
covariant Coulomb gauge to a covariant gauge specified by the generalized Landau
Non-abelian gauge theory: construction and functional integral formulation 539
gauge condition
where the fα (x) are, for the time being, arbitrary but fixed c-number functions. We
shall include the fermion dynamics completely at this point, by including the usual
integrals over Grassmannian integration variables, and begin with the functional
4
ZYM [Ki ] = DAαμ DψDψ̄δ(∇ ·A α ]ei (LYM +Ki (x)Oi (x))d x
α )Δcoul [A (15.147)
where now LYM is the full non-abelian Lagrangian (15.112), including fermion kinetic
terms, and the Oi (x) are an arbitrary set of gauge-invariant operators whose Green
functions we wish to compute, by taking functional derivatives with respect to the asso-
ciated sources Ki (x). Defining, in analogy to (15.78), a covariant DFP functional by
1 = Δfcov [A] · DU δ(∂ μ AUαμ − fα ) (15.148)
we can easily show (see Problem 7), just as in the abelian case, that for fields in
the generalized Landau gauge (15.146), (a) Δfcov ≡ Δcov is in fact independent of the
arbitrary functions fα (x), (b) is a gauge-invariant functional, Δcov [AV ] = Δcov [A]
for any local gauge transformation V (x), and (c) is the determinant of the covariant
analog of the Coulomb operator (15.135),
The steps leading from (15.79) to (15.80) can be repeated more or less verbatim, by
inserting the partition of
unity (15.148) (with the functional integral over abelian
gauge transformations DΛ. . . now replaced by the corresponding non-abelian
Hurwitz functional integral DU ), and using a shift of the integration variables
which corresponds to a local (finite) gauge transformations on all the fields (leaving
the Lagrangian LY M and the gauge-invariant operators Oi unchanged)
−1
Aαμ (x) → AU
αμ (x), ψ → U −1 ψ(x) (15.150)
Note that the shift of variables (15.150) has unit Jacobian. This is obvious for the
fermionic fields, which undergo unitary rotation at each spacetime point by the matrix
U −1 (x) which has unit determinant. The gauge fields in the adjoint representation
transform as follows (cf. (15.105):
−1
−1 i
t α AU
αμ = U tα Aαμ U − U −1 ∂μ U (15.151)
g
As the second term on the right is an additive shift, and the first corresponds to an
orthogonal rotation of the Aαμ in the α indices, the Jacobian of the transformation
on the gauge fields is also unity. The upshot is that we obtain (after removing the
Coulomb DFP determinant in the form of the partition of unity (15.144)) a manifestly
covariant expression for the same functional ZYM :
540 Symmetries IV: Local symmetries in field theory
(LYM +Ki (x)Oi (x))d4 x
ZYM [Ki ] = DAαμ DψDψ̄δ(∂ μ Aαμ − fα )Δcov [A]ei (15.152)
As our starting point did not contain the functions fα , the functional ZYM must also
be independent of the fα (despite their appearance in the δ-function), and we may
therefore multiply (15.152) by a physically irrelevant constant factor
2 4
C ≡ Dfα e− 2ξ fα (x) d x
i
(15.153)
and interchange the functional integrals over the gauge and fermion fields with those
over the fα to obtain the non-abelian analog of (15.82):
1 2 4
ZYM = DAαμ DψDψ̄Δcov [A]ei (LYM − 2ξ (∂ Aαμ ) +Ki Oi )d x
μ
(15.154)
The presence of the non-trivial (and unknown!) functional Δcov [A] in this formula
makes it unsuitable for practical perturbative calculations. Instead, it is convenient
to re-express this functional in terms of a functional integral representation, where
the determinant in (15.149) is generated by integrating over complex Grassmannian
“ghost fields” ωα (x), ω̄α (x), using the Gaussian fermionic integal (10.114),
μ 4
Δcov [A] = det(δαβ + gfαβγ ∂ Aγμ ) = Dωα Dω̄α ei ω̄α (δαβ +gfαβγ ∂ Aγμ )ωβ d x
μ
(15.155)
We emphasize that the ghost fields are introduced here as a purely technical device:
they evidently do not correspond to physical fields or particles in the theory, and
in particular there are no asymptotic states associated with them. As the theory is
based on an underlying hermitian Hamiltonian, we expect the perturbative S-matrix
constructed from a Fock space of gauge and fermion particles (and no “ghost
particles”) to be unitary on its own.10 Inserting (15.155) in (15.154), we obtain
our final result for the generating functional of a theory of coupled Yang–Mills and
fermionic matter fields, in a covariant ξ-gauge:
1 2 4
ZYM [Ki ] = DAαμ DψDψ̄Dωα Dω̄α ei (LYM +Lgh − 2ξ (∂ Aαμ ) +Ki Oi )d x (15.156)
μ
1
LYM = − Fαμν Fαμν + / − ma )ψa
ψ̄a (iD (15.157)
4 a
The advantage of the ghost field representation of the DFP determinant functional
is apparent in (15.156): the path integral takes the standard form, as an integral
10 By arguments analogous to those given in Section 15.3 for abelian gauge theory, the on-shell S-matrix
can be shown to be the same in a physical gauge (such as Coulomb or axial gauge), where only manifestly
physical gauge degrees of freedom are present, as in the covariant gauges under discussion. It is therefore
clear that perturbative unitarity must therefore hold on a Fock space constructed from the interpolating
fields associated with the gauge and matter fields of the theory, sans ghosts.
Non-abelian gauge theory: construction and functional integral formulation 541
The choices ξ = 0 (resp. 1) are referred to as Landau (resp. Feynman) gauge, but as
mentioned previously, it is often convenient to leave the gauge parameter ξ unfixed
at intermediate stages, as its disappearance at the end in physically meaningful
quantities is a powerful check (guaranteed by gauge invariance) on the correctness
of the calculation.
The interaction vertices of the theory are associated with the cubic and quartic
terms in the total Lagrangian. Thus, denoting the gauge, ghost and Fermi fields
by A, ω, ψ generically, there are vertices in the graphs of the theory corresponding
(schematically) to A3 , A4 , ψ̄ψA, and ω̄ωA field products (see Fig. 15.1 for the Feynman
rules for the bosonic vertices of the theory; also, Problem 9). In addition, we must
remember that the Grassmann nature of the ghost fields inserts minus signs (analogous
to those for the physical fermion fields of the theory) when ghost fields are exchanged
in Wick products, as well as in closed loops of ghost propagators (cf. (10.40)). As
emphasized above, the ghost propagators only appear as internal lines, as the ghost
fields do not correspond to physical asymptotic particle states.
A deeper understanding of the role of ghost fields in the quantization of local gauge
theories has been provided by the beautiful theory developed by Becchi, Rouet, and
Stora (Becchi et al., 1976), and Tyutin (Iofa and Tyutin, 1976), where the first-class
constraints appearing in a theory with local gauge symmetry are reinterpreted in terms
of an exact global supersymmetry of the full Lagrangian density (e.g., the exponent
in the path-integral expression (15.156), including ghost and gauge-fixing terms). The
BRST theory (like its historical antecedent, the Gupta–Bleuler quantization method)
is necessarily formulated on a state space with indefinite metric, but the existence of
a global supersymmetry turns out to be exactly what is needed for the ghost states,
together with longitudinal polarizations of massless gauge mesons, to decouple from
the positive-definite subspace of physical states. The derivation of the Ward–Takahashi
identities which summarize the content of the local symmetry at the level of the Green
functions of the theory is also considerably simplified in the BRST approach. We shall
not describe this approach further here, but refer the reader to the original papers and
accounts in textbooks devoted specifically to quantization of gauge field theories.11
11 For a readable account, see (Taylor, 1976), Chapter 12. The general graded cohomology BRST theory
of Hamiltonian systems with first-class constraints can be found in (Henneaux and Teitelboim, 1992).
542 Symmetries IV: Local symmetries in field theory
βν,p2 γρ,p3
(a) −igfαβγ[gμν(p1−p2)ρ+gνρ(p2−p3)μ+gρμ(p3−p1)ν]
αμ,p1
δσ,p4 γρ,p3
−g2[f βf δ(gμρgνσ−gμσgνρ)
(b) +f γf β(gμσgνρ−gμνgρσ)
+f δf γ(gμνgρσ−gμρgνσ)]
αμ,p1 βν,p2
β,p2
γμ,p3
α,p1
Fig. 15.1 Feynman rules for the (a) triple gluon, (b) quadruple gluon, and (c) ghost-ghost-
gluon vertices in QCD. The arrows indicate direction of momentum flow.
x0 → −ix4 , xi → xi , i = 1, 2, 3, ∂0 → i∂4 , ∂ i → ∂i
i 1
Aα0 → Aα4 , Aαi → Aαi , i = 1, 2, 3,
g g
Non-abelian gauge theory: construction and functional integral formulation 543
i 1
Fα0i → Fα4i , i = 1, 2, 3, Fαij → Fαij , i, j = 1, 2, 3
g g
γ 0 → γ̂4 , γ i → iγ̂i (i = 1, 2, 3), {γ̂μ , γ̂ν } = 2δμν (15.160)
where we have also rescaled the gauge field by a factor of the inverse coupling
constant. On performing these replacements in the Minkowski functional integral
(15.156) we obtain (ignoring source terms, and taking for simplicity just a single
Dirac field in the fundamental representation) the Euclidean functional integral:
− (LYM,E +Lgh + 1 2 (∂μ Aαμ )2 )d4 x
ZYM,E = DAαμ DψDψ̄Dωα ω̄α e 2ξg (15.161)
1
LYM,E = (Fαμν )2 − ψ̄(iD̂
/ − m)ψ (15.162)
4g 2
Fαμν = ∂μ Aαν − ∂ν Aαμ − fαβγ Aβμ Aγν , D̂
/ = γ̂μ (−i∂μ + tα Aαμ ) (15.163)
Lgh,E = ω̄α (δαβ ∂μ ∂μ + fαβγ ∂μ Aγμ )ωβ (15.164)
boundary conditions), this operator will have a purely discrete spectrum, with
the usual orthogonality properties holding for its eigenfunctions:
/φi (x) = λi φi (x), λi real,
D̂ φ†i (x)φj (x)d4 x = δij (15.167)
where the eigenfunctions φi (x) carry implicitly discrete Dirac and fundamental
representation gauge group indices (whence the † appearing in the orthonormality
relation). These properties turn out to be crucial in the analysis of the chiral
properties of gauge theory, as we shall see shortly in our discussion of the axial
anomaly. The self-adjointness of D̂/ is also critical in non-perturbative evaluations
of the Euclidean Green functions in QCD by Monte Carlo simulation of the
lattice-regularized functional integral, as the fermion fields when integrated out
yield the determinant of the Dirac operator Q ≡ iD̂ / − m which can be shown to
be real, as a consequence of
also leads to a Noether symmetry and, at the classical level, a conserved axial Noether
current:
Of course, if a mass term mψ̄ψ = mψ̄L ψR + mψ̄R ψL is added to the Lagrangian, the
chiral symmetry is lost, and the current (15.170) will develop a non-zero divergence.
The remarkable discovery of Adler, Bell, and Jackiw was that even in the absence of
a mass term quantum effects inevitably produce a non-vanishing divergence of J5μ if
the fermions are coupled to vector gauge fields via an exact local gauge symmetry. We
shall (following Fujikawa) exhibit this result using the functional version of Noether’s
theorem: i.e., by deriving the Ward–Takahashi identities for the chiral current of
fundamental representation fermions coupled to gauge fields (either abelian or non-
abelian). We shall see that an anomalous divergence, proportional to Planck’s constant
—and therefore, explicitly a quantum effect—emerges once the Jacobian arising
from a functional change of variable corresponding to a local chiral transformation
is carefully evaluated. As a preparation for the discussion here, the reader may find
it convenient to briefly review the derivation of the Ward–Takahashi identities for a
non-anomalous current that concludes Section 12.5.
The chiral transformations (15.169) act only on the fermion fields of the theory, so
we need only consider the fermionic part of the gauge theory path integral, with
the gauge fields “frozen” at some fixed, but unspecified, values (which may later
be integrated over after the pure gauge parts of the action are included). We shall
work in Euclidean space, so our starting point is the fermionic functional integral
(cf. (15.162))
1 4 4
Zferm [η, η̄] = DψDψ̄ e Lferm d x+ (η̄(x)ψ(x)+ψ̄(x)η(x))d x (15.171)
/ − m)ψ(x)
Lferm = ψ̄(x)(iD̂ (15.172)
546 Symmetries IV: Local symmetries in field theory
We have explicitly indicated the factor of Planck’s constant (normally set to unity)
in the exponent of the functional integrand in order to clarify the quantum origins
of the anomaly. We now retrace the procedure followed in Section 12.5 in deriving
the functional form of the Noether theorem for a vector symmetry, by considering
the effect of a functional change of variables induced by an infinitesimal local chiral
transformation:
Note that the Wick rotation to Euclidean space converts our previous Minkowski space
definition of γ5 to its Euclidean version γ5 = γ̂1 γ̂2 γ̂3 γ̂4 , which turns out to be exactly
the same matrix—i.e., (7.107). We can expand the general Grassmann fields ψ(x), ψ̄(x)
in a complete set of normalized eigenfunctions of the self-adjoint Dirac operator D̂ / (cf.
(15.167)), thereby giving a precise meaning to the functional integral in terms of a
discrete multi-dimensional Grassmann integral
†
ψ(x) = φi (x)ψi , ψ̄(x) = φi (x)ψ̄i (15.174)
i i
DψDψ̄ ≡ dψi dψ̄i (15.175)
i
Note that the eigenfunctions φi (x) contain a hidden (four-dimensional) Dirac index
and a gauge group index corresponding to the fundamental representation of the gauge
group (one-dimensional for the abelian case, such as QED, or N -dimensional for a
SU(N ) non-abelian gauge group). They depend on the c-number gauge fields buried
in the covariant derivative D̂. Under (15.173), the fermionic Lagrangian transforms to
+i ω(x)(η̄(x)γ5 ψ(x) + ψ̄(x)γ5 η(x))d4 x (15.177)
The Dirac conjugate fields ψ̄ produce a similar result, as the sign of the ω term in
(15.173) is the same as for ψ (due to the fact that ψ̄ = ψ † γ̂4 , giving an extra minus
sign when the γ5 is commuted through γ̂4 ):
ψ̄ (x) = ψ̄j Cji (15.178)
j
The effect of the change of variable (15.173) is therefore to introduce a Jacobian factor
(cf. (10.116))
where we again remind the reader that we are working to first order in the gauge
parameter ω(x). Note that if we had been working with the vector symmetry induced
by the phase transformation ψ → eiω ψ (with no γ5 ), there would be no mass term in
the variation of the Lagrangian, as in (15.176), and the functional Jacobian would be
a product of the determinants of matrices C and C̄ given by
Cij = δij + i φ†i (x)ω(x)φj (x)d4 x, C̄ij = δij − i φ†i (x)ω(x)φj (x)d4 x (15.180)
which is simply (to order ω) unity, and we would recover the standard (non-anomalous)
Ward–Takahashi identities (as in Section 12.5). For the chiral current, however, the
functional Jacobian (15.179) is a non-trivial, though gauge-invariant functional of
the gauge fields (as ψ(x) → U (x)ψ(x) induces the change φi (x) → U (x)φi (x), which
leaves Cij unchanged), which we shall shortly evaluate explicitly. Before doing that, let
us assemble the various pieces needed to obtain the chiral Ward–Takahashi identity,
which is simply the statement that the functional Zferm is unchanged by the functional
change of variables (15.173)—provided, of course, that we take into account properly
any non-trivial Jacobians induced by the change. After subjecting Zferm to the change
of variable, we find that the first-order (in ω) change in the integral takes the form
0 = δZferm [η, η̄] = ω(x)W[η, η̄; x] + O(ω 2 )
⇒ 0 = W[η, η̄; x] = DψDψ̄{i(η̄(x)γ5 ψ(x) + ψ̄(x)γ5 η(x)) − 2iA(x)
1
+ (∂μ (ψ̄γ̂μ γ5 ψ(x)) − 2imψ̄γ5 ψ(x))}
1
4
4
Lferm d x+ (η̄(x)ψ(x)+ψ̄(x)η(x))d x
· e (15.181)
which is the analog of the Noether functional theorem (12.141) for our anomalous
symmetry. The Ward–Takahashi (WT) identities (analogous to (12.142)) are obtained
by differentiating W[η, η̄; x] (=0) with respect to the fermionic sources η(x), η̄(x)
and then setting the sources to zero, whereupon we obtain a set of relations among
the Euclidean n-point (Schwinger) functions of the theory, which are the analytic
548 Symmetries IV: Local symmetries in field theory
We see that even in the massless limit, when the mass term on the right-hand side
vanishes, and the chiral symmetry (15.169) becomes an exact Noether symmetry at the
classical level, there is a remaining non-zero contribution coming from the functional
anomaly AM , the quantum origins of which are apparent in the explicit prefactor of
Planck’s constant multiplying the anomaly.
In order to compute the explicit form of the functional anomaly A (back in
Euclidean space), we must recall that although we have already discretized the
spectrum of D̂ / by placing the system in a finite spacetime volume, there is as
yet no short-distance (or high-momentum) cutoff, and the determinant therefore
involves an ill-defined product of infinitely many eigenvalues. We may regularize it
in a gauge-invariant way by observing that the eigenvalues λi are gauge-invariant
2 2
(see Problem 10), so that the inclusion of a factor e−λi /Λ in the trace in (15.179),
where Λ is an ultraviolet cutoff, amounts to a smooth gauge-invariant tempering of the
short-distance modes of the theory, which should be removed after evaluation of the
determinant by letting the cutoff Λ → ∞. Thus, we define the regularized functional
12 The contact terms lose a factor of i as a consequence of the Wick rotation of the four-dimensional
δ-functions, where one coordinate—the time—is rotated by a factor of i.
Explicit quantum-breaking of global symmetries: anomalies 549
anomaly AΛ as
2 2 2 2
AΛ ≡ φ†i (x)γ5 e−λi /Λ φi (x) = φ†i (x)γ5 e−D̂
/ /Λ
φi (x) (15.185)
i i
By the usual spectral analysis of self-adjoint operators, the discrete completeness sum
is equivalent to one over plane wavefunctions, as for any operator O, writing in Dirac
notation φi (x) = x|i
φ†i (x)Oφi (x) = i|xx|O|i
i i
= x|Tr(O)|x
d4 k
= φk (x)† Tr(O)φk (x) , φk (x) = eik·x (15.186)
(2π)4
where the trace operation Tr extends over the discrete γ-matrix and internal gauge
/2 /Λ2
indices only. With O = e−D̂ we therefore have
2 2 d4 k
AΛ = Tr(γ5 e−ik·x e−D̂
/ /Λ ik·x
e ) (15.187)
(2π)4
We may write the discrete trace Tr = trD trG where we explicitly separate the traces
over Dirac (trD ) and fundamental representation gauge (trG ) indices. Now
1 1 1
γ̂μ γ̂ν = {γ̂μ , γ̂ν } + [γ̂μ , γ̂ν ] = δμν + [γ̂μ , γ̂ν ] (15.188)
2 2 2
1 i
/2 = Dμ Dμ + [γ̂μ , γ̂ν ] [Dμ , Dν ] = Dμ Dμ − [γ̂μ , γ̂ν ] Fμν
D̂ (15.189)
4 4
The factors of e−ik·x . . . eik·x in (15.187) merely serve to translate the covariant
derivative Dμ by the four-vector kμ ,
2 1
= e−kμ kμ /Λ · e− Λ2 (2kμ Dμ +Dμ Dμ − 4 [γ̂μ ,γ̂ν ] Fμν )
i
(15.191)
At this point we shall need some simple Dirac trace identities, which we leave to the
reader to check (remembering that our Euclidean γ matrices are now all hermitian
550 Symmetries IV: Local symmetries in field theory
1
AM = i trG (Fμν F̃ μν ) (15.198)
16π2
and the axial current divergence (15.184) becomes
1
∂μ J5μ (x) = 2imψ̄γ5 ψ(x) − trG (Fμν F̃ μν ) (15.199)
8π 2
If we return to the canonical normalization of the gauge fields (by reversing the scaling
Aμ → g1 Aμ in (15.160)), the anomaly term is seen to acquire an explicit factor of
the squared coupling constant, and we obtain the usual form for the axial current
Explicit quantum-breaking of global symmetries: anomalies 551
g2
∂μ J5μ (x) = 2imψ̄γ5 ψ(x) − trG (Fμν F̃ μν ) (15.200)
8π 2
β(g)
T μμ = Tr(Fμν F μν ) + (1 + γ(g))mψ̄ψ (15.201)
2g
552 Symmetries IV: Local symmetries in field theory
Here the functions β(g), γ(g) are well-defined functions of the renormalized coupling
g which are related to the coupling and mass renormalizations of the theory and can
be calculated explicitly order by order in perturbation theory. The β(g) function in
particular—the famous “β function” of the renormalization group—plays a critical
role in understanding the scaling behavior of gauge theories, and will be discussed
in detail later in the book. The intimate connection between the trace anomaly and
the scaling behavior of interacting field theories should come as no surprise when
we recall its origin in our attempt to formulate a Noether current for the classical
dilatation symmetry of a Lagrangian with no dimensionful couplings (such as the
Yang–Mills Lagrangian (15.112) with all fermion masses zero). The trace anomaly
can also be understood at low orders (one loop) from a functional integral point
of view (Fujikawa, 1981)—again, as for the chiral anomaly, the culprit is a non-
trivial functional Jacobian—but it is difficult to obtain a rigorous all-orders result,
as in (15.201), by this technique. Although the field theories of the Standard Model
have non-vanishing β(g) functions and are definitely not conformally (or dilatation)
invariant, there are examples of supersymmetric field theories (N =4 supersymmetric
Yang–Mills is the classic case) where the trace anomalies contributed by the various
fields of the theory cancel and the β function appears to vanish to all orders of
perturbation theory, suggesting an exactly conformally invariant (and even UV finite!)
theory. The cautionary verb “appears” is used here because of the annoying fact that
there is no known ultraviolet regulator which can be used to give a definite meaning
to all the perturbative amplitudes of the theory while preserving both the local gauge
invariance and the global supersymmetry which are essential ingredients in the formal
arguments leading to the asserted conformal invariance.
As a final example of the important role played by quantum anomalies in modern
particle theory, we may mention here the purely gravitational anomalies that arise in
field theories in higher dimensions—in particular, in theories with local supersymmetry
(supergravity theories). In this case, the anomalous current is the energy-momentum
tensor itself! In any generally covariant theory of gravity, the graviton must couple
to a covariantly conserved energy-momentum tensor arising from the matter fields,
and it turns out (Alvarez-Gaumé and Witten, 1983) that the required cancellation of
potential anomalies in the Ward identity expressing this conservation requires very
careful choice of the fermionic representation content of the theory. The observation
that the required gravitational anomaly cancellations corresponded to supergravity
theories (in ten dimensions) which were the low-energy limits of a special class of
superstring theories played a seminal role in the renaissance of string theory in the
mid-1980s.
massless particles in the spectrum of the theory. In point of fact, spontaneously broken
symmetries are much more prevalent than massless particles in Nature, so there must
clearly exist a mechanism for avoiding the consequences of the Goldstone theorem in
most cases. Sometimes, of course, the global symmetry is only approximate, so the
associated Goldstone modes are merely “light” particles, rather than exactly massless
ones. Such creatures are then referred to as “pseudo-Goldstone” particles.
But in the case of electroweak interactions in the Standard Model, we encounter
a situation in which the spontaneous breakdown is associated with a local symmetry,
corresponding to a Lagrangian which is exactly locally gauge-invariant but in which
the vacuum (ground state) of the theory breaks the associated global charge. It should
be emphasized that the underlying local gauge symmetry is always present, as it
is simply the reflection of a redundancy in the field variables in the un-gauge-fixed
Lagrangian: indeed, a famous theorem due to Elitzur (Elitzur, 1975) assures us that
the vacuum-expectation-value of any non-gauge-invariant quantity always vanishes
in a theory with an exact local symmetry, in the absence of gauge-fixing. Once a
gauge is fixed, however, to remove the redundant degrees of freedom, the remaining
(discrete!) global symmetry may undergo spontaneous symmetry-breaking exactly
along the lines discussed in the previous chapter. The phrase “spontaneous breaking
of local gauge symmetry” is therefore in some sense a misnomer, but a convenient
one, if we think of it as a short circumlocution for “spontaneous breaking of remnant
global symmetry after removal of redundant gauge degrees of freedom by appropriate
gauge-fixing”.
In the presence of local gauge symmetry, the conditions discussed in Section
14.2 for the applicability of the Goldstone theorem are not present, and instead
of producing massless Goldstone particles we move in exactly the opposite direc-
tion, with the emergence of massive vector particles corresponding to gauge fields
with no mass term in the Lagrangian! This remarkable phenomenon—discovered
in the 1960s by Higgs (Higgs, 1964) (and independently, by several other work-
ers), but already prefigured in the Ginzburg–Landau model of superconductivity
(where the appearance of a photon “mass” underlies the exponential Meissner screen-
ing of the magnetic field in the superconducting medium)—is at the core of our
present understanding of the electroweak sector of the Standard Model of elementary
particles. The physical mechanism underlying the Higgs phenomenon can be com-
pletely understood in a simple abelian model (which Higgs himself used to illustrate
the essential idea). We start with the Lagrangian for a complex scalar field φ(x)
coupled gauge-invariantly to a vector field Aν , with polynomial self-coupling P (φ∗ φ)
for the scalar field:
1
L = − Fνρ F νρ + (∂ν − igAν )φ∗ · (∂ ν + igAν )φ − P (φ∗ φ) (15.202)
4
which is clearly invariant under the local abelian transformations
the vacuum state occurs for vanishing expectation value (VEV) of the scalar field,
0|φ(x)|0 = 0, and the gauge symmetry is preserved by the vacuum. The theory then
corresponds quite simply to the scalar quantum electrodynamics of a charged massive
spinless particle coupled to a massless photon. If the quadratic coefficient is negative,
on the other hand,
the classical energy density is minimized for fields with magnitude |φ(x)| = √μ2λ ≡ v,
and we must expect the quantum scalar field to acquire a non-vanishing VEV as well,
which to lowest order in the coupling is just the value v. In the absence of a coupling
to the gauge field (i.e., setting g = 0) we would, of course, simply shift the scalar field
by defining φ(x) = v + φ̂(x), and discover on rewriting the Lagrangian in terms of the
shifted field that the real component of the field φ̂R possesses a sensible (positive) non-
zero mass term, while the imaginary part φ̂I has no quadratic part and corresponds
to the massless Goldstone mode.
For g = 0 the result is altogether different. The physical spectrum of the theory
is most easily exposed in this case by employing the full—and exact—local gauge
symmetry of the theory to rotate the complex field to a real value. Thus, writing
φ(x) = √12 (φR (x) + iφI (x)), where φR , φI are self-conjugate (and with the canonical
normalization of their kinetic terms), the gauge symmetry (15.203) can clearly be used
to set φI (x) = 0 identically. In this “unitary” gauge, the Lagrangian becomes
1 1 1
L = − Fνρ F νρ + (∂ν − igAν )φR · (∂ ν + igAν )φR − P ( φ2R ) (15.208)
4 2 2
If we now shift the single remaining field φR by its VEV v = √μ to reflect the
λ
appropriate VEV for the ground state
μ
φR (x) ≡ √ + ψR (x) (15.209)
λ
1 μ2 g 2 1
L = − Fνρ F νρ + Aν Aν + (∂ν − igAν )ψR · (∂ ν + igAν )ψR − μ2 ψR2
4 2λ 2
μg 2 √ 3 1 4
+ √ Aν Aν ψR − μ λψR − λψR (15.210)
λ 4
13 The quantization of this massive vector field can be carried out explicitly along the lines of Problem 4,
Chapter 12. From the point of view of Dirac Hamiltonian theory, the primary constraint Π0 = 0 gives rise to
a secondary constraint (equation of motion for A0 ) which contains the combination m2A A0 + ∇ · Π,
which
has non-vanishing Poisson brackets with Π0 : i.e., we have a pair of second-class constraints. This is therefore
a theory without the first-class constraints characteristic of a gauge theory—not surprisingly, as we have
eliminated the gauge symmetry by a choice of gauge.
556 Symmetries IV: Local symmetries in field theory
1 1
L = − Fνρ · F νρ − Gνρ Gνρ − P (φ)
4 4
g g
ν )φ]† (∂ ν − i g B ν − i g τ · A
ν )φ
+ [(∂ν − i Bν − i τ · A (15.211)
2 2 2 2
Fανρ = ∂ ν Aρα − ∂ ρ Aνα + gαβγ Aνβ Aργ , Gνρ = ∂ ν B ρ − ∂ ρ B ν (15.212)
P (φ) = −μ2 φ† φ + λ(φ† φ)2 (15.213)
where τ are the Pauli matrices, and we have adopted the conventional coupling sign
and normalizations (involving a change of sign relative to (15.202) and a factor of 12 ).
Again, the physical spectrum is most readily revealed in unitary gauge, so we use the
local SU(2) gauge freedom to rotate the scalar doublet field to eliminate the upper
component and the imaginary part of the lower component, leaving only the real part of
the lower component, which is then shifted to remove (at lowest order of perturbation
theory) the VEV associated with the minimum of P (φ) at |φ| = √μ2λ ≡ √12 v:
0
φ(x) = √1 (v + H(x))
2
The single remaining self-conjugate scalar field H(x) interpolates for the famous, but
as yet undiscovered,14 Higgs particle of the Standard Model. The vacuum expectation
value of the scalar doublet
0
< φ >= √1 v
2
generates in the scalar kinetic term in (15.211), as in the Higgs abelian model discussed
previously, a mass term for the four vector bosons of the theory, in this case involving
a squared mass matrix
1 2
M =< φ† > Tα Tβ < φ >, α, β = 1, 2, 3, Y (15.214)
2 αβ
where we have combined the four generators of SU(2)×U(1) in a single notation, with
the Y index referring to the weak hypercharge abelian U (1) subgroup. Thus (using Y
also to indicate the value of the “hypercharge” associated with the U(1) subgroup,
14 As this book goes to press, there are intriguing indications at the Large Hadron Collider at CERN
(Geneva, Switzerland) of a possible Higgs signal at a mass of approximately 125 GeV.
Spontaneous symmetry-breaking in theories with a local gauge symmetry 557
which must be assigned separately to the various field multiplets in the theory)
g g g
Ti = τi , i = 1, 2, 3, TY = Y (= for φ), [Ti , Tj ] = igijk Tk , [Ti , Y ] = 0
2 2 2
(15.215)
The vector mass matrix separates into two uncoupled sectors, with the (α, β) = 1, 2
sector giving
1 2 1 1 −i
M = g2v2 (15.216)
2 8 i 1
where we have defined a complex massive vector field Wν = √12 (A1ν − iA2ν ) with mass
gv/2. In the 3-Y subspace we have the 2x2 squared mass matrix
2
1 2 1 g −gg
M = v2 (15.218)
2 8 −gg g 2
g Bν − gA3ν
Zν ≡ (15.220)
g 2 + g 2
gBν + g A3ν
Aν ≡ (15.221)
g 2 + g 2
where the self-conjugate field Zν has mass mZ = 12 v g 2 + g 2 , while the Aν field is
massless. The existence of a zero mode in the mass matrix (15.214) is clearly associated
with the existence of a linear combination of generators 12 (τ3 + Y ) = g1 T3 + g1 TY
which annihilates the VEV of the scalar doublet (which has Y = 1):
1 0
(τ3 + 1) √1 v = 0
2 2
Thus there is an unbroken U(1) subgroup of the original SU(2)×U(1) local gauge
symmetry, which must be associated with a massless gauge particle. This is, of course,
the photon, in the modern electroweak theory. One may easily check that the W, W †
and Z fields transform under the generator 1g T3 + g1 TY = 12 (τ3 + Y ) ≡ Q as fields of
electric charge –1, +1, and 0 respectively. The discovery in 1983 of a neutral massive
558 Symmetries IV: Local symmetries in field theory
vector Z boson in the weak interactions (in addition to the long-suspected charged
weak carriers W ± ) was a dramatic confirmation that the particular pattern of local
symmetry-breaking described here indeed conforms to reality. Of course, the real value
of such a model lies in its ability to accurately depict the weak interactions of the
fundamental fermions of the theory: the leptons and quarks. We shall briefly describe
the leptonic sector of the electroweak theory here, as proposed in Weinberg’s seminal
paper (Weinberg, 1967), before going on to discuss the functional quantization and
derivation of Feynman rules for a general spontaneously broken local gauge theory.
The electroweak sector of the Standard Model is a chiral gauge theory: that is to say,
left- and right-handed parts of the Dirac fermion fields of the theory (which the reader
will recall from Chapter 7, fall into separate representations of the proper homogeneous
Lorentz group) are placed in different representations of the gauge group. Thus, if ψ is
a Dirac 4-spinor field, ψL = PL ψ = 1+γ 5
2 ψ and ψR = PR ψ =
1−γ5
2 ψ are the left-handed
and right-handed 2-spinor components of ψ respectively. This means that ψL and ψR
may be in SU(2) multiplets of different dimensionality, and may be assigned different
weak hypercharge quantum numbers YL and YR . One recovers the conventional V − A
structure of the charged weak currents by placing the left-handed part of the electron
field eL together with the purely left-handed Weyl electron neutrino field in a SU(2)
doublet Le (with weak hypercharge YL =–1),
νe (x)
Le (x) =
eL (x)
and the right-handed part of the electron field eR in a SU(2) singlet field Re , with weak
hypercharge YR =-2. We may also think of this chiral arrangement as corresponding to
the inclusion of γ5 factors (via chiral projection operators PL , PR ) in the gauge group
generators,
g g
Ti = τ i PL , TY = (YL PL + YR PR ) (15.222)
2 2
which, of course, satisfy the commutation relations (15.215). The charge operator then
becomes Q = 12 (τ3 − 1)PL − PR , giving electric charge –1 to both components eL and
eR of the electron field, and zero charge to the neutrino, as desired. The fermionic
(leptonic) part of the Lagrangian, with these choices, becomes
1 1
Llept = L̄e (i∂/ − g B /)Le + R̄e (i∂/ − g B
/ + gτ · A /)Re (15.223)
2 2
Note that there is so far no mass term for the electron field, as a direct coupling of the
left- and right-handed parts of the electron field would violate the SU(2) symmetry.
When the A ν and Bν fields are rewritten in terms of the physical Wν , Zν , Aν fields,
one recovers (see Problem 13), in addition to the long known V − A structure for the
charged weak currents (mediated by the W fields), a new set of neutral weak current
interactions due to the massive Z boson, as well as, of course, conventional quantum
electrodynamics for the interaction of the electron and photon fields. The muon and tau
leptons (with their associated neutrinos) may be included by essentially “xeroxing” the
structure above twice. Masses arise naturally in this model for the charged leptons once
Spontaneous symmetry-breaking in theories with a local gauge symmetry 559
Yukawa interactions, exactly invariant under the local gauge symmetry, are included
between the leptons and the scalar field doublet:
If we recall that the scalar field φ lies in a SU(2) doublet with weak hypercharge 1,
with Le and Re having weak hypercharges –1 and –2 respectively, we see that the cubic
Yukawa coupling here is invariant under both the SU(2) and U(1) parts of the gauge
group. Moreover, once the field is shifted to extract the vacuum expectation value,
a mass term −Ge (ēL √v2 eR + ēR √v2 eL ) = −me ēe, me = Ge √v2 emerges automatically
for the electron. Muon and tau masses emerge similarly: they involve completely
independent Yukawa couplings Gμ , Gτ , so we cannot expect any obvious connection
between the charged lepton masses (on the basis of symmetry requirements), although,
of course, the wide disparity (as yet, completely mysterious) of these masses is at the
very least aesthetically disturbing.
The presence of γ5 factors in the fermionic generators of our chiral SU(2)×U(1)
gauge theory should alert us to the possibility of anomalies, and indeed the con-
servation of the Noether gauge currents of the purely leptonic electroweak theory
described here is broken by quantum anomalies, which would render the theory non-
renormalizable, and even prevent the execution of the functional quantization process
to be described below (where we assume the absence of any non-trivial functional
Jacobians in the functional integral). It is an extraordinary—and highly suggestive—
feature of the electroweak theory that the quantum anomalies in the gauge currents
are exactly cancelled once quark fields (in one-to-one correspondence with the lepton
fields) are introduced with the appropriate quantum numbers (see Problem 11).
We now turn to the long-promised derivation of the Feynman rules for a sponta-
neously broken gauge theory. We shall emphasize the derivation of the propagators of
the theory, as the possibility of obtaining a renormalizable theory hinges most directly
on the high-momentum behavior of the propagators: in particular, we wish to show
that the disastrous asymptotic behavior (of order kμ k ν /k 2 ) of the massive vector boson
propagator in a unitary gauge can be removed by a choice of gauge which both (a)
maintains manifest Lorentz covariance, and (b) damps the high-momentum behavior
to the same level as that of a scalar propagator: i.e., 1/k 2 . Complete details of the
derivation of the Feynman rules in broken gauge theories can be found in the classic
articles by Weinberg (Weinberg, 1973) and Abers and Lee (Abers and Lee, 1973).
We begin by slightly altering the notation used in the examples discussed above:
the generator matrices will now not contain factors of the coupling constant, and we
return to our original sign conventions for the coupling(s), as incorporated in (15.113–
15.115). We shall assume that our scalar field multiplets consist of purely real fields
(we can, of course, always decompose a complex scalar field into two real fields by
writing φ = √12 (φR + iφI )), with the generator matrices Tα real and antisymmetric,
so that the covariant derivative on the scalar fields reads
Note that, as in electroweak theory, the coupling g may vary from one simple subgroup
of the full local gauge group G to another: to avoid overcomplicating the notation, we
560 Symmetries IV: Local symmetries in field theory
shall avoid indicating this explicitly below. The fermions fill, as usual, complex (but
possibly chiral) representations of G, and we use, as previously, hermitian generators
tα in the fermionic representations, with covariant derivative
the Lagrangian
1 1
L = − Fαμν Fαμν + (Dμ φ)T D μ φ + ψ̄(iD/ − m)ψ − ψ̄Γi ψφi − P (φ) (15.230)
4 2
is invariant, provided the fermion mass (matrix) m commutes with the generators,
[tα , m] = 0, and the Yukawa couplings and the scalar polynomial are appropriately
chosen: namely,
The vacuum expectation value will be removed in the usual fashion by defining a
shifted field φi ≡ φi − vi , so that the action of an infinitesimal gauge transformation
on the scalar and vector fields becomes
We shall now impose a gauge condition as a joint constraint on the gauge and scalar
fields of the theory. The form of the constraint is at first sight rather peculiar, but will
shortly be seen to give an algebraically convenient set of Feynman rules. We impose
the local gauge condition
∂ μ Aαμ (x) − ξg < v, Tα φ (x) >= fα (x), < v, Tα φ >≡ vi (Tα )ij φj (15.235)
Spontaneous symmetry-breaking in theories with a local gauge symmetry 561
Δcov [A, φ ] = det(δαβ + gfαβγ ∂ μ Aγμ − ξg 2 < v, Tα Tβ v > −ξg 2 < v, Tα Tβ φ >)
(15.236)
Lgh = ω̄α (δαβ + gfαβγ ∂ μ Aγμ + ξg 2 < Tα v, Tβ v > −ξg 2 < v, Tα Tβ φ >)ωβ
(15.237)
One notes here the appearance of (a) a ghost mass matrix ξg 2 < Tα v, Tβ v > and, (b)
in addition to the ghost-vector vertex, a ghost-scalar coupling term. Precisely as in
the unbroken case, one may establish that the generating functional of the theory
is independent of the choice of the arbitrary functions fα , which we may therefore
integrate over, with a Gaussian modulating factor as in (15.153), to obtain the path
integral (minus sources) for our spontaneously broken gauge theory:
4
ZSBGT = DAαμ DψDψ̄Dωα Dω̄α Dφ ei Ltot d x
1 1
Ltot = − Fαμν Fαμν − (∂ μ Aαμ − ξg < v, Tα φ >)2
4 2ξ
1
+ (Dμ (v + φ ))2 − P (v + φ ) + Lferm + Lgh (15.238)
2
The utility of the peculiar choice of gauge condition (15.235) now becomes apparent
on examining the scalar field kinetic term,
1 1 g2
(Dμ (v + φ ))2 = (Dμ φ )2 + < Tα v, Tβ v > Aαμ Aμβ
2 2 2
1 1 1
− < gTα v, (∂μ − gTβ Aβμ )φ > Aμα + < (∂ μ − gTα Aμα )φ , gTβ v > Aβμ
2 2 2
1 1
= < Dμ φ , Dμ φ > + Mαβ 2
Aαμ Aβ + g 2 Aμα Aβμ < Tα v, Tβ φ > −gAμα < Tα v, ∂μ φ >
μ
2 2
(15.239)
The final term in (15.239), mixing the scalar and gauge fields, combines with the cross-
term from the gauge-fixing part of the total Lagrangian to produce a total derivative,
which then vanishes after integration over spacetime (recall that the Tα matrices are
real antisymmetric):
1
− (−2ξg∂ μ Aαμ < v, Tα φ >) − gAμα < Tα v, ∂μ φ >= ∂ μ (gAαμ < v, Tα φ >)
2ξ
(15.241)
The propagators of the theory are associated with the parts of Ltot quadratic in the
various fields, and now that unwanted mixing terms have been eliminated, these can
easily be read off from the quadratic scalar, gauge, and fermion Lagrangians:
1 1 ∂2P ξg2
Lquad
scal = (∂μ φi )2 − (φ = v)φi φj − (< v, Tα φ >)2 (15.242)
2 2 ∂φi ∂φj 2
1 1 1 2
gauge = − (∂μ Aαν − ∂ν Aαμ ) −
Lquad 2
(∂μ Aμα )2 + Mαβ Aαμ Aμα
4 2ξ 2
1 1
→ 2
Aαμ ((δαβ + Mαβ )g μν + δαβ ( − 1)∂ μ ∂ ν )Aβν (15.243)
2 ξ
Lquad / − Mf )ψ,
ferm = ψ̄(iD Mf = m + Γi vi (15.244)
2 ∂2P 1
Mij = (φ = v) + ξg 2 (Tα v)i (Tα v)j (15.245)
∂φi ∂φj 2
∂2 P ∂P ∂2P
(Tα )ij φj + (Tα )ik = 0 ⇒ (φ = v)(Tα v)i = 0 (15.246)
∂φk ∂φi ∂φi ∂φk ∂φi
Spontaneous symmetry-breaking in theories with a local gauge symmetry 563
The gauge vector propagator can be read off easily from (15.243): we need the
Green function for the operator ( + M 2 )g μν + ( 1ξ − 1)∂ μ ∂ ν , which the reader will
easily verify corresponds to a Feynman propagator given by
k k
d4 k gμν − (1 − ξ) k2 −ξM 2
μ ν
with the squared-mass matrix M 2 (carrying the α, β indices) given by (15.240). Notice
that the ξ-dependence is entirely in the longitudinal part (proportional to kμ kν ) of the
momentum-space propagator. The poles at k 2 = ξM 2 cannot correspond to physical
particle states as they depend on the arbitrary gauge parameter ξ: indeed, we know
that the S-matrix is independent of ξ and therefore cannot have any such poles in
single-particle cuts of amplitudes. However, these poles occur at exactly the same
mass eigenvalues as those given by the Goldstone mode part of the scalar propagator,
the second term on the right-hand side of (15.245). Indeed, suppose that ρi is an
eigenvector of the latter matrix:
Multiplying both sides by (Tβ v)i , summing over i, and defining χα ≡ (Tα v)j ρj , we
find
2
ξMβα χα = λχβ (15.249)
This means that the unphysical poles in the longitudinal part of the gauge propa-
gators can (indeed, by gauge invariance, must) be cancelled by poles at exactly the
same locations in the scalar propagators. The Feynman ξ = 1 gauge choice gives a
particularly simple momentum-space propagator, proportional to gμν . Of course, as
a check on more complicated higher-order perturbative calculations, it may be useful
to retain the general form to ensure that all ξ-dependent terms cancel in the final
physical result (see Problem 14).
The large momentum behavior of the gauge vector and scalar propagators is
uniformly 1/k2 regardless of the value of the gauge parameter,15 and whether or not
we are in the symmetry broken (M 2 = 0) or unbroken (M 2 = 0) phase of the theory.
This is in complete consonance with the intuition developed from our discussions of
spontaneous symmetry-breaking in Chapter 14, as a phenomenon linked to the large-
distance energetic properties of the theory, but essentially decoupled from the short-
distance, or large momentum, properties of the amplitudes. The soft behavior of the
vector propagators in these renormalizable ξ-gauges plays a crucial role in establishing
the renormalizability of a spontaneously broken gauge theory with massive spin-1
particles, as we shall see in Part 4.
The discovery of the renormalizable SU(2)×U(1) gauge field theory for the weak
and electromagnetic interactions in the early 1970s was followed in short order by
15 The limit ξ → ∞ is singular, and we see that we recover the unitary gauge momentum-space propagator
in this limit, with the numerator factor gμν − kμ kν /M 2 characteristic of a massive Maxwell–Proca field,
and with a propagator of order unity, rather than 1/k2 at large momentum.
564 Symmetries IV: Local symmetries in field theory
16 For an excellent and comprehensive introduction to the phenomenology of the Standard Model, see
(Donoghue et al., 1992).
Problems 565
the Large Hadron Collider (LHC), which should reach total energies of 10 or more
TeV in the center-of-mass frame within the next few years.
15.7 Problems
1. The object of this exercise is to verify that the change in fermionic Green functions
induced by a local gauge transformation in an abelian gauge theory does not affect
the on-shell S-matrix amplitudes of the theory, as given by the LSZ formula.
Suppose there is a single incoming (resp. outgoing) fermion carrying momentum p
(resp. p ). Show that the change in the S-matrix amplitude induced by the gauge
transformation Λ(x) is proportional to
d4 xd4 x eip ·x −ip·x (eie(Λ(x )−Λ(x)) − 1)
· ū(p )(p
/ − m)0|T (ψH (x )ψ̄H (x) . . .)|0(p/ − m)u(p) (15.250)
where we have suppressed spin labels and the gauge transformation function Λ(x)
is assumed to go to zero rapidly at large x. The change (15.250) will vanish if the
poles in the Fourier transform of the T-product of the form 1/(p / − m), 1/(p/ − m)
are absent. Show that these poles are indeed absent by demonstrating that the
Fourier transform f (q , q) of f (x, x ) = eie(Λ(x )−Λ(x)) − 1 takes the form
7. Verify the expression (15.149) for the DFP functional in the covariant gauge
∂ μ Aαμ (x) = fα (x).
8. By examining the quadratic (in gauge fields) part of the action in the generating
functional (15.156), show that the gauge field propagator in the covariant ξ-gauge
is a Green function for the operator g μν + ( 1ξ − 1)∂ μ ∂ ν , and is given by (15.159).
9. By considering the lowest-order tree expressions for the three- and four-point
functions for gauge vector scattering in momentum space, verify the Feynman
vertex factors given in Fig. 15.1, parts (a) and (b).
10. Show that the eigenvalues λi of the Euclidean Dirac operator D̂[A] = γ̂ μ (−i∂μ +
Aμ ) are gauge-invariant: namely, show that if D̂[A]φi (x) = λi φi (x), then
† †
D̂[AU ]U (x)φi (x) = λi U (x)φi (x), AU
μ = U Aμ U + i(∂μ U )U (15.253)
11. The appearance of an anomaly in the currents associated with a local gauge
symmetry would destroy (at the quantum level) the local gauge symmetry of the
theory, with dire consequences for the renormalizability of the theory, as we shall
see in Part 4. In a chiral theory such as the electroweak sector of the Standard
Model, the appearance of γ5 factors in the generators of the local gauge group
(due to the fact that left and right handed fermionic fields transform differently
under the gauge group) signal the potential existence of such anomalies. Now
suppose that the generators are decomposed into right and left handed parts (cf.
1+γ5 1−γ5
(15.222)), Tα = tL R
α PL + tα PR , PL = 2 , PR = 2 , where now the tα do not
contain γ5 factors. It can be shown (see (Donoghue et al., 1992), for example)
that the anomaly in the current ψ̄γ μ Tα ψ is proportional to μνρσ Fβμν Fγρσ times a
difference of traces over the left- and right-handed fields:
Aα ∝ Tr(tL
α {tβ , tγ }) − Tr(tα {tβ , tγ })
L L R R R
(15.254)
12. Consider a broken gauge theory based on the gauge group G= SU(5). The
symmetry-breaking is implemented by coupling the twenty-four-dimensional
adjoint representation of gauge bosons to a real twenty-four-dimensional Higgs
scalar representation, which can be conveniently represented as a traceless hermi-
tian 5x5 matrix Φ, with Φ → U † ΦU giving the action of the group (where U is
a 5x5 unitary matrix of determinant 1). The most general scalar polynomial of
degree ≤ 4 symmetric under G is
What is the remaining unbroken symmetry group in this case? Find the spectrum
of vector mesons in the symmetry-broken phase.
13. Work out the form of the neutral leptonic current sector (electron generation
only) of the electroweak SU(2)×U(1) theory, by extracting the interaction terms
in (15.223) containing the photon (Aμ ) and Z-boson (Zμ ) fields. It is conventional
to introduce the Weinberg angle θW , with gg = tan θW , so
Show that the leptonic interactions involving these neutral fields take the form
14. The presence of neutral weak currents mediated by a massive Z meson necessitates
the choice of SU(2)xU(1) as the electroweak gauge group. The basic QED anni-
hilation process e+ + e− → μ+ + μ− now requires, in addition to the usual graph
with an intermediate virtual photon, inclusion of a graph with an intermediate Z
boson.
(a) Assuming the mass of the electron vanishes, show that the tree amplitude for
this process is ξ-independent.
568 Symmetries IV: Local symmetries in field theory
(b) Show that the presence of the Z graph leads to a forward-backward asymmetry
in the process (i.e., terms linear in cos(θ) in the center-of-mass differential
cross-section).
(c) Show that the tree amplitude from the above two graphs is not ξ-independent
if the electron and muon masses are not neglected. Explain what other graph
or graphs have to be taken into account in this case to restore a gauge-invariant
result.
16
Scales I: Scale sensitivity of field
theory amplitudes and effective field
theories
The history of the natural sciences since the late 1800s (if we temporarily set aside
astronomy as primarily concerned with Nature “in the large”) has to a great extent
involved an attempt to decipher the behavior of matter at ever smaller distance scales.
Qualitative descriptions of biological organisms at the macroscopic level have been
supplanted by an astonishingly detailed understanding of the underlying biochem-
istry of life; the complex profusion of chemical phenomena revealed empirically by
nineteenth-century and early-twentieth-century chemists are now understood to follow,
in many cases with the detailed quantitative support of sophisticated algorithms of
quantum chemistry, from a precise mathematical formulation based on Schrödinger’s
equation applied to atoms and molecules; the phenomenology of nuclei (fission, fusion,
radioactivity, etc.) has been reduced to the behavior of more “elementary” constituents
(quarks and gluons) obeying precise dynamical laws; and so on.
It is apparent from these examples that the type of theory or descriptive frame-
work appropriate for the description of the same phenomena at different levels of
“magnification” can vary enormously. Intricate details of the underlying microscopic
dynamics of a physical process may simply be irrelevant to achieving an adequate
“qualitative” understanding of the process as viewed at larger distance scales. Much
of nuclear physics can be understood perfectly well by treating the proton and neutron
as point-like fermions interacting non-relativistically via spin-dependent short-range
potentials, with absolutely no understanding of the underlying non-abelian local gauge
theory giving rise to these particles and their interactions.
A remarkable property of local quantum field theory, not shared by any of the
larger-scale phenomenologies mentioned above (which we now, of course, believe to be
consequences of the underlying field-theoretic phenomena), is that it is structurally
amenable to a precise mathematical description of the way in which the form of the
dynamical laws changes as the phenomena are examined at varying distance-scales.
As we shall shortly see, the representation of the theory at a given distance- (or
energy-)scale will turn out to be specified by an “effective Lagrangian” fixed in terms
of an infinite vector of dimensionless couplings, and the variation of these couplings
with the distance scale determined by a “renormalization group equation” which is
in essence the infinitesimal Lie algebra corresponding to the “renormalization group”
transformations associated with finite changes of the scale at which we examine the
570 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
theory. In this chapter we shall begin the task of exploring these remarkable features
of local quantum field theory.
1 The existence of very-high-energy cosmic rays offers us a tantalizing, but unfortunately very narrow,
window into physics at much higher energies than those accessible in accelerators, as do indirect cosmological
arguments involving the very early Universe, but the vast majority of our detailed information about
subatomic dynamics derives from the much more precise information gleaned from terrestrial accelerator
experiments.
General structure of local effective Lagrangians 571
where the functions fM M (k1 , .., kM ) are smooth functions of momenta, expandable in
joint Taylor expansions in the four-momenta k1 , ..kM
, k1 , . . . , kM . It will be convenient
as follows,
(2π) 32 (M +M )
∂ ∂ (−)
Hint (x) = f M M (−i
, . . . , +i )φ (x ) · · · ·φ(+)
(xM )
M !M ! ∂x1 ∂xM 1
MM xi =xi =x
(16.3)
where the spacetime-derivatives in the functions fM M are converted to the appro-
priate momentum dependences if we insert the definition of the scalar field (16.2) in
(16.1). We see that the interaction Hamiltonian density is necessarily an (infinite)
expansion involving multi-nomials in the scalar field and all possible spacetime-
derivatives thereof. At this stage, we have not yet inserted the demands of special
relativity. The discussion in Chapter 12 reveals the appropriate further constraints
which the as yet unspecified functions fM M for this Hamiltonian must satisfy to
yield relativistically invariant scatttering amplitudes: they must arise by the standard
canonical procedure whereby the interaction Hamiltonian is derived via Legendre
transformation of a Lorentz scalar Lagrangian density L(φ, ∂ν φ, ∂ν ∂ρ φ, . . .), where now
we must allow the Lagrangian to contain arbitrarily many derivatives and powers of
the local scalar field φ(x) = φ(+) (x) + φ(−) (x) (with positive and negative frequency
parts of the field paired throughout to satisfy the demands of locality) in order to
obtain the general expansion for the (interaction) Hamiltonian density as indicated in
(16.3). A Lagrangian of this type, containing effectively all possible terms consistent
with cluster decomposition and Lorentz-invariance, is sometimes called a “Wilsonian
effective Lagrangian”,2 in order to reflect the profound contributions made in the
understanding of the scale sensitivity of local theories in the early 1970s by Ken
Wilson (Wilson, 1971).
The reader with prior acquaintance with standard treatments of perturbative quan-
tum field theory may object to the use of a scalar Lagrangian with “non-renormalizable
terms” of higher than (mass) dimension 4 (in four spacetime dimensions) which are
well known to lead to ultraviolet (large momentum) divergences in the loop integrals
of perturbation theory which are not removable via the usual processes of mass,
coupling, and wavefunction renormalization of the bare parameters of the theory. We
shall return to the whole matter of perturbative renormalizability, and to its relation
with the Wilsonian approach, in the next chapter. For now, this objection provides
us with the opportunity to fully realize, and put into effect, the qualitative insights
of the preceding section concerning the inescapable limitations on any local theory
formulated on a flat Minkowski spacetime due to the unavoidable dissolution of this
kinematic structure at very short distances (or large momenta). We therefore admit
frankly that our Lagrangian field theory, with its associated path integral, must be
2 At this point we should alert the reader to a dangerous source of terminological confusion. The use
of the word “effective” in this chapter will be completely restricted to the sense indicated here, where we
imagine writing an exact representation of only part of the physical content of the theory, basically by a
change of variable in the functional integral defining the theory at short distances. The notion of an “effective
action” Γ, as used in Chapter 10 in reference to the generating functional of the one-particle-irreducible
n-point functions of the theory, plays no role here, and to avoid confusion with the aforesaid Γ we shall try
to stick to the phrase “effective Lagrangian”, while avoiding the perfectly natural term “effective action”
for the spacetime integral thereof.
General structure of local effective Lagrangians 573
(16.7)
where the dots represent terms with a total of 4, 6, 8, etc., spacetime-derivatives acting
on the fields (coupled, of course, to an overall Lorentz scalar).
We have used our freedom to rescale the field to set the coefficient of the free
kinetic term (∂ν φΛ )2 to be exactly 12 . The mass term is now concealed in the term
a0 φ2Λ , while the coefficient a2 corresponds to the usual dimensionless quartic coupling
constant λ in our previous discussions of λφ4 theory. Now, however, we have an
infinite series of additional interaction terms (note: n is even), corresponding to new
four-point vertices arising from the derivative coupling (∂ν φΛ )2 φ2Λ , six-point vertices
from φ6Λ , (∂ν φΛ )2 φ4Λ , and so on. Recalling that the action SE in (16.5) must be
dimensionless, implying mass dimension of 1 (from the kinetic term) for the field
φΛ , we see that the coupling constants an (resp. an ) must have mass dimension 2 − n
(resp. −n). It will be convenient to rescale these couplings in terms of dimensionless
ones by extracting the appropriate powers of the cutoff (itself of mass dimension 1):
an ≡ gn Λ2−n , n = 0, 2, 4, . . .
an ≡ gn Λ−n , n = 2, 4, 6, . . . (16.8)
For the present, we shall be assuming that our theory is weakly coupled—in other words
that the dimensionless couplings gn , gn , . . . corresponding to interaction terms (i.e.,
those higher than quadratic in the field) are all of order unity, or perhaps somewhat
smaller,3 in which case a formal asymptotic expansion in these variables becomes
quantitatively useful.
3 The concept of “order unity” possesses a somewhat elastic connotation in field theory, as it is not always
obvious what the relevant expansion variable ought to be. The fine-structure constant α = e2 /4π = 1/137
seems to be two orders of magnitude smaller than “order unity”, but the electric charge e ∼ 0.3 is clearly
much closer to unity. Nevertheless, for many amplitudes in QED, an expansion in powers of α is appropriate,
in the sense that the coefficients of powers of α, at low orders of perturbation theory, are fairly close to 1.
Scaling properties of effective Lagrangians: relevant, marginal, and irrelevant operators 575
field theory can be expanded formally in powers of Planck’s constant , with the
lowest-order contributions corresponding to the tree (no-loop) graphs of the theory,
the one-loop graphs contributing with one extra power of , the two-loop graphs with
two extra powers, and so on. We can now ask about the perturbative contributions of
the operators On , On appearing in the general action (16.7) to some n-point function of
the theory, where we assume that the incoming and outgoing momenta of the process
under consideration are all of order E << Λ. The only dimensionful scales present
at the tree graph level are the energy scale of the process E (which permeates the
internal propagators) and the UV cutoff Λ, with the dependence on the latter arising
only from the explicit dependence of the couplings in (16.8) on Λ: in particular, at
tree level there are no loop integrals extending up to and cut off at Λ to introduce
further Λ-dependence. This means that the contribution of a particular operator at
the energy scale E can be estimated by a trivial dimensional argument, essentially by
just counting the mass dimension of the operator (integrated over spacetime), whence
d4 xOn ≡ d4 xφ2+n
Λ ∼ E n−2 (16.9)
d4 xOn ≡ d4 x(∂ν φΛ )2 φnΛ ∼ E n , etc. (16.10)
Including the coupling constants in (16.8) we see that these operators contribute to
tree amplitudes at the relative order
E n−2
On → gn ( ) , n = 0, 2, 4, . . . (16.11)
Λ
E
On → gn ( )n , n = 2, 4, 6, . . . (16.12)
Λ
and so on for the higher operators. This means that for the infrared physics with
which we are concerned, where the energy E of the processes we are studying is much
smaller than the ultraviolet cutoff Λ of the theory, the most important, or relevant,
operator is O0 = φ2 , the mass operator, whose effects grow quadratically as we lower
the energy. This is hardly surprising if we consider the mass expansion of the free
(Euclidean) propagator
1 1 m2 m4
∼ − + + ... (16.13)
k 2 + m2 k2 k4 k6
where we see that increasing powers of the mass correspond to larger and larger
contributions in the infrared region k << m. The quartic coupling operator O2 , by
contrast, contributes equally at all energy scales (in the tree amplitudes): it is therefore
termed a “marginal” operator, which we should regard as a technical designation of its
scaling behavior, and not (given its importance in generating non-trivial interactions)
as a demeaning comment on its importance for the theory! Higher-dimension operators
such as O4 = φ6 and O2 = (∂ν φ)2 φ2 (both of mass dimension 6 and contributing at
order (E/Λ)2 for E << Λ) contribute at a progressively smaller level to the low-energy
576 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
g2 g2
1 g4
E2 Λ2
Fig. 16.1 Some tree graph contributions to the 2-4 scalar scattering amplitude.
physics, the higher their dimension, and are termed “irrelevant”, from the point of view
of tree amplitude scaling.
A simple example is given by the tree graphs displayed in Fig. 16.1, representing
contributions to the 2-4 scattering amplitude—the first graph arising from the quartic
coupling g2 φ4 in second order (and of order g22 /E 2 , where the incoming and outgoing
momenta are of order E) while the second graph, arising from the higher-dimension
2
term Λg42 φ6 is very small, of order E
Λ2 relative to the first (assuming all dimensionless
2
couplings of order unity, or in any event much closer to unity than the ratio E Λ2 ), in
agreement with the scaling deduced previously in (16.11).
The reader should once again guard against attaching the colloquial meaning of
terms such as “irrelevant” to the physics generated by the corresponding operators:
the dimension-six four-fermion operator of Fermi weak interaction theory, responsible
for β-decay, for example, is “irrelevant” from this point of view, but the associated
vast phenomenology of radioactivity is hardly so. A higher-dimension operator may
initiate processes with very low amplitude (hence, rare processes), which may, however,
be of a sufficiently different type from the processes induced by marginal or relevant
operators as to stand out phenomenologically, and even to play an important role
in uncovering details of the physics emerging at shorter distances (as in the case
of the electroweak theory supplanting the Fermi theory of weak interactions). For
reasons that will become clear in the next chapter, the classification into “relevant”,
“marginal”, and “irrelevant” operators (of mass dimension <4, 4, and >4 respectively,
in four spacetime dimensions) is mirrored in the terminology of renormalization
theory by the terms “super-renormalizable”, “strictly renormalizable”, and “non-
renormalizable”, respectively.
When loop effects are included, the situation becomes more complicated, and much
more interesting. If we take 2-2 scattering as a test case, the graphs in Fig. 16.2
illustrate some simple low-order contributions to the process: the lowest-order tree
graph corresponding to the quartic coupling g2 φ4 (we will drop the Λ subscript on
the fields here with the reminder that it simply instructs us to cut off the momenta
on all internal propagators at |k| = Λ), the three one-loop graphs arising at second
order in g2 , and the one-loop graph coming from the first-order contribution of the
dimension 6 “irrelevant” operator Λg42 φ6 . Setting m2 = 2a0 = 2g0 Λ2 , we shall assume
that the momenta in the process and the unperturbed mass m are all much smaller
Scaling properties of effective Lagrangians: relevant, marginal, and irrelevant operators 577
k3 k4
k4
k3 k4
g2 k3 k4 k4 k3 k3
g2 g2 g2 g2 g2 g4
g2
k1 k2 k1 k2 k1
k1 k2
k2
k1 k2
Fig. 16.2 Some tree and one-loop contributions to the 2-2 scalar scattering amplitude.
than the cutoff Λ. The final graph, arising from contracting two out of the six lines
emerging from the six-point vertex associated with the higher-dimension φ6 operator,
contains the cutoff one-loop integral
Λ
1 d4 k 1 k3
θ(Λ2 − k2 ) = dk
k + m (2π)4
2 2 8π 2 0 k2 + m2
1 Λ2 m2
= 2
(Λ2 − m2 ln 2 ) + O( 2 ) (16.14)
16π m Λ
The loop integral (which in the absence of a cutoff would be quadratically divergent)
therefore produces a large factor proportional to the cutoff squared, which will cancel
the inverse factor of Λ2 (in the coupling Λg22 ) which we have previously used to argue
for the “irrelevance” of the φ6 operator at low energies. Of course, the result is just a
momentum-independent constant contribution to the amplitude, of exactly the same
form as the tree contribution proportional to g2 . In fact, a short calculation (see
Problem 1) gives the following result for the truncated four-point function arising
from the graphs in Fig. 16.2 (normalized to begin with g2 )
3 2
Γ(4) (k1 , k2 , k3 , k4 ) = g2 − g {I(s, m2 , Λ2 ) + I(t, m2 , Λ2 ) + I(u, m2 , Λ2 )}
4π 2 2
15 m2 , ki2
2
+g4 + O( ) (16.15)
16π Λ2
1
Λ2
I(p , m , Λ ) ≡
2 2 2
(ln ( ) − 1)dx (16.16)
0 x(1 − x)p2 + m2
s ≡ (k1 + k2 )2 , t ≡ (k1 − k3 )2 , u ≡ (k1 − k4 )2 (16.17)
4 For the purposes of the present discussion, 1/137 is a number of order unity, to be distinguished from
the much tinier ratio of scales, ∼ 10−15 , between, say, the LHC energy E ∼ 104 GeV and the Planck energy
Λ ∼ 1019 GeV.
Scaling properties of effective Lagrangians: relevant, marginal, and irrelevant operators 579
As Fourier modes of different momentum are orthogonal (in coordinate space), the
source term in the functional integral (16.5) depends only on the infrared field φμ (x)
(as the overlap of φ(μ,Λ) and jμ vanishes):
d4 x jμ (x)φΛ (x) = d4 x jμ (x)φμ (x) (16.22)
If we factor the functional measure in the path integral (16.5) as indicated in (16.20),
we see that the source term can be taken outside the integral over the momentum
580 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
shell field φ(μ,Λ) , and the resulting integral used to define a new effective Lagrangian
Lμ (φμ ):
4 4
ZΛ [jμ ] = Dφμ Dφ(μ,Λ) e− d xLΛ (φΛ )+ d xjμ (x)φΛ (x)
d4 xjμ (x)φμ (x) d4 xLΛ (φμ +φ(μ,Λ) ))
= Dφμ e Dφ(μ,Λ) e−
d4 xLμ (φμ )+ d4 xjμ (x)φμ (x)
≡ Dφμ e− (16.23)
where
d4 xLΛ (φμ +φ(μ,Λ) ))
Lμ (φμ ) ≡ − ln ( Dφ(μ,Λ) e− ) (16.24)
The new effective Lagrangian, Lμ , can be expanded5 in powers of the infrared field
φμ and its derivatives, just as our original effective Lagrangian defining the theory at
the high scale Λ, but of course, with coefficients which depend on the scale μ:
1
Lμ = a0 (μ)(∂ν φμ )2 + an (μ)On (φμ ) + an (μ)On (φμ ) + . . . (16.25)
2 n>0
n≥0
Note that the coefficient a0 (often written Z, the wavefunction renormalization
constant) of the kinetic term (which we were free to choose to be 12 at the high scale,
by rescaling the field) now becomes a function of μ as well. The effective running
couplings an (μ), an (μ), . . . . of course satisfy the boundary condition
a0 (Λ) = 1
an (Λ) = an = gn Λ2−n
an (Λ) = an = gn Λ−n , n≥2 (16.26)
5 Strictly speaking, the locality of the effective Lagrangian defined by this procedure depends on certain
smoothness properties which are not present with the sharp momentum cutoff envisaged here. In the next
section we shall remedy this difficulty and derive an exact equation for the cutoff dependence of the local
effective Lagrangian which arises once the momentum cutoff is appropriately chosen.
The renormalization group 581
the loop integrals when any internal propagator exceeds momentum Λ. Note that the
field φ in (16.28) is not cut off, but contains all momentum modes and is independent
of the scale at which we are examining the theory, so derivatives with respect to scale
do not affect the fields. (Alternatively, we may simply choose to pick once and for all
a fixed “ultimate” UV cutoff for this field—the Planck scale ΛPl , say—reflecting our
certain knowledge that a representation of the physics in terms of Minkowski space
fields must fail at this point; see below.) The precise form of D is unimportant, but
for definiteness we can take, for example,
and contains all possible local field dependent terms: accordingly, the sums begin at
n = 0, including the quadratic mass O0 = φ2 and kinetic O0 = (∂ν φ)2 terms, whose
coefficients must be allowed to change as we change the scale. We shall assume, as
previously, that we are only interested in the physics up to some scale μ much lower
than a “top” UV scale ΛU V at which the “bare” couplings gn , gn are initially set, with
−n
an = gn Λ2−n
U V , an = gn ΛU V as before (cf. (16.26)). Accordingly, the external source j(x)
introduced to probe field modes associated with the desired scattering amplitudes need
contain only momentum modes up to μ:
The generating functional describing the physics at any scale Λ is given by the path
integral
4 4
ZΛ [jμ ] = Dφe− d x(L0 (φ,Λ)+Lint (φ,Λ))+ d xjμ (x)φ(x)
d4 q
−S0 [φ,Λ]−Sint [φ,Λ]+ j̃μ (q)φ̃(−q)
≡ Dφe (2π)4 (16.32)
We now claim that there exists a unique evolution with Λ of the interaction Lagrangian
Lint (φ, Λ) (from Λ = ΛU V down to Λ = μ) such that the low-energy physics is exactly
preserved—in other words, which leaves the generating functional W [jμ ] = ln Z[jμ ]
of the connected low-momentum amplitudes of the theory invariant up to source-
independent terms:
∂
WΛ [jμ ] = independent of jμ (16.33)
∂Λ
The renormalization group 583
Thus, when we perform functional derivatives with respect to jμ to extract the con-
nected low-momentum n-point amplitudes of the theory, the Λ dependence disappears
(for any Λ in the range μ < Λ < ΛU V ), as long as we use the effective Lagrangian
Lint (φ, Λ) appropriate for that scale.
The derivation of the renormalization group flow equation for the Lagrangian
Lint is facilitated by a functional integral identity, based on the observation that the
functional integral of a total functional derivative vanishes provided the integrand has
the usual falloff (in our case, exponential) for large values of the field. Namely, we have
4
δ 1 (2π)4 δ −S0 [φ,Λ]−Sint [φ,Λ]+ j̃μ (q)φ̃(−q) d q4
Dφ̃ {(φ̃(k)D(k /Λ ) +
2 2
)e (2π) }
δ φ̃(k) 2 k 2 + m2 δ φ̃(−k)
=0 (16.34)
This functional identity holds for all values of the momentum k, but for reasons shortly
to become apparent we shall apply it only in the regime |k| > μ, where by assumption
j̃μ (k) = 0. Thus, when working out the functional derivatives in (16.34), we can ignore
any factors of j̃μ (k) (or j̃μ (−k)) that appear. We shall also suppose that our fields are
restricted to a large spacetime box of volume V , so that infrared singular functional
derivatives such as
δ 1 V
φ̃(k) = δ 4 (0) = 4
d4 xei0·x = (16.35)
δ φ̃(k) (2π) (2π)4
are given a definite meaning. (These terms will, in any case, later turn out to be irrel-
evant disconnected vacuum terms.) Carrying out the indicated functional derivatives
in (16.34), and using
δS0 k 2 + m2
= D(k 2 /Λ2 )φ̃(k) (16.36)
δ φ̃(−k) (2π)4
one finds after a short calculation (see Problem 2) that it can be rewritten as
1 k2 4 1 k 2 + m2 k2 2
D( 2 )δ (0)ZΛ [jμ ] = Dφ̃ { D( ) φ̃(k)φ̃(−k)
2 Λ 2 (2π)4 Λ2
1 (2π)4 δ 2 Sint δSint δSint jμ φd4 x
+ 2 2
[ − ]}e−S0 −Sint +
2 k + m δ φ̃(k)δ φ̃(−k) δ φ̃(k) δ φ̃(−k)
(16.37)
Note that the two terms in curly braces in (16.34) are chosen such that a contribution
of the form φ̃(k)D(k 2 /Λ2 ) δδS int
φ̃(k)
cancels between them, leaving just the terms given
here.
We can now return to our main focus: how to choose the effective Lagrangian
Lint (φ, Λ) at any given scale Λ to ensure that we obtain the same low-momentum
amplitudes, by functionally differentiating the generating functional ZΛ [jμ ]. A glance
at (16.32) shows that the differential variation of this functional with Λ arises from
two sources: the Λ dependence of the propagator via the cutoff function D(k 2 /Λ2 )
584 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
embedded in the free Lagrangian L0 , and the Λ-dependence of the “interaction” part
Lint (through the Λ-dependent coupling parameters contained in the latter). In fact,
we clearly have, differentiating (16.32),
2
∂ZΛ [jμ ] 1 2 ∂D( Λk 2 ) d4 k ∂Sint
Λ =− Dφ̃{ (k + m2 )φ̃(k)φ̃(−k)Λ +Λ }
∂Λ 2 ∂Λ (2π)4 ∂Λ
4
·e−S0 −Sint + jμ φd x (16.38)
Our choice of cutoff function D(k 2 /Λ2 ) implies (see (16.29)) that derivatives of D
with respect to the scale Λ, if we keep Λ above the infrared scale μ, are exponentially
2 2
negligible (of order e−α(1−k /Λ ) ) in the infrared region |k| < μ < Λ. The support of
both sides of the following identity is therefore precisely in the region of validity |k| > μ
of (16.37):
2
∂D( Λk 2 ) k2 ∂D−1
Λ = −D( 2 )2 Λ (16.39)
∂Λ Λ ∂Λ
Using (16.39), (16.38) can be re-expressed
∂ZΛ [jμ ] 1 2 k2 ∂D −1 d4 k ∂Sint
Λ = Dφ̃{ (k + m2 )D( 2 )2 φ̃(k)φ̃(−k)Λ 4
−Λ }
∂Λ 2 Λ ∂Λ (2π) ∂Λ
4
·e−S0 −Sint + jμ φd x (16.40)
and using the identity obtained by integrating both sides of (16.37) with the measure
4 ∂D−1
d kΛ ∂Λ .., we obtain
∂ZΛ [jμ ] 1 ∂ ln (D(k 2 /Λ2 )) 4
Λ = δ 4 (0) Λ d k · ZΛ [jμ ] (16.42)
∂Λ 2 ∂Λ
6 The reader may find it convenient at this point to review the discussion of disconnected graphs and
vacuum energy in Section 10.2.
The renormalization group 585
be seen to arise from the cutoff-dependence of the zero-point energy associated with
the free Lagrangian L0 (see Problem 3).
The equation (16.41) gives the desired variation in the form of the effective
Lagrangian with the scale at which we probe the physics. Moreover, given that
the starting effective Lagrangian (at the UV cutoff) yields a well-defined convergent
functional integral representation for ZΛ , this renormalization group equation is non-
perturbatively valid, as it is based on exact manipulations of the functional integral.
We note immediately that the space of free Lagrangians (i.e., those Lagrangians
quadratic in the field, but with arbitrarily many spacetime-derivatives) is preserved
under renormalization group transformations, as Λ ∂S int
∂Λ
is clearly quadratic in the
fields (ignoring physically ignorable constant terms) if Sint is. However, if there are
interactions present (with our φ → −φ symmetry, terms quartic or higher in the fields),
the non-linear functional equation (16.41) produces an infinite-dimensional mixing of
the operators in the general expansion (16.30). The reason for this is simply that this
expansion, re-expressed in terms of momentum-space fields, can be written
1
Sint = hL (k1 , k2 , . . . ., kM ; Λ))δ 4 ( ki )φ̃(k1 )φ̃(k2 ) · · · φ̃(kL ) d4 ki
L! i i
L
(16.44)
where the functions hM are scalar functions (under Euclidean rotations of their four-
momentum arguments) expandable in powers of their momentum arguments. Inserting
this form in (16.41) we find that the form of the effective Lagrangian is preserved under
renormalization group transformation
∂Sint 1 (1) (2)
Λ = (ĥL (k1 , . . . , kL ) − ĥL (k1 , . . . , kL ))δ 4 ( ki )φ̃(k1 ) · · · φ̃(kL ) d4 ki
∂Λ L! i i
L
(16.45)
where the functions
(1) L!
ĥL (k1 , . . . , kL ) = F(k 2 )hM +1 (k, k1 , . . . , kM )
M !N !
M +N =L
k
F(k2)
F(k2)
Fig. 16.3 Graphical representation of the renormalization group evolution of the effective
Lagrangian.
of the local operator basis in terms of which we choose to express our cutoff theory,
exactly as expressed in the renormalization group equations (16.27).
The physical interpretation of the two terms in (16.46, 16.47) is illustrated in
Fig. 16.3. The graph on the left (corresponding to (16.46)) illustrates the differential
change in a typical effective vertex (in this case, six-point) due to the differential vari-
ation F of the propagator, represented by the thick line, which in this case connects
two vertices as an internal line in a tree graph. The graph on the right (corresponding
to (16.47)) describes the variation in the effective vertex (in this case, a four-point
vertex) due to the differential variation of the cutoff propagator in an internal line in
a loop graph.
We have already seen an explicit example of the effect of the latter term in (16.15),
where the irrelevant term Λg42 φ6 was shown to lead to an order unity modification of
the marginal four-point vertex (unsuppressed by inverse powers of the high scale) as a
consequence of the large loop integral in the final graph of Fig. 16.2. Let us see how to
reproduce this result from the point of view of our new renormalization group equation.
We shall assume that at the initial high scale Λ, all the dimensionless couplings except
g2 and g4 are negligible, and we shall also ignore terms of order g22 , relative to g2 and
g4 . The evolution of the g4 vertex is determined by terms of order g22 (from (16.46))
or g6 (from (16.47)) both of which we shall neglect: we thereby conclude that to the
∂ g4 g4 (Λ) g4 (μ)
desired accuracy Λ ∂Λ Λ2 is negligible, and we may replace Λ2 by μ2 at any lower
scale μ. The evolution of g2 arising from (16.47) is determined by
∂g2 (2π)4 30 g4 (Λ) ∂D(k 2 /Λ2 )−1 d4 k
Λ =− Λ
∂Λ 2 (2π)8 Λ2 ∂Λ k 2 + m2
(2π)4 g4 (μ) 30 ∂D(k 2 /Λ2 )−1 d4 k
≈− Λ (16.49)
2 μ2 (2π)8 ∂Λ k 2 + m2
We may approach the sharp momentum cutoff used in Section 16.3 by choosing a
large value for the parameter α in (16.29), whereupon we may replace D(k 2 /Λ2 )−1 by
The renormalization group 587
a step function θ(Λ − |k|). The integral in (16.50) is then restricted to the momentum
shell μ < |k| < Λ, and we find (replacing g4μ(μ)
2 by g4Λ(Λ)
2 as indicated above)
15 m2 , μ2
g2 (μ) = g2 (Λ) + 2
g4 (Λ) + O( ) (16.51)
16π Λ2
which can be seen to agree with the order unity shift in g2 induced by the six-point
vertex obtained earlier in (16.15).
The renormalization group flow equation (16.45) gives an exact description of
the appropriate form taken by the dynamics of the theory once phenomenologically
inaccessible short-distance modes of the field are averaged out, but it nevertheless
leaves us with a complicated, and not very practical, end result, as our effective
Lagrangian contains an infinite number of terms. Although by lowering the cutoff
from some very high (and experimentally unreachable) value ΛU V to a value μ
close to experimental energies we have ensured that large loop integrals involving
powers of ratios of ΛU V to the low scale μ have been eliminated, there remains the
obvious difficulty that the calculation of a low-energy amplitude seems to require
the inclusion of contributions from an infinite number of vertices in the low-energy
Lagrangian.
It is a remarkable property of local quantum field theory that for a certain
subset of theories, the sensitivity of the low-energy amplitudes to all but a finite
number of coupling parameters—in particular, those associated with the marginal
and relevant operators in the effective Lagrangian—is reduced to inverse powers of
the high cutoff. From the renormalization group point of view, this occurs because the
renormalization group flow has the property that the point describing the “location”
of the Lagrangian in the infinite-dimension coupling space of the gn (μ), gn (μ), . . . is
attracted, for μ << ΛU V , onto a finite-dimensional submanifold (of dimension equal to
the number of marginal and relevant operators), up to corrections of order ΛUμV to some
(typically even) power (modulo logarithms of ΛUμV ). As a consequence, up to usually
negligible corrections, we find that for these theories the low-energy amplitudes can be
parameterized by just a finite set of parameters—namely, those needed to locate the
theory on the finite-dimensional attractive submanifold, and which can in principle be
determined by making an equal number of independent experimental measurements.
The insensitivity asserted here is actually demonstrated in a perturbative setting: one
shows that the formal expansion of an arbitrary scattering amplitude in powers of the
marginal and relevant interaction couplings defined at low momentum, holding the
irrelevant couplings fixed at the high cutoff scale ΛU V , depend on the latter only by
inverse powers of ΛU V . We then say that the marginal and relevant operators of the
effective theory form a “perturbatively renormalizable set”. Exactly how this works
will be the topic of Section 17.4 in the next Chapter.
Although the physical content of the renormalization group is most easily displayed
using momentum cutoff regularization schemes of the type we have used up to this
point, such schemes have distinct disadvantages from a calculational point of view
once one goes beyond the lowest orders of perturbation theory. Moreover, in theories
with local gauge symmetry, such cutoff schemes turn out to be incompatible with the
local symmetry, with the unwanted result that the renormalization group evolution
588 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
7 The use of the term “thermodynamic” here does not imply any connection to finite-temperature
phenomena: it is a carry-over from the close formal analogy between the Euclidean quantum functional
integral and the canonical partition sums of classical thermodynamics.
Regularization methods in field theory 589
formalism, we are already familiar with the notion that local quantum fields should
really be regarded as operator-valued distributions, and that the multiplication of such
distributions can lead to ambiguities (or singularities) in the continuum theory. Thus,
the matrix elements of the operators On , On etc., defined in Section 16.2 are actually
infinite, if we insist in working in a continuum theory where the UV cutoff is infinite.
Of course, the whole point of the philosophy espoused here is that a cutoff is not only
technically but physically required, and with the momentum cutoff in place (say, by
employing the modified propagator (16.29)), there are no ultraviolet divergences in any
of the loop integrals we encounter, so the operators of the theory have perfectly well-
defined matrix elements (and lead to well-defined perturbative corrections to n-point
functions of the theory when inserted into the graphs for some process). The actual
value of these matrix elements will, of course, depend on the cutoff, so we must keep
in mind that operators such as O0 (x) ≡ φ2 (x) or O0 (x) ≡ (∂ν φ(x))2 have, strictly
speaking, no meaning until we specify an ultraviolet regularization procedure, such
as (in the momentum-shell framework) a value for the UV cutoff Λ, and the specific
form of the cutoff (e.g., the smooth function (16.29)). The renormalization group flow
equation (16.41) of the preceding section expresses the fact that the infinite set of
operators so defined, at any given scale Λ, form a complete set, in the sense that
we need only alter their coefficients in the effective Lagrangian in order to obtain an
exactly equivalent description of the low-energy amplitudes of the theory at any other
scale μ < Λ. We once again remind the reader that this equation is an exact non-
perturbative statement about the amplitudes of the effective field theory, assuming
only that we start with a well-defined functional integral at the high scale: in the
proof of (16.41), we have employed only exact functional integral identities, with no
need to expand the exponent of the functional integral in a perturbative series.
Let us explore in a little more detail the freedom we have to choose different sets of
operator products in an effective field theory without altering the physical content of
the theory. At this point we shall resort to perturbation theory to gain some concrete
intuition about the variability entailed by this freedom of choice. Staying for the time
being with the momentum cutoff approach, let us consider the one particle to one
particle matrix elements of O0 (x) ≡ φ2 (x),
Ignoring uninteresting initial and final-state factors, the matrix element k |O0 (0)|k
receives, in addition to the tree-graph contribution (first graph in Fig. 16.4), a one-loop
contribution of order g2 from the graph on the right in Fig. 16.4:
D(l2 /Λ2 )−1 D((q − l)2 /Λ2 )−1 d4 l
k |O0 (0)|k = 1 − 12g2 + · · · (16.53)
(l2 + m2 )((q − l)2 + m2 ) (2π)4
= 1 − 12g2 I(q 2 ; Λ2 , m2 ) + · · · (16.54)
where the dots represent other perturbative corrections which are not of interest to
us presently. The one-loop integral I(q 2 ; Λ2 , m2 ) can be expanded in powers of the
momentum variable q << Λ, m (see Problem 4):
590 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
k l k
× + ×
k q−l k
Λ2 q 2 n
I(q 2 ; Λ2 , m2 ) = fn ( )( ) (16.55)
n
m2 m2
With a little thought one establishes that the coefficient functions fn , which contain
the dependence on the UV cutoff Λ, and therefore incorporate the conventionality of
our particular regularization of the operator O0 (x), contain at worst a logarithmic
divergence ln Λ2 /m2 in the limit Λ → ∞, plus vanishing corrections involving inverse
powers of the cutoff. For example, taking the first term in the low-momentum expan-
sion, and assuming the parameter α in (16.29) large, so that the cutoff is essentially a
step function at Λ, we find
Λ2 1 m2
f0 ( ) = (ln (Λ 2
/m 2
) − 1 + O( )) (16.56)
m2 16π 2 Λ2
2 2
2 ) for n ≥ 1 given by dimensionless constants plus corrections of O( Λ2 ).
Λ m
with the fn ( m
A similar calculation, again including just the two graphs appearing in Fig. 16.4, gives
for the one-particle matrix element of O0 = (∂ν φ)2 (the only difference being the
appearance of a dot product of the four-momenta entering and leaving the two-point
vertex of the O0 operator),
l · (l − q)D(l2 /Λ2 )−1 D((q − l)2 /Λ2 )−1 d4 l
k |O0 (0)|k = k · k − 12g2 +···
(l2 + m2 )((q − l)2 + m2 ) (2π)4
= k · k − 12g2 I (q 2 ; Λ2 , m2 ) + · · · (16.57)
where
Λ2 q 2 n
I (q2 ; Λ2 , m2 ) = m2 fn ( )( ) (16.58)
n
m2 m2
In this case a quadratic dependence on the cutoff appears in the leading coefficient
function f0 . Again, taking α large so that D(l2 /Λ2 )−1 is approximately a step function
θ(Λ − |l|), one finds
Regularization methods in field theory 591
Λ2 1 Λ2 Λ2
f0 ( 2
)= 2
( 2 − 2 ln ( 2 ) + 1) (16.59)
m 16π m m
Λ2 1 1 Λ2 2
f1 ( ) = (− ln ( )+ ) (16.60)
m2 16π2 2 m2 3
while the fn are Λ independent constants for n ≥ 2. Of course, for a more general
choice of cutoff function (for example, keeping the parameter α in the cutoff function
finite), the coefficients fn , fn (and their generalizations to all orders of perturbation
theory, as well as the corresponding coefficients for all possible local operators) will
be different, although exactly the same low-energy physics can be reproduced by an
appropriate (different) linear combination of the new set of regularized operators as
defined by the altered cutoff method, as we have seen in the preceding section. The
appearance of power-dependence (quadratic, in the case of (16.59)) on the ultraviolet
cutoff Λ in loop integrals, as we have already seen in Section 16.2, is responsible for
the mixing of operators of different mass dimension in the momentum cutoff approach
to the renormalization group.
We shall now see that an alternative cutoff procedure can be used to give a precise
meaning to the matrix elements of an arbitrary local operator at any order of per-
turbation theory, with the remarkable additional feature that all power-dependences
on the cutoff are removed, leaving only terms with a logarithmic dependence on the
Λ2
cutoff scale (such as the ln ( m 2 ) terms visible in (16.59,16.60)). First, note that the
loop integral appearing in (16.53), with the cutoff functions D−1 omitted, would in
fact be ultraviolet-convergent in any (integer) spacetime dimension less than four: it
is only logarithmically divergent in four dimensions after all, right at the edge, as it
were, of being a convergent integral at large momenta. This suggests a dimensional
regularization approach whereby we temporarily imagine carrying out the integral in
a general spacetime dimension d < 4, and then examine the behavior of the result
as we analytically continue the resultant expression back to the physical spacetime
dimension d = 4. To see how to do this, first note the expression for the radial phase-
space in a general d-dimensional Euclidean integral
∞
d 2π d/2
d l= ld−1 dl (16.61)
Γ(d/2) 0
In order to preserve dimensional consistency, so that our expression for the regulated
amplitude retains the same dimension in powers of mass regardless of the dimension d,
we shall append the appropriate power of the ultraviolet scale Λ to each loop integral
to maintain overall mass dimension 4: thus loop integrals will appear as Λ4−d dd l · ·.
Accordingly, we find that the one-loop integral in question, in d-dimensions, becomes
Λ4−d dd l
Id (q 2 ; Λ2 , m2 ) ≡ (16.62)
(2π)d(l2 + m2 )((q − l)2 + m2 )
Λ4−d 1 dd l
= dx (16.63)
(2π)d 0 (l2 − 2xq · l + xq 2 + m2 )2
592 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
1
Λ4−d dd l
= dx (16.64)
(2π)d 0 (l2 + x(1 − x)q 2 + m2 )2
1 ∞
π d/2 Λ4−d 2ld−1
= dx dl, (16.65)
(2π)d Γ(d/2) 0 0 (l2 + M2 )2
M2 ≡ x(1 − x)q 2 + m2 (16.66)
whence we find
Γ(2 − d2 ) 1
m2 + x(1 − x)q 2 d −2
Id (q2 ; Λ2 , m2 ) = ( ) 2 dx (16.69)
(4π)d/2 0 Λ2
We see that the right-hand side of (16.69) provides an analytic continuation of our
originally four-dimensional loop integral to general complex spacetime dimensions d,
with a finite result in the region Re(d) < 4, and indeed analytic save at the poles of the
Γ function at d = 4, 6, 8, . . .. Of course, the poles of the Γ function at zero (and negative
integer) values of its argument mean, not surprisingly, that the integral becomes
divergent once we attempt to return to the physical spacetime dimension d = 4. At
this point in the complex d-plane, our continued loop-integral has a Laurent expansion
in the variable ≡ 4 − d, the first few terms of which (using the Γ function property
Γ(z) ∼ z1 − γ + O(z), z → 0, with γ the Euler–Mascheroni constant) are found to be
1
1 2 q2
Id (q 2 ; Λ2 , m2 ) ∼ ( + ln (4π)−γ + ln (Λ 2
/m2
)− ln (1+x(1−x) )dx + O())
16π2 0 m2
(16.70)
We now define8 the minimally subtracted dimensionally regularized matrix element of
our O0 operator arising from the one-loop graph in Fig. 16.4 by simply omitting the
pure pole term in (16.70), leaving the remaining “finite part” (henceforth indicated
by the notation FP) in the d → 4 limit:
8 The ubiquitous appearance of the annoying factor of ln (4π) − γ accompanying the pole in
has led
to a modified minimal subtraction scheme, wherein the Laurent expansion is made in a shifted variable
¯,
with
2¯ ≡
2 + ln (4π) − γ, and poles in
¯ are then dropped. This is commonly referred to as the “MS-bar”
scheme.
Regularization methods in field theory 593
1
1 q2
FP Id (q ; Λ , m ) ≡
2 2 2
(ln (Λ2 /m2 ) + ln (4π) − γ − ln (1 + x(1 − x) )dx)
16π2 0 m2
(16.71)
The regularized amplitude (16.71) can be expanded in powers of the momentum as in
Λ2
(16.55): for example, referring to (16.56), we see that the leading coefficient f0 ( m 2)
has exactly the same logarithmic cutoff dependence in the momentum cutoff and
dimensional regularization schemes, differing only by an overall additive constant, up
2
to terms of O( m Λ2 ), suppressed by inverse powers of the cutoff. It can be easily shown
(see Problem 5) that all higher coefficients fn , n ≥ 1 are in fact identical up to such
terms in the two regularization schemes.
For the operator under discussion therefore, O0 = φ2 , there would seem to be no
important differences between the use of a momentum cutoff or the pole subtraction
approach leading to (16.71). If we look instead at the operator O0 = (∂ν φ)2 , with one-
particle matrix elements given in (16.58), the situation is very different. Here, the one-
loop integral, containing the extra factor of l · (l − q) in the numerator, has a quadratic
dependence on the ultraviolet cutoff, resulting in the appearance of terms quadratic in
Λ in the leading coefficient f0 (see (16.59)). On the other hand, the dependence of the
dimensionally regularized amplitude on the cutoff Λ appears only through the prefactor
Λ4−d = Λ , and in developing the Laurent expansion of the regularized Feynman
integral in powers of it is apparent that only powers of logarithms of the cutoff
Λ can appear and not whole integer powers, via Λ = 1 + ln Λ + 12 2 (ln Λ)2 + . . ..
A straightforward calculation, following exactly the steps used above to arrive at
the minimally subtracted matrix element corresponding to (16.58), gives for the matrix
element of O0 ,
Λ4−d l · (l − q)dd l
Id (q 2 ; Λ2 , m2 ) ≡ (16.72)
(2π)d (l2 + m2 )((q − l)2 + m2 )
1 1
FP Id (q 2 ; Λ2 , m2 ) = 2
{Aq 2 + Bm2 − ( q 2 + 2m2 ) ln (Λ2 /m2 )
16π 2
1
q2
+ (3x(1 − x)q 2 + 2m2 ) ln (1 + x(1 − x) 2 )dx} (16.73)
0 m
1 1
A= (γ − ln (4π)) − , B = 2(γ − ln (4π)) − 1 (16.74)
2 6
Comparing with the result (16.59) for the zero-momentum amplitude in the momen-
tum cutoff scheme, we see that the term proportional to Λ2 has, as expected,
disappeared: only a logarithm of the cutoff appears, which is in fact restricted to the
coefficients f0 , f1 , where it appears with the same coefficient in both the momentum
and dimensional regularization schemes.
Our discussion of dimensional regularization has clearly been very restricted: we
have considered only a few simple low-order perturbative contributions to a particular
matrix element of the two simplest local operators of our theory. It would clearly
be very desirable to derive a non-perturbative renormalization group equation for an
effective Lagrangian defined in terms of such operators, along the lines of the derivation
594 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
given in the preceding section for the momentum cutoff scheme. Unfortunately, the
obviously very formal prescription given here for obtaining finite matrix elements of
local operators, by simply eliminating the pure pole parts at d = 4 in Feynman loop
integrals analytically continued to complex dimensionality, cannot be extended beyond
the perturbatively expanded amplitudes of the field theory, as we simply have no way
of giving a sensible non-perturbatively valid definition of a local quantum field theory
in other than integer dimensions. For example, we do not know how to write down
the analog of the functional integral (16.32) for the exact generating functional of a
theory in non-integer dimensions, whose dynamics is specified in terms of an effective
Lagrangian expanded in local operators, with perturbative matrix elements defined
by dimensional pole subtraction.
Nevertheless, as we shall see in the next section, many important applications of
effective field theories may be carried out completely in the context of perturbation
theory, and in such cases the dimensional regularization approach is extraordinarily
useful. We have already seen a glimmer of why this might be the case in the examples
above: unlike the situation in a momentum cutoff scheme, integer powers of the renor-
malization scale which would otherwise result in the mixing of operators of different
dimension as the scale is changed are simply absent in dimensional regularization—a
fact which enormously simplifies the power-counting behavior of effective field theories.
In particular, the contribution of higher-dimension “irrelevant” operators to low-
energy amplitudes will remain “small” (in a precisely quantifiable sense) even when
loop integrals are considered, provided we employ dimensional regularization methods
to define these integrals.
Another very important advantage of dimensional regularization (over the momen-
tum cutoff approach) is the ease with which it incorporates local vector gauge
symmetries, which are formally preserved in this approach, as the form of the
Lagrangian for such symmetries remains unaltered in (integer) dimensions other than
the physical one. This turns out to have the very pleasant consequence that the
Ward identities of the theory expressing the local gauge symmetry are preserved
under dimensional regularization. We shall return to these issues in the subsequent
chapters. In particular, a consistent definition of regularized local composite operators,
extending the low-order examples given above, but valid to all orders of perturbation
theory, requires graph-theoretical technology that we will develop in the next two
chapters when we consider perturbative renormalization in more detail. The general
procedure—the “normal product formalism” of Zimmermann—for obtaining well-
defined local composite operators will be explained in detail in Section 18.1.
Before leaving the issue of regularization, we should comment on one potentially
confusing issue which may already have crossed the reader’s mind in connection
with the absence of power-dependence on the cutoff scale in dimensionally regular-
ized amplitudes. We previously emphasized the difficulty—due to just such power-
dependences in a momentum cutoff scheme-with maintaining “small” values (i.e.,
much smaller than the cutoff scale of the theory) for the coefficients of the relevant
operators in an effective Lagrangian defined by momentum cutoff. This “fine-tuning”
difficulty is most dramatically manifested in the cosmological constant and hierarchy
(Higgs mass) problems, briefly discussed earlier. These issues are not obviated by
the existence of a regularization scheme (dimensional regularization), where power-
Effective field theories: a compendium 595
Φm = fm (φn ; Λ) (16.75)
Note that (a) the number M of smeared fields Φm may be different (typically, smaller)
than the original number N of short-distance fields φn , and that (b) the smearing
functions fm may be linear or non-linear in character, may depend on a sliding energy
scale Λ, and are not in general invertible—the smearing in this sense entailing a “loss
596 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
of information” as regards the full local physics of the theory. If the theory is originally
specified in terms of the original φn fields via a (Euclidean) Lagrangian L(φn ), then
the effective Lagrangian Leff (Φm ) associated with the smeared fields Φm is defined by
4 4
e− d xLeff (Φm ) ≡ Dφn δ(Φm − fm (φn ; Λ))e− d xL(φn ) (16.76)
n
Provided we are only interested in the n-point functions of the new fields Φm , we can
discard completely the original microscopic Lagrangian L(φn ) in favor of the effective
theory defined by Leff (Φm ), as the generating functional for the Φm can be written
entirely in terms of the latter,
− d4 xL(φn )+ d4 xJm (x)fm (φn ;Λ)
Z[Jm ] = Dφn e (16.77)
n
d4 xLeff (Φm )+ d4 xJm (x)Φm (x)
= DΦm e− (16.78)
m
as we can see by introducing the definition (16.76) on the right-hand side of (16.78). We
note here that the smearing functions are subject to certain smoothness requirements
in order to ensure that the resultant effective Lagrangian (16.76) can be expanded
in multi-nomials of local products of the smeared fields Φm (see the discussion
following (16.47)).
The application of effective field theory methods has become a wide-spread industry
in modern high-energy physics, and it would certainly require an entire additional
volume to do justice to only the most widely used. We shall conclude our very brief
introduction with a few examples that illustrate the main types, based on the nature
of the smearing functions used to define the effective theory, and refer the reader to
more detailed treatments available in the many excellent reviews of this subject for
a more thorough discussion of the individual cases. Following the general philosophy
exemplified by (16.75), we can choose functional change of variables which involve
1. a linear transformation of the modes of a single fundamental field of the theory,
2. a linear transformation involving several distinct fundamental fields, or
3. a non-linear change of field variables. In this latter case, one may be left with
an effective field theory involving completely different fields than the underlying
“microscopic” elementary fields which define the short-distance dynamics of the
theory.
We have already encountered an example of the first type in our discussion of the
renormalization group transformation of a scalar field theory with a momentum cutoff.
Here the smearing function amounts to a cutoff of the Fourier modes φ̃(k) of a single
scalar field. An extreme example of such a cutoff is the constraint effective potential
discussed in Section 14.3 (cf. (14.58)), where the effective field Φ is just the zero-
momentum mode φ̃(0) of the original field theory: all non-zero momentum modes are
integrated out. As we saw in Chapter 14, the remaining (highly truncated!) effective
theory is an important tool when examining the possibility of spontaneous symmetry-
breaking of the underlying theory. A somewhat less trivial example is provided by
Effective field theories: a compendium 597
non-relativistic effective field theory (NREFT),9 where we are interested in the Fourier
modes φ̃(k) of a massive field corresponding to non-relativistic quanta of the same:
i.e., with |k| ∼ κ << m, |k0 | ∼ m + O(κ2 /m). We can expose these modes of the field
by a simple linear transformation of the original field φ, which here we take to be a
real scalar field with φ4 interaction and short-distance (Minkowski) Lagrangian
1 1 λ
L= ∂μ φ∂ μ φ − m2 φ2 − φ4 (16.79)
2 2 4!
Define a new field
0 d4 k
ψ(x) ≡ (2m)eimx θ(k0 )φ̃(k)e−ik·x (16.80)
(2π)4
in terms of which the original field may be written
1 0 0
φ(x) = √ (e−imx ψ(x) + eimx ψ † (x)) (16.81)
2m
We shall assume that “relativistic” modes of the new field ψ(x) have been integrated
out,10 and that the effective theory that remains contains only Fourier modes of ψ̃(k)
with k0 << m. Accordingly, when (16.81) is substituted back into the Lagrangian
(16.79), terms with unequal numbers of ψ and ψ† fields are accompanied by time-
0
dependent factors e±2inmx with n a non-zero integer which must vanish when we
integrate the Lagrange density over time to form the action, as they cannot be
compensated by the remaining time-dependence of the ψ fields (which, given the
assumed momentum scales |k0 | ∼ m + O(κ2 /m) of the original φ field, mean that
the momentum modes relevant to ψ̃(k) have |k| ∼ κ << m, k0 ∼ O(κ2 /m) << m).
Substituting (16.81) in the Lagrangian (16.79), one finds for the Minkowski action of
the resultant effective theory the leading terms
1 † 2 λ
L → d4 x{iψ̇ψ† + ψ ∇ ψ− (ψ † ψ)2 } + · · · (16.82)
2m 16m2
where the dots refer to terms with unequal numbers of ψ and ψ † (which will induce
pair creation and annihilation processes which are unimportant in the non-relativistic
k2
1
regime), as well as terms like 2m ψ̇ † ψ̇ which scale like m0 (subleading, as k02 /m << k0 ∼
k 2 /m, the scaling behavior of the free, quadratic part of the effective Lagrangian). The
effective Lagrangian (16.82) can be used to establish the existence of bound states in
d = 2, 3 dimensions (for negative λ) exactly as in Section 11.2, and the reader can verify
that for weak coupling the bound-state energy is correctly determined to leading order
9 The reader may find it convenient at this point to review our discussion of non-relativistic threshhold
physics in Section 11.2.
10 As in the renormalization group transformations of Section 16.3, this will result in a modification of the
coefficients of the leading terms in the effective Lagrangian. In weakly coupled theories, as we imagine here,
these modifications will be small and can be computed in perturbation theory. Practically, the determination
of the coefficients in the effective Lagrangian for any given cutoff scheme is carried out by a “matching”
process which we shall describe briefly at the end of this section.
598 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
where Lψ is the free Lagrangian for the light fermion (although we may also allow
this field to have other interactions unconnected with the heavy scalar, e.g., gauge
interactions, without altering what follows in any essential way). The exponent in this
functional integral is at most quadratic in the scalar field, which we may therefore
integrate out completely, obtaining
2
d4 xLψ − g2 d4 xd4 yS(x)ΔE (x−y)S(y)
Z= Dψe− (16.84)
example of this type is provided by the chiral Lagrangians describing the low-energy
behavior of hadronic amplitudes. The underlying QCD Lagrangian containing quark
and gluon fields in this case is replaced by an effective Lagrangian in terms of meson
(and possibly baryon) fields. We shall illustrate the basic idea by taking a highly
simplified version of the real world as our starting point: we assume there are only
two quarks (the “up” and “down” quarks), and neglect, at least initially, their masses
mu and md (which are known to be much smaller than all other mass scales in QCD),
so that our theory is described at the microscopic level (i.e., at distance scales much
smaller than a fermi) by the Lagrangian (cf. (15.112))
1 2
LQCD = − Tr(Fμν F μν ) + q̄a iD
/ qa (16.90)
4 a=1
The local gauge group will as usual be taken to be SU(3), although we shall see that
specific details of the gauge group dynamics remain essentially hidden through the
process of generating the effective field theory. We use the notation qa (x) for the quark
fields, with q1 (x) = u(x) the up-quark field and q2 (x) the down-quark field. Defining
left and right chiral parts of the quark fields in the usual way, qL (x) = 1+γ 5
2 q(x),
qR (x) = 1−γ
2
5
q(x), the quark kinetic term in (16.90) can be rewritten as
q̄a iD
/ qa = (q̄La iD
/ qLa + q̄Ra iD
/ qRa ) (16.91)
a a
as the γ0 γμ product separating q † (x) from q(x) in the quark kinetic bilinear commutes
with γ5 . Formally, therefore, our fundamental Lagrangian is invariant under the eight-
dimensional chiral group U(2)xU(2) (four generators from each U(2)) corresponding
to the global linear field transformations
where B is a constant of dimension mass3 (from the dimensions of the quark fields) is
no longer a matter for any serious debate: its validity has been more than adequately
confirmed by extensive non-perturbative numerical computations using lattice QCD.
Under a general chiral transformation (VL , VR ),
†
q̄La qRb → VLa a VRbb q̄La qRb (16.94)
from which we see that the VEV (16.93) leaves the diagonal isospin subgroup VL = VR
unbroken, but does indeed break the chiral SU(2) subgroup with VL = VR† . Of course,
we are at liberty to redefine the quark fields by a (dynamically exact) chiral symme-
∗
try transformation qLa (x) → Vab qLb (x), qRa (x) → Vab
T
qRb (x), for some V ∈ SU(2),
whereupon the VEV becomes
where in the final equality we have chosen to parameterize the degenerate vacua of
the theory by the chiral transformation V connecting the particular vacuum to the
canonical one in (16.93) corresponding to V = 1. From our discussion of spontaneous
symmetry-breaking in Section 14.3, we recall that well-defined amplitudes in an
infinite-volume theory where an initially exact symmetry is spontaneously broken
can be obtained only by introducing a small symmetry-breaking perturbation which
“tickles” the system into a particular one of the infinitely many degenerate vacua,
before the infinite volume limit is taken. As we shall see below, in real life this
perturbation is provided by the quark masses which we have so far neglected.
In our proof of the Goldstone theorem in Section 14.2, we saw that the Noether
current of a spontaneously broken symmetry serves as an interpolating field for
the corresponding Goldstone boson (in other words, this operator possesses a non-
vanishing vacuum to single-particle matrix element for the corresponding Goldstone
boson, which by Haag–Ruelle theory means that it can be used to construct the exact
602 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
S-matrix scattering amplitudes of the Goldstone particle). In our case the (three)
relevant currents are the axial vector currents J5aμ ≡ q̄γμ γ5 τa q, a = 1, 2, 3 (where τ1,2,3
are the Pauli matrices: we avoid using the usual σ notation here as a field with this
name will shortly make its appearance). We may just as well use the pseudoscalar
operators q̄γ5 τa q, however: indeed, if a quark mass term is present (as it is, in the
real world), these operators are proportional to the divergence of the axial vector
currents J5aμ (cf. (15.199), but with no anomaly term and an SU(2) generator matrix
τa included both in the current on the left and the pseudoscalar divergence on the
right). Consequently, they must have a non-vanishing vacuum to single pion matrix
element if the J5aμ do.
We shall therefore content ourself with writing a generating functional with sources
for the q̄γ5 τa q operators, as well as for the scalar density q̄q = q̄L qR + h.c. whose VEV
signals the spontaneous symmetry-breaking of the theory. Knowledge of this functional
is tantamount (in virtue of the Haag–Ruelle or LSZ scattering theories discussed in
Chapter 9) to knowledge of the full set of multi-pion scattering amplitudes in the
theory. We therefore define (in Minkowski space, and glossing over the usual fine
points vis-à-vis gauge fixing, DFP determinants, ghosts, etc., in our specification of
the functional integral)
4
Z[s, p] ≡ Dq̄DqDAμ ei d x(LQCD −q̄(x)(s(x)−iγ5
τ ·
p(x))q(x)) (16.96)
where the Aμ are the gauge vector gluon fields implementing the underlying color
SU(3) local gauge symmetry, and s(x), p(x) are as usual c-number external sources cou-
pled to the operators of interest in the theory. The expected spontaneous symmetry-
breaking means that we must supplement this functional specification of the theory
by a choice of vacuum, inserted via an infinitesimal “magnetizing” field—in this case,
a small perturbing quark mass term q̄(x)q(x) ( small, real and positive), which can
be implemented by taking the source field s(x) to contain the spacetime constant
term (plus, as usual, fields vanishing at infinity, or outside some compact region of
support). The source term may be decomposed by chirally splitting the quark fields
in the usual way:
From the form (16.97) it follows immediately that the source term (together with the
Lagrangian LQCD , from our previous discussion) is invariant under the joint chiral
transformation (16.92) (with VL , VR SU(2) matrices), together with the source field
transformation
If we consider the effect of a functional change of field variable in the path integral
(16.96) consisting precisely of such a SU(2)×SU(2) chiral transformation (which, being
anomaly-free, has unit functional Jacobian), we see that this invariance transfers
Effective field theories: a compendium 603
directly to the functional Z[s, p] which we may just as well write as a functional
Z[χ] of the 2x2 matrix source field χ(x):
Next, let us define a new 2x2 matrix field Σ(x) = σ(x) + iτ · π (x) which incorporates
four fields—an isoscalar σ and an isovector π —in terms of which we shall write our
effective Lagrangian. The latter may be defined in terms of the functional Fourier
transform of Z[χ], as follows:
4 †
d xLeff [Σ] ≡ −i ln ( Dχe− 2 d xTr(χ (x)Σ(x)) Z[χ])
i
4
(16.101)
where the factor of one-half arises as a consequence of the relation Tr(χ† (x)Σ(x)) =
Tr(χ(x)Σ† (x)) = 2(s(x)σ(x) + p(x) · π (x). The inverse Fourier transform relation then
becomes
4 1 †
Z[χ] = DΣei d x(Leff [Σ]+ 2 Tr(χ (x)Σ(x))) (16.102)
Once again, the chiral SU(2)×SU(2) symmetry (16.100) transfers directly to our new
effective Lagrangian Leff [Σ] defined in (16.101),11
The effective Lagrangian Leff [Σ] is an exact transcription of the dynamics of QCD
relevant for the determination of the full n-point Green functions of the quark bilinear
fields q̄q and q̄γ5 τ q coupled to the sources χ: in particular, if we knew the exact
form of this functional, we would be able to calculate arbitrary multi-pion scattering
amplitudes exactly, and even determine the exact nucleon mass from the location of
the nucleon–antinucleon threshold in π 0 − π 0 scattering, for example! Of course, all we
know about this effective Lagrangian is that it is invariant under the chiral symmetry
(16.103). However, by the same arguments of clustering and Lorentz-invariance which
led to the general form (16.6) in Section 16.2, any (appropriately regularized) effective
Lagrangian must be expandable in an infinite series of products of local operators
and their spacetime derivatives. The chiral symmetry which we must impose on
Leff [Σ] implies that only certain combinations of the matrix field Σ can appear in this
expansion: specifically, the Lagrangian must be given in terms of traces of products of
Σ and Σ† arranged to ensure the validity of (16.103). We thereby obtain the following
expansion (rescaling the field Σ to fix the coefficient of the kinetic term at 14 , and
including an infinitesimal mass term Tr(Σ) from the source s(x) as discussed above,
with a coefficient rescaled to after rescaling Σ)
1 μ2 λ
Llin [Σ] = Tr(∂μ Σ† ∂ μ Σ) + Tr(Σ† Σ) − (Tr(Σ† Σ))2 + Tr(Σ) + . . . (16.104)
4 4 16
11 The reader may easily check that the SU(2)xSU(2) chiral symmetry in (16.103) is equivalent to an
O(4) rotation of the four fields σ,
π . One frequently finds discussions of chiral symmetry phrased in terms
of this O(4) language.
604 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
where the dots refer to higher-dimension (and therefore, in the sense of Section 16.3,
irrelevant) operators such as (Tr(Σ† Σ))3 , Tr(∂μ Σ† ∂ν Σ)Tr(∂ μ Σ† ∂ ν Σ), etc. We shall
return to the role of higher-dimension operators in our theory below. Here, we note
that the dimension 4 (or less) terms indicated in (16.104) constitute the so-called
“linear σ model”. As our field σ = 12 Tr(Σ) reproduces the matrix elements of q̄q we
must ensure that it acquires a VEV in the vacuum, which implies the choice of sign
of the second term in (16.104). The linear model leads in the usual way at tree level
to a vacuum expectation value for the σ field at the unique minimum of the field
polynomial (for infinitesimal positive )
μ2 2 λ
P (σ, π ) = − (σ + π 2 ) + (σ 2 + π 2 )2 − 2 σ (16.105)
2 4
√
which occurs at σ ≡ v = μ/ λ. Displacing the field σ(x) = v + σ̂(x) in the usual
way, we find that the π fields lose their mass term and become massless Goldstone
bosons as expected. This result, of course, continues to hold at the exact minimum of
the non-derivative part of the (unknown!) full Llin [Σ], which by the chiral symmetry,
is necessarily a function of σ 2 + π 2 : thus, at the minimum, we have π = 0 and the flat
directions are just those of the π fields. In addition to the massless Goldstone π fields,
the model also contains the massive σ̂ degree of freedom, with mass of order μ.12
As we explained in our general discussion at the beginning of this section, the
derivation of an effective field theory usually entails, in addition to a functional change
of variable, the partial elimination of degrees of freedom by integrating out field modes
which are not important at the energy scales of interest. We now proceed to this second
step, beginning with the (still, in principle exact) effective theory (16.104). We shall be
interested in processes occurring at momentum scales much lower than the mass scale
μ of the “heavy” degrees of freedom interpolated by the σ̂ field. Precisely as in our
discussion of heavy particle decoupling above, we can do this by integrating out the σ
degree of freedom, leaving an effective Lagrangian depending only on the Goldstone
fields π . The most convenient way to do this involves a further change of variable,
whereby we re-express the theory in terms of new fields S(x), Π via the non-linear
transformation
Σ(x) = σ(x) + iτ · π (x) ≡ S(x)ei
τ ·Π/v , S(x) ≡ σ 2 + π 2 = v + Ŝ(x) (16.106)
The reader may easily verify (Problem 8) that this change of variable, when inserted
in (16.104), leads, after expanding the exponential, to a Lagrangian with (canonically
normalized) massive Ŝ field and massless Π fields. The chiral symmetry transfers
directly to the unitary matrix field U (x) ≡ ei
τ ·Π/v (as S(x) = det(Σ) is chirally
invariant): the theory must be invariant under
12 Note that while the theory contains physical (massless) pions, there is no stable particle associated
with the σ field: the mass scale μ is naturally of the order of the other important physical hadronic scales,
e.g., the rho resonance pole or nucleon mass, i.e., closer to 1 GeV, and quite a bit larger than the VEV v,
which turns out to be just the pion decay constant fπ ∼ 100 MeV (see Problem 7).
Effective field theories: a compendium 605
The result of integrating out the massive Ŝ field must therefore be an effective
Lagrangian Lnonlin [U ], subject to the exact global symmetry (16.107), which our new
effective theory inherits from the original symmetry (16.103) of the linear model.
Following a by now familiar pattern, we therefore set about constructing the most
general chirally invariant functional of the unitary matrix field U , as an expansion
in terms involving traces of products of U , U † and their spacetime-derivatives. As
factors of U and U † must appear adjacent in the traces to ensure invariance under
(16.107), they must have derivatives to avoid evaporating via the unitarity constraint
U † U = U U † = 1. The coefficient of the leading term in the expansion (two derivatives
only) can be chosen to yield the canonical normalization of the kinetic term for the Π
field, and we find the non-linear σ model
v2
Lnonlin [U ] = Tr(∂μ U † ∂ μ U )
4
+L1 Tr(∂μ U † ∂ μ U )2 + L2 Tr(∂μ U † ∂ν U )Tr(∂ μ U † ∂ ν U )
+L3 Tr(∂ μ U † ∂μ U ∂ ν U † ∂ν U ) + . . . (16.108)
where we have followed the notation of Gasser and Leutwyler (Gasser and Leutwyler,
1985) in notating the higher-order coefficients L1 , L2 , etc. If we expand the exponential
U = ei
τ ·Π/v , we find an effective Lagrangian with a massless kinetic term for our
Goldstone pion fields Π and interaction terms which all contain two (or more)
spacetime-derivatives. For example, the terms L2 with two derivatives only (from
the first term on the right-hand side of (16.108)) become (see Problem 9)
1 + 1 (Π · ∂μ Π
Π
· ∂μΠ
−Π
2 ∂μ Π
· ∂ μ Π)
+ ...
L2 = ∂μ Π · ∂ μ Π (16.109)
2 6v 2
where the dots represent operators of dimension 8 or higher. The terms L4 , L6 .. with
four, six,.. spacetime-derivatives may similarly be expanded in powers of the fields
which interpolate for the massless pions of our toy theory. At least at tree level,
Π
we see that the interaction terms in this effective Lagrangian give rise to powers of
the external momenta of the process corresponding to the spacetime-derivatives, and
that at low momenta (much smaller than the scale v, say, which can be shown to
correspond exactly to the pion decay constant fπ —see Problem 7) the multi-pion
scattering amplitudes of our theory should be given to a good approximation by the
graphs generated by L2 , with progressively smaller contributions from L4 , L6 . . ., etc.
But what about loops, which must certainly be included if we wish to calculate in
a systematic way the complete amplitudes implied by our effective theory? And what
about the infinite set of terms represented by the dots in (16.109), containing higher
powers of the Π field, but still only two derivatives, which we may expect to contribute
comparably to the indicated ones, by this argument?
In fact, as the discussion in Section 16.5 made clear, the formal expansion of
an effective Lagrangian in local operators containing powers of the fields and their
606 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
The E external pion fields and Nπ fields at the interaction vertices clearly produce
graphs with the number of internal lines I = Nπ2−E (as each internal line arises from the
contraction of two pion fields).
Moreover, we saw in Section 10.4 that a connected graph
with L loops and V = n Nn vertices has I = L + V − 1 internal lines. Eliminating
Nπ = 2L + 2V + E − 2 from (16.110), we find
D =2+ (n − 2)Nn + 2L (16.111)
n
Effective field theories: a compendium 607
As the second and third terms on the right are zero or positive, we see that the leading
behavior of an arbitrary multi-pion amplitude at low momentum (a) vanishes at least
quadratically as the external momenta go to zero, and (b) that the leading behavior
at low momentum (D = 2), is given completely by the lowest term L2 in our general
effective Lagrangian, and indeed, by only the tree graphs (L = 0) obtained therefrom.
Higher-order corrections at low momentum, of order p4 say, require using the higher-
order term L4 at tree level, or L2 to one-loop order; and so on. In any event, it is clear
that only a finite and well-defined set of parameters (which must be determined by
experimental fits, or calculated non-perturbatively, say by lattice QCD techniques) are
relevant up to any given order of the low-momentum expansion of the amplitudes of
the theory. We see again the quintessential advantage of an effective field theory: the
sensitivity of the amplitudes of the theory in a restricted momentum regime is found
to be restricted to a limited set of terms, allowing these amplitudes to be calculated in
a systematic, though approximate, fashion in perturbation theory, starting from the
leading terms in the effective Lagrangian for the theory.
The highly simplified toy model with which we have introduced the ideas of chiral
effective Lagrangians must, of course, undergo substantial elaboration to provide an
adequate description of low-energy hadronic physics in the real world. Our massless up
and down quarks must be given (small) masses, and the somewhat higher strange quark
mass also included if we wish to study the low-energy amplitudes of strange mesons.
The methods described above can be generalized to deal with the inclusion of quark
mass terms, treated perturbatively (so the amplitudes of the theory are developed in a
double expansion in the small external momenta of the pseudo-Goldstone mesons and
the light quark masses), and sources can be introduced for the vector and axial vector
currents of the theory. Even the axial U(1) anomaly (of importance in the calculation
of the neutral pion decay rate to two photons) can be included systematically in the
resultant effective Lagrangian. The interested reader is encouraged to pursue these
further developments, which are fully laid out in the seminal work of Gasser and
Leutwyler (Gasser and Leutwyler, 1985) (in turn based on the pioneering contributions
of Weinberg (Weinberg, 1968)).
Considerations of space require us to bring to a conclusion this all-too-brief survey
of effective field theory: admittedly, we have barely scratched the surface of this rich,
and, for modern elementary particle theory, profoundly important subject. A few
more comments are in order, however. As a practical matter, most applications of
effective field theory are carried out by a matching procedure, whereby the coefficients
of the putative effective Lagrangian (itself determined by appropriate choice of a
smearing of the underlying fields followed by integrating out unwanted modes, and fully
exploiting any available symmetries to restrict the set of allowed operators) relevant
to the physics regime of interest are determined by equating amplitudes computed
by the use of the effective Lagrangian with the same amplitudes computed either
perturbatively or non-perturbatively (e.g., by lattice methods) up to the appropriate
level of sensitivity in an expansion in some available small momentum ratio. The
technical details of this matching procedure will not be described further here: we
refer the reader to any of a number of excellent reviews on effective field theory, for
example, the review of Georgi (Georgi, 1993) and the TASI-2002 lectures of Rothstein
(Haber and Nelson-eds, 2004).
608 Scales I: Scale sensitivity of field theory amplitudes and effective field theories
16.7 Problems
1. Verify that the graphs in Fig. 16.2 give rise to the amplitude displayed in (16.15).
Use the Feynman parameter identity (16.67) to combine the propagators in the
loop integral, which you may evaluate with a sharp momentum cutoff (i.e., a
factor of θ(Λ − |k|)).
2. Carry out the steps leading from (16.34) to (16.37).
3. Show that the cutoff dependence of the connected functional W0 arising from
the functional integral of the (exponentiated) free cutoff Lagrangian L0 (φ, Λ) in
(16.28) corresponds to the source-independent term in (16.43).
4. Verify that the loop integral appearing in (16.53) can be expanded as indicated
in (16.55).
5. Show that the expansion coefficients fn of the one-loop integral (16.62) in
powers of q 2 /m2 for n > 0 are identical up to powers of m2 /Λ2 in dimensional
regularization and sharp momentum cutoff schemes. Note that the integral
subtracted at zero momentum (thereby leaving only the fn , n > 0 terms) is
convergent as d → 4 in the dimensional scheme and has a finite limit (with
O(m2 /Λ2 ) corrections) as Λ → ∞ in the sharp momentum cutoff scheme.
6. Starting from the effective non-relativistic Lagrangian (16.82), show that the
bubble diagrams contributing to the 2-2 amplitude (corresponding to the Fourier
transform of the four-point function 0|T (ψψψ † ψ † )|0) produce in d = 2 or d = 3
spacetime dimensions (provided λ < 0) a pole corresponding to a non-relativistic
threshold bound state, of exactly the form found in the full relativistic theory in
Section 11.2.
7. By identifying the Noether currents J5aμ = q̄γμ γ5 τa q (to lowest order) for chiral
SU(2) transformations in the QCD Lagrangian with the corresponding current
in the effective non-linear Lagrangian (16.108), we find that the σ model VEV v
can be identified with the pion decay constant fπ . Here is the argument:
(a) Show that the Noether current implementing infinitesimal chiral transforma-
tions (i.e., VL = VR† = 1 + iωa τa , ωa infinitesimal) in the effective non-linear
model is (to lowest order in the chiral expansion)
v2
eff
J5aμ =i Tr(τa (U † ∂μ U − U ∂μ U † ) (16.112)
4
(b) The pion decay constant is defined in terms of the vacuum to one-pion matrix
element of the axial current J5aμ (which appears in the hadronic part of the
Fermi interaction Hamiltonian G √F ūγ μ (1 + γ5 )d μ̄γμ (1 + γ5 )νμ responsible for
2
charged pion decay to μ + ν̄μ ), as follows:
the form of the nonlinear Lagrangian (as a function of S and U fields) prior to
integrating out the “heavy” S field.
In the previous chapter we emphasized the importance of the scale separation property
of local quantum field theories, which expresses our ability to predict, at least to some
reasonable level of accuracy, the features of particle interactions at long distances (or
equivalently, at low energies) despite the fact that we are inevitably ignorant of the
“ultimate” details of these interactions at arbitrarily small distance scales. Historically,
this property was first realized in the context of the perturbative treatment of a specific
quantum field theory, quantum electrodynamics, the local gauge theory describing
the interactions of photons with electrons (and other charged leptons). As we saw
in Chapter 2, the development of interacting quantum field theories in the two
decades from the early 1930s to around 1950 was severely hampered by the ubiquitous
infinities—more precisely, ultraviolet divergences in the integrals over the momenta
of particles appearing in intermediate states in scattering amplitudes—which plagued
all higher-order calculations in these theories.
In the late 1940s the development of graphical techniques for covariant perturbation
theory proved to be the critical ingredient needed to surmount this impasse. The
covariant graphical techniques introduced by Feynman revealed in a much more
transparent way the structure of the scattering amplitudes of the theory, and in
particular the “nested” character of the divergent contributions to these amplitudes,
features which were then exploited by Dyson in his classic development of perturbative
renormalization theory for quantum electrodynamics (Dyson, 1949). Dyson was able
to show, order by order in perturbation theory, and to all orders of perturbation
theory, that the distressing divergences disappeared provided the amplitudes of the
theory were re-expressed in terms of a finite number of “renormalized” parameters
corresponding to measurable (and therefore ipso facto finite) low-energy properties
of the theory. Otherwise stated, if an ultraviolet cutoff Λ is introduced to regularize
the theory (thereby making all loop integrals finite), the reparameterization of the
amplitudes of the theory in terms of renormalized quantities softened the dependence
on the cutoff Λ, yielding amplitudes which were finite in the limit Λ → ∞, with a
cutoff sensitivity at finite Λ corresponding to powers (typically quadratic) of the ratio
of the masses and momenta in the scattering amplitude to the UV cutoff Λ.
The remarkable quantitative agreement of the quantum electrodynamic amplitudes
(anomalous magnetic moment of the electron and muon, Lamb shift, etc.) computed
using this procedure with the measured experimental values remain among the most
impressive successes of physical science. Nevertheless, the overpowering impression
Scales II: Perturbatively renormalizable field theories 611
persisted among many physicists that the procedures of Dyson amounted to an intel-
lectually unsavory “fudge”—a mere sweeping under the rug of potential underlying
inconsistencies in the theory of which the ultraviolet divergences were apparently
the overt symptom. This unease began to dissipate in the early 1970s, with the
development by Wilson of the effective field theory point of view which we discussed
in Section 16.2. The inescapable presence of new physics (quantum gravity, string
theory, or what have you) at short distances implies, as emphasized in the preceding
chapter, that we must necessarily imagine a cutoff Λ of some kind at high energies:
the question of infinities in the (unphysical!) limit where this cutoff is taken to infinity
is then replaced by the issue of sensitivity of the measurable low-energy amplitudes
of the theory to the value (and type) of this cutoff. Our object in this chapter is to
review the techniques that have been developed to study this sensitivity in the context
of the formal weak coupling perturbation expansions introduced in Chapter 10, and
in particular to show that for a certain subset of field theories, the sensitivity is of the
weak kind (inverse powers of the cutoff) first described by Dyson. We shall also see
that these results fit naturally into the picture of the renormalization group flow of
effective Lagrangians discussed in the previous chapter.
Before diving into the technical details, it may be helpful to the reader to give a
simple example of the phenomenon outlined above, whereby the strong dependence of a
regularized scattering amplitude on an ultraviolet cutoff can be dramatically weakened
by a reparameterization of the theory in terms of “renormalized” parameter(s). Let
us suppose that we are told that the 2-2 scattering amplitude out k3 , k4 |k1 , k2 in for
some particle is described by the expression (given, as usual, up to terms involving
inverse powers of the cutoff Λ)
ig 2 ig 2 k2
Γ(4) (k1 , k2 , k3 , k4 ) = + + O( i2 )
1+ Cg 2 2 2 2
ln (t/Λ ) 1 + Cg ln (u/Λ ) Λ
t ≡ −(k3 − k1 )2 , u ≡ −(k4 − k1 )2 (17.1)
where g is the coupling constant for some interaction in the theory cut off at
momentum Λ, and C is an uninteresting (real positive) numerical constant.1 If we
expand out the denominator factors in a perturbative series in powers of g 2 , it is
apparent that at any given finite order of perturbation theory, the amplitude displays
logarithmic ultraviolet divergences, becoming infinite as a power of ln (Λ2 ) as the
cutoff Λ is taken to infinity. If we instead parameterize the amplitude in terms of a
renormalized coupling gR , defined simply in terms of the value of the 2-2 scattering
amplitude at some experimentally accessible value of the momentum transfer variables
t = u = μ2 << Λ2 , as follows:
1 The curious reader may be interested in the origin of this simple expression, although it is not relevant
to the present discussion. It represents the 2-2 amplitude for scattering in a theory of N massless Dirac
fermions in two spacetime dimensions, with quartic interaction Lagrangian 12 g 2 ( N 2
i=1 ψ̄i ψi ) (the so-called
Gross–Neveu model (Gross and Neveu, 1974)), in the limit where N → ∞ with g N fixed—the so-called
2
“1/N” expansion. In this case the constant C = N/(2π). It also gives, in a similar limit, the 2-2 scattering
amplitude for a theory of N massless complex scalar fields in four spacetime dimensions with interaction
†
Lagrangian − 12 g 2 ( Ni=1 φi φi ) , but with the constant C now negative, C = −N/(16π ).
2 2
612 Scales II: Perturbatively renormalizable field theories
i g2 2
gR
2
gR ≡ − Γ(4) (t = u = μ2 ) = 2 2 2
⇒ g2 = 2 (17.2)
2 1 + Cg ln (μ /Λ ) 1 + CgR ln (Λ2 /μ2 )
In the jargon of renormalization theory with which we must now begin to familiarize
ourselves, such a definition is referred to as a renormalization condition: a complete
set of such conditions (one for each free parameter needed to uniquely identify the
low-energy theory) specifies a renormalization scheme. If we now re-express the 2-2
amplitude (17.1) in terms of the renormalized coupling gR , we find
2 2
igR igR k2
Γ(4) (k1 , k2 , k3 , k4 ) = 2 2
+ 2 2
+ O( i2 ) (17.3)
1 + CgR ln (t/μ ) 1 + CgR ln (u/μ ) Λ
and we see that all dependence on the ultraviolet cutoff Λ has been removed to the
level of the (in practice, for quantum electrodynamics) tiny inverse power terms which
are, of course, harmless in the limit (Λ → ∞) in which the cutoff is removed entirely.
In particular, the expansion of our 2-2 amplitude in powers of gR gives a renormalized
perturbation expansion in which each term is separately insensitive (again, up to
harmless inverse power corrections) to the UV cutoff Λ. The essential components
of the renormalization procedure in field theory can be seen already here, in this
admittedly algebraically trivial example: first, the choice of a regularization method
(effectively, a mathematical model of our ignorance of the theory at short distances),
and second, the renormalization conditions identifying a specific reparameterization of
the amplitudes of the theory (allowing us to conveniently inject accessible low-energy
information about the theory).
A third feature of the renormalization program which becomes apparent once we
evaluate amplitudes order by order in perturbation theory derives directly from the
reparameterization step: the appearance of subtracted amplitudes at each order of
4
perturbation theory. Let us illustrate this point at the first subleading order (gR ) in
the expansion of our 2-2 amplitude (17.1) (corresponding to one-loop contributions
in the underlying field theory). We may evidently define a coupling constant shift (or
2
counterterm) δgR by the trivial identity
g 2 ≡ gR
2 2
+ δgR (17.4)
4
where, by expanding (17.2), we have through order gR ,
2
δgR = −CgR
4
ln (Λ2 /μ2 ) + O(gR
6
) (17.5)
Inserting (17.5) into the perturbative expansion of the amplitude (17.1) in powers of
the bare coupling g, we find (neglecting terms suppressed by powers of the cutoff)
amplitude which precisely removes the logarithmic Λ-dependence of the latter in the
original cutoff amplitude.
Understanding the structure of such subtractions will be crucial in developing a
renormalization technology capable of exposing the cutoff sensitivity of field theory
amplitudes at all orders of perturbation theory. In particular, let us note that although
the reparameterization indicated in (17.2) has evidently succeeded in suppressing the
cutoff dependence of the elastic scattering 2-2 amplitude (up to ignorable inverse
power terms, as always), we have not demonstrated that the same reparameterization
suffices to remove the cutoff sensitivity of all the amplitudes of the theory: 2-4, 2-6,
etc., particle production amplitudes, for example. This much stronger requirement—
that a reparameterization of a finite number of couplings (and masses) in terms of
low-energy quantities can suppress the cutoff dependence of the entire set of S-matrix
amplitudes of the theory, order by order, and to all orders of perturbation theory—
is, amazingly, satisfied by a rich variety of local quantum field theories, which we
collectively refer to as “perturbatively renormalizable theories”: they are the subject
of our enquiry in this chapter. Later, after developing the appropriate technology for
perturbative renormalization theory, we shall see (in Section 17.4) that such theories
appear naturally as low-energy limits of the effective Wilsonian field theories described
in Chapter 16.
In the next section we shall examine in detail the structure of the cutoff dependence
of general multi-loop Feynman integrals appearing in the perturbative expansion of
amplitudes in a local quantum field theory. We shall see that the occurrence of
divergent integrals (and subintegrals) in such loop amplitudes is associated with cutoff-
dependent contributions which have a very simple (in fact, polynomial) momentum
dependence. This latter fact will then be exploited in the subsequent section to
demonstrate the equivalence of the set of subtractions needed to remove the leading
cutoff dependence of an arbitrary Feynman amplitude (reducing it to the inverse
power-dependence of the type seen above) to the result of a reparameterization of a
set of coupling and mass parameters appearing in the Lagrangian of the theory. The
intimate connection reparameterization ⇐⇒ subtractions, of which we have just seen
a particularly trivial example, is the essence of the proof of cutoff-insensitivity for
perturbatively renormalizable theories.
Here k1 , k2 , .., kE are the external momenta associated with the E external lines of the
graph, and l1 , l2 , .., lL the L independent loop momenta appearing within the graph.
The numerator factor N (k1 , k2 , .., kE ; l1 , l2 , .., lL ) is a multi-nomial of finite order in
its four-momenta arguments, while the denominator factor is a product of the usual
(Euclidean) Feynman propagator factors p2 + m2i , which we shall here assume are
all massive, to avoid concern over infrared divergences (which we shall be taking up
specifically in Chapter 19), which are, of course, not relevant in a discussion of short-
distance sensitivity of the theory. At present, we leave the spacetime dimension d free,
as Weinberg’s theorem applies for all (integral) dimensions.
A simple example is the one-loop graph of Fig. 17.1 (cf. also Fig. 16.2) contributing
to 2-2 scattering in φ4 theory, which we have already encountered on several occasions:
1 dd l
I(k1 , .., k4 ) = (17.8)
(l2 + m2 )((k1 + k2 − l)2 + m2 ) (2π)d
For the purposes of the present discussion, we ignore overall numerical factors, powers
of the coupling constant, etc. For large values of the loop momentum, l ≡ |l| >> |ki |, m,
the integral scales like
k3 k4
l k1 + k2 − l
k1 k2
where Λ is a UV cutoff. Evidently, if d < 4 the integral is convergent at the upper end,
and we may let Λ → ∞ obtaining a finite result. If d = 4, the integral is logarithmically
divergent, so the result at finite Λ contains a logarithmic sensitivity ∼ ln Λ to the
cutoff. For (integer) dimensions d > 4, the integral has a much larger, power growth
dependence Λd−4 on the cutoff. Evidently, the quantity D ≡ d − 4, which we shall term
the superficial degree of divergence of the loop integral, indicates the demarcation point
for ultraviolet convergence of the loop integral. Note that it is obtained very simply by
counting the difference in powers of loop momenta in the numerator and denominator
of the Feynman integrand, including a factor of ld from the measure dd l. For D < 0, we
say that the integral is superficially convergent: in this case, the result at finite cutoff
(much larger than the external momenta and masses) differs from the infinite cutoff
limit by inverse powers of the cutoff (for d even, at least quadratically, perhaps with
logarithmic modifications which are dominated by the power falloff), and we shall refer
to such a loop integral as “UV-finite”. For D = 0, we have a logarithmically divergent
integral, with ln Λ sensitivity to the cutoff, while for D > 0 we have power-divergent
integrals (linearly for D = +1, quadratically for D = +2, etc.) with much stronger
dependence on the UV cutoff. Weinberg’s theorem generalizes the particular case of a
UV-finite integral to an arbitrary Euclidean Feynman integral of the form (17.7), as
follows.
Theorem 17.1 (Weinberg, 1960) The general Euclidean loop integral (17.7) is UV-
finite provided the superficial degree of divergence associated with scaling any subset
of loop momenta uniformly to infinity is negative. In other words, the overall integral
is UV-finite if the superficial degree of divergence associated with integrating over any
hyperplanar subspace of the full dL-dimensional Euclidean integration space is strictly
negative.
In the event that this condition is satisfied, we are assured that the sensitivity of the
corresponding amplitude to a high-momentum cutoff Λ corresponds to the mild, and
from the point of view of renormalization theory, ignorable inverse power-dependence
on the cutoff that we have seen now on numerous occasions. We shall not attempt
to reproduce the details of Weinberg’s proof here, which depends on a technically
sophisticated (but physically not particularly enlightening) application of real analysis,
especially as the result appears completely natural, particularly once rephrased in the
language of large momentum flows, as we shall see shortly.
Before going on to more general cases, let us just note that for the simple one-
loop integral (17.8) of Fig. 17.1 one easily sees that in d = 4 dimensions, the degree
of divergence associated with a hyperplane subspace of dimension n ≤ 4 is D = n − 4
so that the only divergence comes from the region corresponding to n = 4 in which
all components of the loop four-momentum become large. This will typically be the
case for more complicated multi-loop graphs: we will need only examine hyperplanes
corresponding to all components of some subset (or possibly all) of the loop momenta
becoming large. Physically this corresponds to regions of the integration in which large
momentum “irrigates” various loops of the graph, individually or in some specified
616 Scales II: Perturbatively renormalizable field theories
joint fashion. If this seems all a bit vague at the moment, we beg the reader’s patience
for a few moments longer: explicit examples will soon follow.
In our first foray into perturbative field theory in Chapter 10, we encountered the
logarithmic divergence of (17.8) for infinite UV cutoff in four spacetime dimensions
and recognized (cf. (10.35)) this infinity as the symptom in momentum space of the
singular result of trying to multiply individually well-defined distributions at the same
point in coordinate space (i.e., trying to obtain the squared coordinate space Feynman
propagator ΔF (z)2 ). We saw there (cf. the discussion following (10.36)) that the
singularity amounts in coordinate space to an additive local δ-function term δ 4 (z),
or to a constant in the Fourier-transformed momentum space: indeed, subtracting the
loop integral at any fixed value of the external momenta—for example, zero—gives a
perfectly UV-finite result (cf. (10.36)). Let us revisit this result briefly from the point
of view of Weinberg’s theorem, as it serves as a useful prelude to the discussion in
the more general multi-loop case. Taking d = 4 spacetime dimensions, and subtracting
from the amplitude (17.8) its value with all external momenta set to zero, we obtain
and we see that the subtracted Feynman integral now has superficial degree of
divergence D = −1 and is therefore convergent, by Weinberg’s theorem (there is only
a single loop here, so the only region of large momentum flow corresponds to l large,
where the scaling of the subtracted integrand is as l−1 ). Otherwise put, with a cutoff
Λ in place, the subtracted integral depends on Λ at the level of power-suppressed
k2 ,m2
terms of order O( iΛ2 ): in the conventional language of renormalization theory, it is
UV-finite.2 The origin of this convergence is simply that the leading behavior of the
original (“unrenormalized”) integrand (17.8) and the subtraction term at large l are
identical, so the subtraction removes the leading asymptotic behavior of the integrand
responsible for the logarithmic divergence (= logarithmic dependence of the cutoff
integrals on Λ).
There is an alternative, and for our future purposes extremely important, way to
understand the efficacy of this subtraction procedure in reducing the cutoff dependence
of the loop amplitude. We shall introduce the notation
for the subtraction term in (17.10). The operation tΓ will be defined on all one-particle-
irreducible (1PI) graphs Γ with superficial degree of divergence D ≥ 0 as extracting
the terms up to order D in the Taylor expansion in the external momenta ki around
2 The quadratic dependence arises from the fact that we can further improve the convergence of the
integral by symmetric integration: i.e., taking the average of the integrand at l and −l before integration.
Weinberg’s power-counting theorem and the divergence structure of Feynman integrals 617
zero momentum of the loop integral(s) I representing Γ. In the present case the loop
integral has D = 0, and the operation tΓ therefore simply takes the leading term of
the Taylor expansion, i.e., the unsubtracted amplitude at zero momentum. Thus, our
subtracted amplitude (17.11) amounts to
On the other hand, the right-hand side of (17.13) can be viewed as the sum of all
the Taylor terms in the expansion of I(k1 , .., k4 ) around zero momentum, linear or
higher in the momenta. But as Γ is 1PI, all of its internal lines3 are parts of loops, and
therefore every differentiation of an internal propagator reduces the superficial degree
of divergence of the integrand by 1. In our simple one-loop case, for example,
∂ 1 (k1 + k2 − l)μ
= −2 (17.14)
∂k1μ (k1 + k2 − l)2 + m2 ) ((k1 + k2 − l)2 + m2 )2
so the differentiation has reduced the scaling of the propagator at large loop momen-
tum from 1/l2 to 1/l3 . Accordingly, all the terms in the Taylor expansion of IR in
(17.13) have degree of divergence D = −1 or less, and are therefore UV-convergent.
This argument can clearly be generalized to 1PI graphs Γ with a superficial degree
of divergence D > 0, by defining the Taylor operator tΓ as the sum of the first D + 1
orders of the Taylor expansion in the external momenta (around zero): i.e., all terms
up to homogeneous order k D in the external momenta ki of the graph. The Taylor
operator is defined simply to be zero when applied to a superficially convergent graph
or subgraph. As what remains after tΓ I(ki ) is subtracted from I(ki ) are just the terms
in the Taylor expansion with at least D + 1 derivatives with respect to the ki , and each
such derivative lowers the superficial degree of divergence of the Feynman integral I
associated with the graph Γ (i.e., the scaling of the integrand when all loop momenta
get uniformly large) by 1, we see that the subtracted amplitude (1 − tΓ )I will again
have D < 0, just as in our simple one-loop example.
Another critical point, the full implications of which will emerge in the next section,
concerns the structure of the subtraction terms generated by the Taylor operator
tΓ : by definition, they are polynomial in the external momentum of the subtracted
graph. This implies that they are equivalent to the terms that would be generated in
perturbation theory by a local term in the Lagrangian with as many field operators as
external lines of the graph in question (in this case, four), and with a finite number
of spacetime-derivatives applied to the fields to generate the appropriate factors of
momentum entering the graph. As we shall see in the following sections, it is exactly
this property of the subtractions effected by the zero-momentum Taylor operators
introduced here that allows us to connect the subtractions needed to remove the
dominant cutoff dependence of the amplitudes of the theory with a precise set of
reparameterizations of these amplitudes in terms of low-energy constants defined by
appropriate renormalization conditions (as in the preceding section). It cannot be
emphasized too forcefully that the entire essence of perturbative renormalization
theory is implicit in the ideas introduced in this paragraph: the reader is strongly
3 Recall from Chapter 10 that external legs are truncated by definition in 1PI amplitudes.
618 Scales II: Perturbatively renormalizable field theories
In fact, for kept positive and non-zero, all of our Minkowski space Feynman inte-
grands are majorized by the corresponding Euclidean ones, for which the arguments we
have been presenting, using the power-counting prescriptions of the Weinberg theorem,
are rigorously valid. The absolute convergence (and UV-finiteness) of the subtracted
Euclidean amplitudes is therefore a fortiori correct for their Minkowski versions. Of
course, proper mathematical rigor requires that we establish that the → 0 limit can
indeed be carried out at the end, leading to well-defined covariant Minkowski space
amplitudes. The covariance property in particular is not immediately obvious, given
the presence of the non-covariant factor k2 + m2 in the term in (17.15), but, once
again, Zimmermann (Zimmermann, 1968) has done the hard work for us, and shown
explicitly that the limit does indeed exist, with the resultant amplitudes well-behaved
covariant tempered distributions. For the following, we shall return to Euclidean space,
confident that the subtraction procedures devised there to remove the ultraviolet
sensitivity of the amplitudes will be equally efficacious in Minkowski space.
The true power of Weinberg’s theorem really emerges when we go to higher
orders of perturbation theory, when we encounter multi-loop graphs with multiple
independent regions of integration giving divergent contributions to the overall ampli-
tude. Consider the two-loop diagram illustrated in Fig. 17.2 contributing to the 2-2
amplitude in φ4 theory (again, in four spacetime dimensions). The corresponding loop
integral is (again, ignoring overall numerical factors, couplings, etc.)
Weinberg’s power-counting theorem and the divergence structure of Feynman integrals 619
k3 k4
l1 − l2 − k3
l2
l1 k1 + k2 − l1
k1 k2
1 d4 l1
I(ki ) =
(l12 + m2 )((k1 + k2 − l1 )2 + m2 ) (2π)4
1 d 4 l2
· (17.17)
(l22 m2 )((l 1 − l2 − k3 ) + m ) (2π)
+ 2 2 4
Γ = [l1 , k1 + k2 − l1 , l2 , l1 − l2 − k3 ]
γ = [l2 , l1 − l2 − k3 ] (17.18)
620 Scales II: Perturbatively renormalizable field theories
We have indicated here only the renormalization parts of the full graph Γ: namely,
those connected 1PI subgraphs with superficial degree of divergence greater than or
equal to zero (including possibly the entire graph, as here). Each possible subgraph
γ can be assigned a degree of divergence d(γ) corresponding to the power-counting
associated with large momentum flow through all lines of that subgraph. In the present
case, the only renormalization parts are the full graph Γ and the subgraph γ, with in
both cases d(Γ) = d(γ) = 0. Taylor subtraction operators tΓ and tγ can be similarly
associated with each renormalization part, in an obvious extension of the procedure
followed in our previous one-loop example. For the subgraph γ, the external momenta
now include, in addition to the ki , the loop momentum l1 associated with the lower
pair of internal lines in Fig. 17.2 (more generally, the external momenta of a given
renormalization part consists simply of those momenta which remain fixed when the
large momentum flow giving rise to the degree of divergence of that subgraph is
invoked). As in our present case both d(Γ) and d(γ) are zero, the Taylor operators
amount to either setting k1 = k2 = k3 = k4 = 0 (tΓ ) or k1 = k2 = k3 = k4 = l1 = 0
(tγ ). It seems entirely plausible, and the reader is encouraged to verify (Problem 2)
that the following subtracted integral, when examined in the context of Weinberg’s
theorem, is UV-finite:
Indeed, the subtraction effected by tγ ensures that the degree of divergence associated
with large momentum flow through the subgraph γ is reduced to –1 (just as in our
first one-loop example), while the overall subtraction effected by tΓ reduces the degree
of divergence of the graph as a whole (when both l1 and l2 become large) to –1. The
particular combination of four loop integrals implied by the subtractions of (17.19) is
therefore UV-finite by Weinberg’s theorem, or equivalently, cutoff insensitive at the
level of inverse powers of the cutoff. We have introduced a notation Ī to indicate a
subtracted amplitude containing Taylor subtraction operators for all proper subgraphs
of the graph in question, but not the overall (“top level”) subtraction needed if the
full graph is superficially divergent.
At this point it will be convenient to introduce some graphical notations which will
serve us in good stead as we attempt to generalize the insights gleaned from these first
simple examples to arbitrary graphs. First note that the inner subtraction effected by
the tγ operator in (17.19) actually corresponds to replacing the integral over l2 by a
pure number (as tγ sets l1 and k3 to zero), leaving a graph with only the lines [l1 , k1 +
k2 − l1 ], which amounts to a graph in which the entire subgraph represented by γ
Weinberg’s power-counting theorem and the divergence structure of Feynman integrals 621
has been shrunk to a single vertex (in this case, without any momentum dependence,
as the Taylor operator is of degree zero). We shall denote such a graph by Γ/γ,
and more generally, graphs Γ in which some set of disjoint renormalization parts
γ1 , . . . γn are shrunk to a point as a consequence of having been replaced by their Taylor
operator evaluations as Γ/{γ1 , .., γn }. Also, the integrands associated with graphs (or
subgraphs, or shrunk graphs) will be indicated by an obvious superscript notation.
We may then abbreviate the expression (17.19) as follows:
IR
Γ
(ki ) = (1 − tΓ )Ī Γ , Ī Γ = I Γ + I Γ/γ (−tγ )Ī γ (17.20)
Note that the integrand Ī γ to which the inner subtraction tγ is applied is just the
original subgraph I γ consisting of the lines [l2 , l1 − l2 − k3 ]: the overbar in this case
is superfluous, as this graph contains no proper superficially divergent subgraphs
which need be subtracted. The formula (17.20) is our first example of a general
recursion formula (originally due to Bogoliubov), the explicit solution of which (due
to Zimmermann) will describe in a compact way the exact divergent structure of
arbitrary Feynman integrals.
The two-loop example just discussed possesses a simplifying feature which may
already have occurred to the reader: the divergent regions of the graph are nested—
in other words, the inner subdivergence arising from the region of large momentum
flow through the subgraph γ corresponds to a set of lines which are strictly contained
in the set of lines that carry the large momentum leading to the overall divergence
of the full graph Γ. This makes it easy to guess the proper set of sequential sub-
tractions (1 − tΓ ) · · · (1 − tγ ) · · which render the graph UV-finite: one simply begins
by subtracting off the innermost divergence and then proceeding outwards, at each
stage performing the Taylor subtraction if a divergent subgraph is encountered.
The generalization of this procedure to a large graph containing non-intersecting
sets of nested divergences is also clear: the Taylor subtractions can be performed
independently on the separate non-intersecting sets, leading to a subtracted integrand
which possesses a finite Λ → ∞ limit by Weinberg’s theorem.
We shall use the terminology “non-overlapping” to describe the situations summa-
rized here: two renormalization parts γ1 , γ2 of a diagram (i.e., superficially divergent
1PI subgraphs) are said to be non-overlapping if either (a) one is entirely contained
in the other (the “nested” case), or (b) they are completely non-intersecting (i.e., no
lines in common). Our analysis of the divergence structure of Feynman graphs would
be essentially concluded if we had only the non-overlapping case to consider.
There are, however, cases of divergent subgraphs that intersect partially, giving
rise to the famous problem of overlapping divergences. Evidently, we must consider
at least a two-loop diagram in order to find two distinct—but not simply nested—
subdivergences. We shall illustrate the problem in a a self-interacting scalar theory
λ 3
which we have not heretofore studied- 3! φ theory in six spacetime dimensions—which
has the advantage of being topologically similar to gauge theory, inasmuch as the basic
interaction involves a trilinear coupling.4 To fourth order in λ, one encounters the
4 For a similar analysis of an overlapping divergence in φ4 theory, see Problem 4. Note that the φ3 -theory
has an unbounded spectrum below (cf. Section 8.4): all finite-energy states are unstable. Nevertheless, the
622 Scales II: Perturbatively renormalizable field theories
p − l1 p − l2
p l1 − l2 p
l1 l2
self-energy contribution to the scalar propagator indicated by the graph in Fig. 17.3,
corresponding to the Feynman integral (the external legs are truncated):
1 d6 l1
I (p) =
Γ
(l12 + m2 )((p − l1 )2 + m2 ) (2π)6
1 d6 l2
· (17.21)
(l22 + m2 )((l1 − l2 )2 + m2 )((p − l2 )2 + m2 ) (2π)6
The overall divergence degree of this graph is quadratic, D = −2 − 2 + 6 − 2 −
2 − 2 + 6 = +2, while the subdivergences corresponding to l1 large (with l2 fixed)
or l2 large (with l1 fixed) have degree of divergence D = −2 − 2 − 2 + 6 = 0: i.e.,
logarithmic. Again, we introduce an abbreviation for the various renormalization parts
of the diagram,
Γ = [l1 , p − l1 , l2 , l1 − l2 , p − l2 ],
γ1 = [l1 , p − l1 , l1 − l2 ],
γ2 = [l2 , l1 − l2 , p − l2 ] (17.22)
with d(Γ) = 2, d(γ1 ) = 0, d(γ2 ) = 0. It is apparent that in this case we are faced with
two divergent subgraphs (γ1 and γ2 ) which have a non-trivial intersection (the line
carrying momentum l1 − l2 ) but are not simply nested as in our previous two-loop
example of Fig. 17.2. The correct subtraction procedure is hardly obvious in this case,
especially as the subtraction terms obtained by applying the Taylor operators tγ1 and
tγ2 depend on the order in which these operations are applied, as the reader may easily
verify. It turns out that the correct way to “slice” the integrand in order to extract
correctly the dominant contributions in all divergent subintegrations of the two-loop
integral (17.21) involves subtractions only on non-overlapping renormalization parts.
Specifically, we define the fully subtracted two-loop self-energy as
IR
Γ
(p) = (1 − tΓ )Ī Γ (p) (17.23)
Ī (p) = I + I
Γ Γ Γ/γ1
(−t )Ī
γ1 γ1
+I Γ/γ2
(−t )Ī
γ2 γ2
(17.24)
renormalized perturbation theory of this model is perfectly sensible, to any finite order, and the graph
topology and divergence structure are similar to those appearing in gauge theories, making it a very useful
laboratory for illustrating important issues of renormalization, unclouded by complications introduced by
higher spin fields.
Weinberg’s power-counting theorem and the divergence structure of Feynman integrals 623
IR
Γ
(p) = I Γ + I Γ/γ1 (−tγ1 )Ī γ1 + I Γ/γ2 (−tγ2 )Ī γ2 − tΓ I Γ
−tΓ (I Γ/γ1 (−tγ1 )Ī γ1 ) − tΓ (I Γ/γ2 (−tγ2 )Ī γ2 )
≡ Ia (p) + Ib (p) + Ic (p) + Id (p) + Ie (p) + If (p) (17.25)
o o
o o
− −
o o
(a) (b) (c)
o o
o o + o o o o
− +
o o
(d) (e) (f)
IR
Γ
= (1 − tΓ )Ī Γ (17.27)
n
Ī Γ = I Γ + I Γ/{γ1 ,..,γn } (−tγi )Ī γi (17.28)
γ1 ,γ2 ,..γn ;γi ∩γj =0,i=j i=1
We shall show, by induction on the number of loops, that (17.29) solves (17.28).
In the event that U is the empty forest, the product of (negative) Taylor operators is
interpreted as unity: this term simply reproduces the original, unsubtracted integrand.
All other terms involve at least one renormalization part and, therefore, a subtraction
of the original integrand. The formula is evidently correct at one loop, as there are
no possible subdivergences, the normal forests of γ are empty, and Ī γ = I γ . We now
proceed by induction, and assume that the Zimmermann formula has been established
up to the (lower) number of loops contained in the subgraphs γ1 , .., γn in (17.28). We
may therefore insert (17.29) in (17.28) for the subgraphs Γi , i = 1, 2, .., n, obtaining
n
Ī Γ = I Γ + I Γ/{γ1 ,..,γn } (−tγi ) (−tγr )I γi (17.30)
γ1 ,γ2 ,..γn ;γi ∩γj =0,i=j i=1 Ui ∈N (γi ) γr ∈Ui
γN−1
γN−2 γN
γn+1
γn+2 γn+3 ····
γ1 γ2 γn
With a little thought, the reader will easily see that the sum in the second term on
the right-hand side of (17.30) simply assembles all the non-empty normal forests of
the graph Γ, sorted by their extremal elements γ1 , γ2 , ..γn , thereby reproducing the
forest formula at the next higher level of induction. In the event that the full graph
Γ is itself divergent, then the sum over all forests F(Γ) can be divided into pairs of
normal forests U and full forests [Γ, U ] with an additional subtraction −tΓ for each
full forest, yielding the Zimmermann forest formula for the fully subtracted amplitude
given recursively in (17.27):
IR
Γ
= (−tγr )I Γ (17.31)
U ∈F (γ) γr ∈U
Note that the nested nature of the subtractions in the forest formula (due to the
absence of overlapping renormalization parts) allows us to write the complete Feynman
integrand I Γ (broken up into a product of reduced integrands I Γ/{γ1 ,..,γn } and
individual renormalization parts I γi in (17.30)) to the extreme right in (17.31): the
Taylor operators in each of the “trees” in Fig. 17.5 then act sequentially, starting at
the top (the smallest divergent subdiagram) and working downwards. The trickiest
part of a proper proof 5 of the convergence of IR Γ
in (17.31) lies in the demonstration
that there exists an “admissible” routing of the loop momenta in the subdiagrams such
that the result of the Taylor operations is unambiguous (as we necessarily encounter
the situation in higher orders that the external momenta of one subgraph become the
internal momenta of another).
As usual, nothing serves better to clarify how this works than an explicit exam-
ple. In the case of the graph of Fig. 17.3, in addition to the empty forest, F(Γ)
evidently contains the forests [γ1 ], [γ2 ], [Γ], [Γ, γ1 ], [Γ, γ2 ]—but not a forest containing
both overlapping subdiagrams γ1 and γ2 —giving rise to exactly the six terms indicated
in (17.25) and displayed in Fig. 17.4, the convergence of which has already been
explained.The reader is strongly encouraged to verify the efficacy of the Zimmermann
forest formula with additional examples such as the graph of Figure 17.2, where one
may also verify explicitly the equivalence of the Bogoliubov–Parasiuk recursion and
Zimmermann formulas.
The zero-momentum subtraction method described so far in this section (referred
to commonly as the “BPHZ” or Bogoliubov–Parasiuk–Hepp–Zimmermann scheme) is
by no means the only way to arrive at a fully subtracted amplitude which is UV-finite
according to the requirements of Weinberg’s theorem. Clearly, the addition of finite
(cutoff independent) constant terms to the amounts prescribed by our Taylor operators
tγ for each renormalization part would produce an equally UV-finite fully subtracted
amplitude, differing from the BPHZ one by a finite amount. Indeed, soon after the
5 The technical niceties involved in a proper choice of momentum routing are dealt with in full detail in
(Zimmermann, 1969). Alternatively, one may employ an “α-parametric” representation, replacing Feynman
2 2
propagators 1/(p2 + m2 ) → 0∞ e−α(p +m ) dα, performing the loop momentum integrals explicitly, and
formulating the subtractions directly for the resultant integrands in the multi-dimensional “α” space, in
which case the ambiguities of momentum ordering can be avoided: see (Bèrgere and Lam, 1976).
Weinberg’s power-counting theorem and the divergence structure of Feynman integrals 627
In perturbation theory, we as usual assign the quadratic field kinetic ( 12 (∂ν φ)2 ) and
mass (a0 φ2 ) terms to the “free” part of the Lagrangian, and the remaining terms in
(17.32) are placed in the “interaction”, generating the vertices of the Feynman graphs
of the theory.6 In the present case, the interaction vertices are associated with the bare
coupling constants a1 , a2 , a3 , .., a1 , a2 , a3 , .. etc. Let us now consider a 1PI graph with
E external lines, with momentum-space amplitude Γ(E) (k1 , k2 , . . . , kE ). We shall be
concerned only with dimensional analysis here—counting powers of mass dimension—
so overall numerical (dimensionless) factors may be ignored. Thus, using the symbol
∼ to indicate dimensional equivalence, we have
E
δd ( ki )Γ(E) (k1 , k2 , . . . , kE ) ∼ (ki2 + m2 ) dd x1 · · · dd xE 0|φ(x1 ) · · · φ(xE )|0
i i=1
(17.34)
The product of k 2 + m2 factors on the right-hand side serves to truncate the external
propagators, and, of course, the 1PI character of Γ(E) means that only a subset of the
terms contributing to the E-point function in the integral are included, but this does
not alter the fact that dimensional consistency require the left- and right-hand sides
of (17.34) to have the same engineering dimension. This implies
d d
−d + dim(Γ(E) ) = 2E − dE + E( − 1) ⇒ dim(Γ(E) ) = d + E(1 − ) (17.35)
2 2
Now let us consider a particular L-loop graph contributing to Γ(E) with N1 , N2 , ..
vertices of the interaction terms O1 , O2 , .. etc., and similarly N1 , N2 , .. vertices for the
primed operators O1 , O2 , ... With l representing a generic loop momentum, and with
all loop momenta large compared to masses and external momenta ki , we also have,
dimensionally,
N1 N2
Γ(E) ∼ aN 1
1 N2
a 2 · ·(a 1 ) (a 2 ) · · lD−dL ddL l (17.36)
Using our previous result (17.35) for the dimension of Γ(E) , we therefore find for the
superficial degree of divergence of this graph
6 For the time being we shall ignore terms with more than two derivatives and quadratic in the field. Such
“Pauli–Villars” terms can be included in the free propagator, and result in a damping at high momentum
analogous to that employed in our treatment of the renormalization group flow of effective Lagrangians in
Section 16.4. In other words, they can be viewed as a modification of the cutoff scheme employed to define
individually divergent Feynman graphs.
Counterterms, subtractions, and perturbative renormalizability 631
d d d
D = d + E(1 − ) − (Nn (2 + n(1 − )) + Nn n(1 − ) + ..) (17.38)
2 n=1
2 2
We note that the sum divides into (a) a single negative contribution (n = 1), corre-
sponding to the operator φ3 , increasing insertions of which decrease the superficial
degree of divergence of the graph, (b) a term with vanishing coefficient of N2 , which
counts the number of appearances of the quartic vertex induced by the φ4 term in
the Lagrangian, increasing insertions of which therefore do not affect the superficial
degree of divergence of the graph, and (c) terms with positive contributions to D,
corresponding to operators O3 = φ5 , O4 = φ6 , . . . , O1 = (∂ν φ)2 φ2 , . . . with engineering
dimension greater than 4, and for which increasing numbers of vertices result in
increasing degree of divergence of the graph containing them. This division precisely
corresponds to the classification of operators referred to in Section 16.3 as (a) relevant,
(b) marginal, or (c) irrelevant, from the point of view of the renormalization group
behavior of effective field theories described there. In the present context it will be
more convenient to refer to the operators of type (a) as super-renormalizable, (b) as
renormalizable, and (c) as non-renormalizable, for reasons that will shortly become
clear. The essential point to be grasped at this juncture is that, while the number
of super-renormalizable and renormalizable operators/vertices is finite, (indeed, there
are only two possible interaction terms in 4-dimensions, corresponding to φ3 and φ4 ),
the number of non-renormalizable vertices/operators is always infinite. We note here
for future reference that in d = 6 spacetime dimensions, (17.39) becomes
D = 6 − 2E + ((2n − 2)Nn + 2nNn + ..) (17.40)
n=1
so the only renormalizable operator in this case is φ3 , with the quartic vertex from φ4
already corresponding to a non-renormalizable term. For spacetime dimensions greater
than 6, all interaction operators fall into the non-renormalizable category.
The result (17.39) allows us to identify immediately the renormalization parts
for a scalar field theory in four dimensions. First, consider the case where only
super-renormalizable and renormalizable terms are present in the Lagrangian. Thus,
Nn = 0, n > 2, Nn = 0, n > 0 etc., and we have simply
D =4−E (17.41)
We may identify this theory by the useful notation φ44 , where the subscript indi-
cates the spacetime dimension, and the superscript indicates the highest dimension
(renormalizable) interaction operator. Let us also assume the discrete symmetry
φ → −φ which eliminates terms containing odd powers of the field. Then the only
renormalization parts of the resultant φ4 theory correspond to 1PI graphs with
E = 0, 2, or 4 external lines. We have already seen in Chapter 10 that the vacuum
graphs of the theory (E = 0) induce a physically irrelevant (in flat space) phase
632 Scales II: Perturbatively renormalizable field theories
shift of the vacuum state, which cancels in the appropriately normalized amplitudes.
Thus the renormalization parts of this theory correspond to quadratically divergent
1PI two-point subgraphs (termed “self-energy”, or sometimes “vacuum polarization”
graphs) and logarithmically divergent 1PI four-point subgraphs (which we shall call
“vertex renormalization parts” for reasons shortly to become apparent). All graphs
with more than four external lines are superficially convergent: they may, of course,
contain internal subdivergences, but these must correspond to self-energy or vertex
renormalization subgraphs. Similarly, in φ36 theory, we conclude from (17.40) that
with only renormalizable or super-renormalizable terms present, the renormalization
parts correspond to 1PI graphs with 1, 2, or 3 external lines: termed “tadpole”, “self-
energy”, and “vertex” renormalization parts respectively. Graphs with more than three
external lines (e.g., 2-2 scattering graphs) are superficially convergent. Here we cannot
impose the reflection symmetry φ → −φ, as we wish to retain the one non-trivial
renormalizable term, which is cubic in the field.
On the other hand, if non-renormalizable terms are present, there is an infinite
number of renormalization parts in either case, as the superficial degree of divergence
D of graphs with any number of external lines eventually becomes positive if enough
vertices of non-renormalizable operators are inserted, due to the positive terms in the
sums in (17.39, 17.40). We are about to see that the subtractions introduced in the
preceding section to remove the dominant cutoff dependence of Feynman amplitudes
are exactly equivalent to a precise set of reparameterizations of the parameters in
the (cutoff) Lagrangian in terms of parameters defined at low energy. Evidently, the
presence of non-renormalizable operators in the basic Lagrangian will require the repa-
rameterization of an infinite number of Lagrangian parameters, if we wish to compute
cutoff-insensitive amplitudes to arbitrary orders of perturbation theory. In such a case,
we say that we are dealing with a “perturbatively non-renormalizable theory”.
What if we exclude ab initio non-renormalizable terms from the Lagrangian?
From the point of view of the Wilsonian effective Lagrangian theory discussed in the
preceding chapter, this would seem to be a physically unreasonable procedure: after all,
we saw there that the inevitable presence of new physics (such as quantum gravity) at
very short distances necessarily induces an infinite set of higher-dimension operators in
the effective Lagrangian for any theory cutoff at some high-momentum scale. We shall
return to precisely this question in Section 17.4, where the implications of restricting
ourselves to a Lagrangian with only super-renormalizable or renormalizable terms are
examined from a general renormalization group point of view. For the time being,
let us suppose (as we have frequently done in the book, making no excuses!) that we
can describe the interactions of a real scalar field in four dimensions by a Lagrangian
containing only three terms (and an implicit UV cutoff Λ),
1 1 1
L= (∂μ φ)2 − m2 φ2 − λφ4 (17.42)
2 2 4!
where we have returned to more conventional notation for the mass and coupling
term: a0 = 12 m2 , a2 = 4!
1
λ. This Lagrangian, of course, represents a class of physical
theories, with varying masses for the φ-particle, and for the strength of its self-
coupling. The relevant physical theory must be fixed by performing experiments to
measure the physical mass mph (related, but as we shall see, certainly not identical,
Counterterms, subtractions, and perturbative renormalizability 633
to the parameter m2 in the Lagrangian), and a physical coupling strength λph . One
might, for example, define the latter as the S-matrix element for elastic 2-2 scattering
in the zero (spatial) momentum limit pμ0 = (mph , 0), sans uninteresting momentum-
conservation and phase-space factors (cf. (7.196)):
1
Sp0 p0 ,p0 p0 ≡ λph · (2π)4 δ 4 () (17.43)
(2π)6 (2m ph )
2
Evidently, for any given fixed value of the UV cutoff Λ, these definitions fix the bare
parameters m2 , λ appearing in the Lagrangian as functions of measurable quantities
Note that we take kiμ = 0 for all four components of the external momenta, which
means that we are parameterizing the theory in terms of an off-shell value for a
Green function. The overall numerical factor in Γ(4) is chosen so that its perturbative
expansion in terms of the bare coupling λ begins with λ, with a coefficient of unity.
The first few graphs contributing to Γ(4) in terms of the bare parameters of (17.45)
are indicated in Fig. 17.6. Thus, we have a formal expansion:
The shift δλ between the bare and renormalized coupling is called a “coupling constant
counterterm”: the coefficients c2 , c3 , . . . appearing in the expansion of the counterterm
in λR must be determined order by order in perturbation theory to ensure that the
zero momentum value of Γ(4) remains pinned at λR , as it is defined to be. We shall
see shortly how this is accomplished in practice. We likewise define counterterms for
the other two (as yet) floating parameters in the Lagrangian:
Ẑ ≡ 1 + δ Ẑ = 1 + a1 λR + a2 λ2R + . . . (17.48)
Ẑm2 ≡ m2R − δm2 = m2R − (b1 λR + b2 λ2R + . . .) (17.49)
k3 k4
k3 k4
k3 k4 k4 k3
+ + + + ....
k1 k2 k1 k2
k1 k2
k1 k2
Fig. 17.6 Low-order 1PI contributions (through one loop) to Γ(4) (k1 , k2 , k3 , k4 ) in φ4 theory.
The crossbars indicate that external legs are amputated.
Counterterms, subtractions, and perturbative renormalizability 635
The counterterms δ Ẑ, δm2R , will be used to adjust the behavior of the self-energy (i.e.,
the 1PI two-point function of the renormalized field φR ) at zero momentum, as follows.
Note that the connected two-point function of the theory (in momentum space, the
“full” Feynman propagator Δ̂F (p); cf. Section 10.4) can be graphically represented
as a series of free propagators ΔF (p), interspersed by 1PI self-energy corrections, as
indicated in Fig. 10.5. Denoting the sum of all 1PI two-point self-energy graphs as
Π(p) (the graphs contributing to this Green function in φ4 through two loops are
displayed in Fig. 17.7), we have algebraically
In other words, the inverse (Euclidean) full propagator consists just of a tree con-
tribution which is just the inverse free propagator, together with (minus) the 1PI
loop graphs indicated in Fig. 17.7, just as we would expect, given that it coincides
with the two-point function corresponding to the functional Γ(φR ) generating the 1PI
graphs of the theory (cf. Equations (10.142, 10.143)). We are at liberty to apportion the
quadratic (mass term) part of the Lagrangian at liberty into a “free” and “interacting”
part, and we shall define the coefficient of 12 φ2R in the free Lagrangian as m2R : thus
Δ−1 2 2 2
F (p) = p + mR in (17.50). Moreover, we shall choose δ Ẑ, δmR order by order in
perturbation theory to remove the first two terms in the Taylor expansion (in p2 ) of
Π(p) (which is, of course, a scalar function of p2 , by Lorentz-invariance), as follows:
Π(0) = 0 (17.51)
∂ 2 Π(p)
=0 (17.52)
∂p2 p=0
Equivalently, these latter two conditions may be phrased, using (17.50), in terms of
the full propagator,
Δ̂−1
F (p = 0) = mR
2
(17.53)
1 ∂ 2 Δ̂−1
F (p)
=1 (17.54)
2 ∂p2
p=0
p p + p p +...
Fig. 17.7 Low-order 1PI contributions (through two loops) to Π(p) in φ4 theory. The crossbars
indicate that external legs are amputated.
636 Scales II: Perturbatively renormalizable field theories
The three constraints (17.46, 17.51, 17.52) constitute the renormalization conditions
for our theory, as described in the first section of this chapter: the associated ampli-
tudes are said to be calculated in the BPHZ renormalization scheme.
Note that there is no reason to expect—and indeed it is not the case—that mR
corresponds to the physical mass mph of the particle: i.e., the location of the pole of the
full propagator Δ̂F (p). However, as is apparent from (17.53), it is a perfectly sensible
quantity (with dimensions of mass, of course) in terms of which to parameterize the
amplitudes of the theory, and in terms of which a formula can be obtained (order
by order in perturbation theory) for mph . The reparameterization procedure is most
easily carried out by rewriting the Lagrangian (17.45) as a sum of three terms:
To avoid notational overload, we have dropped the “R” subscript on the field, with
the understanding that here and henceforth we are concerned only with the Green
functions of the rescaled field. The “free” Lagrangian L0 , which determines the
propagators appearing in our graphs, is now written in terms of this rescaled (or
“renormalized”) field (φR of (17.45)), and the renormalized mass parameter mR (thus,
the Euclidean propagator denominators are simply k 2 + m2R ). Inasmuch as all the
terms in the “basic vertex Lagrangian” Lbasic and the “counterterm Lagrangian” Lct
contain at least one power of λR , the full interaction Lagrangian now contains vertices
corresponding not just to the original four-point interaction vertex of (17.45), but
additional two- and four-point vertices corresponding to the terms in the counterterm
Lagrangian.
We now wish to establish the following remarkable property of the amplitudes
(i.e., n-point Euclidean Green functions) of this theory: once reparameterized in
terms of the renormalized quantities mR , λR as defined above, the amplitudes acquire
precisely the zero momentum subtractions corresponding to the Zimmermann formula
(17.31), thereby softening their UV cutoff dependence to the level of inverse powers,
order by order in the perturbative expansion of the amplitudes in λR . A theory of this
kind, in which the redefinition of a finite number of Lagrangian mass and coupling
parameters induces subtractions removing the UV cutoff dependence (up to inverse
powers) of all the amplitudes of the theory, to all orders of perturbation theory, is
called “perturbatively renormalizable”.
Our demonstration will be inductive in the number of loops of the diagrams
considered. Accordingly, our first task is to initiate the induction by demonstrating the
validity of the italicized assertion above in the lowest non-trivial order, namely, for the
one-loop amplitudes of the theory. In fact, we need only consider the 1PI amplitudes of
the theory for our proof of renormalizability. Any (connected) amplitude which is not
Counterterms, subtractions, and perturbative renormalizability 637
where the subscript indicates explicitly that our four-point function is being calculated
through one-loop order. The renormalization condition (17.46) implies that we must
choose the coefficient c2 in the vertex renormalization counterterm to precisely remove
the contribution of the second term on the right-hand side of (17.60) at zero momen-
tum, as the correct zero-momentum value for Γ(4) is already given by the lowest-order
tree contribution:
1
c2 = − · 3I(0, m2R , Λ2 ) (17.62)
32π 2
Inserting (17.62) in (17.60), we find, through order λ2R (i.e., to one loop),
(4) 1 2
Γ1 (k1 , k2 , k3 , k4 ) = λR − λ {IR (s, m2R ) + IR (t, m2R ) + IR (u, m2R )} (17.63)
32π 2 R
with
1
m2R
IR (p 2
, m2R ) ≡ (1 − t )I(p
γ 2
, m2R , Λ2 ) = ln ( )dx (17.64)
0 x(1 − x)p2 + m2R
satisfying the required constraint IR (0, m2R ) = 0. We remind the reader that the Taylor
operator tγ (with γ referring to the one-loop bubbles of Fig. 17.6) when applied to
a one-loop integral with superficial degree of divergence zero, simply evaluates the
graph at zero external momentum. In other words, the counterterm proportional to
c2 λ2R is exactly equivalent to the Taylor operation defined in the preceding section, as
638 Scales II: Perturbatively renormalizable field theories
a consequence of the need to keep the four-point function pinned at the defined value
λR at higher orders. As expected, the result is the removal from the amplitude of
the logarithmic sensitivity to the cutoff Λ, leaving only terms of order m2R /Λ2 , ki2 /Λ2 ,
which we have neglected above.
Next, we consider the one-loop subtractions induced by the field renormalization
(17.48) and mass (17.49) counterterms, which require the insertion in our graphs, to
first order, of interaction vertices generated by the associated one-loop counterterm
Lagrangian
1 1
Lct,1 loop = a1 λR (∂μ φ)2 + b1 λR φ2 (17.65)
2 2
The perturbative expansion of the full propagator Δ̂F (p) through one loop now
consists of the three graphs indicated in Fig. 17.8: in addition to the one-loop self-
energy correction involving the basic vertex of Lbasic , there is a counterterm graph
(Fig. 17.8(c)) in which the operator (17.65) induces a two-point vertex with the
coefficient a1 λR p2 + b1 λR . Thus, we have, through order λR ,
Note that in this case the unsubtracted one-loop self-energy is given by a quadratically
divergent integral which is, however, independent of the external momentum p. We
must now choose the one-loop counterterm coefficents a1 , b1 to impose the renormaliza-
tion conditions (17.51,17.52) on the complete one-loop self-energy Π1,R (p), including
counterterm contributions. As Π1 (p) is independent of p in this case, we have simply
a1 = 0 (17.68)
1 1 d4 l
b1 = 2 (17.69)
2 l2 + mR (2π)4
l a1λR p 2 + b1λR
γ
p p + p p + p × p
Fig. 17.8 Zeroth order (a), unsubtracted one-loop (b), and counterterm (c) contributions to
the propagator in φ4 theory, through one-loop order.
Counterterms, subtractions, and perturbative renormalizability 639
p−l
p γ p
which (in this admittedly very special case) results in the complete cancellation
of the self-energy to one-loop order: Π1,R (p) = 0! More generally, the effect of the
counterterm is clearly just to remove the first two terms in the Taylor expansion in
p2 of the unsubtracted self-energy (which for this special case degenerates to the first
constant term)—namely, the quadratically divergent γ in Fig. 17.8(b):
and the counterterm Lagrangian a2 λ2R (∂μ φ)2 + b2 λ2R φ2 induces a subtraction exactly
equivalent to the Taylor operator tγ , as we must choose, pursuant to (17.51, 17.52),
1 ∂2
a2 = − 2
I(p) (17.72)
2 ∂p p=0
b2 = −I(0) (17.73)
whence
7 In φ3 -theory, as in gauge theories, each additional loop is accompanied by the square of the basic
6
coupling λR , as the reader may easily verify by examining a few simple graphs.
640 Scales II: Perturbatively renormalizable field theories
when computed up to one-loop order. What about all the other one-loop amplitudes
of the theory, with more than four external legs, which are, in virtue of (17.41),
superficially convergent ? Such amplitudes may still contain ultraviolet divergences,
through the presence of divergent one-loop 1PI subgraphs. However, the latter are
exactly the two-point and four-point one-loop subgraphs whose UV divergence has
just been shown to be subtracted by the one-loop counterterms generated by the
reparameterization of the theory in terms mR , λR (and rescaling of the field). By Wein-
berg’s theorem, a superficially convergent graph is UV-finite if all its subdivergences
(in this case, a single one-loop subdivergence) are subtracted, thereby appropriately
lowering the degree of divergence of the subgraph to a negative value. The first step
of our inductive argument is therefore complete.
The inductive step of the renormalization proof proceeds in a familiar fashion: we
assume that it has been established that the subtractions implicit in the Bogoliubov–
Parasiuk recursion formula (17.27, 17.28) (or its explicit solution (17.31)) are exactly
those generated by the counterterms required to implement the renormalization condi-
tions (17.46, 17.51, 17.52), up to L loops. The reader will recall (cf. Section 10.4) that
reinserting explicitly in Feynman amplitudes provides a convenient counting device
for loops: an amplitude with L loops is proportional to L . Consider a contribution of
order L+1 to a 1PI amplitude Γ on the left-hand side of (17.24). The renormalization
parts γ1 , ..γn in (17.27) are proper subgraphs of Γ, therefore of order L at most, and
by the induction hypothesis their subtractions correspond to counterterms induced
by the BPHZ renormalization conditions up to loop order L. If Γ is superficially
convergent, we are done, as the tΓ operation in (17.24) is defined to be zero, and the
amplitude is already finite with no further subtractions needed. If Γ is a two- or four-
point function (in φ44 theory), then we must define the as yet unfixed counterterms
aL+1 , bL+1 , cL+2 to enforce the renormalization conditions, by subtracting off the
constant and quadratic terms in the zero momentum expansion (in the case of the
two-point function, with d(Γ) = 2), or just the zero momentum value for the four-point
functions (with d(Γ) = 0). These counterterms, aL+1 , bL+1 , cL+2 , therefore correspond
exactly to the new order (L + 1)-loop subtraction tΓ Ī Γ in (17.24). There is, as usual,
no better way to convince oneself that nothing is being swept under the rug at this
point than by examining an explicit example: the two-loop self-energy of Fig. 17.3 with
overlapping divergences discussed earlier (see Problem 3). Although our argument has
considered 1PI graphs, we may now extend it to general connected graphs by recalling,
as discussed previously, that a general connected graph is simply the algebraic product
of its 1PI components with connecting propagator factors at fixed momentum, so its
UV-finiteness is assured once the component 1PI pieces are properly subtracted.
The UV-finiteness of amplitudes subtracted according to the recursive Bogoliubov–
Parasiuk procedure (or the explicit forest formula) therefore implies that the ampli-
tudes of φ44 , once reparameterized in terms of renormalized quantities, lose their
dependence on an ultraviolet cutoff Λ, up to the usual inverse power terms of
m2 ,k2 ki
order O( Λ2 i ) (times possible powers of logarithms of m Λ , Λ ), which are considered
negligible from the point of view of renormalization theory. Whether they are indeed
so, from a quantitative point of view, depends, of course, on whether the range of
energy scales over which our low-energy field theory remains valid is sufficiently large.
Certainly, any perturbatively renormalizable local field theory which remains valid up
Counterterms, subtractions, and perturbative renormalizability 641
to the scale of Grand Unified Theories (1015 GeV) or the Planck scale (1019 GeV) will
contain contributions from the “new physics” at the high scale which are completely
negligible, from the point of view of computing amplitudes for processes allowed by
the low-energy theory, even at LHC energies of 10 TeV. Of course, these considerations
are not relevant if we are dealing with processes which are simply forbidden by the
low-energy theory (e.g., proton decay in Grand Unified Theories) and can only occur
by virtue of the new physics appearing at high energy scales.
(n)
Once renormalized amplitudes ΓR (k1 , k2 , . . . kn ; mR , λR ) are obtained in the above
BPHZ zero momentum renormalization scheme, up to the desired loop order in
perturbation theory, they may, of course, be re-expressed in terms of other low-energy
parameterizations, which may bear a more direct relation to directly physically mea-
surable quantities. For example, we may prefer to use an “on-shell” renormalization
scheme in which amplitudes are parameterized in terms of the actual physical mass
mph of our scalar particle (as given by the location in the pole of the full propagator;
cf. Section 9.5) and the on-shell 2-2 scattering amplitude λph at zero (or some other
standard) spatial momentum. In addition, the field rescaling constant Ẑ may be chosen
to adjust the residue of the pole of the full propagator to be unity. These changes
amount to non-linear reparameterizations of the BPHZ parameters mR , λR , Ẑ which,
of course, do not alter the UV-finiteness of the amplitudes: more exactly, this means
that we can re-expand the BPHZ parameters in power series in the new renormalized
coupling λph with coefficients which are finite (up to inverse power corrections, as
usual) as the UV-cutoff Λ is taken much larger than all other scales in the theory.
Alternatively, we may decide to compute our amplitudes ab initio in an on-shell
scheme,8 by an appropriate alteration of the renormalization conditions (17.46, 17.51,
17.52) used to determine the counterterms to each perturbation theory.
The zero-momentum renormalization scheme for φ44 -theory can be generalized in
a fairly obvious way by fixing the renormalized coupling λR as the value of the four-
point function Γ(4) (k1 , k2 , k3 , k4 ) (all momenta outgoing) at a non-zero Euclidean
momentum: the most convenient choice is the Euclidean symmetric point, defined by
where
4 the dot-products are determined by symmetry plus momentum conservation
i=1 i = 0. Similarly, renormalization conditions for the self-energy (17.51,17.52)
k
are imposed at p2 = μ2 rather than at zero momentum. This scheme- really, a one
parameter set of schemes, parameterized by the arbitrary scale μ- results in renor-
malized Green functions with an explicit dependence on the renormalization scale μ
(exactly as we saw in the toy example at the beginning of this chapter). Nevertheless,
physical S-matrix elements are independent of μ, once they are reparameterized in
terms of measurable low-energy quantities (such as λph , mph of the on-shell scheme,
for example). This means that in order to keep the physics invariant, we must change
λR , mR as a function of the scale μ: in this sense, the renormalization procedure
induces a “running” coupling constant and mass.
8 The reader should be warned that for massless theories there are difficulties, due to infrared divergences,
in adopting such an on-shell renormalization procedure, as we shall see in Chapter 19.
642 Scales II: Perturbatively renormalizable field theories
9 In Section 17.4 we shall see that this set corresponds to a finite-dimensional low-energy surface in the
infinite-dimensional coupling constant space of Wilsonian effective theory, onto which the renormalization
group flow necessarily contracts, at least in the neighborhood of zero coupling corresponding to formal
perturbation theory.
Counterterms, subtractions, and perturbative renormalizability 643
Fig. 17.10 Fermion loop contribution to π − π scattering amplitude in Yukawa isospin theory
(thick lines are Dirac propagators, thin lines are (amputated) scalar propagators).
with superficial degree of divergence zero, as in Fig. 17.10, where the four thick
internal lines refer to Dirac nucleon propagators, each of order 1/(loop
momentum),
hence giving rise to a logarithmically divergent loop integral ∼ d4 l/l4 . The forest
formula implies that counterterms corresponding to the operator (π · π )2 must be
present to generate the corresponding Taylor subtraction operator needed to remove
the leading cutoff dependence of this diagram. Note that the single required operator
satisfies the global O(3) isospin symmetry of the basic Lagrangian (17.76), provided
the regularization procedure employed to define the individual diagrams also does. In
the case of a global symmetry, such as the isospin symmetry present here, this is not
difficult to manage: even a crude momentum cutoff will suffice, provided the mode
cutoffs on the different components of the scalar and Dirac fields are done identically.
As we shall see in the next section, cutoff procedures capable of maintaining a local
gauge symmetry are much harder to come by: in this case, the spatiotemporal aspect
of the symmetry requires a much more delicate treatment of the momentum modes of
the fields.
Returning to (17.76), we see that the theory described by this Lagrangian is
not perturbatively renormalizable as it stands, until we include in the interaction
Lagrangian all renormalizable operators (i.e., up to dimension 4) satisfying the global
symmetries of the theory. In the present case, this means that we must include a
quartic scalar term ab initio in our bare Lagrangian:
1 − λ (π · π )2
L = N̄ (i∂/ − M )N + (∂μπ · ∂ μπ − m2π π · π ) − ig N̄ γ5τ N · φ (17.78)
2 4!
The reparameterization of the bare parameters and fields in the Lagrangian (17.78)
now suffices to generate all the counterterms needed to implement the Taylor sub-
tractions in the BPHZ renormalization scheme for this theory. In particular, the δλ
counterterm arising from the final interaction in (17.78) is needed to remove the
logarithmic divergence of Fig. 17.10. We say that the original Yukawa term, together
with the quartic scalar term, form a perturbatively renormalizable set of operators.
An even simpler example of this phenomenon occurs with the φ36 theory introduced
earlier. Once non-renormalizable operators are excluded, we see from (17.40) that the
renormalization parts of the theory consist of the 1PI diagrams with E = 1, 2, or 3
644 Scales II: Perturbatively renormalizable field theories
δv
´
Fig. 17.11 Tadpole graph appearing in φ36 -theory, and its accompanying counterterm.
external lines (there is, of course, no φ → −φ symmetry to exclude odd powers of the
field in this theory), so a naive choice of Lagrangian
1 1 λ
L= (∂μ φ)2 − m2 φ2 − φ3 (17.79)
2 2 3!
lacks the counterterm needed to remove UV divergences which appear in the “tadpole”
graphs of the theory: i.e., the 1PI diagrams with E = 1 external scalar line, as in
Fig. 17.11. Instead, we must write
1 1 λ
L= (∂μ φ)2 − δv φ − m2 φ2 − φ3 (17.80)
2 2 3!
and choose the counterterm δv order by order in perturbation theory to remove the
(momentum independent) tadpole terms as they appear at increasing loop order.10 In
this case, the generation of a complete perturbatively renormalizable set of operators
requires the inclusion of the super-renormalizable operator φ. In the next chapter we
shall see that the need for “completing” the set of operators included in the Lagrangian
is intimately related with the “operator mixing” which is characteristic of the behavior
of local operators in an interacting field theory.
The construction of a perturbatively renormalizable theory requires, as we have just
seen, that “enough” operators be included to allow for all necessary counterterms, but
it is also essential not to include even a single non-renormalizable operator in the basic
Lagrangian if we wish to have a theory in which a finite number of reparameterizations
suffice to remove the primary (divergent) cutoff dependence of the amplitudes. As we
mentioned previously, if even a single such operator is included in the Lagrangian,
the divergence counting formula (17.39) implies that renormalization parts appear
with arbitrarily many external lines, simply by considering graphs with arbitrarily
many insertions of the non-renormalizable vertex. Although the Zimmermann forest
formula continues to yield formally UV-finite subtracted amplitudes to all orders of
perturbation theory in such a case, the tγ subtraction operators appearing therein now
implicitly correspond to an infinite series of counterterms present in the reparameter-
10 As the tadpole graphs lack momentum dependence, BPHZ subtraction removes them entirely, so we
may simply drop any graph with a tadpole insertion in this scheme.
Renormalization and symmetry 645
ized Lagrangian: we are now forced to expand the Lagrangian to include essentially
all operators consistent with any global symmetries of the theory, and our amplitudes
will depend on an infinite set of parameters, severely restricting the predictive content
of the theory, at the very least. The restriction of the terms in the Lagrangian to
only that finite set compatible with the given field content, with the imposed global
and local symmetries, and with perturbative renormalizability, may have unintended
consequences: in particular, the appearance of additional accidental symmetries which
are only valid in virtue of the absence of non-renormalizable operators compatible
with the initial imposed symmetries of the theory but not with renormalizability. Such
accidental symmetries play an important role in the Standard Model (see Section 12.5,
(Weinberg, 1995a)).
It may have occurred to the reader at this point that the insertion of an ultraviolet
cutoff in a general clustering, Lorentz-invariant theory, as discussed in Chapter 16
effectively implies that our theory should indeed be described by an effective Wilsonian
Lagrangian in which all operators—super-renormalizable, renormalizable, and non-
renormalizable—appear. And the renormalization group discussion of the role of these
three classes of operator given there suggested that in fact the non-renormalizable
operators are in some sense the least important at low energy. The apparent incon-
sistency between these two points of view—the historically seminal restriction of
acceptable field theories to the (finite) class of perturbatively renormalizable ones,
and the much wider class of effective field theories encompassed in the Wilsonian
approach—will be discussed in Section 17.4, where we shall see that perturbatively
renormalizable theories emerge by evaluating the renormalization group flow of a
general Wilsonian effective Lagrangian with a particular choice of initial conditions
at the ultraviolet end, and utilizing the demonstrable insensitivity of the low-energy
properties (at the level of perturbation theory) to this special choice.
in the Lagrangian, in the example cited above leaving us with only two non-trivial
interaction terms. The situation is completely different once we include vector fields
interpolating for spin-1 particles. In this case, perturbative renormalizability requires
the presence of a local gauge symmetry, and, as we shall see now, the intertwining of
symmetry and renormalization properties becomes much more intricate than in the
global symmetry case.
The new features which appear in the treatment of renormalization for spin-1 fields
can be traced back to the large momentum behavior of the propagator for a canonical
(massive) field of spin-j, which contains a numerator factor of order k 2j . This leads
to a scalar (j = 0) propagator with a 1/k 2 falloff at large momentum k, a Dirac
fermion propagator falloff 1/k, but no falloff at all for a massive spin-1 (Euclidean)
propagator (g μν − k μ k ν /m2 )/(k 2 + m2 ) ∼ k 0 , k >> m. The result is that, although
the engineering dimension of a massive Aμ vector field is the same as that of a scalar
field (namely, mass to the first power in four spacetime dimensions), internal vector
lines contribute an extra factor of +2, compared to scalar lines, to the overall degree
of divergence of any 1PI graph containing them. For a theory of fermions interacting
with vectors by a trilinear coupling g ψ̄γ μ ψAμ (with g dimensionless), this means that
the power-counting formula (17.77) for Yukawa theories must be replaced by
3
D = 4 − Eψ − EA + 2IA (17.81)
2
for a graph with Eψ external fermion lines, EA (resp. IA ) external (resp. internal)
vector lines. The positive term +2IA has the usual disastrous consequence: regardless
of the number of external lines possessed by a subgraph, at high enough order
of perturbation theory the subgraph will become divergent, and increasingly so as
we go to even higher orders in the trilinear coupling, by inserting more and more
internal gauge field lines, leading to a non-renormalizable theory with infinitely many
independent renormalization parts.
Of course, we have already seen in Chapter 15 that in a theory with massless
vector particle and an exact local gauge symmetry, there exist choices of gauge in
which the vector propagator has the same falloff, 1/k2 , as a scalar propagator, thereby
removing the unfortunate +2IA contribution in (17.81). This is the case, for example,
in the covariant ξ-gauges discussed in Section 15.4, where the (Minkowski) gauge field
propagator takes the form (cf. (15.159)):
(g − (1 − ξ) pμ pν )e−ip·(x−y) 4
μν p2 d p
0|T {Aαμ (x)Aβν (y)}|0 = −iδαβ (17.82)
p2 + i (2π)4
1
LE = (Fμν )2 − ψ̄(∂/ − m)ψ − ieψ̄A/ψ (17.84)
4
We have not included source terms for the fermion fields, as we shall be consid-
ering only Green functions with external gauge field lines in the following. The
functional integral (17.83) is invariant under a change of variable of integration
Aμ (x) → Aμ (x) + ∂μ λ(x), with λ(x) infinitesimal; and as this also corresponds to a
local gauge transformation leaving LE invariant, we have, to first order in λ:
1 1 2 4
0 = DAμ DψDψ̄ [ ∂μ Aμ (x)λ(x) − Jμ ∂μ λ(x)]d4 x e− (LE + 2ξ (∂μ Aμ ) −Jμ Aμ )d x
ξ
(17.85)
Defining ω(x) ≡ λ(x), this becomes
1 1 1 2 4
0 = DAμ DψDψ̄ [ ∂μ Aμ (x) + ∂μ Jμ (x)]ω(x)d4 x e− (LE + 2ξ (∂μ Aμ ) −Jμ Aμ )d x
ξ
(17.86)
whence, taking the functional derivative with respect to ω(x), we find
1 1 1 2 4
0 = DAμ DψDψ̄[ ∂μ Aμ (x) + ∂μ Jμ (x)] e− (LE + 2ξ (∂μ Aμ ) −Jμ Aμ )d x (17.87)
ξ
Factors of Aμ (x) within the functional integral may be replaced by functional deriva-
tives with respect to the source Jμ (x) acting on the generating functional Z[J], so our
result (17.87) can be rewritten as a functional differential equation for Z[J],
1 δZ[J] 1
∂μ + ∂μ Jμ (x)Z[J] = 0 (17.88)
ξ δJμ (x)
1 ∂ 4 1 ∂ δ 2 Γ[A]
δ (x − y) = − (17.94)
ξ ∂xν x ∂xμ δAμ (x)δAν (y)
Recalling that the second derivative of Γ yields the (full) inverse propagator of the
theory, we may write down immediately the Fourier transform of (17.94), with the
obvious translations ∂μ → ipμ , → −p2 ,
1 1
pν = 2 pμ Δ̂−1
F μν (p) (17.95)
ξ p
The full gauge field (i.e., photon) propagator can be obtained, just as in (17.50), as
an iteration of free propagators interspersed with self-energy corrections. The only
difference is that here the propagator and self-energy carry two vector indices, so the
products in the iteration are matrix products. Consequently,
1 pμ pν
Δ̂−1 −1
F μν (p) = ΔF μν (p) − Πμν (p) = p (δμν + ( − 1)
2
) − Πμν (p) (17.96)
ξ p2
where the first term on the right-hand side is the inverse of the free Euclidean
propagator (cf. (17.82)) in the covariant ξ-gauge, and the self-energy Πμν (p) is given
in the bare (i.e., pre-reparameterization) theory by graphs such as those indicated in
Fig. 17.12. In particular, the one-loop graph of Fig. 17.12 is given by
2 tr[γ̂μ ((il/ + m)γ̂ν (i(p/ + /l ) + m)] d4 l
Π1 μν (p) = e (17.97)
(l2 + m2 )((p + l)2 + m2 ) (2π)4
where the γ̂μ are the Euclidean γ-matrices defined in (15.160). Using the appropriate
Euclidean trace identities (see Problem 5), this loop integral becomes
δμν (m2 + l · (p + l)) − lμ (p + l)ν − lν (p + l)μ d4 l
Π1 μν (p) = 4e2 (17.98)
(l2 + m2 )((p + l)2 + m2 (2π)4
Renormalization and symmetry 649
p p
p+l
as the left-hand side of (17.95) is already given by just the free propagator part of
(17.96). In other words, the local gauge symmetry embodied by the Ward identities of
the theory requires the photon self-energy tensor to be transverse. This transversality
must, of course, hold at each loop order in perturbation theory, as we may formally
expand Πμν in powers of and require that (17.99) hold at each order. Any regu-
larization which violates this property will ipso facto do violence to the local gauge
symmetry, and, as we shall now see, destroy the perturbative renormalizability of the
theory. Let us examine this issue explicitly for the one-loop integral (17.98). Inserting
a Feynman parameter in the usual way (cf. (16.67)), this becomes
2
1
δμν m2 − 2lμ lν + 2x(1 − x)pμ pν + δμν (l2 − x(1 − x)p2 d4 l
Π1 μν (p) = 4e dx
0 (l2 + x(1 − x)p2 + m2 )2 (2π)4
(17.100)
If we regularize the integral simply by imposing a momentum cutoff |l| < Λ, the
integral has a leading contribution for Λ >> p, m given by
e2 2
Π1 μν (p) ∼ Λ δμν (17.101)
8π 2
which clearly does not satisfy (17.99). Such a divergence, were it truly present in the
theory, could only be removed by a counterterm corresponding to an explicit photon
mass term in the Lagrangian δm2 Aμ Aμ , which of course also violates the local gauge
invariance of the theory.
On the other hand, dimensional regularization of the integral maintains the desired
transversality, as we might expect (or at least hope!), given that the formal structure
(and hence, the propagators and vertices) of a locally gauge-invariant Lagrangian does
not depend on the spacetime dimension in which it is formulated. Making the standard
replacement
d4 l d
4−d d l
→ μ (17.102)
(2π)4 (2π)d
650 Scales II: Perturbatively renormalizable field theories
Note that the dependence on the arbitrary renormalization scale μ (which plays the
role of the UV cutoff in a momentum cutoff scheme) is logarithmic, not quadratic. In
effect, the need to produce the transverse tensor pμ pν − δμν p2 outside the loop integral
has reduced the quadratic divergence of the loop integral (17.98) to a logarithmic one.
Inserting the renormalized self-energy in the expression (17.96) for the inverse full
photon propagator, and inverting once again, we find, for the renormalized Euclidean
photon propagator (to one loop),
pμ pν
δμν − p2 pμ pν
Δ̂F μν (p) = +ξ (17.109)
p2 (1 − Π1,R (p2 )) p4
Renormalization and symmetry 651
The radiative corrections have induced a change only in the transverse (“physical”)
part of the propagator, but the propagator pole is still at p2 = 0 (note that Π1,R (0) is
finite and non-zero): our physical photon mass is still safely zero. The gauge-variant
part of the propagator retains exactly its lowest-order, tree-level value—a property
which can be shown to persist at higher orders of perturbation, thanks to the Ward
identity. Of course, the residue at the pole—the LSZ “Z” constant—is not unity (it is,
in this renormalization scheme, just 1/(1 − Π1,R (0)), and must therefore be included
in the LSZ formula (9.179) when S-matrix elements are computed.
Similar improvements in the degree of divergence relative to the naive power-
counting rule (17.81) occur in other dimensionally subtracted Green functions of the
theory, all with the net result of limiting the renormalization parts of the theory to
just those which correspond to the counterterms available by reparameterizing only
the couplings and fields present in our initially locally gauge-invariant Lagrangian
(17.84). For example, the analog of Fig. 17.10, the one-loop contribution to light by
light (elastic 2-2 photon) scattering arising from an internal electron loop, would seem
at first glance to be logarithmically divergent, as in the Yukawa case, and therefore
to require the presence of a quartic counterterm δλ(Aμ Aμ )2 . Such a term, of course,
destroys local gauge invariance, and once admitted, would then lead to the need for a
photon mass counterterm as well, massive photon propagators, and the re-emergence
of the fatal +2IA contribution in (17.81). In fact, the Ward identity obtained by
differentiating (17.94) a further three times with respect to the classical photon field
A amounts to the statement that the 1PI four-point photon Green function of the
theory is divergence-less on any of its four spacetime indices, or in momentum space:
p1μ Γ(4) (4)
μνρσ (p1 , p2 , p3 , p4 ) = p2ν Γμνρσ (p1 , p2 , p3 , p4 ) = . . . = 0 (17.110)
The transversality, just as in the case of the two-point function, is realized by the
appearance of four transverse tensors on each of the external legs of the diagram,
which then reduces the effective degree of divergence (in this case, by an astonishing
four powers, although in the present circumstance one would suffice!) to a negative
value. The diagram is therefore finite, and no counterterm is in fact needed.
A complete demonstration of perturbative renormalizability for a general non-
abelian gauge theory,11 quantized in a covariant ξ-gauge and defined by the functional
integral (15.161), requires the systematic application of the appropriate generalizations
of our simple Ward identity (17.94) for the general 1PI Green functions of the theory—
involving gauge field, and fermion and ghost fields. These “Slavnov–Taylor” identities
can then be shown to imply that the only renormalization parts arising to arbitrary
orders of perturbation theory indeed correspond to counterterms associated with
reparameterization of the original Lagrangian. The argument is somewhat lengthy
11 It should be confessed at this point that the addition of a photon mass term in the abelian case does
not in fact ruin perturbative renormalizability: the additional term induced in the Ward identity (basically,
m2
one displaces the gauge parameter ξ1 by A ) does not alter the transversality of the multi-photon 1PI
amplitudes needed to exclude quartic non-gauge-invariant counterterms, although a mass counterterm is
now necessary. In the non-abelian case, however, the non-linear gauge couplings result in the generation
of an infinite number of non-gauge-invariant counterterms once an explicit mass term is included for the
gauge fields.
652 Scales II: Perturbatively renormalizable field theories
and will not be reproduced here, as its conceptual essence is already visible in the
abelian examples given above.12
We have seen that the explicit breaking of global symmetries of the Lagrangian does
not in general alter the renormalizability status of the theory: typically, one simply
has a larger number of renormalizable and super-renormalizable counterterms which
must be included in the Lagrangian to absorb divergences in multi-loop diagrams.
The same is true of spontaneously broken symmetries, which from a Lagrangian
point of view can be thought of as explicitly broken theories with special relations
between the Lagrangian parameters (cf. Section 8.4). Evidently, the renormalizability
of gauge theories (especially in the non-abelian case) is much more intimately con-
nected with the underlying local symmetry, and relies on the absence of any explicit
symmetry-breaking terms, even if their engineering dimension places them in the
category of (superficially) renormalizable or super-renormalizable operators.
Nevertheless, the perturbative renormalizability of a non-abelian gauge theory is
unaffected by spontaneous breaking of the remnant discrete gauge symmetry (after
gauge-fixing; cf. Section 15.6), which results in the appearance of physically massive
gauge vector particles via the Higgs phenomenon. We saw in Chapter 14 that the
spontaneous breakdown of a symmetry is quintessentially a long-distance (therefore,
low-energy) phenomenon, so it should not be surprising that the short-distance scaling
behavior of the theory is basically unaffected by the choice of vacuum state entailed by
the field shifts used to implement the effects of the spontaneous breakdown. Once a per-
turbatively renormalizable theory is reparameterized in terms of a (finite!) set of well-
defined low-energy parameters (and appropriately rescaled fields), the Lagrangian,
and hence the associated Hamiltonian defines a dynamical evolution in the state space
which is (order by order in perturbation theory, of course) insensitive at the power level
to the UV cutoff in the theory. This remains true whether the Hamiltonian is applied to
states obtained by applying field operators to the “false” non-ground-state “vacuum”
of the unshifted, symmetric theory, or to the physically relevant states built on the
true ground-state vacuum of the theory. Indeed, we have already seen in Section 15.6
that by exploiting the underlying exact local gauge symmetry of a spontaneously
broken gauge theory (abelian or non-abelian) one may derive covariant Feynman rules
in which the massive vector propagators have the soft 1/k 2 falloff essential for the
normal alignment of renormalizable (resp. non-renormalizable) operators with mass
dimension 4 (resp. > 4). The non-abelian Slavnov–Taylor identities do the remaining
job, just as in the unbroken case, of restricting the divergent counterterms to just
those associated with reparameterization of the parameters of the original Lagrangian,
which, we recall, is manifestly locally gauge-symmetric before this property is disguised
by the field shifts employed to display the spontaneous breaking.
12 The interested reader will find the complete proof in (for example) Section 12.4, (Itzhykson and Zuber,
1980).
Renormalization group approach to renormalizability 653
We have slightly altered notation here, by labeling all operators (irrespective of the
number of spacetime-derivatives) by a single index n: thus, in addition to simple
powers of the field, the kinetic operator is included in the list, together with all other
operators with two, four, six, etc., spacetime-derivatives. The field φΛ (x) (cf. (16.4))
only contains Fourier modes of the field with Euclidean momentum |k| < Λ. Counting
powers of mass dimension, we find that if the operator On has mass dimension 4 − dn ,
the associated coefficient an has mass dimension dn , and we may define (cf. (16.8))
dimensionless couplings gn (Λ) by the simple scaling
The scale Λ is, as previously stated, arbitrary, and we may imagine fixing the
dimensionless couplings at some fixed very high ultraviolet scale ΛUV (much higher
than the physics we wish to explore, but safely smaller than the scale of quantum
gravity effects, say),
and then using the non-linear first-order evolution equations (16.27) describing the
renormalization group flow of the effective Lagrangian to determine the dimensionless
couplings at any lower scale μ < ΛUV , as a function of the dimensionless ratio μ/ΛUV
and the initial high-energy parameters ḡn :
∂
μ gn (μ) = βn (gn (μ)) ⇒ gn (μ) = gn (ḡn ; μ/ΛUV ) (17.114)
∂μ
In Section 16.3 we divided the set of local operators On into the relevant operators
corresponding to dn > 0, the marginal operators with dn = 0, and the irrelevant
operators with dn < 0 (corresponding, respectively, to the classification into super-
renormalizable, renormalizable, and non-renormalizable operators in the language of
perturbative renormalization). The relevant and marginal operators correspond to
operators (in spacetime dimension 4) with mass dimension less than or equal to 4,
and therefore constitute a finite set: let there be N of these (the actual number may
depend on the type of fields and interactions, and imposed symmetries, of course). We
shall distinguish the operators in this finite set by writing small Roman indices a, b,
etc., and indicate the irrelevant operators of dimension greater than 4 (of which there
are an infinite number) by Greek letters α, β, etc., reserving later Roman characters
m, n, r, etc., for the general set of operators. In Section 16.3 we also saw in some simple
examples that the dominant effect of integrating out non-renormalizable operators
between a high UV cutoff scale and a low-energy scale (up to small corrections
involving inverse powers of the large ultraviolet scale) was to produce modifications,
potentially of order unity, in the couplings associated with marginal and relevant
operators. The point of the following discussion is to reproduce this result in a much
more general context. Our derivation will follow the streamlined approach described
by Weinberg (Weinberg, 1995a), based on Polchinski’s original arguments (Polchinski,
1984).
We shall demonstrate that in the regime of weakly coupled perturbation theory,
the renormalization group flow implied by (17.114) maps an arbitrary initial surface
S̄ in the high-energy coupling constant space of the {ḡn } to an N-dimensional surface
S of the {gn (μ)}, a given point on which is uniquely determined by specifying N
low-energy parameters, up to corrections which fall as inverse whole powers of the
ratio of the UV cutoff ΛUV to the low-energy scale μ. The demonstration relies on a
linear stability analysis familiar in the treatment of non-linear dynamical systems. We
first consider the effect of a small (infinitesimal) change δgn in the parameters on the
renormalization flow generated by (17.114). Defining
∂βn
Mnm (gn ) ≡ (17.115)
∂gm
∂
μ δgn (μ) = Mnm δgm (μ) (17.116)
∂μ
Renormalization group approach to renormalizability 655
We may also define a matrix Gnm expressing the variation of the low-energy parame-
ters under variation of the initial parameters ḡn (see (17.114)):
∂gn
Gnm (μ) ≡ (17.117)
∂ḡm
Differentiating (17.114) with respect to the initial parameters, one finds also
∂
μ Gnm = Mnr Grm (17.118)
∂μ
We shall assume that the finite N×N submatrix Gab with rows and columns restricted
to the marginal and relevant couplings is not singular, with well-defined inverse G−1
ab ,
which is presumably the case with the exception of perhaps an isolated set of measure
zero in the coupling constant space, which we assume our renormalization group flow
avoids. By usual matrix algebra, one has
∂ −1 ∂
μ Gab = −G−1
ac (μ Gcd )G−1 −1 −1
db = −Gac Mcn Gnd Gdb (17.119)
∂μ ∂μ
The equivalence of the first and third lines follows from the fact that for values of the
index n in the marginal/relevant subset, we have Gna G−1 ab = δnb , and the two terms
on the first line then cancel: hence, the sum over n may be restricted to the non-
renormalizable set labeled by β in the third line. A similar argument establishes the
equivalence of the second and fourth lines, at which point (17.121) follows with some
straightforward shuffling of indices. In the free field limit (all couplings corresponding
656 Scales II: Perturbatively renormalizable field theories
to higher than quadratic operators set to zero) there are no loop integrals, and the
entire scale dependence of the dimensionless parameters is due to the rescaling by
engineering dimension in (17.112), and we therefore have
ˆ ∼ ( μ )−dα = ( μ )|dα |
δg (17.124)
α
ΛUV ΛUV
or, equivalently,
μ |dα |
δgα (μ) ∼ Gαa G−1
ab δgb (μ) + O(( ) ), μ << ΛUV (17.125)
ΛUV
The reason for the otherwise strange qualifier “irrelevant” applied to the non-
renormalizable couplings and operators in the theory (indexed by α) should be
apparent at this point: their effects at low energy may be entirely subsumed in
variations of the marginal and relevant couplings (indexed by b).
The result (17.125) (see Fig. 17.13) expresses the desired result: arbitrary infinites-
imal displacements of the initial (high-energy) point {ḡn } (in other words, in the
tangent space of any surface containing the initial point) amount to displacements of
the low-energy parameters (at cutoff μ) in a finite, N-dimensional surface S, as the
displacements δgα (μ) are simply linear combinations of the N displacements δgb (μ) of
the marginal/relevant couplings of the theory. All low-energy amplitudes of the theory
(i.e., with external Euclidean momenta less than μ) are given by the specification of
the effective Lagrangian Lμ determined by the gn (μ), so fixing the low-energy physics
uniquely amounts to specifying a finite number—in fact, exactly N—of independent
low-energy amplitudes, which then locate a unique point on the attractive surface S.
This is exactly the procedure used in the preceding sections, where we have imposed
N renormalization conditions as a prelude to the reparameterization of the theory in
terms of the parameters defined by these conditions. It should be emphasized that
the gα (μ) coefficients of the infinitely many non-renormalizable operators in Lμ are
not zero: indeed, they are needed to incorporate the effects of the marginal/relevant
couplings once all the modes of the field between ΛUV and μ are integrated out.
In other words, using Lμ to actually calculate field theory amplitudes would require
including all the vertices for the operators On , but restricting the loop integrals to
internal propagator momenta |k| < μ—clearly an impractical procedure, and not the
way we actually proceed in renormalized perturbation theory. Instead, the process
of renormalization as outlined in the preceding sections of this chapter implicitly
amounts to computing field-theory amplitudes (order by order in perturbation theory)
successively starting the renormalization group flow at the initial point
Renormalization group approach to renormalizability 657
g2(Λ)
g4(Λ)
g3(Λ)
g1(Λ) High Energy Scale
2
−1 μ
δgα(μ) = GαaGab δgb(μ) + O(( )|dα|)
b=1 Λ
Fig. 17.13 Schematic illustration of an attractive renormalization group flow onto a finite-
dimensional (N=2) low-energy surface.
gα (ΛUV ) = 0
ga (ΛUV ) = ḡa (ΛUV ) = 0 (17.126)
with the “bare” couplings ḡa (ΛUV ) chosen to fix the physical theory at low energy at
the desired end-point on the low-energy surface S (by imposing the renormalization
conditions), and then taking the limit ΛUV → ∞, confident that the end-point of the
renormalization group flow remains on the attractive surface S at the unique physical
point identified by fixing the particular N low-energy amplitudes that define the renor-
malization scheme we have decided to employ. In the language of the renormalization
group, we can say that the huge variety of theories defined by the infinitely many
couplings ḡn specified at the short-distance cutoff lie in a single universality class
of theories, namely those which collapse to a finite-(N)-dimensional surface at low
energies, as pictured in Fig. 17.13.
While the above argument asserts that the low-energy limit corresponds to some
set of marginal/relevant operators, and hence to a theory that is perturbatively
renormalizable by power-counting, it may be the case, depending on the field content
and the nature of the interactions, that the resultant low-energy theory is in fact
just a free theory. A classic example is the Fermi theory of weak interactions: if we
integrate out all fields down to a Gev (say), and consider only the weak interactions
of leptons and baryons (e.g., nuclear β decay), then we have only spin- 12 fields to
consider, and there are no renormalizable interactions in four dimensions involving only
658 Scales II: Perturbatively renormalizable field theories
fermionic spin- 12 fields: the four-fermion coupling term is dimension 6 and therefore
non-renormalizable.13 From the point of view of the argument just above, the low-
energy theory is free, and we simply ignore the weak interactions, as a remnant of high-
energy “new physics”, contributing at the “negligible” O(( ΛμUV )|dα | ) level of (17.125)
(where here μ ∼ MeVs, ΛUV ∼ the W mass, 80 GeV, and dα = 2). Of course, if we
integrate only down to 1 TeV, we are left with all the fields of the Standard Model,
including the W-, Z-, and (presumptive) Higgs bosons, which do form a perturbatively
renormalizable theory—exactly the point of the great electroweak unification of the
early 1970s.
We also note here the obvious point that we are not always so fortunate to have
the couplings of the low-energy renormalizable theory—albeit a non-trivial one with
interesting interactions—sufficiently small to render perturbation theory quantita-
tively useful. In the case of QCD, we do indeed end up with a renormalizable theory
at low energy, but one with a gauge coupling constant of order unity, which means
that a complete quantitative evaluation of the low-energy amplitudes of the theory
necessarily requires explicitly non-perturbative methods, such as lattice gauge theory.
In the next chapter we shall see that QCD however possesses the remarkable property
that the running coupling decreases with increasing energy (the famous property of
“asymptotic freedom”), so that the renormalization group actually provides an escape
route allowing the perturbative calculation of certain high-energy amplitudes of the
theory. Of course, in the case of QED we are very fortunate in this regard: the low-
energy coupling of the electron to the photon provides an expansion coefficient of
order 1/137 (the fine-structure constant), so perturbation theory is (initially) rapidly
convergent, and the accuracy of the results obtained for the anomalous magnetic
moment of the electron can be used to establish the absence of new physics (in the
form of a dimension-5 operator ψ̄σμν ψF μν ) up to an energy scale of at least 107 GeV
(Weinberg, 1995a).
It must once again be emphasized that the arguments given here, based as they are
on a linear stability analysis, make sense only in the context of perturbation theory,
where we are entitled to treat all couplings associated with interaction operators as
infinitesimal. Infinitesimal couplings remain so under the renormalization group flow
from ΛUV down to the low scale μ, even though the flow is non-linear. No statement
is made—nor can be made—concerning the actual existence of flows beginning at the
specified UV starting point (17.126), with finite rather than infinitesimal couplings,
and ending at some specified desired low-energy coupling strengths (say, for the
renormalized coupling λR in φ4 -theory). This is a global issue which requires non-
perturbative control over the theory, as one is in principle interested in situations
where the renormalization group flow is diverted into regions where weak coupling
perturbation theory is no longer valid.
13 The same problem occurs in attempts to describe quantum gravity in terms of a local field theory:
there are no renormalizable interactions of spin-2 gravitons. The leading low-energy residual of whatever
microscopic theory adequately describes quantum gravity effects, the Einstein–Hilbert Lagrangian, leads
to the appearance of infinitely many counterterms if we attempt a perturbative expansion. Indeed, the
only coupling in the theory, Newton’s constant G, has dimensions of mass−2 —the classic signature of a
non-renormalizable interaction.
Renormalization group approach to renormalizability 659
In fact, it may well be the case that the limit outlined above, in which ΛUV is taken
to infinity while holding a set of N low-energy amplitudes fixed leads to a high-energy
theory which does not correspond to a physically sensible theory—for example, by
having negative terms in the effective action which render the functional integral at
the high-scale divergent. In such a case, the only scaling of the bare couplings at the
ultraviolet end which leads to a sensible low-energy theory corresponds to sending all
the interaction terms to zero—in other words, to the “trivial” result of a free field
theory. There is considerable circumstantial evidence (we shall consider some in the
next chapter) that both φ4 theory and quantum electrodynamics in four spacetime
dimensions, taken as self-standing field theories, are in fact trivial theories of this sort,
even though they are, as we have seen earlier, formally perturbatively renormalizable
to all orders of perturbation theory.14
There is no paradox here: recall (Section 11.1) that the formal perturbation
series is always only a divergent, asymptotic one. A local field theory in which a
well-defined continuum limit exists, where all ultraviolet cutoffs have been removed
in a way consistent with full Poincaré invariance while retaining the hermiticity of
the Lagrangian and the unitarity of the theory, is presumably one in which n-point
functions exist satisfying the full panoply of Wightman axioms discussed in Chapter 9.
However, it may well be the case that the perfectly well-defined all-orders expansions
of the amplitudes of a field theory do not correspond to the asymptotic expansions
of a set of Wightman functions satisfying the needed axioms. In this situation, the
perturbation expansion may still be of enormous phenomenological utility (as in the
case of QED): we must regard the relevant microscopic theory not as a continuum field
theory, but as an effective Wilsonian theory valid up to a high-energy scale beyond
which new physics comes into play, altering significantly the ultraviolet behavior. As
long as the interaction couplings at the high scale are reasonably small, we may expect
that the flow down to the low-energy scale where we are doing physical measurements
produces an attraction onto a finite-dimensional surface on which we work in the
setting of renormalization theory. In the case of the electroweak sector of the Standard
Model, the measurements are at an energy scale on the order of hundreds of GeVs, and
we are fortunate that the low-energy couplings are small here. In the case of QCD, the
low-energy couplings attract to a finite-dimensional surface where the gauge coupling
appropriate for hadronic phenomena in the sub-GeV regime is of order unity, and
renormalized perturbation theory, though formally perfectly sensible, is not useful,
and in fact, is qualitatively misleading with regard to the physics of the theory, as we
shall see in Chapter 19.
The renormalization group approach to perturbative renormalizability has some
quite striking advantages in comparison to the detailed analysis of divergence structure
presented earlier in this chapter. The decoupling of ultraviolet sensitivity is seen to
proceed by simple scaling arguments, with no reference to the complications of nested
or overlapping subdivergences in the Feynman graphs associated with the perturbative
amplitudes. There are, however, considerable disadvantages attached to this approach,
14 For a detailed description of the mathematical issues involved in establishing triviality of field theories,
see (Fernandez et al., 1992).
660 Scales II: Perturbatively renormalizable field theories
at least from the point of view of high-energy theory (though less so in the many
condensed-matter applications of the renormalization group). The main one arises
from the need to impose momentum cutoffs, which we have seen do violence to
the local gauge symmetry which plays a central role in all sectors of the Standard
Model. This results in enormous technical complications when repeating the proof
of renormalizability for gauge theories along the lines of Wilsonian effective theory,
as given above for scalar field theories, although the method has been successfully
applied (if painfully) even in this case (Kopper and Müller, 2009). Furthermore, the
renormalization techniques developed earlier in the chapter provide the germs for
further extensions which are indispensable in understanding the short-distance/high-
energy behavior of field theories (where here we are talking about energy scales
intermediate between the important dimensionful scales defining the theory at low
energy, and the high-energy scale at which the theory becomes invalid). Here the
notion of oversubtraction of amplitudes—a straightforward extension of the subtrac-
tion techniques of Section 17.2—becomes critical in understanding the factorization
properties of field theory amplitudes in this regime. These ideas are difficult, if not
impossible, to implement in the framework of renormalization group flow arguments
of the type given in this section, although, as we are about to see, renormalization
group ideas, appropriately reformulated for use within the framework of renormalized
perturbation theory, play an indispensable role in extracting useful information about
high-energy amplitudes once certain factorization properties of the latter have been
established.
17.5 Problems
1. Verify the inequality, valid for A, B ≥ 0, (implying (17.16), setting A = k02 ,
B = k2 + m2 ):
1 4 1
| |≤ 1+ (17.127)
(A − B + iB) 2
A+B
2. Check that the result of differentiating the internally subtracted two loop graph
in Figure 17.3 three times with respect to the external momentum p is UV-finite
in dimensional regularization. One needs to show that the amplitudes resulting
from the differentiation of the basic graph, together with the inner subtractions
prescribed by the forest formula (i.e. (a), (b), and (c) in Figure 17.4), can be
rearranged into a sum of individually UV-finite terms. For example, the terms in
which the propagators carrying momentum p − l1 , p − l2 both receive derivatives
are manifestly UV-finite by Weinberg’s Theorem. Thus, one must show that the
terms in which all three derivatives are applied to a single propagator, together
with associated subtraction terms, give a UV-finite result. The UV convergence of
the third derivative implies that the pole part of the subtracted two loop diagram
must be a polynomial, at most quadratic, in the external momentum p.
3. Determine, in terms of zero-momentum amplitudes, the self-energy and vertex
counterterms responsible for renormalizing the two-loop self-energy of Fig. 17.3
Problems 661
in φ36 -theory, and show that these counterterms give rise to exactly the set of
subtractions indicated in Fig. 17.4.
4. Imitating the arguments leading to (17.41), derive the formula (17.77) giving the
superficial degree of divergence of a graph with Eψ external fermion and Eφ external
scalar lines in a theory of a Dirac fermion field Yukawa-coupled to a scalar field:
3
D = 4 − Eψ − Eφ (17.128)
2
5. Show that the Euclidean γ matrices γ̂μ , μ = 1, 2, 3, 4 defined in (15.160) satisfy the
trace identities:
Tr(γ̂μ γ̂ν ) = 4δμν (17.129)
Tr(γ̂μ γ̂ν γ̂ρ γ̂σ ) = 4(δμν δρσ − δμρ δνσ + δμσ δνρ ) (17.130)
6. Evaluate, after dimensional continuation via (17.102), the photon one-loop self-
energy (17.100) integral, and show that one obtains the result displayed in (17.103).
7. Show that a counterterm of the form δZ × 14 (Fμν )2 , inserted to first order in the
momentum-space photon propagator, produces exactly the transverse tensor in
(17.104).
18
Scales III: Short-distance structure
of quantum field theory
In the preceding chapter we saw that for a certain subclass of local quantum field
theories, whose local dynamics is determined by a Lagrangian containing only a finite
number N of operators, the low-energy amplitudes of the theory, order by order in per-
turbation theory, lose their leading sensitivity to ultraviolet modifications of the theory
once reparameterized in terms of an equal number of independent low-energy quanti-
ties. We refer to such theories as “perturbatively renormalizable”, and the requirement
that the Lagrangian of such theories contain only operators of mass dimension less than
(relevant/super-renormalizable) or equal to (marginal/renormalizable) that of the
spacetime dimension is extremely restrictive, with the pleasant result that enormous
phenomenological predictivity obtains with a minimum of input. More exactly, we
have seen that in perturbatively renormalizable theories, a general Green function
M of the theory, depending on generic momenta p, masses m, and bare Lagrangian
couplings g, evaluated with ultraviolet cutoff Λ, becomes, once reparameterized in
terms of renormalized masses mR and couplings gR (which may depend on a choice
of renormalization scheme—for example, through a renormalization scale μ)
p2 , m2R , μ2
M(p, m, g, Λ) → MR (p, mR , gR , μ) + O( ) (18.1)
Λ2
The preceding equation is to be interpreted as valid order by order in the formal
asymptotic expansion of both sides in powers of the subset of the gR corresponding
to interaction (higher than quadratic) vertices of the theory. For simplicity, we have
taken the couplings to be dimensionless. The power suppressed terms in (18.1) are
to be thought of as incorporating “new physics” which may be interesting in its own
right, but is not directly of interest in the calculation of the desired amplitude M. In
particular, we assume that these terms are quantitatively negligible: the masses and
momenta of particles involved in the given amplitude are much less than the energy
scale Λ at which new physics may emerge. The subtraction technology developed in
Section 17.2 was precisely fitted to the task of extracting just the parts of the full
amplitude which survive in this limit.
Our subject in this chapter will be to show that the subtraction procedure used to
demonstrate (18.1) has a natural generalization to situations in which three distinct
energy scales are present: the “low” masses m and momenta of some of the particles,
a momentum scale Q >> p, m much greater than the remaining momenta and the
masses of the theory, and (as always) a high-energy frontier scale Λ reflecting our
Scales III: Short-distance structure of quantum field theory 663
counterterm subtractions
Fig. 18.1 Energy scales involved in UV subtraction of amplitudes (leading behavior for
large Λ).
ignorance of the ultimate microphysics underlying our field theory (see Fig. 18.1).
We shall see that it is possible in many cases to effectively repeat the procedures
leading to (18.1) to extract the leading behavior of the renormalized amplitude (with
Λ dependence now already discarded) and obtain a factorized amplitude
p2 , m2R , μ2
MR (p, Q, mR , gR , μ) → M̂R,i (p, mR , gR , μ)Ci (Q, mR , gR , μ) + O( )
i
Q2
(18.2)
where now the “small” terms (usually called “higher twist” contributions) are of the
order of inverse powers of the large momentum scale Q, which in some sense has taken
over the role previously played by the “ultimate” cutoff Λ. The result (18.2), effectively
decoupling the dependence of the full amplitude on the large and small momenta, is
the expression in momentum space of a coordinate space property of local operators
originally uncovered by Wilson (Wilson, 1969), and which has come to be known as
the “Wilson operator-product expansion”.
The proof of Wilson’s hypothetical expansion, first given by Zimmermann, simply
extends the subtraction procedure used to remove the leading Λ dependence of the
amplitudes further down, to the “large” (but not too large!) scale Q, as schematically
indicated in Fig. 18.2. The coefficient functions Ci are correspondingly termed “Wilson
coefficients”, while the set of amplitudes M̂R,i will turn out to involve insertions of
appropriately defined local composite operators. We have seen on many occasions (cf.
Section 16.5) that products of field operators contain ultraviolet divergences, and as
(18.2) no longer contains any reference to a cutoff scale Λ, it is clear that the composite
operators appearing here must come fully equipped with a prescription for subtracting
off any additional Λ-dependence which their insertion in a graph might occasion.
Our exploration of the factorization properties of amplitudes at high energy must
therefore begin, naturally enough, with a more detailed treatment of the definition
and properties of local composite operators. A full treatment of the factorization and
renormalization group properties of high-energy amplitudes would easily require a
separate (and sizeable!) book, so we must beg the reader’s indulgence in providing
merely an overview, with (in most cases) detailed proofs omitted.
664 Scales III: Short-distance structure of quantum field theory
oversubtracted amplitudes
(dominant dependence on
Q removed)
Fig. 18.2 Energy scales involved in oversubtraction of amplitudes (leading behavior for
large Q).
The superscript notation indicates that the Green function involves a composite
(squared) operator and two separate field operators, and we use the Γ notation
to make explicit the fact that the amplitude is taken to be 1PI, with the inverse
propagator prefactors removing the external legs carrying momentum k in and k out.
Accordingly, the perturbative expansion of the amplitude begins with the constant
unity, and the order λR contribution is given simply by the logarithmically divergent
one-loop graph of Fig. 16.4. Note that there is no opportunity as yet for the order λ2R
counterterm contained in δλ to appear to cancel the divergence, as we are working
only to order λR (associated with the vertex on the right): the special vertex on the
left associated with the insertion of the φ2 operator (henceforth labeled V ) does not, of
course, carry a factor of the coupling constant. The presence of the composite operator
φ2 (0) has evidently introduced additional UV divergences which are not taken care
of by the normal counterterm subtractions. Nevertheless, a renormalized version of
the amplitude (18.3) can be defined very simply, and to all loop orders, following
the techniques of the BPHZ subtraction scheme. In order to do this we shall make
a slight change in notation for the Taylor subtraction operator tγ associated with a
renormalization part γ of a graph (namely, a superficially divergent 1PI subgraph),
writing
tγ → tD(γ) (18.4)
where the degree function D(γ) indicates the number of terms in the Taylor expansion
around zero momentum to be included in the Taylor operator. For renormalization
graphs not containing the composite vertex V , the subtraction degree is computed as
usual
D(γ) = 4 − Eγ (18.5)
where Eγ is the number of external lines attached to γ, while for renormalization parts
containing V (such as the one-loop graph in Fig. 16.4), we define
D(γ) = δ − Eγ (18.6)
the superficial degree of divergence of all subgraphs, whether or not they contain the
special vertex V . In fact, we can think of the vertex V as a normal four-point vertex,
but with two of the external lines missing, whence the difference of 2 in the degree
functions (18.5) and (18.6) (taking δ = 2). However—and this freedom will become
a crucial ingredient in the techniques to be developed in this chapter—we may also
choose δ > 2, thereby subtracting additional finite terms from the already adequately
subtracted subintegrations associated with each γ, and obtaining an oversubtracted
but nevertheless completely UV-finite amplitude. The sum of all graphs, renormalized
according to the prescription (18.7), defines the insertion of a renormalized composite
φ2 operator of degree δ, henceforth denoted Nδ (φ2 ) (and frequently referred to as a
“Zimmermann normal product operator”)1 :
IR,δ
Γ
≡ Δ̂−1F (k) Δ̂−1
F (k ) d4 xd4 x eik·x−ik ·x Nδ (φ2 (0))φ(x)φ(x ) (18.8)
Γ
In the event that we choose δ = 2, the minimal value required to yield a UV-finite
amplitude, the associated composite operator, N2 (φ2 ) is called minimally subtracted.
Composite operators, such as N4 (φ2 ), containing more than the minimal number
of subtractions required to remove the singular UV-dependence, are called over-
subtracted.
All of the preceding may be carried out in a dimensional renormalization scheme
simply by reinterpreting the Taylor operators tD(γr ) in the forest formula as pole-part
extraction operators, as described in the previous chapter. The renormalized composite
operator Nμ (φ2 ) (for example) so defined implicitly depend on the renormalization
scale μ used to define the dimensionally continued integrals, but we lose the ability to
define oversubtracted operators in which additional momentum dependence is removed
from the renormalization parts, with inconvenient consequences for the proof of the
operator product expansion (for example). Nevertheless, as we shall see below, the
dimensionally renormalized operators can be explicitly related to linear combinations
of the more intuitive BPHZ normal product ones. As usual, one is dealing with the
usual freedom available in choosing a particular “basis” of local operators from an
infinite set of independent ones.
The physical interpretation of the subtractions implemented in our new forest
formula (18.7) according to the degree function (18.5) is clear: these are just the
subtractions generated by the appearance of counterterms in the Lagrangian once the
theory is reparameterized in terms of a set of low-energy parameters identified through
renormalization conditions (in the present scheme, at zero momentum), as we saw
in the previous chapter. But the additional subtractions involving renormalization
parts containing the vertex V associated with the insertion of the composite φ2
operator, employing the degree function (18.6), clearly have nothing to do with
these counterterms, and we may well be concerned that they involve an unacceptable
mutilation of the composite operator, perhaps destroying important properties, such
as locality (space-like commutativity), etc.
1 The normal products defined here are to be distinguished, of course, from the “normal-ordered
products” introduced in our discussion of Wick’s theorem.
Local composite operators in field theory 667
k
k k k k
V× + + + + +...
k k k k
k
In fact, it is not hard to see that the extra subtractions introduced in (18.7)
to render the amplitude Γ(2,1,1) UV-finite correspond simply to a multiplicative
renormalization of the composite φ2 operator,
√ completely analogous to the previous
rescaling of the basic field operator φ → Zφ in (17.45), needed to absorb singular
cutoff dependence in the two-point function of the theory. Rather than give a formal
demonstration of this statement with the forest formula, we shall illustrate the basic
point with an example.
In Fig. 18.3 we show the graphs contributing to Γ(2,1,1) in φ44 -theory through two
loops. The corresponding subtractions induced by application of the forest formula
are indicated in Fig. 18.4, where we remind the reader that the appearance of “o”
symbols on the external legs of a subgraph indicate the application of the appropriate
Taylor zero-momentum operation to that subgraph. In the present case, this effectively
means just setting the momenta entering that subgraph to zero. The lowest-order
graph (a) in Fig. 18.4 is by definition just unity. Also, the reader will recall from the
k k
k k k
V× + + + +
k k k k
k
(a) (b) (c) (d) (e)
o o o
o o o
−o − o + o −
o o o
k o o k
o o
−o − o + o o −×
o
k o o k
(j) (k) (l) (m)
previous chapter that the one-loop propagator correction in graph (e) is momentum-
independent in φ4 -theory, so graphs (e) and (m) in fact cancel identically. A brief
inspection shows that the indicated subtractions indeed suffice to remove divergent
UV contributions from all possible large momentum flows, as required by Weinberg’s
theorem. The graphs of Fig. 18.4 can be rearranged as indicated in Fig. 18.5. We see
that the fully subtracted amplitude factorizes into a dimensionless number, which
we shall call Zφ2 , independent of the external momenta k, k , corresponding to
the contents of the parenthesis on the top line, times the contributions to Γ(2,1,1)
corresponding to the amplitude obtained by inserting the bare φ2 operator into the 1-
1 amplitude and carrying out all necessary counterterm subtractions (in this case, only
the one-loop vertex subtractions) arising from the reparameterization of the theory.
In other words, with a UV-cutoff Λ present to regularize the individual graphs in
Figs. 18.3–18.5, the minimally subtracted φ2 operator, giving UV-finite insertions into
1PI graphs (and hence, by LSZ, with finite matrix elements), is obtained by a cutoff-
dependent rescaling of the bare operator:
Λ2 2
N2 (φ2 (0)) = Zφ2 (λR , )φ (0) (18.9)
m2R
This simple multiplicative relation between the renormalized normal product operator
N2 (φ2 (0)) and its bare counterpart φ2 (0) suggests that the renormalized operator
will possess, in addition to UV-finite matrix elements, the desired Lorentz scalar and
locality (space-like commutativity) properties, and indeed, these properties have been
rigorously established (see (Zimmermann, 1970), and references cited therein). We
should point out here that for certain particularly “nice” composite operators—the
most important examples being those operators corresponding to conserved Noether
currents associated with a Ward–Takahashi identity—the bare composite operator
may already have finite matrix elements, allowing us to simply take the corresponding
Z factor to be unity (see Problem 1).
More generally, defining renormalized composite operators may require a combina-
tion of bare operators, as a consequence of operator mixing. For example, in a theory
with two independent scalar fields φ, χ, with basic interaction Lagrangian
Lint = λ1 φ4 + λ2 φ2 χ2 + λ3 χ4 (18.10)
o o
o o o
1 −o
o
−o +o o
o
+o +...
o
o o
k k k o k
o o
+ + − + −o +...
o o
k k k o k
Fig. 18.5 Factorization of the fully subtracted graphs contributing to Γ(2,1,1) up to two loops.
Local composite operators in field theory 669
φ χ
φ2(0)× λ1 φ2(0)× λ2
φ χ
Fig. 18.6 One-loop renormalization parts appearing in the renormalization of φ2 (0) in the
theory defined by (18.10).
the minimally subtracted operators N2 (φ2 ), N2 (χ2 ) are linear combinations of the
bare φ2 and χ2 operators: thus, we have a 2×2 matrix of renormalization constants
(Zφφ , Zφχ , etc.) connecting the bare operators with the renormalized ones. The need
for including the χ2 operator in the renormalization of φ2 is apparent when one
considers the graphs of Fig. 18.6, where we see that renormalization parts arising
in the insertion of a φ2 operator induce subtractions corresponding to a local χ2
operator, as the zero momentum subtraction of the graph on the right amounts to a
lowest-order insertion of a χ2 operator.
We mentioned previously that the ability to define over-subtracted operators with
more than the minimum number of subtractions needed to ensure the UV-finiteness of
amplitudes containing these operators will be extremely important in understanding
the underlying physics of operator product expansions. In fact, such operators are
simply particular linear combinations of the minimally subtracted ones, as we shall
now see, albeit combinations with particularly useful properties.
Consider, for example, the oversubtracted operator N4 (φ2 ), where the Taylor
operator acting on renormalization parts containing the special vertex V where the
operator is inserted into the diagram contains two additional terms. Thus, the subtrac-
tion (f) of graph (b) in Fig. 18.4 contains, in addition to the constant term obtained by
evaluating the one-loop integral at zero external momentum, the linear and quadratic
terms in the Taylor expansion in external momenta of this graph. The graph evidently
is a scalar function Π(q 2 ) of the momentum q = k − k inserted by the composite
operator, so the subtraction must take the form a + bq 2 = a + b(k2 + k 2 ) − 2bk · k ,
with a (cutoff-dependent) and b (finite) constants. The extra terms contained in
the oversubtracted operator, proportional to b, are clearly just what we would get
from a lowest-order insertion of the composite operator φφ (giving the momentum
dependence −(k 2 + (k )2 ) and ∂μ φ∂μ φ (giving the 2k · k term). A little time spent
examining the effect of the oversubtraction at the next loop order shows that these
new operators containing derivatives appear only minimally subtracted when their
momentum dependence enters a renormalization part requiring subtraction. One also
finds in higher order that the oversubtractions generate a term corresponding to the
minimally subtracted N4 (φ4 ) operator (see Problem 2). The result is the famous
Zimmermann identity:
N4 (φ2 ) = N2 (φ2 ) + rN4 (∂μ φ∂μ φ) + sN4 (φφ) + tN4 (φ4 ) (18.11)
renormalized parameters of the theory, as they must be, given that all operators
appearing in the identity are fully renormalized. Alternatively, we may write
N2 (φ2 ) = N4 (φ2 ) − rN4 (∂μ φ∂μ φ) − sN4 (φφ) − tN4 (φ4 ) (18.12)
The general rule is very simple: we may write a minimally subtracted operator of
degree D as a linear combination of all the independent operators of degree D + δ,
δ > 0 with which it may mix (given symmetry constraints) under renormalization.
A general proof of these Zimmermann identities involves straightforward, if lengthy,
algebraic reshuffling of the forest formula (18.7), which we shall not give here. The
interested reader is referred to the lectures of Lowenstein (Lowenstein, 1976) and
Zimmermann (Zimmermann, 1970), in which all the details are given with proper
mathematical rigor.
Higher than quadratic composite operators can be defined in a similar way to the
above: the minimally subtracted N4 (φ4 ) operator, for example, requires any renormal-
ization part γ containing the four-point vertex corresponding to the operator insertion
to be subtracted with the normal Taylor operator tD(γ) , i.e., with D(γ) = 4 − Eγ .
Moreover, these Zimmermann normal products satisfy some obvious (and convenient!)
properties with respect to spacetime-derivatives—namely:
1. Derivatives may be passed through the normal product by the simple expedient
of raising the degree of the subtractions by one for each derivative that enters
the product: e.g.,
(N )
(−1)r λr
ΓR (k1 , ..., kN ) = r
R
d4 z1 d4 z2 · ·d4 zr N4 (φ4 (z1 ))N4 (φ4 (z2 )) · ·N4 (φ4 (zr ))
r=0
r!(4!)
where the graphs obtained by Wick expansion of the correlation function on the right
are to be subjected to the forest formula subtraction formula corresponding to the
indicated multiple insertion of the quartic interaction operator. The latter is minimally
subtracted, and it is more or less obvious that this prescription precisely corresponds
to the BPHZ renormalization scheme described in detail in the preceding chapter.
The reader may be somewhat puzzled by the fact that the mass operator in (18.16)
appears in oversubtracted form, as a N4 (φ2 ). The reason is easily seen if we examine the
effect of a small change in the renormalized (squared) mass, or equivalently, compute
the first derivative of a renormalized 1PI amplitude with respect to m2R . The effect is
simply (with a change of sign) to double each internal propagator of the graph, as
∂ 1 1 1
2 2 =− 2 · 2 (18.19)
2
∂mR p + mR p + mR p + m2R
2
This doubling occurs, of course, not only in the basic unsubtracted graphs but also in
each of the subtraction terms which pop up whenever there is a divergent subgraph.
The point at which the propagator is doubled may be regarded as a new special vertex
associated with the insertion of the φ2 operator appearing in LZ,0 . The result is as
shown in Fig. 18.7 for a simple example: the 1PI four-point function at one loop,
where “X” marks the point of the φ2 insertion. Recall that only internal lines are
present and differentiated, as we are dealing with a 1PI, and therefore automatically
amputated, amplitude. It is clear that the mass derivative of this one-loop contribution
(4)
to ΓR corresponds to an insertion of the oversubtracted N4 (φ2 ) operator, as it is
subtracted at zero momentum even though the overall degree of divergence of the
2 We apologize once again to the reader for the lamentable overuse of the adjective “effective”, which
appears here now for the third time with a completely different connotation!
672 Scales III: Short-distance structure of quantum field theory
o o o o
∂
− 2 − = −
∂m R
o o o o
o o
+ −
o o
(4)
Fig. 18.7 Mass derivative of the four-point one-loop renormalized amplitude ΓR in φ44 -theory.
one-loop graph with one of the propagators doubled is now –2 rather than zero, and a
minimally subtracted N2 (φ2 ) operator would by definition not require a subtraction of
an already superficially convergent subgraph containing its vertex. In general, all the
operators appearing in a Zimmermann effective Lagrangian of this type carry a degree
subscript equal to the spacetime dimension, independent of their actual engineering
dimension. This means that the operators corresponding to super-renormalizable
terms are necessarily oversubtracted.
We must now return to the basic theme of this chapter—the use of renormalization
techniques to study the short distance, or equivalently, large momentum behavior
of amplitudes in a renormalizable local field theory. We indicated earlier that the
concept of oversubtraction provides the key to unlocking this behavior. In particular,
we wish to consider the situation in which there is a distinct large momentum scale
Q present in the renormalized amplitudes being studied, with Q much larger than
all other dimensionful quantities (masses, super-renormalizable couplings if any, and
other momentum variables: but, of course, as the amplitudes have been renormalized,
no UV cutoff Λ).
The simplest case concerns an amplitude in which all external momenta are of
order Q. It was realized a long time ago by Symanzik (Symanzik, 1970) (and almost
simultaneously, by Callan (Callan, 1970)) that in this regime the leading contribution
to the amplitude at large Q (neglecting subdominant terms suppressed by inverse
powers of Q (cf. (18.2)) satisfies an homogeneous partial differential equation, which
in certain circumstances can be solved and used to extract the desired asymptotic
behavior. The equation in question is now referred to universally as the Callan–
Symanzik equation. We shall not follow the more involved methods used by either
Symanzik or Callan to derive this equation here, as it is an almost immediate
consequence of the Zimmermann identity discussed earlier, and the approach we use
will generalize more easily to the case of factorized amplitudes to be treated in the
following section. Also, we shall henceforth focus on the scalar φ36 -theory introduced
in the previous chapter, for the same reasons indicated there: the topological structure
of the diagrams is essentially identical to that of a four-dimensional gauge theory, and
moreover, the structure and strength of the ultraviolet divergences are very similar
to the gauge-theory case. Thus, instead of (18.15, 18.16, 18.17), we shall be dealing
with an effective Zimmermann Euclidean Lagrangian (in six dimensions) given by the
following free and interaction parts:
Local composite operators in field theory 673
1 1
LZ,0 = N6 ( (∂μ φ)2 + m2R φ2 ) (18.20)
2 2
λR 3
LZ,int = N6 ( φ ) (18.21)
3!
Note that the engineering dimension of the scalar field φ is 2 in six dimensions, so
the kinetic and interaction terms are minimally subtracted and the mass operator
oversubtracted, as usual. No linear term in the field is included, as the BPHZ zero-
momentum subtractions automatically remove all tadpoles—a process equivalent to
cancelling such graphs with a additive field shift order by order in perturbation theory
(see Fig. 17.11). The corresponding graphs in a gauge theory such as QED, in which
a photon line virtualizes into a electron–positron pair which subsequently disappears
into the vacuum, are, of course, necessarily zero by (for example) angular momentum
conservation.
The key to understanding the Callan–Symanzik equation lies in an important
difference in the behavior of minimally subtracted (such as the N2 (φ2 ) in φ44 -theory
and N4 (φ2 ) in φ36 -theory) and oversubtracted (e.g., N4 (φ2 ) in φ44 -theory and N6 (φ2 )
in φ36 -theory) operators when inserted into amplitudes at large (external) momentum
ki = Qk̂i , where Q is a large momentum scale and the k̂i are Euclidean momenta of
order unity. For example, the minimally subtracted φ2 operators, as we have seen,
simply introduce an additional internal propagator into the diagrams, without any
additional subtractions (see Fig. 18.8(a)). The result is to lower the superficial degree
k3
×
k1 k2
(a)
k3 o
× o × o
k1 k2
(b)
(3)
Fig. 18.8 (a) A one-loop graph corresponding to an insertion of N4 (φ2 ) in ΓR in φ36 -theory.
(b) Result of insertion of the oversubtracted N6 (φ2 ) in the same graph.
674 Scales III: Short-distance structure of quantum field theory
(N )
ΓR (k1 , ..., kN ) = φ̃(k1 ) · ·φ̃(kN )1PI (18.22)
(−1)r λr
= r
R
d4 z1 d4 z2 · ·d4 zr N6 (φ3 (z1 ))N6 (φ3 (z2 )) · ·N6 (φ3 (zr ))
r=0
r!(3!)
The fields on the first line are fully interacting (Heisenberg) fields (we omit the usual
“H” subscript here to avoid overburdening the notation), whereas the second line
corresponds to the interaction-picture expansion. Recall that the · · ·1PI symbol in
the second line is to be interpreted by first Wick-expanding the operators products
inside the bracket to generate a set of bare (unsubtracted) 1PI irreducible graphs,
each of which is then subjected to the forest formula to generate the appropriate
subtractions. We now define a series of zero-momentum insertion operations on the
(N )
general renormalized N -point function ΓR as follows:
(N ) 1
Δ0 Γ R ≡ d6 zN4 ( φ2 (z))φ̃(k1 ) · ·φ̃(kN )1PI (18.24)
2
(N ) 1
Δ1 Γ R ≡ d6 zN6 ( φ2 (z))φ̃(k1 ) · ·φ̃(kN )1PI (18.25)
2
(N ) 1
Δ2 Γ R ≡ d6 zN6 ( ∂μ φ(z)∂μ φ(z))φ̃(k1 ) · ·φ̃(kN )1PI (18.26)
2
Local composite operators in field theory 675
corresponding to insertions at zero-momentum (as a consequence of the d6 z integra-
tion) of the minimally subtracted mass operator, the oversubtracted mass operator,
and the minimally subtracted kinetic term operator, respectively. Finally, there is the
insertion operator for an additional minimally subtracted φ3 interaction vertex (at
zero momentum):
(N )
(−1)r λr
Δ3 ΓR ≡ r
R
d4 zd4 z1 · · · d4 zr N6 (φ3 (z))N6 (φ3 (z1 )) · · · N6 (φ3 (zr ))
r=0
r!(3!)
(N ) N (N ) 3 ∂ (N )
(m2R Δ1 + Δ2 )ΓR (k1 , .., kN ) = − ΓR (k1 , .., kN ) + λR Γ (k1 , .., kN )
2 2 ∂λR R
(18.28)
On the other hand, from (18.23), we find
∂ (N ) 1 (N )
ΓR (k1 , .., kN ) = − Δ3 ΓR (k1 , .., kN ) (18.29)
∂λR 3!
(N ) N (N ) λR (N )
(m2R Δ1 + Δ2 )ΓR (k1 , .., kN ) = − Γ (k1 , .., kN ) − Δ3 ΓR (k1 , .., kN ) (18.30)
2 R 4
We pointed out earlier that an insertion of the oversubtracted φ2 operator is equivalent
to a mass derivative:
(N ) ∂ (N )
Δ1 ΓR (k1 , .., kN ) = − Γ (k1 , .., kN ) (18.31)
∂m2R R
Integrating over z, the pure derivative term proportional to s(λR , mR ) vanishes, and
we have, in terms of vertex insertion operators, the final relation
1
Δ0 = Δ1 + r(λR , mR )Δ2 + t(λR , mR )Δ3 (18.33)
2
which is essentially the Callan–Symanzik equation in disguised form, as we shall now
see. Note that the engineering dimension of the functions r(λR , mR ), t(λR , mR ) must
be –2 in powers of mass, so we have (as λR is dimensionless)
1 1
r(λR , mR ) = f (λR ), t(λR , mR ) = g(λR ) (18.34)
m2R m2R
where f (λR ) (resp. g(λR )) begin at order λ2R (resp. λ3R ) in perturbation theory. Com-
bining (18.29, 18.30, 18.31, 18.33), we find the promised Callan–Symanzik equation
∂ ∂ (N ) m2R (N )
(m2R 2 + β(λR ) − N γ(λR ))ΓR (ki ; λR , mR ) = Δ0 ΓR (ki ; λR , mR )
∂mR ∂λR f (λR ) − 1
(18.35)
where we have indicated explicitly the dependence of the N -point function on the
renormalized coupling and mass, and defined the functions
3 λR f (λR ) − 2g(λR )
β(λR ) ≡ ∼ O(λ3R ) (18.36)
2 f (λR ) − 1
1 f (λR )
γ(λR ) ≡ ∼ O(λ2R ) (18.37)
2 f (λR ) − 1
We have already seen that in the large-momentum (or short-distance) limit where
ki = Qk̂i with Q large and the k̂i non-exceptional and fixed, the insertion of the soft
mass operator effected by the vertex operation Δ0 on the right-hand side of (18.35)
suppresses the asymptotic behavior of our N -point function by two powers of Q, so
neglecting such contributions we find an homogeneous equation which must be obeyed
by the leading high-momentum contributions to the amplitude (up to inverse powers
of Q):
(N ) ∂ ∂ (N )
DCZ ΓR (ki ; λR , mR ) ≡ (m2R + β(λR ) − N γ(λR ))ΓR (ki ; λR , mR ) ≈ 0
∂m2R ∂λR
(18.38)
In other words, the particular combination of mass and coupling derivatives contained
in the Callan–Symanzik operator DCZ is exactly equivalent, to all orders of perturba-
tion theory, to an insertion of a soft mass operator, and must therefore suppress, by
powers, the asymptotic behavior of any N -point (Euclidean)1PI amplitude provided
all external momenta are taken large. Eq. (18.38) as it stands is not obviously useful,
as we are hardly in a position to explore the response of physical amplitudes to a
change in the mass of the particles being scattered. However, we can translate the
dependence on mass into one on uniformly rescaled momenta for the process by simple
Local composite operators in field theory 677
(N )
dimensional analysis. Let dN be the engineering dimension of ΓR in powers of mass
(thus, dN = 4 − N for φ44 -theory, 6 − 2N for φ36 -theory). The total powers of mass
and momentum in each term (the coupling λR is dimensionless) contributing to this
1PI amplitude must therefore be dN , which we may express with the usual Euler
derivative:
∂ ∂ (N ) (N )
(κ + mR )Γ (κki ; λR , mR ) = dN ΓR (κki ; λR , mR ) (18.39)
∂κ ∂mR R
We may therefore eliminate the mass derivative in (18.38) to obtain the asymptotic
equation
1 ∂ ∂ 1 (N )
(− κ + β(λR ) − N γ(λR ) + dN )ΓR (κki ; λR , mR ) = 0 (18.40)
2 ∂κ ∂λR 2
In Section 18.3 we shall return to (18.40), and show how to solve an homogeneous
partial differential equation of this type to constrain the large-momentum asymptotic
behavior of an arbitrary amplitude in a perturbatively renormalizable field theory. It
will also be seen there that a very similar equation can be derived in a completely
different way using renormalization group ideas, and the connection between the two
(involving the concept of mass singularities) will be explained.
The derivation of the Callan–Symanzik equation for fully amputated 1PI ampli-
tudes can be easily generalized to take care of the case when some or all of the
external propagators are present. For example, if our amplitude contains all external
legs, the number of lines in each basic graph is 12 (3r + N ) (instead of 12 (3r − N )
in the fully amputated case) for a graph containing r basic 3-vertices, and the end
result is a Callan–Symanzik operator with a change of sign in the N γ(λR ) term in
(18.38). Similarly, if the amplitude is “half-amputated”, with only half the external
legs amputated, the γ(λR ) term is absent.
The functions3 β(λR ) (the famous “β function” of renormalization group lore) and
γ(λR ) (which, for reasons to be seen later, is termed the “anomalous dimension”
of the scalar field) could be computed order by order in perturbation theory by
first determining the coefficient functions in the Zimmermann identity perturbatively
(paying very careful attention to the forest formula!), but it is more convenient to
extract them by simply applying the Callan–Symanzik equation to two independent
1PI N -point functions, in the asymptotic large momentum limit where the right-hand
side may be neglected. For example, to one-loop order, we may use the two-point 1PI
function (inverse propagator) and three-point function, by which a short calculation
(Problem 4) reveals the form:
1
(2) λ2R
ΓR (k) = k 2 + m2R − {(x(1 − x)k2 + m2R ) ln (1 + x(1 − x)k 2 /m2R )
128π 3 0
3 Modulo 1
a noisome factor of 2
; see below.
678 Scales III: Short-distance structure of quantum field theory
(3) 1 1 1 d6 l
ΓR (k1 , k2 , k3 ) = λR + λ3R { 2 2 2
l2 + mR (l + k1 ) + mR (l + k1 + k2 ) + mR (2π)6
2 2
1 d6 l
− 2 3 } + O(λ5R ) (18.42)
(l2 + mR ) (2π)6
Note that the self-energy term in (18.41) (given by the integral over the Feynman
parameter x) begins at order k 4 at small momentum, in keeping with the BPHZ
subtraction of a quadratically divergent integral. The three-point function (triangle
graph in Fig. 18.8(a), without the mass insertion) is logarithmically divergent and
therefore receives a single subtraction at zero momentum, as indicated in (18.42).
(2)
The Callan–Symanzik operator applied to ΓR (k) produces a dominant contribution
proportional to k 2 at large k and order λ2R with contributions from the mass derivative
and anomalous dimension term:
λ2R 1 k 2 λ2R
m2R − 2γ(λR )k 2
= 0 ⇒ γ(λ R ) = + O(λ4R ) (18.43)
128π 3 6 m2R 1536π 3
(3)
while the same considerations applied to the three-point function ΓR (k1 , k2 , k3 ) (in
this case the mass derivative of the first integral in (18.42) is asymptotically suppressed
and can be thrown away), using
∂ 1 d6 l 1 d6 l 1
m2R 2 2 2 3 6
= −3m 2
R 2 2 4 6
=− (18.44)
∂mR (l + mR ) (2π) (l + mR ) (2π) 128π 3
imply
1 3λ3R
3
λ3R + β(λR ) − 3λR γ(λR ) = 0 ⇒ β(λR ) = − + O(λ5R ) (18.45)
128π 512π 3
The significance of the (at first sight innocent) negative sign appearing in the β
function can scarcely be overstated: it leads, as we shall see in Section 18.3, to the
critical property of asymptotic freedom, implying that the theory becomes effectively
weakly coupled at large momenta, restoring the quantitative usefulness of perturbation
theory even in theories which are (at low momenta) strongly coupled. The unphysical
φ36 -theory serving as our toy example here shares this remarkable property with
non-abelian gauge theories generally, and with QCD in particular. The discovery and
proper interpretation in 1973 of asymptotic freedom was rewarded in 2004 by the
conferral of the Nobel Prize in Physics to Gross, Politzer, and Wilczek. But before
going on to describe the special features of asymptotically free theories in more detail,
we shall generalize our discussion of high-momentum behavior of amplitudes given
so far to situations in which, as described in the introduction to this chapter, only a
proper subset of the external momenta are large. The required generalization will lead
us directly to the Wilson operator product expansion.
Considerations of space prevent us from describing many of the quite beautiful
applications of the Zimmermann normal product formalism for local composite opera-
tors. Suffice it to say that the use of normal products allows one to write the Lagrangian
(Heisenberg) field equations of the theory as rigorous relations, correct to all orders
Factorizable structure of field theory amplitudes: the operator product expansion 679
4 We shall see below that the structure of the expansion is considerably altered if the local limit is
approached from the light-cone direction, with ξ 2 = 0.
680 Scales III: Short-distance structure of quantum field theory
We now know, from the rigorous work of Zimmermann, that the expansion is indeed
correct in renormalized perturbation theory, but that the short-distance behavior of
the coefficient functions is more complicated, involving, in general, logarithms as well
as powers of ξ. Nevertheless, up to logarithms, the leading power behavior of the
Wilson coefficient functions (in perturbation theory) is still associated in a simple
way, as we shall see below, with the engineering dimension of the associated composite
operator, in such a way that each additional power of mass dimension in the operator
corresponds to a softening of the short-distance behavior of the associated Wilson
coefficient by a power of ξ (modulo logarithms).
The connection of such an expansion—a sort of operator generalization of the
Taylor expansion (though, as typical in field theory, at best an asymptotic and not
a convergent one)—to the promised separation of large momentum behavior becomes
clear once we consider a definite matrix element of (18.46) (with x = 0) and Fourier
transform the ξ variable:
T (q, ki , ki ) ≡ d4 ξeiq·ξ ki |T {A(ξ/2)A(−ξ/2)}|ki (18.47)
T (q, ki , ki ) → C̃i (q)ki |Oi (0)|ki , q 2 → −∞ (18.48)
i
q ki
T(q,ki,ki)
q
ki
Fig. 18.9 Momentum space amplitude T (q, ki , ki ) corresponding to the insertion of a bilocal
operator.
5 This is a Euclidean analog of the forward scattering amplitude T (k, q; k, q) discussed in Section 6.6:
we shall see later how to extend the use of the OPE to Minkowskian situations of this sort.
Factorizable structure of field theory amplitudes: the operator product expansion 681
C̃i (q) for large Q. Thus, the leading asymptotic behavior of T (q, ki , ki ) for large Q is
determined by the leading term(s) in the expansion, in many cases, by a single operator
of minimal engineering dimension, provided, of course, that such an operator possesses
a non-vanishing matrix element between the initial and final states indicated in (18.48).
A glance at this formula also shows that the general dependence of our amplitude
on “large” q and “small” ki , ki momenta has been factorized in the expansion. This
factorization property of amplitudes is a deep consequence of the dynamics of local
field theories, and the Wilson OPE is the most direct expression thereof.
Our discussion of the Wilson operator product expansion (henceforth, OPE) will
take place entirely in the arena of momentum space: i.e., in the form given in (18.48).
There are several reasons for this. First, we are primarily interested in the behavior
of S-matrix amplitudes at high energy, which are naturally formulated directly as
momentum-space objects. But more importantly, the physical intuition underlying the
emergence of the factorization properties of amplitudes is far more easily acquired by
an examination of the behavior of large momentum flows in graphical amplitudes than
by direct consideration of the corresponding coordinate space amplitudes. Moreover,
there are generalizations of the OPE expansion (the cut vertex formalism of Mueller
(Mueller, 1981) is an example) in which non-local operators appear, and which do not
even have a natural expression in terms of the coordinate space asymptotic behavior
of amplitudes. For all these reasons, our discussion of the OPE, and more generally,
the factorization property, will be given in terms of momentum-space amplitudes.
The basic strategy underlying Zimmermann’s proof of the OPE can easily be
illustrated with a simple example. As usual, the φ36 theory provides a convenient
stage for displaying the central idea. We consider the case where the local field A
in (18.47) is just the canonical φ field, with a single incoming and outgoing particle
carrying momentum k. In Euclidean space, the corresponding amplitude is given by
the connected contributions to the correlation function
T (q, k) = d6 ξeiq·ξ φ(ξ/2)φ(−ξ/2)φ̃(k)φ̃(−k) (18.49)
with the lowest-order graph displayed in Fig. 18.10. The external propagators associ-
ated with the fields carrying momentum ±k are assumed to be truncated (as indicated
q k
q+k
q k
k o
q q
• q+k − • q
q q
k o
(a) (b)
Fig. 18.11 (a) One-loop graph giving divergent contribution to φ2 (0). (b) Zero-momentum
subtraction renormalizing φ2 (0).
by crossbars), but the propagators on the left, carrying momentum q, are not. The
local limit ξ → 0 of the operator product in (18.49) corresponds to integrating T (q, k)
over q, thereby obtaining a δ-function setting ξ to zero. This corresponds, of course,
to the one-loop graph indicated in Fig. 18.11(a), which is logarithmically divergent.
This simply indicates that, as we have seen above, the composite operator φ2 (0)
is ultraviolet-divergent and requires renormalization. The divergence is, of course,
removed by the zero momentum subtraction indicated in Fig. 18.11(b), leading to a
finite result which we interpret as arising from the minimally subtracted composite
operator N4 (φ2 (0)). This subtraction is effective in making the one-loop integral finite
(and this is the crucial point!) precisely because it removes the dominant dependence of
the graph in Fig. 18.11(a) at large momentum q. This suggests that we can introduce
an oversubtracted bilocal operator N4 (φ(ξ/2)φ(−ξ/2)), with Fourier transform given
to lowest order by the graphs indicated in Fig. 18.12, with a finite integral over q, and
therefore with the finite local limit N4 (φ(ξ/2)φ(−ξ/2)) → N4 (φ2 (0)), ξ → 0.
The term “oversubtraction” is appropriate here as the tree diagram giving the
leading order contribution to the amplitude containing the bilocal operator is already
finite and does not in that sense “need” a subtraction. However, the dominant
asymptotic behavior for large q, k fixed, of T (q, k) is closely related to exactly the extra
subtractions introduced to define this oversubtracted bilocal operator, which in the
forest language correspond to counting as renormalization parts those 1PI subgraphs
which would become divergent when the vertices ±ξ/2 associated with the bilocal
operator are pinched to a point (turning Fig. 18.12 into Fig. 18.11).
This insight, combined with clever use of forest formula techniques, allowed
Zimmermann to provide a rigorous, all-orders proof of the OPE in a very general
context. We shall sketch the proof here, but the basic steps will be translated from
the language of Zimmermann forests into explicit graphical expressions where the
structure of the subtractions, and their relation to the dominant large momentum
flows in the diagrams, will be more physically intuitive. Also, we shall restrict ourself
to the leading term in the expansion, as forest formula techniques are more or less
indispensable in handling the combinatorics of the subleading terms in the expansion.
We begin with the Euclidean case, before going on to the phenomenologically more
important light-cone expansion.
Factorizable structure of field theory amplitudes: the operator product expansion 683
q q
k o
q+k − q
k o
q q
Fig. 18.12 Over-subtraction of the bilocal operator d6 ξeiq·ξ N4 (φ(ξ/2)φ(−ξ/2)).
q l
k
q−l k+l
q k
l
Our graphical demonstration of the OPE will depend on a crucial property of two-
particle-irreducible diagrams (or “kernels”: cf. Sections 10.4, 11.2). Consider first the
one-loop “box” diagram contribution to T (q, k) indicated in Fig. 18.13. Recalling that
external propagators are amputated on the right side only, this amplitude (ignoring
combinatoric and coupling factors) takes the form
1 1 1 1 d6 l
I1 loop (q, k) = 2 (18.50)
(q + m2 )2 (l2 + m2 )2 (q − l)2 + m2 (l + k)2 + m2 (2π)6
√
For large Q ≡ q · q, the integral (ignoring the external propagator factors) is UV-
finite and of order 1/Q2 , as we can see by examining the contributions of the possible
regions of large momentum (of order Q) flow through the diagram. In particular, we
have:
1. The region where the loop momentum is large, i.e., lμ ∼ Q, with phase-space
volume Q6 and integrand of order 1/Q8 .
2. The region in which the large momentum q flows in and out of the diagram
entirely through the propagator carrying momentum q − l, corresponding to loop
momentum lμ ∼ kμ , m << Q. This region also contributes asymptotic behavior
1/(q − l)2 ∼ 1/Q2 .
The presence of the second region implies that a zero-momentum subtraction I(q, k) −
I(q, 0) does not reduce the asymptotic behavior. On the other hand, in the corre-
sponding over-subtracted graph, I1 loop (q, k) − I1 loop (q, 0), it is easy to see that the
684 Scales III: Short-distance structure of quantum field theory
contribution of the first region, where large momentum permeates the entire graph,
is suppressed to at least order 1/Q3 , as the effect of the small incoming momentum
k is subdominant once all the internal propagators of the loop are far off-shell (of
order Q2 ). The reader may easily verify this assertion explicitly by constructing the
subtracted integrand and subjecting it to the simple power-counting analysis along the
lines just followed above. The presence of a dominant contribution in regions where a
subset of lines remain soft (low momentum) is clearly connected to the two-particle
reducibility of our box diagram: the large momentum Q is afforded a rapid exit route
from the diagram on the left, via the single left-most internal propagator. On the other
hand, a two-particle-irreducible (2PI) diagram such as the one indicated in Fig. 18.14,
while still of order 1/Q2 for large Q, receives its entire dominant contribution from the
region of large momentum flow through the entire diagram: i.e., lμ ∼ Q. The Feynman
integral in this case is
1 1 1
K1 loop (q, k) =
(q + m2 )2
2 l2 + m (q − l)2 + m2
2
1 1 d6 l
· (18.51)
(l + k)2 + m2 (k + l − q)2 + m2 (2π)6
and we see immediately that the contribution of the second region lμ ∼ kμ , m to the
integral is of order 1/Q4 . The reason is simply that in the 2PI case the large momentum
q entering at the bottom left is forced to flow through at least two internal lines of
the graph in order to exit the graph on the upper left-hand side.6 This property
generalizes to an arbitrary multi-loop 2PI contribution to T (q, k), so defining the sum
of all such 2PI graphs as K(q, k), we conclude that the zero-momentum subtraction
K(q, k) − K(q, 0) softens the asymptotic behavior by at least a power of Q, along the
same lines as discussed above for the large-momentum region of the box diagram. This
property is all that we shall need below to show that the oversubtractions introduced
k+l−q
q k
q−l k+l
q k
l
6 One of the corollaries of Weinberg’s power-counting theorem discussed in Section 17.1 provides a
rigorous estimate for the asymptotic behavior of any convergent Feynman integral in terms of exactly the
minimal routing argument given here, so the reader may be assured that the asserted behavior is on a very
solid footing.
Factorizable structure of field theory amplitudes: the operator product expansion 685
k k k k
q q q q
T (q,k) = K + K K + K K K +...
q q q q
k k k k
to define the bilocal operator N4 (φ(ξ/2)φ(−ξ/2)) are just those needed to obtain a
factorized expression for the leading asymptotic behavior.
The introduction of two-particle irreducible kernels simplifies the description of the
oversubtraction procedure needed to exhibit the emergence of an operator product
expansion, by simplifying the graphical structure of the 1PI contributions to the
amplitudes T (q, ki , ki ) of Fig. 18.9. In particular the 1PI contributions to these
amplitudes7 may be expressed as a sum of ladder graphs in which 2PI kernels are
iterated, as shown in Fig. 18.15 (for the 2-2 case T (q, k)). Each 2PI kernel in Fig.
18.15 is itself a sum of infinitely many graphs, of which a few of the lowest-order ones
are shown in Fig. 18.16.
It should be emphasized that Figs. 18.15 and 18.16 are skeleton graphs: each
propagator line actually represents the full renormalized scalar propagator, including
all possible self-energy corrections, with their associated BPHZ subtractions, and each
vertex where three propagator lines meet at a point a full 1PI three-point vertex
function including all BPHZ subtractions from any renormalization parts it may
contain. In this way, we ensure that the set of ladder graphs in Fig. 18.15 indeed
contains all the graphs making up the fully renormalized 1PI amplitude T (q, k).
Another critical point here is one which we encountered earlier in our treatment
of perturbative renormalization in Chapter 17: the subtractions induced by the
counterterms of the theory produce UV-finite amplitudes with a dependence on the
q k q k q k q k
K = + + +...
q k q k q k q k
Fig. 18.16 Low-order skeleton graphs contributing to the 2PI kernel K(q, k).
7 We note here that in a φ3 theory there are also one-particle reducible graphs contributing to T (q, k , k ),
6 i i
as the absence of a discrete φ → −φ symmetry allows the large momentum q to flow through a subgraph
connected only by a zero-momentum propagator to the part containing the small momenta ki , ki . This
means that among the operators Oi appearing on the right-hand side of the OPE (18.48) in this theory
is the scalar field φ itself. We shall ignore these graphs, which do not occur in the QCD/QED analogs of
φ36 -theory, as an amplitude in which two photons carry large momentum q in and out of a graph cannot be
connected by a single gluon (or photon) to the rest of the diagram, by Lorentz-invariance. Thus, we shall
only consider the 1PI contributions to the T (q, ki , ki ) amplitudes in the following.
686 Scales III: Short-distance structure of quantum field theory
(a) (b)
k o
q q
K K
q q
k o
(c) (d) (e) (f)
l k l o l k l o
q q q o
+ K K K K K K K
q q q o
l k l o l k l o
+ .... ∼ O(1/Q3)
the external vertices x and y in Fig. 18.15 together to construct an insertion of the
composite minimally subtracted N4 (φ2 ), whichis finite precisely because the resultant
loop integral over q has asymptotic behavior (1/q4 · 1/q 3 )d6 q and is therefore UV-
finite. The reader should verify this by explicitly constructing the forests appearing in
the renormalized amplitudes for the N4 (φ2 ) operator (see Problem 5).
If we examine the subtraction terms appearing in Fig. 18.17 closely, which we now
realize incorporate exactly the leading asymptotic behavior of T (q, k), a remarkable
property emerges: they factorize algebraically into functions of the large momentum
q and the remaining “small” momentum k. Indeed, transferring the subtraction terms
to the right-hand side, we obtain the graphical equation indicated in Fig. 18.18, where
the dependence on q is isolated in the set of graphs in the first parenthesis (which
sum simply to T (q, 0)), while the second parenthesis contains exactly the graphs
contributing to the insertion of the minimally subtracted composite N4 (φ2 ) operator
in the 1-1 matrix element. In terms of Euclidean correlation functions, reinserting the
external legs carrying momentum q on the left, we have the asymptotic result
where the Wilson coefficient function C̃φ2 (q) is given in this case simply by setting
the small momenta to zero in the full 1PI amplitude, and is clearly of order 1/Q2 for
large Q. This result generalizes straightforwardly to amplitudes with more than two
low-momentum fields,
φ̃(q)φ̃(−q)φ̃(k1 )φ̃(k3 ) · · · φ̃(kn )1P I ∼ C̃φ2 (q)N4 (φ2 (0))φ̃(k)φ̃(k2 ) · · · φ̃(kn )1P I
+ O(1/Q3 ) (18.53)
Indeed, in this case, the ladder expansion of the amplitude T (q, k1 , k2 , ..., kn ) ter-
minates on the right with a 2PI kernel with two incoming lines on the left and
n > 2 (small momentum) lines on the right. The reader may easily verify that such a
kernel is automatically suppressed (to order 1/Q4 ) if large momentum flows through
it. The subtraction terms needed for the oversubtraction are therefore just the ones
discussed above for the 2PI 2-2 kernels to the left of this final 2-n kernel, and the
reader may easily verify (Problem 6) that the factorization obtained is precisely as
k o o
q q q
T (q, k) ∼ K + K K +....
q q q
k o o
k o
× 1 + V× K − V× K +.... + O(1/Q3)
k o
o o
q q o
DCZ ∼ V ×
K K Q→ ∞ K K
q q o
o o
Fig. 18.19 Factorized structure of the two kernel contribution to DCZ T (q, 0) at large Q.
to that amplitude. We note first that applying DCZ (with N = 4) to the first term in
the skeleton expansion (see Fig. 18.15) of T (q, 0) (consisting of a single 2PI kernel)
automatically produces an asymptotically suppressed amplitude: the insertion of a soft
φ2 vertex produces an extra internal propagator (with no additional subtractions) in
an amplitude which receives its dominant asymptotic contribution from the regime in
which all internal lines are far off-shell (of order Q2 ). Moreover, in all higher terms
in the skeleton expansion, the mass insertion must avoid the left-most kernel, as the
large momentum cannot avoid flowing at least through this part of the graph.
Thus, in the term with two kernels, the application of DCZ leads to the factorization
indicated in Fig. 18.19. We use the “hat” (or “caret”) symbol to indicate the part of the
graph containing the mass insertion, with the specification that the two propagators
on the left of a kernel also receive mass insertions if the hat symbol is attached to
that kernel.8 Once a kernel receives the mass insertion, the momentum flowing in and
out on the left is forced to be small, as momenta of order Q would again lead to
an asymptotically suppressed contribution, by the previous argument. The standard
OPE derived earlier may therefore be applied to the part of the graph to the left of the
mass-inserted kernel (in Fig. 18.19, the 2PI kernel on the left), which thereupon loses
its dependence on the small momentum connecting it to the right side of the graph.
The result is that the mass-inserted kernel sees a momentum-independent amplitude
to its left, leading to the appearance of a local vertex V , as indicated in Fig. 18.19.
The same reasoning applied to the contribution with three 2PI kernels leads to the
factorization indicated in Fig. 18.20, as the reader will confirm with a little thought.
(Here, the OPE factorization of the two kernel subgraph on the left of the final graph
on the top line is allowed by the fact that the loop momentum l connecting it to the
mass-inserted kernel must be soft.)
Putting these results together, we arrive at the factorization indicated in Fig. 18.21
for the amplitude obtained by applying the Callan–Symanzik operator DCZ ≡
m2R ∂m∂ ∂
2 + β(λR ) ∂λ
R
− 4γ(λR ) to the coefficient function C̃φ2 (q). As usual, the vertex
R
V indicates the point of insertion of a minimally subtracted N4 (φ2 ) operator. We see
that, up to terms suppressed by 1/Q2 , the coefficient function satisfies an homogeneous
Callan–Symanzik equation, with an additional term corresponding to C̃φ2 (q) (the
graphs on the top line) multiplied by the momentum-independent series of graphs on
the second line. This term is (a) dimensionless and (b) only a function γφ2 ,CZ of the
8 For these terms, one takes N = 0 in D , as the graph is only “half amputated”—see the discussion
CZ
following (18.38).
690 Scales III: Short-distance structure of quantum field theory
o o o
q q o q
∼
DCZ K K K K V´ K K + K K K
Q→∞
q q o q
o o o
o o
q o q o q o o
∼ K V´ K K + K K + K V´ K – V´ K K
q o q o q o o
o o
Fig. 18.20 Factorized structure of the three-kernel contribution to DCZ T (q, 0) at large Q.
o o o
q q q
DCZCφ2(q) ∼ K + K K + K K K +...
Q→∞ q q q
o o o
o o o
o
´ V´ K + V´ K − V´ K K +V ´ K K +...
o
o o o
Fig. 18.21 Callan–Symanzik equation for coefficient function C̃φ2 (q), in graphical form.
∂ ∂
DCZ Cφ2 (q; mR , λR ) = (m2R 2 + β(λR ) − 4γ(λR ))Cφ2 (q; mR , λR )
∂mR ∂λR
≈ γφ2 ,CZ (λR )Cφ2 (q; mR , λR ) (18.54)
e−(p)
n
n|J˜em,had(q)|k
q = p − p
k
e−(p)
summing over all possible final states, subject to energy-momentum conservation, and
is therefore proportional to the tensor9
μ
(2π)4 δ 4 (Pn − k − q)k|Jem,had ν
(0)|nn|Jem,had (0)|k
n
μ
= Im{i d4 xeiq·x k|T {Jem,had ν
(x)Jem,had (0)}|k} (18.55)
where the second line follows by standard manipulations along the lines used to
establish the Kållen–Lehmann spectral representation in Section 9.5 (see Problem 8).
This result (basically the optical theorem of scattering theory, relating a total
cross-section to the imaginary part of a forward scattering amplitude) allows us to
concentrate our attention on the 2-2 amplitude on the second line, where a large
space-like momentum q is inserted and then extracted on the left of the diagram,
with the momentum k kept fixed (and eventually, sent on-mass-shell for the incoming
and outgoing hadron). We saw in Section 6.6 that a forward scattering amplitude
like (18.55) can also be written as the Fourier transform of a retarded commutator
μ μ
θ(x0 )[Jem,had (x), Jem,had (0)] (cf. (6.126–6.129)), so by locality, the integral over x is
restricted to the forward light-cone. We shall now see that in a certain kinematic
limit, the coordinate displacement x of the two current operators can be forced onto
the light-cone, i.e., to the value x2 = 0 (in Minkowski space), and that the product
of operators can again be expanded, with a leading set of operators (with associated
coefficient functions) providing the dominant asymptotic contribution.
Let us consider the Bjorken limit, in which k · q and q 2 are large (i.e., >> k 2 , m2 )
and comparable, with the ratio fixed:
1 2k · q
ω≡ ≡− 2 fixed (18.56)
x q
We may automatically realize the Bjorken limit by choosing a convenient Lorentz
frame. For a general Minkowski four-momentum p,10 define light-cone coordinates
p± ≡ √12 (p0 ± p3 ), p = (p1 , p2 ) in terms of which the invariant dot-product takes the
form
p · q = p+ q− + p− q+ − p · q (18.57)
Note that the vector symbol here applies only to the two (or, in six spacetime dimen-
sions, four) transverse dimensions orthogonal to the preferred spatial z-direction. We
shall work in a frame in which q = 0, q+ ∼ Q2 /mR , and q− < 0 ∼ mR . For kμ ∼ mR
fixed, in the Bjorken limit,
2(k− q+ + k+ q− ) k−
ω=− ∼− (18.58)
2q− q+ q−
9 For a fuller discussion, including the kinematic factors glossed over here, see Section 13.4, (Itzhykson
and Zuber, 1980).
10 In six dimensions we likewise take p ≡ √1 (p ± p ), p
± 0 5 = (p1 , p2 , p3 , p4 ).
2
Factorizable structure of field theory amplitudes: the operator product expansion 693
Now, by taking q+ large, we force the Fourier transform to extract the dominant
dependence of the retarded commutator of currents for x− small (as the exponent
is q · x = q+ x− + ..). However, locality restricts us to the interior of the forward
light-cone, 2x+ x− > x2 so x− → 0 forces also x → 0, and the Bjorken limit naturally
probes the region x2 → 0: i.e., the light-cone singularities of the operator product.
Fortunately, as in the Euclidean case, where the structure of the amplitude simplifies
(via factorization) in the Euclidean limit x2 → 0 (⇒ x → 0), the forward amplitude in
(18.55) also displays a factorized structure in the Bjorken limit in Minkowski space. In
this case, however, the leading contribution involves an infinite “tower” of operators
and coefficient functions—not surprisingly, as the light-cone limit involves a surface,
rather than a single point.
To expose the essential ideas, while avoiding the (not inconsiderable!) complications
of spin and local gauge symmetry with which we would have to contend in QCD,
we shall sketch the factorization procedure in our old standby, φ36 -theory. Thus,
instead of the second line of (18.55), we consider exactly our previous amplitude
T (q, k) (essentially, the Fourier transform of k|T {φ(x)φ(0)}|k) of Fig. 18.15, but
now in Minkowski space, and in the Bjorken limit (18.56). The factorization will be
demonstrated for the full amplitude: the imaginary part can then be taken at the end,
to obtain the desired inclusive cross-section.
We begin, as before, with the lowest-order tree diagram contributing to T (q, k), as
indicated in Fig. 18.23(a). The large momentum q flows through a single propagator
and the (fully amputated) graph is therefore (suppressing the ubiquitous i terms)
proportional to
1 1
= 2 (18.59)
(q + k) − mR
2 2 q + 2k− q+ + 2k+ q− + k 2 − m2R
In the Bjorken limit, the terms 2k+ q− and k 2 are of order m2R , suppressed by two
powers of the large scale Q relative to the terms q 2 and 2k− q+ which are both of order
Q2 . This means that the leading asymptotic dependence of graph (a) on Q is unaltered
if we set k+ and k to zero (the latter meaning, in six spacetime dimensions, k1 =
k2 = k3 = k4 = 0). We shall indicate that the + and transverse vector components
k o(⇒ k̂)
q q
k+q − k̂ + q
q q
k o(⇒ k̂)
(a) (b)
of an external momentum entering a subgraph have been set to zero by once again
appending the “o” symbol to the corresponding leg, and also define, for a general
momentum pμ = (p+ , p− , p), the projected momentum p̂μ = (0, p− , 0). Accordingly,
the subtraction effected by graph Fig. 18.23(b), with propagator
1 1
= (18.60)
(q + k̂)2 − m2R q 2 + 2k− q+ − m2R
1 (l2 − 2q− l+ )
=
(l2 − m2R + i)2 ((q − l)2 − m2R + i)((q − ˆl)2 − m2R + i)
(2k+ l− + k 2 ) d6 l
· (18.61)
((k + l)2 − m2R + i)((k̂ + l)2 − m2R + i) (2π)6
We now need to estimate the leading asymptotic behavior of this rather formidable
expression at large Q. We no longer have the Weinberg theorem, and its corollaries,
at our disposal, as we are in Minkowski space. In particular, denominators (such as
(q − ˆl)2 − m2R ) with only linear dependence on loop momentum components appear—
a circumstance completely alien to the Euclidean space analysis. Nevertheless, the
indicated subtractions do indeed do their job, and end up reducing the asymptotic
dependence by two powers of Q, as desired. A “physicist’s” proof of this assertion is
l l l l
q k q o q o k o
Fig. 18.24 Oversubtraction of a one-loop box graph contribution to T (q, k) in the Bjorken
limit.
Factorizable structure of field theory amplitudes: the operator product expansion 695
easily obtained by a straightforward scaling analysis, but the general result is confirmed
by extensive computational experience in perturbative field theory, although a general
power-counting theorem in Minkowski space of the scope and power of Weinberg’s
theorem for the Euclidean case has, to the author’s knowledge, never been established.
Let us therefore proceed directly, by examining the contribution to (18.61) from a
region of phase-space corresponding to arbitrary power scalings of the loop momentum
components:
l + ∼ Qα , l− ∼ Qβ , l ∼ Qγ , α, β, γ > 0 (18.62)
The volume of loop phase-space corresponding to this region evidently scales like
Qα+β+4γ . By examining the scaling under (18.62) of each of the numerator and
denominator terms in (18.61), we find that the subtracted amplitude receives a
contribution of order QP(α,β,γ) , with the power given by
P(α, β, γ) ≤ −4 (18.64)
so we may reasonably conclude that the subtractions in graphs (b), (c), and (d) have
indeed succeeded in suppressing by two powers of Q the leading 1/Q2 dependence of
the box diagram Fig. 18.24(a). The subtractions are effective because, in analogy to
the Euclidean case, in the region where the loop momentum l is soft (lμ ∼ kμ , mR ),
corresponding to the large momentum flowing entirely through the left-most vertical
line, the leading asymptotic dependence on Q cancels separately between graphs (a)
and (c), and between (b) and (d); whereas, in the region of l “hard” (this means
l+ ∼ q+ ∼ Q2 , l ∼ Q, l− fixed) where all lines are far off-shell, the cancellation occurs
between graphs (a) and (b), and between (c) and (d), as the reader may easily check,
using power-scaling arguments along the lines of (18.62–18.64).
The preceding examples suggest that we may proceed exactly as in the Euclidean
case to introduce an oversubtracted skeleton expansion for the Minkowski amplitude
T (q, k): the topological structure of these subtractions is exactly the same as in
the Euclidean case, the only difference begin that the subtraction point is at a
projected light-like momentum, rather than at zero momentum. The result is that the
leading asymptotic behavior factorizes as indicated graphically in Fig. 18.25 (replacing
Fig. 18.18). This result may be written explicitly as
d6 l
T (q, k) ∼ T (q, ˆl)(Δ̂F (l))2 T (l, k) , Q → ∞, ω fixed (18.65)
(2π)6
k k
q q ˆl l
∼
T (q,k) Q→∞ T (q,lˆ) V´ T (l,k)
q ω fixed q ˆl l
k k
One can show (see Problem 10) that Γ(ω, Q2 , k 2 ) is analytic in the cut plane of ω with
cuts running from −∞ to −1 and from 1 to +∞. In the light-like case (ˆl2 = 0), we
can therefore expand
∞
Γ(ω̂, Q2 , 0) = ω̂ n Cn (Q2 ) (18.67)
n=0
d6 l
Γ(ω, Q , k ) ∼
2 2
Cn (Q ) 2
T (l, k)(Δ̂F (l))2 (ω̂)n (18.68)
n
(2π)6
l− n d6 l
= ω n Cn (Q2 ) T (l, k)(Δ̂F (l))2 ( ) (18.69)
n
k− (2π)6
↔
i∂− n
= n 2
ω Cn (Q )k|N4 (φ(0)( ) φ(0))|k ≡ ω n vn Cn (Q2 ) (18.70)
n
2k− n
↔
i∂− n
with vn ≡ k|N4 (φ(0)( 2k −
) φ(0))|k the matrix elements of a tower of renormalized
composite operators incorporating the low-energy physics of the process.11
Factors of loop momentum l− appearing at the vertex V in Fig. 18.25 (arising from
expanding the T (q, ˆl) amplitude on the left) have been converted to the corresponding
spatial derivatives appearing in minimally subtracted composite operators, of which
there are clearly an infinite number contributing at leading order. Note that these oper-
ators receive only a single subtraction to remove a logarithmic divergence, irrespective
of the power n of the loop component l− present in the graph (see Problem 11). The
result (18.70) is called the light-cone expansion of the amplitude T (q, k) ≡ Γ(ω, Q2 ).
It shows that the leading asymptotic behavior of the amplitude in the Bjorken limit
is given by an infinite set of factorized terms involving the product of coefficient
functions depending only on the large scale Q, and matrix elements of renormalized
composite operators. In the leading term, all operators of minimum twist (defined as
11 Note that if the initial and final states are taken as light-like elementary scalars, with k → k̂, the matrix
elements are unity, vn = 1. This is just the renormalization condition for the composite operators equating
the matrix element at the special subtraction point to its lowest-order value, as higher loop corrections
vanish if taken at the subtraction point, where they are subtracted.
Factorizable structure of field theory amplitudes: the operator product expansion 697
the engineering dimension of the operator minus its spin, the latter given in this case
by the number of spacetime-derivatives ∂− ) appear. The subtraction degree of the
operator is determined in a light-cone expansion not by the engineering dimension, as
↔
i∂− n
in the Euclidean case, but by the twist: thus, all the operators N4 (φ(0)( 2k −
) φ(0))
appearing in (18.70) have twist 4 (note: the factors of k− are not included in the
dimension) and appear at the same, leading twist, level in the OPE. Operators of
higher twist will contribute to the amplitude at levels suppressed by powers of Q.
In practice, one extracts individual terms in the infinite sum in (18.70) by taking
moments with respect to the variable x ≡ ω1 of Im(Γ(ω, Q2 )), which is directly related
to the inclusive cross-section for the deep inelastic scattering, as discussed above (see
Problem 12).
The discussion of factorization for deep inelastic amplitudes in QCD follows
completely analogous lines to the argument for φ36 -theory given here. In this case
(see, for example, the review of Mueller (Mueller, 1981)), the leading contributions
are given by a tower of operators of twist 2: namely, the quark composite operators
↔
iD− n
On ≡ N2 (ψ̄γ− ( ) ψ) (18.71)
2k−
where Dμ is the gauge-covariant derivative (15.103) for the quark field ψ. In a general
covariant gauge, these composite operators involve both quark and gauge fields, but
the analysis simplifies, and becomes (modulo spin complications) extremely similar
to the φ36 case if we choose a light-cone gauge in which A− (x) = 0, in which case the
gauge-covariant derivatives may be replaced by ordinary ones, D− → ∂− .
Just as in the Euclidean case, the asymptotic behavior of the coefficient functions
Cn (Q2 ) is determined by a Callan–Symanzik equation. The derivation of this equation
follows exactly the lines of the space-like factorization: one applies a soft mass insertion,
via the Callan–Symanzik operator DCZ , to the large momentum amplitude T (q, ˆl),
which is then refactorized. One then obtains, in analogy to (18.54), for the asymptotic
behavior of Cn (Q2 ) (as always, up to power suppressed terms),
where γn , in analogy to the Euclidean case, is given by a single (soft) mass insertion on
↔
i∂− n
the 1-1 matrix element of N4 (φ(0)( 2k −
) φ(0)) (see Problem 11), and is a dimensionless
function of λR . In the case of asymptotically free theories such as QCD (or φ36 ), (18.72)
can be used to reduce the determination of the leading asymptotic behavior in Q to
perturbative information, as we shall explain in the next section. In particular, the
asymptotic behavior of individual moments of the inclusive amplitude can be explicitly
computed and compared with experiment.12
12 For more details on all of this, the reader is referred to the original literature: (Gross and Wilczek,
1974a), (Gross and Wilczek, 1974b). The general approach to factorization outlined here is covered in great
detail in the review of Mueller (Mueller, 1981); see also (Buras, 1980).
698 Scales III: Short-distance structure of quantum field theory
13 As in Section 17.4, we have chosen to label all the coefficients with a single symbol here, not
distinguishing between terms with different numbers of spacetime-derivatives.
Renormalization group equations for renormalized amplitudes 699
at some conventionally chosen low cutoff, and ignoring as usual variations in this
surface of order inverse powers of the much higher UV cutoff), then the renormalized
parameters mR , λR (in φ44 -theory, say) must also vary with μ to keep the physics fixed.
We shall illustrate the derivation of the renormalization group equation for renor-
malized amplitudes taking self-coupled scalar field theory renormalized at Euclidean
scale μ as our starting point (the reader may imagine our theory to be φ44 although
an essentially identical argument holds for φ36 theory). The analysis of perturbative
renormalization in Section 17.2 makes it clear that the renormalized 1PI Euclidean
(N )
N -point function ΓR in such a theory is related to the corresponding 1PI function
(N )
Γ computed by using the bare parameters
and field in (17.42) with a UV cutoff
Λ by (a) rescaling the bare field φ → ẐφR , and (b) reparameterizing the resultant
amplitude in terms of renormalized mass and coupling parameters defined at the
Euclidean subtraction point. Thus, the upshot of our demonstration of perturbative
renormalizability is the relation14
ΓR (ki ; λ̃R , m̃R , μ̃) = Ẑ(λ̃R , m̃R , μ̃, Λ)−N/2 Γ(N ) (ki ; λ, m, Λ)
(N )
(18.75)
In other words, the 1PI Green functions of our theory transform covariantly (by
multiplicative rescaling) under reparameterizations corresponding to an alteration of
the subtraction scale μ. The absence of a dependence on the UV cutoff Λ in the
14 The negative power of Ẑ here arises as a consequence of the need to divide the basic N -point Green
function G(N ) by N full propagators in order to arrive at the fully amputated Γ(N ) : each such propagator
gives a factor of Ẑ on rescaling, converting the Ẑ associated with each of the N fields in G(N ) to a 1/ Ẑ.
Renormalization group equations for renormalized amplitudes 701
renormalized amplitudes, of course, implies the same for the rescaling factor F in
(18.76) and (18.77). The finite transformation expressed in (18.77) can clearly be
viewed as an invertible element of a one-parameter continuous Lie group in which
successive transformations of renormalization scale satisfy an obvious composition
rule. This one parameter group is all that remains of the vastly more complicated
renormalization group flow embodied in the equations (18.73), which themselves lead
to the collapse to the low-energy attractor surface corresponding to perturbative
renormalizability, as we saw in Section 17.4. Nevertheless, this equation—or rather,
its infinitesimal Lie algebra version—once combined with information on the mass
singularities of the amplitudes, will lead us back to the same powerful constraints
on the Green functions of the theory derived previously in the form of the Callan–
Symanzik equation.
The desired infinitesimal version of (18.77) is readily obtained: we simply keep
∂
μ̃, λ̃R and m̃R fixed while applying μ ∂μ . As a result, λR and mR must also be allowed
to vary, and, with the understanding that everywhere
∂ ∂
μ ≡μ (18.78)
∂μ ∂μ λ̃R m̃R μ̃
we obtain
∂ −N/2 (N ) ∂ ∂λR ∂mR (N )
(μ F )ΓR + F −N/2 (μ +μ +μ )ΓR = 0 (18.79)
∂μ ∂μ ∂μ ∂μ
∂ ∂λR ∂mR (N ) 1 ∂ (N )
(μ +μ +μ )ΓR = N (μ ln F )ΓR (18.80)
∂μ ∂μ ∂μ 2 ∂μ
After taking the partial derivatives, we may set μ̃ = μ, m̃R = mR , λ̃R = λR and define
the dimensionless functions
mR ∂λR
β(λR , )≡μ (18.81)
μ ∂μ
mR 1 ∂mR
γm (λR , )≡ μ (18.82)
μ mR ∂μ
mR ∂ ln F
γ(λR , )≡μ (18.83)
μ ∂μ
∂ mR ∂ mR ∂ mR
{μ + β(λR , ) + γm (λR , )mR − N γ(λR , )}
∂μ μ ∂λR μ ∂mR μ
(N )
× ΓR (ki ; λR , mR , μ) = 0 (18.84)
702 Scales III: Short-distance structure of quantum field theory
We emphasize that this equation is exact (having taken the UV cutoff Λ of the theory
to infinity, of course): we have so far not considered any simplifications arising in an
asymptotic regime. An equation of exactly the same form holds if we use dimensional
renormalization, where the parameter μ is introduced via (17.102), with the addi-
tional simplification that the renormalization group functions β, γm , γ lose their mass
dependence (on mR ) and are therefore only functions of the dimensionless coupling
λR . These functions are, moreover, to be regarded as quantum effects: the dependence
on the subtraction scale μ appears only in loop diagrams requiring subtractions—in
other words, in contributions to the amplitude containing non-zero powers of Planck’s
constant. In particular, the tree diagrams of the theory are independent of μ.
At this point, a superficial resemblance of (18.84) to the asymptotic version of the
Callan–Symanzik equation (18.38) should already be apparent. Recall that the latter
equation applies in the event that the external momentum set ki is non-exceptional
(no non-trivial subset of the Euclidean momenta ki summing to zero), and that this
is also the condition for the absence of mass singularities of the amplitude in the
zero mass limit. In particular, all Lorentz-invariant dot-products ki · kj are non-zero
(and uniformly large, say of order Q2 , with Q a large momentum scale, if we consider
the asymptotic regime as previously for the Callan–Symanzik case), and Weinberg’s
theorem then assures us that the graphs contributing to ΓN R receive their dominant
contribution from regions in which all internal propagators are far off-shell, with
denominators of order Q2 , and therefore with a sensitivity to the mass of order m2R /Q2 .
Neglecting the mass sensitivity, and setting mR to zero, we arrive at the approximate
asymptotic equation, valid to inverse powers of the large momentum scale,
∂ ∂ (N )
{μ + β(λR ) − N γ(λR )}ΓR (ki ; λR , μ) = 0 (18.85)
∂μ ∂λR
which is now formally identical to (18.38). (Note, however, that the renormalization
scale μ in this zero mass theory now plays the role of the BPHZ renormalized mass
mR in the Callan–Symanzik equation.) The dependence on the renormalized mass
(N )
(now set to zero) in ΓR (ki ; λR , μ) and the renormalization group functions β(λR )
and γ(λR ) (which therefore also lose their dependence on μ) has been omitted, and
we have an equation which can be made useful, in precise analogy to the steps leading
from (18.38) to (18.40) by trading in the derivative with respect to renormalization
scale μ (which describes a physically inaccessible dependence of the amplitudes) for
one implementing a uniform rescaling of the external momenta, via the dimensional
(N )
equation (dN is the engineering dimension of ΓR in powers of mass)
∂ ∂ (N ) (N )
(κ + μ )ΓR (κki ; λR , μ) = dN ΓR (κki ; λR , μ) (18.86)
∂κ ∂μ
Using (18.86) to eliminate the μ derivative in (18.85), we find
∂ ∂ (N )
(−κ + β(λR ) − N γ(λR ) + dN )ΓR (κki ; λR , μ) = 0 (18.87)
∂κ ∂λR
This equation is identical in form (modulo a redefinition of the functions β(λR ), γ(λR )
by a factor of 2) with the Callan–Symanzik equation (18.40). At the tree level,
Renormalization group equations for renormalized amplitudes 703
The extraction of the large momentum behavior of the amplitudes (i.e., for κ large)
is therefore transferred to a knowledge of the κ-dependence of the running coupling
λeff (κ), from which we may determine the scaling factor z(κ), provided that we are
(N )
able to determine the dependence of the full renormalized 1PI amplitude ΓR on the
coupling (now replaced by its running counterpart). In a general theory where the
coupling(s) may be large, rendering perturbation theory inapplicable, this solution
is not particularly helpful. But in an important subclass of cases, the asymptotic
704 Scales III: Short-distance structure of quantum field theory
behavior is determined by the weak-coupling regime of the theory, and we are able to
make rigorous statements about the large momentum properties of the amplitudes.
Suppose that for some renormalized coupling λ̄, and for all 0 < λR < λ̄, we have
β(λR ) < 0. Then it is apparent from (18.89) that with a physical renormalized coupling
λR < λ̄, the running coupling λeff (κ) is monotone decreasing for κ > 1, and indeed,
we have that λeff (κ) → 0, κ → ∞. Such a theory is said to be asymptotically free. The
leading term in the perturbative expansion of the β function is necessarily negative
in an asymptotically free theory, to enforce negativity of the β function at arbitrarily
small couplings. We have already encountered an example in the φ36 -theory discussed
previously, with a β function given to lowest order (including the factor of two in
accordance with the new definition (18.81)) by
3
β(λR ) = −β0 λ3R + O(λ5R ), β0 = (18.93)
256π 3
Once κ is sufficiently large, the effective coupling will become sufficiently small that
the β function is dominated by its leading term, so that the extreme large-momentum
(or short-distance) asymptotics of the theory is determined by solving the defining
equation (18.89), keeping only the leading term in (18.93). It is more convenient to
solve for the squared effective coupling λ2eff , which satisfies, at this leading order
∂ 2
κ λ (κ) = −2β0 λ4eff (κ) (18.94)
∂κ eff
the solution to which is
λ2R
λ2eff (κ) ∼ , κ→∞ (18.95)
1 + 2β0 λ2R ln (κ)
The running coupling in φ36 -theory therefore falls off logarithmically with the momen-
tum rescaling variable: in effect, free field behavior is restored in the amplitude
(N )
ΓR (κki ; λR , μ) when all momenta are taken large, though admittedly very slowly.
Exactly the same behavior obtains in QCD, with renormalized coupling gR and
lowest-order β function (for a SU(N ) theory with Nq quark fields in the fundamental
representation)
1 11 2
β(gR ) = −β0 gR
3 5
+ O(gR ), β0 = 2
( N − Nq ) (18.96)
16π 3 3
The extraordinary progress made in the last three-and-a-half decades in bringing the
high-energy behavior of many strong interaction amplitudes under analytic control is
entirely dependent on this very fortunate property of the theory, first uncovered by
Gross, Politzer, and Wilczek, and for which they received the 2004 Nobel Prize in
Physics. Considerations of space preclude our delving further into this fascinating and
hugely important area of modern particle physics.15
15 Classic QCD applications of the renormalization group control of high-energy processes, such as the
inclusive electron–positron annihilation cross-section to hadrons, and deep-inelastic scattering can be found
in the reviews of Mueller and Buras cited earlier, as well as in any number of modern texts on Standard
Model field theory.
Renormalization group equations for renormalized amplitudes 705
Asymptotically free theories in four spacetime dimensions are rather rare: indeed,
in the class of perturbatively renormalizable theories, the only field theories with this
property are gauge theories with a non-abelian gauge group, and with not too many
matter fields charged under the gauge group (e.g., in (18.96), we must have Nq < 11 2
N ).
Our other field-theoretic workhorse, φ44 -theory, variants of which are clearly present in
the Standard Model in the event that the Higgs particle turns out to be an elementary
scalar, is not asymptotically free. To lowest (one-loop) order, one finds (see Problem 14)
3
β(λR ) = +β0 λ2R + O(λ3R ), β0 = (18.97)
16π 2
and a running coupling
λR
λeff (κ) ∼ (18.98)
1 − β0 λR ln (κ)
which evidently runs into a singularity (the famous “Landau pole”16 ) at a finite value
of κ, beyond which the effective coupling changes sign, apparently destabilizing the
theory. Of course, we are no longer entitled to rely on the perturbative one-loop form
of the β function once the effective coupling becomes large, as it certainly does once
the singularity is approached. Non-perturbative studies have provided considerable
evidence for the hypothesis that this situation should nevertheless be interpreted
as signaling the “triviality” of φ4 theory (cf. the discussion of global aspects of the
renormalization group flow in Section 17.4): the removal of the UV cutoff of the
theory, corresponding to prescribing a well-defined Wilsonian effective Lagrangian at
arbitrarily high-energy scales, necessarily implies the vanishing of the renormalized
coupling λR defined at any fixed low-energy scale (see, for example, (Fernandez et al.,
1992), for a detailed treatment of the mathematical issues surrounding triviality).
Nevertheless, given, as emphasized on many previous occasions, the unavoidable
presence of a physical cutoff at sufficiently high energy, there is absolutely no reason
to reject φ44 theory as a perfectly adequate low-energy effective field theory whose
amplitudes are accurately computable (if we are lucky enough to have a sufficiently
small λR ) by the standard technology of perturbative renormalization theory.
An alternative scenario to triviality for four-dimensional non-asymptotically-free
theories has been the subject of much interest: the presence of a non-trivial ultraviolet
fixed point at which the theory recovers an exact scale (even conformal) invariance.
Consider a four-dimensional scalar theory where the β function is positive at small
coupling, β(λ) > 0, 0 < λ < λ∗ , but with a zero at some positive coupling value
β(λ∗ ) = 0, as indicated in Fig. 18.26. We also assume that the particular physical
theory of interest has a renormalized coupling λR < λ∗ , which serves as the starting
point, λeff (κ = 1) = λR , for the renormalization group flow of the effective coupling
defined in (18.89). As the β function is positive, the effective coupling increases
monotonically with κ, with the rate of growth slowing as λeff (κ) approaches the
16 Quantum electrodynamics, and abelian gauge theories generally, exhibit a similar structure, with
a positive β function at small coupling: the associated singular behavior, in the context of the photon
propagator, was first pointed out in the 1950s by Landau.
706 Scales III: Short-distance structure of quantum field theory
value λ∗ , which therefore acts as a fixed point of the flow: λeff (κ) → λ∗ , κ → ∞. From
(18.90), the scaling factor z(κ) therefore behaves asymptotically as
κ
dκ ∗
z(κ) ∼ exp (A + γ(λ∗ ) ) ∼ Cκγ(λ ) , κ → ∞ (18.99)
κ
which means that the scaling behavior κ4−N of the tree amplitudes of the massless
theory (cf. (18.88), which follows from the fact that the N scalar fields in the associated
Green function each have engineering dimension 1 (in four dimensions), together
with naive dimensional analysis, is altered by quantum loop effects to the scaling
∗
κ4−N (1+γ(λ )) . We interpret this by saying that the interactions have induced an
anomalous scale dimension γ(λ∗ ) for the underlying scalar field, but that scaling by
fixed powers (rather than fractional powers of logarithms, as in the asymptotically free
case) is restored at the fixed point. In fact, we saw earlier in our discussion of the trace
anomaly (cf. (15.201)), the trace of the energy momentum tensor, which acts as the
divergence of the would-be conserved current of scale transformations (cf. (12.118)),
only vanishes in a massless interacting theory if the β function does so: i.e., at
precisely the fixed point indicated in Fig. 18.26. The situation described here does not
appear to arise in any of the field theories comprising the Standard Model of particle
interactions in four dimensions, but conformally invariant interacting field theories
in two dimensions have been the subject of intense scrutiny, partly as a consequence
of their close connection to aspects of string theory. Moreover, the renormalization
group treatment of critical phenomena in condensed-matter theory—specifically, the
scaling behavior of thermodynamic quantities at a second-order transition—is based
precisely on the existence of an infrared fixed point in scalar field theories which can
be shown to describe the long-distance behavior of spin models near such transitions.
A classic case is the β function of massless φ4 theory in three dimensions, which has
exactly the appearance of Fig. 18.26, but with an overall minus sign, indicating that
power scaling behavior is obtained for the correlation functions in the infrared limit
β(λ)
λ∗
• • λ
λR
17 A comprehensive treatment of this important area can be found in the treatise of Zinn–Justin (Zinn–
Justin, 1989).
708 Scales III: Short-distance structure of quantum field theory
∂ ∂ ∂
{μ +β + γm mR − (2γ + γO )}C̃O (q; λR , mR , μ) = 0 (18.103)
∂μ ∂λR ∂mR
∂ ∂
{μ + β(λR ) − (2γ(λR ) + γO (λR ))}C̃O (q; λR , μ) = 0 (18.104)
∂μ ∂λR
18.4 Problems
1. The conserved currents Jαμ associated with a non-abelian symmetry satisfy the
Ward–Takahashi identity (12.142), where all the operators are bare, unrenormal-
ized ones (prior to rescaling). Rewriting the identity in terms of renormalized
fields (with N3 (Jαμ ) = ZJ Jαμ , φR = Ẑ −1/2 φ), show that the bare Jαμ must already
be ultraviolet finite, and that we may therefore simply set the associated renor-
malization factor ZJ = 1.
2. By examining the graph of Fig. 18.27(a), verify that the oversubtracted N4 (φ2 )
operator in φ44 -theory contains a contribution from the (minimally subtracted)
N4 (φ4 ) operator, as implied by the Zimmermann identity (18.11).
3. The one-loop hexagon graph in φ36 theory shown in Fig. 18.27(b) is taken at an
exceptional momentum point as various subsets of the incoming external momenta
combine to zero momentum. It is proportional to the (UV-finite) Feynman inte-
gral, with superficial degree of divergence –6,
1 1 1 1 d6 l
I(qi , m) =
(l − q1 )2 + m2 (l − q2 )2 + m2 (l − q3 )2 + m2 (l2 + m2 )3 (2π)6
(18.105)
(a) By rescaling the loop integration variable by Q ≡ q 2 , show that
1 m
I(qi , m) = 6
I(q̂i , ), qˆi = qi /Q, q̂i2 = 1, i = 1, 2, 3 (18.106)
Q Q
Problems 709
φ q2 −q2
l − q2
l l
φ
N4(φ2(0)) ´ −q1 q3
φ l − q1 l − q3
l
φ q1 − q3
(a) (b)
Fig. 18.27 (a) A one-loop graph in φ44 -theory needing an oversubtraction in the insertion of
N4 (φ2 ) in the 2-2 amplitude.(b) A one-loop graph in φ36 -theory at an exceptional external
momentum point.
Show that for large Q, the dependence of I(q̂i , m Q ) on the vanishing rescaled
mass m Q
is logarithmic, due to a logarithmic infrared divergence when l → 0
in the rescaled integral. This implies a logarithmic modification of the naive
power scaling 1/Q6 when all qI are taken large simultaneously (of order Q):
I(qi , m) ∼ C ln (Q/m)/Q6 .
(b) Now assume that a mass insertion is made on one of the lines carrying
1
momentum l, thereby increasing the power of the propagator l2 +m 2 to four.
8. Verify the equality of the left- and right-hand sides of (18.55) (see also Sec-
tion 6.6)).
9. Starting from (18.63), verify the inequality (18.64).
10. The analyticity of the amplitude Γ(ω, Q2 ) (suppressing the k 2 dependence) in
the complex ω-plane is limited by the presence of cuts due to physical thresholds
in the sum over states n in (18.55). For a physical state with forward time-like
four-momentum P μ , we must have P± > 0 (as Po > |P |). Show that this implies
that the imaginary part of the amplitude vanishes unless ω > 1. Moreover, by
Bose symmetry (amplitude T (q, k) even under q → −q), we must have Γ(ω, Q2 ) =
Γ(−ω, Q2 ). Consequently, Γ(ω, Q2 ) is analytic in the complex ω-plane except for
cuts running along the real axis from −∞ to −1 and +1 to +∞.
11. (a) Show that the one-loop term contributing to the right-most graph in Fig.
↔
i∂− n
18.25, with the operator N4 (φ(0)( 2k −
) φ(0)) appearing at the vertex V , is loga-
rithmically divergent regardless of the power n. The relevant Feynman integral is
1 1 l− n d6 l
( ) (18.107)
(l2 − m2 + i)2 (k + l)2 − m2 + i k− (2π)6
Show that this integral is rendered finite when once subtracted at the light-like
point k = k̂ (i.e., setting k+ = k = 0). Hint: consider the integral with l−
n
replaced
by the general tensor lμ1 lμ2 ...lμn ; after introducing Feynman parameters (cf.
(16.67), easily generalized to higher powers of either propagator by differentiation
with respect to A or B), and making the usual shift in integration variable, one
sees that the only surviving term when all Lorentz indices are set equal to the −
n
light-cone coordinate is proportional to k− (as g−− = 0), with a logarithmically
divergent coefficient.
(b) Repeat for the one-loop graph studied in part (a), but with a single mass
insertion (doubled propagator), which gives, up to uninteresting factors, the
lowest-order contribution to the anomalous dimension γn in (18.72).
12. Use the analyticity of Γ(ω, Q2 ) demonstrated in Problem 8 to derive the following
Cauchy representation for the coefficients in (18.70):
1 dω
vn Cn (Q2 ) = Γ(ω, Q2 ) (18.108)
2πi ω n+1
where the contour runs around a small circle enfolding the origin. By unfolding
the contour of integration onto the cuts, and changing variables to x ≡ 1/ω, show
that
1
1 + (−1)n
vn Cn (Q2 ) = xn−1 Im(Γ(x, q 2 ))dx (18.109)
π 0
Problems 711
In the final chapter of this book we wish to turn our attention to the behavior of field
theory amplitudes in the long-distance regime, where “distance” is to be interpreted
spatiotemporally: we are interested in those aspects of elementary processes described
by a local relativistic field theory which involve large, even macroscopic, regions of
spacetime. Evidently, this behavior is closely connected with the whole definition of the
stable, asymptotic particle states of the theory, which are after all what survive when
a long time has elapsed after an elementary interaction and the final-state particles
are allowed to separate and become free from each other’s influence.
In Section 9.3 we described in detail the construction of these asymptotic states
and their connection to the local (or almost local) Heisenberg fields of the theory,
along the lines of the Haag–Ruelle scattering theory. An essential input to the
Haag–Ruelle formalism is the existence of a mass gap: the single particle state(s)
of the theory correspond to isolated δ-function singularities δ(p2 − m2 ), m = 0 in the
spectral density of the squared-mass operator Pμ P μ , or equivalently, simple poles in
the momentum-space full propagator defined by Fourier transforming the two-point
Feynman function of any suitable interpolating field for the particle in question. In
other words, the scattering theory unfolds in a straightforward and physically intuitive
way provided we adhere to field theories of purely massive particles. Once exactly
massless particles are present in the theory, the construction of a rigorous scattering
theory—in particular, the construction of a separable asymptotic Fock space based on
almost local operators, along the lines of Theorem 9.4—becomes considerably more
complicated. Nevertheless, the desired extension of the Haag–Ruelle theory has been
carried out by Buchholz for massless spin-0 bosons (Buchholz, 1977) and for massless
spin- 12 fermions (Buchholz, 1975), leading to a well-defined S-matrix and asymptotic
in- and out-Fock spaces along more or less the usual lines, at least for theories in odd
spatial dimensions where the Huyghens principle applies.
For massless spin-1 gauge bosons on the other hand, where the dynamics is
subject to an exact local gauge symmetry, the situation is far more subtle. Strangely
enough, the conceptual (if not calculational) difficulties are greater in the case of
unbroken abelian theories such as QED than in non-abelian gauge theories such as
QCD, for the simple reason that the latter theories do have a mass gap, despite the
presence of massless fields in the Lagrangian, as a consequence of the non-perturbative
confinement of non-gauge-invariant states, to be discussed below. Accordingly, the
slippery issues which arise once massless gauge particles appear in the asymptotic
The infrared catastrophe in unbroken abelian gauge theory 713
spectrum are avoided in the case of unbroken non-abelian theories, as they also are
in the case of spontaneously broken gauge theories (abelian or non-abelian) where
the Higgs phenomenon results in the spin-1 gauge particles of the theory associated
with the broken generators becoming massive. For an unbroken abelian gauge theory
such as QED, on the other hand, with a massive charged particle (e.g., the electron)
coupled to exactly massless photons, we shall see that the definition of a conventional
Fock space and associated S-matrix fails in a fundamental way: strictly speaking,
the S-matrix vanishes identically in such a theory. Indeed, the single-particle pole(s)
in amplitudes associated with incoming or outgoing charged particle(s) are softened
to branch points, making the LSZ formalism useless for obtaining finite scattering
amplitudes.
with E(k) = |k| for massless photons. The polarization vectors (k, λ) are transverse,
i.e., k · (k, λ) = 0, thereby ensuring the Coulomb gauge condition. It will be conve-
nient to re-express the photon field in a four-dimensional Fourier representation, by
reintroducing the energy integral, with the mass-shell condition inserted via the usual
δ-function:
2
d4 k
A(x) = ((2π)5/2
2E(k) (θ(k0 )δ(k 2 ) (k, λ)a(k, λ)e−ik·x + h.c.) (19.3)
(2π)4
λ=1
The Heisenberg field equation for the theory is just the quantized version of
Maxwell’s equation,
H (x) = Jtr (x),
A · Jtr (x) = 0
∇ (19.4)
where Jtr (x) is the transverse part of the full external current Jμ (an explicit example
will be considered shortly), the rest of Jμ (or “longitudinal” part) being associated with
the Coulombic flux carried along with the incoming and outgoing charged particle(s)
of the classical current, but not involved in the generation or removal of real photons
present at outgoing or incoming null (i.e., light-like) infinity. The prescribed c-number
current Jtr (x) is real and transverse, and therefore has the Fourier expansion
714 Scales IV: Long-distance structure of quantum field theory
d4 k
2
Jtr (x) = ˜ k, λ)e−ik·x + J˜∗ (k, λ)eik·x ))
(k, λ)(J( (19.5)
(2π)5/2
λ=1
1 The field renormalization constant Z appearing in the asymptotic condition, non-trivial in theories
with fully quantized local field interactions, is simply unity in our case, where the photon field is coupled
to a c-number source. This follows from the fact that the Heisenberg field and the associated asymptotic
in- and out-fields differ by c-number terms (see (19.10, 19.11)), and therefore satisfy identical equal-time
commutators, which then forces Z = 1. We remind the reader that the limits indicated in (19.6) are to
be interpreted as weak limits—for matrix elements of the indicated, suitably smeared, operators (see
Section 9.3).
The infrared catastrophe in unbroken abelian gauge theory 715
1
Employing the familiar identity k2 +ik 0
= P ( k12 ) − iπ(k0 )δ(k2 ) (with (k0 ) the sign
function, (k0 ) = θ(k0 ) − θ(−k0 )), the Green function Δ(x) is found to have the
Fourier representation
d4 k
Δ(x) = i e−ik·x (k0 )δ(k2 ) (19.13)
(2π)3
Taking the Fourier transform of (19.13), and using the Fourier representations (19.3)
and (19.5) for the field and current, we find
i ˜ k, λ)
aout (k, λ) = ain (k, λ) + J( (19.14)
2E(k)
This explicit relation between the out- and in-annihilation operators also determines
(up to an overall phase) the S-matrix for the theory (cf. (9.51)),
with a similar relation connecting the creation operators a†out , a†in . The desired S-
matrix operator is easily found if we recall the Baker–Campbell–Hausdorff formula
eB Ae−B = A + [B, A], valid if [B, A] is a c-number. If we take S as the formally
unitary operator (exponential of an anti-hermitian operator)
d3 k ˜
2
S = exp (i (J(k, λ)a†in (k, λ) + J˜∗ (k, λ)ain (k, λ))) (19.16)
2E(k) λ=1
valid if the commutator [A, B] is a c-number. Choosing A and B as the first and second
terms in the exponent in (19.16), we find
d3 k ˜∗
2 2
d3 k ˜ k, λ)a† (k, λ)) · exp (i
S = C exp (i J( in J (k, λ)ain (k, λ))
2E(k) λ=1 2E(k) λ=1
(19.18)
Recalling our original definition of S in Section 9.1 (9.47), we see that the photon
state produced by our classical current, given a photon vacuum in the asymptotic
past, takes the explicit form
d3 k ˜
2
S|0in = C exp (i J(k, λ)a†in (k, λ))|0in (19.20)
2E(k) λ=1
p
xμ (τ ) = τ, τ < τ−
m
p
xμ (τ ) = τ, τ > τ+ (19.21)
m
dxμ 4
J μ (x) = e dτ δ (x − xμ (τ )) (19.22)
dτ
The infrared catastrophe in unbroken abelian gauge theory 717
p τ/m
τ+
τ–
pτ/m
Fig. 19.1 spacetime trajectory of a classical charged particle undergoing a localized interaction.
We conclude that the momentum-space current density in this situation must take
the form
ie pμ pμ
J˜μ (k) = ( − )M(k) (19.26)
(2π)3/2 k · p k · p
where the residual amplitude M(k) satisfies (a) M(k) → 1, k → 0, and (b) M(k)
vanishes faster than any power of k for large k (as a consequence of the smoothness
of the trajectory in the interaction zone). Note that the current conservation property
∂μ J μ = 0 ⇒ kμ J˜μ (k) = 0 is automatically satisfied by (19.26). The full momentum-
space current in (19.26) can be decomposed into a longitudinal and transverse part
718 Scales IV: Long-distance structure of quantum field theory
μ
2
J˜μ (k) = kμ J˜l (k) + J˜tr (k) = k μ J˜l (k) + μ (k, λ)J(
˜ k, λ) (19.27)
λ=1
where the Coulomb gauge polarization vectors satisfy 0 (k, λ) = 0, k · (k, λ) = 0, and
therefore kμ μ (k, λ) = 0. The functions J(˜ k, λ) are just (up to uninteresting overall
constants) the objects introduced earlier in (19.5). The current conservation property
kμ J˜μ (k) = 0 then follows directly as a consequence of the photon mass-shell condition
k 2 = 0. We also have (again employing k 2 = 0, together with (19.26))
2
e2 p p 2
J˜μ (k)J˜μ∗ (k) = − ˜ k, λ)|2 =
|J( ( − ) |M(k)|2 (19.28)
(2π)3 k · p k · p
λ=1
Note that the four-vector square here is negative because only the space-like transverse
part survives. Returning to our expression for the S-operator, (19.18), we see that the
exponent in the prefactor C displayed in (19.19) is given, for massless photons with
E(k) = |k|, and with the current satisfying (19.28), by an infrared (logarithmically)
divergent integral, resulting in the vanishing of C, and thence, the S-operator itself
(remember the implicit minus sign in the four-vector square!),
e2 d3 k 1 p p
C = exp ( ( − )2 |M(k)|2 )
2(2π) 3 2E(k) |k|2 E(p) − k̂ · p E(p ) − k̂ · p
∼ exp (−∞) = 0 (19.29)
where
for the photon E(k) = |k|, while for the massive charged particle E(p) =
p 2 + m2 , etc. Note that the integral is cut off at the upper end by the rapid falloff of
the residual amplitude M(k): the problem is entirely at the infrared (low-momentum)
end. Indeed, if we regulate
the infrared divergence by temporarily inserting a photon
mass, so that E(k) = k 2 + m2γ , and cut the integral off at some momentum Λ
(below which the amplitude M(k) may be approximated by unity), the integral in
the exponent in (19.29) becomes
|
p p k|=Λ
d3 k
( − )2 n̂
(19.30)
E(p) − n̂ · p E(p ) − n̂ · p
2|k|2 k 2 + m2γ
where the quantity in angle brackets corresponds to an angular average over directions
of the unit vector n̂. The integral on the right is logarithmically divergent in the limit
of vanishing photon mass:
|
k|=Λ Λ
d3 k dk Λ
= 2π
∼ 2π log , mγ → 0 (19.31)
2|k|2 k 2 + m2γ 0 k 2 + m2γ mγ
Note that while the infrared divergence manifests itself as a zero in the S-operator once
we sum to all powers of the electric charge e, normal perturbation theory corresponds
to an expansion in powers of e, in which case the divergence appears as logarithms
The infrared catastrophe in unbroken abelian gauge theory 719
of the photon mass—one for each power of the fine structure constant α ∝ e2 —and
therefore an independent logarithmic divergence at each order of perturbation theory
in any specific S-matrix element involving a definite number of incoming and outgoing
photons. The only way to avoid this “infrared catastrophe” is, as we see clearly in
(19.30), to take p = p , and prevent our charged particle from receiving any transfer
of momentum, however small!
The physical interpretation of these results is actually extremely simple. Unlike the
situation for massless pions in a chirally symmetric theory (cf. Section 16.6), where
emission of low-momentum pions is suppressed by powers of the low momentum, there
is no penalty in quantum electrodynamics to the emission of low-momentum photons.
Moreover, the emission of a massless photon with arbitrarily low momentum incurs an
arbitrarily low-energy cost, and therefore we should hardly be surprised if the slightest
momentum shift of a charged particle induces the emission of a very large number of
extremely soft photons.
The result of this proliferation of emitted photons, as we have just seen, at least
for the simplified case of a classical charged particle acting as the source, is that the
coherent photon state thereby produced contains so many multi-photon states with
arbitrarily many soft photons that the exclusive amplitude for our charged particle to
emit any definite finite number of photons simply vanishes. In particular, the vacuum-
persistence-amplitude out
0|0in = C itself vanishes. The situation is somewhat anal-
ogous to that discussed previously in the context of Haag’s theorem (Section 10.5), in
that we have unitarily inequivalent spaces, but in this case, not between the physical
in- and out-asymptotic spaces and the computationally convenient (for perturbation
theory) but physically dispensable interaction-picture space, but between the in- and
out-spaces themselves, which by asymptotic completeness we have come to regard as
identical to each other and to the basic physical Hilbert space of the theory in any
“sensible” field theory.
The situation we are here encountering clearly suggests at the conceptual level a
more serious disease than any we have previously uncovered in our studies of local field
theory, and in fact it must be admitted that after much intense investigation there does
not appear to be any way to resuscitate the concept of a normal separable Fock space
with unitarily equivalent asymptotic spaces in an unbroken abelian gauge theory like
QED, with massless photons coupled to charged particles. Nevertheless, we hasten to
assure the depressed reader that a cure is at hand, even though it requires the abandon-
ment of scattering amplitudes like the S-matrix as the fundamental phenomenological
object of the theory, and the return to a direct evaluation of only carefully defined
and directly measurable quantities. This “Bloch–Nordsieck resolution” of the infrared
catastrophe of QED was already proposed in the 1930s, before the advent of modern
covariant quantum electrodynamics a decade later, and will be the subject of the
following section.
There may be some concern that the results we have obtained are subject to the
restriction that while our photons are described in a fully quantum mechanical context,
the charged source is classical and that perhaps this “hybrid” treatment is in some way
introducing inconsistencies into the theory, resulting in the evaporation of our beloved
S-matrix. In fact, essentially identical results are obtained in the fully quantized version
of the theory. We shall briefly explain how this works, referring the reader to the
720 Scales IV: Long-distance structure of quantum field theory
beautiful article of Weinberg (Weinberg, 1965) for the combinatoric details needed to
obtain the final result. Consider a process in QED in which a single incoming electron
scatters off an arbitrary set of other particles, which for simplicity we take to be
themselves uncharged (an example might be Compton scattering, where the electron
scatters off a single hard photon). The basic process is indicated in Fig. 19.2, where
we consider the effect of emission of a single soft photon, carrying four-momentum
k (much smaller than all other momentum scales in the process), from either the
incoming (Fig. 19.2(a)) or outgoing (Fig. 19.2(b)) electron. In the limit k → 0, both
of these amplitudes are found to diverge linearly. In the former case, for example, the
amplitude is proportional to
p/ − k/ + m
ū(p , σ )M(k) ieγ μ u(p, σ)
(p − k)2 − m2
γ μ (−p/ + m) + 2pμ − k/γ μ
= ieū(p , σ )M(k) u(p, σ)
−2p · k + k 2
pμ
∼ (−ie )ū(p , σ )M(k)u(p, σ), k → 0 (19.32)
p·k
Here the amplitude M(k) simply represents the “core” of the diagram, where the
relevant momentum scales are much higher than k, so that for small k we may simply
assume that it approaches some (for our purposes) uninteresting constant. The vector
index μ of the emitted photon (associated with the vertex factor ieγ μ ) is, of course, to
be contracted with a polarization vector μ in the event that the photon is a real one
appearing in the final state, or with the corresponding vector index of an absorbed
photon if it ends up being a virtual photon. The initial on-mass-shell spinor satisfies
the Dirac equation (p/ − m)u(p, σ) = 0, and we have employed the Dirac algebra p/γ μ =
2pμ − γ μ p/, and neglected higher powers of k, in arriving at (19.32). The corresponding
emission from an outgoing line in Fig. 19.2(b) produces similarly, in the small k limit,
( a) (b)
p p
k
k
p + k
M(k) M(k)
p−k
p p
a factorized amplitude with the dependence on the soft photon momentum isolated in
a linearly divergent prefactor,
p/ + k/ + m pμ
ū(p , σ )ieγ μ
M(k)u(p, σ) ∼ (+ie )ū(p , σ )M(k)u(p, σ) (19.33)
(p + k) − m
2 2 p ·k
Combining these results, we see that the emission of a single soft photon from
the charged electron traversing the process produces a divergent prefactor which is
identical (up to a sign) to our Fourier-transformed classical current (19.26):
pμ pμ
Memit one photon (k) ∼ −ie( − )ū(p , σ )M(k)u(p, σ) (19.34)
k · p k · p
The absorption of a photon on either the incoming or outgoing electron line leads
to an exactly similar result, with a change of sign (as in this case we have k → −k).
A single virtual photon exchange requires both of these factors, together with the
virtual photon propagator −igμν /(k2 + i), and an integration over the photon four-
momentum. Putting in the appropriate coupling and combinatoric factors, one finds
that the low-momentum contribution arising from single virtual photon exchange
results in the modification of the uncorrrected core amplitude M by a multiplicative
factor
e2 d4 k 1 p p 2
J =i ( − ) (19.35)
2 (2π) k + i k · p k · p
4 2
while multiple virtual photon exchanges simply exponentiate this result leading to an
overall amplitude in which soft photon effects appear as an exponential prefactor,
The real suppression factor (19.29) previously obtained in our semiclassical analy-
sis corresponds to the absolute magnitude | exp (J)| = exp (Re(J)).2 One finds that
the real part arises effectively from the mass-shell δ-function in the virtual photon
propagator 1/(k2 + i) = P(1/k2 ) − iπδ(k 2 ), leaving us with the three-dimensional
d3 k integral visible in (19.29) (recall that in the classical problem, M(k) = 1 in the
infrared region where the factorization of the soft-photon effects is valid). The reader
is referred to the previously mentioned article of Weinberg’s for a full discussion of
the combinatorics of the soft photon effects, and a careful analysis of the integral
appearing in (19.35).
The annoying “evaporation” of S-matrix amplitudes as a consequence of the expo-
nentiation of infrared divergences appearing in S-matrix amplitudes at finite orders of
perturbation theory is a symptom of deep structural problems in the formulation of
the physical state space of a theory like QED with massless gauge particles appearing
in the asymptotic states of the theory. In fact, the phenomenon is even present in
some low-dimensional non-gauge theories—the classic example being the model of a
2 There is also a divergent phase factor, arising from the imaginary part of J. This is associated with
divergences familiar from careful treatments of Coulomb scattering in non-relativistic quantum mechanics:
see (Weinberg, 1965), Section V.
722 Scales IV: Long-distance structure of quantum field theory
3 For a recent review of the situation, see the discussion of Haag in (Haag, 1992), Sections VI.2, VI.3.
The infrared catastrophe in unbroken abelian gauge theory 723
4 In our classical current model this flux has its origin in the longitudinal part of the current density
(19.27). In a fully quantized theory the asymptotic flux can be shown to commute with all local operators
and therefore to be a c-number: see (Buchholz, 1982).
724 Scales IV: Long-distance structure of quantum field theory
of the theory seems at first a radical step. Nevertheless, the elimination of the infrared
catastrophe provided by this “Bloch–Nordsieck” resolution, to which we now turn,
provides the basis for the unambiguous calculation of the quantum electrodynamic
component of essentially all high-energy processes, the vast majority of which involve
charged particles in the initial or final state, in modern particle physics.
1
Pn (q, λ; Δ) = d3 k1 · ·d3 kn |out
qλ, k1 λ1 , ..kn λn |0in |2 (19.37)
n! |
ki |<Δ λi
where the 1/n! factor takes into account multiple counting of identical soft photon
states (by Bose symmetry). We may convert the out-state appearing in the matrix
element to an in-state by introducing the scattering operator S, as in (9.47), where in
our case S is given explicitly by (19.18):
The Bloch–Nordsieck resolution 725
1
Pn (q, λ; Δ) = |C|2 d3 k1 · ·d3 kn
n! |
ki |<Δ λi
in+1 d3 k ˜ k, λ)a† (k, λ)}n+1 |0in |2 (19.38)
·|in
qλ, k1 λ1 , ..kn λn | { J( in
n + 1! 2E(k) λ
The destruction operators in the exponential on the right in (19.18) act on the in-
vacuum, and the exponential therefore reduces to unity, while the left exponential
can be expanded as shown in (19.38), with only the term involving n + 1 creation
operators surviving. C is the vacuum persistence amplitude (amplitude for emission
of no photons) given in (19.19). Any one of the n + 1 creation operators can be used
to the left to remove
the single distinguished hard photon, giving an overall factor
˜ q , λ)/ 2E(q) in the matrix element. Taking this outside the soft photon
of (n + 1)J(
integrals, we find
˜ q , λ)|2 1 †
|J( n
(A (Δ))
Pn (q, λ; Δ) = |C|2 d3 k1 · ·d3 kn |in
k1 λ1 , ..kn λn | J |0in |2
2E(q) n! |
ki |<Δ n!
λi
d3 k ˜∗
AJ (Δ) ≡ J (k, λ)ain (k, λ) (19.39)
|
k|<Δ 2E(k) λ
where we are allowed to restrict the integral over k in the n-particle creation operator
to the soft-momentum regime |k| < Δ, as the only photons present in the final state
are now the soft ones. The restriction to soft momenta in the momentum integrals
d3 k1 · ·d3 kn can now be relaxed, as the matrix element for any particle with momentum
|ki | > Δ vanishes given the restriction to soft momenta in the creation operator A†J (Δ).
Moreover, the sum over all n-particle states implied by these momentum integrals
can be expanded to a complete set (by including states with m = n photons) as the
additional states manifestly have vanishing matrix elements to the vacuum of the
indicated n-particle creation operator. Recalling the completeness relation (5.22) for
a multi-particle bosonic Fock space, the sum over n-particle states can be augmented
to a complete set of in-states:
˜ q , λ)|2
|J( (A† (Δ))n
Pn (q, λ; Δ) = |C|2 |in
α| J |0in |2 (19.40)
2E(q) α n!
˜ q , λ)|2 1
|J(
= |C|2
0|(AJ (Δ))n (A†J (Δ))n |0in (19.41)
2E(q) (n!)2 in
The reader will recall (cf. (19.19)) that the vacuum persistence (i.e., no-photon-
emission) factor C is given by
726 Scales IV: Long-distance structure of quantum field theory
1 d3 k ˜
C = exp (− |J(k, λ)|2 ) (19.43)
2 2E(k)
λ
with the integral in the exponent logarithmically divergent in the infrared (in the
massless photon limit) for currents corresponding to a charged particle undergoing
a change of momentum, thereby resulting in the vanishing of C. For any finite n,
the probability Pn (q, λ; Δ) in (19.41) therefore also vanishes for massless photons,
as the matrix element multiplying |C|2 contains a finite power of this same infrared
divergence, as we see from (19.42). On the other hand, if we calculate, as previously
argued, the inclusive probability allowing for emission of arbitrarily many soft photons,
we find that the divergence in the integral at small k is exactly cancelled, giving a finite
probability for the detection of a single hard photon (momentum q and polarization
λ) by a detector of finite resolution Δ:
∞
˜ q , λ)|2
|J( d3 k ˜
Ptot (q, λ; Δ) ≡ Pn (q, λ; Δ) = exp (− |J(k, λ)|2 )
n=0
2E(q) |
k|>Δ 2E(k)
λ
(19.44)
The infrared divergence in the integral in the exponent is now effectively cutoff by
the detector resolution Δ: in fact, the exponent turns out to contain logarithms of
the form ln (q 2 /Δ2 ) with currents of the form (19.28) and q = p − p . In other words,
transition probabilities, and more generally all types of measurable cross-sections, for
quantum electrodynamics processes can be expected to depend in an important, but
fortunately calculable, way on the sensitivity of the measurement apparatus to the
“haze” of low-energy photons which are inevitably present.
We have been illustrating the essential nature of the infrared problem in quan-
tum electrodynamics with the aid of a semiclassical model in which the source
current is treated classically (but with a quantized Maxwell field), and taking full
advantage of a delicious property—complete analytic solvability—of this model. In
particular, we have not needed to resort to perturbative approximations, as our
results contain the exact emission probabilities to all orders in the particle charge
e (which is hidden in the current J( ˜ k, λ)- cf. (19.26)). In the fully quantized
version (QED) of quantum electrodynamics, in which the charged particle fields
are also treated quantum mechanically, we must of course resort to perturbation
theory.
Before describing the Bloch–Nordsieck resolution in QED proper, it is useful to
take a look at the cancellation of infrared divergences visible in (19.40–19.44), from
the point of view of a perturbative expansion in the squared charge, or fine-structure
constant α = e2 /4π, as the cancellations occurring in the fully quantized theory arise in
a completely analogous way. We shall work to order α2 , or e4 , recalling that J˜ ∼ O(e),
and that the leading term in Ptot is of order α, as we insist on the emission of a single
hard photon of momentum q. It is clear from (19.39) that the n-soft-photon emission
probability is of order αn+1 , so to order α2 the total transition probability Ptot , which
we already know to be infrared finite for finite detector resolution Δ, is given by just
the contributions from n = 0 and n = 1:
The Bloch–Nordsieck resolution 727
Ptot = P0 + P1 + O(α3 )
˜ q , λ)|2
|J( d3 k ˜ d3 k ˜
= {1 − |J(k, λ)|2 ) + |J(k, λ)|2 } + O(α3 )
2E(q) 2E(k)
|k|<Δ 2E(k)
λ λ
(19.45)
The first two terms in the curly braces in (19.45) arise from the expansion (to order
α) of the no-photon-emission probability |C|2 : in particular, the infrared divergent
integral in the second term corresponds to the emission and reabsorption of a single
virtual photon (of arbitrary momentum) on the charged-particle line, accompanying,
of course, the hard photon emission described by the overall prefactor. The third
term, also an infrared divergent integral (cut off on the ultraviolet end by the
detector resolution Δ), corresponds to the total probability for the emission of a single
undetected soft photon. We see that the cancellation in the infrared divergence between
the two integrals appearing in (19.45) amounts to a cancellation in the total probability
Ptot between infrared divergences arising from virtual photon contributions and real
photon emission terms. If we introduce a photon mass mγ to separately regularize
each of the integrals in the infrared, the singular ln (mγ ) dependence in each integral
evidently cancels exactly between virtual and real photon emission terms, at each order
of the perturbative expansion in α, once the finite detector resolution is properly taken
into account.
Returning now to the fully quantized version of quantum electrodynamics, one finds
that precisely the same mechanism operates to produce well-defined transition prob-
abilities (or cross-sections) once photon detector resolutions are taken into account,
once again by cancellation between virtual and real photon contributions. The detailed
analysis, which we must here omit for considerations of space, can be found in the
seminal paper of Yennie, Frautschi, and Suura (Yennie et al., 1961). However, the
mechanism of the cancellation can be indicated with a simple example. Fig. 19.3
shows the low-order Feynman graphs contributing to the emission of a hard photon
2 2
q q q q
p p p p
k1
k1
+ k +... + + +...
p p p p
Fig. 19.3 Contributions to hard photon emission from an electron in QED (through O(α2 )).
728 Scales IV: Long-distance structure of quantum field theory
of momentum q (indicated by the spiral line) from a charged particle which has had
its momentum altered5 by an interaction indicated by the dashed line (which for
simplicity we may take to be of non-electromagnetic character). To lowest order in the
particle charge e (order α ∼ e2 for the cross-section, or the squared amplitude), only
the single hard photon need be taken into account, as in the left graph in Fig. 19.3(a),
but to the next order we must take into account the possibility of a virtual photon
exchange, as in the right graph in Fig. 19.3(a), or an additional emission of a real soft
photon (momentum k1 ) below the detector threshold, as in Fig. 19.3(b).
Once again one finds that the infrared divergence in the virtual photon diagram
(actually, in the interference term obtained by squaring the amplitude indicated in
Fig. 19.3(a)) cancels exactly with an infrared divergence in the real photon emission
graphs of Fig. 19.3(b). The proof that this cancellation is effective for general
processes, to all orders of perturbation theory, can be found in the aforecited paper of
Yennie et al.
The essential point which we wish to emphasize here is that a careful specifica-
tion of the limitations of any measurement process in a situation involving exactly
massless abelian gauge particles automatically leads to well-defined finite transition
probabilities and cross-sections once the measurement process is carefully specified,
even if the intermediate quantities (S-matrix amplitudes) which we normally rely on
in field theory have a singular structure in the zero mass limit. The formal difficulties
(infrared divergences at finite orders of perturbation theory, vanishing of the S-matrix
when the amplitudes are summed to all orders) appear because of mathematically
convenient idealizations in the theoretical formulation which do not correspond to
physical reality: specifically, the propagation of particles in a Minkowski space of
infinite spatial volume, and the existence of detectors of infinitely precise resolution.
For example, in a finite spatial volume the momentum integrals become discretized
sums, with a natural infrared cutoff of order the inverse spatial size of the “box”.
Thus, unlike the situation discussed above, where the average number of photons
emitted (into infinite volume) by an accelerated charge is infinite, the average number
of photons per unit volume in blackbody radiation is perfectly finite, as the reader
may easily confirm by integrating ρ(ν, T )/hν, with the energy density ρ(ν, T ) given
by the Planck formula (1.22), over all photon frequencies ν. Of course, if we consider
an infinite volume box, the total number of photons is again infinite. Inasmuch as the
formulation of a scattering theory typically presupposes the asymptotic propagation of
incoming and outgoing particles through an infinite spatial volume, it is not surprising
that we encounter formal difficulties due to the concomitant appearance of infinitely
many very soft photons of arbitrarily long wavelength and correspondingly low energy.
5 The classical term for this sort of process is Bremsstrahlung—German for “braking radiation”.
Unbroken non-abelian gauge theory: confinement 729
Two special cases in this class are of particular importance for the present discussion,
corresponding to the fermionic field (which, in virtue of the similarities of the model
6 A review of the discussion of the general relation between particles and fields provided in Section 9.6
may be useful at this point.
Unbroken non-abelian gauge theory: confinement 731
to four-dimensional QCD, we shall dub the “quark” field) being either (a) extremely
massive, m >> e (note that in two spacetime dimensions the gauge field is dimen-
sionless and the charge coupling constant e has dimensions of mass, as required by a
dimensionless action), or (b) massless m = 0—the famous “Schwinger model”. First,
note that in one spatial dimension, gauge-fixing to axial gauge A1 (equivalent in this
case to the transverse or Coulomb gauge ∂i Ai = ∂1 A1 = 0) leaves only the auxiliary,
non-dynamical A0 field, responsible for the static Coulomb interaction, with the Green
function (in one space dimension)
∂2 1
−∇2 V (x) = δ(x) ⇒ − V (x1 ) = δ(x1 ) ⇒ V (x1 ) = − |x1 | (19.47)
∂x21 2
with the charge density given by ρ(x) = eψ̄(x)γ0 ψ(x). There are no transverse degrees
of freedom in one space dimension, so real “photons” are absent in this model: the
entire physics induced by the gauge field is incorporated in the Coulomb interaction
(in Coulomb gauge). Even classically, this theory confines charge, as we see that
the Coulomb potential grows linearly with the separation of charges: thus if we set
ρ(x1 ) = +Qδ(x1 ) − Qδ(x1 − L), the static Coulomb energy of the opposite charged
pair is 12 Q2 L, so an infinite amount of energy would be required to completely isolate
either charge from the other. The result is easily understood from Gauss’s Law: the
electric flux leaves the +Q charge with magnitude Q and energy density Q2 /2 and
travels directly to the −Q charge along the only available spatial axis, giving a total
electrostatic energy Q2 L/2.
This is in contrast to the situation in three space dimensions, of course, where the
static electric flux originating on a charge spreads out throughout the ambient three-
dimensional volume, decreasing the energy density and giving rise to an electrostatic
interaction energy of separated charges falling inversely with their separation. In
the present case, the asymptotic spectrum cannot contain increasingly far-separated
“quarks” (with no intervening charged particles) without incurring an arbitrarily large
energy penalty.
On the other hand, we expect, in the limit of very heavy “quarks” (m >> e), to
find non-relativistic bound quark–antiquark states (analogous to the “onium” mesons
of QCD) of zero total charge. In fact, the only stable particles in the theory correspond
to neutral bosons, which can undergo non-trivial scatterings. As we decrease the mass
m relative to the coupling e, eventually reaching the “strong-coupling” regime of
e >> m, the stable bosons of the theory become, somewhat paradoxically, weakly
coupled (Coleman, 1976), and in the exactly massless limit for the quark (the original
7 As pointed out by Coleman, the physics of the massive model is enriched in an interesting way by
allowing for an external electric field, in which case additional terms appear in the Hamiltonian. Here we
set this field to zero. See (Coleman, 1976) for a detailed discussion of the general case.
732 Scales IV: Long-distance structure of quantum field theory
“Schwinger model”) the spectrum of the √ theory collapses to that of a single free,
neutral, massive boson (with mass = e/ π). In fact, the gauge-invariant operators of
the theory can all be re-expressed in terms of a scalar field φ, in terms of which the
Hamiltonian density reads H = 12 (πφ2 + (∂1 φ)2 + e2 φ2 /π).
The linear form of the Coulomb potential in the Schwinger model is, of course,
a kinematical consequence of the single spatial dimension available for the spread of
electric flux. In two spatial dimensions the Coulomb potential (Green function of the
Laplacian) grows logarithmically with distance, still providing charge confinement in
the abelian case, although a “weaker” form than in one spatial dimension, while in
three space dimensions we have the usual 1/r falloff, allowing us to isolate charged
particles from one another, although not, as we have seen, from the ever present
“cloud” of infrared low-energy photons. In non-abelian models, on the other hand,
there are strong arguments to believe that linear confinement persists even in two
or three space dimensions, as a consequence of the very non-trivial self-interacting
dynamics of the gauge fields of the theory. In the remainder of this chapter we shall
see how the methods of lattice gauge theory can be used to provide both analytic and
numerical support for this hypothesis.
As in the case of spontaneous symmetry-breaking, the physics of confinement
primarily concerns the long-distance properties of the theory, and we may therefore
expect that the details of the theory at very short-distance scales are unimportant,
as long as we take care to regularize the theory in a way that does not do violence
to those features of the theory that are intimately connected with the long-distance
phenomenon of interest. In our case, the features in question are those related directly
to the local gauge symmetry of the Lagrangian, which we take to be unbroken in the
Lagrangian, and with the associated remaining global symmetry (after gauge-fixing)
preserved by the vacuum of the theory (in other words, we are not in a Higgs phase of
the theory where the gauge fields of the theory are screened by a vacuum condensate
of charged fields).
In addition, we shall work in Euclidean space, as it is easy to formulate a simple and
direct criterion for confinement (or non-confinement) of matter fields in any particular
representation of the gauge group in an imaginary-time formulation, as we shall soon
see. The regularization of the theory at short distances will be performed by working
on a four-dimensional hypercubic lattice with a large but finite number of points L
in each (Euclidean) spacetime dimension, with a lattice spacing a separating nearest
neighbors in each spacetime direction (see Fig. 19.4, where a small section of the lattice,
in the μ, ν plane, is shown). We shall assume for definiteness that we are dealing with
a single unbroken gauge group SU(N) (the abelian case U(1) can be treated in a
completely analogous fashion), with the dynamics specified by a Euclidean functional
integral, as in (15.161–15.164), for the continuum theory.
The matter fields of the theory will be identified with field variables localized on
the sites of the lattice, which we will label with bold-faced Roman letters n, m, etc.
Thus, a bosonic scalar (resp. Dirac) field in the fundamental representation will be
specified at location n on the lattice as φn (resp. ψn ), where the gauge group “color”
index (and, in the fermion case, Dirac) indices are suppressed. The continuum vector
gauge fields Aαμ (x) of the theory lying in the adjoint representation (thus, the index
α = 1, 2, ..., N 2 − 1) are encoded in an N × N matrix field chosen to simplify the task
Unbroken non-abelian gauge theory: confinement 733
Un;μν
a n Un,μ
Fig. 19.4 A slice through a Euclidean hypercubic lattice supporting a lattice gauge field.
Aμ ≡ tα Aαμ (19.49)
while the gauge field appearing in (19.49) transforms like (cf. (15.105))
† i †
Aμ (x) → AΛ
μ (x) = Λ(x)Aμ (x)Λ (x) + (∂μ Λ(x))Λ (x) (19.51)
g
The symbol U , formerly used for the local gauge transformations now denoted by
Λ in (19.50), will instead be used to denote the parallel gauge transporter, which is
defined, for the infinitesimal path (x + dx, x) corresponding to the straight segment
from x to x + dx, to be the transformation
Λ(x + dx)U (x + dx, x)Λ† (x) = Λ(x + dx)(1 − igAμ dxμ )Λ† (x)
= (Λ(x) + ∂μ Λ(x)dxμ )(1 − igAμ dxμ )Λ† (x)
= 1 + dxμ ((∂μ Λ(x))Λ† (x) − igΛ(x)Aμ Λ† (x))
734 Scales IV: Long-distance structure of quantum field theory
i
= 1 − ig(Λ(x)Aμ Λ† (x) + (∂μ Λ(x))Λ† (x))dxμ
g
= 1 − igAΛ
μ (x)dxμ ≡ U (x + dx, x)
Λ
(19.53)
that the transporter U (x + dx, x) serves to “shift” a matter field localized at point x to
one transforming under the local gauge transformation appropriate for point x + dx:
The parallel transport property for the infinitesimal path (x + dx, x) generalizes in an
obvious way to finite paths specified by some continuous contour Ca→b from spacetime
point a to point b, as we may simply divide the path into infinitesimal segments
and form the path-ordered product8 of U (x + dx, x) transporters to obtain a finite
transporter
U (Ca→b ) = P exp {−ig Aμ dxμ } (19.55)
Ca→b
transforming like
For a closed path, we end up with a transporter transforming covariantly under the
adjoint representation of the gauge group,
from which it follows (from Λ† (a)Λ(a) = 1) that the trace of any closed-path trans-
porter is gauge-invariant:
For the special case of straight line contours, the parallel transporter satisfies a familiar
type of first-order equation, analogous to the equation (4.20) satisfied by the time-
ordered interaction-picture evolution operator (4.28). Suppose the contour is just a
straight line path in the Euclidean “time” (fourth) direction, from the spacetime point
y ≡ (x, y4 ) to the point x ≡ (x, x4 ), where x4 > y4 . Then we have
∂ 1
U (Cy→x ) = lim (e−igA4 (x,x4 )Δx4 − 1)U (Cy→x )
∂x4 Δx 4 →0 Δx4
= −igA4 (x, x4 )U (Cy→x ) (19.59)
Of particular interest in the discretized lattice version of the theory are the parallel
transporters corresponding to links connecting nearest neighbor sites on the lattice.
Thus, we define (see Fig. 19.4) Un,μ as the parallel transporter for the path (n + aμ̂, n)
8 The finite path-ordered product is defined, analogously to the time-ordered product of (4.27), by
expanding the exponential and ordering the gauge-field factors in each term so that “later” fields along the
path are placed to the left.
Unbroken non-abelian gauge theory: confinement 735
extending from site n in the positive μ direction by one lattice spacing. As a conse-
quence of (19.56), this object transforms under local SU(N ) gauge transformations,
specified by assigning an SU(N ) element Λn to each lattice site n, as follows:
For the time being we shall assume that we are dealing with smooth classical fields, so
that aAμ (n + aν̂) may be approximated in the exponent by a(Aμ (n) + a∂ν Aμ (n)),
neglecting terms of order a3 . The exponentials can be combined using a Baker–
Campbell–Hausdorff formula
2
1
[X,Y ]+O(a3 )
eaX eaY = eaX+aY + 2 a (19.62)
with Fμν the N × N hermitian matrix field strength tensor (cf. (15.108))
As expected, the closed-path plaquette variable Un;μν inherits the covariant adjoint
transformation property (19.57) from the field-strength tensor which has the same
transformation behavior. The trace of this quantity is easily seen to be exactly
invariant under the full set of local lattice gauge transformations specified by (19.60).
Moreover, the real part of the trace of the plaquette transporter can be expanded for
small lattice spacing, giving
1 † 1
Re Tr(Un;μν ) = Tr(Un;μν + Un;μν ) = Tr(1) − g 2 a4 Tr(Fμν (n)Fμν (n)) + O(a5 )
2 2
(19.65)
Note that the trace of the term linear in the exponent vanishes, as the exponent must
be anti-hermitian (Un;μν is unitary). The second term on the right is nothing but
the usual pure gauge Lagrangian density (15.111) (in Euclidean space, so there are
no raised indices). The higher terms in (19.65), of order a5 or higher, correspond by
dimensional analysis to operators of dimension 5 or higher—exactly the ones which
in a cutoff effective Lagrangian correspond to “irrelevant” operators, to which the
low-energy physics should be insensitive, as we saw in Chapter 16. The continuum
gauge action corresponding to (15.111)) (in Euclidean space) becomes, after a naive
discretization on a hypercubic lattice,
736 Scales IV: Long-distance structure of quantum field theory
1
1
Sgauge = Tr(Fμν Fμν )d4 x → a4 Tr(Fμν (n)Fμν (n)) (19.66)
4 n
4
Comparing this with (19.65), we see that the usual continuum action, once regulated
on a lattice, corresponds up to irrelevant (dimension 5 and higher) operators with the
Wilson lattice action (Wilson, 1974)
1 N
SWils,latt = β (1 − Re Tr(Un;μν )), β≡ (19.67)
n,μ<ν
N g(a)2
SWils,latt = β (1 − cos (θn;μν )) (19.68)
n,μ<ν
The continuum limit again corresponds to taking the coupling β large, which forces
the path integral to concentrate in the region of small θn;μν . The specific choice of
periodic function used here is to a large extent a matter of convenience: different
functions with the same Gaussian behavior for small plaquette angles correspond to
effective Lagrangians at the UV cutoff scale differing by higher-dimension operators
(i.e., higher powers of θn;μν ∼ Fμν ) which we expect to be in the same universality
class (cf. Section 17.4) as the theory defined by the Wilson action, say. For example,
a very useful choice for analytic computations is the Villain U(1) action:
Unbroken non-abelian gauge theory: confinement 737
SVill,latt = SVill (θn;μν ) (19.69)
n,μ<ν
+∞
β
SVill (θ) ≡ − log exp {− (θ − 2mπ)2 } (19.70)
m=−∞
2
The Wilson and Villain actions (for β = 5) are displayed in Fig. 19.5: it is apparent
that the small θ behavior is identical; we are free to use either as our regularized version
of the abelian gauge theory. Similarly, a non-abelian Villain lattice action (say, for the
gauge group SU(2)) can be defined by choosing the single plaquette action:
+∞
β 1
SVill (Un;μν ) = − log exp {− (arccos ( Tr(Un;μν )) − 2mπ)2 } (19.71)
m=−∞
2 2
S(θ)
Villain action
Wilson action
θ
−π +π
Fig. 19.5 Comparison of Wilson and Villain actions for a U(1) lattice gauge theory.
738 Scales IV: Long-distance structure of quantum field theory
where the integrals over the link variables Un,μ ∈ SU(N ) are the usual Hurwitz measure
ones (cf. (15.140)). Note that the local gauge group is now compact, as it is simply the
direct product of a finite number of independent SU(N ) groups acting on each lattice
site. The problem of an infinite gauge group volume which plagued the continuum
formulation of the theory, and required the insertion of a gauge-fixing prescription
to provide unambiguous finite results for correlation functions computed from the
functional integral, has simply disappeared. We may therefore perform an unrestricted
integration over the link variables Un,μ , provided the observables O[φn , Un,μ ] being
averaged in the functional integral are themselves gauge-invariant:
O = dφn dUn,μ O[φn , Un,μ ]e−(SWils,latt +Smatter,latt ) /Zlatt (19.73)
n nμ
while the removal is accomplished by ψ̄(y , T )U (C(x,T )→(y,T ) )ψ(x, T ) (see Fig. 19.6),
where the contours are chosen for simplicity to be straight line spatial paths connecting
the locations x and y at fixed time 0 or T .
We also assume, without explicitly indicating this in the notation, that the quark
and antiquark (although of equal mass M ) are of different “flavors”—i.e., one is not
the antiquark of the other—to eliminate the possibility of mutual annihilation (into
pure gauge energy). Instead, the two heavy objects are forced to propagate over a
large Euclidean time T , after which they are removed from the system. The Euclidean
amplitude for this process, written for the time being in the continuum theory, is
represented schematically (ignoring overall normalization, gauge-fixing issues, etc.) by
the functional integral
where a, b, c, d are the spacetime points (x, 0), (x, T ), (y , T ), (y , 0), as indicated (super-
imposed on a spacetime lattice) in Fig. 19.6. In the final line the integral over the
fermionic quark fields has been performed for each fixed gauge field in the remaining
DAαμ functional integral. Thus, the function SE (b, a; A), for example, is the Euclidean
Dirac propagator for the massive quark propagating in the background classical gauge
field Aαμ . The trace appearing in (19.75) is over gauge group (i.e., fundamental
representation) indices, and we have ignored an irrelevant overall minus sign arising
from permuting the Grassmann quark fields.
b c
t=T
t=0 a d
R
x y
We have already seen in Chapter 11 (see the discussion following (11.50)) that the
exchange of transverse gauge particles between charged Dirac fermions is suppressed in
the limit where the fermion mass(es) are taken large, and that moreover, in the extreme
static limit (with the mass taken to infinity), the spatial momentum dependence of
the propagator and the coupling to spatial gauge fields disappears entirely, plausibly
enough, as an infinitely massive particle is insensitive to the transfer of any finite
amount of momentum, and, in the absence of acceleration, cannot radiate or absorb
real gauge quanta.
In the static limit therefore, the Euclidean propagator for our quarks is just the
Green function for the Euclidean Dirac operator appearing in (15.162), but with the
spatial derivatives and spatial components of the gauge field set to zero, leaving only
the fourth (Euclidean “time”) components
where we have reverted in the final expression to the use of the matrix gauge field
A4 = tα Aα4 , and removed a factor of the coupling constant from the field to maintain
consistency with our notation throughout this section. Thus, our static propagator in
a background continuum field satisfies
Using (19.59), we may write down the solution to (19.77) (see Problem 6)
SE (x, y; A) = e−M |x4 −y4 | δ 3 (x − y )(P+ θ(y4 − x4 ) + P− θ(x4 − y4 ))U (C(y,y4 )→(x,x4 ) )
(19.78)
with P± ≡ (1 ± γ̂4 )/2 the projection operators appropriate for quark and antiquark
propagation (in the static limit). Inserting (19.78) into (19.75) we see that the
Euclidean propagation amplitude for our quark–antiquark pair is proportional (using
cyclicity of the trace) to the Wilson loop variable W (R, T ) = Tr(UCa→b→c→d→a ) ≡
Tr(UC(R,T ) ) corresponding to the gauge-invariant trace of the closed rectangular
contour displayed in Fig. 19.6, averaged in the Euclidean functional integral over
gauge fields distributed according to a Boltzmann weight e−Sgauge determined by
the pure gauge action. If our contour is very long in the Euclidean time direction
(T >> R = |x − y | in Fig. 19.6), the Euclidean propagation amplitude must acquire,
by reasoning familiar from Section 4.2, a factor e−V (x−y)T where the static potential
energy V (x − y ) is defined as that of the minimum-energy state into which the quark–
antiquark pair introduced at t = 0 (plus the gauge gluons with which they interact)
can rearrange itself, or alternatively, the minimum-energy state with a non-vanishing
matrix element of the bilocal operator ψ̄(x, 0)U (C(y,0)→(x,0) )ψ(y , 0) to the vacuum.
For the lattice-regularized theory, this quantity, defined mathematically as
1
V (R) ≡ lim {− log < W (R, T ) >} (19.79)
T →∞ T
can be numerically estimated by generating a large ensemble of statistically indepen-
dent gauge field (i.e., link) configurations according to the Boltzmann weight arising
from the Wilson action using Monte Carlo techniques, and then averaging the Wilson
Unbroken non-abelian gauge theory: confinement 741
loop variable (for various choices of R and T ) over this ensemble. This program,
initiated in the late 1970s by Creutz (Creutz, 1980), has been pushed to quite large
lattices and a high level of statistical precision, and there is by now overwhelming
numerical evidence that V (R), in addition to the expected Coulombic behavior at
short distances (where we expect perturbative behavior due to the asymptotic freedom
property at short distance described in the preceding chapter), possesses a linear
dependence of V (R) on R at large separations (see Fig. 11.13).
The appearance of a linearly rising static potential was already demonstrated by
Wilson in the strong coupling limit where β = N/g2 is taken small, in his seminal
paper on lattice gauge theory (Wilson, 1974). We shall briefly explain the reason for
this result here. For compact Lie groups G (such as the SU(N ) groups considered here),
any invariant function on the group f (U ), f (V −1 U V ) = f (U )∀U, V ∈ G has a Fourier
expansion in terms of the character functions χr (U ) associated with a complete set of
unitary representations of the group (labeled by the index r). The character function
χr (U ) is simply the trace of the unitary matrix representing the group element U in
the rth representation, of dimension dr ,
1
and any invariant function, and in particular the exponential function eβ( N ReTr(UP )−1)
appearing in the lattice functional integral from the Wilson gauge lattice action (where
P denotes a particular plaquette) can be expanded
1
cr (β)
eβ( N ReTr(UP )−1) = cr (β)χr (UP ) = c0 (β)(1 + χr (UP )) (19.81)
r
c0 (β)
r =0
where we have separated out explicitly the contribution of the trivial representation
(r = 0) with χ0 (U ) = 1. For concreteness, let us take the case of gauge group SU(2).
The representations are labeled by the index j, which can be integer or half-integer,
with the fundamental (spinor) representation corresponding to j = 12 . In this case the
coefficient ratios appearing in the sum are
Note that for small β, the leading contributions arise from the use of representations
with minimum dimensionality (which give a non-zero contribution to the desired
amplitude). For SU(N ), the lowest-dimensional non-trivial representation is the fun-
damental, which we shall denote with subscript “F” (thus for SU(2), cF = cj=1/2 ).
The Schur orthogonality theorem for the finite-dimensional irreducible unitary
matrix representations of SU(N ) (where the superscripts r, s identify the represen-
tation)
1
dU (Uij )∗ Umn
(r) (s)
= δrs δim δjn (19.83)
dr
742 Scales IV: Long-distance structure of quantum field theory
In the limit β → 0, the leading contribution to W (R, T ) comes from picking the
minimum set of non-trivial representations for each plaquette appearing in the product
over plaquettes in (19.87), compatible with obtaining a non-zero result for the integral
over links. At a minimum, we must include a full set of RT plaquettes “tiling”
the interior of the rectangular R × T contour given by the Wilson loop. Otherwise,
there will be unmatched link variables (appearing only once) which integrate to zero.
Moreover, this minimum set of plaquettes must all be associated with the fundamental
representation in order to obtain a non-zero result, as the boundary links appearing
in χF (UC(R,T ) ) are in this representation, and integrals over products of characters
in different representations vanish by Schur orthogonality, (19.84). The upshot of this
reasoning (see (Montvay and Münster, 1994), Section 3.4, for the gory combinatoric
details) is that the Wilson loop expectation value in the strong coupling limit satisfies9
cF (β) RT
W (R, T ) ∝ ( ) (small β) (19.88)
dF c0 (β)
For SU(2), for example, this becomes
I2 (β) β
W (R, T ) ∝ u(β)RT , u(β) = ∼ + O(β 3 ) (19.89)
I1 (β) 4
i.e. W (R, T ) ∼ e−KRT ≡ e−T V (R) , V (R) = KR, K ≡ − log(u(β)) > 0
(19.90)
9 The factors appearing here originate as follows. Each plaquette variable integration provides a 1/d
F
c (β)
factor, pursuant to (19.84), and a character coefficient cF (β) , as in (19.81), with the overall factors of c0 (β)
0
cancelling between the numerator and denominator (Z) in (19.85).
Unbroken non-abelian gauge theory: confinement 743
The falloff of the Wilson loop as a (negative) exponential of the area RT of the loop
(the famous “area law”) clearly indicates a linear rise in the static potential V (R),
as defined in (19.79). The coefficient K appearing in (19.90) is commonly referred to
as the “string tension”. It has dimensions of force, and turns out in QCD to take the
interesting phenomenological value of approximately 15 metric tons: the gluon flux
“string” extending from a single isolated quark could support a rather large truck
(carrying the quantum numbers of a single anti-quark)!
The physical interpretation of the area-law dependence of the Wilson loop observ-
able in the strong coupling limit is not hard to uncover, given our earlier discussion
of confinement in the massive Schwinger model. From (19.65) we see that the lattice
plaquette variable Re Tr(Un;μν ) corresponds to the square of the μ, ν component of
the color field strength tensor (summed over colors to yield a gauge-invariant object).
If we take our Wilson loop in Fig. 19.6 to be oriented in the Euclidean “x-t” plane,
the plaquettes tiling the interior of the loop correspond to local values of the square
of the color electric field F14 (Euclidean version of F10 in Minkowski space) in the
x-direction. The necessity for including all these plaquettes in the expansion of the
action (in order to obtain a non-vanishing contribution to the functional integral)
amounts to the imposition of a non-abelian Gauss’s Law whereby color flux originating
on the quark must make its way to the anti-quark.
However, if we probe for the presence of color electric flux elsewhere in the volume
of our lattice, by inserting, say, additional plaquette variables somewhere off the plane
of the loop in the observable being measured in the path integral to check for the
presence of electric flux elsewhere, we find that our result in (19.88) is suppressed by
additional factors of β, which is, of course, small in the strong coupling limit under
consideration. Similarly, subdominant contributions to the Wilson loop expectation are
obtained from tilings in which the set of plaquettes bordered by the Wilson loop “bulge
away” from the plane of the loop, corresponding to color electric flux “straying” away
from the straight line connecting quark to anti-quark. In fact, the appearance of exactly
linear confinement in 1+1-dimensional gauge theory is now seen to follow simply
from the kinematic impossibility of such straying when only one spatial dimension
is present: indeed, one can easily show that the result (19.88) is exact (for all β) in
1+1-dimensional gauge theory (see Problem 7).
A number of rigorous results have been established for the behavior of the Wilson
loop observable in lattice gauge theories. Two results of particular interest can be
stated here:
1. The strong coupling expansion (unlike the weak coupling expansions of per-
turbation theory, which as we saw in Chapter 11 are at best only asymptotic
expansions) is a Taylor expansion: in other words, the lattice observables are
analytic in β at β = 0. Moreover, for at least some finite range of β, the string
tension K(β) is non-vanishing (Osterwalder and Seiler, 1978).
2. The static potential V (R) defined by the limit (19.79) cannot rise more rapidly
than linearly with R at large R (Seiler, 1978): the area law is in this sense max-
imal. As we must expect at least constant terms in V (R), Euclidean symmetry
implies that a perimeter law W (R, T ) < e−C(T +R) , for some constant C, is
minimal.
744 Scales IV: Long-distance structure of quantum field theory
+∞
βg 1
SV = − log exp {− (arccos ( Tr(UP ) − 2mπ)2 }
m=−∞
2 2
P
βh †
+ Tr(φn · σUn,μ φn+μ̂ · σUn,μ ) (19.91)
2 nμ
where we use the simplified index notation P for plaquettes (thus, P runs over {n, μν}
with μ < ν).10 In the limit of vanishing βh , we are left with a pure non-abelian
gauge theory (in this case, with gauge group SU(2)). On the other hand, when βh
is taken to infinity, the theory becomes a pure abelian U(1) gauge theory. We can
see this by using the local SU(2) gauge symmetry on each lattice site to rotate the
field to the 3-direction, so that the second term in (19.91) becomes a sum over
φ
†
links of βh Tr(σ 3 Un,μ σ 3 Un,μ ), which for βh → ∞ forces the non-abelian link variables
to collapse onto the U(1) subgroup given by Un,μ = exp (iσ 3 θn,μ ), at which point
we have precisely the abelian Villain model specified earlier in (19.69, 19.70). The
model therefore interpolates smoothly between lattice-regularized unbroken (compact)
10 This is a lattice-regularized version of the so-called Georgi–Glashow model, with an adjoint scalar field
coupled to non-abelian gauge vector fields, in the regime where the isovector “Higgs” field develops a non-
vanishing vacuum-expectation value. The “frozen” magnitude of the scalar field can be viewed as arriving
from a scalar potential P (φ) = λ(φ 2 − 1)2 in the limit of large quartic coupling, λ → ∞.
746 Scales IV: Long-distance structure of quantum field theory
abelian and non-abelian gauge theories. From the discussion in the preceding section
we know that linear confinement (an area law for the Wilson loop observable)
obtains in both cases in the strong coupling expansion, and indeed, it is easy to
verify, for all finite values of βh , that we obtain a non-vanishing string tension for
small βg .
In this interpolating model there is substantial analytic and numerical evidence
that linear confinement persists for all values of the lattice gauge parameter βg . For
example, one can determine numerically an “isotonic” line of constant string tension in
the (βg , βh ) plane (see Fig. 19.7) with just the qualitative features expected from the
semiclassical model of monopole confinement which we shall shortly discuss (Duncan
and Mawhinney, 1990). The essential point is that the physics of confinement evolves
smoothly from the purely abelian case, where we have a mathematically rigorous
and fairly complete physical picture of the confining mechanism, to the much more
complicated non-abelian case, where a proper analytic treatment does not exist.
We begin by examining the physical mechanism responsible for linear confinement
at the abelian end (βh → ∞), where the theory reduces to the Villain model. The path
integral giving the desired Wilson loop observable is just (see Problem 8)
dθn,μ 1 2
W = exp (− lP + ilP θP + i Jn,μ θn,μ ) (19.92)
nμ
2π 2βg nμ
lP P
where the link angle variables θn,μ are continuous, and the plaquette variables lP are
integer valued. The abelian Wilson loop is obtained by setting the lattice vector field
Jn,μ equal to unity on all the ordered links comprising the perimeter of the Wilson
loop (see Fig. 19.6), and zero on all other links. Also, we have used a shorthand index
notation for plaquettes, where P indicates the plaquette in the μ, ν plane with the site
βg
10
βh
5 10
Here, right (resp. left) discrete difference operators Δμ (resp. Δ̄μ ), which interconvert
under a discrete integration by parts, are defined as follows
1
nx
sn ≡ smx ny nz (19.96)
ζ · Δ̄ m =0
x
The right-hand side represents a well-defined and periodic integer-valued site field
provided the site field sn sums to zero when accumulated across the lattice in the ζ
(i.e., x) direction. One then finds that a general solution of Δ̄μ ln;μν = −Jn,ν may be
written, using current conservation Δ̄μ Jn,μ = 0,
1 1
ln;μν = ζν Jn,μ − ζμ Jn,ν + μνλ Δ̄λ φn (19.97)
ζ · Δ̄ ζ · Δ̄
where the first two terms on the right-hand side are the desired particular solution.
The inverse operators in (19.97) are well-defined, as only the current in the plane of
the Wilson loop orthogonal to ζ appears (say, the y-direction), by antisymmetry, and
this component of the current clearly sums to zero across the lattice in the ζ direction
(by combining current contributions from opposite sides of the loop). Using the δ
constraint (19.95) in the Wilson loop average (19.92), and converting the sums over
the integer valued site field φn into integrals over a real-valued site field χn via the
Poisson identity,
+∞ +∞
f (φ) = dχf (χ)e2πiρχ (19.98)
φ=−∞ ρ=−∞
748 Scales IV: Long-distance structure of quantum field theory
Δ = Δ̄μ Δμ (19.102)
The site field ρn in fact describes a gas of magnetic monopoles interacting with the
electric current loop carried by Jn,μ . To see this, observe that an electric current Jμ
produces a magnetic B-field via the curl operation
Setting
1
Bλ = λνμ ζμ Jν (19.105)
ζ · Δ̄
one easily finds that (19.104) is satisfied. We now see that the term coupling the
ρn field to the Wilson loop current Jν (the second term in the first exponential in
(19.100)) is proportional to
ρn Δ−1 σn = coul
ρn vnm (Δλ Bλ )m (19.106)
n nm
In other words, the objects whose density is represented by the ρn field couple
Coulombically to the divergence of the magnetic field generated by the current loop,
and may therefore be properly regarded as magnetic monopoles. The first term in
the exponent (quadratic in ρn ) is just the Coulombic interaction energy of this gas of
monopoles.
How confinement works: three-dimensional gauge theory 749
To summarize, we have shown that the Wilson loop average in (19.92) factorizes
exactly into the product of two terms,
where the first term describes the interaction of a magnetic monopole gas (with
monopole density ρn ) with an electric current Jn,μ running around the Wilson loop
(which generates the field σn ),
−2π2 β ρ (−Δ−1 ) ρ +2πi ρ Δ−1 σ
W mon = e g
nm
n nm m
n
n n
(19.108)
{ρn }
and a second “spin-wave” term which is an explicit functional of the current Jn,μ :
−1
− 1 (ζ 1 J −ζ 1 J )2 − 2β1g σn Δ σn
W SW = e 4βg nμν ν ζ·Δ̄ n,μ μ ζ·Δ̄ n,ν n (19.109)
However, as the difference operators all commute, one may easily check that
−1 −1 −1
−Δ−1 −1 −1
1 Δ̄1 + Δ1 Δ̄3 Δ Δ3 Δ̄−1
1 = −Δ − Δ−1
1 Δ2 Δ Δ̄2 Δ̄−1
1 (19.112)
Δ̄2 Δ̄−1
1 J2 = −J1 (19.113)
and we obtain
1
−1 coul
Jn,μ Δnm Jm,μ − 2β1g Jn,μ vnm Jm,μ
W SW = e 2βg nm =e nm (19.114)
This contribution to the Wilson loop is precisely (see Problem 9) the term that one
expects to survive in the continuum limit of the pure abelian theory, and cannot
therefore be responsible for an area-law behavior of the loop average in the limit
of large loops (recalling that the Coulomb potential in two space dimensions only
rises logarithmically, not linearly, at large distance). Returning to the monopole
contribution (19.108), note that with the x-y orientation of the Wilson loop chosen
above, the σn field (cf. (19.101)) can be written as the gradient in the z-direction
orthogonal to the loop of a uniform density localized on the plane interior of the loop:
750 Scales IV: Long-distance structure of quantum field theory
mx
σm = Δ3 Δ̄−1
1 Jm,2 = Δ3 Jnx my mz ,2 (19.115)
nx =0
curves (see Fig. 19.7) in the βg − βh plane, as we indicated earlier, and the shape of
the curves can even be understood qualitatively from known properties of ’t Hooft–
Polyakov monopoles (see (Duncan and Mawhinney, 1990)). The essential difference
between the behavior of the theory at the abelian (βh → ∞) and non-abelian (βh → 0)
ends lies simply in the fact that in the latter case the monopole action no longer
diverges for large βg : instead, the monopole density remains finite in the continuum
limit, corresponding to the persistence of the area law, and linear confinement on
distance scales much larger than the lattice cutoff. However, in the small βh regime
the monopole cores grow to the point where they overlap, and we no longer have a
beautiful and analytically tractable transcription of the theory in terms of a dilute
monopole gas as in the abelian case.
In four spacetime dimensions, an analogous treatment of the Villain version of
compact U(1) gauge theory (see (Banks et al., 1977)) shows that in the strong-coupling
regime (i.e., for βg smaller than the critical coupling shown by Guth (Guth, 1980) to
mark the transition point to a non-confining phase) an area law is obtained as a con-
sequence of the appearance of magnetic vortices—i.e., closed loops carrying magnetic
current—which interlace with the electric current loop corresponding to the Wilson
loop observable. These Euclidean configurations correspond in Minkowski space to vir-
tual events in which magnetic monopole/anti-monopole pairs appear and subsequently
annihilate. The concomitant large fluctuations in the local magnetic charge density
lead to a suppression of the electric field in the bulk, and a “focussing” of the (by
Gauss’s Law, necessarily conserved) electric flux travelling between opposite electric
charges onto a “flux tube” connecting the charges, with an energy cost proportional to
the length of the tube. The situation is precisely the “dual” (in the sense of interchange
of electric and magnetic fields) of the Meissner effect in superconductivity. There, large
fluctuations in the local electric charge density in the superconducting state lead to
a suppression of magnetic field in the bulk of the superconductor (recall from Section
8.1 that electric and magnetic fields are complementary quantities, with corresponding
mutual uncertainty constraints). Indeed, if we had actual magnetic monopoles at our
disposal, then inserting an oppositely (magnetically) charged pair into the bulk volume
of a superconductor would lead precisely to the formation of a string of magnetic
flux connecting the two, with an energy rising linearly with their separation. This
is the “dual Meissner” interpretation of quark confinement in QCD, which remains,
nearly forty years after its introduction, a perfectly reasonable qualitative picture of
the underlying mechanism leading to a linearly rising potential between static quarks.
A glance at the recent literature surrounding confinement in four-dimensional
Yang–Mills theories reveals a complicated nexus of competing, and at first sight
incompatible, hypotheses advanced by theorists interested in the detailed physical
mechanism leading to quark and/or color confinement (for an extensive review, see
(Greensite, 2003)). The origins of this complexity lie in (at least) two directions.
Firstly, the intrinsic fluidity of gauge theories, which allows physically equivalent,
but sometimes superficially completely different, descriptions of the same physi-
cal phenomenon simply by changing the gauge, tends to induce a proliferation of
hypothetical mechanisms even where the underlying relevant physics is the same.
Secondly, the asymptotic freedom of the theory, so helpful in allowing the extraction
of quantitative results from perturbation theory at high energies, proves a double-
752 Scales IV: Long-distance structure of quantum field theory
edged sword at low energy or long distances. Precisely in this regime, the field
configurations responsible for the confinement phenomenon (as well as the dynamical
chiral symmetry-breaking discussed in Section 16.5) necessarily correspond to strongly
coupled modes of the theory. Specifically, this means, as indicated previously, that
semiclassical methods, which depend on saddle-point expansions rendered sensible
by the existence of some type of small expansion parameter, no longer prove useful
except in the most crudely qualitative way. This means that we will probably never
be able to arrive at an analytically tractable, as well as quantitatively accurate,
description of four-dimensional non-abelian confinement along the lines discussed
above for three-dimensional compact abelian gauge theory. It is indeed fortunate that
the lattice formulation of four-dimensional Yang–Mills theory has at least given us the
option of direct numerical evaluation, using Monte Carlo methods, of the (Euclidean)
amplitudes of the theory, with results which leave us with no possible doubt that
the overall picture of confined elementary quark and gluon constituents is indeed the
correct framework for hadronic physics.
19.5 Problems
1. In the model used by Schroer to introduce the concept of infraparticles, a massless
boson field is coupled to a massive fermion in 1+1 spacetime dimensions, via the
Lagrangian
1
L= ∂μ φ∂ μ φ + ψ̄(i∂/ − M )ψ + ig ψ̄γ μ ψ∂μ φ (19.116)
2
Show that the one-loop contributions to the ψ̄ψφ vertex (with the external
fermions on-mass-shell) contain logarithmic divergences from the infrared part of
the loop integral.
2. Verify the result (19.42) for the matrix element appearing in (19.41).
3. Show that the action (19.46) for QED in 1+1 spacetime dimensions leads, after
going to axial (=Coulomb) gauge A1 = 0, to a Hamiltonian consisting of the usual
massive free fermion piece, plus the Coulomb interaction term (19.48), after the
dependent A0 field is eliminated.
4. Using the identity (19.62), verify that the plaquette variable Un;μν defined in
(19.61) reduces to (19.63).
5. Show that by choosing C(a), D(a) as suitable functions of the lattice spacing a,
the combination of lattice fields
C(a) (φ∗n+μ̂ Un,μ φn + c.c.) + D(a) P (φ∗n φn ) (19.117)
nμ n
reproduces for a → 0 the classical continuum scalar action for the most general
renormalizable gauge-invariant theory with fundamental representation scalar
fields coupled to the gauge vector fields. Here, P is a polynomial up to
degree 2.
6. Show that the Green function (19.78) satisfies its defining equation (19.77).
Problems 753
appropriately defined norm, (Friedlander, 1982)). The definition (A.3) is the obvious
expression of our desire to express the first variation of the functional in a form
analogous to that familiar from ordinary multi-variable calculus:
δZ[j]
δZ[j] ≡ Z[j + δj] − Z[j] = δj(x)d4 x + O((δj)2 ) (A.4)
δj(x)
The reader may easily verify that the definition (A.3) applied to (A.1) gives, for the
first functional derivative,
δZ[j] 1
= G(n) (y, x1 , x2 , . . . , xn )j(x1 )j(x2 ) . . . j(xn )d4 x1 d4 x2 . . . d4 xn (A.5)
δj(y) n
n!
and that the correlation functions G(n) are recoverable from Z[j] by taking the nth
functional derivative and then setting the “source” functions j to zero:
δ n Z[j]
(n)
G (y1 , y2 , . . . , yn ) = (A.6)
δj(y1 )δj(y2 ) . . . δj(yn ) j=0
which is the functional analog of the usual formula for the Taylor coefficients of
a Taylor-expandable function. For action functionals such as (A.2), the functional
derivative gives the total Euler derivative of the Lagrange density
I[φ + δφ] − I[φ] = (L(φ + δφ, ∂μ φ + ∂μ δφ) − L(φ, ∂μ φ))d4 x
∂L L
= ( − ∂μ )δφ(x)d4 x (A.7)
∂φ ∂(∂μ φ)
δI[φ] ∂L L
⇒ = − ∂μ (A.8)
δφ(x) ∂φ(x) ∂(∂μ φ(x))
where the integration by parts maneuvers required to reach the second line are
validated by the smoothness and compact support of the test functions δφ(x). The
actual mechanics of functional differentiation can be simplified by noting that
δj(y)
j(y) = δ 4 (y − x)j(x)d4 x ⇒ = δ 4 (y − x) (A.9)
δj(x)
and applying the obvious generalization of the Leibniz rule to arbitrary products of
the source function.
In Section 10.3 the concept of functional differentiation is extended to functionals
of Grassmann functions, where both the argument functions and the functionals
themselves take values in an anticommuting number field. The reader is referred to
that section for an explanation of the basic properties of such functionals.
Appendix B
Rates and cross-sections
where g(α ) is sharply peaked around some state α with well-defined energy (Eα )
and momentum. Recall that α is a shorthand notation for energy, momentum, and
internal quantum numbers (if any) of all the incoming particles. The state (B.1) is
unit normalized, t|t = 1, provided
dα | g(α ) |2 = 1 (B.2)
1 1
· ( − )e−i(Eα −Eα )t dβ (B.5)
E
α − Eβ − i Eα − Eβ + i
Strictly speaking, this event rate vanishes as t → ∞, as a result of the rapid oscillations
of the exponential factor. A finite number of particles localized in wave-packets
eventually separate, and we should not be surprised that the interaction rate then goes
Rates and cross-sections 757
we can replace f ∗ (y) by f ∗ (x) in the integral on the right and obtain
3 3N −6
| dα g(α )δ (Pα − Pβ ) | = (2π)
2
d x | f (x) |
3 2
d3 yei(Pβ −Pα )·y
= (2π)3N −3 d3 x | f (x) |2 δ 3 (Pβ − Pα )
= (2π)3N −3 δ 3 (Pβ − Pα ) d3 x | ψ(x, x, .., x) |2
where we have combined the energy-conservation δ-function of (B.8) with the three-
momentum δ-function of (B.13) to give a single four-dimensional δ-function of energy-
momentum conservation.
There are two particularly important special cases of (B.15) which deserve special
attention: N=1 (particle decay), and N=2 (two-particle scattering).
1. If there is only one particle in the initial state, we have immediately
ρrel = d3 x | ψ(x) |2 = 1 (B.16)
so the differential decay rate (into final-state phase-space between β and β + dβ)
is
dΓ(β) = ρ1 V dσ · ρ2 v2 (B.18)
so
1 dΓ(β)
dσ = · (B.19)
ρ1 ρ2 v2 V
The projectile beam wavefunction usually varies slowly (in amplitude) over the
interaction range with the target particle, so choosing V much larger than the
interaction range but much smaller than the scale of variation of the projectile
wave-packet envelope
3
d x1 d x2 | ψ(x1 , x2 ) | d x1
3 2 3
d3 x2 | ψ(x1 , x1 ) |2
V1 V1
=V d3 x1 | ψ(x1 , x1 ) |2
= V ρrel (B.22)
Inserting this result in (B.21), and using (B.15) for N=2, we obtain
(2π)4 4
dσ(α → β) = δ (Pβ − Pα ) | Tβα |2 dβ (B.23)
v2
The above result was obtained in the frame in which one of the particles was
at rest. Note that from (B.19) the differential cross-section dσ was given as the
quantity dΓ(β)/V , which is a Lorentz-invariant (# of events per unit time per
unit spatial volume—i.e., # per unit spacetime volume), divided by the quantity
ρ1 ρ2 v2 . It is customary to define the cross-section in a general frame to be exactly
the same number as in (B.23), by generalizing the latter quantity in a Lorentz-
invariant way. Note that in the rest frame of the target (particle 1) the densities
of both particles are given by (c=1 everywhere!)
760 Rates and cross-sections
(0)
ρ 1 = ρ1
(0)
ρ2
ρ2 = (B.24)
1 − v22
(0) (0)
where ρ1 , ρ2 are the particle densities in the rest frames of the particles
themselves. Thus
(0) (0) v2 (0) (0) | p
2 |
ρ1 ρ2 v2 = ρ1 ρ2 = ρ1 ρ2 (B.25)
1 − v22 m 2
The invariant quantity (p1 · p2 )2 − m21 m22 (p1 , p2 four-vectors) becomes, in the
frame where particle 1 is at rest, just m1 | p2 |, so the quantity equal to ρ1 ρ2 v2
in the target rest frame may be written in any frame as
(0) (0) (p1 · p2 )2 − m21 m22 (p1 · p2 )2 − m21 m22
ρ1 ρ2 = ρ1 1 − v1 · ρ2 1 − v2 ·
2 2
m1 m2 m1 m2
m1 m2 (p1 · p2 )2 − m21 m22
= ρ1 ρ2 ·
E1 E2 m1 m2
= ρ1 ρ2 vα (B.26)
√
(p1 ·p2 )2 −m21 m22
with the relative velocity vα ≡ E1 E2 providing the appropriate gener-
alization of v2 in (B.23). Our final formula is thus
(2π)4 4
dσ = δ (Pβ − Pα ) | Tβα |2 dβ (B.27)
vα
Although the above formulas were derived under the assumption of distinguishable
particles in the initial state (in particular, we did not worry about niceties of sym-
metrizing or antisymmetrizing the initial-state wavefunction), it turns out that the
final result is still valid in such cases. The derivation can be found in any of the
standard texts on scattering theory (see (Newton, 1966), for example).
Appendix C
Majorana spinor algebra
Recall from the discussion in Section 7.4.3 that we may continue to employ the
very convenient (and familiar) Dirac 4-spinor language even when describing the
two-component Majorana fields that naturally interpolate for self-conjugate spin- 12
particles. Here, we shall follow the conventions of that section and define a Majorana
spinor as a Dirac spinor with the following relation between the upper and lower
2-spinors:
1 1
φ( 2 0) = Cs φ(0 2 )∗ (C.1)
where Cs = iσ2 is the 2-spinor conjugation matrix introduced in Section 7.2. Thus,
if Qa is a (0 12 ) spinor, in the fundamental representation of SL(2,C), we can form a
Majorana spinor as follows:
Cs Q∗
(C.2)
Q
Any Dirac spinor ψ can be decomposed ψ = √1 (χ1 + iχ2 ) where χ1,2 are Majorana:
2
1
χ1 ≡ √ (ψ − Cγ0 ψ ∗ ) (C.3)
2
−i
χ2 ≡ √ (ψ + Cγ0 ψ ∗ ) (C.4)
2
where
−Cs 0
C ≡ iγ2 γ0 =
0 Cs
Cs χ∗
χ
Cs χ∗
s=
χ
where
χ1
χ=
χ2
and χ1 , χ2 , χ∗1 , χ∗2 are independent Grassmann quantities (i.e., they square to zero and
anticommute with each other).
The following 4x4 matrices will be very useful:
Cs 0
≡
0 Cs
1 0
γ5 ≡
0 −1
0 1
β = γ0 ≡
1 0
Using these, one finds that the adjoint of a Majorana spinor, s̄ ≡ s∗T β can be
written
s̄ = sT γ5 (C.5)
Let M be a general 4x4 numerical matrix (containing normal complex numbers), and
s1 , s2 two Grassmann Majorana spinors:
C −1 M T C = M, M = 1, γ5 , γ5 γμ (C.10)
= −M, M = γμ , [γμ , γν ] (C.11)
For the special case where s1 , s2 are the same spinor, we find
It follows that the only non-vanishing bilinears that can be built from a single
Grassmann Majorana s are s̄s, s̄γ5 s, and s̄γ5 γμ s.
Inserting the explicit expression for s in terms of its Grassmann components, one
finds
There are only four independent objects cubic in χ1 , χ2 , χ∗1 , χ∗2 , so products of three
ss can always be reduced to
⎛ ⎞
2χ1 χ2 χ∗2
⎜ −2χ1 χ2 χ∗1 ⎟
s̄γ5 s · s = ⎜ ⎟
⎝ 2χ∗1 χ∗2 χ1 ⎠
2χ∗1 χ∗2 χ2
1
χ∗1 χ∗2 χ1 χ2 = (s̄γ5 s)2 (C.17)
8
θα θ̄β = Aδαβ θ̄θ + B(γ5 γμ )αβ θ̄γ5 γ μ θ + C(γ5 )αβ θ̄γ5 θ (C.22)
1
Tr(1 · θθ̄) = −θ̄θ = 4Aθ̄θ ⇒ A = − (C.23)
4
764 Majorana spinor algebra
Likewise
1
Tr(γ5 γν θθ̄) = −θ̄γ5 γν θ = −4B θ̄γ5 γν θ ⇒ B = (C.24)
4
1
Tr(γ5 θθ̄) = −θ̄γ5 θ = 4C θ̄γ5 θ ⇒ C = − (C.25)
4
To summarize, we have the following identity:
1 1 1
θα θ̄β = − δαβ θ̄θ + (γ5 γμ )αβ θ̄γ5 γ μ θ − (γ5 )αβ θ̄γ5 θ (C.26)
4 4 4
References
Aarts, G., Seiler, E., and Stamatescu, I. (2010). Complex Langevin method: When
can it be trusted? Physical Review D, 81, 054508.
Abers, E. S. and Lee, B. W. (1973). Gauge theories. Physics Reports, 9, 1–141.
Adler, S. L. (1969). Axial vector vertex in spinor electrodynamics. Physical
Review , 177, 2426–2438.
Adler, S. L. and Bardeen, W. A. (1969). Absence of higher-order corrections in the
anomalous axial-vector divergence equation. Physical Review , 182, 1517–1536.
Adler, S. L, Collins, J. C., and Duncan, A. (1977). Energy-momentum-tensor trace
anomaly in spin-1/2 quantum electrodynamics. Physical Review D, 15, 1712–1721.
Alvarez-Gaumé, L. and Witten, E. (1983). Gravitational anomalies. Nuclear Physics
B , 234, 269–330.
Anderson, A. (1994). Canonical transformations in quantum mechanics. Annals of
Physics, 232, 292–331.
Appelquist, T. and Carrazzone, J. (1975). Infrared singularities and massive fields.
Physical Review D, 11, 2856–2861.
Banks, T., Myerson, R., and Kogut, J. (1977). Phase transitions in abelian lattice
gauge theories. Nuclear Physics B , 129, 493–510.
Barton, G. (1963). Introduction to Advanced Field Theory (1st edn). Interscience
Publishers (John Wiley and Sons), New York.
Barton, G. (1965). Introduction to Dispersion Techniques in Field Theory (1st edn).
W. A. Benjamin, New York.
Baym, G. (1990). Lectures on Quantum Mechanics (3rd edn). Westview Press,
New York.
Becchi, C., Rouet, A., and Stora, R. (1976). Renormalization of gauge theories. Annals
of Physics, 98, 287–321.
Bell, J. S. and Jackiw, R. (1969). A PCAC puzzle: π0 → γγ in the σ-model. Nuovo
Cimento A, 51, 47–61.
Bèrgere, M. and Lam, Y. M. P. (1976). Bogoliubov-Parasiuk theorem in the α-
parametric representation. Journal of Mathematical Physics, 17, 1546–1557.
Bernard, C. and Duncan, A. (1975). Lorentz covariance and Matthew’s theorem for
derivative-coupled field theories. Physical Review D, 11, 848–859.
Bjorken, J. D. and Drell, S. D. (1965). Relativistic Quantum Fields (1st edn). McGraw-
Hill Book Company, New York.
Bloch, F. and Nordsieck, A. (1937). Note on the radiation field of the electron. Physical
Review , 52, 54–59.
Bloch, P. (2006). CPT invariance tests in neutral kaon physics. Journal of Physics
G: Nuclear and Particle Physics, 33, 666–667.
Bodwin, G., Braaten, E., and Lepage, G. P. (1995). Rigorous QCD analysis of
inclusive annihilation and production of heavy quarkonium. Physical Review D, 51,
1125–1171.
766 References
Creutz, M. (1980). Monte carlo study of quantized su(2) gauge theory. Physical Review
D, 21, 2308–2315.
Daboul, J. and Nieto, M. M. (1994). Quantum bound states with zero binding energy.
Physics Letters A, 190, 357–362.
Dirac, P. A. M. (1927a). The physical interpretation of the quantum dynamics.
Proceedings of the Royal Society of London, Series A, 113, 621–641.
Dirac, P. A. M. (1927b). The quantum theory of the emission and absorption of
radiation. Proceedings of the Royal Society of London, Series A, 114, 243–265.
Dirac, P. A. M. (1928). The quantum theory of the electron, I. Proceedings of the
Royal Society (London) A, 117, 610–624.
Dirac, P. A. M. (1933). Théorie du positron. In Septieme Conseil de Physique
Solvay: Structure et propriétés des noyaux atomiques, pp. 203–221. Gauthiers-Villars
(Paris).
Dirac, P. A. M. (1945). On the analogy between classical and quantum mechanics.
Reviews of Modern Physics, 17, 195–199.
Dirac, P. A. M. (1964). Lectures on Quantum mechanics. Yeshiva University, New
York.
Donoghue, J. F., Golowich, E., and Holstein, B. R. (1992). Dynamics of the Standard
Model (1st edn). Cambridge University Press, Cambridge, UK.
Duncan, A. (1976). Fine structure in non-abelian gauge theories. Physical Review
D, 13, 2866–2880.
Duncan, A., Eichten, E., Flynn, J., Hill, B., Hockney, G., and Thacker, H. (1995).
Properties of B mesons in lattice QCD. Physical Review D, 51, 5101–5129.
Duncan, A. and Janssen, M. (2007a). Van Vleck and the correspondence principle
(part one). Archive for History of the Exact Sciences, 61, 553–624.
Duncan, A. and Janssen, M. (2007b). Van Vleck and the correspondence principle
(part two). Archive for History of the Exact Sciences, 61, 625–671.
Duncan, A. and Janssen, M. (2008). Pascual Jordan’s resolution of the conundrum
of the wave-particle duality of light. Studies in History and Philosophy of Modern
Physics, 39, 634–666.
Duncan, A. and Janssen, M. (2009). From canonical transformations to transformation
theory, 1926–1927: The road to Jordan’s Neue Begründung. Studies in the History
and Philosophy of Modern Physics, 40, 352–362.
Duncan, A. and Jones, H. F. (1993). Convergence proof for optimized δ expansion:
Anharmonic oscillator. Physical Review D, 47, 2560–2572.
Duncan, A. and Mawhinney, R. (1990). Semiclassical approach to confinement in three-
dimensional gauge theories. Physical Review D, 43, 554–565.
Dyson, F. J. (1949). The S-matrix in quantum electrodynamics. Physical Review , 75,
1736–1755.
Dyson, F. J. (1952). Divergence of perturbation theory in quantum electrodynamics.
Physical Review , 85, 631–632.
Ehrenfest, P. (1911). Welche Züge der Lichtquantenhypothese spielen in der Theorie
der Wärmestrahlung eine wesentliche Rolle? Annalen der Physik , 36, 91–118.
Ehrenfest, P. (1925). Energieschwankungen im Strahlungsfeld oder Kristallgitter
bei Superposition quantisierter Eigenschwingungen. Zeitschrift für Physik , 34,
362–373.
768 References
Einstein, A. (1905a). Über einen die Erzeugung und Verwandlung des Lichtes betref-
fenden heuristischen Gesichtspunkt. Annalen der Physik , 17, 132–148.
Einstein, A. (1905b). Zur Elektrodynamik bewegter Körper. Annalen der Physik , 17,
891–921.
Einstein, A. (1909a). Über die Entwicklung unserer Anschauungen über das Wesen
und die Konstitution der Strahlung. Physikalische Zeitschrift, 10, 817–825.
Einstein, A. (1909b). Zum gegenwärtigen Stand des Strahlungproblems. Physikalishe
Zeitschrift, 10, 185–193.
Einstein, A. (1916). Zur Quantentheorie der Strahlung. Mitteilungen der Physikalis-
chen Gesellschaft, Zürich, 18, 47–62.
Einstein, A. (1917). Quantentheorie der Strahlung. Physikalische Zeitschrift, 18,
121–128.
Elitzur, S. (1975). Impossibility of spontaneously breaking local symmetries. Physical
Review D, 12, 3978–3982.
Evans, T. S., Kibble, T. W. B., and Steer, D. A. (1998). Wick’s theorem for non-
symmetric normal ordered products and contractions. Journal of Mathematical
Physics, 39, 5726–5738.
Faddeev, L. D. (1969). The Feynman integral for singular Lagrangians. Theoretical
and Mathematical Physics, 1, 1–13.
Fermi, E. (1929). Sopra l’elettrodinamica quantistica I. Rendiconti d. R. Acc. dei
Lincei , 9, 881–887.
Fermi, E. (1930). Sopra l’elettrodinamica quantistica II. Rendiconti d. R. Acc. dei
Lincei , 12, 431–435.
Fernandez, R., Fröhlich, J., and Sokal, A. D. (1992). Random Walks, Critical
Phenomena, and Triviality in Quantum Field Theory (1st edn). Springer Press,
New York.
Feynman, R. P. (1948). Space-time approach to non-relativistic quantum mechanics.
Reviews of Modern Physics, 20, 367–387.
Feynman, R. P. (1949a). Space-time approach to quantum electrodynamics. Physical
Review , 76, 769–78.
Feynman, R. P. (1949b). The theory of positrons. Physical Review , 76, 749–759.
Fock, V. A. (1933). Zur theorie des positrons. Doklady Akademii Nauk USSR, 6,
265–272.
Freedman, D. Z., Muzinich, I. J., and Weinberg, E. J. (1974). On the energy-
momentum tensor in gauge field theories. Annals of Physics, 87, 95–125.
Freedman, D. Z. and Weinberg, E. J. (1974). The energy-momentum tensor in scalar
and gauge field theories. Annals of Physics, 87, 354–374.
Friedlander, F. G. (1982). Introduction to the theory of distributions (1st edn). Cam-
bridge University Press, Cambridge, UK.
Fröhlich, J., Morchio, G., and Strocchi, F. (1979). Charged sectors and scattering
states in quantum electrodynamics. Annals of Physics, 119, 241–284.
Fujikawa, K. (1980). Path integral for gauge theories with fermions. Physical Review
D, 21, 2848–2858.
Fujikawa, K. (1981). Energy-momentum tensor in quantum field theory. Physical
Review D, 23, 2262–2275.
References 769
Pauli, W. (1940). The connection between spin and statistics. Physical Review , 58,
716–722.
Pauli, W. and Weisskopf, V. (1934). Über die Quantisierung der skalaren relativistis-
chen Wellengleichung. Helvetica Physica Acta, 7, 709–731.
Peccei, R. D. (1988). Discrete and global symmetries in particle physics. In Broken
Symmetries: Proceedings of the 37 International Universitäts Wochen für Kern- und
Teilchenphysik, pp. 1–50.
Peierls, R. E. (1934). The vacuum in Dirac’s theory of the positive electron. Proceedings
of the Royal Society A, London, 146, 420–441.
Planck, M. (1899). Über irreversible Strahlungsvorgänge: Fünfte mitteilung (Schluss).
Berliner Berichte, 440–480.
Planck, M. (1900a). Über irreversible Strahlungsvorgänge. Annalen der Physik , 1,
69–122.
Planck, M. (1900b). Zur Theorie des Gesetzes der Energieverteilung im Normalspek-
trum. Verhandlungen der Deutschen Physikalischen Gesellschaft , 2, 237–245.
Polchinski, J. (1984). Renormalization and effective Lagrangians. Nuclear Physics
B , 231, 269–295.
Polyakov, A. M. (1974). Particle spectrum in quantum field theory. JETP Letters, 20,
194–195.
Polyakov, A. M. (1977). Quark confinement and topology of gauge theories. Nuclear
Physics B , 120, 429–458.
Proca, A. (1936). Sur la théorie ondulatoire des électrons positifs et négatifs. Journal
de Physique et le Radium, 7, 347–353.
Rayleigh, Lord (1900). Remarks upon the law of complete radiation. Philosophical
Magazine, 49, 539–540.
Reeh, H. and Schlieder, S. (1961). Bemerkungen zur Unitäräquivalenz von Lorentzin-
varianten feldern. Nuovo Cimento, 22, 1051–1068.
Rey, S-J. (1989). Axion dynamics in wormhole background. Physical Review D, 39,
3185–3189.
Rudin, W. (1966). Real and Complex Analysis (1st edn). McGraw-Hill, Inc, New York.
Ruelle, D. (1962). On the asymptotic condition in quantum field theory. Helvetica
Physica Acta, 35, 147–163.
Sakurai, J. J. (1964). Invariance Principles and Elementary Particles (1st edn).
Princeton University Press, Princeton, New Jersey.
Salpeter, E. E. and Bethe, H. A. (1951). A relativistic equation for bound-state
problems. Physical Review , 84, 1232–1242.
Schroer, B. (1963). Infrateilchen in der Quantenfeldtheorie. Fortschritte der
Physik , 11, 1–32.
Schweber, Silvan S. (1994). QED and the men who made it: Dyson, Feynman,
Schwinger, and Tomonaga (1st edn). Princeton University Press, Princeton,
New Jersey.
Schwinger, J. (1948a). On quantum electrodynamics and the magnetic moment of the
electron. Physical Review , 73, 416–417.
Schwinger, J. (1948b). Quantum electrodynamics. I. a covariant formulation. Physical
Review , 74, 1439–1461.
774 References
Zimmermann, Wolfhart (1968). The power counting theorem for Minkowski metric.
Communications in Mathematical Physics, 11, 1–8.
Zimmermann, W. (1969). Convergence of Bogoliubov’s method of renormalization in
momentum space. Communications in Mathematical Physics, 15, 208–234.
Zimmermann, W. (1970). Local operator products and renormalization. In Bran-
deis Lectures on Elementary Particles and Quantum Field Theory, Volume 1,
pp. 395–582. MIT Press.
Zinn–Justin, J. (1989). Quantum Field Theory and Critical Phenomena (1st edn).
Oxford University Press, Oxford.
Index