July 7, 2022
Lectures on Introduction to Quantum Field Theory
Ghanashyam Date1, ∗
1
Chennai Mathematical Institute
H1, SIPCOT IT Park, Siruseri, Kelambakkam 603103, INDIA.
Abstract
These are lecture notes of the QFT-I course I gave in an online mode at Chennai Mathematical
Institute. The course focussed on the free relativistic quantum fields, their interactions in the
perturbative scattering framework, standard computations of QED processes, radiative corrections
arXiv:2207.02243v1 [hep-th] 5 Jul 2022
at 1-loop with renormalization and an introduction to the toolbox of path integrals.
∗
Electronic address:
[email protected],
[email protected]
1
Contents
1. What is Quantum Field Theory?
6
A. Relativistic Covariance
6
B. Compatibility with quantum conditions
7
C. Manifest Covariance
9
2. The Poincare group and its representation: mass and spin/helicity
12
A. Poincare group, Lie algebra and Casimir invariants
12
B. Representations of the proper, orthochronous Poincare group:
15
C. Little groups
17
D. The Discrete Subgroups and their actions
20
3. Representations suitable for manifest covariance: “field representations”
24
A. Induced representation of Poincare from inducing representation of Lorentz
25
B. Irreducibility and field equations
27
C. Clifford Algebras and Spinor representations
28
4. Unitary representations on the space of solutions: anti-particles
32
5. Parity, time reversal and charge conjugation
39
A. Representations of the Clifford algebra and relations among them
42
B. Dirac-Majorana-Charge Conjugates
43
6. Relativistic Actions: classical fields
47
A. Variational Principle, Symmetries of the Action and Noether’s theorem
51
B. Conserved Poincare charges for scalar field solutions
53
7. Fourier decompositions of fields: collection of harmonic oscillators
57
A. Maxwell Field
59
B. Dirac Field
62
C. Interaction with source: Green’s functions
66
8. Covariance of quantum fields and relativistic causality
A. Poincare Covariance of Quantum fields
2
70
70
B. Space Inversion, Time Reversal, Charge Conjugation of Dirac Field
71
C. CPT theorem for Dirac Field
74
D. Relativistic Causality and the Spin-Statistics Theorem
75
9. States of Free Quantum Fields: Particles, coherence and coherent states
81
A. Particle/anti-particle wave packets
81
B. Correlation Functions and Coherence of states
82
C. Coherent States
85
1. An aside: Harmonic Oscillator States
85
2. Correlation functions
86
D. Evolution into a coherent state
90
10. Quantum fields in Scattering Phenomena
92
A. General scattering framework
94
B. Scattering with quantum fields: Heisenberg Picture
98
C. Kallen-Lehmann Representation
102
D. The S-matrix and its properties
105
E. Lehmann-Symanzik-Zimmermann Reduction of S−matrix
107
1. LSZ reduction for Klein-Gordon field
107
2. LSZ reduction for Dirac field
109
3. LSZ reduction for Maxwell field
112
11. Covariant Perturbation Theory
114
A. Normal ordering and Wick’s theorem
118
B. Differential Cross-section for 2 → n process
129
1. The Special case of 2 → 2 processes
131
12. Diagrammatic recipe for S-matrix elements
134
13. Elementary processes in Yukawa and QED: NR limit
137
14. Basic QED processes
142
A. Electron-muon processes:
143
B. Compton Scattering:
147
3
C. Electron-Positron Annihilation:
152
15. Numerical Estimates of Cross-sections and Applications
155
16. Radiative corrections in QED
159
A. The fermion propagator: self-energy
159
B. The photon propagator: photon self-energy
161
1. The Ward identity claim: q µ Πµν (q) = 0:
162
C. The Vertex function: Form factors
164
D. Electric Charge and Anomalous Magnetic Moment
166
17. Radiative corrections at 1-loop: Divergences
169
A. Isolation of Divergence: Fermion Self Energy
171
B. Isolation of Divergence: Photon Self-Energy
173
C. Isolation of Divergence: Vertex function
176
D. Bremsstrahlung Cross-section to o(α)
178
18. Treatment of Divergences:
183
A. Treatment of the IR divergences
185
B. Treatment of the UV divergences
187
1. 1-Loop Renormalization: Charge Screening and Lamb shift
189
C. The Method of Counter terms
192
19. Renormalized perturbation series
194
A. Necessary Conditions for UV divergence: Power Counting
194
B. An Example: The (Φ3 )6 theory
198
1. At 2-Loops
201
C. Renormalization with massless particles
20. Path Integrals in Quantum Mechanics
202
207
A. The Ab Initio Path Integral
207
B. Derivation From Transition Amplitude
210
C. Functional Derivative
214
D. Ground State-to-ground state Amplitude: Z[J]
216
4
E. Explicit evaluation of a path integral
217
F. Alternative Expression for Z[J]
221
21. Path Integrals in Quantum Field Theory
225
A. The 1-point function: Renormalization ↔ normal ordering
231
B. Path Integrals and Statistical Mechanics
233
22. Path Integrals as Generating Functionals
234
A. The Generating Functionals: Z[J], W [J], Γ[Φ]
234
B. The Renormalization Group Equation
239
C. The Background Field Method
243
23. Closing Remarks
246
References
248
5
1.
WHAT IS QUANTUM FIELD THEORY?
There is of course no short answer to this question. There are different facets of quantum
fields. An aspect that is used extensively in condensed matter physics is the ‘QFT as a
framework for computing processes in interacting many body systems’. Here, each of the
many bodies typically has its internal energy states eg atoms/molecules at various lattice
sites, localized states in periodic potentials etc. The interactions take place by making transitions among these internal states. Since the internal states are discrete, the interactions
also involve discrete exchanges of energy/momentum/spin etc which is described by a quantum field eg., [Ψ(~r), Ψ† (~r0 )] ∼ δ 3 (~r − ~r0 ). With large number of bodies involved, a quantum
field capable of indefinite number of transitions is a well suited framework.
There is another aspect which makes quantized fields essential when we want to describe
interactions of even few bodies in a relativistically covariant manner. Special relativity
imposes two crucial modifications: (a) the notion of causality is modified by the finite upper
speed limit and (b) equivalence of mass and energy demands that the rest masses be also
included in the energy conservation. The latter one allows annihilation and creation of
‘particles’ at the expense of energy. The framework of non-relativistic quantum mechanics
is not capable of accommodating these possibilities. Why is it so?
For this, we need to be sharper about what is meant by “relativistic covariance”.
A.
Relativistic Covariance
Special relativity posits that the space-time that is appropriate arena for describing physical processes is the Minkowski space-time - R4 with the flat metric. This allows the spacetime to be described in terms of coordinates xµ ↔ (t, xi ) and metric η µν = diag(−1, 1, 1, 1) =
ηµν . All inertial observers, use such a space-time and relate their descriptions of the exper0
iments by using the coordinate transformations: x µ = Λµν xν + aµ . The Λµν denote the
Lorentz transformations. These transformations, termed Poincare transformations, preserve
the space-time metric. Under composition of transformations, these form the Poincare group.
In general, covariance of a ‘structure’ with respect to some group means that there is a
homomorphism of that group, onto the invariance group of the structure. Here, invariance
group of the structure means the group of transformations of the structure which preserves
6
the structure. For instance, if the ‘structure’ is a vector space, then its invariance group
must be the group of linear transformations. For the quantum framework, the structure is
a projective Hilbert space (Hilbert space modulo non-zero scaling) and the basic observable
quantities are the transition probabilities: |hψ|φi|2 /(kψkkφk). It is a theorem due to Wigner
that the transformations of physical states (elements of the projective Hilbert space) preserving the transition probabilities can be represented by linear or anti-linear transformation
of the Hilbert space, ψ 0 = Aψ, satisfying hψ 0 |φ0 i = hψ|φi or hψ 0 |φ0 i = hφ|ψi. The former
defines A to be a unitary operator while the latter defines an anti-unitary operator. The
anti-unitary operator is necessarily anti-linear as well. The only anti-unitary operator one
encounters is the time reversal operator (more on this later). Thus covariance requires the
Hilbert space of the quantum system to carry a unitary representation of the Poincare group.
If g ∈ G → R(g) denotes a group action which satisfies the composition rule: R(g1 · g2 ) =
R(g1 )·R(g2 ), R(g) constitutes a representation of the group G. On a Hilbert space, we denote
a group action as |ψ 0 i := g · |ψi := U (g)|ψi. It follows that g1 · (g2 · |ψi) = U (g1 )(U (g2 )|ψi) =
U (g1 · g2 )|ψi = (g1 · g2 ) · |ψi. This action also induces a natural action on the operators
via hψ|g · A|ψi := hg −1 · ψ|A|g −1 · ψi ↔ A0 := g · A := U (g)AU (g)† . Check that (a) this
is indeed a homomorphism: (g1 · g2 ) · A = g1 · (g2 · A) and (b) hψ 0 |A0 |φ0 i = hψ|A|φi. All
algebraic relations among the operators are preserved under this group action. Incidentally,
this is exactly how one defines the action of diffeomorphisms on functions on a manifold:
[g · f ](p) := f (g −1 · p) (this is not the pull back definition). All this essentially says that
covariance is implemented through a representation of the group. Which representation?
B.
Compatibility with quantum conditions
Any particular quantum system is however distinguished by what Dirac called quantum
conditions eg. [q i , pj ] = i~δ i j . It is on the Hilbert space carrying a representation of the
quantum conditions that the covariance group must be represented. To appreciate what
this means, let us take the specific examples of the Galilean group and the Poincare group
and try to seek unitary representations on the Hilbert space of a single particle with the
above quantum conditions with, i, j = 1, 2, 3. Taking the generators of the infinitesimal
transformations to be represented by self-adjoint operators, the Lie algebras of the two
7
groups have the following commutation relations:
Galilean transformations : x0i = Ri j xj + v i t + ai , t0 = t + b ,
Ri m Rj n δ mn = δ ij ;
Generators : P0 , Pi , Ci , Mij ; with non-vanishing commutators:
Mij , Mkl = i δik Mjl − (i ↔ j) − (k ↔ l) + (i, k ↔ j, l) ;
Mij , Pk = i δik Pj − δjk Pi ; Mij , Ck = i δik Cj − δjk Ci ;
Ci , Pj = iM δij , Ci , P0 = iPi .
Poincare transformations : x0µ := Λµν xν + aµ
,
Λµα Λνβ η αβ = η µν , η = diag(−1, 1, 1, 1);
Generators : P0 , Pi , Ki , Mij ; with non-vanishing commutators:
Mij , Mkl = i δik Mjl − (i ↔ j) − (k ↔ l) + (i, k ↔ j, l) ;
Mij , Pk = i δik Pj − δjk Pi ; Mij , Kk = i δik Kj − δjk Ki ;
Ki , Pj = iδij P0 , Ki , P0 = iPi and,
Ki , Kj = iMij .
Both have the same number of generators, 10, and almost the same Lie algebra except that
the Galilean algebra has a ‘central extension’ denoted by the (mass) parameter, M and
the ‘Galilean boost’ generators Ci commute among themselves unlike the ‘Lorentz boost’
generators Ki .
These can be expressed in terms of the basic operators q i , pj as follows1 .
Galilean
,
Poincare
Mij := qi pj − qj pi , Pi := pi
p~ · p~
Ci := M qi , P0 :=
2M
,
Mij := qi pj − qj pi , Pi := pi ;
p
Ki := qi P0 , P0 := p~ · p~ + M 2 .
,
The parameter M is naturally identified with the mass of our particle system. Notice that
only the boost generators and the P0 generators are different in the two groups.
Ex: Verify that the above definitions of the generators, satisfy the respective algebras,
up to factors of ~.
1
Thomas Jordan, 0810.4637 and 1st reference from it. The reference 1 shows that if q i is taken to represent
position and transform under Galilean group accordingly, then the Galilean brackets suffice to fix the
remaining generators, including P up to a constant. This also shows that there are no functions on the
phase space which are invariant under the Galilean group apart form constant functions.
8
Ex: Under infinitesimal group action, the infinitesimal change in any operator is given by
δ~A := i[A,~ · T~ ], where ~ denotes the infinitesimal parameters and T~ denotes the generators.
Show that the infinitesimal changes, δ~q i , δ~pi are at the most linear in q i , pi for the Galilean
case but not so for the Poincare case for the Ki , P0 .
In particular, the Poincare group action is non-linear on the basic operators. This is
problematic for the following reason. At the classical level, the observables are sufficiently
smooth functions on the phase space and the set of all observables can be given the structure
of an algebra: i.e. a vector space with a multiplication (product of two functions) defined. A
similar feature holds in the quantum case. Observables are self-adjoint operators. The set of
observables too can be given the structure of an algebra (product of two operators). In both
cases, the algebras can be thought of as being generated by the basic observables. The action
of a group on the basic variables induces a corresponding action on all operators. There
are domain issues in the quantum case, but much more importantly, the multiplication is
non-commutative which leads to the ordering issues.
C.
Manifest Covariance
When the basic observables transform non-linearly, it is much harder to deduce the
transformation properties of the other operators. This is compounded further by the noncommutativity in the quantum case. When the transformations of basic variables are linear,
the non-commutativity issue exists but is usually easier to manage. With this in mind we
now put the additional requirement that the basic observables transform linearly i.e. in
effect tensorially (modulo inhomogeneous action of translations). This is sometimes called
the requirement of manifest covariance.
Invoking manifest covariance discards Poincare group action on the phase space of a
particle. The action of the Galilean group is however permissible.
This could of course be generalized to N −particles, i,j = 1,. . . 3N. We see that nonrelativistic quantum theory of N −particles is not capable of implementing manifest Poincare
covariance. and we can expect potential hazards in incorporating the relativistic causality
and the creation/annihilation processes.
Could we not extend the quantum conditions to a relativistically covariant form?
For instance, define a quantum system by postulating quantum conditions as [q µ , pν ] =
9
i~δ µν . Then the Poincare generators can be taken to be Mµν := qµ pν − qν pµ and Pµ := pµ . It
is trivial to check the Poincare Lie algebra. These also show that the basic observables are
tensors of ranks as indicated by their index positions. This could be a model for a relativistic
particle. However, now there is another problem in quantum theory.
From the basic commutators, we can define the creation/annihilation operators: aµ :=
√1 (q µ + iη µν pν )
2~
and its adjoint (aµ )† , satisfying the commutation relations [aµ , (aν )† ] = η µν .
As usual, the states are labeled by the eigenvalues of the number operators and the vacuum
state is defined by aµ |0i = 0 ∀ µ. This is a Lorentz covariant form of defining the vacuum
state. It follows immediately that h0|[a0 , (a0 )† ]|0i = k(a0 )† |0ik2 = η 00 = −1.
Additionally, if we were to interpret x0 as a “time” coordinate so that p0 is interpreted
as energy, then the energy eigenvalues must necessarily be unbounded below. The Stonevon Neumann theorem implies that if the canonical commutation relations arise from the
infinitesimal form of the Weyl relations, then the spectra of both the conjugate operators
consists of all real numbers.
Thus, the manifestly Lorentz covariant quantum conditions cannot be realized on a Hilbert
space!
Let us turn attention to systems with infinitely many degrees of freedom. The best
known example is classical electrodynamics. This is also relativistically covariant (in fact,
led to special relativity!). Its consequence, the electromagnetic waves, surprisingly show
particulate behavior. While classically, an accelerated charge radiates waves at the frequency
of its mechanical motion, in atomic systems the electrons only radiate at frequencies related
by energy differences between levels. When photographic films are seen at low intensity
light, the ‘light marks’ are localized. In interference experiments, the interference pattern
builds up grain by grain. The photo-electric effect also shows a threshold behavior. All
these are very particulate manifestations. The observation of intensity correlations, the
(g − 2) measurement and the Lamb shift all point to going beyond classical electrodynamics.
The validation by these high precision measurements support the quantum electrodynamical
theory - a quantum field theory. The most recent example is of course the prediction of the
Higgs particle and its discovery. These are positive arguments in favor of quantum field
theoretic framework being well suited for a quantum theory in relativistic regimes.
What is a QFT framework? Let us keep in mind the familiar classical electromagnetic
field. As an example, consider a reflective cavity in which some classical electromagnetic
10
field exist. This field satisfies the source free, first order in time, Maxwell equations which
imply that the electric and magnetic fields satisfy a wave equation. Let us put some boundary conditions (the cavity is closed, say). We can write the general solution of the wave
equation as a linear combination of an infinite set of mode functions, which are by themselves convenient solutions satisfying the boundary condition. Typically, this selects a set
of frequencies/wavelength. The expansion coefficients encode the relative weightage of the
mode functions. One can express the energy and the momentum contained in the cavity
field, in terms of the expansion coefficients with the disappearance of the mode functions
- their memory remains only in the frequencies/wavelengths. Classically, the expansion
coefficients are complex numbers without any particular restriction (except that the total
energy/momentum should be finite!). Schematically, we quantize the electromagnetic field
by putting hats on the fields or equivalently on the expansion coefficients. The quantization
procedure requires imposition of quantum conditions which make the expansion coefficients
similar to creation/annihilation operators - and we have the quanta of the cavity field. Any
exchange of energy/momentum within the cavity or with any outside environment is done
in terms of these quanta, typically referred to as photons. The many body facet mentioned
at the beginning does essentially this process in reverse.
It is useful to recall two equivalent ways of thinking about quantum harmonic oscillator.
We may think of it as a single particle performing motion which is somehow “discretized”
or as a collection of “quanta”, each carrying an energy ~ω and the ‘particle motion’ being
understood as consequences of emission/absorption of these quanta. It is the latter view
which is more convenient in the context of a field.
In nutshell then, A quantum field is a means to hold or supply an arbitrary collection of
quanta and all interactions are transactions of energy/momentum/charges etc in terms of
these quanta.
This view is certainly borne out in perturbative analysis but by no means the most general
one.
11
2.
THE POINCARE GROUP AND ITS REPRESENTATION: MASS AND
SPIN/HELICITY
A.
Poincare group, Lie algebra and Casimir invariants
The Poincare group: The Poincare group or inhomogeneous Lorentz group is defined
by the transformations of space-time coordinates,
x0µ = Λµν xν + aµ , Λµα Λνβ η αβ = η µν , η µν = diag(−1, 1, 1, 1) = ηµν , aµ ∈ R4 .
(2.1)
The defining conditions on Λµν imply that,
ηµν Λµα Λνβ = ηαβ = Λνα Λνβ ⇒ δ αβ = Λνα Λνβ
∴ Λνα = (Λ−1 )αν
↔ (Λ−1 )να = Λαν
Note the index positions
(2.2)
(2.3)
It is convenient to view Λµν as matrices. It is then important to adopt a consistent convention
regarding the index positions. Viewed as matrices, in the last equation, Λνα is the left inverse
of Λ while Λαν is the right inverse of Λ.
The infinitesimal Transformations are defined by taking Λµν = δ µν + ω µν and aµ = µ .
The defining equation for Lorentz transformations then imply ω µν = −ω νµ , ω µν := ω µα η αν .
This also explains the parameter counting of 6 for the Lorentz and 4 for the translations.
Thus, the Poincare group is a 10 parameter continuous group (actually a Lie group)
with the 6 parameters in Λ constituting the Lorentz subgroup and the 4 parameters aµ
constituting the translations subgroup. We denote a generic element of the group as (Λ, a).
Clearly detΛ = ±1. The transformations with positive determinant are called proper
Lorentz transformations while those with negative one are called improper. Furthermore,
p
P
P
−(Λ00 )2 + 3i=1 (Λ0i )2 = −1 ⇒ Λ00 = ± 1 + i (Λ0i )2 . Those with positive Λ00 are
called orthochronous Lorentz transformations. The identity transformation, being both
proper and orthochronous, the subgroup of proper, orthochronous transformations is the
continuous subgroup connected to the identity. The two improper transformations with
Λ = diag(1, −1, −1, −1) and Λ = diag(−1, 1, 1, 1) are called space inversion and time reversal transformations. It can be shown that any Lorentz transformation can be obtained from
a proper, orthochronous transformation followed by space inversion and/or time reversal.
Group composition: The composition of two Poincare transformations is defined as:
12
(Λ2 , a2 ) · (Λ1 , a1 ) = (Λ2 Λ1 , Λ2 a1 + a2 ). It follows that (1, 0) is the identity and (Λ, a)−1 =
(Λ−1 , −Λ−1 a).
Check:
(Λ, a) · (1, 0) · (Λ, a)−1 = (1, Λb)
i.e.
gHg −1 ∈ H ∀ g. Such a subgroup
H is called an invariant or normal subgroup. Thus the translations subgroup is a normal
subgroup of the Poincare group and the Poincare group is said to be a semi-direct product
of the Lorentz and the translations groups.
Check: Any group element can be uniquely written as:
(Λ, a) = (Λ, 0) · (1, Λ−1 a) = (1, a) · (Λ, 0)
.
While it is possible to proceed with the abstract identification of the Lie algebra
via the vector fields generating the infinitesimal transformation, we can simplify by directly considering a unitary representation of the Poincare group. That is, a set of unitary operators U (g) : H → H such that (i) U † (g)U (g) = 1 = U (g)U (g)† and (ii)
U (g2 g1 ) = U (g2 )U (g1 ) , ∀g 0 s ∈ G. To allow for the possibility of infinite dimensional
representations, we use operators on Hilbert space. We also restrict to irreducible representations i.e. the representation space has no proper subspace which is invariant under the
U (g) operators. It is a result (from Schur’s lemma) that if U (g) is a unitary, irreducible
representation, then the only bounded operator that commutes with all U (g) operators is
a multiple of the identity operator. Its converse also holds, namely, if the only operator
that commutes with all the U (g)’s is a multiple of the identity operator, then the unitary
representation is irreducible. A simple application of this result is that unitary, irreducible
representations of abelian groups are one dimensional. Hence the translation subgroup has
only 1-dimensional irreducible representations.
We define the infinitesimal generators as,
i
U (Λ, a) = U (δ µα +ω µα , µ ) := 1+ ωµν M µν −iµ P µ +o(ω 2 , 2 ) , M, P are self-adjoint . (2.4)
2
The commutation relations among the generators are obtained from using the homomorphism property, U (Λ, a)−1 U (λ, b)U (Λ, a) = U [(Λ, a)−1 ·(λ, b)·(Λ, a)] and taking λ = 1+ω, b =
13
. Using the definition of the generators, it follows,
ωµν U (Λ, a)−1 M µν U (Λ, a) = (Λ−1 ωΛ)αβ M αβ − 2(Λ−1 ωa)α P α
(2.5)
µ U (Λ, a)−1 P µ (Λ, a) = (Λ−1 )α P α . Using
(Λ−1 )α = ηασ (Λ−1 )σµ µ = (Λ−1 )αµ µ = Λµα α ,
(2.6)
(Λ−1 ωΛ)αβ = ωµν Λµα Λνβ and (Λ−1 ωa)α = ωµν Λµα aν ;
U (Λ, a)−1 M µν U (Λ, a) = Λµα Λνβ M αβ − (Λµα aν − Λνα aµ )P α ,
(2.7)
U (Λ, a)−1 P µ U (Λ, a) = Λµν P ν
The last two equations show that under the ‘adjoint’ action of the group, the generators
transform as Lorentz tensors. To deduce the commutation relations, take Λ, a to be infinitesimal i.e. U (Λ, a)±1 = 1 ± 2i ωαβ M αβ ∓ iα P α . Substitution and reading off coefficients
gives,
M µν , M αβ = i η µα M νβ − (µ ↔ ν) − (α ↔ β) + (µ, α ↔ ν, β)
[M µν , P α ] = i η µα P ν − η να P µ
[P µ , P ν ] = 0 .
(2.8)
(2.9)
(2.10)
It is convenient to introduce the notation, K i := M i0 and Ji := 21 ijk M jk ↔ M ij = ijk Jk
and also H := P 0 . Then the Poincare commutators take the form (non-zero commutators
only),
[Ji , Jj ] = iij k Jk , [Ji , Kj ] = iij l Kl
(2.11)
[Ki , Kj ] = − iij l Jl , [Ji , Pj ] = iij k Pk
(2.12)
[Ki , Pj ] = − iHδij , [Ki , H] = iPi
Define the Pauli-Lubanski vector, Wµ :=
1
M να P β
2 µναβ
[P µ , W ν ] = 0 , [M µν , W λ ] = i η µα W ν − η να W
µ
, 0123 := 1.
(2.13)
It follows that
and [Wµ , Wν ] = −iµναβ W α P β . These
relations imply that P 2 := ηµν P µ P ν , W 2 := η µν Wµ Wν commute with all the generators of
the Poincare group and hence also with all the group elements of the proper, orthochronous
Poincare group. These are the two independent Casimir invariants of the Poincare group
and must be multiples of identity operator in irreducible representations. These multiples
serve to label the unitary, irreducible representations.
14
B.
Representations of the proper, orthochronous Poincare group:
Each of its representations induces a representation of its Lie algebra 2.11. Since the P µ ’s
commute, their simultaneous eigenvectors can be taken as a group of labels for a basis.
P µ |k µ , ξi = k µ |k µ , ξi ⇒ P 2 |k µ , ξi = k 2 |k µ , ξi , k 2 = ηµν k µ k ν .
(2.14)
Here, ξ collectively labels the degenerate eigenvectors and k 2 is the value of the Casimir P 2 .
Thus, first level of classification is done by the value of the Lorentz invariant k 2 . There are
four classes that arise naturally:
(i) m2 := −k 2 > 0 (massive), (ii) k 2 = 0, k µ 6= 0 (massless), (iii) k 2 > 0 (tachyonic), and
(iv) k µ = 0 (vacuum). The vacuum representation is just a single vector. The tachyonic
representations have not showed up in experiments yet.
• Since P µ transforms as a Lorentz vector, so does k µ and for k 2 ≤ 0, the sgn(k 0 )
distinguishes two further subclasses of the massive and the massless representations.
We restrict to the massive and the massless representations with k 0 > 0 as being physically
relevant.
• Under a Lorentz transformation, (Λ, 0), k µ goes to another point on the hyperboloid
k 2 = constant. How does the collective label ξ transform? We would like to deduce the
action of U (Λ, a) on the vectors |k µ , ξi. Since U (1, a) = exp−iaµ Pµ , we already know that
U (1, a)|k µ , ξi = exp−iaµ kµ |k µ , ξi and we focus on the Lorentz transformations U (Λ, 0) =:
U (Λ).
The Lorentz transformation relation (2.7) imply,
P µ U (Λ)|k, ξi = Λµν U (Λ)P ν |k, ξi = Λµν k ν U (Λ)|k, ξi
X
Cξ,ξ0 |Λk, ξ 0 i .
U (Λ)|k, ξi =
⇒
(2.15)
ξ0
We want to find the coefficients Cξ,ξ0 corresponding to irreducible representations of the
Lorentz group. In principle, these coefficients could (and do) depend on k label, however,
their number (the degree of degeneracy) must be the same over a given hyperboloid. This
is essentially a continuity argument - the label k changes continuously while the degree of
degeneracy is integer valued.
Let k̂ be some convenient, fixed vector on the k 2 ≤ 0 hyperboloid and let L(k) be
some arbitrarily chosen Lorentz transformation such that k µ = L(k)µν k̂ ν . The Lorentz
15
transformation L depends explicitly on k (and also implicitly on k̂) and also is not unique.
For instance, for k̂ = (M, ~0), any k can be obtained by a combination of a rotation and a
boost transformation. Having made a choice of L(k), define,
|k, ξi := N (k)U (L(k))|k̂, ξi
(2.16)
Since this is a definition, the same label ξ appears on both sides. It follows,
U (Λ)|k, ξi = N (k)U (ΛL(k))|k̂, ξi = N (k) U (L(Λk)) · U ((L(Λk))−1 ) U (Λ)·U (L(k))|k̂, ξi .
The last three factors of U combine to give U (L−1 (Λk)) · Λ · L(k), which acting on k̂ takes
k̂ → k → Λk → k̂ since (Λk)µ = [L(Λk)]µν k̂ ν . Denote: W (Λ, k) := L−1 (Λk) · Λ · L(k). It
follows that W (Λ, k)k̂ = k̂ ∀Λ.
The Lorentz transformations, W (Λ, k) leaving the k̂ invariant, form a subgroup of the
Lorentz transformations. It is known as the little group of k̂ or stability subgroup of k̂.
P
Defining a representation D(W ) by, U (W (Λ, k))|k̂, ξi := ξ0 Dξ0 ξ (W )|k̂, ξ 0 i , we get,
U (Λ)|k, ξi = N (k)U (L(Λk))
X
Dξ0 ξ (W )|k̂, ξ 0 i
ξ0
= N (k)
X
h
i
0
D (W ) U (L(Λk))|k̂, ξ i.
ξ0 ξ
ξ0
∴ U (Λ)|k, ξi =
N (k) X
Dξ0 ξ (W )|Λk, ξ 0 i
N (Λk) ξ0
(2.17)
In the last equation, we have used the definition (2.16). The irreducible representations
of the little group can thus be used to characterize the irreducible representations of the
Lorentz group. We are not done yet, the normalization factors N (k) need to be determined.
Determination of N (k):
For different k̂, k̂ 0 , the corresponding little groups are different. The unitarity of the
Poincare (and hence of the Lorentz group) representation implies the representations of the
little groups be also unitary. We can choose hk̂ 0 , ξ 0 |k̂, ξi = δξ,ξ0 δ 3 (k̂ − k̂ 0 ) and we would like
this to be preserved under Lorentz transformations i.e. with the hatted k’s being replaced
by their Lorentz transforms. This is possible only for a specific choice of N (k) which we
determine now.
Note that this requirement associates the ortho-normalization with the hyperboloids determined by k̂ 2 , (k̂ 0 )2 . Since a delta function is defined in the context of an integration, we are
16
implicitly envisaging an integration over the hyperboloids and not over R4 . The integration
is to be Lorentz invariant. This can be inferred as follows.
Z
Z
Z
0
4
0
2
2
θ(k )f (k) :=
d kθ(k )δ(k + m )f (k) =
d3 kdk 0 θ(k 0 )δ(−(k 0 )2 + ~k · ~k + m2 )f (k)
4
massshell
R
p
Z
X δ(x − xi )
~k 2 + m2 , ~k)
f
(
p
∵ δ(f (x)) =
.
(2.18)
=
d3 k
0 (x )|
|f
i
2 ~k 2 + m2
i
Although there are two roots from the delta function of the mass shell, the θ(k 0 ) picks out
p
3
only one term with ωk := k 0 := + ~k · ~k + m2 . The d k is the Lorentz invariant volume
2ωk
element (measure) on the mass shell. Since we have,
Z
Z
i
d3 k 0 ~ h
3
0
0
3
0
d k f (~k )δ (~k − ~k ) =
f (~k) =
f (k) 2ωk δ 3 (~k − ~k 0 ) ,
R3
R3 2ωk
(2.19)
we identify the invariant delta function on the mass shell as:
q
3
0
3 ~
0
~
~
~
δinv (k − k ) := 2ωk δ (k − k ) = 2 ~k 2 + m2 δ 3 (~k − ~k 0 ).
Let k = L(k)k̂ and define k 0 := L(k)k̂ 0 with the same L(k).
hk 0 , ξ 0 |k, ξi = N ∗ (k 0 )N (k)hk̂ 0 , ξ 0 |U † (L(k)) U (L(k))|k̂, ξi = |N (k)|2 δξ0 ,ξ δ 3 (k̂ 0 − k̂).
We have used unitarity of U (L(k)). But we also know that δ 3 (~k 0 − ~k) and δ(k̂ 0 − k̂) are
related by the same Lorentz transformation. Therefore using invariant delta function we
have,
ωk δ 3 (~k 0 − ~k) = ωk̂ δ 3 (k̂ 0 − k̂)
0
k
0 0
2
∴ hk , ξ |k, ξi = |N (k)| δξ0 ,ξ
δ 3 (~k 0 − ~k) and the choice,
0
k̂
s
k̂ 0
⇒
|N (k)| :=
k0
hk 0 , ξ 0 |k, ξi = δξ0 ,ξ δ 3 (~k − ~k 0 ) and ,
r
(Λk)0 X
U (Λ)|k, ξi =
Dξ0 ξ (W (Λ, k))|Λk, ξ 0 i
k0 ξ0
C.
Little groups
It remains to determine the little groups for the various cases.
The vacuum representation (kµ = 0) :
17
(2.20)
(2.21)
(2.22)
(2.23)
(2.24)
The little group is all of the Lorentz group. However the representation is the trivial one:
U (Λ) = 1, ∀ Λ.
Massive representation (k2 = −m2 ) : Choose k̂ µ = (M, ~0). The little group is defined by
Λµν k̂ ν = k̂ µ ⇒ Λ00 = 1, Λi 0 = 0. The defining condition on Λ, η 00 = Λ00 Λ00 η 00 + (Λ0i )2 δ ii ,
gives Λi 0 = 0. Hence Λ is block diagonal, with δ ij = Λi k Λj l δ kl and the little group is SO(3),
the group of rotations. Its unitary representations are all finite dimensional and labeled
by its Casimir, J 2 = j(j + 1), j ∈ N/2. The half integer ones come from the double cover
SU (2). The Dξ0 ξ are the usual (2s + 1) dimensional representations.
The representations of this class are thus labeled by ‘mass’, m and ‘spin’ s.
Massless representation (k2 = 0) :
Choose k̂ µ = (k, 0, 0, k), k > 0. It is more convenient to determine the generators of the
little group. These are some linear combinations of the Lorentz generators M αβ . Consider
the commutator,
αβ M αβ , P λ = i λβ P β − αλ P α
⇒
αβ M αβ , P λ |k̂, ξi = i λβ k̂ β − αλ k̂ α |k̂, ξi = 2iλα k̂ α .
The l.h.s. of the above equation vanishes because, the generators of the little group, · M
leave the eigenstate of k̂ invariant. The r.h.s. Then gives the conditions on ’s. Explicitly,
0 = λα k̂ α = (λ0 + λ3 )k) ⇒ α0 + α3 = 0. This in turn gives 03 = 0, 01 = 13 , 02 = 23 .
Hence,
1
αβ M αβ |little = 12 M 12 + 13(M 13 − M 01 ) + 23 (M 23 − M 02 )
2
:= 1 (J1 − K2 ) + 2 (−J2 − K2 ) + 3 J3
(2.25)
It is conventional to denote: A := J2 + K1 , B := −J1 + K2 and write the little algebra of
k̂ = k(1, 0, 0, 1) as,
[A, B] = 0 , [J3 , A] = iB , [J3 , B] = −iA .
(2.26)
This is the Euclidean group in 2 dimensions: two translations and one rotation. Its representations are not as familiar. To study these, define R(θ) := exp iθJ3 . It is easy to check
that R(θ) A R−1 (θ) = Acos(θ) − Bsin(θ) and R(θ) B R−1 (θ) = Asin(θ) + Bcos(θ).
Consider a representation D in which A|k̂, a, bi = a|k̂, a, bi and B|k̂, a, bi = b|k̂, a, bi.
Define |k̂, a, b, θi := R(θ)|k̂, a, bi. It follows that,
A|k̂, a, b, θi = (acosθ − bsinθ)|k̂, a, b, θi , B|k̂, a, b, θi = (asinθ + bcosθ)|k̂, a, b, θi .
18
Hence, if a, b are non-zero, then for each θ, the states |k̂, a, b, θi are simultaneous eigenstates
of A and B. Such a representation has the continuous parameter θ and no such additional parameter is seen in observations. To avoid such a parameter, we restrict the representation of
the Little group to those for which a = b = 0 or A, B vanish in the representation. The little
group then effectively reduces to just U (1) generated by J3 . Its irreducible representations
are labeled by half integers. We denote these representations as, J3 |k̂, σi = σ|k̂, σi, σ ∈ 21 Z.
The label σ is called helicity. The one dimensional representation is denoted as Dσ0 σ . The
representations with a, b nonzero, are called “continuous spin representations”.
In summary:
s
k̂ 0
U (L(k))|k̂, σi
k0
|k, σi :=
(2.27)
For m > 0:
r
U (Λ)|k, σi =
(Λk)0 X j
Dσσ0 [W (Λ, k)] |Λk, σ 0 i where,
0
k
σ0
W (Λ, k) := L−1 (Λk) · Λ · L(k) ,
L(k)k̂ := k
σ is the j3 eigenvalue (2.28)
For m = 0:
r
(Λk)0
exp {iσθ(Λ, k)} |Λk, σi where, σ is the helicity label,
k0
W (Λ, k) := L−1 (Λk) · Λ · L(k) := S(α(Λ, k), β(Λ, k)) R(θ(Λ, k))
(2.29)
U (Λ)|k, σi =
Let us evaluate the second Casimir invariant of the Poincare group, W 2 . For this we
evaluate Wµ |k̂, σi.
(i) −k̂2 = m2 6= 0: For k̂ = (m, 0, 0, 0),
Wµ |k̂, σi = 21 µνα0 M να m|k̂, σi. This vanishes for µ = 0 and for µ = i, we get Wi |k̂, σi =
− 21 0ijk M jk m|k̂, σi = −mJi |k̂, σi. Therefore,
W 2 |k̂, σi = m2 j(j + 1)|k̂, σi.
(ii) −k̂2 = 0: For k̂ = k(1, 0, 0, 1),
Wµ |k̂, σi = 21 (µνα0 + µνα3 M να ) k|k̂, σi. This implies,
W0 |k̂, σi = kσ|k̂, σi , W1 |k̂, σi = k(−J1 + K2 )|k̂, σi = 0
(2.30)
W3 |k̂, σi = − kσ|k̂, σi , W2 |k̂, σi = k(−J2 − K1 )|k̂, σi = 0.
(2.31)
Hence W 2 |k̂, σi = [−(kσ)2 + (−kσ)2 ]|k̂, σi = 0, for our choice of representations.
19
Note that the action of the translations subgroup on the Poincare representation is given
by, U (1, a)|k, σi = exp−iaµ kµ |k, σi while that of the Lorentz group is given in the box.
Since any element of the Poincare group can be uniquely written as a product of a Lorentz
transformation times a translation, we have specified the full action. Since this action is
determined by the choice of the unitary, irreducible representation of the little group, the
Poincare representation is said to be induced by a representation of the little group.
These representations are taken to identify elementary quanta.
D.
The Discrete Subgroups and their actions
Recall that we have two improper Lorentz transformations, the space inversion P : Λµν =
diag(1, −1, −1, −1) and the time reversal T : Λµν = diag(−1, 1, 1, 1). All Lorentz transformations can be generated from the proper, orthochronous transformations combined with
either or both of these. Clearly these transformations are order 2 i.e. P −1 = P and T −1 = T .
Let us assume that these have an action on the physical state space preserving probabilities.
Then by Wigner’s theorem, these are represented either as linear and unitary or anti-linear
and anti-unitary operators on the Hilbert space. Let us use the same symbols to denote the
corresponding operators which should be clear from the context.
The homomorphism property,
U (Λ, a)−1 U (λ, b)U (Λ, a) = U (Λ, a)−1 · (λ, b) · (λ, a)
(2.32)
gives us the action of space inversion and time reversal on the proper, orthochronous transformations. For a = 0 and Λ = P or T leads to:
P −1 U (λ, b)P = U (P −1 λP , P −1 b) , T −1 U (λ, b)T = U (T −1 λT , T −1 b).
(2.33)
The same homomorphism (2.32), together with the infinitesimal form, U (1 + ω, ) =
1 + 2i ωαβ M αβ − iα P α for (U (λ, b), gives us the relations:
U −1 (Λ)M µν U (Λ) = Λµα Λνβ M αβ , U −1 (Λ)P µ U (Λ) = Λµν P ν .
Following the same steps as before but not canceling the factors of i to allow for anti-linearity,
we get,
P −1 (iP 0 )P = iP 0 , P −1 (iP i )P = −iP i
T −1 (iP 0 )T = −iP 0 , T −1 (iP i )T = iP i
20
(2.34)
(2.35)
Let if possible P be anti-linear. Then it anti-commutes with P 0 which represents the
energy. This in turn means that for every positive energy state, there is a negative energy
state. This conflicts with the energy being bounded below. Identical implication follows if
T is linear! Hence, the space inversion operator must be linear and unitary while the time
reversal operator must be anti-linear and anti-unitary.
With this understood, the action of these discrete operators on the Poincare generators
is given by,
P −1 P 0 P = P 0 , P −1 P i P = −P i , P −1 Ji P = Ji , P −1 Ki P = −Ki
(2.36)
T −1 P 0 T = P 0 , T −1 P i T = −P i , T −1 Ji T = −Ji , T −1 Ki T = Ki
(2.37)
From these defining actions, we can determine how these operators act on the unitary,
irreducible representations of the Poincare group.
It is easy to see that under space inversion and time reversal, the two Casimir invariants
are invariant: P −1 P 2 P = P 2 and P −1 W 2 P = W 2 . This is obvious from the action of these
operators on the generators which transform as their tensor indices indicate. The Casimir
invariants are Lorentz scalars (not pseudo-scalars) and hence invariant under inversions
and time reversal. Thus both parity and time reversal will not mix different irreducible
representations. To evaluate their actions on individual vectors, we note that the irreducible
representations are characterized in terms of k̂ vector and an eigenvalue(s) of J3 generator.
Action of P :
m > 0:
Here, k̂ = (m, 0, 0, 0) and σ = −s, −s + 1, . . . s − 1, s, s being the spin. Since P commutes
with P 0 , anti-commutes with P i and commutes with J3 , we see that P |k̂, σi also has the
same eigenvalues of the P µ , J3 . Hence, P |k̂, σi = ησ |k̂, σi , with ησ being a phase (since the
vectors are non-degenerate). Next, from the usual angular momentum algebra, we know
p
(J1 ± iJ2 )|k̂, σi = (s ∓ σ)(s ± σ + 1)|k̂, σ ± 1i ⇒
p
(J1 ± iJ2 )(P |k̂, σi) = (s ∓ σ)(s ± σ + 1)(P |k̂, σ ± 1i) ⇒ ησ = ησ±1 .
Thus ησ is independent of σ and equal ±1 since P 2 = 1. The phase η is called the intrinsic
parity of the representation. What about action on the states |k, σi ? Recall that these
states are obtained as (2.27),
s
k̂ 0
|k, σi =
U (L(k))|k̂, σi , k := L(k)k̂
k0
21
for a chosen L(k).
Acting by the matrix P on the defining equation for the Lorentz transformation L(k), gives
(P k) = (P L(k)P −1 )(P k̂) which implies that (P L(k)P −1 ) = L(P k). The homomorphism
property, (2.32), then gives P −1 U (L(k))P = U (L(P k)). Acting on |k, σi gives,
s
k̂ 0
P |k, σi =
U (L(P k))η|k̂, σi = η|P k, σi.
k0
(2.38)
Thus P acts on all vectors of the representation with the same η and thus, η is a property
of the irreducible representation.
m = 0:
Now k̂ = (k, 0, 0, k) and the helicity σ is the eigenvalues of the J3 , the angular momentum
direction along the momentum direction. Since P changes the sign of the momentum but
not of the angular momentum, a state |k̂, σi ∝ | − k̂, −σi. Unlike the massive case, the
direction of the reference momentum k̂ is changed and this makes it inconvenient to infer
the action of parity on a general state. For this reason, it is useful to choose a rotation R2
about, say, the 2-axis to bring back −k̂ → k̂. Hence consider a new operator P 0 := U2 P ,
with U2 := e−iπJ2 . But U2 also rotates J3 (just as k̂) and hence the helicity is unchanged by
the U2 operation. Hence, P 0 |k̂, σi = ησ |k̂, −σi . Here ησ is some other phase factor. What
about the action of P on a general state |p, σi defined in the equation (2.27)? Note that we
are using p := (|p|, p~) instead of k to denote the general vector.
Let R(p, k̂) be the rotation that rotates the reference k̂ in the direction of the general
vector p = (|~p|, p~). Let B(p, k̂) denote the boost along the 3-direction which changes the
reference magnitude k to |~p|. We can thus go from k̂ to p by first using the boost B followed
by the rotation R(p, k̂) : |p, σi ∼ U R(p, k̂)B(p, k̂) |k̂, σi. Noting that P 0 commutes with
boosts along the 3-axis, we have,
U (B(p, k̂)) = P 0 U (B(p, k̂))(P 0 )−1 = U2 P B P −1 U2−1 ⇒ P U (B(p, k̂)) = U2−1 U (B)U2 P
∴ P |p, σi ∼ U R(p, k̂) P U B(p, k̂) |k̂, σi = U R(p, k̂)R2−1 B ησ |k̂, −σi
Now, although R(p, k̂)R2−1 is a rotation that takes the 3-axis along −~p, its unitary representative is not quite U (R(−~p, k̂)). It introduces additional phases. I refer you to the equation
(2.6.21) in [1]. The net result is that, P |p, σi = ησ exp{∓iπσ}|P (p), −σi. The additional
phase ∓πσ correlates with the sign of the 2−component of the momentum p~.
Action of T :
m > 0:
22
Now T commutes with P 0 , anti commutes with P i and Ji .
∴ P i T |k̂, σi = 0 , P 0 T |k̂, σi = m|k̂, σi , J3 |k̂, σi = −σ|k̂, σi.
The last one implies that T |k̂, σi = ζσ |k̂, −σi, where ζ is a phase factor. From the angular
momentum algebra we get,
p
(s ∓ σ)(s ± σ + 1)|k̂, σ ± 1i ⇒
p
(−J1 ± iJ2 )(T |k̂, σi) = (s ∓ σ)(s ± σ + 1)(T |k̂, σ ± 1i)
p
= (s ∓ σ)(s ± σ + 1)ζσ±1 |k̂, −σ ∓ 1i) ⇒ −ζσ = ζσ±1 .
(J1 ± iJ2 )|k̂, σi =
ζσ+1 = ζσ−1 . Choosing the phase for say ζ := ζσ=j , the phase for j − 1, j − 2, . . . alternate
between ±ζ. Thus we can write the phase as ζσ = ζ(−1)j−σ , ζ is an arbitrarily chosen phase.
√
The phase ζ can actually be absorbed away by putting |k̂, σi0 := ζ|k̂, σi. The action of T
on the new vector gives,
T |k̂, σi0 =
p
ζ ∗ T |k̂, σi = |ζ|(−1)j−σ |k̂, −σi .
Note: A similar manipulation with the η phase will retain the phase in the redefined vectors
and thus the intrinsic parity cannot be absorbed away. The anti-linearity of T is responsible
for this feature. We may retain the irrelevant phase ζ.
The action of time reversal operator on the general vectors |k, σi proceeds in exactly the
same manner as above leading to, T |k, σi = ζ(−1)j−σ |k, −σi. It follows immediately that
T 2 |k, σi = (−1)2j |k, σi which distinguishes different spins.
m = 0:
The analysis here proceeds in an exactly same manner as that for the parity and we
just note the final result as: T |p, σi = ξσ σexp{∓iπσ}|T (p), σi. The additional phase ∓πσ
correlates with the sign of the 2−component of the momentum p~.
This completes the basic definitions related to the Poincare group and its unitary, irreducible representations.
These representations specify the attributes that we may assign to elementary quanta.
However, these are not suitable for manifest covariance. Firstly, the label σ transforms
covariantly only under the little group and not the full Lorentz group. Secondly, the label
k is restricted to the positive hyperboloid and not the full R4 . This makes it inconvenient
to take Fourier transform to link them with space-time coordinates. For this we need to
construct representations on vector valued functions on the space-time. This is done next.
23
3.
REPRESENTATIONS SUITABLE FOR MANIFEST COVARIANCE: “FIELD
REPRESENTATIONS”
As noted earlier, the abstractly classified unitary, irreducible representations of the
Poincare group are not suitable for manifest Lorentz covariance. For this, we begin by
defining vector space of suitably chosen function and define an action (homomorphism) of
the Poincare group. These will be reducible in general and irreducibility will be imposed
by partial differential equation (“field equations”). To make the action unitary, requires an
inner product to be defined. This will further show a ‘doubling of representations’ leading
to the particle/anti-particle interpretation.
Let ΨA (x) be a finite dimensional column vector of a suitable class of complex/real valued
functions of the space-time coordinates xµ . The defining action of the Poincare group is:
x → gx := x(Λ,a) = Λx + a. It induces a corresponding action on the functions which we
denote as: Ψ → (gΨ) := Ψg ,
Ψg (x) = D(h(g))Ψ(g −1 x) ↔ Ψ0(Λ,a) (x) = DAB (h(λ, a))ΨB ((Λ, a)−1 x)
Here, D is a finite dimensional representation of the Lorentz group, h(Λ, a) is a map from the
Poincare group to the Lorentz group. The above action is required to be a homomorphism,
i.e. (gΨ)(x) = D(h(g))Ψ(g −1 x) such that
(g 0 (gΨ)) (x) = (g 0 gΨ)(x) ∀ g, g 0 ∈ Poincare and ∀ x.
(3.1)
l.h.s. = D(h(g 0 )) (gΨ)(g 0−1 x) = D(h(g 0 ))D(h(g))Ψ(g −1 g 0−1 x)
= D(h(g 0 )h(g))Ψ((g 0 g)−1 x).
r.h.s. = D(h(g 0 g))Ψ(g 0 g)−1 x) ∴ h(g)0 s must satisfy,
D(h(g 0 )) D(h(g)) = D(h(g 0 g)) ∀ g, g 0 ∈ Poincare. Equivalently,
(3.2)
(3.3)
(3.4)
D(h(g 0 )h(g)) = D(h(g 0 g)) which implies,
h(g 0 )h(g) = h(g 0 g) , or h(g) must be a homomorphism.
(3.5)
Since our primary focus is on the Lorentz group, and h(g) = h((1, a) · (Λ, 0)) =
h((1, a))h((Λ, 0)), we may choose h((1, a)) = 1. This reduces the homomorphism from
Poincare into Lorentz, to a homomorphism from Lorentz to Lorentz. We can and do take,
this homomorphism to the identity homomorphism. With this we write,
Ψ(Λ,a) (x) := D(Λ)Ψ(Λ−1 (x − a)).
24
(3.6)
A.
Induced representation of Poincare from inducing representation of Lorentz
We have thus got an induced representation of the Poincare group from an inducing representation D(Λ) of the Lorentz group and we take D to be an irreducible representation.
There is no requirement of unitarity as ΨA (x) is not a quantum mechanical wave function.
Additionally, it is a fact that the Lorentz group has no non-trivial finite dimensional, unitary representations. Our task now reduces to finding out all possible, irreducible, finite
dimensional representations of the Lorentz group.
To appreciate irreducibility, consider the infinitesimal action: (Λ, a) = (1 + ω, ), D(Λ) =
D(1 + 2i ωαβ T αβ ) where, T represent the Lorentz generators in the D representation. Then,
i
i
Ψ1+ω, (x) = (1 + ω · T )(Ψ(x − ωx − )) = Ψ(x) + ω · T Ψ(x) − (ω αβ xβ + α )∂α Ψ|x (3.7)
2
2
This defines the Poincare generators acting on the Ψ(x). Explicitly,
M αβ Ψ := (T αβ )AB ΨB (x) + i(η γα xβ − η γβ xα )∂γ ΨA (x)
Pµ Ψ := −i∂µ ΨA (x)
(3.8)
(3.9)
The two Casimir invariants, P 2 , W 2 acting on Ψ should evaluate to some constants to
satisfy the necessary condition for irreducibility of the Poincare representation. The P 2
Casimir is easy to evaluate and gives,
(P 2 Ψ)A = η µν (−i)2 ∂µ ∂ν ΨA = − ΨA
:= −∂02 + ∇2 .
(3.10)
For the second Casimir we have,
1
µναβ (M να P β Ψ)A
2
1
= µναβ (T να )AB + iδ AB (η γν xα − η γα xν ) ∂γ −iδ BC ∂ β ΨC
2
i
i
= − µναβ (T να )AB ∂ β ΨB + µναβ xα ∂ ν ∂ β ΨA = − µναβ (T να )AB ∂ β ΨB + 0(3.11)
2
2
(Wµ Ψ)A =
Note that the 2-derivative term has canceled. Then,
1 µν 0 α0 β 0
(Mν 0 α0 Pβ 0 )AB (Wµ Ψ)B
2
h 0 0
ih
i
1
0
0
0
0
0
= µν 0 α0 β 0 (T ν α )AB + iδ AB (η γν xα − η γα xν )∂γ −iδ BC ∂ β (Wµ Ψ)C
2
i
0 0
0
0
0
0
= − µν 0 α0 β 0 (T ν α )AB ∂ β (Wµ Ψ)B + µν 0 α0 β 0 xα ∂ ν ∂ β (Wµ Ψ)A
2
1
0 0
0
= − µν 0 α0 β 0 µναβ (T ν α T να )AB ∂ β ∂ β ΨB
(3.12)
4
(W 2 Ψ)A =
25
Notice that W 2 Ψ = 0 for non-one dimensional D representation.
As illustrations consider two examples: (i) D is the trivial representation of the Lorentz
group and (ii) D is the defining representation of the Lorentz group.
D(Λ) = 1: Now W 2 Ψ = 0 while P 2 Ψ = −Ψ = −m2 Ψ. Thus, irreducibility requires
that Ψ must satisfy ( − m2 )Ψ(x) = 0 and we recognize this as the Klein-Gordon equation.
The Ψ transforms as: Ψ(Λ,a) (x) = Ψ(Λ−1 (x − a)).
D(Λ) is the defining representation: This means Λµν = δ µν + ω µν := δ µν + 2i (ωαβ T αβ )µν
and hence
(T αβ )µν := −i(η αµ δ βν − η βµ δ αν )
This gives,
h 0
i
0 0
0
0
0
0
0
0
0
(T ν α T να )ρσ = − η ν ρ η να δ ασ − η α ρ η νν δ ασ − η ν ρ η αα δ νσ + η α ρ η αν δ νσ
This is manifestly antisymmetric in (ν 0 α0 ) and (να). Contraction with the epsilons give
0
equal contributions and cancel the factor of 4, leading to (W 2 Ψ)ρ = µρνβ 0 µνσβ ∂ ββ Ψσ .
Recall the general identity:
a1 ···aj
aj+1 ···an
a1 ···aj
= (−1)s (n − j)! j! δ
bj+1 ···bn
[aj+1
bj+1
··· δ
an ]
bn
,
(3.13)
where s is the index of the metric i.e. number of negative eigenvalues of the metric. In our
case the metric is the Minkowski metric and s = 1. This gives,
µρνβ 0 µνσβ = − µνρβ 0 µνσβ = − (−1) 2! 2!
1 ρ
(δσ ηβ 0 β − ηβ 0 σ δβρ ),
2
leading to,
0
0
(W 2 Ψ)ρ = 2(δσρ η β β − η βρ δσβ )∂ 2β 0 β Ψσ = 2 [Ψρ − ∂ ρ (∂σ Ψσ ] , and
(3.14)
(P 2 Ψ)ρ = −Ψρ
(3.15)
Note: We have specified the action of the Poincare group on ΨA . How does ∂µ ΨA transform?
We can think of the derivative as a new quantity with two indices. The index A will transform
by (T αβ )AB as before while the index µ will transform by the defining representation of the
Lorentz group, (T αβ )µν = +i(η αν δµβ − η βν δµα ). In the above calculation of the second W µ ,
this was missed. However it may be checked that this extra contribution vanishes in the W 2
calculation.
26
B.
Irreducibility and field equations
The Poincare representation will be irreducible provided the Casimir invariants have fixed
values. Let P 2 Ψρ = −m2 Ψρ and W 2 Ψρ = CΨρ . For non-zero mass, with spin j, we know
that W 2 equals m2 j(j + 1) and hence C = m2 j(j + 1). Thus Ψρ satisfies,
ψ ρ = m2 Ψρ , 2(m2 Ψρ − ∂ ρ ∂ · Ψ) = m2 j(j + 1)Ψρ
We have two cases to consider. The defining vector representation of the Lorentz group,
splits as 1 ⊕ 0 under the rotation subgroup. Hence j = 1 or 0. For j = 1, taking the
divergence of the second equation implies ∂ · Ψ = 0. For j = 0, this argument gives an
identity. However, vanishing of W 2 gives ∂ρ ∂ · Ψ = m2 ψρ . Differentiating by ∂σ and antisymmetrising in the indices gives, ∂ρ Ψσ = ∂‘ σΨρ ⇒ Ψρ = ∂ρ Φ for some scalar Φ which is
defined up to a constant. This scalar satisfies ( − m2 )Φ =constant which can be taken to
be zero. Thus,
if the defining representation is to give the massive, spin one irreducible representation,
Ψρ must satisfy: ( − m2 )Ψρ = 0 = ∂ · Ψ. If it is to give the massive, spin zero irreducible
representation, then Ψρ = ∂ρ Φ with ( − m2 )Φ = 0.
For m = 0 and our restriction on the representations of the little group, the Casimir
conditions give, Ψµ = 0 and ∂µ ∂ν Ψν = 0. Introduce a new field, Aµ := Ψµ + ∂µ Λ. It
follows that,
Aµ = 0 + ∂µ Λ , ∂µ ∂ ν Aν = 0 + ∂µ Λ
⇒ Aµ − ∂µ ∂ ν Aν = 0 = ∂ ν ∂ν Aµ − ∂µ Aν
We recognize the last equality as the source free Maxwell equation, including its gauge
invariance: Aµ → Aµ + ∂µ Λ!
These examples show that the induced representation ΨA of the Poincare group defined
through the inducing representation of the Lorentz subgroup, is in general reducible. Requiring that the Poincare Casimir invariants, evaluated on Ψ gives the values corresponding to
the unitary, irreducible representation of the Poincare group, imposes differential conditions
on Ψ eg the Klein-Gordon equation and the subsidiary conditions. Usually, these equations
are proposed as the field equations. Here we see them arising as irreducibility conditions.
Let us return to the classification of the finite dimensional, representations of the Lorentz
group.
27
We already know that the defining representation acts as: v 0µ = Λµν v ν ↔ vµ0 = Λµν vν =
(Λ−1 )νµ vν . From these we can trivially construct other irreducible representations by taking
tensor products and subtracting suitable ‘traces’. For instance, a rank-2 tensor V µν transforms as V 0µν = Λµα Λνβ V αβ . This is a reducible representation though. Why? Because its
symmetrised and anti-symmetrised parts transform among themselves and thus form invariant subspaces. Furthermore, tensors of the form η µν C also transforms into the same form.
Hence the symmetric combination splits further into traceless and trace part. Thus we get
the space V of the second rank tensors decomposes as:
V = Vanti−sym ⊕ Vsym,traceless ⊕ Vtrace , each being an irreducible representation.
Analogous construction can be carried out for higher rank tensors. These however do not
exhaust all the finite dimensional irreducible representations. There are the spinor representations which are missed. It is a result that all finite dimensional representations of
pseudo-orthogonal groups, SO(p, q), p + q ≥ 2 can be constructed from the irreducible representations of the corresponding Clifford algebra.
C.
Clifford Algebras and Spinor representations
Clifford algebras are algebras generated by elements, {1, γ µ , µ = 0, . . . , (p + q − 1)}
satisfying γ µ γ ν + γ ν γ µ = 2η̄ µν 1, η̄ µν = diag(−1, . . . , −1, +1, · · · + 1). There are p negative
and q positive signs. We have put a bar on the metric since we will relate it to the η
defining the Lorentz group for which p = 1 and q = 3. Note that for µ 6= ν the gamma’s
anti-commute, (γ 0 )2 = η̄ 00 and (γ i )2 = η̄ ii . Since the gamma’s anti-commute, their bilinear
satisfy commutation relations.
Let Σµν := a(γ µ γ ν − γ ν γ µ ). We will choose the proportionality constant a suitably.
Consider,
µν αβ
Σ ,Σ
= a2 γ µ γ ν , γ α γ β − (µ ↔ ν) − (α ↔ β) + (µ, α ↔ ν, β)
(3.16)
γ µ γ ν γ α γ β = 2η̄ να γ µ γ β − 2η̄ µα γ ν γ β + 2η̄ νβ γ α γ µ − 2η̄ µβ γ α γ ν
1 µ ν α β
∴
γ γ , γ γ = −η̄ µα γ ν γ β + η̄ να γ µ γ β − η̄ µβ γ α γ ν + η̄ νβ γ α γ µ
2
∴ Σµν , Σαβ = (4a) −η̄ µα Σνβ + (µ ↔ ν) + (α ↔ β) − (µ, α ↔ ν, β)
(3.17)
This has the same form as the algebra satisfied by the Lorentz generators: M µν , M αβ =
i(η µα M νβ − −+), the M ’s were defined through U (Λ = 1 + ω) = 1 + 2i ωµν M µν . Thus we
28
need −4aη̄ µν = iη µν . There are two ways to satisfy this and we choose our conventions as:
η̄ = −η and a = 4i . Thus,
γ µ γ ν + γ ν γ µ = −2η µν
,
i
Σµν := (γ µ γ ν − γ ν γ µ ) =
4
The Σ’s satisfy the Lorenz Lie algebra.
i µ ν
[γ , γ ] .
4
(3.18)
Furthermore, the definitions imply that
[Σµν , γ λ ] = i(η µλ γ ν − η νλ γ µ ) i.e. the γ’s transform as Lorentz vectors.
It is useful to define γ5 := +iγ 0 γ 1 γ 2 γ 3 Just from the Clifford algebra and definitions it
follows that γ52 = +1, γ5 γ µ = −γ µ γ5 , and [γ5 , Σµν ] = 0.
Claim: It is always possible to choose the γ’s to be finite dimensional unitary matrices.
This is based on the following facts:
(a) The set of elements G
:=
{±1, ±γ µ , ±γ µ γν, ±γ µ γ ν γ λ , . . . , ±γ5 } with all indices distinct in the products, forms a finite
group of 32 elements (in our 4 dimensional case); (b) All representations of finite groups
can be made unitary; (c) representation theory of finite groups applied to the group G gives
that there is exactly one, non-trivial, unitary, irreducible representation of G and hence of
the Clifford algebra and that its dimension is 2[4/2] = 4. The results also extends to other
dimensions.
Unitarity of the γ’s and their squares being ±1, imply further that (γ 0 )† = γ 0 and
(γ i )† = −γ i . This in turn gives (Σ0i )† = −Σ0i , (Σij )† = Σij and γ5† = γ5 .
Hence, the representation D of the Lorentz group provided by Σ’s is (i) non-unitary and
(ii) reducible, since γ5 is Hermitian and commutes with the generators.
Since γ’s are finite dimensional, their traces are defined and all of them including γ5 are
traceless. In particular, this implies that γ5 has two eigenvalues equal to +1 and other two
equal to -1. Hence
1±γ5
2
are projection matrices and reduce the 4 dimensional representation
of the Lorentz group into two irreducible representations of dimensions 2.
By convention, ΨL :=
ΨR :=
1+γ5
Ψ, ↔
2
1−γ5
Ψ, ↔
2
γ5 ΨL = −ΨL is called a left handed Weyl spinor while
γ5 ΨR = +ΨR is called a right handed Weyl spinor.
The 4-component ΨA ’s are called Dirac spinors while the 2-component projections, Ψ± :=
1±γ5
Ψ
2
are called Weyl spinors. Choosing D to be generated by the Σ± :=
1±γ5
Σ
2
give
irreducible representations of the Lorentz group and in turn an irreducible representation of
the Poincare group .
Returning to Poincare representation, consider the combination γ µ Pµ , apparently a
29
Lorentz scaler. Indeed, recalling that [M µν , Pλ ] = +i(δλµ P ν − δλν P µ ), it follows that,
µν λ µν λ
T , γ Pλ = Σ , γ Pλ + γ λ [M µν , Pλ ] = 0 .
We have used T to denote a general representation, M to denote the defining representation
and Σ to denote the reducible spinor representation of the Lorentz group. We use the
conventional abbreviation a/ := γ µ aµ for all 4-vectors aµ .
Consider the Poincare Casimir P 2 . Irreducibility forces P 2 Ψ = −m2 Ψ. Let m 6= 0. The
combinations, m1 ± /p commutes with the Lorentz generators and also satisfy, (m ± /p)2 =
m2 + (p/)2 ± 2mp/ = 2(m ± /p). Therefore, π± :=
m±p
/
2m
is a projection matrix operator (since
P is an operator) that also commutes with the Lorentz generators. It trivially commutes
with the Poincare generators as well. Hence the Poincare representation induced by the
Dirac spinor is reducible in yet another way, the irreducible subspaces being provided by
(±m + /p)Ψ = 0 = (−iγ µ ∂µ ± m)Ψ = 0. This is just the Dirac equation! Note that the
projection property holds only for non-zero mass.
Note: The projectors
1±γ5
2
reduce the D representation of the Lorentz group and each
of these induces an irreducible representation of the Poincare group. By contrast,
m±6 p
,
2m
do
not reduce the D representation, but nevertheless reduces the Poincare representations. Are
either of these Poincare representations further reduced by the other projector? The answer
is ‘No’ because γ5 anti-commutes with 6 p and thus exchanges the two projectors.
Note: Consider the action of 6 p on the left(right) handed Weyl spinors. It follows immediately that γ5 (6 p)ΨL/R = ±6 pΨR/L . Therefore, if we restrict to any one irreducible Weyl
representation, then 6 p must annihilate it. That is Weyl spinors satisfy the massless Dirac
equation - also known as the Weyl equation. Conversely, if we have a Dirac spinor Ψ that
satisfies the Weyl equation, then its decomposition into its Weyl spinors also satisfy the
Weyl equation individually: 6 p(ΨL + ΨR ) = 0 ⇒ 6 pΨL/R = 0.
Exercises:
1. Form the anti-symmetrized products of the γ-matrices, satisfying the Clifford algebra
{γ µ , γ ν } = −2η µν ,
Γµ := γ µ
Γµν := γ µ γ ν − γ ν γ µ
Γµνλ := γ µ γ ν γ λ + γ ν γ λ γ µ + γ λ γ µ γ ν − γ ν γ µ γ λ − γ µ γ λ γ ν − γ λ γ ν γ µ
30
Γµναβ := γ [µ γ ν γ α γ β] with no numerical overall factors. Since the indices take 4 values
only, we cannot have any more antisymmetric products. Together with 1, this set
comprises of 16 matrices.
Show that products of any number of γ’s can be expressed in terms of the these 16
matrices together with products of η’s. Conclude that these 16 matrices constitute a
basis for arbitrary products of γ’s. Since they have different Lorentz transformation
properties, they are all independent. The vector space of k × k matrices has dimension
k 2 , hence the minimum matrix order for the γ’s is 4 and they are necessarily irreducible
representation of the Clifford algebra. Hence our γ’s are 4 × 4. Analogous arguments
hold for Clifford algebra of n dimensions (indices taking n values).
2. For spinors satisfying the Dirac equation with M 6= 0, evaluate the Pauli-Lubanski
scalar, Wµ W µ Ψ.
3. The Lorentz Ji , Ki satisfy the algebras [Ji , Jj ] = iijl Jl ; [Ki , Kj ] = −iijl Kl ; [Ji , Kj ] =
iijl Jl . Define: Ai :=
1
(Ji
2
+ iKi ), Bi :=
1
(Ji
2
− iKi ). Check that the [Ai , Bj ] =
iijl Al , , [Bi , Bj ] = iijl Bl , , [Ai , Bj ] = 0. This the M µν Lie algebra is equivalent to
the direct sum of two, mutually commuting SU (2) algebras. They have the Casimir
~ 2, B
~ 2 . Evaluate these for the spinor representation provided by Σµν ’s.
invariants A
In summary:
ΨΛ,a (x) := D(Λ)Ψ(Λ−1 x − Λ−1 a) (Reducible) Poincare action
(3.19)
Require Poincare Casimir invariants to take constant values
( − m2 )Φ = 0 :
( − m2 )v µ = 0, ∂ · v = 0 :
Klein-Gordon equation (scalar)
(3.20)
Proca equation (massive vector)
(3.21)
Aµ = 0, ∂ · A = 0 ↔ ∂µ F µν = 0 , Fµν := ∂µ An − ∂ν Aµ Maxwell equation) (3.22)
(−i6 ∂ + m)Ψ = 0 : Dirac equation, m = 0 is Weyl equation
31
(3.23)
4.
UNITARY REPRESENTATIONS ON THE SPACE OF SOLUTIONS: ANTI-
PARTICLES
At this stage, we have two sets of irreducible representations of the Poincare group the “Particle” representations {|k, σi}, and the “field” representations ΨA (x) satisfying the
appropriate field equations (or irreducibility conditions). The former are unitary while the
latter have no such notion defined for them. We fill this gap now.
The space of vector valued functions, ΨA can be made a complex vector space easily
enough, but we need to define an inner product to define the notion of unitary. Since the
irreducible representations are solutions of the field equations and the equations are linear,
we consider the complex vector space of the solutions of the field equations and look for an
inner product.
Consider the massive scalar field first: (∂ µ ∂µ − m2 )Φ(x) = 0, ΦΛ,a (x) = Φ(Λ−1 (x − a)).
Let u, v be two solutions of the Klein-Gordon equation. Then,
u∗ ( − m2 )v − v( − m2 )u∗ = ∂µ (u∗ ∂ µ v − v∂ µ u∗ ) = 0.
Hence, J µ := λ (u∗ ∂ µ v − v∂ µ u∗ ) is a conserved current. This has an immediate implication.
Consider a 4 dimensional region bounded by two hypersurfaces of constant value of the
time coordinate. More generally, these are two Cauchy surfaces. Restricting to solutions
which vanish sufficiently rapidly as one approaches asymptotic infinity along the space-like
directions (eg |~x| → ∞),
Z
Z
4
µ
d x∂µ J = 0 =
region
3
d xJ
Σ2 ∪Σ1
0
Z
⇒
3
0
Z
dJ =
Σ2
d3 xJ 0 .
Σ1
Here we have used that both the Cauchy surfaces have their normals directed ‘outward’
(future pointing for the later one and past pointing for the earlier one). This suggests that
we define a candidate inner product between two solutions as,
Z
(v, u) := λ d3 x J 0 (v, u) , J 0 (v, u) := v ∗ ∂ 0 u − u∂ 0 v ∗ , Σ a Cauchy surface.
(4.1)
Σ
The conserved current implies that the inner product is independent of the Cauchy surface.
Note: On a Cauchy surface, the solution and its time derivative form an initial data and
the inner product is really defined on these data. However the Σ−independence of the inner
product allows us to think of this as an inner product on the space of solutions.
32
Is (v, u) really an inner product? It is (i) linear in u and anti-linear in v; (ii) (v, u)∗ = (u, v)
provided λ∗ = −λ and (iii) (u, u) ≥ 0 with equality for u = 0?
To check the third property, let us write a solution of the field equation in the form
u(t, ~x) = e±iωt uω (~x) , ω > 0. Let us conventionally call e−iωt as a positive frequency solution.
R
Then, (u, u) = λ(−2iω) Σ d3 x|uω (~x)|2 . The inner product then satisfies the crucial third
property provided we choose λ = +i|λ| := i. The absolute value of λ has been taken to be
one as it only affects the normalization of the solutions.
Thus, with the convention adopted, the (v, u) with λ = i is indeed an inner product on
the subspace of positive frequency solution. Equally well, the choice λ = −i defines an inner
product on the subspace of negative frequency solutions.
Consider a family of solution, the plane wave solution, u~k (x) := A~k eik·x , k · x := −k 0 t +
p
~k · ~x, k 0 := ~k 2 + m2 =: ω~ . Substitution gives,
k
iA~∗k0 A~k
Z
0
0
d3 x(−ik 0 − ik 0 )ei(k−k )x
Σt
00
0
0
= +A~∗k0 A~k (k 0 + k 0 )e−i(k −k )t (2π)3 δ 3 (~k − ~k 0 )
i
h 0 3
3
2
0
~
~
= (2π) |A~k | (2k )δ (k − k )
(u~k0 , u~k ) =
(4.2)
(4.3)
(4.4)
Since k 0 depends only on the magnitude of ~k, the delta function forces the frequencies to be
equal. We recognize the second square bracket as the Lorentz invariant delta function and
choose the normalization constant to A~k := (2π)−3/2 so that,
q
1
ik·x
e
, k · x := − ~k 2 + m2 t + ~k · ~x , ~k ∈ R3
u~k (x) :=
(2π)3/2
(u~ 0 , u~ ) = δ 3 (~k − ~k 0 ) = (2k 0 )δ 3 (~k − ~k 0 ) .
k
k
inv
(4.5)
(4.6)
The above family of solutions formally form an orthonormal set in the space of positive
frequency solutions. Technically, these do not belong to the space of solutions which have
to die off suitably at spatial infinity. This is understood as usual by either using ‘box
normalization’ or forming wave-packets. Manipulations done using the above do not lead to
any inconsistency.
Now we are ready to check if the Poincare action on the above inner product space is
unitary. We will check this by showing that under the Poincare action, the orthonormality
is preserved.
33
Consider,
[u~k (x)]Λ,a = u~k (Λ−1 (x − a)) =
1
−1
eik·Λ (x−a)
3/2
(2π)
(4.7)
1
ei(Λk)·(x−a) ∵ kµ (Λ−1 )µν xν = (Λνµ kµ )xν = (Λk) · x
(2π)3/2
(4.8)
∴ u~k (x) Λ,a = e−i(Λk)·a uΛk (x) and,
~ 0
~ 0
3
(Λ(k ~− k 0 ))
= ei(Λ(k−k ))·a δinv
[u~k0 ]Λ,a , [u~k ]Λ,a = ei(Λ(k−k ))·a [uΛk
~ 0 ] , [uΛk
~ ]
= [u~k0 ] , [u~k ]
Hence, unitarity!
(4.9)
=
Note: The orthochronous Lorentz group preserves the sign of the frequency, k 0 > 0 =>
(Λk)0 > 0, and hence maps positive (negative) frequency solutions into positive (negative)
frequency solutions. Thus we see that the space of solutions of the irreducibility condition
(field equations) itself decomposes into two Lorentz invariant subspaces of positive/negative
frequency solutions. There is no contradiction with the Casimir being constant - we just
happen to have two unitary representations which have the same values of the Poincare
(orthochronous, proper) Casimir invariants. These in fact represent ‘particle’ and ‘antiparticle’ representations respectively.
The plane wave orthonormal basis, {u~k (x)} is in one-to-one and onto correspondence
with the ‘particle basis’ {|ki}.
The same features are exhibited by the other solution spaces, it remains to identify a
conserved current on the space of solutions, define an inner product, obtain an orthonormal
set and show unitarity. Since all field equations imply that the fields always satisfy the
Klein-Gordon equation, we will always have the positive/negative frequency subspaces and
the particle/anti-particle identification.
Consider the Proca equation with divergence condition: ( − m2 )v µ = 0 = ∂ · v.
Let V denote the space of solutions of these equations. As before, let v µ and uµ be two
solutions. Then it follows that J µ := λ(v ∗ν ∂ µ uν − uν ∂ µ vν∗ ) is conserved exactly as before.
The divergence condition just selects a subspace of the space of solution of the Klein-Gordon
equation. The inner product is defined as (with the same convention of positive frequency
solutions),
Z
(v, u) := i
d3 x [v ∗ν ∂ µ uν − uν ∂ µ vν∗ ]
Σt
34
Consider the family of solutions,
1
εµ (k)eik·x , kµ εµ (k) = 0 , k · x := −k 0 t + ~k · ~x
(2π)3/2
h
i
0
1
00
0
∗µ 0
−i(k0 −k 0 )t
3 3 ~
~k 0 )
(k
+
k
)
[ε
(k
)(ε
(k)]
e
(2π)
δ
(
k
−
u~k0 , u~k =
µ
(2π)3
q
0
0
∗µ 0
3
~
~
= [ε (k )(εµ (k)] δinv (k − k) , here k := ~k 2 + m2 .
u~µk (x) :=
(4.10)
(4.11)
The polarizations ε(k) satisfying the transversality condition, k · ε(k) = 0 selects 3 independent vectors for m 6= 0 and 2 independent vectors for m = 0 thanks to the equivalence
of Aµ and Aµ + ∂µ Λ. To distinguish the different polarization vectors, we introduce an
additional label, a taking 3 and 2 values respectively for the massive and massless cases.
We choose them to satisfy the orthonormality relations, ε∗µ (k, a)εµ (k, b) = δab and of course
ε(k, a) · k = 0 ∀ a = 1, 2, 3. For massless case, a is usually denoted by λ and takes only two
values. The plane wave family of solutions so defined, constitute an orthonormal set.
Under the Poincare action,
h
i
u~µk,a (x)
Λ,b
1
−1
Λµν εν (~k, a)eikΛ (x−b)
3/2
(2π)
= e−i(Λk)·b (Λµν εν (k, a))ei(Λk)·x But,
= Λµν u~νk,a (Λ−1 (x − b)) =
(Λk)µ (Λµν εν (k, a)) = kµ εµ (k, a) = 0 ∴ r.h.s. = ei(Λk)·b) εµ (Λk), a)ei(Λk)·x
h
i
µ
u~k,a (x)
= ei(Λk)·b) uµΛk,a
~ (x) .
(4.12)
Λ,b
Noting further that (Λε(k 0 , a))∗ · (Λε(k, b)) = ε∗ (Λk 0 , a) · ε(Λk, b), unitarity of the Poincare
action now follows in exactly the same manner as for the scalar. As in the case of the scalar,
here too we have two unitary, irreducible representations of the (orthochronous, proper)
Poincare group corresponding to the ‘particle’ and ‘anti-particle’ tags. For the zero mass
case, the only difference is that the polarizations are restricted to the two spatially transverse
directions.
The case of the spinors is more interesting. We follow the same strategy of looking for
a conserved current and a corresponding inner product, but we will do this for the Dirac
equation which is a first order differential equation. We have:
A
B
−1
µ
ΨA
Λ,a (x) = D B (Λ)Ψ (Λ (x − a)) , (−iγ ∂µ ± m)Ψ = 0 .
Note that there are two equations.
35
To look for a conserved current, we need to consider the (matrix) adjoint of the Dirac
equation. The adjoint will involve γ † which are inconvenient for manipulations. However, it
follows readily that (γ µ )† = γ 0 γ µ γ 0 . This suggests that we use the Dirac conjugate defined
as: Ψ̄ := Ψ† γ 0 . It follows that
(−iγ µ ∂µ ± m)Ψ = 0 ↔ i∂µ Ψ̄γ µ ± mΨ̄ = 0 .
Using these equations for two solutions, Ψ1 , Ψ2 and their Dirac conjugates, it follows that
J µ (Ψ2 , Ψ1 ) := λ(Ψ̄2 γ µ Ψ1 ) is the conserved current for both ± m equations. With the choice
λ = +1, the inner product on the solutions of Dirac equations is defined as:
Z
Z
3
0
d xΨ̄2 γ Ψ1 ↔ (Ψ, Ψ) =
d3 xΨ† Ψ ≥ 0 .
(Ψ2 , Ψ1 ) :=
Σt
Σt
Caution: For γ0 in the above equation, (Ψ, Ψ) ≤ 0.
To construct a family of orthonormal solutions, consider the ‘plain wave ansatz’,
ΦA (~k, σ, x) := (2π)13/2 ϕA (k, σ)eik·x . The Dirac equations then require (6 k ± m)ϕ(k, σ) = 0 or
p
6 kϕ = ∓mϕ. Multiplying by 6 k and using 6 k6 k = −k 2 leads to k 2 +m2 = 0 ↔ k 0 := ± ~k 2 + m2 .
We fix k 0 to be positive and call a solution with eik·x as having positive frequency
and a solution with e−ik·x as having negative frequency. The two Dirac equations can
now be taken as a single Dirac equation with +m, admitting positive and negative fre1
A
ik·x
~
quency solutions. Explicitly, we refine the ansatz as ΦA
and
+ (k, σ, x) := (2π)3/2 u (k, σ)e
ΦA (~k, σ, x) := 13/2 v A (k, σ)e−ik·x . Both satisfying (−i6 ∂ + m)Φ± = 0. This implies that
−
(2π)
(6 k + m)u(k, σ) = 0 = (6 k − m)v(k, σ). The u, v spinors get their k−dependence through these
defining equations. These spinors are eigen-spinors of 6 k with eigenvalues ± m. Since the
trace of the γ’s vanishes, 6 k is traceless too and hence each eigenvalue is doubly degenerate
i.e. the ‘σ’ label in the eigen-spinors takes two values. These will be linked to the helicities
later on.
Substitution of these solutions in the inner product leads to,
Z
1
0
0
0
(Φ(k , σ , ), Φ(k, σ)) =
d3 x
u† (k 0 , σ 0 )u(k, σ)e−i(k −k)·x
3
(2π)
Σt
0
0
0
= u† (k 0 , σ 0 )u(k, σ)e−i(k −k )t δ 3 (~k − ~k 0 )
(4.13)
† 0 0
u (k , σ )u(k, σ) 3 ~ ~ 0
3
δinv (k − k ) := δσ0 ,σ δinv
(~k − ~k 0 ) .(4.14)
∴ (Φ(k 0 , σ 0 , ), Φ(k, σ)) =
0
2k
We have chosen a normalization of the u−spinors. Identical orthonormality relations follow
for the v−spinors.
36
Under Lorentz transformations, the solutions transform as,
[Φ(k, σ)]Λ (x) = [D(Λ)u(k, σ)]
ei(Λk)·x)
(2π)3/2
?
= Φ(Λk, σ)(x)
The last equality will hold if D(Λ)u(k, σ) = u(Λk, σ) i.e. if ((Λk)
6
+ m) [D(Λ)u(k, σ)] = 0.
We already have [Σµν , γ λ ] = i(η µλ γ n − η νλ γ µ ). This implies (may be checked by taking
the infinitesimal form of D(Λ)),
D−1 (Λ)γ µ D(Λ) = Λµν γ ν ⇒ γ µ D(Λ) = Λµν D(Λ)γ n .
It follows,
[(Λk)
6
+ m] (D(Λ)u(k, σ)) = [(Λk)µ γ µ D(Λ) + D(Λ)m] u(k, σ)
(4.15)
= [(Λk)µ Λµν D(Λ)γ ν + D(Λ)m] u(k, σ)
= D(Λ) Λµα kα Λµν γ ν + m u(k, σ) but, Λµα Λµν = δ αν
∴ [(Λk)
6
+ m] (D(Λ)u(k, σ)) = D(Λ) [kν γ ν + m] u(k, σ) = 0. (Covariance Result)(4.16)
Thus, indeed we have D(Λ)u(k, σ) = u(Λk, σ) and hence,
[Φ(k, σ)]Λ,a (x) = e−i(Λk)·a Φ(Λk, σ)(x) .
Under Lorentz transformations then the orthonormality relation transforms as,
0
(Φ(k 0 , σ 0 )Λ,a , Φ(k, σ)Λ,a ) = eiΛ(k −k)·a (Φ(Λk 0 , σ 0 ), Φ(Λk, σ))
†
u (Λk 0 , σ 0 )u(Λk, σ) 3 ~ 0 ~
3
=
δinv (k − k) = δσ0 ,σ δinv
(~k 0 − ~k) (4.17)
.
2(Λk)0
In the last equality we have used the normalization of the u spinors. This proves the unitarity of the Poincare representation on the positive frequency solutions. As before, Lorentz
transformations do not mix the subspaces of the positive/negative frequency solutions.
The normalization definition does not look Lorentz invariant, but it is. Both u† u = ūγ 0 u
and k 0 transform the same way. A more convenient form will be displayed later on.
Note: The covariance result allows us to define the u, v spinors from their definition in the
rest frame (for massive case)i, k̂ = (m, ~0). In this frame, the spinors satisfy the equations:
−mγ 0 u(k̂, σ) = −mu(k̂, σ) , and −mγ 0 v(k̂, σ) = +mv(k̂, σ). Thus these spinors at k̂ are
eigen-spinors of γ 0 with eigenvalues ±1 respectively. γ 0 being Hermitian, these spinors are
orthogonal: u† (k̂, σ)v(k̂, σ) = 0. As noted before, these eigenvalues are doubly degenerate
37
and σ labels these. For the massless case, k̂ = (k, 0, 0, k) and u, v are eigen-spinors of J3
generator of the Little group.
Note: Incidentally, covariance result and the identification of u(k̂, σ), v(k̂, σ) as eigenspinors of γ 0 also leads to a completeness relation as follows. Given that γ 0 u(k̂, σ) =
u(k̂, σ) , γ 0 v(k̂, σ) = −v(k̂, σ) and each eigen-space being two dimensional, ⇒ completeP
ness relations (‘spectral representation’ - A = n λn |nihn| ) takes the form
X
u(k̂, σ)u† (k̂, σ) =
σ
X
σ
X
1 − γ0
1 + γ0
,
v(k̂, σ)v † (k̂, σ) =
2
2
σ
X
γ0 + 1
−6 k̂ + m
γ0 − 1
6 k̂ + m
=
,
v(k̂, σ)v̄(k̂, σ) =
= −
2
2m
2
2m
σ
X
X
−6 k + m
6k + m
u(~k, σ)ū(~k, σ) =
,
v(~k, σ)v̄(~k, σ) = −
2m
2m
σ
σ
u(k̂, σ)ū(k̂, σ) =
6 k̂
and in the last equation we multiplied by D(L)
In the second equation, we used γ 0 = − m
on the left and D−1 (L) on the right and used the covariance result. Here L is a boost
which takes k̂ → ~k. Notice that multiplying by u(~k, σ 0 ) and v(~k, σ 0 ) respectively, and
using the equations satisfied by the spinors, gives ū(~k, σ)u(~k, σ 0 ) = δσ,σ0 = −v̄(~k, σ)v(~k, σ 0 ) .
These are consistent with the previously chosen normalizations: u† (~k, σ)u(~k, σ 0 ) = 2k 0 δσ,σ0 =
v † (~k, σ)v(~k, σ 0 ).
Thus, for all the field equations we have found unitary, irreducible representations of the
Poincare group. We have also discovered that the field representations automatically also
include ‘anti-particles’. This is really a consequence of the requirement of manifest Poincare
covariance which necessitates the field representations.
Note: From the expressions above, it should be clear that the ket vectors |k, σi are
in one-to-one correspondence with the “Plane wave solutions” with positive and negative
frequencies, displayed above.
We need to study the behavior of these representations under the space inversion, time
reversal and the new possibility of ‘charge conjugation’ (particle-particle exchange).
38
5.
PARITY, TIME REVERSAL AND CHARGE CONJUGATION
For the field representations above, we focussed on the subgroup of the Poincare group,
connected to the identity. The remaining elements of the Poincare group - improper and
non-orthochronous are generated by the two transformations of space inversion and time
reversal. Their actions on the generators is given in eq.(2.36). These relations continue to
hold in any representation, in particular also on the field representations. Looking at the
A
B
−1
general form of the Poincare action on the ΨA (x)’s, ΨA
Λ,a (x) = D B (Λ)Ψ (Λ (x − a)), we
see that the action on the space-time point is common to all D(Λ) representations and we
may focus on the D(Λ) part of it. Recalling that the time reversal operation is anti-linear
and anti-unitary, we have to be careful about taking complex conjugates. Thus we define
the actions as,
A
B
ΨA
x) , D(P ) =: Π
P (x) := D B (P )Ψ (t, −~
(5.1)
A
B
ΨA
x))∗ , D(T ) =: τ,
T (x) := D B (T )(Ψ (−t, ~
(5.2)
It is customary to denote the anti-unitary, anti-linear operator T as τ K where τ is a unitary
operator and K takes complex conjugate of numbers on its right. T 2 = 1 implies τ τ ∗ = 1.
Consider the translation generators. These act as differential operators on the Ψ’s:
Pµ ΨA (x) = −i∂µ ΨA . Consider the P and T actions on Pµ as given in eqn. (2.36). For
instance,
(T −1 P0 T )ΨA (t, ~x) = T −1 {−i∂0 (τ AB ΨB (−t, ~x)∗ )} = T −1 {+i∂−t (τ AB ΨB (−t, ~x)∗ )}
= −i∂t (τ τ ∗ )AB ΨB (t, ~x) = P0 ΨA (t, ~x) .
(5.3)
The time has been reversed twice and the complex conjugation has been effected twice. For
the spatial components, there is no ‘t’ reversal and the T introduces a sign. Space inversion
is likewise straight forward. We now focus on the Lorentz generators only.
The scalar case has no non-trivial matrices and the vector case is simpler and is left as
an exercise. The spinner case is non-trivial. We have the Lorentz generators:
Σµν =
ijk jk
i µ ν
[γ , γ ] ↔ Ji =
Σ , K i = Σi0 .
4
2
The Π and τ matrices are also 4 × 4 matrices. We note a result.
39
Result: Any 4 × 4 matrix can be written as a linear combination of the 16 Γ matrices,
{Γ} = {1, γ µ , γ [µ γ ν] , γ [µ γ ν γ λ] , γ5 }, the square brackets denoting anti-symmetrization (16 =
1 + 4 + 6 + 4 + 1).
The result follows by showing that the 16 matrices are linearly independent.
We now deduce the Π and τ matrices from the commutation relations (see eq.(2.36)):
Σjk Π = ΠΣjk , Σ0i Π = −ΠΣ0i , Σjk τ = −τ (Σjk )∗ , Σ0i τ = τ (Σ0i )∗ .
The determination of Π is straight forward. Commutation with Σij implies that Π is
a linear combination of 1, γ5 , γ0 . Anti-commutation with Σ0i = 2i γ0 γi implies that only γ0
survives and Π = cγ 0 . Now Π2 = 1 implies c = ±1. Hence, (ΨP )A (x) = ±(γ 0 )AB ΨB (t, −~x) .
It follows immediately that left(right) handed Weyl spinor becomes right(left) handed Weyl
spinor under space inversion.
The determination of τ involves complex conjugation. From the unitarity of the γ
matrices and the defining Clifford relations, the hermiticity properties are determined:
γ0† = γ0 , γi† = −γi . The transpose/complex conjugation properties however depend on
the explicit choice of the γ’s. The τ matrix thus depends on the explicit representation of
the γ’s. There are three commonly employed representations:
Dirac-Pauli : γ 0 :=
1 0
, γ i :=
σi
0
0 1
(5.4)
1 0
−σ 0
0 −1
i
−1
0
0
1
0
σ
;
, γ i :=
, γ5 :=
(5.5)
Weyl : γ 0 :=
−σ i 0
0 1
1 0
0 σ2
iσ 0
0 −σ2
, γ 1 := 3
, γ 2 :=
;
Majorana : γ 0 :=
(5.6)
σ2 0
0 iσ3
σ2 0
−iσ1 0
σ 0
, γ5 := 2
, All purely imaginary. (5.7)
: γ 3 :=
0 −iσ1
0 −σ2
i
, γ5 :=
;
In the Dirac-Pauli and Weyl representations, only γ2 is imaginary. The τ matrix thus
satisfies:
Σ12 τ = −τ Σ12 , Σ23 τ = −τ Σ23 , Σ31 τ = τ Σ31 ,
Σ01 τ = −τ Σ01 , Σ02 τ = τ Σ02 , Σ03 τ = −τ Σ03 ,
40
The last relation in the first line suggests τ = λγ 0 γ 2 . It also checks with all other relations.
The λ is determined as follows.
The operator D(T ) is anti-unitary. Hence,
(D(T )D(T )Ψ1 , D(T )D(T )Ψ2 ) = (D(T )Ψ2 , D(T )Ψ1 ) = (Ψ1 , Ψ2 ) ⇒ D2 (T ) is unitary.
Next, D2 (T ) = (τ K)(τ K) = τ τ ∗ = |λ|2 (γ 0 γ 2 γ 0 (−γ 2 )) = −|λ|2 1. The unitarity then implies
that |λ| = 1 i.e. λ is a phase and D2 (T ) = −1 in the spinor representation.
In the Majorana representation, all γ’s are imaginary. Hence, Σ∗µν = −Σµν . Hence Σjk
commute with τ and anti-commute with Σ0i . This is exactly as for the space inversion and
we deduce that D(T ) = λγ 0 K. Once again D2 (T ) is unitary and equal to −1 and λ is a
phase.
For vector representation, the ‘A’ index will be a tensorial index and it is left as an
exercise to work out the D matrices for space inversion and time reversal.
The existence of the anti-particle representations suggest one more discrete transformation of order 2, called Charge Conjugation.
Recall that the anti-particle representation is the subspace of negative frequency solutions.
These subspaces are spanned by plane wave solutions and involve the operation of complex
conjugation which takes eik·x → e−ik·x . For the non-trivial D(Λ) representations, complex
conjugation also takes the complex conjugate of the D(Λ) matrices. However, a complex
conjugation of a solution need not be a solution again (especially for spinor as we will see)
and hence the charge conjugation must also involve additional transformations over and
above complex conjugation. For the Klein-Gordon, Proca and Maxwell equations, complex
conjugate of a solution is also a solution as the differential operators are real. For the Dirac
equation we need to do further work.
For instance, let Ψ be a positive frequency solution of the Dirac equation, (−iγ µ ∂µ +
m)Ψ = 0. Taking complex conjugate of the equation, we get (+i(γ µ )∗ ∂µ + m)Ψ∗ = 0. If we
could find an invertible matrix B such that (γ µ )∗ = −Bγ µ B −1 , then Ψc := (B −1 Ψ∗ ) satisfies
the same Dirac equation and of course is a negative frequency solution. Ψc is called the
charge conjugate2 of the Ψ. Apparently, B depends on the choice of explicit γ matrices.
2
The terminology comes when coupling to external electromagnetic field is considered by the minimal
substitution ∂µ → ∂µ − ieAµ . Under complex conjugation, e → −e. So if Ψ is thought of as charge ‘e’
solution then Ψc is a charge ‘-e’ solution.
41
As a preparation for subsequent development, we have a subsection on properties of γ
matrices.
A.
Representations of the Clifford algebra and relations among them
Quite generally, for any group, given a (matrix) representation R(G), we have three
other representations, namely, R∗ (G), (RT )−1 (G) and (R† )−1 (G). From the basic relation
R(g1 )R(g2 ) = R(g1 .g2 ), taking complex conjugate, transpose inverse and adjoint inverse
immediately verifies the assertion. If in addition, the representation R(G) is unitary (always
true for finite groups and compact Lie groups), then (RT )−1 (G) = R∗ (G) and (R† )−1 (G) =
R(G). R(G) and R∗ (G) are then the only independent representations. These are either
equivalent, R∗ (g) = SR(g)S −1 or inequivalent. It turns out that for unitary, irreducible
representations, if R and R∗ are equivalent, then S is either symmetric or anti-symmetric.
The following terminology ensues.
For unitary, irreducible representations:
R(G) is complex
if R∗ (G) R(G)
R(G) is pseudo-real if R∗ (G) = SR∗ (G)S −1 , S T = −S
R(G) is real
if R∗ (G) = SR∗ (G)S −1 , S T = S
This is relevant for representations of internal symmetry groups as well as the Clifford
group of the 32 elements. The D(Λ) representations are finite dimensional but not unitary
since Lorentz group does not have unitary finite dimensional representations.
For the Clifford group in 4 dimensions (and in even number of dimensions) there is only
one (up-to unitary equivalence) non-trivial representation which is 4 dimensional (or 2N/2
dimensional for N -even number of dimensions). Since ±γ ∗ , ±γ T and ±γ † all satisfy the same
Clifford algebra and the representation is unique, ∃ matrices B, C, D such that,
(γ µ )∗ = −Bγ µ B −1 , (γ µ )T = −Cγ µ C −1 , (γ µ )† = +Dγ µ D−1 .
The choice of signs above is conventional. Notice that replacing any of the B, C, D matrices
by multiplying by γ5 on the right, reverses the signs.
It follows immediately that,
−(Σµν )∗ = BΣµν B −1 ,
− (Σµν )T = CΣµν C −1 , (Σµν )∗ = DΣµν D−1
42
Given a representation D(Λ) we have the (D† )−1 (Λ) , (DT )−1 (Λ) and D∗ (Λ) representations.
The infinitesimal forms are (1 + 2i ω · Σ), (1 − 2i ω · Σ)† , (1 − 2i ω · ΣT ), (1 − 2i ω · Σ∗ ) which imply
that the corresponding generators are Σµν , (Σµν )† , −(Σµν )T , −(Σµν )∗ . The B, C, D matrices
precisely relate these generators.
These relations can be exponentiated to get corresponding relations among the
D(Λ), D† (Λ), DT (Λ), D∗ (Λ), namely,
D† (Λ) = DD−1 (Λ)D−1 , DT (Λ) = CD−1 (Λ)C −1 , D∗ (Λ) = BD(Λ)B −1 .
These can be checked easily by using the series form of the exponentials derived from D(Λ) =
P∞ (iω·Σ)k
.
k=0
k!
B.
Dirac-Majorana-Charge Conjugates
This allows us to construct Lorentz invariant bi-linears from the spinors. For instance,
define the Majorana conjugate Ψ̃ := ΨT C. Then,
(Ψ̃)Λ := (ΨTΛ )C = ΨT (DT (Λ))C = ΨT CD−1 (Λ) = Ψ̃D−1 (Λ) .
Thus the Majorana conjugate, like the Dirac conjugate ψ̄ := Ψ† D, transforms by D−1 (Λ).
Recall that the charge conjugate, Ψc transforms by D(Λ). Consequently, Ψ̄Ψ, Ψ̄Ψc , Ψ̃Ψ, Ψ̃Ψc
are all Lorentz scalars.
Recalling that D−1 (Λ)γ µ D(Λ) = Λµν γ ν , it is easy to see that,
Ψ̄Ψ is a Scalar
Ψ̄γ µ Ψ is a Vector
Ψ̄γ µ γ ν Ψ is a Tensor of rank 2
Ψ̄γ µ γ5 Ψ is a Axial (or pseudo) vector
Ψ̄γ5 Ψ is a Pseudo-scalar
Combine these with the definitions,
Dirac Spinor : Ψ = Ψ , (Ψ)Λ = D(Λ)Ψ;
Dirac Conjugate : Ψ̄ = Ψ† D , (Ψ̄)Λ = Ψ̄D−1 (Λ);
Charge Conjugate : Ψc = B −1 Ψ∗ , (Ψc )Λ = D(Λ)Ψc ;
Majorana Conjugate : Ψ̃ = ΨT C , (Ψ̃)Λ = Ψ̃D−1 (Λ).
43
Thus, the same Lorentz transformation properties hold with Ψ → Ψc and/or Ψ̄ → Ψ̃. The
last two need a little explanation.
We defined γ5 := iγ 0 γ 1 γ 2 γ 3 =
symmetric symbol, with ε0123 := 1.
µ0
ν0
i
ε
γ µγ ν γ αγ β ,
4! µναβ
where ε is the completely anti-
This symbol is also used to define the determi0
0
nant of a matrix, eg, εµ0 ν 0 α0 β 0 Λ µ Λ ν Λαα Λββ = det(Λ)εµναβ . It follows immediately that
D−‘ (Λ)γ5 D(Λ) = det(Λ)γ5 .
Consider now an axial vector combination,
[Ψ̄γ µ γ5 Ψ]Λ = Ψ̄D−1 (Λ)γ µ γ5 D(Λ)Ψ = Λµν Ψ̄γ ν D−1 γ5 D(Λ)Ψ = det(Λ)Λµν Ψ̄γ ν γ5 Ψ .
Note that for proper Lorentz transformations, the axial vector and the pseudo-scalar transform as vector and scalar respectively.
Quantities that transform as tensors but with an extra factor of determinant of the
transformation are called pseudo-tensors. There are also tensor looking quantities that
transform with additional factor of |det|w and these are called tensor densities of weight w.
Let us return to determine the matrices B, C, D.
Since the γ’s are unitary, we can always choose any of the B, C, D matrices to be unitary.
Quite generally, if R0 = SRS −1 for two irreducible, equivalent unitary representations, then
R0† R0 = 1 ⇒ R† S † S = S † SR ⇒ S † S = α1. Positivity of S † S gives α > 0. Hence
√
S 0 := S/ α is a unitary matrix. This proves the claim.
Next, γ µ = (γ µ∗ )∗ = −B ∗ γ µ∗ (B −1 )∗ = +B ∗ Bγ m (B ∗ B)−1 ⇒ B ∗ B = λ1 with λ being real
since B ∗ B is. The determinant of B being a phase (unitarity of B) fixes λ := B := ±1
and B ∗ B = B 1 . Similar manipulation with γ µ = (γ µT )T leads to C −1 C T = λ1. Unitarity
gives C −1 C T = C † C T = (C T )∗ C T , hence λ is real and determinant gives λ4 = 1. Hence,
C T = C C, C = ±1 . For γ = (γ † )† we get D† = λD for some phase λ. Since D is
determined to within a phase, we can define D0 := eiα D and choose α so that λ = e−2iα
to get (D0 )† = D0 . Thus, without loss of generality, we can always choose D† = D. Now,
knowing the hermiticity properties of the γ’s, we get D = γ 0 , independent of any choice of
explicit γ matrices.
The B , C are correlated since γ T = (γ ∗ )† = (γ † )∗ . γ T = (γ ∗ )† leads to,
−CγC −1 = − (B † )−1 DγD−1 B † ⇒ C = λBD.
44
γ T = (γ † )∗ leads to,
−CγC −1 = − (D∗ )BγB −1 (D−1 )∗ ⇒ C = λ0 D∗ B.
Eliminating C gives BD =
λ0 ∗
D B
λ
or D∗ =
λ
BDB −1 .
λ0
However, D = γ 0 can always be
achieved as shown above. Therefore, λ0 = −λ and C = λBD = −λD∗ B. Next,
C T = λDT B T = λ(D† )∗ (B † )∗ = λD∗ (B ∗ )−1 = λD∗ (B B) = −B C ⇒ C = −B .
Note that this is independent of explicit γ matrices and also independent of the phase λ!
We still need to determine B , say.
Claim: The B does not depend on the choice of γ matrices.
This is easily proved. Let γ 0 and γ be two distinct choices of explicit γ-matrices. Both
being unitary are related by some unitary matrix, S as γ = Sγ 0 S −1 . Then substitution in
γ ∗ = −BγB −1 gives, (γ 0 )∗ = −B 0 γ 0 (B 0 )−1 with B 0 := (S ∗ )−1 BS. It follows that (B 0 )∗ B 0 =
(S −1 B ∗ S ∗ )((S ∗ )−1 BS) = B 1, proving the claim. It therefore suffices to choose an explicit
representation and evaluate B .
Referring to say the Dirac-Pauli representation, see (5.4), only γ 2 is pure imaginary.
Hence, γ 0,1,3 B = −Bγ 0,1,3 and γ 2 B = +Bγ 2 . Furthermore, γ5∗ = γ5 ⇒ γ5 B = −Bγ5 as well.
Therefore B must be made up of odd number of γ’s. By inspection, B = αγ 2 satisfies the
conditions. Unitarity of B restricts α to be a phase and B ∗ B = |α|2 (−γ 2 )(γ 2 ) = 1 ⇒ B =
+1 and C = −1. Also C = λBD = λαγ 2 γ 0 = (−λα)γ 0 γ 2 . The phases λ, α are arbitrary
and convention dependent. We choose λ = 1 and α = i so that B = iγ 2 , C = iγ 2 γ 0 .
This also verifies for the Weyl representation. For Majorana representation, all γ µ ’s are
imaginary and thus commute with B. Hence B must be a phase multiple of 1. This is also
consistent with commutation with γ5 . Clearly B = +1 and C = (λα)γ 0 .
This completes the discussion of representations and relations among them.
It is customary to introduce a charge conjugation operator, C := B −1 K, where K instructs
to take the complex conjugate of the numbers on its right. This operator is anti-linear and
since B is unitary, it is also anti-unitary operator. It follows,
C Σµν = B −1 (Σµν )∗ K = B −1 (−BΣµν B −1 )K = −Σµν C ⇒ C D(Λ) = D(Λ)C .
Thus charge conjugation acts invariantly on Lorentz representations. Provided C 2 = 1,
we have ( 1±2 C )2 =
1±C
2
and the Lorentz representation can be reduced further. The corre-
sponding subspaces satisfy C ψ = ±Ψ and these spinors are called Majorana spinors. Thus,
45
Majorana spinors are Dirac spinors which satisfy C Ψ = ±Ψ ↔ B −1 ψ ∗ = ±Ψ. So is C 2 = 1?
We have C 2 = B −1 KB −1 K = (B ∗ B)−1 K 2 = B 1 = 1. Therefore, we do have Majorana
spinors (in 4 dimensions).
Do we have Weyl-Majorana spinors? Well, C γ5 = B −1 γ5∗ K = −γ5 B −1 K = −γ5 C and we
cannot have Weyl-Majorana spinors in 4 dimensions.
A similar analysis can be carried out for spinors in any dimensions and with any metric
signature. This may be seen for instance in [2].
46
6.
RELATIVISTIC ACTIONS: CLASSICAL FIELDS
So far we focused on the representation theory of the Poincare group. The abstract,
algebraic approach revealed the attributes of these representations, namely mass and
spin/helicity. In order to have a framework which is manifestly Poincare covariant, we
constructed and analyzed representations on vector valued (complex in general) function,
ΨA - henceforth these will be generically referred to as fields. The irreducibility condition
emerged as “field equations”. These equations are all homogeneous, linear and with at the
most two derivatives. They all admit plane waves and their linear combinations as solutions, but no other ‘phenomenon’ involving (say) different types of waves modifying their
propagation etc. Intuitively, there are no interactions.
We would like to have a framework which is not only Poincare covariant but also involves
“interaction” or non-trivial “dynamics”. The well tested and successful strategy is to have an
action formulation. Let us quickly note several advantages of having an action formulation:
1. The equations of motion are retrieved invoking a variational principle as the EulerLagrange equations of motion;
2. “Interactions” can be understood naturally as leading to non-linear and/or coupled
equations which can be easily introduced as more than quadratic order terms in the
action;
3. Covariance of equations of motion (or dynamics) can be easily incorporated by requiring appropriate invariance of the action;
4. The Noether’s theorem gives a recipe for obtaining quantities conserved by equations
of motion (and hence during interactions as well);
5. It leads to a canonical framework of symplectic structure (“Poisson brackets”) and a
Hamiltonian evolution. This provides a systematic method of identifying “degrees of
freedom” (eg Dirac’s theory of constrained systems).
A canonical structure is already inherent in the quantum framework: the imaginary
part of the inner product provides the symplectic structure while the Schrodinger
equation gives a Hamiltonian evolution;
47
6. Path integral quantization - very well suited for gauge theories especially the nonabelian ones - has action as the central quantity.
Without further ado, let us proceed with an action formulation. Our first aim is to obtain
the Klein-Gordon, the Dirac and the Proca/Maxwell equations. Because the equations are
local, partial differential equations, the action must be an integral over the Minkowski spacetime, of a Lagrangian density built out of the Poincare covariant fields and their derivatives
p
R
i.e. it must of the form S = d4 x |det(η)|L with L is a Lorentz scalar built out of
the fields. It is called the Lagrangian density. Here, η denotes the Minkowski metric and
is necessary to absorb the Jacobian of Lorentz transforms of the space-time coordinates.
The absolute value of the determinant happens to be 1 and is suppressed throughout. A
variational principle has the form,
Z
δS[Φ] := S[Φ + δΦ] − S[Φ] =
4
d xδ L :=
Z
d4 x δΦ “
δL
”.
δΦ
Extremization of the action under arbitrary variation δΦ leads to the Euler-Lagrange equations:
δL
δΦ
= 0. Since the equation we want to derive are linear, it suffices to have L to be
quadratic in the fields. Furthermore, the equations have no more than two derivatives which
can be obtained by restricting to first derivatives in the Lagrangian.
Let us begin with the scalar field,
φΛ,a (x) = φ(Λ−1 (x − a)) , Λµα ∂α φ(Λ−1 (x − a)) ⇒
[∂ µ φ∂µ φ]Λ,a = η µν [∂µ φ∂ν φ]Λ,a = η µν Λµα Λνβ [∂α φ∂β φ] (Λ−1 (x − a))
= η αβ ∂α φ∂β φ(Λ−1 (x − a)) =
[∂ α φ∂α φ] (Λ−1 (x − a)) .
When integrated over the space-time, we can change the dummy variable x → Λx + a
which restores the argument of the φ’s without changing the integration measure. Hence
R 4 µ
d x∂ φ∂µ φ is clearly Poincare invariant. Since all fields have the same shift of the spacetime argument, we will suppress the translation part and focus on the Lorentz. This works
as long as there are no externally prescribed fields/functions, which break translation in-
48
variance. Thus, let
Z
1 2 2
1 µ
=:
d4 xL (φ, ∂φ)
d x − ∂ φ∂µ φ − m φ
2
2
M
ZM
d4 x −∂µ φ∂ µ δφ − m2 φδφ
M
Z
d4 x −∂µ (∂ µ φδφ) + (φ − m2 φ)δφ
ZM
Z
4
2
d x δφ ( − m )φ −
d3 x nµ (∂ µ φδφ)
Z
S[φ] :=
δS [φ] =
=
=
4
M
(6.1)
(6.2)
∂M
The integration is over all of Minkowski space-time which has no boundary and hence the
second term should be zero. In practice, such space-times extending to infinite coordinates
are handled/defined putting the system in a large box and taking a limit. The fields must
satisfy suitable boundary conditions so that the boundary contribution again vanishes. Alternatively requiring the field to vanish fast enough in the asymptotic regions, makes the
boundary term vanish. We can always choose the variational principle to set δφ = 0 at
the boundary/asymptotic regions. While the boundary contribution are important in some
context, we will restrict ourselves to boundary contribution being zero.
Demanding that the action be stationary for arbitrary variations of the field, we get the
equation of motion (−m2 )φ = 0. Incidentally, we will get the same equation of motion from
another Lagrangian L 0 = L + ∂µ Λµ . Thus, several Lagrangians can give the same equations
of motion (although the Hamiltonian formulation will vary with the Lagrangians.)
Consider now a vector field. For the massive case, the equations we want are (−m2 )v µ =
0 = ∂µ v µ .
Consider L = aF µν Fµν + bm2 v µ vµ , where Fµν := ∂µ vν − ∂ν vµ and a, b are non-zero
constants to be determined.
δ L = 2aF µν (∂µ δvν − ∂ν vµ ) + 2bm2 v µ δvµ = 4aF µν ∂µ δvν + 2bm2 v ν δvν
= ∂µ (4aF µν δvν ) − 4a(∂µ F µν )δvν + 2bm2 v ν δvν ⇒ − 4a∂µ F µν + 2bm2 v ν = 0.
b 2 ν
ν
∴ 0 = v − m v − ∂ ν ∂ · v
taking divergence and choosing b = 2a gives
2a
0 = ( − m2 )v ν and ∂ · v = 0.
(6.3)
We got b = 2a and we choose b = − 21 similar to the scalar field case. The reason for this
choice will be clear little later.
Note: For m = 0, we denote the vector field by Aµ . Now we cannot conclude ∂ · A = 0,
and the equation we get is: ∂µ F µν = 0 which is just the Maxwell equation. It has the well
49
known gauge invariance: Aµ → Aµ + ∂µ Λ which allows us to take one of the components of
Aµ to be zero.
Exercise: Another natural choice of the Lagrangian is: L 0 = a∂ µ v ν ∂µ vν + bm2 vµ v µ +
c(∂µ v µ )2 . Repeat the steps and deduce the choices for the constants so as to get the Proca
equations. Check that L , L 0 differ by a divergence.
Lastly, let us consider the action for getting the Dirac equation. We have already noted
the Lorentz transformations of Ψ, Ψ̄, γ µ , ∂µ . From these it follows that we can have the
following Lorentz invariant terms, quadratic in the fields and with a single derivative:
Ψ̄γ α ∂α Ψ, ∂ α Ψ̄∂α Ψ, Ψ̄Ψ and the same terms with a γ5 inserted between the spinors. The
terms with the γ5 are all pseudo-scalars and may be dropped if parity (space-inversion plus
a rotation) is required to be a symmetry. We assume so for the present. The candidate
Lagrangian is then,
L := aΨ̄γ α ∂α Ψ + b∂ α Ψ̄∂α Ψ + cmΨ̄Ψ and δΨ̄ L = δ Ψ̄[a6 ∂ Ψ − bΨ + cmΨ].
(6.4)
For the choice a = −i, b = 0 and c = 1, we get the Dirac equation. If we varied ψ, then for
the same choice, we will get the conjugate Dirac equation: i∂α Ψ̄γ α + mΨ̄ = 0.
p
R
In summary, we have S = M d4 x |detη|L with,
1
1
2
2
1 2 µ
1 µν
= − F Fµν − m v vµ ;
4
2
µ
= −iΨ̄γ ∂µ + mψ̄Ψ .
Lscalar = − ∂ µ φ∂µ φ − m2 φ2 ;
(6.5)
Lvector
(6.6)
Lspinor
(6.7)
δS = 0 leads to the field equations implementing the irreducibility condition. For massless
vector, the action is invariant under the gauge transformation: Aµ → Aµ + ∂µ Λ. The actions
are also invariant under parity.
Note: We have taken the scalar and vector fields to be real. We could have complex
scalar field (say) with two field equations: ( − m2 )φ = 0 = ( − m2 )φ∗ . The complex
field may be considered as two real fields, φ = φ1 + iφ2 and two terms may be included
in the action. Alternatively, the complex scalar field equations may be derived from L =
−∂ µ φ∗ ∂µ φ − m2 φ∗ φ.
50
Notice that all actions are real (for spinors it is convenient to take matrix hermitian
i
conjugate). We will always take the actions to be real so that e ~ S will be a phase. Only
when dissipation of energy/momentum/angular momentum is to be incorporated we need
to take the action to be complex. In this course, we will not do so.
A.
Variational Principle, Symmetries of the Action and Noether’s theorem
Let us denote a generic field by X with all indices suppressed. Let an action be expressed
R
as, S[X] := M d4 xL (X, ∂X). Let δX := χ(X(x)) be an arbitrary, infinitesimal variation
of the field X(x). Then,
Z
Z
δL
δL
δL
δL
4
4
dx
δX +
δXµ
=
δX +
∂µ δX
δS =
dx
δX
δXµ
δX
δXµ
M
M
Z
Z
δL
δL
δL
4
4
δX +
d x∂µ
− ∂µ
δX
=
dx
δX
δX,µ
δX,µ
M
M
Here, X,µ = ∂µ X. The
δL
δX
is really like partial derivative, but since both L and X depend
on the coordinate, these should be mentioned too. We have kept these implicit and used the
δ as a reminder. We follow this customary practice. The last term is a divergence and can
R
δL
δX. This is typically dropped/vanishes
be expressed as a boundary integral: ∂M d3 ynµ δX
,µ
for various reasons.
Note: For an action principle to be well defined (this includes the specification of the
class of variations), it is necessary that the boundary contribution must vanish. If in some
cases it does not vanish, additional ‘surface action’ needs to be added to cancel the total
boundary contribution. This is typically encountered in gravitational actions on manifolds
with boundaries.
To summarize: a variational principle asserts that δS[X] vanishes for arbitrary variation
δX around X iff
δL
δX
δL
− ∂µ δX
= 0 i.e. iff X(x) is a solution of the equation of motion.
,µ
We now consider a different situation. We consider special, restricted variations such
that δS[X] = 0 at all fields X(x). Such variations are called infinitesimal symmetries of the
action. This is translated in terms of the Lagrangian density as,
Z
δS[X] =
d4 xδ L = 0 if either δ L = 0
or δ L = ∂µ δΛµ ,
M
where Λµ is some 4-vector which vanishes/falls off on the boundary. Now we have the
Noether’s theorem:
51
Noether’s Theorem: For every infinitesimal symmetry of the action, ∃ a conserved current,
conserved on every solution of the equation of motion.
Proof: We already have the infinitesimal variation of the L which must equal a divergence
i.e. for a symmetry variation,
δL
δL
δL
− ∂µ
δX + ∂µ
δX
= δ L = ∂µ Λµ ⇒,
δX
δX,µ
δX,µ
δL
δL
δL
µ
= 0.
− ∂µ
δX + ∂µ
δX − ∂µ Λ
δX
δX,µ
δX,µ
Thus, if X satisfies the equation of motion, the first term vanish and we have a conserved
δL
current, δJ µ :=
δX − ∂µ Λµ , ∂µ δJ µ = 0
δX,µ
This is neat prescription to discover conserved currents and their corresponding conserved
R
charges: δQ := ∂M d3 σ nµ δJµ . Discovering conserved currents by inspection of the equation
of motion is easy only in the simplest of cases.
Let us see an example, particularly a symmetry variation induced by infinitesimal
Poincare transformations for which δS = 0 by construction. Consider a real scalar field
action for simplicity.
Under an infinitesimal Poincare transformation,
δφΛ,a (x) := φ(Λ−1 (x − a)) − φ(x) = φ((δ µν − ω µν )(x − )ν ) − φ(x)
= (−ω µν xν − µ )∂µ φ :=
− ξ µ ∂µ φ(x) , ξ µ := ω µν xν + µ .
This leads to
δ L = −∂ µ φ∂µ (−ξ · ∂φ) − m2 φ(−ξ · ∂φ) = ∂ µ φ∂µ (ξ ν ∂ν φ) + m2 φξ ν ∂ν φ
2
= (∂µ ξ ν )∂ µ φ∂ν φ + ξ ν ∂ µ φ∂νµ
φ + m2 φ∂ν φ
1 δL
1 2 δL
1
µ
ν
ν
∂ν φ,µ + m
∂ν φ
= (∂µ ξν + ∂ν ξµ )∂ φ∂ φ + ξ
2
2 δφ,µ
2
δφ
Explicitly, ∂µ ξν = ∂µ (ωνα xα + α ) = ωνµ + 0 which is antisymmetric in µ ↔ ν and hence the
first term vanishes. The second term is just ξ ν ∂ν L .
∴ δ L = ξ µ ∂µ L = ∂µ (ξ µ L ) − (∂ · ξ)L = ∂µ δΛµ + 0 , or δ L = ∂µ δΛµ , δΛµ = ξ µ L .
Note: We ensured that L is a Lorentz scalar, but it is not a translation scalar. Action
R
gets it Poincare invariance thanks to the d4 x. Since all fields under translations shift their
52
coordinates by a, in all cases we will have δ L = ξ · ∂ L provided ξµ satisfies ∂µ ξν + ∂ν ξµ = 0
i.e. ξ µ is a “Killing vector” of the Minkowski metric.
By Noether’s theorem, we get
δJ µ =
δL
δφ − δΛµ = ∂ µ φ(−ξ ν ∂ν φ) − ξ µ L = − ξν (∂ µ φ∂ ν φ + η µν L ) =: − ξν T µν .
δφ,µ
Conservation of the Noether current gives (∂µ ξν )T µν + ξν ∂µ T µν = 0. Provided T µν is symmetric (as it is here), the first term vanishes and independence of ξ µ , we get ∂µ T µν = 0. The
tensor, T µν := ∂ µ φ∂ ν φ + η µν L is called the canonical stress tensor for the scalar field.
Thus, for each Killing vector ξ µ , we have J µ := T µν ξν which is conserved on the solutions
R
of the field equation. We define the corresponding charges as, Qµ (ξ) := Σt d3 xT 0ν ξν which
are independent of the hypersurfaces Σt .
Here are practice exercises:
(1) For the Proca Lagrangian, show that δ L = ξ µ ∂µ L for ξ µ = ω µν xν + µ , ω, being
infinitesimal, and obtain the canonical stress tensor.
(2) For m = 0, show that this stress tensor is not gauge invariant and one needs to
“improve” it to get a symmetric, gauge invariant and conserved stress tensor. Guess it.
(3) For the scalar and the Maxwell field, obtain the conserved charges corresponding to
the 10 Killing vectors. The charges corresponding to the translations are the energy and
momentum whiles those corresponding to the ωij are the angular momentum components.
B.
Conserved Poincare charges for scalar field solutions
Let us see an explicit example of Noether charges for a general solution of the KleinGordon equation. We noted above that the canonical stress tensor for the scalar field is
given by,
Tµν = ∂µ φ∂ν φ + ηµν L
,
∂µ T µν = 0
,
Define: M µνλ := T µν xλ − T µλ xν
∴ ∂µ M µνλ = 0 on solutions
53
1
1
2
2
2
∀ ( − m )φ = 0 .
L := − ∂ µ φ∂µ φ − m2 φ2
⇒ ∂µ M µνλ = T λν − T νλ
iff ∂µ T µν = 0 and T µν = T νµ
These 6 conserved currents (M µνλ are antisymmetric in the last two indices) together with
the conserved stress tensor give the 10 Poincare charges:
Z
Z
3
0µ
µν
µ
d xT
, M
:=
d3 xM 0µν .
P :=
Σt
Σt
We had noted the plane wave solutions of the Klein-Gordon equation, labeled by ~k ∈ R3
namely,
1
u~k (x) = p
eik·x , k · x := −ω~k t + ~k · ~x , ω~k :=
2ω~k (2π)3
q
~k 2 + m2
and
their complex conjugates.
The general solution is thus expressed as
Z
φ(x) = d3 k[a(~k)u~k (x) + a∗ (~k)u~∗k (x)] with a(~k) being complex numbers with a suitable
dependence on ~k so that the field has the appropriate boundary behavior. The general
solution is manifestly real.
To compute the energy-momentum and angular momentum charges, we have integration
of expressions quadratic in the solutions. So we note: ∂µ u~k (x) = ikµ u~k (x) , ∂µ u~∗k (x) =
−ikµ u~∗k (x) , k 0 = ω~k and the orthogonality relations:
δ 3 (~k − ~k 0 )
ei(ω~k −ω~k0 )t
3
p
=
d3 x u~∗k (x)u~k0 (x) = p
= δinv
(~k − ~k 0 )
2ω~k
2ω~k 2ω~k0
Z
Z
3
3
d3 x u~k (x)u~k0 (x) = e−2iω~k t δinv
(~k + ~k 0 ) ,
d3 x u~∗k (x)u~∗k0 (x) = e2iω~k t δinv
(~k + ~k 0 ) .
Z
We need,
P
µ
M µν
1 α
1 2 2
=
d x∂ φ∂ φ + η
− ∂ φ∂α φ − m φ
2
2
Z
=
d3 x ∂ 0 φ∂ µ φ + η 0µ L xν − ∂ 0 φ∂ ν φ + η 0ν L xµ
Z
3
0
µ
0µ
(6.8)
(6.9)
And we have,
1
L = −
2
Z
Z
n
0
∗ ~ ∗ ~0 ∗ ∗
0
2
~
~
~
~
dk dk
a(k)a(k )u~k u~k0 + a (k)a (k )u~k u~k0 −k · k + m
o
∗ ~0
∗
∗ ~
0 ∗
0
2
~
~
~
~
a(k)a (k )u~k u~k0 + a (k)a(k )u~k u~k0 +k · k + m (6.10)
Z
Z
n
∂ 0 φ∂ µ φ =
d3 k d3 k 0 a(~k)a(~k 0 )u~k u~k0 + a∗ (~k)a∗ (~k 0 )u~∗k u~∗k0
o
∗ ~0
∗
∗ ~
0 ∗
0~ 0 µ
~
~
~
−k k
(6.11)
−a(k)a (k )u~k u~k0 − a (k)a(k )u~k u~k0
3
3 0
54
R
Substituting and carrying out the d3 x using the orthogonality relations gives,
0 in
Z
Z
o
i
3
3 0 k k
−2iω~k t
∗ ~ ∗
2iω~k
~
~
~
δ 3 (~k + ~k 0 )
P =
dk dk
a(k)a(−k)e
+ a (k)a (−k)e
2ω~k
o
k0ki n ~ ∗ ~
∗ ~
3 ~
0
~
~
a(k)a (k) + a (k)a(k) δ (k − k )
2ω~k
Z
o
n
1
d3 kk i a(~k)a(−~k)e−2iω~k t + a∗ (~k)a∗ (−~k)e2iω~k t + a(~k)a∗ (~k) + a∗ (~k)a(~k) (6.12)
=
2
The first two, time decedent terms in the braces are symmetric under ~k ↔ −~k while the
last two time independent term go just change their argument. Hence, under symmetric
integration, the time dependent terms vanish and we are left with,
i
P =
Z
d3 k k i
a(~k)a∗ (~k) − a(−~k)a∗ (−~k)
2
(6.13)
Since the a’s are ordinary complex numbers the factor of 2 cancels and a new factor arises
from explicit anti-symmetrization w.r.t. ~k. The claimed conserved quantity is manifestly
independent of t.
The calculation of P 0 proceeds much the same way. We have the t−dependent term
proportional to δ 3 (~k + ~k 0 ) while the t−independent term is proportional to δ 3 (~k − ~k 0 ). We
get,
P
0
Z
=
o
n
o
d3 k h 2 n ~
−2iω~k t
∗ ~ ∗
2iω~k t
2
∗ ~
~
~
~
−ω~k a(k)a(−k)e
+ a (k)a (−k)e
+ ω~k 2a(k)a (k)
2ω~k
o
1n ~
+
a(k)a(−~k)e−2iω~k t + a∗ (~k)a∗ (−~k)e2iω~k t ω~k2 + ~k 2 + m2
2
1 n ~ ∗ ~ o 2 ~2
2
+
2a(k)a (k) −ω~k + k + m
(6.14)
2
The t−dependent terms cancel while the t−independent terms give,
Z
0
d3 k ω~k a(~k)a∗ (~k)
P =
(6.15)
The calculation of M µν is similar. The new feature is the explicit xλ in the integral. In
R
the M ij we have T 0i xj − T 0j xi . We trade xi for ∂ki by noting that
kj
tk j
j
or xj u~k = −i∂kj u +
u~ .
∂ki u~k (x) = u~k (x) −it + ix
ω~k
ω~k k
As before the spatial integration will produce δ functions and also derivatives of δ functions
from the ∂ki . While doing the k 0 −integration, we need to flip the derivative as per the rules
55
of δ function. As before, the e±2iω~k t will turn out to be symmetric under ~k ↔ −~k and will
not contribute. Filling in the algebra leads to the final results,
Z
h
i
ij
M = i d3 k a(~k) k i ∂kj − k j ∂ki a~∗k − a∗ (~k) k i ∂kj − k j ∂ki a~k
Z
i
0i
M = −
d3 k ω~k a~k ∂ki a~∗k − a~k ∂ki a~∗k .
2
(6.16)
(6.17)
All are manifestly independent of time.
Apart from verifying that the conserved quantities are indeed time independent, the
expressions for energy and momentum show that these quantities are a sum (integral) of
energy-momentum of individual solutions. Thus each of these plane waves, at least heuristically be thought of as carrying an energy ω~ and momentum ~k. Note that there is no ~ yet
k
so these are mathematically defined conserved quantities much like the energy-momentum
fluxes of electromagnetic fields identified via the Poynting theorem. This is helpful for interpreting the solutions of the field equations as physical entities carrying energy-momentum.
To strengthen this further, we would like to see if the fields and the action can be cast in
the form of a “dynamical system”.
56
7.
FOURIER DECOMPOSITIONS OF FIELDS: COLLECTION OF HARMONIC
OSCILLATORS
In the previous section, we wrote the general solution of the Klein-Gordon equation. Now
we want to see if the action takes the ‘form of a dynamical system’. The meaning will be
clear shortly. Consider again the scalar field. Let Σt be a constant t surface. On Σt , the
~
set of functions, {ϕ~k = (2π)13/2 eik·~x , ~k ∈ R3 } forms a complete, orthonormal set of functions,
R
d3 x ϕ~∗k (~x)ϕ~k0 (~x) = δ 3 (~k − ~k 0 ). Let us Fourier decompose the field as,
Σt
Z
φ(t, ~x) :=
R3
d3 k f~k (t)ϕ~k (~x).
(7.1)
The reality of the field, φ∗ (x) = φ(x) ⇒ f~k∗ (t) = f−~k (t). The expansion coefficient
functions are determined by the equations of motion. It is trivial to check that ( −
R
m2 )φ(x) = 0 ⇒ d3 k(−f¨~ −~k 2 f~ −m2 f~ )ϕ~ (x) = 0. By independence of the basis functions,
k
k
k
k
we get f¨~k +ω~k2 f~k = 0 ∀ ~k ∈ R3 , ω~k2 := ~k 2 +m2 . Its general solution is f~k (t) = a~k e−iω~k t +b~k eiω~k t .
The reality condition f~k∗ = f−~k gives a~∗k = b−~k or b~k = a∗−~k . Substitution gives,
Z
φsoln (x) =
R3
~
3
−iω~k t
d k a~k e
+ b~k e
iω~k t
eik·~x
p
2ω~k (2π)3
d3 k
−iω~k t+i~k·~
x
iω~k t−i~k·~
x
p
=
a~ e
+ b−~k e
2ω~k (2π)3 k
Z
eik·x
∴ φsoln (x) =
d3 k a~k u~k (x) + a~∗k u~∗k (x) , u~k := p
2ω~k (2π)3
Z
(7.2)
This is exactly the same form we had for the general solution. We have adjusted the
normalization factor for future convenience. The last equation will be referred to as a mode
decomposition, with u~k (x)’s denoting the mode functions.
Let us rewrite the Fourier decomposition in a manifestly real form:
Z
φ(t, ~x) =
d3 k f~k (t)ϕ~k (~x) + f~k∗ (t)ϕ~∗k (~x)
R3 /2
We have separated the −~k tagged terms and used the reality condition. The integration is
57
over positive half of R3 . Substituting the Fourier decomposition into the Lagrangian, we get,
Z
Z
m2
1 2
2
3
3
φ2
d xL =
d x − −φ̇ + (∇φ) −
L=
2
2
Σt
ZΣt
i
h
∗
3
∗
˙
˙
φ̇(t, ~x) =
d k f~k (t)ϕ~k (~x) + f~k (t)ϕ~k (~x)
R3 /2
Z
d3 k (ikj ) f~k (t)ϕ~k (~x) − f~k∗ (t)ϕ~∗k (~x) using ortho-normality, we get
∂j φ(t, ~x) =
R3 /2
Z
i
h
(7.3)
d3 k f˙~k f˙~k∗ − ω~k2 f~k f~k∗ , ω~k2 = ~k 2 + m2 .
L =
R3 /2
Let f~k (t) :=
√1 (q~ (t)
2 k
+ iq~k0 (t)), where q, q 0 are real and ~k are in the positive half of R3 .
Substitution in the Lagrangian gives,
Z
i
0
1h 2
0
.
L =
d3 k
q̇k − ω~k2 q~k2 + q̇k2 − ω~k2 q~k2
2
R3 /2
Denoting q~k0 =: q−~k , we can express the Lagrangian as,
Z
1 2
L =
d3 k
q̇k − ω~k2 q~k2 .
2
R3
~k now range over full R3 .
(7.4)
The Lagrangian is manifestly expressed as a sum (integral) of Lagrangians for harmonic
oscillators q~ each with frequency ω~ for ~k ∈ R3 . One can easily pass to the Hamiltonian
k
k
∂L
∂ q̇~k
= q̇~k . We get,
Z
1 2
3
2 2
p + ω~k q~k
H =
dk
.
2 ~k
R3
To complete the canonical form, introduce the usual basic Poisson brackets: {q~k , q~k0 } = 0 =
{p~k , p~k0 } and {q~k , p~k0 } = δ 3 (~k − ~k 0 ), ~k ∈ R3 .
form by defining p~k :=
Introduce the field, π(t, ~x) := φ̇(t, ~x) and define a Poisson brackets among the fields φ, π
fields from the Poisson brackets among the q~k and p~k using the Fourier decompositions.
Noting that f~k = √12 (q~k + iq−~k ) and f˙~k = √12 (p~k + ip−~k ) , ~k ∈ R3 /2. It follows,
Z
Z
3
{φ(t, ~x), π(t, ~y ) =
dk
d3 k 0 {f~k ϕ~k (~x) + f~k∗ ϕ~∗k (~x) , f˙~k0 ϕ~k0 (~y ) + f˙~k∗0 ϕ~∗k0 )}
3
3
R /2
R /2
Z
Z
=
d3 k
d3 k 0
R3 /2
R3 /2
ϕ~k (~x)ϕ~k0 (~y ){f~k , f~k0 } + ϕ~k (~x)ϕ~∗k0 (~y ){f~k , f~k∗0 }
+ϕ~∗k (~x)ϕ~k0 (~y ){f~k∗ , f~k0 } + ϕ~∗k (~x)ϕ~∗k0 (~y ){f~k∗ , f~k∗0 }
Z
Z
3
∗
∗
=
d k ϕ~k (~x)ϕ~k (~y ) + ϕ~k (~x)ϕ~k (~y ) =
d3 k ϕ~k (~x)ϕ~∗k (~y ) (7.5)
R3 /2
3
R3
∴ {φ(t, ~x), π(t, ~y ) = δ (~x − ~y )
(7.6)
58
The remaining Poisson brackets {φ, φ}, {π, π}, defined similarly, are zero.
Note: The same t is taken for the fields since the q’s and p’s are also at the same time.
Starting from the Lagrangian, using the Fourier decomposition, we saw explicitly that the
Lagrangian can be expressed as a sum of Lagrangians of harmonic oscillators. Furthermore,
we could define Poisson brackets among the fields from the same oscillator system. It is
apparent now that a field satisfying the Klein-Gordon equation as irreducibility condition,
can be given a canonical formulation wherein it appears as a system of infinitely many,
uncoupled harmonic oscillators.
Note: The integration domains appearing above can be comfusing. For real valued fields,
using a manifestly real form of a Fourier decomposition, the ~k ∈ R3 /2. Once the solutions
for the time dependent coefficients of the Fourier decomposition are subtituted back in the
Fourier decomposition, we get the mode decomposition of the field, with ~k ∈ R. For complex
valued fields, in both Fourier and mode decompositions, ~k ∈ R3 .
We now repeat the exercise for the Maxwell field and the Dirac field. We will use the
same orthonormal set of ϕ~k (~x) in developing the Fourier decomposition. The steps being
very similar, we will be brief and suppress the integration range of the ~k.
A.
Maxwell Field
Knowing the plane wave solutions having two polarizations, we write the Fourier decomposition of the vector field Aµ (t, ~x) as,
Z
Aµ (t, ~x) =
~
eik·~x
.
d k εµ (~k, a)f~k,a (t)ϕ~k (~x) , ϕ~k (~x) =
(2π)3/2
3
The εµ is a polarization vector which depends on ~k while the label a enumerates the numbers
of polarization vectors. A priori, there are four independent polarization vectors (a tetrad)
and a takes 4 values. One of the Maxwell equations fixes the 0th component of all polarization
vectors in terms of the spatial components. One of the second set of equations is trivially
true for the longitudinal polarization, ~ε(~k) ∝ ~k and we are left with only two independent
equations for the f~k,a corresponding to the transverse polarizations. We could have kept
the f (t, a) inside the polarization vector, but this is more convenient. We will suppress the
sum-over-a till the final expression.
59
This leads to,
Z
∴ F0i
Fµν = ∂µ Aν − ∂ν Aµ =
d3 k [εν ∂µ (f ϕ) − εµ ∂ν (f ϕ)] .
Z
Z
n
o
3
˙
=
d k ϕ~k (~x) εi f~k − iki ε0 f~k , Fij =
d3 k f~k ϕ~k (~x) {i(ki εj − kj εi )}
The equations of motion, using independence of ϕ~k (~x) give,
i~k · ~ε ˙
i~k · ~ε ¨
∂ i Fi0 = 0 : ε0 (~k, a)f~k (t) = −
f~k (t) ⇒ ε0 f˙~k = −
f (t)
~k 2
~k 2 ~k
∂ 0 F0i + ∂ j Fji = 0 : εi −f¨~k − ~k 2 f~k + iki (ε0 f˙~k ) + (~k · ~ε)ki f~k = 0
j
j
k
k
ε
k
k
ε
i
j
i
j
2
∴ 0 = f¨~k −εi +
+ ~k f~k −εi +
~k 2
~k 2
i
h
ki k j
or,
0 =
−δi j +
εj f¨~k + ~k 2 f~k .
~k 2
The prefactor is projector, projecting a 3-vector onto the plane perpendicular to ~k. Thus,
ki k j
j
~
εi (k, a) := δi −
εj (~k, a) define the transverse polarizations and there are two inde~k 2
pendent ones. Since the prefactor is non-zero, the Maxwell equations imply that f~k satisfy
the same harmonic oscillator equation as before, with ω 2 = ~k 2 and the mass is zero. It is
~k
important to remember that transverse polarization, εi (~k, a), is defined by ~k and hence acquire the ~k dependence as well as the label a. Hence the f~k,a satisfy the oscillator equations
only for the transverse polarization. We may not display it always to avoid clutter.
The next task is to compute the Lagrangian, L = − 14 Fµν F µν = 12 (F0i )2 − 14 (Fij )2 .
Z
Z
n
on
o
2
3
F0i =
d k d3 k 0 εi f˙~k − iki ε0
ε0i f˙~k0 − iki0 ε00 f~k ϕ~k (~x)ϕ~k0 (~x)
~ε ˙
0
3
Eliminating ε0 f~k = −i k·~
~k2 f~k and likewise ε0 f~k0 , simplifies the product of the {. . . } to ,
"
{. . . }{. . . } = f˙~k f˙~k0
3
0 ~0
~k · ~ε ~k 0 · ~ε0
~
~k 0 ) ~ε · k − (~ε0 · ~k) ~ε · k
~ε · ~ε0 + ~k · ~k 0
−
(~
ε
·
~k 2 ~k 0 2
(~k 0 )2
(~k)2
#
This is equivalent to using one of the Maxwell equations, the Gauss law equation which is a constraint.
60
Integration over ~x gives δ 3 (~k + ~k 0 ) which cancels two term and we are left with,
#
"
Z
Z
~k) · ~k)(~ε0 (−~k) · ~k)
1
1
(~
ε
(
f˙~k f˙−~k
d3 x (F0i )2 =
d3 k ~ε(~k) · ~ε0 (−~k) −
~k 2
2 Σt
2
kikj
ij
~
But [. . . ] = εi (k) δ −
εj (−~k) = ~ε(~k) · ~ε(−~k)
~k 2
Z
Z
1
1
3
2
∴
d x (F0i ) =
d3 k ~ε(~k) · ~ε(−~k) f˙~k f˙−~k and similarly,
2 Σt
2
Z
Z
~k 2
1
−
~ε(~k) · ~ε(−~k) f~k f−~k
d3 x(Fij )2 =
d3 k −
4
2
The dot product of the transverse polarization includes the sum over the polarizations. The
longitudinal polarization has dropped out explicitly. Making it explicit, we have,
Z
2 h
i
ih
X
3
LM axwell =
dk
~ε(~k, a) · ~ε(−~k, b) f˙~k,a f˙−~k,b − ω~k2 f~k,a f−~k,b
R3 /2
(7.7)
a,b=1
We may now choose the transverse polarizations such that ~ε(~k, a) · ~ε(−~k, b) = δa,b (note the
±~k for both polarizations). Using this, the Maxwell action becomes,
Z
Z
i
Xh
f˙~k,a f˙−~k,a − ω~k2 f~k,a f−~k,a
(7.8)
S[A] =
dt
d3 k
R3 /2
Z
a=1,2
Z
i
1 Xh 2
2
for,
d3 k
=
dt
q̇~k,a − ω~k2 q~k,a
2 a=1,2
R3
Z
X
∗
d3 k
εµ (~k, a)f~k,a (t)ϕ~k (~x) + ε∗µ (~k, a)f~k,a
Aµ (t, ~x) =
(t)ϕ~∗k (~x)
R3 /2
(7.9)
with, (7.10)
a
~k · ~ε(~k, a)
ki k j
j
˙
~
~
f~k,a , εi (k, a)) := δi −
εj (~k, a)
ε0 (k, a)f~k,a (t) = −i
~k 2
~k 2
δa,b = ~ε(~k, a) · ~ε∗ (~k, b) (Normalization).
(7.11)
(7.12)
As in the case of the scalar, we have gone through the introducing the ‘real q~ka ’ degrees of
freedom and expressed the Maxwell action too is a sum of twice as many harmonic oscillators.
The canonical form goes through as well.
Note: If we were to begin with the Maxwell action and attempt to get its Hamiltonian
formulation, we would obtain the Gauss Law equation of motion as a constraint equation. By
using the Fourier decomposition, we have explicitly solved this constraint and eliminated
the ε0 polarization. Once this is done, we get the transverse projector which eliminates
the longitudinal polarization. This gives the action involving only the physical degrees of
freedom.
61
B.
Dirac Field
Lastly, let us consider the Dirac action. Here the Lagrangian is first order in the derivatives and we do not expect a harmonic oscillator form. We will also find it clearer to write the
Fourier decomposition using the positive half in ~k space. We take the Fourier decomposition
of the spinor as,
Z
ψ(t, ~x) =
"
d3 k
R3 /2
#
X
u(~k, σ)f~k,σ (t) ϕ~k (~x) +
σ
"
#
X
v(~k, σ)g~k,σ (t) ϕ~∗k (~x)
σ
Substitution in the Dirac equation, keeping summation over σ implicit, gives,
Z
o
hn
µ
3
0
~
˙
0 = −iγ ∂µ ψ + mψ =
d k −iγ u~k f~k + (k · ~γ + m)u~k f~k ϕ~k
o i
n
0
~
+ −iγ v~k ġ~k + (−k · ~γ + m)v~k g~k ϕ~∗k
Linear independence of the basis functions gives two equations:
−iγ 0 u~k f˙~k + (~k · ~γ + m)u~k f~k = 0 = − iγ 0 v~k ġ~k + (−~k · ~γ + m)v~k g~k .
Differentiating w.r.t. time once more, multiplying by −iγ 0 on the left and using the above
equations again leads to,
−
X
n
o
n
o
X
, ω~k2 := ~k 2 + m2 .
v~k,σ g̈~k,σ + ω~k2 g~k,σ
=0= −
u~k,σ f¨~k,σ + ω~k2 f~k,σ
σ
σ
Since u~k,σ , v~k,σ are presumed linearly independent, each of the f and the g functions satisfy
the same, familiar harmonic oscillator equation, with solutions e±iω~k t . Of course one of the
solutions for each of f, g is spurious since the original equations are first order. Substituting
f~k,σ ∼ e−iω~k t and g~k,σ ∼ e+iω~k t in the first order equations requires the u, v spinors to satisfy:
(6 k + m)u~k,σ = 0 = (6 k − m)v~k,σ . Choice of the other signs does not give a Lorentz covariant
equation for the spinors. Hence we restrict to: f˙~k,σ = −iω~k f~k,σ , ġ~k,σ = +iω~k g~k,σ . These
equations satisfied by the spinors are analogous to the transversality conditions we got on the
polarization tensors. These are consequences of the Dirac equation i.e. hold for solutions.
To compute the Lagrangian, we need the decomposition of the Dirac conjugate,
"(
)
(
)
#
Z
X
X
3
∗
∗
∗
ψ̄(t, ~x) =
dk
ū(~k, σ)f~k,σ (t) ϕ~k (~x) +
v̄(~k, σ)g~k,σ (t) ϕ~k (~x)
R3 /2
σ
σ
62
We will now choose the u, v spinors to be solutions of the equations: (6 k + m)u(~k, σ) =
0 = (6 k − m)v(~k, σ) and express the Dirac action in terms of the f, g functions alone.
Z
hn
o
µ
−iγ ∂µ ψ =
d3 k −iγ 0 u(~k, σ)f˙~k,σ ϕ~k (~x) + −iγ 0 v(~k, σ)ġ~k,σ ϕ~∗k (~x)
R3 /2
oi
n
+ ki γ i u(~k, σ)f~k,σ ϕ~k (~x) − ki γ i v(~k, σ)g~k,σ ϕ~∗k (~x)
Z
hn
µ
3
∴ (−iγ ∂µ + m)ψ =
d k −iγ 0 u(~k, σ)f˙~k,σ + ki γ i u(~k, σ)f~k,σ
o
~
+mu(k, σ)f~k,σ ϕ~k (~x)
n
+ −iγ 0 v(~k, σ)ġ~k,σ − ki γ i v(~k, σ)g~k,σ
o
i
+mv(~k, σ)g~k,σ ϕ~∗k (~x)
Z
Z
Z
Z
3
µ
3 0
3
∴L =
d xψ̄(−iγ ∂µ + m)ψ =
dk
dk
d3 x
3
3
Σ
R /2
R /2
Σt
hnt
o
n
o
i
ū(~k 0 , σ)f~k∗0 ,σ0 (t) ϕ~∗k0 (~x) + v̄(~k 0 , σ 0 )g~k∗0 ,σ0 (t) ϕ~k0 (~x) ×
hn
o
−iγ 0 u(~k, σ)f˙~k,σ + ki γ i u(~k, σ)f~k,σ + mu(~k, σ)f~k,σ ϕ~k (~x)
o
n
i
∗
0 ~
i ~
~
+ −iγ v(k, σ)ġ~k,σ − ki γ v(k, σ)g~k,σ + mv(k, σ)g~k,σ ϕ~k (~x)
Thanks to the integration domain of the ~k, ~k 0 integrations, only ϕ~∗k0 ϕ~k terms contribute
δ 3 (~k − ~k 0 ). Hence only ū pairs with u and v̄ pairs with v. Using the equations satisfied by
the spinors, we eliminate (ki γ i + m)u = +ω~k γ 0 u and (−ki γ i + m)v = −ω~k γ 0 v. This leads
to,
Z
LDirac =
3
dk
(
Xh
R3 /2
ih
i
0 0 ~
∗
~
˙
ū(k, σ )γ u(k, σ) f~k,σ0 (−if~k,σ + ω~k f~k,σ )
σ,σ 0
+
Xh
)
ih
i
∗
v̄(~k, σ 0 )γ 0 v(~k, σ) g~k,σ
0 (−iġ~
k,σ − ω~k g~k,σ )
(7.13)
σ,σ 0
Choosing the normalization of the spinors so that ū(~k, σ 0 )γ 0 u(~k, σ) =
±v̄(~k, σ 0 )γ 0 v(~k, σ) gives the Dirac action as,
Z
Z
i
Xh
∗
∗
S[ψ̄, ψ] = dt
d3 k (−i)
f~k,σ
(f˙~k,σ + iω~k f~k,σ ) ± g~k,σ
(ġ~k,σ − iω~k g~k,σ )
R3 /2
δσ,σ0
=
(7.14)
σ
We can group the f, g’s together to get a uniform Lagrangian with ~k ∈ R3 . Denote f−~k,σ :=
∗
∗
g~k,σ
∀ ~k ∈ R3 /2. Do a partial integration of the g~k,σ
ġ~k,σ = f−~k,σ f˙−∗ ~k,σ . This makes the relative
signs the same in both the terms. Choosing the ‘-’ sign for the normalization condition for
63
the v-spinors, allows us to combine the two terms and extend the integration domain to full
R3 . Thus, we get,
Z
SDirac [ψ̄, ψ] =
Z
dt
Z
ψ(t, ~x) =
d3 k
Xh
R3
σ
d3 k
Xh
R3 /2
i
∗
(−if˙~k,σ + ω~k f~k,σ ) for
f~k,σ
(7.15)
u(~k, σ)f~k,σ (t)ϕ~k (~x) + v(~k, σ)f−∗ ~k,σ (t)ϕ~∗k (~x)
i
(7.16)
σ
0 = (6 k + m)u(~k, σ) = (−6 k + m)v(~k, σ)
normalized as,
δσ;,σ = ū(~k, σ 0 )γ 0 u(~k, σ) = − v̄(~k, σ 0 )γ 0 v(~k, σ) .
(7.17)
(7.18)
Note that the action is real again.
An aside: It is an interesting exercise to express the action in terms of the real q type
variables as we did for the scalar. Since the action is manifestly a sum over variables with
uncoupled labels σ, ~k, we focus on just one of the terms. Thus consider the manifestly
real Lagrangian, L(f, f ∗ ) := 1 {−if ∗ f˙ + if˙∗ f } + ωf ∗ f . Introduce the real variables x, y as:
2
f := x + iy, f˙ = ẋ + iẏ. The Lagrangian then takes the form,
L(x, y, ẋ, ẏ) = xẏ − y ẋ + ω(x2 + y 2 ) ⇒ px = −y , py = +x
Notice that the equations defining the conjugate momenta cannot be inverted to solve for
the velocities - we have a constrained system [3]. This is not the place to discuss it in detail,
the reference gives the necessary background. I will just list the steps.
φ := px + y , χ := py − x
(primary constraints);
Hcanonical := px ẋ + py ẏ − L = − ω(x2 + y 2 );
Htotal = −ω(x2 + y 2 ) + λφ + µχ Preservation of primary constraints ⇒
Htotal = −ω(xpy − ypx ) ∵ λ = ωy , µ = −ωx ;
1
1
Dirac brackets : {x, px }∗ = {y, py }∗ = , {x, y}∗ = {px , px }∗ = − . Use:
2
2
2
2
Htotal = −ω(x + y ) and Dirac brackets for equations of motion:
ẋ = ω y , ẏ = −ω x ; finally drop px , py and set x :=
1
Htotal = − ω(p2 + q 2 ) , {q, p} = 1 , ṗ = ω q , q̇ = −ω p .
2
√p
2
, y :=
√q ,
2
Re-introducing the (~k, σ) labels shows that the spinor field Hamiltonian too is a sum of
harmonic oscillators for each of the ~k ∈ R3 and σ = ± labels.
64
Puzzle: Why is the Hamiltonian of a wrong sign? Is it merely a choice of the overall sign
in the Lagrangian density (will not change the equation of motion)? What is the rationale
for a choice of sign? We will not pursue this classical formulation further here.
Remarks: The scalar and the vector fields that we discussed have been real. This was
reflected in the Fourier decomposition with complex conjugate coefficients. What if we have
a complex field? Well, we can always write the Fourier decomposition as (see the spinorial
case),
Z
φ(t, ~x) =
R3 /2
d3 k f~k (t)ϕ~k (~x) + g~k∗ (t)ϕ~k (~x)
For real scalar field, the reality condition simply identifies g~k (t) = f−~k (t). For a complex
field, there is no such identification. We will then have two harmonic oscillators for each
~k ∈ R3 . For the spinorial case, we had a constrained system and hence half the number of
oscillators.
Remark: Consider the time-Fourier transform of a scalar field: φ(ω, ~x) :=
R∞
√1
dt e−iωt φ(t, ~x). The Fourier transform of its complex conjugate is given by:
2π −∞
R∞
φc (ω) := √12π −∞ dt e+iωt φ∗ (t, ~x). Clearly, φc (ω) = φ∗ (−ω). Therefore, a positive frequency
φ(t, ~x) has φ(ω, ~x) = 0, for ω < 0. This immediately also give that φ∗ (t, ~x) is a negative
frequency field, i.e. φc (ω, ~x) = 0 for ω > 0. This gets reflected in the mode decomposition.
While the Fourier decomposition displays the “degrees of freedom” of the field, the mode
decomposition will turn out to be convenient for passage to quantum fields. To appreciate
this, consider a single oscillator (a single ~k). The degrees of freedom view gives: q̈ + ω 2 q 2 =
0, p = q̇, H = 21 (p2 + q 2 ), {q, p} = 1. The mode decomposition gives (a, a∗ are constants),
q(t) = ae−iωt + a∗ eiωt
⇒
p(t) = −iω ae−iωt − a∗ eiωt inverting gives,
eiωt
p
e−iωt
p
a =
q+i
,
a∗ =
q−i
2
ω
2
ω
√
√
i
Define: a := 2ωa , a∗ := 2ωa∗ ⇒
∴ {a, a∗ } = −
2ω
iωt
−iωt
e
e
∗
a+ √
a∗
{a, a } = − i
,
q(t) = √
2ω
2ω
√
Notice the normalization factor of 2ω. This is correlated with the normalization of the
Poisson bracket of a, a∗ (and of course has nothing to do with the Lorentz covariance!). As
per the usual canonical quantization procedure, {, } → −i/(~ = 1) × [, ]. This would lead to
[a, a† ] = 1.
Keeping this in mind, we summarize the mode decompositions for the fields:
65
Z
φ(x) =
R3
Z
Aµ (x) =
R3
ik·x
d3 k
p
a~k e + b~∗k e−ik·x k · x := −ω~k t + ~k · ~x
2ω(2π)3
If φ∗ = φ then b~k = a~k .
i
Xh
d3 k
p
a~k,λ εµ (~k, λ)eik·x + a~∗k,λ ε∗µ (~k, λ)e−ik·x
2ω(2π)3 λ=1,2
(7.19)
(7.20)
~k · ~ε(~k, λ)
ε0 (~k, λ) := −
, ω~k = |~k|
ω~k
j
k
k
i
j
εj (~k, λ) , ~ε(~k, λ) · ~ε∗ (~k, λ0 ) = δλ,λ0
εi (~k, λ) :=
δi −
~k 2
Z
ψ(x) =
R3
i
Xh
d3 k
p
b~k,σ u(~k, σ)eik·x + d~∗k,σ v(~k, σ)e−ik·x
2ω(2π)3 σ=±
(7.21)
(6 k + m)u(~k, σ) = 0 = (−6 k + m)v(~k, σ)
ū(~k, σ)u(~k, σ 0 ) = δσ,σ0 = − v̄(~k, σ)v(~k, σ 0 )
X
X
−6 k + m
6k + m
u(~k, σ)ū(~k, σ) =
,
v(~k, σ)v̄(~k, σ) = −
2m
2m
σ
σ
For a real scalar field, we note,
{a~k , a~∗k0 } = −iδ 3 (~k − ~k 0 )
NOTE:
{φ(t, ~x), π(t, ~x0 )} = δ 3 (~x − ~x0 ) .
↔
(7.22)
So far whatever we have done has no quantum in it. We have obtained a
classical theory of non-interacting fields.
C.
Interaction with source: Green’s functions
A simplest kind of ‘interaction’ we are familiar with, eg. from classical electrodynamics,
is interaction with a “source”. So let us consider a real scalar field interacting with a source
J(x) which is a real function. The interaction is described by the equation4 :
( − m2 )φ(x) = −J(x) ↔ (x − m2 )G(x, x0 ) = −δ 4 (x − x0 ) Green’s Function
Z
⇒ φ(x) = +
d4 x0 G(x, x0 )J(x0 )
Source
4
Follows from the Lagrangian L = Lscalar + J(x)φ(x) with Lscalar given in (6.5).
66
Poincare invariance of the defining equation for Green’s function implies that it is a Lorentz
invariant function of (x − x0 ). It can be represented as,
0
0
Z
G(x − x ) =
R4
d4 k eik·(x−x )
, k 2 = −(k 0 )2 + ~k 2 .
4
2
2
(2π) k + m
(7.23)
The denominator vanishes for k 2 = −m2 and thus we need a definition of the integral and
hence of the Green’s function.
The usual method is to define
R
d4 k =
R
d3 k
R
dk 0 and interpret the k 0 integral as a
contour integral along a suitably chosen contour. Each choice is supposed to reflect a
“boundary condition” for the wave operator. In the complex k 0 plane, there are poles
p
at k 0 = ± ω~ := ~k 2 + m2 . The integral is sought to be defined by shifting the poles off the
k
real axis. This can be done in 4 ways: (i) shift both poles in the Lower Half Plane (LHP),
k 0 → k 0 + i, (ii) shift both poles in the UHP, k 0 → k 0 − i, (iii) shift the pole at +ω~k in
the LHP and shift the pole at −ω~k in the UHP, ω~k2 → ω~k2 − i and (iv) reverse of the (iii).
To see the consequences of any of these choices, let us consider the special case of a source
localized at the space-time origin: J(t, x) = δ 4 (x). The solution takes the form,
Z
Z
Z
d4 k eik·x −i(−k0 t0 +~k·~x0 ) 0 3 0
0
3 0
e
δ(t )δ (~x )
φ(x) =
dt
dx
(2π)4 k 2 + m2
"Z
#
Z
0
dk 0
e−ik t
d3 k i~k·~x
e
and,
=
(2π)3
2π (k 0 )2 − ωk2
Z
1
dk 0
1
1
0
[. . . ] =
− 0
e−ik t
2ω~k
2π k0 − ω~k k + ω~k
(7.24)
Without the i’s, both sides of the above equations are real. Now we introduce the i’s. The
contour integrals will have the segment of a semicircle at infinity whose contribution should
vanish. On the semicircle at large R, k 0 = Reiθ , the integrand behaves as R−2 eRsin(θ)t and
the measure provides factor of R. This suffices for convergence for t = 0. However, for a
non-zero t, we can get an exponential divergence unless θ is restricted appropriately. This
dictates how the contour should be closed in UHP or LHP. Since φ(x) is a function of the
Lorentz invariant x2 = −t2 + ~x2 , we can consider the two cases as: (a) x2 ≤ 0 (sign of t is
invariant) and we may evaluate the integrals for ~x = 0; (b) x2 > 0 and we can evaluate the
integrals for t = 0.
(i) Both poles to LHP: To pickup the residues, we should close the contour in the LHP.
For space-like interval, taking t = 0, we see that the residues at the two poles cancel each
67
other and the solution vanishes. For time-like intervals, t 6= 0 but we can take ~x = 0. To
have a vanishing contribution from the semicircle at infinity, we have to restrict t > 0. The
evaluation of the integral is trivial and gives,
[. . . ]Retarded = −
1
θ(t)sin(ω~k t) .
ω~k
(ii) Both poles to UHP: To pick-up the residues, the contour should be closed in the UHP
and for time-like interval, we need to restrict to t < 0. The evaluation gives,
[. . . ]Advanced =
1
θ(−t)sin(ω~k t) .
ω~k
As before, for space-like interval, the solution vanishes.
(iii) Positive pole to LHP and negative to UHP: Now closing the contour in either of
UHP/LHP will always pick up a contribution. For time-like interval, for LHP closure, we
have to take t > 0 and we will pick up the contribution from the first term. Likewise, for the
UHP closure, we have to take t < 0 and we will pick up the contribution from the second
term. Evaluation gives, for time-like or light-like separation
[. . . ]F eynman = − i
1 −iω~ |t|
1
θ(t)e−iω~k t + θ(−t)e+iω~k t = − i
e k [θ(t) + θ(−t)] .
2ω~k
2ω~k
For space-like interval, taking t = 0, we get the non-zero answer, [. . . ]F eynman = −i 2ω1~ for
k
closure in either LHP or UHP.
The case (iv) is similar to (iii) and is obtained in an obvious manner. The solution φ(x)
is obtained by the spatial Fourier transform as given above (7.24).
Remarks: There are two properties of the Retarded and the Advanced choices that stand
out. In both cases, the solution φ(x) (a) is real for a real source function and (b) it reflects
the expected causal behavior - vanishing outside the future/past light cones. By contrast,
the Feynman choice gives a non-real φ(x) even for a real source function and the solution
is non-vanishing outside the light cones! The same holds for the choice (iv). What do we
make of this?
The retarded and advanced choices support the interpretation that the field can be regarded as an observable responding to a source in a causally consistent manner. This is
what we would have in a classical field theory. The Feynman choice however disallows such
an interpretation - the particular solution φ(x) cannot be interpreted as causally consistent
response to a source.
68
Both the causal and the Feynman Green’s functions appear naturally when φ(x) is promoted to be an operator field i.e. in Quantum Field Theory. In the scattering theory,
section 10, the causal Green’s function are used in articulating the asymptotic conditions
while the Feynman Green’s function (or Feynman propagator) appears naturally in the
Lehmann-Symanzik-Zimmermann (LSZ) reduction of the scattering matrix.
For completeness, we note the full Green’s functions5 :
Gret (x) =
Gadv (x) =
GF eynman (x2 < 0) =
GF eynman (x2 > 0) =
5
√
2
θ(t)
2
2 mJ1 (m −x )
√
δ(x ) − θ(t)θ(−x )
+
2π
4π −x2
√
θ(−t)
mJ
(m
−x2 )
1
√
+
δ(x2 ) − θ(−t)θ(−x2 )
2π
4π −x2
√
δ(x2 )
m
(2)
− √
H1 (m −x2 )
+
4π
8π −x2
√
i m
+ 2 √ K1 (m x2 )
4π x2
(7.25)
(7.26)
(7.27)
The overall signs are convention dependent. It is easiest to match it for the massless case. For detail see:
[4, 5]
69
8.
COVARIANCE OF QUANTUM FIELDS AND RELATIVISTIC CAUSALITY
So far we have studied representations of the Poincare group and seen emergence of a
“field dynamical system” - an infinite collection of harmonic oscillators. In our attempt to
incorporate interactions with classical sources, we encountered the Green’s functions. We
met the expected and familiar retarded and advanced Green’s functions, but also encountered
the mathematical possibility of the Feynman alternative which does not gel with a classical
field theory interpretation. Taking it as a hint of an opportunity, we attempt a ‘quantum’
interpretation. Here, the field as a collection of harmonic oscillators gives a strong clue: use
the mode decomposition and promote the expansion coefficients to operators. We will then
have a collection of creation and annihilation operators and can tag the states of the system
by the number of quanta. We already saw (7.22) in the simpler case of a real scalar field
that postulating Poisson brackets for the coefficients and using the completeness of mode
functions , we can deduce the Poisson brackets for the fields themselves. The same method
works for operators as well. That is, not only do we promote the fields to operators but we
also postulate the necessary commutation relations.
A.
Poincare Covariance of Quantum fields
The first feature, fields as operators, immediately modifies the implementation of covariance: a covariant quantum field requires specific transformations for the coefficient functions
and conversely. Transformation laws of the mode functions do not suffice.
To see this, consider a linear combination of classical fields, ψ A (x) :=
P
A
n cn ψn (x).
A
group action on ψ A is deduced from that on the individual ψnA : g · ψnA (x) = DAB (g)ψnB (g −1 x),
X
X
DAB (g)ψnB (g −1 x) = DAB (g)ψ B (g −1 x).
cn g · ψnA (x) =
g · ψ A (x) :=
n
n
This of course presumes quite naturally that the coefficients are invariant under the group
action. When a similar linear combination is promoted to an operator by making the cn as
operators, the above procedure breaks down.
X
?
U (g)ψ̂ A (x)U † (g) :=
U (g)ĉn U (g)† ψnA (x) = DAB (g −1 )ψ̂ B (gx) .
n
Evidently, if we assume the ĉn to be invariant, then the field cannot be covariant. So the
ĉn should transform in a manner that shifts the group action on to the ψnA (x) and then
70
to work out correctly so as to produce the expected right hand side6 . Taking now
R 3
P
cn → â, ↠, n → d...k and using the mode expansion form, we observe that if we postulate
the
P
n
the transformation laws for the â, ↠as,
~ “D(Λ)”σ) , U (g)↠(~k, σ)U † (g) := ↠(Λk,
~ “D(Λ)”σ) ,
U (g)â(~k, σ)U † (g) := â(Λk,
then we can shift the changed labels from the operators to the mode function labels using
the change of integration (and summation) variables. Provided the integration measure is
invariant, the right hand side takes the expected form and we have the covariance of the
quantum field. Note that this essentially fixes the form of the mode decomposition! In
equations (for a Dirac field say),
Z
i
h
i
Xh
d3 k
†
p
U (g)ψ̂U (g) (x) :=
(U b̂~k,σ U † )u(~k, σ)eik·x + (U dˆ~†k,σ U † )v(~k, σ)e−ik·x
2ω~k (2π)3 σ
Z
i
Xh
d3 k
†
ik·x
−ik·x
~
ˆ
~
p
=
b̂Λk,σ
u(
k,
σ)e
+
d
v(
k,
σ)e
~ Λ
~ Λ
Λk,σ
2ω~k (2π)3 σ
Z
Xh
d3 (Λ−1 k)
~ k), σ −1 )ei(Λ−1 k)·x
q
b̂~k,σ u((Λ−1
=
Λ
3
2ω(Λ−1
~ k) (2π) σΛ−1
i
~ k), σ −1 )e−i(Λ−1 k)·x
+ dˆ~†k,σ v((Λ−1
Λ
Z
n
o
Xh
d3 k
p
b̂~k,σ D(Λ−1 )u(~k, σ) eik·(Λx)
=
2ω~k (2π)3 σ
n
o
i
+ dˆ~†k,σ D(Λ−1 )v(~k, σ) e−ik·(Λx
= D(Λ−1 )ψ̂(Λx)
Note: If we postulate a Poincare invariant state |0i and define |~k, σi := b† (~k, σ)|0i, then
these states transform precisely as per the particle representation. Thus the postulated group
action on the b, d operators are precisely as needed for building up the particle representations
and we will use it shortly.
B.
Space Inversion, Time Reversal, Charge Conjugation of Dirac Field
Let us quickly see how the discrete symmetries act on quantum fields. Again, we will
consider a Dirac field for illustration and now suppress the ‘hats’ on the operators.
6
The homomorphism dictates the expected form on the r.h.s.
71
Space Inversion:
We want ΨP (t, ~x) := P ΨP † = D(P )Ψ(t, −~x). From the defining relations for P , namely,
P Pi P † = −Pi and P Ji P † = Ji , we postulate,
P b† (~k, σ)P † := ηb† (−~k, σ) , P d† (~k, σ)P † := ηd† (−~k, σ) , η 2 = ±1 .
As noted while discussing the discrete symmetry actions on the particle representations, the
phase is σ independent.
Substituting the mode expansion and using these postulated actions gives,
Z
o
Xn
† ~
†
3
∗ ~
ik·(P x)
† ~
−ik·(P x)
P Ψ (k, σ)P = [d k]inv
η b(k, σ)u−~k,σ e
+ ηd (k, σ)v−~k,σ e
σ
Claim: The u, v spinors satisfy, u(−~k, σ) = γ 0 u(~k, σ) and v(−~k, σ) = −γ 0 v(~k, σ).
This follows by noting that
0 = (6 k + m)u(~k, σ) = (−ωγ 0 + ~k · ~γ + m)u(~k, σ) , u(~k, σ) = γ 2 u(~k, σ)
∴ 0 = (−ωγ 0 γ 0 + ~k · ~γ γ 0 + mγ 0 )(γ 0 u(~k, σ))
= (−ωγ 0 + (−~k · ~γ + m)(γ 0 u(~k, σ)) ⇒ γ 0 u(~k, σ) ∝ u(−~k, σ)
Similarly, γ 0 v(~k, σ) ∝ v(−~k, σ). The proportionality factors are fixed by noting that for
~k = k̂, the u, v spinors are eigen-spinors of γ 0 with eigenvalues ±1. This implies,
Z
o
Xn
† ~
†
0
3
∗ ~
ik·(P x)
† ~
−ik·(P x)
P b (k, σ)P = γ
[d ]inv
η b(k, σ)u~k,σ e
− ηd (k, σ)v~k,σ e
σ
Clearly η should be imaginary for the right hand side to have the field ψ(P x). We choose
η = −i and deduce that, P Ψ(x)P † = iγ 0 Ψ(P x)} , i.e., D(P ) = iγ 0 and η 2 = −1.
Time reversal:
We want ΨT (t, ~x) := T ΨT † = D(T )Ψ(−t, ~x). The defining relations where both Pi and
Ji change sign, we postulate,
T b† (~k, σ)T † = ξσ b† (−~k, −σ) , T d† (~k, σ)T † = ξσ d† (−~k, −σ) .
As noted while discussing the discrete symmetry actions on the particle representations, the
phase is σ dependent.
72
Substituting the mode decomposition and noting the T operator is anti-unitary, we get,
Z
o
Xn
†
[T ΨT ](x) = [d3 k]inv
ξσ∗ b(−~k, −σ)u∗ (~k, σ)e−ik·x + ξσ d† (−~k, −σ)v ∗ (~k, σ)eik·x
σ
Z
=
o
Xn
∗
[d3 k]inv
ξ−σ
b(~k, σ)u∗ (−~k, −σ)e−ik·(T x) + ξ−σ d† (~k, σ)v ∗ (−~k, −σ)eik·(T x)
σ
Now we need to use the relations,
u∗ (−~k, −σ) = −σCγ5 u(~k, σ) , v ∗ (−~k, −σ) = −σCγ5 v(~k, σ) .
Once again these are proved from the defining equations for the spinors.
Clearly, if we choose ξσ = σ(= ±1), the we get Ψ(T x) on the right hand side and we get,
T Ψ(x)T † = Cγ5 Ψ(T x), i.e.D(T ) = Cγ5 , Cγ µ C † = −(γ µ )T .
Charge conjugation:
Lastly, consider the charge conjugation action. This is defined through the particle-antiparticle exchange and we postulate,
C b~†k,σ C † = ξd~†k,σ , C d~†k,σ C † = ξb~†k,σ .
(8.1)
Substitution of mode expansion gives,
Z
o
Xn
†
ξ ∗ d~k,σ u(~k, σ)eik·x + ξb~†k,σ v(~k, σ)e−ik·x
[C ΨC ](x) = [d3 k]inv
σ
Now we need to use: v(~k, σ) = C † ūT (~k, σ) and recall that the C matrix is given by
C = iγ 2 γ 0 , C T = −C , C † C = 1 so that v = −C ūT and u = −C v̄ T . This gives,
Z
o
Xn
†
†
3
∗
T ~
ik·x
T ~
−ik·x
[C ΨC ](x) = [d k]inv
ξ d~k,σ (−C v̄ (k, σ))e + ξb~k,σ (−C ū (k, σ))e
and,
σ
Ψ̄T (x) =
Z
o
Xn †
3
T ~
−ik·x
T ~
ik·x
[d k]inv
b~k,σ (ū (k, σ))e
+ d~k,σ (v̄ (k, σ))e
σ
Hence, if we choose ξ ∗ = ξ = −1, then we get, C Ψ(x)C † = +C(Ψ̄)T (x)
. Note that C
operator is unitary!
From these it is easy to get the transformations of Dirac conjugates. Here is a summary:
73
P ΨP † (x) = iγ 0 Ψ(P x)
,
P Ψ̄P † (x) = −iΨ̄(P x)γ 0 ; (8.2)
T ΨT † (x) = Cγ5 Ψ(T x)
,
T Ψ̄T † (x) = Ψ̄(T x)γ5 C † ; (8.3)
C ΨC † (x) = C Ψ̄T (x)
,
C Ψ̄C † (x) = ΨT (x)C ;
(8.4)
Cγ µ C † = −(γ µ )T , C † C = 1
,
C T = −C , C := iγ 2 γ 0
(8.5)
We are now ready to verify one of the important general theorem of relativistic quantum
field theory, the CPT theorem. We verify it for the spinorial quantum field.
C.
CPT theorem for Dirac Field
We want to consider the combined action of the discrete transformations. Let us choose
one ordering, say, CPT . Then,
(CPT )Ψ(x)(CPT )† = CP (Cγ5 Ψ(T x))P † T † = Cγ5 C (iγ 0 Ψ(PT x))C †
= iCγ5 γ 0 C Ψ̄T (−x)
∵ PT x = −x
(CPT )Ψ̄(x)(CPT )† = CP (Ψ̄(T x)γ5 C † )P † T † = C (Ψ̄(PT x))(−iγ 0 )γ5 C † )C †
= iΨT (−x)Cγ 0 γ5 C
∵ C † = −C
Next, consider the CPT transform of a general bilinear of the form Ψ̄(x)AΨ(x). Using
(CPT )† (CPT ) = 1, the CPT transform will be the CPT transform of the Dirac conjugate,
that of the matrix A and that of Ψ. The CPT transform of A is just A∗ since only the
complex conjugate acts on A.
(CPT )Ψ̄(x)(CPT )† (CPT )A(CPT )† (CPT )Ψ(x)(CPT )†
= iΨT (−x)Cγ 0 γ5 C [A∗ ] iCγ5 γ 0 C Ψ̄T (−x)
T
= i2 Ψ̄(−x)(Cγ5 γ 0 C)T A† (Cγ 0 γ5 C)T Ψ(−x) × (−1)
= +Ψ̄(−x) (Cγ5 γ 0 C)T A† (Cγ 0 γ5 C)T Ψ(−x)
(8.6)
The explicit (−1) in the third line is due to the fermionic nature of the spinors which we
will see shortly in the discussion of the spin-statistics theorem.
The factors of the gamma matrices simplify as,
Cγ5 γ 0 C = +γ5 Cγ 0 C = −γ5 Cγ 0 C † = +γ5 (γ 0 )T = γ5 γ 0 and Cγ 0 γ5 C = γ 0 γ5 .
74
(CPT ) Ψ̄(x)AΨ(x) (CPT )† = Ψ̄(−x) (γ 0 γ5 )A† (γ5 γ 0 ) Ψ(−x)
(8.7)
It remains to evaluate the middle braces for A = 1, γ µ , γ µ γ ν , γ µ γ5 and γ5 . These are obtained
easily as,
γ 0 γ5 1γ5 γ 0 = 1
γ 0 γ5 γ µ γ5 γ 0 = −γ µ
γ 0 γ5 Σµ,ν γ5 γ 0 = +Σµν
γ 0 γ5 γ µ γ5 γ5 γ 0 = −γ µ
γ 0 γ5 (iγ5 )γ5 γ 0 = γ5
The Lagrangian density is made up of the bi-linears and also involves derivatives. For
a ∂µ Ψ(x), there is an extra minus sign due to PT x = −x. Thus we see that every tensor
index, on γ or on ∂, gives a minus sign under CPT transform. For a Hermitian Lagrangian
(coefficients are appropriately real or pure imaginary) which is a Lorentz scalar (so even
rank tensor), CPT contributes no extra sign and only changes the space-time argument i.e.
CPT )L (x)(CPT )† = L (−x) . Thus,
a Hermitian action, invariant under proper and orthochronous Lorentz transformations,
is invariant under the discrete CPT transformations.
We have explicitly verified it for the quantum, Dirac field. Verification for scalar and
Maxwell field is left as an exercise.
Note: Had we not used the extra minus sign attributed to the fermionic nature of the
fields, we would have got the CPT transform of the Lagrangian to be minus of L (−x)
and got CPT non-invariance of the action! This minus sign is tightly connected with the
spin-statistics theorem which we discuss next.
D.
Relativistic Causality and the Spin-Statistics Theorem
Now we come to the second feature of quantum fields - the commutation relations. Here
we get another surprise. The Pauli exclusion principle follows from the requirement of
relativistic causality!
The Fourier decomposition suggested field as a collection of harmonic oscillators. A
natural quantization procedure is to postulate [a~ , a† ] = δ 3 (~k − ~k 0 ). This has a potential
k
75
~k0
problem for ~k = ~k 0 . The usual way this is interpreted is to imagine the field being confined to a large box satisfying periodic or Dirichlet boundary conditions. This discretizes
the momentum labels, ~k → ~kn ∼ 2π~n/L and also replaces the delta function by Kronecker delta. Another way is to choose a suitable, countable orthonormal set of functions
{ϕn (~x)} such that ∇2 ϕn = −ωn2 ϕn . Completeness of the ϕn ’s allows translating [am , a†n ]
to [Φ(t, ~x), Π(t, ~y )] = δ 3 (x − y). We will implicitly assume such a procedure and proceed
formally using the delta function.
From the postulated commutation relations we have, for each ~k,
N~k := a~†k a~k , [N~k , a~k ] = −a~k , [N~k , a~†k ] = a~†k ,
N~k |n~k i = n~k |n~k i , n~k = 0, 1, . . . ;
hm~k |n~k i = δm~k ,n~k
(a~†k )n~k
p
√
p
|0~k i , a~k |n~k i = n~k |n~k − 1i , a~†k |n~k i = n~k + 1|n~k + 1i
|n~k i :=
(n~k )!
The state label n~k is interpreted as denoting the number of quanta, created by a~†k and
destroyed by a~k .
Note: In this case of a scalar field, all quanta have the same mass parameter, but different
k~k label. For a vector field, we have the polarization label λ and for the spinor field we have
the spin projection label σ. All quanta of a given mass m and spin/helicity are identical.
Now we add a postulate of relativistic causality [1] also referred to as micro-causality,
namely, all pairs of fields and their adjoints, commute or anti-commute for space-like separation.
This requirement comes about as follow. In a Lorentz invariant theory, all observables
would be Lorentz tensors of various ranks. Spinors being double valued quantities, they
themselves are not observables. Spinorial fields must come in even powers (single valued) in
an observable. As per general principles of quantum theory, namely, simultaneously measurable observables must commute and the notion of relativistic causality that simultaneously
measurable observables must be space-like separated, suggest that for a space-time dependent operators to be observables, they must commute for space-like separation. Operators
built out of spinorial fields being at least quadratic in the spinorial fields, allows possible
anti-commutation rules for spinor fields7 .
7
Weinberg bases his argument in favor of relativistic causality by invoking Lorentz invariance of the scattering matrix.
76
All our mode decompositions have annihilation operators as coefficients of the positive frequency modes and the creation operators as coefficients of the negative frequency
modes. Let us denote these parts as Ψ+ (x) and Ψ− (x) respectively. Consider a general
linear combination of these fields: Ψ(x) := αΨ+ (x) + βΨ− x and let Ψ† (x) denote its adjoint. Let all observables be built out of Ψ(x) and Ψ† (x). Let [A, B]± := AB ± BA, the
anti-commutator and commutator respectively. The requirement of micro-causality is then
stated as: [Ψ(x), Ψ(y)]± = 0 = [Ψ(x), Ψ† (y)]± for (x − y)2 > 0 . This requirement is quite
restrictive, as seen below.
Let us begin with a real scalar field. We have,
Z
Z
d3 k
d3 k
ik·x
p
p
φ+ (x) :=
a~k e
, φ− (x) :=
a~† e−ik·x = φ†+ (x) .
2ω~k (2π)3
2ω~k (2π)3 k
Z
∴ [φ+ (x), φ− (y)]± =
d3 k
p
2ω~k (2π)3
Z
d3 k 0
0
p
eik·x−ik ·y [a~k , a~†k0 ]±
2ω~k0 (2π)3
For the scalar field, we use the commutator given above. Let us assume the same form
for the anti-commutator i.e. we assume, [a~ , a† ]± = δ 3 (~k − ~k 0 ). This gives,
k
~k0
d3 k
eik·(x−y) =: ∆+ (x − y) is Poincare invariant. For
3
2ω~k (2π)
choose (x − y)2 = (~x − ~y )2 =: r2 ,
Z ∞
Z 1
Z 2π
1
k2
dk √
dcos(θ)
dϕeikrcosθ
2
2
(2π)3 0
2 k + m −1
0
Z ∞
2
k
sin(kr)
1
dk √
put k = mα,
2
2
2
4π 0
kr
Z ∞ k +m
1 m
α
√
dα
sin(mrα)
2+1
4π 2 r 0
α
Z ∞
α
1 1∂
dα √
cos(mrα)
− 2
4π r ∂r 0
α2 + 1
Z
[φ+ (x), φ− (y)]± =
space-like separation ,
Thus ∆+ (x2 ) =
=
∆+ (x2 ) =
=
√
Thus, ∆+ (x2 > 0) = K1 (m x2 ) (Modified Bessel function of the second kind) , and
it is non-zero for space-like separation.
77
Define φ(x) := αφ+ (x) + βφ− (x) , φ† (y) = α∗ φ− (y) + β ∗ φ+ (y). It follows,
[φ(x), φ(y)]± = αβ [φ+ (x), φ− (y)]± + βα [φ− (x), φ+ (y)]±
= αβ (∆+ (x − y) ± ∆+ (y − x))
= αβ(1 ± 1)∆+ (x − y)
since ∆+ is symmetric in x, y.
φ(x), φ† (y) ± = |α|2 [φ+ (x), φ− (y)]± + |β|2 [φ− (x), φ+ (y)]±
= (|α|2 ± |β|2 )∆+ (x − y)
To satisfy the requirement of causality, both the brackets must vanish i.e. in both equations we must choose the minus sign i.e. a commutator as well as the condition |α| = |β|.
This is quite a strong restriction on both the sign as well as the coefficients α, β.
Let us consider the spinor field now. Let,
Z
X
d3 k
p
ψ+ (x) :=
(u(~k, σ)eik·x )b~k,σ ,
←→
3
2ω~k (2π) σ
Z
X
d3 k
†
p
(ψ+ ) (x) :=
(u∗ (~k, σ)e−ik·x )b~†k,σ ;
3
2ω~k (2π) σ
Z
X
d3 k
p
(v(~k, σ)e−ik·x )d~†k,σ ,
←→
ψ− (x) :=
2ω~k (2π)3 σ
Z
X
d3 k
†
p
(v ∗ (~k, σ)e+ik·x )d~k,σ .
(ψ− ) (x) :=
2ω~k (2π)3 σ
The Dirac index is suppressed and the dagger refers to the adjoint of the operators and not
of the spinors.
†
†
Let ψ(x) := µψ+ (x) + νψ− (x) and ψ † (y) := µ∗ ψ+
(y) + ν ∗ ψ−
(y). Then,
Z
h
i
d3 k X n 2
†
ψα (x), ψβ (y)
=
|µ| uα (~k, σ)u∗β (~k, σ)eik·(x−y)
3
2ω~k (2π) σ
±
± |ν|2 vα (~k, σ)vβ∗ (~k, σ)e−ik·(x−y)
The sums over σ are given by,
i
X
Xh
uα (~k, σ)u∗β (~k, σ) =
u(~k, σ)ū(~k, σ)γ 0
σ
X
σ
=
αβ
uα (~k, σ)u∗β (~k, σ)eik·(x−y) = (2m)−1 (i6 ∂ + m)γ 0 eik·(x−y)
o
−6 k + m 0
γ ⇒
2m
and likewise,
σ
X
σ
X
vα (~k, σ)vβ∗ (~k, σ) =
Xh
v(~k, σ)v̄(~k, σ)γ 0
i
=
αβ
σ
−6 k − m 0
γ ⇒
2m
vα (~k, σ)vβ∗ (~k, σ)e−ik·(x−y) = −(2m)−1 (i6 ∂ + m)γ 0 e−ik·(x−y) ;
σ
78
and the integration over ~k just gives the ∆+ ((x − y)2 ) function computed above. Hence,
h
i
ψα (x), ψβ† (y)
±
= (|µ|2 ∓ |ν|2 )
i6 ∂ + m 0
γ ∆+ ((x − y)2 )
2m
Once again, the right hand side vanishes for space-like separation provided the upper signs
are chosen and |µ| = |ν|. Thus, the for spinors we have to choose anti-commutation relations
and the weights of the positive and negative frequency solutions must be the same up
to a phase factor. We may just take µ = ν = 1 which gives back our previous mode
decomposition.
Note: While writing the mode decomposition, we did not bother about the normalizations
of the coefficients, b, d∗ etc. Now, in choosing the brackets, [b~ , b† ]+ = δσ,σ0 δ 3 (~k − ~k 0 ) and
k,σ
~k0 ,σ 0
ditto for the d’s, we have fixed these normalizations. We could still have arbitrary multiple
in each term. The requirement of causality restricts this freedom to the above.
The two examples considered above, generalize to other representations as well and what
we have is the celebrated spin-statistics theorem:
A relativistic quantum field theory satisfies the requirement of causality provided integer spin/helicity field quantum conditions use commutators (bosons) while the half integer
spin/helicity field quantum conditions use anti-commutators (fermions).
Note: While discussing quantum statistics in statistical mechanics, we invoke the additional attribute of indistinguishability among identical particles and require the BoseEinstein/Fermi-Dirac statistics. Identical particles are defined by having identical intrinsic
attributes such as mass, spin, color, flavor etc. The (in)distinguishability arises from a spatial localization and subsequent ability to tag them through, say, a scattering process. This
is lost when their wave functions overlap, the ‘quantum’ becomes operative and quantum
statistics becomes essential. In the above discussion, no such additional property emerged.
We already have the ‘quantum’ operative so indistinguishability is also operative. It is the
demand of relativistic causality that forced the spin-statistics correlation.
With the anti-commutators around, [b~k,σ , b~†k0 ,σ0 ]+ = δσ,σ0 δ 3 (~k − ~k 0 ), [b~k,σ , b~k0 ,σ0 ]+ = 0, we
define the number operators for each (~k, σ) as before: N := b† b It follows that [N, b]− =
−b , [N, b† ]− = +b† . Eigenvalues of N continue to be integers but thanks to (b† )2 = 0,
they take only two values, 0, 1. The corresponding eigenstates are: |0i and |1i. This is the
incorporation of the Pauli exclusion principle.
Consider states of two modes (scalar for simplicity), say ~k1 , ~k2 . A general basis state
79
would be |~k1 , . . . ~k1 , ~k2 , . . . ~k2 i ∼ (a~†k )m (a~†k )n |0, 0i.
1
2
If the a’s anti-commute, then m, n ≤ 1 and we have 4 possible states:
|0, 0i, |~k1 , 0i, |0, ~k2 i, |~k1 , ~k2 i. Thanks to anti-commutation, |~k2 , ~k1 i = −|~k1 , ~k2 i. For bosonic
operators, there is no minus sign under exchange. Generalizing from this, it is clear that
(anti-)commutation relations among the creation operators automatically ensure complete
(anti-)symmetry under permutations of the labels.
To understand the extra minus sign in the third line of eq. (8.6), momentarily write the
Ψ̄ as Ψ̄1 and Ψ as Ψ2 . Then in the third line of the equation, we see an explicit exchange of
Ψ1 and Ψ2 . The fermionic commutation relation then gives that extra minus sign. As noted
before, this is also important for the CPT theorem.
To summarize: Quantum fields are operators tagged by space-time points and obeying certain commutation relations. We may think of them as a collection of creation-annihilation
operators for each mode (solution of the equations of motion). A general state of a quantum
field is a linear combination of basis states which are multi-quanta states. Each quantum
with momentum label ~k and a spin/helicity label σ carries energy-momentum-angular momentum as given by the Poincare charges. As per quantum field theory, all processes involve
exchanges of quanta among different quantum fields.
80
9.
STATES OF FREE QUANTUM FIELDS: PARTICLES, COHERENCE AND
COHERENT STATES
While we argued that particle dynamics with position-momentum is not compatible with
relativity, we observe particles in a variety of experiments eg as tracks in bubble chambers,
photographic emulsions, stacks of particle counters etc and also use the particle view in
designing accelerator beams. So the more sophisticated relativistic framework should also
have a suitable description of “particles”. Since quantum fields are operators, the description
must be in terms of states of a quantum field. In the section (7), we saw that from a
dynamical view point, a (free) field is an infinite collection of non-interacting harmonic
oscillators. Furthermore, its quantization follows from the quantization of oscillators and
A quantum field emerges as a linear combination of creation-annihilation operators of its
modes. Its states are built by the action of the creation operators on vacuum state of each
mode. We have to identify “particles” in this huge space of states of a quantum field.
A.
Particle/anti-particle wave packets
Let |0i denote the unique vacuum state annihilated by all annihilation operators a~k . We
P
can generate “1-particle states” by taking a linear combination of the form ~k α(~k)a~†k |0i,
P
“2-particle states” by taking a linear combination of the form ~k,~l α(~k, ~l)a~†k a~†l |0i and so
on. The totality of all such states forms the state space of a free quantum field. For the
n − particle states, the (anti-)commutators of the creation operators automatically take care
of (anti-)symmetrization of the states. Notice that 1-particle state does not mean a single
a~†k |0i, but a linear combination of several creation operators. These allow us to form wave
packets.
As an explicit example, let us construct a wave packet of an anti-fermion with some
momentum and spin distribution. We have the Dirac field,
Z
†
3
ik·x
−ik·x
~
~
Ψ(x) =
[d k] b~k,σ u(k, σ)e + d~k,σ v(k, σ)e
, where,
3
d3 k
, and (6 k + m)u = 0 = (−6 k + m)v.
d k := p
2ω~k (2π)3
Acting on the vacuum, it will create a 1-anti-fermion state, the linear combination from the
81
d~†k,σ creation operators. Let f (k~0 , σ 0 ) be a suitable complex valued function. Define
Z
hf | :=
3 0
dk
X
f (~k 0 , σ 0 ) h0|d~k0 ,σ0 ,
Z
σ0
d3 k
X
|f (~k, σ)|2 = 1 .
(9.1)
σ
(χ)c (x) := hf |Ψ(x)|0i
Z
X
=
[d3 k]
f (~k, σ)v(~k, σ)e−ik·x ∵ h0|d~k0 ,σ0 d~†k,σ |0i = δσ,σ0 δ 3 (~k 0 − ~k) (9.2)
σ
The so constructed (χ)c (x) is a spinor-valued function of x which satisfies the Dirac equation.
The v(~k, σ) spinor signifies that it is an anti-fermion wave packet and the suffix c on χ is
a reminder. How do we get a fermion wave packet? Well, in place of Ψ(x) operator, we
use its charge conjugate, C Ψ̄T (x) (see 8.2), and in place of the d~†k,σ in hf | we use b~k,σ . The
corresponding fermion wave packet takes the form,
Z
Z
X
X
3 0
0
0
~
hg| :=
dk
g(k , σ ) h0|b~k0 ,σ0 ,
d3 k
|g(~k, σ)|2 = 1 .
σ0
(9.3)
σ
χ(x) := hg|C Ψ̄T (x)|0i and h0|b~k0 ,σ0 b~†k,σ |0i = δσ,σ0 δ 3 (~k 0 − ~k) ⇒
Z
X
=
[d3 k]
g(~k, σ)uc (~k, σ)e−ik·x
(9.4)
σ
where, uc (~k, σ) = (iγ 2 u∗ (~k, σ)) is the charge conjugate of the u(~k, σ) spinor.
The generalization for wave packets of other fields should be obvious. All these wave
packets are positive or negative frequency solutions of the corresponding field equations.
We already have discussed the inner products on these spaces and we may view these as the
usual probability amplitudes. Of course we do not have the analogue of position operator
unless we restrict to a non-relativistic approximation/limit. Suffice it to say that one can
use Dirac equation to analyze beams of relativistic fermions, as is done for instance in [8].
B.
Correlation Functions and Coherence of states
There are of course more general states which are not labeled by the number of quanta.
Even for a single oscillator, states with a given occupation number is only a class of states.
We can have finite linear combination or infinite linear combinations of these numbers states,
can have squeezed states, coherent states etc. They are distinguished by their properties
and utility. A some what more general characterization of quantum states is in terms of
correlation functions. Let us see an example of this.
82
Consider a real scalar quantum field. We have its mode decomposition and had also
defined the φ± (x) fields consisting of the positive/negative frequency parts. Consider the
measurement of the operators φ− (x)φ+ (y). Its average value is some state which is kept
implicit, is denoted as
G(x, x0 ) := hφ− (x)φ+ (y)i
(a correlation function)
Z
Z
3 0
3
dk
dk
~
~0 0
p
p
=
e−ik·x+ik ·x ha~†k a~k0 i
3
3
2ω~k (2π)
2ω~k0 (2π)
where, ha~†k a~k0 i := T r(ρa~†k a~k0 ) and thus the measured averaged value depends on the density
matrix or “state”, ρ. The usual Schrodinger equation for the ket vectors becomes the equation idt ρ = [H, ρ] whose solution is: ρ(t) = e−itH ρ0 eitH for a time independent Hamiltonian.
These are the only Hamiltonians considered below. For example, the Hamiltonian of our
scalar field is given by the quantum version of (6.15).
A time-independent density matrix ρ is said to represent a stationary state. Thus for a
stationary state, [ρ, H] = 0 which implies, ρ = e−itH ρeitH ∀ t.
h
i
h
i
h
i
T r ρa~†k a~k0 = T r e−itH ρeitH a~†k a~k0 = T r ρeitH a~†k a~k0 e−itH
h
i
= T r ρ(eitH a~†k e−itH ) (eitH a~k0 e−itH )
Z
h
i
†
it(ω~k −ω~k0 )
= T r ρa~k a~k0 e
∵ H ∼ [d3 k 00 ]ω~k00 N~k00
h
ρa~†k a~k0
i
The left hand side is time independent, so T r
= 0 unless ω~k = ω~k0 i.e. unless
|~k| = |~k 0 |. For the same frequency, the trace need not vanish and we have a matrix in
the labels ~k, ~k 0 . This matrix is hermitian (ρ is hermitian) and can be diagonalised. Hence,
i
h
without loss of generality, for a stationary state, T r ρa~†k a~k0 =: n~k δ~k,~k0 , where n~k denotes
the average number of quanta in the ~k mode. This leads to,
Z
d3 k
iω~k (t−t0 )−i~k·(~
x−~
x0 )
0
n
e
(In a stationary state).
G(x, x ) =
~
k
2ω~k (2π)3
Clearly, a stationary state leads to translationl invariance of the above correlation function
and this correlation function contains the information of the average number of quanta in
each mode.
Assume now that the average number of quanta is vanishingly small outside a small
neighbourhood of some direction k̂0 (collimation) and also outside a small neighbourhood
83
of ω~k0 (monochromaticity). We may then approximate the integral as ,
N0
iω (t−t0 )−i~k0 ·(~
x−~
x0 )
G(x, x ) ≈
e ~k0
3
2ω~k0 (2π)
0
Z
, N0 ≈
ngbd(~k0 )
d3 k
n~
2ω~k (2π)3 k
It is apparent that the correlation function has a factorised form: G(x, x0 ) ≈
q
ik0 ·x
0
which is a plane wave. A state ρ with such a form of
ϕ∗ (x)ϕ(x0 ) , ϕ(x) := 2ω~ N(2π)
3e
k0
the correlation function is said to have a first order coherence8 .
Notice that the factorised form results when the integral can be approximated. This in
turn is valid for |~x −~x0 | ≈ (2π)/|~k0 | and |t−t0 | ≈ (2π)/ω~ . Furthermore, the imperfections in
k0
the monochromaticity and collimation affect the approximation and also limit the validity of
the first order coherence. This introduces the ideas of coherence time and coherence length as
time and space intervals over which the factorization property or coherence property holds.
Thus a stationary state with first order coherence, with finite coherence length and time,
may be interpreted as a “beam” along the direction k̂0 with intensity proportional to N0 ω~k0 .
Let us continue further and assume perfect collimation along say the x-axis. The beam
may have a distribution of frequencies though. The phase9 then becomes iω~k (t − t0 − x + x0 )
and G(x, x0 ) =: G(τ ), τ := (t − t0 − x + x0 ). Since we have a single direction, we can denote
the dependence on ~k by the frequency ω. Suppose further that n~k =: nω has a Gaussian
2
0)
]. Now the integral can be done exactly and get,
dependence on ω: nω = A exp[− (ω−ω
2σ 2
Z ∞
σ2 τ 2
dω [− (ω−ω20 )2 ] iωτ
2σ
G(τ ) = A
e
= G(0)eiω0 τ e− 2 .
e
ω(2π)
0
Thus, a highly collimated beam with a frequency range of σ around ω0 , has the correlation
function G(x, x0 ) decaying with τ = (t − t0 − x + x0 ) ∼ σ −1 . For example, if we have say a
laser beam with ω0 ∼ 1014 Hz and σ ∼ 106 Hz, then τ ∼ 10−6 seconds or about 102 meters.
Since the beam is presumed to be a stationary state, the τ does not vary with the time, t (or
t0 ). However, beyond the coherence length, the beam will show frequency dispersion. Note
that first order coherence suffices for the notions of coherence length and time.
8
9
k th order coherence is defined in terms of the factorization property of Gn (x1 , . . . , xn , y1 , . . . , yn ) :=
hφ− (x1 ), . . . , φ+ (yn )i = E ∗ (x1 ) . . . E ∗ (xn ) E(y1 ) . . . E(yn ), where E(x) is independent of n for all 1 ≤ n ≤
k [7]. The coherent states, eigenstates of the annihilation operators, familiar from the harmonic oscillator
have infinite order coherence.
p
The formula is given for a massless field. More generally, we will have τ (ω) = t − t0 − 1 − m2 /ω 2 (x − x0 ).
For very high frequency/energy, the mass can be neglected.
84
Remark: In this example, we have stipulated the properties of stationarity and 1st order
coherence to restrict the implicit state of the quantum field. These stipulations do not single
out a particular state, but permits a subset of states ρ. All such states suffice to describe a
beam with stable properties over the coherence interval.
C.
Coherent States
There is a very important class of states of quantum fields, especially for electromagnetic
field, namely the class of coherent states. These are familiar from the quantum harmonic
oscillator. Let us recall their construction briefly.
1.
An aside: Harmonic Oscillator States
With the usual creation-annihilation operators, [a, a† ] = 1 we take the oscillator variables
as:
q(t) := ae−iωt + a† eiωt := q+ (t) + q− (t) ,
i
i
p(t) := − ae−iωt + a† e−ωt ↔ [q(t), p(t)] − i .
2
2
Define:
(i) Displacement operators: For each α ∈ C,
αa† −α∗ a
D(α) := e
∴ D† (α) =
= e
|α|2
†
−
e 2 e−αa
D† (α) a D(α) = a + α1
αa†
eα
∗a
−α∗ a
e
−
e
|α|2
2
= D(−α)
Using the BCH formula,
and
D† (α)D(α) = 1
D† (α) a† D(α) = a† + α∗ 1 , and
,
i
D(α + β) = D(α) D(β) e− 2 (αβ
∗ −α∗ β)
.
(ii) Squeezing operators: for each complex number := reiφ ,
1
∗ a2 −(a† )2 )
S() := e 2 (
, S † () = S(−) , S † ()S() = 1
S † () a S() = a ch(r) − a† e−2iφ sh(r) , S † () a† S() = a† ch(r) − a e2iφ sh(r)
These operators define a two parameter family of states, squeezed coherent states, as
|α, i := D(α)S()|0i ; = 0 ↔ (coherent states), α = 0 ↔ (squeezed states).
85
For this family of states, it is straightforward to see,
hα, |a2 |α, i = −e−2iφ ch(r)sh(r) + α2 ,
hα, |a|α, i = α , ∀
hα, |(a† )2 |α, i = −e2iφ ch(r)sh(r) + (α∗ )2 , hα, |a† |α, i = α∗
hα, |N |α, i = |α|2 + sh2 (r)
a|α, i = −sh(r)e−2iφ D(α)S()a† |0i + α|α, i .
The last equation shows that only for the squeezing parameter, r = 0, |α, 0i is an eigenstate
of the annihilation operator which is the usual definition of coherent states.
The Heisenberg uncertainty relation takes the form,
A := 1 + 2hN i − 2|hai|2 = 1 + 2sh2 (r)
B := (∆a)2 e−2iωt + (∆a† )2 e2iωt = −2ch(r)sh(r)cos(ωt + φ)
(∆q)2 (t) = A + B , 4(∆p)2 (t) = A − B
Clearly, for the coherent states r = 0 ⇒ A = 1, B = 0 and (∆q)2 (∆p)2 =
1
4
which saturates
the uncertainty bound. The uncertainties are also time independent. Some of the basic
properties of the coherent states are:
|αi = |α, 0i = D(α)|0i ⇒ hα|αi = 1
∞
|α|2 X
αn
−
√ |ni
|αi = e 2
n!
n=0
|α|2 +|β|2
∗
hβ|αi = e− 2 +αβ
Z
X
1
1 =
d2 α |αihα| =
|nihn| .
π
n≥0
2.
(9.5)
(9.6)
(9.7)
(9.8)
Correlation functions
This subsection is based on [6, 7].
We begin with the definition:
Gn (t1 , . . . tn ; tn+1 . . . t2n ) := hq− (t1 ) . . . q− (tn ) q+ (tn+1 ) . . . q+ (t2n )i
= eiω(t1 +...tn −tn+1 ···−t2n ) T r ρ(a† )n an
(9.9)
(9.10)
Since the q̂− and q̂+ commute among themselves, the ordering of (t1 , . . . tn ) and (tn+1 . . . t2n )
is unimportant. Note that for (and only for) ρ = |0ih0|, the correlation functions vanish
identically. In the following, this density matrix is excluded.
86
We invoke the general result: ∀ Â , and ∀ ρ , T r[ρA† A] ≥ 0 .
P
Taking A := i αi q̂+ (ti ),
hA† Ai =
X
αi∗ G1 (ti , tj )αj ≥ 0 ∀ αi0 s .
i,j
Hence, the matrix G1 (ti , tj ) is non-negative. So its determinant is non-negative. Thus,
G1 (ti , ti )G1 (tj , tj ) − G1 (ti , tj )G1 (tj , ti ) ≥ 0 .
From the definition of G1 , it follows that G1 (tj , ti ) = G1 (ti , tj )∗ .
is real and in fact non-negative.
Therefore G1 (t, t)
The non-negative determinant condition becomes
|G1 (ti , tj )|2 ≤ G1 (ti , ti )G1 (tj , tj ). This leads to the definition of the normalized correlation
functions,
Gn (t1 , . . . , t2n )
g n (t1 , . . . , t2n ) := p
,
G1 (t1 , t1 ) . . . G1 (t2n , t2n )
(9.11)
The determinant condition may be expressed as, |g 1 (ti , tj )| ≤ 1 . Using the explicit definition,
g 1 (ti , tj ) = p
eiω(ti −tj ) T r(ρa† a)
(T r(ρa† a) T r(ρa† a)
= eiω(ti −tj ) ∴ |g 1 (ti , tj )| = 1.
kth order Coherence :
A state ρ is said to have k th -order coherence, if |g n (t1 , . . . , t2n )| = 1 ∀ 1 ≤ n ≤ k .
Remark: This is a property of a state, defined with respect to a specific kind of correlation
function. Thus we could have different notions of coherence, if we adopt different correlation
functions. The choice of this specific class of correlation functions has its roots in the
measurements in optics which involves electromagnetic field. The correlation functions are
also defined for classical fields (with the averages defined as ensemble averages over repeated
observations) and have very similar properties as the quantum counterparts. The distinctions
begin to show up when the intensities of the light beams is decreased to almost “single
photon” level.
Remark: In light of the result in the preceding line, every state has 1st order coherence.
Q
It follows immediately that for k th order coherence, |Gn (t1 , . . . , t2n )|2 = i G1 (ti , ti ).
Result: A state has a k th order coherence iff there exist a function E(t), independent of
n, such that
Gn (t1 , . . . , t2n ) = E ∗ (t1 ) . . . E ∗ (tn )E(tn+1 ) . . . E(t2n ) ∀ 1 ≤ n ≤ k.
87
The proof is easy. First prove it for G1 (t1 , t2 ). From this the result for Gn follows. This
factorization property is another characterization of k th order coherence.
Among the pure states, we have states which are finite linear combinations of number
states and those with infinite linear combinations. Let |ψi be a state with finite linear combinations of the number eigenstates. Let K be the maximum eigenvalue of N appearing in
the linear combination. Then an |ψi = 0 ∀ n > K. Since, for pure states, |g n | =
hψ|(a† )n an |ψi
,
(hψ|a† a|ψi)n
it follows that |g n | = 0 ∀ n > K. Hence, finite linear combinations are necessarily partially
coherent i.e. have maximum order of coherence to be K. A fully coherent state must have
infinite linear combinations.
But not all infinite linear combinations have full coherence! Here are two example.
Coherent states |α, 0i: We have |g n | =
(α∗ )n αn
(|α|2 )n
= 1 ∀ n ≥ 1 and coherent states are fully
coherent. These are the only non-vacuum states which have infinite order coherence. This
may be checked by considering an arbitrary, infinitesimal perturbation of a coherent state,
|Ψi := (1 − /2)|αi + /2|φi, |φi =
6 |0i and computing hΨ|(a† )n an |Ψi − (hΨ|a† a|Ψi)n to first
order in .
Squeezed vacuum |0, i: Now,
h0|S † ()(a† )2 a2 S()|0i
|g 2 | = p
(h0|S † ()a† aS()|0i)n
This evaluates to,
Numerator:
= h0|S † a† S S † a† S S † aS S † aS|0i
= h0|(ch(r)a† − sh(r)e2iφ a)(ch(r)a† − sh(r)e2iφ a)
(ch(r)a − sh(r)e−2iφ a† )(ch(r)a − sh(r)e−2iφ a† )|0i
= (sh(r))2 h0|a(ch(r)a† − sh(r)e2iφ a)(ch(r)a − sh(r)e−2iφ a† )a† |0i
= (sh(r))2 (ch(r))2 + 2(sh(r))2
Denominator:
= h0|S † N S|0i = (sh(r))2 =⇒
|g 2 | = 1 + 3(sh(r))2 > 1 ∀ |0, i.
Thus while full coherence requires infinite linear combinations, they do not guarantee
even a second order coherence!
Points to note as a summary:
• Quantum framework has vastly more number of states compared to the classical framework. The observed measurements are always averages with uncertainties. There are
88
many states which have the same average of a given set of observable. While uncertainties serve to distinguish among these, alternative attribute are provided by the
correlation functions and notion of coherence. There is at least one class of states, the
coherent states, which have full coherence, have time independent uncertainties (for
oscillator dynamics) and have the smallest uncertainties compatible with the Heisenberg relation. This class of states is large enough to provide (over-complete) basis for
the Hilbert space.
• We used a specific kind of correlation function and defined coherence of states w.r.t.
these. These correlators have equal number of creation and annihilation operators
which are ordered with annihilation operators to the right (the so-called normal order).
This is prejudiced on typical detectors working by absorbing quanta. For opposite types
of detectors, anti-normal order would be appropriate. The correlators with unequal
number of a, a† operators, are used in measurement of phase information.
All these correlation functions are premised on processes being understood as emission
and absorption of quanta.
• Generalizing, the coherent states of fields are essentially tensor product of single mode
coherent states. We may thus denote a coherent state of a scalar field as |{α~k }i with
α~k := hα~k |a~k |α~k i. Then,
ˆ
ϕ(x) := h{α~k }|φ(x)|{α
~k }i =
Z
3
d k α~k eik·x + α~k∗ e−ik·x .
Thus every coherent state of the quantum field gives a classical solution determined
by the complex parameters α~k ’s. Conversely, we may interpret a classical (free) field
as the expectation value of the quantum field in the corresponding coherent state. It
will faithfully mimic the usual classical description eg descriptions of wave solutions
etc. This is how one understands the usual classical electromagnetic waves.
• There is also a generalization of coherent states for fermionic fields. This uses Grassmann variables, their algebra and calculus. The constructions are very analogous. I
refer you to [7]
89
D.
Evolution into a coherent state
Since we see a large set of classical field solutions, a question arises as to why does a
quantum system, go into a coherent state? Under what conditions are these states manifested?
Under a free evolution (linear equation), a state which is say a finite linear combination
of the number states will continue to maintain the finite linear combination and not evolve
into a coherent state, interactions are essential. It suffices to have an interaction with a
classical source i.e. a c-number source. This is demonstrated most easily by using using
interaction picture for a single harmonic oscillator.
Recall:
Let |ψiS denote a state vector in the Schrodinger picture and AS denote a generic operator
without explicit time dependence and let the Hamiltonian be also time independent. The
Schrodinger picture is defined by: idt |ψ(t)iS = H|ψ(t)i , dt AS = 0. Define a new ‘picture I’
with an arbitrary unitary operator V (t) , V (0) = 1 as,
|ψ(t)iI := V (t)|ψ(t)iS , AI (t) := V † (t)AS V (t) ⇒
idt |ψ(t)iI = idt V † (t)|ψ(t)iS + V † (t)HS |ψ(t)iS = V † (t) [(−idt + HS ] V (t)|ψ(t)iI
∴ idt |ψ(t)iI = HI − V † (t)idt V (t) |ψ(t)iI , HI (t) := V † (t)HS V (t) . Similarly,
idt AI (t) = AI (t), V † (t)idt V (t) .
For the choice dt V (t) = 0, we get back the Schrodinger picture. For the choice idt V (t) =
V (t)HI (t) ↔ idt V (t) = HS V (t), we get dt |ψ(t)iI = 0 , idt AI (t) = [AI (t), HI (t)] and we
have the Heisenberg picture.
Let us split the Schrodinger Hamiltonian as HS := H0 + H 0 where H0 is the “free”
Hamiltonian and H 0 is the “interaction” Hamiltonian. Now we choose, idt VI (t) = H0 VI (t).
This results in the equations,
idt |ψ(t)iI = (HI − (H0 )I )|ψ(t)iI := HI0 |ψ(t)i , idt AI (t) = [AI , (H0 )I ]
(9.12)
This is the “Dirac” or the “interaction” picture wherein the states evolve by interaction
Hamiltonian while the operators evolve by the free Hamiltonian. Notice that (H0 )I (t) = H0 .
Consider now a quantum harmonic oscillator interacting with a “classical source” i.e.
H = ( 21 p̂2 + 12 ω 2 q̂ 2 ) + J(t)q̂(t), where J(t) is an externally prescribed real function of time.
90
We express the Hamiltonian as,
1
H = ω(a† a + ) + J(t)(a + a† ) := H0 + H 0 .
2
In the interaction picture,
idt aI (t) = [aI (t), H0 ] = ω
h
aI , a†I aI
i
= ωaI
∴ aI (t) = e−iωt aI (0) , a†I (t) = e+iωt a†I (0)
∴ HI0 = J(t)eiωt aI (0) + J(t)e−iωt a†I (0) .
R
t
The state vector evolves by HI0 (t) which gives, |ψ(t)i = exp −i 0 dt0 HI0 (t0 ) |ψ(0)iI . The
exponential simplifies as,
e
−i
Rt
0
dt0 HI0 (t0 )
−i
= e
h
i
0
0
dt0 J(t0 )eiωt a+e−iωt a†
Rt
0
† −α∗ (t)a
= eα(t)a
a := aI (0) , a† := a†I (0)
Z t
0
dt0 J(t0 )e−iωt . Evidently,
, α(t) := −i
0
|ψ(t)iI = D(α(t))|ψ(0)iI .
We see immediately that if |ψ(0)iI = |0i, then at each t, |ψ(t)iI is the coherent state |α(t)i.
Thus a quantum oscillator interacting with a classical source (linearly coupled) will evolve
into a coherent state from its ground state.
The generalization to a quantum field is immediate. A quantum field in its ground state
(vacuum state), upon interacting (linear coupling) with a classical source, will evolve into a
coherent state.
Remark: You may try out different coupling to a classical source eg J(t)q 2 , to see what
happens. This is indeed what happens with a quantum field in a FRW expanding universe.
The result is squeezed states of the field modes, in exactly the same manner. If the source is
also an operator, then in general we do not expect the state to evolve into a coherent state.
91
10.
QUANTUM FIELDS IN SCATTERING PHENOMENA
So far we have introduced and studied quantum fields in the “non-interacting” context in the sense, the equations of motion that they satisfied were linear. We alluded to
interactions of the fields with sources/detectors, involving emission/absorption of quanta.
A more detailed version of this is typically treated in interactions of quantum fields with
atoms/molecules which are non-relativistic system making transitions among their bound
states. There is another class of interactions which are manifested in scattering phenomena. These can be studied in high energy collisions and are a direct tool to probe the basic
microscopic interactions.
We first discuss typical scattering arrangement, identify the relevant measurable physical quantities, set up a theoretical framework and see how quantum fields gel with the
framework.
A typical scattering experiment involves shooting particles at some target to be studied
and observing the deflected or emerging set of particles. Instead of fixed target we could
also have colliding beams. It is usually the case that,
(i) we understand the projectile particles reasonably well and able to control their properties such as energy, momentum, spin, charge, shooting angles etc fairly accurately. It
is presumed (a necessary condition) that the presence of target and/or other particles do
not affect these properties significantly at least as long as they are well separated (spatially
and/or temporally);
(ii) after a large but finite time, we can observe the emergent particles whose properties
can be similarly ascertained and they too are presumed well understood;
(iii) the correlations among the incoming and outgoing particles, though determined by
the intervening target region, can be measured independent of the target.
If we wish to view the combined system of projectile, target and emergent particles as
a single dynamical system, then the dynamical system must have three separate evolutions
at least approximately. These are: (a) evolution in the distant past (t → −∞); (b) evolution in the distant future (t → +∞) and (c) evolution during the intermediate time of
interaction. For instance, in a classical scattering, we will have a trajectory beginning and
ending in a region of the phase space, marked as “asymptotic region”. In these regions, the
exact trajectory may be well approximated by a different, usually “free” Hamiltonian. In a
92
quantum setting, we may replace a trajectory in phase space by one in the state space. It
should be noted right away that not all trajectories are of the scattering type - classically or
quantum mechanically.
What are the typically observed quantities?
In practice, it is not a single particle that is sent in but rather a well collimated beam
directed at the target region. After individual particles emerge, there is a spread in their
direction as the initial conditions are not identical. Typically, these are detected at different
locations by detectors which have finite aperture and therefore the state of the detected
particles is also known approximately. Given the aperture of a detector and its distance
to the target, we know the solid angle subtended at the scattering region. Thus, the basic
measurements are: (i) I(n̂)dΩ := number of particles or events detected per second along
the direction n̂. This may be measured for each species of particle separately or just the
total; (ii) the prepared beam has a flux J := number of incident particles per second per unit
transverse area of the beam. Both these numbers are known directly. They are obviously
proportional and define: I(n̂)dΩ := dσ(n̂, . . . )J. . The dσ(n̂, . . . ) is called the differential
cross-section and . . . refer to its dependence on other properties such as energy etc. These
reflect the averaged attributes of the beam - average energy, momentum, polarization etc.
The detection can be made finer with particle detection in coincidence, select specific polarization, charge etc. Depending upon the attributes tagged and included in I(n̂)dΩ we get
exclusive/inclusive cross-sections. If one does not care even about the angle, we have the
total cross-section. These are typically used in determining decay rates and in applications
in statistical mechanics as input from the microscopic dynamics.
In summary, the only information that is used about the beam is the flux while finer
details of the final state are used in defining various differential cross-sections. Thus any
cross-section is defined as the ratio above and has dimensions of area. Experimentalist give
these measured numbers while a theorist has to choose a favorite model and compute these
numbers. For this, we turn to the theoretical formulation of scattering processes in the
quantum framework. Here are some basic observations.
93
A.
General scattering framework
Let H be the Hilbert space of the combined system (projectile-target-scattered/emerged).
Let H0 , H00 and H be the three Hamiltonian operators governing the time evolution in the
remote past, distant future and all through. For simplicity, we take H00 = H0 , though they
could be different. The evolution generated by H0 is is said to be “free”. All are assumed
to have no explicit time dependence. All are assumed to be self-adjoint.
Let their corresponding eigenvalue equations be, H0 φn = n φn and Hψα = εα ψα . We
have the most general solutions expressed as,
ψ(t) =
X
ψα (0) =
X
∴ ψ(t) =
X
cα e−iεα t ψα (0) , φ(t) =
α
X
an e−in t φn (0)
n
bα,n φn (0)
∵ {ψα (0)} , {φn (0)} , are complete.
n
cα bα,n e−iεα t φn (0) , all coefficients are time independent.
α,n
We are interested in those solutions ψ(t) while resembles some free solution as t → ±∞.
Thus, as t → −∞,
ψ(t) − φ(t) → 0 ⇒
(
X X
n
∴ an →
X
)
cα bα,n e−iεα t+in t − an
φn (0) → 0 .
α
cα bα,n ei(n −εα )t
∀ n . t−independence of coefficients implies,
α
cα bα,n (no sum) = 0 whenever εα 6= n .
So, if H0 , H do not have any eigenvalue in common, then for every cα∗ 6= 0, bα∗ ,n = 0 ∀ n.
Hence, no solution ψα (0) is approximable by a free evolution. For a sufficiently general
solution φ(t) to approximate some exact solution ψ(t), spectrum of H0 should be a subset of
the spectrum of H. Conversely, every solution with cα = 0 for every εα ∈
/ Spec(H0 ), can be
asymptotically approximated, by some free evolution. Such solutions are called scattering
solutions.
Note: Usually, the spectrum of H0 is continuous and bounded below. Its eigenstates span
the Hilbert space. The scattering states of H however span only a subspace of H . To span
the full Hilbert space we need to include the bound states of H. We have been somewhat
informal in the above argument, here are the sharper definitions.
94
A solution |ψ(t)i is said to be incoming (outgoing) if ∃ a solution
Definition:
|φin (t)i (|φout (t)i) such that
lim kψ(t) − φin (t)k = 0 ( lim kψ(t) − φout (t)k).
t→−∞
t→+∞
Note: If the prepared beam or the detected scattered particles are to be associated with a
state of the full system, then it is necessary that at the most only one |ψ(t)i can be asymptote
to a given |φin (t)i or a given |φout (t)i. Establishing such a property for a model of scattering
is called the basic existence and uniqueness problem.
The solutions can be tagged by states at any particular instance for example, “|ψ(t = 0)i”.
We identify Hin := {ψ ∈ H /ψ(t) is incoming } and Hout := {ψ ∈ H /ψ(t) is outgoing }.
Potentially it is possible to have (i) ψ ∈
/ Hin , ψ ∈
/ Hout (no scattering); (ii) ψ ∈ Hin ,
but ψ ∈
/ Hout (’capture’ process); (iii) ψ ∈
/ Hin , but ψ ∈ Hout (’decay’ process) and (iv)
ψ ∈ Hin ∩ Hout (scattering).
Definition: A scattering system is weakly asymptotically complete if Hin = Hout . It may
so happen that the subspace of bound states, Hbound is such that H = Hbound ⊕ Hin ⊕ Hout .
Then the system is said to be asymptotically complete.
These distinctions have a bearing on the definition of S−matrix and its unitarity.
Let U (t), U0 (t) denote the unitary evolution operators corresponding to H0 , H respectively. Let |ψ(t)i = U (t−t0 )|ψ(t0 )i be an incoming solution. Then ∃ φin (t) = U0 (t−t0 )φin (t0 )i
(t0 arbitrarily chosen instant) such that,
As t → −∞
, kU (t − t0 )ψ(t0 ) − U0 (t − t0 )φin (t0 )k → 0
kψ(t0 ) − U † (t − t0 )U0 (t − t0 )φin (t0 )k → 0
∴
∴ lim ψ(t0 ) =
t→−∞
lim U † (t − t0 )U0 (t − t0 )φin (t0 ) .
t→−∞
Define: Ω+ := lim U † (t − t0 )U0 (t − t0 ) =
t→−∞
we have,
Ω+ =
lim U † (t)U0 (t) Using U † (t) = U (−t),
t→−∞
lim U (−t)U0 (t) and |ψ(t0 )i = Ω+ |φin (t0 )i ∀ t0 .
t→−∞
Likewise, for an asymptotically outgoing solution, we have,
Ω− =
lim U (−t)U0 (t) and |ψ(t0 )i = Ω− |φout (t0 )i ∀ t0 .
t→+∞
The Ω± are called the Moller operators. They map free solutions to asymptotic solutions.
The assumption of existence and uniqueness of scattering solutions, guarantees the existence
95
of these operators. Their adjoints are defined as, Ω†+ := limt→−∞ U0 (−t)U (t) , Ω†− :=
limt→+∞ U0 (−t)U (t). The adjoints give φin (t0 ) = Ω†+ ψ(t0 ) , φout (i0 ) = Ω†− ψ(t0 ) ∀ t0 .
It follows that ψ(t0 ) = Ω± Ω†± ψ(t0 ), Ω†+ Ω+ φin (t) = φin (t0 ) and Ω†+ Ω+ φout (t) = φout (t0 ).
Since the φin/out span the full Hilbert space, the last two equalities imply that Ω†± Ω± = 1.
The same is not true for the ψ(t)’s and hence Ω± Ω†± 6= 1 . Thus the Moller operators
are isometry operators but not unitary operators. For the special case of asymptotic completeness, we can write Ω± Ω†± = 1 − Pbound where Pbound is the projection operator on
Hbound . From the Schrodinger equations satisfied by the ψ(t) and φ(t), it is easy to see that
HΩ± = Ω± H0 and H0 Ω†± = Ω†± H .
Consider now a solution which is both asymptotically incoming and outgoing. Then we
have, ψ(t) = Ω+ φin (t) and φout (t) = Ω†− ψ(t). Therefore,
φout (t) = Ω†− Ω+ φin (t) , φin (t) = Ω†+ Ω− φout (t) and we define,
S := Ω†− Ω+
on [Range(Ω+ )] ∩ [Domain(Ω− )] 6= ∅ .
(10.1)
(10.2)
Note that the range of Ω+ are states in Hin while the domain of Ω†− are states in Hout . The
scattering operator, S, is thus well defined when weak asymptotic completeness holds. Similarly, we define S † := Ω†+ Ω− which is also well defined when weak asymptotic completeness
holds. If in addition asymptotic completeness holds, then
S † S = Ω†+ Ω− Ω†− Ω+ = Ω†+ (1 − Pbound )Ω+ = 1 − 0 likewise,
(10.3)
SS † = Ω†− Ω+ Ω†+ Ω− = Ω†− (1 − Pbound )Ω− = 1 − 0 .
(10.4)
Thus the scattering operator is unitary when asymptotic completeness holds. This is regardless of the Moller operators being non-unitary. The scattering operator S maps from free
solutions φin (t) to free solutions φout . Its matrix elements between the basis of free solutions
define the S − matrix. Explicitly, let φain (t) and φbout (t) denote incoming and out going free
solutions. Then,
Sba := hφbout (t)|S|φain (t)i =: hφbout (t)|φ̃aout (t)i .
(10.5)
The φ̃aout (t) state is a state that evolved from a specific incoming state while φbout (t) is an
arbitrary free state and Sba is the inner product between the two.
Hence, Sba gives the probability amplitude for a free state |φin (t)i to evolve into an outgoing state |φbout (t)i - in short a transition amplitude.
96
The S operator defined in terms of the Moller operators is manifestly time independent.
It matrix elements however seem to have a time dependence through those of the φin/out (t).
This apparent time dependence is actually absent and can be seen as follows.
Recall the boxed relations among Ω± , Ω†± , H and H0 . From these we get,
SH0 = Ω†− Ω+ H0 = Ω†− HΩ+ = H0 Ω†− Ω+ = H0 S.
idt Sba = idt (hφbout (t)|S|φain (t)i)
= −hφbout (t)H0 S|φain (t)i + 0 + hφbout (t)iSH0 |φain (t)i
= hφbout (t)|[S, H0 ]|φain (t)i = 0.
In summary:
• A generic scattering system has H and the three Hamiltonians H, H0 , H00 specified;
• There exist asymptotic subspaces of H , Hin , Hout elements of which give full solutions
approaching some free solutions as t → ±∞;
• Provided the system admits scattering states, there exist the Moller operators, Ω± on
H with ranges in Hin/out respectively;
• If weak asymptotic completeness holds, ∃ the scattering operators S : Hin → Hout and
its adjoint S † . The scattering operator S is unitary provided asymptotic completeness
holds.
• Once Sba are known, the cross-section can be computed, since the number of scattered
particles is proportional to |Sba |2 , while the flux of the incoming particles is given by
the probability current corresponding to φin (t).
• Much of the rigorous analysis goes in addressing the existence and uniqueness properties of the proposed scattering system. Further details may be seen in [9, 10].
In practice, rarely a scattering system is specified in sufficient details to address these
issues, but they are at the very basic definitional level.
Returning to interacting quantum fields, we have the candidate free states - the states of
free quantum fields with known free Hamiltonian H0 . We may specify the interacting system
by additional Lorentz invariant terms in the Lagrangian or specifying the Hamiltonian H.
97
However, we do not know what the Hilbert space of interacting quantum fields is let alone
if it admits scattering states. But we do know what properties we would like to have for
a scattering interpretation. The task now is to make suitable assumptions and build a
computational recipe to define and compute the S−matrix elements.
B.
Scattering with quantum fields: Heisenberg Picture
The above discussion has been in terms of the states evolving in time i.e. in Schrodinger
picture. For interacting quantum fields we do not know the state space (except for the free
fields). We do have Poincare covariance as a guide and this puts the field operators and their
Poincare transforms at the center stage. We are thus led to use the Heisenberg picture and
formulate the recipe in terms of the evolving operators. As states do not evolve in Heisenberg
picture, we postulate certain properties for the states, postulate Poincare covariance on the
operator evolution and make specific additional assumption of “asymptotic condition” to
incorporate the scattering feature. A reference for this material is [11].
1. We choose the basis states of the interacting quantum fields to be labeled by mutually
commuting, conserved Noether charges whose existence is guaranteed by the presumed symmetries. Among these symmetries is included the Poincare group (proper,
orthochronous with/without discrete symmetries). The Poincare symmetry immediately gives eigenvalues of P µ as a set of labels. Our first assumption is regarding the
spectrum of P µ .
2.
• There exists a unique vacuum, eigenstate |0i with P µ |0i = 0;
• Eigenvalues, pµ , of P µ lie in the forward light cone i.e. p2 ≤ 0, p0 > 0;
• ∃ stable, single particle states with masses mi : P µ |pi i = pµi |pi i , p2i = −m2i .
• The eigenvalues 0 and m2i of P 2 are discrete in the sense of being isolated values. The “multi-particle” states have their total momentum time-like but the
corresponding −p2 can take any value above their total rest mass.
Note: If there are “internal symmetries” (i.e. non-space-time symmetries), then it is
conceivable that the zero eigenvalue of P µ is degenerate, and carries some representation of the internal symmetry group (usually compact). Every one of these states
98
P0
6
Continuum
...
.......
....... ............................
.........
.......
.........
.......
.........
.........
.......
.........
.........
.......
..........
..........
........
..........
........
..........
...........
........
...........
............
........
................
................
........
........
.........
.........
.........
...........
.............
................
...
.......
.......
.......
.
.
.
.
.
.
.
.....
p = −4m
........
........
........
........
.
.
.
.
.
.
.
.....
.........
.........
........
.........
.
.
.
.
.
.
.
.
.
.........
............
.......................
.
.........
........
........
.........
.........
.........
........
.........
.
.
.
.
.
.
.
.
...
..........
..........
..........
..........
..........
..........
............
..............
.......................
2
.
2
p2 = −m2
-
p~
FIG. 1: Spectrum for a single, massive scalar field. The two particle states form a continuum above
and including −(p1 + p2 )2 = 4m2 hyperboloid.
must still be singlets of the Poincare group. But now, these vacua themselves carry
some quantum numbers and quantum fields acting on these will create single particle
states labeled by tensor product of representations of the internal symmetry group.
This is not done and one stipulates the vacuum to be a singlet under all symmetries.
Note: We have taken the stable particles to have a non-zero mass. This is a statement
about the spectrum of P 2 in the full, exact interacting quantum field theory and not
at some approximation of it. This would seem to exclude the massless representations
of the Poincare group for interacting quantum fields. In practice i.e. in perturbation
theory we will use massless photons and see the accompanying infrared problems.
3.
• An interacting field operator, denoted schematically by Φ(x) satisfies an operator
equation of the form, ( − m20 )Φ(x) = J(x) where J(x) is an operator built
out of Φ and possibly other field operators. We re-write the field equation as
˜
( − m2 )Φ(x) = J(x) − (m2 − m2 )Φ(x) =: J(x)
0
• The field obeys the equal time (anti-)commutation relations:
[Φ(t, ~x), Φ(t, ~y )] = 0 = [Π(t, ~x), Π(t, ~y )] , [Φ(t, ~x), Π(t, ~y )] = iδ 3 (~x − ~y ) .
If J(x) has no derivatives of fields, the momentum field is Π(t, ~x) = Φ̇(t, ~x). This
stipulates the quantized nature of the field while the equation of motion stipulates
99
the nature of interaction.
• ∃ another set of fields, quantized Φin (x), Φout (x), build out of the interacting field
Φ(x), and satisfying the same set of equal time (anti-)commutation relations but
obeying free field equations: ( − m2 )Φin/out = 0 where m is the “physical” mass.
The in and out fields thus have the same mode decomposition introducing the
creation/annihilation operators which create/destroy stable particles with physical mass m. These fields also transform the same way as the field Φ(x) under
the symmetries, in particular, Φin (x + a) = eia·P Φin (x)e−ia·P .
4. Asymptotic conditions: The exact interacting quantum field Φ(x) is linked to the in
and out fields by the asymptotic conditions,
Φ(x)|t→−∞ −→
√
ZΦin (x) ,
Φ(x)|t→∞ −→
√
ZΦout (x)
Here Z is a possible normalization constant which we will show to be necessarily
different from 1.
Note: The asymptotic conditions are not operator equations but are weak equations
i.e. hold for arbitrary matrix elements in the basis of normalized states.
Below we elaborate the conditions and note a number of properties.
Asymptotic conditions are consistent with Poincare covariance: A formal solution of the
equation satisfied by Φ(x) is given by,
Z
√
˜ + ZΦin (x).
Φ(x) =
d4 y Gret (x − y, m)J(y)
(10.6)
As t → −∞, the first term vanishes (property of retarded Green function) recovering the
asymptotic condition. This is a heuristic argument since J(y) it built out of Φ(x) and some
iterative procedure is implicit. The consistency with covariance is seen as,
Z
√
˜
ZΦin (x − a) = Φ(x − a) − d4 y Gret (x − a − y, m)J(y)
Z
−ia·P
ia·P
˜ − a)
= e
Φ(x)e
− d4 y Gret (x − y, m)J(y
Z
√
−ia·P
4
˜
= e
Φ(x) − d y Gret (x − y, m)J(y)
eia·P = Ze−ia·P Φin (x)eia·P .
Note: Only the Gret is used in the formal solution and not GF eynman .
100
Φin (x) creates states with physical mass m: The infinitesimal form of the transformation
law for Φin (x) implies, [P µ , Φin (x)] = −i∂ µ Φin (x) and we also have the equation of motion
( − m2 )Φin = 0.
For any physical state |k, αi, P µ |k, αi = k µ |k, αi. Therefore,
−i∂ µ hk, α|Φin (x)|0i
=
hk, α|[P µ , Φin (x)]|k, αi = k µ hk, α|Φin (x)|0i
∴ −hk, α|Φin (x)|0i
=
k 2 hk, α|Φin (x)|0i
∴ (− + m2 )hk, α|Φin (x)|0i = 0 = (k 2 + m2 )hk, α|Φin (x)|0i ⇒ k 2 = −m2 .
Thus, Φin (x) behaves exactly like a free field producing physical states from the vacuum and
we can be taken to have the usual mode decomposition,
Z
h
i
d3 k
† ~ −ik·x
ik·x
~
p
ain (k)e + ain (k)e
Φin (x) =
2ω~k (2π)3
Z
h
i
d3 k
p
∂0 Φin (x) =
−iω~k ain (~k)eik·x + iω~k a†in e−ik·x
⇒
2ω~k (2π)3
Z
←
→
←
→
−i
† ~
d3 x eik·x ∂0 Φin (x) , f ∂ g := f ∂g − ∂(f )g . (10.7)
ain (k) = p
3
2ω~k (2π)
Z
←
→
i
~
d3 x e−ik·x ∂0 Φin (x)
And, ain (k) = p
2ω~k (2π)3
Note: In writing the inversion formula, we have done the spatial integration over a
t =constant hypersurface Σt . The choice of this hypersurface does not matter (a†in (~k) is
independent of t). This is kept implicit in all the inversion formulae below.
It follows immediately, h0|Φin (x)|0i = 0 and,
hk, α|Φin (x)|0i = hk, α|e−ix·P Φin (0)eix·P |0i = e−ik·x hk, α|Φin (0)|0i
hk, α|Φin (0)|0i = p
1
e−ik·x
⇒ hk, α|Φin (x)|0i = p
.
2ω~k (2π)2
2ω~k (2π)3
Suffice it to say that corresponding expressions exist for φout . In particular the heuristic
expression take the form,
Z
√
˜
Φ(x) = ZΦout (x) + d4 y Gadv (x − y, m2 )J(y).
How far can we get with Lorentz covariance and the assumptions on the spectrum?
101
C.
Kallen-Lehmann Representation
Consider the vacuum expectation values of the commutator of interacting field,
i∆0 (x, x0 ) := h0|[Φ(x), Φ(x0 )]0i. We had evaluated this for the free fields while considering the spin-statistics, ∼ ∆+ (x − x0 ). We evaluate it as,
i∆(x, x0 ) =
X
=
X
=
X
h0|Φ(x)|nihn|Φ(x0 )|0i − (x ↔ x0 ),
(10.8)
n
0
h0|Φ(0)eix·P |nihn|e−ix ·P Φ(0)|0i − (x ↔ x0 ),
(10.9)
n
0
0
|h0|Φ(0)|ni|2 ei(x−x )·kn − e−i(x−x )·kn
a function of (x − x0 ). (10.10)
n
The states |ni contain the vacuum, the single particle states on the isolated hyperboloid
kn2 = −m2 as well as the multi-particles continuum. To collect terms with the same total
R
knµ , insert 1 = d4 qδ 4 (q − kn ) and write,
"
#
Z
X
i
0
0
∆0 (x − x0 ) = −
d4 q (2π)3
δ 4 (q − kn )|h0|Φ(0)|ni|2 ei(x−x )·kn − e−i(x−x )·kn
3
(2π)
n
Z
i
0
4
i(x−x )·q
−i(x−x0 )·q
:= −
d
q
e
−
e
ρ(q) where,
(2π)3
"
#
X
ρ(q) := (2π)3
δ 4 (q − kn )|h0|Φ(0)|ni|2
(the spectral density).
n
Claim: The spectral density is a scalar function of q 2 .
Proof: Under a Lorentz transformation, U (Λ), we have U (Λ)|0i
=
|0i and
U (Λ)Φ(0)U −1 (Λ) = Φ(Λ0) = Φ(0). Hence,
ρ(q) = (2π)3
X
δ 4 (q − kn )|h0|Φ(0)U (Λ)|ni|2 .
n
Next,
Z
Z
Z
−1
d xe
= d (Λx)e
= d4 xei(Λ k)·x ∼ δ 4 (Λ−1 k) .
X
∴ ρ(q) = (2π)3
δ 4 (Λ−1 (q − kn ))|h0|Φ(0)U −1 (Λ)|ni|2 .
4
δ (k) ∼
4
ik·x
4
ik·(Λx)
n
Let U (Λ)|ni = |mi. Then U −1 (Λ)P µ U (Λ) = Λµν P ν . Acting on |ni implies, P µ |mi =
µ
µ −1
km
|mi and we already had P µ |ni = knµ |ni. This in turn implies, km
U (Λ)|mi = Λµν knν |ni
µ
or, km
= Λµν knν . Lowering the Lorentz index gives, (km )µ = (kn )ν (Λ−1 )νµ . Replacing the
102
sum over n by that over m gives,
ρ(q) = (2π)3
X
δ 4 (km − Λ−1 q)|h0|Φ(0)|mi|2 = ρ(Λ−1 q) .
m
Thus, ρ(q) can only depend on q through q 2 .
Since the physical spectrum has q 2 < 0 with q 0 > 0, we write ρ(q) := θ(q 0 )ρ(q 2 ), ρ(q 2 ) =
0 for q 2 > 0. Substituting in ∆0 gives,
Z
−i
0
0
4
2
0
iq·(x−x0 )
−iq·(x−x0 )
∆ (x − x ) =
d
qρ(q
)θ(q
)
e
−
e
(2π)3
Z ∞
Z
−i
2
2
4
2
2
0 iq·(x−x0 )
dσ ρ(σ )
.
=
d qδ(q − σ )(q )e
(2π)3
0
Z ∞
0
0
dσ 2 ρ(σ 2 )∆(x − x0 , σ 2 ) .
∴ ∆ (x − x ) =
(10.11)
0
Here we have introduced the function (x) = θ(x) − θ(−x) to combine the two exponentials
and also recognized that the the square bracket is just −i[Φin (x), Φin (x0 )] for m2 = σ 2 . The
last equation is known as the Kallen-Lehmann representation or the spectral representation
for the commutator function ∆0 (x − x0 ) of the exact interacting quantum fields. This has
followed from Poincare covariance (separation of the x−dependence), the assumptions regarding the spectrum of the interacting system (ρ(q 2 > 0) = 0) and normalizations used for
Φin in identifying the ∆(x − x0 , σ 2 ).
We can separate the contribution of the 1-particle states at σ 2 = m2 . We obtain the
h0|Φ(x)|ki using the asymptotic condition,
Z
√
˜
h0|Φ(x)|ki = Zh0|Φin (x)|ki + d4 y Gret (x − y, m)h0|J(y)|ki
, but
˜
h0|J(y)|ki
= h0|( − m2 )Φ(x)|ki = ( − m2 ) h0|Φ(0)|kieik·x
= (−k 2 − m2 ) h0|Φ(0)|kieik·x = 0. The first term then gives,
√
√
eik·x
Z
⇒ |h0|Φ(x)|ki|2 =
.
h0|Φ(x)|ki = Zh0|Φin (x)|ki = Z p
2ω~k (2π)3
2ω~k (2π)3
Z
Z
2
3
∴ ρ(q )|1−particle = (2π)
d3 kδ 4 (k − q)
= Zδ(q 2 − m2 )θ(q 0 )
(10.12)
2ω~k (2π)3
Z ∞
0
0
0
2
∴
∆ (x − x ) = Z∆(x − x , m ) +
dσ 2 ρ(σ 2 )∆(x − x0 , σ 2 ) .(10.13)
m21
The lower limit on the integral is the smallest invariant mass of the multi-particle continuum
and m21 > m2 is assumed.
103
We can now derive a bound on Z. Observe that
lim (i∂t ∆0 (x − x0 )) = h0|[Φ̇(x), Φ(x0 )]|0i = −iδ 3 (x − x0 ) = (i∂t ∆(x − x0 , m2 )) .
t0 →t
We are using the assumption that there are no derivative terms in J(x) or equivalently,
Π(x) = Φ̇(x) and noting that the in/out fields satisfy the same commutation relations.
Taking −i∂t on eq.(10.13) and taking the limit t0 → t, we deduce
Z ∞
3
0
3
0
−iδ (~x − ~x ) = Z(−iδ (~x − ~x )) +
dσ 2 ρ(σ 2 ) −iδ 3 (~x − ~x0 , σ)
m21
Z ∞
dσ 2 ρ(σ 2 ) =⇒ 0 ≤ Z < 1.
Or, 1 = Z +
(10.14)
m21
The inequalities on Z can be understood as follows. Since asymptotic condition is needed
to link Φ(x) with Φin/out (x), Z 6= 0. To understand the upper limit, notice that the free
fields can only produce 1-particles states since it is linear in creation/annihilation operators.
The interacting field has no such restriction and can produce multi-particle states as well.
Hence the probability for it to produce 1-particle states is less than that for the free fields
i.e. |h0|Φ(x)|ki|2 < |h0|Φin/out (x)|ki|2 . This results in Z < 1.
In principle the Z factors for the in and out fields could have been different in the
asymptotic conditions. However, following identical steps as above would lead us to the
same relation (10.13) implying equality of the Z’s.
An important point to note is that Z < 1 has a physical reason, quite independent of
any infinities and their renormalization that we will encounter later on.
Remark: In the Kallen-Lehmann spectral representation we considered the vacuum expectation value of the commutator of the fully interacting quantum field presumed to satisfy
the usual canonical commutation relations. Because of this, the interacting quantum field
gets identified with the so called “bare” quantum field in the renormalized perturbation
√
theory. We could instead use a “renormalized” quantum field, ΦR (x) := Φ(x)/ Z and correspondingly define, ∆0R (x − x0 ) := Z −1 ∆0 (x − x0 ). It follows that ρR (q 2 ) = Z −1 ρ(q 2 ) and
[ΦR (x), Φ̇R (x0 )] = iZ −1 δ 3 (x − x0 ). In terms of the renormalized quantities, there is no factor
of Z in equation (10.13). The manipulations leading to the bound on Z will now give,
lim (i∂t ∆0R (x − x0 )) = h0|[Φ̇R (x), ΦR (x0 )]|0i = −iZ −1 δ 3 (x − x0 ) = Z −1 (i∂t ∆(x − x0 , m2 )) ,
t0 →t
R∞
and the eq.(10.13) will lead to 1 = Z 1 + 0 dσ 2 ρR (σ 2 ) implying again that Z < 1. These
bounds are true provided the spectral integral is finite. In perturbation theory, the two point
104
functions (commutator or the Feynman propagator) of the unrenormalized fields are UltraViolet divergent and the bound cannot be inferred (See section 10.7 of [1]). The KallenLehmann representation eq.(10.11) itself does not depend on the canonical commutation
relations for the interacting field and can indeed be derived for composite fields as well.
D.
The S-matrix and its properties
Having the Φin/out (x) and their corresponding mode expansions at our disposal, we can
now define the in/out states as, |kiin := a†in (~k)|0i and |kiout := a†out (~k)|0i. Similarly, multiparticle states can be defined. We can form wave packets for normalizable states and take
limit of infinite sharpness at the end. We bypass these steps and work with the momentum
eigenstates. Note that these states are time independent. While we can certainly define the
general in/out states and take their spans, we need to (and do) make a further assumption
that these spans generate the full Hilbert space of the interacting quantum fields10 . The
sets of in and out states just constitute two orthonormal bases, conveniently transforming
by representations of the Poincare group. This guarantees existence of a unitary operator
connecting the two orthonormal bases. This is our scattering operator in this Heisenberg
picture formulation. Taking the basis elements to be generated by monomials of a† (~k, σ, . . . )
in
and
a†out (~k, σ, . . . ),
the S−matrix is defined as,
Sβα := hβ out|α ini =: hβ in|S|α ini
(10.15)
Note that α, β label basis states which are created by monomials in a†in , a†out . The definition
of S operator shows that it preserves the labels: hβ out| =: hβ in|S. We stipulate some
conditions on the S−matrix.
Stability of the vacuum and 1-particle states: We expect vacuum state to suffer no scattering spontaneously except may be acquiring a phase (uniqueness of vacuum allows a phase
since rays are defined unto phases). We choose the phase to be 1, and stipulate S00 = 1 .
Likewise presumed stability of single particle states also disallows any spontaneous change
(nothing to scatter against). Thus we stipulate that Sk0 ,k = δ 3 (~k 0 − ~k). Scattering takes
place with multi-particle states.
10
This automatically precludes bound states in the interacting system and amounts to postulating asymptotic completeness property mentioned earlier.
105
Claim: Φout (x) = S −1 Φin (x)S .
Proof: Consider, hβ out|Φout (x)|α ini and evaluate in two ways.
hβ out|Φout (x)|α ini = (hβ in|S)Φout (x)|α ini.
Next, hβ 0 out| := hβ out|Φout (x) is a linear combination of the out-basis elements. Each of
this can be expressed as corresponding in-basis element times S. And the in-basis element
is also obtained by action of Φin (x). In equation,
hβ 0 out| =
X
Cγ hγ out| =
γ
X
Cγ (hγ in|S) =
γ
!
X
Cγ hγ in| S =: hβ 0 in|S = (hβ in|Φin (x))S.
γ
Taking inner product of the two expressions with |α ini gives,
hβ out|Φout (x)|α ini = hβ in|SΦout (x)|α ini = hβ in|Φin (x))S|α ini ∀ α, β.
Completeness of the in-basis implies, Φout (x) = S −1 Φin (x)S . proving the claim.
Claim: Covariance of the in/out fields gives invariance of the scattering operator.
Proof: We have,
U (Λ, a)Φin (x)U −1 (Λ, a) = Φin (Λx + a)
U (Λ, a)Φout (x)U −1 (Λ, a) = Φout (Λx + a)
and
Φout (x) = S −1 Φin (x)S
∴ U (S −1 Φin (x)S)U −1 = S −1 Φin (Λx + a)S = S −1 U Φin (x)U −1 S
(10.16)
∴ U −1 SU S −1 Φin (x) = Φin (x)U −1 SU S −1 =>
U −1 SU S −1 = 1 or SU (Λ, a) = U (Λ, a)S .
(10.17)
Question: Are the Schrodinger picture definitions of the scattering matrix element (10.5)
and the Heisenberg picture scattering matrix element defined using Φout = S −1 Φin S, related?
If so, how?
We now relate the S-matrix elements to vacuum expectation values of time ordered products of the interacting fields - the Lehmann-Symanzik-Zimmermann (LSZ) reduction formulae. This is followed by covariant perturbation series leading to Feynman rules. We will first
detail these steps for the notationally simpler case of a scalar field and then summarize the
corresponding steps for the Dirac and the Maxwell fields.
106
E.
Lehmann-Symanzik-Zimmermann Reduction of S−matrix
We have defined the S−matrix elements using the in and out states which form the basis
elements created by monomials in a† (~k, . . . ) and a†out (~k, . . . ) acting on the unique vacuum.
in
The in and out fields are not independent or uncorrelated, they are linked through the
interacting field Φ(x) via the asymptotic condition. The LSZ reduction of the S−matrix
elements expresses them in terms of the interacting field explicitly. For simplicity, let us
first consider the case of an interacting Hermitian scalar field.
1.
LSZ reduction for Klein-Gordon field
Consider a matrix element of the form hβ out|α, k ini where we have separated the k
label in the in-state. The definition gives,
hβ out|α, k ini = hβ out|a†in (~k)|α ini = hβ out| a†in (~k) − a†out (~k) + a†out (~k) |α ini .
The a†out (~k) acting on the out-state will remove a particle with label ~k if it is present in the
label set β, or will annihilate the out-state. For simplicity, let us assume that there is no
such state in the β label. Then we have just added and subtracted a 0. For the first two
term in the brackets , use the inversion formula (10.7).
"
Z
←
→
−i
d3 xeik·x ∂0 Φin (x)
hβ out|α, k ini = hβ out| p
2ω~k (2π)3
#
Z
←
→
+i
+p
d3 xeik·x ∂0 Φout (x) |α ini.
3
2ω~k (2π)
)
(
→
Z
ik·x ←
+i
e
∂
Φ(x)
p
√0
[. . . ] =
lim − lim
d3 x
3
t→+∞
t→−∞
2ω~k (2π)
Z
Z
o
n
√
←
→
∴ Z [. . . ] =
d4 x ∂0 eik·x ∂0 Φ(x)
Z
=
d4 x eik·x ∂02 Φ(x) − ∂02 (eik·x )Φ(x) , (−∂02 eik·x = (−∇2 + m2 )eik·x )
Z
=
d4 x eik·x ∂02 Φ(x) + (−∇2 + m2 )Φ(x) , (∇2 flipped onto Φ(x)).
Z
−i
d4 xeik·x
p
∴ hβ out|α, k ini = √
( − m2 )hβ out|Φ(x)|α ini .
3
2ω~k (2π)
Z
Now let us separate a particle with label k 0 from the out-state and write hβ out| =
107
hγ, k 0 out| the inner product on the right hand side of the above equation. Consider,
hγ, k 0 out|Φ(x)|α ini = hγ out|aout (~k 0 )Φ(x)|α ini
= hγ out|(aout (~k 0 )Φ(x) − Φ(x)ain (~k 0 ) + Φ(x)ain (~k 0 ))|α ini
As before, for simplicity, assume α-label does not contain k 0 and thus drop the third term
in the brackets. Notice that we have added and subtracted the zero term with ain (~k 0 ) to the
right of Φ(x) while aout (~k 0 ) is naturally to the left of Φ(x). We need to maintain this order
as we do not have the commutation relations between the in/out field and the interacting
field. As before, using the inversion formula (10.7) and using the asymptotic condition, we
get
"
Z
i
0
hγ, k out|Φ(x)|α ini = hγ out| p
2ω~k0 (2π)3
√
∴
Z
i
Z [. . . ] = p
2ω~k0 (2π)3
3 0
dx
3 0 −ik0 ·x0
d xe
n←
→
∂00 Φ(x0 )out Φ(x)
oi
←
→0
0
−Φ(x) ∂0 Φ(x )in |α ini
n
←
→
−ik0 ·x0 0
0
lim
(e
∂
0 Φ(x ))Φ(x)
0
t →∞
−ik0 ·x0
− 0 lim Φ(x)(e
t →−∞
←
→0
0
∂0 Φ(x ))
It is convenient to define a time ordering instruction. Define a time ordered product as:
T {A(t1 , ~x)B(t2 , ~y )} := θ(t1 − t2 )A(t1 , ~x)B(t2 , ~y ) + θ(t2 − t1 )B(t2 , ~y )A(t1 , ~x).
Notice that the two limits already have the time ordering incorporated. Hence we can
combine the terms and write,
√
Z [. . . ] = p
i
Z
3 0
dx
lim − 0 lim
0
→
0 0←
e−ik ·x ∂00 T {Φ(x0 )Φ(x)}
t →∞
t →−∞
2ω~k0 (2π)3
Z
→
i
0 0←
d4x0 ∂00 e−ik ·x ∂00 T {Φ(x0 )Φ(x)}
= p
2ω~k0 (2π)3
Proceeding exactly as before we get,
2 Z
Z
0 0
e−ik ·x
eik·x
−i
0
4 0
p
dx
d4 x p
hγ, k out|α, k ini = √
2ω~k0 (2π)3 2ω~k (2π)3
Z
(x0 − m2 )(x − m2 )hγ, k 0 out|T {Φ(x0 )Φ(x)} |α, k ini (10.18)
The generalization is immediate and we state the master formula for the S−matrix element
108
as,
hk1 ,
0
. . . , kn0
out|k1 , . . . , km ini =
m+n Z
Z
−i
4 0
4 0
√
d x1 ..d xn d4 x1 ..d4 xm
Z
0 0
m
n
Y
e−ikj ·xj Y e+ikj ·xj
p
p
(x01 − m2 )
3
3
2ω
(2π)
2ω
(2π)
~k0
~k
j
j
. . . (x0m − m2 )(x1 − m2 ) . . . (xn − m2 )
h0|T {Φ(x01 ) . . . Φ(x0n )Φ(x1 ) . . . Φ(xn )} |0i (10.19)
The vacuum expectation value of the time ordered product of n quantum fields (interacting
or free) is called an n−point function/Green function/n-point correlation function.
To get the S−matrix elements, the n-point function is operated by the free field equation
expression, multiplied by the in/out mode functions and integrated over all the space-time
points. Note that the time ordering arises because the definition of S−matrix requires all the
creation operators, a† (~k), have to be on the right while the annihilation operators, aout (~k)
in
have to be on the left. This will also naturally lead to the Feynman Green’s function (free,
2-point function).
Let us note the steps that were followed to get the S−matrix elements in terms of the
n-point functions.
• Mode decomposition of the free field and the inversion formulae;
• Equation of motion for the interacting field with a ‘source’ term on the right hand
side;
• Equal time(anti-)commutation relations for both the interacting and the in/out fields;
• Asymptotic conditions;
• Reduction process;
2.
LSZ reduction for Dirac field
We begin by recalling the mode decomposition.
Z
h
i
d3 k
p
b(~k, σ)u(~k, σ)eik·x + d† (~k, σ)v(~k, σ)e−ik·x
Ψ(x) =
2ω~k (π)3
Z
h
i
d3 k
p
Ψ† (x) =
b† (~k, σ)u† (~k, σ)e−ik·x + d(~k, σ)v † (~k, σ)eik·x
2ω~k (π)3
109
We have used the Ψ† field instead of Ψ̄ because the inversion formulae take a more convenient
form.
We had defined the u, v spinors as,
(6 k + m)u(~k, σ) = 0 , ū(~k, σ)u(~k, σ 0 ) = δσ,σ0 = − v̄(~k, σ)v(~k, σ 0 ) , (−6 k + m)v(~k, σ) = 0
X
X
6k + m
−6 k + m
,
v(~k, σ)v̄(~k, σ) = −
u(~k, σ)ū(~k, σ) =
2m
2m
σ
σ
We need to express these in terms of the adjoints rather than in terms of the Dirac adjoints.
This alternative for can be derived as follows. We have,
ū(~k, σ)γ µ u(~k, σ 0 ) = Λµν ū(k̂, σ)γ µ u(k̂, σ 0 ) , k µ = Λµν k̂ ν , k̂ µ = (m, ~0) . For µ = 0,
u† (~k, σ)u(~k, σ 0 ) = Λ0ν u† (k̂, σ)γ 0 γ ν u(k̂, σ 0 ) = Λ0ν u† (k̂, σ)γ ν u(k̂, σ 0 ) ∵ γ 0 u(k̂, σ 0 ) = u(k̂, σ 0 )
= Λ00 u† (k̂, σ)γ 0 u(k̂, σ 0 ) + Λ0i u† (k̂, σ)γ i u(k̂, σ 0 )
∴ u† (~k, σ)u(~k, σ 0 ) = Λ00 ū(k̂, σ)u(k̂, σ 0 ) + 0 = Λ00 δσ,σ0 + 0
The second term in the last equation is zero because, for k̂ spinors γ 0 u(k̂, σ) = u(k̂, σ). This
implies,
u† γ i u = u† γ 0 γ i u = −u† γ i γ 0 u = −u† γ i u .
Similarly, v † (~k, σ)v(~k, σ 0 ) = −Λ00 v̄(k̂, σ)v(k̂, σ 0 ) = +Λ00 δσ,σ0 . But Λ00 is defined through,
k µ = Λµν k̂ ν . For µ = 0 this gives, Λ00 =
ω~k
.
m
Our orthonormality relations then take the
form,
u† (~k, σ)u(~k, σ 0 ) =
ω~k
δσ,σ0 = v † (~k, σ)v(~k, σ 0 ) , u† v = 0 = v † u .
m
We can use these relations to derive the inversion formulae. For instance, multiplying the
mode decomposition for Ψ(x) by u† (~k, σ)e−ik·x and integrating over d3 x yields,
2m
b(~k, σ) = p
2ω~k (2π)3
Z
d3 xu† (~k, σ)e−ik·x Ψ(x)
It is convenient to define
u(~k, σ)
v(~k, σ)
U (~k, σ) := 2m p
, V (~k, σ) := 2m p
2ω~k (2π)3
2ω~k (2π)3
110
The inversion formulae are,
b(~k, σ) =
Z
d (~k, σ) =
Z
b† (~k, σ) =
Z
d(~k, σ) =
Z
†
d3 xe−ik·x U † (~k, σ)Ψ(x)
(10.20)
d3 xe+ik·x V † (~k, σ)Ψ(x)
(10.21)
d3 xe+ik·x Ψ† (x)U (~k, σ)
(10.22)
d3 xe−ik·x Ψ† (x)V (~k, σ)
(10.23)
In the reduction process, we will have for instance,
Z
† ~
†
†
3
ik·x
†
~
bin (k, σ) − bout (k, σ) =
dxe
Ψ (x)in − Ψout (x) U (~k, σ)
Z
1
and using asymptotic condition −→ √
d4 x ∂0 eik·x Ψ† (x) U (~k, σ)
Z
Next,
∂0 (eik·x Ψ† (x))U (~k, σ) = eik·x (∂0 Ψ† + ik0 Ψ† )U ;
ik0 Ψ† U = iΨ̄(k0 γ 0 )U = iΨ̄(6 k − ki γ i )U = iΨ̄(−mU ) + Ψ̄γ i (−iki U )
∴ ∂0
∴ eik·x ik0 Ψ† U = eik·x (−iΨ̄mU ) + Ψ̄γ i (−∂i eik·x )U = eik·x −imΨ̄ + ∂i Ψ̄γ i U
n ←
o
−
ik·x †
ik·x
0
i
ik·x
~
e Ψ U (k, σ) = e
∂0 Ψ̄γ + ∂i Ψ̄γ − imΨ̄ U = (−ie ) iΨ̄ 6 ∂ + mΨ̄ U
←
−
Hence, instead of (−x + m2 ) acting on Φ(x), we will have (−2mi)(i 6 ∂ + m) acting on Ψ̄(x)
→
−
and (−2mi)(−i 6 ∂ + m) acting on Ψ(x). The wavefunction factors will be:
Particle in in-state: U (~k, σ)eik·x , Particle in out-state: Ū (~k, σ)e−ik·x
(10.24)
anti-Particle in in-state: V̄ (~k, σ)eik·x , anti-Particle in out-state: V (~k, σ)e−ik·x(10.25)
For multi-particle case we will again have time ordering instruction. For fermions, it is
defined with an relative minus sign,
T {Ψα (x)Ψβ (y)} := θ(x0 − y 0 )Ψα (x)Ψβ (y) − θ(y 0 − x0 )Ψβ (y)Ψα (x)
.
and T {Ψα (x)Ψβ (y)} = −T {Ψβ (y)Ψα (x)}. Thus for fermions, changing the ordering inside
the time ordering generates a minus sign for each exchange. We summarize the general
111
matrix element for m fermions going into n fermions as,
m+n Y Z
−i
0
0
~
~
√
d4 x0j 0 d4 xj U (~kj , σj )Ū (~kj0 0 , σj0 0 )
out h..kj 0 σj 0 ..|..kj σj ..iin =
ZΨ
j,j 0
e
−i
P
kj0 0 ·x0j 0
(−i6 ∂ 01 + m) . . . (−i6 ∂ 0n + m)
h0|T Ψ(x01 )..Ψ(x0n )Ψ̄(x1 )..Ψ̄(xm ) |0i
P
←
−
←
−
(−i6 ∂ 1 + m) . . . (−i6 ∂ m + m)e+i j kj ·xj (10.26)
j0
For anti-fermions in the in/out states, the spinors U, Ū are changed to V, V̄ as appropriate.
3.
LSZ reduction for Maxwell field
This is a bosonic field so there is no additional minus sign. Like the Dirac field, it has
two polarizations and thanks to the zero mass, these are transverse polarizations. We begin
by recalling the mode decomposition.
Z
i
Xh
d3 k
∗ ~
−ik·x
ik·x
∗
~
p
Aµ (x) =
a~k,λ εµ (k, λ)e + a~k,λ εµ (k, λ)e
2ω(2π)3 λ=1,2
j
~k · ~ε(~k, λ)
k
k
i
j
ε0 (~k, λ) := −
δi −
εj (~k, λ) , ω~k = |~k|
, εi (~k, λ) :=
~k 2
ω~k
X
ki k j
j
∗ ~
0
∗ ~
~
~
~ε(k, λ) · ~ε (k, λ ) = δλ,λ0 ,
δi −
εi (k, λ)εj (k, λ) =
~k 2
λ
This gives the inversion formulae as,
Z
Z
←
→
←
→
3
−ik·x
i
†
∗
a(~k, λ) = iεi (~k, λ) d xe
∂0 A (x) , a (~k, λ) = − iεi (~k, λ) d3 xe+ik·x ∂0 Ai (x)
In the transverse/radiation/Coulomb gauge, εµ (~k, λ) = εµ (~k, λ) , ~k · ~ε(~k, λ) = 0 and the
equations of motion are Ai (x) = 0. The commutation relations are:
Z
∂i ∂j
d3 k ik·(x−y)
ki kj
[Ai (t, ~x), πj (t, ~y )] = i δij − 2
:= i
e
δij −
~k 2
∇
(2π)3
The reduction formula goes similar to the scalar. We will have (−x + m2 ) → −x
~
and the wave function factors would be Eµ (~k, λ) := √ ε(k,λ) 3 . Then the general S−matrix
2ω~k (2π)
112
element for m → n particles takes the form,
m+n Y Z
−i
0
0
~
~
√
d4 x0j 0 d4 xj E(~kj , λ)E ∗ (~kj0 0 , λ0j 0 )
out h..kj 0 λj 0 ..|..kj λ..iin =
ZA
j,j 0
−i
e
P
kj0 0 ·x0j 0
(−01 ) . . . (−0n )
h0|T Aµ01 (x01 )..Aµ0n (x0n )Aµ1 (x1 )..Aµm (xm ) |0i
P
←
−
←−
(−1 ) . . . (−m )e+i j kj ·xj
(10.27)
j0
Having related the S−matrix elements to the n−point functions of the interacting fields,
out next task is to try see if these can be expressed in terms of the free fields in some
systematic, well defined way. This is achieved in the so-called covariant perturbation series.
113
11.
COVARIANT PERTURBATION THEORY
In formulating the scattering theory for interacting fields, we postulated the in/out fields
satisfying the free field equations. More importantly, we also postulated them to satisfy the
same equal time (anti-)commutation relations. With some assumptions of existence, these
suffice to develop a perturbation scheme to compute the n−point functions.
Since the interacting and the free (in/out) fields satisfy the same basic quantum conditions, it is plausible that they are related by some unitary transformation. For systems with
finitely many degrees of freedom, it is theorem that guarantees existence of canonical transformations at the classical level and unitary transformations at the quantum level (thanks
to the Stone-von-Neumann theorem). For field theories, there is no such guarantee and we
need to postulate the required existence. Their utility gives a post-facto justification for the
assumptions. Let,
Φ(t, ~x) = U −1 (t)Φin (t, ~x)U (t) , Π(t, ~x) = U −1 (t)Πin (t, ~x)U (t) .
.
The fields satisfy the equations of motion in the form,
∂t Φin (x) = i[Hin (Φin , Πin ), Φin (x)] , ∂t Πin (x) = i[Hin (Φin , Πin ), Πin (x)]
(11.1)
∂t Φ(x) = i[H(Φ, Π), Φ(x)] , ∂t Π(x) = i[H(Φ, Π), Π(x)]
(11.2)
These enable us to derive an equation for U (t).
∂t Φin = ∂t [U (t)ΦU −1 (t)] = ∂t U · U −1 Φin + U (∂t Φ)U −1 + Φin · U ∂t U −1
= [∂t U · U −1 , Φin ] + iU [H(Φ, Π), Φ]U −1 But,
U [H(Φ, Π), Φ]U −1 = [U H(Φ, Π]U −1 , U ΦU −1 ] = [H(Φin , Πin ), Φin ]
∴ ∂t Φin = [∂t U · U −1 + i {H(Φin , Πin ) − Hin (Φin , Πin ) + Hin (Φin , Πin )} , Φin ]
∂t Φin = ∂t Φin + [∂t U · U −1 + iHI (Φin , Πin ), Φin ]
∂t Πin = ∂t Πin + [∂t U · U −1 + iHI (Φin , Πin ), Πin ] .
114
Similarly,
Since ∂t U · U −1 + iHI (Φin , Πin ) commutes with both Φin , Πin , it must be a multiple of
identity, say E0 1. This gives the equation determining the U (t) as,
i∂t U (t) = HI0 (Φin , Πin )U (t) , HI0 (Φin , Πin ) := H(Φin , Πin ) − Hin (Φin , Πin ) + E0 (t)
(11.3)
To solve the equation, it is convenient to define the combination U (t, t0 ) := U (t)U −1 (t0 ).
It follows that this combination too satisfies the same first order equation with the initial
condition, U (t, t) = 1. The integral form of this equation is:
Z t
0
dt00 HI0 (t00 )U (t00 , t0 ) .
U (t, t ) = 1 − i
t0
This is solved by iteration.
Rt
Define: U0 (t, t0 ) = 1, and for n ≥ 1, Un (t, t0 ) := 1 − i t0 dtn H 0 (tn )Un−1 (tn , t0 ). Then
P
U (t, t0 ) = n Un (t, t0 ) is the formal solution of the integral equation.
It is trivial to verify the formal solution. It is formal because no conditions are imposed
for convergence of the series. The first few terms are,
U0 (t, t0 ) = 1
t
Z
dt1 H 0 (t1 )1
U1 (t, t ) = 1 − i
0
Z tt
dt2 H 0 (t2 )U1 (t2 , t0 )
U2 (t, t0 ) = 1 −
t0
Z t
Z t
Z t2
0
2
dt2 H (t2 ) + (−i)
dt2
= 1 + (−i)
dt1 H 0 (t2 )H 0 (t1 )
0
0
0
t
Z t t
Zt t
dt3 H 0 (t3 ) 1 + (−i)
dt2 H 0 (t2 )
U3 (t, t0 ) = 1 + (−i)
0
0
t
t
Z t
Z t2
2
0
0
+(−i)
dt2
dt1 H (t2 )H (t1 )
t0
t0
Z t
Z t
Z t3
0
2
dt3 H (t3 ) + (−i)
dt3
dt2 H 0 (t3 )H 0 (t2 )
= 1 + (−i)
t0
t0
t0
Z t
Z t3
Z t2
3
+ (−i)
dt3
dt2
dt1 H 0 (t3 )H 0 (t2 )H 0 (t1 )
0
t0
∴ Un (t, t0 ) = 1 +
n
X
(−i)k
Rt
t0
dtn
R tn
t0
Z
tk
t0
Z
dtk−1 · · ·
dtk
t0
k=1
Consider the term
t
Z
t0
t2
t0
dt1 H 0 (tk )H 0 (tk−1 ) . . . H 0 (t1 )
(11.4)
t0
dtn−1 . By interchanging the order f integration we can write
115
it as (Hn0 := H 0 (tn )),
Z t
Z t
Z t
Z t
Z tn
Z t
0
0
0
0
0
dtn−1
dtn Hn Hn−1 =
dtn
dtn−1 Hn−1
Hn0
dtn−1 Hn Hn−1 =
dtn
t0
tn−1
t0
tn
t0
t0
Z t
Z t
Z t
Z tn
1
1
0
0
0
0
∴ LHS = (LHS + RHS) =
dtn
dtn−1 Hn−1 Hn
dtn−1 Hn Hn−1 +
dtn
2
2 t0
t0
tn
t0
Z t
Z tn
1
0
=
dtn−1 T Hn0 Hn−1
dtn
2 t0
0
t
Z t
Z t
0 0
+
dtn
dtn−1 T Hn Hn−1
t0
tn
Z t
Z
1 t
0
=
dtn
dtn−1 T Hn0 Hn−1
2 t0
t0
Similarly, we can symmetrize the higher order terms to replace each product by their time
ordered products and then combine the integrals to have the same limits of integrations. The
Rt
Rt
Rt
nth order term then takes the form: n!1 t0 dtn t0 dtn−1 · · · t0 dt1 T {H 0 (tn )H 0 (tn−1 ) . . . H 0 (t1 )}.
We write the iterated formal solution of the (11.3) as,
Z t
0
0 0
0
0
U (t, t ) := T exp −i
dt HI (Φin (t ), Πin (t ))
(11.5)
t0
= 1+
Z
∞
X
(−i)n
n=1
n!
t
Z
dt1 · · ·
t0
t0
t
dtn T {HI0 (t1 ) . . . HI0 (tn )}
(11.6)
The solution is thus obtained entirely in terms of the ‘in’ fields. Note that the solution
gives U (t, t0 ) and not U (t). While the equations satisfied by U (t) and U (t, t0 ) are the same,
there is no “initial” condition provided for U (t) and in a sense it is ill-defined. There is no
canonical way to deduce/define U (t) from U (t, t0 ). Fortunately, U (t, t0 ) suffice.
Note: From the definition it follows the useful composition property that U (t, t0 ) =
U (t, t00 )U (t00 , t0 ), regardless of any inequalities for t00 . This may also be verified directly.
Let us see an implication of the existence of U (t). Consider an n−point function for a
scalar, G(x1 , . . . , xn ) = h0|T {Φ(x1 ) . . . Φ(xn )} |0i. Insert in this Φ(x) = U −1 (t)Φin (x)U (t)
which gives,
Φ(x1 )Φ(x2 ) . . . Φ(xn ) = U −1 (t1 )Φin (x1 )U (t1 ) · U −1 (t2 )Φin (x2 )U (t2 )..U −1 (tn )Φ(xn )U (tn )
= U −1 (t) [U (t, t1 )Φin (x1 )U (t1 , t2 )Φin (x2 ) . . .
U (tn−1 , tn )Φin (xn )U (tn , −t)] U (−t) , U (t, t0 ) := U (t)U −1 (t0 )
Observe that in the limit t → ∞, the factor U −1 (t) goes to the extreme left and U (t) to
the extreme right. All other space-time arguments being finite, we can take these factors
116
outside the time ordering symbol. The product of the fields Φ(x)’s is already under time
ordering which then allows us to arrange all the factors within T {. . . } to be conveniently
grouped. In particular, all the U (ti , tj ) can be combined using their composition property
noted above.
With the limit of t → ∞ implicit, we can write the n−point function as,
Z t
0 0 0
−1
U (t)|0i . (11.7)
dt HI (t )
G(x1 , . . . , xn ) = h0|U (t)T Φin (x1 ) . . . Φin (xn ) exp −i
−t
Next, we evaluate the U (−t)|0i in the limit t → ∞ and its adjoint.
Claim: U (−t)|0i → λ− |0i as t → ∞.
Proof: Consider an in-state containing at least one particle of momentum k, |α, kiin .
Then,
in hα, k|U (−t)|0i
=
=
~
in hα|ain (k)U (−t)|0i
Z
←
→
d3 x
p
e−ik·x ∂00 in hα|Φin (−t0 , ~x)U (−t)|0i
i
2ω~k (2π)3
Σt0
We have used the inversion formula for ain by choosing an arbitrary Σt0 hyper-surface and
ain is of course independent of this.
In the integrand above, substitute Φin (t0 , ~x) = U (t0 )Φ(t0 , ~x)U −1 (t0 ) and evaluate the time
derivative.
h
Integrand = e−ik·x hα| U̇ (−t0 )U −1 (−t0 )Φin (−t0 , ~x)U (−t) + Φin (−t0 , ~x)U (−t0 )∂t0 U −1 (t0 )U (−t)
+U (−t0 )∂00 Φ(−t0 )U −1 (−t0 )U (−t) − U (−t0 )Φ(−t0 )U −1 (−t0 )U (−t)(∂00 (−ik · x)) |0i
Now take choose t0 = t. Then, the first two terms on the r.h.s. combine as,
[U̇ (−t)U −1 (−t), Φin (−t, ~x)]U (−t) = − i[HI0 (Φin , Πin ), Φin (−t, ~x)]U (−t) = 0 .
In the usual theories, the interaction Hamiltonian has no Π dependence and hence at equal
time, commutes with the field giving 0.
In the last two terms, the U −1 (−t0 )U (−t) factor becomes 1 for t0 = t. These terms
←
→
combine to give, in hα|U (−t) e−ik·x ∂0 Φ(−t) |0i. Taking the limit t → ∞ and invoking the
√
asymptotic condition, converts Φ(x) into ZΦin (x). The spatial integration converts the
√
expression into Z in hα|U (−t)ain (~k)|0i = 0.
Thus we conclude that
in hα, k|U (−t)|0i
→ 0 as t → ∞ for all states containing at least
one particle. U (−t)|0i must therefore be proportional to |0i itself thereby proving the claim.
117
By similar reasoning, we obtain U (t)|0i → λ+ |0i as t → ∞.
Corollary:
λ∗+ λ−
Z
= h0|T exp −i
∞
0
dt
HI0 (t0 )
−1
|0i
(11.8)
−∞
The proof is simple,
λ∗+ λ− = lim h0|U −1 (t)|0ih0|U (−t)|0i
t→∞
≈ h0|U −1 (t)|U (−t)|0i since intermediate states do not contribute by above result
≈ h0|U (−t)U −1 (t)|0i = h0|U (−t, t)|0i
≈ h0|(U (t, −t))−1 |0i = (h0|U (t, −t)|0i)−1 .
The final expression follows by substituting the formal solution in the limit t → ∞.
Using the equations (11.7, 11.8), we express the n− point function as,
n
o
R∞
h0|T Φin (xi ) . . . Φin (xn )exp − i −∞ dt0 HI (t0 ) |0i
n
o
G(x1 , . . . , xn ) =
R∞
h0|T exp − i −∞ dt0 HI (t0 ) |0i
(11.9)
Notice that the multiple of identity E0 (t)1 has been canceled between the numerator and
the denominator before taking the limit t → ∞.
Suffice it to say that analogous expressions exist for other fields as well. However we will
not write them explicitly.
The reduction formulae give S−matrix elements in terms of the n−point functions and
the n−point functions have a perturbative expansion in terms of the free fields alone. These
two together are the master formulae for computations.
A further simplification is provided by the Wick’s Theorem.
A.
Normal ordering and Wick’s theorem
Recall that the mode expansion of the free quantum fields can be grouped as Φ(x) =
Φ+ (x) + Φ− (x) with Φ± (x) denoting the sum over positive/negative frequency modes. The
positive frequency part is made up of annihilation operators while the negative frequency
part is made up of creation operators alone. For a Hermitian quantum field, the negative
frequency part is the Hermitian adjoint of the positive frequency part. A product of fields
at different points will mix these parts and the idea is to bring all annihilation operators to
118
the right and the creation operators to the left. The vacuum expectation value of so ordered
groups will of course be zero. In this process, several commutators of fields at different
space-time points are generated, but they all are c-numbers. This leads to a simplification.
We define normal ordered products of two free fields as:
: Φ− (x)Φ+ (y) : = Φ− (x)Φ+ (y)
,
: a† a : := a† a
(11.10)
: Φ+ (x)Φ− (y) : = Φ− (y)Φ+ (x)
,
: aa† : := a† a
(11.11)
: Ψ− (x)Ψ+ (y) : = Ψ− (x)Ψ+ (y)
,
: b† b : := b† b
(11.12)
: Ψ+ (x)Ψ− (y) : = − Ψ− (y)Ψ+ (x)
,
: bb† : := − b† b
(11.13)
For bosons
For fermions
For product of two or more positive (or negative) frequency fields, the normal ordering does
not change order. It is immediately obvious that h0| : Operator : |0i = 0.
Note that the normal ordering are meaningful only for products of free fields (in/out).
Hence the in/out suffixes are suppressed for these.
To get a general form for several bosonic as well as fermionic free fields, notice that for
a quantum field Q(x) = Q+ (x) + Q− (x) a product of n fields generates terms with the k
number of Q+ ’s interspersed with (n − k) number of Q− ’s, with the order of the space-time
points maintained, with k = 0, . . . , n. Under normal ordering each of these terms will shift
the Q+ ’s to the right and Q− ’s to the left, generating a permutation of the space-time points,
say p. If Q is a fermionic field, the term with permutation p will get a factor of σp = sgn(p)
while a bosonic field will have σp = 1. Note that for any given k, there will be several terms
eg Q1 Q2 , Q3 Q7 , . . . etc. Each will generate its own permutation under normal ordering.
With this understood, we can write the general expression for normal ordered product of
n−fields as,
: Q(x1 ) . . . Q(xn ) : :=
X
σp
A,B
Y
i∈A
Q− (xp(i) )
Y
Q+ (xp(j) ) .
j∈B
The A, B are two groups of space-time points corresponding to the k, (n − k) mentioned
above and the sum refers to various possible groupings of the space-time points within each
class.
Exercise: Verify for n = 4.
Consider now the relation between time ordered products and normal ordered products.
For a single field, both orderings are trivial and their vacuum expectation value vanishes:
h0|T {φin (x)}|0i = 0 = h0| : φ(x) : |0i.
119
For time ordered product of two fields, for any particular instances of time they appear in a particular order which can be explicitly put in the normal ordered form, eg.
a(~k)a† (~k 0 ) = “1” + a† (~k 0 )a(~k) =: a(~k)a† (~k 0 ) : +“1”. The last term is a c-number (multiple
of the identity operator). Similarly, Φ(x)Φ(y) =: Φ(x)Φ(y) : + c−number. And the cnumber is trivial to evaluate by just taking the vev (vacuum expectation value): c−number
= h0|Φ(x)Φ(y)|0i. Noting that T {Φ(x)Φ(y)} = θ(x0 − y 0 )Φ(x)Φ(y) + θ(y 0 − x0 )Φ(y)Φ(x),
we get, T {Φ(x)Φ(y)} =: T {Φ(x)Φ(y)} : +h0|T {Φ(x)Φ(y)}|0i . Notice that normal ordering and time ordering commute: : T {A(x)B(y)} := T {: A(x)B(y) :}, for all non-trivial
A(x), B(y) operators i.e. for operators which are polynomials in the basic field operators of
degree greater that zero. The generalization of this to product of arbitrary number of field
is the Wick’s theorem. We state it explicitly for a scalar field.
Wick’s Theorem:
T {Φ(x1 ) . . . Φ(xn )} = : T {Φ(x1 ) . . . Φ(xn )} :
+ [h0|T {Φ(x1 )Φ(x2 )}|0i : T {Φ(x3 ) . . . Φ(xn )} : + permutations]
+ [h0|T {Φ(x1 )Φ(x2 )}|0ih0|T {Φ(x3 )Φ(x4 )}|0i×
: T {Φ(x5 ) . . . Φ(xn )} : + permutations]
.. ..
..
[h0|T {Φ1 Φ2 }|0i . . . h0|T {Φn−1 Φn }|0i + permutations]
(n even)
+
[h0|T {Φ1 Φ2 }|0i . . . h0|T {Φn−2 Φn−1 }|0iΦn
+ permutations] (n odd)
In the last line we have used the abbreviation Φi := Φ(xi ) and we will use the same when
convenient.
Proof:
The proof is by induction and we have already seen the validity for n = 1, 2. Assume the
theorem is true for n = n. Include an extra field Φ(xn+1 ) and choose tn+1 to be the earliest
120
instant so that it stays on the extreme right. Then,
T {Φ1 . . . Φn+1 } = T {Φ1 . . . Φn }Φn+1
"
=
#
X
: T {Φ1 . . . Φn } : +
h0|T {Φ1 Φ2 }|0i : T {. . . } : + . . . Φn+1 ,
perm
: Φ1 . . . Φn : Φn+1 =
(
X
)
σp
A,B
=
X
A,B
σp
Y
Φ−
i
Φ+
j
−
Φ+
n+1 + Φn+1
Φ−
i
Y
+
Φ+
j Φn+1 +
j∈B
i∈A
(
X
A,B
j∈B
i∈A
Y
Y
σp
Y
Φ−
i
Φ−
n+1
i∈A
"
Y
j∈B
Φ+
j +
#)
Y
−
Φ+
j , Φn+1
j∈B
The first two terms in the last equation have the requisite normal ordered form for n + 1
−
fields. The third term has several commutators, [Φ+
j , Φn+1 ] which again are c-numbers and
can be evaluated by taking vev. That is,
+ −
+
+ −
−
−
[Φ+
j , Φn+1 ] = h0|(Φj Φn+1 − Φn+1 Φj )|0i = h0|Φj Φn+1 |0i + 0
∴ : Φ1 . . . Φn : Φn+1
= h0|Φj Φn+1 |0i = h0|T {Φj Φn+1 } |0i , since tn+1 is the earliest.
X
= : Φ1 . . . Φn Φn+1 : +
σp : Φ1 . . . Φj−1 Φj+1 . . . Φn : h0|T {Φj Φn+1 } |0i
j
Thus, inserting the time ordering, the first term, : T {Φ1 . . . Φn } : Φn+1 gets expressed as
: T {Φ1 . . . Φn Φn+1 } : + {terms which have a structure similar to the second term} and so
on. Hence the induction hypothesis allows the expression to hold for n + 1 and the theorem
is proved by induction.
Exercise: Verify the theorem for n = 3, 4.
Observe that G(x1 , . . . , xn ) = h0|T {Φ(x1 ) . . . Φ(xn )}|0i vanishes for odd n and equals
P
perm
σp h0|T {Φ(x1 )Φ(x2 )}|0i . . . h0|T {Φ(xn−1 )Φ(xn )}|0i for even n.
Thus, 2n−point function of free fields is the sum of products of n, 2−point functions of
free fields. We denote: h0|T {Φ(x)Φ(y)}|0i =: i∆F (x − y) =: (Feynman propagator) .
We have the S−matrix elements in terms of the n−point functions of interacting quantum
fields, expressed in terms of vev of time ordered products of free fields and thanks to Wick’s
theorem these are expressed in terms of 2−point functions of free fields. In short, the
S−matrix elements are obtained from a bunch of 2−point functions of free fields, together
with some integrations and operations of (−m2 ). It remains to have a convenient algorithm
for computations.
121
A few remarks are in order. The normal ordering is introduced here as a simplification
step. It works because the postulated unique Poincare invariant vacuum state is annihilated
by all (free field) annihilation operators. This also requires the Poincare generators of the free
fields to be normal ordered11 . The Poincare generators of the interacting fields also need
to annihilate the vacuum. Thanks to the assumed unitary operator linking the free and
the interacting fields, these generators too are expressed in terms of the free field creationannihilation operator. These expressions, in particular the Hamiltonian, must also be normal
ordered for Poincare invariance. We will make this explicit for the interaction Hamiltonian
HI (Φin , Πin ).
In the quantum optics context, the normal ordering was used since the detectors worked
by absorbing a quantum from the field and in principle, for a detector working by emission
of a quantum would require anti-normal ordering. Such an option is not available with
manifest Poincare invariance with a unique invariant vacuum state.
An explicit example of evaluation of an n−point function will be useful.
Consider
G2 (x1 , x2 ) computed to second order in HI . Consider the numerator first. All fields are
‘in’ fields and the ‘in’ subscript is suppressed below.
Z ∞
dt0 HI (t0 )
G2 (x1 , x2 )|N r = h0|T {Φ(x1 )Φ(x2 ) 1 − i
−∞
2 Z ∞
(−i)
dt1 dt2 T {HI (t1 )HI (t2 )} }|0i
+
2!
−∞
Z ∞
= h0|T {φ(x1 )Φ(x2 )}|0i − i
dt1 h0|T {Φ(x1 )Φ(x2 )HI (t1 )}|0i
−∞
Z
(−i)2 ∞
+
dt1 dt2 h0|T {Φ(x1 )Φ(x2 )HI (t1 )HI (t2 )}|0i
2
−∞
Now let us take a specific interaction Hamiltonian, simplest non-trivial being HI (t) :=
R
g d3 y : Φ(t, ~y )3 : . As noted above, for the case of Poincare invariant vacuum, the Poincare
generators and hence in particular the Hamiltonian must be normal ordered. The time
integrations combine with these spatial integration to give a space time integral. Thus, we
11
Note that while abelian symmetries could allow a non-zero multiple of identity for invariance (i.e. P µ |0i =
αµ |0i), non-abelian symmetry generators have to annihilate the vacuum. Hence the Lorentz generators
must annihilate the vacuum and then the commutator of M µν with P λ shows that the αµ = 0 must also
hold. Poincare group being non-abelian must have its generators being normal ordered for the vacuum to
be invariant.
122
get,
Z
G2 (x1 , x2 )|N r = h0|T {φ(x1 )Φ(x2 )}|0i − ig
d4 yh0|T {Φ(x1 )Φ(x2 ) : Φ3 (y) :}|0i
Z
Z
(−i)2 2
4
+
g
d y1 d4 y2 h0|T {Φ(x1 )Φ(x2 ) : Φ3 (y1 ) :: Φ3 (y2 ) :}|0i
2
Z
Z
g2
4
= i∆F (x1 − x2 ) −
d y1 d4 y2 h0|T {Φ(x1 )Φ(x2 ) : Φ3 (y1 ) :: Φ3 (y2 ) :}|0i
2
The first term is just the Feynman propagator and the o(g) term vanishes since it is a 5point function. The non-trivial term is the o(g 2 ) term. This is not in the form of the Wick’s
theorem, in that there are normal ordered factors. We can replace the normal ordered
product in terms of unordered product minus the product of two point functions i.e. using
the Wick’s theorem in reverse form. For instance, consider (for three distinct points)
: Φ(y1 )Φ(y10 )Φ(y100 ) : = Φ(y1 )Φ(y10 )Φ(y100 ) − h0|Φ(y1 )Φ(y10 )|0iΦ(y100 )
− h0|Φ(y10 )Φ(y100 )|0iΦ(y1 ) − h0|Φ(y1 )Φ(y100 )|0iΦ(y10 )
And likewise for the second factor of : Φ(y2 )3 :. If we now substitute these in the o(g 2 )
term and eventually take the coincidence limit of the primed arguments, we see that the two
point functions of coincident points cancel out. The net result result is that in writing the
permutation terms, we omit the Feynman propagators of coincident points and this holds
because we use the normal ordered interaction Hamiltonian. If we did not normal order the
interaction Hamiltonian then these terms, which are formally divergent, will remain and will
have to be handled differently. This is what will happen in the functional integral method
which will be discussed later. Keeping this in mind, the o(g 2 ) term under vev becomes,
h0|T {Φ(x1 )Φ(x2 ) : Φ3 (y1 ) :: Φ3 (y2 ) :}|0i
= i∆F (x1 − x2 ).i∆F (y1 − y2 ).i∆F (y1 − y2 ).i∆F (y1 − y2 ) + permutations
= (i4 ) ∆F (x1 − x2 )∆F (y1 − y2 )3 + ∆F (x1 − y1 )∆F (x2 − y2 )∆F (y1 − y2 )2
+∆F (x1 − y2 )∆F (x2 − y1 )∆F (y1 − y2 )2
There are no other ‘pairings’ of space-time points as they would involve Feynman propagators
with coincident arguments.
The denominator of the n−point function has the same structure as the numerator except
123
for the Φ(x1 )Φ(x2 ) fields i.e.
Z
d4 yh0|T {: Φ3 (y) :}|0i
Z
Z
(−i)2 2
4
+
g
d y1 d4 y2 h0|T {: Φ3 (y1 ) :: Φ3 (y2 ) :}|0i
2
Z
Z
g2
4
= 1−
d y1 d4 y2 h0|T {: Φ3 (y1 ) :: Φ3 (y2 ) :}|0i
2
Z
Z
(−ig)2
4
= 1+
(CF )
d y1 d4 y2 (i∆F (y1 − y2 ))3
2
G2 (x1 , x2 )|Dr = h0|0i − ig
Here, CF is the combinatorial factor - number of distinct ways of effecting the same pairings
- and equals 3! in the example above.
The generalization is fairly obvious. For definiteness consider the interaction Hamiltonian
density to be HI (Φ(y)) :=
g
k!
: Φk (y) :. Then, it is apparent that the numerator of a general
n−point function is a sum of the mth order contributions, m = 0, 1, . . . . The mth order
contribution has the form:
m) R
• (−ig)/k!)
d4 y1 . . . d4 ym h0|T {Φ(x1 ) . . . Φ(xn ) : Φk (y1 ) : · · · : Φk (ym ) :}|0i ;
m!
• The Wick’s theorem gives a sum of terms, each of which is a product of a total of
(n+km)/2 Feynman propagators, i∆F (xj −yk ), whose arguments give the ‘pairing’ of spacetime points from the set: x1 , . . . , xn , k−copies of y1 , . . . , k−copies of ym . The normal ordered
form of interaction Hamiltonian ensures that the paired points are necessarily distinct. The
x1 , . . . , xn points are assumed to be distinct. A ‘pairing’ is also called a ‘Wick contraction’.
The sum of terms is generated from any particular one by permutations. If n + km is odd,
the contribution vanishes;
• There would also be combinatorial factor counting distinct ways of getting the same
set of pairing (or Feynman propagators). The denominator has similar contributions except
the x1 , . . . , xn points.
There is a convenient diagrammatic representation to keep track of the generated terms as
well as associating specific factors and integrals with them. The diagrams are the Feynman
Diagrams while the prescription to construct associated integral is given by the Feynman
rules.
Procedure to generate Feynman diagrams for Green’s function:
• For each distinct x1 , . . . , xn , draw a vertex with a single edge. Call these vertices as
external vertices;
124
• For normal ordered monomial of order k, draw a vertex yj with k edges sticking out.
Call these as the internal vertices;
• Wick contract or pair, by joining two edges from two distinct vertices, internal or
external. If the pair has one or both vertices to be external vertices, the line is called an
external line while a pairing with both vertices being internal, is called an internal line. This
constructs a Feynman diagram with vertices and internal/external lines.
Procedure to associate integral with a Feynman diagram:
• With each contraction, associate a Feynman propagator, i∆F (z 0 − z 00 ).
• With each internal vertex, associate a factor of (−i)×coefficient of the monomial of
R
the fields in HI (Φ(y)) and an integral over its space-time location, d4 y.
• Associate a numerical factor as a ratio. In the numerator, count the number of distinct
ways of generating the same diagram (or the same contractions). In the denominator, put
m! if m is the total number of internal vertices.
For the above example of G2 (x1 , x2 ) to second order for HI = g : Φ(y)3 :, here is a
summary:
125
o(g)
Has one loose end and vev is zero.
o(g 2 )
y1
3
(a)
x1
1
x2
2
(−ig)2
3.3.2.1
2!
R
d4 y1 d4 y2 (i∆F (x1 − y1 )) ×
(i∆F (x2 − y2 ))(i∆F (y1 − y2 ))2
3
y2
(−ig)2
(b)
3.3.2.1
2!
R
d4 y1 d4 y2 (i∆F (x1 − y2 )) ×
(i∆F (x2 − y1 ))(i∆F (y1 − y2 ))2
x1
x2
(−ig)2
3
(c)
y1
2
y2
1.3.2.1)
2!
R
d4 y 1 d4 y 2
(i∆F (x1
−
x2 ))(i∆F (y1 − y2 ))3
1
The diagrams (a) and (b) are examples of connected diagrams while (c) is a example of a
topologically disconnected diagram. Here (dis)connection refers to (dis)connection with external vertices. A diagram with no external vertices is called a vacuum bubble. The diagram
in (c) has a disconnected piece which is a vacuum bubble. The denominator diagrams are
all vacuum bubbles.
It is actually possible to separate the vacuum bubble components of the diagrams in the
numerator and cancel them against those in the denominator. The proof goes as follows.
In an n−point function, at the k th order, the numerator has the form,
Nr =
Z
∞
X
(−i)k
k=0
k!
d4 y1 . . . d4 yk h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yk ) :}|0i
Group the contractions into two groups: (i) those that have l of the internal vertices with
a contraction with at least one of the external vertices and (ii) the remaining (k − l) internal
vertices which have no segment connecting an external vertex. The above vev then splits
126
as,
h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yk ) :}|0i =
h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yl ) :}|0i ×
h0|T {: HI (y1 ) : · · · : HI (yk−l ) :}|0i
And this split can happen in k Cl ways. Hence,
Nr =
k
∞
X
(−i)k X
k=0
k!
l=0
k!
l!(k − l)!
Z
d4 y1 . . . d4 yl
h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yl ) :}|0i ×
Z
d4 yl+1 . . . d4 yk h0|T {: HI (yl+1 ) : · · · : HI (yk ) :}|0i
Putting (−i)k = (−i)l (−i)(k−l) and interchanging the order of summation we can write,
#
"∞
X (−i)l Z
4
4
d y1 . . . d yl h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yl ) :}|0i
Nr =
l!
l=0
"∞
#
X (−i)(k−l) Z
×
d4 yl+1 . . . d4 yk h0|T {: HI (yl+1 ) : · · · : HI (yk ) :}|0i
(k
−
l)!
k=l
k → k + l in the second square bracket, shows it as h0|T {exp − i
R∞
−∞
dtH(t)}|0i which is
just the denominator!
The first factor is called the connected Green’s function and denoted as Gcn . It consists
of only the diagrams connected to external vertices. The diagrams may be topologically
disconnected. From now on we focus on the connected Green’s functions and drop the denominator.
Having gotten Feynman diagrams and Feynman integrals, we further simplify the expression for the S−matrix elements.
To go from Green’s functions to S−matrix elements (a) we operate on the Green’s function by the equation of motion differential operator for each external vertex; (b) insert the
√
wave function factors for each external vertex; (c) divide by Z for each external vertex
and (d) integrate over the external vertices.
We have only connected diagrams. Among these are topologically disconnected diagrams
too. There is a special subclass of topologically disconnected diagrams in which two external
vertices are Wick contracted. These represent an incoming particle which goes out without
any scattering. We can separate such un-scattered processes from the S−matrix (which
127
we will do little later) and focus on processes wherein every incoming particle necessarily
scatters i.e. every external vertex is necessarily connected to an internal vertex.
Consider an external vertex x connected with an internal vertex y. Associated with the
0 0
external vertex is a wavefunction factor
eik·x (or e−ik x )
√
,
2ω~k (2π)3
integration over x (or x0 ), Feynman
propagator ∆F (x − y) and (x − m2 ) acting on the propagator. Since the propagator is a
Green’s function, (x − m2 )∆F (x − y) = −δ 4 (x − y). Integration over x then removes the
0
0
delta function and eik·x → eik·y . For the external vertex of an out-going particle, e−ik ·x →
0
e−ik ·y . Thus the integration over external vertices is trivially carried out and the propagators
involving an external vertex is removed - an “external leg is said to be amputated”.
Now use the Fourier representation of the remaining propagators, ∆F (yi − yj ) =
R
d4 l eil·(yi −yj )
.
(2π)4 l2 +m2 −i
Each internal vertex yi is connected to possibly several other internal vertices
as well as possibly several external vertices. For internal vertex, the Fourier transform supplies a factor of eil·yi and each external vertex provides a factor of ek·yi . All of these combine
R
P
P
P
and the d4 yi then gives a (2π)4 δ 4 ( k − k 0 + l). Thus, the integration over internal vertices result in a momentum conserving delta function. The space-time integrations
are done but we are left with integration over the Fourier momenta which are associated
with internal lines. The delta functions trivialize many momentum integrations leaving an
overall delta function involving only the external momenta and enforcing the momentum
conservation. We also left with some unconstrained loop momenta.
To be explicit, let nE , nI denote the number of external and internal lines and let nv
denote the number internal vertices. We have then nE + nI number of momenta and nv
number of conservation equations. Thus nE + nI − nv + 1 is the number of undetermined
momenta or the loop momenta. The +1 in the counting signifies the loss of one conservation
equation due to the left over delta function enforcing conservation of external momenta.
There are no dependence on space-time points, no space-time integrations left and the
scattering matrix element is given by integration over a bunch of loop momenta and a
product of momentum space propagators forming the integrand and of course the various
factors of i, π and numerical combinatorial factors.
X
X YZ
Y
4
0
4
Sf i ∼ δ (
k−
k)
d lj
nI
j
1
2
pi (k, l) + m2 − i
We will elaborate the numerical factors in explicit examples. The structure of the S−matrix
elements (non-trivial scattering) should be clear from the above discussion.
128
To separate out the trivial scattering it is customary to introduce the the so-called
T −matrix as,
S := 1 + iT , hkj0 0 |T |kj i := (2π)4 δ 4 (Σj 0 kj0 0 − Σj kj ) M(kj → kj0 0 )
Here the k, k 0 denote the on-shell momenta of the incoming and outgoing particles. The
delta function enforcing momentum conservation is a consequence of translation invariance
and (M ) is called the invariant matrix element. This is what is computed in practice. The
Sf i given above is really the iTf i .
In practice, computation of the T −matrix elements is only part of what is needed to
compare with experiments which measure cross-sections. To compute cross-sections, recall
that the S−matrix elements are the transition probability amplitudes whose non-trivial
0
− ktotal )(iMi→f ). The cross-section is
contributions are identified as hf |ii = (2π)4 δ 4 (ktotal
the ratio of the number of outgoing particles per second to the incident flux. The numerator
is proportional to the transition probability rate while the denominator is determined by
the initial state.
We have given the S−matrix elements in the plane wave basis which is strictly incorrect. One should use wave packets for representing the asymptotic states. Alternatively, a
commonly used practice is to put the system in a finite space-time box and take the limit
of infinite box at the end. This is simpler to implement in practice and suffices for most
purposes. We will use this [12] and refer to [13] for the wave packet treatment.
B.
Differential Cross-section for 2 → n process
Imagine the scattering experiment to be enclosed in a large spatial box of volume V = L3 ,
with periodic boundary conditions imposed on the mode functions. Let the duration of the
experiment be a large time interval T . The probability of transition from an initial state |ii
to a final state |f i is given by,
P robi→f
(2π)4 δ 4 (Σkj0 0 − Σkj ) [(2π)4 δ 4 (0)] |iMi→f |2
|hf |ii|2
=
=
hi|iihf |f i
hi|iihf |f i
The square of the momentum conservation delta function is written with one factor as δ(0)
Z
Z
4 4
4
i0·x
which is to be understood as: (2π) δ (0) = d xe
= d4 x := V T . The initial and
129
final states are normalized using hk|ki := lim
δ 3 (~k 0 − ~k) = δ 3 (0) :=
0
k →k
V
(2π)3
. Typically, the
initial state consists of two particles. So let us specialize to this case of 2 → n processes.
V
(n+2)
Then, hi|iihf |f i = [ (2π)
. Therefore,
3]
P robi→f
=
T
(2π)4 δ 4 (Σkj0 0 − Σkj ) |iMi→f |2 V
[V /(2π)3 ]n+2
Real detectors have finite aperture and hence the detected particle’s momentum is anywhere within a small window around the central value. The corresponding probabilities are
thus to be added. An estimate for such a window follows from the box normalization we
~n . Summing over the momenta within
have taken. The momenta are given by ~kj = 2π
L j
a window is same as summing over the integers within a window and for large volume,
R 3
V
d k, for each final state particle. This leads to the total probability per unit
Σ~nj ≈ (2π)
3
time for the transition,
n 3
(2π)4 δ 4 (Σkj0 0 − Σkj ) |iM2→n |2 V Y
P rob2→n
d kj
dΓn =
V
T
[V /(2π)3 ]n+2
(2π)3
j=1
r
i
Qn+2 h
4
4
0
3
(2π) δ (Σkj 0 − k1 − k2 ) iM2→n
(2π)
2ω
~
j=1
kj
=
V (2ω~k1 )(2ω~k2 )
2
n
Y
j=1
d3 kj
2ω~k (2π)3
In going from the first to the second line, we have divided and multiplied by the product
of (2ω~k ) for all the (n + 2) particles. The last product is the dΓn which is the Lorentz
invariant phase space volume. The square root factor will get absorbed in the invariant
matrix element and will get rid of similar factors coming from the wave functions of the
incoming and outgoing particles.
We also need the incident flux. For the initial state of two articles, consider the laboratory
frame where the particle ‘2’ is at rest. The particle ‘1’ has a speed |~k1 |/E1 and gives the
number density of one particle per unit volume, 1/V . Hence, the incident number flux is
|~k1 |/(E1 V ). Dividing by the flux, the exclusive, differential cross-section for 2 → n process
is given by,
130
dσlab
M̃2→n
(2π)4 δ 4 (k1 + k2 − Σj kj )|iM̃2→n |2
=
dΓn
4m2 |(~k1 )lab |
v
un+2 h
i
uY
2ω~kj (2π)3
:= M2→n t
where,
and,
j=1
dΓn :=
n
Y
d3 kj
2ω~k (2π)3
j=1
We have used ω~k2 = m2 and ω~k1 = E1 . Notice that all factors of the volume and the duration
T have disappeared. The explicit square root factors will also cancel out.
Many basic calculations involve scattering processes with two particles going into two
particles. We have already used the initial state of two particles to obtain the incident
flux. We will now also specialize to n = 2 for out going particles and write the phase space
integrals more explicitly.
1.
The Special case of 2 → 2 processes
Let the initial momenta be denoted by k1 , k2 and the final momenta be denoted by k10 , k20 .
Lorentz invariance implies that the scattering amplitude will have Lorentz indices (tensorial
or spinorial) carried by the wavefunctions and the invariant amplitude will be a function of
Lorentz invariants. We have three independent momenta thanks to the overall conservation
due to translation invariance and we can form 6 Lorentz scalars of the form pi · pj . Of these
three are masses and the remaining three are conveniently defined as “center of mass energy”
and two types of “squared momentum transferred”. These are known as the Mandelstam
variables and are defined as (k1 + k2 = k10 + k20 ),
s := −(k1 + k2 )2 = −(k10 + k20 )2
(Centre of Mass Energy)
(11.14)
t := −(k10 − k1 )2 = −(k20 − k2 )2
(Squared Momentum transfer)
(11.15)
u := −(k20 − k1 )2 = −(k2 − k10 )2
(Squared Momentum transfer)
(11.16)
The definitions imply the Mandelstam identity: s + u + t = m2k1 + m2k2 + m2k10 + m2k20 .
Laboratory and Centre of Mass Frames:
The lab frame, is defined by regarding particle 2, say, at rest while particle 1 is incident
131
on it. Thus,
k1lab
q
= ( ~k12 + m21 , ~k1lab ) , k2lab = (m2 , ~0)
The lab frame is convenient for expressing the incident flux.
The center of mass frame is defined by ~ktotal := ~k1 + ~k2 = 0 = ~k10 + ~k20 . Thus,
k1cm
q
q
cm
2 ~
2
2 + m2 , −~
~
= ( kcm + m1 , kcm ) , k2 = ( ~kcm
kcm )
2
This frame is more convenient for defining scattering angle. The common momentum direction singles out say, the z-axis. Orienting the frame accordingly, we take
k1 = (E1 , kcm ẑ) , k2 = (E2 , −kcm ẑ) ,
k10 = (E10 , ~k 0 ) , k20 = (E20 , −~k 0 ) , k̂ 0 · ẑ =: cos(Θcm ).
(11.17)
The Mandelstam invariant s can be used to relate the lab frame momentum |k1 |lab and
the center of mass momentum |k|cm . The invariant s := −(k1 + k2 )2 = −(−m21 − m22 −
2E1 E2 + 2~k1 · ~k2 ) has two equivalent expressions:
q
s|lab = m21 + m22 + 2m2 m21 + ~k12
q
q
2
2
2
2
2
2 + 2~
~
kcm
.
s|cm = m1 + m2 + 2 m1 + kcm m22 + ~kcm
These can be solved for the momenta and give manifestly Lorentz invariant expressions,
q
1
s2 − 2s(m21 + m22 ) + (m21 − m22 )2
2m2
q
1
= √
s2 − 2s(m21 + m22 ) + (m21 − m22 )2
2 s
|k1 |lab =
|k|cm
(11.18)
(11.19)
Phase space volume in center of mass frame: We have the Lorentz invariant definition,
d3 k10
d3 k20
3 2ω 0 (2π)3 .
(2π)
0
k
k
dΓ2 = (2π)4 δ 4 (k10 + k20 − k1 − k2 ) 2ω
1
2
√
In the CM frame we simplify it using: δ 4 → δ(E10 + E20 − s)δ 3 (k10 + k20 ) since ~k1 + ~k2 = 0, s =
(E1 + E2 )2 . We can remove the momentum delta function by integrating over ~k 0 . Denoting
2
d3 k10
:=
0
d3 kcm
Z
=
0
dkcm
(k 0 )2cm dΘcm sin2 (Θcm )dφ,
d3 k20 dΓ2 =
we write,
√
1
1
0
0
0
−
δ(E
+
E
s) 0 0 dkcm
(k 0 )2cm dΘcm sin2 (Θcm )dφ
2
1
2
(2π)
4E1 E2
132
p
0
, we simplify the
(m0 )21 + (k 0 )2cm + (m0 )22 + (k 0 )2cm is a function of kcm
P δ(x−xi )
delta function using δ(f (x)) = i f 0 (xi ) , f (xi ) = 0. We have
Since E10 + E20 =
p
√
0
0
0
0
0
kcm
kcm
kcm
s
0 E1 + E2
= 0 0 ,
+ 0 = kcm
f
=
0
0 0
E1
E
E1 E2
E1 E2
p2
2
2
2
0
0
s − 2s((m )1 + (m )2 ) + ((m0 )21 − (m0 )22 )2
0
0
√
f (kcm
) = 0 ⇒ kcm
=
2 s
0
0
(kcm
)
0
Doing the dkcm
integration using the delta function gives,
Z
0
√
(k 0 )2cm E10 E20
1 2
kcm
√
√
d(k 0 )2cm δ(E10 + E20 − s) =
which
gives
dΓ
=
.
d
Ω
2
cm
0
16π 2
kcm
s
s
(11.20)
Thus, for the special case of a process with 2 particles going to 2 particles, one gets the
differential cross-section as,
0
√
1 1 kcm
dσ
2
lab
=
|M|
where
we
have
used
2m
|k
|
=
2
s|k|cm , and(11.21)
2
1
dΩcm
64π 2 s kcm
s
0
s2 − 2s((m0 )21 + (m0 )22 ) + ((m0 )21 − (m0 )22 )2
kcm
.
(11.22)
=
kcm
s2 − 2s(m21 + m22 ) + (m21 − m22 )2
The invariant amplitude has dependence on the scattering angle Θcm and the total crosssection is obtained by integrating over the center of mass solid angle.
133
12.
DIAGRAMMATIC RECIPE FOR S-MATRIX ELEMENTS
We now specify explicit interacting fields by giving an interaction Lagrangian and state
the Feynman rules to complete the diagrammatic recipe for the T − matrix elements. We
will state the rules for the Φ4 theory, the Yukawa theory and the Quantum electrodynamics
(QED). In the next section we will compute specific processes. There is a good deal of
conventions of normalization etc and they have to be kept track of carefully. The free
(quadratic) part of the Lagrangian density sets the normalization of the propagators while
the interaction terms (beyond quadratic in fields) contribute to the numerical factors. Here
are the terms in the Lagrangian densities.
1
1
Free Scalar : − ∂ µ Φ∂µ Φ − m2 Φ2 (x) ;
2
2
Free Spinor : −iΨ̄6 ∂ Ψ + mΨ̄Ψ ;
1
Massless Vector : − F µν Fµν , Fµν := ∂µ Aν − ∂ν Aµ ;
4
g
λ
Scalar self coupling : − Φ3 (x) − Φ(x)4 ;
3!
4!
Yukawa coupling : −gΦ(x)Ψ̄(x)Ψ(x) ;
QED coupling : −ieAµ (x)Ψ̄(x)γ µ Ψ(x) .
In the interaction
Hamiltonian, all coupling terms will change signs.
(12.1)
(12.2)
(12.3)
(12.4)
(12.5)
(12.6)
(12.7)
The propagators are the Feynman Green’s functions for the free equations of motion.
Consider the massless vector field as this has a new feature.
The free Lagrangian (Maxwell) can be expressed as 12 Aµ (η µν − ∂ µ ∂ ν )Aν + divergence
terms. The equation of motion is (η µν − ∂ µ ∂ ν )Aν (x) = 0. Equivalently, in Fourier space
equation takes the form −(η µν k 2 − k µ k ν )Ãν (k) = 0. However, the differential operator is not
invertible and hence does not admit a Green’s function! The non-invertibility follows because
every Aν of the form ∂ν f (x), solves the equation. That is, ∂ν f is a non-trivial eigenvector
of the differential operator, with zero eigenvalue. For perturbation theory though we need
a propagator for which extra terms are added to the action to break its gauge invariance
- invariance under δAµ (x) = ∂µ Λ(x). This can be done in several ways and each choice
corresponds to a gauge.
A common and convenient choice is to add − 12 (∂µ Aµ )2 term to the action. Up to a
divergence term, this is just + 12 Aµ ∂ µ ∂ ν Aν and precisely cancels the term in the equation
134
of motion operator, making it η µν which is invertible. The added term is manifestly
Lorentz invariant and the corresponding gauge is called the Lorentz gauge. For our purposes, this gauge will suffice. The propagator is an inverse of the differential operator since
(η µν x )(DF )νλ (x − y) = δ µλ δ 4 (x − y).
~ ↔ −i∂µ →
The QED coupling arises from the minimal substitution rule: P~ → P~ + eA
−i∂µ + eAµ ↔ ∂µ → ∂µ + ieAµ . Substitution in the free Dirac action gives a −ieAµ as the
QED coupling.
Let us gather the various factors for the scalar field in the T −matrix elements.
• Each in/out particle gives a wave function factor of [2ω~k (2π)3 ]−1/2 e±ik·x , an
R
d4 x
and (x − m2 ) acting on the n−point function. For a spinor field, the amputation
operator changes to (−i6 ∂ + m) and we have additionally the u, v, ū, v̄ spinors. For a
vector field, the amputation operator changes and we have the polarization vectors in
addition. Other factors remain the same.
R
• Each order in HI , gives (−i) d4 y from the T −ordered exponential and a (m!)−1 for
the mth order, from the exponential;
• Each Wick contraction gives (i∆F (z − z 0 )) :=
R
0
d4 l ieil·(z−z )
(2π)4 l2 +m2 −i
;
• Each amputation, action of ( − m2 ) gives a (−δ 4 (x − y)) while each integration over
x or y gives (2π)4 times a momentum conserving δ 4 . The momentum integrations do
not produce any factors.
• Let E, I, V denote the number of external lines (= number of external vertices), number of internal lines and number of internal vertices (= order of HI ) respectively.
Then, the factors of [2ω~k (2π)3 ]−1/2 , precisely cancel the explicit factor we found in the
amplitude M̃2→n . This is due to the normalization choices which cancel out in
|hf |ii|2
hi|iihf |f i
and can now be dropped from both M̃ and from the T −matrix element.
Factors of i: (−i)V (i)I = (−1)V iV +I ;
Factors of 2π: (2π)−4I+4E+4V −4 . The last −4 is because it has been taken out in the
definition of the M due to the overall momentum conservation.
Vertex factors, including the i in the QED vertex are to be taken case-by-case. There
is the (m!)−1 and a combinatorial factor that will come for each diagram. These too
135
are taken case-by-case.
With these, we now state the Feynman rules for Feynman diagrams.
External Line:
scalar
fermion
anti-fermion
photon
1
p
u(p, σ)
p
v̄(p, σ)
p
γ
ε∗µ (p, λ)
p
1
p
ū(p, σ)
p
v(p, σ)
p
γ
εµ (p, λ)
p
Internal Line:
scalar
fermion
photon (Lorentz gauge)
µ
ν
−i
k2 +m2 −i
(−i)(−6 k+m)
k2 +m2 −i
−iη µν
k2 −i
Vertex:
Φ3
i 3!g
Φ4
i 4!λ
Yukawa
ig
QED
µ
ieγ µ
136
13.
ELEMENTARY PROCESSES IN YUKAWA AND QED: NR LIMIT
We begin with scattering of a fermion off another fermion, interacting via the Yukawa
coupling and compute the T −matrix element to the leading order. We will take the nonrelativistic limit and identify an equivalent potential. We will consider anti-fermion scattering and fermion-anti-fermion scattering as well. We will then compare the qed coupling
and briefly the gravitational coupling. This will lead to appreciate the dependence the
attractive/repulsive nature of the interaction on spin of the exchanged particle.
Consider a general process depicted below together with its ‘expansion’.
p
p
p
k
0
p0 − p
+
k
p
p0
p0
=
k
p0
p
0
+ k0 − p
+ . . . (13.1)
k0
k0
k
k
k0
The first term (o(g 0 )) denotes ‘no scattering’. There are two contributions at o(g 2 ) corresponding to exchange of two out-going fermions. The overall momentum conservation delta
function is the same, enforcing p + k = p0 + k 0 . The Feynman rules give the expression as,
−i
2
iM = (ig) ū(p0 )u(p) 0
ū(k 0 )u(k)
2
2
(p − p) + mϕ − i
−i
0
0
ū(p )u(k)
(13.2)
−ū(k )u(p) 0
(k − p)2 + m2ϕ − i
The relative minus sign between the two terms is due to the T-ordering definition for the
fermions. The overall sign of the amplitude is determined by the convention adopted for
ordering the initial/final state labels for the fermions. See [12, 13]. Notice how the fermion
arrows are followed.
Note: If we had anti-fermion scattering, then all the fermion arrows will be reversed and
their momenta will be denoted as minus the previous momenta12 . Apart from the reversal
of fermion arrows, the u(p), ū(p0 ) spinors go to v(p0 ), v̄(p) spinors.
If it is a fermion-anti-fermion scattering, then the second exchange diagram will be absent.
12
Thus, we may adopt a convention that the diagram displays the fermion arrow and the momentum is also
in the same direction. For a fermion the momentum is p and for anti-fermion it is −p.
137
A specific scattering arrangement may permit the initial and the final state fermions to
be distinguishable. Then only one of the two scattering diagrams will contribute.
Note: The reduction formula for fermions gave a factor of 2m for the wavefunctions since
we had normalized the spinors as ū(p, σ)u(p, σ 0 ) = δσ,σ0 . With the Feynman rules we have
adopted in the table the wavefunctions have only the spinors. This is equivalent to using
the normalization: ū(p, σ)u(p, σ 0 ) = 2mδσ,σ0 = −v̄(p, σ)v(p, σ 0 ), m is of course the fermion
mass.
With this noted, the amplitude becomes,
δσp ,σk0 δσp0 ,σk
δσp ,σp0 δσk ,σk0
2
2
−
iM = (ig) (−i)(2m)
(p0 − p)2 + m2ϕ − i
(k 0 − p)2 + m2ϕ − i
(13.3)
Noting that the scalar field momentum is just the momentum transfer in both the diagrams,
we denote it by q. And q 2 = −(q 0 )2 + |~q|2 = |~q|2 in both cases. We thus write the amplitude
more conveniently as,
o
+ig 2 4m2 n
iM =
δσp ,σp0 δσk ,σk0 − δσp ,σk0 δσp0 ,σk .
|~q|2 + m2ϕ − i
As noted above, only one of the two terms will contribute if the fermions are distinguishable.
In the non-relativistic scattering theory, we have h~k 0 |S|~ki := h~k 0 |~ki − 2πiδ(Ek0 −
Ek )h~k 0 |T |~ki with,
h~k 0 |T |~ki = h~k 0 |H 0 |~ki = h~k 0 |V (~q)|~ki = V (|~q = ~k 0 − ~k|) .
We can read-off the potential as,
V (|q|) = −
2
gef
4m2 g 2
f
(δδ
−
δδ)
:=
−
.
|~q|2 + m2ϕ
~q2 + m2ϕ
(13.4)
The inverse Fourier transform gives,
Z
Z ∞
Z 1
2
gef
ei~q·~x
q2
d3 q
f
2
= −
dq 2
d(cos(θ))eiqrcos(θ)
V (~x) = −gef f
(2π)3 ~q2 + m2ϕ
(2π)2 0
q + m2ϕ −1
Z ∞
2
gef
qeiqr
f
= − 2
dq 2
contour integrate to pick up q = imϕ ,
4π (ir) −∞ q + m2ϕ
2
−mϕ r
gef
f e
∴ V (r) = −
4π
r
(Yukawa Potential)
(13.5)
This is a simple illustration how an effective potential can be inferred from the underlying
relativistically specified interaction. Several remarks are in order.
138
Remark: The most important point is the sign of the potential which renders it attractive.
But is the sign unambiguous?
We have already noted that the overall sign of the amplitude is determined by the convention adopted for ordering of the fermion labels in the in/out states while the relative
sign is due to the Pauli principle. When the fermions are distinguishable, either of the two
diagrams may be chosen, there is no preference. But the diagrams contribute opposite signs!
The convention arises as follows [13]. We have taken |p, ki ∼ b†p b†k |0i (sequence of creation
operators follows label order in the in-state) and hp0 , k 0 | ∼ h0|bk0 bp0 which is opposite to that
of the in-state, but consistent with the Hermitian conjugation. Hence,
hp0 , k 0 |P, ki ∼ h0|bk0 bp0 b†p b†k |0i = h0|bk0 b†k |0iδ 3 (p0 − p) − h0|bk0 b†p bp b†k |0i
≈ δ 3 (k 0 − k)δ 3 (p0 − p) − δ 3 (p0 − k)δ 3 (k 0 − p)
Actually there are Ψ̄Ψ fields in between, but being a bilinear it does not change the relative
sign. This explains the overall sign.
In a non-relativistic comparison, |q| mϕ , the small momentum transfer means large
spatial separation and hence fermions are distinguishable. If so, there is no Pauli principle
or anti-commutation and no relative sign either! So either of the two diagrams will give the
same answer.
Ideally, we should consider one of the fermions to be very massive to mimic a source of
potential (means ~q → ~0 limit) and match all the factors for a precise comparison.
The effective coupling contains the Kronecker deltas enforcing conservation of the spin
projections. Hence, the Yukawa potential preserves spin projection. For matching with nonrelativistic normalization we have to take 2mf ermion δσp ,σp0 → 1. This is needed for inferring
the strength of the Yukawa potential.
Remark: Suppose we consider fermion-anti-fermion scattering. Then we will have one of
the Kronecker deltas to have a minus sign. However, in this case, b† → d† and doing the
contractions with Ψ0 s from the Yukawa coupling, we pick up another minus sign. Hence the
overall sign does not change. Hence, the Yukawa potential between fermion-anti-fermion is
also attractive. The same holds for anti-fermion scattering.
Thus, the inferred Yukawa potential is always attractive. Its underlying QFT description
is an exchange of a (virtual) massive scalar particle of mass mϕ . Yukawa proposed this
potential as a model for binding of nucleons and working with the properties of nucleons,
139
estimated the scalar mass to be about 200 MeV which is close to the mass of the pion. Of
course pions come in three varieties, π ± , π 0 while we have taken only a single scalar.
Consider fermion scattering in QED, by replacing the Yukawa coupling by the QED
coupling. The relevant diagrams are:
p0
p
p
p0 − p
k0 − p
+
k0
k
k
The corresponding expression is,
2
iM = (ie) ū(p0 )γ µ u(p)
p0
k0
−iηµν
ū(k 0 )γ ν u(k)
− p)2 − i
−iηµν
0 µ
0 ν
−ū(k )γ u(p) 0
ū(p )γ u(k) ,
(k − p)2 − i
(p0
(13.6)
with the same relative minus sign.
As before, consider the case of non-relativistic limit with distinguishable fermions and
consider only the first diagram. The denominator simplifies the same way: (p0 − p)2 − i →
|~q|2 − i. In the numerator we have [ū(p0 , σ 0 )γ µ u(p, σ)][ū(k 0 , ρ0 )γµ u(k, ρ)].
Claim: In the NR limit, ū(p0 , σ 0 )γ µ u(p, σ) → ūγ 0 u.
Proof: The spinor u satisfies, (6 p + m)u = 0. Consider say the Dirac representation of the
gamma matrices. Then in terms of the two component notation we have,
−ωp~ + m p~ · ~σ
u
1 = 0 ⇒ −~p · ~σ u1 = (ωp~ + m)u2 , p~ · ~σ u2 = (−ωp~ + m)u1
−~p · ~σ ωp~ + m
u2
The NR limit ωp~ ≈ m +
p
~2
2m
then implies u2 ∼ o(|p|/m)u1 . Also, in the two component
notation, we have
ū0 γ i u = ((u01 )† , (u02 )† )
ū0 γ 0 u = ((u01 )† , (u02 )† )
1 0
0 −1
1 0
0 −1
0
−σ
σ
i
1 0
0 −1
i
0
This proves the claim.
140
u1
= (u01 )† σ i u2 + (u02 )† σ i u1 ∼ o(p/m) ;
u2
u1
u2
= (u01 )† u1 + (u02 )† u2 ∼ o(1) + o(|p|/m).
Hence the numerator approximates to [ū(~p0 , σ 0 )γ 0 u(~p, σ)][ū(~k 0 , ρ0 )γ 0 u(~k, ρ)] and we get,
iM ≈
ie2 η00 [u† (~p0 σ 0 )u(~p, σ)][u† (~k 0 , ρ0 )u(~k, ρ)]
.
~q2 − i
(13.7)
Since the square brackets are positive (p0 ≈ p, k 0 ≈ k for small momentum transfer) and
the η00 < 0, relative to the Yukawa matrix element we have an opposite sign and hence,
the effective potential from the QED coupling will be repulsive for fermion scattering. The
form of the potential itself can be obtained from the Yukawa one by taking mϕ → 0 and
as expected, we get the Coulomb potential, V (r) =
(e0 )2
4π
=:
α
.
4π
We have absorbed the
normalization factors in e0 . By redefining the original e, we can take e0 → e, the measured
electric charge (in the natural units). The constant α :=
e2
4π
≈
1
137
is called the fine structure
constant.
If we consider fermion-anti-fermion scattering via the QED coupling, the u → v does not
introduce a negative sign since it is u† u and not ūu. The b† → d† introduces a minus sign
as before and hence the overall sign changes. Hence, for fermion-anti-fermion, the potential
is attractive. For anti-fermion scattering of course the potential is repulsive.
Note: It seems the sign depends on the η00 which is convention dependent. However, the
signs in the propagator will also change accordingly and the sign of the potential will be
metric signature independent.
Note: For a tensorial interaction like gravity, helicity 2, we will have two factors of η00
and the overall sign will not change. The Newtonian potential will be attractive and will
be so for fermion-anti-fermion scattering as well. This argument is to be taken as heuristic
since we need to be explicit about the gravitational coupling to infer the numerator factors
of the spinors which are sensitive to u → v changes.
These examples indicate that exchange of quanta can be interpreted as giving an equivalent potential (and hence force) in the non-relativistic limit (where the concept of potential
is meaningful). The qualitative properties of attractive/repulsive are already encoded in the
Feynman rules. The comparison with effective potential also allows the coupling parameters
to be determined experimentally.
141
14.
BASIC QED PROCESSES
The quantum electrodynamics has basic scattering of two charged particles, the scattering of light by a charged particle, particle-anti-particle annihilation and pair creation.
These are analyzed as: (i) electron-muon scattering, (ii) electron-positron scattering as well
as production of muon-anti-muon, (iii) electron-photon scattering (Compton scattering) and
(iv) electron-positron annihilation into two photons. We will evaluate these in the leading
e2 approximation. We will also compute the cross-sections for comparison with experiments. These cross-sections will be for the un-polarized particles in the initial state, for
which we will average the cross section over the initial spins/polarizations. We will also not
detect the final particles spin/polarizations and hence sum the cross-section over the final
spins/polarizations.
Here is the list of the processes with their diagrams and the corresponding invariant
amplitudes.
e− (p)µ− (k) → e− (p0 )µ− (k 0 )
e− (p)
e− (p0 )
iM = [ū(p0 )(ieγ µ )u(p)]
q
µ− (k)
−iηµν
q 2 −i
i
[ū(k 0 )(ieγ µ )u(k)]
q
(for µ replaced by e it is Bhabha Scattering)
µ+ (k 0 )
iM = [v̄(p0 )(ieγ µ )u(p)]
e− (p)
h
µ− (k 0 )
e+ (p0 )e− (p) → µ+ (k 0 )µ− (k)
e+ (p0 )
(for µ replaced by e it is Moller Scattering)
h
−iηµν
q 2 −i
i
[ū(k 0 )(ieγ µ )v(k)]
µ− (k)
In the next two processes, there are two diagrams contributing, obtained by exchanging
the initial and final state photons in the Compton scattering and the two final state photons
in the annihilation process. Although the diagrams may not look ‘crossed’, they are!
142
e− (p)γ(k) → e− (p0 )γ(k 0 )
e− (p)
e− (p0 )
p+k
γ(k 0 )
γ(k)
e− (p)
γ(k)
p0 − k
(Compton Scattering)
−i(−(6 p + 6 k) + m)
0
0
0
ν
~
iM = εν (k , λ )ū(p )(ieγ )
(p + k)2 + m2e − i
× (ieγ µ )u(p)ε∗ (~k, λ)
µ
e− (p0 )
iM =
−i(−(6 p0 − 6 k) + m)
(p − k 0 )2 + m2e − i
× (ieγ µ )u(p)εµ (~k 0 , λ0 )
ε∗ν (~k, λ)ū(p0 )(ieγ ν )
γ(k 0 )
e+ (p0 )e− (p) → γ(k)γ(k 0 )
e− (p)
(Annihilation process)
γ(k)
p−k
e+ (p0 )
γ(k 0 )
e− (p)
γ(k)
−i(−(6 p − 6 k) + m)
iM = εν (k , λ )v̄(p )(ieγ )
(p − k)2 + m2e − i
× (ieγ µ )u(p)εµ (~k, λ)
~0
0
0
ν
−i(−(6 k −6 p0 ) + m)
0
ν
~
iM = εν (k, λ)v̄(p )(ieγ )
(p − k 0 )2 + m2e − i
× (ieγ µ )u(p)εµ (~k 0 , λ0 )
p−k
e+ (p0 )
γ(k 0 )
Since we will be computing un-polarized (average over initial spin/polarizations), inclusive
P
(sum over final spin/polarizations) cross-sections, we have to evaluate 12 12 σ,σ0 |M|2 for the
fermion scattering and likewise for the other cases. For the Compton scattering and pair
annihilation where two diagrams contribute, we need to add them before taking the modsquare.
A.
Electron-muon processes:
Consider the electron-muon scattering. We have,
(iM)(−iM∗ ) =
e4
{(ū(p0 )γ µ u(p))(u† (p)(γ ν )† (γ 0 )† u(p0 ))}
(q 2 − i)(q 2 + i)
× {(ū(k 0 )γµ u(k))(u† (k)(γν )† (γ 0 )† u(k 0 ))}
The spins, σ, σ 0 , ρ, ρ0 are implicit in the above expression. Summing over all of these replaces
uū using the completeness relation. Each of the braces can be expressed as traces over the
143
gamma matrices. With the new normalizations used, the completeness relations take the
form,
X
u(p, σ)ū(p, σ) = −6 p + m ,
X
σ
v(p, σ)v̄(p, σ) = −6 p − m .
(14.1)
σ
Thus the electron and muon spin sums become,
{. . . }p,p0 = T r[γ µ (−6 p + me )γ ν (−6 p0 + me ] , {. . . }k,k0 = T r[γµ (−6 k + mµ )γν (−6 k 0 + mµ ] .
This leads to,
1 X
1
e4
|M|2eµ→eµ =
T r[γ µ (−6 p + me )γ ν (−6 p0 + me ]
4 σ,σ0 ,ρ,ρ0
4 (q 2 − i)(q 2 + i)
× T r[γµ (−6 k + mµ )γν (−6 k 0 + mµ ]
(14.2)
For the e+ e− → µ+ µ− we will have,
(iM)(−iM∗ ) =
e4
{(v̄(p0 )γ µ u(p))(u† (p)(γ ν )† (γ 0 )† v(p0 ))}
(q 2 − i)(q 2 + i)
× {(ū(k 0 )γµ v(k))(v † (k)(γν )† (γ 0 )† u(k 0 ))}
This is the same as before except the p0 , k 0 spinors are v spinors. Using the corresponding
completeness relation we get,
1
1 X
e4
|M|2e+ e− →µ+ µ− =
T r[γ µ (−6 p + me )γ ν (−6 p0 − me ]
4 σ,σ0 ,ρ,ρ0
4 (q 2 − i)(q 2 + i)
× T r[γµ (−6 k + mµ )γν (−6 k 0 − mµ ] (14.3)
The Compton scattering and annihilation processes have two diagrams each and each of
these has a single Dirac trace. But each diagram also has photon polarization completeness
relations. Incidentally, for Moller and Bhabha processes, there are ‘crossed diagrams’ too.
We will do these later.
To proceed with the two processes above, we need to traces of 2 and 4 gamma matrices.
Here are the relevant formulae.
T r1 = 4;
(14.4)
T rγ µ γ ν = −4η µν ;
T rγ µ γ ν γ α γ β = 4 η µν η αβ − η µα η νβ + η µβ η να
144
(14.5)
(14.6)
Using these, we get
T r[γ µ (−6 p + me )γ ν (−6 p0 + me ] = pα p0β T r[γ µ γ α γ ν γ β ] + m2e T r[γ µ γ ν ]
0
0
= 4(pµ p ν − η µν p · p0 + pν p µ ) − 4m2e η µν
and,
T r[γµ (−6 k + mµ )γν (−6 k 0 + mµ ] = kα kβ0 T r[γµ γ α γν γ β ] + m2µ T r[γµ γν ]
= 4(kµ kν0 − ηµν k · k 0 + kν kµ0 ) − 4m2µ ηµν .
Dotting the traces gives,
T r[. . . ]T r[. . . ] = 32[(p · k)(p0 · k 0 ) + (p · k 0 )(p0 · k) + m2µ p · p0 + m2e k · k 0 + 2m2e m2µ ]
1X
8e4
∴
|M|2eµ→eµ = 4 [(p · k)(p0 · k 0 ) + (p · k 0 )(p0 · k) + m2µ p · p0 + m2e k · k 0 + 2m2e m2µ(14.7)
]
4 spins
q
where q = p0 − p .
For the e+ e− → µ+ µ− process, the only difference is m2e , m2µ changing signs. Thus we
have,
8e4
1X
|M|2e+ e− →µ+ µ− = 4 [(p · k)(p0 · k 0 ) + (p · k 0 )(p0 · k) − m2µ p · p0 − m2e k · k 0 + 2m2e m2µ ] (14.8)
4 spins
q
where q = p + p0 .
To relate to cross-section, we can use the center of mass frame. For the electron-muon
scattering, we may choose
0
0
pµ = (Ee , pẑ) , k µ = (Eµ , −pẑ) , p µ = (Ee0 , p~0 ) , k µ = (Eµ0 , −~p0 ) , cos(Θcm ) := ẑ · p~ˆ0 .
q
p
p
p 2
0
0
2
2
2
0
2
2
Notice that Ee + Eµ = p + me + p + mµ = Ee + Eµ = (p ) + me + (p0 )2 + m2µ ,
implies that |p0 | = |p|. The magnitude of the center of mass momentum and the scattering
angle are the only two independent parameters given the electron and muon masses.
For e+ e− → µ+ µ− , a good deal of convenience ensues. The equality of magnitudes of
momenta and masses being identical in both initial and final states, implies that the energies
of individual particles in initial and final state are equal and total energy conservation implies
these energies are equal too. The initial and final momenta magnitudes are then simply
related by the masses. Thus we may choose,
ˆ
pµ = (E, pẑ) , (p0 )µ = (E, −pẑ) , k µ = (E, ~k) , (k 0 )µ = (E, −~k) , cos(Θcm ) := ẑ · ~k.
Again only the center of mass energy and the scattering angle are the only independent
parameters given the masses.
145
It remains to simplify the expression for the cross-sections. We will give it for the muon
production process and leave the eµ scattering process as an exercise. We will also neglect
the electron mass compared to the muon mass.
The dot products of momenta, in the center of mass take the form, q 2 = (p + p0 )2 =
−(2E)2 = −4(k 2 + m2e ) ≈ −4k 2 . Next, p · p0 = −E 2 − p2 ≈ −2E 2 and
p · k = p0 · k 0 = −E 2 + Ekcos(Θcm ) , p · k 0 = p · k = −E 2 − Ekcos(Θcm ) .
Substituting in the invariant amplitude gives (me = 0 set),
8e4 2
1X
(E + Ekcos(Θcm ))2 + (E 2 − Ekcos(Θcm ))2 − m2µ (−2E 2 )
|M|2e+ e− →µ+ µ− =
4 spins
(−4E 2 )2
k 2 cos2 (Θcm ) m2µ
4
+ 2
= e 1+
E2
E
2
mµ
m2µ
4
2
= e
1 + 2 + 1 − 2 cos (Θcm )
(14.9)
E
E
The differential cross-section is given by (11.21),
!
0
1 1 kcm
1X
dσ
=
|M|2
where, s = 4E 2 , me = 0, m01 = m02 = mµ ⇒
dΩcm
64π 2 s kcm 4 spins
r
m2µ
m2µ
1 1 s2 − 4sm2µ
2
=
1 + 2 + 1 − 2 cos (Θcm )
64π 2 s
s
E
E
r
2 2
m2µ
m2µ
m2µ
e
1
2
=
1− 2
1 + 2 + 1 − 2 cos (Θcm )
(14.10)
4π
4 · 4E 2
E
E
E
Putting α :=
e2
,
4π
the fine structure constant and Ecm = 2E, we write the unpolarized,
e+ e− → µ+ µ− cross-section as,
r
dσ
α2
1−
(E, Θcm ) =
2
dΩcm
4Ecm
r
Z
πα2
dσ
1−
σtotal := dΩ
=
2
dΩ
3Ecm
m2µ
m2µ
2
1 + 2 + 1 − 2 cos (Θcm ) (14.11)
E
E
m2µ
m2µ
1
+
(14.12)
E2
2E 2
m2µ
E2
Remarks: The square root factor shows that E ≥ mµ must hold for the muon production
to proceed. In the ultra-relativistic limit, E mµ , the differential cross-section has the
characteristic (1 + cos2 (Θ)) angular dependence.
Since the cross-section is a measured quantity and the center of mass energy is under
our control, by comparing the pair production cross-sections for two different final states,
eg µ, τ , and taking the ratio, we can obtain bounds on the masses of the heavier leptons.
146
Exercises: Obtain the differential and the total cross-section for eµ → eµ process. Also
for the Moller and the Bhabha processes.
B.
Compton Scattering:
Recall the total amplitude from the two diagrams (we take the photon polarizations to
be real),
−i(−(6 p + 6 k) + m)
0
0
0
ν
~
(ieγ µ )u(p)εµ (~k, λ)
iM = εν (k , λ )ū(p )(ieγ )
(p + k)2 + m2e − i
0
−i(−(6
p
−
6
k)
+
m)
0
ν
+εν (~k, λ)ū(p )(ieγ )
(ieγ µ )u(p)εµ (~k 0 , λ0 )
(14.13)
(p0 − k)2 + m2e − i
ν
0
µ
µ
ν
γ
(−(6
p
+
6
k)
+
m)γ
γ
(−(6
p
−
6
k)
+
m)γ
2
0
0
0
M = e εµ (~k, λ)εν (~k , λ )ū(p )
+
u(p)
(p + k)2 + m2e − i
(p0 − k)2 + m2e − i
(14.14)
In the second (crossed diagram) terms, we have used p−k 0 = p0 −k which is more convenient.
Simplification: Since the momenta are on shell, p2 = −m2 , k 2 = 0 = (k 0 )2 , we get
(p + k)2 + m2 = p2 + m2 + k 2 + 2p · k = 2p · k , (p0 − k)2 + m2 = −2p0 · k;
(−6 p + m)γ µ u(p) = (2pµ + γ µ (6 p + m))u(p) = 2pµ u(p) ,
ū(p0 )γ µ (−6 p0 + m) = ū(p0 )(2(p0 )µ + (6 p0 + m))γ µ = 2(p0 )µ ū(p0 ) .
ν µ
γ (2p − 6 kγ µ ) (2(p0 )µ + γ µ6 k)γ ν )
2
0
0
0
~
~
M = e εµ (k, λ)εν (k , λ )ū(p )
+
u(p) (14.15)
2p · k − i
−2p0 · k − i
α
α
β
β
0 α
α
(2p
−
γ
6
k)γ
γ
(2(p
)
+
6
kγ
)
†
2
0
0
+
u(p0 ) (14.16)
M = e εα (~k, λ)εβ (~k , λ )ū(p)
0
2p · k + i
−2p · k + i
In the second equation, we have used
[ū(p0 )Γu(p)]∗ = ū(p)(γ 0 )Γ† γ 0 u(p0 ) = ū(p)Γreversed u(p0 ).
Here Γreversed has the order of the gamma matrices reversed due to the † operation.
Summing and averaging over the spins and polarizations the |M|2 , we get,
ν µ
1 X
e4
γ (2p − 6 kγ µ ) (2(p0 )µ + γ µ6 k)γ ν
2
|M| =
Tr
+
(−6 p + m)
4 σ,σ0 ,λ,λ0
4
2p · k − i
−2p0 · k − i
α
(2p − γ α6 k)γ β γ β (2(p0 )α + 6 kγ α )
0
+
(−6 p + m) ×
2p · k + i
−2p0 · k + i
(
)(
)#
X
X
εµ (~k, λ)εα (~k, λ)
εν (~k 0 , λ0 )εβ (~k 0 , λ0 )
(14.17)
λ0
λ
147
We have used the completeness relation (14.1) to convert the spin sum. We have a trace
over strings of γ matrices, but now we also have sums over the photon polarizations. An
explicit expression can be obtained by introducing an explicit set of 4 orthonormal vectors.
Consider the polarization sum. Given a k, k 2 = 0 we can write k µ = (|~k|, ~k). Introduce
another vector k̃ := C(|~k|, −~k) so that k̃ 2 = 0. Fix C by demanding k· k̃ = −2 ↔ C = 1/|~k|2 .
We have two transverse directions and we take polarizations ε(~k, λ) along these two directions
and mutually orthonormalized. These are space-like vectors and have no time component.
These 4 vectors, k, k̃, ε(~k, 1), ε(~k, 2) are independent and the completeness relation takes the
form,
k µ k̃ ν + k ν k̃ µ
+ εµ (~k, 1)εν (~k, 1) + εµ (~k, 2)εν (~k, 2) = η µν
2
Thus we get the sum over photon polarizations as,
−
kµ k̃ν + kν k̃µ
εµ (~k, λ)εν (~k, λ) = ηµν +
2
λ=1,2
X
(14.18)
Claim: The terms containing k, k̃ in the polarization sum, do not contribute to the
P
|M|2 .
Proof: Observe that we can always express M = εµ (~k, λ)M µ , by just taking out the εµ .
P
Then λ |M|2 = M µ (M ∗ )ν {ηµν + 21 (kµ k̃ν + kν k̃µ )}. The k terms contracting with M can be
seen by replacing εµ → kµ . It is more convenient to write the Fermion propagators using,
−6 q+m
q 2 +m2 −i
1
= 6 q+m
since (−6 q + m)(6 q + m) = q 2 + m2 . Then,
1
2
0
M |εµ =kµ = e ū(p ) 6 ε(k 0 )
{6 k +6 p + m −6 p − m}
6p +6k + m
1
0
0
0
+{6 k −6 p − m +6 p + m} 0
6 ε(k ) u(p)
6p −6k + m
1
2
= e ū(p0 )6 ε(k 0 )u(p) − ū(p0 )6 ε(k 0 )
(6 p + m)})u(p)
6p +6k + m
1
0
0
0
0
0
ū(p )(−6 ε(k ))u(p) + ū(p )(6 p + m) 0
6 ε(k )u(p)
6p −6k + m
=0!
We have used p − k 0 = p0 − k in the second term in the first equation. In the last equation,
the first and the third terms cancel while the second and the fourth terms vanish by equation
of motion. This proves the claim. Hence, effectively, each of the polarization sums give only
the ηµν .
Note: The total scattering amplitude vanishes when the photon transverse polarization
is replaced by a longitudinal one, is a general result known as a ‘Ward identity’. This is a
148
consequence of gauge invariance. While we do not discuss the general proof here, it suffices
to note that (a) we used the Dirac equation: (6 p + m)u(p) = 0 = ū(p0 )(6 p0 + m); (b) the QED
coupling has the form ∼ Aµ J µ with J µ ∼ Ψ̄γ µ Ψ ∼ ū(p0 )γ µ u(p); (c) δAµ = ∂µ Λ is a gauge
transformation (↔ εµ (k) → εµ (k) + kµ λ(k)) and gauge invariance of the interaction implies
∂µ J µ = 0. Thus, the vanishing of the amplitude is related to gauge invariance.
The averaged |M|2 in equation (14.17) becomes,
ν µ
X
γ (2p − 6 kγ µ ) (2(p0 )µ + γ µ6 k)γ ν
2
4
+
(−6 p + m)
|M| = e T r
0 · k − i
2p
·
k
−
i
−2p
0
0
σ,σ ,λ,λ
(2pµ − γµ6 k)γν γν (2(p0 )µ + 6 kγµ )
0
+
(−6 p + m)
2p · k + i
−2p0 · k + i
"
(2pµ γ ν − γ ν6 kγ µ )6 p(2pµ γν − γµ6 kγν )6 p0
4
= e Tr
+
(2p · k)2
1
(2(p0 )µ γ ν + γ µ6 kγ ν )6 p(2p0µ γν + γν6 kγµ )6 p0
+
Tr
(−2p0 · k)2
2
µ ν
(2p γ − γ ν6 kγ µ )6 p(2p0µ γν + γν6 kγµ )6 p0
+
Tr
−4p · kp0 · k
3
#
(2(p0 )µ γ ν + γ µ6 kγ ν )6 p(2pµ γν − γµ6 kγν )6 p0
Tr
−4p · kp0 · k
4
"
(2pµ γ ν − γ ν6 kγ µ )(2pµ γν − γµ6 kγν )
+
+m2 e4 T r
(2p · k)2
5
(2(p0 )µ γ ν + γ µ6 kγ ν )(2p0µ γν + γν6 kγµ )
Tr
+
(−2p0 · k)2
6
µ ν
(2p γ − γ ν6 kγ µ )(2p0µ γν + γν6 kγµ )
Tr
+
−4p · kp0 · k
7
#
(2(p0 )µ γ ν + γ µ6 kγ ν )(2pµ γν − γµ6 kγν )
Tr
−4p · kp0 · k
8
(14.19)
(14.20)
The terms linear on m have odd number of γ’s and hence vanish under trace.
It may be checked easily that the 2 is obtained from 1 by p ↔ −p0 and likewise for
5 ↔ 6. Noting the identity T r(γ1 γ2 · · · γn ) = T r(γn γn−1 · · · γ2 γ1 ), it follows that 3 = 4
and 7 = 8. The first group of 4 terms involves trace of a maximum of 8 gamma matrices
while the last 4 terms involve a maximum of 6 gamma’s. These are simplified using various
identities among the gamma matrices.
149
Consider 1 . We have the traces,
T r 4p2 γ ν6 pγν6 p0 − 2γ ν6 p6 p6 kγν6 p0 − 2γ ν6 k6 p6 pγν6 p0 + γ ν6 kγ µ6 pγµ6 kγν6 p0
Use: γ ν6 pγν = +26 p, 6 p6 p = −p2 , 6 k6 p +6 p6 k = −2p · k, T r6 p6 p0 = −4p · p0 and the cyclic property
of the trace to get the above trace as,
T r{· · · }
= (4p2 )(2)(−4p · p0 ) − 2p2 (2)(−4k · p0 ) × 2 + (2)(2)T r(6 k6 p6 k6 p0 )
1
= 32m2 (p · p0 − k · p0 ) + 32(k · p)(k · p0 )
(14.21)
We have used p2 = −m2 and k 2 = 0.
Similarly we get the other traces (without the denominator factors) as,
T r{· · · }
3
= 4p · p0 T r(γ ν6 pγν6 p0 ) − 2T r(6 k6 p06 pγν6 p0 γ ν ) + 2T r(γ ν6 pγν6 k6 p6 p0 )
− T r(γ ν6 kγ µ6 pγν6 kγµ6 p0 )
= −32(p · p0 )2 + (k · p − k · p0 )(16m2 + 32p · p0 ) + 0
T r{· · · }
5
7
(14.23)
= 4p2 T r(γ ν γν ) − 2T r(6 k6 p(−41)) + 8T r(6 p6 k) + 16T r(6 k6 k)
= 64m2 − 64k · p
T r{· · · }
(14.22)
(14.24)
= −64p · p0 + 32(k · p − k · p0 ) − T r(γ ν6 kγ µ γν6 kγµ )
= −64p · p0 + 32(k · p − k · p0 ) + 0
(14.25)
To simplify the expression further, we need to use: p · p0 = −m2 + k · p − k · p0 which
follows from 0 = (k 0 )2 = (p + k − p0 )2 . For comparison with [13], it is useful to note:
k · p = k 0 · p0 , p · k 0 = p0 · k.
The final expression is,
"
2 #
0
1 X
p
·
k
p
·
k
1
1
1
1
|M|2 = 2e4
+
+ 2m2
−
+ m4
−
4
p·k
p · k0
p · k0 p · k
p · k p · k0
spin/P ol
(14.26)
Exercise: Check the algebra!
The Compton cross-section is usually presented in the lab frame with the electron initially
at rest, i.e. pµ = (m, ~0), k µ = (ω, ωẑ), (p0 )µ = (E 0 , p~]), (k 0 )µ = (ω 0 , ω 0 sinθ, 0, ω 0 cos(θ)). The
final electron and photon momenta define a plane which is taken to be the z−x plane with the
final photon making an angle θ to the z−axis. These choices give: p·k 0 = −mω 0 , p·k = −mω
150
and,
"
X
|M|2 = 2e4
spin/P ol
2 #
1
ω0
ω
1
1
1
+ 0 + 2m − 0 +
+ m2 − + 0
ω
ω
ω
ω
ω ω
(14.27)
Next, (p0 )2 = (p + k − k 0 )2 implies,
−m2 = −m2 + 2p · (k − k 0 ) + (k − k 0 )2 ⇒ 0 = m(ω − ω 0 ) − ωω 0 (1 − cos(θ)),
leading to the Compton formula for shift in the photon wavelength with the scattering angle:
Compton Formula:
1 − cosθ
1
1
=
−
ω0 ω
m
↔ ω0 =
1+
ω
.
− cosθ)
ω
(1
m
(14.28)
To get the differential cross-section, we simplify the phase space integral as,
Z
Z
d3 p0
d3 k 0
(2π)4 δ 4 (p; +k 0 − p − k) =
0
3
0
3
2ω (2π)
2E (2π)
Z
dω 0 (ω 0 )2 dΩ
1
(2π)4 δ(ω 0 + E 0 − m − ω)
0
3
0
2ω (2π) 2E (2π)3
We have used up a δ 3 to get p~0 = ~k 0 − ~k − p~ = (ω 0 sinθ, 0, ω 0 cosθ − ω) which gives (E 0 )2 −
(ω 0 )2 + ω 2 − 2ω 0 ωcosθ + m2 . Hence
Z
ω∗0
dE 0
ω 0 − ωcosθ
1
ω0
,
=
.
dω 0 0 0 δ(ω 0 + E 0 (ω 0 ) − m − ω) =
0
dE
4E (ω )
4E 0 (ω∗0 ) |1 + dω
dω 0
E0
0 |
∗
Inserting in the phase space integral,
Z
Z
1
dcosθ ω∗0
, ω∗0 + E 0 (ω∗0 ) = m + ω
=
0 −ωcosθ
ω
0
2π
4E
|1
+
|
∗
Γ2
∗
E0
Z
02
1
ω
=
dcosθ
.
We have,
(14.29)
8π
mω
dσ
1 1 (ω 0 )2 X
=
[
|M|2 ]
dcosθ
4mω 8π mω
2 0
2
e4 1 ω 0
ω
ω
−1 + cosθ
2 (1 − cosθ)
=
+ 0 + 2m
+m
16π m2 ω
ω
ω
m
m2
This give the differential cross-section for the Compton scattering, known as Klein-Nishina
formula,
dσ
dcosθlab
πα2
= 2
m
ω0
ω
2
ω0
ω
+ 0 − sin2 θlab
ω
ω
Remarks:
151
, with
ω0
=
ω
1+
1
.
− cosθ)
ω
(1
m
(14.30)
(a) There has been no approximation at the α2 level computation. It can now be used
for various limits. At θ = 0, forward scattering, there is no frequency change and the crosssection equaling
πα2
m
is independent of the photon energy. For a massive charged particle,
one defines its Compton length to be λc := h/(mc) = 2π/m in the natural units, ~ = 1 = c.
The forward scattering cross-section is then α2 (πλ̄2 ), where λ̄ := λ/(2π).
(b) There are two scales in the problems: m, ω. This gives two natural limits: (i) ω →
0 ↔ ω/m 1 and (ii) ω → ∞ ↔ ω/m 1.
For a generic θ, consider ω → 0. Then ω/ω 0 → 1 + (ω/m)(1 − cosθ) and
dσ
dcosθ
=
ω/m1
i
ω
8πα2
πα2 h
2
2
1
+
cos
θ
−
4sin
(θ/2)
+
.
.
.
,
σ
=
+ ...
tot
m2
m
3m2
(14.31)
The leading term is the Thomson cross-section for classical radiation scattering off free
electrons.
In the opposite limit, m/ω 1, we have
m
1
m
1
1
m
0
+ ...
:= (1 − . . . ) , (θ) :=
.
1−
ω /ω =
ω 1 − cosθ
ω 1 − cosθ
ω 1 − cosθ
In this limit,
dσ
πα2
πα2
m
≈
(θ)
=
,
(14.32)
2
2
dcosθ
m
m ω(1 − cosθ)
Z 1−δ
dσ
πα2
σtot (δ) :=
dcosθ
≈
[−`n(δ/2)] → ∞ , as θ → 0.
(14.33)
dcosθ
mω
−1
p
For the (θ) 1 condition to hold so that the expansion is meaningful, θ 2m/ω must
hold. At θ = 0 we know the exact answer which is finite.
C.
Electron-Positron Annihilation:
Having evaluated the Compton scattering amplitude, this is actually simple to evaluate. From the table giving the amplitudes (notice the use of p0 − k and k − p0 momenta in the crossed diagrams), we can see that the annihilation diagrams expressions
are obtained by making the substitutions: p → p, k → −k, p0 → −p0 , k 0 → k 0 and ū → v̄.
These substitutions preserve the momentum conservations appropriate for the processes:
p + k = p0 + k 0 → p − k = −p0 + k 0 . The photon polarizations are insensitive to the ‘sign’
of momentum and will give ηµν as before. One of the spin sums however changes from
152
[−6 p0 + m] → [−(−6 p0 + m)] = [6 p − m] which generates an overall minus sign in the summed,
squared amplitude. This is actually an example of what is known as the crossing symmetry
of the S−matrix: A scattering amplitude for a particle with momentum k in the initial state
is the same as the amplitude with the initial particle moved to the final state anti-particle
with momentum −k.
Without any explicit calculation we can write down the summed squared amplitude for
the annihilation process as,
"
2 #
0
X
1
1
p
·
k
1
1
p
·
k
+
+ 2m2
+ m4
+
+
|M|2 = (−)2e4 −
p·k
p · k0
p · k0 p · k
p · k p · k0
spin/P ol
(14.34)
It is more convenient to express the cross-section in the center of mass frame:
pµ = (E, pẑ) , (p0 )µ = (E, −pẑ) , E 2 = p2 + m2 , s = −(p + p0 )2 = 4E 2
k µ = (E, EsinΘ, 0, EcosΘ) , (k 0 )µ = (E, −EsinΘ, 0, −EsinΘ)
This gives,
p · k = −E 2 + pEcosΘ = −E(E − pcosΘ) ,
p · k 0 = −E 2 − pEcosΘ = −E(E + pcosΘ),
s
|k 0 |cm
1
E
s2 − 0 + 0 2
E
√
=
=
=
=
|k|cm
s2 − 2s(2m2 ) + 0
1 − m2 /E 2
p
E 2 − m2
The differential cross-section takes the form,
X
dσ
1
1 E
=
|M|2
with
2
2
dΩcm
64π 4E |p|
spin/P ol
X
E + pcosΘ E − pcosΘ
2
4
|M| = (−2e ) −
+
(14.35)
E − pcosΘ E + pcosΘ
spin/P ol
2m2
1
1
+
+
(14.36)
−E E + pcosΘ E − pcosΘ
2 #
m4
1
1
+ 2
+
E
E + pcosΘ E − pcosΘ
2
2
2
2m2
2m4
4 E + p cos Θ
+
−
(14.37)
= 4e
E 2 − p2 cos2 Θ E 2 − p2 cos2 Θ (E 2 − p2 cos2 Θ)2
153
The differential cross-section for unpolarized, e+ e− annihilation process is then given by,
α2 E E 2 + p2 cos2 Θ + 2m2
2m4
dσ
=
−
, s = 4E 2 .
dΩcm
s |p|
m2 + p2 sin2 Θ
(m2 + p2 sin2 Θ)2
In the high energy limit, E m, this reduces to,
dσ
α2 1 + cos2 Θ
→
dΩcm
s
sin2 Θ
(14.38)
(14.39)
To obtain the total cross-section, we need to integrate over the final state. Since the two
photons a identical, Θ needs to be integrated between 0, π/2. The Bose factor of 2 (which
we did not include in the amplitude) cancels the factor of 2 from the angular integration.
We get,
σtot
α2
(2π)
=
s
Z
1
dx
0
1 + x2
1 − x2
(14.40)
The integral is clearly divergent from the ‘forward’ direction (Θ = 0). This is artificial since
R 1−∆ R 1
we set m = 0 in the integrand. Split the integration as 0
+ 1−∆ . The first term is a
finite number and this contribution falls off as s−1 . For the second term, we go back to the
exact differential cross-section, approximate the integrand for 1 − ∆ ≤ cos(Θ) ≤ 1. Then,
the factors of E all cancel and second term ∼ α2 (1 − ∆)m−2 where, ∆ m2 /p2 so that
the denominator can be approximated. Thus, the total cross-section does not diverge but
is bounded by m−2 . This is an example of bounds satisfied by total cross-sections at very
high energies.
154
15.
NUMERICAL ESTIMATES OF CROSS-SECTIONS AND APPLICATIONS
We have obtained examples of cross-sections for some of the basic processes in quantum
electrodynamics. We also saw how they relate to non-relativistic versions of the processes
eg potential scattering and inferred the modification due to relativistic effects as well as
estimate ‘strengths’ of interactions. Before we study the higher order corrections, it is useful
to have a sense of order of magnitudes of cross-sections and how these numbers are used in
applications. We will focus on the QED processes.
What is the value of the fine structure constant?
This is related to the electric charge, the Planck constant due to quantum framework
and the speed of light due to special relativity. We first obtain the expression in terms of
“engineering units” and then use the measured values to compute the fine structure constant.
Observe that the Feynman rules are derived from the Lagrangian density which has
properly normalized kinetic (quadratic) terms, − 41 Fµν F µν + iΨ̄γ µ ∂µ Ψ − mΨ̄Ψ and the interaction term introduces ‘e’ as eΨ̄γ µ ΨAµ and thus is dimensionless. This definition gives
the equation of motion, ∂µ F µν = −eΨ̄γ ν Ψ ↔ ∂i E i = eΨ† Ψ, using the identifications:
F 0i := E i , ∂µ F µν = −J ν , J µ = (ρ, J i ). Furthermore, the Hamiltonian and hence the energy
~ 2 /2.
density is E
~ ·E
~ SI = ρSI /0 . The field energy
The MKSA (SI) units has the Gauss law in the form: ∇
~ 2 /2. Comparing the energy densities,
density (as inferred from the Poynting theorem) is 0 E
SI
√
~ ·E
~ = √0 ∇
~ ·E
~ SI := √0 ρSI . This
we introduce the identification, E := 0 ESI Then ρ = ∇
0
leads to the identification, e :=
eSI
√
.
0
Thus, we identify the variables used in the action with
those used in engineering variables by comparing the equations of motion and the energy
densities.
In the natural units, ~ = 1 = c used in writing the action, the fine structure constant was
defined as α := e2 /(4π) which is dimensionless as noted above. Substituting e in terms of
eSI and introducing ~, c, we write α :=
e2SI a b
~ c
4π0
and determine the powers a, b by requiring
α to be dimensionless. For this, we need the dimensions of eSI and 0 ! Now we also use
the SI expression of the Lorentz force, F = eSI ESI . Hence, dimensionally (Force)2 /(energy
density) ∼ e2SI /0 ∼ M 1 L3 T −2 . This and α being dimensionless gives a = b = −1 and
α=
e2SI
4π0 ~c
' 1/137 , using the values: (4π0 )−1 = c2 10−7 , eSI = 1.6 × 10−19 Coulomb,
~ ' 1.05 × 10−34 Joules.sec and c ' 3 × 108 m/sec.
155
An Aside: Another set of units are commonly used in gravitational physics, the so called
geometrized units. These are defined by setting G = c = 1. The choice of c = 1 gives: 1
sec = 3×108 meters. Newton’s constant in SI units is, G ≈ 6.67 × 10−11 kg −1 meter3 sec−2 .
Hence setting G = 1 along with c = 1, gives 1 kg = 7.4 × 10−28 meters.
~ ·E
~ = 4πρ
It is conventional in gravitational physics to use the Gauss law in the form ∇
and the electric field energy density as E 2 /(8π). This gives the identifications: Egeom =
√
√
4π0 ESI , qgeom = qSI / 4π0 . As before, the Lorentz force equation gives the dimensions
2
2
as: [qSI
0 ] = [4πqgeom
] = M 1 L3 T −2 = L2 .
√qSI Gα cβ must have dimensions of L, we infer
4π0
√
q
1/2
−2
SI
√
G c = qSI c2 × 10−7 G1/2 c−2 ≈ (8.6 × 10−18 )qSI
4π0
Setting qgeom =
qgeom =
α = 1/2 and β = −2 and
meters.
Looking at the expressions for the cross-sections and noting that in the natural units
energy ∼ length−1 , we see that the cross-sections are of the form α/E 2 . This gets more
and more accurate at ultra relativistic energies where masses can be neglected. Since the
cross-section has dimensions of area while energies are typically given in MeV/GeV/TeV
etc, it is useful to have a conversion between energy units and length units. As already
notes, c = 1 ⇒ 1 second = 3 × 108 meters. ~ = 1 ⇒ 1 Joule = 1034 sec−1 ' 3.3 ×
1025 meter−1 . Next, 100M eV = 1.6 × 10−19+8 J ' 5 × 1014 meter−1 . (This is also expressed
as 200M eV.f ermi = 1). Clearly, a cross-section at 100 MeV (electron mass being 0.5
Mev) is approximately equal to σ100M ev ' α2 /E 2 ' 2 × 10−33 m2 . The conventional unit
for these cross-sections is 1 barn := 100 fermi2 = 10−28 m2 . Typical numbers encountered
in high energy processes (around 1 GeV) dominated by strong interactions ∼ 10−30 m2 ,
electromagnetic interactions ∼ 10−36 m2 and weak interactions ∼ 10−42 m2 .
So, scattering experiments will provide us with such numbers. How are they useful?
In many applications, say passage of a collimated set of particles through some medium,
we have the estimate of the cross-section of scattering of individual particles comprising the
‘beam’ and the medium. Intuitively, a cross-section may be viewed as a target disk of area
σ which is bombarded by ‘marbles’ which get reflected from the disk. They are reflected if
they hit the disk or pass by if not. Thus, σ gives the likelihood of a scattering interaction.
Suppose we have multiple targets (medium) and multiple beam particles. The probability of
interaction is clearly proportional to the ratio of the effective area of the target particles and
156
the area exposed to the beam Thus, probability of scattering = (number of target particles
Total Area
Beam
Target
exposed o the beam)×σ/Total area presented by the medium). Let n be the number of
target particles exposed per unit area perpendicular to the beam. Then the number of
particles exposed = n× total area and the probability of scattering = n · σ.
Let ρ be the number density of the target particles i.e. number per unit volume. Let d be
the thickness of the medium. Then the areal density n = ρ · d. The probability of scattering
is 1nσ = ρ · d · σ for d = dˆ = (ρ · σ)−1 . Let hvi be the average speed of the beam particles.
ˆ
Then the average reaction time := τr := d/hvi
= 1/(ρhviσ).
Thus, if we have a confined plasma of say electrons and we bombard it with a beam
of positrons and look for production of µ± pairs, then we should expect to wait for a
time τ = 1/(ρhviσe+ e− →µ+ µ− ). This type of use of microscopic cross-sections is required in
discussing bulk processes eg in nuclear physics or astrophysics and are useful in estimating
thermalization times.
Sometimes, we aren’t interested in tracking individual processes, any interaction may
P
suffice. In such cases we need to sum over all possible final states, σtot = f σi→f . The
σi→f ∼ |Mi→f |2 . We already have expressions using phase space integration for fixed number
of final state particles. Now we have to sum over the number of particles and the labels,
Nf
XZ Y
d3 kj0
1
vlab
(2π)4 δ 4 (Σk 0 − Pin )|M|2 .
σtot =
3
0
4Ek Ep
2ω
(2π)
k
i=1
f,α
This can be related to the forward scattering amplitude via the “Optical Theorem”.
P
P
We have S † S = 1 ⇒ f hi|S † |f ih|S|ii = 1 = f |Sf i |2 . Substituting for Sf i we get,
|Sf i |2 = δf i δf i + (2π)4 δ 4 (Pf − Pi )δf i [i(Mf i − M∗f i )] + (2π)8 [δ 4 (Pf − Pi )]2 |Mf i |2 .
157
Summing over f and using unitarity gives,
(2π)4
X
δ 4 (Pf − Pi )δf i [i(Mf i − M∗f i )] = −(2π)8
f
X
δ 4 (Pf − Pi )δ 4 (Pf − Pi )|Mf i |2
f
Or,
2Im(Mii ) = (2π)4
X
δ 4 (Pf − P − i)|Mf i |2 .
(15.1)
f
Substituting the rhs in the expression for σtot gives,
σtot =
P
2Im(Mii )
√
, where, k = kCM and s = −( Pi )2 .
4k s
The Mii is the forward scattering, i → i, amplitude and σtot is the inclusive cross-section
for i → anyf processes. This could also be used to estimate the total cross-section.
Remark: Let us note that we defined the scattering matrix in the basis of number states or
Fock basis. Further, we chose the idealization of plane waves for both in and out states. This
is convenient but looses the information about spatial locations of the scattering particles.
This also looses any notion of “impact parameter” - how far the projectile particle is from
the target particle or equivalently the ‘orbital angular momentum’ of the projectile about
the target particle. The information about the transverse position is of course contained in
the parameters of the outgoing particles. This should be kept in mind while interpreting
some of the divergences of cross-sections.
158
16.
RADIATIVE CORRECTIONS IN QED
Consider now corrections due to higher powers of interaction Hamiltonian, also called
radiative corrections. Having seen the correspondence between the correction terms and
Feynman diagrams, we can list the corrections directly in terms of the diagrams. A Feynman
diagram is made up of lines (free propagators) and vertices. In QED, there are two types of
propagators and one types of vertex. So consider corrections to two point functions and the
three point function.
A.
The fermion propagator: self-energy
All
=
+
+
+
+ ···
=
+
1PI
+
(16.1)
1PI
1PI
+ · · · (16.2)
Here is a graphical expansion of the exact two point function of the fermions. In the
second line, the diagrams have been grouped into a series of terms involving 1-particle
irreducible (1PI) blobs, connected by free propagators. The 1PI blob represents the sum
of all diagrams with two external legs and which cannot be disconnected by removing a
single fermion line. The 2nd and the 3rd diagrams in the first line are 1IP (their extreme
propagators being the external lines) while the last diagram is 1-particle reducible. The
second line follows from first one by inspection.
Let iSF0 (p) denote the lhs - the full 2-point function or full propagator - and iSF (p) denote
the free propagator. Let the 1PI blob be denoted by iΣ(p). Then the series is:
−iSF0 (p) = −iSF (p) + −iSF · iΣ(p) · −iSF + −iSF · iΣ · −iSF · iΣ · −iSF + . . .
= −iSF + −iSF · iΣ(−iSF − −iSF · iΣ · −iSF . . . )
∴ −iSF0 = −iSF + −iSF · iΣ · −iSF0
∴ SF = (1 − SF Σ)SF0 ↔ 1 = SF−1 (−Σ)SF0
∴
SF0 (p) =
1
,
6 p + m − Σ(p)
and SF−1 = 6 p + m,
Σ(p) is called the fermion self energy.
159
The SF0 (p), Σ(p) are matrices with the Dirac spinor indices though we may not always be
explicit about it. What can we say about its singularities?
We can appeal to Lorentz covariance and decompose Σ(p) = a(p2 )6 p + b(p2 )1 + c(p2 )γ5 .
With parity conserving interactions such as qed, we can take c(p2 ) = 0. Furthermore, to the
leading order, a(p2 ) = 0 = b(p2 ). Thus the denominator of the exact propagator is,
m − b(p2 )
−1
2
2
2
0
⇒
[SF (p)] = (1 − a(p ))6 p + (m − b(p )) = (1 − a(p )) 6 p +
1 − a(p2 )
m−b(p2 )
1
−6 p + 1−a(p2 )
SF0 (p) =
2
1 − a(p2 )
m−b(p2 )
2
p + 1−a(p2 )
Clearly, the denominator vanishes for p2 = −m2ph , mph :=
m−b(−m2ph )
1−a(−mph2 )
. Since the numerator
has the form −6 p + mph , we can regard the pole to be defined by 6 p = −mph . Since a, b are
functions of p2 = −6 p6 p, we may regard the self energy as a function of 6 p and define the pole
by the condition: 6 p + m − Σ(6 p)|6 p=−mph = 0. Clearly, mph 6= m.
The exact propagator can be expanded about its pole as,
dΣ(6 p)
6 p + m − Σ(6 p) = 0 + (6 p + mph ) 1 −
|6 p=−mph + o((6 p + mph )2 )
d6 p
"
#
Z2
1
1
:=
∴ SF0 (p) '
dΣ(6
p)
6 p + mph 1 −
6 p + mph
|6 p=−mph
d6 p
The shift of the position of the pole in the exact propagator from that of the free propagator, δm := mph − m = −Σ(6 p = −mph ) is important in an S−matrix element which has
the spinors on the external lines. These spinors satisfy the equation of motion with physical mass (= mph ) and not the mass parameter in the free propagator inferred from the
Lagrangian. In particular this means that in the LSZ formula for S−matrix element, the
amputation of external fermion legs is to be done with (−i6 ∂ + mph ). This in turn means
that in the Green’s functions with external fermion legs, the external lines must include the
self-energy corrections i.e. should have SF0 propagator instead of the free propagator SF
(and of course no more self energy corrections on the external lines). We will return to this
point later again while discussing renormalized perturbation series.
We also see that the residue at the pole in the exact propagator is not 1 and we identify
it with the Z2 which we know from the Kallen-Lehmann representation to be less than 1.
160
B.
The photon propagator: photon self-energy
All
=
+
=
+
+
1PI
+
+
1PI
1PI
(16.3)
+ · · · (16.4)
Here is a graphical representation of the exact photon propagator. As before, we group
together the 1-PI diagrams connected by free propagators. Let the exact propagator be
0
(q) and the free propagator by −iDµν (q). Let the 1PI blob be denoted
denoted by −iDµν
by iΠµν (q). In the Lorentz gauge, the free propagator is: Dµν (q) = ηµν /(q 2 − i). While
the series can be formally summed as before, it is more convenient to separate the tensor
indices in the Πµν (q) tensor.
Lorentz covariance implies, Πµν (q) = ηµν A(q 2 ) + qµ qν B(q 2 ). The Πµν tensor may be
thought of as γ → γ process, although q 2 6= 0. We now appeal to a Ward identity:
q µ Πµν (q) = 0. This is done separately below. Presently, it implies A(q 2 ) = −q 2 B(q 2 ) and
we define: Πµν (q) := [ηµν q 2 − qµ qν ]Π(q 2 ) . With this notation, the exact 2-point function
takes the form,
0
−iDµν
(q) = −i
ηµα
ηβν
ηµν
+ −i 2
· [i(q 2 η αβ − q α q β )Π(q)] · −i 2
+
− i
q − i
q − i
− iDµα · [i(. . . )αβ ] · −iDβρ · [i(. . . )ρσ ] · −iDσν + . . .
q2
Define,
2 αβ
i(q η
q α qν
α
− q q )(−iηβν q ) = δ ν − 2
=: ∆αν ⇒ ∆αν ∆νβ = ∆αβ .
q
α β
−2
Then the exact propagator series takes the form,
ηµν
ηµα
ηµα
α
2
+
−i
∆
Π(q
)
+
−i
∆α ∆β Π(q 2 ) + . . .
ν
q 2 − i
q 2 − i
q 2 − i β ν
ηµν
ηµα
= −i 2
+ −i 2
∆α (Π + Π2 + Π3 + . . . )
q − i
q − i ν
−i
qµ qν qµ qν
qµ qν
2
3
= 2
ηµν − 2 + 2
+ ηµν − 2 (Π + Π + Π . . . )
q − i
q
q
q
1
q µ qν
1 qµ qν
0
∴ Dµν
(q 2 ) = 2
+
η
−
.
µν
q (1 − Π(q 2 ))
q2
q2 q2
0
−iDµν
(q) = −i
In any S−matrix calculation, the D0 propagator will land on a fermion line and thanks to the
Ward identity, the terms proportional to q µ , q ν will vanish. Hence, for S−matrix element
161
0
computations, we can take Dµν
(q) =
ηµν
(q 2 −i)(1−Π(q 2 ))
. The Π(q 2 ) is called the photon self
energy. To the leading order, the self energy vanishes and hence in perturbation theory,
it can never equal 1 and cause another pole at some other q 2 . As long as it is regular
at q 2 = 0, the exact propagator continues to have a simple pole at q 2 = 0, just like the
free propagator. Hence, the photon wavefunction, µ (~k) continues to be transverse and
longitudinal polarizations will decouple.
Unlike the fermion self energy which shifts the mass, the photon self energy does not
shift the photon mass from zero thanks to the Ward identity. However, like the fermion self
energy, the residue at the pole does shift away from 1, (Z3 )−1 := 1 − Π(q 2 = 0).
1.
The Ward identity claim: q µ Πµν (q) = 0:
γi+1
γn
pi + q
pn + q
γ1
γj
pi
p1
γi
γ2
Proof: Recall that Πµν is defined without external propagators i.e. it is an amputated 2-point
function. The momentum q however, need not be on-shell and hence it does not correspond
to an S−matrix element (even if we disregard the polarization factors). Consider diagrams
contributing to the photon self energy with some fixed number of vertices (or order in the
coupling). The set of these diagrams will be generated by vertex injecting momentum q µ at
various points on the available fermion lines. The fermion lines will necessarily be loops as
there are no external fermions contributing to the photon self energy. Consider the subset
of diagrams wherein the q µ vertex is on some particular loop which contributes a factor of
the form: (the fermion arrow points from i = 1 to i = n and so do the fermion momenta)
T r [SF (pn + q)γ µn . . . SF (pi+1 + q)γ µi+1 {SF (pi + q)6 qSF (pi )} γ µi . . . SF (p1 )γ µ1 ]
The q µ vertex is inserted between the ith and (i + 1)th vertices and adds the propagator
SF (pi + q). All momenta from pi+1 to pn are shifted by q. We have also contracted with q µ .
162
The expression between the braces simplifies as
13
,
1
1
1
1
6q
=
−
= SF (pi ) − SF (pi + q) .
6 pi +6 q + m 6 pi + m 6 pi + m 6 pi +6 q + m
The trace thus becomes,
T r [SF (pn + q)γ µn . . . SF (pi+1 + q)γ µi+1 {SF (pi )} γ µi . . . SF (p1 )γ µ1 ]
−T r [SF (pn + q)γ µn . . . SF (pi+1 + q)γ µi+1 {SF (pi + q)} γ µi . . . SF (p1 )γ µ1 ]
Note that the momenta after i (and i − 1) have q added to them. Summing over the
diagrams where the q µ vertex is on this fermion loop i.e from i = 1, . . . , n , there is a
pairwise cancellation. The first term at i cancels the second term at i + 1 and so on. We
are left with the second term at i = 1 and the first term at i = n, i.e.
n
X
T r [. . . ] = T r [SF (pn )γ µn . . . SF (p1 )γ µ1 ] − T r [SF (pn + q)γ µn . . . SF (p1 + q)γ µ1 ]
i=1
Because we have a fermion loop, there is an integration over say p1 . Provided the integral is
finite, we can shift the integration variable and absorb away the q. The integrals of the two
traces, then cancel out and the claim is proved.
Remark: We have not yet introduced loop diagrams. The above fermion loop, before the
q insertion, has n vertices and n momenta entering at those vertices. The conservation of
momenta, taking out the overall conservation, thus leaves one momentum undetermined and
this is to be integrated.
Remark: As an extension of the result, consider an open fermion line with incoming
momentum p0 , outgoing momentum pn and insertions on it.
γj
pn + q
pn−1 + q
γn
13
γn−1
pi + q pi
γi+1
p1
γi
γ2
p0
γ1
This identity holds as long as both the fermion propagators have the same mass, mph or m. This is
relevant while dealing with external fermion lines.
163
For the insertion between the i, i + 1 vertices, we will have a string of factors as,
SF (pn + q)γ µn SF (pn + q) . . . γ µi+1 {SF (pi + q)6 qSF (pi )} γ µi . . . SF (p1 )γ µ1 SF (p0 )
Replacing SF (pi + q)6 qSF (pi ) = SF (pi ) − SF (pi + q) and summing over the insertions from
i = 0, . . . , n, we get
n
X
[. . . ] = SF (pn )γ µn SF (pn−1 ) . . . SF (p1 )γ µ1 S(p0 )
i=0
− SF (pn + q)γ µn SF (pn−1 + q) . . . SF (p1 + q)γ µ1 S(p0 + q)
To get an S−matrix contribution, we have to multiply by ū(pn + q)[SF (pn + q)]−1 ] on the
left and by [SF (p0 )]−1 u(p0 ) on the right and take the limits p20 → −m2 , (pn + q)2 → −m2 .
Now we see that each of the two terms has only one of the poles which can cancel the inverse
propagator. In the on-shell limit then, each of the terms vanishes, thereby proving that for
S−matrix elements with external fermion lines, sum over insertion of photon lines followed
by contraction with the photon momentum gives zero.
Note: There is one subtlety here. For external lines, we should be using the exact propagator as per the LSZ rules. The S6 qS identity requires both propagators to have the same mass,
the physical mass. This would require all internal fermion lines to use the exact propagator
(and then do not consider the fermion self energy correction). For fermion loop, this issue
does not arise. Notice however that SF0 (p)−1 = SF (p)−1 − Σ(p) and σ(p) is always of order α
and higher. Thus, in using free inverse propagator instead of exact inverse propagator, we
are dropping a higher order contribution which we can do in a perturbation theory.
The conclusion is then that qµ Mµ (q, . . . ) = 0 which is a statement of Ward identity.
C.
The Vertex function: Form factors
The expectation is that the electron-photon coupling will undergo a change due to higher
order corrections. Unlike the 2-point functions above which were considered for arbitrary
momentum, we consider the 3-point function with the fermion momenta “on-shell” (and this
means the q 2 is necessarily space-like (prove this)) and hence an internal line. When viewed
as part of an S−matrix element, the external fermion legs are amputated with physical mass
inverse propagator and replaced by the ū(p0 ), u(p) spinors respectively. Here is a graphical
view of the possible corrections.
164
q
µ
p0
All
=
+
+
+
(16.5)
p
q
ieΓµ (p, p0 ) = ieγ µ +
µ
p0
1PI
1PI
+
+ ···
1PI
p
(16.6)
1PI
Notice that some corrections on the fermion lines and the photon line are just self-energy
corrections. The remaining corrections have photon lines necessarily connecting the two
fermion lines and these diagrams are already 1PI. The last diagram in the second equation
is 1PR.
We can simplify the form of the exact vertex function, Γµ (p, p0 ) by appealing to Lorentz
covariance.
0
0
Γµ (p, p0 ) = γ µ A(q 2 ) + (pµ + p µ )B(q 2 ) + (p µ − pµ )C(q 2 ) .
With two independent momenta, we have 3 Lorentz scalars, p2 , (p0 )2 , p · p0 . The first two
equal −m2ph while the third is traded for q 2 for convenience.
Claim: ū(p0 )qµ Γµ u(p) = 0.
Proof: This is the Ward identity argument given above.
The above form for Γµ then implies,
0 = ū(p0 )6 qu(p)A(q 2 ) + (−m2ph + m2ph )ū(p0 )B(q 2 )iu(p) + q 2 ū(p0 )C(q 2 )u(p)
= 0 + 0 + q 2 ū(p0 )C(q 2 )u(p) ⇒ ū(p0 )C(q 2 )u(p) = 0 .
The A, B terms can be rewritten using “Gordon identity”,
ū(p0 )γ µ u(p) =
1
ū(p0 ) [(p + p0 )µ − 2iσ µν qν ] u(p)
2mph
This can be established as follows (mph → m for convenience):
We have, u(p) = − m1 6 pu(p) , ū(p0 ) = − m1 ū(p0 )6 p0 . Therefore
165
1 µ
1 µ ν
0 µ
µ
µ ν
µν
ū(p )γ u(p) = −
(γ 6 p +6 p γ ) But, γ 6 p = pν γ γ = pν −η + [γ , γ ]
2m
2
i
∴ γ µ6 p = −pµ − 2iΣµν pν , 6 p0 γ µ = −(p0 )µ − 2iΣνµ p0ν , Σµν := [γ µ , γ ν ] ;
4
1
∴ ū(p0 )γ µ u(p) = −
ū(p0 ) [−pµ − (p0 )µ − 4iΣµν (pν − p0ν )] u(p)
2m
1
=
ū(p0 )[(p + p0 )µ − 2iσ µν qν ]u(p) , σ µν := 2Σµν , q := p0 − p .
2m
0
µ
Eliminating the (p + p0 )µ from the B(q 2 ) term, and identifying A + 2mph B =: F1 (q 2 ) and
4Σµν B(q 2 ) =:
σ µν
F 2 (q 2 ),
2mph
we take the general form of Γµ (p, p0 ) as,
Γµ (p, p0 ) = γ µ F1 (q 2 ) + i
σ µν qν
F2 (q 2 ) , q := p0 − p .
2mph
The Fi (q 2 ) are called form factors. To the lowest order, F1 (q 2 ) = 1, F2 (q 2 ) = 0 . These get
corrected at higher orders.
D.
Electric Charge and Anomalous Magnetic Moment
The form factor decomposition helps us identify the electric charge and the anomalous
magnetic moment of electron (and other charged fermions). This is seen as follows.
Consider the scattering of an electron off a heavy charged particle. We imagine the
scattering to be effected by exchanging a single photon between the heavy charge and the
electron. Schematically the amplitude can be written as,
iM = ie2 [ū(p0 )Γµ (p0 , p)u(p)]
1
[ū(k 0 )γµ u(k)]
q2
The first factor is from electron, second is the single photon propagator and the third
factor is from the heavy charge. The Γ vertex has only 1PI diagrams and excludes the
photon self energy on the propagator. This represents the modification of the electron
response due to the 1PI corrections, the photon self energy would indicate modification
in the electromagnetic field felt by the electron and hence is also referred to as vacuum
polarization. This is ignored in this calculation.
In the limit of infinitely heavy fermion, the second and third factors are replaced by an
external classical potential Aµ (q) and the amplitude is defined as,
0
iM (2π) δ((p0 − p)0 ) = −ieū(p0 )Γµ u(p)Ãcl
µ (p − p)
166
We consider two cases - Coulomb potential and magnetic field, and consider non-relativistic
limit of the scattering.
~
Coulomb potential: Acl
µ (x) = (φ(x), 0). Coulomb field is static and we also assume it
0
~
to be slowly varying spatially. This means that we take Ãcl
µ (q) = 2πδ(q )(φ̃(q), 0) with
the φ̃(q) having support near q ' 0. We can thus take the limit q → 0 of the first
factor. This in turn means that only the F1 term contributes. In the non-relativistic
limit, ū(p0 )γ µ u(p) → ū(p0 )γ 0 u(p) = 2mu† (p)u(p) and the amplitude takes the form, iM →
−ieF1 (0)φ̃(q)(2mu† (p)u(p)). Comparing with the Born approximation for scattering off a
potential leads to the identification, V (x) = eF1 (0)φ(~x). Thus, electric charge = eF (0).
Since F1 (0) = 1 at the leading order, the radiative corrections to F1 (q 2 ) should vanish as
q 2 → 0.
~ cl
Magnetic field: Acl
µ = (0, Ai (x)). Again this is taken to be time independent and spatially
slowly varying. Thus we consider the limit q → 0. Earlier, we dropped the F2 term, because
F1 term was non-zero. Now however ū(p0 )γ i u(p) ' o(p/m) in the non-relativistic limit which
is comparable to the F2 term.
Recall from the NR limit of the Dirac spinors, uT := (uT1 , uT2 ), (6 p+m)u(p) = 0 ⇒ u2 (p) '
~·~
σ
− p2m
u1 and we have the normalization u† u(p) = 2m. We also had,
ū(p0 )γ i u(p) ' u†1 (p0 )σ i u2 (p) + u†2 (p0 )σ i u1 (p) ' −
1 † 0
u (p ){σ i p~ · ~σ + p~0 · ~σ σ i }u1 (p).
2m 1
Using σ i σ j = δ ij + iijk σ k , we get,
~ cl + p~0 · A
~ cl − iAcl ij qj σ k u1 (p)
~ cl u(p) = − 1 u†1 (p0 ) p~ · A
ū(p0 )~γ · A
i
k
2m
Noting that q 0 = 0, the F2 term has σ ij qj . Going over to the two component spinors, we
get
i ij † 0 k
i
ū(p0 )σ ij u(p)qj ≈
[u (p )σ u1 (p)]
2m
2m k 1
The contribution of the u2 spinors is negligible in the NR limit.
Combining the terms we get in the NR limit,
"
#
0
cl
~
(~
p
+
p
~
)
·
A
i
† 0
k
ū(p0 )Γi u(p)Ãcl
F1 (0) −
ij qj Acl
i σ (F1 (0) + F2 (0)) u1 (p)
i (q) ≈ u1 (p )
2m
2m k
In the non-relativistic Hamiltonian, (p − eA)2 /(2m) we have the e(p · +A · p) terms which
are recovered. The remaining terms however are new. Focusing only on those, we write the
167
amplitude as,
iM =
−ieu†1 (p0 )
1
k
−
(F1 (0) + F2 (0))σ Bk u1 (p) , Bk := −ikij qi Ãcl
j (q) .
2m
This amplitude is interpreted as the Born approximation to a potential scattering with
~ x), with the effective magnetic moment,
potential, V (x) = −h~µi · B(~
h~µi =
e
~σ
e ~
[F1 (0) + F2 (0)]u†1 (p) u1 (p) := g
S , g := 2[F1 (0) + F2 (0)] = 2[1 + F2 (0)].
mph
2
2mph
g is called the Lande’s g-factor, its value being 2 is prediction of Dirac equation while
F2 (0) 6= 0 is a prediction of QED. It is called the anomalous magnetic moment.
Without actually calculating the higher order corrections, we noted what they mean for
2-point functions (self-energies). We used Lorentz covariance to get a form factor decomposition and interpreted them by appealing to the NR limit. One of the immediate prediction
was the anomalous magnetic moment.
We also note that to the leading order (see diagrams), we have Z2−1 = 1, Z3−1 = 1, F1 (q 2 ) =
1, F2 (q 2 ) = 0. This is the reason that at the tree level calculations that we did, did not have
any factors of Z’s. This will change when we do explicit evaluations of the leading corrections
in the next section.
168
17.
RADIATIVE CORRECTIONS AT 1-LOOP: DIVERGENCES
We now calculate the leading radiative corrections to the self energies and the QED vertex
function i.e. the 1-loop contributions to Σ(6 p), Π(q 2 ), F1 (q 2 ), F2 (q 2 ). Here are the Feynman
diagrams and the corresponding invariant amplitudes.
Fermion self energy
p−k
p
p
k
iΣ(p) = (ie)
2
d4 k µ −i(−6 k + m) ν
−iηµν
γ
γ 2
4
2
(2π)
k + m − i (p − k)2 + µ2 − i
Z
Photon self energy (vacuum polarization)
k+q
q
q
µν
Z
2
iΠ (q) = (ie) (−1)
d4 k
−i
ν
µ −i
γ
Tr γ
(2π)4
6 k + m 6 k +6 q + m
k
QED Vertex function, δΓµ := Γµ − γ µ
p0
p0 − k
d4 k
−iηαβ
δΓ (p, p ) = (ie)
×
(2π)4 (k)2 + µ2 − i
0
α −i(−(6 p − 6 k) + m) µ −i(−(6 p − 6 k) + m) β
γ
γ
γ
(p0 − k)2 + m2 − i (p − k)2 + m2 − i
µ
µ
k
q
p−k
0
2
Z
p
In the photon self energy expression, the (−1) is due to the fermion loop. The photon
propagator is added a small mass µ2 in anticipation. In the vertex correction, the expression
is understood to be sandwiched between ū(p0 ) and u(p).
There are some new points to be noted.
• There is a momentum integration over k, unrelated to the external momenta. This k
can be space-like, time-like or light-like. In any frame, we may consider the region where
R
R
k µ → ∞, then: (i) the electron self-energy ∼ d4 kk α /k 4 ∼ dk which is naively linearly
R
R
divergent; (ii) the photon self-energy ∼ d4 k/k 2 ∼ kdk which is naively quadratically
R
divergent and (iii) the vertex correction ∼ d4 kk 2 /k 6 ∼ dk/k which is naively logarithmically
divergent.
• There are also other regions of integration space which give divergent contributions,
but these will be visible after the integrand is put in a convenient form.
169
• The i in the denominator is quite significant now. It implies that the 4-dimensional integration must be defined with the k 0 integration being done first and the spatial integrations
done subsequently.
• The naively divergent integrals need to be regularised i.e. a prescription for the integration must be supplied which will manifest the divergence in an explicit form. The divergent
contribution so obtained, must be subtracted to obtain finite answers. This is the process of
renormalization.
• The integrand has a numerator which has tensor/spinor indices while the denominator
is a product of scalar factors. One combines the denominator factors into a single factor form
using the so called Feynman parameters. This is followed by the momentum integration with
a regularization and a finite part is identified. The integration over the Feynman parameters
is done last to get the answer. We explain these steps now.
The Feynman/Schwinger trick:
Z ∞
δ(Σi xi − 1)
1
dx1 . . . dxn
Claim:
= (n − 1)!
.
A1 . . . An
(Σi xi Ai )n
0
R∞
Proof: Observe that A−1 = 0 dαe−αA , Re(A) > 0. Therefore,
Z ∞
Z ∞
1
−Σi αi Ai
dtδ(t − Σi αi ) and αi → txi ,
dα1 . . . dαn e
·1 , 1=
=
A1 . . . An
0
0
Z ∞ Z ∞
1
=
dt
dx1 . . . dxn tn e−tΣi xi Ai δ(Σi xi − 1)
t
0
Z ∞
Z0 1
dt tn−1 e−t(Σi xi Ai )
dx1 . . . dxn δ(Σi xi − 1)
=
0
0
Z ∞
Z ∞
δ(Σi xi − 1)
= (n − 1)!
dx1 . . . dxn
∵
dt tn−1 e−tΛ = Λ−n Γ(n) .
n
(Σ
x
A
)
i i i
0
0
Note: Since each of the denominator factors are of the form Ai = (pi − k)2 + m2i − i, we see
that
Σi xi Ai = k 2 − 2k · (Σi xi pi ) + Σi xi (p2i + m2i ) − i
= (k − Σi xi pi )2 + M 2 − i , M 2 (xi , pi , mi ) := Σi xi (p2i + m2i ) − (Σi xi pi )2 .
The obvious step is to shift the integration variable, k → k + Σi xi pi which simplifies the
denominator - it has the same form of a single propagator with the same −i.
Let us begin with the fermion self energy.
170
A.
Isolation of Divergence: Fermion Self Energy
iΣ(p) = (ie)
2
−iηµν
d4 k µ −i(−6 k + m) ν
γ
γ 2
4
2
(2π)
k + m − i (p − k)2 + µ2 − i
Z
The Feynman trick gives,
1
=
2
2
(k + m − i)((p − k)2 + µ2 − i)
Z
1
dx
0
[(k −
xp)2
1
,
+ M 2 − i]2
with, M 2 := x(1 − x)p2 + (1 − x2 )m2 + xµ2 . Shifting k → k + xp, and using γ µ γµ =
−41, γ µ6 kγµ = 26 k, gives
2
Z
iΣ = e
0
1
Z
d4 k −4m − 2(6 k + x6)p
(2π)4 [k 2 + M 2 (x, p) − i]2
Recalling that the k 0 integration is to be done first and that the poles are at k 0 =
p
± ~k 2 + M 2 ∓ i, we rotate the contour anti-clockwise without crossing any singularity.
This is equivalent to putting k 0 = ik 0 and k 2 → (k 0 )2 + ~k 2 =Euclidean vector norm. This
R
R
R
α
γα
sends d4 k → i d4 k. For Euclidean integrals d4 k (k2k+M
2 )2 = 0, since the integrand is odd
under k → −k. We are left with,
Z 1 Z
d4 k 4m + 2x6 p
2
iΣ = −ie
dx
, M 2 := x(1 − x)p2 + (1 − x2 )m2 + xµ2
2
4
2
2
(2π) (k + M )
0
The k−integral is logarithmically divergent and needs to be regulated. There are several
ways of doing this. One is the Pauli-Villars regularization which subtracts from the photon
propagator, another identical piece with arbitrarily large mass squared, Λ2 , i.e.
1
1
1
→
−
.
(p − k)2 + µ2
(p − k)2 + µ2 (p − k)2 + Λ2
For large k, both terms go as k −2 and cancel each other leaving a finite answer. Thus
iΣ → iΣreg := iΣ(Λ = ∞) − iΣ(Λ). The same combining of denominators will produce
identical terms with M 2 (µ2 ) → M 2 (Λ2 ). The momentum integration becomes,
"
# Z
Z
Z ∞
d4 k
1
1
dΩ3
− 2
=
dk|k|3 [. . . ]
2
4
4
2
2
2
2
(2π) (k + Mµ )
(2π) 0
(k + MΛ )
171
n/2
2π
= 2π 2 for n = 4 and the integral becomes,
The angular integration gives: Ωn−1 = Γ(n/2)
Z
Z ∞
d4 k
dy
y
1
y
[· · · − . . . ] =
−
, k 2 =: y substituted.
(2π)2
8π 2 0 2 (y + Mµ2 )2 (y + MΛ2 )2
Z ∞
Mµ2
1
1
MΛ2
1
dy
−
=
−
+
16π 2 0
y + Mµ2 y + MΛ2 (y + Mµ2 )2 (y + MΛ2 )2
"
∞ #
2 ∞
2
2
y
+
M
M
1
MΛ
µ
µ
=
ln
+
−
2
2
2
16π
y + MΛ 0
y + Mµ y + MΛ2 0
Z
d4 k
1
MΛ2
.
[·
·
·
−
.
.
.
]
=
ln
∴
(2π)2
16π 2
Mµ2
In the limit Λ → ∞, MΛ2 → xΛ2 and we get,
Z 1
e2
xΛ2
iΣ(p) = −i 2
dx (2m + x6 p)ln
8π 0
x(1 − x)p2 + (1 − x)m2 + xµ2
(17.1)
Recall that the mass shift as well as the Z2 are obtained from the self energy as, δm :=
mph − m = −Σ(6 p = −mph ), Z2−1 = 1 −
Σ(6 p)
|
.
d6 p 6 p=−mph
Since the self energy already has an
explicit factor of e2 , in the integrand, we can use 6 p = −mph ≈ −m. The derivative of Σ is
obtained as,
dΣ
d6 p
6 p=−mph
α
= −
2π
Z
1
dx xln
0
xΛ2
−x(1 − x)m2 + (1 − x)m2 + xµ2
+
−1
(−2x(1 − x)(−m))
(2m − x · m)
−x(1 − x)m2 + (1 − x)m2 + xµ2
Simplification leads to,
Z
dΣ
α 1
xΛ2
x(1 − x)m2
=
dx −xln
+ 2(2 − x)
d6 p 6 p=−mph
2π 0
(1 − x)2 m2 + xµ2
(1 − x)2 m2 + xµ2
(17.2)
δm = −Σ(6 p = −mph ) =
α
2π
1
Z
dx (2 − x)ln
0
2
xΛ
(1 − x)2 m2 + xµ2
(17.3)
The integrands are well behaved at both end points so the integrals are finite and the leading
contribution is given by the Λ2 dependent part alone and for dimensional reasons we divide
by the fermion mass. Thus, the divergent contributions as Λ → ∞ take the form,
Z
α 1
3α
δm
dx(2 − x)ln(Λ2 /m2 ) =
ln(Λ2 /m2 )
2π 0
4π
Z
α
α 1
α
−1
Z2
1−
(−xln(Λ2 /m2 )) = 1 +
ln(Λ2 /m2 ) ⇒ Z2 ≈ 1 −
ln(Λ2 /m2 )
2π 0
4π
4π
The logarithmic divergence is thus manifested in terms of ln(Λ2 /m2 ). There are of course
finite pieces, but until we take care of the divergent parts the finite parts are irrelevant.
172
B.
Isolation of Divergence: Photon Self-Energy
µν
iΠ (q) = −e
2
Z
d4 k T r[γ µ (−6 k + m)γ ν (−6 k −6 q + m)]
(2π)4 (k 2 + m2 − i)((k + q)2 + m2 − i)
Using the trace formulae for the Dirac matrices (14.4), we get
Numerator: = T r[γ µ6 kγ ν (6 k +6 q) + m2 γ µ γ ν ]
= 4[k µ (k + q)ν − η µν k · (k + q) + (k + q)µ k ν − m2 η µν ]
= 4[2k µ k ν − η µν (k 2 + m2 ) + (k µ k ν + k ν q µ − η µν k · q)]
1
1
Denominator: = 2
2
2
k + m − i (k + q) + m2 − i)
Z 1
1
dx
=
shift k → k − xq,
2
[(k + xq) + x(1 − x)q 2 + m2 ]2
0
The shift generates additional terms linear in k in the numerator. Under integration, these
terms drop out and we are left with,
Z
d4 k {2k µ k ν + 2x2 q µ q ν − η µν (k 2 + x2 q 2 + m2 ) − x(2q µ q ν − η µν q 2 )}
µν
2
iΠ (q) = (−4e )
(2π)4
(k 2 + M 2 − i)2
where, M 2 = x(1 − x)q 2 + m2 . Doing Wick rotation as before,
µ ν
Z
4
2k k − 2x(1 − x)q µ q ν + x(1 − x)η µν q 2 − η µν (k 2 + m2 )
d
k
iΠµν (q) = (−4e2 )i
(2π)4
(k 2 + M 2 − i)2
The tensorial integral,
Z
Z
d4 k
1
d4 k
kµkν
k2
µν
=
Aδ
,
with
A
=
(2π)4 (k 2 + M 2 )2
4
(2π)4 (k 2 + M 2 )2
There are several problems to be faced now. First, the integral is quadratically divergent
and shift in the momentum variable is unjustified. Second, we have now δ µν and η µν in the
numerator, confusing Lorentz covariance. Third, naively we see a quadratically divergent
piece with coefficient η µν but only logarithmically divergent one with coefficient q µ q ν casting
doubts on possibility of gauge invariance. We could try the Pauli-Villars regulation, but we
will introduce a different one, the dimensional regularization. The Pauli-Villars may be seen
in [13]. The dimensional regularization has the main virtue of keeping the Ward identity
satisfied, even for non-abelian gauge theories.
The idea is to think of the perturbation theory to be a member of a class of similar theories
formulated in general n dimensions. One chooses a value of n where the integrals are well
173
defined (n < 4) and analytically continues n → 4. This isolates the original divergence
in a particular form (poles) providing the needed regularization. Since the mass/length
dimensions also depend on space-time dimensions we need the coupling constants to be
dimensionfull. To maintain space-time covariance, we need to regard the external momenta
and the γ matrices etc also to be n−dimensional. We set := 4 − n and consider the
limit → 0. Since the integrals are over Euclidean momenta due to the Wick rotation, the
external momenta and the metric tensor need to continue back to Minkowski signature at
the end of the calculations.
In this regularization scheme, the following rules are adopted.
Z
Z
dΩn−1
2π n/2
n−1
2
(17.4)
dk|k|
f
(|k|
)
,
dΩ
=
n−1
(2π)n
Γ(n/2)
Z
Z
dn k
1
2π n/2
2
∴
f (|k| ) =
dkk n−1 f (k 2 )
(17.5)
(2π)n
Γ(n/2) (2π)n
n
Z
Z ∞
1 ∞
x 2 −1
M2
k n−1
=
dx
, put y =
dk 2
(k + M 2 )α
2 0
(x + M 2 )α
x + M2
0
Z
n
n
M n−2α 1
M n−2α
=
dyy α−1− 2 (1 − y) 2 −1 =
β(α − n/2, n/2)
2
2
0
Z
dn k
f (|k|2 ) =
(2π)n
Z
Z
∞
dk
∴
0
k n−1
M n−2α Γ(α − n/2)Γ(n/2)
=
(k 2 + M 2 )α
2
Γ(α)
(17.6)
and
Z
1
1 Γ(α − n/2)
dn k
=
(M 2 )n/2−α
n
2
2
α
(2π) (k + M )
(4π)n/2
Γ(α)
(17.7)
The Γ function has isolated poles at α − n/2 = 0, −1, −2, . . . i.e. at n = 2(α + m), m =
0, 1, . . . . The Gamma function has the expansion Γ(/2) = 2/ − γ + o() where, γ =
0.57721 . . . is the Euler-Mascheroni Constant.
Since finite parts can be affected we need to keep in various place too eg {γ µ , γ ν } =
−2δ µν 1, T r(1) = n = 4 − , γ µ γ ν γµ = (2 − )γ ν . These terms will give contributions from
the 1/ terms. The coupling constant dimensions work as follows. [Ψ] = (n − 1)/2, [Aµ ] =
/2
(n − 2)/2 ⇒ [e] = n − (n − 1) − (n/2 − 1) = /2. Therefore we write e → eµ0
where µ0
is an arbitrary mass scale which should disappear from physical quantities. The dimensions
are in mass units. The tensorial integral will now follow from k µ k ν → n1 δ µν k 2 . We use these
in evaluating the Πµν (q).
174
µν
Π (q) =
(−ne2 µ0 )
1
Z
−2x(1 − x)q µ q ν + δ µν (x(1 − x)q 2 − m2 ) ×
0
Z
Z
dn k
1
dn k k 2 + M 2 − M 2
µν
+ δ (2/n − 1)
, where,
(2π)n (k 2 + M 2 )2
(2π)n (k 2 + M 2 )2
dx
M 2 = x(1 − x)q 2 + m2
Z
Z
dn k
dn k
1
k2
µν 2
2
(−ne µ0
dx δ
−1
+
×
n
(2π)n k 2 + M 2
(2π)n (k 2 + M 2 )2
0
{z
} |
|
{z
}
1
2
2
2
µ ν
µν
2
2
−m − x(1 − x)q −
−2x(1 − x)q q + δ
−1 M
n
2 −/2
M
Γ(/2)
1 Γ(1 − n/2)
M2
2 n/2−1
(M )
=
n/2
2
(4π)
Γ(1)
(4π)
4π
(/2 − 1)
2 −/2
1 Γ(2 − n/2)
1
M
(M 2 )n/2−2 =
Γ(/2)
n/2
2
(4π)
Γ(1)
(4π)
4π
M2
n
n 2
2 ,
−1=1− =
−1
/2 − 1
2
2
2 n
Z 1
2
M2
µν
2
dx δ
(−ne µ0 )
−1 n 2
n
( − 1)
0
2 n
2
µν
2
2
2
µ ν
+δ
−m −
− 1 M + x(1 − x)q − 2x(1 − x)q q × 2
n
2
2 2
µν
2
2
2
2
δ
M −m −
M − M 1 + x(1 − x)q
− 2x(1 − x)q µ q ν
n
n
|
{z
}
Z
=
1
=
2
=
∴ 1
=
Πµν (q) =
But, [. . . ] =
1
2x(1−x)q 2
= 2x(1 − x)(q 2 δ µν − q µ q ν ) ⇒ Πµν (q) = (q 2 δ µν − q µ q ν )Π(q 2 ) with,
2 −/2
Z 1
1
M
2
2
Π(q ) := −ne µ0
dx
Γ(/2)(2x)(1 − x)
2
(4π)
4π
0
Z
2(4 − )e2 1
2
4πµ20
dx x(1 − x)
−γ
ln
+1
= −
(4π)2
M2 2
0
Using a/2 = 1 + 2 ln(a) + o(2 ), we write,
2α
Π(q ) = −
π
2
Z
1
dx x(1 − x)
0
2
−γ
4πµ20
1 + ln
1−
2
M2
4
175
(17.8)
Notice that for q 2 , M 2 = m2 and
2α 2
Π(0) = −
π
1
Z
dx(x − x2 ) + constant = −
0
2α 1
2α
, ⇒ Z3 = 1 −
3π
3π
(17.9)
We see the divergence in Π(0), Z3 and there is of course no mass shift. We return to the
interpretation of Π(q 2 ) later.
Remark: The −1 pole in dimensional regularization corresponds to logarithmic diverR
gence in Pauli-Villars. this is indicated by comparing the regulated integral [(k 2 + M 2 )−2 −
(k 2 Λ2 )−2 ] in Pauli-Villars, which goes over to
ization, our integral 2 goes as
1 2
.
4π 2
1
ln(xΛ2 /M 2 ).
4π 2
In the dimensional regular-
Hence, −1 ↔ ln(Λ/M ). Thanks to the dimensional
regularization maintaining the Ward identity, the q 2 δ µν − q µ q ν got pulled out and the superficially quadratically divergent integral go converted to a logarithmically divergent one.
C.
Isolation of Divergence: Vertex function
We compute δΓµ = Γµ − γ µ , with p2 = (p0 )2 = −m2 and sandwiching by ū(p0 ), u(p) is
implicit.
0
µ
δΓ (p, p ) = −ie
2
Z
0
ηαβ
d4 k
µ (−(6 p − 6 k) + m)
β
α (−(6 p − 6 k) + m)
γ
γ
γ
(2π)4 k 2 + µ2 − i
(p0 − k)2 + m2 − i (p − k)2 + m2 − i
We have three denominators and the Feynman trick leads to,
k2
1
1
1
2
2
2
0
2
+ µ − i (p − k) + m − i (p − k) + m2 − i
Z 1 Z 1 Z 1
2
dz
=
dx
dy
[(k − xp − yp0 )2 + M 2 − i]3
0
0
0
M 2 := x(p2 + m2 ) + y((p0 )2 + m2 ) + zµ2 − (xp + yp0 )2
where,
∴ M 2 = 0 + 0 + +zµ2 + m2 (x + y)2 + q 2 xy
Since q 2 > 0 (space-like) for this diagram, M 2 = (x + y)2 m2 + xyq 2 + zµ2 is manifestly
positive. As usual shifting the momentum k → k + xp + xp0 , doing the Wick rotation and
dropping the terms linear in k in the numerator, we get,
Z 1 Z 1 Z 1
Z
µ
0
2
dx
dy
dzδ(1 − x − y − z)
δΓ (p, p ) = e
0
0
0
d4 k 2 × Nr
(2π)4 [k 2 + M 2 ]3
N r = γ α {(6 p0 − 6 k − x6 p − y6 p0 )γ µ (6 p0 − 6 k − x6 p − y6 p0 )} γα + m2 γ α γ µ γα
−mγ α {γ µ (6 p − x6 p − y6 p0 ) + (6 p0 − x6 p − y6 p0 )γ µ } γα + linear in k
176
Now we need to use the identities,
γ α γ µ γα = −2η αµ γα − γ µ (−4) = 2γ µ
γ α γ ρ γ σ γα = 4η ρσ
(17.10)
(17.11)
γ α γ ρ γ µ γ σ γα = 2γ σ γ µ γ ρ
(17.12)
∴ γ α6 kγ µ6 kγ α = 26 kγ µ6 k = −4k µ6 k + 2γ µ k 2 ∵ 6 k6 k = −k 2
(17.13)
γ α {. . . }γ µ {. . . }γα = 2{(1 − x)6 p − y6 p0 }γ µ {(1 − y)6 p0 − x6 p}
(17.14)
γ α {γ µ (. . . ) + (. . . )γ µ }γα = 4 {(1 − x)pµ − y(p0 )µ + (1 − y)(p0 )µ − xpµ }
(17.15)
Using these, the numerator takes the form,
0
0
+ 2[{(1 − x)6 p − y6 p0 } γ µ {(1 − y)6 p0 − x6 p}]
ū(p )(N r)u(p) = ū(p ) [−4k µ6 k + 2k 2 γ µ ]
1
2
u(p)
+2[m2 γ µ ]
− 4m[(1 − 2x)pµ + (1 − 2y)(p0 )µ ]
3
4
R
Observe that 6 pu(p) = −mu(p), ūp06 p0 = −mūp0 . In 1 , d4 kk µ6 k ∝ γ µ (after Pauli-Villars).
We simplify 2 as,
1
ū(p0 ) 2 u(p) = ū(p0 )[(1 − x)(1 − y)6 pγ µ6 p0 + xym2 γ µ + x(1 − x)m6 pγ µ + y(1 − y)mγ µ6 p0 ]u(p)
2
6 pγ µ → −2pµ − (−m)γ µ , γ µ6 p0 → −2(p0 )µ + mγ µ between the spinors.
Thus we see that, between the spinors, all the terms in the numerator can be put in the
form of (p + p0 )µ [. . . ] + γ µ [. . . ] and therefore the integral can be arranged as contributing to
the form factors F1 , F2 . The intermediate steps, including the Pauli-Villars regularization,
are left as an exercise. The result is (from δΓµ ):
2
Z
α 1
zΛ
2
F1 (q ) − 1 =
dx dy dzδ(1 − x − y − z) ln
+
2π 0
M2
1
2
2
2
(1 − x)(1 − y)q + (1 − 4z + z )m
M2
2
Z
α 1
2m z(1 − z)
2
F2 (q ) =
dx dy dzδ(1 − x − y − z)
, with
2π 0
M2
M 2 := q 2 xy + m2 (1 − z)2 + zµ2
Note:
177
(17.16)
(17.17)
(i) In the δF1 , there is a UV divergence (Λ → ∞) coming from the large k region, even
for q 2 = 0. This violates the δF1 (0) = 0 condition and thus changes the electric charge. The
condition can be naturally restored by defining δF1,ren (q 2 ) := δF1 (q 2 ) − δF1 (0). This is a
renormalization prescription.
(ii) The δF1 also has an “infrared divergence” (µ2 → 0) at q 2 = 0 coming from the
M −2 (0) = m−2 (1 − z)−2 from the z → 1 region.
(iii) The F2 (q 2 ) has no divergences. Its value at q 2 = 0 gives the anomalous magnetic
moment and can be evaluated explicitly as,
Z 1−z
Z
Z
z
2z
α 1
α
α 1
dx dy dzδ(1 − x − y − z)
=
dz
=
F2 (0) =
2π 0
1−z
π 0
1−z
2π
0
g−2
α
⇒
=
One of the triumphs of QED!
(17.18)
2
2π
The renormalized F1 (q 2 ) is obtained as,
Z
m2 (1 − z)2
α 1
2
F1,ren (q ) = 1 +
dx dy dzδ(1 − x − y − z) ln
2π 0
m2 (1 − z)2 + q 2 xy
m2 (1 − 4z + z 2 )
m2 (1 − 4z + z 2 ) + q 2 (1 − x)(1 − y)
− 2
+
(17.19)
m2 (1 − z)2 + q 2 xy + µ2 z
m (1 − z)2 + µ2 z
We have to face the divergences now. All three radiative corrections have UV divergence
while the vertex function has IR divergence as well. For dealing with the IR divergence, we
need to look at another process called “Bremsstrahlung” (breaking radiation in German).
D.
Bremsstrahlung Cross-section to o(α)
We know from the classical electrodynamics that accelerated charge radiates. A quantum
mechanical view of such a radiation is depicted in the diagrams below.
k
k
+
+
emission of any number of photons
The diagrams depict a scattering process which causes the top electron to ‘accelerate’ and
‘emit’ a (on-shell) photon. This is the QFT view of ‘radiation from an accelerated charge’.
The bottom charged line represents some heavy fermion/boson/source, while the connecting
wavy line denotes mediation of the interaction. The first diagram suggests photon emission
before the kick while the second diagram suggests photon emission after the kick. As per
178
the rules of the relativistic QFT, both contributions are to be added and mod-squared to
get to the cross-section. To the leading order - single photon emission - the cross-section is
order α. This cross-section also has a an IR divergence as k → 0. An on-shell photon with
arbitrarily small momentum is called a ‘soft photon’. To isolate the divergence, focus on the
following diagrams:
p0
0
p0
p +k
k
+
p−k
k
p
p
We are computing the amplitude for the process e → e+γ. Let M0 (k 0 , l0 ) denote the part
of the amplitude of electron interacting with the external source, it is depicted as the blob.
Here k 0 , l0 denote the out-going and in-coming momenta respectively. The full amplitude is
then given by (polarizations ε(~k, λ) taken to be real),
−i(−(6 p − 6 k) + m) µ
0
0
γ εµ
iM(p , p, k) = (ie)ū(p ) M0 (p0 , p − k)
(p − k)2 + m2 −
0
µ −i(−(6 p + 6 k) + m)
0
εµ γ 0
M0 (p + k, p) u(p)
(p + k)2 + m2 − i
To compare with the classical radiation formula, consider the limit wherein the external
source interaction approximates as, M0 (p0 , p − k) ≈ M0 (p0 + k, p) ≈ M0 (p0 , p). Next, since
the photon is soft, we neglect the 6 k in the fermion numerators. The denominators simplify
as, (p − k)2 + m2 = −m2 − 2p · k + 0 = m2 = −2p · k and likewise, (p0 + k)2 + m2 = +2p0 · k.
Furthermore, (−6 p + m)γ µ u(p) → 2pµ u(p), and ū(p0 )γ µ (−6 p0 + m) → ū(p0 )(p0 )µ . Thus the
amplitude takes the form,
ε · p ε · p0
iM(p , p, k) = e
[ū(p )M0 (p , p)u(p)]
×
−
+
|
{z
}
p · k p0 · k
|
{z
}
elastic scattering amplitude
extra k dependent factor
0
0
0
The total, unpolarized cross-section is obtained as,
Z
0
X
d3 k
ε·p
0
0
2 ε·p
dσ(p → p + k) = dσ(p → p )
e
− 0
3
0
2|k|(2π) λ=1,2
p ·k p ·k
2
(17.20)
The integration is over soft photons i.e. |~k| < kmax . Notice that The integral is dimensionless.
The integrand is the probability density of radiating a photon of momentum k within
d3 k and any transverse polarization, accompanying the electron scattering with p → p0
179
(“acceleration kick”, p0 6= p + k) while the integral is the total probability of emission of a
soft photon with energy up to kmax .
Multiplying the integrand by the photon energy, |~k| gives the expectation value of the
energy carried away by soft photons up to energy kmax ,
Z
Esof t (kmax ) := hEi =
0
p~
p~
d3 k X e2
~ε ·
−
(2π)3 λ 2
p0 · k p · k
2
0
The polarization sum in the cross-section can be done as follows. Noting that k · ( pp0 ·k −
p
)
p·k
= 0, we can drop the k k̃ terms in the completeness relation for the polarizations,
effectively taking Σλ εµ εν = ηµν . Hence,
µ 0
ν 0
0
X p0
p
p
p
p
p
p
p
−
εµ
−
εν =
−
·
−
,
p0 · k p · k
p0 · k p · k
p0 · k p · k
p0 · k p · k
λ
and the expectation value of energy carried by soft photons is given by,
Z
Esof t (kmax ) =
d3 k X e2
m2
m2
p · p0
− 0
−
−2
(2π)3 λ 2
(p · k)2 (p · k)2
(p · k)(p0 · k)
(17.21)
Note that the integral represents average radiated energy inferred from the quantum mechanical probability distribution for emission of a single soft photon with |~k| < kmax .
An Aside: Incidentally, integrand in the r.h.s. is also the classical formula for energy
carried away by the k th Fourier mode of the electromagnetic field. To see this, consider a
classical trajectory z µ (τ ) = pµ /m for τ < 0 and equals (p0 )µ /m for τ > 0. The corresponding
current is,
∞
Z 0
(p0 )µ 4
pµ
J (x) =
dτ
δ (x − z(τ )) +
dτ δ 4 (x − z(τ )) ⇒
m
m
−∞
0
0 µ
µ
e (p )
p
Aµ (~k) = −
−
(using the Retarded Green function.)
|~k| p0 · k p · k
µ
Z
The energy of the above radiation field is the precisely as given by the integrand in the r.h.s.
of eq.(17.21) [13]. The integral now represents the total energy carried by Fourier modes
with momenta |~k| < kmax .
The integral, Esof t (kmax ), thus has two different interpretations.
Returning to the evaluation of the integral in eq.(17.21), let us choose a frame in which
the initial and final energies of the electron are the same: E 0 = E ⇒ (p0 )0 = p0 and we
write p = E(1, ~v ), p0 = E(1, ~v 0 ), k = (k := |k|, ~k). This is equivalent to p2 = −m2 and
180
m2 /E 2 = 1 − ~v 2 . With this choice, p · k = Ek(−1 + k̂ · ~v ), p0 · k = Ek(−1 + k̂ · ~v 0 ) and
p · p0 = E 2 (−1 + ~v · ~v 0 ). The k 2 from the d3 k cancels and the integrated emitted energy takes
the form,
#
"
!
Z
Z
e2
dΩ m20
1
1
2(1 − ~v · ~v 0 )
Esof t (kmax ) = −
+
−
dk
2
8π 3 E 2 (1 − k̂ · ~v )2 (1 − k̂ · ~v 0 )2
(1 − k̂ · ~v )(1 − k̂ · ~v 0 )
"
!#
Z
dΩk̂
m2
2(1 − ~v · ~v 0 )
1
1
0
I(~v , ~v ) :=
−
+
⇒
4π (1 − k̂ · ~v )(1 − k̂ · ~v 0 ) E 2 (1 − k̂ · ~v )2 (1 − k̂ · ~v 0 )2
Z
e2
α
Esof t (kmax ) =
kmax I(~v , ~v 0 )
dkI(~v , ~v 0 ) ≈
2
4π
π
The integral is actually divergent at the upper end. It has been cut off at kmax which is
provided by the inverse of the duration over which the electron receives the kick. This is
given by kmax ∼ |~p − p~0 | := |~q|. This is a physical cutoff.
The angular integration receives the maximum contribution from when k̂ is parallel to
~v or ~v 0 depending upon if it is the initial or the final state bremsstrahlung. Given the
directions vv̂, v̂ 0 , the k̂ varies between these two directions (smaller angle). Thus, to pick
up contribution when k̂ is parallel to ~v , we define cosθ by k̂ · ~v = vcosθ which varies between
|~v |cosθ = ~v · ~v 0 and cosθ = 1. Similarly for contribution from nearly parallel to ~v 0 , we define
cosθ by k̂ · ~v 0 = |~v 0 |cosθ which varies from |~v 0 |cosθ = ~v · ~v 0 to cos θ = 1.
Additionally, in the extreme relativistic limit where we can neglect the m2 /E 2 terms, we
can approximate (1 − k̂ · ~v )(1 − k̂ · ~v 0 ) ≈ (1 − |~v |cosθ)(1 − ~v · ~v 0 ) or ≈ (1 − |~v 0 |cosθ)(1 − ~v · ~v 0 ).
The angular integral then approximates as,
Z 1
Z 1
1 − ~v · ~v 0
1 − ~v · ~v 0
0
I(~v , ~v ) ≈
dcosθ
+
dcosθ
(1 − vcosθ)(1 − ~v · ~v 0 )
(1 − v 0 cosθ)(1 − ~v · ~v 0 )
|~v 0 |cosθ=~v ·~v 0
|~v |cosθ=~v ·~v 0
1 − ~v · ~v 0
1 − ~v · ~v 0
(p · p0 )2
' ln
+ ln
≈ ln
1 − |~v |
1 − |~v 0 |
E 2 (E − p)(E − p0 )
q2
' 2ln 2 where q := p0 − p
m
We have used: (1−~v ·~v 0 ) = −p·p0 /E 2 , (1−|~v |) = (E −|~p|)/E, (E −|~p|)(E −|~p0 |) ≈ (E −|~p|)2
and E(E − |~p|) ≈ E 2 − p~2 = m2 in getting to the last equation.
Thus the radiated energy in the soft modes is given by,
Z
Z
2α kmax
α kmax
0
Esof t (kmax ) =
dkI(~v , ~v ) −−−→
dk ln(q 2 /m2 ) .
Em
π 0
π 0
(17.22)
Interpreting the above energy classically, what would be the total number of photons
emitted, Nγ ? Notice that I(~v , ~v 0 )dk is the contribution to the radiated energy from the
181
Fourier modes with energy, |~k|. The equivalent number of photons would be
I(~v ,~v 0 )dk
|~k|
and the
total number Nγ is obtained by integrating over the energies:
α
Nγ =
π
Z
0
kmax
dk
2α
I(~v , ~v 0 ) ≈
k
π
Z
kmax
0
dk
ln(q 2 /m2 )
k
This is clearly divergent from the lower limit.
But this is the same expression for the total quantum mechanical probability for emission
of a soft photon up to energy kmax and thus is divergent.
This is the IR divergence of the QED cross-section for the bremsstrahlung process.
R
If the photon is given a small mass µ, then the dk/k will give ln((|q| ' kmax )/µ) while
I(~v , ~v 0 ) gives the ln(q 2 /m2 ). Hence,
dσ(p → p0 + k) = dσ(p → p0 ) ·
182
α
ln(q 2 /µ2 ) · ln(q 2 /m2 )
π|
{z
}
Sudakov double log
(17.23)
18.
TREATMENT OF DIVERGENCES:
At the first non-trivial attempt at computing radiative corrections, we encounter divergences of the UV type (from large loop momentum) and of IR type (from small loop
momentum in massless propagators). How do we understand the physical origin of these?
How do we adjust the computational procedure so as to make unambiguous predictions to
be confronted with observations? Let us recapitulate what we have got.
• Quite generally, using only Poincare covariance and assumption about the possible
spectrum of a theory with mass gap, the Kallen-Lehmann representation gave us,
Z ∞
0
dσ 2 ρ(σ 2 )∆0 (p, σ) , 0 < Z < 1
∆F (p) = Z∆0 (p, mph ) +
m2th
−i
(Free propagator of mass σ)
+ σ 2 − i
Z ∞
−i{ρ1 (σ 2 )6 p + ρ2 (σ 2 )}
0
dσ 2
SF (p) = Z2 S0 (p, mph ) +
p2 + σ 2 − i
m2th
∆0 (p, σ) :=
p2
(−6 p + σ)
S0 (p, σ) := − i 2
(Free propagator of mass σ)
p + σ 2 − i
Z ∞
qµ qν
Z3
Π(σ 2 )
0
2
(DF )µν (q) = ηµν − 2
+
dσ 2
q
q 2 − i
q + σ 2 − i
0
The Z’s are the field (or wavefunction) renormalization constants. They are determined as
the residues at the (isolated) pole at the physical mass, mph . This is without any perturbation
series.
In perturbation series though we obtained (for fermion and photon),
qµ qν
1
1
1
0
0
SF (p) =
& DF (q) = ηµν − 2
6 p + m − Σ(6 p)
q
q 2 1 − Π(q 2 )
This is consistent with the general expectation that the physical masses are determined by
the poles in the exact propagator while the corresponding residues determine the Z’s. Thus,
δm := mph − m = −Σ(6 p = −mph ) , Z2−1 = 1 −
dΣ(6 p)
|6 p=−mph , Z3−1 = 1 − Π(0).
d6 p
and of course the photon physical mass is zero.
• The vertex function for on shell fermion masses takes the form,
i
γ µ (p, p0 ) := γ µ F1 (q 2 ) + iσ µν qν F2 (q 2 ) , σµν := [γµ, γn ] , 6 p = −mph = 6 p0 .
2
This decomposition is understood to be sandwiched between ū(p0 ) and u(p). Since mph =
m + δm = m + o(α), for the first order corrections, we can take mph = m. Here are the
expressions we obtained:
183
(I)
(II)
(III)
(IV )
Z
α 1
xΛ2
Σ(p) =
dx (2m + x6 p)ln
(18.1)
2π 0
x(1 − x)p2 + (1 − x)m2 + xµ2 − i
Z
α 1
xΛ2
δm =
dx (2 − x)ln
,
(18.2)
2π 0
(1 − x)2 m2 + xµ2
α
ln(Λ2 /m2 )
(18.3)
Z2 = 1 +
4π
2
Z 1
2α
m + q 2 x(1 − x)
2
Π(q 2 ) = −
− γ − ln
(18.4)
dx x(1 − x)
π 0
µ20
2α 1
2α 1
= −
+ o(0 ) ⇒ Z3 = 1 −
(18.5)
3π Z
3π
α
zΛ2
2
F1 (q ) = 1 +
dx dy dz δ(1 − x − y − z) ln
2π
M2
−(1 − x)(1 − y)q 2 + (1 − 4z + z 2 )m2
(18.6)
M2
2
Z
α
2m z(1 − z)
2
F2 (q ) =
dx dy dzδ(1 − x − y − z)
,
(18.7)
2π
M2
M 2 := q 2 xy + m2 (1 − z)2 + µ2 z
(18.8)
Z
2
zΛ
α
dx dy dzδ(1 − x − y − z) ln
∴ F1 (0) = 1 +
2π
(1 − z)2 m2
2
1 − 4z + z
+
(18.9)
(1 − z)2
Z
z
g−2
α
α
dx dy dzδ(1 − x − y − z)
=:
(18.10)
=
F2 (0) =
2π
1−z
2π
2
2
Z
d3 k X 2 ~ε · p~0
~ε · p~
(18.11)
dσp→p0 +k = dσp→p0
−
e
2k(2π)2 λ
p~0 · ~k p~ · ~k
2 Z
e
dk
0
= dσp→p0
I(~v , ~v ) where,
(18.12)
4π 2
k
(
)
Z
dΩ
2(1 − ~v · ~v 0 )
m2 /E 2
m2 /E 2
0
I(~v , ~v ) =
−
−
(18.13)
4π (1 − k̂ · ~v )(1 − k̂ · ~v 0 ) (1 − k̂ · ~v )2 (1 − k̂ · ~v 0 )2
where pµ = (E, E~v ) , (p0 )µ = (E, E~v 0 ) , k µ (|~k|, ~k) is used.
dσp→p0 +k
For |~v | ≈ |~v 0 | ≈ 1 , I(~v , ~v 0 ) ≈ 2ln(~q2 /m2 ), q := p0 − p ⇒
hα
i
= dσp→p0
ln(q 2 /µ2 ) ln(q 2 /m2 )
π
184
(18.14)
(18.15)
A.
Treatment of the IR divergences
Let us consider the IR problem first. Since F1 has both the UV and the IR divergence,
we will separate these by using the UV renormalized F1,ren which was defined through,
δF1,ren (q 2 ) := δF1 (q 2 ) − δF1 (0). This guarantees that the renormalized F1 satisfies the
condition F1 (0) = 1. Using the expressions above, we see that,
Z
−(1 − x)(1 − y)q 2 + (1 − 4z + z 2 )m2
α
2
dx dy dz δ(1 − x − y − z)
δF1,ren (q ) =
2π
M 2 (q 2 )
(1 − 4z + z 2 )m2
−
(1 − z)2 m2 + zµ2
The IR divergence comes from z → 1 ↔ x ∼ y ∼ 0 ↔ M (q 2 , x, y) would vanish but for
the photon mass µ2 . To isolate the divergence, it suffice to take x = y = 0, z = 1 in the
numerator. Doing the x integration using the delta function gives x = 1 − y − z > 0 ⇒ y <
1 − z and leads to,
Z
Z 1−z
−2m2 − q 2
2m2
α 1
dz
dy
δF1,ren ≈
2π 0
(1 − z)2 m2 + y(1 − y − z)q 2 + µ2 (1 − z)2 m2 + µ2
0
Substituting y := (1−z)u, v := 1−z, the leading contribution in the limit µ → 0 is expressed
as [13],
2
m or q 2
α
2
F1,ren (q ) ≈ 1 −
fIR (q ) ln
2π
µ2
Z 1
m2 + q 2 /2
2
−1
du
fIR (q ) =
m2 + u(1 − u)q 2
0
2
where,
(18.16)
(18.17)
Since F1 is the coefficient of the γ µ term, we can replace e → eF1,ren in the electron scattering
off a classical potential. The cross-section is then given by,
2
dσ
dσ
α
m or q 2
2
'
1 − fIR (q )ln
.
dσ
dσ tree
π
µ2
Notice that e → eF1 ⇒ e2 → e2 F12 ⇒ α/(2π) → α/(π).
Not only is this divergent as µ → 0, for non-zero µ it is actually negative implying negative
cross-section! In the limit of q 2 → ∞ (q 2 is space-like and hence positive), fIR (q 2 ) →
ln(q 2 /m2 ) ⇒ F1,ren (q 2 ) ' 1 −
α
ln(q 2 /m2 ) ln(q 2 /µ2 )
2π
and hence,
i
h
α
dσ(p → p0 ) ' dσtree (p → p0 ) 1 − ln(q 2 /m2 ) ln(q 2 /µ2 ) , q 2 → ∞ , µ2 → 0.
π
185
(18.18)
The Bremsstrahlung cross-section on the other hand is,
α q2
α q2
q2
q2
0
dσ(p → p + k) = dσ(p → p )
ln
= dσ(p → p )tree
ln
.
ln
ln
π m2 µ2
π m2 µ2
0
0
(18.19)
Both the cross-sections above are IR divergent and both suffer from ambiguity from contamination of soft photons.
We already noted that detectors are unable to distinguish a charged particle accompanied
by soft photons below detector sensitivity. What is the appropriate theoretically computed
quantity which reflects this limitation of the detection process?
When an experimenter reports a detection of a scattered electron, he/she is actually
giving an estimate of the probability that an electron, e(p0 ) is detected and a photon is
not detected. This probability is the probability for a process with no emitted photon plus
the probability that there are accompanying soft photons with energies below the detection
threshold, εth . In equation,
(dσ)measured = dσ ( p → p0 ) + dσ ( p → p0 + k ; |~k| < εth ) .
(18.20)
But this is precisely the sum of the two cross-sections given in eqns.(18.18,18.19). We see
that the leading contribution as q 2 → ∞, µ2 → 0 is exactly canceled out in the observed
cross-section!
Note: For a general q 2 , the sum of the two cross-sections is given by [13],
2
q or m2
α
α
2
0
2
0
+
I(~v , ~v ) ln(εth /µ )
(dσ)measured = dσtree ( p → p ) 1 − fIR (q )ln
π
µ2
2π
It turns out that without the limit q 2 → ∞, it still holds that I(~v , ~v 0 ) → 2fIR [13]. For
general q 2 , the coefficient of fIR is ln(q 2 ) or ln(m2 ). For large q 2 we can of course drop
m2 . Furthermore, experimentally it is easier to track the behavior of the cross-section as a
function of q 2 , so taking q 2 m2 , we write the unambiguous and measurable prediction as,
dσ
dΩ
=
measured
dσ
dΩ
α q2
q2
2
(p → p ) 1 − ln 2 ln 2 + o(α )
π m
εth
tree
q 2 m2
0
(18.21)
Appreciate that we had parametrised the IR divergence in terms of the photon mass µ2 ;
we used the renormalized F1,ren (q 2 ) to separate the IR from the UV and finally, identified
the quantity which is actually reported by experiments which is the sum of the two crosssections.
186
This has been a demonstration at the 1-loop. It is non-trivial result of a great deal of
work, that the basic mechanism of cancellation works to all orders in α. There are other
types of divergences analogous to the IR divergences, the so called “mass singularities”
which do not cancel but can be factorised in a convenient form and then eliminated from
the observed cross-sections. See the book by G. Sterman [14].
UV divergences is a different ball game and needs a different procedure.
B.
Treatment of the UV divergences
All the three radiative corrections in, Σ, Π and F1 have the UV divergences parametrised
in terms of the Pauli-Villars cut-off Λ or the dimensional regularization −1 pole. Note that
the bremsstrahlung process does not have a UV divergence as there is physical cut-off for
the energy of the “soft photons”. We need to pay attention to the Z factors now.
Recall that the S−matrix element definition, via the LSZ reduction procedure, had a
factor of
√1
Z
for each external line. There was also an amputation of the external line,
effected by the equation of motion operator eg − m2 acting on the Feynman propagator
√
∆F for the external line. The Z entered from the asymptotic condition relating the
interacting field to the in/out fields. The in/out fields satisfy their respective field equations
with physical masses and also have the normalization factors with ωk which also contain
the physical masses. In the momentum space, these factors associated with external lines
are of the form
√1 (p2
Z
2
+ m2 )∆0F (p) where the ∆F contains the self-energy giving it the form
∆0F (p) = (p2 + m − Π(p2 ))−1 (for scalars).
In perturbative computation of the self-energy however, the m is the mass parameter in
the L which is not the physical mass. In fact from the Kallen-Lehmann representation, we
know that the physical mass is determined from p2 +m2 −Π(p2 )|p2 =−m2ph = 0 ↔ −m2ph +m2 =
Π(−m2ph ). Expanding Π(p2 ) about −m2ph gives us,
Π(p2 ) = π(−m2ph ) + (p2 + m2ph )Π0 (−m2ph ) + . . .
⇒
2
(∆0F )−1 (p) = p2 + m2 − Π(−m2ph ) − (p2 + Mph
)Π0 (−m2ph ) + . . .
= (p2 + m2ph ) 1 − Π0 (−m2ph ) + . . .
= (p2 + m2ph ) ZΦ−1 + . . .
p
1
∴ √ (p2 + m2ph )∆0F (p) = ZΦ 1 + o( (p2 + m2ph )2 )
ZΦ
Since we evaluate the S−matrix elements on shell, all the higher order terms vanish. The
187
net result is that:
In an S−matrix element, for each external line introduce a factor of
p
Zf ield × appropriately
normalized wavefunction and now do not include self-energy corrections on the external lines.
Consider the electron scattering off external potential. The corresponding invariant am√
plitude is given by e( Z2 )2 ū(p0 )Γµ (p0 , p)u(p) with Γµ = γ µ F1 + (. . . )F2 . The F2 has no
divergences while Z2 , F1 both are divergent. Thus we write, Z2 = 1 + δZ2 , F1 = 1 + δF1 with
the δ’s representing the o(α) divergent corrections. Thus we write,
µν
0
0 µ
2
0 iσ qν
2
Me→e (p , p) = e(1 + δZ2 ) ū(p )γ u(p) (1 + δF1 (q )) + ū(p )
u(p) δF2 (q )
2mph
iσ µν qν
= e [ū(p0 )γ µ u(p)] {1 + δF1 + δZ2 } + ū(p0 )
u(p) δF2 (q 2 ) But,
2mph
Z 1
2
zΛ
z(1 − z)m2
α
, M 2 = (1 − z)2 m2 + zµ2
dz −zln 2 + 2(2 − z)
δZ2 =
2π 0
M
M2
Z
α 1
m2
zΛ2
2
δF1 (0) =
dz(1 − z) ln 2 + (1 + z − 4z) 2
2π 0
M
M
Z 1
zΛ2
α
m2
dz (1 − 2z)ln 2 + 2 ×
∴ δF1 (0) + δZ2 =
2π 0
M
M
(4 − 2z)(1 − z)z + (1 − z)(1 + z 2 − 4z)
Integrate the first term by parts,
Z
0
1
Z 1
1
zΛ2
zΛ2
1 2(1 − z)(−1)m2 + µ2
2
2
−
dz(1 − 2z)ln 2 = (z − 2z ) ln 2 −
dz(z − z )
M
M 0
z
M2
0
Z 1
1−z
dz
= −
{M 2 + 2z(1 − z)m2 − zµ2 }
2
M
Z0 1
Z 1
1−z
m2 (1 − z)2 (1 + z)
2
= −
dz
{m
(1
−
z)(1
+
z)}
=
−
dz
µ2
M2
0
0
m2 (1 − z)2 + z m
2
Z
α 1
m2 (1 − z)(1 − z 2 )
m2
2
2
dz −
+ 2 (1 − z){4z − 2z + 1 + z − 4z}
∴ δF1 (0) + δZ2 =
2π 0
M2
M
Z 1
α
m2
=
dz 2 (1 − z)(1 − z 2 ){−1 + 1} = 0 (!)
2π 0
M
Thus not only has the lnΛ2 divergence canceled, we have got δZ2 = −δF1 (0) and thus the
entire correction takes the form,
iσ µν qν
0
0 µ
2
Me→e (p , p) = eū(p )γ u(p) 1 + δF1 (q ) − δF1 (0) + iū(p0 )
u(p)δF2 (q 2 )
|
{z
}
2m
ph
δF1,ren
188
We recover the ad hoc prescription of using δF1,ren introduced earlier while discussing the IR
divergences. Note that the subtracted term was crucial for the IR divergence cancellation.
It was of course physically expected since δF1 (q 2 = 0) = 0 should actually hold to all
orders. Note that the UV divergence in the vertex function, δF1 (q 2 ), is canceled against the
divergence in δZ2 coming from electron self-energy.
Now we are left with the divergence in Z3 from the Π(q 2 = 0) for the photon self-energy.
We still have the electron mass-shift δm which is divergent.
1.
1-Loop Renormalization: Charge Screening and Lamb shift
Consider charged particle scattering by exchanging a photon, necessarily off-shell (spacelike in fact). Replace the free propagator (DF )µν (q 2 ) by the exact propagator which includes
the photon self-energy, (DF0 )µν (q 2 ). The exact propagator has additional terms qµ qν . But
thanks to the Ward identity, these do not contribute to S−matrix elements and effectively,
(DF0 )µν (q 2 ) = −iηµν [q 2 (1 − Π(q 2 ))]−1 . This has a pole at q 2 = 0 with residue Z3 = [1 −
Π(0)]−1 . The replacement thus gives a factor of Z3 . The photon internal line connects two
vertices contributing e2 . Thus, if we identify e2 Z3 =: e2ph , then all divergences coming
from the photon self-energy can be neatly absorbed in the physical, measured charge which
is finite. For the mass parameter too, we identified m2ph = m2 + δm, which absorbs the
√
divergence in δm into the Lagrangian parameter m. The identification eph := e Z3 is called
charge renormalization while m2ph := m2 + δm is called mass renormalization. For external
√
photon line, we will just have Z3 factor and no self-energy corrections and of course no
mass shift.
Thus the exact photon propagator may be replaced by the free propagator by simultaneously replacing α by αef f defined below.
α(DF0 )µν (q 2 )
α
ηµν
αph (1 − Π(0))
ηµν
2
= (DF )µν (q )
= 2
:= αef f (q ) 2
1 − Π(q 2 )
q − i
1 − Π(q 2 )
q − i
αph
with,
αef f (q 2 ) =
1 − (Π(q 2 ) − Π(0))
2
To appreciate the implication of the above procedure, recall the discussion of the effective
potential inferred from the tree level scattering for both the Yukawa and the QED coupling.
Let us use the same formula (for QED), but include the self-energy correction for the photon
189
propagator. The inferred potential is,
Z
−e2ph
d3 q i~q·~x
(18.22)
V (~x)
=
e
(2π)3
~q2 (1 − Π(q 2 ) + Π(0))
2
Z
2α 1
2
m + q 2 x(1 − x)
2
(18.23)
Π(q )
=
−
dx x(1 − x)
− γ − ln
π 0
µ20
Z
2α 1
m2
2
Π(q ) − Π(0)
=
−
dx x(1 − x) ln
(18.24)
π 0
m2 + q 2 x(1 − x)
Z 1
q2
α q2
q~2 m2 2α
−−−−→
dx x2 (1 − x)2 2 =
(NR limit)
(18.25)
π 0
m
15π m2
Z
−e2ph
d3 q i~q·~x
∴ V (~x)
≈
(18.26)
e
α q2
(2π)3
~q2 (1 − 15π
)
2
m
2
αph 4αph
−
(18.27)
≈
−
δ 3 (~x)
r | 15m{z2
}
perturbation
The perturbed Coulomb potential induces a shift in the energy levels of the Coulomb potential, say in Hydrogen atom. The first order perturbation theory gives the shift as,
2
3
5
4αph
αph
m3
αph
m
∆E = −
=
−
' −1.123 × 10−7 eV(Lamb Shift)
2
15m 8π
30π
(18.28)
This is the contribution of the photon self-energy to the famous Lamb shift. The photon
self-energy ↔ αef f is thus an observable and observed effect. More exact calculation may
be seen in [13].
We can also consider the ultra-relativistic regime q 2 m2 . Now
Z
2α 1
m2
m2
2
Π(q ) − Π(0) = −
dx x(1 − x) ln( 2 ) − ln x(1 − x) + 2
π 0
q
q
Z 1
2α 1
2
2
2
2
≈ −
ln(m /q ) −
dx x(1 − x)ln{x(1 − x) + o(m /q )}
π 6
0
α
'
ln(q 2 /m2 ) − 5/3 + o(m2 /q 2 )
3π
αph
3π
q 2 m2
2 −−−−→ −
∴ αef f =
(18.29)
2
−5/3
q
ln(q e
/m2 )
1 − α ln
3π
m2 e5/3
Note that the denominator is less than 1 and hence effective coupling is larger than the
physical coupling. Thus, the effective coupling gets stronger at shorter distances. This is
interpreted as saying that the photon self-energy correction polarizes the space between
say the nucleus and the electron, shielding the nuclear charge. As the shielding cloud is
190
penetrated, higher nuclear charge is seen and hence αef f is larger. For this reason, the
photon self-energy Π(q 2 ) is also called the vacuum polarization.
To summarize:
1. The UV divergences in the self energies and the vertex correction are absorbed away
by introducing the physical mass for electron, physical charge for the electron and
electron wavefunction renormalization constant to cancel the divergence in F1 (q 2 );
2. The IR divergences in the amplitudes, imply IR divergence in the cross-section. For
the e − e scattering, the IR divergence is canceled against the bremsstrahlung process,
once the measured quantity is correctly identified taking into account the finite detector
resolution;
3. The hiding of divergences in αph and m2ph imply that αef f , δm2 acquire q 2 dependence
- they “run” with q 2 .
This seems satisfactory within the context of the o(α) corrections and the magic of the
Ward identity. But it raises more questions.
• Are the divergences generic? What are the mathematical and physical reasons for their
existence?
• Can we sometimes/always take care of the divergences and make unambiguous predictions?
• What kind of predictions can be made?
• What are the effective/running parameters?
• Are the divergences an artifact of perturbation theory? Etc, etc . . . .
Addressing these questions is the genesis of the renormalization theory.
As a matter of strategy, we will stay within perturbative framework and try to make sense
of the divergences. After all, despite divergences, we could obtain prediction from QED at
1-loop which have been well tested!
We begin with a comment that the UV divergences persist at higher loops as well and
have to be faced. Can we always absorb them away in physical masses, couplings and the
field renormalizations in S−matrix elements/cross-sections automatically and make finite,
unambiguous predictions? Indeed it is so for QED and for the class of (super-)renormalizable
theories. The proof needs a somewhat modified procedure, introducing additional diagrams,
191
representing the so called ‘counter terms’ and adjusting its coefficients/Feynman rules to
systematically absorb the divergences. We discuss this procedure in the context of a scalar
field theory, specifically the Φ4 theory in 4 dimensions and Φ3 theory in 6 dimensions.
C.
The Method of Counter terms
Consider the Φ4 theory given by,
1
2
1
2
L = − ∂µ Φ0 ∂ µ Φ0 − m20 Φ20 −
λ0 4
Φ
4! 0
The corresponding Feynman rules would be,
p
p2
−i
+ m20 − i
−i
,
λ0
4!
As seen in QED, we will have divergences in the radiative corrections,
p
=
All
+
The suffix 0 quantities are called “bare quantities”. Define: Φ0 :=
√
ZΦ, arbitrary scaling
of the field. This expresses the Lagrangian density as,
Z
λ0 Z 2 4
Z
Φ
2
2
4!
1
1
λ
= − ∂µ Φ∂ µ Φ − m2 Φ2 − Φ4
2
2
4!
2
Zm0 − m2 2 λ0 Z 2 − λ 4
Z −1
−
∂µ Φ∂ µ Φ −
Φ −
Φ
2
2
4!
L = − ∂µ Φ∂ µ Φ − m20 Φ2 −
The last line constitute “counter terms”, the field Φ, the mass m and the coupling λ are
called renormalized quantities. The corresponding Feynman rules would be:
p
p2
p
−i
+ m2 − i
− i{(Z − 1) p2 − Zm20 − m2 }
| {z }
| {z }
δZ
−i
,
,
δm
λ
4!
−
i 2
Z λ0 − λ
4! | {z }
δλ
We have used the renormalized quantities and have additional vertices with adjustable coefficients δZ , δm , δλ . The new vertices are generated by the counter terms.
192
Note that we have only rewritten the bare Lagrangian density split into two sets of terms.
Since we have not stipulated any conditions, the split is completely arbitrary. Two sets of
conditions, called renormalization conditions, are now provided: (i) The exact propagator
is given by
−i
p2 +m2 −i
+ terms regular at p2 = −m2 . That is, the renormalized mass should
be identified with the physical mass and the residue should be 1;
(ii) The exact 4-point, 1PI function should equal −iλ at s = 4m2 , t = u = 0. s, t, u are
the Mandelstam variables.
The definition of λ could be changed, but essentially it is the value of the exact 4-point
scattering amplitude which is measurable.
Diagrams generated by these vertices will again have divergences which must be regulated. The counter terms coefficients, δZ , δm , δλ are to be so chosen as to ensure that the
renormalization conditions are satisfied, order-by-order. Since the conditions are finite and
cut-off independent (regularization independent), all divergences are spent in defining the
counter term coefficients. By construction, we have generated finite quantities to all orders!
Note: There are different ways of absorbing the divergences in the counter term coefficients, especially when massless particles are involved. The different methods of absorbing
the UV divergences is generically referred to as a subtraction scheme. This will be illustrated
below.
The perturbation series generated using the bare form of the Lagrangian is called “bare
perturbation series” while that generated using the renormalized quantities together with
the renormalization conditions, is called “renormalized perturbation series”.
Does such a simple splitting procedure always generate UV finite (cut-off independent)
quantities in terms of finitely many renormalization conditions? The answer is YES for a
class of theories called “renormalizable theories”. There are non-renormalizable theories (=
Lagrangians) for which this procedure fails.
193
19.
RENORMALIZED PERTURBATION SERIES
Let a theory be specified by a (bare) Lagrangian density, L as,
X g0,k
1
1
Φk0 .
L (Φ0 , m0 , g0,k ) = − ∂µ Φ0 ∂ µ Φ0 − m20 Φ20 −
2
2
k!
k≥3
Introduce renormalized quantities, Φ, m2 , gk and scaling parameters, Zφ , Zm , Zgk through the
p
k/2
definitions: Φ0 =: Zφ Φ, m20 Zφ = Zm m2 , g0,k Zφ =: Zgk gk . Writing the Z’s in 1 + δ form,
we recast the Lagrangian density as,
1
2
1
2
L (Φ, m, gk ; δΦ , δm , δk ) = − ∂µ Φ∂ µ Φ − m2 Φ2 −
X gk
Φk
k!
k≥3
X gk
1
1
− δΦ ∂µ Φ∂ µ Φ − δm m2 Φ2 −
δk Φ k
2
2
k!
k≥3
←− (Counter terms)
We choose the renormalization conditions to be:
(i) Exact propagator =
−i
,
p2 +m2 −i
which is equivalent to the two conditions: (a) Π(p2 =
−m2 ) = 0, and (b) Π0 (p2 = −m2 ) = 0 ;
(ii) gk = k-point amputated function, also called k-point vertex function, at some chosen
values of its momentum arguments.
Note: Due to the conditions (i), this scheme is called on-shell renormalization.
Calculations of the vertex functions will be functions of the external momenta,
m, gk , δΦ , δm , δk and the regularization parameter - Λ for a momentum cut-off, = 4 − n
for the dimensional regularization. The renormalization conditions will serve to define the
counter term coefficients in terms of the regularization parameter and eliminate them from
the vertex functions. We will be left with vertex function having dependence on the momenta, m and the couplings gk . This is what we seek to understand.
A.
Necessary Conditions for UV divergence: Power Counting
We begin by finding necessary conditions for occurrence of UV divergence. These are
obtained by estimating the Feynman integrals in the region where all loop momenta become
large. Feynman integrands are rational functions of momenta and in the regime of large
momenta, simply give a power of the large momenta.
Consider an arbitrary, connected and topologically connected diagram made up of Eexternal lines, I−internal lines, nk − vertices of k th order and L−loops. The loops arise
194
because we have I number of momenta with V := Σk≥3 nk number of vertices enforcing
momentum conservation (and −1 since an overall momentum conservation) condition which
leaves some momenta undetermined. All internal momenta are linear functions of some
external momenta, pi and some loop momenta, kl . There are several integration regions in
the d × L dimensional space (d is the space-time dimension). These regions correspond to
various subsets of internal lines vanishing as q(pi , kl )−2 . It suffices to consider region wherein
all loop momenta diverge which means that all internal momenta also vanish. If we take
the loop momenta to diverge as kl = Λk̂l , Λ → ∞, then we get the superficial degree of
divergence, D = dL − 2I.
For vertex functions, there are no external lines and 2I = Σk≥3 knk − E. The Green’s
functions will have +E.
As noted before, the number of loop momenta is given by
L = I − V + 1. Using these the superficial degree of divergence is given by,
X d − 2
d−2
D =d−
nk k
E+
−d
2
2
k≥3
A necessary condition for divergence is that D ≥ 0.
For D < 0, there is no UV divergence, D = 0 is (possibly) logarithmically divergent,
D = 1 is (possibly) linearly divergent, D = 2 is (possibly) quadratically divergent. We have
taken k ≥ 3 to have some interaction, we take at least one nk 6= 0 for a non-trivial diagram
and of course d ≥ 2, E ≥ 1.
As an example, consider d = 4. Then D = 4 − E + Σk≥3 nk (k − 4) ≥ 0 for divergence.
That is 4 ≥ E + n3 − Σk≥5 nk (k − 4). If we have nk≥5 = 0, i.e. only Φ3 , Φ4 terms, then the
condition for divergence is independent of the number of Φ4 vertices while with increasing
Φ3 vertices, the E must decrease correspondingly.
• For n3 = 0, we must have E ≤ 4 i.e. the 2−point and the 4−point vertex functions are
superficially divergent, quadratically and logarithmically respectively. For pure Φ4 theory,
the vertex functions with odd number of external lines vanish identically. Thus, in 4 dimensional Φ4 theory, the self-energy and the 4−point vertex function are divergent to all orders.
Since we do have δΦ , δm , δ4 coefficients, we can satisfy the renormalization conditions to all
orders. This theory is renormalizable.
• For pure Φ3 theory, n4 ≥ 0, D = 4 − E − n3 . For n3 = 1, we can have only tree level,
3−point function. This gives D = 0. But of course there is no loop integration and hence
195
no divergence (hence superficial degree of divergence is not a sufficient condition). Let us
also require L ≥ 1.
For n3 = 2, we can have 2−point function diagrams
2, L = 1, D = 0 which are logarithmically divergent and also
,
with E =
with E = 4, D =
−1, L = 0 which are (trivially) “convergent”. For n3 = 3, E = 2 we can have the diagram
which is divergent even though D < 0. But this is because of the 1PR nature
of the diagram. To exclude this triviality, we stipulate that contributing diagrams should
all be 1PI. Then, for all n3 ≥ 4, all E−point vertex functions are convergent. This theory
thus has divergences (eg 2-point function), but only up to a finite order. Beyond that, there
are no divergences. Such a theory is called super-renormalizable.
• For a pure Φk , k ≥ 5, D = 4 − E + (k − 4)nk . With increasing nk , the number of loops
also increase and D keeps increasing. Equivalently, more and more E−point functions turn
divergent and we will need infinitely many counter terms to absorb the divergences, in the
same E−point function. Such a theory is non-renormalizable.
• For sake of variety, take d = 3, so that D = 3 − E/2 + nk (k/2 − 3). For k = 6, D is
independent of nk . The 2, 4, 6 point vertex functions need counter terms to all orders and
Φ6 theory is renormalizable.
• Take d = 6, ⇒ D = 6 − 2E + nk (2k − 6) = 2[(3 − E) + nk (k − 3)]. For k = 3, D is
independent of n3. The 1, 2, 3 point functions are divergent to all orders and the theory is
renormalizable.
Remarks:
• The formula for the superficial degree of divergence can be generalised to include
fermions and photons. The fermion internal line contributes −1 (instead of −2) to the
power counting. The fermion number conservation ust also be paid attention to by restricting the types of vertices. If there are derivative couplings, the numerator contributes positive
powers to D.
• As noted above, D ≥ 0 indicates the possibility of a UV divergence. The actual diagram
may be less divergent due to symmetries eg Π(q 2 ) in QED is only log divergent though D = 2
suggests quadratic divergence. It is also possible the coefficient of the indicated divergence is
196
actually zero! An example of this is the photon 3−point vertex function in QED, at 1-loop.
λ
µ
ν
Here D = 4 − 3 = 1. However due to Furry’s theorem, in QED, any photon amplitude with
odd number of external lines is zero. Hence the diagram is identically zero. However, what
is true is that if D < 0 for a diagram and each of its sub-diagram, then the diagram has no
UV divergence (Dyson-Weinberg theorem).
The superficial degree of divergence, with the caveats mentioned above, suggest a classification of theories as:
(i) Super-renormalizable: Only a finitely many diagrams have D ≥ 0. The D has a
dependence on the number of vertices such that it decreases with increase in nk ;
(ii) Renormalizable: Only a finite number of E−point vertex functions have D ≥ 0. For
these functions though, D is non-negative at all orders;
(iii) Non-renormalizable: All vertex functions are superficially divergent at sufficiently
high orders. The D increases with increase in nk .
A given theory, may be in any one of the classes in different dimensions.
Dimensional analysis gives another convenient criterion for renormalizability. This goes
as follows.
Dimensional Analysis:
In d dimensions [Φ] =
d−2
,
2
[Φk ] = k d−2
∴ [gk ] = d − k d−2
= d(1 − k2 ) + k. An E−point
2
2
vertex function can come from a term in the Lagrangian as gE ΦE with [gE ] = d(1 − E2 ) + E.
If a diagram contributing to such as vertex function has a momentum cut-off Λ and the
number of vertices of order k is nk , then the divergent part is proportional to gknk ΛD . For
k = E and nE = 1, there is no loop integration and dimension comes only from the coupling
gE . Hence [gE ] = [gknk ΛD ], i.e.
d−2
d(1 − E/2) + E = nk d − k
) +D ⇒
2
d−2
d−2
or, D = d −
E − {d −
k }nk
2
| {z2 }
[gk ]
197
Thus, D is independent of nk if [gk ] = 0 and we have finitely many vertex functions potentially divergent. If [gk ] > 0, then only finitely many diagram can be divergent. If [gk ] < 0,
then every vertex function has divergence at some order. Thus, the dependence on the
space-time dimension, d can be hidden in the dimensions of the couplings and
gk Φk is renormalizable
if [gk ] = 0] ;
gk Φk is super-renormalizable if [gk ] > 0] ;
gk Φk is non-renormalizable
if [gk ] < 0] .
Having identified simple criteria for presence of UV divergences, we see now how the
counter terms are used for renormalization.
B.
An Example: The (Φ3 )6 theory
As specific example, we will consider the (Φ3 )6 . Here are the Feynman rules for the
renormalized perturbation series.
−i
p2 +m2 −i
,
+ig ,
−i (δΦ p2 + δm m2 )
,
+igδ3 .
p
p
D = 6−2E ≥ 0 ⇒ E = 2 (quadratically divergent) and E = 3 (logarithmically divergent)
are the only vertex functions with UV divergence. Consider the 1-loop divergences first.
p+k
p
p
k
Z
1
iΠ(p ) =
(ig)2
2
2
dd k
(2π)d
+
+
···
−i
−i
(p + k)2 + m2 − i k 2 + m2 − i
1
2
2
+ 1 · (−i)(δΦ p + δm m ) + . . .
2
2
1
The comes from:
2
6
3
1
6×2×3
1
= and
2
2!(3!)
2
2×1
=1
1!2!
δΦ , δm to be determined from Π(−m2 ) = 0 = π 0 (−m2 ) .
The 1 comes from:
2
198
1
The first term, after using the Feynman parameters and Wick rotation and equations (17.4,
17.6, 17.7) gives,
Z 1 Z
dd k
g2
1
2
2
dx
1 =
I(p ) , I(p ) := i
, M 2 := x(1 − x)p2 + m2
2
d
2
2
2
(2π) (k + M )
0
Z 1
Γ(−1 + /2)
g2
2
2 /2
/2
I(p2 ) =
dx
M
(4π/M
)
,
:=
6
−
d,
put
g
→
g(µ̄)
α
:=
(4π)3
(4π)3
0
2 = −i(δΦ p2 + δm m2 )
Z 1
4π µ̄2
2 p2
α
2
2
2
dx M ln γ 2
(1 + )( + m ) +
− δΦ p2 − δm m2 + o(α2 ).
∴ Π(p ) = −
2
6
e
M
0
Putting µ :=
√
4πe−γ µ̄ , the logarithm becomes ln(µ2 /M 2 ). We continue the Euclidean
momenta back to Minkowski momenta, p2 → p2 . Club the first group of terms with the
counter terms. This gives the self energy as, [12]
Z
α 1
2
Π(p ) =
dx M 2 ln(M 2 /µ2 )
2 0
α 1 1
1 1
2
−
+
+ δΦ p − α
+
+ δm m2 + o(α2 )
6 2
2
(19.1)
Now, in the first term, use ln(M 2 /µ2 ) = ln(M 2 /m2 )+ln(m2 /µ2 ) and absorb the contribution
of the second term into the counter term coefficients along with the UV divergences (the −1
pole) by choosing,
α 1 1
1 1
2
+ + CΦ + o(α ) , δm := −α
+ + Cm + o(α2 ) .
δΦ := −
6 2
2
(19.2)
Note that we have only absorbed the pole together with some, µ−dependent finite parts (the
ln(m2 /µ2 ) terms) and subsumed these in the undetermined, finite constants CΦ , Cm . These
constants are determined by the renormalization conditions. The self-energy then takes the
form,
α
Π(p ) =
2
2
Z
1
dx M 2 ln(M 2 /m2 ) +
0
α
CΦ p2 + αCm m2 + o(α2 ).
6
The C’s are now determined by imposing Π(−m2 ) = 0 = Π0 (−m2 ). This leads to,
Z 1
M02
CΦ
M2
Cm −
=
dx 02 ln(M02 /m2 ) ,
= 1 − x + x2 ,
2
6
m
m
0
Z
α 1
CΦ
−
=
dx x(1 − x) ln(M02 /m2 ) + 1 .
6
2 0
(19.3)
(19.4)
(19.5)
Notice that the C’s so determined, are independent of the arbitrary parameter µ that entered
in the dimensional regularization scheme. We have expressed the 2-point vertex function
199
explicitly free of UV divergence and the arbitrary µ parameter. In Π(p2 ) we had two terms
with the −1 pole, one with coefficient p2 and one with coefficient m2 . The two counter terms
δΦ , δm sufficed to absorbed these.
Consider now the 3-point vertex function, Γ(p1 , p2 , p3 ):
p1
iΓ3 :
p2
p3
:=
p2
p1
p3
+
p2
p1
k
p3
+
p2
p1
p3
+
···
At 1-loop we have,
p1
iδΓ3 :
p2
p1
p3
k − p1
+
k
p2
p3
p2 + k
= [igδ3 ]
+
1
Z
(ig)3 (−i)3 ×
dd k
1
1
1
(2π)d k 2 + m2 − i (p2 + k)2 + m2 − i (k − p1 )2 + m2 − i 2
In the second term, Feynman parameterization, shifting momentum and Wick rotation gives,
Z
Z
Z
δ(1 − x − y − z)
[. . . ] = 2 dx dy dz
, M 2 := zxp21 + zyp22 + xyp23 + m2 .
(k 2 + M 2 )3
As before replacing g → g µ̄/2 , = 6 − d and using the (17.7) gives,
Z
dd k
1
Γ(3 − d/2) −(3−d/2)
M
.
=
2
(2π)d (k + M 2 )3
2(4π)d/2
Hence, putting α := g 2 /(4π)3 , µ2 := 4πe−γ µ̄2
"
/2 #
Z
δΓ(p1 , p2 , p3 )
α 2
4π µ̄2
= δg +
+ 2 dx dy dzδ(1 − x − y − z)
;
g
2
M2
o
nα
+ δg
=
Z
−α
(19.6)
dxdydzδ(1 − x − y − z)ln(M 2 /m2 ) + o(α2 )
α
δg := − − αCg + o(α2 ) ⇒
(19.7)
Z
2
2
2
Γ(p1 , p2 , p3 ) = g 1 − α dx dy dzδ(1 − x − y − z)ln(M /m ) − αCg + o(α ) (19.8)
Choosing
As before, we have obtained a completely finite and µ−independent vertex function with
one undetermined constantCg . This is to be fixed by a suitable renormalization condition.
What condition do we choose?
200
In QED we had a natural choice Γµ (q 2 → 0) = eph . Here however, there is no natural
choice. Any cross-section will involve both g and Cg and the value of a cross-section gives
one condition. A convenient choice is Γ3 (0, 0, 0) = g ↔ Cg = 0.
Thus, at the 1-loop level, we see how the counter terms absorb the divergence and how
the renormalization conditions determine the constants C’s. Note that we could have added
any finite constant to the pole in defining the counter terms. The renormalization conditions
then would give different expressions for the constants.
1.
At 2-Loops
A natural question is: How does this work at higher orders? In our (Φ3 )6 theory, only the
2-point and 3-point vertex functions are divergent. At 2-loops the contributing diagrams
are:
(δφ )1
p
(δm )1
+
p
2−point
+ ···
+
+
+ ···
+
(δ3 )1
+
(δφ )2
3−point
+
I
II
(δ3 )1
+ ···
(δm )2
+
(δφ,m )1
III
+
+ · · · IV
The counter terms have the same form as the original bare terms. When they are clubbed to201
k
/k!
gether into the ‘interaction’ Lagrangian or Hamiltonian, one would expect that from Hint
terms, we would get the counter vertices also appearing k times. For example, at o(g 2 ), we
would have
and
,
vertices. In ver-
tex functions, the 1PR diagrams from the self-energy counter terms are omitted. Secondly,
the coefficients of the counter terms are explicitly instructed to be adjusted to enforce the
renormalization conditions at any given order in g. Hece, the counter vertices are not on
par with the elementary vertices. The order of perturbation is determined by the number
of elementary vertices and not by counter vertices.
In the first of the group of diagrams, we have a subdiagram that is divergent (only
|k1 | → ∞, k2 remains finite). This divergence is absorbed by the 1-loop counter vertices
- the seond and the third diagrams of group I. To absorb the divergence when both k1 , k2
loop momenta become large, we need the new 2-loop counter term as shown in III.
The group II exhibit the so called “overlapping divergences”: the vertical line’s momentum diverges when either of the loop momenta diverge. There are two subdiagrams which
are divergent and to absorb these, we need the δ3 1-loop counter term. The group III counter
terms are needed to absorb the divergence coming from both momenta becoming large.
Ar higher loops, the procedure is thus recursive. The counter vertices themselves have
an expansion in g 2 with coefficients absorbing divergences from the subdiagrams (lower loop
orders). The renormalization conditions determine the highest loop order coefficients.
This explains how the divergences arise and how they are absorbed systematically via
the counter terms and the renormalization conditions.
We have taken for illustration the on-shell renormalization condition. This method fails
when the physical masses vanish and the procedure needs to be adopted suitably. This
impacts what exactly is subtracted - (divergent part) or (divergent part + a finite piece)?
This leads to different schemes of renormalization. Let us see how the probem arises and
how different renormalization conditions can be chosen. We continue with the (Φ3 )6 theory.
C.
Renormalization with massless particles
We have the self energy in equation (19.1). Taking derivative w.r.t. p2 gives,
Z
α 1
α 1 1
0 2
+
+
dx x(1 − x){ln(M 2 /µ2 ) + 1} + o(α2 ) .
Π (p ) = − δΦ +
6 2
2 0
202
As m → 0, M 2 → x(1 − x)p2 . Then Π(0) = 0 identically while Π0 (0) is ill-defined (depends
on how p2 , m2 → 0 is taken). Thus the previous renormalization conditions cannot be used
and the counter term coefficients would remain undetermined! We noted while discussing
the Kallen-Lehmann representation, that the physical spectrum was assumed to have a mass
gap i.e. single particle pole being separated from the multi-particle branch point. Whenever
this is violated, the above on-shell renormalization procedure fails.
All is not lost though. We can try different renormalization conditions.
There are two commonly used subtraction schemes within dimensional regularization the so called Minimal Subtraction scheme (MS) and the Modified Subtraction Scheme (M S).
They are defined as:
M S ↔ define the counter term constants by absorbing only the pole(s);
while
the
M S scheme uses µ2 := 4πe−γ µ̄2 in the MS scheme.
With these definitions, we give the finite self-energy in the three schemes:
Z
α 2
α 1
2
2
Πon−shell (p ) = − (p + m ) +
dx M 2 ln(M 2 /M02 ) + o(α2 ),
12
2 0
where,
M 2 = x(1 − x)p2 + m2 and M02 := M 2 (p2 = −m2 )
Z
α 2
α 1
2
2
dxM 2 ln(M 2 /µ̄2 ) + o(α2 )
ΠM S (p ) = − (p + m ) +
12
2 0
Z
α 2
α 1
2
2
ΠM S (p ) = − (p + m ) +
dxM 2 ln(M 2 /µ2 ) + o(α2 )
12
2 0
(19.9)
(19.10)
(19.11)
The Πon−shell is ill-defined as m → 0 as noted above. For non-zero m though, it is
unambiguous, free of the arbitrary scale µ and m is also the physical mass.
By contrast, ΠM S is well defined as m → 0, but it depends on the mathematical artifact
parameter µ! Now the m parameter cannot be the pole of the propagator and hence is not the
physical mass. The physical mass is determined from [p2 + m2 − Π(p2 )]p2 =−m2ph = 0 ↔ m2ph =
m2 − ΠM S (−m2ph ). Since Π is order α, we can write, to order α, m2ph = m2 − ΠM S (−m2 ) .
Substitution gives,
Z
α 2
α 1
= m + m (1 − 6) +
dx m2 (1 − x + x2 )ln((1 − x + x2 )m2 /µ2 )
12
2 0
√
5
µ2
34 − 3π 3
0
2
0
2
2
or
mph = m 1 + α ln 2 + C + o(α ) , C =
≈ 1.18 (19.12)
12
m
15
m2ph
2
Observe that the physical mass explicitly depends on µ which it should not.
d
m2 (m2 , α, µ)
dµ2 ph
Hence
= 0 must hold. Clearly this would be possible if α and/or m also have µ
dependence. The independence of the physical mass then relates the µ dependences of m, α.
203
The residue, R, at the physical pole is as before, R−1 =
∆0F
2
|
dp2 p2 =−mph
= 1 − Π0M S (−m2ph ) =
1 − Π0M S (−m2 ) + o(α2 ). This evaluates to,
R
−1
α
=1+
ln(µ2 /m2 ) + C 00
12
√
17 − 3π 3
√
, C =
≈ 0.23 .
3
00
(19.13)
The residue gives the wave function renormalization constant ZΦ = R.
Finally, in the vertex function, we choose δg = −α/ in the M S scheme. This leads to,
Z
2
2
2
Γ3,M S (p1 , p2 , p3 ) = g 1 − α dx dy dzδ(−x − y − z)ln(M /µ ) + o(α ) , where,
M 2 = xyp21 + yzp22 + zxp23 + m2 .
(19.14)
To see a possible µ−dependence of the coupling α, we need to identify a physical quantity
which has explicit µ dependence. As discussed by Srednicky [12], we consider the two particle
going to two particle process which is related to the 4−point vertex function which is UV
finite. In the limit m → 0, this process too suffers from IR divergence which is handled as in
QED - carefully considering what is observed and including the soft particles contribution.
From [12], we borrow the formulae below.
The amplitude for the process, p1 + p2 → p3 + p4 is depicted below to order α.
p01
p1
p01
p1
,
p02
p2
p02
p2
(a)
(b)
The contribution of (a) in the high energy limit, is given by,
11
2
0
2
M = M0 1 − α ln(s/m ) + o(m ) + o(α ) with,
12
1 1 1
2
M0 = −g
+ +
, s := −(p1 + p2 )2 , t = −(p01 − p1 )2 , u := −(p02 − p1 )2 .
s t u
The o(m0 ) term is free of lm(m → 0) singularity. The amplitude M is divergent as m → 0
(see equation (26.1) of Srednicky [12].)
In the limit m → 0, the above process is experimentally indistinguishable from the one in
which there are on-shell, collinear particles associated with any/all external lines14 . This is
14
P
P
P
For mass less particles, ( i pi )2 = i6=j pi · pj = ij |pi ||pj |(−1 + cos(θij )) which vanishes if all particles
are collinear, θij = 0.
204
shown in (b) above, with the dotted lines denoting the collinear particles. We must integrate
the squared amplitude over momenta collinear within detector’s angular resolution.
For observable cross-section with collinear emission included (from all 4 external lines),
we have
!
!
#
2~ 2
√
k
∆
4α
ln
+ C + . . . , C := 4 − 3 3π , ~k 2 = s/4,
= |M|2 1 +
2
12
m
h
i
11
α
2
2
= |M0 | 1 − α ln(s/m ) × 1 +
ln(∆2 s/(4m2 )) + C + . . .
6
3
3
1
2
2
2
0
2
= |M0 | 1 − α
ln(s/m ) + ln(1/∆ ) + o(m ) + o(α ) .26.15 of [12]
2
3
"
|M|2obs
The ∆ is related to the detector angular resolution. In going to the second equation, we
have included collinear emission from all 4 lines. There are no UV divergences here. The
above formulae implicitly assume on-shell renormalization scheme where we cannot take the
m → 0 limit.
The same computation of the amplitude can be done in the M S scheme. This changes
the amplitude M by changing the ln(s/m2 ) → ln(s/µ2 ). The correction due to inclusion of
collinear emission continues to have ln(∆2 s/(4m2 )). Additionally, the residue at the physical
pole is not 1 but R and hence the amplitude is multiplied by R2 . Including these changes,
the observed squared amplitude is given by,
h
ih
i
11
α
α
2
2
2
2
2
2
2
|M|obs = |M0 | 1 − αln(s/µ ) 1 + ln(∆ s/m ) 1 − ln(µ /m )
6
3
3
1
1
11
2
2
2
2
2
2
= |M0 | 1 − α
ln(s/µ ) − ln(∆ s/m ) + ln(µ /m ) + . . .
6
3
3
1
3
2
2
2
2
0
2
∴ |M|obs = |M0 | 1 − α
ln(s/µ ) + ln(1/∆ ) + o(m ) + o(α )
(19.15)
2
3
There is no dependence on m now and the limit m → 0 can be taken. However, there
is explicit µ dependence and the observed cross-section cannot depend on the arbitrary µ.
Hence
d
|M|2obs
dµ
= 0 must hold. Writing
ln|M|2obs := ln|M0 |2 + [α 32 ln(µ2 ) + αC2 ] := C1 + 2lnα + 3αln(µ) + 3αC2
where C1 , C2 are functions of s, t, u but independent of α and µ, the condition of
205
µ−independence gives,
0 =
β(α) :=
dα
3
= − α2 + o(α3 ). We also recall,
dln(µ)
2
dm2ph
dln(µ)
(19.16)
(19.17)
⇒
(19.18)
dln(m)
α
= − + o(α2 )
dln(µ)
12
(19.19)
0 =
γm (α) :=
2
dα
+ 3(C2 + ln(µ))
+ 3α ⇒
α
dln(µ)
The γm is called the anomalous mass dimension of the mass parameter and β(α) is the
famous “beta” function of the theory. We have just obtained these functions at 1-loop,
for the (Φ3 )6 theory and these govern how the renormalized parameter must change with
the “renormalization scale”, µ, in order that the physical quantities are independent of the
renormalization scale. If we identify µ = s, then we have the running of the renormalized
parameters with the center-of-mass energy scale.
The differential equations can be trivially solved giving the running as, µ = µ̂et ,
α̂
α(t) =
3
1 + 2 α̂ln(µ/µ̂)
As µ ∼
−5/18
3
m(t) = m̂ 1 + α̂ t
2
,
(19.20)
√
s µ̂, the coupling decreases. Such a theory is said to be asymptotically free.
To summarize: We have seen how the introduction of counter terms makes it possible to
absorb the UV divergences into unobservable coefficients such as δΦ , δm , δg . This process of
absorption or s̀ubtraction’ has an inherent ambiguity: exactly what is subtracted. This is
implicitly determined by imposition of a set of renormalization conditions (eg the on-shell
renormalization) or explicitly (eg the M S/M S scheme). When the pole (in p2 ) in the selfenergy is not a simple pole, which happens with mass less particle (mph = 0), alternative
subtraction schemes are needed. Such schemes, typically imply renormalization scale dependence in the renormalized parameters such as masses and couplings. This is governed by
the beta function(s) and the anomalous dimension(s). For the sub-class of asymptotically
free theories, this helps in improving perturbative predictions at high energies.
A more general discussion of the renormalization group and its application is given in
subsection (22 B).
206
20.
PATH INTEGRALS IN QUANTUM MECHANICS
We consider an alternative strategy to view quantum dynamics.
Recall that in a quantum framework, we have a (projective) Hilbert space (or more generally a density operator on a Hilbert space) that encodes the kinematics while a family of
unitary operators encodes the dynamics. Observable quantities are provided by operators
whose expectation values and uncertainties in a given state (or density operator), provide
numbers to be matched against experiments. To track time evolution of observable quantities, the central quantity of interest is the transition probability amplitude, for a state |Ψi
at time t to make a transition at t0 > t to another state |Ψ0 i. The state Ψi evolves to
exp−i(t0 − t)H|Ψi by Schrodinger equation (H is taken time independent for simplicity).
Its inner product with |Ψ0 i gives the probability amplitude. This is denoted as:
0
0
Transition probability amplitude := hΨ0 | e−i(t −t)H |Ψi = hΨ0 |e−i(t −t)H |Ψi
Assuming the quantum system to be describing a particle with a configuration space {~q},
we can use the position representation completeness relation and express the amplitude as,
Z
Z
0
0 −i(t0 −t)H
0
hΨ |e
|Ψi =
d~q
d~q hΨ0 |~q0 ih~q0 |e−i(t −t)H |~qih~q|Ψi
Z
Z
0
:=
d~q
d~q Ψ∗ (~q0 )K(~q0 , t0 ; ~q, t)Ψ(~q)
The idea is to focus on the kernel K and get a convenient representation for it.
Note: We have already presumed the usual framework of quantum theory. It is possible
to develop the ideas ab initio taking the Kernel as a central quantity with a proposed form
and arrive at the usual quantum framework. The book of Feynman and Hibbs follows this
line. We will first sketch the heuristic approach and then relate it to the usual quantum
framework.
A.
The Ab Initio Path Integral
Consider a classical system with a configuration space Q which, for convenience of notation we take to be R. Let q1 , q2 be two points of it. The particle is supposed to make a
transition from q1 at t1 to q2 at t2 . Such a transition takes place classically as well and we
use it in the Lagrangian framework and deduce that the transition takes place along a curve
207
q(t) which extremises the action. In a departure from the classical dynamics, it is proposed
that there is certain probability for the transition (q1 , t1 ) → (q2 , t2 ). This is to be computed
by squaring the total probability amplitude obtained by summing over the probability amplitudes for every path connecting the (q1 , t1 ) and (q2 , t2 ). The amplitude for each path is
P
given as a certain phase. Thus, heuristically, K(q2 , t2 ; q1 , t1 ) ∼ paths eiϕ[q(t)]
Questions: What do we choose for the phase? How do we restrict the paths? How is the
‘sum’ to be performed?
• We expect to recover classical dynamics in the limit ~ → 0. So we expect the phase
must be such as the amplitude is dominated by a single path corresponding to the classical
transition. the obvious choice is ϕ[q(t)] ∼ S[q(t)]/~!
• First guess about the paths would be smooth paths, but this turns out to be not enough.
Since the action involves derivatives of q(t), we expect at least piecewise differentiability.
But even this may not suffice - we can define derivatives as differences. It seems, continuity
suffices. Without worrying about this too much, let us proceed by discretizing the time
interval. This will also lead us to answer the third question.
Let T := t2 − t1 := N where N is a large number eventually to be taken to infinity. For
definiteness, let the action be
Z
S=
0
T
N
−1
i
X
m (qk+1 − qk ))2
qk+1 + qk
q̇ − V (q) →
− V
, qk := q(tk )
dt
2
2
2
k=0
hm
2
For uniformity of notation, denote q0 =: qinitia , qN =: qf inal . We have used the natural
discretization of the action. The space of paths is now described by the N − 1 qk ’s varying
independently over R. The integration measure may be taken as the product measure
QN −1 dqk
k=1 C() , where the constant C() is left free and will be chosen shortly. Since N is
arbitrary, we define a path integral as:
N −1 Z
1 Y ∞ dqk
i m (qk+1 − qk ))2
qk+1 + qk
K(qin , qf i ; T ) := lim
exp
− V
→0 C()
C()
~
2
2
−∞
k=0
(20.1)
We can rewrite it by separating the k = (N − 1)th integral as,
K(qin , qf ; T ) =
Z
∞
−∞
dq 0
i m (qf − q 0 )2
qf + q 0
0
exp
− V
· K(qin , q ; T − )
C()
~ 2
2
208
As → 0, the rapid oscillations of the
1
term imply that dominant contribution to the
integral comes from q 0 very close to qf . Therefore Taylor expanding the potential and the
K(qin , q 0 ; T − ) about qf , we get
Z
∞
K(qin , qf ; T ) ≈
−∞
dq 0 i m (qf −q0 )2
e ~ 2
·
C()
(
−i
V (qf ) +
1+
~
−i
~
2
2 Vf2
...
2 2
)
(q 0 − qf )2 2
0
× 1 + (q − qf )∂q0 +
∂q0 + . . . K(qin , q 0 ; T − )
2!
q 0 =qf
Notice that the first braces are independent of q 0 (V (q) is assumed to be reasonably well
behaved). The second braces is a power series in (q 0 − qf ).
Z ∞
−i
dx i m x2
x2 2
V
(q
)
f
K(qin , qf ; T ) ≈ e ~
e ~ 2 · 1 + x∂qf + ∂qf + . . . K(qin , qf ; T − )
2!
−∞ C()
In effect, we have a series of Gaussian integrals of the form:
R∞
−∞
i mx2
m ~ 2
.
dxx e
The Gaussian
integrals are all well known,
r
Z ∞
Z ∞
Z ∞
π
1
2
−ax2
2k −ax2
dxe
=
dxx e
dxx2k+1 e−ax = 0 .
,
= k+1/2 Γ(k + 1/2) ,
a
a
−∞
−∞
−∞
For us, a =
im
.
2~
For this to be defined, we have to define the usual Gaussian integral for
a > 0 and continue analytically in the complex a−plane with real part of a being positive.
Hence, we must put m → m + iε to provide the convergence factor. Thus,
r
1
i~ 2
π
× 1 − i V (qf ) +
∂ + . . . K(qin , qf ; T − ) .
K(qin , qf ; T ) ≈
−im/(2~) C()
~
2m qf
The terms in the braces are regular as → 0 and hence the right hand side can be arranged
q
to have a regular limit as → 0 by choosing C() = 2π~
. Additionally, we also get,
−im
i
i~ 2
∂ − V (qf ) K(qin , qf ; T )
K(qin , qf ; T ) − K(qin , qf ; T − ) =
2m qf
~
2
~ 2
or
i~∂T K(qin , qf ; T ) = −
∂ + V K(qin , qf ; T )
2m qf
(20.2)
(20.3)
Thus, the K defined above, satisfies the time dependent Schrodinger equation!
What about the initial condition? To see the limit T → 0, take N = 1(T = ). Then
R
there is no dq. Only the k = 0 term in the exponent survives and we get,
r
1 i m (qf −qin )2 −V ((qf +qin )/2)
2π~
, C() =
K(qin , qf ; → 0) = lim
e~ 2
.
→0 C()
−im
209
The right hand side is just the representation of δ(qf − qin ). The usual quantum framework
definition of the transition amplitude satisfies the time dependent Schrodinger equation with
the same initial condition! So we have a strong hint that the K(qin , qf ; T ) defined may indeed
be identified with hqf |e−iT H |qin i We can see this directly as well.
B.
Derivation From Transition Amplitude
Divide the interval [0, T ] in to N intervals of size each, N = T . Since the Hamiltonian
is assumed to be time independent, we can write,
e
− ~i T H
− ~i
=e
PN −1
k=0
H(tk+1 −tk )
=
N
−1
Y
e
− ~i (tk+1 −tk )H
≈
k=0
k=0
Insert the completeness relation 1 =
h~qf |e
R
Z
− iT
H
~
|~qin i =
d~q1 . . . d~qN −1
N
−1
Y
i
1 − H + ...
~
.
k
d~qk |~qk ih~qk | between each factor so that
N
−1
Y
h~qk+1 |e−
iH
~
|~qk i
~q0 =: ~qin , ~qN =: ~qf
k=0
− iH
~
e
i
≈ 1 − H + ...
~
for small .
To simplify further, we consider types of Hamiltonian operators (we suppress the vector
arrows).
(i) H(q̂, p̂) = f (p̂) + g(q̂). Then
Z
hqk+1 |f (p̂)|qk i =
dpk hqk+1 |f (p̂)|pk ihpk |qk i
i
Z
dpk f (pk )hqk+1 |pk ihpk |qk i
=
Z
∴ hqk+1 |f (p̂)|qk i =
e− ~ q~k ·~pk
Use: hpk |qk i = √
2π~
i
e ~ p~k ·(~qk+1 −~qk )
dpk f (p̂)
2π~
Likewise,
Z
i
1
~
hqk+1 |g(q̂)|qk i = q(~qk )δ(~qk+1 − ~qk ) = g(~qk )
dpk e ~ p~k ·(~qk+1 −~qk )
2π~
~i p~k ·(~qk+1 −~qk )
Z
d~pk
~qk+1 + ~qk
∴ hqk+1 |Ĥ(q̂, p̂)|qk i =
Hcl
, p~k
2π~
2
We have used g(qk ) → g(qk+1 + qk )/2) when multiplied by the δ(qk+1 − qk ).
(ii) Ĥ contains the q̂, p̂ in various ordering such as, q̂ p̂, p̂q̂, q̂ 2 p̂, q̂ p̂q̂, p̂q̂ 2 , . . . etc. As an
210
example, consider a term q̂ a p̂b in the quantum Hamiltonian. Then,
Z
Z
d~pk i p~k ·(~qk+1 −~qk )
a
a
a
hqk+1 |q̂ p̂b |qk i =
d~pk hqk+1 |q̂ |pk ihpk |p̂b |qk i = qk+1 (pk )b
e~
2π~
Z
Z
d~pk i p~k ·(~qk+1 −~qk )
a
a
a
e~
hqk+1 |p̂b q̂ |qk i =
d~pk hqk+1 |p̂b |pk ihpk |q̂ |qk i = qk (pk )b
2π~
a
Z
qk+1 + qka
d~pk i p~k ·(~qk+1 −~qk )
q̂ a p̂b + p̂b q̂ a
∴ hqk+1 |
|qk i =
(pk )b
e~
2
2
2π~
Thus for the natural ordering, the classical Hamiltonian has its ~q dependence in the averaged
form. Here is another example involving q̂ 2 , p̂2 , taken in a particular order, called Weyl order:
W eyl(q̂ 2 , p̂2 ) := 14 (q̂ 2 p̂2 + 2q̂ p̂2 q̂ + p̂2 q̂ 2 ). It is easily seen that,
1 2
q hqk+1 |p2 |qk i + 2qk+1 qk hqk+1 |p2 |qk i + qk2 hqk+1 |p2 |qk i
4 k+1
2
~qk+1 + ~qk
=
hqk+1 |p̂2 |qk i
2
hqk+1 |W eyl(q̂ 2 , p̂2 )|qk i =
More generally, for monomials in q̂, p̂, the Weyl order W eyl(p̂m , q̂ n ) equals the fully symmetrized and averaged product. It is a result that
n
~qk+1 + ~qk
n m
hqk+1 |p̂m |qk i
hqk+1 |W eyl(q̂ , p̂ )|qk i =
2
Hence, modulo Weyl ordering, we do get,
hqk+1 |e
− iT~H
hqf |e
− i
H
~
Z
|qk i =
dpk − ~i H
e
2π~
Z
|qin i =
Z
d~q1 . . . d~qN −1
q
+q
pk , k+12 k
i
e ~ p~k ·(~qk+1 −~qk )
d~p1
d~pN
i
...
exp
2π~
2π~
~
(
)
qk+1 + qk
(~qk+1 − ~qk )
− H(pk ,
)
p~k+1 ·
2
k=0
N
−1
X
(20.4)
Notice that we have N, pi integrations, one for each interval and N − 1, qi integrations, two
less than the number of points since q0 , qN are fixed. The exponent is clearly a discretized
RT
RT
d~
p
form of 0 dt p~(t) · ~q˙(t) − H(~p(t), ~q(t)) = 0 dt L(~q(t), ~q˙(t)). The d~q (2π~)
measure suggests
that we have paths in the “phase space”. What type of paths?
The mis-match in the number of integration variables makes it harder to view the multiple integrals as a measure on the space of paths in phase space. Introduce an arbitrary
momentum p~0 so that (~q0 , p~0 ) denotes a point in the phase space just as (~qk , p~k ), k = 1, . . . , N
211
do. The
R
d~pN shows that the momentum at the final point is integrated over. Thus, the
phase space path may be specified by an initial point (~q0 , p~0 ). The arbitrary momentum
does not enter anywhere and just serves to anchor the initial point. The final point however
is the set of all points (~qN , p~N ) with P~N integrated over. Thus, the space of paths is (Γ is a
phase space of dimension 2n):
n
Space of paths =: PΓ :=
γ(t) ∈ Γ / γ(0) = (~qin , p~0 ) , γ(T ) = (~qf , p~), p~ ∈ Rn .
o
There are arbitrary constants here, the p~0 is an arbitrary constant vector. Since the p~k , ~qk
are all independent, the paths are continuous but non-differentiable everywhere. Thus we
denote:
− iT~Ĥ
hqf |e
Z
|qin i := lim
Z
d~q1 . . . d~qN −1
→0
d~p1
d~pN
...
2π~
2π~
"
#
N −1
i X
(~qk+1 − ~qk )
~qk+1 + ~qk
exp
p~k+1 ·
− H p~k ,
(20.5)
~ k=0
2
Z T
Z
i
˙
dt p~(t) · ~q(t) − H(~p(t), ~q(t)) ,(20.6)
D q(t)D p(t) exp
=
~ 0
PΓ
Z
N
−1 Z
N Z
Y
Y
d~pj
D q(t)D p(t) := lim
d~qk
(20.7)
N →∞
2π~
PΓ
i=1
j=1
For the special case of H(~p, ~q) =
p
~2
2m
+ V (~q), the momentum dependence is quadratic and
the momentum integrals can be done trivially. For instance,
Z
d~pk+1 ~i
e
(2π~)d
q
~
+~
q
p
~2
p
~k+1 ·(~
qk+1 −~
qk )− 2m
+V ( k+12 k )
=
i m
1
2
e ~ 2 (~qk+1 −~qk ) ×
d
(2π~)
Z
m(~
qk+1 −~
qk ) 2
i
~−
− i p
dd pk e 2m~
e− ~ V
=
m d/2 ~i
e
2π~(i)
qk+1 −~
qk )2
m (~
−V
2
W p2
(
q
~k+1 +~
qk
2
)
The square root prefactor is just C()−1 . Doing all the momentum integrals gives,
i
hqf |e− ~ T Ĥ |qin i = lim
→0
Z
d~q1
···
(2π~(i/m))d/2
Z
i
~
qk+1 −~
qk )2
m (~
−V
2
2
d~qN −1
e
(20.8)
(2π~(i/m))d/2 (2π~(i/m))d/2
Each d~q integral gets a factor of C()−d and extra such factor is left over from the extra
momentum integration. This was the factor introduced and determined by a regular behavior
as → 0.
212
Thus, the upshot is that we can either take the heuristically motivated definition of K and
show that it matched with quantum mechanical definition of transition amplitude or directly
deduce it from the quantum mechanical definition. The K of the transition amplitude is the
central quantity in the path integral approach.
A note: We defined K(qin , qf ; T ) and identified it with hqf |exp−(i/~)T H|qin i and called
it the probability amplitude for a particle at qin at t0 to transit to qf at time t0 + T .
This is sometimes also denoted as hqf , T |qin , 0i or hq 00 , t00 |q 0 , t0 i with t00 − t0 = T . This
notation can be confusing since |q 0 , t0 i does not denote a solution of the Schrodinger equation.
Rather, it denotes the instantaneous eigenvector of the Heisenberg picture operator Q̂(t) :=
eitH/~ QSch (0)e−itH/~ . Thus,
Q(t)|q, ti = q|q, ti ↔ QSch e−itH/~ |q, ti = q e−itH/~ |q, ti ⇒ e−itH/~ |q, ti = |qi
Or, |q, ti = eitH/~ |qi. If |q, ti were the time evolution of |qi, we should have eitH/~ |qi. Thus
we have the consistent notation:
hqf |e−
iT H
~
|qin i = hqf |e−
i(t00 −t0 )H
~
|qin i = hqf |e−
it00 H
~
·e
it0 H
~
|qin i = hqf , t00 |qin , t0 i .
Consider now hqf , T |QH (t)|qin , −T i, for t ∈ [−T, T ].
hqf , T |QH (t)|qin , −T i = hqf |e−iT H/~ eitH/~ QS e−itH/~ e−iT H/~ |qin i
Z
=
dq 0 dq 00 hqf |ei(t−T )H/~ |q 0 ihq 0 |QS |q 00 ihq 00 |e−i(t+T )H/~ |qin i
Z
=
dq 0 dq 00 hqf , T |q 0 , tihq 0 |QS |q 00 ihq 00 , t|qin , −T i
Z
=
dq 0 hqf , T |q 0 , tiq 0 (t)hq 0 , t|qin , −T i
In the last equation, we have used hq 0 |QS |q 00 i = q 0 δ(q 0 − q 00 ) - the property of the Schrodinger
picture operator, carried out the dq 00 integration and as a reminder inserted the argument t
in q 0 (t). Using the path integral notation, we write the last equation as,
Z
Z
Z
Z
i
i
i
S[−T,T ]
S[t,T ]
~
~
hqf , T |QH (t)|qin , −T i = D q q(t)e
= dq D qe
q(t) D qe ~ S[−T,t]
R
(20.9)
Z
Next, consider the quantity,
Z
Z
Z
i
i
S[−T,T
]
S[t
,T
]
1
D q q(t1 )q(t2 )e ~
:=
dq1 dq2
D qe ~
q1 (t1 ) ×
Z
Z
i
i
S[t
,t
]
S[−T,t
]
2
1
2
D qe ~
q2 (t2 )
D qe ~
213
− T ≤ t2 ≤ t1 ≤ T ;
For t2 ≥ t1 the factors will switch accordingly. We summarize the formulae as,
Z
i
i 00
0 00
0
00
0
hqf , t |qin , t i =
D qe ~ S[t ,t ] =: hqf |e− ~ (t −t )Ĥ |qin i
(20.10)
Z
i
hqf , T |QH (t)|qin , −T i =
D q q(t)e ~ S[−T,T ]
Z
:=
dqt hqf , T |q, tiqt hq, t|qin , −T i
(20.11)
Z
Z
Z
i
S[−T,T
]
:=
dq1 dq2
hqf , T |T {QH (t1 )QH (t2 )}|qin , −T i =
D qq(t1 )q(t2 )e ~
hqf , T |q1 , t1 iq1 (t1 )hq1 , t1 |q2 , t2 iq2 (t2 )hq2 , t2 |qin , −T i ,
t ≤t
1
2
hqf , T |q2 , t2 iq2 (t2 )hq2 , t2 |q1 , t1 iq1 (t1 )hq1 , t1 |qin , −T i ,
t2 ≤ t1
(20.12)
C.
Functional Derivative
It will be convenient to have the notation of functional derivative. Recall from the
Rt
variational principle of mechanics, the action functional, S[q(t)] := t12 dt L(q(t), q̇(t)). For
any given function q : [t1 , t2 ] → q(t) ∈ R, the right hand side computes a number and that
number is the action. It is a function on the space of paths/curves/functions on [t1 , t2 ], and
in short is regarded as a functional of any given q(t). This is not a functional in the sense
of being in the dual of vector space - it is not linear in q(t)’s. Under a variation of a curve,
o
n
R
d ∂L
q(t) → q(t) + δq(t), we compute δS := S[q + δq] − S[q] = dt δq ∂L
−
+ end-point
∂q
dt ∂ q̇
contributions. Comparing this with df (x1 , . . . , xn ) = Σni=1 dxi ∂i f , we can identify and denote,
the functional derivative of the action with respect to the curve as:
δS
δq(t)
:=
∂L
∂q
−
d ∂L
dt ∂ q̇
. Note
that S has no explicit t dependence while its functional derivative does. Thus the idea of a
functional derivative is to consider the first order variation of a functional and read off the
coefficient of the δ.
We can also notice that any function f (t) we can write, δf (t) =
δf (t)
δf (t0 )
R
dt0 δ(t − t0 )δf (t0 ) ↔
= δ(t − t0 ). Thus we define the derivative with respect to a function as:
214
(αF [f ] + βG[f ]) = α δfδF(t) + β δfδG(t)
(i)
δ
δf (t)
(ii)
δ
F [f ].G[f ]
δf (t)
=
δF
.G[f ]
δf (t)
(iii)
δ
F (G[f ])
δf (t)
δf (t0 )
δf (t)
=
∂F δG
∂G δf (t)
(iv)
(linearity)
+ F [f ]. δfδG(t) (Leibnitz rule)
(Chain rule)
:= δ(t0 − t)
For an arbitrary “source function”, J(t) define the functional,
Z
R 00
i t
00 00 0 0
hq , t |q , t i[J] := D qe ~ t0 dt{L(q,q̇)+J(t)q(t)} .
(20.13)
Then it follows that,
Z
R 00
i t
δ
i
00 00 0 0
hq , t |q , t i[J] =
D qe ~ t0 dt{L(q,q̇)+J(t)q(t)} · q(t)
δJ(t)
~
Z
00
R
t
i
δ
Or,
−i~
hq 00 , t00 |q 0 , t0 i[J] =
D q q(t)e ~ t0 dt{L(q,q̇)+J(t)q(t)}
δJ(t)
= hq 00 , t00 |Q(t)|q 0 , t0 iJ .
Z
R 00
i t
δ
δ
00 00 0 0
−i~
−i~
hq , t |q , t i[J] =
D q q(t1 )q(t2 )e ~ t0 dt{L(q,q̇)+J(t)q(t)}
δJ(t1 )
δJ(t2 )
= hq 00 , t00 |T {Q(t1 )Q(t2 )} |q 0 , t0 iJ .
The generalization is obvious. Evaluating the functional derivatives at J(t) = 0 gives us,
hq 00 , t00 |T {Q(t1 ) . . . Q(tn )} |q 0 , t0 i = (−i~)n
δn
hq 00 , t00 |q 0 , t0 i
δJ(t1 ) . . . δJ(tn )
(20.14)
J=0
Thus, the “correlation functions” (left hand side) are given by the functional derivatives
of the transition amplitude in presence of a source function, evaluated at vanishing source
function.
To relate it to ground state/vacuum expectation values of time ordered products of
Heisenberg picture operators, we take a closer look at the transition amplitude with non-zero
source function.
Choose the source function to have a compact support, J(t) = 0 for t < t0 , t > t00 . Choose
T 0 < t0 and T 00 > t00 . Then,
Z
00
00
0
0
hQ , T |Q , T iJ = dq 00 dq 0 hQ00 , T 00 |q 00 , t00 iJ=0 hq 00 , t00 |q 0 , t0 iJ hq 0 , t0 |Q0 , T 0 iJ=0 .
215
The J = 0 amplitudes can be written in the energy representation (energy eigenvalues
assumed to be discrete for convenience),
0
hq 0 , t0 |Q0 , T 0 i = hq 0 |e−i(t −T
0 )H/~
|Q0 i =
X
0
hq 0 |ϕn ihϕn |Q0 ie−i(t −T
0 )E /~
n
n
=
X
ϕn (q
0
0
0
)ϕ∗n (Q0 )e−i(t −T )En /~
n
∴ e−iT
0 E /~
0
hq 0 , t0 |Q0 , T 0 i =
X
0
ϕn (q 0 )ϕ∗n (Q0 )e−it En /~ eiT
0 (E −E )/~
n
0
n
∴ 0lim e
−iT 0 E0 /~
T →i∞
∴
lim
T 00 →−i∞
eiT
00 E /~
0
0
0
0
0
0
hq , t |Q , T i = ϕ0 (q 0 )ϕ∗0 (Q0 )e−it E0 /~ ∵ En 6= E0 terms drop out, likewise
hQ00 , T 00 |q 00 , t00 i = ϕ∗0 (q 00 )ϕ0 (Q00 )eit
00 E /~
0
We have assumed that the lowest energy eigenvalue is non-degenerate, otherwise the
will reduce to the sum over the degenerate states. Thus we get,
Z
hQ00 , T 00 |Q0 , T 0 iJ
lim
= dq 0 dq 00 ϕ∗0 (q 00 , t00 )ϕ0 (q 0 , t0 )hq 00 , t00 |q 0 , t0 iJ
−iE0 (T 00 −T 0 ) ϕ (Q00 )ϕ∗ (Q0 )
T 0 →i∞
e
0
0
00
P
n
(20.15)
T →−i∞
D.
Ground State-to-ground state Amplitude: Z[J]
The right hand side of eq. (20.15) is the ground state-to-ground state transition amplitude, the one that we were looking for. The left hand side tells us how to compute it
from hQ00 , T 00 |Q0 , T 0 i which is similar to hq 00 , t00 |q 0 , t0 iJ . The factor in the denominator is
independent of J and will not matter. We introduce the definition,
Z
Z[J] :=
dq 0 dq 00 ϕ∗0 (q 00 , t00 )hq 00 .t00 |q 0 , t0 iJ ϕ0 (q 0 , t0 )
(20.16)
n Z
i
δ n Z[J]
=
dq 0 dq 00 ϕ∗0 (q 00 , t00 ) ×
δJ(t1 ) . . . δJ(tn ) J=0
~
hq 00 , t00 |T {Q(t1 ) . . . Q(tn )} |q 0 , t0 i ϕ0 (q 0 , t0 ) .(20.17)
The previous result tells us that Z[J] may be computed as,
" Z 00
#
Z
T
i
Z[J] ∼ lim hQ00 , T 00 |Q0 , T 0 iJ = lim
D qexp
dt {L(q, q̇) + J(t)q(t)}
0
0
T →i∞
T →i∞
~ T0
00
00
T →−i∞
T →−i∞
We have dropped the unimportant denominator on the left hand side.
We have continued T 0 → i∞, T 00 → −i∞, but not the intermediate times, ti appearing
as arguments of the Heisenberg operators. We will do so now and get to the Euclidean
formulation.
216
We begin with,
00
00
0
Z
0
hQ , T |T {Q(t1 ) . . . Q(tn )} |Q , T iJ=0 ∼
lim
T 0 →i∞
T 00 →−i∞
dq1 . . . dqn hQ00 , T 00 |q1 , t1 iq1 hq1 , t1 |q2 , t2 i
. . . qn hqn , tn |Q0 , T 0 i , where,
"
#
Z NY
−1
qi + qi+1 qi+1 − qi
dqj
i X
√
L
,
, N = ti+1 − ti
hqi , ti |qi+1 , ti+1 i ∼
exp
~ j
2
2πi
j=1
"
#
Z NY
−1
qi + qi+1 qi+1 − qi
dqj
0 X
√
L
,
, N 0 := τi+1 − τi
hqi , −iτi |qi+1 , −iτi+1 i
exp
0
0
~
2
−i
2π
j
j=1
The analytic continuation of the hT {. . . }i is defined through the h..|..i. Thus,
Z
00
00
0
0
hQ , T |T {Q(t1 ) . . . Q(tn )} |Q , T iJ=0
∼ lim
D q q(τ1 ) . . . q(τN ) ×
ti =−iτi
τin →−∞
τf →∞
Z
τf
exp
τin
dq
dτ L(q, − )
dτ
This suggests going over to a Euclidean formulation,
Z ∞
Z
dq
dτ L q, i
ZE [J] := D q exp
+ J(τ )q(τ )
dτ
−∞
(20.18)
and the paths are between some Q0 = limτ →−∞ q(τ ) and Q00 = limτ →∞ q(τ ). The Euclidean
ZE and the Minkowskian Z are related through,
δ n Z[J]
1
Z[J] δJ(t1 ) . . . δJ(tn )
= in
J=0
1
δ n ZE [J]
ZE [J] δJ(τ1 ) . . . δJ(τn )
J=0
This will be used in the field theory Green’s functions.
E.
Explicit evaluation of a path integral
We would like to see the various expressions above explicitly for a 1-dimensional harmonic
oscillator with a source function. We have L = 12 (q̇ 2 − ω 2 q 2 ) + J(t)q(t) and we want to
compute hq 00 , t00 |q 0 , t0 iJ . Discretizing the corresponding action, the definition gives,
" (N −1
2
Z ∞ NY
−1
dq
i X 1 (qk+1 − qk )2 ω 2 qk + qk+1
1
k
00 00 0 0
hq , t |q , t iJ = lim
exp
−
→0 C() −∞
C()
~
2
2
2
k=1
k=0
qk + qk+1
+Jk
, q0 := q(t0 ), qN := q(t00 ), Jk ; = J(tk )
2
217
All are Gaussian integrals with coupled variables. It is more convenient to discretize a
different action.
The equations of motion are: q̈ + ω 2 q 2 = J(t) , q(t0 ) = q 0 , q(t00 ) = q 00 . Let qcl (t) be a
classical solution of this equation satisfying the end point conditions. Let us assume that
there is just one such solution. Introduce η(t) via the definition: q(t) := qcl (t) + η(t). Then
η(t0 ) = 0 = η(t00 ). The action becomes,
t00
ω2
1
2
2
(q̇cl + η̇) − (qcl + η) + J(t)(qcl + η)
dt
S(t , t ) =
2
2
t0
Z t00
1 2
1 2
2 2
2 2
2
dt
(q̇ − ω qcl ) + J(t)qcl +
(η̇ − ω η ) + η̇ q̇cl − ω ηqcl + Jη
=
2 cl
2
t0
0
00
Z
Up to a total derivative, the last term is −η(q̈cl + ω 2 qcl − J) = 0. The total deriva00
tive term gives η q̇cl |tt0 = 0 thanks to the end point condition satisfied by η(t). Thus,
Z t00
0 00
0 00
1 2
(η̇ − ω 2 η 2 ), where Scl is the action evaluated at the presumed
S(t , t ) = Scl (t , t ) +
2
t0
solution qcl . The terms linear in η have vanished thanks to qcl being a solution. We now
“quantize” the η variable, i.e. we define
"
(
Z ∞ NY
−1
h i J 0 00 i
dη
1
i
k
exp
hq 00 , t00 |q 0 , t0 iJ := e ~ Scl (t ,t ) lim
→0 C() −∞
C()
~
k=1
N
−1
X
1 (ηk+1 − ηk )2
2
k=0
ω2
−
((ηk + ηk+1 )/2)2
(20.19)
2
The paths η(t) begin and end at η = 0. The second factor is independent of both J and
t0 , t00 . It depends on T = t00 − t0 . Our task is to evaluate the first factor and carry out the
coupled Gaussian integral.
218
The exponent in the second factor is of the form,
N
−1
N
−1
X
X
1
i
2
2
2
... =
(ηk+1 − ηk ) − ω (ηk+1 + ηk )/2) :=
ηk Akl ηl , where,
2~
k=0
k,l=0
Alk := 2(1/ − ω/4) δlk − (1/ + ω 2 /4)2 {δl,k+1 + δl,k−1 } , l, k = 1, . . . N − 1.
|
{z
}
|
{z
}
a
a
−b
0
=
.
.
b
−b 0 . . .
a −b . . .
−b a −b
.
.
.
.
.
.
0
0
...0
.
.
A tri-diagonal, symmetric matrix.
0 . . . −b a
)#
"
( N −1
Z ∞ N −1
h i J 0 00 i
X
d
η
1
i
ηi Aij ηj
exp
∴ hq 00 , t00 |q 0 , t0 iJ := e ~ Scl (t ,t ) lim
→0 C() −∞ C()N −1
2~ i,j=1
0
For a single variable we have
R∞
2
dxe−ax =
−∞
p
π/a ⇒
R∞
2
dxeiax =
−∞
p
π/(−ia). Its multi-
dimensional generalization is
Z
∞
dn xei
P
ij
xi Aij xj
−∞
(iπ)n/2
=√
det A
Using this, our transition amplitude takes the form,
h i J 0 00 i
(N −1)/2
1
1
(iπ~)
00 00 0 0
(t
,t
)
S
√
hq , t |q , t iJ = e ~ cl
lim
→0 C() ()N −1
det A
√
Using C() = 2iπ~ since we have taken unit mass, m = 1, we have,
"
#
h i J 0 00 i
1
1
p
hq 00 , t00 |q 0 , t0 iJ = e ~ Scl (t ,t ) lim √
→0
2πi~()N/2 det A()
(20.20)
Now we need to compute the determinant of the tridiagonal, symmetric matrix. This is
usually solved by using a recursion relation. Denote Dn := det An×n where A has the form
given above. Clearly, D1 = a and D2 = a2 − b2 . By checking for 4 × 4, 5 × 5 matrices, it is
easy to see that Dn satisfies the recursion relation,
2
Dn = aDn−1 − b Dn−2 , D0 := 1, D1 = a ; a = 2
1 ω2
−
4
1 ω2
, b= +
.
4
We can pull out a factor of −1 from A and since our A is of order (N − 1), we pull out
√
a factor of ()−(N −1)/2 . This replaces the second [. . . ] by lim→0 (2π~ detA0 )−1 . The A0
matrix has elements a0 := 2 − 2 ω 2 /2 , b0 := 1 + 2 ω 2 /4.
219
Note: For a given N, = T /N , hence (suppressing the primes) a, b have an N dependence.
The matrix itself is also (N − 1) × (N − 1) and our notation Dn as the determinant of An×n
is valid for n ≤ N − 1. For a given N , the Dn≥N is not defined. Hence, the recursion relation
is a difference equation with constant coefficients [15]. It is important to keep the distinction
between fixed N and a variable n. The a, b are functions of N but independent of n.
Such difference equations are solved by the ansatz, Dn = λn . Shifting n → n + 2, we
write the difference equation as,
∀ 0 ≤ n ≤ N − 3 : Dn+2 − aDn+1 + b2 Dn = 0 , D0 = 1, D1 = a.
Substitution gives the characteristic equation λ2 − aλ + b2 = 0. Its solutions, for 1, are
r
2 ω 2
2 ω 2
2 ω 2
2 ω 2
) ± (1 −
) − (1 +
) ≈ 1−
± i ω
λ± ≈ (1 −
4
2
2
4
∴ λ± ' 1 ± iω . ⇒ Dn = αλn+ + βλn− ∀ n ∈ [0, N − 1].
1
i
1
i
Initial conditions give, α =
1−
, β=
1+
.
2
ω
2
ω
The desired determinant is then given by,
1
i
1
i
N −1
DN −1 =
1−
(1 + iω)
+
1+
(1 − iω)N −1 , = T /N
2
ω
2
ω
iN
1
iN
1
N
(1 + iωT /N ) +
1+
(1 − iωT /N )N
−−−→ i 1 −
N →∞ 2
ωT
2
ωT
iN
N
→
−
eiωT − e−iωT =
sin(ωT )
2ωT
ωT
sin(ωT
∴ DN →
.
ω
We finally get,
00
00
0
0
h
hq , t |q , t iJ = e
i J 0 00
S (t ,t )
~ cl
i
1
√
2πi~
r
ω
.
sin(ωT )
(20.21)
Note: All the J dependence is in the first factor only while the “quantum correction” are
in the second factor and independent of J.
Note: This method of evaluating the amplitude near a classical solution can also be
adopted for more general (non-linear) equations of motion. The Scl always comes out, the
term linear in η always vanishes while the o(η 2 ) terms always gives the (determinant)−1/2 .
In the general context, it is termed as a semi-classical approximation which is exact for the
oscillator.
220
The calculation of the first factor requires solving the equation of motion with the source
function and then evaluating the action for qcl (t). This is little involved as the equation
is inhomogeneous and requires use of Green’s function. The Green’s function can be obtained by directly solving the differential equation with δ- function source and matching
the discontinuity due to the delta function. We just note the final result [Problem 3.11 of
Feynman-Hibbs, Abers-Lee].
SclJ (t0 , t00 ) =
0 2
ω
((q ) + (q 00 )2 )cos(ωT ) − 2q 0 q 00
2sin(ωT )
Z t00
Z t00
q 00
q0
0
+
dtJ(t)sin(ω(t − t )) +
dtJ(t)sin(ω(t00 − t))
sin(ωT ) t0
sin(ωT ) t0
Z σ
Z t00
1
dτ J(σ)J(τ )sin(ω(t00 − σ).sin(ω(τ − t0 ))
(20.22)
−
dσ
ωsin(ωT ) t0
t0
This exercise was done to illustrate the schematics of evaluating the path integral directly
using the definition.
Note: If we have a system of two degrees of freedom, the q(t) and the J(t) which are
R t00
coupled by a linear coupling, t0 dtJ(t)q(t), then the SclJ (t0 , t00 ) can be viewed as an effective
action contribution after integrating out the q(t) degree of freedom. This has the form
R
R
Rt
∼ dtJ(t)α(t) − dtJ(t) dσβ(t, σ)J(σ). This last term is a non-local term.
F.
Alternative Expression for Z[J]
We now consider another method which is closer to what is done is field theory. We begin
by modifying the Hamiltonian operator as: Ĥ → (1 − i)Ĥ [12]. Then,
X
X
i 0
i 0
i 0
|q 0 , t0 i = e ~ t Ĥ |q 0 i =
ϕ∗n (q 0 )e ~ t En |ni →
ϕ∗n (q 0 )e ~ t (1−i)En |ni
n
n
Assuming E0 = 0 for convenience and taking the limits, we get
|q 0 , t0 i −−
−−→ ϕ∗0 (q 0 )|0i
0
t →−∞
and
hq 00 , t00 | −00−−→ h0|ϕ0 (q 00 ) .
t →∞
(20.23)
Thus, |q 0 , t0 i, hq 00 , t00 | both go to the (presumed non-degenerate) ground state as t0 → −∞
and t00 → ∞, provided we make the substitution: Ĥ → (1 − i)Ĥ. We will work with the
limits.
Consider, for our 1-dimensional oscillator, H = p2 /2 + ω 2 q 2 /2,
Z ∞
Z
i
h0|0iJ = D pD qexp
dt {pq̇ − (1 − i)H + J(t)q}
~ −∞
221
We do momentum integration by using (1 − i)p2 /2 ≈
p2
.
2(1+i)
This gives the term 12 (1 + i)q̇ 2
and we get,
Z
h0|0iJ =
Z ∞
1 2
i
1
2
2
(1 + i)q̇ − ω (1 − i)q + J(t)q .
D qexp
dt
~ −∞
2
2
(20.24)
Define the Fourier transform,
Z ∞
Z ∞
1
iνt
q̃(ν) :=
dte q(t) ↔ q(t) =
dνe−iνt q̃(ν)
2π
−∞
−∞
and similarly for J(t). Substitution give,
Z
Z
1 ∞ dν ∞ dν 0 −i(ν+ν 0 )t h
L =
e
− (1 + i)νν 0 − (1 − i)ω 2 q̃(ν)q̃(ν 0 )
2 −∞ 2π −∞ 2π
0
˜
˜ 0 )q̃(ν) i
J(ν)q̃(ν
) + J(ν
+
2
The expression in the braces simplifies to {. . . } = ν 2 − ω 2 + i(ν) , (ν) := (ν 2 + ω 2 ). Define
x̃(ν) := q̃(ν) +
˜
J(ν)
.
ν 2 −ω 2 +i(ν)
S=
1
2
Z
∞
−∞
The the action becomes,
"
˜ J(−ν)
˜
dν
J(ν)
x̃(ν) ν 2 − ω 2 + i(ν) x̃(−ν) − 2
2π
ν − ω 2 + i(ν)
#
The shift, q̃(ν) → x̃(ν) is a constant shift, the shift is independent of q̃. Hence the Jacobian
of the transformation will be 1 and D q̃(ν) = D x̃(ν). Taking Fourier transform, the shift
takes the form q(t) → x(t) + f (t) with f (t) independent of q(t). This is also a constant shift
and in the time domain too, and we expect D q(t) = D x(t).
" Z
!#
∞
˜ J(−ν)
˜
i
dν
J(ν)
∴ h0|0iJ = exp
×
2~ −∞ 2π −ν 2 + ω 2 − i(ν)
Z
Z ∞
i
dν
2
2
D x̃(ν)exp
x̃(ν) ν − ω + i(ν) x̃(−ν)
2~ −∞ 2π
(20.25)
(20.26)
The second factor is a path integral independent of J and hence equals h0|0iJ=0 . However, without any source interaction, the ground state remains a ground state and hence
h0|0iJ=0 = 1! . This gives finally,
i
h0|0iJ = exp
2~
Z
∞
−∞
dν
2π
We never needed to evaluate the path integral!
222
˜ J(−ν)
˜
J(ν)
−ν 2 + ω 2 − i(ν)
!
(20.27)
The above expression can also be expressed in the time domain as,
Z ∞
i
h0|0iJ = exp
dt0 dt00 J(t00 )G(t00 − t0 )J(t0 )
with,
(20.28)
2~ −∞
Z ∞
00
0
e−iν(t −t )
i −iω|t00 −t0 |
dν
00
0
=
e
(by contour integration) (20.29)
G(t − t ) =
2
2
2ω
−∞ 2π −ν + ω − i(ν)
Note: The ν dependence in the (ν) does not affect the contour integration.
We have discussed two different methods of computing the transition amplitude in the
limit of infinite time separation. Notice that the h0|0iJ above is the same as the Z[J] defined
R
earlier in eq. (20.16) since dq 0 |q 0 , t0 iϕ0 (q 0 , t0 ) = |0, t0 i and likewise for h0, t00 |.
The above frequency domain form is very convenient to obtain the correlation functions
as seen below.
Our general formula (20.14) tells us that h0|T {Q(t1 ) . . . Q(tn )}|0i is given by the functional derivatives of Z[J] evaluated at J = 0. Choosing the above form of Z[J] we see
that,
δ 2 h0|0iJ
δJ(t1 )δJ(t2 ) J=0
Z ∞
δ
i
2
= (−i~)
2
dt0 G(t2 − t1 )J(t0 )×
δJ(t1 ) 2~ −∞
i Z Z
exp
JGJ
2~
= (−i~)2 (i/~)G(t2 − t1 ) = −i~G(t2 − t1 ).
h0|T {Q(t1 )Q(t2 )}|0i = (−i~)2
The derivative of the e
R
JGJ
(20.30)
J=0
(20.31)
does not contribute in the limit of J = 0. Taking one more
derivative to get the three point function, we have
Z
i
δ δ
dtG(t3 − t)J(t) · h0|0iJ
=
δJ1 δJ2 ~
Z
Z
δ
i
i
i
0
0
0
G(t3 − t2 ) · h0|0iJ +
dt G(t3 − t)J(t) ·
dt G(t2 − t )J(t ) · h0|0iJ
δJ1 ~
~
~
= 0+0 =0
∵ there is always a factor of J which kills the term at J = 0.
223
For 4 derivatives, we will have,
Z
δ
i
i
G(t3 − t2 ) ·
dtG(t1 − t)J(t) · h0|0iJ +
δJ0 ~
~
Z
i
i
G(t3 − t1 ) ·
dtG(t2 − t)J(t) · h0|0iJ +
~
~
Z
i
i
dtG(t3 − t)J(t) · G(t2 − t1 ) · h0|0iJ +
~
~
Z
Z
i 3 Z
0
0
0
00
00
00
dtG(t3 − t)J(t) · dt G(t2 − t )J(t ) · dt G(t1 − t )J(t ) · h0|0iJ
~
Carrying out the J0 differentiation and putting J = 0, only the first 3 terms contribute since
they have a single factor of J(t). For a 4-point function, this is multiplied by (−i~)4 and we
get,
2
h0|T {Q(t0 )Q(t1 )Q(t2 )Q(t3 )}|0i = (−i~)
n
G01 G23 + G02 G31 + G03 G12
o
, Gij ↔ G(ti − ti ) .
(20.32)
We recognize this as the same pattern seen in the vacuum expectation value calculation
using the Wick’s theorem. Indeed, the quantum field theory Wick’s theorem shows up here
just as a result of the functional differentiation.
We now have all the definitions, notations and pattern of computation needed in the field
theory generalization.
224
21.
PATH INTEGRALS IN QUANTUM FIELD THEORY
We consider the path integral formulation of a field theory, specifically a scalar field
theory. While discussing the field as a dynamical system, we had already noted that the
notation φ(t, ~x) refers to infinitely many degrees of freedom labeled by ~x ∈ R3 (for us). Thus
we could view φ(t, ~x) as φ~x (t) and draw analogy with qi (t). Each of these degrees of freedom
can be quantized a la path integral exactly as the single degree of freedom discussed before.
R
RQ
R
R QN −1 dqk
QN −1
D φ(x) −−−→
In place of Dq (t) −−−→
x,k .
~
x∈Σ
k=0 dφ~
k=0 C() , we now have
T =N
T =N
We have dropped the C() which will be subsumed in the overall normalization constant.
The action for the field theory is as the usual one. Thus, the vacuum-to-vacuum transition
amplitude in presence of a source is denoted as,
Z
Z
n
o
i
h0|0iJ := D φexp
d4 x L(φ, ∂µ φ) + J(x)φ(x) =: Z[J]
~
(21.1)
The ‘paths’ implicit in the measure D φ are of course paths in the configuration space of the
field i.e. the space of all φ(~x) at any fixed t. A path itself connects φ1 (~x, t1 ) to φ2 (~x, t2 ), ∀ ~x.
The paths are not paths in the space-time.
For the Lagrangian density L, we write L = L0 + L1 , with L0 := − 21 ∂ µ φ∂µ φ − 12 m2 φ2 .
The m2 has −i implicit, in anticipation. The L1 will typically involve interaction terms,
polynomials in φ such as g3 φ2 (x) + g4 φ4 (x) . . . etc. Consider
Z
o
ni Z
d4 x(L0 + J(x)φ(x)) .
Z0 [J] := D φ exp
~
We go to the momentum space as in the case of the single particle.
R
R d4 k ik·x
Let φ̃(k) := d4 xe−ik·x φ(x) ↔ φ(x) = (2π)
φ̃(k), with k · x := −k 0 x0 + ~k · ~x and k 0
4e
is the integration variable in d4 k and not k0 . Substituting in the action and doing the d4 x
integration gives,
1
S0 =
2
Z
i
d4 k h
2
2
˜ φ̃(−k) + J(−k)
˜
−
φ̃(k)(k
+
m
)
φ̃(−k)
+
J(k)
φ̃(k)
(2π)4
L0 , but includes the source.
˜
. Then D φ = D χ since we have a “constant shift transforDefine φ̃(k) := χ̃(k) + k2J(k)
+m2
Note that S0 is not
R
mation”. This also holds in the φ(x) space. Then, as in the case of the single oscillator, we
get
1
S0 =
2
Z
"
#
˜ J(−k)
˜
d4 k J(k)
− χ̃(k)(k 2 + m2 )χ̃(−k) .
4
2
2
(2π)
k +m
225
The path integral
R
˜ namely h0|0iJ=0 = 1 and we
D χ, just gives a factor independent of J,
write, ~ = 1 from now on ,
" Z
h0|0iJ = Z0 [J] = exp i
d4 k
(2π)4
(
˜ J(−k)
˜
1 J(k)
2 k 2 + m2 − i
)#
Z
Z
0
0
4
4 0 1
where,
= exp i d x d x J(x)∆(x − x )J(x )
2
Z
0
d4 k
eik·(x−x )
0
∆(x − x ) =
, (Feynman propagator)
(2π)4 k 2 + m2 − i
Note that the Feynman propagator arose because of the −i included in the mass term in
the L .
As discussed for the oscillator, we get
δ
δ
· (−i)
· Z0 [J]
δJ(x1 )
δJ(x2 )
J=0
Z
δ
0
0 4 0
Z0 [J] · ∆(x − x2 )J(x )d x (−i)(i)
= (−i)
δJ(x1 )
h0|T {φ(x1 )φ(x2 )} |0i = −i∆(x1 − x2 ) .
h0|T {φ(x1 )φ(x2 )} |0i = (−i)
⇒
J=0
Exactly as for the Oscillator, we get
h0|T {φ(x1 )φ(x2 )φ(x3 )φ(x4 )0} |0i =
h
i
2
(−i) ∆(x1 − x2 )δ(x3 − x4 ) + ∆(x1 − x3 )δ(x2 − x4 ) + ∆(x1 − x4 )δ(x2 − x3 )
which easily generalizes to,
h0|T {φ(x1 ) . . . φ(x2n )} |0i =
h X
i
(−i)n
∆(x1 − x2 )δ(x3 − x4 ) . . . ∆(x2n−1 − x2n ) + permutations
pairings
This is what we had obtained as the Wick’s theorem.
Note: Wick’s theorem expressed the time ordered product of quantum fields in terms of
the normal ordered product plus contractions. The normal ordering was for all Poincare
generators to ensure invariance of the vacuum. The theorem was also proved for free fields
for which we do have Fourier decomposition.
Here too the “Wick’s theorem” is again seen for free fields. Here it is no more than the
chain rule of differentiation.
Consider now interacting fields, by which we mean L = L0 + L1 . Recall that h0|Q(t)|0i =
R
Dq q(t)eiS and we can get the q(t) in the integrand by
226
δ
S |
δJ(t) J J=0
(remember S = SJ |J=0 ).
δ
Thus, insertion of q(t)’s or φ(x)’s in field theory, can be effected by δJ(x)
SJ |J=0 . If we expand
R
R
n
ei L1 (φ) = Σ∞
n=0 [i L1 ] /n!, then we have integrals of polynomials of the fields which can be
expressed as
δ
S .
δJ J
The δJ can be taken outside of the path integration. Thus we write,
Z
Z
int
Z[J] := h0|0iJ := D φ exp i d4 x L0 (φ) + L1 (φ) + J(x)φ(x)
Z
R 4
iL(−iδJ(x) )
∝ e
· D φei d x{L0 +J(x)φ(x)}
or,
Z
δ
4
Z[J] ∝ exp i d xL1 −i
· Z0 [J] , with
(21.2)
δJ(x)
Z
i Z
4
Z0 [J] := exp
d x d4 yJ(x)∆(x − y)J(y)
(21.3)
2
RR
J∆J term and a Gaussian path
In the oscillator case, for Z0 [J] we had the exp 2i
integral which turned out to be equal to Z0 [J = 0]. By taking the asymptotic time limits,
we could assert this to be equal to 1 since it was vacuum-to-vacuum amplitude without any
source. In presence of interactions, even with J = 0, the vacuum may not remain unaffected
and we cannot justify h0|0iint
J=0 = 1. Instead, the proportionality constant is determined by
demanding Z[J = 0] = 1.
The prescription to compute Z[J] via the eiL1 (−iδJ ) Z0 [J], is in effect the perturbative
prescription. Let us see how this works in an example L1 (φ) = gφ3 (x)/3!. Then Z[J] ∝
g
3
e 6 δJ · Z0 [J]. Expand the exponential,
" Z
3 #v
∞
X
1 ig
δ
Z[J] = N
d4 x −i
×
v!
3!
δJ(x)
v=0
p
Z
Z
∞
X
1 i
4
4
d y d zJ(y)∆(y − z)J(z) .
p!
2
p=0
This too is a power series in J as is Z0 [J], but with different coefficients. Consider a particular
term in this double sum with a fixed v and p.
• Then we have 3v derivatives acting on 2p J’s, leaving E = (2p − 3v)J’s. Clearly, E ≥ 0
must hold and there are several such term;
• The overall numerical factor associated with such a group of terms is:
v
ig
(−i)3v (i/2)p
iv−3v+p v g v 1
=
g =
(i)E+v−p
v
3!
v!
p!
(3!) v!p!
3! v!p!
• The combinatorial factor resulting from the number of ways he derivative acts is: 3v
derivatives on 2pJ’s give
(2p)!
(2p−3v)!
(This is because the first derivative can act on 2p J’s, second
on (2p − 1), . . . (3v)th on (2p − 3v + 1) J’s.)
227
We can generate and keep track of the various derivatives by denoting
ig δ 3
3! δJ 3 (x)
i
J(x)∆(x
2
− y)J(y)
↔
ig
3!
Has
R
d4 x
↔
i
2
Has
R
d4 y
R
d4 z ,
and the operation of evaluating the derivative by joining the free ends from the ‘vertices’ to
the free ends of the ‘propagator lines’ - exactly as we represented the Wick contractions.
The number of terms with a give v, p is the number of ways of joining the free ends
and generating a diagram. Some of the diagrams may have the identical factors associated
with them For example, the 3δJ ’s at a given vertex can be permuted in joining up with the
propagator lines. This clearly gives a factor of 3!. Likewise ends of propagator lines gives a
factor of 2!. The diagram will look the same if the v vertices are themselves permuted (the
R 4
d x are dummy variables) and this gives v!. Similarly the propagator lines give p!. Thus,
all the numerical factors, except i, g cancel out.
Compared to the previous counting based on Wick’s theorem, we have a double expansion
and lines (pairs of J’s) to be contracted with the edges of the vertices. Secondly, in the
Wick’s theorem, thanks to the normal ordering of the interaction terms, we don’t have self
contractions at the vertices (∆(x, x) propagators). In the present approach, there is no
“normal ordering” and we do get such term. Eg.
δ(x − y)
y
!
∆(x − z)
x
δ(x − z)
Such diagrams are called ‘tadpoles’.
z
Thus we have more diagrams and different ways of working out the total combinatorial
factor. The above bulleted points have gotten rid of all explicit factors, but could involve
an over counting. This is compensated by dividing by a factor called the symmetry factor.
This is a geometrical problem of identifying the groups of diagrams. Here are a couple of
examples [12].
228
We can exchange the two loops in 2 ways. We can
: exchange the ends of each loop in 2 × 2 ways. Therefore
the total symmetry factor is 2 · 2 · 2 = 8.
The 3 propagators can be permuted in 3 · 2 ways. The 2
: vertices can be exchanged in 2 ways. So the symmetry
factor is: 3!2! = 12.
A general statement and a derivation may be seen in the appendix of Sterman’s book [14].
In the double expansion, we have terms with no J’s, a single J, 2J’s . . . etc. When J = 0
is put, only the terms with no J survive. These are termed as the “vacuum bubbles” and
are the contributions to the Z[J = 0].
A general contribution to a given number of left over J’s would consist of product of
topologically connected diagrams. Let Cγ denote the contribution of a particular topologically connected diagram, γ. In a given diagram, Γ, let γ occur nγ times. The symmetry
factor resulting from permutations within each γ, are included in the contribution Cγ . But
now we can also have permutations across different, topologically connected diagrams. Such
permutations can leave the product diagram invariant if the different copies of a given Γ are
permuted as a whole. Hence, we have an additional symmetry factor of nγ ! by which we
Q
have to divide. A product diagram with different γ’s thus has a symmetry factor of γ nγ !.
Returning to the full Z[J] which we can now represent as a contribution from a sum
of diagrams Γ, each of which can have several topologically connected γ’s with nγ copies.
Hence,
Z[J] ∝
X
CΓ ∼
X Y (Cγ )nγ
:= N
nγ !
hX i
Y
= N
exp[Cγ ] = N exp
Cγ .
{nγ } γ
{nγ }
∞
YX
(Cγ )nγ
nγ !
γ n =0
γ
γ
I
Thus, Z[J] is proportional to the exponential of the sum of contributions of topologically
connected diagrams. In this sum, are also the contributions topologically connected vacuum diagrams i.e. Z[J = 0]. Imposing the normalization condition, Z[J = 0] = 1, gives
N × exp(topologically connected vacuum bubbles) = 1. . Hence, the normalization is trivially incorporated by simply dropping the contributions of the vacuum diagrams and we are
229
left with,
h
i
X
Z[J] = exp iW [J] , iW [J] =
topologically connected diagrams.
(21.4)
The Green’s functions that we get from Z[J], now get expressed as,
h0|φ(x)|0i = −i
δZ[J]
δJ(x)
=
J=0
δW [J]
δJ(x)
,
∵
J=0
1
= 1.
W [J = 0]
This gives the contribution of diagrams with a single J(x) and gets removed when J = 0 is
put.
Similarly other Green’s functions can be expressed in terms of the derivatives of the W [J].
As an illustration of the organization of the topologically connected diagrams in W [J],
consider the two point function.
δ2
Z[J]
= −iδJ1 eiW δJ2 W
δJ1 δJ2
J=0
δJ1 W.δJ2 W − iδJ21 J2 W J=0
G(x1 , x2 ) := (−i)2
= eiW
∴ G(x1 , x2 ) = Gc (x1 )Gc (x2 ) − Gc (x1 , x2 ) , where,
2
Gc (x) := δJ(x) W [J]|J=0 , Gc (x, y) := iδJ(x)J(y)
W [J]|J=0
x1
x2
G
x1
=
|
x2
Gc
{z
}
Top. disconnected
x1
−
Gc
|
x2
Gc
{z
}
Top. connected
The Gc denote, contributions from connected (to external points) and topologically connected diagrams. This continues to the other n−point functions, as proved by the argument
leading to eqn. (21.4). To conclude,
G(x1 , . . . , xn ) :
Σ of all connected diagrams
n
δ
Z[J]
: (−i)n δJ(x1 )...δJ(x
n)
J=0
n
δ
Gc (x1 , . . . , xn ) : Σ of topologically connected diagrams : (−i)n−1 δJ(x1 )...δJ(x
W [J]
n)
n=0
:
Σ of all vacuum diagrams
:
Z[J = 0] = 1
n=0
:
Σ of all topologically connected
:
W [J = 0] = 0
J=0
vacuum diagrams
The computation of the diagrams uses the same Feynman rules that we discussed earlier.
The Z and W provide a convenient way of dealing all the diagrams together and serve as
generating functions for the Greens functions.
230
A.
The 1-point function: Renormalization ↔ normal ordering
The 1-point function of a scalar field has a distinction.
Since this represents the
h0|Φ(x)|0i, the Poincare invariance allows this to be a non-zero constant, say v. No other
Lorentz covariant field can have its vacuum expectation value to be non-zero since there are
no Lorentz invariant spinors or vectors15 . However, if we want the quanta exchanged during
interactions to have the appropriate quantum numbers, the vacuum should not have any
quantum numbers. As noted in the discussion of uniqueness of vacuum, a non-unique vacuum would have to carry non-trivial representation labels of the symmetry group which will
add to the labels of the exchanged quanta. Consequently, we must have a unique vacuum
and then, by the mode expansion of the quantum field, its expectation value must vanish.
Does our path integral definition of n−point function satisfy this property?
Consider computation of the 1-point function in the (φ)3n theory. Notice that topologically
disconnected diagrams contributing to the 1−point function can only be the vacuum bubbles
giving a multiplicative contribution of 1. So we can limit to only topologically connected
diagrams. Furthermore, if we have 1P R diagrams, their contribution is again of a product
form: (1P I) × ∆(x − y) × (1P I) × ∆(z − w) × (1P I) · · · . Hence it suffices to consider only
the 1P I diagrams. Thus we have,
Gc (x)
=
+
+ ···
At 1-loop (and higher), the diagrams are non-zero and in fact divergent. We invoke the
counter term method to absorb away these divergences. Thus, we add a term in L1 , of the
form Y φ(x) = −iY
δ
.
δJ(x)
This is represented by the diagram
. With its inclusion, the
contributing diagrams are:
Gc (x) =
=
15
y
x
(a)
+
Y
(b)
+
+
Y
+ ···
(2-loop)
A second rank, symmetric tensor field can have h0|ĥµν (x)|0i = Ληµν . But this typically arises only when
gravity is included.
231
These give, to 1-loop,
ig
h0|φ(x)|0i = |{z}
iY + · −i∆(0) ·
{z
}
|
2
(b)
Z
dn y(i∆(x − y)) + o(g 3 )
(a)
For the left hand side to be zero to o(g), we must set Y = i g2 ∆(0) + o(g 3 ) with,
Z
dn k
1
∆(0) =
(divergent for n ≥ 2.)
(2π)n k 2 + m2 − i
Z
dn k
1
1 Γ(1 − n/2) 2 n/2−1
(m )
for n = 4 − (say)
= i
=
i
2
(2π)n k + m2 − i
(4π)n/2
Γ(1)
1
2
= i
(−1) (m2 )1−/2 .
2
16π
2
g m
g m2
2
∴Y =
1
−
ln(m
)
+
·
·
·
≈
+ o(0 ) + · · · .
2
2
16π
2
16π
Thus, we fix the counter term Y by demanding that the 1-point function be zero, as the
renormalization condition and this can be continued at higher orders. The upshot is that
we can ensure h0|φ(x)|0i = 0 by means of a counter term. Hence, the sum of all connected
diagrams with a single external line vanishes.
Clearly, a tadpole can attach to any other diagram only in a 1PR manner (eg replacing
the source end of the tadpole by the other diagram). This is a multiplicative contribution
and immediately renders all such diagrams to be zero. Hence all diagrams with a tadpole
attached, when summed to a given order, vanish. We may thus simply remove the tadpole
diagrams and obtain results equivalent to those obtained by normal ordering prescription
employed before.
Note: Had the 1−point function been finite, we would still need to use the counter term
method to set it to zero to ensure uniqueness of the vacuum. In the dimensional regularization, massless tadpole diagrams are zero.
Qn: Normal ordering eliminates all self contractions - 1−point function in Φ3 as well as
2−point function in Φ4 and likewise if higher order vertices are included. For instance, the
1-loop diagram contributing to the 2−point function in Φ4 . Now there will be a 1-loop
counter term for the 2-point function (absent in the normal ordered version). Does it affect
the finite part?
Ans: The self contractions at any vertex, in any diagram produces a 1-loop contribution
as a multiplicative factor, ∆F (0) and this multiplicative factor is independent of any momenta entering/exiting the vertex. Consider such a contribution to the self-energy, Π(p2 ),
232
in the Φ4 theory. Unless a counter vertex is introduced, it is not possible to satisfy the
renormalization condition, Π(−m2 ) = 0 at the order λ1 (here we consider on-shell renormalization for simplicity, so that m2 = m2ph ). So introduce an order λ counter vertex. Then
at 1-loop, the self energy equals ∆F (0) + δ1 . Both are independent of p2 and the renormalization condition implies that two must add to zero. The self energy thus receives no
contribution at 1-loop, exactly as if normal ordering has been invoked. In any other diagram
(1PI), if a self contraction appears, then the same counter vertex, δ1 will again automatically
cancel the contribution, effectively enforcing normal ordering. Again this is independent of
UV divergence and is a consequence of the requirement of the renormalization condition on
self-energy.
B.
Path Integrals and Statistical Mechanics
We discuss briefly an interesting connection of the path integral representation of the
transition amplitude and the (grand) partition function of statistical mechanics [16].
R
The basic observation is that the transition amplitude, schematically hQ2 , t2 |Q1 , t1 i =
R
D q exp{ ~i tt12 dtL(q(t), q̇(t))}, can be taken as a postulate rather than a derivation from
matrix element of the quantum unitary evolution operator. We can also separately postulate
i
that the transition amplitude is the matrix element hQ2 |e− ~ (t2 −t1 )H |Q1 i. Mathematically,
set t2 − t1 =: −iβ, set Q2 = Q1 =: Q and integrate over Q. On the left hand side,
we get T re−βH which is the usual canonical partition function. On the right hand side,
the path integral turns to a path integral over all closed paths of an integrand which is
exponential of a Euclidean continuation (t → −iτ ) of the action. We thus obtain a path
integral representation of the statistical mechanical canonical partition function, in terms
of Euclidean continuation of a classical action. This is used in the context of the black hole
entropy which may be seen in [16].
233
22.
PATH INTEGRALS AS GENERATING FUNCTIONALS
We have noted the two basic quantities, Z[J] and W [J] = −ilnZ[J].
Their func-
tional derivatives with respect to J(x) give the contributions to the n−point (n ≥ 1)
functions: (−i)n δ n Z[J]|J=0 contains the contributions of all the connected diagrams while
(−i):n−1 δ n W [J] contains the contributions of all the connected and topologically connected
diagrams. The normalization condition: Z[J = 0] = 1 ↔ W [J = 0] = 0 omits the contributions of the vacuum bubbles. We had encountered the 1PI diagrams while discussing
the self-energies. Their contributions too can be obtained by functional differentiation of
another quantity which we obtain below. Note that the n−point functions being obtained
by functional derivatives also means that Z[J], W [J] can be viewed as generating functional
for Green’s functions.
A.
The Generating Functionals: Z[J], W [J], Γ[Φ]
Consider a connected and topologically connected diagram contributing to δ n W [J]|J=0 .
It is said to be 1-particle reducible (1PR), if it can disconnected by cutting 1 internal line. A
diagram which is not 1PR is 1-particle irreducible (1PI). Any given diagram can be viewed as
a collection of 1PI sub-diagrams connected by one internal line. As noted in the discussion
of self energy diagrams, all 1PI diagrams with two external lines when strung together by an
internal line (free propagator) lead to the exact propagator. Thus, the sum of all connected
and topologically connected diagrams contributing to an n−point function can be viewed
as new connected and topologically connected diagrams whose “vertices”, γk are all 1PI
diagrams with k > 2 edges, connected by lines representing the exact propagator. These
new diagrams all have a “tree topology” - there are no loops in such a reorganization of
diagrams since any loop would make a 1PI sub-diagram and will already be included in one
of the γk ’s. These vertices can be of any order, unlike elementary vertices dictated by the
Lagrangian defining a theory. The set of these new diagrams can be generated from another
generating functionals,
Z
4
Zγ [J] :=
D ϕ exp i γ[ϕ] + i d xJ(x)ϕ(x)
:= eiWγ [J] ,
XZ
γn (x1 , · · · , xn )
ϕ(x1 ) · · · ϕ(xn ) .
γ[ϕ] =
dx1 · · · dxn
n!
n≥2
Z
234
Note: The coefficients of the ϕn terms are not simple numbers and this action is non-local.
The coefficient of the n = 2 term, γ2 (x1 , x2 ) is the Fourier transform of the inverse of the
exact propagator. Note that this establishes that WS [J] = Wγ [J]
.
tree
However, such a Zγ [J], Wγ [J] will generate diagrams with loops as well and we are interested in generating only the tree diagrams. To pick out these alone, introduce a fictitious
dimensionless parameter, λ and define
Z
Z
i
4
γ[ϕ] + i d xJ(x)ϕ(x)
:= exp[iWγ,λ [J]] .
Zγ,λ [J] := D ϕ exp
λ
Let us count the powers of λ in a any connected and topologically connected diagram. Noting
that the γ2 term in γ[ϕ] is the (Fourier transform of the) inverse of the exact propagator,
the scaling by λ−1 means that that each propagator gives a factor of λ while each vertex
and every J, gives a factor of λ−1 . Thus, for any diagram contributing to Wγ,λ [J] we get
the factor (λ)P −V −E , where P, V, E are the number of propagator lines, number of vertices
and the number “external” lines (lines connected to a source) respectively. The number of
“internal” lines is P − E. In the diagram below, we have P = 6, V = 2, E = 4 giving λ0 .
λ−1
λ−1
λ
λ
λ−1
λ
λ
λ
λ−1
λ
−1
λ
λ−1
For topologically connected diagrams we also have L = I − V + 1 = P − E − V + 1. Hence
P
L−1
Wγ,L [J] and organize the
the factor of λ is (λ)L−1 . We can thus write, Wγ,λ [J] = ∞
L=0 λ
diagrams by the number of loops. In the formal limit, λ → 0, Wγ,λ→0 [J] → λ1 Wγ,L=0 which
is the contribution of the tree diagrams (with exact propagators and exact vertices alone).
On the other hand, we may evaluate Zγ,λ [J] in the limit λ → 0 by stationary phase
approximation. Clearly, the path integral is dominated by the fields, ϕ(x) which satisfy
δγ
δϕ
+ J = 0. Denoting ϕcl (x) as its solution, we get,
Z
i
4
γ[ϕcl ] + d xJ(x)ϕcl (x) = eiWγ,λ→0 [J] ' exp Wγ,L=0 [J].
λ
Z
∴ γ[ϕcl ] + d4 xJ(x)ϕcl (x) = Wγ [J]
= WS [J].
i
Zγ,λ→0 [J] ' exp
λ
tree
235
The last equation shows that γ[ϕ] is just the Legendre transform of W [J] and by construction,
it is the generating function of connected, topologically connected, 1PI diagrams with external legs amputated! The last property follows because the vertex functions γn (x1 , · · · , xn )
explicitly do not have external propagator lines included in them.
Note: We have followed a route of organizing the diagrams as a tree of 1PI diagrams
connected by exact propagators and inferred the corresponding generating function γ[ϕ] as
a Legendre transform of the W [J]. The usual approach is to begin with a Legendre transform
definition and arrive at its interpretation. This conventional approach goes as follows.
Define, Φ(x)[J] :=
δW [J]
δJ(x)
= h0|Φ̂(x)|0iJ , J is not set to zero and hence this is not the
1-point function. We have the path integral representation as,
R
D ϕϕ(x)eiS[J]
−iδln(Z[J])
−i δZ[J]
Φ(x)[J] =
=
= R
,
δJ(x)
Z[J] δJ(x)
D ϕeS[J]
v := Φ(x)[J = 0] .
.
ˆ (x)|0i = 0. This generalizes to,
Define, Φ̄(x) := Φ(x) − v. Then h0|Φ̄
Claim:
δn
W [J]
δJ(x1 ) . . . δJ(xn )
= in−1 h0|T Φ̄(x1 ) . . . Φ̄(xn ) |0i , ∀ n ≥ 2.
J=0
The proof is by induction starting at n = 2. For n = 2 we have,
δ2
δ 2 Z[J]
δ
i δZ[J]
i δZ[J] δZ[J]
W [J]
−i
=
−
=
δJ1 δJ2
δJ2
Z[J] δJ1
Z 2 δJ2 δJ1
δJ2 δJ1 J=0
J=0
J=0
i
= (iv)(iv) − i(i2 )h0|Φ(x1 )Φ(x2 )|0i
1
= i h0|Φ(x1 )Φ(x2 )|0i − v 2 = h0|Φ̄(x1 )Φ̄(x2 )|0i.
Which verifies the claim for n = 2. The pattern repeats and the proof follows.
As a corollary, we can write
X 1 Z
δnW
dx1 . . . dxn J(x1 ) . . . J(xn )
W [J] =
n!
δJ1 . . . Jn J=0
n≥2
Z
X (i)n−1
dx1 . . . dxn h0|T Φ̄(x1 ) . . . Φ̄(xn ) |0iJ(x1 ) . . . J(xn )
=
n!
n≥2
(22.1)
(22.2)
Using Φ(x)[J] = δJ(x) W [J], define the Legendre transform of W [J] ,
Z
Γ[Φ] := W [J] −
d4 xJ(x)Φ(x)
,
(compare with: −H(p) = L(q̇) − pq̇ )
236
(22.3)
Notice that this is exactly the same as the γ[ϕ] defined above.
The Γ[Φ] is defined by expressing the right hand side as a function of Φ, in particular
J = J[Φ] is understood which is obtained by inverting the relation Φ[J] = δJ W [J]. It follows
as usual for a Legendre transform,
Z
Z
Z
δΦ(y)
δWδJ(y)
δΓ[Φ]
4
4 δJ(y)
=
− dy
Φ(y) − d4 yJ(y)
= −J(x)
dy
δΦ(x)
δΦ(x)
δΦ(x)
δJ(y) δΦ(x)
In the last equality we have used
δΓ[Φ]
δΦ(x)
δΦ(y)
δΦ(x)
= δ 4 (x − y). Since Φ(x)[J = 0] = v, we also have
= 0. Thus, v is that value which extremises Γ[Φ(x)] and hints at Γ[Φ] being
Φ(x)=v
some sort of action whose extremization leads to a solution. This is supported further as
follows.
We have Z[J] = eiW [J] =
R
D ϕeiS[J,ϕ] . Let ϕcl (x) be a classical solution i.e. δS[J, ϕcl ] =
0. Expand the action around such a solution by setting ϕ(x) = ϕcl (x) + η(x). We get
(schematically to avoid clutter),
S[ϕ] = S[ϕcl + η] = S[ϕcl ] +
1 δ2 2
δS
·η +
η + ...
δϕcl
2 δϕ2
|{z}
0
δ 2 S[J, ϕ]
1
d4 x d4 y
η(x)η(y)
= S[ϕcl ] +
2
δϕcl (x)δϕcl (y)
Z
Z
Z
i
δ2S
iS[J,ϕcl ]
4
4
= e
D η exp
d x d y η(x)
η(y) + . . .
2
δϕcl (x)δϕcl (y)
Z
∴ eiW [J]
The path integral does have an implicit dependence on J(x) through the ϕcl (x), unless the
action itself is quadratic as was the case of the oscillator.
Momentarily let us just ignore the path integral altogether. Then, W [J] ≈ W0 [J] :=
R
S[J, ϕcl ] = d4 x[L(ϕcl ) + J(x)ϕcl (x)] where the classical solution is to first obtained by
solving δS[J, Φ] = 0 and then substituted back in the action.
Applying our definition, Φ(x) = δJ(x) W [J] ≈ δJ(x) W0 [J] = δJ(x) Scl [J] , Scl :=
S[J, ϕcl (x)[J]]. Straight forward evaluation gives,
Z
δScl δϕcl (y)
Φ(x) = d4 y
+ ϕcl (x)
δϕcl (y) δJ(x)
| {z }
=0
the last term coming from δJ
R
Jϕ. Hence, in this approximation, Φ(x) = ϕcl (x).
237
Proceeding with the Legendre transform, we get
Z
Γ0 [Φ] = W0 [J] − d4 xJ(x)Φ(x)
Z
Z
4
4
+ dxJ(x)ϕ
. (22.4)
=
S[ϕcl ]
=
S[ϕcl ]
cl (x) − dxJ(x)Φ(x)
| {z }
| {z }
no explicit J
no explicit J
Thus, within the approximation, Γ[Φ] is just the classical action (ϕcl (x) → Φ(x)).
But this also shows that when the path integral included, W [J] will be very different and
so also the corresponding Γ[Φ]. Thus, Φ(x) is identified as the “quantum corrected solution”
while the Γ[Φ] is called a “quantum action” or an “effective action” incorporating quantum
corrections.
R
Consider Taylor expanding the effective action about Φ[x] = v. Γ[v] = W [J = 0]− 0·φ =
0 since W [0] = 0 by our normalization, Z[0] = 1. We have also seen that δv Γ = 0. The
remaining terms are a power series in (Φ(x) − v) =: Φ̄(x) with the coefficients evaluated at
v. This is the same as regarding Γ as a function of Φ̄(x) and expanding it about Φ̄(x) = 0
i.e.
Γ[Φ̄] =
XZ
n≥2
dx1 · · · dxn
Γ(x1 , · · · , xn )
Φ̄(x1 ) · · · Φ̄(xn ) .
n!
What do these coefficients represent? We already have the answer. We just need to recognize that Γ[Φ̄] = γ[ϕ]! The coefficients are the contribution of all connected, topologically
connected, 1PI diagrams without external legs. An explicit demonstration of amputation of
the external legs may be seen in [17]. The generating functions are summarized in the table
below.
238
In summary
Z
Z[J] :=
D ϕeiS[J,ϕ] , Z[J = 0] = 1 , (normalization)
G(x1 , . . . , xn ) := −i
W [J] :=
Gc (x1 , . . . , xn ) :=
Φ(x)[J] :=
Γ[Φ] :=
(22.5)
n
δ
Z[J]
= h0|T {Φ(x1 ) . . . Φ(xn )}|0i
(22.6)
δJ(x)
J=0
−i ln(Z[J]) , W [J = 0] = 0
(22.7)
n
δ W [J]
(−i)n−1
= h0|T {Φ(x1 ) . . . Φ(xn )}|0ic (22.8)
δJ(x1 ) . . . δJ(xn ) J=0
δW [J]
, v := Φ(x)[J = 0] , Φ̄(x) := Φ(x) − v
(22.9)
δJ(x)
Z
W [J] − d4 xJ(x)ϕ(x)
(22.10)
δΓ[Φ]
δΓ[Φ]
,
=0
δΦ(x)
δΦ(x) Φ(x)=v
X in−1 Z
d4 x1 . . . d4 xn G(x1 , . . . , xn )J(x1 ) . . . J(xn )
Z[J] =
n!
n≥1
X in−1 Z
W [J] =
d4 x1 . . . d4 xn Gc (x1 , . . . , xn )J(x1 ) . . . J(xn )
n!
n≥1
X 1 Z
Γ[Φ̄] =
d4 x1 . . . d4 xn Γn (x1 , . . . , xn )Φ̄(x1 ) . . . Φ̄(xn )
n!
n≥2
J(x) = −
G(x1 , . . . , xn ) ↔ all connected diagrams
Gc (x1 , . . . , xn ) ↔ all connected and topologically diagrams
(22.11)
(22.12)
(22.13)
(22.14)
(22.15)
(22.16)
Γ(x1 , . . . , xn ) ↔ all connected, topologically connected, 1PI diagrams
with external legs amputated
B.
(22.17)
The Renormalization Group Equation
Recall that while discussing the renormalized perturbation series, we introduced φ0 , m0 , g0
as the ‘bare’ quantities with L(φ0 , m0 , g0 ) generating the diagrams containing the UV divergences. We then introduced the renormalized variables φ, m, gk defined through the scaling:
p
k/2
φ0 := Zφ φ, m20 Zφ := m2 Zm , g0k Zφ := Zgk gk which were finite by definition. We also
introduced counter terms to take care of the UV divergences. With the Z, G, Γ we have the
239
formal representation of the totality of all diagrams. Going over to the momentum space
labels we have (we take the 1-point function to be zero and a single coupling constant for
convenience),
δΓ0 [φ0 ]
Γ0,n (p1 , . . . , pn ) :=
δφ0 (p1 ) . . . δφ0 (pn )
⇒
,
φ0 =0
Γ0n (pi , m0 , g0 )
=
p
δΓ0 [ Zφ φ]
Γn (p1 , . . . , pn ) :=
δφ(p1 ) . . . δφ(pn )
−n/2
Zφ Γn (pi , m, g)
φ=0
(22.18)
The renormalized parameters are defined by a set of renormalization conditions, which
introduce a scale, say, µ (eg a M S or M S scheme). The bare quantities know nothing about
dΓ0 (pi , m0 , g0 )
this scale and therefore µ
= 0. . Substitution gives,
dµ
d
d −n/2
−n/2
Γn + Zφ µ Γn and,
0 = µ Zφ
dµ
dµ
d
∂
dm ∂
dg ∂
µ Γn = µ
+µ
+µ
Γn (pi , m, g)
dµ
∂µ
dµ ∂m
dµ ∂g
Thus we have the renormalization group (RG) equation for the renormalized vertex function
as,
µ
∂
Γn (pi , m(µ), g(µ))
∂µ
+ β(g)
m,g
∂
Γn
∂g
µ,m
− γm (g)m
∂
Γn
∂m
µ,g
− nγφ Γn = 0,
(22.19)
where,
β(g, m) := µ
dg
dln(m)
1 dln(Zφ )
, γ(g, m) := −µ
, γφ (g, m) := µ
.
dµ
dµ
2
dµ
(22.20)
The first is the beta function, the second is called the mass anomalous dimension while the
last is called the anomalous dimension. These are computed typically in a perturbation
theory as a power series in the renormalized coupling (see below). Different schemes give
different simplifications and there is some scheme dependence in the coefficients of these
functions.
As mentioned above, we describe a method of computation of the renormalization constants and their dependence on the couplings in the MS scheme [18].
We have the defining equations (22.19,22.20). These were obtained by noting that the
bare vertex functions as a function of the bare parameters and the regulator are independent
of µ, for fixed g0 , m0 and . When the renormalized parameters are not defined in terms of
physically measured quantities, how do we compute the µ dependence of these parameters?
240
For this, we go back to the basic definitions and recall the relation between the bare and the
renormalized parameters, in particular, g0 := µ Zg g. Note that we have absorbed the Zφ
factor into Zg and have also introduced the µ dependence by taking g to be dimensionless.
The µ∂µ is now evaluated with the bare parameters and the regulator held fixed i.e.
β(g) = µ∂µ g(g0 µ− , )
= µ∂µ g0 µ− Zg−1 = −g − gµ∂µ ln(Zg ) .
g0 ,
Noting that Zg is determined as a function of g, Zg (g, ) = 1 +
P
k≥1
−k Zgk , we can write
the defining equation for the β function in the form,
β(g, ) = −g − gβ(g, )∂g ln(Zg (g, )) ↔ β(g, )∂g (gZg (g, )) + gZ(g, ) = 0.
The β(g, ) must have a smooth limit as → 0 and we may consider β ∼ β0 +β1 +β2 2 +· · · .
The ∂g (gZg (g, )) ∼ 1+−1 +· · · while gZg (g, ) ∼ +0 +−1 +· · · . Clearly, the coefficients
of the positive powers of greater than 1 must vanish. Hence we take, β(g, ) := β0 (g) + β1 .
It follows that β1 = −g and the equation reduces to,
0 = (β0 − g)(Zg + g∂g Zg )
# "
#
"
X g 2 ∂g Zgk+1
X 1
k
2
1
∂g (gZg ) − g ∂g Zg +
0 = β0 1 +
k
k
k≥1
k≥1
∴
β0 (g) = g 2 ∂g Zg1 , β0 ∂g (gZgk ) = g 2 ∂g Zgk+1 .
(22.21)
(22.22)
(22.23)
Theβ0 (g) is the usual beta function and is obtained from the renormalization constant Zg .
The Zgk>2 coefficients are also determined recursively.
Note: The above expressions are in the context of the Φ4 coupling. For Yukawa or
the Yang-Mills cubic coupling, the coupling has gY ukawa = µ/2 gdimensionless and the above
equations will change accordingly.
As a practical application of the renormalization group equation, consider a theory which
is massless and remains so perturbatively. Then the anomalous mass dimension term is
absent. Putting t := lm(µ/µ0 ) for some arbitrary scale µ0 at which the theory is defined
∂
(renormalization conditions are imposed), we write µ ∂µ
=
∂
∂t
and the equation takes the
form,
∂
∂
dg(t)
+ β(g) − nγ(g) Γn (g, t, pi ) = 0 with β(g) =
∂t
∂g
dt
This immediately gives,
dΓn (g(t),t,pi )
dt
= 0.. Its solution is obtained as,
Z t
0
0
Γn (t, g0 , pi ) = Γn,0 (g(t, g0 ), p) exp n
dt γ(g(t , g0 ))
where,
0
241
(22.24)
dg(t, g0 )
= β(g(t, g0 )) , g(0, g0 ) = g0 .
dt
Although typically the beta function is a power series in the coupling. Once g(t) is known,
g(t, g0 ) is the solution:
the t dependence of all vertex functions is determined. A good deal of qualitative behavior
can be gleaned from studying the beta function - especially near its zeros. We know it has
a zero at g = 0 (perturbative calculation), but at may have other zeros. The derivative of
β(g) near its fixed points, controls how g(t) evolves with t.
Let us assume it to be some given function of the form as shown in the figure.
β(g)
I
g0
III
g1
g2
II
g3
g
It has multiple zeros, say g0 = 0 < g1 < g2 < g3 . . . . Let I, II, III, . . . denote the
intervals on the g−axis, bounded by the zeros. Clearly, given a g0 in one of these intervals,
it will remain so for all t. If β(g) > 0 in an interval, then g(t) will evolve to its upper
bound and the opposite if β(g) < 0. A coupling will always flow to a UV-stable (we are
considering increasing t) fixed point. In the figure, g1 , g3 are the stable fixed points while
g0 , g2 are unstable ones.
Let g0 be in region I so that as t → ∞, g(t, g0 ) → g1 . The asymptotic behavior of Γn is
controlled by the integral of the anomalous dimension. Let us write,
Z t
Z t
Z t
0
0
0
0
dt [γ(g(t , g0 )) − γ(g1 )] +
dt0 γ(g1 )).
dt γ(g(t , g0 )) =
0
0
0
In the first term, we may take the upper limit t → ∞ since the integrand provides exponential
suppression. This integral is then some constant, C. In the second term, γ(g1 ) is independent
of t0 and thus evaluates to γ(g1 ) · t. Hence,
h
i
Γn (t, g0 , pi ) −−−→ Γn,o (g1 , pi ) · C · exp{γ(g1 )t} , t = ln(µ/µ0 ) .
t→∞
µ is some large scale eg some of the invariants pi · pj . Thus, we get Γn (ln(µ/µ0 ), g0 , pi ) −−−→
µ→∞
(µ/µ0 )γ(g1 ) ∀ n. The exponent is the same for all n−point vertex functions and is governed by
242
a stable fixed point in the vicinity of g0 . All these vertex functions have the same asymptotic
behavior. For further application, I refer you to [19].
C.
The Background Field Method
As an illustration of the utility of the formal advantage of the functional methods, we
discuss the so called background field method, extremely useful for non-abelian gauge theories
[20]. For illustration, we consider the simplest case of a scalar field ϕ.
We have the basic definitions
Z
Z
R
W [J]
iW [J]
iS[ϕ]+ J(x)ϕ(x)
, Γ[ϕ̄] := W [J(ϕ̄)] − J(x)ϕ̄(x) .
e
= Z[J] := D ϕe
, ϕ̄(x) :=
J(x)
Introduce a new, arbitrary background field, χ(x) and define,
Z
R
iW̃ [J,χ]
e
= Z̃[J, χ] :=
D ϕeiS[ϕ+χ]+ J(x)ϕ(x) ,
Z
W̃ [J, χ]
ϕ̃(x) :=
, Γ̃[ϕ̃, χ] := W [J(ϕ, χ)] − J(x)ϕ̃(x) .
J(x)
In the new generating functionals defined, shift the integration variable as ϕ → ϕ − χ.
This is a simple translation and gives the Jacobian to be 1. It is immediate that Z̃[J, χ] =
R
R
Z[J]e−i J(x)χ(x) and W̃ [J, χ] = W [J] − J(x)χ(x). Clearly, ϕ̃(x) = ϕ̄(x) − χ(x) and
Z
Z
Γ̃[ϕ̃, χ] = W [J] − J(x)χ(x) − J(x)(ϕ̄(x) − χ(x)) = Γ[ϕ̄] = Γ[ϕ̃ + χ] .
If we had arranged that ϕ̄ = 0, then Γ[χ] = Γ̃[0, χ]. The background field χ being arbitrary,
we get an alternate method of computing the effective action Γ[χ] using the shifted fields.
The shifted action has the form:
Z
S[ϕ + χ] = S[χ] +
L1 (χ)ϕ(x) +
Z
L2 (χ)ϕ2 (x) + · · · .
Since there is no functional integration over χ, the first term is just the classical action.
Since χ is arbitrary, the second term does not vanish (it vanishes if χ is an exact solution of
the classical equations of motion). The third term gives the propagator for the ϕ field and
the higher order terms give interactions among the ϕ, χ fields.
Now, the shifted effective action, Γ̃ is a generator of 1PI diagrams in presence of ϕ̃ and
χ. That is, its derivatives with respect to ϕ̃ will give the 1PI diagrams in presence of χ.
243
The ϕ propagator, which comes from the terms in the action which are quadratic in ϕ, will
in general be χ−dependent and the vertices will have factors of χ(x). When ϕ̃ = 0, only
the diagrams with no external ϕ̃ lines i.e. only the vacuum diagrams will contribute. Thus
the desired effective action can be computed using only vacuum diagrams, albeit with vertex
factors having the χ field.
There are two ways to approach the computation. If we treat the background field exactly,
then in obtaining the Feynman rules the propagator of the ϕ field will have the χ field as
well (the terms quadratic in ϕ in S[ϕ + χ]). Except when χ is space-time independent, this
propagator is complicated. And for a general background field, this method is not useful.
An alternative is to treat χ also perturbatively i.e. the ϕ propagator remains exactly same
as before and the additional χ−dependent terms are treated as additional couplings (the ϕ
factors are replaced as
δ
).
δJ
The Feynman rules have the same ϕ−propagator, there is no
propagator for the χ field and the vertices have extra edges denoting the background field.
Since there is no χ propagator, these vertices generate only external χ lines.
As an explicit example, consider the (ϕ3 )6 theory. The shift by a background field gives,
Z
h
i
g
S[ϕ + χ] = S[χ] + S[ϕ] + d4 x −∂µ ϕ∂ µ χ − m2 ϕ(x)χ(x) −
ϕ(x)χ2 (x) + ϕ2 (x)χ(x)
2!
The S[χ] comes out of the path integral, the S[ϕ] gives the usual ϕ−propagator and the ϕ3
vertex while the last four terms give the additional interaction vertices.
Since in the 1PI diagrams we are interested in, internal lines are only ϕ−lines and the
external lines are only χ lines, the first three of the four additional vertices are irrelevant and
only the ϕ3 and ϕ2 χ vertices remain. It is a simple exercise in power counting to determine
the superficially divergent 1PI diagrams with no internal χ−lines and no external ϕ−lines.
Consider a 1PI diagram with n0,3 number of ϕ3 vertices and n1,2 number of χϕ2 vertices.
Then the internal ϕ−lines is given by 2Iϕ = 3n0,3 + 2n1,2 and the external χ−lines if given
by Eχ = n1,2 . It is easy to see that the superficial degree of divergence, D, is then given by
D = 6 − 2Eχ . Notice that n0,3 drops out of the degree of divergence. The renormalizability
of the theory is thus manifest and only the 2−point and the 3−point vertex functions are
divergent.
How does renormalization work in background field method? Re-scale the fields as ϕ →
p
p
Zϕ ϕ , χ → Zχ χ. Then the ϕ−propagator will get a factor of Zϕ−1 . The n0,3 vertices
n
3/2
1/2
give a factor of (Zϕ )n0,3 and the n1,2 vertices give a factor of Zϕ 1,2 · (Zχ )n1,2 . The Iϕ lines
244
−I
give a factor of Zϕ ϕ . Clearly, the factors of Zϕ cancel out since −Iϕ + 32 n0,3 + 22 n1,2 = 0.
n
We may thus choose not to renormalise the ϕ fields. The left over factor is Zχ 1,2
/2
E /2
= Zχ χ
as expected.
Notice that the diagrams that need to be computed for any vertex function, are exactly the
same as without background field except for the vertex factors from the vertices connecting
the external lines. The renormalization procedure(s) then proceed as usual eg as discussed
in the subsection 19 B.
For scalar fields used in the illustration, there is no particular advantage. But with gauge
theories the method allows enormous simplification [20]. This is beyond the scope of this
course though.
Returning to our generating functionals Z, W, Γ; we have seen how the perturbative diagrams for the Green’s functions and the S−matrix elements (through the vertex functions)
can be subsumed by formal manipulations with the basic path integral. The utility of the
formalism goes beyond perturbation theory and computations of cross-sections. When it
comes to gauge theories, especially the non-abelian ones with/without spontaneous symmetry breaking, the generating functional provide a convenient tool to establish renormalizability and unitarity. The proof of the Ward identities - constraints or relations enforced
by gauge invariance on different n−point functions - is most economical in this framework.
The existence of anomalies - violation of classical symmetries at the quantum level and their
treatment is also much more transparent in these functional methods.
The basic path integral is defined without any pre-supposition of perturbation theory. If
the basic definition can be implemented eg via a lattice discretization, we can potentially
have a non-perturbative handle through the computation of the vertex functions.
245
23.
CLOSING REMARKS
The lectures were organized around four strands. The main references used for each
strand are also cited.
1. Poincare symmetry realization and its consequences [1, 2]:
This involved the particle representations which identified the attributes of the permissible quanta;
Manifest covariance needs the field representation and the irreducibility condition imposed the linear field equations; The unitarity analysis revealed anti-particles;
Action formulation showed the free classical fields as a dynamical system of infinitely
many independent oscillators; Introduction of interaction with non-dynamical source
brought in the various Green’s function, the Feynman propagator conflicting with a
causally consistent classical interpretation;
Quantization of free fields required their covariance to be formulated differently and
led to the CPT theorem; requirement of causality lead to the spin-statistics theorem;
States of Free fields naturally contain particles as the relativistic wave packets and
pave the way for recovering the usual non-relativistic limit.
2. Interacting quantum fields [9–11]:
This was discussed within the context of scattering experiments; corresponding scattering theory postulates were discussed leading to the Kallen-Lehmann spectral representation; vanishing of the 1-point function is required for the stability of (unique)
vacuum;
LSZ reduction was discussed to obtain the S-matrix elements to vacuum expectation
values of time ordered fields; covariant perturbation theory was premised on the interacting fields have the same canonical commutation relation as the in/out fields which
are deemed physical - masses and residues at poles; S-matrix elements are given entirely in terms of free quantum fields with arbitrarily specified but Lorentz invariant
interactions and a diagrammatic recipe follows;
This characterizes interacting quantum fields as facilitating discrete transactions of
energy/momentum/spin/charge etc via virtual quanta. The off-shell quanta do not
246
satisfy the equations of motion and hence represent a model of quantum fluctuations.
3. Application to the Yukawa interaction and QED up to 1-loop [12, 13]:
The early successes at tree level were discussed followed by the Radiative corrections;
The IR divergences highlighted the care needed in applying the formalism to what is
actually measured;
The UV divergences highlighted the unacceptable, unbounded dominance of the quantum fluctuations and questioned the purely theoretical parameters in the Lagrangian
(‘bare’); Some renormalization procedure needs to be adopted to identify the physically
measured parameters;
The generic problem is to be handled recursively using counter terms which were illustrated at the 2-loop level; The renormalization process also led to the renormalization
group equation.
4. The path integral formulation [12, 17, 18]:
The classical action has a direct and central role in computing quantum transition
amplitudes; The path integral provides generating functionals for various n-point functions and is a powerful tool for both formal studies as well as for richer non-abelian
gauge theories with/without spontaneous symmetry breaking (SSB);
When extended to the more general theories - non-abelian gauge theories with/out
SSB, it follows that one needs to identify the ‘correct fields’ and their interactions
before computing the various n-point functions; The qualitative physics guesses are
invoked to propose the choice of the unique vacuum. In the gauge theory context, for
instance, gauge invariance under large gauge transformations (non-trivial at infinity)
reveal the θ−vacua and some choice is made by nature.
247
[1] Steven Weinberg, The Quantum Theory of Fields volume I, Cambridge University Press, 1995.
[2] C. Wetterich, Massless Spinors in More Than Four Dimensions, Nucl. Phys. B 211, 177
(1983).
[3] G Date, Lecture on Constrained Systems, arXiv:1010.2062.
[4] Hong-Hao Zhang, Kai-Xi Feng, Si-Wei Qiu, An Zhao, Xue-Song Li, A Note on analytic formulas of Feynman propagators in position space, Chin.Phys. C34, 1576 (2010), arXiv:0811.1261.
[5] W. Greiner and J. Reinhardt, Quantum Electrodynamics, 4th edition, Springer, 2009, page
68.
[6] D F Walls and G J Milburn, Quantum Optics, second edition, Springer-Verlag, 2008.
[7] R J Glauber, Quantum Theory of Optical Coherence: Selected Papers and Lectures, WileyVCH Verlag GmbH & Co. KGaA, 2007.
[8] R. Jagannathan and S. A. Khan, Quantum Mechanics of Charged Particle Beam Optics, CRC
Press, 2019.
[9] M Reed and B Simon, Methods of Modern Mathematical Physics, Scattering Theory, Vol III,
Academic Press, 1979.
[10] R G Newton, Scattering Theory of Waves and Particles, chapters 6 and 7, Springer-Verlag,
NY, 1982.
[11] J D Bjorken and S D Drell, Relativistic Quantum Fields, chapters 16 and 17, McGraw Hill,
1965.
[12] M Srednicky, Quantum Field Theory, Cambridge University Press, 2007.
[13] M Peskin and D V Schroeder, Introduction to Quantum Field Theory, Addison-Wesley, 1995.
[14] George Sterman, An Introduction to Quantum Field Theory, Cambridge University Press,
1993.
[15] S. Elaydi, An Introduction to Difference Equations, Springer, 2005.
[16] G W Gibbons and S W Hawking, Action integrals and partition functions in quantum gravity,
Phys. Rev. D, 15, 2752 (1977).
[17] E S Abers and B W Lee, Gauge Theories, Physics Reports, 9, 1, (1973).
[18] D Gross, Method in Field Theory, Les Houches, Ed. R Balian and J Zinn-Justin, 1975.
[19] S Coleman, Quantum Field Theory: Lectures of Sydney Coleman, Chapter 50, World Scien-
248
tific, 2019.
[20] L F Abbot, Introduction to the background field method, Acta Physica Polonica, aB13, 33
(1982).
249