AQT Lectures

Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

Advanced Quantum Theory Lecture Notes: Michaelmas 2023

Nabil Iqbal

December 11, 2023

Contents

1 Introduction 3
1.1 Why quantum field theory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Special relativity and Lorentz invariance 6


2.1 2d rotational invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Basic kinematics of Lorentz invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Group theory of Lorentz group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Translations and the Poincaré group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 The twin paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Realizing transformations on fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6.1 Scalar fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6.2 Vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Lagrangian methods and classical field theory 15


3.1 Lagrangians for classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Lagrangian methods for classical field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Action of a real scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Noether’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.1 Example: a complex scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.2 Proof of Noether’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.3 Alternative way to find the current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 Hamiltonian formalism in classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1
4 Quantum field theory in canonical formalism 23
4.1 Quantizing the simple harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Quantizing the free complex scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.1 Operator-valued Fourier expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.2 Algebra of creation/annihilation operators . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2.3 Hamiltonian and the energy of the vacuum . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2.4 Single particle states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.5 The Fock space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3 Propagators and causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.1 Lorentz-invariant integration measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.2 Commutators and the light-cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3.3 Particle propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.4 Feynman propagator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5 Interacting quantum field theories 41


5.1 The interaction picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Wick’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Feynman diagrams and Feynman rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6 Scattering 53
6.1 LSZ reduction formula and the S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2 Scattering: λφ4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.3 Scattering: gφ2 σ theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.4 Wrapping up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.5 Loops and infinity and renormalization and all of that . . . . . . . . . . . . . . . . . . . . . . 61
These are lecture notes for Advanced Quantum Theory (i.e. introductory quantum field theory) in Michaelmas
2023 at Durham University. I inherited this course from Charlotte Sleight, and these lectures largely follow
the structure of her course. They also owe a lot to previous versions of the course by Marija Zamaklar and
Kasper Peeters, the lecture notes by David Tong and the standard expositions in the textbooks by Peskin
and Schroeder and Srednicki. Please send any errors to [email protected].

2
Sunday, 1 October, 2023 11:15 PM

Figure 1.1: A square showing different kinematic regimes of nature

1 Introduction

1.1 Why quantum field theory?

This is a course on quantum field theory. Quantum field theory is an amazing and wonderful subject. It
is the mathematical framework behind our most precise understanding of the natural world, and it is really
the language in which essentially all modern research in quantum physics is done. I am happy to say that
by the end of this year you will not only understand the principles behind it, but you will also be able to use
it to calculate observable things.
I should also say at the outset that quantum field theory can be hard. It is a bit of a conceptual leap, and
the calculations that are involved can be difficult. There are also many different ways to say the same thing,
as you will see. This is part of the fun, but as always the things that are the most fun invariably involve a
bit of a challenge.
So before the inevitable blizzard of indices and infinite-dimensional integrals that will follow, let’s first
understand: why do we need quantum field theory?
Let’s start by drawing a map of physical theories, as in Figure 1.1.
The goal is to understand things that are both small, and that move fast. Some examples are:

1. ...light. You may have learned in an earlier quantum mechanics class that there are “light particles”
called photons. (If you haven’t learned this, don’t worry; it’s hard to explain it properly without
quantum field theory). You might expect that light particles are “small”, and they move at the speed
of light. Thus a careful quantum mechanical treatment of light will invariably require quantum field
theory.
2. Particle colliders, e.g. the LHC at CERN – these are devices that make individual particles go very,
very fast. It then smashes them together. To figure out what happens in this process requires quantum

3
field theory. (This is maybe a good time to stress that in reality this “smashing” operation is a kind
of microscope – we do this so that we can see what is happening at small scales. We will learn more
about this later).

3. Very hot things. If you heat something up, the particles in it move around more quickly. If you
heat it up enough, you might worry that they move around near the speed of light. You then need
QFT to understand this. This sort of thing does not happen very often, but it did happen in the early
universe – the “Big Bang” was basically an explosion of arbitrarily high temperature, and to describe
what happens afterwards requires the formalism of quantum field theory.

So the point is that quantum field theory is basically what you get if you combine quantum mechanics and
special relativity. It may surprise you that this needs a whole course; why can’t we just make quantum
mechanics relativistically invariant and move on with our life? (In other words, why isn’t this subject called
“relativistic quantum mechanics” instead of “quantum field theory”?).
It turns out a great deal of new physics enters. Let’s heuristically try and understand why. This is really for
inspiration, later on we will do this all precisely. Consider a free particle with mass m. In non-relativistic
classical mechanics, the energy for the particle would be

p~2
Enon−rel = (1.1)
2m
where p~ is the 3-vector momentum of the particle. This is not the right equation for a relativistic particle,
however. The correct equation there turns out to be:
p p~2
Erel = p~2 c2 + m2 c4 ≈ mc2 + + O(p4 ) (1.2)
2m
You may recall this equation from an earlier class on relativity; if not, we will recap quite a lot of it soon so
don’t worry.
In the last equality I performed a Taylor expansion in powers of p2 . Note that if we set p = 0 we get the world’s
most famous equation E = mc2 : this says is that in relativity a single particle has an inextricable amount of
energy mc2 called the rest mass (the thing we normally call “kinetic energy” is basically a correction on top
of that). The reason why this formula is so profound is that it tells you that particles always have energy,
even when they are not moving1 – however, reversing the logic you might imagine that if we supply enough
energy, we might be able to create a particle! In particular if we supply energy E = 2mc2 then we can create
a particle/anti-particle pair.
So far I have only discussed relativity; now let us include quantum mechanics. In quantum mechanics, we
know that energy fluctuates. More precisely, what this means is that we will often consider a state that is
not an energy eigenstate, which means that it will have some spread of energies ∆E. If we ever end up with

∆E ≈ 2mc2 (1.3)

then it suggests that the fluctuations of energy will be enough to create a particle/anti-particle pair. In
other words, there exist states where the particle number will change. But this is not allowed in traditional
quantum mechanics – you always write down the Schrodinger equation for 1 particle, or 2 particles, or some
fixed number N , which you fix to start with. Relativity tells us this is not enough. We need a new formalism
entirely, one where the state space allows the particle number to change. That is the formalism of quantum
field theory.
1 Of course the liberation of this energy is how nuclear power works, and the conversion factor of c2 is one of the reasons it

is so powerful.

4
Let us push on this a bit more and figure out when we expect this new formalism to be experimentally
relevant. Imagine putting the particle in a box of size L. Recall the Heisenberg uncertainty relation
~
∆q∆p ≥ (1.4)
2
What does this mean for our particle in the box? We see that because the position is known to within an
accuracy of at least L, there is an uncertainty in the momentum of at least
~
∆p ≥ (1.5)
2L
Now I want to use this to figure out the typical uncertainty in ∆E.
p
Erel = p~2 c2 + m2 c4 ≈ |~
p|c (1.6)

where the last equality is true in the limit p → ∞. If we use the last expression to figure out ∆E then we
have
c~
∆E ∼ c∆p ∼ (1.7)
2L
(There is some mild sleaziness in what I just did. Self-consistency means that this is actually the correct
expression only as we take L very small, so let’s do that from now on).
Now from above we know that something interesting will happen when we have ∆E ∼ 2mc2 , which means
that
~
∆E ∼ 2mc2 → L∼ (1.8)
mc
This is a very interesting formula. Notice that it combines ~ (for quantum) and c (for relativity) and it
~
says that if you try to confine a prticle to a box of size smaller than size mc then the quantum fluctuations
of energy will threaten to create particles from the vacuum. For a given particle this number is called the
(reduced) Compton wavelength:
~
LCompton ≡ (1.9)
mc
For an electron this wavelength is 10−12 m or so.
Now I haven’t proven to you that relativistic quantum mechanics doesn’t work; it can be an instructive
exercise to just ignore what I told you above and try and build a relativistic version of quantum mechanics
and check that things go wrong. You get confusing things like negative probabilities, negative energies, etc.
I may discuss some of these things as we go on; it requires a bit of machinery of Lorentz-invariance.
But first let me tell you where we are going. In traditional classical mechanics the degrees of freedom are the
numbers qa (t), pa (t). Here a runs over the degrees of freedom; for example for a particle in three dimensions
we would have a ∈ {1, 2, 3}.
In quantum mechanics we learned how to canonically quantize these. The process of canonical quantization
is (for us) an algorithm: we stick in a classical theory, we do some manipulations, and then we end up with
a quantum theory, where we obtain operators q̂a , p̂a that obey a Heisenberg uncertainty relation like (1.4).
Now let’s think about classical field theory. In classical field theory, the basic degree of freedom – i.e. the
analogue of q(t) – is a field φ(~x, t). In this example, I am talking about a real scalar field, which is a function
from space and time to the real numbers:
φ : R3,1 → R (1.10)
we can have other kinds of fields; for example a complex scalar field, or a vector field such as E(x, ~ t), the
electric field from elementary electrodynamics. Note that this object φ(~x, t) takes on a different value at each
point in space, so it is a lot of degrees of freedom – infinitely many, in fact.

5
In classical field theory φ(~x, t) will have a conjugate momentum which I will call π(~x, t). We will discuss
the equations that these two things obey. Now to get quantum field theory from this, we again canonically
quantize this theory, and we end up with a quantum system with operators φ̂(~x) and π̂(~x). This process – i.e.
quantizing fields rather than simple degrees of freedom like qa (t) – is usually called second quantization. In
my opinion (...not shared by everyone) calling this “second” quantization is somewhat confusing terminology
and is largely for historical reasons, so I won’t explain why it is called that till later.2
Note that we have infinitely many operators – one quantum operator at each point in space, and thinking
about how to organize this information is one of the main goals of this course.
Now let me tell you what will happen once we understand the physics above. We will understand the following
facts about the universe:

1. There are different, yet totally indistinguishable copies of elementary particles like the electron.
2. There is a relationship between the statistics of particles (i.e. the behavior of their exchange) and their
spin (i.e. what happens if you rotate them).

3. Anti-particles exist.
4. Particles can be created and destroyed. (We already talked about this).
5. Finally, and perhaps most profoundly: the things that we call forces can be imagined as being caused
by the exchange of particles.

1.2 Conventions

We will use the following conventions: from now on we will mostly set ~ = c = 1. Our metric signature will
be (−, +, +, +); note that many (most?) quantum field theory textbooks use a different convention, so stay
vigilant for sign differences.

2 Special relativity and Lorentz invariance

We will now review Lorentz invariance. I should note that you are aware of most of this already from your
Geometry of Mathematical Physics course last term, so this will have something of the character of a review,
but hopefully one that is helpful. However, before doing that, let’s discuss something much simpler: we will
warm up very quickly with rotational invariance.

2.1 2d rotational invariance

Consider the plane R2 with coordinates (x, y). We can imagine a different coordinate system (x0 , y 0 ) which
is related to the old one by the following formula
 0   
x cos θ sin θ x
~x0 = = ≡ R(θ)~x (2.1)
y0 − sin θ cos θ y

Here the matrix R(θ) is an element of the two dimensional rotation group SO(2). Let me now say some
extremely obvious things:
2 Actually I never explained why it is called this. Ah well.

6
1. The individual components x0 and y 0 are not independent under the rotation.
2. The length of the vector however is invariant under the rotation: (~x0 )2 ≡ x~0 · x~0 = ~x · ~x. This is
geometrically obvious, and we can also calculate it immediately from the form of the transformation.
Let me just do this once:

x02 + y 02 = (x cos θ + y sin θ)2 + (y cos θ − x sin θ)2 = x2 + y 2 (2.2)

3. More generally, the dot product of any two vectors is invariant. Consider two vectors ~v = (v x , v y ) and
~ = (wx , wy ); the dot product of the two vectors ~v · w
w ~ ≡ v~0 · w
~ 0 is also invariant. Let me note that I
can write this dot product as  
1 0
~ = ~v T
~v · w w
~ . (2.3)
0 1
Now the condition that the dot product is invariant can be written as
T
v~0 · w
~ 0 = ~v T RT Rw
~ (2.4)

In other words RT R = 1. You can easily check this is true for rotation matrices of the form (2.1), and
in a more general number of spatial dimensions this is the definition of the orthogonal group.

This is all super obvious. Now let me introduce for you the distance paradox! The distance paradox is the
fact that if you have a triangle AB, the distance AB is not equal to the distance BC plus CA. You might
say that this is not a paradox, and that would be a perfectly fine viewpoint that you should keep in mind.
We are also used to the fact that the laws of physics are rotationally invariant; what this means is that if
you use (x0 , y 0 ) to describe your system, you should get all the same answers as if you used (x, y).3

2.2 Basic kinematics of Lorentz invariance

We now move to special relativity. Einstein defined them by the following two postulates:

1. The principle of relativity: the laws of nature are the same in all inertial frames.

2. The speed of light is the same in all inertial frames.

I haven’t defined an inertial frame yet. Informally, two inertial frames are related in the following way: if I
am at rest in an inertial frame, and you are moving past me at a constant speed, then you are also in an
inertial frame. I’ll give a more formal definition slightly below.
Now let us put some mathematics on these ideas. We have space and time, which I will denote as R1,3 ; the
“1” here separates the time from the space.
We label a point in spacetime by xµ = (t, x1 , x2 , x3 ) = (x0 , x1 , x2 , x3 ) (I will sometimes call the time compo-
nent x0 ). Note that I have put the µ index “up”; this is called the contravariant 4-vector4 and this notation
carries some information and is important in what follows.
3 The laws of physics are obviously rotationally invariant. Yet nevertheless things always fall down, and not e.g. up. How

would you convince a caveman (or, apparently, a member of the Flat Earth Society, which to my continual amazement seems to
be a real thing) that the laws of physics are rotationally invariant in the fact of such obvious experimental evidence otherwise?
Can you think of other areas of life that might appear to be less symmetric than they actually are?
4 I will never use that name again and just call it “up”.

7
Now consider the following transformation to a new coordinate system x0µ :
 0   2

t γ t − vc2 x1
x01    γ(x1 − vt) 
 1
x0µ =  =
x02  
   γ=q (2.5)
2
x  1− v2
x03 x3
c2

This is called a Lorentz boost. It mixes together space and time, and you should think about it as a fancy
version of the rotations we discussed previously.
Let’s think about what it means for a second. Consider a particle sitting at rest in the origin of unprimed
coordinate system, which means that it extends along the following line:
   
t τ
x1  0
 2
x  =0
 (2.6)
x3 particle 0

Here τ is a parameter which runs along the worldline of the particle. What is happening in the primed
coordinate system? There we have  0  
t γτ
= (2.7)
x01 particle −vγτ

(I am ignoring x2,3 as they don’t do anything). In other words, in the primed coordinate system the particle
is moving, it does not have a constant x01 . Thus the primed coordinates correspond to a frame that is moving
with respect to the original frame. We will see what they are for shortly.
Note that if we take the limit c → ∞ – this is the non-relativistic limit – this transformation becomes;
 0  
t t
x01  x1 − vt
x0µ = x02  =  x2 
   (2.8)
x03 x3

which is called a “Galilean boost”, familiar from Newtonian mechanics. From now on I am going to set c = 1,
and it will never appear again in any of our formulas. Exercise: how then do we recover the non-relativistic
limit?
Now, the point is the following: recall that in the section on rotational invariance, we realized that some
things (lengths of vectors, dot products, etc.) were invariant under rotations, and some things weren’t. We
now want to build the same formalism for Lorentz boosts: in other words, what is invariant under Lorentz
boosts?
I will first define the spacetime Minkowski metric ηµν with “down” indices, which is defined to be
 
−1 0 0 0
 0 1 0 0
ηµν = 
 0 0 1 0
 (2.9)
0 0 0 1

The point of the spacetime metric is to allow is to “lower” indices in the following way:
X
ηµν xν = −x0 , x1 , x2 , x3

xµ ≡ (x0 , x1 , x2 , x3 ) = (2.10)
ν

8
This is of course just matrix multiplication with the η matrix. This “lower” vector is called the covariant
vector. From now on, we will use the Einstein summation convention and not write down the sums explicitly;
whenever you see a repeated index you should imagine a sum, i.e.
X
xµ = ηµν xν = ηµν xν (2.11)
ν
µν
Now, let me define for you another object: η with only up indices. This is defined as the matrix inverse of
the metric tensor with only down indices. Component-wise you can just take the inverse and find
 
−1 0 0 0
 0 1 0 0
η µν = 
 0 0 1 0
 (2.12)
0 0 0 1

(i.e. it’s numerically the same). But because it is the inverse, we can now invert (2.11) to find:

xµ = η µν xν (2.13)

So η µν raises indices. By the way, note that the statement that η µν is the inverse of ηµν can be written as

η µν ηνρ = δρµ (2.14)

where δρµ is the identity matrix:


 
1 0 0 0
0 1 0 0
δρµ = 

0
 (2.15)
0 1 0
0 0 0 1
Now the point of all of this fancy machinery is to allow us to construct an inner product between two vectors
xµ and y µ as follows: the dot product is defined to be
 
−1 0 0 0  
 0 1 0 0 y 0
x · y = xµ y ν ηµν = xT ηy = (x0 , ~x)T  0 0 1 1
 0 0 1 0 ~y = −x y + x y + x y + x y
 2 2 3 3
(2.16)
0 0 0 1

This is very similar to the familiar dot product that you know and love from ordinary R3 , except for the
extra minus sign in front of the time component of the vector. This minus sign can be thought of as the
ultimate origin of the (...observationally quite obvious...) fact that time is different from space. You should
now convince yourselves that all of the following ways of computing this dot product give you the same
answer:
x · y = xµ yµ = xµ y µ = xµ yν η µν (2.17)
In other words, you can always contract an upper and a lower index in this way. This is the nice thing about
this notation; it basically makes it impossible to mess up the dot product, unless you write something like
this:
xµ y µ (2.18)
with both indices up. You should not do this (and indeed even writing this leaves me with a terrible feeling
of discomfort).
Now let us ask the following question, which should now seem quite natural given our discussion of two
dimensional rotational invariance: what is the most general transformation of the coordinates that leaves
this dot product between two vectors invariant? In other words, what is the most general 4 × 4 matrix
Λ = Λµν such that I can write:

x0 · y 0 = (x0 )µ (y 0 )µ = xµ yµ = x · y (x0 )µ = Λµν xν y 0µ = Λµν y ν (2.19)

9
This is the analogue of (2.4). We see that the equation to be satisfied is:

xT ΛT ηΛy = xT ηy → ΛT ηΛ = η (2.20)

or, in indices, the same equation becomes:

Λµν ηµρ Λρσ = ηνσ (2.21)

A matrix satisfying this property is an element of a group called O(1, 3), or the Lorentz group. You should
imagine that this matrix “preserves the Minkowski metric”.
It is helpful to convince ourselves that the boost introduced back in (2.5) does satisfy this relation. Suppressing
the x2,3 directions, from there we have  
µ 1 −v
Λν =γ (2.22)
−v 1
and now just explicitly calculate that
 T     
2 1 −v −1 0 1 −v −1 0
γ = (2.23)
−v 1 0 1 −v 1 0 1
as claimed.

2.3 Group theory of Lorentz group

That was an example. Let us now understand the full set of Λ for which this is true. First, let’s take the
determinant of the equation (2.20). We then find that
(detΛ)2 detη = detη → detΛ = ±1 (2.24)
Thus the set of Λ fall into two disjoint sets, depending on the sign of their determinant. Let’s first consider
the set with positive determinant. These are the Λ’s that are continuously connected to the identity. As you
might remember from GMP, it is then a useful thing to consider the Lie algebra: to remind you how this
works, consider a rotation of the following form:
Λ = exp (M) (2.25)
where M is a 4 × 4 matrix which we can write in indices as Mµν . We now want to understand the space of
possible M’s, which are the generators of the group. To understand this, let us start by assuming that M is
small; in that case we can Taylor expand in powers of it, to find:
Λ = 1 + M + O(M2 ) (2.26)
Now let us plug this into (2.21). We find that
(δνµ + Mµν ) ηµρ (δσρ + Mρσ ) = ηµρ (2.27)
which simplifies to
ηνσ + Mµν ηµσ + ηνρ Mρσ = ηνσ (2.28)
and now finally we can write this as
Mσν + Mνσ = 0 (2.29)
in other words, M (with both indices down) is antisymmetric. This turns out to be the only constraint on
the M’s. There are 6 linearly independent 4 × 4 antisymmetric matrices. Let’s call a basis for these matrices
M (a) where (a) runs from 1 to 6. Now consider the following objects
(M ρσ )µν = η σµ δνρ − η ρµ δνσ (2.30)

10
Now the notation is actually doing a lot of work here, so let me unpack it very carefully.
Here µ and ν are spacetime (or matrix) indices; thus for each value of ρ and σ we have a 4 × 4 matrix.
 
a b c d
M ρσ = (2.31)
···
However the indices ρ, σ each run from 0 to 3, and from the definition above you can see explicitly that
M ρσ = −M σρ . Thus the set of antisymmetric combinations of ρ and σ label the 6 possible matrices that we
can have.
Now to make this seem a little less crazy let me write some examples out carefully. First let’s do some M ’s
who have a single 0 index. Expanding out the definition (2.30), we have
 
0 1 0 0
 1 0 0 0
(M 01 )µν = 
 0 0 0 0
 (2.32)
0 0 0 0
Note that notation M 01 makes it clear which two dimensions are being mixed up. It won’t surprise you that
we have  
0 0 1 0
 0 0 0 0
(M 02 )µν = 
 1 0 0 0
 (2.33)
0 0 0 0
and so on. You can do the same for the other components (there are some interesting signs here and there).
We are now in a position to write down the most general Lorentz transformation, which is the exponential
of an arbitrary sum over all of these possible matrices. This takes the form
M = ωρσ M ρσ Λ(ωρσ ) = exp(ωρσ M ρσ ) (2.34)
where ωρσ is a antisymmetric tensor ωρσ = −ωσρ that allows us to choose an arbitrary linear combination of
the M ’s.
Now the Lorentz transformations form a group, called the Lorentz group. As you recall from GMP, the
commutators of the generators play an important role. You can explicitly check from the definition (2.30)
that the commutators of the M ’s are:
[M µν , M ρσ ] = −η νρ M µσ + η µρ M νσ + η νσ M µρ − η µσ M νρ (2.35)
This defines the Lorentz algebra.
Now let’s think physically about all of this. The transformations M 0i correspond to boosts in the xi direction:
for example if you work out
 
cosh(ω01 ) sinh(ω01 ) 0 0
 sinh(ω01 ) cosh(ω01 ) 0 0
Λ(ω01 ) = exp(ω01 M 01 ) =   (2.36)
 0 0 1 0
0 0 0 1
v
This is in fact exactly the same as (2.22) if we map the boost velocity v to the ω01 by sinh ω = − √1−v 2
. ω01
is often called the rapidity.
Now let’s consider those M ’s with both spatial indices. We find:
 
0 0 0 0
12 µ
0 0 −1 0
(M ) ν =  0
 (2.37)
1 0 0
0 0 0 0

11
Exponentiating this we get  
1 0 0 0
0 cos(ω12 ) − sin(ω12 ) 0
Λ(ω12 ) = 
0 sin(ω12 ) cos(ω12 ) 0
 (2.38)
0 0 0 1
which is clearly a rotation that mixes together the x1 and the x2 directions and so is a rotation about the x3
axis. We see that the group of spatial rotations SO(3) is a subgroup of SO(1, 3).
Details on how to do these matrix exponentials will be given in the homework.
Finally let’s discuss those transformations which do not have detΛ = 1; from earlier, this means that they will
have detΛ = −1, and some examples are time reversal, which acts as T : (x0 , x1 , x2 , x3 ) → (−x0 , x1 , x2 , x3 )
– so in matrix notation it is  
−1 0 0 0
 0 1 0 0
ΛT =  0 0 1 0
 (2.39)
0 0 0 1
Another important one is parity P , which changes the direction of all of the spatial directions: P :
(x0 , x1 , x2 , x3 ) → (x0 , −x1 , −x2 , −x3 ). We then have
 
1 0 0 0
0 −1 0 0
ΛP = 0 0 −1 0 
 (2.40)
0 0 0 −1
These clearly both have determinant −1. One can get the whole Lorentz group O(1, 3) by composing these
transformations with the infinitesimal ones studied above. Exercise: what is the form of parity if we had
only two spatial (and one time) dimension? Why is it different?

2.4 Translations and the Poincaré group

We have just understood Lorentz transformations, and I argued that these are the relevant generalization of
the idea of “rotation” to a relativistic spacetime. We should also now think about space-time translations,
i.e. transformations of the form:
xµ → x0µ = xµ − aµ (2.41)
where aµ is a constant 4-vector. These are symmetries of (sufficiently small) regions of empty space in our
universe. Regarding aµ : it’s 0-th component corresponds to a translation of the origin of time (saying that
the laws of physics are the same now and in the future) and its spatial components correspond to a translation
of the origin of space (saying that the laws of physics are the same in this classroom and in the next).
The combination of these together with the Lorentz transformations results in the following 10-parameter
symmetry group, called the Poincaré group:
xµ → x0µ = Λµν xν − aµ (2.42)
ρσ
It has ten generators, the six M and four translations Pµ . In the next section I will write down more
explicitly how Pµ acts on fields defined on spacetime.
Finally, it’s worth noting that the translations don’t commute with the Lorentz transformations. The full
structure of commutators is:
[M µν , M ρσ ] = −η νρ M µσ + η µρ M νσ + η νσ M µρ − η µσ M νρ (2.43)
µ νρ µν ρ µρ ν
[P , M ] = −η P +η P (2.44)
µ ν
[P , P ] = 0 (2.45)

12
This is called the Poincaré algebra. I have not quite proved the second of these relations for you yet. We
will see how to do so shortly.

2.5 The twin paradox

So just to recap everything that we have just studied – we looked at the symmetries of flat space with the
Minkowski metric R1,3 and learned that they have a pretty simple group theoretical structure involving some
antisymmetric matrices.
The thing I want you to keep in mind is that this really isn’t that different from ordinary rotations, there
are just some signs here and there. The signs are extremely important: they tell us that time is different
from space, which is very very true – but calculationally and conceptually Lorentz transformations should
not be much worse than normal rotations. As an example, let’s study the twin paradox. picture of twin
paradox. We have one twin who stays at home, and thus lives on the spacetime trajectory AB – whereas
we have another, who goes to AC and then CB. When the two twins meet up again, one of them is older!
(omg).
This is exactly like the distance paradox that I told you about, except that now the invariant notion of
time elapsed along each trajectory is measured with the Minkowski q metric and not the regular metric. So
∆t 2

we have twin AB measuring time ∆t, and other twin measuring 2 2 − ∆x2 . The twin paradox is really
not very different from the fact that the different sides of a triangle have different lengths, except that we
measure “length” with the Minkowski metric and not the familiar Euclidean one.

2.6 Realizing transformations on fields

We have seen how coordinate transformations act on the coordinates x, i.e. in an index-free notation as:

x0 = Λx − a (2.46)

Now let us come to the basic ingredient in this class: consider a scalar field φ(x), which is a map φ :
R1,3 → R. φ(x) will also transform under coordinate transformations, so we can say that φ(x) transforms in
a representation of the Poincaré group. We will now understand what this means.

2.6.1 Scalar fields

The basic rule is the following: when you go from an unprimed to a primed coordinate system, everything
changes, including the scalar field φ(x). Let us call the new scalar field φ0 (x0 ): the primed field at the primed
point is determined by the unprimed field at the unprimed point, i.e.

φ0 (x0 ) = φ(x) (2.47)

Let us work out explicitly how this works for translations, i.e. consider the transformation

x0µ = xµ − aµ (2.48)

So we have that
φ0 (x0 ) = φ0 (x − a) = φ(x) (2.49)
where the last equality came from (2.47). We now have

φ0 (x) = φ(x + a) (2.50)

13
Let’s start with an infinitesimal translation; expanding the right hand side in powers of a we have

φ0 (x) = φ(x) + aµ φ(x) + O(a2 ) (2.51)
∂xµ
Thus we see that under an infinitesimal translation, the transformation of the scalar field is

φ0 (x) = φ(x) + δφ(x) δφ(x) = aµ ∂µ φ(x) (2.52)

Now let us consider finite transformations. For any transformation continuously connected to the identity,
we can obtain the finite transformation by exponentiating some algebra element, i.e. there exists a Pµ such
that
φ0 (x) = φ(x + a) = exp(aµ Pµ )φ(x) (2.53)
In this case, by expanding both sides in powers of a we see that

Pµ = (2.54)
∂xµ
i.e. the generator of translations on the scalar field is by derivatives. This should seem philosophically
reasonable.
Let us now come to the slightly more complicated problem of Lorentz transformations. This is similar except
that the transformation of x is given by (2.34), i.e.
µ
x0µ = Λµν xν = exp (ωρσ M ρσ ) ν xν (2.55)

Let us now expand this out for infinitesimal ω; we have then:

x0µ = xµ + δxµ = xµ + ωρσ (M ρσ )µν xν + O(ω 2 ) (2.56)

Now we play the same game as before: we have

φ0 (x0 ) = φ(x) (2.57)

Formally speaking, we would like to find a set of generators Lρσ so that the following expression is true:

φ0 (x) = exp (ωρσ Lρσ ) φ(x) = φ(x) + ωρσ Lρσ φ(x) + O(ω 2 ) (2.58)

It’s perhaps a good time to understand the difference between Lρσ and Mρσ : here Mρσ is a set of 4 × 4
matrices which are linear operators which realize the Lorentz algebra on the coordinates xµ . However Lρσ is
going to be a set of differential operators, which are linear operators who realize the Lorentz algebra on the
field φ(x). In the language of GMP, the field φ(x) and the coordinate xµ form different representations of
the Lorentz group.
Our task now is to determine the form of Lρσ .
To do this, we can proceed algorithmically just as before. We have

φ0 (x + δx) = φ(x) (2.59)

Now we work for small δx to find

φ0 (x) = φ(x) − δxµ ∂µ φ(x) = φ(x) − (ωρσ (M ρσ )µν xν ) ∂µ φ(x) + O(ω 2 ) (2.60)

From here we can basically read off the form of the Lρσ . Plugging in the form of M ρσ = (M ρσ )µν =
η σµ δνρ − η ρµ δνσ from (2.30) and comparing with (2.58) we find
∂ ∂
Lρσ = xσ − xρ (2.61)
∂xρ ∂xσ

14
This is the desired expression. Note what it says: if you perform a Lorentz transformation, the field changes
– the amount that it changes depends on how far you are from the origin of the transformation, which makes
complete sense.
Now the L’s and the P ’s form a representation of the Poincaré group, and thus satisfy the following algebra:
[Lµν , Lρσ ] = −η νρ Lµσ + η µρ Lνσ + η νσ Lµρ − η µσ Lνρ (2.62)
µ νρ µν ρ µρ ν
[P , L ] = −η P +η P (2.63)
µ ν
[P , P ] = 0, (2.64)
which you are welcome to check explicitly.

2.6.2 Vector fields

There are other kinds of fields than scalar fields. Let us think for a second about a vector field, e.g. Aµ (x).
This will be important when we discuss gauge theories later on in the course. This is different from a scalar,
because it has an extra index which must also be transformed. In other words, the transformation law
A0µ (x0 ) = Λµν Aν (x) (2.65)
We can now go through the same analysis as before to determine the form of the Lorentz generator. I’ll leave
you to go through this as an exercise: it becomes
∂ ∂
Lρσ = M ρσ + xσ − xρ σ (2.66)
∂xρ ∂x
The extra M here acts on the space of Aµ (x) at a single point and rotates it just as though it was a vector;
the other parts act on the coordinates x.
Finally there is another type of field which will be important, called a spinor field, which again you have
studied in GMP. I will discuss the Lorentz transformation properties of these in due course, but for now I
feel we have done enough group theory and I would like to get to some dynamics.

3 Lagrangian methods and classical field theory

We now move towards field theory. First, Lagrangian methods for classical mechanics.

3.1 Lagrangians for classical mechanics

In classical mechanics imagine we have a system with N coordinates, where a ∈ {1, · · · N }. Then we can
organize our understanding with a Lagrangian L(q1 , · · · qN , q̇1 , · · · q̇N ), where q̇a = dq dt . A typical form of
a

the Lagrangian is
N
X 1 2
L(q1 , · · · qN , q̇1 , · · · q̇N ) = mq̇a − V (q1 , · · · qN ) (3.1)
a=1
2
i.e. of the form “kinetic energy” minus “potential energy”. The point of the Lagrangian is that we use it to
determine the action: Z tf
S[qa ] = dtL(qa , q̇a ) (3.2)
ti
The action is a map from the space of particle trajectories to R, and the principle of least action tells us
that the solution to the classical equation of motion is the one that extremizes the action, i.e. if we consider

15
a variation of the path qa (t) → qa (t) + δqa (t), then the variation of the action should be stationary. In other
words we demand:
Z tf
0 = δS[qa ] = dtδL(qa , q̇a ) (3.3)
ti
Z tf N  
X ∂L ∂L
= dt δqa + δ q̇a (3.4)
ti a=1
∂qa ∂ q˙a
Z tf N   
X ∂L d ∂L
= dt − δqa (t) (3.5)
ti a=1
∂qa dt ∂ q˙a

where in the second equality we have integrated by parts and neglected a boundary term (formally, we
demand that δqa (ti ) = δqa (tf ) = 0).
Demanding that this holds for the space of all variations δqa (t) we find that the quantity multiplying qa (t)
must vanish, which gives us the Euler-Lagrange equations of motion, which is the following set of ODEs.
 
∂L d ∂L
− =0 (3.6)
∂qa dt ∂ q˙a

This should be familiar. Note that we have obtained N equations of motion from a single scalar quantity L;
this is a computationally very useful thing and is one of the reasons why we formulate classical mechanics in
terms of Lagrangians.

3.2 Lagrangian methods for classical field theory

We now want to write down a Lagrangian for a classical scalar field φ(x) = φ(t, ~x). Let me begin by making an
analogy with normal classical mechanics; there we have qa (t) – now consider having a run over the discrete
N sites of a cubic lattice, and imagine making N bigger PNand bigger until the lattice sites approximate a
continuum R3 . Then we see that qa (t) → φ(t, ~x), and a=1 → d3 x. Basically the spatial coordinate ~x
R

plays the role of the index a labelling the degrees of freedom.


We are thus led to consider a Lagrangian which takes the following form:
Z
L = d3 x L(φ(x), ∂µ φ(x)) (3.7)

Here L is called the Lagrangian density. Let us insert this into the action to find:
Z
S[φ] = dtd3 x L(φ(x), ∂µ φ(x)) (3.8)

This is the general form of the field theories that we will consider. Let me note a few nice things about it:

1. Note that in the integral we have d4 x = dx0 dx1 dx2 dx3 ; in other words space and time look like
they’re treated on a similar footing, which is crucial for relativistic invariance. We will explicitly check
relativistic invariance soon.
2. This has locality built in: fields only couple to each other at the same space-time point, i.e. given this
form you will never see a term in the action that looks like φ(t, ~x)φ(t, ~y ). This appears to be a basic
requirement for a field theory to be well-behaved.

16
Now let us derive the analogue of the Euler-Lagrange equations. We again demand that the action is
stationary under a variation φ(x) → φ(x) + δφ(x), so we have
Z Z  
∂L ∂L
0 = δφ S = d4 xδL = d4 x δφ + ∂µ δφ (3.9)
∂φ ∂(∂µ φ)
Z   
∂L ∂L
= d4 x − ∂µ δφ(x) (3.10)
∂φ ∂(∂µ φ)
Now this gives us the classical Euler-Lagrange equations for a field theory, i.e.
 
∂L ∂L
− ∂µ =0 (3.11)
∂φ ∂(∂µ φ)

Note that everything here was a function of space-time, so the action formalism gives us a way to obtain a
PDE from a single scalar density L(φ, ∂µ φ).
Some important notation: when a field configuration satisfies the Euler-Lagrange equations, we say that it
is on-shell. Whenever we are looking at an on-shell field configuration the action is stationary under small
variations about this field configuration.

3.2.1 Action of a real scalar field

Let us now describe some conditions that we want for our action S (equivalently, our Lagrangian density L):

1. The action S should be invariant under Poincaré transformations. Note we have


Z
S = d4 x L(φ, ∂µ φ) (3.12)

Let’s quickly check how the measure transforms under Poincaré transformations: under x0 = x − a we
have d4 x0 = d4 x. Under x0 = Λx we have d4 x0 = |detΛ|d4 x = d4 x. So we see that the integral doesn’t
bring in any new features, and invariance of the action really only requires that the Lagrangian density
L be invariant under Poincaré.
2. The action S should result in non-trivial dynamics, i.e. we should get an interesting PDE out of (3.11).
In practice, this means that L must depend on ∂µ φ.

The simplest action that satisfies these two properties is the following action of a real scalar field:

1
L(φ, ∂µ φ) = − ∂µ φ∂ µ φ − V (φ) (3.13)
2

Here V (φ) is a function of one variable – it is called the potential of the scalar field. We will spend a
great deal of time thinking about implications of this action. Let’s expand it out a little bit:
1
L = − η µν ∂µ φ∂ν φ − V (φ) (3.14)
2
1 1~ ~ − V (φ)
= ∂t φ2 − ∇φ · ∇φ (3.15)
2 2
This is relativistically invariant because we have contracted all of the indices using the Minkowski metric
tensor; this introduced an interesting minus sign between the derivatives. Note that by analogy with the case
for a particle, we can consider the first term a kind of “kinetic energy” and the second two terms a sort of
“potential energy”.

17
The simplest choice for V (φ) is
1 2 2
V (φ) =m φ (3.16)
2
What is the meaning of the parameter m? At this point I have to tell you what this theory describes. With
this choice of potential it turns out that the quantum theory described by this action describes a system of
non-interacting particles with mass m. I have not justified this radical statement at all yet, so you should be
very skeptical of this, but I’m going to show you this is true.
Let us now work out the Euler-Lagrange equations for this system. I will do this somewhat carefully – we
start with  
∂L ∂L
− ∂µ (3.17)
∂φ ∂(∂µ φ)
Now let’s work out both of the terms. For the first we simply have
∂L ∂V
=− = −m2 φ (3.18)
∂φ ∂φ
For the second we need to work out
 
∂L ∂ 1
= − η ρσ ∂ρ φ∂σ φ (3.19)
∂(∂µ φ) ∂(∂µ φ) 2
In the second equality I have taken the term from (3.15) and renamed some indices to avoid confusion. Now
in constructing this derivative let us start by making sure that we understand the following expression:

(∂ρ φ) = δρσ (3.20)
∂(∂σ φ)
The fact that the left-hand side vanishes if σ 6= ρ and is 1 if σ = ρ actually means that the right-hand side
must be equal to the Kronecker delta, which means the index placement must be as shown. The fact that
the ρ index is down should not be too surprising, but the fact that σ index is up might be. You can verify
that the Lorentz transformation properties of this expression do in fact make sense.
Using this expression we find
 
∂ 1 1
− η ρσ ∂ρ φ∂σ φ = − η ρσ δρµ ∂σ φ + ∂ρ φδσµ φ = −η µσ ∂σ φ = −∂ µ φ

(3.21)
∂(∂µ φ) 2 2
Putting both of these pieces in we find:
 
∂L ∂L
− ∂µ = −m2 φ − ∂µ (−∂ µ φ) (3.22)
∂φ ∂(∂µ φ)
or
∂ µ ∂µ φ − m2 φ = 0 (3.23)
This is the Klein-Gordon equation. It is a linear PDE that describes a fully relativistically invariant wave
equation.
Let us expand out the structure ∂ µ ∂µ = η µν ∂µ ∂ν = −∂t2 + ∇ ~ ·∇
~ – this structure should be familiar to you
from studying the wave equation in other courses. Note the speed of wave propagation – which is given by
the ratio of the spatial and time derivatives – is 1. This is because of my choice of units; if use more human
units it would have been the speed of light c.

3.3 Noether’s theorem

We will now discuss Noether’s theorem, probably the deepest and most important result in mathematical
physics. It is basically the following statement: a continuous symmetry results in a conserved current.
Let me first explain what I mean by a continuous symmetry.

18
3.3.1 Example: a complex scalar field

I will start with the simplest example, also as an excuse to consider a slightly different field theory. Let us
imagine not a real scalar field but a complex scalar field Φ. The action for the complex scalar field is
Z
S[Φ, Φ ] = d4 x −(∂µ Φ∗ )(∂ µ Φ) − m2 Φ∗ Φ


(3.24)

Some notational points: this field has both a real and imaginary part, and it is customary to encode this in
the field Φ and its conjugate Φ∗ . Clearly the action is invariant under the following U (1) symmetry, a phase
rotation of Φ:
Φ → eiα Φ Φ∗ → e−iα Φ∗ (3.25)
Let’s just verify this very carefully:
Z
S[Φeiα , Φ∗ e−iα ] = d4 x −(∂µ Φ∗ e−iα )(∂ µ Φe+iα ) − m2 Φ∗ e−iα Φe+iα

(3.26)
Z
d4 x −(∂µ Φ∗ )(∂ µ Φ) − m2 Φ∗ Φ = S[Φ, Φ∗ ]

= (3.27)

Here α is a constant. This is called a global continuous symmetry; it is global because it acts everywhere
in space and time in the same way since α is a constant. There are also symmetries which are not global
symmetries: you will learn about those in due course in the part of the course on gauge theory.
It is continuous because it is a symmetry for all values of α, eiα ∈ U (1). Importantly, we can consider its
infinitesimal version:
Φ → (1 + iα)Φ + O(α2 ) Φ∗ → (1 − iα)Φ∗ + O(α2 ) (3.28)
We can also have discrete symmetries, which are not continuous. In fact the action above has one:

Φ → Φ∗ Φ∗ → Φ (3.29)

This symmetry is often called charge conjugation, or C for short. (It will turn out that in the quantum
theory this symmetry switches particles and antiparticles; again for now you just have to take my word for
it). Note it does not have an infinitesimal version, and there isn’t an immediate version of Noether’s theorem
for it.

3.3.2 Proof of Noether’s theorem

We will now prove Noether’s theorem. Consider a field theory described by a set of scalar fields φa , with an
action S[φa ]. Now let us imagine that the theory is invariant under a continuous symmetry. This means that
there exists an infinitesimal variation of the fields

φa (x) → φa (x) + ∆φa (x) (3.30)

under which the action is invariant, i.e.

S[φa + ∆φa ] − S[φa ] = 0 → δ S[φa ] = 0 (3.31)

Here  is an infinitesimal symmetry parameter; it is analogous to the α that I used above in the U (1) example.
Now, let us perform a trick: let’s consider a variation where we allow (x) to be an arbitrary function of
space and time, i.e.
φa (x) → φa (x) + (x)φa (x) (3.32)

19
The action now is not invariant – why would it be? However it must vanish when (x) is a constant (in which
case it reduces to the case above), so the variation takes the form
Z
δ S[φa ] = d4 x j µ ∂µ (x) (3.33)

(Here I have secretly also assumed that the field theory is local). Here j µ is some function of the fields which
depends on the theory in question. Now let’s integrate by parts and neglect boundary terms to find
Z
δ S[φa ] = − d4 x (∂µ j µ )(x) (3.34)

This relation holds for any choice of field configuration φa . But now let us consider field configurations φa
that satisfy the Euler-Lagrange equations of motion, i.e. they are on-shell. In that case the variation of the
action is zero for any variation of the fields, including (3.32). So on-shell we find
Z
0 = δ S[φa ] = − d4 x (∂µ j µ )(x) (3.35)

This holds for any choice of (x); thus whenever the fields satisfy the equations of motion we have

∂µ j µ = 0 (3.36)

i.e. there exists a divergenceless vector field j µ , which is usually called a “conserved current”. Thus we have
proven Noether’s theorem. We note that this approach also gives us an algorithm to construct the conserved
current.
Now let us discuss for a second what it means to have a divergenceless current. Basically it means that there
exists a conserved charge. Let’s explain this carefully: the equation of motion (3.36) reads:
∂t j 0 (t, xi ) + ∂i j i (t, xi ) = 0 (3.37)
Now let us integrate this equation over all space:
Z Z
∂t d3 xj 0 + d3 x∂i j i (t, xi ) = 0 (3.38)

The last term is a total derivative, which means we can neglect it all fields die off quickly enough at infinity.
In that case we find the following expression
Z
d
Q(t) = 0 Q ≡ d3 xj 0 (t, xi ) (3.39)
dt
In other words, there is a conserved charge Q, which is independent of time.
Let us now work this out explicitly in the case of the complex scalar field. We consider the infinitesimal
variation from (3.28)
∆Φ(x) = iαΦ(x) ∆Φ∗ = −iαΦ(x)∗ (3.40)
Now as suggested above we make the symmetry parameter α depend on space and consider the infinitesimal
variation:
∆Φ(x) = iα(x)Φ(x) ∆Φ∗ = −iα(x)Φ(x)∗ (3.41)
And now let’s consider the infinitesimal variation. We treat Φ and Φ∗ as independent variables to find from
(4.29):
Z
δα S[Φ, Φ∗ ] = d4 x −(∂µ ∆Φ∗ )(∂ µ Φ) − (∂µ Φ∗ )(∂ µ ∆Φ) − m2 ∆Φ∗ Φ − m2 Φ∗ ∆Φ

(3.42)
Z
= d4 x −(∂µ (−iαΦ∗ ))(∂ µ Φ) − (∂µ Φ∗ )(∂ µ (iαΦ)) − m2 (−iαΦ∗ )Φ − m2 Φ∗ (+iαΦ)

(3.43)
Z
= − d4 x ∂µ α [i(∂ µ Φ∗ )Φ − iΦ∗ ∂ µ Φ] (3.44)

20
Indeed, as promised, this looks exactly the same as (3.33), which lets us immediately read off the explicit
expression of the conserved current:
j µ = −i [(∂ µ Φ∗ )Φ − Φ∗ (∂ µ Φ)] (3.45)
We can also now construct the conserved charge Q:
Z Z
Q = d xj = i d3 x [(∂t Φ∗ )Φ − Φ∗ (∂t Φ)]
3 0
(3.46)

It can be instructive to show that the current is conserved using the ordinary form of the Euler-Lagrange
equations for the complex scalar field.

3.3.3 Alternative way to find the current

Recall that the action is written in terms of the Lagrangian density L(φa , ∂µ φa ). In many situations we find
that it is not the integrated action which is invariant under the symmetry, but also the Lagrangian itself. In
that case there’s a shortcut to finding the current.
Consider the symmetry variation of the fields (note there is an implicit sum over a here):
Z Z  
∂L ∂L
δS = d4 xδL = d4 x ∆φ a + ∂µ (∆φ a
) =0 (3.47)
∂φa ∂(∂µ φa )
If the Lagrangian itself is invariant under the symmetry, then for constant  we have:
 
∂L ∂L a
δL =  ∆φ a + ∂ µ (∆φ ) =0 (3.48)
∂φa ∂(∂µ φa )
Now recall that the Euler-Lagrange equations of motion tell us that on-shell we have:
 
∂L ∂L
− ∂ µ =0 (3.49)
∂φa ∂(∂µ φa )
∂L
Let’s use this formula to eliminate ∂φ a . We see from (3.48) that

   
∂L ∂L a ∂L
∂µ ∆φa + ∂µ (∆φ ) = ∂µ ∆φa = 0 (3.50)
∂(∂µ φa ) ∂(∂µ φa ) ∂(∂µ φa )
Thus we have identified a divergenceless current
 
µ ∂L
j = ∆φa (3.51)
∂(∂µ φa )

This is often a quick way to find the symmetry current. It’s instructive to check that for the complex scalar
field this gives the same answer as (3.45). I stress that the formula above only works if the Lagrangian
itself is invariant, which is often the case but is strictly a stronger condition than the integrated action being
invariant. There is a way to improve this equation to suit the general case (see e.g. p18 of Peskin & Schroeder)
but under those circumstances I usually just use the algorithm of the previous part, which always works.

3.4 Hamiltonian formalism in classical mechanics

Finally, let us briefly remind ourselves how the Hamiltonian works in classical mechanics. Given a classical
mechanical degree of freedom qa (t), we can construct its canonical momentum
∂L
pa ≡ (3.52)
∂ q̇a

21
We then construct the Hamiltonian by the following algorithm:
X
H(qa , pa ) ≡ pa q̇a − L(qa , q̇a ) (3.53)
a

Now let’s do the same thing for field theory. Our basic degree of freedom is the field φ(t, ~x). We now construct
its canonical momentum π(t, ~x) from the Lagrangian density via:

∂L
π(t, ~x) ≡ (3.54)
∂(φ̇(t, ~x))

Let’s work this out explicitly with the example of the real scalar field
1 1 1~ ~ − 1 m 2 φ2
L = − η µν ∂µ φ∂ν φ − m2 φ2 = φ̇2 − ∇φ · ∇φ (3.55)
2 2 2 2
So we find
∂L
π(t, ~x) = = φ̇ (3.56)
∂ φ̇
And we can now find the Hamiltonian density H by the same sort of formula, where as usual we eliminate φ̇
in terms of π.

H = π(t, ~x)φ̇(t, ~x) − L (3.57)


 
2 1 2 1~ ~ 1 2 2
=π − π − ∇φ · ∇φ − m φ (3.58)
2 2 2
1 2 1~ 1
~ + m2 φ2
= π + ∇φ · ∇φ (3.59)
2 2 2
Note that the Hamiltonian density is defined pointwise in space. The full Hamiltonian H is the integral of
the Hamiltonian density over all of space
Z Z  
3 3 1 2 1~ ~ 1 2 2
H = d xH = d x π + ∇φ · ∇φ + m φ (3.60)
2 2 2

As usual in classical mechanics, this can be interpreted as the energy of the system. It is an integral of a
positive-definite quantity, so the energy is positive, which makes sense. One can also show that this energy
is conserved (it is in fact the Noether charge associated with time translational symmetry).
Note that even though the underlying system was Lorentz invariant, we brutally destroyed this Lorentz

invariance by doing various non-covariant things using a particular choice of time to define · ≡ ∂t . This is
always an issue with Hamiltonian methods. The final answers will actually always turn out to be Lorentz
invariant though intermediate steps generally will not be.

22
4 Quantum field theory in canonical formalism

We are now ready to start thinking about quantum theories. We will begin in what is probably the more stan-
dard and familiar fashion, using a Hamiltonian; this is called the canonical formalism (to be distinguished
from the path integral formalism).
Again we remind ourselves how this works in quantum mechanics. We start with a classical system with
coordinates qa and conjugate momenta pa . There is also a Hamiltonian H(qa , pa ), constructed (perhaps)
from the Lagrangian L(qa , q̇a ) as described previously. These degrees of freedom have Poisson brackets
which are
{qa , qb } = 0 {pa , pb } = 0 {qa , pb } = δab (4.1)
Now there is an algorithm (usually called “quantization”) that takes this structure and makes it into a
quantum theory. It works by promoting the classical variables into Hermitian operators q̂a , p̂a which now act
on some appropriate Hilbert space. The structure of the Poisson brackets is now realized in commutators.
There are two ways to think about quantum mechanics – the so-called Schrodinger picture where the states
|ψi evolve in time and the operators are time-independent, in which case the algebra of operators is
[q̂a , q̂b ] = 0 [q̂a , p̂b ] = 0 [q̂a , p̂b ] = i~δab (4.2)
One can also think about the Heisenberg picture: in this case the operators evolve in time: we have q̂a (t),
p̂b (t), and it should be understood that the commutator algebra above holds only at equal time, i.e:
[q̂a (t), q̂b (t)] = 0 [q̂a (t), p̂b (t)] = 0 [q̂a (t), p̂b (t)] = i~δab (4.3)
(If we considered commutators at unequal times we would find different and generically more complicated
commutation relations which depend explicitly on the time delay).
From now on I will set ~ → 1 again.
We now want to perform the same process for field theory, i.e. we want to quantize the classical theory of
the free scalar field that we introduced previously. Recall that for the real scalar field we have the classical
variables φ(t, x) and their conjugate momenta π(t, x). We now want to construct quantum operators φ̂(t, ~x)
and π̂(t, ~x) which obey the following canonical commutation relations:

[φ̂(t, ~x), φ̂(t, ~y )] = 0 [π̂(t, ~x), π̂(t, ~y )] = 0 [φ̂(t, ~x), π̂(t, ~y )] = iδ (3) (~x − ~y ) (4.4)
Here the last commutator is the interesting one; note that the delta function in space is the continuum field
theoretical analogue of the δab in (4.3). Our task now is to build the quantum theory that satisfies this:
understand the quantum version of the Hamiltonian, understand the state space, figure out its eigenvalues,
etc.
This is in general a difficult task; we will do it for the free theory (i.e. that where the potential V (φ) is
quadratic), where the problem turns out to be tractable. To understand why this is, let’s remind ourselves
of the Euler-Lagrange equation for the free real scalar field, which we derived a few lectures ago:
 
∂µ ∂ µ − m2 φ(t, ~x) = −∂t2 + ∇~ 2 − m2 φ(t, ~x) = 0

(4.5)

This is a linear partial differential equation in a system with translational invariance. Whenever you have
translational invariance, it makes sense to go to momentum space, i.e. to write down the following expansion:
d3 k i~k·~x
Z
φ(t, ~x) = e φ̃(t, ~k) (4.6)
(2π)3
Note in passing that this can be inverted as follows:
Z
~ ~
φ̃(t, k) = d3 xe−ik·~x φ(t, x) (4.7)

23
by using the usual Fourier transformation formulas. Now consider acting on this with the Euler-Lagrange
operator; each ∇2 brings down a −~k 2 and we find:
 
∂t2 + k~2 + m2 φ̃(t, ~k) = 0 (4.8)

~
(Strictly speaking we find the integral of this equation multiplied by eik·~x , but that operator is invertible).
This is exactly the equation for a simple harmonic oscillator with frequency ω~k = ~k 2 + m2 :
 
∂t2 + ω~k2 φ̃(t, ~k) = 0 (4.9)

Thus, at least classically, each Fourier mode of the free scalar field is a harmonic oscillator. We do
know how to solve the quantum theory of the harmonic oscillator: so let’s review that and then return to
the field-theoretical problem.
I note that if the potential had not been quadratic in φ this wouldn’t have worked and we would immediately
have been stuck. Indeed solving such “interacting” quantum field theories is a huge and hard problem in
physics.

4.1 Quantizing the simple harmonic oscillator

First, we study the classical harmonic oscillator. The simple harmonic oscillator with frequency ω in one
dimension has a single coordinate q(t) and the following Lagrangian:
1 2 1 2 2
L(q, q̇) = q̇ − ω q (4.10)
2 2
We can easily verify that the Euler-Lagrange equations are

q̈(t) = −ω 2 q(t) (4.11)

The general solution is


1
ae−iωt + be+iωt

q(t) = √ (4.12)


where a and b are integration constants and the factor of 2ω is for later convenience.
Now: let us recall that q(t) is a real number, so q(t) = q ∗ (t). Thus we need b = a∗ :
1
ae−iωt + a∗ e+iωt

q(t) = √ (4.13)

Next, let’s go to the Hamiltonian formalism. From our familiar expressions for the Hamiltonian we have
∂L 1 2
p + ω2 q2

p= = q̇ H = pq̇ − L = (4.14)
∂ q̇ 2
Now, let’s quantize! We promote everything to operators acting on a Hilbert space, so p → p̂ and q → q̂. We
will work in the Heisenberg picture, in which the operators depend on time. The Hamiltonian becomes
1 2
p̂ + ω 2 q̂ 2

Ĥ = (4.15)
2
Now what is the operator analogue of the expansion (4.13)? We write
1
âe−iωt + ↠e+iωt

q̂(t) = √ (4.16)

24
where now â and ↠are quantum operators that we take to be time-independent. In the context of the
harmonic oscillator they are called ladder operators. In that case we can compute p̂ as:
r
˙ = −i ω âe−iωt − ↠e+iωt

p̂(t) = q̂(t) (4.17)
2
In the first equality I assumed that the classical relation still holds in the quantum theory: technically you
should check this but it is true. (In elementary QM we often work in the Schrodinger picture, so there is
no time dependence; also you can alternatively take this as the definition of â and ↠, which are indeed
time-independent even in the Schrodinger picture.)
Now the fundamental canonical commutation relations that we need are the following:

[q̂(t), p̂(t)] = i [q̂(t), q̂(t)] = [p̂(t), p̂(t)] = 0 (4.18)

What does this imply for the commutation relations of a and a† ? We can solve the two equations above for
â and its conjugate:
r  r 
ω i † ω i
â = q̂(t) + √ p̂(t) e +iωt
â = q̂(t) − √ p̂(t) e−iωt (4.19)
2 2ω 2 2ω
Working out the commutators by using the equal-time commutation relations for p̂ and q̂ we find them to
be:
[â, ↠] = 1 [â, â] = [↠, ↠] = 0 (4.20)
Next, let’s construct the Hamiltonian. Expressing p̂ and q̂ in terms of the a’s, we get:
 
ω † †
 † 1
Ĥ = â â + ââ = ω â â + (4.21)
2 2

where in the last equality we used the known commutator to swap the order around of â and ↠.
Now we come to the key point of the ladder operators: they diagonalize the Hamiltonian! To see this, let’s
compute the commutator:
   
1
† †
[Ĥ, â ] = ω â â + , â = ω aˆ† [â, ↠] = ωa†

(4.22)
2
Similarly, we can compute
[Ĥ, â] = −ωâ (4.23)
This means that these two operators raise and lower the energy! To understand this, consider an energy
eigenstate |Ei that has energy E, i.e. it satisfies the following equation:

Ĥ|Ei = E|Ei (4.24)

(I stress that in the above expression E is a number, not an operator; this is an eigenvalue equation). Now
let’s consider the state ↠|Ei. What happens if we act on it with the Hamiltonian?
   
Ĥ↠|Ei = a† Ĥ + [Ĥ, ↠] = ↠Ĥ + ω↠|Ei = (E + ω)↠|Ei (4.25)

In other words the state ↠|Ei is also an eigenstate of the Hamiltonian, with energy E + ω: ↠has raised the
energy by the fixed amount ω. Similarly, we can consider the state â|Ei, i.e.
 
Ĥâ|Ei = âĤ + [Ĥ, â] |Ei = (E − ω)â|Ei (4.26)

and we see that â has lowered the energy by an amount ω.

25
These two operators now allow us to construct the full set of energy eigenstates. We define a state |0i called
the ground state by the condition that â|0i = 0. Note that the energy of the ground state is:
 
† 1 ω
Ĥ|0i = ω â â + |0i = |0i (4.27)
2 2
Now we can construct a tower of excited states by acting repeatedly with the raising operators:
 
1
|ni = (↠)n |0i Ĥ|ni = ω n + |ni (4.28)
2
This solves the problem of the single simple harmonic oscillator. Now on to quantum field theory, which is
really many (artfully arranged) simple harmonic oscillators.

4.2 Quantizing the free complex scalar field

Now we do something very similar for the free complex scalar field, which has the following classical action:
Z
S[φ, φ∗ ] = d4 x −(∂µ φ∗ )(∂ µ φ) − m2 φ∗ φ

(4.29)

First let’s solve the classical system. Recall from earlier that the equation of motion in position and in
momentum space is:
    q
~ 2 φ(t, ~x) = 0
∂t2 + ∇ ∂t2 + ω~k2 φ̃(t, ~k) = 0 ω~k = ~k 2 + m2 (4.30)

As we described earlier, the momentum space equation of motion is just that of a harmonic oscillator, so
following what we did above, we can now write down the most general solution to the momentum space
equation of motion:
1  −iω~ t 
φ̃(~k, t) = √ a~k e k + b∗−~k eiω~k t (4.31)
2ωk
This is directly analogous to (4.12); here a~k , b~k are momentum-dependent Fourier coefficients. Now we go to
position space by:
d3 k
Z
1  −iω~ t 
~
φ(~x, t) = 3
p a~k e k + b∗−~k eiω~k t e+ik·~x (4.32)
(2π) 2ω~k
Now it is conventional to make the following change of variables in the second term: ~k → −~k. In that case
this flips all the k’s on that term and we find the following slightly nicer expression:
d3 k
Z
1  −iω~ t+i~k·~x ∗ iω~k t−i~
k·~
x

φ(~x, t) = a~ e k + b~ e (4.33)
(2π)3 2ω~k k
p k

This is nicer as both expressions in the exponent are now of a Lorentz-invariant form.
Note that for the harmonic oscillator above q was real, and so there was a relation between a and b. Similarly,
if we were doing the real scalar field here we would find such a relation, but for the complex scalar field they
are independent.
Now we are getting ready to quantize! First, let’s work out the Hamiltonian. If we expand out the Lagrangian
in a particular rest frame we find
~ · ∇φ
L = φ̇φ̇∗ − (∇φ ~ ∗ ) − m2 φ∗ φ (4.34)
Now we need to determine the canonical momenta. Recall that φ and φ∗ are independent variables: so the
momentum conjugate to φ and φ∗ respectively are:
∂L ∂L
πφ = = φ̇∗ πφ∗ = = φ̇ (4.35)
∂ φ̇ ∂ φ̇∗

26
For notational convenience from now on I will refer to these as

π = πφ π ∗ = πφ∗ (4.36)

and now the Hamiltonian is


~ · ∇φ
H = π φ̇ + π ∗ φ̇∗ − L = φ̇φ̇∗ + (∇φ ~ ∗ ) + m2 φ∗ φ (4.37)

4.2.1 Operator-valued Fourier expansions

It is now time to quantize! The idea is the fields φ(t, ~x) and π(t, ~x) acquire hats and become operators in a
Hilbert space!
φ(t, ~x) → φ̂(t, ~x) (4.38)
Similarly for their conjugates (which we now write with a dagger to denote the fact that it is now a Hermitian
conjugate of an operator, and not just the complex conjugate of a complex number):

φ∗ (t, ~x) → φ̂† (t, ~x) (4.39)

So, in the Heisenberg picture, we have the following expansion of the operator:

d3 k
Z
1  −iω~ t+i~k·~x † iω~k t−i~
k·~
x

φ̂(~x, t) = â~ e k + b̂~ e (4.40)
(2π)3 2ω~k k
p
k

where the a’s and b’s are now operators. Similarly, we have:

d3 k
Z
1  † +iω~ t−i~k·~x −iω~k t+i~

φ̂† (~x, t) = â e k + b̂~ e k·~
x
(4.41)
(2π)3 2ω~k ~k k
p

This is the Hermitian conjugate of the above expression.


Let us also work out the Fourier transform of the canonical momenta. We do this the same was for the
harmonic oscillator: we begin with the classical definition:
r
d3 k ω~k  † iω~ t−i~k·~x
Z
˙†

−iω~k t+i~
k·~
x
π̂(~x, t) = φ̂ (~x, t) = i â~ e k − b̂~
k e (4.42)
(2π)3 2 k

and finally, same for the conjugate:


r
d3 k ω~k 
Z
˙

~ ~

π̂ (~x, t) = φ̂(~x, t) = i −â~k e−iω~k t+ik·~x + b̂~† e+iω~k t−ik·~x (4.43)
(2π)3 2 k

Now the point of all of these expansions is that we see that everything is determined by the operators â~k and
b̂~k . It turns out that these operators can again be used to diagonalize the Hamiltonian of the free quantum
field theory and determine its spectrum: they will turn out to have a simple physical interpretation: they
create and annihilate particles!
To understand this, let’s first recall the canonical commutation relations between the scalar field and its
canonical conjugate:
[φ̂(~x, t), π̂(~y , t)] = [φ̂† (~x, t), π̂ † (~y , t)] = iδ (3) (~x − ~y ) (4.44)

All of the rest of the (equal-time) commutation relations are zero:

[φ̂, φ̂] = [φ̂, π̂ † ] = · · · = 0 (4.45)

27
I would now like to ask the question: what does this commutation relation imply for the commutators of the
a’s and b’s? It turns out that it implies the following relation:

[â~k , â†~0 ] = (2π)3 δ (3) (~k − k~0 ) (4.46)


k
[b̂~k , b̂†~0 ] = (2π)3 δ (3) (~k − k~0 ) (4.47)
k

with all other commutators zero.


Before proving this, let’s think for a second about what this means: it means that we can imagine each
Fourier mode as being two independent harmonic oscillators: One coming from a and one coming from b.
The delta function means that as far as this algebra is involved the Fourier modes don’t talk to each other.

4.2.2 Algebra of creation/annihilation operators

Now let’s prove this. This is the sort of calculation that comes up often in field theory, involving a lot of
manipulation of integrals. Again as this is the first one, we will do it in quite a lot of detail.
We will first invert (4.40) to (4.43) for a, b as a function of φ, π. Before getting into it let’s note the familiar
Fourier identity: Z
d3 xei~p·~x = (2π)3 δ (3) (~
p) (4.48)

~0
Now we multiply (4.40) with eik ·~x and integrate over all ~x:

d3 k
Z Z
~0 1  −iω~ t+i(~k+k~0 )·~x † iω~k t−i(~k−~k0 )·~

d3 xe+ik ·~x φ̂(~x, t) = d3 x â~ e k + b̂~ e x
(4.49)
(2π)3 2ω~k k
p
k
Z
1  −iω~ t (3) ~ ~0 
= d3 k p â~k e k δ (k + k ) + b̂~† eiω~k t δ (3) (~k − k~0 ) (4.50)
2ω~k k

1  
=p â−~k0 e−iω~k0 t + b̂~† 0 eiω~k0 t (4.51)
2ω~k0 k

where it can be helpful to remember that ω−~k = ω~k . We have expressed a sum of a and b† in terms of φ. To
extract a or b by itself we need another linearly independent combination of them. We can find this by doing
some very similar manipulations on π̂ † (~x, t) in (4.43) to obtain
r
ω~k 
Z 
~
d3 xe+ik·~x π̂ † (~x, t) = i −â−~k e−iω~k t + b̂~† e+iωk t (4.52)
2 k

√ √1
Now we can multiply the first equation with ω~k and divide the second by ωk to extract ak by itself:
r !
ω~k
Z
−i~ 1
â~k = 3
d xe k·~
x+iω~k t
φ̂(~x, t) + iπ̂ † (~x, t) p (4.53)
2 2ω~k

~ in order to get a~ rather than a ~ ). Similarly we can extract b̂† by taking the opposite
(note I took ~k → −k k −k k
linear combination: r !
ω~k
Z
~ 1
b̂~† = d3 xe+ik·~x−iω~k t φ̂(~x, t) − iπ̂ † (~x, t) p (4.54)
k 2 2ω~k
And as expected we can obtain the corresponding expressions for the daggered operators by taking the

28
Hermitian conjugate, i.e.
r !
ω~k
Z
+i~ 1
â~† = 3
d xe x−iω~k t
k·~ †
φ̂ (~x, t) − iπ̂(~x, t) p (4.55)
k 2 2ω~k
r !
ω~k
Z
−i~ 1
b̂~k = d3 xe k·~
x+iω~k t
φ̂† (~x, t) + iπ̂(~x, t) p (4.56)
2 2ω~k

Okay, now let us consider the equal-time commutator of â~k with â†~0 . We have:
k
Z
1 ~ ~0 0
[â~k , â~† 0 ] =
d3 xd3 x0 eit(ω~k −ω~k0 )−ik·~x+ik ·~x ×
k 2
!
ω~k ωk~0 † 1
r r
√ ˆ† 0 0 † 0 † 0
ω~k ω~k0 [φ̂(~x, t), φ (~x , t)] − i [φ̂(~x, t), π̂(~x , t)] + i [π̂ (~x, t), φ̂ (~x , t)] + √ [π̂ (~x, t), π̂(~x , t)]
ω~k0 ω~k ω~k ω~k0
(4.57)

The commutators of φ with φ† and π with π † automatically vanish. The other two give two delta functions,
and we find:
r 
ω~k (3) ωk~0 (3)
Z
1
r
~ ~0 0
[â~k , â~† 0 ] = d3 xd3 x0 eit(ω~k −ω~k0 )−ik·~x+ik ·~x δ (~x − x~0 ) + δ (~x − x~0 ) (4.58)
k 2 ω~k0 ω~k

Now we can use the delta function to do the x~0 integral, setting x~0 to ~x everywhere:
r 
ω~k ωk~0
Z
1
r
† 3 it(ω~k −ω~k0 )−i~
k·~ k0 ·~
x+i~ x
[â~k , â~ 0 ] = d xe + (4.59)
k 2 ω~k0 ω~k

and now the remaining ~x integral gives us a delta function, i.e.


r 
1 ω~k ωk~0
r
[â~k , â~† 0 ] = (2π)3 δ (3) (~k − ~k 0 )eit(ω~k −ω~k0 ) + (4.60)
k 2 ω~k0 ω~k

Finally, because everything is multiplying a delta function which only has support when ~k = ~k 0 , we can freely
set them to be equal to each other. The two square-rooted expressions each become 1 and together cancel
the 2; thus we find eventually the desired result:

[â~k , â~† 0 ] = (2π)3 δ (3) (~k − ~k 0 ) (4.61)


k

I will leave it to you to check the only other nonzero commutator is

[b̂~k , b̂~† 0 ] = (2π)3 δ (3) (~k − ~k 0 ) (4.62)


k

with all else vanishing.

29
4.2.3 Hamiltonian and the energy of the vacuum

We now want to write the Hamiltonian of our quantum theory in terms of these creation and annihilation
operators. From (4.65) we have that the classical Hamiltonian is:
Z  
H = d3 x ππ ∗ + ∇φ ~ · ∇φ
~ ∗ + m2 φ∗ φ (4.63)

We now want to figure out the quantum Hamiltonian: and we do this by simply making each of these classical
variables into operators: Z  
~ φ̂ · ∇
Ĥ = d3 x π̂π̂ † + ∇ ~ φ̂† + m2 φ̂† φ̂ (4.64)

Now writing this in terms of the creation and annihilation operators is one of those “straightforward but
tedious” exercises that you will do for your homework. After the dust settles you find the following answer

d3 k
Z  
Ĥ = ω â†
â + b̂ ˆ†
b (4.65)
~ ~ ~ ~
(2π)3 k ~k k k k

which I now rewrite using the commutator of b and b† to be:

d3 k
Z  
† † 3 (3)
Ĥ = ω~ â â~ + b̂ b̂~ + (2π) δ (0) (4.66)
(2π)3 k ~k k ~k k

We see that we are instructed to evaluate the delta function at zero, where it’s infinite. This is really rather
confusing (and is maybe the first of many infinities in this subject).
For now, let us build the Hilbert space – we begin by constructing the vacuum state |0i, in precisely the
same way as we did for the simple harmonic oscillator in – the vacuum is defined to be the state who satisfies:

a~k |0i = 0 b~k |0i = 0 for all ~k (4.67)

In other words, it is the state that is empty of particles. This is the state that is meant to be empty space.
Now let’s answer the first question that we can do with this formalism: what is the energy of empty space,
i.e. what happens if we act with Ĥ on this state? We have

d3 k d3 k
Z   Z
† † 3 (3)
Ĥ|0i = ω~
k â~ â~
k + b̂~ b̂~k + (2π) δ (0) |0i = ω~ (2π)3 δ (3) (0)|0i (4.68)
(2π) 3 k k (2π)3 k

In other words, the vacuum is indeed an energy eigenstate with an eigenvalue E0 but this eigenvalue is

d3 k
Z
E0 = ω~ (2π)3 δ (3) (0) (4.69)
(2π)3 k

and thus the energy of the vacuum appears to be infinite.


This may appear perplexing. Let’s disentangle a few infinities. First, the δ (3) (0) — this actually turns out
to be related to the infinite volume of space. More precisely, consider doing the following regulated integral:
Z
(2π)3 δ (3) (p) = lim d3 xeipx (4.70)
V →∞ V

This means that in a finite box of size V , δ (3) (0) should be interpreted as
Z
(2π)3 δ (3) (0) = lim d3 x = lim V (4.71)
V →∞ V V →∞

30
So one source of infinity comes from the fact that space is infinite; let’s divide out by this to computer a
vacuum energy density:
d3 k
Z  
E0 1 1
0 = = ω~ + (4.72)
V (2π)3 k 2 2
Recall the zero point energy (4.27) for the single harmonic oscillator. This is exactly the same, except that it
receives a contribution from each momentum mode for both the a’s and the b’s and thus is extremely infinite.
There is no obvious way to get rid of this – what should we do?
One thing we can do is the following: we can simply ignore it: after all, there is a sense in which “only
energy differences are measurable”. This is thus a good time to introduce the idea of normal ordering.
Given an operator O which is made up out of a product of field operators φ(~x1 )φ(~x2 ) · · · etc. we define its
normal ordered version : O : to be as follows: it is the regular product except that we expand it out and
place all of the annihilation operators (i.e. a, b) to the right of the creation ones (i.e. a† , b† ).
Thus the normal ordered version of the Hamiltonian above would be:
d3 k
Z  
† †
: Ĥ := ω~ â â~ + b̂ b̂~ (4.73)
(2π)3 k ~k k ~k k

and thus the normal-ordered version of the Hamiltonian indeed annihilates the ground state |0i:

: Ĥ : |0i = 0 (4.74)

and we have lost this pesky vacuum energy.


This may feel ad-hoc – and I suppose there is a sense where it is. It may make you feel better to realize that
such shifts in the Hamiltonian can also arise in the mapping from from classical to quantum Hamiltonians:
for example, classically the Hamiltonian 12 (ωq − ip)(ωq + ip) is exactly 1 2 2 2
 the same as 2 (ω q + p ), but quantum
† † 1
mechanically the first one is ωa a and the second one is ω a a + 2 . So the issue that we are stressing about
arises from a sort of ambiguity in passing from the classical to the quantum theory.
Due to lack of time I won’t discuss this further, but there is a lot of beautiful physics in this vacuum energy
– for example, Google Casimir effect to understand an experimentally measurable manifestation of this.

4.2.4 Single particle states

Let’s move on past the vacuum to consider single-particle states. First, we will compute the commutator:

d3 k 0
Z   
† † †
[Ĥ, â~ ] = ω ~0 â ~0 âk~0 , a~ (4.75)
k (2π)3 k k k

d3 k 0
Z
= ω ~0 ↠[â ~0 , a† ] (4.76)
(2π)3 k k~0 k ~k
d3 k 0
Z
= (2π)3 ω~k0 â†~0 δ (3) (k~0 − ~k) (4.77)
(2π)3 k

= ω~k0 â~† (4.78)


k

which is familiar from the SHO discussion. Similarly, we have

[Ĥ, â~k ] = −ω~k â~k (4.79)

and exactly the same expressions for the b’s.

[Ĥ, b̂~† ] = ω~k b̂~† [Ĥ, b̂~k ] = −ω~k b̂~† (4.80)


k k k

31
(Note the normal ordered Hamiltonian differs from this by an infinite constant, but that does not affect this
commutation relation).

Now, let’s consider an excited state a~† |0i. We compute its energy to be:
k
 
: Ĥ : â~† |0i = ω~k â~† + â~† : Ĥ : |0i = ω~k â~† |0i (4.81)
k k k k

In other words, the energy of this state is


q
ω~k = ~k 2 + m2 (4.82)

which is precisely the energy of a single relativistic particle with momentum ~k. We have created a particle!
Similarly, we can show by an identical calculation that the energy of the state b~† |0i is also ω~k .
k

It is now good to make precise our interpretation of what the a’s do versus the b’s. Recall from a few lectures
ago that the complex scalar field has a U (1) symmetry which is φ → eiα φ. We constructed a charge Q from
the Noether procedure for that symmetry. Classically that charge was (where here I have adjusted the overall
sign for convenience): Z
Q = −i d3 x [(∂t φ∗ )φ − φ∗ (∂t φ)] (4.83)

which we can now promote to the quantum theory as:


Z h i
Q̂ = −i d3 x π̂ φ̂ − φ̂† π̂ † (4.84)

Plugging in the Fourier expansions of the fields we ultimately find – after doing a computation which may
or may not be homework – that in normal ordered form this is
d3 k  †
Z 
: Q̂ := 3
â~ a~k − b̂~† b~k (4.85)
(2π) k k

This is the U (1) charge. Note the crucial sign difference as compared to the Hamiltonian. The “b”-type
particles have opposite charge to the “a”-type particles.
By Noether’s theorem, this charge is classically time-independent. In the quantum theory, this means that

[Q̂, Ĥ] = 0 (4.86)

which means that it is possible to diagonalize the Hamiltonian and the charge simultaneously. Indeed if we
act with Q̂ on the two states above we find:

: Q̂ : |0i = 0 : Q̂ : a~† |0i = (+1)a~† |0i : Q̂ : b~† |0i = (−1)b~k |0i (4.87)
k k k

In other words, these are all charge eigenstates. The a-type state has positive charge +1, but the b-type
state has negative charge −1. Thus a creates particles, and b creates anti-particles.
Finally, if you recall, in the last Problems Class we constructed the (classical) operator that measures mo-
mentum P i , by finding the Noether charge for translational invariance. We can do this for the quantum
theory as well and show that we have
P̂ i â~† |0i = k i â~† |0i (4.88)
k k

Let me summarize what we have learned:

1. The vacuum |0i is empty; provided we measure energy and charge by the normal-ordered versions, it
has no energy and no charge.

32
2. The state a~† |0i is a single-particle state. It has energy ω~k , mass m, momentum k i , charge +1
k

3. The state b~† |0i is a single-anti particle state. It has energy ω~k , mass m, momentum k i , charge −1
k

Note that these particles are momentum eigenstates, which means – via the regular rules of quantum me-
chanics – that they are completely delocalized in space.

4.2.5 The Fock space

We have just discussed a few very special excited states. Let’s now consider the full space of multi-particle
states, which we can create by acting multiple times with the creation and annihilation operators, e.g. imagine
the following state with Na particles and Nb antiparticles:

p1 , p~2 , · · · p~Na }; {~q1 , ~q2 · · · ~qNb }i = â†p~1 â†p~2 · · · â†p~N b̂q†~1 b̂q†~2 · · · b̂q†~N |0i
|{~ (4.89)
a b

In this context we can see that the combination â~† â~k appearing in the expression for the Hamiltonian and
k
charge simply counts “the number of particles with momentum ~k”. To compute the charge we simply add
this up over all momenta (and subtract the corresponding quantity for anti-particles). To compute the energy
we add this up – weighted by a factor ω~k – for all particles.
Indeed this is an energy and charge eigenstate with the following eigenvalues:
Na
X Nb
X
Q = Na − Nb H= ωp~i + ωq~i (4.90)
i=1 j=1

As promised, we have constructed a space of states that allows particle number to change. The state space
created by all possible ways of acting with creation operators on the vacuum is called the Fock space. We see
that roughly speaking a basis for the Fock space is labeled by a set of integers n~ki , two for each momentum.
Finally, let’s consider the two-particle state, labeled by two momenta:

p1 , p~2 i = â†p~1 â†p~2 |0i


|~ (4.91)

What happens if we interchange the two particles? Because the two a† ’s commute, we have:

p2 , p~1 i = â†p~2 â†p~1 |0i = â†p~1 â†p~1 |0i = |~


|~ p1 , p~2 i (4.92)

In other words, because the a† ’s commute with one another, the state is exactly unchanged if we interchange
the two particles – there is no meaning to which one we write “first”. Note that there is no way whatsoever
to distinguish the two particles; this is quite different from classical physics. We say that the particles are
bosons, after Bengali physicist Satyendranath Bose5 .
Finally, we have discussed creating particles that have definite momentum. We may also be interested in
creating particles that have a definite position. As usual in quantum mechanics, we do this by adding up
5 I will now tell you about this physicist, who grew up in Kolkata in India, later moving to Dhaka University, which is now

in Bangladesh. (Dhaka is the city where I grew up.) Bose studied the problem of black-body radiation in quantum systems
and found that the quantum statistics that we describe by the formula above resulted in the correct prefactor for Planck’s
law. He sent his results to Einstein, who realized their importance. He translated them to German and had them published
under Bose’s name in Zeitschrift für Physik; this is why in quantum statistical mechanics the corresponding finite-temperature
distribution is referred to as Bose-Einstein statistics. It results in many striking phenomena (you may have heard of Bose-
Einstein condensation?), and is one of exactly two classes of quantum statistics realized in nature – we will get to fermions in
a few weeks.

33
momentum eigenstates in the correct fashion: in our case, this is already done with our original field φ(~x, t):
d3 k
Z
1  −iω~ t+i~k·~x † iω~k t−i~
k·~
x

φ̂(~x, t) = â~ e k + b̂~ e (4.93)
(2π)3 2ω~k k
p
k

Imagine acting with this field on the vacuum |0i, to obtain the state:
d3 k
Z
1 ~
φ̂(~x, t)|0i = b̂† eiω~k t−ik·~x |0i (4.94)
(2π)3 2ω~k ~k
p

We see that the â will annihilate itself, but the term in b~† will create a superposition of anti-particles,
k
~
summing over all momenta with wavefunction e−ik·~x . Thus φ† (t, ~x) creates an anti-particle at position ~x, t.
Similarly, if we act with φ† (~x, t):
d3 k
Z
† 1  † +iω~ t−i~k·~x −iω~k t+i~
k·~
x

φ̂ (~x, t) = â~ e k + b̂~k e (4.95)
(2π)3 2ω~k k
p

Now the b̂ will annihilate itself against the vacuum, and we end up with a sum over a~† ’s:
k

d3 k
Z
1 ~
φ̂† (~x, t)|0i = ↠e+iω~k t−ik·~x |0i (4.96)
(2π)3 2ω~k ~k
p

This operator creates a particle at position ~x, t.

4.3 Propagators and causality

We are now ready to ask some questions about how particles propagate from one point to another. As it
turns out, there are multiple notions of propagator. Before getting into this, however, I want to discuss a
brief mathematical interlude: how do we define a Lorentz-invariant measure on integrating over 3-momenta?
From now on a slight notational change: I am going to stop putting the hats on the operators, as we basically
only have quantum operators from now on. It should be clear from context whether something is an operator
or a number.

4.3.1 Lorentz-invariant integration measures

As is turns out, the integral d3 k is not Lorentz invariant; if I imagine that the 3-momenta ~k are part of
R

a 4-vector kµ then the integration measure above transforms by factors of γ under a boost. Let us improve
on this by constructing an invariant measure. To do this, consider the following manifestly Lorentz invariant
integral: Z
d4 k δ(−kµ k µ − m2 ) (4.97)

Now this is an integral over 4-momenta with a constraint that the magnitude of the 4-momenta be equal to
(timelike) m2 . Note that this defines a 3-dimensional hyperboloid in momentum space.
Let us do the integral over k0 first: Z
d3 kdk0 δ(k02 − ω~k2 ) (4.98)

We can solve the delta function by picking one or the other of:
q
k0 = ± ~k 2 + m2 = ±ω~k (4.99)

34
Consider taking the positive sign for simplicity. Now using the basic rules for delta functions we find
−1
dk02 d3 k
Z Z
d3 kdk0 δ(k0 − ω~k ) = (4.100)
dk0 2ω~k

Thus the point is that the measure


d3 k
Z
(4.101)
2ω~k
is a Lorentz-invariant measure. It is the integral over the hyperboloid in momentum space discussed earlier,
and it will often appear in calculations.

4.3.2 Commutators and the light-cone

Let us now note an uncomfortable fact – though I made a big deal about Lorentz invariance in the first part
of this course, when we passed to the quantum theory we made heavy use of the Hamiltonian formalism, and
we quantized by picking a rest frame. This was actually quite important, as the basic starting point was the
commutation relation
[φ(~x, t)π(~y , t)] = iδ (3) (~x − ~y ), (4.102)
which took place at equal times in that frame.
It is not at all clear that the resulting quantum theory is really Lorentz invariant; perhaps our quantization
procedure destroyed this.6
Let us consider a diagnostic. First, let me remind you of the idea of the light-cone: if an event happens at x,
then the set of events that it can influence happen on spacetime points y that are timelike separated from it,
i.e. where (x − y)2 < 0. This forms the interior of a cone in R1,3 .
If our theory is Lorentz-invariant, then no information should propagate faster than light. This means that
any two measurements in the quantum theory that happen outside the light-cone should not affect each other:
in other words, any operator O1 should commute with another O2 provided it is spacelike separated from
O2 , i.e.
[O1 (x), O2 (y)] = 0 if (x − y)2 > 0 (4.103)
Note that the two operators above are defined at different spacetime points, and not at equal times. We
are now in a position to explicitly check this for the complex scalar field. Let’s compute from the Fourier
expansion (4.40) and (4.41):

d3 k 1  ik·(x−y)
Z 
[φ(x), φ† (y)] = 3
e − e−ik·(x−y) (4.104)
(2π) 2ω~k

where I have introduced the 4-momentum k-vector, whose components are

kµ = (−ω~k , ~k) (4.105)

Here the first term comes from the commutator of a and a† , and the second from b and b† . This is not
obviously zero. Note that it’s interesting that it is just a number, not an operator like the objects on the
right-hand side were7 . Note also that it is made up of Lorentz-invariant objects, i.e. the Lorentz-invariant
integration measure (4.101) and the 4-vector inner product k · (x − y). Let us now consider two cases: in the
6 This is not actually as academic of a worry as it seems: it is quite possible for a quantum theory to not be invariant under

the symmetries of its classical counterpart: this is called a quantum mechanical anomaly, and there is really a sense in which
most of the mass of the universe – that of protons and neutrons – arises from such an effect.
7 This particular fact is actually only true for free quantum field theories.

35
first, let’s imagine that the separation (x−y)µ is a timelike vector. In that case we can use Lorentz-invariance
to rotate it to point purely in the time direction, i.e. there exists a Lorentz boost Λ so that

Λµν (x − y)ν = (x00 − y 00 , 0, 0, 0)µ (4.106)

(If the existence of this Lorentz transformation is not obvious, it is instructive to construct the boost that
does this yourself!). We may now use Lorentz-invariance to evaluate the right-hand side in this coordinate
system to find
d3 k 1  −iω~ (x0 −y0 )0
Z 
iω~k (x0 −y 0 )0
e k − e (4.107)
(2π)3 2ω~k
0
−y 0 )
which is some oscillatory function of x0 −y 0 that is not zero. (You can show it goes like a sum over e±im(x
for large separation).
Now, let’s turn to the interesting case where (x − y)µ is spacelike; in that case we can transform it to point
purely in the spatial direction, i.e. there exists a (different) Lorentz boost Λ so that

Λµν (x − y)ν = (0, ~x0 − ~y 0 )µ (4.108)

We now perform the computation in this coordinate system to find:

d3 k
Z
1 
~ 0 ~ 0

3
p e+ik·(~x−~y) − e−ik·(~x−~y) (4.109)
(2π) 2 k~2 + m2

Now note that we can change the integration variable ~k → −~k in the last term to see that

d3 k
Z
1 
~ 0 ~ 0

3
p e+ik·(~x−~y) − e+ik·(~x−~y) = 0 (4.110)
(2π) 2 k~2 + m2

This holds for any spacelike separation where (x − y)2 > 08 . Thus we conclude that

[φ(x), φ† (y)] = 0 if (x − y) spacelike (4.111)

as required. The same is true for any commutator of any local operators, and is actually true in any Lorentz-
invariant theory, not just the free quantum field theory that we just explicitly solved: quantum field theory
does not let one send information outside the light-cone.

4.3.3 Particle propagation

Let us now ask a slightly different question. Somewhat imprecisely: if we create an anti-particle at one
space-time point xµ = (x0 , ~x), what is the amplitude of finding it at a different spacetime point y µ = (y 0 , ~y )?
As it turns out there are actually several different answers to this question. The most obvious putative answer
to this is given by the following overlap:
h0|φ† (y)φ(x)|0i (4.112)
where from the expressions (4.94) and its Hermitian conjugate we have:

d3 k d3 k
Z Z
1 † iω~k t−i~ † 1 ~
φ(~x, t)|0i = b e k·~
x
|0i h0|φ (~
x , t) = h0| b~ e−iω~k t+ik·~x (4.113)
(2π)3 2ω~k ~k (2π)3 2ω~k k
p p

8 It’s helpful to compare this to the above computation and ask what the exact difference is: there is no change of variables

that takes ω~k to −ω~k

36
Putting these in we find:

d3 k1 d3 k2
Z
1

hφ (y)φ(x)i = e−ik1 ·x+ik2 ·y h0|b~k2 b†~ |0i, (4.114)
(2π)3 (2π)3 2ω~k1 2ω~k2
p
k1

where for the 4-vector k1 and k2 we have:

kµ,i = (−ω~ki , ~ki ) (4.115)

Note that we have omitted the choice of the vacuum state in the expectation value; in general whenever you
see this notation you should assume that the expectation value is taken in the vacuum, i.e. h0|X|0i = hXi.
Now commuting the b past its dagger and annihilating it on the vacuum we find:

d3 k 1 −ik·(x−y)
Z

hφ (y)φ(x)i = e (4.116)
(2π)3 2ω~k

This expression is sometimes called the propagator or correlator (as it is a correlation function of two
fields). Note again that it’s completely Lorentz-invariant, just like the expression for the commutator I wrote
down above.
It’s quite possible to evaluate it explicitly, though as usual I will not do so here. Let’s just ask a more basic
question: note that unlike the commutator, it does not vanish outside the light-cone. In fact the required
integral can be done via Bessel functions9 and we find that for large spacelike |~x − ~y | → ∞ it behaves as

hφ† (y)φ(x)i ∼ exp (−m|~x − ~y |) (4.117)

What does this mean, exactly? The fact that this does not vanish means that there is entanglement in the
vacuum of a quantum field theory. One can get some intutition for this by considering a far simpler system,
the famous EPR (Einstein-Podolsky-Rosen) state for two entangled spins, which you have probably seen:
1
|EPRi = √ (| ↑i| ↑i + | ↓i| ↓i) (4.118)
2
If you have this state and you compute e.g. hEPR|σ1z σ2z |EPRi you will obtain a non-zero correlation because
the spins are correlated by their entanglement (i.e. if you know that spin 1 is up, then you know what spin 2
is doing). Similarly, one can interpret the correlation above as stating that in the vacuum, there is quantum
entanglement between different points in space.
Now can we use this entanglement to send information faster than light? We have already seen above through
a direct computation that the answer is no. Let’s try to interpret the answer we found above in this language;
note that staring at the expression for the commutators we have

[φ(x), φ† (y)] = h[φ(x), φ† (y)]i = hφ(x)φ† (y)i − hφ† (y)φ(x)i (4.119)

The first equality follows from the fact that the commutator is a number, and so we can evaluate it in any
state; the second follows from expanding out the commutator. Now let us wrap some words around the
expressions on the right: the first is the probability for a particle to propagate from y to x, and the second
is the probability for an anti-particle to propagate from x to y. These exactly cancel outside the light-cone.
More poetically: the reason we cannot use the particle entanglement to send information faster than light
is because pesky anti-particles appear to cancel them. This highlights an uncomfortably deep relationship
between the existence of anti-particles and the causal structure of relativity.
9 Fun exercise! Perhaps depends somewhat on your definition of “fun”.

37
4.4 Feynman propagator

As it turns out, the most useful propagator in quantum field theory is actually not the object described above,
but a slightly different object called the Feynman propagator. To introduce it, I need to first explain the
notion of time-ordering. A time-ordered product of some operators O1 (x1 ), O2 (x2 ), · · · is written as

T (O1 (x1 )O2 (x2 ) · · · ) (4.120)

is defined as the correlation function that you get when you order all of the operators by time and then place
the earliest ones to the right. So after the ordering you will find something like

Oa (xa )Ob (xb ) · · · (4.121)

where a, b, · · · is a permutation of the original operator labels 1, 2 · · · where we have

ta > tb > tc > · · · tn (4.122)

So for example, the time ordered correlator of φ and φ† is

T (φ(x0 )φ† (x)) = θ(t0 − t)φ(x0 )φ† (x) + θ(t − t0 )φ† (x)φ(x0 ), (4.123)

where we have introduced the Heaviside step function θ(x), as usual defined as
(
1 x>0
θ(x) = (4.124)
0 x<0

We now define the Feynman propagator G(x, x0 ) to be

G(x, x0 ) = h0|T (φ(x0 )φ† (x))|0i , (4.125)

i.e. the expectation value of the time-ordered product of φ and φ† .


It is at this point reasonable to ask why? I will basically dodge this question for now: it turns out that this is
a quantity that is in many ways the “correct” notion of particle propagation; we will see when we start doing
interacting quantum fields and scattering particles off of each other that this is a natural building block.
Now let’s calculate it. First the two pieces: we have already calculated φ† (x)φ(x0 ) in (4.116), and a very
similar calculation give us the remaining piece φ(x0 )φ† (x). Putting them together we find

d3 k 1 ik·(x0 −x) d3 k 1 −ik·(x0 −x)


Z Z
G(x, x0 ) = θ(t0 − t) e + θ(t − t0 ) e (4.126)
(2π)3 2ω~k (2π)3 2ω~k

in both cases we have kµ = (−ω~k , ~k) as usual. It’s worth expanding out the exponents to understand precisely
what the difference is between the two branches:
d3 k 1 −iω~ (t0 −t)+i~k·(~x0 −~x) d3 k 1 +iω~ (t0 −t)−i~k·(x~0 −~x)
Z Z
G(x, x0 ) = θ(t0 − t) 3
e k + θ(t − t0
) e k (4.127)
(2π) 2ω~k (2π)3 2ω~k
0
So the point is that if t0 > t, then the behavior in time is like e−iω~k (t −t) , whereas if t0 < t, then the behavior
0
in time is the other way, i.e. e+iω~k (t −t) . This is the key defining property of the Feynman propagator. (The
spatial dependence also looks like it oscillates the other way on the two branches, but recall from the previous
part that this is an illusion: we can always take ~k → −~k in the integral, but no similar manipulation can be
done for the time integral.)

38
Now I will show you an elegant way to write this answer. As it turns out, the Feynman propagator can be
written as 0
d4 k e−ik·(x −x)
Z
0
G(x, x ) = −i (4.128)
(2π)4 kν k ν + m2
Note that here we are doing something different: we are actually integrating over all four components of the
4-vector k µ = (k 0 , ~k), which is not something that we’ve done before; it should thus not be obvious at all
that this is similar to the expressions above. To show that it is, we will explicitly do the integral over k 0 and
show that it agrees with (4.127)
It also turns out that this integral is not yet well-defined, as I will explain in a second.
We will do these integrals by contour integrals in the complex plane, which I believe everyone already knows.
To remind you: if we have an analytic function f (z) of a complex variable z – which for our purposes means
that the only singularities are poles – then the anti-clockwise integral over a closed loop Γ in the complex
plane is I X
dzf (z) = 2πi (Residues at poles of f (z) inside Γ) (4.129)
Γ

A very simple example of this is the integral over the function z1 , which is
I
1
dz = 2πi (4.130)
Γ z

since the residue is just 1. If we do the integral clockwise instead then we pick up a different sign.
Now let’s consider the integral over k 0 :
+∞ 0 0 ~ 0
d3 k dk 0 eik (t −t)−ik·(~x −~x)
Z Z
G(x, x0 ) = −i (4.131)
(2π)3 −∞ (2π) −(k 0 )2 + k~2 + m2

Let’s hold ~k fixed for now and focus on the k 0 integral, which I’ll call I. We see that it is done over the real
line, but it actually hits two poles at k 0 = ±ω~k , i.e. we have:
0 0 ~ 0 0 0 ~ 0
dk 0 eik (t −t)−ik·(~x −~x) dk 0 eik (t −t)−ik·(~x −~x)
Z Z
I = −i = +i (4.132)
2π −(k 0 )2 + k~2 + m2 2π (k 0 + ω~k )(k 0 − ω~k )

These poles actually mean that the integral is ill-defined as an ordinary real integral unless we provide some
more information: we need to define this as an integral where k 0 is allowed to stray into the complex l0 plane,
and we then need to explain how to go around the poles.
The Feynman propagator is defined by following the contour in Figure 4.1, i.e. we go below the pole at −ω~k
and above the pole at +ω~k . Let’s now understand carefully how this works by computing the two branches
above. To use the residue integration formula we need to consider a closed contour, which means in this case
that we close the contour either above – i.e. at k 0 → +i∞, or below, at k 0 → −i∞. As usual in complex
analysis, if we pick the contour so that this closure does not contribute, then the integral over the closed
integral is equal to the integral which we wish to compute.
Which way we close the contour depends on the sign of t0 − t: if t0 − t > 0, then we should close the contour
0 0 0
above so that eik (t −t) ∼ e−∞(t −t) decays exponentially. This means that we are going anti-clockwise around
the pole at k = −ω~k . The residue is everything that multiplies the pole if one sets k 0 = −ω~k , i.e. the from
0

the integrand it is
0 ~ 0
i e−iω~k (t −t)−ik·(~x −~x)
residue = (4.133)
2π −2ω~k

39
Figure 4.1: Contour for frequency integration in the complex k 0 plane for the Feynman propagator

so using the residue formula we find that


0 ~ 0
−i e−iω~k (t −t)−ik·(~x −~x)
I = (2πi) (4.134)
2π 2ω~k
which becomes 0 0
~
e−iω~k (t −t)−ik·(~x −~x)
I= t0 > t (4.135)
2ω~k
Now let us consider doing the same with the opposite sign for t0 − t < 0. By the same argument, we now
0 0 0
need to close the contour below so that eik (t −t) ∼ e−∞|t −t| decays exponentially. We then pick up the other
pole at k = +ω~k . Now the residue is again everything that multiples that pole, if we set k 0 = +ω~k , i.e.
0

0 ~ 0
i e+iω~k (t −t)−ik·(~x −~x)
residue = (4.136)
2π 2ω~k
Note now that we are going around the contour the other way, so the residue formula picks up an additional
sign. Using the residue formula we find that
0 ~ 0
e+iω~k (t −t)−ik·(~x −~x)
I= t0 < t (4.137)
2ω~k

The key difference here is the sign of the quantity in the exponent multiplying t0 − t, which we see now differs
depending on the sign of t0 − t. Now putting everything together we have
0
d4 k e−ik·(x −x)
Z
G(x, x0 ) = −i (4.138)
(2π)4 kν k ν + m2
d3 k 1 −iω~ (t0 −t)+i~k·(~x0 −~x) d3 k 1 +iω~ (t0 −t)−i~k·(x~0 −~x)
Z Z
= θ(t0 − t) 3
e k + θ(t − t0
) e k (4.139)
(2π) 2ω~k (2π)3 2ω~k

where we have changed the sign of ~k → −~k in the first term. This is exactly what we had in (4.127), which
is indeed what we set out to prove.
Thus we see that the integral (4.128), together with the specification of the integral contour, give us a nice
Lorentz-covariant expression for the Feynman propagator. We will make heavy use of this expression in the

40
future. Often we capture the information in the choice of contour by writing the expression for the Feynman
propagator compactly as
0
d4 k e−ik·(x −x)
Z
G(x, x0 ) = −i (4.140)
(2π)4 k 2 + m2 − i
where  > 0 is a positive infinitesimal. Its role in life is to tip the poles slightly off of the real axis in precisely
the way as in our contour prescription: to understand this let’s solve for the location of the poles with the 
involved:
− (k 0 )2 + ~k 2 + m2 − i = 0 → k02 = ω~k2 − i (4.141)
which becomes !
i
k0 = ±ω~k 1− 2 (4.142)
ω~
k

We see that this moves the poles off of the real axis precisely as needed (i.e. the negative and positive poles
indeed move off in opposite directions). This way of writing the propagator is often called the Feynman
prescription.
This completes the derivation of the Feynman propagator. Note that this representation is manifestly Lorentz-
invariant, something which was not at all obvious from the definition in terms of time-ordering.
Teaser: I did not actually discuss this in lectures, but as we will see, this propagator has many uses. I
will highlight just one of them before we move on to interacting field theories: the Feynman propagator is
actually a Green’s function of the Klein-Gordon equation, i.e. the PDE:

(∂ 2 − m2 )φ(x) = 0 (4.143)

To understand this, let’s just directly compute


0
d4 k e−ik·(x −x)
Z
(∂x2 − m2 )G(x, x0 ) = −i(∂x2 − m2 ) (4.144)
(2π)4 k 2 + m2 − i
d4 k −ik·(x0 −x) −k 2 − m2
Z
= −i e (4.145)
(2π)4 k 2 + m2 − i
d4 k −ik·(x0 −x)
Z
= +i e (4.146)
(2π)4
= +iδ (4) (x0 − x) (4.147)

This means that you can use this to find classical solutions to the Klein-Gordon PDE. This also plays a more
central role in the path integral formulation of QFT that you will do in the next term.

5 Interacting quantum field theories

We will now move on to quantum field theories with interactions. In this section we will work mostly with
the real scalar field.
The free real scalar field has by now the extremely familiar action

m2 2
Z  
4 1 2
S = d x − (∂φ) − φ (5.1)
2 2

We did not carefully canonically quantize it, but it is mostly a simpler version of the story from the complex
scalar field. You are asked to work out some of the details in a homework problem, but basically one finds

41
the following Fourier expansion:

d3 k
Z
1  −iω~ t+i~k·~x ~

φ(~x, t) = 3
p a~k e k + a~† eiω~k t−ik·~x (5.2)
(2π) 2ω~k k

(i.e. we have that the operator Formally Called b = a: the anti-particle is the same as the particle), where
the a’s have the familiar commutation relation

[a~k , a~† 0 ] = (2π)3 δ (3) (~k − ~k 0 ) (5.3)


k

and the Hamiltonian takes the form


d3 k
Z
H0 = ω~ a† a~ (5.4)
(2π)3 k ~k k
(I have subtracted off the infinite vacuum energy and will simply ignore it from now on).
We have by now completely solved this quantum theory: we know the spectrum of the Hamiltonian and we
worked out the full time-dependence of the fields in the Heisenberg picture, which is given above in (5.2).
The physics of this is simply of a set of non-interacting particles – the fact that they are non-interacting
is clear from the energy spectrum, which simply counts the energy of each of the particles individually and
adds them up.
We now like to move on to the interacting quantum field theory: the simplest such theory is called the λφ4
theory, which has the following action:

m2 2 λ 4
Z  
1
S = d4 x − (∂φ)2 − φ − φ (5.5)
2 2 4!

The action is no longer quadratic in the field φ. It will turn out that the term in λ now allows the φ particles
to interact with each other – the larger λ is, the stronger the interaction. The first thing to note is that the
simple addition of this new term makes the theory completely unsolvable; indeed if you can find some
way to generally solve interacting quantum field theories you will definitely win a Nobel prize.
Instead, what we will do is Taylor expand the problem in powers of λ: we understand what is happening
at λ = 0, and we will try to extend that understanding to a systematic expansion order by order in λ; this
is called “perturbation theory” and is one of the few tools we have to understand interacting quantum field
theory.
Let us first give ourselves a goal: we are interested in computing this sort of quantity:

h0|T (φ(x)φ(y))|0i (5.6)

in the interacting theory. In the free theory, we have already computed this: it is the Feynman propagator
G(x, y).

5.1 The interaction picture

So how will we do this? First, let’s us construct the Hamiltonian of the full theory:
Z
λ
H = H0 + Hint = H0 + d3 x φ(~x)4 (5.7)
4!
Now the new interaction term affects our calculation in two different ways: there is time evolution in the
quantum fields φ(x), φ(y) and it also affects the choice of the vacuum state |0i, which is no longer the same
as the free vacuum.

42
Let us tackle the time dependence issue first: recall that if we know what the Heisenberg picture quantum
field is at at some time t = t0 , then we can figure out what it is at some later time via the usual expression
for Heisenberg time evolution:
φ(t, ~x) = eiH(t−t0 ) φ(t0 , ~x)e−iH(t−t0 ) (5.8)
Now in general, we do not know how to exponentiate the full Hamiltonian; however we do know how to
exponentiate the free Hamiltonian H0 ; indeed, the time dependence arising from that Hamiltonian is exactly
that which we wrote down previously in (5.2).
Let’s define a field φI (t, ~x) to be the field evolved according to only the free Hamiltonian, i.e.

φI (t, ~x) = eiH0 (t−t0 ) φ(t0 , ~x)e−iH0 (t−t0 ) (5.9)

This is sometimes called the interaction picture field. It is not equal to the “real” Heisenberg picture field
φ(t, ~x), but if λ is small then there is a sense where it captures the “main” part of the time evolution, and
we fully understand it. It is thus a useful thing to use to organize our perturbative expansion: we just need
to figure out how to express everything in terms of φI (x).
The “real” Heisenberg picture field is related to φI by

φ(t, ~x) = eiH(t−t0 ) e−iH0 (t−t0 ) φI (t, ~x)e+iH0 (t−t0 ) e−iH(t−t0 ) (5.10)

where we are undoing the free evolution and then redoing it with the full Hamiltonian. It is now convenient
to define the following evolution operator:

U (t, t0 ) ≡ eiH0 (t−t0 ) e−iH(t−t0 ) (5.11)

So that we have
φ(t, ~x) = U † (t, t0 )φI (t, ~x)U (t, t0 ) (5.12)
There is some temptation to combine the exponentials above: this would be a mistake, as the interaction
part of H does not commute with H0 (this is indeed the whole point; if they commuted it would be very
easy to solve this problem).
We will now develop an expansion of U (t, t0 ) in terms of φI . To do this, let’s first note that it satisfies the
following differential equation:
   
∂ ∂ iH0 (t−t0 ) −iH(t−t0 ) iH0 (t−t0 ) ∂ −iH(t−t0 )
i U (t, t0 ) = i e e + ie e (5.13)
∂t ∂t ∂t
 
= i eiH0 (t−t0 ) (iH0 )e−iH(t−t0 ) + eiH0 (t−t0 ) (−iH)e−iH(t−t0 ) (5.14)
= eiH0 (t−t0 ) (H − H0 ) e−iH(t−t0 ) (5.15)
iH0 (t−t0 ) −iH(t−t0 )
=e Hint e (5.16)
iH0 (t−t0 ) −iH0 (t−t0 ) +iH0 (t−t0 ) −iH(t−t0 )
=e Hint e e e (5.17)
 
= eiH0 (t−t0 ) Hint e−iH0 (t−t0 ) U (t, t0 ) (5.18)

Look at the last expression – it is the interaction Hamiltonian Hint , evolved in time using only the free
Hamiltonian H0 as in (5.9): we might call it the interaction Hamiltonian in the interaction picture, so we
will call it HI :
HI = eiH0 (t−t0 ) Hint e−iH0 (t−t0 ) (5.19)
Let’s understand what it is in terms of the interaction picture φI : we have
Z Z
4 λ iH0 (t−t0 ) 4 −iH0 (t−t0 ) λ
HI = d x e φ(t0 , ~x) e = d4 x φI (t, ~x)4 (5.20)
4! 4!

43
i.e. it is simple when written in terms of the interaction picture field φI .
So to conclude, the differential equation obeyed by U is

i U (t, t0 ) = HI (t)U (t, t0 ) (5.21)
∂t
Note also the boundary condition that U (t0 , t0 ) = 1. This boundary condition and differential equation
uniquely determine U (t, t0 ). We will now explicitly write down a general solution for it.
The solution is called Dyson’s formula10 , and is really amazingly slick:
  Z t 
0 0
U (t, t0 ) = T exp −i dt HI (t ) (5.22)
t0

Here the T outside the exponential defines the time-ordered exponential, which simply means that you
should expand out the exponential and then time-order each term in the sum.
Let us check that it satisfies the differential equation above: taking the derivative of (5.22) we have
  Z t 
∂ ∂
i U (t, t0 ) = i T exp −i dt0 HI (t0 ) (5.23)
∂t ∂t t0
  Z t 
0 0
= T HI (t) exp −i dt HI (t ) (5.24)
t0
  Z t 
= HI (t)T exp −i dt0 HI (t0 ) (5.25)
t0

Note that though naively nothing in this expression commutes, we could freely take HI (t) out to the left,
because the integral is only over times that are less than t, and thus by the magic of time-ordering HI (t) is
always placed to the left of everything else and thus can be taken outside.
Finally, the boundary condition is trivial: clearly we have U (t0 , t0 ) = 1. Thus, by the uniqueness of solutions
to differential equations, we have indeed found that (5.22) is equal to U (t, t0 ) defined in (5.11).
Let me state a few properties of U (t, t0 ) which you can check for yourselves:

1. U (t, t0 ) is unitary.
2. Composition property: U (t1 , t2 )U (t2 , t3 ) = U (t1 , t3 ), if t1 > t2 > t3 .
3. Taking the Hermitian conjugate evolves backwards in time: U (t1 , t3 )[U (t2 , t3 )]† = U (t1 , t2 )

Next, let’s think about the vacuum of the interacting theory |0i. Recall that in the free theory we had a
vacuum state |0i0 which was defined by the condition

H0 |0i0 = 0 (5.26)

This state is empty of particles. Now what can we say about the vacuum of the interacting theory |0i?
This vacuum to be defined as follows:
|0i = U (t0 , −∞)|0i0 (5.27)
This says the following: take the vacuum of the free theory in the far distant past. Evolve it forwards up to
a reference time t0 using the interaction Hamiltonian. We take this to be vacuum of the interacting theory.
10 Though David Tong’s notes tell me that it’s actually originally due to Dirac. I believe the slick presentation is due to Dyson.

44
Strictly speaking this statement requires more justification. I will sketch how you might do it: imagine
that the free vacuum has some overlap with the interacting vacuum |0i. Then if we evolve |0i0 in slightly
imaginary time e−iHT (1−i) then all states with energy E will receive a suppression by e−ET  . This will
project out states with higher energy faster – if we now take T → ∞ then the only state that remains will
be the lowest energy |0i. The small amount  turns out to be the same  as in the definition of the Feynman
propagator.
This is somewhat heuristic and I will not actually try to justify it any further, mostly because we begin to
enter into the weeds of What Does It Mean, Really To Have An Interacting Quantum Field Theory. Let’s
just note that the norm of the true vacuum state after this manuever will not be obviously 1, because the i
spoils the unitarity of U . Thus we can consider computing:

h0|0i =0 h0|U (+∞, t0 )U (t0 , −∞)|0i0 =0 h0|U (+∞, −∞)|0i0 (5.28)

which is called the vacuum persistence amplitude. It basically measures how much we mis-normalized
the free vacuum state. One way to deal with this is simply to divide by it at the end.
We are now ready to write down a formal expression for the time-ordered correlation function h0|T (φ(x)φ(y))|0i
in the interacting theory. Let us first work this out for x0 > y 0 . Recall that we convert from interaction
picture field to real Heisenberg field by:

φ(t, ~x) = U † (t, t0 )φI (t, ~x)U (t, t0 ) (5.29)

Placing this in and converting from free vacuum to interacting vacuum we have:

0 h0|U (+∞, t0 )U (x0 , t0 )φI (x0 , ~x)U (x0 , t0 )U † (y 0 , t0 )φI (y 0 , ~x)U (y 0 , t0 )U (t0 , −∞)|0i0
h0|T (φ(x)φ(y))|0i =
0 h0|U (+∞, −∞)|0i0
(5.30)
0 h0|T (φI (x)φI (y)U (+∞, −∞))|0i0
= (5.31)
0 h0|U (+∞, −∞)|0i0

It may seem that something miraculous happened in the last line – after all, there are many different U ’s
evolving us along different times, how did I combine them all? The key fact here is that the time-ordering
will automatically ensure that every operator is placed in the right place, because it is all going from the
past (in the right) to the future (in the left). You can verify that this construction also works out if y 0 > x0
instead, one finds exactly the same expression.
Now we use Dyson’s formula (5.22) to find the final form:
h  R i

0 h0|T φI (x)φI (y) exp −i −∞ dtHI (t) |0i0
h0|T (φ(x)φ(y))|0i =  R

 (5.32)
0 h0| exp −i −∞ dtHI (t) |0i0

This is the key technical tool in the calculations that follow: because HI (t) is a polynomial in the field φI (x),
basically we can expand out the exponential to any desired order in λ, and we have reduced the problem to
computing expectation values of arbitrary time-ordered products of the field φI (x) in the vacuum.

5.2 Wick’s theorem

This is a problem in bookkeeping. Let’s first note that we have introduced two kinds of ordering so far:
time-ordering T (O(x1 )O(x2 ) · · · ) and normal-ordering : O(x1 )O(x2 ) · · · :. We will now try to to establish a
relationship between these two kinds of ordering.

45
Let’s start by understanding the product of two fields. Start by decomposing the field φI (x) as:

φI (x) = φ+
I (x) + φI (x) (5.33)

where the two components have only annihilation and creation operators respectively, and are:
d3 k d3 k
Z Z
1 1
φ+
I (x) = a~ eik·x
φ −
I (x) = a† e−ik·x (5.34)
3 k 3 2ω~k ~k
p p
(2π) 2ω~k (2π)

the point of this is that


φ+
I (x)|0i = 0 h0|φ−
I (x) = 0 (5.35)
0 0
Now let us consider the time-ordered product in the case where x > y :

T (φI (x)φI (y)) = φ+ φI (y) + φ−
 + 
I (x) + φI (x) I (y) (5.36)

I want to place this in normal order: to do this we should move the φ−


I to the left-hand side. Expanding out
the fields, there is only one commutator we need:
− − − − −
T (φI (x)φI (y)) = φ+ + + + +
I (x)φI (y) + φI (x)φI (y) + φI (y)φI (x) + φI (x)φI (y) + [φI (x), φI (y)] (5.37)

Now in all the terms except for the commutator, all of the terms are normal ordered: so if we have x0 > y 0 ,
we can write:

T (φI (x)φI (y)) =: φI (x)φI (y) : +[φ+
I (x), φI (y)] x0 > y 0 (5.38)
Now we can do the same thing in the case y 0 > x0 : you can verify that everything is the same, except that
the final commutator is different:

T (φI (x)φI (y)) =: φI (x)φI (y) : +[φ+
I (y), φI (x)] y 0 > x0 (5.39)

Let us now define some useful notation: the contraction of two fields is denoted by a line above them, and
is defined to be exactly the combination of commutators that we just obtained:
(

[φ+
I (x), φI (y)] x0 > y 0
φI (x)φI (y) = − (5.40)
[φ+
I (y), φI (x)] y 0 > x0

Now here is the magical thing: it turns out that this combination of commutators is in fact exactly the
Feynman propagator!
φI (x)φI (y) = G(x, y) (5.41)
The easiest way to see this is to just explicitly work out the commutators in terms of the ladder operators
and compare to the expression in (4.126).
So we now have a wonderfully simple expression for the time-ordered product of two fields: it is

T (φI (x)φI (y)) =: φI (x)φI (y) : + φI (x)φI (y) (5.42)

This should make sense: consider taking the expectation value of this object in the free vacuum |0i0 : the
normal-ordered part vanishes, and we obtain the known fact that T (φI (x)φI (y)) = G(x, y). So the point of
this is to isolate the part of the correlation function that actually contributes when you take the vacuum
expectation value.
We are now ready to introduce the general form of Wick’s theorem: this tells us how to generalize the
formula above to arbitrary numbers of fields, and it statement is simply.
X
T (φI (x1 )φI (x2 ) · · · φI (xn )) = all possible contractions, with uncontracted fields normal ordered
(5.43)

46
All possible contractions is exactly what it sounds like: you look at all possible ways to draw contractions of
the fields. The simplest example after two fields is four, which I will now work out. It’s a bit lengthy:
T (φI (x1 )φI (x2 )φI (x3 )φI (x4 )) = : φI (x1 )φI (x2 )φI (x3 )φI (x4 ) : (5.44)

+ φI (x1 )φI (x2 ) : φI (x3 )φI (x4 ) : +φI (x1 ) : φI (x2 )φI (x3 )φI (x4 ) : (5.45)

+ φI (x1 ) : φI (x2 )φI (x3 ) : φI (x4 )+ : φI (x1 )φI (x2 )φI (x3 )φI (x4 ) : (5.46)

+ : φI (x1 )φI (x2 )φI (x3 )φI (x4 ) : + : φI (x1 )φI (x2 ) : φI (x3 )φI (x4 ) (5.47)

+ φI (x1 )φI (x2 )φI (x3 )φI (x4 ) + φI (x1 )φI (x2 )φI (x3 )φI (x4 ) (5.48)

+ φI (x1 )φI (x2 )φI (x3 )φI (x4 ) (5.49)


The first line has zero contractions; each of the terms in the next three lines have one contraction each; and
the final two lines are fully contracted. Note that every time you have a contraction, you should replace the
contracted fields with the Feynman propagator, even if the fields aren’t next to each other. So e.g. this term

φI (x1 ) : φI (x2 )φI (x3 ) : φI (x4 ) = G(x1 , x4 ) : φI (x2 )φI (x3 ) : (5.50)
and if you have
φI (x1 )φI (x2 )φI (x3 )φI (x4 ) = G(x1 , x2 )G(x3 , x4 ) (5.51)
I have not proven Wick’s theorem; it should sound vaguely plausible given how we worked out the case for
two fields. It can be proved by induction, but as always for proofs by induction I find it unenlightening, so
I’m not going to do it in lectures; see David Tong’s notes if you would like to see a explicit derivation.
The point of this all is of course to show you which terms are important if you calculate the expectation
value; if you do this, then any term with an uncontracted field vanishes (because it is all normal ordered).
So an important corollary to Wick’s theorem is:
X
0 h0|T (φI (x1 )φI (x2 ) · · · φI (xn ))|0i0 = terms where all fields are fully contracted (5.52)
For example, taking the expectation value of two fields, we have
 
0 h0|T (φI (x)φI (y))|0i0 =0 h0| : φI (x)φI (y) : + φI (x)φI (y) |0i0 (5.53)

= G(x, y)0 h0||0i0 = G(x, y) (5.54)


and for four, we similarly have
0 h0|T (φI (x1 )φI (x2 )φI (x3 )φI (x4 ))|0i0 = G(x1 , x2 )G(x3 , x4 )+G(x1 , x3 )G(x2 , x4 )+G(x1 , x4 )G(x2 , x3 ) (5.55)
Note there is now a temptation to draw pictures for these things. You can imagine that G(x1 , x2 ) represents
a particle moving from x1 to x2 . Then the diagrammatic interpretation of the 4-point function is as shown
in Figure 5.1 – i.e. one moving from x1 to x2 while another particle moves from x3 to x4 and so on.
This is the beginning of Feynman diagrams.

5.3 Feynman diagrams and Feynman rules

This sort of expansion becomes much more fun when we look at the sort of n-point functions that we get
from including the actual interaction term in (5.32), which I record here for convenience:
h  R i

0 h0|T φI (x1 )φI (x2 ) exp −i −∞ dtHI (t) |0i0 N
h0|T (φ(x1 )φ(x2 ))|0i =  R

 = (5.56)
D
0 h0| exp −i −∞ dtHI (t) |0i0

47
Figure 5.1: Simple Feynman diagrams showing free propagation of particles

we will call the numerator of this expression N and the denominator D. Now the idea is to expand the
interaction term out in powers of λ, i.e. we have
 Z ∞   Z  Z
λ λ
−i dtHI (t) = exp −i d4 x φI (x)4 = 1 − i d4 xφI (x)4 + O(λ2 ) (5.57)
−∞ 4! 4!

And similarly we can expand out both the numerator and denominator in powers of λ, i.e. formally we have

N = N0 + λN1 + λ2 N2 + · · · D = D0 + λD1 + λ2 D2 + · · · (5.58)

Let’s now start with N . From above we have


  Z  
λ 4 4 2
N =0 h0|T φI (x1 )φI (x2 ) 1 − i d xφI (x) + O(λ ) |0i0 (5.59)
4!

the first term in order λ0 is


N0 = 0 h0|T (φI (x1 )φI (x2 )) |0i0 = G(x1 , x2 ) (5.60)
1
i.e. the result in the free theory. The effect from the interaction appears at O(λ ):
Z
λ
λN1 = −i d4 x 0 h0|T [φI (x1 )φI (x2 )φI (x)φI (x)φI (x)φI (x)] |0i0 (5.61)
4!
From Wick’s theorem, we know that to determine this expectation value we have to contract these in all
possible ways. Everyone should now take a minute and figure out the possible contraction patterns: how
many times do we count each pattern? Draw some pictures!
Okay, it has been a minute11 . There are essentially two different topological classes of diagrams. One of
them looks like this:
" #
0 h0|T φI (x1 )φI (x2 )φI (x)φI (x)φI (x)φI (x) |0i0 = 3G(x1 , x2 )G(x, x)2 (5.62)

A pictorial representation of this is shown in Figure 5.2. Note this has a physical/poetic interpretation: what
is happening is that one particle is moving from x1 to x2 , and next to it a particle anti-particle pair has

48
Figure 5.2: Feynman diagrams for N1disc

bubbled forth from the vacuum, come together in a brief but passionate embrace at the intersection point x,
and then re-annihilated itself. The probability of the embrace is proportional to the coupling λ.
Such things are called virtual particles.
The factor of 3 may require some thought: the point is that x1 and x2 are special because they are not
integrated over – there is only one way to connect the two of them. However the four φI (x)’s are all the
same. We could have chosen the first contraction in 3 ways (i.e. once you pick the first φI (x), you can
connect it to any of the other three), and then the remaining contraction is fixed.
We will call the contribution from this pattern of contractions λN1disc since it consists of two disconnected
pieces.
The other topologically distinct way to connect them is:
" #
0 h0|T φI (x1 )φI (x2 )φI (x)φI (x)φI (x)φI (x) |0i0 = 12G(x1 , x)G(x2 , x)G(x, x) (5.63)

Here the pictorial representation in Figure 5.3 is that a particle is trying to go from x1 to x2 , but it spon-
taneously emits another particle! The particle flits around the universe before yet re-absorbed again at the
same space-time point x and then the “original” particle moves on to x2 .
Now the factor of 12: there are four choices for the first connection from the external point x1 to one of the
intersection x’s: after this there are 3 choices that remain to connect x2 , and then only 1 choice remains for
the last contraction: 12 = 4 · 3 · 1.
Assembling the pieces we find that the full contribution at first order in λ is
 Z Z 
λ 4 2 4
λN1 = −i 3 d xG(x1 , x2 )G(x, x) + 12 d xG(x1 , x)G(x2 , x)G(x, x) (5.64)
4!
We will call this pattern of contractions λN1conn since it consists of a single connected piece.
11 For you.

49
Figure 5.3: Feynman diagrams for N1conn

You can clearly see the close association with the Feynman diagrams, which are an efficient way to represent
the pattern of contractions. The way we will proceed from now on is actually not in the excruciating
combinatoric way that we have so far: instead we will think of a set of rules – called Feynman rules – that
associate an analytic expression in terms of propagators with a given picture, and then we will just draw a
set of pictures.
You have by now probably already understood the idea of the position-space Feynman rules. If I write
them out explicitly for λφ4 theory, they are as follows:

1. The number of external points is equal to the number of fields in the correlation function we are
calculating.
2. For each line connecting two points x, y, write G(x, y).
d4 x.
R
3. For each vertex – which necessarily has 4 legs – we write −iλ

4. Divide the expression by the symmetry factor.

I have called these position-space because there is also a momentum-space version, which we will introduce
in due course. (It simply involves Fourier-transforming everything).
Here the last point deserves some further explanation. The question is: how many different ways of connecting
the lines represent the same analytic expression? Naively every permutation of the 4 lines coming out of each
vertex represents an equivalent expression that should be counted; this factor of 4! exactly cancels the 4! in
the denominator of the coupling, so there should be no combinatoric factor at all.
However it turns out that some of these 4! permutations are actually not independent; you can make up for
them by changing some of the other things around. This will happen when the diagram has some symmetry.
The degree of this symmetry is called a symmetry factor, and determining it is an interesting exercise in
pure thought. For example, in the connected bubble diagram in Figure 5.4 you can simultaneously switch
the two legs of the propagator and swap the lines coming out of the vertex, resulting in a symmetry factor
of 2. In the figure-of-eight diagram in Figure 5.4 we have instead a symmetry factor of 8, as shown. This
means that we have overcounted, and we should divide the expression by the symmetry factor.

50
Figure 5.4: Computation of symmetry factor S for the two diagrams above.

This results in the same combinatoric factors as above:


3 1 1 12
= = . (5.65)
4! 8 2 4!
Assembling the pieces from the Feynman rules, we see that we arrive at the same expressions as in (5.64).
We could do precisely the same thing at higher order in λ – we would just have more than one vertex. To
obtain the right answer at a given order we just have to make sure we have drawn all possible diagrams.
Now let us come to the denominator D in (5.56), i.e.
 Z ∞   Z 
λ
D ≡ D0 + λD1 + · · · =0 h0| exp −i dtHI (t) |0i0 =0 h0| 1 − i d4 x φI (x)4 + · · · |0i0 (5.66)
−∞ 4!
Clearly D0 =0 h0|0i0 = 1. We could determine D1 through Wick contractions. However let’s use our fancy
new Feynman diagrams – here we are computing the vacuum persistence amplitude with no external points,
and so we should simply draw all Feynman diagrams with only closed loops. We find at order O(λ) as in
Figure 5.5:
−iλ
Z
λD1 = d4 xG(x, x)2 (5.67)
8
It is quite possible to compute higher order terms – you simply apply the Feynman rules above. We can do
this, but for now let’s work out the full expression, which involves dividing the numerator by the denominator:
N N0 + λN1 + · · ·
h0|T (φ(x1 )φ(x2 ))|0i = = = N0 + λ (N1 − D1 N0 ) + O(λ2 ) (5.68)
D 1 + λD1 + · · ·
Now let’s assemble all of the pieces: we find
 Z Z 
1 1
N0 = G(x1 , x2 ) λN1 = λN1conn +λN1disc = −iλ 4 2
d xG(x1 , x2 )G(x, x) + 4
d xG(x1 , x)G(x2 , x)G(x, x)
8 2
(5.69)
Similarly we have
−iλ
Z
λD1 N0 = G(x1 , x2 ) d4 xG(x, x)2 (5.70)
8
We find in the end, up to O(λ):
Z
λ
h0|T (φ(x1 )φ(x2 ))|0i = G(x1 , x2 ) − i d4 xG(x1 , x)G(x2 , x)G(x, x) + · · · (5.71)
2

51
Figure 5.5: Diagram contributing to D1

Note that the term in D1 N0 from the denominator exactly cancels the disconnected diagram from the numer-
ator! This is capturing the physical idea that the disconnected diagram in the numerator – called a vacuum
bubble diagram, as it has no external legs – is actually capturing the physics of how the vacuum shifts due
to the interaction, and has nothing to do with the propagation of the particle itself. Indeed it isn’t hard to
show that this pattern of cancellations actually holds to all orders in perturbation theory: the role of the
denominator is simply to cancel all of the vacuum bubbles. (Remember I told you that this slightly ad-hoc
prescription for adjusting the vacuum normalization would make sense?).
One thus ends up with the following extremely nice pictorial formula for the time-ordered correlation function
X
h0|T (φ(x1 )φ(x2 ))|0i = All Feynman diagrams with 2 external points and no vacuum bubbles (5.72)

The extension to arbitrary numbers of external fields is clearly


X
h0|T (φ(x1 )φ(x2 ) · · · φ(xn )|0i = All Feynman diagrams with n external points and no vacuum bubbles
(5.73)
This is the main idea of perturbative quantum field theory.
A brief philosophical interlude: the Feynman calculus for perturbation theory is nice for two reasons:

1. It gives a simple algorithm to work out all the terms in the perturbative expansion – just draw some
pictures and apply the rules.
2. It gives a physical interpretation to each mathematical term: they represent (possibly virtual) particles
running around and scattering off of each other.

You may now want to go further: for example, why don’t we evaluate the integral over x in (5.71), to get an
explicit function of x1 , x2 ?
It is quite possible to do this, but the reason I’m not doing it right now is that it is actually quite infinite. In
fact, almost all integrals that involve “loops” of virtual particles are infinite – the infinity in this particular
diagram arises from the fact that G(x, x) represents the probability for a particle to end up exactly where it
started (i.e. at point x), which is very sensitive to the short-distance structure of the theory. Dealing with
this is the idea of renormalization, and it is actually where the real fun begins. We will discuss this further
next term.

52
6 Scattering

We have now learned how to calculate time-ordered correlation functions of fields in an interacting quantum
field theory. Happily, this is mostly drawing pictures. We would now however like to learn how to relate
these time-ordered correlation functions to the sorts of things we measure.

6.1 LSZ reduction formula and the S-matrix

The sort of thing that we (in principle) measure in a particle physics experiment is the following: we imagine
throwing in two φ particles and ask: what is the probability that (say) three φ particles come out, as a
function of the energy and momentum of the ingoing particles.
To compute this, we need to right this as a quantum mechanical inner product: we are looking for some inner
product of the form
hThree particles out|Two particles ini (6.1)
We will now show that computing such things can be related to time-ordered correlation functions in the
vacuum, and thus to Feynman diagrams.
We first need to understand our initial states. We will take “in” to mean “from the distant past, t → −∞”,
and “out” to mean, “in the distant future, t → +∞”. We can construct the states quite explicitly in the free
theory; there they are created by the ladder operators a†p~ . Let us remind ourselves how this worked: in the
free theory we have an expression for the time-dependence of the field,

d3 k
Z
1  ik·x † −ik·x

φ(~x, t) = a~ e + a~ e kµ = (−ω~k , ~k) Free theory (6.2)
(2π)3 2ω~k k
p
k

and we could invert this to obtain an expression for the creation operator in terms of the field (recall (4.55)):
r !
ω~k
Z
† i ~
a~ (t) = d x3
φ(t, ~x) − p ∂t φ(t, ~x) eik·~x−iω~k t (6.3)
k 2 2ω~k
−i ←→
Z
~
=p d3 xeik·~x−iω~k t ∂t φ(t, ~x) (6.4)
2ω~k

where I’ve introduced some new notation: for any two functions f (t) and g(t) we have
←→
f (t) ∂t g(t) ≡ f (t)∂t g(t) − (∂t f (t))g(t) (6.5)

This tells us how to create a single-particle state with momentum ~k in the free theory: we just act with a~† |0i.
k
How about in the interacting theory?
We will simply assume that the exact same expression creates both the initial and the final states in the
interacting theory. In other words, we take, even in the interacting theory:

−i ←→
Z
~
a~†,in = lim p d3 xeik·~x−iω~k t ∂t φ(t, ~x) (6.6)
k t→−∞ 2ω~k

and
−i ←→
Z
~
a~†,out = lim p d3 xeik·~x−iω~k t ∂t φ(t, ~x) (6.7)
k t→+∞ 2ω~k

where now φ(t, ~x) is the interaction picture field.

53
Why do we do this? The idea here is basically that we are considering a scattering experiment where all
of the action – i.e. all of physics involving the interaction – takes place in a localized region of space and
time, and we can neglect the interaction at very early and very late times – the theory essentially becomes
free there. This does not always work – e.g. some times the fields in the Lagrangian are not associated with
“asymptotic states” – but basically it works whenever the theory is weakly coupled, which is the only case
we study in this course.
With this assumption we can now formulate the basic question. Let us consider the scattering of an initial
state |ii of two ingoing φ particles with momenta ~k1 and ~k2 into two an outgoing state |f i of two outgoing φ
particles with momenta ~k10 and ~k20 . The probability for this to happen is given by the following inner product:

hf |ii = h0|a~out
k0 ~
aout
k0 ~
a†,in a~†,in |0i (6.8)
1 2 k1 k2

This sort of inner product – measuring the probability of something in the distant past to turn into something
else in the distant future – is called an S-matrix element.
Next, note that as the out-operators are defined at future infinity and the in-operators are defined at past
infinity we can insert a time-ordering product with no loss of generality, i.e. the expression above is equal to

h0|T (a~out
k0 ~
aout
k0 ~
a†,in a~†,in )|0i (6.9)
1 2 k1 k2

Now our basic goal is to move the a’s and a† ’s past each other so that we can annihilate them on the vacuum.
To do this, we will relate a†,in to a†,out by writing the difference between the two as the integral of a derivative:
Z +∞
a~†,out − a~†,in = dt∂t a~† (t) (6.10)
k k k
−∞

We now plug in the expression (6.4) to find:


!
−i x−iω~k t ←→
Z Z
i~
a~†,out − a~†,in = dt∂t p 3
d xe k·~
∂t φ(t, ~x) (6.11)
k k 2ω~k
−i
Z  
~ ~
=p d4 x eik·~x−iω~k t ∂t2 φ(t, ~x) − φ(t, ~x)∂t2 eik·~x−iω~k t (6.12)
2ω~k

Now let’s look at the last term:


 
~ ~ ~ ~ 2 − m2 )ei~k·~x−iω~k t
∂t2 eik·~x−iω~k t = −ω~k2 eik·~x−iω~k t = − ~k 2 + m2 eik·~x−iω~k t = (∇ (6.13)

So we obtain:
−i
Z  
~ ~ 2 − m2 )ei~k·~x−iω~k t
a~†,out − a~†,in = p d4 x eik·~x−iω~k t ∂t2 φ(t, ~x) − φ(t, ~x)(∇ (6.14)
k k 2ω~k
−i
Z  h i 
~ ~ 2 − m2 )φ(t, ~x) ei~k·~x−iω~k t
=p d4 x eik·~x−iω~k t ∂t2 φ(t, ~x) − (∇ (6.15)
2ω~k

where we integrated by parts to transfer the spatial derivatives off of the exponential and onto the field. We
now notice the appearance of the Lorentz-invariant operator ∂µ ∂ µ to find the final expression
Z
i
a~†,in a~†,out d4 xeik·x −∂µ ∂ µ + m2 φ(x)

= +p (6.16)
k k 2ω~k

Note that this makes a lot of sense – it tells us that the change in the operator that creates a particle over
time is proportional to the Klein-Gordon equation acting on the corresponding field. In a free theory we

54
would find (∂µ ∂ µ − m2 )φ(x) = 0 and thus the creation operator is time-independent. In an interacting theory
the equation of motion is modified and the last term is no longer zero.
We can of course take the Hermitian conjugate to find the expression for the annihilation operators:
Z
i
a~out a~in d4 xe−ik·x −∂µ ∂ µ + m2 φ(x)

k
= k
+p (6.17)
2ω~k

Now let’s recall our S-matrix element

h0|T (a~out
k0 ~
aout
k0 ~
a†,in a~†,in )|0i (6.18)
1 2 k1 k2

Now we use the expressions (6.16) and (6.17) to replace aout with ain plus an expression involving the field
φ(x), and similarly we replace a†,in with a†,out plus an expression involving the field φ(x). Now we see
the magic of time-ordering! Each of the ain ’s are now time-ordered to the right and annihilated against
the vacuum |0i. Similarly each of the a†,out are now time-ordered to the left and annihilated against the
conjugate vacuum h0|.
Thus after the dust settles all that remains is the expressions involving the fields, which are

i4
Z
0 0 0 0
hf |ii = q d4 x1 d4 x2 d4 x01 d4 x02 eik1 ·x1 +ik2 ·x2 −ik1 ·x1 −ik2 ·x2 ×
2ω~k1 · 2ω~k0 · · ·
1

(−∂12 +m 2
)(−∂22 + m )(−∂102 + m2 )(−∂202 + m2 )h0|T (φ(x1 )φ(x2 )φ(x01 )φ(x02 )|0i
2
(6.19)

This is the Lehmann-Symanzik-Zimmerman (or LSZ) reduction formula! The generalization to n incoming
particles and n0 outgoing particles is quite obvious, so let me just write that out:
0
in+n
Z
0 0
hf |ii = q d4 x1 · · · d4 x01 · · · eik1 ·x1 (−∂12 + m2 ) · · · e−ik1 ·x1 (−∂102 + m2 )h0|T (φ(x1 ) · · · φ(xn )φ(x01 ) · · · φ(x0n )|0i
2ω~k1 · 2ω~k0 · · ·
1

(6.20)
As the left hand side (defined in terms of asymptotic states) is naturally defined in momentum space, it
suggests that we should express the right-hand side in momentum space as well. To understand this, let’s
define the Fourier transform of the correlation function:
0
n
Z Y n
Y 0
h0|T {φ(k1 ) · · · φ(k2 ) · · · φ(k10 ) · · · φ(kn0 )}|0i ≡ 4
d xi e iki ·xi
d4 x0i e−iki ·xi h0|T (φ(x1 ) · · · φ(xn )φ(x01 ) · · · φ(x0n )|0i
i=1 i=1
(6.21)
0 0
Note that I have Fourier transformed the ingoing particles with e+ik·x and the outgoing ones with e−ik ·x to
match the exponents in (6.20).
We then find that the LSZ formula becomes:
0
in+n
hf |ii = q (k12 + m2 ) · · · (kn02 + m2 )h0|T {φ(k1 ) · · · · · · φ(kn0 )}|0i (6.22)
2ω~k1 · 2ω~k0 · · ·
1

In other words – to determine the S-matrix element, work out the momentum-space n-point correlation √
function and multiply by factors of ki2 + m2 , one for each particle. We should also divide by factors of 2ωk :
in fact this factor is arising from how I chose to normalize the states, and is not physically significant.

55
6.2 Scattering: λφ4

Let’s do an example: let’s understand 2 − 2 scattering in λφ4 theory. We will scatter two particles with
3-momenta ~k1 , ~k2 into two particles with 3-momenta ~k10 , ~k10 .
From the LSZ formula, we need to work out the Fourier transform
Z
0 0 0 0
hT {φ(k1 )φ(k2 )φ(k10 )φ(k20 )}i ≡ d4 x1 d4 x2 d4 x01 d4 x02 eik1 ·x1 +ik2 ·x2 −ik1 ·x1 −ik2 ·x2 hT {φ(x1 )φ(x2 )φ(x01 )φ(x02 )}i
(6.23)
(As there is no worry about forgetting which vacuum we are in now, I will omit the h0| and |0i on the sides).
Let’s now work out this correlation function order by order. First, O(λ0 ), we know that in the free theory
there are three trivial diagrams leading to
hT {φ(x1 )φ(x2 )φ(x01 )φ(x02 )}i = G(x1 , x01 )G(x2 , x02 ) + G(x1 , x2 )G(x01 , x02 ) + G(x1 , x02 )G(x01 , x2 )
O(λ0 )
(6.24)
I claim that this contribution does not represent scattering. To understand, this, let’s look a bit more
carefully at the Fourier transforms of these terms. Let’s zoom in on
Z
0 0
d4 x1 · · · d4 x02 e+ik1 x1 ···−ik2 x2 G(x1 , x01 )G(x2 , x02 ) (6.25)

Now, by translational invariance we know that x1 and x01 will always appear in the combination G(x1 , x01 ) =
G(x1 − x01 ), and so we find
Z
0 0
d4 x1 · · · d4 x02 e+ik1 x1 ···−ik2 x2 G(x1 − x01 )G(x2 − x02 ) (6.26)

Let’s look carefully at the integral over x1 and x01 : we can rewrite the exponent as
1 1
k1 x1 − k10 x01 = (k1 − k10 )(x1 + x01 ) + (k1 + k10 )(x1 − x01 ) (6.27)
2 2
Now consider changing integration variables to the two orthogonal combinations x1 + x01 and x1 − x01 . Note
that the combination x1 + x01 actually does not appear anywhere in the integrand: by translational invariance
everything in the integrand depends only on x1 −x01 . We can thus trivially do the integral over the combination
x1 + x01 , and we see that the whole integral is going to be proportional to δ (4) (k1 − k10 ), and by a similar
argument we also get a factor of δ (4) (k2 − k20 ). Thus we find that
Z
0 0
d4 x1 · · · d4 x02 e+ik1 x1 ···−ik2 x2 G(x1 − x01 )G(x2 − x02 ) ∝ δ (4) (k1 − k10 )δ (4) (k2 − k20 ) (6.28)

But this means that the ingoing momenta is the same as the outgoing momenta, and thus there is no
scattering! This is obviously clear from the Feynman diagram – the ingoing particle is simply the same as
the outgoing particle. It is pretty clear that a similar thing will happen for all the disconnected diagrams –
which are all that we have at O(λ0 ) – and we will from now on ignore them.
Let’s move on to O(λ1 ). In that case we have the following contribution to the time-ordered correlation
function: Z
hT {φ(x1 )φ(x2 )φ(x01 )φ(x02 )}i = −iλ d4 zG(x1 , z)G(x2 , z)G(x01 , z)G(x02 , z) (6.29)

Now we will want to examine this in Fourier space.


i
Basically what will happen is that each of the G(x, z)’s will be replaced by a p2 +m 2 . I’m going to go through

this a bit carefully for the x1 coordinate alone. There are two things happening: we are replacing the
propagator with its Fourier transform via (4.140):
d p1 e−ip1 ·(x1 −z)
Z 4
G(x1 , z) = −i (6.30)
(2π)4 p21 + m2

56
Note that it does not matter here whether we take ±ip in the numerator – the integral is the same function
of x1 and z. Next we are integrating over the x1 in the Fourier transform (6.23) with kernel e+ik1 ·x1 – so the
“full x1 -part” of this integral looks like:

d p1 e−ip1 ·(x1 −z)


Z Z 4
4 ik1 ·x1
hT {φ(k1 ) · · · }i = −i d x1 e ··· (6.31)
(2π)4 p21 + m2
−i
Z
= d4 p1 δ (4) (p1 − k1 ) 2 e+ip1 ·z (6.32)
p1 + m2
−i
= 2 e+ik1 ·z (6.33)
k1 + m2

In other words, we can just replace the propagator with the momentum of the ingoing particle. Clearly the
same argument works for all of the rest of the external points, and we find:
−i −i −i −i
Z
0 0
hT {φ(k1 )φ(k2 )φ(k1 )φ(k2 )}i = −iλ d4 zeiz·(k1 +k2 −k1 −k2 ) 2
0 0
(6.34)
k1 + m2 k22 + m2 k102 + m2 k202 + m2

Note that I picked the exponents in the propagators (6.30) to be different for the ingoing and outgoing
particles, so that they would line up nicely with the sign differences in (6.23). Next let’s do the integral over
z, which results in
Y −i
hT {φ(k1 )φ(k2 )φ(k10 )φ(k20 )}i = −iλ(2π)4 δ(k1 + k2 − k10 − k20 ) (6.35)
k2 + m2
a∈1,2,10 ,20 a

Note that the integral over the internal interaction point can be thought of as conserving the momentum
flowing into the vertex. Finally, we are ready to plug into the momentum-space LSZ formula (6.22). The
factors of i(k 2 + m2 ) there exactly cancel all of the propagators connected to the external legs. This is
(somewhat violently) called amputating the propagators: you can see that it will always happen. Then
we find for the inner product:

Y 1
hf |ii = (2π)4 δ(k1 + k2 − k10 − k20 ) √ (−iλ) (6.36)
a
2ωa

That’s it! This is the matrix element for the 2-2 scattering. It’s all kind of amazing: if you actually built an
accelerator and smashed together the particles then this is really what you would measure.
Now we should note that actually quite a lot of this computation is kinematics – the overall delta√ function
that conserves momentum will always be there in any scattering process, as will the factors of 2ωki . It is
conventional to separate off all of this kinematic information and call the rest of the process iM:
Y 1
hf |ii = (2π)4 δ(k1 + k2 − k10 − k20 ) √ iM (6.37)
a
2ωa

All of the dynamical information is present in the factor iM, which we call the invariant matrix element.
Indeed, from the manipulations above we see that we can calculate iM directly from the Feynman rules in
momentum-space, which are:

−i
1. For each internal line, assign it a 4-momentum p and, write p2 +m2 +i .

2. For each external particle, draw a line but do not write a propagator, because it is amputated. The
4-momentum flowing in (or out) is determined by the momentum of the initial (or final) state.

3. For each 4-point vertex, write −iλ and impose momentum conservation on the momenta flowing in.

57
More pictures
Wednesday, 29 November, 2023 3:22 PM

Figure 6.1: Momentum space Feynman rules for λφ4 theory

d4 p
R
4. Integrate over all the unfixed momenta with measure (2π)4 .

5. Divide by the symmetry factor.

Doing this, we see that for the φ4 theory we do indeed immediately get iM = −iλ: all the propagators are
external and so there is nothing that is unamputated, and there are no unfixed momenta.

6.3 Scattering: gφ2 σ theory

At the end of the day, there was no momentum-dependence in this inner product. So it was slightly boring.
Let us thus try to do something slightly more intricate. I will first do this by introducing a new theory – this
is the theory of two interacting scalar fields φ(x) and σ(x). The action for both will be
Z  
1 1 1 1 g
S[φ, σ] = d4 x − (∂µ φ∂ µ φ) − (∂µ σ∂ µ σ) − m2σ σ 2 − m2φ φ2 − φ2 σ (6.38)
2 2 2 2 2

This theory has two different sorts of particles: φ-type particles with mass mφ and σ-type particles with
mass σ. The character of the interaction is different: in Feynman diagrams, we now have a 3-point vertex
with two φ’s and one σ.
We can now immediately figure out the Feynman rules, which we show in Figure 6.2. As there are two
different types of particles, we have two different propagators: one (solid) line representing the φ particles,
−i
with momentum space propagator p2 +m 2 +i . One (dashed) line representing the σ particle, with momentum
φ
−i
space propagator p2 +m 2 +i . Finally, the vertex has weight −ig (note the factor of 2 from interchanging 2 φ
σ
lines is taken care of by the 21 ).
Now let’s study the same scattering process – look at 2-2 scattering of 2 φ particles with incoming momenta
~k1 , ~k2 to 2 φ particles with outgoing momenta ~k 0 , ~k 0 . We can now use the Feynman rules to calculate
1 2
the invariant matrix element iM. We show the Feynman diagrams in Figure 6.3. There are also other
disconnected Feynman diagrams, but they do not contribute to scattering.
Using the Feynman rules to evaluate them we find
 
~ ~ 0 2 −i −i −i
iM(k1,2 , k1,2 ) = (−ig) + 0 + 0 (6.39)
(k1 + k2 )2 + m2σ (k1 − k1 )2 + m2σ (k2 − k1 )2 + m2σ

This is the scattering amplitude!

58
More pictures
Wednesday, 29 November, 2023 3:22 PM

Figure 6.2: Momentum space Feynman rules for gφ2 σ theory

Figure 6.3: Feynman diagrams contributing to 2 − 2 scattering

59
Note the interesting dependence on k1 , k2 etc; in principle this dependence could be understood from an
experiment, by varying the momentum of the things we are smashing together etc. In such 2 − 2 scattering
problems it is very convenient to introduce the so-called Mandelstam variables s, t, u:

s ≡ (k1 + k2 )2 t ≡ (k1 − k10 )2 u ≡ (k20 − k1 )2 (6.40)

as these represent all the scalar information present in the scattering amplitude. We can thus rewrite the
matrix element as  
−i −i −i
iM(s, t, u) = (−ig)2 + + (6.41)
s + m2σ t + m2σ u + m2σ
where this slightly simpler notation makes clear that iM actually depends on these scalar quantities alone.

6.4 Wrapping up

This is almost the end of our introduction to quantum field theory: in the next term you will continue to
develop the framework that we have developed. I want to conclude by reminding you of the list of things
that I said that we would learn back on page 6, and checking whether we have indeed learned them or not.

1. There are different, yet totally indistinguishable copies of elementary particles like the electron.
We have learned this, at least for scalar particles: it has to do with the fact that a particle is created
by an operator a†p~ which commutes with other creation operators.

2. There is a relationship between the statistics of particles (i.e. the behavior of their exchange) and their
spin (i.e. what happens if you rotate them).
This is next term, because we did not study spin yet. Sorry.

3. Anti-particles exist.
Yep: if we have a complex scalar field there are two different creation operators b† and a† with opposite
charge. For the real scalar field it makes sense to think of the particle as its own antiparticle.
4. Particles can be created and destroyed.
Okay this one we have really studied to death. The Hilbert space of quantum field theory does not
have a fixed particle number, and we saw that this has something deep to do with relativity.
5. Finally, and perhaps most profoundly: the things that we call forces can be imagined as being caused by
the exchange of particles.
We have actually not quite pushed on this yet, so let me now discuss how we can understand this. In
fact the scattering processes shown above clearly indicate that (e.g.) the φ particles are feeling a force
– that is why they are scattering. Usually however in elementary physics we express these forces in
terms of a potential V (x) – is there a way for us to read a potential off of the scattering process in
Figure 6.3?
Here we are trying to take the non-relativistic limit of those diagrams and comparing it to elementary
non-relativistic physics. As is turns out, the s- channel diagram looks very relativistic – two φ particles
collide and vanish into a σ particle. Similarly, in non-relativistic physics particles are distinguishable
and it is confusing to try and interpret the last u-channel one, which is trying to swap particles 1 and
2. We can however try and interpret the middle t-channel diagram.
To do this we actually need to compare our fancy QFT result to the formulas for non-relativistic
scattering in quantum mechanics. You might or might not have studied this; it turns out that the

60
matrix element for scattering a momentum eigenstate with momentum ~k to k~0 off a potential V (x) is
(to lowest order in V – this is called the Born approximation):

h~k 0 |iT |~ki = −(2π)iṼ (~k 0 − ~k)δ(ωp~ − ωp~0 ) (6.42)

(I’m skipping a lot of details here; in particular there’s some kinematic factors that I’m ignoring. My
treatment here is even more sketchy as that in Peskin & Schroeder Chapter 4. The necessary background
for non-relativistic scattering can be found in Griffiths Introduction to Quantum Mechanics Chapter
11).
Here Ṽ is the Fourier transform of the interaction potential, evaluated at the momentum transfer. In
other words, we can compare this to the non-relativistic limit of the t-channel scattering diagram in
(6.39), i.e. !
1
iMnon−rel ∼ +ig 2 (6.43)
(~k 0 − ~k1 )2 + m2
1 σ

Which means that we can read off that the Fourier transform of the interaction potential is

g2
Ṽ (~q) = − (6.44)
~q2 + m2σ
Now we can Fourier transform this back to get an answer in position space:

d3 q i~q·~x g 2 g 2 1 −mσ |~x|


Z
V (~x) = − 3
e 2 2
=− e (6.45)
(2π) ~q + mσ 4π |~x|

(The integral can be done in spherical polars and then using residues; again I refer you to Peskin for the
details). This should be interpreted in the following way: in a a certain regime, exchanging σ particles
with mass mσ has can be interpreted as pulling on other particles with an attractive potential with the
form 1r e−mσ r !
This is where “forces” come from.
Note that the potential decays away exponentially in space – this is because σ particles are heavy, and
they don’t want to be pulled forth from the vacuum and they don’t manage to go too far when they
do. But note that if we were exchanging massless particles then the potential would be a simple 1r ;
this is exactly what happens for the Coulomb interaction in electrodynamics, where we are exchanging
massless photons.

6.5 Loops and infinity and renormalization and all of that

Finally, I want to discuss one more thing. Let’s go back to our favorite λφ4 theory and imagine computing the
scattering amplitude to second order in λ. We then get diagrams that look like Figure 6.4. These have loops.
This means that the Feynman diagrams require us to integrate over an unfixed momentum. For example,
the first diagram is
d4 p −i −i
Z
2
iM1 (k1 , k2 ) = (−iλ) (6.46)
(2π)4 (p + k1 + k2 )2 + m2 p2 + m2

This is an interestingly difficult integral. But let’s note one thing first – is it even finite at large p? Expanding
everything at large p we see that we have
Z Λ 4
d p
iM1 (k1 , k2 ) ∼ ∼ log Λ (6.47)
p4
where Λ is some sort of cutoff at large momenta. The integral is divergent!

61
Figure 6.4: 2 − 2 scattering at O(λ2 )

What does this mean?


In many ways this is really where the actual fun of quantum field theory starts.
What this is telling you is that the information at short distances (high momenta p) is somehow filtering
down to affect this scattering amplitude. Organizing this flow of information is the idea of renormalization
and effective theory, which encode the somewhat common-sense idea that we should think about physics
according to the distance scale that we are interested in. The mathematical formalization of this idea is
probably one of the great ideas of science in the twentieth century, and we continue to learn how to use these
principles to understand new systems.
But sadly, I am now really out of time.

The End
(of the first term of Advanced Quantum Theory)

62

You might also like