Lectures on Introduction to Quantum Field Theory

Ghanashyam Date

Lectures on Introduction to Quantum Field Theory

Ghanashyam Date

2022, arXiv (Cornell University)

visibility

…

description

249 pages

link

1 file

These are lecture notes of the QFT-I course I gave in an online mode at Chennai Mathematical Institute. The course focussed on the free relativistic quantum fields, their interactions in the perturbative scattering framework, standard computations of QED processes, radiative corrections at 1-loop with renormalization and an introduction to the toolbox of path integrals.

July 7, 2022 Lectures on Introduction to Quantum Field Theory Ghanashyam Date1, ∗ 1 Chennai Mathematical Institute H1, SIPCOT IT Park, Siruseri, Kelambakkam 603103, INDIA. Abstract These are lecture notes of the QFT-I course I gave in an online mode at Chennai Mathematical Institute. The course focussed on the free relativistic quantum fields, their interactions in the perturbative scattering framework, standard computations of QED processes, radiative corrections arXiv:2207.02243v1 [hep-th] 5 Jul 2022 at 1-loop with renormalization and an introduction to the toolbox of path integrals. ∗ Electronic address: [email protected], [email protected] 1 Contents 1. What is Quantum Field Theory? 6 A. Relativistic Covariance 6 B. Compatibility with quantum conditions 7 C. Manifest Covariance 9 2. The Poincare group and its representation: mass and spin/helicity 12 A. Poincare group, Lie algebra and Casimir invariants 12 B. Representations of the proper, orthochronous Poincare group: 15 C. Little groups 17 D. The Discrete Subgroups and their actions 20 3. Representations suitable for manifest covariance: “field representations” 24 A. Induced representation of Poincare from inducing representation of Lorentz 25 B. Irreducibility and field equations 27 C. Clifford Algebras and Spinor representations 28 4. Unitary representations on the space of solutions: anti-particles 32 5. Parity, time reversal and charge conjugation 39 A. Representations of the Clifford algebra and relations among them 42 B. Dirac-Majorana-Charge Conjugates 43 6. Relativistic Actions: classical fields 47 A. Variational Principle, Symmetries of the Action and Noether’s theorem 51 B. Conserved Poincare charges for scalar field solutions 53 7. Fourier decompositions of fields: collection of harmonic oscillators 57 A. Maxwell Field 59 B. Dirac Field 62 C. Interaction with source: Green’s functions 66 8. Covariance of quantum fields and relativistic causality A. Poincare Covariance of Quantum fields 2 70 70 B. Space Inversion, Time Reversal, Charge Conjugation of Dirac Field 71 C. CPT theorem for Dirac Field 74 D. Relativistic Causality and the Spin-Statistics Theorem 75 9. States of Free Quantum Fields: Particles, coherence and coherent states 81 A. Particle/anti-particle wave packets 81 B. Correlation Functions and Coherence of states 82 C. Coherent States 85 1. An aside: Harmonic Oscillator States 85 2. Correlation functions 86 D. Evolution into a coherent state 90 10. Quantum fields in Scattering Phenomena 92 A. General scattering framework 94 B. Scattering with quantum fields: Heisenberg Picture 98 C. Kallen-Lehmann Representation 102 D. The S-matrix and its properties 105 E. Lehmann-Symanzik-Zimmermann Reduction of S−matrix 107 1. LSZ reduction for Klein-Gordon field 107 2. LSZ reduction for Dirac field 109 3. LSZ reduction for Maxwell field 112 11. Covariant Perturbation Theory 114 A. Normal ordering and Wick’s theorem 118 B. Differential Cross-section for 2 → n process 129 1. The Special case of 2 → 2 processes 131 12. Diagrammatic recipe for S-matrix elements 134 13. Elementary processes in Yukawa and QED: NR limit 137 14. Basic QED processes 142 A. Electron-muon processes: 143 B. Compton Scattering: 147 3 C. Electron-Positron Annihilation: 152 15. Numerical Estimates of Cross-sections and Applications 155 16. Radiative corrections in QED 159 A. The fermion propagator: self-energy 159 B. The photon propagator: photon self-energy 161 1. The Ward identity claim: q µ Πµν (q) = 0: 162 C. The Vertex function: Form factors 164 D. Electric Charge and Anomalous Magnetic Moment 166 17. Radiative corrections at 1-loop: Divergences 169 A. Isolation of Divergence: Fermion Self Energy 171 B. Isolation of Divergence: Photon Self-Energy 173 C. Isolation of Divergence: Vertex function 176 D. Bremsstrahlung Cross-section to o(α) 178 18. Treatment of Divergences: 183 A. Treatment of the IR divergences 185 B. Treatment of the UV divergences 187 1. 1-Loop Renormalization: Charge Screening and Lamb shift 189 C. The Method of Counter terms 192 19. Renormalized perturbation series 194 A. Necessary Conditions for UV divergence: Power Counting 194 B. An Example: The (Φ3 )6 theory 198 1. At 2-Loops 201 C. Renormalization with massless particles 20. Path Integrals in Quantum Mechanics 202 207 A. The Ab Initio Path Integral 207 B. Derivation From Transition Amplitude 210 C. Functional Derivative 214 D. Ground State-to-ground state Amplitude: Z[J] 216 4 E. Explicit evaluation of a path integral 217 F. Alternative Expression for Z[J] 221 21. Path Integrals in Quantum Field Theory 225 A. The 1-point function: Renormalization ↔ normal ordering 231 B. Path Integrals and Statistical Mechanics 233 22. Path Integrals as Generating Functionals 234 A. The Generating Functionals: Z[J], W [J], Γ[Φ] 234 B. The Renormalization Group Equation 239 C. The Background Field Method 243 23. Closing Remarks 246 References 248 5 1. WHAT IS QUANTUM FIELD THEORY? There is of course no short answer to this question. There are different facets of quantum fields. An aspect that is used extensively in condensed matter physics is the ‘QFT as a framework for computing processes in interacting many body systems’. Here, each of the many bodies typically has its internal energy states eg atoms/molecules at various lattice sites, localized states in periodic potentials etc. The interactions take place by making transitions among these internal states. Since the internal states are discrete, the interactions also involve discrete exchanges of energy/momentum/spin etc which is described by a quantum field eg., [Ψ(~r), Ψ† (~r0 )] ∼ δ 3 (~r − ~r0 ). With large number of bodies involved, a quantum field capable of indefinite number of transitions is a well suited framework. There is another aspect which makes quantized fields essential when we want to describe interactions of even few bodies in a relativistically covariant manner. Special relativity imposes two crucial modifications: (a) the notion of causality is modified by the finite upper speed limit and (b) equivalence of mass and energy demands that the rest masses be also included in the energy conservation. The latter one allows annihilation and creation of ‘particles’ at the expense of energy. The framework of non-relativistic quantum mechanics is not capable of accommodating these possibilities. Why is it so? For this, we need to be sharper about what is meant by “relativistic covariance”. A. Relativistic Covariance Special relativity posits that the space-time that is appropriate arena for describing physical processes is the Minkowski space-time - R4 with the flat metric. This allows the spacetime to be described in terms of coordinates xµ ↔ (t, xi ) and metric η µν = diag(−1, 1, 1, 1) = ηµν . All inertial observers, use such a space-time and relate their descriptions of the exper0 iments by using the coordinate transformations: x µ = Λµν xν + aµ . The Λµν denote the Lorentz transformations. These transformations, termed Poincare transformations, preserve the space-time metric. Under composition of transformations, these form the Poincare group. In general, covariance of a ‘structure’ with respect to some group means that there is a homomorphism of that group, onto the invariance group of the structure. Here, invariance group of the structure means the group of transformations of the structure which preserves 6 the structure. For instance, if the ‘structure’ is a vector space, then its invariance group must be the group of linear transformations. For the quantum framework, the structure is a projective Hilbert space (Hilbert space modulo non-zero scaling) and the basic observable quantities are the transition probabilities: |hψ|φi|2 /(kψkkφk). It is a theorem due to Wigner that the transformations of physical states (elements of the projective Hilbert space) preserving the transition probabilities can be represented by linear or anti-linear transformation of the Hilbert space, ψ 0 = Aψ, satisfying hψ 0 |φ0 i = hψ|φi or hψ 0 |φ0 i = hφ|ψi. The former defines A to be a unitary operator while the latter defines an anti-unitary operator. The anti-unitary operator is necessarily anti-linear as well. The only anti-unitary operator one encounters is the time reversal operator (more on this later). Thus covariance requires the Hilbert space of the quantum system to carry a unitary representation of the Poincare group. If g ∈ G → R(g) denotes a group action which satisfies the composition rule: R(g1 · g2 ) = R(g1 )·R(g2 ), R(g) constitutes a representation of the group G. On a Hilbert space, we denote a group action as |ψ 0 i := g · |ψi := U (g)|ψi. It follows that g1 · (g2 · |ψi) = U (g1 )(U (g2 )|ψi) = U (g1 · g2 )|ψi = (g1 · g2 ) · |ψi. This action also induces a natural action on the operators via hψ|g · A|ψi := hg −1 · ψ|A|g −1 · ψi ↔ A0 := g · A := U (g)AU (g)† . Check that (a) this is indeed a homomorphism: (g1 · g2 ) · A = g1 · (g2 · A) and (b) hψ 0 |A0 |φ0 i = hψ|A|φi. All algebraic relations among the operators are preserved under this group action. Incidentally, this is exactly how one defines the action of diffeomorphisms on functions on a manifold: [g · f ](p) := f (g −1 · p) (this is not the pull back definition). All this essentially says that covariance is implemented through a representation of the group. Which representation? B. Compatibility with quantum conditions Any particular quantum system is however distinguished by what Dirac called quantum conditions eg. [q i , pj ] = i~δ i j . It is on the Hilbert space carrying a representation of the quantum conditions that the covariance group must be represented. To appreciate what this means, let us take the specific examples of the Galilean group and the Poincare group and try to seek unitary representations on the Hilbert space of a single particle with the above quantum conditions with, i, j = 1, 2, 3. Taking the generators of the infinitesimal transformations to be represented by self-adjoint operators, the Lie algebras of the two 7 groups have the following commutation relations: Galilean transformations : x0i = Ri j xj + v i t + ai , t0 = t + b , Ri m Rj n δ mn = δ ij ; Generators : P0 , Pi , Ci , Mij ; with non-vanishing commutators: Mij , Mkl = i δik Mjl − (i ↔ j) − (k ↔ l) + (i, k ↔ j, l) ; Mij , Pk = i δik Pj − δjk Pi ; Mij , Ck = i δik Cj − δjk Ci ; Ci , Pj = iM δij , Ci , P0 = iPi . Poincare transformations : x0µ := Λµν xν + aµ , Λµα Λνβ η αβ = η µν , η = diag(−1, 1, 1, 1); Generators : P0 , Pi , Ki , Mij ; with non-vanishing commutators: Mij , Mkl = i δik Mjl − (i ↔ j) − (k ↔ l) + (i, k ↔ j, l) ; Mij , Pk = i δik Pj − δjk Pi ; Mij , Kk = i δik Kj − δjk Ki ; Ki , Pj = iδij P0 , Ki , P0 = iPi and, Ki , Kj = iMij . Both have the same number of generators, 10, and almost the same Lie algebra except that the Galilean algebra has a ‘central extension’ denoted by the (mass) parameter, M and the ‘Galilean boost’ generators Ci commute among themselves unlike the ‘Lorentz boost’ generators Ki . These can be expressed in terms of the basic operators q i , pj as follows1 . Galilean , Poincare Mij := qi pj − qj pi , Pi := pi p~ · p~ Ci := M qi , P0 := 2M , Mij := qi pj − qj pi , Pi := pi ; p Ki := qi P0 , P0 := p~ · p~ + M 2 . , The parameter M is naturally identified with the mass of our particle system. Notice that only the boost generators and the P0 generators are different in the two groups. Ex: Verify that the above definitions of the generators, satisfy the respective algebras, up to factors of ~. 1 Thomas Jordan, 0810.4637 and 1st reference from it. The reference 1 shows that if q i is taken to represent position and transform under Galilean group accordingly, then the Galilean brackets suffice to fix the remaining generators, including P up to a constant. This also shows that there are no functions on the phase space which are invariant under the Galilean group apart form constant functions. 8 Ex: Under infinitesimal group action, the infinitesimal change in any operator is given by δ~A := i[A,~ · T~ ], where ~ denotes the infinitesimal parameters and T~ denotes the generators. Show that the infinitesimal changes, δ~q i , δ~pi are at the most linear in q i , pi for the Galilean case but not so for the Poincare case for the Ki , P0 . In particular, the Poincare group action is non-linear on the basic operators. This is problematic for the following reason. At the classical level, the observables are sufficiently smooth functions on the phase space and the set of all observables can be given the structure of an algebra: i.e. a vector space with a multiplication (product of two functions) defined. A similar feature holds in the quantum case. Observables are self-adjoint operators. The set of observables too can be given the structure of an algebra (product of two operators). In both cases, the algebras can be thought of as being generated by the basic observables. The action of a group on the basic variables induces a corresponding action on all operators. There are domain issues in the quantum case, but much more importantly, the multiplication is non-commutative which leads to the ordering issues. C. Manifest Covariance When the basic observables transform non-linearly, it is much harder to deduce the transformation properties of the other operators. This is compounded further by the noncommutativity in the quantum case. When the transformations of basic variables are linear, the non-commutativity issue exists but is usually easier to manage. With this in mind we now put the additional requirement that the basic observables transform linearly i.e. in effect tensorially (modulo inhomogeneous action of translations). This is sometimes called the requirement of manifest covariance. Invoking manifest covariance discards Poincare group action on the phase space of a particle. The action of the Galilean group is however permissible. This could of course be generalized to N −particles, i,j = 1,. . . 3N. We see that nonrelativistic quantum theory of N −particles is not capable of implementing manifest Poincare covariance. and we can expect potential hazards in incorporating the relativistic causality and the creation/annihilation processes. Could we not extend the quantum conditions to a relativistically covariant form? For instance, define a quantum system by postulating quantum conditions as [q µ , pν ] = 9 i~δ µν . Then the Poincare generators can be taken to be Mµν := qµ pν − qν pµ and Pµ := pµ . It is trivial to check the Poincare Lie algebra. These also show that the basic observables are tensors of ranks as indicated by their index positions. This could be a model for a relativistic particle. However, now there is another problem in quantum theory. From the basic commutators, we can define the creation/annihilation operators: aµ := √1 (q µ + iη µν pν ) 2~ and its adjoint (aµ )† , satisfying the commutation relations [aµ , (aν )† ] = η µν . As usual, the states are labeled by the eigenvalues of the number operators and the vacuum state is defined by aµ |0i = 0 ∀ µ. This is a Lorentz covariant form of defining the vacuum state. It follows immediately that h0|[a0 , (a0 )† ]|0i = k(a0 )† |0ik2 = η 00 = −1. Additionally, if we were to interpret x0 as a “time” coordinate so that p0 is interpreted as energy, then the energy eigenvalues must necessarily be unbounded below. The Stonevon Neumann theorem implies that if the canonical commutation relations arise from the infinitesimal form of the Weyl relations, then the spectra of both the conjugate operators consists of all real numbers. Thus, the manifestly Lorentz covariant quantum conditions cannot be realized on a Hilbert space! Let us turn attention to systems with infinitely many degrees of freedom. The best known example is classical electrodynamics. This is also relativistically covariant (in fact, led to special relativity!). Its consequence, the electromagnetic waves, surprisingly show particulate behavior. While classically, an accelerated charge radiates waves at the frequency of its mechanical motion, in atomic systems the electrons only radiate at frequencies related by energy differences between levels. When photographic films are seen at low intensity light, the ‘light marks’ are localized. In interference experiments, the interference pattern builds up grain by grain. The photo-electric effect also shows a threshold behavior. All these are very particulate manifestations. The observation of intensity correlations, the (g − 2) measurement and the Lamb shift all point to going beyond classical electrodynamics. The validation by these high precision measurements support the quantum electrodynamical theory - a quantum field theory. The most recent example is of course the prediction of the Higgs particle and its discovery. These are positive arguments in favor of quantum field theoretic framework being well suited for a quantum theory in relativistic regimes. What is a QFT framework? Let us keep in mind the familiar classical electromagnetic field. As an example, consider a reflective cavity in which some classical electromagnetic 10 field exist. This field satisfies the source free, first order in time, Maxwell equations which imply that the electric and magnetic fields satisfy a wave equation. Let us put some boundary conditions (the cavity is closed, say). We can write the general solution of the wave equation as a linear combination of an infinite set of mode functions, which are by themselves convenient solutions satisfying the boundary condition. Typically, this selects a set of frequencies/wavelength. The expansion coefficients encode the relative weightage of the mode functions. One can express the energy and the momentum contained in the cavity field, in terms of the expansion coefficients with the disappearance of the mode functions - their memory remains only in the frequencies/wavelengths. Classically, the expansion coefficients are complex numbers without any particular restriction (except that the total energy/momentum should be finite!). Schematically, we quantize the electromagnetic field by putting hats on the fields or equivalently on the expansion coefficients. The quantization procedure requires imposition of quantum conditions which make the expansion coefficients similar to creation/annihilation operators - and we have the quanta of the cavity field. Any exchange of energy/momentum within the cavity or with any outside environment is done in terms of these quanta, typically referred to as photons. The many body facet mentioned at the beginning does essentially this process in reverse. It is useful to recall two equivalent ways of thinking about quantum harmonic oscillator. We may think of it as a single particle performing motion which is somehow “discretized” or as a collection of “quanta”, each carrying an energy ~ω and the ‘particle motion’ being understood as consequences of emission/absorption of these quanta. It is the latter view which is more convenient in the context of a field. In nutshell then, A quantum field is a means to hold or supply an arbitrary collection of quanta and all interactions are transactions of energy/momentum/charges etc in terms of these quanta. This view is certainly borne out in perturbative analysis but by no means the most general one. 11 2. THE POINCARE GROUP AND ITS REPRESENTATION: MASS AND SPIN/HELICITY A. Poincare group, Lie algebra and Casimir invariants The Poincare group: The Poincare group or inhomogeneous Lorentz group is defined by the transformations of space-time coordinates, x0µ = Λµν xν + aµ , Λµα Λνβ η αβ = η µν , η µν = diag(−1, 1, 1, 1) = ηµν , aµ ∈ R4 . (2.1) The defining conditions on Λµν imply that, ηµν Λµα Λνβ = ηαβ = Λνα Λνβ ⇒ δ αβ = Λνα Λνβ ∴ Λνα = (Λ−1 )αν ↔ (Λ−1 )να = Λαν Note the index positions (2.2) (2.3) It is convenient to view Λµν as matrices. It is then important to adopt a consistent convention regarding the index positions. Viewed as matrices, in the last equation, Λνα is the left inverse of Λ while Λαν is the right inverse of Λ. The infinitesimal Transformations are defined by taking Λµν = δ µν + ω µν and aµ = µ . The defining equation for Lorentz transformations then imply ω µν = −ω νµ , ω µν := ω µα η αν . This also explains the parameter counting of 6 for the Lorentz and 4 for the translations. Thus, the Poincare group is a 10 parameter continuous group (actually a Lie group) with the 6 parameters in Λ constituting the Lorentz subgroup and the 4 parameters aµ constituting the translations subgroup. We denote a generic element of the group as (Λ, a). Clearly detΛ = ±1. The transformations with positive determinant are called proper Lorentz transformations while those with negative one are called improper. Furthermore, p P P −(Λ00 )2 + 3i=1 (Λ0i )2 = −1 ⇒ Λ00 = ± 1 + i (Λ0i )2 . Those with positive Λ00 are called orthochronous Lorentz transformations. The identity transformation, being both proper and orthochronous, the subgroup of proper, orthochronous transformations is the continuous subgroup connected to the identity. The two improper transformations with Λ = diag(1, −1, −1, −1) and Λ = diag(−1, 1, 1, 1) are called space inversion and time reversal transformations. It can be shown that any Lorentz transformation can be obtained from a proper, orthochronous transformation followed by space inversion and/or time reversal. Group composition: The composition of two Poincare transformations is defined as: 12 (Λ2 , a2 ) · (Λ1 , a1 ) = (Λ2 Λ1 , Λ2 a1 + a2 ). It follows that (1, 0) is the identity and (Λ, a)−1 = (Λ−1 , −Λ−1 a). Check: (Λ, a) · (1, 0) · (Λ, a)−1 = (1, Λb) i.e. gHg −1 ∈ H ∀ g. Such a subgroup H is called an invariant or normal subgroup. Thus the translations subgroup is a normal subgroup of the Poincare group and the Poincare group is said to be a semi-direct product of the Lorentz and the translations groups. Check: Any group element can be uniquely written as: (Λ, a) = (Λ, 0) · (1, Λ−1 a) = (1, a) · (Λ, 0) . While it is possible to proceed with the abstract identification of the Lie algebra via the vector fields generating the infinitesimal transformation, we can simplify by directly considering a unitary representation of the Poincare group. That is, a set of unitary operators U (g) : H → H such that (i) U † (g)U (g) = 1 = U (g)U (g)† and (ii) U (g2 g1 ) = U (g2 )U (g1 ) , ∀g 0 s ∈ G. To allow for the possibility of infinite dimensional representations, we use operators on Hilbert space. We also restrict to irreducible representations i.e. the representation space has no proper subspace which is invariant under the U (g) operators. It is a result (from Schur’s lemma) that if U (g) is a unitary, irreducible representation, then the only bounded operator that commutes with all U (g) operators is a multiple of the identity operator. Its converse also holds, namely, if the only operator that commutes with all the U (g)’s is a multiple of the identity operator, then the unitary representation is irreducible. A simple application of this result is that unitary, irreducible representations of abelian groups are one dimensional. Hence the translation subgroup has only 1-dimensional irreducible representations. We define the infinitesimal generators as, i U (Λ, a) = U (δ µα +ω µα , µ ) := 1+ ωµν M µν −iµ P µ +o(ω 2 , 2 ) , M, P are self-adjoint . (2.4) 2 The commutation relations among the generators are obtained from using the homomorphism property, U (Λ, a)−1 U (λ, b)U (Λ, a) = U [(Λ, a)−1 ·(λ, b)·(Λ, a)] and taking λ = 1+ω, b = 13 . Using the definition of the generators, it follows, ωµν U (Λ, a)−1 M µν U (Λ, a) = (Λ−1 ωΛ)αβ M αβ − 2(Λ−1 ωa)α P α (2.5) µ U (Λ, a)−1 P µ (Λ, a) = (Λ−1 )α P α . Using (Λ−1 )α = ηασ (Λ−1 )σµ µ = (Λ−1 )αµ µ = Λµα α , (2.6) (Λ−1 ωΛ)αβ = ωµν Λµα Λνβ and (Λ−1 ωa)α = ωµν Λµα aν ; U (Λ, a)−1 M µν U (Λ, a) = Λµα Λνβ M αβ − (Λµα aν − Λνα aµ )P α , (2.7) U (Λ, a)−1 P µ U (Λ, a) = Λµν P ν The last two equations show that under the ‘adjoint’ action of the group, the generators transform as Lorentz tensors. To deduce the commutation relations, take Λ, a to be infinitesimal i.e. U (Λ, a)±1 = 1 ± 2i ωαβ M αβ ∓ iα P α . Substitution and reading off coefficients gives, M µν , M αβ = i η µα M νβ − (µ ↔ ν) − (α ↔ β) + (µ, α ↔ ν, β) [M µν , P α ] = i η µα P ν − η να P µ [P µ , P ν ] = 0 . (2.8) (2.9) (2.10) It is convenient to introduce the notation, K i := M i0 and Ji := 21 ijk M jk ↔ M ij = ijk Jk and also H := P 0 . Then the Poincare commutators take the form (non-zero commutators only), [Ji , Jj ] = iij k Jk , [Ji , Kj ] = iij l Kl (2.11) [Ki , Kj ] = − iij l Jl , [Ji , Pj ] = iij k Pk (2.12) [Ki , Pj ] = − iHδij , [Ki , H] = iPi Define the Pauli-Lubanski vector, Wµ := 1 M να P β 2 µναβ [P µ , W ν ] = 0 , [M µν , W λ ] = i η µα W ν − η να W µ , 0123 := 1. (2.13) It follows that and [Wµ , Wν ] = −iµναβ W α P β . These relations imply that P 2 := ηµν P µ P ν , W 2 := η µν Wµ Wν commute with all the generators of the Poincare group and hence also with all the group elements of the proper, orthochronous Poincare group. These are the two independent Casimir invariants of the Poincare group and must be multiples of identity operator in irreducible representations. These multiples serve to label the unitary, irreducible representations. 14 B. Representations of the proper, orthochronous Poincare group: Each of its representations induces a representation of its Lie algebra 2.11. Since the P µ ’s commute, their simultaneous eigenvectors can be taken as a group of labels for a basis. P µ |k µ , ξi = k µ |k µ , ξi ⇒ P 2 |k µ , ξi = k 2 |k µ , ξi , k 2 = ηµν k µ k ν . (2.14) Here, ξ collectively labels the degenerate eigenvectors and k 2 is the value of the Casimir P 2 . Thus, first level of classification is done by the value of the Lorentz invariant k 2 . There are four classes that arise naturally: (i) m2 := −k 2 > 0 (massive), (ii) k 2 = 0, k µ 6= 0 (massless), (iii) k 2 > 0 (tachyonic), and (iv) k µ = 0 (vacuum). The vacuum representation is just a single vector. The tachyonic representations have not showed up in experiments yet. • Since P µ transforms as a Lorentz vector, so does k µ and for k 2 ≤ 0, the sgn(k 0 ) distinguishes two further subclasses of the massive and the massless representations. We restrict to the massive and the massless representations with k 0 > 0 as being physically relevant. • Under a Lorentz transformation, (Λ, 0), k µ goes to another point on the hyperboloid k 2 = constant. How does the collective label ξ transform? We would like to deduce the action of U (Λ, a) on the vectors |k µ , ξi. Since U (1, a) = exp−iaµ Pµ , we already know that U (1, a)|k µ , ξi = exp−iaµ kµ |k µ , ξi and we focus on the Lorentz transformations U (Λ, 0) =: U (Λ). The Lorentz transformation relation (2.7) imply, P µ U (Λ)|k, ξi = Λµν U (Λ)P ν |k, ξi = Λµν k ν U (Λ)|k, ξi X Cξ,ξ0 |Λk, ξ 0 i . U (Λ)|k, ξi = ⇒ (2.15) ξ0 We want to find the coefficients Cξ,ξ0 corresponding to irreducible representations of the Lorentz group. In principle, these coefficients could (and do) depend on k label, however, their number (the degree of degeneracy) must be the same over a given hyperboloid. This is essentially a continuity argument - the label k changes continuously while the degree of degeneracy is integer valued. Let k̂ be some convenient, fixed vector on the k 2 ≤ 0 hyperboloid and let L(k) be some arbitrarily chosen Lorentz transformation such that k µ = L(k)µν k̂ ν . The Lorentz 15 transformation L depends explicitly on k (and also implicitly on k̂) and also is not unique. For instance, for k̂ = (M, ~0), any k can be obtained by a combination of a rotation and a boost transformation. Having made a choice of L(k), define, |k, ξi := N (k)U (L(k))|k̂, ξi (2.16) Since this is a definition, the same label ξ appears on both sides. It follows, U (Λ)|k, ξi = N (k)U (ΛL(k))|k̂, ξi = N (k) U (L(Λk)) · U ((L(Λk))−1 ) U (Λ)·U (L(k))|k̂, ξi . The last three factors of U combine to give U (L−1 (Λk)) · Λ · L(k), which acting on k̂ takes k̂ → k → Λk → k̂ since (Λk)µ = [L(Λk)]µν k̂ ν . Denote: W (Λ, k) := L−1 (Λk) · Λ · L(k). It follows that W (Λ, k)k̂ = k̂ ∀Λ. The Lorentz transformations, W (Λ, k) leaving the k̂ invariant, form a subgroup of the Lorentz transformations. It is known as the little group of k̂ or stability subgroup of k̂. P Defining a representation D(W ) by, U (W (Λ, k))|k̂, ξi := ξ0 Dξ0 ξ (W )|k̂, ξ 0 i , we get, U (Λ)|k, ξi = N (k)U (L(Λk)) X Dξ0 ξ (W )|k̂, ξ 0 i ξ0 = N (k) X h i 0 D (W ) U (L(Λk))|k̂, ξ i. ξ0 ξ ξ0 ∴ U (Λ)|k, ξi = N (k) X Dξ0 ξ (W )|Λk, ξ 0 i N (Λk) ξ0 (2.17) In the last equation, we have used the definition (2.16). The irreducible representations of the little group can thus be used to characterize the irreducible representations of the Lorentz group. We are not done yet, the normalization factors N (k) need to be determined. Determination of N (k): For different k̂, k̂ 0 , the corresponding little groups are different. The unitarity of the Poincare (and hence of the Lorentz group) representation implies the representations of the little groups be also unitary. We can choose hk̂ 0 , ξ 0 |k̂, ξi = δξ,ξ0 δ 3 (k̂ − k̂ 0 ) and we would like this to be preserved under Lorentz transformations i.e. with the hatted k’s being replaced by their Lorentz transforms. This is possible only for a specific choice of N (k) which we determine now. Note that this requirement associates the ortho-normalization with the hyperboloids determined by k̂ 2 , (k̂ 0 )2 . Since a delta function is defined in the context of an integration, we are 16 implicitly envisaging an integration over the hyperboloids and not over R4 . The integration is to be Lorentz invariant. This can be inferred as follows. Z Z Z 0 4 0 2 2 θ(k )f (k) := d kθ(k )δ(k + m )f (k) = d3 kdk 0 θ(k 0 )δ(−(k 0 )2 + ~k · ~k + m2 )f (k) 4 massshell R p Z X δ(x − xi ) ~k 2 + m2 , ~k) f ( p ∵ δ(f (x)) = . (2.18) = d3 k 0 (x )| |f i 2 ~k 2 + m2 i Although there are two roots from the delta function of the mass shell, the θ(k 0 ) picks out p 3 only one term with ωk := k 0 := + ~k · ~k + m2 . The d k is the Lorentz invariant volume 2ωk element (measure) on the mass shell. Since we have, Z Z i d3 k 0 ~ h 3 0 0 3 0 d k f (~k )δ (~k − ~k ) = f (~k) = f (k) 2ωk δ 3 (~k − ~k 0 ) , R3 R3 2ωk (2.19) we identify the invariant delta function on the mass shell as: q 3 0 3 ~ 0 ~ ~ ~ δinv (k − k ) := 2ωk δ (k − k ) = 2 ~k 2 + m2 δ 3 (~k − ~k 0 ). Let k = L(k)k̂ and define k 0 := L(k)k̂ 0 with the same L(k). hk 0 , ξ 0 |k, ξi = N ∗ (k 0 )N (k)hk̂ 0 , ξ 0 |U † (L(k)) U (L(k))|k̂, ξi = |N (k)|2 δξ0 ,ξ δ 3 (k̂ 0 − k̂). We have used unitarity of U (L(k)). But we also know that δ 3 (~k 0 − ~k) and δ(k̂ 0 − k̂) are related by the same Lorentz transformation. Therefore using invariant delta function we have, ωk δ 3 (~k 0 − ~k) = ωk̂ δ 3 (k̂ 0 − k̂) 0 k 0 0 2 ∴ hk , ξ |k, ξi = |N (k)| δξ0 ,ξ δ 3 (~k 0 − ~k) and the choice, 0 k̂ s k̂ 0 ⇒ |N (k)| := k0 hk 0 , ξ 0 |k, ξi = δξ0 ,ξ δ 3 (~k − ~k 0 ) and , r (Λk)0 X U (Λ)|k, ξi = Dξ0 ξ (W (Λ, k))|Λk, ξ 0 i k0 ξ0 C. Little groups It remains to determine the little groups for the various cases. The vacuum representation (kµ = 0) : 17 (2.20) (2.21) (2.22) (2.23) (2.24) The little group is all of the Lorentz group. However the representation is the trivial one: U (Λ) = 1, ∀ Λ. Massive representation (k2 = −m2 ) : Choose k̂ µ = (M, ~0). The little group is defined by Λµν k̂ ν = k̂ µ ⇒ Λ00 = 1, Λi 0 = 0. The defining condition on Λ, η 00 = Λ00 Λ00 η 00 + (Λ0i )2 δ ii , gives Λi 0 = 0. Hence Λ is block diagonal, with δ ij = Λi k Λj l δ kl and the little group is SO(3), the group of rotations. Its unitary representations are all finite dimensional and labeled by its Casimir, J 2 = j(j + 1), j ∈ N/2. The half integer ones come from the double cover SU (2). The Dξ0 ξ are the usual (2s + 1) dimensional representations. The representations of this class are thus labeled by ‘mass’, m and ‘spin’ s. Massless representation (k2 = 0) : Choose k̂ µ = (k, 0, 0, k), k > 0. It is more convenient to determine the generators of the little group. These are some linear combinations of the Lorentz generators M αβ . Consider the commutator, αβ M αβ , P λ = i λβ P β − αλ P α ⇒ αβ M αβ , P λ |k̂, ξi = i λβ k̂ β − αλ k̂ α |k̂, ξi = 2iλα k̂ α . The l.h.s. of the above equation vanishes because, the generators of the little group, · M leave the eigenstate of k̂ invariant. The r.h.s. Then gives the conditions on ’s. Explicitly, 0 = λα k̂ α = (λ0 + λ3 )k) ⇒ α0 + α3 = 0. This in turn gives 03 = 0, 01 = 13 , 02 = 23 . Hence, 1 αβ M αβ |little = 12 M 12 + 13(M 13 − M 01 ) + 23 (M 23 − M 02 ) 2 := 1 (J1 − K2 ) + 2 (−J2 − K2 ) + 3 J3 (2.25) It is conventional to denote: A := J2 + K1 , B := −J1 + K2 and write the little algebra of k̂ = k(1, 0, 0, 1) as, [A, B] = 0 , [J3 , A] = iB , [J3 , B] = −iA . (2.26) This is the Euclidean group in 2 dimensions: two translations and one rotation. Its representations are not as familiar. To study these, define R(θ) := exp iθJ3 . It is easy to check that R(θ) A R−1 (θ) = Acos(θ) − Bsin(θ) and R(θ) B R−1 (θ) = Asin(θ) + Bcos(θ). Consider a representation D in which A|k̂, a, bi = a|k̂, a, bi and B|k̂, a, bi = b|k̂, a, bi. Define |k̂, a, b, θi := R(θ)|k̂, a, bi. It follows that, A|k̂, a, b, θi = (acosθ − bsinθ)|k̂, a, b, θi , B|k̂, a, b, θi = (asinθ + bcosθ)|k̂, a, b, θi . 18 Hence, if a, b are non-zero, then for each θ, the states |k̂, a, b, θi are simultaneous eigenstates of A and B. Such a representation has the continuous parameter θ and no such additional parameter is seen in observations. To avoid such a parameter, we restrict the representation of the Little group to those for which a = b = 0 or A, B vanish in the representation. The little group then effectively reduces to just U (1) generated by J3 . Its irreducible representations are labeled by half integers. We denote these representations as, J3 |k̂, σi = σ|k̂, σi, σ ∈ 21 Z. The label σ is called helicity. The one dimensional representation is denoted as Dσ0 σ . The representations with a, b nonzero, are called “continuous spin representations”. In summary: s k̂ 0 U (L(k))|k̂, σi k0 |k, σi := (2.27) For m > 0: r U (Λ)|k, σi = (Λk)0 X j Dσσ0 [W (Λ, k)] |Λk, σ 0 i where, 0 k σ0 W (Λ, k) := L−1 (Λk) · Λ · L(k) , L(k)k̂ := k σ is the j3 eigenvalue (2.28) For m = 0: r (Λk)0 exp {iσθ(Λ, k)} |Λk, σi where, σ is the helicity label, k0 W (Λ, k) := L−1 (Λk) · Λ · L(k) := S(α(Λ, k), β(Λ, k)) R(θ(Λ, k)) (2.29) U (Λ)|k, σi = Let us evaluate the second Casimir invariant of the Poincare group, W 2 . For this we evaluate Wµ |k̂, σi. (i) −k̂2 = m2 6= 0: For k̂ = (m, 0, 0, 0), Wµ |k̂, σi = 21 µνα0 M να m|k̂, σi. This vanishes for µ = 0 and for µ = i, we get Wi |k̂, σi = − 21 0ijk M jk m|k̂, σi = −mJi |k̂, σi. Therefore, W 2 |k̂, σi = m2 j(j + 1)|k̂, σi. (ii) −k̂2 = 0: For k̂ = k(1, 0, 0, 1), Wµ |k̂, σi = 21 (µνα0 + µνα3 M να ) k|k̂, σi. This implies, W0 |k̂, σi = kσ|k̂, σi , W1 |k̂, σi = k(−J1 + K2 )|k̂, σi = 0 (2.30) W3 |k̂, σi = − kσ|k̂, σi , W2 |k̂, σi = k(−J2 − K1 )|k̂, σi = 0. (2.31) Hence W 2 |k̂, σi = [−(kσ)2 + (−kσ)2 ]|k̂, σi = 0, for our choice of representations. 19 Note that the action of the translations subgroup on the Poincare representation is given by, U (1, a)|k, σi = exp−iaµ kµ |k, σi while that of the Lorentz group is given in the box. Since any element of the Poincare group can be uniquely written as a product of a Lorentz transformation times a translation, we have specified the full action. Since this action is determined by the choice of the unitary, irreducible representation of the little group, the Poincare representation is said to be induced by a representation of the little group. These representations are taken to identify elementary quanta. D. The Discrete Subgroups and their actions Recall that we have two improper Lorentz transformations, the space inversion P : Λµν = diag(1, −1, −1, −1) and the time reversal T : Λµν = diag(−1, 1, 1, 1). All Lorentz transformations can be generated from the proper, orthochronous transformations combined with either or both of these. Clearly these transformations are order 2 i.e. P −1 = P and T −1 = T . Let us assume that these have an action on the physical state space preserving probabilities. Then by Wigner’s theorem, these are represented either as linear and unitary or anti-linear and anti-unitary operators on the Hilbert space. Let us use the same symbols to denote the corresponding operators which should be clear from the context. The homomorphism property, U (Λ, a)−1 U (λ, b)U (Λ, a) = U (Λ, a)−1 · (λ, b) · (λ, a) (2.32) gives us the action of space inversion and time reversal on the proper, orthochronous transformations. For a = 0 and Λ = P or T leads to: P −1 U (λ, b)P = U (P −1 λP , P −1 b) , T −1 U (λ, b)T = U (T −1 λT , T −1 b). (2.33) The same homomorphism (2.32), together with the infinitesimal form, U (1 + ω, ) = 1 + 2i ωαβ M αβ − iα P α for (U (λ, b), gives us the relations: U −1 (Λ)M µν U (Λ) = Λµα Λνβ M αβ , U −1 (Λ)P µ U (Λ) = Λµν P ν . Following the same steps as before but not canceling the factors of i to allow for anti-linearity, we get, P −1 (iP 0 )P = iP 0 , P −1 (iP i )P = −iP i T −1 (iP 0 )T = −iP 0 , T −1 (iP i )T = iP i 20 (2.34) (2.35) Let if possible P be anti-linear. Then it anti-commutes with P 0 which represents the energy. This in turn means that for every positive energy state, there is a negative energy state. This conflicts with the energy being bounded below. Identical implication follows if T is linear! Hence, the space inversion operator must be linear and unitary while the time reversal operator must be anti-linear and anti-unitary. With this understood, the action of these discrete operators on the Poincare generators is given by, P −1 P 0 P = P 0 , P −1 P i P = −P i , P −1 Ji P = Ji , P −1 Ki P = −Ki (2.36) T −1 P 0 T = P 0 , T −1 P i T = −P i , T −1 Ji T = −Ji , T −1 Ki T = Ki (2.37) From these defining actions, we can determine how these operators act on the unitary, irreducible representations of the Poincare group. It is easy to see that under space inversion and time reversal, the two Casimir invariants are invariant: P −1 P 2 P = P 2 and P −1 W 2 P = W 2 . This is obvious from the action of these operators on the generators which transform as their tensor indices indicate. The Casimir invariants are Lorentz scalars (not pseudo-scalars) and hence invariant under inversions and time reversal. Thus both parity and time reversal will not mix different irreducible representations. To evaluate their actions on individual vectors, we note that the irreducible representations are characterized in terms of k̂ vector and an eigenvalue(s) of J3 generator. Action of P : m > 0: Here, k̂ = (m, 0, 0, 0) and σ = −s, −s + 1, . . . s − 1, s, s being the spin. Since P commutes with P 0 , anti-commutes with P i and commutes with J3 , we see that P |k̂, σi also has the same eigenvalues of the P µ , J3 . Hence, P |k̂, σi = ησ |k̂, σi , with ησ being a phase (since the vectors are non-degenerate). Next, from the usual angular momentum algebra, we know p (J1 ± iJ2 )|k̂, σi = (s ∓ σ)(s ± σ + 1)|k̂, σ ± 1i ⇒ p (J1 ± iJ2 )(P |k̂, σi) = (s ∓ σ)(s ± σ + 1)(P |k̂, σ ± 1i) ⇒ ησ = ησ±1 . Thus ησ is independent of σ and equal ±1 since P 2 = 1. The phase η is called the intrinsic parity of the representation. What about action on the states |k, σi ? Recall that these states are obtained as (2.27), s k̂ 0 |k, σi = U (L(k))|k̂, σi , k := L(k)k̂ k0 21 for a chosen L(k). Acting by the matrix P on the defining equation for the Lorentz transformation L(k), gives (P k) = (P L(k)P −1 )(P k̂) which implies that (P L(k)P −1 ) = L(P k). The homomorphism property, (2.32), then gives P −1 U (L(k))P = U (L(P k)). Acting on |k, σi gives, s k̂ 0 P |k, σi = U (L(P k))η|k̂, σi = η|P k, σi. k0 (2.38) Thus P acts on all vectors of the representation with the same η and thus, η is a property of the irreducible representation. m = 0: Now k̂ = (k, 0, 0, k) and the helicity σ is the eigenvalues of the J3 , the angular momentum direction along the momentum direction. Since P changes the sign of the momentum but not of the angular momentum, a state |k̂, σi ∝ | − k̂, −σi. Unlike the massive case, the direction of the reference momentum k̂ is changed and this makes it inconvenient to infer the action of parity on a general state. For this reason, it is useful to choose a rotation R2 about, say, the 2-axis to bring back −k̂ → k̂. Hence consider a new operator P 0 := U2 P , with U2 := e−iπJ2 . But U2 also rotates J3 (just as k̂) and hence the helicity is unchanged by the U2 operation. Hence, P 0 |k̂, σi = ησ |k̂, −σi . Here ησ is some other phase factor. What about the action of P on a general state |p, σi defined in the equation (2.27)? Note that we are using p := (|p|, p~) instead of k to denote the general vector. Let R(p, k̂) be the rotation that rotates the reference k̂ in the direction of the general vector p = (|~p|, p~). Let B(p, k̂) denote the boost along the 3-direction which changes the reference magnitude k to |~p|. We can thus go from k̂ to p by first using the boost B followed by the rotation R(p, k̂) : |p, σi ∼ U R(p, k̂)B(p, k̂) |k̂, σi. Noting that P 0 commutes with boosts along the 3-axis, we have, U (B(p, k̂)) = P 0 U (B(p, k̂))(P 0 )−1 = U2 P B P −1 U2−1 ⇒ P U (B(p, k̂)) = U2−1 U (B)U2 P ∴ P |p, σi ∼ U R(p, k̂) P U B(p, k̂) |k̂, σi = U R(p, k̂)R2−1 B ησ |k̂, −σi Now, although R(p, k̂)R2−1 is a rotation that takes the 3-axis along −~p, its unitary representative is not quite U (R(−~p, k̂)). It introduces additional phases. I refer you to the equation (2.6.21) in [1]. The net result is that, P |p, σi = ησ exp{∓iπσ}|P (p), −σi. The additional phase ∓πσ correlates with the sign of the 2−component of the momentum p~. Action of T : m > 0: 22 Now T commutes with P 0 , anti commutes with P i and Ji . ∴ P i T |k̂, σi = 0 , P 0 T |k̂, σi = m|k̂, σi , J3 |k̂, σi = −σ|k̂, σi. The last one implies that T |k̂, σi = ζσ |k̂, −σi, where ζ is a phase factor. From the angular momentum algebra we get, p (s ∓ σ)(s ± σ + 1)|k̂, σ ± 1i ⇒ p (−J1 ± iJ2 )(T |k̂, σi) = (s ∓ σ)(s ± σ + 1)(T |k̂, σ ± 1i) p = (s ∓ σ)(s ± σ + 1)ζσ±1 |k̂, −σ ∓ 1i) ⇒ −ζσ = ζσ±1 . (J1 ± iJ2 )|k̂, σi = ζσ+1 = ζσ−1 . Choosing the phase for say ζ := ζσ=j , the phase for j − 1, j − 2, . . . alternate between ±ζ. Thus we can write the phase as ζσ = ζ(−1)j−σ , ζ is an arbitrarily chosen phase. √ The phase ζ can actually be absorbed away by putting |k̂, σi0 := ζ|k̂, σi. The action of T on the new vector gives, T |k̂, σi0 = p ζ ∗ T |k̂, σi = |ζ|(−1)j−σ |k̂, −σi . Note: A similar manipulation with the η phase will retain the phase in the redefined vectors and thus the intrinsic parity cannot be absorbed away. The anti-linearity of T is responsible for this feature. We may retain the irrelevant phase ζ. The action of time reversal operator on the general vectors |k, σi proceeds in exactly the same manner as above leading to, T |k, σi = ζ(−1)j−σ |k, −σi. It follows immediately that T 2 |k, σi = (−1)2j |k, σi which distinguishes different spins. m = 0: The analysis here proceeds in an exactly same manner as that for the parity and we just note the final result as: T |p, σi = ξσ σexp{∓iπσ}|T (p), σi. The additional phase ∓πσ correlates with the sign of the 2−component of the momentum p~. This completes the basic definitions related to the Poincare group and its unitary, irreducible representations. These representations specify the attributes that we may assign to elementary quanta. However, these are not suitable for manifest covariance. Firstly, the label σ transforms covariantly only under the little group and not the full Lorentz group. Secondly, the label k is restricted to the positive hyperboloid and not the full R4 . This makes it inconvenient to take Fourier transform to link them with space-time coordinates. For this we need to construct representations on vector valued functions on the space-time. This is done next. 23 3. REPRESENTATIONS SUITABLE FOR MANIFEST COVARIANCE: “FIELD REPRESENTATIONS” As noted earlier, the abstractly classified unitary, irreducible representations of the Poincare group are not suitable for manifest Lorentz covariance. For this, we begin by defining vector space of suitably chosen function and define an action (homomorphism) of the Poincare group. These will be reducible in general and irreducibility will be imposed by partial differential equation (“field equations”). To make the action unitary, requires an inner product to be defined. This will further show a ‘doubling of representations’ leading to the particle/anti-particle interpretation. Let ΨA (x) be a finite dimensional column vector of a suitable class of complex/real valued functions of the space-time coordinates xµ . The defining action of the Poincare group is: x → gx := x(Λ,a) = Λx + a. It induces a corresponding action on the functions which we denote as: Ψ → (gΨ) := Ψg , Ψg (x) = D(h(g))Ψ(g −1 x) ↔ Ψ0(Λ,a) (x) = DAB (h(λ, a))ΨB ((Λ, a)−1 x) Here, D is a finite dimensional representation of the Lorentz group, h(Λ, a) is a map from the Poincare group to the Lorentz group. The above action is required to be a homomorphism, i.e. (gΨ)(x) = D(h(g))Ψ(g −1 x) such that (g 0 (gΨ)) (x) = (g 0 gΨ)(x) ∀ g, g 0 ∈ Poincare and ∀ x. (3.1) l.h.s. = D(h(g 0 )) (gΨ)(g 0−1 x) = D(h(g 0 ))D(h(g))Ψ(g −1 g 0−1 x) = D(h(g 0 )h(g))Ψ((g 0 g)−1 x). r.h.s. = D(h(g 0 g))Ψ(g 0 g)−1 x) ∴ h(g)0 s must satisfy, D(h(g 0 )) D(h(g)) = D(h(g 0 g)) ∀ g, g 0 ∈ Poincare. Equivalently, (3.2) (3.3) (3.4) D(h(g 0 )h(g)) = D(h(g 0 g)) which implies, h(g 0 )h(g) = h(g 0 g) , or h(g) must be a homomorphism. (3.5) Since our primary focus is on the Lorentz group, and h(g) = h((1, a) · (Λ, 0)) = h((1, a))h((Λ, 0)), we may choose h((1, a)) = 1. This reduces the homomorphism from Poincare into Lorentz, to a homomorphism from Lorentz to Lorentz. We can and do take, this homomorphism to the identity homomorphism. With this we write, Ψ(Λ,a) (x) := D(Λ)Ψ(Λ−1 (x − a)). 24 (3.6) A. Induced representation of Poincare from inducing representation of Lorentz We have thus got an induced representation of the Poincare group from an inducing representation D(Λ) of the Lorentz group and we take D to be an irreducible representation. There is no requirement of unitarity as ΨA (x) is not a quantum mechanical wave function. Additionally, it is a fact that the Lorentz group has no non-trivial finite dimensional, unitary representations. Our task now reduces to finding out all possible, irreducible, finite dimensional representations of the Lorentz group. To appreciate irreducibility, consider the infinitesimal action: (Λ, a) = (1 + ω, ), D(Λ) = D(1 + 2i ωαβ T αβ ) where, T represent the Lorentz generators in the D representation. Then, i i Ψ1+ω, (x) = (1 + ω · T )(Ψ(x − ωx − )) = Ψ(x) + ω · T Ψ(x) − (ω αβ xβ + α )∂α Ψ|x (3.7) 2 2 This defines the Poincare generators acting on the Ψ(x). Explicitly, M αβ Ψ := (T αβ )AB ΨB (x) + i(η γα xβ − η γβ xα )∂γ ΨA (x) Pµ Ψ := −i∂µ ΨA (x) (3.8) (3.9) The two Casimir invariants, P 2 , W 2 acting on Ψ should evaluate to some constants to satisfy the necessary condition for irreducibility of the Poincare representation. The P 2 Casimir is easy to evaluate and gives, (P 2 Ψ)A = η µν (−i)2 ∂µ ∂ν ΨA = − ΨA := −∂02 + ∇2 . (3.10) For the second Casimir we have, 1 µναβ (M να P β Ψ)A 2 1 = µναβ (T να )AB + iδ AB (η γν xα − η γα xν ) ∂γ −iδ BC ∂ β ΨC 2 i i = − µναβ (T να )AB ∂ β ΨB + µναβ xα ∂ ν ∂ β ΨA = − µναβ (T να )AB ∂ β ΨB + 0(3.11) 2 2 (Wµ Ψ)A = Note that the 2-derivative term has canceled. Then, 1 µν 0 α0 β 0 (Mν 0 α0 Pβ 0 )AB (Wµ Ψ)B 2 h 0 0 ih i 1 0 0 0 0 0 = µν 0 α0 β 0 (T ν α )AB + iδ AB (η γν xα − η γα xν )∂γ −iδ BC ∂ β (Wµ Ψ)C 2 i 0 0 0 0 0 0 = − µν 0 α0 β 0 (T ν α )AB ∂ β (Wµ Ψ)B + µν 0 α0 β 0 xα ∂ ν ∂ β (Wµ Ψ)A 2 1 0 0 0 = − µν 0 α0 β 0 µναβ (T ν α T να )AB ∂ β ∂ β ΨB (3.12) 4 (W 2 Ψ)A = 25 Notice that W 2 Ψ = 0 for non-one dimensional D representation. As illustrations consider two examples: (i) D is the trivial representation of the Lorentz group and (ii) D is the defining representation of the Lorentz group. D(Λ) = 1: Now W 2 Ψ = 0 while P 2 Ψ = −Ψ = −m2 Ψ. Thus, irreducibility requires that Ψ must satisfy ( − m2 )Ψ(x) = 0 and we recognize this as the Klein-Gordon equation. The Ψ transforms as: Ψ(Λ,a) (x) = Ψ(Λ−1 (x − a)). D(Λ) is the defining representation: This means Λµν = δ µν + ω µν := δ µν + 2i (ωαβ T αβ )µν and hence (T αβ )µν := −i(η αµ δ βν − η βµ δ αν ) This gives, h 0 i 0 0 0 0 0 0 0 0 0 (T ν α T να )ρσ = − η ν ρ η να δ ασ − η α ρ η νν δ ασ − η ν ρ η αα δ νσ + η α ρ η αν δ νσ This is manifestly antisymmetric in (ν 0 α0 ) and (να). Contraction with the epsilons give 0 equal contributions and cancel the factor of 4, leading to (W 2 Ψ)ρ = µρνβ 0 µνσβ ∂ ββ Ψσ . Recall the general identity: a1 ···aj aj+1 ···an a1 ···aj = (−1)s (n − j)! j! δ bj+1 ···bn [aj+1 bj+1 ··· δ an ] bn , (3.13) where s is the index of the metric i.e. number of negative eigenvalues of the metric. In our case the metric is the Minkowski metric and s = 1. This gives, µρνβ 0 µνσβ = − µνρβ 0 µνσβ = − (−1) 2! 2! 1 ρ (δσ ηβ 0 β − ηβ 0 σ δβρ ), 2 leading to, 0 0 (W 2 Ψ)ρ = 2(δσρ η β β − η βρ δσβ )∂ 2β 0 β Ψσ = 2 [Ψρ − ∂ ρ (∂σ Ψσ ] , and (3.14) (P 2 Ψ)ρ = −Ψρ (3.15) Note: We have specified the action of the Poincare group on ΨA . How does ∂µ ΨA transform? We can think of the derivative as a new quantity with two indices. The index A will transform by (T αβ )AB as before while the index µ will transform by the defining representation of the Lorentz group, (T αβ )µν = +i(η αν δµβ − η βν δµα ). In the above calculation of the second W µ , this was missed. However it may be checked that this extra contribution vanishes in the W 2 calculation. 26 B. Irreducibility and field equations The Poincare representation will be irreducible provided the Casimir invariants have fixed values. Let P 2 Ψρ = −m2 Ψρ and W 2 Ψρ = CΨρ . For non-zero mass, with spin j, we know that W 2 equals m2 j(j + 1) and hence C = m2 j(j + 1). Thus Ψρ satisfies, ψ ρ = m2 Ψρ , 2(m2 Ψρ − ∂ ρ ∂ · Ψ) = m2 j(j + 1)Ψρ We have two cases to consider. The defining vector representation of the Lorentz group, splits as 1 ⊕ 0 under the rotation subgroup. Hence j = 1 or 0. For j = 1, taking the divergence of the second equation implies ∂ · Ψ = 0. For j = 0, this argument gives an identity. However, vanishing of W 2 gives ∂ρ ∂ · Ψ = m2 ψρ . Differentiating by ∂σ and antisymmetrising in the indices gives, ∂ρ Ψσ = ∂‘ σΨρ ⇒ Ψρ = ∂ρ Φ for some scalar Φ which is defined up to a constant. This scalar satisfies ( − m2 )Φ =constant which can be taken to be zero. Thus, if the defining representation is to give the massive, spin one irreducible representation, Ψρ must satisfy: ( − m2 )Ψρ = 0 = ∂ · Ψ. If it is to give the massive, spin zero irreducible representation, then Ψρ = ∂ρ Φ with ( − m2 )Φ = 0. For m = 0 and our restriction on the representations of the little group, the Casimir conditions give, Ψµ = 0 and ∂µ ∂ν Ψν = 0. Introduce a new field, Aµ := Ψµ + ∂µ Λ. It follows that, Aµ = 0 + ∂µ Λ , ∂µ ∂ ν Aν = 0 + ∂µ Λ ⇒ Aµ − ∂µ ∂ ν Aν = 0 = ∂ ν ∂ν Aµ − ∂µ Aν We recognize the last equality as the source free Maxwell equation, including its gauge invariance: Aµ → Aµ + ∂µ Λ! These examples show that the induced representation ΨA of the Poincare group defined through the inducing representation of the Lorentz subgroup, is in general reducible. Requiring that the Poincare Casimir invariants, evaluated on Ψ gives the values corresponding to the unitary, irreducible representation of the Poincare group, imposes differential conditions on Ψ eg the Klein-Gordon equation and the subsidiary conditions. Usually, these equations are proposed as the field equations. Here we see them arising as irreducibility conditions. Let us return to the classification of the finite dimensional, representations of the Lorentz group. 27 We already know that the defining representation acts as: v 0µ = Λµν v ν ↔ vµ0 = Λµν vν = (Λ−1 )νµ vν . From these we can trivially construct other irreducible representations by taking tensor products and subtracting suitable ‘traces’. For instance, a rank-2 tensor V µν transforms as V 0µν = Λµα Λνβ V αβ . This is a reducible representation though. Why? Because its symmetrised and anti-symmetrised parts transform among themselves and thus form invariant subspaces. Furthermore, tensors of the form η µν C also transforms into the same form. Hence the symmetric combination splits further into traceless and trace part. Thus we get the space V of the second rank tensors decomposes as: V = Vanti−sym ⊕ Vsym,traceless ⊕ Vtrace , each being an irreducible representation. Analogous construction can be carried out for higher rank tensors. These however do not exhaust all the finite dimensional irreducible representations. There are the spinor representations which are missed. It is a result that all finite dimensional representations of pseudo-orthogonal groups, SO(p, q), p + q ≥ 2 can be constructed from the irreducible representations of the corresponding Clifford algebra. C. Clifford Algebras and Spinor representations Clifford algebras are algebras generated by elements, {1, γ µ , µ = 0, . . . , (p + q − 1)} satisfying γ µ γ ν + γ ν γ µ = 2η̄ µν 1, η̄ µν = diag(−1, . . . , −1, +1, · · · + 1). There are p negative and q positive signs. We have put a bar on the metric since we will relate it to the η defining the Lorentz group for which p = 1 and q = 3. Note that for µ 6= ν the gamma’s anti-commute, (γ 0 )2 = η̄ 00 and (γ i )2 = η̄ ii . Since the gamma’s anti-commute, their bilinear satisfy commutation relations. Let Σµν := a(γ µ γ ν − γ ν γ µ ). We will choose the proportionality constant a suitably. Consider, µν αβ Σ ,Σ = a2 γ µ γ ν , γ α γ β − (µ ↔ ν) − (α ↔ β) + (µ, α ↔ ν, β) (3.16) γ µ γ ν γ α γ β = 2η̄ να γ µ γ β − 2η̄ µα γ ν γ β + 2η̄ νβ γ α γ µ − 2η̄ µβ γ α γ ν 1 µ ν α β ∴ γ γ , γ γ = −η̄ µα γ ν γ β + η̄ να γ µ γ β − η̄ µβ γ α γ ν + η̄ νβ γ α γ µ 2 ∴ Σµν , Σαβ = (4a) −η̄ µα Σνβ + (µ ↔ ν) + (α ↔ β) − (µ, α ↔ ν, β) (3.17) This has the same form as the algebra satisfied by the Lorentz generators: M µν , M αβ = i(η µα M νβ − −+), the M ’s were defined through U (Λ = 1 + ω) = 1 + 2i ωµν M µν . Thus we 28 need −4aη̄ µν = iη µν . There are two ways to satisfy this and we choose our conventions as: η̄ = −η and a = 4i . Thus, γ µ γ ν + γ ν γ µ = −2η µν , i Σµν := (γ µ γ ν − γ ν γ µ ) = 4 The Σ’s satisfy the Lorenz Lie algebra. i µ ν [γ , γ ] . 4 (3.18) Furthermore, the definitions imply that [Σµν , γ λ ] = i(η µλ γ ν − η νλ γ µ ) i.e. the γ’s transform as Lorentz vectors. It is useful to define γ5 := +iγ 0 γ 1 γ 2 γ 3 Just from the Clifford algebra and definitions it follows that γ52 = +1, γ5 γ µ = −γ µ γ5 , and [γ5 , Σµν ] = 0. Claim: It is always possible to choose the γ’s to be finite dimensional unitary matrices. This is based on the following facts: (a) The set of elements G := {±1, ±γ µ , ±γ µ γν, ±γ µ γ ν γ λ , . . . , ±γ5 } with all indices distinct in the products, forms a finite group of 32 elements (in our 4 dimensional case); (b) All representations of finite groups can be made unitary; (c) representation theory of finite groups applied to the group G gives that there is exactly one, non-trivial, unitary, irreducible representation of G and hence of the Clifford algebra and that its dimension is 2[4/2] = 4. The results also extends to other dimensions. Unitarity of the γ’s and their squares being ±1, imply further that (γ 0 )† = γ 0 and (γ i )† = −γ i . This in turn gives (Σ0i )† = −Σ0i , (Σij )† = Σij and γ5† = γ5 . Hence, the representation D of the Lorentz group provided by Σ’s is (i) non-unitary and (ii) reducible, since γ5 is Hermitian and commutes with the generators. Since γ’s are finite dimensional, their traces are defined and all of them including γ5 are traceless. In particular, this implies that γ5 has two eigenvalues equal to +1 and other two equal to -1. Hence 1±γ5 2 are projection matrices and reduce the 4 dimensional representation of the Lorentz group into two irreducible representations of dimensions 2. By convention, ΨL := ΨR := 1+γ5 Ψ, ↔ 2 1−γ5 Ψ, ↔ 2 γ5 ΨL = −ΨL is called a left handed Weyl spinor while γ5 ΨR = +ΨR is called a right handed Weyl spinor. The 4-component ΨA ’s are called Dirac spinors while the 2-component projections, Ψ± := 1±γ5 Ψ 2 are called Weyl spinors. Choosing D to be generated by the Σ± := 1±γ5 Σ 2 give irreducible representations of the Lorentz group and in turn an irreducible representation of the Poincare group . Returning to Poincare representation, consider the combination γ µ Pµ , apparently a 29 Lorentz scaler. Indeed, recalling that [M µν , Pλ ] = +i(δλµ P ν − δλν P µ ), it follows that, µν λ µν λ T , γ Pλ = Σ , γ Pλ + γ λ [M µν , Pλ ] = 0 . We have used T to denote a general representation, M to denote the defining representation and Σ to denote the reducible spinor representation of the Lorentz group. We use the conventional abbreviation a/ := γ µ aµ for all 4-vectors aµ . Consider the Poincare Casimir P 2 . Irreducibility forces P 2 Ψ = −m2 Ψ. Let m 6= 0. The combinations, m1 ± /p commutes with the Lorentz generators and also satisfy, (m ± /p)2 = m2 + (p/)2 ± 2mp/ = 2(m ± /p). Therefore, π± := m±p / 2m is a projection matrix operator (since P is an operator) that also commutes with the Lorentz generators. It trivially commutes with the Poincare generators as well. Hence the Poincare representation induced by the Dirac spinor is reducible in yet another way, the irreducible subspaces being provided by (±m + /p)Ψ = 0 = (−iγ µ ∂µ ± m)Ψ = 0. This is just the Dirac equation! Note that the projection property holds only for non-zero mass. Note: The projectors 1±γ5 2 reduce the D representation of the Lorentz group and each of these induces an irreducible representation of the Poincare group. By contrast, m±6 p , 2m do not reduce the D representation, but nevertheless reduces the Poincare representations. Are either of these Poincare representations further reduced by the other projector? The answer is ‘No’ because γ5 anti-commutes with 6 p and thus exchanges the two projectors. Note: Consider the action of 6 p on the left(right) handed Weyl spinors. It follows immediately that γ5 (6 p)ΨL/R = ±6 pΨR/L . Therefore, if we restrict to any one irreducible Weyl representation, then 6 p must annihilate it. That is Weyl spinors satisfy the massless Dirac equation - also known as the Weyl equation. Conversely, if we have a Dirac spinor Ψ that satisfies the Weyl equation, then its decomposition into its Weyl spinors also satisfy the Weyl equation individually: 6 p(ΨL + ΨR ) = 0 ⇒ 6 pΨL/R = 0. Exercises: 1. Form the anti-symmetrized products of the γ-matrices, satisfying the Clifford algebra {γ µ , γ ν } = −2η µν , Γµ := γ µ Γµν := γ µ γ ν − γ ν γ µ Γµνλ := γ µ γ ν γ λ + γ ν γ λ γ µ + γ λ γ µ γ ν − γ ν γ µ γ λ − γ µ γ λ γ ν − γ λ γ ν γ µ 30 Γµναβ := γ [µ γ ν γ α γ β] with no numerical overall factors. Since the indices take 4 values only, we cannot have any more antisymmetric products. Together with 1, this set comprises of 16 matrices. Show that products of any number of γ’s can be expressed in terms of the these 16 matrices together with products of η’s. Conclude that these 16 matrices constitute a basis for arbitrary products of γ’s. Since they have different Lorentz transformation properties, they are all independent. The vector space of k × k matrices has dimension k 2 , hence the minimum matrix order for the γ’s is 4 and they are necessarily irreducible representation of the Clifford algebra. Hence our γ’s are 4 × 4. Analogous arguments hold for Clifford algebra of n dimensions (indices taking n values). 2. For spinors satisfying the Dirac equation with M 6= 0, evaluate the Pauli-Lubanski scalar, Wµ W µ Ψ. 3. The Lorentz Ji , Ki satisfy the algebras [Ji , Jj ] = iijl Jl ; [Ki , Kj ] = −iijl Kl ; [Ji , Kj ] = iijl Jl . Define: Ai := 1 (Ji 2 + iKi ), Bi := 1 (Ji 2 − iKi ). Check that the [Ai , Bj ] = iijl Al , , [Bi , Bj ] = iijl Bl , , [Ai , Bj ] = 0. This the M µν Lie algebra is equivalent to the direct sum of two, mutually commuting SU (2) algebras. They have the Casimir ~ 2, B ~ 2 . Evaluate these for the spinor representation provided by Σµν ’s. invariants A In summary: ΨΛ,a (x) := D(Λ)Ψ(Λ−1 x − Λ−1 a) (Reducible) Poincare action (3.19) Require Poincare Casimir invariants to take constant values ( − m2 )Φ = 0 : ( − m2 )v µ = 0, ∂ · v = 0 : Klein-Gordon equation (scalar) (3.20) Proca equation (massive vector) (3.21) Aµ = 0, ∂ · A = 0 ↔ ∂µ F µν = 0 , Fµν := ∂µ An − ∂ν Aµ Maxwell equation) (3.22) (−i6 ∂ + m)Ψ = 0 : Dirac equation, m = 0 is Weyl equation 31 (3.23) 4. UNITARY REPRESENTATIONS ON THE SPACE OF SOLUTIONS: ANTI- PARTICLES At this stage, we have two sets of irreducible representations of the Poincare group the “Particle” representations {|k, σi}, and the “field” representations ΨA (x) satisfying the appropriate field equations (or irreducibility conditions). The former are unitary while the latter have no such notion defined for them. We fill this gap now. The space of vector valued functions, ΨA can be made a complex vector space easily enough, but we need to define an inner product to define the notion of unitary. Since the irreducible representations are solutions of the field equations and the equations are linear, we consider the complex vector space of the solutions of the field equations and look for an inner product. Consider the massive scalar field first: (∂ µ ∂µ − m2 )Φ(x) = 0, ΦΛ,a (x) = Φ(Λ−1 (x − a)). Let u, v be two solutions of the Klein-Gordon equation. Then, u∗ ( − m2 )v − v( − m2 )u∗ = ∂µ (u∗ ∂ µ v − v∂ µ u∗ ) = 0. Hence, J µ := λ (u∗ ∂ µ v − v∂ µ u∗ ) is a conserved current. This has an immediate implication. Consider a 4 dimensional region bounded by two hypersurfaces of constant value of the time coordinate. More generally, these are two Cauchy surfaces. Restricting to solutions which vanish sufficiently rapidly as one approaches asymptotic infinity along the space-like directions (eg |~x| → ∞), Z Z 4 µ d x∂µ J = 0 = region 3 d xJ Σ2 ∪Σ1 0 Z ⇒ 3 0 Z dJ = Σ2 d3 xJ 0 . Σ1 Here we have used that both the Cauchy surfaces have their normals directed ‘outward’ (future pointing for the later one and past pointing for the earlier one). This suggests that we define a candidate inner product between two solutions as, Z (v, u) := λ d3 x J 0 (v, u) , J 0 (v, u) := v ∗ ∂ 0 u − u∂ 0 v ∗ , Σ a Cauchy surface. (4.1) Σ The conserved current implies that the inner product is independent of the Cauchy surface. Note: On a Cauchy surface, the solution and its time derivative form an initial data and the inner product is really defined on these data. However the Σ−independence of the inner product allows us to think of this as an inner product on the space of solutions. 32 Is (v, u) really an inner product? It is (i) linear in u and anti-linear in v; (ii) (v, u)∗ = (u, v) provided λ∗ = −λ and (iii) (u, u) ≥ 0 with equality for u = 0? To check the third property, let us write a solution of the field equation in the form u(t, ~x) = e±iωt uω (~x) , ω > 0. Let us conventionally call e−iωt as a positive frequency solution. R Then, (u, u) = λ(−2iω) Σ d3 x|uω (~x)|2 . The inner product then satisfies the crucial third property provided we choose λ = +i|λ| := i. The absolute value of λ has been taken to be one as it only affects the normalization of the solutions. Thus, with the convention adopted, the (v, u) with λ = i is indeed an inner product on the subspace of positive frequency solution. Equally well, the choice λ = −i defines an inner product on the subspace of negative frequency solutions. Consider a family of solution, the plane wave solution, u~k (x) := A~k eik·x , k · x := −k 0 t + p ~k · ~x, k 0 := ~k 2 + m2 =: ω~ . Substitution gives, k iA~∗k0 A~k Z 0 0 d3 x(−ik 0 − ik 0 )ei(k−k )x Σt 00 0 0 = +A~∗k0 A~k (k 0 + k 0 )e−i(k −k )t (2π)3 δ 3 (~k − ~k 0 ) i h 0 3 3 2 0 ~ ~ = (2π) |A~k | (2k )δ (k − k ) (u~k0 , u~k ) = (4.2) (4.3) (4.4) Since k 0 depends only on the magnitude of ~k, the delta function forces the frequencies to be equal. We recognize the second square bracket as the Lorentz invariant delta function and choose the normalization constant to A~k := (2π)−3/2 so that, q 1 ik·x e , k · x := − ~k 2 + m2 t + ~k · ~x , ~k ∈ R3 u~k (x) := (2π)3/2 (u~ 0 , u~ ) = δ 3 (~k − ~k 0 ) = (2k 0 )δ 3 (~k − ~k 0 ) . k k inv (4.5) (4.6) The above family of solutions formally form an orthonormal set in the space of positive frequency solutions. Technically, these do not belong to the space of solutions which have to die off suitably at spatial infinity. This is understood as usual by either using ‘box normalization’ or forming wave-packets. Manipulations done using the above do not lead to any inconsistency. Now we are ready to check if the Poincare action on the above inner product space is unitary. We will check this by showing that under the Poincare action, the orthonormality is preserved. 33 Consider, [u~k (x)]Λ,a = u~k (Λ−1 (x − a)) = 1 −1 eik·Λ (x−a) 3/2 (2π) (4.7) 1 ei(Λk)·(x−a) ∵ kµ (Λ−1 )µν xν = (Λνµ kµ )xν = (Λk) · x (2π)3/2 (4.8) ∴ u~k (x) Λ,a = e−i(Λk)·a uΛk (x) and, ~ 0 ~ 0 3 (Λ(k ~− k 0 )) = ei(Λ(k−k ))·a δinv [u~k0 ]Λ,a , [u~k ]Λ,a = ei(Λ(k−k ))·a [uΛk ~ 0 ] , [uΛk ~ ] = [u~k0 ] , [u~k ] Hence, unitarity! (4.9) = Note: The orthochronous Lorentz group preserves the sign of the frequency, k 0 > 0 => (Λk)0 > 0, and hence maps positive (negative) frequency solutions into positive (negative) frequency solutions. Thus we see that the space of solutions of the irreducibility condition (field equations) itself decomposes into two Lorentz invariant subspaces of positive/negative frequency solutions. There is no contradiction with the Casimir being constant - we just happen to have two unitary representations which have the same values of the Poincare (orthochronous, proper) Casimir invariants. These in fact represent ‘particle’ and ‘antiparticle’ representations respectively. The plane wave orthonormal basis, {u~k (x)} is in one-to-one and onto correspondence with the ‘particle basis’ {|ki}. The same features are exhibited by the other solution spaces, it remains to identify a conserved current on the space of solutions, define an inner product, obtain an orthonormal set and show unitarity. Since all field equations imply that the fields always satisfy the Klein-Gordon equation, we will always have the positive/negative frequency subspaces and the particle/anti-particle identification. Consider the Proca equation with divergence condition: ( − m2 )v µ = 0 = ∂ · v. Let V denote the space of solutions of these equations. As before, let v µ and uµ be two solutions. Then it follows that J µ := λ(v ∗ν ∂ µ uν − uν ∂ µ vν∗ ) is conserved exactly as before. The divergence condition just selects a subspace of the space of solution of the Klein-Gordon equation. The inner product is defined as (with the same convention of positive frequency solutions), Z (v, u) := i d3 x [v ∗ν ∂ µ uν − uν ∂ µ vν∗ ] Σt 34 Consider the family of solutions, 1 εµ (k)eik·x , kµ εµ (k) = 0 , k · x := −k 0 t + ~k · ~x (2π)3/2 h i 0 1 00 0 ∗µ 0 −i(k0 −k 0 )t 3 3 ~ ~k 0 ) (k + k ) [ε (k )(ε (k)] e (2π) δ ( k − u~k0 , u~k = µ (2π)3 q 0 0 ∗µ 0 3 ~ ~ = [ε (k )(εµ (k)] δinv (k − k) , here k := ~k 2 + m2 . u~µk (x) := (4.10) (4.11) The polarizations ε(k) satisfying the transversality condition, k · ε(k) = 0 selects 3 independent vectors for m 6= 0 and 2 independent vectors for m = 0 thanks to the equivalence of Aµ and Aµ + ∂µ Λ. To distinguish the different polarization vectors, we introduce an additional label, a taking 3 and 2 values respectively for the massive and massless cases. We choose them to satisfy the orthonormality relations, ε∗µ (k, a)εµ (k, b) = δab and of course ε(k, a) · k = 0 ∀ a = 1, 2, 3. For massless case, a is usually denoted by λ and takes only two values. The plane wave family of solutions so defined, constitute an orthonormal set. Under the Poincare action, h i u~µk,a (x) Λ,b 1 −1 Λµν εν (~k, a)eikΛ (x−b) 3/2 (2π) = e−i(Λk)·b (Λµν εν (k, a))ei(Λk)·x But, = Λµν u~νk,a (Λ−1 (x − b)) = (Λk)µ (Λµν εν (k, a)) = kµ εµ (k, a) = 0 ∴ r.h.s. = ei(Λk)·b) εµ (Λk), a)ei(Λk)·x h i µ u~k,a (x) = ei(Λk)·b) uµΛk,a ~ (x) . (4.12) Λ,b Noting further that (Λε(k 0 , a))∗ · (Λε(k, b)) = ε∗ (Λk 0 , a) · ε(Λk, b), unitarity of the Poincare action now follows in exactly the same manner as for the scalar. As in the case of the scalar, here too we have two unitary, irreducible representations of the (orthochronous, proper) Poincare group corresponding to the ‘particle’ and ‘anti-particle’ tags. For the zero mass case, the only difference is that the polarizations are restricted to the two spatially transverse directions. The case of the spinors is more interesting. We follow the same strategy of looking for a conserved current and a corresponding inner product, but we will do this for the Dirac equation which is a first order differential equation. We have: A B −1 µ ΨA Λ,a (x) = D B (Λ)Ψ (Λ (x − a)) , (−iγ ∂µ ± m)Ψ = 0 . Note that there are two equations. 35 To look for a conserved current, we need to consider the (matrix) adjoint of the Dirac equation. The adjoint will involve γ † which are inconvenient for manipulations. However, it follows readily that (γ µ )† = γ 0 γ µ γ 0 . This suggests that we use the Dirac conjugate defined as: Ψ̄ := Ψ† γ 0 . It follows that (−iγ µ ∂µ ± m)Ψ = 0 ↔ i∂µ Ψ̄γ µ ± mΨ̄ = 0 . Using these equations for two solutions, Ψ1 , Ψ2 and their Dirac conjugates, it follows that J µ (Ψ2 , Ψ1 ) := λ(Ψ̄2 γ µ Ψ1 ) is the conserved current for both ± m equations. With the choice λ = +1, the inner product on the solutions of Dirac equations is defined as: Z Z 3 0 d xΨ̄2 γ Ψ1 ↔ (Ψ, Ψ) = d3 xΨ† Ψ ≥ 0 . (Ψ2 , Ψ1 ) := Σt Σt Caution: For γ0 in the above equation, (Ψ, Ψ) ≤ 0. To construct a family of orthonormal solutions, consider the ‘plain wave ansatz’, ΦA (~k, σ, x) := (2π)13/2 ϕA (k, σ)eik·x . The Dirac equations then require (6 k ± m)ϕ(k, σ) = 0 or p 6 kϕ = ∓mϕ. Multiplying by 6 k and using 6 k6 k = −k 2 leads to k 2 +m2 = 0 ↔ k 0 := ± ~k 2 + m2 . We fix k 0 to be positive and call a solution with eik·x as having positive frequency and a solution with e−ik·x as having negative frequency. The two Dirac equations can now be taken as a single Dirac equation with +m, admitting positive and negative fre1 A ik·x ~ quency solutions. Explicitly, we refine the ansatz as ΦA and + (k, σ, x) := (2π)3/2 u (k, σ)e ΦA (~k, σ, x) := 13/2 v A (k, σ)e−ik·x . Both satisfying (−i6 ∂ + m)Φ± = 0. This implies that − (2π) (6 k + m)u(k, σ) = 0 = (6 k − m)v(k, σ). The u, v spinors get their k−dependence through these defining equations. These spinors are eigen-spinors of 6 k with eigenvalues ± m. Since the trace of the γ’s vanishes, 6 k is traceless too and hence each eigenvalue is doubly degenerate i.e. the ‘σ’ label in the eigen-spinors takes two values. These will be linked to the helicities later on. Substitution of these solutions in the inner product leads to, Z 1 0 0 0 (Φ(k , σ , ), Φ(k, σ)) = d3 x u† (k 0 , σ 0 )u(k, σ)e−i(k −k)·x 3 (2π) Σt 0 0 0 = u† (k 0 , σ 0 )u(k, σ)e−i(k −k )t δ 3 (~k − ~k 0 ) (4.13) † 0 0 u (k , σ )u(k, σ) 3 ~ ~ 0 3 δinv (k − k ) := δσ0 ,σ δinv (~k − ~k 0 ) .(4.14) ∴ (Φ(k 0 , σ 0 , ), Φ(k, σ)) = 0 2k We have chosen a normalization of the u−spinors. Identical orthonormality relations follow for the v−spinors. 36 Under Lorentz transformations, the solutions transform as, [Φ(k, σ)]Λ (x) = [D(Λ)u(k, σ)] ei(Λk)·x) (2π)3/2 ? = Φ(Λk, σ)(x) The last equality will hold if D(Λ)u(k, σ) = u(Λk, σ) i.e. if ((Λk) 6 + m) [D(Λ)u(k, σ)] = 0. We already have [Σµν , γ λ ] = i(η µλ γ n − η νλ γ µ ). This implies (may be checked by taking the infinitesimal form of D(Λ)), D−1 (Λ)γ µ D(Λ) = Λµν γ ν ⇒ γ µ D(Λ) = Λµν D(Λ)γ n . It follows, [(Λk) 6 + m] (D(Λ)u(k, σ)) = [(Λk)µ γ µ D(Λ) + D(Λ)m] u(k, σ) (4.15) = [(Λk)µ Λµν D(Λ)γ ν + D(Λ)m] u(k, σ) = D(Λ) Λµα kα Λµν γ ν + m u(k, σ) but, Λµα Λµν = δ αν ∴ [(Λk) 6 + m] (D(Λ)u(k, σ)) = D(Λ) [kν γ ν + m] u(k, σ) = 0. (Covariance Result)(4.16) Thus, indeed we have D(Λ)u(k, σ) = u(Λk, σ) and hence, [Φ(k, σ)]Λ,a (x) = e−i(Λk)·a Φ(Λk, σ)(x) . Under Lorentz transformations then the orthonormality relation transforms as, 0 (Φ(k 0 , σ 0 )Λ,a , Φ(k, σ)Λ,a ) = eiΛ(k −k)·a (Φ(Λk 0 , σ 0 ), Φ(Λk, σ)) † u (Λk 0 , σ 0 )u(Λk, σ) 3 ~ 0 ~ 3 = δinv (k − k) = δσ0 ,σ δinv (~k 0 − ~k) (4.17) . 2(Λk)0 In the last equality we have used the normalization of the u spinors. This proves the unitarity of the Poincare representation on the positive frequency solutions. As before, Lorentz transformations do not mix the subspaces of the positive/negative frequency solutions. The normalization definition does not look Lorentz invariant, but it is. Both u† u = ūγ 0 u and k 0 transform the same way. A more convenient form will be displayed later on. Note: The covariance result allows us to define the u, v spinors from their definition in the rest frame (for massive case)i, k̂ = (m, ~0). In this frame, the spinors satisfy the equations: −mγ 0 u(k̂, σ) = −mu(k̂, σ) , and −mγ 0 v(k̂, σ) = +mv(k̂, σ). Thus these spinors at k̂ are eigen-spinors of γ 0 with eigenvalues ±1 respectively. γ 0 being Hermitian, these spinors are orthogonal: u† (k̂, σ)v(k̂, σ) = 0. As noted before, these eigenvalues are doubly degenerate 37 and σ labels these. For the massless case, k̂ = (k, 0, 0, k) and u, v are eigen-spinors of J3 generator of the Little group. Note: Incidentally, covariance result and the identification of u(k̂, σ), v(k̂, σ) as eigenspinors of γ 0 also leads to a completeness relation as follows. Given that γ 0 u(k̂, σ) = u(k̂, σ) , γ 0 v(k̂, σ) = −v(k̂, σ) and each eigen-space being two dimensional, ⇒ completeP ness relations (‘spectral representation’ - A = n λn |nihn| ) takes the form X u(k̂, σ)u† (k̂, σ) = σ X σ X 1 − γ0 1 + γ0 , v(k̂, σ)v † (k̂, σ) = 2 2 σ X γ0 + 1 −6 k̂ + m γ0 − 1 6 k̂ + m = , v(k̂, σ)v̄(k̂, σ) = = − 2 2m 2 2m σ X X −6 k + m 6k + m u(~k, σ)ū(~k, σ) = , v(~k, σ)v̄(~k, σ) = − 2m 2m σ σ u(k̂, σ)ū(k̂, σ) = 6 k̂ and in the last equation we multiplied by D(L) In the second equation, we used γ 0 = − m on the left and D−1 (L) on the right and used the covariance result. Here L is a boost which takes k̂ → ~k. Notice that multiplying by u(~k, σ 0 ) and v(~k, σ 0 ) respectively, and using the equations satisfied by the spinors, gives ū(~k, σ)u(~k, σ 0 ) = δσ,σ0 = −v̄(~k, σ)v(~k, σ 0 ) . These are consistent with the previously chosen normalizations: u† (~k, σ)u(~k, σ 0 ) = 2k 0 δσ,σ0 = v † (~k, σ)v(~k, σ 0 ). Thus, for all the field equations we have found unitary, irreducible representations of the Poincare group. We have also discovered that the field representations automatically also include ‘anti-particles’. This is really a consequence of the requirement of manifest Poincare covariance which necessitates the field representations. Note: From the expressions above, it should be clear that the ket vectors |k, σi are in one-to-one correspondence with the “Plane wave solutions” with positive and negative frequencies, displayed above. We need to study the behavior of these representations under the space inversion, time reversal and the new possibility of ‘charge conjugation’ (particle-particle exchange). 38 5. PARITY, TIME REVERSAL AND CHARGE CONJUGATION For the field representations above, we focussed on the subgroup of the Poincare group, connected to the identity. The remaining elements of the Poincare group - improper and non-orthochronous are generated by the two transformations of space inversion and time reversal. Their actions on the generators is given in eq.(2.36). These relations continue to hold in any representation, in particular also on the field representations. Looking at the A B −1 general form of the Poincare action on the ΨA (x)’s, ΨA Λ,a (x) = D B (Λ)Ψ (Λ (x − a)), we see that the action on the space-time point is common to all D(Λ) representations and we may focus on the D(Λ) part of it. Recalling that the time reversal operation is anti-linear and anti-unitary, we have to be careful about taking complex conjugates. Thus we define the actions as, A B ΨA x) , D(P ) =: Π P (x) := D B (P )Ψ (t, −~ (5.1) A B ΨA x))∗ , D(T ) =: τ, T (x) := D B (T )(Ψ (−t, ~ (5.2) It is customary to denote the anti-unitary, anti-linear operator T as τ K where τ is a unitary operator and K takes complex conjugate of numbers on its right. T 2 = 1 implies τ τ ∗ = 1. Consider the translation generators. These act as differential operators on the Ψ’s: Pµ ΨA (x) = −i∂µ ΨA . Consider the P and T actions on Pµ as given in eqn. (2.36). For instance, (T −1 P0 T )ΨA (t, ~x) = T −1 {−i∂0 (τ AB ΨB (−t, ~x)∗ )} = T −1 {+i∂−t (τ AB ΨB (−t, ~x)∗ )} = −i∂t (τ τ ∗ )AB ΨB (t, ~x) = P0 ΨA (t, ~x) . (5.3) The time has been reversed twice and the complex conjugation has been effected twice. For the spatial components, there is no ‘t’ reversal and the T introduces a sign. Space inversion is likewise straight forward. We now focus on the Lorentz generators only. The scalar case has no non-trivial matrices and the vector case is simpler and is left as an exercise. The spinner case is non-trivial. We have the Lorentz generators: Σµν = ijk jk i µ ν [γ , γ ] ↔ Ji = Σ , K i = Σi0 . 4 2 The Π and τ matrices are also 4 × 4 matrices. We note a result. 39 Result: Any 4 × 4 matrix can be written as a linear combination of the 16 Γ matrices, {Γ} = {1, γ µ , γ [µ γ ν] , γ [µ γ ν γ λ] , γ5 }, the square brackets denoting anti-symmetrization (16 = 1 + 4 + 6 + 4 + 1). The result follows by showing that the 16 matrices are linearly independent. We now deduce the Π and τ matrices from the commutation relations (see eq.(2.36)): Σjk Π = ΠΣjk , Σ0i Π = −ΠΣ0i , Σjk τ = −τ (Σjk )∗ , Σ0i τ = τ (Σ0i )∗ . The determination of Π is straight forward. Commutation with Σij implies that Π is a linear combination of 1, γ5 , γ0 . Anti-commutation with Σ0i = 2i γ0 γi implies that only γ0 survives and Π = cγ 0 . Now Π2 = 1 implies c = ±1. Hence, (ΨP )A (x) = ±(γ 0 )AB ΨB (t, −~x) . It follows immediately that left(right) handed Weyl spinor becomes right(left) handed Weyl spinor under space inversion. The determination of τ involves complex conjugation. From the unitarity of the γ matrices and the defining Clifford relations, the hermiticity properties are determined: γ0† = γ0 , γi† = −γi . The transpose/complex conjugation properties however depend on the explicit choice of the γ’s. The τ matrix thus depends on the explicit representation of the γ’s. There are three commonly employed representations:  Dirac-Pauli : γ 0 :=  1 0    , γ i :=  σi 0   0 1  (5.4) 1 0 −σ 0 0 −1       i −1 0 0 1 0 σ  ;  , γ i :=   , γ5 :=  (5.5) Weyl : γ 0 :=  −σ i 0 0 1 1 0       0 σ2 iσ 0 0 −σ2  , γ 1 :=  3  , γ 2 :=   ; Majorana : γ 0 :=  (5.6) σ2 0 0 iσ3 σ2 0     −iσ1 0 σ 0  , γ5 :=  2  , All purely imaginary. (5.7) : γ 3 :=  0 −iσ1 0 −σ2 i  , γ5 :=   ; In the Dirac-Pauli and Weyl representations, only γ2 is imaginary. The τ matrix thus satisfies: Σ12 τ = −τ Σ12 , Σ23 τ = −τ Σ23 , Σ31 τ = τ Σ31 , Σ01 τ = −τ Σ01 , Σ02 τ = τ Σ02 , Σ03 τ = −τ Σ03 , 40 The last relation in the first line suggests τ = λγ 0 γ 2 . It also checks with all other relations. The λ is determined as follows. The operator D(T ) is anti-unitary. Hence, (D(T )D(T )Ψ1 , D(T )D(T )Ψ2 ) = (D(T )Ψ2 , D(T )Ψ1 ) = (Ψ1 , Ψ2 ) ⇒ D2 (T ) is unitary. Next, D2 (T ) = (τ K)(τ K) = τ τ ∗ = |λ|2 (γ 0 γ 2 γ 0 (−γ 2 )) = −|λ|2 1. The unitarity then implies that |λ| = 1 i.e. λ is a phase and D2 (T ) = −1 in the spinor representation. In the Majorana representation, all γ’s are imaginary. Hence, Σ∗µν = −Σµν . Hence Σjk commute with τ and anti-commute with Σ0i . This is exactly as for the space inversion and we deduce that D(T ) = λγ 0 K. Once again D2 (T ) is unitary and equal to −1 and λ is a phase. For vector representation, the ‘A’ index will be a tensorial index and it is left as an exercise to work out the D matrices for space inversion and time reversal. The existence of the anti-particle representations suggest one more discrete transformation of order 2, called Charge Conjugation. Recall that the anti-particle representation is the subspace of negative frequency solutions. These subspaces are spanned by plane wave solutions and involve the operation of complex conjugation which takes eik·x → e−ik·x . For the non-trivial D(Λ) representations, complex conjugation also takes the complex conjugate of the D(Λ) matrices. However, a complex conjugation of a solution need not be a solution again (especially for spinor as we will see) and hence the charge conjugation must also involve additional transformations over and above complex conjugation. For the Klein-Gordon, Proca and Maxwell equations, complex conjugate of a solution is also a solution as the differential operators are real. For the Dirac equation we need to do further work. For instance, let Ψ be a positive frequency solution of the Dirac equation, (−iγ µ ∂µ + m)Ψ = 0. Taking complex conjugate of the equation, we get (+i(γ µ )∗ ∂µ + m)Ψ∗ = 0. If we could find an invertible matrix B such that (γ µ )∗ = −Bγ µ B −1 , then Ψc := (B −1 Ψ∗ ) satisfies the same Dirac equation and of course is a negative frequency solution. Ψc is called the charge conjugate2 of the Ψ. Apparently, B depends on the choice of explicit γ matrices. 2 The terminology comes when coupling to external electromagnetic field is considered by the minimal substitution ∂µ → ∂µ − ieAµ . Under complex conjugation, e → −e. So if Ψ is thought of as charge ‘e’ solution then Ψc is a charge ‘-e’ solution. 41 As a preparation for subsequent development, we have a subsection on properties of γ matrices. A. Representations of the Clifford algebra and relations among them Quite generally, for any group, given a (matrix) representation R(G), we have three other representations, namely, R∗ (G), (RT )−1 (G) and (R† )−1 (G). From the basic relation R(g1 )R(g2 ) = R(g1 .g2 ), taking complex conjugate, transpose inverse and adjoint inverse immediately verifies the assertion. If in addition, the representation R(G) is unitary (always true for finite groups and compact Lie groups), then (RT )−1 (G) = R∗ (G) and (R† )−1 (G) = R(G). R(G) and R∗ (G) are then the only independent representations. These are either equivalent, R∗ (g) = SR(g)S −1 or inequivalent. It turns out that for unitary, irreducible representations, if R and R∗ are equivalent, then S is either symmetric or anti-symmetric. The following terminology ensues. For unitary, irreducible representations: R(G) is complex if R∗ (G) R(G) R(G) is pseudo-real if R∗ (G) = SR∗ (G)S −1 , S T = −S R(G) is real if R∗ (G) = SR∗ (G)S −1 , S T = S This is relevant for representations of internal symmetry groups as well as the Clifford group of the 32 elements. The D(Λ) representations are finite dimensional but not unitary since Lorentz group does not have unitary finite dimensional representations. For the Clifford group in 4 dimensions (and in even number of dimensions) there is only one (up-to unitary equivalence) non-trivial representation which is 4 dimensional (or 2N/2 dimensional for N -even number of dimensions). Since ±γ ∗ , ±γ T and ±γ † all satisfy the same Clifford algebra and the representation is unique, ∃ matrices B, C, D such that, (γ µ )∗ = −Bγ µ B −1 , (γ µ )T = −Cγ µ C −1 , (γ µ )† = +Dγ µ D−1 . The choice of signs above is conventional. Notice that replacing any of the B, C, D matrices by multiplying by γ5 on the right, reverses the signs. It follows immediately that, −(Σµν )∗ = BΣµν B −1 , − (Σµν )T = CΣµν C −1 , (Σµν )∗ = DΣµν D−1 42 Given a representation D(Λ) we have the (D† )−1 (Λ) , (DT )−1 (Λ) and D∗ (Λ) representations. The infinitesimal forms are (1 + 2i ω · Σ), (1 − 2i ω · Σ)† , (1 − 2i ω · ΣT ), (1 − 2i ω · Σ∗ ) which imply that the corresponding generators are Σµν , (Σµν )† , −(Σµν )T , −(Σµν )∗ . The B, C, D matrices precisely relate these generators. These relations can be exponentiated to get corresponding relations among the D(Λ), D† (Λ), DT (Λ), D∗ (Λ), namely, D† (Λ) = DD−1 (Λ)D−1 , DT (Λ) = CD−1 (Λ)C −1 , D∗ (Λ) = BD(Λ)B −1 . These can be checked easily by using the series form of the exponentials derived from D(Λ) = P∞ (iω·Σ)k . k=0 k! B. Dirac-Majorana-Charge Conjugates This allows us to construct Lorentz invariant bi-linears from the spinors. For instance, define the Majorana conjugate Ψ̃ := ΨT C. Then, (Ψ̃)Λ := (ΨTΛ )C = ΨT (DT (Λ))C = ΨT CD−1 (Λ) = Ψ̃D−1 (Λ) . Thus the Majorana conjugate, like the Dirac conjugate ψ̄ := Ψ† D, transforms by D−1 (Λ). Recall that the charge conjugate, Ψc transforms by D(Λ). Consequently, Ψ̄Ψ, Ψ̄Ψc , Ψ̃Ψ, Ψ̃Ψc are all Lorentz scalars. Recalling that D−1 (Λ)γ µ D(Λ) = Λµν γ ν , it is easy to see that, Ψ̄Ψ is a Scalar Ψ̄γ µ Ψ is a Vector Ψ̄γ µ γ ν Ψ is a Tensor of rank 2 Ψ̄γ µ γ5 Ψ is a Axial (or pseudo) vector Ψ̄γ5 Ψ is a Pseudo-scalar Combine these with the definitions, Dirac Spinor : Ψ = Ψ , (Ψ)Λ = D(Λ)Ψ; Dirac Conjugate : Ψ̄ = Ψ† D , (Ψ̄)Λ = Ψ̄D−1 (Λ); Charge Conjugate : Ψc = B −1 Ψ∗ , (Ψc )Λ = D(Λ)Ψc ; Majorana Conjugate : Ψ̃ = ΨT C , (Ψ̃)Λ = Ψ̃D−1 (Λ). 43 Thus, the same Lorentz transformation properties hold with Ψ → Ψc and/or Ψ̄ → Ψ̃. The last two need a little explanation. We defined γ5 := iγ 0 γ 1 γ 2 γ 3 = symmetric symbol, with ε0123 := 1. µ0 ν0 i ε γ µγ ν γ αγ β , 4! µναβ where ε is the completely anti- This symbol is also used to define the determi0 0 nant of a matrix, eg, εµ0 ν 0 α0 β 0 Λ µ Λ ν Λαα Λββ = det(Λ)εµναβ . It follows immediately that D−‘ (Λ)γ5 D(Λ) = det(Λ)γ5 . Consider now an axial vector combination, [Ψ̄γ µ γ5 Ψ]Λ = Ψ̄D−1 (Λ)γ µ γ5 D(Λ)Ψ = Λµν Ψ̄γ ν D−1 γ5 D(Λ)Ψ = det(Λ)Λµν Ψ̄γ ν γ5 Ψ . Note that for proper Lorentz transformations, the axial vector and the pseudo-scalar transform as vector and scalar respectively. Quantities that transform as tensors but with an extra factor of determinant of the transformation are called pseudo-tensors. There are also tensor looking quantities that transform with additional factor of |det|w and these are called tensor densities of weight w. Let us return to determine the matrices B, C, D. Since the γ’s are unitary, we can always choose any of the B, C, D matrices to be unitary. Quite generally, if R0 = SRS −1 for two irreducible, equivalent unitary representations, then R0† R0 = 1 ⇒ R† S † S = S † SR ⇒ S † S = α1. Positivity of S † S gives α > 0. Hence √ S 0 := S/ α is a unitary matrix. This proves the claim. Next, γ µ = (γ µ∗ )∗ = −B ∗ γ µ∗ (B −1 )∗ = +B ∗ Bγ m (B ∗ B)−1 ⇒ B ∗ B = λ1 with λ being real since B ∗ B is. The determinant of B being a phase (unitarity of B) fixes λ := B := ±1 and B ∗ B = B 1 . Similar manipulation with γ µ = (γ µT )T leads to C −1 C T = λ1. Unitarity gives C −1 C T = C † C T = (C T )∗ C T , hence λ is real and determinant gives λ4 = 1. Hence, C T = C C, C = ±1 . For γ = (γ † )† we get D† = λD for some phase λ. Since D is determined to within a phase, we can define D0 := eiα D and choose α so that λ = e−2iα to get (D0 )† = D0 . Thus, without loss of generality, we can always choose D† = D. Now, knowing the hermiticity properties of the γ’s, we get D = γ 0 , independent of any choice of explicit γ matrices. The B , C are correlated since γ T = (γ ∗ )† = (γ † )∗ . γ T = (γ ∗ )† leads to, −CγC −1 = − (B † )−1 DγD−1 B † ⇒ C = λBD. 44 γ T = (γ † )∗ leads to, −CγC −1 = − (D∗ )BγB −1 (D−1 )∗ ⇒ C = λ0 D∗ B. Eliminating C gives BD = λ0 ∗ D B λ or D∗ = λ BDB −1 . λ0 However, D = γ 0 can always be achieved as shown above. Therefore, λ0 = −λ and C = λBD = −λD∗ B. Next, C T = λDT B T = λ(D† )∗ (B † )∗ = λD∗ (B ∗ )−1 = λD∗ (B B) = −B C ⇒ C = −B . Note that this is independent of explicit γ matrices and also independent of the phase λ! We still need to determine B , say. Claim: The B does not depend on the choice of γ matrices. This is easily proved. Let γ 0 and γ be two distinct choices of explicit γ-matrices. Both being unitary are related by some unitary matrix, S as γ = Sγ 0 S −1 . Then substitution in γ ∗ = −BγB −1 gives, (γ 0 )∗ = −B 0 γ 0 (B 0 )−1 with B 0 := (S ∗ )−1 BS. It follows that (B 0 )∗ B 0 = (S −1 B ∗ S ∗ )((S ∗ )−1 BS) = B 1, proving the claim. It therefore suffices to choose an explicit representation and evaluate B . Referring to say the Dirac-Pauli representation, see (5.4), only γ 2 is pure imaginary. Hence, γ 0,1,3 B = −Bγ 0,1,3 and γ 2 B = +Bγ 2 . Furthermore, γ5∗ = γ5 ⇒ γ5 B = −Bγ5 as well. Therefore B must be made up of odd number of γ’s. By inspection, B = αγ 2 satisfies the conditions. Unitarity of B restricts α to be a phase and B ∗ B = |α|2 (−γ 2 )(γ 2 ) = 1 ⇒ B = +1 and C = −1. Also C = λBD = λαγ 2 γ 0 = (−λα)γ 0 γ 2 . The phases λ, α are arbitrary and convention dependent. We choose λ = 1 and α = i so that B = iγ 2 , C = iγ 2 γ 0 . This also verifies for the Weyl representation. For Majorana representation, all γ µ ’s are imaginary and thus commute with B. Hence B must be a phase multiple of 1. This is also consistent with commutation with γ5 . Clearly B = +1 and C = (λα)γ 0 . This completes the discussion of representations and relations among them. It is customary to introduce a charge conjugation operator, C := B −1 K, where K instructs to take the complex conjugate of the numbers on its right. This operator is anti-linear and since B is unitary, it is also anti-unitary operator. It follows, C Σµν = B −1 (Σµν )∗ K = B −1 (−BΣµν B −1 )K = −Σµν C ⇒ C D(Λ) = D(Λ)C . Thus charge conjugation acts invariantly on Lorentz representations. Provided C 2 = 1, we have ( 1±2 C )2 = 1±C 2 and the Lorentz representation can be reduced further. The corre- sponding subspaces satisfy C ψ = ±Ψ and these spinors are called Majorana spinors. Thus, 45 Majorana spinors are Dirac spinors which satisfy C Ψ = ±Ψ ↔ B −1 ψ ∗ = ±Ψ. So is C 2 = 1? We have C 2 = B −1 KB −1 K = (B ∗ B)−1 K 2 = B 1 = 1. Therefore, we do have Majorana spinors (in 4 dimensions). Do we have Weyl-Majorana spinors? Well, C γ5 = B −1 γ5∗ K = −γ5 B −1 K = −γ5 C and we cannot have Weyl-Majorana spinors in 4 dimensions. A similar analysis can be carried out for spinors in any dimensions and with any metric signature. This may be seen for instance in [2]. 46 6. RELATIVISTIC ACTIONS: CLASSICAL FIELDS So far we focused on the representation theory of the Poincare group. The abstract, algebraic approach revealed the attributes of these representations, namely mass and spin/helicity. In order to have a framework which is manifestly Poincare covariant, we constructed and analyzed representations on vector valued (complex in general) function, ΨA - henceforth these will be generically referred to as fields. The irreducibility condition emerged as “field equations”. These equations are all homogeneous, linear and with at the most two derivatives. They all admit plane waves and their linear combinations as solutions, but no other ‘phenomenon’ involving (say) different types of waves modifying their propagation etc. Intuitively, there are no interactions. We would like to have a framework which is not only Poincare covariant but also involves “interaction” or non-trivial “dynamics”. The well tested and successful strategy is to have an action formulation. Let us quickly note several advantages of having an action formulation: 1. The equations of motion are retrieved invoking a variational principle as the EulerLagrange equations of motion; 2. “Interactions” can be understood naturally as leading to non-linear and/or coupled equations which can be easily introduced as more than quadratic order terms in the action; 3. Covariance of equations of motion (or dynamics) can be easily incorporated by requiring appropriate invariance of the action; 4. The Noether’s theorem gives a recipe for obtaining quantities conserved by equations of motion (and hence during interactions as well); 5. It leads to a canonical framework of symplectic structure (“Poisson brackets”) and a Hamiltonian evolution. This provides a systematic method of identifying “degrees of freedom” (eg Dirac’s theory of constrained systems). A canonical structure is already inherent in the quantum framework: the imaginary part of the inner product provides the symplectic structure while the Schrodinger equation gives a Hamiltonian evolution; 47 6. Path integral quantization - very well suited for gauge theories especially the nonabelian ones - has action as the central quantity. Without further ado, let us proceed with an action formulation. Our first aim is to obtain the Klein-Gordon, the Dirac and the Proca/Maxwell equations. Because the equations are local, partial differential equations, the action must be an integral over the Minkowski spacetime, of a Lagrangian density built out of the Poincare covariant fields and their derivatives p R i.e. it must of the form S = d4 x |det(η)|L with L is a Lorentz scalar built out of the fields. It is called the Lagrangian density. Here, η denotes the Minkowski metric and is necessary to absorb the Jacobian of Lorentz transforms of the space-time coordinates. The absolute value of the determinant happens to be 1 and is suppressed throughout. A variational principle has the form, Z δS[Φ] := S[Φ + δΦ] − S[Φ] = 4 d xδ L := Z d4 x δΦ “ δL ”. δΦ Extremization of the action under arbitrary variation δΦ leads to the Euler-Lagrange equations: δL δΦ = 0. Since the equation we want to derive are linear, it suffices to have L to be quadratic in the fields. Furthermore, the equations have no more than two derivatives which can be obtained by restricting to first derivatives in the Lagrangian. Let us begin with the scalar field, φΛ,a (x) = φ(Λ−1 (x − a)) , Λµα ∂α φ(Λ−1 (x − a)) ⇒ [∂ µ φ∂µ φ]Λ,a = η µν [∂µ φ∂ν φ]Λ,a = η µν Λµα Λνβ [∂α φ∂β φ] (Λ−1 (x − a)) = η αβ ∂α φ∂β φ(Λ−1 (x − a)) = [∂ α φ∂α φ] (Λ−1 (x − a)) . When integrated over the space-time, we can change the dummy variable x → Λx + a which restores the argument of the φ’s without changing the integration measure. Hence R 4 µ d x∂ φ∂µ φ is clearly Poincare invariant. Since all fields have the same shift of the spacetime argument, we will suppress the translation part and focus on the Lorentz. This works as long as there are no externally prescribed fields/functions, which break translation in- 48 variance. Thus, let Z 1 2 2 1 µ =: d4 xL (φ, ∂φ) d x − ∂ φ∂µ φ − m φ 2 2 M ZM d4 x −∂µ φ∂ µ δφ − m2 φδφ M Z d4 x −∂µ (∂ µ φδφ) + (φ − m2 φ)δφ ZM Z 4 2 d x δφ ( − m )φ − d3 x nµ (∂ µ φδφ) Z S[φ] := δS [φ] = = = 4 M (6.1) (6.2) ∂M The integration is over all of Minkowski space-time which has no boundary and hence the second term should be zero. In practice, such space-times extending to infinite coordinates are handled/defined putting the system in a large box and taking a limit. The fields must satisfy suitable boundary conditions so that the boundary contribution again vanishes. Alternatively requiring the field to vanish fast enough in the asymptotic regions, makes the boundary term vanish. We can always choose the variational principle to set δφ = 0 at the boundary/asymptotic regions. While the boundary contribution are important in some context, we will restrict ourselves to boundary contribution being zero. Demanding that the action be stationary for arbitrary variations of the field, we get the equation of motion (−m2 )φ = 0. Incidentally, we will get the same equation of motion from another Lagrangian L 0 = L + ∂µ Λµ . Thus, several Lagrangians can give the same equations of motion (although the Hamiltonian formulation will vary with the Lagrangians.) Consider now a vector field. For the massive case, the equations we want are (−m2 )v µ = 0 = ∂µ v µ . Consider L = aF µν Fµν + bm2 v µ vµ , where Fµν := ∂µ vν − ∂ν vµ and a, b are non-zero constants to be determined. δ L = 2aF µν (∂µ δvν − ∂ν vµ ) + 2bm2 v µ δvµ = 4aF µν ∂µ δvν + 2bm2 v ν δvν = ∂µ (4aF µν δvν ) − 4a(∂µ F µν )δvν + 2bm2 v ν δvν ⇒ − 4a∂µ F µν + 2bm2 v ν = 0. b 2 ν ν ∴ 0 = v − m v − ∂ ν ∂ · v taking divergence and choosing b = 2a gives 2a 0 = ( − m2 )v ν and ∂ · v = 0. (6.3) We got b = 2a and we choose b = − 21 similar to the scalar field case. The reason for this choice will be clear little later. Note: For m = 0, we denote the vector field by Aµ . Now we cannot conclude ∂ · A = 0, and the equation we get is: ∂µ F µν = 0 which is just the Maxwell equation. It has the well 49 known gauge invariance: Aµ → Aµ + ∂µ Λ which allows us to take one of the components of Aµ to be zero. Exercise: Another natural choice of the Lagrangian is: L 0 = a∂ µ v ν ∂µ vν + bm2 vµ v µ + c(∂µ v µ )2 . Repeat the steps and deduce the choices for the constants so as to get the Proca equations. Check that L , L 0 differ by a divergence. Lastly, let us consider the action for getting the Dirac equation. We have already noted the Lorentz transformations of Ψ, Ψ̄, γ µ , ∂µ . From these it follows that we can have the following Lorentz invariant terms, quadratic in the fields and with a single derivative: Ψ̄γ α ∂α Ψ, ∂ α Ψ̄∂α Ψ, Ψ̄Ψ and the same terms with a γ5 inserted between the spinors. The terms with the γ5 are all pseudo-scalars and may be dropped if parity (space-inversion plus a rotation) is required to be a symmetry. We assume so for the present. The candidate Lagrangian is then, L := aΨ̄γ α ∂α Ψ + b∂ α Ψ̄∂α Ψ + cmΨ̄Ψ and δΨ̄ L = δ Ψ̄[a6 ∂ Ψ − bΨ + cmΨ]. (6.4) For the choice a = −i, b = 0 and c = 1, we get the Dirac equation. If we varied ψ, then for the same choice, we will get the conjugate Dirac equation: i∂α Ψ̄γ α + mΨ̄ = 0. p R In summary, we have S = M d4 x |detη|L with, 1 1 2 2 1 2 µ 1 µν = − F Fµν − m v vµ ; 4 2 µ = −iΨ̄γ ∂µ + mψ̄Ψ . Lscalar = − ∂ µ φ∂µ φ − m2 φ2 ; (6.5) Lvector (6.6) Lspinor (6.7) δS = 0 leads to the field equations implementing the irreducibility condition. For massless vector, the action is invariant under the gauge transformation: Aµ → Aµ + ∂µ Λ. The actions are also invariant under parity. Note: We have taken the scalar and vector fields to be real. We could have complex scalar field (say) with two field equations: ( − m2 )φ = 0 = ( − m2 )φ∗ . The complex field may be considered as two real fields, φ = φ1 + iφ2 and two terms may be included in the action. Alternatively, the complex scalar field equations may be derived from L = −∂ µ φ∗ ∂µ φ − m2 φ∗ φ. 50 Notice that all actions are real (for spinors it is convenient to take matrix hermitian i conjugate). We will always take the actions to be real so that e ~ S will be a phase. Only when dissipation of energy/momentum/angular momentum is to be incorporated we need to take the action to be complex. In this course, we will not do so. A. Variational Principle, Symmetries of the Action and Noether’s theorem Let us denote a generic field by X with all indices suppressed. Let an action be expressed R as, S[X] := M d4 xL (X, ∂X). Let δX := χ(X(x)) be an arbitrary, infinitesimal variation of the field X(x). Then, Z Z δL δL δL δL 4 4 dx δX + δXµ = δX + ∂µ δX δS = dx δX δXµ δX δXµ M M Z Z δL δL δL 4 4 δX + d x∂µ − ∂µ δX = dx δX δX,µ δX,µ M M Here, X,µ = ∂µ X. The δL δX is really like partial derivative, but since both L and X depend on the coordinate, these should be mentioned too. We have kept these implicit and used the δ as a reminder. We follow this customary practice. The last term is a divergence and can R δL δX. This is typically dropped/vanishes be expressed as a boundary integral: ∂M d3 ynµ δX ,µ for various reasons. Note: For an action principle to be well defined (this includes the specification of the class of variations), it is necessary that the boundary contribution must vanish. If in some cases it does not vanish, additional ‘surface action’ needs to be added to cancel the total boundary contribution. This is typically encountered in gravitational actions on manifolds with boundaries. To summarize: a variational principle asserts that δS[X] vanishes for arbitrary variation δX around X iff δL δX δL − ∂µ δX = 0 i.e. iff X(x) is a solution of the equation of motion. ,µ We now consider a different situation. We consider special, restricted variations such that δS[X] = 0 at all fields X(x). Such variations are called infinitesimal symmetries of the action. This is translated in terms of the Lagrangian density as, Z δS[X] = d4 xδ L = 0 if either δ L = 0 or δ L = ∂µ δΛµ , M where Λµ is some 4-vector which vanishes/falls off on the boundary. Now we have the Noether’s theorem: 51 Noether’s Theorem: For every infinitesimal symmetry of the action, ∃ a conserved current, conserved on every solution of the equation of motion. Proof: We already have the infinitesimal variation of the L which must equal a divergence i.e. for a symmetry variation, δL δL δL − ∂µ δX + ∂µ δX = δ L = ∂µ Λµ ⇒, δX δX,µ δX,µ δL δL δL µ = 0. − ∂µ δX + ∂µ δX − ∂µ Λ δX δX,µ δX,µ Thus, if X satisfies the equation of motion, the first term vanish and we have a conserved δL current, δJ µ := δX − ∂µ Λµ , ∂µ δJ µ = 0 δX,µ This is neat prescription to discover conserved currents and their corresponding conserved R charges: δQ := ∂M d3 σ nµ δJµ . Discovering conserved currents by inspection of the equation of motion is easy only in the simplest of cases. Let us see an example, particularly a symmetry variation induced by infinitesimal Poincare transformations for which δS = 0 by construction. Consider a real scalar field action for simplicity. Under an infinitesimal Poincare transformation, δφΛ,a (x) := φ(Λ−1 (x − a)) − φ(x) = φ((δ µν − ω µν )(x − )ν ) − φ(x) = (−ω µν xν − µ )∂µ φ := − ξ µ ∂µ φ(x) , ξ µ := ω µν xν + µ . This leads to δ L = −∂ µ φ∂µ (−ξ · ∂φ) − m2 φ(−ξ · ∂φ) = ∂ µ φ∂µ (ξ ν ∂ν φ) + m2 φξ ν ∂ν φ 2 = (∂µ ξ ν )∂ µ φ∂ν φ + ξ ν ∂ µ φ∂νµ φ + m2 φ∂ν φ 1 δL 1 2 δL 1 µ ν ν ∂ν φ,µ + m ∂ν φ = (∂µ ξν + ∂ν ξµ )∂ φ∂ φ + ξ 2 2 δφ,µ 2 δφ Explicitly, ∂µ ξν = ∂µ (ωνα xα + α ) = ωνµ + 0 which is antisymmetric in µ ↔ ν and hence the first term vanishes. The second term is just ξ ν ∂ν L . ∴ δ L = ξ µ ∂µ L = ∂µ (ξ µ L ) − (∂ · ξ)L = ∂µ δΛµ + 0 , or δ L = ∂µ δΛµ , δΛµ = ξ µ L . Note: We ensured that L is a Lorentz scalar, but it is not a translation scalar. Action R gets it Poincare invariance thanks to the d4 x. Since all fields under translations shift their 52 coordinates by a, in all cases we will have δ L = ξ · ∂ L provided ξµ satisfies ∂µ ξν + ∂ν ξµ = 0 i.e. ξ µ is a “Killing vector” of the Minkowski metric. By Noether’s theorem, we get δJ µ = δL δφ − δΛµ = ∂ µ φ(−ξ ν ∂ν φ) − ξ µ L = − ξν (∂ µ φ∂ ν φ + η µν L ) =: − ξν T µν . δφ,µ Conservation of the Noether current gives (∂µ ξν )T µν + ξν ∂µ T µν = 0. Provided T µν is symmetric (as it is here), the first term vanishes and independence of ξ µ , we get ∂µ T µν = 0. The tensor, T µν := ∂ µ φ∂ ν φ + η µν L is called the canonical stress tensor for the scalar field. Thus, for each Killing vector ξ µ , we have J µ := T µν ξν which is conserved on the solutions R of the field equation. We define the corresponding charges as, Qµ (ξ) := Σt d3 xT 0ν ξν which are independent of the hypersurfaces Σt . Here are practice exercises: (1) For the Proca Lagrangian, show that δ L = ξ µ ∂µ L for ξ µ = ω µν xν + µ , ω, being infinitesimal, and obtain the canonical stress tensor. (2) For m = 0, show that this stress tensor is not gauge invariant and one needs to “improve” it to get a symmetric, gauge invariant and conserved stress tensor. Guess it. (3) For the scalar and the Maxwell field, obtain the conserved charges corresponding to the 10 Killing vectors. The charges corresponding to the translations are the energy and momentum whiles those corresponding to the ωij are the angular momentum components. B. Conserved Poincare charges for scalar field solutions Let us see an explicit example of Noether charges for a general solution of the KleinGordon equation. We noted above that the canonical stress tensor for the scalar field is given by, Tµν = ∂µ φ∂ν φ + ηµν L , ∂µ T µν = 0 , Define: M µνλ := T µν xλ − T µλ xν ∴ ∂µ M µνλ = 0 on solutions 53 1 1 2 2 2 ∀ ( − m )φ = 0 . L := − ∂ µ φ∂µ φ − m2 φ2 ⇒ ∂µ M µνλ = T λν − T νλ iff ∂µ T µν = 0 and T µν = T νµ These 6 conserved currents (M µνλ are antisymmetric in the last two indices) together with the conserved stress tensor give the 10 Poincare charges: Z Z 3 0µ µν µ d xT , M := d3 xM 0µν . P := Σt Σt We had noted the plane wave solutions of the Klein-Gordon equation, labeled by ~k ∈ R3 namely, 1 u~k (x) = p eik·x , k · x := −ω~k t + ~k · ~x , ω~k := 2ω~k (2π)3 q ~k 2 + m2 and their complex conjugates. The general solution is thus expressed as Z φ(x) = d3 k[a(~k)u~k (x) + a∗ (~k)u~∗k (x)] with a(~k) being complex numbers with a suitable dependence on ~k so that the field has the appropriate boundary behavior. The general solution is manifestly real. To compute the energy-momentum and angular momentum charges, we have integration of expressions quadratic in the solutions. So we note: ∂µ u~k (x) = ikµ u~k (x) , ∂µ u~∗k (x) = −ikµ u~∗k (x) , k 0 = ω~k and the orthogonality relations: δ 3 (~k − ~k 0 ) ei(ω~k −ω~k0 )t 3 p = d3 x u~∗k (x)u~k0 (x) = p = δinv (~k − ~k 0 ) 2ω~k 2ω~k 2ω~k0 Z Z 3 3 d3 x u~k (x)u~k0 (x) = e−2iω~k t δinv (~k + ~k 0 ) , d3 x u~∗k (x)u~∗k0 (x) = e2iω~k t δinv (~k + ~k 0 ) . Z We need, P µ M µν 1 α 1 2 2 = d x∂ φ∂ φ + η − ∂ φ∂α φ − m φ 2 2 Z = d3 x ∂ 0 φ∂ µ φ + η 0µ L xν − ∂ 0 φ∂ ν φ + η 0ν L xµ Z 3 0 µ 0µ (6.8) (6.9) And we have, 1 L = − 2 Z Z n 0 ∗ ~ ∗ ~0 ∗ ∗ 0 2 ~ ~ ~ ~ dk dk a(k)a(k )u~k u~k0 + a (k)a (k )u~k u~k0 −k · k + m o ∗ ~0 ∗ ∗ ~ 0 ∗ 0 2 ~ ~ ~ ~ a(k)a (k )u~k u~k0 + a (k)a(k )u~k u~k0 +k · k + m (6.10) Z Z n ∂ 0 φ∂ µ φ = d3 k d3 k 0 a(~k)a(~k 0 )u~k u~k0 + a∗ (~k)a∗ (~k 0 )u~∗k u~∗k0 o ∗ ~0 ∗ ∗ ~ 0 ∗ 0~ 0 µ ~ ~ ~ −k k (6.11) −a(k)a (k )u~k u~k0 − a (k)a(k )u~k u~k0 3 3 0 54 R Substituting and carrying out the d3 x using the orthogonality relations gives, 0 in Z Z o i 3 3 0 k k −2iω~k t ∗ ~ ∗ 2iω~k ~ ~ ~ δ 3 (~k + ~k 0 ) P = dk dk a(k)a(−k)e + a (k)a (−k)e 2ω~k o k0ki n ~ ∗ ~ ∗ ~ 3 ~ 0 ~ ~ a(k)a (k) + a (k)a(k) δ (k − k ) 2ω~k Z o n 1 d3 kk i a(~k)a(−~k)e−2iω~k t + a∗ (~k)a∗ (−~k)e2iω~k t + a(~k)a∗ (~k) + a∗ (~k)a(~k) (6.12) = 2 The first two, time decedent terms in the braces are symmetric under ~k ↔ −~k while the last two time independent term go just change their argument. Hence, under symmetric integration, the time dependent terms vanish and we are left with, i P = Z d3 k k i a(~k)a∗ (~k) − a(−~k)a∗ (−~k) 2 (6.13) Since the a’s are ordinary complex numbers the factor of 2 cancels and a new factor arises from explicit anti-symmetrization w.r.t. ~k. The claimed conserved quantity is manifestly independent of t. The calculation of P 0 proceeds much the same way. We have the t−dependent term proportional to δ 3 (~k + ~k 0 ) while the t−independent term is proportional to δ 3 (~k − ~k 0 ). We get, P 0 Z = o n o d3 k h 2 n ~ −2iω~k t ∗ ~ ∗ 2iω~k t 2 ∗ ~ ~ ~ ~ −ω~k a(k)a(−k)e + a (k)a (−k)e + ω~k 2a(k)a (k) 2ω~k o 1n ~ + a(k)a(−~k)e−2iω~k t + a∗ (~k)a∗ (−~k)e2iω~k t ω~k2 + ~k 2 + m2 2 1 n ~ ∗ ~ o 2 ~2 2 + 2a(k)a (k) −ω~k + k + m (6.14) 2 The t−dependent terms cancel while the t−independent terms give, Z 0 d3 k ω~k a(~k)a∗ (~k) P = (6.15) The calculation of M µν is similar. The new feature is the explicit xλ in the integral. In R the M ij we have T 0i xj − T 0j xi . We trade xi for ∂ki by noting that kj tk j j or xj u~k = −i∂kj u + u~ . ∂ki u~k (x) = u~k (x) −it + ix ω~k ω~k k As before the spatial integration will produce δ functions and also derivatives of δ functions from the ∂ki . While doing the k 0 −integration, we need to flip the derivative as per the rules 55 of δ function. As before, the e±2iω~k t will turn out to be symmetric under ~k ↔ −~k and will not contribute. Filling in the algebra leads to the final results, Z h i ij M = i d3 k a(~k) k i ∂kj − k j ∂ki a~∗k − a∗ (~k) k i ∂kj − k j ∂ki a~k Z i 0i M = − d3 k ω~k a~k ∂ki a~∗k − a~k ∂ki a~∗k . 2 (6.16) (6.17) All are manifestly independent of time. Apart from verifying that the conserved quantities are indeed time independent, the expressions for energy and momentum show that these quantities are a sum (integral) of energy-momentum of individual solutions. Thus each of these plane waves, at least heuristically be thought of as carrying an energy ω~ and momentum ~k. Note that there is no ~ yet k so these are mathematically defined conserved quantities much like the energy-momentum fluxes of electromagnetic fields identified via the Poynting theorem. This is helpful for interpreting the solutions of the field equations as physical entities carrying energy-momentum. To strengthen this further, we would like to see if the fields and the action can be cast in the form of a “dynamical system”. 56 7. FOURIER DECOMPOSITIONS OF FIELDS: COLLECTION OF HARMONIC OSCILLATORS In the previous section, we wrote the general solution of the Klein-Gordon equation. Now we want to see if the action takes the ‘form of a dynamical system’. The meaning will be clear shortly. Consider again the scalar field. Let Σt be a constant t surface. On Σt , the ~ set of functions, {ϕ~k = (2π)13/2 eik·~x , ~k ∈ R3 } forms a complete, orthonormal set of functions, R d3 x ϕ~∗k (~x)ϕ~k0 (~x) = δ 3 (~k − ~k 0 ). Let us Fourier decompose the field as, Σt Z φ(t, ~x) := R3 d3 k f~k (t)ϕ~k (~x). (7.1) The reality of the field, φ∗ (x) = φ(x) ⇒ f~k∗ (t) = f−~k (t). The expansion coefficient functions are determined by the equations of motion. It is trivial to check that ( − R m2 )φ(x) = 0 ⇒ d3 k(−f¨~ −~k 2 f~ −m2 f~ )ϕ~ (x) = 0. By independence of the basis functions, k k k k we get f¨~k +ω~k2 f~k = 0 ∀ ~k ∈ R3 , ω~k2 := ~k 2 +m2 . Its general solution is f~k (t) = a~k e−iω~k t +b~k eiω~k t . The reality condition f~k∗ = f−~k gives a~∗k = b−~k or b~k = a∗−~k . Substitution gives, Z φsoln (x) = R3 ~ 3 −iω~k t d k a~k e + b~k e iω~k t eik·~x p 2ω~k (2π)3 d3 k −iω~k t+i~k·~ x iω~k t−i~k·~ x p = a~ e + b−~k e 2ω~k (2π)3 k Z eik·x ∴ φsoln (x) = d3 k a~k u~k (x) + a~∗k u~∗k (x) , u~k := p 2ω~k (2π)3 Z (7.2) This is exactly the same form we had for the general solution. We have adjusted the normalization factor for future convenience. The last equation will be referred to as a mode decomposition, with u~k (x)’s denoting the mode functions. Let us rewrite the Fourier decomposition in a manifestly real form: Z φ(t, ~x) = d3 k f~k (t)ϕ~k (~x) + f~k∗ (t)ϕ~∗k (~x) R3 /2 We have separated the −~k tagged terms and used the reality condition. The integration is 57 over positive half of R3 . Substituting the Fourier decomposition into the Lagrangian, we get, Z Z m2 1 2 2 3 3 φ2 d xL = d x − −φ̇ + (∇φ) − L= 2 2 Σt ZΣt i h ∗ 3 ∗ ˙ ˙ φ̇(t, ~x) = d k f~k (t)ϕ~k (~x) + f~k (t)ϕ~k (~x) R3 /2 Z d3 k (ikj ) f~k (t)ϕ~k (~x) − f~k∗ (t)ϕ~∗k (~x) using ortho-normality, we get ∂j φ(t, ~x) = R3 /2 Z i h (7.3) d3 k f˙~k f˙~k∗ − ω~k2 f~k f~k∗ , ω~k2 = ~k 2 + m2 . L = R3 /2 Let f~k (t) := √1 (q~ (t) 2 k + iq~k0 (t)), where q, q 0 are real and ~k are in the positive half of R3 . Substitution in the Lagrangian gives, Z i 0 1h 2 0 . L = d3 k q̇k − ω~k2 q~k2 + q̇k2 − ω~k2 q~k2 2 R3 /2 Denoting q~k0 =: q−~k , we can express the Lagrangian as, Z 1 2 L = d3 k q̇k − ω~k2 q~k2 . 2 R3 ~k now range over full R3 . (7.4) The Lagrangian is manifestly expressed as a sum (integral) of Lagrangians for harmonic oscillators q~ each with frequency ω~ for ~k ∈ R3 . One can easily pass to the Hamiltonian k k ∂L ∂ q̇~k = q̇~k . We get, Z 1 2 3 2 2 p + ω~k q~k H = dk . 2 ~k R3 To complete the canonical form, introduce the usual basic Poisson brackets: {q~k , q~k0 } = 0 = {p~k , p~k0 } and {q~k , p~k0 } = δ 3 (~k − ~k 0 ), ~k ∈ R3 . form by defining p~k := Introduce the field, π(t, ~x) := φ̇(t, ~x) and define a Poisson brackets among the fields φ, π fields from the Poisson brackets among the q~k and p~k using the Fourier decompositions. Noting that f~k = √12 (q~k + iq−~k ) and f˙~k = √12 (p~k + ip−~k ) , ~k ∈ R3 /2. It follows, Z Z 3 {φ(t, ~x), π(t, ~y ) = dk d3 k 0 {f~k ϕ~k (~x) + f~k∗ ϕ~∗k (~x) , f˙~k0 ϕ~k0 (~y ) + f˙~k∗0 ϕ~∗k0 )} 3 3 R /2 R /2 Z Z = d3 k d3 k 0 R3 /2 R3 /2 ϕ~k (~x)ϕ~k0 (~y ){f~k , f~k0 } + ϕ~k (~x)ϕ~∗k0 (~y ){f~k , f~k∗0 } +ϕ~∗k (~x)ϕ~k0 (~y ){f~k∗ , f~k0 } + ϕ~∗k (~x)ϕ~∗k0 (~y ){f~k∗ , f~k∗0 } Z Z 3 ∗ ∗ = d k ϕ~k (~x)ϕ~k (~y ) + ϕ~k (~x)ϕ~k (~y ) = d3 k ϕ~k (~x)ϕ~∗k (~y ) (7.5) R3 /2 3 R3 ∴ {φ(t, ~x), π(t, ~y ) = δ (~x − ~y ) (7.6) 58 The remaining Poisson brackets {φ, φ}, {π, π}, defined similarly, are zero. Note: The same t is taken for the fields since the q’s and p’s are also at the same time. Starting from the Lagrangian, using the Fourier decomposition, we saw explicitly that the Lagrangian can be expressed as a sum of Lagrangians of harmonic oscillators. Furthermore, we could define Poisson brackets among the fields from the same oscillator system. It is apparent now that a field satisfying the Klein-Gordon equation as irreducibility condition, can be given a canonical formulation wherein it appears as a system of infinitely many, uncoupled harmonic oscillators. Note: The integration domains appearing above can be comfusing. For real valued fields, using a manifestly real form of a Fourier decomposition, the ~k ∈ R3 /2. Once the solutions for the time dependent coefficients of the Fourier decomposition are subtituted back in the Fourier decomposition, we get the mode decomposition of the field, with ~k ∈ R. For complex valued fields, in both Fourier and mode decompositions, ~k ∈ R3 . We now repeat the exercise for the Maxwell field and the Dirac field. We will use the same orthonormal set of ϕ~k (~x) in developing the Fourier decomposition. The steps being very similar, we will be brief and suppress the integration range of the ~k. A. Maxwell Field Knowing the plane wave solutions having two polarizations, we write the Fourier decomposition of the vector field Aµ (t, ~x) as, Z Aµ (t, ~x) = ~ eik·~x . d k εµ (~k, a)f~k,a (t)ϕ~k (~x) , ϕ~k (~x) = (2π)3/2 3 The εµ is a polarization vector which depends on ~k while the label a enumerates the numbers of polarization vectors. A priori, there are four independent polarization vectors (a tetrad) and a takes 4 values. One of the Maxwell equations fixes the 0th component of all polarization vectors in terms of the spatial components. One of the second set of equations is trivially true for the longitudinal polarization, ~ε(~k) ∝ ~k and we are left with only two independent equations for the f~k,a corresponding to the transverse polarizations. We could have kept the f (t, a) inside the polarization vector, but this is more convenient. We will suppress the sum-over-a till the final expression. 59 This leads to, Z ∴ F0i Fµν = ∂µ Aν − ∂ν Aµ = d3 k [εν ∂µ (f ϕ) − εµ ∂ν (f ϕ)] . Z Z n o 3 ˙ = d k ϕ~k (~x) εi f~k − iki ε0 f~k , Fij = d3 k f~k ϕ~k (~x) {i(ki εj − kj εi )} The equations of motion, using independence of ϕ~k (~x) give, i~k · ~ε ˙ i~k · ~ε ¨ ∂ i Fi0 = 0 : ε0 (~k, a)f~k (t) = − f~k (t) ⇒ ε0 f˙~k = − f (t) ~k 2 ~k 2 ~k ∂ 0 F0i + ∂ j Fji = 0 : εi −f¨~k − ~k 2 f~k + iki (ε0 f˙~k ) + (~k · ~ε)ki f~k = 0 j j k k ε k k ε i j i j 2 ∴ 0 = f¨~k −εi + + ~k f~k −εi + ~k 2 ~k 2 i h ki k j or, 0 = −δi j + εj f¨~k + ~k 2 f~k . ~k 2 The prefactor is projector, projecting a 3-vector onto the plane perpendicular to ~k. Thus, ki k j j ~ εi (k, a) := δi − εj (~k, a) define the transverse polarizations and there are two inde~k 2 pendent ones. Since the prefactor is non-zero, the Maxwell equations imply that f~k satisfy the same harmonic oscillator equation as before, with ω 2 = ~k 2 and the mass is zero. It is ~k important to remember that transverse polarization, εi (~k, a), is defined by ~k and hence acquire the ~k dependence as well as the label a. Hence the f~k,a satisfy the oscillator equations only for the transverse polarization. We may not display it always to avoid clutter. The next task is to compute the Lagrangian, L = − 14 Fµν F µν = 12 (F0i )2 − 14 (Fij )2 . Z Z n on o 2 3 F0i = d k d3 k 0 εi f˙~k − iki ε0 ε0i f˙~k0 − iki0 ε00 f~k ϕ~k (~x)ϕ~k0 (~x) ~ε ˙ 0 3 Eliminating ε0 f~k = −i k·~ ~k2 f~k and likewise ε0 f~k0 , simplifies the product of the {. . . } to , " {. . . }{. . . } = f˙~k f˙~k0 3 0 ~0 ~k · ~ε ~k 0 · ~ε0 ~ ~k 0 ) ~ε · k − (~ε0 · ~k) ~ε · k ~ε · ~ε0 + ~k · ~k 0 − (~ ε · ~k 2 ~k 0 2 (~k 0 )2 (~k)2 # This is equivalent to using one of the Maxwell equations, the Gauss law equation which is a constraint. 60 Integration over ~x gives δ 3 (~k + ~k 0 ) which cancels two term and we are left with, # " Z Z ~k) · ~k)(~ε0 (−~k) · ~k) 1 1 (~ ε ( f˙~k f˙−~k d3 x (F0i )2 = d3 k ~ε(~k) · ~ε0 (−~k) − ~k 2 2 Σt 2 kikj ij ~ But [. . . ] = εi (k) δ − εj (−~k) = ~ε(~k) · ~ε(−~k) ~k 2 Z Z 1 1 3 2 ∴ d x (F0i ) = d3 k ~ε(~k) · ~ε(−~k) f˙~k f˙−~k and similarly, 2 Σt 2 Z Z ~k 2 1 − ~ε(~k) · ~ε(−~k) f~k f−~k d3 x(Fij )2 = d3 k − 4 2 The dot product of the transverse polarization includes the sum over the polarizations. The longitudinal polarization has dropped out explicitly. Making it explicit, we have, Z 2 h i ih X 3 LM axwell = dk ~ε(~k, a) · ~ε(−~k, b) f˙~k,a f˙−~k,b − ω~k2 f~k,a f−~k,b R3 /2 (7.7) a,b=1 We may now choose the transverse polarizations such that ~ε(~k, a) · ~ε(−~k, b) = δa,b (note the ±~k for both polarizations). Using this, the Maxwell action becomes, Z Z i Xh f˙~k,a f˙−~k,a − ω~k2 f~k,a f−~k,a (7.8) S[A] = dt d3 k R3 /2 Z a=1,2 Z i 1 Xh 2 2 for, d3 k = dt q̇~k,a − ω~k2 q~k,a 2 a=1,2 R3 Z X ∗ d3 k εµ (~k, a)f~k,a (t)ϕ~k (~x) + ε∗µ (~k, a)f~k,a Aµ (t, ~x) = (t)ϕ~∗k (~x) R3 /2 (7.9) with, (7.10) a ~k · ~ε(~k, a) ki k j j ˙ ~ ~ f~k,a , εi (k, a)) := δi − εj (~k, a) ε0 (k, a)f~k,a (t) = −i ~k 2 ~k 2 δa,b = ~ε(~k, a) · ~ε∗ (~k, b) (Normalization). (7.11) (7.12) As in the case of the scalar, we have gone through the introducing the ‘real q~ka ’ degrees of freedom and expressed the Maxwell action too is a sum of twice as many harmonic oscillators. The canonical form goes through as well. Note: If we were to begin with the Maxwell action and attempt to get its Hamiltonian formulation, we would obtain the Gauss Law equation of motion as a constraint equation. By using the Fourier decomposition, we have explicitly solved this constraint and eliminated the ε0 polarization. Once this is done, we get the transverse projector which eliminates the longitudinal polarization. This gives the action involving only the physical degrees of freedom. 61 B. Dirac Field Lastly, let us consider the Dirac action. Here the Lagrangian is first order in the derivatives and we do not expect a harmonic oscillator form. We will also find it clearer to write the Fourier decomposition using the positive half in ~k space. We take the Fourier decomposition of the spinor as, Z ψ(t, ~x) = " d3 k R3 /2 # X u(~k, σ)f~k,σ (t) ϕ~k (~x) + σ " # X v(~k, σ)g~k,σ (t) ϕ~∗k (~x) σ Substitution in the Dirac equation, keeping summation over σ implicit, gives, Z o hn µ 3 0 ~ ˙ 0 = −iγ ∂µ ψ + mψ = d k −iγ u~k f~k + (k · ~γ + m)u~k f~k ϕ~k o i n 0 ~ + −iγ v~k ġ~k + (−k · ~γ + m)v~k g~k ϕ~∗k Linear independence of the basis functions gives two equations: −iγ 0 u~k f˙~k + (~k · ~γ + m)u~k f~k = 0 = − iγ 0 v~k ġ~k + (−~k · ~γ + m)v~k g~k . Differentiating w.r.t. time once more, multiplying by −iγ 0 on the left and using the above equations again leads to, − X n o n o X , ω~k2 := ~k 2 + m2 . v~k,σ g̈~k,σ + ω~k2 g~k,σ =0= − u~k,σ f¨~k,σ + ω~k2 f~k,σ σ σ Since u~k,σ , v~k,σ are presumed linearly independent, each of the f and the g functions satisfy the same, familiar harmonic oscillator equation, with solutions e±iω~k t . Of course one of the solutions for each of f, g is spurious since the original equations are first order. Substituting f~k,σ ∼ e−iω~k t and g~k,σ ∼ e+iω~k t in the first order equations requires the u, v spinors to satisfy: (6 k + m)u~k,σ = 0 = (6 k − m)v~k,σ . Choice of the other signs does not give a Lorentz covariant equation for the spinors. Hence we restrict to: f˙~k,σ = −iω~k f~k,σ , ġ~k,σ = +iω~k g~k,σ . These equations satisfied by the spinors are analogous to the transversality conditions we got on the polarization tensors. These are consequences of the Dirac equation i.e. hold for solutions. To compute the Lagrangian, we need the decomposition of the Dirac conjugate, "( ) ( ) # Z X X 3 ∗ ∗ ∗ ψ̄(t, ~x) = dk ū(~k, σ)f~k,σ (t) ϕ~k (~x) + v̄(~k, σ)g~k,σ (t) ϕ~k (~x) R3 /2 σ σ 62 We will now choose the u, v spinors to be solutions of the equations: (6 k + m)u(~k, σ) = 0 = (6 k − m)v(~k, σ) and express the Dirac action in terms of the f, g functions alone. Z hn o µ −iγ ∂µ ψ = d3 k −iγ 0 u(~k, σ)f˙~k,σ ϕ~k (~x) + −iγ 0 v(~k, σ)ġ~k,σ ϕ~∗k (~x) R3 /2 oi n + ki γ i u(~k, σ)f~k,σ ϕ~k (~x) − ki γ i v(~k, σ)g~k,σ ϕ~∗k (~x) Z hn µ 3 ∴ (−iγ ∂µ + m)ψ = d k −iγ 0 u(~k, σ)f˙~k,σ + ki γ i u(~k, σ)f~k,σ o ~ +mu(k, σ)f~k,σ ϕ~k (~x) n + −iγ 0 v(~k, σ)ġ~k,σ − ki γ i v(~k, σ)g~k,σ o i +mv(~k, σ)g~k,σ ϕ~∗k (~x) Z Z Z Z 3 µ 3 0 3 ∴L = d xψ̄(−iγ ∂µ + m)ψ = dk dk d3 x 3 3 Σ R /2 R /2 Σt hnt o n o i ū(~k 0 , σ)f~k∗0 ,σ0 (t) ϕ~∗k0 (~x) + v̄(~k 0 , σ 0 )g~k∗0 ,σ0 (t) ϕ~k0 (~x) × hn o −iγ 0 u(~k, σ)f˙~k,σ + ki γ i u(~k, σ)f~k,σ + mu(~k, σ)f~k,σ ϕ~k (~x) o n i ∗ 0 ~ i ~ ~ + −iγ v(k, σ)ġ~k,σ − ki γ v(k, σ)g~k,σ + mv(k, σ)g~k,σ ϕ~k (~x) Thanks to the integration domain of the ~k, ~k 0 integrations, only ϕ~∗k0 ϕ~k terms contribute δ 3 (~k − ~k 0 ). Hence only ū pairs with u and v̄ pairs with v. Using the equations satisfied by the spinors, we eliminate (ki γ i + m)u = +ω~k γ 0 u and (−ki γ i + m)v = −ω~k γ 0 v. This leads to, Z LDirac = 3 dk ( Xh R3 /2 ih i 0 0 ~ ∗ ~ ˙ ū(k, σ )γ u(k, σ) f~k,σ0 (−if~k,σ + ω~k f~k,σ ) σ,σ 0 + Xh ) ih i ∗ v̄(~k, σ 0 )γ 0 v(~k, σ) g~k,σ 0 (−iġ~ k,σ − ω~k g~k,σ ) (7.13) σ,σ 0 Choosing the normalization of the spinors so that ū(~k, σ 0 )γ 0 u(~k, σ) = ±v̄(~k, σ 0 )γ 0 v(~k, σ) gives the Dirac action as, Z Z i Xh ∗ ∗ S[ψ̄, ψ] = dt d3 k (−i) f~k,σ (f˙~k,σ + iω~k f~k,σ ) ± g~k,σ (ġ~k,σ − iω~k g~k,σ ) R3 /2 δσ,σ0 = (7.14) σ We can group the f, g’s together to get a uniform Lagrangian with ~k ∈ R3 . Denote f−~k,σ := ∗ ∗ g~k,σ ∀ ~k ∈ R3 /2. Do a partial integration of the g~k,σ ġ~k,σ = f−~k,σ f˙−∗ ~k,σ . This makes the relative signs the same in both the terms. Choosing the ‘-’ sign for the normalization condition for 63 the v-spinors, allows us to combine the two terms and extend the integration domain to full R3 . Thus, we get, Z SDirac [ψ̄, ψ] = Z dt Z ψ(t, ~x) = d3 k Xh R3 σ d3 k Xh R3 /2 i ∗ (−if˙~k,σ + ω~k f~k,σ ) for f~k,σ (7.15) u(~k, σ)f~k,σ (t)ϕ~k (~x) + v(~k, σ)f−∗ ~k,σ (t)ϕ~∗k (~x) i (7.16) σ 0 = (6 k + m)u(~k, σ) = (−6 k + m)v(~k, σ) normalized as, δσ;,σ = ū(~k, σ 0 )γ 0 u(~k, σ) = − v̄(~k, σ 0 )γ 0 v(~k, σ) . (7.17) (7.18) Note that the action is real again. An aside: It is an interesting exercise to express the action in terms of the real q type variables as we did for the scalar. Since the action is manifestly a sum over variables with uncoupled labels σ, ~k, we focus on just one of the terms. Thus consider the manifestly real Lagrangian, L(f, f ∗ ) := 1 {−if ∗ f˙ + if˙∗ f } + ωf ∗ f . Introduce the real variables x, y as: 2 f := x + iy, f˙ = ẋ + iẏ. The Lagrangian then takes the form, L(x, y, ẋ, ẏ) = xẏ − y ẋ + ω(x2 + y 2 ) ⇒ px = −y , py = +x Notice that the equations defining the conjugate momenta cannot be inverted to solve for the velocities - we have a constrained system [3]. This is not the place to discuss it in detail, the reference gives the necessary background. I will just list the steps. φ := px + y , χ := py − x (primary constraints); Hcanonical := px ẋ + py ẏ − L = − ω(x2 + y 2 ); Htotal = −ω(x2 + y 2 ) + λφ + µχ Preservation of primary constraints ⇒ Htotal = −ω(xpy − ypx ) ∵ λ = ωy , µ = −ωx ; 1 1 Dirac brackets : {x, px }∗ = {y, py }∗ = , {x, y}∗ = {px , px }∗ = − . Use: 2 2 2 2 Htotal = −ω(x + y ) and Dirac brackets for equations of motion: ẋ = ω y , ẏ = −ω x ; finally drop px , py and set x := 1 Htotal = − ω(p2 + q 2 ) , {q, p} = 1 , ṗ = ω q , q̇ = −ω p . 2 √p 2 , y := √q , 2 Re-introducing the (~k, σ) labels shows that the spinor field Hamiltonian too is a sum of harmonic oscillators for each of the ~k ∈ R3 and σ = ± labels. 64 Puzzle: Why is the Hamiltonian of a wrong sign? Is it merely a choice of the overall sign in the Lagrangian density (will not change the equation of motion)? What is the rationale for a choice of sign? We will not pursue this classical formulation further here. Remarks: The scalar and the vector fields that we discussed have been real. This was reflected in the Fourier decomposition with complex conjugate coefficients. What if we have a complex field? Well, we can always write the Fourier decomposition as (see the spinorial case), Z φ(t, ~x) = R3 /2 d3 k f~k (t)ϕ~k (~x) + g~k∗ (t)ϕ~k (~x) For real scalar field, the reality condition simply identifies g~k (t) = f−~k (t). For a complex field, there is no such identification. We will then have two harmonic oscillators for each ~k ∈ R3 . For the spinorial case, we had a constrained system and hence half the number of oscillators. Remark: Consider the time-Fourier transform of a scalar field: φ(ω, ~x) := R∞ √1 dt e−iωt φ(t, ~x). The Fourier transform of its complex conjugate is given by: 2π −∞ R∞ φc (ω) := √12π −∞ dt e+iωt φ∗ (t, ~x). Clearly, φc (ω) = φ∗ (−ω). Therefore, a positive frequency φ(t, ~x) has φ(ω, ~x) = 0, for ω < 0. This immediately also give that φ∗ (t, ~x) is a negative frequency field, i.e. φc (ω, ~x) = 0 for ω > 0. This gets reflected in the mode decomposition. While the Fourier decomposition displays the “degrees of freedom” of the field, the mode decomposition will turn out to be convenient for passage to quantum fields. To appreciate this, consider a single oscillator (a single ~k). The degrees of freedom view gives: q̈ + ω 2 q 2 = 0, p = q̇, H = 21 (p2 + q 2 ), {q, p} = 1. The mode decomposition gives (a, a∗ are constants), q(t) = ae−iωt + a∗ eiωt ⇒ p(t) = −iω ae−iωt − a∗ eiωt inverting gives, eiωt p e−iωt p a = q+i , a∗ = q−i 2 ω 2 ω √ √ i Define: a := 2ωa , a∗ := 2ωa∗ ⇒ ∴ {a, a∗ } = − 2ω iωt −iωt e e ∗ a+ √ a∗ {a, a } = − i , q(t) = √ 2ω 2ω √ Notice the normalization factor of 2ω. This is correlated with the normalization of the Poisson bracket of a, a∗ (and of course has nothing to do with the Lorentz covariance!). As per the usual canonical quantization procedure, {, } → −i/(~ = 1) × [, ]. This would lead to [a, a† ] = 1. Keeping this in mind, we summarize the mode decompositions for the fields: 65 Z φ(x) = R3 Z Aµ (x) = R3 ik·x d3 k p a~k e + b~∗k e−ik·x k · x := −ω~k t + ~k · ~x 2ω(2π)3 If φ∗ = φ then b~k = a~k . i Xh d3 k p a~k,λ εµ (~k, λ)eik·x + a~∗k,λ ε∗µ (~k, λ)e−ik·x 2ω(2π)3 λ=1,2 (7.19) (7.20) ~k · ~ε(~k, λ) ε0 (~k, λ) := − , ω~k = |~k| ω~k j k k i j εj (~k, λ) , ~ε(~k, λ) · ~ε∗ (~k, λ0 ) = δλ,λ0 εi (~k, λ) := δi − ~k 2 Z ψ(x) = R3 i Xh d3 k p b~k,σ u(~k, σ)eik·x + d~∗k,σ v(~k, σ)e−ik·x 2ω(2π)3 σ=± (7.21) (6 k + m)u(~k, σ) = 0 = (−6 k + m)v(~k, σ) ū(~k, σ)u(~k, σ 0 ) = δσ,σ0 = − v̄(~k, σ)v(~k, σ 0 ) X X −6 k + m 6k + m u(~k, σ)ū(~k, σ) = , v(~k, σ)v̄(~k, σ) = − 2m 2m σ σ For a real scalar field, we note, {a~k , a~∗k0 } = −iδ 3 (~k − ~k 0 ) NOTE: {φ(t, ~x), π(t, ~x0 )} = δ 3 (~x − ~x0 ) . ↔ (7.22) So far whatever we have done has no quantum in it. We have obtained a classical theory of non-interacting fields. C. Interaction with source: Green’s functions A simplest kind of ‘interaction’ we are familiar with, eg. from classical electrodynamics, is interaction with a “source”. So let us consider a real scalar field interacting with a source J(x) which is a real function. The interaction is described by the equation4 : ( − m2 )φ(x) = −J(x) ↔ (x − m2 )G(x, x0 ) = −δ 4 (x − x0 ) Green’s Function Z ⇒ φ(x) = + d4 x0 G(x, x0 )J(x0 ) Source 4 Follows from the Lagrangian L = Lscalar + J(x)φ(x) with Lscalar given in (6.5). 66 Poincare invariance of the defining equation for Green’s function implies that it is a Lorentz invariant function of (x − x0 ). It can be represented as, 0 0 Z G(x − x ) = R4 d4 k eik·(x−x ) , k 2 = −(k 0 )2 + ~k 2 . 4 2 2 (2π) k + m (7.23) The denominator vanishes for k 2 = −m2 and thus we need a definition of the integral and hence of the Green’s function. The usual method is to define R d4 k = R d3 k R dk 0 and interpret the k 0 integral as a contour integral along a suitably chosen contour. Each choice is supposed to reflect a “boundary condition” for the wave operator. In the complex k 0 plane, there are poles p at k 0 = ± ω~ := ~k 2 + m2 . The integral is sought to be defined by shifting the poles off the k real axis. This can be done in 4 ways: (i) shift both poles in the Lower Half Plane (LHP), k 0 → k 0 + i, (ii) shift both poles in the UHP, k 0 → k 0 − i, (iii) shift the pole at +ω~k in the LHP and shift the pole at −ω~k in the UHP, ω~k2 → ω~k2 − i and (iv) reverse of the (iii). To see the consequences of any of these choices, let us consider the special case of a source localized at the space-time origin: J(t, x) = δ 4 (x). The solution takes the form, Z Z Z d4 k eik·x −i(−k0 t0 +~k·~x0 ) 0 3 0 0 3 0 e δ(t )δ (~x ) φ(x) = dt dx (2π)4 k 2 + m2 "Z # Z 0 dk 0 e−ik t d3 k i~k·~x e and, = (2π)3 2π (k 0 )2 − ωk2 Z 1 dk 0 1 1 0 [. . . ] = − 0 e−ik t 2ω~k 2π k0 − ω~k k + ω~k (7.24) Without the i’s, both sides of the above equations are real. Now we introduce the i’s. The contour integrals will have the segment of a semicircle at infinity whose contribution should vanish. On the semicircle at large R, k 0 = Reiθ , the integrand behaves as R−2 eRsin(θ)t and the measure provides factor of R. This suffices for convergence for t = 0. However, for a non-zero t, we can get an exponential divergence unless θ is restricted appropriately. This dictates how the contour should be closed in UHP or LHP. Since φ(x) is a function of the Lorentz invariant x2 = −t2 + ~x2 , we can consider the two cases as: (a) x2 ≤ 0 (sign of t is invariant) and we may evaluate the integrals for ~x = 0; (b) x2 > 0 and we can evaluate the integrals for t = 0. (i) Both poles to LHP: To pickup the residues, we should close the contour in the LHP. For space-like interval, taking t = 0, we see that the residues at the two poles cancel each 67 other and the solution vanishes. For time-like intervals, t 6= 0 but we can take ~x = 0. To have a vanishing contribution from the semicircle at infinity, we have to restrict t > 0. The evaluation of the integral is trivial and gives, [. . . ]Retarded = − 1 θ(t)sin(ω~k t) . ω~k (ii) Both poles to UHP: To pick-up the residues, the contour should be closed in the UHP and for time-like interval, we need to restrict to t < 0. The evaluation gives, [. . . ]Advanced = 1 θ(−t)sin(ω~k t) . ω~k As before, for space-like interval, the solution vanishes. (iii) Positive pole to LHP and negative to UHP: Now closing the contour in either of UHP/LHP will always pick up a contribution. For time-like interval, for LHP closure, we have to take t > 0 and we will pick up the contribution from the first term. Likewise, for the UHP closure, we have to take t < 0 and we will pick up the contribution from the second term. Evaluation gives, for time-like or light-like separation [. . . ]F eynman = − i 1 −iω~ |t| 1 θ(t)e−iω~k t + θ(−t)e+iω~k t = − i e k [θ(t) + θ(−t)] . 2ω~k 2ω~k For space-like interval, taking t = 0, we get the non-zero answer, [. . . ]F eynman = −i 2ω1~ for k closure in either LHP or UHP. The case (iv) is similar to (iii) and is obtained in an obvious manner. The solution φ(x) is obtained by the spatial Fourier transform as given above (7.24). Remarks: There are two properties of the Retarded and the Advanced choices that stand out. In both cases, the solution φ(x) (a) is real for a real source function and (b) it reflects the expected causal behavior - vanishing outside the future/past light cones. By contrast, the Feynman choice gives a non-real φ(x) even for a real source function and the solution is non-vanishing outside the light cones! The same holds for the choice (iv). What do we make of this? The retarded and advanced choices support the interpretation that the field can be regarded as an observable responding to a source in a causally consistent manner. This is what we would have in a classical field theory. The Feynman choice however disallows such an interpretation - the particular solution φ(x) cannot be interpreted as causally consistent response to a source. 68 Both the causal and the Feynman Green’s functions appear naturally when φ(x) is promoted to be an operator field i.e. in Quantum Field Theory. In the scattering theory, section 10, the causal Green’s function are used in articulating the asymptotic conditions while the Feynman Green’s function (or Feynman propagator) appears naturally in the Lehmann-Symanzik-Zimmermann (LSZ) reduction of the scattering matrix. For completeness, we note the full Green’s functions5 : Gret (x) = Gadv (x) = GF eynman (x2 < 0) = GF eynman (x2 > 0) = 5 √ 2 θ(t) 2 2 mJ1 (m −x ) √ δ(x ) − θ(t)θ(−x ) + 2π 4π −x2 √ θ(−t) mJ (m −x2 ) 1 √ + δ(x2 ) − θ(−t)θ(−x2 ) 2π 4π −x2 √ δ(x2 ) m (2) − √ H1 (m −x2 ) + 4π 8π −x2 √ i m + 2 √ K1 (m x2 ) 4π x2 (7.25) (7.26) (7.27) The overall signs are convention dependent. It is easiest to match it for the massless case. For detail see: [4, 5] 69 8. COVARIANCE OF QUANTUM FIELDS AND RELATIVISTIC CAUSALITY So far we have studied representations of the Poincare group and seen emergence of a “field dynamical system” - an infinite collection of harmonic oscillators. In our attempt to incorporate interactions with classical sources, we encountered the Green’s functions. We met the expected and familiar retarded and advanced Green’s functions, but also encountered the mathematical possibility of the Feynman alternative which does not gel with a classical field theory interpretation. Taking it as a hint of an opportunity, we attempt a ‘quantum’ interpretation. Here, the field as a collection of harmonic oscillators gives a strong clue: use the mode decomposition and promote the expansion coefficients to operators. We will then have a collection of creation and annihilation operators and can tag the states of the system by the number of quanta. We already saw (7.22) in the simpler case of a real scalar field that postulating Poisson brackets for the coefficients and using the completeness of mode functions , we can deduce the Poisson brackets for the fields themselves. The same method works for operators as well. That is, not only do we promote the fields to operators but we also postulate the necessary commutation relations. A. Poincare Covariance of Quantum fields The first feature, fields as operators, immediately modifies the implementation of covariance: a covariant quantum field requires specific transformations for the coefficient functions and conversely. Transformation laws of the mode functions do not suffice. To see this, consider a linear combination of classical fields, ψ A (x) := P A n cn ψn (x). A group action on ψ A is deduced from that on the individual ψnA : g · ψnA (x) = DAB (g)ψnB (g −1 x), X X DAB (g)ψnB (g −1 x) = DAB (g)ψ B (g −1 x). cn g · ψnA (x) = g · ψ A (x) := n n This of course presumes quite naturally that the coefficients are invariant under the group action. When a similar linear combination is promoted to an operator by making the cn as operators, the above procedure breaks down. X ? U (g)ψ̂ A (x)U † (g) := U (g)ĉn U (g)† ψnA (x) = DAB (g −1 )ψ̂ B (gx) . n Evidently, if we assume the ĉn to be invariant, then the field cannot be covariant. So the ĉn should transform in a manner that shifts the group action on to the ψnA (x) and then 70 to work out correctly so as to produce the expected right hand side6 . Taking now R 3 P cn → â, â† , n → d...k and using the mode expansion form, we observe that if we postulate the P n the transformation laws for the â, â† as, ~ “D(Λ)”σ) , U (g)â† (~k, σ)U † (g) := â† (Λk, ~ “D(Λ)”σ) , U (g)â(~k, σ)U † (g) := â(Λk, then we can shift the changed labels from the operators to the mode function labels using the change of integration (and summation) variables. Provided the integration measure is invariant, the right hand side takes the expected form and we have the covariance of the quantum field. Note that this essentially fixes the form of the mode decomposition! In equations (for a Dirac field say), Z i h i Xh d3 k † p U (g)ψ̂U (g) (x) := (U b̂~k,σ U † )u(~k, σ)eik·x + (U dˆ~†k,σ U † )v(~k, σ)e−ik·x 2ω~k (2π)3 σ Z i Xh d3 k † ik·x −ik·x ~ ˆ ~ p = b̂Λk,σ u( k, σ)e + d v( k, σ)e ~ Λ ~ Λ Λk,σ 2ω~k (2π)3 σ Z Xh d3 (Λ−1 k) ~ k), σ −1 )ei(Λ−1 k)·x q b̂~k,σ u((Λ−1 = Λ 3 2ω(Λ−1 ~ k) (2π) σΛ−1 i ~ k), σ −1 )e−i(Λ−1 k)·x + dˆ~†k,σ v((Λ−1 Λ Z n o Xh d3 k p b̂~k,σ D(Λ−1 )u(~k, σ) eik·(Λx) = 2ω~k (2π)3 σ n o i + dˆ~†k,σ D(Λ−1 )v(~k, σ) e−ik·(Λx = D(Λ−1 )ψ̂(Λx) Note: If we postulate a Poincare invariant state |0i and define |~k, σi := b† (~k, σ)|0i, then these states transform precisely as per the particle representation. Thus the postulated group action on the b, d operators are precisely as needed for building up the particle representations and we will use it shortly. B. Space Inversion, Time Reversal, Charge Conjugation of Dirac Field Let us quickly see how the discrete symmetries act on quantum fields. Again, we will consider a Dirac field for illustration and now suppress the ‘hats’ on the operators. 6 The homomorphism dictates the expected form on the r.h.s. 71 Space Inversion: We want ΨP (t, ~x) := P ΨP † = D(P )Ψ(t, −~x). From the defining relations for P , namely, P Pi P † = −Pi and P Ji P † = Ji , we postulate, P b† (~k, σ)P † := ηb† (−~k, σ) , P d† (~k, σ)P † := ηd† (−~k, σ) , η 2 = ±1 . As noted while discussing the discrete symmetry actions on the particle representations, the phase is σ independent. Substituting the mode expansion and using these postulated actions gives, Z o Xn † ~ † 3 ∗ ~ ik·(P x) † ~ −ik·(P x) P Ψ (k, σ)P = [d k]inv η b(k, σ)u−~k,σ e + ηd (k, σ)v−~k,σ e σ Claim: The u, v spinors satisfy, u(−~k, σ) = γ 0 u(~k, σ) and v(−~k, σ) = −γ 0 v(~k, σ). This follows by noting that 0 = (6 k + m)u(~k, σ) = (−ωγ 0 + ~k · ~γ + m)u(~k, σ) , u(~k, σ) = γ 2 u(~k, σ) ∴ 0 = (−ωγ 0 γ 0 + ~k · ~γ γ 0 + mγ 0 )(γ 0 u(~k, σ)) = (−ωγ 0 + (−~k · ~γ + m)(γ 0 u(~k, σ)) ⇒ γ 0 u(~k, σ) ∝ u(−~k, σ) Similarly, γ 0 v(~k, σ) ∝ v(−~k, σ). The proportionality factors are fixed by noting that for ~k = k̂, the u, v spinors are eigen-spinors of γ 0 with eigenvalues ±1. This implies, Z o Xn † ~ † 0 3 ∗ ~ ik·(P x) † ~ −ik·(P x) P b (k, σ)P = γ [d ]inv η b(k, σ)u~k,σ e − ηd (k, σ)v~k,σ e σ Clearly η should be imaginary for the right hand side to have the field ψ(P x). We choose η = −i and deduce that, P Ψ(x)P † = iγ 0 Ψ(P x)} , i.e., D(P ) = iγ 0 and η 2 = −1. Time reversal: We want ΨT (t, ~x) := T ΨT † = D(T )Ψ(−t, ~x). The defining relations where both Pi and Ji change sign, we postulate, T b† (~k, σ)T † = ξσ b† (−~k, −σ) , T d† (~k, σ)T † = ξσ d† (−~k, −σ) . As noted while discussing the discrete symmetry actions on the particle representations, the phase is σ dependent. 72 Substituting the mode decomposition and noting the T operator is anti-unitary, we get, Z o Xn † [T ΨT ](x) = [d3 k]inv ξσ∗ b(−~k, −σ)u∗ (~k, σ)e−ik·x + ξσ d† (−~k, −σ)v ∗ (~k, σ)eik·x σ Z = o Xn ∗ [d3 k]inv ξ−σ b(~k, σ)u∗ (−~k, −σ)e−ik·(T x) + ξ−σ d† (~k, σ)v ∗ (−~k, −σ)eik·(T x) σ Now we need to use the relations, u∗ (−~k, −σ) = −σCγ5 u(~k, σ) , v ∗ (−~k, −σ) = −σCγ5 v(~k, σ) . Once again these are proved from the defining equations for the spinors. Clearly, if we choose ξσ = σ(= ±1), the we get Ψ(T x) on the right hand side and we get, T Ψ(x)T † = Cγ5 Ψ(T x), i.e.D(T ) = Cγ5 , Cγ µ C † = −(γ µ )T . Charge conjugation: Lastly, consider the charge conjugation action. This is defined through the particle-antiparticle exchange and we postulate, C b~†k,σ C † = ξd~†k,σ , C d~†k,σ C † = ξb~†k,σ . (8.1) Substitution of mode expansion gives, Z o Xn † ξ ∗ d~k,σ u(~k, σ)eik·x + ξb~†k,σ v(~k, σ)e−ik·x [C ΨC ](x) = [d3 k]inv σ Now we need to use: v(~k, σ) = C † ūT (~k, σ) and recall that the C matrix is given by C = iγ 2 γ 0 , C T = −C , C † C = 1 so that v = −C ūT and u = −C v̄ T . This gives, Z o Xn † † 3 ∗ T ~ ik·x T ~ −ik·x [C ΨC ](x) = [d k]inv ξ d~k,σ (−C v̄ (k, σ))e + ξb~k,σ (−C ū (k, σ))e and, σ Ψ̄T (x) = Z o Xn † 3 T ~ −ik·x T ~ ik·x [d k]inv b~k,σ (ū (k, σ))e + d~k,σ (v̄ (k, σ))e σ Hence, if we choose ξ ∗ = ξ = −1, then we get, C Ψ(x)C † = +C(Ψ̄)T (x) . Note that C operator is unitary! From these it is easy to get the transformations of Dirac conjugates. Here is a summary: 73 P ΨP † (x) = iγ 0 Ψ(P x) , P Ψ̄P † (x) = −iΨ̄(P x)γ 0 ; (8.2) T ΨT † (x) = Cγ5 Ψ(T x) , T Ψ̄T † (x) = Ψ̄(T x)γ5 C † ; (8.3) C ΨC † (x) = C Ψ̄T (x) , C Ψ̄C † (x) = ΨT (x)C ; (8.4) Cγ µ C † = −(γ µ )T , C † C = 1 , C T = −C , C := iγ 2 γ 0 (8.5) We are now ready to verify one of the important general theorem of relativistic quantum field theory, the CPT theorem. We verify it for the spinorial quantum field. C. CPT theorem for Dirac Field We want to consider the combined action of the discrete transformations. Let us choose one ordering, say, CPT . Then, (CPT )Ψ(x)(CPT )† = CP (Cγ5 Ψ(T x))P † T † = Cγ5 C (iγ 0 Ψ(PT x))C † = iCγ5 γ 0 C Ψ̄T (−x) ∵ PT x = −x (CPT )Ψ̄(x)(CPT )† = CP (Ψ̄(T x)γ5 C † )P † T † = C (Ψ̄(PT x))(−iγ 0 )γ5 C † )C † = iΨT (−x)Cγ 0 γ5 C ∵ C † = −C Next, consider the CPT transform of a general bilinear of the form Ψ̄(x)AΨ(x). Using (CPT )† (CPT ) = 1, the CPT transform will be the CPT transform of the Dirac conjugate, that of the matrix A and that of Ψ. The CPT transform of A is just A∗ since only the complex conjugate acts on A. (CPT )Ψ̄(x)(CPT )† (CPT )A(CPT )† (CPT )Ψ(x)(CPT )† = iΨT (−x)Cγ 0 γ5 C [A∗ ] iCγ5 γ 0 C Ψ̄T (−x) T = i2 Ψ̄(−x)(Cγ5 γ 0 C)T A† (Cγ 0 γ5 C)T Ψ(−x) × (−1) = +Ψ̄(−x) (Cγ5 γ 0 C)T A† (Cγ 0 γ5 C)T Ψ(−x) (8.6) The explicit (−1) in the third line is due to the fermionic nature of the spinors which we will see shortly in the discussion of the spin-statistics theorem. The factors of the gamma matrices simplify as, Cγ5 γ 0 C = +γ5 Cγ 0 C = −γ5 Cγ 0 C † = +γ5 (γ 0 )T = γ5 γ 0 and Cγ 0 γ5 C = γ 0 γ5 . 74 (CPT ) Ψ̄(x)AΨ(x) (CPT )† = Ψ̄(−x) (γ 0 γ5 )A† (γ5 γ 0 ) Ψ(−x) (8.7) It remains to evaluate the middle braces for A = 1, γ µ , γ µ γ ν , γ µ γ5 and γ5 . These are obtained easily as, γ 0 γ5 1γ5 γ 0 = 1 γ 0 γ5 γ µ γ5 γ 0 = −γ µ γ 0 γ5 Σµ,ν γ5 γ 0 = +Σµν γ 0 γ5 γ µ γ5 γ5 γ 0 = −γ µ γ 0 γ5 (iγ5 )γ5 γ 0 = γ5 The Lagrangian density is made up of the bi-linears and also involves derivatives. For a ∂µ Ψ(x), there is an extra minus sign due to PT x = −x. Thus we see that every tensor index, on γ or on ∂, gives a minus sign under CPT transform. For a Hermitian Lagrangian (coefficients are appropriately real or pure imaginary) which is a Lorentz scalar (so even rank tensor), CPT contributes no extra sign and only changes the space-time argument i.e. CPT )L (x)(CPT )† = L (−x) . Thus, a Hermitian action, invariant under proper and orthochronous Lorentz transformations, is invariant under the discrete CPT transformations. We have explicitly verified it for the quantum, Dirac field. Verification for scalar and Maxwell field is left as an exercise. Note: Had we not used the extra minus sign attributed to the fermionic nature of the fields, we would have got the CPT transform of the Lagrangian to be minus of L (−x) and got CPT non-invariance of the action! This minus sign is tightly connected with the spin-statistics theorem which we discuss next. D. Relativistic Causality and the Spin-Statistics Theorem Now we come to the second feature of quantum fields - the commutation relations. Here we get another surprise. The Pauli exclusion principle follows from the requirement of relativistic causality! The Fourier decomposition suggested field as a collection of harmonic oscillators. A natural quantization procedure is to postulate [a~ , a† ] = δ 3 (~k − ~k 0 ). This has a potential k 75 ~k0 problem for ~k = ~k 0 . The usual way this is interpreted is to imagine the field being confined to a large box satisfying periodic or Dirichlet boundary conditions. This discretizes the momentum labels, ~k → ~kn ∼ 2π~n/L and also replaces the delta function by Kronecker delta. Another way is to choose a suitable, countable orthonormal set of functions {ϕn (~x)} such that ∇2 ϕn = −ωn2 ϕn . Completeness of the ϕn ’s allows translating [am , a†n ] to [Φ(t, ~x), Π(t, ~y )] = δ 3 (x − y). We will implicitly assume such a procedure and proceed formally using the delta function. From the postulated commutation relations we have, for each ~k, N~k := a~†k a~k , [N~k , a~k ] = −a~k , [N~k , a~†k ] = a~†k , N~k |n~k i = n~k |n~k i , n~k = 0, 1, . . . ; hm~k |n~k i = δm~k ,n~k (a~†k )n~k p √ p |0~k i , a~k |n~k i = n~k |n~k − 1i , a~†k |n~k i = n~k + 1|n~k + 1i |n~k i := (n~k )! The state label n~k is interpreted as denoting the number of quanta, created by a~†k and destroyed by a~k . Note: In this case of a scalar field, all quanta have the same mass parameter, but different k~k label. For a vector field, we have the polarization label λ and for the spinor field we have the spin projection label σ. All quanta of a given mass m and spin/helicity are identical. Now we add a postulate of relativistic causality [1] also referred to as micro-causality, namely, all pairs of fields and their adjoints, commute or anti-commute for space-like separation. This requirement comes about as follow. In a Lorentz invariant theory, all observables would be Lorentz tensors of various ranks. Spinors being double valued quantities, they themselves are not observables. Spinorial fields must come in even powers (single valued) in an observable. As per general principles of quantum theory, namely, simultaneously measurable observables must commute and the notion of relativistic causality that simultaneously measurable observables must be space-like separated, suggest that for a space-time dependent operators to be observables, they must commute for space-like separation. Operators built out of spinorial fields being at least quadratic in the spinorial fields, allows possible anti-commutation rules for spinor fields7 . 7 Weinberg bases his argument in favor of relativistic causality by invoking Lorentz invariance of the scattering matrix. 76 All our mode decompositions have annihilation operators as coefficients of the positive frequency modes and the creation operators as coefficients of the negative frequency modes. Let us denote these parts as Ψ+ (x) and Ψ− (x) respectively. Consider a general linear combination of these fields: Ψ(x) := αΨ+ (x) + βΨ− x and let Ψ† (x) denote its adjoint. Let all observables be built out of Ψ(x) and Ψ† (x). Let [A, B]± := AB ± BA, the anti-commutator and commutator respectively. The requirement of micro-causality is then stated as: [Ψ(x), Ψ(y)]± = 0 = [Ψ(x), Ψ† (y)]± for (x − y)2 > 0 . This requirement is quite restrictive, as seen below. Let us begin with a real scalar field. We have, Z Z d3 k d3 k ik·x p p φ+ (x) := a~k e , φ− (x) := a~† e−ik·x = φ†+ (x) . 2ω~k (2π)3 2ω~k (2π)3 k Z ∴ [φ+ (x), φ− (y)]± = d3 k p 2ω~k (2π)3 Z d3 k 0 0 p eik·x−ik ·y [a~k , a~†k0 ]± 2ω~k0 (2π)3 For the scalar field, we use the commutator given above. Let us assume the same form for the anti-commutator i.e. we assume, [a~ , a† ]± = δ 3 (~k − ~k 0 ). This gives, k ~k0 d3 k eik·(x−y) =: ∆+ (x − y) is Poincare invariant. For 3 2ω~k (2π) choose (x − y)2 = (~x − ~y )2 =: r2 , Z ∞ Z 1 Z 2π 1 k2 dk √ dcos(θ) dϕeikrcosθ 2 2 (2π)3 0 2 k + m −1 0 Z ∞ 2 k sin(kr) 1 dk √ put k = mα, 2 2 2 4π 0 kr Z ∞ k +m 1 m α √ dα sin(mrα) 2+1 4π 2 r 0 α Z ∞ α 1 1∂ dα √ cos(mrα) − 2 4π r ∂r 0 α2 + 1 Z [φ+ (x), φ− (y)]± = space-like separation , Thus ∆+ (x2 ) = = ∆+ (x2 ) = = √ Thus, ∆+ (x2 > 0) = K1 (m x2 ) (Modified Bessel function of the second kind) , and it is non-zero for space-like separation. 77 Define φ(x) := αφ+ (x) + βφ− (x) , φ† (y) = α∗ φ− (y) + β ∗ φ+ (y). It follows, [φ(x), φ(y)]± = αβ [φ+ (x), φ− (y)]± + βα [φ− (x), φ+ (y)]± = αβ (∆+ (x − y) ± ∆+ (y − x)) = αβ(1 ± 1)∆+ (x − y) since ∆+ is symmetric in x, y. φ(x), φ† (y) ± = |α|2 [φ+ (x), φ− (y)]± + |β|2 [φ− (x), φ+ (y)]± = (|α|2 ± |β|2 )∆+ (x − y) To satisfy the requirement of causality, both the brackets must vanish i.e. in both equations we must choose the minus sign i.e. a commutator as well as the condition |α| = |β|. This is quite a strong restriction on both the sign as well as the coefficients α, β. Let us consider the spinor field now. Let, Z X d3 k p ψ+ (x) := (u(~k, σ)eik·x )b~k,σ , ←→ 3 2ω~k (2π) σ Z X d3 k † p (ψ+ ) (x) := (u∗ (~k, σ)e−ik·x )b~†k,σ ; 3 2ω~k (2π) σ Z X d3 k p (v(~k, σ)e−ik·x )d~†k,σ , ←→ ψ− (x) := 2ω~k (2π)3 σ Z X d3 k † p (v ∗ (~k, σ)e+ik·x )d~k,σ . (ψ− ) (x) := 2ω~k (2π)3 σ The Dirac index is suppressed and the dagger refers to the adjoint of the operators and not of the spinors. † † Let ψ(x) := µψ+ (x) + νψ− (x) and ψ † (y) := µ∗ ψ+ (y) + ν ∗ ψ− (y). Then, Z h i d3 k X n 2 † ψα (x), ψβ (y) = |µ| uα (~k, σ)u∗β (~k, σ)eik·(x−y) 3 2ω~k (2π) σ ± ± |ν|2 vα (~k, σ)vβ∗ (~k, σ)e−ik·(x−y) The sums over σ are given by, i X Xh uα (~k, σ)u∗β (~k, σ) = u(~k, σ)ū(~k, σ)γ 0 σ X σ = αβ uα (~k, σ)u∗β (~k, σ)eik·(x−y) = (2m)−1 (i6 ∂ + m)γ 0 eik·(x−y) o −6 k + m 0 γ ⇒ 2m and likewise, σ X σ X vα (~k, σ)vβ∗ (~k, σ) = Xh v(~k, σ)v̄(~k, σ)γ 0 i = αβ σ −6 k − m 0 γ ⇒ 2m vα (~k, σ)vβ∗ (~k, σ)e−ik·(x−y) = −(2m)−1 (i6 ∂ + m)γ 0 e−ik·(x−y) ; σ 78 and the integration over ~k just gives the ∆+ ((x − y)2 ) function computed above. Hence, h i ψα (x), ψβ† (y) ± = (|µ|2 ∓ |ν|2 ) i6 ∂ + m 0 γ ∆+ ((x − y)2 ) 2m Once again, the right hand side vanishes for space-like separation provided the upper signs are chosen and |µ| = |ν|. Thus, the for spinors we have to choose anti-commutation relations and the weights of the positive and negative frequency solutions must be the same up to a phase factor. We may just take µ = ν = 1 which gives back our previous mode decomposition. Note: While writing the mode decomposition, we did not bother about the normalizations of the coefficients, b, d∗ etc. Now, in choosing the brackets, [b~ , b† ]+ = δσ,σ0 δ 3 (~k − ~k 0 ) and k,σ ~k0 ,σ 0 ditto for the d’s, we have fixed these normalizations. We could still have arbitrary multiple in each term. The requirement of causality restricts this freedom to the above. The two examples considered above, generalize to other representations as well and what we have is the celebrated spin-statistics theorem: A relativistic quantum field theory satisfies the requirement of causality provided integer spin/helicity field quantum conditions use commutators (bosons) while the half integer spin/helicity field quantum conditions use anti-commutators (fermions). Note: While discussing quantum statistics in statistical mechanics, we invoke the additional attribute of indistinguishability among identical particles and require the BoseEinstein/Fermi-Dirac statistics. Identical particles are defined by having identical intrinsic attributes such as mass, spin, color, flavor etc. The (in)distinguishability arises from a spatial localization and subsequent ability to tag them through, say, a scattering process. This is lost when their wave functions overlap, the ‘quantum’ becomes operative and quantum statistics becomes essential. In the above discussion, no such additional property emerged. We already have the ‘quantum’ operative so indistinguishability is also operative. It is the demand of relativistic causality that forced the spin-statistics correlation. With the anti-commutators around, [b~k,σ , b~†k0 ,σ0 ]+ = δσ,σ0 δ 3 (~k − ~k 0 ), [b~k,σ , b~k0 ,σ0 ]+ = 0, we define the number operators for each (~k, σ) as before: N := b† b It follows that [N, b]− = −b , [N, b† ]− = +b† . Eigenvalues of N continue to be integers but thanks to (b† )2 = 0, they take only two values, 0, 1. The corresponding eigenstates are: |0i and |1i. This is the incorporation of the Pauli exclusion principle. Consider states of two modes (scalar for simplicity), say ~k1 , ~k2 . A general basis state 79 would be |~k1 , . . . ~k1 , ~k2 , . . . ~k2 i ∼ (a~†k )m (a~†k )n |0, 0i. 1 2 If the a’s anti-commute, then m, n ≤ 1 and we have 4 possible states: |0, 0i, |~k1 , 0i, |0, ~k2 i, |~k1 , ~k2 i. Thanks to anti-commutation, |~k2 , ~k1 i = −|~k1 , ~k2 i. For bosonic operators, there is no minus sign under exchange. Generalizing from this, it is clear that (anti-)commutation relations among the creation operators automatically ensure complete (anti-)symmetry under permutations of the labels. To understand the extra minus sign in the third line of eq. (8.6), momentarily write the Ψ̄ as Ψ̄1 and Ψ as Ψ2 . Then in the third line of the equation, we see an explicit exchange of Ψ1 and Ψ2 . The fermionic commutation relation then gives that extra minus sign. As noted before, this is also important for the CPT theorem. To summarize: Quantum fields are operators tagged by space-time points and obeying certain commutation relations. We may think of them as a collection of creation-annihilation operators for each mode (solution of the equations of motion). A general state of a quantum field is a linear combination of basis states which are multi-quanta states. Each quantum with momentum label ~k and a spin/helicity label σ carries energy-momentum-angular momentum as given by the Poincare charges. As per quantum field theory, all processes involve exchanges of quanta among different quantum fields. 80 9. STATES OF FREE QUANTUM FIELDS: PARTICLES, COHERENCE AND COHERENT STATES While we argued that particle dynamics with position-momentum is not compatible with relativity, we observe particles in a variety of experiments eg as tracks in bubble chambers, photographic emulsions, stacks of particle counters etc and also use the particle view in designing accelerator beams. So the more sophisticated relativistic framework should also have a suitable description of “particles”. Since quantum fields are operators, the description must be in terms of states of a quantum field. In the section (7), we saw that from a dynamical view point, a (free) field is an infinite collection of non-interacting harmonic oscillators. Furthermore, its quantization follows from the quantization of oscillators and A quantum field emerges as a linear combination of creation-annihilation operators of its modes. Its states are built by the action of the creation operators on vacuum state of each mode. We have to identify “particles” in this huge space of states of a quantum field. A. Particle/anti-particle wave packets Let |0i denote the unique vacuum state annihilated by all annihilation operators a~k . We P can generate “1-particle states” by taking a linear combination of the form ~k α(~k)a~†k |0i, P “2-particle states” by taking a linear combination of the form ~k,~l α(~k, ~l)a~†k a~†l |0i and so on. The totality of all such states forms the state space of a free quantum field. For the n − particle states, the (anti-)commutators of the creation operators automatically take care of (anti-)symmetrization of the states. Notice that 1-particle state does not mean a single a~†k |0i, but a linear combination of several creation operators. These allow us to form wave packets. As an explicit example, let us construct a wave packet of an anti-fermion with some momentum and spin distribution. We have the Dirac field, Z † 3 ik·x −ik·x ~ ~ Ψ(x) = [d k] b~k,σ u(k, σ)e + d~k,σ v(k, σ)e , where, 3 d3 k , and (6 k + m)u = 0 = (−6 k + m)v. d k := p 2ω~k (2π)3 Acting on the vacuum, it will create a 1-anti-fermion state, the linear combination from the 81 d~†k,σ creation operators. Let f (k~0 , σ 0 ) be a suitable complex valued function. Define Z hf | := 3 0 dk X f (~k 0 , σ 0 ) h0|d~k0 ,σ0 , Z σ0 d3 k X |f (~k, σ)|2 = 1 . (9.1) σ (χ)c (x) := hf |Ψ(x)|0i Z X = [d3 k] f (~k, σ)v(~k, σ)e−ik·x ∵ h0|d~k0 ,σ0 d~†k,σ |0i = δσ,σ0 δ 3 (~k 0 − ~k) (9.2) σ The so constructed (χ)c (x) is a spinor-valued function of x which satisfies the Dirac equation. The v(~k, σ) spinor signifies that it is an anti-fermion wave packet and the suffix c on χ is a reminder. How do we get a fermion wave packet? Well, in place of Ψ(x) operator, we use its charge conjugate, C Ψ̄T (x) (see 8.2), and in place of the d~†k,σ in hf | we use b~k,σ . The corresponding fermion wave packet takes the form, Z Z X X 3 0 0 0 ~ hg| := dk g(k , σ ) h0|b~k0 ,σ0 , d3 k |g(~k, σ)|2 = 1 . σ0 (9.3) σ χ(x) := hg|C Ψ̄T (x)|0i and h0|b~k0 ,σ0 b~†k,σ |0i = δσ,σ0 δ 3 (~k 0 − ~k) ⇒ Z X = [d3 k] g(~k, σ)uc (~k, σ)e−ik·x (9.4) σ where, uc (~k, σ) = (iγ 2 u∗ (~k, σ)) is the charge conjugate of the u(~k, σ) spinor. The generalization for wave packets of other fields should be obvious. All these wave packets are positive or negative frequency solutions of the corresponding field equations. We already have discussed the inner products on these spaces and we may view these as the usual probability amplitudes. Of course we do not have the analogue of position operator unless we restrict to a non-relativistic approximation/limit. Suffice it to say that one can use Dirac equation to analyze beams of relativistic fermions, as is done for instance in [8]. B. Correlation Functions and Coherence of states There are of course more general states which are not labeled by the number of quanta. Even for a single oscillator, states with a given occupation number is only a class of states. We can have finite linear combination or infinite linear combinations of these numbers states, can have squeezed states, coherent states etc. They are distinguished by their properties and utility. A some what more general characterization of quantum states is in terms of correlation functions. Let us see an example of this. 82 Consider a real scalar quantum field. We have its mode decomposition and had also defined the φ± (x) fields consisting of the positive/negative frequency parts. Consider the measurement of the operators φ− (x)φ+ (y). Its average value is some state which is kept implicit, is denoted as G(x, x0 ) := hφ− (x)φ+ (y)i (a correlation function) Z Z 3 0 3 dk dk ~ ~0 0 p p = e−ik·x+ik ·x ha~†k a~k0 i 3 3 2ω~k (2π) 2ω~k0 (2π) where, ha~†k a~k0 i := T r(ρa~†k a~k0 ) and thus the measured averaged value depends on the density matrix or “state”, ρ. The usual Schrodinger equation for the ket vectors becomes the equation idt ρ = [H, ρ] whose solution is: ρ(t) = e−itH ρ0 eitH for a time independent Hamiltonian. These are the only Hamiltonians considered below. For example, the Hamiltonian of our scalar field is given by the quantum version of (6.15). A time-independent density matrix ρ is said to represent a stationary state. Thus for a stationary state, [ρ, H] = 0 which implies, ρ = e−itH ρeitH ∀ t. h i h i h i T r ρa~†k a~k0 = T r e−itH ρeitH a~†k a~k0 = T r ρeitH a~†k a~k0 e−itH h i = T r ρ(eitH a~†k e−itH ) (eitH a~k0 e−itH ) Z h i † it(ω~k −ω~k0 ) = T r ρa~k a~k0 e ∵ H ∼ [d3 k 00 ]ω~k00 N~k00 h ρa~†k a~k0 i The left hand side is time independent, so T r = 0 unless ω~k = ω~k0 i.e. unless |~k| = |~k 0 |. For the same frequency, the trace need not vanish and we have a matrix in the labels ~k, ~k 0 . This matrix is hermitian (ρ is hermitian) and can be diagonalised. Hence, i h without loss of generality, for a stationary state, T r ρa~†k a~k0 =: n~k δ~k,~k0 , where n~k denotes the average number of quanta in the ~k mode. This leads to, Z d3 k iω~k (t−t0 )−i~k·(~ x−~ x0 ) 0 n e (In a stationary state). G(x, x ) = ~ k 2ω~k (2π)3 Clearly, a stationary state leads to translationl invariance of the above correlation function and this correlation function contains the information of the average number of quanta in each mode. Assume now that the average number of quanta is vanishingly small outside a small neighbourhood of some direction k̂0 (collimation) and also outside a small neighbourhood 83 of ω~k0 (monochromaticity). We may then approximate the integral as , N0 iω (t−t0 )−i~k0 ·(~ x−~ x0 ) G(x, x ) ≈ e ~k0 3 2ω~k0 (2π) 0 Z , N0 ≈ ngbd(~k0 ) d3 k n~ 2ω~k (2π)3 k It is apparent that the correlation function has a factorised form: G(x, x0 ) ≈ q ik0 ·x 0 which is a plane wave. A state ρ with such a form of ϕ∗ (x)ϕ(x0 ) , ϕ(x) := 2ω~ N(2π) 3e k0 the correlation function is said to have a first order coherence8 . Notice that the factorised form results when the integral can be approximated. This in turn is valid for |~x −~x0 | ≈ (2π)/|~k0 | and |t−t0 | ≈ (2π)/ω~ . Furthermore, the imperfections in k0 the monochromaticity and collimation affect the approximation and also limit the validity of the first order coherence. This introduces the ideas of coherence time and coherence length as time and space intervals over which the factorization property or coherence property holds. Thus a stationary state with first order coherence, with finite coherence length and time, may be interpreted as a “beam” along the direction k̂0 with intensity proportional to N0 ω~k0 . Let us continue further and assume perfect collimation along say the x-axis. The beam may have a distribution of frequencies though. The phase9 then becomes iω~k (t − t0 − x + x0 ) and G(x, x0 ) =: G(τ ), τ := (t − t0 − x + x0 ). Since we have a single direction, we can denote the dependence on ~k by the frequency ω. Suppose further that n~k =: nω has a Gaussian 2 0) ]. Now the integral can be done exactly and get, dependence on ω: nω = A exp[− (ω−ω 2σ 2 Z ∞ σ2 τ 2 dω [− (ω−ω20 )2 ] iωτ 2σ G(τ ) = A e = G(0)eiω0 τ e− 2 . e ω(2π) 0 Thus, a highly collimated beam with a frequency range of σ around ω0 , has the correlation function G(x, x0 ) decaying with τ = (t − t0 − x + x0 ) ∼ σ −1 . For example, if we have say a laser beam with ω0 ∼ 1014 Hz and σ ∼ 106 Hz, then τ ∼ 10−6 seconds or about 102 meters. Since the beam is presumed to be a stationary state, the τ does not vary with the time, t (or t0 ). However, beyond the coherence length, the beam will show frequency dispersion. Note that first order coherence suffices for the notions of coherence length and time. 8 9 k th order coherence is defined in terms of the factorization property of Gn (x1 , . . . , xn , y1 , . . . , yn ) := hφ− (x1 ), . . . , φ+ (yn )i = E ∗ (x1 ) . . . E ∗ (xn ) E(y1 ) . . . E(yn ), where E(x) is independent of n for all 1 ≤ n ≤ k [7]. The coherent states, eigenstates of the annihilation operators, familiar from the harmonic oscillator have infinite order coherence. p The formula is given for a massless field. More generally, we will have τ (ω) = t − t0 − 1 − m2 /ω 2 (x − x0 ). For very high frequency/energy, the mass can be neglected. 84 Remark: In this example, we have stipulated the properties of stationarity and 1st order coherence to restrict the implicit state of the quantum field. These stipulations do not single out a particular state, but permits a subset of states ρ. All such states suffice to describe a beam with stable properties over the coherence interval. C. Coherent States There is a very important class of states of quantum fields, especially for electromagnetic field, namely the class of coherent states. These are familiar from the quantum harmonic oscillator. Let us recall their construction briefly. 1. An aside: Harmonic Oscillator States With the usual creation-annihilation operators, [a, a† ] = 1 we take the oscillator variables as: q(t) := ae−iωt + a† eiωt := q+ (t) + q− (t) , i i p(t) := − ae−iωt + a† e−ωt ↔ [q(t), p(t)] − i . 2 2 Define: (i) Displacement operators: For each α ∈ C, αa† −α∗ a D(α) := e ∴ D† (α) = = e |α|2 † − e 2 e−αa D† (α) a D(α) = a + α1 αa† eα ∗a −α∗ a e − e |α|2 2 = D(−α) Using the BCH formula, and D† (α)D(α) = 1 D† (α) a† D(α) = a† + α∗ 1 , and , i D(α + β) = D(α) D(β) e− 2 (αβ ∗ −α∗ β) . (ii) Squeezing operators: for each complex number := reiφ , 1 ∗ a2 −(a† )2 ) S() := e 2 ( , S † () = S(−) , S † ()S() = 1 S † () a S() = a ch(r) − a† e−2iφ sh(r) , S † () a† S() = a† ch(r) − a e2iφ sh(r) These operators define a two parameter family of states, squeezed coherent states, as |α, i := D(α)S()|0i ; = 0 ↔ (coherent states), α = 0 ↔ (squeezed states). 85 For this family of states, it is straightforward to see, hα, |a2 |α, i = −e−2iφ ch(r)sh(r) + α2 , hα, |a|α, i = α , ∀ hα, |(a† )2 |α, i = −e2iφ ch(r)sh(r) + (α∗ )2 , hα, |a† |α, i = α∗ hα, |N |α, i = |α|2 + sh2 (r) a|α, i = −sh(r)e−2iφ D(α)S()a† |0i + α|α, i . The last equation shows that only for the squeezing parameter, r = 0, |α, 0i is an eigenstate of the annihilation operator which is the usual definition of coherent states. The Heisenberg uncertainty relation takes the form, A := 1 + 2hN i − 2|hai|2 = 1 + 2sh2 (r) B := (∆a)2 e−2iωt + (∆a† )2 e2iωt = −2ch(r)sh(r)cos(ωt + φ) (∆q)2 (t) = A + B , 4(∆p)2 (t) = A − B Clearly, for the coherent states r = 0 ⇒ A = 1, B = 0 and (∆q)2 (∆p)2 = 1 4 which saturates the uncertainty bound. The uncertainties are also time independent. Some of the basic properties of the coherent states are: |αi = |α, 0i = D(α)|0i ⇒ hα|αi = 1 ∞ |α|2 X αn − √ |ni |αi = e 2 n! n=0 |α|2 +|β|2 ∗ hβ|αi = e− 2 +αβ Z X 1 1 = d2 α |αihα| = |nihn| . π n≥0 2. (9.5) (9.6) (9.7) (9.8) Correlation functions This subsection is based on [6, 7]. We begin with the definition: Gn (t1 , . . . tn ; tn+1 . . . t2n ) := hq− (t1 ) . . . q− (tn ) q+ (tn+1 ) . . . q+ (t2n )i = eiω(t1 +...tn −tn+1 ···−t2n ) T r ρ(a† )n an (9.9) (9.10) Since the q̂− and q̂+ commute among themselves, the ordering of (t1 , . . . tn ) and (tn+1 . . . t2n ) is unimportant. Note that for (and only for) ρ = |0ih0|, the correlation functions vanish identically. In the following, this density matrix is excluded. 86 We invoke the general result: ∀ Â , and ∀ ρ , T r[ρA† A] ≥ 0 . P Taking A := i αi q̂+ (ti ), hA† Ai = X αi∗ G1 (ti , tj )αj ≥ 0 ∀ αi0 s . i,j Hence, the matrix G1 (ti , tj ) is non-negative. So its determinant is non-negative. Thus, G1 (ti , ti )G1 (tj , tj ) − G1 (ti , tj )G1 (tj , ti ) ≥ 0 . From the definition of G1 , it follows that G1 (tj , ti ) = G1 (ti , tj )∗ . is real and in fact non-negative. Therefore G1 (t, t) The non-negative determinant condition becomes |G1 (ti , tj )|2 ≤ G1 (ti , ti )G1 (tj , tj ). This leads to the definition of the normalized correlation functions, Gn (t1 , . . . , t2n ) g n (t1 , . . . , t2n ) := p , G1 (t1 , t1 ) . . . G1 (t2n , t2n ) (9.11) The determinant condition may be expressed as, |g 1 (ti , tj )| ≤ 1 . Using the explicit definition, g 1 (ti , tj ) = p eiω(ti −tj ) T r(ρa† a) (T r(ρa† a) T r(ρa† a) = eiω(ti −tj ) ∴ |g 1 (ti , tj )| = 1. kth order Coherence : A state ρ is said to have k th -order coherence, if |g n (t1 , . . . , t2n )| = 1 ∀ 1 ≤ n ≤ k . Remark: This is a property of a state, defined with respect to a specific kind of correlation function. Thus we could have different notions of coherence, if we adopt different correlation functions. The choice of this specific class of correlation functions has its roots in the measurements in optics which involves electromagnetic field. The correlation functions are also defined for classical fields (with the averages defined as ensemble averages over repeated observations) and have very similar properties as the quantum counterparts. The distinctions begin to show up when the intensities of the light beams is decreased to almost “single photon” level. Remark: In light of the result in the preceding line, every state has 1st order coherence. Q It follows immediately that for k th order coherence, |Gn (t1 , . . . , t2n )|2 = i G1 (ti , ti ). Result: A state has a k th order coherence iff there exist a function E(t), independent of n, such that Gn (t1 , . . . , t2n ) = E ∗ (t1 ) . . . E ∗ (tn )E(tn+1 ) . . . E(t2n ) ∀ 1 ≤ n ≤ k. 87 The proof is easy. First prove it for G1 (t1 , t2 ). From this the result for Gn follows. This factorization property is another characterization of k th order coherence. Among the pure states, we have states which are finite linear combinations of number states and those with infinite linear combinations. Let |ψi be a state with finite linear combinations of the number eigenstates. Let K be the maximum eigenvalue of N appearing in the linear combination. Then an |ψi = 0 ∀ n > K. Since, for pure states, |g n | = hψ|(a† )n an |ψi , (hψ|a† a|ψi)n it follows that |g n | = 0 ∀ n > K. Hence, finite linear combinations are necessarily partially coherent i.e. have maximum order of coherence to be K. A fully coherent state must have infinite linear combinations. But not all infinite linear combinations have full coherence! Here are two example. Coherent states |α, 0i: We have |g n | = (α∗ )n αn (|α|2 )n = 1 ∀ n ≥ 1 and coherent states are fully coherent. These are the only non-vacuum states which have infinite order coherence. This may be checked by considering an arbitrary, infinitesimal perturbation of a coherent state, |Ψi := (1 − /2)|αi + /2|φi, |φi = 6 |0i and computing hΨ|(a† )n an |Ψi − (hΨ|a† a|Ψi)n to first order in . Squeezed vacuum |0, i: Now, h0|S † ()(a† )2 a2 S()|0i |g 2 | = p (h0|S † ()a† aS()|0i)n This evaluates to, Numerator: = h0|S † a† S S † a† S S † aS S † aS|0i = h0|(ch(r)a† − sh(r)e2iφ a)(ch(r)a† − sh(r)e2iφ a) (ch(r)a − sh(r)e−2iφ a† )(ch(r)a − sh(r)e−2iφ a† )|0i = (sh(r))2 h0|a(ch(r)a† − sh(r)e2iφ a)(ch(r)a − sh(r)e−2iφ a† )a† |0i = (sh(r))2 (ch(r))2 + 2(sh(r))2 Denominator: = h0|S † N S|0i = (sh(r))2 =⇒ |g 2 | = 1 + 3(sh(r))2 > 1 ∀ |0, i. Thus while full coherence requires infinite linear combinations, they do not guarantee even a second order coherence! Points to note as a summary: • Quantum framework has vastly more number of states compared to the classical framework. The observed measurements are always averages with uncertainties. There are 88 many states which have the same average of a given set of observable. While uncertainties serve to distinguish among these, alternative attribute are provided by the correlation functions and notion of coherence. There is at least one class of states, the coherent states, which have full coherence, have time independent uncertainties (for oscillator dynamics) and have the smallest uncertainties compatible with the Heisenberg relation. This class of states is large enough to provide (over-complete) basis for the Hilbert space. • We used a specific kind of correlation function and defined coherence of states w.r.t. these. These correlators have equal number of creation and annihilation operators which are ordered with annihilation operators to the right (the so-called normal order). This is prejudiced on typical detectors working by absorbing quanta. For opposite types of detectors, anti-normal order would be appropriate. The correlators with unequal number of a, a† operators, are used in measurement of phase information. All these correlation functions are premised on processes being understood as emission and absorption of quanta. • Generalizing, the coherent states of fields are essentially tensor product of single mode coherent states. We may thus denote a coherent state of a scalar field as |{α~k }i with α~k := hα~k |a~k |α~k i. Then, ˆ ϕ(x) := h{α~k }|φ(x)|{α ~k }i = Z 3 d k α~k eik·x + α~k∗ e−ik·x . Thus every coherent state of the quantum field gives a classical solution determined by the complex parameters α~k ’s. Conversely, we may interpret a classical (free) field as the expectation value of the quantum field in the corresponding coherent state. It will faithfully mimic the usual classical description eg descriptions of wave solutions etc. This is how one understands the usual classical electromagnetic waves. • There is also a generalization of coherent states for fermionic fields. This uses Grassmann variables, their algebra and calculus. The constructions are very analogous. I refer you to [7] 89 D. Evolution into a coherent state Since we see a large set of classical field solutions, a question arises as to why does a quantum system, go into a coherent state? Under what conditions are these states manifested? Under a free evolution (linear equation), a state which is say a finite linear combination of the number states will continue to maintain the finite linear combination and not evolve into a coherent state, interactions are essential. It suffices to have an interaction with a classical source i.e. a c-number source. This is demonstrated most easily by using using interaction picture for a single harmonic oscillator. Recall: Let |ψiS denote a state vector in the Schrodinger picture and AS denote a generic operator without explicit time dependence and let the Hamiltonian be also time independent. The Schrodinger picture is defined by: idt |ψ(t)iS = H|ψ(t)i , dt AS = 0. Define a new ‘picture I’ with an arbitrary unitary operator V (t) , V (0) = 1 as, |ψ(t)iI := V (t)|ψ(t)iS , AI (t) := V † (t)AS V (t) ⇒ idt |ψ(t)iI = idt V † (t)|ψ(t)iS + V † (t)HS |ψ(t)iS = V † (t) [(−idt + HS ] V (t)|ψ(t)iI ∴ idt |ψ(t)iI = HI − V † (t)idt V (t) |ψ(t)iI , HI (t) := V † (t)HS V (t) . Similarly, idt AI (t) = AI (t), V † (t)idt V (t) . For the choice dt V (t) = 0, we get back the Schrodinger picture. For the choice idt V (t) = V (t)HI (t) ↔ idt V (t) = HS V (t), we get dt |ψ(t)iI = 0 , idt AI (t) = [AI (t), HI (t)] and we have the Heisenberg picture. Let us split the Schrodinger Hamiltonian as HS := H0 + H 0 where H0 is the “free” Hamiltonian and H 0 is the “interaction” Hamiltonian. Now we choose, idt VI (t) = H0 VI (t). This results in the equations, idt |ψ(t)iI = (HI − (H0 )I )|ψ(t)iI := HI0 |ψ(t)i , idt AI (t) = [AI , (H0 )I ] (9.12) This is the “Dirac” or the “interaction” picture wherein the states evolve by interaction Hamiltonian while the operators evolve by the free Hamiltonian. Notice that (H0 )I (t) = H0 . Consider now a quantum harmonic oscillator interacting with a “classical source” i.e. H = ( 21 p̂2 + 12 ω 2 q̂ 2 ) + J(t)q̂(t), where J(t) is an externally prescribed real function of time. 90 We express the Hamiltonian as, 1 H = ω(a† a + ) + J(t)(a + a† ) := H0 + H 0 . 2 In the interaction picture, idt aI (t) = [aI (t), H0 ] = ω h aI , a†I aI i = ωaI ∴ aI (t) = e−iωt aI (0) , a†I (t) = e+iωt a†I (0) ∴ HI0 = J(t)eiωt aI (0) + J(t)e−iωt a†I (0) . R t The state vector evolves by HI0 (t) which gives, |ψ(t)i = exp −i 0 dt0 HI0 (t0 ) |ψ(0)iI . The exponential simplifies as, e −i Rt 0 dt0 HI0 (t0 ) −i = e h i 0 0 dt0 J(t0 )eiωt a+e−iωt a† Rt 0 † −α∗ (t)a = eα(t)a a := aI (0) , a† := a†I (0) Z t 0 dt0 J(t0 )e−iωt . Evidently, , α(t) := −i 0 |ψ(t)iI = D(α(t))|ψ(0)iI . We see immediately that if |ψ(0)iI = |0i, then at each t, |ψ(t)iI is the coherent state |α(t)i. Thus a quantum oscillator interacting with a classical source (linearly coupled) will evolve into a coherent state from its ground state. The generalization to a quantum field is immediate. A quantum field in its ground state (vacuum state), upon interacting (linear coupling) with a classical source, will evolve into a coherent state. Remark: You may try out different coupling to a classical source eg J(t)q 2 , to see what happens. This is indeed what happens with a quantum field in a FRW expanding universe. The result is squeezed states of the field modes, in exactly the same manner. If the source is also an operator, then in general we do not expect the state to evolve into a coherent state. 91 10. QUANTUM FIELDS IN SCATTERING PHENOMENA So far we have introduced and studied quantum fields in the “non-interacting” context in the sense, the equations of motion that they satisfied were linear. We alluded to interactions of the fields with sources/detectors, involving emission/absorption of quanta. A more detailed version of this is typically treated in interactions of quantum fields with atoms/molecules which are non-relativistic system making transitions among their bound states. There is another class of interactions which are manifested in scattering phenomena. These can be studied in high energy collisions and are a direct tool to probe the basic microscopic interactions. We first discuss typical scattering arrangement, identify the relevant measurable physical quantities, set up a theoretical framework and see how quantum fields gel with the framework. A typical scattering experiment involves shooting particles at some target to be studied and observing the deflected or emerging set of particles. Instead of fixed target we could also have colliding beams. It is usually the case that, (i) we understand the projectile particles reasonably well and able to control their properties such as energy, momentum, spin, charge, shooting angles etc fairly accurately. It is presumed (a necessary condition) that the presence of target and/or other particles do not affect these properties significantly at least as long as they are well separated (spatially and/or temporally); (ii) after a large but finite time, we can observe the emergent particles whose properties can be similarly ascertained and they too are presumed well understood; (iii) the correlations among the incoming and outgoing particles, though determined by the intervening target region, can be measured independent of the target. If we wish to view the combined system of projectile, target and emergent particles as a single dynamical system, then the dynamical system must have three separate evolutions at least approximately. These are: (a) evolution in the distant past (t → −∞); (b) evolution in the distant future (t → +∞) and (c) evolution during the intermediate time of interaction. For instance, in a classical scattering, we will have a trajectory beginning and ending in a region of the phase space, marked as “asymptotic region”. In these regions, the exact trajectory may be well approximated by a different, usually “free” Hamiltonian. In a 92 quantum setting, we may replace a trajectory in phase space by one in the state space. It should be noted right away that not all trajectories are of the scattering type - classically or quantum mechanically. What are the typically observed quantities? In practice, it is not a single particle that is sent in but rather a well collimated beam directed at the target region. After individual particles emerge, there is a spread in their direction as the initial conditions are not identical. Typically, these are detected at different locations by detectors which have finite aperture and therefore the state of the detected particles is also known approximately. Given the aperture of a detector and its distance to the target, we know the solid angle subtended at the scattering region. Thus, the basic measurements are: (i) I(n̂)dΩ := number of particles or events detected per second along the direction n̂. This may be measured for each species of particle separately or just the total; (ii) the prepared beam has a flux J := number of incident particles per second per unit transverse area of the beam. Both these numbers are known directly. They are obviously proportional and define: I(n̂)dΩ := dσ(n̂, . . . )J. . The dσ(n̂, . . . ) is called the differential cross-section and . . . refer to its dependence on other properties such as energy etc. These reflect the averaged attributes of the beam - average energy, momentum, polarization etc. The detection can be made finer with particle detection in coincidence, select specific polarization, charge etc. Depending upon the attributes tagged and included in I(n̂)dΩ we get exclusive/inclusive cross-sections. If one does not care even about the angle, we have the total cross-section. These are typically used in determining decay rates and in applications in statistical mechanics as input from the microscopic dynamics. In summary, the only information that is used about the beam is the flux while finer details of the final state are used in defining various differential cross-sections. Thus any cross-section is defined as the ratio above and has dimensions of area. Experimentalist give these measured numbers while a theorist has to choose a favorite model and compute these numbers. For this, we turn to the theoretical formulation of scattering processes in the quantum framework. Here are some basic observations. 93 A. General scattering framework Let H be the Hilbert space of the combined system (projectile-target-scattered/emerged). Let H0 , H00 and H be the three Hamiltonian operators governing the time evolution in the remote past, distant future and all through. For simplicity, we take H00 = H0 , though they could be different. The evolution generated by H0 is is said to be “free”. All are assumed to have no explicit time dependence. All are assumed to be self-adjoint. Let their corresponding eigenvalue equations be, H0 φn = n φn and Hψα = εα ψα . We have the most general solutions expressed as, ψ(t) = X ψα (0) = X ∴ ψ(t) = X cα e−iεα t ψα (0) , φ(t) = α X an e−in t φn (0) n bα,n φn (0) ∵ {ψα (0)} , {φn (0)} , are complete. n cα bα,n e−iεα t φn (0) , all coefficients are time independent. α,n We are interested in those solutions ψ(t) while resembles some free solution as t → ±∞. Thus, as t → −∞, ψ(t) − φ(t) → 0 ⇒ ( X X n ∴ an → X ) cα bα,n e−iεα t+in t − an φn (0) → 0 . α cα bα,n ei(n −εα )t ∀ n . t−independence of coefficients implies, α cα bα,n (no sum) = 0 whenever εα 6= n . So, if H0 , H do not have any eigenvalue in common, then for every cα∗ 6= 0, bα∗ ,n = 0 ∀ n. Hence, no solution ψα (0) is approximable by a free evolution. For a sufficiently general solution φ(t) to approximate some exact solution ψ(t), spectrum of H0 should be a subset of the spectrum of H. Conversely, every solution with cα = 0 for every εα ∈ / Spec(H0 ), can be asymptotically approximated, by some free evolution. Such solutions are called scattering solutions. Note: Usually, the spectrum of H0 is continuous and bounded below. Its eigenstates span the Hilbert space. The scattering states of H however span only a subspace of H . To span the full Hilbert space we need to include the bound states of H. We have been somewhat informal in the above argument, here are the sharper definitions. 94 A solution |ψ(t)i is said to be incoming (outgoing) if ∃ a solution Definition: |φin (t)i (|φout (t)i) such that lim kψ(t) − φin (t)k = 0 ( lim kψ(t) − φout (t)k). t→−∞ t→+∞ Note: If the prepared beam or the detected scattered particles are to be associated with a state of the full system, then it is necessary that at the most only one |ψ(t)i can be asymptote to a given |φin (t)i or a given |φout (t)i. Establishing such a property for a model of scattering is called the basic existence and uniqueness problem. The solutions can be tagged by states at any particular instance for example, “|ψ(t = 0)i”. We identify Hin := {ψ ∈ H /ψ(t) is incoming } and Hout := {ψ ∈ H /ψ(t) is outgoing }. Potentially it is possible to have (i) ψ ∈ / Hin , ψ ∈ / Hout (no scattering); (ii) ψ ∈ Hin , but ψ ∈ / Hout (’capture’ process); (iii) ψ ∈ / Hin , but ψ ∈ Hout (’decay’ process) and (iv) ψ ∈ Hin ∩ Hout (scattering). Definition: A scattering system is weakly asymptotically complete if Hin = Hout . It may so happen that the subspace of bound states, Hbound is such that H = Hbound ⊕ Hin ⊕ Hout . Then the system is said to be asymptotically complete. These distinctions have a bearing on the definition of S−matrix and its unitarity. Let U (t), U0 (t) denote the unitary evolution operators corresponding to H0 , H respectively. Let |ψ(t)i = U (t−t0 )|ψ(t0 )i be an incoming solution. Then ∃ φin (t) = U0 (t−t0 )φin (t0 )i (t0 arbitrarily chosen instant) such that, As t → −∞ , kU (t − t0 )ψ(t0 ) − U0 (t − t0 )φin (t0 )k → 0 kψ(t0 ) − U † (t − t0 )U0 (t − t0 )φin (t0 )k → 0 ∴ ∴ lim ψ(t0 ) = t→−∞ lim U † (t − t0 )U0 (t − t0 )φin (t0 ) . t→−∞ Define: Ω+ := lim U † (t − t0 )U0 (t − t0 ) = t→−∞ we have, Ω+ = lim U † (t)U0 (t) Using U † (t) = U (−t), t→−∞ lim U (−t)U0 (t) and |ψ(t0 )i = Ω+ |φin (t0 )i ∀ t0 . t→−∞ Likewise, for an asymptotically outgoing solution, we have, Ω− = lim U (−t)U0 (t) and |ψ(t0 )i = Ω− |φout (t0 )i ∀ t0 . t→+∞ The Ω± are called the Moller operators. They map free solutions to asymptotic solutions. The assumption of existence and uniqueness of scattering solutions, guarantees the existence 95 of these operators. Their adjoints are defined as, Ω†+ := limt→−∞ U0 (−t)U (t) , Ω†− := limt→+∞ U0 (−t)U (t). The adjoints give φin (t0 ) = Ω†+ ψ(t0 ) , φout (i0 ) = Ω†− ψ(t0 ) ∀ t0 . It follows that ψ(t0 ) = Ω± Ω†± ψ(t0 ), Ω†+ Ω+ φin (t) = φin (t0 ) and Ω†+ Ω+ φout (t) = φout (t0 ). Since the φin/out span the full Hilbert space, the last two equalities imply that Ω†± Ω± = 1. The same is not true for the ψ(t)’s and hence Ω± Ω†± 6= 1 . Thus the Moller operators are isometry operators but not unitary operators. For the special case of asymptotic completeness, we can write Ω± Ω†± = 1 − Pbound where Pbound is the projection operator on Hbound . From the Schrodinger equations satisfied by the ψ(t) and φ(t), it is easy to see that HΩ± = Ω± H0 and H0 Ω†± = Ω†± H . Consider now a solution which is both asymptotically incoming and outgoing. Then we have, ψ(t) = Ω+ φin (t) and φout (t) = Ω†− ψ(t). Therefore, φout (t) = Ω†− Ω+ φin (t) , φin (t) = Ω†+ Ω− φout (t) and we define, S := Ω†− Ω+ on [Range(Ω+ )] ∩ [Domain(Ω− )] 6= ∅ . (10.1) (10.2) Note that the range of Ω+ are states in Hin while the domain of Ω†− are states in Hout . The scattering operator, S, is thus well defined when weak asymptotic completeness holds. Similarly, we define S † := Ω†+ Ω− which is also well defined when weak asymptotic completeness holds. If in addition asymptotic completeness holds, then S † S = Ω†+ Ω− Ω†− Ω+ = Ω†+ (1 − Pbound )Ω+ = 1 − 0 likewise, (10.3) SS † = Ω†− Ω+ Ω†+ Ω− = Ω†− (1 − Pbound )Ω− = 1 − 0 . (10.4) Thus the scattering operator is unitary when asymptotic completeness holds. This is regardless of the Moller operators being non-unitary. The scattering operator S maps from free solutions φin (t) to free solutions φout . Its matrix elements between the basis of free solutions define the S − matrix. Explicitly, let φain (t) and φbout (t) denote incoming and out going free solutions. Then, Sba := hφbout (t)|S|φain (t)i =: hφbout (t)|φ̃aout (t)i . (10.5) The φ̃aout (t) state is a state that evolved from a specific incoming state while φbout (t) is an arbitrary free state and Sba is the inner product between the two. Hence, Sba gives the probability amplitude for a free state |φin (t)i to evolve into an outgoing state |φbout (t)i - in short a transition amplitude. 96 The S operator defined in terms of the Moller operators is manifestly time independent. It matrix elements however seem to have a time dependence through those of the φin/out (t). This apparent time dependence is actually absent and can be seen as follows. Recall the boxed relations among Ω± , Ω†± , H and H0 . From these we get, SH0 = Ω†− Ω+ H0 = Ω†− HΩ+ = H0 Ω†− Ω+ = H0 S. idt Sba = idt (hφbout (t)|S|φain (t)i) = −hφbout (t)H0 S|φain (t)i + 0 + hφbout (t)iSH0 |φain (t)i = hφbout (t)|[S, H0 ]|φain (t)i = 0. In summary: • A generic scattering system has H and the three Hamiltonians H, H0 , H00 specified; • There exist asymptotic subspaces of H , Hin , Hout elements of which give full solutions approaching some free solutions as t → ±∞; • Provided the system admits scattering states, there exist the Moller operators, Ω± on H with ranges in Hin/out respectively; • If weak asymptotic completeness holds, ∃ the scattering operators S : Hin → Hout and its adjoint S † . The scattering operator S is unitary provided asymptotic completeness holds. • Once Sba are known, the cross-section can be computed, since the number of scattered particles is proportional to |Sba |2 , while the flux of the incoming particles is given by the probability current corresponding to φin (t). • Much of the rigorous analysis goes in addressing the existence and uniqueness properties of the proposed scattering system. Further details may be seen in [9, 10]. In practice, rarely a scattering system is specified in sufficient details to address these issues, but they are at the very basic definitional level. Returning to interacting quantum fields, we have the candidate free states - the states of free quantum fields with known free Hamiltonian H0 . We may specify the interacting system by additional Lorentz invariant terms in the Lagrangian or specifying the Hamiltonian H. 97 However, we do not know what the Hilbert space of interacting quantum fields is let alone if it admits scattering states. But we do know what properties we would like to have for a scattering interpretation. The task now is to make suitable assumptions and build a computational recipe to define and compute the S−matrix elements. B. Scattering with quantum fields: Heisenberg Picture The above discussion has been in terms of the states evolving in time i.e. in Schrodinger picture. For interacting quantum fields we do not know the state space (except for the free fields). We do have Poincare covariance as a guide and this puts the field operators and their Poincare transforms at the center stage. We are thus led to use the Heisenberg picture and formulate the recipe in terms of the evolving operators. As states do not evolve in Heisenberg picture, we postulate certain properties for the states, postulate Poincare covariance on the operator evolution and make specific additional assumption of “asymptotic condition” to incorporate the scattering feature. A reference for this material is [11]. 1. We choose the basis states of the interacting quantum fields to be labeled by mutually commuting, conserved Noether charges whose existence is guaranteed by the presumed symmetries. Among these symmetries is included the Poincare group (proper, orthochronous with/without discrete symmetries). The Poincare symmetry immediately gives eigenvalues of P µ as a set of labels. Our first assumption is regarding the spectrum of P µ . 2. • There exists a unique vacuum, eigenstate |0i with P µ |0i = 0; • Eigenvalues, pµ , of P µ lie in the forward light cone i.e. p2 ≤ 0, p0 > 0; • ∃ stable, single particle states with masses mi : P µ |pi i = pµi |pi i , p2i = −m2i . • The eigenvalues 0 and m2i of P 2 are discrete in the sense of being isolated values. The “multi-particle” states have their total momentum time-like but the corresponding −p2 can take any value above their total rest mass. Note: If there are “internal symmetries” (i.e. non-space-time symmetries), then it is conceivable that the zero eigenvalue of P µ is degenerate, and carries some representation of the internal symmetry group (usually compact). Every one of these states 98 P0 6 Continuum ... ....... ....... ............................ ......... ....... ......... ....... ......... ......... ....... ......... ......... ....... .......... .......... ........ .......... ........ .......... ........... ........ ........... ............ ........ ................ ................ ........ ........ ......... ......... ......... ........... ............. ................ ... ....... ....... ....... . . . . . . . ..... p = −4m ........ ........ ........ ........ . . . . . . . ..... ......... ......... ........ ......... . . . . . . . . . ......... ............ ....................... . ......... ........ ........ ......... ......... ......... ........ ......... . . . . . . . . ... .......... .......... .......... .......... .......... .......... ............ .............. ....................... 2 . 2 p2 = −m2 - p~ FIG. 1: Spectrum for a single, massive scalar field. The two particle states form a continuum above and including −(p1 + p2 )2 = 4m2 hyperboloid. must still be singlets of the Poincare group. But now, these vacua themselves carry some quantum numbers and quantum fields acting on these will create single particle states labeled by tensor product of representations of the internal symmetry group. This is not done and one stipulates the vacuum to be a singlet under all symmetries. Note: We have taken the stable particles to have a non-zero mass. This is a statement about the spectrum of P 2 in the full, exact interacting quantum field theory and not at some approximation of it. This would seem to exclude the massless representations of the Poincare group for interacting quantum fields. In practice i.e. in perturbation theory we will use massless photons and see the accompanying infrared problems. 3. • An interacting field operator, denoted schematically by Φ(x) satisfies an operator equation of the form, ( − m20 )Φ(x) = J(x) where J(x) is an operator built out of Φ and possibly other field operators. We re-write the field equation as ˜ ( − m2 )Φ(x) = J(x) − (m2 − m2 )Φ(x) =: J(x) 0 • The field obeys the equal time (anti-)commutation relations: [Φ(t, ~x), Φ(t, ~y )] = 0 = [Π(t, ~x), Π(t, ~y )] , [Φ(t, ~x), Π(t, ~y )] = iδ 3 (~x − ~y ) . If J(x) has no derivatives of fields, the momentum field is Π(t, ~x) = Φ̇(t, ~x). This stipulates the quantized nature of the field while the equation of motion stipulates 99 the nature of interaction. • ∃ another set of fields, quantized Φin (x), Φout (x), build out of the interacting field Φ(x), and satisfying the same set of equal time (anti-)commutation relations but obeying free field equations: ( − m2 )Φin/out = 0 where m is the “physical” mass. The in and out fields thus have the same mode decomposition introducing the creation/annihilation operators which create/destroy stable particles with physical mass m. These fields also transform the same way as the field Φ(x) under the symmetries, in particular, Φin (x + a) = eia·P Φin (x)e−ia·P . 4. Asymptotic conditions: The exact interacting quantum field Φ(x) is linked to the in and out fields by the asymptotic conditions, Φ(x)|t→−∞ −→ √ ZΦin (x) , Φ(x)|t→∞ −→ √ ZΦout (x) Here Z is a possible normalization constant which we will show to be necessarily different from 1. Note: The asymptotic conditions are not operator equations but are weak equations i.e. hold for arbitrary matrix elements in the basis of normalized states. Below we elaborate the conditions and note a number of properties. Asymptotic conditions are consistent with Poincare covariance: A formal solution of the equation satisfied by Φ(x) is given by, Z √ ˜ + ZΦin (x). Φ(x) = d4 y Gret (x − y, m)J(y) (10.6) As t → −∞, the first term vanishes (property of retarded Green function) recovering the asymptotic condition. This is a heuristic argument since J(y) it built out of Φ(x) and some iterative procedure is implicit. The consistency with covariance is seen as, Z √ ˜ ZΦin (x − a) = Φ(x − a) − d4 y Gret (x − a − y, m)J(y) Z −ia·P ia·P ˜ − a) = e Φ(x)e − d4 y Gret (x − y, m)J(y Z √ −ia·P 4 ˜ = e Φ(x) − d y Gret (x − y, m)J(y) eia·P = Ze−ia·P Φin (x)eia·P . Note: Only the Gret is used in the formal solution and not GF eynman . 100 Φin (x) creates states with physical mass m: The infinitesimal form of the transformation law for Φin (x) implies, [P µ , Φin (x)] = −i∂ µ Φin (x) and we also have the equation of motion ( − m2 )Φin = 0. For any physical state |k, αi, P µ |k, αi = k µ |k, αi. Therefore, −i∂ µ hk, α|Φin (x)|0i = hk, α|[P µ , Φin (x)]|k, αi = k µ hk, α|Φin (x)|0i ∴ −hk, α|Φin (x)|0i = k 2 hk, α|Φin (x)|0i ∴ (− + m2 )hk, α|Φin (x)|0i = 0 = (k 2 + m2 )hk, α|Φin (x)|0i ⇒ k 2 = −m2 . Thus, Φin (x) behaves exactly like a free field producing physical states from the vacuum and we can be taken to have the usual mode decomposition, Z h i d3 k † ~ −ik·x ik·x ~ p ain (k)e + ain (k)e Φin (x) = 2ω~k (2π)3 Z h i d3 k p ∂0 Φin (x) = −iω~k ain (~k)eik·x + iω~k a†in e−ik·x ⇒ 2ω~k (2π)3 Z ← → ← → −i † ~ d3 x eik·x ∂0 Φin (x) , f ∂ g := f ∂g − ∂(f )g . (10.7) ain (k) = p 3 2ω~k (2π) Z ← → i ~ d3 x e−ik·x ∂0 Φin (x) And, ain (k) = p 2ω~k (2π)3 Note: In writing the inversion formula, we have done the spatial integration over a t =constant hypersurface Σt . The choice of this hypersurface does not matter (a†in (~k) is independent of t). This is kept implicit in all the inversion formulae below. It follows immediately, h0|Φin (x)|0i = 0 and, hk, α|Φin (x)|0i = hk, α|e−ix·P Φin (0)eix·P |0i = e−ik·x hk, α|Φin (0)|0i hk, α|Φin (0)|0i = p 1 e−ik·x ⇒ hk, α|Φin (x)|0i = p . 2ω~k (2π)2 2ω~k (2π)3 Suffice it to say that corresponding expressions exist for φout . In particular the heuristic expression take the form, Z √ ˜ Φ(x) = ZΦout (x) + d4 y Gadv (x − y, m2 )J(y). How far can we get with Lorentz covariance and the assumptions on the spectrum? 101 C. Kallen-Lehmann Representation Consider the vacuum expectation values of the commutator of interacting field, i∆0 (x, x0 ) := h0|[Φ(x), Φ(x0 )]0i. We had evaluated this for the free fields while considering the spin-statistics, ∼ ∆+ (x − x0 ). We evaluate it as, i∆(x, x0 ) = X = X = X h0|Φ(x)|nihn|Φ(x0 )|0i − (x ↔ x0 ), (10.8) n 0 h0|Φ(0)eix·P |nihn|e−ix ·P Φ(0)|0i − (x ↔ x0 ), (10.9) n 0 0 |h0|Φ(0)|ni|2 ei(x−x )·kn − e−i(x−x )·kn a function of (x − x0 ). (10.10) n The states |ni contain the vacuum, the single particle states on the isolated hyperboloid kn2 = −m2 as well as the multi-particles continuum. To collect terms with the same total R knµ , insert 1 = d4 qδ 4 (q − kn ) and write, " # Z X i 0 0 ∆0 (x − x0 ) = − d4 q (2π)3 δ 4 (q − kn )|h0|Φ(0)|ni|2 ei(x−x )·kn − e−i(x−x )·kn 3 (2π) n Z i 0 4 i(x−x )·q −i(x−x0 )·q := − d q e − e ρ(q) where, (2π)3 " # X ρ(q) := (2π)3 δ 4 (q − kn )|h0|Φ(0)|ni|2 (the spectral density). n Claim: The spectral density is a scalar function of q 2 . Proof: Under a Lorentz transformation, U (Λ), we have U (Λ)|0i = |0i and U (Λ)Φ(0)U −1 (Λ) = Φ(Λ0) = Φ(0). Hence, ρ(q) = (2π)3 X δ 4 (q − kn )|h0|Φ(0)U (Λ)|ni|2 . n Next, Z Z Z −1 d xe = d (Λx)e = d4 xei(Λ k)·x ∼ δ 4 (Λ−1 k) . X ∴ ρ(q) = (2π)3 δ 4 (Λ−1 (q − kn ))|h0|Φ(0)U −1 (Λ)|ni|2 . 4 δ (k) ∼ 4 ik·x 4 ik·(Λx) n Let U (Λ)|ni = |mi. Then U −1 (Λ)P µ U (Λ) = Λµν P ν . Acting on |ni implies, P µ |mi = µ µ −1 km |mi and we already had P µ |ni = knµ |ni. This in turn implies, km U (Λ)|mi = Λµν knν |ni µ or, km = Λµν knν . Lowering the Lorentz index gives, (km )µ = (kn )ν (Λ−1 )νµ . Replacing the 102 sum over n by that over m gives, ρ(q) = (2π)3 X δ 4 (km − Λ−1 q)|h0|Φ(0)|mi|2 = ρ(Λ−1 q) . m Thus, ρ(q) can only depend on q through q 2 . Since the physical spectrum has q 2 < 0 with q 0 > 0, we write ρ(q) := θ(q 0 )ρ(q 2 ), ρ(q 2 ) = 0 for q 2 > 0. Substituting in ∆0 gives, Z −i 0 0 4 2 0 iq·(x−x0 ) −iq·(x−x0 ) ∆ (x − x ) = d qρ(q )θ(q ) e − e (2π)3 Z ∞ Z −i 2 2 4 2 2 0 iq·(x−x0 ) dσ ρ(σ ) . = d qδ(q − σ )(q )e (2π)3 0 Z ∞ 0 0 dσ 2 ρ(σ 2 )∆(x − x0 , σ 2 ) . ∴ ∆ (x − x ) = (10.11) 0 Here we have introduced the function (x) = θ(x) − θ(−x) to combine the two exponentials and also recognized that the the square bracket is just −i[Φin (x), Φin (x0 )] for m2 = σ 2 . The last equation is known as the Kallen-Lehmann representation or the spectral representation for the commutator function ∆0 (x − x0 ) of the exact interacting quantum fields. This has followed from Poincare covariance (separation of the x−dependence), the assumptions regarding the spectrum of the interacting system (ρ(q 2 > 0) = 0) and normalizations used for Φin in identifying the ∆(x − x0 , σ 2 ). We can separate the contribution of the 1-particle states at σ 2 = m2 . We obtain the h0|Φ(x)|ki using the asymptotic condition, Z √ ˜ h0|Φ(x)|ki = Zh0|Φin (x)|ki + d4 y Gret (x − y, m)h0|J(y)|ki , but ˜ h0|J(y)|ki = h0|( − m2 )Φ(x)|ki = ( − m2 ) h0|Φ(0)|kieik·x = (−k 2 − m2 ) h0|Φ(0)|kieik·x = 0. The first term then gives, √ √ eik·x Z ⇒ |h0|Φ(x)|ki|2 = . h0|Φ(x)|ki = Zh0|Φin (x)|ki = Z p 2ω~k (2π)3 2ω~k (2π)3 Z Z 2 3 ∴ ρ(q )|1−particle = (2π) d3 kδ 4 (k − q) = Zδ(q 2 − m2 )θ(q 0 ) (10.12) 2ω~k (2π)3 Z ∞ 0 0 0 2 ∴ ∆ (x − x ) = Z∆(x − x , m ) + dσ 2 ρ(σ 2 )∆(x − x0 , σ 2 ) .(10.13) m21 The lower limit on the integral is the smallest invariant mass of the multi-particle continuum and m21 > m2 is assumed. 103 We can now derive a bound on Z. Observe that lim (i∂t ∆0 (x − x0 )) = h0|[Φ̇(x), Φ(x0 )]|0i = −iδ 3 (x − x0 ) = (i∂t ∆(x − x0 , m2 )) . t0 →t We are using the assumption that there are no derivative terms in J(x) or equivalently, Π(x) = Φ̇(x) and noting that the in/out fields satisfy the same commutation relations. Taking −i∂t on eq.(10.13) and taking the limit t0 → t, we deduce Z ∞ 3 0 3 0 −iδ (~x − ~x ) = Z(−iδ (~x − ~x )) + dσ 2 ρ(σ 2 ) −iδ 3 (~x − ~x0 , σ) m21 Z ∞ dσ 2 ρ(σ 2 ) =⇒ 0 ≤ Z < 1. Or, 1 = Z + (10.14) m21 The inequalities on Z can be understood as follows. Since asymptotic condition is needed to link Φ(x) with Φin/out (x), Z 6= 0. To understand the upper limit, notice that the free fields can only produce 1-particles states since it is linear in creation/annihilation operators. The interacting field has no such restriction and can produce multi-particle states as well. Hence the probability for it to produce 1-particle states is less than that for the free fields i.e. |h0|Φ(x)|ki|2 < |h0|Φin/out (x)|ki|2 . This results in Z < 1. In principle the Z factors for the in and out fields could have been different in the asymptotic conditions. However, following identical steps as above would lead us to the same relation (10.13) implying equality of the Z’s. An important point to note is that Z < 1 has a physical reason, quite independent of any infinities and their renormalization that we will encounter later on. Remark: In the Kallen-Lehmann spectral representation we considered the vacuum expectation value of the commutator of the fully interacting quantum field presumed to satisfy the usual canonical commutation relations. Because of this, the interacting quantum field gets identified with the so called “bare” quantum field in the renormalized perturbation √ theory. We could instead use a “renormalized” quantum field, ΦR (x) := Φ(x)/ Z and correspondingly define, ∆0R (x − x0 ) := Z −1 ∆0 (x − x0 ). It follows that ρR (q 2 ) = Z −1 ρ(q 2 ) and [ΦR (x), Φ̇R (x0 )] = iZ −1 δ 3 (x − x0 ). In terms of the renormalized quantities, there is no factor of Z in equation (10.13). The manipulations leading to the bound on Z will now give, lim (i∂t ∆0R (x − x0 )) = h0|[Φ̇R (x), ΦR (x0 )]|0i = −iZ −1 δ 3 (x − x0 ) = Z −1 (i∂t ∆(x − x0 , m2 )) , t0 →t R∞ and the eq.(10.13) will lead to 1 = Z 1 + 0 dσ 2 ρR (σ 2 ) implying again that Z < 1. These bounds are true provided the spectral integral is finite. In perturbation theory, the two point 104 functions (commutator or the Feynman propagator) of the unrenormalized fields are UltraViolet divergent and the bound cannot be inferred (See section 10.7 of [1]). The KallenLehmann representation eq.(10.11) itself does not depend on the canonical commutation relations for the interacting field and can indeed be derived for composite fields as well. D. The S-matrix and its properties Having the Φin/out (x) and their corresponding mode expansions at our disposal, we can now define the in/out states as, |kiin := a†in (~k)|0i and |kiout := a†out (~k)|0i. Similarly, multiparticle states can be defined. We can form wave packets for normalizable states and take limit of infinite sharpness at the end. We bypass these steps and work with the momentum eigenstates. Note that these states are time independent. While we can certainly define the general in/out states and take their spans, we need to (and do) make a further assumption that these spans generate the full Hilbert space of the interacting quantum fields10 . The sets of in and out states just constitute two orthonormal bases, conveniently transforming by representations of the Poincare group. This guarantees existence of a unitary operator connecting the two orthonormal bases. This is our scattering operator in this Heisenberg picture formulation. Taking the basis elements to be generated by monomials of a† (~k, σ, . . . ) in and a†out (~k, σ, . . . ), the S−matrix is defined as, Sβα := hβ out|α ini =: hβ in|S|α ini (10.15) Note that α, β label basis states which are created by monomials in a†in , a†out . The definition of S operator shows that it preserves the labels: hβ out| =: hβ in|S. We stipulate some conditions on the S−matrix. Stability of the vacuum and 1-particle states: We expect vacuum state to suffer no scattering spontaneously except may be acquiring a phase (uniqueness of vacuum allows a phase since rays are defined unto phases). We choose the phase to be 1, and stipulate S00 = 1 . Likewise presumed stability of single particle states also disallows any spontaneous change (nothing to scatter against). Thus we stipulate that Sk0 ,k = δ 3 (~k 0 − ~k). Scattering takes place with multi-particle states. 10 This automatically precludes bound states in the interacting system and amounts to postulating asymptotic completeness property mentioned earlier. 105 Claim: Φout (x) = S −1 Φin (x)S . Proof: Consider, hβ out|Φout (x)|α ini and evaluate in two ways. hβ out|Φout (x)|α ini = (hβ in|S)Φout (x)|α ini. Next, hβ 0 out| := hβ out|Φout (x) is a linear combination of the out-basis elements. Each of this can be expressed as corresponding in-basis element times S. And the in-basis element is also obtained by action of Φin (x). In equation, hβ 0 out| = X Cγ hγ out| = γ X Cγ (hγ in|S) = γ ! X Cγ hγ in| S =: hβ 0 in|S = (hβ in|Φin (x))S. γ Taking inner product of the two expressions with |α ini gives, hβ out|Φout (x)|α ini = hβ in|SΦout (x)|α ini = hβ in|Φin (x))S|α ini ∀ α, β. Completeness of the in-basis implies, Φout (x) = S −1 Φin (x)S . proving the claim. Claim: Covariance of the in/out fields gives invariance of the scattering operator. Proof: We have, U (Λ, a)Φin (x)U −1 (Λ, a) = Φin (Λx + a) U (Λ, a)Φout (x)U −1 (Λ, a) = Φout (Λx + a) and Φout (x) = S −1 Φin (x)S ∴ U (S −1 Φin (x)S)U −1 = S −1 Φin (Λx + a)S = S −1 U Φin (x)U −1 S (10.16) ∴ U −1 SU S −1 Φin (x) = Φin (x)U −1 SU S −1 => U −1 SU S −1 = 1 or SU (Λ, a) = U (Λ, a)S . (10.17) Question: Are the Schrodinger picture definitions of the scattering matrix element (10.5) and the Heisenberg picture scattering matrix element defined using Φout = S −1 Φin S, related? If so, how? We now relate the S-matrix elements to vacuum expectation values of time ordered products of the interacting fields - the Lehmann-Symanzik-Zimmermann (LSZ) reduction formulae. This is followed by covariant perturbation series leading to Feynman rules. We will first detail these steps for the notationally simpler case of a scalar field and then summarize the corresponding steps for the Dirac and the Maxwell fields. 106 E. Lehmann-Symanzik-Zimmermann Reduction of S−matrix We have defined the S−matrix elements using the in and out states which form the basis elements created by monomials in a† (~k, . . . ) and a†out (~k, . . . ) acting on the unique vacuum. in The in and out fields are not independent or uncorrelated, they are linked through the interacting field Φ(x) via the asymptotic condition. The LSZ reduction of the S−matrix elements expresses them in terms of the interacting field explicitly. For simplicity, let us first consider the case of an interacting Hermitian scalar field. 1. LSZ reduction for Klein-Gordon field Consider a matrix element of the form hβ out|α, k ini where we have separated the k label in the in-state. The definition gives, hβ out|α, k ini = hβ out|a†in (~k)|α ini = hβ out| a†in (~k) − a†out (~k) + a†out (~k) |α ini . The a†out (~k) acting on the out-state will remove a particle with label ~k if it is present in the label set β, or will annihilate the out-state. For simplicity, let us assume that there is no such state in the β label. Then we have just added and subtracted a 0. For the first two term in the brackets , use the inversion formula (10.7). " Z ← → −i d3 xeik·x ∂0 Φin (x) hβ out|α, k ini = hβ out| p 2ω~k (2π)3 # Z ← → +i +p d3 xeik·x ∂0 Φout (x) |α ini. 3 2ω~k (2π) ) ( → Z ik·x ← +i e ∂ Φ(x) p √0 [. . . ] = lim − lim d3 x 3 t→+∞ t→−∞ 2ω~k (2π) Z Z o n √ ← → ∴ Z [. . . ] = d4 x ∂0 eik·x ∂0 Φ(x) Z = d4 x eik·x ∂02 Φ(x) − ∂02 (eik·x )Φ(x) , (−∂02 eik·x = (−∇2 + m2 )eik·x ) Z = d4 x eik·x ∂02 Φ(x) + (−∇2 + m2 )Φ(x) , (∇2 flipped onto Φ(x)). Z −i d4 xeik·x p ∴ hβ out|α, k ini = √ ( − m2 )hβ out|Φ(x)|α ini . 3 2ω~k (2π) Z Now let us separate a particle with label k 0 from the out-state and write hβ out| = 107 hγ, k 0 out| the inner product on the right hand side of the above equation. Consider, hγ, k 0 out|Φ(x)|α ini = hγ out|aout (~k 0 )Φ(x)|α ini = hγ out|(aout (~k 0 )Φ(x) − Φ(x)ain (~k 0 ) + Φ(x)ain (~k 0 ))|α ini As before, for simplicity, assume α-label does not contain k 0 and thus drop the third term in the brackets. Notice that we have added and subtracted the zero term with ain (~k 0 ) to the right of Φ(x) while aout (~k 0 ) is naturally to the left of Φ(x). We need to maintain this order as we do not have the commutation relations between the in/out field and the interacting field. As before, using the inversion formula (10.7) and using the asymptotic condition, we get " Z i 0 hγ, k out|Φ(x)|α ini = hγ out| p 2ω~k0 (2π)3 √ ∴ Z i Z [. . . ] = p 2ω~k0 (2π)3 3 0 dx 3 0 −ik0 ·x0 d xe n← → ∂00 Φ(x0 )out Φ(x) oi ← →0 0 −Φ(x) ∂0 Φ(x )in |α ini n ← → −ik0 ·x0 0 0 lim (e ∂ 0 Φ(x ))Φ(x) 0 t →∞ −ik0 ·x0 − 0 lim Φ(x)(e t →−∞ ← →0 0 ∂0 Φ(x )) It is convenient to define a time ordering instruction. Define a time ordered product as: T {A(t1 , ~x)B(t2 , ~y )} := θ(t1 − t2 )A(t1 , ~x)B(t2 , ~y ) + θ(t2 − t1 )B(t2 , ~y )A(t1 , ~x). Notice that the two limits already have the time ordering incorporated. Hence we can combine the terms and write, √ Z [. . . ] = p i Z 3 0 dx lim − 0 lim 0 → 0 0← e−ik ·x ∂00 T {Φ(x0 )Φ(x)} t →∞ t →−∞ 2ω~k0 (2π)3 Z → i 0 0← d4x0 ∂00 e−ik ·x ∂00 T {Φ(x0 )Φ(x)} = p 2ω~k0 (2π)3 Proceeding exactly as before we get, 2 Z Z 0 0 e−ik ·x eik·x −i 0 4 0 p dx d4 x p hγ, k out|α, k ini = √ 2ω~k0 (2π)3 2ω~k (2π)3 Z (x0 − m2 )(x − m2 )hγ, k 0 out|T {Φ(x0 )Φ(x)} |α, k ini (10.18) The generalization is immediate and we state the master formula for the S−matrix element 108 as, hk1 , 0 . . . , kn0 out|k1 , . . . , km ini = m+n Z Z −i 4 0 4 0 √ d x1 ..d xn d4 x1 ..d4 xm Z 0 0 m n Y e−ikj ·xj Y e+ikj ·xj p p (x01 − m2 ) 3 3 2ω (2π) 2ω (2π) ~k0 ~k j j . . . (x0m − m2 )(x1 − m2 ) . . . (xn − m2 ) h0|T {Φ(x01 ) . . . Φ(x0n )Φ(x1 ) . . . Φ(xn )} |0i (10.19) The vacuum expectation value of the time ordered product of n quantum fields (interacting or free) is called an n−point function/Green function/n-point correlation function. To get the S−matrix elements, the n-point function is operated by the free field equation expression, multiplied by the in/out mode functions and integrated over all the space-time points. Note that the time ordering arises because the definition of S−matrix requires all the creation operators, a† (~k), have to be on the right while the annihilation operators, aout (~k) in have to be on the left. This will also naturally lead to the Feynman Green’s function (free, 2-point function). Let us note the steps that were followed to get the S−matrix elements in terms of the n-point functions. • Mode decomposition of the free field and the inversion formulae; • Equation of motion for the interacting field with a ‘source’ term on the right hand side; • Equal time(anti-)commutation relations for both the interacting and the in/out fields; • Asymptotic conditions; • Reduction process; 2. LSZ reduction for Dirac field We begin by recalling the mode decomposition. Z h i d3 k p b(~k, σ)u(~k, σ)eik·x + d† (~k, σ)v(~k, σ)e−ik·x Ψ(x) = 2ω~k (π)3 Z h i d3 k p Ψ† (x) = b† (~k, σ)u† (~k, σ)e−ik·x + d(~k, σ)v † (~k, σ)eik·x 2ω~k (π)3 109 We have used the Ψ† field instead of Ψ̄ because the inversion formulae take a more convenient form. We had defined the u, v spinors as, (6 k + m)u(~k, σ) = 0 , ū(~k, σ)u(~k, σ 0 ) = δσ,σ0 = − v̄(~k, σ)v(~k, σ 0 ) , (−6 k + m)v(~k, σ) = 0 X X 6k + m −6 k + m , v(~k, σ)v̄(~k, σ) = − u(~k, σ)ū(~k, σ) = 2m 2m σ σ We need to express these in terms of the adjoints rather than in terms of the Dirac adjoints. This alternative for can be derived as follows. We have, ū(~k, σ)γ µ u(~k, σ 0 ) = Λµν ū(k̂, σ)γ µ u(k̂, σ 0 ) , k µ = Λµν k̂ ν , k̂ µ = (m, ~0) . For µ = 0, u† (~k, σ)u(~k, σ 0 ) = Λ0ν u† (k̂, σ)γ 0 γ ν u(k̂, σ 0 ) = Λ0ν u† (k̂, σ)γ ν u(k̂, σ 0 ) ∵ γ 0 u(k̂, σ 0 ) = u(k̂, σ 0 ) = Λ00 u† (k̂, σ)γ 0 u(k̂, σ 0 ) + Λ0i u† (k̂, σ)γ i u(k̂, σ 0 ) ∴ u† (~k, σ)u(~k, σ 0 ) = Λ00 ū(k̂, σ)u(k̂, σ 0 ) + 0 = Λ00 δσ,σ0 + 0 The second term in the last equation is zero because, for k̂ spinors γ 0 u(k̂, σ) = u(k̂, σ). This implies, u† γ i u = u† γ 0 γ i u = −u† γ i γ 0 u = −u† γ i u . Similarly, v † (~k, σ)v(~k, σ 0 ) = −Λ00 v̄(k̂, σ)v(k̂, σ 0 ) = +Λ00 δσ,σ0 . But Λ00 is defined through, k µ = Λµν k̂ ν . For µ = 0 this gives, Λ00 = ω~k . m Our orthonormality relations then take the form, u† (~k, σ)u(~k, σ 0 ) = ω~k δσ,σ0 = v † (~k, σ)v(~k, σ 0 ) , u† v = 0 = v † u . m We can use these relations to derive the inversion formulae. For instance, multiplying the mode decomposition for Ψ(x) by u† (~k, σ)e−ik·x and integrating over d3 x yields, 2m b(~k, σ) = p 2ω~k (2π)3 Z d3 xu† (~k, σ)e−ik·x Ψ(x) It is convenient to define u(~k, σ) v(~k, σ) U (~k, σ) := 2m p , V (~k, σ) := 2m p 2ω~k (2π)3 2ω~k (2π)3 110 The inversion formulae are, b(~k, σ) = Z d (~k, σ) = Z b† (~k, σ) = Z d(~k, σ) = Z † d3 xe−ik·x U † (~k, σ)Ψ(x) (10.20) d3 xe+ik·x V † (~k, σ)Ψ(x) (10.21) d3 xe+ik·x Ψ† (x)U (~k, σ) (10.22) d3 xe−ik·x Ψ† (x)V (~k, σ) (10.23) In the reduction process, we will have for instance, Z † ~ † † 3 ik·x † ~ bin (k, σ) − bout (k, σ) = dxe Ψ (x)in − Ψout (x) U (~k, σ) Z 1 and using asymptotic condition −→ √ d4 x ∂0 eik·x Ψ† (x) U (~k, σ) Z Next, ∂0 (eik·x Ψ† (x))U (~k, σ) = eik·x (∂0 Ψ† + ik0 Ψ† )U ; ik0 Ψ† U = iΨ̄(k0 γ 0 )U = iΨ̄(6 k − ki γ i )U = iΨ̄(−mU ) + Ψ̄γ i (−iki U ) ∴ ∂0 ∴ eik·x ik0 Ψ† U = eik·x (−iΨ̄mU ) + Ψ̄γ i (−∂i eik·x )U = eik·x −imΨ̄ + ∂i Ψ̄γ i U n ← o − ik·x † ik·x 0 i ik·x ~ e Ψ U (k, σ) = e ∂0 Ψ̄γ + ∂i Ψ̄γ − imΨ̄ U = (−ie ) iΨ̄ 6 ∂ + mΨ̄ U ← − Hence, instead of (−x + m2 ) acting on Φ(x), we will have (−2mi)(i 6 ∂ + m) acting on Ψ̄(x) → − and (−2mi)(−i 6 ∂ + m) acting on Ψ(x). The wavefunction factors will be: Particle in in-state: U (~k, σ)eik·x , Particle in out-state: Ū (~k, σ)e−ik·x (10.24) anti-Particle in in-state: V̄ (~k, σ)eik·x , anti-Particle in out-state: V (~k, σ)e−ik·x(10.25) For multi-particle case we will again have time ordering instruction. For fermions, it is defined with an relative minus sign, T {Ψα (x)Ψβ (y)} := θ(x0 − y 0 )Ψα (x)Ψβ (y) − θ(y 0 − x0 )Ψβ (y)Ψα (x) . and T {Ψα (x)Ψβ (y)} = −T {Ψβ (y)Ψα (x)}. Thus for fermions, changing the ordering inside the time ordering generates a minus sign for each exchange. We summarize the general 111 matrix element for m fermions going into n fermions as, m+n Y Z −i 0 0 ~ ~ √ d4 x0j 0 d4 xj U (~kj , σj )Ū (~kj0 0 , σj0 0 ) out h..kj 0 σj 0 ..|..kj σj ..iin = ZΨ j,j 0 e −i P kj0 0 ·x0j 0 (−i6 ∂ 01 + m) . . . (−i6 ∂ 0n + m) h0|T Ψ(x01 )..Ψ(x0n )Ψ̄(x1 )..Ψ̄(xm ) |0i P ← − ← − (−i6 ∂ 1 + m) . . . (−i6 ∂ m + m)e+i j kj ·xj (10.26) j0 For anti-fermions in the in/out states, the spinors U, Ū are changed to V, V̄ as appropriate. 3. LSZ reduction for Maxwell field This is a bosonic field so there is no additional minus sign. Like the Dirac field, it has two polarizations and thanks to the zero mass, these are transverse polarizations. We begin by recalling the mode decomposition. Z i Xh d3 k ∗ ~ −ik·x ik·x ∗ ~ p Aµ (x) = a~k,λ εµ (k, λ)e + a~k,λ εµ (k, λ)e 2ω(2π)3 λ=1,2 j ~k · ~ε(~k, λ) k k i j ε0 (~k, λ) := − δi − εj (~k, λ) , ω~k = |~k| , εi (~k, λ) := ~k 2 ω~k X ki k j j ∗ ~ 0 ∗ ~ ~ ~ ~ε(k, λ) · ~ε (k, λ ) = δλ,λ0 , δi − εi (k, λ)εj (k, λ) = ~k 2 λ This gives the inversion formulae as, Z Z ← → ← → 3 −ik·x i † ∗ a(~k, λ) = iεi (~k, λ) d xe ∂0 A (x) , a (~k, λ) = − iεi (~k, λ) d3 xe+ik·x ∂0 Ai (x) In the transverse/radiation/Coulomb gauge, εµ (~k, λ) = εµ (~k, λ) , ~k · ~ε(~k, λ) = 0 and the equations of motion are Ai (x) = 0. The commutation relations are: Z ∂i ∂j d3 k ik·(x−y) ki kj [Ai (t, ~x), πj (t, ~y )] = i δij − 2 := i e δij − ~k 2 ∇ (2π)3 The reduction formula goes similar to the scalar. We will have (−x + m2 ) → −x ~ and the wave function factors would be Eµ (~k, λ) := √ ε(k,λ) 3 . Then the general S−matrix 2ω~k (2π) 112 element for m → n particles takes the form, m+n Y Z −i 0 0 ~ ~ √ d4 x0j 0 d4 xj E(~kj , λ)E ∗ (~kj0 0 , λ0j 0 ) out h..kj 0 λj 0 ..|..kj λ..iin = ZA j,j 0 −i e P kj0 0 ·x0j 0 (−01 ) . . . (−0n ) h0|T Aµ01 (x01 )..Aµ0n (x0n )Aµ1 (x1 )..Aµm (xm ) |0i P ← − ←− (−1 ) . . . (−m )e+i j kj ·xj (10.27) j0 Having related the S−matrix elements to the n−point functions of the interacting fields, out next task is to try see if these can be expressed in terms of the free fields in some systematic, well defined way. This is achieved in the so-called covariant perturbation series. 113 11. COVARIANT PERTURBATION THEORY In formulating the scattering theory for interacting fields, we postulated the in/out fields satisfying the free field equations. More importantly, we also postulated them to satisfy the same equal time (anti-)commutation relations. With some assumptions of existence, these suffice to develop a perturbation scheme to compute the n−point functions. Since the interacting and the free (in/out) fields satisfy the same basic quantum conditions, it is plausible that they are related by some unitary transformation. For systems with finitely many degrees of freedom, it is theorem that guarantees existence of canonical transformations at the classical level and unitary transformations at the quantum level (thanks to the Stone-von-Neumann theorem). For field theories, there is no such guarantee and we need to postulate the required existence. Their utility gives a post-facto justification for the assumptions. Let, Φ(t, ~x) = U −1 (t)Φin (t, ~x)U (t) , Π(t, ~x) = U −1 (t)Πin (t, ~x)U (t) . . The fields satisfy the equations of motion in the form, ∂t Φin (x) = i[Hin (Φin , Πin ), Φin (x)] , ∂t Πin (x) = i[Hin (Φin , Πin ), Πin (x)] (11.1) ∂t Φ(x) = i[H(Φ, Π), Φ(x)] , ∂t Π(x) = i[H(Φ, Π), Π(x)] (11.2) These enable us to derive an equation for U (t). ∂t Φin = ∂t [U (t)ΦU −1 (t)] = ∂t U · U −1 Φin + U (∂t Φ)U −1 + Φin · U ∂t U −1 = [∂t U · U −1 , Φin ] + iU [H(Φ, Π), Φ]U −1 But, U [H(Φ, Π), Φ]U −1 = [U H(Φ, Π]U −1 , U ΦU −1 ] = [H(Φin , Πin ), Φin ] ∴ ∂t Φin = [∂t U · U −1 + i {H(Φin , Πin ) − Hin (Φin , Πin ) + Hin (Φin , Πin )} , Φin ] ∂t Φin = ∂t Φin + [∂t U · U −1 + iHI (Φin , Πin ), Φin ] ∂t Πin = ∂t Πin + [∂t U · U −1 + iHI (Φin , Πin ), Πin ] . 114 Similarly, Since ∂t U · U −1 + iHI (Φin , Πin ) commutes with both Φin , Πin , it must be a multiple of identity, say E0 1. This gives the equation determining the U (t) as, i∂t U (t) = HI0 (Φin , Πin )U (t) , HI0 (Φin , Πin ) := H(Φin , Πin ) − Hin (Φin , Πin ) + E0 (t) (11.3) To solve the equation, it is convenient to define the combination U (t, t0 ) := U (t)U −1 (t0 ). It follows that this combination too satisfies the same first order equation with the initial condition, U (t, t) = 1. The integral form of this equation is: Z t 0 dt00 HI0 (t00 )U (t00 , t0 ) . U (t, t ) = 1 − i t0 This is solved by iteration. Rt Define: U0 (t, t0 ) = 1, and for n ≥ 1, Un (t, t0 ) := 1 − i t0 dtn H 0 (tn )Un−1 (tn , t0 ). Then P U (t, t0 ) = n Un (t, t0 ) is the formal solution of the integral equation. It is trivial to verify the formal solution. It is formal because no conditions are imposed for convergence of the series. The first few terms are, U0 (t, t0 ) = 1 t Z dt1 H 0 (t1 )1 U1 (t, t ) = 1 − i 0 Z tt dt2 H 0 (t2 )U1 (t2 , t0 ) U2 (t, t0 ) = 1 − t0 Z t Z t Z t2 0 2 dt2 H (t2 ) + (−i) dt2 = 1 + (−i) dt1 H 0 (t2 )H 0 (t1 ) 0 0 0 t Z t t Zt t dt3 H 0 (t3 ) 1 + (−i) dt2 H 0 (t2 ) U3 (t, t0 ) = 1 + (−i) 0 0 t t Z t Z t2 2 0 0 +(−i) dt2 dt1 H (t2 )H (t1 ) t0 t0 Z t Z t Z t3 0 2 dt3 H (t3 ) + (−i) dt3 dt2 H 0 (t3 )H 0 (t2 ) = 1 + (−i) t0 t0 t0 Z t Z t3 Z t2 3 + (−i) dt3 dt2 dt1 H 0 (t3 )H 0 (t2 )H 0 (t1 ) 0 t0 ∴ Un (t, t0 ) = 1 + n X (−i)k Rt t0 dtn R tn t0 Z tk t0 Z dtk−1 · · · dtk t0 k=1 Consider the term t Z t0 t2 t0 dt1 H 0 (tk )H 0 (tk−1 ) . . . H 0 (t1 ) (11.4) t0 dtn−1 . By interchanging the order f integration we can write 115 it as (Hn0 := H 0 (tn )), Z t Z t Z t Z t Z tn Z t 0 0 0 0 0 dtn−1 dtn Hn Hn−1 = dtn dtn−1 Hn−1 Hn0 dtn−1 Hn Hn−1 = dtn t0 tn−1 t0 tn t0 t0 Z t Z t Z t Z tn 1 1 0 0 0 0 ∴ LHS = (LHS + RHS) = dtn dtn−1 Hn−1 Hn dtn−1 Hn Hn−1 + dtn 2 2 t0 t0 tn t0 Z t Z tn 1 0 = dtn−1 T Hn0 Hn−1 dtn 2 t0 0 t Z t Z t 0 0 + dtn dtn−1 T Hn Hn−1 t0 tn Z t Z 1 t 0 = dtn dtn−1 T Hn0 Hn−1 2 t0 t0 Similarly, we can symmetrize the higher order terms to replace each product by their time ordered products and then combine the integrals to have the same limits of integrations. The Rt Rt Rt nth order term then takes the form: n!1 t0 dtn t0 dtn−1 · · · t0 dt1 T {H 0 (tn )H 0 (tn−1 ) . . . H 0 (t1 )}. We write the iterated formal solution of the (11.3) as, Z t 0 0 0 0 0 U (t, t ) := T exp −i dt HI (Φin (t ), Πin (t )) (11.5) t0 = 1+ Z ∞ X (−i)n n=1 n! t Z dt1 · · · t0 t0 t dtn T {HI0 (t1 ) . . . HI0 (tn )} (11.6) The solution is thus obtained entirely in terms of the ‘in’ fields. Note that the solution gives U (t, t0 ) and not U (t). While the equations satisfied by U (t) and U (t, t0 ) are the same, there is no “initial” condition provided for U (t) and in a sense it is ill-defined. There is no canonical way to deduce/define U (t) from U (t, t0 ). Fortunately, U (t, t0 ) suffice. Note: From the definition it follows the useful composition property that U (t, t0 ) = U (t, t00 )U (t00 , t0 ), regardless of any inequalities for t00 . This may also be verified directly. Let us see an implication of the existence of U (t). Consider an n−point function for a scalar, G(x1 , . . . , xn ) = h0|T {Φ(x1 ) . . . Φ(xn )} |0i. Insert in this Φ(x) = U −1 (t)Φin (x)U (t) which gives, Φ(x1 )Φ(x2 ) . . . Φ(xn ) = U −1 (t1 )Φin (x1 )U (t1 ) · U −1 (t2 )Φin (x2 )U (t2 )..U −1 (tn )Φ(xn )U (tn ) = U −1 (t) [U (t, t1 )Φin (x1 )U (t1 , t2 )Φin (x2 ) . . . U (tn−1 , tn )Φin (xn )U (tn , −t)] U (−t) , U (t, t0 ) := U (t)U −1 (t0 ) Observe that in the limit t → ∞, the factor U −1 (t) goes to the extreme left and U (t) to the extreme right. All other space-time arguments being finite, we can take these factors 116 outside the time ordering symbol. The product of the fields Φ(x)’s is already under time ordering which then allows us to arrange all the factors within T {. . . } to be conveniently grouped. In particular, all the U (ti , tj ) can be combined using their composition property noted above. With the limit of t → ∞ implicit, we can write the n−point function as, Z t 0 0 0 −1 U (t)|0i . (11.7) dt HI (t ) G(x1 , . . . , xn ) = h0|U (t)T Φin (x1 ) . . . Φin (xn ) exp −i −t Next, we evaluate the U (−t)|0i in the limit t → ∞ and its adjoint. Claim: U (−t)|0i → λ− |0i as t → ∞. Proof: Consider an in-state containing at least one particle of momentum k, |α, kiin . Then, in hα, k|U (−t)|0i = = ~ in hα|ain (k)U (−t)|0i Z ← → d3 x p e−ik·x ∂00 in hα|Φin (−t0 , ~x)U (−t)|0i i 2ω~k (2π)3 Σt0 We have used the inversion formula for ain by choosing an arbitrary Σt0 hyper-surface and ain is of course independent of this. In the integrand above, substitute Φin (t0 , ~x) = U (t0 )Φ(t0 , ~x)U −1 (t0 ) and evaluate the time derivative. h Integrand = e−ik·x hα| U̇ (−t0 )U −1 (−t0 )Φin (−t0 , ~x)U (−t) + Φin (−t0 , ~x)U (−t0 )∂t0 U −1 (t0 )U (−t) +U (−t0 )∂00 Φ(−t0 )U −1 (−t0 )U (−t) − U (−t0 )Φ(−t0 )U −1 (−t0 )U (−t)(∂00 (−ik · x)) |0i Now take choose t0 = t. Then, the first two terms on the r.h.s. combine as, [U̇ (−t)U −1 (−t), Φin (−t, ~x)]U (−t) = − i[HI0 (Φin , Πin ), Φin (−t, ~x)]U (−t) = 0 . In the usual theories, the interaction Hamiltonian has no Π dependence and hence at equal time, commutes with the field giving 0. In the last two terms, the U −1 (−t0 )U (−t) factor becomes 1 for t0 = t. These terms ← → combine to give, in hα|U (−t) e−ik·x ∂0 Φ(−t) |0i. Taking the limit t → ∞ and invoking the √ asymptotic condition, converts Φ(x) into ZΦin (x). The spatial integration converts the √ expression into Z in hα|U (−t)ain (~k)|0i = 0. Thus we conclude that in hα, k|U (−t)|0i → 0 as t → ∞ for all states containing at least one particle. U (−t)|0i must therefore be proportional to |0i itself thereby proving the claim. 117 By similar reasoning, we obtain U (t)|0i → λ+ |0i as t → ∞. Corollary: λ∗+ λ− Z = h0|T exp −i ∞ 0 dt HI0 (t0 ) −1 |0i (11.8) −∞ The proof is simple, λ∗+ λ− = lim h0|U −1 (t)|0ih0|U (−t)|0i t→∞ ≈ h0|U −1 (t)|U (−t)|0i since intermediate states do not contribute by above result ≈ h0|U (−t)U −1 (t)|0i = h0|U (−t, t)|0i ≈ h0|(U (t, −t))−1 |0i = (h0|U (t, −t)|0i)−1 . The final expression follows by substituting the formal solution in the limit t → ∞. Using the equations (11.7, 11.8), we express the n− point function as, n o R∞ h0|T Φin (xi ) . . . Φin (xn )exp − i −∞ dt0 HI (t0 ) |0i n o G(x1 , . . . , xn ) = R∞ h0|T exp − i −∞ dt0 HI (t0 ) |0i (11.9) Notice that the multiple of identity E0 (t)1 has been canceled between the numerator and the denominator before taking the limit t → ∞. Suffice it to say that analogous expressions exist for other fields as well. However we will not write them explicitly. The reduction formulae give S−matrix elements in terms of the n−point functions and the n−point functions have a perturbative expansion in terms of the free fields alone. These two together are the master formulae for computations. A further simplification is provided by the Wick’s Theorem. A. Normal ordering and Wick’s theorem Recall that the mode expansion of the free quantum fields can be grouped as Φ(x) = Φ+ (x) + Φ− (x) with Φ± (x) denoting the sum over positive/negative frequency modes. The positive frequency part is made up of annihilation operators while the negative frequency part is made up of creation operators alone. For a Hermitian quantum field, the negative frequency part is the Hermitian adjoint of the positive frequency part. A product of fields at different points will mix these parts and the idea is to bring all annihilation operators to 118 the right and the creation operators to the left. The vacuum expectation value of so ordered groups will of course be zero. In this process, several commutators of fields at different space-time points are generated, but they all are c-numbers. This leads to a simplification. We define normal ordered products of two free fields as: : Φ− (x)Φ+ (y) : = Φ− (x)Φ+ (y) , : a† a : := a† a (11.10) : Φ+ (x)Φ− (y) : = Φ− (y)Φ+ (x) , : aa† : := a† a (11.11) : Ψ− (x)Ψ+ (y) : = Ψ− (x)Ψ+ (y) , : b† b : := b† b (11.12) : Ψ+ (x)Ψ− (y) : = − Ψ− (y)Ψ+ (x) , : bb† : := − b† b (11.13) For bosons For fermions For product of two or more positive (or negative) frequency fields, the normal ordering does not change order. It is immediately obvious that h0| : Operator : |0i = 0. Note that the normal ordering are meaningful only for products of free fields (in/out). Hence the in/out suffixes are suppressed for these. To get a general form for several bosonic as well as fermionic free fields, notice that for a quantum field Q(x) = Q+ (x) + Q− (x) a product of n fields generates terms with the k number of Q+ ’s interspersed with (n − k) number of Q− ’s, with the order of the space-time points maintained, with k = 0, . . . , n. Under normal ordering each of these terms will shift the Q+ ’s to the right and Q− ’s to the left, generating a permutation of the space-time points, say p. If Q is a fermionic field, the term with permutation p will get a factor of σp = sgn(p) while a bosonic field will have σp = 1. Note that for any given k, there will be several terms eg Q1 Q2 , Q3 Q7 , . . . etc. Each will generate its own permutation under normal ordering. With this understood, we can write the general expression for normal ordered product of n−fields as, : Q(x1 ) . . . Q(xn ) : := X σp A,B Y i∈A Q− (xp(i) ) Y Q+ (xp(j) ) . j∈B The A, B are two groups of space-time points corresponding to the k, (n − k) mentioned above and the sum refers to various possible groupings of the space-time points within each class. Exercise: Verify for n = 4. Consider now the relation between time ordered products and normal ordered products. For a single field, both orderings are trivial and their vacuum expectation value vanishes: h0|T {φin (x)}|0i = 0 = h0| : φ(x) : |0i. 119 For time ordered product of two fields, for any particular instances of time they appear in a particular order which can be explicitly put in the normal ordered form, eg. a(~k)a† (~k 0 ) = “1” + a† (~k 0 )a(~k) =: a(~k)a† (~k 0 ) : +“1”. The last term is a c-number (multiple of the identity operator). Similarly, Φ(x)Φ(y) =: Φ(x)Φ(y) : + c−number. And the cnumber is trivial to evaluate by just taking the vev (vacuum expectation value): c−number = h0|Φ(x)Φ(y)|0i. Noting that T {Φ(x)Φ(y)} = θ(x0 − y 0 )Φ(x)Φ(y) + θ(y 0 − x0 )Φ(y)Φ(x), we get, T {Φ(x)Φ(y)} =: T {Φ(x)Φ(y)} : +h0|T {Φ(x)Φ(y)}|0i . Notice that normal ordering and time ordering commute: : T {A(x)B(y)} := T {: A(x)B(y) :}, for all non-trivial A(x), B(y) operators i.e. for operators which are polynomials in the basic field operators of degree greater that zero. The generalization of this to product of arbitrary number of field is the Wick’s theorem. We state it explicitly for a scalar field. Wick’s Theorem: T {Φ(x1 ) . . . Φ(xn )} = : T {Φ(x1 ) . . . Φ(xn )} : + [h0|T {Φ(x1 )Φ(x2 )}|0i : T {Φ(x3 ) . . . Φ(xn )} : + permutations] + [h0|T {Φ(x1 )Φ(x2 )}|0ih0|T {Φ(x3 )Φ(x4 )}|0i× : T {Φ(x5 ) . . . Φ(xn )} : + permutations] .. .. ..   [h0|T {Φ1 Φ2 }|0i . . . h0|T {Φn−1 Φn }|0i + permutations]      (n even) +   [h0|T {Φ1 Φ2 }|0i . . . h0|T {Φn−2 Φn−1 }|0iΦn     + permutations] (n odd) In the last line we have used the abbreviation Φi := Φ(xi ) and we will use the same when convenient. Proof: The proof is by induction and we have already seen the validity for n = 1, 2. Assume the theorem is true for n = n. Include an extra field Φ(xn+1 ) and choose tn+1 to be the earliest 120 instant so that it stays on the extreme right. Then, T {Φ1 . . . Φn+1 } = T {Φ1 . . . Φn }Φn+1 " = # X : T {Φ1 . . . Φn } : + h0|T {Φ1 Φ2 }|0i : T {. . . } : + . . . Φn+1 , perm : Φ1 . . . Φn : Φn+1 = ( X ) σp A,B = X A,B σp Y Φ− i Φ+ j − Φ+ n+1 + Φn+1 Φ− i Y + Φ+ j Φn+1 + j∈B i∈A ( X A,B j∈B i∈A Y Y σp Y Φ− i Φ− n+1 i∈A " Y j∈B Φ+ j + #) Y − Φ+ j , Φn+1 j∈B The first two terms in the last equation have the requisite normal ordered form for n + 1 − fields. The third term has several commutators, [Φ+ j , Φn+1 ] which again are c-numbers and can be evaluated by taking vev. That is, + − + + − − − [Φ+ j , Φn+1 ] = h0|(Φj Φn+1 − Φn+1 Φj )|0i = h0|Φj Φn+1 |0i + 0 ∴ : Φ1 . . . Φn : Φn+1 = h0|Φj Φn+1 |0i = h0|T {Φj Φn+1 } |0i , since tn+1 is the earliest. X = : Φ1 . . . Φn Φn+1 : + σp : Φ1 . . . Φj−1 Φj+1 . . . Φn : h0|T {Φj Φn+1 } |0i j Thus, inserting the time ordering, the first term, : T {Φ1 . . . Φn } : Φn+1 gets expressed as : T {Φ1 . . . Φn Φn+1 } : + {terms which have a structure similar to the second term} and so on. Hence the induction hypothesis allows the expression to hold for n + 1 and the theorem is proved by induction. Exercise: Verify the theorem for n = 3, 4. Observe that G(x1 , . . . , xn ) = h0|T {Φ(x1 ) . . . Φ(xn )}|0i vanishes for odd n and equals P perm σp h0|T {Φ(x1 )Φ(x2 )}|0i . . . h0|T {Φ(xn−1 )Φ(xn )}|0i for even n. Thus, 2n−point function of free fields is the sum of products of n, 2−point functions of free fields. We denote: h0|T {Φ(x)Φ(y)}|0i =: i∆F (x − y) =: (Feynman propagator) . We have the S−matrix elements in terms of the n−point functions of interacting quantum fields, expressed in terms of vev of time ordered products of free fields and thanks to Wick’s theorem these are expressed in terms of 2−point functions of free fields. In short, the S−matrix elements are obtained from a bunch of 2−point functions of free fields, together with some integrations and operations of (−m2 ). It remains to have a convenient algorithm for computations. 121 A few remarks are in order. The normal ordering is introduced here as a simplification step. It works because the postulated unique Poincare invariant vacuum state is annihilated by all (free field) annihilation operators. This also requires the Poincare generators of the free fields to be normal ordered11 . The Poincare generators of the interacting fields also need to annihilate the vacuum. Thanks to the assumed unitary operator linking the free and the interacting fields, these generators too are expressed in terms of the free field creationannihilation operator. These expressions, in particular the Hamiltonian, must also be normal ordered for Poincare invariance. We will make this explicit for the interaction Hamiltonian HI (Φin , Πin ). In the quantum optics context, the normal ordering was used since the detectors worked by absorbing a quantum from the field and in principle, for a detector working by emission of a quantum would require anti-normal ordering. Such an option is not available with manifest Poincare invariance with a unique invariant vacuum state. An explicit example of evaluation of an n−point function will be useful. Consider G2 (x1 , x2 ) computed to second order in HI . Consider the numerator first. All fields are ‘in’ fields and the ‘in’ subscript is suppressed below. Z ∞ dt0 HI (t0 ) G2 (x1 , x2 )|N r = h0|T {Φ(x1 )Φ(x2 ) 1 − i −∞ 2 Z ∞ (−i) dt1 dt2 T {HI (t1 )HI (t2 )} }|0i + 2! −∞ Z ∞ = h0|T {φ(x1 )Φ(x2 )}|0i − i dt1 h0|T {Φ(x1 )Φ(x2 )HI (t1 )}|0i −∞ Z (−i)2 ∞ + dt1 dt2 h0|T {Φ(x1 )Φ(x2 )HI (t1 )HI (t2 )}|0i 2 −∞ Now let us take a specific interaction Hamiltonian, simplest non-trivial being HI (t) := R g d3 y : Φ(t, ~y )3 : . As noted above, for the case of Poincare invariant vacuum, the Poincare generators and hence in particular the Hamiltonian must be normal ordered. The time integrations combine with these spatial integration to give a space time integral. Thus, we 11 Note that while abelian symmetries could allow a non-zero multiple of identity for invariance (i.e. P µ |0i = αµ |0i), non-abelian symmetry generators have to annihilate the vacuum. Hence the Lorentz generators must annihilate the vacuum and then the commutator of M µν with P λ shows that the αµ = 0 must also hold. Poincare group being non-abelian must have its generators being normal ordered for the vacuum to be invariant. 122 get, Z G2 (x1 , x2 )|N r = h0|T {φ(x1 )Φ(x2 )}|0i − ig d4 yh0|T {Φ(x1 )Φ(x2 ) : Φ3 (y) :}|0i Z Z (−i)2 2 4 + g d y1 d4 y2 h0|T {Φ(x1 )Φ(x2 ) : Φ3 (y1 ) :: Φ3 (y2 ) :}|0i 2 Z Z g2 4 = i∆F (x1 − x2 ) − d y1 d4 y2 h0|T {Φ(x1 )Φ(x2 ) : Φ3 (y1 ) :: Φ3 (y2 ) :}|0i 2 The first term is just the Feynman propagator and the o(g) term vanishes since it is a 5point function. The non-trivial term is the o(g 2 ) term. This is not in the form of the Wick’s theorem, in that there are normal ordered factors. We can replace the normal ordered product in terms of unordered product minus the product of two point functions i.e. using the Wick’s theorem in reverse form. For instance, consider (for three distinct points) : Φ(y1 )Φ(y10 )Φ(y100 ) : = Φ(y1 )Φ(y10 )Φ(y100 ) − h0|Φ(y1 )Φ(y10 )|0iΦ(y100 ) − h0|Φ(y10 )Φ(y100 )|0iΦ(y1 ) − h0|Φ(y1 )Φ(y100 )|0iΦ(y10 ) And likewise for the second factor of : Φ(y2 )3 :. If we now substitute these in the o(g 2 ) term and eventually take the coincidence limit of the primed arguments, we see that the two point functions of coincident points cancel out. The net result result is that in writing the permutation terms, we omit the Feynman propagators of coincident points and this holds because we use the normal ordered interaction Hamiltonian. If we did not normal order the interaction Hamiltonian then these terms, which are formally divergent, will remain and will have to be handled differently. This is what will happen in the functional integral method which will be discussed later. Keeping this in mind, the o(g 2 ) term under vev becomes, h0|T {Φ(x1 )Φ(x2 ) : Φ3 (y1 ) :: Φ3 (y2 ) :}|0i = i∆F (x1 − x2 ).i∆F (y1 − y2 ).i∆F (y1 − y2 ).i∆F (y1 − y2 ) + permutations = (i4 ) ∆F (x1 − x2 )∆F (y1 − y2 )3 + ∆F (x1 − y1 )∆F (x2 − y2 )∆F (y1 − y2 )2 +∆F (x1 − y2 )∆F (x2 − y1 )∆F (y1 − y2 )2 There are no other ‘pairings’ of space-time points as they would involve Feynman propagators with coincident arguments. The denominator of the n−point function has the same structure as the numerator except 123 for the Φ(x1 )Φ(x2 ) fields i.e. Z d4 yh0|T {: Φ3 (y) :}|0i Z Z (−i)2 2 4 + g d y1 d4 y2 h0|T {: Φ3 (y1 ) :: Φ3 (y2 ) :}|0i 2 Z Z g2 4 = 1− d y1 d4 y2 h0|T {: Φ3 (y1 ) :: Φ3 (y2 ) :}|0i 2 Z Z (−ig)2 4 = 1+ (CF ) d y1 d4 y2 (i∆F (y1 − y2 ))3 2 G2 (x1 , x2 )|Dr = h0|0i − ig Here, CF is the combinatorial factor - number of distinct ways of effecting the same pairings - and equals 3! in the example above. The generalization is fairly obvious. For definiteness consider the interaction Hamiltonian density to be HI (Φ(y)) := g k! : Φk (y) :. Then, it is apparent that the numerator of a general n−point function is a sum of the mth order contributions, m = 0, 1, . . . . The mth order contribution has the form: m) R • (−ig)/k!) d4 y1 . . . d4 ym h0|T {Φ(x1 ) . . . Φ(xn ) : Φk (y1 ) : · · · : Φk (ym ) :}|0i ; m! • The Wick’s theorem gives a sum of terms, each of which is a product of a total of (n+km)/2 Feynman propagators, i∆F (xj −yk ), whose arguments give the ‘pairing’ of spacetime points from the set: x1 , . . . , xn , k−copies of y1 , . . . , k−copies of ym . The normal ordered form of interaction Hamiltonian ensures that the paired points are necessarily distinct. The x1 , . . . , xn points are assumed to be distinct. A ‘pairing’ is also called a ‘Wick contraction’. The sum of terms is generated from any particular one by permutations. If n + km is odd, the contribution vanishes; • There would also be combinatorial factor counting distinct ways of getting the same set of pairing (or Feynman propagators). The denominator has similar contributions except the x1 , . . . , xn points. There is a convenient diagrammatic representation to keep track of the generated terms as well as associating specific factors and integrals with them. The diagrams are the Feynman Diagrams while the prescription to construct associated integral is given by the Feynman rules. Procedure to generate Feynman diagrams for Green’s function: • For each distinct x1 , . . . , xn , draw a vertex with a single edge. Call these vertices as external vertices; 124 • For normal ordered monomial of order k, draw a vertex yj with k edges sticking out. Call these as the internal vertices; • Wick contract or pair, by joining two edges from two distinct vertices, internal or external. If the pair has one or both vertices to be external vertices, the line is called an external line while a pairing with both vertices being internal, is called an internal line. This constructs a Feynman diagram with vertices and internal/external lines. Procedure to associate integral with a Feynman diagram: • With each contraction, associate a Feynman propagator, i∆F (z 0 − z 00 ). • With each internal vertex, associate a factor of (−i)×coefficient of the monomial of R the fields in HI (Φ(y)) and an integral over its space-time location, d4 y. • Associate a numerical factor as a ratio. In the numerator, count the number of distinct ways of generating the same diagram (or the same contractions). In the denominator, put m! if m is the total number of internal vertices. For the above example of G2 (x1 , x2 ) to second order for HI = g : Φ(y)3 :, here is a summary: 125 o(g) Has one loose end and vev is zero. o(g 2 ) y1 3 (a) x1 1 x2 2 (−ig)2 3.3.2.1 2! R d4 y1 d4 y2 (i∆F (x1 − y1 )) × (i∆F (x2 − y2 ))(i∆F (y1 − y2 ))2 3 y2 (−ig)2 (b) 3.3.2.1 2! R d4 y1 d4 y2 (i∆F (x1 − y2 )) × (i∆F (x2 − y1 ))(i∆F (y1 − y2 ))2 x1 x2 (−ig)2 3 (c) y1 2 y2 1.3.2.1) 2! R d4 y 1 d4 y 2 (i∆F (x1 − x2 ))(i∆F (y1 − y2 ))3 1 The diagrams (a) and (b) are examples of connected diagrams while (c) is a example of a topologically disconnected diagram. Here (dis)connection refers to (dis)connection with external vertices. A diagram with no external vertices is called a vacuum bubble. The diagram in (c) has a disconnected piece which is a vacuum bubble. The denominator diagrams are all vacuum bubbles. It is actually possible to separate the vacuum bubble components of the diagrams in the numerator and cancel them against those in the denominator. The proof goes as follows. In an n−point function, at the k th order, the numerator has the form, Nr = Z ∞ X (−i)k k=0 k! d4 y1 . . . d4 yk h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yk ) :}|0i Group the contractions into two groups: (i) those that have l of the internal vertices with a contraction with at least one of the external vertices and (ii) the remaining (k − l) internal vertices which have no segment connecting an external vertex. The above vev then splits 126 as, h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yk ) :}|0i = h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yl ) :}|0i × h0|T {: HI (y1 ) : · · · : HI (yk−l ) :}|0i And this split can happen in k Cl ways. Hence, Nr = k ∞ X (−i)k X k=0 k! l=0 k! l!(k − l)! Z d4 y1 . . . d4 yl h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yl ) :}|0i × Z d4 yl+1 . . . d4 yk h0|T {: HI (yl+1 ) : · · · : HI (yk ) :}|0i Putting (−i)k = (−i)l (−i)(k−l) and interchanging the order of summation we can write, # "∞ X (−i)l Z 4 4 d y1 . . . d yl h0|T {Φ(x1 ) . . . Φ(xn ) : HI (y1 ) : · · · : HI (yl ) :}|0i Nr = l! l=0 "∞ # X (−i)(k−l) Z × d4 yl+1 . . . d4 yk h0|T {: HI (yl+1 ) : · · · : HI (yk ) :}|0i (k − l)! k=l k → k + l in the second square bracket, shows it as h0|T {exp − i R∞ −∞ dtH(t)}|0i which is just the denominator! The first factor is called the connected Green’s function and denoted as Gcn . It consists of only the diagrams connected to external vertices. The diagrams may be topologically disconnected. From now on we focus on the connected Green’s functions and drop the denominator. Having gotten Feynman diagrams and Feynman integrals, we further simplify the expression for the S−matrix elements. To go from Green’s functions to S−matrix elements (a) we operate on the Green’s function by the equation of motion differential operator for each external vertex; (b) insert the √ wave function factors for each external vertex; (c) divide by Z for each external vertex and (d) integrate over the external vertices. We have only connected diagrams. Among these are topologically disconnected diagrams too. There is a special subclass of topologically disconnected diagrams in which two external vertices are Wick contracted. These represent an incoming particle which goes out without any scattering. We can separate such un-scattered processes from the S−matrix (which 127 we will do little later) and focus on processes wherein every incoming particle necessarily scatters i.e. every external vertex is necessarily connected to an internal vertex. Consider an external vertex x connected with an internal vertex y. Associated with the 0 0 external vertex is a wavefunction factor eik·x (or e−ik x ) √ , 2ω~k (2π)3 integration over x (or x0 ), Feynman propagator ∆F (x − y) and (x − m2 ) acting on the propagator. Since the propagator is a Green’s function, (x − m2 )∆F (x − y) = −δ 4 (x − y). Integration over x then removes the 0 0 delta function and eik·x → eik·y . For the external vertex of an out-going particle, e−ik ·x → 0 e−ik ·y . Thus the integration over external vertices is trivially carried out and the propagators involving an external vertex is removed - an “external leg is said to be amputated”. Now use the Fourier representation of the remaining propagators, ∆F (yi − yj ) = R d4 l eil·(yi −yj ) . (2π)4 l2 +m2 −i Each internal vertex yi is connected to possibly several other internal vertices as well as possibly several external vertices. For internal vertex, the Fourier transform supplies a factor of eil·yi and each external vertex provides a factor of ek·yi . All of these combine R P P P and the d4 yi then gives a (2π)4 δ 4 ( k − k 0 + l). Thus, the integration over internal vertices result in a momentum conserving delta function. The space-time integrations are done but we are left with integration over the Fourier momenta which are associated with internal lines. The delta functions trivialize many momentum integrations leaving an overall delta function involving only the external momenta and enforcing the momentum conservation. We also left with some unconstrained loop momenta. To be explicit, let nE , nI denote the number of external and internal lines and let nv denote the number internal vertices. We have then nE + nI number of momenta and nv number of conservation equations. Thus nE + nI − nv + 1 is the number of undetermined momenta or the loop momenta. The +1 in the counting signifies the loss of one conservation equation due to the left over delta function enforcing conservation of external momenta. There are no dependence on space-time points, no space-time integrations left and the scattering matrix element is given by integration over a bunch of loop momenta and a product of momentum space propagators forming the integrand and of course the various factors of i, π and numerical combinatorial factors. X X YZ Y 4 0 4 Sf i ∼ δ ( k− k) d lj nI j 1 2 pi (k, l) + m2 − i We will elaborate the numerical factors in explicit examples. The structure of the S−matrix elements (non-trivial scattering) should be clear from the above discussion. 128 To separate out the trivial scattering it is customary to introduce the the so-called T −matrix as, S := 1 + iT , hkj0 0 |T |kj i := (2π)4 δ 4 (Σj 0 kj0 0 − Σj kj ) M(kj → kj0 0 ) Here the k, k 0 denote the on-shell momenta of the incoming and outgoing particles. The delta function enforcing momentum conservation is a consequence of translation invariance and (M ) is called the invariant matrix element. This is what is computed in practice. The Sf i given above is really the iTf i . In practice, computation of the T −matrix elements is only part of what is needed to compare with experiments which measure cross-sections. To compute cross-sections, recall that the S−matrix elements are the transition probability amplitudes whose non-trivial 0 − ktotal )(iMi→f ). The cross-section is contributions are identified as hf |ii = (2π)4 δ 4 (ktotal the ratio of the number of outgoing particles per second to the incident flux. The numerator is proportional to the transition probability rate while the denominator is determined by the initial state. We have given the S−matrix elements in the plane wave basis which is strictly incorrect. One should use wave packets for representing the asymptotic states. Alternatively, a commonly used practice is to put the system in a finite space-time box and take the limit of infinite box at the end. This is simpler to implement in practice and suffices for most purposes. We will use this [12] and refer to [13] for the wave packet treatment. B. Differential Cross-section for 2 → n process Imagine the scattering experiment to be enclosed in a large spatial box of volume V = L3 , with periodic boundary conditions imposed on the mode functions. Let the duration of the experiment be a large time interval T . The probability of transition from an initial state |ii to a final state |f i is given by, P robi→f (2π)4 δ 4 (Σkj0 0 − Σkj ) [(2π)4 δ 4 (0)] |iMi→f |2 |hf |ii|2 = = hi|iihf |f i hi|iihf |f i The square of the momentum conservation delta function is written with one factor as δ(0) Z Z 4 4 4 i0·x which is to be understood as: (2π) δ (0) = d xe = d4 x := V T . The initial and 129 final states are normalized using hk|ki := lim δ 3 (~k 0 − ~k) = δ 3 (0) := 0 k →k V (2π)3 . Typically, the initial state consists of two particles. So let us specialize to this case of 2 → n processes. V (n+2) Then, hi|iihf |f i = [ (2π) . Therefore, 3] P robi→f = T (2π)4 δ 4 (Σkj0 0 − Σkj ) |iMi→f |2 V [V /(2π)3 ]n+2 Real detectors have finite aperture and hence the detected particle’s momentum is anywhere within a small window around the central value. The corresponding probabilities are thus to be added. An estimate for such a window follows from the box normalization we ~n . Summing over the momenta within have taken. The momenta are given by ~kj = 2π L j a window is same as summing over the integers within a window and for large volume, R 3 V d k, for each final state particle. This leads to the total probability per unit Σ~nj ≈ (2π) 3 time for the transition, n 3 (2π)4 δ 4 (Σkj0 0 − Σkj ) |iM2→n |2 V Y P rob2→n d kj dΓn = V T [V /(2π)3 ]n+2 (2π)3 j=1  r i Qn+2 h   4 4 0 3  (2π) δ (Σkj 0 − k1 − k2 ) iM2→n (2π) 2ω ~  j=1 kj =  V (2ω~k1 )(2ω~k2 )    2     n Y     j=1 d3 kj 2ω~k (2π)3 In going from the first to the second line, we have divided and multiplied by the product of (2ω~k ) for all the (n + 2) particles. The last product is the dΓn which is the Lorentz invariant phase space volume. The square root factor will get absorbed in the invariant matrix element and will get rid of similar factors coming from the wave functions of the incoming and outgoing particles. We also need the incident flux. For the initial state of two articles, consider the laboratory frame where the particle ‘2’ is at rest. The particle ‘1’ has a speed |~k1 |/E1 and gives the number density of one particle per unit volume, 1/V . Hence, the incident number flux is |~k1 |/(E1 V ). Dividing by the flux, the exclusive, differential cross-section for 2 → n process is given by, 130 dσlab M̃2→n (2π)4 δ 4 (k1 + k2 − Σj kj )|iM̃2→n |2 = dΓn 4m2 |(~k1 )lab | v un+2 h i uY 2ω~kj (2π)3 := M2→n t where, and, j=1 dΓn := n Y d3 kj 2ω~k (2π)3 j=1 We have used ω~k2 = m2 and ω~k1 = E1 . Notice that all factors of the volume and the duration T have disappeared. The explicit square root factors will also cancel out. Many basic calculations involve scattering processes with two particles going into two particles. We have already used the initial state of two particles to obtain the incident flux. We will now also specialize to n = 2 for out going particles and write the phase space integrals more explicitly. 1. The Special case of 2 → 2 processes Let the initial momenta be denoted by k1 , k2 and the final momenta be denoted by k10 , k20 . Lorentz invariance implies that the scattering amplitude will have Lorentz indices (tensorial or spinorial) carried by the wavefunctions and the invariant amplitude will be a function of Lorentz invariants. We have three independent momenta thanks to the overall conservation due to translation invariance and we can form 6 Lorentz scalars of the form pi · pj . Of these three are masses and the remaining three are conveniently defined as “center of mass energy” and two types of “squared momentum transferred”. These are known as the Mandelstam variables and are defined as (k1 + k2 = k10 + k20 ), s := −(k1 + k2 )2 = −(k10 + k20 )2 (Centre of Mass Energy) (11.14) t := −(k10 − k1 )2 = −(k20 − k2 )2 (Squared Momentum transfer) (11.15) u := −(k20 − k1 )2 = −(k2 − k10 )2 (Squared Momentum transfer) (11.16) The definitions imply the Mandelstam identity: s + u + t = m2k1 + m2k2 + m2k10 + m2k20 . Laboratory and Centre of Mass Frames: The lab frame, is defined by regarding particle 2, say, at rest while particle 1 is incident 131 on it. Thus, k1lab q = ( ~k12 + m21 , ~k1lab ) , k2lab = (m2 , ~0) The lab frame is convenient for expressing the incident flux. The center of mass frame is defined by ~ktotal := ~k1 + ~k2 = 0 = ~k10 + ~k20 . Thus, k1cm q q cm 2 ~ 2 2 + m2 , −~ ~ = ( kcm + m1 , kcm ) , k2 = ( ~kcm kcm ) 2 This frame is more convenient for defining scattering angle. The common momentum direction singles out say, the z-axis. Orienting the frame accordingly, we take k1 = (E1 , kcm ẑ) , k2 = (E2 , −kcm ẑ) , k10 = (E10 , ~k 0 ) , k20 = (E20 , −~k 0 ) , k̂ 0 · ẑ =: cos(Θcm ). (11.17) The Mandelstam invariant s can be used to relate the lab frame momentum |k1 |lab and the center of mass momentum |k|cm . The invariant s := −(k1 + k2 )2 = −(−m21 − m22 − 2E1 E2 + 2~k1 · ~k2 ) has two equivalent expressions: q s|lab = m21 + m22 + 2m2 m21 + ~k12 q q 2 2 2 2 2 2 + 2~ ~ kcm . s|cm = m1 + m2 + 2 m1 + kcm m22 + ~kcm These can be solved for the momenta and give manifestly Lorentz invariant expressions, q 1 s2 − 2s(m21 + m22 ) + (m21 − m22 )2 2m2 q 1 = √ s2 − 2s(m21 + m22 ) + (m21 − m22 )2 2 s |k1 |lab = |k|cm (11.18) (11.19) Phase space volume in center of mass frame: We have the Lorentz invariant definition, d3 k10 d3 k20 3 2ω 0 (2π)3 . (2π) 0 k k dΓ2 = (2π)4 δ 4 (k10 + k20 − k1 − k2 ) 2ω 1 2 √ In the CM frame we simplify it using: δ 4 → δ(E10 + E20 − s)δ 3 (k10 + k20 ) since ~k1 + ~k2 = 0, s = (E1 + E2 )2 . We can remove the momentum delta function by integrating over ~k 0 . Denoting 2 d3 k10 := 0 d3 kcm Z = 0 dkcm (k 0 )2cm dΘcm sin2 (Θcm )dφ, d3 k20 dΓ2 = we write, √ 1 1 0 0 0 − δ(E + E s) 0 0 dkcm (k 0 )2cm dΘcm sin2 (Θcm )dφ 2 1 2 (2π) 4E1 E2 132 p 0 , we simplify the (m0 )21 + (k 0 )2cm + (m0 )22 + (k 0 )2cm is a function of kcm P δ(x−xi ) delta function using δ(f (x)) = i f 0 (xi ) , f (xi ) = 0. We have Since E10 + E20 = p √ 0 0 0 0 0 kcm kcm kcm s 0 E1 + E2 = 0 0 , + 0 = kcm f = 0 0 0 E1 E E1 E2 E1 E2 p2 2 2 2 0 0 s − 2s((m )1 + (m )2 ) + ((m0 )21 − (m0 )22 )2 0 0 √ f (kcm ) = 0 ⇒ kcm = 2 s 0 0 (kcm ) 0 Doing the dkcm integration using the delta function gives, Z 0 √ (k 0 )2cm E10 E20 1 2 kcm √ √ d(k 0 )2cm δ(E10 + E20 − s) = which gives dΓ = . d Ω 2 cm 0 16π 2 kcm s s (11.20) Thus, for the special case of a process with 2 particles going to 2 particles, one gets the differential cross-section as, 0 √ 1 1 kcm dσ 2 lab = |M| where we have used 2m |k | = 2 s|k|cm , and(11.21) 2 1 dΩcm 64π 2 s kcm s 0 s2 − 2s((m0 )21 + (m0 )22 ) + ((m0 )21 − (m0 )22 )2 kcm . (11.22) = kcm s2 − 2s(m21 + m22 ) + (m21 − m22 )2 The invariant amplitude has dependence on the scattering angle Θcm and the total crosssection is obtained by integrating over the center of mass solid angle. 133 12. DIAGRAMMATIC RECIPE FOR S-MATRIX ELEMENTS We now specify explicit interacting fields by giving an interaction Lagrangian and state the Feynman rules to complete the diagrammatic recipe for the T − matrix elements. We will state the rules for the Φ4 theory, the Yukawa theory and the Quantum electrodynamics (QED). In the next section we will compute specific processes. There is a good deal of conventions of normalization etc and they have to be kept track of carefully. The free (quadratic) part of the Lagrangian density sets the normalization of the propagators while the interaction terms (beyond quadratic in fields) contribute to the numerical factors. Here are the terms in the Lagrangian densities. 1 1 Free Scalar : − ∂ µ Φ∂µ Φ − m2 Φ2 (x) ; 2 2 Free Spinor : −iΨ̄6 ∂ Ψ + mΨ̄Ψ ; 1 Massless Vector : − F µν Fµν , Fµν := ∂µ Aν − ∂ν Aµ ; 4 g λ Scalar self coupling : − Φ3 (x) − Φ(x)4 ; 3! 4! Yukawa coupling : −gΦ(x)Ψ̄(x)Ψ(x) ; QED coupling : −ieAµ (x)Ψ̄(x)γ µ Ψ(x) . In the interaction Hamiltonian, all coupling terms will change signs. (12.1) (12.2) (12.3) (12.4) (12.5) (12.6) (12.7) The propagators are the Feynman Green’s functions for the free equations of motion. Consider the massless vector field as this has a new feature. The free Lagrangian (Maxwell) can be expressed as 12 Aµ (η µν − ∂ µ ∂ ν )Aν + divergence terms. The equation of motion is (η µν − ∂ µ ∂ ν )Aν (x) = 0. Equivalently, in Fourier space equation takes the form −(η µν k 2 − k µ k ν )Ãν (k) = 0. However, the differential operator is not invertible and hence does not admit a Green’s function! The non-invertibility follows because every Aν of the form ∂ν f (x), solves the equation. That is, ∂ν f is a non-trivial eigenvector of the differential operator, with zero eigenvalue. For perturbation theory though we need a propagator for which extra terms are added to the action to break its gauge invariance - invariance under δAµ (x) = ∂µ Λ(x). This can be done in several ways and each choice corresponds to a gauge. A common and convenient choice is to add − 12 (∂µ Aµ )2 term to the action. Up to a divergence term, this is just + 12 Aµ ∂ µ ∂ ν Aν and precisely cancels the term in the equation 134 of motion operator, making it η µν which is invertible. The added term is manifestly Lorentz invariant and the corresponding gauge is called the Lorentz gauge. For our purposes, this gauge will suffice. The propagator is an inverse of the differential operator since (η µν x )(DF )νλ (x − y) = δ µλ δ 4 (x − y). ~ ↔ −i∂µ → The QED coupling arises from the minimal substitution rule: P~ → P~ + eA −i∂µ + eAµ ↔ ∂µ → ∂µ + ieAµ . Substitution in the free Dirac action gives a −ieAµ as the QED coupling. Let us gather the various factors for the scalar field in the T −matrix elements. • Each in/out particle gives a wave function factor of [2ω~k (2π)3 ]−1/2 e±ik·x , an R d4 x and (x − m2 ) acting on the n−point function. For a spinor field, the amputation operator changes to (−i6 ∂ + m) and we have additionally the u, v, ū, v̄ spinors. For a vector field, the amputation operator changes and we have the polarization vectors in addition. Other factors remain the same. R • Each order in HI , gives (−i) d4 y from the T −ordered exponential and a (m!)−1 for the mth order, from the exponential; • Each Wick contraction gives (i∆F (z − z 0 )) := R 0 d4 l ieil·(z−z ) (2π)4 l2 +m2 −i ; • Each amputation, action of ( − m2 ) gives a (−δ 4 (x − y)) while each integration over x or y gives (2π)4 times a momentum conserving δ 4 . The momentum integrations do not produce any factors. • Let E, I, V denote the number of external lines (= number of external vertices), number of internal lines and number of internal vertices (= order of HI ) respectively. Then, the factors of [2ω~k (2π)3 ]−1/2 , precisely cancel the explicit factor we found in the amplitude M̃2→n . This is due to the normalization choices which cancel out in |hf |ii|2 hi|iihf |f i and can now be dropped from both M̃ and from the T −matrix element. Factors of i: (−i)V (i)I = (−1)V iV +I ; Factors of 2π: (2π)−4I+4E+4V −4 . The last −4 is because it has been taken out in the definition of the M due to the overall momentum conservation. Vertex factors, including the i in the QED vertex are to be taken case-by-case. There is the (m!)−1 and a combinatorial factor that will come for each diagram. These too 135 are taken case-by-case. With these, we now state the Feynman rules for Feynman diagrams. External Line: scalar fermion anti-fermion photon 1 p u(p, σ) p v̄(p, σ) p γ ε∗µ (p, λ) p 1 p ū(p, σ) p v(p, σ) p γ εµ (p, λ) p Internal Line: scalar fermion photon (Lorentz gauge) µ ν −i k2 +m2 −i (−i)(−6 k+m) k2 +m2 −i −iη µν k2 −i Vertex: Φ3 i 3!g Φ4 i 4!λ Yukawa ig QED µ ieγ µ 136 13. ELEMENTARY PROCESSES IN YUKAWA AND QED: NR LIMIT We begin with scattering of a fermion off another fermion, interacting via the Yukawa coupling and compute the T −matrix element to the leading order. We will take the nonrelativistic limit and identify an equivalent potential. We will consider anti-fermion scattering and fermion-anti-fermion scattering as well. We will then compare the qed coupling and briefly the gravitational coupling. This will lead to appreciate the dependence the attractive/repulsive nature of the interaction on spin of the exchanged particle. Consider a general process depicted below together with its ‘expansion’. p p p k 0 p0 − p + k p p0 p0 = k p0 p 0 + k0 − p + . . . (13.1) k0 k0 k k k0 The first term (o(g 0 )) denotes ‘no scattering’. There are two contributions at o(g 2 ) corresponding to exchange of two out-going fermions. The overall momentum conservation delta function is the same, enforcing p + k = p0 + k 0 . The Feynman rules give the expression as, −i 2 iM = (ig) ū(p0 )u(p) 0 ū(k 0 )u(k) 2 2 (p − p) + mϕ − i −i 0 0 ū(p )u(k) (13.2) −ū(k )u(p) 0 (k − p)2 + m2ϕ − i The relative minus sign between the two terms is due to the T-ordering definition for the fermions. The overall sign of the amplitude is determined by the convention adopted for ordering the initial/final state labels for the fermions. See [12, 13]. Notice how the fermion arrows are followed. Note: If we had anti-fermion scattering, then all the fermion arrows will be reversed and their momenta will be denoted as minus the previous momenta12 . Apart from the reversal of fermion arrows, the u(p), ū(p0 ) spinors go to v(p0 ), v̄(p) spinors. If it is a fermion-anti-fermion scattering, then the second exchange diagram will be absent. 12 Thus, we may adopt a convention that the diagram displays the fermion arrow and the momentum is also in the same direction. For a fermion the momentum is p and for anti-fermion it is −p. 137 A specific scattering arrangement may permit the initial and the final state fermions to be distinguishable. Then only one of the two scattering diagrams will contribute. Note: The reduction formula for fermions gave a factor of 2m for the wavefunctions since we had normalized the spinors as ū(p, σ)u(p, σ 0 ) = δσ,σ0 . With the Feynman rules we have adopted in the table the wavefunctions have only the spinors. This is equivalent to using the normalization: ū(p, σ)u(p, σ 0 ) = 2mδσ,σ0 = −v̄(p, σ)v(p, σ 0 ), m is of course the fermion mass. With this noted, the amplitude becomes, δσp ,σk0 δσp0 ,σk δσp ,σp0 δσk ,σk0 2 2 − iM = (ig) (−i)(2m) (p0 − p)2 + m2ϕ − i (k 0 − p)2 + m2ϕ − i (13.3) Noting that the scalar field momentum is just the momentum transfer in both the diagrams, we denote it by q. And q 2 = −(q 0 )2 + |~q|2 = |~q|2 in both cases. We thus write the amplitude more conveniently as, o +ig 2 4m2 n iM = δσp ,σp0 δσk ,σk0 − δσp ,σk0 δσp0 ,σk . |~q|2 + m2ϕ − i As noted above, only one of the two terms will contribute if the fermions are distinguishable. In the non-relativistic scattering theory, we have h~k 0 |S|~ki := h~k 0 |~ki − 2πiδ(Ek0 − Ek )h~k 0 |T |~ki with, h~k 0 |T |~ki = h~k 0 |H 0 |~ki = h~k 0 |V (~q)|~ki = V (|~q = ~k 0 − ~k|) . We can read-off the potential as, V (|q|) = − 2 gef 4m2 g 2 f (δδ − δδ) := − . |~q|2 + m2ϕ ~q2 + m2ϕ (13.4) The inverse Fourier transform gives, Z Z ∞ Z 1 2 gef ei~q·~x q2 d3 q f 2 = − dq 2 d(cos(θ))eiqrcos(θ) V (~x) = −gef f (2π)3 ~q2 + m2ϕ (2π)2 0 q + m2ϕ −1 Z ∞ 2 gef qeiqr f = − 2 dq 2 contour integrate to pick up q = imϕ , 4π (ir) −∞ q + m2ϕ 2 −mϕ r gef f e ∴ V (r) = − 4π r (Yukawa Potential) (13.5) This is a simple illustration how an effective potential can be inferred from the underlying relativistically specified interaction. Several remarks are in order. 138 Remark: The most important point is the sign of the potential which renders it attractive. But is the sign unambiguous? We have already noted that the overall sign of the amplitude is determined by the convention adopted for ordering of the fermion labels in the in/out states while the relative sign is due to the Pauli principle. When the fermions are distinguishable, either of the two diagrams may be chosen, there is no preference. But the diagrams contribute opposite signs! The convention arises as follows [13]. We have taken |p, ki ∼ b†p b†k |0i (sequence of creation operators follows label order in the in-state) and hp0 , k 0 | ∼ h0|bk0 bp0 which is opposite to that of the in-state, but consistent with the Hermitian conjugation. Hence, hp0 , k 0 |P, ki ∼ h0|bk0 bp0 b†p b†k |0i = h0|bk0 b†k |0iδ 3 (p0 − p) − h0|bk0 b†p bp b†k |0i ≈ δ 3 (k 0 − k)δ 3 (p0 − p) − δ 3 (p0 − k)δ 3 (k 0 − p) Actually there are Ψ̄Ψ fields in between, but being a bilinear it does not change the relative sign. This explains the overall sign. In a non-relativistic comparison, |q| mϕ , the small momentum transfer means large spatial separation and hence fermions are distinguishable. If so, there is no Pauli principle or anti-commutation and no relative sign either! So either of the two diagrams will give the same answer. Ideally, we should consider one of the fermions to be very massive to mimic a source of potential (means ~q → ~0 limit) and match all the factors for a precise comparison. The effective coupling contains the Kronecker deltas enforcing conservation of the spin projections. Hence, the Yukawa potential preserves spin projection. For matching with nonrelativistic normalization we have to take 2mf ermion δσp ,σp0 → 1. This is needed for inferring the strength of the Yukawa potential. Remark: Suppose we consider fermion-anti-fermion scattering. Then we will have one of the Kronecker deltas to have a minus sign. However, in this case, b† → d† and doing the contractions with Ψ0 s from the Yukawa coupling, we pick up another minus sign. Hence the overall sign does not change. Hence, the Yukawa potential between fermion-anti-fermion is also attractive. The same holds for anti-fermion scattering. Thus, the inferred Yukawa potential is always attractive. Its underlying QFT description is an exchange of a (virtual) massive scalar particle of mass mϕ . Yukawa proposed this potential as a model for binding of nucleons and working with the properties of nucleons, 139 estimated the scalar mass to be about 200 MeV which is close to the mass of the pion. Of course pions come in three varieties, π ± , π 0 while we have taken only a single scalar. Consider fermion scattering in QED, by replacing the Yukawa coupling by the QED coupling. The relevant diagrams are: p0 p p p0 − p k0 − p + k0 k k The corresponding expression is, 2 iM = (ie) ū(p0 )γ µ u(p) p0 k0 −iηµν ū(k 0 )γ ν u(k) − p)2 − i −iηµν 0 µ 0 ν −ū(k )γ u(p) 0 ū(p )γ u(k) , (k − p)2 − i (p0 (13.6) with the same relative minus sign. As before, consider the case of non-relativistic limit with distinguishable fermions and consider only the first diagram. The denominator simplifies the same way: (p0 − p)2 − i → |~q|2 − i. In the numerator we have [ū(p0 , σ 0 )γ µ u(p, σ)][ū(k 0 , ρ0 )γµ u(k, ρ)]. Claim: In the NR limit, ū(p0 , σ 0 )γ µ u(p, σ) → ūγ 0 u. Proof: The spinor u satisfies, (6 p + m)u = 0. Consider say the Dirac representation of the gamma matrices. Then in terms of the two component notation we have,    −ωp~ + m p~ · ~σ u    1  = 0 ⇒ −~p · ~σ u1 = (ωp~ + m)u2 , p~ · ~σ u2 = (−ωp~ + m)u1 −~p · ~σ ωp~ + m u2 The NR limit ωp~ ≈ m + p ~2 2m then implies u2 ∼ o(|p|/m)u1 . Also, in the two component notation, we have  ū0 γ i u = ((u01 )† , (u02 )† )   ū0 γ 0 u = ((u01 )† , (u02 )† )  1 0 0 −1 1 0 0 −1     0 −σ σ i 1 0 0 −1 i   0   This proves the claim. 140 u1   = (u01 )† σ i u2 + (u02 )† σ i u1 ∼ o(p/m) ; u2  u1 u2  = (u01 )† u1 + (u02 )† u2 ∼ o(1) + o(|p|/m). Hence the numerator approximates to [ū(~p0 , σ 0 )γ 0 u(~p, σ)][ū(~k 0 , ρ0 )γ 0 u(~k, ρ)] and we get, iM ≈ ie2 η00 [u† (~p0 σ 0 )u(~p, σ)][u† (~k 0 , ρ0 )u(~k, ρ)] . ~q2 − i (13.7) Since the square brackets are positive (p0 ≈ p, k 0 ≈ k for small momentum transfer) and the η00 < 0, relative to the Yukawa matrix element we have an opposite sign and hence, the effective potential from the QED coupling will be repulsive for fermion scattering. The form of the potential itself can be obtained from the Yukawa one by taking mϕ → 0 and as expected, we get the Coulomb potential, V (r) = (e0 )2 4π =: α . 4π We have absorbed the normalization factors in e0 . By redefining the original e, we can take e0 → e, the measured electric charge (in the natural units). The constant α := e2 4π ≈ 1 137 is called the fine structure constant. If we consider fermion-anti-fermion scattering via the QED coupling, the u → v does not introduce a negative sign since it is u† u and not ūu. The b† → d† introduces a minus sign as before and hence the overall sign changes. Hence, for fermion-anti-fermion, the potential is attractive. For anti-fermion scattering of course the potential is repulsive. Note: It seems the sign depends on the η00 which is convention dependent. However, the signs in the propagator will also change accordingly and the sign of the potential will be metric signature independent. Note: For a tensorial interaction like gravity, helicity 2, we will have two factors of η00 and the overall sign will not change. The Newtonian potential will be attractive and will be so for fermion-anti-fermion scattering as well. This argument is to be taken as heuristic since we need to be explicit about the gravitational coupling to infer the numerator factors of the spinors which are sensitive to u → v changes. These examples indicate that exchange of quanta can be interpreted as giving an equivalent potential (and hence force) in the non-relativistic limit (where the concept of potential is meaningful). The qualitative properties of attractive/repulsive are already encoded in the Feynman rules. The comparison with effective potential also allows the coupling parameters to be determined experimentally. 141 14. BASIC QED PROCESSES The quantum electrodynamics has basic scattering of two charged particles, the scattering of light by a charged particle, particle-anti-particle annihilation and pair creation. These are analyzed as: (i) electron-muon scattering, (ii) electron-positron scattering as well as production of muon-anti-muon, (iii) electron-photon scattering (Compton scattering) and (iv) electron-positron annihilation into two photons. We will evaluate these in the leading e2 approximation. We will also compute the cross-sections for comparison with experiments. These cross-sections will be for the un-polarized particles in the initial state, for which we will average the cross section over the initial spins/polarizations. We will also not detect the final particles spin/polarizations and hence sum the cross-section over the final spins/polarizations. Here is the list of the processes with their diagrams and the corresponding invariant amplitudes. e− (p)µ− (k) → e− (p0 )µ− (k 0 ) e− (p) e− (p0 ) iM = [ū(p0 )(ieγ µ )u(p)] q µ− (k) −iηµν q 2 −i i [ū(k 0 )(ieγ µ )u(k)] q (for µ replaced by e it is Bhabha Scattering) µ+ (k 0 ) iM = [v̄(p0 )(ieγ µ )u(p)] e− (p) h µ− (k 0 ) e+ (p0 )e− (p) → µ+ (k 0 )µ− (k) e+ (p0 ) (for µ replaced by e it is Moller Scattering) h −iηµν q 2 −i i [ū(k 0 )(ieγ µ )v(k)] µ− (k) In the next two processes, there are two diagrams contributing, obtained by exchanging the initial and final state photons in the Compton scattering and the two final state photons in the annihilation process. Although the diagrams may not look ‘crossed’, they are! 142 e− (p)γ(k) → e− (p0 )γ(k 0 ) e− (p) e− (p0 ) p+k γ(k 0 ) γ(k) e− (p) γ(k) p0 − k (Compton Scattering) −i(−(6 p + 6 k) + m) 0 0 0 ν ~ iM = εν (k , λ )ū(p )(ieγ ) (p + k)2 + m2e − i × (ieγ µ )u(p)ε∗ (~k, λ) µ e− (p0 ) iM = −i(−(6 p0 − 6 k) + m) (p − k 0 )2 + m2e − i × (ieγ µ )u(p)εµ (~k 0 , λ0 ) ε∗ν (~k, λ)ū(p0 )(ieγ ν ) γ(k 0 ) e+ (p0 )e− (p) → γ(k)γ(k 0 ) e− (p) (Annihilation process) γ(k) p−k e+ (p0 ) γ(k 0 ) e− (p) γ(k) −i(−(6 p − 6 k) + m) iM = εν (k , λ )v̄(p )(ieγ ) (p − k)2 + m2e − i × (ieγ µ )u(p)εµ (~k, λ) ~0 0 0 ν −i(−(6 k −6 p0 ) + m) 0 ν ~ iM = εν (k, λ)v̄(p )(ieγ ) (p − k 0 )2 + m2e − i × (ieγ µ )u(p)εµ (~k 0 , λ0 ) p−k e+ (p0 ) γ(k 0 ) Since we will be computing un-polarized (average over initial spin/polarizations), inclusive P (sum over final spin/polarizations) cross-sections, we have to evaluate 12 12 σ,σ0 |M|2 for the fermion scattering and likewise for the other cases. For the Compton scattering and pair annihilation where two diagrams contribute, we need to add them before taking the modsquare. A. Electron-muon processes: Consider the electron-muon scattering. We have, (iM)(−iM∗ ) = e4 {(ū(p0 )γ µ u(p))(u† (p)(γ ν )† (γ 0 )† u(p0 ))} (q 2 − i)(q 2 + i) × {(ū(k 0 )γµ u(k))(u† (k)(γν )† (γ 0 )† u(k 0 ))} The spins, σ, σ 0 , ρ, ρ0 are implicit in the above expression. Summing over all of these replaces uū using the completeness relation. Each of the braces can be expressed as traces over the 143 gamma matrices. With the new normalizations used, the completeness relations take the form, X u(p, σ)ū(p, σ) = −6 p + m , X σ v(p, σ)v̄(p, σ) = −6 p − m . (14.1) σ Thus the electron and muon spin sums become, {. . . }p,p0 = T r[γ µ (−6 p + me )γ ν (−6 p0 + me ] , {. . . }k,k0 = T r[γµ (−6 k + mµ )γν (−6 k 0 + mµ ] . This leads to, 1 X 1 e4 |M|2eµ→eµ = T r[γ µ (−6 p + me )γ ν (−6 p0 + me ] 4 σ,σ0 ,ρ,ρ0 4 (q 2 − i)(q 2 + i) × T r[γµ (−6 k + mµ )γν (−6 k 0 + mµ ] (14.2) For the e+ e− → µ+ µ− we will have, (iM)(−iM∗ ) = e4 {(v̄(p0 )γ µ u(p))(u† (p)(γ ν )† (γ 0 )† v(p0 ))} (q 2 − i)(q 2 + i) × {(ū(k 0 )γµ v(k))(v † (k)(γν )† (γ 0 )† u(k 0 ))} This is the same as before except the p0 , k 0 spinors are v spinors. Using the corresponding completeness relation we get, 1 1 X e4 |M|2e+ e− →µ+ µ− = T r[γ µ (−6 p + me )γ ν (−6 p0 − me ] 4 σ,σ0 ,ρ,ρ0 4 (q 2 − i)(q 2 + i) × T r[γµ (−6 k + mµ )γν (−6 k 0 − mµ ] (14.3) The Compton scattering and annihilation processes have two diagrams each and each of these has a single Dirac trace. But each diagram also has photon polarization completeness relations. Incidentally, for Moller and Bhabha processes, there are ‘crossed diagrams’ too. We will do these later. To proceed with the two processes above, we need to traces of 2 and 4 gamma matrices. Here are the relevant formulae. T r1 = 4; (14.4) T rγ µ γ ν = −4η µν ; T rγ µ γ ν γ α γ β = 4 η µν η αβ − η µα η νβ + η µβ η να 144 (14.5) (14.6) Using these, we get T r[γ µ (−6 p + me )γ ν (−6 p0 + me ] = pα p0β T r[γ µ γ α γ ν γ β ] + m2e T r[γ µ γ ν ] 0 0 = 4(pµ p ν − η µν p · p0 + pν p µ ) − 4m2e η µν and, T r[γµ (−6 k + mµ )γν (−6 k 0 + mµ ] = kα kβ0 T r[γµ γ α γν γ β ] + m2µ T r[γµ γν ] = 4(kµ kν0 − ηµν k · k 0 + kν kµ0 ) − 4m2µ ηµν . Dotting the traces gives, T r[. . . ]T r[. . . ] = 32[(p · k)(p0 · k 0 ) + (p · k 0 )(p0 · k) + m2µ p · p0 + m2e k · k 0 + 2m2e m2µ ] 1X 8e4 ∴ |M|2eµ→eµ = 4 [(p · k)(p0 · k 0 ) + (p · k 0 )(p0 · k) + m2µ p · p0 + m2e k · k 0 + 2m2e m2µ(14.7) ] 4 spins q where q = p0 − p . For the e+ e− → µ+ µ− process, the only difference is m2e , m2µ changing signs. Thus we have, 8e4 1X |M|2e+ e− →µ+ µ− = 4 [(p · k)(p0 · k 0 ) + (p · k 0 )(p0 · k) − m2µ p · p0 − m2e k · k 0 + 2m2e m2µ ] (14.8) 4 spins q where q = p + p0 . To relate to cross-section, we can use the center of mass frame. For the electron-muon scattering, we may choose 0 0 pµ = (Ee , pẑ) , k µ = (Eµ , −pẑ) , p µ = (Ee0 , p~0 ) , k µ = (Eµ0 , −~p0 ) , cos(Θcm ) := ẑ · p~ˆ0 . q p p p 2 0 0 2 2 2 0 2 2 Notice that Ee + Eµ = p + me + p + mµ = Ee + Eµ = (p ) + me + (p0 )2 + m2µ , implies that |p0 | = |p|. The magnitude of the center of mass momentum and the scattering angle are the only two independent parameters given the electron and muon masses. For e+ e− → µ+ µ− , a good deal of convenience ensues. The equality of magnitudes of momenta and masses being identical in both initial and final states, implies that the energies of individual particles in initial and final state are equal and total energy conservation implies these energies are equal too. The initial and final momenta magnitudes are then simply related by the masses. Thus we may choose, ˆ pµ = (E, pẑ) , (p0 )µ = (E, −pẑ) , k µ = (E, ~k) , (k 0 )µ = (E, −~k) , cos(Θcm ) := ẑ · ~k. Again only the center of mass energy and the scattering angle are the only independent parameters given the masses. 145 It remains to simplify the expression for the cross-sections. We will give it for the muon production process and leave the eµ scattering process as an exercise. We will also neglect the electron mass compared to the muon mass. The dot products of momenta, in the center of mass take the form, q 2 = (p + p0 )2 = −(2E)2 = −4(k 2 + m2e ) ≈ −4k 2 . Next, p · p0 = −E 2 − p2 ≈ −2E 2 and p · k = p0 · k 0 = −E 2 + Ekcos(Θcm ) , p · k 0 = p · k = −E 2 − Ekcos(Θcm ) . Substituting in the invariant amplitude gives (me = 0 set), 8e4 2 1X (E + Ekcos(Θcm ))2 + (E 2 − Ekcos(Θcm ))2 − m2µ (−2E 2 ) |M|2e+ e− →µ+ µ− = 4 spins (−4E 2 )2 k 2 cos2 (Θcm ) m2µ 4 + 2 = e 1+ E2 E 2 mµ m2µ 4 2 = e 1 + 2 + 1 − 2 cos (Θcm ) (14.9) E E The differential cross-section is given by (11.21), ! 0 1 1 kcm 1X dσ = |M|2 where, s = 4E 2 , me = 0, m01 = m02 = mµ ⇒ dΩcm 64π 2 s kcm 4 spins r m2µ m2µ 1 1 s2 − 4sm2µ 2 = 1 + 2 + 1 − 2 cos (Θcm ) 64π 2 s s E E r 2 2 m2µ m2µ m2µ e 1 2 = 1− 2 1 + 2 + 1 − 2 cos (Θcm ) (14.10) 4π 4 · 4E 2 E E E Putting α := e2 , 4π the fine structure constant and Ecm = 2E, we write the unpolarized, e+ e− → µ+ µ− cross-section as, r dσ α2 1− (E, Θcm ) = 2 dΩcm 4Ecm r Z πα2 dσ 1− σtotal := dΩ = 2 dΩ 3Ecm m2µ m2µ 2 1 + 2 + 1 − 2 cos (Θcm ) (14.11) E E m2µ m2µ 1 + (14.12) E2 2E 2 m2µ E2 Remarks: The square root factor shows that E ≥ mµ must hold for the muon production to proceed. In the ultra-relativistic limit, E mµ , the differential cross-section has the characteristic (1 + cos2 (Θ)) angular dependence. Since the cross-section is a measured quantity and the center of mass energy is under our control, by comparing the pair production cross-sections for two different final states, eg µ, τ , and taking the ratio, we can obtain bounds on the masses of the heavier leptons. 146 Exercises: Obtain the differential and the total cross-section for eµ → eµ process. Also for the Moller and the Bhabha processes. B. Compton Scattering: Recall the total amplitude from the two diagrams (we take the photon polarizations to be real), −i(−(6 p + 6 k) + m) 0 0 0 ν ~ (ieγ µ )u(p)εµ (~k, λ) iM = εν (k , λ )ū(p )(ieγ ) (p + k)2 + m2e − i 0 −i(−(6 p − 6 k) + m) 0 ν +εν (~k, λ)ū(p )(ieγ ) (ieγ µ )u(p)εµ (~k 0 , λ0 ) (14.13) (p0 − k)2 + m2e − i ν 0 µ µ ν γ (−(6 p + 6 k) + m)γ γ (−(6 p − 6 k) + m)γ 2 0 0 0 M = e εµ (~k, λ)εν (~k , λ )ū(p ) + u(p) (p + k)2 + m2e − i (p0 − k)2 + m2e − i (14.14) In the second (crossed diagram) terms, we have used p−k 0 = p0 −k which is more convenient. Simplification: Since the momenta are on shell, p2 = −m2 , k 2 = 0 = (k 0 )2 , we get (p + k)2 + m2 = p2 + m2 + k 2 + 2p · k = 2p · k , (p0 − k)2 + m2 = −2p0 · k; (−6 p + m)γ µ u(p) = (2pµ + γ µ (6 p + m))u(p) = 2pµ u(p) , ū(p0 )γ µ (−6 p0 + m) = ū(p0 )(2(p0 )µ + (6 p0 + m))γ µ = 2(p0 )µ ū(p0 ) . ν µ γ (2p − 6 kγ µ ) (2(p0 )µ + γ µ6 k)γ ν ) 2 0 0 0 ~ ~ M = e εµ (k, λ)εν (k , λ )ū(p ) + u(p) (14.15) 2p · k − i −2p0 · k − i α α β β 0 α α (2p − γ 6 k)γ γ (2(p ) + 6 kγ ) † 2 0 0 + u(p0 ) (14.16) M = e εα (~k, λ)εβ (~k , λ )ū(p) 0 2p · k + i −2p · k + i In the second equation, we have used [ū(p0 )Γu(p)]∗ = ū(p)(γ 0 )Γ† γ 0 u(p0 ) = ū(p)Γreversed u(p0 ). Here Γreversed has the order of the gamma matrices reversed due to the † operation. Summing and averaging over the spins and polarizations the |M|2 , we get, ν µ 1 X e4 γ (2p − 6 kγ µ ) (2(p0 )µ + γ µ6 k)γ ν 2 |M| = Tr + (−6 p + m) 4 σ,σ0 ,λ,λ0 4 2p · k − i −2p0 · k − i α (2p − γ α6 k)γ β γ β (2(p0 )α + 6 kγ α ) 0 + (−6 p + m) × 2p · k + i −2p0 · k + i ( )( )# X X εµ (~k, λ)εα (~k, λ) εν (~k 0 , λ0 )εβ (~k 0 , λ0 ) (14.17) λ0 λ 147 We have used the completeness relation (14.1) to convert the spin sum. We have a trace over strings of γ matrices, but now we also have sums over the photon polarizations. An explicit expression can be obtained by introducing an explicit set of 4 orthonormal vectors. Consider the polarization sum. Given a k, k 2 = 0 we can write k µ = (|~k|, ~k). Introduce another vector k̃ := C(|~k|, −~k) so that k̃ 2 = 0. Fix C by demanding k· k̃ = −2 ↔ C = 1/|~k|2 . We have two transverse directions and we take polarizations ε(~k, λ) along these two directions and mutually orthonormalized. These are space-like vectors and have no time component. These 4 vectors, k, k̃, ε(~k, 1), ε(~k, 2) are independent and the completeness relation takes the form, k µ k̃ ν + k ν k̃ µ + εµ (~k, 1)εν (~k, 1) + εµ (~k, 2)εν (~k, 2) = η µν 2 Thus we get the sum over photon polarizations as, − kµ k̃ν + kν k̃µ εµ (~k, λ)εν (~k, λ) = ηµν + 2 λ=1,2 X (14.18) Claim: The terms containing k, k̃ in the polarization sum, do not contribute to the P |M|2 . Proof: Observe that we can always express M = εµ (~k, λ)M µ , by just taking out the εµ . P Then λ |M|2 = M µ (M ∗ )ν {ηµν + 21 (kµ k̃ν + kν k̃µ )}. The k terms contracting with M can be seen by replacing εµ → kµ . It is more convenient to write the Fermion propagators using, −6 q+m q 2 +m2 −i 1 = 6 q+m since (−6 q + m)(6 q + m) = q 2 + m2 . Then, 1 2 0 M |εµ =kµ = e ū(p ) 6 ε(k 0 ) {6 k +6 p + m −6 p − m} 6p +6k + m 1 0 0 0 +{6 k −6 p − m +6 p + m} 0 6 ε(k ) u(p) 6p −6k + m 1 2 = e ū(p0 )6 ε(k 0 )u(p) − ū(p0 )6 ε(k 0 ) (6 p + m)})u(p) 6p +6k + m 1 0 0 0 0 0 ū(p )(−6 ε(k ))u(p) + ū(p )(6 p + m) 0 6 ε(k )u(p) 6p −6k + m =0! We have used p − k 0 = p0 − k in the second term in the first equation. In the last equation, the first and the third terms cancel while the second and the fourth terms vanish by equation of motion. This proves the claim. Hence, effectively, each of the polarization sums give only the ηµν . Note: The total scattering amplitude vanishes when the photon transverse polarization is replaced by a longitudinal one, is a general result known as a ‘Ward identity’. This is a 148 consequence of gauge invariance. While we do not discuss the general proof here, it suffices to note that (a) we used the Dirac equation: (6 p + m)u(p) = 0 = ū(p0 )(6 p0 + m); (b) the QED coupling has the form ∼ Aµ J µ with J µ ∼ Ψ̄γ µ Ψ ∼ ū(p0 )γ µ u(p); (c) δAµ = ∂µ Λ is a gauge transformation (↔ εµ (k) → εµ (k) + kµ λ(k)) and gauge invariance of the interaction implies ∂µ J µ = 0. Thus, the vanishing of the amplitude is related to gauge invariance. The averaged |M|2 in equation (14.17) becomes, ν µ X γ (2p − 6 kγ µ ) (2(p0 )µ + γ µ6 k)γ ν 2 4 + (−6 p + m) |M| = e T r 0 · k − i 2p · k − i −2p 0 0 σ,σ ,λ,λ (2pµ − γµ6 k)γν γν (2(p0 )µ + 6 kγµ ) 0 + (−6 p + m) 2p · k + i −2p0 · k + i " (2pµ γ ν − γ ν6 kγ µ )6 p(2pµ γν − γµ6 kγν )6 p0 4 = e Tr + (2p · k)2 1 (2(p0 )µ γ ν + γ µ6 kγ ν )6 p(2p0µ γν + γν6 kγµ )6 p0 + Tr (−2p0 · k)2 2 µ ν (2p γ − γ ν6 kγ µ )6 p(2p0µ γν + γν6 kγµ )6 p0 + Tr −4p · kp0 · k 3 # (2(p0 )µ γ ν + γ µ6 kγ ν )6 p(2pµ γν − γµ6 kγν )6 p0 Tr −4p · kp0 · k 4 " (2pµ γ ν − γ ν6 kγ µ )(2pµ γν − γµ6 kγν ) + +m2 e4 T r (2p · k)2 5 (2(p0 )µ γ ν + γ µ6 kγ ν )(2p0µ γν + γν6 kγµ ) Tr + (−2p0 · k)2 6 µ ν (2p γ − γ ν6 kγ µ )(2p0µ γν + γν6 kγµ ) Tr + −4p · kp0 · k 7 # (2(p0 )µ γ ν + γ µ6 kγ ν )(2pµ γν − γµ6 kγν ) Tr −4p · kp0 · k 8 (14.19) (14.20) The terms linear on m have odd number of γ’s and hence vanish under trace. It may be checked easily that the 2 is obtained from 1 by p ↔ −p0 and likewise for 5 ↔ 6. Noting the identity T r(γ1 γ2 · · · γn ) = T r(γn γn−1 · · · γ2 γ1 ), it follows that 3 = 4 and 7 = 8. The first group of 4 terms involves trace of a maximum of 8 gamma matrices while the last 4 terms involve a maximum of 6 gamma’s. These are simplified using various identities among the gamma matrices. 149 Consider 1 . We have the traces, T r 4p2 γ ν6 pγν6 p0 − 2γ ν6 p6 p6 kγν6 p0 − 2γ ν6 k6 p6 pγν6 p0 + γ ν6 kγ µ6 pγµ6 kγν6 p0 Use: γ ν6 pγν = +26 p, 6 p6 p = −p2 , 6 k6 p +6 p6 k = −2p · k, T r6 p6 p0 = −4p · p0 and the cyclic property of the trace to get the above trace as, T r{· · · } = (4p2 )(2)(−4p · p0 ) − 2p2 (2)(−4k · p0 ) × 2 + (2)(2)T r(6 k6 p6 k6 p0 ) 1 = 32m2 (p · p0 − k · p0 ) + 32(k · p)(k · p0 ) (14.21) We have used p2 = −m2 and k 2 = 0. Similarly we get the other traces (without the denominator factors) as, T r{· · · } 3 = 4p · p0 T r(γ ν6 pγν6 p0 ) − 2T r(6 k6 p06 pγν6 p0 γ ν ) + 2T r(γ ν6 pγν6 k6 p6 p0 ) − T r(γ ν6 kγ µ6 pγν6 kγµ6 p0 ) = −32(p · p0 )2 + (k · p − k · p0 )(16m2 + 32p · p0 ) + 0 T r{· · · } 5 7 (14.23) = 4p2 T r(γ ν γν ) − 2T r(6 k6 p(−41)) + 8T r(6 p6 k) + 16T r(6 k6 k) = 64m2 − 64k · p T r{· · · } (14.22) (14.24) = −64p · p0 + 32(k · p − k · p0 ) − T r(γ ν6 kγ µ γν6 kγµ ) = −64p · p0 + 32(k · p − k · p0 ) + 0 (14.25) To simplify the expression further, we need to use: p · p0 = −m2 + k · p − k · p0 which follows from 0 = (k 0 )2 = (p + k − p0 )2 . For comparison with [13], it is useful to note: k · p = k 0 · p0 , p · k 0 = p0 · k. The final expression is, " 2 # 0 1 X p · k p · k 1 1 1 1 |M|2 = 2e4 + + 2m2 − + m4 − 4 p·k p · k0 p · k0 p · k p · k p · k0 spin/P ol (14.26) Exercise: Check the algebra! The Compton cross-section is usually presented in the lab frame with the electron initially at rest, i.e. pµ = (m, ~0), k µ = (ω, ωẑ), (p0 )µ = (E 0 , p~]), (k 0 )µ = (ω 0 , ω 0 sinθ, 0, ω 0 cos(θ)). The final electron and photon momenta define a plane which is taken to be the z−x plane with the final photon making an angle θ to the z−axis. These choices give: p·k 0 = −mω 0 , p·k = −mω 150 and, " X |M|2 = 2e4 spin/P ol 2 # 1 ω0 ω 1 1 1 + 0 + 2m − 0 + + m2 − + 0 ω ω ω ω ω ω (14.27) Next, (p0 )2 = (p + k − k 0 )2 implies, −m2 = −m2 + 2p · (k − k 0 ) + (k − k 0 )2 ⇒ 0 = m(ω − ω 0 ) − ωω 0 (1 − cos(θ)), leading to the Compton formula for shift in the photon wavelength with the scattering angle: Compton Formula: 1 − cosθ 1 1 = − ω0 ω m ↔ ω0 = 1+ ω . − cosθ) ω (1 m (14.28) To get the differential cross-section, we simplify the phase space integral as, Z Z d3 p0 d3 k 0 (2π)4 δ 4 (p; +k 0 − p − k) = 0 3 0 3 2ω (2π) 2E (2π) Z dω 0 (ω 0 )2 dΩ 1 (2π)4 δ(ω 0 + E 0 − m − ω) 0 3 0 2ω (2π) 2E (2π)3 We have used up a δ 3 to get p~0 = ~k 0 − ~k − p~ = (ω 0 sinθ, 0, ω 0 cosθ − ω) which gives (E 0 )2 − (ω 0 )2 + ω 2 − 2ω 0 ωcosθ + m2 . Hence Z ω∗0 dE 0 ω 0 − ωcosθ 1 ω0 , = . dω 0 0 0 δ(ω 0 + E 0 (ω 0 ) − m − ω) = 0 dE 4E (ω ) 4E 0 (ω∗0 ) |1 + dω dω 0 E0 0 | ∗ Inserting in the phase space integral, Z Z 1 dcosθ ω∗0 , ω∗0 + E 0 (ω∗0 ) = m + ω = 0 −ωcosθ ω 0 2π 4E |1 + | ∗ Γ2 ∗ E0 Z 02 1 ω = dcosθ . We have, (14.29) 8π mω dσ 1 1 (ω 0 )2 X = [ |M|2 ] dcosθ 4mω 8π mω 2 0 2 e4 1 ω 0 ω ω −1 + cosθ 2 (1 − cosθ) = + 0 + 2m +m 16π m2 ω ω ω m m2 This give the differential cross-section for the Compton scattering, known as Klein-Nishina formula, dσ dcosθlab πα2 = 2 m ω0 ω 2 ω0 ω + 0 − sin2 θlab ω ω Remarks: 151 , with ω0 = ω 1+ 1 . − cosθ) ω (1 m (14.30) (a) There has been no approximation at the α2 level computation. It can now be used for various limits. At θ = 0, forward scattering, there is no frequency change and the crosssection equaling πα2 m is independent of the photon energy. For a massive charged particle, one defines its Compton length to be λc := h/(mc) = 2π/m in the natural units, ~ = 1 = c. The forward scattering cross-section is then α2 (πλ̄2 ), where λ̄ := λ/(2π). (b) There are two scales in the problems: m, ω. This gives two natural limits: (i) ω → 0 ↔ ω/m 1 and (ii) ω → ∞ ↔ ω/m 1. For a generic θ, consider ω → 0. Then ω/ω 0 → 1 + (ω/m)(1 − cosθ) and dσ dcosθ = ω/m1 i ω 8πα2 πα2 h 2 2 1 + cos θ − 4sin (θ/2) + . . . , σ = + ... tot m2 m 3m2 (14.31) The leading term is the Thomson cross-section for classical radiation scattering off free electrons. In the opposite limit, m/ω 1, we have m 1 m 1 1 m 0 + ... := (1 − . . . ) , (θ) := . 1− ω /ω = ω 1 − cosθ ω 1 − cosθ ω 1 − cosθ In this limit, dσ πα2 πα2 m ≈ (θ) = , (14.32) 2 2 dcosθ m m ω(1 − cosθ) Z 1−δ dσ πα2 σtot (δ) := dcosθ ≈ [−`n(δ/2)] → ∞ , as θ → 0. (14.33) dcosθ mω −1 p For the (θ) 1 condition to hold so that the expansion is meaningful, θ 2m/ω must hold. At θ = 0 we know the exact answer which is finite. C. Electron-Positron Annihilation: Having evaluated the Compton scattering amplitude, this is actually simple to evaluate. From the table giving the amplitudes (notice the use of p0 − k and k − p0 momenta in the crossed diagrams), we can see that the annihilation diagrams expressions are obtained by making the substitutions: p → p, k → −k, p0 → −p0 , k 0 → k 0 and ū → v̄. These substitutions preserve the momentum conservations appropriate for the processes: p + k = p0 + k 0 → p − k = −p0 + k 0 . The photon polarizations are insensitive to the ‘sign’ of momentum and will give ηµν as before. One of the spin sums however changes from 152 [−6 p0 + m] → [−(−6 p0 + m)] = [6 p − m] which generates an overall minus sign in the summed, squared amplitude. This is actually an example of what is known as the crossing symmetry of the S−matrix: A scattering amplitude for a particle with momentum k in the initial state is the same as the amplitude with the initial particle moved to the final state anti-particle with momentum −k. Without any explicit calculation we can write down the summed squared amplitude for the annihilation process as, " 2 # 0 X 1 1 p · k 1 1 p · k + + 2m2 + m4 + + |M|2 = (−)2e4 − p·k p · k0 p · k0 p · k p · k p · k0 spin/P ol (14.34) It is more convenient to express the cross-section in the center of mass frame: pµ = (E, pẑ) , (p0 )µ = (E, −pẑ) , E 2 = p2 + m2 , s = −(p + p0 )2 = 4E 2 k µ = (E, EsinΘ, 0, EcosΘ) , (k 0 )µ = (E, −EsinΘ, 0, −EsinΘ) This gives, p · k = −E 2 + pEcosΘ = −E(E − pcosΘ) , p · k 0 = −E 2 − pEcosΘ = −E(E + pcosΘ), s |k 0 |cm 1 E s2 − 0 + 0 2 E √ = = = = |k|cm s2 − 2s(2m2 ) + 0 1 − m2 /E 2 p E 2 − m2 The differential cross-section takes the form,   X dσ 1 1 E  = |M|2  with 2 2 dΩcm 64π 4E |p| spin/P ol X E + pcosΘ E − pcosΘ 2 4 |M| = (−2e ) − + (14.35) E − pcosΘ E + pcosΘ spin/P ol 2m2 1 1 + + (14.36) −E E + pcosΘ E − pcosΘ 2 # m4 1 1 + 2 + E E + pcosΘ E − pcosΘ 2 2 2 2m2 2m4 4 E + p cos Θ + − (14.37) = 4e E 2 − p2 cos2 Θ E 2 − p2 cos2 Θ (E 2 − p2 cos2 Θ)2 153 The differential cross-section for unpolarized, e+ e− annihilation process is then given by, α2 E E 2 + p2 cos2 Θ + 2m2 2m4 dσ = − , s = 4E 2 . dΩcm s |p| m2 + p2 sin2 Θ (m2 + p2 sin2 Θ)2 In the high energy limit, E m, this reduces to, dσ α2 1 + cos2 Θ → dΩcm s sin2 Θ (14.38) (14.39) To obtain the total cross-section, we need to integrate over the final state. Since the two photons a identical, Θ needs to be integrated between 0, π/2. The Bose factor of 2 (which we did not include in the amplitude) cancels the factor of 2 from the angular integration. We get, σtot α2 (2π) = s Z 1 dx 0 1 + x2 1 − x2 (14.40) The integral is clearly divergent from the ‘forward’ direction (Θ = 0). This is artificial since R 1−∆ R 1 we set m = 0 in the integrand. Split the integration as 0 + 1−∆ . The first term is a finite number and this contribution falls off as s−1 . For the second term, we go back to the exact differential cross-section, approximate the integrand for 1 − ∆ ≤ cos(Θ) ≤ 1. Then, the factors of E all cancel and second term ∼ α2 (1 − ∆)m−2 where, ∆ m2 /p2 so that the denominator can be approximated. Thus, the total cross-section does not diverge but is bounded by m−2 . This is an example of bounds satisfied by total cross-sections at very high energies. 154 15. NUMERICAL ESTIMATES OF CROSS-SECTIONS AND APPLICATIONS We have obtained examples of cross-sections for some of the basic processes in quantum electrodynamics. We also saw how they relate to non-relativistic versions of the processes eg potential scattering and inferred the modification due to relativistic effects as well as estimate ‘strengths’ of interactions. Before we study the higher order corrections, it is useful to have a sense of order of magnitudes of cross-sections and how these numbers are used in applications. We will focus on the QED processes. What is the value of the fine structure constant? This is related to the electric charge, the Planck constant due to quantum framework and the speed of light due to special relativity. We first obtain the expression in terms of “engineering units” and then use the measured values to compute the fine structure constant. Observe that the Feynman rules are derived from the Lagrangian density which has properly normalized kinetic (quadratic) terms, − 41 Fµν F µν + iΨ̄γ µ ∂µ Ψ − mΨ̄Ψ and the interaction term introduces ‘e’ as eΨ̄γ µ ΨAµ and thus is dimensionless. This definition gives the equation of motion, ∂µ F µν = −eΨ̄γ ν Ψ ↔ ∂i E i = eΨ† Ψ, using the identifications: F 0i := E i , ∂µ F µν = −J ν , J µ = (ρ, J i ). Furthermore, the Hamiltonian and hence the energy ~ 2 /2. density is E ~ ·E ~ SI = ρSI /0 . The field energy The MKSA (SI) units has the Gauss law in the form: ∇ ~ 2 /2. Comparing the energy densities, density (as inferred from the Poynting theorem) is 0 E SI √ ~ ·E ~ = √0 ∇ ~ ·E ~ SI := √0 ρSI . This we introduce the identification, E := 0 ESI Then ρ = ∇ 0 leads to the identification, e := eSI √ . 0 Thus, we identify the variables used in the action with those used in engineering variables by comparing the equations of motion and the energy densities. In the natural units, ~ = 1 = c used in writing the action, the fine structure constant was defined as α := e2 /(4π) which is dimensionless as noted above. Substituting e in terms of eSI and introducing ~, c, we write α := e2SI a b ~ c 4π0 and determine the powers a, b by requiring α to be dimensionless. For this, we need the dimensions of eSI and 0 ! Now we also use the SI expression of the Lorentz force, F = eSI ESI . Hence, dimensionally (Force)2 /(energy density) ∼ e2SI /0 ∼ M 1 L3 T −2 . This and α being dimensionless gives a = b = −1 and α= e2SI 4π0 ~c ' 1/137 , using the values: (4π0 )−1 = c2 10−7 , eSI = 1.6 × 10−19 Coulomb, ~ ' 1.05 × 10−34 Joules.sec and c ' 3 × 108 m/sec. 155 An Aside: Another set of units are commonly used in gravitational physics, the so called geometrized units. These are defined by setting G = c = 1. The choice of c = 1 gives: 1 sec = 3×108 meters. Newton’s constant in SI units is, G ≈ 6.67 × 10−11 kg −1 meter3 sec−2 . Hence setting G = 1 along with c = 1, gives 1 kg = 7.4 × 10−28 meters. ~ ·E ~ = 4πρ It is conventional in gravitational physics to use the Gauss law in the form ∇ and the electric field energy density as E 2 /(8π). This gives the identifications: Egeom = √ √ 4π0 ESI , qgeom = qSI / 4π0 . As before, the Lorentz force equation gives the dimensions 2 2 as: [qSI 0 ] = [4πqgeom ] = M 1 L3 T −2 = L2 . √qSI Gα cβ must have dimensions of L, we infer 4π0 √ q 1/2 −2 SI √ G c = qSI c2 × 10−7 G1/2 c−2 ≈ (8.6 × 10−18 )qSI 4π0 Setting qgeom = qgeom = α = 1/2 and β = −2 and meters. Looking at the expressions for the cross-sections and noting that in the natural units energy ∼ length−1 , we see that the cross-sections are of the form α/E 2 . This gets more and more accurate at ultra relativistic energies where masses can be neglected. Since the cross-section has dimensions of area while energies are typically given in MeV/GeV/TeV etc, it is useful to have a conversion between energy units and length units. As already notes, c = 1 ⇒ 1 second = 3 × 108 meters. ~ = 1 ⇒ 1 Joule = 1034 sec−1 ' 3.3 × 1025 meter−1 . Next, 100M eV = 1.6 × 10−19+8 J ' 5 × 1014 meter−1 . (This is also expressed as 200M eV.f ermi = 1). Clearly, a cross-section at 100 MeV (electron mass being 0.5 Mev) is approximately equal to σ100M ev ' α2 /E 2 ' 2 × 10−33 m2 . The conventional unit for these cross-sections is 1 barn := 100 fermi2 = 10−28 m2 . Typical numbers encountered in high energy processes (around 1 GeV) dominated by strong interactions ∼ 10−30 m2 , electromagnetic interactions ∼ 10−36 m2 and weak interactions ∼ 10−42 m2 . So, scattering experiments will provide us with such numbers. How are they useful? In many applications, say passage of a collimated set of particles through some medium, we have the estimate of the cross-section of scattering of individual particles comprising the ‘beam’ and the medium. Intuitively, a cross-section may be viewed as a target disk of area σ which is bombarded by ‘marbles’ which get reflected from the disk. They are reflected if they hit the disk or pass by if not. Thus, σ gives the likelihood of a scattering interaction. Suppose we have multiple targets (medium) and multiple beam particles. The probability of interaction is clearly proportional to the ratio of the effective area of the target particles and 156 the area exposed to the beam Thus, probability of scattering = (number of target particles Total Area Beam Target exposed o the beam)×σ/Total area presented by the medium). Let n be the number of target particles exposed per unit area perpendicular to the beam. Then the number of particles exposed = n× total area and the probability of scattering = n · σ. Let ρ be the number density of the target particles i.e. number per unit volume. Let d be the thickness of the medium. Then the areal density n = ρ · d. The probability of scattering is 1nσ = ρ · d · σ for d = dˆ = (ρ · σ)−1 . Let hvi be the average speed of the beam particles. ˆ Then the average reaction time := τr := d/hvi = 1/(ρhviσ). Thus, if we have a confined plasma of say electrons and we bombard it with a beam of positrons and look for production of µ± pairs, then we should expect to wait for a time τ = 1/(ρhviσe+ e− →µ+ µ− ). This type of use of microscopic cross-sections is required in discussing bulk processes eg in nuclear physics or astrophysics and are useful in estimating thermalization times. Sometimes, we aren’t interested in tracking individual processes, any interaction may P suffice. In such cases we need to sum over all possible final states, σtot = f σi→f . The σi→f ∼ |Mi→f |2 . We already have expressions using phase space integration for fixed number of final state particles. Now we have to sum over the number of particles and the labels, Nf XZ Y d3 kj0 1 vlab (2π)4 δ 4 (Σk 0 − Pin )|M|2 . σtot = 3 0 4Ek Ep 2ω (2π) k i=1 f,α This can be related to the forward scattering amplitude via the “Optical Theorem”. P P We have S † S = 1 ⇒ f hi|S † |f ih|S|ii = 1 = f |Sf i |2 . Substituting for Sf i we get, |Sf i |2 = δf i δf i + (2π)4 δ 4 (Pf − Pi )δf i [i(Mf i − M∗f i )] + (2π)8 [δ 4 (Pf − Pi )]2 |Mf i |2 . 157 Summing over f and using unitarity gives, (2π)4 X δ 4 (Pf − Pi )δf i [i(Mf i − M∗f i )] = −(2π)8 f X δ 4 (Pf − Pi )δ 4 (Pf − Pi )|Mf i |2 f Or, 2Im(Mii ) = (2π)4 X δ 4 (Pf − P − i)|Mf i |2 . (15.1) f Substituting the rhs in the expression for σtot gives, σtot = P 2Im(Mii ) √ , where, k = kCM and s = −( Pi )2 . 4k s The Mii is the forward scattering, i → i, amplitude and σtot is the inclusive cross-section for i → anyf processes. This could also be used to estimate the total cross-section. Remark: Let us note that we defined the scattering matrix in the basis of number states or Fock basis. Further, we chose the idealization of plane waves for both in and out states. This is convenient but looses the information about spatial locations of the scattering particles. This also looses any notion of “impact parameter” - how far the projectile particle is from the target particle or equivalently the ‘orbital angular momentum’ of the projectile about the target particle. The information about the transverse position is of course contained in the parameters of the outgoing particles. This should be kept in mind while interpreting some of the divergences of cross-sections. 158 16. RADIATIVE CORRECTIONS IN QED Consider now corrections due to higher powers of interaction Hamiltonian, also called radiative corrections. Having seen the correspondence between the correction terms and Feynman diagrams, we can list the corrections directly in terms of the diagrams. A Feynman diagram is made up of lines (free propagators) and vertices. In QED, there are two types of propagators and one types of vertex. So consider corrections to two point functions and the three point function. A. The fermion propagator: self-energy All = + + + + ··· = + 1PI + (16.1) 1PI 1PI + · · · (16.2) Here is a graphical expansion of the exact two point function of the fermions. In the second line, the diagrams have been grouped into a series of terms involving 1-particle irreducible (1PI) blobs, connected by free propagators. The 1PI blob represents the sum of all diagrams with two external legs and which cannot be disconnected by removing a single fermion line. The 2nd and the 3rd diagrams in the first line are 1IP (their extreme propagators being the external lines) while the last diagram is 1-particle reducible. The second line follows from first one by inspection. Let iSF0 (p) denote the lhs - the full 2-point function or full propagator - and iSF (p) denote the free propagator. Let the 1PI blob be denoted by iΣ(p). Then the series is: −iSF0 (p) = −iSF (p) + −iSF · iΣ(p) · −iSF + −iSF · iΣ · −iSF · iΣ · −iSF + . . . = −iSF + −iSF · iΣ(−iSF − −iSF · iΣ · −iSF . . . ) ∴ −iSF0 = −iSF + −iSF · iΣ · −iSF0 ∴ SF = (1 − SF Σ)SF0 ↔ 1 = SF−1 (−Σ)SF0 ∴ SF0 (p) = 1 , 6 p + m − Σ(p) and SF−1 = 6 p + m, Σ(p) is called the fermion self energy. 159 The SF0 (p), Σ(p) are matrices with the Dirac spinor indices though we may not always be explicit about it. What can we say about its singularities? We can appeal to Lorentz covariance and decompose Σ(p) = a(p2 )6 p + b(p2 )1 + c(p2 )γ5 . With parity conserving interactions such as qed, we can take c(p2 ) = 0. Furthermore, to the leading order, a(p2 ) = 0 = b(p2 ). Thus the denominator of the exact propagator is, m − b(p2 ) −1 2 2 2 0 ⇒ [SF (p)] = (1 − a(p ))6 p + (m − b(p )) = (1 − a(p )) 6 p + 1 − a(p2 )   m−b(p2 ) 1  −6 p + 1−a(p2 )  SF0 (p) =  2  1 − a(p2 ) m−b(p2 ) 2 p + 1−a(p2 ) Clearly, the denominator vanishes for p2 = −m2ph , mph := m−b(−m2ph ) 1−a(−mph2 ) . Since the numerator has the form −6 p + mph , we can regard the pole to be defined by 6 p = −mph . Since a, b are functions of p2 = −6 p6 p, we may regard the self energy as a function of 6 p and define the pole by the condition: 6 p + m − Σ(6 p)|6 p=−mph = 0. Clearly, mph 6= m. The exact propagator can be expanded about its pole as, dΣ(6 p) 6 p + m − Σ(6 p) = 0 + (6 p + mph ) 1 − |6 p=−mph + o((6 p + mph )2 ) d6 p " # Z2 1 1 := ∴ SF0 (p) ' dΣ(6 p) 6 p + mph 1 − 6 p + mph |6 p=−mph d6 p The shift of the position of the pole in the exact propagator from that of the free propagator, δm := mph − m = −Σ(6 p = −mph ) is important in an S−matrix element which has the spinors on the external lines. These spinors satisfy the equation of motion with physical mass (= mph ) and not the mass parameter in the free propagator inferred from the Lagrangian. In particular this means that in the LSZ formula for S−matrix element, the amputation of external fermion legs is to be done with (−i6 ∂ + mph ). This in turn means that in the Green’s functions with external fermion legs, the external lines must include the self-energy corrections i.e. should have SF0 propagator instead of the free propagator SF (and of course no more self energy corrections on the external lines). We will return to this point later again while discussing renormalized perturbation series. We also see that the residue at the pole in the exact propagator is not 1 and we identify it with the Z2 which we know from the Kallen-Lehmann representation to be less than 1. 160 B. The photon propagator: photon self-energy All = + = + + 1PI + + 1PI 1PI (16.3) + · · · (16.4) Here is a graphical representation of the exact photon propagator. As before, we group together the 1-PI diagrams connected by free propagators. Let the exact propagator be 0 (q) and the free propagator by −iDµν (q). Let the 1PI blob be denoted denoted by −iDµν by iΠµν (q). In the Lorentz gauge, the free propagator is: Dµν (q) = ηµν /(q 2 − i). While the series can be formally summed as before, it is more convenient to separate the tensor indices in the Πµν (q) tensor. Lorentz covariance implies, Πµν (q) = ηµν A(q 2 ) + qµ qν B(q 2 ). The Πµν tensor may be thought of as γ → γ process, although q 2 6= 0. We now appeal to a Ward identity: q µ Πµν (q) = 0. This is done separately below. Presently, it implies A(q 2 ) = −q 2 B(q 2 ) and we define: Πµν (q) := [ηµν q 2 − qµ qν ]Π(q 2 ) . With this notation, the exact 2-point function takes the form, 0 −iDµν (q) = −i ηµα ηβν ηµν + −i 2 · [i(q 2 η αβ − q α q β )Π(q)] · −i 2 + − i q − i q − i − iDµα · [i(. . . )αβ ] · −iDβρ · [i(. . . )ρσ ] · −iDσν + . . . q2 Define, 2 αβ i(q η q α qν α − q q )(−iηβν q ) = δ ν − 2 =: ∆αν ⇒ ∆αν ∆νβ = ∆αβ . q α β −2 Then the exact propagator series takes the form, ηµν ηµα ηµα α 2 + −i ∆ Π(q ) + −i ∆α ∆β Π(q 2 ) + . . . ν q 2 − i q 2 − i q 2 − i β ν ηµν ηµα = −i 2 + −i 2 ∆α (Π + Π2 + Π3 + . . . ) q − i q − i ν −i qµ qν qµ qν qµ qν 2 3 = 2 ηµν − 2 + 2 + ηµν − 2 (Π + Π + Π . . . ) q − i q q q 1 q µ qν 1 qµ qν 0 ∴ Dµν (q 2 ) = 2 + η − . µν q (1 − Π(q 2 )) q2 q2 q2 0 −iDµν (q) = −i In any S−matrix calculation, the D0 propagator will land on a fermion line and thanks to the Ward identity, the terms proportional to q µ , q ν will vanish. Hence, for S−matrix element 161 0 computations, we can take Dµν (q) = ηµν (q 2 −i)(1−Π(q 2 )) . The Π(q 2 ) is called the photon self energy. To the leading order, the self energy vanishes and hence in perturbation theory, it can never equal 1 and cause another pole at some other q 2 . As long as it is regular at q 2 = 0, the exact propagator continues to have a simple pole at q 2 = 0, just like the free propagator. Hence, the photon wavefunction, µ (~k) continues to be transverse and longitudinal polarizations will decouple. Unlike the fermion self energy which shifts the mass, the photon self energy does not shift the photon mass from zero thanks to the Ward identity. However, like the fermion self energy, the residue at the pole does shift away from 1, (Z3 )−1 := 1 − Π(q 2 = 0). 1. The Ward identity claim: q µ Πµν (q) = 0: γi+1 γn pi + q pn + q γ1 γj pi p1 γi γ2 Proof: Recall that Πµν is defined without external propagators i.e. it is an amputated 2-point function. The momentum q however, need not be on-shell and hence it does not correspond to an S−matrix element (even if we disregard the polarization factors). Consider diagrams contributing to the photon self energy with some fixed number of vertices (or order in the coupling). The set of these diagrams will be generated by vertex injecting momentum q µ at various points on the available fermion lines. The fermion lines will necessarily be loops as there are no external fermions contributing to the photon self energy. Consider the subset of diagrams wherein the q µ vertex is on some particular loop which contributes a factor of the form: (the fermion arrow points from i = 1 to i = n and so do the fermion momenta) T r [SF (pn + q)γ µn . . . SF (pi+1 + q)γ µi+1 {SF (pi + q)6 qSF (pi )} γ µi . . . SF (p1 )γ µ1 ] The q µ vertex is inserted between the ith and (i + 1)th vertices and adds the propagator SF (pi + q). All momenta from pi+1 to pn are shifted by q. We have also contracted with q µ . 162 The expression between the braces simplifies as 13 , 1 1 1 1 6q = − = SF (pi ) − SF (pi + q) . 6 pi +6 q + m 6 pi + m 6 pi + m 6 pi +6 q + m The trace thus becomes, T r [SF (pn + q)γ µn . . . SF (pi+1 + q)γ µi+1 {SF (pi )} γ µi . . . SF (p1 )γ µ1 ] −T r [SF (pn + q)γ µn . . . SF (pi+1 + q)γ µi+1 {SF (pi + q)} γ µi . . . SF (p1 )γ µ1 ] Note that the momenta after i (and i − 1) have q added to them. Summing over the diagrams where the q µ vertex is on this fermion loop i.e from i = 1, . . . , n , there is a pairwise cancellation. The first term at i cancels the second term at i + 1 and so on. We are left with the second term at i = 1 and the first term at i = n, i.e. n X T r [. . . ] = T r [SF (pn )γ µn . . . SF (p1 )γ µ1 ] − T r [SF (pn + q)γ µn . . . SF (p1 + q)γ µ1 ] i=1 Because we have a fermion loop, there is an integration over say p1 . Provided the integral is finite, we can shift the integration variable and absorb away the q. The integrals of the two traces, then cancel out and the claim is proved. Remark: We have not yet introduced loop diagrams. The above fermion loop, before the q insertion, has n vertices and n momenta entering at those vertices. The conservation of momenta, taking out the overall conservation, thus leaves one momentum undetermined and this is to be integrated. Remark: As an extension of the result, consider an open fermion line with incoming momentum p0 , outgoing momentum pn and insertions on it. γj pn + q pn−1 + q γn 13 γn−1 pi + q pi γi+1 p1 γi γ2 p0 γ1 This identity holds as long as both the fermion propagators have the same mass, mph or m. This is relevant while dealing with external fermion lines. 163 For the insertion between the i, i + 1 vertices, we will have a string of factors as, SF (pn + q)γ µn SF (pn + q) . . . γ µi+1 {SF (pi + q)6 qSF (pi )} γ µi . . . SF (p1 )γ µ1 SF (p0 ) Replacing SF (pi + q)6 qSF (pi ) = SF (pi ) − SF (pi + q) and summing over the insertions from i = 0, . . . , n, we get n X [. . . ] = SF (pn )γ µn SF (pn−1 ) . . . SF (p1 )γ µ1 S(p0 ) i=0 − SF (pn + q)γ µn SF (pn−1 + q) . . . SF (p1 + q)γ µ1 S(p0 + q) To get an S−matrix contribution, we have to multiply by ū(pn + q)[SF (pn + q)]−1 ] on the left and by [SF (p0 )]−1 u(p0 ) on the right and take the limits p20 → −m2 , (pn + q)2 → −m2 . Now we see that each of the two terms has only one of the poles which can cancel the inverse propagator. In the on-shell limit then, each of the terms vanishes, thereby proving that for S−matrix elements with external fermion lines, sum over insertion of photon lines followed by contraction with the photon momentum gives zero. Note: There is one subtlety here. For external lines, we should be using the exact propagator as per the LSZ rules. The S6 qS identity requires both propagators to have the same mass, the physical mass. This would require all internal fermion lines to use the exact propagator (and then do not consider the fermion self energy correction). For fermion loop, this issue does not arise. Notice however that SF0 (p)−1 = SF (p)−1 − Σ(p) and σ(p) is always of order α and higher. Thus, in using free inverse propagator instead of exact inverse propagator, we are dropping a higher order contribution which we can do in a perturbation theory. The conclusion is then that qµ Mµ (q, . . . ) = 0 which is a statement of Ward identity. C. The Vertex function: Form factors The expectation is that the electron-photon coupling will undergo a change due to higher order corrections. Unlike the 2-point functions above which were considered for arbitrary momentum, we consider the 3-point function with the fermion momenta “on-shell” (and this means the q 2 is necessarily space-like (prove this)) and hence an internal line. When viewed as part of an S−matrix element, the external fermion legs are amputated with physical mass inverse propagator and replaced by the ū(p0 ), u(p) spinors respectively. Here is a graphical view of the possible corrections. 164 q µ p0 All = + + + (16.5) p q ieΓµ (p, p0 ) = ieγ µ + µ p0 1PI 1PI + + ··· 1PI p (16.6) 1PI Notice that some corrections on the fermion lines and the photon line are just self-energy corrections. The remaining corrections have photon lines necessarily connecting the two fermion lines and these diagrams are already 1PI. The last diagram in the second equation is 1PR. We can simplify the form of the exact vertex function, Γµ (p, p0 ) by appealing to Lorentz covariance. 0 0 Γµ (p, p0 ) = γ µ A(q 2 ) + (pµ + p µ )B(q 2 ) + (p µ − pµ )C(q 2 ) . With two independent momenta, we have 3 Lorentz scalars, p2 , (p0 )2 , p · p0 . The first two equal −m2ph while the third is traded for q 2 for convenience. Claim: ū(p0 )qµ Γµ u(p) = 0. Proof: This is the Ward identity argument given above. The above form for Γµ then implies, 0 = ū(p0 )6 qu(p)A(q 2 ) + (−m2ph + m2ph )ū(p0 )B(q 2 )iu(p) + q 2 ū(p0 )C(q 2 )u(p) = 0 + 0 + q 2 ū(p0 )C(q 2 )u(p) ⇒ ū(p0 )C(q 2 )u(p) = 0 . The A, B terms can be rewritten using “Gordon identity”, ū(p0 )γ µ u(p) = 1 ū(p0 ) [(p + p0 )µ − 2iσ µν qν ] u(p) 2mph This can be established as follows (mph → m for convenience): We have, u(p) = − m1 6 pu(p) , ū(p0 ) = − m1 ū(p0 )6 p0 . Therefore 165 1 µ 1 µ ν 0 µ µ µ ν µν ū(p )γ u(p) = − (γ 6 p +6 p γ ) But, γ 6 p = pν γ γ = pν −η + [γ , γ ] 2m 2 i ∴ γ µ6 p = −pµ − 2iΣµν pν , 6 p0 γ µ = −(p0 )µ − 2iΣνµ p0ν , Σµν := [γ µ , γ ν ] ; 4 1 ∴ ū(p0 )γ µ u(p) = − ū(p0 ) [−pµ − (p0 )µ − 4iΣµν (pν − p0ν )] u(p) 2m 1 = ū(p0 )[(p + p0 )µ − 2iσ µν qν ]u(p) , σ µν := 2Σµν , q := p0 − p . 2m 0 µ Eliminating the (p + p0 )µ from the B(q 2 ) term, and identifying A + 2mph B =: F1 (q 2 ) and 4Σµν B(q 2 ) =: σ µν F 2 (q 2 ), 2mph we take the general form of Γµ (p, p0 ) as, Γµ (p, p0 ) = γ µ F1 (q 2 ) + i σ µν qν F2 (q 2 ) , q := p0 − p . 2mph The Fi (q 2 ) are called form factors. To the lowest order, F1 (q 2 ) = 1, F2 (q 2 ) = 0 . These get corrected at higher orders. D. Electric Charge and Anomalous Magnetic Moment The form factor decomposition helps us identify the electric charge and the anomalous magnetic moment of electron (and other charged fermions). This is seen as follows. Consider the scattering of an electron off a heavy charged particle. We imagine the scattering to be effected by exchanging a single photon between the heavy charge and the electron. Schematically the amplitude can be written as, iM = ie2 [ū(p0 )Γµ (p0 , p)u(p)] 1 [ū(k 0 )γµ u(k)] q2 The first factor is from electron, second is the single photon propagator and the third factor is from the heavy charge. The Γ vertex has only 1PI diagrams and excludes the photon self energy on the propagator. This represents the modification of the electron response due to the 1PI corrections, the photon self energy would indicate modification in the electromagnetic field felt by the electron and hence is also referred to as vacuum polarization. This is ignored in this calculation. In the limit of infinitely heavy fermion, the second and third factors are replaced by an external classical potential Aµ (q) and the amplitude is defined as, 0 iM (2π) δ((p0 − p)0 ) = −ieū(p0 )Γµ u(p)Ãcl µ (p − p) 166 We consider two cases - Coulomb potential and magnetic field, and consider non-relativistic limit of the scattering. ~ Coulomb potential: Acl µ (x) = (φ(x), 0). Coulomb field is static and we also assume it 0 ~ to be slowly varying spatially. This means that we take Ãcl µ (q) = 2πδ(q )(φ̃(q), 0) with the φ̃(q) having support near q ' 0. We can thus take the limit q → 0 of the first factor. This in turn means that only the F1 term contributes. In the non-relativistic limit, ū(p0 )γ µ u(p) → ū(p0 )γ 0 u(p) = 2mu† (p)u(p) and the amplitude takes the form, iM → −ieF1 (0)φ̃(q)(2mu† (p)u(p)). Comparing with the Born approximation for scattering off a potential leads to the identification, V (x) = eF1 (0)φ(~x). Thus, electric charge = eF (0). Since F1 (0) = 1 at the leading order, the radiative corrections to F1 (q 2 ) should vanish as q 2 → 0. ~ cl Magnetic field: Acl µ = (0, Ai (x)). Again this is taken to be time independent and spatially slowly varying. Thus we consider the limit q → 0. Earlier, we dropped the F2 term, because F1 term was non-zero. Now however ū(p0 )γ i u(p) ' o(p/m) in the non-relativistic limit which is comparable to the F2 term. Recall from the NR limit of the Dirac spinors, uT := (uT1 , uT2 ), (6 p+m)u(p) = 0 ⇒ u2 (p) ' ~·~ σ − p2m u1 and we have the normalization u† u(p) = 2m. We also had, ū(p0 )γ i u(p) ' u†1 (p0 )σ i u2 (p) + u†2 (p0 )σ i u1 (p) ' − 1 † 0 u (p ){σ i p~ · ~σ + p~0 · ~σ σ i }u1 (p). 2m 1 Using σ i σ j = δ ij + iijk σ k , we get, ~ cl + p~0 · A ~ cl − iAcl ij qj σ k u1 (p) ~ cl u(p) = − 1 u†1 (p0 ) p~ · A ū(p0 )~γ · A i k 2m Noting that q 0 = 0, the F2 term has σ ij qj . Going over to the two component spinors, we get i ij † 0 k i ū(p0 )σ ij u(p)qj ≈ [u (p )σ u1 (p)] 2m 2m k 1 The contribution of the u2 spinors is negligible in the NR limit. Combining the terms we get in the NR limit, " # 0 cl ~ (~ p + p ~ ) · A i † 0 k ū(p0 )Γi u(p)Ãcl F1 (0) − ij qj Acl i σ (F1 (0) + F2 (0)) u1 (p) i (q) ≈ u1 (p ) 2m 2m k In the non-relativistic Hamiltonian, (p − eA)2 /(2m) we have the e(p · +A · p) terms which are recovered. The remaining terms however are new. Focusing only on those, we write the 167 amplitude as, iM = −ieu†1 (p0 ) 1 k − (F1 (0) + F2 (0))σ Bk u1 (p) , Bk := −ikij qi Ãcl j (q) . 2m This amplitude is interpreted as the Born approximation to a potential scattering with ~ x), with the effective magnetic moment, potential, V (x) = −h~µi · B(~ h~µi = e ~σ e ~ [F1 (0) + F2 (0)]u†1 (p) u1 (p) := g S , g := 2[F1 (0) + F2 (0)] = 2[1 + F2 (0)]. mph 2 2mph g is called the Lande’s g-factor, its value being 2 is prediction of Dirac equation while F2 (0) 6= 0 is a prediction of QED. It is called the anomalous magnetic moment. Without actually calculating the higher order corrections, we noted what they mean for 2-point functions (self-energies). We used Lorentz covariance to get a form factor decomposition and interpreted them by appealing to the NR limit. One of the immediate prediction was the anomalous magnetic moment. We also note that to the leading order (see diagrams), we have Z2−1 = 1, Z3−1 = 1, F1 (q 2 ) = 1, F2 (q 2 ) = 0. This is the reason that at the tree level calculations that we did, did not have any factors of Z’s. This will change when we do explicit evaluations of the leading corrections in the next section. 168 17. RADIATIVE CORRECTIONS AT 1-LOOP: DIVERGENCES We now calculate the leading radiative corrections to the self energies and the QED vertex function i.e. the 1-loop contributions to Σ(6 p), Π(q 2 ), F1 (q 2 ), F2 (q 2 ). Here are the Feynman diagrams and the corresponding invariant amplitudes. Fermion self energy p−k p p k iΣ(p) = (ie) 2 d4 k µ −i(−6 k + m) ν −iηµν γ γ 2 4 2 (2π) k + m − i (p − k)2 + µ2 − i Z Photon self energy (vacuum polarization) k+q q q µν Z 2 iΠ (q) = (ie) (−1) d4 k −i ν µ −i γ Tr γ (2π)4 6 k + m 6 k +6 q + m k QED Vertex function, δΓµ := Γµ − γ µ p0 p0 − k d4 k −iηαβ δΓ (p, p ) = (ie) × (2π)4 (k)2 + µ2 − i 0 α −i(−(6 p − 6 k) + m) µ −i(−(6 p − 6 k) + m) β γ γ γ (p0 − k)2 + m2 − i (p − k)2 + m2 − i µ µ k q p−k 0 2 Z p In the photon self energy expression, the (−1) is due to the fermion loop. The photon propagator is added a small mass µ2 in anticipation. In the vertex correction, the expression is understood to be sandwiched between ū(p0 ) and u(p). There are some new points to be noted. • There is a momentum integration over k, unrelated to the external momenta. This k can be space-like, time-like or light-like. In any frame, we may consider the region where R R k µ → ∞, then: (i) the electron self-energy ∼ d4 kk α /k 4 ∼ dk which is naively linearly R R divergent; (ii) the photon self-energy ∼ d4 k/k 2 ∼ kdk which is naively quadratically R divergent and (iii) the vertex correction ∼ d4 kk 2 /k 6 ∼ dk/k which is naively logarithmically divergent. • There are also other regions of integration space which give divergent contributions, but these will be visible after the integrand is put in a convenient form. 169 • The i in the denominator is quite significant now. It implies that the 4-dimensional integration must be defined with the k 0 integration being done first and the spatial integrations done subsequently. • The naively divergent integrals need to be regularised i.e. a prescription for the integration must be supplied which will manifest the divergence in an explicit form. The divergent contribution so obtained, must be subtracted to obtain finite answers. This is the process of renormalization. • The integrand has a numerator which has tensor/spinor indices while the denominator is a product of scalar factors. One combines the denominator factors into a single factor form using the so called Feynman parameters. This is followed by the momentum integration with a regularization and a finite part is identified. The integration over the Feynman parameters is done last to get the answer. We explain these steps now. The Feynman/Schwinger trick: Z ∞ δ(Σi xi − 1) 1 dx1 . . . dxn Claim: = (n − 1)! . A1 . . . An (Σi xi Ai )n 0 R∞ Proof: Observe that A−1 = 0 dαe−αA , Re(A) > 0. Therefore, Z ∞ Z ∞ 1 −Σi αi Ai dtδ(t − Σi αi ) and αi → txi , dα1 . . . dαn e ·1 , 1= = A1 . . . An 0 0 Z ∞ Z ∞ 1 = dt dx1 . . . dxn tn e−tΣi xi Ai δ(Σi xi − 1) t 0 Z ∞ Z0 1 dt tn−1 e−t(Σi xi Ai ) dx1 . . . dxn δ(Σi xi − 1) = 0 0 Z ∞ Z ∞ δ(Σi xi − 1) = (n − 1)! dx1 . . . dxn ∵ dt tn−1 e−tΛ = Λ−n Γ(n) . n (Σ x A ) i i i 0 0 Note: Since each of the denominator factors are of the form Ai = (pi − k)2 + m2i − i, we see that Σi xi Ai = k 2 − 2k · (Σi xi pi ) + Σi xi (p2i + m2i ) − i = (k − Σi xi pi )2 + M 2 − i , M 2 (xi , pi , mi ) := Σi xi (p2i + m2i ) − (Σi xi pi )2 . The obvious step is to shift the integration variable, k → k + Σi xi pi which simplifies the denominator - it has the same form of a single propagator with the same −i. Let us begin with the fermion self energy. 170 A. Isolation of Divergence: Fermion Self Energy iΣ(p) = (ie) 2 −iηµν d4 k µ −i(−6 k + m) ν γ γ 2 4 2 (2π) k + m − i (p − k)2 + µ2 − i Z The Feynman trick gives, 1 = 2 2 (k + m − i)((p − k)2 + µ2 − i) Z 1 dx 0 [(k − xp)2 1 , + M 2 − i]2 with, M 2 := x(1 − x)p2 + (1 − x2 )m2 + xµ2 . Shifting k → k + xp, and using γ µ γµ = −41, γ µ6 kγµ = 26 k, gives 2 Z iΣ = e 0 1 Z d4 k −4m − 2(6 k + x6)p (2π)4 [k 2 + M 2 (x, p) − i]2 Recalling that the k 0 integration is to be done first and that the poles are at k 0 = p ± ~k 2 + M 2 ∓ i, we rotate the contour anti-clockwise without crossing any singularity. This is equivalent to putting k 0 = ik 0 and k 2 → (k 0 )2 + ~k 2 =Euclidean vector norm. This R R R α γα sends d4 k → i d4 k. For Euclidean integrals d4 k (k2k+M 2 )2 = 0, since the integrand is odd under k → −k. We are left with, Z 1 Z d4 k 4m + 2x6 p 2 iΣ = −ie dx , M 2 := x(1 − x)p2 + (1 − x2 )m2 + xµ2 2 4 2 2 (2π) (k + M ) 0 The k−integral is logarithmically divergent and needs to be regulated. There are several ways of doing this. One is the Pauli-Villars regularization which subtracts from the photon propagator, another identical piece with arbitrarily large mass squared, Λ2 , i.e. 1 1 1 → − . (p − k)2 + µ2 (p − k)2 + µ2 (p − k)2 + Λ2 For large k, both terms go as k −2 and cancel each other leaving a finite answer. Thus iΣ → iΣreg := iΣ(Λ = ∞) − iΣ(Λ). The same combining of denominators will produce identical terms with M 2 (µ2 ) → M 2 (Λ2 ). The momentum integration becomes, " # Z Z Z ∞ d4 k 1 1 dΩ3 − 2 = dk|k|3 [. . . ] 2 4 4 2 2 2 2 (2π) (k + Mµ ) (2π) 0 (k + MΛ ) 171 n/2 2π = 2π 2 for n = 4 and the integral becomes, The angular integration gives: Ωn−1 = Γ(n/2) Z Z ∞ d4 k dy y 1 y [· · · − . . . ] = − , k 2 =: y substituted. (2π)2 8π 2 0 2 (y + Mµ2 )2 (y + MΛ2 )2 Z ∞ Mµ2 1 1 MΛ2 1 dy − = − + 16π 2 0 y + Mµ2 y + MΛ2 (y + Mµ2 )2 (y + MΛ2 )2 " ∞ # 2 ∞ 2 2 y + M M 1 MΛ µ µ = ln + − 2 2 2 16π y + MΛ 0 y + Mµ y + MΛ2 0 Z d4 k 1 MΛ2 . [· · · − . . . ] = ln ∴ (2π)2 16π 2 Mµ2 In the limit Λ → ∞, MΛ2 → xΛ2 and we get, Z 1 e2 xΛ2 iΣ(p) = −i 2 dx (2m + x6 p)ln 8π 0 x(1 − x)p2 + (1 − x)m2 + xµ2 (17.1) Recall that the mass shift as well as the Z2 are obtained from the self energy as, δm := mph − m = −Σ(6 p = −mph ), Z2−1 = 1 − Σ(6 p) | . d6 p 6 p=−mph Since the self energy already has an explicit factor of e2 , in the integrand, we can use 6 p = −mph ≈ −m. The derivative of Σ is obtained as, dΣ d6 p 6 p=−mph α = − 2π Z 1 dx xln 0 xΛ2 −x(1 − x)m2 + (1 − x)m2 + xµ2 + −1 (−2x(1 − x)(−m)) (2m − x · m) −x(1 − x)m2 + (1 − x)m2 + xµ2 Simplification leads to, Z dΣ α 1 xΛ2 x(1 − x)m2 = dx −xln + 2(2 − x) d6 p 6 p=−mph 2π 0 (1 − x)2 m2 + xµ2 (1 − x)2 m2 + xµ2 (17.2) δm = −Σ(6 p = −mph ) = α 2π 1 Z dx (2 − x)ln 0 2 xΛ (1 − x)2 m2 + xµ2 (17.3) The integrands are well behaved at both end points so the integrals are finite and the leading contribution is given by the Λ2 dependent part alone and for dimensional reasons we divide by the fermion mass. Thus, the divergent contributions as Λ → ∞ take the form, Z α 1 3α δm dx(2 − x)ln(Λ2 /m2 ) = ln(Λ2 /m2 ) 2π 0 4π Z α α 1 α −1 Z2 1− (−xln(Λ2 /m2 )) = 1 + ln(Λ2 /m2 ) ⇒ Z2 ≈ 1 − ln(Λ2 /m2 ) 2π 0 4π 4π The logarithmic divergence is thus manifested in terms of ln(Λ2 /m2 ). There are of course finite pieces, but until we take care of the divergent parts the finite parts are irrelevant. 172 B. Isolation of Divergence: Photon Self-Energy µν iΠ (q) = −e 2 Z d4 k T r[γ µ (−6 k + m)γ ν (−6 k −6 q + m)] (2π)4 (k 2 + m2 − i)((k + q)2 + m2 − i) Using the trace formulae for the Dirac matrices (14.4), we get Numerator: = T r[γ µ6 kγ ν (6 k +6 q) + m2 γ µ γ ν ] = 4[k µ (k + q)ν − η µν k · (k + q) + (k + q)µ k ν − m2 η µν ] = 4[2k µ k ν − η µν (k 2 + m2 ) + (k µ k ν + k ν q µ − η µν k · q)] 1 1 Denominator: = 2 2 2 k + m − i (k + q) + m2 − i) Z 1 1 dx = shift k → k − xq, 2 [(k + xq) + x(1 − x)q 2 + m2 ]2 0 The shift generates additional terms linear in k in the numerator. Under integration, these terms drop out and we are left with, Z d4 k {2k µ k ν + 2x2 q µ q ν − η µν (k 2 + x2 q 2 + m2 ) − x(2q µ q ν − η µν q 2 )} µν 2 iΠ (q) = (−4e ) (2π)4 (k 2 + M 2 − i)2 where, M 2 = x(1 − x)q 2 + m2 . Doing Wick rotation as before, µ ν Z 4 2k k − 2x(1 − x)q µ q ν + x(1 − x)η µν q 2 − η µν (k 2 + m2 ) d k iΠµν (q) = (−4e2 )i (2π)4 (k 2 + M 2 − i)2 The tensorial integral, Z Z d4 k 1 d4 k kµkν k2 µν = Aδ , with A = (2π)4 (k 2 + M 2 )2 4 (2π)4 (k 2 + M 2 )2 There are several problems to be faced now. First, the integral is quadratically divergent and shift in the momentum variable is unjustified. Second, we have now δ µν and η µν in the numerator, confusing Lorentz covariance. Third, naively we see a quadratically divergent piece with coefficient η µν but only logarithmically divergent one with coefficient q µ q ν casting doubts on possibility of gauge invariance. We could try the Pauli-Villars regulation, but we will introduce a different one, the dimensional regularization. The Pauli-Villars may be seen in [13]. The dimensional regularization has the main virtue of keeping the Ward identity satisfied, even for non-abelian gauge theories. The idea is to think of the perturbation theory to be a member of a class of similar theories formulated in general n dimensions. One chooses a value of n where the integrals are well 173 defined (n < 4) and analytically continues n → 4. This isolates the original divergence in a particular form (poles) providing the needed regularization. Since the mass/length dimensions also depend on space-time dimensions we need the coupling constants to be dimensionfull. To maintain space-time covariance, we need to regard the external momenta and the γ matrices etc also to be n−dimensional. We set := 4 − n and consider the limit → 0. Since the integrals are over Euclidean momenta due to the Wick rotation, the external momenta and the metric tensor need to continue back to Minkowski signature at the end of the calculations. In this regularization scheme, the following rules are adopted. Z Z dΩn−1 2π n/2 n−1 2 (17.4) dk|k| f (|k| ) , dΩ = n−1 (2π)n Γ(n/2) Z Z dn k 1 2π n/2 2 ∴ f (|k| ) = dkk n−1 f (k 2 ) (17.5) (2π)n Γ(n/2) (2π)n n Z Z ∞ 1 ∞ x 2 −1 M2 k n−1 = dx , put y = dk 2 (k + M 2 )α 2 0 (x + M 2 )α x + M2 0 Z n n M n−2α 1 M n−2α = dyy α−1− 2 (1 − y) 2 −1 = β(α − n/2, n/2) 2 2 0 Z dn k f (|k|2 ) = (2π)n Z Z ∞ dk ∴ 0 k n−1 M n−2α Γ(α − n/2)Γ(n/2) = (k 2 + M 2 )α 2 Γ(α) (17.6) and Z 1 1 Γ(α − n/2) dn k = (M 2 )n/2−α n 2 2 α (2π) (k + M ) (4π)n/2 Γ(α) (17.7) The Γ function has isolated poles at α − n/2 = 0, −1, −2, . . . i.e. at n = 2(α + m), m = 0, 1, . . . . The Gamma function has the expansion Γ(/2) = 2/ − γ + o() where, γ = 0.57721 . . . is the Euler-Mascheroni Constant. Since finite parts can be affected we need to keep in various place too eg {γ µ , γ ν } = −2δ µν 1, T r(1) = n = 4 − , γ µ γ ν γµ = (2 − )γ ν . These terms will give contributions from the 1/ terms. The coupling constant dimensions work as follows. [Ψ] = (n − 1)/2, [Aµ ] = /2 (n − 2)/2 ⇒ [e] = n − (n − 1) − (n/2 − 1) = /2. Therefore we write e → eµ0 where µ0 is an arbitrary mass scale which should disappear from physical quantities. The dimensions are in mass units. The tensorial integral will now follow from k µ k ν → n1 δ µν k 2 . We use these in evaluating the Πµν (q). 174 µν Π (q) = (−ne2 µ0 ) 1 Z −2x(1 − x)q µ q ν + δ µν (x(1 − x)q 2 − m2 ) × 0 Z Z dn k 1 dn k k 2 + M 2 − M 2 µν + δ (2/n − 1) , where, (2π)n (k 2 + M 2 )2 (2π)n (k 2 + M 2 )2 dx M 2 = x(1 − x)q 2 + m2   Z Z  dn k dn k 1 k2  µν 2 2 (−ne µ0 dx δ −1 + ×  n (2π)n k 2 + M 2 (2π)n (k 2 + M 2 )2 0  {z } | | {z } 1 2 2 2 µ ν µν 2 2 −m − x(1 − x)q − −2x(1 − x)q q + δ −1 M n 2 −/2 M Γ(/2) 1 Γ(1 − n/2) M2 2 n/2−1 (M ) = n/2 2 (4π) Γ(1) (4π) 4π (/2 − 1) 2 −/2 1 Γ(2 − n/2) 1 M (M 2 )n/2−2 = Γ(/2) n/2 2 (4π) Γ(1) (4π) 4π M2 n n 2 2 , −1=1− = −1 /2 − 1 2 2 2 n Z 1 2 M2 µν 2 dx δ (−ne µ0 ) −1 n 2 n ( − 1) 0 2 n 2 µν 2 2 2 µ ν +δ −m − − 1 M + x(1 − x)q − 2x(1 − x)q q × 2 n         2  2 2 µν 2 2 2 2 δ M −m − M − M 1 + x(1 − x)q − 2x(1 − x)q µ q ν n  n    | {z }   Z = 1 = 2 = ∴ 1 = Πµν (q) = But, [. . . ] = 1 2x(1−x)q 2 = 2x(1 − x)(q 2 δ µν − q µ q ν ) ⇒ Πµν (q) = (q 2 δ µν − q µ q ν )Π(q 2 ) with, 2 −/2 Z 1 1 M 2 2 Π(q ) := −ne µ0 dx Γ(/2)(2x)(1 − x) 2 (4π) 4π 0 Z 2(4 − )e2 1 2 4πµ20 dx x(1 − x) −γ ln +1 = − (4π)2 M2 2 0 Using a/2 = 1 + 2 ln(a) + o(2 ), we write, 2α Π(q ) = − π 2 Z 1 dx x(1 − x) 0 2 −γ 4πµ20 1 + ln 1− 2 M2 4 175 (17.8) Notice that for q 2 , M 2 = m2 and 2α 2 Π(0) = − π 1 Z dx(x − x2 ) + constant = − 0 2α 1 2α , ⇒ Z3 = 1 − 3π 3π (17.9) We see the divergence in Π(0), Z3 and there is of course no mass shift. We return to the interpretation of Π(q 2 ) later. Remark: The −1 pole in dimensional regularization corresponds to logarithmic diverR gence in Pauli-Villars. this is indicated by comparing the regulated integral [(k 2 + M 2 )−2 − (k 2 Λ2 )−2 ] in Pauli-Villars, which goes over to ization, our integral 2 goes as 1 2 . 4π 2 1 ln(xΛ2 /M 2 ). 4π 2 In the dimensional regular- Hence, −1 ↔ ln(Λ/M ). Thanks to the dimensional regularization maintaining the Ward identity, the q 2 δ µν − q µ q ν got pulled out and the superficially quadratically divergent integral go converted to a logarithmically divergent one. C. Isolation of Divergence: Vertex function We compute δΓµ = Γµ − γ µ , with p2 = (p0 )2 = −m2 and sandwiching by ū(p0 ), u(p) is implicit. 0 µ δΓ (p, p ) = −ie 2 Z 0 ηαβ d4 k µ (−(6 p − 6 k) + m) β α (−(6 p − 6 k) + m) γ γ γ (2π)4 k 2 + µ2 − i (p0 − k)2 + m2 − i (p − k)2 + m2 − i We have three denominators and the Feynman trick leads to, k2 1 1 1 2 2 2 0 2 + µ − i (p − k) + m − i (p − k) + m2 − i Z 1 Z 1 Z 1 2 dz = dx dy [(k − xp − yp0 )2 + M 2 − i]3 0 0 0 M 2 := x(p2 + m2 ) + y((p0 )2 + m2 ) + zµ2 − (xp + yp0 )2 where, ∴ M 2 = 0 + 0 + +zµ2 + m2 (x + y)2 + q 2 xy Since q 2 > 0 (space-like) for this diagram, M 2 = (x + y)2 m2 + xyq 2 + zµ2 is manifestly positive. As usual shifting the momentum k → k + xp + xp0 , doing the Wick rotation and dropping the terms linear in k in the numerator, we get, Z 1 Z 1 Z 1 Z µ 0 2 dx dy dzδ(1 − x − y − z) δΓ (p, p ) = e 0 0 0 d4 k 2 × Nr (2π)4 [k 2 + M 2 ]3 N r = γ α {(6 p0 − 6 k − x6 p − y6 p0 )γ µ (6 p0 − 6 k − x6 p − y6 p0 )} γα + m2 γ α γ µ γα −mγ α {γ µ (6 p − x6 p − y6 p0 ) + (6 p0 − x6 p − y6 p0 )γ µ } γα + linear in k 176 Now we need to use the identities, γ α γ µ γα = −2η αµ γα − γ µ (−4) = 2γ µ γ α γ ρ γ σ γα = 4η ρσ (17.10) (17.11) γ α γ ρ γ µ γ σ γα = 2γ σ γ µ γ ρ (17.12) ∴ γ α6 kγ µ6 kγ α = 26 kγ µ6 k = −4k µ6 k + 2γ µ k 2 ∵ 6 k6 k = −k 2 (17.13) γ α {. . . }γ µ {. . . }γα = 2{(1 − x)6 p − y6 p0 }γ µ {(1 − y)6 p0 − x6 p} (17.14) γ α {γ µ (. . . ) + (. . . )γ µ }γα = 4 {(1 − x)pµ − y(p0 )µ + (1 − y)(p0 )µ − xpµ } (17.15) Using these, the numerator takes the form, 0 0 + 2[{(1 − x)6 p − y6 p0 } γ µ {(1 − y)6 p0 − x6 p}] ū(p )(N r)u(p) = ū(p ) [−4k µ6 k + 2k 2 γ µ ] 1 2 u(p) +2[m2 γ µ ] − 4m[(1 − 2x)pµ + (1 − 2y)(p0 )µ ] 3 4 R Observe that 6 pu(p) = −mu(p), ūp06 p0 = −mūp0 . In 1 , d4 kk µ6 k ∝ γ µ (after Pauli-Villars). We simplify 2 as, 1 ū(p0 ) 2 u(p) = ū(p0 )[(1 − x)(1 − y)6 pγ µ6 p0 + xym2 γ µ + x(1 − x)m6 pγ µ + y(1 − y)mγ µ6 p0 ]u(p) 2 6 pγ µ → −2pµ − (−m)γ µ , γ µ6 p0 → −2(p0 )µ + mγ µ between the spinors. Thus we see that, between the spinors, all the terms in the numerator can be put in the form of (p + p0 )µ [. . . ] + γ µ [. . . ] and therefore the integral can be arranged as contributing to the form factors F1 , F2 . The intermediate steps, including the Pauli-Villars regularization, are left as an exercise. The result is (from δΓµ ): 2 Z α 1 zΛ 2 F1 (q ) − 1 = dx dy dzδ(1 − x − y − z) ln + 2π 0 M2 1 2 2 2 (1 − x)(1 − y)q + (1 − 4z + z )m M2 2 Z α 1 2m z(1 − z) 2 F2 (q ) = dx dy dzδ(1 − x − y − z) , with 2π 0 M2 M 2 := q 2 xy + m2 (1 − z)2 + zµ2 Note: 177 (17.16) (17.17) (i) In the δF1 , there is a UV divergence (Λ → ∞) coming from the large k region, even for q 2 = 0. This violates the δF1 (0) = 0 condition and thus changes the electric charge. The condition can be naturally restored by defining δF1,ren (q 2 ) := δF1 (q 2 ) − δF1 (0). This is a renormalization prescription. (ii) The δF1 also has an “infrared divergence” (µ2 → 0) at q 2 = 0 coming from the M −2 (0) = m−2 (1 − z)−2 from the z → 1 region. (iii) The F2 (q 2 ) has no divergences. Its value at q 2 = 0 gives the anomalous magnetic moment and can be evaluated explicitly as, Z 1−z Z Z z 2z α 1 α α 1 dx dy dzδ(1 − x − y − z) = dz = F2 (0) = 2π 0 1−z π 0 1−z 2π 0 g−2 α ⇒ = One of the triumphs of QED! (17.18) 2 2π The renormalized F1 (q 2 ) is obtained as, Z m2 (1 − z)2 α 1 2 F1,ren (q ) = 1 + dx dy dzδ(1 − x − y − z) ln 2π 0 m2 (1 − z)2 + q 2 xy m2 (1 − 4z + z 2 ) m2 (1 − 4z + z 2 ) + q 2 (1 − x)(1 − y) − 2 + (17.19) m2 (1 − z)2 + q 2 xy + µ2 z m (1 − z)2 + µ2 z We have to face the divergences now. All three radiative corrections have UV divergence while the vertex function has IR divergence as well. For dealing with the IR divergence, we need to look at another process called “Bremsstrahlung” (breaking radiation in German). D. Bremsstrahlung Cross-section to o(α) We know from the classical electrodynamics that accelerated charge radiates. A quantum mechanical view of such a radiation is depicted in the diagrams below. k k + + emission of any number of photons The diagrams depict a scattering process which causes the top electron to ‘accelerate’ and ‘emit’ a (on-shell) photon. This is the QFT view of ‘radiation from an accelerated charge’. The bottom charged line represents some heavy fermion/boson/source, while the connecting wavy line denotes mediation of the interaction. The first diagram suggests photon emission before the kick while the second diagram suggests photon emission after the kick. As per 178 the rules of the relativistic QFT, both contributions are to be added and mod-squared to get to the cross-section. To the leading order - single photon emission - the cross-section is order α. This cross-section also has a an IR divergence as k → 0. An on-shell photon with arbitrarily small momentum is called a ‘soft photon’. To isolate the divergence, focus on the following diagrams: p0 0 p0 p +k k + p−k k p p We are computing the amplitude for the process e → e+γ. Let M0 (k 0 , l0 ) denote the part of the amplitude of electron interacting with the external source, it is depicted as the blob. Here k 0 , l0 denote the out-going and in-coming momenta respectively. The full amplitude is then given by (polarizations ε(~k, λ) taken to be real), −i(−(6 p − 6 k) + m) µ 0 0 γ εµ iM(p , p, k) = (ie)ū(p ) M0 (p0 , p − k) (p − k)2 + m2 − 0 µ −i(−(6 p + 6 k) + m) 0 εµ γ 0 M0 (p + k, p) u(p) (p + k)2 + m2 − i To compare with the classical radiation formula, consider the limit wherein the external source interaction approximates as, M0 (p0 , p − k) ≈ M0 (p0 + k, p) ≈ M0 (p0 , p). Next, since the photon is soft, we neglect the 6 k in the fermion numerators. The denominators simplify as, (p − k)2 + m2 = −m2 − 2p · k + 0 = m2 = −2p · k and likewise, (p0 + k)2 + m2 = +2p0 · k. Furthermore, (−6 p + m)γ µ u(p) → 2pµ u(p), and ū(p0 )γ µ (−6 p0 + m) → ū(p0 )(p0 )µ . Thus the amplitude takes the form, ε · p ε · p0 iM(p , p, k) = e [ū(p )M0 (p , p)u(p)] × − + | {z } p · k p0 · k | {z } elastic scattering amplitude extra k dependent factor 0 0 0 The total, unpolarized cross-section is obtained as, Z 0 X d3 k ε·p 0 0 2 ε·p dσ(p → p + k) = dσ(p → p ) e − 0 3 0 2|k|(2π) λ=1,2 p ·k p ·k 2 (17.20) The integration is over soft photons i.e. |~k| < kmax . Notice that The integral is dimensionless. The integrand is the probability density of radiating a photon of momentum k within d3 k and any transverse polarization, accompanying the electron scattering with p → p0 179 (“acceleration kick”, p0 6= p + k) while the integral is the total probability of emission of a soft photon with energy up to kmax . Multiplying the integrand by the photon energy, |~k| gives the expectation value of the energy carried away by soft photons up to energy kmax , Z Esof t (kmax ) := hEi = 0 p~ p~ d3 k X e2 ~ε · − (2π)3 λ 2 p0 · k p · k 2 0 The polarization sum in the cross-section can be done as follows. Noting that k · ( pp0 ·k − p ) p·k = 0, we can drop the k k̃ terms in the completeness relation for the polarizations, effectively taking Σλ εµ εν = ηµν . Hence, µ 0 ν 0 0 X p0 p p p p p p p − εµ − εν = − · − , p0 · k p · k p0 · k p · k p0 · k p · k p0 · k p · k λ and the expectation value of energy carried by soft photons is given by, Z Esof t (kmax ) = d3 k X e2 m2 m2 p · p0 − 0 − −2 (2π)3 λ 2 (p · k)2 (p · k)2 (p · k)(p0 · k) (17.21) Note that the integral represents average radiated energy inferred from the quantum mechanical probability distribution for emission of a single soft photon with |~k| < kmax . An Aside: Incidentally, integrand in the r.h.s. is also the classical formula for energy carried away by the k th Fourier mode of the electromagnetic field. To see this, consider a classical trajectory z µ (τ ) = pµ /m for τ < 0 and equals (p0 )µ /m for τ > 0. The corresponding current is, ∞ Z 0 (p0 )µ 4 pµ J (x) = dτ δ (x − z(τ )) + dτ δ 4 (x − z(τ )) ⇒ m m −∞ 0 0 µ µ e (p ) p Aµ (~k) = − − (using the Retarded Green function.) |~k| p0 · k p · k µ Z The energy of the above radiation field is the precisely as given by the integrand in the r.h.s. of eq.(17.21) [13]. The integral now represents the total energy carried by Fourier modes with momenta |~k| < kmax . The integral, Esof t (kmax ), thus has two different interpretations. Returning to the evaluation of the integral in eq.(17.21), let us choose a frame in which the initial and final energies of the electron are the same: E 0 = E ⇒ (p0 )0 = p0 and we write p = E(1, ~v ), p0 = E(1, ~v 0 ), k = (k := |k|, ~k). This is equivalent to p2 = −m2 and 180 m2 /E 2 = 1 − ~v 2 . With this choice, p · k = Ek(−1 + k̂ · ~v ), p0 · k = Ek(−1 + k̂ · ~v 0 ) and p · p0 = E 2 (−1 + ~v · ~v 0 ). The k 2 from the d3 k cancels and the integrated emitted energy takes the form, # " ! Z Z e2 dΩ m20 1 1 2(1 − ~v · ~v 0 ) Esof t (kmax ) = − + − dk 2 8π 3 E 2 (1 − k̂ · ~v )2 (1 − k̂ · ~v 0 )2 (1 − k̂ · ~v )(1 − k̂ · ~v 0 ) " !# Z dΩk̂ m2 2(1 − ~v · ~v 0 ) 1 1 0 I(~v , ~v ) := − + ⇒ 4π (1 − k̂ · ~v )(1 − k̂ · ~v 0 ) E 2 (1 − k̂ · ~v )2 (1 − k̂ · ~v 0 )2 Z e2 α Esof t (kmax ) = kmax I(~v , ~v 0 ) dkI(~v , ~v 0 ) ≈ 2 4π π The integral is actually divergent at the upper end. It has been cut off at kmax which is provided by the inverse of the duration over which the electron receives the kick. This is given by kmax ∼ |~p − p~0 | := |~q|. This is a physical cutoff. The angular integration receives the maximum contribution from when k̂ is parallel to ~v or ~v 0 depending upon if it is the initial or the final state bremsstrahlung. Given the directions vv̂, v̂ 0 , the k̂ varies between these two directions (smaller angle). Thus, to pick up contribution when k̂ is parallel to ~v , we define cosθ by k̂ · ~v = vcosθ which varies between |~v |cosθ = ~v · ~v 0 and cosθ = 1. Similarly for contribution from nearly parallel to ~v 0 , we define cosθ by k̂ · ~v 0 = |~v 0 |cosθ which varies from |~v 0 |cosθ = ~v · ~v 0 to cos θ = 1. Additionally, in the extreme relativistic limit where we can neglect the m2 /E 2 terms, we can approximate (1 − k̂ · ~v )(1 − k̂ · ~v 0 ) ≈ (1 − |~v |cosθ)(1 − ~v · ~v 0 ) or ≈ (1 − |~v 0 |cosθ)(1 − ~v · ~v 0 ). The angular integral then approximates as, Z 1 Z 1 1 − ~v · ~v 0 1 − ~v · ~v 0 0 I(~v , ~v ) ≈ dcosθ + dcosθ (1 − vcosθ)(1 − ~v · ~v 0 ) (1 − v 0 cosθ)(1 − ~v · ~v 0 ) |~v 0 |cosθ=~v ·~v 0 |~v |cosθ=~v ·~v 0 1 − ~v · ~v 0 1 − ~v · ~v 0 (p · p0 )2 ' ln + ln ≈ ln 1 − |~v | 1 − |~v 0 | E 2 (E − p)(E − p0 ) q2 ' 2ln 2 where q := p0 − p m We have used: (1−~v ·~v 0 ) = −p·p0 /E 2 , (1−|~v |) = (E −|~p|)/E, (E −|~p|)(E −|~p0 |) ≈ (E −|~p|)2 and E(E − |~p|) ≈ E 2 − p~2 = m2 in getting to the last equation. Thus the radiated energy in the soft modes is given by, Z Z 2α kmax α kmax 0 Esof t (kmax ) = dkI(~v , ~v ) −−−→ dk ln(q 2 /m2 ) . Em π 0 π 0 (17.22) Interpreting the above energy classically, what would be the total number of photons emitted, Nγ ? Notice that I(~v , ~v 0 )dk is the contribution to the radiated energy from the 181 Fourier modes with energy, |~k|. The equivalent number of photons would be I(~v ,~v 0 )dk |~k| and the total number Nγ is obtained by integrating over the energies: α Nγ = π Z 0 kmax dk 2α I(~v , ~v 0 ) ≈ k π Z kmax 0 dk ln(q 2 /m2 ) k This is clearly divergent from the lower limit. But this is the same expression for the total quantum mechanical probability for emission of a soft photon up to energy kmax and thus is divergent. This is the IR divergence of the QED cross-section for the bremsstrahlung process. R If the photon is given a small mass µ, then the dk/k will give ln((|q| ' kmax )/µ) while I(~v , ~v 0 ) gives the ln(q 2 /m2 ). Hence, dσ(p → p0 + k) = dσ(p → p0 ) · 182 α ln(q 2 /µ2 ) · ln(q 2 /m2 ) π| {z } Sudakov double log (17.23) 18. TREATMENT OF DIVERGENCES: At the first non-trivial attempt at computing radiative corrections, we encounter divergences of the UV type (from large loop momentum) and of IR type (from small loop momentum in massless propagators). How do we understand the physical origin of these? How do we adjust the computational procedure so as to make unambiguous predictions to be confronted with observations? Let us recapitulate what we have got. • Quite generally, using only Poincare covariance and assumption about the possible spectrum of a theory with mass gap, the Kallen-Lehmann representation gave us, Z ∞ 0 dσ 2 ρ(σ 2 )∆0 (p, σ) , 0 < Z < 1 ∆F (p) = Z∆0 (p, mph ) + m2th −i (Free propagator of mass σ) + σ 2 − i Z ∞ −i{ρ1 (σ 2 )6 p + ρ2 (σ 2 )} 0 dσ 2 SF (p) = Z2 S0 (p, mph ) + p2 + σ 2 − i m2th ∆0 (p, σ) := p2 (−6 p + σ) S0 (p, σ) := − i 2 (Free propagator of mass σ) p + σ 2 − i Z ∞ qµ qν Z3 Π(σ 2 ) 0 2 (DF )µν (q) = ηµν − 2 + dσ 2 q q 2 − i q + σ 2 − i 0 The Z’s are the field (or wavefunction) renormalization constants. They are determined as the residues at the (isolated) pole at the physical mass, mph . This is without any perturbation series. In perturbation series though we obtained (for fermion and photon), qµ qν 1 1 1 0 0 SF (p) = & DF (q) = ηµν − 2 6 p + m − Σ(6 p) q q 2 1 − Π(q 2 ) This is consistent with the general expectation that the physical masses are determined by the poles in the exact propagator while the corresponding residues determine the Z’s. Thus, δm := mph − m = −Σ(6 p = −mph ) , Z2−1 = 1 − dΣ(6 p) |6 p=−mph , Z3−1 = 1 − Π(0). d6 p and of course the photon physical mass is zero. • The vertex function for on shell fermion masses takes the form, i γ µ (p, p0 ) := γ µ F1 (q 2 ) + iσ µν qν F2 (q 2 ) , σµν := [γµ, γn ] , 6 p = −mph = 6 p0 . 2 This decomposition is understood to be sandwiched between ū(p0 ) and u(p). Since mph = m + δm = m + o(α), for the first order corrections, we can take mph = m. Here are the expressions we obtained: 183 (I) (II) (III) (IV ) Z α 1 xΛ2 Σ(p) = dx (2m + x6 p)ln (18.1) 2π 0 x(1 − x)p2 + (1 − x)m2 + xµ2 − i Z α 1 xΛ2 δm = dx (2 − x)ln , (18.2) 2π 0 (1 − x)2 m2 + xµ2 α ln(Λ2 /m2 ) (18.3) Z2 = 1 + 4π 2 Z 1 2α m + q 2 x(1 − x) 2 Π(q 2 ) = − − γ − ln (18.4) dx x(1 − x) π 0 µ20 2α 1 2α 1 = − + o(0 ) ⇒ Z3 = 1 − (18.5) 3π Z 3π α zΛ2 2 F1 (q ) = 1 + dx dy dz δ(1 − x − y − z) ln 2π M2 −(1 − x)(1 − y)q 2 + (1 − 4z + z 2 )m2 (18.6) M2 2 Z α 2m z(1 − z) 2 F2 (q ) = dx dy dzδ(1 − x − y − z) , (18.7) 2π M2 M 2 := q 2 xy + m2 (1 − z)2 + µ2 z (18.8) Z 2 zΛ α dx dy dzδ(1 − x − y − z) ln ∴ F1 (0) = 1 + 2π (1 − z)2 m2 2 1 − 4z + z + (18.9) (1 − z)2 Z z g−2 α α dx dy dzδ(1 − x − y − z) =: (18.10) = F2 (0) = 2π 1−z 2π 2 2 Z d3 k X 2 ~ε · p~0 ~ε · p~ (18.11) dσp→p0 +k = dσp→p0 − e 2k(2π)2 λ p~0 · ~k p~ · ~k 2 Z e dk 0 = dσp→p0 I(~v , ~v ) where, (18.12) 4π 2 k ( ) Z dΩ 2(1 − ~v · ~v 0 ) m2 /E 2 m2 /E 2 0 I(~v , ~v ) = − − (18.13) 4π (1 − k̂ · ~v )(1 − k̂ · ~v 0 ) (1 − k̂ · ~v )2 (1 − k̂ · ~v 0 )2 where pµ = (E, E~v ) , (p0 )µ = (E, E~v 0 ) , k µ (|~k|, ~k) is used. dσp→p0 +k For |~v | ≈ |~v 0 | ≈ 1 , I(~v , ~v 0 ) ≈ 2ln(~q2 /m2 ), q := p0 − p ⇒ hα i = dσp→p0 ln(q 2 /µ2 ) ln(q 2 /m2 ) π 184 (18.14) (18.15) A. Treatment of the IR divergences Let us consider the IR problem first. Since F1 has both the UV and the IR divergence, we will separate these by using the UV renormalized F1,ren which was defined through, δF1,ren (q 2 ) := δF1 (q 2 ) − δF1 (0). This guarantees that the renormalized F1 satisfies the condition F1 (0) = 1. Using the expressions above, we see that, Z −(1 − x)(1 − y)q 2 + (1 − 4z + z 2 )m2 α 2 dx dy dz δ(1 − x − y − z) δF1,ren (q ) = 2π M 2 (q 2 ) (1 − 4z + z 2 )m2 − (1 − z)2 m2 + zµ2 The IR divergence comes from z → 1 ↔ x ∼ y ∼ 0 ↔ M (q 2 , x, y) would vanish but for the photon mass µ2 . To isolate the divergence, it suffice to take x = y = 0, z = 1 in the numerator. Doing the x integration using the delta function gives x = 1 − y − z > 0 ⇒ y < 1 − z and leads to, Z Z 1−z −2m2 − q 2 2m2 α 1 dz dy δF1,ren ≈ 2π 0 (1 − z)2 m2 + y(1 − y − z)q 2 + µ2 (1 − z)2 m2 + µ2 0 Substituting y := (1−z)u, v := 1−z, the leading contribution in the limit µ → 0 is expressed as [13], 2 m or q 2 α 2 F1,ren (q ) ≈ 1 − fIR (q ) ln 2π µ2 Z 1 m2 + q 2 /2 2 −1 du fIR (q ) = m2 + u(1 − u)q 2 0 2 where, (18.16) (18.17) Since F1 is the coefficient of the γ µ term, we can replace e → eF1,ren in the electron scattering off a classical potential. The cross-section is then given by, 2 dσ dσ α m or q 2 2 ' 1 − fIR (q )ln . dσ dσ tree π µ2 Notice that e → eF1 ⇒ e2 → e2 F12 ⇒ α/(2π) → α/(π). Not only is this divergent as µ → 0, for non-zero µ it is actually negative implying negative cross-section! In the limit of q 2 → ∞ (q 2 is space-like and hence positive), fIR (q 2 ) → ln(q 2 /m2 ) ⇒ F1,ren (q 2 ) ' 1 − α ln(q 2 /m2 ) ln(q 2 /µ2 ) 2π and hence, i h α dσ(p → p0 ) ' dσtree (p → p0 ) 1 − ln(q 2 /m2 ) ln(q 2 /µ2 ) , q 2 → ∞ , µ2 → 0. π 185 (18.18) The Bremsstrahlung cross-section on the other hand is, α q2 α q2 q2 q2 0 dσ(p → p + k) = dσ(p → p ) ln = dσ(p → p )tree ln . ln ln π m2 µ2 π m2 µ2 0 0 (18.19) Both the cross-sections above are IR divergent and both suffer from ambiguity from contamination of soft photons. We already noted that detectors are unable to distinguish a charged particle accompanied by soft photons below detector sensitivity. What is the appropriate theoretically computed quantity which reflects this limitation of the detection process? When an experimenter reports a detection of a scattered electron, he/she is actually giving an estimate of the probability that an electron, e(p0 ) is detected and a photon is not detected. This probability is the probability for a process with no emitted photon plus the probability that there are accompanying soft photons with energies below the detection threshold, εth . In equation, (dσ)measured = dσ ( p → p0 ) + dσ ( p → p0 + k ; |~k| < εth ) . (18.20) But this is precisely the sum of the two cross-sections given in eqns.(18.18,18.19). We see that the leading contribution as q 2 → ∞, µ2 → 0 is exactly canceled out in the observed cross-section! Note: For a general q 2 , the sum of the two cross-sections is given by [13], 2 q or m2 α α 2 0 2 0 + I(~v , ~v ) ln(εth /µ ) (dσ)measured = dσtree ( p → p ) 1 − fIR (q )ln π µ2 2π It turns out that without the limit q 2 → ∞, it still holds that I(~v , ~v 0 ) → 2fIR [13]. For general q 2 , the coefficient of fIR is ln(q 2 ) or ln(m2 ). For large q 2 we can of course drop m2 . Furthermore, experimentally it is easier to track the behavior of the cross-section as a function of q 2 , so taking q 2 m2 , we write the unambiguous and measurable prediction as, dσ dΩ = measured dσ dΩ α q2 q2 2 (p → p ) 1 − ln 2 ln 2 + o(α ) π m εth tree q 2 m2 0 (18.21) Appreciate that we had parametrised the IR divergence in terms of the photon mass µ2 ; we used the renormalized F1,ren (q 2 ) to separate the IR from the UV and finally, identified the quantity which is actually reported by experiments which is the sum of the two crosssections. 186 This has been a demonstration at the 1-loop. It is non-trivial result of a great deal of work, that the basic mechanism of cancellation works to all orders in α. There are other types of divergences analogous to the IR divergences, the so called “mass singularities” which do not cancel but can be factorised in a convenient form and then eliminated from the observed cross-sections. See the book by G. Sterman [14]. UV divergences is a different ball game and needs a different procedure. B. Treatment of the UV divergences All the three radiative corrections in, Σ, Π and F1 have the UV divergences parametrised in terms of the Pauli-Villars cut-off Λ or the dimensional regularization −1 pole. Note that the bremsstrahlung process does not have a UV divergence as there is physical cut-off for the energy of the “soft photons”. We need to pay attention to the Z factors now. Recall that the S−matrix element definition, via the LSZ reduction procedure, had a factor of √1 Z for each external line. There was also an amputation of the external line, effected by the equation of motion operator eg − m2 acting on the Feynman propagator √ ∆F for the external line. The Z entered from the asymptotic condition relating the interacting field to the in/out fields. The in/out fields satisfy their respective field equations with physical masses and also have the normalization factors with ωk which also contain the physical masses. In the momentum space, these factors associated with external lines are of the form √1 (p2 Z 2 + m2 )∆0F (p) where the ∆F contains the self-energy giving it the form ∆0F (p) = (p2 + m − Π(p2 ))−1 (for scalars). In perturbative computation of the self-energy however, the m is the mass parameter in the L which is not the physical mass. In fact from the Kallen-Lehmann representation, we know that the physical mass is determined from p2 +m2 −Π(p2 )|p2 =−m2ph = 0 ↔ −m2ph +m2 = Π(−m2ph ). Expanding Π(p2 ) about −m2ph gives us, Π(p2 ) = π(−m2ph ) + (p2 + m2ph )Π0 (−m2ph ) + . . . ⇒ 2 (∆0F )−1 (p) = p2 + m2 − Π(−m2ph ) − (p2 + Mph )Π0 (−m2ph ) + . . . = (p2 + m2ph ) 1 − Π0 (−m2ph ) + . . . = (p2 + m2ph ) ZΦ−1 + . . . p 1 ∴ √ (p2 + m2ph )∆0F (p) = ZΦ 1 + o( (p2 + m2ph )2 ) ZΦ Since we evaluate the S−matrix elements on shell, all the higher order terms vanish. The 187 net result is that: In an S−matrix element, for each external line introduce a factor of p Zf ield × appropriately normalized wavefunction and now do not include self-energy corrections on the external lines. Consider the electron scattering off external potential. The corresponding invariant am√ plitude is given by e( Z2 )2 ū(p0 )Γµ (p0 , p)u(p) with Γµ = γ µ F1 + (. . . )F2 . The F2 has no divergences while Z2 , F1 both are divergent. Thus we write, Z2 = 1 + δZ2 , F1 = 1 + δF1 with the δ’s representing the o(α) divergent corrections. Thus we write, µν 0 0 µ 2 0 iσ qν 2 Me→e (p , p) = e(1 + δZ2 ) ū(p )γ u(p) (1 + δF1 (q )) + ū(p ) u(p) δF2 (q ) 2mph iσ µν qν = e [ū(p0 )γ µ u(p)] {1 + δF1 + δZ2 } + ū(p0 ) u(p) δF2 (q 2 ) But, 2mph Z 1 2 zΛ z(1 − z)m2 α , M 2 = (1 − z)2 m2 + zµ2 dz −zln 2 + 2(2 − z) δZ2 = 2π 0 M M2 Z α 1 m2 zΛ2 2 δF1 (0) = dz(1 − z) ln 2 + (1 + z − 4z) 2 2π 0 M M Z 1 zΛ2 α m2 dz (1 − 2z)ln 2 + 2 × ∴ δF1 (0) + δZ2 = 2π 0 M M (4 − 2z)(1 − z)z + (1 − z)(1 + z 2 − 4z) Integrate the first term by parts, Z 0 1 Z 1 1 zΛ2 zΛ2 1 2(1 − z)(−1)m2 + µ2 2 2 − dz(1 − 2z)ln 2 = (z − 2z ) ln 2 − dz(z − z ) M M 0 z M2 0 Z 1 1−z dz = − {M 2 + 2z(1 − z)m2 − zµ2 } 2 M Z0 1 Z 1 1−z m2 (1 − z)2 (1 + z) 2 = − dz {m (1 − z)(1 + z)} = − dz µ2 M2 0 0 m2 (1 − z)2 + z m 2 Z α 1 m2 (1 − z)(1 − z 2 ) m2 2 2 dz − + 2 (1 − z){4z − 2z + 1 + z − 4z} ∴ δF1 (0) + δZ2 = 2π 0 M2 M Z 1 α m2 = dz 2 (1 − z)(1 − z 2 ){−1 + 1} = 0 (!) 2π 0 M Thus not only has the lnΛ2 divergence canceled, we have got δZ2 = −δF1 (0) and thus the entire correction takes the form,       iσ µν qν 0 0 µ 2 Me→e (p , p) = eū(p )γ u(p) 1 + δF1 (q ) − δF1 (0) + iū(p0 ) u(p)δF2 (q 2 ) | {z }   2m ph   δF1,ren 188 We recover the ad hoc prescription of using δF1,ren introduced earlier while discussing the IR divergences. Note that the subtracted term was crucial for the IR divergence cancellation. It was of course physically expected since δF1 (q 2 = 0) = 0 should actually hold to all orders. Note that the UV divergence in the vertex function, δF1 (q 2 ), is canceled against the divergence in δZ2 coming from electron self-energy. Now we are left with the divergence in Z3 from the Π(q 2 = 0) for the photon self-energy. We still have the electron mass-shift δm which is divergent. 1. 1-Loop Renormalization: Charge Screening and Lamb shift Consider charged particle scattering by exchanging a photon, necessarily off-shell (spacelike in fact). Replace the free propagator (DF )µν (q 2 ) by the exact propagator which includes the photon self-energy, (DF0 )µν (q 2 ). The exact propagator has additional terms qµ qν . But thanks to the Ward identity, these do not contribute to S−matrix elements and effectively, (DF0 )µν (q 2 ) = −iηµν [q 2 (1 − Π(q 2 ))]−1 . This has a pole at q 2 = 0 with residue Z3 = [1 − Π(0)]−1 . The replacement thus gives a factor of Z3 . The photon internal line connects two vertices contributing e2 . Thus, if we identify e2 Z3 =: e2ph , then all divergences coming from the photon self-energy can be neatly absorbed in the physical, measured charge which is finite. For the mass parameter too, we identified m2ph = m2 + δm, which absorbs the √ divergence in δm into the Lagrangian parameter m. The identification eph := e Z3 is called charge renormalization while m2ph := m2 + δm is called mass renormalization. For external √ photon line, we will just have Z3 factor and no self-energy corrections and of course no mass shift. Thus the exact photon propagator may be replaced by the free propagator by simultaneously replacing α by αef f defined below. α(DF0 )µν (q 2 ) α ηµν αph (1 − Π(0)) ηµν 2 = (DF )µν (q ) = 2 := αef f (q ) 2 1 − Π(q 2 ) q − i 1 − Π(q 2 ) q − i αph with, αef f (q 2 ) = 1 − (Π(q 2 ) − Π(0)) 2 To appreciate the implication of the above procedure, recall the discussion of the effective potential inferred from the tree level scattering for both the Yukawa and the QED coupling. Let us use the same formula (for QED), but include the self-energy correction for the photon 189 propagator. The inferred potential is, Z −e2ph d3 q i~q·~x (18.22) V (~x) = e (2π)3 ~q2 (1 − Π(q 2 ) + Π(0)) 2 Z 2α 1 2 m + q 2 x(1 − x) 2 (18.23) Π(q ) = − dx x(1 − x) − γ − ln π 0 µ20 Z 2α 1 m2 2 Π(q ) − Π(0) = − dx x(1 − x) ln (18.24) π 0 m2 + q 2 x(1 − x) Z 1 q2 α q2 q~2 m2 2α −−−−→ dx x2 (1 − x)2 2 = (NR limit) (18.25) π 0 m 15π m2 Z −e2ph d3 q i~q·~x ∴ V (~x) ≈ (18.26) e α q2 (2π)3 ~q2 (1 − 15π ) 2 m 2 αph 4αph − (18.27) ≈ − δ 3 (~x) r | 15m{z2 } perturbation The perturbed Coulomb potential induces a shift in the energy levels of the Coulomb potential, say in Hydrogen atom. The first order perturbation theory gives the shift as, 2 3 5 4αph αph m3 αph m ∆E = − = − ' −1.123 × 10−7 eV(Lamb Shift) 2 15m 8π 30π (18.28) This is the contribution of the photon self-energy to the famous Lamb shift. The photon self-energy ↔ αef f is thus an observable and observed effect. More exact calculation may be seen in [13]. We can also consider the ultra-relativistic regime q 2 m2 . Now Z 2α 1 m2 m2 2 Π(q ) − Π(0) = − dx x(1 − x) ln( 2 ) − ln x(1 − x) + 2 π 0 q q Z 1 2α 1 2 2 2 2 ≈ − ln(m /q ) − dx x(1 − x)ln{x(1 − x) + o(m /q )} π 6 0 α ' ln(q 2 /m2 ) − 5/3 + o(m2 /q 2 ) 3π αph 3π q 2 m2 2 −−−−→ − ∴ αef f = (18.29) 2 −5/3 q ln(q e /m2 ) 1 − α ln 3π m2 e5/3 Note that the denominator is less than 1 and hence effective coupling is larger than the physical coupling. Thus, the effective coupling gets stronger at shorter distances. This is interpreted as saying that the photon self-energy correction polarizes the space between say the nucleus and the electron, shielding the nuclear charge. As the shielding cloud is 190 penetrated, higher nuclear charge is seen and hence αef f is larger. For this reason, the photon self-energy Π(q 2 ) is also called the vacuum polarization. To summarize: 1. The UV divergences in the self energies and the vertex correction are absorbed away by introducing the physical mass for electron, physical charge for the electron and electron wavefunction renormalization constant to cancel the divergence in F1 (q 2 ); 2. The IR divergences in the amplitudes, imply IR divergence in the cross-section. For the e − e scattering, the IR divergence is canceled against the bremsstrahlung process, once the measured quantity is correctly identified taking into account the finite detector resolution; 3. The hiding of divergences in αph and m2ph imply that αef f , δm2 acquire q 2 dependence - they “run” with q 2 . This seems satisfactory within the context of the o(α) corrections and the magic of the Ward identity. But it raises more questions. • Are the divergences generic? What are the mathematical and physical reasons for their existence? • Can we sometimes/always take care of the divergences and make unambiguous predictions? • What kind of predictions can be made? • What are the effective/running parameters? • Are the divergences an artifact of perturbation theory? Etc, etc . . . . Addressing these questions is the genesis of the renormalization theory. As a matter of strategy, we will stay within perturbative framework and try to make sense of the divergences. After all, despite divergences, we could obtain prediction from QED at 1-loop which have been well tested! We begin with a comment that the UV divergences persist at higher loops as well and have to be faced. Can we always absorb them away in physical masses, couplings and the field renormalizations in S−matrix elements/cross-sections automatically and make finite, unambiguous predictions? Indeed it is so for QED and for the class of (super-)renormalizable theories. The proof needs a somewhat modified procedure, introducing additional diagrams, 191 representing the so called ‘counter terms’ and adjusting its coefficients/Feynman rules to systematically absorb the divergences. We discuss this procedure in the context of a scalar field theory, specifically the Φ4 theory in 4 dimensions and Φ3 theory in 6 dimensions. C. The Method of Counter terms Consider the Φ4 theory given by, 1 2 1 2 L = − ∂µ Φ0 ∂ µ Φ0 − m20 Φ20 − λ0 4 Φ 4! 0 The corresponding Feynman rules would be, p p2 −i + m20 − i −i , λ0 4! As seen in QED, we will have divergences in the radiative corrections, p = All + The suffix 0 quantities are called “bare quantities”. Define: Φ0 := √ ZΦ, arbitrary scaling of the field. This expresses the Lagrangian density as, Z λ0 Z 2 4 Z Φ 2 2 4! 1 1 λ = − ∂µ Φ∂ µ Φ − m2 Φ2 − Φ4 2 2 4! 2 Zm0 − m2 2 λ0 Z 2 − λ 4 Z −1 − ∂µ Φ∂ µ Φ − Φ − Φ 2 2 4! L = − ∂µ Φ∂ µ Φ − m20 Φ2 − The last line constitute “counter terms”, the field Φ, the mass m and the coupling λ are called renormalized quantities. The corresponding Feynman rules would be: p p2 p −i + m2 − i − i{(Z − 1) p2 − Zm20 − m2 } | {z } | {z } δZ −i , , δm λ 4! − i 2 Z λ0 − λ 4! | {z } δλ We have used the renormalized quantities and have additional vertices with adjustable coefficients δZ , δm , δλ . The new vertices are generated by the counter terms. 192 Note that we have only rewritten the bare Lagrangian density split into two sets of terms. Since we have not stipulated any conditions, the split is completely arbitrary. Two sets of conditions, called renormalization conditions, are now provided: (i) The exact propagator is given by −i p2 +m2 −i + terms regular at p2 = −m2 . That is, the renormalized mass should be identified with the physical mass and the residue should be 1; (ii) The exact 4-point, 1PI function should equal −iλ at s = 4m2 , t = u = 0. s, t, u are the Mandelstam variables. The definition of λ could be changed, but essentially it is the value of the exact 4-point scattering amplitude which is measurable. Diagrams generated by these vertices will again have divergences which must be regulated. The counter terms coefficients, δZ , δm , δλ are to be so chosen as to ensure that the renormalization conditions are satisfied, order-by-order. Since the conditions are finite and cut-off independent (regularization independent), all divergences are spent in defining the counter term coefficients. By construction, we have generated finite quantities to all orders! Note: There are different ways of absorbing the divergences in the counter term coefficients, especially when massless particles are involved. The different methods of absorbing the UV divergences is generically referred to as a subtraction scheme. This will be illustrated below. The perturbation series generated using the bare form of the Lagrangian is called “bare perturbation series” while that generated using the renormalized quantities together with the renormalization conditions, is called “renormalized perturbation series”. Does such a simple splitting procedure always generate UV finite (cut-off independent) quantities in terms of finitely many renormalization conditions? The answer is YES for a class of theories called “renormalizable theories”. There are non-renormalizable theories (= Lagrangians) for which this procedure fails. 193 19. RENORMALIZED PERTURBATION SERIES Let a theory be specified by a (bare) Lagrangian density, L as, X g0,k 1 1 Φk0 . L (Φ0 , m0 , g0,k ) = − ∂µ Φ0 ∂ µ Φ0 − m20 Φ20 − 2 2 k! k≥3 Introduce renormalized quantities, Φ, m2 , gk and scaling parameters, Zφ , Zm , Zgk through the p k/2 definitions: Φ0 =: Zφ Φ, m20 Zφ = Zm m2 , g0,k Zφ =: Zgk gk . Writing the Z’s in 1 + δ form, we recast the Lagrangian density as, 1 2 1 2 L (Φ, m, gk ; δΦ , δm , δk ) = − ∂µ Φ∂ µ Φ − m2 Φ2 − X gk Φk k! k≥3 X gk 1 1 − δΦ ∂µ Φ∂ µ Φ − δm m2 Φ2 − δk Φ k 2 2 k! k≥3 ←− (Counter terms) We choose the renormalization conditions to be: (i) Exact propagator = −i , p2 +m2 −i which is equivalent to the two conditions: (a) Π(p2 = −m2 ) = 0, and (b) Π0 (p2 = −m2 ) = 0 ; (ii) gk = k-point amputated function, also called k-point vertex function, at some chosen values of its momentum arguments. Note: Due to the conditions (i), this scheme is called on-shell renormalization. Calculations of the vertex functions will be functions of the external momenta, m, gk , δΦ , δm , δk and the regularization parameter - Λ for a momentum cut-off, = 4 − n for the dimensional regularization. The renormalization conditions will serve to define the counter term coefficients in terms of the regularization parameter and eliminate them from the vertex functions. We will be left with vertex function having dependence on the momenta, m and the couplings gk . This is what we seek to understand. A. Necessary Conditions for UV divergence: Power Counting We begin by finding necessary conditions for occurrence of UV divergence. These are obtained by estimating the Feynman integrals in the region where all loop momenta become large. Feynman integrands are rational functions of momenta and in the regime of large momenta, simply give a power of the large momenta. Consider an arbitrary, connected and topologically connected diagram made up of Eexternal lines, I−internal lines, nk − vertices of k th order and L−loops. The loops arise 194 because we have I number of momenta with V := Σk≥3 nk number of vertices enforcing momentum conservation (and −1 since an overall momentum conservation) condition which leaves some momenta undetermined. All internal momenta are linear functions of some external momenta, pi and some loop momenta, kl . There are several integration regions in the d × L dimensional space (d is the space-time dimension). These regions correspond to various subsets of internal lines vanishing as q(pi , kl )−2 . It suffices to consider region wherein all loop momenta diverge which means that all internal momenta also vanish. If we take the loop momenta to diverge as kl = Λk̂l , Λ → ∞, then we get the superficial degree of divergence, D = dL − 2I. For vertex functions, there are no external lines and 2I = Σk≥3 knk − E. The Green’s functions will have +E. As noted before, the number of loop momenta is given by L = I − V + 1. Using these the superficial degree of divergence is given by, X d − 2 d−2 D =d− nk k E+ −d 2 2 k≥3 A necessary condition for divergence is that D ≥ 0. For D < 0, there is no UV divergence, D = 0 is (possibly) logarithmically divergent, D = 1 is (possibly) linearly divergent, D = 2 is (possibly) quadratically divergent. We have taken k ≥ 3 to have some interaction, we take at least one nk 6= 0 for a non-trivial diagram and of course d ≥ 2, E ≥ 1. As an example, consider d = 4. Then D = 4 − E + Σk≥3 nk (k − 4) ≥ 0 for divergence. That is 4 ≥ E + n3 − Σk≥5 nk (k − 4). If we have nk≥5 = 0, i.e. only Φ3 , Φ4 terms, then the condition for divergence is independent of the number of Φ4 vertices while with increasing Φ3 vertices, the E must decrease correspondingly. • For n3 = 0, we must have E ≤ 4 i.e. the 2−point and the 4−point vertex functions are superficially divergent, quadratically and logarithmically respectively. For pure Φ4 theory, the vertex functions with odd number of external lines vanish identically. Thus, in 4 dimensional Φ4 theory, the self-energy and the 4−point vertex function are divergent to all orders. Since we do have δΦ , δm , δ4 coefficients, we can satisfy the renormalization conditions to all orders. This theory is renormalizable. • For pure Φ3 theory, n4 ≥ 0, D = 4 − E − n3 . For n3 = 1, we can have only tree level, 3−point function. This gives D = 0. But of course there is no loop integration and hence 195 no divergence (hence superficial degree of divergence is not a sufficient condition). Let us also require L ≥ 1. For n3 = 2, we can have 2−point function diagrams 2, L = 1, D = 0 which are logarithmically divergent and also , with E = with E = 4, D = −1, L = 0 which are (trivially) “convergent”. For n3 = 3, E = 2 we can have the diagram which is divergent even though D < 0. But this is because of the 1PR nature of the diagram. To exclude this triviality, we stipulate that contributing diagrams should all be 1PI. Then, for all n3 ≥ 4, all E−point vertex functions are convergent. This theory thus has divergences (eg 2-point function), but only up to a finite order. Beyond that, there are no divergences. Such a theory is called super-renormalizable. • For a pure Φk , k ≥ 5, D = 4 − E + (k − 4)nk . With increasing nk , the number of loops also increase and D keeps increasing. Equivalently, more and more E−point functions turn divergent and we will need infinitely many counter terms to absorb the divergences, in the same E−point function. Such a theory is non-renormalizable. • For sake of variety, take d = 3, so that D = 3 − E/2 + nk (k/2 − 3). For k = 6, D is independent of nk . The 2, 4, 6 point vertex functions need counter terms to all orders and Φ6 theory is renormalizable. • Take d = 6, ⇒ D = 6 − 2E + nk (2k − 6) = 2[(3 − E) + nk (k − 3)]. For k = 3, D is independent of n3. The 1, 2, 3 point functions are divergent to all orders and the theory is renormalizable. Remarks: • The formula for the superficial degree of divergence can be generalised to include fermions and photons. The fermion internal line contributes −1 (instead of −2) to the power counting. The fermion number conservation ust also be paid attention to by restricting the types of vertices. If there are derivative couplings, the numerator contributes positive powers to D. • As noted above, D ≥ 0 indicates the possibility of a UV divergence. The actual diagram may be less divergent due to symmetries eg Π(q 2 ) in QED is only log divergent though D = 2 suggests quadratic divergence. It is also possible the coefficient of the indicated divergence is 196 actually zero! An example of this is the photon 3−point vertex function in QED, at 1-loop. λ µ ν Here D = 4 − 3 = 1. However due to Furry’s theorem, in QED, any photon amplitude with odd number of external lines is zero. Hence the diagram is identically zero. However, what is true is that if D < 0 for a diagram and each of its sub-diagram, then the diagram has no UV divergence (Dyson-Weinberg theorem). The superficial degree of divergence, with the caveats mentioned above, suggest a classification of theories as: (i) Super-renormalizable: Only a finitely many diagrams have D ≥ 0. The D has a dependence on the number of vertices such that it decreases with increase in nk ; (ii) Renormalizable: Only a finite number of E−point vertex functions have D ≥ 0. For these functions though, D is non-negative at all orders; (iii) Non-renormalizable: All vertex functions are superficially divergent at sufficiently high orders. The D increases with increase in nk . A given theory, may be in any one of the classes in different dimensions. Dimensional analysis gives another convenient criterion for renormalizability. This goes as follows. Dimensional Analysis: In d dimensions [Φ] = d−2 , 2 [Φk ] = k d−2 ∴ [gk ] = d − k d−2 = d(1 − k2 ) + k. An E−point 2 2 vertex function can come from a term in the Lagrangian as gE ΦE with [gE ] = d(1 − E2 ) + E. If a diagram contributing to such as vertex function has a momentum cut-off Λ and the number of vertices of order k is nk , then the divergent part is proportional to gknk ΛD . For k = E and nE = 1, there is no loop integration and dimension comes only from the coupling gE . Hence [gE ] = [gknk ΛD ], i.e. d−2 d(1 − E/2) + E = nk d − k ) +D ⇒ 2 d−2 d−2 or, D = d − E − {d − k }nk 2 | {z2 } [gk ] 197 Thus, D is independent of nk if [gk ] = 0 and we have finitely many vertex functions potentially divergent. If [gk ] > 0, then only finitely many diagram can be divergent. If [gk ] < 0, then every vertex function has divergence at some order. Thus, the dependence on the space-time dimension, d can be hidden in the dimensions of the couplings and gk Φk is renormalizable if [gk ] = 0] ; gk Φk is super-renormalizable if [gk ] > 0] ; gk Φk is non-renormalizable if [gk ] < 0] . Having identified simple criteria for presence of UV divergences, we see now how the counter terms are used for renormalization. B. An Example: The (Φ3 )6 theory As specific example, we will consider the (Φ3 )6 . Here are the Feynman rules for the renormalized perturbation series. −i p2 +m2 −i , +ig , −i (δΦ p2 + δm m2 ) , +igδ3 . p p D = 6−2E ≥ 0 ⇒ E = 2 (quadratically divergent) and E = 3 (logarithmically divergent) are the only vertex functions with UV divergence. Consider the 1-loop divergences first. p+k p p k Z 1 iΠ(p ) = (ig)2 2 2 dd k (2π)d + + ··· −i −i (p + k)2 + m2 − i k 2 + m2 − i 1 2 2 + 1 · (−i)(δΦ p + δm m ) + . . . 2 2 1 The comes from: 2 6 3 1 6×2×3 1 = and 2 2!(3!) 2 2×1 =1 1!2! δΦ , δm to be determined from Π(−m2 ) = 0 = π 0 (−m2 ) . The 1 comes from: 2 198 1 The first term, after using the Feynman parameters and Wick rotation and equations (17.4, 17.6, 17.7) gives, Z 1 Z dd k g2 1 2 2 dx 1 = I(p ) , I(p ) := i , M 2 := x(1 − x)p2 + m2 2 d 2 2 2 (2π) (k + M ) 0 Z 1 Γ(−1 + /2) g2 2 2 /2 /2 I(p2 ) = dx M (4π/M ) , := 6 − d, put g → g(µ̄) α := (4π)3 (4π)3 0 2 = −i(δΦ p2 + δm m2 ) Z 1 4π µ̄2 2 p2 α 2 2 2 dx M ln γ 2 (1 + )( + m ) + − δΦ p2 − δm m2 + o(α2 ). ∴ Π(p ) = − 2 6 e M 0 Putting µ := √ 4πe−γ µ̄ , the logarithm becomes ln(µ2 /M 2 ). We continue the Euclidean momenta back to Minkowski momenta, p2 → p2 . Club the first group of terms with the counter terms. This gives the self energy as, [12] Z α 1 2 Π(p ) = dx M 2 ln(M 2 /µ2 ) 2 0 α 1 1 1 1 2 − + + δΦ p − α + + δm m2 + o(α2 ) 6 2 2 (19.1) Now, in the first term, use ln(M 2 /µ2 ) = ln(M 2 /m2 )+ln(m2 /µ2 ) and absorb the contribution of the second term into the counter term coefficients along with the UV divergences (the −1 pole) by choosing, α 1 1 1 1 2 + + CΦ + o(α ) , δm := −α + + Cm + o(α2 ) . δΦ := − 6 2 2 (19.2) Note that we have only absorbed the pole together with some, µ−dependent finite parts (the ln(m2 /µ2 ) terms) and subsumed these in the undetermined, finite constants CΦ , Cm . These constants are determined by the renormalization conditions. The self-energy then takes the form, α Π(p ) = 2 2 Z 1 dx M 2 ln(M 2 /m2 ) + 0 α CΦ p2 + αCm m2 + o(α2 ). 6 The C’s are now determined by imposing Π(−m2 ) = 0 = Π0 (−m2 ). This leads to, Z 1 M02 CΦ M2 Cm − = dx 02 ln(M02 /m2 ) , = 1 − x + x2 , 2 6 m m 0 Z α 1 CΦ − = dx x(1 − x) ln(M02 /m2 ) + 1 . 6 2 0 (19.3) (19.4) (19.5) Notice that the C’s so determined, are independent of the arbitrary parameter µ that entered in the dimensional regularization scheme. We have expressed the 2-point vertex function 199 explicitly free of UV divergence and the arbitrary µ parameter. In Π(p2 ) we had two terms with the −1 pole, one with coefficient p2 and one with coefficient m2 . The two counter terms δΦ , δm sufficed to absorbed these. Consider now the 3-point vertex function, Γ(p1 , p2 , p3 ): p1 iΓ3 : p2 p3 := p2 p1 p3 + p2 p1 k p3 + p2 p1 p3 + ··· At 1-loop we have, p1 iδΓ3 : p2 p1 p3 k − p1 + k p2 p3 p2 + k = [igδ3 ] + 1 Z (ig)3 (−i)3 × dd k 1 1 1 (2π)d k 2 + m2 − i (p2 + k)2 + m2 − i (k − p1 )2 + m2 − i 2 In the second term, Feynman parameterization, shifting momentum and Wick rotation gives, Z Z Z δ(1 − x − y − z) [. . . ] = 2 dx dy dz , M 2 := zxp21 + zyp22 + xyp23 + m2 . (k 2 + M 2 )3 As before replacing g → g µ̄/2 , = 6 − d and using the (17.7) gives, Z dd k 1 Γ(3 − d/2) −(3−d/2) M . = 2 (2π)d (k + M 2 )3 2(4π)d/2 Hence, putting α := g 2 /(4π)3 , µ2 := 4πe−γ µ̄2 " /2 # Z δΓ(p1 , p2 , p3 ) α 2 4π µ̄2 = δg + + 2 dx dy dzδ(1 − x − y − z) ; g 2 M2 o nα + δg = Z −α (19.6) dxdydzδ(1 − x − y − z)ln(M 2 /m2 ) + o(α2 ) α δg := − − αCg + o(α2 ) ⇒ (19.7) Z 2 2 2 Γ(p1 , p2 , p3 ) = g 1 − α dx dy dzδ(1 − x − y − z)ln(M /m ) − αCg + o(α ) (19.8) Choosing As before, we have obtained a completely finite and µ−independent vertex function with one undetermined constantCg . This is to be fixed by a suitable renormalization condition. What condition do we choose? 200 In QED we had a natural choice Γµ (q 2 → 0) = eph . Here however, there is no natural choice. Any cross-section will involve both g and Cg and the value of a cross-section gives one condition. A convenient choice is Γ3 (0, 0, 0) = g ↔ Cg = 0. Thus, at the 1-loop level, we see how the counter terms absorb the divergence and how the renormalization conditions determine the constants C’s. Note that we could have added any finite constant to the pole in defining the counter terms. The renormalization conditions then would give different expressions for the constants. 1. At 2-Loops A natural question is: How does this work at higher orders? In our (Φ3 )6 theory, only the 2-point and 3-point vertex functions are divergent. At 2-loops the contributing diagrams are: (δφ )1 p (δm )1 + p 2−point + ··· + + + ··· + (δ3 )1 + (δφ )2 3−point + I II (δ3 )1 + ··· (δm )2 + (δφ,m )1 III + + · · · IV The counter terms have the same form as the original bare terms. When they are clubbed to201 k /k! gether into the ‘interaction’ Lagrangian or Hamiltonian, one would expect that from Hint terms, we would get the counter vertices also appearing k times. For example, at o(g 2 ), we would have and , vertices. In ver- tex functions, the 1PR diagrams from the self-energy counter terms are omitted. Secondly, the coefficients of the counter terms are explicitly instructed to be adjusted to enforce the renormalization conditions at any given order in g. Hece, the counter vertices are not on par with the elementary vertices. The order of perturbation is determined by the number of elementary vertices and not by counter vertices. In the first of the group of diagrams, we have a subdiagram that is divergent (only |k1 | → ∞, k2 remains finite). This divergence is absorbed by the 1-loop counter vertices - the seond and the third diagrams of group I. To absorb the divergence when both k1 , k2 loop momenta become large, we need the new 2-loop counter term as shown in III. The group II exhibit the so called “overlapping divergences”: the vertical line’s momentum diverges when either of the loop momenta diverge. There are two subdiagrams which are divergent and to absorb these, we need the δ3 1-loop counter term. The group III counter terms are needed to absorb the divergence coming from both momenta becoming large. Ar higher loops, the procedure is thus recursive. The counter vertices themselves have an expansion in g 2 with coefficients absorbing divergences from the subdiagrams (lower loop orders). The renormalization conditions determine the highest loop order coefficients. This explains how the divergences arise and how they are absorbed systematically via the counter terms and the renormalization conditions. We have taken for illustration the on-shell renormalization condition. This method fails when the physical masses vanish and the procedure needs to be adopted suitably. This impacts what exactly is subtracted - (divergent part) or (divergent part + a finite piece)? This leads to different schemes of renormalization. Let us see how the probem arises and how different renormalization conditions can be chosen. We continue with the (Φ3 )6 theory. C. Renormalization with massless particles We have the self energy in equation (19.1). Taking derivative w.r.t. p2 gives, Z α 1 α 1 1 0 2 + + dx x(1 − x){ln(M 2 /µ2 ) + 1} + o(α2 ) . Π (p ) = − δΦ + 6 2 2 0 202 As m → 0, M 2 → x(1 − x)p2 . Then Π(0) = 0 identically while Π0 (0) is ill-defined (depends on how p2 , m2 → 0 is taken). Thus the previous renormalization conditions cannot be used and the counter term coefficients would remain undetermined! We noted while discussing the Kallen-Lehmann representation, that the physical spectrum was assumed to have a mass gap i.e. single particle pole being separated from the multi-particle branch point. Whenever this is violated, the above on-shell renormalization procedure fails. All is not lost though. We can try different renormalization conditions. There are two commonly used subtraction schemes within dimensional regularization the so called Minimal Subtraction scheme (MS) and the Modified Subtraction Scheme (M S). They are defined as: M S ↔ define the counter term constants by absorbing only the pole(s); while the M S scheme uses µ2 := 4πe−γ µ̄2 in the MS scheme. With these definitions, we give the finite self-energy in the three schemes: Z α 2 α 1 2 2 Πon−shell (p ) = − (p + m ) + dx M 2 ln(M 2 /M02 ) + o(α2 ), 12 2 0 where, M 2 = x(1 − x)p2 + m2 and M02 := M 2 (p2 = −m2 ) Z α 2 α 1 2 2 dxM 2 ln(M 2 /µ̄2 ) + o(α2 ) ΠM S (p ) = − (p + m ) + 12 2 0 Z α 2 α 1 2 2 ΠM S (p ) = − (p + m ) + dxM 2 ln(M 2 /µ2 ) + o(α2 ) 12 2 0 (19.9) (19.10) (19.11) The Πon−shell is ill-defined as m → 0 as noted above. For non-zero m though, it is unambiguous, free of the arbitrary scale µ and m is also the physical mass. By contrast, ΠM S is well defined as m → 0, but it depends on the mathematical artifact parameter µ! Now the m parameter cannot be the pole of the propagator and hence is not the physical mass. The physical mass is determined from [p2 + m2 − Π(p2 )]p2 =−m2ph = 0 ↔ m2ph = m2 − ΠM S (−m2ph ). Since Π is order α, we can write, to order α, m2ph = m2 − ΠM S (−m2 ) . Substitution gives, Z α 2 α 1 = m + m (1 − 6) + dx m2 (1 − x + x2 )ln((1 − x + x2 )m2 /µ2 ) 12 2 0 √ 5 µ2 34 − 3π 3 0 2 0 2 2 or mph = m 1 + α ln 2 + C + o(α ) , C = ≈ 1.18 (19.12) 12 m 15 m2ph 2 Observe that the physical mass explicitly depends on µ which it should not. d m2 (m2 , α, µ) dµ2 ph Hence = 0 must hold. Clearly this would be possible if α and/or m also have µ dependence. The independence of the physical mass then relates the µ dependences of m, α. 203 The residue, R, at the physical pole is as before, R−1 = ∆0F 2 | dp2 p2 =−mph = 1 − Π0M S (−m2ph ) = 1 − Π0M S (−m2 ) + o(α2 ). This evaluates to, R −1 α =1+ ln(µ2 /m2 ) + C 00 12 √ 17 − 3π 3 √ , C = ≈ 0.23 . 3 00 (19.13) The residue gives the wave function renormalization constant ZΦ = R. Finally, in the vertex function, we choose δg = −α/ in the M S scheme. This leads to, Z 2 2 2 Γ3,M S (p1 , p2 , p3 ) = g 1 − α dx dy dzδ(−x − y − z)ln(M /µ ) + o(α ) , where, M 2 = xyp21 + yzp22 + zxp23 + m2 . (19.14) To see a possible µ−dependence of the coupling α, we need to identify a physical quantity which has explicit µ dependence. As discussed by Srednicky [12], we consider the two particle going to two particle process which is related to the 4−point vertex function which is UV finite. In the limit m → 0, this process too suffers from IR divergence which is handled as in QED - carefully considering what is observed and including the soft particles contribution. From [12], we borrow the formulae below. The amplitude for the process, p1 + p2 → p3 + p4 is depicted below to order α. p01 p1 p01 p1 , p02 p2 p02 p2 (a) (b) The contribution of (a) in the high energy limit, is given by, 11 2 0 2 M = M0 1 − α ln(s/m ) + o(m ) + o(α ) with, 12 1 1 1 2 M0 = −g + + , s := −(p1 + p2 )2 , t = −(p01 − p1 )2 , u := −(p02 − p1 )2 . s t u The o(m0 ) term is free of lm(m → 0) singularity. The amplitude M is divergent as m → 0 (see equation (26.1) of Srednicky [12].) In the limit m → 0, the above process is experimentally indistinguishable from the one in which there are on-shell, collinear particles associated with any/all external lines14 . This is 14 P P P For mass less particles, ( i pi )2 = i6=j pi · pj = ij |pi ||pj |(−1 + cos(θij )) which vanishes if all particles are collinear, θij = 0. 204 shown in (b) above, with the dotted lines denoting the collinear particles. We must integrate the squared amplitude over momenta collinear within detector’s angular resolution. For observable cross-section with collinear emission included (from all 4 external lines), we have ! ! # 2~ 2 √ k ∆ 4α ln + C + . . . , C := 4 − 3 3π , ~k 2 = s/4, = |M|2 1 + 2 12 m h i 11 α 2 2 = |M0 | 1 − α ln(s/m ) × 1 + ln(∆2 s/(4m2 )) + C + . . . 6 3 3 1 2 2 2 0 2 = |M0 | 1 − α ln(s/m ) + ln(1/∆ ) + o(m ) + o(α ) .26.15 of [12] 2 3 " |M|2obs The ∆ is related to the detector angular resolution. In going to the second equation, we have included collinear emission from all 4 lines. There are no UV divergences here. The above formulae implicitly assume on-shell renormalization scheme where we cannot take the m → 0 limit. The same computation of the amplitude can be done in the M S scheme. This changes the amplitude M by changing the ln(s/m2 ) → ln(s/µ2 ). The correction due to inclusion of collinear emission continues to have ln(∆2 s/(4m2 )). Additionally, the residue at the physical pole is not 1 but R and hence the amplitude is multiplied by R2 . Including these changes, the observed squared amplitude is given by, h ih i 11 α α 2 2 2 2 2 2 2 |M|obs = |M0 | 1 − αln(s/µ ) 1 + ln(∆ s/m ) 1 − ln(µ /m ) 6 3 3 1 1 11 2 2 2 2 2 2 = |M0 | 1 − α ln(s/µ ) − ln(∆ s/m ) + ln(µ /m ) + . . . 6 3 3 1 3 2 2 2 2 0 2 ∴ |M|obs = |M0 | 1 − α ln(s/µ ) + ln(1/∆ ) + o(m ) + o(α ) (19.15) 2 3 There is no dependence on m now and the limit m → 0 can be taken. However, there is explicit µ dependence and the observed cross-section cannot depend on the arbitrary µ. Hence d |M|2obs dµ = 0 must hold. Writing ln|M|2obs := ln|M0 |2 + [α 32 ln(µ2 ) + αC2 ] := C1 + 2lnα + 3αln(µ) + 3αC2 where C1 , C2 are functions of s, t, u but independent of α and µ, the condition of 205 µ−independence gives, 0 = β(α) := dα 3 = − α2 + o(α3 ). We also recall, dln(µ) 2 dm2ph dln(µ) (19.16) (19.17) ⇒ (19.18) dln(m) α = − + o(α2 ) dln(µ) 12 (19.19) 0 = γm (α) := 2 dα + 3(C2 + ln(µ)) + 3α ⇒ α dln(µ) The γm is called the anomalous mass dimension of the mass parameter and β(α) is the famous “beta” function of the theory. We have just obtained these functions at 1-loop, for the (Φ3 )6 theory and these govern how the renormalized parameter must change with the “renormalization scale”, µ, in order that the physical quantities are independent of the renormalization scale. If we identify µ = s, then we have the running of the renormalized parameters with the center-of-mass energy scale. The differential equations can be trivially solved giving the running as, µ = µ̂et , α̂ α(t) = 3 1 + 2 α̂ln(µ/µ̂) As µ ∼ −5/18 3 m(t) = m̂ 1 + α̂ t 2 , (19.20) √ s µ̂, the coupling decreases. Such a theory is said to be asymptotically free. To summarize: We have seen how the introduction of counter terms makes it possible to absorb the UV divergences into unobservable coefficients such as δΦ , δm , δg . This process of absorption or s̀ubtraction’ has an inherent ambiguity: exactly what is subtracted. This is implicitly determined by imposition of a set of renormalization conditions (eg the on-shell renormalization) or explicitly (eg the M S/M S scheme). When the pole (in p2 ) in the selfenergy is not a simple pole, which happens with mass less particle (mph = 0), alternative subtraction schemes are needed. Such schemes, typically imply renormalization scale dependence in the renormalized parameters such as masses and couplings. This is governed by the beta function(s) and the anomalous dimension(s). For the sub-class of asymptotically free theories, this helps in improving perturbative predictions at high energies. A more general discussion of the renormalization group and its application is given in subsection (22 B). 206 20. PATH INTEGRALS IN QUANTUM MECHANICS We consider an alternative strategy to view quantum dynamics. Recall that in a quantum framework, we have a (projective) Hilbert space (or more generally a density operator on a Hilbert space) that encodes the kinematics while a family of unitary operators encodes the dynamics. Observable quantities are provided by operators whose expectation values and uncertainties in a given state (or density operator), provide numbers to be matched against experiments. To track time evolution of observable quantities, the central quantity of interest is the transition probability amplitude, for a state |Ψi at time t to make a transition at t0 > t to another state |Ψ0 i. The state Ψi evolves to exp−i(t0 − t)H|Ψi by Schrodinger equation (H is taken time independent for simplicity). Its inner product with |Ψ0 i gives the probability amplitude. This is denoted as: 0 0 Transition probability amplitude := hΨ0 | e−i(t −t)H |Ψi = hΨ0 |e−i(t −t)H |Ψi Assuming the quantum system to be describing a particle with a configuration space {~q}, we can use the position representation completeness relation and express the amplitude as, Z Z 0 0 −i(t0 −t)H 0 hΨ |e |Ψi = d~q d~q hΨ0 |~q0 ih~q0 |e−i(t −t)H |~qih~q|Ψi Z Z 0 := d~q d~q Ψ∗ (~q0 )K(~q0 , t0 ; ~q, t)Ψ(~q) The idea is to focus on the kernel K and get a convenient representation for it. Note: We have already presumed the usual framework of quantum theory. It is possible to develop the ideas ab initio taking the Kernel as a central quantity with a proposed form and arrive at the usual quantum framework. The book of Feynman and Hibbs follows this line. We will first sketch the heuristic approach and then relate it to the usual quantum framework. A. The Ab Initio Path Integral Consider a classical system with a configuration space Q which, for convenience of notation we take to be R. Let q1 , q2 be two points of it. The particle is supposed to make a transition from q1 at t1 to q2 at t2 . Such a transition takes place classically as well and we use it in the Lagrangian framework and deduce that the transition takes place along a curve 207 q(t) which extremises the action. In a departure from the classical dynamics, it is proposed that there is certain probability for the transition (q1 , t1 ) → (q2 , t2 ). This is to be computed by squaring the total probability amplitude obtained by summing over the probability amplitudes for every path connecting the (q1 , t1 ) and (q2 , t2 ). The amplitude for each path is P given as a certain phase. Thus, heuristically, K(q2 , t2 ; q1 , t1 ) ∼ paths eiϕ[q(t)] Questions: What do we choose for the phase? How do we restrict the paths? How is the ‘sum’ to be performed? • We expect to recover classical dynamics in the limit ~ → 0. So we expect the phase must be such as the amplitude is dominated by a single path corresponding to the classical transition. the obvious choice is ϕ[q(t)] ∼ S[q(t)]/~! • First guess about the paths would be smooth paths, but this turns out to be not enough. Since the action involves derivatives of q(t), we expect at least piecewise differentiability. But even this may not suffice - we can define derivatives as differences. It seems, continuity suffices. Without worrying about this too much, let us proceed by discretizing the time interval. This will also lead us to answer the third question. Let T := t2 − t1 := N where N is a large number eventually to be taken to infinity. For definiteness, let the action be Z S= 0 T N −1 i X m (qk+1 − qk ))2 qk+1 + qk q̇ − V (q) → − V , qk := q(tk ) dt 2 2 2 k=0 hm 2 For uniformity of notation, denote q0 =: qinitia , qN =: qf inal . We have used the natural discretization of the action. The space of paths is now described by the N − 1 qk ’s varying independently over R. The integration measure may be taken as the product measure QN −1 dqk k=1 C() , where the constant C() is left free and will be chosen shortly. Since N is arbitrary, we define a path integral as: N −1 Z 1 Y ∞ dqk i m (qk+1 − qk ))2 qk+1 + qk K(qin , qf i ; T ) := lim exp − V →0 C() C() ~ 2 2 −∞ k=0 (20.1) We can rewrite it by separating the k = (N − 1)th integral as, K(qin , qf ; T ) = Z ∞ −∞ dq 0 i m (qf − q 0 )2 qf + q 0 0 exp − V · K(qin , q ; T − ) C() ~ 2 2 208 As → 0, the rapid oscillations of the 1 term imply that dominant contribution to the integral comes from q 0 very close to qf . Therefore Taylor expanding the potential and the K(qin , q 0 ; T − ) about qf , we get Z ∞ K(qin , qf ; T ) ≈ −∞ dq 0 i m (qf −q0 )2 e ~ 2 · C() ( −i V (qf ) + 1+ ~ −i ~ 2 2 Vf2 ... 2 2 ) (q 0 − qf )2 2 0 × 1 + (q − qf )∂q0 + ∂q0 + . . . K(qin , q 0 ; T − ) 2! q 0 =qf Notice that the first braces are independent of q 0 (V (q) is assumed to be reasonably well behaved). The second braces is a power series in (q 0 − qf ). Z ∞ −i dx i m x2 x2 2 V (q ) f K(qin , qf ; T ) ≈ e ~ e ~ 2 · 1 + x∂qf + ∂qf + . . . K(qin , qf ; T − ) 2! −∞ C() In effect, we have a series of Gaussian integrals of the form: R∞ −∞ i mx2 m ~ 2 . dxx e The Gaussian integrals are all well known, r Z ∞ Z ∞ Z ∞ π 1 2 −ax2 2k −ax2 dxe = dxx e dxx2k+1 e−ax = 0 . , = k+1/2 Γ(k + 1/2) , a a −∞ −∞ −∞ For us, a = im . 2~ For this to be defined, we have to define the usual Gaussian integral for a > 0 and continue analytically in the complex a−plane with real part of a being positive. Hence, we must put m → m + iε to provide the convergence factor. Thus, r 1 i~ 2 π × 1 − i V (qf ) + ∂ + . . . K(qin , qf ; T − ) . K(qin , qf ; T ) ≈ −im/(2~) C() ~ 2m qf The terms in the braces are regular as → 0 and hence the right hand side can be arranged q to have a regular limit as → 0 by choosing C() = 2π~ . Additionally, we also get, −im i i~ 2 ∂ − V (qf ) K(qin , qf ; T ) K(qin , qf ; T ) − K(qin , qf ; T − ) = 2m qf ~ 2 ~ 2 or i~∂T K(qin , qf ; T ) = − ∂ + V K(qin , qf ; T ) 2m qf (20.2) (20.3) Thus, the K defined above, satisfies the time dependent Schrodinger equation! What about the initial condition? To see the limit T → 0, take N = 1(T = ). Then R there is no dq. Only the k = 0 term in the exponent survives and we get, r 1 i m (qf −qin )2 −V ((qf +qin )/2) 2π~ , C() = K(qin , qf ; → 0) = lim e~ 2 . →0 C() −im 209 The right hand side is just the representation of δ(qf − qin ). The usual quantum framework definition of the transition amplitude satisfies the time dependent Schrodinger equation with the same initial condition! So we have a strong hint that the K(qin , qf ; T ) defined may indeed be identified with hqf |e−iT H |qin i We can see this directly as well. B. Derivation From Transition Amplitude Divide the interval [0, T ] in to N intervals of size each, N = T . Since the Hamiltonian is assumed to be time independent, we can write, e − ~i T H − ~i =e PN −1 k=0 H(tk+1 −tk ) = N −1 Y e − ~i (tk+1 −tk )H ≈ k=0 k=0 Insert the completeness relation 1 = h~qf |e R Z − iT H ~ |~qin i = d~q1 . . . d~qN −1 N −1 Y i 1 − H + ... ~ . k d~qk |~qk ih~qk | between each factor so that N −1 Y h~qk+1 |e− iH ~ |~qk i ~q0 =: ~qin , ~qN =: ~qf k=0 − iH ~ e i ≈ 1 − H + ... ~ for small . To simplify further, we consider types of Hamiltonian operators (we suppress the vector arrows). (i) H(q̂, p̂) = f (p̂) + g(q̂). Then Z hqk+1 |f (p̂)|qk i = dpk hqk+1 |f (p̂)|pk ihpk |qk i i Z dpk f (pk )hqk+1 |pk ihpk |qk i = Z ∴ hqk+1 |f (p̂)|qk i = e− ~ q~k ·~pk Use: hpk |qk i = √ 2π~ i e ~ p~k ·(~qk+1 −~qk ) dpk f (p̂) 2π~ Likewise, Z i 1 ~ hqk+1 |g(q̂)|qk i = q(~qk )δ(~qk+1 − ~qk ) = g(~qk ) dpk e ~ p~k ·(~qk+1 −~qk ) 2π~ ~i p~k ·(~qk+1 −~qk ) Z d~pk ~qk+1 + ~qk ∴ hqk+1 |Ĥ(q̂, p̂)|qk i = Hcl , p~k 2π~ 2 We have used g(qk ) → g(qk+1 + qk )/2) when multiplied by the δ(qk+1 − qk ). (ii) Ĥ contains the q̂, p̂ in various ordering such as, q̂ p̂, p̂q̂, q̂ 2 p̂, q̂ p̂q̂, p̂q̂ 2 , . . . etc. As an 210 example, consider a term q̂ a p̂b in the quantum Hamiltonian. Then, Z Z d~pk i p~k ·(~qk+1 −~qk ) a a a hqk+1 |q̂ p̂b |qk i = d~pk hqk+1 |q̂ |pk ihpk |p̂b |qk i = qk+1 (pk )b e~ 2π~ Z Z d~pk i p~k ·(~qk+1 −~qk ) a a a e~ hqk+1 |p̂b q̂ |qk i = d~pk hqk+1 |p̂b |pk ihpk |q̂ |qk i = qk (pk )b 2π~ a Z qk+1 + qka d~pk i p~k ·(~qk+1 −~qk ) q̂ a p̂b + p̂b q̂ a ∴ hqk+1 | |qk i = (pk )b e~ 2 2 2π~ Thus for the natural ordering, the classical Hamiltonian has its ~q dependence in the averaged form. Here is another example involving q̂ 2 , p̂2 , taken in a particular order, called Weyl order: W eyl(q̂ 2 , p̂2 ) := 14 (q̂ 2 p̂2 + 2q̂ p̂2 q̂ + p̂2 q̂ 2 ). It is easily seen that, 1 2 q hqk+1 |p2 |qk i + 2qk+1 qk hqk+1 |p2 |qk i + qk2 hqk+1 |p2 |qk i 4 k+1 2 ~qk+1 + ~qk = hqk+1 |p̂2 |qk i 2 hqk+1 |W eyl(q̂ 2 , p̂2 )|qk i = More generally, for monomials in q̂, p̂, the Weyl order W eyl(p̂m , q̂ n ) equals the fully symmetrized and averaged product. It is a result that n ~qk+1 + ~qk n m hqk+1 |p̂m |qk i hqk+1 |W eyl(q̂ , p̂ )|qk i = 2 Hence, modulo Weyl ordering, we do get, hqk+1 |e − iT~H hqf |e − i H ~ Z |qk i = dpk − ~i H e 2π~ Z |qin i = Z d~q1 . . . d~qN −1 q +q pk , k+12 k i e ~ p~k ·(~qk+1 −~qk ) d~p1 d~pN i ... exp 2π~ 2π~ ~ ( ) qk+1 + qk (~qk+1 − ~qk ) − H(pk , ) p~k+1 · 2 k=0 N −1 X (20.4) Notice that we have N, pi integrations, one for each interval and N − 1, qi integrations, two less than the number of points since q0 , qN are fixed. The exponent is clearly a discretized RT RT d~ p form of 0 dt p~(t) · ~q˙(t) − H(~p(t), ~q(t)) = 0 dt L(~q(t), ~q˙(t)). The d~q (2π~) measure suggests that we have paths in the “phase space”. What type of paths? The mis-match in the number of integration variables makes it harder to view the multiple integrals as a measure on the space of paths in phase space. Introduce an arbitrary momentum p~0 so that (~q0 , p~0 ) denotes a point in the phase space just as (~qk , p~k ), k = 1, . . . , N 211 do. The R d~pN shows that the momentum at the final point is integrated over. Thus, the phase space path may be specified by an initial point (~q0 , p~0 ). The arbitrary momentum does not enter anywhere and just serves to anchor the initial point. The final point however is the set of all points (~qN , p~N ) with P~N integrated over. Thus, the space of paths is (Γ is a phase space of dimension 2n): n Space of paths =: PΓ := γ(t) ∈ Γ / γ(0) = (~qin , p~0 ) , γ(T ) = (~qf , p~), p~ ∈ Rn . o There are arbitrary constants here, the p~0 is an arbitrary constant vector. Since the p~k , ~qk are all independent, the paths are continuous but non-differentiable everywhere. Thus we denote: − iT~Ĥ hqf |e Z |qin i := lim Z d~q1 . . . d~qN −1 →0 d~p1 d~pN ... 2π~ 2π~ " # N −1 i X (~qk+1 − ~qk ) ~qk+1 + ~qk exp p~k+1 · − H p~k , (20.5) ~ k=0 2 Z T Z i ˙ dt p~(t) · ~q(t) − H(~p(t), ~q(t)) ,(20.6) D q(t)D p(t) exp = ~ 0 PΓ Z N −1 Z N Z Y Y d~pj D q(t)D p(t) := lim d~qk (20.7) N →∞ 2π~ PΓ i=1 j=1 For the special case of H(~p, ~q) = p ~2 2m + V (~q), the momentum dependence is quadratic and the momentum integrals can be done trivially. For instance, Z d~pk+1 ~i e (2π~)d q ~ +~ q p ~2 p ~k+1 ·(~ qk+1 −~ qk )− 2m +V ( k+12 k ) = i m 1 2 e ~ 2 (~qk+1 −~qk ) × d (2π~) Z m(~ qk+1 −~ qk ) 2 i ~− − i p dd pk e 2m~ e− ~ V = m d/2 ~i e 2π~(i) qk+1 −~ qk )2 m (~ −V 2 W p2 ( q ~k+1 +~ qk 2 ) The square root prefactor is just C()−1 . Doing all the momentum integrals gives, i hqf |e− ~ T Ĥ |qin i = lim →0 Z d~q1 ··· (2π~(i/m))d/2 Z i ~ qk+1 −~ qk )2 m (~ −V 2 2 d~qN −1 e (20.8) (2π~(i/m))d/2 (2π~(i/m))d/2 Each d~q integral gets a factor of C()−d and extra such factor is left over from the extra momentum integration. This was the factor introduced and determined by a regular behavior as → 0. 212 Thus, the upshot is that we can either take the heuristically motivated definition of K and show that it matched with quantum mechanical definition of transition amplitude or directly deduce it from the quantum mechanical definition. The K of the transition amplitude is the central quantity in the path integral approach. A note: We defined K(qin , qf ; T ) and identified it with hqf |exp−(i/~)T H|qin i and called it the probability amplitude for a particle at qin at t0 to transit to qf at time t0 + T . This is sometimes also denoted as hqf , T |qin , 0i or hq 00 , t00 |q 0 , t0 i with t00 − t0 = T . This notation can be confusing since |q 0 , t0 i does not denote a solution of the Schrodinger equation. Rather, it denotes the instantaneous eigenvector of the Heisenberg picture operator Q̂(t) := eitH/~ QSch (0)e−itH/~ . Thus, Q(t)|q, ti = q|q, ti ↔ QSch e−itH/~ |q, ti = q e−itH/~ |q, ti ⇒ e−itH/~ |q, ti = |qi Or, |q, ti = eitH/~ |qi. If |q, ti were the time evolution of |qi, we should have eitH/~ |qi. Thus we have the consistent notation: hqf |e− iT H ~ |qin i = hqf |e− i(t00 −t0 )H ~ |qin i = hqf |e− it00 H ~ ·e it0 H ~ |qin i = hqf , t00 |qin , t0 i . Consider now hqf , T |QH (t)|qin , −T i, for t ∈ [−T, T ]. hqf , T |QH (t)|qin , −T i = hqf |e−iT H/~ eitH/~ QS e−itH/~ e−iT H/~ |qin i Z = dq 0 dq 00 hqf |ei(t−T )H/~ |q 0 ihq 0 |QS |q 00 ihq 00 |e−i(t+T )H/~ |qin i Z = dq 0 dq 00 hqf , T |q 0 , tihq 0 |QS |q 00 ihq 00 , t|qin , −T i Z = dq 0 hqf , T |q 0 , tiq 0 (t)hq 0 , t|qin , −T i In the last equation, we have used hq 0 |QS |q 00 i = q 0 δ(q 0 − q 00 ) - the property of the Schrodinger picture operator, carried out the dq 00 integration and as a reminder inserted the argument t in q 0 (t). Using the path integral notation, we write the last equation as, Z Z Z Z i i i S[−T,T ] S[t,T ] ~ ~ hqf , T |QH (t)|qin , −T i = D q q(t)e = dq D qe q(t) D qe ~ S[−T,t] R (20.9) Z Next, consider the quantity, Z Z Z i i S[−T,T ] S[t ,T ] 1 D q q(t1 )q(t2 )e ~ := dq1 dq2 D qe ~ q1 (t1 ) × Z Z i i S[t ,t ] S[−T,t ] 2 1 2 D qe ~ q2 (t2 ) D qe ~ 213 − T ≤ t2 ≤ t1 ≤ T ; For t2 ≥ t1 the factors will switch accordingly. We summarize the formulae as, Z i i 00 0 00 0 00 0 hqf , t |qin , t i = D qe ~ S[t ,t ] =: hqf |e− ~ (t −t )Ĥ |qin i (20.10) Z i hqf , T |QH (t)|qin , −T i = D q q(t)e ~ S[−T,T ] Z := dqt hqf , T |q, tiqt hq, t|qin , −T i (20.11) Z Z Z i S[−T,T ] := dq1 dq2 hqf , T |T {QH (t1 )QH (t2 )}|qin , −T i = D qq(t1 )q(t2 )e ~   hqf , T |q1 , t1 iq1 (t1 )hq1 , t1 |q2 , t2 iq2 (t2 )hq2 , t2 |qin , −T i ,      t ≤t 1 2   hqf , T |q2 , t2 iq2 (t2 )hq2 , t2 |q1 , t1 iq1 (t1 )hq1 , t1 |qin , −T i ,     t2 ≤ t1 (20.12) C. Functional Derivative It will be convenient to have the notation of functional derivative. Recall from the Rt variational principle of mechanics, the action functional, S[q(t)] := t12 dt L(q(t), q̇(t)). For any given function q : [t1 , t2 ] → q(t) ∈ R, the right hand side computes a number and that number is the action. It is a function on the space of paths/curves/functions on [t1 , t2 ], and in short is regarded as a functional of any given q(t). This is not a functional in the sense of being in the dual of vector space - it is not linear in q(t)’s. Under a variation of a curve, o n R d ∂L q(t) → q(t) + δq(t), we compute δS := S[q + δq] − S[q] = dt δq ∂L − + end-point ∂q dt ∂ q̇ contributions. Comparing this with df (x1 , . . . , xn ) = Σni=1 dxi ∂i f , we can identify and denote, the functional derivative of the action with respect to the curve as: δS δq(t) := ∂L ∂q − d ∂L dt ∂ q̇ . Note that S has no explicit t dependence while its functional derivative does. Thus the idea of a functional derivative is to consider the first order variation of a functional and read off the coefficient of the δ. We can also notice that any function f (t) we can write, δf (t) = δf (t) δf (t0 ) R dt0 δ(t − t0 )δf (t0 ) ↔ = δ(t − t0 ). Thus we define the derivative with respect to a function as: 214 (αF [f ] + βG[f ]) = α δfδF(t) + β δfδG(t) (i) δ δf (t) (ii) δ F [f ].G[f ] δf (t) = δF .G[f ] δf (t) (iii) δ F (G[f ]) δf (t) δf (t0 ) δf (t) = ∂F δG ∂G δf (t) (iv) (linearity) + F [f ]. δfδG(t) (Leibnitz rule) (Chain rule) := δ(t0 − t) For an arbitrary “source function”, J(t) define the functional, Z R 00 i t 00 00 0 0 hq , t |q , t i[J] := D qe ~ t0 dt{L(q,q̇)+J(t)q(t)} . (20.13) Then it follows that, Z R 00 i t δ i 00 00 0 0 hq , t |q , t i[J] = D qe ~ t0 dt{L(q,q̇)+J(t)q(t)} · q(t) δJ(t) ~ Z 00 R t i δ Or, −i~ hq 00 , t00 |q 0 , t0 i[J] = D q q(t)e ~ t0 dt{L(q,q̇)+J(t)q(t)} δJ(t) = hq 00 , t00 |Q(t)|q 0 , t0 iJ . Z R 00 i t δ δ 00 00 0 0 −i~ −i~ hq , t |q , t i[J] = D q q(t1 )q(t2 )e ~ t0 dt{L(q,q̇)+J(t)q(t)} δJ(t1 ) δJ(t2 ) = hq 00 , t00 |T {Q(t1 )Q(t2 )} |q 0 , t0 iJ . The generalization is obvious. Evaluating the functional derivatives at J(t) = 0 gives us, hq 00 , t00 |T {Q(t1 ) . . . Q(tn )} |q 0 , t0 i = (−i~)n δn hq 00 , t00 |q 0 , t0 i δJ(t1 ) . . . δJ(tn ) (20.14) J=0 Thus, the “correlation functions” (left hand side) are given by the functional derivatives of the transition amplitude in presence of a source function, evaluated at vanishing source function. To relate it to ground state/vacuum expectation values of time ordered products of Heisenberg picture operators, we take a closer look at the transition amplitude with non-zero source function. Choose the source function to have a compact support, J(t) = 0 for t < t0 , t > t00 . Choose T 0 < t0 and T 00 > t00 . Then, Z 00 00 0 0 hQ , T |Q , T iJ = dq 00 dq 0 hQ00 , T 00 |q 00 , t00 iJ=0 hq 00 , t00 |q 0 , t0 iJ hq 0 , t0 |Q0 , T 0 iJ=0 . 215 The J = 0 amplitudes can be written in the energy representation (energy eigenvalues assumed to be discrete for convenience), 0 hq 0 , t0 |Q0 , T 0 i = hq 0 |e−i(t −T 0 )H/~ |Q0 i = X 0 hq 0 |ϕn ihϕn |Q0 ie−i(t −T 0 )E /~ n n = X ϕn (q 0 0 0 )ϕ∗n (Q0 )e−i(t −T )En /~ n ∴ e−iT 0 E /~ 0 hq 0 , t0 |Q0 , T 0 i = X 0 ϕn (q 0 )ϕ∗n (Q0 )e−it En /~ eiT 0 (E −E )/~ n 0 n ∴ 0lim e −iT 0 E0 /~ T →i∞ ∴ lim T 00 →−i∞ eiT 00 E /~ 0 0 0 0 0 0 hq , t |Q , T i = ϕ0 (q 0 )ϕ∗0 (Q0 )e−it E0 /~ ∵ En 6= E0 terms drop out, likewise hQ00 , T 00 |q 00 , t00 i = ϕ∗0 (q 00 )ϕ0 (Q00 )eit 00 E /~ 0 We have assumed that the lowest energy eigenvalue is non-degenerate, otherwise the will reduce to the sum over the degenerate states. Thus we get, Z hQ00 , T 00 |Q0 , T 0 iJ lim = dq 0 dq 00 ϕ∗0 (q 00 , t00 )ϕ0 (q 0 , t0 )hq 00 , t00 |q 0 , t0 iJ −iE0 (T 00 −T 0 ) ϕ (Q00 )ϕ∗ (Q0 ) T 0 →i∞ e 0 0 00 P n (20.15) T →−i∞ D. Ground State-to-ground state Amplitude: Z[J] The right hand side of eq. (20.15) is the ground state-to-ground state transition amplitude, the one that we were looking for. The left hand side tells us how to compute it from hQ00 , T 00 |Q0 , T 0 i which is similar to hq 00 , t00 |q 0 , t0 iJ . The factor in the denominator is independent of J and will not matter. We introduce the definition, Z Z[J] := dq 0 dq 00 ϕ∗0 (q 00 , t00 )hq 00 .t00 |q 0 , t0 iJ ϕ0 (q 0 , t0 ) (20.16) n Z i δ n Z[J] = dq 0 dq 00 ϕ∗0 (q 00 , t00 ) × δJ(t1 ) . . . δJ(tn ) J=0 ~ hq 00 , t00 |T {Q(t1 ) . . . Q(tn )} |q 0 , t0 i ϕ0 (q 0 , t0 ) .(20.17) The previous result tells us that Z[J] may be computed as, " Z 00 # Z T i Z[J] ∼ lim hQ00 , T 00 |Q0 , T 0 iJ = lim D qexp dt {L(q, q̇) + J(t)q(t)} 0 0 T →i∞ T →i∞ ~ T0 00 00 T →−i∞ T →−i∞ We have dropped the unimportant denominator on the left hand side. We have continued T 0 → i∞, T 00 → −i∞, but not the intermediate times, ti appearing as arguments of the Heisenberg operators. We will do so now and get to the Euclidean formulation. 216 We begin with, 00 00 0 Z 0 hQ , T |T {Q(t1 ) . . . Q(tn )} |Q , T iJ=0 ∼ lim T 0 →i∞ T 00 →−i∞ dq1 . . . dqn hQ00 , T 00 |q1 , t1 iq1 hq1 , t1 |q2 , t2 i . . . qn hqn , tn |Q0 , T 0 i , where, " # Z NY −1 qi + qi+1 qi+1 − qi dqj i X √ L , , N = ti+1 − ti hqi , ti |qi+1 , ti+1 i ∼ exp ~ j 2 2πi j=1 " # Z NY −1 qi + qi+1 qi+1 − qi dqj 0 X √ L , , N 0 := τi+1 − τi hqi , −iτi |qi+1 , −iτi+1 i exp 0 0 ~ 2 −i 2π j j=1 The analytic continuation of the hT {. . . }i is defined through the h..|..i. Thus, Z 00 00 0 0 hQ , T |T {Q(t1 ) . . . Q(tn )} |Q , T iJ=0 ∼ lim D q q(τ1 ) . . . q(τN ) × ti =−iτi τin →−∞ τf →∞ Z τf exp τin dq dτ L(q, − ) dτ This suggests going over to a Euclidean formulation, Z ∞ Z dq dτ L q, i ZE [J] := D q exp + J(τ )q(τ ) dτ −∞ (20.18) and the paths are between some Q0 = limτ →−∞ q(τ ) and Q00 = limτ →∞ q(τ ). The Euclidean ZE and the Minkowskian Z are related through, δ n Z[J] 1 Z[J] δJ(t1 ) . . . δJ(tn ) = in J=0 1 δ n ZE [J] ZE [J] δJ(τ1 ) . . . δJ(τn ) J=0 This will be used in the field theory Green’s functions. E. Explicit evaluation of a path integral We would like to see the various expressions above explicitly for a 1-dimensional harmonic oscillator with a source function. We have L = 12 (q̇ 2 − ω 2 q 2 ) + J(t)q(t) and we want to compute hq 00 , t00 |q 0 , t0 iJ . Discretizing the corresponding action, the definition gives, " (N −1 2 Z ∞ NY −1 dq i X 1 (qk+1 − qk )2 ω 2 qk + qk+1 1 k 00 00 0 0 hq , t |q , t iJ = lim exp − →0 C() −∞ C() ~ 2 2 2 k=1 k=0 qk + qk+1 +Jk , q0 := q(t0 ), qN := q(t00 ), Jk ; = J(tk ) 2 217 All are Gaussian integrals with coupled variables. It is more convenient to discretize a different action. The equations of motion are: q̈ + ω 2 q 2 = J(t) , q(t0 ) = q 0 , q(t00 ) = q 00 . Let qcl (t) be a classical solution of this equation satisfying the end point conditions. Let us assume that there is just one such solution. Introduce η(t) via the definition: q(t) := qcl (t) + η(t). Then η(t0 ) = 0 = η(t00 ). The action becomes, t00 ω2 1 2 2 (q̇cl + η̇) − (qcl + η) + J(t)(qcl + η) dt S(t , t ) = 2 2 t0 Z t00 1 2 1 2 2 2 2 2 2 dt (q̇ − ω qcl ) + J(t)qcl + (η̇ − ω η ) + η̇ q̇cl − ω ηqcl + Jη = 2 cl 2 t0 0 00 Z Up to a total derivative, the last term is −η(q̈cl + ω 2 qcl − J) = 0. The total deriva00 tive term gives η q̇cl |tt0 = 0 thanks to the end point condition satisfied by η(t). Thus, Z t00 0 00 0 00 1 2 (η̇ − ω 2 η 2 ), where Scl is the action evaluated at the presumed S(t , t ) = Scl (t , t ) + 2 t0 solution qcl . The terms linear in η have vanished thanks to qcl being a solution. We now “quantize” the η variable, i.e. we define " ( Z ∞ NY −1 h i J 0 00 i dη 1 i k exp hq 00 , t00 |q 0 , t0 iJ := e ~ Scl (t ,t ) lim →0 C() −∞ C() ~ k=1 N −1 X 1 (ηk+1 − ηk )2 2 k=0 ω2 − ((ηk + ηk+1 )/2)2 (20.19) 2 The paths η(t) begin and end at η = 0. The second factor is independent of both J and t0 , t00 . It depends on T = t00 − t0 . Our task is to evaluate the first factor and carry out the coupled Gaussian integral. 218 The exponent in the second factor is of the form, N −1 N −1 X X 1 i 2 2 2 ... = (ηk+1 − ηk ) − ω (ηk+1 + ηk )/2) := ηk Akl ηl , where, 2~ k=0 k,l=0 Alk := 2(1/ − ω/4) δlk − (1/ + ω 2 /4)2 {δl,k+1 + δl,k−1 } , l, k = 1, . . . N − 1. | {z } | {z } a  a   −b    0  =   .    .  b −b 0 . . . a −b . . . −b a −b . . . . . .  0  0    ...0    .    .   A tri-diagonal, symmetric matrix. 0 . . . −b a )# " ( N −1 Z ∞ N −1 h i J 0 00 i X d η 1 i ηi Aij ηj exp ∴ hq 00 , t00 |q 0 , t0 iJ := e ~ Scl (t ,t ) lim →0 C() −∞ C()N −1 2~ i,j=1 0 For a single variable we have R∞ 2 dxe−ax = −∞ p π/a ⇒ R∞ 2 dxeiax = −∞ p π/(−ia). Its multi- dimensional generalization is Z ∞ dn xei P ij xi Aij xj −∞ (iπ)n/2 =√ det A Using this, our transition amplitude takes the form, h i J 0 00 i (N −1)/2 1 1 (iπ~) 00 00 0 0 (t ,t ) S √ hq , t |q , t iJ = e ~ cl lim →0 C() ()N −1 det A √ Using C() = 2iπ~ since we have taken unit mass, m = 1, we have, " # h i J 0 00 i 1 1 p hq 00 , t00 |q 0 , t0 iJ = e ~ Scl (t ,t ) lim √ →0 2πi~()N/2 det A() (20.20) Now we need to compute the determinant of the tridiagonal, symmetric matrix. This is usually solved by using a recursion relation. Denote Dn := det An×n where A has the form given above. Clearly, D1 = a and D2 = a2 − b2 . By checking for 4 × 4, 5 × 5 matrices, it is easy to see that Dn satisfies the recursion relation, 2 Dn = aDn−1 − b Dn−2 , D0 := 1, D1 = a ; a = 2 1 ω2 − 4 1 ω2 , b= + . 4 We can pull out a factor of −1 from A and since our A is of order (N − 1), we pull out √ a factor of ()−(N −1)/2 . This replaces the second [. . . ] by lim→0 (2π~ detA0 )−1 . The A0 matrix has elements a0 := 2 − 2 ω 2 /2 , b0 := 1 + 2 ω 2 /4. 219 Note: For a given N, = T /N , hence (suppressing the primes) a, b have an N dependence. The matrix itself is also (N − 1) × (N − 1) and our notation Dn as the determinant of An×n is valid for n ≤ N − 1. For a given N , the Dn≥N is not defined. Hence, the recursion relation is a difference equation with constant coefficients [15]. It is important to keep the distinction between fixed N and a variable n. The a, b are functions of N but independent of n. Such difference equations are solved by the ansatz, Dn = λn . Shifting n → n + 2, we write the difference equation as, ∀ 0 ≤ n ≤ N − 3 : Dn+2 − aDn+1 + b2 Dn = 0 , D0 = 1, D1 = a. Substitution gives the characteristic equation λ2 − aλ + b2 = 0. Its solutions, for 1, are r 2 ω 2 2 ω 2 2 ω 2 2 ω 2 ) ± (1 − ) − (1 + ) ≈ 1− ± i ω λ± ≈ (1 − 4 2 2 4 ∴ λ± ' 1 ± iω . ⇒ Dn = αλn+ + βλn− ∀ n ∈ [0, N − 1]. 1 i 1 i Initial conditions give, α = 1− , β= 1+ . 2 ω 2 ω The desired determinant is then given by, 1 i 1 i N −1 DN −1 = 1− (1 + iω) + 1+ (1 − iω)N −1 , = T /N 2 ω 2 ω iN 1 iN 1 N (1 + iωT /N ) + 1+ (1 − iωT /N )N −−−→ i 1 − N →∞ 2 ωT 2 ωT iN N → − eiωT − e−iωT = sin(ωT ) 2ωT ωT sin(ωT ∴ DN → . ω We finally get, 00 00 0 0 h hq , t |q , t iJ = e i J 0 00 S (t ,t ) ~ cl i 1 √ 2πi~ r ω . sin(ωT ) (20.21) Note: All the J dependence is in the first factor only while the “quantum correction” are in the second factor and independent of J. Note: This method of evaluating the amplitude near a classical solution can also be adopted for more general (non-linear) equations of motion. The Scl always comes out, the term linear in η always vanishes while the o(η 2 ) terms always gives the (determinant)−1/2 . In the general context, it is termed as a semi-classical approximation which is exact for the oscillator. 220 The calculation of the first factor requires solving the equation of motion with the source function and then evaluating the action for qcl (t). This is little involved as the equation is inhomogeneous and requires use of Green’s function. The Green’s function can be obtained by directly solving the differential equation with δ- function source and matching the discontinuity due to the delta function. We just note the final result [Problem 3.11 of Feynman-Hibbs, Abers-Lee]. SclJ (t0 , t00 ) = 0 2 ω ((q ) + (q 00 )2 )cos(ωT ) − 2q 0 q 00 2sin(ωT ) Z t00 Z t00 q 00 q0 0 + dtJ(t)sin(ω(t − t )) + dtJ(t)sin(ω(t00 − t)) sin(ωT ) t0 sin(ωT ) t0 Z σ Z t00 1 dτ J(σ)J(τ )sin(ω(t00 − σ).sin(ω(τ − t0 )) (20.22) − dσ ωsin(ωT ) t0 t0 This exercise was done to illustrate the schematics of evaluating the path integral directly using the definition. Note: If we have a system of two degrees of freedom, the q(t) and the J(t) which are R t00 coupled by a linear coupling, t0 dtJ(t)q(t), then the SclJ (t0 , t00 ) can be viewed as an effective action contribution after integrating out the q(t) degree of freedom. This has the form R R Rt ∼ dtJ(t)α(t) − dtJ(t) dσβ(t, σ)J(σ). This last term is a non-local term. F. Alternative Expression for Z[J] We now consider another method which is closer to what is done is field theory. We begin by modifying the Hamiltonian operator as: Ĥ → (1 − i)Ĥ [12]. Then, X X i 0 i 0 i 0 |q 0 , t0 i = e ~ t Ĥ |q 0 i = ϕ∗n (q 0 )e ~ t En |ni → ϕ∗n (q 0 )e ~ t (1−i)En |ni n n Assuming E0 = 0 for convenience and taking the limits, we get |q 0 , t0 i −− −−→ ϕ∗0 (q 0 )|0i 0 t →−∞ and hq 00 , t00 | −00−−→ h0|ϕ0 (q 00 ) . t →∞ (20.23) Thus, |q 0 , t0 i, hq 00 , t00 | both go to the (presumed non-degenerate) ground state as t0 → −∞ and t00 → ∞, provided we make the substitution: Ĥ → (1 − i)Ĥ. We will work with the limits. Consider, for our 1-dimensional oscillator, H = p2 /2 + ω 2 q 2 /2, Z ∞ Z i h0|0iJ = D pD qexp dt {pq̇ − (1 − i)H + J(t)q} ~ −∞ 221 We do momentum integration by using (1 − i)p2 /2 ≈ p2 . 2(1+i) This gives the term 12 (1 + i)q̇ 2 and we get, Z h0|0iJ = Z ∞ 1 2 i 1 2 2 (1 + i)q̇ − ω (1 − i)q + J(t)q . D qexp dt ~ −∞ 2 2 (20.24) Define the Fourier transform, Z ∞ Z ∞ 1 iνt q̃(ν) := dte q(t) ↔ q(t) = dνe−iνt q̃(ν) 2π −∞ −∞ and similarly for J(t). Substitution give, Z Z 1 ∞ dν ∞ dν 0 −i(ν+ν 0 )t h L = e − (1 + i)νν 0 − (1 − i)ω 2 q̃(ν)q̃(ν 0 ) 2 −∞ 2π −∞ 2π 0 ˜ ˜ 0 )q̃(ν) i J(ν)q̃(ν ) + J(ν + 2 The expression in the braces simplifies to {. . . } = ν 2 − ω 2 + i(ν) , (ν) := (ν 2 + ω 2 ). Define x̃(ν) := q̃(ν) + ˜ J(ν) . ν 2 −ω 2 +i(ν) S= 1 2 Z ∞ −∞ The the action becomes, " ˜ J(−ν) ˜ dν J(ν) x̃(ν) ν 2 − ω 2 + i(ν) x̃(−ν) − 2 2π ν − ω 2 + i(ν) # The shift, q̃(ν) → x̃(ν) is a constant shift, the shift is independent of q̃. Hence the Jacobian of the transformation will be 1 and D q̃(ν) = D x̃(ν). Taking Fourier transform, the shift takes the form q(t) → x(t) + f (t) with f (t) independent of q(t). This is also a constant shift and in the time domain too, and we expect D q(t) = D x(t). " Z !# ∞ ˜ J(−ν) ˜ i dν J(ν) ∴ h0|0iJ = exp × 2~ −∞ 2π −ν 2 + ω 2 − i(ν) Z Z ∞ i dν 2 2 D x̃(ν)exp x̃(ν) ν − ω + i(ν) x̃(−ν) 2~ −∞ 2π (20.25) (20.26) The second factor is a path integral independent of J and hence equals h0|0iJ=0 . However, without any source interaction, the ground state remains a ground state and hence h0|0iJ=0 = 1! . This gives finally, i h0|0iJ = exp 2~ Z ∞ −∞ dν 2π We never needed to evaluate the path integral! 222 ˜ J(−ν) ˜ J(ν) −ν 2 + ω 2 − i(ν) ! (20.27) The above expression can also be expressed in the time domain as, Z ∞ i h0|0iJ = exp dt0 dt00 J(t00 )G(t00 − t0 )J(t0 ) with, (20.28) 2~ −∞ Z ∞ 00 0 e−iν(t −t ) i −iω|t00 −t0 | dν 00 0 = e (by contour integration) (20.29) G(t − t ) = 2 2 2ω −∞ 2π −ν + ω − i(ν) Note: The ν dependence in the (ν) does not affect the contour integration. We have discussed two different methods of computing the transition amplitude in the limit of infinite time separation. Notice that the h0|0iJ above is the same as the Z[J] defined R earlier in eq. (20.16) since dq 0 |q 0 , t0 iϕ0 (q 0 , t0 ) = |0, t0 i and likewise for h0, t00 |. The above frequency domain form is very convenient to obtain the correlation functions as seen below. Our general formula (20.14) tells us that h0|T {Q(t1 ) . . . Q(tn )}|0i is given by the functional derivatives of Z[J] evaluated at J = 0. Choosing the above form of Z[J] we see that, δ 2 h0|0iJ δJ(t1 )δJ(t2 ) J=0 Z ∞ δ i 2 = (−i~) 2 dt0 G(t2 − t1 )J(t0 )× δJ(t1 ) 2~ −∞ i Z Z exp JGJ 2~ = (−i~)2 (i/~)G(t2 − t1 ) = −i~G(t2 − t1 ). h0|T {Q(t1 )Q(t2 )}|0i = (−i~)2 The derivative of the e R JGJ (20.30) J=0 (20.31) does not contribute in the limit of J = 0. Taking one more derivative to get the three point function, we have Z i δ δ dtG(t3 − t)J(t) · h0|0iJ = δJ1 δJ2 ~ Z Z δ i i i 0 0 0 G(t3 − t2 ) · h0|0iJ + dt G(t3 − t)J(t) · dt G(t2 − t )J(t ) · h0|0iJ δJ1 ~ ~ ~ = 0+0 =0 ∵ there is always a factor of J which kills the term at J = 0. 223 For 4 derivatives, we will have, Z δ i i G(t3 − t2 ) · dtG(t1 − t)J(t) · h0|0iJ + δJ0 ~ ~ Z i i G(t3 − t1 ) · dtG(t2 − t)J(t) · h0|0iJ + ~ ~ Z i i dtG(t3 − t)J(t) · G(t2 − t1 ) · h0|0iJ + ~ ~ Z Z i 3 Z 0 0 0 00 00 00 dtG(t3 − t)J(t) · dt G(t2 − t )J(t ) · dt G(t1 − t )J(t ) · h0|0iJ ~ Carrying out the J0 differentiation and putting J = 0, only the first 3 terms contribute since they have a single factor of J(t). For a 4-point function, this is multiplied by (−i~)4 and we get, 2 h0|T {Q(t0 )Q(t1 )Q(t2 )Q(t3 )}|0i = (−i~) n G01 G23 + G02 G31 + G03 G12 o , Gij ↔ G(ti − ti ) . (20.32) We recognize this as the same pattern seen in the vacuum expectation value calculation using the Wick’s theorem. Indeed, the quantum field theory Wick’s theorem shows up here just as a result of the functional differentiation. We now have all the definitions, notations and pattern of computation needed in the field theory generalization. 224 21. PATH INTEGRALS IN QUANTUM FIELD THEORY We consider the path integral formulation of a field theory, specifically a scalar field theory. While discussing the field as a dynamical system, we had already noted that the notation φ(t, ~x) refers to infinitely many degrees of freedom labeled by ~x ∈ R3 (for us). Thus we could view φ(t, ~x) as φ~x (t) and draw analogy with qi (t). Each of these degrees of freedom can be quantized a la path integral exactly as the single degree of freedom discussed before. R RQ R R QN −1 dqk QN −1 D φ(x) −−−→ In place of Dq (t) −−−→ x,k . ~ x∈Σ k=0 dφ~ k=0 C() , we now have T =N T =N We have dropped the C() which will be subsumed in the overall normalization constant. The action for the field theory is as the usual one. Thus, the vacuum-to-vacuum transition amplitude in presence of a source is denoted as, Z Z n o i h0|0iJ := D φexp d4 x L(φ, ∂µ φ) + J(x)φ(x) =: Z[J] ~ (21.1) The ‘paths’ implicit in the measure D φ are of course paths in the configuration space of the field i.e. the space of all φ(~x) at any fixed t. A path itself connects φ1 (~x, t1 ) to φ2 (~x, t2 ), ∀ ~x. The paths are not paths in the space-time. For the Lagrangian density L, we write L = L0 + L1 , with L0 := − 21 ∂ µ φ∂µ φ − 12 m2 φ2 . The m2 has −i implicit, in anticipation. The L1 will typically involve interaction terms, polynomials in φ such as g3 φ2 (x) + g4 φ4 (x) . . . etc. Consider Z o ni Z d4 x(L0 + J(x)φ(x)) . Z0 [J] := D φ exp ~ We go to the momentum space as in the case of the single particle. R R d4 k ik·x Let φ̃(k) := d4 xe−ik·x φ(x) ↔ φ(x) = (2π) φ̃(k), with k · x := −k 0 x0 + ~k · ~x and k 0 4e is the integration variable in d4 k and not k0 . Substituting in the action and doing the d4 x integration gives, 1 S0 = 2 Z i d4 k h 2 2 ˜ φ̃(−k) + J(−k) ˜ − φ̃(k)(k + m ) φ̃(−k) + J(k) φ̃(k) (2π)4 L0 , but includes the source. ˜ . Then D φ = D χ since we have a “constant shift transforDefine φ̃(k) := χ̃(k) + k2J(k) +m2 Note that S0 is not R mation”. This also holds in the φ(x) space. Then, as in the case of the single oscillator, we get 1 S0 = 2 Z " # ˜ J(−k) ˜ d4 k J(k) − χ̃(k)(k 2 + m2 )χ̃(−k) . 4 2 2 (2π) k +m 225 The path integral R ˜ namely h0|0iJ=0 = 1 and we D χ, just gives a factor independent of J, write, ~ = 1 from now on , " Z h0|0iJ = Z0 [J] = exp i d4 k (2π)4 ( ˜ J(−k) ˜ 1 J(k) 2 k 2 + m2 − i )# Z Z 0 0 4 4 0 1 where, = exp i d x d x J(x)∆(x − x )J(x ) 2 Z 0 d4 k eik·(x−x ) 0 ∆(x − x ) = , (Feynman propagator) (2π)4 k 2 + m2 − i Note that the Feynman propagator arose because of the −i included in the mass term in the L . As discussed for the oscillator, we get δ δ · (−i) · Z0 [J] δJ(x1 ) δJ(x2 ) J=0 Z δ 0 0 4 0 Z0 [J] · ∆(x − x2 )J(x )d x (−i)(i) = (−i) δJ(x1 ) h0|T {φ(x1 )φ(x2 )} |0i = −i∆(x1 − x2 ) . h0|T {φ(x1 )φ(x2 )} |0i = (−i) ⇒ J=0 Exactly as for the Oscillator, we get h0|T {φ(x1 )φ(x2 )φ(x3 )φ(x4 )0} |0i = h i 2 (−i) ∆(x1 − x2 )δ(x3 − x4 ) + ∆(x1 − x3 )δ(x2 − x4 ) + ∆(x1 − x4 )δ(x2 − x3 ) which easily generalizes to, h0|T {φ(x1 ) . . . φ(x2n )} |0i = h X i (−i)n ∆(x1 − x2 )δ(x3 − x4 ) . . . ∆(x2n−1 − x2n ) + permutations pairings This is what we had obtained as the Wick’s theorem. Note: Wick’s theorem expressed the time ordered product of quantum fields in terms of the normal ordered product plus contractions. The normal ordering was for all Poincare generators to ensure invariance of the vacuum. The theorem was also proved for free fields for which we do have Fourier decomposition. Here too the “Wick’s theorem” is again seen for free fields. Here it is no more than the chain rule of differentiation. Consider now interacting fields, by which we mean L = L0 + L1 . Recall that h0|Q(t)|0i = R Dq q(t)eiS and we can get the q(t) in the integrand by 226 δ S | δJ(t) J J=0 (remember S = SJ |J=0 ). δ Thus, insertion of q(t)’s or φ(x)’s in field theory, can be effected by δJ(x) SJ |J=0 . If we expand R R n ei L1 (φ) = Σ∞ n=0 [i L1 ] /n!, then we have integrals of polynomials of the fields which can be expressed as δ S . δJ J The δJ can be taken outside of the path integration. Thus we write, Z Z int Z[J] := h0|0iJ := D φ exp i d4 x L0 (φ) + L1 (φ) + J(x)φ(x) Z R 4 iL(−iδJ(x) ) ∝ e · D φei d x{L0 +J(x)φ(x)} or, Z δ 4 Z[J] ∝ exp i d xL1 −i · Z0 [J] , with (21.2) δJ(x) Z i Z 4 Z0 [J] := exp d x d4 yJ(x)∆(x − y)J(y) (21.3) 2 RR J∆J term and a Gaussian path In the oscillator case, for Z0 [J] we had the exp 2i integral which turned out to be equal to Z0 [J = 0]. By taking the asymptotic time limits, we could assert this to be equal to 1 since it was vacuum-to-vacuum amplitude without any source. In presence of interactions, even with J = 0, the vacuum may not remain unaffected and we cannot justify h0|0iint J=0 = 1. Instead, the proportionality constant is determined by demanding Z[J = 0] = 1. The prescription to compute Z[J] via the eiL1 (−iδJ ) Z0 [J], is in effect the perturbative prescription. Let us see how this works in an example L1 (φ) = gφ3 (x)/3!. Then Z[J] ∝ g 3 e 6 δJ · Z0 [J]. Expand the exponential, " Z 3 #v ∞ X 1 ig δ Z[J] = N d4 x −i × v! 3! δJ(x) v=0 p Z Z ∞ X 1 i 4 4 d y d zJ(y)∆(y − z)J(z) . p! 2 p=0 This too is a power series in J as is Z0 [J], but with different coefficients. Consider a particular term in this double sum with a fixed v and p. • Then we have 3v derivatives acting on 2p J’s, leaving E = (2p − 3v)J’s. Clearly, E ≥ 0 must hold and there are several such term; • The overall numerical factor associated with such a group of terms is: v ig (−i)3v (i/2)p iv−3v+p v g v 1 = g = (i)E+v−p v 3! v! p! (3!) v!p! 3! v!p! • The combinatorial factor resulting from the number of ways he derivative acts is: 3v derivatives on 2pJ’s give (2p)! (2p−3v)! (This is because the first derivative can act on 2p J’s, second on (2p − 1), . . . (3v)th on (2p − 3v + 1) J’s.) 227 We can generate and keep track of the various derivatives by denoting ig δ 3 3! δJ 3 (x) i J(x)∆(x 2 − y)J(y) ↔ ig 3! Has R d4 x ↔ i 2 Has R d4 y R d4 z , and the operation of evaluating the derivative by joining the free ends from the ‘vertices’ to the free ends of the ‘propagator lines’ - exactly as we represented the Wick contractions. The number of terms with a give v, p is the number of ways of joining the free ends and generating a diagram. Some of the diagrams may have the identical factors associated with them For example, the 3δJ ’s at a given vertex can be permuted in joining up with the propagator lines. This clearly gives a factor of 3!. Likewise ends of propagator lines gives a factor of 2!. The diagram will look the same if the v vertices are themselves permuted (the R 4 d x are dummy variables) and this gives v!. Similarly the propagator lines give p!. Thus, all the numerical factors, except i, g cancel out. Compared to the previous counting based on Wick’s theorem, we have a double expansion and lines (pairs of J’s) to be contracted with the edges of the vertices. Secondly, in the Wick’s theorem, thanks to the normal ordering of the interaction terms, we don’t have self contractions at the vertices (∆(x, x) propagators). In the present approach, there is no “normal ordering” and we do get such term. Eg. δ(x − y) y ! ∆(x − z) x δ(x − z) Such diagrams are called ‘tadpoles’. z Thus we have more diagrams and different ways of working out the total combinatorial factor. The above bulleted points have gotten rid of all explicit factors, but could involve an over counting. This is compensated by dividing by a factor called the symmetry factor. This is a geometrical problem of identifying the groups of diagrams. Here are a couple of examples [12]. 228 We can exchange the two loops in 2 ways. We can : exchange the ends of each loop in 2 × 2 ways. Therefore the total symmetry factor is 2 · 2 · 2 = 8. The 3 propagators can be permuted in 3 · 2 ways. The 2 : vertices can be exchanged in 2 ways. So the symmetry factor is: 3!2! = 12. A general statement and a derivation may be seen in the appendix of Sterman’s book [14]. In the double expansion, we have terms with no J’s, a single J, 2J’s . . . etc. When J = 0 is put, only the terms with no J survive. These are termed as the “vacuum bubbles” and are the contributions to the Z[J = 0]. A general contribution to a given number of left over J’s would consist of product of topologically connected diagrams. Let Cγ denote the contribution of a particular topologically connected diagram, γ. In a given diagram, Γ, let γ occur nγ times. The symmetry factor resulting from permutations within each γ, are included in the contribution Cγ . But now we can also have permutations across different, topologically connected diagrams. Such permutations can leave the product diagram invariant if the different copies of a given Γ are permuted as a whole. Hence, we have an additional symmetry factor of nγ ! by which we Q have to divide. A product diagram with different γ’s thus has a symmetry factor of γ nγ !. Returning to the full Z[J] which we can now represent as a contribution from a sum of diagrams Γ, each of which can have several topologically connected γ’s with nγ copies. Hence, Z[J] ∝ X CΓ ∼ X Y (Cγ )nγ := N nγ ! hX i Y = N exp[Cγ ] = N exp Cγ . {nγ } γ {nγ } ∞ YX (Cγ )nγ nγ ! γ n =0 γ γ I Thus, Z[J] is proportional to the exponential of the sum of contributions of topologically connected diagrams. In this sum, are also the contributions topologically connected vacuum diagrams i.e. Z[J = 0]. Imposing the normalization condition, Z[J = 0] = 1, gives N × exp(topologically connected vacuum bubbles) = 1. . Hence, the normalization is trivially incorporated by simply dropping the contributions of the vacuum diagrams and we are 229 left with, h i X Z[J] = exp iW [J] , iW [J] = topologically connected diagrams. (21.4) The Green’s functions that we get from Z[J], now get expressed as, h0|φ(x)|0i = −i δZ[J] δJ(x) = J=0 δW [J] δJ(x) , ∵ J=0 1 = 1. W [J = 0] This gives the contribution of diagrams with a single J(x) and gets removed when J = 0 is put. Similarly other Green’s functions can be expressed in terms of the derivatives of the W [J]. As an illustration of the organization of the topologically connected diagrams in W [J], consider the two point function. δ2 Z[J] = −iδJ1 eiW δJ2 W δJ1 δJ2 J=0 δJ1 W.δJ2 W − iδJ21 J2 W J=0 G(x1 , x2 ) := (−i)2 = eiW ∴ G(x1 , x2 ) = Gc (x1 )Gc (x2 ) − Gc (x1 , x2 ) , where, 2 Gc (x) := δJ(x) W [J]|J=0 , Gc (x, y) := iδJ(x)J(y) W [J]|J=0 x1 x2 G x1 = | x2 Gc {z } Top. disconnected x1 − Gc | x2 Gc {z } Top. connected The Gc denote, contributions from connected (to external points) and topologically connected diagrams. This continues to the other n−point functions, as proved by the argument leading to eqn. (21.4). To conclude, G(x1 , . . . , xn ) : Σ of all connected diagrams n δ Z[J] : (−i)n δJ(x1 )...δJ(x n) J=0 n δ Gc (x1 , . . . , xn ) : Σ of topologically connected diagrams : (−i)n−1 δJ(x1 )...δJ(x W [J] n) n=0 : Σ of all vacuum diagrams : Z[J = 0] = 1 n=0 : Σ of all topologically connected : W [J = 0] = 0 J=0 vacuum diagrams The computation of the diagrams uses the same Feynman rules that we discussed earlier. The Z and W provide a convenient way of dealing all the diagrams together and serve as generating functions for the Greens functions. 230 A. The 1-point function: Renormalization ↔ normal ordering The 1-point function of a scalar field has a distinction. Since this represents the h0|Φ(x)|0i, the Poincare invariance allows this to be a non-zero constant, say v. No other Lorentz covariant field can have its vacuum expectation value to be non-zero since there are no Lorentz invariant spinors or vectors15 . However, if we want the quanta exchanged during interactions to have the appropriate quantum numbers, the vacuum should not have any quantum numbers. As noted in the discussion of uniqueness of vacuum, a non-unique vacuum would have to carry non-trivial representation labels of the symmetry group which will add to the labels of the exchanged quanta. Consequently, we must have a unique vacuum and then, by the mode expansion of the quantum field, its expectation value must vanish. Does our path integral definition of n−point function satisfy this property? Consider computation of the 1-point function in the (φ)3n theory. Notice that topologically disconnected diagrams contributing to the 1−point function can only be the vacuum bubbles giving a multiplicative contribution of 1. So we can limit to only topologically connected diagrams. Furthermore, if we have 1P R diagrams, their contribution is again of a product form: (1P I) × ∆(x − y) × (1P I) × ∆(z − w) × (1P I) · · · . Hence it suffices to consider only the 1P I diagrams. Thus we have, Gc (x) = + + ··· At 1-loop (and higher), the diagrams are non-zero and in fact divergent. We invoke the counter term method to absorb away these divergences. Thus, we add a term in L1 , of the form Y φ(x) = −iY δ . δJ(x) This is represented by the diagram . With its inclusion, the contributing diagrams are: Gc (x) = = 15 y x (a) + Y (b) + + Y + ··· (2-loop) A second rank, symmetric tensor field can have h0|ĥµν (x)|0i = Ληµν . But this typically arises only when gravity is included. 231 These give, to 1-loop,   ig   h0|φ(x)|0i = |{z} iY + · −i∆(0) · {z } | 2 (b) Z dn y(i∆(x − y)) + o(g 3 ) (a) For the left hand side to be zero to o(g), we must set Y = i g2 ∆(0) + o(g 3 ) with, Z dn k 1 ∆(0) = (divergent for n ≥ 2.) (2π)n k 2 + m2 − i Z dn k 1 1 Γ(1 − n/2) 2 n/2−1 (m ) for n = 4 − (say) = i = i 2 (2π)n k + m2 − i (4π)n/2 Γ(1) 1 2 = i (−1) (m2 )1−/2 . 2 16π 2 g m g m2 2 ∴Y = 1 − ln(m ) + · · · ≈ + o(0 ) + · · · . 2 2 16π 2 16π Thus, we fix the counter term Y by demanding that the 1-point function be zero, as the renormalization condition and this can be continued at higher orders. The upshot is that we can ensure h0|φ(x)|0i = 0 by means of a counter term. Hence, the sum of all connected diagrams with a single external line vanishes. Clearly, a tadpole can attach to any other diagram only in a 1PR manner (eg replacing the source end of the tadpole by the other diagram). This is a multiplicative contribution and immediately renders all such diagrams to be zero. Hence all diagrams with a tadpole attached, when summed to a given order, vanish. We may thus simply remove the tadpole diagrams and obtain results equivalent to those obtained by normal ordering prescription employed before. Note: Had the 1−point function been finite, we would still need to use the counter term method to set it to zero to ensure uniqueness of the vacuum. In the dimensional regularization, massless tadpole diagrams are zero. Qn: Normal ordering eliminates all self contractions - 1−point function in Φ3 as well as 2−point function in Φ4 and likewise if higher order vertices are included. For instance, the 1-loop diagram contributing to the 2−point function in Φ4 . Now there will be a 1-loop counter term for the 2-point function (absent in the normal ordered version). Does it affect the finite part? Ans: The self contractions at any vertex, in any diagram produces a 1-loop contribution as a multiplicative factor, ∆F (0) and this multiplicative factor is independent of any momenta entering/exiting the vertex. Consider such a contribution to the self-energy, Π(p2 ), 232 in the Φ4 theory. Unless a counter vertex is introduced, it is not possible to satisfy the renormalization condition, Π(−m2 ) = 0 at the order λ1 (here we consider on-shell renormalization for simplicity, so that m2 = m2ph ). So introduce an order λ counter vertex. Then at 1-loop, the self energy equals ∆F (0) + δ1 . Both are independent of p2 and the renormalization condition implies that two must add to zero. The self energy thus receives no contribution at 1-loop, exactly as if normal ordering has been invoked. In any other diagram (1PI), if a self contraction appears, then the same counter vertex, δ1 will again automatically cancel the contribution, effectively enforcing normal ordering. Again this is independent of UV divergence and is a consequence of the requirement of the renormalization condition on self-energy. B. Path Integrals and Statistical Mechanics We discuss briefly an interesting connection of the path integral representation of the transition amplitude and the (grand) partition function of statistical mechanics [16]. R The basic observation is that the transition amplitude, schematically hQ2 , t2 |Q1 , t1 i = R D q exp{ ~i tt12 dtL(q(t), q̇(t))}, can be taken as a postulate rather than a derivation from matrix element of the quantum unitary evolution operator. We can also separately postulate i that the transition amplitude is the matrix element hQ2 |e− ~ (t2 −t1 )H |Q1 i. Mathematically, set t2 − t1 =: −iβ, set Q2 = Q1 =: Q and integrate over Q. On the left hand side, we get T re−βH which is the usual canonical partition function. On the right hand side, the path integral turns to a path integral over all closed paths of an integrand which is exponential of a Euclidean continuation (t → −iτ ) of the action. We thus obtain a path integral representation of the statistical mechanical canonical partition function, in terms of Euclidean continuation of a classical action. This is used in the context of the black hole entropy which may be seen in [16]. 233 22. PATH INTEGRALS AS GENERATING FUNCTIONALS We have noted the two basic quantities, Z[J] and W [J] = −ilnZ[J]. Their func- tional derivatives with respect to J(x) give the contributions to the n−point (n ≥ 1) functions: (−i)n δ n Z[J]|J=0 contains the contributions of all the connected diagrams while (−i):n−1 δ n W [J] contains the contributions of all the connected and topologically connected diagrams. The normalization condition: Z[J = 0] = 1 ↔ W [J = 0] = 0 omits the contributions of the vacuum bubbles. We had encountered the 1PI diagrams while discussing the self-energies. Their contributions too can be obtained by functional differentiation of another quantity which we obtain below. Note that the n−point functions being obtained by functional derivatives also means that Z[J], W [J] can be viewed as generating functional for Green’s functions. A. The Generating Functionals: Z[J], W [J], Γ[Φ] Consider a connected and topologically connected diagram contributing to δ n W [J]|J=0 . It is said to be 1-particle reducible (1PR), if it can disconnected by cutting 1 internal line. A diagram which is not 1PR is 1-particle irreducible (1PI). Any given diagram can be viewed as a collection of 1PI sub-diagrams connected by one internal line. As noted in the discussion of self energy diagrams, all 1PI diagrams with two external lines when strung together by an internal line (free propagator) lead to the exact propagator. Thus, the sum of all connected and topologically connected diagrams contributing to an n−point function can be viewed as new connected and topologically connected diagrams whose “vertices”, γk are all 1PI diagrams with k > 2 edges, connected by lines representing the exact propagator. These new diagrams all have a “tree topology” - there are no loops in such a reorganization of diagrams since any loop would make a 1PI sub-diagram and will already be included in one of the γk ’s. These vertices can be of any order, unlike elementary vertices dictated by the Lagrangian defining a theory. The set of these new diagrams can be generated from another generating functionals, Z 4 Zγ [J] := D ϕ exp i γ[ϕ] + i d xJ(x)ϕ(x) := eiWγ [J] , XZ γn (x1 , · · · , xn ) ϕ(x1 ) · · · ϕ(xn ) . γ[ϕ] = dx1 · · · dxn n! n≥2 Z 234 Note: The coefficients of the ϕn terms are not simple numbers and this action is non-local. The coefficient of the n = 2 term, γ2 (x1 , x2 ) is the Fourier transform of the inverse of the exact propagator. Note that this establishes that WS [J] = Wγ [J] . tree However, such a Zγ [J], Wγ [J] will generate diagrams with loops as well and we are interested in generating only the tree diagrams. To pick out these alone, introduce a fictitious dimensionless parameter, λ and define Z Z i 4 γ[ϕ] + i d xJ(x)ϕ(x) := exp[iWγ,λ [J]] . Zγ,λ [J] := D ϕ exp λ Let us count the powers of λ in a any connected and topologically connected diagram. Noting that the γ2 term in γ[ϕ] is the (Fourier transform of the) inverse of the exact propagator, the scaling by λ−1 means that that each propagator gives a factor of λ while each vertex and every J, gives a factor of λ−1 . Thus, for any diagram contributing to Wγ,λ [J] we get the factor (λ)P −V −E , where P, V, E are the number of propagator lines, number of vertices and the number “external” lines (lines connected to a source) respectively. The number of “internal” lines is P − E. In the diagram below, we have P = 6, V = 2, E = 4 giving λ0 . λ−1 λ−1 λ λ λ−1 λ λ λ λ−1 λ −1 λ λ−1 For topologically connected diagrams we also have L = I − V + 1 = P − E − V + 1. Hence P L−1 Wγ,L [J] and organize the the factor of λ is (λ)L−1 . We can thus write, Wγ,λ [J] = ∞ L=0 λ diagrams by the number of loops. In the formal limit, λ → 0, Wγ,λ→0 [J] → λ1 Wγ,L=0 which is the contribution of the tree diagrams (with exact propagators and exact vertices alone). On the other hand, we may evaluate Zγ,λ [J] in the limit λ → 0 by stationary phase approximation. Clearly, the path integral is dominated by the fields, ϕ(x) which satisfy δγ δϕ + J = 0. Denoting ϕcl (x) as its solution, we get, Z i 4 γ[ϕcl ] + d xJ(x)ϕcl (x) = eiWγ,λ→0 [J] ' exp Wγ,L=0 [J]. λ Z ∴ γ[ϕcl ] + d4 xJ(x)ϕcl (x) = Wγ [J] = WS [J]. i Zγ,λ→0 [J] ' exp λ tree 235 The last equation shows that γ[ϕ] is just the Legendre transform of W [J] and by construction, it is the generating function of connected, topologically connected, 1PI diagrams with external legs amputated! The last property follows because the vertex functions γn (x1 , · · · , xn ) explicitly do not have external propagator lines included in them. Note: We have followed a route of organizing the diagrams as a tree of 1PI diagrams connected by exact propagators and inferred the corresponding generating function γ[ϕ] as a Legendre transform of the W [J]. The usual approach is to begin with a Legendre transform definition and arrive at its interpretation. This conventional approach goes as follows. Define, Φ(x)[J] := δW [J] δJ(x) = h0|Φ̂(x)|0iJ , J is not set to zero and hence this is not the 1-point function. We have the path integral representation as, R D ϕϕ(x)eiS[J] −iδln(Z[J]) −i δZ[J] Φ(x)[J] = = = R , δJ(x) Z[J] δJ(x) D ϕeS[J] v := Φ(x)[J = 0] . . ˆ (x)|0i = 0. This generalizes to, Define, Φ̄(x) := Φ(x) − v. Then h0|Φ̄ Claim: δn W [J] δJ(x1 ) . . . δJ(xn ) = in−1 h0|T Φ̄(x1 ) . . . Φ̄(xn ) |0i , ∀ n ≥ 2. J=0 The proof is by induction starting at n = 2. For n = 2 we have, δ2 δ 2 Z[J] δ i δZ[J] i δZ[J] δZ[J] W [J] −i = − = δJ1 δJ2 δJ2 Z[J] δJ1 Z 2 δJ2 δJ1 δJ2 δJ1 J=0 J=0 J=0 i = (iv)(iv) − i(i2 )h0|Φ(x1 )Φ(x2 )|0i 1 = i h0|Φ(x1 )Φ(x2 )|0i − v 2 = h0|Φ̄(x1 )Φ̄(x2 )|0i. Which verifies the claim for n = 2. The pattern repeats and the proof follows. As a corollary, we can write X 1 Z δnW dx1 . . . dxn J(x1 ) . . . J(xn ) W [J] = n! δJ1 . . . Jn J=0 n≥2 Z X (i)n−1 dx1 . . . dxn h0|T Φ̄(x1 ) . . . Φ̄(xn ) |0iJ(x1 ) . . . J(xn ) = n! n≥2 (22.1) (22.2) Using Φ(x)[J] = δJ(x) W [J], define the Legendre transform of W [J] , Z Γ[Φ] := W [J] − d4 xJ(x)Φ(x) , (compare with: −H(p) = L(q̇) − pq̇ ) 236 (22.3) Notice that this is exactly the same as the γ[ϕ] defined above. The Γ[Φ] is defined by expressing the right hand side as a function of Φ, in particular J = J[Φ] is understood which is obtained by inverting the relation Φ[J] = δJ W [J]. It follows as usual for a Legendre transform, Z Z Z δΦ(y) δWδJ(y) δΓ[Φ] 4 4 δJ(y) = − dy Φ(y) − d4 yJ(y) = −J(x) dy δΦ(x) δΦ(x) δΦ(x) δJ(y) δΦ(x) In the last equality we have used δΓ[Φ] δΦ(x) δΦ(y) δΦ(x) = δ 4 (x − y). Since Φ(x)[J = 0] = v, we also have = 0. Thus, v is that value which extremises Γ[Φ(x)] and hints at Γ[Φ] being Φ(x)=v some sort of action whose extremization leads to a solution. This is supported further as follows. We have Z[J] = eiW [J] = R D ϕeiS[J,ϕ] . Let ϕcl (x) be a classical solution i.e. δS[J, ϕcl ] = 0. Expand the action around such a solution by setting ϕ(x) = ϕcl (x) + η(x). We get (schematically to avoid clutter), S[ϕ] = S[ϕcl + η] = S[ϕcl ] + 1 δ2 2 δS ·η + η + ... δϕcl 2 δϕ2 |{z} 0 δ 2 S[J, ϕ] 1 d4 x d4 y η(x)η(y) = S[ϕcl ] + 2 δϕcl (x)δϕcl (y) Z Z Z i δ2S iS[J,ϕcl ] 4 4 = e D η exp d x d y η(x) η(y) + . . . 2 δϕcl (x)δϕcl (y) Z ∴ eiW [J] The path integral does have an implicit dependence on J(x) through the ϕcl (x), unless the action itself is quadratic as was the case of the oscillator. Momentarily let us just ignore the path integral altogether. Then, W [J] ≈ W0 [J] := R S[J, ϕcl ] = d4 x[L(ϕcl ) + J(x)ϕcl (x)] where the classical solution is to first obtained by solving δS[J, Φ] = 0 and then substituted back in the action. Applying our definition, Φ(x) = δJ(x) W [J] ≈ δJ(x) W0 [J] = δJ(x) Scl [J] , Scl := S[J, ϕcl (x)[J]]. Straight forward evaluation gives, Z δScl δϕcl (y) Φ(x) = d4 y + ϕcl (x) δϕcl (y) δJ(x) | {z } =0 the last term coming from δJ R Jϕ. Hence, in this approximation, Φ(x) = ϕcl (x). 237 Proceeding with the Legendre transform, we get Z Γ0 [Φ] = W0 [J] − d4 xJ(x)Φ(x) Z Z 4 4 + dxJ(x)ϕ . (22.4) = S[ϕcl ] = S[ϕcl ] cl (x) − dxJ(x)Φ(x) | {z } | {z } no explicit J no explicit J Thus, within the approximation, Γ[Φ] is just the classical action (ϕcl (x) → Φ(x)). But this also shows that when the path integral included, W [J] will be very different and so also the corresponding Γ[Φ]. Thus, Φ(x) is identified as the “quantum corrected solution” while the Γ[Φ] is called a “quantum action” or an “effective action” incorporating quantum corrections. R Consider Taylor expanding the effective action about Φ[x] = v. Γ[v] = W [J = 0]− 0·φ = 0 since W [0] = 0 by our normalization, Z[0] = 1. We have also seen that δv Γ = 0. The remaining terms are a power series in (Φ(x) − v) =: Φ̄(x) with the coefficients evaluated at v. This is the same as regarding Γ as a function of Φ̄(x) and expanding it about Φ̄(x) = 0 i.e. Γ[Φ̄] = XZ n≥2 dx1 · · · dxn Γ(x1 , · · · , xn ) Φ̄(x1 ) · · · Φ̄(xn ) . n! What do these coefficients represent? We already have the answer. We just need to recognize that Γ[Φ̄] = γ[ϕ]! The coefficients are the contribution of all connected, topologically connected, 1PI diagrams without external legs. An explicit demonstration of amputation of the external legs may be seen in [17]. The generating functions are summarized in the table below. 238 In summary Z Z[J] := D ϕeiS[J,ϕ] , Z[J = 0] = 1 , (normalization) G(x1 , . . . , xn ) := −i W [J] := Gc (x1 , . . . , xn ) := Φ(x)[J] := Γ[Φ] := (22.5) n δ Z[J] = h0|T {Φ(x1 ) . . . Φ(xn )}|0i (22.6) δJ(x) J=0 −i ln(Z[J]) , W [J = 0] = 0 (22.7) n δ W [J] (−i)n−1 = h0|T {Φ(x1 ) . . . Φ(xn )}|0ic (22.8) δJ(x1 ) . . . δJ(xn ) J=0 δW [J] , v := Φ(x)[J = 0] , Φ̄(x) := Φ(x) − v (22.9) δJ(x) Z W [J] − d4 xJ(x)ϕ(x) (22.10) δΓ[Φ] δΓ[Φ] , =0 δΦ(x) δΦ(x) Φ(x)=v X in−1 Z d4 x1 . . . d4 xn G(x1 , . . . , xn )J(x1 ) . . . J(xn ) Z[J] = n! n≥1 X in−1 Z W [J] = d4 x1 . . . d4 xn Gc (x1 , . . . , xn )J(x1 ) . . . J(xn ) n! n≥1 X 1 Z Γ[Φ̄] = d4 x1 . . . d4 xn Γn (x1 , . . . , xn )Φ̄(x1 ) . . . Φ̄(xn ) n! n≥2 J(x) = − G(x1 , . . . , xn ) ↔ all connected diagrams Gc (x1 , . . . , xn ) ↔ all connected and topologically diagrams (22.11) (22.12) (22.13) (22.14) (22.15) (22.16) Γ(x1 , . . . , xn ) ↔ all connected, topologically connected, 1PI diagrams with external legs amputated B. (22.17) The Renormalization Group Equation Recall that while discussing the renormalized perturbation series, we introduced φ0 , m0 , g0 as the ‘bare’ quantities with L(φ0 , m0 , g0 ) generating the diagrams containing the UV divergences. We then introduced the renormalized variables φ, m, gk defined through the scaling: p k/2 φ0 := Zφ φ, m20 Zφ := m2 Zm , g0k Zφ := Zgk gk which were finite by definition. We also introduced counter terms to take care of the UV divergences. With the Z, G, Γ we have the 239 formal representation of the totality of all diagrams. Going over to the momentum space labels we have (we take the 1-point function to be zero and a single coupling constant for convenience), δΓ0 [φ0 ] Γ0,n (p1 , . . . , pn ) := δφ0 (p1 ) . . . δφ0 (pn ) ⇒ , φ0 =0 Γ0n (pi , m0 , g0 ) = p δΓ0 [ Zφ φ] Γn (p1 , . . . , pn ) := δφ(p1 ) . . . δφ(pn ) −n/2 Zφ Γn (pi , m, g) φ=0 (22.18) The renormalized parameters are defined by a set of renormalization conditions, which introduce a scale, say, µ (eg a M S or M S scheme). The bare quantities know nothing about dΓ0 (pi , m0 , g0 ) this scale and therefore µ = 0. . Substitution gives, dµ d d −n/2 −n/2 Γn + Zφ µ Γn and, 0 = µ Zφ dµ dµ d ∂ dm ∂ dg ∂ µ Γn = µ +µ +µ Γn (pi , m, g) dµ ∂µ dµ ∂m dµ ∂g Thus we have the renormalization group (RG) equation for the renormalized vertex function as, µ ∂ Γn (pi , m(µ), g(µ)) ∂µ + β(g) m,g ∂ Γn ∂g µ,m − γm (g)m ∂ Γn ∂m µ,g − nγφ Γn = 0, (22.19) where, β(g, m) := µ dg dln(m) 1 dln(Zφ ) , γ(g, m) := −µ , γφ (g, m) := µ . dµ dµ 2 dµ (22.20) The first is the beta function, the second is called the mass anomalous dimension while the last is called the anomalous dimension. These are computed typically in a perturbation theory as a power series in the renormalized coupling (see below). Different schemes give different simplifications and there is some scheme dependence in the coefficients of these functions. As mentioned above, we describe a method of computation of the renormalization constants and their dependence on the couplings in the MS scheme [18]. We have the defining equations (22.19,22.20). These were obtained by noting that the bare vertex functions as a function of the bare parameters and the regulator are independent of µ, for fixed g0 , m0 and . When the renormalized parameters are not defined in terms of physically measured quantities, how do we compute the µ dependence of these parameters? 240 For this, we go back to the basic definitions and recall the relation between the bare and the renormalized parameters, in particular, g0 := µ Zg g. Note that we have absorbed the Zφ factor into Zg and have also introduced the µ dependence by taking g to be dimensionless. The µ∂µ is now evaluated with the bare parameters and the regulator held fixed i.e. β(g) = µ∂µ g(g0 µ− , ) = µ∂µ g0 µ− Zg−1 = −g − gµ∂µ ln(Zg ) . g0 , Noting that Zg is determined as a function of g, Zg (g, ) = 1 + P k≥1 −k Zgk , we can write the defining equation for the β function in the form, β(g, ) = −g − gβ(g, )∂g ln(Zg (g, )) ↔ β(g, )∂g (gZg (g, )) + gZ(g, ) = 0. The β(g, ) must have a smooth limit as → 0 and we may consider β ∼ β0 +β1 +β2 2 +· · · . The ∂g (gZg (g, )) ∼ 1+−1 +· · · while gZg (g, ) ∼ +0 +−1 +· · · . Clearly, the coefficients of the positive powers of greater than 1 must vanish. Hence we take, β(g, ) := β0 (g) + β1 . It follows that β1 = −g and the equation reduces to, 0 = (β0 − g)(Zg + g∂g Zg ) # " # " X g 2 ∂g Zgk+1 X 1 k 2 1 ∂g (gZg ) − g ∂g Zg + 0 = β0 1 + k k k≥1 k≥1 ∴ β0 (g) = g 2 ∂g Zg1 , β0 ∂g (gZgk ) = g 2 ∂g Zgk+1 . (22.21) (22.22) (22.23) Theβ0 (g) is the usual beta function and is obtained from the renormalization constant Zg . The Zgk>2 coefficients are also determined recursively. Note: The above expressions are in the context of the Φ4 coupling. For Yukawa or the Yang-Mills cubic coupling, the coupling has gY ukawa = µ/2 gdimensionless and the above equations will change accordingly. As a practical application of the renormalization group equation, consider a theory which is massless and remains so perturbatively. Then the anomalous mass dimension term is absent. Putting t := lm(µ/µ0 ) for some arbitrary scale µ0 at which the theory is defined ∂ (renormalization conditions are imposed), we write µ ∂µ = ∂ ∂t and the equation takes the form, ∂ ∂ dg(t) + β(g) − nγ(g) Γn (g, t, pi ) = 0 with β(g) = ∂t ∂g dt This immediately gives, dΓn (g(t),t,pi ) dt = 0.. Its solution is obtained as, Z t 0 0 Γn (t, g0 , pi ) = Γn,0 (g(t, g0 ), p) exp n dt γ(g(t , g0 )) where, 0 241 (22.24) dg(t, g0 ) = β(g(t, g0 )) , g(0, g0 ) = g0 . dt Although typically the beta function is a power series in the coupling. Once g(t) is known, g(t, g0 ) is the solution: the t dependence of all vertex functions is determined. A good deal of qualitative behavior can be gleaned from studying the beta function - especially near its zeros. We know it has a zero at g = 0 (perturbative calculation), but at may have other zeros. The derivative of β(g) near its fixed points, controls how g(t) evolves with t. Let us assume it to be some given function of the form as shown in the figure. β(g) I g0 III g1 g2 II g3 g It has multiple zeros, say g0 = 0 < g1 < g2 < g3 . . . . Let I, II, III, . . . denote the intervals on the g−axis, bounded by the zeros. Clearly, given a g0 in one of these intervals, it will remain so for all t. If β(g) > 0 in an interval, then g(t) will evolve to its upper bound and the opposite if β(g) < 0. A coupling will always flow to a UV-stable (we are considering increasing t) fixed point. In the figure, g1 , g3 are the stable fixed points while g0 , g2 are unstable ones. Let g0 be in region I so that as t → ∞, g(t, g0 ) → g1 . The asymptotic behavior of Γn is controlled by the integral of the anomalous dimension. Let us write, Z t Z t Z t 0 0 0 0 dt [γ(g(t , g0 )) − γ(g1 )] + dt0 γ(g1 )). dt γ(g(t , g0 )) = 0 0 0 In the first term, we may take the upper limit t → ∞ since the integrand provides exponential suppression. This integral is then some constant, C. In the second term, γ(g1 ) is independent of t0 and thus evaluates to γ(g1 ) · t. Hence, h i Γn (t, g0 , pi ) −−−→ Γn,o (g1 , pi ) · C · exp{γ(g1 )t} , t = ln(µ/µ0 ) . t→∞ µ is some large scale eg some of the invariants pi · pj . Thus, we get Γn (ln(µ/µ0 ), g0 , pi ) −−−→ µ→∞ (µ/µ0 )γ(g1 ) ∀ n. The exponent is the same for all n−point vertex functions and is governed by 242 a stable fixed point in the vicinity of g0 . All these vertex functions have the same asymptotic behavior. For further application, I refer you to [19]. C. The Background Field Method As an illustration of the utility of the formal advantage of the functional methods, we discuss the so called background field method, extremely useful for non-abelian gauge theories [20]. For illustration, we consider the simplest case of a scalar field ϕ. We have the basic definitions Z Z R W [J] iW [J] iS[ϕ]+ J(x)ϕ(x) , Γ[ϕ̄] := W [J(ϕ̄)] − J(x)ϕ̄(x) . e = Z[J] := D ϕe , ϕ̄(x) := J(x) Introduce a new, arbitrary background field, χ(x) and define, Z R iW̃ [J,χ] e = Z̃[J, χ] := D ϕeiS[ϕ+χ]+ J(x)ϕ(x) , Z W̃ [J, χ] ϕ̃(x) := , Γ̃[ϕ̃, χ] := W [J(ϕ, χ)] − J(x)ϕ̃(x) . J(x) In the new generating functionals defined, shift the integration variable as ϕ → ϕ − χ. This is a simple translation and gives the Jacobian to be 1. It is immediate that Z̃[J, χ] = R R Z[J]e−i J(x)χ(x) and W̃ [J, χ] = W [J] − J(x)χ(x). Clearly, ϕ̃(x) = ϕ̄(x) − χ(x) and Z Z Γ̃[ϕ̃, χ] = W [J] − J(x)χ(x) − J(x)(ϕ̄(x) − χ(x)) = Γ[ϕ̄] = Γ[ϕ̃ + χ] . If we had arranged that ϕ̄ = 0, then Γ[χ] = Γ̃[0, χ]. The background field χ being arbitrary, we get an alternate method of computing the effective action Γ[χ] using the shifted fields. The shifted action has the form: Z S[ϕ + χ] = S[χ] + L1 (χ)ϕ(x) + Z L2 (χ)ϕ2 (x) + · · · . Since there is no functional integration over χ, the first term is just the classical action. Since χ is arbitrary, the second term does not vanish (it vanishes if χ is an exact solution of the classical equations of motion). The third term gives the propagator for the ϕ field and the higher order terms give interactions among the ϕ, χ fields. Now, the shifted effective action, Γ̃ is a generator of 1PI diagrams in presence of ϕ̃ and χ. That is, its derivatives with respect to ϕ̃ will give the 1PI diagrams in presence of χ. 243 The ϕ propagator, which comes from the terms in the action which are quadratic in ϕ, will in general be χ−dependent and the vertices will have factors of χ(x). When ϕ̃ = 0, only the diagrams with no external ϕ̃ lines i.e. only the vacuum diagrams will contribute. Thus the desired effective action can be computed using only vacuum diagrams, albeit with vertex factors having the χ field. There are two ways to approach the computation. If we treat the background field exactly, then in obtaining the Feynman rules the propagator of the ϕ field will have the χ field as well (the terms quadratic in ϕ in S[ϕ + χ]). Except when χ is space-time independent, this propagator is complicated. And for a general background field, this method is not useful. An alternative is to treat χ also perturbatively i.e. the ϕ propagator remains exactly same as before and the additional χ−dependent terms are treated as additional couplings (the ϕ factors are replaced as δ ). δJ The Feynman rules have the same ϕ−propagator, there is no propagator for the χ field and the vertices have extra edges denoting the background field. Since there is no χ propagator, these vertices generate only external χ lines. As an explicit example, consider the (ϕ3 )6 theory. The shift by a background field gives, Z h i g S[ϕ + χ] = S[χ] + S[ϕ] + d4 x −∂µ ϕ∂ µ χ − m2 ϕ(x)χ(x) − ϕ(x)χ2 (x) + ϕ2 (x)χ(x) 2! The S[χ] comes out of the path integral, the S[ϕ] gives the usual ϕ−propagator and the ϕ3 vertex while the last four terms give the additional interaction vertices. Since in the 1PI diagrams we are interested in, internal lines are only ϕ−lines and the external lines are only χ lines, the first three of the four additional vertices are irrelevant and only the ϕ3 and ϕ2 χ vertices remain. It is a simple exercise in power counting to determine the superficially divergent 1PI diagrams with no internal χ−lines and no external ϕ−lines. Consider a 1PI diagram with n0,3 number of ϕ3 vertices and n1,2 number of χϕ2 vertices. Then the internal ϕ−lines is given by 2Iϕ = 3n0,3 + 2n1,2 and the external χ−lines if given by Eχ = n1,2 . It is easy to see that the superficial degree of divergence, D, is then given by D = 6 − 2Eχ . Notice that n0,3 drops out of the degree of divergence. The renormalizability of the theory is thus manifest and only the 2−point and the 3−point vertex functions are divergent. How does renormalization work in background field method? Re-scale the fields as ϕ → p p Zϕ ϕ , χ → Zχ χ. Then the ϕ−propagator will get a factor of Zϕ−1 . The n0,3 vertices n 3/2 1/2 give a factor of (Zϕ )n0,3 and the n1,2 vertices give a factor of Zϕ 1,2 · (Zχ )n1,2 . The Iϕ lines 244 −I give a factor of Zϕ ϕ . Clearly, the factors of Zϕ cancel out since −Iϕ + 32 n0,3 + 22 n1,2 = 0. n We may thus choose not to renormalise the ϕ fields. The left over factor is Zχ 1,2 /2 E /2 = Zχ χ as expected. Notice that the diagrams that need to be computed for any vertex function, are exactly the same as without background field except for the vertex factors from the vertices connecting the external lines. The renormalization procedure(s) then proceed as usual eg as discussed in the subsection 19 B. For scalar fields used in the illustration, there is no particular advantage. But with gauge theories the method allows enormous simplification [20]. This is beyond the scope of this course though. Returning to our generating functionals Z, W, Γ; we have seen how the perturbative diagrams for the Green’s functions and the S−matrix elements (through the vertex functions) can be subsumed by formal manipulations with the basic path integral. The utility of the formalism goes beyond perturbation theory and computations of cross-sections. When it comes to gauge theories, especially the non-abelian ones with/without spontaneous symmetry breaking, the generating functional provide a convenient tool to establish renormalizability and unitarity. The proof of the Ward identities - constraints or relations enforced by gauge invariance on different n−point functions - is most economical in this framework. The existence of anomalies - violation of classical symmetries at the quantum level and their treatment is also much more transparent in these functional methods. The basic path integral is defined without any pre-supposition of perturbation theory. If the basic definition can be implemented eg via a lattice discretization, we can potentially have a non-perturbative handle through the computation of the vertex functions. 245 23. CLOSING REMARKS The lectures were organized around four strands. The main references used for each strand are also cited. 1. Poincare symmetry realization and its consequences [1, 2]: This involved the particle representations which identified the attributes of the permissible quanta; Manifest covariance needs the field representation and the irreducibility condition imposed the linear field equations; The unitarity analysis revealed anti-particles; Action formulation showed the free classical fields as a dynamical system of infinitely many independent oscillators; Introduction of interaction with non-dynamical source brought in the various Green’s function, the Feynman propagator conflicting with a causally consistent classical interpretation; Quantization of free fields required their covariance to be formulated differently and led to the CPT theorem; requirement of causality lead to the spin-statistics theorem; States of Free fields naturally contain particles as the relativistic wave packets and pave the way for recovering the usual non-relativistic limit. 2. Interacting quantum fields [9–11]: This was discussed within the context of scattering experiments; corresponding scattering theory postulates were discussed leading to the Kallen-Lehmann spectral representation; vanishing of the 1-point function is required for the stability of (unique) vacuum; LSZ reduction was discussed to obtain the S-matrix elements to vacuum expectation values of time ordered fields; covariant perturbation theory was premised on the interacting fields have the same canonical commutation relation as the in/out fields which are deemed physical - masses and residues at poles; S-matrix elements are given entirely in terms of free quantum fields with arbitrarily specified but Lorentz invariant interactions and a diagrammatic recipe follows; This characterizes interacting quantum fields as facilitating discrete transactions of energy/momentum/spin/charge etc via virtual quanta. The off-shell quanta do not 246 satisfy the equations of motion and hence represent a model of quantum fluctuations. 3. Application to the Yukawa interaction and QED up to 1-loop [12, 13]: The early successes at tree level were discussed followed by the Radiative corrections; The IR divergences highlighted the care needed in applying the formalism to what is actually measured; The UV divergences highlighted the unacceptable, unbounded dominance of the quantum fluctuations and questioned the purely theoretical parameters in the Lagrangian (‘bare’); Some renormalization procedure needs to be adopted to identify the physically measured parameters; The generic problem is to be handled recursively using counter terms which were illustrated at the 2-loop level; The renormalization process also led to the renormalization group equation. 4. The path integral formulation [12, 17, 18]: The classical action has a direct and central role in computing quantum transition amplitudes; The path integral provides generating functionals for various n-point functions and is a powerful tool for both formal studies as well as for richer non-abelian gauge theories with/without spontaneous symmetry breaking (SSB); When extended to the more general theories - non-abelian gauge theories with/out SSB, it follows that one needs to identify the ‘correct fields’ and their interactions before computing the various n-point functions; The qualitative physics guesses are invoked to propose the choice of the unique vacuum. In the gauge theory context, for instance, gauge invariance under large gauge transformations (non-trivial at infinity) reveal the θ−vacua and some choice is made by nature. 247 [1] Steven Weinberg, The Quantum Theory of Fields volume I, Cambridge University Press, 1995. [2] C. Wetterich, Massless Spinors in More Than Four Dimensions, Nucl. Phys. B 211, 177 (1983). [3] G Date, Lecture on Constrained Systems, arXiv:1010.2062. [4] Hong-Hao Zhang, Kai-Xi Feng, Si-Wei Qiu, An Zhao, Xue-Song Li, A Note on analytic formulas of Feynman propagators in position space, Chin.Phys. C34, 1576 (2010), arXiv:0811.1261. [5] W. Greiner and J. Reinhardt, Quantum Electrodynamics, 4th edition, Springer, 2009, page 68. [6] D F Walls and G J Milburn, Quantum Optics, second edition, Springer-Verlag, 2008. [7] R J Glauber, Quantum Theory of Optical Coherence: Selected Papers and Lectures, WileyVCH Verlag GmbH & Co. KGaA, 2007. [8] R. Jagannathan and S. A. Khan, Quantum Mechanics of Charged Particle Beam Optics, CRC Press, 2019. [9] M Reed and B Simon, Methods of Modern Mathematical Physics, Scattering Theory, Vol III, Academic Press, 1979. [10] R G Newton, Scattering Theory of Waves and Particles, chapters 6 and 7, Springer-Verlag, NY, 1982. [11] J D Bjorken and S D Drell, Relativistic Quantum Fields, chapters 16 and 17, McGraw Hill, 1965. [12] M Srednicky, Quantum Field Theory, Cambridge University Press, 2007. [13] M Peskin and D V Schroeder, Introduction to Quantum Field Theory, Addison-Wesley, 1995. [14] George Sterman, An Introduction to Quantum Field Theory, Cambridge University Press, 1993. [15] S. Elaydi, An Introduction to Difference Equations, Springer, 2005. [16] G W Gibbons and S W Hawking, Action integrals and partition functions in quantum gravity, Phys. Rev. D, 15, 2752 (1977). [17] E S Abers and B W Lee, Gauge Theories, Physics Reports, 9, 1, (1973). [18] D Gross, Method in Field Theory, Les Houches, Ed. R Balian and J Zinn-Justin, 1975. [19] S Coleman, Quantum Field Theory: Lectures of Sydney Coleman, Chapter 50, World Scien- 248 tific, 2019. [20] L F Abbot, Introduction to the background field method, Acta Physica Polonica, aB13, 33 (1982). 249

Log In

Lectures on Introduction to Quantum Field Theory

Related papers

Related papers

Related topics