Les Houches 96

Theory and Observations of the Cosmic
Background Radiation
J. Richard Bond
CIAR Cosmology Program
Canadian Institute for Theoretical Astrophysics
Abstract
These lecture notes describe the state of CMB research up to Spring
1996. The lectures were originally delivered at the Les Houches Ses-
sion LX in August 1993, but as the written version evolved an attempt
was made to keep up with the tremendous pace of development in CMB
theory and observation. A beta-release preprint version was generated
in Winter 1995. This final version corresponds to the published one,
with minor further debugs corrected. It was updated to include the “fi-
nal” COBE/FIRAS data, 4-year COBE/DMR and various smaller angle
anisotropy data sets, and a brief discussion of the CMB satellites, MAP
and Planck. Among the topics covered are: A comprehensive treatment
of energy injection of all kinds at high redshift, and the constraints de-
rived from the limits on spectral distortion of the CMB. The observation
of anisotropies and phenomenological temperature power spectrum esti-
mates. How 3D random fields illuminated by differential visibility map
via transport to various 2D radiation patterns, with applications to pri-
mary sources such as the integrated Sachs Wolfe effect, photon bunching
and the Doppler effect, and to secondary sources such as the Sunyaev-
Zeldovich thermal effect and emission from dusty galaxies. The theory
of primary anisotropies (those derived from linear perturbation theory)
is developed and the historical path in calculational procedures to the
current high precision emphasis is described. Procedures for calculating
secondary anisotropies (those derived from nonlinear development) are
also sketched. CMB observations are related to early universe fluctuation
power spectra amplitudes and shapes for inflation-based models, and are
connected to large scale structure observations and galaxy formation to
constrain structure formation theories and the cosmological parameters
that define them. Technical appendices give a comprehensive treatment
of perturbation theory of the Einstein-Boltzmann equations, beginning
from the nonlinear ADM equations of transport, conservation, constraint
and geometrodynamics. Included are: a full treatment of momentum space
gauge freedom as well as the coordinate freedom in the radiative transport equa-
tion, full derivations of polarized photon source functions, tensor (gravity wave)
perturbations, gravitational lensing, the shear viscosity and thermal diffusions with
polarization in the tight-coupling regime, etc.
in Cosmology and Large Scale Structure, pp. 469-674,

eds. R. Schaeffer, J. Silk, M. Spiro and J. Zinn-Justin, Elsevier (1996)
1
Contents
1 Introduction and basic properties 5
2 Spectral observations and constraints 9
3 Spectral distortion theory 13

3.1 Radiative transport in the expanding universe . . . . . . . . . . 13
3.2 Source functions for spectral distortions . . . . . . . . . . . . . 17
3.2.1 Compton scattering and the Kompaneets source term . 17
3.2.2 Bremsstrahlung . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.3 Double Compton scattering . . . . . . . . . . . . . . . . 21
3.2.4 Rayleigh scattering . . . . . . . . . . . . . . . . . . . . . 22
3.2.5 Line radiation . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.6 Synchrotron . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.7 Dust grains . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 The cosmic photosphere and Bose–Einstein distortions . . . . . 27
3.4 Recombination and photon decoupling . . . . . . . . . . . . . . 28
3.4.1 Hydrogen and Helium Recombination . . . . . . . . . . 28
3.4.2 Visibility and decoupling . . . . . . . . . . . . . . . . . 33
3.5 Reionization of the universe . . . . . . . . . . . . . . . . . . . . 34
3.6 Post-recombination energy sources . . . . . . . . . . . . . . . . 36
4 Phenomenology of CMB anisotropy 41

4.1 Statistical measures of the radiation pattern: C(θ), C` , . . . . . . 42
4.2 Experimental arrangements and their filters . . . . . . . . . . . 45
4.2.1 Pixel–pixel correlation filters . . . . . . . . . . . . . . . 45
4.2.2 Beams and dmr and firs . . . . . . . . . . . . . . . . . . 48
4.2.3 2-Beams, 3-beams, oscillating beams, . . . . . . . . . . . . 49
4.3 Primary power spectra for inflation-based theories . . . . . . . 51
4.4 2D spectra with tilt and a Gaussian coherence angle . . . . . . 55
4.5 Experimental band-powers: past and present . . . . . . . . . . 57
4.6 Measuring cosmological parameters with the CMB . . . . . . . 65
5 Primary and secondary sources of anisotropy 76

5.1 Angular power spectra from 3D random source-fields . . . . . . 76
5.1.1 Simple sample sources . . . . . . . . . . . . . . . . . . . 78
5.1.2 Angular power spectra for simple sample sources . . . . 79
5.1.3 Products of Bessel functions . . . . . . . . . . . . . . . . 81
5.1.4 Fourier derivation of the simple sample spectra at high ` 83
5.1.5 Narrow and broad visibilities . . . . . . . . . . . . . . . 84
5.1.6 Power-law spectra with coherence scales in 3D and 2D . 85
5.2 The primary primary anisotropy effects . . . . . . . . . . . . . 86
5.2.1 Sachs–Wolfe, photon-bunching and Doppler sources . . 87
5.2.2 Longitudinal and synchronous pictures of the Sachs–Wolfe
effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2
5.2.3 Differential power spectrum and form factors . . . . . . 88
5.2.4 Damping . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.2.5 Early reionization form factors . . . . . . . . . . . . . . 91
5.2.6 The isocurvature effect on low multipoles . . . . . . . . 91
5.3 Secondary anisotropies . . . . . . . . . . . . . . . . . . . . . . . 92
5.3.1 Sample secondary anisotropy power spectra . . . . . . . 92
5.3.2 Anisotropy power from dusty primeval galaxies . . . . . 94
5.3.3 SZ and nonlinear Thomson scattering from clusters . . . 96
5.3.4 Single-cluster observations of the SZ effect . . . . . . . . 98
5.3.5 The maximum entropy nature of Gaussian anisotropies 101
5.3.6 Quadratic nonlinearities in Thomson scattering . . . . . 101
5.3.7 The influence of weak gravitational lensing on the CMB 104
6 Perturbation theory of primary anisotropies 105

6.1 Overview of fluctuation formalism . . . . . . . . . . . . . . . . 105
6.2 Perturbed Einstein equations . . . . . . . . . . . . . . . . . . . 107
6.2.1 Time-hypersurface and gauge freedom . . . . . . . . . . 107
6.2.2 Scalar mode Einstein equations . . . . . . . . . . . . . . 109
6.2.3 Useful gauge invariant combinations for scalar modes . . 111
6.2.4 Longitudinal and synchronous gauges . . . . . . . . . . 112
6.2.5 Tensor mode metric equations . . . . . . . . . . . . . . . 113
6.3 Connection with primordial post-inflation power spectra . . . . 115
6.4 Relating scalar and tensor power measures to the dmr band-power118
6.5 The Boltzmann transport equation . . . . . . . . . . . . . . . . 120
6.5.1 Scalar mode transfer equations . . . . . . . . . . . . . . 121
6.5.2 Tensor mode transfer equations . . . . . . . . . . . . . . 132
7 Connection with other cosmic probes of k-space 135

7.1 Density power spectra and characteristic scales . . . . . . . . . 135
7.2 The observable range in k-space . . . . . . . . . . . . . . . . . 137
7.3 Relating the cluster-amplitude σ8 and the dmr band-power . . 140
7.4 The future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
A The ADM formalism and perturbation theory 146

A.1 The ADM equations . . . . . . . . . . . . . . . . . . . . . . . . 147
A.2 Scalar perturbations . . . . . . . . . . . . . . . . . . . . . . . . 151
A.3 Tensor perturbations . . . . . . . . . . . . . . . . . . . . . . . . 155
B Transport theory in General Relativity 156

B.1 The distribution function and the BTE in GR . . . . . . . . . . 156
B.2 Number, energy and momentum conservation equations . . . . 159
B.3 The transport of extremely relativistic particles . . . . . . . . . 161
B.4 momentum space gauge transformations . . . . . . . . . . . . . 163
3
C Polarized transport for Thomson scattering 166
C.1 The polarization matrix and Stokes parameters . . . . . . . . . 166
C.2 Scalar perturbation source terms . . . . . . . . . . . . . . . . . 171
C.2.1 Thomson source functions . . . . . . . . . . . . . . . . . 171
C.2.2 The moment equations for photons . . . . . . . . . . . . 172
C.2.3 CDM and baryon transport . . . . . . . . . . . . . . . . 173
C.2.4 The transport of massless neutrinos . . . . . . . . . . . 174
C.2.5 Hot and warm dark matter transport . . . . . . . . . . 175
C.3 Numerically useful regimes for scalar perturbations . . . . . . . 178
C.3.1 Tight-coupling, shear viscosity and thermal diffusion . . 178
C.3.2 Free-streaming . . . . . . . . . . . . . . . . . . . . . . . 182
C.4 Modifications with mean curvature . . . . . . . . . . . . . . . . 183
C.5 Lensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
C.6 Tensor perturbation source terms . . . . . . . . . . . . . . . . . 188
4
1 Introduction and basic properties
We shall look back on this decade as a golden age for cosmic background
radiation research, with signals unveiled by very high precision spectrum and
angular anisotropy experiments revealing much about how structure arose in
the Hubble patch in which we live. Although the theory was reasonably well
developed before the observation of anisotropy, much new work on all aspects
of CMB theory and phenomenology has occurred, to better place the new
experimentation in a cosmological framework. Sample lecture notes and reviews
giving earlier snapshots of the state of the art in theory are [1, 2, 3, 4, 5, 7, 6].
Peebles’ book [8] covers some of the theoretical ground, the White et al. [7]
Annual Review article gives a shorter overview and extensive references, while
Partridge [9] covers experimental techniques. [10] gives a recent overview of
cosmology and how the CMB fits in. These lectures will be about equally
divided among: the spectrum and what the remarkable lack of distortion tells us
(section 2, 3); the observations and phenomenology of anisotropies (section 2);
primary and secondary sources of anisotropy (section 5); the coupled Einstein–
Boltzmann equations which describe the development of primary anisotropies
(section 6), including a set of Appendices providing much more detail about
these equations and their solution; how the CMB results connect with large
scale structure results (section 7). Emphasis is on inflation-based models of
cosmic structure formation. What is not covered here is how topological defect
theories of structure formation impact upon the CMB: for cosmic strings see
e.g., [281, 282, 283] and for texture defects see e.g., [284, 285, 286, 287], and
references therein.
In section 2, I review the status of spectrum observations: although impor-
tant historic work from the ground, in balloons and in rockets shortward of a
centimeter will now be a footnote to FIRAS, ground-based radio telescopes still
control the spectral constraints at the long wavelength end. The long-sustained
assault on the mm-wave CMB peak led to one strong distortion after another,
each one stimulating a flurry of theoretical papers which, by now, have largely
sorted out the issues of how early energy release in the Universe would have
been processed into observable signals (section 3). The photon transport is
rather simple for spectral distortion calculations, homogeneous radiative trans-
fer, a warmup for the more complicated inhomogeneous transfer in random
media required for the treatment of CMB anisotropies. The source functions
describing the predominant emission, absorption, and scattering processes are
given there (bremsstrahlung from Coulomb-scattered and Compton-scattered
electrons, low energy Compton scattering, and interactions with any primeval
dust present). Of course, it is the secondary anisotropies that would accompany
these distortions that can give us insights into the structure at emission time.
Primary anisotropies (section 6) are those that we can calculate with linear
perturbation theory. The primary goal of theoretical anisotropy research is to
work out detailed predictions within a given cosmic structure formation model
of primary and secondary anisotropies as a function of scale. Because of the
5
linearity, primary anisotropies are the simplest to predict and offer the least
ambiguous glimpse of the underlying fluctuations that define the structure for-
mation theory. With detailed high precision observations, we expect to be able
to use CMB anisotropies to measure various cosmological parameters to high
accuracy (section 4.6). The nonlinearity inherent in secondary anisotropies
makes those predictions more ambiguous.
If energy is injected early enough in the Universe, it is just reprocessed
by interaction with the plasma into a Planck spectrum, albeit with a higher
entropy than the starting state. We must rely on indirect arguments based on
primordial nucleosynthesis to constrain exactly when the entropy of our Hubble
patch came into being, and this only if it was injected later than a redshift of
ten billion. The cosmic photosphere exists around a redshift of ten million or
so. With a FIRAS temperature of Tγ (now) ≡ Tc∗ = 2.728 ± 0.004 K [12, 11],
the entropy per comoving volume is
3
4π 2 3 kB
sγ∗ = āTγ = 1.48 × 103 cm−3 . (1)
45 h̄c
The (mean) scale factor of the Universe is ā, which I take to be normalized to be
unity at the present time, so that it is related to the redshift z by ā = (1+z)−1 .
I also invariably take the temperature to be in energy units, which is equivalent
to taking Boltzmann’s constant kB to be unity. Recall that 1 eV = 1.16×104 K.
As well, h̄ and c are taken to be unity. Returning from these theorist units to
the real world requires insertion of as many factors of h̄c = 0.1973 eV µm as
are necessary to take the energy factors into lengths, and once that is done, c
is inserted to take the lengths to time. Recall that for a Planck distribution of
photons, we have a comoving number and energy density and a pressure
2ζ3 3
nγ∗ = T , ργ∗ = 34 sγ∗ Tγ ≈ 2.7nγ∗Tγ , pγ = 13 ργ ≈ 0.9nγ Tγ , (2)
π 2 c∗
where Tγ∗ ≡ āTγ = Tc∗ ; numerically,
nγ∗ = 412 cm−3 , ργ∗ = 0.261 eV cm−3 , Ωγ h2 = 2.46 × 10−5 , (3)
scaling as the appropriate power of (Tγ∗ /2.728 K). The Hubble parameter is
h ≡ H0 /(100 km s−1 Mpc−1 ).
Because the expansion of the Universe is adiabatic, the photon entropy
per comoving volume is a conserved quantity. If we suppose the entropy was
generated early enough that neutrinos and e+ e− pairs would have been in ther-
modynamic equilibrium with the photons (T > 1 MeV), then the annihilation
of the e+ e− pairs into photons, when the temperature was about a few hun-
4
dred keV, would have increased the photon entropy from 11 sγ∗ to the sγ∗ we
observe. The comoving neutrino temperature would have remained that as-
sociated with the lower entropy level per particle, āTν = (4/11)1/3 āTγ , i.e.,
1.95 K. The total entropy, apart from the minor bits residing in the gas and
6
stellar radiation of our Universe, is fully determined by the single number Tγ
and the number of low mass neutrino degrees of freedom:
stot∗ = sγ∗ + sν∗ = sγ∗ + 7 4

8 11 Nν sγ∗ = 43
22 sγ∗ = 2.90 × 103 cm−3 , (4)
for Nν = 3 light neutrino generations, each contributing a left-handed particle

and an antiparticle, but no right-handed components. The associated energy
density parameter for relativistic particles is Ωer h2 = (1 + 78 (4/11)4/3 Nν )Ωγ h2 ,
4.1 × 10−5 for Nν = 3. The origin of stot∗ is, of course, a mystery, enshrouded
by the cosmic photosphere. It used to be considered to be a gift of the Planck
era. In inflation models, our patch of the Universe was once in accelerated
expansion, during which any primordial temperature would have dropped to
essentially zero. stot∗ could then have arisen only once deceleration began, for
only then could coherent field energy have been able to dissipate into entropy.
It is usual to divide the entropy by another (partially) conserved quantity,
the comoving baryon number density, nB∗ = 1.13 × 10−5 ΩB h2 , expressed in
terms of the baryon density parameter (relative to the critical density) ΩB .
After the identification of physical processes that could plausibly have led to
the generation of baryon number from a hot medium, it became usual to invert
the large numbers
−1 −1
stot∗ Ω B h2 sγ∗ Ω B h2
= 2.56 × 1010 , = 1.31 × 1010 ,
nB∗ 0.01 nB∗ 0.01
and try to explain why the baryon number is so tiny relative to the entropy
through the extreme weakness of baryon-violating interactions.
A nice way to picture CMB transport from the early Universe to the present
is to consider when and where various phenomena occurred on our past light
cone, when defined by redshift, where defined by comoving distance to that red-
shift. For concrete numbers, I shall take the example of a universe with a critical
density in nonrelativistic matter, Ωnr = 1, where Ωnr has contributions from
cold dark matter, baryons, etc. The cosmic photosphere is then 5924 h−1 Mpc
away from us, very close to our “horizon”, ≈ 2H0−1 = 6000 h−1 Mpc. (Of
course, inflation could have made the true event horizon much bigger; some
process must have.) FIRAS gives stringent upper limits to distortions of var-
ious types. For example, the photon chemical potential constraint strongly
limits the energy output that occurred just shortward of the cosmic photo-
sphere (within about 200 h−1 kpc comoving distance from it). Barring early
energy input which escapes the COBE bounds, the photons decouple at a
redshift ≈ 1000, a distance 5796 h−1 Mpc away, 128 h−1 Mpc from the pho-
tosphere. The shell between the photosphere and this last scattering surface
where the Compton depth is unity defines an electron scattering “atmosphere”,
quite thick to photons. In particular, when helium recombines, the photons are
very tightly coupled.
The theory of the hydrogen atom (section 3.4) is so well known that we can
be quite confident that we have the physics of recombination well described.
7
The essential ingredients were worked out immediately after the discovery of the
CMB, and the novel feature is the dominant role that the two-photon decay of
the 2s state to the 1s state plays. The width of the region over which decoupling
takes place is only about 10 h−1 Mpc comoving distance (section 3.4.2). That
the width is nonzero plays a fundamental role in defining how small the scale
of anisotropies is that we can see. The relation between angular scale and
−1/2
comoving distance at high redshifts is about d ≈ 100Ωnr h−1 Mpc (θ/1◦ ),
hence we might expect that fluctuations on scales below about 100 are affected:
they are strongly damped below this “coherence” angle and this will define
which experiments are most useful to do if we wish to probe the moment when
the photons were first released to freely propagate from their point of origin to
us, without much further modification, apart from some gravitational redshifts,
some lensing, and possibly some scattering from hot gas.
Even with the FIRAS limits, it is still quite possible that enough energy
was injected either prior to recombination, or sufficiently shortly after (above
redshift ∼ 150) so that the photons had their decoupling delayed (section 3.5).
The decoupling position then moves forward to ∼ 5570 h−1 Mpc. The thickness
of the region over which decoupling would have taken place is more than an
order of magnitude larger, ∼ 200 h−1 Mpc comoving distance, corresponding
to a few degrees. Damping of anisotropies below a few degrees is the result,
although nonlinear effects can lead to interesting short distance signatures of
such early reionization (section 5.3.6).
Distortions of the background may occur before or after recombination. If
it is Compton cooling of hot gas, the spectral signature of the y-distortion
to the background radiation has allowed very powerful FIRAS constraints to
be given on the Compton y-parameter which strongly rules out many models.
If pregalactic dust, or dust in primeval galaxies, exists, it will absorb higher
frequency radiation (UV and optical) and down-shift it into the infrared; com-
bined with the redshift, a sub-mm background is expected but, with FIRAS,
is now quite strongly constrained (section 3.2.7). Accompanying these sec-
ondary backgrounds are anisotropies that carry invaluable information about
the epochs that the relevant structures formed. Even if the angle-averaged
distortions are well below the level that absolute spectrum experiments like
FIRAS probe, it is certain that these secondary anisotropies are accessible to
experiments: the question is only for what fraction of the sky do they rise
above experimental noise and the primary signal. A major goal of experimen-
tal/phenomenological anisotropy research is to design experiments and statis-
tical processing procedures that will allow the various primary, secondary and
foreground contributions to anisotropy to be separated (section 4). With the
wealth of signals to be unveiled, we have a CMB future that “looks marvelous,
simply marvelous”.
8
2 Spectral observations and constraints
We now know from COBE’s Far Infrared Absolute Spectrophotometer, FIRAS,
that the CMB is well fit by a blackbody with T ≈ 2.728 ± 0.004 K over the
region from 5000 µm to 500 µm [12, 11], a number compatible with the COBRA
rocket experiment of Gush et al. [15] covering the same band, and also with
ground based measurements at centimeter wavelengths – although there is still
room for significant spectral distortion longward of 1 cm. Figure 1 gives a
view of the current state of the data on thermodynamic temperature T (λ) as
a function of wavelength for FIRAS and selected experiments described below.
Following the Penzias and Wilson [13] discovery, during the 60s and 70s
there were a large number of radio observations with coherent receivers that
obtained T (λ) in the Rayleigh–Jeans (RJ) portion of the spectrum. These re-
sults were reviewed by Weiss in 1980 [14], and a best fit temperature of 2.74 K
was given, with a ±0.08 K “one sigma” error. Throughout the 80s, a Berkeley–
Italian team [17], the White Mountain collaboration, made measurements at
many wavelengths, from 12 cm down to 0.33 cm, using corrugated horn anten-
nas with 15◦ beamwidths switched from sky to a 3.8 K “cold load” calibrator.
And Johnson and Wilkinson [18] used a balloon to get a temperature estimate
at 1 cm.
Wavelengths longer than 10 cm are extremely difficult to explore, both
because of large Galactic corrections that must be made and because of con-
tamination from man-made radio signals. For many years a rather heroic early
experiment by Howell and Shakeshaft in 1967 [16] was all that defined the con-
straints at long wavelengths. Recent experiments at the relatively radio-quiet
South Pole [19, 20, 21, 22] have considerably improved the error bars. There
are hints of deviation from the FIRAS temperature extrapolated into the RJ
regime but the corrections are large.
One of the more remarkable aspects of the CMB story is that the population
of rotational states of diatomic molecules – found by optical observations of
interstellar absorption lines in the spectrum of bright stars – can be used to
estimate CMB temperature. The first molecules discovered in interstellar space
were CH and CN, found using the spectrum of ζ Ophiuchi. In 1941, McKellar
inferred a 2.3 K excitation temperature to explain the relative intensity ratios
of the lines originating from the K = 0 and K = 1 levels in the 3883 Å
band of CN (frequency difference (2640 µm)−1 , just longward of the CMB
peak). This observation was very well known to the astronomical community,
since it was given prominent play in the classic Herzberg text [23], although
it was dismissed as having “only a very restricted meaning”. Had Gamow
or his students Alpher and Herman made the connection, how different the
development of cosmology might have been, but it was only in 1966, after
the Penzias and Wilson discovery, that the connection was made. In 1972,
Thaddeus [24] reviewed the CN work and gave T (2640 µm) = 2.78 ± 0.10 K
for ζ Ophiuchi, with much larger errors for other stellar spectra. In the 80s
and 90s great improvements were made using very high signal-to-noise spectra
9
FIRAS 96
Figure 1: Selected old and new data on CMB distortions in terms of thermody-
namic temperature. The dotted point at 7 cm is the original Penzias and Wilson
(1965) result, the long-dashed point at 63 cm is from Howell and Shakeshaft
(1966). The situation in the Rayleigh–Jeans region was improved quite a bit
with the White Mountain collaboration results (solid). Results from Bersanelli
(1995) at 21 cm and Staggs and Wilkinson (1995) at 19 cm are shown. The
point with the small error bar at λ = 1.2 cm is that of Johnson and Wilkinson
(1987). Cyanogen results are given at 2640 µm (Roth et al. 1993, Crane 1989,
1995). The tiny error bars are from FIRAS (Fixsen et al. 1996). The inset
gives a blowup of the region for FIRAS.
10
of ζ Ophiuci; e.g., Meyer and Jura [25] got 2.73 ± 0.04 K at λ = 0.264 cm and
2.8 ± 0.3 K at λ = 0.132 cm, important at the time because it failed to confirm
large reported excesses found with other techniques. Of course, excitation
temperatures are really only upper bounds on the CMB temperature, since local
contributions to the excitation, e.g., through collisions with electrons, might
enhance the upper level’s population. These frequencies overlap with those
probed by FIRAS, and, within their much larger errors, agree with FIRAS.
The assault on the CMB peak and into the Wein region by the experimental-
ists proved very difficult, with distortion reports being the norm rather than the
exception. In the 60s and early 70s there were rocket and balloon experiments
which reported significant post-peak excesses, but Muehlner and Weiss (1973,
reviewed in Weiss [14]), using five broad band filters, were able to show that
these large excesses were not there. Around 1980, Woody and Richards [28]
and Gush [29] used Fourier transform spectroscopy to get the spectrum around
the peak, and both reported large (but qualitatively different) distortions, that
cyanogen results and an experiment by Peterson, Richards and Timusk [30]
failed to confirm. In 1988, a Nagoya–Berkeley rocket experiment with 6 broad
band filters (Matsumoto et al. [31]) indicated a large excess energy content
(about 20% of that in the CMB) that spurred much theoretical exploration of
energy injection.
The issue was forever settled, to a standing ovation, at the famous January
1990 AAS presentation by COBE team leader John Mather of the perfect
blackbody that 9 minutes of data taken shortly after the November 1989 launch
revealed, with Tc∗ = 2.735±0.06 K, a result beautifully confirmed shortly after
by the COBRA rocket experiment [15], with Tc∗ = 2.736 ± 0.02 K.
Both experiments also used the elegant method of Fourier transform spec-
troscopy, with on-board reference blackbodies to compare with the sky signal.
FIRAS also had an auxiliary external calibrator blackbody which could be
moved in for further in-flight calibration. The FIRAS calibrators could range
from 2 to 25 K in temperature, but when the sky was being observed, temper-
atures near 2.7 K were chosen to nearly null the difference between the internal
and sky signals. FIRAS used polarizing Michelson interferometers, with mir-
rors that moved at constant velocity so that path difference was proportional
to time lag, to construct the correlation between sky and reference blackbody
as a function of time lag, an interferogram. FIRAS made two million of them.
Fourier transform gives the power as a function of frequency. A dichroic filter
split the FIRAS signal into low and high frequency parts, < 500 and > 500 µm
(with the best results from the low frequency 104 to 500 µm part).
With Fourier transform spectroscopy, determining the absolute tempera-
ture (which requires absolute calibration of the reference blackbodies) cannot
be done with nearly the same precision as determining the level of deviation
from a blackbody. The most complete analysis of the FIRAS data is [12], who
used the full (all channels, nine months) low frequency data set, whereas [11]
concentrated on the last 6 weeks of the FIRAS experiment, for which calibra-
tions were frequent and the instruments were operating very stably. Models
11
were needed for the dipole, determined from the DMR experiment on COBE,
and for the Galactic emission – modelled by G(`, b)g(ν), with the geometrical
function of Galactic longitude and latitude, G(`, b), taken from the DIRBE
240 µm map, and g(ν) from the FIRAS data. This is of course dominated
by emission from the Galactic plane. There is evidence that the high b gas is
colder than that using g(ν) determined in this way [48, 49].
The dipole amplitude is 3.372±0.007 mK (95% CL error bars), i.e., ∆T /T =
1.2 × 10−3 = v/c, with v our velocity relative to the CMB local rest frame.
Although this is small, the precision of FIRAS was such that the difference
between the spectrum determined for a patch in the dipole direction and that
from the opposite direction could be taken. This should be proportional to the
derivative of a blackbody, and indeed it is to a very high degree of accuracy
(an rms deviation consistent with the level of detector noise). From the 4-year
DMR data, the derived value is 3.353 ± 0.024 mK [85, 86], in good agreement
with the FIRAS result. The DMR-derived direction in Galactic coordinates
(`, b) is (264.26 ± 0.33, 48.22 ± 0.13).
At the 95% confidence limits, the temperatures determined from the monopole
spectrum and from the dipole spectrum in [12] are:
monopole: Tc∗ = 2.728 ± 0.004 K (95% CL), (5)

dipole: Tc∗ = 2.717 ± 0.014 K , (2.725 ± 0.020 K DM R) . (6)
Thus the dipole temperature agrees to within the errors with the monopole
temperature. The 0.004 K error should be compared with the error bars on
the monopole T (λ) shown in the inset of fig. 1.
The data was also used to place stringent constraints on distortions to the
spectrum. We now turn to the implications of these, but here just quote the
values [12]:
Compton y-parameter: ȳ < 1.5 × 10−5 (95% CL), (7)

−4
chemical potential: |µγ |/Tγ < 0.9 × 10 (95% CL), (8)
δE
general distortions: (500–5000 µm) < 0.00025 (1σ) . (9)
Ecmb
For the general distortions, the constraint on the fractional energy release over
the waveband from 5000 to 500 µm follows from the FIRAS team result using
the monopole spectrum that over this band the maximum 1-sigma intensity
deviation from a blackbody was < 0.012% of the peak brightness. The rms
intensity deviation from a blackbody over all channels in this range is even
more stringent, 0.005%. (If a sub-mm background mimicked Galactic emission
the constraints would not be as severe. See section 3.6.)
12
3 Spectral distortion theory
3.1 Radiative transport in the expanding universe
The development of spectral distortions or angular anisotropies in the mi-
crowave background is described by radiative transfer equations for the photon
distribution function, which are coupled to Einstein’s equations for the gravi-
tational field and to the hydrodynamic and transport equations for the other
types of matter present. The photon distribution function for the total in-
tensity, ft (q, q̂, xi , τ ) is a dimensionless general relativistic invariant giving the
average photon occupation number as a function of the photon momentum q I ,
I = 1, 2, 3, with magnitude q and direction vector q̂, in the neighborhood of
the spatial point xi , i = 1, 2, 3, at time τ . Not only is ft a general relativistic
scalar under the change of the spacetime coordinates (xi , τ ), it also remains in-
variant under change of the 3-momentum coordinates q I . Apart from ft , there
are three other photon distribution functions needed to describe the state of
polarization: ft , fU , fV , fQ , correspond to the four Stokes parameters I, U, V, Q.
Because physical momentum p redshifts as the Universe expands, the co-
moving momentum, q ≡ ā(t)p is a better choice than p. The comoving photon
energy qc and comoving wavelength λ are therefore related to the physical
frequency ν, physical energy ω = hν, and wavelength λe by
2πh̄c 2πh̄c
qc = = ā(t)ω = ā(t) . (10)
λ λe
Thus, if λe is the wavelength at emission at time t, λ is the observed wavelength
at t0 in the absence of frequency shifts beyond that from cosmological expan-
sion. A Planck distribution function is of form fP l = [exp(q/Tγ∗ ) − 1]−1 , where
Tγ∗ ≡ āTγ is the (comoving) photon temperature. We denote the Planckian
with the observed CMB temperature Tc∗ = 2.728 K by
q 5273 µm 2.728 K ν
fc ≡ (ex − 1)−1 , x≡ = = 1.76 . (11)
Tc∗ λ Tc∗ 100 GHz
The dimensionless x remains constant as the universe expands. Instead of
the distribution function, it is often convenient to work with a generalized
(comoving) thermodynamic temperature,
Tt∗ (q, q̂, xi , τ ) ≡ q/ ln(ft−1 + 1) . (12)
We are interested in the fluctuations in ft and Tt∗ ,

∆T (Tt∗ − Tc∗ )
∆ft ≡ ft − fc , ∆t ≡ ≡ . (13)
T t Tc∗
These will generally store both distortion and anisotropy information and may
often be nonlinear. Let us denote the spatial averages of ft and ∆t at a given
13
time by f¯t ≡ fc + ∆ft and ∆t and the spatial fluctuation in ft by δft =
∆ft − ∆ft .
The specific intensity Iν historically used by astronomers to describe radia-
tive transfer is related to the distribution function by
h̄ 3
Iν (ν, q̂, xi , t) = 2π
ν 2 ft , (14)
c2
with the 2 coming from the two photon polarizations. The energy per unit
3-volume radiated into solid angle dΩq̂ in the frequency interval ν to ν + dν
is Iν dνdΩq̂ . For a generally inhomogeneous spacetime, both the 3-volume
and the momentum (hence ν) can be transformed by a coordinate (gauge)
transformation, which is why invariant distribution functions are far preferable
to work with.
For spectral distortions, and for anisotropies that arise from secondary pro-
cesses such as Compton cooling of hot gas in clusters and emission from point
sources, the following form of the transfer equation is sufficient:
∂ft
+ q̂ · ∇ft = ā S[ft ] , (15)
∂τ q
where S is the source function describing the difference between the rate at
which photons are being added to the momentum volume d3 q/(2π)3 , and the
rate at which they are being removed. Instead of “cosmic time”, it is more
convenient to use conformal time dτ = dt/ā and comoving space coordinates
xi in the transfer equation. In terms of (∆T /T )t , the transfer equation takes
the form

∂ ∆T ā S[ft ] (1 + ∆t )2
+ q̂ · ∇ ≡ G(r, q, q̂, τ ) ≡ .
∂τ q T t x(fc + ∆ft )(1 + fc + ∆ft )
(16)
The solution of the transfer equation for a source at position rs emitting in
a burst at time τs , hence with S ∝ δ(τ − τs )δ(r − rs ), is the Green function at
time τ0 and position r0 :
−1
∂
+ q̂ · ∇ = ϑ(τ0 − τs )δ (3) (r0 − rs − q̂(τ0 − τs )) , (17)
∂τ
where ϑ is the Heaviside unit function, 0 for τ < τs , 1 otherwise. It describes
the free-streaming of the radiation along the line-of-sight to the source, with q
kept constant over the look-back. It can be used to map the radiation pattern
from a time just after all emissions, absorptions and scatterings have become
negligible (so S ≈ 0) to the present. If there is a contribution −āΓ̄a ft of
uniform absorbers in āS as well, then the Green function is
−1
∂
+ q̂ · ∇ + āΓ̄a
∂τ
= e−ζa (τ0 |τs ) ϑ(τ0 − τs )δ (3) (r0 − rs − q̂(τ0 − τs )) , (18)
14
where the absorption depth for the process Γa is
Z τ0
ζa (τ0 |τs ) ≡ ā(τ )Γ̄a (τ ) dτ . (19)
τs
With inhomogeneous absorbers, the Green function naturally depends on the

absorption depth along the line-of-sight.
In the tight coupling limit valid in the early universe, sources and sinks in
S approximately balance, so S ≈ 0, but the solutions then are an equilibrium,
with a small perturbation describing diffusion and viscous coupling of the pho-
ton fluid to other matter present. As usual with radiative transport, most of
the complications arise in the transition from tight coupling to free streaming.
If the spatial fluctuation δf = ft − f¯t is small enough so that the spatial av-
erage of āS[ft ] can be replaced by āS[f¯t ] to zeroth order, then f¯t obeys the
zeroth-order (background) transfer equation,
∂ f¯t /∂τ = ā S[f¯t ] . (20)

q
In both the tight coupling and free streaming regimes, any form-invariant func-
tion of q is a solution, in particular a Planckian with Tt∗ constant, or a Bose–
Einstein distribution (exp[q/Tt∗ + α] − 1)−1 with the chemical potential pa-
rameter α ≡ −µγ /Tt constant as well. If the distortion and/or anisotropy
fluctuation ∆ft is small compared with fc , then
∂fc ∆T ∆T xex
∆ft ≈ Tc ≡ xfc (1 + fc ) , xfc (1 + fc ) = . (21)
∂Tc T T (ex − 1)2
We typically use this transformation to go from distribution function to tem-
perature fluctuation, although it is sometimes not a good approximation, e.g.,
in the Wien region with dust emission sources, since fc drops so rapidly.
The full treatment of the transport theory with gravitational redshift and
lensing effects and polarization effects is developed in Appendix B, and dis-
cussed in section 6.5. The transport operator, the left-hand side of eq. (15),
is augmented by a term that depends upon the connection coefficients of the
spacetime metric, −q −1 Γiαβ q α q β ∂f /∂q i . The Green function describing free-
streaming from a source is a delta function along the photon’s geodesic path.
The bending of q̂ is essentially a lensing effect, a nonlinear correction involving
a product of the perturbed metric and ∆ft . The gravitational frequency shift
as the photon climbs into and out of local pockets of curvature is very impor-
tant in linear theory, the Sachs–Wolfe effect [32]. It is legitimate to take the
Sachs–Wolfe term to the right-hand side and treat it as a source. If we write
the metric as ds2 = ā2 (ηαβ + hαβ ) dxα dxβ , with ηαβ = diag(−1, 1, 1, 1),1 where
hαβ is the metric fluctuation, then the effective G in linear perturbation theory
(from eq. (289)) is
GtSW = − 21 ḣij q̂ i q̂ j + 12 q̂ i ∂i h00 + q̂ i q̂ j ∂i h0j . (22)
1 The MTW [196] sign conventions and the summation convention are used. Mean curva-
ture is ignored here but is discussed in Appendix B.
15
It is usual in perturbation theory to simplify this by adopting a coordinate
system in which hi0 = 0 – called a time-orthogonal gauge choice. Also, by
means of a change of the momentum variable in ft , one can modify this form.
For example, ∆t − 12 h00 has GtSW = − 21 ḣij q̂ i q̂ j − 12 ḣ00 .
Linear perturbations in the expanding universe can be separated into scalar,
vector and tensor modes, which are mutually independent. For a flat universe,
we can Fourier transform the transport equation. If k is the comoving wavevec-
tor and we choose the 3-axis to be k̂, then tensor modes involve the two (trans-
verse traceless) gravitational wave polarization modes, h(T +) = (h11 − h22 )/2
and h(T ×) = h12 , vector modes describing vorticity2 involve h13 , h23 (and
h01 , h02 ), and scalar modes involve
h − h33
ν ≡ − 21 h00 , ϕ≡ , where h ≡ δ ij hij , (23)
4
h − 3h33 h03
ψ≡− , Ψn ≡ −i , Ψσ ≡ Ψn + āψ̇ . (24)
4k 2 āk
The tensor modes are invariant under coordinate changes, whereas the scalar
mode potentials defined by eq. (24) do change, i.e., are gauge dependent. For
scalar perturbations, the Einstein equations and the various transport equa-
tions only involve ν, ϕ and Ψσ , and the perturbations to the various matter
densities and velocity potentials, as well as to the distribution functions; fur-
ther, ν, ϕ and Ψσ only depend upon the choice of time surfaces upon which
they are measured, not on changes of spatial coordinates on the hypersurfaces.
The time can be chosen to make some linear combination of the three vanish.
The two standard choices that have been used in the computation of radiative
transport in linear perturbation theory are the synchronous gauge, for which
ν = 0 (and Ψn =0), and the longitudinal gauge, for which ψ = 0 (and Ψn =0).1
In terms of these metric variables,
(S)
GtSW = −iq̂ · kν − ϕ̇ − (q̂ · k)2 ā−1 Ψσ ,
(T {+,×})
GtSW = −(1 − (q̂ · k̂)2 ) 21 ḣ(T {+,×}) . (25)
The source function for Compton scattering when energy transfers are impor-
tant is described in the next section, and the source function GtC in the low
energy Thomson scattering limit including polarization and angular anisotropy
effects in the scattering is derived in detail in Appendix C. The dominant terms
2 The vector parts of the vector parts, h (V ) (V )
0i and wi of hij = ki wj + kj wi , are curls of
vectors.
1 Many different notations are used for the perturbation variables {ν, ϕ, ψ, Ψ , Ψ }; e.g.,
n σ
Bardeen (1980): {A, HL + 31 HT , −k −2 HT , k −1 āB, k −1 ā(B − k −1 ḢT )} [171, 173]; Bardeen
(1988): {α, ϕ, −γ, −āβ, −χ} [178]; Mukhanov et al. {φ, −ψ, −E, āB, ā(B − Ė)} [176]. In
[2, 4, 88, 134, 195, 215, 216], we used a (+, −, −, −) metric signature, hence h, h 33 are of
opposite signs to those given here. I use ψ because it is basically the displacement potential
familiar from use in the Zeldovich approximation, Ψn and Ψσ because they are velocity
potentials for the 4-velocity and shear of observers following the flow of time.
16
for scalar (S) and tensor (T ) modes do not depend upon these complications:
(S) (S) (S) (S)
GtC = −ne σT ā(∆t − ∆t0 − q̂ · vB ) + (anisotropy , polarization) ,
(T {+,×}) T {+,×}
GtC = −ne σT ā∆t + (anisotropy, polarization), (26)
(S)
where ne is the electron density, σT is the Thomson cross section and ∆t0 is
the angle-averaged temperature fluctuation.
3.2 Source functions for spectral distortions

Provided the temperature of the universe is well below me c2 where e+ e− pairs
recombine, only a small number of processes have to be included to adequately
describe the photon transport. In the following expressions, ργ is the photon
energy density, nB is the baryon number density, B is the energy per baryon in
gas, Ye = ne /nB is the electron fraction per baryon, Te and Tγ are the electron
and photon temperatures (in energy units), xe = ω/Te , σT = (8π/3)α2 /m2e is
the Thomson cross section, me and mN are the electron and nucleon masses,
and α is the fine structure constant. The energy rates are those appropriate
to near-equilibrium transfer from photons to plasma. All source functions and
the photon energy ω are given in the reference frame in which the electrons are
at rest (the comoving-baryon gauge).
3.2.1 Compton scattering and the Kompaneets source term

For the nonrelativistic electrons appropriate to the period after pair recom-
bination, Compton scattering is primarily Thomson scattering, a conservative
scattering process in which the outgoing photon energy ω 0 equals that of the
incoming one ω, so momentum but not energy is transferred. The associated
source function can describe the development of anisotropy, but will give rise
to no spectral distortion. For this source function to vanish, it is necessary that
the radiation field be isotropic in the comoving frame of the electrons.
The general structure of the source function for γe → γe scattering is
X
S[f ](q) = R(q0 → q)f (q0 )(1 + f (q)) − R(q → q0 )f (q)(1 + f (q0 )) .
q0
The first term describes stimulated emission, of photons in momentum state

q, the second describes stimulated absorption. Here R is the scattering kernel,
which is related to the Klein–Nishina cross section averaged over the thermal
electron distribution, ne hdσKN /[(2π)3 d3 q]i. If the electrons are in thermal
equilibrium at temperature Te , R obeys the detailed balance relation
0
R(q → q0 ) = R(q0 → q) e(ω−ω )/Te .
The source function for scattering vanishes if f −1 + 1 is proportional to eω/Te ,
that is if the distribution function is a Bose–Einstein one,
fBE = [exp(ω/Te + α) − 1]−1 , α ≡ −µγ /Te . (27)
17
The photon chemical potential µγ enters because photon number is a conserved
quantity in Compton scattering.
For homogeneous transfer, R(q → q0 ) = R0 (ω → ω 0 ) is a function only of
the energies in and out. In the Thomson (very heavy electron) limit, R0 ∝
δ(ω − ω 0 ), hence S[f ] → 0: inhomogeneity is needed to have nonzero sources
for Thomson scattering. In the next order in m−1 e , small energy transfers
∆ω = ω −ω 0 do occur. Let us introduce a redistribution probability φ(ω → ω 0 ),
defined by
Z
1 (ω 0 )3 dω 0
φ(ω → ω 0 ) ≡ R0 (ω → ω 0 ) 2 , with φ(ω → ω 0 ) = 1 .
n e σT π ω0
It is sharply peaked, concentrated near ∆ω ≈ 0, with deviations of order m−1

e ,
as described by moments taken with respect to φ:
2
∆ω Te ω ∆ω Te
=4 − , =2 , (28)
ω φ me me ω φ me
describing both a net upward drift in the scatters if the photon energy is smaller
than 4Te /me (i.e., the electrons on average Compton-cool) and a random walk
of the photon energy about the net drift.
To derive the Kompaneets form [33] for S, one relies on the peaked nature
of φ to “punch out” the distribution function at ω, using the Taylor expansion
in ∆ω of f (ω 0 ) and of the detailed balance relation:

Te 1 ∂ 4 ∂f
SK [f ] = n e σT c ω Te + f (1 + f ) . (29)
me Te ω 2 ∂ω ∂ω
The following Rproperties can be readily verified: (1) no photons are created or
destroyed ( π12 ω 2 dωSK [f ] = 0); (2) SK [f ] vanishes only if f is of the Bose–
Einstein form (since [f (1 + f )]−1df = −d(ω/Te )); (3) The rate per unit volume
at which photons are heated, dργ /dt, is the negative of the Compton cooling
rate per unit volume of the electrons, nB (dB /dt):
Z 2
dB dργ ω dω 4ne σT
−nB = = ωSK [ft ] ≈ ργ (Te − Tγ ) , (30)
dt K dt K π2 me
where the last term assumes ft ≈ fP l , the Planck function. Thus the electron
temperature is driven by Compton cooling towards the photon temperature.
If Compton scattering dominates energy redistribution, but it is not so
strong as to shape a Bose–Einstein “kinetic equilibrium” distribution, a y-
distortion spectrum is the solution to the Kompaneets equation. For small
distortions of the distribution function, ∆ft fc , we have
(Te − Tc ) x (ex + 1)
GK = −2āne σT ψK (x) , ψK (x) ≡ 2 − . (31)
me 2 (ex − 1)
18
The solution of the radiative transfer equation can therefore be written in terms
of the Compton y-parameter (along a line-of-sight from the current (conformal)
time τ0 back to time τ ):
Z τ0
∆T (Te − Tc )
= −2yψK (x) , y ≡ ādτ ne σT . (32)
T K τ me
In terms of the Thomson scattering (optical) depth,
Z τ0
ζC (τ0 |τ ) ≡ nB Ye σT āc dτ , (33)
τ
this is ζC (τ0 |τ )hTe − Tc i/me . For a fully ionized medium with free electron
−1/2
abundance Ye per baryon , ζC ≈ 0.0465YeΩB,gas hΩnr (1 + z)3/2 for z 1.
There is another useful solution to the Kompaneets equation given by Zeldovich
and Sunyaev [34] which is valid for larger y than the perturbation expansion
allows, but with the restriction that the electron temperature is well in excess
of Tγ , which eq. (32) does not require:
Z ∞
1 dξ 1 (3y − ln x + ln ξ)2
f (x) ≈ √ − f c (x) exp − .
4πy 0 ξ eξ − 1 4y
In either case, the asymptotic Rayleigh–Jeans temperature is related to the
unperturbed photon temperature by TRJ = e−2y Tc and the total energy is
ργ = e4y ρcmb . The FIRAS constraint eq. (7) implies that energy injected into
the medium (below a redshift zy ∼ 105 defined below) which Compton-cooled
can be at most
δECompton cool
= 4y < 6.0 × 10−5 (95% CL) . (34)
Ecmb
The spectral signature of this y-distortion is uniquely characteristic: −2y on
the Rayleigh Jeans side, xy on the far Wein side, passing through zero at x =
3.83, as shown in fig. 2 for y = 0.001. A Bose–Einstein curve with α = 0.0057
is also shown. Both correspond to 0.4% energy injections relative to the CMB.
(For a few years in the late eighties there was a flurry of activity as theoreticians
tried to come to grips with a y = 0.016 distortion reported by Matsumoto et
al. [31].)
3.2.2 Bremsstrahlung
The source function for free–free emission and absorption from ionized hydrogen
and helium is
Sf f [ft ] = −Γ0B (∆ft − (feq − fc )) , feq = (exe − 1)−1 , (35)

Γ0B = ΓB (1 − e −xe
),
xe ≡ ω/Te ,
3/2
1/2
(2π) 2 me g(xe )
ΓB = √ αnB σT Ye (Yp + YHeII + 4YHeIII ) 3 3 ,
3 Te Tc x
19
du.04 wh.4
wh.04
SZ.004
BE.004
Figure 2: Sample types of spectral distortions are compared with the FIRAS
data (Fixsen et al. 1996). SZ.004 is a y-distortion with y = 0.001, BE.004 is a
Bose–Einstein distortion with α = 0.0057, du.04 is a model with ordinary dust
grains with abundance 10−6 reprocessing injected energy which was taken to
be 4% of that in the CMB between redshifts 50 and 25. Two models mimicking
the effect of an optically thin abundance of needle-like grains (whiskers) acting
over the same redshift, with 40% and 4% of the CMB energy injected, are also
shown.
20
√
3 2.25
Gaunt factor: g(xe ) ≈ 1, xe > 1 ; ≈ ln , xe < 1 .
π xe
Since the rate at which photons are emitted into the energy interval ω to ω +dω
by free–free processes, dnγ /dt ∼ (ω 2 dω/π 2 )Γ0B (feq − fc ), → dω/ω at low ω,
bremsstrahlung is very efficient at filling in an equilibrium Planck distribution
(with zero chemical potential) at low energies. Although it is also not inefficient
at high energies, Compton scattering dominates.
It is convenient to characterize the strength of bremsstrahlung by a param-
eter yf f analogous to the Compton y-parameter:
Z
yf f ≡ dt [ΓB x3 ](1 − Tc /Te ) . (36)
The approximate constancy of ΓB (x)x3 has been exploited in this formula. The
current 2σ limit on this parameter is yf f < 1.9 × 10−5 [19]. The total energy
input relative to the CMB for Te Tc and over the long wavelength range up
to say λ = 2 cm is

δEbremss Te∗
(tot) = 15yf f ,
Ecmb Tc∗
δEbremss
(λ > 2 cm) ≈ 15yf f x < −5
∼ 7 × 10 , with x = 0.26 . (37)
Ecmb
This can be used to constrain reionized models, with the caveat that yf f ∝
−1/2
n2e Te is dominated by dense regions and so is very sensitive to clumping in
the medium.
The source for the temperature fluctuations can be written
dyf f g(xe ) (e−xe − e−x ) (ex − 1)

Gbremss = ā ψf f (x) , ψf f (x) ≡ . (38)
dt x3 1 − Tc /Te x
The signature of bremsstrahlung in the thermodynamic temperature is

∆T
= yf f hψf f i , ψf f ≈ x−2 ln(2.35/xe ) for x 1 . (39)
T bremss
Thus for low frequencies, the thermodynamic temperature follows a ν −2 law.

(For large x, but xe small (Te Tc ), ψf f ≈ x−4 ex ln(2.35/xe ), so the slope
eventually turns positive.)
3.2.3 Double Compton scattering

In Compton scattering, the electron can shake off a soft photon, γe → γe + γ,
basically a bremsstrahlung process with a form very similar to that for free–
free emission. In particular, there is a logarithmic divergence in the number
of low energy photons emitted. The source functions for this Double Compton
21
scattering were derived in [2] using the cross sections given by Gould [35]. These
revealed a different dependence on photon energy than bremsstrahlung at high
energies:
SDC [f ] = −Γ0DC (f − feq ) , Γ0DC = ΓDC (1 − e−xe ) ,

2
16π 3 Te gDC (xe )
ΓDC = αne σT ,
45 me x3e
Z ∞
15 4 x/y F (x/y)
gDC (x) = f (y)(1 + f (y − x))y dy ,
4π 4 2x 2

wF (w) 1 2 w2 (1 + w2 ) 4 2 2
= 2 (1 − w) 1 + (1 − w) + + w + w (1 − w) .
2 (1 − w)2
(40)
For small w, [wF (w)/2] → 1. Burigana et al. [36] fit gDC (xe ) by exp(−xe /2),
valid for xe < 1, an improvement (in the cosmologically-relevant regime) over
a more complicated approximation I gave in [2]. The net effect is that the
Double Compton process is usually subdominant to free–free emission for cos-
mologically interesting parameters unless ΩB h2 is quite low.
3.2.4 Rayleigh scattering

Photon scattering from neutral hydrogen and helium has an identical source
function to that for Thomson scattering, but with a frequency dependence given
by the fourth power law. For hydrogen, the ratio of the rates ∼ (nHI /Ye )(ω/ωα )4 ,
where ωα = 10.2 eV is the Lyman α transition energy. For typical photon ener-
gies at recombination (z ≈ 1000) this is small, 2 × 10−5 nHI /ne , and it declines
precipitously as the radiation temperature drops, so is never significant. Al-
though helium is neutral above z = 1000, Thomson scattering dominates even
in the very tightly coupled regime.
3.2.5 Line radiation

Lines formed during the recombination of helium when the temperature is a
few eV and of hydrogen at ∼ 1/4 eV are either too weak to be easily observable,
or are buried in the background associated with interstellar dust emission [37].
Such processes play a very important role in the recombination process itself
of course, and this is discussed later.
3.2.6 Synchrotron
Since synchrotron emission requires both magnetic fields and energetic elec-
trons to have been generated, it seems unlikely that a synchrotron background
from high redshift will generate a measurable distortion. However, at radio fre-
quencies the anisotropy from extragalactic radio sources and Galactic emission
is significant and will contaminate anisotropies from other signals. Fortunately
22
the spectral signature is sufficiently different from primary anisotropies that
with enough frequency coverage this component could be isolated. The syn-
chrotron intensity is parameterized by a power law index ps : Iν ∼ ν −ps . For
extragalactic radio sources, ps ∼ 0.5 is the conventional value, but from deep
VLA counts there is evidence for a flatter population of sources [38]; how flat
and how abundant in the frequency range of interest for anisotropy observa-
tions is currently not well known. For Galactic sources, one has ps ∼ 0.3–0.7
at low frequencies; e.g., using maps at 408 MHz [39], 1.4 GHz [40], and 2 GHz
[22] gives ps ∼ 0.6 for moderate Galactic latitudes. One expects the index to
steepen at higher frequency, and there are indications that around 15 GHz,
ps ∼ 1 may be more appropriate [103]. In any case, (small) spatial variations
in ps are both expected and observed. The thermodynamic temperature is

∆T 1 (ex − 1)2
∝ ψsynch , ψsynch ≡ 3+ps , (41)
T synch x xex
going as ν −(2+ps ) for low frequencies, an even steeper law than the bremsstrahlung
ν −2 .
3.2.7 Dust grains

Radiation from heated primeval dust at high redshift would naturally reside at
submillimeter wavelengths, with the energy density peaking at several hundred
microns (e.g., BCH2 [42] and references therein); e.g., with 30 K dust typical of
starburst galaxies, the dust temperature would only be a factor of two above the
CMB at redshift 5. Of course Galactic sources abound to obscure cosmological
signals: dust at 20 K and possibly cold dust (∼ 5 K) at high Galactic latitude
[43]. The dust source function for emission/absorption is
Sdust = Γa (feq − fc ) − Γa ∆f , (42)
ω
feq = (exd − 1)−1 , xd ≡ , ∆f ≡ f − fc , x
fc ≡ (e − 1) −1
,
Td
(43)
−1
ρd λe
Γa = Ad (λe ) . (44)
ρid 2πc
There is also a dust scattering source term. Here feq is the equilibrium distri-
bution function for dust in thermal equilibrium at a (single) temperature Td
(obtained by balancing the energy absorbed from the local radiation field to the
energy emitted – usually). ρd is the mass density of grains, ρid is their internal
density (≈ 3 g cm−3 ), and the parameterizing function Ad depends upon the
photon energy ω = 2π/λe and grain properties. In BCH2, we adopted a three
parameter form (αd , Ad100 , rd ), only two of which were needed at the infrared
emission wavelengths of relevance for CMB observations, an amplitude Ad100
at 100 µm and a slope:
Ad (λe , z) = Ad100 (100 µm/λe )αd −1 , (45)
23
where Ad100 ≡ Ad (100 µm). For almost all types of grains and plausible con-
ditions, the absorption part −Γa ∆f of eq. (42) is of relevance only in the UV
and visible. However, the large contribution of the unperturbed cosmic back-
ground itself must be included as an absorption component −Γa fc that partly
counteracts the emission component Γa feq at long wavelengths. The source
function for the thermodynamic temperature is therefore
ρd
Gdust = Kā1−αd Ad100 ψdust , (46)
ρid
(e−xd − e−x ) (ex − 1)
ψdust (x) ≡ xαd , (47)
(1 − e−xd ) x
where K is a constant, hence the shape of the thermodynamic temperature
spectral form is

∆T (Td − Tc )
∝ ψdust , ≈ xαd for x 1 . (48)
T dust Tc
For xd 1, x > 1, ψdust ≈ xαd −2 ex .
Of course the dust population will be a mix of grains of differing compo-
sition, size, shape and temperature. Unravelling the components making up
“conventional” Galactic dust remains a hotly debated subject, nicely reviewed
in [50]. An example of a recently proposed mix to explain all of the data from
the UV to the sub-mm [44] is: most of the mass in “usual” ∼ 0.01–0.1 µm
silicate grains, with an added carbon-dominated coating and separate amor-
phous carbon and graphite grains; ∼ 6% of the mass in very small (∼ 10 Å)
carbon-dominated grains; and ∼ 6% in ∼ 10–100 Å polycyclic aromatic hy-
drocarbon molecules (PAHs), of which “bucky-balls” are an example. Dust
which is porous and fractal [52], consisting of large random aggregates of small
grains, and grains which are triaxial, possibly with extreme elongations (needles
or whiskers), are also proposed constituents of the Galactic mix. For spherical
amorphous carbon, graphite and silicate grains, Ad100 ∼ 0.3. The slope αd de-
pends upon the mix of grains and their shapes. On broad theoretical grounds,
one expects αd ≈ 2 for large λ. If the FIRAS sub-mm to mm emission is fit
to a single temperature dust model, αd = 1.65 (and Td = 23 K) are obtained
[43]; similar slopes are inferred from other CMB experiments, while earlier data
over the 100–1300 µm range gave αd ∼ 1–2, with the steeper slopes inferred
for star forming regions and molecular clouds, and the shallower ones inferred
for the Galactic center, dust forming stars and compact HII regions. Forcing
αd to be exactly 2, but allowing two dust temperatures, gives a better fit to
FIRAS [43, 47], with the 20 K dust augmented by a 5 K component. A cold
component persists at high Galactic latitude, which could be Galactic [48] or
due to redshifted extragalactic sources [49]. The dust temperatures associated
with most of the Galactic IR luminosity, from diffuse HI clouds, and also from
molecular clouds, are around 20 K, with warmer 30 K dust in lower density
HII regions, which do not contribute much luminosity. In starburst galaxies,
30 K dust dominates.
24
The dust absorption law at short wavelengths is also of concern because
it determines how efficiently stellar and other radiation is absorbed to be re-
emitted in the infrared. If the absorption cross section was geometrical, πrd2 ,
where rd is the grain radius, then Ad = 0.75λe /(2πrd ) and Γa is approximately
constant – a rough guide, but for realistic materials Ad is broadly frequency
independent at intermediate wavelengths with resonance features superposed.
Galactic dust is observed to have Ad ≈ 0.8 at λ = 0.1 µm, rising from the visual
to the UV (until -0.1 µm), probably due to very small grains, with a strong res-
onant feature at λ ≈ 0.22 µm associated with graphite. There is a strong broad
resonance at λ ≈ 10 µm, attributed to silicates in an amorphous or disordered
state, another silicate feature at 19 µm, and a resonance feature around 3 µm,
attributed to carbonaceous grains or coating on the silicates [44]. Dust grains
in molecular clouds exhibit more resonances. The size distribution of grains
can be derived from Galactic extinction data only with specific assumptions
about the nature of the dust; e.g., [51] apply the 0.1–5 µm extinction data
to spherical grain models and obtain dnd /drd ∼ rd−3.1 exp[−rd /0.14 µm] for
silicates and ∼ rd−3.5 exp[−rd /0.28 µm] for graphite and/or amorphous carbon
– not far off the oft-used MRN law dnd /drd ∼ rd−3.5 [46].
The usual way to make predictions about dust emission in the extragalactic,
protogalactic and pregalactic realms is to assume the dust is similar to Galactic
dust – as it is currently envisaged. For emission redshifts below about 10,
resonances would not appear in the 5000–500 µm FIRAS band but pregalactic
dust emission at z ∼ 50–100 would bring broadened resonance features into
the FIRAS band to aid in emission epoch determination – if distortions had
been found. Complicating the constraints that one can impose from the FIRAS
limits on high redshift emission is the freedom one has with dust models. In
particular, fractal grains would have large effective sizes which would lower the
effective αd in the far infrared and thereby increase Ad (λ); it could easily be
by more than an order of magnitude over the conventional dust value. A much
more radical absorption rate would result if grains were long conducting needles,
basically little antennae for which Ad could be thousands of times bigger than
the conventional value at long wavelengths. The cosmological importance of
these whiskers was suggested by Layzer and Hively [53], and the subject has
been developed by Rana [56], Hoyle [54], Wright [57], and is the mainstay of the
attempts by Hoyle, Burbidge and Narlikar [55] to create a viable neo-steady-
state model.
In the limit in which the photon wavelength is large compared with the scale
rd of the grain (volume ≡ (4π/3)rd3 ), Ad can be written in terms of the trace
of the (electric) polarizability tensor, αeij , of the grain (which can be treated as
P
a coherent unit in this limit): Ad = 4π 3 =[αejj ], where = denotes imaginary
part. For example, for homogeneous ellipsoidal grains with a complex isotropic
dielectric tensor (ω)δij ,
3
1X (ω) − 1
Ad = =[Ad ] , Ad = (for ωrd 1). (49)
3 j=1 1 + Lj ((ω) − 1)
25
The sum is over the axes of the grains, and the Lj are “depolarizing factors”,
functions of the axis ratios for ellipsoidal grains. The Lj sum to unity. Setting
all Lj = 1/3 gives the classical Mie expression for spherical grains; it is used
together with laboratory data on (ω) to estimate Ad .
For conductors at IR wavelengths, the dielectric function is of form (ω) ≈
d + i2σc λ, where d is the static (real) dielectric constant and σc is the conduc-
tivity. For iron grains, (2σc )−1 ≈ 0.015 µm, and for carbon (graphite) grains,
it is ≈ 0.6 µm. For needle-like grains Lj is 1/2 in the transverse directions
and nearly vanishes along the needle. For example, assuming a prolate ellip-
soid with semi-minor axis bd much smaller than the semi-major axis ad , the
2
depolarizing factor along the needle is Lk ≈ bd /ad ln ad /bd , hence eq. (49)
gives
1 2σc λ 2σc λ
Ad ≈ +8 , (50)
3 1 + (Lk 2σc λ)2 (1 + d )2 + (2σc λ)2
∝ λ until λ exceeds L−1k (2σc )
−1
, which could be in the centimeter range if ad /bd
could be above a thousand. Thus, αd ≈ 0, perhaps rising to 2 only beyond the
FIRAS range. Formation scenarios that could lead to such elongated grains
have been proposed but there is no evidence that they are produced in nature.
Wright et al. [58] argue that the FIRAS data implies such a good blackbody
that a large optical depth to needles is needed in a whisker-impregnated steady
state model, and this would mask high redshift objects (we have also seen the
SZ effect in a cluster at z = 0.55, section 5.3.4). Although it seems improbable
that the entire CMB could be just dust-emitted radiation, a small fraction of
grains in the whisker form could hide more modest energy injections. Figure 2
illustrates what happens when one flattens the dust index to αd = 0 and
uses a whisker-motivated value for Ad100 (2222 was chosen) on a model with
energy injected in a burst between redshifts 50 and 25 and a dust abundance
Ωd = 10−7 , arranged to give a depth just below unity. Whereas a model with
injected energy 4% of the CMB is strongly ruled out for normal dust with
αd = 1.5 and Ad100 = 0.3, the redshifted whisker temperature remains so
near the CMB temperature that the distortion is small (but the 40% injected
energy model is ruled out). Of course, a mix of grain types with only a small
percentage of whiskers will give larger distortions [57]; and even the whisker-
only model will be enhanced by nonequilibrium effects: these antennae are
such efficient radiators that a balance between absorbed and emitted energy
leading to a steady dust temperature will not happen, but rather there will
be strong temperature fluctuations as absorbed energy is immediately radiated
away, a phenomenon expected in very small grains as well [59]. With improved
exploration of the sub-mm and mm sky, a necessary part of the next generation
of CMB anisotropy experiments, we can expect that the exotic dust loophole
will be more strongly constrained.
26
3.3 The cosmic photosphere and Bose–Einstein distor-
tions
To determine what happens to injected energy at early epochs, we must solve
(∂f /∂t)q = Sbremss + SDC + SK . The other processes mentioned above are not
important. To be accurate, numerical solutions are required; Burigana, Danese
and DeZotti [36] give the most detailed to date. Three redshifts characterize
the solutions: Energy injection prior to
−0.39
6.9 Ω B h2
zP l ≈ 10 (51)
0.01
is redistributed into a Planckian form, hence zP l defines the redshift of the
cosmic photosphere. Between zP l and
−1/2
Ω B h2
zBE ≈ 105.6 (52)
0.01
injected energy is redistributed into a Bose–Einstein shape characterized by a
chemical potential. Below
−1/2
Ω B h2
zy ≈ 105 (53)
0.01
the y-distortion formula holds. There is an intermediate range between zBE
and zy when neither the Bose–Einstein nor y-distortion forms are accurate.
To understand the magnitudes of these redshifts, an analytic treatment
based on Zeldovich and Sunyaev [34] is quite adequate. Assume the distri-
bution function has the form f = [exp(x + α(x, t)) − 1]−1 and linearize the
transport equation in α. (α is more transparent to work with than the ther-
modynamic temperature fluctuation ∆t = −α/(x + α).) In the tight coupling
regime, SK + Sbremss + SDC approximately vanishes; this condition is satis-
fied for small xe = ω/Te if α = α0 (t) exp(−x0 /xe ). Thus for low frequencies,
x < x0 , bremsstrahlung and the Double Compton process dump photons in
fast enough to yield a Planck form, but for x > x0 the Bose–Einstein form pre-
vails. Here x0 = (4x3 (Γbremss + ΓDC )/ΓK )1/2 , where the “Kompaneets” rate
is ΓK ≡ 4ne σT Te /me . The approximate constancy of x3 (Γbremss + ΓDC ) has
been exploited to obtain this result. If we assume Ẏγ photons per baryon are
being injected with average energy Ēγ at time t, adding to the Yγ0 photons per
baryon already there, then we find the scaling parameter α0 evolves according
to
dα0 α0 Ēγ Ẏγ
=− + − 1 1.87 ,
dt τD 3.6Tγ Yγ0
τD = 1.29x0 [(Γbremss + ΓDC )x3 ]−1 = 1.29(ΓK /4)−1 x−1
0 . (54)
Thus there is a damping term with timescale τD driving α0 towards zero,
i.e., a Planck distribution, against which the injection term tries to drive the
27
distortion. When the damping time is shorter than the expansion rate of the
universe, any injected energy input would be rethermalized into a Planckian in
equilibrium with the electrons within one Hubble time. This basically defines
zP l . When the Kompaneets rate is a few times the expansion rate, x0 will
be low but α0 will not be zero, and the BE form is appropriate. This defines
zBE . However, it is not until the Kompaneets rate is a few times below the
expansion rate that the perturbative y-distortion solution prevails. This defines
zy ≈ zBE /4. Naturally zBE and zy scale in the way defined by ΓK /H.
To constrain the allowed energy input in the Bose–Einstein regime, we
take the BE distribution and linearize it in ∆T /T and α, where now both are
frequency-independent. The photon number density and photon energy density
are related to the unperturbed values by

(0) ∆T ζ2 (0) ∆T ζ3
nγ = n γ 1+3 − α , ργ = ρ γ 1+4 − α ,
T ζ3 T ζ4
P −s
where ζs = j denotes the Riemann zeta function of index s. With fixed
(0)
photon number throughout energy injection, we must have nγ = nγ remaining
invariant, hence a relationship between the temperature perturbation and the
chemical potential, ∆T /T = (ζ2 /ζ3 )α/3, leading to a relative energy perturba-
tion
δργ 4ζ2 ζ3
(0)
= − α = 0.71α . (55)
ργ 3ζ3 ζ4
Using the FIRAS constraint eq. (8), the allowed energy injection relative to the
primeval radiation in the zP l to zBE epoch is at most [58]
δEBE <
6.4 × 10−5 (95% CL) . (56)
Ecmb ∼
3.4 Recombination and photon decoupling

3.4.1 Hydrogen and Helium Recombination
The subject of the recombination of the primeval plasma was well developed
immediately after the discovery of the background radiation [60, 61]. In this
subsection, we display the ODEs we solved in the Bond and Efstathiou papers
for hydrogen recombination [134, 88]. Although helium is neutral through
hydrogen recombination, helium recombination is now also explicitly included
for our anisotropy calculations for increased accuracy [302], and the relevant
equations are also given. In implementing these equations, it is important to
use very accurate and self consistent physical parameters.
The availability of photons per baryon in the background radiation illus-
trates that there are not enough photons above Lyman α energy to guarantee
equilibrium of the 1s state with states above it, though there are plenty be-
low the Balmer continuum. Thus absorption and production timescales for
the 2s → 3p transitions, for example, are measured in seconds at redshifts
28
above 1000. We can therefore take the population of excited states with n > 1
to be in thermal equilibrium with the 2s state. In the following, we denote
the abundances per baryon of various hydrogen states {n, `} by Yn,` , the total
abundance of hydrogen atoms and ions per baryon by YHT , and the free elec-
tron and proton abundances by Ye and Yp . The (positive) binding energy of
the state n, ` is denoted by Bn , and gn` = 4(2` + 1) is its statistical weight,
with the 4 coming from the proton and electron spins. As before, Te and Tγ are
the electron and photon temperatures in energy units (kB = 1). When account
is taken of the equilibrium associated with the fast timescales, the network of
equations describing the normal recombination transition is:
1. Equilibrium of the state {n, `} with the 2s:
Yn` = (gn` /4)Y2s exp[−(B2 − Bn )/Tγ ] . (57)
2. Baryon conservation:
X
Yp + Y1s + Y2s Z(T ) = YHT , Z(T ) = (gn` /4)e−(B2 −Bn )/Tγ . (58)
n>1, `
The partition function for states above n = 1 is Z(T ).

3. Loss of free electrons through recombination: Ẏe = −αc nB Ye Yp + Y2s βc .
Here αc is the recombination rate, excluding direct recombinations to
the ground state since the released photon above the Lyman edge leads
immediately to another ionization. The factor
3/2
m e c 2 Te
βc = e−B2 /Tγ αc (59)
2π(h̄c)2
describes the detailed balance relating the photoionization rate to the
recombination coefficient αc . For αc , we use the analytic approximation
αc = 1.948 × 10−13 (104 K/Te )1/2 ϕ(y) cm3 s−1 , (60)

13.6 eV
ϕ(y) ≈ 21 (1.735 + ln y + y −1 /6) − (1 − y −1 − 2y −2 ), y≡
Te
(Bates and Dalgarno [62]).1
4. 1s production:
Y2s Y1s −(B1 −B2 )/Tγ R
Ẏ1s = − e + . (61)
τ2γ τ2γ nB
1 This −1/2
recombination rate is superior to the Boardman [63] form, αc = 2.84 × 10−13 Te4 ,
Te4 ≡ Te /104 K, used by Peebles and to the oft-used Seaton approximation, αc = 2.6 ×
−0.85
10−13 Te4 . The latter is accurate at 104 K, but differs from the Osterbrock [64] values by
3% at 5000 K, 9% at 2500 K and by 19% at 1250 K, whereas the formula we adopt differs
by only a percent in all three cases (and by even less from the tabulated values of Bates and
Dalgarno). The original Peebles formula differs by 12%, 26% and 37%, respectively.
29
The first term describes the 2s → 1s+γγ transition, with lifetime τ2γ =0.12
s. The second describes the rate at which Lyman alpha photons from the
2p → 2s + γ transition are shifted out of the line due to the expansion
of the Universe before they can be reabsorbed. Thermal CMB photons
are irrelevant for this, since, at the temperatures of recombination, essen-
tially no photons with energies as high as Ly alpha exist. (Balmer lines
do yield a thermal distribution function.) Thus a detailed solution of the
photon distribution function across the line, including redshift effects, is
needed. This is straightforward. For given Y2s and Y1s , Peebles shows

(B1 − B2 )3 Y2s −(B1 −B2 )/Tγ
R = H(a) −e . (62)
(π 2 (h̄c)3 ) Y1s
Here the Hubble parameter is H(a).
The net effect of the rapid equilibration of the 2s state with the 2p and
higher states yields the equation
Y2s Y1s −(B1 −B2 )/Tγ R
Ẏ2s = αc nB Ye Yp − Y2s βc − + e − . (63)
τ2γ τ2γ nB
The rates are large enough that Ẏ2s can be taken to vanish, yielding an ex-
pression for Y2s in terms of Y1s and Ye . Baryon conservation gives a further
relation of Y1s in terms of Yp = Ye , so the entire system of equations reduces
to one for the evolution of the free electron abundance, Ye .
Denoting the ionization fraction by x = Yp /YHT , putting Y1s /YHT ≈ 1 − x,
and transforming the time derivative to one over the photon temperature, Tγ =
Tγ∗ /ā, we have
dx 1 nB YHT αc x2 − βc (1 − x)e−(B1 −B2 )/Tγ

= −1 . (64)
dTγ Tγ H(a) −1 (B1 −B2 )3 H(a)
1 + βc τ2γ + (π2 (h̄c) 3 )n Y
B HT (1−x)
This is a stiff equation. At high redshift, Saha equilibrium holds, with Ẏ1s =0,
thus Y2s = Y1s e−(B1 −B2 )/Tγ , and Ẏe =0, hence
−3/2
m e c 2 Te gHI
Y1s = cHI Ye Yp , cHI = nB eB1 /Tγ , (65)
2π(h̄c)2 ge gp
The statistical weights are ge = 2, gp = 2, gHI = 4. but by Tγ ≈ 4000 K one
should shift over to the ODE solution.
This equation is coupled to the Compton cooling equation (30) for the
evolution of the electron temperature Te as it breaks equality with the photon
temperature Tγ to follow the (1 + z)2 redshift evolution of a nonrelativistic
ideal fluid:
" #
1 d 2 8σT ργ Ye 2 Ẏe 3
Te a = − (Te − Tγ ) − (B1 + Te ) . (66)
a2 dt 3me c YT 3 YT 2
30
Here YT ≡ YHT + Ye + YHe T is the number of gas particles per baryon. (The
term in square brackets from the binding and thermal energy gained when an
electron recombines is ignorable here.) The large value of the photon energy
density ργ ensures that this Compton heating keeps Te and Tγ nearly equal
until a redshift below about 400 (as shown in fig. 3(c)). These equations must
be integrated numerically with stiff ODE solvers. Solutions for some CDM
models are shown in fig. 3(a). If one is just interested in the development of
anisotropies, the critical region is not around the redshift ∼ 1500 when the
universe passes from 95% to 10% ionized, but rather a redshift interval from
about 1200 to 900 when the radiation passes from being tightly coupled to
freely streaming, when the optical depth to Thomson scattering, ζC defined
by eq. (33), passes through unity. The final values of the residual ionization
are also of interest since those few free electrons present catalyze the formation
of molecular hydrogen, which can be an important coolant in the first objects
that collapse in the Universe.
Krolik [65] discusses extra Fokker–Planck diffusive terms arising from scat-
tering in the lines, but shows that these result in numerically small corrections
to recombination over that obtained using the system of equations given here.
Although photons are quite tightly coupled to the baryons when helium re-
combines, for high precision calculations of CMB anisotropies at small angular
scales the effect should be taken into account [302]. With more free electrons
present, the photons do not diffuse as easily. It seems to be adequate to solve
for the Saha equilibrium rather than doing the full time evolution as is required
for hydrogen recombination. One should solve for the ionization fractions of
the states of helium and hydrogen together, in practice done by iterating the
following equation and demanding convergence in YHI and YHeII :
Ye = YHT + 2YHeT − (YHI + 2YHeI + YHeII ) ,
YHI = YHT /(1 + (cHI Ye )−1 ) ,
YHeII = YHeT /(1 + (cHeII Ye )−1 + cHeI Ye ) ,
YHeI = YHeT /(1 + (1 + (cHeII Ye )−1 )(cHeI Ye )−1 ) . (67)
The coefficients entering are
−3/2
m e Te gHeI
cHeI = nB 2
eBHeI /Tγ ,
2π(h̄c) ge gHeII
3/2
m e Te gHeII
cHeII = nB eBHeII /Tγ , (68)
2π(h̄c)2 ge gHeIII
with statistical weights reflecting the spinless alpha particle in the fully ionized
state, gHeIII = 1, the electron spin in the once-ionized helium hydrogenic
ground state, gHeII = 2, and the two electrons in the singlet 1 S0 ground state
of neutral helium, gHeI = 1. The partition functions can be assumed to be
temperature independent. The binding energies are BHeI = 24.6 eV, BHeII =
54.4 eV. When c−1 −1
HeI and cHeII are very small, helium is fully recombined and
the hydrogen-only Saha equation is adequate to solve.
31
Figure 3: (a) Evolution of the ionization fraction. Effect of varying ΩB , Ωnr , h.
(b) Differential visibility functions de−ζC /d ln ā for standard recombination
(concentrated around z ≈ 1000, rather like a Gaussian in τ ) and for “no re-
combination”. (c) Closeup of (b), the Ye ∼ a−p power, and on the extreme left
the relative difference between the electron and photon temperatures amplified
by 10.
32
3.4.2 Visibility and decoupling
The visibility of the Universe to Thomson scattering is defined by e−ζC and
the differential visibility by VC ≡ de−ζC /dτ = e−ζC /τC , where τC−1 ≡ ān̄e σT .
Figure 3(b) shows VC /(Hā) for the universes of (a); a closeup of a subset of the
models is shown in (c). For normal recombination the differential visibility is
sharply peaked, only weakly dependent on cosmological parameters. Although
the distribution is somewhat skew, a Gaussian fit is not a bad approximation.
We define the conformal time of decoupling τdec to be where VC has a peak and
the width of decoupling, RVC ,dec , to be the fwhm of VC times a factor 0.425,
which turns the fwhm into a dispersion for a Gaussian. The corresponding
expansion factors are ādec and σa,Cdec , related by
12 12
−1/2 −1 3 1 aeq aeq
τdec = 190 Ωnr h Mpc (10 adec ) 2 1+ − ,
adec adec
−1/2 −1 (103 adec )1/2
RVC ,dec = 9.5(10σa,Cdec)Ωnr h Mpc ≈ 21 σa,dec τdec ,
aeq 1/2
1 + adec
Ωer
aeq ≡ ≈ [24200Ωnr h2 ]−1 , CDM: 0.06 < <
∼ σa,Cdec ∼ 0.1 . (69)
Ωnr
The aeq /adec ∼ [6Ωnr (2h)2 ]−1 corrections usually cannot be ignored. For nor-
mal recombination CDM-dominated universes, the σa,dec range, as measured
from the fwhm of the fig. 3(b) curves, imply the Gaussian width RVC ,dec is
only about 0.03–0.05 of the horizon size at decoupling. The last scatter-
ing region is therefore quite thin, with typical (comoving) Gaussian width
−1/2
∼ (5 − 10)Ωnr h−1 Mpc. The Gaussian approximation,
de−ζC exp[−(τ − τdec )2 /(2RV2 C ,dec )]

VC ≡ ≈ (70)
dτ (2πRV2 C ,dec )1/2
is not bad for these cases, and is nice for analytic purposes ([2], section 5.2).
Figure 3(c) shows how the instantaneous power law scaling pe,dec ≡ −d ln Ye /d ln a
varies with redshift. Around decoupling pe,dec ∼ 10 is typical. The Compton
scattering time is related to the Hubble time at decoupling by
VC max: ne σT /H = (pe,dec + 2) defines ādec , τdec ,

VC
max: ne σT /H = (pe,dec + 2 − qdec ) ,
(Hā)
1 (1 + 2(aeq /adec ))
qdec = . (71)
2 (1 + (aeq /adec ))
We could also define the decoupling redshift when VC /(Hā) has a maximum:
this occurs slightly later than that determined by VC . Here qdec is the value of
the deceleration parameter at decoupling. The Compton time is therefore only
33
∼ 5% of the “horizon” size at decoupling. In section 5.1.2, we shall see that a
local measure of the width of the visibility at time τ is useful to characterize
the damping of anisotropies associated with the fuzziness of the last scattering
surface:
∂ 2 ln VC −1/2 (H̄ā)−1
RVC (τ ) ≡ =h i1/2 .
∂ χ̄2 n̄e σT d ln(pe +2)
(pe + 2)(−q + H̄
+ d ln a )
This also gives σa,C ≈ H̄āRVC ; in particular, if we substitute ne σT /H =

pe,dec + 2 in this, note from the figure that d ln(pe + 2)/d ln a is typically 2
or 3, we get 0.06 < ∼ σa,Cdec <∼ 0.1 for 7 < ∼ pe,dec <
∼ 12, in good accord with
the fwhm estimates. This expression also shows that RVC ,dec ≈ τC,dec . (The
time-dependent RVC (τ ) expression must eventually break down, once ne σT /H
drops below the deceleration parameter q.)
Figure 3(b) shows the dramatic effect of early reionization on the visibility.
For full ionization (pe,dec = 0), the redshift at which VC /(Hā) peaks is exactly
where the optical depth to us is unity,
−2/3
2.1 ΩB h
zζC =1 ≈ 10 Ω1/3
nr . (72)
0.02
The redshift zdec at which VC peaks is 20% smaller. Figure 3 shows the
Gaussian approximation is not very good (the half power points in τ are
at 0.75τdec and 1.5τdec , with “Gaussian” width of 0.32τdec ). For the typical
Ωnr =1, ΩB =0.05 dark-matter dominated Universes, zdec ≈ 130 and RVC ,dec ≈
170 h−1 Mpc, but for the ΩB = Ωnr = 0.1 universe in fig. 3, whether open or
vacuum-dominated (to make Ω = 1), the decoupling redshift is pushed dramat-
ically forward, to zdec ≈ 28 and RVC ,dec ≈ 360 h−1 Mpc.
3.5 Reionization of the universe

Erasure of CMB temperature anisotropies is dramatic if re-ionization occurs
earlier than the minimum redshift required to make the optical depth to us
unity, eq. (72). Although this seems unlikely in CDM dominated models [134],
it is reasonable to expect a 10% effect on ∆T /T ∝ e−ζC even if re-ionization
occurs as late as zζC =1 /3, say, since ζC = [(1 + z)/(1 + zζC =1 )]3/2 .
The Gunn–Peterson test shows that the cumulative optical depth to Lyman
alpha radiation back to the most distant quasars at z ∼ 3 is less than 0.05
implying the universe is extremely highly ionized with neutral hydrogen fraction
YH < −6
∼ 10 . Quasars, which contribute a significant amount of this ionizing
flux, are expected to have formed too late to have had much influence on CMB
anisotropies. An early population of massive stars or more exotic sources such
as decaying Big Bang relic particles with a radiative channel could reionize early
enough. In [66], we estimated the fraction of the closure density in massive stars
of various types required for re-ionization to occur via the overlapping of the
34
HII regions they generate. We found that to reionize by zζC =1 requires a cosmic
abundance of ionizing stars
0.8
−6 ΩB h
Ω∗ = K10 (1 + δ̄gas )1.5 , (73)
0.02
where K is a factor depending upon the type of stars: for stars with mass
∼ 30 M , K ≈ 30 if they have Population III abundances (i.e., with essentially
no heavy elements) and is somewhat higher if there is Population II metallicity,
while for the limiting case of Very Massive Objects (mass > ∼ 100 M ), K ≈ 1
for Population III abundances and K ≈ 5 for Population II abundances. Ω∗
depends upon the overdensity of the gas relative to the background, 1 + δgas ,
i.e., the clumpiness factor. Ω∗ is lowest if the gas is unclumped, but the gas
in the neighborhood of the stars will be overdense and the HII region would
first have to break out of this gas before entering into the δgas ≈ 0 background
medium. It is therefore unclear what to take for the average δ̄gas entering
eq. (73), and thus how much larger a fraction than 10−6 in massive stars is
required.
To assess whether it is plausible that such relatively large fractions of the
universe can have gone into massive stars by zζC =1 , we use the Press–Schechter
formula [67] for the fraction of the baryons√ that would be in collapsed objects
by redshift z, ΩBcoll (z) = ΩB erfc(νcoll / 2), where νcoll (z) ≡ fcoll (1 + z)/σρB .
Here the factor fcoll ≈ 1.686 is the average linear density fluctuation within
a sphere needed for that sphere to have collapsed to infinite density when
nonlinearities are included. σρB (z) denotes the rms level of the gas density
fluctuations at redshift z. (For rare events, i.e., high νcoll , we have ΩBcoll ≈
2
−1 −νcoll
ΩB (2/π)1/2 νcoll e /2
; the better physically-motivated “peak-patch picture”
[68] based on collapse about peaks in the linear density field yields similar
results.)
There is a natural filter ∼ 1 h−1 kpc for the gas associated with the Jeans
mass at recombination. In [134], we showed that σρ on this scale is typically ≈
20σ8 /(1 + z) for initially scale invariant Ωnr = 1 CDM-dominated models with
h = 0.5 and about the same for initially scale invariant nonzero Λ models with
h ≈ 0.75 and Ωnr ≈ 0.3. Here σ8 denotes the rms linear density fluctuations
on cluster scales at the current time. For CDM and the nonzero Λ models,
we have νcoll (zζC =1 ) ranging from about 7 to 10, hence ΩBcoll (zζC =1 ) is very
tiny indeed. However, by zζC =1 /3 it would have grown to a number which
can exceed Ω∗ . Thus, although we concluded in [134] that the drastic case of
extreme damping of small angle CMB anisotropies was unlikely unless there
were an extremely high efficiency of massive star formation from collapsed gas,
it is quite conceivable that there will be some small effect from the earliest
generation of stars on the anisotropies provided there is a reasonable amount
of “short-distance power” in the density fluctuation spectrum.
What complicates this enormously is that the entities which form may well
be rather fragile with a small binding energy, easily disrupted by the massive
35
stars they generate. But it is also possible that the amount of nonlinear gas
could be amplified by the explosion of such stars sweeping up shells of gas far
from the parent object. It is difficult to argue definitively either way and this
issue of efficiency and amplification or suppression will likely remain a subject
of uncertainty in interpretation of CMB anisotropies for a long time to come.
For recent discussions of the issues involved in reionization see [69, 70].
Although the influence of early reionization on inflation-based CDM models
and models with nonzero Λ is ambiguous, the situation seems clearer in other
models. In isocurvature baryon models with (nearly) white noise initial con-
ditions popular in the late seventies [71], the first objects collapse at z ∼ 300,
making reionization easy, and, indeed, expected. Similarly, in models in which
there are isocurvature seeds, such as in texture models, one also expects early
ionization to be quite plausible, although by no means certain.
If there is no recombination, there is a constraint from the y-distortion on
how early energy can be injected:
−2/3
3.8 ΩB h
zmax,reh ≈ 10 Ω1/3
nr . (74)
0.02
This is a result from Zeldovich and Sunyaev [34], revisited by Bartlett and
Stebbins [72], which I modified to take into account the FIRAS limit [12]. This
limit can be avoided if one can sustain a temperature of the cooling electrons
to be nearly the CMB temperature. In any case, it is no limitation for the low
ΩB favored by standard Big Bang nucleosynthesis [73].
3.6 Post-recombination energy sources

After recombination, we expect energy release to accompany the formation of
nonlinear cosmic structure as stars, black holes etc. form. Although the limits
on this release in the CMB region are now very stringent, they are not as strong
in the optical and near infrared. I now survey a number of sources that would
be expected to contribute to a background, choosing normalization parameters
to be relatively conservative. Even so they are not far off the FIRAS bound
(eq. 9), < 2.5 × 10−4 from 500–5000 µm – a useful limit to bear in mind
when considering the following energy source formulas. On the other hand,
there is a tentative identification of a sub-mm background in the FIRAS data
[49] in the range ∼ 200 − 1000 µm, with energy δE/Ecmb ∼ 10−3 longward
of ∼ 400 µm, which partly mimics the Galactic contribution (and could be
partly due to cold high latitude Galactic dust [48]). There are also residuals
after source subtractions in the DIRBE data which could be interpreted as a
cosmological infrared background at shorter (∼ 1 − 200 µm) wavelengths at the
δE/Ecmb ∼ 10−2 level [79, 80]. These are shown in fig. 3.6.
We first consider an exotic source before the more prosaic ones we know
must exist at some level. Decaying (cold) particles with a radiative channel
36
X → X 0 + γ having a branching ratio BXγ contribute a relative energy
Edecay 106
∼ 0.02BXγ ΩX,i h2 (75)
Ecmb (1 + zdec )
to the Universe, where ΩX,i is the initial density parameter of the cold particles
which are destined to decay, which may easily be in excess of unity; e.g., for
keV neutrinos it is 40. zdec is the decay redshift, when the lifetime equals
the Hubble time. In cases like this, zdec > zP l unless the branching ratio is
tiny, i.e., with a lifetime shorter than a month. And if a considerable fraction
of the CMB energy were created this way, the success of standard Big Bang
nucleosynthesis would come into jeopardy. If the particle has a longer lifetime
and if there is dust to reprocess the radiation into the sub-mm band probed
by FIRAS, the constraints on the branching ratio are quite severe; if there is
no dust so the decay radiation is just redshifted, then it would lie at shorter
wavelengths where the bounds are much less stringent.
The nuclear energy output of stars with efficiency nuc radiating at redshift
z∗ with an abundance Ω∗ relative to the CMB is

E∗ Ω ∗ h2 5 nuc
≈ 0.03 . (76)
Ecmb 0.001 (1 + z∗ ) 0.004
Massive stars have an efficiency which is not much less than the maximum
value of 0.004 for Very Massive Objects [66], those with mass > 100 M . The
radiant energy release from stars which eject a mass Zej M in metals when they
undergo supernova explosions is limited by the metal fraction Z they contribute
to a gas of density Ωgas ,
0.5
EpreSN ∗ Z Ωgas h2 Zej M
≈ 0.0008 −3 . (77)
Ecmb 10 0.01 0.2 20 M
Radiation generated by mass accreting onto black holes with an efficiency acc ,
typically taken to be about 0.1 for quasar models, delivers energy
EBHacc ΩBHacch2 5 acc

∼ 0.0008 −6
. (78)
Ecmb 10 (1 + zacc ) 0.1
We might reasonably expect that Ω∗ nuc , ZΩgas and ΩBHacc acc would be
larger than the normalizations indicate and so they would be in conflict with
the FIRAS limit, eq. (9), if that radiation were to find its way to the sub-mm. In
particular, the prospect of (∼ 102 –105 M ) VMO remnant black holes forming
a considerable component of the dark matter is ruled out if the unavoidable
thermonuclear energy release prior to collapse passed through pregalactic dust
or through dusty galaxies.
Although there is a contribution from the gravitational energy released
during the collapse of various structures in the universe in all wavebands, it
is typically smaller than that from other sources. Letting ΩB,coll fcool be the
37
density of baryons which have cooled in a potential well characterized by the
three-dimensional virial velocity dispersion vT which formed at redshift zcoll
and taking the average over all collapsed structures, we get an energy release
2
Ef ormation ΩB,coll h2 fcool vT
∼ 0.0002 . (79)
Ecmb 10−3 (1 + zcoll ) 1000 km s−1
Taking typical parameters for gas that has cooled in forming galaxies gives a
value of order ten lower than the prefactor.
The FIRAS limit on the y-distortion does place a powerful constraint on
how effective explosions could have been in generating cosmic structure. As
Ikeuchi and Ostriker emphasized (e.g., [74]), a predominantly hydrodynamic
explanation for cosmic structure development is a perfectly reasonable extrap-
olation of known behavior in the interstellar medium to the pregalactic medium.
In [2], I gave a conservative lower estimation of the amount of Compton cooling
that would have accompanied the explosive formation of bubbles of radius Rexp
with filling factor fexp by equating the thermal energy to the minimum energy
per baryon required to scour out a bubble of size Rexp at redshift zexp :
2
ECompton cool Rexp
∼ u10−3 fexp ΩB hΩ1/2
nr , (80)
Ecmb 20 h−1 Mpc
with u ∼ 1/2. Chris Thompson [78] gave a more refined derivation and got
the same functional form with prefactors u ranging from 1/3 to 1, assuming
that the electrons would be much cooler than the ions. The zexp dependence is
weak for redshifts >∼ 10 when Compton cooling dominates, and ∼ (1 + zexp )
5/2
−5
below. Thus the FIRAS limit of 6 × 10 very strongly constrains the scale
Rexp and/or the filling factor fexp . If supernova explosions were responsible
for energy injection, one expects that the presupernova light radiated would
be in excess of the explosive energy by a factor in excess of 100, which would
lead to even stronger restrictions on the model; and if the supernova debris is
metal-enriched, the allowed amount of metals also poses a strong constraint.2
Another victim of the powerful FIRAS y-distortion limit was the supercon-
ducting cosmic string model of structure formation, in which the strings would
radiate magnetohydrodynamic and (damped) extremely long frequency waves
that would heat the medium, giving a picture of structure formation similar in
spirit to the explosion model, but with a more exotic energy source. Thompson
has estimated δE/Ecmb ∼ 10−2.2 (GN µ/10−6 ) for ΩB (2h)2 = 0.1, where GN is
Newton’s constant and µ is the string tension. This is much too large for the
range of µ needed to make the model viable.
Figure 3.6(a) compares current DIRBE constraints with sample theoretical
models having spectra peaking in the near-infrared [81], examples of the energy
2 The limits from anisotropy are not as strong: the packed shell model above gives
anisotropies at the few times 10−5 level, but the expectation was that in the early fireball
development phase, the hot gas would create large anisotropies [2, 75], although if T e Tion
Thompson suggests these can be avoided.
38
(caption next page)
Table 1: Sample dust emission models

Model: M8 M11 M14 M13 pk
Ωdust 10−6 10−5 10−5 10−6 ∼ 10−6
δEinj /Ecmb 0.04 0.004 0.004 0.004 0.01S
Mode burst burst steep Ė flat Ė burst
zgf 5 9 5 9 6–4
δEN IR /δEinj 0.82 0.01 0.14 0.81 –
39
Figure 4: (previous page) (a) The intensity levels νIν in units of the total CMB
intensity Icmb = 10−3 erg cm−2 s−1 sr−1 for a variety of near-IR and far-IR
models of energy generation associated with galaxy formation are compared
with current limits and potential measurement levels. The straight heavy line
in the sub-mm shows the current FIRAS constraint on spectral distortions of
the CMB, the light upper lines show the 1990 announcement limit, and the
improvement one year later (using Baade window observations). The upper
heavy line shows the COBRA limit. Typical optical and UV limits are denoted
by daggers, IRAS measurements are solid squares. Open circles are DIRBE’s
“dark sky” values, hence upper limits to an infrared background, heavy er-
ror bars give the estimated DIRBE range of residuals at high Galactic after
removing “foreground” sources, and the open squares denote the sensitivities
DIRBE could in principle have gotten to with perfect source removal in 1 field-
of-view after 1 year of integration. Inverted triangles are limits from the FIRAS
HF channels. (b) A closeup in the sub-mm. The solid circle data points are
positive FIRAS residuals the open data points are absolute values of negative
FIRAS residuals. They are bounded by the solid line. If a spectrum mimics
Galactic emission then it is not as strongly constrained. The solid line above
GP ole/4 is the FIRAS determination of the Galactic Pole emission, lower is
1/4 of this. The heavy large-dashed curves in both panels denote the tentative
sub-mm background suggested for the FIRAS data (Puget et al. 1996). The
upper dotted curve is the CMB blackbody, next is the perturbed CMB with
fixed δT = 0.004, and last is the CMB times 0.00025. The solid curves below
these are BCH2 primeval galaxy models of dust emission.
40
releases discussed above: metal generating stars which generate Ωmetals = 10−4
at z = 100 and at z = 9 (two solid lines), Eddington-limited accreting black
holes (AGN pre-cursors) at z = 9 with Ωbh = 10−5 , VMOs with abundance
ΩV M O = 0.05 at z=100 (dashed), that make Ωbh = 0.025–0.05. The curves
scale up linearly with Ωmetals , Ωbh , or ΩV M O , and with (2h)2 . (All models
shown have Ω = 1 and h=0.5.) With increasing 1 + z of formation, there is a
linear increase to longer wavelength and a linear drop in the amplitude of the
curves, as in the transition between the 2 solid curves.
If intervening dust is present, these curves will have the same underlying
energy but be shifted, at least partially, into the sub-mm. Figure 3.6(a) shows
the FIRAS bound given as a fraction of the CMB peak, while fig. 3.6(b) gives a
closeup showing what freedom there really is, since the most powerful FIRAS
bound was derived by modelling the Galactic emission so if cosmological sources
can mimic that Galactic emission they are not as strongly constrained (e.g.,
[49]).
For the sub-mm theoretical curves shown, corresponding to the BCH2 [42]
models listed in table 1, “normal” Galactic dust is assumed, with far infrared
opacity index αd = 1.5, similar to the value derived for single temperature
dust from the COBE observation. (αd = 2 is better motivated theoretically
in this range, and, with a cold component added, by the COBE observations,
section 3.2.7.) The dust abundance is Ωd (∼ 10−5 corresponds roughly to
a Pop I abundance of dust in bright galaxies). The radiative energy input
relative to that in the CMB is δEinj /Ecmb , of which a fraction δEN IR /δEinj
is not absorbed by the dust, and just redshifts to appear as a near infrared
background. This fraction depends upon the dust distribution. The peak of
emission also depends upon how clumped the dust is; BCH2 used homogeneous
models which have maximally cool dust, hence somewhat bigger emission in the
FIRAS bands than the hotter compact dust of starbursts. A peak-patch model
[68] for starbursting galaxies [82], including normal and dwarf contributions,
radiating with a dust temperature Td = 30 K that form according to a σ8 = 0.7
CDM model, but which were not allowed to burst below redshift 4, is also given
in the table, since maps based upon it are shown in fig. 15: the parameter S is
variable, but should apparently be less than 0.1 to satisfy the COBE bounds
of [12]; S = 1 would have all the normal galaxies that formed pass through a
phase at birth during which their luminosity output was at the Arp 220 level –
Arp 220 being the canonical strong starburst example. Wright et al. [58] and
De Zotti et al. [83] give other versions of the constraints on dust emission from
high (and low) redshift galaxies that can be derived from FIRAS.
4 Phenomenology of CMB anisotropy

Generally many sources will contribute to the CMB anisotropy pattern. Now
that fluctuations in the temperature have been discovered, the challenge is
to design experiments that can separate the many components that will be
41
present, in particular, the cosmological signals from those that are merely
Galactic or conventionally extragalactic (e.g., radio galaxies). Ultimately, it
will probably require a sophisticated combination of spectral and angular in-
formation, and cross-correlation with other datasets, such as X-ray and HI
maps. With enough frequency bands covered, the prospects for separation on
the basis of spectrum alone is not bad. Figure 5 draws together the spectral
signatures of the different sources of anisotropy that are likely to appear, us-
ing eqs. (32), (39), (41), (48). Although the different signals are gratifyingly
different, many parameters must be fit.
The angular patterns could also be used, for example to get rid of point
sources. Of course, this can be dangerous since what we are trying to discover is
the angular pattern in the background. We now turn to measures of this angular
pattern, with special emphasis on the power spectrum as a way of codifying the
contribution of different angular scales to the anisotropies for different cosmic
signals. However, the patterns may be non-Gaussian, especially for secondary
anisotropies, and so how the power is concentrated in hot and cold spots defines
a crucial aspect of the distribution.
4.1 Statistical measures of the radiation pattern: C(θ), C` , . . .

To relate observations of anisotropy to theory, statistical measures quite fa-
miliar from their application to the galaxy distribution have been widely used.
Denote the radiation pattern as measured here and now by the two-dimensional
random field ∆T (q̂), where −q̂ = (θ, φ) is the unit direction vector on the sky
(and q̂ is the direction the photons are travelling in). The correlation function
is
∆T ∆T 0
C(θ) ≡ (q̂) (q̂ ) , cos(θ) = q̂ · q̂ 0 , (81)
Tc Tc
where Tc is the background temperature (the monopole). In theoretical treat-
ments, a probability functional describes the distribution of the sky patterns.
Generally all N -point correlation functions are required to specify the statistical
distribution. The random field is statistically isotropic if all N -point functions
are rotationally-invariant. In particular, this implies C(θ) is only a function of
the angle between the vectors q̂ and q̂ 0 . The theoretical correlation function is
an ensemble-average of possible skies, while experimentally C(θ) must be an
angle-averaged estimate for the patch of the sky over which the observations
have taken place. Even if there were perfect resolution and all-sky coverage, the
observed C(θ) and the theoretical C(θ) would differ. For realistic experiments,
the errors arising from both observational sources and fluctuations because the
observed patch of the sky is just one realization from the ensemble are crucial
to properly include. The latter effect is called “cosmic variance” [90].
Other analogues of 3D measures that have been applied to CMB maps
include: constructing the one-point distribution for ∆T /T as a function of
resolution scale, the analogue of “counts-in-cells”; particular aspects are the
rms fluctuation on a given resolution scale, and the skewness and kurtosis of
42
Figure 5: The flat spectrum in thermodynamic temperature predicted for pri-
mary anisotropies is contrasted with the spectral signatures for other sources
of anisotropy (normalized at 4 mm): SZ anisotropies (long-dashed, with a
sign change at 1300 µm); bremsstrahlung (short-dashed); synchrotron (dot-
ted), with index varying from ps = 0.5 to 0.9; and dust (with index αd = 2
as indicated by a two-temperature fit to COBE), both the usual Galactic dust
at 20 K (heavy solid) and dust at 6 K and 4 K (light solid lines, which could
represent a cold Galactic component or, e.g., 30 K dust radiating at redshift
∼ 5); a shallower, less physically-motivated, αd = 1.5 dust opacity law for the
20K grains is also shown, appropriate for the single-temperature COBE fit.
The frequency bands which various experiments probe are indicated. There is
a minimum of the Galactic foregrounds at about 90 GHz, the highest frequency
COBE channel.
43
the distribution; the statistics of hot and cold spots (high positive and negative
excursions in the maps); the genus, etc. Many of these are rather obscured by
the intrinsic observational noise, and only full scale Monte Carlo treatments
are possible to assess how well a theory is faring.
As for the galaxy distribution on large scales, the most useful statistic is
the power spectrum, C` , for a 2D distribution a function of multipole number `.
For CMB anisotropies, it is natural to expand the radiation pattern in spherical
harmonics Y`m (θ, φ):
∆T X X
(q̂) = a Y
`m `m (q̂) = ξ`0 Y`0 (θ, 0)
Tc
`m `

√ X̀
+ 2 Y`m (θ, 0)[ξ`m cos(mφ) − η`m sin(mφ)] , (82)
m=1
with the latter splitting the complex a`m into ` + 1 symmetric real components,
ξ`m , and ` antisymmetric real components, η`m , the symmetry defined by the
behavior under change of the sign of the longitude:
1
a`0 = ξ`0 ; η`0 = 0; a`m = √ (ξ`m + iη`m ) ,
2
1
a`−m = a∗`m = √ (ξ`,m − iη`,m ) , m>
− 1. (83)
2
If the temperature pattern is statistically isotropic, then ha∗`m a`0 m0 i = 0 unless
` = `0 , m = m0 . The nonzero components are the ensemble-averaged angular
power spectrum,
C` ≡ `(` + 1)C` /(2π) , C` = ha∗`m a`m i = hξ`m
2 2
i = hη`m i. (84)
At high `, this corresponds to the power in a logarithmic waveband d ln(`). The
specific `(` + 1) factor is chosen because C` is predicted to be flat at small ` for
theories with scale invariant adiabatic density perturbations [247] (section 5.1).
In terms of a discrete “logarithmic integral”, I(f ), of a function f` , defined by
X (` + 21 )
I(f ) ≡ f` , (85)
`(` + 1)
`
the correlation function is given by

X 2` + 1
C(θ) = C` P` (cos(θ)) = I(C` P` (cos(θ))) . (86)
4π
`
The rms fluctuations in the multipole ` are found by squaring the `-poles
of ∆T /Tc and averaging over angles:

2 1 X̀ 2 1 2
X̀
2 2
σT,` = |a`m | = ξ + (ξ + η`m ) . (87)
4π 4π `0 m=1 `m
m=−`
44
For example, Qrms = Tc σT,2 is the quadrupole amplitude. The full four-year
data [85] gives Qrms = 10.7±3.6±7.1 µK, the first the 1-sigma statistical error,
the second Galactic modelling errors; i.e., σT,2 = 0.4 × 10−5 [1 ± 0.3 ± 0.7].
Over small patches of the sky, the curvature of the sky is not important
and we can Fourier transform the radiation pattern:
Z g
∆T d2 Q ∆T
($|q̂P ) = (Q) eiQ·$ . (88)
Tc (2π)2 T
This is certainly useful for the fast Fourier transform can then be applied to
small scale map construction. Here I describe the way we did this in [88].
Choose a pole q̂P within the patch and, in the neighborhood of the pole, let
$ = ($x , $y ) = $(cos φ, sin φ), where $ = 2 sin(θ/2) is confined to the range
0 <
− $ < 2. Its magnitude is $ = |q̂ − q̂P |, and to terms of order $ , we
2
can decompose the unit vector q̂ ≈ q̂P + $ into parallel and transverse pieces.
This representation is an equal area projection of the sphere onto a disk in
the sense that a solid angle element dΩ = sin θ dθdφ is just $ d$dφ = d2 $.
However, only for $ < 1 does the map look good: as one goes into the opposite
hemisphere, the distortions are severe. Note that the opposite pole to the one
we are expanding about is the $ = 2 circle.1 To evaluate the angular power
spectrum, we make use of the property that in the limit of large ` and small
$,
P` (cos θ) ≈ J0 ((` + 1/2)$) . (89)
We therefore have
g 2
∆T
C` ≈ (Q) , if Q = ` + 1/2 . (90)
T
This suggests that the analogue of the power per logarithmic wavenumber
interval is actually (` + 1/2)2 C` /(2π). The form C` ≡ `(` + 1)C` /(2π) adopted
differs by only 4% for ` = 2 and by less than a percent for ` > 4. Since the
dimensions of Q are inverse radians, the `–pole can be considered to probe
angles around 3438/(` + 1/2) arcminutes (and angular wavelengths 2π bigger).
4.2 Experimental arrangements and their filters

4.2.1 Pixel–pixel correlation filters
We now discuss anisotropy experiments in more detail. Typically we are given
the data in the form of measurements (∆T /T )p ± σDp of the anisotropy in the
pth pixel, where σDp is the variance about the mean for the measurements. In
1 We can choose to zero ∆T /T at this circle; the expansion is then a Fourier-Bessel series
with cylindrical eigenfunctions ∝ eimφ Jm (Qnm $), where the Qnm are the positive roots of
Jm (2Qnm ) = 0, thus with a discrete spectrum, though not useful like the Y`,m expansion
unless we are interested in small enough angles so that $ = 2 can be considered to be infinity
and Qnm becomes continuous.
45
general, there may be pixel–pixel correlations in the noise, defining a correlation
2
matrix CDpp0 with off-diagonal components as well as the diagonal σDp . Also
there is usually more than one frequency channel, with the generalized pixels
having frequency as well as spatial designations. The signal (∆T /T )p can be
expressed in terms of linear filters Fp,`m acting on the a`m :
X
∆T /T p = Fp,`m a`m . (91)
lm
The Fp,`m encode the experimental beam and the switching strategy that de-
fines the temperature difference, the former filtering high `, the latter low `.
They can also encode the frequency dependence if the signal has a fixed spec-
tral signature, as primary CMB and secondary SZ fluctuations do. Reality
∗
implies Fp,`,−m = Fp,`m . The pixel–pixel correlation function of the temper-
ature differences can be expressed in terms of a quadratic Npix × Npix filter
matrix Wpp0 ,` acting on CT ` :
* +
∆T ∆T
CT pp0 ≡ = I[Wpp0 ,` CT ` ] , (92)
T p T p0
Npix
4π X 1 X
Wpp0 ,` ≡ Fp,`m Fp∗0 ,`m ; W` ≡ Wpp,` . (93)
2` + 1 m Npix p=1
The trace W ` defines the average filters [3, 42, 243] shown in fig. 6, which
determine the rms anisotropies σT [W ]:
2 Npix
∆T 1 X
σT2 [W ] ≡ ≡ CT pp ≡ I[W ` CT ` ] . (94)
T rms Npix p=1
We define the band-power associated with the filter W ` to be the average power
across the filter [5, 89]:
hC` iW ≡ I[W ` CT ` ]/I[W ` ] . (95)
Usually the band-power is the quantity that can be most accurately determined
from the experimental data, and it is used extensively in what follows to assess
what various experiments have measured, and what various theories predict.
In the high ` limit it is often more convenient to use the Fourier transform
representation:
X g
∆T
(∆T /T )p = Fep (Q) (Q), Fep (Q) = eiQ·$p B(Q)Up (Q) , (96)
T
Q
where $p is the position defining the pixel, B(Q) defines the beam profile, and
Up (Q) contains details of the switching strategy. The associated filter for CT pp0
46
Figure 6: Filter functions for some current experiments: cobe’s dmr and firs
are treated as single-beam maps; ten is Tenerife; sp94 is the UCSB 1994 South
Pole experiment, for which two filters for different HEMT receiver systems
are shown; sk95 is the 1993-95 BigPlate (Saskatchewan) experiment, which is
sensitive to a large range in `, and for which two filters at ` ∼ 100 and ` ∼ 300
are shown; py is Python; g2, g3 are 2 and 3 beam configurations for MSAM; max
is for the MAX3,4,5 experiments; wd1,2 are the m = 1, 2 analysis modes for the
WhiteDish experiment. ovro is the filter for the 1987 OVRO experiment using
a 40-m radio dish, ovro22 uses a 5-m single dish, JCMT/IRAM illustrates what
bolometer arrays on submillimeter telescopes are sensitive to, and mmOVRO
and VLA denote approximations to the `-space probed by a mm interferometer
array and by the Very Large Array in a compact configuration.
47
is
Z 2π
fpp0 (Q) = dφQ e
W Fp (Q)Fep0 (Q)∗
0 2π
Z 2π
dφQ iQ·($p −$p0 ) 2
= e B (Q)Up (Q)Up∗0 (Q) . (97)
0 2π
The decomposition of the filter into a Fourier phase factor associated with the
pixel position, a beam function and a switching strategy function Up (which
can depend upon the pixel position too for some experiments) is generally
useful for experiments on scales below a few degrees – provided distortions in
the $-representation over the region of the sky mapped are not large; if they
are, it is better to work with the full spherical harmonic representation. The
analogue for the spherical harmonic representation Fp,`m of pulling out the
phase associated with $p is to pull out an overall Y`m0 (q̂p ), but with penalty
that the switching factor is a function of m0 as well as the `m and possibly the
pixel position: X
Fp,`m = Y`m0 (q̂p )B(`, m)Up,`mm0 . (98)
m0
Discretization into time bins and aspects of pixelization are encoded in the
functions Up,`mm0 or Up (Q).
4.2.2 Beams and dmr and firs

Experimental beams are characterized by a full width at half maximum θf whm .
Beams must be determined experimentally, typically by determining the pat-
tern of a point source on the sky. Usually there is a nice monotonic fall-off
from the central point to low levels of power. However, beams do have side
lobes which experimenters suppress as much as possible. Also the beams are
not always rotationally symmetric. Still, for many experiments a Gaussian as
a function of angle is not a bad approximation. The beam would then also be
Gaussian in multipole (Fourier transform) space,

(` + 21 )2
B(`|`s ) ≡ exp − ,
2(`s + 21 )2
1 1 θf whm
`s + = , θs = √ ≈ 0.425θf whm. (99)
2 2 sin(θs /2) 8 ln 2
The square of the COBE beam is shown in fig. 6: it falls off more rapidly than
the rough 7◦ fwhm Gaussian used [90] before it was precisely determined [91].
One can imagine a “one-beam” experiment, with the temperature fluctu-
ation relative to an absolute temperature being determined. In this case, the
average filter is just W ` ≈ B 2 (`|`s ). But this is never the case in practice
although the processed COBE and firs maps can be analyzed as if they were
one-beam experiments. COBE actually measures the difference between ∆T
48
values at two beam-smeared points 60◦ apart, but as the satellite spins and
rotates, the entire sky is covered, albeit with different integration times for
different sections of the sky. The set of dmr measurements give the differ-
ence in ∆T from a given beam-smeared point to enough connected points 60◦
away to allow a successful inversion and construction of a map: i.e., beam-
smeared ∆T (q̂) values at 6144 pixels for each of the 2 × 3 frequency channels
(using a convenient oversampled digitization in squares of size 2.6◦ of each
beam-smeared point). The price one pays is that residual correlation in the
experimental variance occurs between map pixels separated by 60◦ [92]. The
gain is that the COBE maps can be thought of as giving ∆T (θ, φ) directly,
smoothed with a “single-beam” high-` filter associated with the beam-size. Of
course, the monopole and dipole components are also filtered out: the ` = 0
component, the average temperature on the sky, obviously is inaccessible; and
because of the large dipole anisotropy induced by the motion of the earth rel-
ative to the cosmic background radiation, the “intrinsic” ` = 1 component is
also inaccessible.
The coverage of the firs experiment is more complicated than for dmr be-
cause it was a balloon experiment taking useful data for only about 5 hours.
Nonetheless, a map with highly inhomogeneous weighting of each of the 1.3◦
pixels can be constructed for each of its four frequency channels. Although
one may be more sophisticated in taking this into account in the construction
of W ` , it is reasonable to characterize the experiment by a one-beam filter
function (99), with `s ≈ 34 corresponding to the 3.9◦ fwhm beam.
4.2.3 2-Beams, 3-beams, oscillating beams, . . .

For a given theory, experiments could be designed to get the optimal signal.
For example, MAX, MSAM and other half-degree experiments probe multipole
ranges which optimize the signal from power spectra like that for primary
anisotropies if the recombination of the primeval plasma occurred normally. A
filter with a beam like that of sp94 is better for probing primary anisotropies
if early reionization occurred. It is of course best to get information from
experiments probing the entire ` range, and thus the emphasis on large scale
mapping experiments for the future. I now describe the Up and W ` for a variety
of current experimental configurations to give a flavor for what goes into fig. 6.
Versions are also shown and discussed in [3, 5, 42, 89, 106, 140, 243].
A single-differencing (or 2-beam) experiment subtracts the temperature of
the points on either side separated by θthrow /2 from the central point (the pixel
label). Let us denote the separation direction by $̂throw . For a pixel at $p we
have:
∆T ∆T
∆p = ($p + 21 $throw ; $s ) − ($0 − 21 $throw ; $s )
T T
hence Up (Q) = 2i sin(Q · $throw /2) . (100)
49
The filter is simply expressed in terms of the J0 Bessel function:
1
`+
W ` = [2(1 − J0 (xt ))] B 2 (`|`s ) , where xt = 2
1 . (101)
`throw + 2
The J0 (xt ) term is really the high ` approximation to P` (cos θthrow ). W ` rises
like `2 B 2 .
In a double-difference (3-beam) experiment, the smoothed fluctuation at the
pixel site has subtracted from it the average of the fluctuations at a distance
θthrow away. Thus
∆T
∆p = ($p ; $s )
T

∆T ∆T
− 12 ($p + $throw ; $s ) + ($0 − $throw ; $s ) ,
T T
hence Up (Q) = 2 sin2 (Q · $throw /2) , (102)
with the average filter
W ` = [2(1 − J0 (xt )) − 21 (1 − J0 (2xt ))] B 2 (`|`s ) . (103)
W ` rises like `4 B 2 .
In the MSAM (g2, g3) experiment, the raw data was projected into both a
2-beam and 3-beam mode, an example of a growing trend to adopt switching
strategies in software rather than hardware. Often the experimental filters are
more complicated than 2 or 3 beam ones and the associated matrix elements
must be calculated precisely, taking into account the details of the pattern on
the sky. For example, the sp91 and sp94 experiments are similar to single-
differencing experiments, except that the beam oscillates about the pixel posi-
tion in a direction $̂osc with an oscillation amplitude $osc , frequency ω, and
time behavior $p + $osc sin(ωt), with the temperature positively weighted on
one side and negatively weighted on the other. The sp91 beam was 1.5◦ with
oscillation amplitude of 2.95/2 degrees, roughly corresponding to a 2-beam
throw of about 2◦ , not that much larger than the beam: thus beam and throw
interference result in a relatively small width and maximum of W ` . A similar
story holds for sp94. For such an experiment,
Z
2 π/2
Up (Q) = 2iH0 (Q · $osc ) , where H0 (x) ≡ dθ sin(x cos(θ)) (104)
π 0
is called the Struve function of index zero [243].
The sp89 experiment and the MAX balloon-borne bolometer experiment
both had fwhm beams of 300 (`s ≈ 269) and also measured temperature differ-
ences via oscillating beams, with oscillation amplitude ≈ 1.4/2 degrees. The
filters differ because MAX used a sine weighting of the temperature to make
the temperature difference rather than the plus/minus step function technique
of sp89 and sp91. For sine weighting, Up (Q) = 2i π2 J1 (Q · $osc ).
50
The 1.80 beam Owens Valley (ov7) experiment [94] used a 40-m radio dish
at 1.5 cm to observe 7 fields on the sky with a double-differencing (or 3-beam)
experiment, which (basically) subtracted the average of the temperatures 7.15 0
to each side of a central point from the temperature at the central point. Thus
`throw ≈ 480 corresponds to the “throw” angle θthrow = 70 and `s ≈ 2246 is
the beam’s filter scale. It is theoretically advantageous to have θthrow θbeam
to get the maximum rms signal, but it is difficult to manage experimentally.
Python was a 4-beam experiment with beam 450 and throw 2.75◦, hence
also a very large ratio of throw to beam; for it,
Up (Q) = 2i sin3 ( 12 Q · $throw ) ,

W ` = 14 [5 − 15
2 J0 (xt ) + 3J0 (2xt ) − 12 J0 (3xt ))]B 2 (`|`s ) .
The raw data ∆p (φ) for the WhiteDish wd experiment (beam 120 ) are dif-
ferences in the temperature of points on a circle of radius $throw /2 (= 140 )
centered on the pixel and the pixel. Among ways
R to analyze the data, the most
straightforward is to form angular moments, e−imφ ∆p (φ) dφ/(2π):
m
Upm (Q) = im Jm ( 21 Q$throw ) (Q̂x − iQ̂y )m , 2 1
W ` = Jm ( 2 Q$throw )B 2 . (105)
Data was given for the m = 1, 2 modes, derived from 5 pixels in a line.
As we move into the next generation of experiments, the goal is to make
maps of extended regions. An example of the increasing sophistication is pro-
vided by the sk95 experiment [150], which projected from 3-beam to 19-beam
configurations in software, leading to an interpretation of an even more gen-
eralized pixel-space than one just including frequencies and spatial centering.
The filter functions can also be designed after the fact with this approach, as
in fact was done for broad-band power spectrum analysis in [150]. Two such
filters at either end of the `-range that sk95 was sensitive to are shown in fig. 6.
4.3 Primary power spectra for inflation-based theories

Sample theoretical C` ’s are shown in figs. 7, 8 for a number of inflation-inspired
theories with modest variations in cosmological parameters [144, 261, 305].
The “standard” scale invariant adiabatic CDM model (Ω = 1, ns = 1, h =
0.5, ΩB = 0.05) with normal recombination shown in fig. 7 and repeated in
each of the panels of fig. 8 illustrates the typical form: the Sachs–Wolfe effect
dominating at low `, followed by rises and falls in the first and subsequent
“Doppler peaks”, with an overall decline due to destructive interference across
the photon decoupling surface and damping by shear viscosity in the photon
plus baryon fluid. A similar CDM model, but with early reionization (at z >
200), shows the Doppler peaks are damped, a result of destructive interference
from forward and backward flows across the decoupling region, illustrating
that the “short-wavelength” part of the density power spectrum can have a
dramatic effect upon C` , since it determines how copious UV production from
51
Figure 7: Temperature power spectra normalized to hC` idmr = 10−10 : for a
standard ns =1 CDM model with standard recombination, early reionization,
a (dashed) tilted primordial spectrum with ns = 0.95, with the gravity wave
contribution shown, a (dotted) H0 = 75 model with Λ 6= 0, and an open
H0 = 60 CDM model (with the peaks shifted to larger `). Band-powers with
10% (dmr-level) error bars (for selected experimental configurations) are shown
for the tilted and untilted CDM models. A hot/cold hybrid model power
spectrum with Ων = 0.2 is plotted as well but is indistinguishable here from
the standard CDM case. The power spectra of SZ maps constructed using the
peak-patch method (Bond and Myers 1996) are shown for a σ8 = 1 standard
CDM model, a hot/cold hybrid model (Ων = 0.3) with σ8 = 0.7 (a tilted CDM
model with ns = 0.8 and σ8 = 0.7 is also shown). Spectra for a BCH2 dust
model (13) is also shown, the larger (arbitrarily normalized) part a shot-noise
effect for galaxies with dust distributed
52 over 10 kpc, the smaller a continuous
clustering contribution, including a nonlinear correction. The ∼ `2 shot-noise
rise also characterizes the power spectrum for extragalactic radio sources. On
the other hand, Galactic foregrounds have power spectra falling ∼ `−1 with
`. Average filter functions for a variety of experiments are shown in the lower
panel.
Figure 8: Spectra for a variety of inflation-inspired models, normalized to the
COBE band-power. Theoretical band-powers for various experimental config-
urations are placed at hìW , horizontal error bars extend to the e−1/2 W max
points. Unless otherwise indicated, ΩB h2 = 0.0125, h = 0.5, ns = 1; when
the gravity wave contribution is nonzero, νt = νs and rts ≈ −7νt are assumed
(T ) (S)
(rts ≡ C2 /C2 ). The untilted ns = 1, rts = 0 model is repeated in each panel
(solid line). (a) CDM models with variable tilt ns . (b) ns = 1 models with
ΩB h2 changed, h fixed. (c) ns = 1 models, with ΩB h2 changed, ΩB fixed. (d)
ns = 1 models with fixed age, 13 Gyr, but variable H0 and ΩΛ = 1−Ωcdm −ΩB
(.92,.79,.43,0 for 100,80,60,50). (e) CDM models with very early reionization
at zreh >∼ 150 (equivalent to “no recombination”), and later reionization at
zreh = 30, 50 are contrasted with standard recombination (SR). The zreh = 50
spectrum is close to the ns = 0.95 spectrum with SR (thin, dot-dashed): the
moderate suppression if 20 < <
∼ zreh ∼ 150 can be partially mimicked by decreas-
ing ns or increasing h. (f) Sample cosmologies with nearly degenerate spectra
and band-powers. Dashed curve: increasing ΩΛ is compensated by increasing
h. Dot-dashed curve: tilting to ns = 0.94 (r̃ts = 0.42) is compensated by
increasing ΩΛ to 0.6. The dotted hot/cold model curves (Bond and Lithwick
53
1995) (with Ων indicated) are nearly identical to the standard CDM one, but
even these few percent differences can be distinguished in principle by satellite
all-sky experiments with currently available detector technology.
early stars was. Lower redshifts of reionization still maintain a Doppler peak,
but suppressed relative to the standard CDM case (as illustrated in fig. 8(e)).
The primary spectra are calculated by solving for each mode M the lin-
earized Boltzmann transport equation for photons (including polarization) and
light neutrinos, coupled to the equations of motion for baryons and cold dark
matter, and to the perturbed gravitational metric equations (section 6).
If the post-inflation fluctuations are Gaussian-distributed, then so are the
(M)
multipole coefficients a`m , with amplitudes fully determined by just the an-
(M)
gular power spectra C` . Figures 7, 8 include adiabatic scalar and tensor
contributions. The relative magnitude of each is characterized by either the
(T ) (S)
ratio of the quadrupole powers, rts = C2 /C2 , or the ratio of the dmr band-
(T ) (S)
powers r̃ts = hC` idmr /hC` idmr . For the scale invariant cases, rts is taken to
vanish.
A simple variant of CDM-like models is to tilt the initial spectrum. We deal
with the physics of tilt in more detail in section 6, and just sketch the main
results here. The scalar tilt is defined by νs = ns − 1, in terms of the usual
primordial index for density fluctuations, ns , which is one for scale invariant
adiabatic fluctuations. There is a corresponding tilt which characterizes the ini-
tial spectrum of gravitational waves which induce primary tensor anisotropies,
νt . Inflation models give νt < 0 and usually give νs < 0. For small tensor tilts,
rts ≈ −6.9νt and rts ≈ 1.3r̃ts are expected (with corrections given by eq. 184).
For a reasonably large class of inflation models νt ≈ νs , but in some popular
inflation models νt may be nearly zero even though νs is not. Figures 7 and
(S) (T )
8(a) show C` + C` derived for tilted cases when νt ≈ νs is assumed to hold.
(T ) (T )
Figure 7 also shows the contribution that C` makes to the total; C` for both
the standard and early reionization cases are actually both shown; they cannot
be distinguished on this graph.
Spectra for hot/cold hybrid models with a light massive neutrino look quite
similar to those for CDM only [260, 262, 261], as fig. 7 and 8(f) show, with
small differences appearing at higher `. This is also even true for pure hot dark
matter models [134] because the scale associated with neutrino damping is near
to the scale associated with the width of decoupling.
The dotted C` in fig. 7 also has a flat initial spectrum, but has a large nonzero
cosmological constant in order to have a high H0 , in better accord with most
observational determinations (∼ 65 − 85) e.g., [10]. The specific model has the
same age (13 Gyr) as the standard CDM model, and ΩΛ = 0.734, Ωcdm = 0.243,
ΩB = 0.022, H0 = 75, ns = 1. (The best current estimate for globular cluster
ages, along with one and two sigma error estimates is 14.6+1.7;3.7 −1.6;3.0 Gyr [111].)
Other nonzero Λ examples with this age are given in fig. 8(d). As one goes
from ` = 2 to ` = 3 and above there is first a drop in C` [110], a consequence of
the time dependence of the gravitational potential fluctuations ΦN (see fig. 23
for a closeup of this).
The model whose peak is shifted to high ` is an open CDM cosmology [305]
with the same 13 Gyr age, but now H0 = 60, and Ωtot = 0.33 (and Ωcdm = 0.30,
54
ΩB = 0.035). By H0 = 70, Ωtot is down to 0.055 at this age. The shift to higher
` for open models is a simple consequence of the cosmological angle-distance
relation (section 5.1.4, eq. (130)); for closed models, the shift is to smaller `.
To get a visual impression of what the spectral structure means, fig. 9 shows
what the sky looks like on a few resolution scales for the standard ns = 1 CDM
model: on the COBE beamscale (Gaussian filtering `s = 19 here, see also
fig. 11), the nearly scale invariant form; on the half-degree scale (`s = 269
here), where the standard recombination spectrum is a maximum; with no
smoothing at all, with the shapes defined entirely by the destructive interference
that occurred across the photon decoupling region. For early-reionization, the
shapes in the 60◦ NR map are also the naturally occurring ones, since there is
no power left at `s ∼ 269 to artificially filter.
4.4 2D spectra with tilt and a Gaussian coherence angle

A phenomenology characterized by three parameters, a broad-band power
hC` iW , a broad-band tilt ν∆T , and a Gaussian coherence scale θc is often a
good local approximation to C` :
1 2
$c2
Qν∆T e− 2 Q I[W ` ] 1
C` = hC` iW 1 2 2 , Q≡`+ 2 , $c ≡ 2 sin( 12 θc ).
I[W ` Qν∆T e− 2 Q $c ]
(106)
Instead of Qν∆T , it has become standard to use a form U` which, as is shown

in section 5.1, arises when the anisotropy is generated by emission from a thin
shell at cosmological distance of sources described by a 3D Gaussian random
field with power spectrum P(k) ∝ k ν∆T :
Γ(` + ν∆T /2)Γ(` + 2)

U` ≡ = Qν∆T (1 + O(Q−2 )) . (107)
Γ(`)Γ(` + 2 − ν∆T /2)
The band-power hC` iW derived for an experiment is estimated using one or

more of these functional forms, and is often quite insensitive to ν∆T or to θc .
Two special cases are usually analyzed:
(1) The pure power law case has zero coherence scale and ν∆T variable. ν∆T
nearly 0 corresponds to scale invariant in ∆T ; ν∆T ≈ 0.15 is an effective index
appropriate for COBE-scale anisotropies for a standard initially scale-invariant
CDM model with Ω = 1, h = 0.5 and 5% baryon content [144, 89], and varying
ν∆T can model changing the primordial tilt of the spectrum: it is relatively
insensitive to h and ΩB changes for low `, but see fig. 23.
(2) The other case has a coherence scale and a white noise spectral in-
dex ν∆T = 2, a Gaussian correlation function model with C($) = (∆T /T )2c
exp(− 21 $2 /$c2 ). It describes uncorrelated blobs of size ∼ θc , and is similar
to the spectrum for a shot noise distribution of blobs with Gaussian profiles
(section 5.3.1, also see fig. 7) and so is a reasonable form to try. Note that
55
Figure 9: How a CDM model normalized to COBE varies with resolution.
The contours begin at 109 µK in the half-degree smoothing cases, 54.5 µK in
the no-smoothing case, 27.3 µK in the all-sky aitoff projection map. Positive
contours are heavy, negative are light. SR denotes standard recombination,
NR denotes very early reionization, so there is no Doppler peak. The hills
and valleys in the 5◦ SR (60◦ NR) map are naturally smooth: mapping them
will give a direct probe of the physics of how the photon decoupling region at
redshift ∼ 1000 (200) damped the primary signal.
56
the form ∼ `2 exp[−`2 ($c2 /2 + $s2 )] when the beam smoothing is included is
not very dissimilar from the form of the W ` of a 2-beam experiment, again not
unreasonable. However, the (∆T /T )c versus θc plots that became the standard
way of representing experimental data until the band-power representation was
developed are somewhat misleading: even if there is no information at all in the
experiment on the shape of the spectrum, (∆T /T )c will have a minimum at a
scale corresponding to where the filter function has its maximum; by contrast,
the band-power is largely θc -independent unless one has a mapping experiment
with very broad filters; i.e., is sensitive to the shape.
4.5 Experimental band-powers: past and present

The story of the experimental quest for anisotropies is a heroic one. The
original Penzias and Wilson (1965) discovery paper quoted angular anisotropies
below 10%, but by the late sixties 10−3 limits were reached [123, 124]. As
calculations of baryon-dominated adiabatic and isocurvature models improved
in the seventies, with the seminal work of Peebles and Yu [130], Doroshkevich,
Sunyaev and Zeldovich [131] and Wilson and Silk [132, 133], the theoretical
expectation was that the experimentalists just had to get to 10−4 . And they
did march from 10−3 down to 10−4 in the seventies, with results from Boynton
and Partridge [125], among others. The only signal that was found was the
dipole, hinted at by Conklin and Bracewell in 1973, but found definitively
in Berkeley and Princeton balloon experiments. Throughout the 1980s, the
upper limits kept coming down, punctuated by a few experiments widely used
by theorists to constrain models: the Uson and Wilkinson [126] limit of 5×10−5
with a 4.50 switch (and therefore not much primary signal); the 1987 OVRO
limit [94] of 2 × 10−5 on 70 scales (also below the coherence scale); the 6◦ limit
of Melchiorri’s group [127]; early versions of the Tenerife experiment [129]; the
7◦ -beam Relict 1 satellite limit [128]; Lubin and Meinhold’s 1989 half-degree
South Pole limit [95], on an angular scale which was optimal for testing CDM-
like models.
These data were used to rule out adiabatic baryon-dominated models, but
by then the dark matter dominated universes had come to the rescue to lower
the theoretical predictions by about an order of magnitude [134, 135]. Many
groups developed codes to solve the perturbed Boltzmann–Einstein equations
when dark matter was present e.g., [134, 135, 215, 136, 137, 138, 139], and,
post-COBE [140, 141, 142, 143, 159, 302, 306]. With the results of the pre-
COBE computations, a number of otherwise interesting models fell victim to
the data: scale invariant isocurvature cold dark matter models [215]; large
regions of parameter space for isocurvature baryon models [216, 218]; many
broken scale invariant inflation models with enhanced power on large scales
[233, 192]; CDM models with a decaying (∼ keV) neutrino if its lifetime was too
long (>
∼ 10yr) [233, 252]; constraints on ΩB , Ω and Λ in CDM models [243, 136].
For all of these the strategy was to normalize the anisotropy predictions using
the clustering properties predicted by the model, in particular by σ8 .
57
sk95
Figure 10: Band-power estimates derived for the anisotropy data up to March
1996. The lower panel is a closeup of the first “Doppler peak” region. The
theoretical curves are those of the filter figure, normalized to the dmr4 data:
the standard SR CDM model, the nearly degenerate one with Ων = 0.2 in light
massive neutrinos, the NR CDM model, the H0 = 75 vacuum-dominated model
(upper), the slightly tilted CDM model with a gravity wave contribution, and
the H0 = 60 open model. All are for universes with age 13 Gyr. Although
current data broadly follows the inflation-based expectations, the band-powers
shown may have signals from systematic 58 effects such as sidelobe contamination
or Galactic effects such as bremsstrahlung and dust, as well as the cosmological
primary and secondary anisotropy signals.
Now that we have detections, we normalize spectra to the COBE anisotropy
level, and can now use the data to rule out theories from below as well as from
above. In fig. 10, I use the band-power estimates with their error bars to give a
snapshot of the current data at this time and use it as a vehicle for discussing the
associated experiments. To determine band-powers for an experiment [5, 89],
a local model of C` is constructed, assumed to be valid over the scale of the
experiment’s average filter W ` . I usually choose eq. (106) with zero coherence
angle. The once popular ν∆T =2 coherence angle form is rapidly disappearing
from the scene, but it is also easy to transform such results into the band-power
language. As we learn more, a shape that fits the data will be the preferred
form [89].
Because there are so many detections now, fig. 10 is split into two panels
for clarity, the upper giving the overview, the lower focussing on the crucial
first Doppler peak region. Data points either denote the maximum likelihood
values for the band-power and the error bars give the 16% and 84% Bayesian
probability values (corresponding to ±1σ if the probability distributions were
Gaussian) or are my translations of the averages and errors given by the ex-
perimental groups to this language. Aspects of these statistical techniques are
described in section 4.6. Upper and lower triangles denote 95% confidence
limits unless otherwise stated. The horizontal location is at hì W and the hor-
izontal error bars denote where the filters have fallen to e−0.5 of the maximum
(with fig. 6 providing a more complete representation of sensitivity as a func-
tion of `). Only wavelength-independent Gaussian anisotropies in ∆T /T are
assumed to be contributing to the signals, but nonprimary sources (e.g., dust,
synchrotron) may contribute to these C` ’s (as can unknown systematic errors
of course). Either it has been shown that the frequency spectrum is compatible
with the CMB and incompatible with expected contaminating foregrounds or
some attempt at cleaning the observations of residual signals in almost all of
these cases. With residual contaminants, one generally expects the underlying
primary C` to be lower than the values shown, but it can be higher because of
“destructive interference” among component signals. In the following, consid-
erable space is devoted to the dmr data since the definitive 4-year data set has
been released and it is so important for normalizing spectra. (A closeup view
of the large angle band-powers is given in fig. 23.)
The ` = 2 power uses the 4-year quadrupole value [85, 86], determined from
high Galactic latitude data. It is the multipole most likely to have a residual
Galactic signal contaminating it, possibly destructively, and the “systematic”
error, the dashed addition to the statistical error bar (solid), reflects this. In
determining the dmr band-power, it is therefore wise to assume that in addition
to any primordial signal, unknown monopole and dipoles, there is an unknown
Galactic quadrupole contamination. Further, the Galactic plane should be
cut out, and other regions of known large Galactic contamination should be
removed as well. I used to take |b| > 25◦ as a safe cut [146], but have now
adopted the |b| > 20◦ customized cut used by the dmr team in its analysis
of the 4-year data [85], which also removes regions found to be high when
59
correlated with the 140 µm DIRBE map, in particular Ophiucus and Orion.
The two heavy points at ` ∼ 7 are band-powers derived for the 4-year dmr
53+90+31 GHz “A+B” maps, using the methods of section 4.6 [146, 162],
the solid point assuming a ν∆T = 0 spectrum, the open marginalizing over all
possible ν∆T . (The latter is lower by 6% if a Galactic quadrupole component
is not allowed for.) A good fit as a function of ν∆T to the band-power is
1/2
hC` idmr ≈ [0.82 + 0.26(1 − ν∆T 2 )
2.8
] × 1+.07
−.06 , which is reasonably insensitive to
modest variations in ν∆T . Band-powers derived for the “A+B” maps as a
function of frequency are in excellent agreement: the functional form with the
2.8 power fits well, but the coefficients and errors are slightly different: 53 GHz:
0.88, 0.24, 1+.08 +.11 +.16
−.07; 90 GHz: 0.95, 0.15, 1−.10; 31 GHz: 0.90, 0.08, 1−.14. The band-
powers are also remarkably insensitive to making signal-to-noise cuts in the
data (i.e., filtering it). The “A-B” maps are consistent with no signal.
DMR had enough coverage in `-space that one can estimate the spectral
index 1 + ν∆T from the data as well by Bayesian means (marginalizing over
1/2
hC` idmr ): 1.07+.27;.52
−.29;.58 for 53+90+31 GHz (agreeing with the dmr team’s result
using this method [87] and also with the 1.05+.27;.52−.28;.57 I determine for the Ecliptic
(as opposed to Galactic) coordinate versions of the maps); 0.97+.30;.57 −.32;.64 for 53+90
+.31;.59 +.40;.77
GHz; 1.15−.33;.66 for 53 GHz; 1.27−.43;.86 for 90 GHz. (First errors are 1σ,
second are 2σ.) Notice the preferred index actually goes down when the 53 GHz
and 90 GHz are added. For the 31 GHz map, there is a residual bremsstrahlung
contribution that lowers the index determination to 0.3, with large errors.
The approximate relation between ν∆T and the primordial tilt νs for the
standard CDM model over the dmr band is ν∆T ≈ 0.15 + νs , hence the values
of 1 + ν∆T are quite compatible with the simplest inflation expectations, 0.6 < ∼
ns <∼ 1. Indeed when standard CDM models with tilts ranging from 0.5 to 1.5
are considered, the index marginalized over σ8 for the summed map is ns =
1.02+.23;.41
−.25;.52 for the case with no gravity waves (νt = 0), and ns = 1.02−.18;.46
+.23;.40
with gravity waves (νt = νs ). Band-powers for specific ` ranges also show the
nearly flat character of C` , as the light open points at ` ∼ 4, 8, 16 from [87]
show.
Optimally-filtered maps show the same large scale features independent of
frequency. The 53+90+31 GHz version of this in fig. 11 shows the true sky
anisotropy features as revealed by COBE (cleaned of experimental noise). It is
compared with a realization of a scale invariant Ω = 1 dark matter dominated
model which has driven so much of the theory of the last decade. The basic
lesson is that there is a tremendous consistency in the 4-year dmr data set, with
the overall band-power being very well determined and the shape moderately
well determined.
Many papers estimating Qrms,P S (i.e., C2 ), ν∆T , σ8 , etc. have been writ-
ten using the 1-year and 2-year dmr data e.g., [90, 84, 256, 257, 89, 6, 146],
[288, 289, 290, 291, 292] which will all now be footnotes to history except for
their importance in developing the statistical techniques that have been applied
to the definitive 4-year dmr data. A variety of different measures such as corre-
60
lation functions and power spectra estimated by quadratic pixel combinations,
multipole modes and S/N -modes using linear pixel combinations, have been
used and there is good agreement among the methods [85] and with the results
given here.2
The two points at ` ∼ 10 are for the 170 GHz firs map, solid with the
restriction ν∆T = 0, open with ν∆T allowed to float. A “Galactic quadrupole”
as well as a model for residual “noise” that exists in the data [89, 6] were
1/2
integrated over. The band-power hC` if irs ≈ 1.15 + 0.03(1 + ν∆T )2 × 1+.25 −.23
is compatible with the COBE value. For example, for the ns = 1 standard
+.30,.62
CDM model, σ8 = 1.27−.29,.57 , compared with the 53+90+31 GHz A+B dmr
+.08,.17
map value of σ8 = 1.20−.08,.15 . This strengthens the case for a CMB origin,
extending the 31-90 GHz band to 170 GHz. The firs team [93] also showed
a significant cross-correlation with dmr exists. Although the firs coverage in
`-space is large enough that a spectral index can be estimated, the value I
obtain for 1 + ν∆T , 1.6+.7
−.8 , has very large errors, and the small angle residual
“noise” is probably driving the higher values [89]. Using a correlation function
analysis, which filters some of the residual noise, the firs team [93, 147] derived
similar amplitudes but a smaller ν∆T , although quite compatible within the
large error bars.
The Tenerife point [103] at ` ∼ 20 uses combined 15 and 33 GHz data,
agrees with the band-power for their data at 15 GHz only, which covered a
much larger region of the sky, and is rather remarkable in view of the relatively
low frequency (fig. 5). The Tenerife results have also been shown to strongly
correlate with the DMR maps.
We now come to the crowded region from two degrees to half a degree. The
lower open circle is from a joint 4-channel analysis of the 9 and 13 point sp91
scans [96, 5, 89, 162] (with the individual 9 point and 13 point values given in
the lower panel). The upper solid is for a simultaneous analysis of all channels of
the sp94 data [148, 162], with separate values for the Ka and Q bands (fig. 5)
in the lower panel. The solid triangle in the upper panel is the sk93 result;
the big solid circle at ` ∼ 80 in the lower panel is the sk93+94 result. The
nearness of the sp94, sk93 and sk93+94 band-powers, and the demonstration
for both experiments that the preferred frequency dependence is nearly flat in
∆T /T and many sigma away from bremsstrahlung or synchrotron, the expected
contaminants in this 30-40 GHz range, lend confidence that the spectrum in
the ` ∼ 60–80 region has really been determined; and it looks quite compatible
+.37,.85
with the COBE-normalized CDM spectrum: sp94 gives σ8 = 1.26−.27,.47 , and
+.24,.54
sk93+94 gives σ8 = 1.21−.19,.39 [162], very close to the dmr value given above:
1/2
2 The 1-year hC i
` dmr and σ8 value is rather close to the preferred 4-year estimate, e.g.,
only 3% lower for 53 GHz A+B [89]. For this channel, the 2-year value is about 4% higher,
1/2
but for 90 GHz A+B it is 14%, leading to a 12% higher hC` idmr and a 14% higher σ8 in the
53+90+31 GHz A+B map. Much of this can be attributed to the customized cut: a straight
|b| > 20◦ cut gives only a 4% discrepancy. The main lesson is that one should disregard
the earlier values and only use the new ones with the customized cuts; further foreground
corrections beyond this do not change the values by much [85].
61
Figure 11: 140◦ diameter maps centered on the South and North Galactic Poles
are shown for a realization of a CDM C` –spectrum convolved with the dmr beam
in (c). No noise has been added. This is how the primary sky would appear in
a ns = 1 CDM Universe with σ8 = 1.2 (or in a Ωmν = 0.2 hot/cold universe
with σ8 = 0.8), the most likely values for the dmr data. This is contrasted with
the 4-year dmr (53+90+31)a+b map shown in (a) and the map after the data
has undergone optimal signal-to-noise filtering in (b) (using the same C` -shape
and amplitude for the filter). The statistically significant features are also seen
in each of the dmr channel maps after optimal filtering (which preferentially
removes high angular frequencies, more so for noisier maps). Thus, to compare,
(d) shows the theoretical realization after
62 passing (c) through the same optimal
filter used for (b); the average, dipole and quadrupole of the full |b| > 20◦ sky
were also removed, an effective low `-filter – if they stay in, the maps look
similar to the unfiltered theory maps except small scale smoothing leads to
loss of the higher contour levels. Note that the contours are linearly spaced at
±15n µK for all but (a), for which the spacings are ±15, 30, 60, 120 µK. The
maps have been smoothed by an additional 1.66◦ Gaussian filter.
63
that this would be so is evident from the curves. The 5 heavy open circle
points probing `’s ranging from 60 to 400 repeated in the upper and lower panels
labelled sk95 are combined sk93+94+95 results [150]. The estimated 14% error
in the overall amplitude because of calibration uncertainties associated with
Cas A are included. The large `-space coverage from this one intermediate
angle experiment gives a first glimpse of the `-space coverage that will become
standard in the next round of anisotropy experiments.
Python [104], py, the heavy solid curve at ` ∼ 90, is sensitive to a wide
coverage in `-space as the horizontal error bars in the top panel indicate. Argo
[105], ar, a balloon-borne experiment, is next. The next five points in the lower
panel are from the fourth and fifth flights of the MAX [100, 99], max4,max5,
another balloon experiment. Because the filters changed with frequency, the
points are placed at the average over all max filters. In the upper panel three
max4 scans are combined into one data point as are two max5 scans. The
lines ending in triangles at ` ≈ 145 and 240 denote the 90% limits for the
MSAM [101] single (msam2) and double (msam3) difference configurations. A
limitation on these balloon experiments is the ∼ 5 hours over which data can
be effectively taken. Planned long duration balloon flights that would circle
Antarctica for about a week would allow extensive mapping at high precision
to be done, and a number of groups have been proposing designs labelled ACE,
Boomerang, Maxima, Top Hat.
The CAT points at ` ≈ 400 and 600 represent a very different experimen-
tal technique, interferometry, so I will discuss the approach in some detail.
CAT is a 3-element synthesis telescope, probing ∼ 15 GHz frequencies with a
270 synthesized beam and a 2◦ field-of-view (the fwhm of the individual tele-
scopes). It is a precursor to the larger VSA (Very Small Array), covering a
wider frequency range with more telescopes and a larger (4◦ ) FOV. Two other
CMB interferometers are also planned: CBI (fig. 16) and VCA. Interferometers
directly measure the Fourier amplitudes ∆ e T (Q) for wavevectors Q associated
with the baseline separation of the telescopes; Q rotates with the rotation of
the earth and with many movable telescopes many |Q| can be probed as well.
Analysis is most naturally done in Q-space with the power spectrum a direct
outcome. The phase information can be used to reduce atmospheric contam-
ination. Maps sensitive within the FOV can be made using methods such as
maximum entropy reconstruction. The CAT team have done this. The low fre-
quencies of CAT implies that radio source contamination is a problem: these
are found using the higher resolution RYLE interferometer (section 5.3.4) and
subtracted from the CAT data.
The ovro experiments are also at radio frequencies, but use single dishes.
The 1987 ovro 7 point upper limit [94] used a 40 meter dish. A 5 meter dish
has been used for the larger scale ovro22 experiment (fig. 6) that has detections
awaiting the cleaning of radio sources found with the 40 meter dish. WhiteDish
[102] had a small amplitude filter function, fig. 6, a hint of a detection in the
m = 1 mode and a 95% limit in m = 2 mode at ` ≈ 520, wd2. The open
triangle at ` ≈ 160 is the (historically important) 95% credible limit for the
64
sp89 9 point scan [95, 243].
4.6 Measuring cosmological parameters with the CMB

In the future we will be able to strongly select the preferred theories by simul-
taneously analyzing experiments like these. Although combining the statistics
for a number of experiments was quite effective when we just had upper lim-
its (e.g., using sp89 and ov7 in [243]) and interesting when we had a mix of
detections and upper limits (e.g., using dmr1, sp91, sp89 and ov7 [140]), it
can also be quite misleading unless we are careful to include secondary back-
grounds, foregrounds, instrumental systematics and calibration uncertainties
as well as primary anisotropies in our model for ∆p , or can demonstrate that
they are absent. Band-power diagrams such as fig. 10 are very useful guides in
the evolving progress towards a primary C` spectrum, and help to inhibit theo-
rists from over-interpreting the cosmological consequences of the current data.
Some of the datasets have now been shown convincingly to be consistent with a
CMB rather than a foreground or systematic origin, and this warrants a return
to the multiresolution approach, since so much more can be determined using
long baselines in `-space. In this subsection I illustrate how the multiresolution
approach works with two exercises. The first, fig. 4.6 taken from [140], shows
what the near future looked like as seen from summer 1993, with experiments
still characterized by a narrow W ` because of beam-to-throw constraints, like
most experiments in fig. 10. Although the prognosis was good, an approximate
degeneracy in the parameter space was identified that showed that apart from
an overall amplitude (e.g., σ8 ) only a single parameter, ν̃s (described below),
was really well determined by few-scan data [144]. The second illustration,
fig. 13, shows how well all-sky high-resolution satellite experiments can do,
as examples, albeit best-case ones, of what low noise maps of contiguous re-
gions may achieve: the approximate degeneracy is broken and a large number
of cosmological parameters can be simultaneously determined with relatively
high accuracy. Less spectacular, in ` range and sky coverage, but not by a
huge amount, are projections for what will be achievable in `-space from long
duration balloon and ground-based interferometry experiments. In between,
statistical methods are briefly discussed.
In [140], simulations of data sets with assumed noise levels were constructed
for experiments probing ` < ∼ 15 (dmr), ` ∼ 100 (sp91), ` ∼ 200 (sp89) and
` ∼ 500 (ov22). The noise levels were being achieved even then and the number
of scans chosen for each configuration is conservative. The input signal was
a model with rts = 1, equal scalar and tensor quadrupole powers and tilts
νs = νt = −0.15, with spectrum shown in fig. 4.6(a). The issue addressed was
how well the input signal could be recovered as one progressively relaxed what
was known. What fig. 4.6(b) shows is that if we happened to know νs and νt ,
recovery of the overall amplitude is excellent, and recovery of the ratio r ts is
also quite good. (With only dmr data, rts cannot be determined because the
tensor and scalar spectra look the same apart from ` = 2.) Allowing νs to vary
65
(caption next page)
66
Figure 12: (previous page) A vintage simulation of how the multiresolution
combination of experiments can determine cosmological parameters, from Crit-
tenden et al. (1993). (a) C` /C2 for tensor, scalar and the sum for a tilted
(νs = νt = −0.15) but otherwise standard CDM model with normal recombi-
nation. Hot/cold hybrid models look quite similar. The light dashed line is an
ΩB = 0.01 model. The rest of the panels show contour maps in parameter space
derived from simulated large and small angle data consisting of the dmr correla-
tion function (with error bars appropriate to 4 years of data), six 13 point sp91
strips (1.5◦ beam, 18–27µK error bars for each of the 4 frequency channels),
six 9 point sp89 strips (0.5◦ beam, 15µK error bars) and one ov22 strip (70
beam, 220 throw, 25µK error bars). Error bars as good or better than these are
now being achieved. The mean signal input into the simulated data is denoted
by the square; “x” denotes maximum likelihood. (b) Shows 1, 2 and 3 sigma
(S) (T )
contour lines in the scalar ([C2 ]1/2 /10−5 ) versus tensor ([C2 ]1/2 /10−5) am-
plitude plane, assuming the index 0.85 is known. (c) Shows 1, 2 and 3 sigma
(S)
likelihood contours for the simulated data in [C2 ]1/2 –ns space, constrained
(T ) (S)
to obey the C2 /C2 ≈ −7νs and νt = νs relationship (solid) and with this
ratio unconstrained, but νt = νs . Shading indicates the range for which CDM
models are not dynamically viable based on σ8 . Without other information
(S)
such as this, one recovers well only one parameter, a combination of [C2 ]1/2
and ns , while parameters orthogonal to this have wide error bars. This ambi-
guity increases when the space is opened up to encompass more cosmological
parameters.
67
in fig. 4.6(c) shows that we can get one direction in parameter space very well,
that corresponding to the ν̃s variable of eq. (108), but the orthogonal one is
sloppy. However, if we accept that we know the relationship between νt and
νs , which can be computed for any specific inflation model, e.g., νt ≈ νs , with
rts then following, then recovery is excellent. Using the 4-year dmr data in
conjunction with just the sk94 and sp94 data gives error bars on ns that are
similar to what this simulated exercise gave [162]. Another important point
to note in fig. 4.6(c) is information on the allowed value of σ8 taken from
cluster observations tightens up the precision with which the parameters are
determined. This is addressed more fully in section 7.3. However, if we open
up the parameter space to include variations in ΩΛ , h, zreh , etc. – as in fig. 8
– then more ambiguity arises.
Superposed upon the spectra in figs. 7 and 8 are theoretical band-powers
derived for a variety of anisotropy experiments. Figure 7 also shows 10% one-
sigma error bars: with 4 years of data, the dmr band-power errors are 14%; to
achieve this with smaller angle experiments one would need to have about the
same number of pixels as COBE, but scaled to the beam size hence covering
a smaller region of the Universe: that is, mapping experiments on smaller
angular scales. Even if there were idealized perfect all-sky coverage with noise-
free versions of the experiments of Fig. 7, there would still be cosmic variance
errors on the band-powers to limit the accuracy. These go as hì−1 [89] even for
quite narrow bands, as is shown below. Thus it appears that by using (nearly
perfect) CMB experiments which are sensitive to a wide range of angular scales,
we might expect to distinguish even among the nearly degenerate theoretical
models shown in fig. 8(f), and be able to measure the parameters that define
the variations in these models.
The near-degeneracy is especially prevalent through the first Doppler peak.
In [144], we showed that for small variations about the “standard” CDM model,
with ns = 1, Ω = Ωnr = 1, ΩB = 0.0125h−2 (from big bang nucleosynthesis),
the height of the first Doppler peak relative to the dmr band-power is (within
∼ 15%)
C` |max
≈ 5 e3.6ν̃s , (108)
hC` idmr
z 3/2
ln(1 + r̃ts ) Ω B h2 reh
ν̃s ≈ νs − − 0.5[Ω1/2 1
nr h − 2 ] + 0.08( − 1) − .
3.6 0.0125 200
1/2
The nature of the tensor reduction term is clear. The Ωnr h term follows from
a strong dependence on the redshift at equal energy densities in relativistic and
non-relativistic matter or aeq , eq. (69). This term shows that the height of the
peak goes up as Ωvac = 1 − Ωnr goes up, quite dramatically for fixed h, but not
by much for models with fixed age, since h goes down, as fig. 8(d) illustrates. aeq
also varies with the relativistic energy density, Ωer , if it is not the “standard”
1/2
value with three massless neutrino species; if not, Ωnr h should be divided
1/2
by [Ωer /(1.68Ωγ )] in eq. (108). (See [158] for small variations breaking the
68
simple aeq degeneracy, and for another form for ν̃s .) The reionization term
is simply related to the depth to Compton scattering from the re-ionization
redshift zreh to the present by 2ζC /3.6 ∝ (zreh /zζC =1 )3/2 , where zζC =1 ≈
−2/3 1/3
102.1 (ΩB h/0.02) Ωnr (eq. 72), and so depends on Ωnr h2 (and on ΩB h2 ).
<
zreh must be ∼ 150 to have a local maximum, as fig. 8(e) shows.
In [144], we fixed ΩB h2 at 0.0125, but, as fig. 8(b) shows, a linear depen-
dence in ν̃s on ΩB gives the variation of the peak height to sufficient accuracy.
However, fig. 8(b) also shows that the relative heights of the secondary Doppler
peaks are sensitive to ΩB , so the approximate degeneracy is broken in the vari-
able ΩB h2 . It is also broken by Ωtot , since the position of the peak, determined
by the angle-distance relation, changes. The formula eq. (108) shows that a
model with no gravity wave contribution but ns ≈ 0.88 has a spectrum that is
almost degenerate with the ns = 0.95, r̃ts = 0.3 spectrum, so much so that it
will be difficult to tell them apart. We argued that the precision required to
separately determine ns , r̃ts , ΩΛ , . . . was too high for what was then the near-
term future, but ν̃s could be determined accurately, and that to separate the
various contributions to ν̃s in the near term would require other cosmological
experiments, e.g., measuring the scalar perturbation shape through galaxy–
galaxy power spectra and amplitude through cluster abundances or streaming
velocities (section 7.3); and, in some happy future, determining H0 definitively.
In the future, NASA and ESA high precision CMB space experiments should
achieve the sensitivities necessary using CMB anisotropy information alone
[161, 152, 154].
Fig. 13 gives a closeup view of how very fine differences in the theoretical
C` can be measured using detector sensitivities and the long observing times
appropriate for satellite experiments feasible with present technology [152, 153,
154].
To discuss how cosmic variance, experimental noise, and sky coverage af-
fect the estimation of the predicted band-powers, it is worthwhile to make
a brief excursion into statistical analysis. For the CMB data sets that have
been obtained up to now, including COBE, it has been possible to do a rela-
tively complete Bayesian statistical analyses [155] if the primary anisotropies
are Gaussian and the non-Gaussian Galactic foregrounds are not large. The
goal is to determine the best error bars on the parameters of a target set of
theories with correlation matrices CT pp0 , by first determining the likelihood
function L for each theory, and then comparing the likelihoods as a function
of the parameters. To give preferred values and errors for a specific cosmolog-
ical parameter of interest such as the Hubble parameter, one often integrates
(marginalizes) over the other parameters, such as σ8 and ns , assuming a prior
probability distribution, which can be a statement of a priori maximal igno-
rance, or take into account constraints from other information such as large
scale structure observations, as is done in section 7.3.
A useful method for likelihood determination is to expand in signal-to-
noise eigenmodes [146], those linear combinations of pixels which diagonalize
69
−1/2 −1/2
the matrix Cn CT Cn , where the noise correlation matrix Cn = CD + Cres
consists of the pixel errors CD and the correlation of any unwanted residuals
Cres , whether of known origin such as Galactic or extragalactic foregrounds or
unknown extra residuals within the data.3
The S/N -mode basis facilitates the many Npix × Npix matrix inversions of
Ct ≡ Cn + CT involved in evaluating the likelihood function,
† √
ln L(σth ) = − 21 ∆ (Cn + CT )−1 ∆ − 12 Trace ln(Cn + CT ) − Npix ln 2π , (109)
1 1/2
as a function of an overall amplitude σth ∝ [Trace(CT )] 2 (e.g., σ8 , C2 /10−5 ,
1/2
hC` idmr /10−5 ). Here † denotes transpose. (This form of the likelihood func-
tion assumes a Gaussian distribution of errors and that the target signal and
residuals are also Gaussian-distributed. To derive it, integrate δ (Npix ) (∆ − ∆)
over each ∆A probability distribution, A = n, T, res. The total ∆ coming in
to the detector is modelled as ∆ = ∆n + ∆T + ∆res , each with a distribu-
1
tion exp[− 21 ∆†A CA
−1
∆A ]/((2π)Npix det[CA ]) 2 . If the target signal or any of the
residuals has a non-Gaussian distribution, the integrations cannot usually be
done and Monte Carlo treatments of the statistics becomes necessary.)
Constraints such as averages, gradients (dipoles, quadrupoles) and known
templates, which may be frequency dependent (e.g., IRAS or DIRBE dust
maps) can also be modelled in the total ∆, as “nuisance variables” to be in-
tegrated (marginalized) over. Denoting each constraint c on pixel p by κc Υpc ,
where the template for constraint c is Υpc (e.g., the Fp,1m and Fp,2m of eq. (91)
for the dipole and quadrupole)
P and the amplitude is κc , we need only replace
∆p in eq. (109) by ∆p − c Υpc κc , then integrate, assuming a prior probabil-
ity distribution for the amplitudes κc . This is most easily done if we assume
the κc are also Gaussian-distributed with a very broad distribution reflecting
our ignorance of its values (or if we know its likely range, incorporating that
as prior information in the Gaussian spreads). The integration over κc then
yields
†
ln L+C = ln L + 12 ∆ [Υ† Ct−1 ]† (K −1 + Υ† Ct−1 Υ)−1 [Υ† Ct−1 ]∆
− 12 Trace ln(I + KΥ† Ct−1 Υ) , (110)
3 The
PNpix −1/2
Npix modes ξk = p=1
(RCn )kp (∆T /T )p , having the “dimensions” of signal-
to-noise, can be expanded into noise nk , signal sk , residual “noise” resk not accounted
for by Cn , and any further “constraints” ck (residual dipoles, quadrupoles, etc. ): ξk =
sk + nk + ck + resk [5, 89, 6, 146]. Here R is a rotation matrix. In this basis, the noise
and signal have diagonal correlations: hnk nk0 i = δkk0 , hsk sk0 i = ET R,k δkk0 . The great
simplification of orthogonality, i.e., no mode-mode correlations, is destroyed somewhat by
off-diagonal terms in the hck ck0 i and hresk resk0 i (if they are not fully modelled by Cn ). The
modes are sorted in order of decreasing S/N -eigenvalues, ET R,k , so low k-modes probe the
theory in question best. This expansion is a complete (unfiltered) representation of the map.
In S/N -filtering, only restricted ranges in this k-space are kept. The sum of ξ k2 over bands
in S/N -space defines a S/N -power spectrum which gives a valuable picture of the data and
shows how well the target theory fares [89, 146, 162].
70
where Kcc0 = hκc κc0 i is the assumed prior variance for the constraint ampli-
tudes. Evaluating this involves only NC × NC matrix inversions, where NC is
small compared with Npix . Taking into account constraints with amplitudes
that are not linear multipliers times the template is much more complex.
An equivalent expression to eq.(110) for ln L+C takes the form eq.(109)
but with Cn replaced byPCen ≡ Cn + ΥKΥ† . The constraint portion of the
P
matrix is just h c κc Υpc c0 κc0 Υp0 c0 i. The span of the templates Υpc defines
a subspace in the data. As the eigenvalues of K become very large, the effect of
the constraint matrix is to project onto the data subspace orthogonal to that
spanned by Υpc . Although one can directly use the likelihood equation in this
projection limit (using δ (Nc ) (κ) for the constraint prior), it is computationally
simpler to use the Gaussian prior.
Which form of the likelihood to use depends upon the application: using
eq.(110), one can vary the number of constraints to include without recomput-
−1/2 −1/2
ing the S/N modes associated with Cn CT Cn (e.g., allowing for a Galactic
quadrupole contamination in the dmr data or not); for a fixed but large number
of constraints, the eq.(109) form is better, using S/N modes associated with
Cen−1/2 CT Cen−1/2 .
Many of the determinations of the band-powers and their error bars shown
in Fig. 10 were facilitated by S/N mode expansions. This technique has the
highest sensitivity to the data, but a byproduct is that it is also sensitive to low
level residuals. If these exist, they can sometimes be removed by signal-to-noise
filtering, getting rid of modes that are very insensitive to the class of theories
being tested.
A strong indication of the robustness of the dmr data set is the insensitiv-
ity of the band-powers to the degree of signal-to-noise filtering and to which
frequencies are probed (section 4.5). This S/N -filtering is a form of data com-
pression: when the eigenmodes are rank-ordered by decreasing eigenvalues,
one usually finds that only the moderate to high S/N -modes (e.g., ∼ 10% for
COBE) probe the target theory well and the rest must be consistent with noise
[89, 6, 146, 162]; and if they are not, filtering out the high S/N -modes leaves
offending residuals whose nature can then be explored. [89, 6, 146].
Filtering using S/N -modes has a long history in signal processing where it
is called the Karhunen-Loeve method [156], and it is now being widely adopted
for analysis of astronomical databases.
When the number of pixels becomes too large, statistical compromises are
necessary because the eigenvectors of the full S/N matrices cannot be deter-
mined. An all-sky experiment with 100 resolution will have more than a million
pixels per frequency channel, and long duration balloon experiments will have
tens of thousands of pixels. Exploring how to best estimate power spectra and
cosmological parameters given computational constraints by first projecting
the data onto well chosen smaller subsets is sure to become a very active area.
This happy day of too many pixels is now upon us.
The optimal (Wiener) filtering shown in fig. 11 is an immediate byproduct
71
of the S/N -eigenmode expansion [89, 157, 146]: given observations ∆p , the
mean value and variance matrix of the desired signal ∆T p are [232]
h∆T |∆i = CT Ct−1 ∆ , hδ∆T ⊗ δ∆T |∆i = CT Ct−1 (Ct − CT ) . (111)
The mean field, h∆T p |∆i, is the optimally-filtered map. The operator multi-
plying ∆ is the Wiener filter. If the map is very sensitive to the assumed CT
or if the fluctuation, δ∆T p = ∆T p − h∆T p |∆i, of the signal about the mean
is large in some region of space or on some resolution scale, then this tells us
that the data there are not yet good enough in the optimally-filtered maps to
identify real structures on the sky with this method. (Marginalization over the
constraints is incorporated into the mean field and variance by adding ΥKΥ†
to Cn and thus Ct [146].)
To get an idea of how experimental noise and sky coverage affect the es-
timation of the predicted band-powers, we consider an experiment with noise
2
matrix CDpp0 = σpix δpp0 , with the per-pixel error σpix independent of the pixel
position (i.e., homogeneous uncorrelated noise). Suppose first that the pixels
are sufficiently separated that CT pp0 ≈ 0 for p 6= p0 , i.e., that only W ` is an
effective probe of C` . For large Npix , the 1-sigma uncertainty in the experimen-
tal value of the band-power about the maximum likelihood value, hC` iB,maxL ,
is [89]
q h i
2
∆hC` iB = 2/Npix hC` iB,maxL + σpix /I[W ` ] . (112)
For large Npix , the observed maximum likelihood will fluctuate from hC` iB,th ,
the theoretical quantity we want, but the error√ bars of eq. (112) include these
realization-to-realization fluctuations (thus 2 appears, not 1). To get 10%
error bars as in fig. 7 requires low experimental noise and Npix ∼ 200 “inde-
pendent” pixels, i.e., a mapping experiment.
In a mapping experiment, the pixels will be adjacent and off-diagonal cor-
relations in CT pp0 are very important, but for a large enough contiguous region
and simple observing strategies this can be adequately treated with an ex-
pansion in the a`m (or Fourier) modes. With uniform weighting and all-sky
coverage, the S/N -modes are just the independent Re(a`m ) and Im(a`m ). For
each `, there is a (2` + 1) degeneracy, an effective pixel number for `-modes. If
only a fraction fsky of the sky is covered, then for high `, so that the angular
scale `−1 is small compared with the patch probed, the effective pixel number
scales by fsky . Thus, for each `, we have
√
2
∆CT ` ≈ p (CT ` + Cres,` + CD` B`−2 ) , (113)
(2` + 1)fsky
`(` + 1)σν2
CD` ≡ , σν ≡ σpix $pix .
2π
Thus the cosmic variance for each ` goes as Q−1/2 , where, as usual, Q ≡ ` + 21 .
The filter function associated with the beam is B` . It has been divided out
72
to show that the effective noise level in C` determination picks up enormously
above `s ∼ (0.425θf whm)−1 . For fixed experimental parameters, the combina-
tion σν remains the same as the pixel size is varied.
Figs. 7 and 8 show that the variation in CT ` with cosmological parameters is
quite smooth so we can broaden the band-power filters to encompass more than
a single `. In fig. 13, the errors shown are those appropriate for logarithmic
binning of width ± 12 ∆ ln ` about ln `, with 12 ∆ ln ` = 0.05. This gives a better
feeling for how well parameter estimation can occur. The variance is
[(CT ` + Cres,` )2 + 2γ1 (CT ` + Cres,` )CD` B`−2 + γ2 CD`

2
B`−4 ]1/2
∆CT ` ≈ p p .
Qfsky cosh(∆ ln `)[1 + Qsinh(∆ ln `)]
(114)
The factors γ1 and γ2 are nearly unity if ∆ ln ` is small. There is a crossover

point at which ∆CT ` from cosmic variance goes from the usual Q−1/2 depen-
dence to a Q−1 dependence.
(The derivation evaluates the likelihood function within the (integer-spaced)
[e−∆ ln `/2 `] < < ∆ ln `/2 `] interval. The cosmic variance term is just the sum
− L − [e
1
of ` + 2 over the bin. The γ1 and γ2 terms are estimated by expanding in noise-
to-signal, CD` /[(CT ` +Cres,` )B 2 ], up to second order, grouping terms to preserve
the basic form of (∆CT ` )2 in eq. (113). If ν∆T is the local slope of (CT ` +Cres,` ),
then
sinh[(2 − ν∆T /2 + (Q$s )2 )∆ ln `]
γ1 ≈ , (115)
(2 − ν∆T /2 + (Q$s )2 )sinh(∆ ln `)
3sinh[(3 − ν∆T + (Q$s )2 )∆ ln `]
γ2 ≈ 4γ12 − .
(3 − ν∆T + (Q$s )2 )sinh(∆ ln `)
If $s is small and for a flat ν∆T = 0, γ1 ≈ cosh(∆ ln `) and γ2 ≈ 1. For

example, although fig. 13 includes the full corrections of eq. (115), the result
without them is indistinguishable for the ∆ ln ` = 0.1 chosen.)
The all-sky uniform-noise assumption was used to model the dmr correla-
tion function errors before the 1-year data were released, as in [5, 140] and
Fig. 4.6. The uniform-noise assumption for regular connected patches cover-
ing a fraction fsky of the sky has been used recently to address the ultimate
accuracy in measuring cosmological parameters that satellite and balloon ex-
periments might achieve if foreground contamination (i.e., Cres,` ) is ignored
[159, 160, 161, 154, 163, 164, 165]. That application will be sketched here,
following the treatment in [164], since it represents a nice exercise for working
with the likelihood formula, eq.(109), is being widely used, and it allows us to
focus on the two forthcoming satellite experiments.
We shall use current specifications for MAP and COBRAS/SAMBA (now
called Planck), although these may well evolve. In fig. 13, parameters roughly
suitable for the NASA mission MAP [152] and the higher resolution CO-
BRAS/SAMBA [154] are shown. Of the 5 HEMT channels for MAP, we
73
shall assume the 3 highest frequency channels, at 40, 60 and 90 GHz, will
be dominated by the primary cosmological signal, and adopt fwhm beams of
24, 17 and 14 arcminutes, respectively. We shall take the noise power to be
CD` = 2.3 × 10−15 for each channel (i.e., 25 µK per 180 pixel) for a 2 year
mission. For COBRAS/SAMBA, which has both HEMTs and bolometers, we
take the 4 best bolometer CMB channels at 100, 150, 220 and 350 GHz to be
the primary cosmological ones, with fwhm beams 15, 10, 7, 4 arcminutes and
noise the remarkable CD` = {2.8, 1.5, 0.5, 35} × 10−17 (3.5, 3.5, 3.3, and 444
µK per fwhm pixel), respectively, corresponding to a 14 month mission. We
shall also assume fsky = 0.65 will be usable, the same as the fraction used in
the analysis of the 4-year dmr data.
Consider a class of cosmological models with Gaussian-distributed temper-
ature anisotropies defined by a parameter set {yA }. For definiteness here we
shall use the parameter space {Ωtot , h, ΩB h2 , νs , rts , Ωvac , hC` iB , ζC }, with the
residual energy density, Ωtot − Ωvac − ΩB , assumed to be in cold dark matter,
hC` iB the total bandpower for the experiment and ζC the Compton optical
depth from a reheating redshift zreh to the present. This is similar to the space
used in [154], except that C2 was used instead of hC` iB (which does not change
the results much); [161] added 2 more parameters, while [164] added 7 more.
For illustration we shall assume that the correct underlying theory is an
untilted standard CDM one (the hot/cold model has a very similar power
spectrum). After integrating the 8 parameter probability distribution over
all other parameters but one, sample results for COBRAS/SAMBA are νs to
±0.006, ΩB h2 to 0.6%, H0 to 2%, Ωvac to ±0.05, Ωtot to ±0.007, while for
MAP they are ±0.04, 5%, 11%, ±0.28, ±0.04. Here Ωtot is forced to be zero
if Ωvac is not, and vice versa, because there is an approximate degeneracy in
the C` spectrum for cosmologies with different Ωtot , Ωvac but the same angle-
distance relationship which maps the post-recombination spatial structure to
angular structure at high `. COBRAS/SAMBA has such high sensitivity that
it could even determine Ωmν to ±0.02. Of course, between the experimental
data and these wonderful numbers many complications, especially foreground
removal, must be overcome.
We now sketch the method used for this analysis. Choose the parameter
set {yAm } which approximately maximizes the likelihood (e.g., using quadratic
estimators to determine the power spectrum from the data and fitting it with
C` (yAm )). Expand ln L to quadratic order in δyA ≡ yA − yAm . Adjusting the
yAm so that the linear term ∂ ln L/∂yA vanishes, thereby yielding a Gaussian
approximation with zero mean to the likelihood,
X X ∂CT ` ∂CT `
L ∼ Lm exp[− 12 S AB δyA δyB ] , S AB = (∆CT ` )−2 . (116)
∂yA ∂yA
AB `
Here the parameter derivatives ∂CT ` /∂yA are evaluated at yAm as is the CT `
in the variance (∆CT ` ) given by eq.(113).4
4 The derivation is most easily done using S/N eigenmodes, which, as a result of the
74
Just as was done for the constraints, it is convenient to choose a Gaussian
prior probability for the fluctuations δyA , with covariance matrix HAB . The
limit of very large eigenvalues of H corresponds to no prior information on the
δyA . The final probability for the parameter fluctuation δyA is then a Gaussian
with mean zero and variance (S +H −1 )−1 . If we are interested in the error bars
on δyA irrespective of the values of the other
q variables, we would marginalize
over these. The 1-sigma error is then ± (S + H −1 )−1 AA , the numbers quoted
above.
Generally the errors in the parameters will be correlated through nondiag-
onal components of (S + H −1 )−1 . There are linear combinations of the param-
eters which are uncorrelated, namely (S + H −1 )1/2 δy. When the eigenvalues of
(S + H −1 ) are rank ordered, from high to low, the variable combinations corre-
sponding to the top ones will be very accurately determined, while those for the
lowest may be very poorly determined, representing the degenerate directions
in parameter space. Such parameter eigenmode combinations are therefore a
natural generalization of the degeneracy parameter ν̃s of eq.(108).
These idealized studies do not take into account the issue of separating the
many components expected in the data, in particular Galactic and extragalac-
tic foregrounds. As can be seen from fig. 7, the effects of Sunyaev-Zeldovich
fluctuations on the power spectrum are not large. However the power comes
come the clusters and so non-Gaussian aspects of this “foreground” are impor-
tant (fig. 16). Little is known about high redshift extragalactic sources in the
sub-mm. The shape of the power spectrum will have a ∼ `2 part just from the
source counts, and could also have a tail into lower ` associated with cluster-
ing, as shown in fig. 7. By contrast, much is known about the abundance of
extragalactic radio sources as a function of flux at long wavelengths. However
extrapolations to higher frequencies are required, some poorly known fraction
of the sources have flat (ps ∼ 0) spectra, and it is not known how much of
homogeneous noise assumption, are spherical harmonics for all-sky coverage or Fourier modes
for smaller patches. If we expand about parameters {yA∗ } for which the likelihood is not
necessarily a maximum, we have
X X
1
ln L = ln L∗ + FA δyA − 2
(S AB + δSAB )δyA δyB ,
A AB
X [ξ2 − (1 + E∗k )] X (∂Ek /∂yA )(∂Ek /∂yB )
1 k 1
FA = 2
(∂Ek /∂yA ) , S AB = 2
,
(1 + E∗k )2 (1 + E∗k )2
X [ξ 2 − (1 + E∗k )] n (∂Ek /∂yA )(∂Ek /∂yB )
o
k
δSAB = 1
2
2 − (∂ 2 Ek /∂yA ∂yB ) ,
(1 + E∗k )2 (1 + E∗k )
where δyA = yA − yA∗ and the S/N eigenvalue Ek and its derivatives are evaluated at {yA∗ }.
The appropriate (linear) adjustment to the maximum likelihood parameters is y Am − yA∗ =
(S + δS)−1 F . ThePFA (yA − yAm ) term then vanishes, where now δyA = yA − yAm , leaving
ln L = ln Lm − 21 AB
(S AB + δSAB )δyA δyB . The matrix δSAB contains the fluctuations
2
in the S/N power spectrum, ξ k , about its mean value (1 + E∗k ). If E∗k is the correct theory,
the ensemble average of δSAB vanishes and it is usually ignored – as was done for the specific
numbers given.
75
a problem this will be in the optimal 50-150 GHz observing window for the
CMB. The ∼ `2 Poisson part should strongly dominate.
There is currently some optimism that the Galactic foregrounds may not be
a severe problem. The individual warm and cold clouds in the standard three
phase ISM model are quite small (see, e.g., [50] for an inventory), and the
observed structure of the far-infrared emission, dominated by the dust-laden
Cirrus clouds discovered by the IRAS satellite, is actually rather filamentary
with a power spectrum rising towards low ` with ν∆T ∼ −1 [107]. Galactic
bremsstrahlung also has a ν∆T ∼ −1 power spectrum, found using the dmr data
[108]. This is extremely important since it suggests that in the ` ∼ 100 − 500
range, especially in the frequency range around 90 GHz, these backgrounds will
not overly contaminate high precision experiments. Complications will arise
however, the most important being the non-Gaussian concentration of power
and the possible multicomponent nature of the dust. Because of this rise to low
`, the quadrupole is more contaminated than higher multipoles, which results in
a large (70%) systematic error in its value from Galactic modelling uncertainties
[85], hence the large error bars in fig. 10. An important exercise is to see how
well we can do parameter estimation as we vary experimental mapping strate-
gies and sky-coverage when realistic non-Gaussian foregrounds are included
in simulated data sets. In [154], optimal-filtering techniques (eq. 111) were
used on simulations of primary and secondary signals and realistic frequency-
dependent foregrounds to show that a well-designed high resolution experiment
with good frequency coverage (e.g., the COBRAS/SAMBA set of channels in
fig. 5) should be able to accurately recover the primary signal.
5 Primary and secondary sources of anisotropy

The radiative transfer solution involves a line-of-sight integral of a 3D ran-
dom source-field G(q, q̂, r, τ ) through some region, projecting it onto the 2D
sky through the action of the Green function, eq. (18), on the source. General
features of the C` ’s can be understood from projections of simplified forms for
G, which is the exercise undertaken in section 5.1. Since this section also in-
troduces some of the typical mathematical manipulations used to treat trans-
port, it is reasonable on a first pass to just read the introductory overview,
and then go directly to section 5.2 which describes the sources for linear pri-
mary anisotropies and to section 5.3 which describes some nonlinear secondary
anisotropy sources.
5.1 Angular power spectra from 3D random source-fields

A general selection function or visibility V(τ ) is taken out of G. It “illuminates”
the portion of the 3D random source-field we are to look at. In this subsection,
general formulae are derived for the multipole coefficients a`m in terms of fluc-
tuations in G(q, q̂, k, τ ) (eq. (123)) and for the associated C` in terms of the 3D
76
Figure 13: This shows the ability of satellites to measure cosmic parameters to
high accuracy. The relative difference of the power spectrum in question from
a comparison spectrum (both normalized to the 4-year dmr (53+90+31)A+B
COBE maps) are shown so that the few percent deviations can be clearly seen
over the entire ` range. The lighter lines are 1 − sigma error bars for all-
sky coverage (averaged over the smoothing width shown, with 21 ∆ ln ` = 0.05)
and include cosmic variance (dominant at low `) and pixel noise at 20µK or
6µK (dominant at high `), with the very rapid growth relative to the theory
curve at high ` coming from the finite beam-size (with the fwhm indicated,
corresponding to a Gaussian filter in multipole space of `s = 404 and `s = 809
respectively). The first choice corresponds to the NASA satellite experiment
MAP, the second choice to the ESA mission COBRAS/SAMBA, if the entire
−1/2
sky were usable (errors scaling ∝ fsky ). The ultimate accuracy achievable
will depend upon the decontamination of the primary signal of non-Gaussian
Galactic synchrotron, bremsstrahlung 77and dust signals. The models shown all
have a uniform age of 13 Gyr, Ωcdm + Ωmν + ΩΛ + ΩB = 1, ΩB h2 = 0.0125,
ns = 1 and no gravity wave contribution. Notice the scale change for the
hot/cold model panels. (One species of massive neutrino was adopted for these
two cases.)
power spectra for G (eq. (125)). Seeing what happens in special cases is quite
instructive: limiting cases for high ` (eq. (126)) and the relation to the Fourier
transform approximation for 2D maps (eq. (129)); narrow and broad visibility
limits (section 5.1.5); simplified 3D spectra which allow analytic evaluations
(section 5.1.6). The latter tells us in what limits the phenomenological spectra
of section 4.4 are realized by the transport of physical 3D fields; in particular,
3D power law spectra with narrow visibilities lead to the 2D formula eq. (107);
white noise spectra with Gaussian coherence-filtering lead to the 2D “Gaussian
correlation function” model for a narrow visibility function, but the form is
modified for broad visibilities.
These results are applied to a treatment of SZ and dust-emission secondary
anisotropies in section 5.3 to give an understanding of why the spectra for
these anisotropies look as they do in fig. 7. All that is needed from this section,
section 5.1, is the broad visibility high ` limit. For secondary anisotropies, the
back action of the radiation field on the fluctuations in G is usually ignorable,
but G is determined by the nonlinear physics of cosmic structure evolution –
and subject to the inevitable approximations the treatment of that entails.
A look at the dominant source fields for primary scalar anisotropies is given
in section 5.2, which relies upon the results quoted in section 3.1 for G for Thom-
son scattering and the Sachs–Wolfe effect. Among other things it describes how
one arrives at the ∆T /T ∼ ΦN /3 “naive Sachs–Wolfe formula,” with ΦN the
gravitational potential, and what it neglects. This section will be easier to
follow in conjunction with section 6 which gives a full treatment of perturba-
tion theory and the primary scalar (section 6.5.1) and tensor (section 6.5.2)
anisotropies. We shall see that the 3D source fields are highly coupled to ∆T
and to each other so we can expect analytic forms for G to be only approximate.
5.1.1 Simple sample sources

The source function G(q, q̂, r, τ ) can be expanded in powers of q̂. For all sources
we need to consider, the expansion contains at most terms of quadratic order in
q̂. The quadratic terms for scalar and tensor modes come from the Sachs–Wolfe
gravitational redshift sources of eq. (22) and some of the subdominant Thomson
scattering sources. However the dominant Thomson scattering source terms
and all of the secondary sources have only terms of zeroth and first order in
q̂, i.e., monopole and dipole sources; further momentum space transformations
can put the scalar Sachs–Wolfe terms into this form.
For the illustrations in this section, we shall consider only monopole and
dipole sources, and further simplify the dipole by assuming it is a gradient:
G ≈ V(τ ) [G0 (r) − q̂ · ∇G1 (r) + · · ·] ,

∆T
≈ T (τ ) [∆G0 + ∆G1 + · · ·] . (117)
T
The secondary anisotropy sources accompanying distortions are of the scalar
G0 form: eq. (31) for Compton upscattering, eq. (46) for dust emission and
78
eq. (38) for bremsstrahlung. The asymmetry in Thomson scattering from the
flow of electrons contributes a dipole term, σT āq̂ · [ne ve ], to G. The current
ne ve can be expanded in terms of the gradient of a scalar potential, thus of
the G1 form, and the curl of a divergenceless vector potential. For primary
scalar perturbations, the curl vanishes, leaving only a G1 -type term, but when
the gas is nonlinear and clumpy, ∇ne will not be aligned with ve , generating a
curl source which can sometimes exceed the anisotropy driven by the gradient
term.1 For primary scalar fluctuations there are also G0 -type terms.
For mathematical convenience, a (differential) “visibility function” V has
been removed from the sources and a “transparency function” T from ∆T /T .
This is useful to do if there is a reasonably strong concentration of the “emissiv-
ity” in redshift space. For Thomson scattering, the transparency is T = e−ζC .
Depending upon which sources we are interested in, V will either be the differ-
ential Thomson visibility function, VC = ne σT āe−ζC , of fig. 3 or the integrated
visibility e−ζC . For the Sunyaev–Zeldovich effect from clusters, we would take
both T and V to be unity. For dust emission from primeval galaxies, V can be
chosen to define an (angle-averaged) emission shell and T to be unity, as we
described in BCH2 [42].
5.1.2 Angular power spectra for simple sample sources

We now manipulate and solve the transfer equation in a manner which shows
how one passes to C` from the 3D power spectra for the random source fields,
k3 ∗
PGA GB (k; χ̄, ∆χ) = hGA (k, τ+ )GB (k, τ− )i , A, B = 0, 1 , (118)
2π 2
where χ̄ = (χ+ +χ− )/2 , ∆χ = χ+ −χ− , χ± = χ̄± 21 ∆χ. We shall assume that
the sources are statistically homogeneous and isotropic, so that the 3D power
spectra are functions only of |k|. If the sources are Galactic, for example, this
will not be correct.
The easiest way to deal with the gradient terms is to rewrite the transfer
equation as
0 0
∂ ∆T ∆T
+ q̂ · ∇ = G0 ,
∂τ T T
0
∆T ∆T ∂
= + VG1 , G 0 (q, r, τ ) = V(τ )G0 + VG1 , (119)
T T ∂τ
where G 0 now has no q̂ dependence. If we assume that there is initially no
anisotropy, then the solution (in a flat background cosmology) is
0 Z ∞ Z
∆T d3 k
(q, q̂, here, now) = dχ 3
δG 0 (q, k, τ ) e−ik·q̂R(χ) ,
T 0 (2π)
1 Nonlinear Thomson scattering in a flowing plasma is responsible for the moving cluster
effect generated at relatively low redshift from ionized gas in groups and clusters, and for
the Vishniac effect, which has quadratic nonlinearities included to correct scalar primary
anisotropies in a baryon isocurvature example with early reionization.
79
R(χ) = χ = τ0 − τ . (120)
In open or closed universes, the mean curvature precludes making a Fourier

transform expansion, but a prescription for small angles using a modified R(χ)
is described in eq. (130) below.
For secondary anisotropies, there is a nonzero angle-averaged part to the
random function G(q, q̂, r, τ ), which gives average spectral distortions:
Z ∞Z
∆T dΩr̂
(q, r = 0, τ0 ) = V(τ )G0 (q, r = χ, r̂, τ ) dχ (121)
T angle 0 4π
For primary Thomson scattering anisotropies, this term vanishes.

To go from eq. (119) to the anisotropy pattern on the sky as embodied
in the multipole coefficients a`m of eq. (82), we make use of the plane wave
expansion X
e−ik·q̂χ = (−i)` 4πY`m
∗
(k̂)Y`m (q̂)j` (kχ) . (122)
`m
P ∗
(Recall that (2`+1)P` (q̂·k̂) = m 4πY`m (k̂)Y`m (q̂).) Denoting the contribution
of mode k to the anisotropy a`m by ã`m (k), we have
Z
d3 k
a`m = ã`m (k) , (123)
(2π)3
Z ∞
ã`m (k) = dχ(−i)` 4πY`m
∗
(k̂)j` (kχ)δG 0 (q, k, τ )
0
Z ∞
= dχV(χ)(−i)` 4πY`m ∗
(k̂)[G0 j` (kχ) + kG1 j`0 (kχ)] (124)
0
using an integration by parts on the Ge1 term. The statistical homogeneity and
isotropy assumption implies in particular that the correlation of hã`m (k)ã∗`m (k0 )i
is zero unless k0 = k, so the 2D radiation power spectrum, C` = `(` + 1)
h|a`m |2 i/(2π), is
Z Z
C` = 2`(` + 1) d ln k dχ̄d∆χ V(χ̄ + 21 ∆χ)V(χ̄ − 21 ∆χ)
X (A) (B)
k A+B PGA GB (k; χ̄, ∆χ)j` (kχ+ )j` (kχ− ) (125)
A,B=0,1
(0) (1)
with j` = j` , j` = j`0 and A, B = 0, 1. Thus to understand how the C`
will look, we must get into the arcana of how products of spherical Bessel
(A) (B)
functions behave. We show below that the product j` (kχ+ )j` (kχ− ) can
(A) (B)
be written as j` (k χ̄)j` (k χ̄)(cos(k̄k ∆χ) plus a fast-oscillation term which
will often average to zero). In that case, the ∆χ integration reduces to a
Fourier transform defining a function p(k̄k ; χ̄, A, B) which encodes information
80
about variations in the visibility about the average longitudinal distance χ̄:
dC`
≈ 2`(` + 1)
d ln kZ
X (A) (B)
× dχ̄V 2 (χ̄) p(k̄k )PGA GB (k; χ̄, 0)hj` j` i(k χ̄)k A+B ,
A,B
Z
V(χ+ )V(χ− ) [PGA GB + PGB GA ](k; χ̄, ∆χ)
p(k̄k ) ≡ d∆χ cos(k̄k ∆χ) ,
[V(χ̄)]2 [PGA GB + PGB GA ](k; χ̄, 0)
(126)
p (A) (B)
where k̄k ≡ kk (χ̄), kk (χ) ≡ k 2 − Q2 /χ2 , Q ≡ ` + 21 . Here hj` j` i is either
a direct product of the Bessel functions evaluated at k χ̄ or an approximation
to it given by eq. (128) below. For example, the main feature of hj` j` i(x) at
highp` is that it is nearly zero for x < Q, falling from a finite maximum as
(2x x2 − Q2 )−1 for x > Q.
Examples of dC` /d ln k for primary anisotropies are shown in fig. 14 for the
standard CDM model for a variety of `’s. At ` = 4, 10 the oscillatory behavior
of the products of j` ’s is apparent. For ` = 59, 121, smoothing over nearby `’s
has been done, with the sharp rise in k (at ∼ `/χ̄) and power law decline a
characteristic shape for averages of j` products, as we now describe.
5.1.3 Products of Bessel functions

In this technical subsection, we make use of standard Bessel function asymp-
(A) (B)
totics to define approximations for hj` j` i(x):
1
j` (kχ) ∼ p cos(kk χ − Q arcsin(kk /k) − π4 ) + O(Q−1 ) , kk > 0,
kk kχ
!`+1
1 k exp[|kk |χ]
j` (kχ) ∼ p p , kk imaginary,
2 |kk | + |kk | + k
2 2 |kk |k χ
2 !
sin(π/3)Γ(1/3) 1 √ kk kk
j` (kχ) ≈ √ 1− 2 +O , kk ≈ 0.
32/3 21/6 π (kχ)5/6 k k
For high `, the (kχ/Q)`+1 behavior in the kχ < Q “imaginary-kk” regime

ensures that it is almost zero.
(A) (B)
To evaluate products of form j` (kχ+ )j` (kχ− ), with A, B = 0, 1 as in
eq. (125), expand in ∆χ/χ̄:
(A) (B) (A) (B)
j` (kχ+ )j` (kχ− ) ∼ hj` j` i(k χ̄)(cos(k̄k ∆χ) + fast) . (127)
“Fast” denotes a cosine or sine term with a large argument consisting of terms
(A) (B)
like k̄k χ̄ which average to zero. For the hj` j` i(k χ̄), one can either take the
81
Figure 14: dC` /d ln k for the scale invariant CDM model shown demonstrates
the basic j`2 (kχdec ) oscillatory behavior for low `. For the two higher ` cases
shown, smoothing over nearby `’s has been done to damp the fast oscillations,
and the result basically follows the limiting high ` behavior of the products of j`
and/or j`0 . The vertical lines are defined by k −1 = 2cH0−1 and πk −1 = 2cH0−1 ,
when half a wavelength equals the horizon size.
82
(A) (B)
product j` (k χ̄)j` (k χ̄), or use an average based on the high-Q asymptotics
[88]:
(A) (B)
hj` j` i(k χ̄) = 0 for k < Q/χ̄ ; and for k > Q/χ̄ ,
5/3
(0) (0) 1 2 Q
hj` j` i(k χ̄) = min 2
, j` (Q) , j` (Q)Q5/6 ≈ 0.59,
2k k̄k χ̄ k χ̄
8/3
(1) (0)
(k 2 + k̄k2 ) 5 2 Q
hj` j` i = −min 2 2
, j` (Q) ,
(2k χ̄) k̄k χ̄k̄k 6Q k χ̄
2 11/3
(1) (1) 1 k̄k (k 2 + k̄k2 )2 25 2 Q
hj` j` i = min + , j (Q) .
2k k̄k χ̄2 k 2 4(k χ̄)2 k̄k4 36Q2 ` k χ̄
(128)
(A) (B)
The apparent singularities as k̄k → 0 are avoided by saturating the hj` j` i at
their values for k̄k = 0, as indicated by the minimum function. For low Q, the
drop on the imaginary-k̄k side is not rapid enough to use this approximation,
(A) (B)
but the expansion in ∆χ may still be good: the direct product j` j` should
then be used. It is by dropping the “fast” term that we get eq. (126).
5.1.4 Fourier derivation of the simple sample spectra at high `

(A) (B)
With the form for hj` j` i(k χ̄) valid for high `, we encounter the first of a
class of “small angle approximations” that have been used over the years to
simplify the calculations of C(θ) and C` . They turn out to be reasonably good
provided we are not interested in low multipoles and do not need the answer to
high precision for higher multipoles. A useful exercise to guide understanding of
eqs. (125), (126) is to first derive the correlation function C($) for an isotropic
source source-field of the G0 form, then calculate its 2D Fourier transform
(using the notation of eq. (88), and splitting the wavenumber into components
kk along the average line of sight and k⊥ , perpendicular to it):
Z
C` ≈ dχ̄d∆χ V(χ+ )V(χ− )
Z ∞ 2
Q 1
dkk e−ikk ∆χ P (k; χ̄, ∆χ), (129)
0 R(χ̄) k 3 GG
k 2 ≈ (Q/R(χ̄))2 + kk2 , Q ≡ ` + 21 , R(χ̄) = χ̄;
Z ∞ Z ∞
1
(· · ·) dkk = (· · ·)k 2 dk p ,
0 Q/R(χ̄) k k − (Q/R(χ̄))2
2
Z
dΩq̂ k 3
V(χ+ )V(χ− )PGG (k; χ̄, ∆χ) ≡ hG(q, q̂, k, τ+ )G ∗ (q, q̂, k, τ− )i.
4π 2π 2
Equation (129) allows one to turn the kk integral into one over k; hence, with
p(kk ) defined as a Fourier transform over ∆χ, we regain eq. (126) with the
83
eq. (128) approximation. Note that the k > Q/χ̄ restriction is just a conse-
quence of positive kk .
As a further approximation when the source-fields are not isotropic, but
have more complicated angular dependence, e.g., G(q, q̂, k, τ ), an isotropized
power spectrum for the source fields has been used; for the G = V(G0 −iq̂· k̂kG1 )
case of this section, PGG = PG0 G0 + PG1 G1 k 2 /3: cross-correlation terms do not
appear.2
These small-angle C` formulas can also be applied to open (or closed)
universes for multipoles on angular scales ` 2|1 − Ω|1/2 Ω−1 if we replace
R(χ) = χ by
R(χ) = dcurv sinh((τ0 − τ )/dcurv ) if Ω < 1 , (Ωcurv ≡ 1 − Ω) ,

dcurv = H0−1 Ωcurv
−1/2
, Ωcurv ≡ 1 − Ω , τ0 = fτ 2H0−1 Ωnr −1/2
,
1/2
Ωnr
fτ = [ln(Ω1/2
curv + 1) − ln(Ωnr )
1/2
] , (Ω = Ωnr < 1) ,
Ωcurv
(130)
where τ0 is the conformal time now and fτ is a factor that must sometimes be
computed, e.g., in universes with sizable vacuum energies. For closed universes,
fτ = (Ωnr /|Ωcurv |)1/2 arcsin[(|Ωcurv |/Ωnr )1/2 ] and “sinh” is replaced by “sin”.
In flat universes, dC` /d ln k is concentrated around Q ∼ k χ̄, in open universes
Q is pushed higher than k χ̄ and in closed universes Q is pushed lower: hence
features in C` will be at smaller angular scales in open models (as fig. 18(c,d)
shows) and larger scales in closed models than in flat ones.
5.1.5 Narrow and broad visibilities

A few other nonessential approximations are useful to get a simple analytic
form for p(k̄k ): e.g., absorb the leading growth terms in PGA GB in the visibility
product so the remaining weak dependence upon ∆χ can be ignored – e.g., the
linear growth factors D(χ+ )D(χ− ) may describe the dependence. To get a nice
result for discussion of limiting cases, it is useful and is often not even a bad
approximation to assume that the product of visibilities looks like a Gaussian
2
in ∆χ, V(χ+ )V(χ− ) ≈ V 2 (χ̄) e−(∆χ/RV ) /4 :
√ 2 2 ∂ 2 ln V
p(k̄k ) ≈ 2 πRV e−k̄k RV , RV−2 ≡ − . (131)
∂ χ̄2
The function RV (χ̄) will generally be dependent upon χ̄ and could also depend
upon A, B = 0, 1. A case of some interest where it is constant is a Gaussian
2 Replacing hG(q, q̂, k, τ )G ∗ (q, q̂ 0 , k, τ )i by hG(q, q̂ , k, τ )G ∗ (q, q̂ , k, τ )i, where q̂ ≈
+ − P + P − P
(q̂ + q̂ 0 )/2 – without the averaging – is a “DSZ” approximation [131]; [132, 134, 88] exploited
the isotropized form. These methods have been applied to C(θ) and C` estimations for
primary anisotropies (section C.3.2) and to the secondary δne ve nonlinear source-field ([216]
and section 5.3.6.
84
visibility. We showed in section 3.4.2 that this was a reasonable approximation
for VC for standard recombination, with RVC (χ̄) = RVC ,dec . It is used in
section 5.2. There are also a few interesting limiting cases:
1 Broad visibility: When the selection function is broad, RV → ∞, p(k̄k ) →

2πδ(k̄k ). This is the approximation used in [2, 42] and eq. (140) in sec-
tion 5.3.1 below for anisotropy power spectra from primeval dust and the
SZ effect. The broad limit has a nice interpretation:
R the “column depth”
across the visibility surface is Σ0 (x̂) = dχ V(χ)G0 (x = x̂χ, τ ). The radi-
ation correlation function C(θ) is the correlation function of the column
depths along lines of sight separated by angle θ. For SZ anisotropies,
the “column depth” is the Compton y-parameter, and for dust-emission
anisotropies it is the dust optical depth (for a constant redshifted dust
temperature).
2 Narrow
√ visibility: When the selection function is narrow, RV → 0, p(k̄k ) →
2 πRV , and the power spectrum is integrated over the unobserved k̄k ,
the projection of the 3D spectrum onto 2D. Only for primary anisotropies
with normal recombination is one ever really close to this limit and even
then damping of the spectrum for ` > ∼ χ̄/RV due to the “fuzziness” of the
visibility surface is important to include, as encoded in p(kk ). The sur-
face is perpendicular to the photon path to us, so the spatial oscillations
are across the surface, giving destructive interference from both peaks and
troughs for waves with kRV > π. There is no destructive interference if
the photons are only received from either peaks or troughs, but not both,
the case if oscillations are along the surface, or if the wavenumbers are
small.
5.1.6 Power-law spectra with coherence scales in 3D and 2D

There are a few power spectra for which there is an analytic integral over the
Bessel function products. To deal with these we shall restrict ourselves to
sources of the G0 form. The first case we shall consider has a thin shell source
concentrated at zs . For the power law 3D spectrum
2σG2 0 2
PG0 G0 (k; χs , 0) = νG0 /2
(kRcoh )νG0 e−(kRcoh ) /2
,
2 Γ(νG0 /2)
ν
Γ(2 − νG0 ) 2σ 2 Rcoh G0
C` = π ν /2 G0
22−νG0 Γ2
3−νG0 2 G0 Γ(νG0 /2) χ̄s
2
ν
Γ ` + G20 Γ(` + 2)
× ν . (132)
Γ(`)Γ ` + 2 − G20
This formula is valid as long as the angular scale `−1 is large compared with
the coherence angle Rcoh /R(χ̄s ).
85
If the visibility has finite extension, the integral over k can still be done,
but in terms of a hypergeometric function which is not very useful. What can
be done easily once again is the Fourier calculation for large `:
Z ν
2 2σG2 0 QRcoh G0 − 1 (QRcoh /χ̄)2
C` ≈ dχ̄ V (χ̄)π ν /2 e 2
2 G0 Γ(νG0 /2) R(χ̄)
√
Rcoh 2RV
×p 2
. (133)
Q2 (Rcoh + 2RV2 )/χ̄2 + (3 − νG0 )
The (3−νG0 ) term in the denominator is an approximation based on a first order
expansion in (kk χ̄/Q)2 . With the visibility concentrated at χ̄s , eq. (133) is just
the 2D power law equation, eq. (106), with $c = Rcoh /R(χ̄s ) and ν∆T = νG0 .
For the special case of a 3D white noise distribution, νG0 = 3, the exact
result including the Gaussian coherence length Rcoh can be expressed in terms
of a modified Bessel function:
2
PG0 G0 = (2/π)1/2 σG2 0 (kRcoh )3 e−(kRcoh ) /2 ,
Z √
C` = dχ̄d∆χ V(χ+ )V(χ− ) 2π σG2 0 `(` + 1)
2
/(2R2coh ) −(χ+ χ− )/R2coh
× e−(∆χ) e 2
IQ (χ+ χ− /Rcoh ). (134)
If the coherence scale of the blobs is small compared with the cosmological
distance at which the visibility is concentrated, the asymptotic expansion of
1
the modified Bessel function, I`+1/2 (z) ∼ √2πz ez (1 − `(` + 1)/(2z) + · · ·), can
be used to simplify the expression:
Z 2 √
√ QRcoh 1 2 Rcoh 2RV
CQ ≈ dχ̄ V 2 (χ̄) 2πσG2 0 e− 2 (QRcoh /R(χ̄)) p 2 .
R(χ̄) Rcoh + 2RV2
This is the νG0 → 3 limit of eq. (133), as expected, and if the visibility is
concentrated at χ̄s , it is the ν∆T = 2, $c = Rcoh /R(χ̄s ) version of the 2D law
eq. (106). If the visibility is broad, there is a distribution of coherence angles
contributing so the final result is only roughly of the Gaussian form.
It might be thought that clouds in our Galaxy could be modelled by such a
blob spectrum with no long range correlations, but this is not so. As we saw in
section 4.6, C` for dust-emission and Galactic bremsstrahlung apparently rise
as ell−1 [107, 108] at the resolutions they have been observed.
5.2 The primary primary anisotropy effects

For primary scalar anisotropies, we describe here the leading terms associated
with the Sachs–Wolfe and Thomson scattering sources, eqs. (25), (26), using
the ν, ϕ, Ψσ notation for the metric perturbations introduced there. This is just
a preview to show which fields are “illuminated” by the visibility of Thomson
scattering. The terminology and manipulations of this section will become
more familiar after reading section 6.
86
5.2.1 Sachs–Wolfe, photon-bunching and Doppler sources
By manipulation of eqs. (25), (26), they can be cast into the G0,1 form for a
modified field ∆e (S)
t (and for a flat Universe):

∂ i ∂ e (S) 1e i ∂ −1 e
+ q̂ TC ∆ t = V C δγ − q̂ ā Ψv,B + (· · ·)
∂τ ∂xi 4 ∂xi
∂
+TC [ν − ϕ + ∂τ (ā−1 Ψσ )] ,
∂τ
e (S) (S) ∂ā−1 Ψσ
∆ t ≡ ∆t + ν + − q̂ i ∂i ā−1 Ψσ , vB = −ā−1 ∇Ψv,B ,(135)
∂τ
1 1 ∂ −1
GnSW,0 + Gδγ ,0 ≡ δeγ ≡ δγ + ν + ā Ψσ ≡ FΦ (k, τ )ΦN /3 , (136)
4 4 ∂τ
G1 ≡ ā−1 Ψ e v,B ≡ ā−1 (Ψv,B + Ψσ ) ≡ Fv (k, τ )τ ΦN /3 , (137)
∂ 2 −1
GiSW 0 = ā Ψσ + ν̇ − ϕ̇ = FΦ̇ (k, τ )ΦN /3 . (138)
∂τ 2
Here δγ is the photon energy density perturbation, vB is the baryon velocity and
Ψv,B is the baryon velocity potential. The (· · ·) refer to source terms driven by
the quadrupole and by polarization terms. These are subdominant and can be
ignored in rough treatments. There are three types of terms multiplied by VC .
The “Doppler” source is G1 . The G0 = δeγ /4 term has two parts, metric terms
which give the “naive” Sachs–Wolfe effect, in particular the famous ΦN /3,
where ΦN is the gravitational potential, if ΦN is constant – which it rarely is.
The δγ /4 term describes the amount of “photon bunching” when decoupling
releases the photons, which gives the equally famous 31 δρB /ρ̄B term if the
entropy per baryon is constant – which it is not. The source term multiplied
by TC is known as the integrated Sachs–Wolfe or Rees–Sciama effect. While
the Doppler and integrated Sachs–Wolfe sources do not depend upon the gauge
which is chosen, the relative amount which is attributed to photon bunching
and the naive Sachs–Wolfe effect does, although the δeγ /4 combination is gauge-
invariant.
All of the effects have been normalized to ΦN (k, τ0 )/3 through form fac-
tors FΦ (k), Fv (k) [2] and FΦ̇ , to emphasize that the magnitude of ΦN controls
the magnitude of ∆T /T – although it is the explicit form of these order-unity
form factors that define the bumps and wiggles of the spectral shape. The
statistical distribution of ∆T /T is also completely determined by the statisti-
cal distribution of the stochastic field ΦN , with the spatial Fourier transforms
{FΦ , Fv , FΦ̇ }(x, τ ) defining nonstochastic time-dependent fields which are con-
volved with ΦN (x, τ0 ) to give the sources.
5.2.2 Longitudinal and synchronous pictures of the Sachs–Wolfe ef-

fect
The longitudinal gauge has Ψσ = 0. The metric is characterized by νL , which
is the closest analogue to the perturbed Newtonian potential ΦN ; and ϕL goes
87
to −ΦN once anisotropic pressure forces can be neglected, which it can after
āeq and ādec . In the regime in which nonrelativistic (nr) matter dominates the
evolution ΦN is constant (in linear perturbation theory). The velocity potential
for baryons in that gauge is Ψv,BL ≡ Ψ e v,B and the velocity potential for cold
e
dark matter is Ψv,cdmL ≡ Ψv,cdm . In the nr-dominated regime, ā−1 Ψ e v,cdm =
1
3 ΦN τ . Compton drag stops the baryons from following the nr dark matter
flow, but once the photons do let go, G1 also approaches 31 ΦN τ . For normal
recombination, there is by this time no differential visibility left; the G1 -field
is determined by the earlier baryon physics, i.e., the transition through the
optical-depth-unity regime of tight-coupling to damped-streaming. In universes
with early reionization, much of the “visible” region can be after the Compton
drag lets go and G1 ≈ 31 ΦN τ can be a good approximation.
In the synchronous gauge, ν is set to zero and the constant time surfaces
chosen to be those on which cold dark matter is at rest; the synchronous gauge
metric variable ΨσS is then just Ψ e v,cdm , and the metric part of GnSW,0 + Gδγ ,0
1
gives 3 ΦN in the nr-dominated regime, i.e., the classic naive Sachs–Wolfe term.
This suggests we define the photon bunching source Gδγ ,0 to be 41 δγS , which is
then a gauge-invariant term, with the remainder of eq. (136) defining the naive
Sachs–Wolfe source GnSW,0 . In the oft-used longitudinal gauge, the correct
ΦN /3 behavior is obtained only when a piece of δγL /4, photon bunching as
viewed in this gauge, is added to νL = ΦN : we show in section 6 that δγL /4 =
δγS /4 − H̄Ψσ , which becomes δγS /4 − 32 ΦN in the nr-dominated regime.
The integrated Sachs–Wolfe term at late times becomes TC 2Φ̇N . Thus FΦ̇ =
6Φ̇N /ΦN . In the nr-dominated regime, it vanishes for linear perturbations.
Nonlinear clustering generates nonzero Φ̇N . Even though the potential change
may not be very great, the factor of 6 enhances the impact on the CMB. When
the equation of state changes from nr-dominance, Φ̇N no longer vanishes. This
is the source for the relative upturn in C` in the vacuum-dominated model in
fig. 7 [110]. It is also rarely true that aeq is so much less than adec that changes
in ΦN around recombination can be ignored. In that case, we can absorb it into
a VC style source by replacing FΦ by FΦ +τC FΦ̇ , where τC−1 = n̄e σT ā = VC /TC .
5.2.3 Differential power spectrum and form factors

We can apply the machinery leading to eq. (126) to get the power spectrum.
Let us ignore late-time integrated Sachs–Wolfe effects associated with nonzero
Λ, etc. so we can use just the visibility VC sources. Since we are also interested
(A) (B) (A) (B)
in low `, we use the Bessel function product j` j` for hj` j` i:
Z
dC` 1
= 2`(` + 1) PΦN (k, τ0 ) dχ̄ VC2 (χ̄) p(k̄k )
d ln k 9
[(FΦ (k, τ̄ ) + τC FΦ̇ (k, τ̄ ))j` (kR(τ̄ )) + kτ̄ Fv (k, τ̄ )j`0 (kR(τ̄ ))]2 . (139)
We have seen that the visibility VC is roughly a Gaussian in conformal time

with width RV,dec for normal recombination. If the 3D source functions do not
88
change much over VC , the form factors can be evaluated at τ̄ and even at τdec ;
otherwise an average over the shell is needed, defining effective form factors
which can also absorb τC FΦ̇ and the last scattering surface fuzziness damping
associated with p(k̄k ).
The goal of analytic approaches is to use approximate form factors like
these to understand the physics defining the basic features of the spectra and
to provide a tool for rapid estimation of C` . It has been developed in various
approximations in [2, 143, 269, 270, 271]. Hu and Sugiyama [271] have included
the most effects, in particular the time variation of ΦN that arises because aeq is
not very far from adec ; by doing so they obtain remarkably good reproductions
of the spectra derived using full Boltzmann transport codes, within about 10%
or so even at high `. Here I shall just use a simple analytic result [2, 201]
to illustrate the physics that determines the nature of the oscillations that
translate into the C` peaks and troughs, but caution that the more elaborate
scheme of [271] is needed if one wants a quantitative tool. (It was, for example,
used in [161] to rapidly calculate C` for a large parameter set to assess how well
parameters could be determined in idealized all-sky satellite experiments.)
Earlier than decoupling, the photons and baryons are so tightly coupled
by Thomson scattering that they can be treated as a single fluid with shear
viscosity (4/(15fη ))ρ̄γ āτC , zero bulk viscosity, √thermal conductivity κγ =
(4ργ /(3Tγ ))āτC and sound speed cs,(γ+B) = (c/ 3)[1 + 3ρ̄B /(4ρ̄γ )]−1/2 , low-
√
ered over the (c/ 3) for a pure photon gas because of the inertia in the baryons.
Here fη is a parameter which depends upon the approximations that are made
to treat the photons: it is 3/4 if all effects are included, 9/10 for unpolarized
photons and 1 if the angular dependence in the Thomson cross section is also
ignored. These results are derived in Appendix C.3.1.
Let us assume constant ΦN and ρ̄B ρ̄γ through decoupling. The WKB
solution of the tight coupling fluid equations is, for τ < τdec ,
1 2
FΦ ≈ e− 2 (σD kτ ) (3c̄2s )1/4 cos(kc̄s τ ),
1 2 sin(kc̄s τ )
Fv ≈ e− 2 (σD kτ ) (3c̄2s )1/4 .
kc̄s τ
Also FΦ̇ ≈ 0. σD is a parameter describing √ Silk damping. In this tiny baryon
number limit, the sound speed is c̄s = (c/ 3), but for finite ρ̄B , c̄s is a suitable
time-average of cs,(γ+B) and there is also a weak amplitude-diminisher, (3c̄2s )1/4 .
The WKB solution for δγS /(−ΦN ) = 1 − FΦ shows δγS growing outside
the horizon like τ 2 ; the horizon is “entered” for photons when kc̄s τ ≈ π/2;
and thereafter oscillations spaced equally in kc̄s τ should be expected in the
evolution of individual k-modes. Some examples of this behavior for different
k’s are shown in fig. 19. By contrast, the view of the density fluctuations in
the longitudinal gauge is δγL /(−ΦN ) ≈ 5/3 − FΦ , dominated by the constant
2/3 part which swamps the rising part. This emphasizes the care that must
be taken in choosing which variables to integrate – no matter what the initial
gauge choice.
89
The phase of the waves as they hit the narrow recombination band, kc̄s τdec ,
determines the oscillations in C` that appear in fig. 7. The combination of
viscous and fuzziness damping diminishes the amplitude of the Doppler peaks.
Because the oscillations are in both δeγ and Ψ e v,B , both contribute to the detailed
structure.
In section 5.1.2, a high ` form of C` was given, eq. (129), and a further
simplification associated with isotropizing the total source power spectrum was
described. For the limiting WKB case, the isotropized source-power evaluated
2
at τ = τdec is 19 PΦN e−(σD kτ ) (3c̄2s )1/2 (1 + 34ρ̄ρ̄Bγ sin2 (kc̄s τ )). This illustrates that
in the instantaneous recombination limit with no damping and tiny baryon
abundance, PΦN /9, the naive Sachs–Wolfe effect, is recovered. But this is
obviously not what one sees in the figures. It is in the finite ΩB effects, the
time dependence of ΦN and even the differences between j` and j`0 that the
dramatic hills and valleys of C` owe their origin – and it is with just those
factors that the C` –landscape can be estimated accurately.
5.2.4 Damping
The parameter σD is an integral of the damping rate involving the shear vis-
cosity and thermal conductivity. In the WKB limit, it is given by eq. (357)
in section C.3.1: for CDM models with low ΩB , σD is roughly 0.02–0.03 with
polarization included, which enhances damping, and is about 10% lower if the
radiation is assumed to be unpolarized. With ΩB h2 = 0.0125 preferred by Big
Bang nucleosynthesis, σD ≈ 0.02, and the same rough value is obtained in the
limit of large ΩB . Of course the tight coupling equations break down as the
radiation passes through decoupling, so it is better to treat σD as a phenomeno-
logical factor, but matching to numerical results for Silk damping in baryon
dominated models also gives the 0.02 estimate for CDM-model parameters [2].
The damping acting on ∆T /T due to fuzziness of the last scattering surface is
2 2
e−(kk RVC ,dec ) /2 , while that from Silk damping is e−(kσD τdec ) /2 . From eq. (69),
we have RVC ,dec /(σD τdec ) ≈ σa,dec /(2σD ) which is ∼ 2 for the examples of
fig. 3(b).
The fuzziness damping acts only on kk , while the WKB viscous damp-
ing acts on k. Effective isotropized fuzziness filters are found by expand-
ing in kk RVC ,dec and angle-averaging [2], which reduces the effective filter to
√
RVC ,dec / 3; this makes the WKB and fuzziness damping values similar in
magnitude. The WKB tight-coupling solution does in fact calculate a version
of fuzziness damping acting on δeγ , along with other transport effects, but the
k⊥ –kk asymmetry is obscured by the truncation of the `-hierarchy at such low
`: up to ≈ τdec , higher moments are strongly damped, but as the photons pass
through ζC = 1, fuzziness damping in this “scattering atmosphere” occurs. At
decoupling, τC is only 5% of τdec .
90
5.2.5 Early reionization form factors
If we assume early reionization, and a decoupling redshift (where the visibility
peaks) in the nr-dominated regime and below the epoch at which Compton drag
lets up, <
∼ 200, then we have Fv = 1. For small kτ and adiabatic perturbations,
we expect to have FΦ = 1 in this nr-regime, damping as kτ increases, but
not exponentially once τC ∼ ā2 grows to a point where tight-coupling breaks
down. What one does get is a photon density gradient responding to the
residual Compton drag; a converging baryon flow increases δeγ /4, a diverging
one diminishes it: the net effect for large kτ is FΦ ≈ −τ /τC , which falls like
ā−3/2 for a fully ionized medium. Thus in reionized adiabatic models, one
expects a normal Sachs–Wolfe behavior at small `, with a velocity-induced
extra piece pushing it up a bit at larger `, both being diminished by an overall
large fuzziness factor, typically with RVC ∼ 0.3τ (section 3.4.2). The high-
kτ part of FΦ has been shown to augment the velocity-induced term by an
order-unity factor [217, 218].
5.2.6 The isocurvature effect on low multipoles

If the perturbation mode is isocurvature rather than adiabatic, the fluctua-
tions are initially perturbations in the entropy (per CDM particle for isocur-
vature CDM perturbations or per baryon) without accompanying curvature
perturbations. For these, there is another effect which amplifies FΦ to 6, the
isocurvature effect. Let δs ≡ 43 δγ − δx denote the relative perturbation in
the entropy per x-particle, where x = cdm, B. To have no energy density
perturbation in the kτ → 0 limit and yet have a nonzero δs , we must have
δγS ≈ 34 δs (1 + 43 ρ̄er /ρ̄nr )−1 ≈ −(ρ̄nr /ρ̄er )δxS , where ρ̄nr is the density of non-
relativistic particles, ρ̄er is the density of relativistic particles and it has been
assumed for this illustration that all nr and er particles will have the same
relative density perturbations, δxS and δγS , respectively. At very early times,
δγS is tiny, with the entropy perturbation being carried by the x-particles, but
as ρ̄nr /ρ̄er = ā/āeq grows from unity to ∼ 104 , the perturbation is primarily
carried by the radiation. When a given wave enters the horizon, δxS ceases
declining, and begins to grow after τeq via the usual Jeans instability. This
diminishment of δx at low k translates to a sharp bend in the isocurvature
−1
CDM transfer function at k ∼ τeq , falling as k 2 at low k, but being unity at
high k. The reciprocal impact of this on δγS gives the isocurvature effect.
It is easiest to see why FΦ =6 using the equation for δeγ (k, τ )/4 in the
kτ < e
∼ 1 limit, the angle-average of the ∆t transfer equation (and eq. (356)
of Appendix C.3.1). Since δγS is initially nearly zero, we have δeγ (k, τ )/4 ≈
(ν − ϕ + ∂τ [ā−1 Ψσ ])(k, τ ). Both sides turn out to be gauge-invariant and
so the right-hand side is the quantity νL − ϕL ; but νL is ΦN by definition,
and ϕL ≈ −ΦN , which becomes exact if there are no anisotropic pressure
terms; i.e., δeγ (k, τ )/4 ≈ 2ΦN for low k: i.e., FΦ ≈ 6 [215, 174]. The isocur-
vature effect can also be expressed in terms of the initial entropy fluctuation,
91
2δs /5 = δs /3 + ΦN /3, where the gravitational potential is related to the initial
entropy perturbation by ΦN = δs /5.
The FΦ = 6 factor is so large that isocurvature theories with nearly scale in-
variant spectra are strongly ruled out by the observations, [215] although there
is still some room for them to contribute at a subdominant level to adiabatic
scalar perturbations [233] for dark matter dominated models. For isocurva-
ture baryon perturbations, there are significant features in the density transfer
function for scales kτeq ∼ 1 [131, 248, 216]. Although the isocurvature effect
does lead to nearly scale invariant spectra being strongly ruled out, isocur-
vature baryon models with arbitrary spectra (which are steep with ns ≈ 0
corresponding to seed models) and arbitrary ionization histories are usually
adopted; with this freedom, the case against them is strong but not yet defini-
tive [216, 218, 219, 243]. (See also fig. 18).
5.3 Secondary anisotropies

Secondary anisotropies are non-Gaussian, with the spectral power concentrated
in hot and/or cold spots on the sky, rather than being democratically dis-
tributed as it is for Gaussian anisotropies. The power spectra are instructive
since they do tell us what the best scales to probe are, but they are far from
the whole story. Examples of C` for the ambient SZ effect from clusters and
dust-emission from primeval galaxies are shown in fig. 7. The dual nature of
the power spectra for dust anisotropies can be understood using the methods
of section 5.1, as sketched in section 5.3.1. The power spectrum can be used
to calculate rms anisotropies. In the mid to late 80s, estimates made on the
strength of the SZ effect from clusters concentrated on the distribution of y
and the rms variations as a function of beam, using simplified peak or Press–
Schechter based models to calculate the abundance and clustering of the sources
[2, 112, 113]. This work showed that for inflation-based CDM-like models, the
rms SZ fluctuations would be quite small, well below 10−5 . Similar techniques
were applied to estimating the rms anisotropies in the far infrared and sub-mm
from dusty primeval galaxies [81, 2, 42]. More powerful methods were devel-
oped in the nineties [117, 118, 120, 121, 122] to address how non-Gaussian the
signals would actually be (e.g., figs. 15, 16).
5.3.1 Sample secondary anisotropy power spectra

For secondary anisotropies, the broad visibility case of section 5.1.5 is the ap-
propriate limit for the angular power spectrum. For high Q ≡ ` + 21 ,
Z
C` ≈ dχ̄ V 2 (χ̄) (Q/R(χ̄))−1 πPG0 G0 (k = Q/R(χ̄); χ̄, 0) . (140)
It is often suitable to adopt a shot noise model for the distribution of the
random source-field G0 [2, 42]: this consists of (1) a class of objects defined by
parameters C (e.g., mass, luminosity, X-ray temperature) whose positions are
92
P
specified by a random point process nC∗ (r) = j∈C δ (3) (r − rj ) with the sum
over points j satisfying the conditions C; and (2) profiles for G0 centered at
each point, g(r|C, τ ). nC∗ is a comoving density if rj are comoving positions.
The points C could define galaxies, clusters, N -body groups, the centers of
cosmic explosions, . . ., and the profiles g may be asymmetrical (e.g., filaments,
pancakes). In the “peak-patch picture” of [2, 68, 117, 120], the shots are
equated with specially selected peaks of the smoothed linear density field.
The source function G0 for a shot noise model is the sum of convolutions:
XZ
G0 (r, τ ) = d3 r0 g(r − r0 |C) nC∗ (r, τ ) , (141)
C
X
Ge0 (k, τ ) = g̃(k|C, τ )ñC∗ (k, τ ) , g̃(k|C, τ ) ≡ gc (τ |C)VC Fg (k|C) .
C
We haveR separated g̃(k) into a central value gc , a weighted volume of the region
VC ≡ g(r) d3 r/gc , and a form factor Fg (k) which is dimensionless and equal
to unity at k = 0 by construction. Although g can be considered to be a random
field as well, it is usual to just assume fixed profile shapes. An example of some
interest is a truncated spherical β-profile,
g(r) = gc (1 + r2 /rcore∗
2
)−3β/2 ϑ(RC∗ − r) ,
with core radius rcore and truncation radius RC ; the ∗ denotes comoving quan-
tities (e.g., rcore∗ = ā−1 rcore ). This form is widely used to model the gas
density in clusters and thus g for the SZ effect if the cluster is isothermal. Fits
to X-ray profiles give β ≈ 2/3. An approximate form factor which roughly
takes the truncation into account is
exp[−krcore∗ ]
Fg (k|C) ≈ 3 . (142)
((kRC∗ )2 + 1) 2 (1−β)
For small k, Fg (k|C) ≈ 1 and gc (τ |C)VC is ∝ hTe iBC for the SZ effect from
clusters, where BC is the cluster baryon number, while for dust at fixed tem-
perature, it is ∝ the mass of dust in the galaxy.
The power spectrum for G0 can be written in terms of the cross-correlation
power spectra for the shots:
X n̄C1 ∗ k 3
PG0 G0 (k) = |g̃(k|C1 )|2
2π 2
C1
X
+ g̃(k|C1 )g̃ ∗ (k|C2 ) n̄C1 ∗ n̄C2 ∗ PCcc1 C2 (k) , (143)
C1 C2
where the tilde denotes Fourier transform. The shot correlation power has
−1/3
been decomposed into a Poisson contribution δC1 C2 (kn̄C1 )3 /(2π 2 ) describing
the self-correlation of the discrete objects and a continuous correlation piece
93
PCcc1 C2 describing the clustering of the objects. In a linear biasing approxima-
tion, we would have PCcc1 C2 (k, τ ) = bC1 (τ )bC2 (τ )Pρρ (k, τ ), where the bCj (τ ) are
biasing factors and Pρρ (k, τ ) is an underlying mass density power spectrum.
Even if such a relation were to hold for low k one would expect considerable
modification at high k.
For the Poisson piece, the contribution from an object which subtends an
angle θC ∼ RC∗ /R(χ̄), whose core subtends θcore ∼ rcore∗ /R(χ̄), is
∼ Q2 (1 + Q2 θC
2 −3(1−β) −2Qθcore
) e , (144)
−1
i.e., white noise (Poissonian) for small Q, Q3β−1 for Q > θC , with an exponen-
tial suppression at very high Q. For the continuous clustering contribution, the
overall amplitude is usually lower and the shape is multiplied by Qnρ,ef f (Q/χ̄) ,
where nρ,ef f (k) is the local index of Pρ (i.e., ∼ k 3+nρ,ef f ). For angular scales
> θC , it can often dominate, ∼ Q2+nρ,ef f (k) , cf. the ∼ Q2 Poisson term.
Notice that if we use a Gaussian profile for the shots and have a narrow
visibility at redshift zs , the C` we get from the Poisson √ piece is a Gaussian
coherence spectrum with coherence angle θc ≈ $c = 2 (1 + zs )RC /R(χs ),
i.e., eq. (106) with ν∆T = 2.
5.3.2 Anisotropy power from dusty primeval galaxies

The BCH2 [42] spectra shown in fig. 15 show the basic features: a Gaussian
radial profile of scale RG = 10 kpc for the dust in galaxies defines the cutoff at
high `, the amplitude is determined by the galaxy (shot) density, here chosen to
be that of bright galaxies nG∗ = 0.02 (h−1 Mpc)−3 . The continuous clustering
piece dominates at lower `.3
The spectra clearly show that to get the maximum signal one would like
to probe the shot noise power, i.e., have a small beam. This is misleading
because a small beam may be unlikely to capture a galaxy. Large beams have
too many galaxies in them to give much shot-noise anisotropy. Clustering
dominates the signal there. Figure 15 emphasizes how useful very small angle
anisotropy experiments can be for detecting high redshift dust emission from
primeval galaxies even for cases which fall well below the FIRAS bounds. The
dust maps in fig. 15 were constructed using the peak-patch method to identify
the high redshift galaxies [82]. The most promising instrument coming on line
for this is SCUBA on JCMT [114], with 1200 resolution, and the ability to
probe a number of frequencies, in particular an atmospheric window around
850µm. The peak model shown in the figure gives rms anisotropies (∆νIν )rms
of 0.2S – as measured in units of 10−6 erg cm−2 s−1 sr−1 . Assuming galaxies
3 The particular model chosen hybridized a biased linear density power spectrum shape for
small k and a nonlinear power law contribution for high k, with the two joined at k N L where
the power is unity. The shape change in the graph is a result of this rough approximation.
The maximum occurs where (Q/R(χ̄))−1 πPG0 G0 (k = Q/R(χ̄)) ∼ Q2+nρ,ef f (k) is maximum,
at nρ,ef f (k) ∼ −2, which occurs at ∼ 0.5 h−1 Mpc for the CDM spectrum, and on somewhat
larger scales for adiabatic models with nonzero Λ.
94
Figure 15: Illustration of what the sub-mm emission might look like from
primeval galaxies in a σ8 = 0.7 CDM model. (a) A 40 × 40 contour map for
dust-emission from primeval galaxies at z ∼ 5 convolved with a 1200 beam
appropriate for the 855µ 37-pixel SCUBA array. The minimum contour is
1000S µJy/beam and subsequent contours increase linearly in 250S µJy/beam
steps. SCUBA has a 20 × 20 FOV and is expected to achieve 470 µJy/beam
at the 1σ level in just one hour of integration. (b) Shows the same map
seen with a 100 beam with 250S µJy/beam contours for an 800µ sub-mm ar-
ray. (c) Shows the map with a 0.8600 beam with 200S µJy/beam contours
for a 1.36 mm array. S is a scaling factor which is 1 if all “bright galax-
ies” have Arp 220 luminosities down to redshift 4. To satisfy the FIRAS
bound (Fixsen et al. 1996), apparently either S <
∼ 0.1 is required, or <
∼ 0.1 of
the sources present could be bursting. (1 Jansky ≡ 10−26 W m−2 Hz−1 , hence
Iν = (λ/3 µm)(δE(λ)/Ecmb ) MJy sr−195 .)
with a density of ∼ 0.02 (h−1 Mpc)−3 (the current density of “bright” galaxies)
do all of the emission in a biased CDM model for the other (BCH2) models
given in table 1, the rms anisotropies that SCUBA would see (assuming a
double differencing mode) would be quite large: 1, 4, 4, 0.3 in the above units,
corresponding to ∆T /T of (7, 20, 12, 2) × 10−5 for BCH2 models 8, 11, 14, 13,
respectively. These anisotropies should be compared with the current 800µ
JCMT 95% C.L. upper limit for an 1800 beam of 3.4 × 10−3 [115, 42], and the
1300µ IRAM millimeter 95% C.L. upper limit for an 1100 beam of 2.4 × 10−4
[116, 42]. (The signal would also have fallen off from the 800µ value by 1300µ.)
Of course the map fig. 15 also demonstrates that the rms emission is somewhat
misleading for SCUBA since it is concentrated in bright patches; and it is
totally misleading for the interferometric arrays.
5.3.3 SZ and nonlinear Thomson scattering from clusters

The most direct way to make maps of secondary anisotropies is to do hydro-
dynamical simulations, then calculate the line-of-sight integrals of G through
the computational volume. It is difficult to make large enough simulations and
still get the resolution needed to treat the structure in the objects. A good
example of the current state of the art is given in [122], in which SZ maps were
constructed by using many hydrodynamical simulations. The “peak-patch pic-
ture” that Steve Myers and I developed [68] allows us to determine the spatial
distribution and properties of rare events in the medium such as clusters over
very large volumes of space by identifying them with carefully selected peaks
of the linear density field [68, 117, 120]. Peak-patch catalogues accord well
with N -body cluster and group catalogues, both statistically and spatially, re-
produce well the gross internal properties such as mass and internal energy,
and do reasonably well at getting the bulk flow of the rare events [68]. The
maps in fig. 16 were constructed in this way, finding all clusters and groups in
a 16◦ × 16◦ patch over a region extending out to redshift 1.5 for a σ8 = 0.7
CDM model. A truncated β profile was used with β = 2/3 to give the gas
density distribution, the core radius was calibrated with X-ray observations,
and the gas was assumed to be isothermal. σ8 controls the overall abundance
of rich clusters: maps such as these look dramatically different with even small
variations. The shape of the power spectrum controls the poor-to-rich cluster
ratio. The σ8 was chosen so the cluster abundances as a function of tem-
perature roughly agreed with X-ray observations. A COBE-normalized CDM
model has σ8 ∼ 1.2 (eq. (222)) and far too many large clusters, but, for ex-
ample, a COBE-normalized Ω = 1 model with a mixture of hot and cold dark
matter (see section 7.3) has σ8 ∼ 0.7, fits the X-ray data reasonably well, and
has a similar appearance to the CDM model shown here, albeit with a smaller
poor-to-rich cluster ratio [120].
The SZ, moving cluster and primary maps of fig. 16 have the following min-
ima, maxima, mean offsets, and rms, in units of 10−6 : (a) (−47, 0, −2, 3)CSZ ;
(c) (−8, 6, −0.04, 0.4)CV ; (d) (−53, 48, −0.06, 18). Thus the SZ effect is com-
96
Figure 16: 2◦ × 2◦ maps for a σ8 = 0.7 CDM model that could be probed
by the Cosmic Background Imager (CBI) being built by Caltech: an 8 small-
dish interferometer to map scales from ∼ 20 –200 , with optimal sensitivity > 0
∼5,
using HEMTs to cover frequencies 30–40 GHz, with a 15 GHz channel to help
to remove contamination. (a) Shows the SZ effect for 30 GHz, with contours
−5 × 10−6 CSZ × 2n−1 ; (b) the associated ROSAT map (0.1–2.4 keV), with
contours 10−14 CX ×2n−1 erg cm−2 s−1 , so the minimum contour level is similar
to the ROSAT 5σ sensitivity for long exposure pointed observations; (c) the
Thomson scattering anisotropy induced by the bulk motion of the clusters, with
contours now ±1.25×10−6CV ×2n−1 , CV ≈ 1.2; (d) primary anisotropies, with
contour levels at ±10−5 × 2n−1 . Negative contours are light and dotted. The
CSZ , CX and CV are order-unity correction factors.
97
petitive with the much larger primary anisotropies expected in this model only
in the cores of clusters; and the moving-cluster anisotropies are disappointingly
small, even when nonlinear corrections are included. For the X-rays, the map
flux characteristics are (b) (0, 12, 0.05, 0.2) × 10−14 CX erg cm−2 s−1 . Using
the information in a deep field cluster catalogue such as (b) will clearly be
invaluable for separating SZ from primary. Even so, since the true sky will be
the sum of (a), (c), and (d) – plus Galactic and extragalactic synchrotron and
bremsstrahlung sources for CBI, and plus dust for SAMBA, some possibly cold
– separation using many well-spaced frequency bands will be essential and also
quite difficult.
5.3.4 Single-cluster observations of the SZ effect

There has been dramatic improvement in observations of the SZ effect from
individual clusters in the last few years, with the promise of much more to
come. The effect has now clearly been seen in more than a dozen clusters at
between the 5 and 10 sigma level [202, 203], with redshifts ranging from 0.023
for COMA to 0.55. The immediate implication of these sort of observations is
that the CMB comes from a redshift > 0.55.
The SZ effect involves pressure integrals along the line of sight through the
cluster. Even in a medium with gas in states of mixed cooling, so with large
density and temperature inhomogeneities, the pressure tends to uniformize on
a sound crossing time into a distribution defined by the gravitational poten-
1/2
tial. By contrast, X-ray emission – involving line of sight integrals of n2e Te
for bremsstrahlung and more complicated temperature and abundance depen-
dences for recombination and line cooling – is very sensitive to clumping. Be-
cause the SZ effect is proportional to ne rather than n2e , it can probe the
intracluster medium out to much larger radial distance than the X-ray emis-
sion, especially when sensitivities in the few times 10−6 range can be achieved.
A further advantage is that ∆T /T for clusters does not diminish with red-
shift for a nonevolving cluster population, whereas cluster X-ray fluxes drop off
dramatically. We have evidence from X-ray observations that there is strong
evolution of the cluster population, and this is expected theoretically as well.
Even so, we should eventually be able to probe clusters at z ∼ 1 with the
SZ effect. An illustration of this is fig. 17: a smooth particle hydrodynamics
calculation in a CDM model of a rare-event region that grows into a cluster
of COMA-like mass 1015 h−1 M by z = 0 after a major merger at z ∼ 0.05
is seen in ∆T /T at z = 0.7, 0.5, 0.2. (SPH and other hydro calculations of
the SZ effect for individual clusters were pioneered by Evrard [213] but have
not been exploited much to date. The SPH example shown, from Bond and
Wadsley [214], evolved from peak-patch initial conditions for a spherical re-
gion 30 h−1 Mpc across constrained to give a cluster of the final state mass,
as described in [68]. The cosmology is a standard CDM model of fig. 7 with
σ8 = 1 (20% lower σ8 than COBE-normalization would give). The simulation
used 65247 gas and 65247 dark particles, and a 1283 multigrid Gauss-Seidel
98
gravity solver with particle-particle corrector forces to improve short distance
resolution, by a factor of 20 or so. The gravitational and SPH smoothing were
matched, which means that about 40 neighbors were required to be within the
softening radius. As such the resolution achieved, 30 h−1 kpc at the final stage,
is a factor of 5 or so better than X-ray core radii, just good enough for the X-
ray calculations. Calculations that are optimized for X-rays with an order of
magnitude more particles are easily feasible on current workstations and are
currently being done by a number of groups.)
Combining the SZ and X-ray observationsR is one of the main paths to
H0 (and in principle q0 ). Because (∆T ) ∝ ne Te ādχ and the X-ray sur-
R 1/2
face brightness ΣX ∝ ā4cl n2e Te ādχ for bremsstrahlung, the proper lin-
3/2
ear scale of the √ cluster, Rcl ∝ (ā2cl ∆T )2 /(Te ΣX ), can be measured. But
−1
Rcl = 2H0 (1 − ācl )ācl θcl (for flat universes, with a weak q0 -dependence for
nearby clusters), so combining the SZ and X-ray observations with the angular
size θcl and the redshift of the cluster zcl allows H0 to be estimated. In practice
the data are used in a more sophisticated way than this, but even so, spheri-
cal symmetry is assumed: cluster elongation along the line of sight pushes H0
down, clumpiness pushes it up. These and other effects make the values of
H0 derived from SZ/X observations uncertain and it is difficult to set realistic
error bars.
So far the SZ effect has only been observed in massive clusters where the
effect is quite large. Birkinshaw [204, 205, 206], using the single 40-m OVRO
dish at 1.5 cm (ovro filter, fig. 6), observed the SZ effect in 3 clusters, 0016+16,
Abell 665 (the richest cluster in Abell’s catalogue), and Abell 2218, with ob-
served central decrements −(1.8, 1.5, 1.5) × 10−4 . These were used to estimate
Hubble parameters for the latter two of 51 ± 18 and 65 ± 25. Nearby clusters,
such as Coma at z = 0.023 [207] and three other X-ray luminous ones [208],
have recently been detected using the 5-m OVRO dish (ovro22 filter, fig. 6).
H0 values obtained with this data vary, with 74 ± 29 for COMA, similarly large
values for Abell 2256 and 2142, but a significantly smaller one for Abell 478.
The Ryle telescope is a 5-km interferometer array with 8 13-m dishes and re-
ceivers operating at 2 and 6 cm. [209] showed the first SZ image of a cluster,
Abell 2218, at z = 0.17 (with H0 = 38 ± 17 cf. [206]) and the Ryle team have
now imaged the SZ effect in a dozen clusters, including 0016+16, Abell 665,
1722 and 773. Since some clusters give low H0 , some high, it is unclear what
conclusion to draw: although we can be confident that the statistical error bars
will shrink with new technological advances, systematic error bars may never
be reliable because clusters are decidedly not idealized spherical distributions.
However, the distribution of H0 determinations for a well-selected sample of
clusters may help to reduce these biases.
Other interferometers are also being applied to this problem. Because clus-
ters at moderate redshift subtend a reasonably large angle, high resolution
instruments with long baselines such as the VLA are not effective. On the
other hand, smaller dishes in a close-packed configuration are quite promis-
99
Figure 17: SZ contour maps derived from a cosmological hydrodynamics simu-
lation for a cluster which becomes massive, hot and large at redshift zero after
a major merger at z ∼ 0.05. The SZ image (∆T /T = −2y here) is contrasted
with the ROSAT image at redshift 0.2. The ROSAT contours correspond to
deep-pointing mode. The upper panels show that the SZ effect in the pre-merge
pieces at z = 0.7, 0.5 is reasonably large on a few arcminute scales. The solid SZ
contour levels > −5
− 10 are experimentally accessible now and the dotted < 10
−5
contours should eventually be feasible. The contours roughly scale with ΩB ,

here chosen to be 0.05 for this Ωnr = 1 CDM model.
100
ing: the OVRO millimeter-wave interferometer at 32 GHz was used to observe
Abell 773 and 0016+16 [212]; the Australian compact telescope array, ACTA,
is another example of a compact array being applied to this problem [211].
SuZIE uses bolometers (operating at 300 mK, at wavelengths 1.2 and 2.2
mm) on the Caltech sub-mm telescope (1.40 beam, 20 separation). The SZ effect
in Abell 2613 has been observed, with a large −(2.6 ± 0.6) × 10−4 decrement.
The advantage here is that one can straddle the SZ sign change (fig. 5). It is
hoped that one can use this to extract the moving cluster effect from the SZ
effect, at least for cases when the cluster is moving mostly forward or away
from us (cf. fig. 16) at a high speed.
5.3.5 The maximum entropy nature of Gaussian anisotropies

One of the fundamental features of these secondary maps is that they have
their power concentrated in hot and/or cold spots: they are decidedly non-
Gaussian. The fundamental characterization of Gaussian fluctuations is de-
scribed by the following lemma [117, 120]: Consider a general statistical distri-
bution functional P[∆t (q̂)]D∆t (q̂) giving the probability of an anisotropy con-
figuration ∆t (q̂) of the random field ∆t . Define the entropy of this probability
to be Z
Entropy[P] = − P[∆t (q̂)] ln (P[∆t (q̂)]) D∆t (q̂) . (145)
Among all of the distributions with a specified spectrum C` , the Gaussian one
is the one which maximizes this entropy. Thus the Gaussian statistics of the
primary anisotropy maps displayed in fig. 9 show maximally random distribu-
tions of the power available. The best observing strategy is then to concentrate
the observing time on just a dozen or so patches of the sky because you are
bound to hit something. For non-Gaussian fluctuations, with power more con-
centrated around the “hot” or “cold” spots than in the Gaussian case, a better
observing strategy is to sample many patches at lower sensitivity to look for
the regions of high power concentration. Because we now expect that the ob-
served anisotropy will be a sum of many component signals, Galactic as well as
cosmological, most of which will be source-like non-Gaussian ones, it is really
essential to sample very many patches: i.e., to make large maps.
5.3.6 Quadratic nonlinearities in Thomson scattering

As noted in section 5.1, quadratic nonlinearities in Thomson scattering can
sometimes dominate over the first-order anisotropies if the latter are strongly
damped and there is early ionization. Their importance was originally sug-
gested by Vishniac [109], and calculations have been done by Efstathiou and
I [216, 218, 243], Dodelson and Jubas [142] and Hu and Scott [221]. Even if
there is early reionization in nearly scale invariant models, there is generally not
sufficient power on small length scales for the Vishniac effect to be important.
Thus it can usually safely be ignored in inflation-based models.
101
Figure 18: 10◦ ×10◦ maps, with grey scale extending from −4σmap to 4σmap , for
an nis = 0, Ω = ΩB = 0.2 isocurvature baryon model with (b) no recombina-
tion, calculated using only linear perturbation theory, (a) with quadratic non-
linearities added as well (and assuming the superposition of many sources along
each line of sight is sufficient to make the nonlinear contributions Gaussian-
distributed, which should be reasonable). (c) and (d) show standard recombi-
nation models, (d) with ΩB = 1 and (c) with Ω = ΩB = 0.2, illustrating how
the changed geometry concentrates the signal to smaller angular scales. The
total power I[C` ] for (b) is (3 × 10−5 )2 σ82 , rising to ≈ 10−8 σ82 for (a) with the
nonlinearities. The SR model (c) has total power 10−8 σ82 .
102
This is not so for isocurvature baryon models in which the initial spectral
index nis is considered a free parameter, as the maps in fig. 18 adapted from
[222] demonstrate, using power spectra taken from [243] calculated using the
methods of [88, 216]. The nonlinear source-field is G = VC (τ )δe q̂ · ve , where
δe = δne /n̄e is the perturbation to the electron density and ve is the electron
velocity. Subdominant nonlinearities ∝ (δne )2 have been ignored. The electron
and baryon velocities can be taken to be the same, but δe = δB + δYe /Ȳe can
have a piece associated with fluctuations in the ionization fraction, although it
is usually ignored. The calculation uses some of the techniques of section 5.1,
in particular the Fourier transform expression for C` , eq. (129), using a time-
averaged isotropized power spectrum for the nonlinear source-field given by
[216]
Z ∞
2 d ln k
C` ∼ Q p I1 P GG ,
Q/R̄ k R̄ (k R̄)2 − Q2
Z
(DḊ)2
I1 = dχ̄ VC2 (χ̄) τζC =1 ,
(D1 Ḋ1 )2
2
π 1
P GG ≈ PΦN (k, τζC =1 ) (kτζC =1 )5 I2 (k) , (146)
32 9
I2 (k) =
Z ∞ Z
y(1 − µ2 )(1 − 2µy)2 PΦN (k(1 + y 2 − 2µy)1/2 ) PΦN (ky)
dy dµ .
0 (1 + y 2 − 2µy)3/2 PΦN (k) PΦN (k)
The gravitational potential fluctuations are assumed to be time-independent
appropriate to nr-dominated evolution; D(τ ) ∼ τ 2 is the linear growth factor
for the density fluctuations and Ḋ = dD/dτ is the linear growth factor for the
velocity fluctuations. The normalization time, τζC =1 , is chosen to be when the
optical depth is unity, i.e., at the maximum of VC /(H̄ā). The Vishniac effect
actually comes from a broad redshift range because of the D Ḋ growth factor.
R̄ is an average cosmological distance.
The spectral index in the maps has been chosen to correspond to the Pois-
son seed model (although Gaussian initial conditions were assumed). Phe-
nomenologically it seems nis between –1 and 0 is preferred. Since there is a
great deal of power at short distances in these models, star formation is ex-
pected to occur very early, hence it seems likely that the Universe would have
been photoionized shortly after the usual epoch of recombination at z ∼ 1000.
Thus no-recombination (NR) models provide a more realistic description of the
anisotropies expected in isocurvature baryon models, although our limited abil-
ity to deal with star formation in the early universe gives us arbitrary freedom
in designing an ionization history. The isocurvature effect due to photon bunch-
ing at large scales is augmented by Thomson scattering anisotropies from the
flow of baryons during decoupling, giving an enhanced signal around ` ∼ 200
for the NR ΩB = 0.2 model. Although primary small angle anisotropies are
103
diminished if there is NR, the quadratic nonlinearities in the scattering induce
a significant anisotropy in, e.g., the ovro window of fig. 6, and even more so for
experiments with filters like “VLA”, especially for high nis models. With σ8
about unity (the conventional value for these models), large regions in ΩB –nis
space are ruled out by the observations of current small and medium angle ob-
servations [243], but exactly how much depends upon one’s assumptions about
ionization history. Nonlinearities beyond quadratic order could also obscure
this result. There is also uncertainty in how to extrapolate the spectrum to the
curvature scale, so it is unclear how such a model is to be “COBE-normalized”.
5.3.7 The influence of weak gravitational lensing on the CMB

Another nonlinear effect is gravitational lensing which bends, focusses and
defocusses the CMB photons as they propagate from decoupling through the
clumpy medium to us. Of course lensing is a mature subject in astronomy (e.g.,
[272, 273]). There was a flurry of activity in the late 80s assessing whether or
not lensing would significantly decrease anisotropies by taking photons from
a high ∆T /T region and dispersing them into lower ∆T /T regions [274, 275,
276, 277, 278]. Given the difficulties that astronomers have had detecting
lensing, with the best observations coming from clusters of galaxies, it may
seem obvious that the effect on the ∼ 100 coherence scale typical for primary
CMB anisotropies is likely be quite small; and this is what these papers found.
However there is an effect on sub-arcminute scales that may affect some types
of secondary anisotropies.
An important issue to re-address is how weak-lensing from late-time linear
and nonlinear structure development may complicate the interpretation of the
primary anisotropy power spectrum even if it is very well determined. In the
post-COBE era, the CMB lensing issue has been picked up again by [279, 280].
In particular, while the earlier papers emphasized the influence of lensing on
the correlation function, Seljak [280] has shown what its impact will be on
C` . To show the effects on both, in Appendix C.5 I apply the Boltzmann
transport equation formalism of Appendix B to lensing of primary anisotropies,
in particular to C lens (θ) and C`lens in eqs. (368), (370).
The critical quantity to determine is the statistical distribution of the extra
displacement between two photons due to lensing relative to their unlensed
separation; i.e., the component of geodesic deviation driven by curvature. At
small separations, the displacement defines a 2 × 2 shear tensor. The surface
upon which the radiation pattern is constructed and upon which the separation
is measured should be well after decoupling so that the distribution function
for the here and now is just a direct map of the initial distribution function on
the post-decoupling hypersurface by the action of a Green function that now
fully incorporates the bending geodesic trajectories.
The total angular power is conserved: i.e., C lens (0) = C no-lens (0). How-
ever, at finite separation θ, C lens (θ) is a smoothed version of C no-lens (θ), with
smoothing scale ∼ εθ where ε is basically an rms shear. Since C(θ)/C(0) gives
104
the statistically-averaged profile about a point, this means that lensing smooths
out hot and cold spots e.g., [275, 276, 277], but it does so in such a way as to
preserve the overall power in a map. How much spreading occurs depends upon
how outrageous the structure formation model is, but the consensus of the pa-
pers on this subject is that the effect is not very large for primary anisotropies.
Seljak [280] has used realistic power spectra that include nonlinear corrections,
which enhances the role of lensing at sub-arcminute scales, to translate the cor-
relation function decline into its effect on C` : the effect on the primary spectra
of fig. 7 is to smooth the Doppler peaks; the typical range in ` over which the
power is spread in ∆`/` is basically the weak-lensing shear, about 10% to 20%
or so at a few arcminutes, depending upon the model [280] – in agreement with
the levels estimated by people advocating using the influence of weak-lensing
on the ellipticities of faint galaxy images to determine the mass density power
spectrum e.g., [273].
6 Perturbation theory of primary anisotropies

6.1 Overview of fluctuation formalism
A generic fluctuation variable D(x, t) can be expanded in terms of modes M ∈
{adiabatic scalar, isocurvature scalar, vector or tensor; growing or decaying}:
X n (D) (D)∗
o
D(x, t) = w ukM (t)QkM (x)akM + ukM (t)Q∗kM (x)a†kM
kM
w = 1/2 classical , w = 1 quantum . (147)
For classical fluctuations, akM is a random variable and a†kM its complex con-
jugate, while for quantum fluctuations, akM is an annihilation operator for
(D)
the mode kM and a†kM is the creation operator. The ukM (t) are mode func-
tions which describe the evolution (and, for now, include polarization effects,
e.g., for gravitational waves). The spatial dependence of the modes is given
by eigenfunctions QkM (x) of the Laplacian of the background geometry. For
a flat background of most relevance to inflation models, it is simply a plane
wave, QkM (x) = eik·x , labelled by a comoving wavevector k. For curved back-
grounds, the eigenfunctions are more complex.
The power spectrum of D associated with mode M is the fluctuation vari-
ance per log wavenumber and can be expressed in terms of the statistics of
akM and a†kM :
2
dσD|M k 3 (D)
quantum: M
PD (k) ≡ = |u (t)|2 (1 + 2ha†kM akM i) ,
d ln k 2π 2 kM
3
M k (D)
classical: PD (k) = |u (t)|2 ha∗kM akM i . (148)
2π 2 kM
In the quantum case, h(−)i denotes Trace(ρ(−)), where ρ is the density matrix
operator; in the classical case, it denotes ensemble average with respect to
105
the probability distribution functional. If the modes are Gaussian-distributed,
statistically homogeneous and isotropic, then this is all that is needed to specify
the patterns in the field D(x, t). The local shape is characterized by the index
M
nD (k) + 3 ≡ d ln PD (k)/d ln k . (149)
Thus −nD is a “fractal dimension”: zero is white noise, while three is scale
invariance in D.
In the inflation picture, the wavenumbers in the observable regime are usu-
ally considered to be so high that any pre-inflation mode occupation, ha†kM akM i,
is negligible, and only the unity zero point oscillation term appears. In that
case, we connect to the random field description by making the real and imag-
inary parts of akM Gaussian-distributed with variance 1/2. Although quanti-
zation is at least self consistent in linear perturbation theory about a classical
background, and gauge invariant, there are still obvious subtleties associated
with the transition from a quantum to a classical random field description. A
true inconsistency appears if we include the nonlinear backreaction of the fluc-
tuations upon the background fields and upon themselves. For this, we would
need a quantum gravity theory. The stochastic inflation theory is an attempt
to bypass this using classical fields acted upon by quantum-derived noise (e.g.,
[180, 181]). In the inflation regime,
D ∈ {δφinf , δφis , h+ , h× , δ ln a, δ ln H, δq, . . .} .
That is, D would refer to fluctuations in: (1) the inflaton field δφinf whose
equation of state can give the negative pressure needed to drive the acceler-
ation; (2) other scalar field degrees of freedom δφis which can, for example,
induce scalar isocurvature perturbations. (If axions are the dark matter, φis
would be the axion field.) The isocurvature baryon mode would need to have a
φis (“isocons”) coupled some way to the baryon number, e.g., [246]; (3) grav-
itational wave modes h+ , h× ; (4) the inhomogeneous scale factor a(x, t), the
Hubble parameter H(x, t) and the deceleration parameter:
q(x, t) ≡ −d ln Ha/d ln a , (150)
or other geometrical variables encoding scalar metric perturbations and their
variations.1 Inflation ends when q passes from negative to positive. Provided
the fluctuations over the observable k-range remain Gaussian, the outcome
of inflation is therefore a set of amplitudes for scalar metric (adiabatic) per-
turbations, gravity wave modes and various possible isocurvature modes, and
primordial spectral tilts for each, in particular:
d ln Pln a|H∗ (k)
scalar: νs (k) ≡ ns (k) − 1 ≡
d ln k
1 To be more precise, in terms of the variables of eq. (24), in the longitudinal gauge with
Ψσ = 0, we have δ ln a ≡ ϕL , (δ ln H) = (H̄ ā)−1 ϕ̇L − νL , δq = (H̄ ā)−1 ν̇L + (1 + q̄)δ ln H,

and the fluctuation used to characterize post-inflation amplitudes is δ ln a| H∗ = δ ln a −
(d ln ā/d ln H̄)δ ln H = ϕcom .
106
(a|H∗ ≡ a(x, t(x, H −1 ))) ,
d ln PGW (k)
tensor: νt (k) ≡ nt (k) + 3 ≡ , (151)
d ln k
Measuring the power in scalar metric fluctuations on the time surfaces upon
which the inhomogeneous Hubble parameter H(x, t) – the proper time deriva-
tive of ln a(x, t) – is constant is useful [175, 192, 176, 181]: Once Ha exceeds
k for a mode with wavenumber k, (δ ln a|H∗ ) becomes time-independent dur-
ing an inflation epoch with a single dynamically-important scalar field, and it
remains so through reheating and the passage from radiation into matter dom-
inance until Ha falls below k (the wave “re-enters” the horizon). Although
transforming calculations to a uniform Hubble hypersurface is instructive, it
does not mean that solving the equations for fluctuations defined on that hy-
persurface is best. The perturbation quantities used in practice depend upon
the gauge and choice of time surfaces, and are described in the next section.
In the post-inflation period,
D ∈ {δρcdm , δvcdm , δρB , δvB , δfγ , δferν , δfmν , h+ , h× , ν, ϕ, Ψσ , . . .} .
That is, D would refer to fluctuations in the density and velocity of dark
matter and baryons (δρcdm , δcdm , δρB , δvB ), in the distribution functions for
photons (δfγ ) and relativistic or semi-relativistic neutrinos (δferν , δfmν ), and
in the metric (dispersing gravitational wave modes h+,× and the scalar variables
such as the “gravitational potential”, ΦN = νL ). The Gaussian nature of the
statistics is not modified until mode–mode coupling occurs in the nonlinear
regime.
6.2 Perturbed Einstein equations

6.2.1 Time-hypersurface and gauge freedom
In two relatively technical appendices, A and B, the Einstein–Boltzmann equa-
tions are viewed as defining a Cauchy problem: the spacetime metric plus
matter variables step forward from a set of initial conditions through a se-
quence of spatial hypersurfaces, each labelled by a time coordinate. This “fo-
liation” of spacetime into a 3 + 1 split is described by the ADM formalism
[167, 168, 169, 171, 178, 196]. Appendices A and B give the full nonlinear
equations for transport and metric evolution, and only then are reduced to lin-
ear perturbation theory, because the nonlinear version illuminates the physical
meaning of the perturbation terms. Because the ADM formalism restricts at-
tention to foliations which are covered by a single time parameter, a change of
foliation (timelike hypersurfaces) is conceptually intermingled with a change of
coordinate system (gauge transformations). The gauge invariance aspect of this
which looms so large in much of the cosmological literature is not as important
as the choice of time surfaces upon which the perturbations are instantaneously
measured. The time surfaces have a spatial 3-geometry, defined by a metric
107
(3)
gij , which are the geometrodynamical variables encoding the dynamics of
the gravitational field. The theorist can decide how to push/pull his/her spa-
tial hypersurfaces forward. This is encoded in the 4 remaining components of
the spacetime metric, parameterized in terms of a lapse function N and a shift
three-vector N i , i = 1, 2, 3 (g00 = −(N 2 − Nk N k ), g0i = Ni ). The contravari-
ant 4-vector, en , with components (eα n) = N
−1
(1, −N 1 , −N 2 , −N 3 ), is timelike
α β
and unity-normalized, en gαβ en = −1: it is the 4-velocity of observers who each
have fixed positions on the spatial hypersurfaces (fiducial observers).
The covariant derivative of any 4-velocity U (of which en is a special case)
can be decomposed into an acceleration 4-vector A ≡ ∇U U (Aa ≡ U b U a ;b ,
where ‘;’ denotes covariant derivative), expansion rate θ, a vorticity ω ab and an
anisotropic shear σab (where a, b ∈ {0, 1, 2, 3}):
Ub;a = −Ua Ab + 31 θ ⊥ab + σab + ωab , ⊥ab ≡ gab + Ua Ub . (152)
The tensor ⊥ab satisfies ⊥ (U ) = 0, ⊥2 = ⊥ (i.e., ⊥ac U c = 0 and ⊥ac ⊥cb = ⊥ab ),
hence is a projection operator onto the three-dimensional subspace orthogonal
to the flow U a . The vorticity tensor is the antisymmetric (and trace-free) part
of ⊥ca Ub;c , θ is its trace, and σab is the remaining symmetric trace-free part.
By construction, spacelike hypersurfaces exist which are orthogonal to the
U α = eαn flow. This implies vanishing vorticity for the flow of time, ωab = 0.
(In general spacetimes, a global time parameter may not in fact exist.) The
remaining spacelike part is the total shear, and its negative, Kab , is called the
extrinsic curvature:
for U = en , Kab ≡ −( 13 θ ⊥ab + σab ) . (153)
It measures the relative deviation of the fiducial flow lines and defines how the
spatial 3-geometry changes in time.
For a given flow U , in particular for the time flow eα
n , the stress energy tensor
ab
Ttype for matter of type “type” can be decomposed into an energy density, ρtype ,
a
a momentum current J(e),type , an isotropic pressure, ptype and an anisotropic
ab
pressure tensor Πtype :
ab
Ttype = ρtype U a U b + J(e),type
a
U b + U a J(e),type
b
+ ptype ⊥ab + Πab
type ,
⊥ab ≡ g ab + U a U b , ab
ρtype ≡ Ua Ttype Ub , a
J(e),type ≡ ⊥ac Ttype
cb
Ub ,
ptype ⊥ab + Πab a cd b
type ≡ ⊥c Ttype ⊥d , (Πtype )aa = 0 . (154)
a
The total values ρtot , J(e),tot , ptot , Πab
tot are just the sums of course.
In perturbation theory, we expand the spatial three metric, the lapse and the
shift in terms of normal modes for the Einstein equations, expressed in terms
(V ) (V )
of scalar metric variables ϕ, ψ, ν, ΨN , the vector contributions, hij , Ni , and
(T T )
the (transverse traceless) tensor contributions, hij :

(3) (V ) (T T )
gij = (3) ḡij + (3) ḡij 2ϕ − ā2 (3) ∇i (3) ∇j 2ψ + ā2 hij + ā2 hij ,
108
(V ) ā2
N = N̄(1 + ν) , Ni = N̄ (3) ∇i Ψn + Ni , Ψσ ≡ ψ̇ + Ψn . (155)
N̄
The parameters of the background geometry are the average lapse N̄ , which
is taken to be ā if conformal time τ is chosen, and an unperturbed FRW
background 3-metric (k = 0, ±1 gives the 3 FRW curvature possibilities and
isotropic coordinates are used here):
(3) kr2
ḡij = ā2 f 2 (r)δij , f −1 = 1 + 2 , χ = τ0 − τ ,
4dcurv

r χ χ
k = −1: = tanh , Σ = f r = dcurv sinh ,
2dcurv 2dcurv dcurv

r χ χ
k = 1: = tan , Σ = f r = dcurv sin ,
2dcurv 2dcurv dcurv
(3) 6k ā˙ 2 2
R̄ = , K̄ij = − f ā δij . (156)
(dcurv ā(t))2 N ā
Here τ0 is the current conformal time, and χ is the comoving distance back to
redshift z (and χ = c(τ0 − τ ) is the solution for radial photon geodesics). The
covariant derivative with respect to the background 3-metric in the direction i
i 2 i
is (3) ∇i , (3) ∇ ≡ (3) g ij (3) ∇j , and the Laplacian is (3) ∇ = (3) ∇ (3) ∇i . For flat
i 2
(k =P0) models, (3) ∇i = ∂i , (3) ∇ ≡ δ ij ā−2 ∂j , and the Laplacian is (3) ∇ =
ā−2 ∂j ∂j – recall these ā−1 factors in the following.
6.2.2 Scalar mode Einstein equations

The physical meaning of the scalar metric variables is determined by their
relation to such physical quantities as the fluctuations in the 3-curvature scalar,
(δ (3) R), in the anisotropic 3-curvature tensor, (δ (3) R0 )ij , in the expansion rate,
(δH), in the anisotropic shear σji of en . For a conformal time choice, with
N̄ = ā, from eqs. (248), (247), (244), (245), (248) we have:
2
(δ (3) R) = −4 (3) ∇ ϕ − (3) R̄2ϕ , (157)
(3) 0 i (3) i (3) 1 i (3) 2
(δ R ) j = −[ ∇ ∇j − 3δ j ∇ ]ϕ , (158)
1 (3) 2
ā(δH) = ϕ̇ − N̄ H̄ν − 3 ∇ āΨσ , (159)
i (3) 1 i (3) 2
σji = −( (3)
∇ ∇j − 3 δj ∇ )Ψσ . (160)
In addition to the three metric scalars, for each type of matter present
there will be a relative density perturbation, δtype , a velocity perturbation
which, because the flow is irrotational for scalar perturbations, can be written
in terms of a velocity potential, Ψv,type: vtype = −ā−1 ∇Ψv,type ,2 Here type
2 This is the usual definition for nonrelativistic matter, but is better defined in terms of
the momentum current, (ρtype + ptype )vtype , especially for scalar fields φ for which Ψv,φ =
−1
āφ̄˙ δφ is ill-defined since φ̄˙ can vanish [178, 192].
109
runs over inflaton and isocon fields, massless and massive neutrinos, photons,
baryons, CDM, etc. For some types of matter there may be an anisotropic
pressure perturbation, πt,type , and for photons and neutrinos, there will be
higher moments, expressing all of the degrees of freedom in the perturbed
distribution functions.
There are generally 10 Einstein equations, Gab = 8πGN Tba . These split
into 4 constraint equations, Gnn = 8πGN Tnn (where Gnn ≡ en α Gαβ en β ) and
GIn = 8πGN TnI , and 6 dynamical equations, GIJ = 8πGN TJI . (Here I, J refer
to spatial components taken with respect to a triad eI α of spacelike 4-vectors
perpendicular to en α described in more detail in Appendix A.1.) Because
spatial components of scalar variables can be expressed in terms of gradients,
first integrals of the gradient equations can be done, reducing the total to 2
constraint and 2 dynamical equations.
The perturbed energy constraint (δGnn ) and momentum constraint (δGIn )
equations are
2 (3) 2 1 8πGN
2H̄(δH) − ∇ ϕ − (3) R̄2ϕ = (δρ)tot , (161)
3 6 3
1 (3) 2
ā(δH) + ∇ āΨσ = ϕ̇ − H̄āν
3
X 1
= −4πGN (ρ̄ + p̄)type āΨv,type − (3) R̄ āΨσ . (162)
type
6
A combination of the momentum and energy constraints gives a “Poisson–

Newton” constraint equation:
2
− (3) ∇ (ϕ + H̄Ψσ ) − 14 (3) R̄2(ϕ + H̄Ψσ )
= 4πGN ((δρ)tot + 3H̄(ρ̄ + p̄)tot Ψv,tot ) . (163)
The dynamical Einstein equations are those for the isotropic pressure (δGII )
and the anisotropic stress (δGIJ − 31 δJI δGK I
K ); instead of (δGI ), it is more useful
and usual to use the perturbed Raychaudhuri equation, δR n n = −(δGnn +
δGI I )/2:
∂(āδH) ā˙ 1 2
+ (āδH) − (1 + q̄)(H̄ā)2 ν − ā2 (3) ∇ ν
∂τ ā 3
4πGN 2
=− ā ((δρ) + 3(δp))tot , (164)
3
X
ā−1 Ψ̇σ + H̄Ψσ + (ϕ + ν) = −8πGN p̄type πt,type , (165)
type
(S) 2
Πtype ij = ( (3) ∇i (3) ∇j − 1 (3)
3 gij (3) ∇ )p̄type πt,type .
Note that only the combination Ψσ of the two scalars ψ, Ψn enters the linearized
equations. Terms contributing to the total pressure are photons, ρ̄γ δγ /3, mass-
less neutrinos, ρ̄erν δerν /3, hot and warm dark matter, and scalar field pressure
if appropriate. The pressure of the baryons can be neglected – except on very
small scales.
110
6.2.3 Useful gauge invariant combinations for scalar modes
Under change of the hypersurface from τ to τ̃ = τ + T (x, τ ), the scalar metric
variables change according to ϕ̃ = ϕ− H̄ āT , Ψ̃σ = Ψσ + āT , ν̃ = ν − ā−1 ∂τ (āT )
(see eq. (260) for the behavior of other variables). Three simple combinations
can be made that are T -independent:
νL = ν + ā−1 Ψ̇σ ≡ ΦA , ϕL = ϕ + H̄Ψσ ≡ ΦH ,

(H̄ ā)−1 ϕ̇ − ν
ϕ+ = ϕcom − H̄ −2 (3) R̄(Ψv,tot + Ψσ ) ≡ (δ ln a|H∗ ) ,
1 + q̄
where ϕcom ≡ ϕ − H̄Ψv,tot . (166)
Here q̄ is the mean deceleration parameter, expressible in terms of the mean

densities and pressures of the matter present.
In the absence of mean curvature the metric combination (δ ln a|H∗ ) reduces
to the scalar curvature potential on the hypersurfaces in which the net momen-
tum current vanishes, ϕcom . This quantity deserves some comment. In early
universe calculations and to characterize the initial conditions for the photon
transport through decoupling, the power in adiabatic scalar fluctuations on
scales beyond the Hubble radius is best characterized in terms of quantities
such as ϕcom which become time-independent; ϕcom has been used to simplify
calculations of linear fluctuations generated by quantum noise since the early
eighties by Mukhanov and others [176, 171, 185].
The variable ln a|H∗ is the inhomogeneous scale factor as measured on time
surfaces upon which the space creation rate, H∗ ≡ ∂ ln a/(N ∂τ ), is uniform.
It gives a nice characterization of even nonlinear fluctuations that can arise in
stochastic inflation [181]. However, H∗ here is not exactly H ≡ −K/3, the
usual Hubble parameter. For scalar perturbations, given a foliation, we can
change spatial coordinates on the time surface to get ψ = 0, with all of Ψσ
moved to the shift potential Ψn . In these coordinates,
1 (3) 1
H∗ = H + ∇j N j ≈ H̄ + ϕ̇ − H̄ν ,
3N̄ N̄
1 (3) 2
(δH∗ ) = (δH) + ∇ Ψσ , ln a = ln ā + ϕ . (167)
3
Under the purely spatial transformation, ϕ remains unchanged. Both (δH∗ )
and ϕ are modified under time surface changes, but in such a way that the
combination (δ ln a|H∗ ) is invariant.
For numerical or analytic calculations in inflation, it is impractical to work
on a uniform Hubble foliation for complex calculations. One determines (δ ln a|H∗ )
by hypersurface shifting after the computations are done. In linear perturba-
tion theory this is particularly simple. For example, suppose that the calcula-
tion has been performed in the longitudinal gauge, for which the variables of
relevance are ln N , ln a and ln H. Keeping the same spatial coordinates, the
time we transform to defines a scalar function T (x, τ ), and the inhomogeneous
111
scale factor and Hubble parameter become a(x, τ +T (x, τ )), H∗ (x, τ +T (x, τ )).
Choosing T to make H∗ constant gives a nonlinear equation for a|H∗ . With
linearization, (δ ln a|H∗ ) = δ ln a − ddln
ln a
H (δ ln H∗ ), i.e., eq. (166).
Bardeen [171] emphasized the virtues of ϕh , the value of ϕ on surfaces
upon which H ≡ −K/3 is constant. The difference between (δH) and (δH∗ ) is
a term of order (k/(H̄ā))2 , hence ϕh differs from (δ ln a|H∗ ) by terms of order
(k/(H̄ā))2 as well; i.e., they are the same well outside the horizon. Another
quantity which I have used extensively is Bardeen, Steinhardt and Turner’s ζ
[175, 232, 192, 178]:
(3) 2
(δρ)tot ∇ ϕL
ζ =ϕ+ = ϕcom + . (168)
3(ρ̄ + p̄)tot 3(H̄)2 (1 + q̄)
Again, far outside the “horizon”, k/H̄ā 1, ζ ≈ ϕcom ≈ δ ln a|H∗ .
The most commonly used gauge choice in nonlinear numerical relativity
computations of black hole formation has been one on which K is not just uni-
form but is zero, because it turns out to be the time slicing which maximizes
the 3-volume. This has the virtue of avoiding singularities, but in cosmology
we usually care about following collapsing objects. The cosmological general-
ization of maximal slicing is one on which the Hubble parameter is uniform,
i.e., basically the hypersurfaces whose scalar curvature parameter ϕ is used to
characterize the initial conditions for adiabatic fluctuations.
6.2.4 Longitudinal and synchronous gauges

For scalar perturbations, there are no intrinsic dynamics of the gravitational
field. This is similar to Newtonian theory in which, given the density, the
Newtonian potential is found by solving the Poisson equation, but no ODEs in
time. To specify the gauge, a single combination of the scalar degrees of freedom
can be fixed. Two have been most widely used in cosmological calculations of
radiative transport. In the longitudinal gauge (L), ψ and Ψn are both set to
zero. Thus Ψσ is also set to zero, so the hypersurfaces have zero shear. The
remaining two metric variables in Bardeen’s notation [171] are ΦA = νL , ΦH =
ϕL (= δ ln a). In the longitudinal gauge, one can use the anisotropic dynamical
Einstein equation to algebraically relate (νL −ϕL ) to the anisotropic stress, and
the Poisson–Newton equation to get νL in terms of the total comoving energy
density. All dynamical information is then carried by the matter present. Refs.
[173, 138, 251] adopt this approach, but instead of solving for the longitudinal
gauge δtypeL , they solve for the comoving densities δtypecom = δtype + 3(1 +
p̄
ρ̄ )type H̄Ψv,type . Ref. [260] solves for δtypeL , but uses the momentum constraint
equation in place of the anisotropic shear equation. The longitudinal gauge is
considered to be the one closest to Newtonian in the nonrelativistic regime,
for both metric variables are given by the perturbed Newtonian potential ΦN :
ΦA → ΦN , ΦH → −ΦN .
In the synchronous gauge (S), the lapse perturbation ν and Ψn are set
to zero. One could solve the momentum constraint equation as a first order
112
ODE for ϕS in terms of the various velocity potentials, and then determine
ā(δH), ∝ the perturbation to the trace of the extrinsic curvature, through the
energy constraint. We recommended this approach for inflation in [192]. The
perturbed equations in the synchronous gauge for radiation and matter involve
only ā(δH) and ϕ̇S , so actually solving for ϕS is not really necessary, except to
get ā(δH). However, ā(δH) can be directly determined from the Raychaudhuri
equation, which is just an ODE at each point, and does not depend upon ϕS or
even ϕ̇S . If one uses this equation to evolve ā(δH), ϕ̇S is set algebraically by
the momentum constraint, and ϕS is not solved for. The transport equation
for CDM in the synchronous gauge is simply δ̇cdm = −ā(δH). Transforming
to normal N̄ = 1 time gives the usual density P perturbation growth equation,
δ̈cdm + 2H̄ δ̇cdm = 4πGN ρ̄cdm δcdm + 4πGN type6=cdm (δρtype + 3δptype ).
The spatial hypersurfaces are those on which cold dark matter is comoving
(the velocity potential for cdm particles in the longitudinal gauge is Ψv,cdm,L=Ψσ,S ).
In the nr limit, ψS → ΦN /(4πGN ρ̄nr ā2 ) becomes the potential for the displace-
ment field that appears in the mapping from Lagrangian space to Eulerian
space, x(r, τ ) = r − s(r, τ ), s(r, τ ) = ∇ψS . The coordinates xI (r, τ ) are Eu-
lerian ones appropriate to the longitudinal gauge and the r i coordinates are
Lagrangian ones labelling the cold matter particles. The deformation tensor
eJ∗ i = ∂xJ /∂ri defines a triad of orthogonal vectors eJ∗ for the space in which
the cold dark matter is at rest (i.e., perpendicular to the flow eα n of the CDM).
The way gravitational collapse manifests itself is through the shrinking of the
comoving lengths eJ∗ i dri , although dri remains fixed: i.e., collapse is viewed as
a motionless distortion of the geometry. The synchronous gauge breaks down
once e∗J i becomes singular, i.e., once caustics form and shell crossing occurs.
(Once the universe has become dominated by nonrelativistic matter, ϕ̇S → 0,
ϕS → −5ΦN /3, ā−1 Ψσ → τ ΦN /3, ψ̈S = ∂τ (ā−1 Ψσ ) → ΦN /3).
6.2.5 Tensor mode metric equations

The tensor modes satisfy the anisotropic δGJI − 31 δJI δGK
K Einstein equations.
Expressed in conformal time this is
(T T ) j (T T ) j 2 (T T ) j (T T ) j
ḧi + 2H̄āḣi − ā2 (3)
∇ hi + 13 ā2 (3) R̄hi
(T )
= 16πGN ā2 (Πtot )ji , (169)
(T )
where (Πtot )ji is the tensor mode part of the anisotropic stress. The mode
expansion (147) for gravity waves (in a flat FRW background) is
(T T )
hij (x, τ )
X X (T ) (T )∗
=w h(T ) Eij eik·x ak(T ) + h∗(T ) Eij e−ik·x a†k(T ) ,
=+,× kM
(T +)
E = (e1 ⊗ e1 − e2 ⊗ e2 ) , E (T ×) = (e1 ⊗ e2 + e2 ⊗ e1 ) ,
k · e{1,2} = 0 , e 1 · e2 = 0 . (170)
113
Here {e1 , e2 , k̂} form an orthonormal triad. With the wavevector k oriented
in the z-direction, e1 in the x-direction and e2 in the y-direction, we have the
usual
E (T +) ij hij (k, τ )
h(T +) (k, τ ) = (T +)
= (h11 − h22 )/2 ,
E (T +) ij Eij
E (T ×) ij hij (k, τ )
h(T ×) (k) = (T ×)
= h12 .
E (T ×) ij Eij
The only degree of freedom of the stress energy tensor which has a nonzero
amplitude in the tensor mode is the anisotropic stress, which has components
(T {+,×}) E (T {+,×}) ij Πtot ij (k, τ )

p̄tot πt,tot ≡ (T {+,×})
.
E (T {+,×}) ij Eij
Hence (for flat backgrounds) the Einstein equations reduce to

(T )
ḧ(T ) + 2H̄āḣ(T ) + k 2 h(T ) = 16πGN ā2 p̄tot πtot , = +, × . (171)
For inflation-based models in which gravity waves are zero point fluctu-
ations, the anisotropic stress driver can be ignored, even during evolution
through the radiation-dominated epoch where the anisotropic stress may not be
irrelevant. This contrasts with the scalar perturbation case for which the Ein-
stein equations have source terms depending upon the density and the velocity
potential, and one cannot solve for the metric variables without simultaneously
solving for the radiation and matter. For tensor perturbations the predomi-
nant behavior is just free evolution from a given set of initial conditions for the
waves.
The solution for k Ha is h(T {+,×}) constant for all relevant equations of
state. Let us suppose that the gravitational waves are characterized by a power
spectrum Ph(T {+,×}) (k, τe ) at some initial time τe for which kτe 1. To see the
character of the solutions for k > Ha, let us consider the case where we have
only relativistic particles with density parameter now given by Ωer,0 (photons
and massless or very light neutrinos) and nonrelativistic particles (baryons and
cold dark matter say) with density parameter now Ωnr,0 . When eq. (169) is
expanded in k-modes, h(T {+,×}) obeys
1/2
d2 h(T ) x/(2x∗ ) + aeq 1 dh(T )
+ 2 + h(T ) = 0 , x ≡ k(τ − τe ) , (172)
dx2 1/2
x/(4x∗ ) + aeq x dx
1/2
where x∗ = k[H0 Ωnr,0 ]−1 and
1/2 1/2 Ωer,0

a − ae = 41 [H0 Ωnr,0 (τ − τe )]2 + a1/2
eq [H0 Ωnr,0 (τ − τe )] , aeq ≡ . (173)
Ωnr,0
114
For waves which reach Ha ∼ k in the er-dominated regime, the solution is
Ph(T {+,×}) (k, τ ) = Ph(T {+,×}) (k, τe ) (j0 (k(τ − τe )))2 , a − ae aeq , (174)
while for those which reach Ha ∼ k in the nr-dominated regime it is

2
3j1 (k(τ − τe ))
Ph(T {+,×}) (k, τ ) = Ph(T {+,×}) (k, τe ) , a − ae aeq . (175)
k(τ − τe )
It is more complicated in the transition region or if there are other constituents

in the equation of state, such as vacuum energy, decaying particles, light mas-
sive neutrinos, etc. These solutions explicitly show the constancy outside of
the “horizon” and the loss of power due to the free-streaming of gravity wave
perturbations inside the “horizon”, i.e., for k(τ − τe ) 1.
6.3 Connection with primordial post-inflation power spec-

tra
The evolution equations for fluctuations δφj in scalar fields φj are
∂ 2 δφj ∂δφj X 2
2 (3) 2 2 ∂ V
+ 2 H̄ā − ā ∇ δφ j + ā δφi
∂τ 2 ∂τ i
∂φj ∂φi
2 ∂V
= −φ̄˙ j 3ϕ̇ − ν̇ − (3) ∇ āΨσ − 2 ν. (176)
∂φj
The inflaton and isocons are coupled through a potential V (φinf , φis , . . .). No
explicit dissipative coupling term has been included (but is needed to turn
oscillating scalar field into radiation and matter). The combinations
δφj − φ̄˙ j ā−1 Ψσ = δφjL and δφj + (H̄ā)−1 φ̄˙ j ϕ = δφj |ln a (177)
are gauge invariant. Thus the perturbation in the longitudinal gauge is an

invariant combination δφjL . Scalar fields have no anisotropic stress in lin-
ear perturbation theory. For the longitudinal gauge this gives the simple
relation νL = −ϕL . The second gauge invariant combination, δφj |ln a =
dφ̄
δφjL + d lnja δ ln a, is the value of the scalar field on hypersurfaces of fixed ln a;
i.e., beginning in the longitudinal gauge, one forms φln a = φ(x, τ + T (x, τ ))
with T defined by ln a(x, τ + T (x, τ )) is constant. Mukhanov [177] showed in
perturbation theory that the metric terms disappear in the scalar field evo-
lution equation when the choice āδφj |ln a is made, resulting in considerable
simplification for the case of a single scalar field being important in inflation.
In [181], we emphasized some of the virtues in the nonlinear case.
The equation for h(T {+,×}) is identical to that for scalar fields with no
effective mass. (There is a (δH) term multiplying h(T {+,×}) , but this is an
ignorable quadratic nonlinearity.) A factor is required to make the actions the
115
√
same: (mP / 16π)h(T {+,×}) , where the Planck mass is related to Newton’s
−1/2
gravitational constant by mP ≡ GN in units with h̄ = c = 1.
We now describe the power spectra resulting from zero-point quantum os-
cillations found by solving these equations. During inflation Ha increases. The
solution of the massless scalar equation shows rapid oscillation of the respective
mode functions “inside the horizon”, almost freeze-out outside (k < Ha), with
1/2
a power amplitude Pφ (k, τk ) ≈ H(τk )/(2π) on the k = Ha boundary. The
Hawking temperature H/(2π) result3 follows from a WKB solution to (169)
evaluated at k = Ha provided the effective masses of the scalars are small
compared with H. The perturbation in the inflaton field φinf translates into
scalar perturbations in the metric through δ ln a = (H/φ̇inf )δφinf . If we de-
note the end of inflation by τe and horizon crossing by τk , the post-inflation
spectra are
√
1/2 1 4π H(τk ) us
Pln a|H∗ (k, τe ) = √ e , (178)
q + 1 mP 2π
√
1/2 √ 4π H(τk ) ut
PGW = 8 e , PGW (k) ≡ Ph(T +) (k) + Ph(T ×) (k) .
mP 2π
The correction factors ut and us to “the H/(2π) at k = Ha WKB approxi-
mation” are in practice nearly zero. How near is now of considerable interest
because the COBE results have created a desire for calculational precision
[184, 6]. Complicated potential interactions between the inflaton and isocon
degrees of freedom can also change these results.
In eq. (178), H(φ) and the deceleration parameter q(φ) are treated as func-
tions of the inflaton field. These functions naturally follow from the Hamilton–
Jacobi formulation [181, 182], in which H(φ) is related to the potential V (φ)
by
" 2 # 2
2
2 8π 1 m P ∂H m2P ∂ ln H
H = + V (φ) , (1 + q) = . (179)
3m2P 2 4π ∂φ 4π ∂φ
The oft-used slow rollover approximation, valid in many inflation calculations

but certainly not all, is the zeroth order solution in an expansion in 1 + q:
H 2 ≈ 8πV /(3m2P ).
The adiabatic scalar and tensor tilts are logarithmic derivatives of eq. (178):
νt nt + 3
= = 1 + q −1 + Ct ,
2 2
νs ns − 1 m2 ∂ 2 ln H
= = 1 + q −1 − q −1 P + Cs . (180)
2 2 4π ∂φ2
3 In stochastic inflation, noise at the Hawking temperature radiates from short distances
across the decreasing (Ha)−1 boundary into an inhomogeneous background field built from
longer wavelengths.
116
Here Ct,s are small and essentially ignorable correction factors associated with
derivatives of the ut,s .
Eq. (180) shows that tilt mostly depends upon how far the acceleration is
below the critical value of unity. For uniform acceleration, the scalar and tensor
tilts are equal:
νs = νt = −2(p − 1)−1 , q + 1 = p−1 . (181)
It is realized by power law inflation, a ∝ tp , with p constant, and an exponential
potential in φ. Over the small observable window we have in k-space, this
is often a good approximation, e.g., for extended inflation, one of a class of
theories with variable gravitational coupling. Of course, q must go negative
for a viable model of inflation. Power law potentials of the form V (φ) =
λe m4P (φ/mP )2n /(2n) with n constant have the acceleration naturally dropping
through zero: q ≈ −1 + (φ/mP )−2 n2 /(4π). In chaotic inflation examples [188],
one often takes power law potentials with n = 1 or 2. A characteristic of
such potentials is that the range of values of φ which correspond to all of the
large scale structure that we observe is actually remarkably small: e.g., for
n = 2, the region of the potential curve responsible for the structure between
the scale of galaxies and the scales up to our current Hubble length is just
4mP < <
∼ φ ∼ 4.4mP [192]. Consequently, H(φ) does not evolve by a large
factor over the large scale structure region and we therefore expect approximate
uniformity of νs (k) and νt (k) over the narrow observable bands of k-space, and
near-scale-invariance for both. Although this is usually quoted in the form of
a logarithmic correction to the ln a|H∗ -spectrum, a power law approximation
is quite accurate [190]:
n+1 n
νs (k) ≈ − , νt ≈ − ,
NI (k) − n/6 NI (k) − n/6
n/2
q+1≈ .
NI (k) + n/3
NI (k) is the number of e-foldings from the point at which wavenumber k

“crosses the horizon” (when k = Ha) and the end of inflation. For waves
the size of our current Hubble length we have the familiar NI (k) ∼ 60, hence
νs ≈ −0.05, νt ≈ −0.03 for n = 2 and νs ≈ −0.03, νt ≈ −0.02 for n = 1 (mas-
sive scalar field case). Further, the observable scales are sufficiently far from
the reheating scale that NI is relatively large over the observable range: e.g.,
over the range from our Hubble radius down to the galaxy scale, νs decreases
by only about 0.01.
In natural inflation [189, 190], the inflaton for the region of k-space that
we can observe is identified with a pseudo-Goldstone boson with a potential
V = 2Λ4 sin2 (φ/(2f )). This is similar to the axion, except that the symmetry
breaking scale f is taken to be of order mP and the energy scale for the potential
is taken to be of order the grand unified scale, mGU T , so that an effective
weak coupling, λe = Λ4 /(f mP )2 ∼ (mGU T /mP )4 arises “naturally”, giving
the required 10−13 for mGU T = 1016 GeV. To obtain sufficient inflation and
117
a high enough post-inflation reheat temperature for baryogenesis, f > ∼ 0.3mP
is required. To have a tilted spectrum and also get enough inflation in our
Hubble patch, φ/f must have started near the maximum at π, an inflection
point where q is nearly −1, hence tensor tilt and gravity wave power are both
exponentially-suppressed; however the scalar νs ≈ −m2P /(8πf 2 ) does not have
to be small [190].
The index νs can have complex k-dependent structure when the acceleration
changes considerably over the k-band in question. According to eqs. (180),
the post-inflation gravitational wave spectrum will have power increasing with
wavelength, whereas artfully using the ∂ 2 ln H/∂φ2 term allows essentially any
prescribed shape for the adiabatic scalar spectrum (e.g., [191, 192]). However,
most broken scale invariance models which do give considerable variation in
νs (k) over the relatively narrow band of k-space that we can observe are not
very compelling, since rather dramatic features must be tuned to lie on the
potential surface in just that stretch which corresponds to our observable band.
A slowly varying νs (k) is certainly a better bet.
6.4 Relating scalar and tensor power measures to the dmr

band-power
For early universe calculations and also to characterize the initial conditions
for the photon transport through decoupling, the power in adiabatic scalar
fluctuations on scales beyond the Hubble radius is best characterized in terms of
quantities which become time-independent. We have seen that some examples
are the spatial curvature of time surfaces on which there is no net flow of
momentum, the expansion factor fluctuation on time surfaces with uniform
space creation rate and ζ. An initially scale invariant adiabatic spectrum has
k-independent power per d ln k in these variables (for k/(H̄ā) 1), while for
models with spectral tilt νs , we have Pϕcom (k) = Pϕcom (τ0−1 )(kτ0 )νs , where
we use the instantaneous comoving horizon size at the current epoch, τ0 , as
the normalization point. For CDM-like models (those with Ω = Ωnr = 1 and
τ0 = 2H0−1 ), these spectra are related to the portion of the dmr band power
(S)
hC` idmr in the scalar adiabatic mode, hC` idmr = hC` idmr /(1 + r̃ts ), and to the
(S)
quadrupole power, C2 = C2 /(1 + rts ), by
(S) (S)
Pϕcom (τ0−1 ) ≈ 23.4hC` idmr e−1.99νs (1+0.1νs ) ≈ 23.5C2 e−1.1νs ,(182)
Pϕcom (k) ≈ Pln a|H∗ (k) ≈ Pζ (k) Pϕcom (k) = Pϕcom (τ0−1 )(kτ0 )νs ,
i.e., roughly 3 × 10−9 . This relation, determined for the ΩB = 0.05, h = 0.5
CDM model, is quite insensitive to variations in h and ΩB (e.g., 23.6 to 23.0 as
(S) (S)
ΩB rises from 0.0125 to 0.20 for hC` idmr and almost no change for C2 ). For
25 25
scales of order our present Hubble size, we also have Pζ ≈ 9 PΦH ≈ 4 P(δρ)hor ,
where ΦH ≡ ϕL = −νL = −ΦN is the perturbed Newtonian gravitational
potential and (δρ)hor is the density fluctuation (in the synchronous gauge) at
“horizon crossing”, kτ = 1.
118
Quantum noise in the transverse traceless modes of the perturbed metric
tensor would also have arisen in the inflation epoch and for many models may
have been quite significant, as is discussed below. The gravitational radiation
power spectrum PGW = Ph+ + Ph× is the sum of the two independent gravita-
tional wave polarizations. It is related to the amplitude of the dmr band power
(T ) (T )
hC` idmr = hC` idmr r̃ts /(1 + r̃ts ) and to the quadrupole C2 by
(T ) (T )
PGW (τ0−1 ) ≈ 17.6hC` idmr e−1.92νt (1+0.1νt ) ≈ 13.7C2 e−1.25νt ,
PGW (k) = PGW (τ0−1 )(kτ0 )νt , (183)
with very little ΩB dependence.

The inflation model determines the ratio of PGW (τ0−1 ) to Pϕcom (τ0−1 ),
through eq.(178), which is related to the tilt of the gravity wave spectrum,
−4νt /(1 − νt /2) in zeroth order, with small corrections associated with Ct and
ut − us predominantly dependent upon νt − νs and which can usually be ig-
nored [184, 6]. The fits given above can then be used to relate the ratio of dmr
band-powers (and quadrupoles) to the tilts (for Ωvac = Ωcurv = 0):
(T )
hC` idmr hP −1 i
GW (τ0 )
r̃ts ≡ ≈ 1.33 e−0.07νt e−1.99(νs −νt ) ,
(S)
hC` idmr Pϕcom (τ0−1 )
(T )
C2 hP −1 i
GW (τ0 )
rts ≡ ≈ 1.71 e−0.15νt e−1.1(νs −νt ) , (184)
C2
(S) Pϕcom (τ0−1 )
hP −1 i
GW (τ0 ) −4νt
= 8(1 + q)e2(ut −us ) ≈ .
Pϕcom (τ0−1 ) (1 − νt /2)
Recall from eq.(180) that the tensor tilt is simply related to the deceleration
parameter q = −aä/ȧ2 of the Universe in the inflationary epoch, νt /2 ≈ 1 +
q −1 ; although 1 + q −1 is the leading term for the scalar tilt, other terms can
dominate when the deceleration is near the critical deSitter-space value of −1.
It is invariably negative. When assessing the effect of gravity waves on the
normalization of the spectrum, as noted earlier it is useful to consider two
limiting cases: νs ≈ νt , which holds for the widest class of models, including
power law and chaotic inflation, and νt ≈ 0, with νs arbitrary, which holds for
some models such as “natural” inflation. To lowest order in νt , r̃ts ≈ −5.3νt
and rts ≈ −6.8νt (often rounded up to −7νt , which is nearly the value one gets
(S)
if only the naive Sachs-Wolfe formula is used to estimate C2 .)
There are also corrections as one goes away from the Ωnr = Ω = 1 models.
For example, models with nonzero cosmological constant Ωvac , but Ωnr +Ωvac =
(T ) (S)
1, have PGW /hC` idmr being only weakly dependent upon Ωvac whereas Pζ /hC` idmr ∝
3.5
(1 − 0.6Ωvac ) is strongly dependent upon it (section 7.3).
119
6.5 The Boltzmann transport equation
In Appendices B and C, transport theory with polarization in the ADM frame-
work is derived with full nonlinearities in the gravitational field included. The
Boltzmann transport equation depends upon not only the spacetime foliation
chosen, but also upon the momentum variables q I chosen. A natural set to
select are those referred to the orthonormal basis {en , eI }, where the spatial
triad eI is normal to the hypersurface flow vector eα I
n ; however, this choice, p ,
is a physical momentum, not a comoving momentum, and we have seen that
the transport equations are much simpler if we use comoving momentum, hence
use q I = ΩpI . The factor Ω must reduce to the average expansion factor, ā, in
the unperturbed case, but can be inhomogeneous if we like. Choosing different
Ω/ā corresponds to choosing different momentum-space gauges, and leads to
different forms for the Boltzmann transport equation. This is only one example
of the extra gauge transformation freedom that exists in dealing with transport
phenomena. Momenta defined with respect to tetrads other than the {en , eI }
also lead to modified equations.
Recall from section 3.1 that to treat polarized photons, four distribution
functions are required, ft , fU , fV , fQ , corresponding to the four Stokes param-
eters. These are best understood as elements of a 2 × 2 polarization matrix
fss0 , where s denotes the photon polarization, s = ±1 for circular polarization,
s = 1, 2 say for linear polarization, with associated polarization basis εs sat-
isfying εs ⊥ q̂. One can combine the tensor product basis εs ⊗ εs0 with fss0
to make a (spatial) tensor of rank 2 which is conceptually extremely useful for
understanding the Stokes parameters and how they behave under rotations:
X
f= fss0 εs ⊗ εs0 = ft E(t) + fU E(U ) + fV E(V ) + fQ E(Q) . (185)
ss0
The tensor basis E(µ) for the Stokes parameters are linear combinations of the
εs ⊗ εs0 , defined by eq. (309). In Appendix C, the source function for Thomson
scattering is derived using this language. A more conventional approach is to
apply Chandrasekhar’s classic development of the scattering source term for
Rayleigh (and thus Thomson) scattering in a plane parallel atmosphere [200].
Of course, the transfer problem is not plane parallel; however, it effectively
becomes so for each normal Fourier transform mode for flat universes [134]. The
nonlinear Boltzmann transport equation for ∆{t,Q,U,V } is given by eq. (284).
The linearized version for photons takes the form (using N̄=ā, i.e., conformal
time)
∂ ∂
∆{t,Q,U,V } + q̂ J J ∆{t,Q,U,V }
∂τ ∂x
= G{t,Q,U,V } SW + G{t,Q,U,V } curv + G{t,Q,U,V } C , (186)
in terms of a Sachs–Wolfe source from redshift effects, a source associated with

mean curvature of the Universe, G{t,Q,U,V } curv , and the Thomson scattering
source, G{t,Q,U,V } C . In nonlinear theory, there is a term associated with the
120
bending (lensing) of light, Gtbend , but to linear order it manifests itself only if
there is mean curvature. Mean curvature terms, grouped into G{t,Q,U,V } curv ,
are described in section C.4. Although the solution method when there is mean
curvature is quite similar to the flat case one, the discussion is complicated
because the mode functions are not plane waves. For this section, we assume
a flat background so the modes are characterized by a comoving wavevector
kI , with the understanding that we really mean the action of the operator
−iā (3) ∇I .
6.5.1 Scalar mode transfer equations

For scalar perturbations (and flat universes – see section C.4 for mean curvature
modifications),
(S)
GtSW = −iq̂ · k̂kν − ϕ̇ − (q̂ · k̂)2 k 2 ā−1 Ψσ , (187)
(S) (S) (S)
GQ SW = 0, GU SW = 0, GV SW = 0,
(S) (S) (S)
τC GtC = −∆t + ∆t0 − iq̂ · k̂kā−1 Ψv,B
(S) (S) (S)
− 12 P2 (q̂ · k̂)(∆t2 + ∆Q2 + ∆Q0 ) . (188)
(S) (S) (S) (S) (S)
τC GQC = −∆Q + 21 (1 − P2 (k̂ · q̂)) (∆Q0 + ∆t2 + ∆Q2 ) , (189)
(S) (S) (S) (S)
τ C GU C = 0 , τC GV C = −∆V + 43 q̂ · k̂ ∆V 1 . (190)
(S)
The moments ∆{t,Q,U,V } ` are defined by expanding in Legendre polynomials:
(S)
X (S)
∆{t,Q,U,V } = (2` + 1) (−i)` ∆{t,Q,U,V } ` P` (k̂ · q̂) . (191)
`
(S) (S)
∆U and ∆V are initially zero, and remain decoupled from other sources,
(S)
so remain zero. Thus for each eigenmode only transport equations for ∆t
(S)
and ∆Q need to be solved. However, when one reconstructs from these
mode solutions the statistical distribution of the observed polarization pattern,
∆{Q,U,V } (x, τ0 , q̂), the U and Q components mix, so both are nonzero.
These equations are valid for arbitrary gauge choices. The momenta have
been chosen as the components defined with respect to an orthogonal basis so
that the direction q̂ is what would be measured. The change of momentum
coordinates is itself part of a gauge transformation.4 It is often convenient in
dealing with the transport equation to rewrite it with many of the explicit terms
of form q̂ I (3) ∇I brought into the transport operator. This is equivalent to a
momentum component transformation. An explicit example is to use momenta
qeI = exp[ν + (∂τ − q̂ i ∂i )(ā−1 Ψσ )] q I , (192)
4 The oft-used approach of viewing the CMB photon transport equation as one for the
radiation brightness, the integral of the distribution function over q but not q̂, obscures this
view. Also for most sources but Thomson scattering or for other particles such as massive
neutrinos, the q-dependence is very relevant.
121
which transforms the distribution function and source terms to the gauge-
e (S)
invariant combination ∆ t introduced in section 5.2:
−1
∆e (S) = ∆(S) + ν + ∂ā Ψσ − q̂ i ∂i ā−1 Ψσ , (193)
t t
∂τ
2 −1
(S) ∂ ā Ψσ
GetSW = ν̇ − ϕ̇ + 2
,
∂τ
1e 1 ∂ −1
4 δγ ≡ 4 δγ + ν + ∂τ ā Ψσ ,
Ψe v,γ ≡ Ψv,γ + Ψσ , ∆ e (S) ≡ ∆(S) (` > 2) .

t` t` −
If the Compton sources are small and the gravitational potential perturbations
do not change in time (or equivalently if the time flow has ā−1 Ψσ ∝ τ ), then
the source term can be neglected, ∆ e (S) propagates freely, and is particularly
t
simple to integrate [88]. We used this quantity extensively in [134, 88]; it has
the simple interpretation in the longitudinal gauge of basically saying it is the
Tolman combination eνL Tγ which free-streams.
These equations are coupled to the transport equations for massless neutri-
nos, hot, warm and cold dark matter, and baryons. Massless neutrinos obey
a transfer equation identical to that for photons, except of course there is no
Compton coupling, only the Sachs–Wolfe and curvature sources. (If the rela-
tivistic particles are decay products generated during evolution, there is also a
(S)
Gerν,decay source term [252].) Just as for photons, these equations are solved
by a moment expansion. For hot dark matter (light massive neutrinos), warm
dark matter, etc. the transport equation has the form
q
∂τ [∆hdm ] + i q̂ · k̂k∆hdm = (Ghdm SW + Ghdm curv ) , (194)
qn
(S) qn p
Ghdm SW = −i q̂ · kν − ϕ̇ − (q̂ · k)2 ā−1 Ψσ , q n = q 2 + m2 ā2 .
q
It is the semi-relativistic stage, when q/q n is not simply unity or q/(mā), that
creates the difficulty. A straightforward method is to solve this by moment
expansion, one for each neutrino momentum, q. To feedback into the perturbed
Einstein equations one needs to appropriately sum over the momenta q, to get
the energy density, velocity, pressure and anisotropic stress – with an adequate
number of moments and energy groups, and proper treatment of boundary
conditions in `-space, one can get away with only a few hundred extra coupled
ODEs to be added to the already formidable number of moment equations used
for the photons [260, 261]. In earlier work [195] we described another method,
expressing the metric equations as integro-differential equations – which had
the penalty of integrating over past time to get the current perturbed energy–
momentum tensor of the neutrinos. To make the method practical and indeed
relatively rapid for CMB anisotropy calculations, optimal sampling of the past
history [261] and a switch into (essentially) cold dark matter equations once the
122
particles were strongly in the nonrelativistic regime and the wavenumber was
much below the redshift-dependent Jeans wavenumber was helpful ([134, 2, 233]
and eqs. (350), (352)).
The mass and momentum conservation equations for nr dark matter and
for baryons are (eqs. (280), (282))
1 1 2 −1
CDM: 3 δ̇cdm + ϕ̇ + 3 k ā (Ψv,cdm + Ψσ ) = 0 , (195)
−1
ā Ψ̇v,cdm = ν , (196)
1 1 2 −1
baryons: 3 δ̇B + ϕ̇ + 3 k ā (Ψv,B
+ Ψσ ) = 0 , (197)
4 ρ γ −1
ā−1 Ψ̇v,B = ν + ne σT ā Ψv,γB , (198)
3 ρB
relative velocity potential: Ψv,γB ≡ Ψv,γ − Ψv,B . (199)
In eq. (198), the baryon pressure is neglected, a valid approximation for primary
anisotropy calculations: it manifests itself through the post-recombination
baryon Jeans length, k −1 ∼ 1h−1 kpc, very small compared to the ∼ 5h−1 Mpc
damping scale for primary anisotropies [134]. For Thomson scattering with ∆t
independent of q, the first few moments of the photon distribution function
are related to the density and pressure perturbations, velocity potential and
anisotropic stress by
(S) δγ (δp)γ (S) kΨv,γ (S) πt,γ

∆t0 = = , ∆t1 = , ∆t2 = k 2 ,
4 4p̄γ 3ā 12
(S)
and the first two moments of the ∆t Boltzmann transport equation are just
the energy and momentum conservation equations for photons:
photons: 1
4 δ̇γ + ϕ̇ + 31 k 2 ā−1 (Ψv,γ + Ψσ ) = 0 , (200)
−1 1 2
ā Ψ̇v,γ − H̄Ψv,γ − ( 41 δγ + ν) + 6 k πt,γ = −ne σT Ψv,γB , (201)
The right-hand side of eq. (198) gives the body force (i.e., per unit volume)
from Compton drag felt by the baryons, and the right-hand side of eq. (201)
gives the equal and opposite body force felt by the photons:
Z
1 d3 q
{ForceB }C -drag = − 4 2 St q = −(ργ + pγ ) ne σT (vB − vγ ) ,
ā (2π)3
{Forceγ }C -drag = −{ForceB }C -drag . (202)
Compton drag effectively damps the gas motion down to z ∼ 300 for CDM-
type models if the universe remains ionized, but lets up at z below ∼ 1000 for
normal recombination.
There are two kinds of limiting behavior in the baryon plus photon transport
equations that simplify calculations. The first is at the beginning of computa-
tions, when the photons plus baryons are tightly coupled, acting like a single
fluid, albeit with a shear viscosity and a thermal conductivity (section C.3.1).
123
In this limit, the infinite hierarchy of moment equations is not needed, but is
truncated by assuming the ` = 3 term vanishes. The anisotropic stress πt,γ is
then related to Ψv,γB , which also has a thermal diffusion contribution to it. To
get these transport coefficients accurately, Ψv,γB must be expanded to second
order in τC . These are used until it is unsafe to do so (using a conservative
safety tolerance).
The full transport equations are then solved, with an algorithm for opening
up the number of multipoles being calculated: the main effect of transport is to
propagate a pulse in `-space localized around ` = kτ from low ` to high ` as τ
increases to τ0 . In fig. 14, the nature of the pulse (smoothed in `, as described
below) at its final location at τ0 is shown for representative `’s.5
A second regime, used long after photon decoupling and only if the curvature
(S)
term and GetSW are negligible, is free-streaming in the ∆ e t variable. It can be
used to propagate from some stopping point τs (k) to the present in one step.
This translates the `-space pulse from a location centered on ` ∼ kτs to one
centered on ` ∼ kτ0 .
The hypersurface choice should be the one it is easiest to make the com-
putations in. What often happens is that the equations suggest best variables
for numerical reasons which pick out the gauge choice. The perturbed source
function was derived in the comoving gauge of the baryons, then transformed to
other gauges. Even with evolution in the synchronous gauge at the beginning of
a CDM-dominated evolution, after photon decoupling when the radiation free-
streams, a combination of the photon distribution function and metric variables
is suggested which turns out to be the Tolman combination eνL Tγ in the lon-
gitudinal gauge. Quantities which are manifestly invariant under infinitesimal
coordinate transformations are also usually numerically preferred, such as the
photon entropy per baryon and differences in velocities (the most obvious of
which is velocity relative to CDM). A crucial one for numerical accuracy is the
relative velocity of the photons and baryons.
In figs. 19 and 20, a few scalar perturbation mode functions are shown:
fig. 19 shows relative density perturbations as computed in the synchronous
gauge, while fig. 20 shows some gauge invariant velocity potentials Ψv,γB , Ψv,B cdm
and the metric variables ϕ̇ and Ψσ,S . The behavior of the relative density per-
turbations in the longitudinal gauge outside the horizon is dramatically differ-
5 The continuum limit of the moment hierarchy in ` – Appendix C.2.1, eqs. (333), (334) –
(S)
gives a wave equation for D(kτ, Q) = ∆t` in the variables kτ and Q = ` + 1/2 with the wave
P (S)
R
solution Q−1/2 fn(Q − kτ ) which conserves the power (2` + 1)|∆t` |2 , ∼ fn2 (q) dq. Here
fn is a function describing the pulse whose form is determined by the power that is injected
at the base of the hierarchy, through the sources acting on ` = 0, 1, 2. The Q −1/2 prefactor
is a little more complicated with curvature, but the pulse description is still valid, D ∝
fn(kdcurv arcsin(Q/kdcurv ) − kτ ) for closed models, D ∝ fn(kdcurv arcsinh(Q/kdcurv ) − kτ )
for open models. Thus it is the combination Q/(kR(χ)) which is relevant for propagation of
the `-space pulse, where R(χ) is defined by eq. (130).
124
Figure 19: The synchronous gauge evolution of scalar perturbations with the
4 wavenumbers shown for the standard CDM model with normal recombina-
tion illustrates such basic physical phenomena as Hubble drag on the CDM
perturbation growth in the er-dominated regime after the wave “enters the
horizon”, Silk damping of the baryon and photon perturbations, the catch-up
of the baryons to the CDM after photon decoupling. k = 1 h−1 Mpc is about
the highest k one needs to go to get an accurate computation of C` for this
model. Even so, by z = 100 one needs to follow multipoles up to ` = 460, and
the number of photon ODEs is twice this because of the polarization. After
free-streaming to z = 0, one needs to go to ` about 6000. Although this is easi-
est to do with the one step free-streaming method, it is also quite feasible to do
the full Boltzmann equation integration numerically. For relativistic neutrinos,
modes only up to ` = 40 were included, but once they exert a negligible effect
on the metric variables they are shut 125
off.
Figure 20: The gauge invariant relative velocity potentials ā−1 Ψv,γB ,
ā−1 Ψv,B cdm ≡ Ψv,BS and the photon entropy per baryon perturbations are
shown for the standard CDM model with normal recombination for the 4
wavenumbers of the last figure. They are all normalized to the amplitude
of the CDM density fluctuation at the current time if linear growth prevailed.
The synchronous gauge metric variables ā−1 ΨσS ≡ ā−1 Ψv,cdm,L and ϕ̇S and
the comoving curvature parameter ϕcom are also shown: ϕ̇S becomes negligible
and ā−1 ΨσS ∝ τ at late times. The velocity potentials and ϕ̇ are in units used
for the Boltzmann integration code, so the relative magnitudes are meaning-
ful. (The physical and conformal time units of the code are ctu = 22.28 Mpc,
τu = 1280 yr, tu /τu = (a0 )u = 56776.)
126
ent (see eq. (261)):
δtype,L δtype,S
= − H̄Ψσ,S . (203)
3(1 + p̄/ρ̄)type 3(1 + p̄/ρ̄)type
For τ τeq and kτ 1, H̄Ψσ,S is approximately constant, hence so are the

relative perturbations. At late times with kτ 1, the H̄Ψσ,S is dominated by
δtype,S , so δtype,L ≈ δtype,S . That is why one can compute transfer functions
for density perturbations in either gauge without hypersurface shifting. The
quantities δsγ = (1 + p̄γ /ρ̄γ )−1 δγ − δB (fig. 20) and (1 + p̄/ρ̄)−1
type δtype − δcdm are
gauge invariant of course. The latter are useful for accurately following scalar
isocurvature CDM models (for small k).
A catalogue of mode functions with varying k are generated. Depending
upon the accuracy one wishes anywhere from many hundreds to many thou-
sands are typical for a CDM calculation. The output of the Einstein–Boltzmann
(S)
calculations is therefore ∆{t,Q},` (k, τ0 ) – even in open or closed FRW mod-
2
els, where k 2 /ā2 is the eigenvalue of −(3) ∇ . This allows one to form the
(S)
k-space spectra for given `, dC{t,Q},` /d ln k, which, when integrated over ln k,
(S)
yield the spectra C{t,Q},` . Figure 14 showed the standard CDM example for
` = 4, 10, 59, 121. The ` = 59, 121 cases have been averaged over nearby `’s,
from ` − δ` to ` + δ`, to smooth out the dominant rapid oscillation associated
with the typical j`2 (kτ0 ) behavior. If one has sparse k coverage, just a few hun-
dred logarithmically spaced from (10−7 h−1 Mpc)−1 to (1 h−1 Mpc)−1 , then
δ` should not be too small. With many thousand, little smoothing is needed.
Another approach to smoothing is to wait until the ln k integration has been
done. Too much smoothing lowers the heights of the Doppler peaks, too little
leaves high frequency oscillations in C` .
(S)
Figure 21(a) shows the differential spectrum dCt,` /d ln k for the quadrupole
and a window, corresponding to typical half-degree-beam anisotropy experi-
ments, that probes the same k-band as many large scale structure observa-
tions. The no-recombination CDM model has very little power at ` = 214 as
expected. The nonzero Λ and standard CDM models look similar except for
a shift to smaller k for the nonzero Λ model associated with τ0 being larger.
Notice that the quadrupole probes k’s whose wavelength exceeds the size of our
Hubble patch, although unless the power spectrum is rising rapidly to small k,
waves with kτ0 < 1 contribute very little to the observed quadrupole, and even
less to the octopole and higher multipoles. Still it is this behavior which allows
one to set useful constraints on “fluctuations bigger than the horizon”.
(S)
Figure 22 shows where the polarization power, CQ,` lies in `-space for
scalar modes when there is standard recombination and early reionization.
(S)
Figure 21(b) shows dCQ,` /d ln k, i.e., where the polarization power lies in k-
space, for the ` choices of fig. 14. The polarization is a 10% effect in ∆T /T
[134]. The polarization power spectrum can be used, for example, to make
theoretical polarization maps, [88] and Appendix C.1. [166] has shown that
127
Figure 21: (a) dC` /d ln k for the scale invariant models listed, for the quadrupole
and a ` ∼ 200 multipole that lies within the MAX and MSAM windows, which
probes k extending into the large scale structure region. The vertical lines are
defined by k −1 = 2cH0−1 and πk −1 = 2cH0−1 , when half a wavelength equals
the horizon size. The extension beyond this line is what one means when one
says that CMB data can constrain fluctuations bigger than our horizon: a huge
128
increase would be ruled out by the quadrupole observations. (b) How the po-
larization power is concentrated in k for selected multipoles. The contribution
at low ` for the SR model is negligible.
Figure 22: Polarization power spectra for the models shown demonstrate that
over a limited multipole band the polarization power has signals about 10% of
the primary signal. As experimental noise decreases, it can provide a signature
for early reionization models.
129
there is a small but interesting cross-correlation between polarization and to-
tal anisotropy maps which may be useful in differentiating among polarization
components. Given the strides in decreasing receiver noise, it seems quite plau-
sible that the 10% effect (on selected angular scales) can be used to differentiate
among models, in particular it could provide a nice signature for early reioniza-
tion models since the polarization power is concentrated at relatively low `’s,
around ` ∼ 10 − 50, whereas it is a small angle signature with normal recom-
(S)
bination. Of course, the presence or absence of a Doppler peak in Ct,` (upper
curves) is a more direct signature, but the more signals we have to select on
primary anisotropies the better.
As we have seen, after Compton scattering has become negligible as a
source, at say τs (k), the solution to the radiative transfer problem for the
Tolman combination ∆ e (S)
t is
e (S)
∆ t (q̂, k, τ0 ) = e
−ζC (τ0 |τs (k)) −ik̂·q̂k(τ0 −τs (k)) e (S)
e ∆t (q̂, k, τs (k))
Z τ0
∂ 2 ā−1 Ψσ
+ dτ e−ζC (τ0 |τ ) e−ik̂·q̂k(τ0 −τ ) ν̇ − ϕ̇ + . (204)
τs (k) ∂τ 2
The first term represents the free-streaming of the temperature pattern at τs to

τ0 . The second term involving a line-of-sight integral of (ν̇−ϕ̇+(∂ 2 ā−1 Ψσ /∂τ 2 ))
is the integrated Sachs–Wolfe effect. This term vanishes for standard Ωnr =
Ω = 1 universes, provided we take τs τdec , τeq . The classical ΦN /3 Sachs–
Wolfe factor (where ΦN is the Newtonian gravitational potential perturbation)
is easiest to see in the synchronous gauge: νS = 0 defines the gauge, ϕ̇S → 0 and
ā−1 Ψσ → 13 ΦN τ for τs τeq , hence the ∂τ [ā−1 Ψσ ] term in ∆ e (S)
t gives ΦN /3.
Of course, photon bunching and Doppler effects also have small influences even
at low `. Figure 23 contrasts the shape of the spectrum that would result if
(S)
only ΦN /3 contributed with the exact result. The effective tilt in C` for this
ns = 1 case is ν∆T = 0.15, not 0 as the ΦN /3 approximation would give.
The integrated Sachs–Wolfe effect is important if we try to take τs (k) too
close to τeq or if the gravitational potential changes as a result of a change in
the equation of state of the Universe, for example the period between τΛ , when
vacuum energy becomes important, and τ0 . For an initially scale invariant
(S)
spectrum it causes an upturn in C` at low `, as shown in fig. 23.
Not only do open models have a nontrivial integrated Sachs–Wolfe effect,
there is also a direct effect of the curvature on the mode function evolution,
as is described in section C.4. Of course whatever mechanism generated the
ultra-large-scale mean curvature may well have had associated with it strong
fluctuations on observable scales, so much so that this is an argument against
large mean curvature because of the absence of such effects in the CMB. Even if
the background curvature is determined by an entirely different mechanism, it
should influence the fluctuation generation mechanism. An open issue in open
models has always been what p is a natural shape for the spectrum for k near
d−1
curv . Power laws in kd curv , (kdcurv )2 − 1, etc. have often been adopted.
130
Figure 23: This illustrates the role of the integrated Sachs–Wolfe effect for
scalar perturbations when there is vacuum energy or when there is negative
curvature. The vacuum effect was first considered by Kofman and Starobinski
(1985). The h = 0.6 open model also has an integrated Sachs–Wolfe effect,
but uses a scale-invariant initial condition from inflation which is naturally
truncated at kdcurv = 1; by contast, a h = 0.55, Ω = 0.6 model with the same
13 Gyr age turns down at low `. Also shown is a comparison between the C`
from the full Einstein–Boltzmann transport and the C` found using just 1/3 of
the current gravitational potential.
131
The case shown in fig. 23 has equal power per decade in the initial gravitational
potential power spectrum (or more correctly in Pϕcom ), that would arise if the
fluctuations were generated by quantum oscillations during an inflation epoch
subsequent to the mean curvature generation, for tilt νs ≈ 0 [245, 244, 305].
Spurred on by the promise of high precision space experiments [152, 154],
a considerable fraction of the CMB theoretical community with Boltzmann
transport codes compared their approaches and validated the results to ensure
subpercent accuracy [301]. An important byproduct of this was an emphasis on
speed, since one hopes to constrain a large multidimensional parameter space
with the anisotropy data. The important issues for methods based on solving
the moment equations are discussed in various places in these notes; although
the techniques were in place prior to the COBE discovery, to get the high
accuracy with speed has been somewhat of a challenge: e.g., if the number of
wavenumbers run is too small then smoothing is required, and this smooths
the C` curves, but to run the number needed to avoid smoothing is slow.
Although solving the hierarchy of moment equations became the standard
approach for evaluating the transport of photons and neutrinos, there are al-
ternatives. One can cast the entire problem of photon transport in terms of
integral equations in which the multipoles with ` > 2 are expressed as history-
(S)
integrals of metric variables, photon-bunching (∆t,0 ), Doppler and polarization
(S)
(e.g., ∆t,2 ) sources; and the problem of neutrino transport, massive and mass-
less, can be cast into history-integrals of metric variables only. This approach
was used by [195, 134, 233, 261] for hot and warm dark matter to evaluate
moments that fed into the metric equations (eq. 350). It was used by Kaiser
[266] to evaluate photon polarization. If applied to just the integrated Sachs-
Wolfe term it can augment the one-step free-streaming result and allow one to
begin the free-streaming transport to the present shortly after recombination
without any loss of accuracy. It has now been used by Seljak and Zaldarriaga
[306] to develop an accurate and fast code for C` evaluation. One aspect of the
speedup is that since C` ’s do not change that rapidly with `, one does not need
to evaluate the history-integrals ` by `, whereas with the moment hierarchy the
∆` ’s are all coupled to each other.
6.5.2 Tensor mode transfer equations

For tensor perturbations, and for flat universes, we have seen that for wavenum-
ber k there are two independent tensor modes defined in terms of two trans-
(T {+,×}) (T {+,×}) (T {+,×}) ij
verse traceless matrices, Eij (satisfying k j Eij = 0, Eij δ =
(T T )
0). The expansion of hij and the anisotropic stresses Πtype ij in this ba-
(T )
sis gives the h(T {+,×}) and πtype mode functions, and the reduction of the
(G0 )ij dynamical Einstein equation to eq. (171), ḧ(T ) + 2H̄āḣ(T ) + k 2 h(T ) =
(T )
16πGN ā2 p̄tot πtot (for the flat case).
The radiation field can also be expanded in these modes. The natural mode
132
e (T )
variables are ∆ {t,U,V,Q} in the expansion
X X X (T )
(T )
∆ij = w e (T ) E
∆
· E(µ) (T ) ik·x
Eij e ak(T ) + cc.
(µ) E
(µ) · E(µ)
(µ)=t,Q,U,V =+,× k
(205)
The E(µ) are the tensor product combinations of the polarization basis εs ,
eq. (185) and Appendix C, eq. (309). In the frame in which k̂ is the pole and
q̂ has polar coordinates (θ, φ), the E (T ) · E(µ) terms are proportional to either
cos(2φ) or sin(2φ) and functions of µ = cos(θ) that are at most quadratic, and
are given by eq. (372) of Appendix C.6. Apart from an overall sign, these are
the combinations first suggested by Polnarev [230] and which we used in [140]:
(T ) et (T +) et (T ×)
∆t = −∆ (1 − µ2 ) cos 2φ − ∆ (1 − µ2 ) sin 2φ
(T ) e (T +) (1 + µ2 ) cos 2φ + ∆
e (T ×) (1 + µ2 ) sin 2φ
∆Q =∆ Q Q
(T ) e (T +) 2µ sin 2φ e (T ×) 2µ cos 2φ .
∆U = ∆ U − ∆ U (206)
(T {+,×})
The source functions in these modes, Ge{t,U,V,Q} SW , are also derived in Ap-
pendix C.6:
(T ) (T )
GetSW = 12 ḣ(T ) , Ge{U,V,Q} SW = 0 , (207)
(T ) e t(T ) − Υ(T ) ) , (T )
GetC = −τC−1 (∆ GeV C = −τC−1 ∆
e TV , (208)
(T ) e (T ) + Υ(T ) ) , (T ) e (T ) − Υ(T ) ) ,
GeQC = −τC−1 (∆ Q GeU C = −τC−1 (∆ U
Υ(T ) ≡
Z
0 1 e (T ) e (T ) e (T )
3
8
1
2 dµ [ 2 (1 − (µ0 )2 )2 ∆ t − 12 (1 + (µ0 )2 )2 ∆ Q + 12 (2µ0 )2 ∆ U ]
1 e (T ) e (T ) + 3 e (T ) e (T ) − 6 ∆
e (T ) 3 e (T )
= 10 ∆t0 + 17 ∆ t2 70 ∆t4 + 35 ∆ Q0 7 Q2 + 70 ∆Q4 .
(S) (S)
Recall from section 6.5.1 that in the scalar case only ∆t can be and ∆Q
excited, so two transfer equations are required [134]. In the tensor case, ∆ e t(T ) ,
e (T ) and ∆
∆ e (T ) can be excited, but the source for ∆ e (T ) + ∆
e (T ) has only a
Q U Q U
pure damping term, so the combination will be unexcited in the early universe
and will remain so – as will ∆ e (T ) . Thus, the four Stokes radiative transfer
V
equations again reduce to two.
The back action on the gravity wave collisionless damping is from the
anisotropic stress for the photons,
2 e (T ) 4 e (T ) 2 e (T )
πγ(T ) = 12( 15 ∆t0 + 21 ∆t2 + 35 ∆t4 ) , (209)
with a similar contribution from extremely relativistic neutrinos. There will
also be a contribution from hot or warm dark matter in the er and semi-
relativistic phase.
133
The main features of the solution can be readily understood by writing the
transport equations in terms of a combination that only has ḣ(T ) as a source:
et (T ) e (T )
(∂τ + ikµ + τC−1 )[∆ −∆ 1
Q ] = 2 ḣ(T ) ,
e (T ) ] = τ −1 ΥT ,
(∂τ + ikµ + τC−1 )[∆ (210)
Q C
As in the scalar case, these equations are solved by expanding in Legendre

polynomials [140]. The polarization induced by the tensor mode is quite small
( 1%) [141]. To the extent that polarization and the small back action of the
anisotropic stress of the neutrinos and photons upon the gravity waves can be
neglected, the solution (for the flat background case) is simply
Z τ0
e (T )
∆t` ≈ e−ζC (τ ) dτ j` (kχ) 21 ḣ(T ) (τ ) , χ ≡ τ0 − τ . (211)
0
The e−ζC (τ ) implies that waves that entered the horizon before decoupling will
not be able to develop anisotropy in ∆ e t(T ) until after recombination, when the
gravity waves will have already decayed as a result of collisionless dispersion,
as embodied in the spherical Bessel function behavior of h(T ) .
To go from the mode variable ∆ e (T ) to angular power spectra, one must
(µ)
take into account the angular dependence of E (T ) ·E(µ) . For ∆ e (T ) , the angular
t
power spectrum found by summing over k and polarizations is
(T ) e (T )
dCt` 1 2 k3 1 X ∆ t,`−2
= `(` + 1) 1 − 2 1+ 1 1
d ln k ` ` 2π 2 2 =+,× (1 − 2` )(1 + 2` )
e (T ) e (T ) 2
∆ t,` ∆ t,`+2
+2 1 3 + 1 3 . (212)
(1 − 2` )(1 + 2` ) (1 + 2` )(1 + 2` )
If we assume recombination is sudden at τ = τdec and use the eq. (211) approx-
(T )
imation, this reduces to the classical Abbott and Wise [225] formula for Ct` .
With the full e−ζC (τ ) included, the approximation is a good one compared with
the results of the full integration. Typical solutions are shown in figs. 4.6 and
7. The decline at ` above ∼ 50 is due to the loss of gravity wave power by de-
coupling. Note also the rise at ` = 2. Even though eq. (211) is simple in form,
(T )
the directional decay of h(T ) implies eq. (212) even for C2 , hence the ratio
(T ) (S)
rts = C2 /C2 , needs numerical evaluation. The feedback of the anisotropic
stress in relativistic neutrinos and photons upon the gravitational wave evolu-
tion does have a significant (∼ 20%) effect at ` >
∼ 100, which is somewhat larger
for smaller Ωnr /Ωer , but by then the power has fallen off sufficiently that it is
(T )
unmeasurable. In fact, in fig. 7 the C` actually have both curves, with and
without anisotropic stress feedback, drawn, and on this scale one cannot see a
difference. For open universes, the influence of the gravitational radiation on
the spectrum extends to higher ` because of the angle-distance relation [305].
134
7 Connection with other cosmic probes of k-
space
7.1 Density power spectra and characteristic scales
A byproduct of the linear perturbation calculations used to compute ∆T /T
is the transfer function for density fluctuations, which maps the initial density
fluctuation spectrum in the very early universe into the final post-recombination
one. From this, fluctuation spectra appropriate to the linear regime for the
density, velocity and gravitational potential can be constructed. Various (co-
moving) wavenumber scales determined by the transport of the many species
of particles present in the universe characterize these spectra. The most im-
portant of these for dark matter dominated universes is the scale of the horizon
at redshift Ωnr /Ωer when the density in nonrelativistic matter, Ωnr ā−3 , equals
that in relativistic matter, Ωer ā−4 ,
−1
kHeq = 5 Γ−1
eq h
−1
Mpc , Γeq = Ωnr h [Ωer /(1.68Ωγ )]−1/2 . (213)
(It is defined by kτeq = π, where τeq is the conformal time at er/nr equality.)
In [231], we adopted the functional form:
Pρ (k) ∝ k 3+ns (k) {1 + [ak + (bk)3/2 + (ck)2 ]p }−2/p , (214)

(a, b, c) = (6.4, 3.0, 1.7) Γ−1 h−1 Mpc , p = 1.13 ,
−1 1/2
Γ ≈ Γeq e−(ΩB (1+Ωnr (2h) )−0.06)
,
where Γ is an effective index. For Γ = 0.5, this accurately fits the linear power-
spectrum of the standard adiabatic CDM model with Ωnr = 1, h = 0.5 and
ΩB = 0.03 [134]. Although one does not expect that this fit will be highly
accurate if we change ΩB , and indeed the best-fit parameters a, b, c, ν do vary
with ΩB [134, 88], it is usually sufficiently accurate for large scale structure work
to use a simple exp[−2(ΩB − 0.03)] correction factor for modest ΩB variations,
even if h varies [250]; a further improvement [251] occurs for low Ωnr if the 2 is
replaced by (1 + Ω−1nr (2h)
1/2
), as indicated. (The oft-used ΩB → 0 form given
in BBKS [232], Appendix G,
[ln(1 + ek)]2
Pρ (k) ∝ k 3+ns (k) {1 + ak + (bk)2 + (ck)3 + (dk)4 }−1/2 ,
(ek)2
(a, b, c, d, e) = (3.97, 16.4, 5.57, 6.85, 2.39) e
Γ−1 h−1 Mpc ,
e = Γeq e−ΩB (1+Ω−1
Γ nr (2h)
1/2
)
,
is best fit by Γ = 0.53, and with the Γ e form the fits are at least as good as
the eq. (214) form [250]. The coefficients have been increased by (2.728/2.70) 2
over the BBKS values. For the standard CDM model, both transfer function
formulae fit to better than 3% to k −1 = 1 h−1 Mpc, with eq.(214) better over
the crucial large scale structure region.)
135
To fit the APM angular correlation function using a power spectrum for
galaxies described by eq. (214) requires 0.15 < ∼Γ< ∼ 0.3 [231] for ns = 1 and
0.2 < n
∼ s∼
< 0.6 for Γ = 0.53 [190, 249]. More generally, dnρ,ef f (k)/dΓ ≈ 2 over
the APM waveband, hence it is Γ + νs /2 that should lie in the 0.15-0.3 range
[6]. A recent estimate of Γ using power spectra from redshift surveys as well
as from the APM data suggests Γ ≈ 0.23 fits best [250]. Figure 24 compares
the COBE-normalized ns = 1, Γ = 0.5 linear density power spectrum with an
ns = 1, Γ = 0.25 and an ns = 0.6, Γ = 0.5 spectrum.
To lower Γ into the 0.15 to 0.3 range one can [233]: lower h; lower Ωnr ; or
raise Ωer (= 1.68Ωγ with the canonical three massless neutrino species present).
Raising ΩB also helps. Low density CDM models in a spatially flat universe (i.e.
with Λ > 0) lower Ωnr to 1 − ΩΛ . CDM models with decaying neutrinos raise
Ωer [233, 252]: Γ ≈ 1.08Ωnr h(1 + 0.96(mν τd /keV yr)2/3 )−1/2 , where mν is the
neutrino mass and τd is its lifetime. Decaying neutrino models have the added
feature of a bump in the power at subgalactic scales to ensure early galaxy
formation, a consequence of the large effective Ωnr of the neutrinos before they
decayed. As we saw in section 6.3, we expect a tilt in inflation models, so we
can probably relax the amount by which Γ needs to be lowered. One could do it
entirely by tilt by invoking one of the inflation models of section 6.3 utilizing a
deceleration parameter q ≈ −(ns + 1)/2 or, for natural inflation, the curvature
m2
in ln H away from the peak of the potential, 4πP ∂ 2 ln H/∂φ2 ≈ (ns − 1)/2.
Generally, more scales are needed to characterize the spectrum than just
kHeq :
−1
kνdamp ≈ 6 (Ωnr Ων (2h)2 )−1/2 (gmν /2)1/2 h−1 Mpc , (215)
−1
kHrec ≈ 41 (Ωnr )−1/2 h−1 Mpc , (216)
−1
kSilk ≈ 3.8 (Ωnr )−1/2 h−1 Mpc , (217)
−1
kJBrec ≈ 0.0016 (Ωnr )−1/2 h−1 Mpc , (218)
−1
kcurv ≡ dcurv ≈ 3000 |1 − Ωtot |−1/2 h−1 Mpc . (219)
These are: (215) the collisionless damping scale for hot dark matter (massive
neutrinos), with gmν the number of massive species (counting particle and
antiparticle); (216) the horizon scale at recombination; (217) the Silk damping
scale; (218) the baryon Jeans length at recombination (below which the baryon
power spectrum in CDM models is effectively filtered, multiplying the power by
approximately (1 + (k/kJBrec )2 /2)−2 ); and (219) the curvature scale for open
universes (in which case k is not exactly wavenumber).
One could try to mimic some of these effects on the power spectrum by
modifying Γ. In hot/cold hybrid models, there is a stable light neutrino of
mass mν contributing a density Ων = 0.3(mν /7.2 eV)(2h)−2 , combining with
the CDM and baryon densities to make a total Ωnr = 1. A Γ-shape is not
a very accurate representation of the entire spectrum, dropping from about
0.5 for small k to Γ ∼ 0.22(Ων /0.3)−1/2 over the band 0.04–2 (h−1 Mpc)−1 of
relevance to large scale structure calculations [233, 253, 231]. For pure hot
136
dark matter models, BBKS showed that a good fit is provided by a Γ-law –
with Ωer = 1.46Ωγ for one species of massive neutrino (hence Γ = 1.07Γcdm)
1/2
– but with an exponential filtering multiplying the Γ-form of Pρ : Dν =
exp[−0.32(kRf ν ) − (kRf ν )2 ], where Rf ν = 2.6(Ων h)−1 h−1 Mpc. The damping
is dominated by the Gaussian part of the filter. If we define a characteristic
Gaussian filtering length by the radius at which the filtering function drops
−1
to 1/e2 , then this radius defines kνdamp = 1.1Rf ν . For the mixed hot/cold
1/2
models, the Pρ modification factor
h (1 + (Ak)2 + (1 − Ω )β −1 Ω (Bk)4 ) iβ
ν er
Dν = ,
1 + (Bk)2 − (Bk)3 + (Bk)4
1 p 1.14
β = (5 − 25 − 24Ων ) , B = 10.73 ,
4 (Ων + 0.14)
p
(1 + 10.91Ων ) Ων (1 − .9465Ων )
A = 69.06
(1 + (9.26Ων )2 )
is quite accurate [255], even for finite ΩB [261].

For warm dark matter, Γ is the same as for the CDM model and a rough fit
to the influence of free-streaming is provided by the exponential damping factor
g
form Dν = exp[−kRf w −(kRf w )2 ], where Rf w = 0.2( w,dec
100 )
−4/3
(Ωwarm h)−1 h−1 Mpc,
where gw,dec is the effective number of particle degrees of freedom when the
warm-particles decoupled, typically about 60–300 for minimal grand unified
theories over the range of decoupling temperature T ∼ 1–1018 GeV, and
Ωwarm h2 = 1.0(gw,dec/100)−1 (mwarm /keV).
−1
Scales characterizing the CMB anisotropy power spectrum include kSilk ,
−1 −1
kcurv , and kHrec (above which causal processes cannot occur at the recombi-
−1 −1/2
nation epoch). In addition, we have seen that kLS ≈ (5–10) Ωnr h−1 Mpc,
the fuzziness of the last scattering surface below which destructive interference
damps CMB anisotropies, is very important. Associated with these physical
1/2 1/2
scales are angular scales θLS ≈ (30 –60 ) Ωnr and θHrec ≈ 2◦ Ωnr , evaluated
◦ −1
using the angle-distance relation θ(d) = 0.95 Ωnr d/100 h Mpc appropriate
for an Ω = Ωnr = 1 universe and for an Ω = Ωnr 1 universe.
7.2 The observable range in k-space

Figure 24 contrasts k-space filters FW (k) for representative CMB anisotropy
experiments (characterized by `-space filters W ` ) with the bands in k-space
probed by large scale structure (LSS) observations and the bands associated
with the formation of collapsed structures such as clusters. The LSS probes
shown are: the angular correlation of galaxies wgg (θ); the power spectrum
and redshift space correlation function of galaxies ξgg (r) as probed by redshift
surveys; large scale streaming velocities LSSV; and the correlation function of
clusters ξcc . The abundances of clusters (“cls”) and groups (“gps”) provide in-
formation on slightly smaller scales. Abundances of galaxies (and quasars and
137
Figure 24: Cosmic waveband probes. The bands of cosmic fluctuation spectra
probed by LSS observations are contrasted with the bands that current CMB
experiments can probe. The (linear) density power spectrum for the standard
ns = 1 CDM model, labelled Γ = 0.5, is contrasted with (COBE-normalized)
power spectra that fit the galaxy clustering data, one tilted (ns = 0.6, Γ = 0.5)
and the other scale invariant with a modified shape parameter (ns = 1, Γ =
0.25). Biasing must raise the spectra up (uniformly?) to fit into the hatched
−1
wgg range and nonlinearities must raise it at k > ∼ 0.2h Mpc to (roughly)
match the heavy solid (γ = 1.8) line. The solid data point in the cluster-band
denotes the constraint on the power spectrum from the abundance of clusters,
and the open data point at 10 h−1 Mpc denotes an estimate from streaming
velocities (for Ωnr = Ω = 1 models). 138
damped Lyman alpha systems) at high redshift provide valuable information
on the power in higher k-bands, but these probes are sensitive to gas dynami-
cal processing which may obscure the hierarchical relationship between object
and primordial fluctuation waveband; indeed damping processes or tilted ini-
tial spectra may require some of the shorter distance structure to arise from
fragmentation and other nongravitational effects.
We can define a k-space filter FW (k) as one acting upon a k-space “power
spectrum for ∆T /T fluctuations” P∆T (k):

−1 dC` dC` (k, τ )
FW (k) = P∆T I W ` , P∆T (k, τ ) = I , (220)
d ln k d ln k
with dC` /d ln k evaluated at some post-decoupling time τ . P∆T R defined in this

way is basically conserved through free streaming [88] and P∆T d ln k gives
the total anisotropy power. (The choice of P∆T is really up to the theorist;
e.g., a filter acting on the primordial gravitational potential power spectrum
can be constructed by choosing P∆T = PΦN .) In [88], we showed that a rather
good approximation to C` is obtained by putting the P∆T (k) of eq. (220) in
place of PG0 G0 (k) in eq. (125) and δ(τ − τs ) in place of V, where τs is the
time at which P∆T (k, τs ) is evaluated (which should be well after decoupling).
The k-filters actually plotted in the figure use the high ` approximation for the
Bessel function product:
X (` + 12 )
FW (k) = q W` . (221)
1
`<kχs − 2
kχs (kχs )2 − (` + 21 )2
The filters shown in fig. 24 are for the large angle dmr and firs experiments, with
beams ∼ 7◦ and ∼ 3.9◦ , two filters for the sk95 experiment, showing the k-space
that it covers, and the Caltech OVRO ov7 (1.80 beam) experiment. max and
msam cover ranges between the sk95 points. CAT [151], WhiteDish [102] and a
new OVRO (70 beam) experiment (ov22) cover the region between sk95 and ov7.
These experiments are all sensitive to primary anisotropies. The line labelled
SR shows the length scale below which the primary power is basically erased if
hydrogen recombination is standard; NR denotes the scale if there is an early
injection of energy which ionizes the medium. These depend upon Ωnr , ΩB etc.
The light long-dashed filters at high k show the bands probed by very small
angle microwave background experiments, the VLA, the SCUBA array on the
sub-mm telescope JCMT, and the OVRO mm-array. Although their beams are
too small to see primary CMB anisotropies, they will provide invaluable probes
of secondary anisotropies generated by nonlinear effects, including redshifted
dust emission from galaxies and Thomson scattering from nonlinear structures
in the pregalactic medium.
The (linear) density fluctuation power spectra (actually their square roots,
1/2
Pρ (k)) shown in fig. 24 are for three (Ωnr = 1) models normalized to the
COBE dmr data (i.e., within the small-k hatched region which includes the
139
8% dmr error on overall amplitude): a standard Γ = 0.5 CDM model with
ns = 1, one with the spectrum tilted to ns = 0.6, and an ns = 1 CDM model
whose shape is characterized by Γ = 0.25. To fit the galaxy clustering data
requires 0.15 < <
∼ Γ + νs /2 ∼ 0.3. A biasing factor bg is relied upon to move the
curves up into the allowed wgg band (i.e., into the higher-k hatched region) and
nonlinearities to bend the shape upward to match the (approximate) 1.8 law
for k −1 < 5 h−1 Mpc (heavy line extending the hatched wgg region). Power
spectra derived from the QDOT [235], IRAS 1.2 Jansky [236] and CfA2 [237]
redshift surveys are compatible with the range inferred from wgg when account
is taken of redshift space distortions and biasing offsets between IRAS and
optically identified galaxies. Cluster–cluster correlations and galaxy–cluster
cross correlations [240] also seem to be compatible with this inferred spectrum.
Power spectrum estimates derived from the abundance of clusters as a function
of temperature [120] and from the Mark III peculiar velocity catalogue [298]
are also shown. There is a lesson to draw from an overview figure like this
while we concentrate on the LSS power issue that has led to intense research
on variations in the scale-invariant minimal-CDM theme for many years: the
great success inherent in the extrapolation over so many decades from COBE
normalization to large and small scale structure formation suggests that scale
invariance cannot be wildly broken and nonminimality cannot be too extreme,
even if the generation mechanism has nothing to do with inflation.
7.3 Relating the cluster-amplitude σ8 and the dmr band-

power
Apart from the shape parameters for Pρ (k), there is also an overall amplitude
(S)
parameter, which we now take to be hC` idmr = hC` idmr /(1 + r̃ts ), where
(T ) (S)
r̃ts = hC` idmr /hC` idmr . The band-powers obtained from the 4-year dmr
data as a function of the phenomenological slope ν∆T for each frequency channel
and for the 53+90+31 A+B GHz map were given in section 4.5. The effective
slope of the standard ns = 1 ΩB = 0.05 CDM model of figs. 7, 23 is ν∆T ≈ 0.15
over the dmr band; variation in ΩB and H0 does not change this very much
as fig. 8(b,c) shows; nor does a change in the recombination history (fig. 7).
Vacuum-dominated models do raise the slope to low ` because of the time-
dependence of the gravitational potential [110]: fig. 23 shows it is not well
represented by a single power law, but if we were forced to choose an effective
index it would be ν∆T ≈ 0.
Before the COBE detection, normalization of the density spectrum was done
using σ8 , the rms (linear) mass density fluctuations on the scale of 8 h−1 Mpc,
or to a biasing factor bg for galaxies, which was usually assumed to obey
bg σ8 ≈ 1 e.g., [134, 243]. The COBE-normalized value of σ8 is thus ex-
tremely important for deciding on viability of any specific model of cosmic
structure formation. Bayesian determinations of σ8 from the dmr data for a
number of selected models can be used to calibrate a more general relation
140
Figure 25: This illustrates the accuracy and utility of the fitting formula for
σ8 . Top left shows the average and ±1σ variation of σ8 against tilt for Ωnr = 1
CDM models, with no gravity waves (upper) and with them. A reduction factor
for hot/cold hybrid models is also given. The heavy closed data points are σ8 ’s
derived using the exact C` . The two vertical lines denote two estimates of σ8
from clusters. Upper right shows σ8 (Ωmν ) for ns = 1 (upper) and 0.85 hot/cold
models. Open circles shifted left of the dmr points are σ8 ’s for the sp94 data,
open squares for the sk93+94 data. The lower panels show σ8 (h) for a sequence
with fixed age, no mean curvature, and Ωvac (h) = 1 − Ωnr , the rising curve.
The solid dropping curve is Ωnr h and the almost indistinguishable dashed one
is Γ, the error bars defining its likely range.
141 The rising hatched regions are the
two cluster σ8 estimates.
between σ8 and the dmr band-power, following [231, 190, 6]. Although we have
seen in fig. 23 that the naive Sachs–Wolfe formula with dC` /d ln k = 2`(` + 1)
PΦ (k)j`2 (kχdec )/9 is not particularly good for the standard CDM model, we
1/2
can use the scalings predicted by PΦ to parameterize the hC` idmr /σ8 relation:
1/2
1.25 105 hC` idmr Ωnr −0.77
(2(Γ − 0.03)) 2.63νs
Γ-law: σ8 ≈ e , (222)
fSW (1 + r̃ts )1/2 (1 + 0.55(Ων /0.3)1/2 )
1/2 ν∆T 2.8
4yr(53&90&31)a&b: 105 hC` idmr ≈ [0.82 + 0.26(1 − ) ] × 1+.07
−.06 ,
2
10
fSW ≈ (1 + 0.12Ωvac )(1 + Ωvac ) , ν∆T ≈ 0.15(1 − Ωvac ) + νs ,
(−νt )
r̃ts ≈ 5.4 e−0.07νt e−1.99(νs −νt ) (1 − 0.6Ω3.5
vac ) .
(1 − νt /2)
Here νs ≡ ns − 1. The fit was originally made in ns with Γ fixed at 0.5, and
in Γ with ns fixed at 1, but it works well even when both vary significantly
from these standard values. The Ωvac = 0 formulae are the same as I gave in
[6, 146]. How well the fitting formula does is shown in fig. 25. As expected,
this calibration works well for the H0 variation of fig. 8(c) and, using Γ =
−1 1/2
Γeq e−[ΩB (1+Ωnr (2h) )−0.06] , works well for the cases of fig. 8(b).
Although fSW = 1 takes into account some of the enhancements over the
1/2
naive Sachs–Wolfe formula by normalizing to the calculated σ8 –hC` idmr relation
1/2
for standard CDM, it does not take into account the enhancement of hC` idmr
associated with the time dependence of the gravitational potential when Λ
dominates, hence for that case we expect fSW to exceed unity. Using the Ωvac
dependences of fSW , ν∆T and r̃ts (section 6.4) allows good σ8 fits, as the lower
panels of fig. 25 show. All models shown have ΩB h2 =0.0125, with the rest of
the nr-matter in cold dark matter (Ωnr = Ωcdm + ΩB ). For these sequences
of models with a uniform age t0 , the variation of Ωvac with Hubble parameter
(the rising curve in fig. 25) is
p p
−1/2 ln[ Ωvac /Ωnr + Ωvac /Ωnr + 1]
h = h1 Ωnr p , h1 ≡ 0.5(13Gyr/t0 ) ,
Ωvac /Ωnr
Ωvac (h) ∼ 0.9(0.3(h/h1 − 1)0.3 + 0.7(h/h1 − 1)0.4 ) . (223)
The latter is a rough inversion. The ages shown in fig. 25 bracket a recent
estimate for globular cluster ages, 14.6+1.7
−1.6 Gyr [111]. The Ωvac = 0 model
with 13 Gyr age is therefore the H0 = 50 standard CDM model, and H0 = 43
for the 15 Gyr age.
For open CDM models, the COBE-determined σ8 goes down with decreas-
ing Ω (and increasing h). These models are not so attractive because Ω drops
so precipitously with increasing h for fixed age. Equation (222) has not been
modified to treat open models (see e.g., [292]).
Section 4.6 showed that different combinations of cosmological parameters
can lead to sufficiently similar spectra that it will be quite an experimental
142
challenge to differentiate among them [144]. In the near term, we must rely on
1/2
such important ratios as σ8 /hC` idmr and the shape of the galaxy correlations to
further constrain cosmological parameter space. We can also hope to constrain
parameters through observations of galaxies at high redshift and by large scale
streaming velocities. As is evident from fig. 24, to have a COBE-normalized
power spectrum pass through the error bars associated with the power spectrum
from cluster abundances on the scale of ∼ 0.2 h−1 Mpc and the LSSV estimate
at ∼ 0.1 h−1 Mpc [297, 298], and to satisfy the shape restriction, albeit with a
free galaxy biasing factor bg , is like threading the eye of a needle, and clearly
severely restricts the range of models. Much discussion in the post-COBE era
has been about which COBE-normalized models pass these tests. We now
consider a few examples of the use of eq. (222) in conjunction with the current
large scale structure data. As an illustration here I will consider the shape and
cluster constraints and, to a lesser extent, the LSSV constraint, on models of
fixed age with variable tilt and Ωvac . In fig. 25, the dashed curve shows the
power spectrum shape parameter Γ, almost indistinguishable from the Ωnr h
curve, for the 13 and 15 Gyr model sequence. The rising curves with error bars
denote estimates of σ8 from clusters. The upper rising regions also roughly
denote the σ8 behavior, as derived from optical galaxy samples, in units of
b g σ8 .
The mass enclosed within 8 h−1 Mpc is that of a typical rich cluster, 1.2 ×
1015 Ωnr (2h)−1 M . Because rich clusters are rare events in the medium, their
number density is extremely sensitive to the value of σ8 . The abundance as a
function of cluster mass, velocity dispersion or X-ray temperature also depends
upon the shape of Pρ (k) in the cluster band of fig. 24, i.e., on ns and Γ.
Cluster X-ray data implies 0.6 < ∼ σ8 < ∼ 0.8 for CDM-like Ωnr = 1 theories, with
the best value depending upon Γ, ns , some issues of theoretical calibration of
models, and especially which region of the dncl /dTX data one wishes to fit,
since the data prefer a local spectral index d ln Pρ /d ln k substantially flatter
over the cluster region than the standard CDM model gives [120]. I believe a
good target number is 0.7 and values below about 0.5 are unacceptable, but
because CDM spectra do not fit the data well, this normalization depends
upon whether one focusses on the high or low temperature end. Other authors
who concentrated on the low to median region found lower values for ns = 1
models, 0.57 ± 0.05 [293] and 0.50 ± 0.04 [294], but do not fit the high TX
end well. (A small upward correction should be applied to these low estimates
to account for the nonzero redshift of the calibrating samples.) For ΩΛ 6= 0,
−0.56
a higher value is better [231, 254]; [293] adopt Ωnr as the correction, [294]
−0.53+0.13Ωnr
give a more moderate dependence, Ωnr for nonzero vacuum models,
−0.47+0.10Ωnr
Ωnr for open models. The rising curves with error bars in Fig. 25
show the higher and lower σ8 estimates from clusters. Allowed models would
have to lie in the overlap region between the cluster σ8 and the dmr σ8 .
There are many estimates of the combination σ8 Ω0.56 nr that are obtained by
relating the galaxy flow field to the galaxy density field inferred from redshift
surveys, which all take the form [bg σ8 ] βg , where βg is a numerical factor whose
143
value depends upon data set and analysis procedure: in [295], the rather varied
estimations are reviewed, and raw averages are given, 0.78 ± 0.33 for IRAS-
selected galaxy surveys, 0.71 ± 0.25 for optically-selected galaxy surveys. Even
more recent estimates give βg ∼ 0.4–0.6. (In this case, the Ω0.56 nr is the factor by
which the linear growth rate Ḋ/D differs from the Hubble expansion rate ȧ/a.)
For this to be a σ8 estimator requires the simplifying assumption of linear am-
plification bias, and a choice for bg σ8 . It is usual to take bg ≈ σ8−1 for galaxies,
but bg can depend upon the galaxy types being probed, upon scale, and could
be bigger or smaller than σ8−1 , and certainly cannot be determined by theory
alone. Recent estimates of parameters derived from the LSSV data, in this case
the Mark III velocity catalogue [296], are Γ = 0.5 ± 0.15, bg nearly unity and
σ8 Ω0.56
nr ∼ 0.85 ± 0.1, with sampling errors adding another ∼ 0.1 uncertainty
[297, 298]. The emphasis in [298] is on Pρ (k) estimation from the LSSV data
since it allows a direct comparison with models in figures like fig. 24. For ex-
1/2 +0.07
ample, they give Pρ Ω0.56 nr = 0.48−0.08 at k
−1
≈ 10 h−1 Mpc, which compares
1/2
with Pρ ∼ 0.54σ8 (1 − 0.65(Γ + νs /2 − 0.5)) for tilted Γ models. However,
the 17% should be augmented by a theoretical “cosmic variance” sample error,
which may be quite large. In [299], parametric models give similar results,
1/2 0.56
Pρ Ωnr = 0.49+0.07 0.56
−0.08 , σ8 Ωnr ∼ 0.88 ± 0.15. (Earlier work on LSSV concen-
trated on estimates of large scale rms bulk flows, e.g., over 40 and 60 h−1 Mpc
regions: σv (40 h−1 Mpc) had the same 17% data errors, but there the cosmic
variance fluctuations contributed 50% uncertainty; even so the Γ = 0.5 model
needed ns > 0.83 with the typical gravity wave contribution and > 0.55 without
[190].) Since the peculiar velocity data relies on having spatially-independent
and accurate distance indicators (e.g., the empirical Tully–Fisher relation be-
tween luminosity and rotation velocity in spiral galaxies), how seriously we take
the LSSV constraints depends upon how reliable we think the indicators are –
a subject of much debate.
Fig. 25 shows σ8 is a sensitive function of ns : for CDM models with Ωnr = 1,
it is far too high at 1.2 for ns =1, but too low by ns ≈ 0.76 with the “standard”
gravity wave contribution (νt = νs ) or by ns ≈ 0.60 if there is no tensor mode
contribution. However, the shape constraint wants lower ns . In [162], we
marginalize likelihood functions determined with the COBE data (and smaller
angle data) using a prior probability requiring that Γ + νs /2 be 0.22 ± 0.08 and
+0.15
σ8 Ω0.56
nr be 0.65−0.08 in order to condense the tendencies evident in fig. 25 into
single numbers with error bars. Threading the “eye of the needle” this way is so
exacting that the error bars are too small to take too seriously. Sample numbers
using only the 4-year dmr data and these priors are ns = 0.76+.03;.06
−.03;.06 for h = 0.5
with gravity waves, ns = 0.61+.04;.09
−.04;.08 without. For h = 0.7 and Ωvac = 0.66,
+.03;.06
we get ns = 0.99−.02;.04; and when Hubble parameters in the range from 0.5
to 1 are marginalized over, the preferred index is ns = 0.99+.06;.18
−.04;.08 with gravity
+.09;.19
waves, ns = 0.95−.10;.17 without. These are of course significantly better than
can be determined from dmr alone (section 4.5).
144
For the decaying neutrino model with ns = 1 to have σ8 > 0.5 we need
Γ > 0.22, i.e., mν τd < 14 keV yr. The hot/cold hybrid model formula in
eq. (222) is for one massive neutrino species. As fig. 25 shows, an ns =1 hot/cold
hybrid model with Ων < 0.3 would have σ8 > 0.8; however, even with a modest
tilt to ns = 0.95 this can drop to 0.7 for Ων = 0.25. (See also ref. [255].) That
is, little tilt is required, in contrast to the CDM case.
It is also evident from fig. 25 that the cluster data in combination with the
dmr data stops h from becoming too high for a fixed age, but also would prefer
a nonzero Λ value, with H0 ∼ 60 − 70 for 13 Gyr, and H0 ∼ 50 − 60 for 15
Gyr. When the tilt is allowed to vary as well, the preferred values lower to
very near 50 and 43, respectively, i.e., with little Ωvac : h < 0.70 at 2σ with
gravity waves, h < 0.56 with no gravity waves for 13 Gyr; h < 0.56 at 2σ with
gravity waves for 15 Gyr. For the hot/cold models, the values near 50 and 43
are preferred even more, even with very little tilt.
The redshift of galaxy formation cannot be too low or we would get too few
z ∼ 4 quasars and too little neutral gas compared with that inferred using the
damped Lyman alpha systems seen in the spectra of quasars. A fairly conserva-
−0.23
tive estimate of the redshift of galaxy formation is [190] zgf ≈ 1.3σ0.5 Ωnr −1,
−1
where σ0.5 ≡ σρ (0.5 h Mpc) is the analogue of σ8 but at a galactic mass scale
−0.23
rather than a cluster mass scale and D/a ≈ Ωnr for the linear growth rate
D(t) at high redshift has been used. This suggests 2 < −0.23 <
∼ σ0.5 Ωnr ∼ 5 or
so. For the Γ models with tilt we have roughly σ0.5 ∼ 6.4σ8 e (Γ/0.5)0.44 .
νs
(If we characterize galactic scales by the baryonic mass then we should use
σ1 Mpc ≡ σρ (1 Mpc) rather than than σ0.5 in the zgf estimation if ΩB h2 is
treated as fixed by primordial nucleosynthesis. For the Γ models with tilt,
σ1 Mpc ∼ σ0.5 /(2h)0.3 .) The zgf requirement leads to serious constraints on ns
in standard CDM models: ns > 0.76 with gravity waves, ns > 0.63 without.
With Γ < 0.5, the restrictions on the primordial spectral index from galaxy
−0.23
and cluster formation are even more severe (for Ωnr ∼ 1), but the Ωnr fac-
tor ameliorates the situation for Λ 6= 0 models. The zgf constraint is also the
Achille’s heel of hot/cold hybrid models with Ων > ∼ 0.3 [233, 255]. Observa-
tions of the CMB on small scales could in principle help to normalize the power
spectrum there; e.g., using sub-mm sky observations as in fig. 15 (if one could
get redshifts by other means).
7.4 The future

A consistent story that accommodates all of the current data on the CMB, large
scale structure, the Hubble parameter, the ages of stars, the deceleration pa-
rameter, clusters, lensing, etc. does not yet leap out at us. With the large Sloan
and 2df redshift surveys, we will have a wealth of LSS data to compare with the
evolving CMB spectrum, and many of the current puzzles will be definitively
answered. As we have seen, if just the shape of the density power spectrum
over the LSS band and the amplitude of the power spectrum on cluster scales
are considered to be known, then the range of inflation and dark matter models
145
is restricted considerably when combined with the COBE anisotropy level (and
indeed the anisotropy levels of intermediate angle experiments). Whether the
solution will be a simple variant on the CDM+inflation theme [233], involving
slight tilt (or more radical broken scale invariance), stable ev-mass neutrinos,
decaying (>keV)-neutrinos, vacuum energy, low H0 , high baryon fraction, neg-
ative mean curvature or some combination, is still open, but can be decided as
the observations tighten, and, in particular, as the noise in the C` figure sub-
sides, revealing the details of the Doppler peaks, a very happy future for those
of us who wish to peer into the mechanism by which structure was generated
in the Universe.
Although there are undoubtedly many surprises in store for us as the
anisotropy data improves, we should be very encouraged by how far we have
come since the COBE discovery. We are now beginning to map the sky’s pri-
mary and secondary anisotropy signals. It is fitting to end by pointing back
to fig. 11 that shows the anisotropy at low resolution as revealed by COBE,
and forward to the interferometric arrays (VSA, CBI, VCA), long duration
balloon experiments (ACE, Boomerang, Maxima, Top Hat,...) and especially
the all-sky satellite experiments (MAP, COBRAS/SAMBA), that will tell us
the parameters defining how cosmic structure formed in detail.
Acknowledgements
First a nod to my fellow CMB enthusiasts who kept digging for three decades
until gold was revealed. Now we’re all rich. And the data-bank deposits just
keep accumulating. Fifteen years of collaboration with George Efstathiou on
CMB topics and strong interactions with Bernard Carr and Craig Hogan on the
spectrum, Steve Myers, Paul Steinhardt, Rick Davis, Rob Crittenden, Andrew
Jaffe, Lloyd Knox, Yoram Lithwick, Dmitry Pogosyan and Tarun Souradeep
on the anisotropy is especially noted. Support from a Canadian Institute for
Advanced Research Fellowship, NSERC and from the Institut d’Astrophysique
de Paris, where some of these lecture notes were written, is gratefully acknowl-
edged.
A The ADM formalism and perturbation the-

ory
The ADM treatment of the Cauchy problem in relativity [167] is well covered
in MTW [196]). The ADM formalism is the natural language for numerical
relativity, so there has been intense post-MTW development; in particular,
Jimmy York’s highly influential 1979 Batelle and 1982 Les Houches lecture
notes [168, 169]. The approach to perturbation theory which I ascribe to
[195, 2, 192] is based upon this 3+1 split. I usually use either the synchronous
gauge or the longitudinal gauge, but with liberal use of transformation to other
146
variables and hypersurfaces if it simplifies analytic or numerical calculations or
helps in understanding. This approach underlied Bardeen’s influential 1980
paper and many of the main papers in the subject. However, there was also
excessive zeal for the “gauge invariant approach” that made sacrosanct the per-
turbation to the lapse and the inhomogeneous scale factor in the longitudinal
gauge. These variables refer to just one choice of time slicing, which is some-
times a rather bad choice from the point of view of hypersurface warping. By
contrast, the much-maligned synchronous gauge – for which the hypersurfaces
are those on which cold dark matter is at rest – is often excellent and a great
workhouse in General Relativity, e.g., Landau and Lifshitz [179]. Bardeen’s
China lectures [178] redress the balance, giving a clear compact enunciation of
the issues starting from the ADM formalism in a paper which deserves to be
better known in cosmology.
The main equations for perturbation theory are given in sections A.2, B.4,
C.2, C.3.1, C.3.2, C.4 for scalar modes and in sections A.3, C.6 for tensor
modes. The other sections develop these equations from first principles.
A.1 The ADM equations

A foliation is a set of spacelike 3-surfaces {(3) G} that fills spacetime, for which
a closed 1-form Ω exists which is normal to the surfaces. It is therefore locally
exact, i.e., can be written as Ω = dτ , where τ is a time coordinate labelling the
hypersurfaces. The metric can be decomposed into the ADM form in terms of
the lapse function N , the shift (three) vector N i , and a spatial metric (3) gij :
ds2 = −N 2 dτ 2 + (3) gij (dxi + N i dτ )(dxj + N j dτ ) , (224)

2 k (3) j (3)
g00 = −N + Nk N , g0i = Ni ≡ gij N , gij = gij ,
i i j
1 N N N
g 00 = − 2
, g 0i = 2 , g ij = (3) g ij − , (3) ij (3)
g gjk ≡ δki .
N N N N
Here, xi are local coordinates on the τ = constant surfaces and (3) g ij is the
contravariant spatial 3-metric.
One can refer tensors to the coordinate basis, dxα and its dual basis ∂α ≡
∂/∂xα or to a more general contravariant basis (tetrad), ea , and its covariant
dual basis, ea , where a = 0, 1, 2, 3; for the spatial components with respect to
the basis I shall use I, J, K, . . .. It is natural to choose the 4-velocity en =
N −1 ∂0 − N −1 N i ∂i as the timelike basis vector (and en = N dτ ): it describes
observers comoving with the flow of time (section 6.2.1). The spatial triad
{e1 , e2 , e3 } is chosen to be perpendicular to en (hen , eI i = hen , eI i = 0). Thus,
{eI } is invariant under the action of the projector ⊥αβ = g αβ + enα enβ . Tetrads
are not usually expressible as coordinate bases (i.e., are nonholonomic), but
components of tensors with respect to tetrads often have more direct physical
meaning than components referred to coordinates. With the eI chosen to be
perpendicular to en , to go from spatial coordinate components of a tensor
ij... IJ... ij... I J k `
T k`... to triad components T KL... , one just forms T k`... e i e j eK eL . . .; 3-space
147
spatial covariant derivatives with respect to the 3-metric (3) gij are denoted
by T ij...
k`...|m or by
(3)
∇m T ij... IJ...
k`... , with T KL...|M or
(3)
∇M T IJ...
KL... denoting the
action of the covariant derivative (3) ∇eM on the tensor. If T is invariant under
projection, then [(3) ∇]T = [⊥ (4) ∇]T , where (4) ∇ is the covariant derivative
with respect to the 4-metric (4) gij . The 3-space metric coefficients in the eI
basis is (3) gIJ ≡ eI · eJ , e.g., δIJ for an orthonormal choice. Considered as
matrices, (eIi ) = [(eJj )tr ]−1 . The matrix eJj is sometimes called the deformation
tensor since eJj drj gives the proper length of an element of coordinate length
drj .
In the following equations, we shall refer the time components to the basis
en , using the subscript n. For the spatial components, because eI and eI are
just linear combinations at each point of the ∂i and ∂ i , independent of en and
en , the transformation to basis components involves changing i, j to I, J (with
some care for the treatment of the shift; also note that although eI0 vanishes,
eI0 = δ IJ Nj eJj does not for nonzero shift – however, eIn does vanish.) It
is useful to introduce a modified basis e∗I which “takes out the expansion of
the Universe” from eI : eI∗ = A−1 eI , e∗I = AeI . Here A(x, τ ) is a “conformal
factor” that should reduce to ā(τ ) for homogeneous backgrounds, but also could
be spatially dependent for fluctuations if it results in simplified equations.
The only nonvanishing components of the extrinsic curvature (with respect
to the basis) are

1 ∂ (3) gij
KIJ = eIi eJj − + Nj|i + Ni|j (225)
2N ∂τ
Ȧ 1 1
=− δIJ − (δKJ e∗Ii + δIK e∗Ji )ėK
∗i + e i e j (Nj|i + Ni|j ) .
NA 2N 2N I J
The last form assumes the basis is orthonormal, and is explicitly given to show
(with zero shift) that this is just the familiar matrix relation for the shear
tensor in terms of the deformation tensor when one maps from Lagrangian to
Eulerian space in Newtonian dynamics in the expanding Universe.
We define an inhomogeneous Hubble parameter in terms of the trace of the
extrinsic curvature K = KII and a hypersurface anisotropic shear in terms of
the anisotropic part of the extrinsic curvature, (K 0 )IJ . Letting
h i
a(x, τ ) ≡ [det( (3) g)]1/6 = exp 61 Trace ln (3) gij , (226)
we have
K 1 ∂ ln a 1 1 (3) 1 1
H≡θ≡− = − ∇j N j = en [ln a] − ∂j N j
3 N ∂τ 3N 3N
Ȧ 1 1 i K 1 1 j
= + e∗K ė∗ i − N ,
NA 3 N 3 N |j
σIJ ≡ −(K 0 )IJ ≡ −(KIJ − 31 (3) gIJ K) (227)
148

1 1 1 K
= (δKJ e∗Ii + δIK e∗Ji ) − δIJ e∗Ki ė
2 3 N ∗i

1 1 1
−eI i eJj (Ni|j + Nj|i ) − (3) gij N|kk
, (228)
N 2 3

1 1 2 ∂ (3) gij /a2 1 1 1 (3) k
σij = a − (N + Nj|i ) − gij N|k .
2N ∂τ N 2 i|j 3
Just as the stress–energy tensor was decomposed in eq. (154), so the Einstein
tensor Gab can be decomposed into {Gnn , GIn , 31 GII , GIJ − 31 (3) gIJ GK
K }, and
the 10 Einstein equations written in this form. The energy constraint equation
can be re-expressed as an inhomogeneous Friedmann equation, which is also
the general relativistic version of the Poisson–Newton equation:
(4)
Gnn = 21 ( (3) R + 32 K 2 − (K 0 )IJ (K 0 )IJ ) = 8πGN ρtot , (229)
2 8 1 2 1 (3) 2 1 IJ
i.e., H = 3 πGN ρtot + 3σ − 6 R, σ ≡ 2 σIJ σ . (230)
The momentum constraint equation is

(4)
Gni = (3)
∇j ((K 0 )ij − 32 K (3) g ij ) = 8πGN J(e),tot
i
. (231)
The isotropic dynamical equation (GII /3) is
2 2 1 (3) 1 1 1
en [K] + ∇2 N − K 2 − (3) R − (K 0 )IJ (K 0 )IJ = 8πGN ptot
3 3N 3 6 2
(232)
The curvature term in eq. (229) can be eliminated by forming the combination
Rnn = −(Gnn + GII )/2 equation, which is the Raychaudhuri equation for this
zero vorticity hypersurface flow (ω 2 ≡ 21 ωIJ ω IJ = 0):
1
3en [H] + 3H 2 + 2(σ 2 − ω 2 ) − (3)
∇2 N + 4πGN (ρ + 3p)tot = 0 .
N
(233)
The anisotropic dynamical Einstein equations ((G0 )ij ) are

0 i 0 i (3) 0 i 1 (3) i (3) 1 i (3) 2
−en [(K )j ] + K(K )j + (R )j − ∇ ∇j − δ j ∇ N
N 3
1 1
− (K 0 )jk (3) ∇k N i + (K 0 )ki (3) ∇j N k = 8πGN (Πtot )ji . (234)
N N
I now give a few examples of the stress–energy tensors which we shall have
occasion to use. The stress energy of a classical fluid can be decomposed into a
comoving density ρcom = Ua T ab Ub , momentum current J(e)com
a
, pressure pcom ,
ab
and anisotropic stress Πcom , defined by eq. (154) but with U the 4-velocity of
149
the fluid in question. The fluid may be imperfect, with shear and bulk viscosity,
η, ζ, and a thermal conductivity κ, obeying the constitutive relations [264, 196]:
pcom = p(ρcom , T ) − ζθ(U ) , Πab ab
com = −2ησ(U ) , (235)
a
J(e)(U ) = −κT ⊥ab
(U ) ( (4)
∇b [ln T ] + A(U )b ) , (236)
where T is the fluid temperature and p(ρ, T ) is the equation of state. The fluid’s
acceleration is A(U )b ≡ (4) ∇U Ub , where the subscript (U ) indicates projection
with respect to U , e.g., θ(U ) ≡ ⊥a(U )b T bc ⊥(U )ca . The stress energy derived
from a distribution function is given by eq. (276) below. A last example is a
scalar field, φ, interacting through a potential V (φ, . . .); projecting onto U =
en , we have
ρφ = 12 (en [φ])2 + (3) g IJ eI [φ]eJ [φ] + V , (237)
I I (3) IJ
J(e),φ = −T(φ) n =− g en [φ]eI [φ] ,
pφ = 21 (en [φ])2 − 1 (3) IJ
6 g eI [φ]eJ [φ] − V ,
Π(φ) IJ = eI [φ]eJ [φ] − 31 (3) gIJ (3) g KM eK [φ]eM [φ] .
(4)
The scalar field evolution equation, ∇2 φ = ∂V /∂φ, is
scalar field momentum: Π(φ) = en [φ] , (238)
∂V
en [Π(φ) ] − KΠ(φ) − (3) g IJ eJ [ln N ]eI [φ] − (3) ∇2 φ + = 0.
∂φ
One can split the 20 independent components of the spacetime curvature
tensor (4) Rabcd into 14 that just depend upon the properties of the 3-geometry,
as embodied in the space curvature tensor (3) Rijkm , and upon the extrinsic
curvature and its spatial derivatives,
(4)
Rijkm = (3) Rijkm + (Kik Kjm − Kim Kjk ) , (239)
(4) (3) (3)
Rijkn = ∇j Kik − ∇i Kjk , (240)
(3) (3) (3) (3) (3) (3) (3)
Rijkm = gik Rjm − Rjk + gjm Rik
gim

− (3) gjk (3) Rim + 12 (3) R (3) gim (3) gjk − (3) gik (3) gjm ,
and into 6 dynamical components that depend upon how the extrinsic curvature
changes in time, i.e., dependent upon (3) G-evolution:
(4)
Rinjn = Kik Kjk + N −1 (3) ∇i (3) ∇j N (241)

1 1
+ K̇ij − [N k (3) ∇k Kij + Kik (3) ∇j N k + Kkj (3) ∇i N k ] .
N N
Eqs. (240), (239) are called the Gauss–Codazzi equations in the differential
geometry of surfaces.1
1 Many of these quantities are most naturally expressed in terms of Lie derivatives: e.g.,
KIJ is the Lie derivative of (3) gab with respect to en , the term in curly brackets in eq. (241)
is the Lie derivative of Kij with respect to en , Len Kij , and the term in square brackets is
the Lie derivative of Kij along the shift vector N K eK , which vanishes for zero shift.
150
Normal coordinates have N i = 0. In perturbation theory this defines time-
orthogonal gauges. Because the equations simplify, this has also often been
adopted in numerical relativity. Gaussian normal coordinates have N = 1 (or ā)
as well, defining the synchronous gauge. There is a gauge which maximizes the
3-space volume, one with K = 0, which was used to retard horizon formation
in black hole calculations, but is of little interest for cosmology. Constant K
hypersurfaces are used to characterize the outcome of inflation calculations, and
have been generally advocated for inhomogeneous numerical cosmology because
they are singularity-avoiding, e.g., [172]. However, this positive feature is a
negative one if we are interested in following the collapse of cosmic structures
such as clusters. Other choices that have been used in black hole calculations
share this singularity-avoiding characteristic. There is also a large class of
comoving hypersurfaces, one for each “type” of matter present, and one on
a
which the total energy current J(e),tot vanishes. These are very useful for
deriving source functions, etc. and are sometimes useful for calculations.
Perturbation theory beyond first order in General Relativity depends upon
exactly what spacetime we expand about. It is often useful to take out some as-
pect of the dynamics via a conformal transformation on 4-space (gαβ = Ω2 g̃αβ )
or on 3-space ((3) gij = A2 (3) g∗ij ). In the usual cosmological perturbation
theory, it is A2 = ā2 or Ω2 = ā2 which is removed, but inhomogeneous parts
could also be transformed. The spatial metric (3) gij can even in the nonlin-
ear case be decomposed into terms that we can identify with scalar, vector
and tensor (transverse traceless) components, but the nature of these depend
upon exactly what we pull out in A or Ω and the Einstein equations cou-
ple them – unless the metric coefficients and the conformal factors are all
treated fully linearly. Nonlinear choices of some interest are Ω = N (x, τ ) and
A = a(x, τ ) ≡ (det((3) g))1/6 .
A.2 Scalar perturbations

In the following, unperturbed variables and covariant derivative operators have
bars over them. For scalar perturbations, we have
(3)
gij = (3) ḡij (1 + 2ϕ) − ā2 ( (3) ∇j (3) ∇i + (3)
∇i (3) ∇j )ψ , (242)
2 (3)
g00 = −N̄ (1 + 2ν) , g0i = Ni = N̄ ∇i Ψ n , (243)
i
en0 0 −1
= −N̄(1 + ν) , en = N̄ (1 − ν) , en = − i (3)
∇ Ψn ,
ā2
Ψσ ≡ Ψn + ψ̇ , (244)
N̄
1 1 1 2
(δH) ≡ − (δK) = ϕ̇ − H̄ν − (3) ∇ Ψσ , (245)
3 N̄ 3
i 2
σji ≡ −(K 0 )ij = −( (3) ∇ (3) ∇j − 13 δji (3) ∇ )Ψσ , (246)
i (3) 2
(δ (3) R0 )ij = −[ (3) ∇ ∇j − 13 δ ij (3) ∇ ]ϕ , (247)
151
2 kc
(δ (3) R) = −4 (3) ∇ ϕ − (3)
R̄2ϕ , (3)
R̄ = 6 , (248)
dcurv ā2
2
ρtype = ρ̄type + (δρ)type = ρ̄type (1 + δtype ) , (249)

(3)
J(e),type I = Ttype nI = −(ρ̄ + p̄)type ∇I Ψv,type , (250)
Utype I = −(3) ∇I Ψv,type , ptype = p̄type + (δp)type ,
fluid acceleration: A(U )I = −(3) ∇I ΨA,type ,
1
ΨA,type = Ψ̇v,type − ν ,
N̄
hypersurface acceleration: A(en )I = (3) ∇I ν , (251)
(3) 2
Πtype ij = ( ∇i ∇j − 31 (3) gij (3) ∇ )p̄type πt,type
(3)
,
scalar field: φ = φ̄ + δφ , Ψv,φ = (ēn [φ])−1 δφ . (252)
X
(δρ)tot ≡ ρ̄tot δtot ≡ ρ̄type δtype ,
type
P
type (ρ̄ + p̄)type Ψv,type
Ψv,tot ≡ , (253)
(ρ̄ + p̄)tot
J(e),tot I = Ttot nI = −(ρ̄ + p̄)tot (3) ∇I Ψv,tot . (254)
Thus ϕ fully parameterizes the Ricci 3-space tensor (kc = 0, ±1 gives the 3
FRW curvature possibilities). The velocity potentials for “type”-matter are
Ψv,type . It is also convenient to define a total velocity perturbation through
eqs. (253), (254). Ψn is like a velocity potential for the shift, and, as we shall see,
only the combination Ψσ , which is a potential for the anisotropic shear of the
hypersurfaces σji , enters into the equations of motion. The anisotropic stress
for type-matter can also be expressed in terms of a scalar potential, πt,type .
It vanishes for scalar fields, as eq. (237) shows, and also for perfect fluids,
including CDM and the baryons. It does not vanish for photons and relativistic
neutrinos. The acceleration of a fluid moving with velocity U is A(U )I =
ā−1 U n ēn [āUI ] + eI [ln N ] to first order, and is eJ [ln N ] to all orders for the time
surfaces (as is shown in Appendix B), yielding eq. (251) expressed in terms of an
acceleration potential. The acceleration, ∇U U , is from nongravitational forces
only, hence the + (3) ∇I ν term is there to take out the gravitational acceleration
derived from geodesic motion.
The expansion of (3) gij is based on the removal of the 3-space conformal
j
factor ā rather than some inhomogeneous function. We define ∇ in terms
i
of the (3) ḡij without the ā2 taken out, so that (3) ∇ = ā−2 (3) g∗ij ∂j (with
(3) ij
g∗ = δ ij for a flat Universe) has extra ā terms designed to confuse the
2
reader. So does the Laplacian (3) ∇ . One of the advantages in working in an
orthonormal basis is that the correct ā multipliers enter into the expressions
(e.g., (3) ∇I = ā−1 ∂/∂xI for a flat Universe).
The energy constraint and (the first integral of) the momentum constraint
152
are
2 (3) 2 1 8πGN
2H̄(δH) − ∇ ϕ − (3) R̄2ϕ = (δρ)tot , (255)
3 6 3
1 2 1
(δH) + (3) ∇ Ψσ = ϕ̇ − H̄ν
3 N̄
X 1
= −4πGN (ρ̄ + p̄)type Ψv,type − (3) R̄ Ψσ . (256)
type
6
It is sometimes better to work with a modified form of the energy constraint

equation, found by inserting the relation for (δH) from the momentum con-
straint equation into eq. (255):
2 1 (3)
− (3) ∇ (ϕ + H̄Ψσ ) − R̄2(ϕ + H̄Ψσ ) = 4πGN (δρ)com,tot , (257)
4
(δρ)com,tot ≡ (δρ)tot + 3H̄(ρ̄ + p̄)tot Ψv,tot , ΦH ≡ ϕ + H̄Ψσ ,
involving the energy density in the frame in which the total energy current
J(e),tot vanishes and Bardeen’s gauge invariant ΦH , which is also ϕ in the
longitudinal gauge: ϕL = ΦH .
The Raychaudhuri equation, slightly reworked, is

1 ∂ N̄(δH) ∂ ln(N̄ −1 ā2 ) ∂ H̄ 1 2 (3) 2
+ N̄ (δH) − N̄ ν − N̄ ∇ ν
N̄ 2 ∂τ ∂τ ∂τ 3
4πGN
=− ((δρ) + 3(δp))tot . (258)
3
Note that N̄ (δH) is negative: a growing density perturbation slows the expan-
sion rate. The (G0 )IJ simplifies considerably when expressed in terms of the
potentials:
1 X
Ψ̇σ + H̄Ψσ + (ϕ + ν) = −8πGN p̄typeπt,type . (259)
N̄ type
Although for scalar perturbations, the constraint equations together with

the matter conservation equations form a complete system from which the
dynamical Einstein equations follow by taking appropriate time derivatives
and linear combinations, sometimes it is worth it to solve the Raychaudhuri
equation, extra time derivative and all, or the anisotropic GIJ equation in the
place of one of the constraint equations. In a gauge with ν=0, the Raychaudhuri
equation becomes a simple ODE for N̄ (δH) at each point in the space. The
momentum constraint equation is an ODE for ϕ, but it turns out that only
ϕ̇ enters the matter evolution equations and its expression in terms of the
velocity potentials can be substituted. This is the usual approach taken for
solving scalar perturbations in the synchronous gauge, and is the one adopted
in the Bond and Szalay and Bond and Efstathiou papers [195, 134, 88].
153
Although it is fine to solve eq. (258) for the evolution of matter and radia-
tion through photon decoupling and free-streaming to the present, intractable
numerical problems arise in inflation calculations with scalar fields [192]: a
robust solution strategy for solving synchronous gauge fluctuations does exist:
the momentum constraint is treated as an ODE for ϕ, and (δH) is then fixed
through the energy constraint equation.
For the synchronous gauge, the anisotropic GIJ equation follows from a
combination of the matter evolution equations and the other Einstein equations
and is not usually separately solved for. It is an algebraic relation for zero shear
hypersurfaces (Ψσ = 0), e.g., for the longitudinal gauge (with ψ = Ψn = 0).
For example, if there is no anisotropic stress (e.g., universes with only perfect
fluids and/or scalar fields), then νL = −ϕL . In [192], we also solved for scalar
field fluctuations in the longitudinal gauge, using eq. (259) and a sum of the
Raychaudhuri and energy constraint equations, a dynamical equation of second
order in ϕL . For the CMB problem, the standard approach [138] has been to
also use a constraint equation, the Poisson equation, eq. (257), relating the
2
total comoving energy density to (3) ∇ νL . [260] use the momentum constraint
equation instead of the anisotropic shear equation.
Under scalar mode gauge transformations [171, 178], τnew = τold +T , xinew =
i
xold + ā2 (3) ∇ L, where T and L are scalar functions, we have
i
1 ∂ N̄ T
νnew = νold − , ϕnew = ϕold − H̄ N̄ T , (260)
N̄ ∂τ
ā2
ψnew = ψold + L , Ψnnew = Ψnold − L̇ + N̄ T
N̄
Ψσ,new = Ψσ,old + N̄ T , Ψv,type,new = Ψv,type,old − N̄ T ,
(δH)new = (δH)old − H̄ ˙ T − 1 (3) ∇2 N̄ T ,
3
(δH)∗new = (δH)∗old + (1 + q̄)H̄ 2 N̄ T ,
(δρ)tot,new = (δρ)tot,old + 3H̄(ρ̄ + p̄)tot N̄ T ,
(δρ)type,new = (δρ)type,old − ρ̄˙ type T ,
(δp)type,new = (δp)type,old − p̄˙ type T , πt,new = πt,old ,
ΨA,type,new = ΨA,type,old ,
scalar field: (δφ)new = (δφ)old + ēn [φ]N̄ T .
The modified inhomogeneous Hubble parameter H∗ is defined by eq. (167).
Notice that the “acceleration potential” of a fluid is gauge invariant. The
unperturbed momentum of the scalar field is ēn [φ].
To transform from the synchronous to the longitudinal gauge:
1
N̄T = −Ψσ,S , ΦA ≡ ν L = Ψ̇σ,S , ΦH ≡ ϕL = ϕS + H̄Ψσ,S ,
N̄
Ψv,type,L = Ψv,type,S + Ψσ,S , Ψv,cdm,L = Ψσ,S , (261)
d ln ρ̄type
δtype,L = δtype,S + H̄Ψσ,S ,
d ln a
154
scalar field: (δφ)L = (δφ)S − ēn [φ̄]Ψσ,S . (262)
ΦA and ΦH are gauge invariant. Some other gauge invariant quantities that
are often used are:
(δρ)tot
ζ =ϕ+ , (263)
3(ρ̄ + p̄)tot
ϕcom = ϕ − H̄Ψv,tot , (264)
1
(δρ)com,type = (δρ)type − ρ̄˙ type Ψv,type , (265)
N̄
p̄˙ type
(δp)type − (δρ)type . (266)
ρ̄˙ type
Also gauge invariant are any differences between quantities which may them-
selves not be gauge invariant, such as velocity and appropriately normalized
density differences:
Ψv,type1 ,type2 ≡ Ψv,type1 − Ψv,type2 , (267)
(δρ)type1 (δρ)type2
− . (268)
(ρ̄ + p̄)type1 (ρ̄ + p̄)type2
Examples used below are the relative photon–baryon velocity potential, Ψ v,γB
and photon entropy per baryon perturbation, δsγ = 34 δγ − δB .
A.3 Tensor perturbations

For tensor perturbations, we have
(T T )
gij = (3) gij = ēIi ēJj (δIJ + hij ), g00 = −N̄ 2 , g0i = Ni = 0 ,
1 (T T ) ik
(δH) = 0 , σji = −(K 0 )ij = − ḣ δ ,
2N̄ kj
2 1 (T T ) j (T T ) j
(δ (3) R0 )ij = −(3) ∇ 2 hi + (3) R̄ hi , δ (3) R = 0 . (269)
The tensor mode is already gauge invariant. Only the anisotropic dynamical
Einstein equations are needed: multiplying both sides by 2N̄ 2 gives
(T T ) j ∂ ln(ā3 /N̄ ) (T T ) j 2 (T T ) j 1 (T T ) j
ḧi + ḣi − N̄ 2 (3)
∇ hi + N̄ 2 (3) R̄hi
∂τ 3
= 16πGN N̄ 2 (Πtot )ji . (270)
With scalar fields only, there is no anisotropic stress, hence the gravity waves
are freely propagating. Of course they can still be generated by quantum
(T T )
noise in the hij field. Anisotropic stresses from neutrinos and photons can
lead to gravitational wave generation, but this is a very small effect. Cosmic
strings decay by emitting gravitational waves, generated in response to their
anisotropic stress.
155
B Transport theory in General Relativity
B.1 The distribution function and the BTE in GR
The theoretical framework used to calculate the anisotropies and distortions
of the CMB is general relativistic polarized photon transport theory. Kinetic
theory in general relativity was actively developed in the late sixties and early
seventies (e.g., Ehlers 1971 and Stewart 1971). For the cosmological transport
problem, we need a set of Boltzmann transport equations for single particle dis-
tribution functions. Due to the nonlocalizability of position and momentum,
one must be careful in defining the distribution function. For flat cosmolo-
gies, the eigenmodes are plane waves, momenta are Fourier transform variables
conjugate to positions, and a Wigner distribution function can be defined (in
terms of a two-particle equal-time propagator). If the particles have spin (or
polarization) labelled by s, then the Wigner distribution function is a matrix
in spin space:
X
fs0 s (q, x, τ ) ≡ eik·x ha†s0 ,q−k/2 (τ )as,q+k/2 (τ )i , (271)
k
where h· · ·i denotes a (nonequilibrium) ensemble average, the operator as,q+k/2

annihilates a particle with spin s of momentum q + k/2, and a†s0 ,q−k/2 creates
a particle withPspin s0 of momentum q − k/2. The trace of fs0 s in spin space,
ft (q i , xα ) = 21 s fss , is the mean occupation number of the state of momentum
q i in the neighborhood of the spacetime point xα . ft defined this way is not a
positive definite quantity and so the interpretation of ft as phase space density
is invalid.1
Coherent effects – such as the modification of the photon propagator by
collective plasma effects – must be taken into account by appropriately de-
1 A natural way to get a positive definite distribution is to discretize phase space into
cells of size (2πh̄)3 and centers (X, Q). The uncertainty principle implies further localiza-
tion within a cell is not possible. The photon field can be expanded in annihilation and
creation operators asXQ , a†sXQ , with associated wave functions hx|X, Qi which are zero out-
side of the spatial part of the box and which are box-normalized plane waves, exp(iQ · x),
inside. These form a complete orthonormal set. The relative degree to which the boxes
are spatially elongated is at our disposal provided the quantum volume constraint is main-
tained. By shrinking the spatial directions at the expense of increasing the separation between
wavenumbers one recovers the delta function wavefunctions of the position space represen-
tation of quantum mechanics; by shrinking the momentum directions one approaches the
continuum plane wavefunctions of the momentum space representation. The occupation
indices of each such fundamental phase space cell can be used to define the distribution func-
tion: fss0 (Q, X, τ ) = ha†s0 XQ (τ )asXQ (τ )i. Compared with the usual Wigner distribution,
there are disadvantages (does not have a continuous dependence on position and momentum,
boundary terms involving transport from one box to another are complicated) and advantages
(emphasizes the fundamental graininess imposed by quantum mechanics on phase space, and
coarse-graining of phase space only involves making the boxes of much larger volume than
that required by the Heisenberg uncertainty principle). Both approaches give exact quantum
evolution equations which reduce to the usual form of the Boltzmann transport equation in
the classical limit.
156
fined quasiparticles which have these collective interactions included, but this
is not of importance for the ∆T /T problem. In the classical limit – when
spatial inhomogeneities of ft and gravitational field curvature are both of long
wavelength compared with the typical de Broglie wavelength of the particles,
q −1 – localizability is a good approximation, ft is positive definite, and the
quantum evolution equation for ft reduces to a Boltzmann transport equa-
tion. The transport model considers the particles propagating along geodesics
in spacetime. The particles may undergo absorptions or emissions or scat-
terings at single points. For such a description of collisions to be valid it is
also necessary that the interaction regime be small in spatial and temporal
extent compared with the scale of inhomogeneity in fs0 s . Also, in order for
the equations to be closed off at the single-particle distribution level (rather
than requiring e.g., a full Liouville equation or higher moments in a BBGKY
hierarchy), the only correlations allowed to be explicitly included are those due
to the particle statistics, Bose–Einstein or Fermi–Dirac. The equation for the
evolution of the distribution function is
∂fs2 s1 ∂fs2 s1
qα α
− Γiαβ q α q β = q 0 Ss2 s1 [f ] , (272)
∂x q ∂q i x
where Ss2 s1 [f ] is the source function. fs0 s is a general relativistic scalar under
coordinate transformations of the position coordinates and of the momentum
coordinates. q 0 Ss2 s1 [f ] also transforms as a general relativistic scalar, which
conveniently allows transformation from one gauge to another.
Although eq. (272) is not manifestly covariant because the summation in
the second term runs over spatial indices only, it is actually covariant – a
consequence of the momentum being constrained to lie on the mass shell,
q α gαβ q β = −m2 . Any other 3 parameters labelling the mass shell instead
of the coordinate momenta q i would also do. Thus, for the transport prob-
lem selecting a gauge involves choosing a spacetime coordinate system and a
momentum space coordinate system, and these can be chosen relatively inde-
pendently of each other if we wish. Coordinate momenta q i are generally not
very physically meaningful. It is usually better to use spatial momentum com-
ponents relative to an orthonormal tetrad eaα (xµ ): q I = eIα qcoord
α
, often along
with further momentum-gauge transformations beyond this to simplify analyt-
ics or numerics, in particular one that makes the momentum a comoving one
[195]. The transfer equation in the triad momentum variables1 looks similar to
eq. (272), and is easily obtained by appropriately transforming it:
∂f ∂f
q a ea [f ] − ΓIab q a q b = q n S[f ], where ea [f ] ≡ eα
a , (273)
∂q I x ∂xα q
and the ΓIab are connection coefficients relative to the tetrad ea , defined by the
expansion ∇eb ea = Γcab ec . To reduce this to a usable but quite general form, we
1 If the geodesic motion is xα (λ), q I (λ), where λ is an affine parameter, then the geodesic
equations q a = ea α I I a b
α dx /dλ, dq /dλ = −Γab q q applied to Df /dλ = (Df /dλ)coll , where
(Df /dλ)coll describes the change in the distribution function as a result of local interactions,
yields the transport equation.
157
use a little more of the machinery of differential geometry. The Γcab are often
c
termed Ricci rotation coefficients (and denoted by ωab ), and are related to the
c 2
structure coefficients Cab of the basis,
c
[ea , eb ] ≡ Cab ec , by Γcab = − 12 g cd {ed [gab ] − eb [gda ] − ea [gdb ]}
c
− 21 [Cab − gdb g cf Cfda − gad g cf Cfdb ] . (274)
Let us first introduce an orthonormal basis, en , eI , where eα β

a gαβ ea = ηab =
c c a
diag(−1, 1, 1, 1), so Γab only involves the Cab . If p is the momentum in the
∂f IJ d ∂f
ea basis, then pa pb ΓIab ∂p a b
I = p p ηad δ CJb ∂pI , often easier to calculate. The
I J 1/2
momentum p ≡ (p δIJ p ) will redshift as the universe expands, so it is not
the momentum we finally wish to work with. Since we have seen that the
equations greatly simplify with the introduction of a comoving p momentum, we
introduce q I = ΩpI , hence q = Ωp, q̂ I = p̂I and q n = q 2 + m2 Ω2 , where
the function Ω(x, τ ) is at our disposal, except that it should be ā(τ ) for the
unperturbed case. The transformation of the action of the vector ea on f from
the space in which p is fixed to the space in which q is fixed is simply shown
to be
ea |p [ft ] = ea |q [ft ] + ea [ln Ω]q∂ft /∂q .
Note that q∂ft /∂q = q q̂ I ∂ft /∂q I . For the basis en , eI , we have
n n n
CnI = −CIn = eI [ln N ] , CIJ = 0,

J J J i 1 i
CnI = −CIn = e i en [eI ] + eI [N ] ,
N
K
CIJ = eI [eJ j ]eK j K
j − eJ [eI ]e j .
An example of the use of this is the computation of the acceleration 4-vector

of the timelike hypersurfaces: An vanishes and AI = ΓInn = (3) g IJ CJn n
=
(3) IJ
g eJ [ln N ].
As in section A.1, we use the 3-space “conformally transformed” basis,
eI∗ = A−1 eI , e∗I = AeI , with A(x, τ ) reducing to ā(τ ) for the unperturbed
case, but possibly inhomogeneous in the fluctuation case. This means the
2 The commutator of the differential operator [ea , eb ] is defined by its action on a function
f : [ea , eb ](f ) = (eaα eb,β α − ebα ea,β α )∂f /∂xβ . To define the sign conventions I use here, the
curvature tensor is R(ea , eb , ec , ed ) = hed , R(ea , eb )ec i = Rdcab , hence Rdcab = ea [Γdcb ] −
eb [Γdca ] + Γfcb Γdf a − Γfca Γdf b − Cab
f
Γdcf where Cab c = Γc − Γc . Here the operator R(e , e ) =
ba ab a b
(4) ∇ (4) ∇ (4) (4) (4)
ea eb − ∇ eb ∇ ea + ∇[ea ,eb ] . The Ricci tensor, scalar, Einstein tensor and
Einstein equations are: Rcb = Racab , R = g cb Rcb , Gcb = Rcb − 21 Rgcb , Gcb = 8πGN Tcb . The
connection and curvature forms are ωbc ≡ Γcba ea , θcd = dωcd +ωad ∧ωca , obeying dgab = ωab +ωba ,
where ωab ≡ g ac ωbc , and the first and second Cartan equations, dea +ωba ∧eb = 0 (or 21 torsion)
and θcd = 21 Rdcab ea ∧ eb . Here ∧ is the exterior product and d is the exterior derivative of
forms. The latter 3 equations are all that is needed to compute connection coefficients and
the curvature tensor for a metric in any basis, and is usually simpler than using direct Γ cab
calculation in a given basis.
158
momentum is q a = ΩAhea∗ , pi. As we shall see, it turns out to be most desirable
to have Ω = A to ensure that there are no terms representing the redshifting
of the radiation for the unperturbed background. It is this q a and its q which
is the inhomogeneous generalization of the comoving momentum introduced
in section 3.1. Because of the flexibility in the spatial dependence of A, it is
not unique. The way the basis change manifests itself is through eK i
i en [eJ ] =
K i K
e∗i en [e∗J ] − δJ en [ln A]. In terms of q and e∗J , the transport equation becomes

qa ∂ft IJ qn 1
n
e [f
a t ] − q I
δ q̂J en [ln(A/Ω)] + e∗J [ln N ]
q ∂q q A
q 1 1
− n q̂J q̂ K e∗K [ln Ω] − q̂K eK i
∗ i {en [e∗J ] + e∗J [N i ]}
q A N

q M 1 K i i
+ n q̂K q̂ e {e∗J [e∗M ] − e∗M [e∗J ]}
q A ∗i
∂ft q 1 IJ
+q I n δ − q̂ I q̂ J e∗J [ln A] = S[ft ] . (275)
∂q q A
B.2 Number, energy and momentum conservation equa-

tions
A first application of this equation is to derive the energy and momentum
conservation equations for “type”-matter. Just as the stress energy tensor can
I a
be decomposed into (ρ, J(e) , p, ΠIJ )type , so the number current 4-vector Jtype
of particles of a given type with respect to a flow U a can be decomposed as
Jnatype = ntype U a + J(na
type )
, where ⊥ba J(n a
type )
= 0. Similarly the “type”-
a a
entropy 4-vector J(stype ) can be decomposed as J(s type )
= s(type) U a + J(s
a
type )
.
I
(In the comoving frame of an imperfect fluid moving with velocity U , J(stype ) =
T −1 J(e)type
I
, given by eq. (236).) If U is the flow of time, eα n , and we use the
basis eI , these various densities and currents are related to the distribution
function by
X 1 X −4 q 2
ρ= Ω−4 q n ft , p= Ω ft , (276)
qs
3 qs qn
X
ntype = Ω−3 q n ft ,
qs
X
s=− Ω−3 {ft ln ft − (±)(1 ± ft ) ln(1 ± ft )} ,
qs
X X qI
I
J(e) = Ω−4 q I ft , I
J(n) = Ω−3 ft ,
qs qs
qn
X qI
I
J(s) =− Ω−3 {ft ln ft − (±)(1 ± ft ) ln(1 ± ft )} ,
qs
qn
159
X 1 q2 X XZ d3 q
ΠIJ = Ω−4 (q̂ I q̂ J − δ IJ ) n f , (· · ·) ≡ (· · ·) .
qs
3 q qs spin
(2π)3
Here (+) is for bosons, (−) is for fermions. A sum over spins (or polariza-
tions) is needed because of the way ft has been defined. (For a general ba-
sis, the form for Jnatype involves Ω−3 (−(4) g)1/2 q a /(−qn ) and for Ttype
ab
involves
−4 (4) 1/2 a b
Ω (− g) q q /(−qn ), where qn is the covariant time component of the
momentum.)
Consider the limit of the BTE eq. (275) for nr-matter, for which q n → mΩ.
We take Ω = A = ā and ignore q/q n terms. The BTE and the zeroth and first
order moment equations w.r.t. q which give mass and momentum conservation
are then:
qI ∂ft
en [ft ] + eI [ft ] − me∗J [ln N ] I + eK i
∗ i {en [e∗J ]
mā ∂q
1 ∂f
+ e∗J [N i ]}qK δ IJ I = S[ft ] ,
N ∂q
1 1 X
en [ρΩ3 ] + 2 e∗I [J(e) I
Ω4 ] − ρΩ3 eJ∗ i {en [e∗Ji ] + e∗J [N i ]} = m S,
ā N qs
I
en [J(e) Ω4 ] + ā−2 e∗J [(pδ IJ + ΠIJ )Ω5 ] + ρΩ3 δ IJ e∗J [ln N ] − Ω4 (J(e)
I
eJ∗ i
1 X
K
+ J(e) gKM eM IJ i
∗ i δ ){en [e∗J ] + e∗J [N i ]} = qI S . (277)
N qs
The nr-transport equation does indeed take the form of a Boltzmann equation
with a gravitational force −m∇ ln N , with ln N the gravitational potential
perturbation. The last term in eqs. (277) is a shearing term related to the
extrinsic curvature. The general equation for semi-relativistic matter can also
be obtained this way, but it is more easily derived from the ADM formulation
of the energy and momentum conservation laws for type-matter:
en [ρ] − K(ρ + p) − (K 0 )IJ ΠIJ + (3) ∇I J(e)

I
X
I (3)
+ 2J(e) ∇I ln N = pn S ,
ps
I I
en [J(e) ] − KJ(e) − 2(K ) ΠIJ + (3) ∇J (p (3) g IJ + ΠIJ )
0 IJ
X
+ ((ρ + p) (3) g IJ + ΠIJ )(3) ∇J ln N + J(e)
J (3)
∇J N I = pI S .
ps
The physical interpretation of the different terms is clear.

The perturbed energy and momentum conservation equations for nr-type
particles follow from eqs. (277). More generally, we shall keep in the terms of
order p̄/ρ̄ to have a generally valid result:
ρ̄type 2
ēn [δtype ] + 3ēn [ϕ] − (3) ∇ (Ψv,type + Ψσ )
3(ρ̄ + p̄)type
160
P n n
(δp − p̄δ)type ps p Stype − p S type (1 + δtype )
+ H̄ = , (278)
(ρ̄ + p̄)type (ρ̄ + p̄)type
ρ̄type p̄type p̄type
en [(1 + )Ψv,type ] − 3H̄ Ψv,type
(ρ̄ + p̄)type ρ̄type ρ̄type
2
(δp)type ( 2 (3) ∇ + 13 (3) R)p̄type πt,type
=ν+ + 3
(ρ̄ + p̄)type (ρ̄ + p̄)type
−2 P I I

(3)
∇ (3) ∇I ps pn Stype ppn + S type (3) ∇ Ψv,type
− . (279)
(ρ̄ + p̄)type
The perturbed energy conservation equations for the total energy and momen-
tum are the same, except that the sums over sources Stot vanish: the total
energy and momentum are conserved.
The nonrelativistic limit of eq. (279) is p̄type/ρ̄type → 0, pn → mnr . The
energy equation, eq. (278), is handled by writing ρnr = nnr (mnr + nr ), where
mnr , nnr and nr are the mass, number density and thermal energy per nr-type
particle. Terms of zeroth order in m−1 nr give the number conservation equation
and terms of first order give the thermal energy conservation law, which is just
dnr + pnr dn−1 nr = Tnr dsnr , where dsnr denotes the entropy generation in the
nr-matter in time dτ , Tnr the temperature. These laws are:
2 X
ēn [δn,nr ] + 3ēn [ϕ] − (3) ∇ (Ψv,nr + Ψσ ) = n̄−1
nr (δSnr − S nr δn,nr ),
ps
(280)
3H̄ n̄−1
ēn [δnr ] − p̄nr ēn [δn,nr ] + nr ((δp)nr − p̄nr δn,nr )
X p2
= n̄−1
nr (δSnr − S nr δn,nr ) . (281)
ps
2mnr
2
(δp)nr ( 2 (3) ∇ + 13 (3) R)p̄nr πt,nr
ēn [Ψv,nr ] = ν + + 3
mnr n̄nr mnr n̄nr
−2 X pI I
p

−1 (3) (3) 3 I
− n̄nr ∇ ∇I δSnr + + ∇ Ψv,nr S nr . (282)
ps
mnr mnr
For cold dark matter we only need eq. (280) and eq. (282), and this is all we need
for baryons as well if the baryonic pressure and heating can be neglected, which
is the case if we only wish to follow the development of primary anisotropies.
B.3 The transport of extremely relativistic particles

We now turn back to the transport equation to apply it to radiative transfer.
Instead of using q I , we shall change to q, q̂ I and define a derivative with respect
to q̂ I = q̂I by
∂ ∂ ∂
I
≡ q̂I + q −1 I . (283)
∂q ∂q ∂ q̂
161
The reason for this separation is that while terms involving derivatives with
respect to q̂ I are very relevant for the bending of light, i.e., lensing, they are not
relevant for most issues in primary and secondary anisotropy development (they
can contribute if there is large scale mean curvature). We shall call the source
associated with the first term StSW and the source associated with the second
Stbend . Note that q̂I ∂/∂ q̂ I = 0. Instead of repeating the Boltzmann equation,
we shall write this in terms of the ∆t notation introduced in section 3.1:
1 + ∆t ≡ (q/Tc∗ )/ ln(ft−1 ± 1) , (+) BE , (−) FD ,

q 1
en [∆t ] + n q̂ I e∗I [∆t ] = N̄ −1 (GtSW + Gtbend + GtC ) , (284)
q A

∂∆t 1
N̄ −1 GtSW = 1 + ∆t − q̂K q̂ J eK i
∗ i {en [e∗J ] + e∗J [N i ]}
∂ ln q N

qn 1 q 1
− q̂ J e∗J [ln N ] + n q̂ J e∗J [ln Ω] − en [ln(A/Ω)] , (285)
q A q A
n
∂∆t IJ q 1
N̄ −1 Gtbend = δ − q̂ I q̂ J e∗J [ln N ]
∂ q̂ I q A
q 1 1
− n e∗J [ln A] − q̂K eK ∗ i {en [e∗J ] +
i
e∗J [N i ]}
q A N

q M 1 K i i
+ n q̂K q̂ e {e∗J [e∗M ] − e∗M [e∗J ]} (286)
q A ∗i
S[ft ] (1 + ∆t )2
N̄ −1 Gt,source ≡ . (287)
(q/Tc∗ )(fc + ∆ft )(1 ± (fc + ∆ft ))
For light massive neutrinos and photons whose spectrum is frequency depen-
dent, it is better to use either (1 + ∆t )−1 or ln ft−1 ± 1 , which is akin to a
dimensionless generalized chemical potential, for the transport.
For massless er-particles, the components of the stress–energy tensor are
related to ∆t by
I Z
ρer per J(e)er ΠIJ dΩq̂
, , , = (1 + ∆t )4 {1, 1, q̂ I , q̂ I q̂ J − 13 δ IJ } .
ρ̄er p̄er ρ̄er ρ̄er 4π
The radiation brightness perturbation is defined to be ρ̄−1 er dρer /(dΩq̂ /4π).

In [195, 2], I used Ω = A = ā, conformal time, N̄ = ā, and a triad orthogonal
to linear order in the metric perturbation hij = ā−2 ( (3) gij − (3) ḡij ), where (3) ḡij
is the unperturbed (flat) spatial metric:
eI∗ i = δ∗I i + 21 hIi , e∗Ii = δ∗Ii − 12 hIi , (3)

ḡij = ā2 δij . (288)
Raising and lowering of indices is here done with respect to δIJ = eI · eJ . We

now concentrate on massless particle transport (q n = q) in a flat unperturbed
Universe for which the bending source is of second order, take Ω = A = ā and
assume ∆t is q-independent, as for Thomson scattering of a Planck distribution.
162
To linear order in hαβ and ∆t , eq. (285) can be written as

J K i 1 i J 1
GtSW = N̄ q̂K q̂ e∗ i {en [e∗J ] + e∗J [N ]} − q̂ e∗J [ln N ]
N A
1 1 ˙ t + q̂ i ∂i ∆t ) .
= −q̂ i q̂ j ḣij + q̂ i q̂ j ∂j h0i + q̂ j ∂j h00 (= ∆ (289)
2 2
B.4 momentum space gauge transformations

There are two kinds of gauge transformations that operate on ft , coordinate and
momentum. For Thomson scattering problems, we usually restrict ourselves to
classes of momenta for which ∂ft /∂τ vanishes in the unperturbed state, so
ft is a function only of q; in that case, δft changes only under momentum
gauge transformations. However, since we usually tie the momentum variable
choice to the coordinate system choice (using a triad eI perpendicular to the
flow of time), the two are intimately related. We now discuss the general
situation, where we allow the new momenta to be arbitrary functions of the
I J
old: qnew (qold ). This accompanies the transformations of time, τnew = τold + T ,
and space, xinew = xiold + Li , coordinates.
The class of momentum transformations we have been discussing so far
a
are conformal transformations of an orthonormal basis, hence qnew = ΩLab qold
b
,
a
where Lb is a Lorentz transformation. We can therefore identify a velocity
vector v, a gamma factor γ = (1 − v · v)−1/2 and a rotation matrix RJI such
that
I Ωnew I J n
qnew = R (q − γqold v J + (γ − 1)qold · v̂v J ) . (290)
Ωold J old
The old hypersurface as seen on the new hypersurface is moving with velocity
v. To linear order, we have the following transformations
n
qnew = (1 + δ ln Ω)R(qold − qold v) , (291)
n
qnew = qold + qold δ ln Ω − qold q̂old · v̂ v , (292)
n

qold
q̂new = R q̂old − (v − q̂old · vq̂old ) , (293)
qold
¯
∂ f¯ qnew − qold ∂f
δftnew = δftold − q − T (294)
∂q qold ∂τ
¯
∂ f¯ n
qold ∂f
= δftold − q δ ln Ω − q̂old · v̂v − T , (295)
∂q qold ∂τ
qnew − qold
∆tnew = ∆told +
qold
¯
qn ∂ f/∂τ
= ∆old + δ ln Ω − old q̂old · v̂v + ¯ ln q T . (296)
qold ∂ f/∂
Notice that it is only the redshifting associated with the conformal factor or
relative flow that enters in the transformation of f . If we restrict ourselves to
the class of comoving momenta, then the terms in square brackets vanish.
163
We know how the velocity v and the scale factor transform (ānew = āold +
H̄ N̄T ). Although we have restricted ourselves to momenta that have Ω reduc-
ing to ā in the unperturbed case, we have some freedom in deciding how (ā−1 Ω)
transforms. Therefore, for the combined coordinate and gauge transformation
of the radiation distribution function for scalar and tensor perturbations we
have
−1 (S)
(S) (S) (ā Ω)new
∆tnew = ∆told + H̄ N̄ T + q̂ I (3) ∇I N̄T + ln ,
(ā−1 Ω)old
−1 (T )
(T ) (T ) (ā Ω)new
∆tnew = ∆told + ln . (297)
(ā−1 Ω)old
For the most common choice for Ω, namely Ω = ā, we see that, as expected,
(S)
the angle average of 4∆t transforms as a density perturbation and the first
moment with respect to q̂ I transforms as a velocity, while all higher moments,
(T )
including that for the anisotropic stress, are gauge invariant. And ∆t is
gauge invariant. Looking at eq. (260), we see that the following quantity is
gauge invariant,
(S) (S)
∆t − H̄Ψσ − q̂ I ēI [Ψσ ] = ∆tL , (298)
as are many other combinations. Equation (298) relates the distribution func-
tion in the longitudinal gauge to that in the synchronous gauge.
A pure momentum gauge transformation with ā−1 Ωold = 1 and ā−1 Ωnew =
ν
e gives ∆new = ∆old + ν. This turns out to be a more relevant combination
for the longitudinal gauge. The synchronous gauge combination is:
(S) (S)
∆tS + ψ̈ − q̂ I ∂I ψ̇ = ∆tL + νL . (299)
When we solve the transport equations in the synchronous gauge, it is this

quantity which free-streams after the photons have decoupled [88].
The momentum gauge transformation can be quite decoupled from coordi-
nate transformations: it is worthwhile to show explicitly the remarkable flex-
ibility that it allows. We can discuss this entirely in terms of ā−1 Ω which we
now allow to be an arbitrary function of q, q̂ I . In particular, it can be expanded
in spherical harmonics to induce the following transformation:
X
ln(ā−1 Ω)new = ln(ā−1 Ω)old + z`m (qold , x, τ )Y`m (q̂old ) , (300)
`m
X
∆new = ∆old + z`m (qold )Y`m (q̂) . (301)
`m
We are therefore allowed to perform gauge transformations on ∆t beyond

` = 0, 1 if we wish, although these are not connected to coordinate transforma-
tions. Indeed, it appears to be possible to use the momentum transformation
to completely remove the distribution function perturbation. Of course, ten-
sor, octopole and higher multipoles in the momentum gauge transformation
164
modify the transport operator: GtSW transforms as well and if we allow an
order ` term to appear in ∆t , the q̂ j ∂j ∆t term in the transport operator will
induce a term of order ` + 1 in q̂ in GtSW . Since GtSW and the Compton source
function have terms that are at most quadratic in q̂, it would seem wise not to
induce terms cubic and higher order in q̂. This restricts the class of momentum
transformations on ∆t to have only ` = 0, 1 terms. Consider how we would get
the combination eq. (299) in the synchronous gauge: we would make a pure
momentum gauge transformation, ln(ā−1 Ω)new = ψ̈ − q̂ I ∂I ψ̇. In practice one
doesn’t usually think of it this way. Rather one takes the transport operator
and GtSW and shuffles terms from the right-hand side to the left-hand side if
it looks convenient to do so. That is how we decided that the combination
∆tS + ψ̈ − q̂ I ∂I ψ̇ was useful computationally [88].
If we change the momentum variables for one species, but not another, then
the interpretation becomes more complicated. For example, we should require
that such physically meaningful quantities as the entropy per baryon pertur-
bation 34 δγ − δB and the relative velocity vγ − vB be gauge invariant under
the combination of spacetime and momentum coordinate changes. However,
all species present will have distribution functions, and they can all be trans-
formed. Thus, for example, just as the photon density transforms to δγ + 4ν
under ā−1 Ω = eν , so the baryon density transforms to δB + 3ν.
For the flat unperturbed case, we can do a Fourier expansion of the distri-
bution function and the Sachs–Wolfe source. In the frame in which k is taken
to be along the 3-axis and (θ, φ) are the polar angles, the Sachs–Wolfe source
terms under the standard momentum gauge choice are
(S)
(Ω = ā) scalar: GtSW = −ikµν − ϕ̇ − µ2 k 2 ā−1 Ψσ , µ = k̂ · q̂ , (302)
(T )
tensor: GtSW = − 21 q̂ i q̂ j ḣTijT

(T +) (T ×)
= −(1 − µ2 ) GetSW cos(2φ) + GetSW sin(2φ) ,
(T )
GetSW = 12 ḣ(T ) , ḣ(T +) = 21 (ḣ11 − ḣ22 ) , ḣ(T ×) = ḣ12 (. 303)
There is of course no effect on the polarization components. Under a further

pure momentum transformation, the distribution function and Sachs–Wolfe
scalar terms transform to:

∂ā−1 Ψσ
Ω = ā 1 + ν + − q̂ i ∂i ā−1 Ψσ , (304)
∂τ
e (S) (S) ∂ā−1 Ψσ
∆ t = ∆t + ν + − q̂ i ∂i ā−1 Ψσ , (305)
∂τ
(S) ∂ 2 ā−1 Ψσ
GetSW = ν̇ − ϕ̇ + , (306)
∂τ 2
while the tensor terms remain invariant.
165
C Polarized transport for Thomson scattering
C.1 The polarization matrix and Stokes parameters
The off-diagonal components fs2 s1 , s2 6= s1 , of the distribution function contain
phase information, describing the probability amplitude for propagation from a
state of spin s1 to a state of spin s2 . For photons, there are two polarizations,
hence a 2 × 2 “polarization matrix” [258] transverse to q̂ ⊗ q̂ is required for
photons in the direction q̂. Consider linear polarization. If we expand the
polarization distribution function in terms of the basis consisting of the Pauli
matrices σ (i) , i = 1, 2, 3 and the identity σ (0) = diag(1, 1),
3
X
1 (µ)
(fss0 ) = 2 f(µ) σ , (307)
µ=0
then the 4 real distribution functions f(µ) correspond to conventional Stokes

parameters, except that they are defined for distribution functions, as described
in section 3:
f(0) = ft , f(1) = fU , f(2) = fV , f(3) = fQ . (308)
By using polarization vectors one can define an object in the combined
position and momentum space, f , which has properties similar to a spatial
tensor of rank two for fixed momentum. Consider a photon travelling in the
direction q̂ and two polarization vectors ε1,2 perpendicular to q̂ and to each
other. The εA will be functions of q̂ and possibly of x or k. To make a tensor
out of the 2 × 2 matrix, f(µ) σ (µ) /2, we use the tensor product basis
Et = 21 (ε1 ⊗ ε1 + ε2 ⊗ ε2 ) , EQ = 12 (ε1 ⊗ ε1 − ε2 ⊗ ε2 ) ,
EU = 12 (ε1 ⊗ ε2 + ε2 ⊗ ε1 ) , EV = − 21 i(ε1 ⊗ ε2 − ε2 ⊗ ε1 ) ,
X
1 AB
i.e., E(µ) = 2 σ(µ) εA ⊗ εB . (309)
A,B=1,2
For observations, the basis εA (q̂, x, τ ) would, for given q̂, be defined with some
P3
axis convention on the celestial sphere; the tensor f = µ=0 f(µ) E(µ) is inde-
pendent of polarization basis orientation, with f(µ) transforming under rotation
of the polarization basis in a complementary way to E(µ) . It is useful to also
use a polarization basis whose orientation is defined with respect to eigenmode
variables in the expansion eq. (147). For the flat case, a wavenumber k can be
used to label the eigenfunctions, εA (q̂, k̂, τ ) can be a function of k̂, independent
of x, and a mode expansion can be made:
X (M)
f =w f(µ) E(µ) QkM (x, τ )akM + cc . (310)
Mk
We can also expand f in the basis {eI } of the time hypersurfaces, f = f IJ eI ⊗eJ ,
which makes the spatial tensor aspect manifest. Just as (3) gij is expanded in
166
scalar, vector, and tensor modes, so can fij . For scalar perturbations, we
project onto δij and k̂i k̂j − 31 δij . As we show below, we can choose ε2 ⊥ k̂ as
(S) (S)
well as ⊥ q̂, which implies fU = 0 and fV = 0. For tensor perturbations, we
(T )
project using Eij of eq. (170):
(T )
X X X (T ) E
(T )
· E(µ) (T ) ik·x
fij =w fe(µ) E e ak(T ) + cc .
E(µ) · E(µ) ij
(µ)=t,Q,U,V =+,× k
(311)
(T ) e (T ) are the natural mode functions for
The quantities fe(µ) or equivalently ∆ (µ)
tensor perturbations. These are evaluated in section C.6.
Thomson scattering is conservative, hence in the comoving frame of the
baryons, the photon energy out equals the photon energy in. The scattering
function that enters the Boltzmann transport equation can then be written as
Ss2 s1 (x, τ, q, q̂) =

XZ
− dΩq̂0 Rs2 s1 ;s02 s01 (x, τ, q; q̂ → q̂ 0 )fs2 s1 (x, τ, q, q̂)
s01 s02
Z
− dΩq̂0 Rs02 s01 ;s2 s1 (x, τ, q; q̂ 0 → q̂)fs02 s01 (x, τ, q, q̂ 0 ) .
Denote the total scattering rate (per unit conformal time) by

XZ
τC−1 ≡ ne σT a = dΩq̂0 Rs2 s1 ;s02 s01 (x, τ, q; q̂ → q̂ 0 ) (312)
s01 s02
and define a phase function by
Ps02 s01 ;s2 s1 (x, τ ; q̂ 0 → q̂) ≡ 4πτC Rs02 s01 ;s2 s1 (x, τ, q; q̂ 0 → q̂) (313)
(independent of the magnitude of q for Thompson scattering). Instead of pro-

ceeding with the polarization matrix language, let us go over into the Stokes
parameter language, noting that we can expand any symmetric matrix in spin
space in terms of σ (µ) ,
σ (µ)
Ss2 s1 = S(µ) ; (314)
2
(ν)
in particular, we can expand the source function. The phase function P(µ)
maps the distribution from a q̂ 0 -orthogonal system to a q̂-orthogonal system.
We can then write

S(µ) (x, τ, q, q̂) = −τC−1 f(µ) (x, τ, q, q̂)
Z
dΩq̂0 (ν) 0 0
+ P (x, τ, q̂ → q̂)f(ν) (x, τ, q, q̂ ) . (315)
4π (µ)
167
(ν)
Our goal is therefore to calculate P(µ) , or equivalently the spatial tensor map
3
X (ν)
P= P(µ) E(µ) (q̂, x, τ ) ⊗ E(ν) (q̂, x, τ ) (316)
µ,ν=0
expressed in terms of sky orientation (or via mode expansions).

(ν)
The calculation of P(µ) is done through a sequence of “rotations” of the
Stokes parameters which progressively take us: from (1) a linear polarization
0
basis E1,2 perpendicular to the photon direction q̂ 0 before the scattering and
referred to the sky reference frame; through (2) a linear polarization basis ε 01,2
in a plane perpendicular to q̂ 0 , which, for convenience also has ε02 perpendicular
to k̂; into (3) a polarization basis e01,2 in a plane ⊥ q̂ 0 , and also e02 ⊥ q̂, a
natural basis for action on the distribution function by the scattering phase
matrix, with the result re-expressed in terms of a new polarization basis e1,2
spanning a plane ⊥ q̂ with e2 ⊥ q̂ 0 ; through (4) a linear polarization basis
ε1,2 in a plane perpendicular to q̂ with ε2 perpendicular to k̂ as well; and,
finally, into (5) a linear polarization basis E1,2 in a plane perpendicular to q̂
referred to the sky reference frame. The transformations are all designed to
get the distribution function into the correct form for step (3), in which the
familiar action of Thomson scattering of light linearly polarized in a direction
perpendicular and parallel to the scattering plane (that spanned by q̂ 0 and q̂)
can be performed. The bases in steps (2) and (4) are suited to the free transport
between scatters, since they are a natural polarization basis for the independent
modes of the system. The rotations (1), (2), (3) leave f (q̂ 0 , x, τ ) invariant and
the rotations (3), (4), (5) leave f (q̂, x, τ ) invariant, with the entire action of
the scatter expressible as the transformation step from q̂ 0 to q̂, in terms of the
mapping P(q̂ 0 → q̂):
P(q̂ 0 → q̂) = 43 (1 + (q̂ 0 · q̂)2 ){(e2 (q̂) ⊗ e2 (q̂)) ⊗ (e02 (q̂ 0 ) ⊗ e02 (q̂ 0 ))}
+ 43 2q̂ 0 · q̂ 12 {(e1 (q̂) ⊗ e2 (q̂)) ⊗ (e02 (q̂ 0 ) ⊗ e01 (q̂ 0 ))
+ (e2 (q̂) ⊗ e1 (q̂)) ⊗ (e01 (q̂ 0 ) ⊗ e02 (q̂ 0 ))} . (317)
This relatively simple expression demonstrates the utility of the f approach,
although for it to be usable e0A and eA must be expressed in terms of the
mode-bases ε0A and εA and sky-bases EA 0
and EA , which is where the work
lies. Chandrasekhar [200] develops the Stokes parameter equations in his sec-
tion on Rayleigh scattering, which has the same angular scattering dependence
as Thomson scattering, by doing these rotations, but using a more classical
language and approach.
The full sequence of operations can be expressed in terms of a total phase
tensor
(ν) (α) (β) (γ) (δ) (ν)
P(µ) = [P5 ](µ) [P4 ](α) [Pscat ](β) [P2 ](γ) [P1 ](δ) , (318)
acting on the distribution function f(ν) (x, τ, q̂ 0 ). However, since the computa-
tional method to solve the transport equation uses the modes of the system,
f(δ) (k, τ, q̂ 0 ), we actually do not need to do step (1).
168
In linear perturbation theory for an Einstein–deSitter Universe, the modes
are plane waves, labelled by the comoving wavevector k. A linear polarization
basis in which ε02 is perpendicular to k̂ as well as q̂ 0 is ([88], Appendix 5)
k̂ × q̂ 0 k̂ − (k̂ · q̂ 0 )q̂ 0
ε02 = , ε01 = q̂ 0 × ε02 = . (319)
(1 − (k̂ · q̂ 0 )2 )1/2 (1 − (k̂ · q̂ 0 )2 )1/2
Thus {ε01 , ε02 , q̂ 0 } is an othonormal triad for the incoming photon state. For
given k̂ and q̂ 0 , the incoming Stoke’s parameters are in this coordinate system.
Similarly, {ε1 , ε2 , q̂} with q̂ replacing q̂ 0 in eq. (319) is an appropriate triad for
the outgoing (scattered) photon state, but after the polarizing action of the
scatter is taken into account.
In the scattering frame, we define
q̂ − q̂ · q̂ 0 q̂ 0 q̂ × q̂ 0
e01 = , e02 = e01 × q̂ 0 = . (320)
(1 − (q̂ · q̂ 0 )2 )1/2 (1 − (q̂ · q̂ 0 )2 )1/2
Thus, e01 = q̂ 0 × e02 and, very importantly, e02 is ⊥ q̂, i.e., is perpendicular to the
scattering plane. By interchanging q̂ 0 and q̂ we get the outgoing triad {e1 , e2 , q̂},
with polarization basis differing from the incoming one by sign changes:
e2 ≡ −e02 , e1 = q̂ × e2 . (321)
The angular dependence of Thomson scattering on the Stokes parameters is

described by the phase tensor
(0) (3)
[Pscat ](0) = [Pscat ](3) = 32 (1 + (q̂ · q̂ 0 )2 )/2 ,
(0) (3)
[Pscat ](3) = [Pscat ](0) = − 23 (1 − (q̂ · q̂ 0 )2 )/2 ,
(1) (2)
[Pscat ](1) = [Pscat ](2) = 32 q̂ · q̂ 0 . (322)
The rest of the components vanish, thanks to the particular e01 , e02 basis choice
with e02 perpendicular to the scattering plane. This gives eq. (317).
But we wish to use the incoming mode-basis, ε01,2 and outgoing basis ε1,2 .
A rotation about the direction q̂ 0 by an angle ψ 0 takes ε01,2 into e01,2 , where
k̂ · q̂ − k̂ · q̂ 0 q̂ · q̂ 0
cos ψ 0 = ε02 · e02 = ε01 · e01 = . (323)
(1 − (q̂ · q̂ 0 )2 )1/2 (1 − (k̂ · q̂ 0 )2 )1/2
The effect of the basis change on f(µ) (k, q̂ 0 , τ ) is encoded in the action of the
2 × 2 rotation matrix

iψ 0 σ (2) 0 (0) 0 (2) cos(ψ 0 ) sin(ψ 0 )
e = cos(ψ )σ + sin(ψ )iσ = (324)
− sin(ψ 0 ) cos(ψ 0 )
169
acting on the left of the polarization matrix and its inverse (adjoint) acting on
the right:
0
(µ) (β) σ (2) 1 0 (2)
1
2 ([P2 ](β) f(µ) )σ = eiψ 0
2 f(µ) (k, q̂ , τ )σ
(µ)
e−iψ σ ,
(0) (2)
[P2 ](0) = [P2 ](2) = 1,
(3) (1) (3) (1)
[P2 ](3) = [P2 ](1) = cos(2ψ 0 ) , [P2 ](1) = −[P2 ](3) = sin(2ψ 0 ) .
(µ)
The rest of the [P2 ](β) vanish. The rotation by angle ψ from the triad {e1 , e2 , q̂}
(ν) (ν)
to the triad {ε1 , ε2 , q̂} gives a phase tensor [P4 ](µ) identical in form to [P2 ](µ)
if we replace ψ 0 by −ψ, where cos ψ is similar to eq. (323) with q̂ and q̂ 0
interchanged.
(δ) (β) (γ) (δ)
We now have all of the ingredients to get P(α) = [P4 ](α) [Pscat ](β) [P2 ](γ) . To
make the form useful, we need to express q̂, q̂ 0 , k̂ in some coordinate basis. Let us
(ν)
choose polar coordinates with k̂ the pole and q̂ = (θ, φ) and q̂ 0 = (θ0 , φ0 ). P(µ) is
p p
then a function of µ = k̂ · q̂, µ0 = k̂ · q̂ 0 , and q̂ · q̂ 0 = 1 − µ2 1 − (µ0 )2 cos(φ −
φ0 ) + µµ0 . The phase tensor can be expanded in terms of cos(m(φ − φ0 )),
sin(m(φ − φ0 )), where m = 0, 1, 2 terms appear. Thus we have a sequence of
products such as cos(mφ) cos(mφ0 ) and cos(mφ) sin(mφ0 ). The conventional
approach is to expand the incoming distribution function (or equivalently the
temperature fluctuation ∆(µ) ) in cos(mφ0 ) and sin(mφ0 ) terms and the outgoing
distribution in cos(mφ) and sin(mφ) terms; i.e., into scalar and tensor terms,
and vector terms denoted by “vec” which we ignore:
(S) (T +) (T ×)
∆t (q̂, k) = ∆t (µ) + vec + ∆t (µ) cos(2φ) + ∆t (µ) sin(2φ)
(S) (T +) (T ×)
∆Q (q̂, k) = ∆Q (µ) + vec + ∆Q (µ) cos(2φ) + ∆Q(µ) sin(2φ)
(S) (T +) (T ×)
∆U (q̂, k) = ∆U (µ) + vec − ∆U (µ) sin(2φ) + ∆U (µ) cos(2φ)
(S) (T +) (T ×)
∆V (q̂, k) = ∆V (µ) + vec − ∆V (µ) sin(2φ) + ∆V (µ) cos(2φ)
(325)
In the same way, we can also expand the source function S(µ) for f(µ) – or G(µ)
(S) (V c,s) (T +,×)
for ∆(µ) , defining G(µ) (µ), G(µ) (µ), G(µ) (µ). The reason for the different
(T +) (T +)
sin and cos combinations for ∆U , ∆Q is that the phase tensor expansion
couples + to + and × to × but not + to ×: i.e., the modes are independent.
Using the 3-tensor ∆ and the 3-tensor map P, we do not go through this
(T {+,×})
intermediate step of defining ∆(µ) , but rather go directly to variables
e (T {+,×}) (S)
∆ (µ) , which have a further µ dependence removed from them. For ∆ (µ)
there is no difference.
To calculate the polarization a detector would observe, we must choose a
fixed frame on the sky, say Galactocentric coordinates. Since −q̂ points outward
in the radial direction, the two polarization vectors on the sky E1 , E2 = E1 × q̂
170
form an orthonormal basis for the celestial sphere. An angle ψk defines the
rotation to {ε1 , ε2 }. It is the angle between the fixed E1 and k⊥ , where k⊥ =
k − k · q̂ q̂, is, as we look out upon a specific spot on the celestial sphere, the
(ν) (ν)
projection of k onto it. The phase tensor [P5 ](µ) is identical in form to [P4 ](µ)
with ψ replaced by ψk which acts on ∆(µ) (k, q̂, τ ) to give
Z
d3 k
∆t (q̂, here, now) = ∆t (k, q̂, τ0 ),
(2π)3
Z
d3 k
∆Q (q̂, here, now) = (∆Q (k, q̂, τ0 ) cos(2ψk ) + ∆U sin(2ψk )),
(2π)3
Z
d3 k
∆U (q̂, here, now) = (−∆Q (k, q̂, τ0 ) sin(2ψk ) + ∆U cos(2ψk )),
(2π)3
Z
d3 k
∆V (q̂, here, now) = ∆V (k, q̂, τ0 ).
(2π)3
(k̂i − k̂ · q̂ q̂i )(k̂j − k̂ · q̂ q̂j ) − imn jrs q̂m q̂r k̂n k̂s
cos(2ψk ) = −E2i E2j ,
(1 − (k̂ · q̂)2 )
(k̂j − k̂ · q̂ q̂j )imn q̂m k̂n
sin(2ψk ) = −E2i E2j , (326)
(1 − (k̂ · q̂)2 )
where the summation convention on repeated indices has been used and imn
is the completely antisymmetric Levi–Cevita symbol. Because of the k depen-
dence of the phases implicit in ∆Q (q̂, k, τ0 ), etc. we cannot do the ψk integra-
tion in eq. (326). A strategy for making small angle polarization maps using
this formula and knowledge of the polarization power spectrum is described in
[88].
C.2 Scalar perturbation source terms

C.2.1 Thomson source functions
For the scalar components of the phase function, we have
(0)
[PS ](0) = 3
8 [3 − µ2 − (µ0 )2 + 3µ2 (µ0 )2 ] ,
(3) 2 (0)
[PS ](0) = 3
8 [3µ − 1][(µ0 )2 − 1] , [PS ](3) = 38 [µ2 − 1][3(µ0 )2 − 1] ,
(3) (2)
[PS ](3) = 3
8 [3 − 3µ2 − 3(µ0 )2 + 3µ2 (µ0 )2 ] , [PS ](2) = 34 µµ0 ,
(S) (S)
with the rest vanishing. In terms of an expansion of ∆t and ∆Q in angular
(S) (S)
moments, ∆t` and ∆Q` , with respect to Legendre polynomials [88]
(S)
X (S)
∆t,Q,U,V (q̂, k, τ ) = (2` + 1) (−i)` ∆t,Q,U,V ` (k, τ ) P` (q̂ · k̂) , (327)
`
171
we have
(S) (S) (S) (S) (S) (S)
τC GtC = −∆t + ∆t0 − 12 P2 (k̂ · q̂) (∆t2 + ∆Q2 + ∆Q0 ) , (328)
(S) (S) (S) (S) (S)
τC GQC = −∆Q + 21 (1 − P2 (k̂ · q̂)) (∆Q0 + ∆t2 + ∆Q2 ) , (329)
(S)
τ C GU C = 0, (330)
(S) (S) 3 (S)
τ C GV C = −∆V + 4 k̂ · q̂ ∆V 1 . (331)
This equation was derived in the comoving baryon gauge, but the transforma-
(S) (S)
tion of ∆t − ∆t0 to a frame in which the baryons are moving with velocity
(S)
vB can be done using eq. (296), which only modifies eq. (328):
(S) (S) (S) (S) (S) (S) (S)
τC GtC = −∆t + ∆t0 + q̂ · vB − 12 P2 (µ)(∆t2 + ∆Q2 + ∆Q0 ) . (332)
(S)
In eq. (332), the source term proportional to ∆t2 arises because of the angular
dependence of Thomson scattering. This quadrupole anisotropy is also respon-
(S)
sible for the generation of polarization. The Sachs–Wolfe source term GtSW is
(S) (S)
given by eq. (302); GQ,U,V SW all vanish. Since ∆V = 0 in the early Universe
(S)
and there is no coupling through GV C nor through gravity to excite it, it remains
(S)
zero and an evolution equation for V is unnecessary. Although ∆U (q̂, k, τ )
(S) (S)
also vanishes, hence the power spectrum dCU ` /d ln k = 0, ∆U (q̂, x, τ ) does
not vanish since it appears when one rotates from the polarization basis fixed
(S)
by q̂, k̂ to one defined relative to sky coordinates: ∆U (q̂, x, τ ) is a random field
(S)
determined from the nonzero power spectrum 12 dCQ` /d ln k. (See eq. (326).)
C.2.2 The moment equations for photons

The moment equations are explicitly (for flat universes, see section C.4 for
nonflat modifications):
`=0 ˙ (S) + k∆(S) = −ϕ̇ − 1 k 2 ā−1 Ψσ ,

∆ t0 t1 3
`=1 ˙ (S) − k( 1 ∆(S) − 2 ∆(S) )

∆ t1 3 t0 3 t2
(S) (S)
= k 31 ν − ne σT a (∆t1 − 31 k̂ · ivB ) ,
`=2 ˙ (S) − k( 2 ∆(S) − 3 ∆(S) )
∆ t2 5 t1 5 t3
2 2 −1 1 (S) (S) (S)
= 15 k ā Ψσ − ne σT a 10 (9∆t2 − ∆Q0 − ∆Q2 ) ,

˙ (S) − k ` (S) ` + 1 (S) (S)
`>
−3 ∆ t` ∆ − ∆ = −ne σT a ∆t` .
2` + 1 t(`−1) 2` + 1 t(`+1)
(333)
The moment equations for the polarization are:
`=0 ˙ (S) + k∆(S) = −ne σT a 1 (∆(S) − ∆(S) − ∆(S) ) ,

∆ Q0 Q1 2 Q0 Q2 t2
172
`=1 ˙ (S) − k( 1 ∆(S) − 2 ∆(S) ) = −ne σT a ∆(S) ,
∆ Q1 3 Q0 3 Q2 Q1
`=2 ˙ (S) − k( 2 ∆(S) − 3 ∆(S) )

∆ Q2 5 Q1 5 Q3
1 (S) (S) (S)
= −ne σT a 10 (9∆Q2 − ∆Q0 − ∆t2 ) ,

˙ (S) − k ` (S) ` + 1 (S) (S)
`>
−3 ∆ Q` ∆ − ∆ = −ne σT a ∆Q` .
2` + 1 Q(`−1) 2` + 1 Q(`+1)
(334)
We can rewrite the ` = 0, 1 photon and neutrino moment equations using

photon–fluid potentials:
1
4 δ̇γ + ϕ̇ + 31 k 2 ā−1 (Ψv,γ + Ψσ ) = 0 , (335)
−1 1 2
ā Ψ̇v,γ − H̄Ψv,γ − ( 41 δγ + ν) + 6 k πt,γ = −ne σT Ψv,γB , (336)
relative velocity potential: Ψv,γB ≡ Ψv,γ − Ψv,B . (337)
The photon density, velocity potential, isotropic pressure ((δp)γ ) and anisotropic
stress (πt,γ ) perturbations are related to the low order moments by
(S) δγ (δp)γ (S) kΨv,γ (S) πt,γ

∆t0 = = , ∆t1 = , ∆t2 = k 2 . (338)
4 4p̄γ 3ā 12
(S)
Under (Ω = ā)-gauge transformations, δγ and Ψv,γ can change, but πt,γ , ∆t`
(S) (S) (S) (S) (S)
for ` >
− 2, ∆t,`>2 ≡ ∆t − ∆t0 + 3iq̂ · k̂∆t1 , and ∆Q` for ` >
− 0 do not.
−
C.2.3 CDM and baryon transport

These are coupled to the equations for the other types of matter present. The
equations for cold dark matter and for the baryons are of the form of eqs. (280),
(282), with the proper Thomson scattering coupling included in the latter case.
1 1 2 −1
CDM: 3 δ̇cdm + ϕ̇ + 3 k ā (Ψv,cdm + Ψσ ) = 0 , (339)
−1
ā Ψ̇v,cdm = ν , (340)
1 1 2 −1
baryons: 3 δ̇B + ϕ̇ + 3 k ā (Ψv,B
+ Ψσ ) = 0 , (341)
4 ργ −1
ā−1 Ψ̇v,B = ν + ne σT ā ā Ψv,γB . (342)
3 ρB
Overall momentum conservation of the photon–baryon fluid determines the
form of the Compton drag. The baryon pressure and anisotropic stress from
electron–ion viscosity have been neglected. In dealing with the combined pho-
ton plus baryon system, as well as Ψv,γB , it is useful to consider the equations
for the entropy per baryon and for the momentum current of the combined
(γ + B)-fluid:
entropy per baryon: δsγ ≡ 34 δγ − δB , (343)
173
δ̇sγ + k 2 ā−1 Ψv,γB = 0 , (344)
−1
ā Ψ̇v,γB − H̄(yB Ψv,γB + Ψv,(γ+B) )
= −(yB τC )−1 ā−1 Ψv,γB + 41 (δγ − 23 k 2 πt,γ ) , (345)
(ρ̄γ + p̄γ )Ψv,γ + ρ̄B Ψv,B
(γ + B) vel. potential: Ψγ+B = , (346)
(ρ̄γ + p̄γ + ρ̄B )

4 ρ̄γ
ρ̄B ā−1 ∂τ + 1 Ψv,(γ+B) = (δp)γ − 32 k 3 p̄γ πt,γ + ( 34 ρ̄γ + ρ̄B )ν ,
3 ρ̄B
−1
ρ̄B 4 ρ̄γ
where yB ≡ = +1 . (347)
ρ̄γ + p̄γ + ρ̄B 3 ρ̄B
The entropy generation equation, eq. (344), takes the form of a conservation
law. The combined momentum current dissipates only because of the viscous
anisotropic stresses in πt,γ = πt,(γ+B) . In computations, we evolve Ψv,γB in-
stead of Ψv,γ . We shall see that Ψv,γB goes to zero linearly with τC at high
redshift (section C.3.1). For isocurvature baryon perturbations, δsγ is a useful
variable to solve for [216], while for isocurvature CDM perturbations, the man-
ifestly gauge invariant 34 δγ − δcdm and δB − δcdm are useful for small k, but not
for large k [215].
C.2.4 The transport of massless neutrinos

Massless neutrinos and any other freely-streaming extremely relativistic par-
ticles (denoted by fer) obey the same transport equation as for the photons,
except there is of course no Thomson scattering source, and usually the spin
(or mixing of neutrino types as in oscillations) does not need to be treated.
If they are stable, then, once neutrino scattering is negligible when the tem-
perature drops below an MeV, only the Sachs–Wolfe source term needs to be
included. The initial conditions at time τi (assumed to be τeq , i.e., safely in
the er-dominated regime) for the neutrino distribution function are found by
expandingP3it to order kτi , which implies only terms up to ` = 3 are needed,
∆f er ≈ `=0 (2` + 1)(−i)`∆f er,` P` (q̂ · k̂). For example, for adiabatic perturba-
tions in the synchronous gauge the initial conditions are
∆f er (k, µ, τi ) ≈ 41 δf er,i (β + 3(1 − β)µ2 )(1 − ikµτi /3) ,
∆f er,0 (k, τi ) ≈ 14 δf er,i ,
9 − 4β
ā−1 Ψv,f er (k, τi ) = 3∆f er,1 /k = δf er,i τi ,
60
1 2 1−β 1−β
∆f er,2 = 12 k πt,f er = − δf er,i , ∆f er,3 /k = − δf er,i τi ,
10 70
5ρ̄γ (τi ) + 9ρ̄f er,tot (τi )
β≡ . (348)
15ρ̄γ (τi ) + 19ρ̄f er,tot (τi )
Here ρ̄f er,tot (τi ) is the total density of extremely relativistic particles that are
freely streaming at time τi . For three relativistic neutrino species, ρ̄f er,tot (τi )/
174
ρ̄γ (τi ) = 3×2×(7/16)×(4/11)4/3 = 0.6813 and β = 0.3984. The starting time is
assumed to be τeq , i.e., safely in the er-dominated regime. These initial con-
ditions contrast with those for the tightly coupled photons: with τC (τi ) ≈ 0, we
have ∆γ,t,0 = 41 δγ,i , Ψv,γB,i ≈ 0, ā−1 Ψv,γ ≈ ā−1 Ψv,B ≈ 12 1
δγ,i τi , and ∆γ,t,` ≈ 0
for ` >− 2. As well, δ sγ ≈ 0, hence δ B,i ≈ 3
δ
4 γ,i . For CDM, the defining condition
of the synchronous gauge is Ψv,cdm = 0, and the density perturbation starts off
the same as that for baryons, δcdm,i = 43 δγ,i . The initial conditions for the met-
ric variables in this gauge are ϕ̇ = − 12 βδγ,i τi−1 , k 2 ā−1 Ψσ = − 23 (1 − β)δγ,i τi−1 .
(The initial conditions for the relativistic neutrinos follow from expanding the
past-history integration, eq. (350) given below.)
C.2.5 Hot and warm dark matter transport

For scalar perturbations and hot or warm dark matter
q N̄
∂τ [∆hdm ] + i n
k q̂ · k̂ ∆hdm = (Ghdm SW + Ghdm curv ) , (349)
q ā
(S) qn p
Ghdm SW = −i q̂ · kν − ϕ̇ − (q̂ · k)2 ā−1 Ψσ , q n = q 2 + m2 ā2 .
q
It is the semi-relativistic stage, when q/q n is not simply unity or q/(mā), that
creates the difficulty. Thus, it is perhaps worthwhile to make a brief aside on
the numerical methods used in [134, 2, 233, 253, 259, 255, 260, 261, 262] to
solve collisionless damping equations for semi-relativistic particles. Just as for
photons, a hierarchy of moment equations can be written for ∆hdm,` , ∆wdm,` .
For massless neutrinos, the moment expansion became our preferred method in
[134]. For hot and warm dark matter, the number of equations to be solved is
the product of the number of multipoles that are being followed times the num-
ber of momentum groups, which then must be summed over with appropriate
weights to get the neutrino stress–energy tensor for the source side of Ein-
stein’s equations. In [195], we described an efficient Gauss–Legendre integra-
tion method using as the integration variable g, where dg = −(x∂ f¯/∂x) x2 dx,
x ≡ q/(āT̄hdm ), which gives all momentum groups significant weights in the
energy group sum: 24 groups give accurate results. The moment expansion
with truncation is called the “P–N” method, and it requires many moments to
give accurate results amd so is expensive. However, with a a suitable boundary
condition in ` space, Lithwick and I [261] showed that high precision can be
achieved with just 20 moments for high k and 10 moments for low k, a very
modest numerical cost. One can also lower the number of multipoles to 2 in the
very nonrelativistic regime. This seems to be the best approach to this problem
now [260, 261, 262]. Another method is to discretize the BTE in angle as well
as energy, which reduces the problem to a set of ODEs with total number equal
to the number of momentum groups times the number of angular bins. Durrer
[259] used this “S–N” method to solve for massive neutrino transport.
In Bond and Szalay [195] and subsequent work on hot and hot/cold models
[134, 233, 2], we adopted a history-integration method. Since this is very
175
different than the moment approach I shall discuss it in a little detail. We are
not interested in the detailed angular distribution of neutrinos as we are for
photons, but only in the density, pressure, velocity, anisotropic stress and the
action these have upon the metric variables. The transport equation can be
integrated and low order moments taken, which expresses the result in terms of
momentum integrals of spherical Bessel functions with momentum-dependent
arguments:
Z τ
∆hdm,` (q, k, τ ) = Dhdm,` (q, k, τ ) − 0
dτ ϕ̇(k, τ 0 )j` (k∆η)
τi

2 −1 0 2`(` + 1) − 1
+ k ā Ψσ (k, τ ) j` (k∆η)
(2` − 1)(2` + 3)

(` − 1)` (` + 1)(` + 2)
− j`−2 (k∆η) − j`+2 (k∆η)
(2` − 1)(2` + 1) (2` + 1)(2` + 3)

qn 0 ` `+1
− k ν(k, τ ) j`−1 (k∆η) − j`+1 (k∆η) ,
q (2` + 1) (2` + 1)
(350)
Z τ
q
∆η = η(q, τ ) − η(q, τ 0 ) , where η(q, τ ) = dτ p ,
q 2 + m2 ā2
0
Z τ R 2+m
hqi q dq f¯
η̄ = dτ q , hq m i = R 2 ¯ , m = 1, 2, . . . ,
0 2 q dq f
hqi + m2 ā2
hqi 7ζ4 3ζ4
= = 3.151 fermions , = = 2.701 bosons .
āT̄hdm 2ζ3 ζ3
The explicit numbers for the average momenta assume unperturbed f¯ = (eq/(āT̄ ) ±
1)−1 distributions for the light fermions and bosons. Recall that for light neu-
trinos āT̄mν = 1.95 K. The Dhdm,` (q, k, τ ) describe the evolution of the initial
conditions. For example, in the synchronous gauge for an adiabatic mode we
have ν = 0 and [195]
Dhdm,0 (q, k, τ ) = 41 δhdm (q, k, τi ) [j0 (k∆ηi ) − 2(1 − β)j2 (k∆ηi )] ,

9 − 4β
Dhdm,1 (q, k, τ ) = 41 δhdm (q, k, τi ) (j1 (k∆ηi )
5

+ 91 kτi j0 (k∆ηi )) − 65 (1 − β)j3 (k∆ηi ) ,
∆ηi ≡ η(q, τ ) − η(q, τi ) . (351)
The complication in this equation (350) is the integral over past time τ 0 of
the metric variables, turning the metric ODEs into integro-differential equa-
tions, not by itself a great numerical problem, but for speed some care is
needed to efficiently yet fully sample the past history. In [261] we show adap-
tive (Romberg) integration makes this method competitive in accuracy and
176
numerical cost with the moment method. Even with less efficient sampling,
the speedup in the [134] neutrino code, which was also applied to hot/cold
hybrid models in [2, 233], was considerable. Past-history approaches are now
also being used to great advantage for rapid computation of ∆t,` for the radi-
ation [306]. The Sachs-Wolfe metric part of this is similar to eq. (350), except
the optical depth exp[−ζC ] enters in the obvious way. The Compton terms
(S)
associated with the source GtC have similar j` expansions, but now the low
(S)
order moments of ∆t enter into the integral. One of the features of the past-
(S)
history method is that one does not have to calculate ∆t,` at every `, whereas
this is necessary because of the way the equations are coupled for the moment
method.
For the hot or warm dark matter, one can also save by shifting into P − 1
equations once the particles are strongly nonrelativistic and the wavenumber
is much below the Jeans length, kJhdm (a) = (4πGN ρ̄nr ā2 /c2s,hdm )1/2 , ∼ ā1/2 ,
where cs,hdm ∼ ā−1 is the adiabatic sound speed. I shall sketch this since it
exercises some of the equations derived earlier. The energy and momentum
conservation laws for hdm, eqs. (278), (279) with no source terms, Shdm = 0,
are generally valid of course, but to close off the equations, a model is required
for the pressure fluctuation (δp)hdm and the anisotropic stress πt,hdm . This
relation is complex because of collisionless damping, but at late times the hdm
obeys an equation similar to cdm. To roughly model residual effects of the
random velocity dispersion, we can introduce fudge factors akin to variable
Eddington factors to close off the hierarchy:
(δp)hdm = Cp,hdm p̄hdm δhdm ,

2
( 23 (3) ∇ + 1 (3)
3 R)πt,hdm = Cπt ,hdm δhdm ,
δ̇hdm + 3ϕ̇ + k 2 ā−1 (Ψv,hdm + Ψσ ) + 3H̄(Cp,hdm − 1) 53 c2s,hdm δhdm = 0,
ā−1 Ψ̇v,hdm = ν + (Cp,hdm + Cπt ,hdm ) 35 c2s,hdm δhdm ,
5 p̄hdm 5 hq 2 i
“adiabatic sound” speed: c2s,hdm ≡ = ,
3 ρ̄hdm 3 3m2hdmā2
1/2 1/2
cs,hdm 20η5 η3 20ζ5 ζ3
= = 0.85 fermions , = 0.89 bosons;
hqihdm 27η42 27ζ42
mhdm ā
P − 1 eqs. are used for cs,hdm (a) < TOLvnr , k < TOLJ kJhdm (a) ,
and (Cp,hdm + Cπt ,hdm ) is set to 5/3 . (352)
The fudge factor is arbitrary: the 5/3 choice is arranged so that it is the
adiabatic sound speed, cs,hdm , rather than the isothermal sound speed, be-
cause compressing neutrinos that are gravitationally bound would be better
approximated this way. (The Riemann eta and zeta values are ηj , ζj .) The
pdV term on the neutrino energy density, ∼ H̄(Cp,hdm − 1)c2s,hdm δhdm , is not
important. Although one could get more sophisticated by better modelling
Cp,hdm , Ct,hdm and thus the damping (see [300] for a nice nr analytic model
177
of neutrino-damping), it is better to deal with the damping by the full past-
history integration or with a hierarchy of moments. Thus the tolerance factors
are chosen to be quite conservative. The “very nonrelativistic” tolerance factor,
TOLvnr , should be quite small (< 0.05) and the Jeans tolerance factor, TOLJ ,
should be at most a tenth.
C.3 Numerically useful regimes for scalar perturbations

C.3.1 Tight-coupling, shear viscosity and thermal diffusion
Tight coupling equations adequately approximate the hierarchy prior to a (k-
dependent) redshift ztc (k). These are obtained by first developing a two-fluid
treatment of the photon–baryon interaction, which is adequate provided the
Compton timescale τC is short compared with all other timescales in the prob-
lem, in particular, the light-crossing time across half a wavelength πk −1 , and
the Hubble time at that epoch. In the [134, 88] code, we choose ztc to be at
least 2000, and also required that kτC < 0.01 and H̄āτC < 0.01 to remain
in tight coupling: the results are insensitive to considerable relaxation of this
criterion.
Two-fluid equations are obtained from the infinite hierarchy of moment
(S)
equations (eq. 333) by setting ∆t3 to zero in the ` = 2 equation, thereby
truncating the hierarchy, and neglecting ∆˙ (S) . The polarization is also assumed
t2
(S) (S)
to change quickly enough so that ∆Q0 and ∆Q2 are in the steady state found
by setting the right-hand side of the ` = 0 and ` = 2 equations to zero:
(S) (S) (S) (S)
∆Q0 = 45 ∆t2 , ∆Q2 = − 14 ∆t2 . (353)
Thus, the ` = 2 equation fixes the anisotropic stress (total quadrupole anisotropy):
(S) 4
1
6 πt,γ = k −2 2∆t2 = τC ā−1 (Ψv,γ + Ψσ ) (354)
15fη
4 4
= τC ā−1 (Ψv,(γ+B) + Ψσ ) − τC c2s(γ+B) ā−1 Ψv,γB ,
15fη 15fη
3 9
fη = 4 with polarization, fη = 10 without,
fη = 1 isotropic, no polarization,
4
shear viscosity: ηγ = ργ āτC ,
15fη
c2 1
sound speed: c2s(γ+B) = ,
3 1 + 43 ρρBγ
4 ργ
thermal diffusion: κγ = āτC .
3 Tγ
Identifying this result with the form of eq. (236) for scalar perturbations gives
the photon’s shear viscosity ηγ . The photon kinematic viscosity is ηγ /(ρ̄γ + p̄γ ),
178
hence is (5fη )−1 āτC . Weinberg’s classic text [264, 265] gives the fη = 1 result.
(He restricted himself to the approximation that Compton scattering was an-
gle and polarization independent.) The bulk viscosity for photons vanishes.
To determine the thermal diffusion coefficient, we must identify terms in these
equations with the defining relation for κγ , eq. (236), which involves the fluid
acceleration as well as the temperature gradient. The temperature gradient
is a projected one, so that it has no component in the direction of the fluid’s
velocity U ; i.e., it reduces to a spatial gradient in the fluid’s comoving frame.
In the frame defined by the time-surface velocity eα n , it picks up a time compo-
nent: ⊥b(U )I ( (4) ∇b [ln T ] + A(U )b ) is (3) ∇I [ln T ] + U n ēn [UI ]. For scalar photon
perturbations, this reduces to (3) ∇I [ 41 δγ + H̄Ψv,γ ]; thus,
(J(e)γ )I = −κγ T̄γ (3) ∇I ( 41 δγ + H̄Ψv,γ + ν − ā−1 Ψ̇γ )
defines the combination we are looking for. With appropriate multiplication
by ρ̄γ to relate to the energy conservation equation, we get Weinberg’s [265]
result for κγ : the photon entropy per unit volume times τC . Of course, it
is unaffected by fη which arises in the anisotropic shear stress. Indeed, the
potential for (J(e)γ )I is just 34 ρ̄γ (Ψv,γB +(āτC )−1 61 k 2 πt,γ ), i.e., basically Ψv,γB ,
but with the anisotropic stress contribution to it removed.
The two-fluid equations are obtained from the ` = 0 and ` = 1 equa-
tions, and the baryon mass and momentum conservation equations. They are
eqs. (341), (342), (335) and the Ψ̇v,γB equation, (345), with πt,γ substituted
into it. Alternatively one can use the δ̇sγ and Ψ̇v,(γ+B) equations instead of
the δ̇B and Ψ̇v,B equations.
The tight coupling equations are a one-fluid (γ + B) approximation in
which the two-fluid character is encoded in diffusion and viscosity coefficients.
They are obtained by creating a “constitutive relation” for Ψv,γB by expanding
yB τC ā−1 Ψ̇v,γB in eq. (345) in powers of yB τC . Even if one is only interested in
first order τC effects in the evolution equations, the expansion of Ψv,γB must
go to quadratic order:
4 −1
ā−1 Ψv,γB = −yB k 2 τC2 ā (Ψv,B + Ψσ )
15fη
+ yB τC ( 41 δγ + H̄Ψv,B ) [1 − (3 − yB + pe )yB τC H̄ā]
2 2
+ yB τC ((q + 1)(H̄ā)2 ā−1 Ψv,B + 31 k 2 ā−1 Ψv,B + ā(δH)) , (355)
ηγ 3 2
(γ + B) kinematic shear viscosity: = 4 = cs(γ+B) āτC ,
3 γρ + ρ B 5f η

1 −1 2 2 4 3ρ̄B
(γ + B) sound damping rate: Γ = ā k cs(γ+B) τc + yB ,
2 5fη 4ρ̄γ
d ln(H̄ā)
where q = − , pe ≡ −d ln Ye /d ln a .
d ln ā
The term in ā−1 Ψv,γB of order yB τC2 is from the shear viscosity, while the
yB τC and (yB τC )2 terms together are the thermal diffusion contributions (the
179
nonnegligible (yB τC )2 terms come from the fluid acceleration which enters the
thermal diffusion expression). Three tight coupling equations are then to be
solved: for entropy generation, δ̇sγ , and for (γ + B) mass and momentum
conservation. In practice, we solve the δ̇γ equation and the Ψ̇v,B equation,
(342), instead of (346).
We saw in section 5.2.1 that the transfer equation can be recast in terms
of TC ∆ e t , with TC the Compton transparency, and leading sources VC 1 δeγ ,
4
−q̂ i (3) e v,B , and the integrated Sachs–Wolfe term. Because the source
∇i VC ā−1 Ψ
(S)
term GetSW (eq. (304)) vanishes if the gravitational potential is constant, it is
sometimes useful to regroup the ` = 0, 1 photon transport equations, and the 2-
fluid and tight-coupling equations following from it, to exploit this in searching
for analytic solutions:
e 1 2 −1 e (S)
1
4 ∂τ [δγ ] + 3 k ā Ψv,γ = GetSW , (356)
e v,γ ] + 1 δeγ + 1 k 2 πt,γ = −ne σT Ψv,γB
∂τ [ā−1 Ψ .
4 6
(To put all density and velocity perturbations on the same footing, we should
also transform the momenta of baryons, massive and massless neutrinos, cold,
warm and hot dark matter to qeI , which yields gauge-invariant quantities; e.g.,
1e 1 −1 e v,B ≡ Ψv,B + Ψσ .)
3 δB = 3 δB + ν + ∂τ [ā Ψσ ], Ψ
A WKB analysis of these equations gives the usual damped oscillation be-
havior, e.g., [130, 265, 264, 8]. For a mode of wavenumber k, we have
R R
1e 1/4 −i kcs,(γ+B) dτ − Γā dτ
4 δ γ ∝ (1 − y B ) e e
(if metric terms are ignored). The 4/(5fη ) is from viscous shear, while the
smaller 34ρ̄ρ̄Bγ yB part is due to the thermal diffusion. (Without the acceleration
correction, Ψ̇v,γB , in eq. (355), the correct yB multiplier in the diffusion term of
the damping rate is not obtained.) The R τ Silk damping scale factor σD defined in
2
[2] is related to Γ by σD = (kτdec )−2 0 dec Γā dτ . The integrand ∼ āpe +5/2 d ln ā
is a very steeply rising function, so to truncate the integral at precisely adec
(defined by eq. (71)) will provide only a rough estimation of the overall factor;
assuming the region near decoupling dominates so we take pe ≈ pe,dec , which
is also a crude approximation,
2 1 1 (1 + 15fη ρ̄B yB /(16ρ̄γ ))

σD ≈ 5 . (357)
15fη (pe + 2 )(pe + 2) (1 + 3ρ̄B /(4ρ̄γ ))
With fη = 3/4 and 7 < < < <

∼ pe,dec ∼ 12 (from fig. 3(c)), we have 0.02 ∼ σD ∼ 0.03
in the small ΩB limit. For the low ΩB values inferred from nucleosynthesis,
−1/2
the shear viscosity is by far the dominant damping term, and σD ∝ fη :
the inclusion of polarization therefore results in a 10% increase in σD over
the value that is obtained if polarization is not included. Thus, apart from the
intrinsic interest in polarization [266, 134, 88], it is clearly important to include
it because of the enhanced damping.
180
The photon–baryon fluid equations are coupled to those for CDM, the Boltz-
mann equations for massless neutrinos and massive neutrinos or warm dark
matter if applicable, and the metric equations to obtain Ψσ and ϕ̇. In a typical
run, we begin evolution when the waves are far outside the horizon and also in
the relativistic dominated regime (so the initial conditions can be integrated
analytically). For the photons and baryons, we start off with the tight cou-
pling equations. For massless neutrinos. we solve for typically 40 moments,
and shut them off once the energy density becomes negligible. After ztc (k),
we solve the full moment hierarchy equations up to some `max (τ ), which we
increase according to an algorithm based on a monitor of the radiation power
in high `-modes (`max scales with kτ .) Special care must be taken with the
computational procedure and time-stepping through recombination. We either
integrate the full equations forward to the present – which can often mean we
are just generating Bessel functions by ODE solvers, not the most straightfor-
ward nor accurate method, or, for many models and wavenumbers, we can do
this more accurately in a single step using free-streaming equations. Before
turning to these, it is worthwhile to note how well one can do with just two
fluid or tight coupling equations.
Two-fluid and tight coupling equations dominated theoretical explorations
of the sixties, seventies and even into the eighties, with the notable exception of
the expansion of the transport equation in angular bins by Peebles and Yu [130]
and unpublished earlier work using a moment expansion by Bardeen. For ex-
ample, among others, [267, 268, 143] used two-fluid models to calculate transfer
functions for matter, for which it is often quite accurate, and to estimate CMB
anisotropies, which Seljak [143] has recently shown to come reasonably close
to a full transport solution. Tight coupling equations have not only been used
to begin full transport calculations when the equations are very stiff. They
have also been used successfully in calculating transfer functions, estimating
Silk damping by WKB solutions for baryon-dominated models, and have also
been extensively used to make estimates for G0 , G1 used in the approximate
equations of section 5.1, e.g., [131, 2, 269, 270, 271]. Keeping only the tight
coupling equations to lowest order in τC has a solution which can be expressed
in terms of hypergeometric functions [269] plus a special solution of the in-
homogeneous equation driven by the metric variables. Since hypergeometric
functions are not very useful for calculations, searching for WKB solutions is
usually more profitable. In [2], I used the approximation of (1) tiny ρB , (2)
a constant gravitational potential νL through photon decoupling, a limiting
case for CDM-dominated universes, to elucidate the role that the Sachs–Wolfe
effect, electron bulk flows and photon compression had on the development of
anisotropy in both adiabatic and isocurvature models. Doroshkevich, Starobin-
sky and collaborators included finite ρB effects. None of our results could be
viewed as particularly accurate, especially at high `. Hu and Sugiyama [271]
explicitly included models for the metric variables in the finite but small ρB
cases and showed that one can use the G0 , G1 results to get the spectra even at
high ` to within about 10% accuracy. Of course, if back action of the photon
181
and baryons upon the metric variables becomes important, this will not work
so well. These semi-analytical methods still require computation of Bessel func-
tions to large order to get the C` ’s, but this is quite fast. However, with current
computing power, full Boltzmann transport calculation of an entire model runs
very quickly, measured in hours on DEC alphas for enough k-mode coverage
to get good accuracy.
C.3.2 Free-streaming
Either by direct rearrangement of the transfer equation or by the use of inho-
mogeneous momentum transformations of the sort used to get eq. (304), we
can rearrange terms in the transport equation to give the modified distribution
and source
e t = ∆t + ν + ∂τ (ā−1 Ψσ ) − q̂ i(3) ∇i ā−1 Ψσ − ne σT Ψv,γB ,
∆
(S) (S) ∂ 2 ā−1 Ψσ ∂ne σT Ψv,γB
GetSW + GetC = ν̇ − ϕ̇ + 2
−
∂τ ∂τ
− ne σT ā 21 P2 (k̂ · q̂)(∆t2 + ∆Q2 + ∆Q0 ) − ne σT ā∆t,`>2 . (358)
−
The distribution function perturbation ∆ e t is gauge invariant. (Taking the

ne σT Ψv,γB term into the modified distribution function is not really impor-
tant, since it is expected to fall to zero quite quickly after photon decoupling,
especially for normal recombination. With this form, the Thomson source falls
even more quickly to zero.) In [88], we identified this variable, expressed in
terms of āδH for use when ϕ̇ was negligible, as the one of relevance for free-
streaming (but without the ne σT Ψv,γB term.)
If the metric source terms become small beyond some time τs (k) (redshift
zs (k)), the radiation free-streams:
e t (k, q̂, τ ) = e−iq̂·k(τ −τs(k)) ∆

∆ e t (k, q̂, τs (k)) . (359)
The numerical output at redshift zs (k) is ∆t` (k, τs ), including multipoles up

to Lmax (τs ), from which ∆ e t` can be constructed. Expanding the plane waves
−iµk(τ −τs (k))
e in terms of Legendre polynomials and j` (k(τ − τs (k)) and inte-
grating over µ gives a direct relation involving Clebsch–Gordon coefficients that
allows one to get ∆t` (k, τ0 ) at the present time τ0 in a single step (Eq. (4.5) of
[88]):
X 0 2
e t` (k, τ0 ) =
∆ e t`0 (k, τs ) . (360)
(−1)(L+` −`)/2 h`0`0 0|L0i (2`0 + 1)jL (kχs )∆
`0 L
The Clebsch–Gordon coefficients, h`m`0 m0 |LM i, use standard notation (e.g.,

deShalit and Feshbach [263]). Note that |` − `0 | < < 0
− L − ` + ` and L must be
0
even (odd) if ` + ` is even (odd). The spherical Bessel sum has to go to very
high L, to Lmax (τs ) + Lmax (τ0 ). The former may be only a few hundred, but
182
the latter will be at least 3000. Spherical Bessel functions can be evaluated
to ` ∼ 6000 and higher with accuracy using Miller’s method on a recursion
relation.
As we have seen in eq. (261), νL − ϕL = ψ̈S − ϕS . The time derivatives
of these terms go to zero when the anisotropic stress becomes negligible (so
νL = −ϕL ) and the gravitational potential becomes constant: this occurs for
Ωnr = Ω = 1 universes well after radiation–matter equality τeq . We would typ-
ically take zs (k) ∼ 100 for standard CDM models with normal recombination,
although for accuracy long waves are integrated to the present, which is a trivial
computational burden; in reionized models there is persistent damping down to
low redshifts so a zs near or at the present is needed for accuracy for high k as
well. (The streaming formula can be modified to take the dominant damping
effect into account.) The potential terms ν̇L − ϕ̇L are nonzero at late times
if the universe becomes vacuum-dominated [110], but these effects have little
influence on high k’s, so although for low k’s one evolves the equations forward
to the present, for high k’s one can still use the free-streaming prescription.
As a last aspect of this free-streaming, we described “small-angle” approx-
imations in section 5.1.4 that have been used to speed up the evaluation of
correlation functions and power spectra in the past; they are not used anymore
for primary anisotropies because the techniques and computing are well in place
for doing full Boltzmann transport. A conceptually useful way of thinking of
the free-streaming transport which connects to section 5.1 is to treat the radi-
ation pattern itself as the source, with a delta function visibility at some time
τs :
e t (q, q̂, k, τs ) , V(τ ) = δ(τ − τs ) .
G(q, q̂, k, τ ) = V(τ )∆
With G so defined, one just applies eq. (129) with either PGG (k; χs , 0) being
q
e t (k` , µ` , τs )|2 i, where µ` ≡ kk /k` and k` ≡ Q2 2
proportional to h|∆ R(χs )2 + kk —
a DSZ [131] style approximation – or one isotropizes, with PGG proportional to
P e t` (k, τs )|2 i, a nearly-conserved quantity which is
Wt2 (k; τs ) = `>2 (2` + 1)h|∆
−
what the second approximation method exploits. For example, the way it was
used in [88] for CDM-type models was to integrate the Boltzmann equations
down to redshift zs = 200 or so, construct Wt2 (k; τs ), then use eq. (129) to get
C` (e.g., fig. 7 of [88]).
C.4 Modifications with mean curvature

In the seventies and eighties, when approximate methods were still being heav-
ily used for anisotropy calculations, it was usual to free-stream the radiation
from an early time when the curvature was unimportant to now using flat model
results, but with an angle-distance relation appropriate for the curved model,
eq. (130). The results for open CDM and isocurvature baryon models were
then used to constrain parameters with data from the small and intermediate
angle CMB experiments of the time, e.g., [134, 135, 216, 243].
183
Now the calculations are being done precisely. When there is mean curva-
ture, one cannot expand in plane waves. The modes QkM are eigenfunctions,
2
−ā2 (3) ∇ QkM = k 2 QkM , of the background Laplacian. Although plane waves
are not solutions for curved FRW spaces, spherical waves ∝ Y`m are solutions,
with radial wavefunctions Xk` (χ/dcurv ) which go to spherical Bessel functions
j` (kχ) when k is large compared with d−1 curv , and which, like Bessel functions,
can be generated by solving various recursion relations.1 This suggests multi-
pole expansions are indeed the way to try to solve the equations. One wants
this to be as close to the flat case as possible. Let us define a polynomial of
order ` by poly` (xµ, x2 ) ≡ x` P` (µ). In the curved case, just as in the flat case,
we can write2
∆kM (x, τ, q̂) (361)

X 2
= (2` + 1)(∆` (k, τ )/κ`,t ) (−k)−` poly` (q̂ i(3) ∇i , ā2 (3)
∇ )QkM (x) ,
`
Ỳ (3)
Rā2
κ`,t = κ` , κ2` ≡ 1 − (`2 − 1) , κ0 = κ 1 = 1 .
6k 2
`0 =0
(S)
The ∆` correspond to ∆{t,Q,U,V },` , ∆erν,` , etc. and (3) Rā2 = ±6d−2 curv . The
product of κ` ’s in the denominator helps to regulate the hierarchy of moment
equations in the presence of curvature [303, 304, 305]. When we express the
hierarchy equations for neutrinos and photons in terms of ∆` (k, τ )/κ`,t they
remain the same as for the flat case, e.g., eqs. (333), except an effective source
term is added to the right-hand side:
` + 1 `(` + 2) (3) Rā2 (S)

Gcurv,` = k ∆t(`+1) /κ`,t . (362)
2l + 1 6k 2
The (3) R corrections to the metric equations must be included as well of course.
For numerical solution, one should rewrite the equations explicitly in terms of
∆` (k, τ ). In that case, the Gcurv,` is absorbed into the left hand side, with κ`
terms now appearing in the coupling of ∆(`−1) , ∆(`+1) to ∆ ˙ `:

˙`−k ` `+1
∆ κ` ∆(`−1) − κ`+1 ∆(`+1) = usual RHS . (363)
2` + 1 2` + 1
1 For open FRW universes the spectrum of the Laplacian has kd >
curv − 1, and the radial
functions are
q 1
π −(`+ )
Xk` (x) = (kdcurv )` P 1
2 (cosh(x))sinh−1/2 (x) , x ≡ χ/d
curv ,
2 iβ−
2
where the Pνµ (x) are associated Legendre functions (e.g., [133]).
2
P This` expansion format suggests that we define generalized potentials U t` by ∆t =
i (3) ∇ , ā2 (3) ∇2 )U /κ , so that U −1 Ψ
`
(−) poly ` (q̂ i t` `,t t1 = ā v,γ , Ut2 = 5κ2,t πt,γ /12 and
the higher ` equations become more like the energy and momentum conservation laws.
184
(Sources in the ` = 2 equation also have to be multiplied by κ2 , and, to
the extent they explicitly involve ∆` , rewritten with the correct κ`,t factors.)
Because the angle-distance relation for open universes results in the typical `
associated with a given wavenumber being much larger than in a flat universe,
being able to free-stream from an early time to the present is very useful to
speed up numerical evaluations, but this is difficult to make efficient, unlike
in the flat case [305]. The full numerical problem for open universes was first
tackled by Mike Wilson [133], was picked up again by [138], and, more recently,
by [245, 244, 305] for open CDM models and by [287] for texture and other
isocurvature seed models; closed models are addressed in [304, 305].
In the absence of knowing what the generation mechanism is for the fluc-
tuations, it is usual in cosmology to consider “natural” spectral shapes such
as power laws. What complicates matters p is that the phase space for curved
2
universes
p goes like β dβ, where β ≡ (kdcurv )2 − 1 for scalar perturbations
and (kdcurv )2 − 3 for tensor perturbations,p with the spectrum of β going
from 0 to ∞. (In closed models, β ≡ (kdcurv )2 + 1 for scalar perturbations,
p
(kdcurv )2 + 3 for tensor perturbations, with β > 0 but in this case the β
spectrum is discrete.) It is unclear a priori whether the power law should be
in kdcurv , β, volume or another combination.
In inflation models with mean curvature, if the generation mechanism is
the usual zero point quantum fluctuations in scalar or gravity wave fields, the
equations of sections 6.3 and 6.2.5 describe the development. In [245, 244],
it was shown that d−1 curv (kdcurv )
−2
(3 + (kdcurv )2 )2 ((kdcurv )2 − 1)−1/2 is an
1
inflation-inspired analogue of the k Harrison–Zeldovich energy density spec-
trum for flat Universes. This looks complicated but has a very simple phys-
ical interpretation: just as for the flat case, this translates to equal power
per decade of wavenumber in the gravitational potential. Thus, it is advan-
tageous to use power per logarithmic waveband to express this. Actually
the scale independence is in the gauge invariant variables ζ or ϕcom (sec-
tion 6.2.3), which are ∝ ΦN ≡ −ΦH , the gravitational potential, on large
scales. With tilt νs , Pζ (k) ∼ (kdcurv )νs is suggested by the absence of curva-
ture effects explicitly appearing in the equation for scalar field perturbations,
eq. (176). The analogue for tensor perturbations for which curvature correc-
tions explicitly appear in the gravitational wave evolution equation, eq. (169),
is PGW (k) ∼ ((kdcurv )2 − 2)νt /2 . In realistic inflation models there are further
small corrections near β = 0 [305].
C.5 Lensing
Even though one usually linearizes in the metric variables to treat gravitational
lensing in cosmological contexts, in transport theory it is a nonlinear process:
Gtbend involves the transverse derivative to the instantaneous direction of the
photon path, −∂∆t /∂ q̂ I F I , where F I is a linear combination of the perturbed
metric variables, ν, ϕ, Ψσ . What complicates this is that under linear gauge
transformations, ∆t can get new components ∝ q̂ J VJ , where VJ involves metric
185
components; thus terms F I VI of quadratic order in the metric components are
induced. The situation can be clarified by recognizing that, in the absence of
interactions with matter, the Boltzmann equation is just a book-keeping device
saying that the mean photon occupation number (or phase space density) is
conserved along photon trajectories and the photon trajectories can be solved
with linearized potentials. As expected dq̂ I /dτ = F I .
The expressions for the angular power spectrum derived in this section are
meant to exercise some of the machinery and approximations given previously
in these appendices. The relationship between C lens ($) and C`no-lens is equiva-
lent to an expression given by Seljak [280] whose numerical results are described
below; see also [275].
It is customary (e.g., [273]) to work in the longitudinal gauge for lensing,
with metric variables νL = ΦN and ϕL → −ΦN once anisotropic stress can be
neglected, so one’s Newtonian insight into the potential ΦN can be applied. In
terms of these variables,
∂∆t IJ
Gtbend = (δ − q̂ I q̂ J )e∗J [νL − ϕL ] . (364)
∂ q̂ I
To relate this to the equations of motion, the expressions in the footnote in
section B.1 are evaluated using the Ricci rotation coefficients eq. (274). For
each geodesic there is an affine parameter λ “clocking” changes. We can also
measure changes by transforming to conformal time τ (λ) or, as is done here,
to comoving radial distance χ(λ) which is set to zero (as is λ) at the end
of the photon trajectory; i.e., here, at x0 , and now, at τ0 . In terms of the
photon momenta q I that gives us the gauge-invariant ∆ e t variable (i.e., with
ln Ω = ln ā + νL , ln A = ln ā + ϕL ), the geodesic equations are
1 d ln q
= ēn [νL − ϕL ]
N̄ dτ
1 dq̂ I 1
= −(δ IJ − q̂ I q̂ J ) e∗J [νL − ϕL ] .
N̄ dτ ā
dxi N q
= ( n e∗Ii q̂ I + Aeni ) → eνL −ϕL δIi q̂ I ,
dτ A q
dτ qn
=− → −e−2νL q , dχ/dλ → e−(νL +ϕL ) q . (365)
dλ NΩ
(Note that a surface of constant conformal time is not a surface at a fixed
comoving distance in this gauge when one takes the perturbations into account.)
The photon position as it meanders back and forth under the action of the
metric obeys
xI ≈ rI − sI , rI = xI0 − q̂0I χ ,
Z χs Z χ
I
s =− dχ dχ0 (δ IJ − q̂ I q̂ J )e∗J [νL − ϕL ] . (366)
0 0
There are many similarities to the Zeldovich approximation, with the unper-
turbed photon trajectory r I like the unperturbed (Lagrangian space) position,
186
with the deviation from that trajectory sI like the displacement field, and with
the true trajectory xI like the “true” (Eulerian space) position. One can use
the same methods for solving this problem as is used to map from Lagrangian
space to Eulerian space in 3D cosmology. A flat Universe has been assumed.
Thus we can use Fourier transform methods to find the solution. For example
the correlation function at time τ0 can be expressed in terms of the radiation
pattern on the surface a distance χs away by
C lens ($) = h∆e t (q̂0 , x0 , τ0 )∆

e t (q̂ 0 , x0 , τ0 )i
0
X
−ik·(q̂0 −q̂0 )χs −ik·(s−s0 ) e
0
e ∗ (k, τs , k̂ · q̂ 0 )i .
≈ e he ∆t (k, τs , k̂ · q̂s )∆ t s
k
(367)
Here as usual $ = q̂0 − q̂00 , q̂s and q̂s0 are the directions of the photons at χs ,
and ∆s ≡ s − s0 . The ensemble-average encompasses the statistics of both the
radiation pattern at χs and the distribution of the clumped matter lying be-
tween χs and us which is responsible for the bending. In practice it will be an
excellent approximation to assume they are statistically independent of each
other. As a further simplification along the lines of the “small angle approxi-
mations” described in section C.3.2, we replace h∆ e t (k, τs , k̂ · q̂s )∆
e ∗ (k, τs , k̂ · q̂ 0 )i
t s
by the DSZ approximation, h|∆ e t (k, τs , µ̄)|2 i. In the usual DSZ approximation,
µ̄ = k̂ · (q̂s + q̂s0 )/2. In principle the average lensed polar direction, (q̂s + q̂s0 )/2,
could be shifted considerably from the unperturbed direction (q̂0 + q̂00 )/2 on
the sky. Still, as a first approximation we replace µ̄ by its ensemble average,
µ̄ = k̂ · (q̂0 + q̂00 )/2, invoking [88] who showed that one still gets a good approx-
imation by going one step beyond DSZ by isotropizing h|∆ e t (k, τs , µ̄)|2 i.
For small angles we can also use a Fourier transform approximation to the
power spectrum, utilizing a split into components transverse and parallel to the
average line of sight, which sets the unlensed 2D wavenumber to be Q0 = k⊥ χs :
Z 2
d Q0 iQ0 ·$ −iQ0 ·(s−s0 )/χs 2π no-lens
C lens ($) ≈ e he i 2 C `0 . (368)
(2π)2 Q0
As usual, Q0 = |Q0 | = `0 + 12 . The statistical average he−iQ0 ·∆s/χs i is the

characteristic function for the random variable Q0 · ∆s, expressible in terms
of all of the connected N -point correlation function of it. A subject which is
interesting to explore is the extent to which non-Gaussian features will mani-
fest themselves. To date the papers have focussed on simplified approximations
to get an idea of the magnitude of the effect.
P The leading term for this aver-
age is a Gaussian approximation, exp[− 21 AB Q0A Q0B h∆sA ∆sB i/χ2s ], where
A, B = 1, 2 for the two components of the transverse vector. If the separa-
tion |$| is small, then ∆sA can be expanded in terms of the “shear tensor”
εAB = −∂∆sA /∂(χs $B ). (Strain tensor rather than shear tensor is the more
appropriate name.) For the basis of the illustration of this section, we shall
just consider the isotropized version of h(Q0 · ∆s)2 i, i.e., 21 Q20 h∆s · ∆si, which
187
I define to be 12 Q20 ε2 $2 χ2s . In the small angle limit of the isotropized version,
ε2 = 21 εAB εAB .
We can use Fourier methods to determine the rms displacement. In the
ϕL = −ΦN limit, [280, 273]
Z Z 2 2

1 2 2 dχ d Q0 iQ0 ·$ 2π Q0
2ε $ ≈ (1 − e ) 4PΦN (1 − χ/χs )2 . (369)
χs (2π)2 Q0 χs
In the small $ limit, ε is $-independent as expected. For this constant ε case,
the Fourier transform of the correlation function can be done explicitly:
Z
exp[−|Q − Q0 |2 /(Q20 ε2 )] Q2 no-lens
C`lens = d2 Q0 C . (370)
π(Q20 ε2 ) Q20 `0
Q is the lensed angular wavenumber and Q = ` + 12 . The total power is
conserved – the logarithmic integrals of C`lens and C`no -lens are the same – but
0
it is rearranged via the convolution, which is a smoothing in `-space. If ε($) is
changing slowly with angular scale $, ε(Q−1 0 ) is reasonable to use. Seljak [280]
has used realistic gravitational potential power spectra – linear theory on large
scales with a good approximation to nonlinear effects on small scales, thereby
enhancing the lensing effect – to estimate ε. A rough fit, covering a range from
arcsecond scales up to tens of arcminutes, is ε(θ) ∼ 0.2−0.03 ln(θ/10 ) for a CDM
model with σ8 = 1 and ∼ 0.14 − 0.03 ln(θ/10 ) for a Ωvac = 0.8 model. Thus the
spread around `0 , ∆`/`0 ∼ ε, is not very large, ∼ 0.2 at a few arcminutes, less
for larger scales; note also that ε is changing slowly with θ, with a local power
law index < ∼ 0.2 for arcminute scales, so the constant ε approximation is not
even too bad. The net effect is that the higher Doppler peaks and troughs are
smoothed out enough so that one must take the lensing effect into account in
some happy future where we have an extremely well determined C` .
C.6 Tensor perturbation source terms

As we saw in eq. (311), the natural variables to use for tensor perturbations
e (T )
are ∆ {t,U,V,Q} defined by the expansion
X X X (T )
(T )
∆ij = w e (T ) E
∆
· E(µ) (T ) ik·x
E e ak(T ) + cc.
(µ) E(µ) · E(µ) ij
(µ)=t,Q,U,V =+,× k
The polarization basis for k-modes is eq. (319), with q̂ replacing q̂ 0 :

p
ε2 = (− sin φ, − cos φ, 0), ε1 = (−µ cos φ, −µ sin φ, 1 − µ2 ). (371)
To determine ∆ e (T ) , we need the 2 × 4 transformation matrix of inner prod-

t,U,V,Q
ucts
!
E (T {+,×}) · E{t,U,V,Q}
E{t,U,V,Q} · E{t,U,V,Q}
188

−(1 − µ2 ) cos(2φ) (1 + µ2 ) cos(2φ) 2µ sin(2φ) 0
= ;
−(1 − µ2 ) sin(2φ) (1 + µ2 ) sin(2φ) −2µ cos(2φ) 0
E (T +) · EQ
e.g., = (ε1 )1 (ε1 )1 − (ε2 )1 (ε2 )1 − (ε1 )2 (ε1 )2 + (ε2 )2 (ε2 )2
EQ · EQ
= (1 + µ2 ) cos(2φ) . (372)
e (T )
Note that there is no ∆ V . One can also expand the source functions G(T )ij,C
and G(T )ij,SW in modes:
(T )
Gij {C,SW }
XXX (T ) E (T ) · E(µ) (T ) ik·x
=w Ge(µ) {C,SW } E e ak(T ) + cc.

E(µ) · E(µ) ij
(µ) k
(T ) (T )
The evaluation of Ge(µ) SW is simple, with the result eq. (207): GetSW = 21 ḣ(T ) ,
with the rest vanishing.
(T )
To get the Thomson scattering source functions eq. (208) for Ge(µ) C is more
work. A straightforward route is to isolate the cos(2(φ − φ0 )), sin(2(φ − φ0 ))
(µ)
terms in the phase tensor components [P ](ν) . Let us denote the perturbation
(T ) (T )
variables in an expansion in cos(2φ) and sin(2φ) by ∆(µ) , G(µ) {C,SW } , without
the tilde. The relation to the tilde variables is
(T ) e (T ) (T ) e (T ) (T ) e (T )
∆t ≡ −(1 − µ2 )∆ t , ∆Q = (1 + µ2 )∆ Q , ∆U = −2µ∆ U ,
(T ) (T )
and similarly for G(µ) SW and G(µ) C , which is given by
(T ) (T )
τC GtC = −∆t + (1 − µ2 )Υ(T ) , = +, × , (373)
(T ) (T ) 2
τC GQC = −∆Q − (1 + µ )Υ(T ) ,
(T ) (T )
τC GU C = −∆U + 2µΥ(T ) ,
(T ) (T )
τ C GV C = −∆V ,
Z
0 1 (T ) (T ) (T )
Υ(T ) ≡ 3
8
1
2 dµ [ 2 (1 − (µ0 )2 )∆t − 12 (1 + (µ0 )2 )∆Q + 12 2µ0 ∆U ].
(µ)
Although the derivation of [P ](ν) was done in the comoving baryon gauge, the
(T )
tensor terms ∆(µ) are all gauge invariant, so are valid in any gauge.
(µ)
The classical route is to have the form of [P ](ν) compel one to first transform
(T )
to variables ∆(µ) which take out the cos(2φ), sin(2φ), then be compelled by
the form of eqs. (373) to introduce the Polnarev combinations ∆ e (T ) . Note
(µ)
(T )
that ∆V obeys pure damping equations with no source terms, hence remains
189
unexcited by gravitational waves, and so vanishes identically, a result which
follows directly in the tilde representation.
The ∆e (T )
t,U,V,Q obey the simplified transfer equations
∂ e (T ) e (T ) = 1 ḣ(T ) − τ −1 ∆
e (T ) + τ −1 Υ(T ) ,
∆ + q̂ · ∇∆ (374)
∂τ t t 2 C t C
∂ e (T ) e (T ) = −τ −1 ∆e (T ) + τ −1 Υ(T ) , ∆ e (T ) = ∆e (T ) ,
∆ + q̂ · ∇∆
∂τ Q Q C Q C Q U
Z
0 1 0 2 2 e (T ) e (T ) ] ,
Υ(T ) ≡ 38 1
2 dµ [ 2 (1 − (µ ) ) ∆t + 12 (1 + 6(µ0 )2 + (µ0 )4 )∆ Q
1 e (T ) e (T ) + 3 e (T ) e (T ) − 6 ∆
e (T ) 3 e (T )
Υ(T ) = 10 ∆t0 + 17 ∆ t2 70 ∆t4 + 35 ∆ Q0 7 Q2 + 70 ∆Q4 .
As for scalar perturbations, these two transfer equations are solved by expand-
ing in Legendre polynomials, P` (µ) [140]. The moment equations are identical
in form to those for scalar perturbations, except that only the ` = 0 equations
have nonzero sources for both ∆ e (T ) : Higher moments have only the usual
t,Q `
e (T ) damping and grow only as a result of the flux from lower `’s through
τC−1 ∆ t`
e (T ) propagation term. The ` = 0 source feeding the development
the q̂ · k̂ ∆
t,Q
of total anisotropy is 21 ḣ(T ) + τC−1 Υ(T ) . The polarization growth is fed by
τC−1 Υ(T ) in the ` = 0 equation.
Given ḣ(T ) , there is an exact solution for ∆e t(T ) − ∆
e (T ) , which is a free-
Q
streaming solution including damping (associated with the Thomson depth ζC ).
The polarization is quite small [141], so this is also a good approximation for
e t(T ) , the solution when the Υ(T ) feed is neglected:
∆
Z τ0
e (T ) ≈ ∆
∆ e (T ) − ∆
e (T ) = e−ζC (τ ) dτ e−ik·q̂χ 12 ḣ(T ) (τ ) , (375)
t t Q
0
Z τ0
e (T ) ≈ ∆
∆ e (T ) − ∆
e (T ) = e−ζC (τ ) dτ j` (kχ) 12 ḣ(T ) (τ ) . (376)
t` t` Q`
0
Although working with the +, × quantities has some advantages, for deriva-
tions it is useful to use the expansion
r
(T ) 16π e (T G) e (T G) ≡ √1 (∆ e (T ×) ) . (377)
e (T +) − i∆
∆t = − ∆ Y22 + cc , ∆
15 t t
2
t t
Here cc denotes complex conjugate. This explicitly shows that an ` = 2 tensor

component is the leading term coming out of gravity waves, whereas for scalar
modes there are (gauge dependent) `√= 0 and ` = 1 terms. We also introduce
the notation hG ≡ (h(T +) − ih(T ×) )/ 2 for the analogous gravity wave contri-
e t(T G) to multipole components on the sky and the angular
bution. To go from ∆
190
power spectrum, we make use of
X (`)
Y`±2 (µ, 0) = D±2m (k̂) Y`m (q̂) , µ = k̂ · q̂ ,
m
(2` + 1)Y2±2 (µ, 0)P` (µ)
X√ p
= 5 (2`0 + 1) h20`0 0|`0ih2 ± 2`0 − (±2)|`0i Y`0 ±2 (µ, 0)
`0
r
√ X 1 3p p
= 5 (2`0 + 1) (`0 − 1)`0 (`0 + 1)(`0 + 2)
2 2
`0

δ`,`0 +2 δ`,`0 δ`,`0 −2
× − 2 + .
(2`0 + 1)(2`0 + 3) (2`0 − 1)(2`0 + 3) (2`0 − 1)(2`0 + 1)
(`)
D±2m (α, β, γ) denotes the irreducible rotation tensor of rank ` for a rotation
with Euler angles α, β, γ, with here α = 0 and β, γ the polar angles of k̂.
The h`m`0 m0 |LM i are Clebsch–Gordon coefficients [263]. Thus the multipole
coefficients are
(T )
X (`) p p
at,`m = D±2m (k̂) (2` + 1) (` − 1)`(` + 1)(` + 2)
k
e (T G) e (T G) e (T G)
∆ t,`−2 ∆ t,` ∆ t,`+2
× +2 + + cc
(2` − 1)(2` + 1) (2` − 1)(2` + 3) (2` + 1)(2` + 3)
and the differential angular power spectrum is
(T ) e (T G)
dCt` 1 2 k3 ∆ t,`−2
= `(` + 1)(1 − 2 )(1 + ) 2 1 1
d ln k ` ` 2π (1 − 2` )(1 + 2` )
e (T G) e (T G) 2
∆ t,` ∆ t,`+2
+2 1 3 + 1 3 . (378)
(1 − 2` )(1 + 2` ) (1 + 2` )(1 + 2` )
When we use the brick wall approximation for e−ζC (τ ) , unity after recombina-
tion, zero before, in eq. (376) we obtain the Abbott and Wise [225] approxi-
mation for tensor mode microwave background fluctuations. Keeping the full
e−ζC (τ ) improves the approximation. Obtaining the power spectra for the po-
(T
larization is more complex because the multiplying functions going from ∆Q,U
e (T )
to the ∆ Q variables are not simply Y2±2 .
References
[1] Ya.B. Zeldovich and R.A. Sunyaev, 1980, Ann. Rev. Astron. Ap. 18,
537; R.A. Sunyaev and Ya.B. Zeldovich, 1981, Space Physics Reviews 1,
1.
191
[2] J.R. Bond, 1988, in: W.G. Unruh, ed., The Early Universe, Proc. NATO
Summer School, Vancouver Is., Aug. 1986 (Reidel, Dordrecht).
[3] J.R. Bond, 1989, The formation of cosmic structure, in: A. Astbury et
al., eds., Frontiers in Physics – From Colliders to Cosmology (World
Scientific, Singapore) p. 182.
[4] G. Efstathiou, 1990, in: A. Heavens, A. Davies and J. Peacock, eds.,
Physics of the Early Universe, Scottish Universities Summer School Pub-
lications.
[5] J.R. Bond, 1994, Cosmic Structure Formation and the Background Ra-
diation, in: T. Padmanabhan, ed., Proceedings of the IUCAA Dedication
Ceremonies, Pune, India, Dec. 28–30, 1992 (Wiley).
[6] J.R. Bond, 1994, Testing inflation with the cosmic background radia-
tion, in: M. Sasaki, ed., Relativistic Cosmology, Proc. 8th Nishinomiya-
Yukawa Memorial Symposium (Academic Press) astro-ph/9406075.
[7] M. White, D. Scott and J. Silk, 1994, Ann. Rev. Astron. Ap. 32, 319.
[8] P.J.E. Peebles, 1993, Principles of Physical Cosmology (Princeton Uni-
versity Press, Princeton, NJ), especially chapters 6 and 24.
[9] R.B. Partridge, 1994, 3 K: The Cosmic Microwave Background Radiation
(Cambridge University Press, Cambridge).
[10] The Evolution of the Universe, Dahlem Workshop Report ES 19, ed.
G. Borner and S. Gottlober, (John Wiley and Sons); e.g., J.R. Bond
1997, pp. 199-223, Implications of the Background Radiation for Cosmic
Structure Formation
[11] J.C. Mather et al., 1994, Ap. J. Lett. 420, 439.
[12] D.J. Fixsen et al., 1996, Ap. J. 473, 576.
[13] A.A. Penzias and R.W. Wilson, 1965, Ap. J. 142, 419.
[14] R. Weiss, 1980, Ann. Rev. Astron. Ap. 18, 489.
[15] H.P. Gush, M. Halpern and E. Wishnow, 1990, Phys.Rev. Lett. 65, 537.
[16] T.F. Howell and J.R. Shakeshaft, 1966, Nature 210, 1318; 1967, 216,
753.
[17] A. Kogut, 1991, in: S.S. Holt, C.L. Bennett and V. Trimble, eds., After
the First Three Minutes, AIP Conference Proceedings, Vol. 222 (AIP,
New York).
[18] D.G. Johnson and D. Wilkinson, 1987, Ap. J. Lett. 313, L1.
192
[19] M. Bensadoun et al., 1993, Ap. J. 409, 1.
[20] G. Sironi et al., 1990 Ap. J. 357, 301; 1994, 1995, in Astrophys. Lett.
and Comm., 32.
[21] S. Staggs and D. Wilkinson, 1995, in Astrophys. Lett. and Comm., 32.
[22] M. Bersanelli et al., 1995, in Astrophys. Lett. and Comm., 32.

[23] G. Herzberg, 1945, Molecular Spectra and Molecular Structure (Prentice
Hall, New York).
[24] P. Thaddeus, 1972, Ann. Rev. Astron. Ap. 10, 305.
[25] D.M. Meyer and M. Jura, 1985, Ap. J. 297, 119.
[26] K. Roth, D.M. Meyer and I. Hawkins, 1993, Ap. J. Lett. 413, L67.
[27] P. Crane, 1989, Ap. J. 346, 136; 1995, in Astrophys. Lett. and Comm.
32; E. Palazzi, 1990, Ap. J. 357, 14.
[28] D.P. Woody and P.L. Richards, 1981, Ap. J. 248, 18.
[29] H.P. Gush, 1981, Phys. Rev. Lett. 47, 745.
[30] J.B. Peterson, P.L. Richards and T. Timusk, 1985, Phys. Rev. Lett. 55,
332.
[31] T. Matsumoto, S. Hayakawa, H. Matsuo, H. Murakami, S. Sato, A.E.
Lange and P.L. Richards, 1988, Ap. J., 329, 567.
[32] R. Sachs and A. Wolfe, 1967, Ap. J. 147, 73.
[33] A. Kompaneets, 1957, Sov. Phys. JETP 4, 730.

[34] Ya.B. Zeldovich and R.A. Sunyaev, 1969, Ap. Space Sci. 4, 301.
[35] R.J. Gould, 1984, Ap. J. 285, 275.
[36] C. Burigana, L. Danese and G. DeZotti, 1991, Astron. Ap. 246, 49.
[37] Yu.E. Lyubarskii and R.A. Sunyaev, 1983, Astron. Ap. 123, 171.
[38] R.A. Windhorst, E.B. Fomalont, R.B. Partridge and J.D. Lowenthal
1993, Ap. J. 405, 498; R.A. Windhorst et al.1995, Nature 375, 471.
[39] C.G.T. Haslam, H. Stoffel, C.J. Salter, and W.E. Wilson 1982, Astron.
Ap. Supp. 47, 1.
[40] P. Reich and W. Reich 1988, Astron. Ap. Supp. 74, 7.
[41] M. Jones 1996, preprint
193
[42] J.R. Bond, B.J. Carr and C.J. Hogan, 1991, Ap. J. 367, 420 [BCH2].
[43] E.L. Wright et al., 1991, Ap. J. 381, 200.
[44] F.-X. Desert, F. Boulanger and J.L. Puget, 1990, Astron. Ap. 237, 215.
[45] J.L. Puget and A. Leger, 1989, Ann. Rev. Astron. Ap. 27, 161.
[46] J.S. Mathis, W. Rumpl and K.H. Nordsieck, 1977, Ap. J. 217, 425.
[47] W.J. Barnes, 1994, MIT Thesis, A Model of Galactic Dust and Gas from
FIRAS.
[48] W. Reach et al., 1995, Ap. J. 451, 188.
[49] J.-L. Puget et al., 1996, Astron. Ap. 308, L5.
[50] J. Dorscher and T. Henning, 1995, Astron. Ap. Reviews 6, 271.
[51] S.-H. Kim, P.G. Martin and P.D. Hendry, 1994, Ap. J. 422, 164.
[52] E.L. Wright, 1987, Ap. J. 320, 818.
[53] D. Layzer and R. Hively, 1973, Ap. J. 179, 361
[54] F. Hoyle, 1980, Steady State Cosmology Revisited, (University College
Cardiff Press, Cardiff, Wales).
[55] F. Hoyle, G.R. Burbidge and J.V. Narlikar, 1993, Ap. J. 410, 437; 1994,
M.N.R.A.S. 267, 1007.
[56] N.C. Rana, 1979, Ap. Space Sci. 66, 173; 1980 Ap. Space Sci. 71, 123;
1981, M.N.R.A.S. 197, 1125.
[57] E.L. Wright, 1982, Ap. J. 255, 401; I. Hawkins and E.L. Wright, 1988,
Ap. J. 324, 46.
[59] P. Guhathakurta and B.T. Draine, 1989, Ap. J. 345, 230.
[60] P.J.E. Peebles, 1968, Ap. J. 153, 1.
[61] Ya.B. Zeldovich, V.G. Kurt and R.A. Sunyaev, 1969, Sov. Phys. JETP
28, 146.
[62] D.R. Bates and A. Dalgarno, 1962, in: D.R. Bates, ed., Atomic and
Molecular Processes (Academic Press) p. 245.
[63] W.J. Boardman, 1964, Ap. J. Suppl. 9, 185.
[64] D.E. Osterbrock, 1974, Astrophysics of Gaseous Nebulae (Freeman, San
Francisco).
194
[65] J.H. Krolik, 1990, Ap. J. 353, 21.
[66] B.J. Carr, J.R. Bond and W.D. Arnett, 1984, Ap. J. 277, 445.
[67] W.H. Press and P. Schechter, 1974, Ap. J. 187, 425.
[68] J.R. Bond and S. Myers, 1996, The peak-patch picture of cosmic cat-
alogues I: algorithms; II: validation, Ap. J. Supp. 103, 1; IV: analytic
methods, preprint.
[69] M. Tegmark, J. Silk and A. Blanchard, 1994, Ap. J. 420, 484.
[70] M. Fukugita and M. Kawasaki, 1994, M.N.R.A.S. 269, 563.
[71] P.J.E., Peebles, 1987, Ap. J. 277, L1.
[72] J. Bartlett and A. Stebbins, 1991, Ap. J. 371, 8.
[73] T.P. Walker, G. Steigman, D.N. Schramm, K.A. Olive and H.S. Kang,
1991, Ap. J. 376, 51; M.S. Smith, L.H. Kawano and R.A. Malaney, 1993,
Ap. J. Supp. 85, 219.
[74] S. Ikeuchi, K. Tomisaki and J.P. Ostriker, 1983, Ap. J. 265, 583.
[75] J.J. Levin, K. Freese and D.N. Spergel, 1992, Ap. J. 389, 464.
[76] J.P. Ostriker, C. Thompson and E. Witten, 1986, Phys. Lett. B180,
231.
[77] J.P. Ostriker and C. Thompson, 1987, Ap. J. Lett. 323, L97.
[78] C. Thompson, 1993, private communication.
[79] M.G. Hauser et al., 1991, in: S.S. Holt, C.L. Bennett and V. Trimble,
eds., After the First Three Minutes, AIP Conference Proceedings, Vol.
222, (AIP, New York) p. 161.
[80] M.G. Hauser, 1995, in Unveiling the Cosmic Infrared Background, ed. E.
Dwek, AIP Conference Proceedings, (AIP, New York); 1995, IAU Sympo-
sium 168, Examining the Big Bang and Diffuse Background Radiations,
ed. M. Kafatos and Y. Kondo (Kluwer, Dordrecht).
[81] J.R. Bond, B.J. Carr and C.J. Hogan, 1986, Ap. J. 306, 428.
[82] J.R. Bond and S. Myers, 1993, in: M. Shull and H. Thronson, eds., The
Evolution of Galaxies and their Environment, Proceedings of the Third
Teton Summer School, NASA Conference Publication 3190, p. 21.
[83] A. Franceschini, L. Toffolatti, P. Mazzei, L. Danese and G. De Zotti,
1991, Astron. Ap. Suppl. 89, 285; G. De Zotti et al., 1995, in: Dust,
Molecules and Backgrounds: from Laboratory to Space, Planetary and
Space Science, in press.
195
[84] C. Bennett et al., 1994, Ap. J., 436, 423.
[85] C. Bennett et al., 1996, Ap. J. Lett., 464, 1; and 4-year DMR references
therein.
[86] A. Kogut et al., 1996, Ap. J. Lett. 464, 5.
[87] G. Hinshaw et al., 1996, Ap. J. Lett. 464, 17.
[88] J.R. Bond and G. Efstathiou, 1987, M.N.R.A.S. 226, 655.
[89] J.R. Bond, 1995, Astrophys. Lett. and Comm., 32, 63.
[90] G.F. Smoot et al., 1992, Ap. J. Lett. 396, L1.
[91] R. Kneissl and G. Smoot, 1993, COBE note 5053.
[92] C. Lineweaver and G. Smoot, 1993, COBE note 5051.
[93] K. Ganga, E. Cheng, S. Meyer and L. Page, 1993, Ap. J. Lett. 410, L57
[ firs].
[94] A.C.S. Readhead et al., 1989, Ap. J. Lett. 346, 556 [ov7].
[95] P. Meinhold and P. Lubin, 1991, Ap. J. Lett. 370, 11 [sp89].
[96] T. Gaier et al., 1992, Ap. J. Lett. 398, L1; J. Schuster et al., 1993,
Ap. J. Lett. 412, L47 [ sp91].
[97] E.J. Wollack et al., 1993, Ap. J. Lett. 419, L49 [ sk93].
[98] P. Meinhold et al., 1993, Ap. J. Lett. 409, L1; J. Gunderson et al., 1993,
Ap. J. Lett. 413, L1 [ max3].
[99] A. Clapp et al., 1994, M. Devlin et al., 1994, Ap. J. 430, 1 [MAX4,
max4].
[100] S. Tanaka et al., 1996, Ap. J. Lett. 468, 81 [ max5].
[101] E.S. Cheng et al., 1993, Ap. J. Lett. 420, L37 [msam2 (g2),msam3 (g3)].
[102] G.S. Tucker et al., 1993, Ap. J. Lett. 419, L45 [ wd2, wd1].
[103] S.M. Gutteriez de la Cruz et al., 1995, Ap. J. 442, 10; S. Hancock et al.,
1994, Nature 367, 333; R. Watson et al., 1992, Nature 357 660 [ ten].
[104] J.E. Ruhl et al., 1995, Ap. J. Lett. 453, L1 [ py].
[105] P. de Bernardis et al., 1994, Ap. J. Lett. 422, L33 [ ar].
[106] M. White and M. Srednicki, 1995, Ap. J. 443, 6.
[107] T.N. Gautier et al., 1992, AJ 103, 1313.
196
[108] A. Kogut et al., 1996, Ap. J. 460, 1.
[109] E.T. Vishniac, 1987, Ap. J. 322, 597.
[110] L. Kofman and A.A. Starobinsky, 1985, Sov. Astron. Lett. 11, 271.
[111] B. Chaboyer et al., 1996, Science, in press.
[112] S. Coles and N. Kaiser, 1987, M.N.R.A.S., 233, 637.
[113] R. Schaeffer and J. Silk, 1988, Ap. J., 332, 1.
[114] W.K. Gear and C.R. Cunningham, 1995, ASP conference series 75, 215,
(Multi-feed Systems for Radio Telescopes, ed., D.T. Emerson and J.M.
Payne).
[115] S. Church et al., 1993, M.N.R.A.S., in press.
[116] E. Kreysa and A. Chini, 1989, Proc. Particle Astrophysics Workshop,
Berkeley (World Scientific, Singapore).
[117] J.R. Bond and S. Myers, 1991, in: D. Cline and R. Peccei, eds., Trends
in Astroparticle Physics (World Scientific, Singapore) 262.
[118] M. Markevitch, G.R. Blumenthal, W. Forman, C. Jones and R.A. Sun-
yaev, 1992, Ap. J. 395, 326.
[119] J.G. Bartlett, A.K. Gooding and D.N Spergel, 1993, Ap. J. 403, 1; J.G.
Bartlett and J. Silk, 1994, Ap. J. 423, 12.
[120] J.R. Bond and S. Myers, 1996, The peak-patch picture of cosmic cat-
alogues III: application to cluster X-ray emission and the SZ effect,
Ap. J. Supp. 103, 1.
[121] S. Colanfrancesco, P. Mazzotta, Y. Rephaeli and N. Vittorio, 1994 Ap. J.
433, 454.
[122] R. Scaramella, R. Cen and J.P. Ostriker et al., 1994, Ap. J. 416, 399.
[123] R.B. Partridge and D.T. Wilkinson, 1967, Phys. Rev. Lett. 18, 557.
[124] E.K. Conklin and R.N. Bracewell, 1967, Nature 216, 777.
[125] P.E. Boynton and R.B. Partridge, 1973, Ap. J. 181, 243; R.B. Partridge,
1980, Ap. J. 235, 681.
[126] J.M. Uson and D.T. Wilkinson, 1984, Ap. J. Lett. 277, L1.
[127] F. Melchiorri, B.O. Melchiorri, C. Ceccarelli and L. Pietranera, 1981,
Ap. J. Lett. 250, L1.
197
[128] I.A. Strukov, D.P. Skulachev and A.A. Klypin, 1987, in: J. Audouze and
A.S. Szalay, eds., Proceedings I.A.U. Symposium 130 (Reidel, Dordrecht).
[129] R.D. Davies et al., 1987, Nature 326, 462.
[130] P.J.E. Peebles and J.T. Yu, 1970, Ap. J. 162, 815.
[131] A.G. Doroshkevich, Ya.B. Zeldovich and R.A. Sunyaev, 1978, Sov. As-
tron. 22, 523.
[132] M.L. Wilson and J. Silk, 1981, Ap. J. 243, 14.
[133] M.L. Wilson, 1983, Ap. J. 273, 2.
[134] J.R. Bond and G. Efstathiou, 1984, Ap. J. Lett. 285, L45.
[135] N. Vittorio and J. Silk, 1984, Ap. J. Lett. 285, L39.
[136] N. Vittorio and J. Silk, 1992, Ap. J. Lett. 385, 9.
[137] M. Fukugita, N. Sugiyama and M. Umemura, 1990, Ap. J. 358, 28.
[138] N. Gouda, N. Sugiyama and M. Sasaki, 1991, Prog. Theor. Phys. 85,
1023; 1991, Ap. J. Lett. 327, 49; N. Gouda, M. Sasaki and Y. Suto, 1989,
Ap. J. 341, 557.
[139] K.M. Gorski, R. Stompor and R. Juszkiewicz, 1993, Ap. J. Lett. 410, 1.
[140] R. Crittenden, J.R. Bond., R.L. Davis., G. Efstathiou and P.J. Stein-
hardt, 1993, Phys. Rev. Lett. 71, 324.
[141] R. Crittenden, R. Davis and P. Steinhardt, 1993, Ap. J. Lett., L13.
[142] S. Dodelson and J. Jubas, 1994, Phys. Rev. Lett. 70, 2224.
[143] U. Seljak, 1994, Ap. J. Lett., in press.
[144] J.R. Bond, J.R. Crittenden, R.L. Davis, G. Efstathiou and P.J. Stein-
hardt, 1994, Phys. Rev. Lett. 72, 13.
[145] J.R. Bond, R.L. Davis and P.J. Steinhardt, 1995, Astrophys. Lett. and
Comm., 32, 53.
[146] J.R. Bond, 1994, Phys. Rev. Lett. 74, 4369.
[147] K. Ganga, L. Page, E. Cheng and S. Meyer, 1994, Ap. J. Lett. 432, L15.
[148] J.O. Gundersen et al., 1995, Ap. J. Lett., 443, L57 sp94.
[149] C.B. Netterfield, N. Jurosik, L. Page and D. Wilkinson, 1995, Ap. J. Lett.
455, L69. sk94.
198
[150] C.B. Netterfield, M.J. Devlin, N. Jurosik, L. Page and E.J. Wollack, 1997,
Ap. J. 474, 47 sk95.
[151] P.F. Scott et al., 1996, Ap. J. Lett. 461, 1 cat96.
[152] C. Bennett et al., 1996, MAP experiment home page,
http://map.gsfc.nasa.gov
[153] M.A. Janssen et al., 1997, Ap. J., in press, astro-ph/9602009.
[154] M. Bersanelli et al., 1996, COBRAS/SAMBA, The Phase A Study for
an ESA M3 Mission, preprint.
[155] J.O. Berger 1985, Statistical Decision Theory and Bayesian Analysis
(Springer-Verlag, New York).
[156] C.W. Therrien 1992, Discrete Random Signals in Statistical Signal Pro-
cessing, ISPN0-13-852112-3 (Prentice Hall).
[157] E.F. Bunn et al., 1994, Ap. J. Lett. 432, 75.
[158] M. White 1996, Phys. Rev. D53, 3011.
[159] L. Knox 1995, Phys. Rev. D 52, 4307.
[160] M. Tegmark, and G. Efstathiou 1996, M.N.R.A.S. in press.
[161] G. Jungman, M. Kamionkowski, A. Kosowsky, and D.N. Spergel, 1996,
Phys. Rev. Lett. 76, 1007.
[162] Bond, J.R. and Jaffe, A., 1996, CITA preprint, astro-ph/9610091.
[163] L. Knox 1996, CITA preprint, astro-ph/9606066.
[164] J.R. Bond, G. Efstathiou and M. Tegmark, 1997, astro-ph/9702100.
[165] M. Zaldarriaga, D. Spergel, and U. Seljak, 1997, astro-ph/9702157.
[166] D. Coulson, R.G. Crittenden and N.G. Turok, 1994, Phys. Rev. Lett. 73,
2390.
[167] R. Arnowitt, S. Deser and C.W. Misner, 1962, in: L. Witten, ed., Grav-
itation (Wiley, New York) pp. 227–265.
[168] J.W. York, 1979, in: L. Smarr, ed., Sources of Gravitational Radiation
(Cambridge University Press, Cambridge) p. 83.
[169] J.W. York, 1983, in: N. Deruelle and T. Piran, eds., Gravitational Ra-
diation, 1982 Les Houches Proceedings (North-Holland, Amsterdam) p.
175.
[170] E.M. Lifshitz, 1946, JETP Letters 16, 587.
199
[171] J.M. Bardeen, 1980, Phys. Rev. D 22, 1882.
[172] T. Piran, 1988, in: W.G. Unruh, ed., The Early Universe, Proc. NATO
Summer School, Vancouver Is., Aug. 1986 (Reidel, Dordrecht).
[173] H. Kodama and M. Sasaki, 1984, Prog. Theor. Phys. Suppl. 78, 1.
[174] H. Kodama and M. Sasaki, 1986, Intern. J. Mod. Phys. A1, 265.
[175] J. Bardeen, P.J. Steinhardt and M.S. Turner, 1983, Phys. Rev. D 28,
679.
[176] G. Chibisov and V.F. Mukhanov, 1982, M.N.R.A.S. 200, 535; V.F.
Mukhanov, H.A. Feldman and R.H. Brandenberger, 1992, Phys. Rep.
215, 203; D.H. Lyth, 1985, Phys. Rev. D 31, 1792.
[177] V. Mukhanov, 1988, JETP 94, 1.
[178] J.M. Bardeen, 1988, in: A. Zee, ed., Proc. CCAST Symposium on Parti-
cle Physics and Cosmology (Gordon and Breach, New York).
[179] L. Landau and Lifshitz, 1971, Classical Theory of Fields (Addison-Wesley,
Reading, MA).
[180] A.A. Starobinski, 1986, in: H.T. de Vega and N. Sanchez, eds., Current
Topics in Field Theory, Quantum Gravity, and Strings, Proc. Meudon
and Paris VI, Vol. 246 (Springer Verlag) p. 107.
[181] D.S. Salopek and J.R. Bond, 1990, Phys. Rev. D 42, 3936; 1991,
Phys. Rev. D 43, 1005.
[182] D.S. Salopek and J. Stewart, 1994, preprint.
[183] E.D. Stewart and D.H. Lyth, 1993, Phys.Lett. 302B, 171.
[184] A.R. Liddle and M.S. Turner, 1994, Fermilab preprint, astro-ph/9402021
FERMILAB-Pub-93/399-A; E.W. Kolb and S.L. Vadas, 1994, Fermilab
preprint, astro-ph/9403001.
[185] D.H. Lyth and E.D. Stewart, 1992, Phys. Lett. 274B, 168.
[186] D. La and P.J. Steinhardt, 1989, Phys. Rev. Lett. 62, 376.
[187] E.W. Kolb, D.S. Salopek and M.S. Turner, 1990, Phys. Rev. D 42, 3925.
[188] A.D. Linde, 1983, Phys. Lett. B 129, 177.
[189] K. Freese, J.A. Frieman, and A.V. Olinto, 1990, Phys. Rev. Lett. 65,
3233.
[190] F.C. Adams, J.R. Bond, K. Freese, J.A. Frieman and A.V. Olinto, 1993,
Phys. Rev. D, 47, 426.
200
[191] L.A. Kofman and A.D. Linde, 1987, Nucl. Phys. B 282, 555; L.A. Kofman
and D.Yu. Pogosyan, 1988, Phys. Lett. 214 B, 508.
[192] D.S. Salopek, J.R. Bond and J.M. Bardeen, 1989, Phys. Rev. D 40, 1753.
[193] J. Ehlers, 1971, in: R. Sachs, ed., General Relativity and Cosmology
(Academic Press, New York).
[194] J.M. Stewart, 1971, Non-Equilibrium Relativistic Kinetic Theory, Lecture
notes in Physics 10 (Springer Verlag, Berlin).
[195] J.R. Bond and A.S. Szalay, 1983, Ap. J. 277, 443.
[196] C.W. Misner, K.S. Thorne and J.A. Wheeler, 1973, Gravitation (Free-
man, San Francisco).
[197] G. Baym and L. Kadanoff, 1962, Quantum Statistical Mechanics (Ben-
jamin).
[198] R.K. Osborn and S. Yip, 1966, The Foundations of Neutron Transport
Theory (Gordon and Breach, New York).
[199] U. Fano, 1954, Phys. Rev. 93, 121.
[200] S. Chandrasekhar, 1960, Radiative Transfer (Dover, New York) pp. 1–53.
[201] J.R. Bond, 1987, in: C. Dyer and B. Tupper, eds., General Relativity and
Astrophysics, Proc. Second Canadian Conference on General Relativity
(World Scientific, Singapore) pp. 310–314.
[202] M. Jones, 1994, in: F. Durret et al., eds., Clusters of Galaxies, Proc.
XXIXth Rencontre de Moriond (Edition Frontières).
[203] Y. Raphaeli 1995, Ann. Rev. Astron. Ap. 33, 541.
[204] M. Birkinshaw, 1990, in: N. Mandolesi and N. Vittorio, eds., The Cosmic
Microwave Background: 25 Years Later (Kluwer, Dordrecht) p. 77.
[205] M. Birkinshaw, J.P. Hughes and K.A. Arnaud, 1991, Ap. J. 379, 466.
[206] M. Birkinshaw and J.P. Hughes, 1993, Ap. J..
[207] T. Herbig, C.R. Lawrence, A.C.S. Readhead and S. Gulkis, 1995
Ap. J. Lett. 449, 5.
[208] S.T. Myers et al., 1997 Ap. J. 485, 1.
[209] M. Jones et al., 1993, Nature 365, 320; K. Grainge et al., 1994,
M.N.R.A.S., 265, L57.
[210] T.M. Wilbanks, P.A.R. Ade, M.L. Fischer, W.L. Holzapfel and A.E.
Lange, 1994, Ap. J. Lett. 427, 75.
201
[211] H. Liang and R. Ekers 1995, Publ. Astron. Soc. Australia 12, 123.
[212] J.E. Carlstrom, M. Joy and L. Greggo, 1996 Ap. J. Lett. 456, 75.
[213] G. Evrard, 1990, Ap. J. 363, 349.
[214] J.R. Bond and J. Wadsley, 1994, unpublished.
[215] G. Efstathiou and J.R. Bond, 1986, M.N.R.A.S. 218, 103.
[216] G. Efstathiou and J.R. Bond, 1987, M.N.R.A.S. 227, 33P.
[217] N. Kaiser, 1984, Ap. J. 282, 374.
[218] G. Efstathiou, 1988, in: G. Coyne and V. Rubin, eds., Large Scale Mo-
tions in the Universe, Proc. Pontifical Academy of Sciences Study Week
# 27 (Princeton University Press, Princeton).
[219] W. Hu, E. Bunn and N. Sugiyama, 1995, Ap. J. Lett. 447, 59.
[220] S. Dodelson and J. Jubas, 1995, Ap. J. 439, 503.
[221] W. Hu, D. Scott, and J. Silk 1994, Phys. Rev. D 49, 648.
[222] J.R. Bond, 1990, in: N. Mandolesi and N. Vittorio, eds., The Cosmic
Microwave Background: 25 Years Later, Proceedings of the L’Aquila
Conference (Kluwer, Dordrecht) p. 45.
[223] N. Kaiser and A. Stebbins, 1984, Nature 310, 391.
[224] A.A. Starobinsky, 1979, Pis’ma Zh. Eksp. Teor. Fiz. 30, 719; 1985, Sov.
Astron. Lett. 11, 133.
[225] L.F. Abbott and M. Wise, 1984, Nucl. Phys. B 244, 541.
[226] D.S. Salopek, 1992, Phys. Rev. Lett. 69, 3602.
[227] T. Souradeep and V. Sahni, 1992, Mod. Phys. Lett. A7, 3541.
[228] R.L. Davis et al., 1992, Phys. Rev. Lett. 69, 1856.
[229] L. Krauss and M. White, 1992, Phys. Rev. Lett. 69, 869; F. Lucchin, S.
Matarrese and S. Mollerach, 1992, Ap. J. Lett. 401, 49; A. Liddle and
D. Lyth, 1992, Phys. Lett. B 291, 391; J.E. Lidsey and P. Coles, 1992,
M.N.R.A.S. 258, 57P.
[230] A.G. Polnarev, 1985, Sov. Astron. 29, 607.
[231] G. Efstathiou, J.R. Bond and S.D.M. White, 1992, M.N.R.A.S. 258, 1P.
[232] J.M. Bardeen, J.R. Bond, N. Kaiser and A.S. Szalay, 1986, Ap. J. 304,
15.
202
[233] J.M. Bardeen, J.R. Bond and G. Efstathiou, 1987, Ap. J. 321, 28.
[234] E. Bertschinger et al., 1990, Ap. J. 364, 370.
[235] N. Kaiser et al., 1991, M.N.R.A.S. 252, 1.
[236] K.B. Fisher et al., 1992 Ap. J. 402, 42.
[237] M.S. Vogeley et al., 1992, Ap. J. Lett. 391, L5.
[238] G.D. Dalton et al., 1992, Ap. J. Lett. 390, L1.
[239] R.C. Nichol et al., 1992, M.N.R.A.S. 255, 21p.
[240] C.M. Baugh and G. Efstathiou, 1993, M.N.R.A.S. 265, 145.
[241] S.J. Maddox, G. Efstathiou and W.J. Sutherland, 1990, M.N.R.A.S. 246,
433.
[242] C.A. Collins, R.C. Nichol and S.L. Lumsden, 1992, M.N.R.A.S. 254, 295.
[243] J.R. Bond, G. Efstathiou, P.M. Lubin and P. Meinhold, 1991,
Phys. Rev. Lett. 66, 2179.
[244] M. Kamionkowski, B. Ratra, D.N. Spergel and N. Sugiyama, 1994,

Ap. J. Lett.426, 57.
[245] B. Ratra and P.J.E. Peebles, 1994, Ap. J. Lett. 432, L5; Phys. Rev. D50,
5232; Phys. Rev. D, in press (PUPT-1444).
[246] J. Yokoyama et al., 1992, Ap. J. Lett. 396, L13.
[247] P.J.E. Peebles, 1983, Ap. J. Lett. 263, L1.
[248] P.J.E. Peebles, 1987, Ap. J. Lett. 277, L1.
[249] R. Cen, N. Gnedin, L. Kofman and J. Ostriker, 1992, Ap. J. 413, 1.
[250] J.A. Peacock and S.J. Dodds, 1994, M.N.R.A.S. 267, 1020.
[251] N. Sugiyama, 1995, Ap. J. Supp. 100, 281.
[252] J.R. Bond and G. Efstathiou, 1991, Phys. Lett. B 379, 440.
[253] J.A. Holtzman, 1989, Ap. J. Supp. 71, 1.
[254] L. Kofman, N. Gnedin and N. Bahcall, 1993, Ap. J. 413, 1.
[255] D.Yu. Pogosyan and A.A. Starobinsky, 1993, M.N.R.A.S. 265, 507; 1995
Ap. J.447, 465.
203
[257] K. Gorski et al., 1994, Ap. J. Lett. 430, L89.
[258] U. Fano, 1954, Phys. Rev. 93, 121.
[259] R. Durrer, 1989, Astron. Ap. 208, 1.
[260] C.-P. Ma and E. Bertschinger, 1995, Ap. J. 455, 7.
[261] J.R. Bond and Y. Lithwick 1995, unpublished.
[262] S. Dodelson, E. Gates, and A. Stebbins, 1996 Ap. J. 467, 10.
[263] A. deShalit and H. Feshbach, 1974, Theoretical Nuclear Physics, Volume
1: Nuclear Structure (Wiley, New York).
[264] S. Weinberg, 1972, Gravitation and Cosmology: Principles and Appli-
cations of the General Theory of Relativity (Wiley, New York).
[265] S. Weinberg, 1971, Ap. J. 168, 175.
[266] N. Kaiser, 1983, M.N.R.A.S. 202, 1169.
[267] W.H Press and E.T. Vishniac, 1980, Ap. J. 236, 323.
[268] S.A. Bonometto, F. Lucchin and R. Valdarnini, 1984, Astron. Ap. 140,
27.
[269] A.G. Doroshkevich, 1988, Pis’ma Zh. Eksp. Teor. Fiz. 14, 296; F. Atrio-
barandela, A.G. Doroshkevich and A.A. Klypin, 1991 Ap. J.378, 1.
[270] A.A. Starobinsky, 1988, Pis’ma Zh. Eksp. Teor. Fiz. 14, 394.
[271] W. Hu and N. Sugiyama, 1995, Phys. Rev. D 51, 2599.
[272] P. Schneider, J. Ehlers and E.E. Falco, 1992, Gravitational Lenses
(Springer Verlag, New York).
[273] N. Kaiser, 1992, Ap. J. 388, 272; 1992, in: New Insights of the Universe,
Proc. Valencia Summer School, Sept. 1991.
[274] A. Blanchard and J. Schneider, 1987, Astron. Ap. 184, 1.
[275] S. Cole and G. Efstathiou, 1989, M.N.R.A.S. 239, 195.
[276] K. Tomita and K. Watanabe, 1989, Prog. Theor. Phys. 82, 563.
[277] M. Sasaki, 1989, M.N.R.A.S. 240, 415.
[278] V.E. Linder, 1990, M.N.R.A.S. 243, 362.
[279] L. Cayon, E. Martinez-Gonzalez and J.L. Sanz, 1993, Ap. J. 403, 471;
413, 10.
204
[280] U. Seljak, 1996, Ap. J. 463, 1.
[281] M.B. Hindmarsh and T.W.B. Kibble, 1994, Cosmic Strings, SUSX-TP-
94-74, IMPERIAL /TP/94-95/5, hep-ph/9411342.
[282] D.P. Bennett, A. Stebbins and F.R. Bouchet, 1992, Ap. J. Lett. 399, 5.
[283] B. Allen, R.R. Caldwell, E.P.S. Shellard, A. Stebbins and S. Veeraragha-

van, 1996, Phys. Rev. Lett. 77, 3061.
[284] D.P. Bennett and S.H. Rhie, 1993, Ap. J. Lett. 406, 7.
[285] Ue-Li Pen, D.N. Spergel and N. Turok, 1994, Phys. Rev. D 49, 692.
[286] D. Coulson, P. Ferreira, P. Graham and N. Turok, 1994, Nature 368,
27.
[287] Ue-Li Pen and D.N. Spergel, 1994, Phys. Rev. D, in press.
[288] M. Tegmark and E.F. Bunn 1995, Ap. J. 455, 1.
[289] M. Tegmark 1996, M.N.R.A.S. in press.
[290] E.F. Bunn, D. Scott and M. White, 1995, Ap. J. Lett. 441, L9; M. White
and E.F. Bunn, 1996, Ap. J. 460 1071.
[291] R. Stompor, K.M. Gorski and A.J. Banday, 1995 M.N.R.A.S. 277, 1225.
[292] K.M. Gorski, B. Ratra, N. Sugiyama and A.J. Banday, 1995, Ap. J. Lett.
446, 67.
[293] S.D.M. White, G. Efstathiou, and C.S. Frenk, 1995, M.N.R.A.S. 292,
371
[294] V.R. Eke, S. Cole, and C.S. Frenk, 1996, M.N.R.A.S. 282, 263.
[295] M. Strauss, and J. Willick, J. 1995, Phys. Rep. 261, 271
[296] J.A. Willick et al.1996, Ap. J.457, 460.
[297] S. Zaroubi et al.1996, preprint, astro-ph/9603068
[298] T. Kolatt and A. Dekel 1996, preprint, astro-ph/9512132
[299] S. Zaroubi, I. Zehavi, A. Dekel, Y. Hoffmann and T. Kolatt 1996,
preprint, astro-ph/9610226
[300] Bertschinger, E. 1996, in “Cosmology and Large Scale Structure”, Les
Houches Session LX, August 1993, ed. R. Schaeffer, Elsevier Science Press
205
[301] E. Bertschinger, P. Bode, J.R. Bond, D. Coulson, R. Crittenden, S. Do-
delson, G. Efstathiou, K. Gorski, W. Hu, L. Knox, Y. Lithwick, D. Scott,
U. Seljak, A. Stebbins, P. Steinhardt, R. Stompor, T. Souradeep, N.
Sugiyama, N. Turok, N. Vittorio, M. White, M. Zaldarriaga 1995, ITP
workshop on Cosmic Radiation Backgrounds and the Formation of Galax-
ies, Santa Barbara [COMBA]
[302] W. Hu, D. Scott, N. Sugiyama and M. White, 1995, Phys. Rev. D 52,
5498.
[303] L. Abbott and R.K. Schaeffer, 1985, Ap. J. 308, 546.
[304] M. White and D. Scott, 1996, Ap. J. 459, 415.
[305] J.R. Bond and T. Souradeep, 1996, preprint.
[306] U. Seljak and M. Zaldarriaga, 1996, Ap. J. 469, 437.
206

Les Houches 96

Uploaded by

Copyright:

Available Formats

Les Houches 96

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Les Houches 96

Uploaded by

Copyright:

Available Formats

Theory and Observations of the Cosmic

in Cosmology and Large Scale Structure, pp. 469-674,

2 Spectral observations and constraints 9

3 Spectral distortion theory 13

4 Phenomenology of CMB anisotropy 41

5 Primary and secondary sources of anisotropy 76

6 Perturbation theory of primary anisotropies 105

7 Connection with other cosmic probes of k-space 135

A The ADM formalism and perturbation theory 146

B Transport theory in General Relativity 156

nγ∗ = 412 cm−3 , ργ∗ = 0.261 eV cm−3 , Ωγ h2 = 2.46 × 10−5 , (3)

stot∗ = sγ∗ + sν∗ = sγ∗ + 7 4

for Nν = 3 light neutrino generations, each contributing a left-handed particle

monopole: Tc∗ = 2.728 ± 0.004 K (95% CL), (5)

Compton y-parameter: ȳ < 1.5 × 10−5 (95% CL), (7)

Tt∗ (q, q̂, xi , τ ) ≡ q/ ln(ft−1 + 1) . (12)

We are interested in the fluctuations in ft and Tt∗ ,

With inhomogeneous absorbers, the Green function naturally depends on the

∂ f¯t /∂τ = ā S[f¯t ] . (20)

3.2 Source functions for spectral distortions

3.2.1 Compton scattering and the Kompaneets source term

The first term describes stimulated emission, of photons in momentum state

It is sharply peaked, concentrated near ∆ω ≈ 0, with deviations of order m−1

Sf f [ft ] = −Γ0B (∆ft − (feq − fc )) , feq = (exe − 1)−1 , (35)

dyf f g(xe ) (e−xe − e−x ) (ex − 1)

The signature of bremsstrahlung in the thermodynamic temperature is

Thus for low frequencies, the thermodynamic temperature follows a ν −2 law.

3.2.3 Double Compton scattering

SDC [f ] = −Γ0DC (f − feq ) , Γ0DC = ΓDC (1 − e−xe ) ,

3.2.4 Rayleigh scattering

3.2.5 Line radiation

3.2.7 Dust grains

3.4 Recombination and photon decoupling

1. Equilibrium of the state {n, `} with the 2s:

Yn` = (gn` /4)Y2s exp[−(B2 − Bn )/Tγ ] . (57)

The partition function for states above n = 1 is Z(T ).

αc = 1.948 × 10−13 (104 K/Te )1/2 ϕ(y) cm3 s−1 , (60)

dx 1 nB YHT αc x2 − βc (1 − x)e−(B1 −B2 )/Tγ

de−ζC exp[−(τ − τdec )2 /(2RV2 C ,dec )]

VC max: ne σT /H = (pe,dec + 2) defines ādec , τdec ,

This also gives σa,C ≈ H̄āRVC ; in particular, if we substitute ne σT /H =

3.5 Reionization of the universe

3.6 Post-recombination energy sources

EBHacc ΩBHacch2 5 acc

Table 1: Sample dust emission models

4 Phenomenology of CMB anisotropy

4.1 Statistical measures of the radiation pattern: C(θ), C` , . . .

the correlation function is given by

4.2 Experimental arrangements and their filters

hC` iW ≡ I[W ` CT ` ]/I[W ` ] . (95)

4.2.2 Beams and dmr and firs

4.2.3 2-Beams, 3-beams, oscillating beams, . . .

with the average filter

W ` = [2(1 − J0 (xt )) − 21 (1 − J0 (2xt ))] B 2 (`|`s ) . (103)

Up (Q) = 2i sin3 ( 12 Q · $throw ) ,

4.3 Primary power spectra for inflation-based theories

4.4 2D spectra with tilt and a Gaussian coherence angle

Instead of Qν∆T , it has become standard to use a form U` which, as is shown

EBHacc ΩBHacch2 5 acc

For τ τeq and kτ 1, H̄Ψσ,S is approximately constant, hence so are the