Symmetries and invariances in classical physics
Katherine Brading∗ and Elena Castellani†
December 17, 2005
Abstract
Symmetry, intended as invariance with respect to a transformation
(more precisely, with respect to a transformation group), has acquired
more and more importance in modern physics. This Chapter explores
in 8 Sections the meaning, application and interpretation of symmetry
in classical physics. This is done both in general, and with attention to
specific topics. The general topics include illustration of the distinctions
between symmetries of objects and of laws, and between symmetry principles and symmetry arguments (such as Curie’s principle), and reviewing
the meaning and various types of symmetry that may be found in classical
physics, along with different interpretative strategies that may be adopted.
Specific topics discussed include the historical path by which group theory
entered classical physics, transformation theory in classical mechanics, the
relativity principle in Einstein’s Special Theory of Relativity, general covariance in his General Theory of Relativity, and Noether’s theorems. In
bringing these diverse materials together in a single Chapter, we display
the pervasive and powerful influence of symmetry in classical physics, and
offer a possible framework for the further philosophical investigation of
this topic.
Contents
1
Introduction
The term ‘symmetry’ comes with a variety of ancient connotations, including
beauty, harmony, correspondence between parts, balance, equality, proportion,
and regularity. These senses of the term are clearly related to one another;
the concept of symmetry used in modern physics arose out of this family of
ideas. We are familiar with the approximate symmetries of physical objects that
∗ Department of Philosophy, 100 Malloy Hall, University of Notre Dame Notre Dame, Indiana 46556. E-mail:
[email protected]
† Department of Philosophy, University of Florence, via Bolognese 52, 50139, Firenze, Italy.
E-mail:
[email protected]
1
we find around us – the bilateral symmetry of the human face, the rotational
symmetry of a snowflake turned through 60o , and so forth. We may define a
symmetry of a given geometric figure as the invariance of that figure when equal
component parts are exchanged under a specified operation (such as rotation).
The development of the algebraic concept of a group, in the nineteenth century,
allowed a generalization and refinement of this idea; a precise mathematical
notion of symmetry emerged which was applicable not just to physical objects
and geometrical figures, but also to mathematical equations – and thus, to what
is of particular interest to us, the laws of physics expressed as mathematical
equations. The group theoretical notion of symmetry is the notion of invariance
under a specified group of transformations. ‘Invariance’ is a mathematical term:
something is invariant when it is left unaltered by a given transformation. This
mathematical notion is used to express the notion of physical symmetry that
we are interested in, i.e. invariance under a group of transformations. This is
the concept of symmetry that has proved so successful in modern science, and
the one that will concern us in what follows.
We begin in Section 2 with the distinction between symmetries of objects
and of laws, and that between symmetry principles and symmetry arguments.
This section includes a discussion of Curie’s principle. Section 3 discusses the
important connection between symmetries, as studied in physics, and the mathematical techniques of group theory. We offer a brief history of how group theoretical techniques came to be of such central importance in twentieth century
physics. With these considerations in mind, Section 4 offers an account of what
is meant by symmetry in physics, and a taxonomy of the different types of symmetry that are found within physics. In Section 5 we discuss some applications
of symmetries in classical physics, beginning with transformation theory in classical mechanics, and then turning to Einstein’s Special and General Theories of
Relativity (see Section 6). We focus on the roles and meaning of symmetries
in these theories, and this leads into the discussion of Noether’s theorems in
Section 7. Finally, in Section 8, we offer some concluding remarks concerning
the place, role and interpretation of symmetries in classical physics.
2
Symmetries of objects and of laws
That we must distinguish between symmetries of objects versus symmetries of
laws can be seen as follows. It is one thing to ask about the geometric symmetries
of certain objects – such as the 60o rotational symmetry of a snowflake and
the approximate bilateral symmetry of the human face mentioned above – and
the asymmetries of objects – such as the failure of a chair to be rotationally
symmetric. It is another thing to ask about the symmetries of the laws governing
the time-evolution of those objects: we can apply the laws of mechanics to the
evolution of our chair, considered as an isolated system, and these laws are
rotationally invariant (they do not pick out a preferred orientation in space) even
though the chair itself is not. Re-phrasing the same point, we should distinguish
2
between symmetries of states or solutions, versus symmetries of laws. Having
distinguished these two types of symmetry we can, of course, go on to ask about
the relationship between them: see, for example, current discussions of Curie’s
principle, referred to in Section ??, below.
2.1
Symmetry principles and symmetry arguments
It is also important to distinguish between symmetry principles and symmetry
arguments. The application of symmetry principles to laws was of central importance to physics in the twentieth century, as we shall see below in the context
of Eintein’s Special and General Theories of Relativity. Requiring that the laws
– whatever their precise form might be – satisfy certain symmetry properties,
became a central methodological tool of theoretical physicists in the process of
arriving at the detailed form of various laws.
Symmetry arguments, on the other hand, involve drawing specific consequences with regard to particular phenomena on the basis of their symmetry
properties. This type of use of symmetry has a long history; examples include
Anaximander’s argument for the immobility of the Earth, Archimedes’s equilibrium law for the balance, and the case of Buridan’s ass.1 In each case the
associated argument can be understood as an example of the application of the
Leibnizean Principle of Sufficient Reason (PSR): if there is no sufficient reason
for one thing to happen instead of another, then nothing happens (i.e. the initial situation does not change). There is something more that the above cases
have in common: in each of them PSR is applied on the grounds that the initial
situation has a certain symmetry.2 The symmetry of the initial situation implies
the complete equivalence between the offered alternatives. If the alternatives
are completely equivalent, then there is no sufficient reason for choosing between
them and the initial situation remains unchanged. Arguments of this kind most
frequently take the following form: a situation with a certain symmetry evolves
in such a way that, in the absence of an asymmetric cause, the initial symmetry
is preserved. In other words, a breaking of the initial symmetry cannot happen
without a reason: an asymmetry cannot originate spontaneously. This style of
argumentation is also to be found in recent discussions of ‘Curie’s principle’, the
principle to which we now turn.
2.2
Curie’s principle
Pierre Curie (1859-1906) was led to reflect on the question of the relationship
between physical properties and symmetry properties of a physical system by
his studies on the thermal, electric and magnetic properties of crystals, since
these properties were directly related to the structure, and hence the symmetry,
of the crystals studied. More precisely, the question he addressed was the following: in a given physical medium (for example, a crystalline medium) having
specified symmetry properties, which physical phenomena (for example, which
1 For
2 In
a discussion of these examples, see Brading and Castellani (2003, ch. 1, Section 2.2).
the first case rotational symmetry, in the second and third bilateral symmetry.
3
electric and magnetic phenomena) are allowed to happen? His conclusions, systematically presented in his 1894 work ‘Sur la symétrie dans les phénomènes
physiques’, can be summarized as follows:3
(a1 ) When certain causes produce certain effects, the symmetry elements of the causes must
be found in their effects.
(a2 ) When certain effects show a certain dissymmetry, this dissymmetry must be found in
the causes which gave rise to them.4
(a3 ) In practice, the converses of these two propositions are not true, i.e., the effects can be
more symmetric than their causes.
(b) A phenomenon may exist in a medium having the same characteristic symmetry or
the symmetry of a subgroup of its characteristic symmetry. In other words, certain
elements of symmetry can coexist with certain phenomena, but they are not necessary.
What is necessary, is that certain elements of symmetry do not exist. Dissymmetry is
what creates the phenomenon.
Conclusion (a1 ) is what is usually called Curie’s principle in the literature.
Conclusion (a2 ) is logically equivalent to (a1 ); the claim is that symmetries
are necessarily transferred from cause to effect, while dissymmetries are not.
Conclusion (a3 ) clarifies this claim, emphasizing that since dissymmetries need
not be transferred from cause to effect, the effect may be more symmetric than
the cause.5 Conclusion (b) invokes a distinction found in all of Curie’s examples,
between the ‘medium’ and the ‘phenomena’. We have a medium with known
symmetry properties, and Curie’s principle concerns the relationship between
the phenomena that can occur in the medium and the symmetry properties –
or rather, ‘dissymmetry’ properties – of the medium. Conclusion (b) shows that
Curie recognized the important function played by the concept of dissymmetry
– of broken symmetries in current terminology – in physics.
In order for Curie’s principle to be applicable, various conditions need to be
satisfied: the cause and effect must be well-defined, the causal connection between them must hold good, and the symmetries of both the cause and the effect
must also be well-defined (this involves both the physical and the geometrical
properties of the physical systems considered). Curie’s principle then furnishes
a necessary condition for given phenomena to happen: only those phenomena
can happen that are compatible with the symmetry conditions stated by the
principle. Curie’s principle has thus an important methodological function: on
3 For an English translation of Curie’s paper, see Curie (1981); some aspects of the translation are misleading.
4 Curie uses the term dissymmetry in his paper, as was current at his time. The sense is the
same of that of symmetry breaking in modern terminology, which is today often identified with
the sense of asymmetry. To be more precise one should distinguish between the result of a
symmetry-breaking process (broken symmetry), the absence of one of the possible symmetries
compatible with the situation considered (dissymmetry, as it was called in the nineteenth
century literature, notably by Louis Pasteur in his works on molecular dissymmetry), and the
absence of all the possible symmetries compatible with the situation considered (asymmetry).
5 Note that for some authors conclusion (b) is a principle on its own. Radicati (1987) goes
further, describing conclusions (a1 ), (a2 ) and (b) as three different principles: Curie’s first,
second and third principle, respectively.
4
the one hand, it furnishes a kind of selection rule (given an initial situation
with a specified symmetry, only certain phenomena are allowed to happen); on
the other hand, it offers a falsification criterion for physical theories (a violation of Curie’s principle may indicate that something is wrong in the physical
description).
Such applications of Curie’s principle depend, of course, on our accepting its
truth, and this is something that has been questioned in the literature, especially in relation to spontaneous symmetry breaking. Different proposals have
been offered for justifying the principle. Curie himself seems to have regarded
it as a form of causality principle, and the question in the recent literature has
been whether the principle can be demonstrated from premises that include a
definition of “cause” and “effect”. In this direction it has become current of
late to understand the principle as following from the invariance properties of
deterministic physical laws. The seminal paper for this approach is Chalmers
(1970), which introduces the formulation of Curie’s principle in terms of the
relationship between the symmetries of earlier and later states of a system, and
the laws connecting these states. This “received view” can be criticized for offering a reformulation that is significantly different from Curie’s intentions (so
that the label ‘Curie’s principle’ is a misnomer), and for resting on an assumption that may undermine the interest and importance of the view, as we discuss
in the following brief remarks.6
The received view, by concerning itself with temporally ordered cause and
effect pairs (or states of systems), offers a diachronic or dynamic analysis. In
fact, Curie himself focusses on synchronic or static situations, concerning the
compatibility of different phenomena occurring at the same time, rather than
the evolution of one state of a system into another state. In other words, the
‘cause–effect’ terminology used by Curie is not intended to indicate a temporal
ordering of phenomena being considered. This is clear from his examples, and
also from the fact that discussion of the laws – so central to the diachronic
version – is absent from Curie’s own analysis. That the diachronic version has
come to have the label ‘Curie’s principle’ therefore misrepresents Curie’s original
principle and his discussion of that principle.
Is the diachronic version interesting and important, nevertheless? The account can be understood as an application of PSR in which we pay careful
attention to whether the laws provide a “sufficient reason” for a symmetry to
be broken as a system evolves from its initial to final state by means of those
laws. The reformulation of the diachronic version by Earman (2004) has the
strong merit of being precise, and thereby enabling a proof that if the initial
state possesses a given symmetry, and the laws deterministically preserve that
symmetry, then the final state will also possess that symmetry. However, things
are not so simple as they might seem because the proof takes a state with a
given symmetry. Specifying the symmetries of a state requires, in general, recourse to a background structure – such as space or spacetime, or the space of
6 For detailed discussion see Brading and Castellani (2006). The “received view” that we
attribute first to Chalmers is developed in Ismael (1997) and Earman (2004). See also Earman
(this vol., ch. 15, Section 2.3).
5
solutions. In some cases, the required structure may seem trivial or minimal,
but nevertheless the dynamics of the system will not be independent of this
structure (consider the examples of the spatial or spatiotemporal structure or,
more strongly still, the space of solutions). This has the consequence that, in
general, the structures on which the symmetries of a state and the symmetries of
the dynamics depend are not independent of one another, and any appearance
to the contrary in the “proof” needs to be handled with caution. Indeed, we
think that answering the question of whether the diachronic version is interesting and important depends in part upon investigating this lack of independence
and the role it plays in the proof, something which has yet to be provided in
the literature on the diachronic version of ‘Curie’s principle’.
Both Curie’s original version of his principle, and the diachronic version, begin with the symmetries of states of physical systems. In contemporary physics,
focus has shifted to symmetries of laws, and the significant connection between
symmetries of physical systems and symmetries of laws has to do not with
symmetries of states of those systems, but with symmetries of ensembles of
solutions.7 The symmetries of a dynamical equation are not, in general, the
symmetries of the individual solutions (let alone states), but rather the symmetries of the whole set of solutions, in the sense that a symmetry of a dynamical
equation transforms a given solution into another solution. Considering this relationship between laws and solutions leads to an alternative version of Curie’s
principle, which we propose here.8 As with the diachronic version of Curie’s
principle, our proposal departs from Curie’s original proposal, but our contention is that it remains true to the main motivation behind Curie’s original
investigation. In this version we seek to unite two things:
1. We understand Curie’s motivating question to be ‘which phenomena are
physically possible?’, and his suggestion to be that we can use symmetries
as a guide towards answering this question; and
2. We go beyond Curie in making use of symmetries of laws, something
about which he said nothing, but which has become a central concern in
contemporary physics.
Combining these two ingredients, a “modern” version of Curie’s principle
would then simply state that the symmetries of the law (equation) are to be
found in the ensemble of its solutions. This version expresses Curie’s basic
idea – that “symmetry does not get lost (without a reason)” – in virtue of the
fact that the symmetry of the law is to be found in the ensemble of solutions.
The fact that this is how we define the relationship between symmetries and
laws does not render it empty of significance with respect to Curie’s motivating
7 By
“solution” here we mean a temporally extended history of a system, the “state” of a
system being a “solution at an instant”.
8 Notice that this version does not involve the temporal evolution from cause to effect (as in
the diachronic version), nor is it restricted to a state of a system at a given instant or during
a certain temporal period (as in the sychronic version); rather, it concerns the structure of an
ensemble of solutions, considered as a whole.
6
question. On the contrary, the point is that we can use the symmetries of the
law as a guide to finding solutions, i.e. to determining which phenomena are
physically possible, when not all the solutions are known. We can ask, following
Curie, ‘What phenomena are possible?’, and we can use the connection between
the symmetries of the law and the symmetries of the ensemble of solutions as a
guide to finding the physically possible phenomena. Thus, what is on the one
hand a definitional statement (that the symmetries of the law (equation) are
to be found in the ensemble of the solutions) comes on the other hand to have
epistemic bite when we don’t know all the solutions. This, we believe, is true
to Curie’s motivating question, as expressed in item (1), above.
3
Symmetry and group theory: early history
Group theory is the powerful mathematical tool by means of which the symmetry properties of theories are studied. In this section, we begin with the
definition of a group, and outline the origins of this notion in the mathematics
of algebraic equations. We then turn our attention to the manner in which
group theory was applied first to geometry and then to physics in the course of
the nineteenth century.
3.1
The introduction of the group concept and the first
developments of group theory
A group is a family G of elements g1 , g2 , g3 ... for which there is defined a multiplication that assigns to every two elements gi and gj of the group a third element
(their product) gk = gi gj ∈ G, in such a way that the following requirements
hold:9
• (gi gj )gk = gi (gj gk ) for all gi , gj , gk ∈ G (associativity of the product);
• there exists an identity element e ∈ G such that gi e = egi for all gi ∈ G;
• for all gi ∈ G there exists an inverse element gi−1 ∈ G such that gi gi−1 =
gi−1 gi = e.
The concept of a group was introduced by Évariste Galois in the short time
he was able to contribute to mathematics (born in 1811, he died as a result of
a duel in 1832) in connection with the question of the resolution of equations
by radicals.10 The resolvent formulas for cubic and quartic equations were
9 The concept of a group can be weakened by relaxing these conditions (for example, dropping the inverse requirement leads to the concept of a monoid, and retaining only associativity
leads to the concept of a semigroup). The question that then arises is whether the full group
structure, or some weaker structure, is related to the symmetry properties of a given theory.
10 That is, in terms of a finite number of algebraic operations – addition, subtraction, multiplication, division, raising to a power and extracting roots – on the coefficients of the equations
7
found by the mathematicians of the Renaissance,11 while the existence of a
formula for solving the general equations of the fifth and higher degrees by
radicals remained an open question for a long time, stimulating developments
in algebra. In particular, the studies of the second half of the eighteenth century
focussed on the role played in the solution of equations by functions invariant
under permutation of the roots, so giving rise to the theory of permutations.
In J. L. Lagrange’s Réflexions sur la résolution algébrique des équations, the
most influential text on the subject, some fundamental results of permutation
theory were obtained.12 Lagrange’s text served as a basis for successive algebraic
developments, from P. Ruffini’s first proof in 1799 of the impossibility of solving
nth degree equations in radicals for n ≥ 5, to the seminal works by A. L. Cauchy,
N. H. Abel, and finally Galois.13
Galois’s works14 marked a turning point, providing answers to the open
questions in the solutions of equations by using new methods and algebraic
notions, first of all the notion of a group. This notion was introduced by Galois
in relation to the properties of the set of permutations of the roots of equations
(the permutations constituting what he named a ‘group’), together with other
basic notions of group theory such as “subgroup”, “normal subgroup”, and
“simple group”.15 By characterizing an equation in terms of its “degree of
symmetry”, determined by the permutation group of the roots preserving their
algebraic relations (later known as the Galois group of the equation), Galois
could transform the problem of the resolution of equations into that of studying
the properties of the permutation groups involved. In this way he obtained,
among other things, the necessary and sufficient conditions for solving equations
by radicals.
Galois’s achievements in group theory, first brought to publication by Joseph
Liouville in 1846, were collected and expanded in Camille Jordan’s 1870 Traité
des substitutions et des équations algébriques. Jordan’s Treatise, the first systematic textbook on group theory, had a decisive influence on the application
of this new theory, including its application to other domains of mathematical
science, such as geometry and mathematical physics.
11 The resolvent formula for a quadratic equation was known since Babylonian times. A
historical survey on the question of the existence of resolvent formulas for algebraic equations
is in Yaglom (1988, pp. 3 f.).
12 Among other results, the so-called Lagrange’s theorem which states – in modern terminology – that the order of a subgroup of a finite group is a divisor of the order of the group.
13 Cauchy (1789–1857) generalized Ruffini’s results in 1815; Abel (1802–1829) published in
1824 a proof of the impossibility of solving the quintic equation by radicals and in 1826 the
paper Démonstrations de l’impossibilité de la résolution algébrique des équations générales
qui passent le quatrième degré.
14 A few “m’emoires” submitted to the Académie des Sciences, three brief papers published
in 1830 in the Académie’s ‘Bulletin’, and some letters, among which is the last one written to
his friend Auguste Chevalier in the night before the fatal duel.
15 See Yaglom (1988, pp. 9 f.), for details.
8
3.2
3.2.1
Applications of group theory: the contributions of
Klein and Lie
Projective geometry, the theory of invariants and group theory: Klein and Lie’s starting point
In the same year as the appearance of Jordan’s Treatise, Sophus Lie and Felix
Klein, two young mathematicians who were to become the key figures in extending the domain of application of group theory, moved for a period from Berlin
to Paris to enter into contact with the French school of mathematics. Lie and
Klein had just written a joint paper investigating the properties of some curves
in terms of the groups of projective transformations leaving them invariant. In
fact they were drawn to Paris mostly by their interest in projective geometry, the
science founded by J. V. Poncelet to study the properties of figures preserved
under central projections. Projective geometry had become, at the time, a particularly fruitful research field for the combination of algebraic and geometrical
methods based on the notion of invariance. The theory of invariants was itself
a flourishing branch of mathematics, centered on the systematic study of the
invariants of “algebraic forms”. Using the theory of invariants, the English algebraist Arthur Cayley16 had recently clarified the relationship between Euclidean
and projective geometry, showing the former to be a special case of the latter.
Before leaving for France, Klein had tried to extend Cayley’s results, based on
the possibility of defining a distance (a “metric”) in terms of a quadratic form
defined on the projective space, to the case of non-Euclidean geometries.
While in Paris Lie and Klein became acquainted not only with Jordan, but
also with the expert in differential geometry Gaston Darboux, who stimulated
their interest in the relations between differential geometry and projective geometry.
3.2.2
Klein’s Erlangen Program
The question of the relations between the different contemporary geometrical
systems particularly interested Klein. He aimed at obtaining a unifying foundational principle for the various branches into which geometry had apparently
recently separated. In this respect, he fruitfully combined (a) the application of
the theory of invariants to the study of geometrical properties, with (b) his and
Lie’s idea of applying algebraic group theory to treating also geometrical transformations. The new group theoretical conception of a geometrical theory which
resulted was announced in his famous Erlangen Program, as it became known
following the inaugural lecture entitled ‘Comparative Considerations on Recent
Geometrical Research’17 that the 23-year-old Klein delivered when entering, in
1872, as a professor on the staff of the University of Erlangen. Guided by the
idea that geometry is in the end a unity, Klein’s solution to the problem posed
16 Cayley was one of the three members of the ‘invariant trio’, as the French mathematician Hermite dubbed them, the other two being James Sylvester, inventor of most of the
terminology of the theory including the word ‘invariant’, and George Salmon.
17 ‘Vergleichende Betrachtungen über neuere geometrische Vorschungen’.
9
by the existence of different geometries was to propose a general characterization
of a geometrical theory by using the notion of invariance under a transformation
group (i.e., the notion of symmetry). According to his characterization a geometry is defined, with respect to a given domain (the plane, the space, or a given
“manifold”) and a group of transformations acting on it, as the science studying
the invariants under the transformations of the group. Each specific geometry
is thus determined by the characterizing symmetry group (for example, planar
Euclidean geometry is determined by the group of affine transformations acting
on the plane), and the interrelations between geometries can be described by the
relations between the corresponding groups (for example, the equivalence of two
geometries amounts to the isomorphism between the corresponding groups).
With Klein’s definition of a geometry, geometrical and symmetrical properties become very close: the symmetry of a figure, which is defined in a given
“space”18 the “geometrical” properties of which are preserved by the transformations of a group G, is determined by the subgroup of G leaving the figure
invariant. The new group theoretical techniques prompted a transition from an
inductive approach (familiar from the nineteenth century classifications of crystalline forms in terms of their visible – and striking – symmetry properties)19
to a more abstract and deductive approach. This is the procedure formulated
in Weyl’s classic book on symmetry (Symmetry, 1952) as follows:
Whenever you have to do with a structure-endowed entity Σ try to
determine its group of automorphisms, the group of those elementwise transformations which leave all structural relations undisturbed.
[...] After that you may start to investigate symmetric configurations
of elements, i.e. configurations which are invariant under a certain
subgroup of the group of all automorphisms (Weyl, 1952, p. 144,
emphasis in original).
In this way, the symmetry classifications could be extended to figures in “spaces”
different from the plane and space of common experience.
Klein himself contributed to the classification of symmetry groups of figures
with his works on discrete groups; in particular, he studied the transformation
groups related to the symmetry properties of regular polyhedra, which proved
to be useful in the solution of algebraic equations by radicals.
3.2.3
Lie’s theory of continuous groups
After 1872, while Klein was concerned especially with discrete transformation
groups, Lie devoted all his research work to building the theory of continuous
transformation groups, the results of which were systematically collected in his
three-volume Theorie der Transformationsgruppen (I: 1888, II: 1890, and III:
1893), written with the collaboration of F. Engel. Lie’s interest in continuous
groups arose in relation to the theory of differential equations, which he took
18 A
19 A
set of points endowed with a structure.
classic textbook in this respect is Shubnikov–Koptsik (1974). See also Section ??,
below.
10
to be ‘the most important discipline in modern mathematics’. By the time he
was in Paris, Lie had begun to study the theory of first-order partial differential
equations, a theory of particular interest because of the central role it played in
the formulation given by W. R. Hamilton and C. G. Jacobi to mechanics.20 His
project was to extend to the case of differential equations Galois’s method for
solving algebraic equations: that is, using the knowledge of the ‘Galois group’
of an equation (the symmetry group formed by the transformations taking solutions into solutions) so as to solve it or reduce it to a simpler equation. Thus
Lie’s guiding idea was that continuous transformation groups could, in the solution of differential equations, play a role analogous to that of the permutation
groups used by Galois in the case of algebraic equations.
Lie had already considered continuous groups of transformations in some
earlier geometrical works. In his studies with Klein on special kinds of curves
(called by them ‘W-curves’), he had examined transformations that were continuously related in the sense that they were all generated by repeating an
infinitesimal transformation.21 The relevance of infinitesimal transformations
to continuous groups of transformations was to become a central point in his
studies of contact transformations, so called because they preserved the contact
or tangency of surfaces. Lie had started to investigate contact transformations
in association with geometrical reciprocities implied in his “line-to-sphere mapping”, a mapping between a line geometry and a sphere geometry that he had
discovered while in Paris.22 When he turned to considering first-order partial
differential equations, he soon realized that they admitted contact transformations as symmetry transformations (i.e. transformations taking solutions into
solutions). Thus contact transformations could form the “Galois group” of firstorder partial differential equations. This motivated him to develop the invariant
theory of contact transformations, which represented the first step of his general
theory of continuous groups.
Lie’s crucial result, allowing him to pursue his program, was the discovery
that to each continuous transformation group could be assigned what is today
called its Lie algebra. Lie showed that the infinitesimal generators of a continuous transformation group obey a linearized version of the group law, involving
the commutator bracket (or Lie bracket); this linearized law then represents the
structure of the algebra. In short (and in modern terminology): we describe the
elements (transformations) of a continuous group (now called a Lie group)23
as functions of a certain number r of continuous parameters al (l = 1, 2, ...r).
And these group elements can be written in terms of a corresponding number
20 For
details on classical mechanics we refer the reader to Butterfield (this vol., ch. 1) and
the references therein.
21 See on this part Hawkins (2000, Section 1.2). According to Hawkins (p. 15), with the
works of Lie and Klein on W-curves ‘for the first time not only is a continuous group the
starting point for an investigation, but also for the first time in print we have the idea that
infinitesimal transformations are a characteristic and useful feature of continuous systems of
transformations’.
22 See Hawkins (2000, Section 1.4).
23 For a precise definition of this and other terms in this paragraph, see Butterfield (this
vol., ch. 1, Section 3) and Harvey (this vol., ch. 14).
11
r of infinitesimal operators Xl , the generators of the group, which satisfy the
“multiplication law” represented by the Lie brackets
[Xs , Xt ] = cqst Xq ,
so forming what is called the Lie algebra of the group. The coefficients cqst are
constants characterizing the structure of the group and are called the structure
constants of the group.24
Thanks to this sort of result, the study and classification of continuous groups
could be conducted in terms of the corresponding Lie algebras. This proved to
be extremely fruitful in the successive developments, not only algebraic and
geometrical, but also physical. With regard first of all to the physics of Lie’s
time, Lie had arrived at the correspondence between continuous groups and
Lie algebras by reinterpreting, in the light of his program for solving differential equations, the results obtained by Poisson and especially Jacobi about the
integration of first-order partial differential equations arising in mechanics.25
His achievements were thus of great relevance to the solution of the dynamical
problems discussed by his contemporaries.
But it is in twentieth century physics, with the works of such figures as
Hermann Weyl, Emmy Noether and Eugene Wigner (just to recall the central figures who first contributed to the applications of Lie’s theory to modern
physics), that the theory of Lie groups and Lie algebras acquired a fundamental
role in the description of physical phenomena. Today, the applications of the
theory that originated from Lie’s works include the whole of theoretical physics,
of both the large and the small: classical and quantum mechanics, relativity
theories, quantum field theory, and string theory.26
4
What are symmetries in physics? Definitions
and varieties
4.1
What is meant by ‘symmetry’ in physics
We can understand intuitively the generalization of the scientific notion of symmetry from physical or geometric objects to laws, as follows. We write down our
law as a mathematical equation, and appearing in this equation will be various
mathematical objects and operators. For a particular group of transformations,
these objects and operators transform according to rules that may be fixed either by the mathematical nature of the object or operator concerned, or (where
the mathematics does not fix the transformation rules) by our specification. If
the “form” of the equation is preserved when we transform each of the objects
24 For
more details, see Butterfield (this vol., ch. 1, Sections 3.2 and 3.4).
details see Hawkins (2000, Section 2.5).
26 In this volume, see especially t’Hooft (this vol., ch. 7), Harvey (this vol., ch. 14) and
Belot (this vol., ch. 2).
25 For
12
and operators appearing in our equation by any element of the group, then we
say that the group is a symmetry group of the equation.
More precisely, what we mean by the symmetry transformations of the laws
in physics can be formulated in either of the following ways, which are equivalent
in the sense that they pick out the same set of transformations:
(1) Transformations, applied to the independent and dependent variables of
the theory in question, that leave the form of the laws unchanged.
(2) Transformations that map solutions into solutions.
Symmetry transformations may be viewed either actively or passively. From
the passive point of view we re-describe the same physical evolution in two different coordinate systems.27 That is, we transform the independent and dependent
variables, as in (1). If the description in the original set of coordinates is a solution of given equations, then the new description in the new set of coordinates
is a solution of the same equations. (If the transformation is not a symmetry
transformation, then the new description in the new coordinates will not, in
general, be a solution of the same equations, but rather of different equations.)
The mapping of one solution into another solution of the same equations, by
means of a symmetry transformation, leads to the active interpretation of such
transformations. On this interpretation, the two solutions are viewed as different
physical evolutions described in the same coordinate system. Thus, formulation
(2) lends itself naturally to an active interpretation.
The ‘form of the law’ in (1) means the functional form of the law, expressed
in terms of the independent and dependent variables. A transformation of those
variables will, in general, lead to an expression whose functional form differs from
that of the original expression (x goes to x2 , for example). At this point it will be
helpful to say a few words about “invariance” and “covariance”. Let the reader
beware that there is no unanimity over how these terms are used in discussing
the laws of physics, especially in the philosophy of physics literature. Often,
the term ‘invariant’ is reserved for objects, and ‘covariant’ is used for equations
or laws. However, this is a product of a more fundamental distinction, which
when understood correctly allows for the application of the notion of invariance
to laws as well.
We think that the discussion of Ohanian and Ruffini (1994, Section 7.1) is
very useful, and that it nicely distils much of the best of what can be found in the
literature, both in physics and in philosophy of physics. The upshot is as follows.
We may say that an equation is covariant under a given transformation when
its form is left unchanged by that transformation. This is the notion at work in
Definition 1. In a way, it is rather weak: given an equation that is not covariant
under a given transformation, we can always re-write it so that it becomes
covariant. On the other hand, this re-writing may involve the introduction of
new functions of the variables, and it is the physical interpretation of these new
27 By ‘coordinates’ here we are referring to generalized coordinates; in general, one coordinate
for each degree of freedom of the system.
13
quantities that allows covariance to gain physical significance. We will have
more to say about this for the specific case of general covariance and Einstein’s
General Theory of Relativity in Section ?? below.
Invariance of an equation, as characterized by Ohanian and Ruffini, is a
stronger requirement than covariance. Not only should the form of the equation
remain the same, but so too should the values of any non-dynamical quantities,
including “constants” such as the speed of light. By “non-dynamical quantities”
we mean all those objects which appear in the equations yet which do not
themselves satisfy equations of motion. We here enter the muddy waters of
how to distinguish between “absolute” and “dynamical” objects, as discussed
by Anderson (1967).28
In both cases (covariance and invariance), the associated transformations –
when actively construed – take solutions into solutions. When using formulation
(2), it is important to be clear about what is meant by a solution. This does not
mean a solution-at-an-instant, i.e. an instantaneous state of a system; rather, it
means an entire history, i.e. possible time-evolution, of the system in question.29
4.2
Varieties of symmetry
Symmetries in physics come in a number of different varieties, distinguished
by such terms as ‘global’ and ‘local’; ‘internal’ and ‘external’; ‘continuous’ and
‘discrete’. In this Section we briefly review this terminology and the associated
distinctions.
The most familiar are the global spacetime symmetries, such as the Galilean
invariance of Newtonian mechanics, and the Lorentz invariance of the Special
Theory of Relativity. Global spacetime symmetries are intended to be valid
for all the laws of nature, for all the processes that unfold in the spacetime.
Symmetries with this universal character were labelled ‘geometric’ by Wigner
(see 1967, especially p. 17).
This universal character is not shared by some of the symmetries introduced
into physics during the twentieth century. Most of these were of an entirely new
kind, with no roots in the history of science, and in some cases expressly introduced to describe specific forms of interactions – whence the name ‘dynamical
symmetries’ due to Wigner (1967, see especially pp. 15, 17–18, 22–27, 33).
The various symmetries of modern physics can also be classified according
to a second distinction: that between global and local symmetries. The terms
‘global’ and ‘local’ are used in physics, and in philosophy of physics, with a
variety of meanings. The distinction intended here is between symmetries that
depend on constant parameters (global symmetries) and symmetries that depend on arbitrary smooth functions of space and time (local symmetries). While
Lorentz invariance is an example of a global symmetry, the gauge symmetry of
28 See also Section ??, below. One difficulty in tackling the literature on this issue is the
variety of uses and meanings attaching to the common terminology of covariance, principle of
covariance, invariance, absolute and dynamical objects, and so forth.
29 The distinction is important in, for example, our discussion of Curie’s principle, Section
?? above.
14
classical electromagnetism (an internal symmetry)30 and the diffeomorphism
invariance in General Relativity (a spacetime symmetry) are examples of local symmetries, since they are parameterized by arbitrary functions of space
and time.31 Recalling Wigner’s distinction, Lorentz invariance is a geometric
symmetry, applying to all interactions, whereas the gauge symmetry of electromagnetism concerns the electromagnetic interaction specifically and is therefore
a dynamical symmetry.
The gauge symmetry of classical electromagnetism is an internal symmetry
because the transformations of the vector potential occur in the internal space
of the field system, rather than in spacetime. The gauge symmetry of classical
electromagnetism can seem to be no more than a mathematical curiosity, specific
to this theory; but with the advent of quantum theory the use of internal degrees
of freedom, and the related internal symmetries, became fundamental.32
The translations, rotations and boosts of the inhomogeneous Lorentz group
are all examples of continuous symmetries, for which any finite symmetry transformation can be built up of infinitesimal symmetry transformations. In contrast
with the continuous symmetries we have the discrete symmetries of charge conjugation, parity, and time reversal (CPT), along with permutation invariance.
Thus, Newtonian mechanics and classical electrodynamics are invariant under
parity (left-right inversion) and under time reversal (roughly: the laws hold for
a sequence of states evolved in the backwards time direction just as they hold
for the states ordered in the forwards direction). Classical electrodynamics is
also invariant under charge conjugation, so long as we correctly implement the
associated transformations of the electric and magnetic fields. Finally, there
is a sense in which classical statistical mechanics is permutation invariant: the
particles postulated are identical to one another, and their permutation takes
a solution into a solution. However, the power and significance of the discrete
symmetries achieves its full force only in quantum theory, and for discussion of
this we refer the reader to Harvey (this vol., ch. 14).
In Section ?? below, we discuss some of the interpretative issues associated
with these different varieties of symmetry in classical physics.
30 For
more on gauge and internal symmetries, see the following paragraph.
discuss the local symmetry of General Relativity further in Section ?? below. See also
Belot (this vol., ch. 2).
32 For interpretative issues associated with gauge symmetry in classical electromagnetism,
see Belot (1998). Gauge symmetries came to prominence with the development of quantum
theory. For symmetries in quantum theory see Harvey (this vol., ch. 14). The term ‘gauge
symmetry’ itself stems from Weyl’s 1918 theory of gravitation and electromagnetism. For
discussions of all these aspects of gauge symmetry, see Brading and Castellani, 2003.
31 We
15
5
5.1
Some applications of symmetries in classical
physics
Transformation theory in classical mechanics
As we have seen, Lie’s interest in continuous groups arose in relation to his
studies of the theory of first-order partial differential equations, which played
a central role in the formulation given by Hamilton and Jacobi to mechanics.
The transformation theory of mechanics based on this formulation is indeed one
of the first examples of a systematic exploitation in physics of the invariance
properties of dynamical equations. These symmetries are exploited according to
the following strategy: the integration of the equations of motion is simplified by
transforming – by means of symmetry transformations – the original dynamical
system into another system with fewer degrees of freedom.
Historically, the road to the possibility of applying the above ‘transformation
strategy’ to solving dynamical problems was opened by the works of J. L. Lagrange and L. Euler. The Euler-Lagrange analytical formulation of mechanics,
grounded in the seminal Mécanique Analytique (1788) of Lagrange, expressed
the laws of motion in a form which was covariant (cf. Section ??) under all
coordinate transformations. This meant one could more easily choose coordinates to suit the dynamical problem concerned. In particular, one hoped to find
a coordinate system containing “cyclic” (a.k.a. “ignorable”) coordinates. The
presence of ignorable coordinates amounts to a partial integration of the equations: if all the coordinates are ignorable, the problem is completely and trivially
solved. The method was thus to try to find (by applying coordinate transformations leaving the dynamics unchanged) more and more ignorable variables, thus
transforming the problem of integrating the equations of motion into a problem
of finding suitable coordinate transformations.
The successive developments in the analytical approach to mechanics, from
Hamilton’s “canonical equations of motion” to the general transformation theory of these equations (the theory of canonical transformations) obtained by
Jacobi, presented many advantages of the “transformation strategy” point of
view. For further details we refer the reader to Butterfield’s chapter of this
volume, along with classic references such as Lanczos ([1949], 1962) and Whittaker ([1904], 1989). Butterfield (this vol., ch. 1), by expounding the theory of
symplectic reduction in classical mechanics, thoroughly illustrates the strategy
of simplifying a mechanical problem by exploiting a symmetry. This strategy is
also the main subject of Butterfield (2006), focussing on how symmetries yield
conserved quantities according to Noether’s first theorem (see Section ??, below), and thereby reduce the number of variables that need to be considered in
solving a problem.
We end these brief remarks on symmetry and transformation theory in classical mechanics by emphasizing two points.
First, we note that a problem-solving strategy according to which a dynamical problem (equation) is transformed into another equivalent problem (equa-
16
tion) by means of a symmetry might be seen as an example of the application
of Curie’s principle in its modern version (see here Section ??): by transforming
an equation into another equivalent equation using a specific symmetry we may
arrive at an equation which we can solve; the solution of the new equation is
related to the unknown solution of the old equation by the specific symmetry;
that is, we thereby arrive at an equivalent solution.
Second, we emphasize that in all these developments the invariance properties of the dynamical equations, though undoubtedly important, were considered exclusively in an instrumental way. That is, canonical transformations
were studied only for the purpose of solving the dynamical problem at hand.
The equations were given, and their invariance properties were investigated to
help find their solutions. The formulation of Einstein’s Special Theory of Relativity at the beginning of the twentieth century brings an inversion of this way
of thinking about the relationship between symmetries and physical laws, as we
shall see in the following section.
5.2
Symmetry principles as guides to theory construction
The principle of relativity, as expressed by Einstein in his 1905 paper announcing
the Special Theory of Relativity, asserts that
The laws by which the states of physical systems undergo changes
are independent of whether these changes are referred to one or the
other of two coordinate systems moving relatively to each other in
uniform motion.33
It further turns out that these coordinate systems are to be inertial coordinate
systems, related to one another by the Lorentz transformations comprising the
inhomogeneous Lorentz group.
The principle of relativity thus stated meets the conditions listed above in
Section ?? for a symmetry principle:
• The Lorentz transformations, applied to the independent and dependent
variables of the theory, leave the form of the laws as stated in one inertial system unchanged on transformation to another inertial coordinate
system.
• The Lorentz transformations map a solution, given relative to an inertial
coordinate system, into another solution.
This principle was explicitly used by Einstein as a guide to theory construction: it is a principle that must be satisfied whatever the final details of
the theory.34 Indeed, using just the principle of relativity and the light postulate, Einstein derives various results, including the Lorentz transformations.
33 Miller’s
(1981) translation, p. 395.
discussion of the principle/constructive theory distinction in Einstein, see Brown
(2006, ch. 6) and Howard (2007).
34 For
17
As noted above, this represents a reversal in the priority that, since the time of
Newton, had been given to the relativity principle versus the dynamical laws.
Huygens used the relativity principle as a basic postulate from which to derive
dynamical results, but in Newton the relativity principle, initially presented in
his manuscripts as an independent postulate, is relegated in the Principia to a
corollary.35 From then until Einstein, the relativity of inertial motion is seen as
a consequence of the particular laws under consideration, and something that
could turn out to be false once the details of the laws of some particular interaction are known. Similarly for classical physics in general, symmetries – such as
spatial translations and rotations – were viewed as properties of the laws that
hold as a consequence of those particular laws. With Einstein that changed:
symmetries could be postulated prior to details of the laws being known, and
used to place restrictions on what laws might be postulated. Thus, symmetries
acquired a new status, being postulated independently of the details of the laws,
and as a result having strong heuristic power. As Wigner wrote, Einstein’s papers on special relativity ‘mark the reversal of a trend’: after Einstein’s works,
‘it is now natural to try to derive the laws of nature and to test their validity
by means of the laws of invariance, rather than to derive the laws of invariance
from what we believe to be the laws of nature’ (Wigner, 1967, p. 5).
The methodology that had served Einstein well with the Special Theory
of Relativity (STR) also had a role in his development of the General Theory
(GTR), for which he used various different principles as restrictions on the
possible form that the eventual theory might take.36 One of these was, so
Einstein maintained, an extension of the principle of relativity found in STR
to include coordinate systems that are in accelerated motion relative to one
another, implemented by means of the requirement that the equations of his
new theory be generally covariant. Einstein was seeking a “Machian” solution
to the challenge of Newton’s bucket, which he took to require that there be
no preferred reference frames. Thus, in his 1916 review article Einstein wrote
that ‘The laws of physics must be of such a nature that they apply to systems of
reference in any kind of motion. Along this road we arrive at an extension of
the postulate of relativity’ (emphasis in original).
The questions of whether or not the principle of general covariance (a) makes
any arbitrary smooth coordinate transformation into a symmetry transformation, and (b) is a generalization of the principle of relativity, have been much
discussed. The answer to (b) is a definitive ‘no’, but there is less consensus at
present about the answer to (a).37 In the following section we take up discussion
of (a). Here we close with a few brief remarks concerning (b).
35 In fact, it does not follow from Newton’s three laws of motion – we must further assume
the velocity independence of mass and force. See Barbour (1989, Section 1.2).
36 Primarily the following: the principle of relativity, later (in 1918) distinguished from what
Einstein referred to as ‘Mach’s principle’; the principle of equivalence; and the principle of
conservation of energy–momentum.
37 See for example Torretti (1983, pp. 152–4); Norton (1993), who also discusses the relationship with the principle of equivalence; Anderson (1967).
18
Even if general covariance in GTR is a symmetry principle, it is not an extension of the relativity principle. That is to say, general covariance says nothing
about the observational equivalence of distinct reference frames.38 As already
noted, the thought that general covariance might provide such a principle was,
for Einstein, connected with his attempts to provide a “Machian” resolution to
the challenge of Newton’s bucket, and with his principle of equivalence. However, the principle of equivalence does not imply the observational equivalence
of reference frames in arbitrary states of motion (Einstein never thought that it
did), and Einstein eventually realized that GTR does not vindicate a solution
to Newton’s bucket that depends only on the relative motion of matter.39
Whatever the subtleties of whether, and to what extent, general covariance
is a symmetry principle, it is clear that it had enormous heuristic power, not
just for Einstein in his development of GTR, but also beyond. Think for example Hilbert’s work on the axiomatization of physics (see Corry, 2004, and
references therein), and Weyl’s attempts to construct a unified field theory (see
O’Raifeartaigh, 1997, for an English translation of Weyl’s 1918 paper ‘Gravitation and Electricity’, and see also Weyl, 1922). In all these cases, general
covariance provided a powerful tool for theory construction. In the following
Section we discuss further the significance and interpretation of general covariance in GTR.40
6
General covariance in General Relativity
In the preceding Section we noted the role of the principle of general covariance as a guide to theory construction. In this Section we turn our attention
to a number of further issues relating to general covariance in GTR that have
received attention in the philosophical literature. We begin with the issue,
raised in the preceding section, concerning the status of arbitrary smooth coordinate transformations as symmetry transformations. We then discuss various
characteristics associated with general covariance, including those pointed to
by Einstein’s so-called ‘hole argument’, before turning to the issue of whether
or not general covariance has physical content.41 We postpone discussion of
Noether’s theorems to Section ??, below.
6.1
General covariance and arbitrary coordinate transformations as symmetry transformations
Does the principle of general covariance make any arbitrary smooth coordinate
transformation into a symmetry transformation? One way to approach this
38 For
further discussion see, for example, Norton (1993) and Torretti (1983, Section 5.5).
a clear and concise discussion, see Janssen (2005).
40 For detailed presentation of the Special and General Theories of Relativity, see Malament
(this vol., ch. 3). See also Rovelli (this vol., ch 12, Section 4).
41 See also Belot (this vol., ch. 2).
39 For
19
question is to consider active rather than passive transformations (see Section
4.1, above), and to compare the situation in GTR with that in STR.
In STR, a Lorentz transformation – actively construed – picks up the matter
fields and redistributes them with respect to the spacetime structure encoded
in the metric. The principle of relativity holds for such transformations because
the evolution of the matter fields in the two cases (related by the Lorentz transformation) are observationally indistinguishable: no observations, in practice or
in principle, could distinguish between the two scenarios. In GTR, active general
covariance is implemented by active diffeomorphisms on the spacetime manifold
(see Rovelli, this vol., ch. 12, Section 4.1). These involve transformations of not
just the matter fields, but also the metric field, in which both are redistributed
with respect to the spacetime manifold. Once again, the “two cases” are observationally “indistinguishable”, but this time the reason generally given is that
the “two cases” are in fact just one case.42
Why should we accept that there are two genuinely distinct cases when
considering the Lorentz transformations in STR, and only one case for the diffeomorphisms of GTR? One approach would be to claim that a crucial difference
between the two is that a Lorentz transformation can be implemented on an
effectively isolated sub-system of the matter fields, producing an observably distinct scenario in which, nevertheless, the evolution of the sub-system in question
is indistinguishable assuming no reference is made to matter fields outside that
subsystem. For example, in Galileo’s famous ship experiment we consider two
observably distinct scenarios – one in which the ship is at rest with respect to
the shore, and one in which it moves uniformly with respect to the shore – and
we notice that the behaviour of physical systems within the cabin of the ship
does not distinguish between the two scenarios.43 No analogue of the Galilean
ship experiment can be generated for the general covariance of GTR.44
The importance of symmetry transformations being implementable to produce observationally distinct scenarios has been emphasized by Kosso (2000).
On this view, the observational significance of symmetry transformations rests
on a combination of two observations being possible in principle. First, it must
be possible to confirm empirically the implementation of the transformation –
hence the importance of being able to generate an observationally distinguishable scenario through the transformation of a subsystem. Second, we must be
able to observe that the subsequent internal evolution of the subsystem is unaffected. That we cannot meet the first of these requirements for arbitrary smooth
42 See
also Section ??, penultimate paragraph.
implementation can be only approximate, relying on the degree to which the subsystem in question can be isolated from the “external” matter fields.
44 One suggestion might be that we perform a transformation T which is the identity outside
some region R, and which differs from the identity within that region. This will not achieve
the desired result. The two scenarios must have observationally distinct consequences, at least
in principle. In the case of Galileo’s ship, if we allow the subsystem to interact with other
matter once again, we will see that in one case the ship crashes into rocks (for example),
while in the other it suffers no such collision. Thus, we have observational distinguishability
in principle. The transformation T does not produce a scenario which any future events could
enable us to distinguish from the original.
43 This
20
coordinate transformations in GTR marks a difference between these and the
Lorentz transformations.45
On this approach, while the field equations of GTR take the same form for
any choice of coordinate system, this is not sufficient for arbitrary coordinate
transformations to be symmetries. In addition, the actively construed transformations must have a physical interpretation – we must be transforming one
thing with respect to something else. When we perform a diffeomorphism, we
get back the same solution, not a new solution, for we are not re-arranging the
matter fields with respect to the metric.
We stress that this is only one way to approach the issue of whether general
covariance should be understood as a symmetry principle in GTR. A contrasting position may be found in Anderson (1967, Section 10-3), who argues that
we must understand Einstein as viewing general covariance as a symmetry requirement, and attempts to spell out the conditions under which it can function
as such.
6.2
Characteristics of generally covariant theories
Any generally covariant theory will possess certain characteristics that are philosophically noteworthy. First, there will be a prima facie problem with causality
and determinism within the theory, and second, there will be constraints on the
specification of the initial data. Einstein recognized aspects of the first characteristic while he was searching for his theory of gravitation, maintaining from
1913 through until the fall of 1915 that his so-called ‘hole argument’ provided
grounds for concluding that no generally covariant theory could be physically
acceptable.
In the ‘hole argument’, Einstein considers a region of spacetime in which
there are no matter fields (the “hole”), and then shows that in a generally
covariant theory no amount of data about the values of the matter and gravitational fields outside the hole is sufficient to uniquely determine the values of
the gravitational field inside the hole. From this, Einstein concluded that no
generally covariant theory could be physically acceptable.46
The context to bear in mind here is that Einstein was searching for a theory
in which the matter fields plus the field equations would uniquely determine
the metric.47 In the summer of 1915 Einstein lectured on relativity theory
in Göttingen where his audience included David Hilbert. If we assume that
Einstein’s presentation included a version of his ‘hole argument’, then we can
reasonably infer that Hilbert was quick to reinterpret the issue that the ‘hole
argument’ points to, and to present the problem raised for generally covariant
theories in terms of whether such theories permit well-posed Cauchy problems.48
45 Indeed, this result applies generally to local versus global symmetries. See also Brading
and Brown (2004).
46 For presentation and discussion of the ‘hole argument’, see Norton (1984, pp. 286–291,
and 1993, Sections 1-3), Stachel (1993), and Ryckman (2005, Section 2.2.2). See also Rovelli,
this vol., ch. 12, Section 4.1.1
47 For more on Einstein’s (mis)appropriation of Mach’s principle, see Barbour (2005).
48 Brading and Ryckman (2007); see also Brown and Brading (2002, especially Section IV).
21
In the years immediately following the advent of GTR, Hilbert played a
central role in spelling out the problems of causality and determinism faced by
any generally covariant theory. He pointed out that in any such theory, including
GTR, there will be four fewer field equations than there are variables, leading
to a mathematical underdetermination in the theory. As Hilbert stressed, the
Cauchy problem is not well-posed: given a specification of initial data, the field
equations do not determine a unique evolution of the variables.
We can see the connection between the underdetermination problem and
general covariance as follows. For the Cauchy problem to be well-posed, we
must be able to express the second time derivatives of the metric in terms
of the initial data (plus the further spatial derivatives that can be calculated
from the initial data). However, if we re-express the 10 (source-free) Einstein
field equations Gµν = 0 so as to explicitly display all the terms containing the
second time derivative of the metric, we see that we have ten equations for
six unknowns gij,00 , the remaining four second time derivatives gµ0,00 failing to
appear in the equations.49 This is a direct consequence of general covariance:
we can always make a coordinate transformation in the neighborhood of the
initial data surface such that the metric components and their first derivatives
are unchanged, while the second time derivatives gµ0,00 vanish on that surface.
Thus the field equations, which must be valid in all coordinate systems, cannot
possibly contain information on the second time derivatives. The initial data
do not determine the metric uniquely: there are four arbitrary functions gµ0,00
that we are free to choose.
Today, it is customary to assert from the outset that solutions of Einstein’s
field equations differing only in the choice of these four arbitrary functions are
physically equivalent.50 But here we should note that this “gauge freedom” interpretation of general covariance leads to problems of its own.51 For example,
within this framework the observables of the theory must be “gauge invariant”
quantities, but such quantities have (to date) turned out to be far removed from
anything “observable” in the operational sense. The gauge freedom interpretation of general covariance is sometimes accompanied by the view that this
freedom – and therefore general covariance itself – lacks physical content. We
turn to consider this issue in Section ??, below.
In our explanation of the underdetermination problem, above, we noted that
the Einstein equations provide ten equations for the six unknowns gij,00 . The
other face of the underdetermination problem is therefore an overdetermination
problem with respect to the gij,00 , and what this means is that there will be
constraints on the specification of the data on the initial hypersurface. This is
the second characteristic of all generally covariant theories that we mentioned
in our opening remarks of the current subsection. Indeed, the presence of constraint equations is a feature shared with other theories with a local symmetry
structure, such as electromagnetism. Philosophically, the significance lies in the
49 See Adler, Bazin and Schiffer (1975, ch. 8) for details of the over- and under-determination
issues.
50 Recall the discussion of Section ??, above.
51 See Belot (this vol., ch. 2).
22
relationship between the theory and the initial data. In the seventeenth century
Descartes wrote a story of a world created in a state of disorder from which, by
the ordinary operation of the laws of nature, a world seemingly similar to our
own emerged.52 This image of the world emerging from an initial chaos has a
long history, of course, but the emergence of order by means of the operation of
the laws of nature offered a novel twist to the tale. It involves the separation of
initial conditions, which could be anything, from the subsequent law-governed
evolution of the cosmos. In modern terms, this is a theory without constraints:
the theory determines which properties of a system must be specified in order
to give adequate initial data, but we are then free to assign whatever values we
please to these properties; the equations of the theory are used to evolve that
data forwards in time. A theory with constraints, by contrast, contains two
types of equations: constraint equations that must be satisfied by the initial
data, as well as evolution equations. In GTR, four of the ten field equations
connect the curvature of the initial data hypersurface with the distribution of
mass–energy on that hypersurface, and the remaining six field equations are
evolution equations. To sum up, in a theory with constraints, the initial “disorder” cannot be so disordered after all, but must itself satisfy constraints set
down by the laws of the theory.
6.3
Does general covariance have any physical significance?
As we saw in Section ??, Einstein treated general covariance as a symmetry
principle guiding the search that produced his General Theory of Relativity.
There is no doubt that general covariance proved a useful heuristic for Einstein,
but there remains an ongoing dispute over whether general covariance in fact
has any physical significance. The issue was forcefully raised by Kretschmann
already in 1917. The thrust of the argument, which continues to reverberate
today, is that any theory can be given a generally covariant formulation given
sufficient mathematical ingenuity, and therefore the principle of general covariance places no restrictions on the physical content of a theory. Indeed, Norton
(2003) begins his discussion of the issue by claiming that this negative view of
general covariance has become mainstream, before going on to give an alternative viewpoint (see below).
It seems clear to us that the characteristic features of generally covariant
theories discussed above may, in some theories at least (including GTR), be far
from trivial, and that the mainstream view – which would indeed render these issues trivial – should be opposed. Those wishing to oppose the mainstream view
adopt a two-step general strategy: first, show under what conditions general
covariance places a restriction on the physical content of a theory; and second,
demonstrate what those implications for physical content consist in. Thus, the
general mathematical point that any theory can be put into generally covari52 Written around 1633, Le Monde was not published in Descartes’s lifetime. For an English
translation see Descartes (1998). The “order out of disorder” story is in the Treatise on Light,
chs. 6 and 7. Whether the ordinary operation of the laws of nature was sufficient to bring
order out of chaos became a much-disputed issue.
23
ant form is conceded, but the implication that general covariance is therefore
necessarily physically vacuous is resisted by attention to the manner in which
general covariance is implemented in a given theory or class of theories.
For example, Anderson (1967), Ohanian and Ruffini (1994), Norton (2003),
and Earman (2006) each attempt to explain under what conditions the purely
mathematical feature of general covariance comes to have physical bite.53 Anderson distinguishes between the symmetries of a theory (which have physical
significance) and the covariance group of the equations (which need not). Anderson is the classic reference for the distinction between “absolute” and “dynamical” objects,54 and in this terminology the covariance group of the equations of
a theory becomes a symmetry group if and only if the theory contains no absolute objects. Ohanian and Ruffini (1994) appeal to the distinction they make
between invariance and covariance of the equations of a theory.55 Covariance,
they agree, is a mathematical feature (perhaps simply an artefact of the particular formulation of the theory at hand); but we require not only the covariance
of the equations, but also that for any objects (with one or more components)
appearing in the theory that are nevertheless independent of the state of matter
(such as the speed of light, Planck’s constant, etc.), their value should be unchanged by the general coordinate transformations. Norton (2003) emphasizes
the role of physical considerations in fixing the content of a theory such that
this restricts the formal games that we can play. Earman (2006) begins by taking pains to emphasize the distinction between the ‘mere co-ordinate freedom’
(associated with arbitrary coordinate transformations, passively construed) and
‘the substantive demand that diffeomorphism invariance is a gauge symmetry
of the theory at issue’. That is to say, he reminds us that the issue at stake is
not our ability to re-write a theory in generally covariant form (it is conceded
that this is something we can always do, given sufficient mathematical ingenuity), but the relationship between the physical situations that are related by
diffeomorphisms, i.e. by (active) point transformations (see Section 6.1, above).
‘Substantive general covariance’ holds when diffeomorphically related models of
the theory represent different descriptions of the same physical situation. The
claim is that GTR satisfies substantive general covariance whereas generally covariant formulations of such theories as STR need not, and the goal is to show
that this requirement provides demarcation between theories in which general
covariance represents a physically significant property of the theory, and those
in which it does not.56
53 See
also Norton (1993, especially Section 5), and Rovelli (this vol., ch. 12, Section 4.1.3).
has proved difficult to make the distinction between absolute and dynamical objects
precise, but the intuitive idea is clear enough. Dynamical objects satisfy field equations
and interact with other objects, whereas absolute objects are not affected by the dynamical
behaviour of other fields appearing in the theory. For a careful and detailed treatment of
Anderson’s approach, and the counter-examples that have been raised, see Pitts (2006). The
conclusion of this paper is that Anderson’s intuition can be made sufficiently precise to cope
with all counter-examples that have appeared in the literature to date (including one due to
Pitts himself), but that there is another example, due to Geroch, that Pitts has been unable
to resolve. The debate goes on!
55 See Section ??.
56 One important tool for distinguishing genuine ‘gauge theories’ from those in which the
54 It
24
Thus, Anderson, Ohanian and Ruffini, Norton, and Earman each seek to
add bite to the “merely mathematical” requirement of general covariance by
placing conditions on the manner in which it is implemented in the theory.
Once these requirements are added, various consequences follow for the content
of the theory, such as that the metric be a dynamical object. In each case,
the aim is to elevate general covariance as implemented in GTR to a symmetry
principle.57
Considerations of the significance of general covariance in theories of gravitation led to the formulation of three theorems important for the general interpretation of symmetries in physics. These theorems are due to Emmy Noether
and Felix Klein, and will be discussed in the following section.
7
Noether’s theorems
Any discussion of the significance of symmetries in physics would be incomplete without mention of Noether’s theorems. These theorems relate symmetry
properties of theories to other important properties, such as conservation laws.
Within physics, the term ‘Noether’s theorem’ is most frequently associated
with a connection between global continuous symmetries and conserved quantities. Familiar examples from classical mechanics include the connections between: spatial translations and conservation of linear momentum; spatial rotations and conservation of angular momentum; and time translations and conservation of energy. In fact, this theorem is the first of two theorems presented
in her 1918 paper ‘Invariante Variationsprobleme’.58
Before stating the two theorems, we begin with the following cautionary
remark. The connection between variational symmetries (connected to the invariance of the action, and in terms of which Noether’s theorems are formulated)
and dynamical symmetries (concerning the dynamical laws, which is the topic of
our discussion here) is subtle (see Olver, 1993, ch. 4). Noether herself never addressed the connection, and never used the word ‘symmetry’ in her paper. She
discusses integrals mathematically analogous to (but generalizations of) the action integrals of Lagrangian physics, and uses variational techniques and group
theory to elicit a pair-wise correspondence between variational symmetries of
the integral and a set of identities.
Noether then proves two theorems, the first for the case where the variational symmetry group depends on constant parameters, and the second for the
case where the variational symmetry group depends on arbitrary functions of
the variables.59 In the following statement of her theorems we use the term
‘Noether symmetry’ to refer to a symmetry of the field equations for which the
change in the action arising from the infinitesimal symmetry transformation is
local symmetry in question is merely formal is Noether’s second theorem; see Section 7, below.
57 Brown and Brading (2002) attempt to analyze in more detail, by means of Noether’s theorems (see Section ??, below), what additional conditions must be added to general covariance
in order to arrive at specific aspects of the content of GTR.
58 For an English translation see Noether (1971).
59 See Brading and Brown, 2007.
25
at most a surface term. Using the terminology of Section ??, the first type of
symmetry then corresponds to a global dynamical symmetry, and the second to
a local one. We state the theorems in a form appropriate to Lagrangian field
theory; Noether’s own statement of the theorems involves no such specialization.
For discussion of the first theorem in the context of finite-dimensional classical
mechanics see Butterfield (this vol., ch. 2, Section 2.1.3). We state the theorems so that we can refer back to them to characterize the conceptual content,
but for discussion of the mathematical detail of their derivation and content
we refer the reader elsewhere – see especially Olver (1993) and Barbashov and
Nesterenko (1983).
We can state Noether’s two theorems, for a Lagrangian density L depending
on the fields φi (x) and their first derivatives, as follows.
Noether’s first theorem
If a continuous group of transformations depending smoothly on ρ constant parameters ωk (k = 1, 2, ..., ρ) is a Noether symmetry group of the Euler-Lagrange
equations associated with a Lagrangian L(φi , ∂µ φi , xµ ), then the following ρ
relations are satisfied, one for every parameter on which the symmetry group
depends:60
X
(1)
EiL ξik = ∂µ jkµ .
i
On the left-hand side we have a linear combination of Euler expressions,
L
Em
≡
∂L
− ∂µ
∂φm
∂L
∂φm,µ
(2)
where
L
Em
=0
(3)
ξim
are the Euler-Lagrange equations for the field φm . (The
depend on the
particular symmetry transformations and fields under consideration, and the
details are not important for our current purposes.)
On the right-hand side we have the divergence of a current, jkµ . When
the left-hand side vanishes, the divergence of the current is equal to zero, and
this expression can be converted into a conserved quantity subject to certain
conditions. Thus, Noether’s first theorem gives us a connection between global
symmetries and conserved quantities.61
Noether’s second theorem
If a continuous group of transformations depending smoothly on ρ arbitrary
functions of time and space pk (x) (k = 1, 2, ..., ρ) and their first derivatives is
60 Note that we are using the Einstein summation convention to sum over repeated greek
indices.
61 This theorem is widely discussed. See especially Barbashov and Nesterenko (1983);
Doughty (1990). We refer the reader to Butterfield (this vol., ch. 2, and 2006) for further
discussion of Noether’s first theorem in the context of finite-dimensional classical mechanics.
26
a Noether symmetry group of the Euler-Lagrange equations associated with a
Lagrangian L(φi , ∂µ φi , xµ ), then the following ρ relations are satisfied, one for
every function on which the symmetry group depends:
X
X
∂ν (bνki EiL ).
(4)
EiL aki =
i
i
bνki
The aki and
depend on the particular transformations of the fields in question, and while again the details need not concern us here, we note for use
below that while the aki arise even when the symmetry transformation is a
global transformation, the bνki occur only when it is local.62 What we have here,
essentially, is a dependency between the Euler expressions and their first derivatives. This dependency holds as a consequence of the local symmetry used in
deriving the theorem. In the case when all the fields are dynamical (i.e. satisfy
Euler-Lagrange equations) it follows that not all the field equations are independent of one another. This formal underdetermination is characteristic of
theories with a local symmetry structure.63
As Hilbert recognized in the context of generally covariant theories of gravitation, the underdetermination is independent of the specific form of the Lagrangian.64 In the case of General Relativity, once we specify the Lagrangian
and substitute it into (??), we arrive at the (contracted) Bianchi identities.
For Noether herself, the impetus for the paper arose from the discussions over
the status of energy conservation in generally covariant theories between Hilbert,
Klein and Einstein, during which Hilbert commented that energy conservation
for the matter fields no longer has the same status in generally covariant theories
as it had in previous (non-generally covariant) theories, because it follows independently of the field equations for the matter fields. Noether’s two theorems
can be used to support this conjecture (see Brading, 2005). The discussion over
the status of energy conservation in General Relativity continues, the root of
the issue being that energy–momentum cannot, in general, be defined locally.65
62 Once
again, the reader is referred to Brading and Brown (2007) for further details.
the dependencies expressed by the second theorem are trivial or not depends
on the status of the fields with respect to which the local symmetry holds. It is in this way
that Noether’s second theorem can be used as a tool in the attempt to demarcate ‘true gauge
theories’ from theories where the local symmetry is a ‘mere mathematical artefact’ (see Section
6.3 above, and Earman, 2006). For a ‘true gauge theory’ the dependencies have significant
physical implications.
64 Hilbert (1915).
65 The energy–momentum conservation law in General Relativity is formulated in terms of
the vanishing of the covariant divergence of the energy–momentum tensor associated with the
matter fields. Alternatively, we can express this in terms of the vanishing of the coordinate
divergence of the energy–momentum of the matter fields plus that of the gravitational field.
The latter term falling under the divergence operator is not uniquely defined and, pertinent
the issue of non-localizability, may vanish in some coordinate systems and not in others.
We can understand this coordinate dependence by reflecting on the equivalence principle,
according to which partitioning the inertial-gravitational field to obtain a division between
inertial and gravitational forces is itself a coordinate-dependent issue. For further discussion
see, for example, Misner, Thorne and Wheeler (1970), pp. 467–8, and Wald (1984), p. 70.
See also Malament (this vol., ch. 3).
63 Whether
27
Today, the significance of Noether’s results lie in their generality. Many
of the specific connections between global spacetime symmetries and their associated conserved quantities were known before Noether’s 1918 paper, and
both Einstein and Hilbert anticipated some aspects of the second theorem in
their investigations of energy conservation during and after the development of
GTR.66 However, her systematic treatment allows us to understand that these
relations do not rely on the detailed dynamics of a particular theory, but in fact
follow from the structure of Lagrangian theories and significantly weaker stipulations than the full dynamics of the theory. For example, general covariance
leads to energy conservation in GTR given satisfaction of the gravitational field
equations, but independently of the detailed form of those equations, and independently of the fields equations for the matter fields (indeed, independently of
whether the matter fields satisfy Euler-Lagrange equations at all).67 Noether’s
theorems are a powerful tool for investigating the structure of theories – which
assumptions are required to generate which aspects of the theory, and so forth.68
It is worthwhile mentioning a third theorem, connected with Noether’s two
theorems and derived in the same context (i.e. the study of generally covariant theories of gravitation and conservation of energy) by Felix Klein (1918).
We call it the ‘Boundary theorem’ for reasons associated with its method of
derivation.69 As with Noether’s second theorem, the Boundary theorem concerns local symmetries, and results in a series of identities (termed the ‘cascade
equations’ by Julia and Silva (1998)).70 We state here a simplified version of
the Boundary theorem in which the action is left unchanged by an infinitesimal
symmetry transformation (i.e. we do not allow for the possibility of a surface
term).71
The Boundary theorem (restricted form)
If a continuous group of transformations depending smoothly on ρ arbitrary
functions of time and space pk (x) (k = 1, 2, ..., ρ) and their first derivatives is
a Noether symmetry group72 of the Euler-Lagrange equations associated with
a Lagrangian L(φi , ∂µ φi , xµ ), then the following three sets of ρ relations are
satisfied, one for every parameter on which the symmetry group depends:
66 On
Einstein, see Janssen (2005), pp. 75–82; and see Sauer (1999) on Hilbert.
Brading and Brown (2007).
68 For a discussion of this in the case of general covariance, see Brading and Brown (2002).
69 The Boundary theorem also appears in the work of Hermann Weyl, specialized to the case
of his unified field theory (see Weyl, 1922, p. 287–289; the first appearance was in the 1919
third edition), and was published in a non-theory-specific form by Utiyama (1956, 1959).
70 As with Noether’s second theorem, the Boundary theorem is a useful tool in the attempt to demarcate ‘true gauge theories’ from theories where the local symmetry is a ‘mere
mathematical artefact’, through inspection of the identities that result from the theorem, and
through the physical significance – or otherwise – of these identities.
71 For further details of the Boundary theorem, including the generalization that allows for
a surface term, see Brading and Brown (2007).
72 The Boundary theorem is here stated in a restricted form such that the Noether symmetry
group must belong to the restricted class of such groups associated with an invariant action.
67 See
28
X
∂µ (bµki EiL ) = ∂µ jkµ
(5)
i
i
X
∂ν
−
∂L
bµki
=
∂(∂
φ
)
ν
i
i
∂L
∂L
µ
ν
= 0.
+
b
b
∂(∂µ φi ) ki
∂(∂ν φi ) ki
X
(bµki EiL )
jkµ
(6)
(7)
Once again, the bνki depend on the particular transformations of the fields in
question, the details of which need not concern us here. The first identity is
connected to the existence of superpotentials associated with local symmetries.73
The second equation can be used to investigate the relationship between a field
and its sources. For example, in the case of classical electromagnetism, we can
investigate the relationship between the local gauge symmetry of the theory and
the condition that:
j µ = ∂ν F µν ,
(8)
i.e. that Maxwell’s equations with dynamical sources hold. Using the case of
classical electromagnetism as our example once again, the third equation becomes the condition that the electromagnetic tensor be antisymmetric (showing
the relationship between this condition and the local gauge symmetry of that
theory):
F µν + F νµ = 0.
(9)
These remarks have been necessarily brief, and the reader is referred to
Barbashov and Nesterenko (1983), along with Brading and Brown (2003 and
2007), for detailed derivations and discussion of these results. The identities of
the Boundary theorem and of Noether’s two theorems are not all independent of
one another, and which is most useful depends on the context and the question
under consideration. As with Noether’s theorems, the Boundary theorem holds
independently of the specific details of the dynamical equations, and together
they allow us to investigate structural features of our theories that are associated
with the symmetry properties of those theories.
8
The interpretation of symmetries in classical
physics
In what follows, we begin with ‘Wigner’s hierarchy’, which has become the
canonical view of the relationship between symmetries, laws and events. We
supplement this with a brief discussion of the connection between symmetry and
73 See,
for example, Trautman (1962, p. 179).
29
irrelevance, and how this bears on the interpretation of the various symmetries
described in Section ??, above.
The general interpretation of symmetries in physical theories can adopt a
number of complementary approaches. We can ask about the different roles that
various symmetries play; about the epistemological, ontological or other status
that various symmetries have; and about the significance of the structures left
invariant by symmetry transformations. We end with some remarks on each of
these issues.
8.1
Wigner’s hierarchy
The starting point for contemporary philosophical discussion of the status and
significance of symmetries in physics is Wigner’s 1949 paper ‘Invariance in Physical Theory’, along with his three later papers published in 1964.74 In these papers, Wigner makes the distinction mentioned above (see Section ??) between
geometrical and dynamical symmetries, which we will return to below. He also
presents his view of the hierarchy of physical knowledge, according to which
symmetries are viewed as properties of laws:
There is a strange hierarchy in our knowledge of the world around
us. Every moment brings surprises and unforeseeable events – truly
the future is uncertain. There is, nevertheless, a structure in the
events around us, that is, correlations between the events of which
we take cognizance. It is this structure, these correlations, which
science wishes to discover, or at least the precise and sharply defined
correlations. ... We know many laws of nature and we hope and
expect to discover more. Nobody can foresee the next such law that
will be discovered. Nevertheless, there is a structure in the laws of
nature which we call the laws of invariance. This structure is so
far-reaching in some cases that laws of nature were guessed on the
basis of the postulate that they fit into the invariance structure. ...
This then, the progression from events to laws of nature, and from
laws of nature to symmetry or invariance principles, is what I meant
by the hierarchy of our knowledge of the world around us. (Wigner,
1967, pp. 28–30).
This view of symmetries, as properties of laws, has become canonical.
8.2
Symmetry and irrelevance
There is a general property of laws, or of the underlying events, to which symmetries are connected: the irrelevance of certain quantities that might otherwise
74 Wigner’s papers can be found in the collection Symmetries and Reflections (Wigner,
1967).
30
be thought to have physical significance.75 In Section ?? we outlined the variety of symmetries found in physics, and in each case the symmetry is associated
with a property that is deemed irrelevant for the purposes of describing the lawgoverned behaviour of a system. For example, left-right symmetry means that
whether a system is left-handed or right-handed is irrelevant to its law-governed
evolution. Famously, this symmetry is violated in the weak interaction: the lawgoverned behaviour of systems turns out to be sensitive to handedness for certain
processes (see Pooley, 2003).
In Section ?? we characterized the distinction between global and local symmetries mathematically, in terms of the dependence on constant parameters and
arbitrary functions of time (and space) respectively. The physical meaning of
this distinction can be understood through the associated properties that are
deemed irrelevant. A global symmetry reflects the irrelevance of absolute values
of a certain quantity: only relative values are relevant. So in Newtonian mechanics, for example, spatial translation invariance holds and absolute position
is irrelevant to the behaviour of systems.76 Only relative positions matter, and
this is reflected in the structure of the theory through the equations being invariant under global spatial translations – the equations do not depend upon,
or invoke, a background structure of absolute positions.
A global symmetry is a special case of a local symmetry. A local symmetry
reflects the irrelevance not only of absolute values, but furthermore of relative
values specified at-a-distance: only local relative values (i.e. relative values
specified at a point) are relevant. This is reflected in the structure of the theory
by the equations of motion not depending upon some background structure
that determines relative values at-a-distance (i.e. there is no global background
structure associated with the property in question).77
8.3
Roles of symmetries
The various different roles in which symmetries are invoked in physics have
become much more evident with the advent of quantum theory.78 Nevertheless, already with the classification of crystals using their remarkable and varied
symmetry properties, we see the powerful classificatory role at work. Indeed, it
75 For an analysis of the connection between symmetry, equivalence and irrelevance, see
Castellani (2003).
76 We are considering here Newtonian mechanics, without Newton’s absolute space.
77 Instead, we require the explicit appearance of a connection in our theory, which provides
the rules by which two distant objects may be brought together so that comparisons between
them may be made locally.
78 The application of the theory of groups and their representations for the exploitation
of symmetries in the quantum mechanics of the 1920s represents a dramatic step-change
in the significance of symmetries in physics, with respect to both the foundations and the
phenomenological interpretation of the theory. As Wigner emphasized on many occasions,
one essential reason for the ‘increased effectiveness of invariance principles in quantum theory’
(Wigner, 1967, p. 47) is the linear nature of the state space of a quantum physical system,
corresponding to the possibility of superposing quantum states. For details on the application
of symmetries in quantum physics we refer the reader to Harvey (this vol., ch. 14). For
philosophical discussions see Brading and Castellani (2003).
31
was with René-Just Haüy’s use of symmetries in this way that crystallography
emerged in 1801 as a discipline distinct from mineralogy.79 Furthermore, the
heuristic and/or normative role is clear for the principle of relativity in the construction of both Special and General Relativity (see above, Section ??). The
unificatory role, so prominent now in the attempts to unify the fundamental
forces, was already present (although differing methodologically somewhat) in
Hilbert’s attempt to construct a generally covariant theory of gravitation and
electromagnetism (see Sauer, 1999) and in Weyl’s 1918 unified theory of gravitation and electromagnetism, for example. Symmetries may also be invoked in a
variety of explanatory roles. For example, on the basis of Noether’s first theorem
(see Section ??) we might say that it is because of the translational symmetry of
classical mechanics (plus satisfaction of other conditions) that linear momentum
is conserved in that theory. Another example would be an appeal to symmetry
principles as an explanation, via Wigner’s hierarchy, for (i) aspects of the form
of the laws, and thereby (ii) why certain events occur and others do not.
8.4
Status of symmetries
Are symmetries ontological, epistemological, or methodological in status? It is
clear that symmetries have an important heuristic function, as discussed above
(Section ??) in the context of relativity. This indicates a methodological status, something that becomes further developed within the context of quantum
theory. We can also ask whether we should attribute an ontological or epistemological status to symmetries.
According to an ontological viewpoint, symmetries are seen as “existing in
nature”, or characterizing the structure of the physical world. One reason for
attributing symmetries to nature is the so-called geometrical interpretation of
spatiotemporal symmetries, according to which the spatiotemporal symmetries
of physical laws are interpreted as symmetries of spacetime itself, the “geometrical structure” of the physical world. Moreover, this way of seeing symmetries
can be extended to non-external symmetries, by considering them as properties
of other kinds of spaces, usually known as “internal spaces”.80 The question of
exactly what a realist would be committed to on such a view of internal spaces
remains open, and an interesting topic for discussion.
One approach to investigating the limits of an ontological stance with respect to symmetries would be to investigate their empirical or observational
status: can the symmetries in question be directly observed? We first have to
address what it means for a symmetry to be observable, and indeed whether
all symmetries have the same observational status. Kosso (2000) arrives at the
conclusion that there are important differences in the empirical status of the dif79 The use of discrete symmetries in crystallography continued through the nineteenth century in the work of J. F. Hessel and A. Bravais, leading to the 32 point transformation crystal
classes and the 14 Bravais lattices. These were combined into the 230 space groups in the
1890s by E. S. Fedorov, A. Schönflies, and W. Barlow. The theory of discrete groups continues
to be important in such fields as solid state physics, chemistry, and materials science.
80 See Section 4.2, above, for the varieties of symmetry.
32
ferent kinds of symmetries. In particular, while global continuous symmetries
can be directly observed – via such experiments as the Galilean ship experiment
– a local continuous symmetry can have only indirect empirical evidence.81
The direct observational status of the familiar global spacetime symmetries
leads us to an epistemological aspect of symmetries. According to Wigner, the
spatiotemporal invariance principles play the role of a prerequisite for the very
possibility of discovering the laws of nature: ‘if the correlations between events
changed from day to day, and would be different for different points of space,
it would be impossible to discover them’ (Wigner, 1967). For Wigner, this
conception of symmetry principles is essentially related to our ignorance (if we
could directly know all the laws of nature, we would not need to use symmetry
principles in our search for them). Such a view might be given a methodological
intepretation, according to which such spatiotemporal regularities are presupposed in order for the enterprize of discovering the laws of physics to get off the
ground.82 Others have arrived at a view according to which symmetry principles
function as “transcendental principles” in the Kantian sense (see for instance
Mainzer, 1996). It should be noted in this regard that Wigner’s starting point,
as quoted above, does not imply exact symmetries – all that is needed epistemologically (or methodologically) is that the global symmetries hold approximately, for suitable spatiotemporal regions, so that there is sufficient stability
and regularity in the events for the laws of nature to be discovered.
As this discussion, and that of the preceding Subsections, indicate, the differences between various types of symmetry become important before we have
ventured very far into interpretational issues. For this reason, much recent work
on the interpretation of symmetry in physical theory has focussed not on general questions, such as those sketched above, but on addressing interpretational
questions specific to particular symmetries.83
8.5
Symmetries, objectivity, and objects
Turning now to the issue of the structures left invariant by symmetry transformations, the old and natural idea that what is objective should not depend
upon the particular perspective under which it is taken into consideration is
reformulated in the following group theoretical terms: what is objective is what
is invariant with respect to the relevant transformation group. This connection
between symmetries and objectivity is something that has a long history going
back to the early twentieth century at least. It was highlighted by Weyl (1952),
81 See Section 6.1, above; and Brading and Brown (2003b), who argue for a different interpretation of Kosso’s examples.
82 We are grateful to Brandon Fogel for this point, and for the comparison he suggested
between this view of spatiotemporal symmetries and the methodological face of Einstein’s
notion of separability.
83 These include the varieties of gauge invariance found in classical electromagnetism and in
quantum theories, along with general covariance in GTR (these being continuous symmetries),
plus the discrete symmetries of parity (violated in the weak interaction) and permutation
invariance, both of which are found in classical theory but require reconsideration in the light
of quantum theory. See Brading and Castellani (2003).
33
where he writes that ‘We found that objectivity means invariance with respect
to the group of automorphisms.’ This connection between objectivity and invariance was discussed particularly in the context of Relativity Theory, both
Special and General. We recall Minkowski’s famous phrase ([1908] 1923, p. 75)
that ‘Henceforth space by itself, and time by itself, are doomed to fade away into
mere shadows, and only a kind of union of the two will preserve an independent
reality’, following his geometrization of Einstein’s Special Theory of Relativity,
and the recognition of the spacetime interval (rather than intervals of space and
of time) as the geometrically invariant quantity. The connection between objectivity and invariance in General Relativity was discussed by, amongst others,
Hilbert and Weyl, and continues to be an issue today.84
Related to this is the use of symmetries to characterize the objects of physics
as sets of invariants. Originally developed in the context of quantum theory,
this approach can also be applied in classical physics.85 The basic idea is that
the invariant quantities – such as mass and charge – are those by which we
characterize objects. Thus, through the application of group theory we can use
symmetry considerations to determine the invariant quantities and “construct”
or “constitute” objects as sets of these invariants.86
In conclusion, then, the philosophical questions associated with symmetries
in classical physics are wide-ranging. What we have offered here is nothing more
than an overview, influenced by our own interests and puzzles, which we hope
will be of service in further explorations of this philosophically and physically
rich field.
Acknowledgements – We are grateful to the editors, Jeremy Butterfield and John
Earman, for their encouragement and detailed comments. We would also like to
thank Brandon Fogel, Brian Pitts, and Thomas A. Ryckman for their comments
and suggestions.
9
References
R. Adler, M. Bazin, and M. Schiffer (1975), Introduction to General Relativity,
2nd edition, McGraw-Hill Book Company, New York, St Louis, San Francisco,
Toronto, London, Sydney.
J. L. Anderson (1967), Principles of Relativity Physics, New York and London: Academic press.
B. M. Barbashov and V. V. Nesterenko (1983), ‘Continuous symmetries in
field theory’, Fortschr. Phys. 31, pp. 535-67.
84 We saw above (Sections 6.2 and 6.3) some aspects of this debate in the discussion of
Einstein’s ‘hole argument’ and of the status of observables in GTR.
85 See Max Born, reprinted in Castellani (1998).
86 For further discussion see Castellani (1998), part II.
34
J. B. Barbour (1989), Absolute or relative motion? Vol. 1 The discovery of
dynamics, Oxford University Press.
J. B. Barbour (2005), Absolute or relative motion? Vol. 2 The deep structure
of general relativity, Oxford University Press.
G. Belot (1998), ‘Understanding Electromagnetism’, British Journal for the
Philosophy of Science 49, pp. 531-55.
G. Belot (this volume), ‘Geometric Mechanics’.
K. Brading (2005), ‘A note on general relativity, energy conservation, and
Noether’s theorems’, in The Universe of General Relativity (Einstein Studies),
ed. J. Eisenstaedt and A. J. Kox, Birkhäuser.
K. Brading and H. R. Brown (2002), ‘General covariance from the perspective of Noether’s theorems’, Diálogos, pp. 59-86.
K. Brading and H. R. Brown (2003), ‘Symmetries and Noether’s theorems’,
in K. Brading and E. Castellani (eds.) (2003), pp. 89-109.
K. Brading and H. R. Brown (2004), ‘Are gauge symmetry transformations
observable?’, British Journal for the Philosophy of Science 55, pp. 645-65.
K. Brading and H. R. Brown (2007), ‘Noether’s theorems, gauge symmetries
and general relativity’, manuscript.
K. Brading and E. Castellani (eds.) (2003), Symmetry in Physics: Philosophical Reflections, Cambridge University Press.
K. Brading and E. Castellani (2006), ‘Curie’s Principle, Encore’, in preparation.
K. Brading and T. A. Ryckman (2007), ‘Hilbert’s axiomatic method and the
foundations of physics: an interpretation of generally covariant physics and a
revision of Kant’s epistemology’, manuscript.
H. Brown (2006), Physical relativity: spacetime structure from a dynamical
perspective, Oxford University Press.
J. Butterfield (2004), ‘Between Laws and Models: Some Philosophical Morals
of Lagrangian Mechanics; available at Los Alamos arXive: http://arxiv.org/abs/
physics/0409030; and at Pittsburgh archive: http://philsci-archive.pitt.edu/
archive/00001937/.
J. Butterfield (2006), ‘On Symmetry and Conserved Quantities in Classical
Mechanics’, forthcoming in a Festschrift for Jeffrey Bub, ed. W. Demopoulos
and I. Pitowsky, Kluwer: University of Western Ontario Series in Philosophy of
Science; available at Los Alamos arXive: http://arxiv.org/abs/physics/; and at
Pittsburgh archive: http://philsci-archive.pitt.edu/archive/00002362/.
J. Butterfield (this volume), ‘On Symplectic Reduction in Classical Mechanics’.
E. Castellani (ed.) (1998), Interpreting Bodies. Classical and Quantum
Objects in Modern Physics, Princeton University Press.
E. Castellani (2003), ‘Symmetry and Equivalence’, in K. Brading and E.
Castellani (eds.) (2003), pp. 321-334.
35
A. F. Chalmers (1970), ‘Curie’s Principle’, The British Journal for the Philosophy of Science 21, pp. 133-148.
L. Corry (2004), David Hilbert and the Axiomatization of Physics (18981918), Dordrecht: Kluwer Academic.
P. Curie (1894), ‘Sur la symétrie dans les phénomènes physiques. Symétrie
d’un champ électrique et d’un champ magnétique’, Journal de Physique, 3e série,
vol. 3, pp. 393-417.
P. Curie (1981), ‘On symmetry in physical phenomena’, trans. J. Rosen and
P. Copié, Am. J. Phys. 49(4), pp. 17-25.
R. Descartes (1998), The World and other writings, ed. and trans. S.
Gaukroger, Cambridge University Press.
N. Doughty (1990), Lagrangian interaction, Addison-Wesley.
J. Earman (2004), ‘Curie’s Principle and Spontaneous Symmetry Breaking’,
International Studies in Philosophy of Science 18, pp. 173-199.
J. Earman (2006), ‘Two Challenges to the Substantive Requirement of General Covariance’, Synthese, in press.
J. Earman (this volume), ‘Determinism in Modern Physics’.
J. Harvey (this volume), ‘Symmetries and Invariances in Quantum Physics’.
T. Hawkins (2000), Emergence of the Theory of Lie Groups: an essay in the
history of mathematics 1869-1926, New York: Springer.
D. Hilbert (1915), ‘Die Grundlagen der Physik (Erste Mitteilung), Nachrichten
von der Gesellschaft der Wissenschaften zu Göttingen’, Mathematisch-physikalische
Klasse, pp. 395-407.
’t Hooft (this volume), ‘Conceptual Basis of Quantum Field Theory’.
D. Howard (2007), ‘ ‘And I Shall not Mingle Conjectures and Certainties’:
The Roots of Einstein’s Principle Theories/Constructive Theories Distinction’,
manuscript.
J. Ismael (1997), ‘Curie’s Principle’, Synthese 110, pp. 167-190.
M. Janssen (2005), ‘Of pots and holes: Einstein’s bumpy road to general
relativity’, Ann. Phys. 14, Supplement, pp. 58-85.
B. Julia and S. Silva (1998), ‘Current and superpotentials in classical gauge
invariant theories I. Local results with applications to perfect fluids and general
relativity’, gr-qc/9804029 v2.
F. Klein (1918), ‘Über die Differentialgesetze für die Erhaltung von Impuls
und Energie in der Einsteinschen Gravitationstheorie. Königliche Gesellschaft
der Wissenschaften zu Göttingen. Mathematisch-physikalische Klasse’, Nachrichten,
pp. 469-92.
P. Kosso (2000), ‘The empirical status of symmetries in physics’, British
Journal for the Philosophy of Science 51, pp. 81-98.
C. Lanczos ([1949], 1962), The Variational Principles of Mechanics, University of Toronto Press.
36
K. Mainzer (1996), Symmetries of Nature. A Handbook, New York: de
Gruyter.
D. Malament (this volume), ‘Special and General Relativity Theory’.
A. I. Miller (1981), Albert Einstein’s Special Theory of Relativity, AddisonWesley Publishing Company.
H. Minkowski ([1908], 1923), ‘Space and Time’, in The Principle of relativity.
A collection of original memoirs on the special and general theory of relativity,
New York: Dover, pp. 75-91.
C. W. Misner, K. S. Thorne and J. A. Wheeler (1970), Gravitation, New
York: W. H. Freeman and Company.
E. Noether (1971), ‘Noether’s theorem’, Transport Theory and Statistical
Physics 1, 183-207.
J. Norton (1984), ‘How Einstein found his field equations: 1912-1915’, Hist.
Stud. Phys. Sci. 14, pp. 253-316.
J. Norton (1993), ‘General covariance and the foundations of general relativity: eight decades of dispute’, Rep. Prog. Phys. 56, pp. 791-858.
J. Norton (2003), ‘General covariance, gauge theories, and the Kretschmann
objection’, in K. Brading and E. Castellani (eds.) (2003), pp. 110-123.
H. C. Ohanian and R. Ruffini (1994), Gravitation and Spacetime, 2nd edition.
London, New York: W. W. Norton and company.
P. J. Olver (1993), Applications of Lie Groups to Differential Equations, 2nd
edition, New York: Springer.
L. O’Raifeartaigh (1997), The Dawning of Gauge Theory, Princeton University Press.
J. B. Pitts (2006), ‘Absolute objects and counterexamples: Jones-Geroch
Dust, Torretti Constant Curvature, Tetrad-Spinor, and Scalar Density’, Studies
in History and Philosophy of Modern Physics, forthcoming. Manuscript available at Los Alamos arXive: http://arxiv.org/abs/gr-qc/0506102/.
O. Pooley (2003), ‘Handedness, parity violation, and the reality of space’ ,
in K. Brading and E. Castellani (eds.) (2003), pp. 250-280.
L. A. Radicati (1987), ‘Remarks on the early developments of the notion
of symmetry breaking’, in M.G. Doncel, A. Hermann, L. Michel, and A. Pais
(eds.), Symmetries in physics (1600-1980), Barcelona: Servei de Publicaciones,
pp. 195-206.
C. Rovelli (this volume), ‘Quantum Gravity’.
T. Ryckman (2005), The Reign of Relativity, Oxford University Press.
T. Sauer (1999), ‘The relativity of discovery: Hilbert’s first note on the
foundations of physics’, Arch. Hist. Exact Sci. 53, pp. 529-575.
A. V. Shubnikow and V. A. Koptsik (1974), Symmetry in Science and Art,
London: Plenum Press.
37
J. Stachel (1993), ‘The meaning of general covariance: the hole story’, in
Philosophical Problems of the Internal and External World: Essays on the Philosophy of Adolf Grünbaum., ed. J. Earman et al., University of Pittsburgh
Press, pp. 129-160.
R. Torretti (1983), Relativity and Geometry, Oxford: Pergamon Press.
A. J. Trautman (1962), ‘Conservation laws in general relativity’, in Gravitation: an introduction to current research, ed. L. Witten, New York: John
Wiley and Sons, pp. 169-98.
R. Utiyama (1956), ‘Invariant theoretical interpretation of interaction’, Physical Review 101, pp. 1597-607.
R. Utiyama (1959), ‘Theory of invariant variation in the generalized canonical dynamics’, Prog. Theor. Phys. Suppl. 9, pp. 19-44.
R. Wald (1984), General Relativity, University of Chicago Press.
H. Weyl (1922), Space, Time, Matter, reprinted by Dover (1952).
H. Weyl (1952), Symmetry, Princeton University Press.
E. T. Whittaker ([1904] 1989), A Treatise on the Analytical Dynamics of
Particles and Rigid Bodies, Cambridge University Press.
E. P. Wigner (1967), Symmetries and Reflections, Bloomington: Indiana
University Press.
I. M. Yaglom (1988), Felix Klein and Sophus Lie. Evolution of the Idea of
Symmetry in the Nineteenth Century, Boston-Basel: Birkhäuser.
38