Waner Diff Geom-2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 145

Differential Gometry and General Relativity http://people.hofstra.edu/stefan_waner/diff_geom/...

Introduction to Differential Geometry and


General Relativity
Lecture Notes by Stefan Waner,
Department of Mathematics, Hofstra University

These notes are dedicated to the memory of Hanno Rund.

TABLE OF CONTENTS
1. Preliminaries: Distance, Open Sets, Parametric Surfaces and
Smooth Functions
2. Smooth Manifolds and Scalar Fields
3. Tangent Vectors and the Tangent Space
4. Contravariant and Covariant Vector Fields
5. Tensor Fields
6. Riemannian Manifolds
7. Locally Minkowskian Manifolds: A Little Relativity
8. Covariant Differentiation
9. Geodesics and Local Inertial Frames
10. The Riemann Curvature Tensor
11. A Little More Relativity: Comoving Frames and Proper Time
12. The Stress Tensor and the Relativistic Stress-Energy Tensor
13. Three Basic Premises of General Relativity
14. The Einstein Field Equations and Derivation of Newton's Law
15. The Schwarzschild Metric and Event Horizons
16. White Dwarfs, Neutron Stars and Black Holes by Gregory C.
Levine

Download the latest version of the differential geometry/relativity notes in


PDF format

References and Suggested Further Reading


(Listed in the rough order reflecting the degree to which they were used)

Bernard F. Schutz, A First Course in General Relativity (Cambridge


University Press, 1986)
David Lovelock and Hanno Rund, Tensors, Differential Forms, and
Variational Principles (Dover, 1989)
Charles E. Weatherburn, An Introduction to Riemannian Geometry and the
Tensor Calculus (Cambridge University Press, 1963)
Charles W. Misner, Kip S. Thorne and John A. Wheeler, Gravitation (W.H.

1 of 2 10/08/2010 05:06 PM
Differential Gometry and General Relativity http://people.hofstra.edu/stefan_waner/diff_geom/...

Freeman, 1973)
Keith R. Symon, Mechanics (3rd. Ed. Addison Wesley)

Further Reading on the Web


For a comprehensive catalog of internet sites on special and general
relativity, visit Relativity on the Web.

Last Updated: January, 2002


Copyright © Stefan Waner
Stop by at the Finite Mathematics and Applied Calculus Resource.

2 of 2 10/08/2010 05:06 PM
Distance and Open Sets http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 1: Distance, Open Sets, Curves, and


Surfaces
Table of Contents On to Lecture 2: Smooth Manifolds

Here, we do just enough topology so as to be able to talk about smooth


manifolds. We begin with n-dimensional Euclidean space.

En = {(y1, y2, ... , yn) | yi R}.

(R is the set of real numbers.) Thus, E1 is just the real line, E2 is the
Euclidean plane, and E3 is 3 dimensional Euclidean space. Why?

The magnitude, or norm, ||y|| of y = (y 1, y2, . . . , yn) in En is defined to be

||y|| =(y1 + y2 + . . . + yn )1/2,


2 2 2

which we think of as its distance from the origin. Let me see some examples.

The distance between two points y = (y1, y2, ... , yn) and z = (z1, z2, ... , zn)
in En is defined as ||z y||:

Distance Formula

Distance between y and z = ||z y|| = ((z1 y1)2 + (z2 y2)2 + . . . + (zn
yn)2)1/2

An example, please.

The properties of the norm are summed up in the following result.

Proposition 1.1 (Properties of the norm)

The norm satisfies the following:

(a) ||y|| 0, and ||y|| = 0 if and only if y = 0 (positive definite)


(b) ||cy|| = |c|||y|| for every c R and y En.
(c) ||y + z|| ||y|| + ||z|| for every y, z En (triangle inequality 1)
(d) ||y z|| ||y w|| + ||w z|| for every y, z, w En (triangle
inequality 2)
Why are they called "triangle inequalities?"

The proof of Proposition 1.1 is an exercise which may require reference to a


linear algebra text (see "inner products").

1 of 7 10/08/2010 05:04 PM
Distance and Open Sets http://people.hofstra.edu/stefan_waner/diff_geom...

Definition 1.2 A Subset U of E n is called open if, for every y in U, all


points of En within some positive distance r of y are also in U. (The size
of r may depend on the point y chosen.) Let me see a picture.

Intuitively, an open set is a solid region minus its boundary. If we include


the boundary, we get a closed set, which formally is defined as the
complement of an open set.

Examples 1.3

(a) If a En, then the open ball with center a and radius r is the set of
all points in En whose distance from a is less than r.

B(a, r) = {x En | ||x a|| < r}.

Open balls are open sets: If x B(a, r), then, with s = r ||x a||, one has
B(x, s) B(a, r); that is, all points within a distance s of x are still inside B(a,
r).

(b) En is open.

(c) ∅ is open.

(d) Unions of open sets are open.

(e) Open sets are unions of open balls. (Why is that?)

Definition 1.4 Now let M Es. A subset U M is called open in M (or


relatively open) if, for every y in U, all points of M within some positive
distance r of y are also in U.

In the following diagram, M is the hemisphere, Es is three-dimensional


space (yellow) and U is the small "patch" on M (excluding its boundary).
Notice that U is not open in Es, since there are points in Es arbitrarily close
to U that lie outside U. However, it is open in M, since given any point y in
U, all points of M within a small enough distance from y are still in U.

Examples 1.5

2 of 7 10/08/2010 05:04 PM
Distance and Open Sets http://people.hofstra.edu/stefan_waner/diff_geom...

(a) Open balls in M If M Es, m M, and r > 0, define

BM(m, r) = {x M | ||x m|| < r}.

For example, if M is the surface of the earth, m is the center of Honolulu


and r = 100 miles, then BM(m, r) consists of all points on the surface of the
earth less than 100 miles from central Honolulu. However, points above the
surface--even one inch above cantral Honolulu--are not in BM(m, r).

Notice that

BM(m, r) = B(m, r) M,

and so BM(m, r) is open in M.

(b) M is open in M.

(c) ∅ is open in M.

(d) Unions of open sets in M are open in M.

(e) Open sets in M are unions of open balls in M.

Parametric Paths and Surfaces in E3

From now on, the three coordinates of 3-space will be referred to as y1, y2,
and y3.

Definition 1.6 A smooth path in E 3 is a set of three smooth (infinitely


differentiable) real-valued functions of a single real variable t:

y1 = y1(t), y2 = y2(t), y3 = y3(t).

The variable t is called the parameter of the curve. If the vector


(dy1/dt, dy2/dt, dy3/dt) nowhere zero, we speak of a non-singular
path.

Notes

(a) Instead of writing y 1 = y1(t), y2 = y2(t), y3 = y3(t), we shall simply write


yi = yi(t).

(b) Since there is nothing special about three dimensions, we define a


smooth path in En in exactly the same way: as a collection of smooth
functions yi = yi(t), where this time i goes from 1 to n.

Examples 1.7

3 of 7 10/08/2010 05:04 PM
Distance and Open Sets http://people.hofstra.edu/stefan_waner/diff_geom...

(a) Straight lines in E3

(b) Helix in E3

Definition 1.8 A smooth surface immersed in E3 is a collection of


three smooth real-valued functions of two variables x1 and x2 (notice
that x finally makes a debut).

y1 = y1(x1, x2)
y2 = y2(x1, x2)
y3 = y3(x1, x2),

or just

yi = yi(x1, x2) (i = 1, 2, 3).

1
Note that holding x constant gives a smooth path, with different
constants yielding different paths. Similarly, holding x2 constant gives
another batch of paths that intersect the first ones. (See the picture.)

We also require that the 3 2 matrix whose ij entry is yi/ xj has rank two.
We call x1 and x2 the parameters or local coordinates.

Examples 1.9

(a) Planes in E3
We can paramaterize the plane through the point (p 1, p2, p3) and parallel to
the (independent) vectors (a1, a2, a3), (b1, b2, b3) by

y1 = p1 + a1x1 + b1x2
y2 = p2 + a2x1 + b2x2
y3 = p3 + a3x1 + b3x2

or simply

yi = pi + aix1 + bix2 (i = 1, 2, 3)

2 2
(b) The paraboloid y3 = y1 + y2 can be paramaterized by setting

y1 = x1;
y2 = x2
y3 = (x1)2 + (x2)2

4 of 7 10/08/2010 05:04 PM
Distance and Open Sets http://people.hofstra.edu/stefan_waner/diff_geom...

Note (x2)2 means x2 squared, and not x4. (Yes, I know the notation is
strange, but that's the tradition...)

(c) The unit sphere y12 + y22 + y32 = 1, using


spherical polar coordinates.

y1 = sin(x1)cos(x2)
y2 = sin(x1)sin(x2)
y3 = cos(x1)

x1 and x2 are the usual polar coordinates (the angles shown in the figure).

(d) The ellipsoid

y12 y2
2
y3
2
+ + = 1,
2 2 2
a b c

where a, b and c are positive constants, can be paramaterized using similar


polar coordinates:

1 2
y1 = a sin(x )cos(x )
y2 = b sin(x1)sin(x2)
y3 = c cos(x1)

(e) The Jacobean matrix for spherical polar coordinates (Example (c)) is the
matrix

y1 y2 y3
1
x1 x1 x1 cos x1 cos x2 cos x1 sin x2 - sin x
J= =
y1 y2 y3 -sin x1 sin x2 sin x1 cos x2 0
2 2 2
x x x

Exercise Show that J has rank 2 everywhere except x1 = n (n an integer).

(f) The torus with radii a > b:

y1 = (a+bcos x2)cos x1
y2 = (a+bcos x2)sin x1
y3 = bsin x2

Question The parametric equations of a surface show us how to obtain a

5 of 7 10/08/2010 05:04 PM
Distance and Open Sets http://people.hofstra.edu/stefan_waner/diff_geom...

point on the surface once we know the two local coordinates (parameters).
In other words, we have specified a function E2 E3. How do we obtain the
local coordinates from the Cartesian coordinates y1, y2, y3?

Answer We need to solve for the local coordinates xi as functions of yj. For
instance, in the case of a sphere, we get

1 -1
x = cos (y3)
2 2 1/2
cos-1 y1/(y1 + y2 ) if y2 0 ... (*)
2=
x 2 2 1/2
2 - cos-1 y1/(y1 + y2 ) if y2 < 0

This allows us to give each point on much of the sphere two unique
coordinates, x1, and x2. There is a problem with continuity when y2 = 0,
since then x1 switches from 0 to 2 . There is also a problem at the poles (y1
= y2 = 0), since then the above functions are not even defined. Thus, we
restrict to the portion of the sphere given by

0 < x1 < 2 , 0 < x2 < ,

which is an open subset U of the sphere. (Think of it as the surface of the


earth with the Greenwich Meridian removed. Let me see a picture of U.) We
call x1 and x2 the coordinate functions. They are functions

x1: U E1

and

x2: U E1.

We can put them together to obtain a single function x: U E2 given by

1 2
x(y1, y2, y3) = (x (y1, y2, y3), x (y1, y2, y3))

where x1 and x2 are the functions specified by the above formulas (*), as a
chart.

Definition 1.10 A chart of a surface S is a pair of functions

1 2
x = (x (y1, y2, y3), x (y1, y2, y3))

which specify each of the local coordinates (parameters) x 1 and x2 as


smooth functions of a general point (global or ambient coordinates)

6 of 7 10/08/2010 05:04 PM
Distance and Open Sets http://people.hofstra.edu/stefan_waner/diff_geom...

(y1, y2, y3) on the surface.

Question Why are these functions called a chart?

Answer The chart above assigns to each point on the sphere (away from the
meridian) two coordinates. So, we can think of it as giving a two-dimensional
map of the surface of the sphere, just like a geographic chart.

Question Our chart for the sphere is very nice, but is only appears to chart
a portion of the sphere. What about the missing meridian?

Answer We can use another chart to get those by using different


paramaterization that places the poles on the equator. (Diagram in class.)

In general, we chart an entire manifold M by "covering" it with open sets U


which become the domains of coordinate charts.

Table of Contents On to Lecture 2: Smooth Manifolds


Last Updated: January, 2002
Copyright © Stefan Waner

7 of 7 10/08/2010 05:04 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 2: Smooth Manifolds and Scalar Fields


Table of Back to Lecture 1: On to Lecture 3: Tangent Vectors
Contents Preliminaries and the Tangent Space

We now generalize the ideas discussed in Lecture 1.

Definition 2.1 An open cover of M Es is a collection {Ua} of open


sets in M such that M = aUa.

Examples

(a) Es can be covered by open balls.

(b) Es can be covered by the single (open) set Es.

(c) The unit sphere in Es can be covered by the collection {U1, U2} where

U1 = {(y1, y2, y3) | y3 > -1/2}


U2 = {(y1, y2, y3) | y3 < 1/2}.

Definition 2.2 A subset M of E s is called an n-dimensional smooth


manifold if we are given a collection

1 2 n
{Ua; xa , xa , . . ., xa }

where:

(a) The Ua form an open cover of M.


r
(b) Each xa is a smooth (what does that mean?) real-valued
r
function defined on U (that is, xa : Ua E1), called the th
coordinate, such that the map

x: Ua En given by

1 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

1 2 n
x(u) = (xa (u), xa (u), . . . , xa (u))
is one-to-one. (That is, to each point in Ua, we are assigned a
unique set of n coordinates.)
1 2 n
The tuple (Ua; xa , xa , . . ., xa ) is called a local chart of M. The
collection of all charts is called a smooth atlas of M. Further, Ua
is called a coordinate neighborhood.
(c) If (U, xi), and (V, j) are two local charts of M, and if U V ,
then we can write

xi = xi( j)

with inverse

k
= k(xl)

for each i and k, where all functions in sight are smooth. These
functions are called the change-of-coordinates transformations.

By the way, we call the "big" space Es in which the manifold M is embedded
the ambient space.

Notes
1. Always think of the xi as the local coordinates (or parameters) of the
manifold. We can paramaterize each of the open sets U by using the inverse
function x-1 of x, which assigns to each point in some neighborhood of En a
corresponding point in the manifold. Let me see an example.
2. Condition (c) implies that

i
det 0, and

2 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

xj

xi
det 0,
j

since the associated matrices must be invertible.


3. The ambient space need not be present in the general theory of
manifolds; that is, it is possible to define a smooth manifold M without any
reference to an ambient space at all -- see any text on differential topology
or differential geometry.
4. More terminology: We shall sometimes refer to the xi as the local
coordinates, and to the y j as the ambient coordinates. Thus, a point in an
n-dimensional manifold M in Es has n local coordinates, but s ambient
coordinates.

Examples 2.3
(a) En is an n-dimensional manifold, with the single identity chart defined
by

xi(y1, . . . , yn) = yi.

(b) S1, the unit circle, with the exponential map, is a 1-dimensional
manifold. Here is a possible structure:with two charts as show in in the
following figure.

One has

x: S1-{(1, 0)} E1 : S1-{(-1, 0)} E1,

with 0 < x, < 2 , and the change-of-coordinate maps are given by

+x if x <
= (See the figure for the two cases.)
-x if x >

and

3 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

+ if <
x= ,
- if >

Notice the symmetry between x and . Also notice that these change-
of-coordinate functions are only defined when 0, . Further,

/ x = x/ = 1.

Note that, in terms of complex numbers, we can write, for a point p = eiz
S1,

x = arg(z), = arg(-z).

n
(c) Generalized Polar Coordinates Let us take M = S , the unit n-sphere,

n 2
S = {(y1, y2, ... , yn, yn+1) En+1 | iyi = 1},

with coordinates (x1, x2, . . . , xn) with

0 < x1, x2, . . . , xn-1 < , and


n
0<x <2 ,

given by

y1 = cos x1
y2 = sin x1 cos x2
y3 = sin x1 sin x2 cos x3
...
yn-1 = sin x1 sin x2 sin x3 sin x4 ... cos xn-1
yn = sin x1 sin x2 sin x3 sin x4 ... sin xn-1 cos xn
yn+1 = sin x1 sin x2 sin x3 sin x4 ... sin xn-1 sin xn

In the homework, you will be asked to obtain the associated chart by solving
for the xi. Note that if the sphere has radius r, then we can multiply all the
above expressions by r, getting

y1 = r cos x1
y2 = r sin x1 cos x2
y3 = r sin x1 sin x2 cos x3
...
yn-1 = r sin x1 sin x2 sin x3 sin x4 ... cos xn-1

4 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

yn = r sin x1 sin x2 sin x3 sin x4 ... sin xn-1 cos xn


yn+1 = r sin x1 sin x2 sin x3 sin x4 ... sin xn-1 sin xn.

(d) The torus T = S1 S1, with the following four charts:

x: (S1-{(1, 0)}) (S1-{(1, 0)}) E2, given by


x1((cos , sin ), (cos , sin )) =
x2((cos , sin ), (cos , sin )) = .

The remaining charts are defined similarly, and the change-of-coordinate


maps are omitted.

(e) The cylinder (exercise)

(f) Sn, with (again) stereographic projection, is an n-manifold; the two


charts are given as follows. Let P be the point (0, 0, . . , 0, 1) and let Q be
the point (0, 0, . . . , 0, -1). Then define two charts (Sn-P, xi) and (Sn-Q, i) as
follows. (See the figure.)

If (y1, y2, . . . , yn, yn+1) is a point in Sn, let

y1 y1
x1 = 1=
1 - yn+1 1 + yn+1

y2 y2
x2 = 2=
1 - yn+1 1 + yn+1
... ...
yn yn
xn = n=
1 - yn+1 1 + yn+1

5 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

We can invert these maps (that is, solve for the global coordinates yi in
terms of the local coordinates xi and i) as follows:

Let r2 = i xixi, and 2 = i i i. Then:

2x1 2 1
y1 = y1 =
r2 + 1 1+ 2

2x2 2 2
y2 = y2 =
r2 + 1 1+ 2
... ...

2xn 2 n
yn = yn =
r2 + 1 1+ 2

r2 - 1 1- 2
yn+1 = yn+1 =
r2 + 1 1+ 2

The change-of-coordinate maps are therefore:

1
2
1 y1 1+ 2 1
x = 1-y = = 2
n+1 1- 2
1-
1+ 2

2 2
x = 2

...
n n
x = 2

This makes sense, since the maps are not defined when i = 0 for all i,
corresponding to the north pole.

Note Since is the distance from i to the origin, this map is hyperbolic
reflection in the unit circle;

1 i
i=
x

and squaring and adding gives

6 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

r = 1/

That is, project it to the circle, and invert the distance from the origin. This
also gives the inverse relations, since we can write
i 2 i
= x = xi/r2.

In other words, we have the following transformation rules.

Change of Coordinate Transformations for Stereographic


Projection

Let r2 = i xixi, and 2 = i i i.

i= xi i=
i
Then 2
; 2
; r =1
r

Note We can put all the coordinate functions xar: Ua E1 together to get a
single map

xa: Ua Wa En.

A more precise formulation of condition (c) in the definition of a manifold is


then the following: each Wa is an open subset of En, each xa is invertible,
and each composite

-1 xb
xa
Wa En Wb

is a smooth function defined on an open subset.

We now want to discuss scalar and vector fields on manifolds, but how do we
specify such things? First, a scalar field.

Definition 2.4 A smooth scalar field on a smooth manifold M is just a


smooth real-valued map : M E1. (In other words, it is a smooth
function of the coordinates of M as a subset of Er.) Thus, associates to
each point m of M a unique scalar (m).
If U is a subset of M, then a smooth scalar field on U is smooth
real-valued map : U E1. If U M, we sometimes call such a scalar field
local.

If is a scalar field on M and x is a chart, then we can express as a smooth


1 2 n
function of the associated parameters x , x , . . . , x . If the chart is , we

7 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

shall write for the function of the other parameters 1, 2, . . . , n. Note


that we must have = at each point of the manifold (see the transformation
rule below).

Examples 2.5 (a) Let M = E n (with its usual structure) and let be any
smooth real-valued function in the usual sense. Then, using the identity
chart, we have = .

(b) Let M = S2, and define (y1, y2, y3) = y3.


Using stereographic projection, we find both and :

2 2
1 2 r2 - 1 (x1) + (x2) - 1
(x1, x2) = y3(x , x ) = 2 = 2 2
r +1 (x1) + (x2) + 1
2 2
2
1 2 1- 1 - ( 1) - ( 2)
( 1, 2) = y3( , ) = = 2 2
1+ 2 1 + ( 1) + ( 2)

(c) Local Scalar Field The most obvious candidate for local fields are the
coordinate functions themselves. If U is a coordinate neighborhood, and x =
{xi} is a chart on U, then the maps xi are local scalar fields.

Sometimes, as in the above example, we may wish to specify a scalar field


purely by specifying it in terms of its local parameters; that is, by specifying
the various functions instead of the single function . The problem is, we
can't just specify it any way we want, since it must give a value to each
point in the manifold independently of local coordinates. That is, if a point p
M has local coordinates (xj) with one chart and ( h) with another, they
must be related via the relationship

j
= j(xh).

Transformation Rule for Scalar Fields

( j) = (xh).

Example 2.6 Look at Example 2.5(b) above. If you substituted i as a


function of the xj, you would get ( 1, 2) = (x1, x2) (after some laborious
albegra!).

Exercise Set 2

1. Give the paraboloid z = x2 + y2 the structure of a smooth manifold.

8 of 9 10/08/2010 05:06 PM
Smooth Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

2. Find a smooth atlas of E2 consisting of three charts.

3. (a) Extend the method in Exercise 1 to show that the graph of any
smooth function f: E2 E1 can be given the structure of a smooth manifold.
(b) Generalize part (a) to the graph of a smooth function f: En E1.

4. Two atlases of the manifold M give the same smooth structure if their
union is again a smooth atlas of M.
(a) Show that the smooth atlases (E1, f), and (E1, g), where f(x) = x and g(x)
= x3 are incompatible.
(b) Find a third smooth atlas of E1 that is incompatible with both the atlases
in part (a).

5. Consider the ellipsoid L E3 specified by

x2 y2 z2
+ 2 + 2 = 1,
a2 b c

2
(a, b, c 0). Define f: L S by f(x, y, z) = (x/a, y/b. z/c).
(a) Verify that f is invertible (by finding its inverse).
(b) Use the map f, together with a smooth atlas of S2, to construct a smooth
atlas of L.

6. Find the chart associated with the generalized spherical polar


coordinates described in Example 2.3(c) by inverting the coordinates. How
many additional charts are needed to get an atlas? Give an example.

7. Obtain the equations in Example 2.3(f).

Table of Back to Lecture 1: On to Lecture 3: Tangent Vectors


Contents Preliminaries and the Tangent Space
Last Updated: january, 2002
Copyright © Stefan Waner

9 of 9 10/08/2010 05:06 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 3: Tangent Vectors and the Tangent


Space
Back to Lecture 2: Smooth
Table of On to Lecture 4: Contravariant
Manifolds and Scalar
Contents and Covariant Vector Fields
Fields

Lecture 2 described scalar fields on manifolds. We now turn to vectors on


smooth manifolds. We must first talk about smooth paths in M.

Definition 3.1 A smooth path in the smooth manifold M is a smooth


map defined on an open segment of the real line, r: (-a, a) M, where r
is a vector-valued function with coordinates (y1, y2, . . ., ys).

We say that r is a smooth path through m M if r(t0) = m for some t0.

We can specify a path in M at m by its coordinates:

y1 = y1(t)
y2 = y2(t)
...
ys = ys(t),

where m is the point (y1(t0), y2(t0), . . . , ys(t0)). Equivalently, since the


ambient and local coordinates are functions of each other, we can also
express a path--at least that part of it inside a coordinate neighborhood--in
terms of its local coordinates:

x1 = x1(t)
x2 = x2(t)
...
xn = xn(t).

Examples 3.2
(a) (borrowed from Example 1.7 in Lecture 1) Straight lines in E3
(b) (from Example 1.7 in Lecture 1) Helix in a cylinder radius r embedded
in E3
(c) A smooth path in Sn

Definition 3.3 A tangent vector at m M Er is a vector v in E r of


the form v = y'(t0) for some path y = y(t) in M through m (with y(t0) =
m.

1 of 8 10/08/2010 05:08 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

Examples 3.4
(a) Let M be the surface y3 = y12 + y22, which we paramaterize by

y1 = x1
y2 = x2
y3 = (x1)2 + (x2)2

This corresponds to the single chart (U=M; x1, x2),


where

x1 = y1 and
x2 = y2.

To specify a tangent vector, let us first specify a path in M, such as

y1 = t sin t
y2 = t cos t
y3 = t2

(Check that the equation of the surface is satisfied.) This gives the path
shown in the figure.
Now we obtain a tangent vector field along the path by taking the
derivative:

dy1 dy2 dy3


, , = (t cos t + sin t, - t sin t + cos t, 2t)
dt dt dt

(To get actual tangent vectors at points in M, evaluate this at a fixed point
t0.)

Note We can also express the coordinates xi in terms of t:

x1 = y1 = t sin t
x2 = y2 = t cos t,

giving

dx1 dx2
, = (t cos t + sin t, -t sin t + cos t),
dt dt

i
since x = yi for this manifold. We also think of this as the tangent vector,
given in terms of the local coordinates. A lot more will be said about the
relationship between the above two forms of the tangent vector below.

2 of 8 10/08/2010 05:08 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

Algebra of Tangent Vectors: Addition and Scalar Multiplication

The sum of two tangent vectors is, geometrically, also a tangent vector, and
the same goes for scalar multiples of tangent vectors. However, we have
defined tangent vectors using paths in M, and we cannot produce these new
vectors by simply adding or scalar-multiplying the corresponding paths: if y
= f(t) and y = g(t) are two paths through m é M where f(t0) = g(t0) = m,
then adding them coordinate-wise need not produce a path in M. However,
we can add these paths using some chart as follows.

Choose a chart x at m, with the property (for convenience) that x(m) = 0.


Then the paths x(f(t)) and x(g(t)) (defined as in the note above) give two
paths through the origin in coordinate space. Now we can add these paths
or multiply them by a scalar without leaving coordinate space and then use
the chart map to lift the result back up to M. In other words, define

(f+g)(t) = x-1(x(f(t)) + x(g(t))


and(¬f)(t) = x-1(¬x(f(t))).

Taking their derivatives at the point t0 will, by the chain rule, produce the
sum and scalar multiples of the corresponding tangent vectors. Since we
can add and scalar-multiply tangent vectors

Definition 3.5 If M is an n-dimensional manifold, and m M, then the


tangent space at m is the set Tm of all tangent vectors at m.

The above constructions turn Tm into a vector space.

Let us return to the issue of the two ways of describing the coordinates of a
tangent vector at a point m M: writing the path as yi = yi(t) we get the
ambient coordinates of the tangent vector:

dy1 dys
y'(t0) = , ... , Ambient coordinates
dt dt t=t0

and, using some chart x at m, we get the local coordinates

dx1 dxn
x'(t0) = , ... , Local coordinates
dt dt t=t0

Question In general, how are the dxi/dt related to the dy i/dt?

Answer By the chain rule,

dy1 y1 dx1 y1 dx2


= +
dt

3 of 8 10/08/2010 05:08 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

dt x1 dt x2

and similarly for dy2/dt and dy3/dt. Thus, we can recover the original three
ambient vector coordinates from the local coordinates. In other words, the
local vector coordinates completely specify the tangent vector.

Note The chain rule as used above shows us how to convert local
coordinates to ambient coordinates and vice-versa:

Converting Between Local and Ambient Coordinates of a Tangent


Vector

If the tangent vector V has ambient coordinates (v1, v2, . . . , vs) and
local coordinates (v1, v2, . . . , vn), then they are related by the formulae

n yi
vi = k vk
k=1 x
and

i
s
xi v .
v= k
k=1 yk

Note To obtain the coordinates of sums or scalar multiples of tangent


vectors, simply take the corresponding sums and scalar multiples of the
coordinates. In other words:

(v+w)i = vi + wi and (¬v) i = ¬vI

just as we would expect to do for ambient coordinates. (Why can we do


this?)

From now on, we shall omit the summation signs, and use the Einstein
Summation Convention:

Einstein Summation Convention

If an index appears twice in an expression, then summation over that


index is implied.

Thus,

n yi yi
k becomes k (because the index k repeats)
kv kv
k=1 x x
and

4 of 8 10/08/2010 05:08 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

s
xi v xi v
k becomes k (again because the index k repeats).
k=1 yk yk

Examples 3.4 Contd.


(b) Take M = En, and let v be any vector in the usual sense with coordinates
åi. Choose x to be the usual chart xi = yi. If p = (p 1, p2, . . . , p n) is a point in
M, then v is the derivative of the path

x1 = p 1 + t 1
x2 = p2 + t 2;
...
xn = p n + t n

at t = 0. Thus this vector has local and ambient coordinates equal to each
other, and equal to

dxi = i
,
dt

which are the same as the original coordinates. In other words, the tangent
vectors are "the same" as ordinary vectors in En.

(c) Let M = S2, and the path in S2 given by

y1 = sin t
y2 = 0
y3 = cos t

This is a path (circle) through m = (0, 0, 1) following


the line of longitude x2 = 0, and has tangent vector

dy1 dy2 dy3


, , = (cost, 0, -sint) = (1, 0, 0) at the point m.
dt dt dt

(c) We can also use the local coordinates to describe a path; for instance,
the path in part (b) can be described using spherical polar coordinates by

x1 = t
x2 = 0

The derivative

dx1 dx2
, = (1, 0)
dt dt

5 of 8 10/08/2010 05:08 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

give the local coordinates (the coordinates of its image in coordinate


Euclidean space).

(e) In general, if (U; x 1, x2, . . . , xn) is a coordinate system near m, then we


can obtain paths yi(t) by setting

j t + const. if j = i
x= const. if j i
,

where the constants are chosen to make xi(t0) correspond to m. (The paths
in parts (c) and (d) are examples of this.) To view this as a path in M, we
just apply the parametric equations yi = yi(xj), giving the yi as functions of t.

The associated tangent vector at the point where t = t0 is called / xi. It has
local coordinates

dxj 1 if j = i j
j
v= = = i
dt t=0 0 if j i

j
i is called the Kronecker Delta, and is defined by

j 1 if j = i j
i = = i .
0 if j i

Question Which matrix has ij entry equal to ij?


Answer

Question Using the Einstein summation convention, evaluate vi i .


j

Answer

We can now get the ambient coordinates by the above conversion formula
(we are using the Einstein summation convention from this point on):

yj yj k
yj
vj = k
kv = k i =
x x xi

We call this vector / xi. Summarizing,

Definition of xi

6 of 8 10/08/2010 05:08 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

xi is the vector whose local coordinates are given by

j
xj
j th coordinate = x
i = ij = i .
x

Its ambient coordinatres are given by

yj
j th coordinate = .
xi

What do these strange vectors look like?

Now that we have a better feel for local and ambeinet coordinates of
vectors, let us state some more "general nonsense": Let M be an
n-dimensional manifold, and let m M.

Proposition 3.6 (The Tangent Space)

There is a linear one-to-one correspondence between tangent vectors at


m and plain old vectors in En. In other words, the tangent space "looks
like" En.

Click here for a proof (which will also explain why local coordinates are
better than ambient ones).

Question Wait a minute! Isn't that obvious from the picture? The tangent
space is just an n-dimensional plane, and all n-dimensional planes are just
copies of n-dimensional space!

Answer Geometrically, that seems true -- at least in three dimensions. But


remember, we have defined the tangent space at m M as the set of
tangent vectors to paths through m. How can we be so sure that this 1-1
correspondence works (a) with this definition, and (b) in arbitrary
s-dimensional space? That is why we really need the proof.

7 of 8 10/08/2010 05:08 PM
Tangent Vectors and the Tangent Space http://people.hofstra.edu/stefan_waner/diff_geom...

Note Under the one-to-one correspondence in the proposition, the standard


basis vectors in En correspond to the tangent vectors / x1, / x2, . . . , / xn.
Therefore, the latter vectors are a basis of the tangent space Tm.

Exercise Set 3

1. Suppose that v is a tangent vector at m é M with the property that there


exists a local coordinate system xi at m with vi = 0 for every i. Show that v
has zero coordinates in every coefficient system, and that, in fact, v = 0.

2. (a) Calculate the ambient coordinates of the vectors / and / at a


general point on S , where ø and ™ are spherical polar coordinates ( = x1,
2

= x2).
(b) Sketch these vectors at some point on the sphere.

3. Prove that

xj
=
i i xj

4. Consider the torus T2 with the chart x given by

y1 = (a+b cos x1)cos x2


y2 = (a+b cos x1)sin x2
y3 = b sin x1

with 0 < xi < 2 . Find the ambeint coordinates of the two orthogonal
tangent vectors at a general point, and sketch the resulting vectors.

Back to Lecture 2: Smooth


Table of On to Lecture 4: Contravariant
Manifolds and Scalar
Contents and Covariant Vector Fields
Fields
Last Updated: January, 2002
Copyright © Stefan Waner

8 of 8 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 4: Contravariant and Covariant Vector


Fields
Table of Back to Lecture 3: Tangent Vectors On to Lecture 5:
Contents and the Tangent Space Tensor Fields

Question How are the local coordinates of a given tangent vector for one
chart related to those for another?

Answer Again, we use the chain rule. The formula

i
d i= dxj
dt xj dt

In other words, a tangent vector through a point m in M is a collection of n


numbers (local coordinates) V i = dxi/dt (specified for each chart x at m)
where the quantities for one chart are related to those for another
according to the formula

i i
v = j vj
x

This leads to the following definition.

Definition 4.1 A contravariant vector at m M is a collection v i of n


quantities (defined for each chart at m) which transform according to
the formula

i i
v = j vj
x

It follows that contravariant vectors "are" just tangent vectors: the


contravariant vector vi corresponds to the tangent vector given by

v = vi xi

so we shall henceforth refer to tangent vectors and contravariant


vectors.

A contravariant vector field V on M associates with each chart x a


collection of n smooth real-valued coordinate functions Vi of the n

1 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

variables (x1, x2, . . . , xn), such that evaluating Vi at any point gives a
vector at that point. Further, the domain of the V i is the whole of the
range of x. Similarly, a contravariant vector field V on U M is
defined in the same way, but its domain is restricted to x(U).

The tranformation rule for all contravariant vector fields is therefore given
as follows.

Contravariant Vector Transformation Rule

i
i
V = j Vj
x

where now the V i are functions of the associated coordinates (x1, x2, . . . ,
xn), and similarly for the barred coordinates. Note that the transformation
rule is only valid on the intersection of the images of x and .

Notes 4.2
1.The above formula is reminiscent of matrix multiplication: In fact, let be
i j
the matrix whose ij th entry is / x , then the above equation becomes, in
matrix form:

V = D V.

where we think of V and as column vectors.

2. By "transform," we mean that the above relationship holds between the


coordinate functions Vi of the xi associated with the chart x, and the
functions i of the i, associated with the chart .

3. Note the formal symbol cancellation: if we cancel the 's, the x's, and the
superscripts on the right, we are left with the symbols on the left!

4. From the proof of 3.6, we saw that, if V is any smooth contravariant


vector field on M, then

V = V i xi .
Examples 4.3
(a) Take M = En, and let F be any tangent vector field in the usual sense
with coordinates Fi. If p = (p 1, p2, . . . , p n) is a point in M, then F is the
derivative of the path

2 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

x1 = p1 + tF1
x2 = p2 + tF2;
...
xn = pn + tFn

at t = 0. Thus this vector has coordinate functions

dxi = i
F,
dt

which are the same as the original coordinates. In other words, the tangent
vectors are "the same" as ordinary vectors in En.

(b) An Important Local Vector Field Recall from Example 3.4 (e) the
definition of the vectors / xi: At each point m in a manifold M, we have the
n vectors / x1, / x2, . . . , / xn, where the typical vector / xi was obtained
by taking the derivative of the path:

t+ j=
vector obtained by differentiating the path if
= j const. i ,
xi x =
const. if j i

where the constants are chosen to make xi(t0) correspond to m for some t0.
This gave

j
x i = ij .

Now, there is nothing to stop us from defining n different vector fields / x1,
/ x2, . . . , / xn, in exactly the same way: at each point in the coordinate
neighborhood of the chart x, associate the vector above.

Note: / xi is a field, and not the i th coordinate of a field. Its jth local
coordinate under the chart x is given by ij = xj/ xi at every point in the
image of x.

Question Since the coordinates do not depend on x, does it mean that the
vector field is constant?
Answer No. Remember that a tangent filed is a field on (part of) a manifold,
and as such, it is not, in general, constant. The only thing that is constant
are its coordinates under the specific chart x. The corresponding
j
coordinates under another chart are / xi (which are not constant in

3 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

general).

Question What does the vector field / xi look like?


Answer Click here to see.

(c) Patching Together Local Vector Fields


The vector field in the above example has the disadvantage that is local. We
can "extend" it to the whole of M by making it zero near the boundary of the
coordinate patch, as follows. If m M and x is any chart of M, lat x(m) = y
and let D be a disc or some radius r centered at y entirely contained in the
image of x. Now define a vector field on the whole of M by

2
-R
x e
j if p is in D
w(p) =
0 otherwise

where

|x(p) - y|
R=
r - |x(p) - y|)

The following figure shows what this field looks like on M.

The fact that the local coordinates vary smoothly with p M now follows
from the fact that all the partial derivatives of all orders vanish as you leave
the domain of x. Note that this field agrees with / xi at the point m.

(d) (Based on Example 3.2(c)) Take M = Sn, with stereographic projection


given by the two charts discussed earlier (Example 2.3(f) in Lecture 2).
Consider the circulating vector field on Sn defined at the point y = (y1, y2, .
. . , yn, yn+1) by the paths

4 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

t (y1cost - y2sint, y1sint + y2cost, y3, ... , yn+1).

(For fixed y = (y1, y2, . . . , yn, yn+1) this defines a path at the point y -- see
Example 3.2(c).) This is a circulating field in the y1y2-plane. (See the figure.
Note: the length of the tangent vector at a given point equals the radius of
the latitutde circle on which it sits.)

Question What are its local coordinates under the two charts x and
associated with stereographic projection?

Answer We saw in Lecture 2 that

y1 y1cos t - y2sin t 1 y1sin t + y2cos t


x1 = = so V1 = dx = - = -x2
1-yn + 1 1-yn + 1 dt 1-yn + 1

y2 y1sin t + y2cos t 2 y1cos t - y2sin t


x2 = = so V2 = dx = = x1
1-yn + 1 1-yn + 1 dt 1-yn + 1

y3 3
x
3= so V3 = dx = 0
1-yn + 1 dt
...
yn n
xn = so Vn = dx = 0
1-yn + 1 dt

and

y1 y1cos t - y2sin t 1 y1sin t + y2cos t


1= = so 1 = d = - =- 2
1+yn + 1 1+yn + 1 dt 1+yn + 1

y2 y1sin t + y2cos t 2 y1cos t - y2sin t


2= = so 2 = d = = 1
1+yn + 1 1+yn + 1 dt 1+yn + 1

5 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

y3 3
3= so 3 = d = 0
1+yn + 1 dt
...
yn n
n= so n = d = 0
1+yn + 1 dt

Thus, the local coordinates are given by

V = [-x2, x1, 0, 0, ... , 0] , and


2 1
= [- , , 0, 0, ... , 0]

Question I don't believe that they transform according to the


transformation rule for contravariant vectors!

Answer They do. Click here for the interesting details.

Covariant Vector Fields

We now look at the gradient. If is a smooth scalar field on M, and if x is a


chart, then we obtain the locally defined vector field / xi. By the chain
rule, these functions transform as follows:

xj
i= ,
xj i

or, writing Cj = / xj and i = / i,

j
i= Cj x .
i

This leads to the following definition.

Definition 4.4 A covariant vector field C on M associates with each


chart x a collection of n smooth functions Ci(x1, x2, . . . , xn) which
satisfy:

Covariant Vector Transformation Rule

j
i= Cj x .
i

Notes 4.5

6 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

1. If D is the matrix whose ij th entry is xi/ j, then the above equation


becomes, in matrix form:

C = CD

where now we think of C and as row vectors.

2. Note that

i xi k xi
(D ) j = = = ji ,
k xj x
j

and similarly for D. Thus, and D are inverses of each other.

3. Note again the formal symbol cancellation: if we cancel the 's, the x's,
and the superscripts on the right, we are left with the symbols on the left!

4. Guide to memory: In the contravariant objects, the barred x goes on top;


in covariant vectors, on the bottom. In both cases, the non-barred indices
match.

Note From now on, all scalar and vector fields are assumed smooth.

Question Geometrically, a contravariant vector is a vector that is tangent


to the manifold. How do we think of a covariant vector?

Answer The key to the answer is this:

Definition 4.6 A smooth 1-form, or a smooth cotangent vector


field on the manifold M (or on an open subset U of M) is a function F
that assigns to each tangent vector field V on M (or on the subset U) a
scalar field F(V) which is smooth (in the sense that F converts smooth
vector fields to smooth scalar functions)., and which has the following
properties:

F(V+W) = F(V) + F(W)


F( V) = F(V).

for every pair of tangent vector fields V and W, and every scalar . (In
the language of linear algebra, this says that F is a linear transformation
from the vector space of smooth tangent vector fields on M to the the
vector space of smooth scalar fields on M.)

7 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

Proposition 4.7 (Covariant Fields are One-Form Fields)

There is a one-to-one correspondence between covariant vector fields on


M (or U) and 1-forms on M (or U). Thus, we can think of covariant
tangent fields as nothing more than 1-forms.

Click here for a proof

Examples 4.8
(a) Let M = S1 with the charts:

x = arg(z), = arg(-z)

discussed in Lecture 2. There, we saw that the change-of-coordinate maps,


are given by

+ if < +x if x <
x= = ,
- if > -x if x >

with / x = x/ = 1,

so that the change-of-coordinates do nothing. It follows that functions C and


specify a covariant vector field iff C = . (Then they are automatically a
contravariant field as well.) For example, let

C(x) = 1 = ( ).

This field circulates around S1. On the other hand, we could define

C(x) = sin x and ( ) = - sin = sin x.

This field is illustrated in the following figure.

(The length of the vector at the point ei is given by sin .)

(b) Let be a scalar field. Its ambient gradient, grad , is given by

8 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

grad = y1 y2 ... ys

that is, the garden-variety gradient you learned about in calculus. This
gradient is, in general, neither covariant or contravariant. However, we can
use it to obtain a 1-form as follows: If V is any contravariant vector field,
then the rate of change of along V is given by V. grad . (If V happens to
be a unit vector at some point, then this is the directional derivative at that
point.) In other words, dotting with grad assigns to each contravariant
vector field the scalar field F(v) = V. grad which tells it how fast is
changing along V. We also get the 1-form identities:

F(V+W) = F(V) + F(W)


F( V) = F(V).

The coordinates of the corresponding covariant vector field are

F( / xi) = ( / xi).grad

dy1 dy2 dys


= ... . y1 y2 ... ys
dt dt dt

= ,
xi

which is the example that first motivated the definition.

(c) Generalizing (b), let be any smooth vector field in Es defined on an


open set containing M itself. Then the operation of dotting with is a linear
function from smooth tangent fields on M to smooth scalar fields. Thus, by
the proof of Proposition 4.7, it is a cotangent field on M with local
coordinates given by applying the linear function to the canonical charts /
xi:

Ci =
xi .

The gradient is an example of this, since we are taking

in the preceding example.

Note that dotting with depends only on the tangent component of . This
leads us to the (very important!) next example.

(d) If V is any tangent (contravariant) field, then we can appeal to (c) above
and obtain an associated covariant field. The coordinates of this field are not

9 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

the same as those of V. To find them, we write:

V = Vi xi (See Note 4.2 (4).)

Hence, the local coordinates are

Cj = i
xj . V = V xj . xi

Question The vectors / xi are mutually orthogonal, so that the last dot
product is just ij, right?

Answer Wrong! The tangent vectors / xi are not necessarily orthogonal in


general (look at the picture of these fields from earlier in this lecture) , so
the dot products don't behave as simply as we might suspect.

Instead, we can define certain functions gij by

gij =
xi . xj

so that

Cj = gijVi

gives the correct relation between the coordinates of a covariant vector and
the corresponding contravariant vector field. (Note how the indices cancel
to leave us with a lowered index...) We shall see the quantities gij again
presently. One last thing:

Definition 4.9 If V and W are contravariant (or covariant) vector fields


on M, and if is a real number, we can define new fields V+W and V by

(V + W) i = V i + Wi
and ( V)i = Vi.

It is easily verified that the resulting quantities are again contravariant


(or covariant) fields.

These operations turn the set of all smooth contravariant (or covariant)
fields on M into a vector space. Note that we cannot expect to obtain a
vector field by adding a covariant field to a contravariant field.

Exercise Set 4

j
1. Suppose that X is a contravariant vector field on the manifold M with the

10 of 11 10/08/2010 05:08 PM
Contravariant and Covariant Vector Fields http://people.hofstra.edu/stefan_waner/diff_geom...

following property: at every point m of M, there exists a local coordinate


system xi at m with X j(x1, x2, . . . , xn) = 0. Show that X i is identically zero in
any coordinate system.

2. Give and example of a contravariant vector field that is not covariant.


Justify your claim.

3. Verify the following claim If V and W are contravariant (or covariant)


vector fields on M, and if is a real number, then V+W and V are again
contravariant (or covariant) vector fields on M.

4. Verify the following claim in the proof of Proposition 4.7: If Ci is covariant


and Vj is contravariant, then CkVk is a scalar. 5. Let : Sn E1 be the scalar
field defined by (p1, p2, . . . , p n+1) = pn+1.
(a) Express as a function of the xi and as a function of the j.
(b) Calculate Ci = / xi and j = / j.
(c) Verify that Ci and C-j transform according to the covariant vector
transformation rules.

6. Is it true that the quantities xi themselves form a contravariant vector


field? Prove or give a counterexample.

7. Prove that and in Proposition 4.7 are inverse functions.

8. Prove: Every covariant vector field is of the type given in Example 4.8(d).
That is, obtained from the dot product with some contrravariant field.

Table of Back to Lecture 3: Tangent Vectors On to Lecture 5:


Contents and the Tangent Space Tensor Fields
Last Updated: January, 2002
Copyright © Stefan Waner

11 of 11 10/08/2010 05:08 PM
Tensor Fields http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 5: Tensor Fields


Table of Back to Lecture 4: Contravariant On to Lecture 6:
Contents and Covariant Vector Fields Riemannian Manifolds

Lecture 4 described vector fields on manifolds. We now look at tensors on


smooth manifolds.

Suppose that v = (v1, v2, v3) and w = (w1, w2, w3) are vector fields on E3.
Then their tensor product is defined to consist of the nine quantities viwj.

Let us see how such things transform. Thus, let V and W be contravariant,
and let C and D be covariant. Then:

i j i j
i j k m= k
= k V mW k mV Wm ,
x x x x

and similarly,

i xm k
i
j= k j
V Cm ,
x

and

xk xm
i j = Ck D m .
j j

We call these product fields "tensors" of type (2, 0), (1, 1), and (0, 2)
respectively.

Definition 5.1 A tensor field of type (2, 0) on the n-dimensional


smooth manifold M associates with each chart x a collection of n2
smooth functions Tij(x1, x2, . . . , xn) which satisfy the transformation
rules shown below. Similarly, we define tensor fields of type (0, 2), (1,
1), and, more generally, a tensor field of type (m, n).

Some Tensor Transformation Rules

i j
Type (2, 0): ij = Tkm "contravariant rank 2"
xk xm

1 of 6 10/08/2010 05:09 PM
Tensor Fields http://people.hofstra.edu/stefan_waner/diff_geom...

xm k i
"mixed with contravariant rank 1
i
Type (1, 1): k F m j =
x j and covariant rank 1"
xk xm
Type (0, 2): ij = Ekm "covariant rank 2"
j j

Note A tensor field of type (1, 0) is just a contravariant vector field, while a
tensor field of type (0, 1) is a covariant vector field. Similarly, a tensor field
of type (0, 0) is a scalar field. Type (1, 1) tensors correspond to linear
transformations in linear algebra.

Examples 5.2

(a) Of course, by definition, we can take tensor products of vector fields to


obtain tensor fields, as we did above in Definition 4.1.

(b) The Kronecker Delta Tensor, given by

i 1 if j = i
j =
0 if j i

is, in fact a tensor field of type (1, 1). Indeed, one has

i xi
j = ,
xj

and the latter quantities transform according to the rule

i
i xk xm
=
j
xk xm j

i
xm k
= m
xk j

whence they constitute a tensor field of type (1, 1).

Question OK, so is this how it works: Given a point p of the manifold and a
chart x at p this strange object assigns the n2< quantities ij; that is, the
identity matrix, regardless of the chart we chose?
Answer Yes.

Question But how can we interpret this strange object?


Answer Just as a covariant vector field converts contravariant fields into
scalars (see Lecture 3) we shall see that a type (1,1) tensor converts
contravariant fields to other contravariant fields. This particular tensor does

2 of 6 10/08/2010 05:09 PM
Tensor Fields http://people.hofstra.edu/stefan_waner/diff_geom...

nothing: put in a specific vector field V, out comes the same vector field. In
other words, it is the identity transformation.

Notes

j
i is independent of the chart used (the coordinates are the same as the
1. j i
barrd coordinates). Also, i = j . That is, it is a symmetric tensor.
i xj
i
2. j k = k
= ik
x

(c) We can make new tensor fields out of old ones by taking products of
existing tensor fields in various ways. For example,

Mijk N pqrs is a tensor of type (3, 4),

while

i jk
M jk N rs is a tensor of type (1, 2).

Specific examples of these involve the Kronecker delta, and are in the
homework.

(d) If X is a contravariant vector field, then the functions X i/ xjdo not define
a tensor. Indeed, let us check the transformation rule directly:

i i
= k
j X
j xk
i
xh
= xh X k k
x j

Xk i
xh k
2 i
= k +X
xh x j xh xk

The extra term on the right violates the transformation rules.

We will see more interesting examples later.

Proposition 5.3 (If It Looks Like a Tensor, It Is a Tensor)

Suppose that we are given smooth local functions gij with the property
that for every pair of contravariant vector fields X i and Yi, the smooth
functions g ijX iYj determine a scalar field. Then the g ij determine a
smooth tensor field of type (0, 2).

3 of 6 10/08/2010 05:09 PM
Tensor Fields http://people.hofstra.edu/stefan_waner/diff_geom...

Proof Since the g ijX iYj form a scalar field, we must have

i j
ij = ghkX hYk.

On the other hand,

i j
i j
ij = ijX hYk
xh xk

by the transformation rules for contravariant vectors. Equating the


right-hand sides gives

i j
h k
ghkX Y = ij h k X hYk ..................... (I)
x x

h k
Now, if we could only cancel the terms X Y ! Well, choose a point m M. It
suffices to show that

i j
ghk = ij ,
xh xk

when evaluated at the coordinates of m. However, by Example 4.3(c), we


can arrange for vector fields X and Y such that

i
1 if i = h
X (coordinates of m) = ,
0 if i h

and

i
1 if i = k
Y (coordinates of m) = .
0 if i k

Substituting these into equation (I) now gives the required transformation
rule.

Example 5.4 Metric Tensor

Define a set of quantities g ij by

gij =
xi . xj

Question Why should I believe that this is a


tensor?
Answer Let us invoke the above proposition:

4 of 6 10/08/2010 05:09 PM
Tensor Fields http://people.hofstra.edu/stefan_waner/diff_geom...

If X i and Yj are any contravariant fields on M,


then their dot product X.Y is a scalar, and

X . Y = X i xi . Yj xj = gijX iYj .

Thus, by Proposition 5.3, it is a type (0, 2) tensor. We call this tensor "the
metric tensor inherited from the imbedding of M in Es."

Exercise Set 5

1. Compute the transformation rules for each of the following, and hence
decide whether or not they are tensors. Sub-and superscripted quantities
(other than coordinates) are understood to be tensors.

dX ij xi Xi 2 2 p
(a) (b) j (c) j (d) i j (e) xi j
dt x x x x x x

2. (Rund, p. 95 #3.4) Show that if A j is a type (0, 1) tensor, then

Ak Ah
-
xh xk

is a type (0, 2) tensor.

3. Show that, if M and N are tensors of type (1, 1), then:

(a) MijN pq is a tensor of type (2, 2)


(b) MijN jq is a tensor of type (1, 1)
(c) MijN jj is a tensor of type (0, 0) (that is, a scalar field)

4. Let X be a contravariant vector field, and suppose that M is such that all
change-of-coordinate maps have the form i = aijxj + ki for certain constants
aij and kj. (We call such a manifold affine.) Show that the functions X i/ xj
define a tensor field of type (1, 1).

5. (Rund, p. 96, 3.12) If Bijk = -Bjki, show that Bijk = 0. Deduce that any
type (3, 0) tensor that is symmetric on the first pair of indices and
skew-symmetric on the last pair of indices vanishes.

6. (Rund, p. 96, 3.16) If Akj is a skew-symmetric tensor of type (0, 2), show
that the quantities Brst defined by

Brst = Ast + Atr + Ars

5 of 6 10/08/2010 05:09 PM
Tensor Fields http://people.hofstra.edu/stefan_waner/diff_geom...

xr xs xt

(a) are the components of a tensor; and


(b) are skew-symmetric in all pairs in indices.
(c) How many independent components does Brst have?

8. Suppose that Cij is a type (2, 0) tensor, and that, regarded as an n¿n
matrix C, it happens to be invertible in every coordinate system. Define a
new collection of functions, D ij by taking

D ij = C-1ij,

the ij the entry of C-1 in every coordinate system. Show that D ij, is a type (0,
2) tensor. [Hint: Write down the transformation equation for Cij and invert
everything in sight.]

Table of Back to Lecture 4: Contravariant On to Lecture 6:


Contents and Covariant Vector Fields Riemannian Manifolds
Last Updated: January, 2002
Copyright © Stefan Waner

6 of 6 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 6: Riemannian Manifolds


Table of Back to Lecture 5: On to Lecture 7: Locally Minkowskian
Contents Tensor Fields Manifolds: A Little Relativity

In the last lecture, we saw how the scalar product in Es gave rise to a type
(0, 2) covariant tensor field g ij. Here, we generalize this concept.

Definition 6.1 A smooth inner product on a manifold M is a function


-,- that associates to each pair of smooth contravariant vector fields X
and Y a scalar (field) X, Y , satisfying the following properties.

symmetry: X, Y = Y, X for all X and Y,


aX, bY = ab X, Y for all X and Y, and scalars
a and b
bilinearity:
X, Y+Z = X, Y + X, Z
X+Y, Z = X, Z + Y, Z .
non-degeneracy:
If X, Y = 0 for every Y, then X = 0.

We also call such a gizmo a symmetric bilinear form. A manifold


endowed with a smooth inner product is called a Riemannian
manifold.

Before we look at some examples, let us see how these things can be
specified. First, notice that, if x is any chart, and p is any point in the
domain of x, then

X, Y = X iYj xi , xj

This gives us smooth functions

gij =
xi , xj

such that

X, Y = gijX iYj

and which, by Proposition 5.3, constitute the coefficients of a type (0, 2)


symmetric tensor. We call this tensor the fundamental tensor or metric
tensor of the Riemannian manifold.

Examples 6.2

1 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

(a) Take M = E n, with the usual inner product; we find g ij = ij.

(b) (Minkowski Metric) M = E4, with g ij given by the matrix

1 0 0 0
0 1 0 0
G= ,
0 0 1 0
0 0 0 -c2

where c is the speed of light.

Question How does this effect the length of vectors?


Answer We saw in Lecture 3 that, in En, we could think of tangent vectors
in the usual way; as directed line segments starting at the origin. The role
that the metric plays is that it tells you the length of a vector; in other
words, it gives you a new distance formula:

Euclidean 3- space: d(x, y) = [(y1 - x1)2 + (y2 - x2)2 + (y3 - x3)2]1/2


Minkowski 4-space: d(x, y) = [(y1 - x1)2 + (y2 - x2)2 + (y3 - x3)2 - c2(y4
- x4)2]1/2

Geometrically, the set of all points in Euclidean 3-space at a distance r from


the origin (or any other point) is a sphere of radius r. In Minkowski space, it
is a hyperbolic surface. In Euclidean space, the set of all points a distance of
0 from the origin is just a single point; in M, it is a cone, called the light
cone. (See the figure.)

Minkowski 4-Space

2 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

(c) If M is any manifold embedded in Es, then we have seen above that M
inherits the structure of a Riemannian metric from a given inner product on
Es. In particular, if M is any 3-dimensional manifold embedded in E4 with
the metric shown above, then M inherits such a inner product.

(d) As a particular example of (c), let us calculate the metric of the


two-sphere M = S2, with radius r, using polar coordinates x1 = , x2 = . To
find the coordinates of g** we need to calculate the inner product of the
basis vectors / x1, / x2 in the ambient space Es. We saw in Section 3 that
the ambient coordinates of / xi are given by

yj
j th coordinate =
xi

where

y1 = r sin(x1) cos(x2)
y2 = r sin(x1) sin(x2)
y3 = r cos(x1)

Thus,

1 2 1 2 1
x1 = r(cos(x )cos(x ), cos(x )sin(x ), -sin(x ))
1 2 1 2
x2 = r(-sin(x )sin(x ), sin(x )cos(x ), 0)

3 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

This gives

g11 = / x1, / x1 = r2
g22 = / x2, / x2 = r2 sin2(x1)
g12 = / x1, / x2 = 0,

so that

r2 0
g** = .
0 r2sin2 (x1)

(e) The n-Dimensional Sphere Let M be the n-sphere of radius r with


generalized polar coordinates.

y1 = r cos x1
y2 = r sin x1 cos x2
y3 = r sin x1 sin x2 cos x3
...
yn-1 = r sin x1 sin x2 sin x3 sin x4 ... cos xn-1
yn = r sin x1 sin x2 sin x3 sin x4 ... sin xn-1 cos xn
yn+1 = r sin x1 sin x2 sin x3 sin x4 ... sin xn-1 sin xn.

(Notice that x1 is playing the role of and the x2, x3, . . . , xn-1 the role of .)
Following the line of reasoning in the previous example, we have

(-r sin x1, r cos x1 cos x2, r cos x1 sin x2 cos x3 , ... , r cos x1 sin
= 2
x1 x ... sin x
n-1
cos xn, r cos x1 sin x2 ... sin xn-1 sin xn)
(0, -r sin x1 sin x2, . . . , r sin x 1 cos x2 sin x3... sin xn-1 cos xn, r
=
x2 sin x1 cos x2 sin x3 ... sin xn-1 sin xn)
1 2 3 1 2 3 4
0, 0, -r sin x sin x sin x , r sin x sin x cos x cos x . . . , r sin
= x1 sin x2 cos x3 sin x4... sin xn-1 cos xn, r sin x1 sin x2 cos x3 sin x4
x3
... sin xn-1 sin xn)

and so on. This gives

g11 = / x1, / x1 = r2
g22 = / x2, / x2 = r2sin2x1
g33 = / x3, / x3 = r2sin2x1 sin2 x2
...

4 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

gnn = / xn, / xn = r2sin2x1 sin2 x2 ... sin2 xn-1


gij = 0 if i j

so that

r2 0 0 ... 0
0 r2sin2x1 0 ... 0
g** = 0 0 r2sin2x1 sin2 x2 ... 0 .
... ... ... ... ...
0 0 0 ... r2sin2x1 sin2 x2 ... sin2 xn-1

(f) Diagonalizing the Metric Let G be the matrix of g** in some local
coordinate system, evaluated at some point p on a Riemannian manifold.
Since G is symmetric, it follows from linear algebra that there is an
invertible matrix P = (Pji) such that

±1 0 0 0
0 ±1 0 0
PGPT =
... ... ... ...
0 0 0 ±1

at the point p. Let us call the sequence (±1,±1, . . . , ±1) the signature of
the metric at p. (Thus, in particular, the Minkowski metric has signature (1,
1, 1, -1).) If we now define new coordinates j by

i j
x = Pji ,

(so that we are using the inverse of P for this) then xi/ j = Pji, and so

xa g xb
ij = ab = PiagabPjb = Piagab(PT )bj = (PGPT )ij
i j

showing that, at the point p,

±1 0 0 0
0 ±1 0 0
** = .
... ... ... ...
0 0 0 ±1

5 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

Thus, in the eyes of the metric, the unit basis vectors ei = / i are
orthogonal; that is,

ei, ej = ± ij.

Note The non-degeneracy condition in Definition 6.1 is equivalent to the


requirement that the locally defined quantities

g = det(g ij)

are nowhere zero.

Here are some things we can do with a Riemannian manifold.

Definition 6.3 If X is a contravariant vector field on M, then define the


square norm norm of X by

2 i j
||X|| = X, X = gijX X .

2 2
Note that ||X|| may be negative. If ||X|| < 0, we call X timelike; if
||X||2 > 0, we call X spacelike, and if ||X||2 = 0, we call X null. If X is
not spacelike, then we can define

||X|| = (||X||2)1/2= (gijX iX j)1/2

In the exercise set you will show that null need not imply zero.

Note Since X, X is a scalar field, so is ||X|| is a scalar field, if it exists, and


satisfies ||™X|| = |™|·||X|| for every contravariant vector field X and every

6 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

scalar field ™. The expected inequality

||X + Y|| ||X|| + ||Y||

need not hold. (See the exercises.)

Arc Length One of the things we can do with a metric is the following. A
path C given by xi = xi(t) is non-null if ||dx i/dt||2 0. It follows that
i 2
||dx /dt|| is either always positive ("spacelike") or negative ("timelike").

Definition 6.4 If C is a non-null path in M, then define its length as


follows: Break the path into segments S each of which lie in some
coordinate neighborhood, and define the length of S by

i j 1/2
L(a, b) = ± gij dx . dx dt
dt dt
a

where the sign ±1 is chosen as +1 if the curve is space-like and -1 if it is


time-like. In other words, we are defining the arc-length differential
form by

ds2 = ±gijdxidxj.

To show (as we must) that this definition is independent of the choice of


chart x, all we need observe is that the quantity under the square root sign,
being a contraction product of a type (0, 2) tensor with a type (2, 0) tensor,
is a scalar.

Proposition 6.5 (Paramaterization by Arc Length)

Let C be a non-null path xi = xi(t) in M. Fix a point t = a on this path,


and define a new function s (arc length) by s(t) = L(a, t) = length of
path from t = a to t. Then s is an invertible function of t, and, using s as
a parameter, ||dxi/ds||2 is constant, and equals 1 if C is space-like and -1
if it is time-like.

Conversely, if t is any parameter with the property that ||dx i/dt||2 = ±1,
then, choosing any parameter value t = a in the above definition of
arc-length s, we have

t = ±s + C

7 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

for some constant C. (In other words, t must be, up to a constant, arc
length.

Physicists call the parameter = s/c, where c is the speed of light,


proper time for reasons we shall see below.)

Click here for a proof.

Exercise Set 6

1. Give an example of a Riemannian metric on E2 such that the


corresponding metric tensor g ij is not constant.

2. Let aij be the components of any symmetric tensor of type (0, 2) such that
det(aij) is never zero. Define

X, Y a = aijX iYj.

Show that this is a smooth inner product on M.

3. Give an example to show that the ||X|| + ||Y|| is not always true on a
Riemannian manifold.

4. Give an example of a Riemannian manifold M and a nowhere zero vector


field X on M with the property that ||X|| = 0. We call such a field a null
field.

5. Show that if g is any smooth type (0, 2) tensor field, and if g = det(g ij) 0
for some chart x, then = det( ij) 0 for every other chart (at points where
the change-of-coordinates is defined). [Use the property that, if A and B are
matrices, then det(AB) = det(A)det(B).]

6. Suppose that g ij is a type (0, 2) tensor with the property that g = det(g ij)
is nowhere zero. Show that the resulting inverse (of matrices) gij is a type
(2, 0) tensor. (Note that it must satisfy g ijgkl = ki lj.)

7. (Index lowering and raising) Show that, if Rabc is a type (0, 3) tensor,
i
then Ra c given by

Ra c = gibRabc,
i

is a type (1, 2) tensor. (Here, g ** is the inverse of g **.) What is the inverse
operation?

8 of 9 10/08/2010 05:09 PM
Riemannian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

8. A type (1, 1) tensor field T is orthogonal in the Riemannian manifold M


if, for all pairs of contravariant vector fields X and Y on M, one has

TX, TY = X, Y ,

where (TX) i = TikX k. What can be said about the columns of T in a given
coordinate system x? (Note that the ith column of T is the local vector field
given by T( / xi).)

Table of Back to Lecture 5: On to Lecture 7: Locally Minkowskian


Contents Tensor Fields Manifolds: A Little Relativity
Last Updated: January, 2002
Copyright © Stefan Waner

9 of 9 10/08/2010 05:09 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 7: Locally Minkowskian Manifolds: A


Little Relativity
Table of Back to Lecture 6: On to Lecture 8: Covariant
Contents Riemannian Manifolds Differentiation

In the last lecture, we saw how we can use a Riemannian metric to measure
distance. Here, we look at a very special metric.

First a general comment: We said in the last section that, at any point p in a
Riemannian manifold M, we can find a local chart at p with the property
that the metric tensor g** is diagonal, with diagonal terms 1. In particular,
we said that Minkowski space comes with a such a metric tensor having
signature (1, 1, 1, -1). Now there is nothing special about the number 1 in
the discussion: we can also find a local chart at any point p with the
property that the metric tensor g ** is diagonal, with diagonal terms any
non-zero numbers we like (although we cannot choose the signs).

In relativity, we take deal with 4-dimensional manifolds, and take the first
three coordinates x1, x2, x3 to be spatial (measuring distance), and the
fourth one, x4, to be temporal (measuring time). Let us postulate that we are
living in some kind of 4-dimensional manifold M (since we want to include
time as a coordinate. By the way, we refer to a chart x at the point p as a
frame of reference, or just frame). Suppose now we have a particle --
perhaps moving, perhaps not -- in M. Assuming it persists for a period of
time, we can give it spatial coordinates (x1, x2, x3) at every instant of time
(x4). Since the first three coordinates are then functions of the fourth, it
follows that the particle determines a path in M given by

x1 = x1(x4)
x2 = x2(x4)
x3 = x3(x4)
x4 = x4,

so that x4 is the parameter. This path is called the world line of the
particle. Mathematically, there is no need to use x4 as the parameter, and so
we can describe the world line as a path of the form

xi = xi(t),

where t is some parameter. (Note: t is not time; it's just a parameter. x4 is


time). Conversely, if t is any parameter, and xi = xi(t) is a path in M, then, if

1 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

x4 is an invertible function of t, that is, dx 4/dt 0 (so that, at each time x4,
we can solve for the other coordinates uniquely) then we can solve for x1,
x2, x3 as smooth functions of x4, and hence picture the situation as a particle
moving through space.

Now, let's assume our particle is moving through M with world line xi = xi(t)
as seen in our frame (local coordinate system). The velocity and speed of
this particle (as measured in our frame) are given by

dx1 dx2 dx3


v= , ,
dx4 dx4 dx4
2 2 2
2= dx1 dx2 dx3
speed + + .
dx4 dx4 dx4

The problem is, we cannot expect v to be a vector -- that is, satisfy the
correct transformation laws. But we do have a contravariant 4-vector

i
i= dx
T
dt

(T stands for tangent vector. Also, remember that t is not time). If the
particle is moving at the speed of light c, then

2 2 2
dx1 dx2 dx3
4
+ 4
+ 4
= c2 ...... (I)
dx dx dx

2 2 2 2
dx1
2 3
+ dx + dx =c 2 dx4 (using the chain rule)
dt dt dt dt
2 2 2 2
dx1
2 3
+ dx + dx -c 2 dx4 = 0.
dt dt dt dt

Now this looks like the norm-squared, ||T||2, of the vector T under the
metric whose matrix is

1 0 0 0
0 1 0 0
g** = diag[1, 1, 1, -c 2] = .
0 0 1 0
0 0 0 -c2

2 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

In other words, the particle is moving at light-speed ||T||2 = 0 ||T|| is


null under this rather interesting local metric. So, to check whether a
particle is moving at light speed, just check whether T is null.

Question What's the -c 2 doing in place of -1 in the metric?


Answer Since physical units of time are (usually) not the same as physical
units of space, we would like to convert the units of x4 (the units of time) to
match the units of the other axes. Now, to convert units of time to units of
distance, we need to multiply by something with units of distance/time; that
is, by a non-zero speed. Since relativity holds that the speed of light c is a
universal constant, it seems logical to use c as this conversion factor.

Now, if we happen to be living in a Riemannian 4-manifold whose metric


diagonalizes to something with signature (1, 1, 1, -c 2), then the physical
2
property of traveling at the speed of light is measured by ||T|| , which is a
scalar, and thus independent of the frame of reference. In other words, we
have discovered a metric signature that is consistent with the requirement
that the speed of light is constant in all frames in which g ** has the above
diagoal form (so that ita makes sense to say what the speed if light is.)

Definition 7.1 A Riemannian 4-manifold M is called locally


2
Minkowskian if its metric has signature (1, 1, 1, -c ).

For the rest of this section, we will be in a locally Minkowskian manifold M.

Note If we now choose a chart x in locally Minkowskian space where the


metric has the diagonal form diag[1, 1, 1, -c2] shown above at a given point
p, then we have, at the point p:

(a) If any path C has ||T||2 = 0, then

2 2 2 2
dx1
2 3
+ dx + dx -c 2 dx4 =0
dt dt dt dt

(because this is how we calculate ||T||2)

(b) If V is any contravariant vector with zero x4-coordinate, then

||V||2 = (V 1)2 + (V 2)2 + (V 3)2 (for the same reason as above)

(a) says that we measure the world line C as representing a particle


traveling with light speed, and (b) says that we measure ordinary length in
the usual way. This motivates the following definition.

3 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

Definition 7.2 A Lorentz frame at the point p M is any coordinate


system i with the following properties: (a) If any path C has the scalar
||T||2 = 0, then, at p,

2 2 2 2
d 1
2 3
+ d + d -c 2 d 4 =0 ...... (II)
dt dt dt dt

(Note: In general, , is not of this form, since ij may not be be


diagonal)

(b) If V is a contravariant vector at p with zero 4-coordinate, then

||V||2 = ( 1)2 + ( 2)2 + ( 3)2 ...... (III)

(Again, this need not be || ||2.)

It follows from the remark preceding the defintion that if x is any chart such
that, at the point p, the metric has the nice form diag[1, 1, 1, -c2], then x is
a Lorentz frame at the point p. Note that in general, the coordinates of T in
the system i are given by matrix multiplication with some possibly
complicated change-of-coordinates matrix, and to further complicate things,
the metric may look messy in the new coordinate system. Thus, very few
frames are going to be Lorentz.

Physical Interpretation of a Lorentz Frame

What the definition means physically is that an observer in the -frame


who measures a particle traveling at light speed in the x-frame will also
reach the conclusion that its speed is c, because he makes the decision
based on (I), which is equivalent to (II). In other words:

A Lorentz frame in locally Minkowskian space is any frame in which


light appears to be traveling at light speed, and where we measure
length in the usual way.

Question Do all Lorentz frames at p have the property that metric has the
nice form diag[1, 1, 1, -c2]?
Answer Yes, as we shall see below.

Question OK. But if x and are two Lorentz frames at the point p, how are
they related?
Answer Here is an answer. First, continue to denote a specific Lorentz
frame at the point p by x.

4 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

Theorem 7.3 (Criterion for Lorentz Frames)

The following are equivalent for a locally Minkowskian manfifold M:

(a) A coordinate system i in Minkowski space M is Lorentz at the point


p

(b) If x is any frame such that, at p, G = diag[1, 1, 1, -c 2], then the


columns of the change-of-coordinate matrix

i
i
Dj =
xj

satisfy

column i, column j = e i, e j ,

where the inner product is defined by the matrix G.

(c) = diag[1, 1, 1, -c 2]

Click here for a proof.

We will call the transformation from one Lorentz frame to another a


generalized Lorentz transformation.

An Example of a Lorentz Transformation We would like to give a simple


example of such a transformation matrix D, so we look for a matrix D whose
first column has the general form a, 0, 0, b , with a and b non-zero
constants. (Why? If we take b = 0, we will wind up with a less interesting
transformation: a rotation in 3-space.) There is no loss of generality in
taking a = 1, so let us use 1, 0, 0, - /c . Here, c is the speed of light, and
is a certain constant. (The meaning of will emerge in due course). Its
2
norm-squared is (1 - ), and we want this to be 1, so we replace the vector
by

1 - /c
( (1 - 2 1/2 ,
)
0, 0,
(1 -
2 1/2
)
).

This is the first column of D. To keep things simple, let us take the next two
columns to be the corresponding basis vectors e2, e3. Now we might be
tempted to take the forth vector to be e4, but that would not be orthogonal
to the above first vector. By symmetry (to get a zero inner product) we are
forced to take the last vector to be

5 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

- c 1
( (1 - 2 1/2 ,
)
0, 0,
(1 -
2 1/2
)
).

This gives the transformation matrix as

1 - c
2 1/2 0 0 2 1/2
(1 - ) (1 - )
0 1 0 0
D= ,
0 0 1 0
- /c 1
2 1/2 0 0 2 1/2
(1 - ) (1 - )

and hence the new coordinates (by integrating everything in sight; using
the boundary conditions i = 0 when xi = 0) as

1 4 4 1
x  -  cx 2 2; 3 3; x  -  x /c
1= ; =x =x 4= ;
2 1/2 2 1/2
(1 - ) (1 - )

Notice that solving the first equation for x1 gives

x1 = 1(1- 2)1/2 + cx4.

4
Since x is just time t here, it means that the origin of the -system has
coordinates ( ct, 0, 0) in terms of the original coordinates. In other words,
it is moving in the x-direction with a velocity of

v= c,

so we must interpret as the speed in "warp;"

= v/c.

This gives us the famous

Lorentz Transformations of Special Relativity

If two Lorentz frames x and have the same coordinates at (x, y, z, t) =


(0, 0, 0, 0), and if the -frame is moving in the x-direction with a speed
of v, then the -coordinates of an event are given by

x - vt
x=
(1 - v2/c2)1/2

6 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

y=y
z=z
2
t - vx/c
t =
(1 - v2/c2)1/2

Exercise Set 7

1. What can be said about the scalar ||dxi/dt||2 in a Lorentz frame for a
particle traveling at (a) sub-light speed (b) super-light speed?

2. (a) Show that, if x i(t) is a timelike path in the Minkowskian manifold M so


that dx4/dt 0, then d 4/dt 0 in every Lorentz frame . In other words, if a
particle is moving at sub-light speed in any one Lorentz frame, then it is
moving at sub-light speed in all Lorentz frames.
(b) Conclude that, if a particle is traveling at super-light speed in one
Lorentz frame, then it is traveling at super-light speeds in all such frames.

3. Referring to the Lorentz transformations for special relativity, consider a


"photon clock" constructed by bouncing a single photon back and forth
bewtwwen two parallel mirrors as shown in in the following figure.

Now place this clock in a train moving in the x-direction with velocity v. By
comparing the time it takes between a tick and a tock for a stationary
observer and one on the train, obtain the time contraction formula ( in
terms t) from the length contraction one.

4. Prove the claim in the proof of 7.3, that if D is a 4 4 matrix whose


columns satisfy

0 if i j
column i, column j = k if 1 i=j 3 ,
-kc2 if i = j = 4

using the Minkowski inner product G (not the standard inner product), then
D -1 has its columns satisfying

7 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

0 if i j
column i, column j = 1/k if 1 i=j 3 .
-c2/k if i = j = 4

8 of 9 10/08/2010 05:10 PM
Locally Minkowskian Manifolds http://people.hofstra.edu/stefan_waner/diff_geom...

[Hint: use the given property of D to write down the entries of its inverse P
in terms of the entries of D.]

5. Invariance of the Minkowski Form


Show that, if P = x i0 and Q = xi0 + xi are any two events in the Lorentz
frame xi, then, for all Lorenz frames i, one has

( x1)2 + ( x2)2+ ( x3)2- c2( x4)2 = ( 1)2 + ( 2)2+ ( 3)2- c2( 4)2

[Hint: Consider the path xi(t) = x0 + xit, so that dxi/dt is independent of t.


i

Now use the transformation formula to conclude that d i/dt is also


independent of t. (You might have to transpose a matrix before
multiplyingŠ) Deduce that i(t) = zi + rit for some constants ri and si.
Finally, set t = 0 and t = 1 to conclude that i(t) = 0i + i
t, and apply (c)
above.]

6. If the i-system is moving with a velocity v in a certain direction with


resepct to the xi-system, we call this a boost in the given direction. Show
that successive boosts in two perpendicular directions do not give a "pure"
boost (the spatial axes are rotated-no longer parallel to the original axes).
Now do some reading to find the transformation for a pure boost in an
arbitrary direction.

Table of Back to Lecture 6: On to Lecture 8: Covariant


Contents Riemannian Manifolds Differentiation
Last Updated: January, 2002
Copyright © Stefan Waner

9 of 9 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 8: Covariant Differentiation


Back to Lecture 7: Locally On to Lecture 9:
Table of
Minkowskian Manifolds: A Little Geodesics and Local
Contents
Relativity Inertial Frames

8. Covariant Differentiation

Intuitively, by a parallel vector field, we mean a vector field with the


property that the vectors at different points are parallel. Is there a notion of
a parallel field on a manifold? For instance, in En, there is an obvious notion:
just take a fixed vector v and translate it around. On the torus, there are
good candidates for parallel fields (see the figure) but not on the 2-sphere.
(There are, however, parallel fields on the 3-sphere...)

Let us restrict attention to parallel fields of constant length. Usually, we can


recognize such a field by taking the derivatives of its coordinates, or by
following a path, and taking the derivative of the vector field with respect to
t: we should come up with zero. The problem is, we won't always come up
with zero if the coordinates are not rectilinear, since the vector field may
change direction as we move along the curved coordinate axes.

Technically, this says that, if X j was such a field, we should check for its
parallelism by taking the derivatives dX j/dt along some path xi = xi(t).
However, there are two catches to this approach: one geometric and one
algebraic.

Geometric Look, for example, at the filed on either torus in the above figure.
Since it is circulating and hence non-constant, dX/dt 0, which is not what
we want. However, the projection of dX/dt parallel to the manifold does
vanish -- we will make this precise below.

Algebraic Since

j
j
= X h,
xh

one has, by the product rule,

1 of 8 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

2 j j
d j=
k
h dx + dX h ... (I)
X
dt xk xh dt xh dt

showing that, unless the second derivatives vanish, dX/dt does not
transform as a vector field. What this means in practical terms is that we
cannot check for parallelism at present -- even in E3 if the coordinates are
not linear.

The projection of dX/dt along M will be called the covariant derivative of X


(with respect to t), and written DX/dt. To compute it, we need to do a little
work. First, some linear algebra.

Lemma 8.1 (Projection onto the Tangent Space)

Let M be a Riemannian n-manifold with metric g, and let V be a vector


in Es,. The projection V of V onto Tm has (local) coordinates given by

( V)i = gik(V. / xk),

where [g ij] is the matrix inverse of [g ij], and g ij = ( / xi).( / xj) as usual.

Proof We can represent V as a sum,

V = V + W,

where W is the component of V normal to Tm. Now write / xk as ek, and


write

V = a1e1 + ... + anen,

where the ai are the desired local coordinates. Then

V= V+W
= a1e1 + ... + anen + W

and so

V.e1 = a1e1 e1 + ... + anen e1 + 0


. .

V.e2 = a1e1 e2 + ... + anen e2


. .

...
V.en = a1e1.en + ... + anen.en
=

2 of 8 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

which we can write in matrix form as

[V.ei] = [ai]g**

whence

[ai] = [V .ei]g**.

Finally, since g ** is symmetric, we can transpose everything in sight to get

[ai] = g**[V.ei],

as required.

For reasons that will become clear later, let us now look at some partial
derivatives of the fundamental matrix [g **] in terms of ambeint coordinates.

ys ys
p [g qr ] = xp
x xq xr
2 ys 2 ys
ys ys
= r+ q
xp xq x xr xp x

or, using "comma notation" (that is, R,p denotes partial derivative with
respect to xp.),

gqr,p = ys,pq ys,r + ys,rp ys,q

Look now at what happens to the indices q, r, and p if we permute them


(they're just letters, after all) cyclically in the above formula (that is, p q
r), we get two more formulas.

gqr,p = ys,pq ys,r + ys,rp ys,q (Original formula)


grp,q = ys,qr ys,p + ys,pq ys,r

gpq,r = ys,rp ys,q + ys,qr ys,p

Note that each term on the right occurs twice altogether as shown by the
colors. This permits us to solve for the yellow term ys,pq ys,r by adding the
first two equations and subtracting the third:

1
ys,pq ys,r = [ gqr,p + grp,q - gpq,r ].
2

3 of 8 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

Definition 8.2 Christoffel Symbols

We make the following definitions.

[pq, 1 [ g qr,p + g rp,q - Christoffel Symbols of the


= 2
r] gpq,r ]. First Kind

i Christoffel Symbols of the


= gir [pq, r]
pq Second Kind

Neither of these gizmos are tensors, but instead transform as follows (Which
you will prove in the exercises!)

Transformation Law for Christoffel Symbols of the First Kind

r i j 2 i j
[hk, l] = [ri, j] l+
ij
xh xk x xh xk xl

Transformation Law for Christoffel Symbols of the Second Kind

xp
r i
xp
2 t
t
p = k+
hk
ri t xh x t xh xk

(Look at how the patterns of indices match those in the Christoffel


symbols...)

We can now obtain a formula for the covariant derivative.

Proposition 8.2 (Formula for Coavariant Derivative)

i i i q
DX = dX + dx
p q Xp
dt dt dt

Proof By definition,

DX dX
=
dt dt

which, by the lemma, has local coordinates given by

DX i = ir dX . r .
g dt x
dt

4 of 8 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

To evaluate the term in parentheses, we use ambeint coordinates. dX/dt has


ambient coordinates

2
d ys dX ys + p
p ys dxq
p = .
X X
dt xp dt xp
p q
x x dt

Thus, dotting with / xk = ys/ xr gives

2
dX
p ys ys ys ys dxq
+ Xp
xp x
r
xr dt
p q
dt x x
p q
= dX gpr + X p[pq, r] dx
dt dt

Finally,

i
DX = ir dX . r
g dt x
dt
p q
= gir dX gpr + X p[pq, r] dx
dt dt
i i q

= dX p + p p dx (Defn of Christoffel symbols of the 2nd


X
p dt q dt Kind)

i
i q
= dX + Xp
dx
dt p q dt

as required.

In the exercises, you will check directly that the covariant derivative
transforms correctly.

This allows us to say whether a field is parallel and of constant length by


seeing whether this quantity vanishes. This claim is motivated by the
following.

Proposition 8.3 (Parallel Fields of Constant Length)


X i is a parallel field of constant length in En iff DX i/dt = 0 for all paths in
En.

i i
Proof Designate the usual coordinate system by x . Then X is parallel and of

5 of 8 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

constant length iff its coordinates with respect to the chart x are constant;
that is, iff

dX i = 0.
dt

But, since for this coordinate system, g ij = ij, the Christoffel symbols clearly
vanish, and so

DX i = dX i = 0.
dt dt

But, if the contravariant vector DX i/dt vanishes under one coordinate


system (whose domain happens to be the whole manifold) it must vanish
under all of them. (Notice that we can't say that about things that are not
vectors, such as dX i/dt.)

Partial Derivatives

Let us make the following definition.

Definitions 8.4 The covariant partial derivative of the contravariant


p
field X is the type (1, 1) tensor given by

Covariant Partial Derivative of Xp

Xp p
X p|k = + h
x k hk X

(Some texts use kX p.) Similarly, the covariant partial derivative of


the covariant field Yp is the type (0, 2) tensor given by

Covariant Partial Derivative of Y p

Yp h
Yp|k = k
+ Yh
x pk

Question How do we know that these things are second order tensors as
claimed?
Answer Some of these will be in the exercises. Click here for a proof that
X p|k is a type (1, 1) tensor.

6 of 8 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

Notes
1. All these forms of derivatives satisfy the expected rules for sums and also
products. (See the exercises.)
2. If C is a path on M, then we obtain the following analogue of the chain
rule:

DX i = p dxk
X |k
dt dt

(Again, see the exercises).

Exercise Set 8
i i
1.(a) Show that j k = k j .

i
(b) If j k are functions that transform in the same way as Christoffel
i i
symbols of the second kind (called a connection) show that j k - k j is
always a type (1, 2) tensor (called the associated torsion tensor).

(c) If aij and gij are any two symmetric non-degenerate type (0, 2) tensor
i i
fields with associated Christoffel symbols j k a and j k g respectively. Show
i i
that j k a - j k g is a type (1, 2) tensor.

2. Covariant Differential of a Covariant Vector Field Use the results


and analysis of the section (and look at, eg. Rund) to show that, if Yi is a
i
q
covariant vector, then DYp = dYp - p q Yi dx . are the components of a
covariant vector field.

3. (See Rund, pp. 72-73) Covariant Differential of a Tensor Field We


can again use the same analysis to obtain, for a type (1, 1) tensor, DThp =
h i
h r q h q
dT p + p q T pdx - p q T i dx .

4. Obtain the transformation equations for Chritstoffel symbols of the first


and second kind. (You might wish to consult an earlier printing of these
notes or Rund's book...)

5. Show directly that the coordinates of DX p/dt transform as a contravariant


vector.

6. Show that, if X i is any vector field on En, then its ordinary partial
derivatives agree with X p|k.

7. Show that, if X i and Yj are any two (contravariant) vector fields on M,

7 of 8 10/08/2010 05:10 PM
Covariant Differentiation http://people.hofstra.edu/stefan_waner/diff_geom...

then

(X i + Yi)|k = X i|k + Yi|k


(X iYj)|k = X i|kYj + X iYj|k.

8. Show that, if C is a path on M, then

DX i = i dxk .
X |k
dt dt

9. Show that, if X and Y are vector fields, then

d DX DY
X, Y = ,Y + X ,
dt dt dt

where the big D's denote covariant differentiation.

10. (a) What is |i if is a scalar field?


(b) Give a definition of the "contravariant" derivative, X a|b of X a with
respect to xb, and show that X a|b = 0 if and only if X a|b = 0.

Back to Lecture 7: Locally On to Lecture 9:


Table of
Minkowskian Manifolds: A Little Geodesics and Local
Contents
Relativity Inertial Frames
Last Updated: January, 2002
Copyright © Stefan Waner

8 of 8 10/08/2010 05:10 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 9: Geodesics and Local Inertial Frames


Table of Back to Lecture 8: Covariant On to Lecture 10: The
Contents Differentiation Riemann Curvature Tensor

9. Geodesics and Local Inertial Frames

Let us now apply some of this theory to curves on manifolds. If a non-null


curve C on M is paramaterized by xi(t), then we can reparamaterize the
curve using arc length,

t
i j 1/2
s(t) = ± gij dx dx du ,
du du
a

(starting at some arbitrary point) as the parameter. The reason for wanting
to do this is that the tangent vector Ti = dxi/ds is then a unit vector (see the
exercises) and also independent of the paramaterization.

If we were talking about a curve in E3, then the derivative of the unit
tangent vector (again with respect to s to make it independent of the
paramaterization) is a measure of how fast the curve is "turning," and so we
call the derivative of Ti the curvature of C.

If C happens to be on a manifold, then the unit tangent vector is still

i i dxi/dt
dx = dx ds =
i
T =
ds
/
dt dt p q 1/2
±gpq dx dx
dt dt

(the last formula is there if you want to actually compute it). But, to get the
curvature, we need to take the covariant derivative:

i
i= DT
P
ds
i
= D(dx /ds)
ds
d2xi i dxp dxq
= +
ds2 p q ds ds

1 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

Definitions 9.1 The first curvature vector P of the curve C is

i= d2xi i dxp dxq


P +
ds2 p q ds ds

A curve on M whose first curvature is zero is called a geodesic. Thus, a


geodesic is a curve that satisfies the system of second order differential
equations

d2xi i dxp dxq = 0.


+
ds2 p q ds ds

In terms of the parameter t, this becomes (see the exercises)

d2xi ds dxi d2s i dxp dxq ds = 0,


- +
dt2 dt dt dt2 p q dt dt dt

where

i j 1/2
ds
= ± gij dx dx .
dt du du

Note that P is a tangent vector at right angles to the curve C which


measures its change relative to M.

Question Why is P at right angles to the curve C?


Answer This can be checked as follows.

d DT DT
T, T = , T + T, (Exercise Set 8 #9)
ds ds ds
DT
= 2 ,T (Symmetry of the scalar product)
ds
= 2 P, T (Definition of P)

so that

1 d
P, T = 2 T, T .
ds

But

2 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

T, T = 1 (Refer back to the Proof of 6.5 to check this)

whence

T, T = P, T 1 d ( 1) = (Refer back to the Proof of 6.5 to check


= 2 ds 0, this)

as asserted.

Local Flatness, or "Local Inertial Frames"

Notation We will be changing some of the notation to simplify things from


now on.

j
1. First, we shall write the Christoffel symbols of the second kind as i k
j
rather than i k
2. Second, we shall continue to use comma notation for ordinary (not
covariant) partial derivatives:
k
Ti,k instead of Ti/ x
Ti,k instead of Ti/ xk etc.

In "flat space" Es all the Christoffel symbols vanish, so the following


question arises:

Question Can we find a chart (local coordinate system) such that the
Christoffel symbols vanish -- at least in the domain of the chart?
Answer This is asking too much; we shall see later that the derivatives of
the Christoffel symbols give an invariant tensor (called the curvature)
which does not vanish in general. However, we do have the following.

Proposition 9.2 (Existence of a Local Inertial Frame)


If m is any point in the Riemannian manifold M, then there exists a local
i
coordinate system x at m such that:

±1 if j = i
(a) gij(m) = = ± ij
0 if j i
(b) gij,k(m) = 0 for every k.

We call such a coordinate system a local inertial frame or a normal


frame.

(It follows that ijk(m) = 0 in an inertial frame.)

Before proving the proposition, we need a lemma.

3 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

Lemma 9.3 (Some Equivalent Things)

Let m M. Then the following are equivalent:

(a) gpq,r(m) = 0 for all p, q, r.

(b) [pq, r]m = 0 for all p, q, r.

r
(c) p q(m) = 0 for all p, q, r.

Proof of Lemma 9.3

(a) (b) follows from the definition of Christoffel symbols of the first kind.

(b) (a) follows from the identity

gpq,r = [qr, p] + [rp, q] (Check it!)

(b) (c) follows from the definition of Christoffel symbols of the second
kind.

(c) (b) follows from the inverse identity

[pq, s] = g sr prq.

In other words, the vanishing of Christoffel symbols at any point of M is


equivalent to the vanishing of the partial derivatives of the metric tensor at
that point.

Click here for a proof of Proposition 9.2.

Corollary 9.4 (Partial Derivatives Look Nice in Inertial Frames)


Given any point m M, there exist local coordinates such that

Xp
X p|k(m) =
xk m

Also, the coordinates Xp in an inertial frame transform to those


of of
xk m
p
X |k(m) in every frame.

4 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

Corollary 9.5 (Geodesics are Locally Straight in Inertial Frames)


If C is a geodesic passing through m M, then, in any inertial frame, it
has zero classical curvature at m (that is, d2xi/ds2 = 0).

Question Is there a local coordinate system such that all geodesics are in
fact straight lines?
Answer Not in general; if you make some geodesics straight, then others
wind up curved. It is the curvature tensor that is responsible for this. This
involves the derivatives of the Christoffel symbols, and we can't make it
vanish.

Question If I throw a ball in the air, then the path is curved and also a
geodesic. Does this mean that our earthly coordinates are not inertial?
Answer Yes. At each instant in time, we can construct a local inertial frame
corresponding to that event. But this frame varies from point to point along
our world line if our world line is not a geodesic (more about this below),
and the only way our world line can be a geodesic is if we were freely falling
(and therefore felt no gravity). Technically speaking, the "earthly"
coordinates we use constitute a momentary comoving reference frame;
it is inertial at each point along our world line, but the direction of the axes
are constantly changing in space-time.

Proposition 9.6 (Changing Inertial Frames) If x and are inertial


frames at m M, then, recalling that D is the matrix whose ij th entry is
( xi/ j), one has

det D = det = ±1.

Proof By definition of inertial frames,

gij(m) = ± ij,

and similarly for ij, so that ij = ±gij, whence det(g **) = ± det( **) = ±1. On
the other hand,

xk xl g ,
ij = kl
i j

which, in matrix form, becomes

** = D T g**D.

Taking determinants gives

5 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

det( **) = det(D T ) det(g**) det(D) = det(D)2 det(g**),

giving

±1 = ±det(D)2,

2
which must mean that det(D) = +1, so that det(D) = ±1 as claimed.

Note that the above theorem also workds if we use units in which det g =
-c2 as in Lorentz frames.

Definition 9.7 Two (not necessarily inertial) frames x and have the
same parity if det > 0. An orientation of M is an atlas of M such that
all the charts have the same parity. M is called orientable if it has such
an atlas, and oriented if it is equipped with one.

Notes
1. Reversing the direction of any one of the axes reverses the orientation.
2. It follows that every orientable manifold has two orientations; one
corresponding to each choice of equivalence class of orientations.
3. If M is an oriented manifold and m M, then we can choose an oriented
inertial frame at m, so that the change-of-coordinates matrix D has positive
determinant. Further, if D happens to be the change-of-coordinates from one
oriented inertial frame to another, then det(D) = +1.
4.E3 has two orientations: one given by any left-handed system, and the
other given by any right-handed system.
5. In the homework, you will see that spheres are orientable, whereas Klein
bottles are not.

We now show how we can use inertial frames to construct a tensor field.

Definition 9.8 Let M be an oriented n-dimensional Riemannian


manifold. The Levi-Civita tensor of type (0, n) is defined as follows.
If is any coordinate system and m M, then define

= det (D i1D i2 ... D in)


i1i2...in(m)
determinant of D with columns permuted according to
=
the indices,

where D j is the j th column of the change-of-coordinates matrix xk/ l,


and where x is any oriented inertial frame at m.

Notes

6 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

1. is a completely antisymmetric tensor. If is itself an inertial frame, then,


since det(D) = +1 (see Note 2 above) the coordinates of (m) are given by

1, if (i1, i2, ... , in) is an even permutation of (1, 2, ... , n)


i1i2...in(m) =
-1, if (i1, i2, ... , in) is an odd permutation of (1, 2, ... , n)

2. I have not seen this tensor defined in this generality in any of the sources
I consulted. Note that this tensor cannot be defined without a metric being
present. In the absence of a metric, the best you can do is define a "relative
tensor," which is not quite the same, and what Rund calls the "Levi-Civita
symbols" in his book. Wheeler, et al. just define it for Minkowski space.

(Compare this with the metric tensor, which is also "nice" in inertial
frames.)

Proposition 9.9 (Levi-Civita Tensor)


The Levi-Civita tensor is a well-defined, smooth tensor field.

Proof To show that it is well-defined, we must show independence of the


choice of inertial frames. But, if and are defined at m M as above by
using two different inertial frames, with corresponding change-
of-coordinates matrices D and E, then D-E is the change-of coordinates from
one inertial frame to another, and therefore has determinant 1. Now,

i1i2...in(m) = det (D i1D i2 ... D in)


= det D I i1i2...in
(where I i1i2...in is the identity matrix with columns ordered as shown in
the indices)
= det D E I i1i2...in
(since E has determinant 1; this being where we use the fact that
things are oriented!)
= det E I i1i2...in (since D = I)
= i1i2...in,

showing it is well-defined at each point. We now show that it is a tensor. If


and are any two oriented coordinate systems at m and change-
of-coordinate matrices D and E with respect to some inertial frame x at m,
and if the coordinates of the tensor with respect to these coordinates are
k1k2...kn and r 1r 2...r n = det (E r 1Er 2 ... E r n) respectively, then at the point m,

k1k2...kn = det (D k1D k2 ... D kn)


i i i

= i1i2...in x 1 x 2 ... x n
k k k
1 2 n

7 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

(by definition of the determinant(!) since i1i2...in is just the sign of the
permutation!)
i i i r r r
x 1 x 2 ... x n
1 2 n
= i1i2...in
r r r k k
... k
1 2 n 1 2 n

r r r
1 2 n
= r 1r 2...r n
k k
... k
1 2 n

showing that the tensor transforms correctly. Finally, we assert that det
(D k1D k2 ... D kn) is a smooth function of the point m. This depends on the
change-of-coordinate matrices to the inertial coordinates. But we saw that
we could construct inertial frames by setting

xi
= V(j)i,
j
m

where the V(j) were an orthogonal base of the tangent space at m. Since we
can vary the coordinates of this base smoothly, the smoothness follows.

Example
In E3, the Levi-Civita tensor coincides with the totally antisymmetric
third-order tensor ijk in Exercise Set 4. In the Exercises, we see how to use
it to generalize the cross-product.

Exercise Set 9
1. Recall that we can define the arc length of a smooth non-null curve by

t
i j 1/2
s(t) = ± gij dx dx du .
du du
a

Assuming that this function is invertible (so that we can express xi as a


function of s) show that

dxi 2 = ±1.
ds

2. Derive the equations for a geodesic with respect to the parameter t.

3. Obtain an analogue of Corollary 9.3 for the covariant partial derivatives


of type (2, 0) tensors.

8 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

4. Use inertial frames argument to prove that gab|c = gab|c = 0. (Also see
Exercise Set 3 #1.)

5. Show that, if the columns of a matrix D are orthonormal, then det D = ±1.

6. Prove that, if is the Levi-Civita tensor, then, in any frame, i1i2...in = 0


whenever two of the indices are equal. Thus, the only non-zero coordinates
occur when all the indices differ.

7. Use the Levi-Civita tensor to show that, if x is any inertial frame at m, and
if X(1), . . . , X(n) are any n contravariant vectors at m, then

det X(1)| . . . |X(n)

is a scalar.

8. The Volume 1-Form (A Generalization of the Cross Product)


If we are given n-1 vector fields X(2), X(3), . . . , X(n) on the n-manifold M,
define a covariant vector field by
i i n
(X(2) X(3) ... X(n))j = j i2...inX(2) 2 X(3) 3... X(n) 2,

where is the Levi-Civita tensor. Show that, in any inertial frame at a point
m on a Riemannian 4-manifold, ||X(2) X(3) X(4)||2 evaluated at the point
m, coincides, up to sign, with the square of the usual volume of the three-
dimensional parallelepiped spanned by these vectors by justifying the
following facts.
(a) Restricting your attention to Riemannian 4-manifolds, let A, B, and C be
vectors at m, and suppose -- as you may -- that you have chosen an inertial
frame at m with the property that A 1 = B1 = C1 = 0. (Think about why you
can you do this.) Show that, in this frame, A B C has only one nonzero
coordinate: the first.
(b) Show that, if we consider A, B and C as 3-vectors a, b and c respectively
by ignoring their first (zero) coordinate, then

(A B C)1 = a.(b c),

which we know to be ± the volume of the parallelepiped spanned by a, b and


c.
(c) Defining ||C||2 = CiCjgij (recall that g ij is the inverse of g kl), deduce that
the scalar ||A B C||2 is numerically equal to square of the volume of the
parallelepiped spanned by the vectors a, b and c. (Note also that ||A B
C||2always get the same answer, no matter what coordinate system we choose.)

9. Define the Levi-Civita tensor of type (n, 0), and show that

9 of 10 10/08/2010 05:11 PM
Geodesics and Local Inertial Frames http://people.hofstra.edu/stefan_waner/diff_geom...

j j 1 if (i1, ... , in) is an even permutation of (j1, ... ,


1 2
i1i2...in ... = jn) .
j
n
-1 if (i1, ... , in) is an odd permutation of (j1, ... , jn

Table of Back to Lecture 8: Covariant On to Lecture 10: The


Contents Differentiation Riemann Curvature Tensor
Last Updated: January, 2002
Copyright © Stefan Waner

10 of 10 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 10: The Riemann Curvature Tensor


Back to Lecture 9: On to Lecture 11: A Little More
Table of
Geodesics and Local Relativity: Comoving Frames and
Contents
Inertial Frames Proper Time

10. The Riemann Curvature Tensor

First, we need to know how to translate a vector along a curve C. Let X j be a


vector field. We have seen that a parallel vector field of constant length on
M must satisfy

j
DX = 0 ...... (I)
dt

for any path C in M.

Definition 9.1 The vector field X j is parallel along the curve C if it


satisfies

DX j = dX j + j i dxh = 0,
ihX
dt dt dt

for the specific curve C.

If X j is parallel along C, which has parametrization with domain [a, b] and


corresponding points and on M, then, since

dX j = - j X i dxh ......... (I)


ih

1 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

dt dt

we can integrate to obtain

b
h
j i dx
X j( ) = X j( ) - ihX ......... (II)
dt
a

Question Given a fixed vector X j( ) at the point M, and a curve C


originating at , it is possible to define a vector field along C by transporting
the vector along C in a parallel fashion?
Answer Yes. Notice that the formula (II) is no good for this, since the
integral already requires X j to be defined along the curve before we start.
But we can go back to (I), which is a system of first order linear differential
equations. Such a system always has a unique solution with given initial
conditions specified by X j( ). Note however that it gives X j as a function of
the parameter t, and not necessarily as a well-defined function of position on
M. If it does not, then we have a parallelizable manifold.

Definition 10.2 If X j( ) is any vector at the point M, and if C is any


path from to in M, then the parallel transport of Xj ( ) along C is
the vector X j( ) given by the solution to the system (I) with initial
conditions given by X j( ).

Examples 10.3
(a) If C is a geodesic in M given by x i = xi(s), where we are using arc-length
s as the parameter (see Exercise Set 8 #1) then the vector field dxi/ds is
parallel along C. (Note that this field is only defined along C, but (I) still
makes sense.) Why? because

2 j
D(dxj/ds) = d x + j dxi dxh ,
ih
Ds ds2 ds ds

which must be zero for a geodesic.

(b) Proper Coordinates in Relativity Along Geodesics

2 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

According to relativity, we live in a Riemannian 4-manifold M, but not the


flat Minkowski space. Further, the metric in M has signature (1, 1, 1, -1).
Suppose C is a geodesic in M given by x i = xi(t), satisfying the property

dxi , dxi < 0.


dt dt

Recall that we refer to such a geodesic as timelike. Looking at the


discussion before Definition 7.1, we see that this corresponds, in Minkowski
space, to a particle traveling at sub-light speed. It follows that we can
choose an orthonormal basis of vectors {V(1), V(2), V(3), V(4)} of the
tangent space at m with the property given in the proof of 9.2, with V(4) =
dxi/dt. We think of V(4) as the unit vector in the direction of time, and V(1),
V(2) and V(3) as the spatial basis vectors. Using parallel translation, we
obtain a similar set of vectors at each point along the path. (The fact that
the curve is a geodesic guarantees that parallel translation of the time axis
will remain parallel to the curve.) Finally, we can use the construction in
9.2 to flesh these frames out to full coordinate systems defined along the
path. (Just having a set of orthogonal vectors in a manifold does not give a
unique coordinate system, so we choose the unique local inertial one there,
because in the eyes of the observer, spacetime should be flat.)

Question Does parallel transport preserve the relationship of these vectors


to the curve. That is, does the vector V(4) remain parallel, and do the
vectors {V(1), V(2), V(3), V(4)} remain orthogonal in the sense of 8.2?
Answer If X and Y are vector fields, then

d DX DY
X, Y = ,Y + X , ,
dt dt dt

where the big D's denote covariant differentiation. (Exercise Set 8 #9). But,
since the terms on the right vanish for fields that have been parallel
transported, we see that X, Y is independent of t, which means that
orthogonal vectors remain orthogonal and that all the directions and
magnitudes are preserved, as claimed.

Note At each point on the curve, we have a different coordinate system! All
this means is that we have a huge collection of charts in our atlas; one
corresponding to each point on the path. This (moving) coordinate system is
called the momentary comoving frame of reference and corresponds to
the "real life" coordinate systems.

(c) Proper Coordinates in Relativity Along Non-Geodesics

If the curve is not a geodesic, then parallel transport of a tangent vector


need no longer be tangent. Thus, we cannot simply parallel translate the
coordinate axes along the world line to obtain new ones, since the resulting

3 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

frame may not be Lorentz. We shall see in Section 11 how to correct for that
when we construct our comoving reference frames

Question Under what conditions is parallel transport independent of the


path? If this were the case, then we could use formula (I) to create a whole
parallel vector field of constant length on M, since then DX j/dt = 0.
Answer To answer this question, let us experiment a little with a fixed
vector V = X j(a) by parallel translating it around a little rectangle
consisting of four little paths. To simplify notation, let the first two
coordinates of the starting point of the path (in some coordinates) be given
by

x1(a) = r, x2(a) = s.

Then, choose r and s so small that the following paths are within the
coordinate neighborhood in question:

xi(a) if i 1 or 2
C1: xj(t) = r+t r if i = 1
s if i = 2

xi(a) if i 1 or 2
C2: xj(t) = r+ r if i = 1
s+t s if i = 2

xi(a) if i 1 or 2
j
C3: x (t) = r+(1-t) r if i = 1
s+ s if i = 2

xi(a) if i 1 or 2
j
C4: x (t) = r if i = 1
s+(1-t) s if i = 2

These paths are shown in the following diagram.

4 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

Now, if we parallel transport X j(a) along C1, we must have, by (II),

1
j
j j ih dxh dt (since t goes from 0 to 1 in the path C )
X (b) = X (a) - i
1
X dt
0

1
j j (see the definition of C1 above; only x1
= X (a) - i1 X i r dt
0 changes...)

Warning: The integrand term i 1 X i is not constant, and must be


j

evaluated as a function of t using the path C1. However, if the path is a small
one, then the integrand is approximately equal to its value at the midpoint
of the path segment:

X j(a) - i 1 X i(midpoint of C1) r


j j
X (b)
j i
X j(a) - i 1 X (a) + 0.5 x1 ( ij1 X i) r r

where the partial derivative is evaluated at the point a. Similarly,

1
j j j
X (c) = X (b) - i2 X i s dt
0

X j(b) - i 2 X i(midpoint of C2) s


j

j i
X j(b) - i 2 X (a) + x1 ( ij2 X i) r + 0.5 x2 ( ij2 X i) s s

where all partial derivatives are evaluated at the point a. (This makes sense
because the field is defined where we need it.)

5 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

1
j
X (d) = X j(c) +
j
i1 X i r dt
0

X j(c) + i 1 X i(midpoint of C3) r


j

j i
X j(c) + i 1 X (a) + 0.5 x1 ( ij1 X i) r + x2 ( ij1 X i) s r

and the vector arrives back at the point a according to

1
j j j Xi (X*j(a) is the new vector at the point
X* (a) = X (d) + i2 dt
0 s a)

X j(d) + i 2 X i(midpoint of C4) r


j

j i
X j(d) + i 2 X (a) + 0.5 x2 ( ij2 X i) s s

To get the total change in the vector, you substitute back a few times and
cancel lots of terms (including the ones with 0.5 in front), being left with

j i j i
X*j(a) - X j(a) = X j x2 ( i 1 X ) - x1 ( i 2 X ) r s

To analyze the partial derivatives in there, we first use the product rule,
getting

j j j j
Xj X i x2 i 1 + i 1 x2 X i - X i x1 i 2 - i 2 x1 X i r s ......... (III)

Next, we recall the "chain rule" formula

j h
DX = j dx
X |h
dt dt

in the homework. Since the term on the right must be zero along each of the
path segments we see that (I) is equivalent to saying that the partial
derivatives

X j|h = 0

for every index p and k (and along the relevant path segment; notice that
we are taking partial derivatives in the direction of the path, so that they do
make sense for this curious field that is only defined along the square path!)
since the terms dxh/dt are non-zero. By definition of the partial derivatives,
this means that

6 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

Xj
h
+ ijhX i = 0,
x

so that

Xj
h
= - ijhX i.
x

We now substitute these expressions in (III) to obtain

j jp i p j j i
Xj X i x2 i 1 - i 1 p 2X - X i x1 i 2 + i 2 p 1X r s

where everything in the square brackets is evaluated at a. Now change the


dummy indices in the first and third terms and obtain

j j i j j i
Xj
p
x2 p 1 - i 1 p 2 - x1 p 2 + i 2 p 1 X r s

This formula has the form

p j
X j Rp 12X r s ............ (IV)

(indices borrowed from the Christoffel symbol in the first term, with the
extra index from the x in the denominator) where the quantity Rpj12 is
known as the curvature tensor.

Curvature Tensor

a a
a i a i a b c b d
Rb cd = bc i d- bd i c+ d
-
c
x x

The terms are rearranged (and the Christoffel symbols switched) so you can
see the index pattern, and also that the curvature is antisymmetric in the
last two covariant indices.

Rbacd = - Rbadc

The fact that it is a tensor follows from the homework.

It now follows from a grid argument, that if C is any (possibly) large planar
closed path within a coordinate neighborhood, then, if X is parallel
transported around the loop, it arrives back to the starting point with
change given by a sum of contributions of the form (IV). If the loop is not

7 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

planar, we choose a coordinate system that makes it planar, and if the loop
is too large for a single coordinate chart, then we can break it into a grid so
that each piece falls within a coordinate neighborhood. Thus we see the
following.

Proposition 10.4 (Curvature and Parallel Transport)


Assume M is simply connected. A necessary and sufficient condition
that parallel transport be independent of the path is that the curvature
tensor vanishes.

Definition 10.5 A manifold with zero curvature is called flat.

Properties of the Curvature Tensor

a
We first obtain a more explicit description of Rb cd in terms of the partial
derivatives of the g ij. First, we have the notation

gij
gij,k =
xk

for partial derivatives, and remember that these are not tensors. Then, the
Christoffel symbols and curvature tensor are given in the convenient form

a 1
b c = 2 gak(gck,b + gkb,c - gbc,k)

Rbacd = [ bic iad - bid iac + bac,d - bad,c]


(Notice that the indices c and d are switched in
the negative terms.)

We can lower the index by defining

i
Rabcd = gbiRa cd

Substituting the first of the above (boxed) formulas into the second, and
using symmetry of the second derivatives and the metric tensor, we find
(exercise set)

Covariant Curvature Tensor in Terms of the Metric


Tensor

Rabcd = 1 (gbc,ad - gbd,ac + gad,bc - gac,bd) + ajd bjc -

8 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

2 ajc bjd

(We can remember this by breaking the indices a, b, c, d into pairs other
than ab, cd (we can do this two ways) the pairs with a and d together are
positive, the others negative.)

Notes
1. The "new kinds" of Christoffel symbols ijk are given by

p
ijk = gpj i k.

2. Some symmetry properties: Rabcd = -Rabdc = -Rbacd and Rabcd = Rcdab (see
the exercise set)

3. We can raise the index again by noting that

gbiRaicd = gbigijRa cd = bjRa cd = Ra cb.


j j b

Now, let us evaluate some partial derivatives in an inertial frame (so that we
can ignore the Christoffel symbols) cyclically permuting the last three
indices as we go:

(gad,bce - gac,bde + gbc,ade - gbd,ace


1
Rabcd,e + Rabec,d + Rabde,c = + g ac,bed - gae,bcd + gbe,acd - gbc,aed
2
+ g ae,bdc - gad,bec + gbd,aec - gbe,adc)
= 0.

Now, I claim this is also true for the covariant partial derivatives:

Bianchi Identities

Rabcd|e + Rabec|d + Rabde|c = 0

Indeed, let us evaluate the left-hand side at any point m M. Choose an


inertial frame at m. Then the left-hand side coincides with Rabcd,e + Rabec,d
+ Rabde,c, which we have shown to be zero. Now, since a tensor which is
zero is sone frame is zero in all frames, we get the result!

Definitions 10.6 The Ricci tensor is defined by

i ij
Rab = Ra bi = g Rajbi

9 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

we can raise the indices of any tensor in the usual way, getting

Rab = gaigbjRij.

In the exercise set, you will show that it is symmetric, and also (up to sign)
is the only non-zero contraction of the curvature tensor.

We also define the Ricci scalar by

R = g abRab = gabgcdRacbd

The last thing we will do in this section is play around with the Bianchi
identities. Multiplying them by g bc:

gbc[Rabcd|e + Rabec|d + Rabde|c] = 0

Since gij|k = 0 (see Exercise Set 8), we can slip the gbc into the derivative,
getting

c
-Rad|e + Rae|d + Ra de|c = 0.

Contracting again gives

gad[-Rad|e + Rae|d + Ra de|c] = 0,


c

or

-R|e + Rde|d + Rdcde|c = 0,

or

-R|e + Rde|d + Rce|c = 0.

Combining terms and switching the order now gives

b 1
R e|b - 2 R|e = 0,

or

1
Rbe|b - 2 be R|b = 0.

ae
Multiplying this by g , we now get

10 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

1
Rab|b - 2 gabR|b = 0,

or

G ab|b = 0,

where we make the following definition:

Einstein Tensor

ab = ab - 1 ab
G R 2g R

Einstein's field equation for a vacuum states that

G ab = 0

(as we shall see later...).

Example 10.7
Take the 2-sphere of radius r with polar coordinates, where we saw that

r2sin2 0
g** = .
0 r2

The coordinates of the covariant curvature tensor are given by

1
Rabcd = (gbc,ad - gbd,ac + gad,bc - gac,bd) + ajd bjc - ajc bjd .
2

Let us calculate R . (Note: when we use Greek letters, we are referring to


specific terms, so there is no summation when the indices repeat!) So, a = c
= , and b = d = . (Incidentally, this is the same as R by the last
exercise below.)

The only non-vanishing second derivative of g** is

g , = 2r2(cos2 - sin2 ),

giving

j j
a c jbd = j = 0,

since b = d = eliminates the second term (two of these indices need to be


in order for the term not to vanish.)

11 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

j j 1 2 cos
a d jbc = j = (-2r2sin cos ) = -r2cos2 .
4 sin

Combining all these terms gives

R = r2(sin2 - cos2 ) + r2cos2 = r2sin2 .

We now calculate

Rab = gcdRacbd R = g R = sin2

and

R =g R
sin2
= = 1.
sin2

All other terms vanish, since g is diagonal and R**** is assymetric. Click here
to see an instance of this! This gives

R = gabRab = g R +g R

= 1 2 +
1 2
(sin ) = .
r2sin2 r2 r2

Summary of Some Properties of Curvature Etc.

b b
a c = c a abc = cba

b b
Ra cd = Ra dc

Rabcd = -Rbacd Rabcd = -Rabdc

Rabcd = Rcdab (Note that a,b and c,d always go together.)

Rab = Ra bi = gijRajbi
i

Rab = Rba

R = g abRab = gacgbdRabcd

Rab = gaiRib

12 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

Rab = gaigbjRij

G ab = Rab - gabR/2

Exercise Set 10

1. Derive the formula for the covariant form of the curvature tensor in terms
of the g ij.

2. (a) Show that the curvature tensor is antisymmetric in the last pair of
variables:

a a
Rb cd = - Rb dc

(b) Use part (a) to show that the Ricci tensor is, up to sign, the only
non-zero contraction of the curvature tensor.
(c) Prove that the Ricci tensor is symmetric.

3. (cf. Rund, pp. 82-83)


(a) Show that

X j|h|k = xk (X j|h) + m k(X m|h) - h k(X j| l)


j l

2 j
X l j
j j j j m l h k(X |
= x
h + Xl k lh+ lh k l + m k h m + m k l hX -
x x )X x )X l)
xk

(b) Deduce that

X j|h|k - X j|k|h = RljhkX l - ShlkX j| l = RljhkX l

where

Shlk = hlk - klh = 0.

(c) Now deduce that the curvature tensor is indeed a type (1, 3) tensor.

4. Show that Rabcd is antisymmetric on the pairs (a, b) and (c, d).

5. Show that Rabcd = Rcdab by first checking the identity in an inertial frame.

13 of 14 10/08/2010 05:11 PM
The Riemann Curvature Tensor http://people.hofstra.edu/stefan_waner/diff_geom...

Back to Lecture 9: On to Lecture 11: A Little More


Table of
Geodesics and Local Relativity: Comoving Frames and
Contents
Inertial Frames Proper Time
Last Updated: January, 2002
Copyright © Stefan Waner

14 of 14 10/08/2010 05:11 PM
A Little More Relativity: Comoving Frames and ... http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 11: A Little More Relativity: Comoving


Frames and Proper Time
Back to Lecture 10: The On to Lecture 12: The Stress
Table of
Riemann Curvature Tensor and the Relativistic Stress-
Contents
Tensor Energy Tensor

11. A Little More Relativity: Comoving Frames and Proper Time

First, we recall some terminology.

Definition 11.1 A Minkowskian 4-manifold is a 4-manifold in which


the metric has signature (1, 1, 1, -1) (eg., the world according to
Einstein).

By Proposition 9.2, if M is Minkowskian and m M, then one can find a


locally inertial frame at m such that the metric at m has the form diag(1, 1,
1, -1). We actually have some flexibility: we can, if we like, adjust the scaling
of the x4-coordinate to make the metric look like diag(1, 1, 1, -c 2). In that
case, the last coordinate is the local time coordinate. Later, we shall
convert to units of time to make c = 1, but for now, let us use this latter kind
of inertial frame.

Note If M is Minkowski space E4, then inertial frames are nothing more
than Lorentz frames. (We saw in Theorem 6.3 that Lorentz frames were
characterized by the fact that the metric had the form diag(1, 1, 1, -c 2) at
every point, so they are automatically inertial everywhere.)

Now let C be a timelike curve in the Minkowskian 4-manifold M.

Definition 11.2 A momentary comoving reference frame for C


(MCRF) associates to each point m C a locally inertial frame whose
last basis vector is parallel to the curve and in the direction of
increasing parameter s. Further, we require the frame coordinates to
vary smoothly with the parameter of the curve.

Proposition 11.3 (Existence of MCRF's)

If C is any timelike curve in the Minkowskian 4-manifold M, then there


exists an MCRF for C.

1 of 7 10/08/2010 05:12 PM
A Little More Relativity: Comoving Frames and ... http://people.hofstra.edu/stefan_waner/diff_geom...

Proof Fix p 0 C and a Lorentz frame W(1), W(2), W(3), W(4) of Mp0 (so that
g** = diag(1, 1, 1, -c 2).) We want to change this set to a new Lorentz frame
V(1), V(2), V(3), V(4) with

i
V(4) = dx Recall that = s/c
d

So let us take V(4) as above. Then it is tangent to C at p 0. Further,

dxi 2 = dxi 2 ds 2
2
||V(4)|| = = (-1)c2 = -c2.
d ds d

using Proposition 6.5. Intuitively, V(4) is the time axis for the observer at
p0: it points in the direction of increasing proper time . We can now invoke
Proposition 9.2 to flesh out this orthonormal set to obtain an inertial frame
at p0. For the other vectors, take

2
V(i) = W(i) + 2 W(i), V(4) V(4)
c

for i = 1, 2, 3. Then

W(i),W(j) 4 W(i),W(4) 4 W(i),V(4) W(j),V(4)


V(i),V(j) = 2
+ c2 W(j),W(4) + c2 ||V(4)||
=0

by orthogonality of the W's and the calculation of ||V(4)||2 above. Also,

4 4
V(i),V(i) = W(i),W(i) + 2 W(i),W(4) 2 + 2 W(i),V(4) 2 ||V(4)||2
c c
= ||W(i)||2 = 1

so there is no need to adjust the lengths of the other axes. Call this
adjustment a time shear. Since we now have our inertial frame at p0, we
can use 9.2 to flesh this out to an inertial frame there.

At another point p along the curve, proceed as follows. For V(4), again use
dxi/d (evaluated at p). For the other axes, start by talking W(1), W(2), and
W(3) to be the parallel translates of the V(i) along C. These may not be
orthogonal to V(4), although they are orthogonal to each other (since
parallel translation preserves orthogonality). To fix this, use the same time
shearing trick as above to obtain the V(i) at p. Note that the spatial
coordinates have not changed in passing from W(i) to V(i)-all that is
changed are the time-coordinates. Now again use 9.2 to flesh this out to an

2 of 7 10/08/2010 05:12 PM
A Little More Relativity: Comoving Frames and ... http://people.hofstra.edu/stefan_waner/diff_geom...

inertial frame.

By construction, the frame varies smoothly with the point on the curve, so
we have a smooth set of coordinates.

Proposition 11.4 (Proper Time is Time in a MCRF)

In a MCRF , the x4-coordinate (time) is proper time .

"Proof"
We are assuming starting with some coordinate system x, and then
switching to the MCRF . Notice that, at the point m,

d 4
4 = dxi
d xi d
4
= i V(4)i (by definition of V(4))
x
4 1. (since V(4) has coordinates (0,0,0,1) in the barred
= V(4) =
system)

In other words, the time coordinate 4 is moving at a rate of one unit per
unit of proper time . Therefore, they must agree.

A particular (and interesting) case of this is the following, for special


relativity.

Proposition 11.5 (In SR, Proper Time = Time in the Moving


Frame)

In SR, the proper time of a particle moving with a constant velocity v is


the t-coordinate of the Lorentz frame moving with the particle.

Proof

3 of 7 10/08/2010 05:12 PM
A Little More Relativity: Comoving Frames and ... http://people.hofstra.edu/stefan_waner/diff_geom...

1/2
s 1 i j
= =2 -gij dx dx dt.
c dt dt

The curve C has parametrization (vt, 0, 0, t) (we are assuming here


movement in the x1-direction), and g ** = diag (1, 1, 1, -c 2). Therefore, the
above integral boils down to

1/2
1
= 2 -(v2 - c2) dt

1/2
1 2 2
= 2
c 1 - v /c dt

= t (1 - v2/c2)1/2.

But, by the (inverse)Lorentz transformations:

t + v /c2
t=
(1-v2/c2)1/2

t
= since = 0 for the particle.
(1-v2/c2)1/2

Thus,

t = t(1-v2/c2)1/2 = ,

as required.

Definition 11.6 Let C be the world line of a particle in a Minkowskian


manifold M. Its four velocity is defined by

i
dx .
ui =
d

Note By the proof of Proposition 11.3, we have

i 2
u, u = dx = -c2.
d

In other words, four velocity is timelike and of constant magnitude.

4 of 7 10/08/2010 05:12 PM
A Little More Relativity: Comoving Frames and ... http://people.hofstra.edu/stefan_waner/diff_geom...

Example 11.7 Four Velocity in SR


Let us calculate the four-velocity of a particle moving with uniform velocity
v with respect to some (Lorentz) coordinate system in Minkowski space M =
E4. Thus, xi are the coordinates of the particle at proper time . We need to
calculate the partial derivatives dxI/d , and we use the chain rule:

i
dxi = dx dx4
d dx4 d
4
= vi dx for i = 1, 2, 3
d

since x4 is time in the unbarred system. Thus, we need to know dx4/d . (In
the barred system, this is just 1, but this is the unbarred system...) Since 4
= , we use the (inverse) Lorentz transformation:

4
4= + v 1/c2
x ,
(1 - v2/c2)1/2

assuming for the moment that v = (v, 0, 0). However, in the frame of the
particle, 1 = 0, and 4 = , giving

x4 = (1 - v2/c2)1/2 ,

and hence

dx4 = 1
d (1 - v2/c2)1/2

Now, using the more general boost transformations, we can show that this is
true regardless of the direction of v if we replace v2 in the formula by (v1)2
+ (v2)2 + (v3)2 (the square magnitude of v). Thus we find

i=
i 4
dx = i dx = vi
u v (i = 1, 2, 3)
d d (1 - v2/c2)1/2

and

dx
4 1
4=
u 2 2 1/2 .
d (1 - v /c )

Hence the coordinates of four velocity in the unbarred system are given as

5 of 7 10/08/2010 05:12 PM
A Little More Relativity: Comoving Frames and ... http://people.hofstra.edu/stefan_waner/diff_geom...

follows.

Four Velocity in SR

u* = ( v1, v2, v3, 1/ (1-v2/c2)1/2 )

We can now calculate u, u directly as

100 1
010 0
u, u = u*
001 0 uT

0 0 0 -c2

2 2
v -c
= 2 2 1/2
= -c2.
(1-v /c )

Special Relativistic Dynamics

If a contravariant "force" field F (such as an electromagnetic force) acts on


a particle, then its motion behaves in accordance with

du
m0 = = F,
d

where m0 is a scalar, the rest mass, corresponding to the mass of the


particle as measured in its own frame.

We use the four velocity to get four momentum, defined by

pi = m0ui,

Its energy is given by the fourth coordinate, and is defined as

2
2 4=
m0c
E=c p .
(1-v2/c2)1/2

Note that, for small v,

1
E = m0(1-v2/c2)-1/2 m0c2 + 2 m0v2.

In the eyes of a the comoving frame, v = 0, so that

6 of 7 10/08/2010 05:12 PM
A Little More Relativity: Comoving Frames and ... http://people.hofstra.edu/stefan_waner/diff_geom...

E = m0c2.

This is called the rest energy of the particle, since it is the energy in a
comoving frame.

Definitions 11.7 If M is any locally Minkowskian 4-manifold and C is a


timelike path or spacelike (thought of as the world line of a particle), we
can define its four momentum as its four velocity times its rest mass,
where the rest mass is the mass as measured in any MCRF.

Exercise Set 10

1. What are the coordinates of four velocity in a comoving frame? Use the
result to check that u, u = -c2 directly in an MCRF.

2. What can you say about p, p , where p is the 4-momentum?

3. Is energy a scalar? Explain

4. Look up and obtain the classical Lorentz transformations for velocity. (We
have kind of done it already.)

5. Look up and obtain the classical Lorentz transformations for mass.

Back to Lecture 10: The On to Lecture 12: The Stress


Table of
Riemann Curvature Tensor and the Relativistic Stress-
Contents
Tensor Energy Tensor
Last Updated: Janyary, 2002
Copyright © Stefan Waner

7 of 7 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 12: The Stress Tensor and the


Relativistic Stress-Energy Tensor
Back to Lecture 11: A Little More On to Lecture 13: Two
Table of
Relativity: Comoving Frames and Basic Premises of General
Contents
Proper Time Relativity

12. The Stress Tensor and the Relativistic Stress-Energy Tensor

Classical Stress Tensor

The classical stress tensor measures the internal forces that parts of a
medium-such as a fluid or the interior of a star-exert on other parts (even
though there may be zero net force at each point, as in the case of a fluid at
equilibrium).

This is how you measure it: if S is an element of surface in the medium,


then the material on each side of this interface is exerting a force on the
other side. (In equilibrium, these forces will cancel out.) To measure it
physically, pretend that all the material on one side is suddenly removed.
Then the force that would be experienced is the force we are talking about.
(It can go in either direction: for a liquid under pressure, it will push out,
whereas for a stretched medium, it will tend to contract in.)

To make this more precise, we need to distinguish one side of the surface S
from the other, and for this we replace S by a vector S = n S whose
magnitude is S and whose direction is normal to the surface element (n is a
unit normal). Then associated to that surface element there is a vector F
representing the force exerted by the fluid behind the surface (on the side
opposite the direction of the vector S) on the fluid on the other side of the
interface.

Since we this force is clearly effected by the magnitude S, we use instead


the force per unit area (the pressure) given by

lim F
T(n) = .
S 0 S

1 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

Note that T is a function only of the direction n (as well as being a function
of the point in space at which we are doing the slicing of the medium);
specifying n at some point in turn specifies an interface (the surface normal
to n at that point) and hence we can define T.

One last adjustment: why insist that n be a unit vector? If we replace n by


an arbitrary vector v, still normal to S, we can still define T(v) by
multiplying T(v/|v|) by |v|. Thus, for general v normal to S,

lim F
T(v) = .|v|.
S 0 S

We now find that T has this rather interesting algebraic property: T


operates on vector fields to give new vector fields. If is were a linear
operator, it would therefore be a tensor, and we could define its coordinates
by

Tab = T(e b)a,

the a-component of stress on the b-interface. In fact, we have

Proposition 12.1 (Linearity and Symmetry)

T is a symmetric tensor, called the stress tensor.

Sketch of Proof To show it's a tensor, we need to establish linearity. By


definition, we already have

T( v) = T(v)

for any constant . Thus, all we need show is that if a, b and c are three
vectors whose sum is zero, that

T(a) + T(b) + T(c) = 0.

Further, we can assume that the first two vectors are at right angles. Why?
Since all three vectors are coplanar, we can think of the three forces above
as stresses on the faces of a prism as shown in the figure. (Note that the
vector c in the figure is meant to be at right angles to the bottom face,
pointing downwards, and coplanar with a and b.)

2 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

If we take a prism that is much longer that it is thick, we can ignore the
forces on the ends. It now follows from Pythagoras' theorem that the areas in
this prism are proportional to the three vectors. Therefore, multiplying
through by a constant reduces the equation to one about actual forces on
the faces of the prism, with T(a) + T(b) + T(c) the resultant force (since the
lengths of the vectors a, b and c are equal to the respective areas). If this
force was not zero, then there would be a resultant force F on the prism,
and hence an acceleration of its material. The trouble is, if we cut all the
areas in half by scaling all linear dimensions down by a factor , then the
areas scale down by a factor of 2, whereas the volume (and hence mass)
scales down by a factor 3. In other words,

T( 2a) + T( 2b) + T( 2c) = 2F

is the resultant force on the scaled version of the prism, whereas its mass is
proportional to 3. Thus its acceleration is proportional to 1/ (using
Newton's law). This means that, as becomes small (and hence the prism
shrinks) the acceleration becomes infinite -- hardly a likely proposition.

The argument that the resulting tensor is symmetric follows by a similar


argument applied to a square prism; the asymmetry results in a rotational
force on the prism, and its angular acceleration would become infinite if this
were not zero.

The Relativistic Stress-Energy Tensor

Now we would like to generalize the stress tensor to 4-dimensional space.


First we set the scenario for our discussion:

We now work in a 4-manifold M whose metric has signature (1, 1, 1, -1).

We have already call such a manifold a locally Minkowskian 4-manifold.


(All this means is that we be using different units for time in our inertial
frames and MCRFs.)

3 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

Example 12.2
Let M be Minkowski space, where one unit of time is defined to be the time
it takes light to travel one spacial unit. (For example, if units are measured
in meters, then a unit of time would be approximately 0.000 000 003 3
seconds.) In these units, c = 1, so the metric does have this form.

The use of MCRFs allows us to define new physical scalar fields as follows: If
we are, say, in the interior of a star (which we think of as a continuous fluid)
we can measure the pressure at a point by hitching a ride on a small solid
object moving with the fluid. Since this should be a smooth function, we
consider the pressure, so measured, to be a scalar field. Mathematically, we
are defining the field by specifying its value on MCRFs. Note that there is a
question here about ambiguity: MCRFs are not unique except for the time
direction: once we have specified the time direction, the other axes might
be "spinning" about the path-it is hard to prescribe directions for the
remaining axes in a convoluted twisting path. However, since we are using a
small solid object, we can choose directions for the other axes at proper
time 0, and then the "solid-ness" hypothesis guarantees (by definition of
solid-ness!) that the other axes remain at right angles; that is, that we
continue to have an MCRF after applying a time shear as in Lecture 11.

Now, we would like to measure a 4-space analogue of the force exerted


across a plane, except this time, the only way we can divide 4-space is by
using a hyperplane; the span of three vectors in some frame of reference.
Thus, we seek a 4-dimensional analogue of the quantity n S. By coincidence,
we just happen to have such a gizmo lying around: the Levi-Civita tensor.
Namely, if a, b, and c are any three vectors in 4-space, then we can define
an analogue of n S to be ijklaibkcl, where is the Levi-Civita tensor. (See the
exercises.)

Next, we want to measure stress by generalizing the classical formula

F
stress = T(n) =
S

for such a surface element. Hopefully, the space-coordinates of the stress


will continue to measure force. The first step is to get rid of all mention of
unit vectors -- they just dont arise in Minkowski space (recall that vectors
can be time-like, space-like, or null...). We first rewrite the formula as

T(n S) = F,

the total force across the area element S. Now multiply both sides by a time
coordinate increment:

T(n S x4) = F x4 = p,

where p is the 3-momentum (classically, force is the time rate of change of

4 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

momentum). This is fine for three of the dimensions. In other words,

p
T(n V) = p, or T(n) = ... (I)
V

where V is volume in Euclidean 4-space, and where we take the limit as V


0.

But now, generalizing to 4-space is forced on us: first replace momentum by


the 4-momentum P, and then, noting that n S x4 is a 3-volume element in
4-space (because it is a product of three coordinate invrements), replace it
by the correct analogue for Minkowski space,

( V)i = ijkl xj yk zl,

getting

T( V) = P,

where P is 4-momentum exerted on the positive side of the 3-volume V


("positive" being given by the direction of V) by the opposite side. But,
there is a catch: the quantity V has to be really small (in terms of
coordinates) for this formula to be accurate. Thus, we rewrite the above
formula in differential form:

T(dV) = T(ndV) = dP

This describes T as a function which converts the covariant vector dV into a


contravariant field (P), and thus suggests a type (2, 0) tensor. To get an
honest tensor, we must define T on arbitrary covariant vectors (not just
those of the form V). However, every covariant vector Y* defines a 3-volume
as follows.

Recall that a one-form at a point p is a linear real-valued function on the


tangent space Tp at that point. If it is non-zero, then its kernel, which
consists of all vectors which map to zero, is a three-dimensional subspace of
Tp. This describes (locally) a (hyper-)surface. (In the special case that the
one-form is the gradient of a scalar field , that surface coincides with the
level surface of passing through p.) If we choose a basis {v, w, u} for this
subspace of Tp, then we can recover the one-form at p (up to constant
multiples) by forming ijklvjwkul. (Indeed, all you have to check is that the
covariant vector ijklvjwkul has u, w, and v in its kernel. But that is
immediate from the anti-symmetric properties of the Levi-Civita tensor.) This
gives us the following formal definition of the tensor T at a point:

Definition 12.3 (The Stress Energy Tensor) For an arbitrary

5 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

covariant vector Y at p, we choose a basis {v, w, u} for its kernel,


scaled so that Yi = ijklvjwkul, and define T(Y) as follows: Form the
parallelepiped V = {r1v + rrw + r3u | 0 ri 1} in the tangent space,
and compute the total 4-momentum P exerted on the positive side of the
volume element V on the positive side of this volume element by the
negative side. Call this quantity P(1). More generally, define

total 4-momentum P exerted on the positive side of the


P( ) = (scaled) volume element 3 V on the positive side of this
volume element by the negative side.

Then define

lim P( )
T(Y) = .
0 3

Note Of course, physical reality intervenes here: how do you measure


momentum across volume elements in the tangent space? Well, you do all
your measurements in a locally intertial frame. Proposition 9.6 then
guarnatees that you get the same physical measurements near the origin
regardless of the inertial frame you use (we are, after all, letting approach
zero).

To evaluate its coordinates on an orthonormal (Lorentz) frame, we define

Tab = T(e b)a,

so that we can take u, w, and v to be the other three basis vectors. This
permits us to use the simpler formula (I) to obtain the coordinates. Of
interest to us is a more usable form -- in terms of quantities that can be
measured. For this, we need to move into an MCRF, and look at an example.

Note It can be shown, by an argument similar to the one we used at the


beginning of this section, that T is a symmetric tensor.

Definition 12.4 Classically, a fluid has no viscosity if its stress tensor


is diagonal in an MCFR (viscosity is a force parallel to the interfaces).

Thus, for a viscosity-free fluid, the top 3 3 portion of matrix should be


diagonal in all MCRFs (independent of spacial axes). This forces it to be a
constant multiple of the identity (since every vector is an eigenvector
implies that all the eigenvalues are equalŠ). This single eigenvector
measures the force at right-angles to the interface, and is called the
pressure, p.

6 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

Question Why the pressure?

Answer Let us calculate T11 (in an MCRF). It is given by

1 P1 ,
T11 = T(e 1) =
V

where the 4-momentum is obtained physically by suddenly removing all


material on the positive side of the x 1-axis, and then measuring 1-component
of the 4-momentum at the origin. Since we are in an MCRF, we can use the
SR 4-velocity formula:

P = m0(v1, v2, v3, 1)/(1-v2/c2)1/2.

At the instant the material is removed, the velocity is zero in the MCRF, so

P(t=0) = m0(0, 0, 0, 1).

After an interval t in this frame, the 4-momentum changes to

P(t=1) = m0( v, 0, 0, 1)/ (1-( v)2/c2)1/2,

since there is no viscosity (we must take v2 = v3 = 0 or else we will get


off-diagonal spatial terms in the stress tensor). Thus,

P = m0( v, 0, 0, 1) / (1-( v)2/c2)1/2.

This gives

m0 v
( P)1 =
(1-( v)2/c2)1/2
=m v (m is the apparent mass)
= (mv)
= Change of measured momentum

Thus,

P1 = (mv)
V y z t
F
= (force = rate of change of momentum)
y z

and we interpret force per unit area as pressure.

What about the fourth coordinate? The 4th coordinate of the 4-momentum is

7 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

the energy. A component of the form T4,1 measures energy-flow per unit
time, per unit area, in the direction of the x1-axis. In a perfect fluid, we
insist that, in addition to zero viscosity, we also have zero heat conduction.
This forces all these off-diagonal terms to be zero as well. Finally, T44
measures energy per unit volume in the direction of the time-axis. This is
the total energy density, . Think of is as the "energy being transferred
from the past to the future."

This gives the stress-energy tensor in a comoving frame of the particle as

p 0 0 0
0 p 0 0
.
0 0 p 0
0 0 0

What about other frames? To do this, all we need do is express T as a tensor


whose coordinates in a the comoving frame happen to be as above. To help
us, we recall from above that the coordinates of the 4-velocity in the
particle's frame are

u = [0 0 0 1] (just set v = 0 in the 4-velocity).

(It follows that

0 0 0 0

a b
0 0 0 0
u u =
0 0 0 0
0 0 0 1

in this frame.) We can use that, together with the metric tensor,

1 0 0 0
0 1 0 0
g= ,
0 0 1 0
0 0 0 -1

to express T as

Tab = ( + p)uaub + pgab.

Stress-Energy Tensor for Perfect Fluid

8 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

The stress-energy tensor of a perfect fluid (no viscosity


and no heat conduction) is given at a point m é M by

Tab = ( + p)uaub + pgab,

where:

is the mass energy density of the fluid


p is the pressure
ui is its 4-velocity.

Note that the scalars in this definition are their physical magnitudes as
measured in an MCRF.

Conservation Laws

Let us now go back to the general formulation of T (not necessarily in a


perfect fluid), work in an MCRF, and calculate some covariant derivatives of
T. Consider a little cube with each side of length l, oriented along the axes
(in the MCRF). We saw above that T41 measures energy-flow per unit time,
per unit area, in the direction of the x1-axis. Thus, the quantity

T41,1 l

is the approximate increase of that quantity (per unit area per unit time).
Thus, the increase of outflowing energy per unit time in the little cube is

T41,1( l)3

due to energy flow in the x1-direction. Adding the corresponding quantities


for the other directions gives

E
- = T41,1( l)3 + T42,2( l)3 + T43,3( l)3,
t

which is an expression of the law of conservation of energy. Since E is given


44
by T ( l)3, and t = x4, we therefore get

- T44,4 ( l)3 = (T41,1 + T42,2 + T43,3)( l)3,

giving

T41,1 + T42,2 + T43,3 + T44,4 = 0

9 of 10 10/08/2010 05:12 PM
The Stress Tensor and the Relativistic Stress-En... http://people.hofstra.edu/stefan_waner/diff_geom...

A similar argument using each of the three components of momentum


instead of energy now gives us the law of conservation of momentum (3
coordinates):

Ta1,1 + Ta2,2 + Ta3,3 + Ta4,4 = 0

for a = 1, 2, 3. Combining all of these and reverting to an arbitrary frame


now gives us:

Einstein's Conservation Law

.
T=0

where .T is the contravariant vector given by ( .T)j =


Tjk|k.

This law combines both energy conservation and momentum conservation


into a single elegant law.

Exercise Set 11

1. If a, b, and c are any three vector fields in locally Minkowskain


4-manifold, show that the field ijklaibkcl is orthogonal to a, b, and c.( is the
Levi-Civita tensor.)

Back to Lecture 11: A Little More On to Lecture 13: Two


Table of
Relativity: Comoving Frames and Basic Premises of General
Contents
Proper Time Relativity
Last Updated: January, 2002
Copyright © Stefan Waner

10 of 10 10/08/2010 05:12 PM
Three Basic Premises of General Relativity http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 13: Three Basic Premises of General


Relativity
Back to Lecture 12: The
On to Lecture 14: The
Table of Stress Tensor and the
Einstein Field Equations and
Contents Relativistic Stress-Energy
Derivation of Newton's Law
Tensor

13. Three Basic Premises of General Relativity

Spacetime

General relativity postulates that spacetime (the set of all events) is a


smooth 4-dimensional Riemannian manifold M, where points are called
events, with the properties A1-A3 listed below.

A1. Locally, M is Minkowski spacetime (so that special


relativity holds locally).

This means that, if we diagonalize the scalar product on the tangent space
at any point, we obtain the matrix

1 0 0 0
0 1 0 0
.
0 0 1 0
0 0 0 -1

The metric is measurable by clocks and rods.

Before stating the next axiom, we recall some definitions.

Definitions 13.1 Let M satisfy axiom A1. If V i is a contravariant vector


at a point in M, define

||Vi||2 = Vi, Vi = V iVjgij.

(Note that we are not defining ||V i|| here.) We say the vector V i is

timelike if ||V i||2< 0,


lightlike if ||V i||2= 0, and
spacelike if ||V i||2> 0,

1 of 7 10/08/2010 05:13 PM
Three Basic Premises of General Relativity http://people.hofstra.edu/stefan_waner/diff_geom...

Examples 13.2
(a) If a particle moves with constant velocity v in some Lorentz frame, then
at time t = x4 its position is

x = a + vx4.

Using the local coordinate x4 as a parameter, we obtain a path in M given


by

i 4
ai + vix4 if i = 1, 2, 3
x (x ) =
x4 if i = 4

so that the tangent vector (velocity) dx i/dx4 has coordinates (v1, v2, v3, 1)
and hence square magnitude

||(v1, v2, v3, 1)||2 = |v|2 - c2.

It is timelike at sub-light speeds, lightlike at light speed, and spacelike at


faster-than-light speeds.

(b) If u is the proper velocity of some particle in locally Minkowskian


spacetime, then we saw (normal condition in Section 10) that u, u = -c2 =
-1 in our units.

A2. Freely falling particles move on timelike geodesics of


M.

Here, a freely falling particle is one that is effected only by gravity, and
i
recall that a timelike geodesic is a geodesic x (t) with the property that
||dxi/dt||2 < 0 in any paramaterization. (This property is independent of the
parameterization -- see the exercise set.)

A3 (Strong Equivalence Principle) All physical laws


that hold in flat Minkowski space (ie. "special relativity")
are expressible in terms of vectors and tensors, and are
meaningful in the manifold M, continue to hold in every
frame (provided we replace derivatives by covariant
derivatives).

Note Here are some consequences:

1. No physical laws can use the term "straight line," since that concept
has no meaning in M; what's straight in the eyes of one chart is curved

2 of 7 10/08/2010 05:13 PM
Three Basic Premises of General Relativity http://people.hofstra.edu/stefan_waner/diff_geom...

in the eyes of another. "Geodesic," on the other hand, does make sense,
since it is independent of the choice of coordinates.
2. If we can write down physical laws, such as Maxwell's equations, that
work in Minkowski space, then those same laws must work in curved
space-time, without the addition of any new terms, such as the
curvature tensor. In other words, there can be no form of Maxwell's
equations for general curved spacetime that involve the curvature
tensor.

An example of such a law is the conservation law, .T = 0, which is thus


postulated to hold in all frames.

A Consequence of the Axioms: Forces in Almost Flat Space

Suppose now that the metric in our frame is almost Lorentz, with a slight,
not necessarily constant, deviation from the Minkowski metric, as follows.

1+2 0 0 0
0 1+2 0 0
g** = ... (I)
0 0 1+2 0
0 0 0 -1+2

or

ds2 = (1+2 )(dx2 + dy2 + dz2) - (1-2 )dt2.

Notes

1. We are not in an inertial frame (modulo scaling) since need not be


constant, but we are in a frame that is almost inertial.
2. The metric g ** is obtained from the Minkowski g by adding a small
multiple of the identity matrix. We shall see that such a metric does
arise, to first order of approximation, as a consequence of Einstein's
field equations.

Now, we would like to examine the behavior of a particle falling freely under
the influence of this metric. What do the timelike geodesics look like? Let us
assume we have a particle falling freely, with 4-momentum P = m0U, where
U is its 4-velocity, dxi/d . The paramaterized path xi( ) must satisfy the
geodesic equation, by A2. Definition 9.1 gives this as

d2xi r
i dx dx
s
+ rs = 0.
d2 d d

3 of 7 10/08/2010 05:13 PM
Three Basic Premises of General Relativity http://people.hofstra.edu/stefan_waner/diff_geom...

2
Multiplying both sides by m0 gives

d2(m0xi) d(m0xr) d(m0xs)


m0 + ris = 0.
d2 d d

or

i
m0 = dP + risPrPs = 0 (since Pi = d(m0xi/d ))
d

where, by the (ordinary) chain rule (note that we are not taking covariant
derivatives here... that is, dPi/d is not a vector -- see Lecture 7 on covariant
differentiation),

dPi = i dxk
P ,k
d d

so that

dm0xk
Pi,k + risPrPs = 0,
d

or

Pi,kPk + risPrPs = 0 ... (I)

Now let us do some estimation for slowly-moving particles v << 1 (the speed
of light in our units) where we work in a frame where g has the given form.

Question Why don't we work in an inertial frame (the frame of the


particle)?

I give up ... Give me an answer.

First, since the frame is almost inertial (Lorentz), we are close to being in
SR, so that

P* m0U* = m0[v1, v2, v3, 1]/(1-v2/c2)1/2


[0, 0, 0, m0] (since v << 1)

(in other words, the frame is almost comoving) Thus (I) reduces to

Pi,4m0 + 4i4 m02 = 0 ... (II)

Let us now look at the spatial coordinates, i = 1, 2, 3. By definition,

4 of 7 10/08/2010 05:13 PM
Three Basic Premises of General Relativity http://people.hofstra.edu/stefan_waner/diff_geom...

i 1 ij
44=2g (g4j,4 + gj4,4 - g44,j).

We now evaluate this at a specific coordinate i = 1, 2 or 3, where we use the


definition of the metric g, recalling that g** = (g **)-1, and obtain

1 1
2 (1+2
)-1(0 + 0 - 2 ,i) 2 (1-2 )(-2 ,i) ,i.

(Here and in what follows, we are ignoring terms of order O( 2).)


Substituting this information in (II), and using the fact that

i i
P ,4 = x4 = (mov ),

the time-rate of change of momentum, or the "force" as measured in that


frame (see the exercise set), we can rewrite (II) as

m0 i 2
x4 (mov ) - m0 ,i = 0,

or

i
x4 (mov ) - m0 ,i = 0.

4
Thinking of x as time t, and adopting vector notation for three-dimensional
objects, we have, in old fashioned 3-vector notation,

t (mov) = m0 ,

that is

F=m .

This is the Newtonian force experienced by a particle in a force field


potential of . (See the exercise set.) In other words, we have found that we
can duplicate, to a good approximation, the physical effects of Newton-like
gravitational force from a simple distortion of the metric. In other words --
and this is what Einstein realized -- gravity is nothing more than the
geometry of spacetime; it is not a mysterious "force" at all.

Exercise Set 13
1. Show that, if x i = xi(t) has the property that ||dxi/dt||2 < 0 for some
parameter t, then ||dxi/dts|2 < 0 for any other parameter s such that ds/dt

5 of 7 10/08/2010 05:13 PM
Three Basic Premises of General Relativity http://people.hofstra.edu/stefan_waner/diff_geom...

0 along the curve. In other words, the property of being timelike does not
depend on the choice of paramaterization.

2. What is wrong with the following (slickly worded) argument based on the
Strong Equivalence Principle?

I claim that there can be no physical law of the form A = R in


curved spacetime, where A is some physical quantity and R is any
quantity derived from the curvature tensor. (Since we shall see
that Einstein's Field Equations have this form, it would follow from
this argument that he was wrong!) Indeed, if the postulated law A
= R was true, then in flat spacetime it would reduce to A = 0. But
then we have a physical law in SR, which must, by the Strong
Equivalence Principle, generalize to A = 0 in curved spacetime as
well. Hence the original law A = R was wrong.

3. Gravity and Antigravity Newton's law of gravity says that a particle of


mass M exerts a force on another particle of mass m according to the
formula

GMmr
F=- ,
r3

where r = x, y, z , r = |r|, and G is a constant that depends on the units; if


the masses M and m are given in kilograms, then G 6.67 10 -11, and the
resulting force is measured in newtons.* (Note that the magnitude of F is
proportional to the inverse square of the distance r. The negative sign
makes the force an attractive one.) Show by direct calculation that

F=m ,

where

GM
= .
r

Hence write down a metric tensor that would result in an inverse square
repelling force ("antigravity").
* 2
A Newton is the force that will cause a 1-kilogram mass to accelerate at 1 m/sec .

Back to Lecture 12: The


On to Lecture 14: The
Table of Stress Tensor and the
Einstein Field Equations and
Contents Relativistic Stress-Energy
Derivation of Newton's Law
Tensor
Last Updated: January, 2002

6 of 7 10/08/2010 05:13 PM
Three Basic Premises of General Relativity http://people.hofstra.edu/stefan_waner/diff_geom...

Copyright © Stefan Waner

7 of 7 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 14: The Einstein Field Equations and


Derivation of Newton's Law
Back to Lecture 13: Three On to Lecture 15: The
Table of
Basic Premises of General Schwarzschild Metric and Event
Contents
Relativity Horizons

14. The Einstein Field Equations and Derivation of Newton's Law

Einstein's field equations show how the sources of gravitational fields alter the
metric. They can actually be motivated by Newton's law for gravitational potential ,
with which we begin this discussion.

First, Newton's law postulates the existence of a certain scalar field , called
gravitational potential which exerts a force on a unit mass given by

F= (classical gravitational field)

Further, satisfies

2
= .( ) = 4 G ... (I)
Div(gravitational field) = constant mass density

where is the mass density and G is a constant. (The divergence theorem then gives
the more familiar F = = GM/r2 for a spherical source of mass M -- see the exercise
set.) In relativity, we need an invariant analogue of (I). First, we generalize the mass
density to energy density (recall that energy and mass are interchangeable
according to relativity), which in turn is only one of the components of the stress-
energy tensor T. Thus we had better use the whole of T.

Question What about the mysterious gravitational potential ?

Answer That is a more subtle issue. Since the second principle of general relativity
tells us that particles move along geodesics, we should interpret the gravitational
potential as somehow effecting the geodesics. But the most fundamental determinant
of geodesics is the underlying metric g. Thus we will generalize to g. In other
words, Einstein replaced a mysterious "force" by a purely geometric quantity. Put
another way, gravity is nothing but a distortion of the local geometry in space-time.
But we are getting ahead of ourselves...

Finally, we generalize the (second order differential) operator to some yet-to-


be-determined second order differential operator . This allows us to generalize (I) to

(g**) = kT**,

where k is some constant. In an MCRF, (g) is some linear combination of g ab,ij, gab,i
and gab, and must also be symmetric (since T is). Examples of such a tensors are the
Ricci tensors Rab, gabR, as well as g ab. Let us take a linear combination as our
candidate:

1 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

ab ab ab ab
R + g R+ g = kT ... (II)

We now apply the conservation laws Tab|b = 0, giving

(Rab + gabR)|b = 0 ... (a)

since gab|b = 0 already (Exercise Set 8 #4). But in Lecture 9 we also saw that

ab - 1 ab = 0, ... (b)
(R 2 g R) |b

ab
where the term in parentheses is the Einstein tensor G . Calculating (a) - (b), using
ab
the product rule for differentiation and the fact that g |b = 0, we find

1
( + 2 )gabR|b = 0,

**
giving (upon multiplication by g )

1
( + 2 )R|j = 0,

which surely implies, in general, that must equal - 1/2. Thus, (II) becomes

ab
G + gab = kTab.

Finally, the requirement that these equations reduce to Newton's for v/c << 1 tells
us that k = 8 (discussed below) so that we have

Einstein's Field Equations

G ab + gab = 8 Tab

The constant is called the cosmological constant. Einstein at first put = 0, but
later changed his mind when looking at the large scale behavior of the universe.
Later still, he changed his mind again, and expressed regret that he had ever come
up with it in the first place. The cosmological constant remains a problem child to
this day . We shall set it equal to zero in what follows.

Solution of Einstein's Equations for Static Spherically Symmetric Stars

In the case of spherical symmetry, we use polar coordinates (r, , , t) with origin
thought of as at the center of the star as our coordinate system (note it is singular
there, so in fact this coordinate system does not include the origin) and restrict
attention to g of the form

2 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

grr 0 0 grt
0 r2 0 0
g** = ,
0 0 r2sin2 0
grt 0 0 -gtt

3 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

or

ds2 = 2grt dr dt + g rr dr2 + r2 d 2 + r2 sin d 2 - gt dt2,

where each of the coordinates is a function of r and t only. In other words, at any
fixed time t, the surfaces = const, = const and r = const are all orthogonal. (This
causes the zeros to be in the positions shown.)

Question Explain why the non-zeros terms have the above form.

Answer For motivation, let us first look at the standard metric on a 2-sphere of
radius r: (see Example 5.2(d))

2 0
r
g** = .
0 r2 sin2

If we throw r in as the third coordinate, we could calculate

1 0 0
g** = 0 r2 0 .
0 r2 sin2

Moving into Minkowski space, we have

2 2 2 2
ds2 = dx + dy + dz - dt
= dr2 + r2(d 2 + sin2 d 2) - dt2,

giving us the metric

Minkowski Space Metrtic in Polar Coordinates

1 0 0 0
0 r 2 0 0
g** =
0 0 r2sin2 0
0 0 0 -1

For the general spherically symmetric stellar medium, we can still define the radial
coordinate to make g = r2 (through adjustment by scaling if necessary). Further,
we take as the definition of spherical symmetry, that the geometry of the surfaces r
= t = const. are spherical, thus foring us to have the central 2 2 block.

For static spherical symmetry, we also require, among other things, (a) that the
geometry be unchanged under time-reversal, and (b) that g be independent of time
t. For (a), if we change coordinates using

4 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

(r, , , t) ' (r, , , -t),

then the metric remains unchanged; that is, = g. But changing coordinates in this
way amounts to multiplying on the left and right (we have an order 2 tensor here)
by the change-of-coordinates matrix diag (1, 1, 1, -1), giving

grr 0 0 -grt
0 r2 0 0
** = .
0 0 r2sin2 0
-grt 0 0 -gtt

Setting = g gives g rt = 0. Combining this with (b) results in g of the form

e2 0 0 0
0 r2 0 0
g** = ,
0 0 r2sin2 0
0 0 0 -e

where we have introduced the exponentials to fix the signs, and where = (r), and
= (r). Using this version of g, we can calculate the Einstein tensor to be (see the
exercise set!)

Einstein Tensor for Static Spherically Symmetric Stars

2 1
r
-4
'e -
r
2 e
2
(1-e-2 ) 0 0 0

-2 ' '
0 e [ ''+( ')2 + r - ' ' - r ] 0 0
** =
G
G
0 0 0
sin2
1 d -2
0 0 0
r
2 e-2 dr [r(1-e )]

We also need to calculate the stress energy tensor,

Tab = ( +p)uaub + pgab.

In the static case, there is assumed to be no flow of star material in our frame, so
that u1 = u2 = u3 = 0. Further, the normal condition for four velocity, "u, u' = -1,
gives

5 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

2 0 0 0
e 0
0 r 2 0 0 0
[0, 0, 0, u 4] 0
= -1,
0 0 r2sin2 0
grt 0 0 -e2 u4

6 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

whence

u4 = e- ,

so that T44 = ( +p)e-2 + p(-e-2 ) (note that we are using g** here). Hence,

** = ( + p)u*u* + pg**
T

0 0 0 0 e2 0 0 0
0 0 0 0 0 r2 0 0
= +p
0 0 0 0 0 0 r2sin2 0
0 0 0 ( +p)e -2
) 0 0 0 2
-e

pe2 0 0 0
p
0 0 0
r2
= .
p
0 0 0
r2sin2
0 0 0 e-2

ab
(a) Equations of Motion T |b =0

To solve these, we first notice that we are not in an inertial frame (the metric g is
not nice at the origin; in fact, nothing is even defined there!) so we need the
Christoffel symbols, and use

ab
ab T
T |b = b
+ kabTkb + bbkTak,
x

where

p 1 gkl glh ghk


h k = 2 glp + - .
xh xk xl

Now, lots of the terms in Tab|b vanish by symmetry, and the restricted nature of the
functions. We shall focus on a = 1, the r-coordinate. We have:

T1b|b = T11|1 + T12|2 + T13|3 + T14|4 ,

and we calculate these terms one-at-a-time.

T11
a = 1, b = 1: T11|1 = 1
+ 111T11 + 111T11.
x

7 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

1
To evaluate this, first look at the term 1 1:

1 1 l1
1 1 = 2 g (g 1l,1 + gl1,1 - g11,l)
1
= 2 g11(g11,1 + g11,1 - g11,1) (because g is diagonal, whence l = 1)
1
= 2 g11(g11,1)
1
= 2 e-2 e2 .2 '(r) = '(r)

Hence,

dp -2 dp -2
T11|1 = e
-2 -2
+ (- 2p '(r)e ) + 2 '(r)pe =
dr dr e .

Now for the next term:

12
T
a = 1, b = 2: T12|2 = + 212T22 + 221T11
x2
1 1
= 0 + 2 gl1(g2l,2+gl2,2-g22,l)T22 + 2 gl2(g1l,2+gl2,1-g21,l)T11

1 1
= 2 g11(-g22,1) T22 + 2 g22(g22,1)T11

1 p 1 1
= 2 e-2 (-2r) 2 + 2 2 2rsin pe-2
r r sin
= 0.

Similarly (exercise set)

T13|3 = 0.

Finally,

T14
a = 1, b = 4: T14|4 = 4
+ 414T44 + 441T11
x
1
= 2 g11(-g44,1) T44+ g44(g44,1)T11

1 1
= 2 e-2 (2 '(r)e2 ) e-2 2 (-e-2 )(-2 '(r)e2 )pe-2

= e-2 '(r)[ + p].

Hence, the conservation equation becomes

dp d
T1a|a = 0 + ( + p) e-2 = 0
dr dr

8 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

dp d
= -( + p)
dr dr

This gives the pressure gradient required to keep the plasma static in a star.

Note In classical mechanics, the term on the right has rather than +p. Thus, the
pressure gradient is larger in relativistic theory than in classical theory. This
increased pressure gradient corresponds to greater values for p, and hence bigger
values for all the components of T. By Einstein's field equations, this now leads to
even greater values of (manifested as gravitational force) thereby causing even
larger values of the pressure gradient. If p is large to begin with (big stars) this
vicious cycle diverges, ending in the gravitational collapse of a star, leading to
neutron stars or, in extreme cases, black holes. You can go directly to Lecture 15 on
stellar collapse to find out more.

ab ab
(b) Einstein Field Equations G =8 T

Looking at the (4, 4) component first, and substituting from the expressions for G
and T, we find

1 d
2e
-2
[r(1-e-2 ] = 8 e-2 .
r dr

If we define

1 -2
2 r(1-e ) = m(r),

then the equation becomes

1 -2 dm(r)
2e = 4 e-2 ,
r dr

or

dm(r)
= 4 r2 ... (I)
dr

This looks like an equation for classical mass, since classically,

M(R) = 4 r2 (r) dr ,
0

where the integrand is the mass of a shell whose thickness is dr. Thus,

dM(R)
= 4 2 (r).
dr

Here, is energy density, and by our choice of units, energy is equal to rest mass, so

9 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

we interpret m(r) as the total mass of the star enclosed by a sphere of radius r.

Now look at the (1, 1) component:

2 1 -2
r 'e-4 - 2 (1-e ) = 8 pe-2
r
2
2 e
r
'- 2 (1-e-2 ) = 8 pe2
r
2 -2 2 2
2r ' - e (1-e ) = 8 r pe

(1-e-2 ) + 8 r2p
' = e2
2r

In the expression for m, solve for e2 to get

1
e2 = 1-2m/r ,

giving

d 8 r2p + 2m/r
=
dr 2r(1-2m/r)

or

d 4 r3p + m
= ... (II)
dr r(r-2m)

It can be checked using the Bianchi identities that we in fact get no additional
information from the (2,2) and (3,3) components, so we ignore them.

Consequences of the Field Equations: Outside the Star

Outside the star we take p = 0, and m(r) = M, the total stellar mass, getting

dm
(I): =0 (nothing new, since m = M = constant)
dr
d M
(II): = ,
dr r(r-2M)

which is a separable first order differential equation with solution

2M
e2 = 1 - r ,

if we impose the boundary condition 0 as r + . (See the exercise Set).

Recalling from the definition of m that

10 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

1
e2 = 1-2M/r ,

we can now express the metric outside a star as follows:

Schwarzschild Metric
1
0 0 0
1-2M/r
** 0 2 0 0
g = r
0 0 r2sin2 0
0 0 0 -(1-2M/r)

In the exercise set, you will see how this leads to Newton's Law of Gravity.

Exercise Set 13

1. Use 2 = 4 G and the divergence theorem to deduce Newton's law = GM/r2


for a spherical mass of uniform density .

2. Calculate the Einstein tensor for the metric g = diag(e2 , r2, r2sin , -e2 ), and
verify that it agrees with that in the notes.

3. Referring to the notes above, show that T13|3 = 0.

i4
4. Show that T |4 = 0 for i = 2, 3, 4.

5. If we impose the condition that, far from the star, spacetime is flat, show that this
is equivalent to saying that r + (r) = r + (r) = 0. Hence obtain the formula e2
= 1 -2M/r.

6. A Derivation of Newton's Law of Gravity


(a) Show that, at a large distance R from a static stable star, the Schwarzschild
metric can be approximated as

1 +2M/R 0 0 0
0 2 0 0
**
R
g
0 0 R2sin2 0
0 0 0 -(1-2M/R)

(b) (Schutz, p. 272 #9) Define a new coordinate by R = (1+M/ )2, and deduce
that, in terms of the new coordinates (ignoring terms of order 1/R2)

11 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

1 + 2M/ 0 0 0
0 2 2 0 0
(1+2M/ )
g** 2 2 2
0 0 (1+2M/ ) sin 0
0 0 0 -(1-2M/ )

12 of 13 10/08/2010 05:13 PM
The Einstein Field Equations and Derivation of ... http://people.hofstra.edu/stefan_waner/diff_geom...

(c) Now convert to Cartesian coordinates, (x, y, z, t) to obtain

1 +2M/ 0 0 0
0 1+2M/ 0 0
g**
0 0 1+2M/ 0
0 0 0 -(1-2M/ )

(d) Now refer to the last formula in Lecture 13, and obtain Newton's Law of Gravity.
To how many kilograms does one unit of M correspond?

Back to Lecture 13: Three On to Lecture 15: The


Table of
Basic Premises of General Schwarzschild Metric and Event
Contents
Relativity Horizons

Last Updated: January, 2002


Copyright © Stefan Waner

13 of 13 10/08/2010 05:13 PM
The Schwarzschild Metric and Event Horizons http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 15: The Schwarzschild Metric and


Event Horizons
On to Lecture 16: White
Back to Lecture 14: The
Table of Dwarfs, Neutron Stars, and
Einstein Field Equations and
Contents Black Holes, by Gregory C.
Derivation of Newton's Law
Levine

15. The Schwarzschild Metric and Event Horizons

We saw that the metric outside a spherically symmetric static stable star
(Schwarzschild metric) is given by

1 2 2 2
ds2 = 1-2M/r dr2 + r d - (1-2M/r)dt ,

2 2 2 2
where d = d + sin d . We see immediately that something strange
happens when 2M = r, and we look at two cases.

Case 1 (Not-So-Dense Stars) Radius of the star, r s > 2M. If we recall that
the Schwarzschild metric is only valid for outside a star; that is, r > rs, we
find that r > 2M as well, and so 1-2M/r is positive, and never zero. (If r
2M, we are inside the star, and the Schwarzschild metric no longer applies.)

Case 2 (Extremely Dense Stars) Radius of the star, r s < 2M. Here, two
things happen: First, as a consequence of the equations of motion, it can be
shown that in fact the pressure inside the star is unable to hold up against
the gravitational forces, and the star collapses (see the next section)
overwhelming even the quantum mechanical forces. In fact, it collapses to a
singularity, a point with infinite density and no physical dimension, a black
hole. For such objects, we have two distinct regions, defined by r > 2M and
r < 2M, separated by the event horizon, r = 2M, where the metric goes
infinite.

1 of 7 10/08/2010 05:14 PM
The Schwarzschild Metric and Event Horizons http://people.hofstra.edu/stefan_waner/diff_geom...

Particles Falling Inwards Suppose a particle is falling radially inwards.


Let us see how long, on the particle's clock (proper time), it takes to reach
the event horizon. Out approach will be as follows:

1. Use the principle that the path is a geodesic in space time.


2. Deduce information about dr/d
3. Integrate d to see how long it takes.

Recall first the geodesic equation for such a particle,

Pi|kPk + risPrPs = 0.

We saw in the derivation (look back) that it came from the equation

i
m0 dP + risPrPs = 0
d

There is a covariant version of this:

dPs
m0 - risPrPi = 0.
d

Press here for a derivation.

Now take this covariant version and write out the Christoffel symbols:

dPs i r
m0 = r sP Pi
d
dPs 1 ik
m0 = 2 g (g rk,s
+ gks,r - gsr,k) PrPi
d
dPs 1
m0 = 2 (g rk,s + gks,r - gsr,k) PrPk
d

But the sum of the second and third terms in parentheses is skew-symmetric
in r and k, whereas the term outside is symmetric in them. This results in
them canceling when we sum over repeated indices. Thus, we are left with

dPs 1
m0 = 2 grk,sPrPk.
d

2 of 7 10/08/2010 05:14 PM
The Schwarzschild Metric and Event Horizons http://people.hofstra.edu/stefan_waner/diff_geom...

But by spherical symmetry, g is independent of xi if i = 2, 3, 4. Therefore


grk,s = 0 unless s = 1. This means that P2, P3 and P4 are constant along the
trajectory. Since P4 is constant, we define

E = -P4/m0,

another constant.

Question What is the meaning of E?

b>Answer Recall that the fourth coordinate of four momentum is the


energy. Suppose the particle starts at rest at r = Ï and then falls inward.
Since space is flat there, and the particle is at rest, we have

P* = [0, 0, 0, m0] (fourth coordinate is rest energy = m0)

(which corresponds to P* = [0, 0, 0, -m0], since P* = P*g**). Thus, E =


-P4/m0 = 1, the rest energy per unit mass.

As the particle moves radially inwards, P2 = P3 = 0. What about P1? Now we


know the first coordinate of the contravariant momentum is given by

dr
1
P = m0 d (by definition, Pi = m0 dxi/d , and x1 = r)

Thus, using the metric to get the fourth contravariant coordinate,

dr
*
P = ( m0 d , 0, 0, m0E(1-2M/r) -1)

we now invoke the normalization condition u, u = -1, whence P, P =


2
-m0 , so that

2
2 2 dr -1
- m0 E2(1-2M/r) -1,
2
-m0 = m0 (1 - 2M/r)
d

giving

2
dr
= E2 - 1 + 2M/r,
d

which is the next step in our quest:

dr
d =- ,

3 of 7 10/08/2010 05:14 PM
The Schwarzschild Metric and Event Horizons http://people.hofstra.edu/stefan_waner/diff_geom...

(E2-1+2M/r) 1/2

where we have introduced the negative sign since r is a decreasing function


of . Therefore, the total time elapsed is

2M
dr
T= - ,
(E2-1+2M/r) 1/2
R

which, though improper, is finite. This is the time it takes, on the hapless
particle's clock, to reach the event horizon.

Now let's recalculate this from the point of view of an observer who is
stationary with respect to the star. That is, let us use the coordinate x4 as
time t. How is it related to proper time? Well, the four velocity tells how:

dx4 = dt
V4 =defn d
d

We can get V 4 from the formula for P* (and divide by m0) so that

dt = V 4d = E(1-2M/r) -1 d

giving a total time of

2M
dr
T= - ,
E(1-2M/r) (E2-1+2M/r) 1/2
R

This integral diverges! So, in the eyes of an outside observer, it takes that
particle infinitely long to get there!

Inside the Event Horizon -- A Dialogue

Tortoise: I seem to recall that the metric for a stationary observer (situated
inside the event horizon) is still given by the Schwarzschild metric

ds2 = (1-2M/r) -1dr2 + r2 d 2 - (1-2M/r)dt2.

Achilles: Indeed, but notice that now the coefficient of dr2 is negative,
while that of dt2 is positive. What could that signify (if anything)?

Tortoise: Let us do a little thought experiment. If we are unfortunate(?)


enough to be there watching a particle follow either a null or timelike world

4 of 7 10/08/2010 05:14 PM
The Schwarzschild Metric and Event Horizons http://people.hofstra.edu/stefan_waner/diff_geom...

line, then, with respect to any parameter (such as ) we must have dr/d 0.
In other words, r must always change with the parameter!

Achilles: So you mean nothing can sit still. Why so?

Tortoise: Simple. First: for any world line, the vector dxi/d is non-zero, (or
else it would not be a path at all!) so some coordinate must be non-zero. But
now if we calculate ||dxi/d ||2 using the signature (-, +, +, +) we get

2
dr
- something + something the others,
d

so the only way the answer can come out zero or negative is if the first
coordinate (dr/d ) is non-zero.

Achilles: I think I see your reasoning... we could get a null path if all the
coordinates were zero, but that just can't happen in a path! So you mean to
tell me that this is true even of light beams. Mmm.... So you're telling me
that r must change along the world line of any particle or photon! But that
begs a question, since r is always changing with , does it increase or
decrease with proper time ?

Tortoise: To tell you the truth, I looked in the Green Book, and all it said
was the "obviously" r must decrease with , but I couldn't see anything
obvious about that.

Achilles: Well, let me try a thought experiment for a change. If you accept
for the moment the claim that a particle fired toward the black hole will
move so as to decrease r, then there is at least one direction for which dr/d
< 0. Now imagine a particle being fired in any direction. Since dr/d will be
a continuous function of the angle in which the particle is fired, we
conclude that it must always be negative.

Tortoise: Nice try, my friend, but you are being too hasty (as usual). That
argument can work against you: suppose that a particle fired away from the
black hole will move (initially at least) so as to increase r, then your
argument proves that r increases no matter what direction the particle is
fired. Back to the drawing board.

Achilles: I see your point...

Tortoise (interrupting): Not only that. You might recall from Lecture 38 (or
thereabouts) that the 4-velocity of as radially moving particle in free-fall is
given by

dr
*
V = d , , 0, 0, E(1-2M/r)-1 ,

5 of 7 10/08/2010 05:14 PM
The Schwarzschild Metric and Event Horizons http://people.hofstra.edu/stefan_waner/diff_geom...

so that the fourth coordinate, dt/d = E(1-2M/r) -1, is negative inside the
horizon. Therefore, proper time moves in the opposite direction to
coordinate time!

Achilles: Now I'm really confused. Does this mean that for r to decrease
with coordinate time, it has to increase with proper time?

Tortoise: Yes. So you were (as usual) totally wrong in your reason for
asserting that dr/d is negative for an inward falling particle.

Achilles: OK. So now the burden of proof is on you! You have to explain
what the hell is going on.

Tortoise: That's easy. You might dimly recall the equation

2
dr
= E2 - 1 + 2M/r
d

in those excellent on-line differential geometry notes, wherein we saw that


we can take E = 1 for a particle starting at rest far from the black hole. In
other words,

2
dr
= 2M/r.
d

Notice that this is constant and never zero, so that dr/d can never change
sign during the trajectory of the particle, even as (in its comoving frame) it
passes through the event horizon. Therefore, since r was initially decreasing
with (outside, in "normal" space-time), it must continue to do so
throughout its world line. In other words, photons that originate outside the
horizon can never escape in their comoving frame. Now (and here's the
catch), since there are some particles whose world-lines have the property
that the arc-length parameter (proper time) decreases with increasing r,
and since r is the unique coordinate in the stationary frame that plays the
formal role of time, and further since, in any frame, all world lines must
move in the same direction with respect to the local time coordinate
(meaning r) as their parameter increases, it follows that all world lines must
decrease r with increasing proper time. Ergo, Achilles, r must always
decrease with increasing proper time . Of course, a consequence of all of
this is that no light, communication, or any physical object, can escape from
within the event horizon. They are all doomed to fall into the singularity.

Achilles: But what about the stationary observer?

Tortoise: Interesting point...the quantity dt/d = E(1-2M/r) -1 is negative,


meaning proper time goes in the opposite direction to coordinate time and

6 of 7 10/08/2010 05:14 PM
The Schwarzschild Metric and Event Horizons http://people.hofstra.edu/stefan_waner/diff_geom...

also becomes large as it approaches the horizon, so it would seem to the


stationary observer inside the event horizon that things do move out toward
the horizon, but take infinitely long to get there. There is a catch, however,
there can be no "stationary observer" according to the above analysis...

Achilles: Oh.

Exercise Set 15

1. Verify that the integral for the infalling particle diverges the case E = 1.

2. Mini-Black Holes How heavy is a black hole with event horizon of radius
one meter? [Hint: Recall that the "M" corresponds to G total mass.]

3. Calculate the Riemann coordinates of curvature tensor Rabcd at the event


horizon. r = 2M.

On to Lecture 16: White


Back to Lecture 14: The
Table of Dwarfs, Neutron Stars, and
Einstein Field Equations and
Contents Black Holes, by Gregory C.
Derivation of Newton's Law
Levine

Last Updated: January, 2002


Copyright © Stefan Waner

7 of 7 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

Lecture 16. White Dwarfs, Neutron Stars and


Black Holes by Gregory C. Levine
Table of Back to Lecture 15: The Schwarzschild Metric and Event
Contents Horizons

16. White Dwarfs, Neutron Stars and Black Holes

I Introduction

In this section we will look at the physical mechanisms responsible for the
formation compact stellar objects. Compact objects such as white dwarf
stars, neutron stars, and ultimately black holes, represent the final state of a
star's evolution. Stars are born in gaseous nebulae in which clouds of
hydrogen coalesce becoming highly compressed and heated through the
gravitational interaction. At a temperature of about 10 7 K, a nuclear
reaction begins converting hydrogen into the next heavier element, helium,
and releasing a large quantity of electromagnetic energy (light). The helium
accumulates at the center of the star and eventually becomes compressed
and heated enough (10 8 K) to initiate nuclear fusion of helium into heavier
elements.

So far, the star is held in "near-equilibrium" by the countervailing forces of


gravity, which compresses the star, and pressure from the vast
electromagnetic energy produced during nuclear fusion, which tends to
make it expand. However, as the star burns hotter and ignites heavier
elements which accumulate in the core, electromagnetic pressure becomes
less and less effective against gravitational collapse. In most stars, this
becomes a serious problem when the core has reached the carbon rich
phase but the temperature is still insufficient to fuse carbon into iron. Even
if a star has reached sufficient temperature to create iron, no other nuclear
fusion reactions producing heavier elements are exothermic and the star
has exhausted its nuclear fuel. Without electromagnetic energy to hold the
core up, one would think that the core would become unstable and begin to
collapse---but another mechanism intervenes.

II The Electron Gas

But there is another "force" that holds the core up; now we will turn to a
study of this force and how the balance between this force and gravity lead
to the various stellar compact objects: white dwarfs, neutron stars and black
holes.

The stabilizing force that keeps the stellar core from collapsing operates at
terrestrial scales as well. All solid matter resists compression and we will

1 of 8 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

trace the origin of this behavior in a material that turns out to most
resemble a stellar compact object: ordinary metal. Although metal is "hard"
by human standards, it is to some degree elastic---capable of stretching and
compression. Metals all have a similar atomic structure. Positively charged
metal ion cores form a regular crystalline lattice and negatively charged
valence electrons form a kind of gas that uniformly permeates the lattice.

Suprisingly, the bulk properties of the metal such as heat capacity,


compressibility, and thermal conductivity are almost exclusively properties
of the electron gas and not the underlying framework of the metal ion cores.
We will begin by studying the properties of an electron gas alone and then
see if it is possible to justify such a simple model for a metal (or a star).

To proceed, two very important principles from Quantum Mechanics need to


be introduced:

Pauli Exclusion Principle: Electrons cannot be in the same quantum


state. For our purposes, this will effectively mean that electrons cannot be
at the same point in space.
Heisenberg Uncertainty Principle: A quantum particle has no precise
position, x, or momentum, p. However, the uncertainties in the outcome of
experiment aimed at simultaneously determining both quantities is
constrained in the following way. Upon repeated measurements, the
"spread" in momentum, p, of a particle absolutely confined to a region in
space of size x, is constrained by

x p 2

where 6.6 10 -34 Joule-sec is a fundamental constant of nature (the Planck


constant).

Here is how these two laws act together to give one of the familiar
properties of metals. The Pauli Exclusion Principle tends to make electrons
stay as far apart as possible. Each of N electrons confined in a box of volume
R3 will typically have R3/N space of its own. Therefore, the average
interparticle spacing is a0 = R/N 1/3. (The situation is actually a bit more
complicated than this.) Since the electrons are spatially confined within a
region of linear size a0, the uncertainty in momentum is p /a0. The

2 of 8 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

precise meaning of p2 is the variance of a large set of measurements of


momentum. Denoting average by angle brackets,

p2 = (p - p )2 = p2 - p 2.

2 2
Therefore, the average value of p must be greater than or equal to p .

Based on these results, let us calculate how the energy of an electron gas
depends upon the size of the box containing it. The kinetic energy of a
particle of mass m and speed v is

2
= 2 mv2 = p
1
2m

Now, taking the minimum value of momentum, p 2 p2


2 2
/a0 , we arrive at
the energy, = 2/mea0 , for a single electron of mass me. The total kinetic
2,

energy of N electrons is then Ee = N . Finally, putting in the dependence of


a0 on N and the system size, R, we get for E e,

2
Ee N 5/3.
meR2

As the system size R is reduced, the energy increases. Even though the
electrons do not interact with one another, there is an effective repulsive
force resisting compression. The origin of this force is the uncertainty
principle! (neglecting e-e interactions and neglecting temperature.)

Let us test out this model by calculating the compressibility of metal.


Consider a metal block that undergoes a small change in volume, V, due to
an applied pressure P.

The bulk modulus, B, is defined as the constant of proportionality between


the applied pressure and the fractional volume change.

V
P=B .
V

The outward pressure (towards positive R) exerted by the electron gas is

3 of 8 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

defined in the usual way in terms of a derivative of the total energy of the
system:

F 1 Ee
P= =- .
A A R

he bulk modulus is then defined as

5/3
P 5 2 N
B=V =-9
me V 10 -10 - 10 -11 N/m2.
V

(We've taken the volume per electron to be 1 nm3.) The values of B for Steel
and Aluminum are Bsteel 6 10 -10 N/m2 and BAl 2 10 -10 N/m2. It is hard to
imagine that this excellent agreement in magnitude is wholly fortuitous (it is
not). Having seen that the Heisenberg uncertainty principle is the
underlying physics behind the rigidity of metal, we will now see that it is
also physical mechanism that keeps stars from collapsing under their own
weight.

III Compact Objects

A star can only be in a condition of static equilibrium if there is some force


to counteract the compressive force of gravity. In large stars this
countervailing force is the radiation pressure from thermally excited atoms
emitting light. But in a white dwarf star, the force counteracting gravity has
its origin in the uncertainty principle, as it did in a metal. The elements
making up the star (mostly iron) exist in a completely ionized state because
of the high temperatures. One can think of the star as a gas of positive
charge atomic nuclei and negative charge electrons. Each metal nucleus is
a few thousand times heavier than the set of electrons that were attached to
it, so the nuclei (and not the electrons) are responsible for the sizable
gravitational force holding the star together. The electrons are strongly
electrostatically bound to core of the star and therefore coexist in the same
volume as the nuclear core---gravity pulling the nuclei together and the
uncertainty principle effectively pushing the electrons apart.

We will proceed in the same way as in the calculation of the bulk modulus
by finding an expression for the total energy and taking its derivative with

4 of 8 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

respect to R to find the effective force.

The gravitational potential energy of sphere of mass M and radius R is


approximately

2
Eg - GM
R

swhere G 7 10 -11 Nm2/kg2 is the gravitational constant. (The exact result


has a coeffient of order unity in front; we are doing only "order-
of-magnitude" calculations and ignoring such factors.) The negative sign
means that the force of gravity is attractive---energy decreases with
decreasing R. We would like to express Eg in terms of N, like E e---this will
make the resulting expressions easier to adapt to neutron stars later on. The
mass M of the star is the collective mass of the nucleons, to an excellent
approximation. As you may know from chemisty, the number of nucleons
(protons and neutrons) is roughly double the number of electrons, for light
elements. If is the average number of nucleons per electron, for the
heavier elements making up the star, The mass of the star is expressed as
M= mnN. Putting the expressions for the electron kinetic energy and the
gravitational potential energy together, we get the total energy E:

2
N 5/3 G 2mn2N 2
E = Ee + Eg -
meR2 R

The graph of the function E(R)

reveals that there is a radius at which the energy is minimum---that is to


say, a radius R0 where the force F = - E/ R is zero and the star is in
mechanical equilibrium. A rough calculation of R0 gives:

2
N -1/3
R0 = 2 2 10 7 m = 10,000 km.
Gme mn

5 of 8 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

where we have used N 10 57, a reasonable value for a star such as our sun.
R0 corresponds to a star that is a little bigger than earth---a reasonable
estimate for a white dwarf star! The mass density may also be calculated
assuming the radius R0: 10 9 kg/m3 = 10 5 density of steel. On the
average, the electrons are much closer to the nuclei in the white dwarf than
they are in ordinary matter.

Under some circumstances, the star can collapse to an object even more
compact than a white dwarf---a neutron star. The Special Theory of Relatvity
plays an important role in this further collapse. If we calculate the kinetic
energy of the most energetic electrons in the white dwarf, we get:

2 2 N 2/3
2= me R0 100 -14 Joules.
mea0

This energy is actually quite close to the rest mass energy of the electron
itself, mec2 = 10 -13 Joules. Recall that the expression for the kinetic energy,
= p2/2m, is only a nonrelativistic approximation. Rest mass energy is a
scalar formed from the product

2
µ
p pµ = 2 - p2 = (mc) 2.
c

The exact expression for the energy of a relativistic particle is then:

2 4
p
= [(pc)2+(mc2)2]1/2 = mc2 + p + terms of order .
2m mc

When p mc (or, equivalently, when p 2/m mc2 as above) the higher order
terms cannot be neglected.

Since the full expression for is unwieldy for our simple approximation
schemes, we will look at the extreme relativistic limit, p >> mc. In this case,
pc. This limit is effectively the limit for extremely massive stars, where
the huge compressive force of gravity will force the electrons to have
compensatingly high kinetic energies and enter the extreme relativistic
regime.

The different form for the energy of the electrons (now linear rather than
quadratic in p) will have dramatic consequences for the stability equation
for the radius R0 derived earlier. The calculation proceeds as before;
according to the uncertainty principle the estimate for the momentum of an
electron within the star is

6 of 8 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

1/3
p = a0 = N
R

Therefore, the total electron energy is given by

Ee N Npc cN 4/3
R

The same expression as before for Eg results in the following expression for
the total energy:

2
mn N 2
2
E + Ee + Eg cN 4/3 - G
R R

The energy E(R) has a completely different behavior than in the


nonrelativistic case. If we look at the force F = - E/ R it is just equal to E/R.
If the total energy is positive, the force always induces expansion; if the
total energy is negative, the force always induces compression. Thus, if the
total energy E is negative, the star will continue to collapse (with an ever
increasing inward force) unless some other force intervenes. These
behaviors are suggested in the figure below.

The expression for total energy tells us that the critical value of N (denoted
by N C) for which the energy crosses over to negative value is

c 3/2
1
NC = 2 .
3 Gmn

This is conventionally written in terms of a critical mass for a star, MC, that
separates the two behaviors: expansion or collapse. The critical mass is

1 3/2
c
MC = N cmn = 2 2 .
mn G

7 of 8 10/08/2010 05:14 PM
White Dwarfs, Neutron Stars and Black Holes http://people.hofstra.edu/stefan_waner/diff_geom...

If M > MC, the star will continue to collapse and its electrons will be pushed
closer and closer to the nuclei. At some point, a nuclear reaction begins to
occur in which electrons and protons combine to form neutrons (and
neutrinos which are nearly massless and noninteracting). A sufficiently
dense star is unstable against such an interaction and all electrons and
protons are converted to neutrons leaving behind a chargeless and
nonluminous star: a neutron star.

You may be wondering: what holds the neutron star up? Neutrons are
chargeless and the nuclear force between neutrons (and protons) is only
attractive, so what keeps the neutron star from further collapse? Just as with
electrons, neutrons obey the Pauli Exclusion Principle. Consequently, they
avoid one another when they are confined and have a sizable kinetic energy
due to the uncertainty principle. If the neutrons are nonrelativistic, the
previous calculation for the radius of the white dwarf star will work just the
same, with the replacement me mn. This change reduces the radius R0 of
the neutron star by a factor of 2000 (the ratio of mn to me) and R0 10 km.
One of these would comfortably fit on Long Island but would produce
somewhat disruptive effects.

Finally, if the neutron star is massive enough to make its neutrons


relativistic, continued collapse is possible if the total energy is negative, as
before in the white dwarf case. The expression for the critical mass MC is
easily adapted to neutrons by setting = 1. Since 2 for a white dwarf, we
would expect that a star about four times more massive than a white dwarf
is susceptible to unlimited collapse. No known laws of physics are capable of
interrupting the collapse of a neutron star. In a sense, the laws of physics
leave the door open for the formation of stellar black holes.

Table of Back to Lecture 15: The Schwarzschild Metric and Event


Contents Horizons

Last Updated: January, 2002


Copyright © Stefan Waner and Gregory C. Levine

8 of 8 10/08/2010 05:14 PM

You might also like