Advanced Calculus Notes
Advanced Calculus Notes
Advanced Calculus Notes
James S. Cook
Liberty University
Department of Mathematics
Fall 2011
2
introduction and motivations for these notes
There are many excellent texts on portions of this subject. However, the particular path I choose
this semester is not quite in line with any particular text. I required the text on Advanced Cal-
culus by Edwards because it contains all the major theorems that traditionally are covered in an
Advanced Calculus course.
My focus diers signicantly. If I had students who had already completed a semester in real
analysis then we could delve into the more analytic aspects of the subject. However, real analy-
sis is not a prerequisite so we take a dierent path. Generically the story is as follows: a linear
approximation replaces a complicated object very well so long as we are close to the base-point
for the approximation. The rst level of understanding is what I would characterize as algebraic,
beyond that is the analytic understanding. I would argue that we must rst have a rm grasp of
the algebraic before we can properly attack the analytic aspects of the subject.
Edwards covers both the algebraic and the analytic. This makes his text hard to read in places
because the full story is at some points technical. My goal is to focus on the algebraic. That said,
I will try to at least point the reader to the section of Edward where the proof can be found.
Linear algebra is not a prerequisite for this course. However, I will use linear algebra. Matrices,
linear transformations and vector spaces are necessary ingredients for a proper discussion of ad-
vanced calculus. I believe an interested student can easily assimilate the needed tools as we go so I
am not terribly worried if you have not had linear algebra previously. I will make a point to include
some baby
1
linear exercises to make sure everyone who is working at this course keeps up with the
story that unfolds.
Doing the homework is doing the course. I cannot overemphasize the importance of thinking
through the homework. I would be happy if you left this course with a working knowledge of:
set-theoretic mapping langauge, bers and images and how to picture relationships diagra-
matically.
continuity in view of the metric topology in n-space.
the concept and application of the derivative and dierential of a mapping.
continuous dierentiability
inverse function theorem
implicit function theorem
tangent space and normal space via gradients
1
if you view this as an insult then you havent met the right babies yet. Baby exercises are cute.
3
extrema for multivariate functions, critical points and the Lagrange multiplier method
multivariate Taylor series.
quadratic forms
critical point analysis for multivariate functions
dual space and the dual basis.
multilinear algebra.
metric dualities and Hodge duality.
the work and ux form mappings for
3
.
basic manifold theory
vector elds as derivations.
Lie series and how vector elds generate symmetries
dierential forms and the exterior derivative
integration of forms
generalized Stokess Theorem.
surfaces
fundmental forms and curvature for surfaces
dierential form formulation of classical dierential geometry
some algebra and calculus of supermathematics
Before we begin, I should warn you that I assume quite a few things from the reader. These notes
are intended for someone who has already grappled with the problem of constructing proofs. I
assume you know the dierence between and . I assume the phrase i is known to you.
I assume you are ready and willing to do a proof by induction, strong or weak. I assume you
know what , , , and denote. I assume you know what a subset of a set is. I assume you
know how to prove two sets are equal. I assume you are familar with basic set operations such
as union and intersection (although we dont use those much). More importantly, I assume you
have started to appreciate that mathematics is more than just calculations. Calculations without
context, without theory, are doomed to failure. At a minimum theory and proper mathematics
allows you to communicate analytical concepts to other like-educated individuals.
Some of the most seemingly basic objects in mathematics are insidiously complex. Weve been
taught theyre simple since our childhood, but as adults, mathematical adults, we nd the actual
4
denitions of such objects as or are rather involved. I will not attempt to provide foundational
arguments to build numbers from basic set theory. I believe it is possible, I think its well-thought-
out mathematics, but we take the existence of the real numbers as an axiom for these notes. We
assume that exists and that the real numbers possess all their usual properties. In fact, I assume
, , , and all exist complete with their standard properties. In short, I assume we have
numbers to work with. We leave the rigorization of numbers to a dierent course.
The format of these notes is similar to that of my calculus and linear algebra and advanced calculus
notes from 2009-2011. However, I will make a number of denitions in the body of the text. Those
sort of denitions are typically background-type denitions and I will make a point of putting them
in bold so you can nd them with ease.
I have avoided use of Einsteins implicit summation notation in the majority of these notes. This
has introduced some clutter in calculations, but I hope the student nds the added detail helpful.
Naturally if one goes on to study tensor calculations in physics then no such luxury is granted, you
will have to grapple with the meaning of Einsteins convention. I suspect that is a minority in this
audience so I took that task o the to-do list for this course.
The content of this course diers somewhat from my previous oering. The presentation of ge-
ometry and manifolds is almost entirely altered. Also, I have removed the chapter on Newtonian
mechanics as well as the later chapter on variational calculus. Naturally, the interested student is
invited to study those as indendent studies past this course. If interested please ask.
I should mention that James Callahans Advanced Calculus: a geometric view has inuenced my
thinking in this reformulation of my notes. His discussion of Morses work was a useful addition to
the critical point analysis.
I was inspired by Flanders text on dierential form computation. It is my goal to implement some
of his nicer calculations as an addition to my previous treatment of dierential forms. In addition,
I intend to encorporate material from Burns and Gideas Dierential Geometry and Topology with
a View to Dynamical Systems as well as Munkrese Analysis on Manifolds. These additions should
greatly improve the depth of the manifold discussion. I intend to go signicantly deeper this year
so the student can perhaps begin to appreciate manifold theory.
I plan to take the last few weeks of class to discuss supermathematics. This will serve as a sideways
review for calculus on
. . . . . . . . . . . . . . . . . . . . . . . . . . 220
8.7.2 metric tensor on a smooth manifold . . . . . . . . . . . . . . . . . . . . . . . 221
8.8 on boundaries and submanifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9 dierential forms 229
9.1 algebra of dierential forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
9.2 exterior derivatives: the calculus of forms . . . . . . . . . . . . . . . . . . . . . . . . 231
9.2.1 coordinate independence of exterior derivative . . . . . . . . . . . . . . . . . . 232
9.2.2 exterior derivatives on
3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.3 pullbacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.4 integration of dierential forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.4.1 integration of -form on
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
9.4.2 orientations and submanifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
9.5 Generalized Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9.6 poincares lemma and converse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
9.6.1 exact forms are closed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
9.6.2 potentials for closed forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
9.7 classical dierential geometry in forms . . . . . . . . . . . . . . . . . . . . . . . . . . 257
9.8 E & M in dierential form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
9.8.1 dierential forms in Minkowski space . . . . . . . . . . . . . . . . . . . . . . . 258
9.8.2 exterior derivatives of charge forms, eld tensors, and their duals . . . . . . 263
9.8.3 coderivatives and comparing to Griths relativitic E & M . . . . . . . . . . 265
9.8.4 Maxwells equations are relativistically covariant . . . . . . . . . . . . . . . . 266
9.8.5 Electrostatics in Five dimensions . . . . . . . . . . . . . . . . . . . . . . . . . 268
10 supermath 273
10 CONTENTS
Chapter 1
set-up
In this chapter we settle some basic terminology about sets and functions.
1.1 set theory
Let us denote sets by capital letters in as much as is possible. Often the lower-case letter of the
same symbol will denote an element; is to mean that the object is in the set . We can
abbreviate
1
and
2
by simply writing
1
,
2
, this is a standard notation. The union
of two sets and is denoted = { }. The intersection of two sets is
denoted = { }. If a set has no elements then we say is the empty set
and denote this by writing = . It sometimes convenient to use unions or intersections of several
sets:
}
we say is the index set in the denitions above. If is a nite set then the union/intersection
is said to be a nite union/interection. If is a countable set then the union/intersection is said
to be a countable union/interection
1
. Suppose and are both sets then we say is a subset
of and write i implies for all . If then we also say is a
superset of . If then we say i = and = . Recall, for sets , we dene
= i implies for all and conversely implies for all . This
is equivalent to insisting = i and . The dierence of two sets and is
denoted and is dened by = { such that / }
2
.
1
recall the term countable simply means there exists a bijection to the natural numbers. The cardinality of such
a set is said to be
2
other texts somtimes use =
11
12 CHAPTER 1. SET-UP
We often make use of the following standard sets:
natural numbers (positive integers); = {1, 2, 3, . . . }.
natural numbers up to the number ;
= {1, 2, 3, . . . , 1, }.
integers; = {. . . , 2, 1, 0, 1, 2, . . . }. Note,
>0
= .
non-negative integers;
0
= {0, 1, 2, . . . } = {0}.
negative integers;
<0
= {1, 2, 3, . . . } = .
rational numbers; = {
, , = 0}.
irrational numbers; = { / }.
open interval from to ; (, ) = { < < }.
half-open interval; (, ] = { < } or [, ) = { < }.
closed interval; [, ] = { }.
The nal, and for us the most important, construction in set-theory is called the Cartesian product.
Let , , be sets, we dene:
= {(, ) and }
By a slight abuse of notation
3
we also dene:
= {(, , ) and and }
In the case the sets comprising the cartesian product are the same we use an exponential notation
for the construction:
2
= ,
3
=
We can extend to nitely many sets. Suppose
=1
and dene
=1
i = (
1
,
2
, . . . ,
) where
= {(
1
,
2
, . . . ,
3
technically ( ) = ( ) since objects of the form (, (, )) are not the same as ((, ), ), we
ignore these distinctions and map both of these to the triple (, , ) without ambiguity in what follows
1.2. VECTORS AND GEOMETRY FOR -DIMENSIONAL SPACE 13
it is assumed from the context that .
In terms of cartesian products you can imagine the -axis as the number line then if we paste
another numberline at each value the union of all such lines constucts the plane; this is the
picture behind
2
= . Another interesting cartesian product is the unit-square; [0, 1]
2
=
[0, 1] [0, 1] = {(, ) 0 1, 0 1}. Sometimes a rectangle in the plane with its edges
included can be written as [
1
,
2
] [
1
,
2
]. If we want to remove the edges use (
1
,
2
) (
1
,
2
).
Moving to three dimensions we can construct the unit-cube as [0, 1]
3
. A generic rectangu-
lar solid can sometimes be represented as [
1
,
2
] [
1
,
2
] [
1
,
2
] or if we delete the edges:
(
1
,
2
) (
1
,
2
) (
1
,
2
).
1.2 vectors and geometry for -dimensional space
Denition 1.2.1.
Let , we dene
= {(
1
,
2
, . . . ,
= 1, 2, . . . , }. If
then we say is an n-vector. The numbers in the vector are called the components;
= (
1
,
2
, . . . ,
.
Notice, a consequence of the denition above and the construction of the Cartesian product
4
is that
two vectors and are equal i
for all
and :
and :
(1.) ( +)
(2.) ()
for all {1, 2, . . . , }. The operation + is called vector addition and it takes two
vectors ,
=
_
1 , =
0 , =
is called the Kronecker delta.
For example,
22
= 1 while
12
= 0.
Denition 1.2.4.
Let
1
be dened by (
is determined by context.
We call
1
+
2
2
+
where
1
,
2
,
. If we take coecients
1
,
2
,
then
is is said to be a complex linear combination. I invite the reader to verify that every vector in
is a linear combination of
1
,
2
, . . . ,
6
. It is not dicult to prove the following properties for
vector addition and scalar multiplication: for all , ,
and , ,
(.) + = +, (.) ( +) + = + ( +)
(.) + 0 = , (.) = 0
(.) 1 = , (.) () = (),
(.) ( +) = +, (.) ( +) = +
(.) +
(.)
These properties of
is a vector space, in fact it is the quintessial model for all other vector spaces.
Fortunately
to . We take
in a pair of vectors and output a real number.
Denition 1.2.6. Let ,
we dene by
=
1
1
+
2
2
+ +
.
Example 1.2.7. Let = (1, 2, 3, 4, 5) and = (6, 7, 8, 9, 10)
= 6 + 14 + 24 + 36 + 50 = 130
Example 1.2.8. Suppose we are given a vector
. Observe that
and consider,
=
_
=1
=1
=1
=
1
1
+ +
+ +
.
The dot-product with
and then
1. =
2. ( +) = +
3. ( ) = () = ()
4. 0 and = 0 i = 0
The formula cos
1
_
then
1.2. VECTORS AND GEOMETRY FOR -DIMENSIONAL SPACE 17
Example 1.2.12. Let = [1, 2, 3, 4, 5]
1
2
+ 2
2
+ 3
2
+ 4
2
+ 5
2
=
1 + 4 + 9 + 16 + 25 =
55
=
6
2
+ 7
2
+ 8
2
+ 9
2
+ 10
2
=
36 + 49 + 64 + 81 + 100 =
330
We nd unit vectors via the standard trick, you just take the given vector and multiply it by the
reciprocal of its length. This is called normalizing the vector,
=
1
55
[1, 2, 3, 4, 5]
=
1
330
[6, 7, 8, 9, 10]
55
330
_
= 15.21
Its good we have this denition, 5-dimensional protractors are very expensive.
Proposition 1.2.13.
Let ,
paired with :
and
is dened to be (, ) .
If we draw a picture this denition is very natural. Here we are thinking of the points , as vectors
from the origin then is the vector which points from to (this is algebraically clear since
+( ) = ). Then the distance between the points is the length of the vector that points from
one point to the other. If you plug in two dimensional vectors you should recognize the distance
formula from middle school math:
((
1
,
2
), (
1
,
2
)) =
(
1
1
)
2
+ (
2
2
)
2
Proposition 1.2.15.
Let :
.
You should recall that we can write any vector in
3
as
=< , , >= < 1, 0, 0 > + < 0, 1, 0 > + < 0, 0, 1 >=
=<
2
2
,
3
3
,
1
1
> .
The magnitude of
can be shown to satisfy
=
,
=
,
If I wish to discuss both the point and the vector to which it corresponds we may use the notation
= (
1
,
2
, . . . ,
)
=<
1
,
2
, . . . ,
>
With this notation we can easily dene directed line-segments as the vector which points from one
point to another, also the distance bewtween points is simply the length of the vector which points
from one point to the other:
Denition 1.2.17.
Let ,
.
20 CHAPTER 1. SET-UP
1.2.2 compact notations for vector arithmetic
I prefer the following notations over the hat-notation of the preceding section because this notation
generalizes nicely to -dimensions.
1
=< 1, 0, 0 >
2
=< 0, 1, 0 >
3
=< 0, 0, 1 > .
Likewise the Kronecker delta and the Levi-Civita symbol are at times very convenient for abstract
calculation:
=
_
1 =
0 =
.
Now let us restate some earlier results in terms of the Einstein repeated index conventions
9
, let
and then
orthonormal basis
(
+
)
vector addition
(
vector subtraction
(
scalar multiplication
dot product
(
cross product.
All but the last of the above are readily generalized to dimensions other than three by simply
increasing the number of components. However, the cross product is special to three dimensions.
I cant emphasize enough that the formulas given above for the dot and cross products can be
utilized to yield great eciency in abstract calculations.
Example 1.2.18. . .
9
there are more details to be seen in the Appendix if youre curious
1.3. FUNCTIONS 21
1.3 functions
Suppose and are sets, we say : is a function if for each the function
assigns a single element () . Moreover, if : is a function we say it is a -valued
function of an -variable and we say = () whereas = (). For example,
if :
2
[0, 1] then is real-valued function of
2
. On the other hand, if :
2
then
wed say is a vector-valued function of a complex variable. The term mapping will be used
interchangeably with function in these notes
10
. Suppose : and and then
we may consisely express the same data via the notation : .
Sometimes we can take two given functions and construct a new function.
1. if : and : then
1
(
1
) = { ()
1
}.
The inverse image of a single point in the codomain is called a ber. Suppose : .
We say is surjective or onto
1
i there exists
1
such that (
1
) =
1
. If a function
is onto its codomain then the function is surjective. If (
1
) = (
2
) implies
1
=
2
for all
1
,
2
1
then we say f is injective on
1
or 1 1 on
1
. If a function
is injective on its domain then we say the function is injective. If a function is both
injective and surjective then the function is called a bijection or a 1-1 correspondance.
Example 1.3.2. Suppose :
2
and (, ) = for each (, )
2
. The function is not
injective since (1, 2) = 1 and (1, 3) = 1 and yet (1, 2) = (1, 3). Notice that the bers of are
simply vertical lines:
1
(
) = {(, ) () (, ) =
} = {(
, ) } = {
}
Example 1.3.3. Suppose : and () =
2
+ 1 for each . This function is not
surjective because 0 / (). In contrast, if we construct : [1, ) with () = () for each
then can argue that is surjective. Neither nor is injective, the ber of
is {
}
for each
= 0. At all points except zero these maps are said to be two-to-one. This is an
abbreviation of the observation that two points in the domain map to the same point in the range.
Denition 1.3.4.
Suppose :
()).
Then we say that = (
1
,
2
, . . . ,
the functions
are
called the component functions of . Furthermore, we dene the projection
to be the map
() =
.
Example 1.3.5. Suppose :
3
2
and (, , ) = (
2
+
2
, ) for each (, , )
3
. Identify
that
1
(, , ) =
2
+
2
whereas
2
(, , ) = . You can easily see that () = [0, ] .
Suppose
2
[0, ) and
then
1
({(
2
,
)}) =
1
() {
}
where
1
() denotes a circle of radius . This result is a simple consequence of the observation
that (, , ) = (
2
,
) implies
2
+
2
=
2
and =
.
1.3. FUNCTIONS 23
Example 1.3.6. Let , , be particular constants. Suppose :
3
and (, , ) =
+ + for each (, , )
3
. Here there is just one component function so we could say that
=
1
but we dont usually bother to make such an observation. If at least one of the constants
, , is nonzero then the bers of this map are planes in three dimensional space with normal
, ,
1
({}) = {(, , )
3
+ + = }
If = = = 0 then the ber of is simply all of
3
and the () = {0}.
The denition below explains how to put together functions with a common domain. The codomain
of the new function is the cartesian product of the old codomains.
Denition 1.3.7.
Let :
1
and :
1
be a mappings then (, ) is a
mapping from
1
to
1
2
dened by (, )() = ((), ()) for all
1
.
Theres more than meets the eye in the denition above. Let me expand it a bit here:
(, )() = (
1
(),
2
(), . . . ,
(),
1
(),
2
(), . . . ,
()) where = (
1
,
2
, . . . ,
)
You might notice that Edwards uses for the identity mapping whereas I use . His notation is
quite reasonable given that the identity is the cartesian product of all the projection maps:
= (
1
,
2
, . . . ,
)
Ive had courses where we simply used the coordinate notation itself for projections, in that nota-
tion have formulas such as (, , ) = ,
() =
and
) =
.
Another way to modify a given function is to adjust the domain of a given mapping by restriction
and extension.
Denition 1.3.8.
Let :
: where
1
() = i () = and the value must be a single value if the function is one-one. When a
function is not one-one then there may be more than one point which maps to a particular point
in the range.
Notice that the inverse image of a set is well-dened even if there is no inverse mapping. Moreover,
it can be shown that the bers of a mapping are disjoint and their union covers the domain of the
mapping:
() = ()
1
{}
1
{} =
()
1
{} = ().
This means that the bers of a mapping partition the domain.
Example 1.3.10. . .
Denition 1.3.11.
Let :
: () is a bijection.
The proposition above tells us that we can take any mapping and cut down the domain and/or
codomain to reduce the function to an injection, surjection or bijection. If you look for it youll see
this result behind the scenes in other courses. For example, in linear algebra if we throw out the
kernel of a linear mapping then we get an injection. The idea of a local inverse is also important
to the study of calculus.
Example 1.3.14. . .
Denition 1.3.15.
Let :
)
1
.
Usually we can nd local inverses for functions in calculus. For example, () = sin() is not 1-1
therefore it is not invertible. However, it does have a local inverse () = sin
1
(). If we were
more pedantic we wouldnt write sin
1
(). Instead we would write () =
_
sin
[
2
,
2
]
_
1
() since
the inverse sine is actually just a local inverse. To construct a local inverse for some mapping we
must locate some subset of the domain upon which the mapping is injective. Then relative to that
subset we can reverse the mapping. The inverse mapping theorem (which well study mid-course)
will tell us more about the existence of local inverses for a given mapping.
1.4 elementary topology and limits
In this section we describe the metric topology for
() = {
< }
The closed ball of radius centered at
is likewise dened
() = {
}
Notice that in the = 1 case we observe an open ball is an open interval: let ,
() = { < } = { < } = ( , +)
In the = 2 case we observe that an open ball is an open disk: let (, )
2
,
((, )) =
_
(, )
2
(, ) (, ) <
_
=
_
(, )
2
( )
2
+ ( )
2
<
_
For = 3 an open-ball is a sphere without the outer shell. In contrast, a closed ball in = 3 is a
solid sphere which includes the outer shell of the sphere.
Denition 1.4.2.
Let
is a limit point of i
every open ball centered at contains points in {}. We say
is a boundary
point of i every open ball centered at contains points not in and other points which
are in {}. We say is an isolated point of if there exist open balls about
which do not contain other points in . The set of all interior points of is called the
interior of . Likewise the set of all boundary points for is denoted . The closure
of is dened to be = {
a limit point}
If youre like me the paragraph above doesnt help much until I see the picture below. All the terms
are aptly named. The term limit point is given because those points are the ones for which it is
natural to dene a limit.
Example 1.4.3. . .
1.4. ELEMENTARY TOPOLOGY AND LIMITS 27
Denition 1.4.4.
Let
() and
() . Let
= {
/ }
is an open set.
Notice that [, ] = (, ) (, ). It is not hard to prove that open intervals are open hence
we nd that a closed interval is a closed set. Likewise it is not hard to prove that open balls are
open sets and closed balls are closed sets. I may ask you to prove the following proposition in the
homework.
Proposition 1.4.5.
A closed set contains all its limit points, that is
is closed i = .
Example 1.4.6. . .
In calculus I the limit of a function is dened in terms of deleted open intervals centered about the
limit point. We can dene the limit of a mapping in terms of deleted open balls centered at the
limit point.
Denition 1.4.7.
Let :
at limit point
of i for each > 0 there exists a > 0 such that
() = .
In calculus I the limit of a function is dened in terms of deleted open intervals centered about the
limit point. We just dened the limit of a mapping in terms of deleted open balls centered at the
limit point. The term deleted refers to the fact that we assume 0 < which means we
do not consider = in the limiting process. In other words, the limit of a mapping considers
values close to the limit point but not necessarily the limit point itself. The case that the function
is dened at the limit point is special, when the limit and the mapping agree then we say the
mapping is continuous at that point.
Example 1.4.8. . .
28 CHAPTER 1. SET-UP
Denition 1.4.9.
Let :
() = ()
If is an isolated point then we also say that is continous at . The mapping is
continous on i it is continous at each point in . The mapping is continuous i
it is continuous on its domain.
Notice that in the = = 1 case we recover the denition of continuous functions from calc. I.
Proposition 1.4.10.
Let :
hence
= (
1
,
2
, . . . ,
() = lim
() =
for each = 1, 2, . . . , .
.
Proof: () Suppose lim
() = . Then for each > 0 choose > 0 such that 0 < <
implies () < . This choice of suces for our purposes as:
()
()
)
2
=1
(
()
)
2
= () < .
Hence we have shown that lim
() =
for all = 1, 2, . . . .
() Suppose lim
() =
()
<
/.
Choose = {
1
,
2
, . . . ,
hence
requiring 0 < < automatically induces 0 < <
=1
(
()
=1
()
2
<
=1
(
/)
2
<
=1
/ = .
Therefore, lim
.
Let be a limit point of then is continous at i
is continuous at for
= 1, 2, . . . , . Moreover, is continuous on i all the component functions of are
continuous on . Finally, a mapping is continous i all of its component functions are
continuous. .
The proof of the proposition is in Edwards, its his Theorem 7.2. Its about time I proved something.
30 CHAPTER 1. SET-UP
Proposition 1.4.13.
The projection functions are continuous. The identity mapping is continuous.
Proof: Let > 0 and choose = . If
() = () for all
.
Hence is continuous on
( )
2
+ ( )
2
<
it is clear that ( )
2
<
2
since if it was otherwise the inequality above would be violated as
adding a nonegative quantity ( )
2
only increases the radicand resulting in the squareroot to be
larger than . Hence we may assume ()
2
<
2
and since > 0 it follows < . Likewise,
< . Thus
(, ) ( +) = + < + < 2
We see for the sum proof we can choose = /2 and it will work out nicely.
1.4. ELEMENTARY TOPOLOGY AND LIMITS 31
Proof: Let > 0 and let (, )
2
. Choose = /2 and suppose (, )
2
such that
(, ) (, ) < . Observe that
(, ) (, ) < ( , )
2
<
2
2
+
2
<
2
.
It follows < and < . Thus
(, ) ( +) = + + < + = 2 = .
Therefore, lim
(,)(,)
(, ) = +. and it follows that the sum function if continuous at (, ).
But, (, ) is an arbitrary point thus is continuous on
2
hence the sum function is continuous. .
Preparing for the proof of continuity of the product function: Ill continue to use the same
notation as above. We need to study (, ) () = < . Consider that
= + = ( ) +( ) +
We know that < and < . There is one less obvious factor to bound in the expression.
What should we do about ?. I leave it to the reader to show that:
< < +
Now put it all together and hopefully well be able to solve for .
= + < ( +) + =
2
+( +) =
I put solve in quotes because we have considerably more freedom in our quest for nding . We
could just as well nd which makes the = become an <. That said lets pursue equality,
2
+( +) = 0 =
( +)
2
+ 4
2
Since , , > 0 it follows that
( +)
2
+ 4 <
( +)
2
= + hence the (+) solution
to the quadratic equation yields a positive namely:
=
+
( +)
2
+ 4
2
Yowsers, I almost made this a homework. There may be an easier route. You might notice we have
run across a few little lemmas (Ive boxed the punch lines for the lemmas) which are doubtless
useful in other proofs. We should collect those once were nished with this proof.
Proof: Let > 0 and let (, )
2
. By the calculations that prepared for the proof we know that
the following quantity is positive, hence choose
=
+
( +)
2
+ 4
2
> 0.
32 CHAPTER 1. SET-UP
Note that
11
,
= + = ( ) +( ) algebra
+ triangle inequality
< ( +) + by the boxed lemmas
=
2
+( +) algebra
=
where we know that last step follows due to the steps leading to the boxed equation in the proof
preparation. Therefore, lim
(,)(,)
(, ) = . and it follows that the product function if con-
tinuous at (, ). But, (, ) is an arbitrary point thus is continuous on
2
hence the product
function is continuous. .
Lemma 1.4.16.
Assume > 0.
1. If , then < < +.
2. If ,
then <
< for = 1, 2, . . . .
The proof of the proposition above is mostly contained in the remarks of the preceding two pages.
Example 1.4.17. . .
11
my notation is that when we stack inequalities the inequality in a particular line refers only to the immediate
vertical successor.
1.4. ELEMENTARY TOPOLOGY AND LIMITS 33
Proposition 1.4.18.
Let :
and :
(
)() =
_
lim
()
_
.
The proof is in Edwards, see pages 46-47. Notice that the proposition above immediately gives us
the important result below:
Proposition 1.4.19.
Let and be mappings such that
is well-dened. The composite function
is
continuous for points (
) such that the following two conditions hold:
1. is continuous at
2. is continuous at ().
I make use of the earlier proposition that a mapping is continuous i its component functions are
continuous throughout the examples that follow. For example, I know (, ) is continuous since
was previously proved continuous.
Example 1.4.20. Note that if =
(, ) then () =
_
(, )
_
() =
_
(, )()
_
=
(, ) =
2
. Therefore, the quadratic function () =
2
is continuous on as it is the composite
of continuous functions.
Example 1.4.21. Note that if =
(, ), ) then () = (
2
, ) =
3
. Therefore, the
cubic function () =
3
is continuous on as it is the composite of continuous functions.
Example 1.4.22. The power function is inductively dened by
1
= and
=
1
for all
. We can prove () =
=
1
= () = (, ()) = (
(, ))().
Therefore, using the induction hypothesis, we see that () =
and suppose . If
lim
() =
1
and lim
() =
2
then
1. lim
() + lim
().
2. lim
(()()) =
_
lim
()
__
lim
()
_
.
3. lim
(()) = lim
().
Moreover, if , are continuous then +, and are continuous.
Proof: Edwards proves (1.) carefully on pg. 48. Ill do (2.) here: we are given that If lim
() =
1
and lim
() =
2
thus by Proposition 1.4.10 we nd lim
(, )() = (
1
,
2
).
Consider then,
lim
(()()) = lim
_
(, )
_
defn. of product function
=
_
lim
(, )
_
since is continuous
= (
1
,
2
) by Proposition 1.4.10.
=
1
2
denition of product function
=
_
lim
()
__
lim
()
_
.
In your homework you proved that lim
(, ) and
(, ) are continuous at and the proof of items (1.) and (2.) is complete. To
prove (3.) I refer the reader to their homework where it was shown that () = for all is a
continuous function. We then nd (3.) follows from (2.) by setting = (function multiplication
commutes for real-valued functions). .
1.4. ELEMENTARY TOPOLOGY AND LIMITS 35
We can use induction arguments to extend these results to arbitrarily many products and sums of
power functions.To prove continuity of algebraic functions wed need to do some more work with
quotient and root functions. Ill stop here for the moment, perhaps Ill ask you to prove a few more
fundamentals from calculus I. I havent delved into the denition of exponential or log functions
not to mention sine or cosine. We will assume that the basic functions of calculus are continuous
on the interior of their respective domains. Basically if the formula for a function can be evaluated
at the limit point then the function is continuous.
Its not hard to see that the comments above extend to functions of several variables and map-
pings. If the formula for a mapping is comprised of nite sums and products of power func-
tions then we can prove such a mapping is continuous using the techniques developed in this
section. If we have a mapping with a more complicated formula built from elementary func-
tions then that mapping will be continuous provided its component functions have formulas which
are sensibly calculated at the limit point. In other words, if you are willing to believe me that
sin(), cos(),
,
1
cosh(
2
) +
, cosh(),
+
1
)
_
is a continuous mapping at points where the radicands of the square root functions are nonnegative.
It wouldnt be very fun to write explicitly but it is clear that this mapping is the Cartesian product
of functions which are the sum, product and composite of continuous functions.
Denition 1.4.25.
A polynomial in -variables has the form:
(
1
,
2
, . . . ,
) =
1
,
2
,...,
=0
1
,
2
,...,
1
1
2
2
1
,
2
,...,
= 0. We denote the set of multinomials in
-variables as (
1
,
2
, . . . ,
).
Polynomials are (). Polynomials in two variables are (, ), for example,
(, ) = + () = 1, linear function
(, ) = + + () = 1, ane function
(, ) =
2
+ +
2
deg(f)=2, quadratic form
(, ) =
2
+ +
2
+ + + deg(f)=2
If all the terms in the polynomial have the same number of variables then it is said to be homo-
geneous. In the list above only the linear function and the quadratic form were homogeneous.
Remark 1.4.26.
36 CHAPTER 1. SET-UP
There are other topologies possible for
1
=
1
+
2
+ +
gives a norm on
to
(.)
then we say that is a vector space over . To be a bit more precise, by (iii.) I mean to say
that there exist some element 0 such that + 0 = for each . Also, (iv.) should be
understood to say that for each there exists another element such that +() = 0.
Example 2.1.1.
is a vector space with respect to the standard vector addition and scalar
multiplication.
Example 2.1.2. = {+ , } is a vector space where the usual complex number addition
provides the vector addition and multiplication by a real number (+) = +() clearly denes
a scalar multiplication.
Example 2.1.3. The set of all matrices is a vector space with respect to the usual matrix
addition and scalar multiplication. We will elaborate on the details in an upcoming section.
Example 2.1.4. Suppose is the set of all functions from a set to a vector space then is
naturally a vector space with respect to the function addition and multiplication by a scalar. Both
of those operations are well-dened on the values of the function since we assumed the codomain of
each function in is the vector space .
There are many subspaces of function space which provide interesting examples of vector spaces.
For example, the set of continuous functions:
Example 2.1.5. Let
0
(
to
then
0
(
)
is a vector space with respect to function addition and the usual multiplication. This fact relies on
the sum and scalar multiple of continuous functions is once more continuous.
Denition 2.1.6.
We say a subset of a vector space is linearly independent (LI) i for scalars
1
,
2
, . . . ,
1
+
2
2
+
= 0
1
=
2
= = 0
for each nite subset {
1
,
2
, . . . ,
} of .
In the case that is nite it suces to show the implication for a linear combination of all the
vectors in the set. Notice that if any vector in the set can be written as a linear combination of
the other vectors in then that makes fail the test for linear independence. Moreover, if a set
is not linearly independent then we say is linearly dependent.
2.1. VECTOR SPACES 39
Example 2.1.7. The standard basis of
is denoted {
1
,
2
, . . . ,
1
+
2
2
+
to obtain
(
1
1
+
2
2
+
= 0
(
1
+
2
= 0
(1) = 0
but, was arbitrary hence it follows that
1
=
2
= =
= (
)+(
). Therefore,
the complex equation
1
+
2
= 0 yields two real equations
1
= 0 and
2
= 0.
Example 2.1.9. Let
0
() be the vector space of all continuous functions from to . Suppose
is the set of monic
1
monomials = {1, ,
2
,
3
, . . . }. This is an innite set. We can argue LI
as follows: suppose
1
1
+
2
2
+ +
by
1
,
2
, . . . ,
1
<
2
< <
1
+
2
+ +
= 0
If
1
= 0 then evaluate at = 0 to obtain
1
= 0. If
1
> 0 then dierentiate
1
times and
denote this new equation by
1
. Evaluate
1
at = 0 to nd
1
(
1
1) 3 2 1
1
= 0
hence
1
= 0. Since we set-up
1
<
2
it follows that after
1
-dierentiations the second summand
is still nontrivial in
1
. However, we can continue dierentiating until we reach
2
and
then constant term is
2
!
2
so evaluation will show
2
= 0. We continue in this fashion until we
have shown that
, 3(1/2)
+
2
3(1/2)
= 0 yields (
1
+3
2
)2
= 0. Hence
1
+3
2
= 0 which
means nontrivial solutions exist. Take
2
= 1 then
1
= 3. Of course the heart of the matter is
that 3(1/2)
= 3(2
i,
j,
k} where we
could also denote
i =
1
,
j =
2
,
k =
3
but Im aiming to make your mind connect with your
calculus III background. This set is clearly linearly dependent since we can write any vector as
a linear combination of the standard unit-vectors: moreover, we can use dot-products to select the
, and components as follows:
= (
i)
i + (
j)
j + (
k)
k
Linear independence helps us quantify a type of redundancy for vectors in a given set. The next
denition is equally important and it is sort of the other side of the coin; spanning is a criteria
which helps us insure a set of vectors will cover a vector space without missing anything.
Denition 2.1.12.
We say a subset of a vector space is a spanning set for i for each there
exist scalars
1
,
2
, . . . ,
and vectors
1
,
2
, . . . ,
such that =
1
1
+
2
2
+
.
We denote {
1
,
2
, . . . ,
} = {
1
1
+
2
2
+
1
,
2
, . . . ,
}.
If and is a vector space then it is immediately obvious that () . If is a
spanning set then it is obvious that (). It follows that when is a spanning set for
we have () = .
Example 2.1.13. It is easy to show that if
then =
1
1
+
2
2
+ +
. It follows
that
= {
=1
.
Example 2.1.14. Let 1, where
2
= 1. Clearly = {1, }.
Example 2.1.15. Let be the set of polynomials. Since the sum of any two polynomials and
the scalar multiple of any polynomial is once more a polynomial we nd is a vector space with
respect to function addition and multiplication of a function by a scalar. We can argue that the set
of monic monomials {1, ,
2
, . . . } a spanning set for . Why? Because if () then that means
there are scalars
0
,
1
, . . . ,
such that () =
0
+
1
+
2
2
+ +
Denition 2.1.16.
We say a subset of a vector space is a basis for i is a linearly independent
spanning set for . If is a nite set then is said to be nite dimensional and the
number of vectors in is called the dimension of . That is, if = {
1
,
2
, . . . ,
} is a
basis for then ( ) = . If no nite basis exists for then is said to be innite
dimensional.
2.2. MATRIX CALCULATION 41
The careful reader will question why this concept of dimension is well-dened. Why can we not
have bases of diering dimension for a given vector space? I leave this question for linear algebra,
the theorem which asserts the uniqueness of dimension is one of the deeper theorems in the course.
However, like most everything in linear, at some level it just boils down to solving some particular
set of equations. You might tell Dr. Sprano its just algebra. In any event, it is common practice
to use the term dimension in courses where linear algebra is not understood. For example,
2
is a
two-dimensional space. Or well say that
3
is a three-dimensional space. This terminology agrees
with the general observation of the next example.
Example 2.1.17. The standard basis {
=1
for
is a basis for
and (
) = . This
result holds for all . The line is one-dimensional, the plane is two-dimensional, three-space
is three-dimensional etc...
Example 2.1.18. The set {1, } is a basis for . It follows that () = 2. We say that the
complex numbers form a two-dimensional real vector space.
Example 2.1.19. The set of polynomials is clearly innite dimensional. Contradiction shows this
without much eort. Suppose had a nite basis . Choose the polynomial of largest degree (say
) in . Notice that () =
+1
is a polynomial and yet clearly () / () hence is not a
spanning set. But this contradicts the assumption is a basis. Hence, by contradiction, no nite
basis exists and we conclude the set of polynomials is innite dimensional.
There is a more general use of the term dimension which is beyond the context of linear algebra.
For example, in calculus II or III you may have heard that a circle is one-dimensional or a surface
is two-dimensional. Well, circles and surfaces are not usually vector spaces so the terminology is
not taken from linear algebra. In fact, that use of the term dimension stems from manifold theory.
I hope to discuss manifolds later in this course.
2.2 matrix calculation
An matrix is an array of numbers with -rows and -columns. We dene
to be the
set of all matrices. The set of all -dimensional column vectors is
1
=
2
. The set of
all -dimensional row vectors is
1
. A given matrix
has -components
. Notice
that the components are numbers;
] is quite ne.
Suppose
, note for 1 we have
()
1
whereas for 1 we nd
()
1
. In other words, an matrix has columns of length and rows of length .
2
We will use the convention that points in
rather than
1
and I dont have to pepper
transposes all over the place. If youve read my linear algebra notes youll appreciate the wisdom of our convention.
42 CHAPTER 2. LINEAR ALGEBRA
Two matrices and are equal i
()
, for all , .
The zero matrix in
is denoted 0 and dened by 0
=1
for each 1 and 1 . In the case = = 1 the indices , are omitted in the equation
since the matrix product is simply a number which needs no index. The identity matrix in
is the square matrix whose components are the Kronecker delta;
=
_
1 =
0 =
.
The notation
is called the transpose of and is dened by (
and
(
)
1
= (
1
)
1
. Furthermore, note dot-product of
,
is given by =
.
The -th standard basis matrix for
is denoted
is zero in all entries except for the (, )-th slot where it has a 1. In other words, we
dene (
. I invite the reader to show that the term basis is justied in this context
4
.
Given this basis we see that the vector space
has (
) = .
Theorem 2.2.1.
3
this product is dened so the matrix of the composite of a linear transformation is the product of the matrices
of the composed transformations. This is illustrated later in this section and is proved in my linear algebra notes.
4
the theorem stated below contains the needed results and then some, you can nd the proof is given in my linear
algebra notes. It would be wise to just work it out in the 2 2 case as a warm-up if you are interested
2.2. MATRIX CALCULATION 43
If
and then
(.) =
=1
(.) =
=1
=1
(.) [
] =
() (.) [
] =
()
(.)
(.)
(.)
.
You can look in my linear algebra notes for the details of the theorem. Ill just expand one point
here: Let
then
=
11
12
1
21
22
2
.
.
.
.
.
.
.
.
.
1
2
=
11
1 0 0
0 0 0
.
.
.
.
.
. 0
0 0 0
+
12
0 1 0
0 0 0
.
.
.
.
.
. 0
0 0 0
+ +
0 0 0
0 0 0
.
.
.
.
.
. 0
0 0 1
=
11
11
+
12
12
+ +
.
The calculation above follows from repeated -applications of the denition of matrix addition
and another -applications of the denition of scalar multiplication of a matrix.
Example 2.2.2. Suppose = [
1 2 3
4 5 6
]. We see that has 2 rows and 3 columns thus
23
.
Moreover,
11
= 1,
12
= 2,
13
= 3,
21
= 4,
22
= 5, and
23
= 6. Its not usually possible to
nd a formula for a generic element in the matrix, but this matrix satises
= 3( 1) + for
all ,
5
. The columns of are,
1
() =
_
1
4
_
,
2
() =
_
2
5
_
,
3
() =
_
3
6
_
.
The rows of are
1
() =
_
1 2 3
,
2
() =
_
4 5 6
=
_
1 4
2 5
3 6
_
. Notice that
1
() =
1
(
),
2
() =
2
(
)
5
In the statement for all , it is to be understood that those indices range over their allowed values. In the
preceding example 1 2 and 1 3.
44 CHAPTER 2. LINEAR ALGEBRA
and
1
() =
1
(
),
2
() =
2
(
),
3
() =
3
(
)
Notice (
_
5 6
7 8
_
=
_
4 4
4 4
_
.
Now multiply by the scalar 5,
5 = 5
_
1 2
3 4
_
=
_
5 10
15 20
_
Example 2.2.6. Let ,
be dened by
= 3 + 5 and
=
2
for all , . Then we
can calculate (+)
= 3 + 5 +
2
for all , .
Example 2.2.7. Solve the following matrix equation,
0 =
_
_
+
_
1 2
3 4
_
_
0 0
0 0
_
=
_
1 2
3 4
_
The denition of matrix equality means this single matrix equation reduces to 4 scalar equations:
0 = 1, 0 = 2, 0 = 3, 0 = 4. The solution is = 1, = 2, = 3, = 4.
The denition of matrix multiplication (()
=1
1
()
1
()
1
()
2
()
1
()
()
2
()
1
()
2
()
2
()
2
()
()
.
.
.
.
.
.
.
.
.
()
1
()
()
2
()
()
()
()
() for all , .
Recall that (
())
and (
())
thus
()
=1
=1
(
())
())
()
1 0
0 1
0 0
_
4 5 6
7 8 9
_
=
[1, 0][4, 7]
[1, 0][5, 8]
[1, 0][6, 9]
[0, 1][4, 7]
[0, 1][5, 8]
[0, 1][6, 9]
[0, 0][4, 7]
[0, 0][5, 8]
[0, 0][6, 9]
4 5 6
7 8 9
0 0 0
1
2
3
_
4 5 6
4 1 5 1 6 1
4 2 5 2 6 2
4 3 5 3 6 3
4 5 6
8 10 12
12 15 18
[1, 2][6, 8]
[3, 4][5, 7]
[3, 4][6, 8]
_
=
_
5 + 14 6 + 16
15 + 28 18 + 32
_
=
_
19 22
43 50
_
Notice the product of square matrices is square. For numbers , it we know the product of
and is commutative ( = ). Lets calculate the product of and in the opposite order,
=
_
5 6
7 8
_ _
1 2
3 4
_
=
_
[5, 6][1, 3]
[5, 6][2, 4]
[7, 8][1, 3]
[7, 8][2, 4]
_
=
_
5 + 18 10 + 24
7 + 24 14 + 32
_
=
_
23 34
31 46
_
Clearly = thus matrix multiplication is noncommutative or nonabelian.
When we say that matrix multiplication is noncommuative that indicates that the product of two
matrices does not generally commute. However, there are special matrices which commute with
other matrices.
46 CHAPTER 2. LINEAR ALGEBRA
Example 2.2.11. Let = [
1 0
0 1
] and =
_
. We calculate
=
_
1 0
0 1
_ _
_
=
_
_
Likewise calculate,
=
_
_ _
1 0
0 1
_
=
_
_
Since the matrix was arbitrary we conclude that = for all
22
.
Example 2.2.12. Consider , , from Example ??.
+ =
_
5
7
_
+
_
6
8
_
=
_
11
15
_
Using the above we calculate,
( +) =
_
1 2
3 4
_ _
11
15
_
=
_
11 + 30
33 + 60
_
=
_
41
93
_
.
In constrast, we can add and ,
+ =
_
19
43
_
+
_
22
50
_
=
_
41
93
_
.
Behold, ( +) = + for this example. It turns out this is true in general.
I collect all my favorite properties for matrix multiplication in the theorem below. To summarize,
matrix math works as you would expect with the exception that matrix multiplication is not
commutative. We must be careful about the order of letters in matrix expressions.
Theorem 2.2.13.
2.2. MATRIX CALCULATION 47
If , ,
, ,
,
and
1
,
2
then
1. (+) + = + ( +),
2. () = (),
3. + = +,
4.
1
(+) =
1
+
2
,
5. (
1
+
2
) =
1
+
2
,
6. (
1
2
) =
1
(
2
),
7. (
1
) =
1
() = (
1
) = ()
1
,
8. 1 = ,
9.
= =
,
10. ( + ) = + ,
11. (
1
+
2
) =
1
+
2
,
12. (+) = +,
Proof: I will prove a couple of these primarily to give you a chance to test your understanding
of the notation. Nearly all of these properties are proved by breaking the statement down to
components then appealing to a property of real numbers. Just a reminder, we assume that it is
known that is an ordered eld. Multiplication of real numbers is commutative, associative and
distributes across addition of real numbers. Likewise, addition of real numbers is commutative,
associative and obeys familar distributive laws when combined with addition.
Proof of (1.): assume , , are given as in the statement of the Theorem. Observe that
((+) +)
= (+)
) +
+ (
+ ( +)
2
))
= (
1
2
)
2
) =
1
(
2
).
Proof of (10.): assume , , are given as in the statement of the Theorem. Observe that
((( + ))
( + )
+ ( )
then is a linear
transformation.
Proof: Let
and dene :
by () = for each
. Let ,
and
,
( +) = ( +) = + = () +()
and
() = () = = ()
thus is a linear transformation.
Obviously this gives us a nice way to construct examples. The following proposition is really at the
heart of all the geometry in this section.
2.3. LINEAR TRANSFORMATIONS 49
Proposition 2.3.3.
Let = { + [0, 1], ,
. If :
_
=
_
_
.
We nd (0, 0) = (0, 0), (1, 0) = (, 0), (1, 1) = (, ), (0, 1) = (0, ). This mapping is called
a dilation.
50 CHAPTER 2. LINEAR ALGEBRA
Example 2.3.5. Let =
_
1 0
0 1
_
. Dene () = for all
2
. In particular this means,
(, ) = (, ) =
_
1 0
0 1
_ _
_
=
_
_
.
We nd (0, 0) = (0, 0), (1, 0) = (1, 0), (1, 1) = (1, 1), (0, 1) = (0, 1). This mapping is
called an inversion.
Example 2.3.6. Let =
_
1 2
3 4
_
. Dene () = for all
2
. In particular this means,
(, ) = (, ) =
_
1 2
3 4
_ _
_
=
_
+ 2
3 + 4
_
.
We nd (0, 0) = (0, 0), (1, 0) = (1, 3), (1, 1) = (3, 7), (0, 1) = (2, 4). This mapping shall
remain nameless, it is doubtless a combination of the other named mappings.
Example 2.3.7. Let =
1
2
_
1 1
1 1
_
. Dene () = for all
2
. In particular this
means,
(, ) = (, ) =
1
2
_
1 1
1 1
_ _
_
=
1
2
_
+
_
.
We nd (0, 0) = (0, 0), (1, 0) =
1
2
(1, 1), (1, 1) =
1
2
(0, 2), (0, 1) =
1
2
(1, 1). This mapping
is a rotation by /4 radians.
2.3. LINEAR TRANSFORMATIONS 51
Example 2.3.8. Let =
_
1 1
1 1
_
. Dene () = for all
2
. In particular this means,
(, ) = (, ) =
_
1 1
1 1
_ _
_
=
_
+
_
.
We nd (0, 0) = (0, 0), (1, 0) = (1, 1), (1, 1) = (0, 2), (0, 1) = (1, 1). This mapping is a
rotation followed by a dilation by =
2.
Example 2.3.9. Let =
_
cos() sin()
sin() cos()
_
. Dene () = for all
2
. In particular
this means,
(, ) = (, ) =
_
cos() sin()
sin() cos()
_ _
_
=
_
cos() sin()
sin() + cos()
_
.
We nd (0, 0) = (0, 0), (1, 0) = (cos(), sin()), (1, 1) = (cos()sin(), cos()+sin()) (0, 1) =
(sin(), cos()). This mapping is a rotation by in the counter-clockwise direction. Of course you
could have derived the matrix from the picture below.
52 CHAPTER 2. LINEAR ALGEBRA
Example 2.3.10. Let =
_
1 0
0 1
_
. Dene () = for all
2
. In particular this means,
(, ) = (, ) =
_
1 0
0 1
_ _
_
=
_
_
.
We nd (0, 0) = (0, 0), (1, 0) = (1, 0), (1, 1) = (1, 1), (0, 1) = (0, 1). This mapping is a
rotation by zero radians, or you could say it is a dilation by a factor of 1, ... usually we call this
the identity mapping because the image is identical to the preimage.
2.3. LINEAR TRANSFORMATIONS 53
Example 2.3.11. Let
1
=
_
1 0
0 0
_
. Dene
1
() =
1
for all
2
. In particular this
means,
1
(, ) =
1
(, ) =
_
1 0
0 0
_ _
_
=
_
0
_
.
We nd
1
(0, 0) = (0, 0),
1
(1, 0) = (1, 0),
1
(1, 1) = (1, 0),
1
(0, 1) = (0, 0). This mapping is a
projection onto the rst coordinate.
Let
2
=
_
0 0
0 1
_
. Dene () =
2
for all
2
. In particular this means,
2
(, ) =
2
(, ) =
_
0 0
0 1
_ _
_
=
_
0
_
.
We nd
2
(0, 0) = (0, 0),
2
(1, 0) = (0, 0),
2
(1, 1) = (0, 1),
2
(0, 1) = (0, 1). This mapping is
projection onto the second coordinate.
We can picture both of these mappings at once:
Example 2.3.12. Let =
_
1 1
1 1
_
. Dene () = for all
2
. In particular this means,
(, ) = (, ) =
_
1 1
1 1
_ _
_
=
_
+
+
_
.
We nd (0, 0) = (0, 0), (1, 0) = (1, 1), (1, 1) = (2, 2), (0, 1) = (1, 1). This mapping is not a
projection, but it does collapse the square to a line-segment.
54 CHAPTER 2. LINEAR ALGEBRA
Remark 2.3.13.
The examples here have focused on linear transformations from
2
to
2
. It turns out that
higher dimensional mappings can largely be understood in terms of the geometric operations
weve seen in this section.
Example 2.3.14. Let =
0 0
1 0
0 1
0 0
1 0
0 1
_
=
.
We nd (0, 0) = (0, 0, 0), (1, 0) = (0, 1, 0), (1, 1) = (0, 1, 1), (0, 1) = (0, 0, 1). This mapping
moves the -plane to the -plane. In particular, the horizontal unit square gets mapped to vertical
unit square; ([0, 1] [0, 1]) = {0} [0, 1] [0, 1]. This mapping certainly is not surjective because
no point with = 0 is covered in the range.
Example 2.3.15. Let =
_
1 1 0
1 1 1
_
. Dene () = for all
3
. In particular this
means,
(, , ) = (, , ) =
_
1 1 0
1 1 1
_
=
_
+
+ +
_
.
2.3. LINEAR TRANSFORMATIONS 55
Lets study how maps the unit cube. We have 2
3
= 8 corners on the unit cube,
(0, 0, 0) = (0, 0), (1, 0, 0) = (1, 1), (1, 1, 0) = (2, 2), (0, 1, 0) = (1, 1)
(0, 0, 1) = (0, 1), (1, 0, 1) = (1, 2), (1, 1, 1) = (2, 3), (0, 1, 1) = (1, 2).
This mapping squished the unit cube to a shape in the plane which contains the points (0, 0), (0, 1),
(1, 1), (1, 2), (2, 2), (2, 3). Face by face analysis of the mapping reveals the image is a parallelogram.
This mapping is certainly not injective since two dierent points get mapped to the same point. In
particular, I have color-coded the mapping of top and base faces as they map to line segments. The
vertical faces map to one of the two parallelograms that comprise the image.
I have used terms like vertical or horizontal in the standard manner we associate such terms
with three dimensional geometry. Visualization and terminology for higher-dimensional examples is
not as obvious. However, with a little imagination we can still draw pictures to capture important
aspects of mappings.
Example 2.3.16. Let =
_
1 0 0 0
1 0 0 0
_
. Dene () = for all
4
. In particular this
means,
(, , , ) = (, , , ) =
_
1 0 0 0
1 0 0 0
_
=
_
_
.
Lets study how maps the unit hypercube [0, 1]
4
4
. We have 2
4
= 16 corners on the unit
hypercube, note (1, , , ) = (1, 1) whereas (0, , , ) = (0, 0) for all , , [0, 1]. Therefore,
the unit hypercube is squished to a line-segment from (0, 0) to (1, 1). This mapping is neither
surjective nor injective. In the picture below the vertical axis represents the , , -directions.
56 CHAPTER 2. LINEAR ALGEBRA
Example 2.3.17. Suppose (, ) = (
,
2
+ ) note that (1, 1) = (1, 2) and (4, 4) = (2, 20).
Note that (4, 4) = 4(1, 1) thus we should see (4, 4) = (4(1, 1)) = 4(1, 1) but that fails to be true
so is not a linear transformation.
Example 2.3.18. Let (, ) =
2
+
2
dene a mapping from
2
to . This is not a linear
transformation since
((, )) = (, ) = ()
2
+ ()
2
=
2
(
2
+
2
) =
2
(, ).
We say is a nonlinear transformation.
Example 2.3.19. Suppose : is dened by () = + for some constants , .
Is this a linear transformation on ? Observe:
(0) = (0) + =
thus is not a linear transformation if = 0. On the other hand, if = 0 then is a linear
transformation.
A mapping on
] = , we say that
= ((
)))
.
2.3. LINEAR TRANSFORMATIONS 57
Example 2.3.21. Given that ([, , ]
for [, , ]
3
nd the the
standard matrix of . We wish to nd a 33 matrix such that () = for all = [, , ]
3
.
Write () then collect terms with each coordinate in the domain,
+ 2
3 + 4
5 + 6
1
0
5
2
3
0
0
4
6
1 2 0
0 3 4
5 0 6
= [] =
1 2 0
0 3 4
5 0 6
Notice that the columns in are just as youd expect from the proof of theorem ??. [] =
[(
1
)(
2
)(
3
)]. In future examples I will exploit this observation to save writing.
Example 2.3.22. Suppose that ((, , , )) = ( + + +, , 0, 3 ), nd [].
(
1
) = ((1, 0, 0, 0)) = (1, 0, 0, 3)
(
2
) = ((0, 1, 0, 0)) = (1, 1, 0, 0)
(
3
) = ((0, 0, 1, 0)) = (1, 0, 0, 0)
(
4
) = ((0, 0, 0, 1)) = (1, 1, 0, 1)
[] =
1 1 1 1
0 1 0 1
0 0 0 0
3 0 0 1
.
I invite the reader to check my answer here and see that () = [] for all
4
as claimed.
Proposition 2.3.23.
Suppose :
and :
_
=
_
3 +
+ 2
_
= (3 + , + 2). Naturally this is the
same formula that we would obtain through direct addition of the formulas of and .
58 CHAPTER 2. LINEAR ALGEBRA
Proposition 2.3.25.
1
:
and
2
:
1
:
is a
linear transformation with matrix [
2
1
] such that
[
2
1
]
=1
[
2
]
[
1
]
21
be dened by
([, ]
) = [ +, 2 ]
for all [, ]
21
. Also let :
21
31
be dened by
([, ]
) = [, , 3 + 4]
for all [, ]
21
. We calculate the composite as follows:
(
)([, ]
) = (([, ]
))
= ([ +, 2 ]
)
= [ +, +, 3( +) + 4(2 )]
= [ +, +, 11 ]
)([, ]
) =
1 1
1 1
11 1
_
[
] =
1 1
1 1
11 1
.
Notice that the standard matrices of and are:
[] =
1 0
1 0
3 4
[] =
_
1 1
2 1
_
Its easy to see that [
is dened by
(
1
1
+
2
2
+ +
) =
1
1
+
2
2
+ +
for all =
1
1
+
2
2
+ +
be dened by
(
) = (
11
, . . . ,
1
,
21
, . . . ,
2
, . . . ,
1
, . . . ,
)
This map simply takes the entries in the matrix and strings them out to a vector of length .
Example 2.3.28. Let :
2
be dened by (+) = (, ). This is the coordinate map for
the basis {1, }.
Matrix multiplication is for vectors in
//
OO
[]
,
//
((
1
() in . Next we operate
by which moves us over to the vector (
1
((
1
())) which is in
1
. The
same journey is accomplished by just multiplying by the matrix []
,
.
Example 2.3.29. Let = {1, ,
2
} be the basis for
2
and consider the derivative mapping
:
2
2
. Find the matrix of assuming that
2
has coordinates with respect to on both
copies of
2
. Dene and observe
(
) =
+1
whereas
1
(
) =
1
60 CHAPTER 2. LINEAR ALGEBRA
for = 0, 1, 2. Recall (
2
+ +) = 2 +.
1
([]
,
) =
((
1
(
1
))) =
((1)) =
(0) = 0
2
([]
,
) =
((
1
(
2
))) =
(()) =
(1) =
1
3
([]
,
) =
((
1
(
3
))) =
((
2
)) =
(2) = 2
2
Therefore we nd,
[]
,
=
0 1 0
0 0 2
0 0 0
.
Calculate
3
. Is this surprising?
A one-one correspondence is a map which is 1-1 and onto. If we can nd such a mapping between
two sets then it shows those sets have the same cardnality. Cardnality is a crude idea of size, it
turns out that all nite dimensional vector spaces over have the same cardnality. On the other
hand, not all vector spaces have the same dimension. Isomorphisms help us discern if two vector
spaces have the same dimension.
Denition 2.3.30.
Let , be vector spaces then : is an isomorphism if it is a 1-1 and onto
mapping which is also a linear transformation. If there is an isomorphism between vector
spaces and then we say those vector spaces are isomorphic and we denote this by
.
Other authors sometimes denote isomorphism by equality. But, Ill avoid that custom as I am
reserving = to denote set equality. Details of the rst two examples below can be found in my
linear algebra notes.
Example 2.3.31. Let =
3
and =
2
. Dene a mapping :
2
3
by
(
2
+ +) = (, , )
for all
2
+ +
2
. As vector spaces,
3
and polynomials of upto quadratic order are the
same.
Example 2.3.32. Let
2
be the set of 2 2 symmetric matrices. Let :
3
2
be dened by
(, , ) =
_
_
.
Example 2.3.33. Let (
to
.
(
) forms a vector space under function addition and scalar multiplication. There is a
natural isomorphism to matrices. Dene : (
)
by () = [] for all
linear transformations (
.
Example 2.4.2.
.
Example 2.4.3.
for each
.
We use the Euclidean norm by default.
Example 2.4.4. Consider as a two dimensional real vector space. Let + and dene
+ =
2
+
2
. This is a norm for .
Example 2.4.5. Let
. For each = [
] we dene
=
2
11
+
2
12
+ +
2
=1
=1
.
This is the Frobenius norm for matrices.
Each of the norms above allows us to dene a distance function and hence open sets and limits for
functions. An open ball in (,
) is dened
) = {
< }.
62 CHAPTER 2. LINEAR ALGEBRA
We dene the deleted open ball by removing the center from the open ball
){
} =
{ 0 <
< }. We say
, if : is a func-
tion from normed space (,
() =
i for each > 0 there exists > 0 such that for all subject to 0 <
< it fol-
lows ()(
< . If lim
() = (
.
Let (,
= then we say {
} is a convergent
sequence. We say {
< for all , with , > . In other words, a sequence is Cauchy if the
terms in the sequence get arbitarily close as we go suciently far out in the list. Many concepts
we cover in calculus II are made clear with proofs built around the concept of a Cauchy sequence.
The interesting thing about Cauchy is that for some spaces of numbers we can have a sequence
which converges but is not Cauchy. For example, if you think about the rational numbers we
can construct a sequence of truncated decimal expansions of :
{
and
. I may guide you
through the proof that , ,
and
are Banach spaces in a homework. When you take
real analysis youll spend some time thinking through the Cauchy concept.
Proposition 1.4.23 was given for the specic case of functions whose range is in . We might be able
to mimick the proof of that proposition for the case of normed spaces. We do have a composition
of limits theorem and I bet the sum function is continuous on a normed space. Moreover, if the
range happens to be a Banach algebra
6
then I would wager the product function is continuous.
Put these together and we get the normed vector space version of Prop. 1.4.23. That said, a direct
proof works nicely here so Ill just forego the more clever route here.
Proposition 2.4.6.
6
if is a Banach space that also has a product : such that 12 12 then is a
Banach algebra.
2.4. NORMED VECTOR SPACES 63
Let , be normed vector spaces. Let be a limit point of mappings , :
and suppose . If lim
() =
1
and lim
() =
2
then
1. lim
() + lim
().
2. lim
(()) = lim
().
Moreover, if , are continuous then + and are continuous.
Proof: Let > 0 and suppose lim
() =
1
and lim
() =
2
. Choose
1
,
2
> 0
such that 0 < <
1
implies ()
1
< /2 and 0 < <
2
implies ()
2
< /2.
Choose = (
1
,
2
) and suppose 0 < <
1
,
2
hence
( +)() (
1
+
2
) = ()
1
+()
2
()
1
+()
2
< /2 +/2 = .
Item (2.) follows. To prove (2.) note that if = 0 the result is clearly true so suppose = 0.
Suppose > 0 and choose > 0 such that ()
1
< /. Note that if 0 < < then
()()
1
= (()
1
) = ()
1
< / = .
The claims about continuity follow immediately from the limit properties and that completes the
proof .
Perhaps you recognize these arguments from calculus I. The logic used to prove the basic limit
theorems on is essentially identical.
Proposition 2.4.7.
Suppose
1
,
2
,
3
are normed vector spaces with norms
1
,
2
,
3
respective. Let
: ()
2
3
and : ()
1
2
be mappings. Suppose that
lim
() =
then
lim
(
)() =
_
lim
()
_
.
Proof: Let > 0 and choose > 0 such that 0 <
2
< implies () (
)
3
< . We
can choose such a since Since is continuous at
() = (
).
Next choose > 0 such that 0 <
1
< implies ()
2
< . We can choose such
a because we are given that lim
() =
. Suppose 0 <
1
< and let = ()
note ()
2
< yields
2
< and consequently () (
)
3
< . Therefore, 0 <
1
< implies (())(
)
3
< . It follows that lim
((()) = (lim
()).
The squeeze theorem relies heavily on the order properties of . Generally a normed vector space
has no natural ordering. For example, is 1 > or is 1 < in ? That said, we can state a squeeze
theorem for functions whose domain reside in a normed vector space. This is a generalization of
64 CHAPTER 2. LINEAR ALGEBRA
what we learned in calculus I. That said, the proof oered below is very similar to the typical proof
which is not given in calculus I
7
Proposition 2.4.8. squeeze theorem.
Suppose : () , : () , : () where is a
normed vector space with norm . Let () () () for all on some > 0 ball
of
8
then we nd that the limits at
() lim
() lim
().
Moreover, if lim
() = lim
() = then lim
() = .
Proof: Suppose () () for all
1
()
for some
1
> 0 and also suppose lim
() =
and lim
() =
>
[() ()] =
by the linearity
of the limit. It follows that for =
1
2
(
2
()
implies
() () (
) < =
1
2
(
1
2
(
) < () () (
) <
1
2
(
)
adding
yields,
3
2
(
) < () () <
1
2
(
) < 0.
Thus, () > () for all
2
()
1
()
so we nd a contradic-
tion for each
() where = (
1
,
2
). Hence
() = lim
1
() for some
1
> 0. We seek to show that lim
()
2
> 0 because the limits of and are given at = . Choose = (
1
,
2
) and note that if
()
then
() () ()
hence,
() () ()
but () < and () < imply < () and () < thus
< () () () < .
7
this is lifted word for word from my calculus I notes, however here the meaning of open ball is considerably more
general and the linearity of the limit which is referenced is the one proven earlier in this section
2.4. NORMED VECTOR SPACES 65
Therefore, for each > 0 there exists > 0 such that
()
implies () < so
lim
() = .
Our typical use of the theorem above applies to equations of norms from a normed vector space.
The norm takes us from to so the theorem above is essential to analyze interesting limits. We
shall make use of it in the next chapter.
Proposition 2.4.9. norm is continuous with respect to itself.
Suppose has norm then : dened by () = denes a continuous
function on .
Proof: Suppose
and
< implies
.
Let
2
have basis = {
1
,
2
} = {
1
, 3
1
+
2
} note the vector = 3
1
+
2
= 3
1
+(3
1
+
2
) =
2
. With respect to the basis we nd
1
= 3 and
2
= 1. The concept of length is muddled in these
coordinates. If we tried (incorrectly) to use the pythagorean theorem wed nd =
9 + 1 =
10
and yet the length of the vector is clearly just 1 since =
2
= (0, 1). The trouble with is that
it has dierent basis elements which overlap. To keep clear the euclidean idea of distance we must
insist on the use of an orthonormal basis.
Id rather not explain what that means at this point. Sucient to say that if is an orthonormal
basis then the coordinates preserve essentially the euclidean idea of vector length. In particular,
we can expect that if
=
=1
then
2
=
=1
.
Proposition 2.4.10.
Let , be normed vector spaces and suppose has basis = {
=1
such that when
=
=1
then
2
=
=1
=1
). Let
be a limit point of then
lim
() = =
=1
lim
() =
for all = 1, 2, . . . .
66 CHAPTER 2. LINEAR ALGEBRA
Proof: Suppose lim
() = =
=1
()
=1
()
2
= ()
2
where in the rst equality I simply added nonzero terms. With the inequality above in mind,
let > 0 and choose > 0 such that 0 < < implies () < . It follows that
()
2
<
2
and hence
()
() =
for all
.
Conversely suppose lim
() =
for all
. Let = {
1
,
2
, . . . ,
}.
Let > 0 and choose, by virtue of the given limits for the component functions,
> 0 such
that 0 < <
implies
()
<
. Choose = {
1
,
2
, . . . ,
} and suppose
0 < < . Consider
() =
=1
(
()
=1
(
()
=1
()
However,
=1
()
<
=1
=1
= .
Therefore, lim
() = .
I leave the case of non-orthonormal bases to the reader. In all the cases we consider it is possible
and natural to choose orthogonal bases to describe the vector space. Ill avoid the temptation to
do more here (there is more).
9
9
add a couple references for further reading here XXX
Chapter 3
dierentiation
Our goal in this chapter is to describe dierentiation for functions to and from normed linear spaces.
It turns out this is actually quite simple given the background of the preceding chapter. The dif-
ferential at a point is a linear transformation which best approximates the change in a function at
a particular point. We can quantify best by a limiting process which is naturally dened in view
of the fact there is a norm on the spaces we consider.
The most important example is of course the case :
) and (,
_
= 0.
In such a case we call the linear mapping the dierential at and we denote =
.
In the case =
and =
] =
()
which means that
() =
() for all
.
Notice this denition gives an equation which implicitly denes
is educated guessing.
Example 3.1.2. Suppose : is a linear transformation of normed vector spaces and .
I propose = . In other words, I think we can show the best linear approximation to the change
in a linear function is simply the function itself. Clearly is linear since is linear. Consider the
dierence quotient:
( +) () ()
=
() +() () ()
=
0
.
Note = 0 implies
= 0 by the denition of the norm. Hence the limit of the dierence quotient
vanishes since it is identically zero for every nonzero value of . We conclude that
= .
Example 3.1.3. Let : where and are normed vector spaces and dene () =
for all . I claim the dierential is the zero transformation. Linearity of () = 0 is trivially
veried. Consider the dierence quotient:
( +) () ()
=
0
.
Using the arguments to the preceding example, we nd
= 0.
Typically the dierence quotient is not identically zero. The pair of examples above are very special
cases. Ill give a few more abstract examples later in this section. For now we turn to the question
of how this general denition recovers the concept of dierentiation we studied in calculus.
1
Some authors might put a norm in the numerator of the quotient. That is an equivalent condition since a function
: has lim
0
() = 0 i lim
0
() = 0
3.1. THE DIFFERENTIAL 69
Example 3.1.4. Suppose : () is dierentiable at . It follows that there exists a
linear function
: such that
2
lim
0
( +) ()
()
= 0.
Since
() = . In this silly
case the matrix is a 1 1 matrix which otherwise known as a real number. Note that
lim
0
( +) ()
()
= 0 lim
0
( +) ()
()
= 0.
In the left limit 0
(+)()()
= 0.
But we can pull the minus out of the left limit to obtain lim
0
(+)()()
= 0. Therefore,
lim
0
( +) ()
()
= 0.
We seek to show that lim
0
(+)()
= .
= lim
0
= lim
0
()
A theorem from calculus I states that if lim( ) = 0 and lim() exists then so must lim() and
lim() = lim(). Apply that theorem to the fact we know lim
0
()
exists and
lim
0
_
( +) ()
()
_
= 0.
It follows that
lim
0
()
= lim
0
( +) ()
.
Consequently,
() = lim
0
( +) ()
dened
() in calc. I.
Therefore,
() =
2
= .
70 CHAPTER 3. DIFFERENTIATION
Example 3.1.5. Suppose :
2
3
is dened by (, ) = (,
2
, +3) for all (, )
2
.
Consider the dierence function at (, ):
= ((, ) + (, )) (, ) = ( +, +) (, )
Calculate,
=
_
( +)( +), ( +)
2
, + + 3( +)
_
_
,
2
, + 3
_
Simplify by cancelling terms which cancel with (, ):
=
_
+, 2 +
2
, + 3)
_
Identify the linear part of as a good candidate for the dierential. I claim that:
(, ) =
_
+, 2, + 3
_
.
is the dierential for at (x,y). Observe rst that we can write
(, ) =
2 0
1 3
_
.
therefore :
2
3
is manifestly linear. Use the algebra above to simplify the dierence quotient
below:
lim
(,)(0,0)
_
(, )
(, )
_
= lim
(,)(0,0)
_
(0,
2
, 0)
(, )
_
Note (, ) =
2
+
2
therefore we fact the task of showing that (0,
2
/
2
+
2
, 0) (0, 0, 0)
as (, ) (0, 0). Recall from our study of limits that we can prove the vector tends to (0, 0, 0)
by showing the each component tends to zero. The rst and third components are obviously zero
however the second component requires study. Observe that
0
2
2
+
2
2
2
=
Clearly lim
(,)(0,0)
(0) = 0 and lim
(,)(0,0)
= 0 hence the squeeze theorem for multivariate
limits shows that lim
(,)(0,0)
2
+
2
= 0. Therefore,
(,)
(, ) =
2 0
1 3
_
.
Computation of less trivial multivariate limits is an art wed like to avoid if possible. It turns out
that we can actually avoid these calculations by computing partial derivatives. However, we still
need a certain multivariate limit to exist for the partial derivative functions so in some sense its
unavoidable. The limits are there whether we like to calculate them or not. I want to give a few
more abstract examples before I get into the partial dierentiation. The purpose of this section is
to showcase the generality of the denition for dierential.
3.1. THE DIFFERENTIAL 71
Example 3.1.6. Suppose () = ()+ () for all () and both and are dierentiable
functions on (). By the arguments given in Example 3.1.4 it suces to nd : such
that
lim
0
_
( +) () ()
_
= 0.
I propose that on the basis of analogy to Example 3.1.4 we ought to have
() = (
()+
()).
Let () = (
() +
() +
())(
1
+
2
) = (
() +
())
1
+(
() +
())
2
= (
1
) +(
2
).
for all
1
,
2
and . Hence : is linear. Moreover,
(+)()()
=
1
_
( +) + ( +) () + () (
() +
())
_
=
1
_
( +) ()
()
_
+
1
_
( +) ()
()
_
Consider the problem of calculating lim
0
(+)()()
_
( +) ()
()
_
= 0 lim
0
1
_
( +) ()
()
_
= 0.
Therefore,
() = (
() +
() +
() = (
(),
(),
())
12
which makes since as :
2
.
Generally constructing the matrix for a function : where , = involves a fair
number of relatively ad-hoc conventions because the constructions necessarily involving choosing
coordinates. The situation is similar in linear algebra. Writing abstract linear transformations in
terms of matrix multiplication takes a little thinking. If you look back youll notice that I did not
bother to try to write a matrix Examples 3.1.2 or 3.1.3. The same is true for the nal example of
this section.
Example 3.1.7. Suppose :
is dened by () =
2
. Notice
= ( +) () = ( +)( +)
2
= + +
2
I propose that is dierentiable at and () = +. Lets check linearity,
(
1
+
2
) = (
1
+
2
) + (
1
+
2
) =
1
+
1
+(
2
+
2
)
72 CHAPTER 3. DIFFERENTIATION
Hence :
is a linear transformation. By construction of the linear terms in the
numerator cancel leaving just the quadratic term,
lim
0
( +) () ()
= lim
0
.
It suces to show that lim
0
= 0. We
nd
() = +.
XXX- need to adjust example below to reect orthonormality assumption.
Example 3.1.8. Suppose is a normed vector space with basis = {
1
,
2
, . . . ,
}. Futhermore,
let : be dened by
() =
=1
()
where
=1
:
then lim
0
() =
=1
i lim
0
() =
=1
_
, factoring out the basis
yields:
lim
0
_
=1
[
( +)
()
_
=
=1
_
lim
0
( +)
()
The expression on the left is the limit of a vector whereas the expression on the right is a vector of
limits. I make the equality by applying the claim. In any event, I hope you are not surprised that:
() =
=1
, space curves in , :
to
()
where
() = lim
0
( +) ()
One great contrast we should pause to note is that the denition of the directional derivative is
explicit whereas the denition of the dierential was implicit. Many similarities do exist. For
example: the directional derivative
then if
() exists in
then
() =
()
Proof: Let : ()
and suppose
()
()
= lim
0
( +()) ()
= lim
0
( +()) ()
Therefore, the limit on the left of the equality exists as the limit on the right of the equality is
given and we conclude
() =
() for all .
If were given the derivative of a mapping then the directional derivative exists. The converse is
not so simple as we shall discuss in the next subsection.
Proposition 3.2.3.
If :
() exists
for each
and
() =
().
Proof: Suppose such that
()
= 0.
This is a limit in
, when it exists it follows that the limits that approach the origin along
particular paths also exist and are zero. In particular we can consider the path for = 0
and > 0, we nd
lim
0, >0
( +) ()
()
=
1
lim
0
+
( +) ()
()
= 0.
3.2. PARTIAL DERIVATIVES AND THE EXISTENCE OF THE DIFFERENTIAL 75
Hence, as = for > 0 we nd
lim
0
+
( +) ()
= lim
0
()
().
Likewise we can consider the path for = 0 and < 0
lim
0, <0
( +) ()
()
=
1
lim
0
( +) ()
()
= 0.
Note = thus the limit above yields
lim
0
( +) ()
= lim
0
()
lim
0
( +) ()
().
Therefore,
lim
0
( +) ()
()
and we conclude that
() =
() for all
() = [
] where [
] = [
(
1
)
(
2
)
) =
() =
().
Also we may use the notation
() =
() or
() exists.
Lets expand this denition a bit. Note that if = (
1
,
2
, . . . ,
) then
() = lim
0
( +
) ()
()]
= lim
0
( +
()
76 CHAPTER 3. DIFFERENTIATION
for each = 1, 2, . . . . But then the limit of the component function
in other words,
= (
1
,
2
, . . . ,
).
Proposition 3.2.5.
If :
() can
be expressed as a sum of partial derivative maps for each =<
1
,
2
, . . . ,
>
() =
=1
()
Proof: since is dierentiable at the dierential
exists and
() =
() for all
.
Use linearity of the dierential to calculate that
() =
(
1
1
+ +
) =
1
(
1
) + +
).
Note
) =
() =
= [
=
if we insist that = 1 then we recover the standard directional derivative we discuss in calculus
III. Naturally the () yields the maximum value for the directional derivative at if we
limit the inputs to vectors of unit-length. If we did not limit the vectors to unit length then the
directional derivative at can become arbitrarily large as
has derivative
matrix
for 1 and 1 .
Perhaps it is helpful to expand the derivative matrix explicitly for future reference:
() =
1
()
2
1
()
1
()
2
()
2
2
()
2
()
.
.
.
.
.
.
.
.
.
.
.
.
()
2
()
()
Lets write the operation of the dierential for a dierentiable mapping at some point in
terms of the explicit matrix multiplication by
(). Let = (
1
,
2
, . . .
() =
() =
1
()
2
1
()
1
()
2
()
2
2
()
2
()
.
.
.
.
.
.
.
.
.
.
.
.
()
2
()
()
2
.
.
.
You may recall the notation from calculus III at this point, omitting the -dependence,
= (
) =
_
1
,
2
, ,
So if the derivative exists we can write it in terms of a stack of gradient vectors of the component
functions: (I used a transpose to write the stack side-ways),
=
_
1
2
2
2
2
.
.
.
.
.
.
.
.
.
.
.
.
=
_
1
2
(
1
)
(
2
)
.
.
.
(
(,)
(, ) =
2 0
1 3
_
.
78 CHAPTER 3. DIFFERENTIATION
If you recall from calculus III the mechanics of partial dierentiation its simple to see that
(,
2
, + 3) = (, 2, 1) =
2
1
(,
2
, + 3) = (, 0, 3) =
0
3
Thus [] = [
()
exists for all = 0, we can calculate:
() =
1
2
+ 2sin
1
cos
1
Notice that (
) unless
is continuous at
.
The next example is sick.
Example 3.2.10. Let us dene (0, 0) = 0 and
(, ) =
2
2
+
2
for all (, ) = (0, 0) in
2
. It can be shown that is continuous at (0, 0). Moreover, since
(, 0) = (0, ) = 0 for all and all it follows that vanishes identically along the coordinate
axis. Thus the rate of change in the
1
or
2
directions is zero. We can calculate that
=
2
3
(
2
+
2
)
2
and
=
4
2
(
2
+
2
)
2
Consider the path to the origin (, ) gives
(, ) = 2
4
/(
2
+
2
)
2
= 1/2 hence
(, ) 1/2
along the path (, ), but
() = (1, 2, 3
2
). In this case we have
() = [
] =
1
2
3
2
The Jacobian here is a single column vector. It has rank 1 provided the vector is nonzero. We
see that
() = (0, 0, 0) for all . This corresponds to the fact that this space curve has a
well-dened tangent line for each point on the path.
Example 3.2.15. Let (, ) = be a mapping from
3
3
. Ill denote the coordinates
in the domain by (
1
,
2
,
3
,
1
,
2
,
3
) thus (, ) =
1
1
+
2
2
+
3
3
. Calculate,
[
(,)
] = (, )
= [
1
,
2
,
3
,
1
,
2
,
3
]
The Jacobian here is a single row vector. It has rank 6 provided all entries of the input vectors are
nonzero.
Example 3.2.16. Let (, ) = be a mapping from
,
1
, . . . ,
) thus (, ) =
=1
. Calculate,
=1
_
=
=1
=1
Likewise,
=1
_
=
=1
=1
1
, . . . ,
1
, . . . ,
),
[
(,)
]
= ()(, ) = = (
1
, . . . ,
,
1
, . . . ,
)
The Jacobian here is a single row vector. It has rank 2n provided all entries of the input vectors
are nonzero.
3.2. PARTIAL DERIVATIVES AND THE EXISTENCE OF THE DIFFERENTIAL 81
Example 3.2.17. Suppose (, , ) = (, , ) we calculate,
= (, 0, 0)
= (, 1, 0)
= (, 0, 1)
Remember these are actually column vectors in my sneaky notation; (
1
, . . . ,
) = [
1
, . . . ,
.
This means the derivative or Jacobian matrix of at (, , ) is
(, , ) = [
(,,)
] =
0 1 0
0 0 1
Note, (
(, , )) = 3 for all (, , )
3
such that , = 0. There are a variety of ways to
see that claim, one way is to observe [
= (2, 0)
= (0, )
= (2, )
The derivative is a 2 3 matrix in this example,
(, , ) = [
(,,)
] =
_
2 0 2
0
_
The maximum rank for
1
=
_
2 0
0
_
2
=
_
2 2
0
_
3
=
_
0 2
_
Well need either (
1
) = 2 = 0 or (
2
) = 2 = 0 or (
3
) = 2
2
= 0. I believe
the only point where all three of these fail to be true simulataneously is when = = = 0. This
mapping has maximal rank at all points except the origin.
Example 3.2.19. Suppose (, ) = (
2
+
2
, , +) we calculate,
= (2, , 1)
= (2, , 1)
The derivative is a 3 2 matrix in this example,
(, ) = [
(,)
] =
2 2
1 1
82 CHAPTER 3. DIFFERENTIATION
The maximum rank is again 2, this time because we only have two columns. The rank will be two
if the columns are not linearly dependent. We can analyze the question of rank a number of ways
but I nd determinants of submatrices a comforting tool in these sort of questions. If the columns
are linearly dependent then all three sub-square-matrices of
1
=
_
2 2
_
2
=
_
2 2
1 1
_
3
=
_
1 1
_
You can see (
1
) = 2(
2
2
), (
2
) = 2( ) and (
3
) = . Apparently we have
(
(, , )) = 2 for all (, )
2
with = . In retrospect this is not surprising.
Example 3.2.20. Suppose (, , ) = (
,
1
) = (
1
2
2
+
1
2
2
, ) for some constant . Lets
calculate the derivative via gradients this time,
= (
/,
/,
/) = (, ,
1
2
2
)
1
= (
1
/,
1
/,
1
/) = (0, , )
Therefore,
(, , ) =
_
1
2
2
0
_
Example 3.2.21. Let (, ) = ( cos , sin ). We calculate,
= ( sin , cos )
Hence,
(, ) =
_
cos sin
sin cos
_
We calculate (
(, )) = thus this mapping has full rank everywhere except the origin.
Example 3.2.22. Let (, ) = (
2
+
2
, tan
1
(/)). We calculate,
=
_
2
+
2
,
2
+
2
_
and
=
_
2
+
2
,
2
+
2
_
Hence,
(, ) =
_
2
+
2
2
+
2
2
+
2
2
+
2
_
=
_
2
_
_
using =
2
+
2
_
We calculate (
(, )) = 1/ thus this mapping has full rank everywhere except the origin.
3.2. PARTIAL DERIVATIVES AND THE EXISTENCE OF THE DIFFERENTIAL 83
Example 3.2.23. Let (, ) = (, ,
2
) for a constant . We calculate,
2
=
_
2
,
2
_
Also, = (1, 0) and = (0, 1) thus
(, ) =
1 0
0 1
This matrix clearly has rank 2 where is is well-dened. Note that we need
2
2
> 0 for the
derivative to exist. Moreover, we could dene (, ) = (
2
, , ) and calculate,
(, ) =
1 0
2
0 1
.
Observe that
(, ) exists when
2
2
2
> 0. Geometrically, parametrizes the sphere
above the equator at = 0 whereas parametrizes the right-half of the sphere with > 0. These
parametrizations overlap in the rst octant where both and are positive. In particular, (
)
(
) = {(, )
2
, > 0 and
2
+
2
<
2
}
Example 3.2.24. Let (, , ) = (, , ,
2
) for a constant . We calculate,
2
=
_
2
,
2
,
2
_
Also, = (1, 0, 0), = (0, 1, 0) and = (0, 0, 1) thus
(, , ) =
1 0 0
0 1 0
0 0 1
This matrix clearly has rank 3 where is is well-dened. Note that we need
2
2
> 0 for the
derivative to exist. This mapping gives us a parametrization of the 3-sphere
2
+
2
+
2
+
2
=
2
for > 0. (drawing this is a little trickier)
84 CHAPTER 3. DIFFERENTIATION
Example 3.2.25. Let (, , ) = ( +, +, +, ). You can calculate,
[
(,,)
] =
1 1 0
0 1 1
1 0 1
(, , ) has rank 3 in
3
provided we are at a point which is not on some coordinate plane. (the coordinate planes are
= 0, = 0 and = 0 for the , and coordinate planes respective)
Example 3.2.27. Let (, , ) = (, 1 ). You can calculate,
[
(,,)
] =
_
1 1 0
_
This matrix has rank 3 if either = 0 or ( ) = 0. In contrast to the preceding example, the
derivative does have rank 3 on certain points of the coordinate planes. For example,
(1, 1, 0) and
) = 3.
Example 3.2.28. Let :
3
3
be dened by () = for a xed vector = 0. We denote
= (
1
,
2
,
3
) and calculate,
( ) =
_
,,
_
=
,,
,,
It follows,
1
( ) =
=
2
2
= (0,
3
,
2
)
2
( ) =
=
3
3
= (
3
, 0,
1
)
3
( ) =
=
1
1
= (
2
,
1
, 0)
Thus the Jacobian is simply,
[
(,)
] =
0
3
2
3
0
1
2
1
0
In fact,
() = () = for each
3
. The given mapping is linear so the dierential of
the mapping is precisely the mapping itself.
3.2. PARTIAL DERIVATIVES AND THE EXISTENCE OF THE DIFFERENTIAL 85
Example 3.2.29. Let (, ) = (, , 1 ). You can calculate,
[
(,,)
] =
1 0
0 1
1 1
= (
) and
= (
)
Then the Jacobian is the 3 2 matrix
_
(,)
The matrix
_
(,)
_
_
_
_
_
Example 3.2.31. . .
Example 3.2.32. . .
86 CHAPTER 3. DIFFERENTIATION
3.3 additivity and homogeneity of the derivative
Suppose
1
:
and
2
:
. It follows that (
1
)
=
1
and (
2
)
=
2
are linear operators from
to
1
( +)
1
()
1
()
= 0 lim
0
2
( +)
2
()
2
()
= 0
To prove that =
1
+
2
is dierentiable at
= (
1
+
2
)
= (
1
)
+(
2
)
. Let = (
1
)
+(
2
)
and consider,
lim
0
(+)()()
= lim
0
1
(+)+
2
(+)
1
()
2
()
1
()
2
()
= lim
0
1
(+)
1
()
1
()
+ lim
0
2
(+)
2
()
2
()
= 0 + 0
= 0
Note that breaking up the limit was legal because we knew the subsequent limits existed and
were zero by the assumption of dierentiability of
1
and
2
at . Finally, since =
1
+
2
we
know is a linear transformation since the sum of linear transformations is a linear transformation.
Moreover, the matrix of is the sum of the matrices for
1
and
2
. Let and suppose =
1
then we can also show that
= (
1
)
= (
1
)
and
2
:
1
+
2
is dierentiable at and
(
1
+
2
)
= (
1
)
+ (
2
)
or (
1
+
2
)
() =
1
() +
2
()
Likewise, if then
(
1
)
= (
1
)
or (
1
)
() = (
1
())
These results suggest that the dierential of a function is a new object which has a vector space
structure. There is much more to say here later.
3.4 chain rule
The proof in Edwards is on 77-78. Ill give a heuristic proof here which captures the essence of the
argument. The simplicity of this rule continues to amaze me.
3.4. CHAIN RULE 87
Proposition 3.4.1.
If :
is dierentiable at and :
is dierentiable at
() then
is dierentiable at and
(
= ()
()
() =
(())
()
Proof Sketch:
In calculus III you may have learned how to calculate partial derivatives in terms of tree-diagrams
and intermediate variable etc... We now have a way of understanding those rules and all the
other chain rules in terms of one over-arching calculation: matrix multiplication of the constituent
Jacobians in the composite function. Of course once we have this rule for the composite of two
functions we can generalize to -functions by a simple induction argument. For example, for three
suitably dened mappings , , ,
(
() =
((()))
(())
()
Example 3.4.2. . .
88 CHAPTER 3. DIFFERENTIATION
Example 3.4.3. . .
Example 3.4.4. . .
Example 3.4.5. . .
3.4. CHAIN RULE 89
Example 3.4.6. . .
Example 3.4.7. . .
90 CHAPTER 3. DIFFERENTIATION
3.5 product rules?
What sort of product can we expect to nd among mappings? Remember two mappings have
vector outputs and there is no way to multiply vectors in general. Of course, in the case we have
two mappings that have equal-dimensional outputs we could take their dot-product. There is a
product rule for that case: if
,
:
then
) = (
)
) +
(
)
Or in the special case of = 3 we could even take their cross-product and there is another product
rule in that case:
) = (
)
+
(
)
What other case can we multiply vectors? One very important case is
2
= where is is
customary to use the notation (, ) = + and = + . If our range is complex numbers
then we again have a product rule: if :
and :
then
() = (
) +(
)
I have relegated the proof of these product rules to the end of this chapter. One other object worth
dierentiating is a matrix-valued function of
(+) =
() = (
) +(
)
Moral of this story? If you have a pair mappings whose ranges allow some sort of product then it
is entirely likely that there is a corresponding product rule
5
.
3.5.1 scalar-vector product rule
There is one product rule which we can state for arbitrary mappings, note that we can always
sensibly multiply a mapping by a function. Suppose then that :
and :
and
where
lim
0
( +) ()
()
= 0 lim
0
( +) ()
()
= 0
Since ( +) () +
() and ( +) () +
() we expect
( +) (() +
())(() +
())
()() +()
() +()
()
. .
linear in
+
()
()
. .
2
order in
5
In my research I consider functions on supernumbers, these also can be multiplied. Naturally there is a product
rule for super functions, the catch is that super numbers , do not necessarily commute. However, if theyre
homogeneneous = (1)
()
3.5. PRODUCT RULES? 91
Thus we propose: () = ()
() +()
=
= lim
0
( +)( +) ()() ()
() ()
()
= lim
0
( +)( +) ()() ()
() ()
()
+
+ lim
0
()( +) ( +)()
+ lim
0
( +)() ()( +)
+ lim
0
()() ()()
= lim
0
_
()
( +) ()
()
+
( +) ()
()
()+
+
_
( +) ()
_
( +) ()
_
= ()
_
lim
0
( +) ()
()
_
+
_
lim
0
( +) ()
()
_
()
= 0
Where we have made use of the dierentiability and the consequent continuity of both and at
. Furthermore, note
( +) = ()
( +) +()
( +)
= ()(
() +
()) +()(
() +
())
= ()
() +()(
() +(()
() +()
())
= () +()
for all ,
and hence = ()
+()
and :
= ()
() +()
()
() =
()() +()
()
The argument above covers the ordinary product rule and a host of other less common rules. Note
again that () and
() are vectors.
92 CHAPTER 3. DIFFERENTIATION
3.5.2 calculus of paths in
3
A path is a mapping from to
() =
() +
().
2. ()
() =
().
3. ()
() =
()() +()
().
4. ( )
() =
() () +()
().
5. provided = 3, ( )
() =
() () +()
().
6. provided () (
), (
)
() =
()(()).
We have to insist that = 3 for the statement with cross-products since we only have a standard
cross-product in
3
. We prepare for the proof of the proposition with a useful lemma. Notice this
lemma tells us how to actually calculate the derivative of paths in examples. The derivative of
component functions is nothing more than calculus I and one of our goals is to reduce things to
those sort of calculations whenever possible.
Lemma 3.5.3.
If :
() = (
1
(),
2
(), . . . ,
())
We are given that the following vector limit exists and is equal to
(),
() = lim
0
( +) ()
then by Proposition 1.4.10 the limit of a vector is related to the limits of its components as follows:
()
= lim
0
( +)
()
.
Thus (
())
= (
1
, . . . ,
) and =
=
(
1
, . . . ,
[( +)
] using def. ( +)
] +
] by calculus I, ( +)
.
= [
and
Hence ( )
[( )
] using def.
=
] repeatedly using, ( +)
] repeatedly using, ()
= (
+ (
() =
] using def.
=
] repeatedly using, ( +)
] repeatedly using, ()
=
_
. We likewise dene
= [
] for with
integrable components. Denite integrals and higher derivatives are also dened component-
wise.
Example 3.5.5. Suppose () =
_
2 3
2
4
3
5
4
_
. Ill calculate a few items just to illustrate the
denition above. calculate; to dierentiate a matrix we dierentiate each component one at a time:
() =
_
2 6
12
2
20
3
_
() =
_
0 6
24 60
2
_
(0) =
_
2 0
0 0
_
Integrate by integrating each component:
() =
_
2
+
1
3
+
2
4
+
3
5
+
4
_
2
0
() =
2
0
3
2
0
2
0
5
2
0
=
_
4 8
16 32
_
Proposition 3.5.6.
Suppose , are matrix-valued functions of a real variable, is a function of a real variable,
is a constant, and is a constant matrix then
1. ()
3. ()
4. ()
5. ()
6. (+)
where each of the functions is evaluated at the same time and I assume that the functions
and matrices are dierentiable at that value of and of course the matrices , , are such
that the multiplications are well-dened.
3.5. PRODUCT RULES? 95
Proof: Suppose ()
and ()
consider,
()
(()
) linearity of derivative
=
algebra
= (
+ (
(()
(
2
cos() +
2
sin()) =
(
2
cos()) +
(
2
sin())
= (2
2
cos()
2
sin()) +(2
2
sin() +
2
cos()) (3.1)
=
2
(2 +)(cos() + sin())
= (2 +)
(2+)
where I have made use of the identity
7
+
=
which seems obvious enough until you appreciate that we just proved it for = 2 +.
7
or denition, depending on how you choose to set-up the complex exponential, I take this as the denition in
calculus II
96 CHAPTER 3. DIFFERENTIATION
3.6 complex analysis in a nutshell
Dierentiation with respect to a real variable can be reduced to the slogan that we dierentiate
componentwise. Dierentiation with respect to a complex variable requires additional structure.
They key distinguishing ingredient is complex linearity:
Denition 3.6.1.
If we have some function : such that
(1.) ( +) = () +() for all , (2.) () = () for all ,
then we would say that is complex-linear.
Condition (1.) is additivity whereas condition (2.) is homogeneity. Note that complex linearity
implies real linearity however the converse is not true.
Example 3.6.2. Suppose () = where if = + for , then = is the complex
conjugate of . Consider for = + where , ,
() = (( +)( +))
= ( +( +))
= ( +)
= ( )( )
= ()
hence this map is not complex linear. On the other hand, if we study mutiplication by just ,
() = (( +)) = ( +) = = ( ) = ( +) = ()
thus is homogeneous with respect to real-number multiplication and it is also additive hence is
real linear.
Suppose that is a linear mapping from
2
to
2
. It is known from linear algebra that there exists
a matrix =
_
_
such that () = for all
2
. In this section we use the notation
+ = (, ) and
(, ) (, ) = ( +)( +) = +( +) = ( , +).
This construction is due to Gauss in the early nineteenth century, the idea is to use two component
vectors to construct complex numbers. There are other ways to construct complex numbers
8
.
Notice that ( + ) =
_
__
_
= ( + , + ) = + + ( + ) denes a real
linear mapping on for any choice of the real constants , , , . In contrast, complex linearity
puts strict conditions on these constants:
8
the same is true for real numbers, you can construct them in more than one way, however all constructions agree
on the basic properties and as such it is the properties of real or complex numbers which truly dened them. That
said, we choose Gauss representation for convenience.
3.6. COMPLEX ANALYSIS IN A NUTSHELL 97
Theorem 3.6.3.
The linear mapping () = is complex linear i the matrix will have the special form
below:
_
_
To be clear, we mean to identify
2
with as before. Thus the condition of complex
homogeneity reads ((, ) (, )) = (, ) (, )
Proof: assume is complex linear. Dene the matrix of as before:
(, ) =
_
__
_
This yields,
( +) = + +( +)
We can gain conditions on the matrix by examining the special points 1 = (1, 0) and = (0, 1)
(1, 0) = (, ) (0, 1) = (, )
Note that (
1
,
2
) (1, 0) = (
1
,
2
) hence ((
1
+
2
)1) = (
1
+
2
)(1) yields
(
1
+
2
) +(
1
+
2
) = (
1
+
2
)( +) =
1
2
+(
1
+
2
)
We nd two equations by equating the real and imaginary parts:
1
+
2
=
1
2
1
+
2
=
1
+
2
Therefore,
2
=
2
and
2
=
2
for all (
1
,
2
) . Suppose
1
= 0 and
2
= 1. We nd
= and = . We leave the converse proof to the reader. The proposition follows.
In analogy with the real case we dene
() as follows:
Denition 3.6.4.
Suppose : () and () then we dene
() = lim
0
( +) ()
.
The derivative function
() = lim
0
()
hence
lim
0
()
= lim
0
( +) ()
lim
0
( +) ()
()
= 0
Note that the limit above simply says that () =
then linearization () =
) is a complex linear
mapping.
Proof: let , and note () =
)() =
) = ().
It turns out that complex dierentiability automatically induces real dierentiability:
Proposition 3.6.6.
If is a complex dierentiable at
with () =
).
Proof: note that lim
0
(+)()
()
= 0 implies
lim
0
( +) ()
()
= 0
but then = and we know () =
and
= + then,
1. () =
) is complex linear.
2. () =
) =
_
)
_
Theorem 3.6.3 applies to
and
.
Example 3.6.8. Let () =
(cos() + sin()) =
cos() +
sin()
3.6. COMPLEX ANALYSIS IN A NUTSHELL 99
Identify for = + we have (, ) =
cos() and (, ) =
sin(). Calculate:
cos()
cos() &
cos()
sin(),
sin()
sin() &
sin()
cos().
Thus satises the CR-equations
and
2
that satisfy the CR-equations at
.
Example 3.6.9. Counter-example to converse of Theorem 3.6.7. Suppose (+) =
_
0 if = 0
1 if = 0
.
Clearly is identically zero on the coordinate axes thus along the -axis we can calculate the partial
derivatives for and and they are both zero. Likewise, along the -axis we nd
and
exist and
are zero. At the origin we nd
and
) =
)
and
) =
.
Proof: we are given that a function :
)
2
2
is continuous with continuous partial
derivatives of its component functions and . Therefore, by Theorem ?? we know is (real)
dierentiable at
)
_ _
1
2
_
.
Note then that the given CR-equations show the matrix of has the form
[] =
_
_
100 CHAPTER 3. DIFFERENTIATION
where =
) and =
= 0
note that the limit with in the denominator is equivalent to the limit above which followed directly
from the (real) dierentiability at
. (the following is not needed for the proof of the theorem, but
perhaps it is interesting anyway) Moreover, we can write
(
1
,
2
) =
_
_ _
1
2
_
=
_
1
+
1
+
2
_
=
1
+
2
+(
1
+
2
)
= (
)(
1
+
2
)
Therefore we nd
) =
gives () =
).
In the preceding section we found necessary and sucient conditions for the component functions
, to construct an complex dierentiable function = + . The denition that follows is the
next logical step: we say a function is analytic
9
at
.
Denition 3.6.11.
Let = + be a complex function. If there exists > 0 such that is complex
dierentiable for each
. If is analytic for
each
is a singular point. Singular points may be outside the domain of the function. If is
analytic on the entire complex plane then we say is entire. Analytic functions are
also called holomorphic functions
If you look in my complex variables notes you can nd proof of the following theorem (well, partial
proof perhaps, but this result is shown in every good complex variables text)
Theorem 3.6.12.
If : is a function and
: is an extension of which is analytic then
extends uniquely to
() =
()
(cos(()) + sin(())).
Note
( + 0) =
(cos(0) + sin(0)) =
thus
= 0 on .
Proof: since = + is analytic we know the CR-equations hold true;
and
.
Moreover, is continuously dierentiable so we may commute partial derivatives by a theorem
from multivariate calculus. Consider
= (
+ (
= (
+ (
= 0
Likewise,
= (
+ (
= (
+ (
= 0
Of course these relations hold for all points inside and the proposition follows.
Example 3.6.14. Note () =
2
is analytic with =
2
2
and = 2. We calculate,
= 2,
= 2
= 0
Note
= 1
= 0
Integrate these equations to deduce (, ) = +
2
for some constant
2
. We thus construct
an analytic function (, ) = +
1
+ ( +
2
) = + +
1
+
2
. This is just () = + for
=
1
+
2
.
Example 3.6.16. Suppose (, ) =
= whereas
= hence
cos()
sin()
Integrating
cos() +
sin(). Of course we
should recognize the function we just constructed, its just the complex exponential () =
.
Notice we cannot just construct an analytic function from any given function of two variables. We
have to start with a solution to Laplaces equation. This condition is rather restrictive. There
is much more to say about harmonic functions, especially where applications are concerned. My
goal here was just to give another perspective on analytic functions. Geometrically one thing we
could see without further work at this point is that for an analytic function = + the families
of level curves (, ) =
1
and (, ) =
2
are orthogonal. Note () =<
> and
() =<
> have
() () =
= 0
This means the normal lines to the level curves for and are orthogonal. Hence the level curves
of and are orthogonal.
Chapter 4
inverse and implicit function theorems
It is tempting to give a complete and rigourous proof of these theorems here, but I will resist the
temptation in lecture. Im actually more interested that the student understand what the theorem
claims. I will sketch the proof and show many applications. A nearly complete proof is found in
Edwards where he uses an iterative approximation technique founded on the contraction mapping
principle. All his arguments are in some sense in vain unless you have some working knowledge
of uniform convergence. Its hidden in his proof, but we cannot conclude the limit of his sequence
of function has the properties we desire unless the sequence of functions is uniformly convergent.
Sadly that material has its home in real analysis and I dare not trespass in lecture. That said, if
you wish Id be happy to show you the full proof if you have about 20 extra hours to develop the
material outside of class. Alternatively, as a course of future study, return to the proof after you
have completed Math 431 here at Liberty
1
. Some other universities put advanced calculus after
the real analysis course so that more analytical depth can be built into the course
2
4.1 inverse function theorem
Consider the problem of nding a local inverse for : () . If we are given a point
() such that there exists an open interval containing with
()( ) +
1
()
_
()
Therefore,
1
() +
1
()
_
()
() = 3 for all .
Suppose we want to nd the inverse function near = 2 then the discussion preceding this example
suggests,
1
() = 2 +
1
3
( 4).
I invite the reader to check that (
1
()) = and
1
(()) = for all , .
In the example above we found a global inverse exactly, but this is thanks to the linearity of the
function in the example. Generally, inverting the linearization just gives the rst approximation to
the inverse.
Consider : ()
. If is dierentiable at
()( ) for . Set = () and solve for via matrix algebra. This time we need
to assume
()( ) + (
())
1
_
()
Therefore,
1
() + (
())
1
_
()
in such
a way that the proof naturally generalizes to function space. This is done by arguing with properties rather than
formulas. The properties oen extend to innite dimensions whereas the formulas usually do not.
4.1. INVERSE FUNCTION THEOREM 105
interestingly the main non-trivial step is an application of the geometric series. For the student
of analysis this is an important topic which you should spend considerable time really trying to
absorb as deeply as possible. The contraction mapping is at the base of a number of interesting
and nontrivial theorems. Read Rosenlichts Introduction to Analysis for a broader and better
organized exposition of this analysis. In contrast, Edwards uses analysis as a tool to obtain results
for advanced calculus but his central goal is not a broad or well-framed treatment of analysis.
Consequently, if analysis is your interest then you really need to read something else in parallel to
get a better ideas about sequences of functions and uniform convergence. I have some notes from
a series of conversations with a student about Rosenlicht, Ill post those for the interested student.
These notes focus on the part of the material I require for this course. This is Theorem 3.3 on page
185 of Edwards text:
Theorem 4.1.2. ( inverse function theorem )
Suppose :
() = and
+1
() =
() [
()]
1
[(
and yet
(0) = 0 (therefore
dened by () = (
(). This function is clearly continuous since we are given that the
partial derivatives of the component functions of are all continuous.
2. note we are given
() is invertible on .
I would argue this is a topological argument because the key idea here is the continuity of .
Topology is the study of continuity in general.
Remark 4.1.4. James J. Callahans Advanced Calculus: a Geometric View, good reading.
James J. Callahan recently authored Advanced Calculus: a Geometric View. This text has
great merit in both visualization and well-thought use of linear algebraic techniques. In
addition, many student will enjoy his staggered proofs where he rst shows the proof for a
simple low dimensional case and then proceeds to the general case. I almost used his text
this semester.
Example 4.1.5. Suppose (, ) =
_
sin() +1, sin() +2
_
for (, )
2
. Clearly is contin-
uously dierentiable as all its component functions have continuous partial derivatives. Observe,
(, ) = [
] =
_
0 cos()
cos() 0
_
Hence
_
_
sin() + 1 =
sin() + 2 =
_
_
= sin
1
( 1)
= sin
1
( 2)
_
It follows that
1
(, ) =
_
sin
1
( 2), sin
1
( 1)
_
for (, ) [0, 2] [1, 3] where you should
note ([/2, /2]
2
) = [0, 2] [1, 3]. Weve found a local inverse for on the region [/2, /2]
2
.
In other words, we just found a global inverse for the restriction of to [/2, /2]
2
. Technically
we ought not write
1
, to be more precise we should write:
(
[/2,/2]
2)
1
(, ) =
_
sin
1
( 2), sin
1
( 1)
_
.
4.1. INVERSE FUNCTION THEOREM 107
It is customary to avoid such detail in many contexts. Inverse functions for sine, cosine, tangent
etc... are good examples of this slight of langauge.
A coordinate system on
is an invertible mapping of
to
2
+
2
=
2
cos
2
() +
2
sin
2
() =
2
(cos
2
() + sin
2
()) =
2
and
=
sin()
cos()
= tan().
It follows that =
2
+
2
and = tan
1
(/) for (, ) (0, ) . We nd
1
(, ) =
_
2
+
2
, tan
1
(/)
_
.
Lets see how the derivative ts with our results. Calcuate,
(, ) = [
] =
_
cos() sin()
sin() cos()
_
note that (
3
is dened by (, , ) = (, , ) for constants , ,
where = 0. Clearly is continuously dierentiable as all its component functions have
continuous partial derivatives. We calculate
(, , ) = [
] = [
1
3
]. Thus
(
(, , )) = = 0 for all (, , )
3
hence this function is locally invertible everywhere.
Moreover, we calculate the inverse mapping by solving (, , ) = (, , ) for (, , ):
(, , ) = (, , ) (, , ) = (/, /, /)
1
(, , ) = (/, /, /).
Example 4.1.8. Suppose :
. Under what conditions is such a function invertible ?. Since the formula for
this function gives each component function as a polynomial in the -variables we can conclude the
108 CHAPTER 4. INVERSE AND IMPLICIT FUNCTION THEOREMS
function is continuously dierentiable. You can calculate that
) = (,
2
, . . . ,
)
for all . Hence any point will have other points nearby which output the same value under .
Suppose () = 0, to calculate the inverse mapping formula we should solve () = for ,
= + =
1
( )
1
() =
1
( ).
Remark 4.1.9. inverse function theorem holds for higher derivatives.
In Munkres the inverse function theorem is given for -times dierentiable functions. In
short, a
2
+
2
= 1
2
= 1
2
=
1
2
.
A function cannot have two outputs for a single input, when we write in the expression above
it simply indicates our ignorance as to which is chosen. Once further information is given then we
may be able to choose a + or a . For example:
1. if
2
+
2
= 1 and we want to solve for near (0, 1) then =
1
2
is the correct choice
since > 0 at the point of interest.
2. if
2
+
2
= 1 and we want to solve for near (0, 1) then =
1
2
is the correct choice
since < 0 at the point of interest.
3. if
2
+
2
= 1 and we want to solve for near (1, 0) then its impossible to nd a single
function which reproduces
2
+
2
= 1 on an open disk centered at (1, 0).
What is the defect of case (3.) ? The trouble is that no matter how close we zoom in to the point
there are always two -values for each given -value. Geometrically, this suggests either we have a
discontinuity, a kink, or a vertical tangent in the graph. The given problem has a vertical tangent
and hopefully you can picture this with ease since its just the unit-circle. In calculus I we studied
4.2. IMPLICIT FUNCTION THEOREM 109
implicit dierentiation, our starting point was to assume = () and then we dierentiated
equations to work out implicit formulas for /. Take the unit-circle and dierentiate both sides,
2
+
2
= 1 2 + 2
= 0
.
Note
is not dened for = 0. Its no accident that those two points (1, 0) and (1, 0) are
precisely the points at which we cannot solve for as a function of . Apparently, the singularity
in the derivative indicates where we may have trouble solving an equation for one variable as a
function of the remaining variable.
We wish to study this problem in general. Given -equations in (+)-unknowns when can we solve
for the last -variables as functions of the rst -variables. Given a continuously dierentiable
mapping = (
1
,
2
, . . . ,
) :
are
constants)
1
(
1
, . . . ,
,
1
, . . . ,
) =
1
2
(
1
, . . . ,
,
1
, . . . ,
) =
2
.
.
.
(
1
, . . . ,
,
1
, . . . ,
) =
as functions of
1
, . . .
such that (, ) = . In
this section we use the notation = (
1
,
2
, . . .
) and = (
1
,
2
, . . . ,
).
Before we turn to the general problem lets analyze the unit-circle problem in this notation. We
are given (, ) =
2
+
2
and we wish to nd () such that = () solves (, ) = 1.
Dierentiate with respect to and use the chain-rule:
= 0
We nd that / =
and
and suppose :
has
(, ) = . Replace with its linearization based at (, ):
(, ) +
(, )( , )
110 CHAPTER 4. INVERSE AND IMPLICIT FUNCTION THEOREMS
here we have the matrix multiplication of the ( + ) matrix
(, ) with the ( + ) 1
column vector ( , ) to yield an -component column vector. It is convenient to dene
partial derivatives with respect to a whole vector of variables,
1
1
.
.
.
.
.
.
1
1
.
.
.
.
.
.
(, )
(, ) =
_
(, )
(, )
_
With this notation we have
(, ) +
(, )( ) +
(, )( )
If we are near (, ) then (, ) thus we are faced with the problem of solving the following
equation for :
+
(, )( ) +
(, )( )
Suppose the square matrix
(, )
_
1
_
(, )( )
_
.
Of course this is not a formal proof, but it does suggest that
_
(, )
= 0 is a necessary
condition for solving for the variables.
As before suppose :
for
_
(, ())
_
=
=1
=1
=1
= 0
we made use of the identity
1
+
=1
2
+
=1
=1
_
= [00 0]
4.2. IMPLICIT FUNCTION THEOREM 111
Properties of matrix addition allow us to parse the expression above as follows:
_
_
+
_
=1
=1
=1
_
= [00 0]
But, this reduces to
+
_
_
= 0
The concatenation property of matrix multiplication states [
1
] = [
1
]
we use this to write the expression once more,
_
= 0
= 0
is invertible.
Theorem 4.2.1. (Theorem 3.4 in Edwardss Text see pg 190)
Let : ()
). If the matrix
(, ) is invertible
then there exists an open ball containing in
such that (, ) =
i = () for all (, ) . Moreover, the mapping is the limit of the sequence of
successive approximations dened inductively below
() = ,
+1
=
() [
(, )]
1
(,
(, )
is invertible. We seek to use the inverse function theorem to prove the implicit function theorem.
Towards that end consider :
(, ) = [
] =
_
_
=
_
_
The determinant of the matrix above is the product of the deteminant of the blocks
and
; (
(, ) = (
)(
) =
(, )) = 0 thus (
(, ) = 0 and we nd
such that
1
is continuously dierentiable. Note (, ) and contains the
point (, ) = (, (, )) = (, ).
Our goal is to nd the implicit solution of (, ) = . We know that
1
((, )) = (, ) and (
1
(, )) = (, )
for all (, ) and (, ) . As usual to nd the formula for the inverse we can solve
(, ) = (, ) for (, ) this means we wish to solve (, (, )) = (, ) hence = . The
formula for is more elusive, but we know it exists by the inverse function theorem. Lets say
= (, ) where :
and thus
1
(, ) = (, (, )). Consider then,
(, ) = (
1
(, ) = (, (, )) = (, (, (, ))
Let = thus (, ) = (, (, (, )) for all (, ) . Finally, dene () = (, ) for
all (, ) and note that = (, ()). In particular, (, ) and at that point we nd
() = (, ) = by construction. It follows that = () provides a continuously dierentiable
solution of (, ) = near (, ).
Uniqueness of the solution follows from the uniqueness for the limit of the sequence of functions
described in Edwards text on page 192. However, other arguments for uniqueness can be oered,
independent of the iterative method, for instance: see page 75 of Munkres Analysis on Manifolds.
Remark 4.2.2. notation and the implementation of the implicit function theorem.
We assumed the variables were to be written as functions of variables to make explicit
a local solution to the equation (, ) = . This ordering of the variables is convenient to
argue the proof, however the real theorem is far more general. We can select any subset of
input variables to make up the so long as
(, , ) = 2 = 0.
4.2. IMPLICIT FUNCTION THEOREM 113
2. if we wish to solve = (, ) then we need
(, , ) = 2 = 0.
3. if we wish to solve = (, ) then we need
(, , ) = 2 = 0.
The point has no local solution for if it is a point on the intersection of the -plane and the
sphere (, , ) =
2
. Likewise, we cannot solve for = (, ) on the = 0 slice of the sphere
and we cannot solve for = (, ) on the = 0 slice of the sphere.
Notice, algebra veries the conclusions we reached via the implicit function theorem:
=
2
=
2
=
2
When we are at zero for one of the coordinates then we cannot choose + or since we need both on
an open ball intersected with the sphere centered at such a point
6
. Remember, when I talk about
local solutions I mean solutions which exist over the intersection of the solution set and an open
ball in the ambient space (
3
in this context). The preceding example is the natural extension of
the unit-circle example to
3
. A similar result is available for the -sphere in
+
3
= 2. Can we solve this equation for
= (, ) near (0, 0, 1)? Let (, , ) =
+
3
and note (0, 0, 1) =
0
+1+0 = 2 hence
(0, 0, 1) is a point on the solution set (, , ) = 2. Note is clearly continuously dierentiable
and
(, , ) = 3
2
(0, 0, 1) = 3 = 0
therefore, there exists a continuously dierentiable function : ()
2
which solves
(, , (, )) = 2 for (, ) near (0, 0) and (0, 0) = 1.
Ill not attempt an explicit solution for the last example.
Example 4.2.5. Let (, , ) i + + = 2 and + = 1. Problem: For which
variable(s) can we solve? Solution: dene (, , ) = ( + + , + ) we wish to study
(, , ) = (2, 1). Notice the solution set is not empty since (1, 0, 1) = (1 +0 +1, 0 +1) = (2, 1)
Moreover, is continuously dierentiable. In this case we have two equations and three unknowns
so we expect two variables can be written in terms of the remaining free variable. Lets examine
the derivative of :
(, , ) =
_
1 1 1
0 1 1
_
6
if you consider (, , ) =
2
as a space then the open sets on the space are taken to be the intersection with
the space and open balls in
3
. This is called the subspace topology in topology courses.
114 CHAPTER 4. INVERSE AND IMPLICIT FUNCTION THEOREMS
Suppose we wish to solve = () and = () then we should check invertiblility of
(, )
=
_
1 1
0 1
_
.
The matrix above is invertible hence the implicit function theorem applies and we can solve for
and as functions of . On the other hand, if we tried to solve for = () and = () then
well get no help from the implicit function theorem as the matrix
(, )
=
_
1 1
1 1
_
.
is not invertible. Geometrically, we can understand these results from noting that (, , ) = (2, 1)
is the intersection of the plane + + = 2 and + = 1. Subsituting + = 1 into + + = 2
yields +1 = 2 hence = 1 on the line of intersection. We can hardly use as a free variable for
the solution when the problem xes from the outset.
The method I just used to analyze the equations in the preceding example was a bit adhoc. In
linear algebra we do much better for systems of linear equations. A procedure called Gaussian
elimination naturally reduces a system of equations to a form in which it is manifestly obvious how
to eliminate redundant variables in terms of a minimal set of basic free variables. The of the
implicit function proof discussions plays the role of the so-called pivotal variables whereas the
plays the role of the remaining free variables. These variables are generally intermingled in
the list of total variables so to reproduce the pattern assumed for the implicit function theorem we
would need to relable variables from the outset of a calculation. The calculations in the examples
that follow are not usually possible. Linear equations are particularly nice and basically what Im
doing is following the guide of the linearization derivation in the context of specic examples.
Example 4.2.6. XXX
Example 4.2.7. XXX
Example 4.2.8. XXX
4.3. IMPLICIT DIFFERENTIATION 115
4.3 implicit dierentiation
Enough theory, lets calculate. In this section I apply previous theoretical constructions to specic
problems. I also introduce standard notation for constrained partial dierentiation which is
also sometimes called partial dierentiation with a side condition. The typical problem is the
following: given equations:
1
(
1
, . . . ,
,
1
, . . . ,
) =
1
2
(
1
, . . . ,
,
1
, . . . ,
) =
2
.
.
.
(
1
, . . . ,
,
1
, . . . ,
) =
calculate partial derivative of dependent variables with respect to independent variables. Contin-
uing with the notation of the implicit function discussion well assume that will be dependent
on . I want to recast some of our arguments via dierentials
7
. Take the total dierential of each
equation above,
1
(
1
, . . . ,
,
1
, . . . ,
) = 0
2
(
1
, . . . ,
,
1
, . . . ,
) = 0
.
.
.
(
1
, . . . ,
,
1
, . . . ,
) = 0
Hence,
1
+ +
1
+ +
= 0
1
+ +
1
+ +
= 0
.
.
.
1
+ +
1
+ +
= 0
Notice, this can be nicely written in column vector notation as:
1
+ +
1
+ +
= 0
Or, in matrix notation:
[
1
.
.
.
+ [
1
.
.
.
= 0
7
in contrast, In the previous section we mostly used derivative notation
116 CHAPTER 4. INVERSE AND IMPLICIT FUNCTION THEOREMS
Finally, solve for , we assume [
]
1
exists,
1
.
.
.
= [
]
1
[
1
.
.
.
= 0
We can solve for , or provided
or
&
&
&
In each case above, the implicit function theorem allows us to solve for one variable in terms of the
remaining two. If the partial derivative of in the denominator are zero then the implicit function
theorem does not apply and other thoughts are required. Often calculus text give the following as a
homework problem:
= 1.
In the equation above we have appear as a dependent variable on , and also as an independent
variable for the dependent variable . These mixed expressions are actually of interest to engineering
and physics. The less mbiguous notation below helps better handle such expressions:
_
= 1.
In each part of the expression we have clearly denoted which variables are taken to depend on the
others and in turn what sort of partial derivative we mean to indicate. Partial derivatives are not
taken alone, they must be done in concert with an understanding of the totality of the indpendent
variables for the problem. We hold all the remaining indpendent variables xed as we take a partial
derivative.
4.3. IMPLICIT DIFFERENTIATION 117
The explicit independent variable notation is more important for problems where we can choose
more than one set of indpendent variables for a given dependent variables. In the example that
follows we study = (, ) but we could just as well consider = (, ). Generally it will not
be the case that
_
is the same as
_
. In calculation of
_
we hold constant as we
vary whereas in
_
_
=
_
(2 2) + 2
_
Use Kramers rule, multiplication by inverse, substitution, adding/subtracting equations etc... what-
ever technique of solving linear equations you prefer. Our goal is to solve for and in terms
of and . Ill use Kramers rule this time:
=
_
1
(2 2) + 2 3
2
_
_
1 1
2 3
2
_ =
3
2
( ) + (2 2) 2
3
2
+ 2
Collecting terms,
=
_
3
2
+ 2 2
3
2
+ 2
_
+
_
3
2
2
3
2
+ 2
_
=
3
2
+ 2 2
3
2
+ 2
&
_
=
3
2
2
3
2
+ 2
The notation above indicates that is understood to be a function of independent variables , .
_
means we take the derivative of with respect to while holding xed. The appearance
8
a good exercise would be to do the example over but instead aim to calculate partial derivatives for , with
respect to independent variables ,
118 CHAPTER 4. INVERSE AND IMPLICIT FUNCTION THEOREMS
of the dependent variable can be removed by using the equations (, , , ) = (3, 5). Similar
ambiguities exist for implicit dierentiation in calculus I. Apply Kramers rule once more to solve
for :
=
_
1
2 (2 2) + 2
_
_
1 1
2 3
2
_ =
(2 2) + 2 2( +)
3
2
+ 2
Collecting terms,
=
_
2 + 2 2
3
2
+ 2
_
+
_
2 2
3
2
+ 2
_
=
2 + 2 2
3
2
+ 2
&
_
=
2 2
3
2
+ 2
You should ask: where did we use the implicit function theorem in the preceding example? Notice
our underlying hope is that we can solve for = (, ) and = (, ). The implicit function
theorem states this is possible precisely when
(,)
=
_
1 1
2 3
2
_
is non singular. Interestingly
this is the same matrix we must consider to isolate and . The calculations of the example
are only meaningful if the
_
1 1
2 3
2
_
= 0. In such a case the implicit function theorem
applies and it is reasonable to suppose , can be written as functions of , .
Example 4.3.3. Suppose the temperature in a room is given by (, , ) = 70+10(
2
2
).
Find how the temperature varies on a sphere
2
+
2
+
2
=
2
. We can choose any one
variable from (, , ) and write it as a function of the remaining two on the sphere. However, we
do need to a
Chapter 5
geometry of level sets
Our goal in this chapter is to develop a few tools to analyze the geometry of solution sets to equa-
tion(s) in
. These solution sets are commonly called level sets. I assume the reader is already
familiar with the concept of level curves and surfaces from multivariate calculus. We go much fur-
ther in this chapter. Our goal is to describe the tangent and normal spaces for a -dimensional level
set in
. The dimension of the level set is revealed by its tangent space and we discuss conditions
which are sucient to insure the invariance of this dimension over the entirety of the level set. In
contrast, the dimension of the normal space to a -dimensional level set in
is . The theory
of orthogonal complements is borrowed from linear algebra to help understand how all of this ts
together at a given point on the level set. Finally, we use this geometry and a few simple lemmas
to justify the method of Lagrange multipliers. Lagranges technique and the theory of multivariate
Taylor polynomials form the basis for analyzing extrema for multivariate functions. In short, this
chapter deals with the question of extrema on the edges of a set whereas the next chapter deals
with the interior point via the theory of quadratic forms applied to the second-order approximation
to a function of several variables. Finally, we should mention that -dimensional level sets provide
examples of -dimensional manifolds, however, we defer careful discussion of manifolds for a later
chapter.
5.1 denition of level set
A level set is the solution set of some equation or system of equations. We conne our interest to
level sets of
and suppose
= {
() has linearly
independent rows at each .
The condition of linear independence of the rows is give to eliminate possible redundancy in the
system of equations. In the case that = 1 the criteria reduces to the conditon level function has
2
and suppose =
1
{0}. Calculate,
(, , ) = [2, 2, 2]
Notice that (0, 0, 0) and
(, , ) =
2
22
. We clearly have rank two at all points in
hence is a 3 2 = 1-dimensional level set. Perhaps you realize is the vertical line which passes
through (, , 0) in the -plane.
5.2. TANGENTS AND NORMALS TO A LEVEL SET 121
5.2 tangents and normals to a level set
There are many ways to dene a tangent space for some subset of
.
Throughout this section we assume that is a -dimensional level set dened by :
where
1
() = . This means that we can apply the implicit function theorem to and for
any given point = (
) where
and
such that (
) =
) then it follows = (
)
over the subset () of . More explicitly, for all such that () () we have
() = (
(), (
, (
() = (
(),
())
(0) = (
(0),
(0) = = (
)
then = (
). The second component of the vector is not free of the rst, it essentially
redundant. This makes us suspect that the tangent space to at is -dimensional.
Theorem 5.2.1.
Let :
1
(0) =
1
and
2
(0) =
2
then there exists a dierentiable curve : such that
(0) =
1
+
2
and
(0) = . Moreover, there exists a dierentiable curve : such that
(0) =
1
and
(0) = .
Proof: It is convenient to dene a map which gives a local parametrization of at . Since
we have a description of locally as a graph = () (near ) it is simple to construct the
parameterization. Dene :
+) = (
+, (
+))
is a curve from to such that (0) = (
, (
)) = (
(0) = (,
)).
The construction above shows that any vector of the form (
(0) =
1
+
2
and (0) = .
122 CHAPTER 5. GEOMETRY OF LEVEL SETS
Likewise, apply the construction to the case =
1
to write () = (
1
() + (
1
)) with
(0) =
1
and (0) = .
The idea of the proof is encapsulated in the picture below. This idea of mapping lines in a at
domain to obtain standard curves in a curved domain is an idea which plays over and over as you
study manifold theory. The particular redundancy of the and sub-vectors is special to the
discussion level-sets, however anytime we have a local parametrization well be able to construct
curves with tangents of our choosing by essentially the same construction. In fact, there are in-
nitely many curves which produce a particular tangent vector in the tangent space of a manifold.
XXX - read this section again for improper, premature use of the term manifold
Theorem 5.2.1 shows that the denition given below is logical. In particular, it is not at all obvious
that the sum of two tangent vectors ought to again be a tangent vector. However, that is just what
the Theorem 5.2.1 told us for level-sets
1
.
Denition 5.2.2.
Suppose is a -dimensional level-set dened by =
1
{} for :
. We
dene the tangent space at to be the set of pairs:
(0)}
Moreover, we dene (i.) addition and (ii.) scalar multiplication of vectors by the rules
(.) (,
1
) + (,
2
) = (,
1
+
2
) (.) (,
1
) = (,
1
)
for all (,
1
), (,
2
)
and .
When I picture
) = { + (, )
}. I often picture
as (
)
2
1
technically, there is another logical gap which I currently ignore. I wonder if you can nd it.
2
In truth, as you continue to study manifold theory youll nd at least three seemingly distinct objects which are
all called tangent vectors; equivalence classes of curves, derivations, contravariant tensors.
5.2. TANGENTS AND NORMALS TO A LEVEL SET 123
We could set out to calculate tangent spaces in view of the denition above, but we are actually
interested in more than just the tangent space for a level-set. In particular. we want a concrete
description of all the vectors which are not in the tangent space.
Denition 5.2.3.
Suppose is a -dimensional level-set dened by =
1
{} for :
and
where
= {}
is given the
natural vector space structure which we already exhibited on the subspace
. We dene
the inner product on
,
(, ) (, ) = .
The length of a vector (, ) is naturally dened by (, ) = . Moreover, we say two
vectors (, ), (, )
we
dene the orthogonal complement by
= {(, )
(, ) (, ) for all (, ) }.
Suppose
1
,
2
then we say
1
is orthogonal to
2
i
1
2
= 0 for all
1
1
and
2
2
. We denote orthogonality by writing
1
2
. If every
can be written
as =
1
+
2
for a pair of
1
1
and
2
2
where
1
2
then we say that
is
the direct sum of
1
and
2
which is denoted by
=
1
2
.
There is much more to say about orthogonality, however, our focus is not in that vein. We just
need the langauge to properly dene the normal space. The calculation below is probably the most
important calculation to understand for a level-set. Suppose we have a curve : where
=
1
() is a -dimensional level-set in
(())
() = 0.
In particular, suppose for = 0 we have (0) = and =
with
() = 0.
Recall :
has an derivative matrix where the -th row is the gradient vector
of the -th component function. The equation
() for
= 1, 2, . . . , . We have derived the following theorem:
Theorem 5.2.4.
Let :
1
() = . The gradient vectors
(, (
())
) (
.
124 CHAPTER 5. GEOMETRY OF LEVEL SETS
Its time to do some counting. Observe that the mapping :
dened by () = (, )
is an isomorphism of vector spaces hence (
= (
) hence (
is found to be + = . Thus (
())
=1
forms just such a basis since it is given
to be linearly independent by the (
())
where equality can be obtained by the slightly tedious equation (
= ((
()
)) . That
equation simply does the following:
1. transpose
()
many wiser authors wouldnt bother. The comments above are primarily about notation. Certainly
hiding these details would make this section prettier, however, would it make it better? Finally, I
once more refer the reader to linear algebra where we learn that (())
= (
). Let me
walk you through the proof: let
. Observe (
) i
= 0 for
= 0 i
() = 0 for = 1, 2, . . . , i
() = 0 for = 1, 2, . . . , i ()
.
Another useful identity for the perp is that (
())
())
= (
()
)
Let me once more replace by a more tedious, but explicit, procedure:
= ((
()
))
Theorem 5.2.5.
Let :
1
() = . The tangent space
= {} (
()
) &
= {} (
()
).
Moreover,
= 0 and (0) = (2, 2, 1, 0). By assumption (()) = 0 since () for all . Dene
= 0 at = 0,
()
_
=
_
_
(())
2
be dened by (, , , ) = ( +
2
+
2
2, +
2
+
2
2). In
this case (, , , ) = (0, 0) gives a two-dimensional manifold in
4
lets call it . Notice that
1
= 0 gives +
2
+
2
= 2 and
2
= 0 gives +
2
+
2
= 2 thus = 0 gives the intersection of
both of these three dimensional manifolds in
4
(no I cant see it either). Note,
1
=< 2, 2, 1, 0 >
2
=< 0, 2, 1, 2 >
It turns out that the inverse mapping theorem says = 0 describes a manifold of dimension 2 if
the gradient vectors above form a linearly independent set of vectors. For the example considered
here the gradient vectors are linearly dependent at the origin since
1
(0) =
2
(0) = (0, 0, 1, 0).
In fact, these gradient vectors are colinear along along the plane = = 0 since
1
(0, , , 0) =
2
(0, , , 0) =< 0, 2, 1, 0 >. We again seek to contrast the tangent plane and its normal at
some particular point. Choose (1, 1, 0, 1) which is in since (1, 1, 0, 1) = (0 + 1 + 1 2, 0 +
1 + 1 2) = (0, 0). Suppose that : is a path in which has (0) = (1, 1, 0, 1) whereas
(0) =
1
((0)) < , , , >= 0 < 2, 2, 1, 0 > < , , , >= 0
(
2
)
(0) =
2
((0)) < , , , >= 0 < 0, 2, 1, 1 > < , , , >= 0
This is two equations and four unknowns, we can solve it and write the vector in terms of two free
variables correspondant to the fact the tangent space is two-dimensional. Perhaps its easier to use
126 CHAPTER 5. GEOMETRY OF LEVEL SETS
matrix techiques to organize the calculation:
_
2 2 1 0
0 2 1 1
_
=
_
0
0
_
We calculate,
_
2 2 1 0
0 2 1 1
_
=
_
1 0 0 1/2
0 1 1/2 1/2
_
. Its natural to chose , as free vari-
ables then we can read that = /2 and = /2 /2 hence
< , , , >=< /2, /2 /2, , >=
2
< 0, 1, 2, 0 > +
2
< 1, 1, 0, 2 >
We can see a basis for the tangent space. In fact, I can give parametric equations for the tangent
space as follows:
(, ) = (1, 1, 0, 1) + < 0, 1, 2, 0 > + < 1, 1, 0, 2 >
Not surprisingly the basis vectors of the tangent space are perpendicular to the gradient vectors
1
(1, 1, 0, 1) =< 2, 2, 1, 0 > and
2
(1, 1, 0, 1) =< 0, 2, 1, 1 > which span the normal plane
is orthogonal to
. In summary
and
=
4
. This is just a fancy way of saying that the normal and the tangent
plane only intersect at zero and they together span the entire ambient space.
5.3 method of Lagrange mulitpliers
Let us begin with a statement of the problem we wish to solve.
Problem: given an objective function :
()) =
then the constraint surface () = will form an ( )-dimensional level set. Let us make that
supposition throughout the remainder of this section.
In order to solve a problem it is sometimes helpful to nd necessary conditions by assuming an
answer exists. Let us do that here. Suppose
) on =
1
{}.
This means there exists an open ball around
for which (
, say :
with (0) =
(0) = 0
Let us expand a bit on both of these conditions:
1.
(0) = 0
2.
(0) = 0
The rst of these conditions places
(0)
) =
()(
is orthogonal to
can be written as a linear combination of the gradient vectors. In particular, this means
there exist constants
1
,
2
, . . . ,
such that
()(
=
1
(
1
)(
+
2
(
2
)(
+ +
)(
1
+
2
2
+ +
.
Well examine a few examples before I reveal a sucient condition. Well also see how absence of
that sucient condition does allow the method to fail.
Example 5.3.1. Suppose we wish to nd maximum and minimum distance to the origin for points
on the curve
2
2
= 1. In this case we can use the distance-squared function as our objective
(, ) =
2
+
2
and the single constraint function is (, ) =
2
2
. Observe that =<
2, 2 > whereas =< 2, 2 >. We seek solutions of = which gives us < 2, 2 >=
< 2, 2 >. Hence 2 = 2 and 2 = 2. We must solve these equations subject to the
condition
2
2
= 1. Observe that = 0 is not a solution since 0
2
= 1 has no real solution.
On the other hand, = 0 does t the constraint and
2
0 = 1 has solutions = 1. Consider
then
2 = 2 and 2 = 2 (1 ) = 0 and (1 +) = 0
Since = 0 on the constraint curve it follows that 1 = 0 hence = 1 and we learn that
(1 +1) = 0 hence = 0. Consequently, (1, 0 and (1, 0) are the two point where we expect to nd
extreme-values of . In this case, the method of Lagrange multipliers served its purpose, as you
can see in the graph. Below the green curves are level curves of the objective function whereas the
particular red curve is the given constraint curve.
128 CHAPTER 5. GEOMETRY OF LEVEL SETS
The picture below is a screen-shot of the Java applet created by David Lippman and Konrad
Polthier to explore 2D and 3D graphs. Especially nice is the feature of adding vector elds to given
objects, many other plotters require much more eort for similar visualization. See more at the
website: http://dlippman.imathas.com/g1/GrapherLaunch.html.
Note how the gradient vectors to the objective function and constraint function line-up nicely at
those points.
In the previous example, we actually got lucky. There are examples of this sort where we could get
false maxima due to the nature of the constraint function.
Example 5.3.2. Suppose we wish to nd the points on the unit circle (, ) =
2
+
2
= 1 which
give extreme values for the objective function (, ) =
2
2
. Apply the method of Lagrange
multipliers and seek solutions to = :
< 2, 2 >= < 2, 2 >
We must solve 2 = 2 which is better cast as (1) = 0 and 2 = 2 which is nicely written
as (1 +) = 0. On the basis of these equations alone we have several options:
1. if = 1 then (1 + 1) = 0 hence = 0
5.3. METHOD OF LAGRANGE MULITPLIERS 129
2. if = 1 then (1 (1)) = 0 hence = 0
But, we also must t the constraint
2
+
2
= 1 hence we nd four solutions:
1. if = 1 then = 0 thus
2
= 1 = 1 (1, 0)
2. if = 1 then = 0 thus
2
= 1 = 1 (0, 1)
We test the objective function at these points to ascertain which type of extrema weve located:
(0, 1) = 0
2
(1)
2
= 1 & (1, 0) = (1)
2
0
2
= 1
When constrained to the unit circle we nd the objective function attains a maximum value of 1 at
the points (1, 0) and (1, 0) and a minimum value of 1 at (0, 1) and (0, 1). Lets illustrate the
answers as well as a few non-answers to get perspective. Below the green curves are level curves of
the objective function whereas the particular red curve is the given constraint curve.
The success of the last example was no accident. The fact that the constraint curve was a circle
which is a closed and bounded subset of
2
means that is is a compact subset of
2
. A well-known
theorem of analysis states that any real-valued continuous function on a compact domain attains
both maximum and minimum values. The objective function is continuous and the domain is
compact hence the theorem applies and the method of Lagrange multipliers succeeds. In contrast,
the constraint curve of the preceding example was a hyperbola which is not compact. We have
no assurance of the existence of any extrema. Indeed, we only found minima but no maxima in
Example 5.3.1.
The generality of the method of Lagrange multipliers is naturally limited to smooth constraint
curves and smooth objective functions. We must insist the gradient vectors exist at all points of
inquiry. Otherwise, the method breaks down. If we had a constraint curve which has sharp corners
then the method of Lagrange breaks down at those corners. In addition, if there are points of dis-
continuity in the constraint then the method need not apply. This is not terribly surprising, even in
calculus I the main attack to analyze extrema of function on assumed continuity, dierentiability
and sometimes twice dierentiability. Points of discontinuity require special attention in whatever
context you meet them.
At this point it is doubtless the case that some of you are, to misquote an ex-student of mine, not-
impressed. Perhaps the following examples better illustrate the dangers of non-compact constraint
curves.
130 CHAPTER 5. GEOMETRY OF LEVEL SETS
Example 5.3.3. Suppose we wish to nd extrema of (, ) = when constrained to = 1.
Identify (, ) = = 1 and apply the method of Lagrange multipliers and seek solutions to
= :
< 1, 0 >= < , > 1 = and 0 =
If = 0 then 1 = is impossible to solve hence = 0 and we nd = 0. But, if = 0 then
= 1 is not solvable. Therefore, we nd no solutions. Well, I suppose we have succeeded here
in a way. We just learned there is no extreme value of on the hyperbola = 1. Below the
green curves are level curves of the objective function whereas the particular red curve is the given
constraint curve.
Example 5.3.4. Suppose we wish to nd extrema of (, ) = when constrained to
2
2
= 1.
Identify (, ) =
2
2
= 1 and apply the method of Lagrange multipliers and seek solutions to
= :
< 1, 0 >= < 2, 2 > 1 = 2 and 0 = 2
If = 0 then 1 = 2 is impossible to solve hence = 0 and we nd = 0. If = 0 and
2
2
= 1
then we must solve
2
= 1 whence = 1. We are tempted to conclude that:
1. the objective function (, ) = attains a maximum on
2
2
= 1 at (1, 0) since (1, 0) = 1
2. the objective function (, ) = attains a minimum on
2
2
= 1 at (1, 0) since (1, 0) =
1
But, both conclusions are false. Note
2
2
1
2
= 1 hence (
2, 1) =
2 and (
2, 1) =
?.
6.1.1 taylors polynomial for one-variable
If : is analytic at
) +
)(
) +
1
2
)(
)
2
+ =
=0
()
(
)
!
(
() =
_
=0
1
!
( )
()
_
=
=
133
134 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
I remind the reader that a function is called entire if it is analytic on all of , for example
, cos()
and sin() are all entire. In particular, you should know that:
= 1 + +
1
2
2
+ =
=0
1
!
cos() = 1
1
2
2
+
1
4!
4
=
=0
(1)
(2)!
2
sin() =
1
3!
3
+
1
5!
5
=
=0
(1)
(2 + 1)!
2+1
Since
2
+
1
4!
4
=
=0
1
(2)!
2
sinh() = +
1
3!
3
+
1
5!
5
=
=0
1
(2 + 1)!
2+1
The geometric series is often useful, for , with < 1 it is known
+ +
2
+ =
=0
=
1
This generates a whole host of examples, for instance:
1
1 +
2
= 1
2
+
4
6
+
1
1
3
= 1 +
3
+
6
+
9
+
3
1 2
=
3
(1 + 2 + (2)
2
+ ) =
3
+ 2
4
+ 4
5
+
Moreover, the term-by-term integration and dierentiation theorems yield additional results in
conjuction with the geometric series:
tan
1
() =
1 +
2
=
=0
(1)
2
=
=0
(1)
2 + 1
2+1
=
1
3
3
+
1
5
5
+
ln(1 ) =
ln(1 ) =
1
1
=
=0
=0
1
+ 1
+1
6.1. MULTIVARIATE POWER SERIES 135
Of course, these are just the basic building blocks. We also can twist things and make the student
use algebra,
+2
=
2
=
2
(1 + +
1
2
2
+ )
or trigonmetric identities,
sin() = sin( 2 + 2) = sin( 2) cos(2) + cos( 2) sin(2)
sin() = cos(2)
=0
(1)
(2 + 1)!
( 2)
2+1
+ sin(2)
=0
(1)
(2)!
( 2)
2
.
Feel free to peruse my most recent calculus II materials to see a host of similarly sneaky calculations.
6.1.2 taylors multinomial for two-variables
Suppose we wish to nd the taylor polynomial centered at (0, 0) for (, ) =
sin(). It is a
simple as this:
(, ) =
_
1 + +
1
2
2
+
__
1
6
3
+
_
= + +
1
2
2
1
6
3
+
the resulting expression is called a multinomial since it is a polynomial in multiple variables. If
all functions (, ) could be written as (, ) = ()() then multiplication of series known
from calculus II would often suce. However, many functions do not possess this very special
form. For example, how should we expand (, ) = cos() about (0, 0)?. We need to derive the
two-dimensional Taylors theorem.
We already know Taylors theorem for functions on ,
() = () +
()( ) +
1
2
()( )
2
+ +
1
!
()
()( )
and... If the remainder term vanishes as then the function is represented by the Taylor
series given above and we write:
() =
=0
1
!
()
()( )
.
Consider the function of two variables :
2
which is smooth with smooth partial
derivatives of all orders. Furthermore, let (, ) and construct a line through (, ) with
direction vector (
1
,
2
) as usual:
() = (, ) +(
1
,
2
) = ( +
1
, +
2
)
for . Note (0) = (, ) and
() = (
1
,
2
) =
(0). Construct =
: and
choose () such that () for (). This function is a real-valued function of a
136 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
real variable and we will be able to apply Taylors theorem from calculus II on . However, to
dierentiate well need tools from calculus III to sort out the derivatives. In particular, as we
dierentiate , note we use the chain rule for functions of several variables:
() = (
)
() =
(())
()
= (()) (
1
,
2
)
=
1
( +
1
, +
2
) +
2
( +
1
, +
2
)
Note
(0) =
1
(, )+
2
() =
1
( +
1
, +
2
) +
2
( +
1
, +
2
)
=
1
(()) (
1
,
2
) +
2
(()) (
1
,
2
)
=
2
1
+
1
+
2
+
2
2
=
2
1
+ 2
1
+
2
2
(0) =
2
1
(, ) +2
1
(, ) +
2
2
(, ). We
may construct the Taylor series for up to quadratic terms:
(0 +) = (0) +
(0) +
1
2
(0) +
= (, ) +[
1
(, ) +
2
(, )] +
2
2
_
2
1
(, ) + 2
1
(, ) +
2
2
(, )
+
Note that () = ( +
1
, +
2
) hence (1) = ( +
1
, +
2
) and consequently,
( +
1
, +
2
) = (, ) +
1
(, ) +
2
(, )+
+
1
2
_
2
1
(, ) + 2
1
(, ) +
2
2
(, )
_
+
Omitting point dependence on the 2
derivatives,
( +
1
, +
2
) = (, ) +
1
(, ) +
2
(, ) +
1
2
_
2
1
+ 2
1
+
2
2
+
Sometimes wed rather have an expansion about (, ). To obtain that formula simply substitute
=
1
and =
2
. Note that the point (, ) is xed in this discussion so the derivatives
are not modied in this substitution,
(, ) = (, ) + ( )
(, ) + ( )
(, )+
+
1
2
_
( )
2
(, ) + 2( )( )
(, ) + ( )
2
(, )
_
+
At this point we ought to recognize the rst three terms give the tangent plane to = (, ) at
(, , (, )). The higher order terms are nonlinear corrections to the linearization, these quadratic
6.1. MULTIVARIATE POWER SERIES 137
terms form a quadratic form. If we computed third, fourth or higher order terms we will nd that,
using =
1
and =
2
as well as =
1
and =
2
,
(, ) =
=0
2
1
=0
2
2
=0
2
=0
1
!
()
(
1
,
2
)
1
)(
2
) (
)
Example 6.1.1. Expand (, ) = cos() about (0, 0). We calculate derivatives,
= sin()
= sin()
=
2
cos()
= sin() cos()
=
2
cos()
=
3
sin()
= cos() cos() +
2
sin()
= cos() cos() +
2
sin()
=
3
sin()
Next, evaluate at = 0 and = 0 to nd (, ) = 1 + to third order in , about (0, 0). We
can understand why these derivatives are all zero by approaching the expansion a dierent route:
simply expand cosine directly in the variable (),
(, ) = 1
1
2
()
2
+
1
4!
()
4
+ = 1
1
2
2
+
1
4!
4
+ .
Apparently the given function only has nontrivial derivatives at (0, 0) at orders 0, 4, 8, .... We can
deduce that
be dened by () = + where = (
1
,
2
, . . . ,
) gives the
direction of the line and clearly
1
+
2
2
+ +
=1
() = (())
() = (()) =
=1
)(())
If we omit the explicit dependence on () then we nd the simple formula
() =
=1
.
Dierentiate a second time,
() =
=1
(())
_
=
=1
_
_
_
(())
_
=
=1
_
(())
()
Omitting the () dependence and once more using
() = we nd
() =
=1
Recall that =
=1
() =
=1
=1
_
=
=1
=1
()
() =
1
=1
2
=1
=1
More explicitly,
()
() =
1
=1
2
=1
=1
)(())
Hence, by Taylors theorem, provided we are suciently close to = 0 as to bound the remainder
1
() =
=0
1
!
_
1
=1
2
=1
=1
)(())
_
1
there exist smooth examples for which no neighborhood is small enough, the bump function in one-variable has
higher-dimensional analogues, we focus our attention to functions for which it is possible for the series below to
converge
6.1. MULTIVARIATE POWER SERIES 139
Recall that () = (()) = ( +). Put
2
= 1 and bring in the
1
!
to derive
( +) =
=0
1
=1
2
=1
=1
1
!
_
_
()
.
Naturally, we sometimes prefer to write the series expansion about as an expresssion in = +.
With this substitution we have = and
= ( )
thus
() =
=0
1
=1
2
=1
=1
1
!
_
_
() (
1
)(
2
) (
).
Example 6.1.2. Suppose :
3
lets unravel the Taylor series centered at (0, 0, 0) from the
general formula boxed above. Utilize the notation =
1
, =
2
and =
3
in this example.
() =
=0
3
1
=1
3
2
=1
3
=1
1
!
_
_
(0)
.
The terms to order 2 are as follows:
() = (0) +
(0) +
(0) +
(0)
+
1
2
_
(0)
2
+
(0)
2
+
(0)
2
+
+
(0) +
(0) +
(0) +
(0) +
(0) +
(0)
_
+
Partial derivatives commute for smooth functions hence,
() = (0) +
(0) +
(0) +
(0)
+
1
2
_
(0)
2
+
(0)
2
+
(0)
2
+ 2
(0) + 2
(0) + 2
(0)
_
+
1
3!
_
(0)
3
+
(0)
3
+
(0)
3
+ 3
(0)
2
+ 3
(0)
2
+3
(0)
2
+ 3
(0)
2
+ 3
(0)
2
+ 3
(0)
2
+ 6
(0)
_
+
Example 6.1.3. Suppose (, , ) =
= ()
2
= ()
2
= ()
2
+
2
+
2
+
2
2
if = 1 is not in the domain of then we should rescale the vector so that = 1 places (1) in (), if is
smooth on some neighborhood of then this is possible
140 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
Evaluating at = 0, = 1 and = 2,
(0, 1, 2) = 2
(0, 1, 2) = 0
(0, 1, 2) = 0
(0, 1, 2) = 4
(0, 1, 2) = 0
(0, 1, 2) = 0
(0, 1, 2) = 2
(0, 1, 2) = 0
(0, 1, 2) = 1
Hence, as (0, 1, 2) =
0
= 1 we nd
(, , ) = 1 + 2 + 2
2
+ 2( 1) + 2( 2) +
Another way to calculate this expansion is to make use of the adding zero trick,
(, , ) =
(1+1)(2+2)
= 1 +( 1 + 1)( 2 + 2) +
1
2
_
( 1 + 1)( 2 + 2)
2
+
Keeping only terms with two or less of , ( 1) and ( 2) variables,
(, , ) = 1 + 2 +( 1)(2) +(1)( 2) +
1
2
2
(1)
2
(2)
2
+
Which simplies once more to (, , ) = 1 + 2 + 2( 1) +( 2) + 2
2
+ .
6.2. A BRIEF INTRODUCTION TO THE THEORY OF QUADRATIC FORMS 141
6.2 a brief introduction to the theory of quadratic forms
Denition 6.2.1.
Generally, a quadratic form is a function :
for all
where
such that
= . In particular, if = (, )
and =
_
_
then
() =
=
2
+ + +
2
=
2
+ 2 +
2
.
The = 3 case is similar,denote = [
] and = (, , ) so that
() =
=
11
2
+ 2
12
+ 2
13
+
22
2
+ 2
23
+
33
2
.
Generally, if [
]
and = [
=1
<
2
.
In case you wondering, yes you could write a given quadratic form with a dierent matrix which
is not symmetric, but we will nd it convenient to insist that our matrix is symmetric since that
choice is always possible for a given quadratic form.
It is at times useful to use the dot-product to express a given quadratic form:
= () = () =
Some texts actually use the middle equality above to dene a symmetric matrix.
Example 6.2.2.
2
2
+ 2 + 2
2
=
_
_
2 1
1 2
_ _
_
Example 6.2.3.
2
2
+ 2 + 3 2
2
2
=
_
2 1 3/2
1 2 0
3/2 0 1
Proposition 6.2.4.
The values of a quadratic form on
= 1}. In particular, () =
2
( ) where
=
1
.
142 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
Proof: Let () =
. Notice that we can write any nonzero vector as the product of its
magnitude and its direction =
1
,
() = ( ) = ( )
=
2
=
2
( ).
Therefore () is simply proportional to ( ) with proportionality constant
2
.
The proposition above is very interesting. It says that if we know how works on unit-vectors then
we can extrapolate its action on the remainder of
and we denote
{0}
1.(negative denite) (
) < 0 i (
1
) < 0
2.(positive denite) (
) > 0 i (
1
) > 0
3.(non-denite) (
) = {0} i (
1
) has both positive and negative values.
Before I get too carried away with the theory lets look at a couple examples.
Example 6.2.6. Consider the quadric form (, ) =
2
+
2
. You can check for yourself that
= (, ) is a cone and has positive outputs for all inputs except (0, 0). Notice that () =
2
so it is clear that (
1
) = 1. We nd agreement with the preceding proposition. Next, think about
the application of (, ) to level curves;
2
+
2
= is simply a circle of radius
or just the
origin. Heres a graph of = (, ):
Notice that (0, 0) = 0 is the absolute minimum for . Finally, lets take a moment to write
(, ) = [, ]
_
1 0
0 1
_ _
_
in this case the matrix is diagonal and we note that the e-values are
1
=
2
= 1.
6.2. A BRIEF INTRODUCTION TO THE THEORY OF QUADRATIC FORMS 143
Example 6.2.7. Consider the quadric form (, ) =
2
2
2
. You can check for yourself
that = (, ) is a hyperboloid and has non-denite outputs since sometimes the
2
term
dominates whereas other points have 2
2
as the dominent term. Notice that (1, 0) = 1 whereas
(0, 1) = 2 hence we nd (
1
) contains both positive and negative values and consequently we
nd agreement with the preceding proposition. Next, think about the application of (, ) to level
curves;
2
2
2
= yields either hyperbolas which open vertically ( > 0) or horizontally ( < 0)
or a pair of lines =
2
in the = 0 case. Heres a graph of = (, ):
The origin is a saddle point. Finally, lets take a moment to write (, ) = [, ]
_
1 0
0 2
_ _
_
in this case the matrix is diagonal and we note that the e-values are
1
= 1 and
2
= 2.
Example 6.2.8. Consider the quadric form (, ) = 3
2
. You can check for yourself that =
(, ) is parabola-shaped trough along the -axis. In this case has positive outputs for all inputs
except (0, ), we would call this form positive semi-denite. A short calculation reveals that
(
1
) = [0, 3] thus we again nd agreement with the preceding proposition (case 3). Next, think
about the application of (, ) to level curves; 3
2
= is a pair of vertical lines: =
/3 or
just the -axis. Heres a graph of = (, ):
144 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
Finally, lets take a moment to write (, ) = [, ]
_
3 0
0 0
_ _
_
in this case the matrix is
diagonal and we note that the e-values are
1
= 3 and
2
= 0.
Example 6.2.9. Consider the quadric form (, , ) =
2
+2
2
+3
2
. Think about the application
of (, , ) to level surfaces;
2
+ 2
2
+ 3
2
= is an ellipsoid. I cant graph a function of three
variables, however, we can look at level surfaces of the function. I use Mathematica to plot several
below:
Finally, lets take a moment to write (, , ) = [, , ]
1 0 0
0 2 0
0 0 3
_
in this case the matrix
is diagonal and we note that the e-values are
1
= 1 and
2
= 2 and
3
= 3.
6.2.1 diagonalizing forms via eigenvectors
The examples given thus far are the simplest cases. We dont really need linear algebra to un-
derstand them. In contrast, e-vectors and e-values will prove a useful tool to unravel the later
examples
3
Denition 6.2.10.
Let
. If
1
is nonzero and = for some then we say is an
eigenvector with eigenvalue of the matrix .
Proposition 6.2.11.
Let
then is an eigenvalue of i () = 0. We say () = ()
the characteristic polynomial and () = 0 is the characteristic equation.
Proof: Suppose is an eigenvalue of then there exists a nonzero vector such that =
which is equivalent to = 0 which is precisely ( ) = 0. Notice that ( )0 = 0
3
this is the one place in this course where we need eigenvalues and eigenvector calculations, I include these to
illustrate the structure of quadratic forms in general, however, as linear algebra is not a prerequisite you may nd some
things in this section mysterious. The homework and study guide will elaborate on what is required this semester
6.2. A BRIEF INTRODUCTION TO THE THEORY OF QUADRATIC FORMS 145
thus the matrix ( ) is singular as the equation ( ) = 0 has more than one solution.
Consequently () = 0.
Conversely, suppose ( ) = 0. It follows that ( ) is singular. Clearly the system
( ) = 0 is consistent as = 0 is a solution hence we know there are innitely many solu-
tions. In particular there exists at least one vector = 0 such that () = 0 which means the
vector satises = . Thus is an eigenvector with eigenvalue for .
Example 6.2.12. Let =
_
3 1
3 1
_
nd the e-values and e-vectors of .
() =
_
3 1
3 1
_
= (3 )(1 ) 3 =
2
4 = ( 4) = 0
We nd
1
= 0 and
2
= 4. Now nd the e-vector with e-value
1
= 0, let
1
= [, ]
denote the
e-vector we wish to nd. Calculate,
(0)
1
=
_
3 1
3 1
_ _
_
=
_
3 +
3 +
_
=
_
0
0
_
Obviously the equations above are redundant and we have innitely many solutions of the form
3 + = 0 which means = 3 so we can write,
1
=
_
3
_
=
_
1
3
_
. In applications we
often make a choice to select a particular e-vector. Most modern graphing calculators can calcu-
late e-vectors. It is customary for the e-vectors to be chosen to have length one. That is a useful
choice for certain applications as we will later discuss. If you use a calculator it would likely give
1
=
1
10
_
1
3
_
although the
such that ( 4)
2
= 0. Notice that ,
are disposable variables in this context, I do not mean to connect the formulas from the = 0 case
with the case considered now.
(4)
1
=
_
1 1
3 3
_ _
_
=
_
+
3 3
_
=
_
0
0
_
Again the equations are redundant and we have innitely many solutions of the form = . Hence,
2
=
_
_
=
_
1
1
_
is an eigenvector for any such that = 0.
Theorem 6.2.13.
A matrix
is symmetric i there exists an orthonormal eigenbasis for .
146 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
There is a geometric proof of this theorem in Edwards
4
(see Theorem 8.6 pgs 146-147) . I prove half
of this theorem in my linear algebra notes by a non-geometric argument (full proof is in Appendix C
of Insel,Spence and Friedberg). It might be very interesting to understand the connection between
the geometric verse algebraic arguments. Well content ourselves with an example here:
Example 6.2.14. Let =
0 0 0
0 1 2
0 2 1
,
2
=
1
2
[0, 1, 1]
and
3
=
1
2
[0, 1, 1]
1 0 0
0
1
2
1
2
0
1
2
1
0 0 0
0 1 2
0 2 1
1 0 0
0
1
2
1
2
0
1
2
1
0 0 0
0 1 0
0 0 3
Its really neat that to nd the inverse of a matrix of orthonormal e-vectors we need only take the
transpose; note
1 0 0
0
1
2
1
2
0
1
2
1
1 0 0
0
1
2
1
2
0
1
2
1
1 0 0
0 1 0
0 0 1
.
XXX remove comments about e-vectors and e-value before this section and put them here as
motivating examples for the proposition that follows.
Proposition 6.2.15.
If is a quadratic form on
with orthonormal
e-vectors
1
,
2
, . . . ,
then
(
) =
2
for = 1, 2, . . . , . Moreover, if = [
1
] then
() = (
=
1
2
1
+
2
2
2
+ +
where we dened =
.
Let me restate the proposition above in simple terms: we can transform a given quadratic form to
a diagonal form by nding orthonormalized e-vectors and performing the appropriate coordinate
transformation. Since is formed from orthonormal e-vectors we know that will be either a
rotation or reection. This proposition says we can remove cross-terms by transforming the
quadratic forms with an appropriate rotation.
Example 6.2.16. Consider the quadric form (, ) = 2
2
+ 2 + 2
2
. Its not immediately
obvious (to me) what the level curves (, ) = look like. Well make use of the preceding
4
think about it, there is a 1-1 correspondance between symmetric matrices and quadratic forms
6.2. A BRIEF INTRODUCTION TO THE THEORY OF QUADRATIC FORMS 147
proposition to understand those graphs. Notice (, ) = [, ]
_
2 1
1 2
_ _
_
. Denote the matrix
of the form by and calculate the e-values/vectors:
() =
_
2 1
1 2
_
= ( 2)
2
1 =
2
4 + 3 = ( 1)( 3) = 0
Therefore, the e-values are
1
= 1 and
2
= 3.
()
1
=
_
1 1
1 1
_ _
_
=
_
0
0
_
1
=
1
2
_
1
1
_
I just solved + = 0 to give = choose = 1 then normalize to get the vector above. Next,
(3)
2
=
_
1 1
1 1
_ _
_
=
_
0
0
_
2
=
1
2
_
1
1
_
I just solved = 0 to give = choose = 1 then normalize to get the vector above. Let
= [
1
2
] and introduce new coordinates = [ , ]
dened by =
( , ) =
2
+ 3
2
It is clear that in the barred coordinate system the level curve (, ) = is an ellipse. If we draw
the barred coordinate system superposed over the -coordinate system then youll see that the graph
of (, ) = 2
2
+ 2 + 2
2
= is an ellipse rotated by 45 degrees. Or, if you like, we can plot
= (, ):
5
technically
( , ) is (( , ), ( , ))
148 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
Example 6.2.17. Consider the quadric form (, ) =
2
+2+
2
. Its not immediately obvious
(to me) what the level curves (, ) = look like. Well make use of the preceding proposition to
understand those graphs. Notice (, ) = [, ]
_
1 1
1 1
_ _
_
. Denote the matrix of the form by
and calculate the e-values/vectors:
() =
_
1 1
1 1
_
= ( 1)
2
1 =
2
2 = ( 2) = 0
Therefore, the e-values are
1
= 0 and
2
= 2.
(0)
1
=
_
1 1
1 1
_ _
_
=
_
0
0
_
1
=
1
2
_
1
1
_
I just solved + = 0 to give = choose = 1 then normalize to get the vector above. Next,
(2)
2
=
_
1 1
1 1
_ _
_
=
_
0
0
_
2
=
1
2
_
1
1
_
I just solved = 0 to give = choose = 1 then normalize to get the vector above. Let
= [
1
2
] and introduce new coordinates = [ , ]
dened by =
( , ) = 2
2
It is clear that in the barred coordinate system the level curve (, ) = is a pair of paralell
lines. If we draw the barred coordinate system superposed over the -coordinate system then youll
see that the graph of (, ) =
2
+ 2 +
2
= is a line with slope 1. Indeed, with a little
algebraic insight we could have anticipated this result since (, ) = (+)
2
so (, ) = implies
+ =
thus =
_
. Denote the matrix of the form by
and calculate the e-values/vectors:
() =
_
2
2
_
=
2
4 = ( + 2)( 2) = 0
Therefore, the e-values are
1
= 2 and
2
= 2.
(+ 2)
1
=
_
2 2
2 2
_ _
_
=
_
0
0
_
1
=
1
2
_
1
1
_
I just solved + = 0 to give = choose = 1 then normalize to get the vector above. Next,
(2)
2
=
_
2 2
2 2
_ _
_
=
_
0
0
_
2
=
1
2
_
1
1
_
I just solved = 0 to give = choose = 1 then normalize to get the vector above. Let
= [
1
2
] and introduce new coordinates = [ , ]
dened by =
( , ) = 2
2
+ 2
2
It is clear that in the barred coordinate system the level curve (, ) = is a hyperbola. If we
draw the barred coordinate system superposed over the -coordinate system then youll see that
the graph of (, ) = 4 = is a hyperbola rotated by 45 degrees. The graph = 4 is thus a
hyperbolic paraboloid:
150 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
The fascinating thing about the mathematics here is that if you dont want to graph = (, ),
but you do want to know the general shape then you can determine which type of quadraic surface
youre dealing with by simply calculating the eigenvalues of the form.
Remark 6.2.19.
I made the preceding triple of examples all involved the same rotation. This is purely for my
lecturing convenience. In practice the rotation could be by all sorts of angles. In addition,
you might notice that a dierent ordering of the e-values would result in a redenition of
the barred coordinates.
6
We ought to do at least one 3-dimensional example.
Example 6.2.20. Consider the quadric form dened below:
(, , ) = [, , ]
6 2 0
2 6 0
0 0 5
6 2 0
2 6 0
0 0 5
= [( 6)
2
4](5 )
= (5 )[
2
12 + 32](5 )
= ( 4)( 8)(5 )
Therefore, the e-values are
1
= 4,
2
= 8 and
3
= 5. After some calculation we nd the following
orthonormal e-vectors for :
1
=
1
1
1
0
2
=
1
1
1
0
3
=
0
0
1
Let = [
1
3
] and introduce new coordinates = [ , , ]
dened by =
. Note these
can be inverted by multiplication by to give = . Observe that
=
1
1 1 0
1 1 0
0 0
=
1
2
( + )
=
1
2
( + )
=
or
=
1
2
( )
=
1
2
( +)
=
The proposition preceding this example shows that substitution of the formulas above into yield:
( , , ) = 4
2
+ 8
2
+ 5
2
6.3. SECOND DERIVATIVE TEST IN MANY-VARIABLES 151
It is clear that in the barred coordinate system the level surface (, , ) = is an ellipsoid. If we
draw the barred coordinate system superposed over the -coordinate system then youll see that
the graph of (, , ) = is an ellipsoid rotated by 45 degrees around the . Plotted below
are a few representative ellipsoids:
In summary, the behaviour of a quadratic form () =
2
1
+
2
2
2
+ +
by choosing
the coordinate system which is built from the orthonormal eigenbasis of (). In this coordinate
system the shape of the level-sets of becomes manifest from the signs of the e-values. )
Remark 6.2.21.
If you would like to read more about conic sections or quadric surfaces and their connection
to e-values/vectors I reccommend sections 9.6 and 9.7 of Antons linear algebra text. I
have yet to add examples on how to include translations in the analysis. Its not much
more trouble but I decided it would just be an unecessary complication this semester.
Also, section 7.1,7.2 and 7.3 in Lays linear algebra text show a bit more about how to
use this math to solve concrete applied problems. You might also take a look in Gilbert
Strangs linear algebra text, his discussion of tests for positive-denite matrices is much
more complete than I will give here.
6.3 second derivative test in many-variables
There is a connection between the shape of level curves (
1
,
2
, . . . ,
(, ) +
(, ) +
1
2
(, ) = [, ][](, ). Since []
= [] we can
nd orthonormal e-vectors
1
,
2
for [] with e-values
1
and
2
respective. Using = [
1
2
] we
7
this set is called the spectrum of the matrix
152 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
can introduce rotated coordinates (
) =
1
2
+
2
2
Clearly if
1
> 0 and
2
> 0 then (, ) yields the local minimum whereas if
1
< 0 and
2
< 0
then (, ) yields the local maximum. Edwards discusses these matters on pgs. 148-153. In short,
supposing () + , if all the e-values of are positive then has a local minimum of ()
at whereas if all the e-values of are negative then reaches a local maximum of () at .
Otherwise has both positive and negative e-values and we say is non-denite and the function
has a saddle point. If all the e-values of are positive then is said to be positive-denite
whereas if all the e-values of are negative then is said to be negative-denite. Edwards
gives a few nice tests for ascertaining if a matrix is positive denite without explicit computation
of e-values. Finally, if one of the e-values is zero then the graph will be like a trough.
Example 6.3.1. Suppose (, ) = (
2
2
+ 2 1) expand about the point (0, 1):
(, ) = (
2
)(
2
+ 2 1) = (
2
)(( 1)
2
)
expanding,
(, ) = (1
2
+ )(1 ( 1)
2
+ ) = 1
2
( 1)
2
+
Recenter about the point (0, 1) by setting = and = 1 + so
(, 1 +) = 1
2
2
+
If (, ) is near (0, 0) then the dominant terms are simply those weve written above hence the graph
is like that of a quadraic surface with a pair of negative e-values. It follows that (0, 1) is a local
maximum. In fact, it happens to be a global maximum for this function.
Example 6.3.2. Suppose (, ) = 4(1)
2
+(2)
2
+((1)
2
(2)
2
)+2(1)(2)
for some constants , . Analyze what values for , will make (1, 2) a local maximum, minimum
or neither. Expanding about (1, 2) we set = 1 + and = 2 + in order to see clearly the local
behaviour of at (1, 2),
(1 +, 2 +) = 4
2
2
+(
2
2
) + 2
= 4
2
2
+(1
2
2
) + 2
= 4 +(+ 1)
2
+ 2 (+ 1)
2
+
There is no nonzero linear term in the expansion at (1, 2) which indicates that (1, 2) = 4 +
may be a local extremum. In this case the quadratic terms are nontrivial which means the graph of
this function is well-approximated by a quadraic surface near (1, 2). The quadratic form (, ) =
(+ 1)
2
+ 2 (+ 1)
2
has matrix
[] =
_
(+ 1)
(+ 1)
2
_
.
6.3. SECOND DERIVATIVE TEST IN MANY-VARIABLES 153
The characteristic equation for is
([] ) =
_
(+ 1)
(+ 1)
2
_
= ( ++ 1)
2
2
= 0
We nd solutions
1
= 1 + and
2
= 1 . The possibilities break down as follows:
1. if
1
,
2
> 0 then (1, 2) is local minimum.
2. if
1
,
2
< 0 then (1, 2) is local maximum.
3. if just one of
1
,
2
is zero then is constant along one direction and min/max along another
so technically it is a local extremum.
4. if
1
2
< 0 then (1, 2) is not a local etremum, however it is a saddle point.
In particular, the following choices for , will match the choices above
1. Let = 3 and = 1 so
1
= 3 and
2
= 1;
2. Let = 3 and = 1 so
1
= 3 and
2
= 5
3. Let = 3 and = 2 so
1
= 0 and
2
= 4
4. Let = 1 and = 3 so
1
= 1 and
2
= 5
Here are the graphs of the cases above, note the analysis for case 3 is more subtle for Taylor
approximations as opposed to simple quadraic surfaces. In this example, case 3 was also a local
minimum. In contrast, in Example 6.2.17 the graph was like a trough. The behaviour of away
from the critical point includes higher order terms whose inuence turns the trough into a local
minimum.
Example 6.3.3. Suppose (, ) = sin() cos() to nd the Taylor series centered at (0, 0) we can
simply multiply the one-dimensional result sin() =
1
3!
3
+
1
5!
5
+ and cos() = 1
1
2!
2
+
1
4!
4
+ as follows:
(, ) = (
1
3!
3
+
1
5!
5
+ )(1
1
2!
2
+
1
4!
4
+ )
=
1
2
2
+
1
24
1
6
1
12
2
+
= +
154 CHAPTER 6. CRITICAL POINT ANALYSIS FOR SEVERAL VARIABLES
The origin (0, 0) is a critical point since
(0, 0) = 0 and
= ,
and second derivatives,
= 0
= 0
= 0,
and the nonzero third derivatives,
= 1.
It follows,
( +, +, +) =
= (, , ) +
(, , ) +
(, , ) +
(, , ) +
1
2
(
) +
Of course certain terms can be combined since
.
Example 7.1.2. Suppose denotes the set of continuous functions on . Dene () =
1
0
() .
The mapping : is linear by properties of denite integrals therefore we identify the denite
integral denes a dual-vector to the vector space of continuous functions.
Example 7.1.3. Suppose = (, ) denotes a set of functions from a vector space to .
Note that is a vector space with respect to point-wise dened addition and scalar multiplication
of functions. Let
and dene () = (
) = (
) + (
.
Example 7.1.4. The determinant is a mapping from
to but it does not dene a dual-vector
to the vector space of square matrices since (+) = () +().
Example 7.1.5. Suppose () = for a particular vector
. We argue
where we
recall =
,
( +) = ( +) = + = () +()
for all ,
and .
Example 7.1.6. Let (, ) = 2 + 5 dene a function :
2
. Note that
(, ) = (, ) (2, 5)
hence by the preceding example we nd (
2
)
.
The preceding example is no accident. It turns out there is a one-one correspondance between row
vectors and dual vectors on
. Let
then we dene
() = . We proved in Example
7.1.5 that
. Suppose (
we see to nd
such that =
. Recall that a
linear function is uniquely dened by its values on a basis; the values of on the standard basis
will show us how to choose . This is a standard technique. Consider:
with
1
=
=1
() = (
=1
) =
=1
. .
) =
=1
)
. .
=
where we dene = ((
1
), (
2
), . . . , (
))
gives an isomorphism of
and (
()
by () =
to give the correspondance an explicit label. The image of the standard basis under
is called the standard dual basis for (
. Consider (
), let
and calculate
(
)() =
() =
then (
)(
) =
is denoted {
1
,
2
, . . . ,
} where we dene
) =
for all ,
. Generally, given a
vector space with basis = {
1
,
2
, . . . ,
= {
1
,
2
, . . . ,
} is
dual to i
) =
for all ,
.
The term basis indicates that {
1
,
2
, . . . ,
} is linearly independent
3
and {
1
,
2
, . . . ,
} =
(
with =
=1
then
() =
=1
_
=
=1
) =
=1
() =
.
1
the super-index is not a power in this context, it is just a notation to emphasize
and
and suppose
with =
=1
. Calculate,
() =
_
=1
_
=
=1
(
() =
=1
(
this shows every dual vector is in the span of the dual basis {
=1
.
7.2 multilinearity and the tensor product
A multilinear mapping is a function of a Cartesian product of vector spaces which is linear with
respect to each slot. The goal of this section is to explain what that means. It turns out the set
of all multilinear mappings on a particular set of vector spaces forms a vector space and well show
how the tensor product can be used to construct an explicit basis by tensoring a bases which are
dual to the bases in the domain. We also examine the concepts of symmetric and antisymmetric
multilinear mappings, these form interesting subspaces of the set of all multilinear mappings. Our
approach in this section is to treat the case of bilinearity in depth then transition to the case of
multilinearity. Naturally this whole discussion demands a familarity with the preceding section.
7.2.1 bilinear maps
Denition 7.2.1.
Suppose
1
,
2
are vector spaces then :
1
2
is a binear mapping on
1
2
i
for all ,
1
, ,
2
and :
(1.) ( +, ) = (, ) +(, ) (linearity in the rst slot)
(2.) (, +) = (, ) +(, ) (linearity in the second slot).
bilinear maps on
When
1
=
2
= we simply say that : is a bilinear mapping on . The set of
all bilinear maps of is denoted
2
0
. You can show that
2
0
forms a vector space under
the usual point-wise dened operations of function addition and scalar multiplication
4
. Hopefully
you are familar with the example below.
Example 7.2.2. Dene :
by (, ) = for all ,
by (, ) =
for all
,
= (
) =
= (, ) +(, )
(, +) =
( +) =
= (, ) +(, )
for all , ,
.
Suppose : is bilinear and suppose = {
1
,
2
, . . . ,
= {
1
,
2
, . . . ,
} is a basis of
with
) =
(, ) =
_
=1
=1
_
(7.1)
=
,=1
(
)
=
,=1
)
=
,=1
(
()
()
Therefore, if we dene
= (
,=1
. The calculation
above also indicates that is a linear combination of certain basic bilinear mappings. In particular,
can be written a linear combination of a tensor product of dual vectors on .
Denition 7.2.4.
Suppose is a vector space with dual space
. If ,
then we dene :
by ( )(, ) = ()() for all , .
Given the notation
5
preceding this denition, we note (
)(, ) =
()
,=1
(
)(
)(, ) therefore, =
,=1
(
We nd
6
that
2
0
= {
,=1
. Moreover, it can be argued
7
that {
,=1
is a linearly
independent set, therefore {
,=1
forms a basis for
2
0
. We can count there are
2
vectors
5
perhaps you would rather write (
)(, ) as
,=1
2
0
7
yes, again, in your homework
7.2. MULTILINEARITY AND THE TENSOR PRODUCT 159
in {
,=1
hence (
2
0
) =
2
.
If =
and if {
=1
denotes the standard dual basis, then there is a standard notation for
the set of coecients found in the summation for . In particular, we denote = [] where
= (
,=1
) =
=1
=1
Denition 7.2.5.
Suppose : is a bilinear mapping then we say:
1. is symmetric i (, ) = (, ) for all ,
2. is antisymmetric i (, ) = (, ) for all ,
Any bilinear mapping on can be written as the sum of a symmetric and antisymmetric bilinear
mapping, this claim follows easily from the calculation below:
(, ) =
1
2
_
(, ) +(, )
_
. .
+
1
2
_
(, ) (, )
_
. .
.
We say
is symmetric in , i
is antisymmetric in
, i
) = (
and
(
) = (
.
You can prove that the sum or scalar multiple of an (anti)symmetric bilinear mapping is once more
(anti)symmetric therefore the set of antisymmetric bilinear maps
2
( ) and the set of symmetric
bilinear maps
0
2
are subspaces of
0
2
. The notation
2
( ) is part of a larger discussion on
the wedge product, we will return to it in a later section.
Finally, if we consider the special case of =
= [] i is antisymmetric.
160 CHAPTER 7. MULTILINEAR ALGEBRA
bilinear maps on
Suppose :
}
is a basis for whereas
= {
1
,
2
, . . . ,
} is a basis of
with
) =
. Let ,
(, ) =
_
=1
=1
_
(7.2)
=
,=1
(
)
=
,=1
)
=
,=1
(
)(
)(
)
Therefore, if we dene
= (
,=1
. To
further rene the formula above we need a new concept.
The dual of the dual is called the double-dual and it is denoted
. In particular, :
is dened by
()() = () for all
and (
)(
) =
(, ) thus,
(, ) =
,=1
(
(, ) =
,=1
(
We argue that {
,=1
is a basis
8
Denition 7.2.7.
8
2
0
is a vector space and weve shown
2
0
( ) { }
,=1
but we should also show
2
0
and
check for LI of { }
,=1
.
7.2. MULTILINEARITY AND THE TENSOR PRODUCT 161
Suppose :
The discussion of the preceding subsection transfers to this context, we simply have to switch some
vectors to dual vectors and move some indices up or down. I leave this to the reader.
bilinear maps on
Suppose :
is bilinear, we say
1
1
(or, if the context demands this detail
1
1
). We dene
1
1
( ) by the natural rule; ( )(, ) = ()() for all
(, )
,=1
and =
,=1
where we dened
= (
).
bilinear maps on
Suppose :
is bilinear, we say
1
1
(or, if the context demands this detail
1
1
). We dene
1
1
by the natural rule; ( )(, ) = ()() for all
(, )
,=1
and =
,=1
where we dened
= (
).
7.2.2 trilinear maps
Denition 7.2.8.
Suppose
1
,
2
,
3
are vector spaces then :
1
3
is a trilinear mapping on
3
i for all ,
1
, ,
2
. ,
3
and :
(1.) ( +, , ) = (, , ) +(, , ) (linearity in the rst slot)
(2.) (, +, ) = (, , ) +(, , ) (linearity in the second slot).
(3.) (, , +) = (, , ) +(, , ) (linearity in the third slot).
162 CHAPTER 7. MULTILINEAR ALGEBRA
If : is trilinear on then we say is a trilinear mapping on and
we denote the set of all such mappings
0
3
. The tensor product of three dual vectors is dened
much in the same way as it was for two,
( )(, , ) = ()()()
Let {
=1
is a basis for with dual basis {
=1
for
. If is trilinear on it follows
(, , ) =
,,=1
and =
,,=1
where we dened
= (
) for all , ,
.
Generally suppose that
1
,
2
,
3
are possibly distinct vector spaces. Moreover, suppose
1
has basis
{
1
=1
,
2
has basis {
2
=1
and
3
has basis {
3
=1
. Denote the dual bases for
1
,
2
,
3
in
the usual fashion: {
1
=1
, {
1
=1
, {
1
=1
. With this notation, we can write a trilinear mapping
on
1
3
as follows: (where we dene
= (
))
(, , ) =
=1
=1
=1
and =
=1
=1
=1
However, if
1
,
2
,
3
happen to be related by duality then it is customary to use up/down indices.
For example, if :
,,=1
and say
1
2
. On the other hand, if :
,,=1
and say
2
1
. Im not sure that Ive ever seen this notation elsewhere, but perhaps it could
be useful to denote the set of trinlinear maps :
as
1
1 1
. Hopefully we will
not need such silly notation in what we consider this semester.
There was a natural correspondance between bilinear maps on
with its double-dual hence this tensor product is already dened, but to be safe let me write it out
in this context
(, , ) =
()
()(
).
7.2. MULTILINEARITY AND THE TENSOR PRODUCT 163
Example 7.2.9. Dene :
3
3
3
by (, , ) = (). You may not have
learned this in your linear algebra course
10
but a nice formula
11
for the determinant is given by the
Levi-Civita symbol,
() =
3
,,=1
3
note that
1
() = [
1
],
2
() = [
2
] and
3
() = [
3
]. It follows that
(, , ) =
3
,,=1
Multilinearity follows easily from this formula. For example, linearity in the third slot:
(, , +) =
3
,,=1
( +)
(7.3)
=
3
,,=1
) (7.4)
=
3
,,=1
+
3
,,=1
(7.5)
= (, , ) +(, , ). (7.6)
Observe that by properties of determinants, or the Levi-Civita symbol if you prefer, swapping a pair
of inputs generates a minus sign, hence:
(, , ) = (, , ) = (, , ) = (, , ) = (, , ) = (, , ).
If : is a trilinear mapping such that
(, , ) = (, , ) = (, , ) = (, , ) = (, , ) = (, , )
for all , , then we say is antisymmetric. Likewise, if : is a trilinear
mapping such that
(, , ) = (, , ) = (, , ) = (, , ) = (, , ) = (, , ).
for all , , then we say is symmetric. Clearly the mapping dened by the determinant
is antisymmetric. In fact, many authors dene the determinant of an matrix as the antisym-
metric -linear mapping which sends the identity matrix to 1. It turns out these criteria unquely
10
maybe you havent even taken linear yet!
11
actually, I take this as the denition in linear algebra, it does take considerable eort to recover the expansion
by minors formula which I use for concrete examples
164 CHAPTER 7. MULTILINEAR ALGEBRA
dene the determinant. That is the motivation behind my Levi-Civita symbol denition. That
formula is just the nuts and bolts of complete antisymmetry.
You might wonder, can every trilinear mapping can be written as a the sum of a symmetric and
antisymmetric mapping? The answer is no. Take the following trilinear mapping on
3
for example:
(, , ) = [
3
] +
You can verify this is linear in each slot however, it is antisymetric in the rst pair of slots
(, , ) = [
3
] + = [
3
] + = (, , )
and symmetric in the last pair,
(, , ) = [
3
] + = [
3
] + = (, , ).
Generally, the decomposition of a multilinear mapping into more basic types is a problem which
requires much more thought than we intend here. Representation theory is concerned with precisely
this problem: how can we decompose a tensor product into irreducible pieces. Their idea of tensor
product is not precisely the same as ours, however algebraically the problems are quite intertwined.
Ill leave it at that unless youd like to do an independent study on representation theory. Ideally
youd already have linear algebra and abstract algebra complete before you attempt that study.
7.2.3 multilinear maps
Denition 7.2.10.
Suppose
1
,
2
, . . .
is a -multilinear
mapping on
1
(
1
, . . . ,
, . . . ,
) = (
1
, . . . ,
, . . . ,
) +(
1
, . . . ,
, . . . ,
)
for = 1, 2, . . . , . In other words, we assume is linear in each of its -slots. If is
multilinear on
is
a type (0, 2) tensor with respect to .
We are free to dene tensor products in this context in the same manner as we have previously.
Suppose
1
1
,
2
2
, . . . ,
and
1
1
,
2
2
, . . . ,
then
(
1
,
2
, . . . ,
) =
1
(
1
)
2
(
2
)
)
It is easy to show the tensor produce of -dual vectors as dened above is indeed a -multilinear
mapping. Moreover, the set of all -multilinear mappings on
1
2
clearly forms a
7.2. MULTILINEARITY AND THE TENSOR PRODUCT 165
vector space of dimension (
1
)(
2
) (
1
,
2
, . . . ,
has
basis {
=1
which is dual to {
=1
the basis for
1
=1
2
=1
=1
2
...
1
1
2
2
If is a type (, ) tensor on then there is no need for the ugly double indexing on the basis
since we need only tensor a basis {
=1
for and its dual {
=1
for
in what follows:
=
1
,...,=1
1
,...,=1
2
...
2
...
.
permutations
Before I dene symmetric and antisymmetric for -linear mappings on I think it is best to discuss
briey some ideas from the theory of permutations.
Denition 7.2.11.
A permutation on {1, 2, . . . } is a bijection on {1, 2, . . . }. We dene the set of permutations
on {1, 2, . . . } to be
) = (
1
, . . . , , . . . , , . . . ,
)
for all possible , . Conversely, if a -linear mapping on has
(
1
, . . . , , . . . , , . . . ,
) = (
1
, . . . , , . . . , , . . . ,
)
for all possible pairs , then it is said to be completely antisymmetric or alter-
nating. Equivalently a -linear mapping L is alternating if for all
1
,
2
, . . . ,
) = ()(
1
,
2
, . . . ,
)
The set of alternating multilinear mappings on is denoted , the set of -linear alter-
nating maps on is denoted
2
...
is completely symmetric
in
1
,
2
, . . . ,
2
...
(1)
(2)
...
()
for all
2
...
is completely
antisymmetric in
1
,
2
, . . . ,
2
...
= ()
(1)
(2)
...
()
for all
. It is a simple
exercise to show that a completely (anti)symmetric tensor
13
has completely (anti)symmetric com-
ponents.
13
in this context a tensor is simply a multilinear mapping, in physics there is more attached to the term
7.3. WEDGE PRODUCT 167
The tensor product is an interesting construction to discuss at length. To summarize, it is asso-
ciative and distributive across addition. Scalars factor out and it is not generally commutative.
For a given vector space we can in principle generate by tensor products multilinear mappings
of arbitrarily high order. This tensor algebra is innite dimensional. In contrast, the space of
forms on is a nite-dimensional subspace of the tensor algebra. We discuss this next.
7.3 wedge product
We assume is a vector space with basis {
=1
throughout this section. The dual basis is denoted
{
=1
as is our usual custom. Our goal is to nd a basis for the alternating maps on and explore
the structure implicit within its construction. This will lead us to call the exterior algebra
of after the discussion below is complete.
7.3.1 wedge product of dual basis generates basis for
Suppose : is antisymmetric and =
.=1
, it follows that
for all
,
<
<
<
<
<
<
<
<
<
). (7.8)
Therefore, {
<
,=1
1
2
is an antisymmetric
bilinear mapping because
(, ) =
,,=1
, it follows that
and
for all , ,
= 0 for = 1, 2, . . . , . A calculation similar to the one just oered for the case of a bilinear
map reveals that we can write as follows:
=
<<
_
(7.9)
Dene
thus
=
<<
,,=1
1
3!
(7.10)
and it is clear that {
, ,
()
(1)
(2)
()
(7.11)
If , we would like to show that
(. . . , , . . . , , . . . ) =
(. . . , , . . . , , . . . ) (7.12)
follows from the complete antisymmetrization in the denition of the wedge product. Before we
give the general argument, lets see how this works in the trilinear case. Consider,
=
=
.
Calculate, noting that
(, , ) =
()
()
() =
hence
(, , ) =
Thus,
(, , ) =
(, , ) =
is an alternating trilinear map as it is clearly trilinear since it is built from the sum of
7.3. WEDGE PRODUCT 169
tensor products which we know are likewise trilinear.
The multilinear case follows essentially the same argument, note
(. . . ,
, . . . ,
, . . . ) =
()
(1)
1
()
()
()
(7.13)
whereas,
(. . . ,
, . . . ,
, . . . ) =
()
(1)
1
()
()
()
. (7.14)
Suppose we take each permutation and subsitute
(. . . ,
, . . . ,
, . . . )
=
(())
(1)
1
()
()
()
()
(1)
1
()
()
()
(. . . ,
, . . . ,
, . . . ) (7.15)
Here the of a permutation is (1)
.
Recall that
. .
. .
. .
. .
. .
Ive indicated how these signs are consistent with the = 2 antisymmetry. Any permutation of
the dual vectors can be thought of as a combination of several transpositions. In any event, it is
sometimes useful to just know that the wedge product of three elements is invariant under cyclic
permutations of the dual vectors,
= ()
(1)
(2)
()
This is just a slick formula which says the wedge product generates a minus whenever you ip two
dual vectors which are wedged.
170 CHAPTER 7. MULTILINEAR ALGEBRA
7.3.2 the exterior algebra
The careful reader will realize we have yet to dene wedge products of anything except for the dual
basis. But, naturally you must wonder if we can take the wedge product of other dual vectors or
morer generally alternating tensors. The answer is yes. Let us dene the general wedge product:
Denition 7.3.1. Suppose
and
. We dene
)
then introduce notation
hence:
=
1
,
2
,...,=1
1
!
2
...
1
!
and
=
1
,
2
,...,=1
1
!
2
...
1
!
Naturally,
1
!!
.
Again, but with less slick notation:
=
1
,
2
,...,=1
1
,
2
,...,=1
1
!!
2
...
2
...
All the denition above really says is that we extend the wedge product on the basis to distribute
over the addition of dual vectors. What this means calculationally is that the wedge product obeys
the usual laws of addition and scalar multiplication. The one feature that is perhaps foreign is the
antisymmetry of the wedge product. We must take care to maintain the order of expressions since
the wedge product is not generally commutative.
Proposition 7.3.2.
Let , , be forms on and then
() ( +) = + distributes across vector addition
() ( +) = + distributes across vector addition
() () = () = ( ) scalars factor out
() ( ) = ( ) associativity
I leave the proof of this proposition to the reader.
7.3. WEDGE PRODUCT 171
Proposition 7.3.3. graded commutivity of homogeneous forms.
Let , be forms on of degree and respectively then
= (1)
Proof: suppose =
1
!
is a -form on and =
1
!
is a -form on . Calculate:
=
1
!!
by defn. of ,
=
1
!!
1
!!
Lets expand in detail why
= (1)
. Suppose = (
1
,
2
, . . . ,
) and =
(
1
,
2
, . . . ,
= (1)
= (1)
(1)
.
.
.
.
.
.
.
.
.
= (1)
(1)
(1)
. .
= (1)
.
Example 7.3.4. Let be a 2-form dened by
=
1
2
+
2
3
And let be a 1-form dened by
= 3
1
Consider then,
= (
1
2
+
2
3
) (3
1
)
= (3
1
2
1
+ 3
2
3
1
= 3
1
2
3
.
(7.16)
whereas,
= 3
1
(
1
2
+
2
3
)
= (3
1
1
2
+ 3
1
2
3
= 3
1
2
3
.
(7.17)
172 CHAPTER 7. MULTILINEAR ALGEBRA
so this agrees with the proposition, (1)
= (1)
2
= 1 so we should have found that = .
This illustrates that although the wedge product is antisymmetric on the basis, it is not always
antisymmetric, in particular it is commutative for even forms.
The graded commutivity rule = (1)
1
2
= 0.
Proof: by assumption of linear dependence there exist constants
1
,
2
, . . . ,
1
+
2
2
+
= 0.
Suppose that
is a nonzero constant in the sum above, then we may divide by it and consequently
we can write
=
1
1
+ +
1
1
+
+1
+1
+ +
_
Insert this sum into the wedge product in question,
1
2
. . .
=
1
2
= (
1
/
)
1
2
1
+(
2
/
)
1
2
2
+
+(
1
/
)
1
2
1
+(
+1
/
)
1
2
+1
+
+(
)
1
2
= 0.
(7.18)
We know all the wedge products are zero in the above because in each there is at least one 1-form
repeated, we simply permute the wedge products till they are adjacent and by the previous propo-
sition the term vanishes. The proposition follows.
7.3. WEDGE PRODUCT 173
Let us pause to reect on the meaning of the proposition above for a -dimensional vector space
. The dual space
, it must consist
of the wedge product of distinct linearly independent one-forms. The number of ways to choose
distinct objects from a list of distinct objects is precisely n choose p,
_
_
=
!
( )!!
for 0 . (7.19)
Proposition 7.3.7.
If is an -dimensional vector space then
is an
_
_
-dimensional vector space of -
forms. Moreover, the direct sum of all forms over has the structure
=
1
1
Proof: dene
0
= then it is clear
= {0} for = hence the term direct sum is appropriate. It remains to show
( ) = 2
where ( ) = . A natural basis for is found from taking the union of the
bases for each subspace of -forms,
= {1,
1
,
2
, . . . ,
1
1
<
2
< <
}
But, we can count the number of vectors in the set above as follows:
= 1 + +
_
2
_
+ +
_
1
_
+
_
_
Recall the binomial theorem states
( +)
=0
_
+
1
+ +
1
+
.
Recognize that = (1 + 1)
= {
1
}. The form
1
and
2
= (
1
2
) is all we can do here. This makes a 2
2
= 4-
dimensional vector space with basis
{1,
1
,
2
,
1
2
}.
Example 7.3.9. exterior algebra of
3
Let us begin with the standard dual basis {
1
,
2
,
3
}.
By denition we take the = 0 case to be the eld itself;
0
, it has basis 1. Next,
1
=
(
1
,
2
,
3
) =
2
= (
1
2
,
1
3
,
2
3
)
and nally,
3
= (
1
2
3
).
This makes a 2
3
= 8-dimensional vector space with basis
{1,
1
,
2
,
3
,
1
2
,
1
3
,
2
3
,
1
2
3
}
it is curious that the number of independent one-forms and 2-forms are equal.
Example 7.3.10. exterior algebra of
4
Let us begin with the standard dual basis {
1
,
2
,
3
,
4
}.
By denition we take the = 0 case to be the eld itself;
0
, it has basis 1. Next,
1
=
(
1
,
2
,
3
,
4
) =
2
= (
1
2
,
1
3
,
1
4
,
2
3
,
2
4
,
3
4
)
and three forms,
3
= (
1
2
3
,
1
2
4
,
1
3
4
,
2
3
4
).
and
3
= (
1
3
). Thus a 2
4
= 16-dimensional vector space. Note that, in contrast
to
3
, we do not have the same number of independent one-forms and two-forms over
4
.
Lets explore how this algebra ts with calculations we already know about determinants.
Example 7.3.11. Suppose = [
1
2
]. I propose the determinant of is given by the top-form
on
2
via the formula () = (
1
2
)(
1
,
2
). Suppose =
_
_
then
1
= (, ) and
16
or volume form for reasons we will explain later, other authors begin the discussion of forms from the consideration
of volume, see Chapter 4 in Bernard Schutz Geometrical methods of mathematical physics
7.3. WEDGE PRODUCT 175
2
= (, ). Thus,
_
_
= (
1
2
)(
1
,
2
)
= (
1
1
)((, ), (, ))
=
1
(, )
2
(, )
2
(, )
1
(, )
= .
I hope this is not surprising!
Example 7.3.12. Suppose = [
1
3
]. I propose the determinant of is given by the top-
form on
3
via the formula () = (
1
3
)(
1
,
2
,
3
). Lets see if we can nd the expansion
by cofactors. By the denition we have
1
2
3
=
=
1
3
+
2
1
+
3
2
=
1
(
2
2
)
2
(
1
1
) +
3
(
1
1
)
=
1
(
2
3
)
2
(
1
3
) +
3
(
1
2
).
I submit to the reader that this is precisely the cofactor expansion formula with respect to the rst
column of . Suppose =
then
1
= (, , ),
2
= (, , ) and
3
= (, , ).
Calculate,
() =
1
(
1
)(
2
3
)(
2
,
3
)
2
(
1
)(
1
3
)(
2
,
3
) +
3
(
1
)(
1
2
)(
2
,
3
)
= (
2
3
)(
2
,
3
) (
1
3
)(
2
,
3
) +(
1
2
)(
2
,
3
)
= ( ) ( ) +( )
which is precisely my claim.
7.3.3 connecting vectors and forms in
3
There are a couple ways to connect vectors and forms in
3
. Mainly we need the following maps:
Denition 7.3.13.
Given =< , , >
3
we can construct a corresponding one-form
=
1
+
2
+
3
or we can construct a corresponding two-form
=
2
3
+
3
1
+
1
2
Recall that (
1
3
) = (
2
3
) = 3 hence the space of vectors, one-forms, and also two-
forms are isomorphic as vector spaces. It is not dicult to show that
1
+
2
=
1
+
2
and
1
+
2
=
1
+
2
for all
1
,
2
3
and . Moreover,
= 0 i = 0 and
= 0 i
= 0 hence () = {0} and () = {0} but this means that and are injective and since
176 CHAPTER 7. MULTILINEAR ALGEBRA
the dimensions of the domain and codomain are 3 and these are linear transformations
17
it follows
and are isomorphisms.
It appears we have two ways to represent vectors with forms in
3
. Well see why this is important
as we study integration of forms. It turns out the two-forms go with surfaces whereas the one-
forms attach to curves. This corresponds to the fact in calculus III we have two ways to integrate
a vector-eld, we can either calculate ux or work. Partly for this reason the mapping is called
the work-form correspondence and is called the ux-form correspondence. Integration
has to wait a bit, for now we focus on algebra.
Example 7.3.14. Suppose =< 2, 0, 3 > and =< 0, 1, 2 > then
= 2
1
+3
3
and
=
2
+2
3
.
Calculate the wedge product,
= (2
1
+ 3
3
) (
2
+ 2
3
)
= 2
1
(
2
+ 2
3
) + 3
3
(
2
+ 2
3
)
= 2
1
2
+ 4
1
3
+ 3
3
2
+ 6
3
3
= 3
2
3
4
3
1
+ 2
1
2
=
<3,4,2>
=
(7.20)
Coincidence? Nope.
Proposition 7.3.15.
Suppose ,
3
then
3
,,=1
.
Proof: Suppose =
3
=1
and =
3
=1
then
3
=1
and
3
=1
.
Calculate,
=
_
3
=1
_
3
=1
_
=
3
=1
3
=1
= (
3
=1
) where Im using
= () to make the
argument of the ux-form mapping easier to read, hence,
=
3
=1
3
=1
(
3
=1
) = (
3
,,=1
)
. .
= ( ) =
Of course, if you dont like my proof you could just work it out like the example that precedes this
proposition. I gave the proof to show o the mappings a bit more.
17
this is not generally true, note () =
2
has () = 0 i = 0 and yet is not injective. The linearity is key.
7.4. BILINEAR FORMS AND GEOMETRY; METRIC DUALITY 177
Is the wedge product just the cross-product generalized? Well, not really. I think theyre quite
dierent animals. The wedge product is an associative product which makes sense in any vector
space. The cross-product only matches the wedge product after we interpret it through a pair of
isomorphisms ( and ) which are special to
3
. However, there is debate, largely the question
comes down to what you think makes the cross-product the cross-product. If you think it must
pick a unique perpendicular direction to a pair of given directions then that is only going to work
in
3
since even in
4
there is a whole plane of perpendicular vectors to a given pair. On the other
hand, if you think the cross-product in
4
should be pick the unique perpendicular to a given triple
of vectors then you could set something up. You could dene =
1
((
))
where :
3
4
is an isomorphism well describe in a upcoming section. But, you see its
no longer a product of two vectors, its not a binary operation, its a tertiary operation. In any
event, you can read a lot more on this if you wish. We have all the tools we need for this course.
The wedge product provides the natural antisymmetric algebra for -dimensiona and the work and
ux-form maps naturally connect us to the special world of three-dimensional mathematics.
There is more algebra for forms on
3
however we defer it to a later section where we have a few
more tools. Chief among those is the Hodge dual. But, before we can discuss Hodge duality we
need to generalize our idea of a dot-product just a little.
7.4 bilinear forms and geometry; metric duality
The concept of a metric goes beyond the familar case of the dot-product. If you want a more
strict generalization of the dot-product then you should think about an inner-product. In contrast
to the denition below, the inner-product replaces non-degeneracy with the stricter condition of
positive-denite which would read (, ) > 0 for = 0. I included a discussion of inner products
at the end of this section for the interested reader although we are probably not going to need all
of that material.
7.4.1 metric geometry
A geometry is a vector space paired with a metric. For example, if we pair
i = 0. It follows that = 0
has no non-trivial solutions hence
1
exists.
Example 7.4.2. Suppose (, ) =
for all ,
, it is just
the dot-product. Note that (, ) =
0
+
1
1
+
2
2
+
3
3
It is useful to write the Minkowski product in terms of a matrix multiplication. Observe that for
,
4
,
(, ) =
0
0
+
1
1
+
2
2
+
3
3
=
_
3
_
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
where we have introduced the matrix of the Minkowski product. Notice that
= and () =
1 = 0 hence (, ) =
0
+
For vectors with zero in the zeroth slot this Minkowski product reduces to the dot-product. However,
for vectors which have nonzero entries in both the zeroth and later slots much diers. Recall that
any vectors dot-product with itself gives the square of the vectors length. Of course this means that
= 0 i = 0. Contrast that with the following: if = (1, 1, 0, 0) then
(, ) = 1 + 1 = 0
Yet = 0. Why study such a strange generalization of length? The answer lies in physics. Ill give
you a brief account by dening a few terms: Let = (
0
,
1
,
2
,
3
)
4
then we say
1. is a timelike vector if < , > < 0
2. is a lightlike vector if < , > = 0
3. is a spacelike vector if < , > > 0
7.4. BILINEAR FORMS AND GEOMETRY; METRIC DUALITY 179
If we consider the trajectory of a massive particle in
4
that begins at the origin then at any later
time the trajectory will be located at a timelike vector. If we consider a light beam emitted from
the origin then at any future time it will located at the tip of a lightlike vector. Finally, spacelike
vectors point to points in
4
which cannot be reached by the motion of physical particles that pass
throughout the origin. We say that massive particles are conned within their light cones, this
means that they are always located at timelike vectors relative to their current position in space
time. If youd like to know more I can reccomend a few books.
At this point you might wonder if there are other types of metrics beyond these two examples.
Surprisingly, in a certain sense, no. A rather old theorem of linear algebra due to Sylvester states
that we can change coordinates so that the metric more or less resembles either the dot-product or
something like it with some sign-ips. Well return to this in a later section.
7.4.2 metric duality for tensors
Throughout this section we consider a vector space paired with a metric : .
Moreover, the vector space has basis {
=1
which has a -dual basis {
=1
. Up to this point
we always have used a -dual basis where the duality was oered by the dot-product. In the
context of Minkowski geometry that sort of duality is no longer natural. Instead we must follow
the denition below:
Denition 7.4.4.
If is a vector space with metric and basis {
=1
then we say the basis {
=1
is -dual
i
Suppose
) =
and consider =
,=1
. Furthermore, suppose
=1
I like to think of this as some sort of conservation of indices. Strict adherence to the notation
drives us to write things such as
=1
. Suppose =
4
=0
the components
=
4
=0
.
For the minkowski metric this just adjoins a minus to the zeroth component: if (
) = (, , , )
then
= (, , , ).
Example 7.4.6. Suppose we are working on
and it follows
that
. In this case
. The covariant and contravariant components are the same. This is why is was ok
to ignore up/down indices when we work with a dot-product exclusively.
What if we raise an index and the lower it back down once more? Do we really get back where we
started? Given
}
on the other hand suppose we lower the index on the dual basis {
} to formally obtain {
) =
_
=
) =
or
or
by the rule
() = (, ).
2. given
we create
by the rule
() =
1
(, ) where
1
(, ) =
.
Recall we at times identify and
) = (,
) = (
) =
) =
Thus,
where
) =
1
(,
) =
1
(
) =
thus
where
.
Suppose we want to change a type (0, 2) tensor to a type (2, 0) tensor. Were given :
where =
. Dene
: as follows:
(, ) = (
)
What does this look like in components? Note
) = (
) =
hence
and
) = (
) =
_
_
=
) =
are not technically components for . If we have a metric we can recover either from
),
then these might be equilvalent if + =
1
to
2
. If you read Gravitation by Misner, Thorne and Wheeler youll nd many more thoughts
on this equivalence. Challenge: can you nd the explicit formulas like
(, ) = (
) which
back up the index calculations below?
or
I hope Ive given you enough to chew on in this section to put these together.
182 CHAPTER 7. MULTILINEAR ALGEBRA
7.4.3 inner products and induced norm
There are generalized dot-products on many abstract vector spaces, we call them inner-products.
Denition 7.4.8.
Suppose is a vector space. If <, >: is a function such that for all , ,
and :
1. < , >=< , > (symmetric)
2. < +, >=< , > + < , > (additive in the rst slot)
3. < , >= < , > (homogeneity in the rst slot)
4. < , > 0 and < , >= 0 i = 0
then we say (, <, >) is an inner-product space with inner product <, >.
Given an inner-product space (, <, >) we can easily induce a norm for by the formula =
< , > for all . Properties (1.), (3.) and (4.) in the denition of the norm are fairly obvious
for the induced norm. Lets think throught the triangle inequality for the induced norm:
+
2
= < +, + > def. of induced norm
= < , + > + < , + > additive prop. of inner prod.
= < +, > + < +, > symmetric prop. of inner prod.
= < , > + < , > + < , > + < , > additive prop. of inner prod.
=
2
+ 2 < , > +
2
At this point were stuck. A nontrivial identity
19
called the Cauchy-Schwarz identity helps us
proceed; < , > . It follows that +
2
2
+ 2 +
2
= ( + )
2
.
However, the induced norm is clearly positive
20
so we nd + +.
Most linear algebra texts have a whole chapter on inner-products and their applications, you can
look at my notes for a start if youre curious. That said, this is a bit of a digression for this course.
19
I prove this for the dot-product in my linear notes, however, the proof is written in such a way it equally well
applies to a general inner-product
20
note: if you have (5)
2
< (7)
2
it does not follow that 5 < 7, in order to take the squareroot of the inequality
we need positive terms squared
7.5. HODGE DUALITY 183
7.5 hodge duality
We can prove that
_
_
=
_
_
. This follows from explicit computation of the formula for
_
_
or
from the symmetry of Pascals triangle if you prefer. In any event, this equality suggests there is
some isomorphism between and ( )-forms. When we are given a metric on a vector space
(and the notation of the preceding section) it is fairly simple to construct the isomorphism.
Suppose we are given
1
,
2
,...,=1
1
!
2
...
1
,
2
,...,=1
1
!( )!
2
...
2
...
1
2
...
I should admit, to prove this is a reasonable denition wed need to do some work. Its clearly a
linear transformation, but bijectivity and coordinate invariance of this denition might take a little
work. I intend to omit those details and instead focus on how this works for
3
or
4
. My advisor
taught a course on ber bundles and there is a much more general and elegant presentation of the
hodge dual over a manifold. Ask if interested, I think I have a pdf.
7.5.1 hodge duality in euclidean space
3
To begin, consider a scalar 1, this is a 0-form so we expect the hodge dual to give a 3-form:
1 =
,,
1
0!3!
=
1
2
3
Interesting, the hodge dual of 1 is the top-form on
3
. Conversely, calculate the dual of the top-
form, note
1
3
=
1
6
thus:
(
1
2
3
) =
3
,,=1
1
3!(3 3)!
=
1
6
(1 + 1 + 1 + (1)
2
+ (1)
2
+ (1)
2
) = 1.
Next, consider
1
, note that
1
=
1
=
,,
1
1!2!
=
1
2
=
1
2
(
231
2
3
+
321
3
2
) =
2
3
Similar calculations reveal
2
=
3
1
and
3
=
1
2
. What about the duals of the two-forms?
Begin with =
1
2
note that
1
2
=
1
2
2
1
thus we can see the components are
184 CHAPTER 7. MULTILINEAR ALGEBRA
=
1
1
. Thus,
(
1
2
) =
,,
1
2!1!
(
1
1
)
=
1
2
_
12
21
_
=
1
2
(
3
(
3
)) =
3
.
Similar calculations show that (
2
3
) =
1
and (
3
1
) =
2
. Put all of this together and we
nd that
(
1
+
2
+
3
) =
2
3
+
3
1
+
1
2
and
(
2
3
+
3
1
+
1
2
) =
1
+
2
+
3
Which means that
and
1 =
1
2
3
(
1
2
3
) = 1
1
=
2
3
(
2
3
) =
1
2
=
3
1
(
3
1
) =
2
3
=
1
2
(
1
2
) =
3
A simple rule to calculate the hodge dual of a basis form is as follows
1. begin with the top-form
1
2
3
2. permute the forms until the basis form you wish to hodge dual is to the left of the expression,
whatever remains to the right is the hodge dual.
For example, to calculate the dual of
2
3
note
1
2
3
=
2
3
. .
1
..
(
2
3
) =
1
.
Consider what happens if we calculate , since the dual is a linear operation it suces to think
about the basis forms. Let me sketch the process of
where is a multi-index:
1. begin with
1
2
3
2. write
1
2
3
= (1)
and identify
= (1)
.
3. then to calculate the second dual once more begin with
1
2
3
and note
1
2
3
= (1)
to the left or
to the right.
7.5. HODGE DUALITY 185
4. It follows that
4
function = 0 1
one-form = 1 =
0
,
1
,
2
,
3
two-form = 2 =
,
1
2
2
3
,
3
1
,
1
2
0
1
,
0
2
,
0
3
three-form = 3 =
,,
1
3!
1
2
3
,
0
2
3
0
1
3
,
0
1
2
four-form = 4
0
1
2
3
0
1
2
3
the top form is degree four since in four dimensions we can have at most four dual-basis vectors
without a repeat. Wedge products work the same as they have before, just now we have
0
to play
with. Hodge duality may oer some surprises though.
Denition 7.5.1. The antisymmetric symbol in at
4
is denoted
0123
= 1
plus the demand that it be completely antisymmetric.
We must not assume that this symbol is invariant under a cyclic exhange of indices. Consider,
0123
=
1023
ipped (01)
= +
1203
ipped (02)
=
1230
ipped (03).
(7.21)
In four dimensions well use antisymmetry directly and forego the cyclicity shortcut. Its not a big
deal if you notice it before it confuses you.
Example 7.5.2. Find the Hodge dual of =
1
with respect to the Minkowski metric
, to begin
notice that has components
=
1
. Lets
186 CHAPTER 7. MULTILINEAR ALGEBRA
raise the index using as we learned previously,
=
1
=
1
Starting with the denition of Hodge duality we calculate
(
1
) =
,,,
1
!
1
()!
,,,
(1/6)
1
,,
(1/6)
1
= (1/6)[
1023
0
2
3
+
1230
2
3
0
+
1302
3
0
2
+
1320
3
2
0
+
1203
2
0
3
+
1032
0
3
2
]
= (1/6)[
0
2
3
2
3
0
3
0
2
+
3
2
0
+
2
0
3
+
0
3
2
]
=
2
3
0
=
0
2
3
.
(7.22)
the dierence between the three and four dimensional Hodge dual arises from two sources, for one
we are using the Minkowski metric so indices up or down makes a dierence, and second the
antisymmetric symbol has more possibilities than before because the Greek indices take four values.
I suspect we can calculate the hodge dual by the following pattern: suppose we wish to nd the
dual of where is a basis form for
4
with the Minkowski metric
1. begin with the top-form
0
1
2
3
2. permute factors as needed to place to the left,
3. the form which remains to the right will be the hodge dual of if no
0
is in otherwise the
form to the right multiplied by 1 is .
Note this works for the previous example as follows:
1. begin with
0
1
2
3
2. note
0
1
2
3
=
1
0
2
3
=
1
(
0
2
3
)
3. identify
1
=
0
2
3
(no extra sign since no
0
appears in
1
)
Follow the algorithm for nding the dual of
0
,
1. begin with
0
1
2
3
2. note
0
1
2
3
=
0
(
1
2
3
)
7.5. HODGE DUALITY 187
3. identify
0
=
1
2
3
( added sign since
0
appears in form being hodge dualed)
Lets check from the denition if my algorithm worked out right.
Example 7.5.3. Find the Hodge dual of =
0
with respect to the Minkowski metric
, to begin
notice that
0
has components
=
0
. Lets
raise the index using as we learned previously,
=
0
=
0
the minus sign is due to the Minkowski metric. Starting with the denition of Hodge duality we
calculate
(
0
) =
,,,
1
!
1
()!
,,,
(1/6)
0
,,
(1/6)
0
,,
(1/6)
0
,,
(1/6)
1
2
3
sneaky step
=
1
2
3
.
(7.23)
Notice I am using the convention that Greek indices sum over 0, 1, 2, 3 whereas Latin indices sum
over 1, 2, 3.
Example 7.5.4. Find the Hodge dual of =
0
1
with respect to the Minkowski metric
, to
begin notice the following identity, it will help us nd the components of
0
1
=
,
1
2
2
0
0
1
=
,
1
2
0
[
1
]
where
0
[
1
]
=
0
0
[
1
]
=
0
1
+
0
1
=
[0
]1
the minus sign is due to the Minkowski metric. Starting with the denition of Hodge duality we
188 CHAPTER 7. MULTILINEAR ALGEBRA
calculate
(
0
1
) =
1
!
1
()!
= (1/4)(
[0
]1
)
= (1/4)(
01
10
)
= (1/2)
01
= (1/2)[
0123
2
3
+
0132
3
2
]
=
2
3
(7.24)
Note, the algorithm works out the same,
0
1
2
3
=
0
1
. .
0
(
2
3
) (
0
1
) =
2
3
The other Hodge duals of the basic two-forms calculate by almost the same calculation. Let us make
a table of all the basic Hodge dualities in Minkowski space, I have grouped the terms to emphasize
1 =
0
1
2
3
(
0
1
2
3
) = 1
(
1
2
3
) =
0
0
=
1
2
3
(
0
2
3
) =
1
1
=
2
3
0
(
0
3
1
) =
2
2
=
3
1
0
(
0
1
2
) =
3
3
=
1
2
0
(
3
0
) =
1
2
(
1
2
) =
3
0
(
1
0
) =
2
3
(
2
3
) =
1
0
(
2
0
) =
3
1
(
3
1
) =
2
0
the isomorphisms between the one-dimensional
0
4
and
4
4
, the four-dimensional
1
4
and
4
, the six-dimensional
2
4
and itself. Notice that the dimension of
4
is 16 which we have
explained in depth in the previous section. Finally, it is useful to point out the three-dimensional
work and ux form mappings to provide some useful identities in this 1 + 3-dimensional setting.
=
0
=
0
_
=
I leave verication of these formulas to the reader ( use the table). Finally let us analyze the process
of taking two hodge duals in succession. In the context of
3
we found that = , we seek to
discern if a similar formula is available in the context of
4
with the minkowksi metric. We can
calculate one type of example with the identities above:
=
0
= (
0
) =
where is a multi-index,
1. transpose dual vectors so that
0
1
2
3
= (1)
2. if 0 / then
= (1)
) =
since 0 . We nd
= (1)
3
= (1)
) =
since 0 / . We
nd
1
,
2
, . . . ,
} and = {
1
,
2
, . . . ,
1
+
2
2
+ +
and =
1
1
+
2
2
+ +
() = (
1
,
2
, . . . ,
) = and
() = (
1
,
2
, . . . ,
) =
We sometimes use the notation
() = []
= whereas
() = []
= . A coordinate map
takes an abstract vector and maps it to a particular representative in
. A natural question
to ask is how do dierent representatives compare? How do and compare in our current
notation? Because the coordinate maps are isomorphisms it follows that
is an
isomorphism and given the domain and codomain we can write its formula via matrix multiplication:
() =
( ) =
However,
1
( ) = hence
= {
=1
is dual to
= {
=1
and
= {
=1
is dual to = {
=1
. By denition we are given that
) =
and
) =
for all ,
. Suppose
with respect
to the
=1
or =
=1
=1
or =
=1
=1
_
_
=1
_
=
,=1
) =
=1
=1
=1
=1
. Therefore,
=1
=1
.
Recall, = . In components,
=1
. Substituting,
=1
=1
=1
.
But, this formula holds for all possible vectors and hence all possible coordinate vectors . If we
consider =
then
hence
=1
=1
. Moreover,
=1
=1
=1
=1
=1
. Thus,
=1
=1
verses
=1
(
1
)
.
It is customary to use lower-indices on the components of dual-vectors and upper-indices on the
components of vectors: we say =
=1
=1
=1
(
1
)
verses
=1
.
The formulas above can be derived by arguments similar to those we already gave in this section,
7.6. COORDINATE CHANGE 191
however I think it may be more instructive to see how these rules work in concert:
=
=1
=1
=1
(
1
)
(7.25)
=
=1
=1
(
1
)
=1
=1
=1
=1
(
1
)
=1
=1
=1
.
7.6.1 coordinate change for
0
2
( )
For an abstract vector space, or for
1
,
2
, . . . ,
= {
1
,
2
, . . . ,
} and
, have coordinate vectors , (which means =
=1
and =
=1
) then,
(, ) =
,=1
where
= (
). If = {
1
,
2
, . . . ,
then we
dene
= (
) and we have
(, ) =
,=1
.
Recall that
=1
= (
) =
_
=1
=1
_
=
,=1
) =
,=1
,=1
XXX- include general coordinate change and metrics with sylvesters theorem.
192 CHAPTER 7. MULTILINEAR ALGEBRA
Chapter 8
manifold theory
In this chapter I intend to give you a fairly accurate account of the modern denition of a manifold
1
.
In a nutshell, a manifold is simply a set which allows for calculus locally. Alternatively, many people
say that a manifold is simply a set which is locally at, or it locally looks like
. This covers
most of the objects youve seen in calculus III. However, the technical details most closely resemble
the parametric view-point.
1
the denitions we follow are primarily taken from Burns and Gideas Dierential Geometry and Topology With
a View to Dynamical Systems, I like their notation, but you should understand this denition is known to many
authors
193
194 CHAPTER 8. MANIFOLD THEORY
8.1 manifolds
Denition 8.1.1.
We dene a smooth manifold of dimension as follows: suppose we are given a set ,
a collection of open subsets
of
which satises the following three criteria:
1. each map
is injective
2. if
:
1
)
1
)
such that
3. =
)
Moreover, we call the mappings
is
called a coordinate chart on . The component functions of a chart (,
1
) are usually
denoted
1
= (
1
,
2
, . . . ,
) where
: for each = 1, 2, . . . , . .
We could add to this denition that is taken from an index set (which could be an innite
set). The union given in criteria (3.) is called a covering of . Most often, we deal with nitely
covered manifolds. You may recall that there are innitely many ways to parametrize the lines
or surfaces we dealt with in calculus III. The story here is no dierent. It follows that when we
consider classication of manifolds the denition we just oered is a bit lacking. We would also like
to lump in all other possible compatible parametrizations. In short, the denition we gave says a
manifold is a set together with an atlas of compatible charts. If we take that atlas and adjoin
to it all possible compatible charts then we obtain the so-called maximal atlas which denes a
dierentiable structure on the set . Many other authors dene a manifold as a set together
with a dierentiable structure. That said, our less ambtious denition will do.
We should also note that
=
1
hence
1
= (
1
)
1
=
1
. The functions
are called the transition functions of . These explain how we change coordinates locally.
I now oer a few examples so you can appreciate how general this denition is, in contrast to the
level-set denition we explored previously. We will recover those as examples of this more general
denition later in this chapter.
Example 8.1.2. Let =
and suppose :
) denes the collection of paramterizations on . In this case the collection is just one
mapping and = =
and suppose
then :
dened by () =
makes () = an -dimensional manifold. Again we have no overlap and the covering criteria
is clearly satised so that leaves injectivity of . Note () = (
) implies
hence
=
.
Example 8.1.4. Suppose is an -dimensional vector space over with basis = {
=1
.
Dene :
() =
1
1
+
2
2
+ +
.
Injectivity of the map follows from the linear independence of . The overlap criteria is trivially
satised. Moreover, () = thus we know that (
.
Moreover, in the context of a vector space we also have innitely many coordinate systems to
use. We will have to analyze compatibility of those new coordinates as we adjoin them. For the
vector space its simple to see the transition maps are smooth since theyll just be invertible linear
mappings. On the other hand, it is more work to show new curvelinear coordinates on
are
compatible with Cartesian coordinates. The inverse function theorem would likely be needed.
Example 8.1.5. Let = {(cos(), sin()) [0, 2)}. Dene
1
() = (cos() sin()) for all
(0, 3/2) =
1
. Also, dene
2
() = (cos() sin()) for all (, 2) =
2
. Injectivity
follows from the basic properties of sine and cosine and covering follows from the obvious geometry
of these mappings. However, overlap we should check. Let
1
=
1
(
1
) and
2
=
2
(
2
). Note
1
2
= {(cos(), sin()) < < 3/2}. We need to nd the formula for
12
:
1
2
(
1
2
)
1
1
(
1
2
)
In this example, this means
12
: (, 3/2) (, 3/2)
Example 8.1.6. Lets return to the vector space example. This time we want to allow for all
possible coordinate systems. Once more suppose is an -dimensional vector space over . Note
that for each basis = {
=1
. Dene
() =
1
1
+
2
2
+ +
.
Suppose ,
are given by
=
1
. It follows
that () = for some () = {
() = 0}. It follows that is a smooth
mapping since each component function of is simply a linear combination of the variables in
.
Lets take a moment to connect with linear algebra notation. If =
1
then
=
1
hence
as we used
.
Thus,
() =
() implies []
= []
. In contrast, Examples 8.1.4 and 8.1.6 are called abstract manifolds since the points in the
manifold were not found in Euclidean space
2
. If you are only interested in embedded manifolds
3
then the denition is less abstract:
Denition 8.1.7. embedded manifold.
We say is a smooth embedded manifold of dimension i we are given a set
of
is injective
2. each map
is smooth
3. each map
1
is continuous
4. the dierential
.
You may identify that this denition more closely resembles the parametrized objects from your
multivariate calculus course. There are two key dierences with this denition:
1. the set
in the abstract denition. In practice, for the abstract case, we use the charts
to lift open sets to , we need not assume any topology on since the machinery of the
manifold allows us to build our own. However, this can lead to some pathological cases so
those cases are usually ruled out by stating that our manifold is Hausdor and the covering
has a countable basis of open sets
4
. I will leave it at that since this is not a topology course.
2. the condition that the inverse of the local parametrization be continuous and
be smooth
were not present in the abstract denition. Instead, we assumed smoothness of the transition
functions.
One can prove that the embedded manifold of Dentition 8.1.7 is simply a subcase of the abstract
manifold given by Denition 8.1.1. See Munkres Theorem 24.1 where he shows the transition
2
a vector space could be euclidean space, but it could also be a set of polynomials, operators or a lot of other
rather abstract objects.
3
The detion I gave for embedded manifold here is mostly borrowed from Munkres excellent text Analysis on
Manifolds where he primarily analyzes embedded manifolds
4
see Burns and Gidea page 11 in Dierential Geometry and Topology With a View to Dynamical Systems
198 CHAPTER 8. MANIFOLD THEORY
functions of an embedded manifold are smooth. In fact, his theorem is given for the case of a
manifold with boundary which adds a few complications to the discussion. Well discuss manifolds
with boundary at the conclusion of this chapter.
Example 8.1.8. A line is a one dimensional manifold with a global coordinate patch:
() =
+
for all . We can think of this as the mapping which takes the real line and glues it in
along
some line which points in the direction and the new origin is at
. In this case :
and
,
are any two linearly independent vectors in the plane, and
for all (, )
2
. This amounts to pasting a copy of the -plane in
. If we just wanted a little paralellogram then we could restrict (, ) [0, 1] [0, 1],
then we would envision that the unit-square has been pasted on to a paralellogram. Lengths and
angles need not be maintained in this process of gluing. Note that the rank two condition for says
the derivative
(, ) = [
] = [
which happens to be the normal to the plane.
Example 8.1.10. A cone is almost a manifold, dene
(, ) = ( cos(), sin(), )
for [0, 2] and 0. What two problems does this potential coordinate patch :
2
3
suer from? Can you nd a modication of which makes () a manifold (it could be a subset
of what we call a cone)
The cone is not a manifold because of its point. Generally a space which is mostly like a manifold
except at a nite, or discrete, number of singular points is called an orbifold. Recently, in the
past decade or two, the orbifold has been used in string theory. The singularities can be used to
t various charge to elds through a mathematical process called the blow-up.
Example 8.1.11. Let (, ) = (cos() cosh(), sin() cosh(), sinh()) for (0, 2) and .
This gives us a patch on the hyperboloid
2
+
2
2
= 1
Example 8.1.12. Let (, , , ) = (, , , cos(), sin()) for (0, 2) and (, , )
3
.
This gives a copy of
3
inside
5
where a circle has been attached at each point of space in the two
transverse directions of
5
. You could imagine that is nearly zero so we cannot traverse these
extra dimensions.
8.1. MANIFOLDS 199
Example 8.1.13. The following patch describes the Mobius band which is obtained by gluing a
line segment to each point along a circle. However, these segments twist as you go around the circle
and the structure of this manifold is less trivial than those we have thus far considered. The mobius
band is an example of a manifold which is not oriented. This means that there is not a well-dened
normal vectoreld over the surface. The patch is:
(, ) =
_
_
1 +
1
2
cos(
2
)
cos(),
_
1 +
1
2
sin(
2
)
sin(),
1
2
sin(
2
)
_
for 0 2 and 1 1. To understand this mapping better try studying the map evaluated
at various values of ;
(0, ) = (1 +/2, 0, 0), (, ) = (1, 0, /2), (2, ) = (1 /2, 0, 0)
Notice the line segment parametrized by (0, ) and (2, ) is the same set of points, however the
orientation is reversed.
Example 8.1.14. A regular surface is a two-dimensional manifold embedded in
3
. We need
2
3
such that, for each ,
,
has rank two for all (, )
. Moreover, in
this case we can dene a normal vector eld (, ) =
where
is open in
satises
1. each map
is bijective
2. if
)
such that
.
3. =
= {(, ) < 0} = (
) and dene
(, ) =
3. Let
= {(, ) > 0} = (
) and dene
(, ) =
4. Let
= {(, ) < 0} = (
) and dene
(, ) =
The set of charts = {(
+
,
+
), (
), (
), (
=
+
= {(, )
, > 0}, its easy to calculate that
1
+
() = (,
1
2
) hence
(
1
+
)() =
(,
1
2
) =
1
2
for each
(
+
). Note
(
+
) implies 0 < < 1 hence it is clear the transition
function is smooth. Similar calculations hold for all the other overlapping charts. This manifold is
usually denoted =
1
.
A cylinder is the Cartesian product of a line and a circle. In other words, we can create a cylinder
by gluing a copy of a circle at each point along a line. If all these copies line up and dont twist
around then we get a cylinder. The example that follows here illustrates a more general pattern,
we can take a given manifold an paste a copy at each point along another manifold by using a
Cartesian product.
Example 8.1.18. Let = {(, , )
3
2
+
2
= 1}.
1. Let
+
= {(, , ) > 0} = (
+
) and dene
+
(, , ) = (, )
2. Let
= {(, , ) < 0} = (
) and dene
(, , ) = (, )
3. Let
= {(, , ) > 0} = (
) and dene
(, , ) = (, )
4. Let
= {(, , ) < 0} = (
) and dene
(, , ) = (, )
The set of charts = {(
+
,
+
), (
), (
), (
=
+
= {(, , )
, > 0}, its easy to calculate that
1
+
(, ) = (,
1
2
, ) hence
(
1
+
)(, ) =
(,
1
2
, ) = (
1
2
, )
for each (, )
(
+
). Note (, )
(
+
) implies 0 < < 1 hence it is clear the
transition function is smooth. Similar calculations hold for all the other overlapping charts.
Generally, given two manifolds and we can construct by taking the Cartesian product
of the charts. Suppose
and
as =
1
. The -torus is constructed
by taking the product of -circles:
=
1
1
1
. .
The atlas on this space can be obtained by simply taking the product of the
1
charts -times.
One of the surprising discoveries in manifold theory is that a particular set of points may have many
dierent possible dierentiable structures. This is why mathematicians often say a manifold is a
set together with a maximal atlas. For example, higher-dimensional spheres (
7
,
8
, ...) have more
than one dierentiable structure. In contrast,
3
or the polar coordinate system for
2
. Technically, certain restrictions must be made on the
domain of these non-Cartesian coordinates if we are to correctly label them coordinate charts.
Interestingly, applications are greedier than manifold theorists, we do need to include those points
in
which spoil the injectivity of spherical or cylindrical coordinates. On the other hand, those
bad points are just the origin and a ray of points which do not contribute noticable in the calcula-
tion of a surface or volume integral.
I will not attempt to make explicit the domain of the coordinate charts in the following two examples
( you might nd them in a homework):
Example 8.1.20. Dene
2
+
2
+
2
, = tan
1
_
_
, = cos
1
_
2
+
2
+
2
_
To show compatibility with the standard Cartesian coordinates we would need to select a subset of
3
for which
.
Example 8.1.21. Dene
2
+
2
, = tan
1
_
_
, =
You can take (
and
such that
and
1
is a smooth mapping
from
to
and
such that
. .
.
=
1
. .
. .
. .
.
. .
. .
follows from the chain rule for mappings. This formula shows that if is smooth with respect to a
particular pair of coordinates then its representative will likewise be smooth for any other pair of
compatible patches.
8.1. MANIFOLDS 203
Example 8.1.24. Recall in Example 8.1.3 we studied = {
which is dened by () =
. Clearly
1
(
, ) = for all
(
, ) . Let
. Consider
the function = :
to
consider. Let
=
1
= .
Hence, is a smooth bijection from
to and we nd is dieomorphic to
and suppose
:
1
.
We nd
1
is smooth on
. It follows that
1
is a dieomorphism since we know transition
functions are smooth on a manifold. We arrive at the following characterization of a manifold: a
manifold is a space which is locally dieomorphic to
.
However, just because a manifold is locally dieomorphic to
. For example, it is a well-known fact that there does not exist a smooth
bijection between the 2-sphere and
2
. The curvature of a manifold gives an obstruction to making
such a mapping.
204 CHAPTER 8. MANIFOLD THEORY
8.2 tangent space
Since a manifold is generally an abstract object we would like to give a denition for the tangent
space which is not directly based on the traditional geometric meaning. On the other hand, we
should expect that the denition which is given in the abstract reduces to the usual geometric
meaning for the context of an embedded manifold. It turns out there are three common viewpoints.
1. a tangent vector is an equivalence class of curves.
2. a tangent vector is a contravariant vector.
3. a tangent vector is a derivation.
I will explain each case and we will nd explicit isomorphisms between each language. We assume
that is an -dimensional smooth manifold throughout this section.
8.2.1 equivalence classes of curves
I essentially used case (1.) as the denition for the tangent space of a level-set. Suppose :
is a smooth curve with (0) = . In this context, this means that all the local
coordinate representatives of are smooth curves on
1
(0) =
2
(0) = to be similar at i (
1
1
)
(0) = (
1
2
)
1
,
2
are similar at then we denote this by writing
1
2
. We insist the curves be
parametrized such that they reach the point of interest at the parameter = 0, this is not a severe
restriction since we can always reparametrize a given curve which reaches at =
by replacing
the parameter with
. Observe that
i (0) = and (
1
(0) = (
1
2
then (
1
1
)
(0) = (
1
2
)
(0) hence (
1
2
)
(0) =
(
1
1
)
(0) and we nd
2
1
thus
is a symmetric relation.
(iii) transitive: if
1
2
and
2
3
then (
1
1
)
(0) = (
1
2
)
(0) and (
1
2
)
(0) =
(
1
3
)
(0) thus (
1
1
)
(0) = (
1
3
)
3
.
The equivalence classes of
(0) in
6
Note, we may have to restrict the domain of
1
. Conversely, given = (
1
,
2
, . . . ,
with
direction and base-point
1
():
() =
1
() +.
We compose with to obtain a smooth curve through which corresponds to the vector .
In invite the reader to verify that =
has
(1.) (0) = (2.) (
1
(0) = .
Notice that the correspondence is made between a vector in
rel-
ative to the chart
1
: , with . In particular, we suppose (0) = (0) = and
(
1
(0) = (
1
(0). Let
1
:
, with
, we seek to show
relative to the
chart
1
. Note that
(0) = (
(
1
())(
1
(0)
Likewise, (
(0) = (
(
1
())(
1
(
1
()) is an in-
vertible matrix since it is the derivative of the invertible transition functions, label (
(
1
()) =
to obtain:
(
(0) = (
1
(0) and (
(0) = (
1
(0)
the equality (
(0) = (
relative to the
coordi-
nate chart. We nd that the equivalence classes of curves are independent of the coordinate system.
With the analysis above in mind we dene addition and scalar multiplication of equivalence classes
of curves as follows: given a coordinate chart
1
: with , equivalence classes
1
,
2
at and
, if
1
has (
1
1
)
(0) =
1
in
and
2
has (
1
2
)
(0) =
2
in
then we
dene
(i)
1
+
2
= where (
1
(0) =
1
+
2
(ii)
1
=
where (
1
(0) =
1
.
We know and exist because we can simply push the lines in
based at
1
() with directions
1
+
2
and
1
up to to obtain the desired curve and hence the required equivalence class.
Moreover, we know this construction is coordinate independent since the equivalence classes are
indpendent of coordinates.
Denition 8.2.1.
206 CHAPTER 8. MANIFOLD THEORY
Suppose is an -dimensional smooth manifold. We dene the tangent space at
to be the set of
(0) = and (
1
(
1
()). With this in
mind we could use the pair (, ) or (, ) to describe a tangent vector at . The cost of using (, )
is it brings in questions of coordinate dependence.
The equivalence class viewpoint is at times quite useful, but the denition of vector oered here is
a bit easier in certain respects. In particular, relative to a particular coordinate chart
1
: ,
with , we dene (temporary notation)
= {(, )
}
Vectors are added and scalar multiplied in the obvious way:
(,
1
) + (,
2
) = (,
1
+
2
) and (,
1
) = (,
1
)
for all (,
1
, (,
2
)
1
co-
ordinate chart then the vector changes form as indicated in the previous subsection; (, ) (, )
where = and = (
(
1
()). The components of (, ) are said to transform
contravariantly.
Technically, this is also an equivalence class construction. A more honest notation would be to
replace (, ) with (, , ) and then we could state that (, , ) (,
, ) i = and =
(
(
1
()). However, this notation is tiresome so we do not pursue it further. I prefer the
notation of the next viewpoint.
8.2.3 derivations
To begin, let us dene the set of locally smooth functions at :
is a derivation on
( + ) =
() +
() and
() = ()
() +
() and .
Example 8.2.3. Let = and consider
= /
.
Example 8.2.4. Consider =
2
. Pick = (
) and dene
and
.
Once more it is clear that
()
2
. These derivations action is accomplished by partial
dierentiation followed by evaluation at .
Example 8.2.5. Suppose =
. Pick
and dene =
. Clearly this is a
derivation for any
.
Are the other types of derivations? Is the only thing a derivation is is a partial derivative operator?
Before we can explore this question we need to dene partial dierentiation on a manifold. We
should hope the denition is consistent with the langauge we already used in multivariate calculus
(and the preceding pair of examples) and yet is also general enough to be stated on any abstract
smooth manifold.
Denition 8.2.6.
Let be a smooth -dimensional manifold and let : be a local parametrization
with . The -th coordinate function
1
: . In other words:
1
() = () = (
1
(),
2
(), . . . ,
())
These
via
) and viewed
as functions
where
() =
() =
at for
() as follows:
() =
_
(
)()
_
=
1
()
=
1
_
()
.
The idea of the dention is simply to take the function with domain in then pull it back to
a function
1
:
on
1
in the same way we did in multivariate calculus. In particular, the partial derivative w.r.t.
is
calculated by:
() =
_
_
_
(() +
)
_
=0
208 CHAPTER 8. MANIFOLD THEORY
which is precisely the directional derivative of
1
in the -direction at (). In fact, Note
_
_
(() +
) = (
1
(() +
)).
The curve
1
(() +
) reduces to +
. It follows that the partial derivative dened for manifolds naturally reduces to the ordinary
partial derivative in the context of =
.
Theorem 8.2.7. Partial dierentiation on manifolds
Let be a smooth -dimensional manifold with coordinates
1
,
2
, . . . ,
near . Fur-
thermore, suppose coordinates
1
,
2
, . . . ,
()
and then:
1.
_
+
2.
3.
= ()
()
4.
5.
=1
6.
=1
Proof: The proof of (1.) and (2.) follows from the calculation below:
( +)
() =
_
( +)
1
_
()
=
1
+
1
_
()
=
1
_
()
+
1
_
()
=
() +
() (8.1)
8.2. TANGENT SPACE 209
The key in this argument is that composition ( + )
1
=
1
+
1
along side the
linearity of the partial derivative. Item (3.) follows from the identity ()
1
= (
1
)(
1
)
in tandem with the product rule for a partial derivative on
_
(
1
)()
_
()
=
()
=
.
where the last equality is known from multivariate calculus. In invite the reader to prove it from
the denition if unaware of this fact. Before we prove (5.) it helps to have a picture and a bit
more notation in mind. Near the point we have two coordinate charts :
and
:
, we take the chart domain to be small enough so that both charts are
dened. Denote Cartesian coordinates on by
1
,
2
, . . . ,
1
are mappings from
to
and
we note
_
(
1
)()
_
=
1
are mappings from
to
_
(
1
)()
_
=
Recall that if , :
and
= then
)
1
=
.
Apply this general fact to the transition functions, we nd their derivative matrices are inverses.
Item (5.) follows. In matrix notation we item (5.) reads
_
(
1
)()
_
()
=
_
(
1
)()
_
()
=
_
_
1
_
(
1
(), . . . ,
())
_
()
: where
() = (
1
)
()
=
=1
(
1
)
()
(
1
)
1
)(())
: chain rule
=
=1
(
1
)
()
(
1
)
()
=
=1
=1
)
(,
) where
and = (
1
)
(()) =
()
. Notice this is the inverse of what we see
in (6.). This suggests that the partial derivatives change coordinates like as a basis for the tangent
space. To complete this thought we need a few well-known propositions for derivations.
Proposition 8.2.8. derivations on constant function gives zero.
If
then
() = 0.
Proof: Suppose () = for all , dene () = 1 for all and note = on . Since
() =
() = ()
() +()() =
() +
()
() = 0.
Moreover, by homogeneity of
, note
() =
() =
(). Thus,
() = 0.
Proposition 8.2.9.
If ,
then
() =
().
Proof: Note that () = () implies () = () () = 0 for all . Thus, the previous
proposition yields
() = 0. Thus,
( ) = 0 and by linearity
()
() = 0. The
proposition follows.
Proposition 8.2.10.
Suppose
=1
Proof: this is a less trivial proposition. We need a standard lemma before we begin.
8.2. TANGENT SPACE 211
Lemma 8.2.11.
Let be a point in smooth manifold and let : be a smooth function. If
: is a chart with and () = 0 then there exist smooth functions
:
whose values at satisfy
() =
=1
()
()
Proof: follows from proving a similar identity on
() =
():
() =
_
() +
=1
()
()
_
=
(()) +
=1
()
()
_
=
=1
_
() +
()
())
_
=
=1
().
The calculation above holds for arbitrary
then we dene
by (
)() =
() +
() and
by
(
)() =
() for all ,
= {
=1
hence
=1
or
=1
7
technically, we should show the coordinate derivations
) =
=1
(8.2)
This is the contravariant transformation rule. In contrast, recall
=1
. We
should have anticipated this pattern since from the outset it is clear there is no coordinate depen-
dence in the denition of a derivation.
8.2.4 dictionary between formalisms
We have three competing views of how to characterize a tangent vector.
1.
= {(, )
}
3.
Perhaps it is not terribly obvious how to get a derivation from an equivalence class of curves.
Suppose is a tangent vector to at and let ,
associated to
via
() = (
)
(0). Consider, ( +)
( +)() = ( +)
(0) =
()()+
() = (
)
()(()) +(())(
()
hence, noting (0) = we verify the Leibniz rule for
() = (()
(0) = (
)
(0)() +()(
(0) =
()() +()
()
In view of these calculations we nd that :
dened by ( ) =
is
well-dened. Moreover, we can show is an isomorphism. To be clear, we dene:
( )() =
() = (
)
(0).
Ill begin with injectivity. Suppose ( ) = (
() we have ( )() = (
)()
hence (
)
(0) = (
)
hence =
and we have shown is injective. Linearity of must be judged on
the basis of our denition for the addition of equivalence classes of curves. I leave linearity and
surjectivity to the reader. Once those are established it follows that is an isomorphism and
.
The isomorphism between
and
for each (,
, relative to coordinates
at ,
_
,
=1
_
=
=1
) (,
) and consequently
_
,
=1
_
=
=1
=1
.
Thus is single-valued on each equivalence class of vectors. Furthermore, the inverse mapping is
simple to write: for a chart at ,
1
(
) = (,
=1
)
and the value of the mapping above is related contravariantly if we were to use a dierent chart
1
(
) = (,
=1
).
See Equation 8.2 and the surrounding discussion if you forgot. It is not hard to verify that
is bijective and linear thus is an isomorphism. We have shown
. Let us
summarize:
Sorry to be anticlimatic here, but we choose the following for future use:
Denition 8.2.12. tangent space
We denote
.
214 CHAPTER 8. MANIFOLD THEORY
8.3 the dierential
In this section we generalize the concept of the dierential to the context of manifolds. Recall that
for :
the dierential
is called the push-forward by at because it pushes tangent vectors along side the
mapping.
Denition 8.3.1. dierential for manifolds.
Suppose and are smooth manifolds of dimension and respective. Furthermore,
suppose : is a smooth mapping. We dene
()
as follows: for
each
and
(())
)() =
).
Notice that : () and consequently
: and it follows
() and it is natural to nd
in the domain of
)
()
. Observe:
1.
)( +) =
(( +)
) =
) =
) +
)
2.
)() =
(()
) =
)) =
)) =
)()
The proof of the Leibniz rule is similar. Now that we have justied the denition lets look at an
interesting application to the study of surfaces in
3
.
Suppose
3
is an embedded two-dimensional manifold. In particular suppose is a regular
surface which means that for each parametrization : the normal vector eld (, ) =
(
3
= 1} is also manifold, perhaps you showed this in a homework. In any event, the mapping
:
2
dened by
(, ) =
provides a smooth mapping from the surface to the unit sphere. The change in measures how
the normal deects as we move about the surface . One natural scalar we can use to quantify
that curving of the normal is called the Gaussian curvature which is dened by = ().
Likewise, we dene = () which is the mean curvature of . If
1
,
2
are the eigen-
values the operator
) =
1
2
and
(
) =
1
+
2
. The eigenvalues are called the principal curvatures. Moreover, it can be
shown that the matrix of
.
8.3. THE DIFFERENTIAL 215
Example 8.3.2. Consider the plane with base point
_
(
) =
_
=
(
_
=
2
=1
(
is the 22 matrix
_
(
_
with respect to the choice of coordinates
1
,
2
on and
1
,
2
on the sphere.
Example 8.3.3. Suppose (, ) = ( , ,
2
) parameterizes part of a sphere
of
radius > 0. You can calculate the Gauss map and the result should be geometrically obvious:
(, ) =
1
_
, ,
2
_
Then the and components of (, ) are simply / and / respective. Calculate,
[
] =
_
]
_
=
_
1
0
0
1
_
Thus the Gaussian curvature of the sphere = 1/
2
. The principle curvatures are
1
=
2
= 1/
and the mean curvature is simply = 2/. Notice that as we nd agreement with the
curvature of a plane.
Example 8.3.4. Suppose is a cylinder which is parametrized by (, ) = (cos , sin , ).
The Gauss map yields (, ) = (cos , sin , 0). I leave the explicit details to the reader, but it can
be shown that
1
= 1/,
2
= 0 and hence = 0 whereas = 1/.
216 CHAPTER 8. MANIFOLD THEORY
The dierential is actually easier to frame in the equivalence class curve formulation of
. In
particular, suppose = [] as a more convenient notation for what follows. In addition, suppose
: is a smooth function and []
then we dene
()
as follows:
([]) = [
]
There is a chain-rule for dierentials. Its the natural rule youd expect. If : and
: then, denoting = (),
) =
.
The proof is simple in the curve notation:
_
_
([]) =
([])
_
=
([
]) = [
(
)] =
)[].
You can see why the curve formulation of tangent vectors is useful. It does simply certain questions.
That said, we will insist
in sequel.
The push-forward need not be an abstract
8
exercise.
Example 8.3.5. Suppose :
2
,
2
,
is the polar coordinate transformation. In particular,
(, ) = ( cos , sin )
Lets examine where pushes
and
or
, the values of
)() and
)()
as we know
) =
)()
)()
where = ().
) =
( cos )
( sin )
= cos
+ sin
= sin
+ cos
.
Therefore, the push-forward is a tool which we can use to change coordinates for vectors. Given
the coordinate transformation on a manifold we just push the vector of interest presented in one
coordinate system to the other through the formulas above. In multivariate calculus we simply
thought of this as changing notation on a given problem. I would be good if you came to the same
understanding here.
8
its not my idea of abstract that is wrong... think about that.
8.4. COTANGENT SPACE 217
8.4 cotangent space
The tangent space to a smooth manifold is a vector space of derivations and we denote it by
. The dual space to this vector space is called the cotangent space and the typical elements
are called covectors.
Denition 8.4.1. cotangent space
= {
}.
If is a local coordinate chart at and
, . . . ,
is a basis for
then we denote
the dual basis
1
,
2
, . . . ,
where
_
=
=1
where
=
_
_
and
is a short-hand for
at this time:
1.
()
is dened as a push-forward,
)() =
)
2.
where
_
=
It is customary to identify
()
with hence there is no trouble. Let us examine how the
dual-basis condition can be derived for the dierential, suppose : hence
: ,
_
() =
((
)) =
()
. .
=
()
_
_
=
()
( which is the nut and bolts of writing
()
=
) and hence have the beautiful identity:
_
=
.
9
we explained this for an arbitrary vector space and its dual
and : we nd
) =
()
This notation is completely consistent with the total dierential as commonly discussed in multi-
variate calculus. Recall that if :
then we dened
=
1
+
2
+ +
.
Notice that the -th component of is simply
) =
()
gives us the same component if we simply evaluate the covector
_
=
and
a cotangent space
meaning
. .
. .
.
Relative to a particular coordinate chart at we can build a basis for
2
...
2
...
such that
1
,...,,
1
,...,=1
(
2
...
2
...
)()
.
The components can be calculated by contraction with the appropriate vectors and covectors:
(
2
...
2
...
)() =
_
, . . . ,
1
, . . . ,
_
.
We can summarize the equations above with multi-index notation:
and
()
0
2
we can calculate
hodge duals in
as follows:
=
and
.
The cannonical projections , tell us where a particular vector or covector are found on the
manifold:
(
) = and (
) =
I usually picture this construction as follows:
XXX add projection pictures
Notice the bers of and are
1
() =
and
1
() =
is injective.
In other words, the image of a section hits each ber over its domain just once. A section selects
a particular element of each ber. Heres an abstract picture of section, I sometimes think of the
section as its image although technically the section is actually a mapping:
XXX add section picture
Given the language above we nd a natural langauge to dene vector and covector-elds on a
manifold. However, for reasons that become clear later, we call a covector-eld a dierential one-
form.
Denition 8.6.2. tensor elds.
10
I assume is Hausdor and has a countable basis, see Burns and Gidea Theorem 3.2.5 on page 116.
220 CHAPTER 8. MANIFOLD THEORY
Let , we dene:
1. is a vector eld on i is a section of on
2. is a dierential one-form on i is a section of
on .
3. is a type (, ) tensor-eld on i is a section of
on .
We consider only smooth sections and it turns out this is equivalent
11
to the demand that the
component functions of the elds above are smooth on .
8.7 metric tensor
Ill begin by discussing briey the informal concept of a metric. The calculations given in the rst
part of this section show you how to think for nice examples that are embedded in
. In such
cases the metric can be deduced by setting appropriate terms for the metric on
to zero. The
metric is then used to set-up arclength integrals over a curved space, see my Chapter on Varitional
Calculus from the previous notes if you want examples.
In the second part of this chapter I give the careful denition which applies to an arbitrary manifold.
I include this whole section mostly for informational purposes. Our main thrust in this course is
with the calculus of dierential forms and the metric is actually, ignoring the task of hodge duals,
not on the center stage. That said, any student of dierential geometry will be interested in the
metric. The problem of paralell transport
12
, and the denition and calculation of geodesics
13
are
fascinating problems beyond this course.
8.7.1 classical metric notation in
Denition 8.7.1.
The Euclidean metric is
2
=
2
+
2
+
2
. Generally, for orthogonal curvelinear
coordinates , , we calculate
2
=
1
2
+
1
2
+
1
2
.
The beauty of the metric is that it allows us to calculate in other coordinates, consider
= cos() = sin()
For which we have implicit inverse coordinate transformations
2
=
2
+
2
and = tan
1
(/).
From these inverse formulas we calculate:
= < /, / > = < /
2
, /
2
>
11
all the bundles above are themselves manifolds, for example is a 2-dimensional manifold, and as such the
term smooth has already been dened. I do not intend to delve into that aspect of the theory here. See any text on
manifold theory for details.
12
how to move vectors around in a curved manifold
13
curve of shortest distance on a curved space, basically they are the lines on a manifold
8.7. METRIC TENSOR 221
Thus, = 1 whereas = 1/. We nd that the metric in polar coordinates takes the form:
2
=
2
+
2
2
Physicists and engineers tend to like to think of these as arising from calculating the length of
innitesimal displacements in the or directions. Generically, for , , coordinates
=
1
=
1
=
1
and
2
=
2
+
2
+
2
= and
= . Notice then
that cylindircal coordinates have the metric,
2
=
2
+
2
2
+
2
.
For spherical coordinates = cos() sin(), = sin() sin() and = cos() (here 0 2
and 0 , physics notation). Calculation of the metric follows from the line elements,
= sin()
=
Thus,
2
=
2
+
2
sin
2
()
2
+
2
2
.
We now have all the tools we need for examples in spherical or cylindrical coordinates. What about
other cases? In general, given some -manifold embedded in
such that the manifold is described by setting all but of the coordinates to a constant.
For example, in
4
we have generalized cylindircal coordinates (, , , ) dened implicitly by the
equations below
= cos(), = sin(), = , =
On the hyper-cylinder = we have the metric
2
=
2
2
+
2
+
2
. There are mathemati-
cians/physicists whose careers are founded upon the discovery of a metric for some manifold. This
is generally a dicult task.
8.7.2 metric tensor on a smooth manifold
A metric on a smooth manifold is a type (2, 0) tensor eld on which is at each point a
metric on
is a symmetric, nondegenerate
bilinear form on
In this context
: are assumed to be smooth functions, the values may vary from point to
point in . Furthermore, we know that
for all ,
] is invertible
222 CHAPTER 8. MANIFOLD THEORY
by the nondegneracy of . Recall we use the notation
=1
.
Recall that according to Sylvesters theorem we can choose coordinates at some point which
will diagonalize the metric and leave (
3
. In the usual notation in
3
,
(, ) = ((, ), (, ), (, ))
Consider a curve : [0, 1] we can calculate the arclength of via the usal calculation in
3
.
The magnitude of velocity
() is
hence =
() and
the following integral calculates the length of ,
1
0
()
Since [0, 1] it follows there must exist some two-dimesional curve ((), ()) for which
() = ((), ()). Observe by the chain rule that
() =
_
_
We can calculate the square of the speed in view of the formula above, let
= and
= ,
()
2
=
_
2
2
+ 2
+
2
2
,
2
+ 2
+
2
2
,
2
+ 2
+
2
2
_
(8.3)
14
this was the denitin given in a general relativity course I took with the physicisist Martin Rocek of SUNY Stony
Brook. He then introduced non-coordinate form-elds which kept the metric constant. I may nd a way to show you
some of those calculations at the end of this course.
8.7. METRIC TENSOR 223
Collecting together terms which share either
2
, or
2
and noting that
2
+
2
+
2
and
2
+
2
+
2
we obtain:
()
2
=
2
+
2
Or, in the notation of Gauss,
= ,
= and
1
0
2
+ 2 +
2
We discover that on there is a metric induced from the ambient euclidean metric. In the current
coordinates, using (, ) =
1
,
= + 2 +
hence the length of a tangent vector is dened via =
2
=
hence =
and we can calculate surface area (if this integral exists!) via
() =
2
.
I make use of the standard notation for double integrals from multivariate calculus and the integra-
tion is to be taken over the domain of the parametrization of .
Many additional formulas are known for , , and there are entire texts devoted to exploring
the geometric intracies of surfaces in
3
. For example, John Opreas Dierential Geometry and
its Applications. Theorem 4.1 of that text is the celebrated Theorem Egregium of Gauss which
states the curvature of a surface depends only on the metric of the surface as given by , , . In
particular,
=
1
2
_
+
__
.
Where curvature at is dened by () = (
) and
() =
= ((
1
), (
2
), (
3
)) and is simply the normal
vector eld to dened by (, ) =
= . Dene
=
3
,=0
sphere.
The boundary of quadrants I and II of the -plane is the -axis. Or, to generalize this example,
we dene the upper-half of
as follows:
= {(
1
,
2
, . . . ,
1
,
0}.
The boundary of
is the
1
2
1
-hyperplane which is the solution set of
= 0 in
; we
can denote the boundary by
hence,
=
1
{0}. Furthermore, we dene
+
= {(
1
,
2
, . . . ,
1
,
> 0}.
15
I am glossing over some analytical details here concerning extensions and continuity, smoothness etc... see section
24 of Munkres a bit more detail in the embedded case.
8.8. ON BOUNDARIES AND SUBMANIFOLDS 225
It follows that
=
+
1
{0}. Note that a subset of
is said to be open in
i there exists some open set
such that
= . For example, if we consider
3
then the open sets in the -plane are formed from intesecting open sets in
3
with the plane; an
open ball intersects to give an open disk on the plane. Or for
2
an open disks intersected with
the -axis give open intervals.
Denition 8.8.1.
We say is a smooth -dimensional manifold with boundary i there exists a family
{
} of open subsets of
or
and local parameterizations
such
that the following criteria hold:
1. each map
is injective
2. if
:
1
)
1
)
such that
3. =
)
We again refer to the inverse of a local paramterization as a coordinate chart and often
use the notation
1
() = (
1
(),
2
(), . . . ,
such that
: is a local parametrization with then is an interior point. Any point
which is not an interior point is a boundary point. The set of all boundary points
is called boundary of is denoted .
A more pragmatic characterization
16
of a boundary point is that i there exists a chart
at such that
) : is a chart at with
() = 0. De-
ne the restriction of
1
to
= 0 by :
() by () = (, 0) where
=
{(
1
, . . . ,
1
)
1
(
1
, . . . ,
1
,
) }. It follows that
1
: ()
1
is just the rst 1 coordinates of the chart
1
which is to say
1
= (
1
,
2
, . . . ,
1
). We
16
I leave it to the reader to show this follows from the words in green.
226 CHAPTER 8. MANIFOLD THEORY
construct charts in this fashion at each point in . Note that
is open in
1
hence the man-
ifold only has interior points. There is no parametrization in which takes a boundary-type
subset half-plane as its domain. It follows that () = . I leave compatibility and smoothness
of the restricted charts on to the reader.
Given the terminology in this section we should note that there are shapes of interest which simply
do no t our terminology. For example, a rectangle = [, ] [, ] is not a manifold with bound-
ary since if it were we would have a boundary with sharp edges (which is not a smooth manifold!).
I have not included a full discussion of submanifolds in these notes. However, I would like to
give you some brief comments concerning how they arise from particular functions. In short, a
submanifold is a subset of a manifold which also a manifold in a natural manner. Burns and Gidea
dene for a smooth mapping from a manifold to another manifold that
a is a critical point of if
()
is not surjective. Moreover, the image
() is called the critical value of .
b is a regular point of if is not critical. Moreover, is called a regular value
of i
1
{} contains no critical points.
It turns out that:
Theorem 8.8.3.
If : is a smooth function on smooth manifolds , of dimensions ,
respective and is a regular value of with nonempty ber
1
{} then the ber
1
{} is a submanifold of of dimension ().
Proof: see page 46 of Burns and Gidea. .
The idea of this theorem is a variant of the implicit function theorem. Recall if we are given
:
is invertible.
But, this local solution suitably restricted is injective and hence the mapping () = (, ()) is a
local parametrization of a manifold in
. (think of =
hence = + and = so
we nd agreement with the theorem above at least in the concrete case of level-sets)
Example 8.8.4. Consider :
2
dened by (, ) =
2
+
2
. Calculate = 2 + 2
we nd that the only critical value of is (0, 0) since otherwise either or is nonzero and as a
consequence is surjective. It follows that
1
{
2
} is a submanifold of
2
for any > 0. I think
youve seen these submanifolds before. What are they?
Example 8.8.5. Consider :
3
dened by (, , ) =
2
2
2
calculate that =
2 2 +2. Note (0, 0, 0) is a critical value of . Furthermore, note
1
{0} is the cone
2
=
2
+
2
which is not a submanifold of
3
. It turns out that in general just about anything
8.8. ON BOUNDARIES AND SUBMANIFOLDS 227
can arise as the inverse image of a critical value. It could happen that the inverse image is a
submanifold, its just not a given.
Theorem 8.8.6.
If be a smooth manifold without boundary and : is a smooth function with a
regular value then
1
(, ] is a smooth manifold with boundar
1
{}.
Proof: see page 50 of Burns and Gidea. .
Example 8.8.7. Suppose :
is dened by () =
2
then = 0 is the only critical value
of and we nd
1
(,
2
] is a submanifold with boundary
1
{
2
}. Note that
1
(, 0) =
in this case. However, perhaps you also see
=
1
[0,
2
] is the closed -ball and
1
() is the (1)-sphere of radius .
Theorem 8.8.8.
Let be a smooth manifold with boundary and a smooth manifold without bound-
ary. If : and
1
= { 0} as the domain of a parametrization in the case of one-dimensional manifolds.
228 CHAPTER 8. MANIFOLD THEORY
Chapter 9
dierential forms
9.1 algebra of dierential forms
In this section we apply the results of the previous section on exterior algebra to the vector space
=
. Recall that {
} is a basis of
} of utilized throughout
the previous section on exterior algebra will be taken to be
, 1
in this section. Also recall that the set of covectors {
} is a basis of
which is dual to {
}
and consequently the {
, 1
in the present context. With these choices the machinery of the previous section takes over and
one obtains a vector space
) and we refer to
() as the
k-th exterior power of the tangent bundle . There is a projection :
() dened by
(, ) = for (, )
()
is a (smooth) function such that ()
.
To say that is a vector eld on an open subset of means that
=
1
1
+
2
2
+
where
1
,
2
, ,
) then is called a
dierential k-form on if for all local vector elds
1
,
2
, ,
(
1
(),
2
(), ,
())
is smooth on . For example if (
1
,
2
, ,
()
on () such that
=
1
2
,=1
()(
)
and such that
() =
1
!
()(
)
where the {
= ()
,
for every permutation . (this is just a fancy way of saying if you switch any pair of indices it
generates a minus sign).
The algebra of dierential forms follows the same rules as the exterior algebra we previously dis-
cussed. Remember, a dierential form evaluated a particular point gives us a wedge product of a
bunch of dual vectors. It follows that the dierential form in total also follows the general properties
of the exterior algebra.
Theorem 9.1.2.
9.2. EXTERIOR DERIVATIVES: THE CALCULUS OF FORMS 231
If is a -form, is a -form, and is a -form on then
1. ( ) = ( )
2. = (1)
( )
3. ( +) = ( ) +( ) ,
.
Notice that in
3
the set of dierential forms
= {1, , , , , , , }
is a basis of the space of dierential forms in the sense that every form on
3
is a linear combination
of the forms in with smooth real-valued functions on
3
as coecients.
Example 9.1.3. Let = + and let = 3 + where , are functions. Find ,
write the answer in terms of the basis dened in the Remark above,
= ( +) (3 +)
= (3 +) + (3 +)
= 3 + + 3 +
= 3
(9.1)
Example 9.1.4. Top form: Let = and let be any other form with degree > 0.
We argue that = 0. Notice that if > 0 then there must be at least one dierential inside
so if that dierential is
we can rewrite =
(9.2)
now has to be either 1, 2 or 3 therefore we will have
) is a chart and =
1
!
and we dene a
( + 1)-form to be the form
=
1
!
.
Where
=1
.
Note that
is well-dened as
=
1
!
1
=1
2
=1
=1
(
where
=
1
!
1
=1
2
=1
=1
()
.
9.2.1 coordinate independence of exterior derivative
The Einstein summation convention is used in this section and throughout the remainder
of this chapter, please feel free to email me if it confuses you somewhere. When an index is
repeated in a single summand it is implicitly assumed there is a sum over all values of that
index
It must be shown that this denition is independent of the chart used to dene . Suppose for
example, that
()(
)
for all in the domain of a chart (
1
,
2
,
) where
() (), = .
We assume, of course that the coecients {
.
9.2. EXTERIOR DERIVATIVES: THE CALCULUS OF FORMS 233
We need to show that
()
()
.
Using the identities
we have
so that
_
.
Consequently,
) =
_
](
_
(
)
+
1
2
_
(
)
=
1
_
_
_
=
__
is zero since:
) =
2
[(
] = 0.
It follows that is independent of the coordinates used to dene it.
Consequently we see that for each the operator maps
() into
+1
() and has the following
properties:
Theorem 9.2.2. properties of the exterior derivative.
If
(),
() and , R then
1. ( +) = () +()
2. ( ) = ( ) + (1)
( )
3. () = 0
234 CHAPTER 9. DIFFERENTIAL FORMS
Proof: The proof of (1) is obvious. To prove (2), let = (
1
, ,
) be a chart on then
(ignoring the factorial coecients)
( ) = (
= (
)
+
)
=
(1)
))
+
((
)
= ( (1)
) +
)
= + (1)
( ) .
9.2.2 exterior derivatives on
3
We begin by noting that vector elds may correspond either to a one-form or to a two-form.
Denition 9.2.3. dictionary of vectors verses forms on
3
.
Let
= (
1
,
2
,
3
) denote a vector eld in
3
. Dene then,
=
1
2
) =
1
2
)
which we will call the ux-form of
.
If you accept the primacy of dierential forms, then you can see that vector calculus confuses two
separate objects. Apparently there are two types of vector elds. In fact, if you have studied coor-
dinate change for vector elds deeply then you will encounter the qualiers axial or polar vector
elds. Those elds which are axial correspond directly to two-forms whereas those correspondant
to one-forms are called polar. As an example, the magnetic eld is axial whereas the electric eld
is polar.
Example 9.2.4. Gradient: Consider three-dimensional Euclidean space. Let :
3
then
=
= (
) + (
) + (
)
=
.
9.3. PULLBACKS 235
Thus we recover the curl.
Example 9.2.6. Divergence: Consider three-dimensional Euclidean space. Let
be a vector
eld and let
=
1
2
= (
1
2
=
1
2
=
1
2
=
1
2
2
)
=
= (
)
and in this way we recover the divergence.
9.3 pullbacks
Another important operation one can perform on dierential forms is the pull-back of a form
under a map
1
. The denition is constructed in large part by a sneaky application of the push-
forward (aka dierential) discussed in the preceding chapter.
Denition 9.3.1. pull-back of a dierential form.
If : is a smooth map and
() then
(
1
, ,
) =
()
(
(
1
),
(
2
), ,
)) .
for each and
1
,
2
, . . . ,
)() = (
)()
for all .
This operation is linear on forms and commutes with the wedge product and exterior derivative:
Theorem 9.3.2. properties of the pull-back.
If : is a
1
-map and
(),
() then
1.
( +) = (
) +(
) ,
2.
( ) =
)
3.
() = (
)
1
thanks to my advisor R.O. Fulp for the arguments that follow
236 CHAPTER 9. DIFFERENTIAL FORMS
Proof: The proof of (1) is clear. We now prove (2).
( )]
(
1
, ,
+
) = ( )
()
(
(
1
), ,
(
+
))
=
(sgn)( )
()
(
1
), ,
(
(+)
))
=
sgn()(
(
(1)
),
(
()
))((
(+1)
(+)
)
=
sgn()(
(
(1)
, ,
()
)(
)(
(+1)
, ,
(+)
)
= [(
) (
)]
(
1
,
2
, ,
(+)
)
Finally we prove (3).
()]
(
1
,
2
,
+1
) = ()
()
((
1
), (
+1
))
= (
)
()
((
1
), , (
+1
))
=
_
()
_
(
)
()
((
1
), , (
+1
))
=
_
()
_
[
)](
1
, ,
+1
)
= [(
) (
)](
1
, ,
+1
)
= [(
)](
1
, ,
+1
)
= (
(
1
, ,
+1
) .
The theorem follows. .
We saw that one important application of the push-forward was to change coordinates for a given
vector. Similar comments apply here. If we wish to change coordinates on a given dierential form
then we can use the pull-back. However, given the direction of the operation we need to use the
inverse coordinate transformation to pull forms forward. Let me mirror the example from the last
chapter for forms on
2
. We wish to convert from , to , notation.
Example 9.3.3. Suppose :
2
,
2
,
is the polar coordinate transformation. In particular,
(, ) = ( cos , sin )
The inverse transformation, at least for appropriate angles, is given by
1
(, ) =
_
2
+
2
, tan
1
(/)
_
.
Let calculate the pull-back of under
1
: let =
1
()
1
()
1
() = (
) +(
) =
1
() = (
) +(
) =
Note that =
2
+
2
and = tan
1
(/) have the following partial derivatives:
2
+
2
=
and
2
+
2
=
2
+
2
=
2
and
2
+
2
=
2
Of course the expressions using are pretty, but to make the point, we have changed into , -
notation via the pull-back of the inverse transformation as advertised. We nd:
=
+
2
+
2
and =
+
2
+
2
.
Once again we have found results with the pull-back that we might previously have chalked up to
substitution in multivariate calculus. Thats often the idea behind an application of the pull-back.
Its just a formal langauge to be precise about a substitution. It takes us past simple symbol
pushing and helps us think about where things are dened and how we may recast them to work
together with other objects. I leave it at that for here.
9.4 integration of dierential forms
The general strategy is generally as follows:
(i) there is a natural way to calculate the integral of a -form on a subset of
provided the
manifold is an oriented
2
-dimensional and thus by the previous idea we have an integral.
(iii) globally we should expect that we can add the results from various local charts and arrive at
a total value for the manifold, assuming of course the integral in each chart is nite.
We will only investigate items (.) and (.) in these notes. There are many other excellent texts
which take great eort to carefully expand on point (iii.) and I do not wish to replicate that eort
here. You can read Edwards and see about pavings, or read Munkres where he has at least 100
pages devoted to the careful study of multivariate integration. I do not get into those topics in my
notes because we simply do not have sucient analytical power to do them justice. I would encour-
age the student interested in deeper ideas of integration to nd time to talk to Dr. Skoumbourdis,
he has thought a long time about these matters and he really understands integration in a way we
dare not cover in the calculus sequence. You really should have that conversation after youve taken
2
we will discuss this as the section progresses
238 CHAPTER 9. DIFFERENTIAL FORMS
real analysis and have gained a better sense of what analysis purpose is in mathematics. That
said, what we do cover in this section and the next is fascinating whether or not we understand all
the analytical underpinnings of the subject!
9.4.1 integration of -form on
Note that on
a -form is the top form thus there exists some smooth function on
such that
= ()
1
2
. In particular,
()
. It
is sometimes convenient to write such an integral as:
()
()
1
but, to be more careful, the integration of over is a quantity which is independent of the
particular order in which the variables on
= ()
2
1
.
How the can we reasonably maintain the integral proposed above? Well, the answer is to make
a convention that we must write the form to match the standard orientation of
. The stan-
dard orientation of
is given by
=
1
2
= ()
then we dene
()
on
= () on some subset = [, ] of ,
() =
().
Or, if
(,)
= (, ) then for a aubset of
2
,
(, ) =
.
If
(,,)
= (, , ) then for a aubset of
3
,
(, , ) =
.
9.4. INTEGRATION OF DIFFERENTIAL FORMS 239
In practice we tend to break the integrals above down into an interated integral thanks to Fubinis
theorems. The integrals
and
1
:
1
1
is said to be right-handed i (
01
) > 0. Otherwise, if (
01
) < 0 then the
patch
1
:
1
1
is said to be left-handed. If the manifold is orientable then as we continue to
travel across the manifold we can choose coordinates such that on each overlap the transition func-
tions satisfy (
)
is positively oriented i (
1
,
2
, . . . ,
,
2
, . . . ,
) < 0 in
that case. It is important that we suppose
= ()
1
2
=1
and subsitute,
=
1
,...,
=1
(
1
)()
1
= (
1
)()
_
1
2
(9.3)
If you calculate the value of on
, . . . ,
youll nd (
) = (().
Whereas, if you evaluate on
, . . . ,
) =
(())
_
()
= (
) =
240 CHAPTER 9. DIFFERENTIAL FORMS
(()) > 0 then (
) = (())
_
()
> 0 i
_
()
> 0.
Let be an oriented -manifold with orientation given by the volume form and an associated
atlas of positively oriented charts. Furthermore, let be a -form dened on . Suppose
there exists a local parametrization :
= ()
1
2
1
()
(())
]
Is this denition dependent on the coordinate system :
? If we instead used
coordinate system :
where coordinates
1
,
2
, . . . ,
on
then the given
form has a dierent coecient of
1
2
= ()
1
2
= (
1
)()
_
1
2
Thus, as we change over to coordinates the function picks up a factor which is precisely the
determinant of the derivative of the transition functions.
1
)()
_
1
2
1
()
(
1
)()
_
]
We need
. Recall,
()
()
where
is more pedantically written as
=
1
, notation aside its just the function written
in terms of the new -coordinates. Likewise,
limits -coordinates so that the corresponding
-coordinates are found in . Applying this theorem to our pull-back expression,
1
()
(())
1
()
(
1
)()
.
Equality of
and
follows from the fact that is oriented and has transition functions
3
which satisfy (
) > 0. We see that this integral to be well-dened only for oriented manifolds.
To integrate over manifolds without an orientation additional ideas are needed, but it is possible.
3
once more recall the notation
. In this
context we must deal with both the coordinates of the ambient
? That is
the problem we concern ourselvew with for the remainder of this section.
Lets begin with a simple object. Consider a one-form =
=1
()
is smooth on some subset of
then
the natural chart on is provided by the parameter in particular we have
= {
}
where (
) = and
= {
and a
dierential form has the form (). How can we use the one-form on
to naturally obtain a
one-form dened along C? I propose:
() =
=1
(())
It can be shown that
=1
((()))
=1
(())
=1
(())
_
This is precisely the transformation rule we want for the components of a one-form.
Example 9.4.1. Suppose = + 3
2
+ and is the curve :
3
dened by
() = (1, ,
2
) we have = 1, = and =
2
hence = 0, = and = 2 on hence
= 0 + 3 +(2) = (3 + 2
2
).
Next, consider a two-form =
,=1
1
2
where , serve as
coordinates on the surface . We can write an arbitrary two-form on in the form (, )
where : is a smooth function on . How should we construct (, ) given ? Again, I
think the following formula is quite natural, honestly, what else would you do
4
?
(, ) =
,=1
((, ))
4
include the
1
2
you say?, well see why not soon enough
242 CHAPTER 9. DIFFERENTIAL FORMS
The coecient function of is smooth because we assume
is smooth on
and
,=1
,=1
,=1
.
Therefore, the restriction of to is coordinate independent and we have thus constructed a
two-form on a surface from the two-form in the ambient space.
Example 9.4.2. Consider =
2
+ + ( + + + ) . Suppose
4
is
parametrized by
(, ) = (1,
2
2
, 3, )
In other words, we are given that
= 1, =
2
2
, = 3, =
Hence, = 0, = 2
2
+ 2
2
, = 3 and = . Computing
is just a matter of
substuting in all the formulas above, fortunately = 0 so only the term is nontrivial:
= (2
2
+ 2
2
) (3) = 6
2
2
= 6
2
2
.
It is fairly clear that we can restrict any -form on
to
1
= (
1
,
2
, . . . ,
) then
1
2
(())
()
where () = (
1
(),
2
(), . . . ,
()) so we mean
to be the
component of ().
Moreover, the indices are understood to range over the dimension of the ambient space, if
we consider forms in
2
then = 1, 2 if in
3
then = 1, 2, 3 if in Minkowski
4
then
should be replaced with = 0, 1, 2, 3 and so on.
5
hopefully known to you already from multivariate calculus
9.4. INTEGRATION OF DIFFERENTIAL FORMS 243
Example 9.4.4. One form integrals vs. line integrals of vector elds: We begin with a
vector eld
and construct the corresponding one-form
(())
() =
You may note that the denition of a line integral of a vector eld is not special to three dimensions,
we can clearly construct the line integral in n-dimensions, likewise the correspondance can be
written between one-forms and vector elds in any dimension, provided we have a metric to lower
the index of the vector eld components. The same cannot be said of the ux-form correspondance,
it is special to three dimensions for reasons we have explored previously.
Denition 9.4.5. integral of two-form over an oriented surface:
Let =
1
2
((, ))
(, )
(, )
where (, ) = (
1
(, ),
2
(, ), . . . ,
(, )) so we mean
to be the
component
of (, ). Moreover, the indices are understood to range over the dimension of the ambient
space, if we consider forms in
2
then , = 1, 2 if in
3
then , = 1, 2, 3 if in Minkowski
4
then , should be replaced with , = 0, 1, 2, 3 and so on.
Example 9.4.6. Two-form integrals vs. surface integrals of vector elds in
3
: We begin
with a vector eld
and construct the corresponding two-form
=
1
2
which is to
say
=
1
+
2
+
3
. Next let be an oriented piecewise smooth surface
with parametrization :
2
3
, then
Proof: Recall that the normal to the surface has the form,
(, ) =
at the point (, ). This gives us a vector which points along the outward normal to the surface
and it is nonvanishing throughout the whole surface by our assumption that is oriented. Moreover
the vector surface integral of
over was dened by the formula,
((, ))
(, ) .
244 CHAPTER 9. DIFFERENTIAL FORMS
now that the reader is reminded whats what, lets prove the proposition, dropping the (u,v) depence
to reduce clutter we nd,
=
(
meaning we have
= (
=
1
2
(
= 3 . Lets calculate the surface integral and two-form integrals over the square =
[0, 1][0, 1] in the -plane, in this case the parameters can be taken to be and so (, ) = (, )
and,
(, ) =
=
(0, 0, 3) (0, 0, 1)
=
1
0
1
0
3
= 3.
Consider that
)
12
=
9.4. INTEGRATION OF DIFFERENTIAL FORMS 245
(
)
21
= 3 and all other components are zero,
=
(
=
_
3
_
=
1
0
1
0
_
3
_
= 3.
Denition 9.4.8. integral of a three-form over an oriented volume:
Let =
1
6
((, , ))
where (, , ) = (
1
(, , ),
2
(, , ), . . . ,
(, , )) so we mean
to be the
component of (, , ). Moreover, the indices are understood to range over the dimension
of the ambient space, if we consider forms in
3
then , , = 1, 2, 3 if in Minkowski
4
then
, , should be replaced with , , = 0, 1, 2, 3 and so on.
Finally we dene the integral of a -form over an -dimensional subspace of , we assume that
so that it is possible to embed such a subspace in ,
Denition 9.4.9. integral of a p-form over an oriented volume:
Let =
1
!
1
...
) :
(for ) then we
dene the integral of the p-form in the subspace as follows,
1
...
((
1
, . . . ,
))
where (
1
, . . . ,
) = (
1
(
1
, . . . ,
),
2
(
1
, . . . ,
), . . . ,
(
1
, . . . ,
)) so we mean
to be the
component of (
1
, . . . ,
where +1
and let be it boundary which is consistently oriented then for a -form which behaves
reasonably on we have that
The proof of this theorem (and a more careful statement of it) can be found in a number of places,
Susan Colleys Vector Calculus or Steven H. Weintraubs Dierential Forms: A Complement to
Vector Calculus or Spivaks Calculus on Manifolds just to name a few. I believe the argument in
Edwards text is quite complete. In any event, you should already be familar with the idea from
the usual Stokes Theorem where we must insist the boundary curve to the surface is related to
the surfaces normal eld according to the right-hand-rule. Explaining how to orient the boundary
given an oriented is the problem of generalizing the right-hand-rule to many dimensions. I
leave it to your homework for the time being.
Lets work out how this theorem reproduces the main integral theorems of calculus.
Example 9.5.2. Fundamental Theorem of Calculus in : Let : be a zero-form
then consider the interval [, ] in . If we let = [, ] then = {, }. Further observe that
=
()
However on the other hand we nd ( the integral over a zero-form is taken to be the evaluation
map, perhaps we should have dened this earlier, oops., but its only going to come up here so Im
leaving it.)
= () ()
Hence in view of the denition above we nd that
() = () ()
Then consider that the exterior derivative of a function corresponds to the gradient of the function
thus we are not to surprised to nd that
()
On the other hand, we use the denition of the integral over a a two point set again to nd
= () ()
Hence if the Generalized Stokes Theorem is true then so is the FTC in three dimensions,
()
= () ()
another popular title for this theorem is the fundamental theorem for line integrals. As a nal
thought here we notice that this calculation easily generalizes to 2,4,5,6,... dimensions.
Example 9.5.4. Greenes Theorem: Let us recall the statement of Greenes Theorem as I have
not replicated it yet in the notes, let be a region in the -plane and let be its consistently
oriented boundary then if
= ((, ), (, ), 0) is well behaved on
+ =
_
_
We begin by nding the one-form corresponding to
namely
= ( +) = + =
which simplies to,
=
_
_
=
(
+
where we have reminded the reader that the notation in the rightmost expression is just another
way of denoting the line integral in question. Next observe,
=
we have
)
Therefore,
+ =
_
Example 9.5.5. Gauss Theorem: Let us recall Gauss Theorem to begin, for suitably dened
and ,
First we recall our earlier result that
(
) = (
)
Now note that we may integrate the three form over a volume,
) =
(
)
whereas,
so there it is,
(
) =
) =
I have left a little detail out here, I may assign it for homework.
Example 9.5.6. Stokes Theorem: Let us recall Stokes Theorem to begin, for suitably dened
and ,
(
)
) =
Hence,
) =
(
)
whereas,
(
)
) =
2
...
)
(9.4)
then dierentiate again,
() =
_
1
!
(
2
...
)
_
=
1
!
(
2
...
)
= 0
(9.5)
since the partial derivatives commute whereas the wedge product anticommutes so we note that
the pair of indices (k,m) is symmetric for the derivatives but antisymmetric for the wedge, as we
know the sum of symmetric against antisymmetric vanishes ( see equation ?? part if you forgot.)
Denition 9.6.2.
A dierential form is closed i = 0. A dierential form is exact i there exists
such that = .
Proposition 9.6.3.
250 CHAPTER 9. DIFFERENTIAL FORMS
All exact forms are closed. However, there exist closed forms which are not exact.
Proof: Exact implies closed is easy, let be exact such that = then
= () = 0
using the theorem
2
= 0. To prove that there exists a closed form which is not exact it suces
to give an example. A popular example ( due to its physical signicance to magnetic monopoles,
Dirac Strings and the like..) is the following dierential form in
2
=
1
2
+
2
( ) (9.6)
You may verify that = 0 in homework. Observe that if were exact then there would exist
such that = meaning that
2
+
2
,
2
+
2
which are solved by = (/) + where is arbitrary. Observe that is ill-dened along
the -axis = 0 ( this is the Dirac String if we put things in context ), however the natural domain
of is
{(0, 0)}.
9.6.2 potentials for closed forms
Poincare suggested the following partial converse, he said closed implies exact provided we place
a topological restriction on the domain of the form. In particular, if the domain of a closed form
is smoothly deformable to a point then each closed form is exact. Well work out a proof of that
result for a subset of
.
Dene maps,
1
: ,
0
:
by
1
() = (1, ) and
0
() = (0, ) for each . Flanders encourages us to view as a
cylinder and where the map
1
maps to the top and
0
maps to the base. We can pull-back
forms on the cylinder to the on the top ( = 1) or to the base ( = 0). For instance, if we consider
= ( +) + for the case = 1 then
0
=
1
= ( + 1).
Dene a smooth mapping of ( + 1) forms on to -forms on as follows:
(1.) ((, )
) = 0, (2.) ((, )
) =
_
1
0
(, )
_
0
.
Proof: Since the equation is given for linear operations it suces to check the formula for mono-
mials since we can extend the result linearly once those are armed. As in the denition of
there are two basic categories of forms on :
Case 1: If = (, )
_
+
_
_
=
_
1
0
=
_
(, 1) (, 0)
0
(9.7)
where we used the FTC in the next to last step. The pull-backs in this case just amount to evalu-
ation at = 0 or = 1 as there is no -type term to squash in . The identity follows.
Case 2: Suppose = (, )
. Calculate,
=
. .
!
6
is a monomial whereas + is a binomial in this context
252 CHAPTER 9. DIFFERENTIAL FORMS
Thus, using
, we calculate:
() =
_
_
=
_
1
0
at which point we cannot procede further since is an arbitrary function which can include a
nontrivial time-dependence. We turn to the calculation of (()). Recall we dened
() =
_
1
0
(, )
_
.
We calculate the exterior derivative of ():
(()) =
_
1
0
(, )
_
=
_
_
1
0
(, )
. .
_
+
_
1
0
(, )
_
_
1
0
. (9.8)
Therefore, ()+(()) = 0 and clearly
0
=
1
= 0 in this case since the pull-backs squash
the to zero. The lemma follows. .
Denition 9.6.5.
A subset
1
=
0
=
Therefore, if is a ( + 1)-form on we calculate,
(
1
)
1
[
] =
9.6. POINCARES LEMMA AND CONVERSE 253
whereas,
(
0
)
= 0
0
[
] = 0
Apply the -lemma to the form =
on and we nd:
((
)) +((
)) = .
However, recall that we proved that pull-backs and exterior derivatives commute thus
(
) =
()
and we nd an extremely interesting identity,
(
()) +((
)) = .
Proposition 9.6.6.
If is deformable to a point then a -form on is closed i is exact.
Proof: Suppose is exact then = for some ( 1)-form on hence = () = 0 by
Proposition 9.6.1 hence is closed. Conversely, suppose is closed. Apply the boxed consequence
of the -lemma, note that (
)) =
identify that
is a -form on whereas (
() {
closed}
the space of closed p-forms on . Then,
() {
exact}
the space of exact p-forms where by convention
0
() = {0} The de Rham cohomology
groups are dened by the quotient of closed/exact,
()
()/
().
the (
) =
Betti number of U.
We observe that simply connected regions have all the Betti numbers zero since
() =
()
implies that
() = {0}. Of course there is much more to say about Cohomology, I just wanted
to give you a taste and alert you to the fact that dierential forms can be used to reveal aspects of
topology. Not all algebraic topology uses dierential forms though, there are several other calcula-
tional schemes based on triangulation of the space, or studying singular simplexes. One important
event in 20-th century mathematics was the discovery that all these schemes described the same
homology groups. Steenrod reduced the problem to a few central axioms and it was shown that all
the calculational schemes adhere to that same set of axioms.
One interesting aspect of the proof we (copied from Flanders
7
) is that it is not a mere existence
proof. It actually lays out how to calculate the form which provides exactness. Lets call the
potential form of if = . Notice this is totally reasonable langauge since in the case of
classical mechanics we consider conservative forces
which as derivable from a scalar potential
by
= . Translated into dierential forms we have
=
0
= 0
thus the one-form corresponding to a conservative vector eld is a closed form. Apply the identity:
let :
3
be the deformation of to a point ,
((
)) =
)
For convenience, lets suppose the space considered is the unit-ball and lets use a deformation to
the origin. Explicitly, (, ) = for all
3
such that 1. Note that clearly (0, ) = 0
7
I dont know the complete history of this calculation at the present. It would be nice to nd it since I doubt
Flanders is the originator.
9.6. POINCARES LEMMA AND CONVERSE 255
whereas (1, ) = and has a nice formula so its smooth
8
. We wish to calculate the pull-back
of
)() =
(())
for each smooth vector eld on . Dierential forms on are written as linear combi-
nations of , , , with smooth functions as coecients. We can calculate the coecents by
evalutaion on the corresponding vector elds
. Observe, since (, , , ) = (, , )
we have
(
) =
1
+
2
+
3
wheras,
(
) =
1
+
2
+
3
and similarly,
(
) =
) =
Furthermore,
((
)) =
) = ++
((
)) =
) = ,
((
)) =
) = ,
((
)) =
) =
Therefore,
= ( ++) + + + = ( ++) +
)(, , , ) =
_
_
(, , ) +(, , ) +(, , )
_
_
Therefore,
(, , ) = (
) =
1
0
_
(, , ) +(, , ) +(, , )
_
Notice this is precisely the line-integral of =< , , >along the line with direction < , , >
from the origin to (, , ). In particular, if () =< , , > then
1
0
_
()
_
8
there is of course a deeper meaning to the word, but, for brevity I gloss over this.
256 CHAPTER 9. DIFFERENTIAL FORMS
Perhaps you recall this is precisely how we calculate the potential function for a conservative vector
eld provided we take the origin as the zero for the potential.
Actually, this calculation is quite interesting. Suppose we used a dierent deformation
: . For xed point we travel to from the origin to the point by the path
(, ).
Of course this path need not be a line. The space considered might look like a snake where a line
cannot reach from the base point to the point . But, the same potential is derived. Why?
Path independence of the vector eld is one answer. The criteria = 0 suces for a sim-
ply connected region. However, we see something deeper. The criteria of a closed form paired
with a simply connected (deformable) domain suces to construct a potential for the given form.
This result reproduces the familar case of conservative vector elds derived from scalar potentials
and much more. In Flanders he calculates the potential for a closed two-form. This ought to
be the mathematics underlying the construction of the so-called vector potential of magnetism.
In junior-level electromagnetism
9
the magnetic eld satises = 0 and thus the two-form
such that
. But
this means
0
= = if you worried about it )
Name Degree Typical Element Basis for
(
4
)
function = 0 1
one-form = 1 =
, , ,
two-form = 2 =
1
2
, ,
, ,
three-form = 3 =
1
3!
,
,
four-form = 4
Greek indices are dened to range over 0, 1, 2, 3. Here the top form is degree four since in four
dimensions we can have four dierentials without a repeat. Wedge products work the same as they
have before, just now we have to play with. Hodge duality may oer some surprises though.
Denition 9.8.1. The antisymmetric symbol in at
4
is denoted
0123
= 1
plus the demand that it be completely antisymmetric.
We must not assume that this symbol is invariant under a cyclic exhange of indices. Consider,
0123
=
1023
ipped (01)
= +
1203
ipped (02)
=
1230
ipped (03).
(9.9)
Example 9.8.2. We now compute the Hodge dual of = with respect to the Minkowski metric
=
1
.
We raise the index using , as follows
=
1
=
1
.
9.8. E & M IN DIFFERENTIAL FORM 259
Beginning with the denition of the Hodge dual we calculate
() =
1
(41)!
= (1/6)
1
= (1/6)[
1023
+
1230
+
1302
+
1320
+
1203
+
1032
]
= (1/6)[
+ + + ]
= .
(9.10)
The dierence between the three and four dimensional Hodge dual arises from two sources, for
one we are using the Minkowski metric so indices up or down makes a dierence, and second the
antisymmetric symbol has more possibilities than before because the Greek indices take four values.
Example 9.8.3. We nd the Hodge dual of = with respect to the Minkowski metric
.
Notice that has components
=
0
. Raising the
index using as usual, we have
=
0
=
0
where the minus sign is due to the Minkowski metric. Starting with the denition of Hodge duality
we calculate
() = (1/6)
0
= (1/6)
0
= (1/6)
0
= (1/6)
= .
(9.11)
for the case here we are able to use some of our old three dimensional ideas. The Hodge dual of
cannot have a in it which means our answer will only have , , in it and that is why we
were able to shortcut some of the work, (compared to the previous example).
Example 9.8.4. Finally, we nd the Hodge dual of = with respect to the Minkowski metric
. Recall that
() =
1
(42)!
01
01
(
) and that
01
=
0
= (1)(1)
01
= 1.
Thus
( ) = (1/2)
01
= (1/2)[
0123
+
0132
]
= .
(9.12)
Notice also that since = we nd ( ) =
260 CHAPTER 9. DIFFERENTIAL FORMS
1 =
( ) = 1
( ) =
=
( ) =
=
( ) =
=
( ) =
=
( ) =
( ) =
( ) =
( ) =
( ) =
( ) =
The other Hodge duals of the basic two-forms follow from similar calculations. Here is a table of
all the basic Hodge dualities in Minkowski space, In the table the terms are grouped as they are to
emphasize the isomorphisms between the one-dimensional
0
() and
4
(), the four-dimensional
1
() and
3
(), the six-dimensional
2
() and itself. Notice that the dimension of () is
16 which just happens to be 2
4
.
Now that weve established how the Hodge dual works on the dierentials we can easily take the
Hodge dual of arbitrary dierential forms on Minkowski space. We begin with the example of the
4-current
Example 9.8.5. Four Current: often in relativistic physics we would even just call the four
current simply the current, however it actually includes the charge density and current density
. Consequently, we dene,
(
) (,
),
moreover if we lower the index we obtain,
(
) = (,
)
which are the components of the current one-form,
=
= +
This equation could be taken as the denition of the current as it is equivalent to the vector deni-
tion. Now we can rewrite the last equation using the vectors forms mapping as,
= +
.
Consider the Hodge dual of ,
=
( +
)
=
.
(9.13)
we will nd it useful to appeal to this calculation in a later section.
9.8. E & M IN DIFFERENTIAL FORM 261
Example 9.8.6. Four Potential: often in relativistic physics we would call the four potential
simply the potential, however it actually includes the scalar potential and the vector potential
) (,
)
we can lower the index to obtain,
(
) = (,
)
which are the components of the current one-form,
=
= +
Sometimes this equation is taken as the denition of the four potential. We can rewrite the four
potential vector eld using the vectors forms mapping as,
= +
.
The Hodge dual of is
. (9.14)
Several steps were omitted because they are identical to the calculation of the dual of the 4-current
above.
Denition 9.8.7. Faraday tensor.
Given an electric eld
= (
1
,
2
,
3
) and a magnetic eld
= (
1
,
2
,
3
) we dene a
2-form by
=
.
This 2-form is often called the electromagnetic eld tensor or the Faraday tensor. If
we write it in tensor components as =
1
2
)
of components then it is easy to see that
(
) =
0
1
2
3
1
0
3
2
2
3
0
1
3
2
1
0
(9.15)
Convention: Notice that when we write the matrix version of the tensor components we take the
rst index to be the row index and the second index to be the column index, that means
01
=
1
whereas
10
=
1
.
Example 9.8.8. In this example we demonstrate various conventions which show how one can
transform the eld tensor to other type tensors. Dene a type (1, 1) tensor by raising the rst index
by the inverse metric
as follows,
) = (
0
) = (0,
1
,
2
,
3
)
Then row one is unchanged since
1
=
1
,
(
1
) = (
1
) = (
1
, 0,
3
,
2
)
and likewise for rows two and three. In summary the (1,1) tensor
) has the
components below
(
) =
0
1
2
3
1
0
3
2
2
3
0
1
3
2
1
0
. (9.16)
At this point we raise the other index to create a (2, 0) tensor,
(9.17)
and we see that it takes one copy of the inverse metric to raise each index and
so
we can pick up where we left o in the (1, 1) case. We could proceed case by case like we did with
the (1, 1) case but it is better to use matrix multiplication. Notice that
is just
the (, ) component of the following matrix product,
(
) =
0
1
2
3
1
0
3
2
2
3
0
1
3
2
1
0
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
0
1
2
3
1
0
3
2
2
3
0
1
3
2
1
0
. (9.18)
So we nd a (2, 0) tensor
and
, it is in fact typically clear from the context which version of one is thinking about.
Pragmatically physicists just write the components so its not even an issue for them.
Example 9.8.9. Field tensors dual: We now calculate the Hodge dual of the eld tensor,
=
(
)
=
( ) +
( ) +
( )
+
( ) +
( ) +
( )
=
.
we can also write the components of
in matrix form:
(
) =
0
1
2
3
1
0
3
2
2
3
0
1
3
2
1
0.
(9.19)
Notice that the net-eect of Hodge duality on the eld tensor was to make the exchanges
and
.
9.8. E & M IN DIFFERENTIAL FORM 263
9.8.2 exterior derivatives of charge forms, eld tensors, and their duals
In the last chapter we found that the single operation of the exterior dierentiation reproduces the
gradiant, curl and divergence of vector calculus provided we make the appropriate identications
under the work and ux form mappings. We now move on to some four dimensional examples.
Example 9.8.10. Charge conservation: Consider the 4-current we introduced in example 9.8.5.
Take the exterior derivative of the dual of the current to get,
(
) = (
)
= (
) [(
) ]
=
= (
+
) .
We work through the same calculation using index techniques,
(
) = (
)
= () [
1
2
)
= (
)
1
2
= (
)
1
2
= (
)
1
2
= (
)
1
2
2
= (
+
) .
Observe that we can now phrase charge conservation by the following equation
(
) = 0
+
= 0.
In the classical scheme of things this was a derived consequence of the equations of electromagnetism,
however it is possible to build the theory regarding this equation as fundamental. Rindler describes
that formal approach in a late chapter of Introduction to Special Relativity.
Proposition 9.8.11.
If (
) = (,
) is the vector potential (which gives the magnetic eld) and = +
, then =
.
264 CHAPTER 9. DIFFERENTIAL FORMS
Proof: The proof uses the denitions
=
and
=
and some vector identities:
= ( +
)
= +(
)
= + (
+ (
=
( )
= (
( )
) +
=
(
)
+
= =
1
2
.
Moreover we also have:
= (
=
1
2
(
+
1
2
(
=
1
2
(
.
Comparing the two identities we see that
) +(
)
=
) + (
) +
1
2
)(
).
(9.20)
W pause here to explain our logic. In the above we dropped the
+
1
2
)]
+ (
)
= [
+
12
(
)]
+[
+
31
(
)]
+[
+
23
(
)]
+(
)
= (
+
+ (
)
=
+ (
)
(9.21)
9.8. E & M IN DIFFERENTIAL FORM 265
where we used the fact that is an isomorphism of vector spaces (at a point) and
1
= ,
2
= , and
3
= . Behold, we can state two of Maxwells equations as
= 0
+
= 0,
= 0 (9.22)
Example 9.8.13. We now compute the exterior derivative of the dual to the eld tensor:
= (
) +(
)
=
+ (
)
(9.23)
This follows directly from the last example by replacing
and
. We obtain the two
inhomogeneous Maxwells equations by setting
,
= (9.24)
Here we have used example 9.8.5 to nd the RHS of the Maxwell equations.
We now know how to write Maxwells equations via dierential forms. The stage is set to prove that
Maxwells equations are Lorentz covariant, that is they have the same form in all inertial frames.
9.8.3 coderivatives and comparing to Griths relativitic E & M
Optional section, for those who wish to compare our tensorial E & M with that of
Griths, you may skip ahead to the next section if not interested
I should mention that this is not the only way to phrase Maxwells equations in terms of
dierential forms. If you try to see how what we have done here compares with the equations
presented in Griths text it is not immediately obvious. He works with
and
and
none
of which are the components of dierential forms. Nevertheless he recovers Maxwells equations
as
and
) in Griths text,
(
( = 1)) =
0
1
2
3
1
0
3
2
2
3
0
1
3
2
1
0
= (
). (9.25)
we nd that we obtain the negative of Griths dual tensor ( recall that raising the indices has
the net-eect of multiplying the zeroth row and column by 1). The equation
does not
follow directly from an exterior derivative, rather it is the component form of a coderivative. The
coderivative is dened =
= =
=
where I leave the sign for you to gure out. Then the other equation
= 0
can be understood as the component form of
= 0
so even though it looks like Griths is using the dual eld tensor for the homogeneous Maxwells
equations and the eld tensor for the inhomogeneous Maxwells equations it is in fact not the case.
The key point is that there are coderivatives implicit within Griths equations, so you have to
read between the lines a little to see how it matched up with what weve done here. I have not en-
tirely proved it here, to be complete we should look at the component form of = and explicitly
show that this gives us
,
then the eld tensor =
then we observe,
(1.)
= (
1
)
(2.)
= (
1
)
(9.26)
where (2.) is simply the chain rule of multivariate calculus and (1.) is not at all obvious. We will
assume that (1.) holds, that is we assume that the 4-potential transforms in the appropriate way
for a one-form. In principle one could prove that from more base assumptions. After all electro-
magnetism is the study of the interaction of charged objects, we should hope that the potentials
9.8. E & M IN DIFFERENTIAL FORM 267
are derivable from the source charge distribution. Indeed, there exist formulas to calculate the
potentials for moving distributions of charge. We could take those as denitions for the potentials,
then it would be possible to actually calculate if (1.) is true. Wed just change coordinates via a
Lorentz transformation and verify (1.). For the sake of brevity we will just assume that (1.) holds.
We should mention that alternatively one can show the electric and magnetic elds transform as to
make
a tensor. Those derivations assume that charge is an invariant quantity and just apply
Lorentz transformations to special physical situations to deduce the eld transformation rules. See
Griths chapter on special relativity or look in Resnick for example.
Let us nd how the eld tensor transforms assuming that (1.) and (2.) hold, again we consider
= (
1
)
((
1
)
) (
1
)
((
1
)
)
= (
1
)
(
1
)
)
= (
1
)
(
1
)
.
(9.27)
therefore the eld tensor really is a tensor over Minkowski space.
Proposition 9.8.14.
The dual to the eld tensor is a tensor over Minkowski space. For a given Lorentz trans-
formation
it follows that
= (
1
)
(
1
)
Proof: homework (just kidding in 2010), it follows quickly from the denition and the fact we
already know that the eld tensor is a tensor.
Proposition 9.8.15.
The four-current is a four-vector. That is under the Lorentz transformation
we
can show,
= (
1
)
Proof: follows from arguments involving the invariance of charge, time dilation and length con-
traction. See Griths for details, sorry we have no time.
Corollary 9.8.16.
The dual to the four current transforms as a 3-form. That is under the Lorentz transfor-
mation
we can show,
= (
1
)
(
1
)
(
1
)
are coordinate invariant expressions which we have already proved give Maxwells equations in one
frame of reference, thus they must give Maxwells equations in all frames of reference.
The essential point is simply that
=
1
2
=
1
2
Again, we have no hope for the equation above to be true unless we know that
= (
1
)
(
1
)
= (
1
)
we will nd it convenient to make our convention for this section that , , ... = 0, 1, 2, 3, 4 whereas
, , ... = 1, 2, 3, 4 so we can rewrite the potential one-form as,
= +
= (,
) =
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
= (
) (9.28)
we could study the linear isometries of this metric, they would form the group (1, 4). Now we
form the eld tensor by taking the exterior derivative of the one-form potential,
= =
1
2
(
now we would like to nd the electric and magnetic elds in 4 dimensions. Perhaps we should
say 4+1 dimensions, just understand that I take there to be 4 spatial directions throughout this
discussion if in doubt. Note that we are faced with a dilemma of interpretation. There are 10
independent components of a 5 by 5 antisymmetric tensor, naively we wold expect that the electric
and magnetic elds each would have 4 components, but that is not possible, wed be missing
two components. The solution is this, the time components of the eld tensor are understood to
correspond to the electric part of the elds whereas the remaining 6 components are said to be
magnetic. This aligns with what we found in 3 dimensions, its just in 3 dimensions we had the
fortunate quirk that the number of linearly independent one and two forms were equal at any point.
This denition means that the magnetic eld will in general not be a vector eld but rather a ux
encoded by a 2-form.
(
) =
0
3
1
2
3
0
(9.29)
Now we can write this compactly via the following equation,
= +
I admit there are subtle points about how exactly we should interpret the magnetic eld, however
Im going to leave that to your imagination and instead focus on the electric sector. What is the
generalized Maxwells equation that must satisfy?
( +) =
where = +
() =
270 CHAPTER 9. DIFFERENTIAL FORMS
the corresponding term in
is
( ) thus, using
=
1
( ) =
1
(9.30)
is the 4-dimensional Gausss equation. Now consider the case we have an isolated point charge
which has somehow always existed at the origin. Moreover consider a 3-sphere that surrounds the
charge. We wish to determine the generalized Coulomb eld due to the point charge. First we note
that the solid 3-sphere is a 4-dimensional object, it the set of all (, , , )
4
such that
2
+
2
+
2
+
2
2
We may parametrize a three-sphere of radius via generalized spherical coordinates,
= ()()()
= ()()()
= ()()
= ()
(9.31)
Now it can be shown that the volume and surface area of the radius three-sphere are as follows,
(
3
) =
2
2
4
(
3
) = 2
2
3
We may write the charge density of a smeared out point charge as,
=
_
2/
2
4
, 0
0, >
. (9.32)
Notice that if we integrate over any four-dimensional region which contains the solid three sphere
of radius will give the enclosed charge to be . Then integrate over the Gaussian 3-sphere
3
with radius call it ,
( ) =
1
now use the Generalized Stokes Theorem to deduce,
( ) =
but by the spherical symmetry of the problem we nd that must be independent of the direction
it points, this means that it can only have a radial component. Thus we may calculate the integral
with respect to generalized spherical coordinates and we will nd that it is the product of
and the surface volume of the four dimensional solid three sphere. That is,
( ) = 2
2
3
=
3
the Coulomb eld is weaker if it were to propogate in 4 spatial dimensions. Qualitatively what has
happened is that the have taken the same net ux and spread it out over an additional dimension,
this means it thins out quicker. A very similar idea is used in some brane world scenarios. String
theorists posit that the gravitational eld spreads out in more than four dimensions while in con-
trast the standard model elds of electromagnetism, and the strong and weak forces are conned
to a four-dimensional brane. That sort of model attempts an explaination as to why gravity is so
weak in comparison to the other forces. Also it gives large scale corrections to gravity that some
hope will match observations which at present dont seem to t the standard gravitational models.
This example is but a taste of the theoretical discussion that dierential forms allow. As a
nal comment I remind the reader that we have done things for at space for the most part in
this course, when considering a curved space there are a few extra considerations that must enter.
Coordinate vector elds
or