Introduction To Vector and Tensor Analysis: Jesper Ferkinghoff-Borg September 6, 2007
Introduction To Vector and Tensor Analysis: Jesper Ferkinghoff-Borg September 6, 2007
Introduction To Vector and Tensor Analysis: Jesper Ferkinghoff-Borg September 6, 2007
Jesper Ferkinghoff-Borg
September 6, 2007
Contents
1 Physical space 3
1.1 Coordinate systems . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Tensors 30
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Outer product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Basic tensor algebra . . . . . . . . . . . . . . . . . . . . . . . . . 31
1
3.3.1 Transposed tensors . . . . . . . . . . . . . . . . . . . . . . 32
3.3.2 Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3.3 Special tensors . . . . . . . . . . . . . . . . . . . . . . . . 33
3.4 Tensor components in orthonormal bases . . . . . . . . . . . . . . 34
3.4.1 Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4.2 Two-point components . . . . . . . . . . . . . . . . . . . . 38
3.5 Tensor fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5.1 Gradient, divergence and curl . . . . . . . . . . . . . . . . 39
3.5.2 Integral theorems . . . . . . . . . . . . . . . . . . . . . . . 40
4 Tensor calculus 42
4.1 Tensor notation and Einsteins summation rule . . . . . . . . . . 43
4.2 Orthonormal basis transformation . . . . . . . . . . . . . . . . . 44
4.2.1 Cartesian coordinate transformation . . . . . . . . . . . . 45
4.2.2 The orthogonal group . . . . . . . . . . . . . . . . . . . . 45
4.2.3 Algebraic invariance . . . . . . . . . . . . . . . . . . . . . 46
4.2.4 Active and passive transformation . . . . . . . . . . . . . 47
4.2.5 Summary on scalars, vectors and tensors . . . . . . . . . . 48
4.3 Tensors of any rank . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.3 Basic algebra . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3.4 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Reflection and pseudotensors . . . . . . . . . . . . . . . . . . . . 54
4.4.1 Levi-Civita symbol . . . . . . . . . . . . . . . . . . . . . . 55
4.4.2 Manipulation of the Levi-Civita symbol . . . . . . . . . . 56
4.5 Tensor fields of any rank . . . . . . . . . . . . . . . . . . . . . . . 57
2
Chapter 1
Physical space
1.2 Distances
.
3
We will take it for given that one can ascribe a unique value (ie. indepen-
dent of the chosen coordinate system), D(O, P ), for the distance between two
arbitraily chosen points O and P . Also we assume (or we may take it as an
observational fact) that when D becomes sufficiently small, a coordinate system
exists by which D can be calculated according to Pythagoras law:
v
u d
uX
D(O, P ) = t (xi (P ) − xi (O))2 , Cartesian coordinate system, (1.3)
i=1
1.3 Symmetries
For classical mechanical phenomena it is found that a frame of reference can
always be chosen in which space is homogeneous and isotropic and time is ho-
mogeneous. It implies that nature has no sence of an absolute origin in time
and space and no sence of an absolute direction. Mechanical or geometrical laws
should therefore be invariant to coordinate transformations involving rotations
and translations. Motivated by this fact we are inclined to develop a mathe-
matical formalism for operating with geometrical objects without referring to a
coordinate system or, equivalently, a formalism which will take the same form
(ie. is covariant) for all coordinates. This fundamental principle is called the
principle of covariance. The mathematics of scalar, vector and tensor algebra is
precisely such a formalism.
1 Small would mean that the length of line segments are much smaller than the radius of
earth
4
Chapter 2
2.1 Definitions
A vector is a quantity having both magnitude and a direction in space, such as
displacement, velocity, force and acceleration.
Graphically a vector is represented by an arrow OP from a point O to a
point P , defining the direction and the magnitude of the vector being indicated
by the length of the arrow. Here, O is called the initial point and P is called the
terminal point. Analytically, the vector is represented by either OP~ or OP and
~
the magnitude by |OP | or |OP|. We shall use the bold face notation in these
notes. In this chapter will assume that all points P belong to an Euklidean
space, P ∈ Ω(O), meaning that lengths of line segments can be calculated
according to Pythagoras.
A scalar is a quantity having magnitude but no direction, e.g. mass, length,
time, temperature and any real number.
We indicate scalars by letters of ordinary types. For example, a vector a will
have length a = |a|. Operations with scalars follow the same rules as elementary
algebra; so multiplication, addition and substraction (provided the scalars have
same units) follow the usual algebraic rules.
1. Two vectors a and b are equal if they have the same magnitude and
direction regardless of the position of their initial point.
5
3. The sum of resultant of vectors a and b is a vector c formed by placing
the initial point of b on the terminal point of a and then joining the initial
point of a to the terminal point of b. The sum is written c = a + b.
4. The difference between two vectors, a and b, represented by a − b is
defined as the sum a + (−b).
5. The product of a vector a with a scalar m is a vector ma with magnitude
|m| times the magnitude of a, ie |ma| = |m||a|, and with direction the
same as or opposite to that of a, according as m is positive or negative.
We stress that these definitions for vector addition, substraction and scalar
multiplications are defined geometrically, ie. they have no reference to coordi-
nates. It then becomes a pure geometric exercise to prove the following laws:
Here, the last formula should just be read as a definition of a vector times a
scalar. Note that in all cases only multiplication of a vector by one or more
scalars are defined. One can define different types of bilinear vector products.
The three basic types are called scalar product (or inner product), cross product
and outer product (or tensor product). We shall define each in turn. The
definition of the outer product is postponed to chapter 3.
Following definition will become useful:
where a = |a|, b = |b| and θ is the angle between the two vectors. Note that
a · b is a scalar. Following rules apply:
6
The three last formulas make the scalar product bilinear in the two arguments.
Note also,
1 a · a = |a|2
2 If a · b = 0 and a and b are not null vectors,
then a and b are perpendicular. (2.4)
3 The projection of a vector a on b is equal to a · eb ,
where eb = b/|b| is the unit vector in direction of b.
a × b = ab sin(θ)u, 0 ≤ θ ≤ π, (2.5)
where θ is the angle between a and b and u is a unit vector in the direction
perpendicular to the plane of a and b such that a, b and u form a right-handed
system1 .
The following laws are valid:
7
By assumption, we can choose this coordinate system to be rectangular
(R), CR , cf. chapter 1. By selecting four points O, P1 , P2 and P3 having the
coordinates
xCR (O) = (0, 0, 0), xCR (P1 ) = (1, 0, 0), xCR (P2 ) = (0, 1, 0), xCR (P3 ) = (0, 0, 1)
(2.7)
we can construct three vectors
and are linear independent. A set of vectors will be a basis if and only if the triple product
g1 · (g2 × g3 ) 6= 0.
4 The simple procedure of Eq. (2.7) and (2.8) for obtaining the basis vectors is not generally
8
In the following we shall retain the notation E = {e1 , e2 , e3 } for the par-
ticular case of an orthonormal basis5 . If ei is non-constant we explicitly write
ei (x). On the other hand, if E is constant the coordinate system is rectangular
and we use the letter R for the basis.
The notation for the set of components of a vector a, a = [a]G , also mean that
we shall refer to the i0 ’th component as ai = [a]G,i . Furthermore, when it is
obvious from the context (or irrelevant) which basis is implied, we will omit the
basis subscript and use the notation a = [a] and ai = [a]i . For the collection of
components, ai , into a triplet, ordinary brackets are commonly used, so
In order not to confuse this notation with the vector obtained from a set of
components X
(a)G =def. a i gi ,
i
whe shall always retain the basis subscript in the latter expression.
Trivially, we have
ie. a Cartesian basis, the notation {i, j, k} is also often used. However, we shall retain the
other notation for algebraic convenience.
6 Expressing vector identities in terms of their components is also referred to as tensor
notation.
9
where δij is the Kronecker delta
1 if i = j
δij = (2.11)
0 if i 6= j
With the risk of being pedantic let us stress the difference between a vector and
its components. For two different bases G 6= G 0 , a vector a will be represented
by the components a ∈ R3 and a0 ∈ R3 respectively, so that
Though representing the same vector, the two sets of components a and a0 will
differ because they refer to different bases. The distinction between a vector
and its components becomes irrelevant in the case where one basis is involved
only.
we have
or
[ma]i = m[a]i . (2.14)
10
The bilinear scalar and cross product are not as easy to operate with in
terms of the components in an arbitrary basis G as in the case of vector addition,
substraction and scalar multiplication. For instance
a · b = (a1 , a2 , a3 )G· (b1 , b2 ,b3 )G
P P
= ( i a i gi ) · j b j gj
P (2.15)
= ij ai bj gi · gj
P
= ij ai bj gij , gij = gi · gj
The quantities gij , are called the metric coefficients. If G is spatially dependent
then gij = gij (x) will be functions of the coordinates. In fact,the functional form
of the metric coefficients fully specifies the type of geometry involved (Euklidean
or not) and they are the starting point for extending vector calculus to arbitrary
geometries. Here it suffices to say that a Cartesian coordinate system uniquely
implies that the metric coefficients are constant and satisfy gij = δij .
11
In other words, we recover the law of Pythagoras, cf. Eq. (1.3), which is
reassuring since a constant orthonormal basis is equivalent to the choise of a
rectangular coordinate system.
The symbol ijk with the three indices is the Levi-Civita symbol. It consists of
3 × 3 × 3 = 27 real numbers given by
In words, ijk is antisymmetric in all indices. From this it follows that the
ortonormal basis vectors satisfy
X
ei × e j = ijk ek , (2.23)
k
12
in modern vector analysis to define the cross product through the determinant,
Eq. (2.20) og equivalently through the Levi-Civita symbol, Eq. (2.21). In
replacing the geometric definition, Eq. (2.5) which pressumes an absolute notion
of handedness, with an algebraic one the direction of the cross-product will
simply be determined by what-ever convention of handedness that has been
adopted with the ortonormal coordinate system in the first place.
Physically, the arbitrariness in the choise of handedness implies that the
orientation of the cross product of two ordinary vectors is not a physical ob-
jective quantity, in contrast to vectors representing displacements, velocities,
forces etc. One therefore distinguishes between proper or polar vectors whose
direction are independent on the choise of handedness and axial or pseudovectors
whose directions depend on the choise of handedness. The distinction between
the two types of vectors becomes important when one considers transformations
between left- and right-handed coordinate systems.
Dimensionality
For all vector operations the only explicit reference to dimension of the physical
space appears in the cross product. This can be seen from the definition of
the -symbol, which explicitly has three indices running from 1 to 3. All other
vector operations are valid in arbitrary dimensions.
The generalization of the cross product to any dimension d is obtained by
constructing a Levi-Civita symbol having d indices each running from 1 to d
and being totally antisymmetric.
For instance for d = 2 :
12 =1
21 = −1 (2.25)
11 = 22 = 0
or in doublet notation
13
2.5.1 Derivatives
The definition for ordinary derivatives of functions can directly be extended to
vector functions of scalars:
dc(t) c(u + ∆u) − c(u)
= lim . (2.26)
du ∆u→0 ∆u
Note that the definition involves substracting two vectors, ∆c = c(u+∆u)−c(u)
which is a vector. Therefore, dr(t)
du will also be a vector provided the limit, Eq.
(2.26), exists. Eq. (2.26) is defined independently of coordinate systems. In an
orthonormal basis independent of u:
d X dai
a= ei , (2.27)
du i
du
2.5.2 Integrals
The definition of ordinary integral of a vector depending on a single scalar
variable, c(u), also follows that of ordinary calculus. With respect to a basis E
independent of u the indefinite integral of c(u) is defined by
Z XZ
c(u)du = ci (u)duei ,
i
14
R
where ci (u)du is the indefinite integral of an ordinary scalar function. If s(u)
d
is a vector satisfying c(u) = du s(u) then
d
Z Z
c(u)du = (s(u)) du = s(u) + k
du
where k is an arbitrary constant vector independent of u. The definite integral
between two limits u = u0 and u = u1 can in such case be written
Z u1 Z u1
d u
c(u)du = (s(u))du = [s(u) + k]u10 = s(u1 ) − s(u0 ).
u0 u0 du
This integral can also be defined as a limit of a sum in a manner analogous to
that of elementary integral calculus.
2.6 Fields
2.6.1 Definition
In continuous systems the basic physical variables are distributed over space.
A function of space is known as a field. Let an arbitrary coordinate system be
given.
Scalar field. If to each position x = (x1 , x2 , x3 ) of a region in space the
corresponds a number or scalar φ(x1 , x2 , x3 ), then φ is called a scalar function of
position or scalar field. Physical examples of scalar fields are the mass or charge
density distribution of an object, the temperature or the pressure distribution
at a given time in a fluid.
Vector field. If to each position x = (x1 , x2 , x3 ) of a region in space there
corresponds a vector a(x1 , x2 , x3 ) then a is called a vector function of position
or a vector field. Physical examples of vector fields are the gravitational field
around the earth, the velocity field of a moving fluid, electro-magnetic field of
charged particle systems.
In the following we shall assume the choise of a rectangular
P coordinate system
C = (O, R). Introducing the position vector r(x) = i xi ei the expressions φ(r)
and a(r) is taken to have the same meaning as φ(x) and a(x), respectively.
15
Equation (2.32) expresses how the vector field changes in the spatial direction
of ei . For the same arguments as listed in previous section, ∂i a will for each i
also be a vector field. Note that in general, ∂j a 6= ∂i a when j 6= i.
Rules for partial diffentiation of vectors are similar to those used in elemen-
tary calculus for scalar functions Thus if a and b are functions of x then
∂i (a · b) = a · (∂i b) + (∂i a) · b
∂i (a × b) = (∂i a) × b + a × (∂i b)
2
∂ji (a · b) = ∂j (∂i (a · b)) = ∂j ((∂i a) · b + a · (∂i b))
2 2
= a · (∂ji b) + (∂j a) · (∂i b) + (∂i a) · (∂j b) + (∂ji a) · b
∂i r = e i . (2.33)
then one must remember to include the spatial derivatives of the basis vectors
as well
∂a 0 X ∂aj
0 0 0 ∂gj 0
(x ) = (x )g j (x ) + a j (x ) (x ) (2.35)
∂x0i j
∂x0i ∂x0i
2.6.3 Differentials
Since partial derivatives of vector field follow those used in elementary calculus
for scalar functions the same will be true for vector differentials. For example,
for C = (O, R)
X X
if a(x) = ai (x)ei , then da = dai (x)ei (2.36)
i i
or simply X
da = ∂i adxi . (2.39)
i
16
P
The position vector r = i xi ei is a special vector field for which Eq. (2.38)
implies X
dr = dxi ei . (2.40)
i
Therefore, the arc length between two points on the curve r(u) given by u = u1
and u = u2 is
Z u2 r
dr dr
s= · du.
u1 du du
In general, we have (irrespective of the choise of basis)
d(a · b) = da · b + a · db
d(a × b) = da × b + a × db
Nabla operator ∇
The vector differential operator del or nabla written as ∇ is defined by
X
∇(·) = ei ∂i (·) (2.41)
i
where · represents a scalar or –as we shall see later– a vector field. Notice that
the i’th component of the operator is given by
∇i = e i · ∇ = ∂ i .
17
The gradient of a scalar field has some interesting geometrical properties. Con-
sider the change of φ in some particular direction. For an infinitesimal vector
displacement, dr, forming its scalar product with ∇φ we have
!
X X X
(∇φ)(r) · dr = (∂i φ)(r)ei · dxj ej = (∂i φ)(r)dxi = dφ,
i j i
d P dai (r(t))
dt a(r(t)) = ei
Pi Pdt
= i j j ai )(r))vj ei
(∂
P P
= j vj ∂j ( i ai (r)ei ) (2.44)
P
= j vj ∂j a(r)
= (v · ∇) a(r)
This shows that the operator
X
v·∇= vi ∂i (2.45)
i
18
The full physical and geometrical meaning of the divergence is discussed in next
section. Clearly (∇ · a)(r) is a scalar field. Now if some vector field a is itself
derived from a scalar field via a = ∇φ then ∇ · a has the form ∇ · ∇φ or, as it
is usually written ∇2 φ where
X X ∂2
∇2 = ∂ii2 =
i i
∂x2i
19
Combinations of grad,div and curl
There are myriad of identities between various combinations of the three impor-
tant vector operators grad, div and curl. The identities involving cross products
are more easily proven using tensor calculus which is postponed for chapter 4.
Here we simply list the most important identities:
∇(φ + ψ) = ∇φ + ∇ψ
∇ · (a + b) =∇·a+∇·b
∇ × (a + b) = ∇ × a + ∇ × b
∇ · (φa) = φ∇ · a + a · ∇φ
(2.48)
∇ · (a × b) = b · (∇ × a) − a · (∇ × b)
∇ × (∇ × a) = ∇(∇ · a) − ∇2 a
∇ · (∇ × a) =0
∇ × ∇φ =0
Here, φ and ψ are scalar fields and a and c are vector fields. The last identity
has an important meaning. If a is derived from the gradient of some scalar field,
a = ∇φ then the identity shows that a is necessarily irrotational ,∇ × a = 0.
We shall return to this point in the next section.
Line integrals
In general, one may encounter line integrals of the forms
Z Z Z
φdr, a · dr, a × dr, (2.49)
C C C
20
line integral in Eq. (2.49), for example, is defined as
Z XN
a · dr =def lim a(xp ) · ∆rp ,
C N →∞
p=
and
R R P P
a(r) × dr = ( i a i (r)) × e j dx j
C RC R j R R
= CR a 2 (r)dx 3 − CRa3 (r)dx2 e1 + C a3 (r)dx1 − C a1 (r)dx3 e2
+ C a1 (r)dx2 − C a2 (r)dx1 e3
(2.52)
Note, that in the above we have used relations of the form
Z Z
ai ej dxj = ai dxj ej ,
C C
21
RB
1. The integral AHa · dr, where A, B ∈ R, is independent of the path from
A to B. Hence C a · dr = 0 around any closed loop in R.
2. There exists a scalar field φ in R such that a = ∇φ.
3. ∇ × a = 0.
4. a · dr is an exact differential.
We will not demonstrate the equivalence of these statements. If a vector field
is conservative we can write
a · dr = ∇φ · dr = dφ
and Z B Z B Z B
a · dr = ∇φ · dr = dφ = φ(A) − φ(B).
A A A
This situtation is encountered whenever a = f represents the force f derived
from a potential (scalar) field, φ, such as the potential energy in a gravitational
field, the potential energy in an elastic spring, the voltage in a electrical circuits,
etc.
Surface integrals
As with line integrals, integrals over surfaces can involve vector and scalar fields
and, equally, result in either a vector or a scalar. We shall focus on surface
integrals of the form Z
a · dS. (2.53)
S
where a is a vector field and S is a surface in space which may be either open or
closed. Following
R the notation
H of line integrals, for surface integrals over a closed
surface S is replaced by S . The vector differential dS in Eq. (2.53) represents
a vector area element of the surface S. It may also be written dS = ndS where
n is a unit normal to the surface at the position of the element and dS is a
scalar area of the element. The convention for the direction of the normal n
to a surface depends on whether the suface is open or closed. For a closed
surface the direction of n is taken be outwards from the enclosed volume. An
open surface spans some perimeter curve C. The direction of n is then given
by the right-hand sense with respect to the direction in which the perimeter is
traversed, ie. it follows the right-hand screw rule.
The formal definition of a surface integral is very similar to that of a line
integral. One divides the surface into N elements of area ∆Sp , p = 1, · · · , N
each with a unit normal np . If xp is any point in ∆Sp then
Z N
X
a · dS = lim a(xp ) · np ∆Sp ,
S N →∞
p=1
22
where it is required that ∆Sp → 0 for N → ∞.
A standard way of evaluating surface integrals is to use cartesian coordinates
and project the surface onto one of the basis planes. For instance, suppose a
surface S has projection R onto the 12-plane (xy-plane), so that an element of
surface area dS projects onto the area element dA. Then
dA = |e3 · dS| = |e3 · ndS| = |e3 · n|dS.
Since in the 12-plane dA = dx1 dx2 we have the expression for the surface integral
dx1 dx2
Z Z Z
a(r) · dS = a(r) · ndS = a(r) · n
S R R |e3 · n|
Now, if the surface S is given by the equation x3 = z(x1 , x2 ), where z(x1 , x2 )
gives the third coordinate of the surface for each (x1 , x2 ) then the scalar field
is identical zero on S. The unit normal at any point of the surface will be given
∇f
by n = |∇f | evaluated at that point, c.f. section 2.6.4. We then obtain
∇f dA dA dA
dS = ndS = = ∇f = ∇f = ∇f dx1 dx2 ,
|∇f | |n · e3 | |∇f · e3 | |∂3 f |
where the last identity follows from the fact that ∂3 f = 1 from Eq. (2.54). The
surface integral then becomes
Z Z
a(r) · dS = a(x1 , x2 , z(x1 , x2 )) · (∇f )(x1 , x2 )dx1 dx2 .
S R
Volume integrals
Volume integrals are generally simpler than line or surface integrals since the
element of the volume dV is a scalar quantity. Volume integrals are most often
on the form Z Z
φ(r)dV a(r)dV (2.55)
V V
Clearly, the firs form results in a scalar, whereas the second one yields a vector.
Two closely related physical examples, one of each kind, are provided R by the
total mass M , of a fluid contained in a volume V , given byR M = V ρ(r)dV and
the total linear momentum of that same fluid, given by V ρ(r)v(r)dV , where
ρ(r) is the density field and v(r) is the velocity field of the fluid.
The evaluation of the first volume integral in Eq. (2.55) is an ordinary
multiple integral. The evaluation of the second type of volume integral follows
directly since we can resolve the vector field into cartesian coordinates
Z XZ
a(r)dV = ai (r)dV ei .
V i V
23
Of course we could have written a in terms of the basis vectors of other coordi-
nate system (e.g. spherical coordinates) but since such basis vectors are not in
general constant, they cannot be taken out of the integrand.
Gauss theorem
Let a be a (differentiable) vector field, and V be any volume bounded by a
closed surface S. Then Gauss theorem states
Z I
∇ · a (r)dV = a(r) · dS. (2.56)
V S
by evaluating the vector field a(r) in the center point of the face r0 + s ∆x
2 ek
k
∆xk
Z
a(r) · dS ≈ a r0 + s ek · sek Ak
Fsk 2
The surface integral on the rhs. of Eq. (2.56) for the box B(r0 ) then becomes
a(r) · dS ≈ sk a r0 + s ∆x
H P
2 ek · sek Ak
k
B(r0 )
= k a r0 + ∆x ∆xk
P
2
k
e k − a r 0 − 2 e k · ek Ak
P (2.57)
= k ak r0 + ∆x ∆xk
2 e k − a k r0 − 2 e k Ak
k
P
≈ k (∂kak )(r0 ) ∆xk Ak
= ∇ · a (r0 ) VB
Adding the surface integrals of each of these boxes, the contribution from the
mutual interfaces vanishes (since the outward normal of the two adjacent boxes
24
point in opposite direction). Consequently, the only contributions comes from
the surface S of V .
I XI
a(r) · dS ≈ a(r) · dSi (2.59)
S i B(r0,i )
Since each term on the rhs. of Eq. (2.58) equals a corresponding term on the
rhs. of Eq. (2.59) the Gauss teorem is demonstrated.
Gauss theorem is often used in conjunction with following mathematical
theorem
d ∂
Z Z
φ(r, t)dV = φ(r, t)dV,
dt V V ∂t
where t is time and φ is a time dependent scalar field (The theorem works in
arbitrary spatial dimension). The two theorems are central in deriving partial
differential equations for dynamical systems, in particular the so called conti-
nuity equations, linking a flow field to the time changes of scalar field advected
by the flow. For instance if ρ(r, t) is a density and v(r, t), is the velocity field
of a fluid then the vector j(r) = ρ(r)v(r) gives the density current. It can then
by shown (try it) that under mass conservation then
∂ρ
+ ∇ · j = 0.
∂t
The divergence of a vector field therefore has the physical meaning of giving the
net “outflux” of a scalar advected by the field within an infinitesimal volume.
Stokes theorem
Stokes theorem states that if S is the “curl analogue” of the divergence theorem
and relates the integral of the curl of a vector field over an open surface S to the
line integral of the vector field around the perimeter C bounding the surface.
Z I
∇ × a (r) · dS = a(r) · dr (2.60)
S C
Following the same lines as for the derivation of the divergence theorem the
surface S can be divided into many small areas Si with boundaries Ci and unit
normals ni . For each small area one can show that
I
(∇ × a) · ni Si ≈ a · dr.
Ci
Summing over i one finds that on the rhs. all parts of all interior boundaries
that are not part of C are included twice, being traversed in opposite directions
on each occasion and thus cancelling each other. Only contributions from line
elements that are also part of C survive. If each Si is allows to to tend to zero,
Stokes theorem, Eq. (2.60), is obtained.
25
2.7 Curvilinear coordinates
The vector operators, ∇φ, ∇ · a and ∇ × a , we have discussed so far have
all been defined in terms of cartesian coordinates. In that respect we have
been more restricted than in the algebraic definitions of the analogous ordinary
scalar and cross product, Eq. (2.18) and Eq. (2.20) respectively. Here we only
assumed orthonormality of the basis. The reason is that the nabla operator
involves spatial derivatives which implies that one must account for the possible
non-constancy of E, c.f. Eq. (2.35).
Many systems possess some particular symmetry which makes other coordi-
nate systems more natural, notably cylindrical or spherical coordinates. These
coordinates are just two examples of what are called curvilinear coordinates.
Curvilinear coordinates refer to the general case in which the rectangular coor-
dinates x of any point can be expressed as functions of another set of coordinates
x0 , thus defining a coordinate transformation, x = x(x0 ) or x0 = x0 (x) 8 . Here
we shall discuss the algebraic form of the standard vector operators for the two
particular transformations into cylindrical or spherical coordinates. An impor-
tant feature of these coordinates is that although they lead to non-constant
bases, these bases retain the property of orthonormality.
The starting point for the disgression is to realize that a non-constant basis
implies that the basis vectors are vector fields, as opposed to the constant basis
R = {e1 , e2 , e3 } associated with cartesian coordinates. Any vector field, a(x 0 )
is then generally written as
X
a(x0 ) = a0j (x0 )e0j (x0 ) (2.61)
j
where a0j are the components of the vector field in the new basis, and x0 = x0 (x)
is the coordinate transformation. One notes, that the functional form of the
components a0j (x0 ) differ from the functional form of the cartesian components,
aj (x), because the same point, P will have two different numerical representa-
tions x0 (P ) and x(P ). For a scalar field one must have the identity
φ0 (x0 ) = φ(x),
where φ0 is the functional form of the scalar field in the primed coordinate
system and φ is the function of the scalar field with respect to the unprimed
system. Again, the two functional forms must be different because the same
point has different numerical representations. However, the value of the two
functions must be the same since x0 and x represent the same point in space.
8 Recall that the notation x0 = x0 (x) is a short hand notation for the three functional
relationships listed in Eq. (1.2).
26
2.7.1 Cylindrical coordinates
Cylindrical coordinates, x0 = (ρ, φ, z), are defined in terms of normal cartesian
coordinates x = (x1 , x2 , x3 ) = (x, y, z) by the coordinate transformation
x = ρ cos(φ)
y = ρ sin(φ), (2.62)
z = z,
Local basis
If we take the partial derivatives with respect to ρ, φ, z, c.f. Eq. (2.33), and
normalize we obtain
e01 (x0 ) = eρ (x0 ) = ∂ρ r = cos(φ)e1 + sin(φ)e2
e02 (x0 ) = eφ (x0 ) = ρ1 ∂φ r = − sin(φ)e1 + cos(φ)e2
e03 = e3
These three unit vectors, like the Cartesian unit vectors ei , form an orthonormal
basis at each point in space. An arbitrary vector field may therefore be resolved
in this basis
ar = a · e r , aφ = a · e φ , Vz = a · e z
Resolution of gradient
The derivatives after cylindrical coordinates are found by differentiation through
the Cartesian coordinates (chain rule)
∂x ∂y
∂ρ = ∂ρ ∂x + ∂ρ ∂y = cos(φ)∂x + sin(φ)∂y
∂x ∂y
∂φ = ∂φ ∂x + ∂φ ∂y = −ρ sin(φ)∂x + ρ cos(φ)∂y
FromP
these relations we can calculate the projections of the gradient operator
∇ = i ei ∂i on the cylindrical basis and we obtain
∇ρ = eρ · ∇ = ∂ρ
∇φ = eφ · ∇ = ρ1 ∂φ
∇z = ez · ∇ = ∂z
27
The resolution of the gradient in the two bases therefore becomes
1
∇ = e ρ ∇ρ + e φ ∇φ + e z ∇z = e ρ ∂ ρ + e φ ∂ φ + e z ∂ z
ρ
Together with the only non-vanishing derivatives of the basis vectors
∂φ eρ = eφ
∂φ eφ = −eρ
we have the necessary tools for calculating in cylindrical coordinates. Note, that
it is the vanishing derivatives of the basis vectors that lead to the simple form
of the vector operators in cartesian coordinates.
Laplacian
The laplacian in cylindrical coordinates takes the form
∇2 = ∇ · ∇ = (eρ ∇ρ + eφ ∇φ + ez ∇z ) · (eρ ∇ρ + eφ ∇φ + ez ∇z )
Using the linearity of the scalar product it can be rewritten to the form
X
e0i ∇0i · e0j ∇0j ,
Applying the chain rule of differentiation and Eq. (2.7.1) one can then show
1 1 2
∇2 = ∂ρρ
2
+ ∂ρ + 2 ∂φφ 2
+ ∂zz
ρ ρ
28
Local basis
The normalized tangent vectors along the directions of the spherical coordinates
are
er (x0 ) = ∂r r = sin(θ) cos(φ)e1 + sin(θ) sin(φ)e2 − sin(θ)e3
eθ (x0 ) = 1r ∂θ r = cos(θ) cos(φ)e1 + cos(θ) sin(φ)e2 − sin(θ)e3
1
eφ (x0 ) = r sin(θ) ∂φ r = − sin(φ)e1 + cos(φ)e2
They define an orthonormal basis such that an arbitrary vector field may be
resolved in these directions
a(x0 ) = ar (x0 )er (x0 ) + aθ (x0 )eθ (x0 ) + aφ (x0 )eφ (x0 )
∇r = e r ∇r + e θ ∇θ + e φ ∇φ
(2.65)
1
= er ∂r + eθ r1 ∂θ + r sin(θ) ∂φ
∂θ er = eθ ∂φ er = sin(θ)eφ
∂θ eθ = −er ∂φ eθ = cos(θ)eφ (2.66)
∂φ eφ = − sin(θ)er − cos(θ)eθ
Laplacian
The laplacian in spherical coordinates becomes
∇2 = (er ∇r + eθ ∇θ + eφ ∇φ ) · (er ∇r + eθ ∇θ + eφ ∇φ ) ,
which, after using the linearity of the scalar product and Eq. (2.66) becomes
2 1 2 cos(θ) 1
∇2 = ∂rr
2
+ ∂r + 2 ∂θθ + 2 ∂θ + 2 2 ∂φφ 2
r r r sin(θ) r sin (θ)
Notice that the two first terms involving radial derivatives can be given alter-
native expressions
2 2 1 1 2
φ) + ( ∂r φ) = 2 ∂r r2 (∂r φ) = ∂rr
(∂rr (rφ) ,
r r r
where φ is a scalar field.
29
Chapter 3
Tensors
In this chapter we will limit ourself to the discussion of rank 2 tensors, unless
stated otherwise. The precise definition of the rank of a tensor will become clear
later.
3.1 Definition
A tensor, T, of rank two is a geometrical object that to any vector a associate
another vector u = T(a) by a linear operation
In other words a rank 2 tensor is a linear vector operator. We will denote tensors
(of rank 2 or higher) with capital bold-face letters.
Any linear transformation of vectors, such as rotations, reflections or pro-
jections are examples of tensors. In fact, we have already encountered tensors
in disguise. The operator T = c× is a tensor. For each vector, a, it associates
the vector T(a) = c × a, obtained by rotating a 900 counter-clockwise around
c and scaling it with the magnitude |c|. Since the cross-product is linear in the
second argument T is linear and thus a tensor. For reasons to become clear we
will also extend and use the dot-notation to indicate the operation of a tensor
on a vector so
T · a =def. T(a) (3.3)
Physical examples of tensors include the inertia tensor, I, or the moment of
inertia, that specifies how a torque, τ (a vector), changes the angular momen-
dl
tum, dt (a vector)
dl
=I·τ
dt
Another example is the (transposed) stress tensor, σ t , that specifies the force f
a continuous medium exerts on a surface element defined by the normal vector
30
n.
f = σt · n
A tensor often associated with the stress tensor is the strain tensor, U, that
specifies how a strained material is distorted ∆u in some direction ∆r
∆u = U · ∆r
In words, for each c the tensor ab associates a vector in the direction of a and
with a magnitude equal to the projection of c into b. In order to call the object
(ab) a tensor we should verify that it is a linear operator. Using the definition,
Eq. (3.3), allows us to “place the brackets where we want”
which will ease the notation. Now, demonstrating that (ab) indeed is a a tensor
amounts to “moving brackets”:
We note that since the definition, eq. (??) involves vector operations that
bear no reference to coordinates/components, the outer product will itself be
invariant to the choise of coordinate system. The outer product is also known
in the litterature as the tensor, direct, exterior or dyadic product. The tensor
formed by the outer product of two vectors is called a dyad.
(S + T) · c =S·c+T·c
(3.6)
(mS) · c = m(S · c).
31
Here, c is any vector. It is easy to show that that (S + T) and (mS) are also
tensors, ie. they satisfy the definition of being linear vector operators, Eq.(4.3.2).
Definition Eq. (3.6) and the properties Eq. (3.5) guarantee that dyad products,
sums of tensors and dot products of tensors with vectors satisfy all the usual
algebraic rules for sums and products.
P For instance,
P the outer product between
two vectors on the form c = i mi ai and d = j nj bj , where mi and nj are
scalars, is
!
X X X
cd = m i ai nj b j = m i n j ai b j , (3.7)
i j ij
ie. it is a sum of all dyad combinations ai bj . The sum of two or more dyads
is called a dyadic. As we shall demonstrate in section 3.4 any tensor can be
expressed as a dyadic but not necessarily as a single dyad.
We can also extend the definition of the dot product. For any vector c we
define the dot product of two tensors by the following formula where c is any
vector
(T · S) · c = T · (S · c) (3.8)
In words, application of the operator (T·S) to any vector means first applying S
and then T. Since the association of two linear functions is a linear function (T·
S) is a tensor itself. Sums and products of tensors also obey usual rules of algebra
except that dot multiplication of two tensors, in general, is not commutative
T · c 6= c · T.
Tt · c = c · T (3.13)
32
As we shall see in section 3.4 any tensor can be expressed as a sum of dyads.
Using this fact following properties can easily be proved
(T + S)t = T t + St (3.15)
t t t
(T · S) =S ·T (3.16)
t
Tt = T. (3.17)
3.3.2 Contraction
Another useful operation on tensors is that of contraction. The contraction,
ab : cd, of two dyads results in a scalar defined by
ab : cd = (a · c)(b · d) (3.18)
Note the following useful relation for the contraction of two dyads formed by
basis vectors
ei ej : ek el = δik δjl
The contraction of two dyadics is defined as the bilinear operation
!
X X X
m i ai b i : nj c j d j = mi nj (ai · cj )(bi · dj ), (3.19)
i j ij
S(T) = T + Tt
(3.20)
A(T) = T − Tt
33
Finally, an important tensor is the identity or unit tensor, 1. It may be
defined as the operator which acting on any vector, yields the vector itself.
Evidently 1 is one of the special cases for which for all tensors A
1·A =A·1
If m is any scalar the product m1 is called a constant tensor and has the property
(m1) · A = A · (m1) = mA
Constant tensors therefore commute will all tensors and no other tensors have
this property.
Thus, X
T·c = Tij cj ei (3.23)
ij
Thus, the concepts of a dyadic and linear vector operator or tensor are identical
and are equivalent to the concept of linear vector function in the sense that
every linear vector function defines a certain tensor or dyadic, and conversely.
34
Conveniently, the components, Tij , can be arranged in a matrix T = (Tij ),1
T11 T12 T13
T = (Tij ) =def. T21 T22 T23 (3.25)
T31 T32 T33
In analogy with the notation used for vectors we write T = [T]E –or simply
T = [T] when the basis is irrelevant– as a short hand notation for the component
matrix of a tensor with respect to a given basis. Also, for the inverse operation
(ie. the tensor obtained by expanding the components along the basis dyads)
we shall use the notation
X
T = (T )E =def Tij ei ej . (3.26)
ij
Tensor · vector
For instance, Eq. (3.22) shows that if u = T · c then the relationship between
the components of u, T and c in any ortonormal basis will satisfy
X
ui = [T · c]i = Tij cj , (3.27)
j
(Tij ), is used to indicate the matrix collection of these. This is in analogy to the notation
used for triplets a = (ai ), cf. section 2.3.
35
or
u = T · a,
where the triplets, u and a are to be considered as a columns. Therefore, we have
the correspondence between the tensor operation and its matrix representation
[T · c] = [T] · [c]
Tensor · tensor
Also, let us consider the implication of definition of the dot product of two
tensors, Eq. (3.8) in terms of the components.
P
T · (S · c) = T · kj Skj cj ek
P
= kj Skj cj T · ek
P P
= kj Skj cj i Tik ei
P P
= ij k Tik Skj cj ei (3.28)
which again is identical to the ij’th element of the matrix product T · S. A more
transparent derivation is obtained by noting that for any vector c
(ek el ) · c = ek (el · c) = ek cl
and consequently
Therefore, one obtains the simple result for the dot product of two basis dyads
ei ej · ek el = δjk ei el (3.30)
[T · S] = [T] · [S]
36
Tensor + tensor
One can similarly show that the definition of the sum of two tensors Eq. (3.6),
(S + T) · c = S · c + T · c, (3.6)
implies that tensors are added by adding their component matrices. Indeed,
P P
(S + T) · c = S ij e i e j + T ij e i e j ·c
P ij ij
P
= ij Sij ei (ej · c) + ij Tij ei (ej · c) (def. Eq. (3.6))
P (3.32)
= ij (Sij + Tij )ei (ej · c)
P
= ij (Sij + Tij )ei ej · c
Consequently, X
S+T = (Sij + Tij )ei ej (3.33)
ij
or equivalently,
[S + T]ij = Sij + Tij , (3.34)
which means that the component matrix of S + T is obtained by adding the
component matrices of S and T respectively
[S + T] = [S] + [T]
It is left as an exercise to demonstrate that
[mT]ij = mTij , (3.35)
where m is a scalar.
Transposition
Finally, the definition of the transpose of a tensor, Eq. (3.13), is also consistent
with the algebraic definition of transposition. Specifically, comparing
X X X X
c· Tij ei ej = Tij (c·ei )ej = Tij ej (ei ·c) = Tji ei ej ·c (3.36)
ij ij ij ij
37
Eq. (3.36) also demonstrates
X
[c · T]i = cj Tji (3.38)
j
In keeping with the convention that c represents a column, the matrix form
of this equation is
ct · T ,
where ct and and the image ct · T will be rows.
and similarly, [T]E 0 E,ij for T̃ij . Two-point components do not involve any extra
formalism though. For instance, getting the components in the dyad basis
E 0 E 0 or in the basis EE is only a question of respectively multiplying with the
basis-vectors e0i to the right or the basis-vectors ei to the left on the expression
0
P
kl T̃kl ek el and collect the terms. For instance
P
0
[T]EE,ij = ei · T̃ e e
kl kl k l · ej
= kl T̃kl (ei · e0k )(el · ej )
P
(3.39)
= kl T̃kl (ei · e0k )δlj
P
38
In cartesian coordinates a tensor field is given by
X
T(x) = Tij (x)ei ej (3.40)
ij
are vector fields with components [ai ]j = Tji and [bi ]j = Tij , respectively.
Nabla operator
For any vector field,
P a(r) we can construct a tensor field by the nabla operator.
Inserting a(r) = i ai (r)ei “blindly” into the definition Eq. (2.41) gives
X X X
(∇a)(r) = ei ∂i aj (r)ej = (∂i aj )(r)ei ej (3.42)
i j ij
Notice that the components of ∇a are that of an outer product between two
ordinary vectors, [ba]ji = bj ai .
The tensor, ∇a, represents how the vector field a changes for a given dis-
placement ∆r
a(r + ∆r) ≈ a(r) + ∆r · (∇a)(r)
or
da = dr · (∇a)
An alternative expression for the differential da was obtained in Eq. (2.44),
Notice that this identity is in accordance with the usual definition for a dyad
operating on a vector from the left.
2 We recall that the ∇ operator has been defined relative to a chosen coordinate system. It
remains to be proven that operators derived from ∇ such as gradients, curls, divergence etc.
“behave as” proper scalars or vectors. We return to this point in 4.5.
39
Divergence
The divergence of a tensor field is obtained straight forwardly by applying the
definition of ∇ and the dot-product. With respect to a cartesian basis R we
have P P
∇ · T (x) = ( i ei ∂i ) · jk Tjk (x)ej ek
P P
= Pi P (3.44)
k ∂i Tik (x)ek
= k ( i ∂i Tik (x)) ek .
P
Consequently, a = ∇ · T is a vector with components ak = i ∂i Tik , jvf. Eq.
(3.41). The result is actually easier seen in pure subscript notation
X
[∇ · T]k = ∂i Tik ,
i
because it follows directly from the matrix algebra, Eq. (3.38). Another way of
viewing the divergence of a tensor field is to take the divergence of each of the
vector fields ai = T · ei X
∇·T= (∇ · ai ) ei
i
Curl
The curl of a tensor field is obtained similarly to the divergence. The result is
most easily obtained by applying the algebraic definition of curl, Eq. (2.47).
Then we obtain X
[∇ × T]ij = imn ∂m Tnj
mn
Thus the curl of a tensor field is another tensor field. As for the divergence
we obtain the same result by considering the curl operator on each of the three
vector field aj = T · ej
X
∇×T= ∇ × aj ej ,
j
where an outer vector product is involved in each of the terms ∇ × aj ej .
40
The corresponding tensor versions are then
R P H P
V
( i ∂i Til ) dV = S ( i Til ni ) dS Gauss
R P H P (3.46)
V ijk ijk ∂j Tkl ni dS = C ( i Til dxi ) Stokes
Here l refer to any component index l = 1, 2, 3. The reasonP these formulas also
works for tensors is that for a fixed l, al = T · el = i Til ei defines a vector
field, c.f. (3.41), and Gauss’ and Stokes’ theorems works for each of these.
Due to Gauss theorem the physical interpretation of the divergence of a
tensor field is analogous to the divergence of a vector field. For instance, if a
flow field v(r, t) is advecting a vector a(r, t) then the outer product J(r, t) =
v(r, t)a(r, t) is a tensor field where [J]ij (r, t) = vi (r, t)aj (r, t) describes how
much of the j’th component of the vector a is transported in direction ei . The
divergence ∇ · J is then a vector where each component [∇ · J]j corresponds
to the accumulation of aj in an infinitesimal volume due to the flow. In other
words, if a represents a conserved quantity then we have a continuity equation
for the vector field a
∂a
+∇·J=0
∂t
41
Chapter 4
Tensor calculus
In the two preceeding chapters we have developed the basic algebra to for scalar,
vector and rank 2 tensors. We have seen that the notation of vectors and tensors
comes in two flavours. The first notation insists in not refering to coordinates or
components at all. This geometrically defined notation, called direct notation,
is explictly invariant to the choise of coordinate system and is the one adopted
in the first part of chapter two and three (2.1-2.2, 3.1-3.3). The second notation,
based on components and indices, is called tensor notation and is the one that
naturally arises with a given coordinate system. Here, vectorial or tensorial
relations are expressed algebraically.
Seemingly, a discrepancy exists between the two notations in that the lat-
ter appears to depend on the choise of coordinates. In section 4.2 we remove
this descrepancy by learning how to transform the components of vectors and
tensors when the coordinate system is changed. As we shall see these transfor-
mation rules guarantee that one can unambigously express vectorial or tensorial
relations using component/tensor notation because the expressions will preserve
the form upon a coordinate transformation. Indeed, we should already expect
this to be the case since we have not made any specific assumptions about the
coordinate system in the propositions regarding relations between vector or ten-
sor components already presented, except of this system beeing rectangular. It
is possible to generalize the tensor notation to ensure that the formalism takes
the same form in non-rectangular coordinate systems as well. This requirement
is known as general covariance. In the following we will, however, restrict the
discussion to rectangular coordinates.
Since tensor notation often requires knowledge of transformation rules be-
tween coordinate systems one may validly ask why use it at all. First of all,
in quantifying any physical system we will eventually have to give a numerical
representation which for vectors or tensors imply to specify the value of their
components with respect to some basis. Secondly, in many physical problems it
is natural to introduce tensors of rank higher than two1 . To insist on a direct
1 The Levi-Civita symbol being an example of a (pseudo)-tensor of rank 3.
42
notation for these objects become increasingly tedious and unnatural. In tensor
notation no new notation needs to be defined, only those we have already seen
(some in disguise), which is reviewed in section 4.3. It is therefore strongly
recommended to become familiar with the tensor notation once and for all.
Here, i is a bound index and can be renamed without changing the meaning of
the expression. The situation where a dummy index appears exactly twice in a
product occurs so often that it is convenient to introduce the convention that
the summation symbol may be left out in this specific case. In particular, it
occurs in any component representation of a “dot” or inner product. Hence we
may write
P
a·b = P i a i bi = a i bi (i a bound index)
[T · a]j = i Tji ai = Tji ai (i a bound index, j a free index)
P (4.1)
[T · S]ij = k Tik Skj = Tik Skj (ij free indices, k bound index)
P
T:S = ij Tij Sij = Tij Sij ij bound indices
Similar, for the expansion of the components along the basis vectors, Eq. (2.10)
X
a= a i gi = a i gi
i
43
4.2 Orthonormal basis transformation
Let E and E 0 be two different orthonormal bases. The basis vectors of the primed
basis may be revolved into the unprimed basis
where
aji =def e0j · ei (4.3)
represents the cosine of the angle between e0j and ei . The relation between the
primed and unprimed components of any (proper) vector
v = ci ei = c0j e0 j
Here, vj0 = [v]E 0 ,j and vi = [v]E,i . Note, that the two transformations differ in
whether the sum is over the first or the second index of aij . The matrix version
of Eq. (4.4) reads
v 0 = A · v, A = (aij ), (4.6)
and Eq. (4.5)
v = At · v 0 .
Using the same procedure the primed and unprimed components of a tensor
A = [1]E 0 E
2 In keeping with Einsteins summation rule this expression is short hand notation for
X X
0 0 0
T= Tij ei ek = Tjl ej el
ik jl
44
4.2.1 Cartesian coordinate transformation
A particular case of an orthonormal basis transformation is the transformation
between two cartesian coordinate systems, C = (O, R) and C 0 = (O0 , R0 ). Here,
the form of the vector transformation, Eq. (4.4), can be directly translated to
the coordinates themselves. To show this we employ the simple relation between
coordinates and the position vector, Eq. (2.9), valid in any cartesian system.
The displacement vector between any two points is given by ∆r = ∆xi ei . This
vector –or its differential analogue dr = dxi ei – is a proper vector independent
of the coordinate system. Following relation between the coordinate differentials
dxi and dx0j must then be satisfied
dr = dxi ei = dx0j e0 j
Multiplying with e0j we obtain
following axioms:
45
so
det(A) = ±1
Transformations with the determinant +1 or −1 cannot be continuously con-
nected. Therefore, the set of transformations with determinant +1 itself forms
a group, called SO(3), which represents the set of all rotations. By considering
a pure reflection, R = (rij ), in the origin of a cartesian coordinate system
we see that det(R) = −1. Clearly, R · R = 1, so the set Z(2) = {1, R} also
forms a group. Consequently, we may write O(3) = Z(2) ⊗ SO(3). In words,
any orthogonal matrix can be decomposed into a pure rotation and an optional
reflection. Note, that any matrix with det = −1 will change the handedness of
the coordinate system.
46
third invariance above, ie. that the i’th component of the image u = T · c of T
operating on c, is always obtained as
ui = Tij cj , (3.27)
Ot · O = 1,
where 1 as usual denotes the identity tensor. In any orthonormal basis the
matrix representation of this identity will be that of Eq. (4.12). Further, let a
new basis system E 0 = {e01 , e02 , e03 } be defined by
e0i = O · ei
where E = {e1 , e2 , e3 } is the old basis. We shall use the short-hand notation
E 0 = O(E)
47
The components of a vector v in the primed basis relates to the the unprimed
components as
vi0 = e0i · v
= (O · ei ) · v
= [O · ei ]E,j [v]E,j (Scalar product wrt. E)
= [O]jk [ei ]k vj (Eq. (3.27), all components wrt. E)
(4.16)
= [O]jk δik vj
= [O]ji vj
= [Ot ]ij vj
= (Ot · v)i
This shows that the matrix, A, representing the basis transformation, Eq. (4.6),
satisfies
A = [Ot ]E = O t = O−1 .
Being explicit about the bases involved we may write this identity as
which is to say that the matrix representing a passive transformation from one
to another basis, E → E 0 , equals the matrix representation wrt. E of the tensor
mapping the new basis vectors onto the old ones, O−1 (E 0 ) = E.
Scalars
A quantity m is a scalar if it has the same value in every coordinate system.
Consequently it transforms according to the rule
m0 = m
Vectors
A triplet of real numbers, v, is a vector if the components transform according
to
vj0 = aji vi
The matrix notation of the transformation rule is
v0 = A · v
48
Rank 2 Tensors
A rank two tensor is a 3 × 3 matrix of real numbers, T , which transforms as the
outer product of two vectors
Not surprisingly, these quantities, Tijk , are called the components of T3 and
since three indicies are needed, T3 is called a third rank tensor. Upon a co-
ordinate transformation we would need to transform the components of each
individual vector in the triple product. Expressing the original basis vectors in
terms of the new basis
ei = aji e0j
one obtains
0
T3 = Tlmn e0l e0m e0n = Tijk ei ej ek
(4.19)
= Tijk (ali e0 l )(amj e0 m )(ank e0 n ) = ali amj ank Tijk e0l e0m e0n
49
This shows that upon an orthogonal transformation the components of a third
rank tensor transform as
0
Tlmn = ali amj ank Tijk
The inverse transformation follows from expressing the new basis vectors in
terms of the old one, e0 j = aji ei in Eq. (4.19)
0
Tijk = ali amj ank Tlmn .
We can continue this procedure, defining the outer product of four vectors
in terms triple products, cf. Eq. (4.18), and introducing addition and scalar
multiplication of these objects as well.
In keeping with previous notation, the set of all components is written as
T = (Tijk ).
Here, ijk are dummy indices and the bracket around Tijk indicates the set of
all of these
(Tijk ) = { Tijk | i = 1, 2, 3 ; j = 1, 2, 3 ; k = 1, 2, 3 }.
Also, the tensor is obtained from the components as
T = (T )E = ( (Tijk ) )E =def Tijk ei ej ek
4.3.2 Definition
In general, a tensor T of rank r is defined as a set of 3r quantities, Ti1 i2 ···ir , that
upon an orthogonal change of coordinates, Eq. (4.11), transform as the outer
product of r vectors
Tj01 j2 ···jr = aj1 i1 aj2 i2 · · · ajr ir Ti1 i2 ···ir . (4.20)
4
Accordingly, a vector is a rank 1 tensor and a scalar is a rank 0 tensor.
It is common to refer to both T and the set (Ti1 i2 ···ir ) as a tensor, although
the latter quantities are strictly speaking only the components of the tensor
with respect to some particular basis E. However, the distinction between a
tensor and its components becomes unimportant, provided we keep in mind
that Tj01 j2 ···jr in Eq. (4.20) are components of the same tensor only with respect
to a different basis E 0 . It is precisely the transformation rule, Eq. (4.20), that
ensures that the two set of components represent the same tensor.
For completeness, the inverse transformation is obtained by summing on the
first indices of the transformation matrix elements instead of the second ones,
cf. section 4.3.2:
Ti1 i2 ···ir = aj1 i1 aj2 i2 · · · ajr ir Tj01 j2 ···jr .
4 The notation T
i1 i2 ···ir may be confusing at first sight. However, i1 , i2 , · · · ir are simply r
independent indices each taking values in the range {1, 2, 3}. Choosing a second rank tensor
as an example and comparing with previous notation, Tij , it simply means that i1 = i, i2 = j.
For any particular component of the tensor, say T31 , we would have with the old notation
i = 3, j = 1 and in the new notation i1 = 3, i2 = 1.
50
4.3.3 Basic algebra
Tensors can be combined in various ways to form new tensors. Indeed, we have
already seen several examples thereof5 . The upshot of these rules below is that
tensor operations that look “natural” are permissible in the sense that if one
starts with tensors the result will be a tensor ??.
Addition/substraction
Tensors of the same rank may be added together. Thus
Ci1 i2 ···ir = Ai1 i2 ···ir + Bi1 ii 2···ir
is a tensor of rank r if A and B are tensors of rank r. This follows directly from
the linearity of Eq. (4.20),
Cj01 j2 ···jr = A0j1 j2 ···jr + Bj0 1 j2 ···jr
= aj1 i1 aj2 i2 · · · ajr ir Ai1 i2 ···ir + aj1 i1 aj2 i2 · · · ajr ir Bi1 i2 ···ir
= aj1 i1 aj2 i2 · · · ajr ir (Ai1 i2 ···ir + Bi1 i2 ···ir )
= aj1 i1 aj2 i2 · · · ajr ir Ci1i2 ···ir .
Consequently, the 3r quantities (Ci1 i2 ···ir ) transform as the components of r
rank tensor when both (Ai1 i2 ···ir ) and (Bi1 ii 2···ir ) do (We made use of this in
the second line of the demonstation above). It follows directly that substraction
of two equally ranked tensors is also a tensor of the same rank, and that tensorial
addition/substraction is commutative and associative.
Outer product
The product of two tensors is a tensor whose rank is the sum of the ranks
of the given tensors. This product which involves ordinary multiplication of
the components of the tensor is called the outer product. It is the natural
generalization of the outer product of two vectors (two tensors of rank 1) defined
in section 3.2. For example
Ci1 i2 ···ir j1 j2 ···js = Ai1 i2 ···ir Bj1 j2 ···js (4.21)
is a tensor of rank r + s if Ai1 i2 ···ir is a tensor of rank r and Bj1 j2 ···js is a tensor
of rank s. Note, that this rule is consistent with the cartesian components of a
dyad c = ab in the specific case where A = a and B = b are vectors. Also, a
tensor may be multiplied with a scalar m = B (tensor of rank 0) according to
the same rule
Ci1 i2 ···ir = mAi1 i2 ···ir
Note, that not every tensor can be written as a product of two tensors of lower
rank. We have already emphasized this point in case of second rank tensors
which can not in general be expressed as a single dyad. For this reason divison
of tensors is not always possible.
5 The composition, T · S, of two second rank tensors, T and S, forms a new second rank
tensor. The operation of a second rank tensor on a vector (1. rank tensor) gives a new vector.
The addition of two second rank tensors gives a new second rank tensor, etc.
51
Permutation
It is permitted to exchange or permute indices in a tensor and still remain with
a tensor. For instance, if we define a new set of 3r quantities Ci1 i2 ···ir from
a tensor Ai1 i2 ···ir by permuting two arbitrarily chosen index numbers α and
β > α:
Ci1 i2 ···iα−1 iα iα+1 ···iβ−1 iβ iβ+1 ···ir = Ai1 i2 ···iα−1 iβ iα+1 ···iβ−1 iα iβ+1 ···ir ,
this new set will also be a tensor. Its tensorial property follows from the sym-
metry among the a-factors in Eq. (4.20). Tensors obtained from permutting
indices are called isomers. Tensors of rank less than two (ie. scalars and vectors)
have no isomers. A tensor of rank two has precisely one isomer, the transposed
one, obtained by setting α = 1 and β = 2 in the above notation:
(Tt )i1 i2 = Ti2 i1
Contraction
The most important rule in tensor algebra is the contraction rule which states
that if two indices in a tensor of rank r + 2 are put equal and summed over,
then the result is again a tensor with rank r. Because of the permutation rule
discussed above we only have to demonstrate it for the first two indices. The
contraction rule states that if A is a tensor of rank r + 2 then
P3
Bj1 j2 ···jr = Aiij1 j2 ···jr (NB! Aiij1 j2 ···jr = i=1 Aiij1 j2 ···jr )
is also a tensor of rank r. The proof follows
Bj0 1 j2 ···jr = A0iij1 j2 ···jr
= aik ail aj1 m1 aj2 m2 · · · ajr mr Aklm1 m2 ···mr
= δkl aj1 m1 aj2 m2 · · · ajr mr Aklm1 m2 ···mr Akkm1 m2 ···mr
= aj1 m1 aj2 m2 · · · ajr mr Bm1 m2 ···mr
Consequently, Bj1 ···jr does transform as a tensor of rank r as claimed.
Inner product
By the process of forming the outer product of two tensors followed by a con-
traction, we obtain a new tensor called an inner product of the given tensors. If
the rank of the two given tensors are respectively r and s then the rank of the
new tensor will be r + s − 2. Its tensorial nature follows from the fact that an
outer product of two tensors is a tensor and the contraction of two indices in a
tensor gives a new tensor. In the previous chapters we have reserved the “dot”-
symbol for precisely this operation. For instance, a scalar product between two
vectors, a and b is a inner product of two rank 1 tensors giving a 0 rank tensor
(scalar):
[ab]ij = a i bj (Outer product)
a·b = a i bi (Contraction of outer product, setting j = i)
52
A rank two tensor, T, operating on a vector v is a inner product of between a
rank two and a rank one tensor giving a rank 1 tensor:
[Tv]ijk = Tij vk (Outer product)
[T · v]i = Tij vj (Contraction of outer product, setting k = j)
It is left to an exercise to see that the “dot”-operation defined in chapter 2 for
two second rank tensors, T · S, is an inner product.
In general, any of two indices in the outer product between two tensors can
be contracted to define an inner product. For example
Ci1 i2 i4 i5 j1 j3 = Ai1 i2 ki4 i5 Bj1 kj3
is a tensor of rank 6 obtained as an inner product between a tensor Ai1 i2 i3 i4 i5 of
rank 5 and a tensor Bj1 j2 j3 of rank 3 by setting i3 = j2 . Tensors of rank higher
than 4 are an odditity in the realm of physics, fortunately .
Summary
In summary, all tensorial algebra of any rank can basically be boiled down to
these few operations: addition, outer product, permutation, contraction and
inner product. Only one more operation is essential to know, namely that of
differentiation.
4.3.4 Differentiation
Let us consider a tensor A of rank r which is a differentiable function of all
of the elements of another tensor B of rank s. The direct notation for this
situation would be
A = A(B) (Rank(A)=r, Rank(B)=s)
In effect, it implies the existence of 3r functions, Ai1 ···ir , each differentiable in
each of its 3s arguments B = (Bj1 j2 ···js ),
Ai1 i2 ···ir = Ai1 i2 ···ir ( (Bj1 j2 ···js ) )
Then the partial derivatives
∂Ai1 i2 ···ir
Ci1 i2 ···ir ,j1 j2 ···js = (4.22)
∂Bj1 j2 ···js
is itself a tensor of rank r + s. As for ordinary derivatives of functions, the
components Ci1 i2 ···ir will in general also be functions of B,
Ci1 i2 ···ir = Ci1 i2 ···ir ( (Bj1 j2 ···js ) )
The direct notation for taking partial derivatives with respect to a set of tensor
components is
∂A
C(B) = (B)
∂B
53
To demonstrate that C is a tensor, and so to justify this direct notation in the
first place, we will have to look at the transformation properties of its elements.
Indeed, we have
∂A0
Ci01 i2 ···ir ,j1 j2 ···js (B 0 ) = ∂B 0i1 i2 ···ir (B 0 )
j1 j2 ···js
∂B ∂(ai1 k1 ai2 k2 ···air kr Ak1 k2 ···kr )
= ∂B 0l1 l2 ···ls ∂Bl l ···l (B)
j1 j2 ···js 1 2 s
= ai1 k1 ai2 k2 · · · air kr aj1 l1 aj2 l2 · · · ajs ls Ck1 k2 ···kr ,l1 l2 ···ls (B)
(4.23)
In the second step we have used the chain rule for differentiation. The last step
follows from
Bl1 l2 ···ls = aj1 l1 aj2 l2 · · · ajs ls Bj0 1 j2 ···js .
It is essential that all components of B are independent and can vary freely
without constraints ??.
From this rule follows the quotient rule which states that if the tensor
Ai1 i2 ···ir is a linear function of the unconstrained tensor Bj1 j2 ···js through the
relation
Ai1 i2 ···ir = Ci1 i2 ···ir j1 j2 ···js Bj1 j2 ···js + Di1 i2 ···ir
then C is a tensor of rank r + s and consequently D must also be a tensor of
rank r.
There are quantities, notably the Levi-Civita symbol, that do not obey this
rule, but acquire an extra minus sign. Such quantities are called pseudo-tensors
in contradistinction to ordinary or proper tensors. Consequently, a set of 3 r
quantities, Pi1 i2 ···ir , is called a pseudo-tensor if it transforms according to the
rule
upon the coordinate transformation, Eq. (4.11) For a pure reflection, aij = −δij ,
so
Pi01 i2 ···ir = −(−1)r Pi1i2 ···ir .
By comparing Eq. (4.25) with Eq. (4.20) one observes that the difference be-
tween a tensor and a pseudotensor only appears if the transformation includes a
reflection, det(A) = −1. A proper tensor can be considered as a real geometrical
object, independent of the coordinate system. A proper vector is an “arrow”
in space for which the components change sign upon a reflection of the axes
of the coordinate system, cf. Eq. (4.24). Direct product of ordinary vectors
54
are ordinary tensors, changing sign once for each index. An even number of
direct products of pseudo-tensors will make an ordinary tensor, whereas an odd
number of direct products of pseudo-tensors leads to another pseudo-tensor.
since 123 = +1. Thus the constant of proportionality is det(A) and 0ijk =
det(A)ijk . If 0ijk should be identical to ijk we must multiply the transforma-
tion with an extra det(A) to account for the possiblity that the transformation
involves a reflection, where det(A) = −1. The correct transformation law must
therefore be
0ijk = det(A)ail ajm akn lmn ,
whence the Levi-Civita symbol is a third rank pseudo-tensor. The transforma-
tion rule for pseudo-tensors leaves the Levi-Civita symbol invariant under all
orthogonal transformations, 0ijk = ijk . 7
The most important application of the Levi-Civita symbol is in the algebraic
definition of the cross-product, c = a × b between two ordinary vectors
ci = [a × b]i = ijk aj bk .
Since the rhs. is a double inner product between a pseudo-tensor of rank three
and two ordinary vectors, the lhs. will be a pseudo-vector. Indeed, the direc-
tion of c depends on the handedness of the coordinate system, as previously
mentioned. An equivalent manifestation of its pseudo-vectorial nature is that c
does not change its direction upon an active reflection.
6 For instance,
0ikj = ail akm ajn lmn = ail akn ajm lnm = ail ajm akn lnm = −ail ajm akn lmn = −0ijk
The second identity follows from the fact that n and m are bound indices and can therefore
be renamed to one and another.
7 We should not be suprised by the fact that
ijk are unaffected by coordinate transforma-
tions, since its definition makes no distinction between the 1,2 and 3 directions.
55
4.4.2 Manipulation of the Levi-Civita symbol
Two important relations for the Levi-Civita symbol are very useful to derive
the myriad of known identities between various vector products. The first is a
formula to reduce the product of two Levi-Civita symbols into kronecker deltas
δil δim δin
ijk lmn = det δjl δjm δjn (4.28)
δkl δkm δkn
ijk lmn = δil δjm δkn + δim δjn δkl + δin δjl δkm
(4.29)
−δin δjm δkl − δjn δkm δil − δkn δim δjl
An other important relation for the -symbol follows from the observation that
it is impossible to construct a non-trivial totally antisymmetric symbol with
more than three indicies. The reason is that the antisymmetry implies that any
component with two equal components must vanish. Hence a non-vanishing
component must have all indices different, and since there are only three possible
values for an index this is impossible. Consequently, all components of a totally
antisymmetric symbol with four indices must vanish. From this follows the rule
det(N t ) = det(N )
56
Set
a1 a2 a3 x1 x2 x3
M = b1 b2 b3 , N = y1 y2 y3 ,
c1 c2 c3 z1 z2 z3
then the matrix product M · N t becomes
a·x a·y a·z
M · Nt = b · x b · y b·z (4.32)
c·x c·y c·z
where we for pedagogical reasons have reinserted the involved sums, denoting
summation indices with a prime. The above expression equals the lhs of Eq.
(4.28). For the rhs of Eq. (4.33) we also recover the rhs. of Eq. (4.28) by noting
that each scalar product indeed becomes a kronecker delta, ie. a·x = ei ·el = δil
etc. This proves Eq. (4.28).
Pi01 i2 ···ir (x0 ) = det(A)ai1 j1 ai2 j2 · · · air jr Pj1 j2 ···jr (x) (4.36)
57
The same transformation rules also hold true for a transformation between any
two orthonormal bases, for instance from a rectangular to a spherical basis.
One must bear in mind, however, than in this case the transformation matrix
elements, aij , will themselves be function of the position, aij = aij (x).
A tensor field is just a particular realization of the more general case con-
sidered in section 4.3.4, with a tensor A being a function of another tensor B,
A = A(B). Here, B = r. Therefore, we can apply Eq. (4.22) to demonstrate
that derivatives of a tensor field is another tensor field of one higher rank. Al-
though r is an improper vector, cf. section 2.3, C = ∂T ∂r will still be a proper
tensor. The reason is that in deriving the tensorial nature of C, Eq. (4.23), we
have used the tensorial nature of B only to show that
∂Bl1 l2 ···ls
= a j 1 l1 a j 2 l2 · · · a j s ls
∂Bj0 1 j2 ···js
However, this also holds true for any improper tensor in cartesian coordinates.
Specifically, for the position vector we have [r]R,l = xl , [r]R0 ,j = x0j and
∂xl
= ajl .
∂x0j
Consequently, if T is a tensor field of rank r, T(r) = (Ti1 i2 ···ir (r))R , then the
operation
∂T
= ∇T
∂r
gives a tensor field of rank r + 1 with the components
[∇T]i1 i2 ···ir ,j (r) = ∂j Ti1 i2 ···ir (r).
Specifically, ∇φ is a vector field when T = φ is a scalar field, c.f. section 2.6.4,
and ∇a is a rank two tensor field when T = a is a vector, cf. section 3.5. 8
Divergence
The fact that spatial derivatives of a tensor always leads to a new tensor of one
higher rank makes it easy to deduce the type of objects resulting from various
vector operations. For instance, if a(x) is a vector field then ∇a is a tensor field
of rank two. By contracting the two indices we therefore obtain a scalar, cf.
section 4.3.2
[∇a]i,i = ∂i ai a scalar quantity.
Thus, the divergence of the vector field is a scalar.
For a rank two tensor, T, ∇T is a rank 3 tensor, and we can perform two
different contractions, aj = ∂i Tij and bi = ∂j Tij , yielding the components of
two different vectors a and b. Only if T is symmetric will a = b. The direct
notation for the two operations are ∇ · T and ∇ · Tt , respectively.
8 Note, that the convention in tensor notation is to place the index of the spatial derivative
at the end of the tensor components, in conflict with the natural convention arising from the
direct notation which for –say a vector a– reads [∇a]ij = ∇i aj . However, in tensor notation
one always explicitly writes the index to be summed over, so no ambiguities arise in practice.
58
Curl
The tensor notation for the curl operator ∇ × a on a vector field a involves the
Levi-Civita symbol
[∇ × a]i = ijk ∂j ak .
If a is a polar vector then the rhs will be a double inner product between a
pseudo-tensor of rank 3 and two polar vectors yielding an axial (or pseudo)
vector field.
59
Chapter 5
60