Special Relativity in The Affine Spacetime
Special Relativity in The Affine Spacetime
Special Relativity in The Affine Spacetime
Michael Yu
June 25, 2012
This article is going to assume knowledge of abstract mathematics. Properties of and terminologies relating to of sets, abstract vector spaces, and the real
numbers will be used freely. The most important mathematical notation which
may be unfamiliar include , which is a symbol for for all, and , which is a
symbol for there exists.
1.1
Let us first characterize the physical universe as a four dimensional space of some
kind, with the three ordinal directions represented in three of the dimensions
and time in the remeaining dimension. Experimentally, we find that the laws
of physics is uninfluenced by location or time (within human reach), so let us
define the spacetime continuum as an affine space over a spacetime vector space.
An affine space is just a vector space where only the difference between elements
act as vectors. To quote from Wikipedia, an affine space is what is left of a
vector space after youve forgotten which point is the origin. We proceed to
define an affine space.
Definition 1.1.1. An affine space is a set A with an addition operator defined
as the following map
V A A : (v, a) 7 v + a
which satisfies three properties:
Left identity
a A, 0 + a = a
Associativity
v, w V and a A, (v + w) + a = v + (w + a)
Uniqueness
a A, V A : v 7 v + a is a bijection
(Copied from Wikipedia to here for ease of use)
1.2
As effective as an affine space as a model for spacetime, it is far more useful and
efficient to work with the underlying vector space. We almost always use either
measure or calculate time interval and distances to directly use in physics, as
opposed to time or and position. Therefore we shall focus our attention on the
spacetime vector space, which I will call the vector spacetime from here on.
For convenience, we will just let the field of the vector field be the real numbers.
We frequently add together physical quantities with the same units and
multiply these quantities with real numbers. Such physical quantities that can
be added and stretched like that form a one dimensional vector space. We
define a physical vector quantity (or just vector quantity for short) to be any
physically motivated vector space like this and we shall call any element from any
vector quantity to be a vector value. Time interval is such a vector quantity
and is also a subspace of vector spacetime. We shall denote this time vector space
as T . The spatial vectors subspace, which we shall denote as X , that make up
the rest of vector spacetime have three dimensions, and such can be expressed as
the direct sum of three spatial vector quantities. We see an important distinction
between the physics used in applications and the mathematical physical model
we are using in the case of space. Even though distances have the same units
of measurements (e.g. metres), they do not have the property that would make
them into one physical quantity. In particular, one metre in the x direction adde
to one metre in the y direction does not give you two metres in some direction.
We also identify all the 0 of every vector quantity as one unique element. This
is to capture the idea that 0 distance, 0 time interval, 0 force, or 0 anything else
might as well be the same thing since they all represent a nil physical quantity.
Extending on this idea is that vector values with different units can actually be
added, just like how a metre in the x direction can be added to a metre in the y
direction. The added vector values will just happen to be linearly independent.
Of course, most of the time the resulting vector value is quite meaningless,
but if and when we need to do such a thing the mathematics for it is ready.
Nonetheless, note that any linear combination of vector values is a vector value
in some vector quantity. Let us denote the sum (in the adding of vector spaces
sense) of all vector quantities, which happens to be the set of all vector values,
as V.
Just as often in physics we need to take the product of two vector values to
get another vector value. Let us agree to call this product to be developed in
the article the physical product to avoid confusion with all the other products
in physics and furthermore assume that all products on two elements in V that
appear later denote this physical product, unless otherwise stated. Now that
we have the two most basic operators that make up physical relationships and
laws, we shall make a wish list of properties that we would like them to have so
that they can best model the existing mathematics in physics.
Addition axioms
1.3
Let us now confirm that the axioms of the physical product are physicall sound.
We begin by using some familiar physical quantities to derive the structure of the
physical product on the vector quantities we are currently concerned about, time
and three dimensional space. One example of a telltale physical relation is from
1 |s|2
1
2Ek m1 t2 = |s|2 .
established physics Ek = m|v|2 Ek = m
2
2 t2
The left side of the final equation should form a vector value because we can
add all the values that take on that form together to form scalar multiples of
each other. This implies that even though the spatial vector value on the right
side can be any vector from three dimensional space, its norm squared is always
an element of one particular vector quantity. However, to talk about |s| as
a physical quantity would require the construction of a directionless distance
vector quantity, which is moreover not a good model of reality since in physics
we never see equations like |s1 |+|s2 |. Therefore, motivated by the fact that for a
general vector v representing a physical quantity, |v|2 = v v where the product
on the right side is the dot product, we define the physical product to have the
property that x, y X \ 0, a R+ such that ax2 = y 2 . The scalar coefficient
is restricted to be positive because when mass and time interval are positive,
5
In this section, let us agree that t will denote a time interval value, x will
denote a spatial displacement value, v will denote a purely imaginary quaternion
representing xt1 for some x and t, and s will denote an element in S.
2.1
The notion of relativity is best explored with looking at the points of views of
observers moving at relative speeds to each other. We proceed to define inertial
reference frames.
Definition 2.1.1. An inertial reference frame is a way of defining the universe as an affine spacetime such that objects moving at constant velocity will
keep moving at that velocity unless a force acts on it. The postulates of special
relativity give more properties of inertial reference frames.
Principle of relativity The laws of physics, including physical constants, hold
and are the same as in any other inertial reference. What laws of physics
included here will be introduced as they are needed. Furthermore, all the
laws of physics are the same no matter where and when in the universe
the law is tested. Also no laws of physics should change depending on the
intrinsic labelling of directions.
Constancy of the speed of light There is a speed such that for all reference
frames, anything travelling at that speed also travel at that speed in any
other reference frame. Experimentally, this speed is the speed of light.
We want to turn the postulates of relativity into mathematical properties
that spacetime must satisfy. First, we shall work with inertial frame shifts, which
is a transformation mapping all the affine spacetime points in one inertial frame
to another. Since two affine spaces over vector spaces with the same dimension
(in our case four) are isomorphic, inertial frame shifts are merely functions from
the affine spacetime to itself. Let us be mindful of which subspace is the time
dimension in the affine spacetime before and after, since time is distinct from
the spatial dimensions. Like with our analysis of physical vector quantities, we
shall see that it is more convenient to work with the vector spacetime. Firstly
we define p q, p and q being elements of an affine space, to be the unique
vector r such that r + q = p
Theorem 2.1.2. Any function f mapping an affine space A to another affine
space Z can be completely described by a function g defined as g(a) = f (a) for
some constant a A and g(b) = h(b a) + g(a), where h is a function from the
vector space underlying A to the vector space underlying Z.
Proof. Choose any a A and fix g(a) = f (a). Now define h(v) = f (v +a)f (a)
where v is a vector underlying A. Then for all b A, g(b) = f ((b a) + a)
f (a) + g(a) = f (b) f (a) + f (a) = f (b).
Since anyone concerned with two different reference frames can just agree on
some common point of space and time, we only need to investigate inertial frame
shifts as a mapping from vector spacetime to itself. Let us agree to use frame
shift to mean the mapping from vector spacetime to itself that can be used to
completely describe any inertial frame shifts. Let us use F to denote any frame
shift and let us use t0 + x0 to denote F (t + x). First we note that frame shifts are
invertible, and therefore bijective, since we are defined to be allowed to frame
shift back from the new inertial reference to the old one. By the postulates of
special relativity, the spacetime location does not affect physical laws, which
include laws governing frame shifts, so, using the same notation from the proof
of 2.1.2 we must have that no matter what a we choose, h is the same mapping.
Then with v, w T X ,
h(v) + h(w) = (f (v + a) f (a)) + (f (w + a) f (a))
= (f (v + (w + a)) f (w + a)) + (f (w + a) f (a))
= f ((v + w) + a) f (a)
= h(v + w)
This shows that frame shifts are additive. Now consider t and F (t). By the
additivity of frame shifts, F (nt) = nF (t) for n N. Suppose u = mt, m N.
7
Then
F (u) = mF (t)
1
u
m
1
F (u) = F
m
So therefore F (rt) = rF (t) for rational r. Honestly this has taken me so, so, so,
so, so, much more time than I had anticipated, and its 5:26 AM, so Im just
going to get straight to the point and summarize everything. Just assume that
frame shifts are homogenous with degree one one the real numbers when the
vector value is time only, and use Newtons first law to derive that frame shifts
are homogenous for any argument. Thus frame shifts are just linear operators
on T X .
To formalize the constancy of the speed of light, we set t2 + x2 = 0 t02 +
2
x0 = 0 (recall that t2 and x2 are defined to always have opposite signs). From
the linearity of frame shifts, I proved the stronger statement t2 + x2 = t02 + x02 .
The spacetime interval t2 + x2 is therefore invariant under frame shifts. From
this, and the fact that direction does not affect physical laws, I derived the time
dilation formula. If F (t) = t0 + vt0, then
t
t0 = p
1 |v 2 |
Next steps are to derive the complete Lorentz tranformations and then use
F = ma to derive relativistic mass.