- 1.2.1 The transformation formulae
- 1.2.2 Proper time and distance
- 1.2.3 Subluminal and superluminal effects
- 1.2.4 Four-vectors
- 1.2.5 Index notation
- 1.2.6 Group property

1.2 The Lorentz Transformation

The Lorentz transformation

describes how measurements
of the position and time of events change from one observer to the
next. It includes Lorentz-Fitzgerald contraction and time dilation as
special cases.

1.2.1 The transformation formulae

This subsection explains how the position and time coordinates of events differ from one observer to the next.

Consider two observers A and B that are in motion compared to each
other with a relative speed

As the left side of figure 1.2 shows, observer A can
believe herself to be at rest and see observer B moving away from her
at speed

It will further be assumed that both observers use coordinate systems
with themselves at the origin to describe the locations and times of
events. In addition, they both take their

In that case the Lorentz transformation says that the relation between
positions and times of events as perceived by the two observers is,
{D.4}:

The made assumptions are that A and B are at the origin of their
coordinate system. And that their spatial coordinate systems are
aligned. And that their relative motion is along the

Time dilation is one special case of the Lorentz transformation.
Assume that two events 1 and 2 happen at the same location

So observer B finds that the time difference between the events is larger. The same is of course true vice-versa, just use the inverse formulae.

Lorentz-Fitzgerald contraction is another special case of the Lorentz
transformation. Assume that two stationary locations in system B are
apart by a distance

Taking the square root to the other side gives the contraction.

As a result of the Lorentz transformation, measured velocities are
related as

1.2.2 Proper time and distance

In classical Newtonian mechanics, time is absolute. All observers
agree about the difference in time

The time difference is an

invariant;it is the same for all observers.

All observers, regardless of how their spatial coordinate systems are
oriented, also agree over the distance

Here the distance between any two points 1 and 2 is found as

The fact that the distance may be expressed as a square root of the sum of the square components is known as the “Pythagorean theorem.”

Relativity messes all these things up big time. As time dilation
shows, the time between events now depends on who is doing the
observing. And as Lorentz-Fitzgerald contraction shows, distances now
depend on who is doing the observing. For example, consider a moving
ticking clock. Not only do different observers disagree over the
distance

However, there is one thing that all observers can agree on. They do
agree on how much time between ticks an observer moving along with the
clock would measure. That time difference is called the “proper time” difference. (The word proper is a wrongly
translated French propre,

which here means
own.

So proper time really means the clock’s own
time.) The time difference

Here

To clean this up, take the square root to the other side and write

Note however that the proper time difference is imaginary if the
quantity under the square root is negative. For example, if an
observer perceives two events as happening simultaneously at two
different locations, then the proper time difference between those two
events is imaginary. To avoid dealing with complex numbers, it is
then more convenient to define the “proper distance”

All observers agree about the values of the proper time difference

Physicists define the square of the proper distance to be the
“space-time interval”

If the interval, defined as

If the proper time difference is real, the earlier event can affect, or even cause, the later event. If the proper time difference is imaginary however, then the effects of either event cannot reach the other event even if traveling at the speed of light. It follows that the sign of the interval is directly related to “causality,” to what can cause what. Since all observers agree about the value of the proper time difference, they all agree about what can cause what.

For small differences in time and location, all differences

1.2.3 Subluminal and superluminal effects

Suppose you stepped off the curb at the wrong moment and are now in
the hospital. The pain is agonizing, so you contact one of the
telecommunications microchips buzzing in the sky overhead. These
chips are capable of sending out a “superluminal” beam; a beam that propagates with a speed greater
than the speed of light. The factor with which the speed of the beam
exceeds the speed of light is called the “warp factor”

You select a microchip that is moving at high speed away from the
location where the accident occurred. The microchip sends out its
superluminal beam. In its coordinate system, the beam reaches the
location of the accident at a time

Because of the high speed

Sounds good, does it not? Unfortunately, there is a hitch. Physicists refuse to work on the underlying physics to enable this technology. They claim it will not be workable, since it will force them to think up answers to tough questions like: “if you did not end up in the hospital after all, then why did you still send the message?” Until they change their mind, our reality will be that observable matter or radiation cannot propagate faster than the speed of light.

Therefore, manipulating the past is not possible. An event can only
affect later events. Even more specifically, an event can only affect
a later event if the location of that later event is sufficiently
close that it can be reached with a speed of no more than the speed of
light. A look at the definition of the proper time interval then
shows that this means that the proper time interval between the events
must be real, or time-like.

And while different
observers may disagree about the location and time of the events, they
all agree about the proper time interval. So all observers,
regardless of their velocity, agree on whether an event can affect
another event. And they also all agree on which event is the earlier
one, because before the time interval

A more visual interpretation of those concepts can also be given.
Imagine a hypothetical spherical wave front spreading out from the
earlier event with the speed of light. Then a later event can be
affected by the earlier event only if that later event is within or on
that spherical wave front. If you restrict attention to events in the

1.2.4 Four-vectors

The Lorentz transformation mixes up the space and time coordinates badly. In relativity, it is therefore best to think of the spatial coordinates and time as coordinates in a four-dimensional “space-time.”

Since you would surely like all components in a vector to have the
same units, you probably want to multiply time by the speed of light,
because zeroth

component of the vector where

(1.10) |

How about the important dot product between vectors? In three dimensional space this produces such important quantities as the length of vectors and the angle between vectors. Moreover, the dot product between two vectors is the same regardless of the orientation of the coordinate system in which it is viewed.

It turns out that the proper way to define the dot product for
four-vectors reverses the sign of the contribution of the time
components:

(1.11) |

inner productis

invariant under the Lorentz transformation.Different observers may disagree about the individual components of four-vectors, but not about their dot products.

The difference between the four-vector positions of two events has a
proper length

equal to the proper distance between the
events

(1.12) |

It should be pointed out that many physicist reverse the sign of the spatial components instead of the time in their inner product. Obviously, this is completely inconsistent with the nonrelativistic analysis, which is often still a valid approximation. And this inconsistent sign convention seems to be becoming the dominant one too. Count on physicists to argue for more than a century about a sign convention and end up getting it all wrong in the end. One very notable exception is [49]; you can see why he would end up with a Nobel Prize in physics.

Some physicists also like to point out that if time is replaced by

Returning to our own universe, the proper length of a four-vector can be imaginary, and a zero proper length does not imply that the four-vector is zero as it does in normal three-dimensional space. In fact, a zero proper length merely indicates that it requires motion at the speed of light to go from the start point of the four-vector to its end point.

1.2.5 Index notation

The notations used in the previous subsection are not standard. In literature, you will almost invariably find the four-vectors and the Lorentz transform written out in index notation. Fortunately, it does not require courses in linear algebra and tensor algebra to make some basic sense out of it.

First of all, in the nonrelativistic case position vectors are
normally indicated by

In short,

shows this book’s common sense notation to the left and the index notation commonly used in physics to the right.

Recall now the Lorentz transformation (1.6). It described
the relationship between the positions and times of events as observed
by two different observers A and B. These observers were in motion
compared to each other with a relative speed

A table like

matrixor

second-order tensor.The individual entries in the matrix are indicated by

(Different sources use different letters for the Lorentz matrix and
its entries. Some common examples are Lorentz

starts with L
and the Greek letter for L is a

for the
Lorentz matrix is good too: the name Lorentz

consists
of roman letters and a is the first letter of the roman alphabet.)

The values of the entries

In terms of the above notations, the Lorentz transformation
(1.6) can be written as

That is obviously a lot more concise than (1.6). Some further shorthand notation is now used. In particular, the “Einstein summation convention” is to leave away the summation symbol

Whenever an index like

It should be noted that mathematicians call the matrix

In understanding tensor algebra, it is essential to recognize one
thing. It is that a quantity like a position differential transforms
different from a quantity like a gradient:

In the first expression, the partial derivatives are by definition the entries of the Lorentz matrix

In the second expression, the corresponding partial derivatives will be indicated by

The entries

inverseLorentz matrix

Assuming that the Lorentz transformation matrix is the simple one to
the right in (1.13), the inverse matrix

Consider now the reason why tensor analysis raises some indices.
Physicists use a superscript index on a vector if it transforms using
the normal Lorentz transformation matrix

If a vector transforms using the inverse matrix

Now suppose that you flip over the sign of the zeroth, time, component
of a four-vector like a position or a position differential. It turns
out that the resulting four-vector then transforms using the inverse
Lorentz transformation matrix. That means that it has become a
covariant vector. (You can easily verify this in case of the simple
Lorentz transform above.) Therefore lower indices are used for the
flipped-over vector:

The convention of showing covariant vectors as rows instead of columns comes from linear algebra. Tensor notation by itself does not have such a graphical interpretation.

Keep one important thing in mind though. If you flip the sign of a
component of a vector, you get a fundamentally different vector. The
vector

If you remember that, tensor algebra becomes a lot less confusing. The expressionThe names of tensors are only correct if the indices are at the right height.

Now consider two different contravariant four-vectors, call them

To see why, recall that since the index

Note also from the above examples that summation indices appear once as a subscript and once as a superscript. That is characteristic of tensor algebra.

Addendum {A.4} gives a more extensive description of the most important tensor algebra formulae for those with a good knowledge of linear algebra.

1.2.6 Group property

The derivation of the Lorentz transformation as given earlier examined
two observers A and B. But now assume that a third observer C is in
motion compared to observer B. The coordinates of an event as
perceived by observer C may then be computed from those of B using the
corresponding Lorentz transformation, and the coordinates of B may in
turn be computed from those of A using that Lorentz transformation.
Schematically,

But if everything is OK, that means that the Lorentz transformations from A to B followed by the Lorentz transformation from B to C must be the same as the Lorentz transformation from A directly to C. In other words, the combination of two Lorentz transformations must be another Lorentz transformation.

Mathematicians say that Lorentz transformations must form a
group.

It is much like rotations of a coordinate
system in three spatial dimensions: a rotation followed by another one
is equivalent to a single rotation over some combined angle. In fact,
such spatial rotations are Lorentz transformations; just
between coordinate systems that do not move compared to each other.

Using a lot of linear algebra, it may be verified that indeed the Lorentz transformations form a group, {D.5}.