Relativity Primer with Bondi K Calculus
Relativity Primer with Bondi K Calculus
with a flashlight
Future Future
Elsewhere
A simple primer
using the Bondi K calculus
Ananda Dasgupta
March 3, 2007
Contents
1 The background 6
1.1 Galileo - the father of relativity . . . . . . . . . . . . . . 6
1.2 Hunt for the elusive ether . . . . . . . . . . . . . . . . . 6
1.3 Enter Einstein . . . . . . . . . . . . . . . . . . . . . . . 6
1
CONTENTS 2
The background
6
CHAPTER 1. THE BACKGROUND 7
Faced with such choices I am pretty sure that most people would
have sided with the first option. Electrodynamics was a fledgeling
science at that stage. Mechanics, on the other hand, was the grand
old man of physics - one, moreover, that had held sway over the
way that scientists thought about the natural world for more than
three centuries at that time. It was Einstein’s genius that he did
not hesitate to go the other way!
Chapter 2
9
CHAPTER 2. THE RADAR METHOD 10
not moving with respect to myself - all that I see is the lamppost
move steadily towards me. My friend crosses me, goes out - and
meets the oncoming lamppost at some point. What distance has he
covered with respect to me? From my origin to that point. Where
does he turn back from? From that very point! Where does he land
up finally? To my origin! Hence, with respect to me he has covered
the same distance on his outward journey as on his inward one!1
Let me stress once again that I did expect most of you to give the
wrong answer. That is because despite learning a lot about relative
motion, all of us instinctively think in terms of one observer all
the time - the ground2 ! In fact, that is why most of you will be
concerned with how far the lamppost is from me - the lamppost
is fixed on the ground! That my friend travelled more during his
outward journey is the correct answer - but only with respect to
the ground! With respect to me, it is fifty-fifty.
Let’s come back to our puzzles. The next one is - if the speed of my
friend on the bike was the same on both halves of the journey, then
which half took him more time? This one is really simple - since
the outward journey was longer than the inward one - the first half
must have taken more time.
“Hold on!” - some of you must be screaming at this point. This
one is obviously correct with respect to the ground. Since the two
journeys are equally long with respect to me, shouldn’t the times be
equal when I measure them? If this is your answer, then congratu-
lations! Not because your answer is right (it isn’t!) but because you
have unwittingly surmounted the biggest stumbling block that lies
in the path of understanding the special theory of relativity! Jokes
apart, think about this issue a bit - would you really expect the
times to change depending on whether a man fixed on the ground
sees the events or whether I, who am moving, sees it?
The situation, once again, is very simple. It is true that the
distances my friend covers in the two halves are the same with
respect to me. The speeds are also the same - but with respect
CHAPTER 2. THE RADAR METHOD 12
to the ground! When he goes away from me, his speed, as I see
it, is less (it is v − u if v and u are the bike’s and my speed with
respect to the ground, respectively) than when he is coming back
in (it is v + u). So, he takes more time to go out than to come back
in - just as the ground sees it. Indeed, if you take the trouble of
calculating the times from my viewpoint (you should!) you will see
that they exactly match the time that the ground observes3 . The
moral of the story is - when you are using somebody’s displacement
measurements, you should not mix them up with someone else’s
velocity measurements!
Brace yourself - for this is where our hero the flashlight makes
its grand entrance into the story! Consider the same scenario as
before - except that instead of the lamppost I am now moving to-
wards a mirror fixed to the ground. I shine a flashlight at the
mirror. A pulse of light travels to the mirror and comes back to me.
As with my friend the biker, the flash of light travelled more during
the outward journey than during the inward one - with respect to
the ground. Again, both halves of its journey were equally long with
respect to me. What about the times? It stands to reason that the
outward journey takes more time than the inward one for a ground
based observer. What do I say about this? Now, as per Einstein,
light moved with the same speed c for both halves of its journey,
not only with respect to the ground - but also with respect to me!
This leaves me with no option but to conclude that - with respect
to me light takes the same time to complete the two halves of its
journey! Accept the constancy of the speed of light, and you have to
accept that time depends on the observer!
Our simple little puzzle has led us right into the heart of the
3
This should not come as a surprise. After all, this is how we arrive at the
v ± u results in the first place!
CHAPTER 2. THE RADAR METHOD 13
have to use up all three directions for the spatial axes - which will
leave me nowhere to put the fourth, time, axis! With one space
dimension only, spacetime is two dimensional - something that I
can draw very easily on a sheet of paper.
Although we cannot draw a
four dimensional spacetime di-
agram on a sheet of paper - noth-
ing prevents us from understand-
ing it at a mathematical level,
or even imagining such a thing.
There is nothing magical about
four dimensions - all that we
are saying is that events must
Figure 2.1: A space time diagram. be labelled by four numbers -
the four coordinates x, y, z and t. Of course, this was true even in
Newton’s world. There is a deeper sense, though, behind the state-
ment that spacetime is four dimensional in STR - I hope to make
this clear in a coming section.
There is one thing about four dimensional spacetime that I must
bring up here. In the 2D space time in figure 2.1, the light lines that
go through any point form a pair of straight lines, both inclined at
45◦ to the two axes. In four dimensions, the equation that describes
light emitted at x = 0 and t = 0 obeys the equation
c2 t2 − x2 − y 2 − z 2 = 0 (2.1)
- which is simply the statement that light travels equally in all di-
rections with the uniform speed c. Although we can’t draw the
surface corresponding to this equation, we can definitely under-
stand its content. What I can draw, is the version of (2.1) in
which I use only two space dimensions, ignoring z. This equa-
CHAPTER 2. THE RADAR METHOD 17
1
t = (t2 + t1 ) (2.2-a)
2
c
x = (t2 − t1 ) (2.2-b)
2
CHAPTER 2. THE RADAR METHOD 18
You may find this hard to swallow - but almost all of relativity can
be derived from this nearly trivial set of rules!
Our humble flashlight is in rather good company - the all impor-
tant radar systems that help armies locate and shoot down enemy
planes and air traffic controllers guide friendly ones to a safe land-
ing works on exactly the same principle outlined above! So, I will
refer to this way of measuring space and time as the radar method.
The rules that I have laid down above may seem too simple to
be true. More importantly - they seem so much in tune with our
classical notions that it is hard to see just how the conclusions
that we can draw from these rules can differ from the classical
ones. Let me point out just where the rules cease to be classical.
In Newtonian mechanics, the equations (2.2-a) and (2.2-b) will be
valid, but only in the one unique frame where the speed of light is
the same in all directions. This is the frame fixed to the medium
in which light propagates - the luminiferous ether. Einstein freed
physics from the shackles of the ether - so that in STR (2.2-a) and
(2.2-b) are valid for all inertial observers. Thus, if an observer is
moving away from me, he can measure the in and out times of
a light beam and use the same equations to locate the bounce in
space and time - but this time according to his own coordinates.
One of the major new features in STR is that space time mea-
surements change from observer to observer in a rather new, un-
expected manner. Since all inertial observers can play the game
of measuring spacetime coordinates by using the equations (2.2-a)
and (2.2-b), we can find out how these changes work out by playing
with the rules of this game. This is precisely what I intend to do in
the next few sections.
CHAPTER 2. THE RADAR METHOD 19
5Κτ
4Κτ
4Κτ
3Κτ
3Κτ 4τ
5τ
2Κτ 3τ
4τ 2Κτ
3τ 2τ
2τ Κτ Κτ
τ
τ
(a) (b)
Figure 2.3: The K factor. In (a) Alice sends Bob flashlight signals
at an interval of τ , which bob receives at an interval Kτ . (b) Shows
what happens when Bob sends his signals - note that the same
factor K shows up again. Note that the times shown in purple are
times read by Alice’s watch, while those in blue are readings from
Bob’s watch.
engine suddenly drops in pitch when the engine crosses you? This
is why another name for K is the Doppler factor.
What happens if Bob decides to shine his flashlight at Alice? If
he flashes his flashlight at an interval τ 0 as measured by his watch,
Alice will receive them at a larger interval of time according to her
watch. What may come as a surprise to you is that the interval that
Alice will see is exactly Kτ 0 . The same factor K comes in, both for
Bob receiving Alice’s signals and for Alice receiving Bob’s signals!
Why are the two factors the same? It is the basic principle of
relativity, the fact that all inertial observers are exactly equivalent,
in action! Note that if Bob had moved away from Alice to the left
instead of to the right, he would still have received Alice’s signals
at the same interval as long as his speed is the same as before -
there being no difference between left and right. Now, with respect
to Bob, Alice moves to the left with the same speed to the left as
CHAPTER 2. THE RADAR METHOD 21
Bob is moving to the right with respect to her. Thus any differ-
ence between the two factors will point to a fundamental difference
between Alice and Bob as observers. This precisely is what the
principle of relativity rules out! Imagine that Alice has flashed her
flashlight at time τ according to her wristwatch at Bob, who re-
ceives it at a time when his watch reads Kτ . Instead of letting the
light beam flash past him, Bob holds up a mirror that reflects the
pulse back to Alice. This beam will reach Alice at K 2 τ , who reflects
it back to reach Bob at K 3 τ , and so on ... Figure 2.4 shows this
bouncing to and fro of the flashlight beam.
When and where does the first
bounce at Bob’s mirror happen,
Κτ
3
according to Alice? Our basic
rules of spacetime measurement,
(2.2-a) and (2.2-b) tells us the
Κτ
2
answer :
Κτ K2 + 1
t = τ
τ 2
K2 − 1
x = c τ
2
Figure 2.4: Determination of K One thing must be immediately
clear - the time at which Bob
saw the first bounce occur was
2
at time Kτ - which is clearly different from the time K 2+1 τ that Alice
infers for it.
To find the value of K, just notice that
x K2 − 1
= 2 c
t K +1
the ratio xt is nothing but the speed v of Bob that Alice sees. So,
calling the dimensionless quantity vc β, the equation translates to
K2 − 1
β= (2.3)
K2 + 1
Thus the K factor goes from 1 when Bob is stationary with re-
spect to Alice, to ∞ when Bob’s speed approaches the speed of
light (β = 1). The factor is smaller than 1 for negative values of β,
and falls all the way to 0 for an observer moving to the left with a
speed c. Indeed, reversing the sign of the velocity changes the K
factor to its reciprocal.
If Bob moves away from Alice with a speed v, then Alice must
move away from Bob with a speed −v. It follows, then, that the K
factor of Alice with respect to Bob must be K −1 , if that of Bob with
respect to Alice is K. However, isn’t this contrary to what I said
a while ago - that when Bob sends here signals at time intervals
of τ , Alice must receive them at intervals of Kτ (and not K −1 τ ),
according to the principle of relativity?
If you think about this a bit, you will realize that there is no
contradiction at all! Imagine a third fellow, maybe Charlie, who is
situated well to the left of both Alice and Bob, who is also shining
flashes of light at both of them. Both Alice and Bob sees the flashes
travelling to the right 5 with speed c. If Bob sees these flashes at a
5
That the direction as well as the speed of light stays unchanged when you
jump from one inertial frame to another is a consequence of the fact that “you
just can’t outrun light”! I will show you how this result follows from the rela-
tivistic rules of velocity addition in section 3.4.
CHAPTER 2. THE RADAR METHOD 23
time interval of τ , then Alice must see them at the smaller inter-
val K −1 τ . This exactly, is the meaning of the statement that the
Doppler factor of Alice with respect to Bob is K −1 ! In this case,
Alice is moving, as far as Bob is concerned, in the direction of the
source of the light and is hence getting to meet each successive
pulses sooner than Bob does. When Bob sends Alice the signals,
though, the light signals must move towards the left to get to the
Alice, which is the same direction that Bob sees Alice move away
in! The light signals take some extra time to catch up with the
receding Alice - hence, the increased time interval of Kτ .
In summary, then, the Doppler factor K for Bob with respect to
Alice is what you must multiply time intervals between light signals
as seen by Alice, to get that seen by Bob - but for light travelling
towards the positive x direction, namely, to the right! For light
travelling towards the left, the multiplying factor is not K at all,
but rather K −1 !
In terms of frequencies, Bob must divide the frequency of light
that Alice sees by K to get the frequency that he sees, but again,
only for light travelling to the right. For light moving the other way,
Bob will have to multiply by K, instead of dividing by it!
Let me address another point that you may have already thought
of on your own. The equation (2.4) clearly shows that the value of
K ceases to be real once β exceeds 1 in magnitude. Does this prove
the statement that I have made all along - that it is impossible for
a material object to move faster than light? Unfortunately, the an-
swer is - no! Just think about it a bit and you will realise that all
it means is that if Bob were to recede from Alice faster than light,
then the flashlight signals Alice sends out towards him will never
catch up with him! To understand why material particles cannot
exceed the speed of light, you will have to wait a while more!
CHAPTER 2. THE RADAR METHOD 24
K0 = 1 + β
26
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 27
talking about - he is on the spot! This means that he had the lux-
ury of directly using his watch to time the bounce, while poor Alice
had to be content with an indirect determination of the time. In-
deed, if we look at the second bounce the light flash suffers, we see
that Alice times it at K 2 τ , while Bob infers that it occurs midway
3
in between the first and third bounces, i.e. K 2+K τ . Once again, the
“on the spot” time (this time its Alice’s turn) is smaller than the
inferred time. There is nothing special, then, about either Alice or
Bob - whoever happens to be on the spot measures a smaller time.
So, STR does make a distinction between a time measurement
made directly and one made indirectly. In STR jargon, directly
measured time is called the proper time. Going by this, you may
be tempted to call the indirectly measured time the “improper time”
(some authors do). In my opinion, this latter term is not very suit-
able - since it gives us the notion that something is wrong about
such a time measurement!
Going back to the first bounce suffered by the flash of light in
figure 2.4, we denote the proper time at which it occurs (Bob’s time)
by t0 and Alice’s time by plain t. We see that
1+β
t K2 + 1 1−β
+1 1
= = q =p
t0 2K 2 1+β 1 − β2
1−β
t0
t= p . (3.1)
1 − β2
1 K2 + 1
γ≡p = (3.2)
1 − β2 2K
and so
t = γt0 .
Note that the ratio of measured times is the same for the second
bounce of the flash as well - but this time t is Bob’s time, while t0
is the time as measured (directly) by Alice.
Although equation (3.1) is pretty straightforward - it can lead
you into all sorts of trouble, especially if you get confused about
which observer’s time is t0 and which one is t. So, let me stress
this once again, if an observer is present at the location of both the
events (on the spot) then the time interval he measures between them
is the proper time. If you are talking of two events such that neither
Alice nor Bob is on the spot for both of them, then neither can claim
his time measurement to be the proper time. How, then, are the
two time’s related to each other? Well, you will just have to wait till
we get to the derivation of Lorentz transformations in a little while.
l0 l0
c t/2
v t/2
(a) (b)
Figure 3.1: An idealized clock. (a) The time taken for light to return
to the lower mirror is 2lc0 , (b) The time taken is longer for someone
moving with respect to the clock, since light has to cover a bigger
distance.
2l0
c
. Note that since both the start and end of the light rays to and
fro journey occurs in the same place with respect to me, this time
interval is the proper time interval between the two events.
Consider what this will look like to someone else, who sees me
and my clock move uniformly to the right with a speed v. He will see
light leave the lower mirror, reach the upper one and bounce back
to the lower one, just as I do. However, by the time light reaches
the upper mirror, it would have shifted to the left ... - hence he will
see light follow the path shown in figure 3.1b. The path is obviously
longer than the one I see. If this had been Newtonian physics, he
√
would have claimed that light is moving faster (at a speed c2 + v 2
to be exact), so that it covers the path in the same time 2lc0 that I
see. That option is out - STR tells us that light has to make the trip
at the same speed with respect to him too - and thus he must see
the trip take a larger time. Thus one tick on my moving clock takes
more time than the designated 2lc0 as far as he is concerned - it is
running slow!
To find out just how slow, all we have to do is appeal to good old
Pythagoras! If t is the time taken for the round trip according to
him, then light takes half of it to get to the upper mirror, covering a
distance of ct2 . In this time, the upper mirror has moved a distance
of vt2 . So, in figure 3.1b we have a right angled triangle with sides l0
and vt2 and a hypotenuse ct2 . So
2 2
ct 2 vt
= l0 +
2 2
2l0
c t0
t= q =p
1− v2 1 − β2
c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 31
see around us moves too slowly for the effect to show up! Consider
a clock that rushes past you on an aircraft flying at Mach 1 (the
speed of sound). How much does this clock slow down by? To get
an estimate of this, note that the speed of sound is about 300 m s−1 ,
while that of light is 3 × 108 m s−1 , so that the β for our clock is 10−6 .
So, this clock goes slow compared to the one that sits still next to
you by a factor γ given by
− 12
γ = 1 − 10−12 ≈ 1 + 5.0 × 10−13
So, this clock that flies past you does go slower than yours - but
only by five parts in 1013 ! No wonder no one had noticed!
Having said that, I must hasten to add that today’s technology
is good enough to actually detect even such a small effect! Today,
we have many direct verifications of the time dilation formula (3.1)
- something which would have been unthinkable in Einstein’s time.
What’s more, the time dilation formula is no longer a mere cu-
riosity producing either tiny effects that would not be detectable
except with very sophisticated technology, or large effects for ex-
otic objects that travel very fast - it is almost a part of daily life
today! You may have heard of the Global Positioning System (GPS)
- a network of satellites that receives signals from transmitters at-
tached to vehicles (like ships in the open sea, aircrafts or even your
mobile phone!), times them and uses something very much like
high school trigonometry to triangulate the exact location of the
source, to within a few meters. The clocks on the satellites run
slow compared to the clocks on the sources (because of their large
orbital speeds) and you must compensate for this effect to achieve
this accuracy! Without this compensation, GPS calculations would
have been off by miles! This perhaps is one of the most drastic
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 34
that during most of his journey, Bob had driven just as straight
as Alice did! The one brief period at C when Bob makes a turn
is ultimately responsible for the length of Bob’s path being a lot
longer than Alice’s! Similarly, the one brief period of acceleration
during Bob’s intergalactic round trip is what makes Alice’s clock
read a lot more than Bob’s. The only reason why people never bat
an eyelid in the former case, while they are all full of wonder at
the latter is that our intuitions, which are honed by what we see
around us in a world of slowly moving objects, are not prepared to
grasp the concept that time can be different for different people!
One argument that many people come up with at this stage is
that from Bob’s point of view it is Alice that accelerates, so it is
equally correct to say that Bob is inertial and Alice is not - and
hence conclude that Bob is correct about their respective ages and
not Alice. This argument falls flat simply because the issue of
which observer is inertial and which is not is not is not a rela-
tive matter! Remember, Newton’s first law asserts that their exists
at least one inertial observer for whom all force free objects are also
acceleration free. Anyone moving uniformly with respect to this in-
ertial observer is inertial, anyone who accelerates with respect to
it is non-inertial! Moreover, although an observer can not make
out whether she is moving or not without looking out when in uni-
form motion, she can easily figure out whether she is accelerating
with respect to an inertial observer. In the brief moment of reversal
during Bob’s journey, Bob’s accelerometer must have registered a
large reading, while Alice felt nothing at all! So, Alice is the one
who is inertial - and their is no ambiguity about this whatsoever!
Although we have disposed off the paradox simply by denying
its existence, it is natural to feel a bit uncomfortable about this.
It is true that the clocks of both friends run slow compared to the
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 38
10 8 10 8
7.5 9.5
9 7 9 7
8.5
8 6.5 8
6 6
5.5
Time (in years)
4 4
3
3
2 2
2 2
1 1
Position Position
(in lightyears) (in lightyears)
For the sake of convenience, I will assume that Bob does not
move as fast as in our example above, but with only a speed of
0.6 c. This gives us a time dilation factor γ of 1.25, so that a journey
that takes only 8 years according to Bob’s while Bob’s clock takes
10 years according to Alice. Before Bob sets out on his journey,
the two friends decide on a strategy by which they can keep track
of each other’s ages. They both agree that on their birthdays each
year they will send out a light flash towards the other one. So, all
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 39
this means that she is going to receive Bob’s yearly signals at gaps
of six months!Bob sends out four signals while coming back. Alice
receives all of them crammed up in the last two years of her ten.
Thus, Bob sends out a total of eight signals, Alice receives all eight!
Everything checks out, and they both agree that Alice is ten years
older than when the journey started, while Bob has aged by only
eight!
Time dilation leads to a puzzle similar to, but not quite the same
as, the twin paradox for two observers equipped with this infinite
set of personal clocks. Just imagine that Bob, who has his own
private array of fixed clocks, is rushing past Alice and her clocks
at the speed of 0.6 c. When the two friends are just abreast, their
clocks both read 0. After a while Bob glances at his clock and finds
that it is reading 8 seconds. He also notices that Alice’s clock right
next to him is now reading 10 seconds. To Bob, the interval that
has elapsed is eight seconds long, whereas to Alice its length is 10
seconds. Of course, Bob uses the same clock for both the readings
of 8 seconds and 0 seconds - so his reading is the proper time
duration of this interval. Alice, on the other and must use two of
her clocks to time this interval - the first one at the origin, and the
second at a distance of 0.6 c × 10 s = 6 light-seconds away from her.
You should check that the two friends’ statements about the length
of the time interval is consistent with our time dilation equation,
(3.1).
So, where is the puzzle? Note that both friends must agree that
at the first event (Bob crossing Alice) their clocks were reading 0 s.
They must also both agree that when Bob next looked at his clock
(the second event), their adjacent clocks were reading 10 s and 8
s, respectively. All this fits our results beautifully. Hang on! Just
try to figure things out from Bob’s point of view. To him, it is Alice
and her clocks that are moving away, with a speed of 0.6 c to the
left. Hence, he should be seeing Alice’s clocks running slow, by
the same γ factor of 1.25! So, shouldn’t Alice’s clock be showing
1
8 s × 1.25 = 6.4 s, when his clock is showing 8 s?
In other words, the puzzle is that while relativity insists that
both of the friends will see the other one’s clocks running slow, the
writing on the clocks seem to insist that it is Bob’s clocks that have
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 42
slowed down - not Alice’s! Note that in this case both friends are
equally inertial - so that we do not have the escape route that we
used for the twin paradox. To see how to resolve this issue you will
have to wait a while until section 3.3 - where we will meet a con-
sequence of relativity that runs even more counter to our intuition
than time dilation!
duced five-six kilometers above the surface survive till they reach
the earths surface?
The answer lies, of course, in the fact that the muon decays by
its own clock, one that is slowed down by quite a big γ factor com-
pared to the ones that we carry. Put in a more technical language,
in the inertial frame in which a muon is at rest, it does decay with
the half life of 2 × 10−6 s. This is the so called proper half-life of
the muon. However, the muons produced in the upper atmosphere
are travelling so fast (at nearly the speed of light) relative to us that
their clock runs slow by a factor of around γ ∼ 10, allowing most of
them them to get to the surface undecayed.
So that’s it - if the muon can make it to the ground from 5-6
kms above, it is all because of the fact that the clock that tells it to
decay runs slowly compared to our earthbound clocks. Before you
start to celebrate the solution of this case of the undying muon,
though, let me just show you the whole life of the muon from the
muon’s point of view - or rather, from an inertial frame in which
the muon is at rest. The muon, in its own frame lives for 2 × 10−6 s.
From the muon’s point of view, it is the earth that rushes at it.
Does the earth move 5-6 kms or more in this small a time? If it
did, it is obvious that the earth would have to move at many times
the speed of light!
So even though the fact that the cosmic ray muons reach the
surface can be explained using time dilation from the earthbound
observers reference frame, it still remains a mystery from the muon’s
frame. The only way that you can solve this is if what we earth-
lings measure as 5-6 kms, is only a few hundred meters long for
the muon! This brings us to the topic of the next section - the fact
that moving rods get shortened!
This illustrates a very important theme in relativity theory - two
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 44
time t1 bounces off the far end of Bob’s rod and returns to her. the
second flash, send out at time t1 according to Alice’s clock, bounces
off the near end of the rod (which is at Bob’s location) at a time Kt2
(as read by Bob’s clock) to reach Alice at a time K 2 t2 (according to
her clock).
We have to exercise a bit more careful to figure out the timings
on the first beam. It reaches Bob, of course at a time Kt1 according
to Bob’s clock. Then it moves on to bounce off the other end of
the rod. Since the speed of the beam is c with respect to Bob, too,
the time it takes to come back to him must be 2lc0 . Thus, on its
way back, the beam meets Bob at the time Kt1 + 2lc0 , once again,
as read by Bob’s clock. This beam reaches Alice at a time (read by
her) that is larger by the factor K again, at K Kt1 + 2lc0 . Figure 3.3
shows the times that Bob measures in blue, while those that Alice
measures are shown in purple.
The groundwork is now done - all that is left is to use our basic
equations (2.2-a) and (2.2-b) to figure out the difference in time and
position of the two bounces as seen by Alice. It is easy to see that
the time gap between the two bounces is
1 2l0 1
K 2 + 1 t2
∆t = K Kt1 + + t1 −
2 c 2
1 Kl0
K 2 + 1 (t1 − t2 ) +
= (3.6)
2 c
2K l0
t1 − t2 = − (3.8)
K2 + 1 c
K2 − 1 2K
l = Kl0 − Kl0 = l0
K2 + 1 K2 + 1
2
The factor K2K+1 is one we have met before, in (3.2). So, we can
rewrite the formula for l as
1 p
l= l0 = l0 1 − β 2 (3.9)
γ
seconds away from Alice, according to Alice - Alice is only 4.8 light-
seconds away according to Bob. There is nothing wrong with this
- after all, Alice reckons Bob’s distance by noting the coordinates
of her clock that is just next to Bob at that instant - this space
interval is fixed as far as Alice is concerned, but it is moving from
1
Bob’s point of view. So, Bob sees it shortened by a factor of 1.25 = 0.8
- which is why his reckoning of the distance is 4.8 m!
Length contraction, again, is just what the doctor ordered for
the case of the undying muon! From the muon’s point of view, it
decays in its proper lifetime. In which the ground rushes towards
it to cover the distance from the upper atmosphere to the earth’s
surface. This is a distance that we on the earth measure to be
about 5-6 kms, but to the muon it is shortened to about 500-600
meters - just the sort of distance the earth is expected to cover
before the muon decays!
Of course, Alice does not have to measure the two endpoints
of Bob’s rod to get its length correct. If there is a time gap of ∆t
between the two measurements, all she has to do is correct for the
distance v∆t that the rod would have moved in the interim. Thus,
another way to figure out the length l would be
l = ∆x − v∆t
Note that in the last statement I was being very, very careful.
I said ‘says’ - not ‘sees’. To see an object, light must travel from
that object to ones eyes. What you see is the result of light falling
on your retina at a given instant. Light travelling from different
parts of an object actually travels for different lengths of time be-
fore reaching your eye - so they must have started out at different
times. Of course, this does not bother us in our daily lives - the
times involved are just to small to make a difference. However, for
rapidly moving objects, this small time difference could result in
large shifts - leaving quite a different image of the object! The vi-
sual appearance of rapidly moving objects is quite an involved topic
- even the great George Gamow got it wrong in the first edition of
his wonderful “Mr. Tompkins in wonderland”! I will say a little bit
on this topic a while later.
3.2.2 What about the other directions?
So far we have been sticking to one spatial dimension. This is as
good a point as any in which to start worrying about the other di-
rections as well. We have seen that a moving rod gets shortened,
but that proof was only for a rod that is aligned along the direc-
tion of motion. Indeed, the more perceptive among you may be
beginning to feel that there was something wrong with my photon
clock derivation of time dilation. what I had assumed there is that
the vertical distance between the two mirrors in figure 3.1 stays
the same for both Alice and bob at l0 . At that stage, we had not
heard of length contraction, so that we had no reason to be wor-
ried about this assumption! Now that we have learned that moving
objects contract, it is only natural to be quite suspicious of this
assumption.
Remember, though, that before talking about the photon clock I
had already proved the time dilation equation (3.1) by using the K
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 50
ct1
l
l vt1
0
B1 B2
A1 C1 A 2 C2
the clock around so that this line is now parallel to the motion?” Of
course, if we use this setup with proper care, this should give us
an alternative derivation of the length contraction equation, (3.9).
As far as Bob is concerned, the clock is stationary - so all that
light does is travel a distance of l0 on both halves of the journey,
taking a time of 2lc0 . What does Alice see? Figure 3.4b shows the
whole process of light travelling from one mirror to the other one
and back from Alice’s point of view. I have indicated the initial
position of the two mirrors, when the light beam starts out from
the mirror on the left, as A1 and A2 , respectively. By the time light
reaches the right hand mirror, the two mirrors have shifted to B1
and B2 , respectively. Finally, when the light beam returns to the
left hand mirror, their respective positions areC1 and C2 . The gaps
A1 A2 , B1 B2 and C1 C2 are each equal, of course, to the length of the
photon clock as measured by Alice. Prompted by twenty-twenty
hindsight, I will call this length l and not assume beforehand that
it is the same length l0 that Bob sees.
The distances travelled by light and the mirrors are indicated in
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 52
figure 3.4b, where t1 is the time, according to Alice, that light takes
to travel from the left hand mirror at A1 to reach the right-hand one
at B2 . From the figure it is obvious that
ct1 = l + vt1
l
t1 = .
c−v
In the same way, it is obvious that the time t2 that the light beam
takes to get back to the left hand mirror, now at C1 is given by
l
t2 = .
c+v
So, Alice reckons that the round trip takes light a total time of
l l 2l 1 2 2l
t= + = × 2 = γ . (3.10)
c−v c+v c v
1 − c2 c
So, now we know that Bob reckons that the time taken for the
round trip by light is 2lc0 , while Alice reckons that it is γ 2 2lc . Should
these two times be equal? Although in pre-relativity days the an-
swer would have been an unequivocal “Yes!”, today we know better!
After all, as far as Bob is concerned, the round trip taken by light
begins and ends at the same place, namely his origin. So, the time
that Bob sees is the proper time interval that elapsed during the
trip. So, Alice must see a dilated time, given by γ × 2lc0 - but we
already know that this is γ 2 2lc . This leads immediately to
1 p
l = l0 × = l0 1 − β 2
γ
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 53
such things that become apparent only for very fast objects! That
is not where the paradox is. The paradox becomes apparent when
you look at this from Bob’s point of view. To Bob, the car is at
its rest length of 10 m, it is the garage that is now moving at 0.6 c,
making it even shorter - 6.4 m to be precise! So, how on earth can
the car fit into the garage, when you see this from Bob’s point of
view?
To see how this paradox can be resolved, let’s examine just what
it means when Alice says that the car fits in her garage. Of course,
she means that the front end of her car hits the back wall of the
garage and the back end crosses the front gate simultaneously.
There’s our clue! As I have been stressing over and over again, the
major new thing in STR is the fact that time measurements differ
from observer to observer. So Bob does agree with Alice that the
front end of the car hits the back wall and the rear end crosses the
front gate - its only that he refuses to accept that the two events
occur at the same time! You will have to wait until the next section,
though, to see that this disagreement between which events occur
together and which do not is just enough to explain the difference
in the two friend’s points of view.
observer - so this may not come as that big a shock to you, after
all.
Just how big is this difference in time measurements? You will
hardly have to work at all to find this out, since I did almost all the
work in the last section! Remember, Alice had to measure the two
ends of Bob’s rod simultaneously in order to measure its length
l. Do cast your mind back to how Alice measures the length -
also, refer to figure 3.3. According to Bob, the two bounces from
the two ends of the rod happened at times of Kt1 + lc0 and Kt2 ,
respectively. So, the time difference between these two bounces
that bob measures is K (t1 − t2 ) + lc0 . Now, as we have seen in (3.8),
l0
the two bounces will be simultaneous to Alice if t1 −t2 = − K2K
2 +1 c . In
this case, the time interval that Bob will see between them becomes
l0 2K 2 l0 K 2 − 1 l0 l0
− 2 =− 2 = −β
c K +1 c K +1 c c
β l − v2 l
∆tBob = − =q c (3.11)
γ c 1 − vc2
2
marks of the lightning strike and himself, than Alice does - that’s
length contraction for you! However, as Alice sees the strikes to
be at equal distances from her on both sides, then so should Bob
- length contraction does not distinguish between left and right.
Let me assume that these two lightning strikes are simultaneous
with respect to Alice. Then, since Alice is sitting midway between
the two flashes, light from the two of them, travelling with equal
speeds c, will reach her at the same instant. Of course, Bob, who
has been moving all the time to the right, meets the flash coming
in from the right hand lightning strike before it reaches Alice, while
light from the other strike does not reach him until after it has
reached Alice and crossed her. Before the days of Einstein, both
friends would have passed this over by saying that light from the
right hand strike is moving in a a speed of c + v with respect to
Bob, while that from the left hand strike is moving at a speed of
only c − v, so it is no wonder that even though the two flashes
occur together (remember - no one had an inkling that simultaneity
is relative before Einstein), Bob sees the right hand strike earlier.
However, once we know that light travels at the same speed for all,
this escape route is out. Since both flashes are equally distant from
Bob, and he sees the right hand flash first, he must reckon that it
had occurred earlier! So, the flashes - which are simultaneous
with respect to Alice are not simultaneous to Bob - simultaneity is
relative!
What is the gap that Bob sees in between the two strikes? Well,
it is pretty easy to figure out that Alice will see Bob meet the light
l
from the right hand and left hand flashes after times 2(c+v) and
l l
2(c−v)
, respectively, where 2 is the distance of each flash from her.
So, Alice will see the time difference between the two lightbeams
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 59
striking Bob as
l l vl β l l
− = 2 2
= 2
= γ2β
2 (c − v) 2 (c + v) c −v 1−β c c
What, then, is the time difference according to Bob? Note that Bob
is present at both the events - so that the time that he measures is
the proper time interval between the two events. This means that
Bob’s value for the time interval is a smaller by a factor of γ. Hence
Bob sees the right hand flash γβ cl ahead of the left hand flash.
Since he is at the midpoint of the position of the thunder-strikes,
he will claim that the right hand flash occurs earlier, by a gap of
γβ cl . This, of course is the result that I had shown you a while ago
using the radar method.
Using an argument like this we can cover the case that’s not
so easy to deal with in terms of the radar method. What if the
train was moving not along the line joining the two points where
lightning had struck - but rather in a direction perpendicular to
this line? It is rather easy to see that the two flashes would have
reached Bob at the same time in this case, since he keeps on stay-
ing equally distant from their sources. Thus Bob and Alice will
both agree that the flashes are simultaneous!
What if Bob had moved off at an angle to the line joining the
flashes? It seems reasonable to expect that in this case, again,
Bob will see the flash towards which his velocity is slanted to occur
earlier - you might even be so bold as to suggest that for this case,
too, the equation (3.11) will continue to hold, except that instead
of the the product vl - you must have the scalar product ~v · ~l - so
that the equation becomes
~v · ~l
∆tBob = −γ 2 (3.12)
c
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 60
~ ~ 2
~ · B,
Using A + B = A2 + B 2 + 2A ~ you can rewrite the above as
l2
c2 t21 = v 2 t21 − ~v · ~l t1 +
4
2 2 2 2
~
l2
c t2 = v t2 + ~v · l t2 +
4
Giving
~v · ~l
t1 − t2 = −
c2 − v 2
- which is the difference in the times at which the two light flashes
reach Bob. This doesn’t match equation (3.12) - but that’s only
because it is the time interval as reckoned by Alice. Bob, of course,
measures
q the proper time interval - throwing in the correction fac-
v2
tor of 1− c2
gives you exactly (3.12)!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 61
He of course can not deny the readings on the clocks, what he will
deny though is that the interval as measured by Alice’s clocks is
really 10 seconds! As far as he is concerned, Alice’s clocks are
not synchronized. You should have expected that - synchroniza-
tion involves setting the clocks at zero simultaneously, and events
that are simultaneous to Alice (who did the synchronization on her
clocks) are not simultaneous to Bob!
Now, Alice’s two clocks are separated by the distance that Bob
travels at a rate of 0.6 c in the 10 seconds that Alice sees as the
length of the interval - this of course is 6 light-seconds.Now, think
about Alice setting her clock at her origin and her clock 6 lightsec-
onds away. These two events, that occur at the same time accord-
ing to her, are separated by a time gap given by equation (3.11)
as
0.6 × 1.25 × 6 s = 4.5 s
with the clock at Alice’s origin set later. Since the two friends clocks
at their respective origins showed zero at the same time, Bob will
have no option other than saying that the second clock being used
by Alice has been set too early, by 4.5 seconds.
To compare how fast Alice’s clocks are running, we must focus
on the time gap between two events as read by any one them. Let
me take the clock that is 6 light-seconds away,k and consider the
two events when it was set to zero - and when Bob came up just
abreast to it. Of course, this clock reads a gap of 10 seconds be-
tween them. However, to Bob, the first event occurred, not at the
time origin, but at t = −4.5 seconds, while the second one occurs
at t = 8 seconds. Hence, as far as Bob is concerned, the gap be-
tween the two events that he sees is not 8 seconds at all, but 12.5
seconds. No wonder, then, that he says that Alice’s clock is going
slow, by a factor of 12.5
10
= 1.25!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 63
K3 = K1 K2 (3.13)
Indeed, as we will see over and over again, this simple result is
the reason why the Doppler factor is more convenient to use than
the speed in relativistic contexts. We have already seen, near the
end of section 2.4 that the Doppler factor is more directly amenable
to experimental determination. Be that as it may, speeds being
more familiar, we would definitely like to write (3.13) in terms of
v1 , v2 and v3 . Using equation (2.4), this is readily done:
implies
1 + β3 1 + β1 1 + β2
= ×
1 − β3 1 − β1 1 − β2
which leads readily to
(1 + β1 ) (1 + β2 ) − (1 − β1 ) (1 − β2 )
β3 =
(1 + β1 ) (1 + β2 ) + (1 − β1 ) (1 − β2 )
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 66
β1 + β2
=
1 + β1 β2
v1 + v2
v3 = (3.14)
1 + v1c2v2
v3 = c !
u−v
u0 = (3.16)
1 − uv
c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 68
(u − v) t
D=q
2
1 − vc2
Now that we have the distance to Charlie according to Bob, the next
question is, what is the time, again according to Bob, when Charlie
is at this distance? We have seen that Alice’s clock right next to
Charlie is showing a time t at this instant - but by now we know
enough to realize that the time that Bob reckons for this must be
different. Since Alice’s clock is movingqpast Bob at a speed of v, it
2
must be running slow by a factor of 1 − vc2 . Just compensating
for this will tell us that Bob reckons the time to be
t
T =q
v2
1− c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 70
(u−v)t
q
2
D 1− v2
u0 = = c
=u−v
T q t
2
1− v2
c
- the same as the classical result! Something must have gone very
wrong here - after all, the classical formula for the addition of ve-
locities just can not be right! Apart from anything else - if instead
of Charlie, both Alice and Bob had been observing a flash of light -
this formula would have told you that they would have seen differ-
ent speeds - something that runs smack against relativity’s basic
postulate!
Some of you must have caught on to the source of the trouble
by now - I have simply forgotten to take the disagreement between
the two friends on clock synchronization into account. Alice says
that Charlie has covered the distance ut in time t - because at
t = 0 he was right next to her, while at time t he is at a distance
of ut from her. Though, the two clocks that tell her the time at
Charlie’s two locations are different, she is sure that subtracting
their readings does give her the time interval elapsed because she
has synchronized them so that they read zero a the same time.
Bob, on the other hand will say that the second clock was set to
zero too early - it has a head-start of q 1 v2 cv2 × (ut) over the clock at
1−
c2
Alice’s origin. This means that the time that has elapsed according
to Bob is not really q t v2 , but rather
1−
c2
t 1 v 1 − uv
c2
T =q −q 2
× (ut) = q t
1− v2
1− v2 c 2
1 − vc2
c2 c2
velocities.
As an added bonus we can easily adapt this argument to un-
derstand the situation when Charlie is not moving in the same
direction as Bob is, with respect to Alice. In this case, the formula
for u0 certainly stays valid for the x component of Charlie’s velocity
(remember - the X component is special, because that is the direc-
tion in which Bob is moving relative to Alice.). Note that you have to
remember to change the u in the term uv c2
in the denominator to ux ,
too - check out the general formula for the relativity of simultaneity
(3.12)! This means that we have
ux − v
u0x = (3.17-a)
1 − ucx2v
fashion. This, despite the fact that distances in this direction are
the same for the two friends. Indeed, it is better to say that the
complication in this case arises exactly because of this - there is
no Lorentz contraction factor that cancels out the time dilation fac-
tor!
x
t2 = t + (3.20-a)
c
x
t1 = t− (3.20-b)
c
x2
t1 t2 = t2 − (3.21)
c2
What makes the above equation very interesting indeed is that the
t2
corresponding product for Bob is Kt1 × K = t1 t2 - the same as for
Alice. So we immediately get
2 x2 02 x02
t − 2 =t − 2 (3.22)
c c
x − vt
x0 = q (3.23-a)
2
1 − vc2
y0 = y (3.23-b)
z0 = z (3.23-c)
t − v2 x
t0 = q c (3.23-d)
2
1 − vc2
where I have thrown in (3.23-b) and (3.23-c) which simply say that
lengths perpendicular to the direction of relative motion are the
same for both Alice and Bob. These, then, are the transformation
equations that are going to take the place of the Galilean trans-
formations. They had been found on mathematical grounds by H.
Lorentz before Einstein put them on a strong physical footing. This
is why these are known as the Lorentz transformation equations -
and the passage from one inertial observer to another is called a
Lorenz transformation.
Our basic postulate of relativity boils down to the mathematical
demand that the form of the equations of physics must stay the
same under a Lorentz transformation. In a slightly more convo-
luted fashion, scientists prefer to put this as “The laws of physics
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 75
x0 + vt0
x = q (3.24-a)
2
1 − vc2
y = y0 (3.24-b)
z = z0 (3.24-c)
t0 + v2 x0
t = q c (3.24-d)
2
1 − vc2
You should check for yourself that solving the Lorentz transforma-
tion equations for (x, y, z, t) yields the same results. Also check that
equations (3.23-a) and (3.23-d) directly lead to (3.22).
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 76
∆x0 = 0 and ∆t0 = t0 , for the two ticks of Bob’s wristwatch. Using
(3.26-d) immediately leads to
t0 + cv2 × 0 t0
t = ∆t = q =q
2 v2
1 − vc2 1− c2
t − v2 ∆x
t0 = ∆t0 = q c
2
1 − vc2
The trouble is - here ∆x is the distance between the two ticks that
Alice sees and that is not zero! Putting in ∆x0 = 0 in (3.25-a) will
immediately give ∆x = vt (this is obvious directly too - after all,
the watch is moving away from Alice with a speed v), which will
immediately lead to the correct answer. It is much easier, of course,
to use the inverse transformation for this particular case, though -
and it is a cardinal sin to mix the two up!
Bob carries with him a rod of length L0 . What is the length L that
Alice measures for it. Since the rod is at rest with respect to Bob, he
will have ∆x0 = L0 for any two events that measure the coordinates
of the two endpoints - irrespective of whether they are simultane-
ous or not! On the other hand, as far as Alice’s measurements are
concerned, ∆x = L only if the two measurements are done simul-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 78
L−v×0
L0 = q
2
1 − vc2
Two events are simultaneous with respect to Alice and she sees
them occur at a gap of ∆x = l. The direct transformation equation
(3.25-d) immediately gives
0 − v2 × l l
∆t0 = q c = −βγ
2
1 − vc2 c
~v · ~l
∆tBob = −γ 2
c
∆x ∆y ∆z
ux = , uy = , uz =
∆t ∆t ∆t
∆x0 ∆y 0 ∆z 0
u0x = , u0y = , u0z =
∆t0 ∆t0 ∆t0
These are of course, the same set of formulae that you have seen
before.
As you can see, the velocity transformations are quite a bit more
complicated than the Lorentz transformations themselves - in par-
ticular, the components in the directions perpendicular to the mo-
tion change in a quite involved manner, while the space coordinates
do not change at all!
~ = q 1 ~u trans-
It turns out that the components of the vector U u2
1−
c2
forms much more neatly than ~u itself. Here the u in the square root
is the magnitude of the vector ~u itself and not the relative speed of
the two observers. What makes this possible is the rather cute
identity q q
2 u2
1 − vc2 × 1− c2 ux v
q =1− (3.27)
1− u02 c2
c2
which you should be able to prove for yourself after a bit off (slightly)
messy algebra. Check for yourself that this means, for example
that
Uy0 = Uy
Uz0 = Uz
Y Y v
Y
v X
Z
X X
Z X
Z
Z
v X
Y
Y v Y
Yr
R
X
Xr
Z
X X
Z Z
Zr
Y
Yr Yr
−1
R
Y
Yr v X r
X r
X
Xr
Z
Zr Zr
X
Z
Zr
special case, provided we also know how to work out the way coor-
dinates change under a general rotation. What’s needed is shown
in figure 3.6. The steps are - first, rotate Alice’s coordinate system
so that her new X axis (the one denoted Xr in the figure) aligns
with the direction of Bob’s velocity. Now, use the special Lorentz
transformation to boost up to Bob’s speed - this gives us the coor-
dinate frame denoted by Xr0 , Yr0 and Zr0 . This, though, is not Bob’s
frame of reference - the sped is right - the orientation is not. to
correct for this, we need the final step - carry out the reverse of the
rotation in the first step. this leaves us with a coordinate system
that is parallel to the original one and is moving with the proper
velocity with respect to it - Bob’s frame!
All that is fine - and abstract! to understand this better let me
show you a concrete example. Instead of trying to figure out the
boost in a very general direction (it will be very difficult to check
whether the result is correct anyway!) - I will show you how to
figure out the equations for Boost along the Z axis.
First, we need a rotation that will take Alice’s X axis into her Z
axis. There are of course many rotations that can do the job for us
- I will use the most obvious - a 90◦ rotation about the Y axis. This
gives
x → xr = z
y → yr = y
z → zr = −x
t → tr = t
where I have included the (trivial) last line to make the notation
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 86
The final step is, of course, carrying out the opposite rotation to
the one in the first step.
x0r → x0 = −zr0 = x
yr0 → y 0 = yr0 = y
zr0 → z 0 = x0r = γ (z − vt)
v
t0r → t0 = t0r = γ t − 2 z
c
x0 = x
y0 = y
z 0 = γ (z − vt)
v
t0 = γ t − 2 z
c
x0 = R−1 Lx Rx (3.28)
in an obvious notation.
The above description of how to figure out the transformation
equations for a general boost are perfectly correct - but there is,
however, a more direct way of figuring out the result for a gen-
eral boost. Just note that in the special Lorentz transformation
equations, all that is special about the x coordinates that it is the
component of the position vector of an event that happens to be in
the same direction as the relative velocity of Bob and Alice. the fact
that y and z do not change under this transformation can be re-
states as - components of position vector transverse to the relative
velocity do not change. So, it pays off to decompose the position
vector of the event into two pieces, one parallel to ~v and one per-
pendicular to it
~r = ~rk + ~r⊥ (3.29)
4
Remember, matrices act to their right and thus in a product AB, it is Bthat
gets the first shot, not A!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 88
~r · ~v
~rk = ~v (3.30)
v2
~r⊥ = ~r − ~rk (3.31)
Check that in the two special cases that we have worked out so
far (~v along the X and the Z axes, respectively) these equations do
reduce to the ones we have found out.
This should immediately tell you how to figure out the equations
for a general Lorentz transformation. The job is almost done - all
that is left is one final rotation - one that will align the coordinate
system parallel to Alice’s to Bob’s actual coordinate system. If I
write the action of this rotation as R, I can write down the equa-
tions for the general Lorentz transformation as
0 ~v · ~r
~r = R ~r + [γ − 1] 2 − γt ~v (3.33-a)
v
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 89
0 ~v · ~r
t = γ t− 2 (3.33-b)
c
where ~r0 and t0 , are obviously constants that represent the space
and time translations. These final transformations are called the
Poincaré transformations in honour of the great French polymath
Henri Poincaré. There are also known, for obvious reasons as
inhomogeneous Lorentz transformations. In summary, then, the
Poincaré transformations, which express the most general connec-
tion between two inertial observers, comprise of boosts, rotations
and spacetime translations.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 90
x2 + y 2 + z 2 02 x02 + y 02 + z 02
t2 − = t −
c2 c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 91
c2 t2 − r2 = c2 t02 − r02
p
where r = x2 + y 2 + z 2 is the distance between the point where
the event occurred and the origin. Since the differences of space-
time coordinates of two events transform in the same way as the
coordinate differences themselves, it is easy to see that the interval
between two events, defined by5
Before you start out on a search for such a function, let me point
out that in general the relative speed of Charlie with respect to Alice
must depend on the angle between ~u and ~v . So the right hand side
of the equation above depends on this angle, while the left hand
side does not depend on it at all! The only way in which this can
be true is if λ is a constant - and this means that λ2 = λ, leading to
λ = 1.
Let me stress once again that this proof is very general - all that
is involved is moving from one inertial observer to another. Perhaps
this is flogging a dead horse, but let me show you one more proof of
the invariance of the interval. In subsection 3.5.2 I showed you that
the most general such transformation is one that involves a boost
in a general direction followed by a rotation. Indeed, you can figure
out the transformations from the special Lorentz transformations,
preceded and followed by rotations. I have already shown above
that the interval stays the same under a special Lorentz transfor-
mation. Rotation does not change either time or length, leaving the
interval invariant. Thus, it is easy to see that the interval does not
change for a general Lorentz transformation.
The invariance of the interval goes through in the case of Poincaré
transformations, too. Note that since the constant spacetime trans-
lations cancels out, both ∆~r and ∆t for the Poincaré transforma-
tions is the same as that for the Lorentz transformations - and that
is all you need for the interval to be invariant.
A word on the notation. I have called c2 ∆t2 − ∆r2 the interval.
From the experience that you have in ordinary geometry, where
the square root of ∆x2 + ∆y 2 + ∆z 2 is the spatial length, you may
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 95
think that is better to call the square root of this quantity the in-
terval. The trouble is, of course, that the invariant combination
above need not be positive - it is negative for two events that occur
far enough apart in space so that ∆r exceeds c∆t. I will play it safe
(most people do) and keep on calling the expression in (3.35) the
interval.
Another point about the notation - instead of using c2 ∆t2 − ∆r2
the interval, I could just as well have reserved this name for the
quantity ∆r2 − c2 ∆t2 . Which one to use is primarily a matter of
preference and both conventions are in use. In these lectures I will
use the former exclusively - but do check the convention that is
being used when you read any other book on STR.
ing “two events are timelike” instead of the more long-winded “two
events are separated by a timelike interval”.
You may be wondering why I am harping so much on the sign of
the interval - and dividing up event pairs into just three categories
depending on whether it is positive, negative or zero - when not
only just the sign but the entire value of the interval is the same
for all observers. Why not divide spacetime into an infinite num-
ber of classes - each class consisting of event pairs separated by
the same value of the interval? All inertial observers will certainly
agree on whether a particular pair of events belong to a particu-
lar class. In the following paragraphs I will give you a few reasons
why the division of spacetime events into light-like, spacelike and
timelike pairs is physically significant. However, the real physical
significance of this division will emerge only when we talk about
causality in section 3.9.
It directly follows from the definition of the interval that two
events that occur at the same place (but at different times) to Alice,
have a positive interval between them. Since Bob must measure
the same positive interval between them, even though he will see
them to occur at different places, it follows that such events are
timelike for everybody. Again, if Alice sees two events occur simul-
taneously but at different places, then the two events are spacelike
for everybody. Turning this over in its head, it is easy to argue
that if Alice sees a timelike interval between two events, then Bob
will never see them to be simultaneous - or if Alice sees a space-
like interval between two events then Bob will never see them as
occurring at the same place.
Again, if Alice sees two events have a timelike interval - it is
possible for Bob to move with such a speed that the two events will
occur at the same place with respect to him. Note that this means
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 97
~v · ∆~r
γ = ∆t
c2
Consider two clocks on Bobs watch. The time interval that he mea-
sures between them is the proper time interval ∆τ of these two
events, while the spatial interval is, of course, 0. What does Alice
measure the time interval ∆t between the two ticks? The invariance
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 98
c2 ∆τ 2 = c2 ∆t2 − ∆r2
where ∆r, the spatial separation between the two ticks as seen by
Alice is related to ∆t by ∆r = v∆t. This means
∆r2 v2
2 2 1 2
∆τ = ∆t 1− 2 × = ∆t 1− 2
c ∆t2 c
u1 − v
u01 =
1 − u1 v/c2
p
u2 1 − v 2 /c2
u02 =
1 − u1 v/c2
p
u3 1 − v 2 /c2
u03 =
1 − u1 v/c2
Carrying out the differentiation for this as well as the other two
components yield the acceleration transformations
3/2
(1 − v 2 /c2 )
a01 = a1 (3.37-a)
(1 − u1 v/c2 )3
(1 − v 2 /c2 ) h v i
a02 = a2 − (u 1 a2 − u 2 a1 ) (3.37-b)
(1 − u1 v/c2 )3 c2
(1 − v 2 /c2 ) h v i
a03 = a3 − (u a
1 3 − u a
3 1 ) (3.37-c)
(1 − u1 v/c2 )3 c2
u = u0 + αt
1
x = x0 + u0 t + αt2
2
u/c αt
=
(1 − u2 /c2 )1/2 c
where we have assumed that the frame S is the frame in which the
particle is at rest at t = 0. This can be simply rearranged to give
αt
u= q (3.41)
2
1 + (αt/c)
Let us compare this solution for the speed of the particle with the
classical result u = αt. As you can see, our solution is very close
to αt when this speed is much smaller than that of light. This is,
of course, only to be expected! However, the deviations show up as
the speed gets closer to that of light. What is gratifying is the fact
that though the classical result can increase without bound, the
expression in (3.41) can never exceed the speed of light!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 103
c 0.999
×√ ≈ 6.7 × 108 s ≈ 21 years!
α 1 − 0.9992
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 104
How long will it take for our astronaut to reach the center of our
galaxy - roughly 2 × 1020 m away? Since the time taken here is very
large, we can safely approximate x by ct and get the huge time
2 × 1020
8
s ≈ 6.7 × 1011 s ≈ 21000 years!
3 × 10
This tells us that the 21000 years time that we observe the journey
to the center of the galaxy to take corresponds to a proper time
of only about 10 years! So our interstellar traveller can make the
journey and return in about two decades of his time - only to find
all remains of civilization having vanished in the more than 42000
years that has elapsed on earth during his journey!
It may be instructive to write down the expressions for t and x
in terms of the proper time τ . You should be able to show easily
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 105
that
c ατ
t = sinh (3.45-a)
α c
c2 ατ
x = cosh (3.45-b)
α c
∠T OR = ∠ORQ − ∠OT R
= ∠OQR − ∠OSQ
= ∠SOQ
Note that this relation must hold for all points R that depicts events
that are simultaneous with O as seen by Bob. This means that
Bob’s X axis is the straight line OR that makes the same angle
with Alice’s X axis that his time axis makes with Alice’s! I leave it
as an exercise for you to show that all Bob’s lines of simultaneous
events turn out to be straight lines parallel to this one. Thus, one
half of Bob’s coordinate grid consists of equi-spaced straight lines,
each inclined to Alice’s X axis at the same angle as his worldline is
inclined to her time axis.
As for the other half of Bob’s coordinate grids - they can ob-
viously be thought of as the worldlines of particles at rest with
respect to Bob, placed at equal intervals with respect to him. So,
this grid consists of a set of equi-spaced straight lines each parallel
to Bob’s worldline. Figure 3.8 shows Bob’s space and time axes in
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 108
relation with Alice’s coordinate grid on the left. On the right I have
drawn Bob’s coordinate grid - where I have also shown Alice’s axes
for reference.You can graphically see that which two events are si-
multaneous and which are not is dependent on the observer from
the fact that Bob’s space axis is tilted with respect to Alice’s.
Figure 3.8 may appear to re-
veal an asymmetry between the
5
Q
two friends. Alice’s space and
4
s
t
t−m
P
ligh
3
s
however, are tilted at a crazy
2
m eter
1 X in angle to each other! Why this
0
0
O difference?
The point is - what I’ve drawn
Figure 3.9: Using a non- for you is really Alice’s space-
perpendicular coordinate grid time diagram - I’ve just incor-
porated Bob’s spacetime axes in it. You could just as well have
drawn Bob’s spacetime diagram directly - with axes perfectly per-
pendicular to each other. In that diagram, as can be guessed, it is
Alice’s axes that will turn out to be the tilted ones.
You may find the fact that Bob’s spacetime axes are not at a
right angle slightly discomfiting. How does one use such a set of
axes to find the coordinates of any particular event? The coordinate
grid, of course, makes it easy - all you have to do is to find which
two lines of the coordinate grid intersects at the event - they will
immediately tell you what values of x and t the event occurs at. For
example, the point P in figure has x = 2 m and t = 2 l-m9 . What
9
Here l-m stands for a light-meter, the time in which light travels a distance
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 109
8
8
6
)
Alice’s t (l−m)
(l−m
6
’s t
4
Bob
4
8
2
6
2
4
(m)
2 ’s x
Bob
2 4 6 8
Alice’s x (m)
will we do for events like Q that are not at the corners of the grid?
In that case, of course, the grid gives us only an approximate way
of locating the event. As you can see, the event Q has x somewhere
in between 3-4 m, and the time at which it occurs is also in the
range 3-4 l-m. You can certainly go in for a finer grid, which will
give you a better approximation. To get a more precise reading, all
you have to do is simple - draw lines parallel to the x and t axes
from the point Q. Where these cut the t and x axes, respectively,
gives the time and position of the event Q.
of 1 m.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 110
P
T α
R R Q
α
90 − 2 α
α 90 + α
α
O Q
Q
α
O Q X
show quiet easily that the scale factor is the same between the two
friends’ time axes. Anyway, this should have been obvious from
the fact that the line showing the propagation of light is equally
inclined to Bob’s axes, just as it was to Alice’s axes.
Armed with a complete description of the spacetime diagram, we
are now in a position to derive the Lorentz transformations them-
selves from it! The relevant diagram is figure 3.11. To find the
coordinates of the point P , all Alice has to do is drop perpendic-
ulars P X and P T to her X and T axes, respectively. Of course,
x = OQ and ct = OT . As for Bob, he has to complete the paral-
lelogram OQ0 P R0 , where OQ0 and OR0 are along his X and T axes.
Then OQ0 and OR0 will represent his measurements of x0 and t0 -
once account is taken of the difference in scale between the two
friends. I have extended the lines P Q0 and P R0 to cut Alice’s axes
at Q and R, respectively.
Now, from elementary geometry it follows that
v
∠QP X = ∠T OR0 = ∠T RP = ∠Q0 OX = α = tan−1 = tan−1 β
c
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 112
OQ0 OQ
◦
=
sin (90 + α) sin (90◦ − 2α)
and hence
cos α
OQ0 = OQ ×
cos (2α)
I leave it as an exercise for you to show that this can be rewritten
as p
1 + β2
OQ0 = (x − vt)
1 − β2
a very similar reasoning shows that
p
1 + β2
OR0 = (ct − βx)
1 − β2
We are early done - all that is left is to note that Bob’s scale
q for
2
units of length and time are larger than Alice’s by a factor 1+β 1−β 2
.
Taking this into account it is easy to see that
s
1 + β2 x − vt
x0 = OQ0 ÷ 2
=p
1−β 1 − β2
s
1 + β2 ct − βx
ct0 = OR0 ÷ 2
=p
1−β 1 − β2
v∆x
∆t − <0
c2
which means that at least one of the two speeds vinfo and v must
exceed c!
This, in fact is the strongest reason why nothing can travel
faster than light. If the information that A has occurred could
get to the location of the event B faster than light, then it is possi-
ble that Bob will be in a position to see the effect occur before the
cause, even if his own speed is within the light-speed limit! On the
other hand, if Bob’s less-than-light speed satisfies the inequality
(3.46), then Bob will say that B has preceded A while Alice claims
just the opposite. However, in order that either of the two events
cause the other one, it would be necessary for the information that
the cause has occurred to reach the effect’s site faster than light.
Banning the possibility of faster than light propagation of the sig-
nal saves us from the prospect, then, that one of the two friends
will see the effect occur before the cause!
To round off the discussion, let me stress that although STR
does allow the possibility that the time order between two events
will be different for two different observers, as long as both signals
and observers obey the light speed limit, there is no conflict with
the principle of causality. In the jargon of STR, we say that such
reversal of time order for two observers is possible only for two
events that are not causally connected.
In order to avoid any possible misunderstanding, let me empha-
size that if event B is within the causal future of event A, then it
is possible that A is the cause of (or one of the causes of) event B.
This does not mean that A has to be the cause of B, however! Of
course, A can not be regarded as the cause of the infinite number
of other events that fall inside its future lightcone. On the other
hand, if B is not within the causal future of A, then it is impossible
for A to cause B. A similar set of statements could be made for the
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 116
v
∆x > c∆t
c
v
and since c
< 1, this requires
∆x > c∆t
So, temporal order can be reversed only if the two events are sepa-
rated by a spacelike interval. On the other hand - this means that
the distance at which the two events occur exceeds c times the time
interval, i.e. the distance is more than that even light can cover in
this time. Thus, two events separated by a spacelike interval can-
not be linked by a cause-effect relation.
On the other hand, if the interval between a pair of events is
timelike or lightlike, then the time interval is sufficiently large for
a slower-than-light (or light) signal to cover the distance between
them. In this case, it is possible that the earlier event was the
cause of the latter one. Note that in this case, all observers (as long
as they stay slower than light) will agree about the temporal order
in which the two events occur - so as long as causality holds for
one observer, it will hold for all of them.
The causal structure of spacetime in relativity can be depicted
very forcefully by means of the lightcone that I showed you ear-
lier. Figure 3.12 shows the lightcone centered at a particular event
- labelled “Here & Now”. The lightcone comes in two halves - for
obvious reasons the part for positive time is called the future light-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 117
Future Future
Elsewhere
cone, while the other half is labelled the past lightcone. The future
lightcone and its interior contains points which are separated from
the origin by a timelike (or lightlike) interval, and which, moreover,
lies to its future - this is the causal future of the origin. The event
at the origin can be the cause of events in this spacetime region.
Similarly, the past lightcone and its interior constitutes the causal
past of the origin. The events here can be the cause of the events
at the origin. The region of spacetime lying outside the lightcone
consists of events that are separated by a spacelike interval from
the event at the origin. The appropriate label that we can assign
such events is elsewhere - no observer will ever see any of them
occur at the same place as the one at the origin.
The figure also shows two dashed lines - the X axis and the T
axis. Why have I used dashed lines for these? Well, these lines
are relevant only for a particular observer - as I have shown you
earlier, the corresponding axes will be quite different for any other
observer moving with respect to this one. Note that events that are
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 118
in the causal future of the origin occur later than the origin, for
all observers (we are sticking to observers that do not exceed the
speed of light) - so the “future” label assigned to this region is not
specific to a particular observer. The same goes for the causal past.
As we have seen already, events that are separated from the origin
by a spacelike interval have no specific time order - an event in this
region may occur before the origin for one observer, but may very
well occur after it for another observer. This is precisely why we
don’t use the words past or future in the context of these events -
the only observer independent designation that they may be given
is the one that we have given them - elsewhere!
So that you can appreciate how different things where in the
pre-Einstein world, I have also drawn the Newtonian view of the
causal structure of spacetime in figure 3.12. In that diagram you
will notice that I have drawn the X axis with a solid line, unlike the
diagram for STR. This is because this line consists of all events that
are simultaneous with the event at the origin - and this designation
is something that every observer agrees with (this is Newtonian
physics - remember?). This line, labelled now separates space-
time into two halves - the past and the future. Note that since
Newtonian physics places no upper limit on the speed of signal
propagation, the event at the origin could have been caused by any
event in its past, while it could be the cause of any event in its
future. I am running the risk of being subjective here - but I do feel
that the causal structure of STR is much more beautiful than the
rather dull setup that Newtonian physics provides!
Chapter 4
119
CHAPTER 4. LIGHT - THE MESSENGER OF RELATIVITY! 120
This precisely is the same as the formula for the Doppler effect
in sound, too - for the observer receding from a source of sound
stationary in air. Of course, for sound, the ratio β will have to
be changed to vv , where vs is the speed of sound. Secondly, Bob’s
s
clock runs slow, making the time interval between the crests smaller
than that predicted by the first effect. This makes the actual fre-
quency that Bob sees larger than that predicted classically (but
still smaller than what Alice measures). Taking both factors into
account, the frequency that Bob actually sees is
s
1−β 1−β
νBob = νAlice p = νAlice (4.2)
1 − β2 1+β
each individual atom that radiates the light photons are in violent
to and fro motion. Atomic speeds are quite rapid, of the order of
the speed of sound or more. This might seem like a small fraction
of the speed of light - but remember that optical measurements like
that of frequency can be carried out to very high precision.
There is one small hitch, though. In this case, the classical
effect is already quite large - what STR does is that it introduces a
small correction to this result (at least, the correction is small when
the source is not moving very fast). To see this, let me write the
expression for the frequency as seen by Bob as a series expansion
in β, which is almost always rather small, to get
1
νBob = νAlice 1 − β + β 2 + . . . (4.3)
2
where v is Bob’ speed with respect to Alice (as well as the source
of light). Note that Alice receives the peaks at exactly the same in-
terval as the source of light itself, so the frequency she measures
CHAPTER 4. LIGHT - THE MESSENGER OF RELATIVITY! 123
4.2.1 Aberration
4.2.2 Propagation
Chapter 5
Additional topics in
kinematics
124
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 125
if you measure both space and time in the same units (so that, as
I mentioned earlier, the speed of light comes out to be unity). If
you want to stick to the same units that you have been using all
along, the same effect can be obtained by considering the product
ct instead of time.
I am going to use the abbreviation x0 for ct, while I will use
the obvious nomenclature x1 = x, x2 = y and x3 = z and a simi-
lar notation for the spacetime coordinates according to the primed
observer. As you can see, I have chosen to write the indices on
the coordinates as superscripts rather than the usual subscripts.
Why? There is a very good reason - one that will hopefully become
clear in a while from now.For the time being, do pay attention to
the fact that x2 refers to the second component of the quantity x
and not the square of x. If I mean the latter, I will write it explicitly
as (x)2 . In terms of these,the Lorentz boost equations in (3.23-a-
3.23-d) becomes
x0 = Lx x = L−1 x0
If you are afraid of matrices, you can also write this in terms of
components
X
x0µ = Lµν xν (5.3)
ν=0,1,2,3
where obviously Lµν is the entry at the µ-th row and ν-th column
of the matrix L. You are certainly familiar with this entry being
denoted Lµν - so why this perverse notation - in which the row index
is written as a superscript and the column index as a subscript? I
will explain this in due time - for the time being please bear with
this!
I have written down the coefficients explicitly for the special
Lorentz boost. It should be obvious that the compact form of the
transformation equations, ( 5.3) is valid for the general Lorentz
transformation that I discussed in the last subsection. Of course,
the individual coefficients may be much more complicated than in
our special case. A natural question that will arise is, given a set of
four equations that look like ( 5.3), can we be sure that they stand
for a Lorentz transformation? You will have to wait a while for the
answer - which I will provide in the next section.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 127
where on the left I have written the matrix form and on the right,
the explicit form in terms of the transformation coefficients (sub-
section 5.2).
If all the signs above had been +, it would be child’s play to write
P3 µ 2
the interval in a compact form - ∆s2 = µ=0 (∆x ) . In terms of
the matrix x, this would have been even more compact1 , ∆s2 =
(∆x)T (∆x)! Of course, all the signs are not the same - so this
simple form is not the right one. However, it is very easy to see that
if you were to introduce another set of four numbers to coordinatize
1
Here, the superscript T stands for the transpose of a matrix - the matrix that
is obtains by it by interchanging rows and columns.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 128
spacetime,
x0 ≡ x0 = ct (5.4-a)
x1 ≡ −x1 = −x (5.4-b)
x2 ≡ −x2 = −y (5.4-c)
x3 ≡ −x3 = −z (5.4-d)
x = ηx (5.9)
3
X
xµ = ηµν xν
ν=0
where, of course
3
X
= ηµν ∆xµ ∆xν (5.11)
µ,ν=0
which you can interpret as the formula that will give you the “length”
of a spacetime coordinate difference - in the same way as the length
of a vector in three dimensions is
3
X
∆xµ ∆xµ
µ=1
T
(∆x0 ) η (∆x0 ) = (∆x)T η (∆x)
we have
(∆x)T LT η L (∆x) = (∆x)T η (∆x)
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 133
for all values of (∆x). This shows2 that the matrix L has to obey
LT ηL = η (5.13)
(det L)2 = 1
and thus
det L = ±1 (5.15)
Thus all Lorentz transformations fall into two categories - one with
det L = +1 and the other with det L = −1.
Another property of the Lorentz transformation matrix is easier
to derive from the condition on coefficients, equation (5.14). Writing
the equation for ρ = σ = 0 leads to
3
X
ηµν Lµ0 Lν0 = η00 = 1
µ,ν=0
leading to
3
2 X 2
L00 =1+ Li0 ≥1
i=1
Putting the conditions (5.15) and (5.16) together we see that all
Lorentz transformations can be grouped under four categories
variant - it mixes with space in much the same way the coordinates
do when you rotate the coordinate axes.
Before I explain what is meant precisely by the statement that
spacetime is four dimensional, let me take you back into the depths
of a statement that you must be much more comfortable with -
the statement that the space that we see around us is three di-
mensional. Of course, every schoolboy knows that you need three
numbers to locate a point in space. However, we are also used,
especially in dynamics problems, to treating the various space co-
ordinates independently - you write down separate equations for
the X component, the Y component and so on. Why is it, then,
that we do not regard the three components as three distinct quan-
tities, but rather look upon them as manifestations of one basic
object, three faces of the same coin - so to speak?
The answer is obvious - which component is X, which one is Y
... are, after all, artificial choices. I could just as easily have chosen
to orient my coordinate axes in a different way - this would mix all
the previous components together to give the new component. It is
this mixing up that shows that we have to regard the three spatial
coordinates as a single entity, unless we force ourselves to keep on
using one particular coordinate system. The latter would have been
a reasonable option, though, if one coordinate system could have
been somehow different from all others - which is not the case!
In much the same way, I could have kept on regarding time as
completely separate from space if I had one inertial observer whom
I could regard as special. However, the whole point of relativity is
that this can not be done - all inertial observers are created equal!
Add to this the fact that changing over from one inertial observer
to another mixes time up with the space coordinates as well as
vice versa and you have to regard the time along with the space
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 137
coordinates as four pieces of the same thing that we call, for want
of a better name, spacetime.
φ
z coordinate stays intact under y x θ
from which a bit of the trigonometry that you learnt in high school
will take you to
If you are familiar with matrices, you will of course realize that
this can be written in the compact matrix form
x0 cos ϑ − sin ϑ 0 x
0
y = sin ϑ cos ϑ 0 y
z0 0 0 1 z
0 0 1
l 2 = x2 + y 2 + z 2
c2 t2 − x2 − y 2 − z 2 .
(cos ϑ)2 + (sin ϑ)2 = 1. what about the Lorentz transformation coef-
ficients? The sum of their squares is not something simple, but
their difference is (γ)2 − (−βγ)2 = (1 − β 2 ) γ 2 = 1! I leave it to you
to figure out just what the connection is between the orthogonality
issue and this.
Let me point out, though, that we do know of simple functions
that can take the place of the cosand sin in the Lorentz transforma-
tions. The fact that the difference of squares of the coefficients
is unity, immediately suggests using the hyperbolic functions -
cosh φ = γ and sinh φ = βγ. The parameter φ is easily seen to be
φ = tanh−1 β
use of imaginary time made time look very much the same as the
space coordinates - but this was an illusory similarity. In fact, they
were instrumental in giving people the notion that relativity places
“space and time on the same footing”.
It is true that relativity does bring time a lot closer to space than
it was thought earlier. The very fact that time ceases to be abso-
lute, but rather observer dependent like space is a huge step in
this direction. In spite of this there is a very big difference between
space and time that persists in relativity - you can go in any direc-
tion in space - but in only one direction in time. Since causality
tells us that cause always precedes effect in time, shows that this
distinction is about as basic as can be! Indeed, hiding the different
sign of t2 in the invariant interval by using an imaginary time co-
ordinate conceals this very important difference - which turns out
to be more of a burden than an advantage in the long run. Thus,
despite having some advantages, Minkowski coordinates are used
only very rarely today!
d2 = ∆x2 + ∆y 2 + ∆z 2
which, of course stays unchanged when you move from one inertial
observer to another.
As I have been pointing out over and over again, the similarity
between the interval and the ordinary distance is what helps us to
motivate the generalization of ordinary space to spacetime. At the
same time, the difference between the interval with its (+, −, −, −)
signature and distance with its (+, +, +) signature is substantial -
and it is this difference that makes the geometry of spacetime very
different from that of space. In particular, let me stress once again
that despite its merger with space, time remains a very different
entity from space even in STR so unlike space, where all three
dimensions are equivalent - in spacetime we have a certain lack of
democracy!
One very important distinction between space and spacetime is
that the distance in four dimensions is not necessarily positive5 !
5
Indeed, if I try to push the analogy between distance and the interval, then I
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 144
components of
Ax = x 2 − x 1 , Ay = y2 − y1 , Az = z2 − z1
You can find out A0y and A0z in a similar fashion. The result is a set
of transformation equations
l02 = x02 + y 02 + z 02
6
If you are wondering what’s the big deal with that - let me point out that
some other quantities like the X component of a vector that you would be likely
to think of as a scalar is not really a scalar according to this criterion!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 147
= x2 + y 2 + z 2 = l 2
- as promised.
Of course, what I have done here can be hardly considered a
proof that rotations keep lengths unchanged. After all, all that has
been done is a verification that this happens for a particular kind
of rotation. However, since the choice of the Z axis is completely
artificial - it can be argued that if lengths do not change for rota-
tions about the Z axis, then they won’t change for rotations about
any axis.
Now that we have shown that the length of a position vector
stays unchanged under rotations, what about the length of any
other vector? Since all vectors transform in exactly the same way
as the position vector, it is easy to see that this invariance of the
length will work for all of them. If you are still sceptical, here is the
detailed verification
~0 · A
A02 = A ~ 0 = A02 02 02
x + Ay + A z
~·A
= A2x + A2y + A2z = A ~ = A2
~ 02 + 2 A
A ~0 · B
~0 + B
~ 02 = A
~2 + 2 A
~·B
~ +B
~2
~ 02 = A
- using A ~ 2 and B
~ 02 = B
~ 2 immediately gives
~0 · B
A ~0 = A
~·B
~
So, the scalar product is really a scalar! Of course, you could have
checked this directly from the transformation law itself.
Let me point out that the fact that the dot product of two vec-
tors A~ ·B
~ = Ax Bx + Ay By + Az Bz is a scalar depends on the fact that
the transformation that I am considering preserves vector lengths.
Such transformations are called orthogonal transformations. Ro-
tations, of course, are orthogonal transformations in three dimen-
sions - other well known transformations in this category are re-
flections. Indeed, it can be shown that in three dimensions, all
7
There is a different way to think of a rotation. Instead of the vector staying
the same and the coordinate system rotating, we can also keep the coordinate
axes the same and turn the vector around the other way. In this case, the com-
ponents change in the same fashion as above, and the notation A ~ 0 is justified
(we are really talking of a different vector here). Technically, the kind of trans-
formations that we have been talking about are called passive transformations,
while this latter kind are called active transformations.,
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 149
confusion with our ordinary run of the mill vectors we give these
entities a special name - four-vectors. If I talk of these vectors and
our familiar old vectors together, as I will do often, I will refer to the
latter as three-vectors. Remember, though, mathematically there
is not much of a difference between four-vectors and three-vectors,
they are just vectors under Lorentz transformations and rotations,
respectively.
As for scalars, there generalization to spacetime is even more
straightforward. The ordinary scalars that we dealt with in three di-
mensions are single numbers do not change under rotations. Four-
scalars are single numbers that do not change under Lorentz trans-
formations. Note that because rotations also form special cases
of Lorentz transformations, all four-scalars are three-scalars, too.
The converse is not true, though - time is a three-scalar but does
not stay invariant under a general Lorentz transformation. The
same goes for the length of the position vector.
The spacetime four vector (ct, x, y, z) naturally splits up into a
time part - the zero-th component, and the three other compo-
nents that form the spatial part. In the same way, every four-vector
(A0 , A1 , A2 , A3 ) can be split up into a timelike part - the A0 , and a
spacelike part, A ~ = (A1 , A2 , A3 ). The real significance of this split up
can be seen when you realize that a rotation in three dimensional
space is a special case of Lorentz transformations in spacetime -
one in which the (x, y, z) get mixed up together, but t stays invari-
ant. Under a rotation, too, a four-vector transforms in the same
way as (ct, ~r) - so that A0 does not change and A ~ turns around!
Thus, the time part A0 of a four-vector is a scalar under rotations,
while the space part A ~ is a three-vector under rotations!
Just as in the case of spacetime coordinates, the four-vector
that I have been describing here is written with the index as a
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 152
φ → φ0 = φ
x00 = x0
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 153
A00 = A0
A01 = A1 cos ϑ + A2 sin ϑ
A02 = −A1 sin ϑ + A2 cos ϑ
A03 = A3
velocity vector ~u. Try as you might, you can’t find a three scalar
ζ that you can club with this one to get a four-vector. The easiest
way to see this is to consider the special Lorentz boost. If (ζ, ~u) is a
four vector, then under this the velocity components in the Y and
Z direction must stay unchanged - which is not what our velocity
transformation equations (3.17-b) and (3.17-c) say!
To understand how to proceed further, then, we have to take a
deeper look at what we had always thought obvious - just why is
velocity a three-vector? Well ~r is a three-vector and hence so is ∆~r.
The time interval ∆t is a three scalar, so that the ratio ∆~r is really
1
a product of a scalar ∆t and a vector ∆~r - and such a product does
transform like the coordinates under a rotation. Velocity is nothing
but the limit of the ratio ∆~ r
∆t
as ∆t → 0, so it is a limit of a sequence
of vectors. This is why velocity is a three-vector. As you can see,
the crucial point above is that ∆t is a scalar under rotations.
You can try the same trick with the position four vector xµ and
µ
land up with ∆x∆t
= (c, ~u). Why is this not a four-vector? The answer
is simple - while ∆t was a three-scalar under rotations, it is not a
scalar under the Lorentz transformations. So, if you want to get
the four dimensional version of the velocity three-vector, you have
to look for something other than ∆t - something that is similar to
it (otherwise we will land up with an entirely different object - not
a generalization of the velocity!) and is invariant under a Lorentz
transformation. Fortunately, such a thing is not very difficult to
find - the proper time ∆τ is the perfect candidate! Not only is
it an invariant, it has the same dimensions as time and reduces
r
(approximately) to it when the speed ∆~ ∆t
is small. This helps us to
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 156
dU µ dU µ
Aµ = =γ
dτ dt
d
= γ [γ (c, ~u)]
dt
dγ
= γ (c, ~u) + γ (0, ~a)
dt
1 dγ d u 2 du γ2
Now γ dt
= dt
(ln γ) = c2
γ dt = c2
~u · ~a, so that
γ2
2
γ2
µ 2 2 γ
A =γ ~u · ~a (c, ~u) + (0, ~a) = γ ~u · ~a, ~a + 2 (~u · ~a) ~u (5.25)
c2 c c
which shows that even the space part of the acceleration four-
vector is not necessarily even in the same direction as the accel-
eration ~a! This result is going to have quiet an important bearing
on relativistic mechanics - the subject of the next chapter.
~·B
A ~ = A1 B1 + A2 B2 + A3 B3
and try writing down its four dimensional version what I will land
up with is not a scalar. This is easy to see - just try it out on
our prototype four vector (ct, ~r) and you will get c2 t2 + ~r2 , which is
not invariant under a Lorentz transformation! Fixing this up is
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 159
not a very big problem, though - one look at the invariant interval
suggests that instead of the sum we should try
A · B = A0 B 0 − A1 B 1 − A 2 B 2 − A3 B 3
3
X
A·B = ηµν Aµ B ν (5.26)
µ,ν=0
T T
or, in terms of column vectors A = (A0 , A1 , A2 , A3 ) and B = (B 0 , B 1 , B 2 , B 3 )
and the matrix η
A · B = AT ηB (5.27)
A · B → A0 · B 0 = A0T ηB 0
= (LA)T ηLB
= AT LT ηLB
= AT ηB = A · B
A = ηA (5.29)
3
X
Aµ = ηµν Aµ (5.30)
ν=0
Note that though I have used the underbar sign in the matrix form
of the covariant vector to distinguish it from the covariant version,
this is not necessary for the components - the placement of the
index µ (as a subscript rather than a superscript) is enough to
make the distinction.
In terms of the covariant version of a vector it is easy to see that
one can write a compact version of the formula for the dot product
3
X 3
X
A·B = Aµ Bµ = Aµ B µ (5.31)
µ=0 µ=0
Note that in this case you had to take the corresponding compo-
nents of the two vectors, multiply them and add up all the terms
together to get a scalar. According to (5.31), you do exactly the
same for the four-vector dot product, except that in this case the
two vectors have to be of different kinds - one contravariant and
the other covariant! Another way to put this is - you can’t produce
a scalar by using the multiply coefficients and add rule on two con-
travariant vectors (or two covariant vectors either) - for this rule to
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 161
A · B = AT B.
AT B → A0T B 0 = (L A)T LB = AT LT LB = AT B
−1
where the last step follows from the fact that L = LT . Note that
had you tried to do the same same with two contravariant vectors,
the matrix product LT L would have been replaced by LT L - which
would have not been equal to the identity matrix.
This should also tell you why you had not met two kinds of
vectors before, when you were studying them in three dimensions.
There, the ransformations you were talking of were rotations, and
there is no distinction between the transformation matrix R and
−1
R = RT . So, it all boils down to the fact that rotations are or-
−1
thogonal (which makes R = RT ), while Lorentz transformations
are not.
U 2 = γ 2 c 2 − γ 2 u2 = γ 2 c 2 − u2 = c 2
(5.32)
γ2
2 24 2
A = −γ ~a + 2 (~u · ~a) (5.33)
c
You can, if you are a glutton for punishment, verify directly that
this is invariant under a Lorentz transformation (or at least, under
the special Lorentz transformation)! One thing that follows directly
from the above is that the acceleration four-vector is always space-
like, unless of course the three-acceleration ~a is zero, in which case
it is lightlike10 .
Let me show you another way of deriving the result above. At
any given instant the particle has a given velocity ~u with respect to
the inertal observer being used. Let us now move over to another
inertial observer, to whom the particle is at rest at that instant.
Note that if the particle has a non-zero acceleration, it is going to
start moving in that frame just afterwards. You can always find a
frame in which the particle stays at rest always - all you have to
do is put an observer right on the particle. Such an observer will
be non inertial, and we won’t have much to do with these crea-
tures in STR. What we have to be satisfied is the instantaneously
comoving frame (ICF) of the kind described above. In that frame
the acceleration of the particle is often called its proper accelera-
10
That’s hardly surprising, of course, given that for ~a = 0 the acceleration four-
vector is identically zero!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 164
A · U = 0 × c − ~a0 · ~0 = 0 (5.34)
and this means that this quantity vanishes for all frames. You can
(and should) check that this is true by doing the explicit calculation
using (5.23) and (5.25) - which should convince you of the utility of
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 165
this trick.
There is another way in which the result (5.34) could be derived
- and the method is quiet instructive in its own right. It is easy to
check that the familiar calculus law for the derivative of a product
d dv du
(uv) = u + v
dt dt dt
d dA dB
(A · B) = ·B+A· (5.35)
dτ dτ dτ
To use this in our present task, let’s start with the norm of the
four-velocity
U 2 = U · U = c2
dU dU
U· + ·U =0
dτ dτ
µ
Using the definition Aµ ≡ dU dτ
and the symmetry of the dot product,
this translates to U · A + A · U = 2 A · U = 0, leading to our identity11 .
As a final example of the utility of this trick of evaluating four-
scalar quantities in a special frame, secure in the knowledge that
their value will not be affected when you move back to a general
frame of reference, I will show you yet another way of deriving
(??)- this time almost trivially. Note that for the four-velocities
11
You may recall a similar result from the world of three-vectors - the derivative
of any vector of fixed length is perpendicular to it - that is proved in exactly the
same way! Among its consequence is the fact that the tangent of a circle is
perpendicular to the radius, as well as the fact that the acceleration of a particle
in uniform circular motion is centripetal (since it is perpendicular to the velocity).
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 166
V = γ (v) (c, ~v ) and U = γ (u) (c, ~u) the four-scalar product is given by
U · V = γ (u) γ (v) c2 − ~u · ~v
while its value in the rest frame of the second observer, where the
particle moves with the relative velocity ~u0 is,
U · V = γ (u0 ) (c, ~u0 ) · c, ~0 = c2 γ (u0 ) .
L = Iω (5.36)
L → L0 = RL, ω → ω 0 = Rω
which implies that the moemnt of inertai tensor in the rotated co-
ordinate frame must satisfy
where in the last line I have used the fact that the rotation matrix
is orthogonal. This is the law of transformation for the moment of
inertia tensor12 . A bit of thought should show you that the same
law should work for anything that can be written in the form of a
square matrix that describes the linear relationship between one
vector and another. Thus, the dielectric tensor (relating the elec-
~ to the electric displacement D),
tric field vector E ~ the stress tensor
(relating the area vector to the force vector), etc. all transform ac-
cording to this rule. In general, the class of objects that transform
like this are collectively called tensors of rank two.
Why “rank two”? Because the rotation matrix appears twice
(once as R and once as RT ) in the transformation law (5.37)! This
may be even more obvious if I write this transformation law in
12
In the language of matrices, this is an orthogonal transformation, a special
kind of similarity transformation in which the transforming matrix happens to
be orthogonal. A theorem of linear algebra, which states that any symmetric
matrix can always be orhogonally diagonalised, ensures that you will always be
able to rotate your coordinate system into one in which the moment of inertia
tensor is diagonal. This, of course, is the principle axis system - something
which makes the study of rigid body dynamics very convenient.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 168
Iij0 = RIRT
ij
3
X
Ria Iab RT
= bj
a,b=1
3
X
= Ria Rjb Iab
a,b=1
3
X
vi0 = Ria va
a=1
This form makes it easy to see how to generalize this to get the
third rank tensor - this is nothing but the set of 33 components
that transform according to the rule
3
X
0
Tijk = Ria Rjb Rkc Tabc , i, j, k = 1, . . . , 3 (5.39)
a,b,c=1
You can keep on going higher and higher in rank by adding more
indices and more R matrix coefficients in the transformation law!
All these was about tensors in three dimensions - or more pre-
cisely, tensors under rotations. What is the scenario in four di-
mensional spacetime? Our tensor of rank two (undrer rotations)
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 169
, respectively!
Let me focus on a tensor that converts a covariant vector to a
contravariant one for the time being - the other cases are similar.
So, we have
V =TW
T → T 0 = LT LT
3
X
0µν
T = Lµρ Lνσ T ρσ (5.40)
ρ,σ=0
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 170
As you can see, this tensor has two L matrix elements in its trans-
formation, that’s twice as many as a contravariant vector. It is easy
to see, then, why it is called a contravariant tensor of rank . This
also explains why I chose to put the indices up as superscripts in
this case.
You should be able to show quiet easily that a tensor that changes
a contarvariant vector into a covariant one transforms according to
the law
X3
0
Tµν = Lµρ Lνσ Tρσ (5.41)
ρ,σ=0
and is called a covariant tensor of rank 2. As for the other two pos-
sibilities, I will leave you to figure out that they yield mixed tensors
with one contravariant and one covariant index, transforming as
3
X
T 0µν = Lµρ Lνσ T ρσ . (5.42)
ρ,σ=0
T µν = Aµ B ν , µ, ν = 0, 1, 2, 3 (5.43)
It is very easy to check that this does satisfy the defining transfor-
mation property of a contravariant tensor of rank two :
T 0µν = A0µ B 0ν ! !
X 3 3
X
= Lµρ Aρ Lνσ B σ
ρ=0 σ=0
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 171
3
X
= Lµρ Lνσ Aρ B σ
ρ,σ=0
3
X
= Lµρ Lνσ T ρσ
ρ,σ=0
Aµν = U µ V ν − U ν V µ (5.44)
~ = U 0 V~ − V 0 U
A ~
13 ~ and V
This is obviously a three-vector under rotations, since U ~ are three-
0 0
vectors and U and V are three-scalars.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 172
A01 = A1 (5.45-a)
A02 = γ (A2 − βB3 ) (5.45-b)
A03 = γ (A3 − βB2 ) (5.45-c)
B10 = B1 (5.45-d)
B20 = γ (B2 + βA3 ) (5.45-e)
B30 = γ (B3 − βA2 ) (5.45-f)
3
X
0λµν
T = Lλϑ Lµρ Lνσ T ϑρσ (5.46)
ϑ,ρ,σ=0
Remember that, once again, a general third rank tensor will trans-
form according to this equation but will not be, in general, a direct
product of three vectors.
In general, it is easy to see that we can take the direct product
of m contravariant vectors and n covariant vectors to produce an
object that trasforms like
3
X
0µ1 ...µm
T ν1 ...νn = Lµρ11 . . . Lµρmm Lνσ11 . . . Lνσnn T ρ1 ...ρm σ1 ...σn (5.47)
ρ1 , . . . , ρ m
=0
σ1 , . . . , σn
Any set of 4m+n numbers that transform like the above are called
mixed tensors having a contarvariant rank of m and a contravariant
tesor of rank n or, in short, a tensor with a rak of m + n.
be equally valid for all inertial frames. In other words, the equa-
tions that express these laws must be Lorentz covariant. Note the
choice of words here - I am saying covariant, not invariant. What
this means is that the two sies of an equation may change (not re-
main invariant), but change in the same manner (co-vary) under a
Lorentz transformation - so that if the equation is correct for one
inertial frame of reference, it is correct for any other inertial frame.
So, given any proposed law of physics, it becomes important to
check whether it conforms to this principle of Lorentz covariance.
Now, this may be a difficult matter - you will have to take a look at
the way in which each entry on both sides of the equation changes
under a Lorentz transformation, carry out the necessary algebra
and often rather painfully verify that both sides do change in the
same manner.
This is exactly where four-tensors come in. If you write down
an equation in which both sides are four-tensors (of the same kind)
then Lorentz covariance is guaranteed - both sides of the equation
will change on changing the inertial observer, but will change in
exactly the same way. The point is - if we write equations in terms
of matching four dimensional entities, then there is no longer any
need to check for Lorentz covaraince - such equations are mani-
festly Lorentz covariant!
Chapter 6
Kinetics in relativity
p2 1
T = = mu2 (6.2)
2m 2
175
CHAPTER 6. KINETICS IN RELATIVITY 176
Note that for a photon, which travels always with the speed of
light, the kinetic energy and the momentum are simply related by
Z pγ
Tγ = cdp = pγ c (6.3)
0
We will borrow from quantum theory the result that the energy of
a photon of frequency ν is given by Tγ = hν. Here h stands for
Planck’s constant. Then the photon must have a momentum given
by pγ = hν c
= λh . I will make extensive use of these results in the
next few sections to show, firstly, that the classical definitions for
momentum (and hence kinetic energy) are inadequate in relativity,
and secondly, to find out what they should be redefined as!
I told you before that the Doppler factor is in many senses (ex-
cept familiarity) a better measure of motion than the speed. Not
only is it easier to determine, it also makes the algebra much sim-
pler in relativistic calculations. In keeping with this, I will write
both the momentum and kinetic energy of a particle as functions
of its Doppler factor1 k, namely, as p (k) and T (k). Classical physics
tells us that
k2 − 1
p (k) ≡ mu = mc
k2 + 1
2
1 2 k2 − 1
1 2
T (k) ≡ mu = mc
2 2 k2 + 1
νi νi k
k=0
k −1
Before Before
νf kνf k=0
k
After After
vation works in all inertial frames will tell us that these relations
must be wrong! They are nevertheless very important - any correct
formula must reduce to these when the velocity is small (i.e. k is
close to unity) compared to the speed of light!
h
(νi + νf ) = p (k) (6.4)
c
h (νi − νf ) = T (k) (6.5)
In this problem, the two unknowns are νf , the final frequency of the
photon, and k the final Doppler factor (i.e. speed) of the electron.
We have two equations - so it seems we have all the ingredients
needed to solve the problem. The only trouble is, we don’t yet know
the functions p (k) and T (k)!
One result that follows immediately from (6.4) and (6.5) is that
for the ratio between the kinetic energy and momentum
T (k) νi − νf
= (6.6)
cp (k) νi + νf
h νi h
+ νf k = p (k) = (νi + νf )
c k c
νi
νf = (6.7)
k
T k−1
= (6.8)
pc k+1
1
T 2
mu2 β k−1
= = 6=
pc cmu 2 k+1
1
mu + mu2 = 2hνi
2c
hνi
u≈2
m
h 2
νf ≈ νi − 2 ν .
m i
dT k2 − 1
= v = cβ = c 2 (6.9)
dp k +1
k2 − 1 k + 1 1 2 dk 1
2
= 2 +
k +1 k−1p k − 1 dp p
and finally to
dp k2 + 1
= dk. (6.10)
p k (k 2 − 1)
Equation (6.10) can be integrated rather trivially to get
k2 − 1
p= A
k
p = 2βγA
CHAPTER 6. KINETICS IN RELATIVITY 183
and all that is left is to identify the constant A. For this, just
remember that for small velocities, this formula must reduce to
the classical result mu = mcβ. This means that A = 12 mc and the
new relativistic formula for the momentum is
k2 − 1 mu
p= mc = mcβγ = γmu = q . (6.11)
2k 2
1 − uc2
What about the photon - the particle that has to move with the
speed of light? If (6.11) and (6.12) are valid for the photon, then
it would land up with infinite amounts of momentum and energy.
The only way out, then, is to assume that the mass m of the photon
is - zero! Since the factor γ is ∞ for a particle moving with the speed
of light, the factor mγ is not well defined - but is consistent with
any finite value. The momentum carried by a photon has nothing
to do with its speed - indeed, quantum mechanics tells us that it
is controlled by its wavelength. One point, though - if you take the
limit m → 0 and γ → ∞ in such a way that the ratio vp → pc = mγ
stays finite, then the expression for the kinetic energy
p 2
T = mc2 (γ − 1) → c = pc
c
which is exactly the expression that we used for the photon kinetic
energy, (6.3).
What about faster than light particles, tachyons - which used
to be (and still is, perhaps to a somewhat lesser extent) the staple
of science fiction writers everywher?. If you naively put in a veloc-
ity larger than c in the relativistic expressions for momentum and
kinetic energy, (6.11) and (6.12), you get imaginary values - which
seems to rule out faster than light travel. A better way to interpret
this is simply follow what we started this section with - it takes an
unbounded amount of energy to speed up a particle moving slower
than light to the speed of light. Since you will not be able to speed a
particle up to the speed of light, the question of going beyond does
not arise!
If you think about this a bit, you will realize that there is nothing
in this argument about particles that have always been moving
faster than light. Such particles, on the other hand, can never
be decelerated below the speed of light. Even slowing them down
CHAPTER 6. KINETICS IN RELATIVITY 185
electron from two different inertial frames, one in which the elec-
tron is initially at rest and one in which it is finally at rest. Let us
now write down the law of momentum conservation in an arbitrary
frame, one travelling to the left with respect to our original inertial
frame with a Doppler factor of K −1 . The reason I have chosen an
observer moving to the left instead of the right is simply to make
the algebra slightly simpler (you should verify that the conclusion
stays unchanged if the observer were travelling to the right). Again,
I have chosen to write the observer’s Doppler factor as K −1 as op-
posed to K merely to ensure that K is larger than 1 - a choice that
marginally helps in keeping track of the terms.
The whole point behind using the Doppler factor instead of the
speed should become clear now. In section 3.4, I showed you that
the velocity addition formula is much simpler in terms of the for-
mer, rather than the latter. In our new frame, the electron initially
has a Doppler factor of K, while after collision its Doppler factor
becomes kK. What about the photon. The observer is approaching
the incident photon and hence she sees it blue-shifted to a fre-
ν
quency νi K, while the final photon is red-shifted to Kf . Thus, in
this new frame, the law of conservation of momentum becomes
h νf
p (kK) − p (K) = Kνi +
c K
p (kK) − p (K) K 2k + 1
= (6.13)
p (k) K (k + 1)
A moments though will tell you that our function p (k) must satisfy
(6.13) not only for all values of K, but also for all values of k! The
task, then, is to deduce the functional form for a function satisfying
this equation.
CHAPTER 6. KINETICS IN RELATIVITY 187
K2 + 1 0
p0 (K) = p (1) (6.14)
2K 2
ergy? I am sure most of you would like to add the constant mc2
so that the right hand side becomes a lot neater! Of course, this
means that the energy of a stationary particle is no longer zero but
mc2 - so calling it the kinetic energy is not very justified. Physicists
call this quantity the total energy E, and hence expression (6.12)
becomes
mc2
E = T + mc2 = (γ − 1) mc2 + mc2 = γmc2 = q (6.15)
2
1 − uc2
an equation which you must all know as the one that has made
Einstein a household name3 !
There you have it - E = γmc2 - arguably the most famous equa-
tion in physics! Yet, I am sure you are feeling a bit disappointed
with the way in which I introduced it. “Is it true, then -” you must
be asking yourselves, “that physics’ most famous equation is a bit
of a sham - simply an attempt to window dress an expression for
kinetic energy so that it looks a little better?” If all collisions in
the world were elastic, then all this new expression would have
been is a mere rewrite of (6.12). Take heart, though, in the case
of processes which the kinetic energy does not balance out, equa-
tion (6.15) takes a new, deeper significance - one that we will now
explore.
Let’s start by considering the most simple example of an inelas-
tic collision that I can think of - two particles of identical mass
mcolliding head on with equal and opposite velocities, correspond-
ing to Doppler factors of k and k −1 , respectively, and sticking to-
gether. You know that after the collision, the composite particle
will be standing still. From your high school education, you would
3
Perhaps you are more used to hearing E = mc2 instead of E = γmc2 . Wait till
section 6.9 to see how these two expressions are really one and the same!
CHAPTER 6. KINETICS IN RELATIVITY 189
conclude that the final particle has a mass of 2m, and the total ki-
netic energy of 21 mu2 × 2 = mu2 that the incident pair had will be
converted to heat energy that will, at least initially, heat up the
composite body. Let’s see what our new-found laws of relativistic
mechanics tell us about this scenario.
To feel the real impact of STR on this simple problem, let me now
describe the same collision from the point of view of another inertial
frame, moving with a Doppler factor of K −1 with respect to our first
one. In this frame, the two colliding particles have Doppler factors
of kK and k −1 K, respectively and the final, composite particle one
of K. Now, the net initial momentum of the system works out to be
K
pi = p (kK) + p
k
1 1 1 K k
= mc kK − + mc −
2 kK 2 k K
mc
k2K 2 − 1 + K 2 − k2
=
2kK
mc
K 2 − 1 k2 + 1
=
2kK
1 1 1
= mc K − k+
2 K k
1 1
= Mc K −
2 K
where I have used (6.16). Now, the first term on the left hand side
above is the total energy Ef = Tf + M c2 of the final particle, and
thus
Ti = Tf + (M − 2m) c2
and this means that even in this frame, the increase in mass is
1
M − 2m = (Ti − Tf ) (6.17)
c2
Ti + 2mc2 = Tf + M c2
m 1 k1 + m 2 k2 + . . . + m r kr = µ1 κ1 + µ2 κ2 + . . . + µs κs (6.21)
and
m1 m2 mr µ1 µ2 µs
+ + ... + = + + ... + (6.22)
k1 k2 kr κ1 κ2 κs
c2
m1 m2 mr
(m1 k1 + m2 k2 + . . . + mr kr ) + + + ... + =
2 k1 k2 kr
CHAPTER 6. KINETICS IN RELATIVITY 193
c2
µ1 µ2 µs
(µ1 κ1 + µ2 κ2 + . . . + µs κs ) + + + ... + (6.23)
2 κ1 κ2 κs
k12 + 1
2 2 2
2 2 kr + 1 2 κ1 + 1 2 κs + 1
m1 c +. . .+mr c = µ1 c +. . .+µs c
2k1 2kr 2κ1 2κs
(6.24)
Although I have written down this equation for the observer I started
out with, it is rather trivial to show that this conservation will work
for any arbitrary observer (show it!).
Thus, demanding that the law of conservation of momentum is
equally valid for all inertial observers leads to the law of conserva-
tion of energy as well, for a rather general process. You may have
already caught on to this - this works just as well the other way
round, too. Demanding that energy is conserved in all frames leads
to momentum conservation. Surely, then, there must be a deeper
connection between momentum and energy in relativity? There is!
I will discuss this connection in the very next section.
given by
mc 1
p (k) = k−
2 k
mc2
1
E (k) = k+
2 k
You can use (6.25-a) and ( 6.25-b) to rewrite Bob’s measured values
in terms of those measured by alice. This leads to
0 1 1 1 1 E
p = K+ p− K−
2 K 2 K c
CHAPTER 6. KINETICS IN RELATIVITY 195
E0
1 1 1 1 E
= − K− p+ K+
c 2 K 2 K c
1 1
and using our by now familiar values for the quaantities 2
K+ K
and 21 K − K1 leads to
0 E
p = γ p−β (6.26-a)
c
E0
E
= γ −βp + (6.26-b)
c c
(x, ct).
To those of you who have not skipped the section on vectors and
scalars in spacetime, this should not come as a big surprise (If you
were the impatient sort, now is a good time to go back and read
section 5.5). After all, we have seen that four-vectors behave just
like spacetime coordinates under Lorentz transformations. So, it
seems that the pair p, Ec that we have been seeing so far is just a
Note that the spatial part of this four-vector is precisely γm~u - the
formula that we have found out a while ago for the accurate rel-
ativistic form of the momentum (albeit in one dimensional form).
As an added bonus note that the temporal part of pµ is γmc which
is just Ec - so that in one swoop we have found out the relativistic
versions of both the momentum and the energy!
In fact now you see why adding the rest energy mc2 to the ki-
netic energy to get the total energy E = γmc2 was such a good idea!
Quiet unwittingly, we had stumbled on the temporal part of the
four-vector whose spatial part is the momentum three-vector. This
also tells us why the demand that the conservation of momentum
be valid in all frames immediately leads to the conservation of en-
ergy as an added bonus. It is actually the four-momentum that is
conserved - the conservation of momentum and that of energy are
two pieces of this single conservation law!
One invariant quantity that can be built immediately, given
a four-vector, is its norm. Finding out the norm of the energy-
momentum four-vector is really trivial - because we already know
the norm of the four-velocity, U µ . So,
P · P = P µ Pµ = m2 U µ Uµ = m2 c2 (6.28)
P = P1 + P2 + . . . + Pn
Since the γ factor for any velocity obeys γ (v) ≥ 1 (with the equality
CHAPTER 6. KINETICS IN RELATIVITY 198
= c2 (m1 + . . . + mn )2 = c2 M 2
P 2 ≥ c2 M 2 (6.31)
with equality holding only when all the particles are moving to-
gether and thus all the inter-particle relative velocities vanish.
This is as good a point as any to pay attention to a special case
- the massless particle. We have already seen one example - the
photon. As we have already seen, massless particles have to travel
at the speed of light and for them our energy-momentum formulae
need to be applied with some care. We have already seen that in
the massless limit
T = pc.
P1 · P2 = ~k0 m (6.37)
mu
p = γmu = q
u2
1− c2
At this point there are two possible courses of action open before
us. The first one, the one I have followed so far, is to gracefully
accept that the formula for momentum is not the old one of mass
times velocity, it is a different expression that, however, does re-
duce to the old formula in the case of slowly moving particles.
The other option is one that was very popular in the early days
of relativity. This one insisted that the momentum is still the same
old mass times velocity - only the mass is no longer the good old
constant quantity that we had been used to in the early days, but
one that increases with speed according to
m0
m= q (6.39)
u2
1− c2
Of course, all that has been done here is absorb the factor γ in m
to define this new quantity. In this second way of thinking about
things, the left hand side of the above equation is called the rela-
tivistic mass - while m0 , which is what I have been calling the mass
m all this time, is the value of this relativistic mass at u = 0, and is
hence called the rest mass. This explains the mystery of the miss-
ing γ in the arguably most famous equation of all physics - it is
simply that the m in E = mc2 is the reativistic mass (our mγ).
CHAPTER 6. KINETICS IN RELATIVITY 202
F~ = m~a
that stage, the most striking feature of the theory was that certain
things that we had always held to be absolute actually turns out
to be observer dependent! This, after all, was summed up by the
catchphrase - “everything is relative!” At that point, the fact that
mass was seen as dependent on velocity is just one more addition
to the long list of “relative” entities. Today, perhaps, the emphasis
has shifted. We regard the theory of relativity today, not so much
as a theory of what is relative (observer dependent) but as a theory
in which the emphasis is on the fact that the laws of physics (all
of them) are observer independent! So much so, that many people
has strongly advocated a change in the name of the theory from the
“theory of relativity” to the “theory of invariance”! In the modern
way of looking at things, the stress is on quantities that do not
depend on the observer. The so called rest mass (also known as
the proper mass) of a particle is such a quantity. That is, to my
mind, the single most important reason for giving the (rest) mass
its rightful status as the mass.
F~ = m~a (6.40)
CHAPTER 6. KINETICS IN RELATIVITY 204
d d~u d 1 dE
F~ = (mγ~u) = mγ + (mγ) ~u = mγ~a + 2 ~u
dt dt dt c dt
dE dK d~p
= = ∇p~ K · = ~u · F~ , (6.42)
dt dt dt
~ 1 ~
F = mγ~a + 2 F · ~u ~u (6.43)
c
least for bodies with constant rest mass). As you can easily see,
the presence of the second term says that the acceleration of a
moving body is, in general, not in the same direction as the force.
There are only two cases where the force and the acceleration are
co-directional - one, when the second term vanishes because the
force is perpendicular to the velocity, and two, when the force is
parallel to the velocity. In the first case, we have
F~ = mγ~a
This means that in the two cases where the relativistic relation be-
tween the force and the acceleration mimics the classical one, the
ratio, which you would be inclined to call the mass, takes different
values. So, as far as generalizing F~ = m~a is concerened, we have
two kinds of mass - the transverse mass mt , which happens to be
equal to the so called relativistic mass mγ, and the longitudinal
mass ml which is bigger by a further factor of γ 2 .
You may be slightly surprised at my assertion that (6.41) is
the correct generalization of Newton’s second law to the relativistic
case. After all, d~
p
dt
can not be the spatial part of a four-vector, for the
same reason that ~v is not! Surely it would have been more sensible
to have
d~p
F~M = = γ F~ (6.44)
dτ
as the relativistic force (the M stands for Minkowski)? One way to
answer this is simply to say that both Newton’s second law and its
CHAPTER 6. KINETICS IN RELATIVITY 206
X = (ct, ~r)
We have already met two other 4-vectors that follow very simply
from it
dX dt dX
U ≡ = = γ (u) (c, ~u)
dτ dτ dt
E
P ≡ mU = mγ (u) (c, ~u) = , p~
c
d 2 dP dm dm
P = 2P · = 2P · F = 2mc2 = 2mc2 γ (u)
dτ dτ dτ dt
and thus
dm
P · F = mc2 γ (u)
dt
1 dE ~
Using P = mγ (u) (c, ~u) and F = γ (u) c dt , F yields, however
2 dE
P · F = m [γ (u)] − F~ · ~u
dt
dE c2 dm
= F~ · ~u + (6.45)
dt γ (u) dt
dP d dU
F = = (mU ) = m = mA (6.47)
dτ dτ dτ
This formula, once again, is only valid for constant rest mass m.
(6.47) gives us another way of deriving (6.43). Using our expres-
sion for the four-acceleration (5.25) and equating each component
CHAPTER 6. KINETICS IN RELATIVITY 209
γ ~ mγ 4
F · ~u = ~u · ~a
c c
mγ 4
γ F~ = mγ 2~a + 2 (~u · ~a) ~u
c
x10 = γ x1 − βx0
x20 = x2
x30 = x3
γ (u0 ) F 30 = γ (u) F 3
F 10 = F 1 r
v2
F 20 = F 2 1−
r c2
v2
F 30 = F3 1 − 2
c
µκ = m1 k1 + m2 k2
µ m1 m2
= +
κ k1 k2
A
ZX → α +Z−2 Y A−4
CHAPTER 6. KINETICS IN RELATIVITY 212
If we write down the equation for the mass-energy balance for this
equation, what we get is
E (X) = E (α) + E (Y )
mX c2 = mα c2 + T (α) + mY c2 + T (Y )
and is thus equal to the mass excess that the decaying nucleus
has over the decay products, apart from a factor of c2 . Note that
the factor of c2 arises only if we insist on using conventional units
for measuring mass - in nuclear physics, mass is usually measured
in energy units like the MeV7 , in which case, the mass difference
is directly equal to the final kinetic energy. It should be easy for
you to see that if the frame being used is not the frame in which X
is at rest, the the mass difference gives the increase in net kinetic
energy in the process, T (Y ) + T (α) − T (X). As I have written in
(6.48), the mass difference is denoted by Q. Why Q? The analogy
with heat released by a chemical reaction should make this clear!
What I have said just now remains equally valid for all decay
7
The Mega-electronVolt, or 106 eV. Remember, an eV is the amount of kinetic
energy gained by an electron accelerated by a potential difference of 1 Volt, and
is equal roughly to 1.6 × 10−19 Jin terms of conventional units.
CHAPTER 6. KINETICS IN RELATIVITY 213
n→p+e+ν
for which, the final kinetic energies in the rest frame of the neutron
obeys8
T (p) + T (e) + T (ν) = mn − mp − me − mν c2
A
ZX →Z+1 Y A + e + v
and a neutrino9
p → n + e+ + ν
p (Y ) + p (α) = 0 (6.49)
Q
T (α) = mα (6.50)
1+ mY
To get an idea about the sort of energies that we are talking about
here, let me consider a real example - the α decay of U238 into Ra234 .
Note that the derivation of (6.50) was made very simple by the
fact that the energies involved were small enough to ensure that
non-relativistic approximations could be used. What if the speeds
involved are much larger (as happens, for example, in beta decay -
with electrons emerging at nearly the speed of light)?
A direct calculation of the energies carried away by the two de-
cay products can be carried out along the lines of the non-relativistic
calculations above. However, the algebra does tend to become a lit-
tle messy. Below we describe two ways of getting the result - the
first by using our k-calculus equations and the second by using
the energy-momentum 4-vector.
Let the Doppler factors of the two particles in the rest frame of
the parent particle be k1 and k2 , respectively. In this case, the
equations (6.21-6.22) become
m1 m2
M = m 1 k1 + m 2 k2 = + . (6.51)
k1 k2
CHAPTER 6. KINETICS IN RELATIVITY 218
m1 c2 1 c2
M 2 + m21 − m22 = M 2 + m21 − m22
E1 = × (6.54)
2 m1 M 2M
c2
M 2 − m21 + m22
E2 = (6.55)
2M
M − m1 + m2
T1 = E1 − m1 c2 = Q (6.56)
2M
M + m1 − m2
T2 = Q (6.57)
2M
c p
|p1 | = (M + m1 + m2 ) (M + m1 − m2 ) (M − m1 + m2 ) (M − m1 − m2 )
2M
(6.58)
The symmetry shows that |p2 | is the same, as it must be.
PX = PY + Pα
can be rewritten as
PY = PX − Pα .
Note that this helps us get rid of the two unknown quantities in-
volving the Y nucleus - its momentum and energy in one go! Now,
PX2 = m2X c2 , etc. Also,
E
~ α
PX · Pα = mX c, 0 · , p~α = mX Eα
c
CHAPTER 6. KINETICS IN RELATIVITY 220
for the total, energy of the α particle. Of course, all you have to do to
find the total energy of the Y nucleus after the decay is interchange
the values of mα and mY - which yields
As you can easily verify, the sum of the energies of the decay prod-
ucts is equal to the rest energy of the original particle,
Eα + EY = mX c2
To get the kinetic energies of the decay products, all you have to do
is subtract the respective rest energies, yielding
(mX − mα )2 − m2Y 2
Tα = c (6.59-a)
2mX
(mX − mY )2 − m2α 2
TY = c (6.59-b)
2mX
expect the α particle to come out with a definite energy - and that is
precisely what has been observed11 ! Of course, this can be traced
back to the fact that α decay is a two body process - the available
kinetic energy gets shared between the daughter nucleus and the
α particle in a precise ratio dictated by the conservation of momen-
tum. The β decay spectrum, on the other hand, is another story
altogether - instead of a precise energy (or a set of sharp lines) the
β particles come out with all energies from zero upto a maximum.
Indeed, the maximum kinetic energy of the β particles (called the
endpoint energy of the β spectrum) turns out to be what (6.59-a)
would predict (Of course with mα replaced by mβ ). Of course, the
explanation is very simple - β is not a two body process - the avail-
able Q value is shared between the three particles in the final state!
Thus, the β particle can come out with any energy below the end-
point value - the rest is carried away by the neutrino, while main-
taining conservation of momentum. Note that the fact that a two
body decay could not explain the continuous decay spectrum in
beta decay is the precise reason why Wolfganf Pauli had proposed
the existence of a third decay product - the neutrino! Indeed, it
11
Actually, if you look at the α decay spectrum, which is nothing but a plot
of the number of α particles emitted against their respective energies, what you
seeussally is not one, but several sharp lines. Thus, the αparticles come out
with not one definite energy, but several well defined energies. Explaining this
is not too difficult, thoough - it is just that the daughter nucleus is not always
produced in its ground state. Thus the energy term that I wrote down foer the
daughter nucleus is not just mY c2 + TY , but has an extra term of EY∗ as well,
where EY∗ is the excitation energy of the state the nucleus is created in. As you
can easily check, this brings down the available kinetic energy from Q to Q − EY∗
- which explains where the α particles of energies lower than that predicted by
(6.50) come from. In fact, these α particles of lower eneries are accompanied with
γ rays (photons) that carry away the extra energy that becomes available when
the excited daughter nucleus de-excites to the ground state - this provides us
with evidence that this explanation is correct! Since quantum mechanics tells
us that the excited states of the daughter nucleus have a few definite energies
only, there are only a few sharp lines in the α spectrum.
CHAPTER 6. KINETICS IN RELATIVITY 222
X + α → Y1 + Y2 + . . .
is, say, −4.0 MeV, you may be inclined to believe that the the reac-
tion will occur if you shoot α particles with a minimum energy of
4.0 MeV at a stationary target of X nuclei. In reality, the α particles
have to be more energitic than this. The reason is the same as
the one that prevented the fast moving proton from decaying - in
this case, the fact that the incoming α particle has some momen-
tum ensures that the decay products would also have to have some
minimum kinetic energy to begin with - just providing them with
CHAPTER 6. KINETICS IN RELATIVITY 224
N
X
m1 k + m2 = µj κj
i=1
N
m1 X µj
+ m2 =
k i=1
κj
CHAPTER 6. KINETICS IN RELATIVITY 225
Now, m1 m2 k + k1
= 2m2 E1 /c2 , while elementary algebra tells us
that r 2
κj κk κj κk
r
+ =2+ − ≥2
κk κj κk κj
with equality holding only for κj = κk . Thus
N N N
!2
X X X
m21 + m22 + 2m2 E1 ≥ µ2j + 2µj µk = µj = M2
j=1 j<k=1 j=1
Another way to look at this hinges around the result for P 2 that
is expressed in (6.30) and the inequality (6.31). If the net four-
momentum of the products of the reaction is P , then we can write
P1 + P2 = P
CHAPTER 6. KINETICS IN RELATIVITY 226
where P1 and P2 are the four-momenta of the target and the projec-
tile, respectively. Now, the
twoincident 4-momenta take the form
P1 = (E1 /c, p~1 ) and P2 = m2 c, ~0 in the lab frame, where E1 is the
energy of the projectile in this frame - so that we have P1 ·P2 = m2 E1 .
“Squaring” both sides and of P1 + P2 = P , we get
c2 m21 + m22 + m2 E1 = P 2 ≥ c2 M 2
This shows that the lighter the target is, the harder the reaction
gets. In fact, if an endothermic reaction is to occur by a projectile
particle hitting an identical, stationary, particle, the projectile must
have more kinetic energy than twice the Q deficit!
mc2
~cki + mc2 = ~ckf + q (6.60-a)
2
1 − vc2
mv
~ki = ~kf cos ϑ + q cos φ (6.60-b)
v2
1 − c2
mv
0 = ~kf sin ϑ − q cos sin φ (6.60-c)
2
1 − vc2
Since our aim is to find out the shift in the photon wavelength
against its deflection angle ϑ, we must get rid of the electron speed
CHAPTER 6. KINETICS IN RELATIVITY 228
m2 v 2 2 2 2
2 = ~ k i + k f − 2k i k f cos ϑ
1 − vc2
m2 c2
v2
= m2 c2 + 2mc~ (ki − kf ) + ~2 (ki − kf )2
1 − c2
leading to
~ 1 1 1
(1 − cos ϑ) = − = (λf − λi )
mc kf ki 2π
This gives the formula
h
∆λ = (1 − cos ϑ) (6.61)
mc
can be rewriiten as
Pef = Pγi − Pγf + Pei
where I have used Pγ2i = Pγ2f = 0 and Pe2i = Pe2f = m2 c2 . This means
that
Pγi · Pγf = Pγi − Pγf · Pei
From this, you can immediately arrive at the Compton shift for-
mula, equation (6.61).
2
~v1 · ~v2 = −~vCM + ~u2CM = 0
since ~uCM = ~vCM !
v sin θ0 sin θ0
tan θ = =
γ (v) (v cos θ0 + v) γ (v) (cos θ0 + 1)
sin φ0 sin θ0
tan φ = =
γ (v) (cos φ0 + 1) γ (v) (− cos θ0 + 1)
1
tan θ tan φ =
γ2 (v)
so that
γ (u) + 1
γ 2 (v) =
2
and our formula becomes
2
tan θ tan φ = (6.62)
γ (u) + 1
This shows that the faster the projectile, the smaller tan θ tan φ is,
and hence the smaller the angle between the emergent directions.
This is one of the predictions of relativity that can be directly tested
in cloud chamber type experiments. Observing the tracks when a
fast cosmic-ray electron hits a stationary electron shows this re-
duction in angle vividly. Indeed, by observing the angular distri-
bution of the outgoing tracks a lot of information can be gathered
about the incident electron’s speed.
rection with respect to the rocket.We will once again solve the prob-
lem using two different approaches - one based on the K-calculus,
the other a more conventional one.
γ (k) + π (k) = k
γ (k) − π (k) = k −1
We will write the constant K-factor of the ejected mass with respect
to the rocket as K−1 (since the mass moves backwards, its K-factor
is less than 1 - or K, however, is larger than 1), where
r
c+U
K=
c−U
and thus
1 k dM + M dk
=
K2 k dM − M dk
so that
k dM 1 + K2 c
= = −
M dk 1 − K2 U
which yields the differential equation
dM c dk
=−
M U k
which tells us how the final mass decreases with increasing speed
of the rocket. Thus, the higher the speed that you want your rocket
to achieve, the smaller can the payload carried by your rocket be!
To achieve a larger residual mass for a given final speed u, U must
be as large as possible. This gives us the photon ship - very pop-
ular in science fiction - which propels itself by ejecting a stream of
photons! For a photon ship (6.63) becomes
r
M c−u
= (6.64)
M0 c+u
M c2 γ (u) = (M + dM ) c2 γ (u + du) + δM c2 γ U
M γ (u) u = (M + dM ) γ (u + du) (u + du) + δM γ U U
d (M γ (u) u) du
U= = u + M γ (u)
d (M γ (u)) d (M γ (u))
CHAPTER 6. KINETICS IN RELATIVITY 237
This implies
1 − uU/c2
1 d 1 1 1 − U/c 1 + U/c
(M γ (u)) = =− = − +
M γ (u) du U −u (1 − u2 /c2 ) U 2U 1 − u./c 1 + u/c
which simplifies to
c/2U
M c−u
=
M0 c+u
which is the solution we were after.
There is an arguably slightly simpler way of arriving at this so-
lution. This involves using a special frame in which the description
of the process becomes simple. This frame is the ICF, the frame in
which the rocket is stationary at a particular instant of time. At
a time t, the ICF will move at a velocity u with respect to the ini-
tial rest frame of the rocket. After a time dt, the ejected gas (mass
δM ) will have a velocity −U while the rocket will have a velocity
of du0 (Remember, the ICF is not fixed to the rocket - and thus the
rocket is stationary in this frame only at the time t). Thus, energy
and momentum conservation in this frame leads to
M c2 = (M + dM ) γ (du0 ) c2 + δM γ (−U) c2
0 = (M + dM ) γ (du0 ) du0 + δM γ (−U) (−U)
u2 u2
0 dM
du = 1 − 2 du = − 1 − 2 U
c c M
which can be easily solved to get (6.63). If you ask me, I find the
simplicity that one gains by going over to the ICF is ruined by the
fact that you haveto convert du0 back to du before you can integrate
it. Make your own choice!
As far as the photon rocket is concerned, we have already de-
rived the result (6.64) for it as a special case of the general result
(6.63). There is, however, a much simpler way of deriving this re-
sult, and this hinges on the the fact that the energy - momentum
relation for photons is linear, which means that the net momentum
carried away by the photons is just the total energy of the photons
√
divided by c (since for a massive particle we have |~p| = E 2 − m2 c4 /c,
the total momentum of the gas, ejected at different speeds with
respect to a given inertial observer at different times, can not be
directly related to the total energy in a simple way for a rocket
ejecting massive particles. In other words, you can not calculate
CHAPTER 6. KINETICS IN RELATIVITY 239
Eγ + M γ (u) c2 = M0 c2
pγ + M γ (u) u = 0
Electrodynamics and
relativity
240
CHAPTER 7. ELECTRODYNAMICS AND RELATIVITY 241