Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
371 views242 pages

Relativity Primer with Bondi K Calculus

This document provides an overview of Einstein's theory of special relativity. It introduces key concepts like spacetime diagrams, Lorentz transformations, time dilation and length contraction using a "radar method" involving tracking objects with a flashlight. It explores various paradoxes and applications, summarizing the theory's key predictions regarding the relativity of simultaneity and invariance of the speed of light.

Uploaded by

Souvik Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
371 views242 pages

Relativity Primer with Bondi K Calculus

This document provides an overview of Einstein's theory of special relativity. It introduces key concepts like spacetime diagrams, Lorentz transformations, time dilation and length contraction using a "radar method" involving tracking objects with a flashlight. It explores various paradoxes and applications, summarizing the theory's key predictions regarding the relativity of simultaneity and invariance of the speed of light.

Uploaded by

Souvik Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 242

RELATIVITY

with a flashlight

Future Future

Elsewhere

Elsewhere Here & Now


Now
Past
Past

(a) STR (b) Newtonian

A simple primer
using the Bondi K calculus

Ananda Dasgupta

March 3, 2007
Contents

1 The background 6
1.1 Galileo - the father of relativity . . . . . . . . . . . . . . 6
1.2 Hunt for the elusive ether . . . . . . . . . . . . . . . . . 6
1.3 Enter Einstein . . . . . . . . . . . . . . . . . . . . . . . 6

2 The radar method 9


2.1 A few puzzles . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 The Space-time diagram . . . . . . . . . . . . . . . . . 14
2.3 Measuring space-time with the flashlight . . . . . . . 17
2.4 The K factor . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Kinematics from the flashlight 26


3.1 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.1 Moving clocks run slow! . . . . . . . . . . . . . . 28
3.1.2 Accelerated clocks . . . . . . . . . . . . . . . . . 31
3.1.3 The Doppler factor revisited . . . . . . . . . . . 32
3.1.4 Testing relativity . . . . . . . . . . . . . . . . . . 32
3.1.5 The twin paradox . . . . . . . . . . . . . . . . . . 34
3.1.6 The writing on the clocks . . . . . . . . . . . . . 40
3.1.7 The case of the immortal muon . . . . . . . . . 42
3.2 Length contraction . . . . . . . . . . . . . . . . . . . . . 44
3.2.1 A moving rod gets shortened . . . . . . . . . . . 45

1
CONTENTS 2

3.2.2 What about the other directions? . . . . . . . . 49


3.2.3 Length contraction via the photon clock . . . . 50
3.2.4 The length contraction paradox . . . . . . . . . 54
3.3 The relativity of Simultaneity . . . . . . . . . . . . . . . 55
3.3.1 Einstein’s train . . . . . . . . . . . . . . . . . . . 57
3.3.2 The writing on the clocks revisited . . . . . . . . 61
3.3.3 The length contraction paradox revisited . . . . 63
3.4 Addition of velocities . . . . . . . . . . . . . . . . . . . . 64
3.4.1 The relative velocity formula . . . . . . . . . . . 67
3.4.2 A deeper look into relative velocities . . . . . . . 68
3.5 Lorentz transformations . . . . . . . . . . . . . . . . . 72
3.5.1 Everything from L. T. . . . . . . . . . . . . . . . 76
3.5.1.1 Time dilation . . . . . . . . . . . . . . . 76
3.5.1.2 Length contraction . . . . . . . . . . . . 77
3.5.1.3 Relativity of simultaneity . . . . . . . . 78
3.5.1.4 The transformation of velocities . . . . 79
3.5.2 The general Lorentz transformation . . . . . . . 82
3.5.3 Why use the radar method? . . . . . . . . . . . 90
3.6 The invariant interval . . . . . . . . . . . . . . . . . . . 90
3.6.1 Splitting up spacetime . . . . . . . . . . . . . . . 95
3.6.2 Everything from the interval! . . . . . . . . . . . 97
3.6.2.1 Time dilation . . . . . . . . . . . . . . . 97
3.7 Acceleration in relativity . . . . . . . . . . . . . . . . . 98
3.7.1 The transformation law for accelerations . . . . 98
3.7.2 Uniformly accelerated motion in STR . . . . . . 101
3.8 Charting out spacetime . . . . . . . . . . . . . . . . . . 105
3.9 Causality and STR . . . . . . . . . . . . . . . . . . . . . 113

4 Light - the messenger of relativity! 119


4.1 The Doppler effect, again . . . . . . . . . . . . . . . . . 119
4.1.1 The longitudinal Doppler effect . . . . . . . . . . 119
CONTENTS 3

4.1.2 The transverse Doppler effect . . . . . . . . . . 122


4.2 More on light . . . . . . . . . . . . . . . . . . . . . . . . 123
4.2.1 Aberration . . . . . . . . . . . . . . . . . . . . . . 123
4.2.2 Propagation . . . . . . . . . . . . . . . . . . . . . 123

5 Additional topics in kinematics 124


5.1 Deriving the Lorentz transformations, directly . . . . . 124
5.2 A new notation . . . . . . . . . . . . . . . . . . . . . . . 124
5.3 General Lorentz transformations and the invariance
of the interval . . . . . . . . . . . . . . . . . . . . . . . . 127
5.3.1 The covariant coordinates . . . . . . . . . . . . 127
5.3.2 The metric . . . . . . . . . . . . . . . . . . . . . . 130
5.4 The geometry of special relativity . . . . . . . . . . . . 135
5.4.1 Coordinate changes in rotations . . . . . . . . . 137
5.4.2 Lorentz transformations and rotations . . . . . 139
5.4.3 A bit of history - Minkowski coordinates . . . . 140
5.4.4 Minkowski spacetime . . . . . . . . . . . . . . . 142
5.5 Four-vectors and four-scalars . . . . . . . . . . . . . . 144
5.5.1 Vectors and scalars in 3 dimensions . . . . . . 144
5.5.2 Lorentz transformations vs. rotations . . . . . . 149
5.5.3 Vectors and scalars in spacetime . . . . . . . . 150
5.5.4 Examples of 4-vectors and 4-scalars . . . . . . 154
5.5.5 The dot product of four vectors . . . . . . . . . . 158
5.5.6 Going further - tensors . . . . . . . . . . . . . . 166
5.5.7 Why bother? . . . . . . . . . . . . . . . . . . . . 173

6 Kinetics in relativity 175


6.1 Rewriting Newton . . . . . . . . . . . . . . . . . . . . . 175
6.2 The momentum - kinetic energy connection . . . . . . 175
6.3 The need to redefine momentum . . . . . . . . . . . . . 177
6.4 The momentum in STR . . . . . . . . . . . . . . . . . . 182
CONTENTS 4

6.4.1 The speed limit . . . . . . . . . . . . . . . . . . . 183


6.4.2 A “better” derivation . . . . . . . . . . . . . . . . 185
6.5 Mass is energy . . . . . . . . . . . . . . . . . . . . . . . 187
6.6 The conservation of energy from that of momentum . 191
6.7 The momentum four vector . . . . . . . . . . . . . . . . 193
6.8 Covariance of the conservation of energy-momentum 199
6.9 Do we need a relativistic mass? . . . . . . . . . . . . . 200
6.10 Force and acceleration in STR . . . . . . . . . . . . . . 203
6.11 The force and acceleration four vectors . . . . . . . . . 207
6.12 Using energy-momentum conservation . . . . . . . . . 211
6.12.1Inelastic collision, again . . . . . . . . . . . . . . 211
6.12.2Decay and stability . . . . . . . . . . . . . . . . . 211
6.12.3Dividing up the energy . . . . . . . . . . . . . . 216
6.12.3.1The k-calculus approach . . . . . . . . 217
6.12.3.2The 4-vector approach . . . . . . . . . . 219
6.12.4Nuclear reactions . . . . . . . . . . . . . . . . . 222
6.12.4.1The k-calculus approach . . . . . . . . 224
6.12.4.2The 4-momentum approach . . . . . . 225
6.12.5The Compton effect . . . . . . . . . . . . . . . . 226
6.12.5.1The shift . . . . . . . . . . . . . . . . . . 227
6.12.5.2The four-vector approach . . . . . . . . 229
6.12.6Relativistic billiards . . . . . . . . . . . . . . . . 230
6.12.6.1The non-relativistic problem . . . . . . 231
6.12.6.2The relativistic case . . . . . . . . . . . 231
6.12.7The relativistic rocket . . . . . . . . . . . . . . . 233
6.12.7.1The K-calculus approach . . . . . . . . 234
6.12.7.2The conventional approach . . . . . . . 236
6.13 Motion under force . . . . . . . . . . . . . . . . . . . . . 239

7 Electrodynamics and relativity 240


7.1 Faraday and Einstein . . . . . . . . . . . . . . . . . . . 240
CONTENTS 5

7.2 Why does a current produce a magnetic field? . . . . . 240


7.3 Transforming the fields . . . . . . . . . . . . . . . . . . 241
7.4 The field of moving charges . . . . . . . . . . . . . . . . 241
7.5 Potentials to the fore . . . . . . . . . . . . . . . . . . . . 241
7.6 Light - again! . . . . . . . . . . . . . . . . . . . . . . . . 241
Chapter 1

The background

1.1 Galileo - the father of relativity

1.2 Hunt for the elusive ether

1.3 Enter Einstein


Just what made Einstein embark upon his fantastic journey of dis-
covery? There are many, many conflicting theories. Some would
say that the Michelson-Morley experiment played an absolutely de-
cisive role, some others would quote Einstein himself to show that
he was unaware of the experiment in 1905. Einstein himself has of-
ten said that his training in electrical engineering as a polytechnic
student had laid the seeds of enquiry about the deep significance of
electromagnetism in his mind. However, we can be reasonably cer-
tain that whatever other influences played a role here, at one level
the point that drove Einstein to his final theory was a philosophi-
cal one. The relativity principle as enunciated by Galileo appealed
to him immensely - but he did have a very strong objection to it.

6
CHAPTER 1. THE BACKGROUND 7

Galileo’s principle applied to mechanics only - and this is what Ein-


stein found hard to agree with. As he saw it, there is no such thing
as a purely mechanical experiment - if nothing else you will at least
have to take use light (and hence electromagnetism) to at least see
the result! As far as Einstein is concerned, the principle of rela-
tivity was too beautiful to give up, yet, confined as it was too only
mechanics, it was essentially an empty statement - devoid of any
real physical meaning! Thus, the only way out, according to Ein-
stein was to assume that the principle of relativity applied, not only
to the laws of mechanics alone - but to all of physics! This princi-
ple became the basis of Einstein’s special theory of relativity (STR),
and today, when we speak of the principle of relativity, we mean
“All laws of physics are the same for all inertial observers.”
This sweeping generalization of the principle of relativity from
one confined to mechanics alone to one that covers all physical
laws left Einstein with two options :

• The laws of mechanics, which are already known to be cor-


rect for all inertial frames need not be tampered with. This
presupposes, of course, that the law of transformation when
one goes over from one inertial observer to the other is the
traditional, Galilean transformations. On the other hand, if
the laws of electromagnetism are the same for all observers,
then they can’t be Maxwell’s equations - since these are not
Galilean covariant. Thus, the laws of electrodynamics must
be modified.

• The laws of electromagnetism are correct - and they are the


same in all inertial frames! This means that the transforma-
tion laws between inertial frames must be something other
that the Galilean transformations. This also means that the
CHAPTER 1. THE BACKGROUND 8

laws of mechanics may fail to be covariant under the new


transformations, and hence may require modification.

Faced with such choices I am pretty sure that most people would
have sided with the first option. Electrodynamics was a fledgeling
science at that stage. Mechanics, on the other hand, was the grand
old man of physics - one, moreover, that had held sway over the
way that scientists thought about the natural world for more than
three centuries at that time. It was Einstein’s genius that he did
not hesitate to go the other way!
Chapter 2

The radar method

2.1 A few puzzles


Let’s start with a simple puzzle. Suppose that I am walking down
a long, straight road. There is a lamppost ahead of me - and I
am walking straight towards it. A friend of mine crosses me on
a bike, gets to the lamppost before he realizes that he has to tell
me something important and then turns back to finally reach me.
I have been walking steadily all the while. The question is - with
respect to me, which half of his journey was longer, the outward
one (from me to the lamppost) or the inward one (back from the
lamppost to me)?
Although I am not a gambling man, I am ready to bet that most
of you will say - “the outward one, of course!” Since I kept on
moving all the while - it stands to reason I am much closer to the
lamppost when he finished up, than when he started out. This
answer is the expected one, but it is also wrong!!! Note that the
question said, “with respect to me” - the answer that seems obvious
to all of you is the one “with respect to the ground”.
Try to look at the whole affair from my point of view. I am

9
CHAPTER 2. THE RADAR METHOD 10

not moving with respect to myself - all that I see is the lamppost
move steadily towards me. My friend crosses me, goes out - and
meets the oncoming lamppost at some point. What distance has he
covered with respect to me? From my origin to that point. Where
does he turn back from? From that very point! Where does he land
up finally? To my origin! Hence, with respect to me he has covered
the same distance on his outward journey as on his inward one!1
Let me stress once again that I did expect most of you to give the
wrong answer. That is because despite learning a lot about relative
motion, all of us instinctively think in terms of one observer all
the time - the ground2 ! In fact, that is why most of you will be
concerned with how far the lamppost is from me - the lamppost
is fixed on the ground! That my friend travelled more during his
outward journey is the correct answer - but only with respect to
the ground! With respect to me, it is fifty-fifty.

At this point, I can’t refrain from airing a pet peeve of


mine. I feel that the chapter on “relative velocity” that
you study in high school does more harm than good!
Most of the problems that are tackled under this topic
tend to be like “The true velocity of rain is so and so,
a man is moving with such and such a velocity - find
out the apparent speed of the rain as seen by the man”
or variants thereof. The trouble is - such language tac-
itly keeps on reinforcing the concept that there is such
a thing as true velocity! Indeed, heading a chapter “rel-
ative velocity” immediately implies that relative velocities
1
It is true that from my viewpoint, the lamppost is much closer to me when
my friend finally reaches me, than when he started out - but that is not our
concern!
2
More precisely, it is the observer who is fixed with respect to the ground. I
will take the linguistic liberty of calling the “observer fixed with respect to the
object A” simply “the observer A”, however - it cuts down the verbiage quite a lot!
CHAPTER 2. THE RADAR METHOD 11

are somewhat different from ordinary velocity - where it


should really be stressed that all velocities, by there very
nature, are relative. It is nonsense to speak of a velocity
(or acceleration, or even displacement, for that matter)
without reference to some observer or the other! This is
simply because all of these quantities are related to the
position - and the position must be measured with re-
spect to an observer. The so called “true velocity” almost
invariably turns out to be the velocity with respect to the
ground - this only keeps on reinforcing the instinctive
idea that there is something special about the ground as
an observer.

Let’s come back to our puzzles. The next one is - if the speed of my
friend on the bike was the same on both halves of the journey, then
which half took him more time? This one is really simple - since
the outward journey was longer than the inward one - the first half
must have taken more time.
“Hold on!” - some of you must be screaming at this point. This
one is obviously correct with respect to the ground. Since the two
journeys are equally long with respect to me, shouldn’t the times be
equal when I measure them? If this is your answer, then congratu-
lations! Not because your answer is right (it isn’t!) but because you
have unwittingly surmounted the biggest stumbling block that lies
in the path of understanding the special theory of relativity! Jokes
apart, think about this issue a bit - would you really expect the
times to change depending on whether a man fixed on the ground
sees the events or whether I, who am moving, sees it?
The situation, once again, is very simple. It is true that the
distances my friend covers in the two halves are the same with
respect to me. The speeds are also the same - but with respect
CHAPTER 2. THE RADAR METHOD 12

to the ground! When he goes away from me, his speed, as I see
it, is less (it is v − u if v and u are the bike’s and my speed with
respect to the ground, respectively) than when he is coming back
in (it is v + u). So, he takes more time to go out than to come back
in - just as the ground sees it. Indeed, if you take the trouble of
calculating the times from my viewpoint (you should!) you will see
that they exactly match the time that the ground observes3 . The
moral of the story is - when you are using somebody’s displacement
measurements, you should not mix them up with someone else’s
velocity measurements!
Brace yourself - for this is where our hero the flashlight makes
its grand entrance into the story! Consider the same scenario as
before - except that instead of the lamppost I am now moving to-
wards a mirror fixed to the ground. I shine a flashlight at the
mirror. A pulse of light travels to the mirror and comes back to me.
As with my friend the biker, the flash of light travelled more during
the outward journey than during the inward one - with respect to
the ground. Again, both halves of its journey were equally long with
respect to me. What about the times? It stands to reason that the
outward journey takes more time than the inward one for a ground
based observer. What do I say about this? Now, as per Einstein,
light moved with the same speed c for both halves of its journey,
not only with respect to the ground - but also with respect to me!
This leaves me with no option but to conclude that - with respect
to me light takes the same time to complete the two halves of its
journey! Accept the constancy of the speed of light, and you have to
accept that time depends on the observer!
Our simple little puzzle has led us right into the heart of the
3
This should not come as a surprise. After all, this is how we arrive at the
v ± u results in the first place!
CHAPTER 2. THE RADAR METHOD 13

special theory of relativity. As we will see, time and again - this


dependence of time on the observer is the major point of departure
that makes STR distinct from Newtonian physics. To stress this
difference a bit further, let me put this in a slightly different way.
Newton would be quite comfortable with the idea that two events
that occur at the same place but at different times according to one
observer, may happen at different places according to another. He
would certainly have a fit, however, if someone were to suggest that
two events that happened at different places but at the same time
according to me - happens at different times according to someone
else!
Just think of a diner in a railway car starting on his soup as
the first event and finishing up the last spoonful of dessert as the
second one4 . To the waiter in the dining car, both of these events
occur at the same place - the dining car table. For someone on the
ground, however, the two events may have occurred very far apart
indeed - the distance the train has travelled while the diner was
busy with his food! However, if the waiter sees another traveller
take a sip of water at the other end of the dining car at the same
instant when our diner is paying the bill - our instincts (which
form the basis of Newtonian physics) will tell us that someone on
the ground will also agree that the two events are simultaneous.
STR puts space and time on a much more equal footing - though.
It actually tells us that two events that occur at the same time but
at different places according to one observer, may appear to occur
at different times to another!
This leaves us in a somewhat difficult position. Newtonian physics
has the advantage of a universal time which serves as a fixed back-
4
My apologies to George Gamow and Cleveland for lifting this example almost
verbatim from their classic text - Physics - Foundations and Frontiers.
CHAPTER 2. THE RADAR METHOD 14

ground that keeps a strict order on things. In Newtonian physics,


the issue of measuring time was more of a matter of technology -
all you had to do to keep time better was build a better clock! The
deeper issue of exactly what measuring time means never reared
its ugly head there. However, with a time that changes from ob-
server to observer, STR forces us to take a deeper look at the way
in which we measure space and time. Indeed, the reason why STR
is fundamentally important is the fact that it forces us to take a
deeper look at the very nature of space and time itself.

2.2 The Space-time diagram


For convenience I will now pretend that the world has only one
space dimension. Most of the amazing new things that relativity
throws at us makes their appearance in 1 dimension - the transi-
tion to three space dimensions from one is mostly devoid of new
shocks. Don’t worry, I will indicate the three dimensional general-
ization of our results as and when needed.
Let me introduce a device that will be very useful to us in the
long run. This is none other than your familiar x vs. t diagram - but
with a slight twist. In this diagram I will plot x along the horizontal
axis and t along the vertical axis. Moreover, I will be plotting not t,
but ct along the vertical axis. This makes it possible for me to refer
to both the directions using the same units. You could also say
that I have chosen a system of units in which space and time are
measured by the same units (meters in this case), where a meter
of time is the time it takes light to travel 1 meter (this is about 3.3
nanoseconds in our conventional units). If I adopt this line, then
velocity becomes dimensionless! In particular, the speed of light in
this new system of units is just - one.
CHAPTER 2. THE RADAR METHOD 15

Figure 2.1 shows a typical space time diagram. In this diagram,


the path of light waves will be straight lines inclined at an angle of
45◦ to the axes. The solid red line is that of a beam of light emitted
by the observers flashlight at t = 0. The dashed red line is also
that of a light beam - but one travelling to the left. Moving particles
trace out a set of points in this diagram - the line joining them is
called the worldline of the particle. Like in our familiar x − t dia-
grams, the lines will be straight for particles moving with a uniform
velocity and curved for an accelerating one. One word of warning
- since we have reversed the positions of time and space axes, the
slope of a worldline is the reciprocal of the particle’s velocity (in
units of c) rather than the usual velocity. Since one of the results
that follow from relativity (one that I will prove later) is that mate-
rial particles can’t travel faster than light, all worldlines, straight
or curved, must always be steeper than lightlines - i.e. make more
than a 45◦ angle with the x axis. The blue line in the figure repre-
sents the worldline of a particle moving with uniform speed, while
the brown one is that of an accelerated particle. Both the green
lines are those of “impossible” particles - the straight one depicts a
particles moving uniformly faster than light, while the curved one
is that of a particle that moves slower than light at the beginning
and end of its track - but is faster than light over the flat middle
portion. Finally, what about the worldline of the observer herself?
Since she is fixed at x = 0, her worldline goes straight up in the
diagram - it is the t axis!
Space-time diagrams like this one are very useful in under-
standing various concepts of relativity. Let me stress, though, that
the reason why I was able to depict our spacetime on a diagram
is because I decided to ignore two of the spatial directions. If I
had tried to do this for full three dimensional space, then I would
CHAPTER 2. THE RADAR METHOD 16

have to use up all three directions for the spatial axes - which will
leave me nowhere to put the fourth, time, axis! With one space
dimension only, spacetime is two dimensional - something that I
can draw very easily on a sheet of paper.
Although we cannot draw a
four dimensional spacetime di-
agram on a sheet of paper - noth-
ing prevents us from understand-
ing it at a mathematical level,
or even imagining such a thing.
There is nothing magical about
four dimensions - all that we
are saying is that events must
Figure 2.1: A space time diagram. be labelled by four numbers -
the four coordinates x, y, z and t. Of course, this was true even in
Newton’s world. There is a deeper sense, though, behind the state-
ment that spacetime is four dimensional in STR - I hope to make
this clear in a coming section.
There is one thing about four dimensional spacetime that I must
bring up here. In the 2D space time in figure 2.1, the light lines that
go through any point form a pair of straight lines, both inclined at
45◦ to the two axes. In four dimensions, the equation that describes
light emitted at x = 0 and t = 0 obeys the equation

c2 t2 − x2 − y 2 − z 2 = 0 (2.1)

- which is simply the statement that light travels equally in all di-
rections with the uniform speed c. Although we can’t draw the
surface corresponding to this equation, we can definitely under-
stand its content. What I can draw, is the version of (2.1) in
which I use only two space dimensions, ignoring z. This equa-
CHAPTER 2. THE RADAR METHOD 17

tion, c2 t2 − x2 − y 2 = 0, is that of a cone. This gives rise to one of the


most important pieces of jargon in relativity - the surface described
by (2.1) is called - the lightcone. I will have a lot more to say about
the lightcone later.

2.3 Measuring space-time with the flash-


light
A simple way of measuring
distances and times that char- t2
acterize where and when an event (x,t)
occurs follows simply from the t
discussion in section 2.1. All I
t1
have to do is to shine our flash-
light and wait for the light to
x
bounce back! Actually I have to
do a bit more - I must note the
Figure 2.2: Measuring space and
times at which the flash was sent
time with a flashlight
out and came back in - both ac-
cording to my wristwatch. This is shown in figure 2.2. Since light
must have travelled the same distance going out from me to the
point where it bounced and back to me and must have covered
both of the halves at the same pace - the bounce must have oc-
curred just midway in between. If these times are t1 and t2 , respec-
tively, then the time and position of the bounce must be

1
t = (t2 + t1 ) (2.2-a)
2
c
x = (t2 − t1 ) (2.2-b)
2
CHAPTER 2. THE RADAR METHOD 18

You may find this hard to swallow - but almost all of relativity can
be derived from this nearly trivial set of rules!
Our humble flashlight is in rather good company - the all impor-
tant radar systems that help armies locate and shoot down enemy
planes and air traffic controllers guide friendly ones to a safe land-
ing works on exactly the same principle outlined above! So, I will
refer to this way of measuring space and time as the radar method.
The rules that I have laid down above may seem too simple to
be true. More importantly - they seem so much in tune with our
classical notions that it is hard to see just how the conclusions
that we can draw from these rules can differ from the classical
ones. Let me point out just where the rules cease to be classical.
In Newtonian mechanics, the equations (2.2-a) and (2.2-b) will be
valid, but only in the one unique frame where the speed of light is
the same in all directions. This is the frame fixed to the medium
in which light propagates - the luminiferous ether. Einstein freed
physics from the shackles of the ether - so that in STR (2.2-a) and
(2.2-b) are valid for all inertial observers. Thus, if an observer is
moving away from me, he can measure the in and out times of
a light beam and use the same equations to locate the bounce in
space and time - but this time according to his own coordinates.
One of the major new features in STR is that space time mea-
surements change from observer to observer in a rather new, un-
expected manner. Since all inertial observers can play the game
of measuring spacetime coordinates by using the equations (2.2-a)
and (2.2-b), we can find out how these changes work out by playing
with the rules of this game. This is precisely what I intend to do in
the next few sections.
CHAPTER 2. THE RADAR METHOD 19

2.4 The K factor


Let us now consider two players in the game of measuring space-
time coordinates. I will call them Alice and Bob. Bob crosses Alice
just when Alice’s watch reads zero and he also synchronizes his
watch to read zero at the same time. From Alice’s viewpoint, Bob
moves to the right with a uniform speed v.
Both Alice and Bob are carrying flashlights of their own. Alice
keeps shining her flashlight in Bob’s direction at a regular interval
of τ . If Bob had been standing still, he would have received the
flashes of light the same interval apart. Since he is moving away
from Alice, each successive flash of light takes longer to catch up
with him - so he receives the signals at a larger interval. Let’s call
the factor by which this interval is larger K. So the flashes of light
will reach Bob at intervals of Kτ , as read by the watch carried by
him. This last point is of crucial importance - remember that I
have already shown you that the watches of Alice and Bob may not
agree.
The K factor will play central role in the story of how Alice and
Bob’s readings of events are related to each other. It is called the
Bondi K factor in honour of Hermann Bondi - who introduced this
delightful little character in the stage of relativity theory.
Think about it a bit, and you will realize that you have already
seen the K factor in action. Alice did not really have to shine the
flashlight on and off to produce a periodic signal - if she had just
kept on shining the flashlight steadily the electric field in the light
will go up and down - reaching a maximum once every time period.
Bob will receive the peaks at a larger time interval - thus the time
period that Bob sees is greater by a factor of K. So, Bob sees the
light to have a smaller frequency that Alice does. This is nothing
other than Doppler effect - remember how the whistle of a train
CHAPTER 2. THE RADAR METHOD 20

5Κτ
4Κτ
4Κτ
3Κτ
3Κτ 4τ

2Κτ 3τ
4τ 2Κτ
3τ 2τ
2τ Κτ Κτ
τ
τ

(a) (b)

Figure 2.3: The K factor. In (a) Alice sends Bob flashlight signals
at an interval of τ , which bob receives at an interval Kτ . (b) Shows
what happens when Bob sends his signals - note that the same
factor K shows up again. Note that the times shown in purple are
times read by Alice’s watch, while those in blue are readings from
Bob’s watch.

engine suddenly drops in pitch when the engine crosses you? This
is why another name for K is the Doppler factor.
What happens if Bob decides to shine his flashlight at Alice? If
he flashes his flashlight at an interval τ 0 as measured by his watch,
Alice will receive them at a larger interval of time according to her
watch. What may come as a surprise to you is that the interval that
Alice will see is exactly Kτ 0 . The same factor K comes in, both for
Bob receiving Alice’s signals and for Alice receiving Bob’s signals!
Why are the two factors the same? It is the basic principle of
relativity, the fact that all inertial observers are exactly equivalent,
in action! Note that if Bob had moved away from Alice to the left
instead of to the right, he would still have received Alice’s signals
at the same interval as long as his speed is the same as before -
there being no difference between left and right. Now, with respect
to Bob, Alice moves to the left with the same speed to the left as
CHAPTER 2. THE RADAR METHOD 21

Bob is moving to the right with respect to her. Thus any differ-
ence between the two factors will point to a fundamental difference
between Alice and Bob as observers. This precisely is what the
principle of relativity rules out! Imagine that Alice has flashed her
flashlight at time τ according to her wristwatch at Bob, who re-
ceives it at a time when his watch reads Kτ . Instead of letting the
light beam flash past him, Bob holds up a mirror that reflects the
pulse back to Alice. This beam will reach Alice at K 2 τ , who reflects
it back to reach Bob at K 3 τ , and so on ... Figure 2.4 shows this
bouncing to and fro of the flashlight beam.
When and where does the first
bounce at Bob’s mirror happen,
Κτ
3
according to Alice? Our basic
rules of spacetime measurement,
(2.2-a) and (2.2-b) tells us the
Κτ
2
answer :
Κτ K2 + 1
t = τ
τ 2
K2 − 1
x = c τ
2
Figure 2.4: Determination of K One thing must be immediately
clear - the time at which Bob
saw the first bounce occur was
2
at time Kτ - which is clearly different from the time K 2+1 τ that Alice
infers for it.
To find the value of K, just notice that

x K2 − 1
= 2 c
t K +1

Since x is the position where Bob is at time t, according to Alice,


CHAPTER 2. THE RADAR METHOD 22

the ratio xt is nothing but the speed v of Bob that Alice sees. So,
calling the dimensionless quantity vc β, the equation translates to

K2 − 1
β= (2.3)
K2 + 1

which can be easily solved to find K


s
1+β
K= . (2.4)
1−β

Thus the K factor goes from 1 when Bob is stationary with re-
spect to Alice, to ∞ when Bob’s speed approaches the speed of
light (β = 1). The factor is smaller than 1 for negative values of β,
and falls all the way to 0 for an observer moving to the left with a
speed c. Indeed, reversing the sign of the velocity changes the K
factor to its reciprocal.
If Bob moves away from Alice with a speed v, then Alice must
move away from Bob with a speed −v. It follows, then, that the K
factor of Alice with respect to Bob must be K −1 , if that of Bob with
respect to Alice is K. However, isn’t this contrary to what I said
a while ago - that when Bob sends here signals at time intervals
of τ , Alice must receive them at intervals of Kτ (and not K −1 τ ),
according to the principle of relativity?
If you think about this a bit, you will realize that there is no
contradiction at all! Imagine a third fellow, maybe Charlie, who is
situated well to the left of both Alice and Bob, who is also shining
flashes of light at both of them. Both Alice and Bob sees the flashes
travelling to the right 5 with speed c. If Bob sees these flashes at a
5
That the direction as well as the speed of light stays unchanged when you
jump from one inertial frame to another is a consequence of the fact that “you
just can’t outrun light”! I will show you how this result follows from the rela-
tivistic rules of velocity addition in section 3.4.
CHAPTER 2. THE RADAR METHOD 23

time interval of τ , then Alice must see them at the smaller inter-
val K −1 τ . This exactly, is the meaning of the statement that the
Doppler factor of Alice with respect to Bob is K −1 ! In this case,
Alice is moving, as far as Bob is concerned, in the direction of the
source of the light and is hence getting to meet each successive
pulses sooner than Bob does. When Bob sends Alice the signals,
though, the light signals must move towards the left to get to the
Alice, which is the same direction that Bob sees Alice move away
in! The light signals take some extra time to catch up with the
receding Alice - hence, the increased time interval of Kτ .
In summary, then, the Doppler factor K for Bob with respect to
Alice is what you must multiply time intervals between light signals
as seen by Alice, to get that seen by Bob - but for light travelling
towards the positive x direction, namely, to the right! For light
travelling towards the left, the multiplying factor is not K at all,
but rather K −1 !
In terms of frequencies, Bob must divide the frequency of light
that Alice sees by K to get the frequency that he sees, but again,
only for light travelling to the right. For light moving the other way,
Bob will have to multiply by K, instead of dividing by it!
Let me address another point that you may have already thought
of on your own. The equation (2.4) clearly shows that the value of
K ceases to be real once β exceeds 1 in magnitude. Does this prove
the statement that I have made all along - that it is impossible for
a material object to move faster than light? Unfortunately, the an-
swer is - no! Just think about it a bit and you will realise that all
it means is that if Bob were to recede from Alice faster than light,
then the flashlight signals Alice sends out towards him will never
catch up with him! To understand why material particles cannot
exceed the speed of light, you will have to wait a while more!
CHAPTER 2. THE RADAR METHOD 24

You may be inclined to think of (2.4) as more physically signifi-


cant than (2.3). After all, velocity is something we all know and un-
derstand - while the K factor is something of an unknown quantity.
However, now that I have shown you that measurements of space
and time may not be as straightforward as it may have seemed, it
should not come as much of a surprise that the speed is not that
easy a quantity to measure directly. On the other hand, K is some-
thing that can be measured directly with a lot of ease. All Alice and
Bob have to do is measure the frequency of a given flash of light.
The ratio of these values gives us the K factor directly. So, in a
practical situation, it is more likely that we will calculate β from K
rather than the other way around. Indeed Alice may just as well
describe the motion of Bob by his K factor rather than his speed -
and this will turn out to be quite convenient for many purposes.

In order to stress the difference between the STR sce-


nario and the classical one, let me show you what the K
factor would have been in a world that obeyed Newton’s
physics. The first light pulse, sent at t = 0, reaches Bob
immediately, since he is right next to Alice at this point.
If Bob receives the second flash at time τ 0 (note that this
is classical physics - we are not bothered about who mea-
sures this time), then the second flash must have trav-
elled for a time of τ 0 − τ . This gives c (τ 0 − τ ) = vτ 0 , leading
to
τ0 c 1
K= = =
τ c−v 1−β

What about the signals Bob emits? Again, Bob’s first


signal reaches Alice immediately. When Bob emits the
second flash, at a time τ (say), he is at a distance of vτ
from Alice. The flash covers the distance back to Alice in
CHAPTER 2. THE RADAR METHOD 25

a time given by vc τ = βτ , so that Alice receives the flash


at (1 + β) τ . So, the K factor for Alice for signals emitted
by Bob turns out to be

K0 = 1 + β

Note that unlike STR’s prediction, the two K factors are


different! Obviously, both differ from the STR prediction,
(2.4). Note, though, that since (1 + β)−1 ≈ 1 − β for β  1,
both these factors are quite close to the relativistic value
for low speeds.
Why do the two K factors differ? If you carefully go
through the derivation above, you will see that despite
appearances, there is a big difference between Alice and
Bob as observers. I have tacitly assumed that the speed
of light is c with respect to Alice, i.e. she is fixed with
respect to the ether! In the classical picture, then, Al-
ice does have a privileged position! Indeed, note that
in the propagation of sound, it does matter whether the
source moves or the observer moves, because in each
case there is a preferred reference frame - one fixed to
air, the medium of sound propagation.
Chapter 3

Kinematics from the


flashlight - the K calculus in
action

3.1 Time Dilation


Let’s go back to figure 2.4 again. As you have already seen, Alice
and bob disagrees about the the time at which a flash sent out by
Alice at time τ meets Bob. According to Bob, this happens at a time
2
Kτ , whereas Alice reckons this to be K 2+1 τ . It is pretty easy to sees
that Bob’s time is the smaller of the two, no matter what the value
of K is1 !
You may object that this destroys the equivalence of the two
players in this game. After all, if Bob always measures the smaller
time, then he can not be on the same footing as Alice! Note, how-
ever, that Bob does have a special role in the event that we are
1 K 2 +1 2
2 τ − Kτ = 12 (K − 1) τ > 0 - for all real values of K. Note that here (as well
as elsewhere) we are assuming that Bob can not move away from Alice at a rate
faster than light.

26
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 27

talking about - he is on the spot! This means that he had the lux-
ury of directly using his watch to time the bounce, while poor Alice
had to be content with an indirect determination of the time. In-
deed, if we look at the second bounce the light flash suffers, we see
that Alice times it at K 2 τ , while Bob infers that it occurs midway
3
in between the first and third bounces, i.e. K 2+K τ . Once again, the
“on the spot” time (this time its Alice’s turn) is smaller than the
inferred time. There is nothing special, then, about either Alice or
Bob - whoever happens to be on the spot measures a smaller time.
So, STR does make a distinction between a time measurement
made directly and one made indirectly. In STR jargon, directly
measured time is called the proper time. Going by this, you may
be tempted to call the indirectly measured time the “improper time”
(some authors do). In my opinion, this latter term is not very suit-
able - since it gives us the notion that something is wrong about
such a time measurement!
Going back to the first bounce suffered by the flash of light in
figure 2.4, we denote the proper time at which it occurs (Bob’s time)
by t0 and Alice’s time by plain t. We see that

1+β
t K2 + 1 1−β
+1 1
= = q =p
t0 2K 2 1+β 1 − β2
1−β

which gives us the famous time dilation formula

t0
t= p . (3.1)
1 − β2

The factor √ 1 recurs so often in STR it deserves a special symbol


1−β 2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 28

of its own, namely γ :

1 K2 + 1
γ≡p = (3.2)
1 − β2 2K

and so
t = γt0 .

Note that the ratio of measured times is the same for the second
bounce of the flash as well - but this time t is Bob’s time, while t0
is the time as measured (directly) by Alice.
Although equation (3.1) is pretty straightforward - it can lead
you into all sorts of trouble, especially if you get confused about
which observer’s time is t0 and which one is t. So, let me stress
this once again, if an observer is present at the location of both the
events (on the spot) then the time interval he measures between them
is the proper time. If you are talking of two events such that neither
Alice nor Bob is on the spot for both of them, then neither can claim
his time measurement to be the proper time. How, then, are the
two time’s related to each other? Well, you will just have to wait till
we get to the derivation of Lorentz transformations in a little while.

3.1.1 Moving clocks run slow!


Let’s try looking at the watch Bob is carrying with him from Alice’s
point of view. Two successive tick on the watch are two events and
both Bob and Alice can time the gap between them. Since Bob
is present at both the ticks (its his wristwatch, after all!), he can
directly read off the time interval. So, Bob measures the proper
time interval between the ticks, while Alice does not. So, if Bob
says that the gap between two successive ticks on his watch is one
second, Alice reckons the gap to be larger - γ seconds to be precise.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 29

From Alice’s point of view, Bob’s watch is running slow!


You must have caught on by now to the conclusion that as far
as Bob is concerned, it is Alice’s watch which is running slow - by
the same factor γ. This leads to one of the most surprising claims
of STR - moving clocks run slow! Does Bob notice his own clock
run slow? Not on your life! That would have given him the ability
to detect his own motion just looking at his watch - something that
an inertial observer just can’t do!

l0 l0
c t/2

v t/2
(a) (b)

Figure 3.1: An idealized clock. (a) The time taken for light to return
to the lower mirror is 2lc0 , (b) The time taken is longer for someone
moving with respect to the clock, since light has to cover a bigger
distance.

Let me show you a more direct proof of this effect of a moving


clock running slow. To do this, you have to imagine an actual
clock, albeit a highly idealized one! Imagine a contraption which
has two mirrors, a distance l0 apart. As shown in figure 3.1a, a ray
of light leaves the lower mirror and bounces from the top mirror
to return. This to an fro motion of light defines one period for our
clock, which I will refer to later on as the “photon clock”. If I stand
still next to this clock, I will see light cover a net distance of 2l0
- which means the time period of the clock with respect to me is
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 30

2l0
c
. Note that since both the start and end of the light rays to and
fro journey occurs in the same place with respect to me, this time
interval is the proper time interval between the two events.
Consider what this will look like to someone else, who sees me
and my clock move uniformly to the right with a speed v. He will see
light leave the lower mirror, reach the upper one and bounce back
to the lower one, just as I do. However, by the time light reaches
the upper mirror, it would have shifted to the left ... - hence he will
see light follow the path shown in figure 3.1b. The path is obviously
longer than the one I see. If this had been Newtonian physics, he

would have claimed that light is moving faster (at a speed c2 + v 2
to be exact), so that it covers the path in the same time 2lc0 that I
see. That option is out - STR tells us that light has to make the trip
at the same speed with respect to him too - and thus he must see
the trip take a larger time. Thus one tick on my moving clock takes
more time than the designated 2lc0 as far as he is concerned - it is
running slow!
To find out just how slow, all we have to do is appeal to good old
Pythagoras! If t is the time taken for the round trip according to
him, then light takes half of it to get to the upper mirror, covering a
distance of ct2 . In this time, the upper mirror has moved a distance
of vt2 . So, in figure 3.1b we have a right angled triangle with sides l0
and vt2 and a hypotenuse ct2 . So
 2  2
ct 2 vt
= l0 +
2 2

which can be easily solved to give

2l0

c t0
t= q =p
1− v2 1 − β2
c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 31

- exactly as our time dilation formula tells us.

3.1.2 Accelerated clocks


So, a clock that is whizzing past you rapidly does run slow com-
pared to your clock. But what if, instead of just moving at a steady
pace, the clock also accelerates? This was a pretty thorny question
in the formative years of relativity - but today we know the answer:
acceleration does not matter! This does not mean that your real
clock is unaffected by accelerations - drop one on the floor and it
is likely to break! What it does mean is that it is in principle possi-
ble to build clocks that are not affeceted by accelerations and that
in reality we can get pretty close to this ideal. So a nonuniformly
moving clock whose speed is u slows down by the same factor of
p
1 − u2 /c2 as a uniformly moving one. The only trouble is that
this factor varies with the speed of the particles and hence with
the time. Thus the factor can be used, but only over infinitesimal
periods of time: r
u2 (t)
dτ = dt 1 − 2 (3.3)
c
Here we have used the notation τ for the proper time as observed
by an arbitrarily moving observer. We will reserve the notation t0 for
the case where the observer is moving uniformly. The net proper
time that elapses between two events A and B is thus given by
r
B
u2 (t)
Z
∆τ = dt 1− (3.4)
A c2

Of course, this reduces to (3.1) whenthe velocity of the clock is a


constant, u (t) = v.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 32

3.1.3 The Doppler factor revisited


In section 2.4 I presented one of the major results of this book - the
Doppler factor for a moving observer. I also showed you that the
value calculated from STR differs from the classical result. Armed
with our result for time dilation, we may now re-derive the formula
(2.4) more directly.
1
As you have seen, the classical value for the K factor is 1−β .
This means that classically, a signal that Alice sends out at time
τ
τ would reach Bob at a time 1−β . And so it does - even if we take
STR into account! If you take a look at the derivation there, all I
have used is the fact that in a time τ 0 , Bob has moved away from
Alice by a distance of vτ 0 and at that instant the light pulse, which
has been travelling for a time τ − τ 0 must have moved a distance of
c (τ − τ 0 ). Equating these two gave me the time τ 0 , the time when
Alice’s second pulse reaches Bob. That sort if impeccable reasoning
is something you just can’t fault! The only catch is that, while
the second pulse that Alice sends out does reach Bob at a time
given by τ 0 = 1−β
τ
, this is the time according to Alice’s watch! What
does Bob’s watch read at this instant? Since his watch is running
p
slow compared to Alice’s by a factor of 1 − β 2 , his watch must be
reading s
p τ p 1+β
τ 0 × 1 − β2 = 1 − β2 = τ (3.5)
1−β 1−β
which gives the K factor according to (2.4)!

3.1.4 Testing relativity


Is time dilation for real? The first question that you may raise is,
if moving clocks really do run slow, how come no one had noticed?
The answer should be immediately obvious! All the clocks that we
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 33

see around us moves too slowly for the effect to show up! Consider
a clock that rushes past you on an aircraft flying at Mach 1 (the
speed of sound). How much does this clock slow down by? To get
an estimate of this, note that the speed of sound is about 300 m s−1 ,
while that of light is 3 × 108 m s−1 , so that the β for our clock is 10−6 .
So, this clock goes slow compared to the one that sits still next to
you by a factor γ given by
− 12
γ = 1 − 10−12 ≈ 1 + 5.0 × 10−13

So, this clock that flies past you does go slower than yours - but
only by five parts in 1013 ! No wonder no one had noticed!
Having said that, I must hasten to add that today’s technology
is good enough to actually detect even such a small effect! Today,
we have many direct verifications of the time dilation formula (3.1)
- something which would have been unthinkable in Einstein’s time.
What’s more, the time dilation formula is no longer a mere cu-
riosity producing either tiny effects that would not be detectable
except with very sophisticated technology, or large effects for ex-
otic objects that travel very fast - it is almost a part of daily life
today! You may have heard of the Global Positioning System (GPS)
- a network of satellites that receives signals from transmitters at-
tached to vehicles (like ships in the open sea, aircrafts or even your
mobile phone!), times them and uses something very much like
high school trigonometry to triangulate the exact location of the
source, to within a few meters. The clocks on the satellites run
slow compared to the clocks on the sources (because of their large
orbital speeds) and you must compensate for this effect to achieve
this accuracy! Without this compensation, GPS calculations would
have been off by miles! This perhaps is one of the most drastic
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 34

confirmation of STR2 that affect our daily lives.

3.1.5 The twin paradox


Scientists love paradoxes - apparently impeccable arguments that
lead to absurd or contradictory results. Resolving paradoxes helps
them hone their logical skills, and sometimes even make new dis-
coveries. Perhaps no other branch of physics has given rise to so
many paradoxes as the special theory of relativity. One of the best
known is the notorious twin paradox - which is based on the phe-
nomenon of time dilation that we have just learned about.
Remember what I said about the impossibility of an inertial ob-
server to detect his motion except by looking “outside”. This means
that when Bob rushes away from Alice, not only will she see his
“photon clock” slow down due to time dilation, but every periodic
activity that happens to Bob must slow down in proportion! This
includes, of course, Bob’s biological clock - the one that tells his
heart when to beat, his lungs when to breathe, his stomach when
to send hunger signals to the brain, ... This way, Bob will not feel
this slowing down at all! Otherwise, all he has to do is to time his
photon clock with his pulse (something very much like what Galileo
did with the hanging wall-lamps in the tower of Pisa), to find out
that it is running slow - and hence detect his motion, without look-
ing at Alice or anyone else! the upshot of all this is that when Bob
is in high speed motion with respect to Alice, his bodily functions
2
Actually, there is one more effect that makes the clocks on board of the satel-
lites run differently than the earth based clocks. The fact that the gravitational
field of the earth is a lot weaker at the satellite than on the surface has an effect
comparable in size to that due to (3.1) - and the GPS system has to compensate
for both the effects! The gravitational time dilation is one of the important pre-
dictions of Einstein’s general theory of relativity (GTR). Thus, the GPS provides
confirmation, on a daily basis, of both of Einstein’s relativity theories!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 35

slow down - and he ages less rapidly!


Now, imagine that Alice and Bob are two twins3 . On their twen-
tieth birthday, Bob sets out on a high speed intergalactic journey,
while Alice stays behind on earth. Fifty years goes by, by Alice’s
clock, when Bob returns home. Alice is now at the ripe old age
of seventy - Bob has a great deal of difficulty recognising her. To
Alice’s eye, however, Bob has changed very little, - he is very much
the young man who had left. The reason, of course, is that Bob’s
clocks, including his biological one, has been running slowly all
this while, so that he has aged only five years during his round
trip! So, now when the two twins hug each other, one is seventy
years old, while the other is only twentyfive!
The scenario above may seem absurd to us - but is it para-
doxical? No! The only reason we have any difficulty in accepting
this scenario is because our small-speed experiences condition us
to the notion of time being the same for everyone. We don’t see
anything like this happening around us, but today we have ample
experimental evidence to show that this is exactly what happens!
So, where is the paradox? Just look at the thing from Bob’s
point of view. He sees Alice recede from him and then come back,
at the same very large speed! Since the laws of physics are the
same for him as well, he concludes that it is Alice’s clock that must
run slow, and thus if he has aged five years during the trip, Alice
must have aged by six months only! So, instead of being twenty
five and seventy at the end of the trip, Bob and Alice should be
twentyfive and twenty and a half only, respectively! This is the
twin paradox - where the final outcome of which twin is older than
the other one appears to depend on which of the two friends is
3
I could have invented a new pair to be the twins in our paradox - but I don’t
want to clutter up the story with too many characters!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 36

thinking about it!


This paradox - one that has bothered a lot of stalwarts in rela-
tivity theory - has a very simple resolution - there is just no para-
dox! The whole issue hinges on the fact that that Bob has just
as much a right to draw conclusion on Alice’s clock as she has on
his. Why? Because the basic principle of relativity says so! Well
... not quite - the basic principle of relativity is that all inertial ob-
servers are equal - not that all observers are equal! Alice, the stay
at home twin, is inertial - but not Bob. Being inertial, the conclu-
sion that Alice draws is correct. So, Bob’s conclusion about Alice
being younger than him is simply - wrong!
So, the whole resolution of the twin paradox rests on the fact
that Bob accelerates during his journey with respect to Alice, once
each at the beginning and the end of his journey, and once at the
midpoint in order to turn back. Indeed, I can get rid of the initial
and final acceleration from the story by getting Bob to pass by
Alice on his rocket (neither start from or stop at her location on
earth), but the acceleration at the middle is a must. Note that this
acceleration is over a very brief duration compared to Bob’s entire
journey (and indeed it can be made as brief as you like by making
the rockets stronger and stronger) - but the entire crux of the time
difference lies in that brief instant.
One thing that has worried a lot of people about this is that for
most of his journey Bob is just as inertial as Alice - so they find it
difficult to figure out how their conclusions about the time it takes
differ so much. However, Hermann Bondi has come up with a
great analogy to show why this is not something one should bother
about. If Alice had driven straight from a city A to a city B, while
Bob makes the trip A to C and then on to B, then, when they meet
Bob’s odometer reads a lot more than Alice’s. This despite the fact
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 37

that during most of his journey, Bob had driven just as straight
as Alice did! The one brief period at C when Bob makes a turn
is ultimately responsible for the length of Bob’s path being a lot
longer than Alice’s! Similarly, the one brief period of acceleration
during Bob’s intergalactic round trip is what makes Alice’s clock
read a lot more than Bob’s. The only reason why people never bat
an eyelid in the former case, while they are all full of wonder at
the latter is that our intuitions, which are honed by what we see
around us in a world of slowly moving objects, are not prepared to
grasp the concept that time can be different for different people!
One argument that many people come up with at this stage is
that from Bob’s point of view it is Alice that accelerates, so it is
equally correct to say that Bob is inertial and Alice is not - and
hence conclude that Bob is correct about their respective ages and
not Alice. This argument falls flat simply because the issue of
which observer is inertial and which is not is not is not a rela-
tive matter! Remember, Newton’s first law asserts that their exists
at least one inertial observer for whom all force free objects are also
acceleration free. Anyone moving uniformly with respect to this in-
ertial observer is inertial, anyone who accelerates with respect to
it is non-inertial! Moreover, although an observer can not make
out whether she is moving or not without looking out when in uni-
form motion, she can easily figure out whether she is accelerating
with respect to an inertial observer. In the brief moment of reversal
during Bob’s journey, Bob’s accelerometer must have registered a
large reading, while Alice felt nothing at all! So, Alice is the one
who is inertial - and their is no ambiguity about this whatsoever!
Although we have disposed off the paradox simply by denying
its existence, it is natural to feel a bit uncomfortable about this.
It is true that the clocks of both friends run slow compared to the
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 38

other one for most of the time, so it may be somewhat disconcerting


that the final outcome is so asymmetrical. In order to make this a
bit easier to digest, let me show you another way of understanding
what’s happening during the trip.

10 8 10 8
7.5 9.5
9 7 9 7
8.5
8 6.5 8
6 6
5.5
Time (in years)

Time (in years)


7
5 5
6 6
4.5
5 4 4

4 4
3
3
2 2
2 2

1 1

Position Position
(in lightyears) (in lightyears)

(a) Alice’s signals to Bob (a) Bob’s signals to Alice

Figure 3.2: The twin paradox

For the sake of convenience, I will assume that Bob does not
move as fast as in our example above, but with only a speed of
0.6 c. This gives us a time dilation factor γ of 1.25, so that a journey
that takes only 8 years according to Bob’s while Bob’s clock takes
10 years according to Alice. Before Bob sets out on his journey,
the two friends decide on a strategy by which they can keep track
of each other’s ages. They both agree that on their birthdays each
year they will send out a light flash towards the other one. So, all
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 39

they have to do is keep count of the flashes they receive, to know


how old the other one is. These flashes are shown in the spacetime
diagrams of figure 3.2.
As you must have figured out, the Doppler factor will play a ma-
jor role in helping us keep track of when each friend receives the
others signals. With a value of β = 0.6, Bob’s K factor with respect
to Alice for the outward journey is 2, and that for the inward jour-
ney is 21 . Of course, the respective factors for Alice with respect to
Bob are 21 and 2, respectively.
Let us consider the flashes that Alice sends out first. Bob’s
Doppler factor of 2 ensures that her annual signals reach Bob at
gaps of two years according to his clock. So, during the outward
journey which lasts for four years according to Bob, Bob receives
only two of Alice’s signals. At the moment he receives’ Alice’ second
signal, Bob fires his rockets and turns back home. Now his Doppler
factor is 12 , so that Alice’s next signals reach him at gaps of six
months. For the next four years of his journey, then, Bob receives
eight signals from Alice, the final one getting to him just as he is
entering the earth again.So, Alice sends out a total of ten signals in
all - Bob receives all ten, and they both agree that Alice has aged
by 10 years during the round trip.
Let’s take a look now at the signals that Bob sends Alice. As
long as Bob is moving away from Alice to the right, the Doppler
factor of Alice with respect to Bob is 12 . Remember, though that
Bob’s light signals sent to Alice are moving to the left, and this
means that Alice will see them at a gap that is twice as big than
the one that Bob sends them out at. Thus, Bob sends out a total
of four signals during his outward journey, and the fourth of these
reaches Alice after eight years (according to her clock). Once Bob
turns back, Alice’s K factor with respect to Bob becomes 2 - but
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 40

this means that she is going to receive Bob’s yearly signals at gaps
of six months!Bob sends out four signals while coming back. Alice
receives all of them crammed up in the last two years of her ten.
Thus, Bob sends out a total of eight signals, Alice receives all eight!
Everything checks out, and they both agree that Alice is ten years
older than when the journey started, while Bob has aged by only
eight!

3.1.6 The writing on the clocks


I have been careful so far to ensure that our observer use only one
clock, the one that he or she carries, to time various events. This
way, of course, each observer can directly time only the events
that occur at his or her origin. For all other events they have to
rely on the radar rule, (2.2-a). This , of course, is not the only
approach that one could take to time measurements - many others
are possible. One way, that Einstein himself was fond of using is
to assume that each observer carries with him or her an infinite
number of fixed clocks, one at each point in space. Thus, if Alice
was equipped with such an infinite number of clocks, all she would
have to do to locate a particular event in spacetime is to look up
the reading in her clock that is next to where the event occurs for
the time, while the fixed coordinates of that particular clock tells
her where that particular event occurs.
If such a way of measuring space and time is adopted, we will
have to change our way of defining proper time measurement slightly.
In this new way of doing things, proper time is, once again, any
time measurement done using a single clock, while any other ob-
server who must use two clocks to measure the time interval be-
tween two events (because the two events occur at different places
with respect to him) does not measure the proper time interval.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 41

Time dilation leads to a puzzle similar to, but not quite the same
as, the twin paradox for two observers equipped with this infinite
set of personal clocks. Just imagine that Bob, who has his own
private array of fixed clocks, is rushing past Alice and her clocks
at the speed of 0.6 c. When the two friends are just abreast, their
clocks both read 0. After a while Bob glances at his clock and finds
that it is reading 8 seconds. He also notices that Alice’s clock right
next to him is now reading 10 seconds. To Bob, the interval that
has elapsed is eight seconds long, whereas to Alice its length is 10
seconds. Of course, Bob uses the same clock for both the readings
of 8 seconds and 0 seconds - so his reading is the proper time
duration of this interval. Alice, on the other and must use two of
her clocks to time this interval - the first one at the origin, and the
second at a distance of 0.6 c × 10 s = 6 light-seconds away from her.
You should check that the two friends’ statements about the length
of the time interval is consistent with our time dilation equation,
(3.1).
So, where is the puzzle? Note that both friends must agree that
at the first event (Bob crossing Alice) their clocks were reading 0 s.
They must also both agree that when Bob next looked at his clock
(the second event), their adjacent clocks were reading 10 s and 8
s, respectively. All this fits our results beautifully. Hang on! Just
try to figure things out from Bob’s point of view. To him, it is Alice
and her clocks that are moving away, with a speed of 0.6 c to the
left. Hence, he should be seeing Alice’s clocks running slow, by
the same γ factor of 1.25! So, shouldn’t Alice’s clock be showing
1
8 s × 1.25 = 6.4 s, when his clock is showing 8 s?
In other words, the puzzle is that while relativity insists that
both of the friends will see the other one’s clocks running slow, the
writing on the clocks seem to insist that it is Bob’s clocks that have
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 42

slowed down - not Alice’s! Note that in this case both friends are
equally inertial - so that we do not have the escape route that we
used for the twin paradox. To see how to resolve this issue you will
have to wait a while until section 3.3 - where we will meet a con-
sequence of relativity that runs even more counter to our intuition
than time dilation!

3.1.7 The case of the immortal muon


In case the paradoxes in the last couple of sections have shaken
your resolve, cheer up! For I will be talking in this section of ar-
guably the most dramatic confirmation of time dilation. This is
provide by the mystery of the undying muon. No, I am not refer-
ring to some supernatural mystery, but to a natural phenomenon
that physicists see quite regularly in their laboratories. Without
time dilation to help us out, though, this would have been just as
difficult to explain as the ghost stories that we don’t believe in (I
hope) but definitely enjoy!
The muon is an elementary particle - one of the many subnu-
clear constituents of matter. One of the places where the muon
is produced is in the upper atmosphere, about 5-6 kms above the
surface in collision processes involving cosmic rays. These muons
reach the surface on a regular basis - to be detected by particle
detectors that we have lined up for the purpose.
By itself, this would not have been a big surprise. What makes
the arrival of the muons a mystery is that they are unstable, de-
caying into other particles. How long do they live? What helps us
here is that muons can be produced in the laboratory by nuclear
processes. The half life of such lab bred muons can be measured,
and turns out to be about 2 × 10−6 s. In this small a time, even
light can travel only 600 meters! How, then, can the muons pro-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 43

duced five-six kilometers above the surface survive till they reach
the earths surface?
The answer lies, of course, in the fact that the muon decays by
its own clock, one that is slowed down by quite a big γ factor com-
pared to the ones that we carry. Put in a more technical language,
in the inertial frame in which a muon is at rest, it does decay with
the half life of 2 × 10−6 s. This is the so called proper half-life of
the muon. However, the muons produced in the upper atmosphere
are travelling so fast (at nearly the speed of light) relative to us that
their clock runs slow by a factor of around γ ∼ 10, allowing most of
them them to get to the surface undecayed.
So that’s it - if the muon can make it to the ground from 5-6
kms above, it is all because of the fact that the clock that tells it to
decay runs slowly compared to our earthbound clocks. Before you
start to celebrate the solution of this case of the undying muon,
though, let me just show you the whole life of the muon from the
muon’s point of view - or rather, from an inertial frame in which
the muon is at rest. The muon, in its own frame lives for 2 × 10−6 s.
From the muon’s point of view, it is the earth that rushes at it.
Does the earth move 5-6 kms or more in this small a time? If it
did, it is obvious that the earth would have to move at many times
the speed of light!
So even though the fact that the cosmic ray muons reach the
surface can be explained using time dilation from the earthbound
observers reference frame, it still remains a mystery from the muon’s
frame. The only way that you can solve this is if what we earth-
lings measure as 5-6 kms, is only a few hundred meters long for
the muon! This brings us to the topic of the next section - the fact
that moving rods get shortened!
This illustrates a very important theme in relativity theory - two
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 44

different observers may differ in their explanations about physical


events, but will have to agree on what happens! That the muon
reaches the ground is something both the earth based observer
and the fellow riding in with the muon will have to agree about
(although the latter will state it as the ground reaching the muon!).
However, though the earth based observer will attribute the muon’s
amazing longevity to time dilation, the muon rider will explain the
facts by invoking the concept of length contraction.

3.2 Length contraction


The case of the muon that refuses to die, suggests, when you are
trying to explain this from the muon’s point of view, that there
must be something going on with moving lengths. Indeed, this is
a problem that you may have noticed even earlier. In subsection
3.1.6 we saw that after Bob has travelled for 10 seconds with re-
spect to Alice at a speed of 0.6 c, their respective clocks read 8 s
and 10 s, respectively. How far apart are they from each other?
As we have already noted, Alice reckons that Bob is 0.6 c × 10 s =
6 light-seconds away from her. What about Bob? To him, Alice is
moving away with the same speed in the other direction, but she
has only been moving for 8 s since crossing him! Thus, Bob must
insist that at this moment she is 0.6 c × 8 s = 4.8 light-seconds away
only! How can the two friends disagree about how far they are from
each other at the same instant? Once again, we see the hint that
like time measurements, in STR there is something subtle about
space measurements, too!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 45

3.2.1 A moving rod gets shortened


At the end of the last section we have seen that moving rods
must contract if we are to apply physics equally in all inertial
frames.To see that this really occurs, let us go back to the radar
method. This time, as Bob rushes away from Alice, he carries a rod
with him, directed along the relative motion. The length of the rod,
according to Bob, is l0 . Anticipating the fact that the length may
be different to other observers, we call the length in Bob’s frame
(where the rod is at rest) the proper length of the rod. The ques-
tion that I am going to address now, just how long will Alice claim
this rod to be?
l To measure the length of the
K(Kt 1+ 2 c0 )
2
rod Alice must determine the po-
K t2 l0
Kt 1 + 2 c sition of both ends of the rod
simultaneously. Let me stress
Kt 2 that this demand of simultane-
t2 ity has nothing to do with Ein-
Kt 1
t1 stein, even good old classical physics
will tell you that you can’t claim
Figure 3.3: Length contraction the difference between the co-
ordinates of two ends of a mov-
ing object to be its length - unless the two measurements are made
at the same instant, the object’s motion will make your length mea-
surement go all wrong. The only case where you don’t have to
bother to read the position of both ends together is when the ob-
ject is not moving with respect to you. For Bob, then, it does not
matter when the two ends are measured, he will always find that
the difference in coordinates is l0 !
To carry out her measurement, Alice sends out two beams from
her flashlight, as shown in figure 3.3. The first one, send out at
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 46

time t1 bounces off the far end of Bob’s rod and returns to her. the
second flash, send out at time t1 according to Alice’s clock, bounces
off the near end of the rod (which is at Bob’s location) at a time Kt2
(as read by Bob’s clock) to reach Alice at a time K 2 t2 (according to
her clock).
We have to exercise a bit more careful to figure out the timings
on the first beam. It reaches Bob, of course at a time Kt1 according
to Bob’s clock. Then it moves on to bounce off the other end of
the rod. Since the speed of the beam is c with respect to Bob, too,
the time it takes to come back to him must be 2lc0 . Thus, on its
way back, the beam meets Bob at the time Kt1 + 2lc0 , once again,
as read by Bob’s clock. This beam reaches Alice at a time (read by
her) that is larger by the factor K again, at K Kt1 + 2lc0 . Figure 3.3


shows the times that Bob measures in blue, while those that Alice
measures are shown in purple.
The groundwork is now done - all that is left is to use our basic
equations (2.2-a) and (2.2-b) to figure out the difference in time and
position of the two bounces as seen by Alice. It is easy to see that
the time gap between the two bounces is
   
1 2l0 1
K 2 + 1 t2

∆t = K Kt1 + + t1 −
2 c 2
1 Kl0
K 2 + 1 (t1 − t2 ) +

= (3.6)
2 c

while the gap in position is


   
c 2l0 c
K 2 − 1 t2

∆x = K Kt1 + − t1 −
2 c 2
c
K 2 − 1 (t1 − t2 ) + Kl0

= (3.7)
2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 47

Now, ∆x above will be the length of the rod as measured by Alice, l,


only if the two bounces occurred simultaneously, according to her.
Thus, we must have ∆t = 0. According to (3.6) this means that

2K l0
t1 − t2 = − (3.8)
K2 + 1 c

This means that l is given by

K2 − 1 2K
l = Kl0 − Kl0 = l0
K2 + 1 K2 + 1
2
The factor K2K+1 is one we have met before, in (3.2). So, we can
rewrite the formula for l as

1 p
l= l0 = l0 1 − β 2 (3.9)
γ

A word about the name of this effect. Einstein’s theory shows


us that this contraction of the length of a moving object is a direct
consequence of the way it forces us to rethink issues of measuring
length and time. However, before Einstein, Fitzgerald and Lorentz
had come up with the notion that moving objects contract in the
direction of motion, essentially so that they could explain the null
effect of the Michelson-Morley experiment. Their prediction was
more of a mathematical device that was based on the ether theory
than what we think of length contraction about today, but in their
honor this relativistic effect is called the Lorentz-Fitzgerald length
contraction.
The same factor comes into play in the length contraction of
moving rods as in time dilation. Think about it a bit and you will
realize that that’s exactly what we need to resolve the difference
between the two friends about their distances that we talked about
at the beginning of this section. At a time when Bob is 6 light-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 48

seconds away from Alice, according to Alice - Alice is only 4.8 light-
seconds away according to Bob. There is nothing wrong with this
- after all, Alice reckons Bob’s distance by noting the coordinates
of her clock that is just next to Bob at that instant - this space
interval is fixed as far as Alice is concerned, but it is moving from
1
Bob’s point of view. So, Bob sees it shortened by a factor of 1.25 = 0.8
- which is why his reckoning of the distance is 4.8 m!
Length contraction, again, is just what the doctor ordered for
the case of the undying muon! From the muon’s point of view, it
decays in its proper lifetime. In which the ground rushes towards
it to cover the distance from the upper atmosphere to the earth’s
surface. This is a distance that we on the earth measure to be
about 5-6 kms, but to the muon it is shortened to about 500-600
meters - just the sort of distance the earth is expected to cover
before the muon decays!
Of course, Alice does not have to measure the two endpoints
of Bob’s rod to get its length correct. If there is a time gap of ∆t
between the two measurements, all she has to do is correct for the
distance v∆t that the rod would have moved in the interim. Thus,
another way to figure out the length l would be

l = ∆x − v∆t

I leave it as an exercise to you to show that if you use (3.6) and


(3.7) in this equation, t1 and t2 will cancel out, leaving exactly (3.9)
behind.
By now you should have learned enough relativity to expect that
just as Alice says that the rod carried by Bob is shortened, so would
Bob say that any rod that is fixed with respect to Alice is shorter
than what Alice measures it to be, and shortened by exactly the
same factor to boot!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 49

Note that in the last statement I was being very, very careful.
I said ‘says’ - not ‘sees’. To see an object, light must travel from
that object to ones eyes. What you see is the result of light falling
on your retina at a given instant. Light travelling from different
parts of an object actually travels for different lengths of time be-
fore reaching your eye - so they must have started out at different
times. Of course, this does not bother us in our daily lives - the
times involved are just to small to make a difference. However, for
rapidly moving objects, this small time difference could result in
large shifts - leaving quite a different image of the object! The vi-
sual appearance of rapidly moving objects is quite an involved topic
- even the great George Gamow got it wrong in the first edition of
his wonderful “Mr. Tompkins in wonderland”! I will say a little bit
on this topic a while later.
3.2.2 What about the other directions?
So far we have been sticking to one spatial dimension. This is as
good a point as any in which to start worrying about the other di-
rections as well. We have seen that a moving rod gets shortened,
but that proof was only for a rod that is aligned along the direc-
tion of motion. Indeed, the more perceptive among you may be
beginning to feel that there was something wrong with my photon
clock derivation of time dilation. what I had assumed there is that
the vertical distance between the two mirrors in figure 3.1 stays
the same for both Alice and bob at l0 . At that stage, we had not
heard of length contraction, so that we had no reason to be wor-
ried about this assumption! Now that we have learned that moving
objects contract, it is only natural to be quite suspicious of this
assumption.
Remember, though, that before talking about the photon clock I
had already proved the time dilation equation (3.1) by using the K
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 50

calculus. So, there is no reason to doubt the validity of this equa-


tion. Indeed, we may even turn the argument with the photon clock
over on its head and use it to prove that the distance between the
two mirrors must remain l0 for Alice as well as Bob - from the fact
that this is the only way we will get the time dilation equation right
for this clock! Thus, instead of providing an alternative derivation
of the time dilation equation, the photon clock can be regarded as a
way of proving the fact that lengths perpendicular to the direction
of relative motion of two observers remain unchanged.
Thus, relative motion between two observers shorten lengths in
the direction of motion but leave transverse lengths the same. We
can leave the matter here - but I can not resist showing you another
delightful little proof of the invariance of lengths perpendicular to
the motion. Just imagine that there is a vertical wall standing next
to the path that Bob takes as he rushes away from Alice. Both
Alice and Bob have brushes dipped in color, Alice’s her favourite
pink, while Bob’s is blue. Each friend draws a line at a height of
1 m above the ground on the wall. The question now is - whose
mark will be higher up? Now, the fact that there is no difference
between left and right is enough to ensure that neither Bob’s nor
Alice’s marks can be higher up than the other’s! This leaves us
with the only conclusion possible - both friends measure the same
vertical distance!

3.2.3 Length contraction via the photon clock


In the last subsection I showed you that the photon clock could be
used as a means of proving that lengths perpendicular to the mo-
tion are unaffected by it. There, of course, we had placed the clock
in such a way that the line joining the mirrors is perpendicular to
the motion. It is natural to ask now - “what will happen if we turn
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 51

ct1
l
l vt1
0
B1 B2

A1 C1 A 2 C2

(a) Bob’s view (a) Alice’s view

Figure 3.4: Length contraction from the photon clock

the clock around so that this line is now parallel to the motion?” Of
course, if we use this setup with proper care, this should give us
an alternative derivation of the length contraction equation, (3.9).
As far as Bob is concerned, the clock is stationary - so all that
light does is travel a distance of l0 on both halves of the journey,
taking a time of 2lc0 . What does Alice see? Figure 3.4b shows the
whole process of light travelling from one mirror to the other one
and back from Alice’s point of view. I have indicated the initial
position of the two mirrors, when the light beam starts out from
the mirror on the left, as A1 and A2 , respectively. By the time light
reaches the right hand mirror, the two mirrors have shifted to B1
and B2 , respectively. Finally, when the light beam returns to the
left hand mirror, their respective positions areC1 and C2 . The gaps
A1 A2 , B1 B2 and C1 C2 are each equal, of course, to the length of the
photon clock as measured by Alice. Prompted by twenty-twenty
hindsight, I will call this length l and not assume beforehand that
it is the same length l0 that Bob sees.
The distances travelled by light and the mirrors are indicated in
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 52

figure 3.4b, where t1 is the time, according to Alice, that light takes
to travel from the left hand mirror at A1 to reach the right-hand one
at B2 . From the figure it is obvious that

ct1 = l + vt1

which leads immediately to

l
t1 = .
c−v

In the same way, it is obvious that the time t2 that the light beam
takes to get back to the left hand mirror, now at C1 is given by

l
t2 = .
c+v

So, Alice reckons that the round trip takes light a total time of

l l 2l 1 2 2l
t= + = × 2 = γ . (3.10)
c−v c+v c v
1 − c2 c

So, now we know that Bob reckons that the time taken for the
round trip by light is 2lc0 , while Alice reckons that it is γ 2 2lc . Should
these two times be equal? Although in pre-relativity days the an-
swer would have been an unequivocal “Yes!”, today we know better!
After all, as far as Bob is concerned, the round trip taken by light
begins and ends at the same place, namely his origin. So, the time
that Bob sees is the proper time interval that elapsed during the
trip. So, Alice must see a dilated time, given by γ × 2lc0 - but we
already know that this is γ 2 2lc . This leads immediately to

1 p
l = l0 × = l0 1 − β 2
γ
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 53

- the length contraction formula!

Although we have already achieved our target, it is in-


structive to look back at the way in which we calculated
Alice’s time measurements above. If this would have
been pre-relativity physics - Alice could have calculated
t1 in this way : “I see light travel to the right at a speed
of c, so Bob sees it travel at a rate of c − v. The mirrors
are fixed as far as Bob is concerned, so to Bob. light just
covers the distance between the mirrors. Since Bob sees
light cover the distance l (the length of the mirror being
the same as the one I measure for him), it must take light
l
a time of c−v !” Wrong reasoning, as we know today - but
the right answer all the same!
Why is the above reasoning wrong? Well, as we all know,
Bob does not see light travel at c − v, he sees it travel at c!
Moreover, the distance between the mirrors that Bob sees
is not l - but a larger length (the proper length)! Perhaps
most importantly, even if this had given me the time that
Bob sees the journey from one mirror to another take (it
does not!), the result would not have been the same as
the time that Alice sees!
Now that we understand why the reasoning above is wrong,
let us turn to the question -”why, then, is the answer
right?” The answer to this question may surprise some
of you - it is right simply because it has no other option!
After all, why did Alice claim in her pre-STR days that
Bob would see the speed of light to be c − v? Answer :
l
so that the correct calculation of time (which gave c−v )
matches for both the friends! Put another way, at a time
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 54

t Alice sees the lightbeam travel to a point ct away from


her, while Bob is vt away from her at that instant. Al-
ice reckons that the distance between Bob and the light
spot is ct − vt = (c − v) t. Dividing this by t gives Alice a
speed c − v, which is what Alice would have considered
the speed of the light spot relative to bob, had she not
known that Bob’s length and time measurements do not
match hers.
This raises another question - exactly why does Bob see
the light spot recede from him at speed c? I will give you
the answer to this one a bit later, in section 3.4.

3.2.4 The length contraction paradox


If the heading of this subsection sounds too formal to you, feel free
to substitute the name under which it is better known - the “car
and garage paradox”.
Alice has bought a new car that is 10 m long. Her trouble is, her
garage, which accommodated her earlier, smaller car is only 8 m
long. Having studied the relativistic length contraction, Alice hits
upon a nice solution. She asks her friend Bob to drive the car into
the garage at a speed
q of 0.6 c. At this speed, the length contraction
factor γ −1 will be 1 − (0.6)2 = 0.8 - precisely enough to make the
car 8 m long according to Alice! What Alice can now do is slam the
door of the garage shut at the same instant that the front end of
the car hits the back wall - and so the 10 m car fits into the 8 m
garage (I am staying deliberately silent about what happens after
that)!
This scenario may seem completely absurd - but that is just
because the slow speeds of our daily lives do not prepare us for
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 55

such things that become apparent only for very fast objects! That
is not where the paradox is. The paradox becomes apparent when
you look at this from Bob’s point of view. To Bob, the car is at
its rest length of 10 m, it is the garage that is now moving at 0.6 c,
making it even shorter - 6.4 m to be precise! So, how on earth can
the car fit into the garage, when you see this from Bob’s point of
view?
To see how this paradox can be resolved, let’s examine just what
it means when Alice says that the car fits in her garage. Of course,
she means that the front end of her car hits the back wall of the
garage and the back end crosses the front gate simultaneously.
There’s our clue! As I have been stressing over and over again, the
major new thing in STR is the fact that time measurements differ
from observer to observer. So Bob does agree with Alice that the
front end of the car hits the back wall and the rear end crosses the
front gate - its only that he refuses to accept that the two events
occur at the same time! You will have to wait until the next section,
though, to see that this disagreement between which events occur
together and which do not is just enough to explain the difference
in the two friend’s points of view.

3.3 The relativity of Simultaneity


The fact that amazes people most about STR, apart from things like
the twin paradox is perhaps the fact that two events that appear
to act at the same time (but at different places) to Alice will not
appear to be at the same time to Bob (Remember my story about
the waiter and the diners in section 2.1?)! Of course our very first
puzzle of me and my friend on the bike should have primed you
towards accepting the fact that times may differ from observer to
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 56

observer - so this may not come as that big a shock to you, after
all.
Just how big is this difference in time measurements? You will
hardly have to work at all to find this out, since I did almost all the
work in the last section! Remember, Alice had to measure the two
ends of Bob’s rod simultaneously in order to measure its length
l. Do cast your mind back to how Alice measures the length -
also, refer to figure 3.3. According to Bob, the two bounces from
the two ends of the rod happened at times of Kt1 + lc0 and Kt2 ,
respectively. So, the time difference between these two bounces
that bob measures is K (t1 − t2 ) + lc0 . Now, as we have seen in (3.8),
l0
the two bounces will be simultaneous to Alice if t1 −t2 = − K2K
2 +1 c . In

this case, the time interval that Bob will see between them becomes

l0 2K 2 l0 K 2 − 1 l0 l0
− 2 =− 2 = −β
c K +1 c K +1 c c

which is a measure of just how relative (i.e. observer dependent)


the concept of simultaneity is. The minus sign shows that to Bob,
the bounce at his origin occurred earlier than the other one. Before
we wrap the formula up, let me point out that the variables that are
in use there are slightly jumbled up - in particular note that l0 is
the spatial distance between the two bounces as seen by Bob, while
it is Alice who sees Bob’s velocity to be β (since the result involves
the first power of the velocity, it does matter whether we are talking
of Bob’s speed with respect to Alice, or Alice’s speed with respect to
Bob!) and more importantly, it is Alice who sees the two bounces
to be simultaneous! So, perhaps it would be better to use l, the
distance between the events as measured by Alice, instead of l0 to
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 57

write the time difference as measured by Bob in the form

β l − v2 l
∆tBob = − =q c (3.11)
γ c 1 − vc2
2

To wrap up, if Alice sees two events occur at a distance l to be


simultaneous, then Bob, who moves at a speed v with respect to
her to the right will see a gap of βγ cl between the two events, with
the one that is to the left occurring earlier. It is easy to see that
had Bob moved to the left, the event that occurs more to the right
would appear earlier to Bob.

3.3.1 Einstein’s train


Like in the case of time dilation and length contraction, being able
to prove equation (3.11) in at least one more way will add to our
understanding of exactly what is going on. Let me now turn to
a thought experiment aimed at this which comes straight from the
horse’s mouth - Einstein’s famous relativistic train example. Let us
imagine that Alice is standing on the side of a railway track, while
Bob is rushing by her in a very, very long train that is moving very,
very fast. Just at the moment when Bob is abreast with Alice, two
flashes of lightning strike at equal distances to their left and right
- leaving marks on both the track and the train carriage. Hold on -
what does “just at the moment” mean? Since we know that events
that are simultaneous to one are not simultaneous to others, such
statements must be qualified with the identity of the observer who
is seeing the lightning strikes together. Since lengths change for
moving observers, too, you may doubt whether the statement “at
equal distances to their left and right” needs to be qualified too.
Well, it is true that Bob will see a different distance between the
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 58

marks of the lightning strike and himself, than Alice does - that’s
length contraction for you! However, as Alice sees the strikes to
be at equal distances from her on both sides, then so should Bob
- length contraction does not distinguish between left and right.
Let me assume that these two lightning strikes are simultaneous
with respect to Alice. Then, since Alice is sitting midway between
the two flashes, light from the two of them, travelling with equal
speeds c, will reach her at the same instant. Of course, Bob, who
has been moving all the time to the right, meets the flash coming
in from the right hand lightning strike before it reaches Alice, while
light from the other strike does not reach him until after it has
reached Alice and crossed her. Before the days of Einstein, both
friends would have passed this over by saying that light from the
right hand strike is moving in a a speed of c + v with respect to
Bob, while that from the left hand strike is moving at a speed of
only c − v, so it is no wonder that even though the two flashes
occur together (remember - no one had an inkling that simultaneity
is relative before Einstein), Bob sees the right hand strike earlier.
However, once we know that light travels at the same speed for all,
this escape route is out. Since both flashes are equally distant from
Bob, and he sees the right hand flash first, he must reckon that it
had occurred earlier! So, the flashes - which are simultaneous
with respect to Alice are not simultaneous to Bob - simultaneity is
relative!
What is the gap that Bob sees in between the two strikes? Well,
it is pretty easy to figure out that Alice will see Bob meet the light
l
from the right hand and left hand flashes after times 2(c+v) and
l l
2(c−v)
, respectively, where 2 is the distance of each flash from her.
So, Alice will see the time difference between the two lightbeams
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 59

striking Bob as

l l vl β l l
− = 2 2
= 2
= γ2β
2 (c − v) 2 (c + v) c −v 1−β c c

What, then, is the time difference according to Bob? Note that Bob
is present at both the events - so that the time that he measures is
the proper time interval between the two events. This means that
Bob’s value for the time interval is a smaller by a factor of γ. Hence
Bob sees the right hand flash γβ cl ahead of the left hand flash.
Since he is at the midpoint of the position of the thunder-strikes,
he will claim that the right hand flash occurs earlier, by a gap of
γβ cl . This, of course is the result that I had shown you a while ago
using the radar method.
Using an argument like this we can cover the case that’s not
so easy to deal with in terms of the radar method. What if the
train was moving not along the line joining the two points where
lightning had struck - but rather in a direction perpendicular to
this line? It is rather easy to see that the two flashes would have
reached Bob at the same time in this case, since he keeps on stay-
ing equally distant from their sources. Thus Bob and Alice will
both agree that the flashes are simultaneous!
What if Bob had moved off at an angle to the line joining the
flashes? It seems reasonable to expect that in this case, again,
Bob will see the flash towards which his velocity is slanted to occur
earlier - you might even be so bold as to suggest that for this case,
too, the equation (3.11) will continue to hold, except that instead
of the the product vl - you must have the scalar product ~v · ~l - so
that the equation becomes

~v · ~l
∆tBob = −γ 2 (3.12)
c
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 60

It is easy to check that this general equation definitely matches the


results for the two cases that we have considered so far - so that
you can have a degree of confidence in it.
You can check whether (3.12) is correct with a little bit of extra
effort. Figure ** shows the situation. It is easy to see that Alice will
see the flashes from the right hand thunder strike and left hand
thunder strike reach Bob at times t1 and t2 , respectively, given by

~l
ct1 = ~v t1 −

2



~l
ct2 = ~v t2 +

2

~ ~ 2

~ · B,
Using A + B = A2 + B 2 + 2A ~ you can rewrite the above as

  l2
c2 t21 = v 2 t21 − ~v · ~l t1 +
4
2 2 2 2

~
 l2
c t2 = v t2 + ~v · l t2 +
4

Taking the difference of these two equations immediately gives


 
c2 − v 2 t21 − t22 = − ~v · ~l (t1 + t2 )
 

Giving
~v · ~l
t1 − t2 = −
c2 − v 2
- which is the difference in the times at which the two light flashes
reach Bob. This doesn’t match equation (3.12) - but that’s only
because it is the time interval as reckoned by Alice. Bob, of course,
measures
q the proper time interval - throwing in the correction fac-
v2
tor of 1− c2
gives you exactly (3.12)!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 61

3.3.2 The writing on the clocks revisited


In subsection 3.1.6 I had introduced to a puzzle that involved the
symmetry of time dilation. In brief, if both Alice and Bob sees the
other’s clock run slow, then how do you explain the fact that Bob
sees Alice’s clock adjacent to him read 10 seconds when his own
shows only 8 seconds?
When Bob’s clock showed zero, Alice’s clock right next to him
had shown zero, too. Thus it might seem to Bob that the time
interval that elapsed in between is 10 seconds according to Alice’s
clocks, whereas his own shows only eight seconds. This of course
would mean that Alice’s clocks are moving faster than Bob’s!
Think about it a bit and you will find that there is a slight gap
in the argument. Alice’s measurement of the time interval involves
two different clocks here - and even in our daily lives (let alone for
situations where things move around at nearly the speed of light)
we would not subtract the readings on two clocks to get the time
interval between two events, unless we were sure that the clocks
are set right. In other words, we must be sure that the clocks are
synchronized - they must both show zero at the same instant. The
same instant? This means of course that both had to read zero
simultaneously! There are many ways in which this can be done -
and since Alice is relying on her clocks to give her accurate time in-
tervals, we take it for granted that she has managed to synchronize
her clocks by some means.
Alice, then, does claim that the time interval that elapses be-
tween the two events is 10 seconds. For Bob’s measurements, of
course, we do not have to worry about synchronization - he is us-
ing the same clock to time the two events - just subtraction of the
readings will do.
Let me now show you what the whole affair looks like to Bob.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 62

He of course can not deny the readings on the clocks, what he will
deny though is that the interval as measured by Alice’s clocks is
really 10 seconds! As far as he is concerned, Alice’s clocks are
not synchronized. You should have expected that - synchroniza-
tion involves setting the clocks at zero simultaneously, and events
that are simultaneous to Alice (who did the synchronization on her
clocks) are not simultaneous to Bob!
Now, Alice’s two clocks are separated by the distance that Bob
travels at a rate of 0.6 c in the 10 seconds that Alice sees as the
length of the interval - this of course is 6 light-seconds.Now, think
about Alice setting her clock at her origin and her clock 6 lightsec-
onds away. These two events, that occur at the same time accord-
ing to her, are separated by a time gap given by equation (3.11)
as
0.6 × 1.25 × 6 s = 4.5 s

with the clock at Alice’s origin set later. Since the two friends clocks
at their respective origins showed zero at the same time, Bob will
have no option other than saying that the second clock being used
by Alice has been set too early, by 4.5 seconds.
To compare how fast Alice’s clocks are running, we must focus
on the time gap between two events as read by any one them. Let
me take the clock that is 6 light-seconds away,k and consider the
two events when it was set to zero - and when Bob came up just
abreast to it. Of course, this clock reads a gap of 10 seconds be-
tween them. However, to Bob, the first event occurred, not at the
time origin, but at t = −4.5 seconds, while the second one occurs
at t = 8 seconds. Hence, as far as Bob is concerned, the gap be-
tween the two events that he sees is not 8 seconds at all, but 12.5
seconds. No wonder, then, that he says that Alice’s clock is going
slow, by a factor of 12.5
10
= 1.25!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 63

3.3.3 The length contraction paradox revisited


Remember Alice’s problem of getting her 10 m long car into her 8
m long garage (subsection 3.2.4) ? She could get Bob to drive at
the car at 0.6 c to get it to fit in the garage - the trouble was, Bob
would see the car to have a length of 10 m, while to him the garage
will be shortened to only 6.4 m!
Of course, when Alice says that her car just fits into her garage,
she means that the front bumper of her car touches the rear wall,
the back of the car just crosses the gate. This means that two
events - the front end of the car touching the rear wall of the garage
and the back end crossing the gate are simultaneous to Alice. By
now you must have caught on to the story. These two events that
occur simultaneously at a distance of 8 m to Alice (that, remember,
is how long the garage as well as the car is to her) occur at a time
gap of 0.6 × 1.25 × 8 m/c = 6 cm according to Bob. Now, Bob sees
himself sitting still in the 10 m long car, with a 6.4 m long garage
rushing in at him at a speed of 0.6 c. How far does the garage
move in the 6 cm time gap that Bob sees between the two events -
exactly 3.6 m - the gap in size between the car and the garage! So,
everything falls into place!
Come to think about it, by demanding that the garage has to be
8 mlong in order that the car fits into it is actually too restrictive -
a smaller garage will do! After all, all Alice needs to do is shut the
door on the back of the car before the information that the front
end of the car has crashed into the rear wall gets to her. If the
garage has a length L, it will take the news at least Lc amount of
time to reach her, in which the car (or rather, the back end, which
does not know that the front end has crashed as yet!) gets to travel
a further distance of βL. So, the car manages to get into the garage
even if its length is L + βL = (1 + β) L according to Alice, that is,
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 64
q
even if its proper length is L0 = γ (1 + β) L = 1+β
1−β
L. Note that the
Doppler factor K makes a rather unexpected appearance here, can
you figure out why?
So, in order to fit a 10 m long car moving at 0.6 c, the garage
needs be only 5 m long!On the other hand, if Alice has a 8 m long
garage, she does not have to get Bob drive at the breakneck speed
of 0.6 c, a mere 0.22 c will do!

3.4 Addition of velocities


One thing may have been vexing you again and again - just how
does the speed of light turn out to be the same for all observers?
Or, why is it if Bob sees Charlie move away from him at a rate of
0.6 c, while he is moving away at the same rate from Alice, Charlie’s
speed as seen by Alice is not 1.2 c, but rather, something smaller
than c? To see the answer, let me take you back to the radar
method once again - you will find that finding out Charlie’s speed
as seen by Alice is very simple, indeed, if we think in terms of the
Doppler factors instead of conventional speeds.
Consider a situation where Alice, Bob and Charlie where to-
gether at their respective zero times. Bob is moving away with a
speed v1 and a corresponding Doppler factor K1 with respect to Al-
ice. Charlie, on the other hand, is moving away with speed v2 (and
Doppler factor K2 ) with respect to Bob. The question is, what is the
velocity v3 of Charlie with respect to Alice? Of course, since K and
β are related to each other via the equations (2.3) and (2.4), we can
just as well ask, what is the value of the Doppler factor of Charlie
with respect to Alice, K3 ?
By now you know enough relativity to know that the answer
cannot be the simple classical result v1 + v2 - the formula for rela-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 65

tivistic addition of velocities must be different! To find out what it


is, just imagine that Alice flashes her flashlight at a time τ (accord-
ing to her Clock) in the direction of Bob and Charlie. Bob receives
the flash of light at time K1 τ (according to his watch), which then
proceed to meet Charlie at a time K3 τ (this time its Charlie’s watch).
Now, Bob could just as well have claimed that he had sent out the
signal at a time K1 τ , which should have reached Charlie when the
latter’s watch reads K2 × K1 τ = K1 K2 τ . This means that we must
have
K3 τ = K1 K2 τ

so that the relativistic addition of velocities simply becomes the


multiplication of Doppler factors

K3 = K1 K2 (3.13)

Indeed, as we will see over and over again, this simple result is
the reason why the Doppler factor is more convenient to use than
the speed in relativistic contexts. We have already seen, near the
end of section 2.4 that the Doppler factor is more directly amenable
to experimental determination. Be that as it may, speeds being
more familiar, we would definitely like to write (3.13) in terms of
v1 , v2 and v3 . Using equation (2.4), this is readily done:

K32 = K12 K22

implies
1 + β3 1 + β1 1 + β2
= ×
1 − β3 1 − β1 1 − β2
which leads readily to

(1 + β1 ) (1 + β2 ) − (1 − β1 ) (1 − β2 )
β3 =
(1 + β1 ) (1 + β2 ) + (1 − β1 ) (1 − β2 )
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 66

β1 + β2
=
1 + β1 β2

or, in terms of the more familiar speeds

v1 + v2
v3 = (3.14)
1 + v1c2v2

This, then, is our relativistic formula for addition of velocities.


Now that we have found out our formula for the addition of
velocities in STR, let us check some of the statements I have been
making so far. First, let us see what the scenario will be if in place
of Charlie we have a beam of light. Then v2 = c and our formula
gives us
v1 + c
v3 = = c!
1 + vc12c
- thus, the speed of light is c with respect to Alice, too. After all,
since I have assumed that the speed of light is the same for all
observers as the basis of our radar method, I would have been very
surprised indeed if anything else had come out of our calculation!
You can easily check that if Bob sees a light beam moving to the
left, v2 = −c , then v3 = −c, too. Remember what I had said about
being unable to outrun light?
The fact that c remains c under the velocity addition rule be-
comes even more obvious in terms of the Doppler factors. Remem-
ber that light travelling to the right (left) has a Doppler factor K2 of
∞(0) - and this obviously does not change upon multiplication by
K1 .
Another result that follows from our rule of addition of velocities
is the fact that if Bob sees Charlie move slower than light, so will
Alice, no matter how fast Bob moves with respect to Alice. You can
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 67

easily check that, for example v1 = v2 = 0.6 c leads to

0.6 c + 0.6 c 1.2


v3 = 0.6 c×0.6 c = c ≈ 0.88 c
1+ c2
1.36

or even that v1 = v2 = c leads to

v3 = c !

Proving this result, though, involves a slightly lengthy bit of alge-


bra. This result, though, is almost immediate if you think in terms
of the Doppler factor instead of the speeds.

3.4.1 The relative velocity formula


The difference between the relative velocity formula and the velocity
addition formula is really one of context. You use the latter if you
know Charlie’s velocity with respect to Bob and Bob’s velocity with
respect to Alice and want to find out Charlie’s velocity with respect
to Alice. You will use the former if given both Charlie and Bob’s
velocities with respect to Alice, you want to figure out Charlie’s
velocity with respect to Bob. In the notation that I used above,
then, you know K3 and K1 and want to find K2 . The result is
obvious and leads directly to

KCharlie w.r.t. Bob = KCharlie w.r.t. Alice /KBob w.r.t. Alice


(3.15)
So, the calculation of relative velocity just boils down to the division
of Doppler factors. In terms of the velocities, this formula becomes,
instead of the classical u − v,

u−v
u0 = (3.16)
1 − uv
c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 68

- where u0 is Charlie’s velocity relative to Bob, with u and v being


the velocities of Charlie and Bob, respectively, with respect to Alice.
Check this!

3.4.2 A deeper look into relative velocities


Now that we have the relative velocity formula safely under our
grasp, let m try to show you how to understand this result. To
give you a flavour of what I intend to do in this subsection, let
me first show you how the familiar u − v result for relative velocity
in pre-Einsteinian physics can be understood directly in terms of
quantities measured by our two friends.
Imagine that you are back in the good old comfortable Newto-
nian world, in which Alice sees Bob and Charlie receding from her
with speeds v and u, respectively, along the same direction. This
means that at a time t (you no longer have to specify by whose
watch - remember that time is universal in Newtonian physics)
they are at respective distances of vt and ut from Alice. Thus at
this point of time, Alice measures the distance between them to be
(u − v) t. This is the distance that Bob measures between them, too
- length being an invariant in Newtonian physics. Now, the time
according to Bob, again, is t - the same as that measured by Alice.
This gives Charlie’s speed according to Bob as (u−v)tt
= u − v - our
familiar relative velocity equation!
Now, let us try to analyse the same situation - but this time
taking STR into account. The situation is exactly the same as far
as Alice is concerned - she does see the distance between Charlie
and Bob as (u − v) t at a time t - according to her distance and time
measurements. The question, then, of course is - just how far does
Bob measure Charlie to be, and what is the time according to him?
Now, Bob will certainly measure the distance to Charlie using
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 69

a measuring scale that is fixed with respect to him, and so length


contraction ensures that there is to be a difference between his
and Alice’s values for this quantity. If the length of the segment
of Bob’s measuring rod stretches from him to Charlie
q is D with
2
respect to him, Alice will see it shortened to D 1 − vc2 . This of
course must match Alice’s distance reading, (u − v) t (a bit of care
must be exercised here to keep the logic straight - the two ends of
the segment of Bob’s rod are at ut and vt according to Alice, but
what ensures that the difference between these two readings is the
length of this segment as far as she is concerned is the fact that
both these readings have been taken, according to her, at the same
time, namely t ! This is why I can simply use the length contraction
formula here.), and so D must be

(u − v) t
D=q
2
1 − vc2

Now that we have the distance to Charlie according to Bob, the next
question is, what is the time, again according to Bob, when Charlie
is at this distance? We have seen that Alice’s clock right next to
Charlie is showing a time t at this instant - but by now we know
enough to realize that the time that Bob reckons for this must be
different. Since Alice’s clock is movingqpast Bob at a speed of v, it
2
must be running slow by a factor of 1 − vc2 . Just compensating
for this will tell us that Bob reckons the time to be

t
T =q
v2
1− c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 70

and this means that Bob will see Charlie’s speed to be

(u−v)t
q
2
D 1− v2
u0 = = c
=u−v
T q t
2
1− v2
c

- the same as the classical result! Something must have gone very
wrong here - after all, the classical formula for the addition of ve-
locities just can not be right! Apart from anything else - if instead
of Charlie, both Alice and Bob had been observing a flash of light -
this formula would have told you that they would have seen differ-
ent speeds - something that runs smack against relativity’s basic
postulate!
Some of you must have caught on to the source of the trouble
by now - I have simply forgotten to take the disagreement between
the two friends on clock synchronization into account. Alice says
that Charlie has covered the distance ut in time t - because at
t = 0 he was right next to her, while at time t he is at a distance
of ut from her. Though, the two clocks that tell her the time at
Charlie’s two locations are different, she is sure that subtracting
their readings does give her the time interval elapsed because she
has synchronized them so that they read zero a the same time.
Bob, on the other hand will say that the second clock was set to
zero too early - it has a head-start of q 1 v2 cv2 × (ut) over the clock at
1−
c2
Alice’s origin. This means that the time that has elapsed according
to Bob is not really q t v2 , but rather
1−
c2

t 1 v 1 − uv
c2
T =q −q 2
× (ut) = q t
1− v2
1− v2 c 2
1 − vc2
c2 c2

Using this value for T immediately leads to our law of relative


CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 71

velocities.
As an added bonus we can easily adapt this argument to un-
derstand the situation when Charlie is not moving in the same
direction as Bob is, with respect to Alice. In this case, the formula
for u0 certainly stays valid for the x component of Charlie’s velocity
(remember - the X component is special, because that is the direc-
tion in which Bob is moving relative to Alice.). Note that you have to
remember to change the u in the term uv c2
in the denominator to ux ,
too - check out the general formula for the relativity of simultaneity
(3.12)! This means that we have

ux − v
u0x = (3.17-a)
1 − ucx2v

What about the other two components of Charlie’s velocity? You


know that both Alice and Bob will measure the same value for
lengths perpendicular to the direction of relative motion. Thus
along the Y and Z directions both will see the separation between
Charlie and Bob as uy t and uz t, respectively (remember that vy =
vz = 0), when Alice’s watches show a time t. Although they both
measure the same displacement component, the disagreement on
time persists, leading to Charlie’s velocity components as measured
by Bob turning out to be
q
v2
uy 1− c2
u0y = (3.17-b)
1 − ucx2v
q
2
uz 1 − vc2
u0z = ux v (3.17-c)
1− c2

As you can see, the velocity components perpendicular to the rel-


ative velocity of Alice and Bob transforms in a rather complicated
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 72

fashion. This, despite the fact that distances in this direction are
the same for the two friends. Indeed, it is better to say that the
complication in this case arises exactly because of this - there is
no Lorentz contraction factor that cancels out the time dilation fac-
tor!

3.5 Lorentz transformations


All the ingredients necessary for studying relativistic kinematics
are now at hand - you can work out all the results with just the few
effects that you have learned about so far (just as in the last sec-
tion I showed you how to understand the velocity transformation
formula with just time dilation, length contraction and the relativ-
ity of simultaneity). However, the job will be much, much easier
if we add to our repertoire of tools the most general one of them
all - the rules about how space time coordinates of an event as
measured by one observer differ from those measured by another!
We are looking, then, for the successor to the venerable Galilean
transformations.
Deriving the Lorentz transformations is a very simple matter if
we make use of the radar method. Just imagine Alice sending out
a beam from her flashlight at time t1 - which bounces from some
distant point and comes back to her at time t2 . On its way out, this
beam had crossed Bob at a time Kt1 , while coming back in, it must
t2
have met Bob at time K (this should be obvious to you by now - if
not, just study section 2.4 in some detail). Of course, Bob could
just as well say that he had send out the beam at time Kt1 and
t2
received it after the bounce at time K . So, from our by now familiar
rules (2.2-a) and (2.2-b), the spacetime coordinates at which the
bounce occurred are :
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 73

For Alice : For Bob :


 
c c t2
x = (t2 − t1 )(3.18-a) x0 = − Kt1(3.19-a)
2 2 K
 
1 1 t2
t = (t2 + t1 )(3.18-b) t0 = + Kt(3.19-b)
1
2 2 K
The equations (3.18-a-3.19-b) relate the space-time coordinates
of the bounce for Alice and bob in terms of the times t1 and t2 . All
we have to do is eliminate these two times from the equations in
order to find the transformations that we are after.
All that is left is the algebra. However, there are some interesting
things that we can learn on the way. Firstly, it is obvious that
(3.18-a) and (3.18-b) leads to

x
t2 = t + (3.20-a)
c
x
t1 = t− (3.20-b)
c

which leads immediately to the relation

x2
t1 t2 = t2 − (3.21)
c2

What makes the above equation very interesting indeed is that the
t2
corresponding product for Bob is Kt1 × K = t1 t2 - the same as for
Alice. So we immediately get

2 x2 02 x02
t − 2 =t − 2 (3.22)
c c

an equation that, as we will see, has immense importance in rela-


tivistic physics.
Now, for the actual derivation! Substituting (3.20-a) and (3.20-b)
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 74

in (3.19-a) and (3.19-b) immediately leads to


   
0 1 1 c 1
x = K+ x− K− t
2 K 2 K
   
1 1 1 1
t0 = − K− x+ K+ t
2c K 2 K

which are the transformations that we are looking for. Of course,


the factors 12 (K + K −1 ) and 21 (K − K −1 ) are old friends - they are
none other than γ and βγ, respectively. In terms of these, the
equations become

x − vt
x0 = q (3.23-a)
2
1 − vc2
y0 = y (3.23-b)
z0 = z (3.23-c)
t − v2 x
t0 = q c (3.23-d)
2
1 − vc2

where I have thrown in (3.23-b) and (3.23-c) which simply say that
lengths perpendicular to the direction of relative motion are the
same for both Alice and Bob. These, then, are the transformation
equations that are going to take the place of the Galilean trans-
formations. They had been found on mathematical grounds by H.
Lorentz before Einstein put them on a strong physical footing. This
is why these are known as the Lorentz transformation equations -
and the passage from one inertial observer to another is called a
Lorenz transformation.
Our basic postulate of relativity boils down to the mathematical
demand that the form of the equations of physics must stay the
same under a Lorentz transformation. In a slightly more convo-
luted fashion, scientists prefer to put this as “The laws of physics
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 75

should be Lorentz covariant”. Just what does covariance mean -


and just why am I stressing on that as opposed to, say, Lorentz in-
variance are issues that I am going to describe in some detail later
on.
For the time being let’s see how Alice can find the space time co-
ordinates of an event, given the coordinates that Bob has measured
for it. In other words, we want to find (x, y, z, t) from (x0 , y 0 , z 0 , t0 ). One
way to do this is simply to solve the Lorentz transformation equa-
tions (3.23-a-3.19-b) to find (x, y, z, t). I leave the job of carrying out
the algebra to you. However, there is a very much simpler way of
finding the result - all you have to do is to remember our basic pos-
tulate of relativity. The equation that connects Alice’s readings to
Bob’s must have the same form as those that connect Bob’s read-
ings to Alice’s - except that instead of v we must put in −v, which
is Alice’s velocity with respect to Bob. Thus, the so called inverse
Lorentz transformations are

x0 + vt0
x = q (3.24-a)
2
1 − vc2
y = y0 (3.24-b)
z = z0 (3.24-c)
t0 + v2 x0
t = q c (3.24-d)
2
1 − vc2

You should check for yourself that solving the Lorentz transforma-
tion equations for (x, y, z, t) yields the same results. Also check that
equations (3.23-a) and (3.23-d) directly lead to (3.22).
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 76

3.5.1 Everything from L. T.


Let me point out now that a more traditional presentation of the
special theory of relativity would have started out by first prov-
ing the Lorentz transformation equations and then deriving all the
other phenomena that we have discovered from them. By the way,
the more traditional presentations would also have called the two
observers S and S 0 - I personally feel that Alice and Bob is much
nicer!
Once you have (3.23-a-3.19-b) and (3.24-a-3.24-d) in hand, find-
ing out things like time dilation, length contraction, etc. is a very
simple matter. As a preparation for this, just note that the equa-
tions are linear, and thus differences of space time coordinates
obey the same set of equations :

∆x − v∆t ∆x0 + v∆t0


∆x0 = q (3.25-a) ∆x = q (3.26-a)
v2 v2
1 − c2 1 − c2
∆y 0 = ∆y (3.25-b) ∆y = ∆y 0 (3.26-b)
∆z 0 = ∆z (3.25-c) ∆z = ∆z 0 (3.26-c)
v
∆t − 2 ∆x ∆t + c2 ∆x0
0 v
∆t0 = q c (3.25-d) ∆t = q (3.26-d)
2 2
1 − vc2 1 − vc2

Let me quickly illustrate how you would go about discovering all


the phenomena that we have covered so far in this chapter from
the above equations.

3.5.1.1 Time dilation

Bob’s wristwatch shows a time interval of t0 between two ticks (this


must be the proper time - Bob being on the spot at both the ticks).
What time interval t will Alice measure? Note that in this case,
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 77

∆x0 = 0 and ∆t0 = t0 , for the two ticks of Bob’s wristwatch. Using
(3.26-d) immediately leads to

t0 + cv2 × 0 t0
t = ∆t = q =q
2 v2
1 − vc2 1− c2

which is our familiar time dilation formula, equation (3.1)!


Before moving on to the next topic, let me warn you about some-
thing that most people get confused about at some point of time or
the other. In the derivation above I have used the inverse Lorentz
transformation - why not use the direct ones? Of course you could
have used the direct equations too, getting

t − v2 ∆x
t0 = ∆t0 = q c
2
1 − vc2

The trouble is - here ∆x is the distance between the two ticks that
Alice sees and that is not zero! Putting in ∆x0 = 0 in (3.25-a) will
immediately give ∆x = vt (this is obvious directly too - after all,
the watch is moving away from Alice with a speed v), which will
immediately lead to the correct answer. It is much easier, of course,
to use the inverse transformation for this particular case, though -
and it is a cardinal sin to mix the two up!

3.5.1.2 Length contraction

Bob carries with him a rod of length L0 . What is the length L that
Alice measures for it. Since the rod is at rest with respect to Bob, he
will have ∆x0 = L0 for any two events that measure the coordinates
of the two endpoints - irrespective of whether they are simultane-
ous or not! On the other hand, as far as Alice’s measurements are
concerned, ∆x = L only if the two measurements are done simul-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 78

taneously, ∆t = 0. In this example, it is more convenient to use the


direct transformation, equation (3.25-a) which gives

L−v×0
L0 = q
2
1 − vc2

leading immediately to (3.9)


r
v2
L = L0 1−
c2

3.5.1.3 Relativity of simultaneity

Two events are simultaneous with respect to Alice and she sees
them occur at a gap of ∆x = l. The direct transformation equation
(3.25-d) immediately gives

0 − v2 × l l
∆t0 = q c = −βγ
2
1 − vc2 c

as the time interval between the two events as seen by Bob.


Deriving the time interval in this way immediately tells us some-
thing that the radar method with its one space dimension, could
not have told us. The l that determines how far away in time the
two events, simultaneous for Alice, will be for Bob - is not the spa-
tial distance between the two events as seen by Alice, but rather,
∆x, the component of the separation vector in the x direction (Of
course, I have already shown you how to derive the result using
Einstein’s train example is subsection 3.3.1) !
You may be slightly puzzled by the fact that the x component
appears to play a special role in the relativity of simultaneity - after
all, aren’t all directions in space supposed to be equivalent to each
other? A moment’s thought will tell you though that the x direc-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 79

tion is special - it is the direction of the relative motion of the two


friends! Of course, if Bob had been moving away from Alice in any
other direction, what would have occurred in the formula above
is the component of the separation vector in that direction. It is
easy to find out the formula that will work for an arbitrary direc-
tion of relative motion - all you have to do is rotate the coordinates
around in such a way that the new X − X 0 axes point along the
velocity. In this frame, of course, the time interval will be given by
− c12 γ (v∆x) - but the combination v∆x can be written as ~v · ∆~r, since
in this frame ~v has an X component only. Thus the expression in
the rotated frame is − c12 γ (~v · ∆~r). Now, since this expression in-
volves scalars only (the γ contains v 2 , and that is a scalar too) - the
expression will stay the same for any rotated coordinate system,
including, in particular, the original one! So, in general, if Alice
sees two events separated by ~l to be simultaneous, then to bob,
who is moving away from Alice with a velocity ~v , the separation in
time between the two events is

~v · ~l
∆tBob = −γ 2
c

which is the general formula that we had found out in subsection


3.3.1.

3.5.1.4 The transformation of velocities

We have derived the formula for the addition of velocities as well as


the formula for relative velocity using the radar method in section
3.4. I will now show that these follow immediately from the Lorentz
transformation equations. As an added bonus, I will now derive the
formulae for velocity addition (and relative velocity) if the observed
object is moving in an arbitrary direction - not just in the same
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 80

direction as the relative velocity between Bob and Alice.


Both Bob and Alice measure the spacetime coordinates of their
friend Charlie at two events. To Alice, Charlie’s velocity in the in-
terval between the two events is given by

∆x ∆y ∆z
ux = , uy = , uz =
∆t ∆t ∆t

while, according to Bob, it has the components

∆x0 ∆y 0 ∆z 0
u0x = , u0y = , u0z =
∆t0 ∆t0 ∆t0

From this it is easy to see that


−1
∆x0 ∆t0

u0x = ×
∆t ∆t
∆x
 q 2
∆t
− v / 1 − vc2
=  q 2
1 − cv2 ∆x
∆t
/ 1 − vc2
ux − v
=
1 − ucx2v

which is the same formula that we found out before.


Working from the Lorentz transformations it is easy to find out
the way in which the components of the velocities in the other di-
rections change. You should check that repeating the above steps
for uy or uz gives the equations
q
v2
uy 1− c2
u0y =
1 − ucx2v
q
2
uz 1 − vc2
u0z = ux v
1− c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 81

These are of course, the same set of formulae that you have seen
before.
As you can see, the velocity transformations are quite a bit more
complicated than the Lorentz transformations themselves - in par-
ticular, the components in the directions perpendicular to the mo-
tion change in a quite involved manner, while the space coordinates
do not change at all!
~ = q 1 ~u trans-
It turns out that the components of the vector U u2
1−
c2
forms much more neatly than ~u itself. Here the u in the square root
is the magnitude of the vector ~u itself and not the relative speed of
the two observers. What makes this possible is the rather cute
identity q q
2 u2
1 − vc2 × 1− c2 ux v
q =1− (3.27)
1− u02 c2
c2

which you should be able to prove for yourself after a bit off (slightly)
messy algebra. Check for yourself that this means, for example
that

Uy0 = Uy
Uz0 = Uz

Find out the way in which Ux transforms.


One thing that (3.27) shows quite clearly is that if both u and
v are less than c, u0 must be less than c also! Can you figure this
out?
There is, actually, a deeper reason behind why the transforma-
tion of U ~ is somewhat simpler than that of the more familiar ~u. I
will tell you about it in a later section where we will study the gen-
eralization of our old friends the vectors so that they fit comfortably
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 82

in the space-time fabric of relativity.

3.5.2 The general Lorentz transformation


The Lorentz transformations that we have been dealing with so far
are rather special - they are ones is which the relative velocity of
Bob and Alice is along the common X axis, and their coordinate
axes are parallel to each other. This is why this is called a spe-
cial Lorentz transformation. However, a Lorentz transformation,
in general, is the relationship between the spacetime coordinates
as measured by any two inertial observers. And as I have already
observed, such observers may move with respect to each other in
any direction whatsoever, provided the velocity stays a constant in
time. What’s more, it is by no means necessary that the two ob-
servers use coordinate axes that are parallel to each other. They
may just as readily use coordinate axes that are rotated with re-
spect to each other (note - rotated and not rotating. In the latter
case, the observer would no longer be inertial.). Figure 3.5 shows
the possibilities.
The kind of Lorentz transformations where the two observers
use coordinate systems parallel to each other are given a special
name - they are called Lorentz boosts. As you can easily see, the
special Lorentz transformation is actually a special kind of boost -
a boost along the X axis.
A more general Lorentz transformation can be easily understood
as a boost followed by a rotation. In figure 3.5c, the dashed blue
lines show the result of a boost corresponding to the velocity ~v . The
actual coordinate system for the observer S 0 is obtained by rotating
this dashed coordinate system.
How can we find out how coordinates change under a general
boost? You can easily figure this out from what we know of the
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 83

Y Y v
Y

v X

Z
X X

Z X
Z
Z

(a) Special Lorentz transformation (b) General Lorentz Boost

v X
Y

(c) General Lorentz transformation

Figure 3.5: Various kinds of Lorentz transformations (a) Special


Lorentz transformation - note that the axes of both observers stay
parallel to each other, and the relative velocity is along the common
X − X 0 axes. (b) The general Lorentz Boost - the axes are still par-
allel to each other, but the relative velocity is along some arbitrary
direction. Note that the special Lorentz transformation is nothing
but a boost along the X axis. (c) The general Lorentz transforma-
tion - not only is the relative velocity in an arbitrary direction, but
the axes are also rotated with respect to each other.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 84

Y v Y
Yr
R

X
Xr

Z
X X

Z Z
Zr

Y
Yr Yr
−1
R
Y
Yr v X r
X r

X
Xr

Z
Zr Zr
X

Z
Zr

Figure 3.6: A general boost from the special boost.


CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 85

special case, provided we also know how to work out the way coor-
dinates change under a general rotation. What’s needed is shown
in figure 3.6. The steps are - first, rotate Alice’s coordinate system
so that her new X axis (the one denoted Xr in the figure) aligns
with the direction of Bob’s velocity. Now, use the special Lorentz
transformation to boost up to Bob’s speed - this gives us the coor-
dinate frame denoted by Xr0 , Yr0 and Zr0 . This, though, is not Bob’s
frame of reference - the sped is right - the orientation is not. to
correct for this, we need the final step - carry out the reverse of the
rotation in the first step. this leaves us with a coordinate system
that is parallel to the original one and is moving with the proper
velocity with respect to it - Bob’s frame!
All that is fine - and abstract! to understand this better let me
show you a concrete example. Instead of trying to figure out the
boost in a very general direction (it will be very difficult to check
whether the result is correct anyway!) - I will show you how to
figure out the equations for Boost along the Z axis.
First, we need a rotation that will take Alice’s X axis into her Z
axis. There are of course many rotations that can do the job for us
- I will use the most obvious - a 90◦ rotation about the Y axis. This
gives

x → xr = z
y → yr = y
z → zr = −x
t → tr = t

where I have included the (trivial) last line to make the notation
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 86

consistent. Now, boost to the speed v to get

xr → x0r = γ (xr − vtr ) = γ (z − vt)


yr → yr0 = yr = y
zr → zr0 = zr = −x
 v   v 
tr → t0r = γ tr − 2 xr = γ t − 2 z
c c

The final step is, of course, carrying out the opposite rotation to
the one in the first step.

x0r → x0 = −zr0 = x
yr0 → y 0 = yr0 = y
zr0 → z 0 = x0r = γ (z − vt)
 v 
t0r → t0 = t0r = γ t − 2 z
c

So the upshot of it all is

x0 = x
y0 = y
z 0 = γ (z − vt)
 v 
t0 = γ t − 2 z
c

If you are comfortable with matrices you should be able to see


that the calculation above could be carried out as a matrix product
   −1    
x0 0 0 1 0 γ 0 0 −vγ 0 0 1 0 x
y0
       


 
= 0 1 0 0 


 0 1 0 0  0 1
 0 0 
 y 


 z0  
  −1 0 0 0 


 0 0 1 0 

 −1 0 0 0 
 z 

t0 0 0 0 1 v
− c2 γ 0 0 γ 0 0 0 1 t
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 87

where the first matrix to be applied is the rightmost 4 4 × 4 matrix


that rotates Alice’s X axis to her Z axis. The next to have a go is
the matrix in the middle, which is just the one that carries out our
special boost. Last to get into the act is the matrix in the front -
which is just the inverse of the first rotation. You should carry out
the matrix multiplication and check that we do get our final result
correct! The beauty of using matrices of course is that the entire
calculation of the last page or so can be condensed into a few lines.
In the matrix notation, the instructions for finding out the general
boost above can be condensed into

x0 = R−1 Lx Rx (3.28)

in an obvious notation.
The above description of how to figure out the transformation
equations for a general boost are perfectly correct - but there is,
however, a more direct way of figuring out the result for a gen-
eral boost. Just note that in the special Lorentz transformation
equations, all that is special about the x coordinates that it is the
component of the position vector of an event that happens to be in
the same direction as the relative velocity of Bob and Alice. the fact
that y and z do not change under this transformation can be re-
states as - components of position vector transverse to the relative
velocity do not change. So, it pays off to decompose the position
vector of the event into two pieces, one parallel to ~v and one per-
pendicular to it
~r = ~rk + ~r⊥ (3.29)
4
Remember, matrices act to their right and thus in a product AB, it is Bthat
gets the first shot, not A!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 88

where elementary vector algebra tells you that

~r · ~v
~rk = ~v (3.30)
v2
~r⊥ = ~r − ~rk (3.31)

In terms of this division, the boost becomes


 
0
 ~v · ~r
~rk = γ ~rk − ~v t = γ − t ~v
v2
0 ~v · ~r
~r⊥ = ~r⊥ = ~r − 2 ~v
 v   
0 1 ~v · ~r
t = γ t − 2 v ~rk = γ t − 2
c c

Putting this together, we find the final condensed(?!) form


 
0 ~v · ~r
~r = ~r + [γ − 1] 2 − γt ~v (3.32-a)
v
 
~v · ~r
t0 = γ t− 2 (3.32-b)
c

Check that in the two special cases that we have worked out so
far (~v along the X and the Z axes, respectively) these equations do
reduce to the ones we have found out.
This should immediately tell you how to figure out the equations
for a general Lorentz transformation. The job is almost done - all
that is left is one final rotation - one that will align the coordinate
system parallel to Alice’s to Bob’s actual coordinate system. If I
write the action of this rotation as R, I can write down the equa-
tions for the general Lorentz transformation as
   
0 ~v · ~r
~r = R ~r + [γ − 1] 2 − γt ~v (3.33-a)
v
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 89
 
0 ~v · ~r
t = γ t− 2 (3.33-b)
c

- as is obvious, the final rotation only changes the position vector


and does not touch the time.
Is this, then, the most general way in which spacetime coordi-
nates are related for two inertial observers? The answer, which may
come as a bit of surprise to you, is - no! The trouble is, the above
equations are homogeneous - ~r = 0, t = 0 changes to ~r0 = 0, t0 = 0.
Of course, in our derivation we had assumed that at one instant of
time, Alice and Bob were side by side, and they had both set their
respective watches to zero at that instant. In general, however,
two observers may very well have origins of both time and space
translated by constant amounts. Taking this into account gives
us the most general transformations that can relate the spacetime
coordinate measurements of two inertial observers as
   
0 ~v · ~r
~r = R ~r + [γ − 1] 2 − γt ~v + ~r0 (3.34-a)
v
 
0 ~v · ~r
t = γ t− 2 + t0 (3.34-b)
c

where ~r0 and t0 , are obviously constants that represent the space
and time translations. These final transformations are called the
Poincaré transformations in honour of the great French polymath
Henri Poincaré. There are also known, for obvious reasons as
inhomogeneous Lorentz transformations. In summary, then, the
Poincaré transformations, which express the most general connec-
tion between two inertial observers, comprise of boosts, rotations
and spacetime translations.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 90

3.5.3 Why use the radar method?


The foregoing may have prompted to ask you the following question
- “If everything can be derived from the Lorentz transformations,
why bother to go through the lengthier route of the radar calculus
at all?”. In my defense I will put forward the following points

• The direct derivation of the Lorentz transformations is not as


simple as the radar method to understand (although as we
will see, it is not very difficult, either!).

• More importantly, learning relativity from the LT equations


only leaves the impression that the whole thing is essentially
a mathematical sleight of hand. The radar method has, in my
honest opinion, a much more physical feel to it.

• Most importantly, the radar method emphasises the central


message of STR - pay careful attention to how we measure
space and time!

3.6 The invariant interval


A side result that I had shown you while deriving the Lorentz trans-
2
formation equations is (3.22), which states that the quantity t2 − xc2
is the same for both Alice and Bob. Since the measurements
the two make of lengths perpendicular to their relative motion are
bound to be the same, we can generalize (3.22) to 3 space dimen-
sions directly to get

x2 + y 2 + z 2 02 x02 + y 02 + z 02
t2 − = t −
c2 c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 91

a result that is often written as

c2 t2 − r2 = c2 t02 − r02
p
where r = x2 + y 2 + z 2 is the distance between the point where
the event occurred and the origin. Since the differences of space-
time coordinates of two events transform in the same way as the
coordinate differences themselves, it is easy to see that the interval
between two events, defined by5

∆s2 = c2 ∆t2 − ∆x2 − ∆y 2 − ∆z 2 (3.35)

is the same for Bob as for Alice,

∆s2 = ∆s02 (3.36)

You may have a slightly uncomfortable feeling that in our deriva-


tion we have considered a rather special case, where Bob moves
away from Alice along the common X axis, while they both make
use of coordinate axis that are parallel to each other’s. A moment’s
thought will tell you that you can always rotate each friend’s co-
ordinate system so that their common X axis gets aligned along
the direction of their relative velocity. If we use these rotated co-
ordinate systems, the proof of (3.36) goes through. During the
rotations, the time interval between two events stay the same, and
while the individual components ∆x, ∆y and ∆z change, the sum
of their squares do not! This means that no matter what coordi-
nate systems the two friends use and which direction their relative
velocity points in, (3.36) remains valid for them!
5 2
Here, I am abusing notation slightly. ∆t2 stands for (t2 − t1 ) and not t22 − t21 ,
2
and so on! Of course, I could have written the more precise (∆t) - but it seems
that the usage here is considered to be standard by most authors.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 92

Given the importance that the interval has in relativity, you


would be surprised if there was no other way of deriving this re-
sult! So let me give you a more direct proof of the invariance of
the interval. For this, just consider two special events - a flash of
light being sent out from some point in space, and reaching some
other point a while later. For Alice, the time gap between these two
events must be related to the distance between them by c∆t = ∆r -
which is just the statement that light moves with a speed c as seen
by Alice. So, for Alice we have ∆s2 = 0 for these two events. Since
Bob will also see the flash travel with the same speed, a very sim-
ilar argument will tell you that ∆s02 = 0, too. Again, if the interval
between any two events is zero as seen by Alice, a light signal can
start from the earlier of the two events and get to the position of
the latter event just when it occurs. Thus the interval must have
the property that whenever ∆s is zero, ∆s0 must be zero too. Now,
the homogeneity of space tells us that the relation between the
space time coordinates as seen by the two friends must be linear
(If this seems too cryptic to you, don’t worry - I will give you a much
more detailed explanation of this when I give you a direct derivation
of the Lorentz transformation equations in subsection 5.1). Also,
since if the two events occur at the same time and the same place
for Alice, Bob must also agree that they are coincident - the relation
between space-time coordinate differences must be homogeneous
as well. This, couple with the fact that they vanish together, means
that the quadratic expression for ∆s2 must be related to ∆s02 by a
relation given by
∆s02 = λ∆s2

where λ is a dimensionless quantity that is independent of the


space time coordinates. However, at this stage, it is quite possi-
ble that λ depends on the dimensionless ratio β = vc , i.e. λ = λ (β).
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 93

All that remains to be done now, is to convince you that λ must


be 1 - irrespective of what the relative speed of the two friends is.
For this, just consider two events that Alice sees occurring at the
same time, but at two different points located along a line per-
pendicular to the direction in which Bob is moving (In our special
choice of coordinates, this may mean that c∆t = ∆x = ∆y = 0,
while ∆z 6= 0). As I have shown you before, these two events will
be simultaneous to Bob, too. Couple this with the fact that lengths
perpendicular to the direction of motion stay unchanged between
observers, and it follows that for these two events, at least, we must
have
∆s2 = ∆s02 6= 0

which immediately tells us that the factor λ must be one - leading


directly to the invariance of the interval.
You may object to the argument I have given you just now on the
grounds that it uses too many physical results. For the doubters, I
will give another argument, this time lifted straight out of Landau
and Lifschitz’s classic The classical theory of fields6 . Consider a
third friend Charlie, moving with a velocity ~u as seen by Bob. Let’s
denote his speed, as seen by Alice to be w. ~ In this case, Charlie’s
value for the interval between the two events will be
u u v 
∆s002 = λ ∆s02 = λ λ ∆s2
c c c
w

which must match λ c
∆s2 . This means that the function λ (β)
6
In case you are wondering what the invariance of the interval is doing in a
text with a name that contains “classical theory” - here the word classical means
anything that does not involve quantum mechanics - and this includes STR as
well as GTR!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 94

must have the property that


u v  w
λ λ =λ
c c c

Before you start out on a search for such a function, let me point
out that in general the relative speed of Charlie with respect to Alice
must depend on the angle between ~u and ~v . So the right hand side
of the equation above depends on this angle, while the left hand
side does not depend on it at all! The only way in which this can
be true is if λ is a constant - and this means that λ2 = λ, leading to
λ = 1.
Let me stress once again that this proof is very general - all that
is involved is moving from one inertial observer to another. Perhaps
this is flogging a dead horse, but let me show you one more proof of
the invariance of the interval. In subsection 3.5.2 I showed you that
the most general such transformation is one that involves a boost
in a general direction followed by a rotation. Indeed, you can figure
out the transformations from the special Lorentz transformations,
preceded and followed by rotations. I have already shown above
that the interval stays the same under a special Lorentz transfor-
mation. Rotation does not change either time or length, leaving the
interval invariant. Thus, it is easy to see that the interval does not
change for a general Lorentz transformation.
The invariance of the interval goes through in the case of Poincaré
transformations, too. Note that since the constant spacetime trans-
lations cancels out, both ∆~r and ∆t for the Poincaré transforma-
tions is the same as that for the Lorentz transformations - and that
is all you need for the interval to be invariant.
A word on the notation. I have called c2 ∆t2 − ∆r2 the interval.
From the experience that you have in ordinary geometry, where
the square root of ∆x2 + ∆y 2 + ∆z 2 is the spatial length, you may
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 95

think that is better to call the square root of this quantity the in-
terval. The trouble is, of course, that the invariant combination
above need not be positive - it is negative for two events that occur
far enough apart in space so that ∆r exceeds c∆t. I will play it safe
(most people do) and keep on calling the expression in (3.35) the
interval.
Another point about the notation - instead of using c2 ∆t2 − ∆r2
the interval, I could just as well have reserved this name for the
quantity ∆r2 − c2 ∆t2 . Which one to use is primarily a matter of
preference and both conventions are in use. In these lectures I will
use the former exclusively - but do check the convention that is
being used when you read any other book on STR.

3.6.1 Splitting up spacetime


The fact that the interval between two events is the same for all
inertial observers lead us to the notion that pairs of events can be
classified in a very significant way using the interval. Two events
that have a positive interval between them are said to be separated
by a timelike interval, while if ∆s2 is negative, the interval is called
spacelike. The nomenclature is obvious from the way the interval
is defined - for timelike intervals, the separation in time is larger
than that in space (if both time and space are measured in the
same units - leading to c = 1), while for spacelike intervals the spa-
tial part dominates. Finally events for which ∆s = 0 are said to
be separated by a light-like or null interval, for obvious reasons.
note that the fact that any two inertial observers measure the same
interval leads to everybody agreeing on whether the separation be-
tween two events is timelike, spacelike or null - even though they
disagree about the actual values of the space-time coordinates of
the two events! Very often, I will follow the general practice of say-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 96

ing “two events are timelike” instead of the more long-winded “two
events are separated by a timelike interval”.
You may be wondering why I am harping so much on the sign of
the interval - and dividing up event pairs into just three categories
depending on whether it is positive, negative or zero - when not
only just the sign but the entire value of the interval is the same
for all observers. Why not divide spacetime into an infinite num-
ber of classes - each class consisting of event pairs separated by
the same value of the interval? All inertial observers will certainly
agree on whether a particular pair of events belong to a particu-
lar class. In the following paragraphs I will give you a few reasons
why the division of spacetime events into light-like, spacelike and
timelike pairs is physically significant. However, the real physical
significance of this division will emerge only when we talk about
causality in section 3.9.
It directly follows from the definition of the interval that two
events that occur at the same place (but at different times) to Alice,
have a positive interval between them. Since Bob must measure
the same positive interval between them, even though he will see
them to occur at different places, it follows that such events are
timelike for everybody. Again, if Alice sees two events occur simul-
taneously but at different places, then the two events are spacelike
for everybody. Turning this over in its head, it is easy to argue
that if Alice sees a timelike interval between two events, then Bob
will never see them to be simultaneous - or if Alice sees a space-
like interval between two events then Bob will never see them as
occurring at the same place.
Again, if Alice sees two events have a timelike interval - it is
possible for Bob to move with such a speed that the two events will
occur at the same place with respect to him. Note that this means
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 97

that Alice will see Bob have a speed given by ∆r


∆t
- which is less than
the speed of light as c∆t > ∆r in this case. On the other hand,
if Alice sees a spacelike interval between them, it is possible for
Bob to move fast enough so that the two events are simultaneous
with respect to him. As you can easily figure out from (??), Bob’s
velocity for this case will have to obey

~v · ∆~r
γ = ∆t
c2

which means that the component of Bob’s velocity parallel to ∆~r


must be
c c∆t
vk = ×
γ ∆r
∆r
Since both γ and the ratio c∆t are larger than 1 (the first, always for
a nonzero speed, and the second because the interval is spacelike)
- this component is less than c. Thus, it is possible for Bob to move
with such a speed.

3.6.2 Everything from the interval!


As we have seen a lot of times before - there are many apparently
different ways to arrive at the same physical conclusion in relativ-
ity. In the following, I will show you how to exploit the invariance
of the interval to derive things like time dilation etc.

3.6.2.1 Time dilation

Consider two clocks on Bobs watch. The time interval that he mea-
sures between them is the proper time interval ∆τ of these two
events, while the spatial interval is, of course, 0. What does Alice
measure the time interval ∆t between the two ticks? The invariance
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 98

of the interval tells us immediately that

c2 ∆τ 2 = c2 ∆t2 − ∆r2

where ∆r, the spatial separation between the two ticks as seen by
Alice is related to ∆t by ∆r = v∆t. This means

∆r2 v2
   
2 2 1 2
∆τ = ∆t 1− 2 × = ∆t 1− 2
c ∆t2 c

which immediately leads us to the familiar time dilation formula.


What about the other physical results like length contraction,
relativity of simultaneity etc.? Well, it is possible to derive these
results using the invariance of the interval - but the arguments are
more involved than the very simple one that gives us time dilation.
Anyway, as we will see in subsection 5.1, you can actually derive
the Lorentz transformations themselves from the invariance of the
interval - and that, of course, allows you to derive all these results
rather straightforwardly.

3.7 Acceleration in relativity

3.7.1 The transformation law for accelerations


We have already seen how space-time coordinates and velocities
change when we move over from one inertial frame to another. It is
now the turn of accelerations. This was really simple in the good
lod days of Galilean transformations - the accelerations measured
by all inertial observers were, simply, the same (as long as we stick
to frames that are not rotated with respect to each other)! Thing
are quiet a bit more complicated under the Lorentz transformation.
Let us now consider a special Lorentz transformation from the
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 99

frame S to the frame S 0 . As we have already seen, the volcity trans-


forms as

u1 − v
u01 =
1 − u1 v/c2
p
u2 1 − v 2 /c2
u02 =
1 − u1 v/c2
p
u3 1 − v 2 /c2
u03 =
1 − u1 v/c2

Now, the acceleration components measured by S and S 0 are, sim-


ply a1 = du1 /dt and a01 = du01 /dt0 and so on, respectively. Thus

du01 du01 /dt


a01 = =
dt0 dt0 /dt
 
d u1 −v
dt 1−u1 v/c2
=  
t−vx/c2
d
dt

1−v 2 /c2

Carrying out the differentiation for this as well as the other two
components yield the acceleration transformations

3/2
(1 − v 2 /c2 )
a01 = a1 (3.37-a)
(1 − u1 v/c2 )3
(1 − v 2 /c2 ) h v i
a02 = a2 − (u 1 a2 − u 2 a1 ) (3.37-b)
(1 − u1 v/c2 )3 c2
(1 − v 2 /c2 ) h v i
a03 = a3 − (u a
1 3 − u a
3 1 ) (3.37-c)
(1 − u1 v/c2 )3 c2

for the special Lorentz boost.


I admit that these transformations look rather intimidating -
even for the case of rectilinear motion, where only the first equation
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 100

is involved. There is one case, however, where the 1-D version


3/2
0 (1 − v 2 /c2 )
a = a1 (3.38)
(1 − uv/c2 )3

simplifies a lot. This is the case where the frame S happens to be


the Instantaneously Comoving Frame (ICF) of the particle in ques-
tion. This is the frame in which the particle is at rest at a given
instant of time - so that u = 0. Remember, an ICF is not a frame
fixed to the accelerating particle - that, of course, would be a non-
inertial frame! Thus, though the velocity components vanish in this
frame, they do so only for a particular instant! Thus, the accelera-
tion components are, in general, nonzero in the ICF. Of course, if S
is an ICF, then the velocity v of S 0 with respect to S is nothing other
than the speed −u0 , where u0 is the velocity of our particle in the S 0
frame. Thus, in this special case we have
−3/2
u02

1− 2 a0 = a0 (3.39)
c

where a0 is the acceleration of the particle in the ICF. This, of


course, is a frame independent quantity by definition, and is called
the proper acceleration of the particle. Note that we had no neces-
sity of such a concept in classical physics - acceleration being the
same in all inertal frames. Since the frame S 0 is arbitrary we see
that for rectilinear motion, we must have

γ 3 (u0 ) a0 = γ 3 (u00 ) a00 = . . .


CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 101

3.7.2 Uniformly accelerated motion in STR


One of the first problems that we study in classical mechanics is
that of uniformly accelerated motion. Every high school kid knows
that
d2 x du
a≡ 2 = = α, (α = constant)
dt dt
has the solution

u = u0 + αt
1
x = x0 + u0 t + αt2
2

where u0 and x0 are the values of u and x, respectively, at t = 0.


Of course, if a = α = contsant, we will always get the above
solution - be it classical physics or relativity! However, the above
notion of uniform acceleration does not make much sense in rel-
ativity. After all, even if the acceleration were a constant in one
inertial frame, the transformation law (3.38) tells us that that in
another frame will be time dependent (note that the acceleration
transformation involves, in addition to the acceleration, also the
velocity of the particle in the frame S - which is where the time
dependence comes from)! Thus, our old notion of uniform acceler-
ation will be valid, if at all, in one inertial frame only - and as such,
cannot have any deep physical significance.
On the other hand, we have met a notion of the acceleration
which is frame independent in the last section - the proper accel-
eration! If the proper acceleration of a body is a constant in one
frame, then it is the same constant in all frames (after all, it is the
acceleration of the body as observed from one particular frame - its
ICF). So, the notion of uniform acceleration that makes sense in
relativity is that of uniform proper acceleration. In what follows we
will investigate this kind of uniformly accelerated motion.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 102

It might appear that the simplest way to solve the equation a0 =


α is to solve it in the ICF - where the soltion will be just the same
as that in high school physics. This is wrong, however! The point
is - the ICF of an accelerated body is not a single frame - you have
a new ICF every instant of time! To solve this problem, you have to
transform the uniform proper acceleration to a fixed inertial frame
(which we will call the frame S in this case). According to (3.39)
this gives the equation
−3/2
u2

du
1− 2 =α (3.40)
c dt

which can be integrated to yield

u/c αt
=
(1 − u2 /c2 )1/2 c

where we have assumed that the frame S is the frame in which the
particle is at rest at t = 0. This can be simply rearranged to give

αt
u= q (3.41)
2
1 + (αt/c)

Let us compare this solution for the speed of the particle with the
classical result u = αt. As you can see, our solution is very close
to αt when this speed is much smaller than that of light. This is,
of course, only to be expected! However, the deviations show up as
the speed gets closer to that of light. What is gratifying is the fact
that though the classical result can increase without bound, the
expression in (3.41) can never exceed the speed of light!
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 103

A simple integration gives the displacement x as


s 2
2

c αt
x= 1+ (3.42)
α c

where we have chosen x = c2 /α at t = 0 to simplify the expression


slightly. For small times this does give us the “parabolic” behavior,
x ≈ c2 /α + αt2 /2 + . . ., as expected. However, it is easy to see that
the world line of the particle, as seen from the S frame is described
by the hyperbola
c4
x2 − c2 t2 = 2 (3.43)
α
This is why uniformly accelerated motion in relativity is also often
called hyperbolic motion.
As can be easily seen, for large times t, the hyperbola approaches
x → ±ct asymptotically. This makes eminent sense, since the spped
approaches ±c for t → ±∞. However, this also throws up a sur-
prise - a light photon starting out from the origin at t = 0 (which
is when our uniformly accelerated particle was momentarily at rest
at x = c2 /2) will never catch up with the particle! This, in spite
of travelling faster than our particle at all times! In general, light
starting more than a distance c2 /α away from our particle when it
was momentarily at rest will never catch up with it.
To put the situation in perspective, let us try some numerical
estimates. Imagine a rocket on an interstellar journey. To maintain
optimum comfort, the proper acceleration of the rocket (which is
what an astronaut riding in it will feel) is kept at a comfortable
1 g ≈ 10 ms−2 . How long does it take for our rocket to reach a speed
of 0.999 c? It is

c 0.999
×√ ≈ 6.7 × 108 s ≈ 21 years!
α 1 − 0.9992
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 104

How long will it take for our astronaut to reach the center of our
galaxy - roughly 2 × 1020 m away? Since the time taken here is very
large, we can safely approximate x by ct and get the huge time

2 × 1020
8
s ≈ 6.7 × 1011 s ≈ 21000 years!
3 × 10

Surely, such an attempt is completely futile for us, mortal men?!


There is hope for our interstellar traveller, though! His journey
does take 21000 years for us stay at homes - but remember that
for most of that time he has travelled at a very large speed with
respect to us - very nearly the speed of light! The immense time
dilation caused by this makes him age a lot less!
Let’s work out the proper time elapsed when our clock reads a
time t. Since we have already found out u as a function of time,
p
it is easy to integrate dτ = dt 1 − u2 /c2 to find out the proper time
elapsed in terms of t:
s 2
t
αt0
Z 
0
τ = dt 1 +
0 c
 s 
 2
c  αt αt 
= ln + 1+ (3.44)
α c c

This tells us that the 21000 years time that we observe the journey
to the center of the galaxy to take corresponds to a proper time
of only about 10 years! So our interstellar traveller can make the
journey and return in about two decades of his time - only to find
all remains of civilization having vanished in the more than 42000
years that has elapsed on earth during his journey!
It may be instructive to write down the expressions for t and x
in terms of the proper time τ . You should be able to show easily
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 105

that

c  ατ 
t = sinh (3.45-a)
α c
c2  ατ 
x = cosh (3.45-b)
α c

3.8 Charting out spacetime


Let’s take a second look at the device that we have been using
all the time - the spacetime diagram. So far, we have plotted the
spacetime diagrams with a bias. They are all diagrams referring
to Alice as the observer. Given the democracy among all inertial
observers, it is natural to enquire about the shape this diagram
takes for Bob.
The first things that Alice would plot in a spacetime diagram
are, obviously, the X and T axes. The T axis corresponds to x = 0
- it is, as I have already stressed earlier, nothing but Alice’s own
worldline. The X axis, on the other hand, cannot be the worldline
of any moving particle. It is, of course, the line t = 0 - the locus of
all events simultaneous with Alice’s wristwatch showing a reading
of 0!
Any kid who has used a sheet of graph paper knows that apart
from the two axes themselves, a great deal of utility lies in the
coordinate grid. In Alice’s spacetime diagram , this would mean
two sets of straight lines. First - a set of equi-spaced straight lines
parallel to the T axis. Physically, these are the worldlines of a set
of particles that are at rest with respect to Alice, spaced out at
a regular interval from her. The second set are the equi-spaced
lines parallel to the X axis. In our physics based language, each
one of these is the locus of a set of simultaneous events - those
simultaneous with each successive tick of Alice’s wristwatch.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 106

As you can see - worldlines and loci of simultaneous events play


a very basic role even in the simplest of spacetime diagrams! Now
that we know that Bob does not agree with Alice about which events
are simultaneous, it stands to reason that for Bob, the X axis and
the coordinate lines parallel to it will be made up of a different set
of events7 .
To see what Bob’s coordinate
grid will look like, then, we must S

first of all find out the locus of Q

all events that are simultane-


R
ous, according to him, to a par-
ticular tick on his wristwatch. O
T
I will now show you how to do
this using school level geome-
P
try. For simplicity, I will spe-
cialize to the case of events si-
multaneous with Bob’s wristwatch
showing 0. Figure 3.7 shows Figure 3.7: Finding Bob’s X axis
Bob sending out a light signal
from event P 8 , which bounces at event R and returns to him at Q.
If P and Q are equidistant from the origin O, then our fundamental
rule, equation (2.2-a), says that the event R is simultaneous with
O. Let’s join the line OR. also let’s extend the line QR to meet
Alice’s axes at S and T , respectively.
Now, since light rays make angles of 45◦ with the axes, the ∠P RQ
is a right angle. A simple theorem that you studied back in your
school days should tell you that this means that the point R lies on
the circumference of a circle that has P Q as a diameter, and hence
O as its center. Thus ∆OP R is isosceles - leading to ∠OQR = ∠ORQ.
Since the exterior angle of a triangle is equal to the sum of the two
7
We already know that Bob’s T axis is different from Alice’s - it is, after all,
his worldline!
8
To be more precise, I should say - the event that is represented in the
spacetme diagram by the point P rather than the event P .
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 107

Figure 3.8: Spacetime coordinates according to Alice and Bob. Al-


ice’s coordinate lines are in magenta, Bob’s in blue.

other angles, it is easy to see that

∠T OR = ∠ORQ − ∠OT R
= ∠OQR − ∠OSQ
= ∠SOQ

Note that this relation must hold for all points R that depicts events
that are simultaneous with O as seen by Bob. This means that
Bob’s X axis is the straight line OR that makes the same angle
with Alice’s X axis that his time axis makes with Alice’s! I leave it
as an exercise for you to show that all Bob’s lines of simultaneous
events turn out to be straight lines parallel to this one. Thus, one
half of Bob’s coordinate grid consists of equi-spaced straight lines,
each inclined to Alice’s X axis at the same angle as his worldline is
inclined to her time axis.
As for the other half of Bob’s coordinate grids - they can ob-
viously be thought of as the worldlines of particles at rest with
respect to Bob, placed at equal intervals with respect to him. So,
this grid consists of a set of equi-spaced straight lines each parallel
to Bob’s worldline. Figure 3.8 shows Bob’s space and time axes in
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 108

relation with Alice’s coordinate grid on the left. On the right I have
drawn Bob’s coordinate grid - where I have also shown Alice’s axes
for reference.You can graphically see that which two events are si-
multaneous and which are not is dependent on the observer from
the fact that Bob’s space axis is tilted with respect to Alice’s.
Figure 3.8 may appear to re-
veal an asymmetry between the
5

Q
two friends. Alice’s space and
4
s

time axes are nice and perpen-


eter

t
t−m

dicular to each other - the way


3

P
ligh

that we have always learned the


2
T in

5 axes of a graph should be. Bob’s,


x 4
1

3
s
however, are tilted at a crazy
2
m eter
1 X in angle to each other! Why this
0

0
O difference?
The point is - what I’ve drawn
Figure 3.9: Using a non- for you is really Alice’s space-
perpendicular coordinate grid time diagram - I’ve just incor-
porated Bob’s spacetime axes in it. You could just as well have
drawn Bob’s spacetime diagram directly - with axes perfectly per-
pendicular to each other. In that diagram, as can be guessed, it is
Alice’s axes that will turn out to be the tilted ones.
You may find the fact that Bob’s spacetime axes are not at a
right angle slightly discomfiting. How does one use such a set of
axes to find the coordinates of any particular event? The coordinate
grid, of course, makes it easy - all you have to do is to find which
two lines of the coordinate grid intersects at the event - they will
immediately tell you what values of x and t the event occurs at. For
example, the point P in figure has x = 2 m and t = 2 l-m9 . What
9
Here l-m stands for a light-meter, the time in which light travels a distance
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 109

8
8

6
)
Alice’s t (l−m)

(l−m
6

’s t
4
Bob
4

8
2

6
2

4
(m)
2 ’s x
Bob
2 4 6 8

Alice’s x (m)

Figure 3.10: Alice and Bob’s spacetime coordinates

will we do for events like Q that are not at the corners of the grid?
In that case, of course, the grid gives us only an approximate way
of locating the event. As you can see, the event Q has x somewhere
in between 3-4 m, and the time at which it occurs is also in the
range 3-4 l-m. You can certainly go in for a finer grid, which will
give you a better approximation. To get a more precise reading, all
you have to do is simple - draw lines parallel to the x and t axes
from the point Q. Where these cut the t and x axes, respectively,
gives the time and position of the event Q.
of 1 m.
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 110

Once we have charted out spacetime coordinates for Bob as well


as Alice, we are very close to being able to work out how the space-
time coordinates of any given event as measured by the two friends
are related to each other. A very simple question needs to be an-
swered before this can be carried out in practice - just how big must
the spacing between the coordinate lines of Bob be? The simplest
way of seeing the answer is to note that both friends agree about
the value of the invariant quantity x2 − c2 t2 . So, a line of events
for which x2 − c2 t2 = l2 will translate into the line x02 − c2 t02 = l2 for
Bob. This line is, of course, a hyperbola. The hyperbola cuts Al-
ice’s X axis at x = l, and it cuts Bob’s X axis at x0 = l. A set of
these hyperbolas, for equally spaced values of l will tell us where to
locate both Alice and Bob’s constant x lines. Similarly, the hyper-
bola x2 − c2 t2 = −c2 τ 2 cuts the two friends’ T axes at t = τ and t0 = τ ,
respectively. Using these hyperbolas (or rather, the points where
they intersect the T axes) we can draw the complete coordinate grid
for both friends together. This is what I have drawn in figure 3.10.
As you can see from the figure, the scales used in the two set
of axes are quiet different. Figuring out the ratio between the two
scales is a rather simple job of elementary coordinate geometry.
Bob’s T axis, is, of course, his worldline - which, in terms of Alice’s
coordinates, is the line x = β(ct). It is easy to show that his X
axis, which is the reflection of this line in the x = ct, is given by
x = β1 (ct). The hyperbola x2 − c2 t2 = l2 will cut Bob’s X axis at
the points x = ± √ l 2 (you should have been able to deduce this
1−β
from what you know about length contraction) and ct = ± √ β l. In
1−β 2
Alice’s space-time diagram,
q the distance between these points and
2
the origin is, of course, 1+β
1−β 2
l. However, this is precisely the length
that Bob will call l. This tells us that Bob’s
q unit of length is longer
1+β 2
than that used by Alice by a factor of 1−β 2 . You should be able to
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 111

P
T α

R R Q

α
90 − 2 α
α 90 + α
α
O Q
Q
α
O Q X

Figure 3.11: Deriving the Lorentz transformations from the space-


time diagram

show quiet easily that the scale factor is the same between the two
friends’ time axes. Anyway, this should have been obvious from
the fact that the line showing the propagation of light is equally
inclined to Bob’s axes, just as it was to Alice’s axes.
Armed with a complete description of the spacetime diagram, we
are now in a position to derive the Lorentz transformations them-
selves from it! The relevant diagram is figure 3.11. To find the
coordinates of the point P , all Alice has to do is drop perpendic-
ulars P X and P T to her X and T axes, respectively. Of course,
x = OQ and ct = OT . As for Bob, he has to complete the paral-
lelogram OQ0 P R0 , where OQ0 and OR0 are along his X and T axes.
Then OQ0 and OR0 will represent his measurements of x0 and t0 -
once account is taken of the difference in scale between the two
friends. I have extended the lines P Q0 and P R0 to cut Alice’s axes
at Q and R, respectively.
Now, from elementary geometry it follows that

v
∠QP X = ∠T OR0 = ∠T RP = ∠Q0 OX = α = tan−1 = tan−1 β
c
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 112

This tells us that QX = P X tan α = βct = vt and T R = P T tan α = βx.


Thus, OQ = x − vt and OR = ct − βx. What I need to figure out Bob’s
measurements are the sides OQ0 and OR0 . It is easy to see that
the three angles of ∆OQQ0 are as shown in the enlarged version
on the right of figure 3.11. Then the sine law from elementary
trigonometry tells us that

OQ0 OQ

=
sin (90 + α) sin (90◦ − 2α)

and hence
cos α
OQ0 = OQ ×
cos (2α)
I leave it as an exercise for you to show that this can be rewritten
as p
1 + β2
OQ0 = (x − vt)
1 − β2
a very similar reasoning shows that
p
1 + β2
OR0 = (ct − βx)
1 − β2

We are early done - all that is left is to note that Bob’s scale
q for
2
units of length and time are larger than Alice’s by a factor 1+β 1−β 2
.
Taking this into account it is easy to see that
s
1 + β2 x − vt
x0 = OQ0 ÷ 2
=p
1−β 1 − β2
s
1 + β2 ct − βx
ct0 = OR0 ÷ 2
=p
1−β 1 − β2

which are, of course, the Lorentz transformations !


CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 113

3.9 Causality and STR


Causality is perhaps the single most important concept of all physics
- yet it is so simple to state that it is difficult to think of it as
something non-trivial! So, just what is causality? It’s the simple
commonsense notion - cause must occur before effect! In a sense,
causality is not a principle of physics at all - it is a part of what you
can call proto-physics - something so fundamental that it takes
precedence over the laws of physics. Any proposed law of physics
that violates causality has to be rejected outright - that’s how pow-
erful this concept is!
In STR the issue of causality becomes especially important. Be-
fore I explain why, let me hasten to assure you that the concept
is equally valid no matter whether you are talking of Newtonian or
post-Newtonian physics. So, just why is it that people start talking
about causality when it comes to STR? The answer lies in that fea-
ture which I have repeatedly called the most distinctive feature of
STR - the transformation of time.
In Newtonian physics - the fact that cause must come before
effect in time is a principle that has to be ultimately checked with
experiments. However, if causality is valid for a particular cause-
effect pair according to one observer, all other observers must agree
that for this pair of events at least, causality works. This is because
Newton’s physics had one invariant absolute time for all - the time
order of any two events (for that matter, even the size of the time
interval), causally related or not, must be the same for all! So once
causality is checked for one inertial observer, there is no further
need to worry.
STR is different - simply because the time interval that Alice and
Bob will measure for the same pair of events will differ - even to the
extent of sign of the interval. Equation (3.25-d) shows that ∆t0 can
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 114

be negative, even though ∆t is positive and vice versa. This means


that while event A might have occurred prior to event B to Alice, it
is possible for Bob to see A occur after B!
Now if Alice sees A cause B, then causality worked fine for her.
What about Bob? He has no option but to agree that A causes B,
but for him, the order of the two events has been reversed! So, you
see, in STR, the concept of causality is not very straightforward - in
particular, we can see that there is a very real threat to its validity!
This means that we have to take a much more careful look at the
concept of causality when we apply it to STR.
Let’s see how fast Bob must move so that a time interval that if
positive to Alice becomes negative for him. Equation (3.25-d) tells
us that this will happen if

v∆x
∆t − <0
c2

which means that


c2 ∆t
v> (3.46)
∆x
Now, if the event A has caused event B, then there must have been
some sort of influence that travels from the location of the former to
that of the latter. At the very least, the “news” that A has occurred
has to reach the location of B, on or before it occurs. Thus, from
Alice’s point of view, the news has to cover the distance ∆x in at
most a time ∆t. Thus the speed at which the information travels
must satisfy
∆x
vinfo ≥
∆t
Thus, if A is to cause B and Bob is to see B occur before A, we
must have
vinfo v > c2
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 115

which means that at least one of the two speeds vinfo and v must
exceed c!
This, in fact is the strongest reason why nothing can travel
faster than light. If the information that A has occurred could
get to the location of the event B faster than light, then it is possi-
ble that Bob will be in a position to see the effect occur before the
cause, even if his own speed is within the light-speed limit! On the
other hand, if Bob’s less-than-light speed satisfies the inequality
(3.46), then Bob will say that B has preceded A while Alice claims
just the opposite. However, in order that either of the two events
cause the other one, it would be necessary for the information that
the cause has occurred to reach the effect’s site faster than light.
Banning the possibility of faster than light propagation of the sig-
nal saves us from the prospect, then, that one of the two friends
will see the effect occur before the cause!
To round off the discussion, let me stress that although STR
does allow the possibility that the time order between two events
will be different for two different observers, as long as both signals
and observers obey the light speed limit, there is no conflict with
the principle of causality. In the jargon of STR, we say that such
reversal of time order for two observers is possible only for two
events that are not causally connected.
In order to avoid any possible misunderstanding, let me empha-
size that if event B is within the causal future of event A, then it
is possible that A is the cause of (or one of the causes of) event B.
This does not mean that A has to be the cause of B, however! Of
course, A can not be regarded as the cause of the infinite number
of other events that fall inside its future lightcone. On the other
hand, if B is not within the causal future of A, then it is impossible
for A to cause B. A similar set of statements could be made for the
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 116

case where B lies in the past lightcone of A.


The splitting up of spacetime intervals into timelike, spacelike
and lightlike picks up special significance in light of the discus-
sion above. Rewriting (3.46) slightly shows that two observers will
disagree about the relative time order of two events only if

v
∆x > c∆t
c
v
and since c
< 1, this requires

∆x > c∆t

So, temporal order can be reversed only if the two events are sepa-
rated by a spacelike interval. On the other hand - this means that
the distance at which the two events occur exceeds c times the time
interval, i.e. the distance is more than that even light can cover in
this time. Thus, two events separated by a spacelike interval can-
not be linked by a cause-effect relation.
On the other hand, if the interval between a pair of events is
timelike or lightlike, then the time interval is sufficiently large for
a slower-than-light (or light) signal to cover the distance between
them. In this case, it is possible that the earlier event was the
cause of the latter one. Note that in this case, all observers (as long
as they stay slower than light) will agree about the temporal order
in which the two events occur - so as long as causality holds for
one observer, it will hold for all of them.
The causal structure of spacetime in relativity can be depicted
very forcefully by means of the lightcone that I showed you ear-
lier. Figure 3.12 shows the lightcone centered at a particular event
- labelled “Here & Now”. The lightcone comes in two halves - for
obvious reasons the part for positive time is called the future light-
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 117

Future Future

Elsewhere

Elsewhere Here & Now


Now
Past
Past

(a) STR (b) Newtonian

Figure 3.12: The causal structure of spacetime

cone, while the other half is labelled the past lightcone. The future
lightcone and its interior contains points which are separated from
the origin by a timelike (or lightlike) interval, and which, moreover,
lies to its future - this is the causal future of the origin. The event
at the origin can be the cause of events in this spacetime region.
Similarly, the past lightcone and its interior constitutes the causal
past of the origin. The events here can be the cause of the events
at the origin. The region of spacetime lying outside the lightcone
consists of events that are separated by a spacelike interval from
the event at the origin. The appropriate label that we can assign
such events is elsewhere - no observer will ever see any of them
occur at the same place as the one at the origin.
The figure also shows two dashed lines - the X axis and the T
axis. Why have I used dashed lines for these? Well, these lines
are relevant only for a particular observer - as I have shown you
earlier, the corresponding axes will be quite different for any other
observer moving with respect to this one. Note that events that are
CHAPTER 3. KINEMATICS FROM THE FLASHLIGHT 118

in the causal future of the origin occur later than the origin, for
all observers (we are sticking to observers that do not exceed the
speed of light) - so the “future” label assigned to this region is not
specific to a particular observer. The same goes for the causal past.
As we have seen already, events that are separated from the origin
by a spacelike interval have no specific time order - an event in this
region may occur before the origin for one observer, but may very
well occur after it for another observer. This is precisely why we
don’t use the words past or future in the context of these events -
the only observer independent designation that they may be given
is the one that we have given them - elsewhere!
So that you can appreciate how different things where in the
pre-Einstein world, I have also drawn the Newtonian view of the
causal structure of spacetime in figure 3.12. In that diagram you
will notice that I have drawn the X axis with a solid line, unlike the
diagram for STR. This is because this line consists of all events that
are simultaneous with the event at the origin - and this designation
is something that every observer agrees with (this is Newtonian
physics - remember?). This line, labelled now separates space-
time into two halves - the past and the future. Note that since
Newtonian physics places no upper limit on the speed of signal
propagation, the event at the origin could have been caused by any
event in its past, while it could be the cause of any event in its
future. I am running the risk of being subjective here - but I do feel
that the causal structure of STR is much more beautiful than the
rather dull setup that Newtonian physics provides!
Chapter 4

Light - the messenger of


relativity!

4.1 The Doppler effect, again


Since the Doppler factor plays such an important role in our story,
it is only fitting that we devote a bit more time to the effect from
which it derives its name. As you all know, the Doppler effect is
what makes a train’s whistle sound high-pitched when it is ap-
proaching you and lowers the pitch when it is moving away. All of
us have noticed the sharp drop in the pitch of the train whistle as a
galloping train rushes by us. The corresponding effect in light has
some subtleties due to relativity as we have seen. In this section, I
am going to describe these in some detail.

4.1.1 The longitudinal Doppler effect


We have already seen this effect in action. This, precisely is what
gives the Doppler factor its name. Bob rushes away from Alice, who
shines a flashlight at him. While Alice sees each successive crest

119
CHAPTER 4. LIGHT - THE MESSENGER OF RELATIVITY! 120

of the electric field in light at a time interval of τ0 , Bob sees them


at a larger time interval. There are two factors at work here, as I
have already shown you. Firstly, the crests have to catch up with
Bob as he rushes away. This causes the lowering of the frequency
that is expected even in classical physics. If this had been the only
effect, the frequencies as measured by Bob and Alice would have
been related by
νBob = νAlice (1 − β) (4.1)

This precisely is the same as the formula for the Doppler effect
in sound, too - for the observer receding from a source of sound
stationary in air. Of course, for sound, the ratio β will have to
be changed to vv , where vs is the speed of sound. Secondly, Bob’s
s
clock runs slow, making the time interval between the crests smaller
than that predicted by the first effect. This makes the actual fre-
quency that Bob sees larger than that predicted classically (but
still smaller than what Alice measures). Taking both factors into
account, the frequency that Bob actually sees is
s
1−β 1−β
νBob = νAlice p = νAlice (4.2)
1 − β2 1+β

The Doppler effect gives us a simple way of checking the predic-


tions of STR. All one has to do, it seems, is measure the frequency
of light emitted by a source that is moving with respect to me, and
verify whether the shift in frequency obeys the classical law, (4.1),
or the relativistic one, (4.2).
Sources of light that are moving are very common, indeed! Even
the light bulb that fixed on the wall provides such sources, even if
you are sitting down right in front of it! You may have caught on
to what I am hitting at here from the fact that I used the plural -
CHAPTER 4. LIGHT - THE MESSENGER OF RELATIVITY! 121

each individual atom that radiates the light photons are in violent
to and fro motion. Atomic speeds are quite rapid, of the order of
the speed of sound or more. This might seem like a small fraction
of the speed of light - but remember that optical measurements like
that of frequency can be carried out to very high precision.
There is one small hitch, though. In this case, the classical
effect is already quite large - what STR does is that it introduces a
small correction to this result (at least, the correction is small when
the source is not moving very fast). To see this, let me write the
expression for the frequency as seen by Bob as a series expansion
in β, which is almost always rather small, to get
 
1
νBob = νAlice 1 − β + β 2 + . . . (4.3)
2

Unless β is rather large, the classical effect −β swamps out the


STR term of 21 β 2 - and hence the relativistic effect that time dilation
causes can be rather difficult to detect directly.
In spite of what I have said right now, experimentalists have
been clever enough to actually detect the small relativistic effects.
They were helped by the fact that the classical effect depends lin-
early on β - which means that the frequency shifts by opposite
amounts due to it for atoms moving in opposite directions. The
atoms move randomly due to thermal effects - which means that
the average frequency stays the same due to the classical term.
This term has a big effect, though - it causes the sharp lines
of an atomic spectrum to broaden out - something that spectro-
scopists call Doppler broadening. Any shift in the average fre-
quency, though, must be caused by the relativistic term(s). Despite
the extreme smallness of this shift, experimentalists have managed
to use a wonderful effect of solid state physics called the Mossbauer
CHAPTER 4. LIGHT - THE MESSENGER OF RELATIVITY! 122

effect to detect this - providing a direct verification of relativity.

4.1.2 The transverse Doppler effect


Although we have been sticking to a one dimensional world for
simplicity, this is as good a point as any in which to introduce
the other directions as well. What happens if Alice and Bob were to
both receive a light signal when they were side by side, coming from
a source that is standing still with respect to Alice but displaced
from her in a direction that is perpendicular to Bob’s speed? For
such transverse beams of light, Bob is not really running away from
the source. Since light has no catching up to do, classical physics
says that both Alice and Bob will receive the electric field maxima
at the same time - and hence will see the same frequency!
There is a small catch here. Alice will keep on receiving the suc-
cessive peaks, one by one. Bob, however, will catch only one (the
one that reaches them both when they are side by side) directly.
When the next one reaches Alice, Bob has already moved slightly
(unless he is moving really fast!) to one side. Thus Bob has to
time the arrival of the second pulse at Alice’s location indirectly!
So, while Alice measures the proper time between the pulses, Bob
does not! Since Bob concludes that Alice’s watch is running slow,
he must have measured a bigger time gap and hence a smaller fre-
quency for the light, purely due to the time dilation effect! It is very
easy to figure out the relation between the two frequencies
p
νBob = νAlice 1 − β2

where v is Bob’ speed with respect to Alice (as well as the source
of light). Note that Alice receives the peaks at exactly the same in-
terval as the source of light itself, so the frequency she measures
CHAPTER 4. LIGHT - THE MESSENGER OF RELATIVITY! 123

is the proper frequency, while Bob measures a smaller one. This


transverse Doppler effect, then, is a purely relativistic effect - oc-
curring entirely due to time dilation.
You may be tempted to argue that Alice must see Bob’s clock
run slow too, and it is she who should be measuring the lower
frequency! Remember, though, that Alice receives all electric field
maxima at the same place, and hence is the one measuring the
proper time between them. The role of the two friends isn’t really
symmetrical here, it is Alice who is stationary with respect to the
source of light, and Bob who is moving with respect to it!

4.2 More on light

4.2.1 Aberration

4.2.2 Propagation
Chapter 5

Additional topics in
kinematics

In this chapter I am going to dwell on several important


physical consequences of relativistic kinematics that we
have derived above Although they are very important, if
you are in a hurry, you can skip ahead to the next chap-
ter, where I deal with the laws of mechanics in relativity.
I would of course prefer that you read all that follows (af-
ter all, that is why I am writing all this!)! Even if you
decide to skip ahead, do come back later to these topics.

5.1 Deriving the Lorentz transformations,


directly

5.2 A new notation


As I will explain in some more detail a bit later, in a sense relativity
places time on an equal footing with space. It is easier to see this

124
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 125

if you measure both space and time in the same units (so that, as
I mentioned earlier, the speed of light comes out to be unity). If
you want to stick to the same units that you have been using all
along, the same effect can be obtained by considering the product
ct instead of time.
I am going to use the abbreviation x0 for ct, while I will use
the obvious nomenclature x1 = x, x2 = y and x3 = z and a simi-
lar notation for the spacetime coordinates according to the primed
observer. As you can see, I have chosen to write the indices on
the coordinates as superscripts rather than the usual subscripts.
Why? There is a very good reason - one that will hopefully become
clear in a while from now.For the time being, do pay attention to
the fact that x2 refers to the second component of the quantity x
and not the square of x. If I mean the latter, I will write it explicitly
as (x)2 . In terms of these,the Lorentz boost equations in (3.23-a-
3.23-d) becomes

x00 = γx0 − βγx1 (5.1-a) x0 = γx00 + βγx01(5.2-a)


x01 = −βγx0 + γx(5.1-b)
1
x1 = βγx00 + γx01(5.2-b)
x02 = x2 (5.1-c) x2 = x02 (5.2-c)
x03 = x3 (5.1-d) x3 = x03 (5.2-d)
This can obviously be written as the matrix equation

x0 = Lx x = L−1 x0

where x and x0 are the 4 × 1 column matrices


   
x0 x00
 1   01 
 x   x
x0 = 

x=  x2  ,

 x02


   
x 3
x03
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 126

while L is the 4 × 4matrix


 
γ −βγ 0 0
 
 −βγ γ 0 0 
L= 

 0 0 1 0 

0 0 0 1

If you are afraid of matrices, you can also write this in terms of
components
X
x0µ = Lµν xν (5.3)
ν=0,1,2,3

where obviously Lµν is the entry at the µ-th row and ν-th column
of the matrix L. You are certainly familiar with this entry being
denoted Lµν - so why this perverse notation - in which the row index
is written as a superscript and the column index as a subscript? I
will explain this in due time - for the time being please bear with
this!
I have written down the coefficients explicitly for the special
Lorentz boost. It should be obvious that the compact form of the
transformation equations, ( 5.3) is valid for the general Lorentz
transformation that I discussed in the last subsection. Of course,
the individual coefficients may be much more complicated than in
our special case. A natural question that will arise is, given a set of
four equations that look like ( 5.3), can we be sure that they stand
for a Lorentz transformation? You will have to wait a while for the
answer - which I will provide in the next section.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 127

5.3 General Lorentz transformations and


the invariance of the interval
I have already shown you that the interval has to stay the same
under a general Lorentz transformation. This has some immediate
consequences for the coefficients of a general Lorentz transforma-
tion themselves.
Remember that we can write a general Lorentz transformation
as
X
either x0 = Lx, or x0µ = Lµν xν
ν=0,1,2,3

where on the left I have written the matrix form and on the right,
the explicit form in terms of the transformation coefficients (sub-
section 5.2).

5.3.1 The covariant coordinates


Now, in this notation, it is easy to see that I can write the invariant
interval as
2 2 2 2
∆s2 = ∆x0 − ∆x1 − ∆x2 − ∆x3

If all the signs above had been +, it would be child’s play to write
P3 µ 2
the interval in a compact form - ∆s2 = µ=0 (∆x ) . In terms of
the matrix x, this would have been even more compact1 , ∆s2 =
(∆x)T (∆x)! Of course, all the signs are not the same - so this
simple form is not the right one. However, it is very easy to see that
if you were to introduce another set of four numbers to coordinatize
1
Here, the superscript T stands for the transpose of a matrix - the matrix that
is obtains by it by interchanging rows and columns.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 128

spacetime,

x0 ≡ x0 = ct (5.4-a)
x1 ≡ −x1 = −x (5.4-b)
x2 ≡ −x2 = −y (5.4-c)
x3 ≡ −x3 = −z (5.4-d)

and use both sets of coordinates together, a compact notation can


be achieved.
X3
2
∆s = (∆xµ ) (∆xµ ) (5.5)
µ=0

If I define a column vector x, as


 
x0
 
 x1 
x= ,

 x2 

x3

then I can also write a nice and simple matrix form

∆s2 = (∆x)T (∆x) . (5.6)

A small bit of nomenclature - the set of numbers (x0 , x1 , x2 , x3 )


is called the contravariant position vector in spacetime - while
(x0 , x1 , x2 , x3 ) is called the covariant position vector. This termi-
nology will become clearer once we study vectors in spacetime in
section 5.5.
Since we know how (x0 , x1 , x2 , x3 ) transform under a Lorentz trans-
formation, it is easy to figure out the way in which the covariant
coordinates change. In the case of the special Lorentz transforma-
tion
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 129

x00 = γx0 + βγx1 (5.7-a) x0 = γx00 − βγx01 (5.8-a)


x01 = βγx0 + γx1 (5.7-b) x1 = −βγx00 + γx01 (5.8-b)
x02 = x2 (5.7-c) x2 = x02 (5.8-c)
x03 = x3 (5.7-d) x3 = x03 (5.8-d)
In terms of the components, the transformation of the covariant
coordinates are written
3
X
x0µ = Lµν xν
ν=0

The notation, Lµν is standard. As you must have guessed, it is the


µ-th row and ν-th column element of a matrix, just as Lµν was.
what might be some source of confusion is that despite the same
letter L being used, they are not elements of the same matrix -
indeed, it is the position of the indices that tell you which matrix
we are talking about. As I had already said before, Lµν is the µ−ν-th
element of the Lorentz transformation matrix L. One look at (5.2-a-
5.2-d) and (5.7-a-5.7-d) might make you think that the other set
of coefficients Lµν are really elements of the matrix L−1 . Actually
−1
they are elements of the matrix LT - it is just that the matrix
L happens to be symmetric for the special Lorentz transformation!
How can you tell? Well, you will have to wait a little while for the
answer!
I have just now answered a question that must have been both-
ering you since I introduced the new notation in subsection 5.2 -
why the superscripts? Of course, I had to use superscripts there,
because I have another set of four numbers that I also want to de-
note by the letter x. Because I was going to use the subscripts here,
I had to use the superscripts there!
All you do to get the covariant position vector from the con-
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 130

travariant one is - simply change the sign of the spatial vector - so


while the latter is (ct, ~r), the former is (ct, −~r). The difference is so
slight that at this stage it is natural to feel slightly cheated - this
looks like a mere sleight of hand, a change of notation simply so
that I can write a formula slightly more simply! At this stage, that
is just what this is! You might even feel that having to deal with
indices above and below is too much of a price to pay for this sim-
plicity of notation. Have faith - this new addition to our notational
stable is going to pay off rich dividends later.
Let’s look back at just what has been going on here. Why did
I have to bring in this new set of coordinates with subscripts on
them? Because the interval, which the Lorentz transformation
leaves invariant, had its signs all mixed up. Put a bit formally,
this means that the Lorentz transformations are not orthogonal -
they do not preserve the sum of squares of coordinates (which is the
square of the length, according to good old Pythagoras). This forces
us to bring in this new kind of entity. Of course, since the Lorentz
transformations are only very slightly non-orthogonal (what they
do preserve is very nearly the sum of squares - except for a few
wayward signs!), the difference between the two versions of coor-
dinates is very slight. If you had to deal with transformations that
are even more badly non-orthogonal (and you would have to, if you
wanted to study Einstein’s General Theory of Relativity!), the dif-
ference between the contravariant and the covariant version of the
coordinates would have been even more marked.

5.3.2 The metric


I have already given you a prescription of how to change a con-
travariant vector into a covariant one - all you have to do is change
the sign of the space coordinates, keeping the time part intact! This
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 131

can, of course be written in the form of a matrix η acting on the col-


T
umn vector (x0 , x1 , x2 , x3 ) , where the matrix η, called the metric, is
a diagonal matrix diag (1, −1, −1, −1) :

x = ηx (5.9)

or, more explicitly,


    
x0 1 0 0 0 x0
x1
    
 x1   0 −1 0 0  
 =   (5.10)

 x2  
  0 0 −1 0 
 x2 

x3 0 0 0 −1 x3

or, in terms of components,

3
X
xµ = ηµν xν
ν=0

where, of course

η00 = −η11 = −η22 = −η33 = 1

are the only nonzero elements of the matrix η.


We have seen the metric in its role as a converter from the con-
travariant coordinates to the covariant coordinates. Another way
to look at this will become apparent if you take a look at the way
the invariant interval is defined
3
X
2
∆s = (∆xµ ) (∆xµ )
µ=0
3 3
!
X X
= (∆xµ ) ηµν ∆xν
µ=0 ν=0
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 132

3
X
= ηµν ∆xµ ∆xν (5.11)
µ,ν=0

which you can interpret as the formula that will give you the “length”
of a spacetime coordinate difference - in the same way as the length
of a vector in three dimensions is
3
X
∆xµ ∆xµ
µ=1

This tells us the reason behind the name metric - it is what is


helpful in measuring “lengths” in spacetime. In terms of the matrix
η, you can write the interval as

∆s2 = (∆x)T η (∆x) (5.12)

The fact that the interval is invariant under a Lorentz transfor-


mation tells us that the Lorentz transformation matrix L has to
obey a very important restriction. Since x0 = Lx obeys

T
(∆x0 ) η (∆x0 ) = (∆x)T η (∆x)

we have
(∆x)T LT η L (∆x) = (∆x)T η (∆x)
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 133

for all values of (∆x). This shows2 that the matrix L has to obey

LT ηL = η (5.13)

which is the condition that can tell us whether a particular 4 × 4


transformation matrix can qualify as a Lorentz transformation or
not. In terms of the coefficients it is easy to see that this condition
translates to
X3
ηµν Lµρ Lνσ = ηρσ (5.14)
µ,ν=0

I told you a while ago that the covariant version of spacetime


coordinates transforms according to the rule x0 = L x, where the
transformation matrix L is related to the Lorentz transformation
matrix L by
−1
L = LT

This is easy to prove now. Note that

x0 = ηx0 = ηLx = ηLη −1 x

so that the transformation matrix for covariant coordinates is L =


−1
ηLη −1 . From equation (5.13) it follows that L = LT , as promised!
An important property of any Lorentz transformation matrix can
be derived by taking the determinant of both sides of equation
2
Actually, the fact that xT Ax = xT Bx for all values of x does not really force
us to conclude that A = B. All this really says is that the matrix C = A − B obeys
xT Cx = 0 for every column vector x. as you can easily check, this is satisfied,
not only by the obvious choice - the null matrix, but also by a (and only by) any
antisymmetric matrix. however, in this case it is easy to see that LT ηL − η is
a symmetric matrix - and is thus both symmetric and antisymmetric. The only
matrix that is both is the null matrix.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 134

(5.13), which immediately gives

(det L)2 = 1

and thus
det L = ±1 (5.15)

Thus all Lorentz transformations fall into two categories - one with
det L = +1 and the other with det L = −1.
Another property of the Lorentz transformation matrix is easier
to derive from the condition on coefficients, equation (5.14). Writing
the equation for ρ = σ = 0 leads to

3
X
ηµν Lµ0 Lν0 = η00 = 1
µ,ν=0

which can be rewritten in the form


3
2 X 2
L00 − Li0 =1
i=1

leading to
3
2 X 2
L00 =1+ Li0 ≥1
i=1

and thus we have

either L00 ≥ +1, or L00 ≤ −1 (5.16)

Putting the conditions (5.15) and (5.16) together we see that all
Lorentz transformations can be grouped under four categories

a) L↑+ Lorentz transformations with det L = +1 and L00 ≥ +1.

b) L↑− Lorentz transformations with det L = −1 and L00 ≥ +1.


CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 135

c) L↓+ Lorentz transformations with det L = +1 and L00 ≤ −1.

d) L↓− Lorentz transformations with det L = −1 and L00 ≤ −1.

The special Lorentz transformation obviously falls in the first cat-


egory - which is called, rather pompously, the set of proper or-
thochronous Lorentz transformations.

5.4 The geometry of special relativity


Even if you had not learned any relativity before this, you are sure
to have heard of one of its results - that STR tells us that we live in
a four dimensional world - and that time is the fourth dimension3 .
In this section we will take a slightly detailed view of what this
really means.
At the basic mathematical level, there is nothing surprising in
the fourth dimension - all that we are saying is that you need four
numbers to characterise an event - the three space coordinates
that tell you where the event has occurred and one more number
specifying the time when it occurred. This is true, of course, even
in Newtonian physics - so just what’s the big deal in relativity?
The big deal is, as we have seen over and over again already,
there is a big difference in the way in which time behaves in New-
tonian and relativistic physics. In Newtonian physics, time does
enter into the equations for Galilean transformation - but more as
a passive parameter. According to these transformations, all ob-
servers measure the same value for the time - it is absolute! In the
Lorentz transformations, on the other hand, time is no longer in-
3
If the idea of a fourth dimension above the usual three makes you feel a bit
uncomfortable - brace yourself. Superstring theory will have us believe that we
live in a ten dimensional spacetime!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 136

variant - it mixes with space in much the same way the coordinates
do when you rotate the coordinate axes.
Before I explain what is meant precisely by the statement that
spacetime is four dimensional, let me take you back into the depths
of a statement that you must be much more comfortable with -
the statement that the space that we see around us is three di-
mensional. Of course, every schoolboy knows that you need three
numbers to locate a point in space. However, we are also used,
especially in dynamics problems, to treating the various space co-
ordinates independently - you write down separate equations for
the X component, the Y component and so on. Why is it, then,
that we do not regard the three components as three distinct quan-
tities, but rather look upon them as manifestations of one basic
object, three faces of the same coin - so to speak?
The answer is obvious - which component is X, which one is Y
... are, after all, artificial choices. I could just as easily have chosen
to orient my coordinate axes in a different way - this would mix all
the previous components together to give the new component. It is
this mixing up that shows that we have to regard the three spatial
coordinates as a single entity, unless we force ourselves to keep on
using one particular coordinate system. The latter would have been
a reasonable option, though, if one coordinate system could have
been somehow different from all others - which is not the case!
In much the same way, I could have kept on regarding time as
completely separate from space if I had one inertial observer whom
I could regard as special. However, the whole point of relativity is
that this can not be done - all inertial observers are created equal!
Add to this the fact that changing over from one inertial observer
to another mixes time up with the space coordinates as well as
vice versa and you have to regard the time along with the space
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 137

coordinates as four pieces of the same thing that we call, for want
of a better name, spacetime.

5.4.1 Coordinate changes in rotations


I have used a comparison between rotations and Lorentz transfor-
mations to explain why spacetime should be regarded as a four
dimensional identity with x, y, z and t as its components. The con-
nection between rotations and Lorentz transformations run quiet
a lot deeper than you may think. In this subsection I will take a
deeper look at the mathematics of rotations to lay the groundwork
for its comparison with the Lorentz transformations in the next
subsection. At the very outset, we must take a look at the way
in which the x, y, z coordinates transform when the coordinate sys-
tem is rotated around. This mathematics will also come in handy
when we discuss vectors and scalars in three and four dimensions
in section 5.5.
To keep things simple I will Y

only consider a rotation through Y

an angle ϑ about a single axis -


let’s say the Z axis, as shown
y X
in figure 5.1. Of course, the r

φ
z coordinate stays intact under y x θ

such a rotation. What about x X


the other two coordinates? It is
clear from the figure that
Figure 5.1: Rotating the coordi-
0
x = r cos φ x = r cos (φ − ϑ) nate system
y = r sin φ y 0 = r sin (φ − ϑ)
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 138

from which a bit of the trigonometry that you learnt in high school
will take you to

x0 = x cos ϑ − y sin ϑ (5.17-a)


y 0 = x sin ϑ + y cos ϑ (5.17-b)
z0 = z (5.17-c)

If you are familiar with matrices, you will of course realize that
this can be written in the compact matrix form
    
x0 cos ϑ − sin ϑ 0 x
 0  
 y  =  sin ϑ cos ϑ 0   y 
 

z0 0 0 1 z

where the matrix


 
cos ϑ − sin ϑ 0
Rz (ϑ) =  sin ϑ cos ϑ 0 
 

0 0 1

is called the rotation matrix for a rotation through an angle ϑ about


the Z axis. You should realize that the rather simple form that this
matrix has is due to the fact that I have considered a rather special
rotation - the matrix for a rotation about an arbitrary axis looks
much more complicated (no columns or rows made out of 0’s and
1’s!) - but is actually just as simple in concept. If you haven’t
learned about matrices yet, don’t worry - we will not use them any
further here.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 139

5.4.2 Lorentz transformations and rotations


As in the case of rotations, we first start with a look at a particular
Lorentz transformation - the boost along the X axis.
One thing should be clear from (5.17-a-5.17-c) and (5.1-a-5.1-d)
- which is the rather deep similarity between rotations and the
Lorentz boost. Of course, there are some differences, too. The most
obvious of them is, of course, that in the Lorentz transformations t
plays a very active role - while it is conspicuous by its absence in
rotations. A direct result of this is that while rotations involve only
three variables - the Lorentz transformations involve four. From a
purely mathematical point of view, this is not a big deal, though!
What makes the real difference between the rotations and Lorentz
transformations is what quantities they keep invariant. While for
rotations the invariant quantity is the length

l 2 = x2 + y 2 + z 2

for Lorentz transformations the invariant quantity is the interval

c2 t2 − x2 − y 2 − z 2 .

This difference is compactly expressed by what is called the signa-


ture - while this is (+, +, +) for rotations, it is (+, −, −, −) for the
Lorentz transformations. This makes the Lorentz transformations
non-orthogonal (orthogonal transformations, remember, preserve
the sum of squares of the coordinates), but not by very much!
This non-orthogonality of the Lorentz transformations reveals
itself in the coefficients of our special Lorentz transformations, too.
Note that the two coefficients in the equation for x00 are γ and −βγ,
whereas the corresponding quantities for the rotation are cos ϑ and
sin ϑ. As you have known for ages now, the latter coefficients obey
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 140

(cos ϑ)2 + (sin ϑ)2 = 1. what about the Lorentz transformation coef-
ficients? The sum of their squares is not something simple, but
their difference is (γ)2 − (−βγ)2 = (1 − β 2 ) γ 2 = 1! I leave it to you
to figure out just what the connection is between the orthogonality
issue and this.
Let me point out, though, that we do know of simple functions
that can take the place of the cosand sin in the Lorentz transforma-
tions. The fact that the difference of squares of the coefficients
is unity, immediately suggests using the hyperbolic functions -
cosh φ = γ and sinh φ = βγ. The parameter φ is easily seen to be

φ = tanh−1 β

The parameter φis rather similar to our Doppler factor K in the


sense that it too provides an alternative to the speed as a means of
parameterising how fast a body is moving. I personally prefer the
doppler factor, especially since it is easily measurable experimen-
tally. However, the use of φ rather than the speed does simplify
some calculations - so it is important to remember this alternative.
This parameter is given a special name - it is called the rapidity.

5.4.3 A bit of history - Minkowski coordinates


You should be able to see a very simple trick which will make the
Lorentz transformations orthogonal. All you have to do is to ensure
that is preserves the sum of squares of the coordinates by taking
as the fourth coordinate as, not ct, but ict where i is the familiar
square root of minus 1! In order to distinguish between our earlier
convention, I will call this fourth coordinate x4 instead of the x0 that
we were using earlier. In addition, I will use subscripts instead of
superscripts on the three space coordinates as well - x1 = x, x2 = y
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 141

and x3 = z 4 ! The person who introduced these coordinates into


STR is Einstein’s friend, the mathematician Hermann Minkowski.
In his honour, this choice of coordinates is called the Minkowski
coordinates.
In terms of these coordinates the invariant quantity is, of course

x21 + x22 + x23 + x24 ≡ x2 + y 2 + z 2 − c2 t2 (5.18)

and the special Lorentz transformations take the form

x01 = γx1 + iβγx4(5.19-a) x1 = γx01 − iβγx(5.20-a)


0
4

x02 = x2 (5.19-b) x2 = x02 (5.20-b)


x03 = x3 (5.19-c) x3 = x03 (5.20-c)
x04 = −iβγx1 + γx
(5.19-d)
4 x4 = iβγx01 + γx(5.20-d)
0
4

The similarity between these equations and those for rotation


should be obvious. Indeed, even the sum of squares of the two
coefficients γ and (−iβγ) is 1, so that you can even identify them
with cos ϑ and sin ϑ, respectively! Of course, in this case we will get
tan ϑ = −iβ so that the angle involved is complex (That’s obvious,
since cos ϑ = γ ≥ 1!). Thus, in the Minkowski coordinates Lorentz
transformations are seen to be rotations, but in the x1 − x4 plane,
in the same way as the rotation about the Z axis was one in the
x1 − x2 plane.
This identification of the Lorentz transformations with rotations
was one reason which made Minkowski coordinates very popu-
lar in their day. However, the fact that the rotation angle had to
be complex actually makes this identification rather tricky! The
4
“There he goes again!” - I can hear you groaning! Why do I revert to sub-
scripts here? Why were I using superscripts is the first place? All that I can tell
you now is, as before, keep patience! All will be revealed in a short while!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 142

use of imaginary time made time look very much the same as the
space coordinates - but this was an illusory similarity. In fact, they
were instrumental in giving people the notion that relativity places
“space and time on the same footing”.
It is true that relativity does bring time a lot closer to space than
it was thought earlier. The very fact that time ceases to be abso-
lute, but rather observer dependent like space is a huge step in
this direction. In spite of this there is a very big difference between
space and time that persists in relativity - you can go in any direc-
tion in space - but in only one direction in time. Since causality
tells us that cause always precedes effect in time, shows that this
distinction is about as basic as can be! Indeed, hiding the different
sign of t2 in the invariant interval by using an imaginary time co-
ordinate conceals this very important difference - which turns out
to be more of a burden than an advantage in the long run. Thus,
despite having some advantages, Minkowski coordinates are used
only very rarely today!

5.4.4 Minkowski spacetime


Minkowski coordinates may have fallen out of favour today, but
Minkowski’s biggest contribution to STR has endured. This is the
idea that one has to consider spacetime as a single four dimen-
sional entity and what one observer calls space and time coordi-
nates of an event are simply components of this entity in axes that
are relevant for him or her. This is the Minkowski spacetime, in
which points have the coordinates x0 , x1 , x2 , x3 or, in more familiar
terms (ct, ~r). Let me stress once again that this clubbing together
of space and time coordinates is not merely a matter of putting two
different things in the same bracket - the Lorentz transformation
mixes them up - so that the distinction between space and time is
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 143

rather blurred in STR.


The geometry of Minkowski spacetime is in some ways very
similar to that of our ordinary three dimensional space - but in
some other ways it is very different. In ordinary space we have the
Pythagoras theorem that gives the distance between two points as

d2 = ∆x2 + ∆y 2 + ∆z 2

- which stays invariant when you rotate your coordinate system.


In Minkowski spacetime we have a new version of the Pythagoras
theorem - the “distance” between two events in spacetime is given
by the interval

∆s2 = c2 ∆t2 − ∆x2 + ∆y 2 + ∆z 2




which, of course stays unchanged when you move from one inertial
observer to another.
As I have been pointing out over and over again, the similarity
between the interval and the ordinary distance is what helps us to
motivate the generalization of ordinary space to spacetime. At the
same time, the difference between the interval with its (+, −, −, −)
signature and distance with its (+, +, +) signature is substantial -
and it is this difference that makes the geometry of spacetime very
different from that of space. In particular, let me stress once again
that despite its merger with space, time remains a very different
entity from space even in STR so unlike space, where all three
dimensions are equivalent - in spacetime we have a certain lack of
democracy!
One very important distinction between space and spacetime is
that the distance in four dimensions is not necessarily positive5 !
5
Indeed, if I try to push the analogy between distance and the interval, then I
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 144

The interval can be positive, zero or even negative! Of course, this


is what gave us the distinction between spacelike, null and time-
like intervals. Mathematicians call such a “distance” like function
a pseudo-metric, in their language Minkowski space is a pseudo-
metric space.

5.5 Four-vectors and four-scalars


When we were discussing the addition of velocities in STR in sec-
tion 3.4, you must have been struck by the fact that the rather
unusual quantities that I denoted by Ux , Uy and Uz transform much
more simply from observer to observer than the familiar velocity
components themselves. In this section I will show you why! But
first, I need to take you through a short detour - to ensure that you
understand what vectors and scalars really are.

5.5.1 Vectors and scalars in 3 dimensions


You may be understandably indignant at the last sentence - after
all, every high school kid has learnt about vectors and scalars!
However, to understand what four-vectors and four-scalars are,
you will have to go a bit deeper into the meaning of ordinary vectors
and scalars than what you have learnt in the high school.
To keep the discussion simple, I will take off from what you
have always thought of vectors as being - arrows going from one
point to another. Imagine a vector A ~ stretching from the point P1 to
the point P2 . If the coordinates of the two points are (x1 , y1 , z1 ) and
(x2 , y2 , z2 ), respectively, then, as you all know, our vectors will have
should talk of ∆s rather than ∆s2 . In that case, the four dimensional “distance”
is not even always real!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 145

components of

Ax = x 2 − x 1 , Ay = y2 − y1 , Az = z2 − z1

Now, the coordinates of the points P1 and P2 certainly depend on


the coordinate system that we are using - they will change when
we rotate the coordinate system. It stands to reason, then, that the
components of my vector A ~ will change, too! Let’s investigate what
this change is like.
We have seen how the coordinates of a point changes under a
rotation of the axes in equations (5.17-a - 5.17-c). From this it is
child’s play to figure out how the components of the vector A ~ will
change. Let me show you how to handle the X component

A0x = x02 − x01


= (x2 cos ϑ − y2 sin ϑ) − (x1 cos ϑ − y1 sin ϑ)
= (x2 − x1 ) cos ϑ − (y2 − y1 ) sin ϑ
= Ax cos ϑ − Ay sin ϑ

You can find out A0y and A0z in a similar fashion. The result is a set
of transformation equations

A0x = Ax cos ϑ − Ay sin ϑ (5.21-a)


A0y = Ax sin ϑ + Ay cos ϑ (5.21-b)
A0z = Az (5.21-c)

If you take a look at the set of transformation equations (??-


5.17-c) and (5.21-a-5.21-c) side by side you will immediately notice
the striking fact that the set (Ax , Ay , Az ) of the components of a
vector transform in exactly the same fashion as the coordinates
(x, y, z). A moment’s thought will show you that this will remain
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 146

true even for more complex rotations, too.


There you have it - the definition of a vector that we will need
for relativity. An ordinary vector (formally known as a three vector)
is a set of 3 numbers (called components) that transform when you
rotate the coordinate system in exactly the same way in which the
coordinates themselves do! Another way to put this is - all vectors
are 3 component objects that transform in the same way under
coordinate rotations. What that transformation rule is is specified
by the fact that the position vector is a prototype vector.
What is a scalar? Well, firstly it is something that is described
by a single number. However, anything that is described by a single
number does not qualify - it has to transform in a very special way!
Indeed, the transformation law for a scalar quantity is so simple
that it hardly seems to be a transformation law at all! A scalar
simply does not change under a rotation of coordinates.
Let me now point out that one quantity that you had always
thought of as a scalar - the length of a vector, is really a scalar6 !
To show this, let me start with the length of the prototype vector
p
- the position vector, l = x2 + y 2 + z 2 . That this is unchanged by
rotations is no surprise - the fact that distances do not change is
one of the basic defining features of rotations!
There is no real need for a proof that under a rotation length
of a vector remains the same, l = l0 , given that it follows from the
definition of a rotation. I will verify that this works for the rotation
for which I have written down the coordinate transformation in
equations (5.17-a- 5.17-c) nevertheless! Here goes

l02 = x02 + y 02 + z 02
6
If you are wondering what’s the big deal with that - let me point out that
some other quantities like the X component of a vector that you would be likely
to think of as a scalar is not really a scalar according to this criterion!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 147

= (x cos ϑ + y sin ϑ)2 + (−x sin ϑ + y cos ϑ)2 + z 2


= x2 cos2 ϑ + 2xy cos ϑ sin ϑ + y 2 sin2 ϑ + x2 sin2 ϑ
−2xy cos ϑ sin ϑ + y 2 cos2 ϑ + z 2
= x2 cos2 ϑ + sin2 ϑ + y 2 cos2 ϑ + sin2 ϑ + z 2
 

= x2 + y 2 + z 2 = l 2

- as promised.
Of course, what I have done here can be hardly considered a
proof that rotations keep lengths unchanged. After all, all that has
been done is a verification that this happens for a particular kind
of rotation. However, since the choice of the Z axis is completely
artificial - it can be argued that if lengths do not change for rota-
tions about the Z axis, then they won’t change for rotations about
any axis.
Now that we have shown that the length of a position vector
stays unchanged under rotations, what about the length of any
other vector? Since all vectors transform in exactly the same way
as the position vector, it is easy to see that this invariance of the
length will work for all of them. If you are still sceptical, here is the
detailed verification

~0 · A
A02 = A ~ 0 = A02 02 02
x + Ay + A z

= (Ax cos ϑ + Ay sin ϑ)2 + (−Ax sin ϑ + Ay cos ϑ)2 + A2z


= A2x cos2 ϑ + 2Ax Ay cos ϑ sin ϑ + A2y sin2 ϑ + A2x sin2 ϑ
−2Ax Ay cos ϑ sin ϑ + A2y cos2 ϑ + A2z
= A2x cos2 ϑ + sin2 ϑ + A2y cos2 ϑ + sin2 ϑ + A2z
 

~·A
= A2x + A2y + A2z = A ~ = A2

~ 0 to denote the vector in the rotated


Note that above I have written A
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 148

coordinate system. Strictly speaking, this is an abuse of notation -


since I am actually referring to the same vector, only in a different
coordinate system7 . I am sure that no harm will be done by this
little mis-notation, though.
Continuing with this notation, let us now ask whether some-
thing that we have always called a scalar is really a scalar. I am
referring to the scalar product of two vectors, A ~ · B.
~ The question is,
is A~0 · B
~0 = A
~ · B?
~ To see that it is, we can play a little trick. Since we
have already shown that lengths of all vectors do not change under
 2  2
a rotation, we have A ~0 + B
~0 = A ~+B ~ , which means that

~ 02 + 2 A
A ~0 · B
~0 + B
~ 02 = A
~2 + 2 A
~·B
~ +B
~2

~ 02 = A
- using A ~ 2 and B
~ 02 = B
~ 2 immediately gives

~0 · B
A ~0 = A
~·B
~

So, the scalar product is really a scalar! Of course, you could have
checked this directly from the transformation law itself.
Let me point out that the fact that the dot product of two vec-
tors A~ ·B
~ = Ax Bx + Ay By + Az Bz is a scalar depends on the fact that
the transformation that I am considering preserves vector lengths.
Such transformations are called orthogonal transformations. Ro-
tations, of course, are orthogonal transformations in three dimen-
sions - other well known transformations in this category are re-
flections. Indeed, it can be shown that in three dimensions, all
7
There is a different way to think of a rotation. Instead of the vector staying
the same and the coordinate system rotating, we can also keep the coordinate
axes the same and turn the vector around the other way. In this case, the com-
ponents change in the same fashion as above, and the notation A ~ 0 is justified
(we are really talking of a different vector here). Technically, the kind of trans-
formations that we have been talking about are called passive transformations,
while this latter kind are called active transformations.,
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 149

orthogonal transformations are either rotations, reflections or com-


binations of the two.
What if the basic transformation had been something else - one
that does not preserve lengths? You may think it unnecessary to
waste time on such grotesque possibilities - but the fact of the mat-
ter is that such transformations are not very uncommon. Indeed,
the Lorentz transformations, as we will see very soon - fall in this
bracket! In this case, we will have to modify the way in which we
define the scalar product - so that it stays a scalar!
Now that we have a precise definition of vectors and scalars,
you may realise that many quantities that we had assumed to be
a vector may or may not be - the bottomline is - you have to check
whether the transformation law is obeyed. For example, it is easy
to check that the sum of two vectors is really a vector, i.e. it, too,
transforms like the position vector (indeed, I should have checked
whether this is correct before using this result in my proof of the
invariance of the dot product above) and the same goes for the
product of a vector with a scalar. As for the cross product of two
vectors, I leave it to you to verify that this does transform like a
vector, hence justifying its name.

5.5.2 Lorentz transformations vs. rotations


Just what is so special about rotations? Not much - if you are
looking at things from a mathematician’s point of view! A math-
ematician would be happy to think about all sorts of other lin-
ear transformations one can carry out on coordinates8 . To him, a
vector is a set of three components that change under any linear
8
A linear transformation is one in which x0 is a linear combination of x, y, z
and so on. Thus, x0 = 3.6x − 1.7y + 0.5z is a linear transformation - while x0 = 2x2
is not.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 150

transformation of coordinates9 in the same way as the coordinates


themselves. There is a special reason why rotations are special to
a physicist, though! The laws of physics are the same no matter
whether you write them with respect to one coordinate system or
another, rotated system. In other words, rotations are symmetries
of physical laws. This is precisely why we are so interested in what
happens to objects under rotations.
Rotations are symmetries of all physical laws - but that can
be traced back to the fact that they are special cases of a much
more general symmetry - the Lorentz transformations. After all,
the basic tenet of relativity is that all physical laws are the same
for all observers - which means that the Lorentz transformations
are symmetries of all physical laws. You will agree that we must
look for quantities that transform in a decent way under Lorentz
transformation.

5.5.3 Vectors and scalars in spacetime


As I have stressed over and over again, the Lorentz transforma-
tions play the same role in spacetime as the rotations in space.
This similarity helps us extend the concept of vectors in space to
vectors in spacetime. In perfect analogy with the ordinary vectors,
we can define vectors in four dimensional spacetime as a set of four
numbers (A0 , A1 , A2 , A3 ) that transform under a Lorentz transforma-
tion in the same manner as the spacetime coordinates (x0 , x1 , x2 , x3 ).
In other words all vectors in spacetime transform in the same man-
ner and the spacetime coordinates give their prototype. To avoid
9
Indeed, the mathematician (and sometimes, the physicist as well!) would be
equally at home talking about general (nonlinear) transformation of coordinates.
In this case, the new coordinate differentials dx0 , dy 0 , dz 0 are linear combinations
of dx, dy and dz. This allows vectors to be defined as a set of 3 components that
transform in the same way as the coordinate differentials do!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 151

confusion with our ordinary run of the mill vectors we give these
entities a special name - four-vectors. If I talk of these vectors and
our familiar old vectors together, as I will do often, I will refer to the
latter as three-vectors. Remember, though, mathematically there
is not much of a difference between four-vectors and three-vectors,
they are just vectors under Lorentz transformations and rotations,
respectively.
As for scalars, there generalization to spacetime is even more
straightforward. The ordinary scalars that we dealt with in three di-
mensions are single numbers do not change under rotations. Four-
scalars are single numbers that do not change under Lorentz trans-
formations. Note that because rotations also form special cases
of Lorentz transformations, all four-scalars are three-scalars, too.
The converse is not true, though - time is a three-scalar but does
not stay invariant under a general Lorentz transformation. The
same goes for the length of the position vector.
The spacetime four vector (ct, x, y, z) naturally splits up into a
time part - the zero-th component, and the three other compo-
nents that form the spatial part. In the same way, every four-vector
(A0 , A1 , A2 , A3 ) can be split up into a timelike part - the A0 , and a
spacelike part, A ~ = (A1 , A2 , A3 ). The real significance of this split up
can be seen when you realize that a rotation in three dimensional
space is a special case of Lorentz transformations in spacetime -
one in which the (x, y, z) get mixed up together, but t stays invari-
ant. Under a rotation, too, a four-vector transforms in the same
way as (ct, ~r) - so that A0 does not change and A ~ turns around!
Thus, the time part A0 of a four-vector is a scalar under rotations,
while the space part A ~ is a three-vector under rotations!
Just as in the case of spacetime coordinates, the four-vector
that I have been describing here is written with the index as a
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 152

superscript. In the same vein, we call such vectors contravariant


four-vectors.
In case the foregoing seems too abstract, let’s take a look at the
way four-scalars and four-vectors behave in two very simple case
- the special Lorentz boost, and a rotation about the Z axis. To
refresh your memory, let me repeat the transformation equations
for the special Lorentz transformation in terms of spacetime coor-
dinates (ct, ~r), equations (5.1-a-5.1-d)

x00 = γx0 − βγx1


x01 = −βγx0 + γx1
x02 = x2
x03 = x3

Under this transformation, a four scalar changes to

φ → φ0 = φ

- i.e. it does not 


change at all. What about a four-vector with
components A0 , A ~ ? It transforms into

A00 = γA0 − βγA1 (5.22-a)


A01 = −βγA0 + γA1 (5.22-b)
A02 = A2 (5.22-c)
A03 = A3 (5.22-d)

What about the rotation? The spacetime transformation in this


case is

x00 = x0
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 153

x01 = x1 cos ϑ + x2 sin ϑ


x02 = −x1 sin ϑ + x2 cos ϑ
x03 = x3

Under this rotation, a four-scalar will obviously stay the same.


What about the four-vector?

A00 = A0
A01 = A1 cos ϑ + A2 sin ϑ
A02 = −A1 sin ϑ + A2 cos ϑ
A03 = A3

As promised, the component A0 stays invariant under the rota-


tion - whereas the other three components change in just the same
way as the components of a three vector does. Of course, for this
particular rotation, the component A3 also stays unchanged, but
that is because the rotation I have considered here is one about
the third (Z) axis.
It is now time to go back to the general case once again. I will
adopt a convention that is used almost universally - which is using
Greek letters µ, ν etc., which take the values 0, 1, 2, 3 to stand for the
components of four-vectors. Very often, I will write Aµ to indicate
not the µ-th component of the four-vector A but the four-vector it-
self - it should be clear from the context which one is being meant
at a given point. Another convention that is used almost univer-
sally is to use Latin indices i, j, k etc., which run over the values of
1,2 and 3 to stand for the spatial component of a four vector.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 154

5.5.4 Examples of 4-vectors and 4-scalars


Now that we know how to generalize the concept of vectors and
scalars to spacetime, the natural question to ask is - are these
just mathematical entities or do we have real life examples of these
objects? Of course we do! That is the whole point behind studying
them in the first place.
We have already met the simplest non-trivial scalar quantity -
the interval! The by now familiar quantity
2 2 2 2
x0 − x1 − x2 − x3 ≡ c2 t2 − ~r2

is invariant under Lorentz transformations. This is exactly what


we mean by a four-scalar! The proper time interval between two
timelike related events is defined by
r s  2
1 1 ∆~r
∆τ = ∆t2 − 2 (∆~r)2 = ∆t 1 − 2
c c ∆t

is also obviously a four scalar.


The most obvious four-vector is none other than the coordinate
four vector xµ = (ct, ~r). We have already seen that differences in
spacetime coordinates change in the same way when you jump
from one inertial frame to another as the coordinates themselves -
so another four-vector is∆xµ = (c∆t, ∆~r).
What about other four-vectors? Since the three-vector part of a
four vector has to be a vector under rotations, it might seem that
all you have to do to build more four-vectors of your own is to
take an ordinary vector and add to it an appropriate three-scalar
to serve as the zero-th (or temporal) component. That this does
not work can be seen immediately. Take the simplest three-vector
beyond the position vector that you can think of in mechanics - the
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 155

velocity vector ~u. Try as you might, you can’t find a three scalar
ζ that you can club with this one to get a four-vector. The easiest
way to see this is to consider the special Lorentz boost. If (ζ, ~u) is a
four vector, then under this the velocity components in the Y and
Z direction must stay unchanged - which is not what our velocity
transformation equations (3.17-b) and (3.17-c) say!
To understand how to proceed further, then, we have to take a
deeper look at what we had always thought obvious - just why is
velocity a three-vector? Well ~r is a three-vector and hence so is ∆~r.
The time interval ∆t is a three scalar, so that the ratio ∆~r is really
1
a product of a scalar ∆t and a vector ∆~r - and such a product does
transform like the coordinates under a rotation. Velocity is nothing
but the limit of the ratio ∆~ r
∆t
as ∆t → 0, so it is a limit of a sequence
of vectors. This is why velocity is a three-vector. As you can see,
the crucial point above is that ∆t is a scalar under rotations.
You can try the same trick with the position four vector xµ and
µ
land up with ∆x∆t
= (c, ~u). Why is this not a four-vector? The answer
is simple - while ∆t was a three-scalar under rotations, it is not a
scalar under the Lorentz transformations. So, if you want to get
the four dimensional version of the velocity three-vector, you have
to look for something other than ∆t - something that is similar to
it (otherwise we will land up with an entirely different object - not
a generalization of the velocity!) and is invariant under a Lorentz
transformation. Fortunately, such a thing is not very difficult to
find - the proper time ∆τ is the perfect candidate! Not only is
it an invariant, it has the same dimensions as time and reduces
r
(approximately) to it when the speed ∆~ ∆t
is small. This helps us to
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 156

define the velocity four-vector by


 
µ µ
dx dx dt c ~u
Uµ ≡ = × = q ,q  = γ (c, ~u) (5.23)
dτ dt dτ 1− u2
1− u2
c2 c2

where I have used


s  2 r
dτ ∆τ 1 ∆~r u2
= lim = lim 1− 2 = 1− .
dt ∆t→0 ∆t ∆t→0 c ∆t c2

~ = γ~u of our velocity four-vector


Note that the three-vector part U
is precisely the vector that I had introduced back in subsection
3.5.1.4. It should now be obvious why the components Uy and Uz
were transforming that simply under the special Lorentz transfor-
mation! Indeed, the identity (3.27) that requires a somewhat messy
bit of algebra to prove directly, becomes nearly trivial once you re-
alise that the quantity cγ = q c u2 forms the zero-th component of
1−
c2
a four vector. So, it would transform in the same way as ct does
under the special Lorentz transformation, leading to
 
c 1 − ucx2v

c 1 c v ux
q =q q − q = q q
1− u02
1− v2
1− u2 c 1− u2 v2
1 − c2 1 − u2
c2 c2 c2 c2 c2

from which (3.27) follows immediately! By the way, this is a special


version of the identity - valid when the relative velocity of the two
frames is along the X axis. Generalising to an arbitrary direction of
the relative velocity becomes obvious when you realise that most of
the terms that are there in the identity are manifestly three-scalars
(remember - this means that they do not change under a rotation
of the axis), like v 2 , u2 etc. The only one that is not manifestly a
three-scalar is the term ux v. However, once you realise that here
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 157

the vector ~v is (v, 0, 0), it is easy to see that ux v = ~u · ~v , and hence


this identity becomes, in general
q q
2 u2
γ (v) γ (u) 1 − vc2 1 − c2 ~u · ~v
= =1− . (5.24)
γ (u0 )
q
02
1 − uc2 c2

To round off our small discussion on the velocity four-vector, let


me do what I have done a lot of times before - show you yet another
way of deriving the same old results. This time, I am going to use
that fact that U µ is a four vector to derive the transformation laws
for the ordinary velocity. It is easy to see the transformation laws
for the spatial part of the four-vector U µ are
 
u0x 1  v 0 1  q ux v c
U 01 ≡ q = q U1 − U =q − q 
1− u02
1− v2 c 1− v2
1− u2 c 1− u2
c2 c2 c2 c2 c2
u0y uy
U 02 ≡ q = U2 = q
u02 u2
1− c2
1− c2
u0 uz
U 03 ≡ q z = U3 = q
u02 u2
1− c2
1− c2

dividing these equations by the transformation equation for the


zero-th component above immediately leads to the familiar velocity
transformation equations.
As you must have guessed, there is no reason to stop at the
velocity four-vector. You can continue in the same vein and define
µ
the acceleration four-vector Aµ ≡ dU dτ
. This time you have to be a bit
careful, though - the factor γ that relates the velocity four-vector to
the ordinary velocity is a variable, too, leading to the connection be-
tween the acceleration four-vector and its three-vector counterpart
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 158

a bit more complicated

dU µ dU µ
Aµ = =γ
dτ dt
d
= γ [γ (c, ~u)]
dt
 

= γ (c, ~u) + γ (0, ~a)
dt

1 dγ d u 2 du γ2
Now γ dt
= dt
(ln γ) = c2
γ dt = c2
~u · ~a, so that

γ2
 2
γ2
  
µ 2 2 γ
A =γ ~u · ~a (c, ~u) + (0, ~a) = γ ~u · ~a, ~a + 2 (~u · ~a) ~u (5.25)
c2 c c

which shows that even the space part of the acceleration four-
vector is not necessarily even in the same direction as the accel-
eration ~a! This result is going to have quiet an important bearing
on relativistic mechanics - the subject of the next chapter.

5.5.5 The dot product of four vectors


A very standard way of getting a scalar from two vectors in three
dimensions is taking their dot product. We have already verified
above that this really is a scalar! Can we do such a thing in four
dimensions? Yes, we can - but there is a slight twist. If I just take
the formula for the three dimensional dot product

~·B
A ~ = A1 B1 + A2 B2 + A3 B3

and try writing down its four dimensional version what I will land
up with is not a scalar. This is easy to see - just try it out on
our prototype four vector (ct, ~r) and you will get c2 t2 + ~r2 , which is
not invariant under a Lorentz transformation! Fixing this up is
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 159

not a very big problem, though - one look at the invariant interval
suggests that instead of the sum we should try

A · B = A0 B 0 − A1 B 1 − A 2 B 2 − A3 B 3

In terms of the metric tensor η that I had introduced you to earlier,


this can be easily written in the form

3
X
A·B = ηµν Aµ B ν (5.26)
µ,ν=0

T T
or, in terms of column vectors A = (A0 , A1 , A2 , A3 ) and B = (B 0 , B 1 , B 2 , B 3 )
and the matrix η
A · B = AT ηB (5.27)

It is rather simple to check that this product is actually an in-


variant. In the matrix representation, since A → A0 = LA and
B → B 0 = LB, we get

A · B → A0 · B 0 = A0T ηB 0
= (LA)T ηLB
= AT LT ηLB
= AT ηB = A · B

where I have used equation (5.13). Another important property of


the scalar product of four-vectors, that it shares with the three-
vector dot product is that it is symmetric

A·B =B·A (5.28)

In direct analogy with the covariant spacetime components xµ


that I had introduced earlier, we can define a covariant four-vector
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 160

A corresponding to a contravariant four vector A by

A = ηA (5.29)

or, in terms of components

3
X
Aµ = ηµν Aµ (5.30)
ν=0

Note that though I have used the underbar sign in the matrix form
of the covariant vector to distinguish it from the covariant version,
this is not necessary for the components - the placement of the
index µ (as a subscript rather than a superscript) is enough to
make the distinction.
In terms of the covariant version of a vector it is easy to see that
one can write a compact version of the formula for the dot product

3
X 3
X
A·B = Aµ Bµ = Aµ B µ (5.31)
µ=0 µ=0

This formula should be strongly reminiscent of the formula for the


3 vector dot product
X3
~·B
A ~ = Ai Bi
i=1

Note that in this case you had to take the corresponding compo-
nents of the two vectors, multiply them and add up all the terms
together to get a scalar. According to (5.31), you do exactly the
same for the four-vector dot product, except that in this case the
two vectors have to be of different kinds - one contravariant and
the other covariant! Another way to put this is - you can’t produce
a scalar by using the multiply coefficients and add rule on two con-
travariant vectors (or two covariant vectors either) - for this rule to
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 161

work, you must involve the two different kinds of vectors!


To understand exactly why two kinds of vectors are needed, let
us try to prove directly the invariance of the product in (5.31) in
terms of the Lorentz transformation matrices. It is easy to see
that in terms of the column vectors A = (A0 , A1 , A2 , A3 )T and B =
T
(B 0 , B 1 , B 2 , B 3 ) the inner product can be written as

A · B = AT B.

Now, under a Lorentz transformation, we have B → B 0 = LB but


A → A0 = L A. This means that

AT B → A0T B 0 = (L A)T LB = AT LT LB = AT B
−1
where the last step follows from the fact that L = LT . Note that
had you tried to do the same same with two contravariant vectors,
the matrix product LT L would have been replaced by LT L - which
would have not been equal to the identity matrix.
This should also tell you why you had not met two kinds of
vectors before, when you were studying them in three dimensions.
There, the ransformations you were talking of were rotations, and
there is no distinction between the transformation matrix R and
−1
R = RT . So, it all boils down to the fact that rotations are or-
−1
thogonal (which makes R = RT ), while Lorentz transformations
are not.

One reason why the Minkowski coordinates that I intro-


duced in subsection 5.4.3 were useful is that they made
the Lorentz transformations orthogonal! This meant that
in terms of these coordinates, you did not have to distin-
guish between contravariant and covariant vectors. This
should explain why we used subscripts on the coordi-
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 162

nates x1 , x2 , x3 , x4 - since there is no need to distinguish


between two kinds of entities, we stick to the more con-
ventionl choice! As I discussed in subsection 5.4.3, the
conceptul difficulties involved with an imaginary time co-
ordinate tends to outweigh these small advantages.

One of the most common example of a dot product is that of a


vector with itself, written A2 - which yields, as in the three dimen-
sional case, the four dimensional invariant norm of the vector. In
the three dimesional case, the square root of this quantity is the
length of the vector. This does not make much sense, though, for
the four dimensional case - since the mixed signature of the metric
allows the norm of a vector to be negative as well as positive or
zero. Note that the norm of a nonzero vector can be zero!
The norm of our prototype four-vector, (ct, ~r) is the invariant
 
interval between the event at (ct, ~r) and the spacetime origin 0, ~0

(ct, ~r)2 = c2 t2 − ~r2

It is easy to see that the intreval between two coordinates with


spacetime four-vectors x1 and x2 is given by (x2 − x1 )2 . In analogy
to the way in which we classify coordinate four-vectors as timelike,
spacelike and lightlike in terms of the sign of its norm, we calssify
all four-vectors as either temporal, spatial or null according as its
norm is positive, negative or zero.
Lets take a look at the other four-vectors we have met in the last
section. The norm of the velocity four-vector γ (c, ~u) is given by

U 2 = γ 2 c 2 − γ 2 u2 = γ 2 c 2 − u2 = c 2

(5.32)

Thus the norm of the four-velocity is an invariant - though, being


constant, it does not seem to be a very interesting one at this stage!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 163

As can be see, the velocity four vector of any particle has to be


timelike.
What about the acceleration four-vector,
 2
γ2

µ 2 γ
A =γ ~u · ~a, ~a + 2 (~u · ~a) ~u ?
c c

A bit of messy algebra shows that its norm is given by

γ2
 
2 24 2
A = −γ ~a + 2 (~u · ~a) (5.33)
c

You can, if you are a glutton for punishment, verify directly that
this is invariant under a Lorentz transformation (or at least, under
the special Lorentz transformation)! One thing that follows directly
from the above is that the acceleration four-vector is always space-
like, unless of course the three-acceleration ~a is zero, in which case
it is lightlike10 .
Let me show you another way of deriving the result above. At
any given instant the particle has a given velocity ~u with respect to
the inertal observer being used. Let us now move over to another
inertial observer, to whom the particle is at rest at that instant.
Note that if the particle has a non-zero acceleration, it is going to
start moving in that frame just afterwards. You can always find a
frame in which the particle stays at rest always - all you have to
do is put an observer right on the particle. Such an observer will
be non inertial, and we won’t have much to do with these crea-
tures in STR. What we have to be satisfied is the instantaneously
comoving frame (ICF) of the kind described above. In that frame
the acceleration of the particle is often called its proper accelera-
10
That’s hardly surprising, of course, given that for ~a = 0 the acceleration four-
vector is identically zero!
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 164

tion ~a0 , and since ~u = 0, γ = 1 the acceleration four-vector simplifies


significantly to (0, ~a0 ). Thus in a particle’s ICF, the norm of its four-
acceleration is given by −~a20 - which is obviously negative (or zero,
if ~a0 = ~0)! Now comes the crux of the argument - since the norm of
a four-vector is invariant, any other observer has to measure it as
−~a20 - so that everyone agrees that it is negative or zero!
The trick that I have used here is one that is used quiet often
in relativistic calculation. The fact that scalars are the same for
all inertial observers allow us to jump from one inertial observer to
another, for whom the calculation can be simple. In this context,
the rest frame of a particle (which has to be replaced by its ICF(s)
if it is accelerating) is very often the one in which the calculation
becomes simplest. The theme is - calculate once, use for all! The
same can be applied, to a somewhat lesser extent, to four-vector
quantities, too (and as we will see later, to other kinds of stuff, like
four-tensors and so on). The only trouble is, in this case you have
to take the trouble of transforming back to your original frame of
referenc once the calculation is done. Actually, you had to do this
for scalars too - the only difference being that this is trivial for
scalars.
To illustrate the power of technique let me use it to derive an
expression that will be of some importance in the study of relativis-
tic mechanics - the value of the scalar A · U = Aµ Uµ . In the ICF  of
a particle, these two vectors take values A = (0, ~a0 ) and U = c, ~0
respectively. It is easy to see that in the ICF,

A · U = 0 × c − ~a0 · ~0 = 0 (5.34)

and this means that this quantity vanishes for all frames. You can
(and should) check that this is true by doing the explicit calculation
using (5.23) and (5.25) - which should convince you of the utility of
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 165

this trick.
There is another way in which the result (5.34) could be derived
- and the method is quiet instructive in its own right. It is easy to
check that the familiar calculus law for the derivative of a product

d dv du
(uv) = u + v
dt dt dt

can be extended to the scalar product almost in toto

d dA dB
(A · B) = ·B+A· (5.35)
dτ dτ dτ

To use this in our present task, let’s start with the norm of the
four-velocity
U 2 = U · U = c2

and differentiate to get

dU dU
U· + ·U =0
dτ dτ
µ
Using the definition Aµ ≡ dU dτ
and the symmetry of the dot product,
this translates to U · A + A · U = 2 A · U = 0, leading to our identity11 .
As a final example of the utility of this trick of evaluating four-
scalar quantities in a special frame, secure in the knowledge that
their value will not be affected when you move back to a general
frame of reference, I will show you yet another way of deriving
(??)- this time almost trivially. Note that for the four-velocities
11
You may recall a similar result from the world of three-vectors - the derivative
of any vector of fixed length is perpendicular to it - that is proved in exactly the
same way! Among its consequence is the fact that the tangent of a circle is
perpendicular to the radius, as well as the fact that the acceleration of a particle
in uniform circular motion is centripetal (since it is perpendicular to the velocity).
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 166

V = γ (v) (c, ~v ) and U = γ (u) (c, ~u) the four-scalar product is given by

U · V = γ (u) γ (v) c2 − ~u · ~v


while its value in the rest frame of the second observer, where the
particle moves with the relative velocity ~u0 is,
 
U · V = γ (u0 ) (c, ~u0 ) · c, ~0 = c2 γ (u0 ) .

Equating these two results for U · V gives us (5.24) immediately!

5.5.6 Going further - tensors


Now that we have understood the most basic physical entities that
you can find in spacetime, it is time to go a bit further and ask
- what other kinds of physical entities can there be? The most
obvious extenion to the concepts of four-scalars and four-vectors
are four-tensors. Indeed, as you may know, scalars and vectors
can even be thought of as special cases of tensors.
To understand tensors in four dimensions, let us first take a
look at a familiar tensor from the three dimensional world. I am
almost certain that the most familiar tensor that you can think of
is - the moment of inertia tensor I. It is what relates the angular
velocity ωof a body to its angular momentum L,

L = Iω (5.36)

where the moment of inertia tensor is represented by a 3× 3 matrix,


with the vectors L and ω being represented by 3 × 1 column vectors.
From the way we have been defining scalars and vectors, it should
be obvious that the question to ask is - just how does the moment
of inertia tensor transform? We can easily deduce this from the
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 167

vector transformation law. Simply note that

L → L0 = RL, ω → ω 0 = Rω

which implies that the moemnt of inertai tensor in the rotated co-
ordinate frame must satisfy

RL = I 0 Rω, ⇒ RIω = I 0 Rω for all ω

which means that we must have

I 0 = RIR−1 = RIRT (5.37)

where in the last line I have used the fact that the rotation matrix
is orthogonal. This is the law of transformation for the moment of
inertia tensor12 . A bit of thought should show you that the same
law should work for anything that can be written in the form of a
square matrix that describes the linear relationship between one
vector and another. Thus, the dielectric tensor (relating the elec-
~ to the electric displacement D),
tric field vector E ~ the stress tensor
(relating the area vector to the force vector), etc. all transform ac-
cording to this rule. In general, the class of objects that transform
like this are collectively called tensors of rank two.
Why “rank two”? Because the rotation matrix appears twice
(once as R and once as RT ) in the transformation law (5.37)! This
may be even more obvious if I write this transformation law in
12
In the language of matrices, this is an orthogonal transformation, a special
kind of similarity transformation in which the transforming matrix happens to
be orthogonal. A theorem of linear algebra, which states that any symmetric
matrix can always be orhogonally diagonalised, ensures that you will always be
able to rotate your coordinate system into one in which the moment of inertia
tensor is diagonal. This, of course, is the principle axis system - something
which makes the study of rigid body dynamics very convenient.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 168

terms of the components. Then

Iij0 = RIRT

ij
3
X
Ria Iab RT

= bj
a,b=1
3
X
= Ria Rjb Iab
a,b=1

where the two R matrix elements becomes explicit - especially if


you compare with the corresponding law for vectors

3
X
vi0 = Ria va
a=1

A second rank tensor, then, can be defined as a set of 3 × 3 = 9


components that transform under a coordinate rotation according
to the rule
3
X
Tij0 = Ria Rjb Tab , i, j = 1, . . . , 3 (5.38)
a,b=1

This form makes it easy to see how to generalize this to get the
third rank tensor - this is nothing but the set of 33 components
that transform according to the rule

3
X
0
Tijk = Ria Rjb Rkc Tabc , i, j, k = 1, . . . , 3 (5.39)
a,b,c=1

You can keep on going higher and higher in rank by adding more
indices and more R matrix coefficients in the transformation law!
All these was about tensors in three dimensions - or more pre-
cisely, tensors under rotations. What is the scenario in four di-
mensional spacetime? Our tensor of rank two (undrer rotations)
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 169

was something that acts linearly on a vector to yield another vec-


tor. You should immediately see that something more complicated
is going on here - the fact that there is more than one kind of vec-
tor in spacetime immediately tells us that there should be an even
larger number of kinds of tensors of rank two, which change

• A covariant vector to a contravariant vector

• A contravariant vector to a covariant vector

• A covariant vector to a covariant vector

• A contravariant vector to a contravariant vector

, respectively!
Let me focus on a tensor that converts a covariant vector to a
contravariant one for the time being - the other cases are similar.
So, we have
V =TW

Since under a Lorentz transformation x → x0 = Lx, the two vectors


−1
transform according to V → V 0 = LV and W → W0 = LT W, we
0
see that T must transform to T given by
−1
LT W = T 0 LT W

for all W, which readily shows that

T → T 0 = LT LT

which becomes, in terms of components

3
X
0µν
T = Lµρ Lνσ T ρσ (5.40)
ρ,σ=0
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 170

As you can see, this tensor has two L matrix elements in its trans-
formation, that’s twice as many as a contravariant vector. It is easy
to see, then, why it is called a contravariant tensor of rank . This
also explains why I chose to put the indices up as superscripts in
this case.
You should be able to show quiet easily that a tensor that changes
a contarvariant vector into a covariant one transforms according to
the law
X3
0
Tµν = Lµρ Lνσ Tρσ (5.41)
ρ,σ=0

and is called a covariant tensor of rank 2. As for the other two pos-
sibilities, I will leave you to figure out that they yield mixed tensors
with one contravariant and one covariant index, transforming as

3
X
T 0µν = Lµρ Lνσ T ρσ . (5.42)
ρ,σ=0

One way in which you can directly produce a tensor of rank wo


from two vectors is by forming the so called direct product. As an
example, I will start with two contravariant 4-vectors, Aµ and B µ ,
and form the set of 16 quantities

T µν = Aµ B ν , µ, ν = 0, 1, 2, 3 (5.43)

It is very easy to check that this does satisfy the defining transfor-
mation property of a contravariant tensor of rank two :

T 0µν = A0µ B 0ν ! !
X 3 3
X
= Lµρ Aρ Lνσ B σ
ρ=0 σ=0
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 171

3
X
= Lµρ Lνσ Aρ B σ
ρ,σ=0
3
X
= Lµρ Lνσ T ρσ
ρ,σ=0

A word of warning - although the direct product of two vectors


does give rise to a tensor of rank two - all tensor of rank two can
not be written as a direct product of two vectors. A simple counting
arguments should be enough to underatnd this - two vectors give
you 8 numbers to play with - while a general tensor of rank two has
16 components! It is easy to see that the sum of the direct products
Aµ B ν and C µ Dν is a tensor of rank two, but it can be itself written
as a direct product only if one of the vectors C, D is proportional to
one of A, B.
A particularly important second rank tensor that you can form
from two vectors U and V is the antisymmetrized direct product

Aµν = U µ V ν − U ν V µ (5.44)

It is easy to see that if you restrict yourself to the spatial compo-


nents Aij , i, j = 1, 2, 3 you get three independent numbers A23 , A31
and A12 which are old friends, three components of the cross prod-
uct U~ × V~ . Thus, (5.44) above is 4 dimensional generalization of
the cross product of two three-vectors. Apart from the three spatial
components that we have talked about earlier, the three other inde-
pendent components are the time-space components A0i , i = 1, 2, 3.
They form, obviously, the three components of the three-vector13

~ = U 0 V~ − V 0 U
A ~
13 ~ and V
This is obviously a three-vector under rotations, since U ~ are three-
0 0
vectors and U and V are three-scalars.
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 172

- thus our antisymmetric product can be thought of as an entity


that packs in two three vectors, the A ~ above and B ~ =U ~ × V~ . Note
that under rotations these two pieces mix among themselves - in-
termixing only happens under a more general Lorentz transforma-
tion, like a boost.
Since we will meet entities like this quiet often later, it may be a
good idea to figure out how this transforms under, say, the special
Lorentz transformation. Using the matrix form, we get
 
0 A01 A02 A03
−A01 B30 −B20 
 

 0 =
0 0 0 

 −A2 −B3 0 B1 
0 0 0
−A3 B2 −B1 0
   T
γ −βγ 0 0 0 A1 A2 A3 γ −βγ 0 0
   

 −βγ γ 0 0   −A1
 0 B3 −B 2   −βγ
 γ 0 0 

 0 0 1 0   −A −B 0 B  0 0 1 0 
  2 3 1  
0 0 0 1 −A3 B2 −B1 0 0 0 0 1

The matrix multiplication above looks much more complicated


than it is - and with a bit of perseverence you can easily show that
this implies

A01 = A1 (5.45-a)
A02 = γ (A2 − βB3 ) (5.45-b)
A03 = γ (A3 − βB2 ) (5.45-c)
B10 = B1 (5.45-d)
B20 = γ (B2 + βA3 ) (5.45-e)
B30 = γ (B3 − βA2 ) (5.45-f)

This result will be quiet useful for us in the future.


CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 173

It is easy to see that you can form tensors of other kinds by


taking a direct product of vectors of the appropriate kind. Indeed,
by playing this game with more than two vectors will give you a
tensors of higher rank. For instance, taking the direct product
of three contravariant vectors produces a tensor of rank three, a
quantity that transfoms according to

3
X
0λµν
T = Lλϑ Lµρ Lνσ T ϑρσ (5.46)
ϑ,ρ,σ=0

Remember that, once again, a general third rank tensor will trans-
form according to this equation but will not be, in general, a direct
product of three vectors.
In general, it is easy to see that we can take the direct product
of m contravariant vectors and n covariant vectors to produce an
object that trasforms like

3
X
0µ1 ...µm
T ν1 ...νn = Lµρ11 . . . Lµρmm Lνσ11 . . . Lνσnn T ρ1 ...ρm σ1 ...σn (5.47)
ρ1 , . . . , ρ m
=0
σ1 , . . . , σn

Any set of 4m+n numbers that transform like the above are called
mixed tensors having a contarvariant rank of m and a contravariant
tesor of rank n or, in short, a tensor with a rak of m + n.

5.5.7 Why bother?


We will meet four-scalars, four-vectors and four-tensors in a big
way in the next few chapters. Are you wondering why they are so
important? The reason lies in the principle of relativity. This im-
plies that in order to qualify as a valid physical law a law has to
CHAPTER 5. ADDITIONAL TOPICS IN KINEMATICS 174

be equally valid for all inertial frames. In other words, the equa-
tions that express these laws must be Lorentz covariant. Note the
choice of words here - I am saying covariant, not invariant. What
this means is that the two sies of an equation may change (not re-
main invariant), but change in the same manner (co-vary) under a
Lorentz transformation - so that if the equation is correct for one
inertial frame of reference, it is correct for any other inertial frame.
So, given any proposed law of physics, it becomes important to
check whether it conforms to this principle of Lorentz covariance.
Now, this may be a difficult matter - you will have to take a look at
the way in which each entry on both sides of the equation changes
under a Lorentz transformation, carry out the necessary algebra
and often rather painfully verify that both sides do change in the
same manner.
This is exactly where four-tensors come in. If you write down
an equation in which both sides are four-tensors (of the same kind)
then Lorentz covariance is guaranteed - both sides of the equation
will change on changing the inertial observer, but will change in
exactly the same way. The point is - if we write equations in terms
of matching four dimensional entities, then there is no longer any
need to check for Lorentz covaraince - such equations are mani-
festly Lorentz covariant!
Chapter 6

Kinetics in relativity

6.1 Rewriting Newton

6.2 The momentum - kinetic energy con-


nection
Let me remind you how we defined kinetic energy in Newtonian
mechanics. It is the work that has to be done on a stationary
particle to speed it up from rest. Thus,
Z
T = F~ · d~r
Z Z p
~
d~p
= · d~r = ~u · d~p (6.1)
dt ~0

This formula remains valid in relativity, too. Note that in classical


mechanics we have p~ = m~u, which leads to the well known result

p2 1
T = = mu2 (6.2)
2m 2

175
CHAPTER 6. KINETICS IN RELATIVITY 176

Note that for a photon, which travels always with the speed of
light, the kinetic energy and the momentum are simply related by
Z pγ
Tγ = cdp = pγ c (6.3)
0

We will borrow from quantum theory the result that the energy of
a photon of frequency ν is given by Tγ = hν. Here h stands for
Planck’s constant. Then the photon must have a momentum given
by pγ = hν c
= λh . I will make extensive use of these results in the
next few sections to show, firstly, that the classical definitions for
momentum (and hence kinetic energy) are inadequate in relativity,
and secondly, to find out what they should be redefined as!
I told you before that the Doppler factor is in many senses (ex-
cept familiarity) a better measure of motion than the speed. Not
only is it easier to determine, it also makes the algebra much sim-
pler in relativistic calculations. In keeping with this, I will write
both the momentum and kinetic energy of a particle as functions
of its Doppler factor1 k, namely, as p (k) and T (k). Classical physics
tells us that

k2 − 1
p (k) ≡ mu = mc
k2 + 1
2
1 2 k2 − 1

1 2
T (k) ≡ mu = mc
2 2 k2 + 1

Of course, the versions in terms of v looks a lot simpler, but that is


only because these are non-relativistic formulae! As we will see in
the next section, the demand that the law of momentum conser-
1
More precisely, it is the Doppler factor of an observer that shares the parti-
cle’s velocity in that instant of time. Note that the particle’s k may change over
time - which means that an observer fixed to it can not be inertial. Instead we
may think of instantaneously comoving frames - a sequence of inertial frames in
each of which the particles is at rest at that instant.
CHAPTER 6. KINETICS IN RELATIVITY 177

νi νi k

k=0
k −1
Before Before

νf kνf k=0
k

After After

Figure 6.1: A photon bouncing off an electron

vation works in all inertial frames will tell us that these relations
must be wrong! They are nevertheless very important - any correct
formula must reduce to these when the velocity is small (i.e. k is
close to unity) compared to the speed of light!

6.3 The need to redefine momentum


Let’s try shining our flashlight at an electron at rest. I will con-
sider a very simple situation. Here, the photon hits the electron
and bounces straight back. In the process, it transfers some of its
energy to the electron. Thus the photon comes out lower in energy
and hence lower in frequency. The situation is shown in figure
6.1a.
Here, the initial net momentum is hνc i + p (1) (remember that k =

1 for an electron at rest), while the final one is − cf + p (k). As
for kinetic energy, the initial and final values for it are hνi + T (1)
and hνf + T (k), respectively. Since a particle at rest has neither
momentum nor kinetic energy, we know that p (1) = 0 and T (1) = 0.
This means that in the frame where the electron is initially at rest,
CHAPTER 6. KINETICS IN RELATIVITY 178

the conservation of energy and momentum leads to the equations

h
(νi + νf ) = p (k) (6.4)
c
h (νi − νf ) = T (k) (6.5)

In this problem, the two unknowns are νf , the final frequency of the
photon, and k the final Doppler factor (i.e. speed) of the electron.
We have two equations - so it seems we have all the ingredients
needed to solve the problem. The only trouble is, we don’t yet know
the functions p (k) and T (k)!
One result that follows immediately from (6.4) and (6.5) is that
for the ratio between the kinetic energy and momentum

T (k) νi − νf
= (6.6)
cp (k) νi + νf

- a formula that will be of some use later.


What does this collision look like from another frame - the one in
which the electron is at rest after the collision? It is easy to see that
in this case, the initial electron is moving to the left with a Doppler
factor k −1 (same speed, opposite direction of motion!), so that the
increase in momentum of the electron is given by p (1) − p (k −1 ) =
−p (k −1 ). Since momentum should reverse with velocity, we de-
mand that p (k −1 ) = −p (k), so that, in this frame too, the elec-
trons momentum increases by p (k). For momentum conservation
to work in this frame, this must be decrease in the momentum of
the photon. this frame is receding from the incoming photon, and
hence it sees its frequency reduced by the Doppler effect to νi0 = νki ,
while the outgoing photon gets shifted the other way, νf0 = νf k.
CHAPTER 6. KINETICS IN RELATIVITY 179

Thus, momentum conservation leads to

h  νi  h
+ νf k = p (k) = (νi + νf )
c k c

and since the electron will move (k 6= 1), we must have

νi
νf = (6.7)
k

Note the power of the relativity principle, by hopping to another


frame, we have already got some way to a solution, without worry-
ing about the precise forms of p (k) and T (k)!
With this ratio between νi and νf , it is easy to see that the kinetic
energy to momentum ratio becomes

T k−1
= (6.8)
pc k+1

Let’s make an important point clear at this stage. The relations


(6.4), (6.5) and (6.6) that I had written down so far were equations
- they were to be solved to find the unknowns k and νf , given the
frequency of the incident photon. However, (6.8) is an identity - one
that must be satisfied by the kinetic energy and the momentum of
the electron, no matter what its speed or k is. This is because,
firstly, (6.8) expresses a relation involving the electron alone - all
trace of the photon has cancelled from it! So, (6.8) is valid for
whatever k factor the electron can have after the collision - but, and
here is the crux of the matter, this can be any value whatsoever -
all you have to do is to vary the incoming photon’s frequency!
Does our classical relations for the momentum and the kinetic
energy satisfy the identity (6.8)? The answer, as you can easily
CHAPTER 6. KINETICS IN RELATIVITY 180

check, is no! Classically, the kinetic energy to momentum ratio is

1
T 2
mu2 β k−1
= = 6=
pc cmu 2 k+1

Clearly the classical definition of the momentum needs revision!


Since this is a rather shocking conclusion (although, with what
you have been subjected to in the previous sections on kinematics
with a flashlight, you may have been expecting such shocks!), let
us take a deeper look at the argument above. Note that the classi-
cal results where holding the fort up to (6.6) - till that point there
was no reason to believe that they could not fit into our scheme of
things. The switch occurs when we show that if momentum conser-
vation is to be valid in another frame, too, then we must have (6.7)
and this leads to (6.8) which really leads to the breakdown in the
classical formulae. In other words, the classical formula for the
momentum can not be valid if we want its conservation to be true
in all inertial frames!
One more point - in deriving (6.8) I had to assume that kinetic
energy as well as momentum to be conserved. You may see a glim-
mer of hope there. There are inelastic collisions in classical me-
chanics too, where kinetic energy is not conserved. So, perhaps
there is nothing wrong with our classical formulae - the failure of
(6.8) is simply telling us that the electron-photon collision is, for
some reason, an inelastic one? As I will show later, this possibility,
too is a no go - we have no option but to revise our age old formula
for the momentum!
You may be wondering why I am harping so much on the clas-
sical formula for the momentum being wrong. After all, all I have
shown is that the classical ratio between the kinetic energy and
the momentum of the electron is wrong. Couldn’t it be that the
CHAPTER 6. KINETICS IN RELATIVITY 181

classical momentum formula survives the rigors of STR - and it is


the classical kinetic energy formula that fails? One look at (6.1)
will immediately tell you that this is impossible - I can derive the
formula for T from the expression for p, so p = mu will always lead
to T = 12 mu2 and hence the wrong ratio. Of course, if we are forced
to change the formula for p, that for T will have to change also.
T
Now although relativity tells us that the ratio pc is k−1
k+1
and not
β
really the classical 2 , it must be close to this ratio for velocities
much smaller than light, i.e. for small values of β. You should
check that this, really, is true.

If the electron moves slowly, we may use the classical


relations to solve the problem. This gives

1
mu + mu2 = 2hνi
2c

(it is more convenient to use u than k for the nonrelativis-


tic situation) giving, to the first approximation

hνi
u≈2
m

which means that

h 2
νf ≈ νi − 2 ν .
m i

Are these value reasonable? If my flashlight emits yellow


light (like reasonable flashlights) of a wavelength 600 nm,
then the velocity of our electron comes out to be about
7 × 1012 m s−1 - many, many times the speed of light! The
energy of each photon may be small - but to the electron,
they are really like cannonballs!
CHAPTER 6. KINETICS IN RELATIVITY 182

6.4 The momentum in STR


How can we ever hope to solve the problem if we don’t know either
p (k) or T (k)? The whole point is, we can use our central principle
of relativity - all inertial frames are equally good, to actually deduce
what these functions can be. For our first deduction, we won’t have
to look any further, we already have all the ingredients for deducing
p (k) and T (k) in the identity (6.8).
All we need is the relation, obvious from (6.1) that

dT k2 − 1
= v = cβ = c 2 (6.9)
dp k +1

Taking the logarithm of both sides of (6.8) and differentiating gives


us  
1 dT 1 1 dk 1
= − +
T dp k − 1 k + 1 dp p
which, in conjunction with (6.9) leads to

k2 − 1 k + 1 1 2 dk 1
2
= 2 +
k +1 k−1p k − 1 dp p

and finally to
dp k2 + 1
= dk. (6.10)
p k (k 2 − 1)
Equation (6.10) can be integrated rather trivially to get

k2 − 1
p= A
k

where A is an arbitrary constant of integration. This result can be


obviously written in the form

p = 2βγA
CHAPTER 6. KINETICS IN RELATIVITY 183

and all that is left is to identify the constant A. For this, just
remember that for small velocities, this formula must reduce to
the classical result mu = mcβ. This means that A = 12 mc and the
new relativistic formula for the momentum is

k2 − 1 mu
p= mc = mcβγ = γmu = q . (6.11)
2k 2
1 − uc2

As for the kinetic energy, applying (6.8) immediately gives


 
2
(k − 1) 1
T = mc2 = (γ − 1) mc2 = mc2  q − 1 . (6.12)
2k 1− u2
c2

6.4.1 The speed limit


This expression for the kinetic energy helps explain why you can’t
accelerate a massive particle to the speed of light. Note that as
the speed of the particle approaches the speed of light, its kinetic
energy increases without bound. Remember that the amount by
which the kinetic energy of a particle rises is equal to the work done
on it2 . So, by keeping on doing work on a massive particle you can
keep on raising its kinetic energy. However, as the particle’s speed
gets closer and closer to the speed of light, its kinetic energy can
suffer a huge increase without any appreciable gain in the speed.
Since no matter how hard and long you can keep on at it, you
can do only a finite amount of work implies that you can make the
speed of a massive particle approach that of light - but you can
never exactly attain that speed!
2
If you are wondering whether this result, which you knew as the work-energy
theorem in Newtonian mechanics, is still valid in STR, let me hasten to assure
you that it is. After all, this is built-in in the definition of the kinetic energy,
equation (6.1)!
CHAPTER 6. KINETICS IN RELATIVITY 184

What about the photon - the particle that has to move with the
speed of light? If (6.11) and (6.12) are valid for the photon, then
it would land up with infinite amounts of momentum and energy.
The only way out, then, is to assume that the mass m of the photon
is - zero! Since the factor γ is ∞ for a particle moving with the speed
of light, the factor mγ is not well defined - but is consistent with
any finite value. The momentum carried by a photon has nothing
to do with its speed - indeed, quantum mechanics tells us that it
is controlled by its wavelength. One point, though - if you take the
limit m → 0 and γ → ∞ in such a way that the ratio vp → pc = mγ
stays finite, then the expression for the kinetic energy

p 2
T = mc2 (γ − 1) → c = pc
c

which is exactly the expression that we used for the photon kinetic
energy, (6.3).
What about faster than light particles, tachyons - which used
to be (and still is, perhaps to a somewhat lesser extent) the staple
of science fiction writers everywher?. If you naively put in a veloc-
ity larger than c in the relativistic expressions for momentum and
kinetic energy, (6.11) and (6.12), you get imaginary values - which
seems to rule out faster than light travel. A better way to interpret
this is simply follow what we started this section with - it takes an
unbounded amount of energy to speed up a particle moving slower
than light to the speed of light. Since you will not be able to speed a
particle up to the speed of light, the question of going beyond does
not arise!
If you think about this a bit, you will realize that there is nothing
in this argument about particles that have always been moving
faster than light. Such particles, on the other hand, can never
be decelerated below the speed of light. Even slowing them down
CHAPTER 6. KINETICS IN RELATIVITY 185

to the speed of light will require you to do an infinite amount of


negative work! So, the upshot of it all is, from the point of view of
kinetics, all that you can say is that any subluminal (slower than
light) particles will have to stay subluminal, superluminal (faster
than light) particles superluminal, and massless particles will have
to move with the speed of light!
A much stronger argument against tachyons is the one based
on causality that we saw in section 3.9. If a tachyonic particle
could carry the information about the occurance of an event A to
the location of another event B, we saw there that it is possible
for a slower than light observer to see B happen before A - thus
leading to causality violation - which is a strict no-no. Of course, if
tachyonic particles exist which do not interact in any way with the
particles of our our subluminal world, there will be no violation of
causality. However, such particles could never be detected exper-
imentally even if they did exist - and they would not belong to the
realm of physics!

6.4.2 A “better” derivation


The above derivation of the relativistic formulae for momentum and
kinetic energy has one problem that you may have noticed. As I
have warned before, one can, perhaps object to this on the grounds
that (6.8), on which it is based relies not only on momentum con-
servation, but also on that of kinetic energy. I will now show you
a better derivation - one that does not rely on the conservation of
kinetic energy at all!
The idea behind this derivation is the same as always - if mo-
mentum conservation is to be a valid law of physics, then it is going
to be equally valid in all reference frames. So far, we have applied
momentum conservation to our system of a photon bouncing off an
CHAPTER 6. KINETICS IN RELATIVITY 186

electron from two different inertial frames, one in which the elec-
tron is initially at rest and one in which it is finally at rest. Let us
now write down the law of momentum conservation in an arbitrary
frame, one travelling to the left with respect to our original inertial
frame with a Doppler factor of K −1 . The reason I have chosen an
observer moving to the left instead of the right is simply to make
the algebra slightly simpler (you should verify that the conclusion
stays unchanged if the observer were travelling to the right). Again,
I have chosen to write the observer’s Doppler factor as K −1 as op-
posed to K merely to ensure that K is larger than 1 - a choice that
marginally helps in keeping track of the terms.
The whole point behind using the Doppler factor instead of the
speed should become clear now. In section 3.4, I showed you that
the velocity addition formula is much simpler in terms of the for-
mer, rather than the latter. In our new frame, the electron initially
has a Doppler factor of K, while after collision its Doppler factor
becomes kK. What about the photon. The observer is approaching
the incident photon and hence she sees it blue-shifted to a fre-
ν
quency νi K, while the final photon is red-shifted to Kf . Thus, in
this new frame, the law of conservation of momentum becomes

h νf 
p (kK) − p (K) = Kνi +
c K

Using (6.4) and (6.7), this becomes

p (kK) − p (K) K 2k + 1
= (6.13)
p (k) K (k + 1)

A moments though will tell you that our function p (k) must satisfy
(6.13) not only for all values of K, but also for all values of k! The
task, then, is to deduce the functional form for a function satisfying
this equation.
CHAPTER 6. KINETICS IN RELATIVITY 187

While solving functional equations is quite a difficult task in


general, there is a simple trick that helps us to solve (6.13). The
2
right hand side of equation (6.13) becomes simply K2K+1 when k = 1,
but the left hand side, for this value of k is of the form 00 . You can,
of course find the limit of the left hand side as k → 1, which is
0 (K)
simply Kpp0 (1)
from l’Hospital’s rule. Thus

K2 + 1 0
p0 (K) = p (1) (6.14)
2K 2

Since p0 (1) is a constant, this is easily solved to yield the formula,


(6.11) that we have already found out before for the momentum.
Once I have shown you what the momentum is, you can simply
use (6.8) to find out the kinetic energy. In case you have qualms
about using (6.8) at all, you are welcome to carry out the integra-
tion in (6.1) to check that you arrive at the same expression for the
kinetic energy as (6.12).

6.5 Mass is energy


The expression for kinetic energy of a point particle is, then, not
1
2
mu2 but (γ − 1) mc2 . The two expressions match, as they must,
when the velocity is small and of course they both vanish for a
stationary particle. Now, the fact that the kinetic energy vanishes
for a stationary particle makes eminent sense - it is meant to be
the energy that a body has due to its motion, after all! Remember,
though, that only energy differences have a physical meaning and
not actual values of the energy. So, you are always free to add any
constant to the energy without worrying about the physical con-
sequences. Given the choice, what would you think is the most
natural constant to add to the expression (6.12) for the kinetic en-
CHAPTER 6. KINETICS IN RELATIVITY 188

ergy? I am sure most of you would like to add the constant mc2
so that the right hand side becomes a lot neater! Of course, this
means that the energy of a stationary particle is no longer zero but
mc2 - so calling it the kinetic energy is not very justified. Physicists
call this quantity the total energy E, and hence expression (6.12)
becomes

mc2
E = T + mc2 = (γ − 1) mc2 + mc2 = γmc2 = q (6.15)
2
1 − uc2

an equation which you must all know as the one that has made
Einstein a household name3 !
There you have it - E = γmc2 - arguably the most famous equa-
tion in physics! Yet, I am sure you are feeling a bit disappointed
with the way in which I introduced it. “Is it true, then -” you must
be asking yourselves, “that physics’ most famous equation is a bit
of a sham - simply an attempt to window dress an expression for
kinetic energy so that it looks a little better?” If all collisions in
the world were elastic, then all this new expression would have
been is a mere rewrite of (6.12). Take heart, though, in the case
of processes which the kinetic energy does not balance out, equa-
tion (6.15) takes a new, deeper significance - one that we will now
explore.
Let’s start by considering the most simple example of an inelas-
tic collision that I can think of - two particles of identical mass
mcolliding head on with equal and opposite velocities, correspond-
ing to Doppler factors of k and k −1 , respectively, and sticking to-
gether. You know that after the collision, the composite particle
will be standing still. From your high school education, you would
3
Perhaps you are more used to hearing E = mc2 instead of E = γmc2 . Wait till
section 6.9 to see how these two expressions are really one and the same!
CHAPTER 6. KINETICS IN RELATIVITY 189

conclude that the final particle has a mass of 2m, and the total ki-
netic energy of 21 mu2 × 2 = mu2 that the incident pair had will be
converted to heat energy that will, at least initially, heat up the
composite body. Let’s see what our new-found laws of relativistic
mechanics tell us about this scenario.
To feel the real impact of STR on this simple problem, let me now
describe the same collision from the point of view of another inertial
frame, moving with a Doppler factor of K −1 with respect to our first
one. In this frame, the two colliding particles have Doppler factors
of kK and k −1 K, respectively and the final, composite particle one
of K. Now, the net initial momentum of the system works out to be
 
K
pi = p (kK) + p
k
   
1 1 1 K k
= mc kK − + mc −
2 kK 2 k K
mc
k2K 2 − 1 + K 2 − k2

=
2kK
mc
K 2 − 1 k2 + 1
 
=
2kK   
1 1 1
= mc K − k+
2 K k
 
1 1
= Mc K −
2 K

where I have chosen to write the combination m k + k1 as M . Since




the law of conservation of momentum, if it is to be a valid physical


law, must be valid in all inertial frames, this must be the momen-
tum of the composite particle in the new frame. But the Doppler
factor of the composite particle in this frame being K, the mass of
the composite particle must be M ! Note that though we used a new
frame to find out the mass of the composite particle this value is
independent of the frame of reference. Thus the final particle has
CHAPTER 6. KINETICS IN RELATIVITY 190

a mass of, not 2m, but


 
1 1
M = 2m × k+ = 2mγ (6.16)
2 k

which is larger than the net initial mass by 2 (γ − 1) m. The conser-


vation of mass, one of the oldest laws that we had learned in the
physical sciences is not valid in relativity!
In our first frame, the net kinetic energy of the incident particles
was 2 (γ − 1) mc2 , while that of the final particle was, of course, zero.
Thus, what you loose in kinetic energy is exactly what you gain
in mass, once you take the factor of c2 into account. You may
be suspicious that this nice result may be valid only in the initial
frame, so let us try out the calculation in the new arbitrary frame.
 
K
Ti = T (kK) + T
k
   
1 2 1 2 1 2 K k
= mc kK + − mc + mc + − mc2
2 kK 2 k K
mc2 2 2
k K + 1 + K 2 + k 2 − 2mc2

=
2kK  
1 2 1
= Mc K + − 2mc2
2 K

where I have used (6.16). Now, the first term on the left hand side
above is the total energy Ef = Tf + M c2 of the final particle, and
thus
Ti = Tf + (M − 2m) c2

and this means that even in this frame, the increase in mass is

1
M − 2m = (Ti − Tf ) (6.17)
c2

- i.e., the decrease in the kinetic energy, divided by c2 .


CHAPTER 6. KINETICS IN RELATIVITY 191

Another way of writing the equation above immediately suggests


itself. Equation (6.17) can be easily rewritten into

Ti + 2mc2 = Tf + M c2

which means that


Ei = Ef .

So, the quantity E that we defined by adding mc2 to the kinetic


energy has a great significance of its own. In this inelastic two body
process, it is E which is conserved - not T or the total mass! This
is a big departure from classical physics - in classical physics the
conservation of energy and that of mass were two separate laws -
here they are rolled up into one common conservation law! Note
that all that we needed to show this is that momentum conserva-
tion is a valid physical law (and hence equally valid in all inertial
frames). The only thing that we need to check now is whether this
is a one-off result - valid only for two particles colliding and sticking
together, or is this a much more general law of physics.

6.6 The conservation of energy from that


of momentum
Consider a general process where the initial state has particles of
mass m1 , m2 , . . . , mr which are moving with Doppler factors k1 , k2 , . . . , kr ,
respectively. In the final state, the number and masses of par-
ticles may be different4 . Let the masses and Doppler factors be
µ1 , µ2 , . . . , µs and κ1 , κ2 , . . . , κs in the final state.
4
Imagine the decay of the neutron. The initial state has one particle only - the
neutron. The final state has three - the proton, the electron and the antineu-
trino.
CHAPTER 6. KINETICS IN RELATIVITY 192

The law of conservation of momentum in this frame is, then


     
1 1 1 1 1 1
m 1 c k1 − + m 2 c k2 − + . . . + m r c kr − =
2 k1 2 k2 2 kr
     
1 1 1 1 1 1
µ1 c κ1 − + µ2 c κ2 − + . . . + µs c κs − (6.18)
2 κ1 2 κ2 2 κs

Allow me to bring in our old friend - the observer moving with a


Doppler factor K −1 . She sees the initial Doppler factors as k1 K, k2 K, . . . , kr K
and the final ones as κ1 K, κ2 K, . . . , κs K. If momentum is to be con-
served for her, too, we have
   
1 1 1 1
m 1 c k1 K − + . . . + m r c kr K − =
2 k1 K 2 kr K
   
1 1 1 1
µ1 c κ1 K − + . . . + µs c κs K − (6.19)
2 κ1 K 2 κs K

which can be easily rearranged to give


   
c m1 m2 mr 1
(m1 k1 + m2 k2 + . . . + mr kr ) K − + + ... + =
2 k1 k2 kr K
   
c µ1 µ2 µs 1
(µ1 κ1 + µ2 κ2 + . . . + µs κs ) K − + + ... + (6.20)
2 κ1 κ2 κs K

Since this is to be equally true for all values of K, we must have

m 1 k1 + m 2 k2 + . . . + m r kr = µ1 κ1 + µ2 κ2 + . . . + µs κs (6.21)
and
m1 m2 mr µ1 µ2 µs
+ + ... + = + + ... + (6.22)
k1 k2 kr κ1 κ2 κs

- a rather striking result!


One immediate consequence of (6.21) and (6.22) is

c2
  
m1 m2 mr
(m1 k1 + m2 k2 + . . . + mr kr ) + + + ... + =
2 k1 k2 kr
CHAPTER 6. KINETICS IN RELATIVITY 193

c2
  
µ1 µ2 µs
(µ1 κ1 + µ2 κ2 + . . . + µs κs ) + + + ... + (6.23)
2 κ1 κ2 κs

which is nothing but the equation of conservation of energy,

k12 + 1
   2   2   2 
2 2 kr + 1 2 κ1 + 1 2 κs + 1
m1 c +. . .+mr c = µ1 c +. . .+µs c
2k1 2kr 2κ1 2κs
(6.24)
Although I have written down this equation for the observer I started
out with, it is rather trivial to show that this conservation will work
for any arbitrary observer (show it!).
Thus, demanding that the law of conservation of momentum is
equally valid for all inertial observers leads to the law of conserva-
tion of energy as well, for a rather general process. You may have
already caught on to this - this works just as well the other way
round, too. Demanding that energy is conserved in all frames leads
to momentum conservation. Surely, then, there must be a deeper
connection between momentum and energy in relativity? There is!
I will discuss this connection in the very next section.

6.7 The momentum four vector


So far, I have made use of the fact that any law of physics must
stay intact when you jump from one frame of reference to another
to deduce the form that momentum and energy must have. Let us
now take a look at how the relativistic momentum and energy do
transform under a Lorentz transformation.
Alice observes a particle to have the Doppler factor k. We have
already seen that this particle must have a momentum and energy
CHAPTER 6. KINETICS IN RELATIVITY 194

given by
 
mc 1
p (k) = k−
2 k
mc2
 
1
E (k) = k+
2 k

which yields, rather trivially


 
1 E
k = +p (6.25-a)
mc c
 
1 1 E
= −p (6.25-b)
k mc c

One result that follows immediately by multiplting the two equa-


tions above is
E2
2
− p2 = m2 c2
c
Although I have proven this for a 1 dimensional world, te result is
perfectly valid in three space dimensions too - as we will see in a
while.
If Bob moves away from Alice with a Doppler factor K, he will
see the particle to have a Doppler factor of Kk . So, according to Bob,
the particle has momentum and energy given by
 
0 mc k K
p (k) = −
2 K k
2
 
0 mc k K
E (k) = +
2 K k

You can use (6.25-a) and ( 6.25-b) to rewrite Bob’s measured values
in terms of those measured by alice. This leads to
   
0 1 1 1 1 E
p = K+ p− K−
2 K 2 K c
CHAPTER 6. KINETICS IN RELATIVITY 195

E0
   
1 1 1 1 E
= − K− p+ K+
c 2 K 2 K c

1 1

and using our by now familiar values for the quaantities 2
K+ K
and 21 K − K1 leads to


 
0 E
p = γ p−β (6.26-a)
c
E0
 
E
= γ −βp + (6.26-b)
c c

Do the above equation look familiar? They should - they are


exactly the same as the transformations of x and ct! The all impor-
tant point right now, though, is that the value of the momentum in
Bob’s frame depends not only on the value of the momentum that
Alice measures - but also on the value of the energy according to
her. Thus, if only momentum is conserved according to Alice and
not the energy - then even the momentum will not be conserved
according to Bob. Thus, we can ultimately trace the coonection
between the law of conservation of momentum to that of energy to
the fact that as far as the transformation from one inertial observer
to another is concerned, the pair p, Ec behaves just like the pair


(x, ct).
To those of you who have not skipped the section on vectors and
scalars in spacetime, this should not come as a big surprise (If you
were the impatient sort, now is a good time to go back and read
section 5.5). After all, we have seen that four-vectors behave just
like spacetime coordinates under Lorentz transformations. So, it
seems that the pair p, Ec that we have been seeing so far is just a


part of some four-vector.


In Newtonian physics momentum is three-vector - simply be-
cause it is a product of a scalar - the mass, and a vector - the ve-
CHAPTER 6. KINETICS IN RELATIVITY 196

locity. The analogous four dimensional quantity is simple - all you


have to do to define the momentum four-vector is multiply mass
m by the four dimensional generalisation of the vector ~u, which is
none other than the four-velocity that we met in subsection 5.5.4.
Thus, we have the four-momentum

pµ ≡ mU µ = mγ (c, ~u) (6.27)

Note that the spatial part of this four-vector is precisely γm~u - the
formula that we have found out a while ago for the accurate rel-
ativistic form of the momentum (albeit in one dimensional form).
As an added bonus note that the temporal part of pµ is γmc which
is just Ec - so that in one swoop we have found out the relativistic
versions of both the momentum and the energy!
In fact now you see why adding the rest energy mc2 to the ki-
netic energy to get the total energy E = γmc2 was such a good idea!
Quiet unwittingly, we had stumbled on the temporal part of the
four-vector whose spatial part is the momentum three-vector. This
also tells us why the demand that the conservation of momentum
be valid in all frames immediately leads to the conservation of en-
ergy as an added bonus. It is actually the four-momentum that is
conserved - the conservation of momentum and that of energy are
two pieces of this single conservation law!
One invariant quantity that can be built immediately, given
a four-vector, is its norm. Finding out the norm of the energy-
momentum four-vector is really trivial - because we already know
the norm of the four-velocity, U µ . So,

P · P = P µ Pµ = m2 U µ Uµ = m2 c2 (6.28)

- which is trivially an invariant. If you write out the norm of the


CHAPTER 6. KINETICS IN RELATIVITY 197

energy-momentum four-vector explicitly, this gives a very useful


identity
E2
− p~ · p~ = m2 c2
c2
- a formula that we have seen a while ago in which is often written
in the form
E 2 = p2 c2 + m2 c4 (6.29)

often informally called, for obvious reasons, the relativistic Pythago-


ras law.
Another invariant that is very often quite useful is the scalar
product of two four-momenta P1 and P2 . A direct calculation yields

P1 · P2 = m1 m2 γ (u1 ) γ (u2 ) c2 − ~u1 · ~u2




However, a much more useful version of this invariant quantity can


be obtained if you jump over to the rest frame of one of theparticles

- say the first one. Then, the two four-momenta are m1 c, ~0 and
 
m2 γ vrel c, ~vrel , respectively, and the invariant product takes the
form
P1 · P2 = m1 m2 γ vrel c2

(6.30)

Of course, this is identically equal to the expression that I found


by the direct calculation - all you have to do to realize this is look
at equation (5.24).
A direct consequence of the above equation concerns the norm
of the net four-momentum of a system of particles

P = P1 + P2 + . . . + Pn

Since the γ factor for any velocity obeys γ (v) ≥ 1 (with the equality
CHAPTER 6. KINETICS IN RELATIVITY 198

holding only for v = 0) we get

P 2 = P12 + . . . + Pn2 + 2P1 · P2 + 2P1 · P3 + . . . + 2Pn−1 · Pn


= c2 m21 + . . . + m2n + 2m1 m2 γ (v12 ) + . . . + 2mn−1 mn γ (vn−1,n )


≥ c2 m21 + . . . + m2n + 2m1 m2 + . . . + 2mn−1 mn




= c2 (m1 + . . . + mn )2 = c2 M 2

where M = m1 + . . . + mn is the total mass of the system. Thus

P 2 ≥ c2 M 2 (6.31)

with equality holding only when all the particles are moving to-
gether and thus all the inter-particle relative velocities vanish.
This is as good a point as any to pay attention to a special case
- the massless particle. We have already seen one example - the
photon. As we have already seen, massless particles have to travel
at the speed of light and for them our energy-momentum formulae
need to be applied with some care. We have already seen that in
the massless limit
T = pc.

Now, for a photon, there is no rest energy - which means that E =


T . So, the photon obeys
E = pc (6.32)

which, as you can easily verify, is what (6.29) reduces to for m = 0.


For a massless particle, the energy-momentum four-vector reduces
to
P µ ≡ (p, p~) (6.33)

where p = |~p|. Fore a photon, quantum theory gives us the momen-


CHAPTER 6. KINETICS IN RELATIVITY 199

tum -wavevector relation


p~ = ~~k (6.34)

leading to the form  


Pγµ ≡ ~ k, ~k (6.35)

One important property of the energy-moemntum four-vector for


massless particles is obvious from (6.28) - the norm of such a four
vector is
P 2 ≡ P µ Pµ = 0. (6.36)

What happens to our expression for P1 · P2 if one of the two


particles is a photon. Of course, you can jump over to the frame
where the other particle is at rest and this immediately gives

P1 · P2 = ~k0 m (6.37)

where k0 is the wavenumber of the photon in the rest frame of the


other particle. This shortcut, of course, is not available when both
particles are photons - you can’t step into a photon’s “rest frame”!
In this case, we have to use the explicit expression
 
P1 · P2 = ~2 k1 k2 − ~k1 · ~k2 . (6.38)

6.8 Covariance of the conservation of energy-


momentum
The fact that the four-momentum is a bona fide four-vector may
lead you to think that this makes the conservation of momentum
manifestly covariant (as discussed in subsection 5.5.7) - and thus
I can avoid the issue of checking whether it stays equally valid
in all frames. There is a slightly subtle point to be wary of here,
CHAPTER 6. KINETICS IN RELATIVITY 200

though! It is true that the four-momentum of a single particle is a


four-vector, but is the total four-momentum of a system (after all,
this is what is conserved in a process) one? There seems to be no
doubt about this - the sum of two four-vectors is one, and this can
be easily extended to the sum of many four vectors! The subtlity
comes in from the fact that when you change from one inertial
observer to another, you are not really summing up the same four-
momenta to get the total four-momentum! The point is, the total
four-momentum is the sum of all the four-momenta of the particles
in the system, but it has to be the sum of all the four-momenta at
a given instant. Jumping to another frame changes the meaning
of “a given instant” - which events are simultaneous and which are
not depends on the observer! So, the initial total four-momentum
Pi (which is equal to the final total momentum Pf ) goes over to Pi0
(which must equal Pf0 ) upon a switch of observers - but the Pi0 is not
the same as the Lorentz-transformed Pi , neither is Pf0 the same as
the Lorentz-transformed Pf . If we denote the Lorentz-transformed
0 0
versions of Pi and Pf by P i and P f , respectively, it is obvious that
Pi = Pf , which is the law of conservation of moemntum for the
0 0
first observer, will imply P i = P f . Unfortunately, this is not the
law of conservation of momentum for the second observer, which
is Pi0 = Pf0 .
Why, then, do we believe in the conservation of four-momentum?
To explain this, I will take the help of a diagram, figure .

6.9 Do we need a relativistic mass?


You may be a bit confused with the equations I have written for
relativistic mechanics. Maybe the one that you will have the most
difficulty in digesting is E = γmc2 - you must have heard from child-
CHAPTER 6. KINETICS IN RELATIVITY 201

hood that Einstein’s famous formula for mass-energy equivalence


is E = mc2 - so where did this extra factor of γ come from?
To see the reason behind this difference in what you always
knew and what I am claiming to be the correct expression we will
have to go back a bit - to the point where I introduced the correct
relativistic formula for the momentum.

mu
p = γmu = q
u2
1− c2

At this point there are two possible courses of action open before
us. The first one, the one I have followed so far, is to gracefully
accept that the formula for momentum is not the old one of mass
times velocity, it is a different expression that, however, does re-
duce to the old formula in the case of slowly moving particles.
The other option is one that was very popular in the early days
of relativity. This one insisted that the momentum is still the same
old mass times velocity - only the mass is no longer the good old
constant quantity that we had been used to in the early days, but
one that increases with speed according to

m0
m= q (6.39)
u2
1− c2

Of course, all that has been done here is absorb the factor γ in m
to define this new quantity. In this second way of thinking about
things, the left hand side of the above equation is called the rela-
tivistic mass - while m0 , which is what I have been calling the mass
m all this time, is the value of this relativistic mass at u = 0, and is
hence called the rest mass. This explains the mystery of the miss-
ing γ in the arguably most famous equation of all physics - it is
simply that the m in E = mc2 is the reativistic mass (our mγ).
CHAPTER 6. KINETICS IN RELATIVITY 202

Since bringing the relativistic mass leads to simplification of two


of the major equations of relativistic mechanics - the definition of
the momentum and the formula for mass-energy equivalence - it
seems like an eminently good idea.The trouble here is that hiding
the factor γ in the definition of the mass only tends to conceal the
basic difference between Newtonian and Einsteinian kinetics - the
fact that the definition of momentum has changed! Perhaps more
importantly, it tends to make you think that all you have to do is to
replace the rest mass in each Newtonian formula by the relativistic
mass and you are done! That is not the case after all. Indeed
perhaps the best known equation of Newtonian mechanics

F~ = m~a

does not generalize nicely at all to a new relativistic formula. I will


have a lot more to say about this in section 6.10. For the time
being, let me just say that in the context of the generalized second
law, we can talk of (at least) two kinds of mass - longitudinal mass
and transverse mass! Moreover, the transverse mass happens to
be numerically the same as the relativistic mass. Given all this
plethora of masses - the utility of the relativistic mass concept loses
some of its sheen.
Today, most people are of the opinion that the relativistic mass
is too much of a baggage to carry around for the little cosmetic
benefits it offers. What I have done here is conform to this trend.
Except in the equation (6.39) above, all my m’s denote the mass
of a particle - the parameter that once used to be called the rest
mass.
There is, perhaps, a sociological lesson here. When STR came
into the scene - it brought with it the hope (or threat, depending
on your point of view) of sweeping all the old concepts away. At
CHAPTER 6. KINETICS IN RELATIVITY 203

that stage, the most striking feature of the theory was that certain
things that we had always held to be absolute actually turns out
to be observer dependent! This, after all, was summed up by the
catchphrase - “everything is relative!” At that point, the fact that
mass was seen as dependent on velocity is just one more addition
to the long list of “relative” entities. Today, perhaps, the emphasis
has shifted. We regard the theory of relativity today, not so much
as a theory of what is relative (observer dependent) but as a theory
in which the emphasis is on the fact that the laws of physics (all
of them) are observer independent! So much so, that many people
has strongly advocated a change in the name of the theory from the
“theory of relativity” to the “theory of invariance”! In the modern
way of looking at things, the stress is on quantities that do not
depend on the observer. The so called rest mass (also known as
the proper mass) of a particle is such a quantity. That is, to my
mind, the single most important reason for giving the (rest) mass
its rightful status as the mass.

6.10 Force and acceleration in STR


Now that we have been forced to accept a new definition of the mo-
mentum in order to ensure that the law of conservation of momen-
tum be Lorentz covariant, the natural question is - what happens
to the central equation of Newtonian kinetics - the second law?
Remarkably, the second law stays intact - not in the form

F~ = m~a (6.40)
CHAPTER 6. KINETICS IN RELATIVITY 204

that may be familiar to you, but in the form originally expressed by


Newton
d~p
F~ = . (6.41)
dt
The only difference is, here we have to use the new definition of
the momentum, p~ = γm~u instead of our old formula p~ = m~u. Since
the factor γ depends on ~u, and through it, is time dependent, it
is natural that the 2nd law will take a more complicated form in
terms of the acceleration than the simple Newtonian form (6.40).
The precise relation between the force and the acceleration is
easy to find out from (6.41).

d d~u d 1 dE
F~ = (mγ~u) = mγ + (mγ) ~u = mγ~a + 2 ~u
dt dt dt c dt

Now, one would expect that

dE dK d~p
= = ∇p~ K · = ~u · F~ , (6.42)
dt dt dt

which is (perhaps surprisingly) the same as the formula for power


that we had learnt in classical physics. As it turns out, though,
this formula does not really work in all situations! It is easy to see
that it will not work, if the mass (which, remember, is what many
other people call the rest mass), instead of being a constant time-
independent quantity, changes with time, because then you would
have to take the rate of change of the rest energy into account, too!
For the time being, we will confine ourselves to the case where m is
a constant.
Using the above expression for the power, we can easily write

~ 1 ~ 
F = mγ~a + 2 F · ~u ~u (6.43)
c

which is our relativistic replacement for Newton’s second law (at


CHAPTER 6. KINETICS IN RELATIVITY 205

least for bodies with constant rest mass). As you can easily see,
the presence of the second term says that the acceleration of a
moving body is, in general, not in the same direction as the force.
There are only two cases where the force and the acceleration are
co-directional - one, when the second term vanishes because the
force is perpendicular to the velocity, and two, when the force is
parallel to the velocity. In the first case, we have

F~ = mγ~a

while in the second,


u2
F~ = mγ~a + 2 F~
c
and hence
F~ = mγ 3~a

This means that in the two cases where the relativistic relation be-
tween the force and the acceleration mimics the classical one, the
ratio, which you would be inclined to call the mass, takes different
values. So, as far as generalizing F~ = m~a is concerened, we have
two kinds of mass - the transverse mass mt , which happens to be
equal to the so called relativistic mass mγ, and the longitudinal
mass ml which is bigger by a further factor of γ 2 .
You may be slightly surprised at my assertion that (6.41) is
the correct generalization of Newton’s second law to the relativistic
case. After all, d~
p
dt
can not be the spatial part of a four-vector, for the
same reason that ~v is not! Surely it would have been more sensible
to have
d~p
F~M = = γ F~ (6.44)

as the relativistic force (the M stands for Minkowski)? One way to
answer this is simply to say that both Newton’s second law and its
CHAPTER 6. KINETICS IN RELATIVITY 206

relativistic versions are only half-laws! By itself Newton’s second


law gets reduced to merely the definition of what we call the force -
the only way it gets physical content is via teaming up with another
half-law - one which expresses the force in terms independent of
the acceleration5 . Thus, from one point of view, which one of the
two (F~ and F~M ) to accept as the force, is a matter of convention -
you can always adjust the extra γ in the half-law that helps you
calculate the force!
Having said that, our vote still goes in favour of (6.41). One
reason for this is, as we will see in chapter 7, this will help us
to leave the Lorentz force law for the force acting on a charged
particle moving in an electromagnetic field intact (no need for an
extra γ factor). Another reason is that we have already used the
work done by this force to deduce the relativistic expression for
the kinetic energy - and this has already given us the convenient
notion of the total energy that forms a nice four-vector along with
the momentum.
Finally, we will be able to recover Newton’s third law, at least
for collisions6 , with this version of the force! When two particles
collide, they stick together for a very short period of time -bduring
which they exert forces on each other.
5
Of course, this is true only for non-constraint forces. Constraint forces, on
the other hand, do adjust themselves to ensure that the constraints are main-
tained - depending on the acceleration of the particle if necessary (remember the
“loss of weight” of a body in a downward accelerating lift?)!
6
For forces acting between two particles at a distance, there is no chance of
getting something akin to the third law - simply because two observers will never
agree on simultaneity when the two bodies are separated!
CHAPTER 6. KINETICS IN RELATIVITY 207

6.11 The force and acceleration four vec-


tors
As we have already seen, the basic 4-vector is the position 4-vector

X = (ct, ~r)

We have already met two other 4-vectors that follow very simply
from it

dX dt dX
U ≡ = = γ (u) (c, ~u)
dτ dτ dt  
E
P ≡ mU = mγ (u) (c, ~u) = , p~
c

Of course, we can (and will) define the force four-vector as


   
dP 1 dE d~p γ (u) dE
F = = γ (u) , = , γ (u) F~
dτ c dt dt c dt

where F~ is our old friend - the 3-force. Defining γ (u) F~ as the


Minkowski force F~M allows us to write
 
γ (u) dE ~
F = , FM .
c dt

At this point we may feel sorely tempted to replace the dE dt


above
by F~ ·~v . However, as I already hinted in the last section, this formula
is not valid in general - but only in the special case where m is a
constant. To see what the general result is, we will start from the
simple relation
P 2 = P · P = m2 c2
CHAPTER 6. KINETICS IN RELATIVITY 208

and differentiate both sides with respect to τ to get

d 2 dP dm dm
P = 2P · = 2P · F = 2mc2 = 2mc2 γ (u)
dτ dτ dτ dt

and thus
dm
P · F = mc2 γ (u)
dt
 
1 dE ~
Using P = mγ (u) (c, ~u) and F = γ (u) c dt , F yields, however

 
2 dE
P · F = m [γ (u)] − F~ · ~u
dt

Equating these two expressions for P · F yields

dE c2 dm
= F~ · ~u + (6.45)
dt γ (u) dt

which is the complete expression for power in relativity. As you can


see, this differs from the classical expression due to the presence
of the term involving the rate of change of the rest mass.
Using dE
dt
= F~ · ~u the four-force can be written as
     
1 ~  ~ 1 ~ 
F =γ F · ~u , F = FM · ~u , F~M (6.46)
c c

which of course, is valid only for the case where m is a constant. Of


course, the four-force does obey a relation that is very reminiscent
of Newton’s second law

dP d dU
F = = (mU ) = m = mA (6.47)
dτ dτ dτ

This formula, once again, is only valid for constant rest mass m.
(6.47) gives us another way of deriving (6.43). Using our expres-
sion for the four-acceleration (5.25) and equating each component
CHAPTER 6. KINETICS IN RELATIVITY 209

on two sides of (6.47)

γ ~  mγ 4
F · ~u = ~u · ~a
c c
mγ 4
γ F~ = mγ 2~a + 2 (~u · ~a) ~u
c

The first of these equations tells us that F~ · ~u = mγ 3 (~u · ~a), a result


that you could have also derived by taking the dot product of both
sides of (6.43) with the velocity u. using this reult in the second
equation immediately gives us (6.43).
One question that will be of some importance is the law of trans-
formation of forces. Since our force three-vector is not the spa-
tial part of the force four-vector, you should expect it to have a
more complicated transformation law then that of, say, the three-
momentum. However, though F~ is not the spatial part of a 4-vector,
γ (u) F~ is. Remember, we saw the same thing happen with ~u - it
is γ (u) ~u which forms the spatial part of a 4-vector, not ~u itself.
Thus, the 3-force should transform in a way that is similar to the
3-velocity!
Let’s work out the transformation equation for the 3-force for
a special Lorentz transformation. The spatial part of the position
4-vector transforms under this as

x10 = γ x1 − βx0


x20 = x2
x30 = x3

It is easy to write down the transformations for the spatial part of


CHAPTER 6. KINETICS IN RELATIVITY 210

the 4-vector F by comparison


   
0 10 γ (u) dE
1 v dE 2
γ (u ) F = γ γ (u) F − β = γγ (u) F − 2
c dt c dt
γ (u0 ) F 20 = γ (u) F 2

γ (u0 ) F 30 = γ (u) F 3

Dividing both sides by the result γ (u0 ) = γγ (u) (1 − u1 v/c2 ) (which


follows from the transformation of the 0-th component of the 4-
velocity) yields the force transformationsF_{2}\sqrt{1-\frac{v^{2}}{c^{2}}}
  
1
F − v dE F1 − v
c2
F~ · ~u
c2 dt
F 10 = u1 v
= u1 v

1− c2
1− c2
q
2 v2
F 1− c2
20
F =
1 − uc12v
q
2
F 3 1 − vc2
F 30 = u1 v
1− c2

If, in particular, the particle happens to be instatntaneously at rest


in the frame S, the force transformations become particularly sim-
ple

F 10 = F 1 r
v2
F 20 = F 2 1−
r c2
v2
F 30 = F3 1 − 2
c

We will have occasion to use these transformations in a big way in


the next chapter.
CHAPTER 6. KINETICS IN RELATIVITY 211

6.12 Using energy-momentum conservation

6.12.1 Inelastic collision, again


I will start with a generalized version of the perfectly inelastic col-
lision that we discussed in section 6.5. This time the two particles
that collide initially have masses (no longer identical) m1 and m2
and Doppler factors of k1 and k2 respectively. They stick together
to produce a composite particle of mass µ and Doppler factor κ.
Equations (6.21) and (6.22) lead to

µκ = m1 k1 + m2 k2
µ m1 m2
= +
κ k1 k2

which can be easily solved to find


s  
m1 m2
µ = (m1 k1 + m2 k2 ) +
k1 k2
s
m 1 k1 + m 2 k2
κ = m1
k1
+ mk22

Check that in the special case m1 = m2 = m and k1 = k2−1 = k, this


does reduce to the the values µ = 2mγ and κ = 1 as expected.

6.12.2 Decay and stability


Next, I will show you the consequences of energy and momentum
conservation for a very real process - the α decay of a heavy nucleus

A
ZX → α +Z−2 Y A−4
CHAPTER 6. KINETICS IN RELATIVITY 212

If we write down the equation for the mass-energy balance for this
equation, what we get is

E (X) = E (α) + E (Y )

or, in terms of the kinetic and mass energies,

mX c2 = mα c2 + T (α) + mY c2 + T (Y )

where I have obviously chosen to write the equation in the frame


in which the parent nucleus is it rest before the decay (So that
T (X) = 0 ). Thus the net kinetic energy of the decay products is
given by
T (α) + T (Y ) = (mX − mY − mα ) c2 = Q (6.48)

and is thus equal to the mass excess that the decaying nucleus
has over the decay products, apart from a factor of c2 . Note that
the factor of c2 arises only if we insist on using conventional units
for measuring mass - in nuclear physics, mass is usually measured
in energy units like the MeV7 , in which case, the mass difference
is directly equal to the final kinetic energy. It should be easy for
you to see that if the frame being used is not the frame in which X
is at rest, the the mass difference gives the increase in net kinetic
energy in the process, T (Y ) + T (α) − T (X). As I have written in
(6.48), the mass difference is denoted by Q. Why Q? The analogy
with heat released by a chemical reaction should make this clear!
What I have said just now remains equally valid for all decay
7
The Mega-electronVolt, or 106 eV. Remember, an eV is the amount of kinetic
energy gained by an electron accelerated by a potential difference of 1 Volt, and
is equal roughly to 1.6 × 10−19 Jin terms of conventional units.
CHAPTER 6. KINETICS IN RELATIVITY 213

processes. For example, the free neutron decays by the process

n→p+e+ν

for which, the final kinetic energies in the rest frame of the neutron
obeys8
T (p) + T (e) + T (ν) = mn − mp − me − mν c2


Similarly, in a nuclear beta decay,

A
ZX →Z+1 Y A + e + v

the net kinetic energy of the decay products is given by

T (Y ) + T (e) + T (ν) = (mX − mY − me − mν ) c2

In general, in any decay process, where a mother particle X decays


into a set of particles P1 , P2 , . . . , Pn you can easily see that we can
write

T (P1 ) + . . . T (Pn ) − T (X) = (mX − mP1 − . . . − mPn ) c2

where the Q value of this reaction is given by (mX − mP1 − . . . − mPn ) c2


- the mass difference times c2 . If we work in the frame where the
mother particle X is at rest, then this is of course the net kinetic
energy of all the decay particles.
Let us now think of a process very much like the decay of the
free neutron - a decay of a free proton into a neutron, a positron
8
I have included the mass of the neutrino in the expression, simply because
current research shows that it is probably non-zero. However, the upper bound
that we have on the neutrino mass is so very small, compared to even the mass
of a light particle such as the electron that it can be safely ignored in these
expressions.
CHAPTER 6. KINETICS IN RELATIVITY 214

and a neutrino9
p → n + e+ + ν

The Q value of this decay process is negative as is easily seen from


the fact that the neutron alone is more massive than the proton
(indeed, since the neutron decay process has a positive Q value,
it is more massive than the proton, the electron and the neutrino
put together!). In the rest frame of the proton, this implies that
the net kinetic energy of the decay products is negative - which is
absurd! The conclusion, then, is that, at least in the rest frame of
the proton, this decay can not occur!
It may strike you that this problem can be overcome by simply
speeding up the proton so that its kinetic energy is sufficient to
overcome the deficit of the negative Q value. Can this be done? All
you have to realize is that you can just as well analyze the situation
from the point of view of someone who is moving at the same speed
as the proton. In this frame all laws of physics are equally valid,
and here the proton is at rest and so can not decay according to
the argument given above. Now, if the proton does not decay in one
frame it does not decay in any other frame either! Thus, speeding
up the proton can not make it decay.
You may feel a bit cheated by the argument against proton decay
presented above - you could feel it more honorable to be able to
conclude the same while staying in the frame in which the proton
is moving very fast. However, the beauty of the whole argument
lies in the fact that we can judiciously choose the frame to simplify
it. Indeed this is one feature that makes applying STR a joy - the
9
The reason why you need to have a positron here in this case, instead of
the electron, is because the net electrical charge has to balance out on the two
sides. The third particle is the neutrino, rather than the anti-neutrino because
of a conservation law called the lepton number conservation - but all this will
take us too far afield.
CHAPTER 6. KINETICS IN RELATIVITY 215

principle of relativity allows you to use any frame whatsoever, while


a particular frame may make the calculation (or the argument) very
simple.
“All that is very well,” you could still say, “but couldn’t we have
come to the same conclusion sticking to the old frame?” The an-
swer to this is - we could, but only aftera slightly more complicated
argument. The point is, not only does the photon’s decay have to
conserve energy - it also has to conserve momentum. Since the
proton ahs a huge momentum in our frame, the net momentum of
the three decay products has to be huge too. Thus, the final kinetic
energies not only have to be positive, they have to be very large in
order to ensure that they have the requisite momentum. As we can
show, the extra kinetic energy needed for this is actually greater
than the kinetic energy that the proton had in the first place - so
the demand of conservation of momentum more than offsets the
advantage that the high initial speed could have brought in!
Thus, it is energy momentum conservation that ensures that
the proton can not decay in this particular manner. However, the
natural question is - why can not the proton ecay into something
less massive - there are certainly many, many particles that can
play the role of possible decay products? The answer lies in the
realms of particle physics - but without going too far afield I can
tell you that it has to do with a law called baryon number conser-
vation. Roughly, it says that a baryon (like protons, neutrons, etc.)
can only decay into a baryon. Now, the proton happens to be the
lightest among the baryons - making it impossible for the proton to
decay10 .
10
Impossible, perhaps, is too strong a word to use here. If baryon number
conservation is an exact law of physics then, of course, there is no possibility
of proton decay. However, some theories of particle physics (the so called grand
unified theories) allow for the possibility of a very small violation of this law -
CHAPTER 6. KINETICS IN RELATIVITY 216

We can make the above into a general law - a particle cannot


decay into products that are, together, more massive than it. On
the other hand, if a proposed decay scheme obeys all conservation
laws and has a positive Q value, to boot - then the decay will occur!
Note though that the rate at which the decay will occur can not be
completely predicted from STR (quantum mechanics plays a huge
role) it is true that a larger Q value does enhance the decay rate.

6.12.3 Dividing up the energy


We can go a lot further than just predict the total kinetic energy
of the decay products. After all, there is one more conservation
law, the conservation of momentum (or rather, as we have seen,
one more portion of the law of conservation of the momentum four
vector) that I have not made use of as yet! For the case of the α
decay, momentum conservation allows us to write

p (Y ) + p (α) = 0 (6.49)

Now, the typical kinetic energies in an α decay process is a few MeV


- which is much less than the rest energies of the particles involved.
This means that we can safely use non-relativistic expressions for
the kinetic energy and momentum. Hence (6.49) becomes

2mα T (α) = 2mY T (Y )


opening up a tiny possibility of proton decay. However, experiments running over
the last decade have not been able to reveal any sign of proton decay. Having
said that, one can not rule out the possibility that it occurs, but with a such a
tiny probability that is below the threshhold of our current detection schemes.
CHAPTER 6. KINETICS IN RELATIVITY 217

which means that (6.48) can be written


 

T (α) 1 + =Q
mY

leading to a precise prediction for the αparticle kinetic energy

Q
T (α) = mα (6.50)
1+ mY

To get an idea about the sort of energies that we are talking about
here, let me consider a real example - the α decay of U238 into Ra234 .
Note that the derivation of (6.50) was made very simple by the
fact that the energies involved were small enough to ensure that
non-relativistic approximations could be used. What if the speeds
involved are much larger (as happens, for example, in beta decay -
with electrons emerging at nearly the speed of light)?
A direct calculation of the energies carried away by the two de-
cay products can be carried out along the lines of the non-relativistic
calculations above. However, the algebra does tend to become a lit-
tle messy. Below we describe two ways of getting the result - the
first by using our k-calculus equations and the second by using
the energy-momentum 4-vector.

6.12.3.1 The k-calculus approach

Let the Doppler factors of the two particles in the rest frame of
the parent particle be k1 and k2 , respectively. In this case, the
equations (6.21-6.22) become

m1 m2
M = m 1 k1 + m 2 k2 = + . (6.51)
k1 k2
CHAPTER 6. KINETICS IN RELATIVITY 218

Eliminating k2 between these two equations leads to the quadratic


equation  2
m2 − m21 − M 2

2
k1 + k1 + 1 = 0 (6.52)
m1 M
for k1 . Of course, we can solve this this for the Doppler factor (and
hence the speed) of the first
 particle. However, if all we want the
m1 c2 1
energy E1 = 2 k1 + k1 , all we have to do is merely rearrange
(6.52) to get
1 M 2 + m21 − m22
k1 + = (6.53)
k1 m1 M
and thus

m1 c2 1 c2
M 2 + m21 − m22 = M 2 + m21 − m22
 
E1 = × (6.54)
2 m1 M 2M

Of course, the interchange m1 ↔ m2 gives E2 as

c2
M 2 − m21 + m22

E2 = (6.55)
2M

One easily checks that this is consistent with the conservation of


energy, E1 + E2 = M c2 . The kinetic energies are given by

M − m1 + m2
T1 = E1 − m1 c2 = Q (6.56)
2M
M + m1 − m2
T2 = Q (6.57)
2M

where Q = (M − m1 − m2 ) c2 is the total available kinetic energy.


The momentum is given by

m1 c 1
|p1 | = k1 −
2 k1
CHAPTER 6. KINETICS IN RELATIVITY 219

and elementary algebra, once again, tells us from (6.53) that


s 2
m1 c M 2 + m21 − m22
|p1 | = × −4
2 m1 M

which can be rearranged into the rather nice symmetric form

c p
|p1 | = (M + m1 + m2 ) (M + m1 − m2 ) (M − m1 + m2 ) (M − m1 − m2 )
2M
(6.58)
The symmetry shows that |p2 | is the same, as it must be.

6.12.3.2 The 4-vector approach

There is a simple trick that uses the property of four-vectors to


solve the problem. Let me now show you how it works. In this
case, the law of four momentum conservation

PX = PY + Pα

can be rewritten as
PY = PX − Pα .

Taking the dot product of each side with itself leads to

PY2 = PX2 + Pα2 − 2PX · Pα

Note that this helps us get rid of the two unknown quantities in-
volving the Y nucleus - its momentum and energy in one go! Now,
PX2 = m2X c2 , etc. Also,
  E 
~ α
PX · Pα = mX c, 0 · , p~α = mX Eα
c
CHAPTER 6. KINETICS IN RELATIVITY 220

which tells us that

m2Y c2 = m2X c2 + m2α c2 − 2mX Eα

yielding the expression

m2X + m2α − m2Y 2


Eα = c
2mX

for the total, energy of the α particle. Of course, all you have to do to
find the total energy of the Y nucleus after the decay is interchange
the values of mα and mY - which yields

m2X + m2Y − m2α 2


EY = c
2mX

As you can easily verify, the sum of the energies of the decay prod-
ucts is equal to the rest energy of the original particle,

Eα + EY = mX c2

To get the kinetic energies of the decay products, all you have to do
is subtract the respective rest energies, yielding

(mX − mα )2 − m2Y 2
Tα = c (6.59-a)
2mX
(mX − mY )2 − m2α 2
TY = c (6.59-b)
2mX

You should be able to verify that these expressions reduce to the


non-relativistic ones we found out earlier in the limit of low speeds.
Note that in this process the kinetic energy available for the de-
cay products (which is nothing but the Q value) gets distributed
among the two decay products in a well defined manner. We thus
CHAPTER 6. KINETICS IN RELATIVITY 221

expect the α particle to come out with a definite energy - and that is
precisely what has been observed11 ! Of course, this can be traced
back to the fact that α decay is a two body process - the available
kinetic energy gets shared between the daughter nucleus and the
α particle in a precise ratio dictated by the conservation of momen-
tum. The β decay spectrum, on the other hand, is another story
altogether - instead of a precise energy (or a set of sharp lines) the
β particles come out with all energies from zero upto a maximum.
Indeed, the maximum kinetic energy of the β particles (called the
endpoint energy of the β spectrum) turns out to be what (6.59-a)
would predict (Of course with mα replaced by mβ ). Of course, the
explanation is very simple - β is not a two body process - the avail-
able Q value is shared between the three particles in the final state!
Thus, the β particle can come out with any energy below the end-
point value - the rest is carried away by the neutrino, while main-
taining conservation of momentum. Note that the fact that a two
body decay could not explain the continuous decay spectrum in
beta decay is the precise reason why Wolfganf Pauli had proposed
the existence of a third decay product - the neutrino! Indeed, it
11
Actually, if you look at the α decay spectrum, which is nothing but a plot
of the number of α particles emitted against their respective energies, what you
seeussally is not one, but several sharp lines. Thus, the αparticles come out
with not one definite energy, but several well defined energies. Explaining this
is not too difficult, thoough - it is just that the daughter nucleus is not always
produced in its ground state. Thus the energy term that I wrote down foer the
daughter nucleus is not just mY c2 + TY , but has an extra term of EY∗ as well,
where EY∗ is the excitation energy of the state the nucleus is created in. As you
can easily check, this brings down the available kinetic energy from Q to Q − EY∗
- which explains where the α particles of energies lower than that predicted by
(6.50) come from. In fact, these α particles of lower eneries are accompanied with
γ rays (photons) that carry away the extra energy that becomes available when
the excited daughter nucleus de-excites to the ground state - this provides us
with evidence that this explanation is correct! Since quantum mechanics tells
us that the excited states of the daughter nucleus have a few definite energies
only, there are only a few sharp lines in the α spectrum.
CHAPTER 6. KINETICS IN RELATIVITY 222

took thirty years for the neutrino to be experimentally detected af-


ter Pauli’s prediction.
Today, the idea that we need an extra particle to ensure that
energy-momentum conservation works out correctly may not seem
to be too far-fetched. After all, there are hundreds of so called ele-
mentary particles, so what’s the big deal about one particle more?
You should realize, though, that when Pauli first proposed the neu-
trino to explain the continuous spectrum in beta decay, the num-
ber of known elementary partices could be counted on the fingers
of your hand - the particle explosion was yet to take place! Adding
one more particle to the list just to ensure that energy-momentum
conservation holds good is a prime example of “changing the uni-
verse” to suit theory - the closest analogy that I can think of is
the proposal of the existence of Neptune to explain deviations in
the orbit of Uranus from the calculated path. While the latter em-
phasises the degee of confidence that eighteenth century physicists
had in Newton’s laws, the fact that very few doubted the existence
of the neutrino in the thirty intervening years between the neu-
trino’s prediction and discovery is testimony to our degree of faith
in energy-momentum conservation!

6.12.4 Nuclear reactions


Energy-momentum conservation has a major role to play in nuclear
physics. I have already shown you the major role it plays in nuclear
decays, this time I will talk about its role in deciding when a nuclear
reaction will occur.
How is a nuclear reaction different from a decay? A decay has
only one particle in the initial state. In a reaction, several particles
(I am using the word particle in an extended sense, heavy nuclei
count as particles, too!) take part in the initial state - but the most
CHAPTER 6. KINETICS IN RELATIVITY 223

common situation is where two particles react - e.g. when you


shoot α particles at a nucleus in order to cause a transmutation.
In most cases, you shoot a projectile particle (which may be an α
particle, a proton ...) at a stationary target containing the nuclei
you want to transmute. Of course, I am describing this from a
particular frame of reference - the frame in which the target is at
rest. this is typically called the lab frame.
The concept of the “heat of reaction” Q holds good for reactions
too. In this case, its value is c2 times the net mass of the reactants
minus the net mass of the products of the reaction. Conservation
of energy tells us that this will be the increase in the net kinetic
energy of the products over those entering into the reaction. In
this case, we borrow terminology from chemistry and call a reaction
exothermic if Q > 0, and endothermic if Q < 0. Note that a decay
will never occur if Q < 0 for it, but it is possible to have a reaction
which is endothermic - all that will happen is that the products
will emerge with less kinetic energy than that carried in by the
reactants. Of course, the incoming particles must have enough
energy to make-up the deficit.
If the Q value of an endothermic reaction

X + α → Y1 + Y2 + . . .

is, say, −4.0 MeV, you may be inclined to believe that the the reac-
tion will occur if you shoot α particles with a minimum energy of
4.0 MeV at a stationary target of X nuclei. In reality, the α particles
have to be more energitic than this. The reason is the same as
the one that prevented the fast moving proton from decaying - in
this case, the fact that the incoming α particle has some momen-
tum ensures that the decay products would also have to have some
minimum kinetic energy to begin with - just providing them with
CHAPTER 6. KINETICS IN RELATIVITY 224

enough to overcome the Q deficit will not do!


Just providing enough kinetic energy to overcome the Q deficit
will be enough to cause the reaction - but only if you can carry
it under the most favourable situation for the reaction to occur
- one in which the products can be produced at rest! Conserva-
tion of momentum tells you that this is possible only if the initial
momentum is zero (this is why we looked at the rest frame of the
mother particle in the case of a decay). Thus, the lab frame (in
which the target has zero momentum but the projectile has a lot of
it) is not suitable for looking at the extreme case - we should look
at the so called center of momentum (CM) frame. For an endother-
mic reaction with Q = −4.0 MeV to occur, the net kinetic energy of
the reactants in the CM frame must be at least +4.0 MeV. What
is important for you to know, if you want to design an experiment
to carry out this reaction, is the minimum kinetic energy that the
projectile must have in the lab frame. Transforming from the CM
frame to the lab frame and vice versa is thus very important in
all studies of nuclear reactions. However in whhat follows, we will
use two approaches, based on (what else!) the K-calculus and 4-
momenta, respectively, to calculate the threshold energy directly in
the lab frame.

6.12.4.1 The k-calculus approach

Let the Doppler factor of the projectile particle m1 be k in the lab


frame. In this case, the equalities (6.21) and (6.22) take the form

N
X
m1 k + m2 = µj κj
i=1
N
m1 X µj
+ m2 =
k i=1
κj
CHAPTER 6. KINETICS IN RELATIVITY 225

so that simply multiplying both sides gives us


  N
! N
!
1 X X µj
m21 + m22 + m1 m2 k+ = µj κj
k i=1 i=1
κj
N N  
X X κj κk
= µ2j + µj µk +
j=1 j<k=1
κk κj

Now, m1 m2 k + k1

= 2m2 E1 /c2 , while elementary algebra tells us
that   r 2
κj κk κj κk
r
+ =2+ − ≥2
κk κj κk κj
with equality holding only for κj = κk . Thus

N N N
!2
X X X
m21 + m22 + 2m2 E1 ≥ µ2j + 2µj µk = µj = M2
j=1 j<k=1 j=1

where M is the net mass of the products of the reaction. The


equality is reached, of course, when all the decay particles have
the same k factor - i.e., they move with as a single lump after the
reaction. This inequality, which we will rederive using 4-momenta
arguments below, allows us to put a lower bound on the energy of
the incident projectile.

6.12.4.2 The 4-momentum approach

Another way to look at this hinges around the result for P 2 that
is expressed in (6.30) and the inequality (6.31). If the net four-
momentum of the products of the reaction is P , then we can write

P1 + P2 = P
CHAPTER 6. KINETICS IN RELATIVITY 226

where P1 and P2 are the four-momenta of the target and the projec-
tile, respectively. Now, the
 twoincident 4-momenta take the form
P1 = (E1 /c, p~1 ) and P2 = m2 c, ~0 in the lab frame, where E1 is the
energy of the projectile in this frame - so that we have P1 ·P2 = m2 E1 .
“Squaring” both sides and of P1 + P2 = P , we get

c2 m21 + m22 + m2 E1 = P 2 ≥ c2 M 2


which is the same inequality that we have derived using the K-


calculus above.
Since the kinetic energy of the projectile in the lab frame Klab
is given by E1 − m1 c2 , we must have

2m1 Klab ≥ c2 M 2 − (m1 + m2 )2 = −Q (M + m1 + m2 )




and thus and endothermic reaction becomes energitically feasible


only if
   
M + m1 + m2 m2 |Q| m2
Klab ≥ |Q| = 1+ + |Q| ≈ 1+ |Q|
2m1 m1 2m1 m1

This shows that the lighter the target is, the harder the reaction
gets. In fact, if an endothermic reaction is to occur by a projectile
particle hitting an identical, stationary, particle, the projectile must
have more kinetic energy than twice the Q deficit!

6.12.5 The Compton effect


We have already met the Comton effect in its most extreme form in
section 6.3. There we talked about a photon bouncing back off a
stationary electron. In general, though, the photon would bounce
off at some angle from the incident direction. Since the photon
CHAPTER 6. KINETICS IN RELATIVITY 227

gives the electron a kick, giving up some of its energy - it ends


up with less energy than it started out with. This means that the
photon emerging out of the collision will have a lower frequency and
hence longer wavelength than when it came in. This is exactly what
was observed by Arthur Compton in 1922. The Compton effect was
the first direct observation of the particle nature of the photon so
far as it proved that not only do photons carry the energy of light,
they also carry momentum!

6.12.5.1 The shift

To calculate the shift in the wavelength of the photon, we start by


writing down the conservation laws. The photon-electron collision
is shown in figure .
We can immediately write down the law of conservation of energy

mc2
~cki + mc2 = ~ckf + q (6.60-a)
2
1 − vc2

as well as the X and Y components of the law of conservation of


energy

mv
~ki = ~kf cos ϑ + q cos φ (6.60-b)
v2
1 − c2
mv
0 = ~kf sin ϑ − q cos sin φ (6.60-c)
2
1 − vc2

Since our aim is to find out the shift in the photon wavelength
against its deflection angle ϑ, we must get rid of the electron speed
CHAPTER 6. KINETICS IN RELATIVITY 228

v and the angle φ. Eliminating φ from (6.60-b-6.60-c) gives

m2 v 2 2 2 2

2 = ~ k i + k f − 2k i k f cos ϑ
1 − vc2

Again, (6.60-a) gives

m2 c2
v2
= m2 c2 + 2mc~ (ki − kf ) + ~2 (ki − kf )2
1 − c2

Subtracting these two equations immediately gives rise to

m2 c2 = m2 c2 + 2mc~ (ki − kf ) − 2~2 ki kf (1 − cos ϑ)

leading to
~ 1 1 1
(1 − cos ϑ) = − = (λf − λi )
mc kf ki 2π
This gives the formula

h
∆λ = (1 − cos ϑ) (6.61)
mc

for the shift in wavelength in the Compton effect.


This formula fits the experimental observations completely - ex-
cept for one small fact. Compton had found that along with a shift
in wavelength as predicted above, the light scattered in a particu-
lar direction also had a component that had the same wavelength
as the incident light. The question is, how could the photon de-
flect after kicking an electron without losing any of its energy? The
answer lies in the m in the denominator of the equation above. A
photon may collide with an electron that is so tightly bound with
its atom that it inot the electron, but the entire atom that recoils in
the collision. This means that the mass that should be used in the
equation is that of the entire atom rather than that of the electron
CHAPTER 6. KINETICS IN RELATIVITY 229

- no wonder the shift in wavelength is too small to detect in this


case.

6.12.5.2 The four-vector approach

Let me now show you how simple this calculation becomes if we


make use of the properties of the energy-momentum four-vector.
Note that in what we did above, my aim was to eliminate the speed
and direction of motion of the electron in the final state. This can be
immediately achieved by taking the norm of the energy-momentum
four-vector for the electron - this will leave just m2 c2 , wiping out all
traces of ~v .
The conservation of four-momentum

Pγi + Pei = Pγf + Pef

can be rewriiten as
Pef = Pγi − Pγf + Pei

which when “squared” leads to

m2 c2 = m2 c2 + 2 Pγi − Pγf · Pei − 2 Pγi · Pγf




where I have used Pγ2i = Pγ2f = 0 and Pe2i = Pe2f = m2 c2 . This means
that

Pγi · Pγf = Pγi − Pγf · Pei

Now, in the lab frame, we have


     
Pγi = ~ ki , ~ki , Pγf = ~ kf , ~kf , Pei = mc, ~0
CHAPTER 6. KINETICS IN RELATIVITY 230

which means that the above equation becomes


 
~ 2 ~ ~
ki kf − ki · kf = ~ (ki − kf ) mc

From this, you can immediately arrive at the Compton shift for-
mula, equation (6.61).

6.12.6 Relativistic billiards


As every billiards player knows, if a ball strikes another of the same
mass at rest, after the collision, which is practically elastic, the two
move off in mutually perpendicular directions. This result is valid
for classical physics - here we wish to examine how it gets modi-
fied when the projectile ball is really fast - so fast that relativistic
mechanics needs to be applied. A difference between this prob-
lem and the previous ones is that unlike in those cases no simple,
elegant 4 -vector argument presents itself. The method which we
will use here falls back directly on the principle of relativity itself:
since the laws of physics are the same in all frames, solve the prob-
lem in a frame where the problem becomes simple and transform
your results back to the frame in which you want the answer. To
illustrate this method, let me use it to solve this problem in the
non-relativistic case first12 .
12
I do not claim that this is the simplest way to solve the non-relativistic ver-
sion of this problem. Arguably the simplest solution could run as follows: con-
servation of momentum and kinetic energy will lead to the equations ~v1 + ~v2 = ~u
and v12 + v22 = u2 after cancellation of factors involving m. If you are geometrically
minded you could simply say that the fisrt equation implies that the vector ~u
forms the third side of a triangle formed by ~v1 and ~v2 , while the second equa-
tion says that this triangle is right-angled! The more algebraically inclined might
2
“square” the first equation to arrive at u2 = (~v1 + ~v2 ) = v12 + v22 + 2~v1 · v~2 which
implies ~v1 · ~v2 = 0!
CHAPTER 6. KINETICS IN RELATIVITY 231

6.12.6.1 The non-relativistic problem

Since this is a purely mechanical problem, we do have a principle


of relativity to fall back upon in this case even for non-relativistic
physics. The frame in which the problem simplifies considerably
is the CM (center of momentum) frame where the net momentum
vanishes. This means that in this frames the two balls approach
each other with equal and opposite velocities ~uCM and −~uCM , re-
spectively. After the collision, the net momentum should again be
zero - which implies that the particles will again have equal and
opposite velocities, ~vCM and −~vCM . Conservation of energy then

tells us that ~uCM = ~vCM .
To find out the situation in the lab-frame all we have to do is
apply the relevant Galilean transformation to the velocities. Since
the second ball is initially at rest in the lab frame, all you have to
do here is subtract the velocity −~uCM from every velocity. Thus,
the post-collision velocities of balls 1 and 2 are ~v1 = ~vCM + ~uCM
and ~v2 = −~vCM + ~uCM . This means that

2
~v1 · ~v2 = −~vCM + ~u2CM = 0

since ~uCM = ~vCM !

6.12.6.2 The relativistic case

the first part os the solution is identical for relativistic billiards.


In the CM frame, the two balls do emerge with equal and opposite
velocities with the same magnitude v as the initial velocities. The
only reason why the relativistic case is somewhat more complicated
is that the velocity transformations back to the lab frame are more
complicated in the relativistic case.
Since all we are interested in are the directions of the post-
CHAPTER 6. KINETICS IN RELATIVITY 232

collision velocities, the problem simplifies somewhat. In the CM


frame, we take the initial direction of motion of the two particles
as the X 0 axis. The two emerging particles make the angles θ0 and
φ0 = π − θ0 to the X 0 axis in this frame. To find the corresponding
angles, θ and φ in the lab frame, we can use the particle aberration
formula to get:

v sin θ0 sin θ0
tan θ = =
γ (v) (v cos θ0 + v) γ (v) (cos θ0 + 1)
sin φ0 sin θ0
tan φ = =
γ (v) (cos φ0 + 1) γ (v) (− cos θ0 + 1)

The simplest way to eliminate the unknown (and unwanted) angles


θ0 and φ0 is to multiply these two equations to get

1
tan θ tan φ =
γ2 (v)

Note that in the non-relativistic limit v  c, γ (v) → 1, which tells


us that tan θ tan φ = 1, and hence we recover the result θ + φ = π/2.
In the relativistic case, however, the product tan θ tan φ is less than
one - implying that the angles between the final velocity directions
is less than π/2.
We are not done with the relativistic case yet, though. We still
have an unknown γ (v) to get rid of, in favor of the known quantity
γ (u), where u is the speed of the projectile in the lab frame. Though
we can calculate γ (v) in terms of γ (u) both directly or via the
equation (5.24) we will use a 4-vector argument to directly derive

this. The 4-velocities U1 and U2 have components of γ (v) c, ~vCM

and γ (v) c, −~vCM , respectively, in the CM frame,   while in the
lab frame, the components are γ (u) (c, ~u) and c, ~0 , respectively.
Equating the values of the invariant (U1 + U2 )2 calculated in the
CHAPTER 6. KINETICS IN RELATIVITY 233

two frames gives us

4γ 2 (v) c2 = (γ (u) + 1)2 c2 − γ 2 (u) u2


u2
   
2 2
= c γ (u) 1 − 2 + 2γ (u) + 1 = 2c2 (γ (u) + 1)
c

so that
γ (u) + 1
γ 2 (v) =
2
and our formula becomes

2
tan θ tan φ = (6.62)
γ (u) + 1

This shows that the faster the projectile, the smaller tan θ tan φ is,
and hence the smaller the angle between the emergent directions.
This is one of the predictions of relativity that can be directly tested
in cloud chamber type experiments. Observing the tracks when a
fast cosmic-ray electron hits a stationary electron shows this re-
duction in angle vividly. Indeed, by observing the angular distri-
bution of the outgoing tracks a lot of information can be gathered
about the incident electron’s speed.

6.12.7 The relativistic rocket


Let us consider a rocket which propels itself by ejecting some of
its mass backwards - thereby gaining speed by the conservation of
momentum. Although the principle of the rocket is simplicity it-
self, the actual calculation is complicated, even for non-relativistic
physics, by the fact that the rocket has a variable mass. Here
we will calculate how the speed of a relativistic rocket varies with
its mass. We will make the simplifying assumption that mass is
ejected from the rocket at a constant speed U in the backward di-
CHAPTER 6. KINETICS IN RELATIVITY 234

rection with respect to the rocket.We will once again solve the prob-
lem using two different approaches - one based on the K-calculus,
the other a more conventional one.

6.12.7.1 The K-calculus approach

To simplify the notation, we make use of our old friends


   
1 1 1 1
γ (k) = k+ π (k) = k−
2 k 2 k

from which it immediately follows that

γ (k) + π (k) = k
γ (k) − π (k) = k −1

We will write the constant K-factor of the ejected mass with respect
to the rocket as K−1 (since the mass moves backwards, its K-factor
is less than 1 - or K, however, is larger than 1), where
r
c+U
K=
c−U

Now let the mass of the rocket change from M to M + dM in


the infinitesimal time interval from t to t + dt, in which time the
rockets K-factor with respect to some fixed inertial frame changes
from k to k + dk. In this frame, which we will take to be the frame
in which the rocket is initially at rest for convenience, the K-factor
of the ejected mass is k/K. Let the mass of the ejected gas in this
interval be δM (in classical physics, conservation of mass would
have told us that δM = −dM . In relativity, however, it is energy
that is conserved, and not mass!). Then, the laws of conservation
CHAPTER 6. KINETICS IN RELATIVITY 235

of energy and momentum applied to this immediately leads to

M γ (k) = (M + dM ) γ (k + dk) + δM γ (k/K)


M π (k) = (M + dM ) π (k + dk) + δM π (k/K)

which can be rewritten in the form

−δM γ (k/K) = dM γ (k) + M γ̇ (k) dk


−δM π (k/K) = dM π (k) + M π̇ (k) dk

We can easily eliminate the unknown δM by dividing the two equa-


tions above to get

γ (k/K) k 2 + K2 dM γ (k) + M γ̇ (k) dk


= 2 2
=
π (k/K) k −K dM π (k) + M π̇ (k) dk

and using componendo and dividendo on this yields

k2 dM (γ (k) + π (k)) + M (γ̇ (k) + π̇ (k)) dk k dM + M dk


= = −1
K 2 dM (γ (k) − π (k)) + M (γ̇ (k) − π̇ (k)) dk k dM − k −2 M dk

and thus
1 k dM + M dk
=
K2 k dM − M dk
so that
k dM 1 + K2 c
= = −
M dk 1 − K2 U
which yields the differential equation

dM c dk
=−
M U k

Using the initial values M = M0 when k = 1, we can immediately


CHAPTER 6. KINETICS IN RELATIVITY 236

solve this equation to get


 −c/2U
M c+u
= k −c/U = (6.63)
M0 c−u

which tells us how the final mass decreases with increasing speed
of the rocket. Thus, the higher the speed that you want your rocket
to achieve, the smaller can the payload carried by your rocket be!
To achieve a larger residual mass for a given final speed u, U must
be as large as possible. This gives us the photon ship - very pop-
ular in science fiction - which propels itself by ejecting a stream of
photons! For a photon ship (6.63) becomes
r
M c−u
= (6.64)
M0 c+u

6.12.7.2 The conventional approach

Instead of using the K-calculus we can use the same physical


arguments directly in terms of the velocity. The velocity of the
ejected mass δM in the initial rest frame of the rocket is given by
U = (u − U) / (1 − uU/c2 ). This is where the reason behind the sim-
plicity of the K-calculus approach should become evident! We can
bravely press forward and write the energy and momentum con-
servation equations as

M c2 γ (u) = (M + dM ) c2 γ (u + du) + δM c2 γ U


M γ (u) u = (M + dM ) γ (u + du) (u + du) + δM γ U U

from which it follows that

d (M γ (u) u) du
U= = u + M γ (u)
d (M γ (u)) d (M γ (u))
CHAPTER 6. KINETICS IN RELATIVITY 237

This implies

1 − uU/c2
 
1 d 1 1 1 − U/c 1 + U/c
(M γ (u)) = =− = − +
M γ (u) du U −u (1 − u2 /c2 ) U 2U 1 − u./c 1 + u/c

leading to the solution (using M = M0 at u = 0)


 
M γ (u) M c 1 + u/c 1
− ln 1 − u2 /c2

ln = ln + ln γ (u) = − ln
M0 γ (0) M0 2U 1 − u/c 2

which simplifies to
 c/2U
M c−u
=
M0 c+u
which is the solution we were after.
There is an arguably slightly simpler way of arriving at this so-
lution. This involves using a special frame in which the description
of the process becomes simple. This frame is the ICF, the frame in
which the rocket is stationary at a particular instant of time. At
a time t, the ICF will move at a velocity u with respect to the ini-
tial rest frame of the rocket. After a time dt, the ejected gas (mass
δM ) will have a velocity −U while the rocket will have a velocity
of du0 (Remember, the ICF is not fixed to the rocket - and thus the
rocket is stationary in this frame only at the time t). Thus, energy
and momentum conservation in this frame leads to

M c2 = (M + dM ) γ (du0 ) c2 + δM γ (−U) c2
0 = (M + dM ) γ (du0 ) du0 + δM γ (−U) (−U)

Noting that γ (du0 ) = γ (0) + γ̇ (0) du0 + O (du02 ) = 1 + O (du02 ) we get


δM γ (−U) = −dM and thus

M du0 = −dM U (6.65)


CHAPTER 6. KINETICS IN RELATIVITY 238

Unfortunately, we cannot directly integrate du0 to get the increase in


the rockets velocity. This is because the increments du0 at different
times are with respect to different frames. In order to find the
velocity, we need to find out velocity increments for a single frame
- say, the initial rest frame of the rocket. We can use the velocity
addition law
u0 + v
u=
1 + u0 v/c2
to find

(1 + u0 v/c2 ) du0 − (u0 + v) vdu0 /c2 (1 − v 2 /c2 ) du0


du = =
(1 + u0 v/c2 )2 (1 + u0 v/c2 )

Since in the ICF, v = u and u0 = 0, this means that

u2 u2
   
0 dM
du = 1 − 2 du = − 1 − 2 U
c c M

which can be easily solved to get (6.63). If you ask me, I find the
simplicity that one gains by going over to the ICF is ruined by the
fact that you haveto convert du0 back to du before you can integrate
it. Make your own choice!
As far as the photon rocket is concerned, we have already de-
rived the result (6.64) for it as a special case of the general result
(6.63). There is, however, a much simpler way of deriving this re-
sult, and this hinges on the the fact that the energy - momentum
relation for photons is linear, which means that the net momentum
carried away by the photons is just the total energy of the photons

divided by c (since for a massive particle we have |~p| = E 2 − m2 c4 /c,
the total momentum of the gas, ejected at different speeds with
respect to a given inertial observer at different times, can not be
directly related to the total energy in a simple way for a rocket
ejecting massive particles. In other words, you can not calculate
CHAPTER 6. KINETICS IN RELATIVITY 239

the total momentum from the total energy of a bunch of massive


particles - while you can do this for a gas of photons!). This allows
us to directly use energy-momentum conservation between the ini-
tial and the final stages of the rocket to get (Eγ is the net energy of
the photon gas)

Eγ + M γ (u) c2 = M0 c2
pγ + M γ (u) u = 0

Since Eγ = −pγ c (remember, pγ is negative and thus |~p| = −pγ ), the


two equations combine to give
 u
M γ (u) 1 + = M0
c

which immediately leads to (6.64).

6.13 Motion under force


So far we have made do with just the conservation laws of energy
and momentum. This is in stark contrast with the classicalap-
proach to elementary mechanics - where force and Newton’s sec-
ond law plays a central role.
Chapter 7

Electrodynamics and
relativity

7.1 Faraday and Einstein

7.2 Why does a current produce a magnetic


field?
Imagine that the history of physics in some other world is quiet
different from the one in our’s. In this world, the special theory of
relativity has come in a lot earlier - before the fact that currents
produce magnetic fields has been discovered! All that is known of
what we call electromagnetism is Coulomb’s law for electrostatics.
In this imaginary world, let us follow the footsteps of a pioneer-
ing physicist who is trying to deduce the effect of a currrent in a
straight wire on a charge moving in its vicinity.

240
CHAPTER 7. ELECTRODYNAMICS AND RELATIVITY 241

7.3 Transforming the fields

7.4 The field of moving charges

7.5 Potentials to the fore

7.6 Light - again!

You might also like