Syllabus SRT v024
Syllabus SRT v024
L ECTURE N OTES
Authors:
Nicolo DE G ROOT and
Sijbrand DE J ONG
Faculty of Science
Institute for Mathematics, Astrophysics and Particle Physics
Contents
Introduction 1
6 Relativistic collisions 35
6.1 Natural units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2 Two body decay of a particle . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.1 Alternative method . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2.2 Example 1: the decay π + → µ+ νµ . . . . . . . . . . . . . . . . 36
6.2.3 Example 2: the decay K∗+ → K+ π 0 . . . . . . . . . . . . . . . 37
6.2.4 Boosting the parent particle . . . . . . . . . . . . . . . . . . . . 37
6.3 Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3.1 Example 1: The Greisen-Zatsepin-Kuzmin (GZK) limit . . . . . 38
6.3.2 Example 2: Compton Scattering . . . . . . . . . . . . . . . . . . 39
7 Loose ends 41
7.1 The Higgs mechanics or how heavy is wet light? . . . . . . . . . . . . . 41
7.2 Gravitational redshift, a first step towards general relativity . . . . . . 42
7.2.1 Example: GPS corrections . . . . . . . . . . . . . . . . . . . . . 43
1
Introduction
This reader is meant as additional material on special relativity. We have been using
the college physics book by Serway for many years. They are perfectly adequate as far
as classical mechanics (and thermodynamics, waves and electricity and magnetism)
are concerned, but are remarkably superficial when special relativity is concerned.
The mathematics of special relativity are not very complicated. Typically they do not
go beyond the use of a square root. What does make it feel complicated to first year
students is that it is counter intuitive. This reader tries to focus on the conceptual
side of relativity first and the equations next, but without avoiding the equations like
Serway.
This reader was first conceived by Nicolo de Groot. He has been inspired by sev-
eral sources of information, in particular Sander Bais’ book de sublieme eenvoud van
relativiteit and the readers for the Special Relativity course by Sijbrand de Jong and
Paul Avery (Universsity of Florida). The section of chapter 6 on the Higgs mechanism
was due to an idea by Ronald Kleiss and Sjbrand de Jong. For the 2022/2023 course,
Sijbrand de Jong has mildly edited the original version.
3
Mechanics is the study of movement of object. We are trying to describe the position
of an object as a function of the time: x (t). Our starting point are Newtons laws. The
first one states that:
Every body perseveres in its state of rest, or of uniform motion in a right line, unless it is
compelled to change that state by forces impressed thereon.
It introduces the concept of a force and states that in absence of an external force
object keep moving in a straight line at constant velocity.
Inverting this law implies that the presence of a force would imply a change in
velocity. This is the topic of Newtons second law that states:
The alteration of motion is ever proportional to the motive force impressed; and is made
in the direction of the right line in which that force is impressed.
With motion, Newton means momentum, the product of mass and velocity: ⃗p =
m⃗v. Putting the second law into a formula:
d⃗p ⃗
= F. (1.1)
dt
When mass is constant in time this reduces to the more familiar: 1
Newtons first law is not obviously true for every observer. For instance an observer in
a rotating frame - like we are on earth - will observe a pseudo-force that seems to push
objects away from constant straight motion in that frame. Think of the Coriolis force
or the centrifugal force. Frames where Newtons first law is valid are called inertial
frames
Every observer is free to choose the direction of the x, y and z axis and the ori-
gin for the time and place coordinates. This leads to many possible inertial frames.
We can move (transform) from one inertial frame into another. Such a transforma-
tion describes how a new observer S′ measures time and position (t′ ,⃗r ′ ), given the
coordinates measured by S (t,⃗r ).
Possible transformations from one inertial frame into the other are:
Translation Two observers S and S′ have the origin of their coordinate system in a
different place. The transformation can be written as:
′
⃗r = ⃗r +⃗a
, (1.6)
t′ = t
Rotation The axes of the coordinate system of S′ have been rotates around the origin
with respect to S. The transformation can be written as:
(
⃗r ′ = A ⃗⃗ ⃗r
, (1.7)
t′ = t
Spatial Inversion The coordinate axes of S′ are mirrored (flipped) in the origin The
transformation can be written as:
′
⃗r = −⃗r
. (1.8)
t′ = t
5
Time Shift The origin t = 0 of the time-axis of S′ has been shifter with respect to S.
The transformation can be written as:
′
⃗r = ⃗r
, (1.9)
t′ = t + b
d⃗r ′ d⃗r
⃗u ′ = = − ⃗v = ⃗u − ⃗v. (1.12)
dt dt
The Lorentz force describes the force on a moving particle with charge q and velocity
v in a magnetic field B. It is given - in scalar form - by:
FL = B · q · v. (1.13)
The Lorentz force will push the particle in a direction orthogonal to the magnetic
field and the direction of flight. If we imagine two particles, one charged, one neutral
being adjacent to each other. The Lorentz force will move them apart. It is possible
to choose an inertial frame which moves along with the particle. In that frame the
particle would have velocity zero and there would be no Lorentz force, regardless
of how B would transform, and the two particles would stay together. This is obvi-
ously not what we observe. Einstein fixed this by rebuilding mechanics based on two
postulates.
2 While we will see that special relativity fixes many things, it does not fix this one. This issue is being
addressed in statistical mechanics and in certain gauge theories in relativistic quantum field theory.
Keep calm and you may encounter these theories later in your education.
3 We use u for the velocity of an object in a certain frame and v to indicate the relative velocity of
two observers.
6
1. The laws of nature are the same for all inertial observers;
2. The velocity of light in vacuum is the same for all inertial observers.
The first one seems a common sense extension of classical relativity. The second
could be interpreted the same way, if one would consider the velocity of light as a
fundamental constant (it is !), but does go against our intuition that velocities are
additive. Experiments show that the velocity of light in vacuum does not depend
on the frequency or wavelength. Light of higher energy and shorter wavelength
travels at the same speed as light of lower energy and longer wavelength. This can
be observed in violent event in stars, many lightyears away, where the high energy
x-ray and the radio signal arrive within seconds, with any time dispersion ascribed
to the physics at the source rather than to the propagation of the light through the
cosmos. The velocity of light is also independent of the velocity of its source. We can
see this in binary pulsars, where the pulses keep arriving regularly, independent if
the pulsar (the source) moves away from or towards the earth. Finally the speed of
light is independent of the velocity of the observer. This has been confirmed by the
Michelson-Morley experiment
In classical mechanics with the Galilei transformation, time was absolute; time dif-
ferences had the same value for all observers. Once two observers synchronised their
identical watches they would always give the same time thereafter. One of the most
shocking consequences of special relativity is that this is no longer true. This can be
seen in the following thought experiment. We take a lightclock, an instrument that
marks time by bounding between two mirrors separated by a distance L, see Fig. 1.1.
The period of the clock is given by the path length, divided by the velocity of light:
Ta = 2L/c.
F IGURE 1.1: The lightclock in the coordinate system of (a) the clock
(b) a moving observer (from https://stephenwhitt.wordpress.com/)
Now we consider the frame of an observer with a relative velocity v with respect
to the clock (diagram (b)). What is the period Tb according to this observer ? In the
time the light crosses to the other mirror, the clock has moved by vTb /2. The path
7
Since γ is always larger than 1 it means that the period of the clock as observed by a
moving observer is always larger than that of an observer in the frame of the clock.
Moving clocks tick slower.
Another remarkable consequence is the relativity of simultaneity. We can see
this in the classical example of the lightning and the train. A train with an onboard
observer Anne, positioned in the exact middle of the train is moving at constant speed
v. A second observer, Bob, standing still outside sees the front and the back of the
train being struck at exactly the same time by lightning, just when Anna and the
center of the train is closest to him, see Fig. 1.2. Bob will see both flashes at the same
time and the same distance and will conclude that they happened simultaneously.
Both flashes will move towards the center of the train, but since the train itself is
moving in the forward direction, the signal from the front will arrive before that from
the rear. For Anna, they have covered the same distance, d/2, with the same speed
of light, so the strike at the front of the train happened before the one at the back.
9
Chapter
In this chapter we derive the Lorentz transformation and look at some of its conse-
quences.
and a specific place is called an event. 1 Grid lines parallel to the x-axis represent
constant time, events that all happen at the same time. Lines parallel to the ct-axis
are happening at the same position. Objects starting at the origin follow a path inside
the light cone known as a world line This world line makes an angle smaller than 45◦
1 Think about your favourite pop-star to perform at your house on Saturday night, that’s an event.
10
with the ct-axis at all places, otherwise the object would move faster than the speed
of light. Space-time diagrams are also knows as Minkowski diagrams.
We consider the coordinates ( x, ct) as measured by an observer S and seek the cor-
responding coordinates ( x ′ , ct′ ) measured by another observer S′ who moves with a
relative velocity v in the x direction with respect to S. Our guiding principle is that
of the speed of light must be constant for all inertial observers. We start by looking
at the classical case of the two observers where their time and place coordinates are
connected by the Galilei transformation:
′
ct = ct
′
x = x − vt
. (2.1)
y′ = y
′
z = z
Here time is absolute and the same for all observers and the ct′ axis makes and angle
α with the ct axis which is given by: tan α = v/c = β (see Fig. 2.2).
One immediately notes that in this case the speed of light cannot be the same for
both observers. The only way to have the speed of light independent of the velocity
of the observer, would be if the x ′ axis would also make an angle α with the x-axis,
albeit in the opposite direction. (see ig. 2.3)
This would imply a transformation like:
′
ct = ct − vc x
′
x = x − vt
. (2.2)
y′ = y
′
z = z
Now this is not the full story. It is possible that both t′ and x ′ are scaled with the
same factor. The spacing of the (ct, x ) grid lines would change, but c would stay the
same. (see Fig. 2.4)
11
F IGURE 2.3: Two observers with rotated x ′ axis will observe an invari-
ant speed of light
F IGURE 2.4: Two observers with rotated x ′ axis and scaled x ′ and t′
We actually know that this is the case from our example of the light clock in the
previous chapter where we saw that time of a moving clock gets scaled with a factor
γ:
1
γ= q (2.3)
2
1 − vc2
Putting everything together, and using β = v/c we get the Lorentz transformation: 2
′
ct = γ(ct − βx )
′
x = γ( x − βct)
. (2.4)
y′ = y
′
z = z
2 Note that we speak of the Lorentz transformation in singular, while there are four items (one time
and three space coordinates) that transform. As has become obvious from the derivation the fact that
they transform together in a specific way defines the Lorentz transformation. Hence, there are not
multiple things that transform, but they change together in one transformation.
12
And y′ = y and z′ = z transform trivially, because scaling them would change the
speed of light again.
This is a good time to take a step back and think about what the set of equations 2.4
means. We see that the Lorentz transformation mix space and time and that what
one observer sees as a pure time difference may be a mix of time and distance for the
other. The lines of equal time for observer S′ are no longer horizontal and we can
now easily see what happened with the lightning on the train in the previous chapter.
Both strikes lie on a horizontal line and are simultaneous for S, but for the moving
observer S′ the lines make and angle and the strike on the front (B) happens before
the one on the back (A).
1 v 2 1 v 4
γ = 1+ + +... (2.6)
2 c 6 c
When v/c is very small, the difference of γ from 1 goes with (v/c)2 and can be ne-
glected, so γ = 1 is a good approximation. Putting this in the Lorentz transformation
for the relevant coordinates ct and x gives:
′
ct = ct − (v/c) x ≈ ct
(2.7)
x ′ = x − (v/c) ct = x − vt
where the term proportional to v/c is neglected compared to 1, and we get back the
Galilei transformation.
∆t = γ∆t′ . (2.10)
13
( x1′ − x2′ ) = γ( x1 − x2 ).
Please note that the corresponding times in S′ can be different, but since the rod is
at rest in S′ this does not prevent us from measuring its length. If we designate the
length in its own rest frame as L0 we find for the length in a frame moving at velocity
v:
L0
L= (2.12)
γ
Since γ > 1 for any velocity v it means moving objects shrink in the direction of
motion. This is known as the Lorentz-Fitzgerald contraction.
Working out the part in the square brackets, the cross-terms 2β∆ct∆x cancel and we
find:
(∆s′ )2 = γ2 [(1 − β2 )(∆ct)2 − (1 − β2 )(∆x )2 ]
And finally using 1 − β2 = 1/γ2 :
So indeed the combination (∆s)2 = (∆ct)2 − (∆x )2 is invariant under Lorentz trans-
formations. This is known as the spacetime interval3 . In case ∆x = 0, ∆s = ∆ct, the
time interval measured by an observer at rest. This special time is called the proper
time or τ. It is the time as measured by an observer moving together with the clock.
3 The overall sign of this invariant is a matter of taste. Particle physicists use it the way it is presented.
Cosmologists give it an overall minus sign and declare (∆x )2 − (∆ct)2 to be the invariant. In the
literature therefore both conventions exist up the present time. We stick to the convention that assigns
a + sign to time and a − sign to space.
14
2.3.5 Causality
We can partition a spacetime diagram in different areas, see Fig. 2.5. The x-axis
(ct = 0) represent the present moment and separates the past from the future. The
light lines, or light cones in more dimensions represent light signals incoming (from
the past) to and outgoing (to the future) from the origin. These points that have
(∆s)2 = 0 and are called light-like. Within the forward light cone (yellow area) are
points with (∆s)2 > 0, which is called a time-like separation, since the time part is
larger than the space part. This is the possible future from the origin. Likewise the
downward cone is the possible past. The area outside the light cone has (∆s)2 < 0, a
space-like interval. Since nothing can go faster than light, it is impossible for events
that have a space-like distance to send information from one to the other. This means
they cannot influence each other, or one cannot be the cause of the other. This is
known as causality. Events that are inside the future light cone of an event can be
the consequence of that event.
Another way of looking at this is through the Lorentz transformation. For two
events A and B outside each others light cone, you can always vfnd observers were
A happens first, one where they are simultaneous and others where B is first. If it
not possible to say which event happens first, it is not possible for one to be the cause
of the other. There are no observers for whom the events take place at the same
position. Events in each others light cone (time-like distance) have the same order
for all observers and there is an observer who will measure them at the same position.
dx
ux =
dt
15
dx/dt − βc ux − v
u′x = = (2.14)
1−
β dx 1 − vu
c2
x
c dt
It is probably a good idea to look at equation 2.14 and consider what it means.
Observer S′ is moving with velocity v in the positive x-direction in the frame of S. An
object moving with u x in S will have a lower velocity in S′ , which explains the − sign
in the numerator of Eq.2.14. In the case S′ is moving to the left, the two velocities in
the numerator have a relative plus sign, but the denominator ensures that the speed
of light is not exceeded. For example, v = −0.6c and u x = 0.8c gives u′x = 0.95c.
Also, if u x = c than u′x = c regardless of v, which is to be expected, since we derived
the Lorentz transformation from the principle that the speed of light is the same for
all observers.
For the y and z direction the position coordinate does not change, but time still
transforms. This leads to:
!
dy ′ dy dy/dt 1 u y
u′y = ′ = = = (2.15)
dt β
γ(dt − dx ) β)
γ(1 − dx γ 1 − vu c2
x
c c dt
Note that if one changes inertial frames, such that the velocity in one direction in-
creases, orthogonal components of the velocity decrease. This ascertains that the
total velocity (or speed) of an object for an inertial observer never exceeds the speed
of light, even if in one direction it approaches the speed of light.
There are a number of paradoxes that follow from the Lorentz equations. Mostly
they are well documented and explained, so we limit ourselves to some comments
and cover some lesser known examples.
The muon paradox is not much of a paradox, more a logical but surprising mea-
surement. To the observer on earth, muons get to fly further because they live longer
by a factor γ because of time dilation. In this case γ is calculated with the velocity of
the muon, which come towards us. From the muon frame, the earth’s atmosphere is
contracted by the same factor γ due to the Lorentz-Fitzgerald contraction.
The twin paradox exists only if we make the mistake of considering the situation
of both twins to be equivalent. A Minkowski diagram for any inertial observer, will
show the twin that stays behind on a straight line and the traveler on a broken one.
One sometimes hears the explanation that the acceleration of the traveller causes
the time difference. This is a misleading interpretation. The acceleration makes the
traveller move from one inertial frame to the other and the equal time lines of the
two frame have opposite angles, which causes a jump in the equal time event on
4 This may not have been explained to you in the Calculus course if that was taught by a proper
mathematician, but for physicists dt and dx are single variables that are infinitesimally small, but always
larger than zero. Hence, they can, more or less, be manipulated as if they were “ordinary” numbers and
also dividing by them is not a problem.
16
the world line of twin who stayed behind. One way of looking at the twin paradox
is through the metric of Minkowski space. In ordinary 3D space the shortest route
between two points is a straight line. In Minkowski space with its hyperbolic metric
the shortest route (0) is achieved by following the light lines. When looking at the
Minkowski diagram for the twins, it is obvious that the traveller follows a path closer
to the light lines and therefore measures a shorter interval and proper time.
The essence of the train in the tunnel or pole or ladder in the barn paradox is
the breakdown of simultaneity in SRT as illustrated by the train and the flashes. In
the frame of the barn (tunnel) the entry of the back and the exit of the front of the
pole (train) is simultaneous and the Lorentz-Fitzgerald contraction makes everything
fit. In the frame of the pole, the barn is contracted but since the exit of the front
happens before the entry of the back, there is no problem.
A lesser known paradox is the rocket and string paradox, also known as the
Bell’s spaceship paradox5 . Here two spaceships at a distance L are connected by
a fragile string of the same length. Minimal stretching of the string would cause it
to break. The clocks of the spaceships are perfectly synchronised. At exactly the
same time they start accelerating by the same amount, until they reach some cruising
speed. In the original frame the spaceships maintain the same distance, since they
accelerated by the same amount at the same time, they have the same velocity at
any time. Still the string will be contracted by the Lorentz-Fitzgerald contraction and
thus should break. Somehow this feels wrong, if the two rockets and the string all
accelerated at the same time, why would the string break? The solution becomes
clear when we draw a Minkowski diagram. In the starting frame, the equal time
lines for the cruising speed frame run at an angle. As a consequence in the cruising
speed frame, the first rocket started earlier and the distance between the rockets has
increased causing the string indeed to break. This can be verified by applying the
Lorentz transformation to the starting events of both rockets into the cruising speed
frame.
An interesting twist to the train in the tunnel is the train on the bridge paradox.6
Here our train of length S is on a bridge with a (larger) missing segment of length
L. The train is moving at relativistic speed and thanks to the Lorentz-Fitzgerald
contraction of the bridge an observer on the train measures the gap to be smaller
than the length of the train and thinks the train will make it across safely. For an
observer next to the bridge the train is contracted and will surely fall. Observers
can measure different values for coordinates, but not two different realities.7 The
solution to this paradox comes from a precise definition of falling. We take the center
of gravity of the train and state that the train will fall when the center of gravity is
no longer supported by either side of the train. The point is that since the train is a
rigid object, it takes time for the signal that the train is supported to reach the center.
If we assume that this signal travels with the speed of light (a perfectly rigid object),
we can track the signal from the front (stop falling) and from the back (start falling)
to the center of the train and see that for both observers the former arrives first and
the train is saved.
5 https://en.wikipedia.org/wiki/Bell%27s_spaceship_paradox
6 https://youtu.be/gH2mI0Oh9Zo
7 Thisis true in Special Relativity. When combining General Relativity with Quantum Mechanics, it is
predicted that different observers can have different realities. An example of this is the Fulling-Davies-
Unruh effect in which an accelerated observer is predicted to see the black body (Hawking) radiation at
a certain temperature, while an inertial observer does not observe this radiation.
17
Chapter
In this chapter we analyze the Doppler effect which is well known from classical wave
mechanics. We have seen that simultaneity depends on the observer in the theory of
special relativity. We start by defining distance, simultaneity and velocity by having
observers communicate through light signals.
We consider the period between two light signals. We start with two observers A and
B. At m A sends a light signal to B who receives it at p. After an interval τ A sends a
second signal (at n which arrives at q for B, see Fig. 3.1
ct
q
τ‘ = k τ
n
p
τ
m
x
A B
F IGURE 3.1: Observers exchanging light pulses
The time between events p and q in the reference frame of B is now given by
τ ′ = kτ. The factor k is called the doppler factor. Note that the situation between A
and B is symmetric. A will also measure a doppler factor k between the signals from
B. When k is greater than 1 the period is stretched and the frequency lowered. In
18
astronomy this is called a redshift. If k is smaller than one the period is compressed,
the frequency higher. This is called blueshift.
Our next step is to define what equal times means for different observers. We have
seen that time and space are relative to the observer. The only thing two observers
agree about is the velocity of light c. We now consider an observer A looking at an
event q on the worldline of observer B at a different place, see Fig. 3.2. At some time
p in the past A and B where at the same place. What event n on its worldline would
A consider to be simultaneous with q ? We can establish this using the speed of light.
Let A send out a light signal at m to reach B in q where it is reflected back to reach A
in o. Then n is at equal time with q for A if tmn = tno .
A B
n q
k2τ kτ
τ
p
F IGURE 3.2: Event n is for A simultaneous with q if tmn = tno
From the relativity principle it can be seen that if q corresponds to time kt for
observer B, the time of the corresponding point o in time for observer A should be
k · kt = k2 t. We can then see that for observer A:
to − tm k2 − 1
tmn = = τ (3.1)
2 2
Now we can define the distance of q in the frame of A as the velocity of light times
half the time the pulse takes to come back:
k2 − 1
dqn = ctmn = c τ
2
19
And so:
v k2 − 1
β= = 2 (3.3)
c k +1
Inverting the equation gives for the doppler factor:
s r
1+β 1 + v/c
k= = (3.4)
1−β 1 − v/c
1 In the units of the Hubble constant, one Mpc, megaparsec, is 3.26 million lightyears.
20
We now consider 3 observers, A, B and C, see Fig. 3.3. The interval τ for A is mea-
sured as kAB τ by B and kAC τ = kAB kBC τ by C If we now consider the velocity of C
with respect to A we find:
vAC k2 − 1 k2 k2 − 1
= 2AC = 2AB 2BC (3.8)
c kAC + 1 kAB kBC + 1
A B C
kABτ
The familiar formula for addition of velocities. Note that for v << c we get back the
classical limit of vAC = vAB + vBC
It is now possible to derive the Lorentz transformation. In Fig. 3.4 we consider event
q having coordinates (ct, x ) for observer A and (ct′ , x ′ ) for B. In point o we choose
to = t′o = 0. A light signal sent by A at tm passes B at t′p and reaches q. The return
signal passes B at tr′ and is received by A at tn .
According to A:
tm + tn (
t = 2 ct − x = ctm
⇔ (3.10)
x = ctn − ctm
ct + x = ctn
2
21
A B
n
p
m
o
Figuur 4: Gebeurtenis q voor twee waarnemers A en B.
F IGURE 3.4: Event q for two observers A and B
!
En op dezelfde manier ′voor′ B: ct − x! ′= t!p ′ en ct′! + x! = ct!r . Ook geldt: t!r = k −1 tn en
′ and
Likewise for B: ct − x = t ct + x = ctr . From the Doppler factor we have:
t!p = ktm
′
. Invullen
− 1
geeft:
′
p
tr = k tn and t p = ktm . Substituting we find:
−1 −1
k(ct − x) = ct! − x! ct! = k+k2−1 ct − k−k2 −x1
( ⇔
′ ! k+ k−1 k −−1k
k −1 (ct
k (ct+−x)x )==ctct +−xx! ′ ct x=
! ′
= k+k22
xct−− k−k2 2 ct x
⇔ (3.11)
k−1 (ct + x ) = ct′ + x ′ −1
x ′ = k + k x − k − k ct
−1
We hebben gevonden dat:
'
2 2
( *
Now using eq.3.4 we can write: ( 1 + v/c 1+β
k=) =
1 − v/c 1−β
2 −1
k+k = p = 2γ
Waarmee we kunnen schrijven dat 1 − β2
and −1 2
k + k− = √ 2β 2 = 2γ
k − k 1 = p1 − β = 2βγ
1 − β2
en
This finally gives us: −1 2β
k − k( =′
√ = 2βγ
ct = γ1(− −2 βx )
ct β
(3.12)
En dit geeft ons x ′ = γ( x − βct)
ct! = γ(ct − βx)
x! = γ(x − βct)
4
23
Chapter
The Lorentz transformation for the velocity is very different from those of position
and time and frankly it looks messy and ugly. In classical mechanics v = dx/dt.
Under a Galilei transformation x transforms but t is the same for all observers and
the transformation for position and velocity are very similar. In the relativistic case,
time is also transforming, which leads to the complicated expressions. It is tempting
to try to construct a velocity with more regular transformation properties. For this we
need a time which does not transform. This exists, the proper time, tau, is the same
for all observers. From the light clock we know dt = γdτ and with this we try for
our proper, or relativistic velocity:
⃗p = m ⃗η = γ m ⃗v (4.2)
Now, so far, these definitions have been based on an argument of elegance, which
is appealing but not necessarily experimentally correct. As it turns out classical mo-
mentum is not conserved under a Lorentz transformation, where relativistic momen-
tum is. This can be illustrated with an example of two particles scattering off each
other elastically. Consider two point particles A and B of equal mass m and without
any other structure moving along the x-axis of an inertial observer O, which will be
called the center-of-mass observer, with equal speed u > 0, but in opposite direc-
tion, i.e. velocities uA B
x = − u x = u. The total momentum of the two particles adds
up to zero (in all directions). After they collide elastically (i.e. both particles remain
intact and no energy is absorbed or released by them), they now move away from
the collision point in the y-direction because momentum conservation in opposite di-
rection with the same speed ûy . Because of energy conservation, the kinetic energy
of the particles before and after the collision must be the same, hence their speed
must be the same before and after the collision, where we make an arbitrary sign
24
choice ûA B 1 At that point ûA = − ûB = 0. For observer O the total
y = − ûy = u. x x
momentum of particles A and B together sums up to zero in all directions before the
collision, as well as after the collision. Hence, in this particular case for observer O
the momentum is conserved. And in this frame this is true irrespective of the defini-
tion of momentum, as long as momentum depends monotonically on velocity. Hence,
momentum is conserved also for the classical momentum definition ⃗p = m⃗v.
Now consider another (inertial) observer O′ , which will be called the fixed target
observer, who moves with speed v = u in the + x-direction, where the directions
of the x and x ′ axes and the y and y′ axes are the same. I.e. O′ moves along with
particle A, or in other words particle A is at rest for observer O′ . For this observerO′
the speeds of the two particles before the collision transform as, assigning labels for
particles A and B arbitrarily as uA B
x = − u x = u:
′ uAx −v u−v
uA
x = = =0 (4.3)
A
1 + u x v/c 2 1 − u v/c2
′ uBx − v −u − v −2u
uBx = = = (4.4)
1 + uBx v/c2 1 + u v/c2 1 + u2 /c2
A′
uAy
uy = =0 (4.5)
γ (1 + uA 2
x v/c )
′ uBy
uBy = =0 (4.6)
γ(1 + uBx v/c2 )
√ √
with γ = 1/ 1 + v2 /c2 = 1/ 1 + u2 /c2 .
After the collision, for observer O′ :
′ ûAx −v
ûA
x = = −v = −u (4.7)
1 + ûAx v/c
2
′ ûBx − v
ûBx = = −v = −u (4.8)
1 + ûBx v/c2
′ ûA
y u
ûA
y = A 2
= (4.9)
γ(1 − û x v/c ) γ
′ ûBy −u
ûBy = = (4.10)
γ(1 − ûBx v/c2 ) γ
Hence, where before the collision the total momentum for O′ was (for x and y com-
ponents separately:
′ ′ −2mu
′
ptot,x = m uA B
x + m ux = (4.11)
1 + u2 /c2
′ ′ ′
ptot,y = m uA B
y + m uy = 0 (4.12)
identical, the energy can only depend on the particles’ velocities monotonically. Hence, equal energy
must result in equal velocity.
25
Now, for the y component before and after the collision sum up to zero, but for the x
component, clearly
′ ′
ptot,x ̸= p̂tot,x (4.15)
and conservation of momentum with this (p = mv) definition of momentum is con-
served for inertial observer O and violated for another inertial observer O′ . This
means, superficially, that either the Einstein’s first postulate us violated or that mo-
mentum is not conserved. Either conclusion is less than appealing.
√But with the new definition of relativistic momentum ⃗p = γm⃗v, with γ =
1/ 1 − v2 /c2 momentum is conserved. With the four vector notation will become
evident in a very trivial way. But before giving that proof, first we consider energy.
dK = ⃗F · d⃗x (4.17)
We are stuck with two integrals over different variables u and γ. To get back to a
simple form let us consider γ:
−1/2
u2
γ = 1− 2 ⇒
c
−3/2
1 u2 2u u
dγ = − 1− 2 · − 2 du = γ3 2 du ⇒
2 c c c
c 2
du = dγ (4.20)
uγ3
26
So:
K = mc2 (γ − 1) (4.22)
Now take the Taylor series for K with (1 − x )n ≈ 1 − nx for small x and hence
γ = (1 − v2 /c2 )−1/2 ≈ 1 + 21 v2 /c2 . Then:
1 1
K ≈ mc2 · v2 /c2 = mv2 (4.23)
2 2
and we recover the classical kinetic energy.
4.3 E = mc2
Our next step is to derive the famous equation E = mc2 . We do this using con-
servation of energy in a thought experiment. We take a nucleus with mass m being
at rest. At a certain moment it simultaneously emits 2 photons in exactly opposite
directions, each carrying the same energy h f = E/2. The nucleus loses energy E,
but because the photons are exactly balanced, does not change its velocity. Since the
nucleus does not have any kinetic energy, the energy lost has to come from some sort
of internal energy of the nucleus. We now assume that in fact the mass of the nucleus
is its internal energy reservoir. This is further supported by the fact that nuclei have a
lower mass after radioactive decay. The lost mass we call ∆m and the corresponding
energy loss ∆Em . In the frame of the nucleus, using conservation of energy, the energy
from the loss in mass must be equal to the total kinetic energy of the photons:2
E = ∆Em (4.24)
Now we examine this system from the viewpoint of an observer moving with velocity
v in the direction of one of the photons. The observer will see one photon being
2 Note that we also assume here that the total energy of the photon is its kinetic energy and the photon
does not have an additional internal energy reservoir, i.e. with our earlier assumption the photon does
not have a mass.
27
redshifted and the other being blueshifted. The total energy of the photons in the
moving frame is: r r
E 1 + v/c E 1 − v/c
+ = γE (4.25)
2 1 − v/c 2 1 + v/c
Also in the moving frame, before the decay the nucleus has kinetic energy
After the decay the nucleus has lost ∆m in mass and ∆Em in (mass) energy. Because
of the lower mass, in the moving frame, it also has lower kinetic energy:
For the moving observer conservation of energy takes the following form:
Chapter
In this chapter, we unite time and space in one single object, a four-vector, introduce
tensor notation and take a different look at the laws of nature.
5.1 Four-vectors
Since the Lorentz transformation mixes time and space, it makes sense to take them
together and form one 4-dimensional space-time. Vectors in this space are indicated
by the (ct, x, y, z) coordinates or:
0
ct r
x 1
rµ = = r . (5.1)
y r2
z r3
where “·” denotes the inner product, also called dot-product, which is implicitly de-
fined in the equation. We modify the 4D dot-product to produce the invariant interval
in space-time: q
√
| x | = x µ · x µ = c2 t2 − x 2 − y2 − z2 (5.3)
or in a more common notation with the greek index:
x µ x µ = c2 t2 − x 2 − y2 − z2 = s2 (5.4)
30
Here, the Einstein summation convention is used, where the combination of the same
upper and lower index implies summation over that index with the terms correspond-
ing to all possible values of the index. Clearly, there is a difference between x µ , which
is called a contravariant vector, and the newly introduced xµ , which is a covariant
vector. This difference can be deduced by comparing equations 5.3 and 5.4, using the
relation between the x µ components and ct, x, y and z, and is given by:
x0 = + x 0 ,
x1 = − x 1 ,
(5.5)
x = − x2 ,
2
x3 = − x 3 .
xµ = gµν x ν , (5.6)
where the sum runs over ν = 0, 1, 2, 3 and the metric tensor gµν is given by1
1 for µ = ν = 0
gµν = −1 for µ = ν = 1, 2, 3 (5.7)
0 for µ ̸= ν.
With gµν , a contravariant vector can be converted into its covariant version. This is
called lowering the index. Reversely, it is also possible to raise the index, making a
contravariant version of a covariant vector2
x µ = gµν xν , (5.8)
Interestingly, with the same numerical values for the same set of indices for gµν as for
gµν :
1 for µ = ν = 0
µν
g = −1 for µ = ν = 1, 2, 3 (5.9)
0 for µ ̸= ν.
This metric with the relative minus signs is often called the Minkowski metric and
sometimes called a hyperbolic metric.
Using four-vectors and tensor notation, we can write the Lorentz transformation
as:
′ ′
x µ = Λµ µ x µ , (5.10)
with the Lorentz transformation in the x-direction for velocity difference v = βc:
γ − βγ 0 0
′ − βγ 0 0
Λµ µ =
γ . (5.11)
0 0 1 0
0 0 0 1
the tensor representing the Lorentz transformation. The tensor Λνµ can be interpreted
1 Please note that while gµν has two indices it is not a matrix in the sense of the ones you encounter
in linear algebra. You can think of gµν as a row vector of row vectors,
i.e. gµν = ( (1, 0, 0, 0), (0, −1, 0, 0), (0, 0, −1, 0), (0, 0, 0, −1) )
2 In the same sense as for g
µν can be interpreted as row vector of row vectors, the metric with upper
indices gµν can be thought of as column vector of column vectors, which would be very awkward to
write put in this footnote.
31
as a matrix with the index µ′ running in the vertical direction and the index µ in the
horizontal direction and then using the normal matrix multiplication with a vector.
Looking at the top left 2 × 2 block of Λ in equation 5.11 for the proper Lorentz trans-
formation, one notes the similarity with the middle 2 × 2 block for the 2D rotation
matrix of equation 5.14. The difference is that here, both off-diagonal elements have
a minus sign, whereas for spatial rotations only one of the off-diagonal elements has
a minus sign. The Lorentz transformation is not a normal rotation since the ct and x
rotate in opposite directions, followed by a stretch with a factor γ.
We see that there are tensors with more than one index, e.g. gµν , gµν and Λµ ν . When
a Lorentz transformation is applied, it should be applied for each of the indices, e.g.
′ ′ ′ ′
Aµ ν ρ′ = Λµ µ Λν ν Λρ ρ′ Aµν ρ . (5.15)
This also introduces the Lorentz transformation for lower (covariant) indices, which
is the inverse of the Lorentz transformation for upper (contravariant) indices:
′
′ −1
Λµ ρ′ Λρ ν = δν ⇔ Λµ ρ′ = Λρ µ
µ
, (5.16)
where, in this case, the superscript −1 means taking the inverse (and not dividing by
as a number).
32
When a tensor transforms for each of its indices under a Lorentz transformation,
the indices are called Lorentz indices, and the tensor is called covariant.3 Laws of
nature, or any expression that is invariant under Lorentz transformations, can be
written in tensor notation. In that case, if the expression is for one inertial observer,
it is valid for all inertial observers. Also, the reverse is true. If a phenomenon is the
same for all observers, one must be able to express it in terms of tensors with Lorentz
indices.
5.4 four-momentum
In chapter 2, we saw that we can combine position and time in one quantity, the
spacetime four-vector x µ = (ct, x, y, z) In chapter 4 the three components of the
relativistic momentum were given by ⃗p = m⃗η = md⃗x /dτ = γm⃗v. We can now ask
ourselves if we make a four-momentum vector pµ and what would be the zeroth com-
ponent p0 ? Generalising from the momentum three-vector to the four-momentum we
can write:
pµ = mdx µ /dτ . (5.17)
For the indices µ = 1, 2, 3 this gives the momentum introduced in chapter 4. New is
the 0th component which is given by:
d(ct)
p0 = m = mcγ = E/c (5.18)
dτ
So E/c is the zeroth component of our four-momentum:
0
p E/c
p1 p x
pµ =
p2 = p y ,
(5.19)
p3 pz
where pµ behaves just like x µ under a Lorentz transformation, e.g. under a Lorentz
transformation for a change in velocity v = βc in the x-direction:
E′/c = γ( E/c − βp x )
p′ = γ( p x − βE/c)
x
, (5.20)
p ′ = p
y y
p′ = p
z z
or in four-vector notation:
′ ′
pµ = Λµ µ pµ . (5.21)
It is now immediately evident that if energy and momentum is conserved for one
inertial observer O it is also conserved for another inertial observer O′ if the relative
coordinate transformation is a Lorentz transformation:
′ ′
x µ = Λµ µ x µ . (5.22)
3 Even if it has (only) contravariant indices.
33
because if for total energy and momentum for O of a system of n particles with four-
µ
momenta pi (i = 1, . . . , n):
n
∑ pi
µ µ
ptot = , (5.23)
i =1
where we made use of the fact that γ ≥ 1 and the equation being zero, then, by
applying the Lorentz transform (which does not depend on the time), we find:
µ′ µ′ µ ′ ′
Λµ dptot
µ µ µ µ
dptot dΛµ ptot dptot dptot
=0⇔ = = =0⇔ = 0, (5.25)
dτ dτ dτ dτ dt
and energy and momentum are also conserved for the observer O′ .
We already saw that the Lorentz transformation leaves the interval (cτ )2 = (ct)2 −
x2 − y2 − z2 = rµ r µ invariant. Since four-momenta transform just as time-space four-
vectors, we also have:
the invariant
ptot µ ptot = m2inv c2
µ
(5.28)
defines the invariant mass minv . To investigate the meaning of the invariant mass,
one can resort to a particular inertial observer and generalise to an arbitrary inertial
observer because the invariant mass is independent of the inertial observer. The
observer of choice is the centre-of-mass observer for which the total momentum sums
34
to zero:
n
⃗ptot = ∑ ⃗pi . (5.29)
i =1
hence, there is a rest mass part mi and a kinetic part |⃗pi | for each particle and in the
energy, the momenta are sum quadratically and do not necessarily add up to zero.
However, if we consider that all the particles concerned would either originate from
one particle decaying or, alternatively, come together to form one particle, call this
particle P, the mass of this single particle is fixed:
The invariant mass is, therefore, the maximum mass of the single particle that can be
produced by the system under consideration. For other observers than the centre-of-
mass one, both the total energy and the total momentum will be larger, but the in-
variant mass stays the same. For a single particle that either originated in or resulted
from the multi-particle system, for those other observers, the total energy consists of
the invariant mass part and a kinetic energy part because the particle must have a
momentum equal to the momentum sum of the multi-particle state.
35
Chapter
6 Relativistic collisions
We often use natural units where we take c = 1 and use the electronvolts (eV) 1
as unit for energy, momentum and mass. The relation between energy, momentum
and mass then becomes m2 = pµ pµ = E2 − p2 with p = |⃗p| the magnitude of the
momentum in three-dimensional space.
We have already seen a particular case of a decay of a nucleus emitting two photons,
hence with one particle in the initial state and three particles in the final state. A case
that is encountered more often is a (parent) particle of mass M decaying into two
daughter particles with masses m1 and m2 , see Fig. 6.1. In the parent’s rest frame the
parent has zero momentum and we can write
µ M
P = ⃗0 . (6.1)
Using pµ pµ = m2 :
or
M2 + m22 − m21
E2 = , (6.3)
2M
and swapping 1 ↔ 2 :
M2 + m21 − m22
E1 = . (6.4)
2M
The magnitude of the momentum can be found with p2 = E2 − m2 and should be the
same for both daughters. This leaves only the two angles ϕ and θ in the parents rest
frame as free parameters of the decay.
and cancelling the p2 and bringing all mass terms to one side brings us to 6.4 again.
mν = 0 We find:
m2π + m2µ
Eµ = = 109.8 MeV
2mπ
m2π − m2µ
Eν = = 29.8 MeV
2mπ
pµ = pν = Eν = 29.8 MeV
and
Eπ 319.4
− pπ cos θ −144.8
pπ =
µ
− pπ sin θ = −250.7 MeV
0 0
Now we need to use the Lorentz transformation
p to boost to the lab frame. The energy
of the K∗ in the lab frame we find with E = p2 + m2 = 3.13 GeV.
p K∗ 3.00
β= = = 0.959
EK∗ 3.13
and
EK∗
γ= = 3.51
m K∗
Next we apply the Lorentz transformation for E and p x , note the +-sign because
the K∗ is moving in the positive x-direction and the lab frame is moving with − β in
the K∗ frame. ( ′
EK = γ( EK + βpKx ) = 2.496 GeV
(6.5)
p′Kx = γ( pKx + βEk ) = 2.435 GeV
38
6.3 Collisions
( pp + pγ )( pp µ + pγ µ ) > (mp + mπ )2 ⇒
µ µ
mp mπ + m2π /2
Ep > = 2.2 × 1020 eV
Eγ
39
We indicate the electron four-momentum with pµ , the electron mass by m and the
photon with kµ before the collision and p′µ and k′µ after. We take the photon to be
moving along the positive x-direction and the scattering in the x − y plane:
pµ = (m, 0, 0, 0)
kµ = (k, k, 0, 0)
p′µ = ( E′ , p′ cos ϕ, p′ sin ϕ, 0)
k′µ = (k′ , k′ cos θ, k′ sin θ, 0)
pµ + k µ = p′µ + k ′µ
we move the electron and photon each to one side and take the innerproduct with
itself:
(kµ − k′µ )2 = ( p′µ − pµ )2 => kµ k µ − 2kµ k′µ + k′µ k′µ = pµ pµ − 2pµ p′µ + p′µ p′µ
and thus:
1 1 mc ′
(1 − cos θ ) = m ′
− = (λ − λ) (6.7)
k k h
where in the last step we have used that k = h/λ and we have put the factor c back
in to make the units match and to reproduce the usual classical form of the formula.
41
Chapter
7 Loose ends
In this chapter we collect a number of topics that formally are not part of the theory
of special relativity, but are related.
We have seen that massless particles move at the speed of light and that particles that
move at the speed of light must be massless. The key example of a particle moving
at the speed of light in vacuum is the photon and hence the photon must have zero
mass.
Now consider shooting a photon through an aquarium, which for the sake of
simplicity of the argument will have no (glass) walls.1 As can be checked with a
beam of photon, e.g. from a laser, the colour of the light that goes from the vacuum
into the water comes out again at the other end unchanged. Since the colour, i.e. the
frequency f , is encoding the energy of the photon through E = h f we must conclude
that the photons leave the water again with the same energy as they came in.
When the photons move in vacuum, they move at the speed of light, their mass is
zero and all their energy is kinetic energy E = p c.
We know that when a ray of light enters the water surface under an angle refrac-
tion occurs. The light makes a kink at the water surface and according to Snellius
law of refraction sin θin / sin θrefr = n, where θin is the angle of the light ray with the
normal of the water surface in the vacuum, θrefr is the angle of the light ray with the
normal of the water surface in the water and n is the refractive index. According to
the theorem of shortest optical path length, the index of refraction is given by the
ratio of the speed of light in vacuum and in water, n = c/vw , where vw is the speed
of light in water. Hence, from this we have to conclude that photons move slower in
water than in vacuum. This is supported by an argument of photon interaction with
matter. Photons typically interact with matter, even when the matter is transparent.
For transparent materials this can be viewed as absorption en reemission of photons
due to the fact that their energy does not fit an electronic band gap or atomic of
molecular excited state transition in the material. But the absorption and reemission
cause a delay in the propagation of the photon, hence their average speed goes down.
But if the photons move slower in water their momentum and kinetic energy go
down. On the other hand we just argued that their total energy remains unchanged.
1 The glass walls do not change the following argument, but taking them into account just means
that the arguments must be repeated for glass, then water and then glass again.
42
The only reasonable way out is that the excess energy after lowering the kinetic
energy and that has to stay with the photon is carried as internal energy, which we
call mass. We therefore conclude that photons acquire a mass when they move in
water, or even more general when they move in any transparent material that breaks
light.
A variant of this mechanism is the Brout-Englert-Higgs mechanism, which predicts
that all elementary particles are massless by nature, but they have interaction with
the Brout-Englert-Higgs field that is omnipresent in space. The the Brout-Englert-
Higgs field has a zero value, the interaction of particle with this background field
has no effect, the strength gets multiplied by zero. If on the other hand the Brout-
Englert-Higgs field has a non-zero value there is interaction and the particle acquire
a mass, just like a photon in water. The value of the mass will be proportional to the
strength of the interaction. The stronger the interaction, the more kinetic energy of
the particle goes down en the more energy has to be shifted to the mass.
The peculiarity of the Brout-Englert-Higgs field is that, due to its spin zero nature,
it spontaneously develops a non-zero vacuum expectation value at the lowest energy.
And this is how elementary particle acquire the mass they have.
A prediction of quantum field theories is that all quantum fields, also the Brout-
Englert-Higgs field, have excited states that we identify as particles. The Brout-
Englert-Higgs field turns out to have one of such excited states, which corresponds
to the Higgs particle. 2 Discovering such a particle actually shows that the field must
be there. The Higgs particle was discovered in 2012, confirming this picture of how
elementary particles acquire their mass.
the photon is massless, it can climb back up to the top of the tower without losing
energy. It is obvious that this process violates conservation of energy, and that the
2 Only Higgs made this remark in his paper, hence the field is called Brout-Englert-Higgs field, but
the associated particle is called Higgs.
43
only way to conserve energy is to conclude that photons loose energy in a gravita-
tional field. This implies their frequency is reduced and wavelength becomes longer.3
Taking the wavelength at the point of emission as λe and putting the gravitational
potential for a mass M at 0 at infinity we find:
GMm 1 1 h λo − λe
∆Ug = − =h − = (7.1)
r λe λo λo λe
for an electron with mass m and using mc2 = h/λ we find for the observed wave-
length away from the gravitational field:
λo − λe GM
z= ≈ 2
λe rc
With z the gravitational redshift. Likewise for the change of frequency:
fe − fo GM
= 2 (7.2)
fe rc
This is an approximation, since for both the classical Newtonian formula for gravita-
tional force is used. The slowing of the frequency in a gravitational field means that
clocks in a strong gravitational field run slower than clocks in a weak gravitational
field or clocks in an inertial frame.
List of Figures
1.1 The lightclock in the coordinate system of (a) the clock (b) a moving
observer (from https://stephenwhitt.wordpress.com/) . . . . . . . . . 6
1.2 Train carriage struck by lightning at both extreme ends. Anna is de-
picted in the middle of the carriage. Bob is looking onto the train from
the viewpoint of you as the reader. The train is moving with a velocity
v along the tracks perpendicular to the line of sight between Anna and
Bob at the time they in closest approach. . . . . . . . . . . . . . . . . . 7