Short Topical Videos
- Lorentz Transforms and Special Relativity (Dhruv Muley)
- Length Contraction and Space-Time animation (Episode 42 of the Mechanical Universe)
Need to Review?
Basics of Special Relativity
- The laws of physics are the same in all inertial reference frames.
- The speed of light is constant in all reference frames.
Imagine you are on a train and sending a pulse of light vertically from the floor to the ceiling. The pulse bounces off the ceiling and returns to you. The pulse travels at the speed of light, , and the time it takes to return to you is . So the light pulse travels a total distance of .
Meanwhile, there is another observer on the train platform who sees the train traveling by him with a speed . From his perspective, the light pulse does not travel vertically. Instead, it moves horizontally as well. Call the time that this observer measures for the pulse to travel to the ceiling and back . Then, when the light pulse has returned to the floor of the train, the observer on the platform measures a horizontal displacement of , since this is the distance the train has moved. The path of the light pulse will be diagonal from his perspective – diagonally up and then diagonally down. Since the speed of light is the same for all observers, he will measure that the pulse has traveled a distance of .
The two diagonal paths together with the horizontal displacement form a triangle. Let’s split this triangle into two right triangles by drawing a vertical line through the center. The length of the vertical line is . Each of the two right triangles will have a horizontal side of length and hypotenuse . Use the Pythagorean theorem to relate the sides, i.e., . We can rearrange this to get
We call the proper time. It is the time between two events in the reference frame in which the events occur at the same spatial location. In any other reference frame, the elapsed time will be longer. In our case, the two events are the sending and receiving of the light pulse.
The take-away point is that time appears to be progressing slower for moving objects.
Suppose the length of the train car is at rest. This is the length that the observer on the train would measure. Suppose there is an observer on the platform. He is standing still on the platform letting the train pass by him, and he records the time at which the front of the train car passes him and the time at which the back of the train car passes him. Since these two events occur at the same spatial location in the reference frame of the platform, we will use to denote the time between the passing of the front and back of the train car. So is not the same as in the section on time dilation. Then the observer on the platform would infer that the length of the train car is . In the meantime, the observer on the train would measure a time difference of .
Now we can use the time-dilation formula to relate and . Then we can relate and as
It’s easy to be confused by this logic. The proper length, , is measured in the reference frame of the train, but the proper time, , is measured in the reference frame of the platform. The proper length is the length measured in the frame in which the object is at rest. The proper time is the time between two events in the frame in which those events occur at the same spatial location.
The take-away point is that moving objects contract along the direction of motion.
When dealing with relativistic systems on a theoretical level, it is often useful to set . Most people find this totally ridiculous at first, but gradually they understand the legitimacy and the usefulness of this choice. This is standard practice in the general-relativity and particle-physics communities. It is such a common choice that virtually all papers in these fields do not even bother to state it. It is mostly only in textbooks that the choice is clearly stated, and usually this is done very early on. For instance, Sean Carroll sets on page 8 of Spacetime and Geometry. Steven Weinberg sets in the preface of Gravitation and Cosmology. In An Introduction to Quantum Field Theory, Peskin and Schroeder set between the preface and the editor’s forward. And so on...
When we set , we are just saying that . Distance and time can be measured in the same units. You can measure distance in seconds, and you can measure time in centimeters. If you don’t want your distances in seconds, then you can always convert to centimeters by multiplying by .
If you are ever given an expression in which has ben set to , you can always restore the factors of by requiring the units to come out the way you want. For example, we might write , where is an energy and is a mass. If you want energy to be measured in ergs and mass to be measured in grams, then there is only one way to restore the factors of , i.e., . On the other hand, if you’re content to measure both energy and mass in, for example, , then you don’t need to restore any factors of at all. This is why people often quote the masses of fundamental particles in or . Sometimes they go halfway toward restoring the factors of by quoting masses in .
The theoretical advantage – besides simplifying expressions – is that time and space are put on the same footing. The Lorentz transformation is seen as a rotation of time and space into each other. Since we now have instead of , mass is now viewed as a form of energy. Momentum will also now have the same units as energy, so we see that momentum is yet another form of energy. Instead of , we now have . This is simpler, easier to remember and clearly shows that mass and momentum are the contributions to the energy of a free particle. The factors of in the first expression only serve to get the units right. They don’t contain any theoretical significance. Additionally, velocities will now be dimensionless and always less than or equal to . This gives us a dimensionless parameter to measure how relativistic a particle is or to Taylor-expand in – higher powers of will be less and less significant. This takes the place of the parameter that is sometimes used in special relativity.
If you ever want to take the Newtonian limit of a relativistic expression, then it is a good idea to restore all factors of . In the Newtonian limit, . This makes no sense at all if . Often, however, the limit is equivalent to the limit, so it’s not always necessary to restore the factors of . Just be careful.
In particle physics, it is typical to also set . In the classical limit, , so you will also sometimes see particle physicists restore and Taylor-expand in factors of when they are interested in classical or semi-classical limits.
You cannot set every constant equal to unity. If your system of units has units, then you can set constants to unity as long as they are linearly independent in their dimensions. This ensures that there is a unique way of restoring all of these constants. In cgs units, it is typical to take . Depending on the problem, you have the freedom to set some other prominent constant to unity. If Kelvin is included, it is common to set .
We can use the time-dilation and length-contraction thought experiments to derive the coordinate transformations between two frames traveling at a constant velocity with respect to each other. This is not the only type of Lorentz transformation. Typically, spatial rotations are also considered to be Lorentz transformations. Since spatial rotations are exactly the same in both the Einsteinian and Newtonian formulations, we will focus on Lorentz boosts, i.e., transformations to frames traveling at some constant velocity with respect to the original frame.
Let be the stationary reference frame and be the frame moving with velocity which, without loss of generality, we can take to be along the -axis. Unprimed coordinates refer to ; primed coordinates refer to .
Without loss of generality, assume the origins of the two coordinate systems coincide at . When this is not the case, it will only introduce an overall offset. So we can always translate our coordinate system to make it the case that the origins coincide at .
Suppose there is a train car of length traveling with the frame. The back of the car is at at and is always at . The trajectory of the front of the car is . This is just one-dimensional inertial motion with the initial condition . But in , the car is at rest and, therefore, its length is larger, i.e., using the length-contraction formula. In , the front of the car is not moving, so its trajectory is given by . Then we can substitue in the expression for , the trajectory in . We can rearrange the terms to get
We can use the same logic to get a transformation from to , but this time the velocity will be pointing in the opposite direction, i.e., . So we have . We can combine the two results to get an expression for in terms of and . Using the definition of , this turns out to be
The expressions for and are a good example of the usefulness of setting . Notice that the expressions are symmetric under an exchange of space and time, i.e., and . Not only is this easier to remember, but it also emphasizes the idea that time and space are part of the same geometry in relativity. The Lorentz boost can be thought of as a rotation of time and space into each other.
The - and -coordinates are not affected by this transformation, so
The train car was only used to give us a spacetime point to follow. These transformations are general coordinate transformations. In particular, if a particle trajectory is given by , , and in , then the trajectory is given by , , and in the frame.
What if the velocity boost is not along the -axis? Don’t make your life more difficult than it has to be. Just rotate your spatial coordinates so that the boost is along the -axis. You can use similar tricks to always reduce the Lorentz boost to the form above. If the origins of the two coordinate systems do not coincide, then translate one or both of the coordinate systems to make it so. Remember that you can translate the coordinate system in both space and time.
We can use the differential form of the Lorentz transformations to see how velocities transform. In differential form, the Lorentz transformations are
We can define the components of the velocity vector of a particle in by
and likewise for the primed coordinates in . With these definitions and a few algebraic manipulations we find
These are the velocity-transformation formulae.
Relativistic Aberration and Beaming
The velocity transformations can used to derive two important results for radiation: the apparent change in emission direction from a moving source, which in turn leads to relativistic beaming.
Relativistic Aberration of Light
Imagine a spaceship moving with (relativistic) velocity with respect to the lab frame. One of the crew points a laser out the window at an angle from the velocity vector (in the ship’s rest frame). We want to know the angle , between the ship’s velocity and the direction of the laser beam in the lab frame.
Since we are dealing with light, we know the magnitude of the velocity in both frames must be . For convenience, let the velocity of the spaceship be in the positive x direction, and let the laser beam lie in the x-y plane. Then,
and similarly for the lab frame. We can plug these into the velocity transformations, yielding
These two equations tell us how the angle of a light beam changes under a Lorentz transform.
Now, instead of a spaceship, imagine an isotropic emitter, such as a star. What does its radiation pattern look like in the lab frame? Let’s discuss this semi-qualitatively first. splits the emitted power in half: one half going towards the direction of motion, one half going the opposite direction. If we examine the transformation for , we can see that when , . So, as increases, the “front half” of the power gets pushed into an increasingly small clone centered on the star’s velocity vector. That is, most of the light is pushed towards the front of the star. This is the relativistic beaming effect.
We can write down an analytic form for the relativistic beaming by examining a small patch of solid angle at , and see how it transforms to the lab frame.
as usual. For convenience, let’s define . Then,
and the relativistic aberration equation becomes:
Differentiating both sides yields
Inserting this back into the infinitesimal solid angle element gives
but, , giving us our final result:
A source that has an istropic emission in its rest frame leads to emission in the lab frame:
Note that the total power emitted is conserved; it is simply remapped over the sphere.
It is often convenient to set relativistic phenomena in a 4-dimensional spacetime. We will number the dimensions of this spacetime , , and . The component is time, and the rest are the spatial components. Then the spacetime location of a particle can be described by a 4-component object called a four-vector. We will use to represent the four-vector of a particle. Note that no longer represents position along the -axis. The components of are denoted by , where is meant to be an index, not an exponent. Typically, Greek indices are used when the index can take on any value from to . Latin indices are used when the index can only take on values from to . This is a very standard convention.
When the index is raised, is said to be contravariant. We can also define the covariant four-vector for which and .
Usually we define a four-vector to be any object that satisfies the condition that is invariant under Lorentz transformations. Note that a four-vector need not represent location in spacetime.
Einstein summation convention
We will often want to sum over the indices of four-vectors. Albert found it tiresome to have to write down so many capital sigmas, so he invented a convention for summing over indices. According to this convention
The essence of the convention is that repeated indices are summed over. This is also a very standard convention.
It is useful to define a 2-index tensor called the Minkowski metric. This tensor can be thought of as a matrix with
when . So, as a matrix, is diagonal.
We can use the Minkowski metric to write covariant four-vectors in terms of contravariant four-vectors, i.e.,
We can also raise and lower indices on tensors with multiple indices, e.g., .
Different authors use different conventions for the Minkowski metric. Our definition is common among general relativists. Among particle physicists, it is common to take and . This only amounts to flipping the sign on various expressions. Be sure you know which convention you are using.
We can define the product of two four-vectors to be . This quantity is sometimes called an inner product (although it can be negative when which violates the conventional definition among mathematicians of an inner product). We are extremely interested in these kinds of products, because they are conserved under Lorentz transformations. Using the formulae for the Lorentz transformations, you can compute the components of and in terms of the components of and . Then you can evaluate . You will find
Of particular interest is the case for which . Then we have
where is sometimes called the invariant spacetime interval and we have briefly gone back to using to mean the -component of the spatial vector. The spacetime interval is conserved under all Lorentz transformations including both boosts and rotations. It is not conserved under translations. For that reason, we often talk write this equality in a differential form, i.e.,
Differentials are conserved under translations, so this expression is now fully invariant under all coordinate transformations.
The invariant interval, , should be thought of as the length of the four-vector. In ordinary 3-dimensional space, the length of a vector is conserved under rotations. That is why we are so interested in dot products and norms; they do not depend on the orientation of our coordinate system. A Lorentz scalar is an even more useful quantity, because it is invariant under rotations as well as velocity boosts.
We can also think of as a generalization of the Pythagorean theorem. The distance we travel in 3-dimensional space is given by . This distance is independent of the coordinate system. The distance is a physical quantity; the coordinate system is just a set of labels. When we move in spacetime, we can also define a 4-dimensional spacetime triangle whose hypotenuse is the spacetime interval. The Pythagorean theorem in our 4-dimensional spacetime is not a straightforward generalization from 3 dimensions, but it has the property that the length of the hypotenuse is completely independent the coordinate system.
In the rest frame of the particle, and . We can define the proper time by
So we could also use as an invariant interval. It will only differ from by a minus sign.
Lorentz transformation as a matrix operation
With our new formalism, we can write the Lorentz transformation as a matrix acting on a vector. The Lorentz transformation will be denoted by the 2-index object . The transformed four-vector is given by
This is just matrix multiplication where
and, for example,
for a boost along the -axis. Rotations can be implemented by using the lower-right block as a rotation matrix for a 3-dimensional vector.
The covariant transformation is given by
where . We can also write the Minkowski metric as a matrix, i.e.,
So, when in doubt, all of these tensor manipulations can be done by simple matrix multiplication. Just make sure you’ve got the right matrix representation and that you’re multiplying the matrices in the right order.
You can either check explicitly or infer from the Lorentz invariance of that
where is called the Kronecker delta. It is basically the identity matrix in that
Raising and lowering indices on the Kronecker delta has no real significance. The order of the indices also doesn’t matter.
We define the four-velocity as
Since and are Lorentz scalars, the four-velocity is also a four-vector, i.e., is Lorentz invariant. The component of does not have an intuitive interpretation. The spatial components of are not quite the same as the real velocity which would be . Only in the non-relativistic limit, when , do the spatial components of begin to approximate the real velocity.
Using , where is the real velocity of the particle, we can express the four-velocity as
where is the 3-dimensional real velocity vector. In particular, in the particle’s rest frame which means that
The four-acceleration is defined as
By the same arguments given in the section on four-velocity, is also a four-vector. Again, the spatial components approximate the real acceleration, , only in the non-relativistic limit.
Interstingly, the four-acceleration is always orthogonal to the four-velocity, i.e., .
Energy and momentum
We will have to define what we mean by energy and momentum in special relativity. We will try to choose definitions that reduce to the well-known Newtonian expressions in the non-relativistic limit.
We define the four-momentum as
where is the mass of the particle. Sometimes people define mass so that it actually changes from one reference frame to another. That is where the term “rest mass” comes from, i.e., the mass measured in the rest frame of the particle. We are not going to take that approach. For us, the mass is a Lorentz scalar.
We will define the energy to be
and the momentum to be the spatial part of . Using , we find that
In the particle’s rest frame, we have , which is just with . So we have discovered that mass is the rest energy of a particle.
Since is a four-vector and is just proportional to , we can conclude that is also a four-vector. In particular, we have . In the rest frame, we have . So we have just found that which can be rearranged to read
We were free to define energy and momentum however we liked, but it would be nice if those definitions were reasonable in the sense that they reduce to Newtonian energy and momentum in the non-relativistic limit. Our expression for energy was . The non-relativistic limit is the limit. We can expand our expression for to second order in to get
where the factors of have been restored since we are now in a pseudo-Newtonian regime. The energy looks like the usual kinetic energy for a non-relativistic particle but with some extra constant offset. This offset is not physical in Newtonian physics; only energy differences are relevant. In the fully Newtonian limit, and the energy offset is infinite. That’s why it’s good to pause the limit at this point and redefine the zero of our energy scale so that .
Expanding to second order in gives
which is the usual non-relativistic expression for momentum. There are no factors of to restore in this expression.
So it looks like our definitions of relativistic energy and momentum were reasonable after all.
This section is based on Rybicki and Lightman’s treatment in Section 4.8.
We want to generalize the Larmor formula for dipole radiation to particles moving at relativistic speeds. Assuming we’ve already derived the dipole formula for non-relativistic motion, a good starting point is a frame in which the particle is moving, at least momentarily, at speeds which are small compared to the speed of light. So let’s start in the instantaneous rest frame of the particle. We can form a four-momentum representing the sum of all the four-momenta of all the photons emitted. In some small time interval , the particle emits an energy . This radiation is not emitted isotropically, but there is no net flux of momentum. For any given direction, the same amount of radiation is emitted in the opposite direction. The spatial components of the four-momentum of the radiation vanish, i.e., . So if we transform to another frame, we will have . At the same time, , since the unprimed frame is the one in which the particle is instantaneously at rest. Then the emitted power is
The factors of cancel, and we find that the power is the same in both reference frames. But we could have transformed to any reference frame. So we have just proved that the radiated power is a Lorentz scalar so long as there is no net flux of momentum in the particle’s rest frame.
In particular, we can apply this result to the Larmor formula for dipole radiation:
where is the real 3-dimensional acceleration. This is definitely valid in the instantaneous rest frame of the particle where the speeds are very small compared with the speed of light. But this expression is not Lorentz invariant, since it depends on instead of . Because we are in the instantaneous rest frame of the particle, and . At the same time, recall that always. Since , we must have . But then . Then we can replace in the Larmor formula with to get
All of the factors in this expression are Lorentz invariant, so this is a Lorentz invariant formula for the total power emitted by an accelerating charge.
In non-covariant form, this can be expressed as
where represents the component of acceleration parallel to the velocity, and represents that perpendicular to the velocity, as seen in the lab frame. The case in which the total is entirely perpendicular to velocity (i.e. ) gives rise to synchotron radiation.
The Maxwell Equations are Lorentz invariant. Unfortunately, the most familiar form of Maxwell’s equations (, etc.) does not make the Lorentz invariance manifest. But we can define a few tensors and rewrite Maxwell’s equation in a manifestly Lorentz invariant form.
First we define the four-current as
where is the charge density and is the 3-dimensional current. Then the continuity equation can be written as
where . So already we’ve been able to write the continuity equation in a manifestly Lorentz invariant form.
Now let’s define the four-potential as
where is the scalar potential and is the vector potential. We want to work in the Lorenz gauge for which the condition is
This is a good gauge for us, because it will allow us to write the equations of motions for in a manifestly Lorentz invariant form. The Lorenz gauge actually originated with a physicist whose last name was Lorenz (not a typo), but unfortunately for him, Hendrik Lorentz became much more famous and the Lorenz gauge turns out to be associated with Lorentz invariance. In this gauge, the equations of motion are
Note that is the d’Alembertian operator.
Now we can define the field-strength tensor as
Notice that is antisymmetric in its indices, i.e., . Using this definition of the field-strength tensor we can write
Using the gauge condition, the equations of motion can be written in terms of as
The previous two equations are Lorentz invariant and equivalent to the conventional form of Maxwell’s equations.
Now let’s try to recover the Maxwell’s equations for the electric and magnetic fields. This will be a backwards argument, since we defined the potentials through the electric and magnetic fields and the field-strength tensor through the four-potential. That is, we shouldn’t be at all surprised to see the familiar form of Maxwell’s equations emerge from this formalism. Recall that and . Then and , where is the Levi-Civita symbol. So when is an even permutation of , and when is an odd permutation. As a matrix, can be written as
Now that we’ve related the field-strength tensor to the electric and magnetic fields, we can rewrite our equation of motion () in terms of and . We find that this equation of motion is equivalent to the two inhomogeneous Maxwell’s equations:
We can use to recover the two homogeneous Maxwell’s equations:
So we have shown that Maxwell’s equations need only be written in terms of the field-strength tensor in order to make their Lorentz invariance manifest.
Lorentz transformation of the electric and magnetic fields
The field-strength tensor has two covariant indices. We saw in the section on Lorentz transformations how to perform a covariant transformation. Since has two indices, we will need to perform a transformation on both. The transformation looks like
This can also be evaluated using ordinary matrix multiplication. You want to be a little careful, though. You should take the transpose of and put it all the way on the right. Otherwise, you’re not performing matrix multiplication. Once you’ve done the multiplication, you can just read off the components of to see how the fields transformed. For a boost along the -axis,
Whereas as a rotation would rotate the components of into each other and the components of into each other, the Lorentz boost actually rotates into . This also means that a Lorentz boost can create magnetic fields. Suppose we only have in one frame and all other field components vanish. In a boosted frame we would pick up a non-zero even though the original frame had no magnetic field at all. For this reason, people sometimes say that magnetism is merely a relativistic effect. Notice also that the fields parallel to the boost are not affected.
Rybicki and Lightman, Radiative Processes in Astrophysics, Ch. 4
Griffiths, Introduction to Electrodynamics, 3rd Ed., Chs. 10, 12
Carroll, Spacetime and Geometry, Ch. 1
Weinberg, Gravitation and Cosmology, Ch. 2
Jackson, Classical Electrodynamics, 3rd Ed., Ch. 11