# Lorentz transformations

## Basics of Special Relativity

### Postulates

1. The laws of physics are the same in all inertial reference frames.
2. The speed of light is constant in all reference frames.

### Thought experiments

#### Time dilation

Imagine you are on a train and sending a pulse of light vertically from the floor to the ceiling. The pulse bounces off the ceiling and returns to you. The pulse travels at the speed of light, ${\displaystyle c}$, and the time it takes to return to you is ${\displaystyle 2t_{0}}$. So the light pulse travels a total distance of ${\displaystyle 2ct_{0}}$.

Meanwhile, there is another observer on the train platform who sees the train traveling by him with a speed ${\displaystyle v}$. From his perspective, the light pulse does not travel vertically. Instead, it moves horizontally as well. Call the time that this observer measures for the pulse to travel to the ceiling and back ${\displaystyle 2t}$. Then, when the light pulse has returned to the floor of the train, the observer on the platform measures a horizontal displacement of ${\displaystyle 2vt}$, since this is the distance the train has moved. The path of the light pulse will be diagonal from his perspective – diagonally up and then diagonally down. Since the speed of light is the same for all observers, he will measure that the pulse has traveled a distance of ${\displaystyle 2ct}$.

The two diagonal paths together with the horizontal displacement form a triangle. Let’s split this triangle into two right triangles by drawing a vertical line through the center. The length of the vertical line is ${\displaystyle ct_{0}}$. Each of the two right triangles will have a horizontal side of length ${\displaystyle vt}$ and hypotenuse ${\displaystyle ct}$. Use the Pythagorean theorem to relate the sides, i.e., ${\displaystyle (ct)^{2}=(vt)^{2}+(ct_{0})^{2}}$. We can rearrange this to get

${\displaystyle t=t_{0}\gamma ,\,\!}$

where

${\displaystyle \gamma \equiv {\frac {1}{\sqrt {1-(v/c)^{2}}}}.\,\!}$

We call ${\displaystyle t_{0}}$ the proper time. It is the time between two events in the reference frame in which the events occur at the same spatial location. In any other reference frame, the elapsed time will be longer. In our case, the two events are the sending and receiving of the light pulse.

The take-away point is that time appears to be progressing slower for moving objects.

#### Length contraction

Suppose the length of the train car is ${\displaystyle \ell _{0}}$ at rest. This is the length that the observer on the train would measure. Suppose there is an observer on the platform. He is standing still on the platform letting the train pass by him, and he records the time at which the front of the train car passes him and the time at which the back of the train car passes him. Since these two events occur at the same spatial location in the reference frame of the platform, we will use ${\displaystyle t_{0}}$ to denote the time between the passing of the front and back of the train car. So ${\displaystyle t_{0}}$ is not the same as in the section on time dilation. Then the observer on the platform would infer that the length of the train car is ${\displaystyle \ell =vt_{0}}$. In the meantime, the observer on the train would measure a time difference of ${\displaystyle t=\ell _{0}/v}$.

Now we can use the time-dilation formula to relate ${\displaystyle t}$ and ${\displaystyle t_{0}}$. Then we can relate ${\displaystyle \ell }$ and ${\displaystyle \ell _{0}}$ as

${\displaystyle \ell ={\frac {\ell _{0}}{\gamma }}.\,\!}$

It’s easy to be confused by this logic. The proper length, ${\displaystyle \ell _{0}}$, is measured in the reference frame of the train, but the proper time, ${\displaystyle t_{0}}$, is measured in the reference frame of the platform. The proper length is the length measured in the frame in which the object is at rest. The proper time is the time between two events in the frame in which those events occur at the same spatial location.

The take-away point is that moving objects contract along the direction of motion.

### Units (${\displaystyle c=1}$)

When dealing with relativistic systems on a theoretical level, it is often useful to set ${\displaystyle c=1}$. Most people find this totally ridiculous at first, but gradually they understand the legitimacy and the usefulness of this choice. This is standard practice in the general-relativity and particle-physics communities. It is such a common choice that virtually all papers in these fields do not even bother to state it. It is mostly only in textbooks that the choice is clearly stated, and usually this is done very early on. For instance, Sean Carroll sets ${\displaystyle c=1}$ on page 8 of Spacetime and Geometry. Steven Weinberg sets ${\displaystyle c=1}$ in the preface of Gravitation and Cosmology. In An Introduction to Quantum Field Theory, Peskin and Schroeder set ${\displaystyle c=1}$ between the preface and the editor’s forward. And so on...

When we set ${\displaystyle c=1}$, we are just saying that ${\displaystyle 3\times 10^{10}\,\mathrm {cm} =1\,\mathrm {s} }$. Distance and time can be measured in the same units. You can measure distance in seconds, and you can measure time in centimeters. If you don’t want your distances in seconds, then you can always convert to centimeters by multiplying by ${\displaystyle 3\times 10^{10}}$.

If you are ever given an expression in which ${\displaystyle c}$ has ben set to ${\displaystyle 1}$, you can always restore the factors of ${\displaystyle c}$ by requiring the units to come out the way you want. For example, we might write ${\displaystyle E=m}$, where ${\displaystyle E}$ is an energy and ${\displaystyle m}$ is a mass. If you want energy to be measured in ergs and mass to be measured in grams, then there is only one way to restore the factors of ${\displaystyle c}$, i.e., ${\displaystyle E=mc^{2}}$. On the other hand, if you’re content to measure both energy and mass in, for example, ${\displaystyle \mathrm {MeV} }$, then you don’t need to restore any factors of ${\displaystyle c}$ at all. This is why people often quote the masses of fundamental particles in ${\displaystyle \mathrm {MeV} }$ or ${\displaystyle \mathrm {GeV} }$. Sometimes they go halfway toward restoring the factors of ${\displaystyle c}$ by quoting masses in ${\displaystyle \mathrm {MeV} /c^{2}}$.

The theoretical advantage – besides simplifying expressions – is that time and space are put on the same footing. The Lorentz transformation is seen as a rotation of time and space into each other. Since we now have ${\displaystyle E=m}$ instead of ${\displaystyle E=mc^{2}}$, mass is now viewed as a form of energy. Momentum will also now have the same units as energy, so we see that momentum is yet another form of energy. Instead of ${\displaystyle E^{2}=m^{2}c^{4}+p^{2}c^{2}}$, we now have ${\displaystyle E^{2}=m^{2}+p^{2}}$. This is simpler, easier to remember and clearly shows that mass and momentum are the contributions to the energy of a free particle. The factors of ${\displaystyle c}$ in the first expression only serve to get the units right. They don’t contain any theoretical significance. Additionally, velocities will now be dimensionless and always less than or equal to ${\displaystyle 1}$. This gives us a dimensionless parameter to measure how relativistic a particle is or to Taylor-expand in – higher powers of ${\displaystyle v}$ will be less and less significant. This takes the place of the parameter ${\displaystyle \beta =v/c}$ that is sometimes used in special relativity.

If you ever want to take the Newtonian limit of a relativistic expression, then it is a good idea to restore all factors of ${\displaystyle c}$. In the Newtonian limit, ${\displaystyle c\to \infty }$. This makes no sense at all if ${\displaystyle c=1}$. Often, however, the ${\displaystyle c\to \infty }$ limit is equivalent to the ${\displaystyle v\ll 1}$ limit, so it’s not always necessary to restore the factors of ${\displaystyle c}$. Just be careful.

In particle physics, it is typical to also set ${\displaystyle \hbar =1}$. In the classical limit, ${\displaystyle \hbar \to 0}$, so you will also sometimes see particle physicists restore and Taylor-expand in factors of ${\displaystyle \hbar }$ when they are interested in classical or semi-classical limits.

You cannot set every constant equal to unity. If your system of units has ${\displaystyle n}$ units, then you can set ${\displaystyle n}$ constants to unity as long as they are linearly independent in their dimensions. This ensures that there is a unique way of restoring all of these constants. In cgs units, it is typical to take ${\displaystyle c=\hbar =1}$. Depending on the problem, you have the freedom to set some other prominent constant to unity. If Kelvin is included, it is common to set ${\displaystyle k=1}$.

## Lorentz transformations

We can use the time-dilation and length-contraction thought experiments to derive the coordinate transformations between two frames traveling at a constant velocity with respect to each other. This is not the only type of Lorentz transformation. Typically, spatial rotations are also considered to be Lorentz transformations. Since spatial rotations are exactly the same in both the Einsteinian and Newtonian formulations, we will focus on Lorentz boosts, i.e., transformations to frames traveling at some constant velocity with respect to the original frame.

Let ${\displaystyle S}$ be the stationary reference frame and ${\displaystyle S^{\prime }}$ be the frame moving with velocity ${\displaystyle v}$ which, without loss of generality, we can take to be along the ${\displaystyle x}$-axis. Unprimed coordinates refer to ${\displaystyle S}$; primed coordinates refer to ${\displaystyle S^{\prime }}$.

Without loss of generality, assume the origins of the two coordinate systems coincide at ${\displaystyle t=t^{\prime }=0}$. When this is not the case, it will only introduce an overall offset. So we can always translate our coordinate system to make it the case that the origins coincide at ${\displaystyle t=0}$.

Suppose there is a train car of length ${\displaystyle \ell }$ traveling with the ${\displaystyle S^{\prime }}$ frame. The back of the car is at ${\displaystyle x=0}$ at ${\displaystyle t=0}$ and is always at ${\displaystyle x^{\prime }=0}$. The trajectory of the front of the car is ${\displaystyle x=\ell +vt}$. This is just one-dimensional inertial motion with the initial condition ${\displaystyle x(0)=\ell }$. But in ${\displaystyle S^{\prime }}$, the car is at rest and, therefore, its length is larger, i.e., ${\displaystyle \ell ^{\prime }=\ell \gamma }$ using the length-contraction formula. In ${\displaystyle S^{\prime }}$, the front of the car is not moving, so its trajectory is given by ${\displaystyle x^{\prime }=\ell ^{\prime }=\ell \gamma }$. Then we can substitue ${\displaystyle \ell =x^{\prime }/\gamma }$ in the expression for ${\displaystyle x(t)}$, the trajectory in ${\displaystyle S}$. We can rearrange the terms to get

${\displaystyle x^{\prime }=\gamma (x-vt).\,\!}$

We can use the same logic to get a transformation from ${\displaystyle S^{\prime }}$ to ${\displaystyle S}$, but this time the velocity will be pointing in the opposite direction, i.e., ${\displaystyle v\to -v}$. So we have ${\displaystyle x=\gamma (x^{\prime }+vt^{\prime })}$. We can combine the two results to get an expression for ${\displaystyle t^{\prime }}$ in terms of ${\displaystyle x}$ and ${\displaystyle t}$. Using the definition of ${\displaystyle \gamma }$, this turns out to be

${\displaystyle t^{\prime }=\gamma (t-vx/c^{2}).\,\!}$

The expressions for ${\displaystyle x^{\prime }}$ and ${\displaystyle t^{\prime }}$ are a good example of the usefulness of setting ${\displaystyle c=1}$. Notice that the expressions are symmetric under an exchange of space and time, i.e., ${\displaystyle x\leftrightarrow t}$ and ${\displaystyle x^{\prime }\leftrightarrow t^{\prime }}$. Not only is this easier to remember, but it also emphasizes the idea that time and space are part of the same geometry in relativity. The Lorentz boost can be thought of as a rotation of time and space into each other.

The ${\displaystyle y}$- and ${\displaystyle z}$-coordinates are not affected by this transformation, so

${\displaystyle y^{\prime }=y\,\!}$

and

${\displaystyle z^{\prime }=z.\,\!}$

The train car was only used to give us a spacetime point to follow. These transformations are general coordinate transformations. In particular, if a particle trajectory is given by ${\displaystyle x}$, ${\displaystyle y}$, ${\displaystyle z}$ and ${\displaystyle t}$ in ${\displaystyle S}$, then the trajectory is given by ${\displaystyle x^{\prime }}$, ${\displaystyle y^{\prime }}$, ${\displaystyle z^{\prime }}$ and ${\displaystyle t^{\prime }}$ in the ${\displaystyle S^{\prime }}$ frame.

What if the velocity boost is not along the ${\displaystyle x}$-axis? Don’t make your life more difficult than it has to be. Just rotate your spatial coordinates so that the boost is along the ${\displaystyle x}$-axis. You can use similar tricks to always reduce the Lorentz boost to the form above. If the origins of the two coordinate systems do not coincide, then translate one or both of the coordinate systems to make it so. Remember that you can translate the coordinate system in both space and time.

### Velocity transformations

We can use the differential form of the Lorentz transformations to see how velocities transform. In differential form, the Lorentz transformations are

${\displaystyle dx^{\prime }=\gamma (dx-vdt),\,\!}$
${\displaystyle dt^{\prime }=\gamma (dt-vdx),\,\!}$
${\displaystyle dy^{\prime }=dy,\,\!}$

and

${\displaystyle dz^{\prime }=dz.\,\!}$

We can define the components of the velocity vector of a particle in ${\displaystyle S}$ by

${\displaystyle v_{x}\equiv {\frac {dx}{dt}},\,\!}$
${\displaystyle v_{y}\equiv {\frac {dy}{dt}},\,\!}$
${\displaystyle v_{z}\equiv {\frac {dz}{dt}}\,\!}$

and likewise for the primed coordinates in ${\displaystyle S^{\prime }}$. With these definitions and a few algebraic manipulations we find

${\displaystyle v_{x}^{\prime }={\frac {v_{x}-v}{1-v_{x}v}},\,\!}$
${\displaystyle v_{y}^{\prime }={\frac {v_{y}}{\gamma (1-v_{x}v)}}\,\!}$

and

${\displaystyle v_{z}^{\prime }={\frac {v_{z}}{\gamma (1-v_{x}v)}}.\,\!}$

These are the velocity-transformation formulae.

### Relativistic Aberration and Beaming

The velocity transformations can used to derive two important results for radiation: the apparent change in emission direction from a moving source, which in turn leads to relativistic beaming.

#### Relativistic Aberration of Light

Imagine a spaceship moving with (relativistic) velocity ${\displaystyle \beta }$ with respect to the lab frame. One of the crew points a laser out the window at an angle ${\displaystyle \theta ^{\prime }}$ from the velocity vector (in the ship’s rest frame). We want to know the angle ${\displaystyle \theta }$, between the ship’s velocity and the direction of the laser beam in the lab frame.

Since we are dealing with light, we know the magnitude of the velocity in both frames must be ${\displaystyle v=v^{\prime }=c=1}$. For convenience, let the velocity of the spaceship be in the positive x direction, and let the laser beam lie in the x-y plane. Then,

{\displaystyle {\begin{aligned}v_{x}^{\prime }&=\cos \left(\theta ^{\prime }\right)\\v_{y}^{\prime }&=\sin \left(\theta ^{\prime }\right)\end{aligned}}\,\!}

and similarly for the lab frame. We can plug these into the velocity transformations, yielding

{\displaystyle {\begin{aligned}\cos \left(\theta ^{\prime }\right)&={\frac {\cos \left(\theta \right)-\beta }{1-\beta \cos \left(\theta \right)}}\\\sin \left(\theta ^{\prime }\right)&={\frac {\sin \left(\theta \right)}{\gamma \left(1-\beta \cos \left(\theta \right)\right)}}\end{aligned}}\,\!}

These two equations tell us how the angle of a light beam changes under a Lorentz transform.

#### Relativistic Beaming

Now, instead of a spaceship, imagine an isotropic emitter, such as a star. What does its radiation pattern look like in the lab frame? Let’s discuss this semi-qualitatively first. ${\displaystyle \theta ^{\prime }={\frac {\pi }{2}}}$ splits the emitted power in half: one half going towards the direction of motion, one half going the opposite direction. If we examine the transformation for ${\displaystyle \cos \left(\theta ^{\prime }\right)}$, we can see that when ${\displaystyle \theta ^{\prime }={\frac {\pi }{2}}}$, ${\displaystyle \cos \left(\theta \right)=\beta }$. So, as ${\displaystyle \beta }$ increases, the “front half” of the power gets pushed into an increasingly small clone centered on the star’s velocity vector. That is, most of the light is pushed towards the front of the star. This is the relativistic beaming effect.

We can write down an analytic form for the relativistic beaming by examining a small patch of solid angle ${\displaystyle d\Omega ^{\prime }}$ at ${\displaystyle (\theta ^{\prime },\phi ^{\prime })}$, and see how it transforms to the lab frame.

${\displaystyle d\Omega =sin(\theta )d\theta d\phi }$ as usual. For convenience, let’s define ${\displaystyle \mu \equiv cos(\theta )}$. Then,

${\displaystyle d\Omega =d\mu d\phi \,\!}$

and the relativistic aberration equation becomes:

${\displaystyle \mu ^{\prime }={\frac {\mu -\beta }{1-\beta \mu }}\,\!}$

Differentiating both sides yields

${\displaystyle d\mu ^{\prime }={\frac {1-\beta ^{2}}{\left(1-\beta \mu \right)^{2}}}\,d\mu \,\!}$

Inserting this back into the infinitesimal solid angle element gives

${\displaystyle d\Omega ^{\prime }={\frac {1-\beta ^{2}}{\left(1-\beta \mu \right)^{2}}}\,d\mu \,d\phi ^{\prime }\,\!}$

but, ${\displaystyle d\phi =d\phi ^{\prime }}$, giving us our final result:

${\displaystyle d\Omega ^{\prime }={\frac {1-\beta ^{2}}{\left(1-\beta \cos \left(\theta \right)\right)^{2}}}\,d\Omega \,\!}$

A source that has an istropic emission in its rest frame ${\displaystyle I_{\nu }^{\prime }\left(\theta ^{\prime },\phi ^{\prime }\right)=\mathrm {const.} }$ leads to emission in the lab frame:

{\displaystyle {\begin{aligned}I_{\nu }&={\frac {d\Omega ^{\prime }}{d\Omega }}I_{\nu }^{\prime }\\I_{\nu }&={\frac {1-\beta ^{2}}{\left(1-\beta \cos \left(\theta \right)\right)^{2}}}I_{\nu }^{\prime }\end{aligned}}\,\!}

Note that the total power emitted is conserved; it is simply remapped over the sphere.

### Four-vectors

It is often convenient to set relativistic phenomena in a 4-dimensional spacetime. We will number the dimensions of this spacetime ${\displaystyle 0}$, ${\displaystyle 1}$, ${\displaystyle 2}$ and ${\displaystyle 3}$. The ${\displaystyle 0^{\mathrm {th} }}$ component is time, and the rest are the spatial components. Then the spacetime location of a particle can be described by a 4-component object called a four-vector. We will use ${\displaystyle x}$ to represent the four-vector of a particle. Note that ${\displaystyle x}$ no longer represents position along the ${\displaystyle x}$-axis. The components of ${\displaystyle x}$ are denoted by ${\displaystyle x^{\mu }}$, where ${\displaystyle \mu }$ is meant to be an index, not an exponent. Typically, Greek indices are used when the index can take on any value from ${\displaystyle 0}$ to ${\displaystyle 3}$. Latin indices are used when the index can only take on values from ${\displaystyle 1}$ to ${\displaystyle 3}$. This is a very standard convention.

When the index is raised, ${\displaystyle x^{\mu }}$ is said to be contravariant. We can also define the covariant four-vector ${\displaystyle x_{\mu }}$ for which ${\displaystyle x_{0}=-x^{0}}$ and ${\displaystyle x_{i}=x^{i}}$.

Usually we define a four-vector to be any object ${\displaystyle x}$ that satisfies the condition that ${\displaystyle \sum \limits _{\mu =0}^{3}x^{\mu }x_{\mu }}$ is invariant under Lorentz transformations. Note that a four-vector need not represent location in spacetime.

#### Einstein summation convention

We will often want to sum over the indices of four-vectors. Albert found it tiresome to have to write down so many capital sigmas, so he invented a convention for summing over indices. According to this convention

${\displaystyle x^{\mu }y_{\mu }=\sum \limits _{i}x^{\mu }y_{\mu }.\,\!}$

The essence of the convention is that repeated indices are summed over. This is also a very standard convention.

#### Minkowski metric

It is useful to define a 2-index tensor ${\displaystyle \eta _{\mu \nu }}$ called the Minkowski metric. This tensor can be thought of as a matrix with

${\displaystyle \eta _{00}=-1,\,\!}$
${\displaystyle \eta _{ii}=1\,\!}$

and

${\displaystyle \eta _{\mu \nu }=0\,\!}$

when ${\displaystyle \mu \not =\nu }$. So, as a matrix, ${\displaystyle \eta _{\mu \nu }}$ is diagonal.

We can use the Minkowski metric to write covariant four-vectors in terms of contravariant four-vectors, i.e.,

${\displaystyle x_{\mu }=\eta _{\mu \nu }x^{\nu }.\,\!}$

We can also raise and lower indices on tensors with multiple indices, e.g., ${\displaystyle T_{\,\,\,\,\nu }^{\mu }=\eta ^{\mu \sigma }T_{\sigma \nu }}$.

Different authors use different conventions for the Minkowski metric. Our definition is common among general relativists. Among particle physicists, it is common to take ${\displaystyle \eta _{00}=+1}$ and ${\displaystyle \eta _{ii}=-1}$. This only amounts to flipping the sign on various expressions. Be sure you know which convention you are using.

#### Lorentz scalar

We can define the product of two four-vectors to be ${\displaystyle x^{\mu }y_{\mu }}$. This quantity is sometimes called an inner product (although it can be negative when ${\displaystyle x=y}$ which violates the conventional definition among mathematicians of an inner product). We are extremely interested in these kinds of products, because they are conserved under Lorentz transformations. Using the formulae for the Lorentz transformations, you can compute the components of ${\displaystyle (x^{\prime })^{\mu }}$ and ${\displaystyle (y^{\prime })^{\mu }}$ in terms of the components of ${\displaystyle x^{\mu }}$ and ${\displaystyle y^{\mu }}$. Then you can evaluate ${\displaystyle (x^{\prime })^{\mu }(y^{\prime })_{\mu }}$. You will find

${\displaystyle x^{\mu }y_{\mu }=(x^{\prime })^{\mu }(y^{\prime })_{\mu }.\,\!}$

Of particular interest is the case for which ${\displaystyle x^{\mu }=y^{\mu }}$. Then we have

${\displaystyle s^{2}\equiv x^{\mu }x_{\mu }=x^{2}+y^{2}+z^{2}-t^{2},\,\!}$

where ${\displaystyle s}$ is sometimes called the invariant spacetime interval and we have briefly gone back to using ${\displaystyle x}$ to mean the ${\displaystyle x}$-component of the spatial vector. The spacetime interval is conserved under all Lorentz transformations including both boosts and rotations. It is not conserved under translations. For that reason, we often talk write this equality in a differential form, i.e.,

${\displaystyle ds^{2}=dx^{2}+dy^{2}+dz^{2}-dt^{2}.\,\!}$

Differentials are conserved under translations, so this expression is now fully invariant under all coordinate transformations.

The invariant interval, ${\displaystyle s}$, should be thought of as the length of the four-vector. In ordinary 3-dimensional space, the length of a vector is conserved under rotations. That is why we are so interested in dot products and norms; they do not depend on the orientation of our coordinate system. A Lorentz scalar is an even more useful quantity, because it is invariant under rotations as well as velocity boosts.

We can also think of ${\displaystyle ds^{2}=dx^{2}+dy^{2}+dz^{2}-dt^{2}}$ as a generalization of the Pythagorean theorem. The distance we travel in 3-dimensional space is given by ${\displaystyle dr^{2}=dx^{2}+dy^{2}+dz^{2}}$. This distance is independent of the coordinate system. The distance is a physical quantity; the coordinate system is just a set of labels. When we move in spacetime, we can also define a 4-dimensional spacetime triangle whose hypotenuse is the spacetime interval. The Pythagorean theorem in our 4-dimensional spacetime is not a straightforward generalization from 3 dimensions, but it has the property that the length of the hypotenuse is completely independent the coordinate system.

In the rest frame of the particle, ${\displaystyle dx_{i}dx^{i}=0}$ and ${\displaystyle ds^{2}=-dt^{2}}$. We can define the proper time by

${\displaystyle d\tau ^{2}\equiv -ds^{2}.\,\!}$

So we could also use ${\displaystyle d\tau }$ as an invariant interval. It will only differ from ${\displaystyle ds}$ by a minus sign.

#### Lorentz transformation as a matrix operation

With our new formalism, we can write the Lorentz transformation as a matrix acting on a vector. The Lorentz transformation will be denoted by the 2-index object ${\displaystyle \Lambda _{\,\,\,\,\nu }^{\mu }}$. The transformed four-vector is given by

${\displaystyle (x^{\prime })^{\mu }=\Lambda _{\,\,\,\,\nu }^{\mu }x^{\nu }.\,\!}$

This is just matrix multiplication where

${\displaystyle x^{\mu }=\left({\begin{array}{c}t\\x\\y\\z\end{array}}\right)\,\!}$

and, for example,

${\displaystyle \Lambda _{\,\,\,\,\nu }^{\mu }=\left({\begin{array}{cccc}\gamma &-v\gamma &0&0\\-v\gamma &\gamma &0&0\\0&0&1&0\\0&0&0&1\end{array}}\right).\,\!}$

for a boost along the ${\displaystyle x}$-axis. Rotations can be implemented by using the lower-right ${\displaystyle 3\times 3}$ block as a rotation matrix for a 3-dimensional vector.

The covariant transformation is given by

${\displaystyle x_{\mu }^{\prime }=\Lambda _{\mu }^{\,\,\,\,\nu }x_{\nu },\,\!}$

where ${\displaystyle \Lambda _{\mu }^{\,\,\,\,\nu }=\eta _{\mu \sigma }\Lambda _{\,\,\,\,\kappa }^{\sigma }\eta ^{\kappa \nu }}$. We can also write the Minkowski metric as a matrix, i.e.,

${\displaystyle \eta ^{\mu \nu }=\eta _{\mu \nu }=\left({\begin{array}{cccc}-1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1\end{array}}\right).\,\!}$

So, when in doubt, all of these tensor manipulations can be done by simple matrix multiplication. Just make sure you’ve got the right matrix representation and that you’re multiplying the matrices in the right order.

You can either check explicitly or infer from the Lorentz invariance of ${\displaystyle x^{\mu }x_{\mu }}$ that

${\displaystyle \Lambda _{\,\,\,\,\nu }^{\mu }\Lambda _{\mu }^{\,\,\,\,\sigma }=\delta _{\nu }^{\sigma },\,\!}$

where ${\displaystyle \delta _{\nu }^{\sigma }}$ is called the Kronecker delta. It is basically the identity matrix in that

${\displaystyle x^{\mu }=\delta _{\nu }^{\mu }x^{\nu }.\,\!}$

Raising and lowering indices on the Kronecker delta has no real significance. The order of the indices also doesn’t matter.

#### Four-velocity

We define the four-velocity as

${\displaystyle U^{\mu }\equiv {\frac {dx^{\mu }}{d\tau }}.\,\!}$

Since ${\displaystyle d\tau }$ and ${\displaystyle dx^{\mu }dx_{\mu }}$ are Lorentz scalars, the four-velocity is also a four-vector, i.e., ${\displaystyle U^{\mu }U_{\mu }}$ is Lorentz invariant. The ${\displaystyle 0^{\mathrm {th} }}$ component of ${\displaystyle U^{\mu }}$ does not have an intuitive interpretation. The spatial components of ${\displaystyle U^{\mu }}$ are not quite the same as the real velocity which would be ${\displaystyle dx^{i}/dt}$. Only in the non-relativistic limit, when ${\displaystyle d\tau \simeq dt}$, do the spatial components of ${\displaystyle U^{\mu }}$ begin to approximate the real velocity.

Using ${\displaystyle d\tau =dt/\gamma (v)}$, where ${\displaystyle v}$ is the real velocity of the particle, we can express the four-velocity as

${\displaystyle U^{\mu }=\gamma (v)(1,\mathbf {v} ),\,\!}$

where ${\displaystyle \mathbf {v} }$ is the 3-dimensional real velocity vector. In particular, ${\displaystyle U^{\mu }=(1,0,0,0)}$ in the particle’s rest frame which means that

${\displaystyle U^{\mu }U_{\mu }=-1.\,\!}$

#### Four-acceleration

The four-acceleration is defined as

${\displaystyle a^{\mu }\equiv {\frac {U^{\mu }}{d\tau }}.\,\!}$

By the same arguments given in the section on four-velocity, ${\displaystyle a^{\mu }}$ is also a four-vector. Again, the spatial components approximate the real acceleration, ${\displaystyle d^{2}x^{i}/dt^{2}}$, only in the non-relativistic limit.

Interstingly, the four-acceleration is always orthogonal to the four-velocity, i.e., ${\displaystyle a^{\mu }U_{\mu }={\frac {dU^{\mu }}{d\tau }}U_{\mu }={\frac {1}{2}}{\frac {d}{d\tau }}(U^{\mu }U_{\mu })={\frac {1}{2}}{\frac {d(-1)}{d\tau }}=0}$.

## Energy and momentum

We will have to define what we mean by energy and momentum in special relativity. We will try to choose definitions that reduce to the well-known Newtonian expressions in the non-relativistic limit.

### Four-momentum

We define the four-momentum as

${\displaystyle p^{\mu }\equiv mU^{\mu },\,\!}$

where ${\displaystyle m}$ is the mass of the particle. Sometimes people define mass so that it actually changes from one reference frame to another. That is where the term “rest mass” comes from, i.e., the mass measured in the rest frame of the particle. We are not going to take that approach. For us, the mass is a Lorentz scalar.

We will define the energy to be

${\displaystyle E\equiv p^{0}\,\!}$

and the momentum to be the spatial part of ${\displaystyle p^{\mu }}$. Using ${\displaystyle U^{\mu }=\gamma (v)(1,\mathbf {v} )}$, we find that

${\displaystyle E=m\gamma \,\!}$

and

${\displaystyle p^{i}=mv^{i}\gamma .\,\!}$

In the particle’s rest frame, we have ${\displaystyle E=m}$, which is just ${\displaystyle E=mc^{2}}$ with ${\displaystyle c=1}$. So we have discovered that mass is the rest energy of a particle.

Since ${\displaystyle U^{\mu }}$ is a four-vector and ${\displaystyle p^{\mu }}$ is just proportional to ${\displaystyle U^{\mu }}$, we can conclude that ${\displaystyle p^{\mu }}$ is also a four-vector. In particular, we have ${\displaystyle p^{\mu }p_{\mu }=|\mathbf {p} |^{2}-E^{2}}$. In the rest frame, we have ${\displaystyle p^{\mu }p_{\mu }=m^{2}}$. So we have just found that ${\displaystyle |\mathbf {p} |^{2}-E^{2}=m^{2}}$ which can be rearranged to read

${\displaystyle E^{2}=|\mathbf {p} |^{2}+m^{2}.\,\!}$

#### Non-relativistic limit

We were free to define energy and momentum however we liked, but it would be nice if those definitions were reasonable in the sense that they reduce to Newtonian energy and momentum in the non-relativistic limit. Our expression for energy was ${\displaystyle E=m\gamma =m/{\sqrt {1-v^{2}}}}$. The non-relativistic limit is the ${\displaystyle v\ll 1}$ limit. We can expand our expression for ${\displaystyle E}$ to second order in ${\displaystyle v}$ to get

${\displaystyle E\simeq mc^{2}+{\frac {1}{2}}mv^{2},\,\!}$

where the factors of ${\displaystyle c}$ have been restored since we are now in a pseudo-Newtonian regime. The energy looks like the usual kinetic energy for a non-relativistic particle but with some extra constant offset. This offset is not physical in Newtonian physics; only energy differences are relevant. In the fully Newtonian limit, ${\displaystyle c\to \infty }$ and the energy offset is infinite. That’s why it’s good to pause the ${\displaystyle c\to \infty }$ limit at this point and redefine the zero of our energy scale so that ${\displaystyle E=mv^{2}/2}$.

Expanding ${\displaystyle p^{i}=mv^{i}\gamma }$ to second order in ${\displaystyle v}$ gives

${\displaystyle p^{i}\simeq mv^{i},\,\!}$

which is the usual non-relativistic expression for momentum. There are no factors of ${\displaystyle c}$ to restore in this expression.

So it looks like our definitions of relativistic energy and momentum were reasonable after all.

This section is based on Rybicki and Lightman’s treatment in Section 4.8.

We want to generalize the Larmor formula for dipole radiation to particles moving at relativistic speeds. Assuming we’ve already derived the dipole formula for non-relativistic motion, a good starting point is a frame in which the particle is moving, at least momentarily, at speeds which are small compared to the speed of light. So let’s start in the instantaneous rest frame of the particle. We can form a four-momentum representing the sum of all the four-momenta of all the photons emitted. In some small time interval ${\displaystyle dt}$, the particle emits an energy ${\displaystyle dE}$. This radiation is not emitted isotropically, but there is no net flux of momentum. For any given direction, the same amount of radiation is emitted in the opposite direction. The spatial components of the four-momentum of the radiation vanish, i.e., ${\displaystyle dp^{i}=0}$. So if we transform to another frame, we will have ${\displaystyle dE^{\prime }=\gamma dE}$. At the same time, ${\displaystyle dt^{\prime }=\gamma dt}$, since the unprimed frame is the one in which the particle is instantaneously at rest. Then the emitted power is

${\displaystyle P={\frac {dE}{dt}}={\frac {E^{\prime }}{dt^{\prime }}}=P^{\prime }.\,\!}$

The factors of ${\displaystyle \gamma }$ cancel, and we find that the power is the same in both reference frames. But we could have transformed to any reference frame. So we have just proved that the radiated power is a Lorentz scalar so long as there is no net flux of momentum in the particle’s rest frame.

In particular, we can apply this result to the Larmor formula for dipole radiation:

${\displaystyle P={\frac {2}{3}}q^{2}|\mathbf {a} |^{2},\,\!}$

where ${\displaystyle |\mathbf {a} |}$ is the real 3-dimensional acceleration. This is definitely valid in the instantaneous rest frame of the particle where the speeds are very small compared with the speed of light. But this expression is not Lorentz invariant, since it depends on ${\displaystyle a^{i}a_{i}}$ instead of ${\displaystyle a^{\mu }a_{\mu }}$. Because we are in the instantaneous rest frame of the particle, ${\displaystyle U^{i}=0}$ and ${\displaystyle U^{0}=1}$. At the same time, recall that ${\displaystyle a^{\mu }U_{\mu }=0}$ always. Since ${\displaystyle U^{0}\not =0}$, we must have ${\displaystyle a^{0}=0}$. But then ${\displaystyle a^{\mu }a_{\mu }=a^{i}a_{i}}$. Then we can replace ${\displaystyle |\mathbf {a} |^{2}}$ in the Larmor formula with ${\displaystyle a^{\mu }a_{\mu }}$ to get

${\displaystyle P={\frac {2}{3}}q^{2}a^{\mu }a_{\mu }.\,\!}$

All of the factors in this expression are Lorentz invariant, so this is a Lorentz invariant formula for the total power emitted by a radiating dipole.

In non-covariant form, this can be expressed as

${\displaystyle P={\frac {2}{3}}q^{2}\left(a_{\parallel }^{2}\gamma ^{6}+a_{\bot }^{2}\gamma ^{4}\right)\,\!}$

where ${\displaystyle {\vec {a}}_{\parallel }}$ represents the component of acceleration parallel to the velocity, and ${\displaystyle {\vec {a}}_{\bot }}$ represents that perpendicular to the velocity, as seen in the lab frame. The case in which the total ${\displaystyle {\vec {a}}}$ is entirely perpendicular to velocity (i.e. ${\displaystyle {\vec {a}}_{\parallel }=0}$) gives rise to synchotron radiation.

## Relativistic electrodynamics

The Maxwell Equations are Lorentz invariant. Unfortunately, the most familiar form of Maxwell’s equations (${\displaystyle \nabla \cdot \mathbf {E} =4\pi \rho }$, etc.) does not make the Lorentz invariance manifest. But we can define a few tensors and rewrite Maxwell’s equation in a manifestly Lorentz invariant form.

First we define the four-current as

${\displaystyle j^{\mu }=(\rho ,\mathbf {j} ),\,\!}$

where ${\displaystyle \rho }$ is the charge density and ${\displaystyle \mathbf {j} }$ is the 3-dimensional current. Then the continuity equation can be written as

${\displaystyle \partial _{\mu }j^{\mu }={\dot {\rho }}+\nabla \cdot \mathbf {j} =0,\,\!}$

where ${\displaystyle \partial _{\mu }={\frac {\partial }{\partial x^{\mu }}}}$. So already we’ve been able to write the continuity equation in a manifestly Lorentz invariant form.

Now let’s define the four-potential as

${\displaystyle A^{\mu }=(\phi ,\mathbf {A} ),\,\!}$

where ${\displaystyle \phi }$ is the scalar potential and ${\displaystyle \mathbf {A} }$ is the vector potential. We want to work in the Lorentz gauge for which the condition is

${\displaystyle \partial _{\mu }A^{\mu }=0.\,\!}$

This is a good gauge for us, because it will allow us to write the equations of motions for ${\displaystyle A^{\mu }}$ in a manifestly Lorentz invariant form. The Lorentz gauge actually originated with a physicist whose last name was Lorenz (that’s not a typo). Unforunately for Lorenz, Hendrik Lorentz became much more famous and the Lorenz gauge turns out to be associated with Lorentz invariance. So the gauge seems to have gone down in history as the Lorentz gauge and not the Lorenz gauge. In this gauge, the equations of motion are

${\displaystyle \partial ^{\nu }\partial _{\nu }A^{\mu }=-4\pi j^{\mu }.\,\!}$

Note that ${\displaystyle \partial ^{\nu }\partial _{\nu }=\Box }$ is the d’Alembertian operator.

Now we can define the field-strength tensor as

${\displaystyle F_{\mu \nu }=\partial _{\mu }A_{\nu }-\partial _{\nu }A_{\mu }.\,\!}$

Notice that ${\displaystyle F_{\mu \nu }}$ is antisymmetric in its indices, i.e., ${\displaystyle F_{\mu \nu }=-F_{\nu \mu }}$. Using this definition of the field-strength tensor we can write

${\displaystyle \partial _{\sigma }F_{\mu \nu }+\partial _{\mu }F_{\nu \sigma }+\partial _{\nu }F_{\sigma \mu }=0.\,\!}$

Using the gauge condition, the equations of motion can be written in terms of ${\displaystyle F_{\mu \nu }}$ as

${\displaystyle \partial _{\nu }F^{\mu \nu }=4\pi j^{\mu }.\,\!}$

The previous two equations are Lorentz invariant and equivalent to the conventional form of Maxwell’s equations.

Now let’s try to recover the Maxwell’s equations for the electric and magnetic fields. This will be a backwards argument, since we defined the potentials through the electric and magnetic fields and the field-strength tensor through the four-potential. That is, we shouldn’t be at all surprised to see the familiar form of Maxwell’s equations emerge from this formalism. Recall that ${\displaystyle \mathbf {E} =-\nabla \phi -{\dot {\mathbf {A} }}}$ and ${\displaystyle \mathbf {B} =\nabla \times \mathbf {A} }$. Then ${\displaystyle E_{i}=\partial _{i}A_{0}-\partial _{0}A_{i}=F_{0i}}$ and ${\displaystyle B_{i}=\epsilon _{ijk}\partial _{j}A_{k}}$, where ${\displaystyle \epsilon _{ijk}}$ is the Levi-Civita symbol. So ${\displaystyle B_{i}=F_{jk}}$ when ${\displaystyle ijk}$ is an even permutation of ${\displaystyle 123}$, and ${\displaystyle B_{i}=-F_{jk}}$ when ${\displaystyle ijk}$ is an odd permutation. As a matrix, ${\displaystyle F_{\mu \nu }}$ can be written as

${\displaystyle F_{\mu \nu }=\left({\begin{array}{cccc}0&-E_{x}&-E_{y}&-E_{z}\\E_{x}&0&B_{z}&-B_{y}\\E_{y}&-B_{z}&0&B_{x}\\E_{z}&B_{y}&-B_{x}&0\end{array}}\right).\,\!}$

Now that we’ve related the field-strength tensor to the electric and magnetic fields, we can rewrite our equation of motion (${\displaystyle \partial _{\nu }F^{\mu \nu }=4\pi j^{\mu }}$) in terms of ${\displaystyle E}$ and ${\displaystyle B}$. We find that this equation of motion is equivalent to the two inhomogeneous Maxwell’s equations:

${\displaystyle \nabla \cdot \mathbf {E} =4\pi \rho \,\!}$

and

${\displaystyle \nabla \times \mathbf {B} =4\pi \mathbf {j} +{\dot {\mathbf {E} }}.\,\!}$

We can use ${\displaystyle \partial _{\sigma }F_{\mu \nu }+\partial _{\mu }F_{\nu \sigma }+\partial _{\nu }F_{\sigma \mu }=0}$ to recover the two homogeneous Maxwell’s equations:

${\displaystyle \nabla \cdot \mathbf {B} =0\,\!}$

and

${\displaystyle \nabla \times \mathbf {E} =-{\dot {\mathbf {B} }}.\,\!}$

So we have shown that Maxwell’s equations need only be written in terms of the field-strength tensor in order to make their Lorentz invariance manifest.

### Lorentz transformation of the electric and magnetic fields

The field-strength tensor ${\displaystyle F_{\mu \nu }}$ has two covariant indices. We saw in the section on Lorentz transformations how to perform a covariant transformation. Since ${\displaystyle F_{\mu \nu }}$ has two indices, we will need to perform a transformation on both. The transformation looks like

${\displaystyle F_{\mu \nu }^{\prime }=\Lambda _{\mu }^{\,\,\,\,\sigma }\Lambda _{\nu }^{\,\,\,\,\kappa }F_{\sigma \kappa }.\,\!}$

This can also be evaluated using ordinary matrix multiplication. You want to be a little careful, though. You should take the transpose of ${\displaystyle \Lambda _{\nu }^{\,\,\,\,\kappa }}$ and put it all the way on the right. Otherwise, you’re not performing matrix multiplication. Once you’ve done the multiplication, you can just read off the components of ${\displaystyle F_{\mu \nu }^{\prime }}$ to see how the fields transformed. For a boost along the ${\displaystyle x}$-axis,

{\displaystyle {\begin{aligned}E_{x}^{\prime }&=E_{x},&B_{x}^{\prime }&=B_{x},\end{aligned}}\,\!}
{\displaystyle {\begin{aligned}E_{y}^{\prime }&=\gamma (E_{y}-vB_{z}),&B^{\prime }&=\gamma (B_{y}+vE_{z}),\end{aligned}}\,\!}
{\displaystyle {\begin{aligned}E_{z}^{\prime }&=\gamma (E_{z}+vB_{y})&\mathrm {and} &&B_{z}^{\prime }&=\gamma (B_{z}-vE_{y}).\end{aligned}}\,\!}

Whereas as a rotation would rotate the components of ${\displaystyle E}$ into each other and the components of ${\displaystyle B}$ into each other, the Lorentz boost actually rotates ${\displaystyle E}$ into ${\displaystyle B}$. This also means that a Lorentz boost can create magnetic fields. Suppose we only have ${\displaystyle E_{y}}$ in one frame and all other field components vanish. In a boosted frame we would pick up a non-zero ${\displaystyle B_{z}}$ even though the original frame had no magnetic field at all. For this reason, people sometimes say that magnetism is merely a relativistic effect. Notice also that the fields parallel to the boost are not affected.

## External references

Rybicki and Lightman, Radiative Processes in Astrophysics, Ch. 4

Griffiths, Introduction to Electrodynamics, 3rd Ed., Chs. 10, 12

Carroll, Spacetime and Geometry, Ch. 1

Weinberg, Gravitation and Cosmology, Ch. 2

Jackson, Classical Electrodynamics, 3rd Ed., Ch. 11