Four-Vector

A four-vector is a mathematical object with four components that transform according to the rules of Lorentz transformations: $$ a^\mu = (a^0, a^1, a^2, a^3) $$.

It’s the mathematical framework that lets us express the laws of physics in a covariant way - meaning they hold the same form for all inertial observers.

Four-vectors transform in the same way as space-time coordinates under Lorentz transformations.

There are two versions of four-vectors:

Contravariant
Contravariant components $ a^\mu $ (with upper indices) are associated with displacements or geometric vectors. Geometrically, they belong to the vector space itself. These are the familiar components of a vector, such as space-time coordinates: $$ x^\mu = (x^0, x^1, x^2, x^3) = (ct, x, y, z) $$. They’re called contravariant because they transform in the opposite way to the coordinates. For example, if I stretch the $x$-axis by a factor of two (a rescaling), the contravariant components shrink by half, compensating for the transformation.
Covariant
Covariant components $ a_\mu $ (with lower indices) are associated with gradients or linear forms. They belong to the dual space - the space of linear functionals acting on vectors. Covariant components are not “new coordinates,” but the same information expressed in dual form, with the signs adjusted by the metric. In Minkowski space: $$ x_0 = +x^0, \quad x_1 = -x^1, \quad x_2 = -x^2, \quad x_3 = -x^3 $$ so that $$ x_\mu = (x^0, -x^1, -x^2, -x^3) = (ct, -x, -y, -z). $$

The scalar product between a covariant and a contravariant vector is a Lorentz invariant, meaning it takes the same value in every inertial frame:

$$ x^\mu x_\mu = (x^0, x^1, x^2, x^3) \cdot (x^0, -x^1, -x^2, -x^3) $$

$$ x^\mu x_\mu = x^0 \cdot x_0 + x^1 \cdot x_1 + x^2 \cdot x_2 + x^3 \cdot x_3 $$

$$ x^\mu x_\mu = (x^0)^2 - (x^1)^2 - (x^2)^2 - (x^3)^2 $$

This expression is not a “norm” in the Euclidean sense, but the space-time interval: the relativistic measure of separation between events, the same for all observers.

This quantity is called the squared space-time interval:

$$ s^2 = (ct)^2 - x^2 - y^2 - z^2 $$

The sign of $ s^2 $ determines the type of separation between two events:

If $ s^2 > 0 $, the separation is timelike: the events can be connected by a particle moving at less than the speed of light. One event can influence the other - it’s just a matter of time.
If $ s^2 = 0 $, the separation is lightlike: the events are connected by a light signal. Again, one event can affect the other, but the signal propagates at the speed of light.
If $ s^2 < 0 $, the separation is spacelike: no causal link is possible, since that would require faster-than-light travel.

Thus, the formula $ x^\mu x_\mu = (ct)^2 - x^2 - y^2 - z^2 $ encodes the space-time interval - the relativistic “distance” between events - which is the same for every observer.

Note. This is the relativistic analogue of Euclidean distance: invariant, meaning identical for all inertial observers.

A Worked Example
Types of Four-Vectors

A Worked Example

Suppose a signal leaves the origin at $t=0$ and, after 2 seconds, is found at a position along the $x$-axis of $3 \times 10^8 \,\text{m}$ - roughly a tenth of a light-year, the distance light covers in one second.

For simplicity, take the other spatial coordinates as constant: $y=0$ and $z=0$.

The contravariant coordinates of the event are:

$$ x^\mu = (ct, x, y, z) = (6 \times 10^8, 3 \times 10^8, 0, 0). $$

So we have:

$x^0 = ct = 6 \times 10^8 \,\text{m}$
$x^1 = x = 3 \times 10^8 \,\text{m}$
$x^2 = 0 $
$x^3 = 0$

Now lower the index using the metric $g_{\mu\nu} = \mathrm{diag}(1,-1,-1,-1)$:

$x_0 = +x^0 = 6 \times 10^8$
$x_1 = -x^1 = -3 \times 10^8$
$x_2 = -x^2 = 0$
$x_3 = -x^3 = 0$

So the covariant four-vector is:

$$ x_\mu = (6 \times 10^8, -3 \times 10^8, 0, 0). $$

What’s different? The time component stays the same, while the spatial components flip sign. That’s the effect of the Minkowski metric.

Now compute the scalar product between the contravariant and covariant vectors - the squared space-time interval:

$$ s^2 = x^\mu x_\mu $$

$$ s^2 = (6 \times 10^8)^2 - (3 \times 10^8)^2 $$

$$ s^2 = 2.7 \times 10^{17} $$

The result is positive, so the separation is timelike.

This tells us that between the origin and the event, the separation is dominated by time. In other words, a physical observer moving at less than the speed of light could actually make that journey.

Types of Four-Vectors

Several kinds of four-vectors are commonly used:

Four-position: $$ x^\mu = (ct, x, y, z) $$
Four-displacement: $$ \Delta x^\mu = (c\Delta t, \Delta x, \Delta y, \Delta z) $$
Four-velocity: $$ u^\mu = \dfrac{dx^\mu}{d\tau} $$
Four-momentum: $$ p^\mu = m u^\mu = \left(\tfrac{E}{c}, \vec{p}\right) $$
Four-force: $$ F^\mu = \dfrac{dp^\mu}{d\tau} $$
Four-current: $$ J^\mu = (c\rho, \vec{J}) $$
Four-potential: $$ A^\mu = (\phi, \vec{A}) $$

And others as well.