# 17.5: Geometry of Space-time

- Page ID
- 9667

## Four-Dimensional Space-Time

In 1906 Poincaré showed that the Lorentz transformation can be regarded as a rotation in a 4-dimensional Euclidean space-time introduced by adding an imaginary fourth space-time coordinate \(ict\) to the three real spatial coordinates. In 1908 Minkowski reformulated Einstein’s Special Theory of Relativity in this 4-dimensional Euclidean space-time vector space and concluded that the spatial variables \(q_i\), where \((i = 1, 2, 3)\), plus the time \(q_0 = ict\) are equivalent variables and should be treated equally using a covariant representation of both space and time. The idea of using an imaginary time axis \(ict\) to make space-time Euclidean was elegant, but it obscured the non-Euclidean nature of space-time as well as causing difficulties when generalized to non-inertial accelerating frames in the General Theory of Relativity. As a consequence, the use of the imaginary \(ict\) has been abandoned in modern work. Minkowski developed an alternative non-Euclidean metric that treats all four coordinates \((ct, x, y, z)\) as a four-dimensional Minkowski metric with all coordinates being real, and introduces the required minus sign explicitly.

Analogous to the usual 3-dimensional cartesian coordinates, the displacement four vector \(d\mathbf{s}\) is defined using the four components along the **four unit vectors** in either the unprimed or primed coordinate frames.

\[\begin{align} d\mathbf{s} &= dx^0\mathbf{\hat{e}}_0 + dx^1\mathbf{\hat{e}}_1 + dx^2\mathbf{\hat{e}}_2 + dx^3\mathbf{\hat{e}}_3 \nonumber \\[4pt] &= dx^{\prime 0}\mathbf{\hat{e}}^{\prime}_0 + dx^{\prime 1}\mathbf{\hat{e}}^{\prime}_1 + dx^{\prime 2}\mathbf{\hat{e}}^{\prime}_2 + dx^{\prime 3}\mathbf{\hat{e}}^{\prime}_3 \label{17.31} \end{align}\]

The convention used is that greek subscripts (covariant) or superscripts (contravariant) designate a four vector with \(0 \leq \mu \leq 3\). The covariant unit vectors \(\mathbf{\hat{e}}_{\mu}\) are written with the subscript \(\mu\) which has 4 values \(0 \leq \mu \leq 3\). As described in appendix \(19.5.3\), using the Einstein convention the components are written with the contravariant superscript \(dx^{\mu}\) where the time axis \(x^0 = ct\), while the spatial coordinates, expressed in cartesian coordinates, are \(x^1 = x\), \(x^2 = y\), and \(x^3 = z\). With respect to a different (primed) unit vector basis \(\mathbf{\hat{e}}^{\prime}_{\mu}\), the displacement must be unchanged as given by Equation \ref{17.31}. In addition, Equation \ref{17.43} shows that the magnitude \(|ds|^2\) of the displacement four vector is invariant to a Lorentz transformation.

The most general Lorentz transformation between inertial coordinate systems \(S\) and \(S^{\prime}\), in relative motion with velocity \(\mathbf{v}\), assuming that the two sets of axes are aligned, and that their origins overlap when \(t = t^{\prime} = 0\), is given by the symmetric matrix \(\lambda\) where

\[x^{\prime \mu} = \sum_{\nu} \lambda_{\mu\nu} x^{\nu} \label{17.32}\]

This Lorentz transformation of the *four vector* \(\mathbb{X}\) components can be written in matrix form as

\[\mathbb{X}^{\prime} = \boldsymbol{\lambda}\mathbb{X} \label{17.33}\]

Assuming that the two sets of axes are aligned, then the elements of the Lorentz transformation \(\lambda_{ \mu \nu }\) are given by

\[\mathbb{X}^{\prime} = \begin{pmatrix} ct^{\prime} \\ x^{\prime 1} \\ x^{\prime 2} \\ x^{\prime 3} \end{pmatrix} = \begin{pmatrix} \gamma & −\gamma \beta_1 & −\gamma \beta_2 & −\gamma \beta_3 \\ −\gamma \beta_1 & 1+(\gamma − 1) \frac{\beta^2_1}{ \beta^2} & (\gamma − 1) \frac{\beta_1\beta_2}{ \beta^2} & (\gamma − 1) \frac{\beta_1\beta_3}{ \beta^2} \\ −\gamma \beta_2 & (\gamma − 1) \frac{\beta_1\beta_2}{ \beta^2} & 1+(\gamma − 1) \frac{\beta^2_2}{ \beta^2} & (\gamma − 1) \frac{\beta_2\beta_3}{ \beta^2} \\ −\gamma \beta_3 & (\gamma − 1) \frac{\beta_1\beta_3}{ \beta^2} & (\gamma − 1) \frac{\beta_2\beta_3}{ \beta^2} & 1+(\gamma − 1) \frac{\beta^2_3}{ \beta^2} \end{pmatrix} \cdot \begin{pmatrix} ct \\ x^1 \\ x^2 \\ x^3 \end{pmatrix} \label{17.34}\]

where \(\beta = \frac{v}{ c}\) and \(\gamma = \frac{1}{\sqrt{ 1−\beta^2}}\) and assuming that the origin of \(S\) transforms to the origin of \(S^{\prime}\) at \((0, 0, 0, 0)\).

For the case illustrated in Figure \(17.2.1\), where the corresponding axes of the two frames are parallel and in relative motion with velocity \(v\) in the \(x_1\) direction, then the Lorentz transformation matrix \ref{17.34} reduces to

\[\begin{pmatrix} ct^{\prime} \\ x^{\prime 1} \\ x^{\prime 2} \\ x^{\prime 3} \end{pmatrix} = \begin{pmatrix} \gamma & −\beta\gamma & 0 & 0 \\ −\beta\gamma & \gamma & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \cdot \begin{pmatrix} ct \\ x^1 \\ x^2 \\ x^3 \end{pmatrix} \label{17.35}\]

This Lorentz transformation matrix is called a *standard boost* since it only boosts from one frame to another parallel frame. In general a rotation matrix also is incorporated into the transformation matrix \(\lambda\) for the spatial variables.

## Four-vector scalar products

Scalar products of vectors and tensors usually are invariant to rotations in three-dimensional space providing an easy way to solve problems. The scalar, or inner, product of two four vectors is defined by

\[\begin{align} \mathbb{X} \cdot \mathbb{Y} &= g_{\mu\nu} X^{\mu} Y^{\nu} \\[4pt] &= ( X^0 X^1 X^2 X^3 ) \cdot \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & −1 & 0 & 0 \\ 0 & 0 & −1 & 0 \\ 0 & 0 & 0 & −1 \end{pmatrix} \cdot \begin{pmatrix} Y^0 \\ Y^1 \\ Y^2 \\ Y^3 \end{pmatrix} \\[4pt] \label{17.36} &= X^0Y^0 − X^1Y^1 − X^2Y^2 − X^3Y^3 \end{align}\]

The correct sign of the inner product is obtained by inclusion of the *Minkowski metric* \(g\) defined by

\[g_{\mu\nu} \equiv \mathbf{\hat{e}}_{\mu} \cdot \mathbf{\hat{e}}_{\nu} \label{17.37}\]

that is, it can be represented by the matrix

\[g \equiv \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & −1 & 0 & 0 \\ 0 & 0 & −1 & 0 \\ 0 & 0 & 0 & −1 \end{pmatrix} \label{17.38}\]

The sign convention used in the Minkowski metric, Equation \ref{17.38}, has been chosen with the time coordinate \((ct)^2\) positive which makes \((ds)^2 > 0\) for objects moving at less than the speed of light and corresponds to \(ds\) being real.^{1}

The presence of the Minkowski metric matrix, in the inner product of four vectors, complicates General Relativity and thus the Einstein convention has been adopted where the components of the *contravariant four-vector* \(\mathbb{X}\) are written with superscripts \(X^{\mu}\). See also appendix \(19.6\). The corresponding *covariant four-vector* components are written with the subscript \(X_{\mu}\) which is related to the contravariant four-vector components \(X^{\nu}\) using the \(\mu\nu \) component of the covariant Minkowski metric matrix \(\mathbf{g}\). That is

\[X_{\mu} = \sum^3_{\nu =0} g_{\mu\nu} X^{\nu} \label{17.39}\]

The contravariant metric component \(g^{\mu\nu}\) is defined as the \(\mu\nu\) component of the inverse metric matrix \(\mathbf{g}^{−1}\) where

\[\mathbf{gg}^{−1} = \mathbf{I} = \mathbf{g}^{−1} \mathbf{g} \label{17.40}\]

where \(\mathbf{I}\) is the four-vector identity matrix. The contravariant components of the four vector can be expressed in terms of the covariant components as

\[X^{\mu} = \sum^3_{\nu =0} g^{\mu\nu} X_{\nu} \label{17.41}\]

Thus equations \ref{17.39} and \ref{17.41} can be used to transform between covariant and contravariant four vectors, that is, to raise or lower the index \(\mu\).

The scalar inner product of two four vectors can be written compactly as the scalar product of a covariant four vector and a contravariant four vector. The Minkowski metric matrix can be absorbed into either \(\mathbb{X}\) or \(\mathbb{Y}\) thus

\[\begin{align} \mathbb{X} \cdot \mathbb{Y} &= \sum^3_{\mu=0} \sum^3_{\nu =0} g_{\mu v} X^{\mu}Y^{\nu} \\[4pt] &= \sum^3_{\nu =0} X_{\nu} Y^{\nu} \\[4pt] &= \sum^3_{\mu=0} X^{\mu} Y_{\mu} \label{17.42} \end{align}\]

If this covariant expression is Lorentz invariant in one coordinate system, then it is Lorentz invariant in all coordinate systems obtained by proper Lorentz transformations.

The scalar inner product of the invariant space-time interval is an especially important example.

\[\begin{align} (ds)^2 \equiv \mathbb{X}\cdot\mathbb{X} &= c^2 (dt)^{\mathbf{2}} − (d\mathbf{r})^2 \\[4pt] &= (cdt)^2 −\sum^3_{i=1} dx^2_i = (cd\tau)^2 \label{17.43} \end{align}\]

This is invariant to a Lorentz transformation as can be shown by applying the Lorentz standard boost transformation given above. In particular, if \(S^{\prime}\) is the rest frame of the clock, then the invariant space-time interval \(ds\) is simply given by the proper time interval \(d\tau \).

## Minkowski Space-Time

Figure \(\PageIndex{1}\) illustrates a three-dimensional \(( ct, x^1, x^2)\) representation of the 4−dimensional space-time diagram where it is assumed that \(x^3 = 0\). The fact that the velocity of light has a fixed velocity leads to the concept of the light cone defined by the locus of \(|x| = ct\).

### Inside the light cone

The vertex of the cones represent the present. Locations inside the upper cone represent the future while the past is represented by locations inside the lower cone. Note that \((ds)^2 = c^2 (dt)^{\mathbf{2}} − (d\mathbf{r})^2 > 0\) inside both the future and past light cones. Thus the space-time interval \(c\Delta t\) is real and positive for the future, whereas it is real and negative for the past relative to the vertex of the light cone. A **world line** is the trajectory a particle follows is a function of time in Minkowski space. In the interior of the future light cone \(\Delta t > 0\) and, since it is real, it can be asserted unambiguously that any point inside this forward cone must occur later than at the vertex of the cone, that is, it is the absolute future. A Lorentz transformation can rotate Minkowski space such that the axis \(x_0\) goes through any point within this light cone and then the “world line” is pure **time like**. Similarly, any point inside the backward light cone unambiguously occurred before the vertex, i.e. it is absolute past.

### Outside the Light Cone

Outside of the light cone, has

\[(ds)^2 = c^2 (dt)^{\mathbf{2}} − (d\mathbf{r})^2 < 0 \nonumber\]

and thus \(\Delta s\) is imaginary and is called **space like**. A spacelike plane hypersurface in spatial coordinates is shown for the present time in the unprimed frame. A rotation in Minkowski space can be made to \(s^{\prime}\) such that the space-like hypersurface now is tilted relative to the hypersurface shown and thus any point \(P\) outside the light cone can be made to occur later, simultaneous, or earlier than at the vertex depending on the orientation of the space-like hypersurface. This startling situation implies that the time ordering of two points, each outside the others light cone, can be reversed which has profound implications related to the concept of simultaneity and the notion of causality.

For the special case of two events lying on the light cone:

\[\sum^4_{\mu} x^2_{\mu} = c^2t^2 − ( x^2_1 + x^2_2 + x^2_3 ) = 0 \nonumber\]

and thus these events are separated by a light ray travelling at velocity \(c\). Only events separated by time-like intervals can be connected causally. The world line of a particle must lie within its light cone. The division of intervals into space-like and time-like, because of their invariance, is an absolute concept. That is, it is independent of the frame of reference.

The concept of proper time can be expanded by considering a clock at rest in frame \(S^{\prime}\) which is moving with uniform velocity \(v\) with respect to a rest frame \(S\). The clock at rest in the \(S^{\prime}\) frame measures the **proper time **\(\tau \), then the time observed in the fixed frame can be obtained by looking at the interval \(ds\). Because of the invariance of the interval, \(ds^2\) then

\[\begin{align} ds^2 &= c^2d\tau^2 \\[4pt] &= c^2dt^2 − [ dx^2_1 + dx^2_2 + dx^2_3 ] \label{17.44} \end{align}\]

That is,

\[\begin{align} d\tau &= dt \sqrt{ 1 − \frac{ dx^2_1 + dx^2_2 + dx^2_3 }{ c^2dt^2} } \\[4pt] &= dt \sqrt { 1 − \frac{v^2}{ c^2} } = \frac{dt}{ \gamma} \label{17.45} \end{align}\]

that is \(dt = \gamma d\tau\) which satisfies the normal expression for time dilation, \((17.3.5)\).

## Momentum-energy four vector

The previous four-vector discussion can be elegantly exploited using the covariant Minkowski space-time representation. Separating the spatial and time of the differential four vector gives

\[d\mathbb{X} = (cdt, d\mathbf{x}) \label{17.46}\]

Remember that the square of the four-dimensional space-time element of length \((ds)^2\) is invariant \ref{17.43}, and is simply related to the proper time element \(d\tau\). Thus the scalar product

\[d\mathbb{X}\cdot d\mathbb{X} = ds^2 = c^2d\tau^2 = c^2dt^2 − [ dx^2_1 + dx^2_2 + dx^2_3 ] \label{17.47}\]

Thus the proper time is an invariant.

The ratio of the four-vector element \(d\mathbb{X}\) and the invariant proper time interval \(d\tau\), is a four-vector called the *four-vector velocity* \(\mathbb{U}\) where

\[\mathbb{U} = \frac{d\mathbb{X}}{ d\tau} = \left( c \frac{dt}{ d\tau }, \frac{dx}{ d\tau} \right) = \gamma_u \left( c, \frac{d\mathbf{x}}{ dt} \right) = \gamma_u (c, \mathbf{u}) \label{17.48}\]

where \(\mathbf{u}\) is the particle velocity, and \(\gamma_u = \frac{1}{\sqrt{(1 - \frac{u^2}{c^2})}}\).

The four-vector momentum \(\mathbb{P}\) can be obtained from the four-vector velocity by multiplying it by the scalar rest mass \(m\)

\[\mathbb{P} = m\mathbb{U} = (\gamma_u mc, \gamma_u m\mathbf{u}) \label{17.49}\]

However,

\[\gamma_u mc = \frac{E}{c} \label{17.50}\]

thus the momentum four vector can be written as

\[\mathbb{P} = \left(\frac{E}{c} , \mathbf{p} \right) \label{17.51}\]

where the vector \(\mathbf{p}\) represents the three spatial components of the relativistic momentum. It is interesting to realize that the Theory of Relativity couples not only the spatial and time coordinates, but also, it couples their conjugate variables linear momentum \(\mathbf{p}\) and total energy, \(\frac{E}{c}\).

An additional feature of this momentum-energy four vector \(\mathbb{P}\), is that the scalar inner product \(\mathbb{P} \cdot \mathbb{P}\) is invariant to Lorentz transformations and equals \((mc)^2\) in the rest frame

\[\begin{align} \mathbb{P} \cdot \mathbb{P} &= \sum^3_{\mu=0} \sum^3_{\nu =0} g_{\mu\nu} P^{ \mu} P^{\nu} \\[4pt] &= \sum^3_{\mu=0} \sum^3_{\nu =0} P_{\mu}P^{\nu} \\[4pt] &= \left(\frac{E}{c} \right)^2 − |\mathbf{p}|^2 \\[4pt] &= m^2c^2 \label{17.52} \end{align}\]

which leads to the well-known equation

\[E^2 = p^2c^2 + E^2_0 \label{17.53}\]

The Lorentz transformation matrix \(\lambda\) can be applied to \(\mathbb{P}\)

\[\mathbb{P} = \boldsymbol{\lambda}\mathbb{P} \label{17.54}\]

The Lorentz invariant four-vector representation is illustrated by applying the Lorentz transformation shown in Figure \(17.2.1\), which gives, \(p^{\prime}_1 = \gamma \left(p_1 - (\frac{v}{c})^2 E \right)\), \(p^{\prime}_2 = p_2\), \(p^{\prime}_3 = p_3\), and \(E^{\prime} = \gamma (E − vp_1)\).

^{1}Older textbooks, such as all editions of Marion, and the first two editions of Goldstein, use the Euclidean Poincaré 4-dimensional space-time with the imaginary time axis \(ict\). About half the scientific community, and modern physics textbooks including this textbook, and the 3\(^{rd}\) edition of Goldstein, use the Bjorken - Drell \(+, −, −, −\), sign convention given in Equation \ref{17.38} where \(x_0 \equiv ct\), and \(x_1, x_2, x_3\) are the spatial coordinates. The other half of the community, including mathematicians and gravitation physicists, use the opposite \(−, +, +, +\), sign convention. Further confusion is caused by a few books that assign the time axis \(ct\) to be \(x_4\) rather than \(x_0\).