Skip to main content
Physics LibreTexts

11.2: Derivation of the Lorentz Transformations

  • Page ID
    17432
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    Instead of constant acceleration, in the theory of relativity we have a constant speed of light in each inertial reference frame. That means that the transformation rules (11.1.1) change. One thing that will not change is that spatial directions in which there is no motion are measured the same by all observers (our observers are after all both stationary in the \(y\) and \(z\) directions), so we’ll only consider the \(x\) direction, and time. To treat space and time on an equal footing, they should of course have the same dimensions. Fortunately, the one universal constant in special relativity, the speed of light \(c\), converts a time to a space (or vice versa), so we’ll consider a transformation on position \(x\) and ‘time’ \(ct\). A second thing that won’t change is that the transformations have to be linear. If they were not, they would violate the principle of relativity, because then the length of an object (or of a time interval) would depend on the choice of origin of our coordinate frames \(S\) and \(S^\prime\).

    Hendrik Antoon Lorentz

    1853-1928) was a Dutch physicist, considered widely as the leading theoretical physicist of his time. At age 24, Lorentz became a professor at Leiden university where he initally worked on electromagnetism. He provided the theoretical explanation for the recently discovered (quantum-mechanical) Zeeman effect, for which he and Zee- man shared the Nobel prize in 1902. Around 1900, Lorentz developed the set of transformations now named after him in an attempt to interpret the results of the Michelson-Morley experiment. As Einstein built the theory of relativity on the mathematical tools provided by Lorentz, it was originally referred to as the Lorentz-Einstein theory; Lorentz himself quickly appreciated Einstein’s insights and consistently referred to ‘Einstein’s principle of relativity’. Lorentz resigned from his position in Leiden in 1912 to have more time to do research, moving to the Teylers museum in Haarlem (still open today); Lorentz’ successor in Leiden, Paul Ehrenfest, founded an institute for theoretical physics there that is now known as the Lorentz institute. From 1918 till 1926 Lorentz focussed his efforts on maritime engineering, as chair of the committee charged with designing the Afsluitdijk, a 32 km dike that closes off the former Zuiderzee in the north of the Netherlands. Lorentz solved the various necessary hydrodynamic problems numerically by hand, one of the first engineering problems approached in this way; when construction was finished, it turned out that his calculations had been highly accurate. One of the two sets of locks in the dyke is named after him.

    Figure \(\PageIndex{1}\) Hendrik Antoon Lorentz, around 1916 [28]

    For a general linear transformation, we write:

    \[
    \left(\begin{array}{c}
    {x^{\prime}} \\
    {c t^{\prime}}
    \end{array}\right)=A\left(\begin{array}{c}
    {x} \\
    {c t}
    \end{array}\right), \quad \text { with } \quad A=\left(\begin{array}{cc}
    {a_{11}} & {a_{12}} \\
    {a_{21}} & {a_{22}}
    \end{array}\label{eq:1}\right).
    \]

    We want our transformation to be invertible, so \(\det(A)\neq 0\) and

    \[\left(\begin{array}{c}
    {x} \\
    {c t}
    \end{array}\right)=\frac{1}{\operatorname{det}(A)}\left(\begin{array}{cc}
    {a_{22}} & {-a_{12}} \\
    {-a_{21}} & {a_{11}}
    \end{array}\right)\left(\begin{array}{c}
    {x^{\prime}} \\
    {c t^{\prime}}
    \end{array}\label{eq:2}\right).\]

    We can find1 the coefficients of \(A\) by simply demanding that \(S^\prime\) moves relative to \(S\) at constant speed \(u\), and the value of \(c\) is the same in \(S\) and \(S^\prime\). To start with the first condition, consider a stationary point in \(S^\prime\), so \(x^\prime=b^\prime\). From (\ref{eq:1}) we have \(x^\prime = a_{11}x + a_{12}ct\), or \(x = (x^\prime - a_{12}ct)/a_{11}\). On the other hand, in \(S\) this point moves at speed \(u\), so the same point is described by an equation of the form \(x=ut+x_0\). Therefore, we have \(-a_{12}c/a_{11}=u\). Naturally, we can also consider a stationary point in \(S\), \(x=b=(a_{22}x^\prime-a_{12}ct^\prime/\det(A)\), which is moving at speed \(-u\) in \(S^\prime\), so \(x^\prime = -ut^\prime+x_0^\prime\), which gives \(-u = a_{12}c/a_{22}\). It follows that \(a_{22} = -a_{12}c/u=a_{11}\), and we can rewrite our transformations (\ref{eq:1}) and (\ref{eq:2}) as:

    \[\begin{equation}\begin{aligned}\label{eq:3}
    x^{\prime} &=a_{11}\left(x+\frac{a_{12}}{a_{11}} c t\right)=a_{11}(x-u t) \\
    c t^{\prime} &=a_{21} x+a_{22} c t=a_{11}\left(\frac{a_{21}}{a_{11}} x+c t\right) \\
    x &=\frac{a_{11}}{\operatorname{det}(A)}\left(x^{\prime}+u t^{\prime}\right) \\
    c t &=\frac{a_{11}}{\operatorname{det}(A)}\left(-\frac{a_{21}}{a_{11}} x^{\prime}+c t^{\prime}\right)
    \end{aligned}\end{equation}\]

    Note that with \(a_{11} =1\) and \(a_{21} =0\), these are simply the Galilean transformations again. We’ll allow \(a_{21} \neq 0\) to accommodate for the light postulate. To see how that works, we first calculate the velocity of a moving object in either reference frame, and relate them to each other:

    \[
    v^{\prime}\equiv \frac{\mathrm{d} x^{\prime}}{\mathrm{d} t^{\prime}}=\frac{a_{11} \mathrm{d}(x-u t)}{a_{11} \mathrm{d}\left(\left(a_{21} / a_{11}\right)(x / c)+t\right)}=\frac{\mathrm{d} x-u \mathrm{d} t}{\left(a_{21} / c a_{11}\right) \mathrm{d} x+\mathrm{d} t} =\frac{\mathrm{d} x / \mathrm{d} t-u}{1+\left(a_{21} / c a_{11}\right) \mathrm{d} x / \mathrm{d} t}=\frac{v-u}{1+\left(a_{21} / a_{11}\right)(v / c)}
    \label{eq:4}\]

    where we used \(v \equiv dx/dt\). Inversely, we have:

    \[v=\frac{\mathrm{d} x}{\mathrm{d} t}=\frac{v^{\prime}+u}{1-\left(a_{21} / a_{11}\right)\left(v^{\prime} / c\right)}\]

    As the light postulate states, it doesn’t matter if we measure \(c\) in \(S\) or \(S^\prime\), we always get the same number. So for light, we should have \(v^\prime = c =v\), which we can use in equation (\ref{eq:4}) to determine \(a_{21}/a_{11}\):

    \[\begin{equation}\begin{aligned}
    &c=\frac{c-u}{1+\left(a_{21} / a_{11}\right)} \quad \text { so } \quad \frac{a_{21}}{a_{11}}=-\frac{u}{c}
    \end{aligned}\end{equation}.\]

    The transforms now become:

    \[\begin{equation}\begin{aligned}
    x^{\prime} &=a_{11}(x-u t) \\
    c t^{\prime} &=a_{11}\left(c t-\frac{u x}{c}\right) \\
    x &=\frac{1}{a_{11}} \frac{1}{1-u^{2} / c^{2}}\left(x^{\prime}+u t^{\prime}\right) \\
    c t &=\frac{1}{a_{11}} \frac{1}{1-u^{2} / c^{2}}\left(c t^{\prime}+\frac{u x^{\prime}}{c}\right).
    \end{aligned}\label{eq:5}\end{equation}\]

    In equations (\(\ref{eq:5}\)) we used \( \det(A)=a^2_{11}(1-u^2/c^2)\). We are left with one undetermined parameter, the value of \(a_{11}\). We’ll use it to make the transformation symmetric - after all, we could have started with \(S^\prime\) as stationary and \(S\) as moving (with speed \(-u\)), and we should get the same transforms, except for the sign of \(u\). Equating the prefactor in equations (\(\ref{eq:5}a\)) and (\(\ref{eq:5}c\)), we find that \(a_{11} = \gamma(u)\), with\(\gamma(u)\) again defined as

    \[\gamma(u)=\frac{1}{\sqrt{1-(u / c)^{2}}}.\]

    Note that \(\gamma(u)=\gamma(-u)\), in accordance with the earlier notion that it doesn’t matter whether you are in \(S\) watching \(S^\prime\) move at \(u\), or in \(S^\prime\) watching \(S\) move at \(-u\). We have now arrived at the Lorentz transformations:

    \[\begin{equation}\begin{aligned}
    x^{\prime} &=\gamma(u)(x-u t) \\
    c t^{\prime} &=\gamma(u)\left(c t-\frac{u x}{c}\right) \\
    x &=\gamma(u)\left(x^{\prime}+u t^{\prime}\right) \\
    c t &=\gamma(u)\left(c t^{\prime}+\frac{u x^{\prime}}{c}\right)
    \end{aligned}\label{eq:6}\end{equation}\]

    The Lorentz transformations transform both space and time. Consequently, our two observers do not only measure space differently, as in the classical system (recall the stationary and comoving coordinates), but they also measure time differently! For small speeds, \(\gamma(u)\) is (very) close to one and the effect negligible, but for high speeds it certainly is not. As we have already seen in the previous section based on the train argument, and see again below, these different time measurements lead to potentially confusing results: the two observers no longer agree on which events are simultaneous, how long a meter stick is, or how long it took to travel from one place to another.

    Before going to the applications, we have a few closing remarks about the Lorentz transformations. First, we put in some effort to make the transformation symmetric between going from \(S\) to \(S^\prime\) and vice versa. We can do more though. Since time and space now both transform, and get mingled up in the transformation, it is no longer appropriate to separate them; instead, we’ll consider a combined system of four dimensions known as spacetime. As proper physicists, we should however not compare apples and oranges, or time and space. We already converted the time coordinate to a space coordinate by multiplying it with \(c\). In equations (\(\ref{eq:6}a\)) and (\(\ref{eq:6}c\)) we canceled that \(c\) in front of \(t\)with a \(c\) in the denominator, but it is cleaner to put it back, so we get an even better sense of the equality of space and time in the Lorentz transformation:

    \[\begin{equation}\begin{aligned}
    x^{\prime} &=\gamma(u)(x-(u / c) c t) \\
    c t^{\prime} &=\gamma(u)(c t-(u / c) x)
    \end{aligned}\label{eq:7}\end{equation}\]

    Note that in equations (\(\ref{eq:7}\)) the velocity \(u\) only appears as a fraction of \(c\): we only have expressions of the form \(u/c\), making all the coefficients of our transforms nicely dimensionless.

    Second, all equations in this section are for a transformation between a stationary frame and one moving in the positive \(x\)-direction with speed \(u\). Since we’re in principle free to choose our coordinates, we can always re-label or construct our axes to match this setup. In practice, that may not always be handy though, so we could also consider movement into a different direction. Of course, moving in either the \(y\) or \(z\) direction just makes those axes swap with the \(x\) axis considered here, so we won’t bother to explicitly write down those transformations. We can also write down the transformation for movement in a general direction \(u\):

    \[\begin{equation}\begin{array}{l}
    {x^{\prime}=x+(\gamma(u)-1) \frac{u \cdot x}{u \cdot u} u-\frac{\gamma(u)}{c} u c t} \\
    {c t^{\prime}=\gamma(u)\left(c t-\frac{u \cdot x}{c}\right)}
    \end{array}\end{equation}\]

    where \(u=|\mathbf{u}|\) is the speed of the moving frame, and \(\mathbf{x} = (x,y,z)\).

    Finally, we note that the collection of Lorentz transformations in the \(x\) direction2 form a group under composition. If a system \(S^\prime\) moves with respect to \(S\) with velocity \(u\), and \(S^{\prime\prime}\) moves with respect to \(S^\prime\) with velocity \(v\), then you can make a Lorentz transformation from \(S\) to \(S^\prime\) and from \(S^\prime\) to \(S^{\prime\prime}\), but also from \(S\) to \(S^{\prime\prime}\) directly3. As you can check for yourself (Problem 11.1), the transformation from \(s\) to \(s^{\prime\prime}\) is indeed another Lorentz transformation. The velocity of \(S^{\prime\prime}\) with respect to \(S\) is again not \(u+v\), but \((u+v)/(1+uv/c^2)\), as given by the relativistic velocity addition equation (11.14) derived below.

    1The derivation that follows is not mathematically difficult (these are all linear equations, after all), but it contains a fairly large number of steps. The easiest way to get your head around them is to take a piece of paper and do them yourself.

    2These transformations in a given direction are sometimes also referred to as Lorentz boosts.

    3Think, for example, of \(S\) as a stationary platform, \(S^\prime\) as a moving train, and \(S^{\prime\prime}\) as a toy train in the moving train.


    This page titled 11.2: Derivation of the Lorentz Transformations is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Timon Idema (TU Delft Open) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.